[2024-06-21 13:59:45,691][15132] Saving configuration to /workspace/metta/train_dir/p2.dr6/config.json... [2024-06-21 13:59:45,725][15132] Rollout worker 0 uses device cpu [2024-06-21 13:59:45,726][15132] Rollout worker 1 uses device cpu [2024-06-21 13:59:45,726][15132] Rollout worker 2 uses device cpu [2024-06-21 13:59:45,727][15132] Rollout worker 3 uses device cpu [2024-06-21 13:59:45,727][15132] Rollout worker 4 uses device cpu [2024-06-21 13:59:45,728][15132] Rollout worker 5 uses device cpu [2024-06-21 13:59:45,728][15132] Rollout worker 6 uses device cpu [2024-06-21 13:59:45,729][15132] Rollout worker 7 uses device cpu [2024-06-21 13:59:45,729][15132] Rollout worker 8 uses device cpu [2024-06-21 13:59:45,730][15132] Rollout worker 9 uses device cpu [2024-06-21 13:59:45,730][15132] Rollout worker 10 uses device cpu [2024-06-21 13:59:45,730][15132] Rollout worker 11 uses device cpu [2024-06-21 13:59:45,731][15132] Rollout worker 12 uses device cpu [2024-06-21 13:59:45,731][15132] Rollout worker 13 uses device cpu [2024-06-21 13:59:45,732][15132] Rollout worker 14 uses device cpu [2024-06-21 13:59:45,733][15132] Rollout worker 15 uses device cpu [2024-06-21 13:59:45,733][15132] Rollout worker 16 uses device cpu [2024-06-21 13:59:45,733][15132] Rollout worker 17 uses device cpu [2024-06-21 13:59:45,733][15132] Rollout worker 18 uses device cpu [2024-06-21 13:59:45,733][15132] Rollout worker 19 uses device cpu [2024-06-21 13:59:45,734][15132] Rollout worker 20 uses device cpu [2024-06-21 13:59:45,734][15132] Rollout worker 21 uses device cpu [2024-06-21 13:59:45,734][15132] Rollout worker 22 uses device cpu [2024-06-21 13:59:45,734][15132] Rollout worker 23 uses device cpu [2024-06-21 13:59:45,735][15132] Rollout worker 24 uses device cpu [2024-06-21 13:59:45,735][15132] Rollout worker 25 uses device cpu [2024-06-21 13:59:45,735][15132] Rollout worker 26 uses device cpu [2024-06-21 13:59:45,735][15132] Rollout worker 27 uses device cpu [2024-06-21 13:59:45,735][15132] Rollout worker 28 uses device cpu [2024-06-21 13:59:45,736][15132] Rollout worker 29 uses device cpu [2024-06-21 13:59:45,736][15132] Rollout worker 30 uses device cpu [2024-06-21 13:59:45,736][15132] Rollout worker 31 uses device cpu [2024-06-21 13:59:46,328][15132] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-21 13:59:46,328][15132] InferenceWorker_p0-w0: min num requests: 10 [2024-06-21 13:59:46,410][15132] Starting all processes... [2024-06-21 13:59:46,410][15132] Starting process learner_proc0 [2024-06-21 13:59:46,643][15132] Starting all processes... [2024-06-21 13:59:46,645][15132] Starting process inference_proc0-0 [2024-06-21 13:59:46,646][15132] Starting process rollout_proc0 [2024-06-21 13:59:46,700][15132] Starting process rollout_proc1 [2024-06-21 13:59:46,702][15132] Starting process rollout_proc2 [2024-06-21 13:59:46,703][15132] Starting process rollout_proc3 [2024-06-21 13:59:46,703][15132] Starting process rollout_proc4 [2024-06-21 13:59:46,703][15132] Starting process rollout_proc5 [2024-06-21 13:59:46,703][15132] Starting process rollout_proc6 [2024-06-21 13:59:46,704][15132] Starting process rollout_proc7 [2024-06-21 13:59:46,705][15132] Starting process rollout_proc8 [2024-06-21 13:59:46,705][15132] Starting process rollout_proc9 [2024-06-21 13:59:46,705][15132] Starting process rollout_proc10 [2024-06-21 13:59:46,706][15132] Starting process rollout_proc11 [2024-06-21 13:59:46,707][15132] Starting process rollout_proc12 [2024-06-21 13:59:46,707][15132] Starting process rollout_proc13 [2024-06-21 13:59:46,707][15132] Starting process rollout_proc14 [2024-06-21 13:59:46,708][15132] Starting process rollout_proc15 [2024-06-21 13:59:46,720][15132] Starting process rollout_proc16 [2024-06-21 13:59:46,728][15132] Starting process rollout_proc17 [2024-06-21 13:59:46,736][15132] Starting process rollout_proc18 [2024-06-21 13:59:46,739][15132] Starting process rollout_proc19 [2024-06-21 13:59:46,748][15132] Starting process rollout_proc20 [2024-06-21 13:59:46,750][15132] Starting process rollout_proc21 [2024-06-21 13:59:46,759][15132] Starting process rollout_proc22 [2024-06-21 13:59:46,772][15132] Starting process rollout_proc23 [2024-06-21 13:59:46,772][15132] Starting process rollout_proc24 [2024-06-21 13:59:46,773][15132] Starting process rollout_proc25 [2024-06-21 13:59:46,778][15132] Starting process rollout_proc26 [2024-06-21 13:59:46,779][15132] Starting process rollout_proc27 [2024-06-21 13:59:46,779][15132] Starting process rollout_proc28 [2024-06-21 13:59:46,782][15132] Starting process rollout_proc29 [2024-06-21 13:59:46,786][15132] Starting process rollout_proc30 [2024-06-21 13:59:46,787][15132] Starting process rollout_proc31 [2024-06-21 13:59:48,666][15412] Worker 11 uses CPU cores [11] [2024-06-21 13:59:48,832][15369] Worker 0 uses CPU cores [0] [2024-06-21 13:59:48,940][15408] Worker 6 uses CPU cores [6] [2024-06-21 13:59:48,945][15406] Worker 2 uses CPU cores [2] [2024-06-21 13:59:48,948][15423] Worker 25 uses CPU cores [25] [2024-06-21 13:59:49,008][15426] Worker 22 uses CPU cores [22] [2024-06-21 13:59:49,009][15349] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-21 13:59:49,009][15349] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-21 13:59:49,015][15414] Worker 8 uses CPU cores [8] [2024-06-21 13:59:49,020][15407] Worker 9 uses CPU cores [9] [2024-06-21 13:59:49,023][15402] Worker 4 uses CPU cores [4] [2024-06-21 13:59:49,023][15349] Num visible devices: 1 [2024-06-21 13:59:49,036][15424] Worker 24 uses CPU cores [24] [2024-06-21 13:59:49,044][15432] Worker 30 uses CPU cores [30] [2024-06-21 13:59:49,044][15349] Setting fixed seed 0 [2024-06-21 13:59:49,045][15349] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-21 13:59:49,045][15349] Initializing actor-critic model on device cuda:0 [2024-06-21 13:59:49,052][15418] Worker 16 uses CPU cores [16] [2024-06-21 13:59:49,064][15417] Worker 15 uses CPU cores [15] [2024-06-21 13:59:49,067][15428] Worker 28 uses CPU cores [28] [2024-06-21 13:59:49,092][15422] Worker 18 uses CPU cores [18] [2024-06-21 13:59:49,106][15411] Worker 10 uses CPU cores [10] [2024-06-21 13:59:49,196][15427] Worker 23 uses CPU cores [23] [2024-06-21 13:59:49,200][15420] Worker 21 uses CPU cores [21] [2024-06-21 13:59:49,208][15430] Worker 26 uses CPU cores [26] [2024-06-21 13:59:49,215][15410] Worker 14 uses CPU cores [14] [2024-06-21 13:59:49,224][15404] Worker 1 uses CPU cores [1] [2024-06-21 13:59:49,225][15405] Worker 7 uses CPU cores [7] [2024-06-21 13:59:49,232][15431] Worker 31 uses CPU cores [31] [2024-06-21 13:59:49,240][15419] Worker 20 uses CPU cores [20] [2024-06-21 13:59:49,253][15401] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-21 13:59:49,253][15401] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-21 13:59:49,261][15401] Num visible devices: 1 [2024-06-21 13:59:49,284][15409] Worker 13 uses CPU cores [13] [2024-06-21 13:59:49,308][15403] Worker 3 uses CPU cores [3] [2024-06-21 13:59:49,308][15425] Worker 27 uses CPU cores [27] [2024-06-21 13:59:49,331][15429] Worker 29 uses CPU cores [29] [2024-06-21 13:59:49,335][15413] Worker 5 uses CPU cores [5] [2024-06-21 13:59:49,346][15421] Worker 19 uses CPU cores [19] [2024-06-21 13:59:49,349][15416] Worker 12 uses CPU cores [12] [2024-06-21 13:59:49,353][15415] Worker 17 uses CPU cores [17] [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,914][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,915][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,918][15349] RunningMeanStd input shape: (1,) [2024-06-21 13:59:49,918][15349] RunningMeanStd input shape: (1,) [2024-06-21 13:59:49,918][15349] RunningMeanStd input shape: (1,) [2024-06-21 13:59:49,919][15349] RunningMeanStd input shape: (1,) [2024-06-21 13:59:49,919][15349] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:49,958][15349] RunningMeanStd input shape: (1,) [2024-06-21 13:59:49,962][15349] Created Actor Critic model with architecture: [2024-06-21 13:59:49,962][15349] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-21 13:59:50,023][15349] Using optimizer [2024-06-21 13:59:50,208][15349] No checkpoints found [2024-06-21 13:59:50,208][15349] No checkpoints found, init from checkpoint train_dir/baseline.v0.5.4/checkpoint_p0/checkpoint_000349863_5732155392.pth [2024-06-21 13:59:50,223][15349] Loaded experiment state at self.train_step=0, self.env_steps=0 [2024-06-21 13:59:50,223][15349] Initialized policy 0 weights for model version 0 [2024-06-21 13:59:50,225][15349] LearnerWorker_p0 finished initialization! [2024-06-21 13:59:50,225][15349] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-21 13:59:50,986][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,987][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,988][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:50,991][15401] RunningMeanStd input shape: (1,) [2024-06-21 13:59:50,991][15401] RunningMeanStd input shape: (1,) [2024-06-21 13:59:50,991][15401] RunningMeanStd input shape: (1,) [2024-06-21 13:59:50,991][15401] RunningMeanStd input shape: (1,) [2024-06-21 13:59:50,991][15401] RunningMeanStd input shape: (11, 11) [2024-06-21 13:59:51,031][15401] RunningMeanStd input shape: (1,) [2024-06-21 13:59:51,053][15132] Inference worker 0-0 is ready! [2024-06-21 13:59:51,053][15132] All inference workers are ready! Signal rollout workers to start! [2024-06-21 13:59:53,389][15132] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-21 13:59:53,769][15424] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,811][15419] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,820][15410] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,828][15418] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,844][15421] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,866][15405] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,883][15412] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,885][15415] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,896][15417] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,914][15427] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,917][15408] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,920][15423] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,924][15409] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,940][15413] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,951][15428] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,952][15429] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,970][15422] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,981][15414] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,993][15407] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,994][15404] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,998][15369] Decorrelating experience for 0 frames... [2024-06-21 13:59:53,998][15432] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,001][15420] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,012][15426] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,015][15416] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,015][15411] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,018][15403] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,027][15406] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,029][15402] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,037][15430] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,061][15431] Decorrelating experience for 0 frames... [2024-06-21 13:59:54,081][15425] Decorrelating experience for 0 frames... [2024-06-21 13:59:55,012][15412] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,031][15410] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,057][15424] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,068][15418] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,080][15415] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,087][15427] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,100][15429] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,115][15421] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,124][15405] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,159][15413] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,193][15408] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,201][15369] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,209][15417] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,227][15402] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,234][15419] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,243][15406] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,266][15423] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,285][15403] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,288][15409] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,289][15426] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,290][15414] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,294][15422] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,309][15432] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,330][15407] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,337][15428] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,349][15411] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,351][15416] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,383][15404] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,401][15430] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,439][15431] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,446][15420] Decorrelating experience for 256 frames... [2024-06-21 13:59:55,553][15425] Decorrelating experience for 256 frames... [2024-06-21 13:59:58,391][15132] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 1031.8. Samples: 5160. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-21 14:00:19,368][15132] Heartbeat connected on RolloutWorker_w26 [2024-06-21 14:00:19,368][15132] Heartbeat connected on RolloutWorker_w24 [2024-06-21 14:00:19,368][15132] Heartbeat connected on RolloutWorker_w29 [2024-06-21 14:00:19,368][15132] Heartbeat connected on RolloutWorker_w30 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w28 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w20 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w5 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w12 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w11 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w22 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w10 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w13 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w14 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w6 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w19 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w9 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w31 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w27 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w25 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w23 [2024-06-21 14:00:19,369][15132] Heartbeat connected on RolloutWorker_w7 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w18 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w3 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w1 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w0 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w4 [2024-06-21 14:00:19,370][15132] Heartbeat connected on LearnerWorker_p0 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w16 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w2 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w17 [2024-06-21 14:00:19,370][15132] Heartbeat connected on RolloutWorker_w21 [2024-06-21 14:00:19,370][15132] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 6296.2. Samples: 163580. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-21 14:00:19,371][15132] Heartbeat connected on Batcher_0 [2024-06-21 14:00:19,374][15132] Heartbeat connected on RolloutWorker_w8 [2024-06-21 14:00:19,374][15132] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 6295.2. Samples: 163580. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-21 14:00:19,376][15132] Heartbeat connected on RolloutWorker_w15 [2024-06-21 14:00:19,376][15132] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 6294.7. Samples: 163580. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-21 14:00:19,377][15132] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 6294.6. Samples: 163580. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-21 14:00:19,387][15132] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-21 14:00:21,244][15412] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-21 14:00:21,331][15410] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-21 14:00:21,345][15413] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-21 14:00:21,356][15424] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-21 14:00:21,410][15415] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-21 14:00:21,437][15405] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-21 14:00:21,438][15404] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-21 14:00:21,457][15406] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-21 14:00:21,468][15408] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-21 14:00:21,485][15411] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-21 14:00:21,494][15429] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-21 14:00:21,499][15427] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-21 14:00:21,503][15349] Signal inference workers to stop experience collection... [2024-06-21 14:00:21,513][15401] InferenceWorker_p0-w0: stopping experience collection [2024-06-21 14:00:21,527][15418] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-21 14:00:22,063][15349] Signal inference workers to resume experience collection... [2024-06-21 14:00:22,063][15401] InferenceWorker_p0-w0: resuming experience collection [2024-06-21 14:00:22,082][15402] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-21 14:00:22,101][15409] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-21 14:00:22,102][15414] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-21 14:00:22,436][15419] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-21 14:00:22,436][15421] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-21 14:00:22,559][15416] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-21 14:00:22,609][15432] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-21 14:00:22,677][15403] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-21 14:00:22,685][15426] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-21 14:00:22,705][15417] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-21 14:00:22,705][15407] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-21 14:00:22,801][15430] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-21 14:00:22,815][15422] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-21 14:00:22,873][15423] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-21 14:00:23,051][15428] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-21 14:00:23,197][15431] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-21 14:00:23,259][15420] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-21 14:00:23,332][15425] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-21 14:00:23,389][15132] Fps is (10 sec: 40825.8, 60 sec: 5461.4, 300 sec: 5461.4). Total num frames: 163840. Throughput: 0: 9906.0. Samples: 297180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-21 14:00:23,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-21 14:00:23,390][15349] Saving new best policy, reward=0.000! [2024-06-21 14:00:23,403][15401] Updated weights for policy 0, policy_version 10 (0.0018) [2024-06-21 14:00:26,151][15404] Worker 1 awakens! [2024-06-21 14:00:28,389][15132] Fps is (10 sec: 18178.6, 60 sec: 4681.1, 300 sec: 4681.1). Total num frames: 163840. Throughput: 0: 9566.9. Samples: 334840. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-21 14:00:28,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-21 14:00:30,879][15406] Worker 2 awakens! [2024-06-21 14:00:33,389][15132] Fps is (10 sec: 1638.4, 60 sec: 4505.6, 300 sec: 4505.6). Total num frames: 180224. Throughput: 0: 8823.0. Samples: 352920. Policy #0 lag: (min: 0.0, avg: 6.4, max: 10.0) [2024-06-21 14:00:33,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-21 14:00:36,810][15403] Worker 3 awakens! [2024-06-21 14:00:38,390][15132] Fps is (10 sec: 3276.8, 60 sec: 4369.1, 300 sec: 4369.1). Total num frames: 196608. Throughput: 0: 8093.3. Samples: 364200. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-21 14:00:38,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-21 14:00:40,844][15402] Worker 4 awakens! [2024-06-21 14:00:43,389][15132] Fps is (10 sec: 6553.6, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 245760. Throughput: 0: 8853.6. Samples: 403560. Policy #0 lag: (min: 0.0, avg: 4.3, max: 13.0) [2024-06-21 14:00:43,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-21 14:00:44,883][15413] Worker 5 awakens! [2024-06-21 14:00:48,389][15132] Fps is (10 sec: 11468.9, 60 sec: 5659.9, 300 sec: 5659.9). Total num frames: 311296. Throughput: 0: 11033.4. Samples: 483760. Policy #0 lag: (min: 0.0, avg: 1.5, max: 4.0) [2024-06-21 14:00:48,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-21 14:00:48,697][15401] Updated weights for policy 0, policy_version 20 (0.0014) [2024-06-21 14:00:49,692][15408] Worker 6 awakens! [2024-06-21 14:00:53,389][15132] Fps is (10 sec: 16383.9, 60 sec: 6826.7, 300 sec: 6826.7). Total num frames: 409600. Throughput: 0: 10865.7. Samples: 533180. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2024-06-21 14:00:53,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-21 14:00:54,344][15405] Worker 7 awakens! [2024-06-21 14:00:57,504][15401] Updated weights for policy 0, policy_version 30 (0.0012) [2024-06-21 14:00:58,389][15132] Fps is (10 sec: 19660.8, 60 sec: 8465.2, 300 sec: 7813.9). Total num frames: 507904. Throughput: 0: 12323.5. Samples: 644360. Policy #0 lag: (min: 0.0, avg: 3.4, max: 7.0) [2024-06-21 14:00:58,390][15132] Avg episode reward: [(0, '0.157')] [2024-06-21 14:00:58,398][15349] Saving new best policy, reward=0.157! [2024-06-21 14:00:59,700][15414] Worker 8 awakens! [2024-06-21 14:01:03,389][15132] Fps is (10 sec: 19660.7, 60 sec: 13771.4, 300 sec: 8660.1). Total num frames: 606208. Throughput: 0: 13649.6. Samples: 764340. Policy #0 lag: (min: 0.0, avg: 7.6, max: 32.0) [2024-06-21 14:01:03,390][15132] Avg episode reward: [(0, '0.157')] [2024-06-21 14:01:04,894][15407] Worker 9 awakens! [2024-06-21 14:01:05,535][15401] Updated weights for policy 0, policy_version 40 (0.0012) [2024-06-21 14:01:08,389][15132] Fps is (10 sec: 19660.8, 60 sec: 14373.3, 300 sec: 9393.5). Total num frames: 704512. Throughput: 0: 11932.4. Samples: 834140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 37.0) [2024-06-21 14:01:08,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-21 14:01:08,390][15349] Saving new best policy, reward=0.272! [2024-06-21 14:01:08,460][15411] Worker 10 awakens! [2024-06-21 14:01:11,326][15401] Updated weights for policy 0, policy_version 50 (0.0013) [2024-06-21 14:01:12,907][15412] Worker 11 awakens! [2024-06-21 14:01:13,389][15132] Fps is (10 sec: 26214.4, 60 sec: 16076.6, 300 sec: 10854.4). Total num frames: 868352. Throughput: 0: 14800.0. Samples: 1000840. Policy #0 lag: (min: 0.0, avg: 4.3, max: 8.0) [2024-06-21 14:01:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 14:01:13,398][15349] Saving new best policy, reward=0.537! [2024-06-21 14:01:17,158][15401] Updated weights for policy 0, policy_version 60 (0.0016) [2024-06-21 14:01:18,390][15132] Fps is (10 sec: 31129.5, 60 sec: 17213.3, 300 sec: 11950.7). Total num frames: 1015808. Throughput: 0: 18353.7. Samples: 1178840. Policy #0 lag: (min: 0.0, avg: 4.3, max: 11.0) [2024-06-21 14:01:18,390][15132] Avg episode reward: [(0, '0.148')] [2024-06-21 14:01:18,894][15416] Worker 12 awakens! [2024-06-21 14:01:22,375][15401] Updated weights for policy 0, policy_version 70 (0.0018) [2024-06-21 14:01:23,136][15409] Worker 13 awakens! [2024-06-21 14:01:23,390][15132] Fps is (10 sec: 31129.4, 60 sec: 16930.1, 300 sec: 13107.2). Total num frames: 1179648. Throughput: 0: 20227.1. Samples: 1274420. Policy #0 lag: (min: 1.0, avg: 4.1, max: 9.0) [2024-06-21 14:01:23,390][15132] Avg episode reward: [(0, '0.037')] [2024-06-21 14:01:27,059][15410] Worker 14 awakens! [2024-06-21 14:01:27,480][15401] Updated weights for policy 0, policy_version 80 (0.0021) [2024-06-21 14:01:28,390][15132] Fps is (10 sec: 34406.3, 60 sec: 19933.8, 300 sec: 14314.4). Total num frames: 1359872. Throughput: 0: 23766.6. Samples: 1473060. Policy #0 lag: (min: 0.0, avg: 5.0, max: 11.0) [2024-06-21 14:01:28,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-21 14:01:32,153][15401] Updated weights for policy 0, policy_version 90 (0.0022) [2024-06-21 14:01:33,116][15417] Worker 15 awakens! [2024-06-21 14:01:33,390][15132] Fps is (10 sec: 34406.5, 60 sec: 22391.4, 300 sec: 15237.1). Total num frames: 1523712. Throughput: 0: 26301.3. Samples: 1667320. Policy #0 lag: (min: 0.0, avg: 22.3, max: 92.0) [2024-06-21 14:01:33,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-21 14:01:36,624][15418] Worker 16 awakens! [2024-06-21 14:01:37,059][15401] Updated weights for policy 0, policy_version 100 (0.0029) [2024-06-21 14:01:38,390][15132] Fps is (10 sec: 32767.8, 60 sec: 24849.0, 300 sec: 16071.9). Total num frames: 1687552. Throughput: 0: 27561.7. Samples: 1773460. Policy #0 lag: (min: 0.0, avg: 6.4, max: 12.0) [2024-06-21 14:01:38,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 14:01:41,200][15415] Worker 17 awakens! [2024-06-21 14:01:41,829][15401] Updated weights for policy 0, policy_version 110 (0.0028) [2024-06-21 14:01:43,389][15132] Fps is (10 sec: 32768.2, 60 sec: 26760.5, 300 sec: 16830.8). Total num frames: 1851392. Throughput: 0: 29586.2. Samples: 1975740. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-06-21 14:01:43,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 14:01:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000000114_1867776.pth... [2024-06-21 14:01:46,673][15401] Updated weights for policy 0, policy_version 120 (0.0031) [2024-06-21 14:01:47,288][15422] Worker 18 awakens! [2024-06-21 14:01:48,390][15132] Fps is (10 sec: 36044.7, 60 sec: 28945.0, 300 sec: 17808.7). Total num frames: 2048000. Throughput: 0: 31503.0. Samples: 2181980. Policy #0 lag: (min: 0.0, avg: 4.4, max: 12.0) [2024-06-21 14:01:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 14:01:51,599][15421] Worker 19 awakens! [2024-06-21 14:01:51,604][15401] Updated weights for policy 0, policy_version 130 (0.0024) [2024-06-21 14:01:53,389][15132] Fps is (10 sec: 36044.7, 60 sec: 30037.3, 300 sec: 18432.0). Total num frames: 2211840. Throughput: 0: 32485.8. Samples: 2296000. Policy #0 lag: (min: 0.0, avg: 4.6, max: 13.0) [2024-06-21 14:01:53,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 14:01:56,055][15401] Updated weights for policy 0, policy_version 140 (0.0026) [2024-06-21 14:01:56,244][15419] Worker 20 awakens! [2024-06-21 14:01:58,390][15132] Fps is (10 sec: 32768.2, 60 sec: 31129.6, 300 sec: 19005.4). Total num frames: 2375680. Throughput: 0: 33768.4. Samples: 2520420. Policy #0 lag: (min: 0.0, avg: 6.0, max: 14.0) [2024-06-21 14:01:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 14:01:59,138][15401] Updated weights for policy 0, policy_version 150 (0.0030) [2024-06-21 14:02:01,796][15420] Worker 21 awakens! [2024-06-21 14:02:03,390][15132] Fps is (10 sec: 36044.7, 60 sec: 32768.0, 300 sec: 19786.8). Total num frames: 2572288. Throughput: 0: 34775.6. Samples: 2743740. Policy #0 lag: (min: 0.0, avg: 6.9, max: 13.0) [2024-06-21 14:02:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 14:02:03,443][15349] Saving new best policy, reward=0.632! [2024-06-21 14:02:04,214][15401] Updated weights for policy 0, policy_version 160 (0.0029) [2024-06-21 14:02:05,911][15426] Worker 22 awakens! [2024-06-21 14:02:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 34406.4, 300 sec: 20510.3). Total num frames: 2768896. Throughput: 0: 35147.1. Samples: 2856040. Policy #0 lag: (min: 0.0, avg: 6.4, max: 15.0) [2024-06-21 14:02:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-21 14:02:08,399][15349] Saving new best policy, reward=0.728! [2024-06-21 14:02:08,645][15401] Updated weights for policy 0, policy_version 170 (0.0029) [2024-06-21 14:02:09,412][15427] Worker 23 awakens! [2024-06-21 14:02:12,945][15401] Updated weights for policy 0, policy_version 180 (0.0024) [2024-06-21 14:02:13,389][15132] Fps is (10 sec: 37683.3, 60 sec: 34679.5, 300 sec: 21065.1). Total num frames: 2949120. Throughput: 0: 35990.7. Samples: 3092640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 16.0) [2024-06-21 14:02:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 14:02:13,894][15424] Worker 24 awakens! [2024-06-21 14:02:16,624][15401] Updated weights for policy 0, policy_version 190 (0.0031) [2024-06-21 14:02:18,389][15132] Fps is (10 sec: 37683.3, 60 sec: 35498.7, 300 sec: 21694.7). Total num frames: 3145728. Throughput: 0: 36901.8. Samples: 3327900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 16.0) [2024-06-21 14:02:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 14:02:20,160][15423] Worker 25 awakens! [2024-06-21 14:02:21,182][15401] Updated weights for policy 0, policy_version 200 (0.0030) [2024-06-21 14:02:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 36591.0, 300 sec: 22500.7). Total num frames: 3375104. Throughput: 0: 37192.5. Samples: 3447120. Policy #0 lag: (min: 1.0, avg: 7.8, max: 17.0) [2024-06-21 14:02:23,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 14:02:24,776][15430] Worker 26 awakens! [2024-06-21 14:02:24,982][15401] Updated weights for policy 0, policy_version 210 (0.0039) [2024-06-21 14:02:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 36591.0, 300 sec: 22937.6). Total num frames: 3555328. Throughput: 0: 38108.9. Samples: 3690640. Policy #0 lag: (min: 0.0, avg: 7.8, max: 18.0) [2024-06-21 14:02:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 14:02:28,966][15401] Updated weights for policy 0, policy_version 220 (0.0031) [2024-06-21 14:02:29,992][15425] Worker 27 awakens! [2024-06-21 14:02:33,334][15401] Updated weights for policy 0, policy_version 230 (0.0029) [2024-06-21 14:02:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 37410.1, 300 sec: 23552.0). Total num frames: 3768320. Throughput: 0: 39140.9. Samples: 3943320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 18.0) [2024-06-21 14:02:33,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-21 14:02:34,400][15428] Worker 28 awakens! [2024-06-21 14:02:36,763][15401] Updated weights for policy 0, policy_version 240 (0.0031) [2024-06-21 14:02:37,532][15429] Worker 29 awakens! [2024-06-21 14:02:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 38229.3, 300 sec: 24129.1). Total num frames: 3981312. Throughput: 0: 39241.2. Samples: 4061860. Policy #0 lag: (min: 0.0, avg: 82.1, max: 242.0) [2024-06-21 14:02:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 14:02:40,900][15401] Updated weights for policy 0, policy_version 250 (0.0033) [2024-06-21 14:02:43,334][15432] Worker 30 awakens! [2024-06-21 14:02:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 38775.5, 300 sec: 24576.0). Total num frames: 4177920. Throughput: 0: 39784.9. Samples: 4310740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 14:02:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 14:02:44,442][15401] Updated weights for policy 0, policy_version 260 (0.0030) [2024-06-21 14:02:48,390][15132] Fps is (10 sec: 42595.3, 60 sec: 39321.1, 300 sec: 25184.4). Total num frames: 4407296. Throughput: 0: 40350.8. Samples: 4559560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 14:02:48,391][15132] Avg episode reward: [(0, '0.285')] [2024-06-21 14:02:48,608][15431] Worker 31 awakens! [2024-06-21 14:02:48,874][15401] Updated weights for policy 0, policy_version 270 (0.0033) [2024-06-21 14:02:52,543][15401] Updated weights for policy 0, policy_version 280 (0.0040) [2024-06-21 14:02:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 39867.8, 300 sec: 25577.2). Total num frames: 4603904. Throughput: 0: 40595.2. Samples: 4682820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 14:02:53,390][15132] Avg episode reward: [(0, '0.211')] [2024-06-21 14:02:56,342][15349] Signal inference workers to stop experience collection... (50 times) [2024-06-21 14:02:56,344][15349] Signal inference workers to resume experience collection... (50 times) [2024-06-21 14:02:56,387][15401] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-21 14:02:56,387][15401] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-21 14:02:56,478][15401] Updated weights for policy 0, policy_version 290 (0.0036) [2024-06-21 14:02:58,390][15132] Fps is (10 sec: 40963.3, 60 sec: 40687.0, 300 sec: 26037.3). Total num frames: 4816896. Throughput: 0: 41010.2. Samples: 4938100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 14:02:58,396][15132] Avg episode reward: [(0, '0.211')] [2024-06-21 14:03:00,296][15401] Updated weights for policy 0, policy_version 300 (0.0033) [2024-06-21 14:03:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 26386.9). Total num frames: 5013504. Throughput: 0: 41518.6. Samples: 5196240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 14:03:03,393][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 14:03:04,104][15401] Updated weights for policy 0, policy_version 310 (0.0038) [2024-06-21 14:03:07,979][15401] Updated weights for policy 0, policy_version 320 (0.0038) [2024-06-21 14:03:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 26886.6). Total num frames: 5242880. Throughput: 0: 41604.9. Samples: 5319340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 14:03:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 14:03:12,087][15401] Updated weights for policy 0, policy_version 330 (0.0024) [2024-06-21 14:03:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 27197.4). Total num frames: 5439488. Throughput: 0: 41812.0. Samples: 5572180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 14:03:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-21 14:03:15,615][15401] Updated weights for policy 0, policy_version 340 (0.0034) [2024-06-21 14:03:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 27573.1). Total num frames: 5652480. Throughput: 0: 41932.1. Samples: 5830260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 14:03:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 14:03:19,732][15401] Updated weights for policy 0, policy_version 350 (0.0046) [2024-06-21 14:03:23,222][15401] Updated weights for policy 0, policy_version 360 (0.0030) [2024-06-21 14:03:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 28086.8). Total num frames: 5898240. Throughput: 0: 42053.8. Samples: 5954280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 14:03:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 14:03:27,537][15401] Updated weights for policy 0, policy_version 370 (0.0036) [2024-06-21 14:03:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 28271.9). Total num frames: 6078464. Throughput: 0: 42074.2. Samples: 6204080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 14:03:28,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-21 14:03:30,949][15401] Updated weights for policy 0, policy_version 380 (0.0038) [2024-06-21 14:03:33,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 28523.0). Total num frames: 6275072. Throughput: 0: 42138.9. Samples: 6455780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 14:03:33,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-21 14:03:35,283][15401] Updated weights for policy 0, policy_version 390 (0.0032) [2024-06-21 14:03:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 28908.7). Total num frames: 6504448. Throughput: 0: 42238.2. Samples: 6583540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 14:03:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 14:03:39,174][15401] Updated weights for policy 0, policy_version 400 (0.0030) [2024-06-21 14:03:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 29135.0). Total num frames: 6701056. Throughput: 0: 42199.1. Samples: 6837060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 14:03:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 14:03:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000000409_6701056.pth... [2024-06-21 14:03:43,677][15401] Updated weights for policy 0, policy_version 410 (0.0037) [2024-06-21 14:03:46,823][15401] Updated weights for policy 0, policy_version 420 (0.0030) [2024-06-21 14:03:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.8, 300 sec: 29421.5). Total num frames: 6914048. Throughput: 0: 41961.5. Samples: 7084500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 14:03:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 14:03:51,518][15401] Updated weights for policy 0, policy_version 430 (0.0042) [2024-06-21 14:03:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 29696.0). Total num frames: 7127040. Throughput: 0: 42098.3. Samples: 7213760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-21 14:03:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-21 14:03:53,453][15349] Saving new best policy, reward=0.751! [2024-06-21 14:03:54,521][15401] Updated weights for policy 0, policy_version 440 (0.0031) [2024-06-21 14:03:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.6, 300 sec: 29959.0). Total num frames: 7340032. Throughput: 0: 42113.8. Samples: 7467400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 14:03:58,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 14:03:59,171][15401] Updated weights for policy 0, policy_version 450 (0.0038) [2024-06-21 14:04:02,881][15401] Updated weights for policy 0, policy_version 460 (0.0040) [2024-06-21 14:04:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 30212.1). Total num frames: 7553024. Throughput: 0: 41787.1. Samples: 7710680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 14:04:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 14:04:06,783][15401] Updated weights for policy 0, policy_version 470 (0.0042) [2024-06-21 14:04:08,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42052.3, 300 sec: 30455.0). Total num frames: 7766016. Throughput: 0: 41892.6. Samples: 7839440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-21 14:04:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 14:04:10,563][15401] Updated weights for policy 0, policy_version 480 (0.0043) [2024-06-21 14:04:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 30688.5). Total num frames: 7979008. Throughput: 0: 41991.9. Samples: 8093720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 14:04:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 14:04:14,585][15401] Updated weights for policy 0, policy_version 490 (0.0030) [2024-06-21 14:04:18,383][15401] Updated weights for policy 0, policy_version 500 (0.0030) [2024-06-21 14:04:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 30913.2). Total num frames: 8192000. Throughput: 0: 41827.2. Samples: 8338000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:04:18,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 14:04:22,366][15401] Updated weights for policy 0, policy_version 510 (0.0038) [2024-06-21 14:04:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 31008.2). Total num frames: 8372224. Throughput: 0: 41843.8. Samples: 8466520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-21 14:04:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 14:04:23,708][15349] Signal inference workers to stop experience collection... (100 times) [2024-06-21 14:04:23,744][15401] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-21 14:04:23,759][15349] Signal inference workers to resume experience collection... (100 times) [2024-06-21 14:04:23,764][15401] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-21 14:04:26,349][15401] Updated weights for policy 0, policy_version 520 (0.0026) [2024-06-21 14:04:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 31278.5). Total num frames: 8601600. Throughput: 0: 41785.3. Samples: 8717400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 14:04:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 14:04:30,182][15401] Updated weights for policy 0, policy_version 530 (0.0045) [2024-06-21 14:04:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 31480.7). Total num frames: 8814592. Throughput: 0: 41835.0. Samples: 8967080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 14:04:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 14:04:34,268][15401] Updated weights for policy 0, policy_version 540 (0.0044) [2024-06-21 14:04:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 31560.8). Total num frames: 8994816. Throughput: 0: 41741.3. Samples: 9092120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 14:04:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 14:04:38,627][15401] Updated weights for policy 0, policy_version 550 (0.0024) [2024-06-21 14:04:42,205][15401] Updated weights for policy 0, policy_version 560 (0.0052) [2024-06-21 14:04:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 31807.6). Total num frames: 9224192. Throughput: 0: 41606.6. Samples: 9339600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:04:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 14:04:46,271][15401] Updated weights for policy 0, policy_version 570 (0.0031) [2024-06-21 14:04:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 41777.5, 300 sec: 31934.7). Total num frames: 9420800. Throughput: 0: 41742.7. Samples: 9589200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 14:04:48,393][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 14:04:50,133][15401] Updated weights for policy 0, policy_version 580 (0.0031) [2024-06-21 14:04:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 32657.1). Total num frames: 9633792. Throughput: 0: 41678.6. Samples: 9714980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-21 14:04:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 14:04:54,233][15401] Updated weights for policy 0, policy_version 590 (0.0037) [2024-06-21 14:04:57,954][15401] Updated weights for policy 0, policy_version 600 (0.0043) [2024-06-21 14:04:58,390][15132] Fps is (10 sec: 42608.6, 60 sec: 41780.8, 300 sec: 35290.7). Total num frames: 9846784. Throughput: 0: 41665.0. Samples: 9968640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 14:04:58,391][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 14:05:02,158][15401] Updated weights for policy 0, policy_version 610 (0.0044) [2024-06-21 14:05:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 35477.5). Total num frames: 10076160. Throughput: 0: 41595.9. Samples: 10209820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 14:05:03,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 14:05:05,767][15401] Updated weights for policy 0, policy_version 620 (0.0031) [2024-06-21 14:05:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 35487.6). Total num frames: 10256384. Throughput: 0: 41631.2. Samples: 10339920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:05:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 14:05:09,893][15401] Updated weights for policy 0, policy_version 630 (0.0030) [2024-06-21 14:05:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41506.3, 300 sec: 35608.6). Total num frames: 10469376. Throughput: 0: 41614.3. Samples: 10590040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 14:05:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 14:05:13,526][15401] Updated weights for policy 0, policy_version 640 (0.0035) [2024-06-21 14:05:17,742][15401] Updated weights for policy 0, policy_version 650 (0.0031) [2024-06-21 14:05:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 35656.0). Total num frames: 10682368. Throughput: 0: 41572.9. Samples: 10837860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 14:05:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 14:05:21,280][15401] Updated weights for policy 0, policy_version 660 (0.0029) [2024-06-21 14:05:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 36322.5). Total num frames: 10878976. Throughput: 0: 41496.7. Samples: 10959480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 14:05:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 14:05:25,512][15401] Updated weights for policy 0, policy_version 670 (0.0039) [2024-06-21 14:05:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41233.2, 300 sec: 36933.4). Total num frames: 11075584. Throughput: 0: 41548.1. Samples: 11209260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:05:28,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 14:05:29,137][15401] Updated weights for policy 0, policy_version 680 (0.0038) [2024-06-21 14:05:33,332][15401] Updated weights for policy 0, policy_version 690 (0.0045) [2024-06-21 14:05:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 37655.4). Total num frames: 11304960. Throughput: 0: 41604.0. Samples: 11461280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:05:33,399][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 14:05:37,103][15401] Updated weights for policy 0, policy_version 700 (0.0034) [2024-06-21 14:05:38,396][15132] Fps is (10 sec: 42570.9, 60 sec: 41774.7, 300 sec: 38154.4). Total num frames: 11501568. Throughput: 0: 41578.9. Samples: 11586300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 14:05:38,405][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 14:05:39,397][15349] Signal inference workers to stop experience collection... (150 times) [2024-06-21 14:05:39,399][15349] Signal inference workers to resume experience collection... (150 times) [2024-06-21 14:05:39,435][15401] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-21 14:05:39,435][15401] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-21 14:05:41,270][15401] Updated weights for policy 0, policy_version 710 (0.0039) [2024-06-21 14:05:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 38655.1). Total num frames: 11714560. Throughput: 0: 41355.9. Samples: 11829660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-21 14:05:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 14:05:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000000715_11714560.pth... [2024-06-21 14:05:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000000114_1867776.pth [2024-06-21 14:05:45,249][15401] Updated weights for policy 0, policy_version 720 (0.0039) [2024-06-21 14:05:48,390][15132] Fps is (10 sec: 39346.5, 60 sec: 41234.7, 300 sec: 38932.8). Total num frames: 11894784. Throughput: 0: 41551.1. Samples: 12079620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 14:05:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 14:05:49,208][15401] Updated weights for policy 0, policy_version 730 (0.0038) [2024-06-21 14:05:52,995][15401] Updated weights for policy 0, policy_version 740 (0.0028) [2024-06-21 14:05:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 39432.7). Total num frames: 12140544. Throughput: 0: 41307.2. Samples: 12198740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 14:05:53,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-21 14:05:57,200][15401] Updated weights for policy 0, policy_version 750 (0.0038) [2024-06-21 14:05:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41233.2, 300 sec: 39710.4). Total num frames: 12320768. Throughput: 0: 41233.8. Samples: 12445560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 14:05:58,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-21 14:05:58,512][15349] Saving new best policy, reward=0.752! [2024-06-21 14:06:00,705][15401] Updated weights for policy 0, policy_version 760 (0.0032) [2024-06-21 14:06:03,389][15132] Fps is (10 sec: 37683.1, 60 sec: 40687.0, 300 sec: 40043.6). Total num frames: 12517376. Throughput: 0: 41239.1. Samples: 12693620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 14:06:03,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 14:06:05,119][15401] Updated weights for policy 0, policy_version 770 (0.0040) [2024-06-21 14:06:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41506.1, 300 sec: 40265.8). Total num frames: 12746752. Throughput: 0: 41255.2. Samples: 12815960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:06:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 14:06:08,943][15401] Updated weights for policy 0, policy_version 780 (0.0044) [2024-06-21 14:06:13,121][15401] Updated weights for policy 0, policy_version 790 (0.0043) [2024-06-21 14:06:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41233.0, 300 sec: 40432.4). Total num frames: 12943360. Throughput: 0: 41245.2. Samples: 13065300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-21 14:06:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-21 14:06:16,809][15401] Updated weights for policy 0, policy_version 800 (0.0038) [2024-06-21 14:06:18,392][15132] Fps is (10 sec: 39312.5, 60 sec: 40958.3, 300 sec: 40543.1). Total num frames: 13139968. Throughput: 0: 41207.6. Samples: 13315720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 14:06:18,393][15132] Avg episode reward: [(0, '0.700')] [2024-06-21 14:06:21,019][15401] Updated weights for policy 0, policy_version 810 (0.0025) [2024-06-21 14:06:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 40654.5). Total num frames: 13352960. Throughput: 0: 41118.6. Samples: 13436380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:06:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 14:06:24,594][15401] Updated weights for policy 0, policy_version 820 (0.0046) [2024-06-21 14:06:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 41506.1, 300 sec: 40821.2). Total num frames: 13565952. Throughput: 0: 41233.4. Samples: 13685160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 14:06:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 14:06:29,009][15401] Updated weights for policy 0, policy_version 830 (0.0035) [2024-06-21 14:06:32,560][15401] Updated weights for policy 0, policy_version 840 (0.0036) [2024-06-21 14:06:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 13778944. Throughput: 0: 41003.7. Samples: 13924780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 26.0) [2024-06-21 14:06:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 14:06:36,956][15401] Updated weights for policy 0, policy_version 850 (0.0045) [2024-06-21 14:06:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41237.5, 300 sec: 41098.8). Total num frames: 13975552. Throughput: 0: 41209.3. Samples: 14053160. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-21 14:06:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 14:06:40,532][15401] Updated weights for policy 0, policy_version 860 (0.0037) [2024-06-21 14:06:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 14172160. Throughput: 0: 41325.2. Samples: 14305200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 14:06:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 14:06:44,871][15401] Updated weights for policy 0, policy_version 870 (0.0052) [2024-06-21 14:06:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 14401536. Throughput: 0: 41030.9. Samples: 14540020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 14:06:48,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 14:06:48,774][15401] Updated weights for policy 0, policy_version 880 (0.0028) [2024-06-21 14:06:53,337][15401] Updated weights for policy 0, policy_version 890 (0.0035) [2024-06-21 14:06:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 41376.5). Total num frames: 14581760. Throughput: 0: 41286.3. Samples: 14673840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 14:06:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 14:06:56,699][15401] Updated weights for policy 0, policy_version 900 (0.0044) [2024-06-21 14:06:58,390][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 14794752. Throughput: 0: 41156.5. Samples: 14917340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 14:06:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 14:07:01,083][15401] Updated weights for policy 0, policy_version 910 (0.0034) [2024-06-21 14:07:01,992][15349] Signal inference workers to stop experience collection... (200 times) [2024-06-21 14:07:01,993][15349] Signal inference workers to resume experience collection... (200 times) [2024-06-21 14:07:02,014][15401] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-21 14:07:02,014][15401] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-21 14:07:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41505.9, 300 sec: 41487.6). Total num frames: 15007744. Throughput: 0: 41250.4. Samples: 15171900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:07:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 14:07:04,515][15401] Updated weights for policy 0, policy_version 920 (0.0030) [2024-06-21 14:07:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 15204352. Throughput: 0: 41443.2. Samples: 15301320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 14:07:08,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 14:07:08,834][15401] Updated weights for policy 0, policy_version 930 (0.0032) [2024-06-21 14:07:12,425][15401] Updated weights for policy 0, policy_version 940 (0.0034) [2024-06-21 14:07:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 15433728. Throughput: 0: 41480.0. Samples: 15551760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 14:07:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 14:07:16,755][15401] Updated weights for policy 0, policy_version 950 (0.0039) [2024-06-21 14:07:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41780.9, 300 sec: 41598.7). Total num frames: 15646720. Throughput: 0: 41541.4. Samples: 15794140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:07:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 14:07:20,321][15401] Updated weights for policy 0, policy_version 960 (0.0044) [2024-06-21 14:07:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 15826944. Throughput: 0: 41511.0. Samples: 15921160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 14:07:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 14:07:24,503][15401] Updated weights for policy 0, policy_version 970 (0.0032) [2024-06-21 14:07:28,106][15401] Updated weights for policy 0, policy_version 980 (0.0027) [2024-06-21 14:07:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 16072704. Throughput: 0: 41583.6. Samples: 16176460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:07:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 14:07:32,352][15401] Updated weights for policy 0, policy_version 990 (0.0029) [2024-06-21 14:07:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 16269312. Throughput: 0: 41969.5. Samples: 16428640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:07:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 14:07:35,978][15401] Updated weights for policy 0, policy_version 1000 (0.0025) [2024-06-21 14:07:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 16465920. Throughput: 0: 41724.4. Samples: 16551440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:07:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 14:07:40,205][15401] Updated weights for policy 0, policy_version 1010 (0.0037) [2024-06-21 14:07:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 41543.3). Total num frames: 16662528. Throughput: 0: 41697.4. Samples: 16793720. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-21 14:07:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 14:07:43,481][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000001018_16678912.pth... [2024-06-21 14:07:43,523][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000000409_6701056.pth [2024-06-21 14:07:44,087][15401] Updated weights for policy 0, policy_version 1020 (0.0028) [2024-06-21 14:07:48,020][15401] Updated weights for policy 0, policy_version 1030 (0.0030) [2024-06-21 14:07:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.2, 300 sec: 41598.7). Total num frames: 16875520. Throughput: 0: 41707.3. Samples: 17048720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-21 14:07:48,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-21 14:07:48,434][15349] Saving new best policy, reward=0.758! [2024-06-21 14:07:51,932][15401] Updated weights for policy 0, policy_version 1040 (0.0042) [2024-06-21 14:07:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 17088512. Throughput: 0: 41600.0. Samples: 17173320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 14:07:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 14:07:56,141][15401] Updated weights for policy 0, policy_version 1050 (0.0040) [2024-06-21 14:07:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 17317888. Throughput: 0: 41406.6. Samples: 17415060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:07:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 14:08:00,019][15401] Updated weights for policy 0, policy_version 1060 (0.0031) [2024-06-21 14:08:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41543.1). Total num frames: 17498112. Throughput: 0: 41665.6. Samples: 17669100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-21 14:08:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 14:08:04,047][15401] Updated weights for policy 0, policy_version 1070 (0.0026) [2024-06-21 14:08:07,952][15401] Updated weights for policy 0, policy_version 1080 (0.0035) [2024-06-21 14:08:08,392][15132] Fps is (10 sec: 39312.2, 60 sec: 41777.5, 300 sec: 41598.4). Total num frames: 17711104. Throughput: 0: 41555.2. Samples: 17791240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:08:08,393][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 14:08:11,917][15401] Updated weights for policy 0, policy_version 1090 (0.0033) [2024-06-21 14:08:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 17940480. Throughput: 0: 41474.5. Samples: 18042820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:08:13,391][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 14:08:15,630][15401] Updated weights for policy 0, policy_version 1100 (0.0038) [2024-06-21 14:08:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 40959.9, 300 sec: 41376.6). Total num frames: 18104320. Throughput: 0: 41495.5. Samples: 18295940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:08:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 14:08:19,765][15401] Updated weights for policy 0, policy_version 1110 (0.0030) [2024-06-21 14:08:23,320][15349] Signal inference workers to stop experience collection... (250 times) [2024-06-21 14:08:23,332][15349] Signal inference workers to resume experience collection... (250 times) [2024-06-21 14:08:23,347][15401] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-21 14:08:23,377][15401] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-21 14:08:23,389][15132] Fps is (10 sec: 39322.6, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 18333696. Throughput: 0: 41370.0. Samples: 18413080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 14:08:23,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 14:08:23,475][15401] Updated weights for policy 0, policy_version 1120 (0.0037) [2024-06-21 14:08:27,676][15401] Updated weights for policy 0, policy_version 1130 (0.0047) [2024-06-21 14:08:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 18546688. Throughput: 0: 41650.5. Samples: 18668000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 14:08:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 14:08:31,215][15401] Updated weights for policy 0, policy_version 1140 (0.0040) [2024-06-21 14:08:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 18743296. Throughput: 0: 41406.7. Samples: 18912020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 14:08:33,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 14:08:35,565][15401] Updated weights for policy 0, policy_version 1150 (0.0039) [2024-06-21 14:08:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41543.1). Total num frames: 18956288. Throughput: 0: 41317.3. Samples: 19032600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 14:08:38,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 14:08:39,051][15401] Updated weights for policy 0, policy_version 1160 (0.0040) [2024-06-21 14:08:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 19152896. Throughput: 0: 41456.4. Samples: 19280600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 14:08:43,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 14:08:43,724][15401] Updated weights for policy 0, policy_version 1170 (0.0038) [2024-06-21 14:08:46,963][15401] Updated weights for policy 0, policy_version 1180 (0.0025) [2024-06-21 14:08:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 19349504. Throughput: 0: 41366.2. Samples: 19530580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 14:08:48,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 14:08:51,531][15401] Updated weights for policy 0, policy_version 1190 (0.0031) [2024-06-21 14:08:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41543.5). Total num frames: 19595264. Throughput: 0: 41467.5. Samples: 19657180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 14:08:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 14:08:55,062][15401] Updated weights for policy 0, policy_version 1200 (0.0030) [2024-06-21 14:08:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 19791872. Throughput: 0: 41445.9. Samples: 19907880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 14:08:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 14:08:59,632][15401] Updated weights for policy 0, policy_version 1210 (0.0031) [2024-06-21 14:09:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 19972096. Throughput: 0: 41268.4. Samples: 20153020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 14:09:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 14:09:03,579][15401] Updated weights for policy 0, policy_version 1220 (0.0035) [2024-06-21 14:09:07,539][15401] Updated weights for policy 0, policy_version 1230 (0.0022) [2024-06-21 14:09:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41507.9, 300 sec: 41432.1). Total num frames: 20201472. Throughput: 0: 41401.4. Samples: 20276140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-21 14:09:08,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 14:09:11,571][15401] Updated weights for policy 0, policy_version 1240 (0.0036) [2024-06-21 14:09:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 40960.1, 300 sec: 41376.5). Total num frames: 20398080. Throughput: 0: 41262.8. Samples: 20524820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:09:13,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-21 14:09:15,371][15401] Updated weights for policy 0, policy_version 1250 (0.0030) [2024-06-21 14:09:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 20611072. Throughput: 0: 41060.8. Samples: 20759760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 14:09:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 14:09:19,546][15401] Updated weights for policy 0, policy_version 1260 (0.0039) [2024-06-21 14:09:23,119][15401] Updated weights for policy 0, policy_version 1270 (0.0048) [2024-06-21 14:09:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 20807680. Throughput: 0: 41331.6. Samples: 20892520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:09:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-21 14:09:27,486][15401] Updated weights for policy 0, policy_version 1280 (0.0033) [2024-06-21 14:09:28,390][15132] Fps is (10 sec: 37682.8, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 20987904. Throughput: 0: 41310.2. Samples: 21139560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 14:09:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 14:09:31,232][15401] Updated weights for policy 0, policy_version 1290 (0.0049) [2024-06-21 14:09:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 21233664. Throughput: 0: 41162.7. Samples: 21382900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-21 14:09:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 14:09:35,210][15401] Updated weights for policy 0, policy_version 1300 (0.0040) [2024-06-21 14:09:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 21430272. Throughput: 0: 41222.7. Samples: 21512200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:09:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 14:09:39,031][15401] Updated weights for policy 0, policy_version 1310 (0.0032) [2024-06-21 14:09:43,389][15132] Fps is (10 sec: 37683.5, 60 sec: 40960.1, 300 sec: 41321.3). Total num frames: 21610496. Throughput: 0: 40929.4. Samples: 21749700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 14:09:43,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 14:09:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000001319_21610496.pth... [2024-06-21 14:09:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000000715_11714560.pth [2024-06-21 14:09:43,631][15401] Updated weights for policy 0, policy_version 1320 (0.0039) [2024-06-21 14:09:46,764][15349] Signal inference workers to stop experience collection... (300 times) [2024-06-21 14:09:46,766][15349] Signal inference workers to resume experience collection... (300 times) [2024-06-21 14:09:46,784][15401] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-21 14:09:46,784][15401] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-21 14:09:46,922][15401] Updated weights for policy 0, policy_version 1330 (0.0035) [2024-06-21 14:09:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 21839872. Throughput: 0: 41045.9. Samples: 22000080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 14:09:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 14:09:51,350][15401] Updated weights for policy 0, policy_version 1340 (0.0046) [2024-06-21 14:09:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 22052864. Throughput: 0: 41124.7. Samples: 22126760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 14:09:53,390][15132] Avg episode reward: [(0, '0.045')] [2024-06-21 14:09:54,605][15401] Updated weights for policy 0, policy_version 1350 (0.0046) [2024-06-21 14:09:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 22249472. Throughput: 0: 40965.7. Samples: 22368280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-21 14:09:58,390][15132] Avg episode reward: [(0, '0.045')] [2024-06-21 14:09:59,053][15401] Updated weights for policy 0, policy_version 1360 (0.0038) [2024-06-21 14:10:02,351][15401] Updated weights for policy 0, policy_version 1370 (0.0036) [2024-06-21 14:10:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 22446080. Throughput: 0: 41367.5. Samples: 22621300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 14:10:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 14:10:06,880][15401] Updated weights for policy 0, policy_version 1380 (0.0031) [2024-06-21 14:10:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 22642688. Throughput: 0: 41152.5. Samples: 22744380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:10:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 14:10:10,835][15401] Updated weights for policy 0, policy_version 1390 (0.0041) [2024-06-21 14:10:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 22872064. Throughput: 0: 41177.4. Samples: 22992540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-21 14:10:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 14:10:14,882][15401] Updated weights for policy 0, policy_version 1400 (0.0031) [2024-06-21 14:10:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 23068672. Throughput: 0: 41001.3. Samples: 23227960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:10:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 14:10:18,732][15401] Updated weights for policy 0, policy_version 1410 (0.0038) [2024-06-21 14:10:23,001][15401] Updated weights for policy 0, policy_version 1420 (0.0037) [2024-06-21 14:10:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 23265280. Throughput: 0: 40927.5. Samples: 23353940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:10:23,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-21 14:10:26,621][15401] Updated weights for policy 0, policy_version 1430 (0.0037) [2024-06-21 14:10:28,393][15132] Fps is (10 sec: 40945.9, 60 sec: 41503.8, 300 sec: 41265.0). Total num frames: 23478272. Throughput: 0: 41189.2. Samples: 23603360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 14:10:28,394][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 14:10:30,921][15401] Updated weights for policy 0, policy_version 1440 (0.0039) [2024-06-21 14:10:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41321.9). Total num frames: 23691264. Throughput: 0: 41139.4. Samples: 23851360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:10:33,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 14:10:35,016][15401] Updated weights for policy 0, policy_version 1450 (0.0027) [2024-06-21 14:10:38,389][15132] Fps is (10 sec: 40974.4, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 23887872. Throughput: 0: 41131.2. Samples: 23977660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 14:10:38,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 14:10:38,732][15401] Updated weights for policy 0, policy_version 1460 (0.0037) [2024-06-21 14:10:43,002][15401] Updated weights for policy 0, policy_version 1470 (0.0036) [2024-06-21 14:10:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 41504.4, 300 sec: 41376.2). Total num frames: 24100864. Throughput: 0: 41181.8. Samples: 24221560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 14:10:43,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 14:10:46,767][15401] Updated weights for policy 0, policy_version 1480 (0.0046) [2024-06-21 14:10:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 24297472. Throughput: 0: 40981.5. Samples: 24465460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-21 14:10:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 14:10:50,869][15401] Updated weights for policy 0, policy_version 1490 (0.0044) [2024-06-21 14:10:53,389][15132] Fps is (10 sec: 39331.2, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 24494080. Throughput: 0: 40997.3. Samples: 24589260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 14:10:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 14:10:54,554][15401] Updated weights for policy 0, policy_version 1500 (0.0041) [2024-06-21 14:10:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 24707072. Throughput: 0: 40828.9. Samples: 24829840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 14:10:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 14:10:58,689][15401] Updated weights for policy 0, policy_version 1510 (0.0040) [2024-06-21 14:11:02,738][15401] Updated weights for policy 0, policy_version 1520 (0.0031) [2024-06-21 14:11:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 24920064. Throughput: 0: 41209.3. Samples: 25082380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 14:11:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 14:11:05,337][15349] Signal inference workers to stop experience collection... (350 times) [2024-06-21 14:11:05,339][15349] Signal inference workers to resume experience collection... (350 times) [2024-06-21 14:11:05,350][15401] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-21 14:11:05,377][15401] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-21 14:11:06,514][15401] Updated weights for policy 0, policy_version 1530 (0.0032) [2024-06-21 14:11:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 25116672. Throughput: 0: 41143.6. Samples: 25205400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 14:11:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 14:11:10,630][15401] Updated weights for policy 0, policy_version 1540 (0.0026) [2024-06-21 14:11:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41506.2, 300 sec: 41432.4). Total num frames: 25362432. Throughput: 0: 41167.2. Samples: 25455740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 14:11:13,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-21 14:11:13,413][15349] Saving new best policy, reward=0.782! [2024-06-21 14:11:14,806][15401] Updated weights for policy 0, policy_version 1550 (0.0034) [2024-06-21 14:11:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 25542656. Throughput: 0: 41096.5. Samples: 25700700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:11:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 14:11:18,546][15401] Updated weights for policy 0, policy_version 1560 (0.0037) [2024-06-21 14:11:22,611][15401] Updated weights for policy 0, policy_version 1570 (0.0036) [2024-06-21 14:11:23,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 25739264. Throughput: 0: 40975.0. Samples: 25821540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-21 14:11:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 14:11:26,559][15401] Updated weights for policy 0, policy_version 1580 (0.0037) [2024-06-21 14:11:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41508.5, 300 sec: 41321.0). Total num frames: 25968640. Throughput: 0: 41274.6. Samples: 26078820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 14:11:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 14:11:30,258][15401] Updated weights for policy 0, policy_version 1590 (0.0023) [2024-06-21 14:11:33,392][15132] Fps is (10 sec: 42588.5, 60 sec: 41231.5, 300 sec: 41320.7). Total num frames: 26165248. Throughput: 0: 41305.3. Samples: 26324300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 14:11:33,393][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 14:11:34,392][15401] Updated weights for policy 0, policy_version 1600 (0.0031) [2024-06-21 14:11:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 26361856. Throughput: 0: 41357.0. Samples: 26450320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-21 14:11:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 14:11:38,479][15401] Updated weights for policy 0, policy_version 1610 (0.0029) [2024-06-21 14:11:42,283][15401] Updated weights for policy 0, policy_version 1620 (0.0033) [2024-06-21 14:11:43,389][15132] Fps is (10 sec: 40969.9, 60 sec: 41234.7, 300 sec: 41265.5). Total num frames: 26574848. Throughput: 0: 41481.7. Samples: 26696520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 14:11:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 14:11:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000001623_26591232.pth... [2024-06-21 14:11:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000001018_16678912.pth [2024-06-21 14:11:46,448][15401] Updated weights for policy 0, policy_version 1630 (0.0037) [2024-06-21 14:11:48,392][15132] Fps is (10 sec: 42587.7, 60 sec: 41504.4, 300 sec: 41376.2). Total num frames: 26787840. Throughput: 0: 41239.2. Samples: 26938240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 14:11:48,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 14:11:50,194][15401] Updated weights for policy 0, policy_version 1640 (0.0038) [2024-06-21 14:11:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 26968064. Throughput: 0: 41233.4. Samples: 27060900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 14:11:53,390][15132] Avg episode reward: [(0, '0.086')] [2024-06-21 14:11:54,422][15401] Updated weights for policy 0, policy_version 1650 (0.0030) [2024-06-21 14:11:58,379][15401] Updated weights for policy 0, policy_version 1660 (0.0029) [2024-06-21 14:11:58,389][15132] Fps is (10 sec: 40970.3, 60 sec: 41506.2, 300 sec: 41321.1). Total num frames: 27197440. Throughput: 0: 41107.2. Samples: 27305560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 14:11:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 14:12:02,556][15401] Updated weights for policy 0, policy_version 1670 (0.0043) [2024-06-21 14:12:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 27394048. Throughput: 0: 41103.5. Samples: 27550360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-21 14:12:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 14:12:06,327][15401] Updated weights for policy 0, policy_version 1680 (0.0031) [2024-06-21 14:12:08,389][15132] Fps is (10 sec: 37682.8, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 27574272. Throughput: 0: 41267.7. Samples: 27678580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:12:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 14:12:10,435][15401] Updated weights for policy 0, policy_version 1690 (0.0037) [2024-06-21 14:12:13,392][15132] Fps is (10 sec: 40950.4, 60 sec: 40685.2, 300 sec: 41209.6). Total num frames: 27803648. Throughput: 0: 40895.6. Samples: 27919220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 14:12:13,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 14:12:14,243][15401] Updated weights for policy 0, policy_version 1700 (0.0024) [2024-06-21 14:12:18,375][15401] Updated weights for policy 0, policy_version 1710 (0.0035) [2024-06-21 14:12:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 28016640. Throughput: 0: 41045.2. Samples: 28171240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 14:12:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 14:12:22,814][15401] Updated weights for policy 0, policy_version 1720 (0.0045) [2024-06-21 14:12:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 28213248. Throughput: 0: 40893.2. Samples: 28290520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 14:12:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 14:12:26,382][15401] Updated weights for policy 0, policy_version 1730 (0.0033) [2024-06-21 14:12:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 28426240. Throughput: 0: 40953.2. Samples: 28539420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 14:12:28,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-21 14:12:30,551][15401] Updated weights for policy 0, policy_version 1740 (0.0032) [2024-06-21 14:12:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 40415.5, 300 sec: 41098.9). Total num frames: 28590080. Throughput: 0: 41147.1. Samples: 28789760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 14:12:33,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 14:12:34,524][15401] Updated weights for policy 0, policy_version 1750 (0.0024) [2024-06-21 14:12:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41232.8, 300 sec: 41265.4). Total num frames: 28835840. Throughput: 0: 41047.6. Samples: 28908060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:12:38,399][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 14:12:38,395][15401] Updated weights for policy 0, policy_version 1760 (0.0030) [2024-06-21 14:12:42,304][15401] Updated weights for policy 0, policy_version 1770 (0.0031) [2024-06-21 14:12:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 29032448. Throughput: 0: 41231.4. Samples: 29160980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 14:12:43,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 14:12:46,348][15401] Updated weights for policy 0, policy_version 1780 (0.0036) [2024-06-21 14:12:48,390][15132] Fps is (10 sec: 39322.8, 60 sec: 40688.5, 300 sec: 41154.4). Total num frames: 29229056. Throughput: 0: 41245.0. Samples: 29406380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 14:12:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 14:12:50,225][15401] Updated weights for policy 0, policy_version 1790 (0.0034) [2024-06-21 14:12:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 29458432. Throughput: 0: 41173.7. Samples: 29531400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 14:12:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 14:12:54,134][15401] Updated weights for policy 0, policy_version 1800 (0.0036) [2024-06-21 14:12:57,371][15349] Signal inference workers to stop experience collection... (400 times) [2024-06-21 14:12:57,371][15349] Signal inference workers to resume experience collection... (400 times) [2024-06-21 14:12:57,397][15401] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-21 14:12:57,397][15401] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-21 14:12:58,311][15401] Updated weights for policy 0, policy_version 1810 (0.0041) [2024-06-21 14:12:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 29655040. Throughput: 0: 41359.9. Samples: 29780320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 14:12:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 14:13:02,387][15401] Updated weights for policy 0, policy_version 1820 (0.0047) [2024-06-21 14:13:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41210.3). Total num frames: 29868032. Throughput: 0: 41068.5. Samples: 30019320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) [2024-06-21 14:13:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 14:13:06,113][15401] Updated weights for policy 0, policy_version 1830 (0.0032) [2024-06-21 14:13:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41777.5, 300 sec: 41154.1). Total num frames: 30081024. Throughput: 0: 41199.6. Samples: 30144600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:13:08,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 14:13:10,260][15401] Updated weights for policy 0, policy_version 1840 (0.0043) [2024-06-21 14:13:13,392][15132] Fps is (10 sec: 39312.0, 60 sec: 40960.0, 300 sec: 41209.6). Total num frames: 30261248. Throughput: 0: 41332.1. Samples: 30399460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 14:13:13,393][15132] Avg episode reward: [(0, '0.381')] [2024-06-21 14:13:14,042][15401] Updated weights for policy 0, policy_version 1850 (0.0039) [2024-06-21 14:13:18,275][15401] Updated weights for policy 0, policy_version 1860 (0.0041) [2024-06-21 14:13:18,390][15132] Fps is (10 sec: 39330.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 30474240. Throughput: 0: 41187.9. Samples: 30643220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-21 14:13:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 14:13:22,144][15401] Updated weights for policy 0, policy_version 1870 (0.0048) [2024-06-21 14:13:23,390][15132] Fps is (10 sec: 39330.9, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 30654464. Throughput: 0: 41277.6. Samples: 30765540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-21 14:13:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-21 14:13:26,047][15401] Updated weights for policy 0, policy_version 1880 (0.0022) [2024-06-21 14:13:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 30883840. Throughput: 0: 41091.6. Samples: 31010100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 14:13:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 14:13:30,189][15401] Updated weights for policy 0, policy_version 1890 (0.0040) [2024-06-21 14:13:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 31096832. Throughput: 0: 41115.1. Samples: 31256560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 14:13:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 14:13:33,824][15401] Updated weights for policy 0, policy_version 1900 (0.0052) [2024-06-21 14:13:38,085][15401] Updated weights for policy 0, policy_version 1910 (0.0031) [2024-06-21 14:13:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.3, 300 sec: 41154.4). Total num frames: 31293440. Throughput: 0: 41055.2. Samples: 31378880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-21 14:13:38,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-21 14:13:41,923][15401] Updated weights for policy 0, policy_version 1920 (0.0036) [2024-06-21 14:13:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 31506432. Throughput: 0: 40993.4. Samples: 31625020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 14:13:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 14:13:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000001923_31506432.pth... [2024-06-21 14:13:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000001319_21610496.pth [2024-06-21 14:13:45,938][15401] Updated weights for policy 0, policy_version 1930 (0.0040) [2024-06-21 14:13:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 31703040. Throughput: 0: 41234.5. Samples: 31874880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 14:13:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 14:13:49,806][15401] Updated weights for policy 0, policy_version 1940 (0.0036) [2024-06-21 14:13:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 31916032. Throughput: 0: 41120.4. Samples: 31994920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-21 14:13:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 14:13:54,541][15401] Updated weights for policy 0, policy_version 1950 (0.0030) [2024-06-21 14:13:57,693][15401] Updated weights for policy 0, policy_version 1960 (0.0052) [2024-06-21 14:13:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 32145408. Throughput: 0: 41070.2. Samples: 32247520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 14:13:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 14:14:02,278][15401] Updated weights for policy 0, policy_version 1970 (0.0049) [2024-06-21 14:14:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 32325632. Throughput: 0: 41175.1. Samples: 32496100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-21 14:14:03,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 14:14:05,443][15401] Updated weights for policy 0, policy_version 1980 (0.0043) [2024-06-21 14:14:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 40961.7, 300 sec: 41154.4). Total num frames: 32538624. Throughput: 0: 41146.7. Samples: 32617140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 14:14:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 14:14:10,155][15349] Signal inference workers to stop experience collection... (450 times) [2024-06-21 14:14:10,156][15349] Signal inference workers to resume experience collection... (450 times) [2024-06-21 14:14:10,186][15401] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-21 14:14:10,186][15401] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-21 14:14:10,322][15401] Updated weights for policy 0, policy_version 1990 (0.0040) [2024-06-21 14:14:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41507.7, 300 sec: 41154.4). Total num frames: 32751616. Throughput: 0: 41312.3. Samples: 32869160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:14:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 14:14:13,681][15401] Updated weights for policy 0, policy_version 2000 (0.0037) [2024-06-21 14:14:18,075][15401] Updated weights for policy 0, policy_version 2010 (0.0033) [2024-06-21 14:14:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 32931840. Throughput: 0: 41352.6. Samples: 33117420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-21 14:14:18,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-21 14:14:21,683][15401] Updated weights for policy 0, policy_version 2020 (0.0034) [2024-06-21 14:14:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41321.0). Total num frames: 33177600. Throughput: 0: 41402.6. Samples: 33242000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-21 14:14:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 14:14:25,810][15401] Updated weights for policy 0, policy_version 2030 (0.0028) [2024-06-21 14:14:28,392][15132] Fps is (10 sec: 42587.8, 60 sec: 41231.4, 300 sec: 41098.5). Total num frames: 33357824. Throughput: 0: 41523.6. Samples: 33493680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 14:14:28,392][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 14:14:29,422][15401] Updated weights for policy 0, policy_version 2040 (0.0032) [2024-06-21 14:14:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 33554432. Throughput: 0: 41539.7. Samples: 33744160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 14:14:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 14:14:33,897][15401] Updated weights for policy 0, policy_version 2050 (0.0047) [2024-06-21 14:14:37,143][15401] Updated weights for policy 0, policy_version 2060 (0.0035) [2024-06-21 14:14:38,392][15132] Fps is (10 sec: 44236.7, 60 sec: 41777.5, 300 sec: 41320.7). Total num frames: 33800192. Throughput: 0: 41585.8. Samples: 33866380. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-21 14:14:38,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 14:14:41,676][15401] Updated weights for policy 0, policy_version 2070 (0.0030) [2024-06-21 14:14:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 33964032. Throughput: 0: 41475.5. Samples: 34113920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 14:14:43,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 14:14:45,129][15401] Updated weights for policy 0, policy_version 2080 (0.0037) [2024-06-21 14:14:48,390][15132] Fps is (10 sec: 39331.1, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 34193408. Throughput: 0: 41547.2. Samples: 34365720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 14:14:48,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-21 14:14:49,559][15401] Updated weights for policy 0, policy_version 2090 (0.0035) [2024-06-21 14:14:52,937][15401] Updated weights for policy 0, policy_version 2100 (0.0032) [2024-06-21 14:14:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 34422784. Throughput: 0: 41682.7. Samples: 34492860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:14:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 14:14:57,479][15401] Updated weights for policy 0, policy_version 2110 (0.0041) [2024-06-21 14:14:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 34586624. Throughput: 0: 41570.3. Samples: 34739820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-21 14:14:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 14:15:00,721][15401] Updated weights for policy 0, policy_version 2120 (0.0034) [2024-06-21 14:15:03,391][15132] Fps is (10 sec: 40953.7, 60 sec: 41778.2, 300 sec: 41320.8). Total num frames: 34832384. Throughput: 0: 41412.7. Samples: 34981060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-21 14:15:03,392][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 14:15:05,882][15401] Updated weights for policy 0, policy_version 2130 (0.0034) [2024-06-21 14:15:08,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42052.3, 300 sec: 41321.0). Total num frames: 35061760. Throughput: 0: 41568.2. Samples: 35112560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 14:15:08,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-21 14:15:08,397][15401] Updated weights for policy 0, policy_version 2140 (0.0043) [2024-06-21 14:15:13,390][15132] Fps is (10 sec: 36050.1, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 35192832. Throughput: 0: 41398.1. Samples: 35356500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-21 14:15:13,392][15132] Avg episode reward: [(0, '0.817')] [2024-06-21 14:15:13,549][15349] Saving new best policy, reward=0.817! [2024-06-21 14:15:13,842][15401] Updated weights for policy 0, policy_version 2150 (0.0042) [2024-06-21 14:15:16,241][15401] Updated weights for policy 0, policy_version 2160 (0.0031) [2024-06-21 14:15:18,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 35438592. Throughput: 0: 41263.2. Samples: 35601000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 14:15:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 14:15:21,809][15401] Updated weights for policy 0, policy_version 2170 (0.0030) [2024-06-21 14:15:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 41233.0, 300 sec: 41265.9). Total num frames: 35651584. Throughput: 0: 41539.4. Samples: 35735560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 14:15:23,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-21 14:15:24,415][15401] Updated weights for policy 0, policy_version 2180 (0.0027) [2024-06-21 14:15:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 41234.7, 300 sec: 41154.4). Total num frames: 35831808. Throughput: 0: 41485.7. Samples: 35980780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 14:15:28,396][15132] Avg episode reward: [(0, '0.750')] [2024-06-21 14:15:29,482][15401] Updated weights for policy 0, policy_version 2190 (0.0035) [2024-06-21 14:15:32,308][15401] Updated weights for policy 0, policy_version 2200 (0.0034) [2024-06-21 14:15:33,371][15349] Signal inference workers to stop experience collection... (500 times) [2024-06-21 14:15:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 36061184. Throughput: 0: 41332.5. Samples: 36225680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 14:15:33,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-21 14:15:33,416][15401] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-21 14:15:33,425][15349] Signal inference workers to resume experience collection... (500 times) [2024-06-21 14:15:33,432][15401] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-21 14:15:37,226][15401] Updated weights for policy 0, policy_version 2210 (0.0037) [2024-06-21 14:15:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41234.7, 300 sec: 41265.8). Total num frames: 36274176. Throughput: 0: 41437.7. Samples: 36357560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 14:15:38,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 14:15:40,047][15401] Updated weights for policy 0, policy_version 2220 (0.0031) [2024-06-21 14:15:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41265.4). Total num frames: 36470784. Throughput: 0: 41290.2. Samples: 36597880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 14:15:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 14:15:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000002226_36470784.pth... [2024-06-21 14:15:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000001623_26591232.pth [2024-06-21 14:15:45,253][15401] Updated weights for policy 0, policy_version 2230 (0.0034) [2024-06-21 14:15:47,810][15401] Updated weights for policy 0, policy_version 2240 (0.0034) [2024-06-21 14:15:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 36700160. Throughput: 0: 41379.7. Samples: 36843080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:15:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 14:15:53,168][15401] Updated weights for policy 0, policy_version 2250 (0.0047) [2024-06-21 14:15:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 36864000. Throughput: 0: 41329.7. Samples: 36972400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-21 14:15:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 14:15:55,950][15401] Updated weights for policy 0, policy_version 2260 (0.0034) [2024-06-21 14:15:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 37109760. Throughput: 0: 41454.7. Samples: 37221960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:15:58,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 14:16:01,003][15401] Updated weights for policy 0, policy_version 2270 (0.0034) [2024-06-21 14:16:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 41507.1, 300 sec: 41376.5). Total num frames: 37322752. Throughput: 0: 41425.1. Samples: 37465140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 14:16:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-21 14:16:03,647][15401] Updated weights for policy 0, policy_version 2280 (0.0041) [2024-06-21 14:16:08,390][15132] Fps is (10 sec: 37683.1, 60 sec: 40413.7, 300 sec: 41098.8). Total num frames: 37486592. Throughput: 0: 41275.1. Samples: 37592940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-21 14:16:08,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-21 14:16:08,930][15401] Updated weights for policy 0, policy_version 2290 (0.0037) [2024-06-21 14:16:11,844][15401] Updated weights for policy 0, policy_version 2300 (0.0031) [2024-06-21 14:16:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 41321.0). Total num frames: 37732352. Throughput: 0: 41293.8. Samples: 37839000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-21 14:16:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 14:16:16,875][15401] Updated weights for policy 0, policy_version 2310 (0.0035) [2024-06-21 14:16:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.0, 300 sec: 41321.0). Total num frames: 37928960. Throughput: 0: 41484.8. Samples: 38092500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 14:16:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 14:16:19,543][15401] Updated weights for policy 0, policy_version 2320 (0.0035) [2024-06-21 14:16:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 38109184. Throughput: 0: 41382.7. Samples: 38219780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 14:16:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 14:16:24,690][15401] Updated weights for policy 0, policy_version 2330 (0.0033) [2024-06-21 14:16:27,289][15401] Updated weights for policy 0, policy_version 2340 (0.0028) [2024-06-21 14:16:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41321.3). Total num frames: 38354944. Throughput: 0: 41442.7. Samples: 38462800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 14:16:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 14:16:32,596][15401] Updated weights for policy 0, policy_version 2350 (0.0029) [2024-06-21 14:16:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 38551552. Throughput: 0: 41644.0. Samples: 38717060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 14:16:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 14:16:35,062][15401] Updated weights for policy 0, policy_version 2360 (0.0044) [2024-06-21 14:16:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 38748160. Throughput: 0: 41536.4. Samples: 38841540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 14:16:38,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 14:16:40,401][15401] Updated weights for policy 0, policy_version 2370 (0.0030) [2024-06-21 14:16:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 41376.9). Total num frames: 38993920. Throughput: 0: 41468.5. Samples: 39088040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 14:16:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 14:16:43,392][15401] Updated weights for policy 0, policy_version 2380 (0.0036) [2024-06-21 14:16:45,671][15349] Signal inference workers to stop experience collection... (550 times) [2024-06-21 14:16:45,672][15349] Signal inference workers to resume experience collection... (550 times) [2024-06-21 14:16:45,693][15401] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-21 14:16:45,693][15401] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-21 14:16:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 39141376. Throughput: 0: 41778.4. Samples: 39345160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 14:16:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 14:16:48,580][15401] Updated weights for policy 0, policy_version 2390 (0.0039) [2024-06-21 14:16:51,194][15401] Updated weights for policy 0, policy_version 2400 (0.0033) [2024-06-21 14:16:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.1, 300 sec: 41265.4). Total num frames: 39370752. Throughput: 0: 41456.9. Samples: 39458500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-21 14:16:53,392][15132] Avg episode reward: [(0, '0.736')] [2024-06-21 14:16:56,343][15401] Updated weights for policy 0, policy_version 2410 (0.0033) [2024-06-21 14:16:58,390][15132] Fps is (10 sec: 47513.3, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 39616512. Throughput: 0: 41540.4. Samples: 39708320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:16:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 14:16:59,135][15401] Updated weights for policy 0, policy_version 2420 (0.0023) [2024-06-21 14:17:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 39763968. Throughput: 0: 41638.8. Samples: 39966240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:17:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 14:17:04,186][15401] Updated weights for policy 0, policy_version 2430 (0.0039) [2024-06-21 14:17:07,047][15401] Updated weights for policy 0, policy_version 2440 (0.0037) [2024-06-21 14:17:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 41376.9). Total num frames: 40009728. Throughput: 0: 41349.9. Samples: 40080520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 14:17:08,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 14:17:12,011][15401] Updated weights for policy 0, policy_version 2450 (0.0037) [2024-06-21 14:17:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 40206336. Throughput: 0: 41566.2. Samples: 40333280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 14:17:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 14:17:14,883][15401] Updated weights for policy 0, policy_version 2460 (0.0028) [2024-06-21 14:17:18,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 40402944. Throughput: 0: 41378.7. Samples: 40579100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 14:17:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 14:17:19,901][15401] Updated weights for policy 0, policy_version 2470 (0.0040) [2024-06-21 14:17:23,219][15401] Updated weights for policy 0, policy_version 2480 (0.0035) [2024-06-21 14:17:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41376.6). Total num frames: 40632320. Throughput: 0: 41295.1. Samples: 40699820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 14:17:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 14:17:27,607][15401] Updated weights for policy 0, policy_version 2490 (0.0035) [2024-06-21 14:17:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 40812544. Throughput: 0: 41347.0. Samples: 40948660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 14:17:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 14:17:31,358][15401] Updated weights for policy 0, policy_version 2500 (0.0026) [2024-06-21 14:17:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 41025536. Throughput: 0: 40994.9. Samples: 41189940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:17:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 14:17:35,598][15401] Updated weights for policy 0, policy_version 2510 (0.0043) [2024-06-21 14:17:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 41238528. Throughput: 0: 41259.1. Samples: 41315160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 14:17:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 14:17:39,573][15401] Updated weights for policy 0, policy_version 2520 (0.0035) [2024-06-21 14:17:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 41376.5). Total num frames: 41435136. Throughput: 0: 41234.1. Samples: 41563860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-21 14:17:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 14:17:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000002529_41435136.pth... [2024-06-21 14:17:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000001923_31506432.pth [2024-06-21 14:17:43,660][15401] Updated weights for policy 0, policy_version 2530 (0.0039) [2024-06-21 14:17:48,066][15401] Updated weights for policy 0, policy_version 2540 (0.0043) [2024-06-21 14:17:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 41631744. Throughput: 0: 40984.0. Samples: 41810520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 14:17:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 14:17:51,635][15401] Updated weights for policy 0, policy_version 2550 (0.0044) [2024-06-21 14:17:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41432.1). Total num frames: 41877504. Throughput: 0: 41235.1. Samples: 41936100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-21 14:17:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 14:17:55,967][15401] Updated weights for policy 0, policy_version 2560 (0.0047) [2024-06-21 14:17:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 42057728. Throughput: 0: 41053.3. Samples: 42180680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:17:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 14:17:59,557][15401] Updated weights for policy 0, policy_version 2570 (0.0033) [2024-06-21 14:18:03,389][15132] Fps is (10 sec: 36044.8, 60 sec: 41233.0, 300 sec: 41210.3). Total num frames: 42237952. Throughput: 0: 41000.9. Samples: 42424140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 14:18:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 14:18:03,997][15401] Updated weights for policy 0, policy_version 2580 (0.0042) [2024-06-21 14:18:05,047][15349] Signal inference workers to stop experience collection... (600 times) [2024-06-21 14:18:05,047][15349] Signal inference workers to resume experience collection... (600 times) [2024-06-21 14:18:05,070][15401] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-21 14:18:05,070][15401] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-21 14:18:07,926][15401] Updated weights for policy 0, policy_version 2590 (0.0035) [2024-06-21 14:18:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40959.9, 300 sec: 41376.9). Total num frames: 42467328. Throughput: 0: 40945.3. Samples: 42542360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:18:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 14:18:11,773][15401] Updated weights for policy 0, policy_version 2600 (0.0035) [2024-06-21 14:18:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 42663936. Throughput: 0: 40904.6. Samples: 42789360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 14:18:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-21 14:18:15,767][15401] Updated weights for policy 0, policy_version 2610 (0.0049) [2024-06-21 14:18:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 42876928. Throughput: 0: 41033.5. Samples: 43036440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:18:18,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-21 14:18:19,641][15401] Updated weights for policy 0, policy_version 2620 (0.0034) [2024-06-21 14:18:23,390][15132] Fps is (10 sec: 40958.3, 60 sec: 40686.7, 300 sec: 41321.0). Total num frames: 43073536. Throughput: 0: 40992.2. Samples: 43159820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:18:23,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 14:18:23,617][15401] Updated weights for policy 0, policy_version 2630 (0.0032) [2024-06-21 14:18:28,077][15401] Updated weights for policy 0, policy_version 2640 (0.0038) [2024-06-21 14:18:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 43270144. Throughput: 0: 41148.1. Samples: 43415520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 14:18:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 14:18:31,517][15401] Updated weights for policy 0, policy_version 2650 (0.0033) [2024-06-21 14:18:33,395][15132] Fps is (10 sec: 44215.6, 60 sec: 41502.7, 300 sec: 41431.4). Total num frames: 43515904. Throughput: 0: 40804.2. Samples: 43646920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:18:33,395][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 14:18:35,773][15401] Updated weights for policy 0, policy_version 2660 (0.0047) [2024-06-21 14:18:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 43696128. Throughput: 0: 41044.9. Samples: 43783120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:18:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 14:18:39,146][15401] Updated weights for policy 0, policy_version 2670 (0.0049) [2024-06-21 14:18:43,390][15132] Fps is (10 sec: 37702.3, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 43892736. Throughput: 0: 41113.3. Samples: 44030780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 14:18:43,392][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 14:18:43,621][15401] Updated weights for policy 0, policy_version 2680 (0.0034) [2024-06-21 14:18:46,965][15401] Updated weights for policy 0, policy_version 2690 (0.0034) [2024-06-21 14:18:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41779.0, 300 sec: 41432.1). Total num frames: 44138496. Throughput: 0: 41203.9. Samples: 44278320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 14:18:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 14:18:51,481][15401] Updated weights for policy 0, policy_version 2700 (0.0031) [2024-06-21 14:18:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 44318720. Throughput: 0: 41419.2. Samples: 44406220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 14:18:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 14:18:55,025][15401] Updated weights for policy 0, policy_version 2710 (0.0044) [2024-06-21 14:18:58,390][15132] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 44515328. Throughput: 0: 41283.4. Samples: 44647120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 14:18:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 14:18:59,358][15401] Updated weights for policy 0, policy_version 2720 (0.0030) [2024-06-21 14:19:02,980][15401] Updated weights for policy 0, policy_version 2730 (0.0041) [2024-06-21 14:19:03,396][15132] Fps is (10 sec: 42570.2, 60 sec: 41774.6, 300 sec: 41375.6). Total num frames: 44744704. Throughput: 0: 41401.5. Samples: 44899780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 14:19:03,396][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 14:19:07,240][15401] Updated weights for policy 0, policy_version 2740 (0.0036) [2024-06-21 14:19:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41321.0). Total num frames: 44941312. Throughput: 0: 41489.2. Samples: 45026820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 14:19:08,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-21 14:19:10,874][15401] Updated weights for policy 0, policy_version 2750 (0.0039) [2024-06-21 14:19:13,389][15132] Fps is (10 sec: 40987.0, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 45154304. Throughput: 0: 41241.8. Samples: 45271400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 14:19:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 14:19:15,181][15401] Updated weights for policy 0, policy_version 2760 (0.0041) [2024-06-21 14:19:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 45350912. Throughput: 0: 41672.7. Samples: 45521980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 14:19:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 14:19:18,739][15401] Updated weights for policy 0, policy_version 2770 (0.0036) [2024-06-21 14:19:20,119][15349] Signal inference workers to stop experience collection... (650 times) [2024-06-21 14:19:20,119][15349] Signal inference workers to resume experience collection... (650 times) [2024-06-21 14:19:20,157][15401] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-21 14:19:20,157][15401] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-21 14:19:23,272][15401] Updated weights for policy 0, policy_version 2780 (0.0032) [2024-06-21 14:19:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.4, 300 sec: 41321.4). Total num frames: 45547520. Throughput: 0: 41212.1. Samples: 45637660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 14:19:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 14:19:26,726][15401] Updated weights for policy 0, policy_version 2790 (0.0037) [2024-06-21 14:19:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 45793280. Throughput: 0: 41327.2. Samples: 45890500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 14:19:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 14:19:31,097][15401] Updated weights for policy 0, policy_version 2800 (0.0043) [2024-06-21 14:19:33,389][15132] Fps is (10 sec: 40959.6, 60 sec: 40690.4, 300 sec: 41210.3). Total num frames: 45957120. Throughput: 0: 41357.0. Samples: 46139380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 14:19:33,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 14:19:34,660][15401] Updated weights for policy 0, policy_version 2810 (0.0041) [2024-06-21 14:19:38,390][15132] Fps is (10 sec: 37682.7, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 46170112. Throughput: 0: 41089.7. Samples: 46255260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 14:19:38,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 14:19:38,950][15401] Updated weights for policy 0, policy_version 2820 (0.0037) [2024-06-21 14:19:42,954][15401] Updated weights for policy 0, policy_version 2830 (0.0047) [2024-06-21 14:19:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 46383104. Throughput: 0: 41245.4. Samples: 46503160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-21 14:19:43,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 14:19:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000002831_46383104.pth... [2024-06-21 14:19:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000002226_36470784.pth [2024-06-21 14:19:47,195][15401] Updated weights for policy 0, policy_version 2840 (0.0030) [2024-06-21 14:19:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40414.0, 300 sec: 41154.4). Total num frames: 46563328. Throughput: 0: 41027.8. Samples: 46745760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-21 14:19:48,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-21 14:19:50,702][15401] Updated weights for policy 0, policy_version 2850 (0.0039) [2024-06-21 14:19:53,392][15132] Fps is (10 sec: 39312.0, 60 sec: 40958.3, 300 sec: 41320.7). Total num frames: 46776320. Throughput: 0: 40908.4. Samples: 46867800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-21 14:19:53,393][15132] Avg episode reward: [(0, '0.295')] [2024-06-21 14:19:55,278][15401] Updated weights for policy 0, policy_version 2860 (0.0044) [2024-06-21 14:19:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41506.2, 300 sec: 41265.7). Total num frames: 47005696. Throughput: 0: 41034.2. Samples: 47117940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-21 14:19:58,391][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 14:19:58,879][15401] Updated weights for policy 0, policy_version 2870 (0.0037) [2024-06-21 14:20:03,057][15401] Updated weights for policy 0, policy_version 2880 (0.0040) [2024-06-21 14:20:03,389][15132] Fps is (10 sec: 40970.0, 60 sec: 40691.4, 300 sec: 41098.8). Total num frames: 47185920. Throughput: 0: 40862.7. Samples: 47360800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 14:20:03,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 14:20:06,755][15401] Updated weights for policy 0, policy_version 2890 (0.0034) [2024-06-21 14:20:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 47398912. Throughput: 0: 40945.2. Samples: 47480200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 14:20:08,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 14:20:10,834][15401] Updated weights for policy 0, policy_version 2900 (0.0036) [2024-06-21 14:20:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 47595520. Throughput: 0: 40871.5. Samples: 47729720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 14:20:13,399][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 14:20:14,945][15401] Updated weights for policy 0, policy_version 2910 (0.0041) [2024-06-21 14:20:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 47824896. Throughput: 0: 40730.1. Samples: 47972240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:20:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 14:20:18,569][15401] Updated weights for policy 0, policy_version 2920 (0.0040) [2024-06-21 14:20:22,862][15401] Updated weights for policy 0, policy_version 2930 (0.0034) [2024-06-21 14:20:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 48021504. Throughput: 0: 40960.4. Samples: 48098480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 14:20:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 14:20:26,433][15401] Updated weights for policy 0, policy_version 2940 (0.0035) [2024-06-21 14:20:28,389][15132] Fps is (10 sec: 37683.7, 60 sec: 40140.8, 300 sec: 41154.4). Total num frames: 48201728. Throughput: 0: 40856.5. Samples: 48341700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:20:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 14:20:30,883][15401] Updated weights for policy 0, policy_version 2950 (0.0038) [2024-06-21 14:20:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 48431104. Throughput: 0: 40923.8. Samples: 48587340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:20:33,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-21 14:20:34,510][15401] Updated weights for policy 0, policy_version 2960 (0.0040) [2024-06-21 14:20:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 48627712. Throughput: 0: 41092.8. Samples: 48716880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 14:20:38,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 14:20:38,990][15401] Updated weights for policy 0, policy_version 2970 (0.0031) [2024-06-21 14:20:42,515][15401] Updated weights for policy 0, policy_version 2980 (0.0034) [2024-06-21 14:20:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 48840704. Throughput: 0: 40902.3. Samples: 48958540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 14:20:43,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 14:20:46,979][15401] Updated weights for policy 0, policy_version 2990 (0.0036) [2024-06-21 14:20:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 49020928. Throughput: 0: 41010.2. Samples: 49206260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:20:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 14:20:50,566][15401] Updated weights for policy 0, policy_version 3000 (0.0028) [2024-06-21 14:20:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41234.8, 300 sec: 41154.4). Total num frames: 49250304. Throughput: 0: 40960.6. Samples: 49323420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 14:20:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-21 14:20:55,087][15401] Updated weights for policy 0, policy_version 3010 (0.0035) [2024-06-21 14:20:56,494][15349] Signal inference workers to stop experience collection... (700 times) [2024-06-21 14:20:56,540][15401] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-21 14:20:56,561][15349] Signal inference workers to resume experience collection... (700 times) [2024-06-21 14:20:56,562][15401] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-21 14:20:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 49463296. Throughput: 0: 41005.8. Samples: 49574980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 14:20:58,392][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 14:20:58,504][15401] Updated weights for policy 0, policy_version 3020 (0.0029) [2024-06-21 14:21:03,238][15401] Updated weights for policy 0, policy_version 3030 (0.0036) [2024-06-21 14:21:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 49643520. Throughput: 0: 41210.7. Samples: 49826720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-21 14:21:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 14:21:06,514][15401] Updated weights for policy 0, policy_version 3040 (0.0049) [2024-06-21 14:21:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 49856512. Throughput: 0: 41005.5. Samples: 49943720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 14:21:08,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 14:21:11,604][15401] Updated weights for policy 0, policy_version 3050 (0.0049) [2024-06-21 14:21:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 50069504. Throughput: 0: 41008.7. Samples: 50187100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 14:21:13,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 14:21:14,741][15401] Updated weights for policy 0, policy_version 3060 (0.0039) [2024-06-21 14:21:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 50266112. Throughput: 0: 41024.5. Samples: 50433440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 14:21:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 14:21:19,359][15401] Updated weights for policy 0, policy_version 3070 (0.0034) [2024-06-21 14:21:22,647][15401] Updated weights for policy 0, policy_version 3080 (0.0036) [2024-06-21 14:21:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 50479104. Throughput: 0: 40860.1. Samples: 50555580. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-21 14:21:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 14:21:27,331][15401] Updated weights for policy 0, policy_version 3090 (0.0031) [2024-06-21 14:21:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 50675712. Throughput: 0: 41044.4. Samples: 50805540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 14:21:28,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-21 14:21:30,784][15401] Updated weights for policy 0, policy_version 3100 (0.0040) [2024-06-21 14:21:33,392][15132] Fps is (10 sec: 40950.3, 60 sec: 40958.5, 300 sec: 41154.1). Total num frames: 50888704. Throughput: 0: 40985.5. Samples: 51050700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 14:21:33,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 14:21:34,964][15401] Updated weights for policy 0, policy_version 3110 (0.0033) [2024-06-21 14:21:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 51085312. Throughput: 0: 41218.1. Samples: 51178240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:21:38,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 14:21:38,581][15401] Updated weights for policy 0, policy_version 3120 (0.0038) [2024-06-21 14:21:42,875][15401] Updated weights for policy 0, policy_version 3130 (0.0035) [2024-06-21 14:21:43,396][15132] Fps is (10 sec: 40943.2, 60 sec: 40955.5, 300 sec: 41209.0). Total num frames: 51298304. Throughput: 0: 41149.2. Samples: 51426960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 14:21:43,397][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 14:21:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000003131_51298304.pth... [2024-06-21 14:21:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000002529_41435136.pth [2024-06-21 14:21:46,480][15401] Updated weights for policy 0, policy_version 3140 (0.0035) [2024-06-21 14:21:48,392][15132] Fps is (10 sec: 42588.4, 60 sec: 41504.5, 300 sec: 41154.1). Total num frames: 51511296. Throughput: 0: 40879.6. Samples: 51666400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:21:48,401][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 14:21:50,952][15401] Updated weights for policy 0, policy_version 3150 (0.0053) [2024-06-21 14:21:53,390][15132] Fps is (10 sec: 40986.4, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 51707904. Throughput: 0: 41075.5. Samples: 51792120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:21:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 14:21:54,480][15401] Updated weights for policy 0, policy_version 3160 (0.0031) [2024-06-21 14:21:58,390][15132] Fps is (10 sec: 39329.6, 60 sec: 40686.7, 300 sec: 41154.3). Total num frames: 51904512. Throughput: 0: 41094.4. Samples: 52036360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 14:21:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 14:21:58,939][15401] Updated weights for policy 0, policy_version 3170 (0.0034) [2024-06-21 14:22:02,364][15401] Updated weights for policy 0, policy_version 3180 (0.0034) [2024-06-21 14:22:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 52133888. Throughput: 0: 41081.4. Samples: 52282100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 14:22:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 14:22:06,782][15401] Updated weights for policy 0, policy_version 3190 (0.0037) [2024-06-21 14:22:08,390][15132] Fps is (10 sec: 40961.5, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 52314112. Throughput: 0: 41177.3. Samples: 52408560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 14:22:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 14:22:10,622][15401] Updated weights for policy 0, policy_version 3200 (0.0050) [2024-06-21 14:22:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 52543488. Throughput: 0: 41039.9. Samples: 52652340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 14:22:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 14:22:14,902][15401] Updated weights for policy 0, policy_version 3210 (0.0043) [2024-06-21 14:22:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 52740096. Throughput: 0: 41100.5. Samples: 52900120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 14:22:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 14:22:18,449][15401] Updated weights for policy 0, policy_version 3220 (0.0044) [2024-06-21 14:22:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 52920320. Throughput: 0: 40950.8. Samples: 53021020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:22:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 14:22:23,390][15401] Updated weights for policy 0, policy_version 3230 (0.0035) [2024-06-21 14:22:23,937][15349] Signal inference workers to stop experience collection... (750 times) [2024-06-21 14:22:23,946][15349] Signal inference workers to resume experience collection... (750 times) [2024-06-21 14:22:23,951][15401] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-21 14:22:23,971][15401] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-21 14:22:26,579][15401] Updated weights for policy 0, policy_version 3240 (0.0026) [2024-06-21 14:22:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41098.9). Total num frames: 53149696. Throughput: 0: 40948.1. Samples: 53269360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:22:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 14:22:31,124][15401] Updated weights for policy 0, policy_version 3250 (0.0035) [2024-06-21 14:22:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 40961.6, 300 sec: 41043.3). Total num frames: 53346304. Throughput: 0: 41159.5. Samples: 53518480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:22:33,405][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 14:22:34,829][15401] Updated weights for policy 0, policy_version 3260 (0.0028) [2024-06-21 14:22:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 53559296. Throughput: 0: 41038.2. Samples: 53638840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-21 14:22:38,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 14:22:38,936][15401] Updated weights for policy 0, policy_version 3270 (0.0030) [2024-06-21 14:22:42,890][15401] Updated weights for policy 0, policy_version 3280 (0.0034) [2024-06-21 14:22:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41510.6, 300 sec: 41209.9). Total num frames: 53788672. Throughput: 0: 41208.3. Samples: 53890720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 14:22:43,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-21 14:22:46,733][15401] Updated weights for policy 0, policy_version 3290 (0.0050) [2024-06-21 14:22:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 40686.9, 300 sec: 40931.9). Total num frames: 53952512. Throughput: 0: 41272.9. Samples: 54139480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:22:48,393][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 14:22:50,647][15401] Updated weights for policy 0, policy_version 3300 (0.0029) [2024-06-21 14:22:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 54198272. Throughput: 0: 41060.8. Samples: 54256300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-21 14:22:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 14:22:54,843][15401] Updated weights for policy 0, policy_version 3310 (0.0042) [2024-06-21 14:22:58,389][15132] Fps is (10 sec: 40970.3, 60 sec: 40960.3, 300 sec: 41098.9). Total num frames: 54362112. Throughput: 0: 41257.0. Samples: 54508900. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-21 14:22:58,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 14:22:58,659][15401] Updated weights for policy 0, policy_version 3320 (0.0028) [2024-06-21 14:23:02,663][15401] Updated weights for policy 0, policy_version 3330 (0.0034) [2024-06-21 14:23:03,390][15132] Fps is (10 sec: 37683.5, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 54575104. Throughput: 0: 41140.3. Samples: 54751440. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-21 14:23:03,392][15132] Avg episode reward: [(0, '0.211')] [2024-06-21 14:23:06,953][15401] Updated weights for policy 0, policy_version 3340 (0.0035) [2024-06-21 14:23:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 54820864. Throughput: 0: 41227.9. Samples: 54876280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 14:23:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 14:23:10,514][15401] Updated weights for policy 0, policy_version 3350 (0.0042) [2024-06-21 14:23:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 54968320. Throughput: 0: 41266.3. Samples: 55126340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:23:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 14:23:14,755][15401] Updated weights for policy 0, policy_version 3360 (0.0023) [2024-06-21 14:23:18,357][15401] Updated weights for policy 0, policy_version 3370 (0.0030) [2024-06-21 14:23:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 55214080. Throughput: 0: 41140.1. Samples: 55369780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 14:23:18,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-21 14:23:22,596][15401] Updated weights for policy 0, policy_version 3380 (0.0031) [2024-06-21 14:23:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 55410688. Throughput: 0: 41301.7. Samples: 55497420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 14:23:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 14:23:26,217][15401] Updated weights for policy 0, policy_version 3390 (0.0036) [2024-06-21 14:23:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 40988.5). Total num frames: 55607296. Throughput: 0: 41102.7. Samples: 55740340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-21 14:23:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 14:23:30,627][15401] Updated weights for policy 0, policy_version 3400 (0.0033) [2024-06-21 14:23:31,374][15349] Signal inference workers to stop experience collection... (800 times) [2024-06-21 14:23:31,381][15349] Signal inference workers to resume experience collection... (800 times) [2024-06-21 14:23:31,402][15401] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-21 14:23:31,402][15401] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-21 14:23:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 55836672. Throughput: 0: 40861.3. Samples: 55978140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 14:23:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 14:23:34,310][15401] Updated weights for policy 0, policy_version 3410 (0.0048) [2024-06-21 14:23:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 56000512. Throughput: 0: 41064.1. Samples: 56104180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 14:23:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 14:23:38,875][15401] Updated weights for policy 0, policy_version 3420 (0.0037) [2024-06-21 14:23:42,259][15401] Updated weights for policy 0, policy_version 3430 (0.0039) [2024-06-21 14:23:43,389][15132] Fps is (10 sec: 37683.4, 60 sec: 40413.9, 300 sec: 40932.3). Total num frames: 56213504. Throughput: 0: 40860.4. Samples: 56347620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 14:23:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 14:23:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000003431_56213504.pth... [2024-06-21 14:23:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000002831_46383104.pth [2024-06-21 14:23:46,874][15401] Updated weights for policy 0, policy_version 3440 (0.0041) [2024-06-21 14:23:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 41507.7, 300 sec: 41098.8). Total num frames: 56442880. Throughput: 0: 40705.6. Samples: 56583200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:23:48,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 14:23:50,456][15401] Updated weights for policy 0, policy_version 3450 (0.0038) [2024-06-21 14:23:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40413.9, 300 sec: 41043.3). Total num frames: 56623104. Throughput: 0: 40908.9. Samples: 56717180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 14:23:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 14:23:54,639][15401] Updated weights for policy 0, policy_version 3460 (0.0029) [2024-06-21 14:23:58,373][15401] Updated weights for policy 0, policy_version 3470 (0.0040) [2024-06-21 14:23:58,392][15132] Fps is (10 sec: 40951.7, 60 sec: 41504.5, 300 sec: 41043.9). Total num frames: 56852480. Throughput: 0: 40758.8. Samples: 56960580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:23:58,392][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 14:24:02,770][15401] Updated weights for policy 0, policy_version 3480 (0.0039) [2024-06-21 14:24:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 57065472. Throughput: 0: 40904.3. Samples: 57210480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 14:24:03,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 14:24:06,531][15401] Updated weights for policy 0, policy_version 3490 (0.0036) [2024-06-21 14:24:08,390][15132] Fps is (10 sec: 40968.9, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 57262080. Throughput: 0: 40885.8. Samples: 57337280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:24:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 14:24:10,629][15401] Updated weights for policy 0, policy_version 3500 (0.0039) [2024-06-21 14:24:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 57458688. Throughput: 0: 40813.3. Samples: 57576940. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-21 14:24:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 14:24:14,554][15401] Updated weights for policy 0, policy_version 3510 (0.0039) [2024-06-21 14:24:18,390][15132] Fps is (10 sec: 37683.5, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 57638912. Throughput: 0: 41261.4. Samples: 57834900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:24:18,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-21 14:24:18,612][15401] Updated weights for policy 0, policy_version 3520 (0.0047) [2024-06-21 14:24:22,547][15401] Updated weights for policy 0, policy_version 3530 (0.0027) [2024-06-21 14:24:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 57868288. Throughput: 0: 41106.7. Samples: 57953980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 14:24:23,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 14:24:26,450][15401] Updated weights for policy 0, policy_version 3540 (0.0034) [2024-06-21 14:24:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 58097664. Throughput: 0: 41196.0. Samples: 58201440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 14:24:28,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-21 14:24:30,339][15401] Updated weights for policy 0, policy_version 3550 (0.0039) [2024-06-21 14:24:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 58277888. Throughput: 0: 41578.5. Samples: 58454220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:24:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 14:24:34,135][15401] Updated weights for policy 0, policy_version 3560 (0.0039) [2024-06-21 14:24:38,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 58474496. Throughput: 0: 41220.9. Samples: 58572120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:24:38,402][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 14:24:38,404][15401] Updated weights for policy 0, policy_version 3570 (0.0038) [2024-06-21 14:24:41,799][15401] Updated weights for policy 0, policy_version 3580 (0.0040) [2024-06-21 14:24:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 58703872. Throughput: 0: 41399.5. Samples: 58823460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 14:24:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 14:24:46,143][15401] Updated weights for policy 0, policy_version 3590 (0.0035) [2024-06-21 14:24:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 40960.1, 300 sec: 41099.2). Total num frames: 58900480. Throughput: 0: 41407.5. Samples: 59073820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 14:24:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 14:24:50,359][15401] Updated weights for policy 0, policy_version 3600 (0.0036) [2024-06-21 14:24:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41098.9). Total num frames: 59129856. Throughput: 0: 41228.5. Samples: 59192560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 14:24:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 14:24:53,813][15401] Updated weights for policy 0, policy_version 3610 (0.0039) [2024-06-21 14:24:58,057][15401] Updated weights for policy 0, policy_version 3620 (0.0038) [2024-06-21 14:24:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41234.6, 300 sec: 41154.4). Total num frames: 59326464. Throughput: 0: 41571.1. Samples: 59447640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:24:58,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-21 14:24:58,552][15349] Signal inference workers to stop experience collection... (850 times) [2024-06-21 14:24:58,559][15349] Signal inference workers to resume experience collection... (850 times) [2024-06-21 14:24:58,604][15401] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-21 14:24:58,604][15401] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-21 14:25:01,513][15401] Updated weights for policy 0, policy_version 3630 (0.0029) [2024-06-21 14:25:03,389][15132] Fps is (10 sec: 36045.1, 60 sec: 40414.0, 300 sec: 40987.8). Total num frames: 59490304. Throughput: 0: 41447.6. Samples: 59700040. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-21 14:25:03,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-21 14:25:05,750][15401] Updated weights for policy 0, policy_version 3640 (0.0039) [2024-06-21 14:25:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 59752448. Throughput: 0: 41350.6. Samples: 59814760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 14:25:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 14:25:09,575][15401] Updated weights for policy 0, policy_version 3650 (0.0039) [2024-06-21 14:25:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 59932672. Throughput: 0: 41497.8. Samples: 60068840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 14:25:13,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 14:25:14,163][15401] Updated weights for policy 0, policy_version 3660 (0.0033) [2024-06-21 14:25:17,494][15401] Updated weights for policy 0, policy_version 3670 (0.0033) [2024-06-21 14:25:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41098.9). Total num frames: 60145664. Throughput: 0: 41497.7. Samples: 60321620. Policy #0 lag: (min: 2.0, avg: 12.1, max: 23.0) [2024-06-21 14:25:18,391][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 14:25:21,928][15401] Updated weights for policy 0, policy_version 3680 (0.0038) [2024-06-21 14:25:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.0, 300 sec: 41265.4). Total num frames: 60375040. Throughput: 0: 41675.0. Samples: 60447500. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-21 14:25:23,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-21 14:25:25,443][15401] Updated weights for policy 0, policy_version 3690 (0.0024) [2024-06-21 14:25:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 40958.4, 300 sec: 41098.5). Total num frames: 60555264. Throughput: 0: 41564.4. Samples: 60693960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 14:25:28,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 14:25:29,897][15401] Updated weights for policy 0, policy_version 3700 (0.0043) [2024-06-21 14:25:33,323][15401] Updated weights for policy 0, policy_version 3710 (0.0036) [2024-06-21 14:25:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 60784640. Throughput: 0: 41476.9. Samples: 60940280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 14:25:33,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 14:25:37,788][15401] Updated weights for policy 0, policy_version 3720 (0.0043) [2024-06-21 14:25:38,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42052.3, 300 sec: 41209.9). Total num frames: 60997632. Throughput: 0: 41660.0. Samples: 61067260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 14:25:38,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 14:25:41,066][15401] Updated weights for policy 0, policy_version 3730 (0.0038) [2024-06-21 14:25:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 61194240. Throughput: 0: 41598.7. Samples: 61319580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 14:25:43,396][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 14:25:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000003735_61194240.pth... [2024-06-21 14:25:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000003131_51298304.pth [2024-06-21 14:25:45,629][15401] Updated weights for policy 0, policy_version 3740 (0.0049) [2024-06-21 14:25:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41265.4). Total num frames: 61423616. Throughput: 0: 41401.2. Samples: 61563100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 14:25:48,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 14:25:49,181][15401] Updated weights for policy 0, policy_version 3750 (0.0035) [2024-06-21 14:25:53,335][15401] Updated weights for policy 0, policy_version 3760 (0.0037) [2024-06-21 14:25:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 61603840. Throughput: 0: 41695.9. Samples: 61691080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 14:25:53,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 14:25:56,915][15401] Updated weights for policy 0, policy_version 3770 (0.0037) [2024-06-21 14:25:58,391][15132] Fps is (10 sec: 39316.9, 60 sec: 41505.3, 300 sec: 41265.3). Total num frames: 61816832. Throughput: 0: 41550.0. Samples: 61938640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-21 14:25:58,391][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 14:26:01,385][15401] Updated weights for policy 0, policy_version 3780 (0.0030) [2024-06-21 14:26:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 41265.4). Total num frames: 62029824. Throughput: 0: 41421.2. Samples: 62185580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 14:26:03,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 14:26:04,599][15401] Updated weights for policy 0, policy_version 3790 (0.0044) [2024-06-21 14:26:08,389][15132] Fps is (10 sec: 37688.1, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 62193664. Throughput: 0: 41408.2. Samples: 62310860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 14:26:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 14:26:09,557][15401] Updated weights for policy 0, policy_version 3800 (0.0047) [2024-06-21 14:26:12,318][15401] Updated weights for policy 0, policy_version 3810 (0.0025) [2024-06-21 14:26:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 62439424. Throughput: 0: 41326.6. Samples: 62553560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 14:26:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 14:26:17,277][15349] Signal inference workers to stop experience collection... (900 times) [2024-06-21 14:26:17,277][15349] Signal inference workers to resume experience collection... (900 times) [2024-06-21 14:26:17,281][15401] Updated weights for policy 0, policy_version 3820 (0.0038) [2024-06-21 14:26:17,292][15401] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-21 14:26:17,292][15401] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-21 14:26:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 62652416. Throughput: 0: 41436.5. Samples: 62804920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 14:26:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 14:26:20,137][15401] Updated weights for policy 0, policy_version 3830 (0.0039) [2024-06-21 14:26:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 62832640. Throughput: 0: 41451.0. Samples: 62932560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:26:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 14:26:25,044][15401] Updated weights for policy 0, policy_version 3840 (0.0047) [2024-06-21 14:26:27,908][15401] Updated weights for policy 0, policy_version 3850 (0.0044) [2024-06-21 14:26:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42052.3, 300 sec: 41321.0). Total num frames: 63078400. Throughput: 0: 41275.6. Samples: 63177080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 14:26:28,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 14:26:33,303][15401] Updated weights for policy 0, policy_version 3860 (0.0031) [2024-06-21 14:26:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 63242240. Throughput: 0: 41488.5. Samples: 63430080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 14:26:33,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-21 14:26:35,994][15401] Updated weights for policy 0, policy_version 3870 (0.0028) [2024-06-21 14:26:38,389][15132] Fps is (10 sec: 37692.4, 60 sec: 40960.0, 300 sec: 41210.8). Total num frames: 63455232. Throughput: 0: 41126.3. Samples: 63541760. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-21 14:26:38,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-21 14:26:41,151][15401] Updated weights for policy 0, policy_version 3880 (0.0034) [2024-06-21 14:26:43,392][15132] Fps is (10 sec: 45863.9, 60 sec: 41777.5, 300 sec: 41321.0). Total num frames: 63700992. Throughput: 0: 41368.7. Samples: 63800280. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-21 14:26:43,392][15132] Avg episode reward: [(0, '0.214')] [2024-06-21 14:26:43,836][15401] Updated weights for policy 0, policy_version 3890 (0.0039) [2024-06-21 14:26:48,394][15132] Fps is (10 sec: 39301.9, 60 sec: 40410.6, 300 sec: 41153.7). Total num frames: 63848448. Throughput: 0: 41503.6. Samples: 64053440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 14:26:48,395][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 14:26:48,910][15401] Updated weights for policy 0, policy_version 3900 (0.0040) [2024-06-21 14:26:51,512][15401] Updated weights for policy 0, policy_version 3910 (0.0028) [2024-06-21 14:26:53,390][15132] Fps is (10 sec: 39330.8, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 64094208. Throughput: 0: 41182.6. Samples: 64164080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:26:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 14:26:56,838][15401] Updated weights for policy 0, policy_version 3920 (0.0043) [2024-06-21 14:26:58,389][15132] Fps is (10 sec: 44258.9, 60 sec: 41233.9, 300 sec: 41209.9). Total num frames: 64290816. Throughput: 0: 41437.8. Samples: 64418260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 14:26:58,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-21 14:26:59,509][15401] Updated weights for policy 0, policy_version 3930 (0.0041) [2024-06-21 14:27:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 64487424. Throughput: 0: 41591.1. Samples: 64676520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-21 14:27:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 14:27:04,578][15401] Updated weights for policy 0, policy_version 3940 (0.0033) [2024-06-21 14:27:07,605][15401] Updated weights for policy 0, policy_version 3950 (0.0037) [2024-06-21 14:27:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 41321.0). Total num frames: 64733184. Throughput: 0: 41342.8. Samples: 64792980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 14:27:08,396][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 14:27:12,751][15401] Updated weights for policy 0, policy_version 3960 (0.0043) [2024-06-21 14:27:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 64880640. Throughput: 0: 41447.1. Samples: 65042100. Policy #0 lag: (min: 0.0, avg: 14.3, max: 23.0) [2024-06-21 14:27:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 14:27:15,367][15401] Updated weights for policy 0, policy_version 3970 (0.0046) [2024-06-21 14:27:18,390][15132] Fps is (10 sec: 37682.5, 60 sec: 40959.9, 300 sec: 41321.0). Total num frames: 65110016. Throughput: 0: 41248.3. Samples: 65286260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-21 14:27:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 14:27:20,977][15401] Updated weights for policy 0, policy_version 3980 (0.0035) [2024-06-21 14:27:23,302][15401] Updated weights for policy 0, policy_version 3990 (0.0035) [2024-06-21 14:27:23,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42325.3, 300 sec: 41432.1). Total num frames: 65372160. Throughput: 0: 41706.1. Samples: 65418540. Policy #0 lag: (min: 0.0, avg: 13.9, max: 26.0) [2024-06-21 14:27:23,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-21 14:27:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 40415.4, 300 sec: 41209.9). Total num frames: 65503232. Throughput: 0: 41319.0. Samples: 65659540. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-21 14:27:28,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-21 14:27:29,043][15401] Updated weights for policy 0, policy_version 4000 (0.0036) [2024-06-21 14:27:31,383][15401] Updated weights for policy 0, policy_version 4010 (0.0035) [2024-06-21 14:27:33,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 65748992. Throughput: 0: 41034.2. Samples: 65899780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-21 14:27:33,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-21 14:27:35,464][15349] Signal inference workers to stop experience collection... (950 times) [2024-06-21 14:27:35,503][15401] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-21 14:27:35,520][15349] Signal inference workers to resume experience collection... (950 times) [2024-06-21 14:27:35,522][15401] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-21 14:27:36,981][15401] Updated weights for policy 0, policy_version 4020 (0.0035) [2024-06-21 14:27:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 65945600. Throughput: 0: 41535.7. Samples: 66033180. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 14:27:38,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-21 14:27:39,559][15401] Updated weights for policy 0, policy_version 4030 (0.0044) [2024-06-21 14:27:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 40688.5, 300 sec: 41321.3). Total num frames: 66142208. Throughput: 0: 41354.5. Samples: 66279220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 14:27:43,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 14:27:43,447][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000004038_66158592.pth... [2024-06-21 14:27:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000003431_56213504.pth [2024-06-21 14:27:44,893][15401] Updated weights for policy 0, policy_version 4040 (0.0041) [2024-06-21 14:27:47,631][15401] Updated weights for policy 0, policy_version 4050 (0.0030) [2024-06-21 14:27:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42328.8, 300 sec: 41321.0). Total num frames: 66387968. Throughput: 0: 40846.6. Samples: 66514620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 14:27:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 14:27:52,617][15401] Updated weights for policy 0, policy_version 4060 (0.0032) [2024-06-21 14:27:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 66551808. Throughput: 0: 41271.6. Samples: 66650200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:27:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 14:27:55,442][15401] Updated weights for policy 0, policy_version 4070 (0.0040) [2024-06-21 14:27:58,392][15132] Fps is (10 sec: 39312.3, 60 sec: 41504.5, 300 sec: 41376.2). Total num frames: 66781184. Throughput: 0: 41237.8. Samples: 66897900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 14:27:58,392][15132] Avg episode reward: [(0, '0.335')] [2024-06-21 14:28:00,344][15401] Updated weights for policy 0, policy_version 4080 (0.0046) [2024-06-21 14:28:03,185][15401] Updated weights for policy 0, policy_version 4090 (0.0034) [2024-06-21 14:28:03,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42325.2, 300 sec: 41376.5). Total num frames: 67026944. Throughput: 0: 41293.3. Samples: 67144460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 14:28:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 14:28:07,977][15401] Updated weights for policy 0, policy_version 4100 (0.0036) [2024-06-21 14:28:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 40686.9, 300 sec: 41376.6). Total num frames: 67174400. Throughput: 0: 41289.5. Samples: 67276560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 14:28:08,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-21 14:28:11,165][15401] Updated weights for policy 0, policy_version 4110 (0.0036) [2024-06-21 14:28:13,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 41376.5). Total num frames: 67420160. Throughput: 0: 41312.0. Samples: 67518580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-21 14:28:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 14:28:15,719][15401] Updated weights for policy 0, policy_version 4120 (0.0029) [2024-06-21 14:28:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.4, 300 sec: 41376.6). Total num frames: 67616768. Throughput: 0: 41554.9. Samples: 67769740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 14:28:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 14:28:19,092][15401] Updated weights for policy 0, policy_version 4130 (0.0028) [2024-06-21 14:28:23,390][15132] Fps is (10 sec: 37682.5, 60 sec: 40413.8, 300 sec: 41321.0). Total num frames: 67796992. Throughput: 0: 41366.4. Samples: 67894680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 14:28:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 14:28:23,550][15401] Updated weights for policy 0, policy_version 4140 (0.0036) [2024-06-21 14:28:26,978][15401] Updated weights for policy 0, policy_version 4150 (0.0036) [2024-06-21 14:28:28,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 41376.5). Total num frames: 68042752. Throughput: 0: 41418.3. Samples: 68143040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 14:28:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 14:28:31,427][15401] Updated weights for policy 0, policy_version 4160 (0.0040) [2024-06-21 14:28:33,389][15132] Fps is (10 sec: 42599.8, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 68222976. Throughput: 0: 41758.8. Samples: 68393760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 14:28:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 14:28:35,323][15401] Updated weights for policy 0, policy_version 4170 (0.0036) [2024-06-21 14:28:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 68419584. Throughput: 0: 41464.8. Samples: 68516120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 14:28:38,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-21 14:28:39,548][15401] Updated weights for policy 0, policy_version 4180 (0.0031) [2024-06-21 14:28:43,023][15401] Updated weights for policy 0, policy_version 4190 (0.0029) [2024-06-21 14:28:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 68648960. Throughput: 0: 41505.2. Samples: 68765540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 14:28:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 14:28:47,479][15401] Updated weights for policy 0, policy_version 4200 (0.0040) [2024-06-21 14:28:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 68861952. Throughput: 0: 41397.8. Samples: 69007360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:28:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 14:28:51,334][15401] Updated weights for policy 0, policy_version 4210 (0.0034) [2024-06-21 14:28:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41376.8). Total num frames: 69058560. Throughput: 0: 41289.2. Samples: 69134580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:28:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 14:28:55,376][15401] Updated weights for policy 0, policy_version 4220 (0.0030) [2024-06-21 14:28:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41234.6, 300 sec: 41321.0). Total num frames: 69255168. Throughput: 0: 41380.0. Samples: 69380680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 14:28:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 14:28:59,455][15401] Updated weights for policy 0, policy_version 4230 (0.0048) [2024-06-21 14:29:00,781][15349] Signal inference workers to stop experience collection... (1000 times) [2024-06-21 14:29:00,812][15401] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-21 14:29:00,836][15349] Signal inference workers to resume experience collection... (1000 times) [2024-06-21 14:29:00,837][15401] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-21 14:29:03,210][15401] Updated weights for policy 0, policy_version 4240 (0.0046) [2024-06-21 14:29:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 40687.0, 300 sec: 41376.6). Total num frames: 69468160. Throughput: 0: 41280.8. Samples: 69627380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 14:29:03,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 14:29:07,260][15401] Updated weights for policy 0, policy_version 4250 (0.0031) [2024-06-21 14:29:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 69648384. Throughput: 0: 41197.1. Samples: 69748540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:29:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 14:29:11,123][15401] Updated weights for policy 0, policy_version 4260 (0.0036) [2024-06-21 14:29:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 69877760. Throughput: 0: 41349.7. Samples: 70003780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 19.0) [2024-06-21 14:29:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 14:29:15,034][15401] Updated weights for policy 0, policy_version 4270 (0.0035) [2024-06-21 14:29:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 70090752. Throughput: 0: 41319.5. Samples: 70253140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-21 14:29:18,390][15132] Avg episode reward: [(0, '0.132')] [2024-06-21 14:29:19,384][15401] Updated weights for policy 0, policy_version 4280 (0.0038) [2024-06-21 14:29:23,047][15401] Updated weights for policy 0, policy_version 4290 (0.0030) [2024-06-21 14:29:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 70287360. Throughput: 0: 41319.1. Samples: 70375480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-21 14:29:23,390][15132] Avg episode reward: [(0, '0.106')] [2024-06-21 14:29:27,106][15401] Updated weights for policy 0, policy_version 4300 (0.0037) [2024-06-21 14:29:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 70500352. Throughput: 0: 41361.4. Samples: 70626800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 14:29:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 14:29:31,062][15401] Updated weights for policy 0, policy_version 4310 (0.0037) [2024-06-21 14:29:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 70713344. Throughput: 0: 41339.9. Samples: 70867660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-21 14:29:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 14:29:34,980][15401] Updated weights for policy 0, policy_version 4320 (0.0042) [2024-06-21 14:29:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 70893568. Throughput: 0: 41426.8. Samples: 70998780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 14:29:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 14:29:39,043][15401] Updated weights for policy 0, policy_version 4330 (0.0039) [2024-06-21 14:29:42,891][15401] Updated weights for policy 0, policy_version 4340 (0.0046) [2024-06-21 14:29:43,389][15132] Fps is (10 sec: 39322.5, 60 sec: 40960.1, 300 sec: 41376.6). Total num frames: 71106560. Throughput: 0: 41360.6. Samples: 71241900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-21 14:29:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 14:29:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000004341_71122944.pth... [2024-06-21 14:29:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000003735_61194240.pth [2024-06-21 14:29:47,126][15401] Updated weights for policy 0, policy_version 4350 (0.0029) [2024-06-21 14:29:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 71335936. Throughput: 0: 41361.7. Samples: 71488660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:29:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 14:29:51,089][15401] Updated weights for policy 0, policy_version 4360 (0.0038) [2024-06-21 14:29:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.2, 300 sec: 41376.6). Total num frames: 71532544. Throughput: 0: 41490.7. Samples: 71615620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:29:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 14:29:55,148][15401] Updated weights for policy 0, policy_version 4370 (0.0043) [2024-06-21 14:29:58,396][15132] Fps is (10 sec: 40934.1, 60 sec: 41501.8, 300 sec: 41542.3). Total num frames: 71745536. Throughput: 0: 41267.6. Samples: 71861080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 14:29:58,396][15132] Avg episode reward: [(0, '0.355')] [2024-06-21 14:29:58,749][15401] Updated weights for policy 0, policy_version 4380 (0.0047) [2024-06-21 14:30:02,896][15401] Updated weights for policy 0, policy_version 4390 (0.0039) [2024-06-21 14:30:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 71942144. Throughput: 0: 41383.9. Samples: 72115420. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-21 14:30:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 14:30:06,450][15401] Updated weights for policy 0, policy_version 4400 (0.0036) [2024-06-21 14:30:08,389][15132] Fps is (10 sec: 40985.9, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 72155136. Throughput: 0: 41340.4. Samples: 72235800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 14:30:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 14:30:10,979][15401] Updated weights for policy 0, policy_version 4410 (0.0026) [2024-06-21 14:30:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 72384512. Throughput: 0: 41076.1. Samples: 72475220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 14:30:13,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-21 14:30:14,631][15401] Updated weights for policy 0, policy_version 4420 (0.0047) [2024-06-21 14:30:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 72548352. Throughput: 0: 41389.0. Samples: 72730160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 14:30:18,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 14:30:18,790][15401] Updated weights for policy 0, policy_version 4430 (0.0037) [2024-06-21 14:30:22,788][15401] Updated weights for policy 0, policy_version 4440 (0.0051) [2024-06-21 14:30:23,390][15132] Fps is (10 sec: 37682.4, 60 sec: 41233.0, 300 sec: 41376.9). Total num frames: 72761344. Throughput: 0: 41084.7. Samples: 72847600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 14:30:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 14:30:26,639][15401] Updated weights for policy 0, policy_version 4450 (0.0035) [2024-06-21 14:30:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 72990720. Throughput: 0: 41272.9. Samples: 73099180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 27.0) [2024-06-21 14:30:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 14:30:30,571][15401] Updated weights for policy 0, policy_version 4460 (0.0031) [2024-06-21 14:30:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 73154560. Throughput: 0: 41464.4. Samples: 73354560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 14:30:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 14:30:34,577][15401] Updated weights for policy 0, policy_version 4470 (0.0028) [2024-06-21 14:30:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 41321.0). Total num frames: 73383936. Throughput: 0: 41168.8. Samples: 73468220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 14:30:38,395][15132] Avg episode reward: [(0, '0.671')] [2024-06-21 14:30:38,445][15401] Updated weights for policy 0, policy_version 4480 (0.0047) [2024-06-21 14:30:41,256][15349] Signal inference workers to stop experience collection... (1050 times) [2024-06-21 14:30:41,256][15349] Signal inference workers to resume experience collection... (1050 times) [2024-06-21 14:30:41,299][15401] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-21 14:30:41,299][15401] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-21 14:30:42,575][15401] Updated weights for policy 0, policy_version 4490 (0.0042) [2024-06-21 14:30:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 73596928. Throughput: 0: 41492.9. Samples: 73728000. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 14:30:43,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-21 14:30:46,486][15401] Updated weights for policy 0, policy_version 4500 (0.0032) [2024-06-21 14:30:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 73809920. Throughput: 0: 41260.1. Samples: 73972120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 14:30:48,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 14:30:50,806][15401] Updated weights for policy 0, policy_version 4510 (0.0033) [2024-06-21 14:30:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 41432.3). Total num frames: 74039296. Throughput: 0: 41247.5. Samples: 74091940. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-21 14:30:53,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 14:30:54,553][15401] Updated weights for policy 0, policy_version 4520 (0.0046) [2024-06-21 14:30:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 40691.2, 300 sec: 41209.9). Total num frames: 74186752. Throughput: 0: 41453.3. Samples: 74340620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:30:58,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-21 14:30:58,738][15401] Updated weights for policy 0, policy_version 4530 (0.0033) [2024-06-21 14:31:02,555][15401] Updated weights for policy 0, policy_version 4540 (0.0030) [2024-06-21 14:31:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 74416128. Throughput: 0: 41150.7. Samples: 74581940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 14:31:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 14:31:06,641][15401] Updated weights for policy 0, policy_version 4550 (0.0034) [2024-06-21 14:31:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 74629120. Throughput: 0: 41341.5. Samples: 74707960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:31:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 14:31:10,458][15401] Updated weights for policy 0, policy_version 4560 (0.0032) [2024-06-21 14:31:13,390][15132] Fps is (10 sec: 39321.9, 60 sec: 40413.8, 300 sec: 41209.9). Total num frames: 74809344. Throughput: 0: 41037.3. Samples: 74945860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:31:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 14:31:14,695][15401] Updated weights for policy 0, policy_version 4570 (0.0040) [2024-06-21 14:31:18,336][15401] Updated weights for policy 0, policy_version 4580 (0.0039) [2024-06-21 14:31:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 75038720. Throughput: 0: 40852.5. Samples: 75192920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 14:31:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 14:31:22,513][15401] Updated weights for policy 0, policy_version 4590 (0.0026) [2024-06-21 14:31:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.2, 300 sec: 41210.3). Total num frames: 75235328. Throughput: 0: 41221.5. Samples: 75323180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 14:31:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 14:31:26,611][15401] Updated weights for policy 0, policy_version 4600 (0.0041) [2024-06-21 14:31:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 75431936. Throughput: 0: 40827.1. Samples: 75565220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:31:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 14:31:30,619][15401] Updated weights for policy 0, policy_version 4610 (0.0035) [2024-06-21 14:31:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 41376.5). Total num frames: 75661312. Throughput: 0: 40774.1. Samples: 75806960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-21 14:31:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 14:31:34,385][15401] Updated weights for policy 0, policy_version 4620 (0.0038) [2024-06-21 14:31:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41154.7). Total num frames: 75841536. Throughput: 0: 40972.4. Samples: 75935700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 14:31:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 14:31:38,492][15401] Updated weights for policy 0, policy_version 4630 (0.0030) [2024-06-21 14:31:42,270][15401] Updated weights for policy 0, policy_version 4640 (0.0052) [2024-06-21 14:31:43,389][15132] Fps is (10 sec: 37684.0, 60 sec: 40687.0, 300 sec: 41321.7). Total num frames: 76038144. Throughput: 0: 40859.2. Samples: 76179280. Policy #0 lag: (min: 0.0, avg: 13.5, max: 25.0) [2024-06-21 14:31:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 14:31:43,495][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000004642_76054528.pth... [2024-06-21 14:31:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000004038_66158592.pth [2024-06-21 14:31:46,740][15401] Updated weights for policy 0, policy_version 4650 (0.0039) [2024-06-21 14:31:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 76267520. Throughput: 0: 40942.4. Samples: 76424340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:31:48,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-21 14:31:50,082][15401] Updated weights for policy 0, policy_version 4660 (0.0035) [2024-06-21 14:31:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 40413.8, 300 sec: 41265.5). Total num frames: 76464128. Throughput: 0: 40861.3. Samples: 76546720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:31:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 14:31:54,609][15401] Updated weights for policy 0, policy_version 4670 (0.0038) [2024-06-21 14:31:57,691][15349] Signal inference workers to stop experience collection... (1100 times) [2024-06-21 14:31:57,705][15349] Signal inference workers to resume experience collection... (1100 times) [2024-06-21 14:31:57,710][15401] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-21 14:31:57,743][15401] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-21 14:31:57,845][15401] Updated weights for policy 0, policy_version 4680 (0.0034) [2024-06-21 14:31:58,392][15132] Fps is (10 sec: 40949.7, 60 sec: 41504.5, 300 sec: 41320.7). Total num frames: 76677120. Throughput: 0: 40996.5. Samples: 76790800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 14:31:58,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 14:32:02,442][15401] Updated weights for policy 0, policy_version 4690 (0.0035) [2024-06-21 14:32:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 76890112. Throughput: 0: 41194.1. Samples: 77046660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 14:32:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 14:32:05,669][15401] Updated weights for policy 0, policy_version 4700 (0.0049) [2024-06-21 14:32:08,390][15132] Fps is (10 sec: 40969.5, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 77086720. Throughput: 0: 40993.2. Samples: 77167880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 14:32:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 14:32:10,199][15401] Updated weights for policy 0, policy_version 4710 (0.0038) [2024-06-21 14:32:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 77316096. Throughput: 0: 41149.8. Samples: 77416960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 14:32:13,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-21 14:32:13,421][15401] Updated weights for policy 0, policy_version 4720 (0.0029) [2024-06-21 14:32:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 77479936. Throughput: 0: 41291.8. Samples: 77665080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 14:32:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 14:32:18,488][15401] Updated weights for policy 0, policy_version 4730 (0.0042) [2024-06-21 14:32:22,193][15401] Updated weights for policy 0, policy_version 4740 (0.0035) [2024-06-21 14:32:23,390][15132] Fps is (10 sec: 36044.4, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 77676544. Throughput: 0: 41006.2. Samples: 77780980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 14:32:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 14:32:26,237][15401] Updated weights for policy 0, policy_version 4750 (0.0042) [2024-06-21 14:32:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 77922304. Throughput: 0: 41121.7. Samples: 78029760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 14:32:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-21 14:32:30,291][15401] Updated weights for policy 0, policy_version 4760 (0.0031) [2024-06-21 14:32:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 40687.1, 300 sec: 41209.9). Total num frames: 78102528. Throughput: 0: 41133.3. Samples: 78275340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 14:32:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-21 14:32:34,229][15401] Updated weights for policy 0, policy_version 4770 (0.0040) [2024-06-21 14:32:38,160][15401] Updated weights for policy 0, policy_version 4780 (0.0037) [2024-06-21 14:32:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 78315520. Throughput: 0: 41086.3. Samples: 78395600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 14:32:38,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-21 14:32:42,336][15401] Updated weights for policy 0, policy_version 4790 (0.0042) [2024-06-21 14:32:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 78544896. Throughput: 0: 41146.2. Samples: 78642280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 14:32:43,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 14:32:45,900][15401] Updated weights for policy 0, policy_version 4800 (0.0032) [2024-06-21 14:32:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 40959.8, 300 sec: 41265.4). Total num frames: 78725120. Throughput: 0: 41055.4. Samples: 78894160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 14:32:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 14:32:50,283][15401] Updated weights for policy 0, policy_version 4810 (0.0038) [2024-06-21 14:32:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41233.1, 300 sec: 41210.2). Total num frames: 78938112. Throughput: 0: 40850.2. Samples: 79006140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 14:32:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 14:32:54,280][15401] Updated weights for policy 0, policy_version 4820 (0.0043) [2024-06-21 14:32:58,121][15401] Updated weights for policy 0, policy_version 4830 (0.0034) [2024-06-21 14:32:58,390][15132] Fps is (10 sec: 42599.1, 60 sec: 41234.7, 300 sec: 41098.9). Total num frames: 79151104. Throughput: 0: 41135.9. Samples: 79268080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:32:58,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-21 14:33:01,983][15401] Updated weights for policy 0, policy_version 4840 (0.0039) [2024-06-21 14:33:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 79331328. Throughput: 0: 41132.3. Samples: 79516040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 14:33:03,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-21 14:33:06,074][15401] Updated weights for policy 0, policy_version 4850 (0.0024) [2024-06-21 14:33:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 79577088. Throughput: 0: 41259.5. Samples: 79637660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:33:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 14:33:09,977][15401] Updated weights for policy 0, policy_version 4860 (0.0034) [2024-06-21 14:33:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40140.7, 300 sec: 41043.3). Total num frames: 79724544. Throughput: 0: 41265.2. Samples: 79886700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 14:33:13,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 14:33:14,051][15401] Updated weights for policy 0, policy_version 4870 (0.0033) [2024-06-21 14:33:17,857][15401] Updated weights for policy 0, policy_version 4880 (0.0037) [2024-06-21 14:33:18,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41233.1, 300 sec: 41210.0). Total num frames: 79953920. Throughput: 0: 41177.4. Samples: 80128320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 14:33:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 14:33:21,939][15401] Updated weights for policy 0, policy_version 4890 (0.0037) [2024-06-21 14:33:23,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42052.2, 300 sec: 41209.9). Total num frames: 80199680. Throughput: 0: 41384.8. Samples: 80257920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 14:33:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 14:33:25,846][15401] Updated weights for policy 0, policy_version 4900 (0.0039) [2024-06-21 14:33:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 80379904. Throughput: 0: 41445.7. Samples: 80507340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 14:33:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 14:33:29,845][15401] Updated weights for policy 0, policy_version 4910 (0.0046) [2024-06-21 14:33:33,391][15132] Fps is (10 sec: 37676.3, 60 sec: 41231.7, 300 sec: 41209.7). Total num frames: 80576512. Throughput: 0: 41241.1. Samples: 80750080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-21 14:33:33,392][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 14:33:33,698][15401] Updated weights for policy 0, policy_version 4920 (0.0029) [2024-06-21 14:33:35,809][15349] Signal inference workers to stop experience collection... (1150 times) [2024-06-21 14:33:35,857][15401] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-21 14:33:35,866][15349] Signal inference workers to resume experience collection... (1150 times) [2024-06-21 14:33:35,876][15401] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-21 14:33:37,764][15401] Updated weights for policy 0, policy_version 4930 (0.0043) [2024-06-21 14:33:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 80822272. Throughput: 0: 41544.5. Samples: 80875640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 14:33:38,391][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 14:33:41,572][15401] Updated weights for policy 0, policy_version 4940 (0.0045) [2024-06-21 14:33:43,389][15132] Fps is (10 sec: 39329.2, 60 sec: 40413.8, 300 sec: 41043.3). Total num frames: 80969728. Throughput: 0: 41392.0. Samples: 81130720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-21 14:33:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 14:33:43,484][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000004943_80986112.pth... [2024-06-21 14:33:43,552][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000004341_71122944.pth [2024-06-21 14:33:45,524][15401] Updated weights for policy 0, policy_version 4950 (0.0030) [2024-06-21 14:33:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41233.3, 300 sec: 41154.4). Total num frames: 81199104. Throughput: 0: 41275.2. Samples: 81373420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 14:33:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 14:33:49,683][15401] Updated weights for policy 0, policy_version 4960 (0.0035) [2024-06-21 14:33:53,372][15401] Updated weights for policy 0, policy_version 4970 (0.0038) [2024-06-21 14:33:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 81428480. Throughput: 0: 41531.7. Samples: 81506580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-21 14:33:53,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 14:33:57,933][15401] Updated weights for policy 0, policy_version 4980 (0.0031) [2024-06-21 14:33:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 81592320. Throughput: 0: 41404.0. Samples: 81749880. Policy #0 lag: (min: 0.0, avg: 13.5, max: 28.0) [2024-06-21 14:33:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 14:34:01,241][15401] Updated weights for policy 0, policy_version 4990 (0.0042) [2024-06-21 14:34:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 41376.5). Total num frames: 81854464. Throughput: 0: 41314.5. Samples: 81987480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:34:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 14:34:05,860][15401] Updated weights for policy 0, policy_version 5000 (0.0025) [2024-06-21 14:34:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 82018304. Throughput: 0: 41412.1. Samples: 82121460. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 14:34:08,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 14:34:09,298][15401] Updated weights for policy 0, policy_version 5010 (0.0026) [2024-06-21 14:34:13,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 82231296. Throughput: 0: 41329.7. Samples: 82367180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 14:34:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 14:34:13,675][15401] Updated weights for policy 0, policy_version 5020 (0.0038) [2024-06-21 14:34:17,338][15401] Updated weights for policy 0, policy_version 5030 (0.0037) [2024-06-21 14:34:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 82477056. Throughput: 0: 41441.8. Samples: 82614880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 14:34:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 14:34:21,454][15401] Updated weights for policy 0, policy_version 5040 (0.0042) [2024-06-21 14:34:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 82657280. Throughput: 0: 41625.7. Samples: 82748800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 14:34:23,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 14:34:25,103][15401] Updated weights for policy 0, policy_version 5050 (0.0030) [2024-06-21 14:34:28,392][15132] Fps is (10 sec: 39312.2, 60 sec: 41504.5, 300 sec: 41209.6). Total num frames: 82870272. Throughput: 0: 41260.5. Samples: 82987540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 14:34:28,392][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 14:34:29,263][15401] Updated weights for policy 0, policy_version 5060 (0.0038) [2024-06-21 14:34:32,883][15401] Updated weights for policy 0, policy_version 5070 (0.0041) [2024-06-21 14:34:33,392][15132] Fps is (10 sec: 42588.3, 60 sec: 41778.9, 300 sec: 41320.7). Total num frames: 83083264. Throughput: 0: 41576.8. Samples: 83244480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:34:33,393][15132] Avg episode reward: [(0, '0.329')] [2024-06-21 14:34:37,123][15401] Updated weights for policy 0, policy_version 5080 (0.0036) [2024-06-21 14:34:38,390][15132] Fps is (10 sec: 39331.0, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 83263488. Throughput: 0: 41338.1. Samples: 83366800. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-21 14:34:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 14:34:40,966][15401] Updated weights for policy 0, policy_version 5090 (0.0040) [2024-06-21 14:34:43,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 41265.5). Total num frames: 83509248. Throughput: 0: 41180.4. Samples: 83603000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 14:34:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 14:34:45,325][15401] Updated weights for policy 0, policy_version 5100 (0.0039) [2024-06-21 14:34:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 83689472. Throughput: 0: 41611.5. Samples: 83860000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 14:34:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 14:34:48,947][15401] Updated weights for policy 0, policy_version 5110 (0.0038) [2024-06-21 14:34:49,074][15349] Signal inference workers to stop experience collection... (1200 times) [2024-06-21 14:34:49,088][15401] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-21 14:34:49,200][15349] Signal inference workers to resume experience collection... (1200 times) [2024-06-21 14:34:49,201][15401] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-21 14:34:53,263][15401] Updated weights for policy 0, policy_version 5120 (0.0037) [2024-06-21 14:34:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 40959.9, 300 sec: 41155.3). Total num frames: 83886080. Throughput: 0: 41200.8. Samples: 83975500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-21 14:34:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 14:34:56,933][15401] Updated weights for policy 0, policy_version 5130 (0.0030) [2024-06-21 14:34:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 41376.5). Total num frames: 84148224. Throughput: 0: 41342.2. Samples: 84227580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-21 14:34:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 14:35:01,116][15401] Updated weights for policy 0, policy_version 5140 (0.0038) [2024-06-21 14:35:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 40685.3, 300 sec: 41154.0). Total num frames: 84295680. Throughput: 0: 41484.8. Samples: 84481800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 14:35:03,393][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 14:35:04,902][15401] Updated weights for policy 0, policy_version 5150 (0.0034) [2024-06-21 14:35:08,390][15132] Fps is (10 sec: 36044.7, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 84508672. Throughput: 0: 40969.2. Samples: 84592420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 14:35:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 14:35:08,981][15401] Updated weights for policy 0, policy_version 5160 (0.0036) [2024-06-21 14:35:12,786][15401] Updated weights for policy 0, policy_version 5170 (0.0041) [2024-06-21 14:35:13,390][15132] Fps is (10 sec: 44247.7, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 84738048. Throughput: 0: 41352.8. Samples: 84848320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 14:35:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 14:35:16,809][15401] Updated weights for policy 0, policy_version 5180 (0.0047) [2024-06-21 14:35:18,390][15132] Fps is (10 sec: 39322.2, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 84901888. Throughput: 0: 41203.1. Samples: 85098520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 14:35:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 14:35:20,681][15401] Updated weights for policy 0, policy_version 5190 (0.0036) [2024-06-21 14:35:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41504.5, 300 sec: 41209.6). Total num frames: 85147648. Throughput: 0: 41055.6. Samples: 85214400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 14:35:23,393][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 14:35:24,753][15401] Updated weights for policy 0, policy_version 5200 (0.0041) [2024-06-21 14:35:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41234.7, 300 sec: 41321.0). Total num frames: 85344256. Throughput: 0: 41454.2. Samples: 85468440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 14:35:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 14:35:28,538][15401] Updated weights for policy 0, policy_version 5210 (0.0051) [2024-06-21 14:35:33,036][15401] Updated weights for policy 0, policy_version 5220 (0.0047) [2024-06-21 14:35:33,390][15132] Fps is (10 sec: 37692.2, 60 sec: 40688.5, 300 sec: 41154.4). Total num frames: 85524480. Throughput: 0: 41087.6. Samples: 85708940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 14:35:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-21 14:35:36,374][15401] Updated weights for policy 0, policy_version 5230 (0.0038) [2024-06-21 14:35:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 85770240. Throughput: 0: 41159.1. Samples: 85827660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 14:35:38,392][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 14:35:40,850][15401] Updated weights for policy 0, policy_version 5240 (0.0033) [2024-06-21 14:35:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 85950464. Throughput: 0: 41241.0. Samples: 86083420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-21 14:35:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 14:35:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000005247_85966848.pth... [2024-06-21 14:35:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000004642_76054528.pth [2024-06-21 14:35:44,669][15401] Updated weights for policy 0, policy_version 5250 (0.0035) [2024-06-21 14:35:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 86163456. Throughput: 0: 40839.6. Samples: 86319480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:35:48,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 14:35:48,690][15401] Updated weights for policy 0, policy_version 5260 (0.0027) [2024-06-21 14:35:52,672][15401] Updated weights for policy 0, policy_version 5270 (0.0035) [2024-06-21 14:35:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 86360064. Throughput: 0: 41226.8. Samples: 86447620. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 14:35:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 14:35:56,664][15401] Updated weights for policy 0, policy_version 5280 (0.0034) [2024-06-21 14:35:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 39867.8, 300 sec: 41098.9). Total num frames: 86540288. Throughput: 0: 41068.0. Samples: 86696380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 14:35:58,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 14:36:00,568][15401] Updated weights for policy 0, policy_version 5290 (0.0043) [2024-06-21 14:36:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41780.9, 300 sec: 41265.5). Total num frames: 86802432. Throughput: 0: 40834.2. Samples: 86936060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:36:03,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 14:36:05,060][15401] Updated weights for policy 0, policy_version 5300 (0.0041) [2024-06-21 14:36:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 86982656. Throughput: 0: 41296.4. Samples: 87072640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-21 14:36:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 14:36:08,450][15401] Updated weights for policy 0, policy_version 5310 (0.0035) [2024-06-21 14:36:12,979][15401] Updated weights for policy 0, policy_version 5320 (0.0038) [2024-06-21 14:36:13,390][15132] Fps is (10 sec: 36044.9, 60 sec: 40413.9, 300 sec: 41098.8). Total num frames: 87162880. Throughput: 0: 41048.0. Samples: 87315600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:36:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 14:36:16,464][15401] Updated weights for policy 0, policy_version 5330 (0.0042) [2024-06-21 14:36:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41265.4). Total num frames: 87408640. Throughput: 0: 41066.2. Samples: 87556920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:36:18,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 14:36:20,786][15401] Updated weights for policy 0, policy_version 5340 (0.0045) [2024-06-21 14:36:23,273][15349] Signal inference workers to stop experience collection... (1250 times) [2024-06-21 14:36:23,273][15349] Signal inference workers to resume experience collection... (1250 times) [2024-06-21 14:36:23,307][15401] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-21 14:36:23,307][15401] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-21 14:36:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 40961.7, 300 sec: 41265.5). Total num frames: 87605248. Throughput: 0: 41408.0. Samples: 87691020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-21 14:36:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 14:36:24,297][15401] Updated weights for policy 0, policy_version 5350 (0.0043) [2024-06-21 14:36:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 87801856. Throughput: 0: 41065.4. Samples: 87931360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 14:36:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 14:36:28,622][15401] Updated weights for policy 0, policy_version 5360 (0.0046) [2024-06-21 14:36:32,118][15401] Updated weights for policy 0, policy_version 5370 (0.0032) [2024-06-21 14:36:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 88031232. Throughput: 0: 41315.9. Samples: 88178700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:36:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 14:36:36,596][15401] Updated weights for policy 0, policy_version 5380 (0.0050) [2024-06-21 14:36:38,392][15132] Fps is (10 sec: 39311.6, 60 sec: 40412.2, 300 sec: 41209.6). Total num frames: 88195072. Throughput: 0: 41279.0. Samples: 88305280. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 14:36:38,393][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 14:36:39,971][15401] Updated weights for policy 0, policy_version 5390 (0.0037) [2024-06-21 14:36:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41265.4). Total num frames: 88440832. Throughput: 0: 41202.6. Samples: 88550500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 14:36:43,392][15132] Avg episode reward: [(0, '0.249')] [2024-06-21 14:36:44,403][15401] Updated weights for policy 0, policy_version 5400 (0.0043) [2024-06-21 14:36:48,067][15401] Updated weights for policy 0, policy_version 5410 (0.0032) [2024-06-21 14:36:48,389][15132] Fps is (10 sec: 44247.9, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 88637440. Throughput: 0: 41217.4. Samples: 88790840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-21 14:36:48,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 14:36:52,376][15401] Updated weights for policy 0, policy_version 5420 (0.0045) [2024-06-21 14:36:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 40959.9, 300 sec: 41154.7). Total num frames: 88817664. Throughput: 0: 41016.0. Samples: 88918360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 14:36:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 14:36:56,062][15401] Updated weights for policy 0, policy_version 5430 (0.0032) [2024-06-21 14:36:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 89047040. Throughput: 0: 41052.9. Samples: 89162980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 14:36:58,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 14:37:00,257][15401] Updated weights for policy 0, policy_version 5440 (0.0038) [2024-06-21 14:37:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 89243648. Throughput: 0: 41189.7. Samples: 89410460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 14:37:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 14:37:03,973][15401] Updated weights for policy 0, policy_version 5450 (0.0045) [2024-06-21 14:37:08,110][15401] Updated weights for policy 0, policy_version 5460 (0.0041) [2024-06-21 14:37:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 89473024. Throughput: 0: 41036.0. Samples: 89537640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 14:37:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 14:37:12,141][15401] Updated weights for policy 0, policy_version 5470 (0.0043) [2024-06-21 14:37:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41265.4). Total num frames: 89653248. Throughput: 0: 41171.0. Samples: 89784060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:37:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 14:37:15,925][15401] Updated weights for policy 0, policy_version 5480 (0.0042) [2024-06-21 14:37:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 89866240. Throughput: 0: 41084.4. Samples: 90027500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 14:37:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 14:37:20,168][15401] Updated weights for policy 0, policy_version 5490 (0.0037) [2024-06-21 14:37:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 90062848. Throughput: 0: 41001.9. Samples: 90150260. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-21 14:37:23,390][15132] Avg episode reward: [(0, '0.100')] [2024-06-21 14:37:23,886][15401] Updated weights for policy 0, policy_version 5500 (0.0032) [2024-06-21 14:37:28,174][15401] Updated weights for policy 0, policy_version 5510 (0.0038) [2024-06-21 14:37:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 90275840. Throughput: 0: 40948.0. Samples: 90393160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 14:37:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-21 14:37:32,147][15401] Updated weights for policy 0, policy_version 5520 (0.0031) [2024-06-21 14:37:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 90488832. Throughput: 0: 41005.3. Samples: 90636080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 14:37:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 14:37:35,949][15401] Updated weights for policy 0, policy_version 5530 (0.0026) [2024-06-21 14:37:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41234.8, 300 sec: 41098.8). Total num frames: 90669056. Throughput: 0: 40926.3. Samples: 90760040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 14:37:38,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-21 14:37:40,052][15401] Updated weights for policy 0, policy_version 5540 (0.0034) [2024-06-21 14:37:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 90898432. Throughput: 0: 41112.1. Samples: 91013020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:37:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 14:37:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000005548_90898432.pth... [2024-06-21 14:37:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000004943_80986112.pth [2024-06-21 14:37:43,968][15401] Updated weights for policy 0, policy_version 5550 (0.0031) [2024-06-21 14:37:48,065][15401] Updated weights for policy 0, policy_version 5560 (0.0041) [2024-06-21 14:37:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 91095040. Throughput: 0: 41102.8. Samples: 91260080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:37:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 14:37:51,749][15401] Updated weights for policy 0, policy_version 5570 (0.0047) [2024-06-21 14:37:53,392][15132] Fps is (10 sec: 40949.8, 60 sec: 41504.5, 300 sec: 41209.6). Total num frames: 91308032. Throughput: 0: 41053.8. Samples: 91385160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-21 14:37:53,393][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 14:37:56,186][15401] Updated weights for policy 0, policy_version 5580 (0.0035) [2024-06-21 14:37:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 41231.4, 300 sec: 41320.7). Total num frames: 91521024. Throughput: 0: 41078.3. Samples: 91632680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 14:37:58,393][15132] Avg episode reward: [(0, '0.307')] [2024-06-21 14:37:59,669][15401] Updated weights for policy 0, policy_version 5590 (0.0045) [2024-06-21 14:38:03,389][15132] Fps is (10 sec: 39331.3, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 91701248. Throughput: 0: 41384.1. Samples: 91889780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 14:38:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 14:38:04,199][15401] Updated weights for policy 0, policy_version 5600 (0.0050) [2024-06-21 14:38:07,719][15401] Updated weights for policy 0, policy_version 5610 (0.0030) [2024-06-21 14:38:08,390][15132] Fps is (10 sec: 40970.0, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 91930624. Throughput: 0: 41248.4. Samples: 92006440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 14:38:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 14:38:12,043][15401] Updated weights for policy 0, policy_version 5620 (0.0026) [2024-06-21 14:38:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 92127232. Throughput: 0: 41395.4. Samples: 92255960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 14:38:13,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-21 14:38:13,577][15349] Signal inference workers to stop experience collection... (1300 times) [2024-06-21 14:38:13,578][15349] Signal inference workers to resume experience collection... (1300 times) [2024-06-21 14:38:13,615][15401] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-21 14:38:13,615][15401] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-21 14:38:15,628][15401] Updated weights for policy 0, policy_version 5630 (0.0027) [2024-06-21 14:38:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 92340224. Throughput: 0: 41600.9. Samples: 92508120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 14:38:18,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 14:38:20,021][15401] Updated weights for policy 0, policy_version 5640 (0.0032) [2024-06-21 14:38:23,327][15401] Updated weights for policy 0, policy_version 5650 (0.0035) [2024-06-21 14:38:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 92569600. Throughput: 0: 41637.8. Samples: 92633740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-21 14:38:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 14:38:28,132][15401] Updated weights for policy 0, policy_version 5660 (0.0041) [2024-06-21 14:38:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41210.2). Total num frames: 92733440. Throughput: 0: 41501.2. Samples: 92880580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-21 14:38:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 14:38:31,274][15401] Updated weights for policy 0, policy_version 5670 (0.0038) [2024-06-21 14:38:33,390][15132] Fps is (10 sec: 37682.7, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 92946432. Throughput: 0: 41442.1. Samples: 93124980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-21 14:38:33,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 14:38:36,348][15401] Updated weights for policy 0, policy_version 5680 (0.0046) [2024-06-21 14:38:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 93192192. Throughput: 0: 41526.7. Samples: 93253760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 14:38:38,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-21 14:38:39,679][15401] Updated weights for policy 0, policy_version 5690 (0.0039) [2024-06-21 14:38:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 93339648. Throughput: 0: 41353.3. Samples: 93493480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 14:38:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 14:38:44,166][15401] Updated weights for policy 0, policy_version 5700 (0.0037) [2024-06-21 14:38:47,474][15401] Updated weights for policy 0, policy_version 5710 (0.0036) [2024-06-21 14:38:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 93585408. Throughput: 0: 41115.4. Samples: 93739980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:38:48,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-21 14:38:51,916][15401] Updated weights for policy 0, policy_version 5720 (0.0039) [2024-06-21 14:38:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 41507.8, 300 sec: 41376.5). Total num frames: 93798400. Throughput: 0: 41516.4. Samples: 93874680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 14:38:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 14:38:55,238][15401] Updated weights for policy 0, policy_version 5730 (0.0030) [2024-06-21 14:38:58,390][15132] Fps is (10 sec: 39321.9, 60 sec: 40961.6, 300 sec: 41098.9). Total num frames: 93978624. Throughput: 0: 41385.9. Samples: 94118320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 14:38:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 14:38:59,638][15401] Updated weights for policy 0, policy_version 5740 (0.0044) [2024-06-21 14:39:03,043][15401] Updated weights for policy 0, policy_version 5750 (0.0032) [2024-06-21 14:39:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41376.5). Total num frames: 94224384. Throughput: 0: 41329.3. Samples: 94367940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 14:39:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 14:39:07,431][15401] Updated weights for policy 0, policy_version 5760 (0.0032) [2024-06-21 14:39:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 94388224. Throughput: 0: 41488.8. Samples: 94500740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 14:39:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 14:39:10,943][15401] Updated weights for policy 0, policy_version 5770 (0.0038) [2024-06-21 14:39:13,392][15132] Fps is (10 sec: 39312.4, 60 sec: 41504.6, 300 sec: 41154.1). Total num frames: 94617600. Throughput: 0: 41452.5. Samples: 94746040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 14:39:13,393][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 14:39:15,371][15401] Updated weights for policy 0, policy_version 5780 (0.0032) [2024-06-21 14:39:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 41265.4). Total num frames: 94830592. Throughput: 0: 41505.3. Samples: 94992720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:39:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 14:39:18,774][15401] Updated weights for policy 0, policy_version 5790 (0.0031) [2024-06-21 14:39:23,197][15401] Updated weights for policy 0, policy_version 5800 (0.0028) [2024-06-21 14:39:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 40960.0, 300 sec: 41210.3). Total num frames: 95027200. Throughput: 0: 41446.3. Samples: 95118840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 14:39:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 14:39:26,639][15401] Updated weights for policy 0, policy_version 5810 (0.0038) [2024-06-21 14:39:28,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42050.6, 300 sec: 41265.5). Total num frames: 95256576. Throughput: 0: 41519.1. Samples: 95361940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 14:39:28,393][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 14:39:31,449][15401] Updated weights for policy 0, policy_version 5820 (0.0029) [2024-06-21 14:39:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 95453184. Throughput: 0: 41657.5. Samples: 95614560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 14:39:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 14:39:34,446][15401] Updated weights for policy 0, policy_version 5830 (0.0032) [2024-06-21 14:39:38,390][15132] Fps is (10 sec: 36053.1, 60 sec: 40413.8, 300 sec: 41043.3). Total num frames: 95617024. Throughput: 0: 41371.0. Samples: 95736380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 14:39:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 14:39:39,319][15401] Updated weights for policy 0, policy_version 5840 (0.0045) [2024-06-21 14:39:40,681][15349] Signal inference workers to stop experience collection... (1350 times) [2024-06-21 14:39:40,682][15349] Signal inference workers to resume experience collection... (1350 times) [2024-06-21 14:39:40,701][15401] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-21 14:39:40,702][15401] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-21 14:39:42,276][15401] Updated weights for policy 0, policy_version 5850 (0.0037) [2024-06-21 14:39:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41321.0). Total num frames: 95879168. Throughput: 0: 41439.1. Samples: 95983080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 14:39:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 14:39:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000005852_95879168.pth... [2024-06-21 14:39:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000005247_85966848.pth [2024-06-21 14:39:47,109][15401] Updated weights for policy 0, policy_version 5860 (0.0037) [2024-06-21 14:39:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 96059392. Throughput: 0: 41427.6. Samples: 96232180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 14:39:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 14:39:50,266][15401] Updated weights for policy 0, policy_version 5870 (0.0049) [2024-06-21 14:39:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 96256000. Throughput: 0: 41096.6. Samples: 96350080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:39:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 14:39:55,141][15401] Updated weights for policy 0, policy_version 5880 (0.0035) [2024-06-21 14:39:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41321.3). Total num frames: 96485376. Throughput: 0: 41274.6. Samples: 96603300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:39:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 14:39:58,570][15401] Updated weights for policy 0, policy_version 5890 (0.0040) [2024-06-21 14:40:03,054][15401] Updated weights for policy 0, policy_version 5900 (0.0033) [2024-06-21 14:40:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 96681984. Throughput: 0: 41171.3. Samples: 96845420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-21 14:40:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 14:40:06,505][15401] Updated weights for policy 0, policy_version 5910 (0.0040) [2024-06-21 14:40:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41265.5). Total num frames: 96911360. Throughput: 0: 41069.6. Samples: 96966980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-21 14:40:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 14:40:11,045][15401] Updated weights for policy 0, policy_version 5920 (0.0040) [2024-06-21 14:40:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41234.7, 300 sec: 41321.0). Total num frames: 97091584. Throughput: 0: 41278.2. Samples: 97219360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 14:40:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 14:40:14,740][15401] Updated weights for policy 0, policy_version 5930 (0.0035) [2024-06-21 14:40:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40960.0, 300 sec: 41154.7). Total num frames: 97288192. Throughput: 0: 41068.2. Samples: 97462640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 14:40:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 14:40:18,918][15401] Updated weights for policy 0, policy_version 5940 (0.0039) [2024-06-21 14:40:22,544][15401] Updated weights for policy 0, policy_version 5950 (0.0038) [2024-06-21 14:40:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 97517568. Throughput: 0: 41225.4. Samples: 97591520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 14:40:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 14:40:26,695][15401] Updated weights for policy 0, policy_version 5960 (0.0039) [2024-06-21 14:40:28,389][15132] Fps is (10 sec: 42599.6, 60 sec: 40961.7, 300 sec: 41321.0). Total num frames: 97714176. Throughput: 0: 41236.1. Samples: 97838700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 14:40:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 14:40:30,351][15401] Updated weights for policy 0, policy_version 5970 (0.0046) [2024-06-21 14:40:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 97910784. Throughput: 0: 41173.8. Samples: 98085000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 14:40:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 14:40:34,451][15401] Updated weights for policy 0, policy_version 5980 (0.0036) [2024-06-21 14:40:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 98123776. Throughput: 0: 41262.2. Samples: 98206880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 14:40:38,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 14:40:38,635][15401] Updated weights for policy 0, policy_version 5990 (0.0039) [2024-06-21 14:40:42,441][15401] Updated weights for policy 0, policy_version 6000 (0.0040) [2024-06-21 14:40:43,392][15132] Fps is (10 sec: 42588.3, 60 sec: 40958.4, 300 sec: 41265.1). Total num frames: 98336768. Throughput: 0: 41054.8. Samples: 98450860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:40:43,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 14:40:46,883][15401] Updated weights for policy 0, policy_version 6010 (0.0047) [2024-06-21 14:40:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 98533376. Throughput: 0: 41114.5. Samples: 98695580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 14:40:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 14:40:50,259][15401] Updated weights for policy 0, policy_version 6020 (0.0049) [2024-06-21 14:40:53,389][15132] Fps is (10 sec: 37692.2, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 98713600. Throughput: 0: 41145.4. Samples: 98818520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:40:53,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 14:40:54,661][15401] Updated weights for policy 0, policy_version 6030 (0.0040) [2024-06-21 14:40:58,126][15401] Updated weights for policy 0, policy_version 6040 (0.0040) [2024-06-21 14:40:58,390][15132] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 98959360. Throughput: 0: 41073.3. Samples: 99067660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:40:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 14:41:02,479][15401] Updated weights for policy 0, policy_version 6050 (0.0049) [2024-06-21 14:41:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 99139584. Throughput: 0: 41243.8. Samples: 99318600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 14:41:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 14:41:05,991][15401] Updated weights for policy 0, policy_version 6060 (0.0030) [2024-06-21 14:41:08,390][15132] Fps is (10 sec: 37683.3, 60 sec: 40413.9, 300 sec: 41265.5). Total num frames: 99336192. Throughput: 0: 40932.9. Samples: 99433500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:41:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 14:41:10,255][15401] Updated weights for policy 0, policy_version 6070 (0.0032) [2024-06-21 14:41:11,555][15349] Signal inference workers to stop experience collection... (1400 times) [2024-06-21 14:41:11,555][15349] Signal inference workers to resume experience collection... (1400 times) [2024-06-21 14:41:11,603][15401] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-21 14:41:11,604][15401] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-21 14:41:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 99565568. Throughput: 0: 41111.1. Samples: 99688700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 14:41:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 14:41:14,176][15401] Updated weights for policy 0, policy_version 6080 (0.0044) [2024-06-21 14:41:17,975][15401] Updated weights for policy 0, policy_version 6090 (0.0042) [2024-06-21 14:41:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 99778560. Throughput: 0: 41035.1. Samples: 99931580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 14:41:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 14:41:22,346][15401] Updated weights for policy 0, policy_version 6100 (0.0030) [2024-06-21 14:41:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 99991552. Throughput: 0: 41176.9. Samples: 100059840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-21 14:41:23,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 14:41:25,762][15401] Updated weights for policy 0, policy_version 6110 (0.0039) [2024-06-21 14:41:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 100188160. Throughput: 0: 41365.8. Samples: 100312220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:41:28,390][15132] Avg episode reward: [(0, '0.884')] [2024-06-21 14:41:28,428][15349] Saving new best policy, reward=0.884! [2024-06-21 14:41:30,140][15401] Updated weights for policy 0, policy_version 6120 (0.0045) [2024-06-21 14:41:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41376.9). Total num frames: 100401152. Throughput: 0: 41424.0. Samples: 100559660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:41:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-21 14:41:34,265][15401] Updated weights for policy 0, policy_version 6130 (0.0035) [2024-06-21 14:41:37,758][15401] Updated weights for policy 0, policy_version 6140 (0.0042) [2024-06-21 14:41:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 41504.5, 300 sec: 41265.1). Total num frames: 100614144. Throughput: 0: 41486.6. Samples: 100685520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 14:41:38,393][15132] Avg episode reward: [(0, '0.711')] [2024-06-21 14:41:41,923][15401] Updated weights for policy 0, policy_version 6150 (0.0030) [2024-06-21 14:41:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41234.7, 300 sec: 41265.5). Total num frames: 100810752. Throughput: 0: 41476.0. Samples: 100934080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:41:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-21 14:41:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000006153_100810752.pth... [2024-06-21 14:41:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000005548_90898432.pth [2024-06-21 14:41:45,679][15401] Updated weights for policy 0, policy_version 6160 (0.0030) [2024-06-21 14:41:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 101023744. Throughput: 0: 41416.4. Samples: 101182340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 14:41:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-21 14:41:50,108][15401] Updated weights for policy 0, policy_version 6170 (0.0039) [2024-06-21 14:41:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 41321.0). Total num frames: 101236736. Throughput: 0: 41661.2. Samples: 101308260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-21 14:41:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 14:41:53,600][15401] Updated weights for policy 0, policy_version 6180 (0.0036) [2024-06-21 14:41:57,991][15401] Updated weights for policy 0, policy_version 6190 (0.0034) [2024-06-21 14:41:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 101416960. Throughput: 0: 41570.6. Samples: 101559380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:41:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-21 14:42:01,580][15401] Updated weights for policy 0, policy_version 6200 (0.0045) [2024-06-21 14:42:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41265.4). Total num frames: 101646336. Throughput: 0: 41619.0. Samples: 101804440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:42:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 14:42:06,043][15401] Updated weights for policy 0, policy_version 6210 (0.0027) [2024-06-21 14:42:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 101842944. Throughput: 0: 41591.1. Samples: 101931440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:42:08,390][15132] Avg episode reward: [(0, '0.218')] [2024-06-21 14:42:09,199][15401] Updated weights for policy 0, policy_version 6220 (0.0041) [2024-06-21 14:42:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 102055936. Throughput: 0: 41368.8. Samples: 102173820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 14:42:13,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 14:42:13,732][15401] Updated weights for policy 0, policy_version 6230 (0.0039) [2024-06-21 14:42:17,240][15401] Updated weights for policy 0, policy_version 6240 (0.0044) [2024-06-21 14:42:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 102252544. Throughput: 0: 41481.4. Samples: 102426320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 14:42:18,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-21 14:42:21,473][15401] Updated weights for policy 0, policy_version 6250 (0.0033) [2024-06-21 14:42:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 102481920. Throughput: 0: 41467.6. Samples: 102551460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 14:42:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 14:42:25,179][15401] Updated weights for policy 0, policy_version 6260 (0.0041) [2024-06-21 14:42:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 102694912. Throughput: 0: 41575.6. Samples: 102804980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 14:42:28,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 14:42:29,290][15401] Updated weights for policy 0, policy_version 6270 (0.0027) [2024-06-21 14:42:33,137][15401] Updated weights for policy 0, policy_version 6280 (0.0029) [2024-06-21 14:42:33,390][15132] Fps is (10 sec: 42596.1, 60 sec: 41778.9, 300 sec: 41487.5). Total num frames: 102907904. Throughput: 0: 41471.9. Samples: 103048600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 14:42:33,391][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 14:42:37,116][15401] Updated weights for policy 0, policy_version 6290 (0.0034) [2024-06-21 14:42:38,390][15132] Fps is (10 sec: 37683.2, 60 sec: 40961.6, 300 sec: 41265.5). Total num frames: 103071744. Throughput: 0: 41445.9. Samples: 103173320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 14:42:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 14:42:39,796][15349] Signal inference workers to stop experience collection... (1450 times) [2024-06-21 14:42:39,796][15349] Signal inference workers to resume experience collection... (1450 times) [2024-06-21 14:42:39,811][15401] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-21 14:42:39,811][15401] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-21 14:42:41,019][15401] Updated weights for policy 0, policy_version 6300 (0.0029) [2024-06-21 14:42:43,389][15132] Fps is (10 sec: 39324.1, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 103301120. Throughput: 0: 41446.7. Samples: 103424480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-21 14:42:43,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-21 14:42:44,887][15401] Updated weights for policy 0, policy_version 6310 (0.0025) [2024-06-21 14:42:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 41321.3). Total num frames: 103497728. Throughput: 0: 41493.8. Samples: 103671660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 14:42:48,394][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 14:42:49,415][15401] Updated weights for policy 0, policy_version 6320 (0.0034) [2024-06-21 14:42:52,827][15401] Updated weights for policy 0, policy_version 6330 (0.0028) [2024-06-21 14:42:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41506.3, 300 sec: 41376.9). Total num frames: 103727104. Throughput: 0: 41354.2. Samples: 103792380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 14:42:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 14:42:57,346][15401] Updated weights for policy 0, policy_version 6340 (0.0040) [2024-06-21 14:42:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 41432.1). Total num frames: 103923712. Throughput: 0: 41505.3. Samples: 104041560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:42:58,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 14:43:00,655][15401] Updated weights for policy 0, policy_version 6350 (0.0039) [2024-06-21 14:43:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 41321.0). Total num frames: 104120320. Throughput: 0: 41407.2. Samples: 104289640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-21 14:43:03,390][15132] Avg episode reward: [(0, '0.155')] [2024-06-21 14:43:05,376][15401] Updated weights for policy 0, policy_version 6360 (0.0025) [2024-06-21 14:43:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41504.5, 300 sec: 41376.2). Total num frames: 104333312. Throughput: 0: 41238.7. Samples: 104407300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-21 14:43:08,393][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 14:43:08,951][15401] Updated weights for policy 0, policy_version 6370 (0.0031) [2024-06-21 14:43:13,270][15401] Updated weights for policy 0, policy_version 6380 (0.0037) [2024-06-21 14:43:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 104529920. Throughput: 0: 41288.3. Samples: 104662960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 14:43:13,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 14:43:16,710][15401] Updated weights for policy 0, policy_version 6390 (0.0026) [2024-06-21 14:43:18,390][15132] Fps is (10 sec: 40969.6, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 104742912. Throughput: 0: 41477.3. Samples: 104915060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 14:43:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 14:43:20,847][15401] Updated weights for policy 0, policy_version 6400 (0.0029) [2024-06-21 14:43:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 104955904. Throughput: 0: 41387.1. Samples: 105035740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 14:43:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 14:43:24,551][15401] Updated weights for policy 0, policy_version 6410 (0.0049) [2024-06-21 14:43:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 105152512. Throughput: 0: 41392.4. Samples: 105287140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:43:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 14:43:28,627][15401] Updated weights for policy 0, policy_version 6420 (0.0027) [2024-06-21 14:43:32,325][15401] Updated weights for policy 0, policy_version 6430 (0.0029) [2024-06-21 14:43:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.3, 300 sec: 41265.5). Total num frames: 105365504. Throughput: 0: 41260.4. Samples: 105528380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:43:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 14:43:36,653][15401] Updated weights for policy 0, policy_version 6440 (0.0045) [2024-06-21 14:43:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 105545728. Throughput: 0: 41273.8. Samples: 105649700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 14:43:38,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 14:43:40,164][15401] Updated weights for policy 0, policy_version 6450 (0.0042) [2024-06-21 14:43:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 40959.8, 300 sec: 41265.5). Total num frames: 105758720. Throughput: 0: 41314.6. Samples: 105900720. Policy #0 lag: (min: 2.0, avg: 11.6, max: 20.0) [2024-06-21 14:43:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 14:43:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000006455_105758720.pth... [2024-06-21 14:43:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000005852_95879168.pth [2024-06-21 14:43:44,506][15401] Updated weights for policy 0, policy_version 6460 (0.0037) [2024-06-21 14:43:48,373][15401] Updated weights for policy 0, policy_version 6470 (0.0039) [2024-06-21 14:43:48,392][15132] Fps is (10 sec: 45864.0, 60 sec: 41777.6, 300 sec: 41376.2). Total num frames: 106004480. Throughput: 0: 41054.7. Samples: 106137200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:43:48,393][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 14:43:52,698][15401] Updated weights for policy 0, policy_version 6480 (0.0038) [2024-06-21 14:43:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 106184704. Throughput: 0: 41445.8. Samples: 106272260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 14:43:53,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-21 14:43:54,872][15349] Signal inference workers to stop experience collection... (1500 times) [2024-06-21 14:43:54,872][15349] Signal inference workers to resume experience collection... (1500 times) [2024-06-21 14:43:54,924][15401] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-21 14:43:54,924][15401] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-21 14:43:56,267][15401] Updated weights for policy 0, policy_version 6490 (0.0029) [2024-06-21 14:43:58,389][15132] Fps is (10 sec: 36053.8, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 106364928. Throughput: 0: 40951.3. Samples: 106505760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 14:43:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-21 14:44:00,481][15401] Updated weights for policy 0, policy_version 6500 (0.0035) [2024-06-21 14:44:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 106594304. Throughput: 0: 41027.2. Samples: 106761280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 14:44:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 14:44:03,994][15401] Updated weights for policy 0, policy_version 6510 (0.0024) [2024-06-21 14:44:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41234.7, 300 sec: 41321.3). Total num frames: 106807296. Throughput: 0: 41140.9. Samples: 106887080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-21 14:44:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 14:44:08,583][15401] Updated weights for policy 0, policy_version 6520 (0.0028) [2024-06-21 14:44:11,915][15401] Updated weights for policy 0, policy_version 6530 (0.0026) [2024-06-21 14:44:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 107020288. Throughput: 0: 40867.9. Samples: 107126200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 14:44:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 14:44:16,733][15401] Updated weights for policy 0, policy_version 6540 (0.0032) [2024-06-21 14:44:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 107200512. Throughput: 0: 41203.6. Samples: 107382540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 14:44:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 14:44:19,924][15401] Updated weights for policy 0, policy_version 6550 (0.0044) [2024-06-21 14:44:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41265.8). Total num frames: 107429888. Throughput: 0: 41219.1. Samples: 107504560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:44:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 14:44:24,633][15401] Updated weights for policy 0, policy_version 6560 (0.0039) [2024-06-21 14:44:28,287][15401] Updated weights for policy 0, policy_version 6570 (0.0033) [2024-06-21 14:44:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 107642880. Throughput: 0: 41168.2. Samples: 107753280. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 14:44:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 14:44:32,641][15401] Updated weights for policy 0, policy_version 6580 (0.0049) [2024-06-21 14:44:33,390][15132] Fps is (10 sec: 39320.8, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 107823104. Throughput: 0: 41458.9. Samples: 108002760. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 14:44:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 14:44:36,152][15401] Updated weights for policy 0, policy_version 6590 (0.0037) [2024-06-21 14:44:38,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 108019712. Throughput: 0: 41022.6. Samples: 108118280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-21 14:44:38,394][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 14:44:40,684][15401] Updated weights for policy 0, policy_version 6600 (0.0039) [2024-06-21 14:44:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 41376.5). Total num frames: 108265472. Throughput: 0: 41438.6. Samples: 108370500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 14:44:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 14:44:43,844][15401] Updated weights for policy 0, policy_version 6610 (0.0043) [2024-06-21 14:44:48,395][15132] Fps is (10 sec: 42575.3, 60 sec: 40684.8, 300 sec: 41320.2). Total num frames: 108445696. Throughput: 0: 41320.2. Samples: 108620920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 14:44:48,395][15132] Avg episode reward: [(0, '0.718')] [2024-06-21 14:44:48,515][15401] Updated weights for policy 0, policy_version 6620 (0.0035) [2024-06-21 14:44:51,751][15401] Updated weights for policy 0, policy_version 6630 (0.0039) [2024-06-21 14:44:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 108675072. Throughput: 0: 41177.0. Samples: 108740040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 14:44:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 14:44:56,424][15401] Updated weights for policy 0, policy_version 6640 (0.0036) [2024-06-21 14:44:58,389][15132] Fps is (10 sec: 40983.1, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 108855296. Throughput: 0: 41518.4. Samples: 108994520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 14:44:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 14:44:59,875][15401] Updated weights for policy 0, policy_version 6650 (0.0034) [2024-06-21 14:45:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 109068288. Throughput: 0: 41275.6. Samples: 109239940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 14:45:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-21 14:45:04,270][15401] Updated weights for policy 0, policy_version 6660 (0.0043) [2024-06-21 14:45:07,754][15401] Updated weights for policy 0, policy_version 6670 (0.0028) [2024-06-21 14:45:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 109297664. Throughput: 0: 41372.9. Samples: 109366340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 14:45:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 14:45:12,026][15401] Updated weights for policy 0, policy_version 6680 (0.0029) [2024-06-21 14:45:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 109461504. Throughput: 0: 41199.5. Samples: 109607260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-21 14:45:13,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-21 14:45:13,978][15349] Signal inference workers to stop experience collection... (1550 times) [2024-06-21 14:45:13,979][15349] Signal inference workers to resume experience collection... (1550 times) [2024-06-21 14:45:13,995][15401] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-21 14:45:13,995][15401] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-21 14:45:15,782][15401] Updated weights for policy 0, policy_version 6690 (0.0029) [2024-06-21 14:45:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 109690880. Throughput: 0: 41266.4. Samples: 109859740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 14:45:18,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-21 14:45:20,384][15401] Updated weights for policy 0, policy_version 6700 (0.0037) [2024-06-21 14:45:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 109903872. Throughput: 0: 41535.2. Samples: 109987360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 14:45:23,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 14:45:23,656][15401] Updated weights for policy 0, policy_version 6710 (0.0036) [2024-06-21 14:45:28,354][15401] Updated weights for policy 0, policy_version 6720 (0.0037) [2024-06-21 14:45:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 41321.0). Total num frames: 110100480. Throughput: 0: 41444.8. Samples: 110235520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 19.0) [2024-06-21 14:45:28,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 14:45:31,534][15401] Updated weights for policy 0, policy_version 6730 (0.0038) [2024-06-21 14:45:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 110313472. Throughput: 0: 41258.8. Samples: 110477340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 14:45:33,394][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 14:45:36,391][15401] Updated weights for policy 0, policy_version 6740 (0.0041) [2024-06-21 14:45:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 41321.3). Total num frames: 110526464. Throughput: 0: 41447.5. Samples: 110605180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:45:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 14:45:39,711][15401] Updated weights for policy 0, policy_version 6750 (0.0061) [2024-06-21 14:45:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 40413.9, 300 sec: 41209.9). Total num frames: 110690304. Throughput: 0: 41034.9. Samples: 110841100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:45:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 14:45:43,611][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000006758_110723072.pth... [2024-06-21 14:45:43,664][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000006153_100810752.pth [2024-06-21 14:45:44,321][15401] Updated weights for policy 0, policy_version 6760 (0.0038) [2024-06-21 14:45:47,521][15401] Updated weights for policy 0, policy_version 6770 (0.0035) [2024-06-21 14:45:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41783.1, 300 sec: 41487.6). Total num frames: 110952448. Throughput: 0: 41052.6. Samples: 111087300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 14:45:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 14:45:52,146][15401] Updated weights for policy 0, policy_version 6780 (0.0043) [2024-06-21 14:45:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 40686.8, 300 sec: 41209.9). Total num frames: 111116288. Throughput: 0: 41209.2. Samples: 111220760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 14:45:53,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-21 14:45:55,836][15401] Updated weights for policy 0, policy_version 6790 (0.0039) [2024-06-21 14:45:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 111345664. Throughput: 0: 41228.4. Samples: 111462540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-21 14:45:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 14:46:00,063][15401] Updated weights for policy 0, policy_version 6800 (0.0043) [2024-06-21 14:46:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 111542272. Throughput: 0: 41156.3. Samples: 111711780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 14:46:03,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-21 14:46:03,838][15401] Updated weights for policy 0, policy_version 6810 (0.0036) [2024-06-21 14:46:08,151][15401] Updated weights for policy 0, policy_version 6820 (0.0031) [2024-06-21 14:46:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 40958.4, 300 sec: 41320.7). Total num frames: 111755264. Throughput: 0: 41028.9. Samples: 111833760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-21 14:46:08,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 14:46:11,792][15401] Updated weights for policy 0, policy_version 6830 (0.0041) [2024-06-21 14:46:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41376.5). Total num frames: 111984640. Throughput: 0: 40885.8. Samples: 112075380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-21 14:46:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 14:46:16,001][15401] Updated weights for policy 0, policy_version 6840 (0.0044) [2024-06-21 14:46:18,390][15132] Fps is (10 sec: 39330.6, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 112148480. Throughput: 0: 41079.5. Samples: 112325920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 14:46:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 14:46:19,735][15401] Updated weights for policy 0, policy_version 6850 (0.0047) [2024-06-21 14:46:23,389][15132] Fps is (10 sec: 36044.9, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 112345088. Throughput: 0: 40765.3. Samples: 112439620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 14:46:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 14:46:23,483][15349] Signal inference workers to stop experience collection... (1600 times) [2024-06-21 14:46:23,486][15349] Signal inference workers to resume experience collection... (1600 times) [2024-06-21 14:46:23,495][15401] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-21 14:46:23,531][15401] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-21 14:46:23,883][15401] Updated weights for policy 0, policy_version 6860 (0.0035) [2024-06-21 14:46:27,715][15401] Updated weights for policy 0, policy_version 6870 (0.0032) [2024-06-21 14:46:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 112607232. Throughput: 0: 41339.1. Samples: 112701360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-21 14:46:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 14:46:32,042][15401] Updated weights for policy 0, policy_version 6880 (0.0033) [2024-06-21 14:46:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 40960.0, 300 sec: 41210.2). Total num frames: 112771072. Throughput: 0: 41313.1. Samples: 112946400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-21 14:46:33,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-21 14:46:35,742][15401] Updated weights for policy 0, policy_version 6890 (0.0055) [2024-06-21 14:46:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 112984064. Throughput: 0: 40899.5. Samples: 113061240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 14:46:38,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 14:46:39,840][15401] Updated weights for policy 0, policy_version 6900 (0.0039) [2024-06-21 14:46:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 113197056. Throughput: 0: 41256.5. Samples: 113319080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:46:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 14:46:43,485][15401] Updated weights for policy 0, policy_version 6910 (0.0035) [2024-06-21 14:46:47,665][15401] Updated weights for policy 0, policy_version 6920 (0.0038) [2024-06-21 14:46:48,390][15132] Fps is (10 sec: 40960.5, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 113393664. Throughput: 0: 41072.1. Samples: 113560020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:46:48,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-21 14:46:51,627][15401] Updated weights for policy 0, policy_version 6930 (0.0038) [2024-06-21 14:46:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41376.5). Total num frames: 113623040. Throughput: 0: 41176.4. Samples: 113686600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:46:53,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 14:46:55,913][15401] Updated weights for policy 0, policy_version 6940 (0.0044) [2024-06-21 14:46:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40687.1, 300 sec: 41154.4). Total num frames: 113786880. Throughput: 0: 41251.2. Samples: 113931680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 14:46:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 14:46:59,516][15401] Updated weights for policy 0, policy_version 6950 (0.0038) [2024-06-21 14:47:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 114016256. Throughput: 0: 41272.9. Samples: 114183200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 14:47:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 14:47:03,934][15401] Updated weights for policy 0, policy_version 6960 (0.0035) [2024-06-21 14:47:07,622][15401] Updated weights for policy 0, policy_version 6970 (0.0042) [2024-06-21 14:47:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 40961.6, 300 sec: 41209.9). Total num frames: 114212864. Throughput: 0: 41491.5. Samples: 114306740. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 14:47:08,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 14:47:11,863][15401] Updated weights for policy 0, policy_version 6980 (0.0034) [2024-06-21 14:47:13,392][15132] Fps is (10 sec: 40950.3, 60 sec: 40685.3, 300 sec: 41265.1). Total num frames: 114425856. Throughput: 0: 41080.1. Samples: 114550060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 14:47:13,392][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 14:47:15,490][15401] Updated weights for policy 0, policy_version 6990 (0.0023) [2024-06-21 14:47:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41265.4). Total num frames: 114655232. Throughput: 0: 41052.0. Samples: 114793740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 14:47:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 14:47:19,717][15401] Updated weights for policy 0, policy_version 7000 (0.0036) [2024-06-21 14:47:23,390][15132] Fps is (10 sec: 39330.8, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 114819072. Throughput: 0: 41323.6. Samples: 114920800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 14:47:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 14:47:23,587][15401] Updated weights for policy 0, policy_version 7010 (0.0035) [2024-06-21 14:47:27,558][15401] Updated weights for policy 0, policy_version 7020 (0.0048) [2024-06-21 14:47:28,390][15132] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41154.5). Total num frames: 115048448. Throughput: 0: 41061.7. Samples: 115166860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 14:47:28,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 14:47:31,649][15401] Updated weights for policy 0, policy_version 7030 (0.0035) [2024-06-21 14:47:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 115261440. Throughput: 0: 41163.6. Samples: 115412380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-21 14:47:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 14:47:35,388][15401] Updated weights for policy 0, policy_version 7040 (0.0045) [2024-06-21 14:47:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 115425280. Throughput: 0: 41114.1. Samples: 115536740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-21 14:47:38,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 14:47:39,689][15401] Updated weights for policy 0, policy_version 7050 (0.0039) [2024-06-21 14:47:43,265][15401] Updated weights for policy 0, policy_version 7060 (0.0034) [2024-06-21 14:47:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 115671040. Throughput: 0: 41252.8. Samples: 115788060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 14:47:43,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-21 14:47:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000007061_115687424.pth... [2024-06-21 14:47:43,550][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000006455_105758720.pth [2024-06-21 14:47:47,486][15401] Updated weights for policy 0, policy_version 7070 (0.0031) [2024-06-21 14:47:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 115867648. Throughput: 0: 41260.9. Samples: 116039940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-21 14:47:48,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-21 14:47:49,637][15349] Signal inference workers to stop experience collection... (1650 times) [2024-06-21 14:47:49,662][15401] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-21 14:47:49,753][15349] Signal inference workers to resume experience collection... (1650 times) [2024-06-21 14:47:49,754][15401] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-21 14:47:50,995][15401] Updated weights for policy 0, policy_version 7080 (0.0044) [2024-06-21 14:47:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 116080640. Throughput: 0: 41228.1. Samples: 116162000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-21 14:47:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 14:47:55,221][15401] Updated weights for policy 0, policy_version 7090 (0.0042) [2024-06-21 14:47:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 41265.5). Total num frames: 116293632. Throughput: 0: 41275.1. Samples: 116407340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-21 14:47:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 14:47:59,296][15401] Updated weights for policy 0, policy_version 7100 (0.0031) [2024-06-21 14:48:03,037][15401] Updated weights for policy 0, policy_version 7110 (0.0035) [2024-06-21 14:48:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41265.8). Total num frames: 116506624. Throughput: 0: 41440.5. Samples: 116658560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 14:48:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 14:48:07,078][15401] Updated weights for policy 0, policy_version 7120 (0.0030) [2024-06-21 14:48:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41504.5, 300 sec: 41265.1). Total num frames: 116703232. Throughput: 0: 41286.3. Samples: 116778780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 14:48:08,393][15132] Avg episode reward: [(0, '0.264')] [2024-06-21 14:48:10,977][15401] Updated weights for policy 0, policy_version 7130 (0.0038) [2024-06-21 14:48:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41507.8, 300 sec: 41265.5). Total num frames: 116916224. Throughput: 0: 41457.8. Samples: 117032460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 14:48:13,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 14:48:14,961][15401] Updated weights for policy 0, policy_version 7140 (0.0040) [2024-06-21 14:48:18,390][15132] Fps is (10 sec: 40969.4, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 117112832. Throughput: 0: 41453.2. Samples: 117277780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-21 14:48:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 14:48:18,766][15401] Updated weights for policy 0, policy_version 7150 (0.0044) [2024-06-21 14:48:22,782][15401] Updated weights for policy 0, policy_version 7160 (0.0033) [2024-06-21 14:48:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41265.5). Total num frames: 117325824. Throughput: 0: 41355.2. Samples: 117397720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-21 14:48:23,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 14:48:26,582][15401] Updated weights for policy 0, policy_version 7170 (0.0029) [2024-06-21 14:48:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 117506048. Throughput: 0: 41423.6. Samples: 117652120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 14:48:28,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-21 14:48:30,667][15401] Updated weights for policy 0, policy_version 7180 (0.0041) [2024-06-21 14:48:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 117735424. Throughput: 0: 41263.1. Samples: 117896780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 14:48:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 14:48:34,918][15401] Updated weights for policy 0, policy_version 7190 (0.0042) [2024-06-21 14:48:38,315][15401] Updated weights for policy 0, policy_version 7200 (0.0029) [2024-06-21 14:48:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 41376.6). Total num frames: 117964800. Throughput: 0: 41471.2. Samples: 118028200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 14:48:38,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-21 14:48:42,699][15401] Updated weights for policy 0, policy_version 7210 (0.0047) [2024-06-21 14:48:43,396][15132] Fps is (10 sec: 40933.7, 60 sec: 41228.7, 300 sec: 41153.8). Total num frames: 118145024. Throughput: 0: 41523.9. Samples: 118276180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 14:48:43,397][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 14:48:46,127][15401] Updated weights for policy 0, policy_version 7220 (0.0035) [2024-06-21 14:48:48,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 118341632. Throughput: 0: 41475.2. Samples: 118524940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-21 14:48:48,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-21 14:48:50,659][15401] Updated weights for policy 0, policy_version 7230 (0.0040) [2024-06-21 14:48:53,389][15132] Fps is (10 sec: 40986.3, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 118554624. Throughput: 0: 41566.7. Samples: 118649180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-21 14:48:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 14:48:54,311][15401] Updated weights for policy 0, policy_version 7240 (0.0040) [2024-06-21 14:48:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 118751232. Throughput: 0: 41334.7. Samples: 118892520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:48:58,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 14:48:58,611][15401] Updated weights for policy 0, policy_version 7250 (0.0045) [2024-06-21 14:49:02,517][15401] Updated weights for policy 0, policy_version 7260 (0.0034) [2024-06-21 14:49:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 118996992. Throughput: 0: 41340.6. Samples: 119138100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:49:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 14:49:06,795][15401] Updated weights for policy 0, policy_version 7270 (0.0044) [2024-06-21 14:49:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40961.6, 300 sec: 41154.4). Total num frames: 119160832. Throughput: 0: 41538.6. Samples: 119266960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 14:49:08,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-21 14:49:10,175][15401] Updated weights for policy 0, policy_version 7280 (0.0041) [2024-06-21 14:49:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 119390208. Throughput: 0: 41357.7. Samples: 119513220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 14:49:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 14:49:14,402][15349] Signal inference workers to stop experience collection... (1700 times) [2024-06-21 14:49:14,403][15349] Signal inference workers to resume experience collection... (1700 times) [2024-06-21 14:49:14,421][15401] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-21 14:49:14,422][15401] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-21 14:49:14,557][15401] Updated weights for policy 0, policy_version 7290 (0.0035) [2024-06-21 14:49:17,985][15401] Updated weights for policy 0, policy_version 7300 (0.0036) [2024-06-21 14:49:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 119603200. Throughput: 0: 41388.4. Samples: 119759260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 14:49:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 14:49:22,368][15401] Updated weights for policy 0, policy_version 7310 (0.0025) [2024-06-21 14:49:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 119799808. Throughput: 0: 41371.5. Samples: 119889920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 14:49:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 14:49:26,186][15401] Updated weights for policy 0, policy_version 7320 (0.0043) [2024-06-21 14:49:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41376.6). Total num frames: 120029184. Throughput: 0: 41232.5. Samples: 120131380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:49:28,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 14:49:30,247][15401] Updated weights for policy 0, policy_version 7330 (0.0033) [2024-06-21 14:49:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 120225792. Throughput: 0: 41375.0. Samples: 120386820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 14:49:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 14:49:34,067][15401] Updated weights for policy 0, policy_version 7340 (0.0033) [2024-06-21 14:49:38,121][15401] Updated weights for policy 0, policy_version 7350 (0.0049) [2024-06-21 14:49:38,392][15132] Fps is (10 sec: 40951.5, 60 sec: 41231.6, 300 sec: 41265.2). Total num frames: 120438784. Throughput: 0: 41220.7. Samples: 120504200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 14:49:38,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 14:49:41,808][15401] Updated weights for policy 0, policy_version 7360 (0.0047) [2024-06-21 14:49:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41510.6, 300 sec: 41321.8). Total num frames: 120635392. Throughput: 0: 41476.4. Samples: 120758960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 14:49:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 14:49:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000007364_120651776.pth... [2024-06-21 14:49:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000006758_110723072.pth [2024-06-21 14:49:45,960][15401] Updated weights for policy 0, policy_version 7370 (0.0046) [2024-06-21 14:49:48,389][15132] Fps is (10 sec: 37691.2, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 120815616. Throughput: 0: 41516.9. Samples: 121006360. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-21 14:49:48,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-21 14:49:49,833][15401] Updated weights for policy 0, policy_version 7380 (0.0049) [2024-06-21 14:49:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 121044992. Throughput: 0: 41288.9. Samples: 121124960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 14:49:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 14:49:53,736][15401] Updated weights for policy 0, policy_version 7390 (0.0032) [2024-06-21 14:49:57,801][15401] Updated weights for policy 0, policy_version 7400 (0.0034) [2024-06-21 14:49:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42052.2, 300 sec: 41376.5). Total num frames: 121274368. Throughput: 0: 41474.7. Samples: 121379580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-21 14:49:58,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 14:50:02,075][15401] Updated weights for policy 0, policy_version 7410 (0.0030) [2024-06-21 14:50:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 121454592. Throughput: 0: 41469.7. Samples: 121625400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-21 14:50:03,404][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 14:50:05,657][15401] Updated weights for policy 0, policy_version 7420 (0.0031) [2024-06-21 14:50:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 41376.5). Total num frames: 121667584. Throughput: 0: 41183.5. Samples: 121743180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 14:50:08,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-21 14:50:09,855][15401] Updated weights for policy 0, policy_version 7430 (0.0034) [2024-06-21 14:50:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 121864192. Throughput: 0: 41486.8. Samples: 121998280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:50:13,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-21 14:50:13,632][15401] Updated weights for policy 0, policy_version 7440 (0.0039) [2024-06-21 14:50:17,651][15401] Updated weights for policy 0, policy_version 7450 (0.0021) [2024-06-21 14:50:18,396][15132] Fps is (10 sec: 42571.7, 60 sec: 41501.7, 300 sec: 41320.1). Total num frames: 122093568. Throughput: 0: 41309.8. Samples: 122246020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 14:50:18,396][15132] Avg episode reward: [(0, '0.809')] [2024-06-21 14:50:21,474][15401] Updated weights for policy 0, policy_version 7460 (0.0037) [2024-06-21 14:50:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 122306560. Throughput: 0: 41444.1. Samples: 122369100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 14:50:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 14:50:25,844][15401] Updated weights for policy 0, policy_version 7470 (0.0044) [2024-06-21 14:50:28,390][15132] Fps is (10 sec: 39346.7, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 122486784. Throughput: 0: 41337.8. Samples: 122619160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 14:50:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 14:50:28,899][15349] Signal inference workers to stop experience collection... (1750 times) [2024-06-21 14:50:28,952][15401] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-21 14:50:29,013][15349] Signal inference workers to resume experience collection... (1750 times) [2024-06-21 14:50:29,013][15401] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-21 14:50:29,340][15401] Updated weights for policy 0, policy_version 7480 (0.0027) [2024-06-21 14:50:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 122699776. Throughput: 0: 41309.4. Samples: 122865280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-21 14:50:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 14:50:33,525][15401] Updated weights for policy 0, policy_version 7490 (0.0051) [2024-06-21 14:50:37,327][15401] Updated weights for policy 0, policy_version 7500 (0.0035) [2024-06-21 14:50:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41234.5, 300 sec: 41432.1). Total num frames: 122912768. Throughput: 0: 41338.7. Samples: 122985200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-21 14:50:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 14:50:41,370][15401] Updated weights for policy 0, policy_version 7510 (0.0039) [2024-06-21 14:50:43,390][15132] Fps is (10 sec: 39320.7, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 123092992. Throughput: 0: 41169.3. Samples: 123232200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 14:50:43,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 14:50:45,385][15401] Updated weights for policy 0, policy_version 7520 (0.0038) [2024-06-21 14:50:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 123305984. Throughput: 0: 41165.9. Samples: 123477860. Policy #0 lag: (min: 2.0, avg: 12.3, max: 24.0) [2024-06-21 14:50:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 14:50:49,314][15401] Updated weights for policy 0, policy_version 7530 (0.0041) [2024-06-21 14:50:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 123518976. Throughput: 0: 41306.7. Samples: 123601980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 14:50:53,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 14:50:53,742][15401] Updated weights for policy 0, policy_version 7540 (0.0041) [2024-06-21 14:50:57,430][15401] Updated weights for policy 0, policy_version 7550 (0.0033) [2024-06-21 14:50:58,392][15132] Fps is (10 sec: 42587.8, 60 sec: 40958.4, 300 sec: 41320.7). Total num frames: 123731968. Throughput: 0: 40984.4. Samples: 123842680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 14:50:58,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 14:51:01,455][15401] Updated weights for policy 0, policy_version 7560 (0.0038) [2024-06-21 14:51:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41321.3). Total num frames: 123944960. Throughput: 0: 40998.2. Samples: 124090680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 14:51:03,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 14:51:05,429][15401] Updated weights for policy 0, policy_version 7570 (0.0041) [2024-06-21 14:51:08,389][15132] Fps is (10 sec: 39331.0, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 124125184. Throughput: 0: 40996.5. Samples: 124213940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 14:51:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 14:51:09,286][15401] Updated weights for policy 0, policy_version 7580 (0.0046) [2024-06-21 14:51:13,292][15401] Updated weights for policy 0, policy_version 7590 (0.0033) [2024-06-21 14:51:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41376.6). Total num frames: 124354560. Throughput: 0: 41053.8. Samples: 124466580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 14:51:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 14:51:17,073][15401] Updated weights for policy 0, policy_version 7600 (0.0030) [2024-06-21 14:51:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41237.4, 300 sec: 41432.1). Total num frames: 124567552. Throughput: 0: 41007.0. Samples: 124710600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 14:51:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 14:51:21,175][15401] Updated weights for policy 0, policy_version 7610 (0.0038) [2024-06-21 14:51:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 40413.9, 300 sec: 41098.9). Total num frames: 124731392. Throughput: 0: 41100.0. Samples: 124834700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 14:51:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 14:51:25,239][15401] Updated weights for policy 0, policy_version 7620 (0.0041) [2024-06-21 14:51:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 124960768. Throughput: 0: 41156.1. Samples: 125084220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 14:51:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 14:51:29,321][15401] Updated weights for policy 0, policy_version 7630 (0.0037) [2024-06-21 14:51:33,107][15401] Updated weights for policy 0, policy_version 7640 (0.0044) [2024-06-21 14:51:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 125173760. Throughput: 0: 41110.1. Samples: 125327820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 14:51:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 14:51:37,609][15401] Updated weights for policy 0, policy_version 7650 (0.0046) [2024-06-21 14:51:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 125353984. Throughput: 0: 41219.1. Samples: 125456840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 14:51:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 14:51:41,189][15401] Updated weights for policy 0, policy_version 7660 (0.0047) [2024-06-21 14:51:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 125583360. Throughput: 0: 41266.2. Samples: 125699560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 14:51:43,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 14:51:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000007666_125599744.pth... [2024-06-21 14:51:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000007061_115687424.pth [2024-06-21 14:51:45,470][15401] Updated weights for policy 0, policy_version 7670 (0.0037) [2024-06-21 14:51:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 125796352. Throughput: 0: 41296.5. Samples: 125949020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:51:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 14:51:49,061][15401] Updated weights for policy 0, policy_version 7680 (0.0033) [2024-06-21 14:51:53,142][15401] Updated weights for policy 0, policy_version 7690 (0.0029) [2024-06-21 14:51:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 125992960. Throughput: 0: 41282.1. Samples: 126071640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:51:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 14:51:56,834][15401] Updated weights for policy 0, policy_version 7700 (0.0035) [2024-06-21 14:51:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41780.9, 300 sec: 41432.1). Total num frames: 126238720. Throughput: 0: 41249.0. Samples: 126322780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 14:51:58,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 14:52:00,814][15401] Updated weights for policy 0, policy_version 7710 (0.0028) [2024-06-21 14:52:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 126386176. Throughput: 0: 41560.4. Samples: 126580820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 14:52:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-21 14:52:04,608][15401] Updated weights for policy 0, policy_version 7720 (0.0046) [2024-06-21 14:52:05,736][15349] Signal inference workers to stop experience collection... (1800 times) [2024-06-21 14:52:05,737][15349] Signal inference workers to resume experience collection... (1800 times) [2024-06-21 14:52:05,758][15401] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-21 14:52:05,758][15401] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-21 14:52:08,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41321.3). Total num frames: 126615552. Throughput: 0: 41335.1. Samples: 126694780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-21 14:52:08,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-21 14:52:08,817][15401] Updated weights for policy 0, policy_version 7730 (0.0033) [2024-06-21 14:52:12,424][15401] Updated weights for policy 0, policy_version 7740 (0.0054) [2024-06-21 14:52:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 126828544. Throughput: 0: 41354.6. Samples: 126945180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-06-21 14:52:13,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 14:52:16,909][15401] Updated weights for policy 0, policy_version 7750 (0.0035) [2024-06-21 14:52:18,389][15132] Fps is (10 sec: 37683.6, 60 sec: 40414.0, 300 sec: 41265.5). Total num frames: 126992384. Throughput: 0: 41538.4. Samples: 127197040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-21 14:52:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 14:52:20,427][15401] Updated weights for policy 0, policy_version 7760 (0.0036) [2024-06-21 14:52:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 127238144. Throughput: 0: 41213.0. Samples: 127311420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 14:52:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 14:52:24,956][15401] Updated weights for policy 0, policy_version 7770 (0.0045) [2024-06-21 14:52:28,219][15401] Updated weights for policy 0, policy_version 7780 (0.0038) [2024-06-21 14:52:28,390][15132] Fps is (10 sec: 47512.8, 60 sec: 41779.1, 300 sec: 41376.5). Total num frames: 127467520. Throughput: 0: 41519.0. Samples: 127567920. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-21 14:52:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 14:52:32,873][15401] Updated weights for policy 0, policy_version 7790 (0.0034) [2024-06-21 14:52:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 127631360. Throughput: 0: 41407.0. Samples: 127812340. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-21 14:52:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 14:52:36,343][15401] Updated weights for policy 0, policy_version 7800 (0.0035) [2024-06-21 14:52:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 127860736. Throughput: 0: 41403.3. Samples: 127934780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 14:52:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 14:52:40,978][15401] Updated weights for policy 0, policy_version 7810 (0.0034) [2024-06-21 14:52:43,392][15132] Fps is (10 sec: 44226.5, 60 sec: 41504.5, 300 sec: 41376.2). Total num frames: 128073728. Throughput: 0: 41421.3. Samples: 128186840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:52:43,392][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 14:52:44,394][15401] Updated weights for policy 0, policy_version 7820 (0.0037) [2024-06-21 14:52:48,396][15132] Fps is (10 sec: 39296.2, 60 sec: 40955.6, 300 sec: 41264.6). Total num frames: 128253952. Throughput: 0: 41103.1. Samples: 128430720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:52:48,397][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 14:52:48,923][15401] Updated weights for policy 0, policy_version 7830 (0.0038) [2024-06-21 14:52:52,324][15401] Updated weights for policy 0, policy_version 7840 (0.0040) [2024-06-21 14:52:53,389][15132] Fps is (10 sec: 39331.1, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 128466944. Throughput: 0: 41236.4. Samples: 128550420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 14:52:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 14:52:56,741][15401] Updated weights for policy 0, policy_version 7850 (0.0038) [2024-06-21 14:52:58,390][15132] Fps is (10 sec: 40986.0, 60 sec: 40413.8, 300 sec: 41209.9). Total num frames: 128663552. Throughput: 0: 41230.3. Samples: 128800540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 14:52:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 14:53:00,310][15401] Updated weights for policy 0, policy_version 7860 (0.0040) [2024-06-21 14:53:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41265.8). Total num frames: 128876544. Throughput: 0: 41084.0. Samples: 129045820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 14:53:03,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 14:53:04,710][15401] Updated weights for policy 0, policy_version 7870 (0.0035) [2024-06-21 14:53:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 129105920. Throughput: 0: 41370.3. Samples: 129173080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 14:53:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 14:53:08,399][15401] Updated weights for policy 0, policy_version 7880 (0.0045) [2024-06-21 14:53:12,457][15401] Updated weights for policy 0, policy_version 7890 (0.0045) [2024-06-21 14:53:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 129269760. Throughput: 0: 41041.0. Samples: 129414760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 14:53:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 14:53:16,316][15401] Updated weights for policy 0, policy_version 7900 (0.0033) [2024-06-21 14:53:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 129515520. Throughput: 0: 41061.8. Samples: 129660120. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-21 14:53:18,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-21 14:53:20,269][15401] Updated weights for policy 0, policy_version 7910 (0.0033) [2024-06-21 14:53:22,999][15349] Signal inference workers to stop experience collection... (1850 times) [2024-06-21 14:53:23,004][15349] Signal inference workers to resume experience collection... (1850 times) [2024-06-21 14:53:23,020][15401] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-21 14:53:23,051][15401] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-21 14:53:23,396][15132] Fps is (10 sec: 44208.5, 60 sec: 41228.7, 300 sec: 41375.6). Total num frames: 129712128. Throughput: 0: 41197.2. Samples: 129788920. Policy #0 lag: (min: 2.0, avg: 9.9, max: 24.0) [2024-06-21 14:53:23,397][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 14:53:24,311][15401] Updated weights for policy 0, policy_version 7920 (0.0028) [2024-06-21 14:53:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 129908736. Throughput: 0: 41007.9. Samples: 130032100. Policy #0 lag: (min: 2.0, avg: 9.9, max: 24.0) [2024-06-21 14:53:28,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-21 14:53:28,699][15401] Updated weights for policy 0, policy_version 7930 (0.0030) [2024-06-21 14:53:32,298][15401] Updated weights for policy 0, policy_version 7940 (0.0033) [2024-06-21 14:53:33,389][15132] Fps is (10 sec: 42625.6, 60 sec: 41779.3, 300 sec: 41265.5). Total num frames: 130138112. Throughput: 0: 41008.9. Samples: 130275860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:53:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 14:53:36,648][15401] Updated weights for policy 0, policy_version 7950 (0.0040) [2024-06-21 14:53:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40686.9, 300 sec: 41210.8). Total num frames: 130301952. Throughput: 0: 41192.5. Samples: 130404080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 14:53:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 14:53:40,668][15401] Updated weights for policy 0, policy_version 7960 (0.0033) [2024-06-21 14:53:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41234.7, 300 sec: 41376.5). Total num frames: 130547712. Throughput: 0: 40984.8. Samples: 130644860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 14:53:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 14:53:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000007968_130547712.pth... [2024-06-21 14:53:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000007364_120651776.pth [2024-06-21 14:53:44,536][15401] Updated weights for policy 0, policy_version 7970 (0.0026) [2024-06-21 14:53:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41237.5, 300 sec: 41265.5). Total num frames: 130727936. Throughput: 0: 41268.0. Samples: 130902880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 14:53:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-21 14:53:48,551][15401] Updated weights for policy 0, policy_version 7980 (0.0037) [2024-06-21 14:53:52,387][15401] Updated weights for policy 0, policy_version 7990 (0.0042) [2024-06-21 14:53:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 130940928. Throughput: 0: 40946.6. Samples: 131015680. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-21 14:53:53,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-21 14:53:56,398][15401] Updated weights for policy 0, policy_version 8000 (0.0033) [2024-06-21 14:53:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 41504.5, 300 sec: 41209.6). Total num frames: 131153920. Throughput: 0: 41245.3. Samples: 131270900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 14:53:58,393][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 14:54:00,279][15401] Updated weights for policy 0, policy_version 8010 (0.0034) [2024-06-21 14:54:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 131334144. Throughput: 0: 41315.0. Samples: 131519300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 14:54:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 14:54:04,453][15401] Updated weights for policy 0, policy_version 8020 (0.0040) [2024-06-21 14:54:08,337][15401] Updated weights for policy 0, policy_version 8030 (0.0046) [2024-06-21 14:54:08,392][15132] Fps is (10 sec: 40959.9, 60 sec: 40958.3, 300 sec: 41265.1). Total num frames: 131563520. Throughput: 0: 40944.0. Samples: 131631240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 14:54:08,392][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 14:54:12,652][15401] Updated weights for policy 0, policy_version 8040 (0.0041) [2024-06-21 14:54:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 131760128. Throughput: 0: 41197.0. Samples: 131885960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 14:54:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 14:54:16,143][15401] Updated weights for policy 0, policy_version 8050 (0.0030) [2024-06-21 14:54:18,389][15132] Fps is (10 sec: 37692.4, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 131940352. Throughput: 0: 41172.5. Samples: 132128620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 14:54:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 14:54:20,737][15401] Updated weights for policy 0, policy_version 8060 (0.0036) [2024-06-21 14:54:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 41235.8, 300 sec: 41209.6). Total num frames: 132186112. Throughput: 0: 41004.5. Samples: 132249380. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-21 14:54:23,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 14:54:23,944][15401] Updated weights for policy 0, policy_version 8070 (0.0029) [2024-06-21 14:54:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 132349952. Throughput: 0: 41210.3. Samples: 132499320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 14:54:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 14:54:28,466][15349] Signal inference workers to stop experience collection... (1900 times) [2024-06-21 14:54:28,518][15401] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-21 14:54:28,525][15349] Signal inference workers to resume experience collection... (1900 times) [2024-06-21 14:54:28,541][15401] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-21 14:54:28,666][15401] Updated weights for policy 0, policy_version 8080 (0.0038) [2024-06-21 14:54:31,986][15401] Updated weights for policy 0, policy_version 8090 (0.0044) [2024-06-21 14:54:33,392][15132] Fps is (10 sec: 37683.1, 60 sec: 40412.2, 300 sec: 41098.8). Total num frames: 132562944. Throughput: 0: 40900.0. Samples: 132743480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 14:54:33,393][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 14:54:36,664][15401] Updated weights for policy 0, policy_version 8100 (0.0037) [2024-06-21 14:54:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 41779.1, 300 sec: 41265.5). Total num frames: 132808704. Throughput: 0: 41326.6. Samples: 132875380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 14:54:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 14:54:39,784][15401] Updated weights for policy 0, policy_version 8110 (0.0033) [2024-06-21 14:54:43,390][15132] Fps is (10 sec: 40969.7, 60 sec: 40413.9, 300 sec: 41209.9). Total num frames: 132972544. Throughput: 0: 40856.8. Samples: 133109360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 14:54:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 14:54:44,880][15401] Updated weights for policy 0, policy_version 8120 (0.0031) [2024-06-21 14:54:47,719][15401] Updated weights for policy 0, policy_version 8130 (0.0038) [2024-06-21 14:54:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 133201920. Throughput: 0: 40705.9. Samples: 133351060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 14:54:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 14:54:53,105][15401] Updated weights for policy 0, policy_version 8140 (0.0026) [2024-06-21 14:54:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 133382144. Throughput: 0: 41108.5. Samples: 133481020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 14:54:53,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 14:54:55,586][15401] Updated weights for policy 0, policy_version 8150 (0.0032) [2024-06-21 14:54:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 40688.5, 300 sec: 41154.4). Total num frames: 133595136. Throughput: 0: 40886.6. Samples: 133725860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 14:54:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 14:55:00,969][15401] Updated weights for policy 0, policy_version 8160 (0.0032) [2024-06-21 14:55:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.3, 300 sec: 41210.0). Total num frames: 133824512. Throughput: 0: 40917.0. Samples: 133969880. Policy #0 lag: (min: 2.0, avg: 10.0, max: 23.0) [2024-06-21 14:55:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 14:55:03,632][15401] Updated weights for policy 0, policy_version 8170 (0.0030) [2024-06-21 14:55:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 40415.4, 300 sec: 41098.8). Total num frames: 133988352. Throughput: 0: 41120.8. Samples: 134099720. Policy #0 lag: (min: 2.0, avg: 10.0, max: 23.0) [2024-06-21 14:55:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 14:55:08,727][15401] Updated weights for policy 0, policy_version 8180 (0.0041) [2024-06-21 14:55:11,998][15401] Updated weights for policy 0, policy_version 8190 (0.0036) [2024-06-21 14:55:13,392][15132] Fps is (10 sec: 39311.9, 60 sec: 40958.3, 300 sec: 41099.4). Total num frames: 134217728. Throughput: 0: 40839.6. Samples: 134337200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 14:55:13,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 14:55:16,583][15401] Updated weights for policy 0, policy_version 8200 (0.0037) [2024-06-21 14:55:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 41506.1, 300 sec: 41098.9). Total num frames: 134430720. Throughput: 0: 40969.3. Samples: 134587000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 14:55:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 14:55:19,691][15401] Updated weights for policy 0, policy_version 8210 (0.0042) [2024-06-21 14:55:23,389][15132] Fps is (10 sec: 39331.0, 60 sec: 40415.5, 300 sec: 41098.9). Total num frames: 134610944. Throughput: 0: 40795.7. Samples: 134711180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 14:55:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 14:55:24,423][15401] Updated weights for policy 0, policy_version 8220 (0.0029) [2024-06-21 14:55:27,748][15401] Updated weights for policy 0, policy_version 8230 (0.0042) [2024-06-21 14:55:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 134840320. Throughput: 0: 41070.7. Samples: 134957540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 14:55:28,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 14:55:32,167][15401] Updated weights for policy 0, policy_version 8240 (0.0045) [2024-06-21 14:55:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41507.9, 300 sec: 41154.4). Total num frames: 135053312. Throughput: 0: 41276.5. Samples: 135208500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 14:55:33,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 14:55:35,441][15401] Updated weights for policy 0, policy_version 8250 (0.0029) [2024-06-21 14:55:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 135233536. Throughput: 0: 41190.6. Samples: 135334600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 14:55:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-21 14:55:40,263][15401] Updated weights for policy 0, policy_version 8260 (0.0036) [2024-06-21 14:55:41,156][15349] Signal inference workers to stop experience collection... (1950 times) [2024-06-21 14:55:41,156][15349] Signal inference workers to resume experience collection... (1950 times) [2024-06-21 14:55:41,190][15401] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-21 14:55:41,190][15401] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-21 14:55:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 41777.5, 300 sec: 41265.1). Total num frames: 135479296. Throughput: 0: 41211.1. Samples: 135580460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 14:55:43,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 14:55:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000008269_135479296.pth... [2024-06-21 14:55:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000007666_125599744.pth [2024-06-21 14:55:43,742][15401] Updated weights for policy 0, policy_version 8270 (0.0039) [2024-06-21 14:55:48,019][15401] Updated weights for policy 0, policy_version 8280 (0.0034) [2024-06-21 14:55:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 135659520. Throughput: 0: 41218.7. Samples: 135824720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 14:55:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 14:55:51,701][15401] Updated weights for policy 0, policy_version 8290 (0.0038) [2024-06-21 14:55:53,390][15132] Fps is (10 sec: 39330.9, 60 sec: 41506.0, 300 sec: 41154.7). Total num frames: 135872512. Throughput: 0: 41076.9. Samples: 135948180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 14:55:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 14:55:55,775][15401] Updated weights for policy 0, policy_version 8300 (0.0035) [2024-06-21 14:55:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 136085504. Throughput: 0: 41414.1. Samples: 136200740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 14:55:58,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 14:55:59,729][15401] Updated weights for policy 0, policy_version 8310 (0.0051) [2024-06-21 14:56:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 136282112. Throughput: 0: 41398.8. Samples: 136449940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 14:56:03,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 14:56:03,698][15401] Updated weights for policy 0, policy_version 8320 (0.0044) [2024-06-21 14:56:07,789][15401] Updated weights for policy 0, policy_version 8330 (0.0037) [2024-06-21 14:56:08,392][15132] Fps is (10 sec: 40948.9, 60 sec: 41777.3, 300 sec: 41154.0). Total num frames: 136495104. Throughput: 0: 41317.4. Samples: 136570580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 14:56:08,393][15132] Avg episode reward: [(0, '0.333')] [2024-06-21 14:56:11,747][15401] Updated weights for policy 0, policy_version 8340 (0.0037) [2024-06-21 14:56:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 40961.7, 300 sec: 41043.3). Total num frames: 136675328. Throughput: 0: 41256.9. Samples: 136814100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 14:56:13,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 14:56:15,650][15401] Updated weights for policy 0, policy_version 8350 (0.0040) [2024-06-21 14:56:18,390][15132] Fps is (10 sec: 37693.8, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 136871936. Throughput: 0: 41309.2. Samples: 137067420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 14:56:18,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-21 14:56:19,775][15401] Updated weights for policy 0, policy_version 8360 (0.0038) [2024-06-21 14:56:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 137101312. Throughput: 0: 41238.4. Samples: 137190320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 14:56:23,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-21 14:56:23,554][15401] Updated weights for policy 0, policy_version 8370 (0.0035) [2024-06-21 14:56:27,688][15401] Updated weights for policy 0, policy_version 8380 (0.0036) [2024-06-21 14:56:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 137314304. Throughput: 0: 41261.7. Samples: 137437140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 14:56:28,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-21 14:56:31,450][15401] Updated weights for policy 0, policy_version 8390 (0.0029) [2024-06-21 14:56:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 137510912. Throughput: 0: 41271.6. Samples: 137681940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 14:56:33,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-21 14:56:36,007][15401] Updated weights for policy 0, policy_version 8400 (0.0039) [2024-06-21 14:56:38,392][15132] Fps is (10 sec: 40950.7, 60 sec: 41504.5, 300 sec: 41154.0). Total num frames: 137723904. Throughput: 0: 41257.4. Samples: 137804860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-21 14:56:38,393][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 14:56:39,351][15401] Updated weights for policy 0, policy_version 8410 (0.0035) [2024-06-21 14:56:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 40688.5, 300 sec: 41098.8). Total num frames: 137920512. Throughput: 0: 41186.2. Samples: 138054120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 14:56:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 14:56:43,860][15401] Updated weights for policy 0, policy_version 8420 (0.0040) [2024-06-21 14:56:47,513][15401] Updated weights for policy 0, policy_version 8430 (0.0035) [2024-06-21 14:56:48,390][15132] Fps is (10 sec: 40969.2, 60 sec: 41232.9, 300 sec: 41154.4). Total num frames: 138133504. Throughput: 0: 41072.2. Samples: 138298200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 14:56:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 14:56:51,719][15401] Updated weights for policy 0, policy_version 8440 (0.0033) [2024-06-21 14:56:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 138330112. Throughput: 0: 41074.6. Samples: 138418820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 14:56:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 14:56:55,675][15401] Updated weights for policy 0, policy_version 8450 (0.0038) [2024-06-21 14:56:58,390][15132] Fps is (10 sec: 39322.1, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 138526720. Throughput: 0: 41035.5. Samples: 138660700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:56:58,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-21 14:56:59,581][15401] Updated weights for policy 0, policy_version 8460 (0.0041) [2024-06-21 14:57:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 138739712. Throughput: 0: 40933.4. Samples: 138909420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:57:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 14:57:03,556][15401] Updated weights for policy 0, policy_version 8470 (0.0038) [2024-06-21 14:57:07,304][15401] Updated weights for policy 0, policy_version 8480 (0.0040) [2024-06-21 14:57:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41235.0, 300 sec: 41154.4). Total num frames: 138969088. Throughput: 0: 40972.8. Samples: 139034100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 14:57:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 14:57:11,509][15401] Updated weights for policy 0, policy_version 8490 (0.0029) [2024-06-21 14:57:12,994][15349] Signal inference workers to stop experience collection... (2000 times) [2024-06-21 14:57:12,996][15349] Signal inference workers to resume experience collection... (2000 times) [2024-06-21 14:57:13,009][15401] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-21 14:57:13,040][15401] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-21 14:57:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 139149312. Throughput: 0: 41011.7. Samples: 139282660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 14:57:13,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 14:57:15,259][15401] Updated weights for policy 0, policy_version 8500 (0.0031) [2024-06-21 14:57:18,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 139345920. Throughput: 0: 41164.3. Samples: 139534340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 14:57:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 14:57:19,509][15401] Updated weights for policy 0, policy_version 8510 (0.0040) [2024-06-21 14:57:23,291][15401] Updated weights for policy 0, policy_version 8520 (0.0029) [2024-06-21 14:57:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 139591680. Throughput: 0: 41074.5. Samples: 139653120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 14:57:23,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 14:57:27,500][15401] Updated weights for policy 0, policy_version 8530 (0.0048) [2024-06-21 14:57:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 139771904. Throughput: 0: 41004.0. Samples: 139899300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:57:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 14:57:31,202][15401] Updated weights for policy 0, policy_version 8540 (0.0038) [2024-06-21 14:57:33,390][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 139984896. Throughput: 0: 41133.0. Samples: 140149180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 14:57:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 14:57:35,287][15401] Updated weights for policy 0, policy_version 8550 (0.0041) [2024-06-21 14:57:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40688.6, 300 sec: 40988.1). Total num frames: 140165120. Throughput: 0: 41266.2. Samples: 140275800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 14:57:38,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 14:57:39,277][15401] Updated weights for policy 0, policy_version 8560 (0.0033) [2024-06-21 14:57:43,009][15401] Updated weights for policy 0, policy_version 8570 (0.0040) [2024-06-21 14:57:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41210.8). Total num frames: 140410880. Throughput: 0: 41390.2. Samples: 140523260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 14:57:43,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 14:57:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000008570_140410880.pth... [2024-06-21 14:57:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000007968_130547712.pth [2024-06-21 14:57:47,418][15401] Updated weights for policy 0, policy_version 8580 (0.0034) [2024-06-21 14:57:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 40960.2, 300 sec: 41098.9). Total num frames: 140591104. Throughput: 0: 41352.1. Samples: 140770260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 14:57:48,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-21 14:57:51,061][15401] Updated weights for policy 0, policy_version 8590 (0.0042) [2024-06-21 14:57:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 140820480. Throughput: 0: 41227.4. Samples: 140889340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 14:57:53,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-21 14:57:55,495][15401] Updated weights for policy 0, policy_version 8600 (0.0037) [2024-06-21 14:57:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 141033472. Throughput: 0: 41318.5. Samples: 141142000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 14:57:58,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 14:57:59,008][15401] Updated weights for policy 0, policy_version 8610 (0.0034) [2024-06-21 14:58:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 141213696. Throughput: 0: 41255.0. Samples: 141390820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 14:58:03,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-21 14:58:03,443][15401] Updated weights for policy 0, policy_version 8620 (0.0028) [2024-06-21 14:58:07,038][15401] Updated weights for policy 0, policy_version 8630 (0.0028) [2024-06-21 14:58:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 141443072. Throughput: 0: 41208.1. Samples: 141507480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 14:58:08,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-21 14:58:11,591][15401] Updated weights for policy 0, policy_version 8640 (0.0028) [2024-06-21 14:58:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 141639680. Throughput: 0: 41325.4. Samples: 141758940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 14:58:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 14:58:14,939][15401] Updated weights for policy 0, policy_version 8650 (0.0032) [2024-06-21 14:58:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41099.7). Total num frames: 141836288. Throughput: 0: 41210.7. Samples: 142003660. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-06-21 14:58:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 14:58:19,477][15401] Updated weights for policy 0, policy_version 8660 (0.0042) [2024-06-21 14:58:23,298][15401] Updated weights for policy 0, policy_version 8670 (0.0028) [2024-06-21 14:58:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 142049280. Throughput: 0: 41097.2. Samples: 142125180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 14:58:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 14:58:27,325][15401] Updated weights for policy 0, policy_version 8680 (0.0039) [2024-06-21 14:58:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 142245888. Throughput: 0: 41165.4. Samples: 142375700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 14:58:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 14:58:31,011][15401] Updated weights for policy 0, policy_version 8690 (0.0039) [2024-06-21 14:58:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 142458880. Throughput: 0: 41107.3. Samples: 142620100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 14:58:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 14:58:35,517][15401] Updated weights for policy 0, policy_version 8700 (0.0042) [2024-06-21 14:58:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41154.4). Total num frames: 142688256. Throughput: 0: 41303.7. Samples: 142748000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 14:58:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 14:58:38,869][15401] Updated weights for policy 0, policy_version 8710 (0.0030) [2024-06-21 14:58:43,335][15401] Updated weights for policy 0, policy_version 8720 (0.0031) [2024-06-21 14:58:43,390][15132] Fps is (10 sec: 40960.5, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 142868480. Throughput: 0: 41196.9. Samples: 142995860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:58:43,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-21 14:58:46,684][15401] Updated weights for policy 0, policy_version 8730 (0.0049) [2024-06-21 14:58:48,392][15132] Fps is (10 sec: 39312.4, 60 sec: 41504.4, 300 sec: 41154.1). Total num frames: 143081472. Throughput: 0: 41113.5. Samples: 143241020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 14:58:48,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 14:58:51,198][15401] Updated weights for policy 0, policy_version 8740 (0.0038) [2024-06-21 14:58:53,325][15349] Signal inference workers to stop experience collection... (2050 times) [2024-06-21 14:58:53,327][15349] Signal inference workers to resume experience collection... (2050 times) [2024-06-21 14:58:53,370][15401] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-21 14:58:53,371][15401] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-21 14:58:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.2, 300 sec: 41154.7). Total num frames: 143294464. Throughput: 0: 41383.5. Samples: 143369740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 14:58:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 14:58:54,264][15401] Updated weights for policy 0, policy_version 8750 (0.0037) [2024-06-21 14:58:58,390][15132] Fps is (10 sec: 40969.6, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 143491072. Throughput: 0: 41315.1. Samples: 143618120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 14:58:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 14:58:59,041][15401] Updated weights for policy 0, policy_version 8760 (0.0035) [2024-06-21 14:59:02,217][15401] Updated weights for policy 0, policy_version 8770 (0.0031) [2024-06-21 14:59:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41154.7). Total num frames: 143704064. Throughput: 0: 41315.5. Samples: 143862860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 14:59:03,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 14:59:06,939][15401] Updated weights for policy 0, policy_version 8780 (0.0036) [2024-06-21 14:59:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 143884288. Throughput: 0: 41395.2. Samples: 143987960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 14:59:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 14:59:09,954][15401] Updated weights for policy 0, policy_version 8790 (0.0023) [2024-06-21 14:59:13,390][15132] Fps is (10 sec: 40956.5, 60 sec: 41232.5, 300 sec: 41265.3). Total num frames: 144113664. Throughput: 0: 41390.7. Samples: 144238320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 14:59:13,391][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 14:59:14,837][15401] Updated weights for policy 0, policy_version 8800 (0.0043) [2024-06-21 14:59:18,025][15401] Updated weights for policy 0, policy_version 8810 (0.0043) [2024-06-21 14:59:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 41779.2, 300 sec: 41210.3). Total num frames: 144343040. Throughput: 0: 41149.0. Samples: 144471800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 14:59:18,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-21 14:59:22,831][15401] Updated weights for policy 0, policy_version 8820 (0.0042) [2024-06-21 14:59:23,392][15132] Fps is (10 sec: 40954.1, 60 sec: 41231.5, 300 sec: 41265.1). Total num frames: 144523264. Throughput: 0: 41219.6. Samples: 144602980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 14:59:23,392][15132] Avg episode reward: [(0, '0.153')] [2024-06-21 14:59:26,575][15401] Updated weights for policy 0, policy_version 8830 (0.0039) [2024-06-21 14:59:28,389][15132] Fps is (10 sec: 36045.3, 60 sec: 40960.0, 300 sec: 41154.7). Total num frames: 144703488. Throughput: 0: 41052.1. Samples: 144843200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 14:59:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-21 14:59:31,224][15401] Updated weights for policy 0, policy_version 8840 (0.0044) [2024-06-21 14:59:33,389][15132] Fps is (10 sec: 40969.7, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 144932864. Throughput: 0: 41169.7. Samples: 145093560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 14:59:33,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-21 14:59:34,367][15401] Updated weights for policy 0, policy_version 8850 (0.0044) [2024-06-21 14:59:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 145145856. Throughput: 0: 41069.8. Samples: 145217880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 14:59:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 14:59:38,931][15401] Updated weights for policy 0, policy_version 8860 (0.0044) [2024-06-21 14:59:42,330][15401] Updated weights for policy 0, policy_version 8870 (0.0034) [2024-06-21 14:59:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 145342464. Throughput: 0: 40968.5. Samples: 145461700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 14:59:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 14:59:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000008871_145342464.pth... [2024-06-21 14:59:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000008269_135479296.pth [2024-06-21 14:59:46,991][15401] Updated weights for policy 0, policy_version 8880 (0.0035) [2024-06-21 14:59:48,391][15132] Fps is (10 sec: 39315.9, 60 sec: 40960.6, 300 sec: 41209.7). Total num frames: 145539072. Throughput: 0: 41130.3. Samples: 145713780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 14:59:48,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 14:59:50,218][15401] Updated weights for policy 0, policy_version 8890 (0.0041) [2024-06-21 14:59:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 145752064. Throughput: 0: 41115.8. Samples: 145838180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 14:59:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 14:59:54,848][15401] Updated weights for policy 0, policy_version 8900 (0.0037) [2024-06-21 14:59:58,128][15401] Updated weights for policy 0, policy_version 8910 (0.0030) [2024-06-21 14:59:58,390][15132] Fps is (10 sec: 44242.7, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 145981440. Throughput: 0: 41110.9. Samples: 146088280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 14:59:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 15:00:02,662][15401] Updated weights for policy 0, policy_version 8920 (0.0036) [2024-06-21 15:00:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 146161664. Throughput: 0: 41423.2. Samples: 146335840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 15:00:03,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 15:00:06,229][15401] Updated weights for policy 0, policy_version 8930 (0.0035) [2024-06-21 15:00:08,390][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41210.3). Total num frames: 146374656. Throughput: 0: 41199.5. Samples: 146456860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 15:00:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 15:00:10,566][15401] Updated weights for policy 0, policy_version 8940 (0.0026) [2024-06-21 15:00:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.7, 300 sec: 41209.9). Total num frames: 146587648. Throughput: 0: 41452.9. Samples: 146708580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 15:00:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 15:00:14,142][15401] Updated weights for policy 0, policy_version 8950 (0.0045) [2024-06-21 15:00:18,326][15401] Updated weights for policy 0, policy_version 8960 (0.0039) [2024-06-21 15:00:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 146800640. Throughput: 0: 41491.1. Samples: 146960660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 15:00:18,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 15:00:22,499][15401] Updated weights for policy 0, policy_version 8970 (0.0035) [2024-06-21 15:00:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41234.7, 300 sec: 41209.9). Total num frames: 146997248. Throughput: 0: 41453.3. Samples: 147083280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:00:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 15:00:26,272][15349] Signal inference workers to stop experience collection... (2100 times) [2024-06-21 15:00:26,323][15401] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-21 15:00:26,327][15349] Signal inference workers to resume experience collection... (2100 times) [2024-06-21 15:00:26,333][15401] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-21 15:00:26,341][15401] Updated weights for policy 0, policy_version 8980 (0.0038) [2024-06-21 15:00:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 41265.4). Total num frames: 147226624. Throughput: 0: 41635.9. Samples: 147335320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:00:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 15:00:30,260][15401] Updated weights for policy 0, policy_version 8990 (0.0040) [2024-06-21 15:00:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 147406848. Throughput: 0: 41685.3. Samples: 147589560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:00:33,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 15:00:34,280][15401] Updated weights for policy 0, policy_version 9000 (0.0044) [2024-06-21 15:00:37,854][15401] Updated weights for policy 0, policy_version 9010 (0.0035) [2024-06-21 15:00:38,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41233.2, 300 sec: 41154.7). Total num frames: 147619840. Throughput: 0: 41429.1. Samples: 147702480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:00:38,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-21 15:00:42,121][15401] Updated weights for policy 0, policy_version 9020 (0.0031) [2024-06-21 15:00:43,392][15132] Fps is (10 sec: 42588.6, 60 sec: 41504.5, 300 sec: 41265.1). Total num frames: 147832832. Throughput: 0: 41468.6. Samples: 147954460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:00:43,392][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 15:00:45,975][15401] Updated weights for policy 0, policy_version 9030 (0.0040) [2024-06-21 15:00:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41234.1, 300 sec: 41154.4). Total num frames: 148013056. Throughput: 0: 41515.6. Samples: 148204040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:00:48,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-21 15:00:50,004][15401] Updated weights for policy 0, policy_version 9040 (0.0042) [2024-06-21 15:00:53,390][15132] Fps is (10 sec: 40966.7, 60 sec: 41505.7, 300 sec: 41209.8). Total num frames: 148242432. Throughput: 0: 41443.8. Samples: 148321860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:00:53,391][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 15:00:54,089][15401] Updated weights for policy 0, policy_version 9050 (0.0029) [2024-06-21 15:00:58,069][15401] Updated weights for policy 0, policy_version 9060 (0.0040) [2024-06-21 15:00:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 148439040. Throughput: 0: 41397.7. Samples: 148571480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 15:00:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 15:01:02,249][15401] Updated weights for policy 0, policy_version 9070 (0.0046) [2024-06-21 15:01:03,390][15132] Fps is (10 sec: 40963.0, 60 sec: 41506.1, 300 sec: 41210.3). Total num frames: 148652032. Throughput: 0: 41359.5. Samples: 148821840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 15:01:03,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-21 15:01:05,895][15401] Updated weights for policy 0, policy_version 9080 (0.0049) [2024-06-21 15:01:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 148881408. Throughput: 0: 41280.5. Samples: 148940900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 15:01:08,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 15:01:10,088][15401] Updated weights for policy 0, policy_version 9090 (0.0035) [2024-06-21 15:01:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 149045248. Throughput: 0: 41204.1. Samples: 149189500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 15:01:13,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 15:01:13,979][15401] Updated weights for policy 0, policy_version 9100 (0.0031) [2024-06-21 15:01:17,809][15401] Updated weights for policy 0, policy_version 9110 (0.0030) [2024-06-21 15:01:18,392][15132] Fps is (10 sec: 37673.9, 60 sec: 40958.3, 300 sec: 41209.6). Total num frames: 149258240. Throughput: 0: 41011.2. Samples: 149435160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 15:01:18,393][15132] Avg episode reward: [(0, '0.808')] [2024-06-21 15:01:22,222][15401] Updated weights for policy 0, policy_version 9120 (0.0042) [2024-06-21 15:01:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 149504000. Throughput: 0: 41351.8. Samples: 149563320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 15:01:23,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 15:01:26,061][15401] Updated weights for policy 0, policy_version 9130 (0.0037) [2024-06-21 15:01:28,390][15132] Fps is (10 sec: 42607.9, 60 sec: 40959.9, 300 sec: 41265.4). Total num frames: 149684224. Throughput: 0: 41178.9. Samples: 149807420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 15:01:28,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 15:01:29,990][15401] Updated weights for policy 0, policy_version 9140 (0.0033) [2024-06-21 15:01:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41265.8). Total num frames: 149897216. Throughput: 0: 41062.5. Samples: 150051860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-21 15:01:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 15:01:33,695][15401] Updated weights for policy 0, policy_version 9150 (0.0036) [2024-06-21 15:01:37,870][15401] Updated weights for policy 0, policy_version 9160 (0.0038) [2024-06-21 15:01:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41232.9, 300 sec: 41265.5). Total num frames: 150093824. Throughput: 0: 41355.7. Samples: 150182840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-21 15:01:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 15:01:38,428][15349] Signal inference workers to stop experience collection... (2150 times) [2024-06-21 15:01:38,428][15349] Signal inference workers to resume experience collection... (2150 times) [2024-06-21 15:01:38,467][15401] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-21 15:01:38,467][15401] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-21 15:01:41,359][15401] Updated weights for policy 0, policy_version 9170 (0.0035) [2024-06-21 15:01:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41234.7, 300 sec: 41265.5). Total num frames: 150306816. Throughput: 0: 41248.9. Samples: 150427680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-21 15:01:43,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-21 15:01:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000009174_150306816.pth... [2024-06-21 15:01:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000008570_140410880.pth [2024-06-21 15:01:45,754][15401] Updated weights for policy 0, policy_version 9180 (0.0051) [2024-06-21 15:01:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 41376.5). Total num frames: 150536192. Throughput: 0: 41005.3. Samples: 150667080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 15:01:48,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 15:01:49,518][15401] Updated weights for policy 0, policy_version 9190 (0.0034) [2024-06-21 15:01:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.5, 300 sec: 41265.5). Total num frames: 150700032. Throughput: 0: 41267.5. Samples: 150797940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 15:01:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 15:01:53,844][15401] Updated weights for policy 0, policy_version 9200 (0.0035) [2024-06-21 15:01:57,272][15401] Updated weights for policy 0, policy_version 9210 (0.0039) [2024-06-21 15:01:58,392][15132] Fps is (10 sec: 37674.4, 60 sec: 41231.4, 300 sec: 41265.1). Total num frames: 150913024. Throughput: 0: 41130.3. Samples: 151040460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-21 15:01:58,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 15:02:01,850][15401] Updated weights for policy 0, policy_version 9220 (0.0039) [2024-06-21 15:02:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 151142400. Throughput: 0: 41197.7. Samples: 151288960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-21 15:02:03,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 15:02:05,014][15401] Updated weights for policy 0, policy_version 9230 (0.0048) [2024-06-21 15:02:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 151322624. Throughput: 0: 41215.7. Samples: 151418020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 15:02:08,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 15:02:09,859][15401] Updated weights for policy 0, policy_version 9240 (0.0031) [2024-06-21 15:02:13,202][15401] Updated weights for policy 0, policy_version 9250 (0.0036) [2024-06-21 15:02:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41376.5). Total num frames: 151552000. Throughput: 0: 41241.4. Samples: 151663280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 15:02:13,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-21 15:02:17,648][15401] Updated weights for policy 0, policy_version 9260 (0.0043) [2024-06-21 15:02:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41780.8, 300 sec: 41265.5). Total num frames: 151764992. Throughput: 0: 41414.2. Samples: 151915500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 15:02:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 15:02:20,957][15401] Updated weights for policy 0, policy_version 9270 (0.0045) [2024-06-21 15:02:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 151961600. Throughput: 0: 41260.1. Samples: 152039540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 15:02:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 15:02:25,556][15401] Updated weights for policy 0, policy_version 9280 (0.0036) [2024-06-21 15:02:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 152174592. Throughput: 0: 41236.4. Samples: 152283320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 15:02:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 15:02:28,839][15401] Updated weights for policy 0, policy_version 9290 (0.0043) [2024-06-21 15:02:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 152354816. Throughput: 0: 41729.8. Samples: 152544920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 15:02:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 15:02:33,498][15401] Updated weights for policy 0, policy_version 9300 (0.0039) [2024-06-21 15:02:36,538][15401] Updated weights for policy 0, policy_version 9310 (0.0031) [2024-06-21 15:02:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 152567808. Throughput: 0: 41383.4. Samples: 152660200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 15:02:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 15:02:41,240][15401] Updated weights for policy 0, policy_version 9320 (0.0040) [2024-06-21 15:02:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 152797184. Throughput: 0: 41540.4. Samples: 152909680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 15:02:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 15:02:44,471][15401] Updated weights for policy 0, policy_version 9330 (0.0028) [2024-06-21 15:02:48,390][15132] Fps is (10 sec: 40960.6, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 152977408. Throughput: 0: 41529.4. Samples: 153157780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 15:02:48,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 15:02:49,028][15401] Updated weights for policy 0, policy_version 9340 (0.0044) [2024-06-21 15:02:52,261][15401] Updated weights for policy 0, policy_version 9350 (0.0037) [2024-06-21 15:02:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 153206784. Throughput: 0: 41407.0. Samples: 153281340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:02:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 15:02:56,864][15401] Updated weights for policy 0, policy_version 9360 (0.0027) [2024-06-21 15:02:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 40961.6, 300 sec: 41209.9). Total num frames: 153370624. Throughput: 0: 41413.9. Samples: 153526900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:02:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 15:03:00,045][15401] Updated weights for policy 0, policy_version 9370 (0.0034) [2024-06-21 15:03:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 153600000. Throughput: 0: 41480.1. Samples: 153782100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:03:03,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 15:03:04,606][15401] Updated weights for policy 0, policy_version 9380 (0.0046) [2024-06-21 15:03:07,461][15349] Signal inference workers to stop experience collection... (2200 times) [2024-06-21 15:03:07,515][15401] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-21 15:03:07,524][15349] Signal inference workers to resume experience collection... (2200 times) [2024-06-21 15:03:07,529][15401] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-21 15:03:08,080][15401] Updated weights for policy 0, policy_version 9390 (0.0031) [2024-06-21 15:03:08,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 41376.6). Total num frames: 153845760. Throughput: 0: 41535.2. Samples: 153908620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 15:03:08,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 15:03:12,521][15401] Updated weights for policy 0, policy_version 9400 (0.0045) [2024-06-21 15:03:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41231.5, 300 sec: 41320.7). Total num frames: 154025984. Throughput: 0: 41444.9. Samples: 154148440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 15:03:13,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 15:03:16,193][15401] Updated weights for policy 0, policy_version 9410 (0.0044) [2024-06-21 15:03:18,390][15132] Fps is (10 sec: 37681.3, 60 sec: 40959.8, 300 sec: 41265.4). Total num frames: 154222592. Throughput: 0: 40885.4. Samples: 154384780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:03:18,391][15132] Avg episode reward: [(0, '0.381')] [2024-06-21 15:03:21,243][15401] Updated weights for policy 0, policy_version 9420 (0.0046) [2024-06-21 15:03:23,392][15132] Fps is (10 sec: 37683.4, 60 sec: 40685.3, 300 sec: 41209.6). Total num frames: 154402816. Throughput: 0: 41154.4. Samples: 154512240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 15:03:23,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 15:03:24,360][15401] Updated weights for policy 0, policy_version 9430 (0.0034) [2024-06-21 15:03:28,390][15132] Fps is (10 sec: 40961.8, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 154632192. Throughput: 0: 40977.3. Samples: 154753660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 15:03:28,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-21 15:03:29,050][15401] Updated weights for policy 0, policy_version 9440 (0.0037) [2024-06-21 15:03:32,257][15401] Updated weights for policy 0, policy_version 9450 (0.0049) [2024-06-21 15:03:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 154845184. Throughput: 0: 40917.3. Samples: 154999060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 15:03:33,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 15:03:36,796][15401] Updated weights for policy 0, policy_version 9460 (0.0047) [2024-06-21 15:03:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40960.2, 300 sec: 41209.9). Total num frames: 155025408. Throughput: 0: 41038.4. Samples: 155128060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 15:03:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 15:03:40,328][15401] Updated weights for policy 0, policy_version 9470 (0.0041) [2024-06-21 15:03:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 41210.3). Total num frames: 155238400. Throughput: 0: 41007.5. Samples: 155372240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 15:03:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 15:03:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000009475_155238400.pth... [2024-06-21 15:03:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000008871_145342464.pth [2024-06-21 15:03:45,034][15401] Updated weights for policy 0, policy_version 9480 (0.0029) [2024-06-21 15:03:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 155451392. Throughput: 0: 40744.9. Samples: 155615620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:03:48,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 15:03:48,627][15401] Updated weights for policy 0, policy_version 9490 (0.0037) [2024-06-21 15:03:52,845][15401] Updated weights for policy 0, policy_version 9500 (0.0033) [2024-06-21 15:03:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 40687.1, 300 sec: 41209.9). Total num frames: 155648000. Throughput: 0: 40729.4. Samples: 155741440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:03:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 15:03:56,579][15401] Updated weights for policy 0, policy_version 9510 (0.0044) [2024-06-21 15:03:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 155860992. Throughput: 0: 40882.1. Samples: 155988040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 15:03:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 15:04:01,385][15401] Updated weights for policy 0, policy_version 9520 (0.0045) [2024-06-21 15:04:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 156073984. Throughput: 0: 41146.2. Samples: 156236340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 15:04:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 15:04:04,646][15401] Updated weights for policy 0, policy_version 9530 (0.0036) [2024-06-21 15:04:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40413.7, 300 sec: 41210.0). Total num frames: 156270592. Throughput: 0: 41028.7. Samples: 156358440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 15:04:08,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 15:04:09,248][15401] Updated weights for policy 0, policy_version 9540 (0.0045) [2024-06-21 15:04:12,663][15401] Updated weights for policy 0, policy_version 9550 (0.0034) [2024-06-21 15:04:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 40961.7, 300 sec: 41154.4). Total num frames: 156483584. Throughput: 0: 41177.8. Samples: 156606660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 15:04:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 15:04:17,155][15401] Updated weights for policy 0, policy_version 9560 (0.0039) [2024-06-21 15:04:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 40960.3, 300 sec: 41210.3). Total num frames: 156680192. Throughput: 0: 41279.2. Samples: 156856620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 15:04:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 15:04:20,624][15401] Updated weights for policy 0, policy_version 9570 (0.0029) [2024-06-21 15:04:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41507.7, 300 sec: 41321.0). Total num frames: 156893184. Throughput: 0: 41151.4. Samples: 156979880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 15:04:23,395][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 15:04:25,040][15401] Updated weights for policy 0, policy_version 9580 (0.0040) [2024-06-21 15:04:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 157089792. Throughput: 0: 41120.1. Samples: 157222640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 15:04:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 15:04:28,761][15401] Updated weights for policy 0, policy_version 9590 (0.0042) [2024-06-21 15:04:32,894][15401] Updated weights for policy 0, policy_version 9600 (0.0039) [2024-06-21 15:04:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 157286400. Throughput: 0: 41271.6. Samples: 157472840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 15:04:33,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 15:04:36,629][15401] Updated weights for policy 0, policy_version 9610 (0.0039) [2024-06-21 15:04:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 157515776. Throughput: 0: 41172.4. Samples: 157594200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 15:04:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 15:04:38,660][15349] Signal inference workers to stop experience collection... (2250 times) [2024-06-21 15:04:38,660][15349] Signal inference workers to resume experience collection... (2250 times) [2024-06-21 15:04:38,688][15401] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-21 15:04:38,688][15401] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-21 15:04:40,807][15401] Updated weights for policy 0, policy_version 9620 (0.0026) [2024-06-21 15:04:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 41321.2). Total num frames: 157728768. Throughput: 0: 41206.4. Samples: 157842320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:04:43,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 15:04:44,452][15401] Updated weights for policy 0, policy_version 9630 (0.0036) [2024-06-21 15:04:48,392][15132] Fps is (10 sec: 37674.3, 60 sec: 40685.3, 300 sec: 41154.1). Total num frames: 157892608. Throughput: 0: 41086.7. Samples: 158085340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:04:48,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 15:04:48,867][15401] Updated weights for policy 0, policy_version 9640 (0.0036) [2024-06-21 15:04:52,281][15401] Updated weights for policy 0, policy_version 9650 (0.0039) [2024-06-21 15:04:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 158121984. Throughput: 0: 41008.5. Samples: 158203820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 15:04:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 15:04:56,941][15401] Updated weights for policy 0, policy_version 9660 (0.0027) [2024-06-21 15:04:58,390][15132] Fps is (10 sec: 40969.7, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 158302208. Throughput: 0: 40974.7. Samples: 158450520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 15:04:58,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-21 15:05:00,235][15401] Updated weights for policy 0, policy_version 9670 (0.0028) [2024-06-21 15:05:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 158547968. Throughput: 0: 40853.7. Samples: 158695040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 15:05:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 15:05:04,820][15401] Updated weights for policy 0, policy_version 9680 (0.0038) [2024-06-21 15:05:08,158][15401] Updated weights for policy 0, policy_version 9690 (0.0028) [2024-06-21 15:05:08,391][15132] Fps is (10 sec: 45870.1, 60 sec: 41505.5, 300 sec: 41265.3). Total num frames: 158760960. Throughput: 0: 41103.5. Samples: 158829580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 15:05:08,391][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 15:05:13,023][15401] Updated weights for policy 0, policy_version 9700 (0.0044) [2024-06-21 15:05:13,390][15132] Fps is (10 sec: 37683.3, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 158924800. Throughput: 0: 41220.8. Samples: 159077580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 15:05:13,400][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 15:05:16,152][15401] Updated weights for policy 0, policy_version 9710 (0.0038) [2024-06-21 15:05:18,389][15132] Fps is (10 sec: 40964.8, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 159170560. Throughput: 0: 40893.4. Samples: 159313040. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-21 15:05:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 15:05:21,485][15401] Updated weights for policy 0, policy_version 9720 (0.0039) [2024-06-21 15:05:23,391][15132] Fps is (10 sec: 42590.6, 60 sec: 40958.8, 300 sec: 41098.6). Total num frames: 159350784. Throughput: 0: 41223.2. Samples: 159449320. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-21 15:05:23,392][15132] Avg episode reward: [(0, '0.190')] [2024-06-21 15:05:24,105][15401] Updated weights for policy 0, policy_version 9730 (0.0040) [2024-06-21 15:05:28,389][15132] Fps is (10 sec: 37683.1, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 159547392. Throughput: 0: 40978.2. Samples: 159686340. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-21 15:05:28,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 15:05:29,399][15401] Updated weights for policy 0, policy_version 9740 (0.0047) [2024-06-21 15:05:31,903][15401] Updated weights for policy 0, policy_version 9750 (0.0037) [2024-06-21 15:05:33,390][15132] Fps is (10 sec: 42605.7, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 159776768. Throughput: 0: 41080.7. Samples: 159933880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-21 15:05:33,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 15:05:37,380][15401] Updated weights for policy 0, policy_version 9760 (0.0043) [2024-06-21 15:05:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 40960.1, 300 sec: 41154.7). Total num frames: 159973376. Throughput: 0: 41298.8. Samples: 160062260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-21 15:05:38,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-21 15:05:39,633][15349] Signal inference workers to stop experience collection... (2300 times) [2024-06-21 15:05:39,633][15349] Signal inference workers to resume experience collection... (2300 times) [2024-06-21 15:05:39,676][15401] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-21 15:05:39,676][15401] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-21 15:05:39,771][15401] Updated weights for policy 0, policy_version 9770 (0.0039) [2024-06-21 15:05:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 160202752. Throughput: 0: 41135.5. Samples: 160301620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-21 15:05:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 15:05:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000009778_160202752.pth... [2024-06-21 15:05:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000009174_150306816.pth [2024-06-21 15:05:45,215][15401] Updated weights for policy 0, policy_version 9780 (0.0028) [2024-06-21 15:05:47,982][15401] Updated weights for policy 0, policy_version 9790 (0.0023) [2024-06-21 15:05:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41780.9, 300 sec: 41210.0). Total num frames: 160399360. Throughput: 0: 41203.2. Samples: 160549180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 23.0) [2024-06-21 15:05:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 15:05:53,361][15401] Updated weights for policy 0, policy_version 9800 (0.0042) [2024-06-21 15:05:53,390][15132] Fps is (10 sec: 36044.8, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 160563200. Throughput: 0: 41006.3. Samples: 160674820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 23.0) [2024-06-21 15:05:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 15:05:56,049][15401] Updated weights for policy 0, policy_version 9810 (0.0036) [2024-06-21 15:05:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.6, 300 sec: 41265.1). Total num frames: 160825344. Throughput: 0: 40989.4. Samples: 160922200. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-21 15:05:58,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 15:06:01,168][15401] Updated weights for policy 0, policy_version 9820 (0.0044) [2024-06-21 15:06:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 161005568. Throughput: 0: 41238.6. Samples: 161168780. Policy #0 lag: (min: 1.0, avg: 9.1, max: 23.0) [2024-06-21 15:06:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 15:06:04,454][15401] Updated weights for policy 0, policy_version 9830 (0.0050) [2024-06-21 15:06:08,389][15132] Fps is (10 sec: 36053.5, 60 sec: 40414.6, 300 sec: 41154.4). Total num frames: 161185792. Throughput: 0: 40762.6. Samples: 161283560. Policy #0 lag: (min: 1.0, avg: 9.1, max: 23.0) [2024-06-21 15:06:08,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 15:06:08,858][15401] Updated weights for policy 0, policy_version 9840 (0.0026) [2024-06-21 15:06:12,260][15401] Updated weights for policy 0, policy_version 9850 (0.0044) [2024-06-21 15:06:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41265.8). Total num frames: 161431552. Throughput: 0: 41186.7. Samples: 161539740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 15:06:13,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-21 15:06:16,574][15401] Updated weights for policy 0, policy_version 9860 (0.0039) [2024-06-21 15:06:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 161595392. Throughput: 0: 41359.7. Samples: 161795060. Policy #0 lag: (min: 2.0, avg: 9.0, max: 22.0) [2024-06-21 15:06:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 15:06:20,037][15401] Updated weights for policy 0, policy_version 9870 (0.0042) [2024-06-21 15:06:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41234.3, 300 sec: 41154.4). Total num frames: 161824768. Throughput: 0: 41105.6. Samples: 161912020. Policy #0 lag: (min: 2.0, avg: 9.0, max: 22.0) [2024-06-21 15:06:23,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-21 15:06:24,326][15401] Updated weights for policy 0, policy_version 9880 (0.0039) [2024-06-21 15:06:28,102][15401] Updated weights for policy 0, policy_version 9890 (0.0045) [2024-06-21 15:06:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 162037760. Throughput: 0: 41444.1. Samples: 162166600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-21 15:06:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 15:06:32,675][15401] Updated weights for policy 0, policy_version 9900 (0.0039) [2024-06-21 15:06:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 162234368. Throughput: 0: 41420.8. Samples: 162413120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-21 15:06:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 15:06:36,281][15401] Updated weights for policy 0, policy_version 9910 (0.0026) [2024-06-21 15:06:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 162480128. Throughput: 0: 41365.8. Samples: 162536280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 15:06:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 15:06:40,453][15401] Updated weights for policy 0, policy_version 9920 (0.0030) [2024-06-21 15:06:41,537][15349] Signal inference workers to stop experience collection... (2350 times) [2024-06-21 15:06:41,538][15349] Signal inference workers to resume experience collection... (2350 times) [2024-06-21 15:06:41,559][15401] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-21 15:06:41,559][15401] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-21 15:06:43,396][15132] Fps is (10 sec: 39296.8, 60 sec: 40409.6, 300 sec: 40986.9). Total num frames: 162627584. Throughput: 0: 41320.3. Samples: 162781780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 15:06:43,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 15:06:44,242][15401] Updated weights for policy 0, policy_version 9930 (0.0032) [2024-06-21 15:06:48,385][15401] Updated weights for policy 0, policy_version 9940 (0.0033) [2024-06-21 15:06:48,390][15132] Fps is (10 sec: 37682.8, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 162856960. Throughput: 0: 41369.3. Samples: 163030400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 15:06:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 15:06:52,181][15401] Updated weights for policy 0, policy_version 9950 (0.0043) [2024-06-21 15:06:53,390][15132] Fps is (10 sec: 47543.4, 60 sec: 42325.2, 300 sec: 41321.3). Total num frames: 163102720. Throughput: 0: 41617.6. Samples: 163156360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 15:06:53,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 15:06:56,213][15401] Updated weights for policy 0, policy_version 9960 (0.0057) [2024-06-21 15:06:58,390][15132] Fps is (10 sec: 40957.4, 60 sec: 40688.1, 300 sec: 41098.8). Total num frames: 163266560. Throughput: 0: 41375.3. Samples: 163401660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 15:06:58,391][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 15:07:00,222][15401] Updated weights for policy 0, policy_version 9970 (0.0037) [2024-06-21 15:07:03,390][15132] Fps is (10 sec: 37683.6, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 163479552. Throughput: 0: 41113.3. Samples: 163645160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 15:07:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 15:07:04,126][15401] Updated weights for policy 0, policy_version 9980 (0.0047) [2024-06-21 15:07:08,254][15401] Updated weights for policy 0, policy_version 9990 (0.0045) [2024-06-21 15:07:08,390][15132] Fps is (10 sec: 42601.2, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 163692544. Throughput: 0: 41320.1. Samples: 163771420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 15:07:08,391][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 15:07:11,927][15401] Updated weights for policy 0, policy_version 10000 (0.0047) [2024-06-21 15:07:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 163889152. Throughput: 0: 41123.1. Samples: 164017140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 15:07:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 15:07:16,366][15401] Updated weights for policy 0, policy_version 10010 (0.0029) [2024-06-21 15:07:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41209.9). Total num frames: 164118528. Throughput: 0: 41045.9. Samples: 164260180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:07:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 15:07:20,114][15401] Updated weights for policy 0, policy_version 10020 (0.0052) [2024-06-21 15:07:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 164298752. Throughput: 0: 41137.8. Samples: 164387480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:07:23,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-21 15:07:24,284][15401] Updated weights for policy 0, policy_version 10030 (0.0037) [2024-06-21 15:07:27,922][15401] Updated weights for policy 0, policy_version 10040 (0.0026) [2024-06-21 15:07:28,390][15132] Fps is (10 sec: 37682.6, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 164495360. Throughput: 0: 41021.3. Samples: 164627480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 15:07:28,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 15:07:32,067][15401] Updated weights for policy 0, policy_version 10050 (0.0039) [2024-06-21 15:07:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 164708352. Throughput: 0: 41016.0. Samples: 164876120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 15:07:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 15:07:35,580][15401] Updated weights for policy 0, policy_version 10060 (0.0037) [2024-06-21 15:07:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 40140.7, 300 sec: 40987.8). Total num frames: 164888576. Throughput: 0: 41007.6. Samples: 165001700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 15:07:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 15:07:39,856][15401] Updated weights for policy 0, policy_version 10070 (0.0048) [2024-06-21 15:07:43,170][15401] Updated weights for policy 0, policy_version 10080 (0.0035) [2024-06-21 15:07:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42056.7, 300 sec: 41265.5). Total num frames: 165150720. Throughput: 0: 40920.6. Samples: 165243060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 15:07:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 15:07:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000010080_165150720.pth... [2024-06-21 15:07:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000009475_155238400.pth [2024-06-21 15:07:45,120][15349] Signal inference workers to stop experience collection... (2400 times) [2024-06-21 15:07:45,120][15349] Signal inference workers to resume experience collection... (2400 times) [2024-06-21 15:07:45,162][15401] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-21 15:07:45,162][15401] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-21 15:07:47,741][15401] Updated weights for policy 0, policy_version 10090 (0.0034) [2024-06-21 15:07:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 165314560. Throughput: 0: 41215.5. Samples: 165499860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 15:07:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 15:07:51,899][15401] Updated weights for policy 0, policy_version 10100 (0.0033) [2024-06-21 15:07:53,390][15132] Fps is (10 sec: 37682.6, 60 sec: 40413.9, 300 sec: 41209.9). Total num frames: 165527552. Throughput: 0: 41002.1. Samples: 165616520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 15:07:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 15:07:55,956][15401] Updated weights for policy 0, policy_version 10110 (0.0037) [2024-06-21 15:07:58,389][15132] Fps is (10 sec: 45875.9, 60 sec: 41779.7, 300 sec: 41265.5). Total num frames: 165773312. Throughput: 0: 41051.1. Samples: 165864440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 15:07:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 15:07:59,714][15401] Updated weights for policy 0, policy_version 10120 (0.0044) [2024-06-21 15:08:03,395][15132] Fps is (10 sec: 40938.5, 60 sec: 40956.3, 300 sec: 40987.0). Total num frames: 165937152. Throughput: 0: 41317.6. Samples: 166119700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 15:08:03,395][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 15:08:04,096][15401] Updated weights for policy 0, policy_version 10130 (0.0038) [2024-06-21 15:08:07,470][15401] Updated weights for policy 0, policy_version 10140 (0.0027) [2024-06-21 15:08:08,395][15132] Fps is (10 sec: 37663.8, 60 sec: 40956.6, 300 sec: 41098.5). Total num frames: 166150144. Throughput: 0: 40981.6. Samples: 166231860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:08:08,400][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 15:08:11,918][15401] Updated weights for policy 0, policy_version 10150 (0.0032) [2024-06-21 15:08:13,390][15132] Fps is (10 sec: 45899.6, 60 sec: 41779.1, 300 sec: 41265.5). Total num frames: 166395904. Throughput: 0: 41323.6. Samples: 166487040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:08:13,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-21 15:08:15,291][15401] Updated weights for policy 0, policy_version 10160 (0.0045) [2024-06-21 15:08:18,390][15132] Fps is (10 sec: 39341.4, 60 sec: 40413.8, 300 sec: 41154.7). Total num frames: 166543360. Throughput: 0: 41297.7. Samples: 166734520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:08:18,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-21 15:08:19,805][15401] Updated weights for policy 0, policy_version 10170 (0.0040) [2024-06-21 15:08:23,105][15401] Updated weights for policy 0, policy_version 10180 (0.0024) [2024-06-21 15:08:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 166789120. Throughput: 0: 41033.8. Samples: 166848220. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-21 15:08:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 15:08:27,642][15401] Updated weights for policy 0, policy_version 10190 (0.0038) [2024-06-21 15:08:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 167002112. Throughput: 0: 41486.1. Samples: 167109940. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-21 15:08:28,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 15:08:31,015][15401] Updated weights for policy 0, policy_version 10200 (0.0036) [2024-06-21 15:08:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 167165952. Throughput: 0: 41285.3. Samples: 167357700. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-21 15:08:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 15:08:35,526][15401] Updated weights for policy 0, policy_version 10210 (0.0044) [2024-06-21 15:08:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41265.5). Total num frames: 167411712. Throughput: 0: 41378.7. Samples: 167478560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-21 15:08:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 15:08:38,761][15401] Updated weights for policy 0, policy_version 10220 (0.0033) [2024-06-21 15:08:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 167591936. Throughput: 0: 41335.4. Samples: 167724540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-21 15:08:43,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-21 15:08:43,694][15401] Updated weights for policy 0, policy_version 10230 (0.0040) [2024-06-21 15:08:47,330][15401] Updated weights for policy 0, policy_version 10240 (0.0031) [2024-06-21 15:08:48,389][15132] Fps is (10 sec: 36045.2, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 167772160. Throughput: 0: 41116.9. Samples: 167969740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 15:08:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 15:08:51,661][15401] Updated weights for policy 0, policy_version 10250 (0.0034) [2024-06-21 15:08:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 168017920. Throughput: 0: 41304.5. Samples: 168090360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 15:08:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 15:08:55,530][15401] Updated weights for policy 0, policy_version 10260 (0.0035) [2024-06-21 15:08:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 40413.8, 300 sec: 41098.8). Total num frames: 168198144. Throughput: 0: 41349.9. Samples: 168347780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 15:08:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 15:08:59,443][15401] Updated weights for policy 0, policy_version 10270 (0.0043) [2024-06-21 15:09:00,113][15349] Signal inference workers to stop experience collection... (2450 times) [2024-06-21 15:09:00,120][15349] Signal inference workers to resume experience collection... (2450 times) [2024-06-21 15:09:00,138][15401] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-21 15:09:00,170][15401] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-21 15:09:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41236.7, 300 sec: 41154.4). Total num frames: 168411136. Throughput: 0: 41145.2. Samples: 168586060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 15:09:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 15:09:03,926][15401] Updated weights for policy 0, policy_version 10280 (0.0037) [2024-06-21 15:09:07,303][15401] Updated weights for policy 0, policy_version 10290 (0.0036) [2024-06-21 15:09:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41509.6, 300 sec: 41209.9). Total num frames: 168640512. Throughput: 0: 41489.4. Samples: 168715240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 15:09:08,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 15:09:11,671][15401] Updated weights for policy 0, policy_version 10300 (0.0036) [2024-06-21 15:09:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 168837120. Throughput: 0: 41142.3. Samples: 168961340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 15:09:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 15:09:15,141][15401] Updated weights for policy 0, policy_version 10310 (0.0054) [2024-06-21 15:09:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 169050112. Throughput: 0: 40899.3. Samples: 169198160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 15:09:18,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 15:09:19,545][15401] Updated weights for policy 0, policy_version 10320 (0.0051) [2024-06-21 15:09:23,169][15401] Updated weights for policy 0, policy_version 10330 (0.0040) [2024-06-21 15:09:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 169246720. Throughput: 0: 41202.7. Samples: 169332680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 15:09:23,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 15:09:27,359][15401] Updated weights for policy 0, policy_version 10340 (0.0040) [2024-06-21 15:09:28,389][15132] Fps is (10 sec: 39321.3, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 169443328. Throughput: 0: 41173.9. Samples: 169577360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 15:09:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 15:09:31,256][15401] Updated weights for policy 0, policy_version 10350 (0.0038) [2024-06-21 15:09:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 169672704. Throughput: 0: 40948.8. Samples: 169812440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:09:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 15:09:35,140][15401] Updated weights for policy 0, policy_version 10360 (0.0034) [2024-06-21 15:09:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 40414.0, 300 sec: 41043.3). Total num frames: 169836544. Throughput: 0: 41145.5. Samples: 169941900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:09:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 15:09:39,606][15401] Updated weights for policy 0, policy_version 10370 (0.0038) [2024-06-21 15:09:43,207][15401] Updated weights for policy 0, policy_version 10380 (0.0036) [2024-06-21 15:09:43,392][15132] Fps is (10 sec: 40950.6, 60 sec: 41504.5, 300 sec: 41321.0). Total num frames: 170082304. Throughput: 0: 40795.6. Samples: 170183680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:09:43,393][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 15:09:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000010381_170082304.pth... [2024-06-21 15:09:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000009778_160202752.pth [2024-06-21 15:09:47,547][15401] Updated weights for policy 0, policy_version 10390 (0.0024) [2024-06-21 15:09:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 170262528. Throughput: 0: 41084.1. Samples: 170434840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:09:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 15:09:51,088][15401] Updated weights for policy 0, policy_version 10400 (0.0038) [2024-06-21 15:09:53,390][15132] Fps is (10 sec: 37692.1, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 170459136. Throughput: 0: 40822.6. Samples: 170552260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:09:53,396][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 15:09:55,558][15401] Updated weights for policy 0, policy_version 10410 (0.0031) [2024-06-21 15:09:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 170688512. Throughput: 0: 40774.2. Samples: 170796180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 15:09:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 15:09:58,920][15401] Updated weights for policy 0, policy_version 10420 (0.0036) [2024-06-21 15:10:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 40687.1, 300 sec: 40987.9). Total num frames: 170852352. Throughput: 0: 41141.7. Samples: 171049540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 15:10:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 15:10:03,784][15401] Updated weights for policy 0, policy_version 10430 (0.0028) [2024-06-21 15:10:04,170][15349] Signal inference workers to stop experience collection... (2500 times) [2024-06-21 15:10:04,170][15349] Signal inference workers to resume experience collection... (2500 times) [2024-06-21 15:10:04,198][15401] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-21 15:10:04,198][15401] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-21 15:10:06,754][15401] Updated weights for policy 0, policy_version 10440 (0.0039) [2024-06-21 15:10:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 171081728. Throughput: 0: 40729.0. Samples: 171165480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:10:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-21 15:10:11,767][15401] Updated weights for policy 0, policy_version 10450 (0.0035) [2024-06-21 15:10:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 171311104. Throughput: 0: 40791.9. Samples: 171413000. Policy #0 lag: (min: 2.0, avg: 9.3, max: 22.0) [2024-06-21 15:10:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 15:10:15,140][15401] Updated weights for policy 0, policy_version 10460 (0.0033) [2024-06-21 15:10:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 40413.8, 300 sec: 41099.1). Total num frames: 171474944. Throughput: 0: 41101.1. Samples: 171661980. Policy #0 lag: (min: 2.0, avg: 9.3, max: 22.0) [2024-06-21 15:10:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 15:10:19,616][15401] Updated weights for policy 0, policy_version 10470 (0.0040) [2024-06-21 15:10:23,112][15401] Updated weights for policy 0, policy_version 10480 (0.0035) [2024-06-21 15:10:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 171704320. Throughput: 0: 40782.6. Samples: 171777120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-21 15:10:23,391][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 15:10:27,499][15401] Updated weights for policy 0, policy_version 10490 (0.0047) [2024-06-21 15:10:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 171917312. Throughput: 0: 40904.0. Samples: 172024260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-21 15:10:28,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 15:10:31,011][15401] Updated weights for policy 0, policy_version 10500 (0.0040) [2024-06-21 15:10:33,396][15132] Fps is (10 sec: 39296.6, 60 sec: 40409.6, 300 sec: 41097.9). Total num frames: 172097536. Throughput: 0: 40826.2. Samples: 172272280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:10:33,397][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 15:10:35,454][15401] Updated weights for policy 0, policy_version 10510 (0.0040) [2024-06-21 15:10:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 172326912. Throughput: 0: 40959.7. Samples: 172395440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 15:10:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 15:10:38,842][15401] Updated weights for policy 0, policy_version 10520 (0.0036) [2024-06-21 15:10:43,390][15132] Fps is (10 sec: 39346.8, 60 sec: 40142.4, 300 sec: 40987.8). Total num frames: 172490752. Throughput: 0: 41108.0. Samples: 172646040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 15:10:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 15:10:43,572][15401] Updated weights for policy 0, policy_version 10530 (0.0039) [2024-06-21 15:10:46,595][15401] Updated weights for policy 0, policy_version 10540 (0.0033) [2024-06-21 15:10:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 172720128. Throughput: 0: 40868.4. Samples: 172888620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 15:10:48,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 15:10:51,468][15401] Updated weights for policy 0, policy_version 10550 (0.0033) [2024-06-21 15:10:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 41506.2, 300 sec: 41099.2). Total num frames: 172949504. Throughput: 0: 41186.7. Samples: 173018880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 15:10:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 15:10:54,902][15401] Updated weights for policy 0, policy_version 10560 (0.0035) [2024-06-21 15:10:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 40413.9, 300 sec: 41043.3). Total num frames: 173113344. Throughput: 0: 41059.2. Samples: 173260660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:10:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 15:10:59,410][15401] Updated weights for policy 0, policy_version 10570 (0.0033) [2024-06-21 15:11:02,961][15401] Updated weights for policy 0, policy_version 10580 (0.0039) [2024-06-21 15:11:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 173342720. Throughput: 0: 40900.9. Samples: 173502520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:11:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 15:11:07,506][15401] Updated weights for policy 0, policy_version 10590 (0.0037) [2024-06-21 15:11:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 173555712. Throughput: 0: 41311.6. Samples: 173636140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:11:08,390][15132] Avg episode reward: [(0, '0.086')] [2024-06-21 15:11:10,997][15401] Updated weights for policy 0, policy_version 10600 (0.0031) [2024-06-21 15:11:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40687.1, 300 sec: 41209.9). Total num frames: 173752320. Throughput: 0: 41242.2. Samples: 173880160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 15:11:13,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-21 15:11:15,476][15401] Updated weights for policy 0, policy_version 10610 (0.0059) [2024-06-21 15:11:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 173965312. Throughput: 0: 41045.8. Samples: 174119080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 15:11:18,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-21 15:11:18,754][15401] Updated weights for policy 0, policy_version 10620 (0.0038) [2024-06-21 15:11:23,392][15132] Fps is (10 sec: 39311.9, 60 sec: 40685.3, 300 sec: 41043.0). Total num frames: 174145536. Throughput: 0: 41107.5. Samples: 174245380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-21 15:11:23,392][15132] Avg episode reward: [(0, '0.266')] [2024-06-21 15:11:23,451][15401] Updated weights for policy 0, policy_version 10630 (0.0028) [2024-06-21 15:11:26,505][15401] Updated weights for policy 0, policy_version 10640 (0.0042) [2024-06-21 15:11:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 174407680. Throughput: 0: 41098.7. Samples: 174495480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-21 15:11:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 15:11:31,141][15349] Signal inference workers to stop experience collection... (2550 times) [2024-06-21 15:11:31,148][15349] Signal inference workers to resume experience collection... (2550 times) [2024-06-21 15:11:31,192][15401] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-21 15:11:31,192][15401] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-21 15:11:31,275][15401] Updated weights for policy 0, policy_version 10650 (0.0038) [2024-06-21 15:11:33,390][15132] Fps is (10 sec: 44247.3, 60 sec: 41510.6, 300 sec: 41043.3). Total num frames: 174587904. Throughput: 0: 41381.4. Samples: 174750780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 15:11:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 15:11:34,463][15401] Updated weights for policy 0, policy_version 10660 (0.0050) [2024-06-21 15:11:38,390][15132] Fps is (10 sec: 36044.5, 60 sec: 40686.9, 300 sec: 41155.3). Total num frames: 174768128. Throughput: 0: 41045.7. Samples: 174865940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-21 15:11:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 15:11:39,174][15401] Updated weights for policy 0, policy_version 10670 (0.0036) [2024-06-21 15:11:42,345][15401] Updated weights for policy 0, policy_version 10680 (0.0041) [2024-06-21 15:11:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 174997504. Throughput: 0: 41315.1. Samples: 175119840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-21 15:11:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 15:11:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000010681_174997504.pth... [2024-06-21 15:11:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000010080_165150720.pth [2024-06-21 15:11:47,207][15401] Updated weights for policy 0, policy_version 10690 (0.0051) [2024-06-21 15:11:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 175194112. Throughput: 0: 41394.5. Samples: 175365280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 15:11:48,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 15:11:50,502][15401] Updated weights for policy 0, policy_version 10700 (0.0038) [2024-06-21 15:11:53,391][15132] Fps is (10 sec: 39317.6, 60 sec: 40686.2, 300 sec: 41098.8). Total num frames: 175390720. Throughput: 0: 40981.7. Samples: 175480360. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 15:11:53,391][15132] Avg episode reward: [(0, '0.291')] [2024-06-21 15:11:54,940][15401] Updated weights for policy 0, policy_version 10710 (0.0029) [2024-06-21 15:11:58,352][15401] Updated weights for policy 0, policy_version 10720 (0.0035) [2024-06-21 15:11:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 41209.9). Total num frames: 175636480. Throughput: 0: 41153.3. Samples: 175732060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:11:58,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-21 15:12:02,733][15401] Updated weights for policy 0, policy_version 10730 (0.0050) [2024-06-21 15:12:03,389][15132] Fps is (10 sec: 40964.2, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 175800320. Throughput: 0: 41421.4. Samples: 175983040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:12:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 15:12:06,488][15401] Updated weights for policy 0, policy_version 10740 (0.0034) [2024-06-21 15:12:08,392][15132] Fps is (10 sec: 39312.0, 60 sec: 41231.4, 300 sec: 41154.0). Total num frames: 176029696. Throughput: 0: 41270.2. Samples: 176102540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 15:12:08,392][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 15:12:10,614][15401] Updated weights for policy 0, policy_version 10750 (0.0046) [2024-06-21 15:12:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41232.9, 300 sec: 41043.3). Total num frames: 176226304. Throughput: 0: 41426.9. Samples: 176359700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 15:12:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 15:12:14,285][15401] Updated weights for policy 0, policy_version 10760 (0.0034) [2024-06-21 15:12:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 176439296. Throughput: 0: 41197.8. Samples: 176604680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 15:12:18,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 15:12:18,471][15401] Updated weights for policy 0, policy_version 10770 (0.0050) [2024-06-21 15:12:22,360][15401] Updated weights for policy 0, policy_version 10780 (0.0035) [2024-06-21 15:12:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42054.0, 300 sec: 41265.5). Total num frames: 176668672. Throughput: 0: 41443.6. Samples: 176730900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 15:12:23,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 15:12:26,275][15401] Updated weights for policy 0, policy_version 10790 (0.0036) [2024-06-21 15:12:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40413.9, 300 sec: 41098.9). Total num frames: 176832512. Throughput: 0: 41296.5. Samples: 176978180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 15:12:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 15:12:30,229][15401] Updated weights for policy 0, policy_version 10800 (0.0048) [2024-06-21 15:12:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 177045504. Throughput: 0: 41205.5. Samples: 177219520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 15:12:33,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 15:12:34,854][15401] Updated weights for policy 0, policy_version 10810 (0.0058) [2024-06-21 15:12:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 177258496. Throughput: 0: 41415.6. Samples: 177344020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 15:12:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 15:12:38,495][15401] Updated weights for policy 0, policy_version 10820 (0.0032) [2024-06-21 15:12:42,652][15401] Updated weights for policy 0, policy_version 10830 (0.0037) [2024-06-21 15:12:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 177455104. Throughput: 0: 41311.6. Samples: 177591080. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-06-21 15:12:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-21 15:12:46,038][15349] Signal inference workers to stop experience collection... (2600 times) [2024-06-21 15:12:46,077][15401] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-21 15:12:46,100][15349] Signal inference workers to resume experience collection... (2600 times) [2024-06-21 15:12:46,101][15401] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-21 15:12:46,246][15401] Updated weights for policy 0, policy_version 10840 (0.0045) [2024-06-21 15:12:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 177668096. Throughput: 0: 41118.3. Samples: 177833360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-21 15:12:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 15:12:50,602][15401] Updated weights for policy 0, policy_version 10850 (0.0043) [2024-06-21 15:12:53,392][15132] Fps is (10 sec: 40950.1, 60 sec: 41232.1, 300 sec: 40987.4). Total num frames: 177864704. Throughput: 0: 41330.2. Samples: 177962400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-21 15:12:53,393][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 15:12:54,009][15401] Updated weights for policy 0, policy_version 10860 (0.0031) [2024-06-21 15:12:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 41155.1). Total num frames: 178077696. Throughput: 0: 40952.6. Samples: 178202560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-21 15:12:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 15:12:58,992][15401] Updated weights for policy 0, policy_version 10870 (0.0049) [2024-06-21 15:13:02,063][15401] Updated weights for policy 0, policy_version 10880 (0.0044) [2024-06-21 15:13:03,393][15132] Fps is (10 sec: 45869.1, 60 sec: 42049.7, 300 sec: 41265.7). Total num frames: 178323456. Throughput: 0: 40993.5. Samples: 178449540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-21 15:13:03,394][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 15:13:06,796][15401] Updated weights for policy 0, policy_version 10890 (0.0031) [2024-06-21 15:13:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40688.6, 300 sec: 40932.2). Total num frames: 178470912. Throughput: 0: 41116.5. Samples: 178581140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 15:13:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 15:13:10,119][15401] Updated weights for policy 0, policy_version 10900 (0.0042) [2024-06-21 15:13:13,390][15132] Fps is (10 sec: 37697.0, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 178700288. Throughput: 0: 41001.7. Samples: 178823260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 15:13:13,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-21 15:13:14,774][15401] Updated weights for policy 0, policy_version 10910 (0.0033) [2024-06-21 15:13:18,018][15401] Updated weights for policy 0, policy_version 10920 (0.0036) [2024-06-21 15:13:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 178929664. Throughput: 0: 41136.4. Samples: 179070660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 15:13:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 15:13:22,776][15401] Updated weights for policy 0, policy_version 10930 (0.0035) [2024-06-21 15:13:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 179093504. Throughput: 0: 41180.5. Samples: 179197140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 15:13:23,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 15:13:26,081][15401] Updated weights for policy 0, policy_version 10940 (0.0051) [2024-06-21 15:13:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 179339264. Throughput: 0: 41071.1. Samples: 179439280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 15:13:28,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 15:13:30,410][15401] Updated weights for policy 0, policy_version 10950 (0.0038) [2024-06-21 15:13:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 179503104. Throughput: 0: 41478.7. Samples: 179699900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:13:33,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 15:13:34,043][15401] Updated weights for policy 0, policy_version 10960 (0.0034) [2024-06-21 15:13:38,045][15401] Updated weights for policy 0, policy_version 10970 (0.0028) [2024-06-21 15:13:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 179732480. Throughput: 0: 41187.0. Samples: 179815720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:13:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-21 15:13:41,978][15401] Updated weights for policy 0, policy_version 10980 (0.0028) [2024-06-21 15:13:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 179961856. Throughput: 0: 41377.3. Samples: 180064540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 15:13:43,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 15:13:43,519][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000010985_179978240.pth... [2024-06-21 15:13:43,580][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000010381_170082304.pth [2024-06-21 15:13:45,802][15401] Updated weights for policy 0, policy_version 10990 (0.0034) [2024-06-21 15:13:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 180125696. Throughput: 0: 41524.8. Samples: 180318000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-21 15:13:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 15:13:49,896][15401] Updated weights for policy 0, policy_version 11000 (0.0045) [2024-06-21 15:13:53,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41234.8, 300 sec: 41154.4). Total num frames: 180338688. Throughput: 0: 41159.1. Samples: 180433300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-21 15:13:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 15:13:54,250][15401] Updated weights for policy 0, policy_version 11010 (0.0040) [2024-06-21 15:13:57,780][15401] Updated weights for policy 0, policy_version 11020 (0.0032) [2024-06-21 15:13:58,393][15132] Fps is (10 sec: 44222.2, 60 sec: 41503.9, 300 sec: 41209.5). Total num frames: 180568064. Throughput: 0: 41293.9. Samples: 180681620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:13:58,393][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 15:14:02,632][15401] Updated weights for policy 0, policy_version 11030 (0.0045) [2024-06-21 15:14:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 40689.5, 300 sec: 41098.8). Total num frames: 180764672. Throughput: 0: 41456.8. Samples: 180936220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:14:03,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-21 15:14:05,531][15401] Updated weights for policy 0, policy_version 11040 (0.0033) [2024-06-21 15:14:08,390][15132] Fps is (10 sec: 42611.8, 60 sec: 42052.2, 300 sec: 41209.9). Total num frames: 180994048. Throughput: 0: 41257.6. Samples: 181053740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-21 15:14:08,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-21 15:14:10,458][15401] Updated weights for policy 0, policy_version 11050 (0.0044) [2024-06-21 15:14:11,689][15349] Signal inference workers to stop experience collection... (2650 times) [2024-06-21 15:14:11,740][15401] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-21 15:14:11,746][15349] Signal inference workers to resume experience collection... (2650 times) [2024-06-21 15:14:11,751][15401] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-21 15:14:13,374][15401] Updated weights for policy 0, policy_version 11060 (0.0030) [2024-06-21 15:14:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 181207040. Throughput: 0: 41472.8. Samples: 181305560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-21 15:14:13,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-21 15:14:18,110][15401] Updated weights for policy 0, policy_version 11070 (0.0048) [2024-06-21 15:14:18,389][15132] Fps is (10 sec: 37683.9, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 181370880. Throughput: 0: 41243.1. Samples: 181555840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 15:14:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 15:14:21,483][15401] Updated weights for policy 0, policy_version 11080 (0.0030) [2024-06-21 15:14:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 41265.4). Total num frames: 181616640. Throughput: 0: 41383.6. Samples: 181677980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:14:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 15:14:25,717][15401] Updated weights for policy 0, policy_version 11090 (0.0036) [2024-06-21 15:14:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 181780480. Throughput: 0: 41389.0. Samples: 181927040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:14:28,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 15:14:29,267][15401] Updated weights for policy 0, policy_version 11100 (0.0038) [2024-06-21 15:14:33,363][15401] Updated weights for policy 0, policy_version 11110 (0.0043) [2024-06-21 15:14:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.1, 300 sec: 41321.0). Total num frames: 182026240. Throughput: 0: 41248.3. Samples: 182174180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 15:14:33,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-21 15:14:37,105][15401] Updated weights for policy 0, policy_version 11120 (0.0029) [2024-06-21 15:14:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 41779.2, 300 sec: 41210.2). Total num frames: 182239232. Throughput: 0: 41619.4. Samples: 182306180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 15:14:38,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-21 15:14:41,412][15401] Updated weights for policy 0, policy_version 11130 (0.0048) [2024-06-21 15:14:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 182419456. Throughput: 0: 41508.2. Samples: 182549360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 15:14:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-21 15:14:45,174][15401] Updated weights for policy 0, policy_version 11140 (0.0043) [2024-06-21 15:14:48,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 182648832. Throughput: 0: 41383.1. Samples: 182798460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 15:14:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 15:14:49,433][15401] Updated weights for policy 0, policy_version 11150 (0.0044) [2024-06-21 15:14:53,047][15401] Updated weights for policy 0, policy_version 11160 (0.0032) [2024-06-21 15:14:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 182845440. Throughput: 0: 41596.5. Samples: 182925580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 15:14:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 15:14:57,339][15401] Updated weights for policy 0, policy_version 11170 (0.0040) [2024-06-21 15:14:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41235.3, 300 sec: 41321.0). Total num frames: 183042048. Throughput: 0: 41468.5. Samples: 183171640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 15:14:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 15:15:01,034][15401] Updated weights for policy 0, policy_version 11180 (0.0037) [2024-06-21 15:15:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41265.4). Total num frames: 183255040. Throughput: 0: 41376.7. Samples: 183417800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 15:15:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 15:15:05,068][15401] Updated weights for policy 0, policy_version 11190 (0.0035) [2024-06-21 15:15:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.2, 300 sec: 41210.0). Total num frames: 183468032. Throughput: 0: 41451.7. Samples: 183543300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:15:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 15:15:08,992][15401] Updated weights for policy 0, policy_version 11200 (0.0039) [2024-06-21 15:15:12,965][15401] Updated weights for policy 0, policy_version 11210 (0.0029) [2024-06-21 15:15:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 183664640. Throughput: 0: 41510.7. Samples: 183795020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:15:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 15:15:17,085][15401] Updated weights for policy 0, policy_version 11220 (0.0045) [2024-06-21 15:15:18,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41777.5, 300 sec: 41265.1). Total num frames: 183877632. Throughput: 0: 41396.1. Samples: 184037100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 15:15:18,392][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 15:15:19,037][15349] Signal inference workers to stop experience collection... (2700 times) [2024-06-21 15:15:19,037][15349] Signal inference workers to resume experience collection... (2700 times) [2024-06-21 15:15:19,077][15401] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-21 15:15:19,077][15401] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-21 15:15:20,819][15401] Updated weights for policy 0, policy_version 11230 (0.0033) [2024-06-21 15:15:23,390][15132] Fps is (10 sec: 40958.0, 60 sec: 40959.8, 300 sec: 41209.9). Total num frames: 184074240. Throughput: 0: 41192.6. Samples: 184159860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 15:15:23,391][15132] Avg episode reward: [(0, '0.687')] [2024-06-21 15:15:24,930][15401] Updated weights for policy 0, policy_version 11240 (0.0037) [2024-06-21 15:15:28,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41779.2, 300 sec: 41321.9). Total num frames: 184287232. Throughput: 0: 41337.0. Samples: 184409520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-21 15:15:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 15:15:29,001][15401] Updated weights for policy 0, policy_version 11250 (0.0046) [2024-06-21 15:15:32,879][15401] Updated weights for policy 0, policy_version 11260 (0.0025) [2024-06-21 15:15:33,390][15132] Fps is (10 sec: 45877.2, 60 sec: 41779.3, 300 sec: 41376.5). Total num frames: 184532992. Throughput: 0: 41347.1. Samples: 184659080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 15:15:33,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 15:15:36,611][15401] Updated weights for policy 0, policy_version 11270 (0.0031) [2024-06-21 15:15:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 184680448. Throughput: 0: 41309.9. Samples: 184784520. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 15:15:38,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 15:15:40,794][15401] Updated weights for policy 0, policy_version 11280 (0.0030) [2024-06-21 15:15:43,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 184909824. Throughput: 0: 41474.3. Samples: 185037980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 15:15:43,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-21 15:15:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000011287_184926208.pth... [2024-06-21 15:15:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000010681_174997504.pth [2024-06-21 15:15:44,296][15401] Updated weights for policy 0, policy_version 11290 (0.0046) [2024-06-21 15:15:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 185106432. Throughput: 0: 41544.2. Samples: 185287280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 15:15:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 15:15:48,614][15401] Updated weights for policy 0, policy_version 11300 (0.0040) [2024-06-21 15:15:52,283][15401] Updated weights for policy 0, policy_version 11310 (0.0042) [2024-06-21 15:15:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41231.5, 300 sec: 41376.2). Total num frames: 185319424. Throughput: 0: 41442.6. Samples: 185408320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 15:15:53,392][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 15:15:56,617][15401] Updated weights for policy 0, policy_version 11320 (0.0044) [2024-06-21 15:15:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 185548800. Throughput: 0: 41242.6. Samples: 185650940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 15:15:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 15:16:00,258][15401] Updated weights for policy 0, policy_version 11330 (0.0052) [2024-06-21 15:16:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 185729024. Throughput: 0: 41607.5. Samples: 185909340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 15:16:03,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-21 15:16:04,465][15401] Updated weights for policy 0, policy_version 11340 (0.0037) [2024-06-21 15:16:08,045][15401] Updated weights for policy 0, policy_version 11350 (0.0038) [2024-06-21 15:16:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 185958400. Throughput: 0: 41470.2. Samples: 186026000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 15:16:08,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-21 15:16:12,721][15401] Updated weights for policy 0, policy_version 11360 (0.0035) [2024-06-21 15:16:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 186155008. Throughput: 0: 41512.8. Samples: 186277600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 15:16:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 15:16:16,184][15401] Updated weights for policy 0, policy_version 11370 (0.0032) [2024-06-21 15:16:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41234.8, 300 sec: 41376.9). Total num frames: 186351616. Throughput: 0: 41367.7. Samples: 186520620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 15:16:18,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-21 15:16:20,617][15401] Updated weights for policy 0, policy_version 11380 (0.0027) [2024-06-21 15:16:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.5, 300 sec: 41265.5). Total num frames: 186580992. Throughput: 0: 41349.8. Samples: 186645260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 15:16:23,392][15132] Avg episode reward: [(0, '0.225')] [2024-06-21 15:16:24,059][15401] Updated weights for policy 0, policy_version 11390 (0.0043) [2024-06-21 15:16:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 186761216. Throughput: 0: 41344.1. Samples: 186898460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 15:16:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 15:16:28,470][15401] Updated weights for policy 0, policy_version 11400 (0.0040) [2024-06-21 15:16:32,064][15401] Updated weights for policy 0, policy_version 11410 (0.0046) [2024-06-21 15:16:33,389][15132] Fps is (10 sec: 37683.3, 60 sec: 40413.9, 300 sec: 41321.0). Total num frames: 186957824. Throughput: 0: 41328.8. Samples: 187147080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 15:16:33,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-21 15:16:36,613][15401] Updated weights for policy 0, policy_version 11420 (0.0042) [2024-06-21 15:16:36,862][15349] Signal inference workers to stop experience collection... (2750 times) [2024-06-21 15:16:36,888][15401] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-21 15:16:36,926][15349] Signal inference workers to resume experience collection... (2750 times) [2024-06-21 15:16:36,926][15401] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-21 15:16:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 187187200. Throughput: 0: 41408.0. Samples: 187271580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 15:16:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-21 15:16:39,958][15401] Updated weights for policy 0, policy_version 11430 (0.0050) [2024-06-21 15:16:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 41231.4, 300 sec: 41320.7). Total num frames: 187383808. Throughput: 0: 41533.0. Samples: 187520020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 15:16:43,393][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 15:16:44,337][15401] Updated weights for policy 0, policy_version 11440 (0.0035) [2024-06-21 15:16:47,750][15401] Updated weights for policy 0, policy_version 11450 (0.0037) [2024-06-21 15:16:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41376.7). Total num frames: 187596800. Throughput: 0: 41243.6. Samples: 187765300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-21 15:16:48,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-21 15:16:52,238][15401] Updated weights for policy 0, policy_version 11460 (0.0036) [2024-06-21 15:16:53,390][15132] Fps is (10 sec: 44247.0, 60 sec: 41780.8, 300 sec: 41321.0). Total num frames: 187826176. Throughput: 0: 41570.1. Samples: 187896660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-21 15:16:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 15:16:56,070][15401] Updated weights for policy 0, policy_version 11470 (0.0044) [2024-06-21 15:16:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 188006400. Throughput: 0: 41400.8. Samples: 188140640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-21 15:16:58,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 15:17:00,122][15401] Updated weights for policy 0, policy_version 11480 (0.0034) [2024-06-21 15:17:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41506.2, 300 sec: 41321.3). Total num frames: 188219392. Throughput: 0: 41541.3. Samples: 188389980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-21 15:17:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 15:17:04,031][15401] Updated weights for policy 0, policy_version 11490 (0.0032) [2024-06-21 15:17:08,123][15401] Updated weights for policy 0, policy_version 11500 (0.0030) [2024-06-21 15:17:08,392][15132] Fps is (10 sec: 40951.0, 60 sec: 40958.3, 300 sec: 41320.7). Total num frames: 188416000. Throughput: 0: 41593.4. Samples: 188517060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:17:08,392][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 15:17:11,908][15401] Updated weights for policy 0, policy_version 11510 (0.0026) [2024-06-21 15:17:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.3, 300 sec: 41432.1). Total num frames: 188661760. Throughput: 0: 41500.4. Samples: 188765980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-21 15:17:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 15:17:16,036][15401] Updated weights for policy 0, policy_version 11520 (0.0033) [2024-06-21 15:17:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 188858368. Throughput: 0: 41425.3. Samples: 189011220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-21 15:17:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 15:17:19,866][15401] Updated weights for policy 0, policy_version 11530 (0.0043) [2024-06-21 15:17:23,390][15132] Fps is (10 sec: 37682.4, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 189038592. Throughput: 0: 41361.2. Samples: 189132840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 15:17:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 15:17:23,982][15401] Updated weights for policy 0, policy_version 11540 (0.0033) [2024-06-21 15:17:27,766][15401] Updated weights for policy 0, policy_version 11550 (0.0032) [2024-06-21 15:17:28,392][15132] Fps is (10 sec: 40950.1, 60 sec: 41777.5, 300 sec: 41431.7). Total num frames: 189267968. Throughput: 0: 41460.5. Samples: 189385740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 15:17:28,393][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 15:17:31,737][15401] Updated weights for policy 0, policy_version 11560 (0.0043) [2024-06-21 15:17:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 41432.1). Total num frames: 189480960. Throughput: 0: 41491.5. Samples: 189632420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 15:17:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 15:17:35,704][15401] Updated weights for policy 0, policy_version 11570 (0.0035) [2024-06-21 15:17:38,390][15132] Fps is (10 sec: 39330.8, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 189661184. Throughput: 0: 41282.3. Samples: 189754360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 15:17:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 15:17:39,601][15401] Updated weights for policy 0, policy_version 11580 (0.0035) [2024-06-21 15:17:43,392][15132] Fps is (10 sec: 37674.3, 60 sec: 41233.0, 300 sec: 41320.7). Total num frames: 189857792. Throughput: 0: 41229.9. Samples: 189996080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 15:17:43,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 15:17:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000011588_189857792.pth... [2024-06-21 15:17:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000010985_179978240.pth [2024-06-21 15:17:43,812][15401] Updated weights for policy 0, policy_version 11590 (0.0042) [2024-06-21 15:17:47,616][15401] Updated weights for policy 0, policy_version 11600 (0.0035) [2024-06-21 15:17:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41376.9). Total num frames: 190070784. Throughput: 0: 41307.9. Samples: 190248840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 15:17:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 15:17:51,519][15401] Updated weights for policy 0, policy_version 11610 (0.0034) [2024-06-21 15:17:53,392][15132] Fps is (10 sec: 44237.3, 60 sec: 41231.6, 300 sec: 41431.8). Total num frames: 190300160. Throughput: 0: 41244.9. Samples: 190373080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:17:53,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 15:17:55,473][15401] Updated weights for policy 0, policy_version 11620 (0.0033) [2024-06-21 15:17:58,392][15132] Fps is (10 sec: 42588.4, 60 sec: 41504.6, 300 sec: 41265.6). Total num frames: 190496768. Throughput: 0: 41230.6. Samples: 190621460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:17:58,393][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 15:17:59,778][15401] Updated weights for policy 0, policy_version 11630 (0.0035) [2024-06-21 15:17:59,784][15349] Signal inference workers to stop experience collection... (2800 times) [2024-06-21 15:17:59,784][15349] Signal inference workers to resume experience collection... (2800 times) [2024-06-21 15:17:59,801][15401] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-21 15:17:59,801][15401] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-21 15:18:03,389][15132] Fps is (10 sec: 39330.9, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 190693376. Throughput: 0: 41268.9. Samples: 190868320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 15:18:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 15:18:03,507][15401] Updated weights for policy 0, policy_version 11640 (0.0032) [2024-06-21 15:18:07,614][15401] Updated weights for policy 0, policy_version 11650 (0.0050) [2024-06-21 15:18:08,390][15132] Fps is (10 sec: 42605.7, 60 sec: 41780.4, 300 sec: 41432.0). Total num frames: 190922752. Throughput: 0: 41176.4. Samples: 190985800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-21 15:18:08,391][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 15:18:11,282][15401] Updated weights for policy 0, policy_version 11660 (0.0028) [2024-06-21 15:18:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 191119360. Throughput: 0: 41120.9. Samples: 191236080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-21 15:18:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 15:18:15,443][15401] Updated weights for policy 0, policy_version 11670 (0.0039) [2024-06-21 15:18:18,392][15132] Fps is (10 sec: 37676.8, 60 sec: 40685.3, 300 sec: 41376.2). Total num frames: 191299584. Throughput: 0: 41095.6. Samples: 191481820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:18:18,393][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 15:18:19,717][15401] Updated weights for policy 0, policy_version 11680 (0.0029) [2024-06-21 15:18:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 191496192. Throughput: 0: 41127.6. Samples: 191605100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:18:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 15:18:23,831][15401] Updated weights for policy 0, policy_version 11690 (0.0033) [2024-06-21 15:18:27,495][15401] Updated weights for policy 0, policy_version 11700 (0.0039) [2024-06-21 15:18:28,390][15132] Fps is (10 sec: 42608.3, 60 sec: 40961.6, 300 sec: 41432.1). Total num frames: 191725568. Throughput: 0: 41242.1. Samples: 191851880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 15:18:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 15:18:31,613][15401] Updated weights for policy 0, policy_version 11710 (0.0028) [2024-06-21 15:18:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 191938560. Throughput: 0: 41021.8. Samples: 192094820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 15:18:33,393][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 15:18:35,360][15401] Updated weights for policy 0, policy_version 11720 (0.0037) [2024-06-21 15:18:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 192151552. Throughput: 0: 41130.1. Samples: 192223840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:18:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 15:18:39,513][15401] Updated weights for policy 0, policy_version 11730 (0.0042) [2024-06-21 15:18:43,348][15401] Updated weights for policy 0, policy_version 11740 (0.0046) [2024-06-21 15:18:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41507.8, 300 sec: 41432.1). Total num frames: 192348160. Throughput: 0: 41107.9. Samples: 192471220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:18:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 15:18:47,163][15401] Updated weights for policy 0, policy_version 11750 (0.0052) [2024-06-21 15:18:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 192561152. Throughput: 0: 41166.6. Samples: 192720820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 15:18:48,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-21 15:18:51,153][15401] Updated weights for policy 0, policy_version 11760 (0.0036) [2024-06-21 15:18:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41234.6, 300 sec: 41377.0). Total num frames: 192774144. Throughput: 0: 41184.1. Samples: 192839060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 15:18:53,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-21 15:18:55,295][15401] Updated weights for policy 0, policy_version 11770 (0.0032) [2024-06-21 15:18:58,389][15132] Fps is (10 sec: 37683.7, 60 sec: 40688.6, 300 sec: 41265.5). Total num frames: 192937984. Throughput: 0: 41157.8. Samples: 193088180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 15:18:58,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 15:18:59,097][15401] Updated weights for policy 0, policy_version 11780 (0.0036) [2024-06-21 15:19:03,137][15401] Updated weights for policy 0, policy_version 11790 (0.0043) [2024-06-21 15:19:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 193167360. Throughput: 0: 41203.6. Samples: 193335880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 15:19:03,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 15:19:07,135][15401] Updated weights for policy 0, policy_version 11800 (0.0047) [2024-06-21 15:19:08,390][15132] Fps is (10 sec: 44235.8, 60 sec: 40960.4, 300 sec: 41265.5). Total num frames: 193380352. Throughput: 0: 41332.8. Samples: 193465080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 15:19:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-21 15:19:11,024][15401] Updated weights for policy 0, policy_version 11810 (0.0037) [2024-06-21 15:19:13,390][15132] Fps is (10 sec: 40959.1, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 193576960. Throughput: 0: 41171.5. Samples: 193704600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 15:19:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 15:19:15,126][15401] Updated weights for policy 0, policy_version 11820 (0.0037) [2024-06-21 15:19:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 41507.9, 300 sec: 41265.5). Total num frames: 193789952. Throughput: 0: 41268.5. Samples: 193951900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 15:19:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 15:19:19,081][15401] Updated weights for policy 0, policy_version 11830 (0.0038) [2024-06-21 15:19:23,059][15401] Updated weights for policy 0, policy_version 11840 (0.0033) [2024-06-21 15:19:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 193986560. Throughput: 0: 41133.0. Samples: 194074820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 15:19:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 15:19:27,023][15401] Updated weights for policy 0, policy_version 11850 (0.0035) [2024-06-21 15:19:28,394][15132] Fps is (10 sec: 40940.9, 60 sec: 41230.0, 300 sec: 41264.8). Total num frames: 194199552. Throughput: 0: 41102.1. Samples: 194321000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 15:19:28,395][15132] Avg episode reward: [(0, '0.264')] [2024-06-21 15:19:30,959][15401] Updated weights for policy 0, policy_version 11860 (0.0045) [2024-06-21 15:19:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 194396160. Throughput: 0: 41024.3. Samples: 194566920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 15:19:33,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-21 15:19:34,951][15401] Updated weights for policy 0, policy_version 11870 (0.0031) [2024-06-21 15:19:38,390][15132] Fps is (10 sec: 39339.4, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 194592768. Throughput: 0: 41052.1. Samples: 194686400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 15:19:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 15:19:38,839][15401] Updated weights for policy 0, policy_version 11880 (0.0042) [2024-06-21 15:19:42,785][15401] Updated weights for policy 0, policy_version 11890 (0.0048) [2024-06-21 15:19:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 194805760. Throughput: 0: 41038.1. Samples: 194934900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:19:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 15:19:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000011890_194805760.pth... [2024-06-21 15:19:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000011287_184926208.pth [2024-06-21 15:19:43,503][15349] Signal inference workers to stop experience collection... (2850 times) [2024-06-21 15:19:43,547][15401] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-21 15:19:43,628][15349] Signal inference workers to resume experience collection... (2850 times) [2024-06-21 15:19:43,629][15401] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-21 15:19:47,216][15401] Updated weights for policy 0, policy_version 11900 (0.0035) [2024-06-21 15:19:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 195035136. Throughput: 0: 40852.4. Samples: 195174240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:19:48,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 15:19:51,049][15401] Updated weights for policy 0, policy_version 11910 (0.0039) [2024-06-21 15:19:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 195215360. Throughput: 0: 40730.2. Samples: 195297940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 15:19:53,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 15:19:55,098][15401] Updated weights for policy 0, policy_version 11920 (0.0049) [2024-06-21 15:19:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 41265.5). Total num frames: 195428352. Throughput: 0: 40776.9. Samples: 195539560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 15:19:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 15:19:59,167][15401] Updated weights for policy 0, policy_version 11930 (0.0038) [2024-06-21 15:20:03,195][15401] Updated weights for policy 0, policy_version 11940 (0.0030) [2024-06-21 15:20:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 195624960. Throughput: 0: 40796.4. Samples: 195787740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:20:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 15:20:07,245][15401] Updated weights for policy 0, policy_version 11950 (0.0053) [2024-06-21 15:20:08,389][15132] Fps is (10 sec: 40960.9, 60 sec: 40960.2, 300 sec: 41265.5). Total num frames: 195837952. Throughput: 0: 40880.5. Samples: 195914440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:20:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 15:20:11,361][15401] Updated weights for policy 0, policy_version 11960 (0.0034) [2024-06-21 15:20:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41210.2). Total num frames: 196034560. Throughput: 0: 40895.6. Samples: 196161120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-21 15:20:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 15:20:15,199][15401] Updated weights for policy 0, policy_version 11970 (0.0026) [2024-06-21 15:20:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 196247552. Throughput: 0: 40991.3. Samples: 196411520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:20:18,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 15:20:19,226][15401] Updated weights for policy 0, policy_version 11980 (0.0031) [2024-06-21 15:20:23,084][15401] Updated weights for policy 0, policy_version 11990 (0.0055) [2024-06-21 15:20:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41232.9, 300 sec: 41265.4). Total num frames: 196460544. Throughput: 0: 41031.1. Samples: 196532800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:20:23,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 15:20:27,191][15401] Updated weights for policy 0, policy_version 12000 (0.0033) [2024-06-21 15:20:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 40690.1, 300 sec: 41043.3). Total num frames: 196640768. Throughput: 0: 41146.8. Samples: 196786500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 15:20:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-21 15:20:31,061][15401] Updated weights for policy 0, policy_version 12010 (0.0033) [2024-06-21 15:20:33,392][15132] Fps is (10 sec: 40950.7, 60 sec: 41231.5, 300 sec: 41320.7). Total num frames: 196870144. Throughput: 0: 41258.3. Samples: 197030960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 15:20:33,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 15:20:35,202][15401] Updated weights for policy 0, policy_version 12020 (0.0049) [2024-06-21 15:20:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 197066752. Throughput: 0: 41304.5. Samples: 197156640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:20:38,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-21 15:20:38,869][15401] Updated weights for policy 0, policy_version 12030 (0.0046) [2024-06-21 15:20:43,226][15401] Updated weights for policy 0, policy_version 12040 (0.0034) [2024-06-21 15:20:43,390][15132] Fps is (10 sec: 39330.7, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 197263360. Throughput: 0: 41390.7. Samples: 197402140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:20:43,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 15:20:46,777][15401] Updated weights for policy 0, policy_version 12050 (0.0033) [2024-06-21 15:20:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 41210.2). Total num frames: 197476352. Throughput: 0: 41292.4. Samples: 197645900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 15:20:48,390][15132] Avg episode reward: [(0, '0.196')] [2024-06-21 15:20:51,150][15401] Updated weights for policy 0, policy_version 12060 (0.0029) [2024-06-21 15:20:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 197689344. Throughput: 0: 41298.1. Samples: 197772860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 15:20:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 15:20:54,475][15401] Updated weights for policy 0, policy_version 12070 (0.0044) [2024-06-21 15:20:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 197885952. Throughput: 0: 41233.4. Samples: 198016620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 15:20:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 15:20:59,036][15401] Updated weights for policy 0, policy_version 12080 (0.0043) [2024-06-21 15:21:02,393][15401] Updated weights for policy 0, policy_version 12090 (0.0034) [2024-06-21 15:21:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 198098944. Throughput: 0: 41207.1. Samples: 198265840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 15:21:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 15:21:06,813][15401] Updated weights for policy 0, policy_version 12100 (0.0040) [2024-06-21 15:21:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 198311936. Throughput: 0: 41324.6. Samples: 198392400. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-21 15:21:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 15:21:10,608][15401] Updated weights for policy 0, policy_version 12110 (0.0043) [2024-06-21 15:21:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41504.6, 300 sec: 41265.1). Total num frames: 198524928. Throughput: 0: 41247.6. Samples: 198642740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-21 15:21:13,392][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 15:21:14,464][15349] Signal inference workers to stop experience collection... (2900 times) [2024-06-21 15:21:14,472][15349] Signal inference workers to resume experience collection... (2900 times) [2024-06-21 15:21:14,485][15401] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-21 15:21:14,498][15401] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-21 15:21:14,623][15401] Updated weights for policy 0, policy_version 12120 (0.0032) [2024-06-21 15:21:18,380][15401] Updated weights for policy 0, policy_version 12130 (0.0043) [2024-06-21 15:21:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41504.4, 300 sec: 41209.6). Total num frames: 198737920. Throughput: 0: 41181.8. Samples: 198884140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 15:21:18,393][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 15:21:22,903][15401] Updated weights for policy 0, policy_version 12140 (0.0025) [2024-06-21 15:21:23,389][15132] Fps is (10 sec: 37692.5, 60 sec: 40687.1, 300 sec: 41154.4). Total num frames: 198901760. Throughput: 0: 41044.1. Samples: 199003620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 15:21:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 15:21:26,317][15401] Updated weights for policy 0, policy_version 12150 (0.0044) [2024-06-21 15:21:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 199147520. Throughput: 0: 41249.0. Samples: 199258340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:21:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 15:21:30,835][15401] Updated weights for policy 0, policy_version 12160 (0.0047) [2024-06-21 15:21:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 40961.6, 300 sec: 41154.4). Total num frames: 199327744. Throughput: 0: 41404.5. Samples: 199509100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:21:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 15:21:34,472][15401] Updated weights for policy 0, policy_version 12170 (0.0044) [2024-06-21 15:21:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 41210.2). Total num frames: 199540736. Throughput: 0: 41164.0. Samples: 199625240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 15:21:38,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-21 15:21:38,910][15401] Updated weights for policy 0, policy_version 12180 (0.0041) [2024-06-21 15:21:42,314][15401] Updated weights for policy 0, policy_version 12190 (0.0040) [2024-06-21 15:21:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41779.1, 300 sec: 41265.4). Total num frames: 199770112. Throughput: 0: 41389.2. Samples: 199879140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 15:21:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 15:21:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000012193_199770112.pth... [2024-06-21 15:21:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000011588_189857792.pth [2024-06-21 15:21:46,906][15401] Updated weights for policy 0, policy_version 12200 (0.0050) [2024-06-21 15:21:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 199950336. Throughput: 0: 41325.9. Samples: 200125500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 15:21:48,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-21 15:21:50,391][15401] Updated weights for policy 0, policy_version 12210 (0.0035) [2024-06-21 15:21:53,390][15132] Fps is (10 sec: 39322.2, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 200163328. Throughput: 0: 41243.1. Samples: 200248340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 15:21:53,399][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 15:21:54,890][15401] Updated weights for policy 0, policy_version 12220 (0.0029) [2024-06-21 15:21:58,354][15401] Updated weights for policy 0, policy_version 12230 (0.0037) [2024-06-21 15:21:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 200376320. Throughput: 0: 41160.5. Samples: 200494860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 15:21:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 15:22:02,765][15401] Updated weights for policy 0, policy_version 12240 (0.0038) [2024-06-21 15:22:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41210.3). Total num frames: 200572928. Throughput: 0: 41361.0. Samples: 200745280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 15:22:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 15:22:06,510][15401] Updated weights for policy 0, policy_version 12250 (0.0040) [2024-06-21 15:22:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 200802304. Throughput: 0: 41471.9. Samples: 200869860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 15:22:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 15:22:10,587][15401] Updated weights for policy 0, policy_version 12260 (0.0031) [2024-06-21 15:22:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40961.7, 300 sec: 41098.9). Total num frames: 200982528. Throughput: 0: 41355.6. Samples: 201119340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-21 15:22:13,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-21 15:22:14,251][15401] Updated weights for policy 0, policy_version 12270 (0.0030) [2024-06-21 15:22:18,344][15401] Updated weights for policy 0, policy_version 12280 (0.0045) [2024-06-21 15:22:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40961.7, 300 sec: 41210.0). Total num frames: 201195520. Throughput: 0: 41408.6. Samples: 201372480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-21 15:22:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 15:22:22,271][15401] Updated weights for policy 0, policy_version 12290 (0.0039) [2024-06-21 15:22:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41154.7). Total num frames: 201408512. Throughput: 0: 41424.1. Samples: 201489320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 15:22:23,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-21 15:22:26,190][15401] Updated weights for policy 0, policy_version 12300 (0.0040) [2024-06-21 15:22:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 201621504. Throughput: 0: 41281.1. Samples: 201736780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 15:22:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 15:22:30,137][15401] Updated weights for policy 0, policy_version 12310 (0.0044) [2024-06-21 15:22:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 201801728. Throughput: 0: 41334.6. Samples: 201985560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:22:33,395][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 15:22:34,078][15401] Updated weights for policy 0, policy_version 12320 (0.0034) [2024-06-21 15:22:37,900][15401] Updated weights for policy 0, policy_version 12330 (0.0032) [2024-06-21 15:22:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41265.8). Total num frames: 202031104. Throughput: 0: 41241.8. Samples: 202104220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:22:38,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-21 15:22:42,088][15401] Updated weights for policy 0, policy_version 12340 (0.0047) [2024-06-21 15:22:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 202227712. Throughput: 0: 41225.2. Samples: 202350000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:22:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 15:22:45,894][15401] Updated weights for policy 0, policy_version 12350 (0.0043) [2024-06-21 15:22:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 41099.2). Total num frames: 202424320. Throughput: 0: 41208.4. Samples: 202599660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:22:48,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 15:22:50,060][15401] Updated weights for policy 0, policy_version 12360 (0.0026) [2024-06-21 15:22:53,391][15132] Fps is (10 sec: 40953.2, 60 sec: 41231.9, 300 sec: 41154.5). Total num frames: 202637312. Throughput: 0: 41232.7. Samples: 202725400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:22:53,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-21 15:22:53,701][15401] Updated weights for policy 0, policy_version 12370 (0.0039) [2024-06-21 15:22:54,812][15349] Signal inference workers to stop experience collection... (2950 times) [2024-06-21 15:22:54,816][15349] Signal inference workers to resume experience collection... (2950 times) [2024-06-21 15:22:54,838][15401] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-21 15:22:54,838][15401] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-21 15:22:58,012][15401] Updated weights for policy 0, policy_version 12380 (0.0024) [2024-06-21 15:22:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 202850304. Throughput: 0: 41224.4. Samples: 202974440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:22:58,392][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 15:23:01,408][15401] Updated weights for policy 0, policy_version 12390 (0.0036) [2024-06-21 15:23:03,391][15132] Fps is (10 sec: 42597.8, 60 sec: 41504.8, 300 sec: 41154.2). Total num frames: 203063296. Throughput: 0: 41030.3. Samples: 203218920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 15:23:03,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 15:23:05,806][15401] Updated weights for policy 0, policy_version 12400 (0.0033) [2024-06-21 15:23:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 203259904. Throughput: 0: 41216.0. Samples: 203344040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 15:23:08,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-21 15:23:09,365][15401] Updated weights for policy 0, policy_version 12410 (0.0044) [2024-06-21 15:23:13,392][15132] Fps is (10 sec: 39319.4, 60 sec: 41231.4, 300 sec: 41209.9). Total num frames: 203456512. Throughput: 0: 41223.6. Samples: 203591940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 15:23:13,392][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 15:23:13,878][15401] Updated weights for policy 0, policy_version 12420 (0.0042) [2024-06-21 15:23:17,203][15401] Updated weights for policy 0, policy_version 12430 (0.0046) [2024-06-21 15:23:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 203669504. Throughput: 0: 41103.9. Samples: 203835240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 15:23:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 15:23:21,861][15401] Updated weights for policy 0, policy_version 12440 (0.0055) [2024-06-21 15:23:23,390][15132] Fps is (10 sec: 40969.3, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 203866112. Throughput: 0: 41320.4. Samples: 203963640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 15:23:23,395][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 15:23:25,422][15401] Updated weights for policy 0, policy_version 12450 (0.0023) [2024-06-21 15:23:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 204079104. Throughput: 0: 41230.7. Samples: 204205380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 15:23:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 15:23:29,691][15401] Updated weights for policy 0, policy_version 12460 (0.0035) [2024-06-21 15:23:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 204308480. Throughput: 0: 41269.9. Samples: 204456800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 15:23:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 15:23:33,392][15401] Updated weights for policy 0, policy_version 12470 (0.0037) [2024-06-21 15:23:37,448][15401] Updated weights for policy 0, policy_version 12480 (0.0032) [2024-06-21 15:23:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 40958.4, 300 sec: 41154.1). Total num frames: 204488704. Throughput: 0: 41290.0. Samples: 204583480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 15:23:38,393][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 15:23:41,113][15401] Updated weights for policy 0, policy_version 12490 (0.0040) [2024-06-21 15:23:43,396][15132] Fps is (10 sec: 40933.5, 60 sec: 41501.7, 300 sec: 41209.0). Total num frames: 204718080. Throughput: 0: 41093.2. Samples: 204823900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 15:23:43,396][15132] Avg episode reward: [(0, '0.760')] [2024-06-21 15:23:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000012495_204718080.pth... [2024-06-21 15:23:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000011890_194805760.pth [2024-06-21 15:23:45,627][15401] Updated weights for policy 0, policy_version 12500 (0.0033) [2024-06-21 15:23:48,390][15132] Fps is (10 sec: 42608.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 204914688. Throughput: 0: 41367.4. Samples: 205080380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:23:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-21 15:23:49,059][15401] Updated weights for policy 0, policy_version 12510 (0.0041) [2024-06-21 15:23:53,390][15132] Fps is (10 sec: 39346.7, 60 sec: 41234.2, 300 sec: 41265.4). Total num frames: 205111296. Throughput: 0: 41191.0. Samples: 205197640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:23:53,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-21 15:23:53,569][15401] Updated weights for policy 0, policy_version 12520 (0.0043) [2024-06-21 15:23:57,337][15401] Updated weights for policy 0, policy_version 12530 (0.0035) [2024-06-21 15:23:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41265.4). Total num frames: 205340672. Throughput: 0: 41324.8. Samples: 205451460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 15:23:58,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 15:24:01,738][15401] Updated weights for policy 0, policy_version 12540 (0.0028) [2024-06-21 15:24:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41234.3, 300 sec: 41209.9). Total num frames: 205537280. Throughput: 0: 41348.9. Samples: 205695940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 15:24:03,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-21 15:24:05,269][15401] Updated weights for policy 0, policy_version 12550 (0.0037) [2024-06-21 15:24:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 205733888. Throughput: 0: 41128.0. Samples: 205814400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 15:24:08,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-21 15:24:09,432][15401] Updated weights for policy 0, policy_version 12560 (0.0031) [2024-06-21 15:24:13,022][15401] Updated weights for policy 0, policy_version 12570 (0.0038) [2024-06-21 15:24:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41780.8, 300 sec: 41265.4). Total num frames: 205963264. Throughput: 0: 41387.0. Samples: 206067800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 15:24:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 15:24:17,432][15401] Updated weights for policy 0, policy_version 12580 (0.0039) [2024-06-21 15:24:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 206143488. Throughput: 0: 41281.7. Samples: 206314480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 15:24:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 15:24:20,937][15401] Updated weights for policy 0, policy_version 12590 (0.0040) [2024-06-21 15:24:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41210.6). Total num frames: 206356480. Throughput: 0: 41098.6. Samples: 206432820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 15:24:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 15:24:25,296][15401] Updated weights for policy 0, policy_version 12600 (0.0041) [2024-06-21 15:24:28,392][15132] Fps is (10 sec: 42587.6, 60 sec: 41504.3, 300 sec: 41265.1). Total num frames: 206569472. Throughput: 0: 41362.2. Samples: 206685040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 15:24:28,393][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 15:24:28,812][15401] Updated weights for policy 0, policy_version 12610 (0.0040) [2024-06-21 15:24:33,225][15401] Updated weights for policy 0, policy_version 12620 (0.0029) [2024-06-21 15:24:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 206766080. Throughput: 0: 41120.9. Samples: 206930820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 15:24:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 15:24:37,101][15401] Updated weights for policy 0, policy_version 12630 (0.0027) [2024-06-21 15:24:38,390][15132] Fps is (10 sec: 40970.6, 60 sec: 41507.8, 300 sec: 41265.5). Total num frames: 206979072. Throughput: 0: 41128.5. Samples: 207048420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 15:24:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 15:24:41,318][15401] Updated weights for policy 0, policy_version 12640 (0.0033) [2024-06-21 15:24:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41237.5, 300 sec: 41209.9). Total num frames: 207192064. Throughput: 0: 41238.8. Samples: 207307200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 15:24:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 15:24:44,906][15401] Updated weights for policy 0, policy_version 12650 (0.0039) [2024-06-21 15:24:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 207388672. Throughput: 0: 41276.5. Samples: 207553380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:24:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 15:24:48,968][15401] Updated weights for policy 0, policy_version 12660 (0.0043) [2024-06-21 15:24:51,756][15349] Signal inference workers to stop experience collection... (3000 times) [2024-06-21 15:24:51,757][15349] Signal inference workers to resume experience collection... (3000 times) [2024-06-21 15:24:51,784][15401] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-21 15:24:51,785][15401] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-21 15:24:52,759][15401] Updated weights for policy 0, policy_version 12670 (0.0032) [2024-06-21 15:24:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 207618048. Throughput: 0: 41483.2. Samples: 207681140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:24:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 15:24:56,985][15401] Updated weights for policy 0, policy_version 12680 (0.0037) [2024-06-21 15:24:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 207814656. Throughput: 0: 41466.6. Samples: 207933800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 15:24:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 15:25:00,426][15401] Updated weights for policy 0, policy_version 12690 (0.0050) [2024-06-21 15:25:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41265.4). Total num frames: 208011264. Throughput: 0: 41482.6. Samples: 208181200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 15:25:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 15:25:04,765][15401] Updated weights for policy 0, policy_version 12700 (0.0042) [2024-06-21 15:25:08,278][15401] Updated weights for policy 0, policy_version 12710 (0.0032) [2024-06-21 15:25:08,395][15132] Fps is (10 sec: 42575.0, 60 sec: 41775.3, 300 sec: 41375.8). Total num frames: 208240640. Throughput: 0: 41609.5. Samples: 208305480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 15:25:08,396][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 15:25:13,183][15401] Updated weights for policy 0, policy_version 12720 (0.0035) [2024-06-21 15:25:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 208420864. Throughput: 0: 41477.9. Samples: 208551440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 15:25:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 15:25:16,420][15401] Updated weights for policy 0, policy_version 12730 (0.0043) [2024-06-21 15:25:18,392][15132] Fps is (10 sec: 40973.5, 60 sec: 41777.6, 300 sec: 41320.7). Total num frames: 208650240. Throughput: 0: 41368.1. Samples: 208792480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:25:18,392][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 15:25:20,982][15401] Updated weights for policy 0, policy_version 12740 (0.0034) [2024-06-21 15:25:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41231.5, 300 sec: 41320.7). Total num frames: 208830464. Throughput: 0: 41722.7. Samples: 208926040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:25:23,392][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 15:25:24,283][15401] Updated weights for policy 0, policy_version 12750 (0.0031) [2024-06-21 15:25:28,389][15132] Fps is (10 sec: 36053.5, 60 sec: 40688.7, 300 sec: 41154.7). Total num frames: 209010688. Throughput: 0: 41390.7. Samples: 209169780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 15:25:28,390][15132] Avg episode reward: [(0, '0.136')] [2024-06-21 15:25:28,893][15401] Updated weights for policy 0, policy_version 12760 (0.0034) [2024-06-21 15:25:32,259][15401] Updated weights for policy 0, policy_version 12770 (0.0044) [2024-06-21 15:25:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 209272832. Throughput: 0: 41338.7. Samples: 209413620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 15:25:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 15:25:36,668][15401] Updated weights for policy 0, policy_version 12780 (0.0046) [2024-06-21 15:25:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 209469440. Throughput: 0: 41606.1. Samples: 209553420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 15:25:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 15:25:40,251][15401] Updated weights for policy 0, policy_version 12790 (0.0038) [2024-06-21 15:25:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 209666048. Throughput: 0: 41353.3. Samples: 209794700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 15:25:43,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 15:25:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000012797_209666048.pth... [2024-06-21 15:25:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000012193_199770112.pth [2024-06-21 15:25:44,623][15401] Updated weights for policy 0, policy_version 12800 (0.0036) [2024-06-21 15:25:47,991][15401] Updated weights for policy 0, policy_version 12810 (0.0042) [2024-06-21 15:25:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 41376.6). Total num frames: 209895424. Throughput: 0: 41407.7. Samples: 210044540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 15:25:48,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 15:25:52,501][15401] Updated weights for policy 0, policy_version 12820 (0.0036) [2024-06-21 15:25:53,392][15132] Fps is (10 sec: 44227.8, 60 sec: 41504.6, 300 sec: 41431.8). Total num frames: 210108416. Throughput: 0: 41510.3. Samples: 210173300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 15:25:53,392][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 15:25:55,785][15401] Updated weights for policy 0, policy_version 12830 (0.0032) [2024-06-21 15:25:57,238][15349] Signal inference workers to stop experience collection... (3050 times) [2024-06-21 15:25:57,293][15401] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-21 15:25:57,295][15349] Signal inference workers to resume experience collection... (3050 times) [2024-06-21 15:25:57,306][15401] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-21 15:25:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 210305024. Throughput: 0: 41489.7. Samples: 210418480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:25:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 15:26:00,135][15401] Updated weights for policy 0, policy_version 12840 (0.0033) [2024-06-21 15:26:03,389][15132] Fps is (10 sec: 39330.3, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 210501632. Throughput: 0: 41792.4. Samples: 210673040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:26:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 15:26:03,595][15401] Updated weights for policy 0, policy_version 12850 (0.0039) [2024-06-21 15:26:08,210][15401] Updated weights for policy 0, policy_version 12860 (0.0028) [2024-06-21 15:26:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41237.0, 300 sec: 41321.3). Total num frames: 210714624. Throughput: 0: 41525.8. Samples: 210794600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 15:26:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 15:26:11,331][15401] Updated weights for policy 0, policy_version 12870 (0.0041) [2024-06-21 15:26:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 41321.3). Total num frames: 210927616. Throughput: 0: 41552.3. Samples: 211039640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 15:26:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 15:26:15,907][15401] Updated weights for policy 0, policy_version 12880 (0.0041) [2024-06-21 15:26:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40961.7, 300 sec: 41376.6). Total num frames: 211107840. Throughput: 0: 41865.0. Samples: 211297540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:26:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 15:26:19,204][15401] Updated weights for policy 0, policy_version 12890 (0.0034) [2024-06-21 15:26:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41507.8, 300 sec: 41265.5). Total num frames: 211320832. Throughput: 0: 41453.4. Samples: 211418820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:26:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 15:26:23,727][15401] Updated weights for policy 0, policy_version 12900 (0.0035) [2024-06-21 15:26:27,373][15401] Updated weights for policy 0, policy_version 12910 (0.0036) [2024-06-21 15:26:28,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 41487.6). Total num frames: 211566592. Throughput: 0: 41692.2. Samples: 211670840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 15:26:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 15:26:31,639][15401] Updated weights for policy 0, policy_version 12920 (0.0040) [2024-06-21 15:26:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 211746816. Throughput: 0: 41699.1. Samples: 211921000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:26:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 15:26:35,449][15401] Updated weights for policy 0, policy_version 12930 (0.0029) [2024-06-21 15:26:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 211943424. Throughput: 0: 41390.0. Samples: 212035760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:26:38,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-21 15:26:39,929][15401] Updated weights for policy 0, policy_version 12940 (0.0030) [2024-06-21 15:26:43,259][15401] Updated weights for policy 0, policy_version 12950 (0.0037) [2024-06-21 15:26:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.3, 300 sec: 41432.1). Total num frames: 212172800. Throughput: 0: 41606.3. Samples: 212290760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-21 15:26:43,392][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 15:26:47,770][15401] Updated weights for policy 0, policy_version 12960 (0.0034) [2024-06-21 15:26:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 212369408. Throughput: 0: 41504.1. Samples: 212540720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-21 15:26:48,396][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 15:26:51,061][15401] Updated weights for policy 0, policy_version 12970 (0.0048) [2024-06-21 15:26:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41234.5, 300 sec: 41376.5). Total num frames: 212582400. Throughput: 0: 41306.9. Samples: 212653420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:26:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 15:26:55,916][15401] Updated weights for policy 0, policy_version 12980 (0.0042) [2024-06-21 15:26:58,392][15132] Fps is (10 sec: 42587.7, 60 sec: 41504.5, 300 sec: 41431.7). Total num frames: 212795392. Throughput: 0: 41406.8. Samples: 212903040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:26:58,393][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 15:26:58,882][15401] Updated weights for policy 0, policy_version 12990 (0.0037) [2024-06-21 15:27:03,389][15132] Fps is (10 sec: 37683.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 212959232. Throughput: 0: 41241.7. Samples: 213153420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:27:03,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 15:27:03,815][15401] Updated weights for policy 0, policy_version 13000 (0.0028) [2024-06-21 15:27:06,658][15401] Updated weights for policy 0, policy_version 13010 (0.0031) [2024-06-21 15:27:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 213204992. Throughput: 0: 41094.2. Samples: 213268060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:27:08,396][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 15:27:11,710][15401] Updated weights for policy 0, policy_version 13020 (0.0044) [2024-06-21 15:27:13,396][15132] Fps is (10 sec: 44208.3, 60 sec: 41228.7, 300 sec: 41375.6). Total num frames: 213401600. Throughput: 0: 41143.9. Samples: 213522580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:27:13,397][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 15:27:14,614][15401] Updated weights for policy 0, policy_version 13030 (0.0025) [2024-06-21 15:27:16,044][15349] Signal inference workers to stop experience collection... (3100 times) [2024-06-21 15:27:16,091][15401] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-21 15:27:16,099][15349] Signal inference workers to resume experience collection... (3100 times) [2024-06-21 15:27:16,115][15401] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-21 15:27:18,390][15132] Fps is (10 sec: 37682.7, 60 sec: 41232.9, 300 sec: 41265.4). Total num frames: 213581824. Throughput: 0: 41093.6. Samples: 213770220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 15:27:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 15:27:19,607][15401] Updated weights for policy 0, policy_version 13040 (0.0033) [2024-06-21 15:27:22,988][15401] Updated weights for policy 0, policy_version 13050 (0.0046) [2024-06-21 15:27:23,390][15132] Fps is (10 sec: 40986.0, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 213811200. Throughput: 0: 41148.8. Samples: 213887460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 15:27:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 15:27:27,586][15401] Updated weights for policy 0, policy_version 13060 (0.0034) [2024-06-21 15:27:28,389][15132] Fps is (10 sec: 42599.5, 60 sec: 40687.0, 300 sec: 41376.6). Total num frames: 214007808. Throughput: 0: 41138.3. Samples: 214141980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 15:27:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 15:27:30,857][15401] Updated weights for policy 0, policy_version 13070 (0.0047) [2024-06-21 15:27:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 214220800. Throughput: 0: 40941.6. Samples: 214383100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 15:27:33,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-21 15:27:35,417][15401] Updated weights for policy 0, policy_version 13080 (0.0045) [2024-06-21 15:27:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 41504.5, 300 sec: 41376.2). Total num frames: 214433792. Throughput: 0: 41358.4. Samples: 214514640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 15:27:38,392][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 15:27:38,991][15401] Updated weights for policy 0, policy_version 13090 (0.0027) [2024-06-21 15:27:43,182][15401] Updated weights for policy 0, policy_version 13100 (0.0039) [2024-06-21 15:27:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 214630400. Throughput: 0: 41350.5. Samples: 214763720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 15:27:43,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 15:27:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000013100_214630400.pth... [2024-06-21 15:27:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000012495_204718080.pth [2024-06-21 15:27:46,864][15401] Updated weights for policy 0, policy_version 13110 (0.0052) [2024-06-21 15:27:48,390][15132] Fps is (10 sec: 42608.2, 60 sec: 41506.0, 300 sec: 41432.3). Total num frames: 214859776. Throughput: 0: 41194.5. Samples: 215007180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 15:27:48,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 15:27:50,986][15401] Updated weights for policy 0, policy_version 13120 (0.0029) [2024-06-21 15:27:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 215040000. Throughput: 0: 41385.4. Samples: 215130400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 15:27:53,390][15132] Avg episode reward: [(0, '0.071')] [2024-06-21 15:27:54,740][15401] Updated weights for policy 0, policy_version 13130 (0.0049) [2024-06-21 15:27:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 40688.6, 300 sec: 41265.7). Total num frames: 215236608. Throughput: 0: 41260.6. Samples: 215379040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 15:27:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 15:27:58,739][15401] Updated weights for policy 0, policy_version 13140 (0.0052) [2024-06-21 15:28:02,737][15401] Updated weights for policy 0, policy_version 13150 (0.0036) [2024-06-21 15:28:03,391][15132] Fps is (10 sec: 44230.1, 60 sec: 42051.2, 300 sec: 41431.9). Total num frames: 215482368. Throughput: 0: 41110.8. Samples: 215620260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 15:28:03,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 15:28:06,758][15401] Updated weights for policy 0, policy_version 13160 (0.0047) [2024-06-21 15:28:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 41321.3). Total num frames: 215646208. Throughput: 0: 41428.5. Samples: 215751740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 15:28:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 15:28:10,756][15401] Updated weights for policy 0, policy_version 13170 (0.0031) [2024-06-21 15:28:13,390][15132] Fps is (10 sec: 37688.7, 60 sec: 40964.4, 300 sec: 41321.0). Total num frames: 215859200. Throughput: 0: 41145.7. Samples: 215993540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 15:28:13,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 15:28:14,903][15401] Updated weights for policy 0, policy_version 13180 (0.0038) [2024-06-21 15:28:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.3, 300 sec: 41376.6). Total num frames: 216072192. Throughput: 0: 41283.2. Samples: 216240840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 15:28:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 15:28:18,749][15401] Updated weights for policy 0, policy_version 13190 (0.0045) [2024-06-21 15:28:22,772][15401] Updated weights for policy 0, policy_version 13200 (0.0038) [2024-06-21 15:28:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 216268800. Throughput: 0: 41157.3. Samples: 216366620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 15:28:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 15:28:26,815][15401] Updated weights for policy 0, policy_version 13210 (0.0033) [2024-06-21 15:28:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 216498176. Throughput: 0: 41040.6. Samples: 216610540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 15:28:28,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-21 15:28:30,536][15401] Updated weights for policy 0, policy_version 13220 (0.0032) [2024-06-21 15:28:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41376.9). Total num frames: 216694784. Throughput: 0: 41207.2. Samples: 216861500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-21 15:28:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 15:28:34,728][15401] Updated weights for policy 0, policy_version 13230 (0.0044) [2024-06-21 15:28:37,168][15349] Signal inference workers to stop experience collection... (3150 times) [2024-06-21 15:28:37,216][15401] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-21 15:28:37,225][15349] Signal inference workers to resume experience collection... (3150 times) [2024-06-21 15:28:37,235][15401] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-21 15:28:38,368][15401] Updated weights for policy 0, policy_version 13240 (0.0027) [2024-06-21 15:28:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 41506.1, 300 sec: 41377.1). Total num frames: 216924160. Throughput: 0: 41137.8. Samples: 216981700. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-21 15:28:38,393][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 15:28:42,659][15401] Updated weights for policy 0, policy_version 13250 (0.0047) [2024-06-21 15:28:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 217104384. Throughput: 0: 41245.7. Samples: 217235100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 15:28:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 15:28:46,059][15401] Updated weights for policy 0, policy_version 13260 (0.0035) [2024-06-21 15:28:48,390][15132] Fps is (10 sec: 40969.9, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 217333760. Throughput: 0: 41490.7. Samples: 217487280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 15:28:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 15:28:50,334][15401] Updated weights for policy 0, policy_version 13270 (0.0035) [2024-06-21 15:28:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 217530368. Throughput: 0: 41329.8. Samples: 217611580. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-21 15:28:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 15:28:54,128][15401] Updated weights for policy 0, policy_version 13280 (0.0027) [2024-06-21 15:28:58,378][15401] Updated weights for policy 0, policy_version 13290 (0.0044) [2024-06-21 15:28:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 217743360. Throughput: 0: 41492.5. Samples: 217860700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-21 15:28:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 15:29:02,232][15401] Updated weights for policy 0, policy_version 13300 (0.0039) [2024-06-21 15:29:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40961.0, 300 sec: 41376.6). Total num frames: 217939968. Throughput: 0: 41600.5. Samples: 218112860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 15:29:03,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-21 15:29:06,179][15401] Updated weights for policy 0, policy_version 13310 (0.0038) [2024-06-21 15:29:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 218152960. Throughput: 0: 41524.3. Samples: 218235220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 15:29:08,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 15:29:09,899][15401] Updated weights for policy 0, policy_version 13320 (0.0035) [2024-06-21 15:29:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 218349568. Throughput: 0: 41659.4. Samples: 218485220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-21 15:29:13,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-21 15:29:14,064][15401] Updated weights for policy 0, policy_version 13330 (0.0029) [2024-06-21 15:29:17,808][15401] Updated weights for policy 0, policy_version 13340 (0.0049) [2024-06-21 15:29:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 218578944. Throughput: 0: 41456.0. Samples: 218727020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-21 15:29:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 15:29:22,412][15401] Updated weights for policy 0, policy_version 13350 (0.0048) [2024-06-21 15:29:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 41376.9). Total num frames: 218775552. Throughput: 0: 41615.2. Samples: 218854280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-21 15:29:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 15:29:26,012][15401] Updated weights for policy 0, policy_version 13360 (0.0035) [2024-06-21 15:29:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 218972160. Throughput: 0: 41320.9. Samples: 219094540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-21 15:29:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 15:29:30,249][15401] Updated weights for policy 0, policy_version 13370 (0.0035) [2024-06-21 15:29:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 41504.4, 300 sec: 41376.2). Total num frames: 219185152. Throughput: 0: 41286.2. Samples: 219345260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:29:33,393][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 15:29:34,170][15401] Updated weights for policy 0, policy_version 13380 (0.0039) [2024-06-21 15:29:38,020][15401] Updated weights for policy 0, policy_version 13390 (0.0039) [2024-06-21 15:29:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40961.6, 300 sec: 41321.0). Total num frames: 219381760. Throughput: 0: 41291.1. Samples: 219469680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:29:38,396][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 15:29:41,816][15401] Updated weights for policy 0, policy_version 13400 (0.0042) [2024-06-21 15:29:43,390][15132] Fps is (10 sec: 40969.4, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 219594752. Throughput: 0: 41277.2. Samples: 219718180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 15:29:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 15:29:43,462][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000013404_219611136.pth... [2024-06-21 15:29:43,528][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000012797_209666048.pth [2024-06-21 15:29:45,931][15401] Updated weights for policy 0, policy_version 13410 (0.0032) [2024-06-21 15:29:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 219807744. Throughput: 0: 41183.9. Samples: 219966140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 15:29:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 15:29:49,623][15401] Updated weights for policy 0, policy_version 13420 (0.0033) [2024-06-21 15:29:53,392][15132] Fps is (10 sec: 39312.6, 60 sec: 40958.3, 300 sec: 41265.2). Total num frames: 219987968. Throughput: 0: 41188.1. Samples: 220088780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 15:29:53,403][15132] Avg episode reward: [(0, '0.136')] [2024-06-21 15:29:54,152][15401] Updated weights for policy 0, policy_version 13430 (0.0048) [2024-06-21 15:29:57,938][15401] Updated weights for policy 0, policy_version 13440 (0.0041) [2024-06-21 15:29:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41232.9, 300 sec: 41376.5). Total num frames: 220217344. Throughput: 0: 41043.4. Samples: 220332180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:29:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 15:30:01,944][15401] Updated weights for policy 0, policy_version 13450 (0.0035) [2024-06-21 15:30:03,389][15132] Fps is (10 sec: 44247.6, 60 sec: 41506.1, 300 sec: 41321.8). Total num frames: 220430336. Throughput: 0: 41166.6. Samples: 220579520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:30:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 15:30:05,760][15401] Updated weights for policy 0, policy_version 13460 (0.0029) [2024-06-21 15:30:08,392][15132] Fps is (10 sec: 39312.9, 60 sec: 40958.4, 300 sec: 41320.7). Total num frames: 220610560. Throughput: 0: 41091.1. Samples: 220703480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:30:08,392][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 15:30:09,719][15401] Updated weights for policy 0, policy_version 13470 (0.0032) [2024-06-21 15:30:13,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41504.5, 300 sec: 41321.0). Total num frames: 220839936. Throughput: 0: 41340.9. Samples: 220954980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:30:13,393][15132] Avg episode reward: [(0, '0.318')] [2024-06-21 15:30:13,698][15401] Updated weights for policy 0, policy_version 13480 (0.0035) [2024-06-21 15:30:17,642][15401] Updated weights for policy 0, policy_version 13490 (0.0027) [2024-06-21 15:30:18,389][15132] Fps is (10 sec: 44247.6, 60 sec: 41233.0, 300 sec: 41432.4). Total num frames: 221052928. Throughput: 0: 41367.1. Samples: 221206680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-21 15:30:18,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-21 15:30:21,395][15401] Updated weights for policy 0, policy_version 13500 (0.0035) [2024-06-21 15:30:22,997][15349] Signal inference workers to stop experience collection... (3200 times) [2024-06-21 15:30:23,048][15401] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-21 15:30:23,055][15349] Signal inference workers to resume experience collection... (3200 times) [2024-06-21 15:30:23,066][15401] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-21 15:30:23,389][15132] Fps is (10 sec: 39331.5, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 221233152. Throughput: 0: 41411.2. Samples: 221333180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-21 15:30:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 15:30:25,526][15401] Updated weights for policy 0, policy_version 13510 (0.0034) [2024-06-21 15:30:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 221462528. Throughput: 0: 41413.0. Samples: 221581760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 15:30:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 15:30:29,125][15401] Updated weights for policy 0, policy_version 13520 (0.0033) [2024-06-21 15:30:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41234.8, 300 sec: 41321.0). Total num frames: 221659136. Throughput: 0: 41634.8. Samples: 221839700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 15:30:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 15:30:33,419][15401] Updated weights for policy 0, policy_version 13530 (0.0030) [2024-06-21 15:30:36,882][15401] Updated weights for policy 0, policy_version 13540 (0.0027) [2024-06-21 15:30:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 221855744. Throughput: 0: 41697.0. Samples: 221965040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 15:30:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 15:30:41,264][15401] Updated weights for policy 0, policy_version 13550 (0.0023) [2024-06-21 15:30:43,392][15132] Fps is (10 sec: 44225.7, 60 sec: 41777.6, 300 sec: 41376.2). Total num frames: 222101504. Throughput: 0: 41722.9. Samples: 222209800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 15:30:43,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 15:30:44,632][15401] Updated weights for policy 0, policy_version 13560 (0.0024) [2024-06-21 15:30:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.2, 300 sec: 41265.8). Total num frames: 222281728. Throughput: 0: 41877.8. Samples: 222464020. Policy #0 lag: (min: 2.0, avg: 12.5, max: 23.0) [2024-06-21 15:30:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 15:30:49,154][15401] Updated weights for policy 0, policy_version 13570 (0.0035) [2024-06-21 15:30:52,345][15401] Updated weights for policy 0, policy_version 13580 (0.0038) [2024-06-21 15:30:53,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42053.9, 300 sec: 41376.5). Total num frames: 222511104. Throughput: 0: 41722.6. Samples: 222580900. Policy #0 lag: (min: 2.0, avg: 12.5, max: 23.0) [2024-06-21 15:30:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 15:30:56,772][15401] Updated weights for policy 0, policy_version 13590 (0.0048) [2024-06-21 15:30:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.4, 300 sec: 41432.1). Total num frames: 222724096. Throughput: 0: 41841.4. Samples: 222837740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 15:30:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 15:31:00,053][15401] Updated weights for policy 0, policy_version 13600 (0.0035) [2024-06-21 15:31:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 222904320. Throughput: 0: 41763.1. Samples: 223086020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 15:31:03,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 15:31:05,002][15401] Updated weights for policy 0, policy_version 13610 (0.0039) [2024-06-21 15:31:07,735][15401] Updated weights for policy 0, policy_version 13620 (0.0046) [2024-06-21 15:31:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.2, 300 sec: 41487.6). Total num frames: 223166464. Throughput: 0: 41742.7. Samples: 223211600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:31:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 15:31:12,824][15401] Updated weights for policy 0, policy_version 13630 (0.0033) [2024-06-21 15:31:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 41779.2, 300 sec: 41487.3). Total num frames: 223346688. Throughput: 0: 41877.3. Samples: 223466340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:31:13,393][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 15:31:15,623][15401] Updated weights for policy 0, policy_version 13640 (0.0038) [2024-06-21 15:31:18,389][15132] Fps is (10 sec: 36044.9, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 223526912. Throughput: 0: 41795.1. Samples: 223720480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-21 15:31:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 15:31:20,518][15401] Updated weights for policy 0, policy_version 13650 (0.0041) [2024-06-21 15:31:23,390][15132] Fps is (10 sec: 44243.8, 60 sec: 42597.7, 300 sec: 41432.0). Total num frames: 223789056. Throughput: 0: 41704.9. Samples: 223841800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-21 15:31:23,391][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 15:31:23,535][15401] Updated weights for policy 0, policy_version 13660 (0.0035) [2024-06-21 15:31:26,895][15349] Signal inference workers to stop experience collection... (3250 times) [2024-06-21 15:31:26,895][15349] Signal inference workers to resume experience collection... (3250 times) [2024-06-21 15:31:26,918][15401] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-21 15:31:26,918][15401] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-21 15:31:28,307][15401] Updated weights for policy 0, policy_version 13670 (0.0032) [2024-06-21 15:31:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 41779.3, 300 sec: 41432.1). Total num frames: 223969280. Throughput: 0: 41775.2. Samples: 224089580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:31:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 15:31:31,456][15401] Updated weights for policy 0, policy_version 13680 (0.0035) [2024-06-21 15:31:33,389][15132] Fps is (10 sec: 36048.1, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 224149504. Throughput: 0: 41508.8. Samples: 224331920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:31:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 15:31:36,686][15401] Updated weights for policy 0, policy_version 13690 (0.0037) [2024-06-21 15:31:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41376.6). Total num frames: 224378880. Throughput: 0: 41756.6. Samples: 224459940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-21 15:31:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 15:31:39,351][15401] Updated weights for policy 0, policy_version 13700 (0.0040) [2024-06-21 15:31:43,392][15132] Fps is (10 sec: 40949.9, 60 sec: 40960.0, 300 sec: 41320.6). Total num frames: 224559104. Throughput: 0: 41611.9. Samples: 224710380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-21 15:31:43,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-21 15:31:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000013706_224559104.pth... [2024-06-21 15:31:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000013100_214630400.pth [2024-06-21 15:31:44,669][15401] Updated weights for policy 0, policy_version 13710 (0.0035) [2024-06-21 15:31:47,177][15401] Updated weights for policy 0, policy_version 13720 (0.0046) [2024-06-21 15:31:48,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42047.7, 300 sec: 41431.2). Total num frames: 224804864. Throughput: 0: 41317.7. Samples: 224945580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:31:48,396][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 15:31:52,519][15401] Updated weights for policy 0, policy_version 13730 (0.0035) [2024-06-21 15:31:53,390][15132] Fps is (10 sec: 42608.7, 60 sec: 41233.1, 300 sec: 41321.3). Total num frames: 224985088. Throughput: 0: 41637.3. Samples: 225085280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:31:53,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-21 15:31:55,044][15401] Updated weights for policy 0, policy_version 13740 (0.0039) [2024-06-21 15:31:58,389][15132] Fps is (10 sec: 39346.8, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 225198080. Throughput: 0: 41490.3. Samples: 225333300. Policy #0 lag: (min: 0.0, avg: 13.0, max: 23.0) [2024-06-21 15:31:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 15:32:00,498][15401] Updated weights for policy 0, policy_version 13750 (0.0036) [2024-06-21 15:32:02,985][15401] Updated weights for policy 0, policy_version 13760 (0.0034) [2024-06-21 15:32:03,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 41543.2). Total num frames: 225460224. Throughput: 0: 41010.1. Samples: 225565940. Policy #0 lag: (min: 0.0, avg: 13.0, max: 23.0) [2024-06-21 15:32:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 15:32:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 40413.8, 300 sec: 41321.9). Total num frames: 225591296. Throughput: 0: 41321.7. Samples: 225701240. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 15:32:08,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 15:32:08,711][15401] Updated weights for policy 0, policy_version 13770 (0.0032) [2024-06-21 15:32:10,926][15401] Updated weights for policy 0, policy_version 13780 (0.0032) [2024-06-21 15:32:13,390][15132] Fps is (10 sec: 36044.6, 60 sec: 41234.7, 300 sec: 41487.6). Total num frames: 225820672. Throughput: 0: 41364.3. Samples: 225950980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 15:32:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 15:32:16,219][15401] Updated weights for policy 0, policy_version 13790 (0.0028) [2024-06-21 15:32:18,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42325.3, 300 sec: 41543.2). Total num frames: 226066432. Throughput: 0: 41470.6. Samples: 226198100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 15:32:18,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 15:32:18,722][15401] Updated weights for policy 0, policy_version 13800 (0.0038) [2024-06-21 15:32:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40687.6, 300 sec: 41432.1). Total num frames: 226230272. Throughput: 0: 41530.2. Samples: 226328800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 15:32:23,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-21 15:32:23,822][15401] Updated weights for policy 0, policy_version 13810 (0.0033) [2024-06-21 15:32:27,039][15401] Updated weights for policy 0, policy_version 13820 (0.0042) [2024-06-21 15:32:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 226459648. Throughput: 0: 41391.6. Samples: 226572900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 15:32:28,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-21 15:32:31,675][15401] Updated weights for policy 0, policy_version 13830 (0.0044) [2024-06-21 15:32:32,826][15349] Signal inference workers to stop experience collection... (3300 times) [2024-06-21 15:32:32,848][15401] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-21 15:32:32,889][15349] Signal inference workers to resume experience collection... (3300 times) [2024-06-21 15:32:32,890][15401] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-21 15:32:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41488.0). Total num frames: 226672640. Throughput: 0: 41839.8. Samples: 226828100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 15:32:33,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-21 15:32:34,825][15401] Updated weights for policy 0, policy_version 13840 (0.0037) [2024-06-21 15:32:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 226836480. Throughput: 0: 41383.5. Samples: 226947540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 15:32:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 15:32:39,717][15401] Updated weights for policy 0, policy_version 13850 (0.0040) [2024-06-21 15:32:42,555][15401] Updated weights for policy 0, policy_version 13860 (0.0037) [2024-06-21 15:32:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42326.9, 300 sec: 41487.6). Total num frames: 227098624. Throughput: 0: 41311.9. Samples: 227192340. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-21 15:32:43,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-21 15:32:47,737][15401] Updated weights for policy 0, policy_version 13870 (0.0049) [2024-06-21 15:32:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 40964.3, 300 sec: 41432.1). Total num frames: 227262464. Throughput: 0: 41855.1. Samples: 227449420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-21 15:32:48,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-21 15:32:50,859][15401] Updated weights for policy 0, policy_version 13880 (0.0045) [2024-06-21 15:32:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 227491840. Throughput: 0: 41515.9. Samples: 227569460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 15:32:53,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-21 15:32:55,535][15401] Updated weights for policy 0, policy_version 13890 (0.0038) [2024-06-21 15:32:58,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42050.5, 300 sec: 41487.5). Total num frames: 227721216. Throughput: 0: 41505.4. Samples: 227818820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 15:32:58,393][15132] Avg episode reward: [(0, '0.220')] [2024-06-21 15:32:58,631][15401] Updated weights for policy 0, policy_version 13900 (0.0036) [2024-06-21 15:33:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40413.9, 300 sec: 41487.6). Total num frames: 227885056. Throughput: 0: 41717.8. Samples: 228075400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:33:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 15:33:03,630][15401] Updated weights for policy 0, policy_version 13910 (0.0037) [2024-06-21 15:33:06,365][15401] Updated weights for policy 0, policy_version 13920 (0.0035) [2024-06-21 15:33:08,391][15132] Fps is (10 sec: 39326.9, 60 sec: 42051.5, 300 sec: 41543.0). Total num frames: 228114432. Throughput: 0: 41218.5. Samples: 228183680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:33:08,391][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 15:33:11,536][15401] Updated weights for policy 0, policy_version 13930 (0.0038) [2024-06-21 15:33:13,391][15132] Fps is (10 sec: 44229.6, 60 sec: 41778.1, 300 sec: 41542.9). Total num frames: 228327424. Throughput: 0: 41527.8. Samples: 228441720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 15:33:13,392][15132] Avg episode reward: [(0, '0.714')] [2024-06-21 15:33:14,138][15401] Updated weights for policy 0, policy_version 13940 (0.0044) [2024-06-21 15:33:18,390][15132] Fps is (10 sec: 36048.4, 60 sec: 40140.7, 300 sec: 41376.5). Total num frames: 228474880. Throughput: 0: 41430.9. Samples: 228692500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 15:33:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 15:33:19,507][15401] Updated weights for policy 0, policy_version 13950 (0.0030) [2024-06-21 15:33:22,695][15401] Updated weights for policy 0, policy_version 13960 (0.0045) [2024-06-21 15:33:23,390][15132] Fps is (10 sec: 40966.4, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 228737024. Throughput: 0: 41247.1. Samples: 228803660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 15:33:23,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-21 15:33:27,449][15401] Updated weights for policy 0, policy_version 13970 (0.0041) [2024-06-21 15:33:28,389][15132] Fps is (10 sec: 47514.2, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 228950016. Throughput: 0: 41422.4. Samples: 229056340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 15:33:28,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 15:33:30,552][15401] Updated weights for policy 0, policy_version 13980 (0.0028) [2024-06-21 15:33:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40959.9, 300 sec: 41376.9). Total num frames: 229130240. Throughput: 0: 41164.9. Samples: 229301840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 15:33:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 15:33:35,216][15401] Updated weights for policy 0, policy_version 13990 (0.0040) [2024-06-21 15:33:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 229343232. Throughput: 0: 41140.1. Samples: 229420760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 15:33:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 15:33:38,714][15401] Updated weights for policy 0, policy_version 14000 (0.0035) [2024-06-21 15:33:43,107][15401] Updated weights for policy 0, policy_version 14010 (0.0046) [2024-06-21 15:33:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 229556224. Throughput: 0: 41295.6. Samples: 229677020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 15:33:43,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 15:33:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000014011_229556224.pth... [2024-06-21 15:33:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000013404_219611136.pth [2024-06-21 15:33:46,763][15401] Updated weights for policy 0, policy_version 14020 (0.0041) [2024-06-21 15:33:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 229736448. Throughput: 0: 40969.7. Samples: 229919040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 15:33:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 15:33:51,039][15401] Updated weights for policy 0, policy_version 14030 (0.0036) [2024-06-21 15:33:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 229965824. Throughput: 0: 41333.5. Samples: 230043640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 27.0) [2024-06-21 15:33:53,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 15:33:54,665][15401] Updated weights for policy 0, policy_version 14040 (0.0031) [2024-06-21 15:33:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 40415.5, 300 sec: 41376.5). Total num frames: 230146048. Throughput: 0: 41147.2. Samples: 230293280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 27.0) [2024-06-21 15:33:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 15:33:58,850][15401] Updated weights for policy 0, policy_version 14050 (0.0037) [2024-06-21 15:34:02,710][15401] Updated weights for policy 0, policy_version 14060 (0.0028) [2024-06-21 15:34:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 230391808. Throughput: 0: 40905.0. Samples: 230533220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:34:03,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-21 15:34:06,343][15349] Signal inference workers to stop experience collection... (3350 times) [2024-06-21 15:34:06,391][15401] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-21 15:34:06,398][15349] Signal inference workers to resume experience collection... (3350 times) [2024-06-21 15:34:06,400][15401] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-21 15:34:07,185][15401] Updated weights for policy 0, policy_version 14070 (0.0053) [2024-06-21 15:34:08,390][15132] Fps is (10 sec: 42597.3, 60 sec: 40960.6, 300 sec: 41432.1). Total num frames: 230572032. Throughput: 0: 41369.6. Samples: 230665300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:34:08,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-21 15:34:10,449][15401] Updated weights for policy 0, policy_version 14080 (0.0036) [2024-06-21 15:34:13,389][15132] Fps is (10 sec: 37683.6, 60 sec: 40688.1, 300 sec: 41321.0). Total num frames: 230768640. Throughput: 0: 41098.7. Samples: 230905780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:34:13,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-21 15:34:15,227][15401] Updated weights for policy 0, policy_version 14090 (0.0025) [2024-06-21 15:34:18,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 230998016. Throughput: 0: 41051.9. Samples: 231149180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 15:34:18,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 15:34:18,682][15401] Updated weights for policy 0, policy_version 14100 (0.0032) [2024-06-21 15:34:23,101][15401] Updated weights for policy 0, policy_version 14110 (0.0039) [2024-06-21 15:34:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 231194624. Throughput: 0: 41226.2. Samples: 231275940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 15:34:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-21 15:34:26,606][15401] Updated weights for policy 0, policy_version 14120 (0.0044) [2024-06-21 15:34:28,390][15132] Fps is (10 sec: 40960.5, 60 sec: 40960.0, 300 sec: 41432.4). Total num frames: 231407616. Throughput: 0: 40952.8. Samples: 231519900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 15:34:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 15:34:31,143][15401] Updated weights for policy 0, policy_version 14130 (0.0029) [2024-06-21 15:34:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 231604224. Throughput: 0: 41092.4. Samples: 231768200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 15:34:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 15:34:34,443][15401] Updated weights for policy 0, policy_version 14140 (0.0042) [2024-06-21 15:34:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 231800832. Throughput: 0: 41039.4. Samples: 231890420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:34:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 15:34:38,912][15401] Updated weights for policy 0, policy_version 14150 (0.0051) [2024-06-21 15:34:42,298][15401] Updated weights for policy 0, policy_version 14160 (0.0023) [2024-06-21 15:34:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 232046592. Throughput: 0: 41012.9. Samples: 232138860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:34:43,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-21 15:34:47,189][15401] Updated weights for policy 0, policy_version 14170 (0.0035) [2024-06-21 15:34:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41543.5). Total num frames: 232243200. Throughput: 0: 41079.6. Samples: 232381800. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 15:34:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 15:34:50,291][15401] Updated weights for policy 0, policy_version 14180 (0.0039) [2024-06-21 15:34:53,390][15132] Fps is (10 sec: 36044.6, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 232407040. Throughput: 0: 40901.9. Samples: 232505880. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 15:34:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 15:34:54,981][15401] Updated weights for policy 0, policy_version 14190 (0.0037) [2024-06-21 15:34:58,276][15401] Updated weights for policy 0, policy_version 14200 (0.0029) [2024-06-21 15:34:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 232652800. Throughput: 0: 41108.4. Samples: 232755660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 15:34:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 15:35:02,822][15401] Updated weights for policy 0, policy_version 14210 (0.0037) [2024-06-21 15:35:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 41376.9). Total num frames: 232816640. Throughput: 0: 41272.1. Samples: 233006420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 15:35:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 15:35:06,103][15401] Updated weights for policy 0, policy_version 14220 (0.0026) [2024-06-21 15:35:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.2, 300 sec: 41376.9). Total num frames: 233046016. Throughput: 0: 41045.3. Samples: 233122980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-21 15:35:08,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-21 15:35:10,735][15401] Updated weights for policy 0, policy_version 14230 (0.0039) [2024-06-21 15:35:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41376.6). Total num frames: 233259008. Throughput: 0: 41217.8. Samples: 233374700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-21 15:35:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 15:35:14,316][15401] Updated weights for policy 0, policy_version 14240 (0.0046) [2024-06-21 15:35:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 41376.5). Total num frames: 233439232. Throughput: 0: 41282.3. Samples: 233625900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 15:35:18,390][15132] Avg episode reward: [(0, '0.138')] [2024-06-21 15:35:19,102][15401] Updated weights for policy 0, policy_version 14250 (0.0041) [2024-06-21 15:35:22,213][15401] Updated weights for policy 0, policy_version 14260 (0.0041) [2024-06-21 15:35:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 233684992. Throughput: 0: 41107.1. Samples: 233740240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 15:35:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 15:35:27,058][15401] Updated weights for policy 0, policy_version 14270 (0.0029) [2024-06-21 15:35:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 40685.3, 300 sec: 41320.7). Total num frames: 233848832. Throughput: 0: 41190.7. Samples: 233992540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 15:35:28,392][15132] Avg episode reward: [(0, '0.750')] [2024-06-21 15:35:28,502][15349] Signal inference workers to stop experience collection... (3400 times) [2024-06-21 15:35:28,528][15401] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-21 15:35:28,560][15349] Signal inference workers to resume experience collection... (3400 times) [2024-06-21 15:35:28,564][15401] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-21 15:35:30,416][15401] Updated weights for policy 0, policy_version 14280 (0.0032) [2024-06-21 15:35:33,390][15132] Fps is (10 sec: 37683.3, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 234061824. Throughput: 0: 41281.3. Samples: 234239460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:35:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 15:35:34,872][15401] Updated weights for policy 0, policy_version 14290 (0.0039) [2024-06-21 15:35:38,211][15401] Updated weights for policy 0, policy_version 14300 (0.0036) [2024-06-21 15:35:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 41506.1, 300 sec: 41321.3). Total num frames: 234291200. Throughput: 0: 41220.9. Samples: 234360820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:35:38,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-21 15:35:42,794][15401] Updated weights for policy 0, policy_version 14310 (0.0041) [2024-06-21 15:35:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 41265.5). Total num frames: 234455040. Throughput: 0: 41173.4. Samples: 234608460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 15:35:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 15:35:43,435][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000014311_234471424.pth... [2024-06-21 15:35:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000013706_224559104.pth [2024-06-21 15:35:46,241][15401] Updated weights for policy 0, policy_version 14320 (0.0030) [2024-06-21 15:35:48,394][15132] Fps is (10 sec: 40943.7, 60 sec: 40957.3, 300 sec: 41320.5). Total num frames: 234700800. Throughput: 0: 40879.9. Samples: 234846180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 15:35:48,394][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 15:35:50,815][15401] Updated weights for policy 0, policy_version 14330 (0.0047) [2024-06-21 15:35:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 234897408. Throughput: 0: 41193.4. Samples: 234976680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:35:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 15:35:54,352][15401] Updated weights for policy 0, policy_version 14340 (0.0040) [2024-06-21 15:35:58,390][15132] Fps is (10 sec: 39337.4, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 235094016. Throughput: 0: 41187.9. Samples: 235228160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:35:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 15:35:58,781][15401] Updated weights for policy 0, policy_version 14350 (0.0052) [2024-06-21 15:36:02,299][15401] Updated weights for policy 0, policy_version 14360 (0.0038) [2024-06-21 15:36:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 235323392. Throughput: 0: 40937.8. Samples: 235468100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 15:36:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 15:36:06,755][15401] Updated weights for policy 0, policy_version 14370 (0.0035) [2024-06-21 15:36:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41233.0, 300 sec: 41265.8). Total num frames: 235520000. Throughput: 0: 41338.6. Samples: 235600480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 15:36:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 15:36:10,311][15401] Updated weights for policy 0, policy_version 14380 (0.0040) [2024-06-21 15:36:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 235716608. Throughput: 0: 41174.7. Samples: 235845300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 15:36:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 15:36:14,704][15401] Updated weights for policy 0, policy_version 14390 (0.0028) [2024-06-21 15:36:18,046][15401] Updated weights for policy 0, policy_version 14400 (0.0039) [2024-06-21 15:36:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41154.5). Total num frames: 235929600. Throughput: 0: 41146.7. Samples: 236091060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 15:36:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 15:36:22,567][15401] Updated weights for policy 0, policy_version 14410 (0.0033) [2024-06-21 15:36:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 236142592. Throughput: 0: 41289.4. Samples: 236218840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 15:36:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 15:36:26,350][15401] Updated weights for policy 0, policy_version 14420 (0.0037) [2024-06-21 15:36:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41780.9, 300 sec: 41376.5). Total num frames: 236355584. Throughput: 0: 41085.8. Samples: 236457320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 15:36:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 15:36:30,639][15401] Updated weights for policy 0, policy_version 14430 (0.0025) [2024-06-21 15:36:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 236552192. Throughput: 0: 41448.6. Samples: 236711200. Policy #0 lag: (min: 0.0, avg: 13.8, max: 29.0) [2024-06-21 15:36:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 15:36:34,107][15401] Updated weights for policy 0, policy_version 14440 (0.0042) [2024-06-21 15:36:38,324][15401] Updated weights for policy 0, policy_version 14450 (0.0052) [2024-06-21 15:36:38,390][15132] Fps is (10 sec: 39320.7, 60 sec: 40959.9, 300 sec: 41321.3). Total num frames: 236748800. Throughput: 0: 41289.1. Samples: 236834700. Policy #0 lag: (min: 0.0, avg: 13.8, max: 29.0) [2024-06-21 15:36:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-21 15:36:42,133][15401] Updated weights for policy 0, policy_version 14460 (0.0028) [2024-06-21 15:36:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42050.5, 300 sec: 41266.0). Total num frames: 236978176. Throughput: 0: 41194.3. Samples: 237082000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 15:36:43,393][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 15:36:46,134][15401] Updated weights for policy 0, policy_version 14470 (0.0027) [2024-06-21 15:36:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41235.9, 300 sec: 41321.0). Total num frames: 237174784. Throughput: 0: 41308.0. Samples: 237326960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 15:36:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 15:36:49,928][15401] Updated weights for policy 0, policy_version 14480 (0.0035) [2024-06-21 15:36:53,390][15132] Fps is (10 sec: 36053.4, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 237338624. Throughput: 0: 41129.0. Samples: 237451280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 15:36:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 15:36:53,784][15349] Signal inference workers to stop experience collection... (3450 times) [2024-06-21 15:36:53,784][15349] Signal inference workers to resume experience collection... (3450 times) [2024-06-21 15:36:53,832][15401] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-21 15:36:53,832][15401] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-21 15:36:53,926][15401] Updated weights for policy 0, policy_version 14490 (0.0032) [2024-06-21 15:36:57,713][15401] Updated weights for policy 0, policy_version 14500 (0.0038) [2024-06-21 15:36:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 237568000. Throughput: 0: 41120.8. Samples: 237695740. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-21 15:36:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 15:37:01,938][15401] Updated weights for policy 0, policy_version 14510 (0.0036) [2024-06-21 15:37:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 237797376. Throughput: 0: 41228.9. Samples: 237946360. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-21 15:37:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 15:37:06,183][15401] Updated weights for policy 0, policy_version 14520 (0.0029) [2024-06-21 15:37:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 237977600. Throughput: 0: 41135.5. Samples: 238069940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:37:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 15:37:09,981][15401] Updated weights for policy 0, policy_version 14530 (0.0044) [2024-06-21 15:37:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 238206976. Throughput: 0: 41126.6. Samples: 238308020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:37:13,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 15:37:13,860][15401] Updated weights for policy 0, policy_version 14540 (0.0042) [2024-06-21 15:37:18,378][15401] Updated weights for policy 0, policy_version 14550 (0.0031) [2024-06-21 15:37:18,391][15132] Fps is (10 sec: 40955.4, 60 sec: 40959.2, 300 sec: 41209.8). Total num frames: 238387200. Throughput: 0: 41113.5. Samples: 238561360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 15:37:18,391][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 15:37:21,647][15401] Updated weights for policy 0, policy_version 14560 (0.0038) [2024-06-21 15:37:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 238600192. Throughput: 0: 41033.1. Samples: 238681180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 15:37:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 15:37:26,326][15401] Updated weights for policy 0, policy_version 14570 (0.0040) [2024-06-21 15:37:28,390][15132] Fps is (10 sec: 44242.0, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 238829568. Throughput: 0: 41156.9. Samples: 238933960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 15:37:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 15:37:29,243][15401] Updated weights for policy 0, policy_version 14580 (0.0043) [2024-06-21 15:37:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 239009792. Throughput: 0: 41241.3. Samples: 239182820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 15:37:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 15:37:34,206][15401] Updated weights for policy 0, policy_version 14590 (0.0047) [2024-06-21 15:37:37,609][15401] Updated weights for policy 0, policy_version 14600 (0.0034) [2024-06-21 15:37:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 239222784. Throughput: 0: 41144.3. Samples: 239302780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 15:37:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-21 15:37:42,052][15401] Updated weights for policy 0, policy_version 14610 (0.0040) [2024-06-21 15:37:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 40961.6, 300 sec: 41265.5). Total num frames: 239435776. Throughput: 0: 41395.5. Samples: 239558540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 15:37:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 15:37:43,510][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000014615_239452160.pth... [2024-06-21 15:37:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000014011_229556224.pth [2024-06-21 15:37:45,322][15401] Updated weights for policy 0, policy_version 14620 (0.0035) [2024-06-21 15:37:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 239632384. Throughput: 0: 41376.9. Samples: 239808320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:37:48,394][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 15:37:49,812][15401] Updated weights for policy 0, policy_version 14630 (0.0042) [2024-06-21 15:37:53,113][15401] Updated weights for policy 0, policy_version 14640 (0.0035) [2024-06-21 15:37:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41154.7). Total num frames: 239861760. Throughput: 0: 41327.2. Samples: 239929660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:37:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 15:37:57,595][15401] Updated weights for policy 0, policy_version 14650 (0.0044) [2024-06-21 15:37:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 240025600. Throughput: 0: 41602.7. Samples: 240180140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:37:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-21 15:38:00,907][15401] Updated weights for policy 0, policy_version 14660 (0.0024) [2024-06-21 15:38:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 41233.0, 300 sec: 41210.1). Total num frames: 240271360. Throughput: 0: 41539.2. Samples: 240430580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:38:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 15:38:05,861][15401] Updated weights for policy 0, policy_version 14670 (0.0037) [2024-06-21 15:38:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 41779.2, 300 sec: 41210.1). Total num frames: 240484352. Throughput: 0: 41597.3. Samples: 240553060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:38:08,395][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 15:38:08,617][15349] Signal inference workers to stop experience collection... (3500 times) [2024-06-21 15:38:08,618][15349] Signal inference workers to resume experience collection... (3500 times) [2024-06-21 15:38:08,632][15401] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-21 15:38:08,632][15401] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-21 15:38:08,780][15401] Updated weights for policy 0, policy_version 14680 (0.0033) [2024-06-21 15:38:13,390][15132] Fps is (10 sec: 37683.6, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 240648192. Throughput: 0: 41485.3. Samples: 240800800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 15:38:13,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-21 15:38:13,827][15401] Updated weights for policy 0, policy_version 14690 (0.0034) [2024-06-21 15:38:16,711][15401] Updated weights for policy 0, policy_version 14700 (0.0039) [2024-06-21 15:38:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41780.0, 300 sec: 41209.9). Total num frames: 240893952. Throughput: 0: 41334.7. Samples: 241042880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 15:38:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 15:38:21,714][15401] Updated weights for policy 0, policy_version 14710 (0.0034) [2024-06-21 15:38:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 241090560. Throughput: 0: 41592.6. Samples: 241174440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:38:23,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 15:38:24,393][15401] Updated weights for policy 0, policy_version 14720 (0.0028) [2024-06-21 15:38:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 241287168. Throughput: 0: 41330.6. Samples: 241418420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:38:28,399][15132] Avg episode reward: [(0, '0.774')] [2024-06-21 15:38:29,561][15401] Updated weights for policy 0, policy_version 14730 (0.0047) [2024-06-21 15:38:32,826][15401] Updated weights for policy 0, policy_version 14740 (0.0031) [2024-06-21 15:38:33,391][15132] Fps is (10 sec: 40955.4, 60 sec: 41505.4, 300 sec: 41209.8). Total num frames: 241500160. Throughput: 0: 41235.5. Samples: 241663960. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-21 15:38:33,391][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 15:38:37,350][15401] Updated weights for policy 0, policy_version 14750 (0.0029) [2024-06-21 15:38:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 241713152. Throughput: 0: 41443.1. Samples: 241794600. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-21 15:38:38,391][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 15:38:40,438][15401] Updated weights for policy 0, policy_version 14760 (0.0033) [2024-06-21 15:38:43,390][15132] Fps is (10 sec: 40964.2, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 241909760. Throughput: 0: 41395.1. Samples: 242042920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 15:38:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 15:38:45,092][15401] Updated weights for policy 0, policy_version 14770 (0.0039) [2024-06-21 15:38:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 242122752. Throughput: 0: 41289.1. Samples: 242288580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 15:38:48,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-21 15:38:48,647][15401] Updated weights for policy 0, policy_version 14780 (0.0027) [2024-06-21 15:38:53,188][15401] Updated weights for policy 0, policy_version 14790 (0.0033) [2024-06-21 15:38:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 242319360. Throughput: 0: 41333.4. Samples: 242413060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 15:38:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 15:38:56,490][15401] Updated weights for policy 0, policy_version 14800 (0.0038) [2024-06-21 15:38:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 242515968. Throughput: 0: 41317.0. Samples: 242660060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 15:38:58,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-21 15:39:00,940][15401] Updated weights for policy 0, policy_version 14810 (0.0039) [2024-06-21 15:39:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 242745344. Throughput: 0: 41397.0. Samples: 242905740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 15:39:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 15:39:04,633][15401] Updated weights for policy 0, policy_version 14820 (0.0041) [2024-06-21 15:39:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 242941952. Throughput: 0: 41376.5. Samples: 243036380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 15:39:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 15:39:08,988][15401] Updated weights for policy 0, policy_version 14830 (0.0034) [2024-06-21 15:39:12,475][15401] Updated weights for policy 0, policy_version 14840 (0.0032) [2024-06-21 15:39:13,392][15132] Fps is (10 sec: 40949.8, 60 sec: 41777.5, 300 sec: 41209.6). Total num frames: 243154944. Throughput: 0: 41352.5. Samples: 243279380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 15:39:13,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 15:39:16,722][15401] Updated weights for policy 0, policy_version 14850 (0.0052) [2024-06-21 15:39:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 243351552. Throughput: 0: 41496.6. Samples: 243531260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 15:39:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 15:39:20,703][15401] Updated weights for policy 0, policy_version 14860 (0.0029) [2024-06-21 15:39:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 243564544. Throughput: 0: 41375.9. Samples: 243656520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 15:39:23,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 15:39:24,482][15401] Updated weights for policy 0, policy_version 14870 (0.0032) [2024-06-21 15:39:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 41209.9). Total num frames: 243761152. Throughput: 0: 41342.3. Samples: 243903320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:39:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 15:39:28,714][15401] Updated weights for policy 0, policy_version 14880 (0.0033) [2024-06-21 15:39:32,195][15401] Updated weights for policy 0, policy_version 14890 (0.0038) [2024-06-21 15:39:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.9, 300 sec: 41376.5). Total num frames: 244006912. Throughput: 0: 41421.7. Samples: 244152560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:39:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 15:39:36,485][15401] Updated weights for policy 0, policy_version 14900 (0.0033) [2024-06-21 15:39:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 244203520. Throughput: 0: 41644.5. Samples: 244287060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 15:39:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 15:39:39,911][15401] Updated weights for policy 0, policy_version 14910 (0.0037) [2024-06-21 15:39:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 244400128. Throughput: 0: 41499.9. Samples: 244527560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 15:39:43,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 15:39:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000014918_244416512.pth... [2024-06-21 15:39:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000014311_234471424.pth [2024-06-21 15:39:44,655][15401] Updated weights for policy 0, policy_version 14920 (0.0045) [2024-06-21 15:39:47,438][15349] Signal inference workers to stop experience collection... (3550 times) [2024-06-21 15:39:47,494][15349] Signal inference workers to resume experience collection... (3550 times) [2024-06-21 15:39:47,494][15401] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-21 15:39:47,507][15401] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-21 15:39:47,957][15401] Updated weights for policy 0, policy_version 14930 (0.0039) [2024-06-21 15:39:48,393][15132] Fps is (10 sec: 40947.1, 60 sec: 41503.9, 300 sec: 41376.1). Total num frames: 244613120. Throughput: 0: 41454.9. Samples: 244771340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 15:39:48,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 15:39:52,431][15401] Updated weights for policy 0, policy_version 14940 (0.0028) [2024-06-21 15:39:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41265.4). Total num frames: 244826112. Throughput: 0: 41475.8. Samples: 244902800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-21 15:39:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 15:39:56,097][15401] Updated weights for policy 0, policy_version 14950 (0.0044) [2024-06-21 15:39:58,390][15132] Fps is (10 sec: 40972.6, 60 sec: 41779.1, 300 sec: 41376.5). Total num frames: 245022720. Throughput: 0: 41444.9. Samples: 245144300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-21 15:39:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 15:40:00,278][15401] Updated weights for policy 0, policy_version 14960 (0.0036) [2024-06-21 15:40:03,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.0, 300 sec: 41321.0). Total num frames: 245235712. Throughput: 0: 41569.7. Samples: 245401900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 15:40:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 15:40:03,938][15401] Updated weights for policy 0, policy_version 14970 (0.0043) [2024-06-21 15:40:08,330][15401] Updated weights for policy 0, policy_version 14980 (0.0038) [2024-06-21 15:40:08,396][15132] Fps is (10 sec: 40933.9, 60 sec: 41501.6, 300 sec: 41264.6). Total num frames: 245432320. Throughput: 0: 41461.3. Samples: 245522540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 15:40:08,396][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 15:40:11,701][15401] Updated weights for policy 0, policy_version 14990 (0.0040) [2024-06-21 15:40:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42054.0, 300 sec: 41487.6). Total num frames: 245678080. Throughput: 0: 41508.5. Samples: 245771200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 15:40:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 15:40:16,250][15401] Updated weights for policy 0, policy_version 15000 (0.0038) [2024-06-21 15:40:18,389][15132] Fps is (10 sec: 40986.6, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 245841920. Throughput: 0: 41469.4. Samples: 246018680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 15:40:18,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 15:40:19,833][15401] Updated weights for policy 0, policy_version 15010 (0.0036) [2024-06-21 15:40:23,390][15132] Fps is (10 sec: 36044.1, 60 sec: 41233.0, 300 sec: 41321.3). Total num frames: 246038528. Throughput: 0: 41088.7. Samples: 246136060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 15:40:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 15:40:24,111][15401] Updated weights for policy 0, policy_version 15020 (0.0039) [2024-06-21 15:40:27,717][15401] Updated weights for policy 0, policy_version 15030 (0.0029) [2024-06-21 15:40:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 41487.6). Total num frames: 246300672. Throughput: 0: 41448.9. Samples: 246392760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 15:40:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 15:40:32,055][15401] Updated weights for policy 0, policy_version 15040 (0.0032) [2024-06-21 15:40:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 246464512. Throughput: 0: 41585.5. Samples: 246642560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-21 15:40:33,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 15:40:35,516][15401] Updated weights for policy 0, policy_version 15050 (0.0039) [2024-06-21 15:40:38,390][15132] Fps is (10 sec: 37682.2, 60 sec: 41232.9, 300 sec: 41432.0). Total num frames: 246677504. Throughput: 0: 41266.6. Samples: 246759800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-21 15:40:38,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-21 15:40:39,843][15401] Updated weights for policy 0, policy_version 15060 (0.0037) [2024-06-21 15:40:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41321.6). Total num frames: 246890496. Throughput: 0: 41642.3. Samples: 247018200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-21 15:40:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 15:40:43,474][15401] Updated weights for policy 0, policy_version 15070 (0.0031) [2024-06-21 15:40:47,746][15401] Updated weights for policy 0, policy_version 15080 (0.0039) [2024-06-21 15:40:48,389][15132] Fps is (10 sec: 44237.9, 60 sec: 41781.4, 300 sec: 41432.1). Total num frames: 247119872. Throughput: 0: 41308.1. Samples: 247260760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-21 15:40:48,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-21 15:40:51,416][15401] Updated weights for policy 0, policy_version 15090 (0.0042) [2024-06-21 15:40:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 247316480. Throughput: 0: 41501.4. Samples: 247389840. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-06-21 15:40:53,394][15132] Avg episode reward: [(0, '0.788')] [2024-06-21 15:40:55,472][15401] Updated weights for policy 0, policy_version 15100 (0.0040) [2024-06-21 15:40:58,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 247496704. Throughput: 0: 41575.9. Samples: 247642120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 15:40:58,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-21 15:40:59,123][15401] Updated weights for policy 0, policy_version 15110 (0.0050) [2024-06-21 15:41:01,408][15349] Signal inference workers to stop experience collection... (3600 times) [2024-06-21 15:41:01,409][15349] Signal inference workers to resume experience collection... (3600 times) [2024-06-21 15:41:01,422][15401] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-21 15:41:01,440][15401] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-21 15:41:03,235][15401] Updated weights for policy 0, policy_version 15120 (0.0038) [2024-06-21 15:41:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 247726080. Throughput: 0: 41581.3. Samples: 247889840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 15:41:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 15:41:07,475][15401] Updated weights for policy 0, policy_version 15130 (0.0037) [2024-06-21 15:41:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41783.6, 300 sec: 41432.1). Total num frames: 247939072. Throughput: 0: 41789.0. Samples: 248016560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 15:41:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 15:41:11,098][15401] Updated weights for policy 0, policy_version 15140 (0.0027) [2024-06-21 15:41:13,392][15132] Fps is (10 sec: 37673.6, 60 sec: 40412.2, 300 sec: 41265.1). Total num frames: 248102912. Throughput: 0: 41422.6. Samples: 248256880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 15:41:13,392][15132] Avg episode reward: [(0, '0.239')] [2024-06-21 15:41:15,298][15401] Updated weights for policy 0, policy_version 15150 (0.0027) [2024-06-21 15:41:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 248332288. Throughput: 0: 41316.0. Samples: 248501780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 15:41:18,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-21 15:41:18,903][15401] Updated weights for policy 0, policy_version 15160 (0.0030) [2024-06-21 15:41:23,030][15401] Updated weights for policy 0, policy_version 15170 (0.0029) [2024-06-21 15:41:23,389][15132] Fps is (10 sec: 44247.6, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 248545280. Throughput: 0: 41624.2. Samples: 248632880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 15:41:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 15:41:26,831][15401] Updated weights for policy 0, policy_version 15180 (0.0030) [2024-06-21 15:41:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 248758272. Throughput: 0: 41192.0. Samples: 248871840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:41:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 15:41:30,957][15401] Updated weights for policy 0, policy_version 15190 (0.0038) [2024-06-21 15:41:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 248971264. Throughput: 0: 41405.7. Samples: 249124020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:41:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 15:41:35,069][15401] Updated weights for policy 0, policy_version 15200 (0.0040) [2024-06-21 15:41:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.3, 300 sec: 41321.3). Total num frames: 249167872. Throughput: 0: 41466.7. Samples: 249255840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:41:38,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 15:41:38,679][15401] Updated weights for policy 0, policy_version 15210 (0.0035) [2024-06-21 15:41:42,724][15401] Updated weights for policy 0, policy_version 15220 (0.0029) [2024-06-21 15:41:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41505.9, 300 sec: 41376.5). Total num frames: 249380864. Throughput: 0: 41304.2. Samples: 249500820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 15:41:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 15:41:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000015221_249380864.pth... [2024-06-21 15:41:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000014615_239452160.pth [2024-06-21 15:41:46,891][15401] Updated weights for policy 0, policy_version 15230 (0.0040) [2024-06-21 15:41:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 249593856. Throughput: 0: 41325.2. Samples: 249749480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 15:41:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 15:41:50,560][15401] Updated weights for policy 0, policy_version 15240 (0.0036) [2024-06-21 15:41:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 249790464. Throughput: 0: 41251.1. Samples: 249872860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:41:53,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-21 15:41:54,622][15401] Updated weights for policy 0, policy_version 15250 (0.0043) [2024-06-21 15:41:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 249987072. Throughput: 0: 41475.2. Samples: 250123160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:41:58,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 15:41:58,831][15401] Updated weights for policy 0, policy_version 15260 (0.0024) [2024-06-21 15:42:02,462][15401] Updated weights for policy 0, policy_version 15270 (0.0040) [2024-06-21 15:42:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 250232832. Throughput: 0: 41553.8. Samples: 250371700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 15:42:03,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 15:42:06,697][15401] Updated weights for policy 0, policy_version 15280 (0.0028) [2024-06-21 15:42:08,392][15132] Fps is (10 sec: 44225.6, 60 sec: 41504.4, 300 sec: 41431.7). Total num frames: 250429440. Throughput: 0: 41497.7. Samples: 250500380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 15:42:08,393][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 15:42:10,274][15401] Updated weights for policy 0, policy_version 15290 (0.0042) [2024-06-21 15:42:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42054.0, 300 sec: 41487.8). Total num frames: 250626048. Throughput: 0: 41746.2. Samples: 250750420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 15:42:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 15:42:14,887][15401] Updated weights for policy 0, policy_version 15300 (0.0027) [2024-06-21 15:42:18,268][15401] Updated weights for policy 0, policy_version 15310 (0.0037) [2024-06-21 15:42:18,389][15132] Fps is (10 sec: 40970.4, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 250839040. Throughput: 0: 41612.5. Samples: 250996580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 15:42:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 15:42:22,579][15401] Updated weights for policy 0, policy_version 15320 (0.0037) [2024-06-21 15:42:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41233.2, 300 sec: 41321.0). Total num frames: 251019264. Throughput: 0: 41458.8. Samples: 251121480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 15:42:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 15:42:26,190][15401] Updated weights for policy 0, policy_version 15330 (0.0029) [2024-06-21 15:42:27,404][15349] Signal inference workers to stop experience collection... (3650 times) [2024-06-21 15:42:27,404][15349] Signal inference workers to resume experience collection... (3650 times) [2024-06-21 15:42:27,441][15401] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-21 15:42:27,441][15401] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-21 15:42:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41777.5, 300 sec: 41542.8). Total num frames: 251265024. Throughput: 0: 41716.2. Samples: 251378140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 15:42:28,392][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 15:42:30,347][15401] Updated weights for policy 0, policy_version 15340 (0.0040) [2024-06-21 15:42:33,390][15132] Fps is (10 sec: 44235.7, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 251461632. Throughput: 0: 41684.3. Samples: 251625280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 15:42:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 15:42:33,821][15401] Updated weights for policy 0, policy_version 15350 (0.0029) [2024-06-21 15:42:37,956][15401] Updated weights for policy 0, policy_version 15360 (0.0035) [2024-06-21 15:42:38,390][15132] Fps is (10 sec: 39330.6, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 251658240. Throughput: 0: 41751.0. Samples: 251751660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 15:42:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 15:42:42,041][15401] Updated weights for policy 0, policy_version 15370 (0.0049) [2024-06-21 15:42:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 41598.7). Total num frames: 251904000. Throughput: 0: 41853.7. Samples: 252006580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 15:42:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-21 15:42:45,870][15401] Updated weights for policy 0, policy_version 15380 (0.0041) [2024-06-21 15:42:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 252084224. Throughput: 0: 41735.6. Samples: 252249800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-21 15:42:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 15:42:49,793][15401] Updated weights for policy 0, policy_version 15390 (0.0045) [2024-06-21 15:42:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 252280832. Throughput: 0: 41573.4. Samples: 252371080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-21 15:42:53,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 15:42:53,754][15401] Updated weights for policy 0, policy_version 15400 (0.0049) [2024-06-21 15:42:57,701][15401] Updated weights for policy 0, policy_version 15410 (0.0039) [2024-06-21 15:42:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41487.7). Total num frames: 252510208. Throughput: 0: 41679.7. Samples: 252626000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:42:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 15:43:02,159][15401] Updated weights for policy 0, policy_version 15420 (0.0030) [2024-06-21 15:43:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 252706816. Throughput: 0: 41739.6. Samples: 252874860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:43:03,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-21 15:43:05,454][15401] Updated weights for policy 0, policy_version 15430 (0.0041) [2024-06-21 15:43:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41234.8, 300 sec: 41543.2). Total num frames: 252903424. Throughput: 0: 41684.0. Samples: 252997260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 15:43:08,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 15:43:09,808][15401] Updated weights for policy 0, policy_version 15440 (0.0037) [2024-06-21 15:43:13,159][15401] Updated weights for policy 0, policy_version 15450 (0.0032) [2024-06-21 15:43:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 253149184. Throughput: 0: 41649.3. Samples: 253252260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 15:43:13,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 15:43:17,516][15401] Updated weights for policy 0, policy_version 15460 (0.0037) [2024-06-21 15:43:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 253313024. Throughput: 0: 41846.4. Samples: 253508360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 15:43:18,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-21 15:43:20,895][15401] Updated weights for policy 0, policy_version 15470 (0.0033) [2024-06-21 15:43:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.1, 300 sec: 41543.2). Total num frames: 253542400. Throughput: 0: 41557.8. Samples: 253621760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 15:43:23,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 15:43:25,453][15401] Updated weights for policy 0, policy_version 15480 (0.0029) [2024-06-21 15:43:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41507.7, 300 sec: 41543.3). Total num frames: 253755392. Throughput: 0: 41588.4. Samples: 253878060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 15:43:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 15:43:28,615][15401] Updated weights for policy 0, policy_version 15490 (0.0029) [2024-06-21 15:43:33,360][15401] Updated weights for policy 0, policy_version 15500 (0.0034) [2024-06-21 15:43:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 253952000. Throughput: 0: 41900.0. Samples: 254135300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:43:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-21 15:43:36,618][15401] Updated weights for policy 0, policy_version 15510 (0.0030) [2024-06-21 15:43:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 254164992. Throughput: 0: 41714.3. Samples: 254248220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:43:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 15:43:41,151][15401] Updated weights for policy 0, policy_version 15520 (0.0038) [2024-06-21 15:43:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 254377984. Throughput: 0: 41723.6. Samples: 254503560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 15:43:43,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 15:43:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000015526_254377984.pth... [2024-06-21 15:43:43,440][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000014918_244416512.pth [2024-06-21 15:43:44,431][15401] Updated weights for policy 0, policy_version 15530 (0.0036) [2024-06-21 15:43:45,479][15349] Signal inference workers to stop experience collection... (3700 times) [2024-06-21 15:43:45,480][15349] Signal inference workers to resume experience collection... (3700 times) [2024-06-21 15:43:45,497][15401] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-21 15:43:45,498][15401] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-21 15:43:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41543.1). Total num frames: 254574592. Throughput: 0: 41485.6. Samples: 254741720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 15:43:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 15:43:48,964][15401] Updated weights for policy 0, policy_version 15540 (0.0035) [2024-06-21 15:43:52,581][15401] Updated weights for policy 0, policy_version 15550 (0.0039) [2024-06-21 15:43:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 254803968. Throughput: 0: 41682.5. Samples: 254872980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 15:43:53,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 15:43:56,943][15401] Updated weights for policy 0, policy_version 15560 (0.0034) [2024-06-21 15:43:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 254967808. Throughput: 0: 41443.0. Samples: 255117200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:43:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 15:44:00,545][15401] Updated weights for policy 0, policy_version 15570 (0.0041) [2024-06-21 15:44:03,393][15132] Fps is (10 sec: 39309.5, 60 sec: 41503.9, 300 sec: 41542.7). Total num frames: 255197184. Throughput: 0: 41242.9. Samples: 255364420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 15:44:03,393][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 15:44:04,914][15401] Updated weights for policy 0, policy_version 15580 (0.0046) [2024-06-21 15:44:08,291][15401] Updated weights for policy 0, policy_version 15590 (0.0039) [2024-06-21 15:44:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 41599.0). Total num frames: 255426560. Throughput: 0: 41621.4. Samples: 255494720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 15:44:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 15:44:12,848][15401] Updated weights for policy 0, policy_version 15600 (0.0039) [2024-06-21 15:44:13,390][15132] Fps is (10 sec: 42611.6, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 255623168. Throughput: 0: 41469.8. Samples: 255744200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 15:44:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 15:44:16,676][15401] Updated weights for policy 0, policy_version 15610 (0.0032) [2024-06-21 15:44:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42050.6, 300 sec: 41598.4). Total num frames: 255836160. Throughput: 0: 41104.0. Samples: 255985080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:44:18,393][15132] Avg episode reward: [(0, '0.228')] [2024-06-21 15:44:20,739][15401] Updated weights for policy 0, policy_version 15620 (0.0029) [2024-06-21 15:44:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 256032768. Throughput: 0: 41364.0. Samples: 256109600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:44:23,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-21 15:44:24,536][15401] Updated weights for policy 0, policy_version 15630 (0.0040) [2024-06-21 15:44:28,390][15132] Fps is (10 sec: 39330.7, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 256229376. Throughput: 0: 41226.5. Samples: 256358760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-21 15:44:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 15:44:28,578][15401] Updated weights for policy 0, policy_version 15640 (0.0036) [2024-06-21 15:44:32,339][15401] Updated weights for policy 0, policy_version 15650 (0.0040) [2024-06-21 15:44:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 256458752. Throughput: 0: 41313.7. Samples: 256600840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-21 15:44:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 15:44:36,851][15401] Updated weights for policy 0, policy_version 15660 (0.0024) [2024-06-21 15:44:38,392][15132] Fps is (10 sec: 42588.6, 60 sec: 41504.4, 300 sec: 41542.8). Total num frames: 256655360. Throughput: 0: 41341.4. Samples: 256733440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-21 15:44:38,392][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 15:44:40,236][15401] Updated weights for policy 0, policy_version 15670 (0.0038) [2024-06-21 15:44:43,390][15132] Fps is (10 sec: 36044.9, 60 sec: 40686.8, 300 sec: 41377.0). Total num frames: 256819200. Throughput: 0: 41132.4. Samples: 256968160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 15:44:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 15:44:44,776][15401] Updated weights for policy 0, policy_version 15680 (0.0044) [2024-06-21 15:44:48,126][15401] Updated weights for policy 0, policy_version 15690 (0.0048) [2024-06-21 15:44:48,390][15132] Fps is (10 sec: 40969.4, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 257064960. Throughput: 0: 41142.3. Samples: 257215700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 15:44:48,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 15:44:52,649][15401] Updated weights for policy 0, policy_version 15700 (0.0037) [2024-06-21 15:44:53,107][15349] Signal inference workers to stop experience collection... (3750 times) [2024-06-21 15:44:53,108][15349] Signal inference workers to resume experience collection... (3750 times) [2024-06-21 15:44:53,133][15401] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-21 15:44:53,133][15401] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-21 15:44:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 40960.1, 300 sec: 41487.6). Total num frames: 257261568. Throughput: 0: 41134.3. Samples: 257345760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 15:44:53,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 15:44:56,429][15401] Updated weights for policy 0, policy_version 15710 (0.0034) [2024-06-21 15:44:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 257474560. Throughput: 0: 41082.6. Samples: 257592920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 15:44:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-21 15:45:00,795][15401] Updated weights for policy 0, policy_version 15720 (0.0024) [2024-06-21 15:45:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41508.3, 300 sec: 41544.1). Total num frames: 257687552. Throughput: 0: 41113.3. Samples: 257835080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 15:45:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 15:45:04,259][15401] Updated weights for policy 0, policy_version 15730 (0.0040) [2024-06-21 15:45:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 257867776. Throughput: 0: 41033.6. Samples: 257956120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 15:45:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 15:45:08,570][15401] Updated weights for policy 0, policy_version 15740 (0.0024) [2024-06-21 15:45:12,579][15401] Updated weights for policy 0, policy_version 15750 (0.0034) [2024-06-21 15:45:13,389][15132] Fps is (10 sec: 37683.3, 60 sec: 40687.0, 300 sec: 41432.1). Total num frames: 258064384. Throughput: 0: 41003.7. Samples: 258203920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 15:45:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 15:45:16,377][15401] Updated weights for policy 0, policy_version 15760 (0.0041) [2024-06-21 15:45:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 40961.6, 300 sec: 41543.2). Total num frames: 258293760. Throughput: 0: 41001.8. Samples: 258445920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:45:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 15:45:20,445][15401] Updated weights for policy 0, policy_version 15770 (0.0035) [2024-06-21 15:45:23,392][15132] Fps is (10 sec: 40950.0, 60 sec: 40685.3, 300 sec: 41265.1). Total num frames: 258473984. Throughput: 0: 40963.1. Samples: 258576780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 15:45:23,392][15132] Avg episode reward: [(0, '0.282')] [2024-06-21 15:45:24,318][15401] Updated weights for policy 0, policy_version 15780 (0.0057) [2024-06-21 15:45:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 258686976. Throughput: 0: 41118.8. Samples: 258818500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 15:45:28,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 15:45:28,471][15401] Updated weights for policy 0, policy_version 15790 (0.0042) [2024-06-21 15:45:32,635][15401] Updated weights for policy 0, policy_version 15800 (0.0037) [2024-06-21 15:45:33,390][15132] Fps is (10 sec: 40969.5, 60 sec: 40413.9, 300 sec: 41376.6). Total num frames: 258883584. Throughput: 0: 41148.0. Samples: 259067360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 15:45:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 15:45:36,778][15401] Updated weights for policy 0, policy_version 15810 (0.0039) [2024-06-21 15:45:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41234.7, 300 sec: 41487.6). Total num frames: 259129344. Throughput: 0: 41080.3. Samples: 259194380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 15:45:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 15:45:40,492][15401] Updated weights for policy 0, policy_version 15820 (0.0046) [2024-06-21 15:45:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 41376.5). Total num frames: 259325952. Throughput: 0: 40962.3. Samples: 259436220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 15:45:43,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-21 15:45:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000015828_259325952.pth... [2024-06-21 15:45:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000015221_249380864.pth [2024-06-21 15:45:44,576][15401] Updated weights for policy 0, policy_version 15830 (0.0050) [2024-06-21 15:45:48,287][15401] Updated weights for policy 0, policy_version 15840 (0.0039) [2024-06-21 15:45:48,392][15132] Fps is (10 sec: 39312.5, 60 sec: 40958.5, 300 sec: 41376.2). Total num frames: 259522560. Throughput: 0: 41148.5. Samples: 259686860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 15:45:48,392][15132] Avg episode reward: [(0, '0.232')] [2024-06-21 15:45:52,622][15401] Updated weights for policy 0, policy_version 15850 (0.0038) [2024-06-21 15:45:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 259719168. Throughput: 0: 41107.3. Samples: 259805940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:45:53,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 15:45:56,284][15401] Updated weights for policy 0, policy_version 15860 (0.0032) [2024-06-21 15:45:58,390][15132] Fps is (10 sec: 42608.6, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 259948544. Throughput: 0: 41226.2. Samples: 260059100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:45:58,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 15:46:00,547][15401] Updated weights for policy 0, policy_version 15870 (0.0038) [2024-06-21 15:46:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 260128768. Throughput: 0: 41457.0. Samples: 260311480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:46:03,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-21 15:46:04,020][15401] Updated weights for policy 0, policy_version 15880 (0.0038) [2024-06-21 15:46:08,389][15132] Fps is (10 sec: 37683.5, 60 sec: 40960.1, 300 sec: 41432.4). Total num frames: 260325376. Throughput: 0: 41186.7. Samples: 260430080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 15:46:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 15:46:08,410][15401] Updated weights for policy 0, policy_version 15890 (0.0031) [2024-06-21 15:46:08,413][15349] Signal inference workers to stop experience collection... (3800 times) [2024-06-21 15:46:08,413][15349] Signal inference workers to resume experience collection... (3800 times) [2024-06-21 15:46:08,459][15401] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-21 15:46:08,459][15401] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-21 15:46:11,789][15401] Updated weights for policy 0, policy_version 15900 (0.0048) [2024-06-21 15:46:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 260554752. Throughput: 0: 41264.0. Samples: 260675380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:46:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 15:46:16,201][15401] Updated weights for policy 0, policy_version 15910 (0.0048) [2024-06-21 15:46:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 260751360. Throughput: 0: 41132.5. Samples: 260918320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:46:18,392][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 15:46:20,138][15401] Updated weights for policy 0, policy_version 15920 (0.0038) [2024-06-21 15:46:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41507.8, 300 sec: 41376.5). Total num frames: 260964352. Throughput: 0: 41128.5. Samples: 261045160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:46:23,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 15:46:23,990][15401] Updated weights for policy 0, policy_version 15930 (0.0028) [2024-06-21 15:46:27,873][15401] Updated weights for policy 0, policy_version 15940 (0.0041) [2024-06-21 15:46:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 261160960. Throughput: 0: 41268.0. Samples: 261293280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:46:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 15:46:32,070][15401] Updated weights for policy 0, policy_version 15950 (0.0053) [2024-06-21 15:46:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 261373952. Throughput: 0: 41371.6. Samples: 261548480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:46:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 15:46:35,633][15401] Updated weights for policy 0, policy_version 15960 (0.0035) [2024-06-21 15:46:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 261603328. Throughput: 0: 41516.8. Samples: 261674200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:46:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 15:46:39,921][15401] Updated weights for policy 0, policy_version 15970 (0.0033) [2024-06-21 15:46:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 261799936. Throughput: 0: 41484.4. Samples: 261925900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:46:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 15:46:43,731][15401] Updated weights for policy 0, policy_version 15980 (0.0039) [2024-06-21 15:46:47,887][15401] Updated weights for policy 0, policy_version 15990 (0.0035) [2024-06-21 15:46:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41234.7, 300 sec: 41376.5). Total num frames: 261996544. Throughput: 0: 41411.5. Samples: 262175000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:46:48,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 15:46:51,801][15401] Updated weights for policy 0, policy_version 16000 (0.0036) [2024-06-21 15:46:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41543.2). Total num frames: 262242304. Throughput: 0: 41423.9. Samples: 262294160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:46:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 15:46:55,903][15401] Updated weights for policy 0, policy_version 16010 (0.0050) [2024-06-21 15:46:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 262422528. Throughput: 0: 41508.0. Samples: 262543240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:46:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 15:46:59,468][15401] Updated weights for policy 0, policy_version 16020 (0.0044) [2024-06-21 15:47:03,390][15132] Fps is (10 sec: 36043.9, 60 sec: 41232.9, 300 sec: 41265.8). Total num frames: 262602752. Throughput: 0: 41734.4. Samples: 262796380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-21 15:47:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 15:47:03,827][15401] Updated weights for policy 0, policy_version 16030 (0.0037) [2024-06-21 15:47:07,162][15401] Updated weights for policy 0, policy_version 16040 (0.0041) [2024-06-21 15:47:08,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42047.8, 300 sec: 41431.2). Total num frames: 262848512. Throughput: 0: 41591.4. Samples: 262917040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-21 15:47:08,396][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 15:47:11,606][15401] Updated weights for policy 0, policy_version 16050 (0.0032) [2024-06-21 15:47:13,390][15132] Fps is (10 sec: 42599.2, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 263028736. Throughput: 0: 41622.5. Samples: 263166300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:47:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 15:47:15,286][15401] Updated weights for policy 0, policy_version 16060 (0.0040) [2024-06-21 15:47:18,389][15132] Fps is (10 sec: 39346.6, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 263241728. Throughput: 0: 41266.2. Samples: 263405460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 15:47:18,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 15:47:19,881][15401] Updated weights for policy 0, policy_version 16070 (0.0028) [2024-06-21 15:47:23,350][15401] Updated weights for policy 0, policy_version 16080 (0.0038) [2024-06-21 15:47:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41321.3). Total num frames: 263454720. Throughput: 0: 41252.1. Samples: 263530540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:47:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 15:47:27,699][15401] Updated weights for policy 0, policy_version 16090 (0.0035) [2024-06-21 15:47:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41232.9, 300 sec: 41265.5). Total num frames: 263634944. Throughput: 0: 41167.4. Samples: 263778440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:47:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 15:47:31,260][15401] Updated weights for policy 0, policy_version 16100 (0.0039) [2024-06-21 15:47:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41777.5, 300 sec: 41431.8). Total num frames: 263880704. Throughput: 0: 41058.3. Samples: 264022720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:47:33,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 15:47:35,471][15401] Updated weights for policy 0, policy_version 16110 (0.0028) [2024-06-21 15:47:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 264060928. Throughput: 0: 41278.2. Samples: 264151680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 15:47:38,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-21 15:47:39,035][15401] Updated weights for policy 0, policy_version 16120 (0.0040) [2024-06-21 15:47:43,390][15132] Fps is (10 sec: 37691.9, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 264257536. Throughput: 0: 41176.4. Samples: 264396180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 15:47:43,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 15:47:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000016130_264273920.pth... [2024-06-21 15:47:43,521][15401] Updated weights for policy 0, policy_version 16130 (0.0044) [2024-06-21 15:47:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000015526_254377984.pth [2024-06-21 15:47:46,149][15349] Signal inference workers to stop experience collection... (3850 times) [2024-06-21 15:47:46,156][15349] Signal inference workers to resume experience collection... (3850 times) [2024-06-21 15:47:46,164][15401] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-21 15:47:46,190][15401] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-21 15:47:46,919][15401] Updated weights for policy 0, policy_version 16140 (0.0033) [2024-06-21 15:47:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 264486912. Throughput: 0: 40991.4. Samples: 264640980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:47:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 15:47:51,521][15401] Updated weights for policy 0, policy_version 16150 (0.0040) [2024-06-21 15:47:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 264683520. Throughput: 0: 41133.9. Samples: 264767800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:47:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 15:47:54,692][15401] Updated weights for policy 0, policy_version 16160 (0.0037) [2024-06-21 15:47:58,390][15132] Fps is (10 sec: 37682.8, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 264863744. Throughput: 0: 40980.9. Samples: 265010440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 15:47:58,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 15:47:59,259][15401] Updated weights for policy 0, policy_version 16170 (0.0026) [2024-06-21 15:48:02,787][15401] Updated weights for policy 0, policy_version 16180 (0.0032) [2024-06-21 15:48:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 265093120. Throughput: 0: 41127.1. Samples: 265256180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 15:48:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 15:48:07,494][15401] Updated weights for policy 0, policy_version 16190 (0.0043) [2024-06-21 15:48:08,392][15132] Fps is (10 sec: 42588.4, 60 sec: 40689.6, 300 sec: 41154.0). Total num frames: 265289728. Throughput: 0: 41205.7. Samples: 265384900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 15:48:08,393][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 15:48:10,537][15401] Updated weights for policy 0, policy_version 16200 (0.0032) [2024-06-21 15:48:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 265519104. Throughput: 0: 41080.6. Samples: 265627060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 15:48:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 15:48:15,221][15401] Updated weights for policy 0, policy_version 16210 (0.0032) [2024-06-21 15:48:18,389][15132] Fps is (10 sec: 42609.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 265715712. Throughput: 0: 41234.7. Samples: 265878180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-21 15:48:18,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-21 15:48:18,707][15401] Updated weights for policy 0, policy_version 16220 (0.0051) [2024-06-21 15:48:23,193][15401] Updated weights for policy 0, policy_version 16230 (0.0039) [2024-06-21 15:48:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 265912320. Throughput: 0: 41023.6. Samples: 265997740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 15:48:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 15:48:26,469][15401] Updated weights for policy 0, policy_version 16240 (0.0030) [2024-06-21 15:48:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 266141696. Throughput: 0: 41231.2. Samples: 266251580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 15:48:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 15:48:31,276][15401] Updated weights for policy 0, policy_version 16250 (0.0042) [2024-06-21 15:48:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40688.5, 300 sec: 41209.9). Total num frames: 266321920. Throughput: 0: 41377.8. Samples: 266502980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 15:48:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 15:48:34,315][15401] Updated weights for policy 0, policy_version 16260 (0.0031) [2024-06-21 15:48:38,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 266534912. Throughput: 0: 41174.2. Samples: 266620640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:48:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 15:48:39,027][15401] Updated weights for policy 0, policy_version 16270 (0.0037) [2024-06-21 15:48:42,697][15401] Updated weights for policy 0, policy_version 16280 (0.0035) [2024-06-21 15:48:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 266747904. Throughput: 0: 41305.7. Samples: 266869200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:48:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 15:48:46,850][15401] Updated weights for policy 0, policy_version 16290 (0.0042) [2024-06-21 15:48:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 266944512. Throughput: 0: 41391.5. Samples: 267118800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 15:48:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 15:48:50,784][15401] Updated weights for policy 0, policy_version 16300 (0.0045) [2024-06-21 15:48:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 267157504. Throughput: 0: 41296.4. Samples: 267243140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 15:48:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 15:48:54,915][15401] Updated weights for policy 0, policy_version 16310 (0.0028) [2024-06-21 15:48:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41210.3). Total num frames: 267354112. Throughput: 0: 41586.9. Samples: 267498480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:48:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 15:48:58,569][15401] Updated weights for policy 0, policy_version 16320 (0.0033) [2024-06-21 15:49:02,799][15401] Updated weights for policy 0, policy_version 16330 (0.0040) [2024-06-21 15:49:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 267567104. Throughput: 0: 41402.2. Samples: 267741280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:49:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 15:49:06,435][15401] Updated weights for policy 0, policy_version 16340 (0.0052) [2024-06-21 15:49:08,396][15132] Fps is (10 sec: 44208.6, 60 sec: 41776.4, 300 sec: 41264.6). Total num frames: 267796480. Throughput: 0: 41601.1. Samples: 267870060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 15:49:08,397][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 15:49:10,880][15401] Updated weights for policy 0, policy_version 16350 (0.0028) [2024-06-21 15:49:11,344][15349] Signal inference workers to stop experience collection... (3900 times) [2024-06-21 15:49:11,345][15349] Signal inference workers to resume experience collection... (3900 times) [2024-06-21 15:49:11,384][15401] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-21 15:49:11,384][15401] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-21 15:49:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41233.0, 300 sec: 41210.2). Total num frames: 267993088. Throughput: 0: 41473.2. Samples: 268117880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 15:49:13,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 15:49:14,408][15401] Updated weights for policy 0, policy_version 16360 (0.0051) [2024-06-21 15:49:18,389][15132] Fps is (10 sec: 37707.8, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 268173312. Throughput: 0: 41239.6. Samples: 268358760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 15:49:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 15:49:18,884][15401] Updated weights for policy 0, policy_version 16370 (0.0045) [2024-06-21 15:49:22,365][15401] Updated weights for policy 0, policy_version 16380 (0.0049) [2024-06-21 15:49:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 268419072. Throughput: 0: 41409.7. Samples: 268484080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 15:49:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 15:49:26,662][15401] Updated weights for policy 0, policy_version 16390 (0.0039) [2024-06-21 15:49:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 268599296. Throughput: 0: 41414.8. Samples: 268732860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 15:49:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 15:49:30,296][15401] Updated weights for policy 0, policy_version 16400 (0.0040) [2024-06-21 15:49:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41777.5, 300 sec: 41265.5). Total num frames: 268828672. Throughput: 0: 41367.6. Samples: 268980440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 15:49:33,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 15:49:34,467][15401] Updated weights for policy 0, policy_version 16410 (0.0035) [2024-06-21 15:49:37,973][15401] Updated weights for policy 0, policy_version 16420 (0.0038) [2024-06-21 15:49:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 269025280. Throughput: 0: 41452.6. Samples: 269108500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 15:49:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 15:49:42,068][15401] Updated weights for policy 0, policy_version 16430 (0.0035) [2024-06-21 15:49:43,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41506.3, 300 sec: 41265.5). Total num frames: 269238272. Throughput: 0: 41354.4. Samples: 269359420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 15:49:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 15:49:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000016433_269238272.pth... [2024-06-21 15:49:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000015828_259325952.pth [2024-06-21 15:49:45,874][15401] Updated weights for policy 0, policy_version 16440 (0.0023) [2024-06-21 15:49:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 269451264. Throughput: 0: 41340.2. Samples: 269601600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:49:48,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 15:49:49,951][15401] Updated weights for policy 0, policy_version 16450 (0.0026) [2024-06-21 15:49:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 269647872. Throughput: 0: 41323.3. Samples: 269729340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:49:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 15:49:53,822][15401] Updated weights for policy 0, policy_version 16460 (0.0036) [2024-06-21 15:49:57,962][15401] Updated weights for policy 0, policy_version 16470 (0.0049) [2024-06-21 15:49:58,392][15132] Fps is (10 sec: 39312.6, 60 sec: 41504.6, 300 sec: 41209.6). Total num frames: 269844480. Throughput: 0: 41286.8. Samples: 269975880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 15:49:58,393][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 15:50:01,667][15401] Updated weights for policy 0, policy_version 16480 (0.0024) [2024-06-21 15:50:03,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41504.4, 300 sec: 41320.7). Total num frames: 270057472. Throughput: 0: 41341.3. Samples: 270219220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 15:50:03,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 15:50:05,888][15401] Updated weights for policy 0, policy_version 16490 (0.0031) [2024-06-21 15:50:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 40964.5, 300 sec: 41321.0). Total num frames: 270254080. Throughput: 0: 41354.3. Samples: 270345020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 15:50:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 15:50:09,550][15401] Updated weights for policy 0, policy_version 16500 (0.0038) [2024-06-21 15:50:13,389][15132] Fps is (10 sec: 42609.1, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 270483456. Throughput: 0: 41394.3. Samples: 270595600. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-21 15:50:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-21 15:50:14,269][15401] Updated weights for policy 0, policy_version 16510 (0.0030) [2024-06-21 15:50:17,402][15401] Updated weights for policy 0, policy_version 16520 (0.0034) [2024-06-21 15:50:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41432.4). Total num frames: 270696448. Throughput: 0: 41197.8. Samples: 270834240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-21 15:50:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 15:50:22,014][15401] Updated weights for policy 0, policy_version 16530 (0.0028) [2024-06-21 15:50:23,390][15132] Fps is (10 sec: 37682.7, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 270860288. Throughput: 0: 41135.9. Samples: 270959620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 15:50:23,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 15:50:25,467][15401] Updated weights for policy 0, policy_version 16540 (0.0047) [2024-06-21 15:50:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 271089664. Throughput: 0: 41119.9. Samples: 271209820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 15:50:28,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 15:50:29,723][15401] Updated weights for policy 0, policy_version 16550 (0.0040) [2024-06-21 15:50:33,304][15401] Updated weights for policy 0, policy_version 16560 (0.0037) [2024-06-21 15:50:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 41507.8, 300 sec: 41321.0). Total num frames: 271319040. Throughput: 0: 41287.7. Samples: 271459540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:50:33,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 15:50:37,601][15401] Updated weights for policy 0, policy_version 16570 (0.0038) [2024-06-21 15:50:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 271482880. Throughput: 0: 41268.6. Samples: 271586420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:50:38,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-21 15:50:41,214][15401] Updated weights for policy 0, policy_version 16580 (0.0044) [2024-06-21 15:50:43,394][15132] Fps is (10 sec: 40941.1, 60 sec: 41503.0, 300 sec: 41376.2). Total num frames: 271728640. Throughput: 0: 41213.2. Samples: 271830560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:50:43,395][15132] Avg episode reward: [(0, '0.522')] [2024-06-21 15:50:46,059][15401] Updated weights for policy 0, policy_version 16590 (0.0039) [2024-06-21 15:50:48,390][15132] Fps is (10 sec: 42597.2, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 271908864. Throughput: 0: 41249.6. Samples: 272075360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:50:48,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 15:50:49,689][15401] Updated weights for policy 0, policy_version 16600 (0.0044) [2024-06-21 15:50:53,390][15132] Fps is (10 sec: 39339.1, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 272121856. Throughput: 0: 41236.8. Samples: 272200680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:50:53,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-21 15:50:53,738][15401] Updated weights for policy 0, policy_version 16610 (0.0041) [2024-06-21 15:50:55,027][15349] Signal inference workers to stop experience collection... (3950 times) [2024-06-21 15:50:55,080][15401] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-21 15:50:55,081][15349] Signal inference workers to resume experience collection... (3950 times) [2024-06-21 15:50:55,099][15401] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-21 15:50:57,629][15401] Updated weights for policy 0, policy_version 16620 (0.0037) [2024-06-21 15:50:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 41780.8, 300 sec: 41432.1). Total num frames: 272351232. Throughput: 0: 41378.5. Samples: 272457640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 15:50:58,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 15:51:01,608][15401] Updated weights for policy 0, policy_version 16630 (0.0047) [2024-06-21 15:51:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41780.9, 300 sec: 41487.6). Total num frames: 272564224. Throughput: 0: 41565.8. Samples: 272704700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 15:51:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 15:51:05,150][15401] Updated weights for policy 0, policy_version 16640 (0.0032) [2024-06-21 15:51:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 272760832. Throughput: 0: 41593.8. Samples: 272831340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 15:51:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 15:51:09,601][15401] Updated weights for policy 0, policy_version 16650 (0.0049) [2024-06-21 15:51:12,944][15401] Updated weights for policy 0, policy_version 16660 (0.0026) [2024-06-21 15:51:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 41376.6). Total num frames: 272957440. Throughput: 0: 41692.5. Samples: 273085980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 15:51:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 15:51:17,429][15401] Updated weights for policy 0, policy_version 16670 (0.0030) [2024-06-21 15:51:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 273170432. Throughput: 0: 41648.4. Samples: 273333720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 15:51:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 15:51:21,102][15401] Updated weights for policy 0, policy_version 16680 (0.0032) [2024-06-21 15:51:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 273367040. Throughput: 0: 41519.4. Samples: 273454800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-21 15:51:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 15:51:25,196][15401] Updated weights for policy 0, policy_version 16690 (0.0037) [2024-06-21 15:51:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 273580032. Throughput: 0: 41658.4. Samples: 273705000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-21 15:51:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 15:51:28,855][15401] Updated weights for policy 0, policy_version 16700 (0.0037) [2024-06-21 15:51:33,066][15401] Updated weights for policy 0, policy_version 16710 (0.0041) [2024-06-21 15:51:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 273776640. Throughput: 0: 41725.4. Samples: 273953000. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-21 15:51:33,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 15:51:36,929][15401] Updated weights for policy 0, policy_version 16720 (0.0043) [2024-06-21 15:51:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41777.5, 300 sec: 41320.7). Total num frames: 273989632. Throughput: 0: 41632.5. Samples: 274074240. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-21 15:51:38,392][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 15:51:40,834][15401] Updated weights for policy 0, policy_version 16730 (0.0027) [2024-06-21 15:51:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41236.1, 300 sec: 41376.5). Total num frames: 274202624. Throughput: 0: 41383.1. Samples: 274319880. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-21 15:51:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 15:51:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000016736_274202624.pth... [2024-06-21 15:51:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000016130_264273920.pth [2024-06-21 15:51:45,013][15401] Updated weights for policy 0, policy_version 16740 (0.0039) [2024-06-21 15:51:48,390][15132] Fps is (10 sec: 40969.6, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 274399232. Throughput: 0: 41564.3. Samples: 274575100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:51:48,392][15132] Avg episode reward: [(0, '0.258')] [2024-06-21 15:51:48,696][15401] Updated weights for policy 0, policy_version 16750 (0.0035) [2024-06-21 15:51:52,895][15401] Updated weights for policy 0, policy_version 16760 (0.0039) [2024-06-21 15:51:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 274595840. Throughput: 0: 41307.5. Samples: 274690180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 15:51:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 15:51:57,017][15401] Updated weights for policy 0, policy_version 16770 (0.0041) [2024-06-21 15:51:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 274825216. Throughput: 0: 41166.2. Samples: 274938460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-21 15:51:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 15:52:01,070][15401] Updated weights for policy 0, policy_version 16780 (0.0037) [2024-06-21 15:52:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 41210.8). Total num frames: 275005440. Throughput: 0: 41193.3. Samples: 275187420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-21 15:52:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 15:52:04,795][15401] Updated weights for policy 0, policy_version 16790 (0.0042) [2024-06-21 15:52:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 275218432. Throughput: 0: 41163.2. Samples: 275307140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-21 15:52:08,390][15132] Avg episode reward: [(0, '0.092')] [2024-06-21 15:52:08,879][15401] Updated weights for policy 0, policy_version 16800 (0.0046) [2024-06-21 15:52:12,537][15401] Updated weights for policy 0, policy_version 16810 (0.0039) [2024-06-21 15:52:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 275431424. Throughput: 0: 41194.6. Samples: 275558760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:52:13,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 15:52:16,727][15401] Updated weights for policy 0, policy_version 16820 (0.0038) [2024-06-21 15:52:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 275644416. Throughput: 0: 41196.5. Samples: 275806840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:52:18,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 15:52:20,425][15401] Updated weights for policy 0, policy_version 16830 (0.0043) [2024-06-21 15:52:21,326][15349] Signal inference workers to stop experience collection... (4000 times) [2024-06-21 15:52:21,374][15401] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-21 15:52:21,384][15349] Signal inference workers to resume experience collection... (4000 times) [2024-06-21 15:52:21,387][15401] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-21 15:52:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 275857408. Throughput: 0: 41225.6. Samples: 275929300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 15:52:23,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 15:52:24,691][15401] Updated weights for policy 0, policy_version 16840 (0.0034) [2024-06-21 15:52:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 41265.8). Total num frames: 276054016. Throughput: 0: 41284.2. Samples: 276177660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 15:52:28,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 15:52:28,629][15401] Updated weights for policy 0, policy_version 16850 (0.0045) [2024-06-21 15:52:32,606][15401] Updated weights for policy 0, policy_version 16860 (0.0045) [2024-06-21 15:52:33,389][15132] Fps is (10 sec: 37683.8, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 276234240. Throughput: 0: 41150.3. Samples: 276426860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 15:52:33,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 15:52:36,398][15401] Updated weights for policy 0, policy_version 16870 (0.0036) [2024-06-21 15:52:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41234.6, 300 sec: 41376.5). Total num frames: 276463616. Throughput: 0: 41272.8. Samples: 276547460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 15:52:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 15:52:40,659][15401] Updated weights for policy 0, policy_version 16880 (0.0032) [2024-06-21 15:52:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 276660224. Throughput: 0: 41379.4. Samples: 276800540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 15:52:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 15:52:44,086][15401] Updated weights for policy 0, policy_version 16890 (0.0040) [2024-06-21 15:52:48,389][15132] Fps is (10 sec: 39322.6, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 276856832. Throughput: 0: 41405.0. Samples: 277050640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 15:52:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 15:52:48,604][15401] Updated weights for policy 0, policy_version 16900 (0.0045) [2024-06-21 15:52:52,122][15401] Updated weights for policy 0, policy_version 16910 (0.0027) [2024-06-21 15:52:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 277102592. Throughput: 0: 41489.6. Samples: 277174180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 15:52:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 15:52:56,489][15401] Updated weights for policy 0, policy_version 16920 (0.0030) [2024-06-21 15:52:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 277299200. Throughput: 0: 41477.9. Samples: 277425260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:52:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 15:52:59,980][15401] Updated weights for policy 0, policy_version 16930 (0.0037) [2024-06-21 15:53:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41506.1, 300 sec: 41376.9). Total num frames: 277495808. Throughput: 0: 41368.5. Samples: 277668420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:53:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 15:53:04,278][15401] Updated weights for policy 0, policy_version 16940 (0.0054) [2024-06-21 15:53:08,081][15401] Updated weights for policy 0, policy_version 16950 (0.0043) [2024-06-21 15:53:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 277708800. Throughput: 0: 41334.3. Samples: 277789340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 15:53:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 15:53:12,146][15401] Updated weights for policy 0, policy_version 16960 (0.0035) [2024-06-21 15:53:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 277889024. Throughput: 0: 41435.8. Samples: 278042280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 15:53:13,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-21 15:53:15,685][15401] Updated weights for policy 0, policy_version 16970 (0.0057) [2024-06-21 15:53:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 278134784. Throughput: 0: 41356.3. Samples: 278287900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 15:53:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 15:53:20,763][15401] Updated weights for policy 0, policy_version 16980 (0.0032) [2024-06-21 15:53:23,389][15132] Fps is (10 sec: 45876.2, 60 sec: 41506.3, 300 sec: 41376.5). Total num frames: 278347776. Throughput: 0: 41622.4. Samples: 278420460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:53:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 15:53:23,604][15401] Updated weights for policy 0, policy_version 16990 (0.0040) [2024-06-21 15:53:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 278511616. Throughput: 0: 41541.4. Samples: 278669900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:53:28,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 15:53:28,521][15401] Updated weights for policy 0, policy_version 17000 (0.0045) [2024-06-21 15:53:31,238][15401] Updated weights for policy 0, policy_version 17010 (0.0037) [2024-06-21 15:53:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 41543.1). Total num frames: 278790144. Throughput: 0: 41343.8. Samples: 278911120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:53:33,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 15:53:36,410][15401] Updated weights for policy 0, policy_version 17020 (0.0044) [2024-06-21 15:53:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 278953984. Throughput: 0: 41613.4. Samples: 279046780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 15:53:38,395][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 15:53:39,255][15401] Updated weights for policy 0, policy_version 17030 (0.0033) [2024-06-21 15:53:43,390][15132] Fps is (10 sec: 36044.6, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 279150592. Throughput: 0: 41443.9. Samples: 279290240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 15:53:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 15:53:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000017038_279150592.pth... [2024-06-21 15:53:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000016433_269238272.pth [2024-06-21 15:53:44,095][15401] Updated weights for policy 0, policy_version 17040 (0.0030) [2024-06-21 15:53:47,268][15401] Updated weights for policy 0, policy_version 17050 (0.0034) [2024-06-21 15:53:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 41487.6). Total num frames: 279396352. Throughput: 0: 41461.8. Samples: 279534200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:53:48,390][15132] Avg episode reward: [(0, '0.163')] [2024-06-21 15:53:51,782][15401] Updated weights for policy 0, policy_version 17060 (0.0034) [2024-06-21 15:53:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 279543808. Throughput: 0: 41708.9. Samples: 279666240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:53:53,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-21 15:53:55,223][15401] Updated weights for policy 0, policy_version 17070 (0.0043) [2024-06-21 15:53:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 279789568. Throughput: 0: 41419.7. Samples: 279906160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:53:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 15:53:59,819][15401] Updated weights for policy 0, policy_version 17080 (0.0033) [2024-06-21 15:54:02,936][15401] Updated weights for policy 0, policy_version 17090 (0.0044) [2024-06-21 15:54:03,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42052.3, 300 sec: 41433.0). Total num frames: 280018944. Throughput: 0: 41588.1. Samples: 280159360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 15:54:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 15:54:07,726][15401] Updated weights for policy 0, policy_version 17100 (0.0042) [2024-06-21 15:54:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 280166400. Throughput: 0: 41482.6. Samples: 280287180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 15:54:08,390][15132] Avg episode reward: [(0, '0.144')] [2024-06-21 15:54:10,105][15349] Signal inference workers to stop experience collection... (4050 times) [2024-06-21 15:54:10,126][15401] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-21 15:54:10,168][15349] Signal inference workers to resume experience collection... (4050 times) [2024-06-21 15:54:10,168][15401] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-21 15:54:10,741][15401] Updated weights for policy 0, policy_version 17110 (0.0047) [2024-06-21 15:54:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 41543.2). Total num frames: 280428544. Throughput: 0: 41263.1. Samples: 280526740. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-06-21 15:54:13,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 15:54:16,033][15401] Updated weights for policy 0, policy_version 17120 (0.0030) [2024-06-21 15:54:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 280625152. Throughput: 0: 41581.8. Samples: 280782300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-06-21 15:54:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 15:54:18,802][15401] Updated weights for policy 0, policy_version 17130 (0.0049) [2024-06-21 15:54:23,389][15132] Fps is (10 sec: 37683.1, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 280805376. Throughput: 0: 41273.4. Samples: 280904080. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-06-21 15:54:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-21 15:54:23,829][15401] Updated weights for policy 0, policy_version 17140 (0.0054) [2024-06-21 15:54:26,504][15401] Updated weights for policy 0, policy_version 17150 (0.0042) [2024-06-21 15:54:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 41488.0). Total num frames: 281067520. Throughput: 0: 41524.2. Samples: 281158820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-21 15:54:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 15:54:31,884][15401] Updated weights for policy 0, policy_version 17160 (0.0041) [2024-06-21 15:54:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 40687.0, 300 sec: 41376.5). Total num frames: 281231360. Throughput: 0: 41781.3. Samples: 281414360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-21 15:54:33,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-21 15:54:34,436][15401] Updated weights for policy 0, policy_version 17170 (0.0035) [2024-06-21 15:54:38,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 281444352. Throughput: 0: 41356.8. Samples: 281527300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 15:54:38,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-21 15:54:39,705][15401] Updated weights for policy 0, policy_version 17180 (0.0050) [2024-06-21 15:54:42,701][15401] Updated weights for policy 0, policy_version 17190 (0.0031) [2024-06-21 15:54:43,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42323.7, 300 sec: 41487.3). Total num frames: 281690112. Throughput: 0: 41672.9. Samples: 281781540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 15:54:43,393][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 15:54:47,585][15401] Updated weights for policy 0, policy_version 17200 (0.0028) [2024-06-21 15:54:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 281837568. Throughput: 0: 41744.9. Samples: 282037880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 15:54:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 15:54:50,551][15401] Updated weights for policy 0, policy_version 17210 (0.0033) [2024-06-21 15:54:53,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.4, 300 sec: 41488.0). Total num frames: 282083328. Throughput: 0: 41350.3. Samples: 282147940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-21 15:54:53,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-21 15:54:55,453][15401] Updated weights for policy 0, policy_version 17220 (0.0039) [2024-06-21 15:54:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 41506.1, 300 sec: 41432.4). Total num frames: 282279936. Throughput: 0: 41588.3. Samples: 282398220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-21 15:54:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 15:54:58,503][15401] Updated weights for policy 0, policy_version 17230 (0.0049) [2024-06-21 15:55:03,249][15401] Updated weights for policy 0, policy_version 17240 (0.0041) [2024-06-21 15:55:03,396][15132] Fps is (10 sec: 37658.8, 60 sec: 40682.6, 300 sec: 41375.6). Total num frames: 282460160. Throughput: 0: 41542.6. Samples: 282651980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 15:55:03,397][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 15:55:06,292][15401] Updated weights for policy 0, policy_version 17250 (0.0029) [2024-06-21 15:55:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 41432.1). Total num frames: 282705920. Throughput: 0: 41405.3. Samples: 282767320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 15:55:08,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-21 15:55:10,995][15401] Updated weights for policy 0, policy_version 17260 (0.0041) [2024-06-21 15:55:13,389][15132] Fps is (10 sec: 42625.8, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 282886144. Throughput: 0: 41378.6. Samples: 283020860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 15:55:13,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 15:55:14,180][15401] Updated weights for policy 0, policy_version 17270 (0.0035) [2024-06-21 15:55:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 283082752. Throughput: 0: 41281.0. Samples: 283272000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:55:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 15:55:18,838][15401] Updated weights for policy 0, policy_version 17280 (0.0043) [2024-06-21 15:55:22,345][15401] Updated weights for policy 0, policy_version 17290 (0.0038) [2024-06-21 15:55:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 283328512. Throughput: 0: 41394.3. Samples: 283390040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 15:55:23,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 15:55:26,733][15401] Updated weights for policy 0, policy_version 17300 (0.0032) [2024-06-21 15:55:27,979][15349] Signal inference workers to stop experience collection... (4100 times) [2024-06-21 15:55:28,022][15401] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-21 15:55:28,030][15349] Signal inference workers to resume experience collection... (4100 times) [2024-06-21 15:55:28,044][15401] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-21 15:55:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 283525120. Throughput: 0: 41340.1. Samples: 283641740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 15:55:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 15:55:30,126][15401] Updated weights for policy 0, policy_version 17310 (0.0036) [2024-06-21 15:55:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 283721728. Throughput: 0: 41181.3. Samples: 283891040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 15:55:33,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-21 15:55:34,490][15401] Updated weights for policy 0, policy_version 17320 (0.0037) [2024-06-21 15:55:38,243][15401] Updated weights for policy 0, policy_version 17330 (0.0047) [2024-06-21 15:55:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 41777.6, 300 sec: 41432.4). Total num frames: 283951104. Throughput: 0: 41369.7. Samples: 284009680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 15:55:38,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 15:55:42,165][15401] Updated weights for policy 0, policy_version 17340 (0.0033) [2024-06-21 15:55:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 40961.6, 300 sec: 41487.6). Total num frames: 284147712. Throughput: 0: 41388.5. Samples: 284260700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:55:43,399][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 15:55:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000017343_284147712.pth... [2024-06-21 15:55:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000016736_274202624.pth [2024-06-21 15:55:46,009][15401] Updated weights for policy 0, policy_version 17350 (0.0036) [2024-06-21 15:55:48,389][15132] Fps is (10 sec: 39331.1, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 284344320. Throughput: 0: 41409.5. Samples: 284515140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 15:55:48,399][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 15:55:49,867][15401] Updated weights for policy 0, policy_version 17360 (0.0028) [2024-06-21 15:55:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 284557312. Throughput: 0: 41598.2. Samples: 284639240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:55:53,399][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 15:55:53,759][15401] Updated weights for policy 0, policy_version 17370 (0.0024) [2024-06-21 15:55:57,803][15401] Updated weights for policy 0, policy_version 17380 (0.0043) [2024-06-21 15:55:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 284770304. Throughput: 0: 41744.9. Samples: 284899380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:55:58,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-21 15:56:01,543][15401] Updated weights for policy 0, policy_version 17390 (0.0056) [2024-06-21 15:56:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42056.7, 300 sec: 41432.1). Total num frames: 284983296. Throughput: 0: 41591.0. Samples: 285143600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 15:56:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 15:56:05,471][15401] Updated weights for policy 0, policy_version 17400 (0.0042) [2024-06-21 15:56:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 285179904. Throughput: 0: 41793.2. Samples: 285270740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:56:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 15:56:09,598][15401] Updated weights for policy 0, policy_version 17410 (0.0046) [2024-06-21 15:56:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 285392896. Throughput: 0: 41813.2. Samples: 285523340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 15:56:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-21 15:56:13,404][15401] Updated weights for policy 0, policy_version 17420 (0.0032) [2024-06-21 15:56:17,436][15401] Updated weights for policy 0, policy_version 17430 (0.0038) [2024-06-21 15:56:18,390][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.1, 300 sec: 41432.1). Total num frames: 285589504. Throughput: 0: 41764.4. Samples: 285770440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:56:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 15:56:21,500][15401] Updated weights for policy 0, policy_version 17440 (0.0035) [2024-06-21 15:56:23,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41504.5, 300 sec: 41487.3). Total num frames: 285818880. Throughput: 0: 41864.0. Samples: 285893560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:56:23,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 15:56:25,248][15401] Updated weights for policy 0, policy_version 17450 (0.0023) [2024-06-21 15:56:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 286031872. Throughput: 0: 41836.9. Samples: 286143360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 15:56:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 15:56:29,508][15401] Updated weights for policy 0, policy_version 17460 (0.0034) [2024-06-21 15:56:32,946][15401] Updated weights for policy 0, policy_version 17470 (0.0036) [2024-06-21 15:56:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41779.2, 300 sec: 41488.0). Total num frames: 286228480. Throughput: 0: 41803.6. Samples: 286396300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 15:56:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-21 15:56:37,327][15401] Updated weights for policy 0, policy_version 17480 (0.0036) [2024-06-21 15:56:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42054.0, 300 sec: 41598.7). Total num frames: 286474240. Throughput: 0: 41819.6. Samples: 286521120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 15:56:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 15:56:40,867][15401] Updated weights for policy 0, policy_version 17490 (0.0038) [2024-06-21 15:56:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 286670848. Throughput: 0: 41681.7. Samples: 286775060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 15:56:43,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 15:56:45,341][15401] Updated weights for policy 0, policy_version 17500 (0.0028) [2024-06-21 15:56:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42050.6, 300 sec: 41598.4). Total num frames: 286867456. Throughput: 0: 41804.9. Samples: 287024920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 15:56:48,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 15:56:48,626][15401] Updated weights for policy 0, policy_version 17510 (0.0025) [2024-06-21 15:56:53,018][15401] Updated weights for policy 0, policy_version 17520 (0.0033) [2024-06-21 15:56:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 287047680. Throughput: 0: 41672.9. Samples: 287146020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 15:56:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 15:56:56,774][15401] Updated weights for policy 0, policy_version 17530 (0.0025) [2024-06-21 15:56:58,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42052.1, 300 sec: 41654.2). Total num frames: 287293440. Throughput: 0: 41801.7. Samples: 287404420. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-21 15:56:58,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-21 15:57:00,858][15401] Updated weights for policy 0, policy_version 17540 (0.0046) [2024-06-21 15:57:01,678][15349] Signal inference workers to stop experience collection... (4150 times) [2024-06-21 15:57:01,684][15349] Signal inference workers to resume experience collection... (4150 times) [2024-06-21 15:57:01,710][15401] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-21 15:57:01,710][15401] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-21 15:57:03,395][15132] Fps is (10 sec: 44212.3, 60 sec: 41775.3, 300 sec: 41597.9). Total num frames: 287490048. Throughput: 0: 41686.8. Samples: 287646580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-21 15:57:03,396][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 15:57:04,582][15401] Updated weights for policy 0, policy_version 17550 (0.0042) [2024-06-21 15:57:08,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 287670272. Throughput: 0: 41629.3. Samples: 287766780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 15:57:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 15:57:08,581][15401] Updated weights for policy 0, policy_version 17560 (0.0034) [2024-06-21 15:57:12,627][15401] Updated weights for policy 0, policy_version 17570 (0.0041) [2024-06-21 15:57:13,390][15132] Fps is (10 sec: 40980.3, 60 sec: 41778.7, 300 sec: 41543.1). Total num frames: 287899648. Throughput: 0: 41769.7. Samples: 288023020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 15:57:13,391][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 15:57:16,259][15401] Updated weights for policy 0, policy_version 17580 (0.0038) [2024-06-21 15:57:18,394][15132] Fps is (10 sec: 42581.5, 60 sec: 41776.4, 300 sec: 41487.1). Total num frames: 288096256. Throughput: 0: 41708.6. Samples: 288273360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 15:57:18,394][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 15:57:20,325][15401] Updated weights for policy 0, policy_version 17590 (0.0026) [2024-06-21 15:57:23,390][15132] Fps is (10 sec: 42601.0, 60 sec: 41780.8, 300 sec: 41598.7). Total num frames: 288325632. Throughput: 0: 41720.3. Samples: 288398540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 15:57:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 15:57:23,932][15401] Updated weights for policy 0, policy_version 17600 (0.0034) [2024-06-21 15:57:28,261][15401] Updated weights for policy 0, policy_version 17610 (0.0031) [2024-06-21 15:57:28,389][15132] Fps is (10 sec: 42615.9, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 288522240. Throughput: 0: 41603.2. Samples: 288647200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 15:57:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 15:57:32,027][15401] Updated weights for policy 0, policy_version 17620 (0.0032) [2024-06-21 15:57:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 288718848. Throughput: 0: 41616.8. Samples: 288897580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 15:57:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 15:57:35,975][15401] Updated weights for policy 0, policy_version 17630 (0.0040) [2024-06-21 15:57:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 288948224. Throughput: 0: 41567.2. Samples: 289016540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 15:57:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 15:57:40,191][15401] Updated weights for policy 0, policy_version 17640 (0.0041) [2024-06-21 15:57:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 289128448. Throughput: 0: 41238.7. Samples: 289260160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 15:57:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 15:57:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000017648_289144832.pth... [2024-06-21 15:57:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000017038_279150592.pth [2024-06-21 15:57:44,136][15401] Updated weights for policy 0, policy_version 17650 (0.0049) [2024-06-21 15:57:47,854][15401] Updated weights for policy 0, policy_version 17660 (0.0033) [2024-06-21 15:57:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41234.7, 300 sec: 41487.6). Total num frames: 289341440. Throughput: 0: 41371.9. Samples: 289508080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 15:57:48,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-21 15:57:51,999][15401] Updated weights for policy 0, policy_version 17670 (0.0037) [2024-06-21 15:57:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 289554432. Throughput: 0: 41569.5. Samples: 289637400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 15:57:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 15:57:55,614][15401] Updated weights for policy 0, policy_version 17680 (0.0034) [2024-06-21 15:57:58,392][15132] Fps is (10 sec: 40950.2, 60 sec: 40958.4, 300 sec: 41542.8). Total num frames: 289751040. Throughput: 0: 41274.9. Samples: 289880460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:57:58,392][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 15:57:59,906][15401] Updated weights for policy 0, policy_version 17690 (0.0039) [2024-06-21 15:58:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41510.0, 300 sec: 41598.7). Total num frames: 289980416. Throughput: 0: 41214.8. Samples: 290127860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:58:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 15:58:03,654][15401] Updated weights for policy 0, policy_version 17700 (0.0032) [2024-06-21 15:58:07,865][15401] Updated weights for policy 0, policy_version 17710 (0.0038) [2024-06-21 15:58:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 290177024. Throughput: 0: 41315.2. Samples: 290257720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 15:58:08,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 15:58:11,362][15401] Updated weights for policy 0, policy_version 17720 (0.0047) [2024-06-21 15:58:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41233.6, 300 sec: 41487.6). Total num frames: 290373632. Throughput: 0: 41334.3. Samples: 290507240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 15:58:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 15:58:15,441][15401] Updated weights for policy 0, policy_version 17730 (0.0038) [2024-06-21 15:58:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41782.0, 300 sec: 41543.1). Total num frames: 290603008. Throughput: 0: 41258.6. Samples: 290754220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 15:58:18,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-21 15:58:18,972][15401] Updated weights for policy 0, policy_version 17740 (0.0034) [2024-06-21 15:58:23,270][15401] Updated weights for policy 0, policy_version 17750 (0.0031) [2024-06-21 15:58:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 290816000. Throughput: 0: 41503.1. Samples: 290884180. Policy #0 lag: (min: 2.0, avg: 10.4, max: 23.0) [2024-06-21 15:58:23,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 15:58:26,938][15401] Updated weights for policy 0, policy_version 17760 (0.0029) [2024-06-21 15:58:28,390][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 290996224. Throughput: 0: 41437.4. Samples: 291124840. Policy #0 lag: (min: 2.0, avg: 10.4, max: 23.0) [2024-06-21 15:58:28,390][15132] Avg episode reward: [(0, '0.155')] [2024-06-21 15:58:31,249][15401] Updated weights for policy 0, policy_version 17770 (0.0039) [2024-06-21 15:58:33,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41777.6, 300 sec: 41598.4). Total num frames: 291225600. Throughput: 0: 41622.7. Samples: 291381200. Policy #0 lag: (min: 2.0, avg: 10.4, max: 23.0) [2024-06-21 15:58:33,392][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 15:58:34,896][15401] Updated weights for policy 0, policy_version 17780 (0.0052) [2024-06-21 15:58:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 291389440. Throughput: 0: 41546.2. Samples: 291506980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 15:58:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 15:58:39,413][15401] Updated weights for policy 0, policy_version 17790 (0.0026) [2024-06-21 15:58:42,721][15401] Updated weights for policy 0, policy_version 17800 (0.0038) [2024-06-21 15:58:43,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 291635200. Throughput: 0: 41587.5. Samples: 291751800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 15:58:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 15:58:46,206][15349] Signal inference workers to stop experience collection... (4200 times) [2024-06-21 15:58:46,206][15349] Signal inference workers to resume experience collection... (4200 times) [2024-06-21 15:58:46,223][15401] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-06-21 15:58:46,224][15401] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-06-21 15:58:47,361][15401] Updated weights for policy 0, policy_version 17810 (0.0039) [2024-06-21 15:58:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 291831808. Throughput: 0: 41631.2. Samples: 292001260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 15:58:48,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 15:58:51,036][15401] Updated weights for policy 0, policy_version 17820 (0.0040) [2024-06-21 15:58:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 292028416. Throughput: 0: 41432.0. Samples: 292122160. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-21 15:58:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 15:58:55,163][15401] Updated weights for policy 0, policy_version 17830 (0.0037) [2024-06-21 15:58:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42053.9, 300 sec: 41543.1). Total num frames: 292274176. Throughput: 0: 41420.3. Samples: 292371160. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-21 15:58:58,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 15:58:58,750][15401] Updated weights for policy 0, policy_version 17840 (0.0036) [2024-06-21 15:59:03,069][15401] Updated weights for policy 0, policy_version 17850 (0.0041) [2024-06-21 15:59:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 292454400. Throughput: 0: 41475.2. Samples: 292620600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-21 15:59:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 15:59:06,804][15401] Updated weights for policy 0, policy_version 17860 (0.0028) [2024-06-21 15:59:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 292667392. Throughput: 0: 41371.1. Samples: 292745880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-21 15:59:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 15:59:10,723][15401] Updated weights for policy 0, policy_version 17870 (0.0039) [2024-06-21 15:59:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 292880384. Throughput: 0: 41615.2. Samples: 292997520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-21 15:59:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 15:59:14,607][15401] Updated weights for policy 0, policy_version 17880 (0.0047) [2024-06-21 15:59:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 293076992. Throughput: 0: 41419.5. Samples: 293244980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 15:59:18,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 15:59:18,992][15401] Updated weights for policy 0, policy_version 17890 (0.0035) [2024-06-21 15:59:22,655][15401] Updated weights for policy 0, policy_version 17900 (0.0038) [2024-06-21 15:59:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 293306368. Throughput: 0: 41310.2. Samples: 293365940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 15:59:23,392][15132] Avg episode reward: [(0, '0.296')] [2024-06-21 15:59:26,783][15401] Updated weights for policy 0, policy_version 17910 (0.0038) [2024-06-21 15:59:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 293486592. Throughput: 0: 41488.4. Samples: 293618780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 15:59:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 15:59:30,615][15401] Updated weights for policy 0, policy_version 17920 (0.0041) [2024-06-21 15:59:33,390][15132] Fps is (10 sec: 37682.7, 60 sec: 40961.5, 300 sec: 41487.6). Total num frames: 293683200. Throughput: 0: 41530.1. Samples: 293870120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 15:59:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 15:59:34,935][15401] Updated weights for policy 0, policy_version 17930 (0.0035) [2024-06-21 15:59:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41432.4). Total num frames: 293912576. Throughput: 0: 41631.0. Samples: 293995560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 15:59:38,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-21 15:59:38,432][15401] Updated weights for policy 0, policy_version 17940 (0.0040) [2024-06-21 15:59:42,858][15401] Updated weights for policy 0, policy_version 17950 (0.0031) [2024-06-21 15:59:43,389][15132] Fps is (10 sec: 44237.8, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 294125568. Throughput: 0: 41629.5. Samples: 294244480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:59:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 15:59:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000017952_294125568.pth... [2024-06-21 15:59:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000017343_284147712.pth [2024-06-21 15:59:46,296][15401] Updated weights for policy 0, policy_version 17960 (0.0039) [2024-06-21 15:59:48,393][15132] Fps is (10 sec: 40944.3, 60 sec: 41503.4, 300 sec: 41487.1). Total num frames: 294322176. Throughput: 0: 41555.9. Samples: 294490780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 15:59:48,394][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 15:59:50,635][15401] Updated weights for policy 0, policy_version 17970 (0.0039) [2024-06-21 15:59:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 294535168. Throughput: 0: 41577.8. Samples: 294616880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 15:59:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 15:59:54,075][15401] Updated weights for policy 0, policy_version 17980 (0.0044) [2024-06-21 15:59:58,389][15132] Fps is (10 sec: 40976.3, 60 sec: 40960.1, 300 sec: 41599.6). Total num frames: 294731776. Throughput: 0: 41530.2. Samples: 294866380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 15:59:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 15:59:58,585][15401] Updated weights for policy 0, policy_version 17990 (0.0034) [2024-06-21 16:00:02,159][15401] Updated weights for policy 0, policy_version 18000 (0.0051) [2024-06-21 16:00:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 294961152. Throughput: 0: 41206.4. Samples: 295099260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 16:00:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 16:00:06,405][15401] Updated weights for policy 0, policy_version 18010 (0.0030) [2024-06-21 16:00:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 295124992. Throughput: 0: 41495.6. Samples: 295233240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 16:00:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 16:00:10,138][15401] Updated weights for policy 0, policy_version 18020 (0.0037) [2024-06-21 16:00:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 295354368. Throughput: 0: 41344.9. Samples: 295479300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 16:00:13,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 16:00:13,996][15401] Updated weights for policy 0, policy_version 18030 (0.0031) [2024-06-21 16:00:17,963][15401] Updated weights for policy 0, policy_version 18040 (0.0036) [2024-06-21 16:00:18,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 295600128. Throughput: 0: 41280.1. Samples: 295727720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 16:00:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 16:00:19,506][15349] Signal inference workers to stop experience collection... (4250 times) [2024-06-21 16:00:19,507][15349] Signal inference workers to resume experience collection... (4250 times) [2024-06-21 16:00:19,539][15401] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-06-21 16:00:19,539][15401] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-06-21 16:00:22,026][15401] Updated weights for policy 0, policy_version 18050 (0.0042) [2024-06-21 16:00:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 295763968. Throughput: 0: 41460.8. Samples: 295861300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 16:00:23,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-21 16:00:25,736][15401] Updated weights for policy 0, policy_version 18060 (0.0048) [2024-06-21 16:00:28,390][15132] Fps is (10 sec: 37682.8, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 295976960. Throughput: 0: 41286.5. Samples: 296102380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 16:00:28,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-21 16:00:30,016][15401] Updated weights for policy 0, policy_version 18070 (0.0038) [2024-06-21 16:00:33,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42050.7, 300 sec: 41543.2). Total num frames: 296206336. Throughput: 0: 41448.5. Samples: 296355900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 16:00:33,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 16:00:33,526][15401] Updated weights for policy 0, policy_version 18080 (0.0036) [2024-06-21 16:00:37,687][15401] Updated weights for policy 0, policy_version 18090 (0.0042) [2024-06-21 16:00:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 296402944. Throughput: 0: 41439.1. Samples: 296481640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 16:00:38,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-21 16:00:41,561][15401] Updated weights for policy 0, policy_version 18100 (0.0028) [2024-06-21 16:00:43,390][15132] Fps is (10 sec: 39330.7, 60 sec: 41233.0, 300 sec: 41543.1). Total num frames: 296599552. Throughput: 0: 41296.8. Samples: 296724740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 16:00:43,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 16:00:45,631][15401] Updated weights for policy 0, policy_version 18110 (0.0044) [2024-06-21 16:00:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41508.9, 300 sec: 41543.2). Total num frames: 296812544. Throughput: 0: 41888.9. Samples: 296984260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:00:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 16:00:49,346][15401] Updated weights for policy 0, policy_version 18120 (0.0035) [2024-06-21 16:00:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.0, 300 sec: 41543.1). Total num frames: 297025536. Throughput: 0: 41550.6. Samples: 297103020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:00:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 16:00:53,571][15401] Updated weights for policy 0, policy_version 18130 (0.0035) [2024-06-21 16:00:57,421][15401] Updated weights for policy 0, policy_version 18140 (0.0028) [2024-06-21 16:00:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 297222144. Throughput: 0: 41606.7. Samples: 297351600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 18.0) [2024-06-21 16:00:58,399][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 16:01:01,332][15401] Updated weights for policy 0, policy_version 18150 (0.0029) [2024-06-21 16:01:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 297435136. Throughput: 0: 41707.5. Samples: 297604560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 18.0) [2024-06-21 16:01:03,391][15132] Avg episode reward: [(0, '0.767')] [2024-06-21 16:01:05,169][15401] Updated weights for policy 0, policy_version 18160 (0.0043) [2024-06-21 16:01:08,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42320.8, 300 sec: 41597.8). Total num frames: 297664512. Throughput: 0: 41574.7. Samples: 297732420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 18.0) [2024-06-21 16:01:08,396][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 16:01:08,988][15401] Updated weights for policy 0, policy_version 18170 (0.0037) [2024-06-21 16:01:12,971][15401] Updated weights for policy 0, policy_version 18180 (0.0035) [2024-06-21 16:01:13,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42052.0, 300 sec: 41654.2). Total num frames: 297877504. Throughput: 0: 41758.4. Samples: 297981520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 16:01:13,391][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 16:01:16,788][15401] Updated weights for policy 0, policy_version 18190 (0.0041) [2024-06-21 16:01:18,389][15132] Fps is (10 sec: 39347.0, 60 sec: 40960.0, 300 sec: 41488.0). Total num frames: 298057728. Throughput: 0: 41858.7. Samples: 298239440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 16:01:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 16:01:21,008][15401] Updated weights for policy 0, policy_version 18200 (0.0027) [2024-06-21 16:01:23,389][15132] Fps is (10 sec: 40961.6, 60 sec: 42052.4, 300 sec: 41543.2). Total num frames: 298287104. Throughput: 0: 41784.9. Samples: 298361960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 16:01:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 16:01:25,082][15401] Updated weights for policy 0, policy_version 18210 (0.0048) [2024-06-21 16:01:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 298500096. Throughput: 0: 41920.1. Samples: 298611140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 16:01:28,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 16:01:28,675][15401] Updated weights for policy 0, policy_version 18220 (0.0034) [2024-06-21 16:01:33,054][15401] Updated weights for policy 0, policy_version 18230 (0.0037) [2024-06-21 16:01:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41234.7, 300 sec: 41376.5). Total num frames: 298680320. Throughput: 0: 41772.8. Samples: 298864040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 16:01:33,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 16:01:36,829][15401] Updated weights for policy 0, policy_version 18240 (0.0045) [2024-06-21 16:01:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 298926080. Throughput: 0: 41821.0. Samples: 298984960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:01:38,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-21 16:01:41,031][15401] Updated weights for policy 0, policy_version 18250 (0.0042) [2024-06-21 16:01:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 41599.0). Total num frames: 299139072. Throughput: 0: 42114.6. Samples: 299246760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:01:43,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 16:01:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000018258_299139072.pth... [2024-06-21 16:01:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000017648_289144832.pth [2024-06-21 16:01:44,533][15401] Updated weights for policy 0, policy_version 18260 (0.0032) [2024-06-21 16:01:45,011][15349] Signal inference workers to stop experience collection... (4300 times) [2024-06-21 16:01:45,043][15401] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-06-21 16:01:45,124][15349] Signal inference workers to resume experience collection... (4300 times) [2024-06-21 16:01:45,124][15401] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-06-21 16:01:48,396][15132] Fps is (10 sec: 37658.9, 60 sec: 41501.7, 300 sec: 41542.3). Total num frames: 299302912. Throughput: 0: 42018.1. Samples: 299495640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:01:48,397][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 16:01:48,640][15401] Updated weights for policy 0, policy_version 18270 (0.0034) [2024-06-21 16:01:52,401][15401] Updated weights for policy 0, policy_version 18280 (0.0024) [2024-06-21 16:01:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41598.7). Total num frames: 299565056. Throughput: 0: 41913.0. Samples: 299618240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 16:01:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 16:01:56,682][15401] Updated weights for policy 0, policy_version 18290 (0.0032) [2024-06-21 16:01:58,390][15132] Fps is (10 sec: 45904.4, 60 sec: 42325.3, 300 sec: 41599.5). Total num frames: 299761664. Throughput: 0: 41943.9. Samples: 299868980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 16:01:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 16:02:00,136][15401] Updated weights for policy 0, policy_version 18300 (0.0032) [2024-06-21 16:02:03,390][15132] Fps is (10 sec: 36044.7, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 299925504. Throughput: 0: 41653.7. Samples: 300113860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 16:02:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 16:02:05,056][15401] Updated weights for policy 0, policy_version 18310 (0.0031) [2024-06-21 16:02:08,062][15401] Updated weights for policy 0, policy_version 18320 (0.0033) [2024-06-21 16:02:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41783.6, 300 sec: 41598.8). Total num frames: 300171264. Throughput: 0: 41621.7. Samples: 300234940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 16:02:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 16:02:12,831][15401] Updated weights for policy 0, policy_version 18330 (0.0031) [2024-06-21 16:02:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.3, 300 sec: 41543.7). Total num frames: 300351488. Throughput: 0: 41696.0. Samples: 300487460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 16:02:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 16:02:15,921][15401] Updated weights for policy 0, policy_version 18340 (0.0045) [2024-06-21 16:02:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 300564480. Throughput: 0: 41438.7. Samples: 300728780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:02:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 16:02:20,573][15401] Updated weights for policy 0, policy_version 18350 (0.0030) [2024-06-21 16:02:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 300777472. Throughput: 0: 41596.0. Samples: 300856780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:02:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 16:02:23,977][15401] Updated weights for policy 0, policy_version 18360 (0.0036) [2024-06-21 16:02:28,314][15401] Updated weights for policy 0, policy_version 18370 (0.0039) [2024-06-21 16:02:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 300974080. Throughput: 0: 41295.7. Samples: 301105060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:02:28,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-21 16:02:31,664][15401] Updated weights for policy 0, policy_version 18380 (0.0027) [2024-06-21 16:02:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 301203456. Throughput: 0: 41063.1. Samples: 301343220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 16:02:33,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-21 16:02:36,309][15401] Updated weights for policy 0, policy_version 18390 (0.0035) [2024-06-21 16:02:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 301400064. Throughput: 0: 41251.6. Samples: 301474560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 16:02:38,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-21 16:02:39,620][15401] Updated weights for policy 0, policy_version 18400 (0.0029) [2024-06-21 16:02:43,389][15132] Fps is (10 sec: 37683.7, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 301580288. Throughput: 0: 41320.1. Samples: 301728380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:02:43,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-21 16:02:44,136][15401] Updated weights for policy 0, policy_version 18410 (0.0032) [2024-06-21 16:02:47,582][15401] Updated weights for policy 0, policy_version 18420 (0.0037) [2024-06-21 16:02:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42329.9, 300 sec: 41654.2). Total num frames: 301842432. Throughput: 0: 41105.9. Samples: 301963620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:02:48,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 16:02:52,164][15401] Updated weights for policy 0, policy_version 18430 (0.0041) [2024-06-21 16:02:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 40413.9, 300 sec: 41488.0). Total num frames: 301989888. Throughput: 0: 41353.0. Samples: 302095820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:02:53,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-21 16:02:55,383][15401] Updated weights for policy 0, policy_version 18440 (0.0044) [2024-06-21 16:02:58,389][15132] Fps is (10 sec: 36045.0, 60 sec: 40687.0, 300 sec: 41432.1). Total num frames: 302202880. Throughput: 0: 41229.4. Samples: 302342780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 16:02:58,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-21 16:02:59,956][15401] Updated weights for policy 0, policy_version 18450 (0.0037) [2024-06-21 16:03:01,377][15349] Signal inference workers to stop experience collection... (4350 times) [2024-06-21 16:03:01,377][15349] Signal inference workers to resume experience collection... (4350 times) [2024-06-21 16:03:01,393][15401] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-06-21 16:03:01,393][15401] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-06-21 16:03:03,317][15401] Updated weights for policy 0, policy_version 18460 (0.0030) [2024-06-21 16:03:03,391][15132] Fps is (10 sec: 45867.9, 60 sec: 42051.2, 300 sec: 41598.5). Total num frames: 302448640. Throughput: 0: 41357.7. Samples: 302589940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 16:03:03,395][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 16:03:07,592][15401] Updated weights for policy 0, policy_version 18470 (0.0031) [2024-06-21 16:03:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 302612480. Throughput: 0: 41349.4. Samples: 302717500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 16:03:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 16:03:11,237][15401] Updated weights for policy 0, policy_version 18480 (0.0038) [2024-06-21 16:03:13,389][15132] Fps is (10 sec: 39327.8, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 302841856. Throughput: 0: 41165.7. Samples: 302957520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:03:13,396][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 16:03:15,687][15401] Updated weights for policy 0, policy_version 18490 (0.0026) [2024-06-21 16:03:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 303022080. Throughput: 0: 41620.1. Samples: 303216120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:03:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 16:03:19,159][15401] Updated weights for policy 0, policy_version 18500 (0.0037) [2024-06-21 16:03:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 303235072. Throughput: 0: 41249.4. Samples: 303330780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:03:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 16:03:23,890][15401] Updated weights for policy 0, policy_version 18510 (0.0038) [2024-06-21 16:03:27,356][15401] Updated weights for policy 0, policy_version 18520 (0.0028) [2024-06-21 16:03:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 41779.2, 300 sec: 41543.5). Total num frames: 303480832. Throughput: 0: 41158.7. Samples: 303580520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:03:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 16:03:31,539][15401] Updated weights for policy 0, policy_version 18530 (0.0041) [2024-06-21 16:03:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 40687.0, 300 sec: 41543.2). Total num frames: 303644672. Throughput: 0: 41519.1. Samples: 303831980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:03:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 16:03:35,080][15401] Updated weights for policy 0, policy_version 18540 (0.0039) [2024-06-21 16:03:38,390][15132] Fps is (10 sec: 37682.3, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 303857664. Throughput: 0: 41275.9. Samples: 303953240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 16:03:38,390][15132] Avg episode reward: [(0, '0.110')] [2024-06-21 16:03:39,297][15401] Updated weights for policy 0, policy_version 18550 (0.0040) [2024-06-21 16:03:42,947][15401] Updated weights for policy 0, policy_version 18560 (0.0036) [2024-06-21 16:03:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 304087040. Throughput: 0: 41438.6. Samples: 304207520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 16:03:43,390][15132] Avg episode reward: [(0, '0.183')] [2024-06-21 16:03:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000018561_304103424.pth... [2024-06-21 16:03:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000017952_294125568.pth [2024-06-21 16:03:47,140][15401] Updated weights for policy 0, policy_version 18570 (0.0042) [2024-06-21 16:03:48,389][15132] Fps is (10 sec: 40960.9, 60 sec: 40413.9, 300 sec: 41487.6). Total num frames: 304267264. Throughput: 0: 41388.6. Samples: 304452360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 16:03:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 16:03:50,853][15401] Updated weights for policy 0, policy_version 18580 (0.0041) [2024-06-21 16:03:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41432.1). Total num frames: 304496640. Throughput: 0: 41186.5. Samples: 304570900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 16:03:53,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-21 16:03:55,063][15401] Updated weights for policy 0, policy_version 18590 (0.0032) [2024-06-21 16:03:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 304693248. Throughput: 0: 41516.0. Samples: 304825740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 16:03:58,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 16:03:59,182][15401] Updated weights for policy 0, policy_version 18600 (0.0033) [2024-06-21 16:04:02,704][15401] Updated weights for policy 0, policy_version 18610 (0.0045) [2024-06-21 16:04:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40961.0, 300 sec: 41487.6). Total num frames: 304906240. Throughput: 0: 41193.2. Samples: 305069820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:04:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 16:04:07,264][15401] Updated weights for policy 0, policy_version 18620 (0.0025) [2024-06-21 16:04:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 41543.1). Total num frames: 305135616. Throughput: 0: 41489.2. Samples: 305197800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:04:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 16:04:11,045][15401] Updated weights for policy 0, policy_version 18630 (0.0030) [2024-06-21 16:04:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 305315840. Throughput: 0: 41333.6. Samples: 305440540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:04:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 16:04:15,240][15401] Updated weights for policy 0, policy_version 18640 (0.0034) [2024-06-21 16:04:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 305528832. Throughput: 0: 41159.6. Samples: 305684160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 16:04:18,391][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 16:04:19,016][15401] Updated weights for policy 0, policy_version 18650 (0.0044) [2024-06-21 16:04:22,190][15349] Signal inference workers to stop experience collection... (4400 times) [2024-06-21 16:04:22,221][15401] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-06-21 16:04:22,258][15349] Signal inference workers to resume experience collection... (4400 times) [2024-06-21 16:04:22,259][15401] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-06-21 16:04:23,131][15401] Updated weights for policy 0, policy_version 18660 (0.0039) [2024-06-21 16:04:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 305725440. Throughput: 0: 41401.9. Samples: 305816320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 16:04:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 16:04:26,857][15401] Updated weights for policy 0, policy_version 18670 (0.0052) [2024-06-21 16:04:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 305938432. Throughput: 0: 41138.2. Samples: 306058740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:04:28,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-21 16:04:30,974][15401] Updated weights for policy 0, policy_version 18680 (0.0027) [2024-06-21 16:04:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 306151424. Throughput: 0: 41147.9. Samples: 306304020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:04:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 16:04:34,907][15401] Updated weights for policy 0, policy_version 18690 (0.0030) [2024-06-21 16:04:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 306348032. Throughput: 0: 41255.1. Samples: 306427380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:04:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 16:04:39,135][15401] Updated weights for policy 0, policy_version 18700 (0.0028) [2024-06-21 16:04:42,705][15401] Updated weights for policy 0, policy_version 18710 (0.0047) [2024-06-21 16:04:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41488.2). Total num frames: 306561024. Throughput: 0: 41162.7. Samples: 306678060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:04:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 16:04:46,884][15401] Updated weights for policy 0, policy_version 18720 (0.0028) [2024-06-21 16:04:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 306757632. Throughput: 0: 41261.9. Samples: 306926600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:04:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 16:04:50,426][15401] Updated weights for policy 0, policy_version 18730 (0.0041) [2024-06-21 16:04:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 306970624. Throughput: 0: 41134.2. Samples: 307048840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:04:53,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-21 16:04:54,809][15401] Updated weights for policy 0, policy_version 18740 (0.0041) [2024-06-21 16:04:58,112][15401] Updated weights for policy 0, policy_version 18750 (0.0034) [2024-06-21 16:04:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 307200000. Throughput: 0: 41382.8. Samples: 307302760. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-21 16:04:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-21 16:05:02,684][15401] Updated weights for policy 0, policy_version 18760 (0.0044) [2024-06-21 16:05:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41543.1). Total num frames: 307380224. Throughput: 0: 41521.2. Samples: 307552620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-21 16:05:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 16:05:05,870][15401] Updated weights for policy 0, policy_version 18770 (0.0034) [2024-06-21 16:05:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40686.9, 300 sec: 41432.1). Total num frames: 307576832. Throughput: 0: 41224.9. Samples: 307671440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:05:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 16:05:10,547][15401] Updated weights for policy 0, policy_version 18780 (0.0034) [2024-06-21 16:05:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 307822592. Throughput: 0: 41534.6. Samples: 307927800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:05:13,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 16:05:13,671][15401] Updated weights for policy 0, policy_version 18790 (0.0030) [2024-06-21 16:05:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.1, 300 sec: 41487.7). Total num frames: 308002816. Throughput: 0: 41678.3. Samples: 308179540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:05:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 16:05:18,709][15401] Updated weights for policy 0, policy_version 18800 (0.0045) [2024-06-21 16:05:21,825][15401] Updated weights for policy 0, policy_version 18810 (0.0036) [2024-06-21 16:05:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 308215808. Throughput: 0: 41529.9. Samples: 308296220. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-21 16:05:23,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-21 16:05:26,625][15401] Updated weights for policy 0, policy_version 18820 (0.0029) [2024-06-21 16:05:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41432.4). Total num frames: 308428800. Throughput: 0: 41536.1. Samples: 308547180. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-21 16:05:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 16:05:29,715][15401] Updated weights for policy 0, policy_version 18830 (0.0030) [2024-06-21 16:05:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 308609024. Throughput: 0: 41587.6. Samples: 308798040. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-06-21 16:05:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 16:05:34,377][15401] Updated weights for policy 0, policy_version 18840 (0.0041) [2024-06-21 16:05:37,663][15401] Updated weights for policy 0, policy_version 18850 (0.0028) [2024-06-21 16:05:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 308854784. Throughput: 0: 41579.5. Samples: 308919920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 16:05:38,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 16:05:42,306][15401] Updated weights for policy 0, policy_version 18860 (0.0032) [2024-06-21 16:05:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 309051392. Throughput: 0: 41414.2. Samples: 309166400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 16:05:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 16:05:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000018863_309051392.pth... [2024-06-21 16:05:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000018258_299139072.pth [2024-06-21 16:05:45,598][15401] Updated weights for policy 0, policy_version 18870 (0.0033) [2024-06-21 16:05:48,389][15132] Fps is (10 sec: 37684.1, 60 sec: 41233.2, 300 sec: 41376.6). Total num frames: 309231616. Throughput: 0: 41433.6. Samples: 309417120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:05:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 16:05:49,917][15349] Signal inference workers to stop experience collection... (4450 times) [2024-06-21 16:05:49,926][15349] Signal inference workers to resume experience collection... (4450 times) [2024-06-21 16:05:49,933][15401] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-06-21 16:05:49,958][15401] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-06-21 16:05:50,078][15401] Updated weights for policy 0, policy_version 18880 (0.0026) [2024-06-21 16:05:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41543.1). Total num frames: 309477376. Throughput: 0: 41490.1. Samples: 309538500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:05:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 16:05:53,451][15401] Updated weights for policy 0, policy_version 18890 (0.0044) [2024-06-21 16:05:57,845][15401] Updated weights for policy 0, policy_version 18900 (0.0036) [2024-06-21 16:05:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 309673984. Throughput: 0: 41456.1. Samples: 309793320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:05:58,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-21 16:06:01,382][15401] Updated weights for policy 0, policy_version 18910 (0.0037) [2024-06-21 16:06:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41506.3, 300 sec: 41377.4). Total num frames: 309870592. Throughput: 0: 41313.3. Samples: 310038640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 16:06:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 16:06:05,978][15401] Updated weights for policy 0, policy_version 18920 (0.0031) [2024-06-21 16:06:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 310083584. Throughput: 0: 41381.7. Samples: 310158400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 16:06:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 16:06:09,504][15401] Updated weights for policy 0, policy_version 18930 (0.0033) [2024-06-21 16:06:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 310280192. Throughput: 0: 41425.3. Samples: 310411320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 16:06:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 16:06:14,367][15401] Updated weights for policy 0, policy_version 18940 (0.0039) [2024-06-21 16:06:17,582][15401] Updated weights for policy 0, policy_version 18950 (0.0030) [2024-06-21 16:06:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 310509568. Throughput: 0: 41295.5. Samples: 310656340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 16:06:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 16:06:22,310][15401] Updated weights for policy 0, policy_version 18960 (0.0051) [2024-06-21 16:06:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 310706176. Throughput: 0: 41487.5. Samples: 310786860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 16:06:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 16:06:25,296][15401] Updated weights for policy 0, policy_version 18970 (0.0027) [2024-06-21 16:06:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 310919168. Throughput: 0: 41403.9. Samples: 311029580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 16:06:28,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 16:06:30,117][15401] Updated weights for policy 0, policy_version 18980 (0.0034) [2024-06-21 16:06:33,351][15401] Updated weights for policy 0, policy_version 18990 (0.0039) [2024-06-21 16:06:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 41376.5). Total num frames: 311132160. Throughput: 0: 41372.3. Samples: 311278880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 16:06:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 16:06:37,753][15401] Updated weights for policy 0, policy_version 19000 (0.0033) [2024-06-21 16:06:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 311312384. Throughput: 0: 41433.0. Samples: 311402980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 16:06:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 16:06:41,206][15401] Updated weights for policy 0, policy_version 19010 (0.0032) [2024-06-21 16:06:43,390][15132] Fps is (10 sec: 40958.7, 60 sec: 41505.9, 300 sec: 41488.5). Total num frames: 311541760. Throughput: 0: 41395.2. Samples: 311656120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 16:06:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 16:06:45,659][15401] Updated weights for policy 0, policy_version 19020 (0.0041) [2024-06-21 16:06:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 311754752. Throughput: 0: 41505.8. Samples: 311906400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 16:06:48,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 16:06:49,146][15401] Updated weights for policy 0, policy_version 19030 (0.0045) [2024-06-21 16:06:53,389][15132] Fps is (10 sec: 40961.9, 60 sec: 41233.2, 300 sec: 41321.0). Total num frames: 311951360. Throughput: 0: 41500.1. Samples: 312025900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 16:06:53,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-21 16:06:53,396][15401] Updated weights for policy 0, policy_version 19040 (0.0031) [2024-06-21 16:06:57,026][15401] Updated weights for policy 0, policy_version 19050 (0.0030) [2024-06-21 16:06:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 312180736. Throughput: 0: 41568.8. Samples: 312281920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:06:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 16:07:01,129][15401] Updated weights for policy 0, policy_version 19060 (0.0028) [2024-06-21 16:07:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 312377344. Throughput: 0: 41746.2. Samples: 312534920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:07:03,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-21 16:07:04,986][15401] Updated weights for policy 0, policy_version 19070 (0.0045) [2024-06-21 16:07:08,392][15132] Fps is (10 sec: 39312.4, 60 sec: 41504.5, 300 sec: 41431.7). Total num frames: 312573952. Throughput: 0: 41517.9. Samples: 312655260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:07:08,393][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 16:07:09,034][15401] Updated weights for policy 0, policy_version 19080 (0.0031) [2024-06-21 16:07:12,976][15401] Updated weights for policy 0, policy_version 19090 (0.0040) [2024-06-21 16:07:13,161][15349] Signal inference workers to stop experience collection... (4500 times) [2024-06-21 16:07:13,213][15349] Signal inference workers to resume experience collection... (4500 times) [2024-06-21 16:07:13,213][15401] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-06-21 16:07:13,229][15401] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-06-21 16:07:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 312803328. Throughput: 0: 41762.3. Samples: 312908880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:07:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 16:07:17,053][15401] Updated weights for policy 0, policy_version 19100 (0.0041) [2024-06-21 16:07:18,389][15132] Fps is (10 sec: 40970.3, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 312983552. Throughput: 0: 41830.3. Samples: 313161240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:07:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 16:07:20,906][15401] Updated weights for policy 0, policy_version 19110 (0.0025) [2024-06-21 16:07:23,390][15132] Fps is (10 sec: 40959.1, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 313212928. Throughput: 0: 41828.3. Samples: 313285260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 16:07:23,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 16:07:24,769][15401] Updated weights for policy 0, policy_version 19120 (0.0045) [2024-06-21 16:07:28,396][15132] Fps is (10 sec: 42572.4, 60 sec: 41502.0, 300 sec: 41375.7). Total num frames: 313409536. Throughput: 0: 41840.0. Samples: 313539160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 16:07:28,396][15132] Avg episode reward: [(0, '0.748')] [2024-06-21 16:07:28,581][15401] Updated weights for policy 0, policy_version 19130 (0.0028) [2024-06-21 16:07:32,536][15401] Updated weights for policy 0, policy_version 19140 (0.0043) [2024-06-21 16:07:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 313622528. Throughput: 0: 41927.8. Samples: 313793160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 16:07:33,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 16:07:36,366][15401] Updated weights for policy 0, policy_version 19150 (0.0036) [2024-06-21 16:07:38,389][15132] Fps is (10 sec: 42624.3, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 313835520. Throughput: 0: 42128.8. Samples: 313921700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:07:38,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 16:07:40,282][15401] Updated weights for policy 0, policy_version 19160 (0.0040) [2024-06-21 16:07:43,390][15132] Fps is (10 sec: 40956.9, 60 sec: 41505.8, 300 sec: 41320.9). Total num frames: 314032128. Throughput: 0: 41928.1. Samples: 314168720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:07:43,391][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 16:07:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000019167_314032128.pth... [2024-06-21 16:07:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000018561_304103424.pth [2024-06-21 16:07:44,118][15401] Updated weights for policy 0, policy_version 19170 (0.0038) [2024-06-21 16:07:48,155][15401] Updated weights for policy 0, policy_version 19180 (0.0032) [2024-06-21 16:07:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 314261504. Throughput: 0: 41973.3. Samples: 314423720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:07:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 16:07:51,957][15401] Updated weights for policy 0, policy_version 19190 (0.0027) [2024-06-21 16:07:53,390][15132] Fps is (10 sec: 42602.0, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 314458112. Throughput: 0: 42110.7. Samples: 314550140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:07:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 16:07:55,913][15401] Updated weights for policy 0, policy_version 19200 (0.0041) [2024-06-21 16:07:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 41504.5, 300 sec: 41432.0). Total num frames: 314671104. Throughput: 0: 41903.0. Samples: 314794620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:07:58,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 16:07:59,879][15401] Updated weights for policy 0, policy_version 19210 (0.0030) [2024-06-21 16:08:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 314867712. Throughput: 0: 41677.2. Samples: 315036720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 16:08:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 16:08:03,774][15401] Updated weights for policy 0, policy_version 19220 (0.0053) [2024-06-21 16:08:07,669][15401] Updated weights for policy 0, policy_version 19230 (0.0040) [2024-06-21 16:08:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 41780.9, 300 sec: 41487.6). Total num frames: 315080704. Throughput: 0: 41800.2. Samples: 315166260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 16:08:08,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-21 16:08:11,644][15401] Updated weights for policy 0, policy_version 19240 (0.0049) [2024-06-21 16:08:13,391][15132] Fps is (10 sec: 40954.4, 60 sec: 41232.0, 300 sec: 41543.0). Total num frames: 315277312. Throughput: 0: 41645.6. Samples: 315413020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-21 16:08:13,392][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 16:08:15,675][15401] Updated weights for policy 0, policy_version 19250 (0.0035) [2024-06-21 16:08:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.6, 300 sec: 41653.9). Total num frames: 315523072. Throughput: 0: 41333.9. Samples: 315653280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-21 16:08:18,393][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 16:08:19,730][15401] Updated weights for policy 0, policy_version 19260 (0.0049) [2024-06-21 16:08:23,389][15132] Fps is (10 sec: 42604.4, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 315703296. Throughput: 0: 41372.9. Samples: 315783480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-21 16:08:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 16:08:23,831][15401] Updated weights for policy 0, policy_version 19270 (0.0035) [2024-06-21 16:08:27,516][15401] Updated weights for policy 0, policy_version 19280 (0.0038) [2024-06-21 16:08:28,390][15132] Fps is (10 sec: 39331.0, 60 sec: 41783.4, 300 sec: 41598.7). Total num frames: 315916288. Throughput: 0: 41526.1. Samples: 316037360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-21 16:08:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-21 16:08:31,681][15401] Updated weights for policy 0, policy_version 19290 (0.0034) [2024-06-21 16:08:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.3, 300 sec: 41543.2). Total num frames: 316112896. Throughput: 0: 41330.8. Samples: 316283600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 16:08:33,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 16:08:35,362][15401] Updated weights for policy 0, policy_version 19300 (0.0038) [2024-06-21 16:08:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 316309504. Throughput: 0: 41174.4. Samples: 316402980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 16:08:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 16:08:39,313][15401] Updated weights for policy 0, policy_version 19310 (0.0038) [2024-06-21 16:08:43,275][15401] Updated weights for policy 0, policy_version 19320 (0.0030) [2024-06-21 16:08:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41779.7, 300 sec: 41598.7). Total num frames: 316538880. Throughput: 0: 41311.5. Samples: 316653540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 16:08:43,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-21 16:08:47,447][15401] Updated weights for policy 0, policy_version 19330 (0.0038) [2024-06-21 16:08:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41233.2, 300 sec: 41487.6). Total num frames: 316735488. Throughput: 0: 41453.4. Samples: 316902120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 16:08:48,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 16:08:51,359][15401] Updated weights for policy 0, policy_version 19340 (0.0036) [2024-06-21 16:08:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 316948480. Throughput: 0: 41356.8. Samples: 317027320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 16:08:53,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 16:08:55,245][15349] Signal inference workers to stop experience collection... (4550 times) [2024-06-21 16:08:55,245][15349] Signal inference workers to resume experience collection... (4550 times) [2024-06-21 16:08:55,287][15401] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-06-21 16:08:55,287][15401] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-06-21 16:08:55,387][15401] Updated weights for policy 0, policy_version 19350 (0.0038) [2024-06-21 16:08:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41234.8, 300 sec: 41487.7). Total num frames: 317145088. Throughput: 0: 41314.7. Samples: 317272120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 16:08:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 16:08:59,310][15401] Updated weights for policy 0, policy_version 19360 (0.0041) [2024-06-21 16:09:03,229][15401] Updated weights for policy 0, policy_version 19370 (0.0044) [2024-06-21 16:09:03,396][15132] Fps is (10 sec: 40933.8, 60 sec: 41501.7, 300 sec: 41431.2). Total num frames: 317358080. Throughput: 0: 41539.4. Samples: 317522720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 16:09:03,397][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 16:09:07,363][15401] Updated weights for policy 0, policy_version 19380 (0.0035) [2024-06-21 16:09:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 317571072. Throughput: 0: 41475.2. Samples: 317649860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 16:09:08,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 16:09:11,280][15401] Updated weights for policy 0, policy_version 19390 (0.0035) [2024-06-21 16:09:13,390][15132] Fps is (10 sec: 42625.7, 60 sec: 41780.2, 300 sec: 41543.2). Total num frames: 317784064. Throughput: 0: 41433.3. Samples: 317901860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 16:09:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 16:09:15,174][15401] Updated weights for policy 0, policy_version 19400 (0.0042) [2024-06-21 16:09:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 40961.6, 300 sec: 41543.2). Total num frames: 317980672. Throughput: 0: 41474.5. Samples: 318149960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 16:09:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 16:09:18,961][15401] Updated weights for policy 0, policy_version 19410 (0.0030) [2024-06-21 16:09:22,983][15401] Updated weights for policy 0, policy_version 19420 (0.0041) [2024-06-21 16:09:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 318210048. Throughput: 0: 41667.1. Samples: 318278000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-21 16:09:23,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 16:09:26,653][15401] Updated weights for policy 0, policy_version 19430 (0.0039) [2024-06-21 16:09:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 318390272. Throughput: 0: 41758.4. Samples: 318532660. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-21 16:09:28,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 16:09:30,614][15401] Updated weights for policy 0, policy_version 19440 (0.0034) [2024-06-21 16:09:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 318619648. Throughput: 0: 41624.4. Samples: 318775220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-21 16:09:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 16:09:34,605][15401] Updated weights for policy 0, policy_version 19450 (0.0041) [2024-06-21 16:09:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 318816256. Throughput: 0: 41763.7. Samples: 318906680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:09:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 16:09:38,491][15401] Updated weights for policy 0, policy_version 19460 (0.0040) [2024-06-21 16:09:42,350][15401] Updated weights for policy 0, policy_version 19470 (0.0025) [2024-06-21 16:09:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.2, 300 sec: 41543.2). Total num frames: 319012864. Throughput: 0: 41738.6. Samples: 319150360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:09:43,396][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 16:09:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000019471_319012864.pth... [2024-06-21 16:09:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000018863_309051392.pth [2024-06-21 16:09:46,820][15401] Updated weights for policy 0, policy_version 19480 (0.0045) [2024-06-21 16:09:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 319242240. Throughput: 0: 41583.7. Samples: 319393720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:09:48,392][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 16:09:50,188][15401] Updated weights for policy 0, policy_version 19490 (0.0042) [2024-06-21 16:09:53,394][15132] Fps is (10 sec: 40943.3, 60 sec: 41230.3, 300 sec: 41431.5). Total num frames: 319422464. Throughput: 0: 41601.1. Samples: 319522080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:09:53,394][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 16:09:54,697][15401] Updated weights for policy 0, policy_version 19500 (0.0037) [2024-06-21 16:09:58,283][15401] Updated weights for policy 0, policy_version 19510 (0.0044) [2024-06-21 16:09:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.0, 300 sec: 41598.7). Total num frames: 319651840. Throughput: 0: 41367.5. Samples: 319763400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:09:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 16:10:02,323][15401] Updated weights for policy 0, policy_version 19520 (0.0028) [2024-06-21 16:10:03,390][15132] Fps is (10 sec: 44254.5, 60 sec: 41783.7, 300 sec: 41654.2). Total num frames: 319864832. Throughput: 0: 41398.7. Samples: 320012900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:10:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 16:10:06,202][15401] Updated weights for policy 0, policy_version 19530 (0.0035) [2024-06-21 16:10:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 320061440. Throughput: 0: 41422.5. Samples: 320142020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-21 16:10:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 16:10:10,001][15401] Updated weights for policy 0, policy_version 19540 (0.0026) [2024-06-21 16:10:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 320274432. Throughput: 0: 41228.9. Samples: 320387960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-21 16:10:13,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-21 16:10:14,150][15401] Updated weights for policy 0, policy_version 19550 (0.0041) [2024-06-21 16:10:17,713][15401] Updated weights for policy 0, policy_version 19560 (0.0038) [2024-06-21 16:10:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 320487424. Throughput: 0: 41304.8. Samples: 320633940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 16:10:18,404][15132] Avg episode reward: [(0, '0.888')] [2024-06-21 16:10:18,405][15349] Saving new best policy, reward=0.888! [2024-06-21 16:10:21,855][15401] Updated weights for policy 0, policy_version 19570 (0.0039) [2024-06-21 16:10:23,231][15349] Signal inference workers to stop experience collection... (4600 times) [2024-06-21 16:10:23,231][15349] Signal inference workers to resume experience collection... (4600 times) [2024-06-21 16:10:23,266][15401] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-06-21 16:10:23,266][15401] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-06-21 16:10:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 320667648. Throughput: 0: 41148.4. Samples: 320758360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 16:10:23,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 16:10:25,742][15401] Updated weights for policy 0, policy_version 19580 (0.0038) [2024-06-21 16:10:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 320880640. Throughput: 0: 41244.0. Samples: 321006340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 16:10:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 16:10:30,781][15401] Updated weights for policy 0, policy_version 19590 (0.0036) [2024-06-21 16:10:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 321093632. Throughput: 0: 41331.5. Samples: 321253640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:10:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 16:10:33,583][15401] Updated weights for policy 0, policy_version 19600 (0.0038) [2024-06-21 16:10:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 321273856. Throughput: 0: 41205.1. Samples: 321376140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:10:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 16:10:38,513][15401] Updated weights for policy 0, policy_version 19610 (0.0036) [2024-06-21 16:10:41,558][15401] Updated weights for policy 0, policy_version 19620 (0.0025) [2024-06-21 16:10:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 321519616. Throughput: 0: 41294.3. Samples: 321621640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:10:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 16:10:46,090][15401] Updated weights for policy 0, policy_version 19630 (0.0037) [2024-06-21 16:10:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 321699840. Throughput: 0: 41481.8. Samples: 321879580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 16:10:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 16:10:49,449][15401] Updated weights for policy 0, policy_version 19640 (0.0032) [2024-06-21 16:10:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41508.9, 300 sec: 41487.6). Total num frames: 321912832. Throughput: 0: 41200.0. Samples: 321996020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 16:10:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 16:10:53,975][15401] Updated weights for policy 0, policy_version 19650 (0.0038) [2024-06-21 16:10:57,543][15401] Updated weights for policy 0, policy_version 19660 (0.0040) [2024-06-21 16:10:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41233.0, 300 sec: 41543.1). Total num frames: 322125824. Throughput: 0: 41326.0. Samples: 322247640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 16:10:58,391][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 16:11:01,991][15401] Updated weights for policy 0, policy_version 19670 (0.0042) [2024-06-21 16:11:03,391][15132] Fps is (10 sec: 40954.1, 60 sec: 40959.0, 300 sec: 41487.4). Total num frames: 322322432. Throughput: 0: 41430.2. Samples: 322498360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 16:11:03,392][15132] Avg episode reward: [(0, '0.355')] [2024-06-21 16:11:05,472][15401] Updated weights for policy 0, policy_version 19680 (0.0031) [2024-06-21 16:11:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41231.4, 300 sec: 41542.8). Total num frames: 322535424. Throughput: 0: 41379.9. Samples: 322620560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 16:11:08,393][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 16:11:10,084][15401] Updated weights for policy 0, policy_version 19690 (0.0040) [2024-06-21 16:11:13,327][15401] Updated weights for policy 0, policy_version 19700 (0.0031) [2024-06-21 16:11:13,390][15132] Fps is (10 sec: 44243.2, 60 sec: 41506.0, 300 sec: 41543.2). Total num frames: 322764800. Throughput: 0: 41540.8. Samples: 322875680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-06-21 16:11:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 16:11:17,953][15401] Updated weights for policy 0, policy_version 19710 (0.0046) [2024-06-21 16:11:18,389][15132] Fps is (10 sec: 39331.7, 60 sec: 40687.0, 300 sec: 41432.1). Total num frames: 322928640. Throughput: 0: 41535.2. Samples: 323122720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-06-21 16:11:18,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 16:11:21,491][15401] Updated weights for policy 0, policy_version 19720 (0.0041) [2024-06-21 16:11:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 323190784. Throughput: 0: 41531.5. Samples: 323245060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-06-21 16:11:23,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 16:11:26,022][15401] Updated weights for policy 0, policy_version 19730 (0.0044) [2024-06-21 16:11:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 323354624. Throughput: 0: 41544.9. Samples: 323491160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 16:11:28,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 16:11:29,486][15401] Updated weights for policy 0, policy_version 19740 (0.0034) [2024-06-21 16:11:33,392][15132] Fps is (10 sec: 36036.4, 60 sec: 40958.4, 300 sec: 41487.3). Total num frames: 323551232. Throughput: 0: 41294.2. Samples: 323737920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 16:11:33,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 16:11:33,848][15401] Updated weights for policy 0, policy_version 19750 (0.0043) [2024-06-21 16:11:37,214][15401] Updated weights for policy 0, policy_version 19760 (0.0027) [2024-06-21 16:11:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 323796992. Throughput: 0: 41458.7. Samples: 323861660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 16:11:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 16:11:41,734][15401] Updated weights for policy 0, policy_version 19770 (0.0034) [2024-06-21 16:11:43,389][15132] Fps is (10 sec: 40970.0, 60 sec: 40687.0, 300 sec: 41376.5). Total num frames: 323960832. Throughput: 0: 41390.0. Samples: 324110180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-21 16:11:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 16:11:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000019774_323977216.pth... [2024-06-21 16:11:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000019167_314032128.pth [2024-06-21 16:11:44,118][15349] Signal inference workers to stop experience collection... (4650 times) [2024-06-21 16:11:44,118][15349] Signal inference workers to resume experience collection... (4650 times) [2024-06-21 16:11:44,160][15401] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-06-21 16:11:44,161][15401] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-06-21 16:11:45,467][15401] Updated weights for policy 0, policy_version 19780 (0.0039) [2024-06-21 16:11:48,390][15132] Fps is (10 sec: 40958.1, 60 sec: 41778.9, 300 sec: 41543.1). Total num frames: 324206592. Throughput: 0: 41204.5. Samples: 324352520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-21 16:11:48,391][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 16:11:49,582][15401] Updated weights for policy 0, policy_version 19790 (0.0029) [2024-06-21 16:11:53,202][15401] Updated weights for policy 0, policy_version 19800 (0.0044) [2024-06-21 16:11:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 41779.4, 300 sec: 41487.7). Total num frames: 324419584. Throughput: 0: 41552.7. Samples: 324490320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-21 16:11:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 16:11:57,283][15401] Updated weights for policy 0, policy_version 19810 (0.0033) [2024-06-21 16:11:58,390][15132] Fps is (10 sec: 39323.3, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 324599808. Throughput: 0: 41363.6. Samples: 324737040. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-21 16:11:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 16:12:01,040][15401] Updated weights for policy 0, policy_version 19820 (0.0035) [2024-06-21 16:12:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42053.4, 300 sec: 41599.1). Total num frames: 324845568. Throughput: 0: 41170.3. Samples: 324975380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-21 16:12:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 16:12:05,367][15401] Updated weights for policy 0, policy_version 19830 (0.0041) [2024-06-21 16:12:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40961.8, 300 sec: 41321.0). Total num frames: 324993024. Throughput: 0: 41383.3. Samples: 325107300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 16:12:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 16:12:08,960][15401] Updated weights for policy 0, policy_version 19840 (0.0034) [2024-06-21 16:12:13,076][15401] Updated weights for policy 0, policy_version 19850 (0.0040) [2024-06-21 16:12:13,389][15132] Fps is (10 sec: 37682.8, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 325222400. Throughput: 0: 41271.5. Samples: 325348380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 16:12:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 16:12:17,034][15401] Updated weights for policy 0, policy_version 19860 (0.0034) [2024-06-21 16:12:18,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 325451776. Throughput: 0: 41158.1. Samples: 325589940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 16:12:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 16:12:21,222][15401] Updated weights for policy 0, policy_version 19870 (0.0048) [2024-06-21 16:12:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40413.9, 300 sec: 41377.4). Total num frames: 325615616. Throughput: 0: 41292.4. Samples: 325719820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 16:12:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 16:12:25,115][15401] Updated weights for policy 0, policy_version 19880 (0.0028) [2024-06-21 16:12:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 325844992. Throughput: 0: 41216.0. Samples: 325964900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 16:12:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 16:12:28,956][15401] Updated weights for policy 0, policy_version 19890 (0.0032) [2024-06-21 16:12:32,743][15401] Updated weights for policy 0, policy_version 19900 (0.0031) [2024-06-21 16:12:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41780.8, 300 sec: 41432.1). Total num frames: 326057984. Throughput: 0: 41379.1. Samples: 326214560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 16:12:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 16:12:36,748][15401] Updated weights for policy 0, policy_version 19910 (0.0035) [2024-06-21 16:12:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41432.2). Total num frames: 326254592. Throughput: 0: 41095.4. Samples: 326339620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 16:12:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 16:12:40,976][15401] Updated weights for policy 0, policy_version 19920 (0.0032) [2024-06-21 16:12:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 326467584. Throughput: 0: 41046.7. Samples: 326584140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 16:12:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 16:12:44,683][15401] Updated weights for policy 0, policy_version 19930 (0.0046) [2024-06-21 16:12:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 40414.2, 300 sec: 41265.5). Total num frames: 326631424. Throughput: 0: 41452.0. Samples: 326840720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 16:12:48,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-21 16:12:48,987][15401] Updated weights for policy 0, policy_version 19940 (0.0027) [2024-06-21 16:12:52,476][15401] Updated weights for policy 0, policy_version 19950 (0.0031) [2024-06-21 16:12:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41376.9). Total num frames: 326877184. Throughput: 0: 41100.3. Samples: 326956820. Policy #0 lag: (min: 2.0, avg: 8.9, max: 21.0) [2024-06-21 16:12:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 16:12:57,244][15401] Updated weights for policy 0, policy_version 19960 (0.0033) [2024-06-21 16:12:57,346][15349] Signal inference workers to stop experience collection... (4700 times) [2024-06-21 16:12:57,397][15401] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-06-21 16:12:57,404][15349] Signal inference workers to resume experience collection... (4700 times) [2024-06-21 16:12:57,420][15401] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-06-21 16:12:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 327090176. Throughput: 0: 41392.1. Samples: 327211020. Policy #0 lag: (min: 2.0, avg: 8.9, max: 21.0) [2024-06-21 16:12:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 16:13:00,166][15401] Updated weights for policy 0, policy_version 19970 (0.0043) [2024-06-21 16:13:03,392][15132] Fps is (10 sec: 39312.1, 60 sec: 40412.2, 300 sec: 41320.7). Total num frames: 327270400. Throughput: 0: 41584.1. Samples: 327461320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 16:13:03,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 16:13:04,984][15401] Updated weights for policy 0, policy_version 19980 (0.0050) [2024-06-21 16:13:07,862][15401] Updated weights for policy 0, policy_version 19990 (0.0042) [2024-06-21 16:13:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41487.8). Total num frames: 327516160. Throughput: 0: 41311.2. Samples: 327578820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 16:13:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 16:13:12,781][15401] Updated weights for policy 0, policy_version 20000 (0.0043) [2024-06-21 16:13:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 41233.0, 300 sec: 41265.8). Total num frames: 327696384. Throughput: 0: 41692.7. Samples: 327841080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 16:13:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 16:13:15,663][15401] Updated weights for policy 0, policy_version 20010 (0.0031) [2024-06-21 16:13:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 327909376. Throughput: 0: 41646.6. Samples: 328088660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 16:13:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 16:13:20,553][15401] Updated weights for policy 0, policy_version 20020 (0.0040) [2024-06-21 16:13:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 41487.6). Total num frames: 328155136. Throughput: 0: 41639.6. Samples: 328213400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 16:13:23,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 16:13:23,551][15401] Updated weights for policy 0, policy_version 20030 (0.0030) [2024-06-21 16:13:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 328318976. Throughput: 0: 41718.6. Samples: 328461480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 16:13:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 16:13:28,433][15401] Updated weights for policy 0, policy_version 20040 (0.0039) [2024-06-21 16:13:31,425][15401] Updated weights for policy 0, policy_version 20050 (0.0040) [2024-06-21 16:13:33,392][15132] Fps is (10 sec: 39312.1, 60 sec: 41504.5, 300 sec: 41487.3). Total num frames: 328548352. Throughput: 0: 41356.4. Samples: 328701860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:13:33,392][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 16:13:36,590][15401] Updated weights for policy 0, policy_version 20060 (0.0034) [2024-06-21 16:13:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 328744960. Throughput: 0: 41704.5. Samples: 328833520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:13:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 16:13:39,574][15401] Updated weights for policy 0, policy_version 20070 (0.0052) [2024-06-21 16:13:43,390][15132] Fps is (10 sec: 39330.9, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 328941568. Throughput: 0: 41633.7. Samples: 329084540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:13:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 16:13:43,548][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000020078_328957952.pth... [2024-06-21 16:13:43,606][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000019471_319012864.pth [2024-06-21 16:13:44,360][15401] Updated weights for policy 0, policy_version 20080 (0.0034) [2024-06-21 16:13:47,500][15401] Updated weights for policy 0, policy_version 20090 (0.0034) [2024-06-21 16:13:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 41487.6). Total num frames: 329187328. Throughput: 0: 41391.9. Samples: 329323860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:13:48,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 16:13:52,235][15401] Updated weights for policy 0, policy_version 20100 (0.0027) [2024-06-21 16:13:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 329383936. Throughput: 0: 41741.7. Samples: 329457200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:13:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 16:13:55,469][15401] Updated weights for policy 0, policy_version 20110 (0.0033) [2024-06-21 16:13:58,390][15132] Fps is (10 sec: 37683.4, 60 sec: 41233.0, 300 sec: 41377.4). Total num frames: 329564160. Throughput: 0: 41402.7. Samples: 329704200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:13:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 16:13:59,983][15401] Updated weights for policy 0, policy_version 20120 (0.0036) [2024-06-21 16:14:03,275][15401] Updated weights for policy 0, policy_version 20130 (0.0037) [2024-06-21 16:14:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 41487.6). Total num frames: 329809920. Throughput: 0: 41386.3. Samples: 329951040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-21 16:14:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 16:14:07,614][15401] Updated weights for policy 0, policy_version 20140 (0.0036) [2024-06-21 16:14:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.0, 300 sec: 41376.6). Total num frames: 329990144. Throughput: 0: 41512.4. Samples: 330081460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-21 16:14:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 16:14:11,218][15401] Updated weights for policy 0, policy_version 20150 (0.0031) [2024-06-21 16:14:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 41432.1). Total num frames: 330203136. Throughput: 0: 41456.5. Samples: 330327020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 16:14:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 16:14:15,170][15401] Updated weights for policy 0, policy_version 20160 (0.0036) [2024-06-21 16:14:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 330416128. Throughput: 0: 41845.7. Samples: 330584820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 16:14:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 16:14:18,893][15401] Updated weights for policy 0, policy_version 20170 (0.0027) [2024-06-21 16:14:22,748][15401] Updated weights for policy 0, policy_version 20180 (0.0033) [2024-06-21 16:14:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 330645504. Throughput: 0: 41708.9. Samples: 330710420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 16:14:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 16:14:23,999][15349] Signal inference workers to stop experience collection... (4750 times) [2024-06-21 16:14:24,000][15349] Signal inference workers to resume experience collection... (4750 times) [2024-06-21 16:14:24,044][15401] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-06-21 16:14:24,044][15401] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-06-21 16:14:26,963][15401] Updated weights for policy 0, policy_version 20190 (0.0042) [2024-06-21 16:14:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 41376.5). Total num frames: 330825728. Throughput: 0: 41617.9. Samples: 330957340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 16:14:28,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 16:14:30,453][15401] Updated weights for policy 0, policy_version 20200 (0.0036) [2024-06-21 16:14:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41507.8, 300 sec: 41432.1). Total num frames: 331038720. Throughput: 0: 42038.8. Samples: 331215600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 16:14:33,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 16:14:34,806][15401] Updated weights for policy 0, policy_version 20210 (0.0040) [2024-06-21 16:14:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 41543.2). Total num frames: 331268096. Throughput: 0: 41796.5. Samples: 331338040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 16:14:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 16:14:38,854][15401] Updated weights for policy 0, policy_version 20220 (0.0034) [2024-06-21 16:14:42,926][15401] Updated weights for policy 0, policy_version 20230 (0.0049) [2024-06-21 16:14:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42050.6, 300 sec: 41431.7). Total num frames: 331464704. Throughput: 0: 41652.0. Samples: 331578640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 16:14:43,393][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 16:14:46,776][15401] Updated weights for policy 0, policy_version 20240 (0.0039) [2024-06-21 16:14:48,390][15132] Fps is (10 sec: 37683.0, 60 sec: 40960.0, 300 sec: 41432.6). Total num frames: 331644928. Throughput: 0: 41711.1. Samples: 331828040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 16:14:48,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-21 16:14:51,055][15401] Updated weights for policy 0, policy_version 20250 (0.0036) [2024-06-21 16:14:53,390][15132] Fps is (10 sec: 40970.0, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 331874304. Throughput: 0: 41621.7. Samples: 331954440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 16:14:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 16:14:54,293][15401] Updated weights for policy 0, policy_version 20260 (0.0028) [2024-06-21 16:14:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41376.6). Total num frames: 332070912. Throughput: 0: 41618.2. Samples: 332199840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 16:14:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 16:14:58,943][15401] Updated weights for policy 0, policy_version 20270 (0.0037) [2024-06-21 16:15:02,204][15401] Updated weights for policy 0, policy_version 20280 (0.0046) [2024-06-21 16:15:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 332300288. Throughput: 0: 41252.1. Samples: 332441160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 16:15:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 16:15:07,116][15401] Updated weights for policy 0, policy_version 20290 (0.0032) [2024-06-21 16:15:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 332480512. Throughput: 0: 41330.1. Samples: 332570280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 16:15:08,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-21 16:15:10,163][15401] Updated weights for policy 0, policy_version 20300 (0.0040) [2024-06-21 16:15:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 332677120. Throughput: 0: 41385.8. Samples: 332819700. Policy #0 lag: (min: 2.0, avg: 13.1, max: 22.0) [2024-06-21 16:15:13,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-21 16:15:14,858][15401] Updated weights for policy 0, policy_version 20310 (0.0037) [2024-06-21 16:15:17,972][15401] Updated weights for policy 0, policy_version 20320 (0.0046) [2024-06-21 16:15:18,392][15132] Fps is (10 sec: 44226.4, 60 sec: 41777.5, 300 sec: 41542.8). Total num frames: 332922880. Throughput: 0: 41015.1. Samples: 333061380. Policy #0 lag: (min: 2.0, avg: 13.1, max: 22.0) [2024-06-21 16:15:18,393][15132] Avg episode reward: [(0, '0.346')] [2024-06-21 16:15:22,925][15401] Updated weights for policy 0, policy_version 20330 (0.0046) [2024-06-21 16:15:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 41376.6). Total num frames: 333086720. Throughput: 0: 41222.3. Samples: 333193040. Policy #0 lag: (min: 2.0, avg: 13.1, max: 22.0) [2024-06-21 16:15:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 16:15:25,795][15401] Updated weights for policy 0, policy_version 20340 (0.0038) [2024-06-21 16:15:28,390][15132] Fps is (10 sec: 39331.1, 60 sec: 41506.0, 300 sec: 41432.1). Total num frames: 333316096. Throughput: 0: 41330.7. Samples: 333438420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-21 16:15:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 16:15:31,038][15401] Updated weights for policy 0, policy_version 20350 (0.0039) [2024-06-21 16:15:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 333545472. Throughput: 0: 41304.5. Samples: 333686740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-21 16:15:33,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 16:15:33,732][15401] Updated weights for policy 0, policy_version 20360 (0.0045) [2024-06-21 16:15:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 333709312. Throughput: 0: 41288.5. Samples: 333812420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-21 16:15:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 16:15:38,747][15401] Updated weights for policy 0, policy_version 20370 (0.0025) [2024-06-21 16:15:41,567][15401] Updated weights for policy 0, policy_version 20380 (0.0034) [2024-06-21 16:15:43,390][15132] Fps is (10 sec: 37682.8, 60 sec: 40961.6, 300 sec: 41432.1). Total num frames: 333922304. Throughput: 0: 41267.8. Samples: 334056900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 16:15:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 16:15:43,546][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000020382_333938688.pth... [2024-06-21 16:15:43,604][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000019774_323977216.pth [2024-06-21 16:15:43,784][15349] Signal inference workers to stop experience collection... (4800 times) [2024-06-21 16:15:43,784][15349] Signal inference workers to resume experience collection... (4800 times) [2024-06-21 16:15:43,832][15401] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-06-21 16:15:43,832][15401] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-06-21 16:15:46,742][15401] Updated weights for policy 0, policy_version 20390 (0.0024) [2024-06-21 16:15:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 334151680. Throughput: 0: 41706.7. Samples: 334317960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 16:15:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 16:15:49,302][15401] Updated weights for policy 0, policy_version 20400 (0.0030) [2024-06-21 16:15:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 334348288. Throughput: 0: 41543.2. Samples: 334439720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 16:15:53,391][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 16:15:54,369][15401] Updated weights for policy 0, policy_version 20410 (0.0051) [2024-06-21 16:15:57,281][15401] Updated weights for policy 0, policy_version 20420 (0.0029) [2024-06-21 16:15:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41543.4). Total num frames: 334577664. Throughput: 0: 41319.5. Samples: 334679080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 16:15:58,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 16:16:02,739][15401] Updated weights for policy 0, policy_version 20430 (0.0023) [2024-06-21 16:16:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 40958.3, 300 sec: 41432.1). Total num frames: 334757888. Throughput: 0: 41687.1. Samples: 334937300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 16:16:03,392][15132] Avg episode reward: [(0, '0.817')] [2024-06-21 16:16:05,551][15401] Updated weights for policy 0, policy_version 20440 (0.0033) [2024-06-21 16:16:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 334970880. Throughput: 0: 41485.2. Samples: 335059880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:16:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 16:16:10,436][15401] Updated weights for policy 0, policy_version 20450 (0.0039) [2024-06-21 16:16:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 335200256. Throughput: 0: 41549.8. Samples: 335308160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:16:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 16:16:13,570][15401] Updated weights for policy 0, policy_version 20460 (0.0035) [2024-06-21 16:16:18,151][15401] Updated weights for policy 0, policy_version 20470 (0.0034) [2024-06-21 16:16:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 40961.7, 300 sec: 41321.0). Total num frames: 335380480. Throughput: 0: 41676.9. Samples: 335562200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:16:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 16:16:21,523][15401] Updated weights for policy 0, policy_version 20480 (0.0036) [2024-06-21 16:16:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 335593472. Throughput: 0: 41493.7. Samples: 335679640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 16:16:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 16:16:25,953][15401] Updated weights for policy 0, policy_version 20490 (0.0043) [2024-06-21 16:16:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42052.4, 300 sec: 41654.6). Total num frames: 335839232. Throughput: 0: 41726.4. Samples: 335934580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 16:16:28,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-21 16:16:29,156][15401] Updated weights for policy 0, policy_version 20500 (0.0038) [2024-06-21 16:16:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41376.6). Total num frames: 336003072. Throughput: 0: 41460.5. Samples: 336183680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 16:16:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 16:16:33,771][15401] Updated weights for policy 0, policy_version 20510 (0.0031) [2024-06-21 16:16:36,910][15401] Updated weights for policy 0, policy_version 20520 (0.0034) [2024-06-21 16:16:38,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 336216064. Throughput: 0: 41369.4. Samples: 336301340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 16:16:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 16:16:41,466][15401] Updated weights for policy 0, policy_version 20530 (0.0040) [2024-06-21 16:16:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 41487.7). Total num frames: 336445440. Throughput: 0: 41686.1. Samples: 336554960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 16:16:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 16:16:45,159][15401] Updated weights for policy 0, policy_version 20540 (0.0040) [2024-06-21 16:16:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41232.9, 300 sec: 41376.5). Total num frames: 336625664. Throughput: 0: 41508.8. Samples: 336805100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 16:16:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 16:16:49,718][15401] Updated weights for policy 0, policy_version 20550 (0.0030) [2024-06-21 16:16:50,214][15349] Signal inference workers to stop experience collection... (4850 times) [2024-06-21 16:16:50,214][15349] Signal inference workers to resume experience collection... (4850 times) [2024-06-21 16:16:50,230][15401] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-06-21 16:16:50,231][15401] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-06-21 16:16:52,793][15401] Updated weights for policy 0, policy_version 20560 (0.0038) [2024-06-21 16:16:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 336855040. Throughput: 0: 41379.1. Samples: 336921940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-21 16:16:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 16:16:57,487][15401] Updated weights for policy 0, policy_version 20570 (0.0032) [2024-06-21 16:16:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 337051648. Throughput: 0: 41469.8. Samples: 337174300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-21 16:16:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 16:17:01,065][15401] Updated weights for policy 0, policy_version 20580 (0.0044) [2024-06-21 16:17:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41234.7, 300 sec: 41487.6). Total num frames: 337231872. Throughput: 0: 41366.6. Samples: 337423700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-21 16:17:03,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 16:17:05,202][15401] Updated weights for policy 0, policy_version 20590 (0.0030) [2024-06-21 16:17:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 337477632. Throughput: 0: 41433.3. Samples: 337544140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 16:17:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 16:17:09,267][15401] Updated weights for policy 0, policy_version 20600 (0.0039) [2024-06-21 16:17:13,116][15401] Updated weights for policy 0, policy_version 20610 (0.0033) [2024-06-21 16:17:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 337674240. Throughput: 0: 41362.6. Samples: 337795900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 16:17:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 16:17:17,235][15401] Updated weights for policy 0, policy_version 20620 (0.0026) [2024-06-21 16:17:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 337887232. Throughput: 0: 41178.0. Samples: 338036700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 16:17:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 16:17:20,901][15401] Updated weights for policy 0, policy_version 20630 (0.0032) [2024-06-21 16:17:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 338083840. Throughput: 0: 41365.7. Samples: 338162800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 16:17:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 16:17:25,203][15401] Updated weights for policy 0, policy_version 20640 (0.0044) [2024-06-21 16:17:28,389][15132] Fps is (10 sec: 39322.4, 60 sec: 40686.9, 300 sec: 41432.1). Total num frames: 338280448. Throughput: 0: 41213.0. Samples: 338409540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 16:17:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 16:17:29,287][15401] Updated weights for policy 0, policy_version 20650 (0.0039) [2024-06-21 16:17:33,005][15401] Updated weights for policy 0, policy_version 20660 (0.0040) [2024-06-21 16:17:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 338509824. Throughput: 0: 41209.9. Samples: 338659540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:17:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 16:17:36,942][15401] Updated weights for policy 0, policy_version 20670 (0.0047) [2024-06-21 16:17:38,392][15132] Fps is (10 sec: 44225.8, 60 sec: 41777.5, 300 sec: 41542.8). Total num frames: 338722816. Throughput: 0: 41425.8. Samples: 338786200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:17:38,392][15132] Avg episode reward: [(0, '0.725')] [2024-06-21 16:17:41,001][15401] Updated weights for policy 0, policy_version 20680 (0.0036) [2024-06-21 16:17:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 338903040. Throughput: 0: 41363.5. Samples: 339035660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:17:43,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-21 16:17:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000020685_338903040.pth... [2024-06-21 16:17:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000020078_328957952.pth [2024-06-21 16:17:44,571][15401] Updated weights for policy 0, policy_version 20690 (0.0040) [2024-06-21 16:17:48,390][15132] Fps is (10 sec: 37692.3, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 339099648. Throughput: 0: 41486.3. Samples: 339290580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 16:17:48,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-21 16:17:48,825][15401] Updated weights for policy 0, policy_version 20700 (0.0053) [2024-06-21 16:17:52,760][15401] Updated weights for policy 0, policy_version 20710 (0.0038) [2024-06-21 16:17:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.2, 300 sec: 41543.1). Total num frames: 339345408. Throughput: 0: 41595.6. Samples: 339415940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 16:17:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 16:17:56,687][15401] Updated weights for policy 0, policy_version 20720 (0.0032) [2024-06-21 16:17:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41543.5). Total num frames: 339525632. Throughput: 0: 41418.6. Samples: 339659740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 16:17:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 16:18:00,539][15401] Updated weights for policy 0, policy_version 20730 (0.0034) [2024-06-21 16:18:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 339755008. Throughput: 0: 41779.6. Samples: 339916780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:18:03,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-21 16:18:04,647][15401] Updated weights for policy 0, policy_version 20740 (0.0034) [2024-06-21 16:18:08,159][15401] Updated weights for policy 0, policy_version 20750 (0.0044) [2024-06-21 16:18:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 339968000. Throughput: 0: 41766.6. Samples: 340042300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:18:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 16:18:12,753][15401] Updated weights for policy 0, policy_version 20760 (0.0033) [2024-06-21 16:18:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 340164608. Throughput: 0: 42007.1. Samples: 340299860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:18:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 16:18:15,979][15401] Updated weights for policy 0, policy_version 20770 (0.0035) [2024-06-21 16:18:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 340377600. Throughput: 0: 41904.0. Samples: 340545220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:18:18,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 16:18:20,281][15401] Updated weights for policy 0, policy_version 20780 (0.0034) [2024-06-21 16:18:21,134][15349] Signal inference workers to stop experience collection... (4900 times) [2024-06-21 16:18:21,175][15401] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-06-21 16:18:21,183][15349] Signal inference workers to resume experience collection... (4900 times) [2024-06-21 16:18:21,193][15401] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-06-21 16:18:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 340606976. Throughput: 0: 42055.5. Samples: 340678600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:18:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 16:18:23,706][15401] Updated weights for policy 0, policy_version 20790 (0.0048) [2024-06-21 16:18:28,093][15401] Updated weights for policy 0, policy_version 20800 (0.0043) [2024-06-21 16:18:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41487.9). Total num frames: 340787200. Throughput: 0: 42027.9. Samples: 340926920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:18:28,399][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 16:18:31,244][15401] Updated weights for policy 0, policy_version 20810 (0.0037) [2024-06-21 16:18:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 341016576. Throughput: 0: 41855.9. Samples: 341174100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 16:18:33,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 16:18:35,894][15401] Updated weights for policy 0, policy_version 20820 (0.0036) [2024-06-21 16:18:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42053.9, 300 sec: 41709.8). Total num frames: 341245952. Throughput: 0: 41909.2. Samples: 341301860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 16:18:38,405][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 16:18:38,838][15401] Updated weights for policy 0, policy_version 20830 (0.0025) [2024-06-21 16:18:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 341426176. Throughput: 0: 42107.6. Samples: 341554580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 16:18:43,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-21 16:18:43,454][15401] Updated weights for policy 0, policy_version 20840 (0.0035) [2024-06-21 16:18:46,662][15401] Updated weights for policy 0, policy_version 20850 (0.0037) [2024-06-21 16:18:48,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 341622784. Throughput: 0: 42051.7. Samples: 341809100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:18:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 16:18:51,736][15401] Updated weights for policy 0, policy_version 20860 (0.0031) [2024-06-21 16:18:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 341868544. Throughput: 0: 42065.4. Samples: 341935240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:18:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 16:18:54,331][15401] Updated weights for policy 0, policy_version 20870 (0.0042) [2024-06-21 16:18:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 342032384. Throughput: 0: 41868.4. Samples: 342183940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:18:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 16:18:59,351][15401] Updated weights for policy 0, policy_version 20880 (0.0036) [2024-06-21 16:19:02,620][15401] Updated weights for policy 0, policy_version 20890 (0.0035) [2024-06-21 16:19:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 342261760. Throughput: 0: 41830.3. Samples: 342427580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 16:19:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 16:19:07,333][15401] Updated weights for policy 0, policy_version 20900 (0.0043) [2024-06-21 16:19:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 342474752. Throughput: 0: 41822.3. Samples: 342560600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 16:19:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 16:19:10,413][15401] Updated weights for policy 0, policy_version 20910 (0.0027) [2024-06-21 16:19:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 342671360. Throughput: 0: 41916.1. Samples: 342813140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 16:19:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 16:19:14,995][15401] Updated weights for policy 0, policy_version 20920 (0.0034) [2024-06-21 16:19:18,008][15401] Updated weights for policy 0, policy_version 20930 (0.0035) [2024-06-21 16:19:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 41598.7). Total num frames: 342917120. Throughput: 0: 41810.7. Samples: 343055580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 16:19:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 16:19:22,762][15401] Updated weights for policy 0, policy_version 20940 (0.0045) [2024-06-21 16:19:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 343097344. Throughput: 0: 41984.5. Samples: 343191160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 16:19:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 16:19:25,555][15401] Updated weights for policy 0, policy_version 20950 (0.0045) [2024-06-21 16:19:28,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 343293952. Throughput: 0: 41885.4. Samples: 343439420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:19:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 16:19:30,771][15401] Updated weights for policy 0, policy_version 20960 (0.0036) [2024-06-21 16:19:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.5, 300 sec: 41654.2). Total num frames: 343556096. Throughput: 0: 41666.7. Samples: 343684100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:19:33,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 16:19:33,399][15401] Updated weights for policy 0, policy_version 20970 (0.0042) [2024-06-21 16:19:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.2, 300 sec: 41543.5). Total num frames: 343719936. Throughput: 0: 41878.7. Samples: 343819780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:19:38,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 16:19:38,493][15401] Updated weights for policy 0, policy_version 20980 (0.0042) [2024-06-21 16:19:41,993][15401] Updated weights for policy 0, policy_version 20990 (0.0041) [2024-06-21 16:19:43,389][15132] Fps is (10 sec: 36044.6, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 343916544. Throughput: 0: 41791.6. Samples: 344064560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 16:19:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 16:19:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000020992_343932928.pth... [2024-06-21 16:19:43,565][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000020382_333938688.pth [2024-06-21 16:19:44,543][15349] Signal inference workers to stop experience collection... (4950 times) [2024-06-21 16:19:44,543][15349] Signal inference workers to resume experience collection... (4950 times) [2024-06-21 16:19:44,590][15401] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-06-21 16:19:44,590][15401] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-06-21 16:19:46,447][15401] Updated weights for policy 0, policy_version 21000 (0.0032) [2024-06-21 16:19:48,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42596.7, 300 sec: 41709.4). Total num frames: 344178688. Throughput: 0: 41828.4. Samples: 344309960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 16:19:48,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 16:19:49,734][15401] Updated weights for policy 0, policy_version 21010 (0.0039) [2024-06-21 16:19:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 344342528. Throughput: 0: 41868.0. Samples: 344444660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 16:19:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 16:19:54,083][15401] Updated weights for policy 0, policy_version 21020 (0.0039) [2024-06-21 16:19:57,394][15401] Updated weights for policy 0, policy_version 21030 (0.0043) [2024-06-21 16:19:58,392][15132] Fps is (10 sec: 39322.9, 60 sec: 42323.9, 300 sec: 41598.4). Total num frames: 344571904. Throughput: 0: 41523.5. Samples: 344681780. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-21 16:19:58,392][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 16:20:02,285][15401] Updated weights for policy 0, policy_version 21040 (0.0039) [2024-06-21 16:20:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 344784896. Throughput: 0: 41776.9. Samples: 344935540. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-21 16:20:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 16:20:05,347][15401] Updated weights for policy 0, policy_version 21050 (0.0042) [2024-06-21 16:20:08,390][15132] Fps is (10 sec: 39329.6, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 344965120. Throughput: 0: 41548.9. Samples: 345060860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-21 16:20:08,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 16:20:10,016][15401] Updated weights for policy 0, policy_version 21060 (0.0042) [2024-06-21 16:20:13,168][15401] Updated weights for policy 0, policy_version 21070 (0.0038) [2024-06-21 16:20:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41654.6). Total num frames: 345210880. Throughput: 0: 41510.2. Samples: 345307380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-21 16:20:13,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 16:20:17,783][15401] Updated weights for policy 0, policy_version 21080 (0.0042) [2024-06-21 16:20:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 345391104. Throughput: 0: 41657.8. Samples: 345558700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-21 16:20:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-21 16:20:20,876][15401] Updated weights for policy 0, policy_version 21090 (0.0045) [2024-06-21 16:20:23,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 345587712. Throughput: 0: 41346.7. Samples: 345680380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-21 16:20:23,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 16:20:25,443][15401] Updated weights for policy 0, policy_version 21100 (0.0032) [2024-06-21 16:20:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 345833472. Throughput: 0: 41565.3. Samples: 345935000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 16:20:28,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-21 16:20:28,930][15401] Updated weights for policy 0, policy_version 21110 (0.0037) [2024-06-21 16:20:33,151][15401] Updated weights for policy 0, policy_version 21120 (0.0034) [2024-06-21 16:20:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 346030080. Throughput: 0: 41733.7. Samples: 346187880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 16:20:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-21 16:20:36,968][15401] Updated weights for policy 0, policy_version 21130 (0.0036) [2024-06-21 16:20:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 346210304. Throughput: 0: 41470.6. Samples: 346310840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 16:20:38,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-21 16:20:40,893][15401] Updated weights for policy 0, policy_version 21140 (0.0039) [2024-06-21 16:20:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 346456064. Throughput: 0: 41897.5. Samples: 346567080. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-06-21 16:20:43,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-21 16:20:44,825][15401] Updated weights for policy 0, policy_version 21150 (0.0036) [2024-06-21 16:20:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41234.7, 300 sec: 41709.8). Total num frames: 346652672. Throughput: 0: 41881.9. Samples: 346820220. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-06-21 16:20:48,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-21 16:20:48,645][15401] Updated weights for policy 0, policy_version 21160 (0.0035) [2024-06-21 16:20:52,752][15401] Updated weights for policy 0, policy_version 21170 (0.0036) [2024-06-21 16:20:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 346849280. Throughput: 0: 41727.5. Samples: 346938600. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-06-21 16:20:53,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 16:20:56,706][15401] Updated weights for policy 0, policy_version 21180 (0.0037) [2024-06-21 16:20:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41507.6, 300 sec: 41710.1). Total num frames: 347062272. Throughput: 0: 41821.4. Samples: 347189340. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-06-21 16:20:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 16:21:00,971][15401] Updated weights for policy 0, policy_version 21190 (0.0032) [2024-06-21 16:21:01,909][15349] Signal inference workers to stop experience collection... (5000 times) [2024-06-21 16:21:01,910][15349] Signal inference workers to resume experience collection... (5000 times) [2024-06-21 16:21:01,963][15401] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-06-21 16:21:01,968][15401] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-06-21 16:21:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 347291648. Throughput: 0: 41878.1. Samples: 347443220. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-06-21 16:21:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 16:21:04,621][15401] Updated weights for policy 0, policy_version 21200 (0.0034) [2024-06-21 16:21:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 41654.3). Total num frames: 347488256. Throughput: 0: 41986.3. Samples: 347569760. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-06-21 16:21:08,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 16:21:08,547][15401] Updated weights for policy 0, policy_version 21210 (0.0029) [2024-06-21 16:21:12,625][15401] Updated weights for policy 0, policy_version 21220 (0.0039) [2024-06-21 16:21:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 347717632. Throughput: 0: 41851.1. Samples: 347818300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 16:21:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 16:21:16,246][15401] Updated weights for policy 0, policy_version 21230 (0.0038) [2024-06-21 16:21:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 347914240. Throughput: 0: 41680.5. Samples: 348063500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 16:21:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-21 16:21:20,456][15401] Updated weights for policy 0, policy_version 21240 (0.0034) [2024-06-21 16:21:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 348094464. Throughput: 0: 41744.0. Samples: 348189320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 16:21:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 16:21:24,438][15401] Updated weights for policy 0, policy_version 21250 (0.0038) [2024-06-21 16:21:28,138][15401] Updated weights for policy 0, policy_version 21260 (0.0029) [2024-06-21 16:21:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 348323840. Throughput: 0: 41557.4. Samples: 348437160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 16:21:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 16:21:32,251][15401] Updated weights for policy 0, policy_version 21270 (0.0030) [2024-06-21 16:21:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 41709.7). Total num frames: 348520448. Throughput: 0: 41578.0. Samples: 348691240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 16:21:33,391][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 16:21:36,131][15401] Updated weights for policy 0, policy_version 21280 (0.0037) [2024-06-21 16:21:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 348733440. Throughput: 0: 41624.4. Samples: 348811700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 16:21:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 16:21:40,374][15401] Updated weights for policy 0, policy_version 21290 (0.0026) [2024-06-21 16:21:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 348946432. Throughput: 0: 41657.2. Samples: 349063920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:21:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 16:21:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000021298_348946432.pth... [2024-06-21 16:21:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000020685_338903040.pth [2024-06-21 16:21:43,973][15401] Updated weights for policy 0, policy_version 21300 (0.0043) [2024-06-21 16:21:48,059][15401] Updated weights for policy 0, policy_version 21310 (0.0030) [2024-06-21 16:21:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.0, 300 sec: 41709.8). Total num frames: 349159424. Throughput: 0: 41598.6. Samples: 349315160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:21:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 16:21:52,154][15401] Updated weights for policy 0, policy_version 21320 (0.0036) [2024-06-21 16:21:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 349356032. Throughput: 0: 41481.2. Samples: 349436420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:21:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 16:21:55,787][15401] Updated weights for policy 0, policy_version 21330 (0.0033) [2024-06-21 16:21:58,392][15132] Fps is (10 sec: 40950.8, 60 sec: 41777.5, 300 sec: 41820.5). Total num frames: 349569024. Throughput: 0: 41557.8. Samples: 349688500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 16:21:58,392][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 16:22:00,081][15401] Updated weights for policy 0, policy_version 21340 (0.0040) [2024-06-21 16:22:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 349782016. Throughput: 0: 41677.7. Samples: 349939000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 16:22:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 16:22:03,737][15401] Updated weights for policy 0, policy_version 21350 (0.0041) [2024-06-21 16:22:07,791][15401] Updated weights for policy 0, policy_version 21360 (0.0035) [2024-06-21 16:22:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 349978624. Throughput: 0: 41615.2. Samples: 350062000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 16:22:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 16:22:11,593][15401] Updated weights for policy 0, policy_version 21370 (0.0031) [2024-06-21 16:22:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 350208000. Throughput: 0: 41686.2. Samples: 350313040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:22:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 16:22:15,554][15401] Updated weights for policy 0, policy_version 21380 (0.0043) [2024-06-21 16:22:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 350388224. Throughput: 0: 41484.2. Samples: 350558020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:22:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 16:22:19,787][15401] Updated weights for policy 0, policy_version 21390 (0.0031) [2024-06-21 16:22:21,639][15349] Signal inference workers to stop experience collection... (5050 times) [2024-06-21 16:22:21,677][15401] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-06-21 16:22:21,758][15349] Signal inference workers to resume experience collection... (5050 times) [2024-06-21 16:22:21,758][15401] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-06-21 16:22:23,191][15401] Updated weights for policy 0, policy_version 21400 (0.0034) [2024-06-21 16:22:23,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42050.6, 300 sec: 41820.5). Total num frames: 350617600. Throughput: 0: 41544.9. Samples: 350681320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:22:23,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 16:22:28,027][15401] Updated weights for policy 0, policy_version 21410 (0.0033) [2024-06-21 16:22:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 350781440. Throughput: 0: 41419.6. Samples: 350927800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 16:22:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 16:22:30,938][15401] Updated weights for policy 0, policy_version 21420 (0.0038) [2024-06-21 16:22:33,390][15132] Fps is (10 sec: 37692.3, 60 sec: 41233.2, 300 sec: 41599.0). Total num frames: 350994432. Throughput: 0: 41475.7. Samples: 351181560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 16:22:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 16:22:35,781][15401] Updated weights for policy 0, policy_version 21430 (0.0040) [2024-06-21 16:22:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 351240192. Throughput: 0: 41472.9. Samples: 351302700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 16:22:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 16:22:38,805][15401] Updated weights for policy 0, policy_version 21440 (0.0037) [2024-06-21 16:22:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 351404032. Throughput: 0: 41333.7. Samples: 351548420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 20.0) [2024-06-21 16:22:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 16:22:43,653][15401] Updated weights for policy 0, policy_version 21450 (0.0042) [2024-06-21 16:22:46,837][15401] Updated weights for policy 0, policy_version 21460 (0.0037) [2024-06-21 16:22:48,390][15132] Fps is (10 sec: 37683.0, 60 sec: 40960.1, 300 sec: 41598.7). Total num frames: 351617024. Throughput: 0: 41201.8. Samples: 351793080. Policy #0 lag: (min: 0.0, avg: 12.7, max: 20.0) [2024-06-21 16:22:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-21 16:22:51,993][15401] Updated weights for policy 0, policy_version 21470 (0.0050) [2024-06-21 16:22:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 351846400. Throughput: 0: 41316.4. Samples: 351921240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 20.0) [2024-06-21 16:22:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 16:22:54,569][15401] Updated weights for policy 0, policy_version 21480 (0.0034) [2024-06-21 16:22:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40961.5, 300 sec: 41598.7). Total num frames: 352026624. Throughput: 0: 41180.7. Samples: 352166180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 16:22:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 16:23:00,014][15401] Updated weights for policy 0, policy_version 21490 (0.0038) [2024-06-21 16:23:02,275][15401] Updated weights for policy 0, policy_version 21500 (0.0036) [2024-06-21 16:23:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 352272384. Throughput: 0: 41139.9. Samples: 352409320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 16:23:03,391][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 16:23:07,863][15401] Updated weights for policy 0, policy_version 21510 (0.0038) [2024-06-21 16:23:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 352452608. Throughput: 0: 41270.3. Samples: 352538380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 16:23:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 16:23:10,131][15401] Updated weights for policy 0, policy_version 21520 (0.0033) [2024-06-21 16:23:13,389][15132] Fps is (10 sec: 36045.2, 60 sec: 40413.8, 300 sec: 41543.2). Total num frames: 352632832. Throughput: 0: 41336.5. Samples: 352787940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 16:23:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 16:23:15,625][15401] Updated weights for policy 0, policy_version 21530 (0.0040) [2024-06-21 16:23:17,957][15401] Updated weights for policy 0, policy_version 21540 (0.0031) [2024-06-21 16:23:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 352911360. Throughput: 0: 40939.1. Samples: 353023820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 16:23:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 16:23:22,956][15349] Signal inference workers to stop experience collection... (5100 times) [2024-06-21 16:23:22,986][15401] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-06-21 16:23:23,022][15349] Signal inference workers to resume experience collection... (5100 times) [2024-06-21 16:23:23,022][15401] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-06-21 16:23:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 40688.6, 300 sec: 41598.7). Total num frames: 353058816. Throughput: 0: 41348.4. Samples: 353163380. Policy #0 lag: (min: 1.0, avg: 7.6, max: 20.0) [2024-06-21 16:23:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 16:23:23,627][15401] Updated weights for policy 0, policy_version 21550 (0.0034) [2024-06-21 16:23:26,339][15401] Updated weights for policy 0, policy_version 21560 (0.0042) [2024-06-21 16:23:28,390][15132] Fps is (10 sec: 34406.0, 60 sec: 41232.9, 300 sec: 41487.6). Total num frames: 353255424. Throughput: 0: 41040.7. Samples: 353395260. Policy #0 lag: (min: 1.0, avg: 7.6, max: 20.0) [2024-06-21 16:23:28,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-21 16:23:31,522][15401] Updated weights for policy 0, policy_version 21570 (0.0031) [2024-06-21 16:23:33,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 353533952. Throughput: 0: 41068.8. Samples: 353641180. Policy #0 lag: (min: 1.0, avg: 7.6, max: 20.0) [2024-06-21 16:23:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 16:23:34,290][15401] Updated weights for policy 0, policy_version 21580 (0.0040) [2024-06-21 16:23:38,390][15132] Fps is (10 sec: 39322.2, 60 sec: 40140.8, 300 sec: 41432.1). Total num frames: 353648640. Throughput: 0: 41308.1. Samples: 353780100. Policy #0 lag: (min: 1.0, avg: 7.6, max: 20.0) [2024-06-21 16:23:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 16:23:39,326][15401] Updated weights for policy 0, policy_version 21590 (0.0031) [2024-06-21 16:23:42,214][15401] Updated weights for policy 0, policy_version 21600 (0.0036) [2024-06-21 16:23:43,389][15132] Fps is (10 sec: 36045.4, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 353894400. Throughput: 0: 41026.4. Samples: 354012360. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-21 16:23:43,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-21 16:23:43,476][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000021601_353910784.pth... [2024-06-21 16:23:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000020992_343932928.pth [2024-06-21 16:23:47,327][15401] Updated weights for policy 0, policy_version 21610 (0.0031) [2024-06-21 16:23:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 354107392. Throughput: 0: 41232.5. Samples: 354264780. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-21 16:23:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 16:23:50,609][15401] Updated weights for policy 0, policy_version 21620 (0.0040) [2024-06-21 16:23:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 40686.9, 300 sec: 41543.2). Total num frames: 354287616. Throughput: 0: 41121.6. Samples: 354388860. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-21 16:23:53,394][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 16:23:55,447][15401] Updated weights for policy 0, policy_version 21630 (0.0043) [2024-06-21 16:23:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41543.1). Total num frames: 354516992. Throughput: 0: 40793.2. Samples: 354623640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 16:23:58,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 16:23:59,055][15401] Updated weights for policy 0, policy_version 21640 (0.0031) [2024-06-21 16:24:03,307][15401] Updated weights for policy 0, policy_version 21650 (0.0031) [2024-06-21 16:24:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 354713600. Throughput: 0: 41354.8. Samples: 354884780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 16:24:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 16:24:07,273][15401] Updated weights for policy 0, policy_version 21660 (0.0035) [2024-06-21 16:24:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40686.8, 300 sec: 41432.1). Total num frames: 354893824. Throughput: 0: 40845.7. Samples: 355001440. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 16:24:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 16:24:11,203][15401] Updated weights for policy 0, policy_version 21670 (0.0047) [2024-06-21 16:24:13,390][15132] Fps is (10 sec: 44233.5, 60 sec: 42051.7, 300 sec: 41487.5). Total num frames: 355155968. Throughput: 0: 41010.6. Samples: 355240760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:24:13,391][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 16:24:15,012][15401] Updated weights for policy 0, policy_version 21680 (0.0044) [2024-06-21 16:24:17,634][15349] Signal inference workers to stop experience collection... (5150 times) [2024-06-21 16:24:17,680][15401] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-06-21 16:24:17,685][15349] Signal inference workers to resume experience collection... (5150 times) [2024-06-21 16:24:17,698][15401] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-06-21 16:24:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 39867.8, 300 sec: 41376.6). Total num frames: 355303424. Throughput: 0: 41304.6. Samples: 355499880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:24:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 16:24:19,436][15401] Updated weights for policy 0, policy_version 21690 (0.0042) [2024-06-21 16:24:22,840][15401] Updated weights for policy 0, policy_version 21700 (0.0053) [2024-06-21 16:24:23,390][15132] Fps is (10 sec: 37685.7, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 355532800. Throughput: 0: 40732.4. Samples: 355613060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:24:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 16:24:27,275][15401] Updated weights for policy 0, policy_version 21710 (0.0030) [2024-06-21 16:24:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 41779.3, 300 sec: 41376.5). Total num frames: 355762176. Throughput: 0: 41220.8. Samples: 355867300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 16:24:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 16:24:30,665][15401] Updated weights for policy 0, policy_version 21720 (0.0038) [2024-06-21 16:24:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 39867.8, 300 sec: 41376.5). Total num frames: 355926016. Throughput: 0: 41213.3. Samples: 356119380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 16:24:33,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 16:24:35,145][15401] Updated weights for policy 0, policy_version 21730 (0.0036) [2024-06-21 16:24:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41543.1). Total num frames: 356171776. Throughput: 0: 41049.4. Samples: 356236080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 16:24:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 16:24:38,608][15401] Updated weights for policy 0, policy_version 21740 (0.0037) [2024-06-21 16:24:43,102][15401] Updated weights for policy 0, policy_version 21750 (0.0028) [2024-06-21 16:24:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 40959.9, 300 sec: 41265.8). Total num frames: 356352000. Throughput: 0: 41608.1. Samples: 356496000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 16:24:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 16:24:46,665][15401] Updated weights for policy 0, policy_version 21760 (0.0038) [2024-06-21 16:24:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 40687.0, 300 sec: 41376.5). Total num frames: 356548608. Throughput: 0: 41205.8. Samples: 356739040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 16:24:48,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-21 16:24:50,711][15401] Updated weights for policy 0, policy_version 21770 (0.0035) [2024-06-21 16:24:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 41432.4). Total num frames: 356794368. Throughput: 0: 41442.4. Samples: 356866340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 16:24:53,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 16:24:54,274][15401] Updated weights for policy 0, policy_version 21780 (0.0032) [2024-06-21 16:24:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 356974592. Throughput: 0: 41618.8. Samples: 357113580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 16:24:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-21 16:24:59,031][15401] Updated weights for policy 0, policy_version 21790 (0.0040) [2024-06-21 16:25:02,404][15401] Updated weights for policy 0, policy_version 21800 (0.0039) [2024-06-21 16:25:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 357203968. Throughput: 0: 41379.9. Samples: 357361980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 16:25:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 16:25:06,898][15401] Updated weights for policy 0, policy_version 21810 (0.0030) [2024-06-21 16:25:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 41376.5). Total num frames: 357416960. Throughput: 0: 41686.7. Samples: 357488960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 16:25:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 16:25:10,027][15401] Updated weights for policy 0, policy_version 21820 (0.0040) [2024-06-21 16:25:13,392][15132] Fps is (10 sec: 39312.3, 60 sec: 40685.8, 300 sec: 41376.2). Total num frames: 357597184. Throughput: 0: 41606.2. Samples: 357739680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 16:25:13,393][15132] Avg episode reward: [(0, '0.190')] [2024-06-21 16:25:14,548][15401] Updated weights for policy 0, policy_version 21830 (0.0044) [2024-06-21 16:25:17,892][15401] Updated weights for policy 0, policy_version 21840 (0.0035) [2024-06-21 16:25:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 357826560. Throughput: 0: 41390.7. Samples: 357981960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 16:25:18,390][15132] Avg episode reward: [(0, '0.153')] [2024-06-21 16:25:22,417][15401] Updated weights for policy 0, policy_version 21850 (0.0035) [2024-06-21 16:25:23,389][15132] Fps is (10 sec: 44247.5, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 358039552. Throughput: 0: 41636.5. Samples: 358109720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 16:25:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 16:25:25,659][15401] Updated weights for policy 0, policy_version 21860 (0.0051) [2024-06-21 16:25:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 358236160. Throughput: 0: 41414.8. Samples: 358359660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 16:25:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 16:25:30,278][15401] Updated weights for policy 0, policy_version 21870 (0.0043) [2024-06-21 16:25:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 358449152. Throughput: 0: 41484.0. Samples: 358605820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 16:25:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 16:25:33,766][15401] Updated weights for policy 0, policy_version 21880 (0.0034) [2024-06-21 16:25:38,296][15401] Updated weights for policy 0, policy_version 21890 (0.0035) [2024-06-21 16:25:38,389][15132] Fps is (10 sec: 40959.5, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 358645760. Throughput: 0: 41490.6. Samples: 358733420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 16:25:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 16:25:41,997][15401] Updated weights for policy 0, policy_version 21900 (0.0042) [2024-06-21 16:25:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 358858752. Throughput: 0: 41448.9. Samples: 358978780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 16:25:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-21 16:25:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000021903_358858752.pth... [2024-06-21 16:25:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000021298_348946432.pth [2024-06-21 16:25:45,871][15349] Signal inference workers to stop experience collection... (5200 times) [2024-06-21 16:25:45,872][15349] Signal inference workers to resume experience collection... (5200 times) [2024-06-21 16:25:45,889][15401] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-06-21 16:25:45,890][15401] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-06-21 16:25:46,021][15401] Updated weights for policy 0, policy_version 21910 (0.0034) [2024-06-21 16:25:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41432.1). Total num frames: 359071744. Throughput: 0: 41575.5. Samples: 359232880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 16:25:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-21 16:25:50,144][15401] Updated weights for policy 0, policy_version 21920 (0.0035) [2024-06-21 16:25:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 359268352. Throughput: 0: 41536.5. Samples: 359358100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 16:25:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 16:25:53,699][15401] Updated weights for policy 0, policy_version 21930 (0.0031) [2024-06-21 16:25:57,986][15401] Updated weights for policy 0, policy_version 21940 (0.0022) [2024-06-21 16:25:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 359464960. Throughput: 0: 41470.2. Samples: 359605740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 16:25:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 16:26:01,549][15401] Updated weights for policy 0, policy_version 21950 (0.0034) [2024-06-21 16:26:03,394][15132] Fps is (10 sec: 44218.3, 60 sec: 41776.4, 300 sec: 41431.5). Total num frames: 359710720. Throughput: 0: 41488.2. Samples: 359849100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 16:26:03,394][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 16:26:05,776][15401] Updated weights for policy 0, policy_version 21960 (0.0038) [2024-06-21 16:26:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 359890944. Throughput: 0: 41556.8. Samples: 359979780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 16:26:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 16:26:09,585][15401] Updated weights for policy 0, policy_version 21970 (0.0050) [2024-06-21 16:26:13,389][15132] Fps is (10 sec: 37698.9, 60 sec: 41507.8, 300 sec: 41265.5). Total num frames: 360087552. Throughput: 0: 41472.8. Samples: 360225940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 16:26:13,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 16:26:14,008][15401] Updated weights for policy 0, policy_version 21980 (0.0036) [2024-06-21 16:26:17,687][15401] Updated weights for policy 0, policy_version 21990 (0.0037) [2024-06-21 16:26:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41233.2, 300 sec: 41376.6). Total num frames: 360300544. Throughput: 0: 41464.1. Samples: 360471700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 16:26:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 16:26:21,667][15401] Updated weights for policy 0, policy_version 22000 (0.0041) [2024-06-21 16:26:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 360513536. Throughput: 0: 41448.5. Samples: 360598600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 16:26:23,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-21 16:26:25,310][15401] Updated weights for policy 0, policy_version 22010 (0.0048) [2024-06-21 16:26:28,390][15132] Fps is (10 sec: 40955.8, 60 sec: 41232.3, 300 sec: 41320.9). Total num frames: 360710144. Throughput: 0: 41381.4. Samples: 360840980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:26:28,391][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 16:26:29,599][15401] Updated weights for policy 0, policy_version 22020 (0.0027) [2024-06-21 16:26:33,396][15132] Fps is (10 sec: 40933.2, 60 sec: 41228.6, 300 sec: 41320.1). Total num frames: 360923136. Throughput: 0: 41320.3. Samples: 361092560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:26:33,397][15132] Avg episode reward: [(0, '0.322')] [2024-06-21 16:26:33,550][15401] Updated weights for policy 0, policy_version 22030 (0.0049) [2024-06-21 16:26:37,495][15401] Updated weights for policy 0, policy_version 22040 (0.0043) [2024-06-21 16:26:38,389][15132] Fps is (10 sec: 40963.9, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 361119744. Throughput: 0: 41295.5. Samples: 361216400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:26:38,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 16:26:41,504][15401] Updated weights for policy 0, policy_version 22050 (0.0039) [2024-06-21 16:26:43,390][15132] Fps is (10 sec: 42625.8, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 361349120. Throughput: 0: 41333.3. Samples: 361465740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 16:26:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 16:26:45,339][15401] Updated weights for policy 0, policy_version 22060 (0.0030) [2024-06-21 16:26:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 361545728. Throughput: 0: 41474.0. Samples: 361715260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 16:26:48,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 16:26:49,173][15401] Updated weights for policy 0, policy_version 22070 (0.0041) [2024-06-21 16:26:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41265.8). Total num frames: 361742336. Throughput: 0: 41347.7. Samples: 361840420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 16:26:53,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-21 16:26:53,501][15401] Updated weights for policy 0, policy_version 22080 (0.0045) [2024-06-21 16:26:57,258][15401] Updated weights for policy 0, policy_version 22090 (0.0034) [2024-06-21 16:26:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 361955328. Throughput: 0: 41387.0. Samples: 362088360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 16:26:58,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-21 16:27:01,288][15401] Updated weights for policy 0, policy_version 22100 (0.0038) [2024-06-21 16:27:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 41235.8, 300 sec: 41376.5). Total num frames: 362184704. Throughput: 0: 41397.1. Samples: 362334580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 16:27:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 16:27:04,868][15401] Updated weights for policy 0, policy_version 22110 (0.0033) [2024-06-21 16:27:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 362364928. Throughput: 0: 41501.6. Samples: 362466180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 16:27:08,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-21 16:27:09,098][15401] Updated weights for policy 0, policy_version 22120 (0.0041) [2024-06-21 16:27:12,729][15401] Updated weights for policy 0, policy_version 22130 (0.0039) [2024-06-21 16:27:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.0, 300 sec: 41321.0). Total num frames: 362577920. Throughput: 0: 41634.1. Samples: 362714480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-21 16:27:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 16:27:17,001][15401] Updated weights for policy 0, policy_version 22140 (0.0042) [2024-06-21 16:27:17,364][15349] Signal inference workers to stop experience collection... (5250 times) [2024-06-21 16:27:17,364][15349] Signal inference workers to resume experience collection... (5250 times) [2024-06-21 16:27:17,387][15401] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-06-21 16:27:17,387][15401] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-06-21 16:27:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42052.1, 300 sec: 41376.9). Total num frames: 362823680. Throughput: 0: 41517.9. Samples: 362960600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-21 16:27:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 16:27:20,767][15401] Updated weights for policy 0, policy_version 22150 (0.0027) [2024-06-21 16:27:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 363020288. Throughput: 0: 41701.3. Samples: 363092960. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-21 16:27:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 16:27:24,546][15401] Updated weights for policy 0, policy_version 22160 (0.0047) [2024-06-21 16:27:28,392][15132] Fps is (10 sec: 39312.5, 60 sec: 41778.2, 300 sec: 41431.7). Total num frames: 363216896. Throughput: 0: 41574.7. Samples: 363336700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 16:27:28,393][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 16:27:28,892][15401] Updated weights for policy 0, policy_version 22170 (0.0037) [2024-06-21 16:27:32,406][15401] Updated weights for policy 0, policy_version 22180 (0.0029) [2024-06-21 16:27:33,392][15132] Fps is (10 sec: 40950.4, 60 sec: 41782.1, 300 sec: 41320.7). Total num frames: 363429888. Throughput: 0: 41709.8. Samples: 363592300. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 16:27:33,393][15132] Avg episode reward: [(0, '0.783')] [2024-06-21 16:27:36,369][15401] Updated weights for policy 0, policy_version 22190 (0.0036) [2024-06-21 16:27:38,390][15132] Fps is (10 sec: 39331.1, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 363610112. Throughput: 0: 41816.8. Samples: 363722180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 16:27:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 16:27:40,085][15401] Updated weights for policy 0, policy_version 22200 (0.0036) [2024-06-21 16:27:43,392][15132] Fps is (10 sec: 42598.2, 60 sec: 41777.6, 300 sec: 41487.3). Total num frames: 363855872. Throughput: 0: 41850.7. Samples: 363971740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 16:27:43,393][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 16:27:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000022208_363855872.pth... [2024-06-21 16:27:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000021601_353910784.pth [2024-06-21 16:27:44,127][15401] Updated weights for policy 0, policy_version 22210 (0.0039) [2024-06-21 16:27:47,843][15401] Updated weights for policy 0, policy_version 22220 (0.0038) [2024-06-21 16:27:48,393][15132] Fps is (10 sec: 45859.4, 60 sec: 42049.8, 300 sec: 41431.6). Total num frames: 364068864. Throughput: 0: 41942.2. Samples: 364222120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 16:27:48,393][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 16:27:52,202][15401] Updated weights for policy 0, policy_version 22230 (0.0037) [2024-06-21 16:27:53,390][15132] Fps is (10 sec: 39330.8, 60 sec: 41779.1, 300 sec: 41432.1). Total num frames: 364249088. Throughput: 0: 41794.2. Samples: 364346920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 16:27:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 16:27:55,592][15401] Updated weights for policy 0, policy_version 22240 (0.0044) [2024-06-21 16:27:58,389][15132] Fps is (10 sec: 40974.3, 60 sec: 42052.3, 300 sec: 41376.6). Total num frames: 364478464. Throughput: 0: 41836.1. Samples: 364597100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:27:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 16:27:59,878][15401] Updated weights for policy 0, policy_version 22250 (0.0042) [2024-06-21 16:28:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 364691456. Throughput: 0: 41934.7. Samples: 364847660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:28:03,392][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 16:28:03,454][15401] Updated weights for policy 0, policy_version 22260 (0.0028) [2024-06-21 16:28:08,110][15401] Updated weights for policy 0, policy_version 22270 (0.0034) [2024-06-21 16:28:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 364871680. Throughput: 0: 41856.4. Samples: 364976500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:28:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 16:28:11,606][15401] Updated weights for policy 0, policy_version 22280 (0.0036) [2024-06-21 16:28:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41376.5). Total num frames: 365117440. Throughput: 0: 41925.2. Samples: 365223240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 16:28:13,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 16:28:15,739][15401] Updated weights for policy 0, policy_version 22290 (0.0038) [2024-06-21 16:28:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 365314048. Throughput: 0: 41868.5. Samples: 365476280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 16:28:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 16:28:19,528][15401] Updated weights for policy 0, policy_version 22300 (0.0032) [2024-06-21 16:28:23,389][15132] Fps is (10 sec: 39322.6, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 365510656. Throughput: 0: 41697.4. Samples: 365598560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 16:28:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 16:28:23,507][15401] Updated weights for policy 0, policy_version 22310 (0.0028) [2024-06-21 16:28:27,369][15401] Updated weights for policy 0, policy_version 22320 (0.0046) [2024-06-21 16:28:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42053.9, 300 sec: 41376.5). Total num frames: 365740032. Throughput: 0: 41863.1. Samples: 365855480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:28:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 16:28:31,211][15401] Updated weights for policy 0, policy_version 22330 (0.0053) [2024-06-21 16:28:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41780.9, 300 sec: 41654.3). Total num frames: 365936640. Throughput: 0: 41735.3. Samples: 366100060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:28:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 16:28:35,171][15401] Updated weights for policy 0, policy_version 22340 (0.0026) [2024-06-21 16:28:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 41543.2). Total num frames: 366149632. Throughput: 0: 41812.6. Samples: 366228480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:28:38,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 16:28:38,994][15401] Updated weights for policy 0, policy_version 22350 (0.0029) [2024-06-21 16:28:43,216][15401] Updated weights for policy 0, policy_version 22360 (0.0039) [2024-06-21 16:28:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41507.8, 300 sec: 41487.6). Total num frames: 366346240. Throughput: 0: 41899.1. Samples: 366482560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:28:43,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-21 16:28:46,571][15401] Updated weights for policy 0, policy_version 22370 (0.0033) [2024-06-21 16:28:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41781.6, 300 sec: 41654.2). Total num frames: 366575616. Throughput: 0: 41774.7. Samples: 366727520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:28:48,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 16:28:50,180][15349] Signal inference workers to stop experience collection... (5300 times) [2024-06-21 16:28:50,181][15349] Signal inference workers to resume experience collection... (5300 times) [2024-06-21 16:28:50,218][15401] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-06-21 16:28:50,218][15401] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-06-21 16:28:51,323][15401] Updated weights for policy 0, policy_version 22380 (0.0045) [2024-06-21 16:28:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41543.1). Total num frames: 366772224. Throughput: 0: 41709.7. Samples: 366853440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:28:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 16:28:54,953][15401] Updated weights for policy 0, policy_version 22390 (0.0045) [2024-06-21 16:28:58,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 366952448. Throughput: 0: 41673.4. Samples: 367098540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 16:28:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 16:28:59,140][15401] Updated weights for policy 0, policy_version 22400 (0.0040) [2024-06-21 16:29:02,751][15401] Updated weights for policy 0, policy_version 22410 (0.0024) [2024-06-21 16:29:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 367181824. Throughput: 0: 41595.0. Samples: 367348060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 16:29:03,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 16:29:07,199][15401] Updated weights for policy 0, policy_version 22420 (0.0038) [2024-06-21 16:29:08,391][15132] Fps is (10 sec: 44228.2, 60 sec: 42051.0, 300 sec: 41487.4). Total num frames: 367394816. Throughput: 0: 41609.2. Samples: 367471060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 16:29:08,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 16:29:10,800][15401] Updated weights for policy 0, policy_version 22430 (0.0042) [2024-06-21 16:29:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 41598.7). Total num frames: 367575040. Throughput: 0: 41440.1. Samples: 367720280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:29:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 16:29:14,950][15401] Updated weights for policy 0, policy_version 22440 (0.0040) [2024-06-21 16:29:18,390][15132] Fps is (10 sec: 40968.0, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 367804416. Throughput: 0: 41411.0. Samples: 367963560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:29:18,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-21 16:29:18,782][15401] Updated weights for policy 0, policy_version 22450 (0.0058) [2024-06-21 16:29:22,853][15401] Updated weights for policy 0, policy_version 22460 (0.0046) [2024-06-21 16:29:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41779.0, 300 sec: 41543.1). Total num frames: 368017408. Throughput: 0: 41408.7. Samples: 368091880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:29:23,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 16:29:26,378][15401] Updated weights for policy 0, policy_version 22470 (0.0048) [2024-06-21 16:29:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 368214016. Throughput: 0: 41187.6. Samples: 368336000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:29:28,391][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 16:29:30,667][15401] Updated weights for policy 0, policy_version 22480 (0.0037) [2024-06-21 16:29:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 368443392. Throughput: 0: 41269.4. Samples: 368584640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:29:33,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 16:29:34,215][15401] Updated weights for policy 0, policy_version 22490 (0.0025) [2024-06-21 16:29:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 368623616. Throughput: 0: 41438.5. Samples: 368718160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:29:38,398][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 16:29:38,422][15401] Updated weights for policy 0, policy_version 22500 (0.0035) [2024-06-21 16:29:42,165][15401] Updated weights for policy 0, policy_version 22510 (0.0045) [2024-06-21 16:29:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 41709.7). Total num frames: 368852992. Throughput: 0: 41319.0. Samples: 368957900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:29:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 16:29:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000022513_368852992.pth... [2024-06-21 16:29:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000021903_358858752.pth [2024-06-21 16:29:46,249][15401] Updated weights for policy 0, policy_version 22520 (0.0049) [2024-06-21 16:29:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41233.1, 300 sec: 41543.1). Total num frames: 369049600. Throughput: 0: 41363.6. Samples: 369209420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 16:29:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 16:29:49,932][15401] Updated weights for policy 0, policy_version 22530 (0.0041) [2024-06-21 16:29:53,390][15132] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 369229824. Throughput: 0: 41367.9. Samples: 369332540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 16:29:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 16:29:54,307][15401] Updated weights for policy 0, policy_version 22540 (0.0033) [2024-06-21 16:29:57,571][15401] Updated weights for policy 0, policy_version 22550 (0.0038) [2024-06-21 16:29:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.6, 300 sec: 41598.4). Total num frames: 369475584. Throughput: 0: 41426.7. Samples: 369584580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 16:29:58,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 16:30:02,065][15401] Updated weights for policy 0, policy_version 22560 (0.0046) [2024-06-21 16:30:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 369672192. Throughput: 0: 41727.6. Samples: 369841300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:30:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 16:30:05,391][15401] Updated weights for policy 0, policy_version 22570 (0.0033) [2024-06-21 16:30:08,390][15132] Fps is (10 sec: 39331.1, 60 sec: 41234.4, 300 sec: 41599.0). Total num frames: 369868800. Throughput: 0: 41586.8. Samples: 369963280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:30:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 16:30:09,921][15401] Updated weights for policy 0, policy_version 22580 (0.0034) [2024-06-21 16:30:13,345][15401] Updated weights for policy 0, policy_version 22590 (0.0040) [2024-06-21 16:30:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 370114560. Throughput: 0: 41659.5. Samples: 370210680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:30:13,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 16:30:18,105][15401] Updated weights for policy 0, policy_version 22600 (0.0044) [2024-06-21 16:30:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 370294784. Throughput: 0: 41801.3. Samples: 370465700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:30:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 16:30:21,075][15401] Updated weights for policy 0, policy_version 22610 (0.0032) [2024-06-21 16:30:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 370507776. Throughput: 0: 41357.2. Samples: 370579240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:30:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 16:30:26,241][15401] Updated weights for policy 0, policy_version 22620 (0.0037) [2024-06-21 16:30:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 370704384. Throughput: 0: 41672.7. Samples: 370833160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:30:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 16:30:28,644][15349] Signal inference workers to stop experience collection... (5350 times) [2024-06-21 16:30:28,648][15349] Signal inference workers to resume experience collection... (5350 times) [2024-06-21 16:30:28,679][15401] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-06-21 16:30:28,679][15401] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-06-21 16:30:29,068][15401] Updated weights for policy 0, policy_version 22630 (0.0042) [2024-06-21 16:30:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41543.1). Total num frames: 370900992. Throughput: 0: 41710.1. Samples: 371086380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 16:30:33,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 16:30:34,179][15401] Updated weights for policy 0, policy_version 22640 (0.0028) [2024-06-21 16:30:36,787][15401] Updated weights for policy 0, policy_version 22650 (0.0033) [2024-06-21 16:30:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 371130368. Throughput: 0: 41722.4. Samples: 371210040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 16:30:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 16:30:41,915][15401] Updated weights for policy 0, policy_version 22660 (0.0043) [2024-06-21 16:30:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 371343360. Throughput: 0: 41701.4. Samples: 371461040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 16:30:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 16:30:44,577][15401] Updated weights for policy 0, policy_version 22670 (0.0035) [2024-06-21 16:30:48,389][15132] Fps is (10 sec: 37683.3, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 371507200. Throughput: 0: 41563.6. Samples: 371711660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 16:30:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 16:30:49,619][15401] Updated weights for policy 0, policy_version 22680 (0.0040) [2024-06-21 16:30:52,333][15401] Updated weights for policy 0, policy_version 22690 (0.0047) [2024-06-21 16:30:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 41709.8). Total num frames: 371769344. Throughput: 0: 41412.1. Samples: 371826820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 16:30:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 16:30:57,387][15401] Updated weights for policy 0, policy_version 22700 (0.0034) [2024-06-21 16:30:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 41507.8, 300 sec: 41543.7). Total num frames: 371965952. Throughput: 0: 41668.1. Samples: 372085740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 16:30:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 16:31:00,450][15401] Updated weights for policy 0, policy_version 22710 (0.0030) [2024-06-21 16:31:03,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 372146176. Throughput: 0: 41494.2. Samples: 372332940. Policy #0 lag: (min: 0.0, avg: 13.5, max: 25.0) [2024-06-21 16:31:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 16:31:05,629][15401] Updated weights for policy 0, policy_version 22720 (0.0033) [2024-06-21 16:31:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 372391936. Throughput: 0: 41513.4. Samples: 372447340. Policy #0 lag: (min: 0.0, avg: 13.5, max: 25.0) [2024-06-21 16:31:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 16:31:08,499][15401] Updated weights for policy 0, policy_version 22730 (0.0052) [2024-06-21 16:31:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 41543.1). Total num frames: 372555776. Throughput: 0: 41551.5. Samples: 372702980. Policy #0 lag: (min: 0.0, avg: 13.5, max: 25.0) [2024-06-21 16:31:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 16:31:13,414][15401] Updated weights for policy 0, policy_version 22740 (0.0038) [2024-06-21 16:31:16,500][15401] Updated weights for policy 0, policy_version 22750 (0.0039) [2024-06-21 16:31:18,390][15132] Fps is (10 sec: 37682.5, 60 sec: 41233.0, 300 sec: 41543.1). Total num frames: 372768768. Throughput: 0: 41380.9. Samples: 372948520. Policy #0 lag: (min: 1.0, avg: 13.1, max: 23.0) [2024-06-21 16:31:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 16:31:21,138][15401] Updated weights for policy 0, policy_version 22760 (0.0040) [2024-06-21 16:31:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41506.2, 300 sec: 41654.4). Total num frames: 372998144. Throughput: 0: 41458.2. Samples: 373075660. Policy #0 lag: (min: 1.0, avg: 13.1, max: 23.0) [2024-06-21 16:31:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 16:31:23,622][15349] Signal inference workers to stop experience collection... (5400 times) [2024-06-21 16:31:23,624][15349] Signal inference workers to resume experience collection... (5400 times) [2024-06-21 16:31:23,670][15401] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-06-21 16:31:23,670][15401] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-06-21 16:31:24,328][15401] Updated weights for policy 0, policy_version 22770 (0.0039) [2024-06-21 16:31:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.0, 300 sec: 41599.6). Total num frames: 373194752. Throughput: 0: 41557.7. Samples: 373331140. Policy #0 lag: (min: 1.0, avg: 13.1, max: 23.0) [2024-06-21 16:31:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 16:31:28,969][15401] Updated weights for policy 0, policy_version 22780 (0.0038) [2024-06-21 16:31:32,135][15401] Updated weights for policy 0, policy_version 22790 (0.0046) [2024-06-21 16:31:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 373424128. Throughput: 0: 41296.7. Samples: 373570020. Policy #0 lag: (min: 1.0, avg: 12.3, max: 24.0) [2024-06-21 16:31:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 16:31:36,758][15401] Updated weights for policy 0, policy_version 22800 (0.0032) [2024-06-21 16:31:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 373620736. Throughput: 0: 41694.2. Samples: 373703060. Policy #0 lag: (min: 1.0, avg: 12.3, max: 24.0) [2024-06-21 16:31:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 16:31:40,363][15401] Updated weights for policy 0, policy_version 22810 (0.0045) [2024-06-21 16:31:43,390][15132] Fps is (10 sec: 36044.8, 60 sec: 40686.8, 300 sec: 41487.6). Total num frames: 373784576. Throughput: 0: 41325.2. Samples: 373945380. Policy #0 lag: (min: 1.0, avg: 12.3, max: 24.0) [2024-06-21 16:31:43,396][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 16:31:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000022814_373784576.pth... [2024-06-21 16:31:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000022208_363855872.pth [2024-06-21 16:31:44,952][15401] Updated weights for policy 0, policy_version 22820 (0.0029) [2024-06-21 16:31:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 374030336. Throughput: 0: 41164.0. Samples: 374185320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 16:31:48,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-21 16:31:48,567][15401] Updated weights for policy 0, policy_version 22830 (0.0027) [2024-06-21 16:31:52,978][15401] Updated weights for policy 0, policy_version 22840 (0.0027) [2024-06-21 16:31:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 374243328. Throughput: 0: 41593.7. Samples: 374319060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 16:31:53,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-21 16:31:56,632][15401] Updated weights for policy 0, policy_version 22850 (0.0034) [2024-06-21 16:31:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 40686.9, 300 sec: 41432.1). Total num frames: 374407168. Throughput: 0: 41233.7. Samples: 374558500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 16:31:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 16:32:00,852][15401] Updated weights for policy 0, policy_version 22860 (0.0031) [2024-06-21 16:32:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 374669312. Throughput: 0: 41172.9. Samples: 374801300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 16:32:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 16:32:04,562][15401] Updated weights for policy 0, policy_version 22870 (0.0038) [2024-06-21 16:32:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 40686.9, 300 sec: 41543.2). Total num frames: 374833152. Throughput: 0: 41237.4. Samples: 374931340. Policy #0 lag: (min: 0.0, avg: 13.8, max: 23.0) [2024-06-21 16:32:08,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 16:32:09,071][15401] Updated weights for policy 0, policy_version 22880 (0.0042) [2024-06-21 16:32:12,376][15401] Updated weights for policy 0, policy_version 22890 (0.0047) [2024-06-21 16:32:13,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 375046144. Throughput: 0: 40938.6. Samples: 375173380. Policy #0 lag: (min: 0.0, avg: 13.8, max: 23.0) [2024-06-21 16:32:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-21 16:32:16,889][15401] Updated weights for policy 0, policy_version 22900 (0.0034) [2024-06-21 16:32:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 375275520. Throughput: 0: 41091.3. Samples: 375419120. Policy #0 lag: (min: 0.0, avg: 13.8, max: 23.0) [2024-06-21 16:32:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 16:32:20,255][15401] Updated weights for policy 0, policy_version 22910 (0.0051) [2024-06-21 16:32:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40686.9, 300 sec: 41432.4). Total num frames: 375439360. Throughput: 0: 41073.8. Samples: 375551380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 16:32:23,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 16:32:24,584][15401] Updated weights for policy 0, policy_version 22920 (0.0034) [2024-06-21 16:32:28,094][15401] Updated weights for policy 0, policy_version 22930 (0.0039) [2024-06-21 16:32:28,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41504.5, 300 sec: 41543.2). Total num frames: 375685120. Throughput: 0: 41053.4. Samples: 375792880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 16:32:28,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 16:32:32,530][15401] Updated weights for policy 0, policy_version 22940 (0.0041) [2024-06-21 16:32:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 375898112. Throughput: 0: 41315.1. Samples: 376044500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 16:32:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 16:32:34,665][15349] Signal inference workers to stop experience collection... (5450 times) [2024-06-21 16:32:34,703][15401] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-06-21 16:32:34,732][15349] Signal inference workers to resume experience collection... (5450 times) [2024-06-21 16:32:34,732][15401] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-06-21 16:32:36,052][15401] Updated weights for policy 0, policy_version 22950 (0.0031) [2024-06-21 16:32:38,389][15132] Fps is (10 sec: 39331.2, 60 sec: 40960.0, 300 sec: 41432.4). Total num frames: 376078336. Throughput: 0: 41069.9. Samples: 376167200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 16:32:38,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 16:32:40,456][15401] Updated weights for policy 0, policy_version 22960 (0.0033) [2024-06-21 16:32:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41488.1). Total num frames: 376307712. Throughput: 0: 41275.2. Samples: 376415880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 16:32:43,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-21 16:32:44,493][15401] Updated weights for policy 0, policy_version 22970 (0.0037) [2024-06-21 16:32:48,122][15401] Updated weights for policy 0, policy_version 22980 (0.0034) [2024-06-21 16:32:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 376520704. Throughput: 0: 41475.6. Samples: 376667700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 16:32:48,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 16:32:52,272][15401] Updated weights for policy 0, policy_version 22990 (0.0030) [2024-06-21 16:32:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40686.9, 300 sec: 41376.5). Total num frames: 376684544. Throughput: 0: 41346.9. Samples: 376791960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:32:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 16:32:55,971][15401] Updated weights for policy 0, policy_version 23000 (0.0038) [2024-06-21 16:32:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 376913920. Throughput: 0: 41384.4. Samples: 377035680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:32:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 16:33:00,159][15401] Updated weights for policy 0, policy_version 23010 (0.0034) [2024-06-21 16:33:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 377126912. Throughput: 0: 41543.0. Samples: 377288560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:33:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 16:33:03,829][15401] Updated weights for policy 0, policy_version 23020 (0.0044) [2024-06-21 16:33:08,299][15401] Updated weights for policy 0, policy_version 23030 (0.0031) [2024-06-21 16:33:08,392][15132] Fps is (10 sec: 40950.9, 60 sec: 41504.4, 300 sec: 41376.2). Total num frames: 377323520. Throughput: 0: 41327.6. Samples: 377411220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:33:08,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 16:33:11,716][15401] Updated weights for policy 0, policy_version 23040 (0.0039) [2024-06-21 16:33:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 377552896. Throughput: 0: 41459.5. Samples: 377658460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:33:13,394][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 16:33:16,391][15401] Updated weights for policy 0, policy_version 23050 (0.0034) [2024-06-21 16:33:18,390][15132] Fps is (10 sec: 42608.0, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 377749504. Throughput: 0: 41393.3. Samples: 377907200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:33:18,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 16:33:19,636][15401] Updated weights for policy 0, policy_version 23060 (0.0028) [2024-06-21 16:33:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 377946112. Throughput: 0: 41444.8. Samples: 378032220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:33:23,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 16:33:24,332][15401] Updated weights for policy 0, policy_version 23070 (0.0043) [2024-06-21 16:33:27,581][15401] Updated weights for policy 0, policy_version 23080 (0.0047) [2024-06-21 16:33:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41507.8, 300 sec: 41487.6). Total num frames: 378175488. Throughput: 0: 41375.6. Samples: 378277780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:33:28,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-21 16:33:32,226][15401] Updated weights for policy 0, policy_version 23090 (0.0031) [2024-06-21 16:33:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 40960.1, 300 sec: 41376.5). Total num frames: 378355712. Throughput: 0: 41351.7. Samples: 378528520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:33:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 16:33:35,549][15401] Updated weights for policy 0, policy_version 23100 (0.0040) [2024-06-21 16:33:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 378568704. Throughput: 0: 41173.9. Samples: 378644780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:33:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 16:33:39,947][15401] Updated weights for policy 0, policy_version 23110 (0.0032) [2024-06-21 16:33:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 378781696. Throughput: 0: 41389.4. Samples: 378898200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 16:33:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 16:33:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000023119_378781696.pth... [2024-06-21 16:33:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000022513_368852992.pth [2024-06-21 16:33:43,669][15401] Updated weights for policy 0, policy_version 23120 (0.0041) [2024-06-21 16:33:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 40413.9, 300 sec: 41265.5). Total num frames: 378945536. Throughput: 0: 41460.1. Samples: 379154260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 16:33:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 16:33:48,451][15401] Updated weights for policy 0, policy_version 23130 (0.0038) [2024-06-21 16:33:51,280][15401] Updated weights for policy 0, policy_version 23140 (0.0026) [2024-06-21 16:33:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 379174912. Throughput: 0: 41332.4. Samples: 379271080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 16:33:53,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-21 16:33:56,360][15401] Updated weights for policy 0, policy_version 23150 (0.0047) [2024-06-21 16:33:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 41506.3, 300 sec: 41432.1). Total num frames: 379404288. Throughput: 0: 41402.8. Samples: 379521580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:33:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 16:33:59,050][15401] Updated weights for policy 0, policy_version 23160 (0.0025) [2024-06-21 16:34:02,639][15349] Signal inference workers to stop experience collection... (5500 times) [2024-06-21 16:34:02,639][15349] Signal inference workers to resume experience collection... (5500 times) [2024-06-21 16:34:02,652][15401] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-06-21 16:34:02,653][15401] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-06-21 16:34:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 40958.4, 300 sec: 41320.9). Total num frames: 379584512. Throughput: 0: 41457.9. Samples: 379772900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:34:03,392][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 16:34:04,056][15401] Updated weights for policy 0, policy_version 23170 (0.0041) [2024-06-21 16:34:07,171][15401] Updated weights for policy 0, policy_version 23180 (0.0047) [2024-06-21 16:34:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41780.9, 300 sec: 41543.2). Total num frames: 379830272. Throughput: 0: 41442.3. Samples: 379897120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 16:34:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 16:34:11,723][15401] Updated weights for policy 0, policy_version 23190 (0.0043) [2024-06-21 16:34:13,390][15132] Fps is (10 sec: 45885.9, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 380043264. Throughput: 0: 41578.1. Samples: 380148800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 16:34:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 16:34:15,315][15401] Updated weights for policy 0, policy_version 23200 (0.0042) [2024-06-21 16:34:18,390][15132] Fps is (10 sec: 37682.8, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 380207104. Throughput: 0: 41530.0. Samples: 380397380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 16:34:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 16:34:19,319][15401] Updated weights for policy 0, policy_version 23210 (0.0026) [2024-06-21 16:34:23,344][15401] Updated weights for policy 0, policy_version 23220 (0.0024) [2024-06-21 16:34:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 380436480. Throughput: 0: 41554.7. Samples: 380514740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 16:34:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 16:34:27,462][15401] Updated weights for policy 0, policy_version 23230 (0.0033) [2024-06-21 16:34:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 380649472. Throughput: 0: 41695.3. Samples: 380774480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 16:34:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 16:34:31,142][15401] Updated weights for policy 0, policy_version 23240 (0.0034) [2024-06-21 16:34:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 380862464. Throughput: 0: 41503.0. Samples: 381021900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 16:34:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 16:34:35,133][15401] Updated weights for policy 0, policy_version 23250 (0.0028) [2024-06-21 16:34:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 41432.1). Total num frames: 381075456. Throughput: 0: 41664.0. Samples: 381145960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 16:34:38,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-21 16:34:38,830][15401] Updated weights for policy 0, policy_version 23260 (0.0038) [2024-06-21 16:34:42,924][15401] Updated weights for policy 0, policy_version 23270 (0.0037) [2024-06-21 16:34:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 381272064. Throughput: 0: 41776.3. Samples: 381401520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 16:34:43,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 16:34:46,639][15401] Updated weights for policy 0, policy_version 23280 (0.0032) [2024-06-21 16:34:48,394][15132] Fps is (10 sec: 40943.4, 60 sec: 42322.5, 300 sec: 41542.6). Total num frames: 381485056. Throughput: 0: 41526.5. Samples: 381641660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 16:34:48,394][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 16:34:50,730][15401] Updated weights for policy 0, policy_version 23290 (0.0030) [2024-06-21 16:34:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41432.4). Total num frames: 381698048. Throughput: 0: 41480.9. Samples: 381763760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 16:34:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 16:34:54,448][15401] Updated weights for policy 0, policy_version 23300 (0.0039) [2024-06-21 16:34:58,389][15132] Fps is (10 sec: 37698.4, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 381861888. Throughput: 0: 41437.9. Samples: 382013500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 16:34:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 16:34:58,858][15401] Updated weights for policy 0, policy_version 23310 (0.0031) [2024-06-21 16:35:02,602][15401] Updated weights for policy 0, policy_version 23320 (0.0042) [2024-06-21 16:35:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41780.9, 300 sec: 41432.1). Total num frames: 382091264. Throughput: 0: 41408.1. Samples: 382260740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 16:35:03,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-21 16:35:06,603][15401] Updated weights for policy 0, policy_version 23330 (0.0034) [2024-06-21 16:35:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 382304256. Throughput: 0: 41717.1. Samples: 382392000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 16:35:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 16:35:10,331][15401] Updated weights for policy 0, policy_version 23340 (0.0046) [2024-06-21 16:35:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 382484480. Throughput: 0: 41391.9. Samples: 382637120. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 16:35:13,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 16:35:14,327][15401] Updated weights for policy 0, policy_version 23350 (0.0041) [2024-06-21 16:35:18,054][15401] Updated weights for policy 0, policy_version 23360 (0.0028) [2024-06-21 16:35:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 41487.6). Total num frames: 382746624. Throughput: 0: 41475.3. Samples: 382888280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-21 16:35:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 16:35:22,028][15401] Updated weights for policy 0, policy_version 23370 (0.0034) [2024-06-21 16:35:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 382926848. Throughput: 0: 41657.8. Samples: 383020560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-21 16:35:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 16:35:25,930][15401] Updated weights for policy 0, policy_version 23380 (0.0039) [2024-06-21 16:35:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 383139840. Throughput: 0: 41433.5. Samples: 383266020. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-21 16:35:28,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 16:35:29,626][15349] Signal inference workers to stop experience collection... (5550 times) [2024-06-21 16:35:29,632][15349] Signal inference workers to resume experience collection... (5550 times) [2024-06-21 16:35:29,638][15401] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-06-21 16:35:29,680][15401] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-06-21 16:35:29,794][15401] Updated weights for policy 0, policy_version 23390 (0.0034) [2024-06-21 16:35:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 383352832. Throughput: 0: 41688.5. Samples: 383517480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-21 16:35:33,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-21 16:35:33,781][15401] Updated weights for policy 0, policy_version 23400 (0.0033) [2024-06-21 16:35:37,792][15401] Updated weights for policy 0, policy_version 23410 (0.0046) [2024-06-21 16:35:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41506.0, 300 sec: 41432.1). Total num frames: 383565824. Throughput: 0: 41668.7. Samples: 383638860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-21 16:35:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 16:35:41,954][15401] Updated weights for policy 0, policy_version 23420 (0.0037) [2024-06-21 16:35:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41506.3, 300 sec: 41543.2). Total num frames: 383762432. Throughput: 0: 41527.7. Samples: 383882240. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-21 16:35:43,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-21 16:35:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000023423_383762432.pth... [2024-06-21 16:35:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000022814_373784576.pth [2024-06-21 16:35:46,021][15401] Updated weights for policy 0, policy_version 23430 (0.0045) [2024-06-21 16:35:48,392][15132] Fps is (10 sec: 39312.4, 60 sec: 41234.1, 300 sec: 41320.6). Total num frames: 383959040. Throughput: 0: 41550.6. Samples: 384130620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 16:35:48,393][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 16:35:50,127][15401] Updated weights for policy 0, policy_version 23440 (0.0037) [2024-06-21 16:35:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 384172032. Throughput: 0: 41405.6. Samples: 384255260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 16:35:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 16:35:54,195][15401] Updated weights for policy 0, policy_version 23450 (0.0024) [2024-06-21 16:35:58,048][15401] Updated weights for policy 0, policy_version 23460 (0.0035) [2024-06-21 16:35:58,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 384385024. Throughput: 0: 41656.4. Samples: 384511660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 16:35:58,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 16:36:02,079][15401] Updated weights for policy 0, policy_version 23470 (0.0024) [2024-06-21 16:36:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41376.5). Total num frames: 384598016. Throughput: 0: 41469.1. Samples: 384754400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:36:03,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 16:36:05,765][15401] Updated weights for policy 0, policy_version 23480 (0.0040) [2024-06-21 16:36:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 384811008. Throughput: 0: 41378.7. Samples: 384882600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:36:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 16:36:09,911][15401] Updated weights for policy 0, policy_version 23490 (0.0042) [2024-06-21 16:36:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.1, 300 sec: 41432.1). Total num frames: 384991232. Throughput: 0: 41455.9. Samples: 385131540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:36:13,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 16:36:13,814][15401] Updated weights for policy 0, policy_version 23500 (0.0034) [2024-06-21 16:36:17,631][15401] Updated weights for policy 0, policy_version 23510 (0.0031) [2024-06-21 16:36:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 385204224. Throughput: 0: 41271.7. Samples: 385374700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:36:18,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 16:36:21,851][15401] Updated weights for policy 0, policy_version 23520 (0.0043) [2024-06-21 16:36:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 385417216. Throughput: 0: 41339.6. Samples: 385499140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:36:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 16:36:25,450][15401] Updated weights for policy 0, policy_version 23530 (0.0030) [2024-06-21 16:36:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 385613824. Throughput: 0: 41435.5. Samples: 385746840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:36:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 16:36:30,088][15401] Updated weights for policy 0, policy_version 23540 (0.0032) [2024-06-21 16:36:33,396][15132] Fps is (10 sec: 40933.9, 60 sec: 41228.7, 300 sec: 41375.6). Total num frames: 385826816. Throughput: 0: 41247.0. Samples: 385986900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:36:33,397][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 16:36:33,689][15401] Updated weights for policy 0, policy_version 23550 (0.0038) [2024-06-21 16:36:38,139][15401] Updated weights for policy 0, policy_version 23560 (0.0042) [2024-06-21 16:36:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40960.1, 300 sec: 41487.6). Total num frames: 386023424. Throughput: 0: 41325.4. Samples: 386114900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:36:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 16:36:41,668][15401] Updated weights for policy 0, policy_version 23570 (0.0035) [2024-06-21 16:36:43,390][15132] Fps is (10 sec: 40986.0, 60 sec: 41232.9, 300 sec: 41376.5). Total num frames: 386236416. Throughput: 0: 41075.1. Samples: 386360040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:36:43,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-21 16:36:45,963][15401] Updated weights for policy 0, policy_version 23580 (0.0031) [2024-06-21 16:36:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41781.0, 300 sec: 41432.1). Total num frames: 386465792. Throughput: 0: 41065.9. Samples: 386602360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:36:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 16:36:49,416][15401] Updated weights for policy 0, policy_version 23590 (0.0030) [2024-06-21 16:36:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 386629632. Throughput: 0: 41188.4. Samples: 386736080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 16:36:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 16:36:53,799][15401] Updated weights for policy 0, policy_version 23600 (0.0044) [2024-06-21 16:36:56,990][15401] Updated weights for policy 0, policy_version 23610 (0.0040) [2024-06-21 16:36:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41504.5, 300 sec: 41376.2). Total num frames: 386875392. Throughput: 0: 41150.3. Samples: 386983400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 16:36:58,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 16:37:01,682][15401] Updated weights for policy 0, policy_version 23620 (0.0050) [2024-06-21 16:37:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 387072000. Throughput: 0: 41283.5. Samples: 387232460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 16:37:03,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-21 16:37:05,142][15401] Updated weights for policy 0, policy_version 23630 (0.0028) [2024-06-21 16:37:06,464][15349] Signal inference workers to stop experience collection... (5600 times) [2024-06-21 16:37:06,464][15349] Signal inference workers to resume experience collection... (5600 times) [2024-06-21 16:37:06,488][15401] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-06-21 16:37:06,488][15401] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-06-21 16:37:08,389][15132] Fps is (10 sec: 37692.5, 60 sec: 40686.9, 300 sec: 41376.6). Total num frames: 387252224. Throughput: 0: 41268.1. Samples: 387356200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:37:08,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 16:37:09,799][15401] Updated weights for policy 0, policy_version 23640 (0.0031) [2024-06-21 16:37:12,907][15401] Updated weights for policy 0, policy_version 23650 (0.0029) [2024-06-21 16:37:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 387497984. Throughput: 0: 41264.3. Samples: 387603740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:37:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 16:37:17,487][15401] Updated weights for policy 0, policy_version 23660 (0.0038) [2024-06-21 16:37:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 387678208. Throughput: 0: 41599.3. Samples: 387858600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:37:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 16:37:20,733][15401] Updated weights for policy 0, policy_version 23670 (0.0040) [2024-06-21 16:37:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41432.4). Total num frames: 387907584. Throughput: 0: 41350.3. Samples: 387975660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:37:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 16:37:25,475][15401] Updated weights for policy 0, policy_version 23680 (0.0037) [2024-06-21 16:37:28,355][15401] Updated weights for policy 0, policy_version 23690 (0.0029) [2024-06-21 16:37:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 388136960. Throughput: 0: 41582.3. Samples: 388231240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:37:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 16:37:33,181][15401] Updated weights for policy 0, policy_version 23700 (0.0039) [2024-06-21 16:37:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41237.5, 300 sec: 41432.1). Total num frames: 388300800. Throughput: 0: 41810.7. Samples: 388483840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:37:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 16:37:36,343][15401] Updated weights for policy 0, policy_version 23710 (0.0041) [2024-06-21 16:37:38,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 388530176. Throughput: 0: 41460.4. Samples: 388601800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 16:37:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 16:37:40,889][15401] Updated weights for policy 0, policy_version 23720 (0.0027) [2024-06-21 16:37:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 388710400. Throughput: 0: 41519.1. Samples: 388851660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 16:37:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 16:37:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000023726_388726784.pth... [2024-06-21 16:37:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000023119_378781696.pth [2024-06-21 16:37:44,327][15401] Updated weights for policy 0, policy_version 23730 (0.0033) [2024-06-21 16:37:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 40960.0, 300 sec: 41487.7). Total num frames: 388923392. Throughput: 0: 41543.3. Samples: 389101900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 16:37:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 16:37:48,958][15401] Updated weights for policy 0, policy_version 23740 (0.0028) [2024-06-21 16:37:52,310][15401] Updated weights for policy 0, policy_version 23750 (0.0041) [2024-06-21 16:37:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 41543.2). Total num frames: 389169152. Throughput: 0: 41544.8. Samples: 389225720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 16:37:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 16:37:56,753][15401] Updated weights for policy 0, policy_version 23760 (0.0034) [2024-06-21 16:37:58,392][15132] Fps is (10 sec: 40949.8, 60 sec: 40960.0, 300 sec: 41376.2). Total num frames: 389332992. Throughput: 0: 41621.4. Samples: 389476800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:37:58,401][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 16:38:00,122][15401] Updated weights for policy 0, policy_version 23770 (0.0032) [2024-06-21 16:38:03,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41233.0, 300 sec: 41432.4). Total num frames: 389545984. Throughput: 0: 41455.5. Samples: 389724100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:38:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 16:38:04,572][15401] Updated weights for policy 0, policy_version 23780 (0.0033) [2024-06-21 16:38:08,029][15401] Updated weights for policy 0, policy_version 23790 (0.0036) [2024-06-21 16:38:08,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42325.3, 300 sec: 41487.6). Total num frames: 389791744. Throughput: 0: 41689.3. Samples: 389851680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:38:08,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 16:38:12,700][15401] Updated weights for policy 0, policy_version 23800 (0.0029) [2024-06-21 16:38:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 389955584. Throughput: 0: 41548.4. Samples: 390100920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 16:38:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 16:38:16,071][15401] Updated weights for policy 0, policy_version 23810 (0.0046) [2024-06-21 16:38:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 390184960. Throughput: 0: 41281.2. Samples: 390341500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 16:38:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 16:38:20,641][15401] Updated weights for policy 0, policy_version 23820 (0.0034) [2024-06-21 16:38:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 390414336. Throughput: 0: 41505.5. Samples: 390469540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 16:38:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 16:38:23,881][15401] Updated weights for policy 0, policy_version 23830 (0.0039) [2024-06-21 16:38:28,390][15132] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41432.1). Total num frames: 390578176. Throughput: 0: 41565.8. Samples: 390722120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:38:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-21 16:38:28,414][15401] Updated weights for policy 0, policy_version 23840 (0.0045) [2024-06-21 16:38:31,836][15401] Updated weights for policy 0, policy_version 23850 (0.0033) [2024-06-21 16:38:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 390823936. Throughput: 0: 41292.4. Samples: 390960060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:38:33,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 16:38:36,314][15401] Updated weights for policy 0, policy_version 23860 (0.0043) [2024-06-21 16:38:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41506.2, 300 sec: 41487.7). Total num frames: 391020544. Throughput: 0: 41591.7. Samples: 391097340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:38:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 16:38:39,615][15401] Updated weights for policy 0, policy_version 23870 (0.0039) [2024-06-21 16:38:43,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 391200768. Throughput: 0: 41573.8. Samples: 391347520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:38:43,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 16:38:43,937][15401] Updated weights for policy 0, policy_version 23880 (0.0040) [2024-06-21 16:38:47,505][15401] Updated weights for policy 0, policy_version 23890 (0.0036) [2024-06-21 16:38:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 41654.2). Total num frames: 391462912. Throughput: 0: 41461.8. Samples: 391589880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 16:38:48,390][15132] Avg episode reward: [(0, '0.195')] [2024-06-21 16:38:51,577][15349] Signal inference workers to stop experience collection... (5650 times) [2024-06-21 16:38:51,577][15349] Signal inference workers to resume experience collection... (5650 times) [2024-06-21 16:38:51,599][15401] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-06-21 16:38:51,599][15401] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-06-21 16:38:51,734][15401] Updated weights for policy 0, policy_version 23900 (0.0036) [2024-06-21 16:38:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 41376.5). Total num frames: 391610368. Throughput: 0: 41586.2. Samples: 391723060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 16:38:53,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 16:38:55,553][15401] Updated weights for policy 0, policy_version 23910 (0.0041) [2024-06-21 16:38:58,392][15132] Fps is (10 sec: 37674.2, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 391839744. Throughput: 0: 41367.6. Samples: 391962560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 16:38:58,393][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 16:38:59,667][15401] Updated weights for policy 0, policy_version 23920 (0.0038) [2024-06-21 16:39:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.3, 300 sec: 41432.1). Total num frames: 392052736. Throughput: 0: 41656.1. Samples: 392216020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:39:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 16:39:03,581][15401] Updated weights for policy 0, policy_version 23930 (0.0048) [2024-06-21 16:39:07,733][15401] Updated weights for policy 0, policy_version 23940 (0.0037) [2024-06-21 16:39:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 392249344. Throughput: 0: 41568.8. Samples: 392340140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:39:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 16:39:11,426][15401] Updated weights for policy 0, policy_version 23950 (0.0034) [2024-06-21 16:39:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 392478720. Throughput: 0: 41410.6. Samples: 392585600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:39:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 16:39:15,807][15401] Updated weights for policy 0, policy_version 23960 (0.0036) [2024-06-21 16:39:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 392675328. Throughput: 0: 41691.1. Samples: 392836160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 16:39:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 16:39:19,431][15401] Updated weights for policy 0, policy_version 23970 (0.0040) [2024-06-21 16:39:23,392][15132] Fps is (10 sec: 37674.5, 60 sec: 40685.2, 300 sec: 41376.2). Total num frames: 392855552. Throughput: 0: 41336.3. Samples: 392957580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 16:39:23,393][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 16:39:23,782][15401] Updated weights for policy 0, policy_version 23980 (0.0047) [2024-06-21 16:39:27,187][15401] Updated weights for policy 0, policy_version 23990 (0.0030) [2024-06-21 16:39:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 393101312. Throughput: 0: 41424.8. Samples: 393211640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 16:39:28,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 16:39:31,533][15401] Updated weights for policy 0, policy_version 24000 (0.0036) [2024-06-21 16:39:33,390][15132] Fps is (10 sec: 44247.3, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 393297920. Throughput: 0: 41541.3. Samples: 393459240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 16:39:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 16:39:35,059][15401] Updated weights for policy 0, policy_version 24010 (0.0030) [2024-06-21 16:39:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 393510912. Throughput: 0: 41303.6. Samples: 393581720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 16:39:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 16:39:39,463][15401] Updated weights for policy 0, policy_version 24020 (0.0038) [2024-06-21 16:39:43,286][15401] Updated weights for policy 0, policy_version 24030 (0.0044) [2024-06-21 16:39:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41432.6). Total num frames: 393707520. Throughput: 0: 41504.9. Samples: 393830180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 16:39:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 16:39:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000024031_393723904.pth... [2024-06-21 16:39:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000023423_383762432.pth [2024-06-21 16:39:47,443][15401] Updated weights for policy 0, policy_version 24040 (0.0036) [2024-06-21 16:39:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 393920512. Throughput: 0: 41523.1. Samples: 394084560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 16:39:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 16:39:50,962][15401] Updated weights for policy 0, policy_version 24050 (0.0041) [2024-06-21 16:39:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 394133504. Throughput: 0: 41461.3. Samples: 394205900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 16:39:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-21 16:39:55,195][15401] Updated weights for policy 0, policy_version 24060 (0.0030) [2024-06-21 16:39:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41507.8, 300 sec: 41487.6). Total num frames: 394330112. Throughput: 0: 41632.5. Samples: 394459060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 16:39:58,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 16:39:58,549][15401] Updated weights for policy 0, policy_version 24070 (0.0041) [2024-06-21 16:40:03,042][15401] Updated weights for policy 0, policy_version 24080 (0.0040) [2024-06-21 16:40:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 394526720. Throughput: 0: 41666.7. Samples: 394711160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 16:40:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 16:40:06,305][15401] Updated weights for policy 0, policy_version 24090 (0.0024) [2024-06-21 16:40:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 394772480. Throughput: 0: 41744.0. Samples: 394835960. Policy #0 lag: (min: 1.0, avg: 9.1, max: 23.0) [2024-06-21 16:40:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 16:40:10,919][15401] Updated weights for policy 0, policy_version 24100 (0.0033) [2024-06-21 16:40:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.3, 300 sec: 41432.1). Total num frames: 394969088. Throughput: 0: 41767.2. Samples: 395091160. Policy #0 lag: (min: 1.0, avg: 9.1, max: 23.0) [2024-06-21 16:40:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 16:40:14,085][15401] Updated weights for policy 0, policy_version 24110 (0.0029) [2024-06-21 16:40:14,940][15349] Signal inference workers to stop experience collection... (5700 times) [2024-06-21 16:40:14,994][15401] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-06-21 16:40:14,995][15349] Signal inference workers to resume experience collection... (5700 times) [2024-06-21 16:40:15,006][15401] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-06-21 16:40:18,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 395149312. Throughput: 0: 41757.3. Samples: 395338320. Policy #0 lag: (min: 1.0, avg: 9.1, max: 23.0) [2024-06-21 16:40:18,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 16:40:18,649][15401] Updated weights for policy 0, policy_version 24120 (0.0028) [2024-06-21 16:40:21,760][15401] Updated weights for policy 0, policy_version 24130 (0.0033) [2024-06-21 16:40:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42327.0, 300 sec: 41543.1). Total num frames: 395395072. Throughput: 0: 41783.9. Samples: 395462000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 16:40:23,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-21 16:40:26,401][15401] Updated weights for policy 0, policy_version 24140 (0.0036) [2024-06-21 16:40:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 40960.1, 300 sec: 41376.6). Total num frames: 395558912. Throughput: 0: 41786.8. Samples: 395710580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 16:40:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 16:40:29,661][15401] Updated weights for policy 0, policy_version 24150 (0.0040) [2024-06-21 16:40:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 395804672. Throughput: 0: 41696.0. Samples: 395960880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 16:40:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 16:40:34,347][15401] Updated weights for policy 0, policy_version 24160 (0.0040) [2024-06-21 16:40:38,166][15401] Updated weights for policy 0, policy_version 24170 (0.0032) [2024-06-21 16:40:38,390][15132] Fps is (10 sec: 45874.2, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 396017664. Throughput: 0: 41832.8. Samples: 396088380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:40:38,390][15132] Avg episode reward: [(0, '0.171')] [2024-06-21 16:40:42,179][15401] Updated weights for policy 0, policy_version 24180 (0.0048) [2024-06-21 16:40:43,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41233.2, 300 sec: 41432.4). Total num frames: 396181504. Throughput: 0: 41593.5. Samples: 396330760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:40:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 16:40:45,789][15401] Updated weights for policy 0, policy_version 24190 (0.0035) [2024-06-21 16:40:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 396410880. Throughput: 0: 41481.7. Samples: 396577840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:40:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 16:40:49,919][15401] Updated weights for policy 0, policy_version 24200 (0.0031) [2024-06-21 16:40:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 396623872. Throughput: 0: 41631.1. Samples: 396709360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:40:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 16:40:53,881][15401] Updated weights for policy 0, policy_version 24210 (0.0039) [2024-06-21 16:40:58,068][15401] Updated weights for policy 0, policy_version 24220 (0.0033) [2024-06-21 16:40:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 396820480. Throughput: 0: 41410.1. Samples: 396954620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 16:40:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-21 16:41:01,585][15401] Updated weights for policy 0, policy_version 24230 (0.0044) [2024-06-21 16:41:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 41543.1). Total num frames: 397066240. Throughput: 0: 41256.0. Samples: 397194840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 16:41:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 16:41:06,003][15401] Updated weights for policy 0, policy_version 24240 (0.0048) [2024-06-21 16:41:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 397246464. Throughput: 0: 41459.1. Samples: 397327660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 16:41:08,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 16:41:09,454][15401] Updated weights for policy 0, policy_version 24250 (0.0030) [2024-06-21 16:41:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 397443072. Throughput: 0: 41467.5. Samples: 397576620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 16:41:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 16:41:14,030][15401] Updated weights for policy 0, policy_version 24260 (0.0038) [2024-06-21 16:41:17,290][15401] Updated weights for policy 0, policy_version 24270 (0.0033) [2024-06-21 16:41:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41543.2). Total num frames: 397672448. Throughput: 0: 41445.3. Samples: 397825920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 16:41:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 16:41:21,991][15401] Updated weights for policy 0, policy_version 24280 (0.0040) [2024-06-21 16:41:23,396][15132] Fps is (10 sec: 40933.5, 60 sec: 40955.7, 300 sec: 41486.7). Total num frames: 397852672. Throughput: 0: 41530.6. Samples: 397957520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 16:41:23,397][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 16:41:25,154][15401] Updated weights for policy 0, policy_version 24290 (0.0041) [2024-06-21 16:41:28,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.1, 300 sec: 41488.5). Total num frames: 398065664. Throughput: 0: 41446.5. Samples: 398195860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 16:41:28,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 16:41:29,485][15349] Signal inference workers to stop experience collection... (5750 times) [2024-06-21 16:41:29,536][15401] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-06-21 16:41:29,596][15349] Signal inference workers to resume experience collection... (5750 times) [2024-06-21 16:41:29,596][15401] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-06-21 16:41:29,760][15401] Updated weights for policy 0, policy_version 24300 (0.0022) [2024-06-21 16:41:32,827][15401] Updated weights for policy 0, policy_version 24310 (0.0033) [2024-06-21 16:41:33,390][15132] Fps is (10 sec: 44265.0, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 398295040. Throughput: 0: 41711.0. Samples: 398454840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 16:41:33,396][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 16:41:37,466][15401] Updated weights for policy 0, policy_version 24320 (0.0031) [2024-06-21 16:41:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 398508032. Throughput: 0: 41715.2. Samples: 398586540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 16:41:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 16:41:41,035][15401] Updated weights for policy 0, policy_version 24330 (0.0043) [2024-06-21 16:41:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 41543.2). Total num frames: 398721024. Throughput: 0: 41667.2. Samples: 398829640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 16:41:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 16:41:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000024336_398721024.pth... [2024-06-21 16:41:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000023726_388726784.pth [2024-06-21 16:41:45,334][15401] Updated weights for policy 0, policy_version 24340 (0.0024) [2024-06-21 16:41:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 398934016. Throughput: 0: 42026.6. Samples: 399086040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 16:41:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 16:41:48,697][15401] Updated weights for policy 0, policy_version 24350 (0.0041) [2024-06-21 16:41:52,887][15401] Updated weights for policy 0, policy_version 24360 (0.0038) [2024-06-21 16:41:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41543.5). Total num frames: 399130624. Throughput: 0: 41970.7. Samples: 399216340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 16:41:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 16:41:56,503][15401] Updated weights for policy 0, policy_version 24370 (0.0029) [2024-06-21 16:41:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 399327232. Throughput: 0: 41851.4. Samples: 399459940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 16:41:58,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-21 16:42:00,626][15401] Updated weights for policy 0, policy_version 24380 (0.0039) [2024-06-21 16:42:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41709.7). Total num frames: 399556608. Throughput: 0: 41987.1. Samples: 399715340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 16:42:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 16:42:04,286][15401] Updated weights for policy 0, policy_version 24390 (0.0036) [2024-06-21 16:42:08,262][15401] Updated weights for policy 0, policy_version 24400 (0.0035) [2024-06-21 16:42:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 399769600. Throughput: 0: 41826.3. Samples: 399839440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 16:42:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-21 16:42:12,293][15401] Updated weights for policy 0, policy_version 24410 (0.0033) [2024-06-21 16:42:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 399949824. Throughput: 0: 42035.5. Samples: 400087460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 16:42:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 16:42:16,017][15401] Updated weights for policy 0, policy_version 24420 (0.0032) [2024-06-21 16:42:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 400179200. Throughput: 0: 41783.7. Samples: 400335100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-21 16:42:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 16:42:20,082][15401] Updated weights for policy 0, policy_version 24430 (0.0025) [2024-06-21 16:42:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41783.6, 300 sec: 41432.1). Total num frames: 400359424. Throughput: 0: 41619.9. Samples: 400459440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-21 16:42:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 16:42:24,168][15401] Updated weights for policy 0, policy_version 24440 (0.0042) [2024-06-21 16:42:28,107][15401] Updated weights for policy 0, policy_version 24450 (0.0029) [2024-06-21 16:42:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 400588800. Throughput: 0: 41801.7. Samples: 400710720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-21 16:42:28,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 16:42:32,227][15401] Updated weights for policy 0, policy_version 24460 (0.0039) [2024-06-21 16:42:33,391][15132] Fps is (10 sec: 44231.0, 60 sec: 41778.3, 300 sec: 41598.5). Total num frames: 400801792. Throughput: 0: 41549.0. Samples: 400955800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-21 16:42:33,391][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 16:42:36,206][15401] Updated weights for policy 0, policy_version 24470 (0.0034) [2024-06-21 16:42:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 400998400. Throughput: 0: 41562.7. Samples: 401086660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 16:42:38,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 16:42:39,779][15401] Updated weights for policy 0, policy_version 24480 (0.0033) [2024-06-21 16:42:43,390][15132] Fps is (10 sec: 39326.9, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 401195008. Throughput: 0: 41601.5. Samples: 401332000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 16:42:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 16:42:43,615][15349] Signal inference workers to stop experience collection... (5800 times) [2024-06-21 16:42:43,615][15349] Signal inference workers to resume experience collection... (5800 times) [2024-06-21 16:42:43,629][15401] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-06-21 16:42:43,629][15401] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-06-21 16:42:43,985][15401] Updated weights for policy 0, policy_version 24490 (0.0031) [2024-06-21 16:42:47,756][15401] Updated weights for policy 0, policy_version 24500 (0.0031) [2024-06-21 16:42:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 401424384. Throughput: 0: 41301.5. Samples: 401573900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 16:42:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 16:42:51,871][15401] Updated weights for policy 0, policy_version 24510 (0.0042) [2024-06-21 16:42:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41654.6). Total num frames: 401620992. Throughput: 0: 41547.3. Samples: 401709060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 16:42:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 16:42:55,637][15401] Updated weights for policy 0, policy_version 24520 (0.0038) [2024-06-21 16:42:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 401817600. Throughput: 0: 41531.2. Samples: 401956360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 16:42:58,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 16:42:59,769][15401] Updated weights for policy 0, policy_version 24530 (0.0047) [2024-06-21 16:43:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.3, 300 sec: 41543.2). Total num frames: 402046976. Throughput: 0: 41499.1. Samples: 402202560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 16:43:03,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 16:43:03,543][15401] Updated weights for policy 0, policy_version 24540 (0.0035) [2024-06-21 16:43:07,735][15401] Updated weights for policy 0, policy_version 24550 (0.0036) [2024-06-21 16:43:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 402243584. Throughput: 0: 41542.2. Samples: 402328840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:43:08,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-21 16:43:11,352][15401] Updated weights for policy 0, policy_version 24560 (0.0038) [2024-06-21 16:43:13,392][15132] Fps is (10 sec: 40949.8, 60 sec: 41777.6, 300 sec: 41598.4). Total num frames: 402456576. Throughput: 0: 41366.7. Samples: 402572320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:43:13,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 16:43:15,535][15401] Updated weights for policy 0, policy_version 24570 (0.0032) [2024-06-21 16:43:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 402653184. Throughput: 0: 41524.9. Samples: 402824360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:43:18,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 16:43:19,230][15401] Updated weights for policy 0, policy_version 24580 (0.0032) [2024-06-21 16:43:23,392][15132] Fps is (10 sec: 40960.1, 60 sec: 41777.6, 300 sec: 41653.9). Total num frames: 402866176. Throughput: 0: 41322.2. Samples: 402946260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:43:23,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 16:43:23,569][15401] Updated weights for policy 0, policy_version 24590 (0.0035) [2024-06-21 16:43:27,193][15401] Updated weights for policy 0, policy_version 24600 (0.0043) [2024-06-21 16:43:28,390][15132] Fps is (10 sec: 40959.0, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 403062784. Throughput: 0: 41319.0. Samples: 403191360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:43:28,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 16:43:31,611][15401] Updated weights for policy 0, policy_version 24610 (0.0036) [2024-06-21 16:43:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 40961.0, 300 sec: 41487.6). Total num frames: 403259392. Throughput: 0: 41602.2. Samples: 403446000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:43:33,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 16:43:35,167][15401] Updated weights for policy 0, policy_version 24620 (0.0035) [2024-06-21 16:43:38,390][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 403488768. Throughput: 0: 41330.6. Samples: 403568940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 16:43:38,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 16:43:39,695][15401] Updated weights for policy 0, policy_version 24630 (0.0039) [2024-06-21 16:43:43,347][15401] Updated weights for policy 0, policy_version 24640 (0.0024) [2024-06-21 16:43:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 403701760. Throughput: 0: 41359.1. Samples: 403817520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 16:43:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 16:43:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000024640_403701760.pth... [2024-06-21 16:43:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000024031_393723904.pth [2024-06-21 16:43:47,613][15401] Updated weights for policy 0, policy_version 24650 (0.0051) [2024-06-21 16:43:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 403881984. Throughput: 0: 41363.1. Samples: 404063900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 16:43:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 16:43:51,219][15401] Updated weights for policy 0, policy_version 24660 (0.0027) [2024-06-21 16:43:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41654.6). Total num frames: 404127744. Throughput: 0: 41244.5. Samples: 404184840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-21 16:43:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 16:43:55,451][15401] Updated weights for policy 0, policy_version 24670 (0.0034) [2024-06-21 16:43:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 404307968. Throughput: 0: 41451.6. Samples: 404437540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:43:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 16:43:59,029][15401] Updated weights for policy 0, policy_version 24680 (0.0033) [2024-06-21 16:44:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 404520960. Throughput: 0: 41434.6. Samples: 404688920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:44:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 16:44:03,401][15401] Updated weights for policy 0, policy_version 24690 (0.0043) [2024-06-21 16:44:06,866][15401] Updated weights for policy 0, policy_version 24700 (0.0032) [2024-06-21 16:44:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.2, 300 sec: 41487.6). Total num frames: 404717568. Throughput: 0: 41514.2. Samples: 404814300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:44:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 16:44:11,167][15401] Updated weights for policy 0, policy_version 24710 (0.0037) [2024-06-21 16:44:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41234.7, 300 sec: 41543.1). Total num frames: 404930560. Throughput: 0: 41545.4. Samples: 405060900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:44:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 16:44:13,980][15349] Signal inference workers to stop experience collection... (5850 times) [2024-06-21 16:44:13,980][15349] Signal inference workers to resume experience collection... (5850 times) [2024-06-21 16:44:14,016][15401] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-06-21 16:44:14,016][15401] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-06-21 16:44:14,806][15401] Updated weights for policy 0, policy_version 24720 (0.0031) [2024-06-21 16:44:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41654.6). Total num frames: 405143552. Throughput: 0: 41372.4. Samples: 405307760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 16:44:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 16:44:19,190][15401] Updated weights for policy 0, policy_version 24730 (0.0044) [2024-06-21 16:44:22,822][15401] Updated weights for policy 0, policy_version 24740 (0.0033) [2024-06-21 16:44:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41507.7, 300 sec: 41543.1). Total num frames: 405356544. Throughput: 0: 41507.9. Samples: 405436800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 16:44:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 16:44:26,831][15401] Updated weights for policy 0, policy_version 24750 (0.0028) [2024-06-21 16:44:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 405569536. Throughput: 0: 41588.3. Samples: 405689000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 16:44:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 16:44:30,674][15401] Updated weights for policy 0, policy_version 24760 (0.0030) [2024-06-21 16:44:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 405766144. Throughput: 0: 41648.8. Samples: 405938100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:44:33,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 16:44:34,870][15401] Updated weights for policy 0, policy_version 24770 (0.0039) [2024-06-21 16:44:38,389][15132] Fps is (10 sec: 39322.6, 60 sec: 41233.2, 300 sec: 41543.2). Total num frames: 405962752. Throughput: 0: 41678.3. Samples: 406060360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:44:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 16:44:38,559][15401] Updated weights for policy 0, policy_version 24780 (0.0045) [2024-06-21 16:44:42,651][15401] Updated weights for policy 0, policy_version 24790 (0.0040) [2024-06-21 16:44:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 406175744. Throughput: 0: 41576.8. Samples: 406308500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:44:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 16:44:46,622][15401] Updated weights for policy 0, policy_version 24800 (0.0036) [2024-06-21 16:44:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 406372352. Throughput: 0: 41361.0. Samples: 406550160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 16:44:48,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-21 16:44:50,553][15401] Updated weights for policy 0, policy_version 24810 (0.0031) [2024-06-21 16:44:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 406585344. Throughput: 0: 41343.1. Samples: 406674740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 16:44:53,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 16:44:54,328][15401] Updated weights for policy 0, policy_version 24820 (0.0033) [2024-06-21 16:44:58,390][15132] Fps is (10 sec: 42597.1, 60 sec: 41505.9, 300 sec: 41598.7). Total num frames: 406798336. Throughput: 0: 41387.0. Samples: 406923320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 16:44:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 16:44:58,571][15401] Updated weights for policy 0, policy_version 24830 (0.0029) [2024-06-21 16:45:02,506][15401] Updated weights for policy 0, policy_version 24840 (0.0050) [2024-06-21 16:45:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 407011328. Throughput: 0: 41554.2. Samples: 407177700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 16:45:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-21 16:45:06,243][15401] Updated weights for policy 0, policy_version 24850 (0.0040) [2024-06-21 16:45:08,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 407207936. Throughput: 0: 41255.6. Samples: 407293300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 16:45:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 16:45:10,462][15401] Updated weights for policy 0, policy_version 24860 (0.0053) [2024-06-21 16:45:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 407420928. Throughput: 0: 41167.2. Samples: 407541520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 16:45:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 16:45:14,345][15401] Updated weights for policy 0, policy_version 24870 (0.0040) [2024-06-21 16:45:18,188][15401] Updated weights for policy 0, policy_version 24880 (0.0033) [2024-06-21 16:45:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 407633920. Throughput: 0: 41365.8. Samples: 407799560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 16:45:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 16:45:22,233][15401] Updated weights for policy 0, policy_version 24890 (0.0024) [2024-06-21 16:45:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41506.3, 300 sec: 41654.2). Total num frames: 407846912. Throughput: 0: 41348.8. Samples: 407921060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:45:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 16:45:25,937][15401] Updated weights for policy 0, policy_version 24900 (0.0039) [2024-06-21 16:45:26,574][15349] Signal inference workers to stop experience collection... (5900 times) [2024-06-21 16:45:26,600][15401] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-06-21 16:45:26,685][15349] Signal inference workers to resume experience collection... (5900 times) [2024-06-21 16:45:26,686][15401] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-06-21 16:45:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 408059904. Throughput: 0: 41367.1. Samples: 408170020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:45:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 16:45:29,959][15401] Updated weights for policy 0, policy_version 24910 (0.0040) [2024-06-21 16:45:33,390][15132] Fps is (10 sec: 39320.7, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 408240128. Throughput: 0: 41730.9. Samples: 408428060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:45:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 16:45:33,842][15401] Updated weights for policy 0, policy_version 24920 (0.0028) [2024-06-21 16:45:37,812][15401] Updated weights for policy 0, policy_version 24930 (0.0030) [2024-06-21 16:45:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 408469504. Throughput: 0: 41594.6. Samples: 408546500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:45:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 16:45:42,232][15401] Updated weights for policy 0, policy_version 24940 (0.0032) [2024-06-21 16:45:43,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 408698880. Throughput: 0: 41791.7. Samples: 408803940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:45:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 16:45:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000024945_408698880.pth... [2024-06-21 16:45:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000024336_398721024.pth [2024-06-21 16:45:45,845][15401] Updated weights for policy 0, policy_version 24950 (0.0043) [2024-06-21 16:45:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 408879104. Throughput: 0: 41591.5. Samples: 409049320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:45:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 16:45:50,183][15401] Updated weights for policy 0, policy_version 24960 (0.0047) [2024-06-21 16:45:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 409075712. Throughput: 0: 41659.1. Samples: 409167960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:45:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 16:45:53,707][15401] Updated weights for policy 0, policy_version 24970 (0.0029) [2024-06-21 16:45:58,054][15401] Updated weights for policy 0, policy_version 24980 (0.0042) [2024-06-21 16:45:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 41504.6, 300 sec: 41431.8). Total num frames: 409288704. Throughput: 0: 41876.5. Samples: 409426060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:45:58,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 16:46:01,367][15401] Updated weights for policy 0, policy_version 24990 (0.0031) [2024-06-21 16:46:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 409501696. Throughput: 0: 41582.3. Samples: 409670760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:46:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 16:46:05,797][15401] Updated weights for policy 0, policy_version 25000 (0.0050) [2024-06-21 16:46:08,389][15132] Fps is (10 sec: 44247.2, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 409731072. Throughput: 0: 41802.6. Samples: 409802180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:46:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 16:46:09,111][15401] Updated weights for policy 0, policy_version 25010 (0.0040) [2024-06-21 16:46:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 409894912. Throughput: 0: 41668.8. Samples: 410045120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 16:46:13,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 16:46:13,997][15401] Updated weights for policy 0, policy_version 25020 (0.0039) [2024-06-21 16:46:16,903][15401] Updated weights for policy 0, policy_version 25030 (0.0026) [2024-06-21 16:46:18,392][15132] Fps is (10 sec: 39312.1, 60 sec: 41504.5, 300 sec: 41599.3). Total num frames: 410124288. Throughput: 0: 41485.9. Samples: 410295020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 16:46:18,393][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 16:46:21,533][15401] Updated weights for policy 0, policy_version 25040 (0.0039) [2024-06-21 16:46:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 410353664. Throughput: 0: 41786.2. Samples: 410426880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 16:46:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 16:46:24,651][15401] Updated weights for policy 0, policy_version 25050 (0.0052) [2024-06-21 16:46:28,390][15132] Fps is (10 sec: 40969.5, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 410533888. Throughput: 0: 41506.2. Samples: 410671720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:46:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 16:46:29,153][15401] Updated weights for policy 0, policy_version 25060 (0.0046) [2024-06-21 16:46:32,403][15401] Updated weights for policy 0, policy_version 25070 (0.0036) [2024-06-21 16:46:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 41543.2). Total num frames: 410763264. Throughput: 0: 41690.3. Samples: 410925380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:46:33,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 16:46:37,095][15401] Updated weights for policy 0, policy_version 25080 (0.0043) [2024-06-21 16:46:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 410976256. Throughput: 0: 41956.2. Samples: 411055980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:46:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 16:46:40,458][15401] Updated weights for policy 0, policy_version 25090 (0.0038) [2024-06-21 16:46:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 411172864. Throughput: 0: 41618.2. Samples: 411298780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:46:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 16:46:45,077][15401] Updated weights for policy 0, policy_version 25100 (0.0036) [2024-06-21 16:46:46,186][15349] Signal inference workers to stop experience collection... (5950 times) [2024-06-21 16:46:46,220][15401] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-06-21 16:46:46,238][15349] Signal inference workers to resume experience collection... (5950 times) [2024-06-21 16:46:46,239][15401] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-06-21 16:46:48,311][15401] Updated weights for policy 0, policy_version 25110 (0.0046) [2024-06-21 16:46:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42050.6, 300 sec: 41598.4). Total num frames: 411402240. Throughput: 0: 41703.5. Samples: 411547520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 16:46:48,393][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 16:46:52,697][15401] Updated weights for policy 0, policy_version 25120 (0.0035) [2024-06-21 16:46:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 411582464. Throughput: 0: 41714.2. Samples: 411679320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 16:46:53,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 16:46:56,355][15401] Updated weights for policy 0, policy_version 25130 (0.0026) [2024-06-21 16:46:58,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42053.9, 300 sec: 41543.2). Total num frames: 411811840. Throughput: 0: 41804.0. Samples: 411926300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-21 16:46:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 16:47:00,977][15401] Updated weights for policy 0, policy_version 25140 (0.0042) [2024-06-21 16:47:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 412008448. Throughput: 0: 41802.2. Samples: 412176020. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-21 16:47:03,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-21 16:47:04,238][15401] Updated weights for policy 0, policy_version 25150 (0.0030) [2024-06-21 16:47:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 412205056. Throughput: 0: 41538.6. Samples: 412296120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-21 16:47:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 16:47:08,613][15401] Updated weights for policy 0, policy_version 25160 (0.0041) [2024-06-21 16:47:12,175][15401] Updated weights for policy 0, policy_version 25170 (0.0041) [2024-06-21 16:47:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 41543.2). Total num frames: 412434432. Throughput: 0: 41791.3. Samples: 412552320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-21 16:47:13,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 16:47:16,295][15401] Updated weights for policy 0, policy_version 25180 (0.0044) [2024-06-21 16:47:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42054.0, 300 sec: 41654.2). Total num frames: 412647424. Throughput: 0: 41649.7. Samples: 412799620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-21 16:47:18,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 16:47:19,798][15401] Updated weights for policy 0, policy_version 25190 (0.0041) [2024-06-21 16:47:23,392][15132] Fps is (10 sec: 40949.7, 60 sec: 41504.5, 300 sec: 41542.8). Total num frames: 412844032. Throughput: 0: 41528.8. Samples: 412924880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:47:23,393][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 16:47:24,072][15401] Updated weights for policy 0, policy_version 25200 (0.0034) [2024-06-21 16:47:27,617][15401] Updated weights for policy 0, policy_version 25210 (0.0027) [2024-06-21 16:47:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41543.4). Total num frames: 413057024. Throughput: 0: 41694.6. Samples: 413175040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:47:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 16:47:31,653][15401] Updated weights for policy 0, policy_version 25220 (0.0041) [2024-06-21 16:47:33,390][15132] Fps is (10 sec: 42608.7, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 413270016. Throughput: 0: 41727.5. Samples: 413425160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 16:47:33,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-21 16:47:35,746][15401] Updated weights for policy 0, policy_version 25230 (0.0044) [2024-06-21 16:47:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 413466624. Throughput: 0: 41561.7. Samples: 413549600. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 16:47:38,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 16:47:39,300][15401] Updated weights for policy 0, policy_version 25240 (0.0028) [2024-06-21 16:47:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 413663232. Throughput: 0: 41678.3. Samples: 413801820. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 16:47:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 16:47:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000025249_413679616.pth... [2024-06-21 16:47:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000024640_403701760.pth [2024-06-21 16:47:43,718][15401] Updated weights for policy 0, policy_version 25250 (0.0034) [2024-06-21 16:47:47,269][15401] Updated weights for policy 0, policy_version 25260 (0.0032) [2024-06-21 16:47:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41234.7, 300 sec: 41543.2). Total num frames: 413876224. Throughput: 0: 41590.7. Samples: 414047600. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 16:47:48,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 16:47:51,438][15401] Updated weights for policy 0, policy_version 25270 (0.0035) [2024-06-21 16:47:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 414105600. Throughput: 0: 41753.8. Samples: 414175040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 16:47:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 16:47:55,121][15401] Updated weights for policy 0, policy_version 25280 (0.0045) [2024-06-21 16:47:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 414285824. Throughput: 0: 41685.3. Samples: 414428160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 16:47:58,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-21 16:47:59,227][15401] Updated weights for policy 0, policy_version 25290 (0.0035) [2024-06-21 16:48:02,729][15401] Updated weights for policy 0, policy_version 25300 (0.0048) [2024-06-21 16:48:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 414531584. Throughput: 0: 41674.6. Samples: 414674980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 16:48:03,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 16:48:06,983][15401] Updated weights for policy 0, policy_version 25310 (0.0033) [2024-06-21 16:48:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41543.5). Total num frames: 414711808. Throughput: 0: 41797.9. Samples: 414805680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 16:48:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 16:48:10,450][15401] Updated weights for policy 0, policy_version 25320 (0.0030) [2024-06-21 16:48:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 414924800. Throughput: 0: 41825.7. Samples: 415057200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 16:48:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 16:48:14,841][15401] Updated weights for policy 0, policy_version 25330 (0.0042) [2024-06-21 16:48:18,214][15349] Signal inference workers to stop experience collection... (6000 times) [2024-06-21 16:48:18,214][15349] Signal inference workers to resume experience collection... (6000 times) [2024-06-21 16:48:18,226][15401] Updated weights for policy 0, policy_version 25340 (0.0046) [2024-06-21 16:48:18,245][15401] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-06-21 16:48:18,274][15401] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-06-21 16:48:18,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42325.4, 300 sec: 41765.7). Total num frames: 415186944. Throughput: 0: 41867.2. Samples: 415309180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 16:48:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 16:48:22,802][15401] Updated weights for policy 0, policy_version 25350 (0.0037) [2024-06-21 16:48:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41780.9, 300 sec: 41654.3). Total num frames: 415350784. Throughput: 0: 42070.8. Samples: 415442780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 16:48:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 16:48:25,920][15401] Updated weights for policy 0, policy_version 25360 (0.0034) [2024-06-21 16:48:28,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 415563776. Throughput: 0: 42056.8. Samples: 415694380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 16:48:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 16:48:30,529][15401] Updated weights for policy 0, policy_version 25370 (0.0031) [2024-06-21 16:48:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 415793152. Throughput: 0: 42045.3. Samples: 415939640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 16:48:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 16:48:33,768][15401] Updated weights for policy 0, policy_version 25380 (0.0033) [2024-06-21 16:48:38,187][15401] Updated weights for policy 0, policy_version 25390 (0.0038) [2024-06-21 16:48:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 41654.2). Total num frames: 415989760. Throughput: 0: 42201.9. Samples: 416074120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 16:48:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 16:48:41,655][15401] Updated weights for policy 0, policy_version 25400 (0.0042) [2024-06-21 16:48:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 416186368. Throughput: 0: 42104.0. Samples: 416322840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 16:48:43,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-21 16:48:45,827][15401] Updated weights for policy 0, policy_version 25410 (0.0038) [2024-06-21 16:48:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 41709.8). Total num frames: 416432128. Throughput: 0: 42131.7. Samples: 416570900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 16:48:48,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 16:48:49,753][15401] Updated weights for policy 0, policy_version 25420 (0.0044) [2024-06-21 16:48:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 416628736. Throughput: 0: 42145.3. Samples: 416702220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 16:48:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 16:48:53,924][15401] Updated weights for policy 0, policy_version 25430 (0.0043) [2024-06-21 16:48:57,422][15401] Updated weights for policy 0, policy_version 25440 (0.0037) [2024-06-21 16:48:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 416825344. Throughput: 0: 42045.0. Samples: 416949220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 16:48:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 16:49:01,803][15401] Updated weights for policy 0, policy_version 25450 (0.0035) [2024-06-21 16:49:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 417054720. Throughput: 0: 42090.6. Samples: 417203260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 16:49:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 16:49:05,299][15401] Updated weights for policy 0, policy_version 25460 (0.0033) [2024-06-21 16:49:08,394][15132] Fps is (10 sec: 44219.0, 60 sec: 42595.5, 300 sec: 41820.3). Total num frames: 417267712. Throughput: 0: 42059.8. Samples: 417335640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 16:49:08,394][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 16:49:09,669][15401] Updated weights for policy 0, policy_version 25470 (0.0041) [2024-06-21 16:49:13,046][15401] Updated weights for policy 0, policy_version 25480 (0.0036) [2024-06-21 16:49:13,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.7, 300 sec: 41765.0). Total num frames: 417464320. Throughput: 0: 42076.0. Samples: 417587900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 16:49:13,393][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 16:49:17,340][15401] Updated weights for policy 0, policy_version 25490 (0.0034) [2024-06-21 16:49:18,389][15132] Fps is (10 sec: 40976.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 417677312. Throughput: 0: 42055.6. Samples: 417832140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 16:49:18,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-21 16:49:21,051][15401] Updated weights for policy 0, policy_version 25500 (0.0024) [2024-06-21 16:49:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 417890304. Throughput: 0: 41925.2. Samples: 417960760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:49:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 16:49:24,950][15401] Updated weights for policy 0, policy_version 25510 (0.0030) [2024-06-21 16:49:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 418086912. Throughput: 0: 42014.7. Samples: 418213500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:49:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 16:49:28,905][15401] Updated weights for policy 0, policy_version 25520 (0.0034) [2024-06-21 16:49:32,621][15401] Updated weights for policy 0, policy_version 25530 (0.0035) [2024-06-21 16:49:33,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42047.8, 300 sec: 41875.5). Total num frames: 418316288. Throughput: 0: 42105.5. Samples: 418465920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 16:49:33,396][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 16:49:36,771][15401] Updated weights for policy 0, policy_version 25540 (0.0035) [2024-06-21 16:49:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 418512896. Throughput: 0: 42069.3. Samples: 418595340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 16:49:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 16:49:40,192][15401] Updated weights for policy 0, policy_version 25550 (0.0040) [2024-06-21 16:49:43,389][15132] Fps is (10 sec: 39346.8, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 418709504. Throughput: 0: 42184.9. Samples: 418847540. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 16:49:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 16:49:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000025556_418709504.pth... [2024-06-21 16:49:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000024945_408698880.pth [2024-06-21 16:49:43,629][15349] Signal inference workers to stop experience collection... (6050 times) [2024-06-21 16:49:43,658][15401] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-06-21 16:49:43,690][15349] Signal inference workers to resume experience collection... (6050 times) [2024-06-21 16:49:43,692][15401] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-06-21 16:49:45,091][15401] Updated weights for policy 0, policy_version 25560 (0.0035) [2024-06-21 16:49:47,819][15401] Updated weights for policy 0, policy_version 25570 (0.0035) [2024-06-21 16:49:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 418938880. Throughput: 0: 41865.3. Samples: 419087200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 16:49:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 16:49:52,696][15401] Updated weights for policy 0, policy_version 25580 (0.0035) [2024-06-21 16:49:53,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41777.5, 300 sec: 41820.5). Total num frames: 419135488. Throughput: 0: 41863.3. Samples: 419219420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 16:49:53,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 16:49:55,613][15401] Updated weights for policy 0, policy_version 25590 (0.0041) [2024-06-21 16:49:58,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 419332096. Throughput: 0: 41764.0. Samples: 419467180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 16:49:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 16:50:00,531][15401] Updated weights for policy 0, policy_version 25600 (0.0035) [2024-06-21 16:50:03,390][15132] Fps is (10 sec: 44246.0, 60 sec: 42052.0, 300 sec: 41931.9). Total num frames: 419577856. Throughput: 0: 41709.9. Samples: 419709100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 16:50:03,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 16:50:03,572][15401] Updated weights for policy 0, policy_version 25610 (0.0038) [2024-06-21 16:50:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40962.7, 300 sec: 41709.8). Total num frames: 419725312. Throughput: 0: 41841.4. Samples: 419843620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 16:50:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 16:50:08,780][15401] Updated weights for policy 0, policy_version 25620 (0.0038) [2024-06-21 16:50:11,671][15401] Updated weights for policy 0, policy_version 25630 (0.0035) [2024-06-21 16:50:13,390][15132] Fps is (10 sec: 37684.4, 60 sec: 41507.8, 300 sec: 41765.3). Total num frames: 419954688. Throughput: 0: 41584.4. Samples: 420084800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 16:50:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 16:50:16,663][15401] Updated weights for policy 0, policy_version 25640 (0.0037) [2024-06-21 16:50:18,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 420200448. Throughput: 0: 41487.7. Samples: 420332600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 16:50:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 16:50:19,449][15401] Updated weights for policy 0, policy_version 25650 (0.0039) [2024-06-21 16:50:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 420380672. Throughput: 0: 41712.8. Samples: 420472420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 16:50:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 16:50:24,235][15401] Updated weights for policy 0, policy_version 25660 (0.0034) [2024-06-21 16:50:26,977][15401] Updated weights for policy 0, policy_version 25670 (0.0032) [2024-06-21 16:50:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 420593664. Throughput: 0: 41453.2. Samples: 420712940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 16:50:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 16:50:31,843][15401] Updated weights for policy 0, policy_version 25680 (0.0032) [2024-06-21 16:50:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41783.7, 300 sec: 41876.4). Total num frames: 420823040. Throughput: 0: 41830.8. Samples: 420969580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 16:50:33,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 16:50:34,918][15401] Updated weights for policy 0, policy_version 25690 (0.0039) [2024-06-21 16:50:38,390][15132] Fps is (10 sec: 39322.1, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 420986880. Throughput: 0: 41656.0. Samples: 421093840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 16:50:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 16:50:39,643][15401] Updated weights for policy 0, policy_version 25700 (0.0047) [2024-06-21 16:50:42,651][15401] Updated weights for policy 0, policy_version 25710 (0.0035) [2024-06-21 16:50:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 421232640. Throughput: 0: 41651.1. Samples: 421341480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 16:50:43,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 16:50:47,562][15401] Updated weights for policy 0, policy_version 25720 (0.0041) [2024-06-21 16:50:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 421429248. Throughput: 0: 41973.1. Samples: 421597880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 16:50:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 16:50:50,529][15401] Updated weights for policy 0, policy_version 25730 (0.0028) [2024-06-21 16:50:53,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41234.7, 300 sec: 41765.6). Total num frames: 421609472. Throughput: 0: 41680.0. Samples: 421719220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 16:50:53,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-21 16:50:55,407][15349] Signal inference workers to stop experience collection... (6100 times) [2024-06-21 16:50:55,407][15349] Signal inference workers to resume experience collection... (6100 times) [2024-06-21 16:50:55,421][15401] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-06-21 16:50:55,421][15401] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-06-21 16:50:55,563][15401] Updated weights for policy 0, policy_version 25740 (0.0035) [2024-06-21 16:50:58,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 421871616. Throughput: 0: 41795.6. Samples: 421965600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 16:50:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 16:50:58,651][15401] Updated weights for policy 0, policy_version 25750 (0.0035) [2024-06-21 16:51:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 40960.2, 300 sec: 41709.8). Total num frames: 422035456. Throughput: 0: 41999.6. Samples: 422222580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:51:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 16:51:03,522][15401] Updated weights for policy 0, policy_version 25760 (0.0053) [2024-06-21 16:51:06,397][15401] Updated weights for policy 0, policy_version 25770 (0.0040) [2024-06-21 16:51:08,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 422248448. Throughput: 0: 41548.0. Samples: 422342080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:51:08,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 16:51:11,333][15401] Updated weights for policy 0, policy_version 25780 (0.0037) [2024-06-21 16:51:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 41987.8). Total num frames: 422510592. Throughput: 0: 41744.5. Samples: 422591440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:51:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 16:51:14,209][15401] Updated weights for policy 0, policy_version 25790 (0.0039) [2024-06-21 16:51:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41709.8). Total num frames: 422658048. Throughput: 0: 41776.9. Samples: 422849540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 16:51:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 16:51:19,110][15401] Updated weights for policy 0, policy_version 25800 (0.0037) [2024-06-21 16:51:22,567][15401] Updated weights for policy 0, policy_version 25810 (0.0040) [2024-06-21 16:51:23,391][15132] Fps is (10 sec: 37678.5, 60 sec: 41778.3, 300 sec: 41876.2). Total num frames: 422887424. Throughput: 0: 41553.0. Samples: 422963780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:51:23,391][15132] Avg episode reward: [(0, '0.759')] [2024-06-21 16:51:27,004][15401] Updated weights for policy 0, policy_version 25820 (0.0038) [2024-06-21 16:51:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 423116800. Throughput: 0: 41808.9. Samples: 423222880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:51:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-21 16:51:30,339][15401] Updated weights for policy 0, policy_version 25830 (0.0031) [2024-06-21 16:51:33,390][15132] Fps is (10 sec: 40964.9, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 423297024. Throughput: 0: 41596.5. Samples: 423469720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 16:51:33,399][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 16:51:35,132][15401] Updated weights for policy 0, policy_version 25840 (0.0033) [2024-06-21 16:51:38,031][15401] Updated weights for policy 0, policy_version 25850 (0.0036) [2024-06-21 16:51:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 423526400. Throughput: 0: 41554.3. Samples: 423589160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 16:51:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 16:51:42,879][15401] Updated weights for policy 0, policy_version 25860 (0.0034) [2024-06-21 16:51:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41765.6). Total num frames: 423723008. Throughput: 0: 41638.2. Samples: 423839320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 16:51:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 16:51:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000025862_423723008.pth... [2024-06-21 16:51:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000025249_413679616.pth [2024-06-21 16:51:45,973][15401] Updated weights for policy 0, policy_version 25870 (0.0030) [2024-06-21 16:51:48,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 423903232. Throughput: 0: 41563.1. Samples: 424092920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 16:51:48,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 16:51:50,674][15401] Updated weights for policy 0, policy_version 25880 (0.0022) [2024-06-21 16:51:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 424148992. Throughput: 0: 41595.6. Samples: 424213880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 16:51:53,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 16:51:54,300][15401] Updated weights for policy 0, policy_version 25890 (0.0046) [2024-06-21 16:51:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 424345600. Throughput: 0: 41663.1. Samples: 424466280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:51:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 16:51:58,392][15401] Updated weights for policy 0, policy_version 25900 (0.0024) [2024-06-21 16:52:02,150][15401] Updated weights for policy 0, policy_version 25910 (0.0052) [2024-06-21 16:52:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 424525824. Throughput: 0: 41496.4. Samples: 424716880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:52:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 16:52:06,507][15401] Updated weights for policy 0, policy_version 25920 (0.0022) [2024-06-21 16:52:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 424755200. Throughput: 0: 41612.3. Samples: 424836280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 16:52:08,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-21 16:52:10,072][15401] Updated weights for policy 0, policy_version 25930 (0.0038) [2024-06-21 16:52:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 40685.3, 300 sec: 41709.4). Total num frames: 424951808. Throughput: 0: 41455.5. Samples: 425088480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:52:13,393][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 16:52:14,319][15401] Updated weights for policy 0, policy_version 25940 (0.0028) [2024-06-21 16:52:17,882][15401] Updated weights for policy 0, policy_version 25950 (0.0056) [2024-06-21 16:52:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 41821.2). Total num frames: 425181184. Throughput: 0: 41504.0. Samples: 425337400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:52:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 16:52:21,986][15401] Updated weights for policy 0, policy_version 25960 (0.0040) [2024-06-21 16:52:23,389][15132] Fps is (10 sec: 42609.1, 60 sec: 41507.0, 300 sec: 41765.3). Total num frames: 425377792. Throughput: 0: 41747.9. Samples: 425467820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:52:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 16:52:25,779][15401] Updated weights for policy 0, policy_version 25970 (0.0034) [2024-06-21 16:52:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 425574400. Throughput: 0: 41687.1. Samples: 425715240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 16:52:28,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-21 16:52:29,704][15401] Updated weights for policy 0, policy_version 25980 (0.0025) [2024-06-21 16:52:33,313][15349] Signal inference workers to stop experience collection... (6150 times) [2024-06-21 16:52:33,315][15349] Signal inference workers to resume experience collection... (6150 times) [2024-06-21 16:52:33,332][15401] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-06-21 16:52:33,332][15401] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-06-21 16:52:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 425803776. Throughput: 0: 41490.2. Samples: 425960080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:52:33,393][15132] Avg episode reward: [(0, '0.241')] [2024-06-21 16:52:33,463][15401] Updated weights for policy 0, policy_version 25990 (0.0032) [2024-06-21 16:52:37,390][15401] Updated weights for policy 0, policy_version 26000 (0.0030) [2024-06-21 16:52:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 426000384. Throughput: 0: 41600.5. Samples: 426085900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:52:38,390][15132] Avg episode reward: [(0, '0.167')] [2024-06-21 16:52:41,334][15401] Updated weights for policy 0, policy_version 26010 (0.0038) [2024-06-21 16:52:43,389][15132] Fps is (10 sec: 40970.4, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 426213376. Throughput: 0: 41519.3. Samples: 426334640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 16:52:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 16:52:45,680][15401] Updated weights for policy 0, policy_version 26020 (0.0045) [2024-06-21 16:52:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 41765.4). Total num frames: 426426368. Throughput: 0: 41633.0. Samples: 426590360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 16:52:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 16:52:49,282][15401] Updated weights for policy 0, policy_version 26030 (0.0033) [2024-06-21 16:52:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 426639360. Throughput: 0: 41664.9. Samples: 426711200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 16:52:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 16:52:53,390][15401] Updated weights for policy 0, policy_version 26040 (0.0033) [2024-06-21 16:52:57,058][15401] Updated weights for policy 0, policy_version 26050 (0.0041) [2024-06-21 16:52:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 426835968. Throughput: 0: 41654.7. Samples: 426962840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 16:52:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 16:53:01,092][15401] Updated weights for policy 0, policy_version 26060 (0.0037) [2024-06-21 16:53:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 427032576. Throughput: 0: 41836.2. Samples: 427220020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 16:53:03,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 16:53:04,858][15401] Updated weights for policy 0, policy_version 26070 (0.0036) [2024-06-21 16:53:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 427261952. Throughput: 0: 41599.5. Samples: 427339800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:53:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 16:53:08,730][15401] Updated weights for policy 0, policy_version 26080 (0.0038) [2024-06-21 16:53:12,676][15401] Updated weights for policy 0, policy_version 26090 (0.0037) [2024-06-21 16:53:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42054.1, 300 sec: 41654.2). Total num frames: 427474944. Throughput: 0: 41615.6. Samples: 427587940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:53:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 16:53:16,435][15401] Updated weights for policy 0, policy_version 26100 (0.0037) [2024-06-21 16:53:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 427671552. Throughput: 0: 41899.6. Samples: 427845460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 16:53:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 16:53:20,835][15401] Updated weights for policy 0, policy_version 26110 (0.0030) [2024-06-21 16:53:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 427884544. Throughput: 0: 41739.3. Samples: 427964180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:53:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 16:53:24,253][15401] Updated weights for policy 0, policy_version 26120 (0.0030) [2024-06-21 16:53:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 428097536. Throughput: 0: 41882.5. Samples: 428219360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:53:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 16:53:28,551][15401] Updated weights for policy 0, policy_version 26130 (0.0023) [2024-06-21 16:53:32,062][15401] Updated weights for policy 0, policy_version 26140 (0.0054) [2024-06-21 16:53:33,390][15132] Fps is (10 sec: 39322.2, 60 sec: 41234.7, 300 sec: 41654.2). Total num frames: 428277760. Throughput: 0: 41757.6. Samples: 428469460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:53:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 16:53:36,773][15401] Updated weights for policy 0, policy_version 26150 (0.0030) [2024-06-21 16:53:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 428523520. Throughput: 0: 41738.5. Samples: 428589440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 16:53:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 16:53:40,520][15401] Updated weights for policy 0, policy_version 26160 (0.0033) [2024-06-21 16:53:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41779.0, 300 sec: 41654.2). Total num frames: 428720128. Throughput: 0: 41914.6. Samples: 428849000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 16:53:43,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-21 16:53:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000026167_428720128.pth... [2024-06-21 16:53:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000025556_418709504.pth [2024-06-21 16:53:44,299][15401] Updated weights for policy 0, policy_version 26170 (0.0034) [2024-06-21 16:53:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 428916736. Throughput: 0: 41539.4. Samples: 429089300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 16:53:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 16:53:48,862][15401] Updated weights for policy 0, policy_version 26180 (0.0037) [2024-06-21 16:53:52,226][15401] Updated weights for policy 0, policy_version 26190 (0.0044) [2024-06-21 16:53:52,751][15349] Signal inference workers to stop experience collection... (6200 times) [2024-06-21 16:53:52,788][15401] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-06-21 16:53:52,811][15349] Signal inference workers to resume experience collection... (6200 times) [2024-06-21 16:53:52,811][15401] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-06-21 16:53:53,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42323.6, 300 sec: 41876.0). Total num frames: 429178880. Throughput: 0: 41690.2. Samples: 429215960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 16:53:53,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 16:53:56,654][15401] Updated weights for policy 0, policy_version 26200 (0.0028) [2024-06-21 16:53:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 429326336. Throughput: 0: 41819.9. Samples: 429469840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 16:53:58,396][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 16:54:00,037][15401] Updated weights for policy 0, policy_version 26210 (0.0052) [2024-06-21 16:54:03,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42052.3, 300 sec: 41654.8). Total num frames: 429555712. Throughput: 0: 41399.6. Samples: 429708440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 16:54:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 16:54:04,594][15401] Updated weights for policy 0, policy_version 26220 (0.0034) [2024-06-21 16:54:07,839][15401] Updated weights for policy 0, policy_version 26230 (0.0044) [2024-06-21 16:54:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42052.2, 300 sec: 41765.6). Total num frames: 429785088. Throughput: 0: 41762.8. Samples: 429843500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 16:54:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 16:54:12,462][15401] Updated weights for policy 0, policy_version 26240 (0.0030) [2024-06-21 16:54:13,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 429932544. Throughput: 0: 41560.1. Samples: 430089560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 16:54:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 16:54:15,960][15401] Updated weights for policy 0, policy_version 26250 (0.0040) [2024-06-21 16:54:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 430194688. Throughput: 0: 41320.5. Samples: 430328880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:54:18,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 16:54:20,039][15401] Updated weights for policy 0, policy_version 26260 (0.0029) [2024-06-21 16:54:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 430358528. Throughput: 0: 41694.2. Samples: 430465680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:54:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 16:54:23,793][15401] Updated weights for policy 0, policy_version 26270 (0.0036) [2024-06-21 16:54:27,676][15401] Updated weights for policy 0, policy_version 26280 (0.0024) [2024-06-21 16:54:28,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41233.1, 300 sec: 41544.1). Total num frames: 430571520. Throughput: 0: 41358.8. Samples: 430710140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 16:54:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 16:54:31,635][15401] Updated weights for policy 0, policy_version 26290 (0.0029) [2024-06-21 16:54:33,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42598.5, 300 sec: 41765.3). Total num frames: 430833664. Throughput: 0: 41453.9. Samples: 430954720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 16:54:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 16:54:35,270][15401] Updated weights for policy 0, policy_version 26300 (0.0036) [2024-06-21 16:54:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 430981120. Throughput: 0: 41615.5. Samples: 431088560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 16:54:38,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-21 16:54:39,512][15401] Updated weights for policy 0, policy_version 26310 (0.0044) [2024-06-21 16:54:43,039][15401] Updated weights for policy 0, policy_version 26320 (0.0026) [2024-06-21 16:54:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 431226880. Throughput: 0: 41422.7. Samples: 431333860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 16:54:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-21 16:54:47,486][15401] Updated weights for policy 0, policy_version 26330 (0.0043) [2024-06-21 16:54:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 41710.1). Total num frames: 431439872. Throughput: 0: 41507.9. Samples: 431576300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 16:54:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 16:54:52,029][15401] Updated weights for policy 0, policy_version 26340 (0.0045) [2024-06-21 16:54:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40415.4, 300 sec: 41598.7). Total num frames: 431603712. Throughput: 0: 41344.4. Samples: 431704000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:54:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 16:54:55,430][15401] Updated weights for policy 0, policy_version 26350 (0.0030) [2024-06-21 16:54:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 431833088. Throughput: 0: 41269.2. Samples: 431946680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:54:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 16:54:59,837][15401] Updated weights for policy 0, policy_version 26360 (0.0045) [2024-06-21 16:55:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 432029696. Throughput: 0: 41556.9. Samples: 432198940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 16:55:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 16:55:03,503][15401] Updated weights for policy 0, policy_version 26370 (0.0027) [2024-06-21 16:55:07,669][15401] Updated weights for policy 0, policy_version 26380 (0.0050) [2024-06-21 16:55:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 40960.1, 300 sec: 41654.2). Total num frames: 432242688. Throughput: 0: 41165.5. Samples: 432318120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 16:55:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 16:55:11,182][15401] Updated weights for policy 0, policy_version 26390 (0.0032) [2024-06-21 16:55:13,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42323.7, 300 sec: 41598.4). Total num frames: 432472064. Throughput: 0: 41372.1. Samples: 432571980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 16:55:13,392][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 16:55:15,385][15401] Updated weights for policy 0, policy_version 26400 (0.0036) [2024-06-21 16:55:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 432652288. Throughput: 0: 41667.9. Samples: 432829780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 16:55:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 16:55:18,593][15349] Signal inference workers to stop experience collection... (6250 times) [2024-06-21 16:55:18,594][15349] Signal inference workers to resume experience collection... (6250 times) [2024-06-21 16:55:18,617][15401] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-06-21 16:55:18,617][15401] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-06-21 16:55:18,917][15401] Updated weights for policy 0, policy_version 26410 (0.0052) [2024-06-21 16:55:23,335][15401] Updated weights for policy 0, policy_version 26420 (0.0050) [2024-06-21 16:55:23,390][15132] Fps is (10 sec: 39330.8, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 432865280. Throughput: 0: 41233.4. Samples: 432944060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 16:55:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 16:55:27,083][15401] Updated weights for policy 0, policy_version 26430 (0.0042) [2024-06-21 16:55:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 433078272. Throughput: 0: 41371.6. Samples: 433195580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 16:55:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 16:55:31,379][15401] Updated weights for policy 0, policy_version 26440 (0.0032) [2024-06-21 16:55:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 40686.9, 300 sec: 41654.3). Total num frames: 433274880. Throughput: 0: 41685.9. Samples: 433452160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 16:55:33,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 16:55:34,929][15401] Updated weights for policy 0, policy_version 26450 (0.0033) [2024-06-21 16:55:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41777.6, 300 sec: 41542.8). Total num frames: 433487872. Throughput: 0: 41540.5. Samples: 433573420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 16:55:38,393][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 16:55:39,067][15401] Updated weights for policy 0, policy_version 26460 (0.0034) [2024-06-21 16:55:42,678][15401] Updated weights for policy 0, policy_version 26470 (0.0040) [2024-06-21 16:55:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 433700864. Throughput: 0: 41838.4. Samples: 433829400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 16:55:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 16:55:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000026471_433700864.pth... [2024-06-21 16:55:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000025862_423723008.pth [2024-06-21 16:55:47,110][15401] Updated weights for policy 0, policy_version 26480 (0.0040) [2024-06-21 16:55:48,390][15132] Fps is (10 sec: 40969.9, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 433897472. Throughput: 0: 41691.5. Samples: 434075060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:55:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 16:55:50,785][15401] Updated weights for policy 0, policy_version 26490 (0.0031) [2024-06-21 16:55:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 434126848. Throughput: 0: 41743.0. Samples: 434196560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:55:53,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 16:55:54,691][15401] Updated weights for policy 0, policy_version 26500 (0.0045) [2024-06-21 16:55:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 434307072. Throughput: 0: 41703.9. Samples: 434448560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:55:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 16:55:58,564][15401] Updated weights for policy 0, policy_version 26510 (0.0033) [2024-06-21 16:56:02,247][15401] Updated weights for policy 0, policy_version 26520 (0.0031) [2024-06-21 16:56:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 434520064. Throughput: 0: 41525.4. Samples: 434698420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 16:56:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 16:56:06,429][15401] Updated weights for policy 0, policy_version 26530 (0.0035) [2024-06-21 16:56:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 434749440. Throughput: 0: 41769.7. Samples: 434823700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 16:56:08,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-21 16:56:09,886][15401] Updated weights for policy 0, policy_version 26540 (0.0033) [2024-06-21 16:56:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 40961.7, 300 sec: 41598.7). Total num frames: 434929664. Throughput: 0: 41762.7. Samples: 435074900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 16:56:13,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-21 16:56:14,327][15401] Updated weights for policy 0, policy_version 26550 (0.0041) [2024-06-21 16:56:17,813][15401] Updated weights for policy 0, policy_version 26560 (0.0030) [2024-06-21 16:56:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 41598.9). Total num frames: 435159040. Throughput: 0: 41460.9. Samples: 435317900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 16:56:18,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 16:56:22,210][15401] Updated weights for policy 0, policy_version 26570 (0.0026) [2024-06-21 16:56:23,393][15132] Fps is (10 sec: 42581.4, 60 sec: 41503.4, 300 sec: 41487.1). Total num frames: 435355648. Throughput: 0: 41715.0. Samples: 435450660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:56:23,394][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 16:56:25,455][15401] Updated weights for policy 0, policy_version 26580 (0.0039) [2024-06-21 16:56:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 435568640. Throughput: 0: 41527.4. Samples: 435698140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:56:28,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 16:56:30,180][15401] Updated weights for policy 0, policy_version 26590 (0.0026) [2024-06-21 16:56:33,244][15401] Updated weights for policy 0, policy_version 26600 (0.0035) [2024-06-21 16:56:33,390][15132] Fps is (10 sec: 45893.2, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 435814400. Throughput: 0: 41500.9. Samples: 435942600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 16:56:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 16:56:37,902][15401] Updated weights for policy 0, policy_version 26610 (0.0035) [2024-06-21 16:56:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41507.8, 300 sec: 41543.2). Total num frames: 435978240. Throughput: 0: 41820.0. Samples: 436078460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-21 16:56:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 16:56:39,290][15349] Signal inference workers to stop experience collection... (6300 times) [2024-06-21 16:56:39,339][15349] Signal inference workers to resume experience collection... (6300 times) [2024-06-21 16:56:39,340][15401] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-06-21 16:56:39,352][15401] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-06-21 16:56:41,169][15401] Updated weights for policy 0, policy_version 26620 (0.0031) [2024-06-21 16:56:43,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 436191232. Throughput: 0: 41579.6. Samples: 436319640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-21 16:56:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 16:56:45,692][15401] Updated weights for policy 0, policy_version 26630 (0.0036) [2024-06-21 16:56:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 436420608. Throughput: 0: 41568.0. Samples: 436568980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-21 16:56:48,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 16:56:49,190][15401] Updated weights for policy 0, policy_version 26640 (0.0035) [2024-06-21 16:56:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41231.4, 300 sec: 41542.8). Total num frames: 436600832. Throughput: 0: 41638.7. Samples: 436697540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-21 16:56:53,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 16:56:53,632][15401] Updated weights for policy 0, policy_version 26650 (0.0046) [2024-06-21 16:56:57,135][15401] Updated weights for policy 0, policy_version 26660 (0.0034) [2024-06-21 16:56:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 436813824. Throughput: 0: 41491.9. Samples: 436942040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-21 16:56:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 16:57:01,427][15401] Updated weights for policy 0, policy_version 26670 (0.0032) [2024-06-21 16:57:03,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42325.2, 300 sec: 41709.8). Total num frames: 437059584. Throughput: 0: 41723.9. Samples: 437195480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-21 16:57:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 16:57:04,891][15401] Updated weights for policy 0, policy_version 26680 (0.0034) [2024-06-21 16:57:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41504.5, 300 sec: 41654.2). Total num frames: 437239808. Throughput: 0: 41706.3. Samples: 437327380. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-21 16:57:08,393][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 16:57:09,161][15401] Updated weights for policy 0, policy_version 26690 (0.0042) [2024-06-21 16:57:12,774][15401] Updated weights for policy 0, policy_version 26700 (0.0025) [2024-06-21 16:57:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41654.3). Total num frames: 437469184. Throughput: 0: 41531.7. Samples: 437567060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 16:57:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 16:57:17,467][15401] Updated weights for policy 0, policy_version 26710 (0.0031) [2024-06-21 16:57:18,389][15132] Fps is (10 sec: 40970.4, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 437649408. Throughput: 0: 41791.6. Samples: 437823220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 16:57:18,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-21 16:57:20,680][15401] Updated weights for policy 0, policy_version 26720 (0.0046) [2024-06-21 16:57:23,389][15132] Fps is (10 sec: 36044.8, 60 sec: 41235.8, 300 sec: 41543.2). Total num frames: 437829632. Throughput: 0: 41358.7. Samples: 437939600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 16:57:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 16:57:25,210][15401] Updated weights for policy 0, policy_version 26730 (0.0027) [2024-06-21 16:57:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42052.3, 300 sec: 41654.6). Total num frames: 438091776. Throughput: 0: 41683.9. Samples: 438195420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 16:57:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 16:57:28,597][15401] Updated weights for policy 0, policy_version 26740 (0.0035) [2024-06-21 16:57:33,041][15401] Updated weights for policy 0, policy_version 26750 (0.0032) [2024-06-21 16:57:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 438288384. Throughput: 0: 41799.0. Samples: 438449940. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-21 16:57:33,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-21 16:57:36,244][15401] Updated weights for policy 0, policy_version 26760 (0.0037) [2024-06-21 16:57:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 438484992. Throughput: 0: 41647.6. Samples: 438571580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-21 16:57:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 16:57:40,943][15401] Updated weights for policy 0, policy_version 26770 (0.0034) [2024-06-21 16:57:43,187][15349] Signal inference workers to stop experience collection... (6350 times) [2024-06-21 16:57:43,216][15401] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-06-21 16:57:43,247][15349] Signal inference workers to resume experience collection... (6350 times) [2024-06-21 16:57:43,247][15401] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-06-21 16:57:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 438714368. Throughput: 0: 41829.9. Samples: 438824380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-21 16:57:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 16:57:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000026778_438730752.pth... [2024-06-21 16:57:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000026167_428720128.pth [2024-06-21 16:57:44,201][15401] Updated weights for policy 0, policy_version 26780 (0.0033) [2024-06-21 16:57:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 438878208. Throughput: 0: 41886.2. Samples: 439080360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-21 16:57:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 16:57:49,026][15401] Updated weights for policy 0, policy_version 26790 (0.0033) [2024-06-21 16:57:51,909][15401] Updated weights for policy 0, policy_version 26800 (0.0043) [2024-06-21 16:57:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 41709.8). Total num frames: 439140352. Throughput: 0: 41484.5. Samples: 439194080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:57:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 16:57:56,963][15401] Updated weights for policy 0, policy_version 26810 (0.0040) [2024-06-21 16:57:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 439320576. Throughput: 0: 41845.4. Samples: 439450100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:57:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 16:57:59,627][15401] Updated weights for policy 0, policy_version 26820 (0.0049) [2024-06-21 16:58:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 439517184. Throughput: 0: 41662.5. Samples: 439698040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 16:58:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 16:58:04,908][15401] Updated weights for policy 0, policy_version 26830 (0.0037) [2024-06-21 16:58:07,643][15401] Updated weights for policy 0, policy_version 26840 (0.0032) [2024-06-21 16:58:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41780.9, 300 sec: 41598.7). Total num frames: 439746560. Throughput: 0: 41832.0. Samples: 439822040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 16:58:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 16:58:12,757][15401] Updated weights for policy 0, policy_version 26850 (0.0031) [2024-06-21 16:58:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 439926784. Throughput: 0: 41722.3. Samples: 440072920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 16:58:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 16:58:15,667][15401] Updated weights for policy 0, policy_version 26860 (0.0051) [2024-06-21 16:58:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41777.5, 300 sec: 41598.4). Total num frames: 440156160. Throughput: 0: 41542.7. Samples: 440319460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 16:58:18,392][15132] Avg episode reward: [(0, '0.831')] [2024-06-21 16:58:20,887][15401] Updated weights for policy 0, policy_version 26870 (0.0038) [2024-06-21 16:58:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 41598.7). Total num frames: 440369152. Throughput: 0: 41708.8. Samples: 440448480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 16:58:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 16:58:23,913][15401] Updated weights for policy 0, policy_version 26880 (0.0036) [2024-06-21 16:58:28,390][15132] Fps is (10 sec: 39330.6, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 440549376. Throughput: 0: 41464.3. Samples: 440690280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 16:58:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 16:58:28,487][15401] Updated weights for policy 0, policy_version 26890 (0.0048) [2024-06-21 16:58:32,064][15401] Updated weights for policy 0, policy_version 26900 (0.0046) [2024-06-21 16:58:33,389][15132] Fps is (10 sec: 40961.0, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 440778752. Throughput: 0: 41362.4. Samples: 440941660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 16:58:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 16:58:36,557][15401] Updated weights for policy 0, policy_version 26910 (0.0033) [2024-06-21 16:58:38,392][15132] Fps is (10 sec: 44226.4, 60 sec: 41777.5, 300 sec: 41598.4). Total num frames: 440991744. Throughput: 0: 41603.5. Samples: 441066340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 16:58:38,393][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 16:58:39,928][15401] Updated weights for policy 0, policy_version 26920 (0.0031) [2024-06-21 16:58:43,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41231.4, 300 sec: 41598.4). Total num frames: 441188352. Throughput: 0: 41453.7. Samples: 441315620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 16:58:43,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 16:58:44,114][15401] Updated weights for policy 0, policy_version 26930 (0.0037) [2024-06-21 16:58:47,779][15401] Updated weights for policy 0, policy_version 26940 (0.0032) [2024-06-21 16:58:48,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42052.3, 300 sec: 41432.4). Total num frames: 441401344. Throughput: 0: 41427.6. Samples: 441562280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 16:58:48,394][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 16:58:51,782][15401] Updated weights for policy 0, policy_version 26950 (0.0043) [2024-06-21 16:58:53,390][15132] Fps is (10 sec: 40969.4, 60 sec: 40959.9, 300 sec: 41598.7). Total num frames: 441597952. Throughput: 0: 41581.7. Samples: 441693220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 16:58:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 16:58:55,436][15401] Updated weights for policy 0, policy_version 26960 (0.0039) [2024-06-21 16:58:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 41504.4, 300 sec: 41542.8). Total num frames: 441810944. Throughput: 0: 41573.8. Samples: 441943840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 16:58:58,392][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 16:58:59,426][15401] Updated weights for policy 0, policy_version 26970 (0.0025) [2024-06-21 16:59:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 442040320. Throughput: 0: 41587.0. Samples: 442190780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 16:59:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 16:59:03,395][15401] Updated weights for policy 0, policy_version 26980 (0.0041) [2024-06-21 16:59:07,179][15349] Signal inference workers to stop experience collection... (6400 times) [2024-06-21 16:59:07,183][15349] Signal inference workers to resume experience collection... (6400 times) [2024-06-21 16:59:07,216][15401] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-06-21 16:59:07,216][15401] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-06-21 16:59:07,488][15401] Updated weights for policy 0, policy_version 26990 (0.0028) [2024-06-21 16:59:08,389][15132] Fps is (10 sec: 40969.9, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 442220544. Throughput: 0: 41465.0. Samples: 442314400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 16:59:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 16:59:11,222][15401] Updated weights for policy 0, policy_version 27000 (0.0030) [2024-06-21 16:59:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41543.1). Total num frames: 442449920. Throughput: 0: 41686.7. Samples: 442566180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 16:59:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 16:59:15,351][15401] Updated weights for policy 0, policy_version 27010 (0.0041) [2024-06-21 16:59:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41780.9, 300 sec: 41709.8). Total num frames: 442662912. Throughput: 0: 41654.6. Samples: 442816120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 16:59:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 16:59:19,082][15401] Updated weights for policy 0, policy_version 27020 (0.0033) [2024-06-21 16:59:22,918][15401] Updated weights for policy 0, policy_version 27030 (0.0040) [2024-06-21 16:59:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 442859520. Throughput: 0: 41621.8. Samples: 442939220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:59:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 16:59:27,558][15401] Updated weights for policy 0, policy_version 27040 (0.0038) [2024-06-21 16:59:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 443072512. Throughput: 0: 41591.1. Samples: 443187120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:59:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 16:59:31,292][15401] Updated weights for policy 0, policy_version 27050 (0.0040) [2024-06-21 16:59:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 443269120. Throughput: 0: 41659.9. Samples: 443436980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 16:59:33,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-21 16:59:35,092][15401] Updated weights for policy 0, policy_version 27060 (0.0031) [2024-06-21 16:59:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41507.8, 300 sec: 41543.2). Total num frames: 443482112. Throughput: 0: 41550.8. Samples: 443563000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:59:38,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-21 16:59:39,010][15401] Updated weights for policy 0, policy_version 27070 (0.0033) [2024-06-21 16:59:42,643][15401] Updated weights for policy 0, policy_version 27080 (0.0044) [2024-06-21 16:59:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41507.8, 300 sec: 41487.6). Total num frames: 443678720. Throughput: 0: 41592.0. Samples: 443815380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:59:43,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 16:59:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000027081_443695104.pth... [2024-06-21 16:59:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000026471_433700864.pth [2024-06-21 16:59:46,614][15401] Updated weights for policy 0, policy_version 27090 (0.0035) [2024-06-21 16:59:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 443891712. Throughput: 0: 41549.3. Samples: 444060500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:59:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 16:59:50,407][15401] Updated weights for policy 0, policy_version 27100 (0.0030) [2024-06-21 16:59:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 444104704. Throughput: 0: 41675.9. Samples: 444189820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 16:59:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 16:59:54,289][15401] Updated weights for policy 0, policy_version 27110 (0.0039) [2024-06-21 16:59:58,277][15401] Updated weights for policy 0, policy_version 27120 (0.0034) [2024-06-21 16:59:58,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42052.3, 300 sec: 41709.4). Total num frames: 444334080. Throughput: 0: 41723.2. Samples: 444443820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 16:59:58,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 17:00:02,194][15401] Updated weights for policy 0, policy_version 27130 (0.0041) [2024-06-21 17:00:03,391][15132] Fps is (10 sec: 44231.5, 60 sec: 41778.4, 300 sec: 41709.6). Total num frames: 444547072. Throughput: 0: 41625.5. Samples: 444689320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 17:00:03,391][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 17:00:06,099][15401] Updated weights for policy 0, policy_version 27140 (0.0046) [2024-06-21 17:00:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42052.3, 300 sec: 41599.0). Total num frames: 444743680. Throughput: 0: 41745.0. Samples: 444817740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 17:00:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 17:00:10,011][15401] Updated weights for policy 0, policy_version 27150 (0.0043) [2024-06-21 17:00:13,389][15132] Fps is (10 sec: 39326.8, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 444940288. Throughput: 0: 41786.7. Samples: 445067520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 17:00:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 17:00:14,419][15401] Updated weights for policy 0, policy_version 27160 (0.0025) [2024-06-21 17:00:18,017][15401] Updated weights for policy 0, policy_version 27170 (0.0046) [2024-06-21 17:00:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 445153280. Throughput: 0: 41659.3. Samples: 445311640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 17:00:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-21 17:00:22,347][15401] Updated weights for policy 0, policy_version 27180 (0.0030) [2024-06-21 17:00:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 445366272. Throughput: 0: 41727.1. Samples: 445440720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 17:00:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-21 17:00:25,730][15401] Updated weights for policy 0, policy_version 27190 (0.0034) [2024-06-21 17:00:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 445562880. Throughput: 0: 41551.0. Samples: 445685180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 17:00:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 17:00:30,477][15401] Updated weights for policy 0, policy_version 27200 (0.0037) [2024-06-21 17:00:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 41710.1). Total num frames: 445792256. Throughput: 0: 41704.1. Samples: 445937180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 17:00:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 17:00:33,448][15401] Updated weights for policy 0, policy_version 27210 (0.0035) [2024-06-21 17:00:38,217][15401] Updated weights for policy 0, policy_version 27220 (0.0042) [2024-06-21 17:00:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 445972480. Throughput: 0: 41681.9. Samples: 446065500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 17:00:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 17:00:41,257][15401] Updated weights for policy 0, policy_version 27230 (0.0037) [2024-06-21 17:00:43,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 446185472. Throughput: 0: 41461.3. Samples: 446309480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 17:00:43,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 17:00:46,198][15349] Signal inference workers to stop experience collection... (6450 times) [2024-06-21 17:00:46,251][15401] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-06-21 17:00:46,254][15349] Signal inference workers to resume experience collection... (6450 times) [2024-06-21 17:00:46,267][15401] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-06-21 17:00:46,270][15401] Updated weights for policy 0, policy_version 27240 (0.0042) [2024-06-21 17:00:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 446398464. Throughput: 0: 41680.0. Samples: 446564860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-21 17:00:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 17:00:49,110][15401] Updated weights for policy 0, policy_version 27250 (0.0032) [2024-06-21 17:00:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 446595072. Throughput: 0: 41605.3. Samples: 446689980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 17:00:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 17:00:53,965][15401] Updated weights for policy 0, policy_version 27260 (0.0024) [2024-06-21 17:00:56,960][15401] Updated weights for policy 0, policy_version 27270 (0.0037) [2024-06-21 17:00:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41507.8, 300 sec: 41709.8). Total num frames: 446824448. Throughput: 0: 41368.9. Samples: 446929120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 17:00:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 17:01:01,663][15401] Updated weights for policy 0, policy_version 27280 (0.0037) [2024-06-21 17:01:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.9, 300 sec: 41543.2). Total num frames: 447004672. Throughput: 0: 41670.2. Samples: 447186800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 17:01:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 17:01:05,205][15401] Updated weights for policy 0, policy_version 27290 (0.0028) [2024-06-21 17:01:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 447217664. Throughput: 0: 41434.3. Samples: 447305260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 17:01:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 17:01:09,941][15401] Updated weights for policy 0, policy_version 27300 (0.0039) [2024-06-21 17:01:13,171][15401] Updated weights for policy 0, policy_version 27310 (0.0038) [2024-06-21 17:01:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 447463424. Throughput: 0: 41569.4. Samples: 447555800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 17:01:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 17:01:17,691][15401] Updated weights for policy 0, policy_version 27320 (0.0040) [2024-06-21 17:01:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41599.3). Total num frames: 447627264. Throughput: 0: 41713.3. Samples: 447814280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 17:01:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-21 17:01:20,927][15401] Updated weights for policy 0, policy_version 27330 (0.0043) [2024-06-21 17:01:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 447856640. Throughput: 0: 41414.5. Samples: 447929160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 17:01:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 17:01:25,575][15401] Updated weights for policy 0, policy_version 27340 (0.0035) [2024-06-21 17:01:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 448069632. Throughput: 0: 41756.4. Samples: 448188520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:01:28,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 17:01:28,724][15401] Updated weights for policy 0, policy_version 27350 (0.0045) [2024-06-21 17:01:33,273][15401] Updated weights for policy 0, policy_version 27360 (0.0039) [2024-06-21 17:01:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 448266240. Throughput: 0: 41609.6. Samples: 448437300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:01:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 17:01:36,480][15401] Updated weights for policy 0, policy_version 27370 (0.0037) [2024-06-21 17:01:38,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 448446464. Throughput: 0: 41538.6. Samples: 448559220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:01:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 17:01:41,056][15401] Updated weights for policy 0, policy_version 27380 (0.0040) [2024-06-21 17:01:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 448675840. Throughput: 0: 41681.7. Samples: 448804800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:01:43,392][15132] Avg episode reward: [(0, '0.269')] [2024-06-21 17:01:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000027385_448675840.pth... [2024-06-21 17:01:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000026778_438730752.pth [2024-06-21 17:01:44,624][15401] Updated weights for policy 0, policy_version 27390 (0.0043) [2024-06-21 17:01:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41506.1, 300 sec: 41654.6). Total num frames: 448888832. Throughput: 0: 41426.2. Samples: 449050980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:01:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 17:01:48,895][15401] Updated weights for policy 0, policy_version 27400 (0.0044) [2024-06-21 17:01:52,748][15401] Updated weights for policy 0, policy_version 27410 (0.0023) [2024-06-21 17:01:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 449085440. Throughput: 0: 41654.6. Samples: 449179720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:01:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 17:01:56,785][15401] Updated weights for policy 0, policy_version 27420 (0.0040) [2024-06-21 17:01:58,390][15132] Fps is (10 sec: 40959.0, 60 sec: 41232.9, 300 sec: 41487.6). Total num frames: 449298432. Throughput: 0: 41512.8. Samples: 449423880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:01:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 17:02:00,655][15401] Updated weights for policy 0, policy_version 27430 (0.0034) [2024-06-21 17:02:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41599.0). Total num frames: 449511424. Throughput: 0: 41313.3. Samples: 449673380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:02:03,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 17:02:04,604][15401] Updated weights for policy 0, policy_version 27440 (0.0053) [2024-06-21 17:02:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 449708032. Throughput: 0: 41480.6. Samples: 449795780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:02:08,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 17:02:09,123][15401] Updated weights for policy 0, policy_version 27450 (0.0032) [2024-06-21 17:02:12,535][15401] Updated weights for policy 0, policy_version 27460 (0.0035) [2024-06-21 17:02:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 40960.1, 300 sec: 41598.7). Total num frames: 449921024. Throughput: 0: 41264.5. Samples: 450045420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:02:13,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-21 17:02:17,107][15401] Updated weights for policy 0, policy_version 27470 (0.0035) [2024-06-21 17:02:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 450134016. Throughput: 0: 41223.2. Samples: 450292340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:02:18,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 17:02:20,462][15401] Updated weights for policy 0, policy_version 27480 (0.0032) [2024-06-21 17:02:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 450330624. Throughput: 0: 41296.9. Samples: 450417580. Policy #0 lag: (min: 1.0, avg: 11.2, max: 27.0) [2024-06-21 17:02:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-21 17:02:25,162][15401] Updated weights for policy 0, policy_version 27490 (0.0037) [2024-06-21 17:02:28,335][15401] Updated weights for policy 0, policy_version 27500 (0.0039) [2024-06-21 17:02:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 450560000. Throughput: 0: 41317.4. Samples: 450664080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 27.0) [2024-06-21 17:02:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 17:02:32,677][15349] Signal inference workers to stop experience collection... (6500 times) [2024-06-21 17:02:32,699][15401] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-06-21 17:02:32,787][15349] Signal inference workers to resume experience collection... (6500 times) [2024-06-21 17:02:32,788][15401] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-06-21 17:02:32,923][15401] Updated weights for policy 0, policy_version 27510 (0.0034) [2024-06-21 17:02:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 450740224. Throughput: 0: 41471.4. Samples: 450917200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 27.0) [2024-06-21 17:02:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 17:02:36,143][15401] Updated weights for policy 0, policy_version 27520 (0.0035) [2024-06-21 17:02:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 41543.1). Total num frames: 450969600. Throughput: 0: 41290.6. Samples: 451037800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 27.0) [2024-06-21 17:02:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 17:02:40,638][15401] Updated weights for policy 0, policy_version 27530 (0.0047) [2024-06-21 17:02:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 451182592. Throughput: 0: 41411.8. Samples: 451287400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 17:02:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 17:02:44,047][15401] Updated weights for policy 0, policy_version 27540 (0.0029) [2024-06-21 17:02:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 451362816. Throughput: 0: 41538.7. Samples: 451542620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 17:02:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 17:02:48,430][15401] Updated weights for policy 0, policy_version 27550 (0.0049) [2024-06-21 17:02:51,984][15401] Updated weights for policy 0, policy_version 27560 (0.0035) [2024-06-21 17:02:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 451592192. Throughput: 0: 41497.3. Samples: 451663160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 17:02:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 17:02:56,253][15401] Updated weights for policy 0, policy_version 27570 (0.0041) [2024-06-21 17:02:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 451805184. Throughput: 0: 41539.4. Samples: 451914700. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-21 17:02:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 17:02:59,846][15401] Updated weights for policy 0, policy_version 27580 (0.0042) [2024-06-21 17:03:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 452001792. Throughput: 0: 41805.2. Samples: 452173580. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-21 17:03:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 17:03:03,948][15401] Updated weights for policy 0, policy_version 27590 (0.0032) [2024-06-21 17:03:07,646][15401] Updated weights for policy 0, policy_version 27600 (0.0038) [2024-06-21 17:03:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 41777.6, 300 sec: 41653.9). Total num frames: 452214784. Throughput: 0: 41642.8. Samples: 452291600. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-21 17:03:08,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 17:03:11,815][15401] Updated weights for policy 0, policy_version 27610 (0.0036) [2024-06-21 17:03:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 41654.6). Total num frames: 452444160. Throughput: 0: 41688.4. Samples: 452540060. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-21 17:03:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 17:03:15,757][15401] Updated weights for policy 0, policy_version 27620 (0.0036) [2024-06-21 17:03:18,392][15132] Fps is (10 sec: 40959.7, 60 sec: 41504.4, 300 sec: 41542.8). Total num frames: 452624384. Throughput: 0: 41778.7. Samples: 452797340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 17:03:18,392][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 17:03:19,541][15401] Updated weights for policy 0, policy_version 27630 (0.0033) [2024-06-21 17:03:23,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 452820992. Throughput: 0: 41807.2. Samples: 452919120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 17:03:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 17:03:23,884][15401] Updated weights for policy 0, policy_version 27640 (0.0044) [2024-06-21 17:03:27,117][15401] Updated weights for policy 0, policy_version 27650 (0.0040) [2024-06-21 17:03:28,389][15132] Fps is (10 sec: 44247.3, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 453066752. Throughput: 0: 41874.6. Samples: 453171760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 17:03:28,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 17:03:31,583][15401] Updated weights for policy 0, policy_version 27660 (0.0031) [2024-06-21 17:03:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41599.0). Total num frames: 453263360. Throughput: 0: 41734.1. Samples: 453420660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 17:03:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 17:03:34,826][15401] Updated weights for policy 0, policy_version 27670 (0.0034) [2024-06-21 17:03:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41543.5). Total num frames: 453443584. Throughput: 0: 41830.3. Samples: 453545520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:03:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 17:03:39,410][15401] Updated weights for policy 0, policy_version 27680 (0.0042) [2024-06-21 17:03:42,810][15401] Updated weights for policy 0, policy_version 27690 (0.0037) [2024-06-21 17:03:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 453689344. Throughput: 0: 41981.0. Samples: 453803840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:03:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 17:03:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000027692_453705728.pth... [2024-06-21 17:03:43,451][15349] Signal inference workers to stop experience collection... (6550 times) [2024-06-21 17:03:43,451][15349] Signal inference workers to resume experience collection... (6550 times) [2024-06-21 17:03:43,472][15401] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-06-21 17:03:43,472][15401] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-06-21 17:03:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000027081_443695104.pth [2024-06-21 17:03:47,297][15401] Updated weights for policy 0, policy_version 27700 (0.0029) [2024-06-21 17:03:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 453885952. Throughput: 0: 41809.0. Samples: 454054980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:03:48,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 17:03:50,460][15401] Updated weights for policy 0, policy_version 27710 (0.0037) [2024-06-21 17:03:53,390][15132] Fps is (10 sec: 40957.7, 60 sec: 41778.9, 300 sec: 41654.5). Total num frames: 454098944. Throughput: 0: 41856.8. Samples: 454175080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:03:53,391][15132] Avg episode reward: [(0, '0.250')] [2024-06-21 17:03:54,831][15401] Updated weights for policy 0, policy_version 27720 (0.0039) [2024-06-21 17:03:58,193][15401] Updated weights for policy 0, policy_version 27730 (0.0043) [2024-06-21 17:03:58,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42050.5, 300 sec: 41653.9). Total num frames: 454328320. Throughput: 0: 42178.5. Samples: 454438200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:03:58,393][15132] Avg episode reward: [(0, '0.382')] [2024-06-21 17:04:02,889][15401] Updated weights for policy 0, policy_version 27740 (0.0051) [2024-06-21 17:04:03,392][15132] Fps is (10 sec: 42589.9, 60 sec: 42050.6, 300 sec: 41709.4). Total num frames: 454524928. Throughput: 0: 41925.3. Samples: 454683980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:04:03,393][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 17:04:06,128][15401] Updated weights for policy 0, policy_version 27750 (0.0035) [2024-06-21 17:04:08,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42053.9, 300 sec: 41654.3). Total num frames: 454737920. Throughput: 0: 41917.8. Samples: 454805420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:04:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 17:04:10,519][15401] Updated weights for policy 0, policy_version 27760 (0.0033) [2024-06-21 17:04:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 454934528. Throughput: 0: 42005.4. Samples: 455062000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 17:04:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 17:04:14,001][15401] Updated weights for policy 0, policy_version 27770 (0.0037) [2024-06-21 17:04:18,289][15401] Updated weights for policy 0, policy_version 27780 (0.0032) [2024-06-21 17:04:18,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42052.2, 300 sec: 41653.9). Total num frames: 455147520. Throughput: 0: 41970.7. Samples: 455309440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 17:04:18,393][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 17:04:22,070][15401] Updated weights for policy 0, policy_version 27790 (0.0039) [2024-06-21 17:04:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 41654.2). Total num frames: 455360512. Throughput: 0: 41978.7. Samples: 455434560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 17:04:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 17:04:26,079][15401] Updated weights for policy 0, policy_version 27800 (0.0042) [2024-06-21 17:04:28,390][15132] Fps is (10 sec: 40970.0, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 455557120. Throughput: 0: 41645.3. Samples: 455677880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 17:04:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 17:04:30,459][15401] Updated weights for policy 0, policy_version 27810 (0.0029) [2024-06-21 17:04:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 455753728. Throughput: 0: 41639.5. Samples: 455928760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 17:04:33,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-21 17:04:34,269][15401] Updated weights for policy 0, policy_version 27820 (0.0043) [2024-06-21 17:04:38,338][15401] Updated weights for policy 0, policy_version 27830 (0.0037) [2024-06-21 17:04:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 455966720. Throughput: 0: 41616.9. Samples: 456047820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 17:04:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 17:04:42,155][15401] Updated weights for policy 0, policy_version 27840 (0.0031) [2024-06-21 17:04:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 456196096. Throughput: 0: 41443.1. Samples: 456303040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 17:04:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-21 17:04:46,259][15401] Updated weights for policy 0, policy_version 27850 (0.0031) [2024-06-21 17:04:48,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41504.5, 300 sec: 41598.4). Total num frames: 456376320. Throughput: 0: 41514.3. Samples: 456552120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 17:04:48,393][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 17:04:50,120][15401] Updated weights for policy 0, policy_version 27860 (0.0025) [2024-06-21 17:04:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.4, 300 sec: 41543.5). Total num frames: 456589312. Throughput: 0: 41439.9. Samples: 456670220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:04:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 17:04:54,121][15401] Updated weights for policy 0, policy_version 27870 (0.0038) [2024-06-21 17:04:58,123][15401] Updated weights for policy 0, policy_version 27880 (0.0028) [2024-06-21 17:04:58,389][15132] Fps is (10 sec: 40970.2, 60 sec: 40961.8, 300 sec: 41487.8). Total num frames: 456785920. Throughput: 0: 41221.0. Samples: 456916940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:04:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 17:05:02,112][15401] Updated weights for policy 0, policy_version 27890 (0.0038) [2024-06-21 17:05:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41234.8, 300 sec: 41543.2). Total num frames: 456998912. Throughput: 0: 41167.6. Samples: 457161880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:05:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 17:05:06,096][15349] Signal inference workers to stop experience collection... (6600 times) [2024-06-21 17:05:06,096][15349] Signal inference workers to resume experience collection... (6600 times) [2024-06-21 17:05:06,137][15401] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-06-21 17:05:06,137][15401] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-06-21 17:05:06,254][15401] Updated weights for policy 0, policy_version 27900 (0.0035) [2024-06-21 17:05:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.2, 300 sec: 41598.7). Total num frames: 457211904. Throughput: 0: 41191.2. Samples: 457288160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:05:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 17:05:10,044][15401] Updated weights for policy 0, policy_version 27910 (0.0047) [2024-06-21 17:05:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 41543.1). Total num frames: 457408512. Throughput: 0: 41122.1. Samples: 457528380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:05:13,394][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 17:05:14,112][15401] Updated weights for policy 0, policy_version 27920 (0.0038) [2024-06-21 17:05:17,907][15401] Updated weights for policy 0, policy_version 27930 (0.0037) [2024-06-21 17:05:18,390][15132] Fps is (10 sec: 39320.7, 60 sec: 40961.6, 300 sec: 41487.6). Total num frames: 457605120. Throughput: 0: 41309.3. Samples: 457787680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:05:18,393][15132] Avg episode reward: [(0, '0.284')] [2024-06-21 17:05:21,952][15401] Updated weights for policy 0, policy_version 27940 (0.0028) [2024-06-21 17:05:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 457834496. Throughput: 0: 41403.9. Samples: 457911000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:05:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 17:05:25,777][15401] Updated weights for policy 0, policy_version 27950 (0.0042) [2024-06-21 17:05:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 458063872. Throughput: 0: 41208.1. Samples: 458157400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:05:28,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-21 17:05:29,841][15401] Updated weights for policy 0, policy_version 27960 (0.0042) [2024-06-21 17:05:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41543.1). Total num frames: 458227712. Throughput: 0: 41336.4. Samples: 458412160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:05:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 17:05:33,851][15401] Updated weights for policy 0, policy_version 27970 (0.0039) [2024-06-21 17:05:37,734][15401] Updated weights for policy 0, policy_version 27980 (0.0029) [2024-06-21 17:05:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 458457088. Throughput: 0: 41256.9. Samples: 458526780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:05:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 17:05:41,745][15401] Updated weights for policy 0, policy_version 27990 (0.0037) [2024-06-21 17:05:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 458686464. Throughput: 0: 41475.4. Samples: 458783340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:05:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 17:05:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000027996_458686464.pth... [2024-06-21 17:05:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000027385_448675840.pth [2024-06-21 17:05:45,528][15401] Updated weights for policy 0, policy_version 28000 (0.0038) [2024-06-21 17:05:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41234.8, 300 sec: 41543.2). Total num frames: 458850304. Throughput: 0: 41594.7. Samples: 459033640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 17:05:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 17:05:49,810][15401] Updated weights for policy 0, policy_version 28010 (0.0034) [2024-06-21 17:05:53,172][15401] Updated weights for policy 0, policy_version 28020 (0.0045) [2024-06-21 17:05:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 459079680. Throughput: 0: 41367.3. Samples: 459149700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 17:05:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-21 17:05:57,655][15401] Updated weights for policy 0, policy_version 28030 (0.0026) [2024-06-21 17:05:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 459292672. Throughput: 0: 41729.4. Samples: 459406200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 17:05:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 17:06:01,121][15401] Updated weights for policy 0, policy_version 28040 (0.0024) [2024-06-21 17:06:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 459472896. Throughput: 0: 41492.9. Samples: 459654860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 17:06:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 17:06:05,273][15401] Updated weights for policy 0, policy_version 28050 (0.0027) [2024-06-21 17:06:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 459702272. Throughput: 0: 41369.8. Samples: 459772640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 17:06:08,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-21 17:06:08,819][15401] Updated weights for policy 0, policy_version 28060 (0.0043) [2024-06-21 17:06:13,246][15401] Updated weights for policy 0, policy_version 28070 (0.0052) [2024-06-21 17:06:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 459898880. Throughput: 0: 41571.1. Samples: 460028100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 17:06:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 17:06:16,570][15401] Updated weights for policy 0, policy_version 28080 (0.0039) [2024-06-21 17:06:18,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 460079104. Throughput: 0: 41404.9. Samples: 460275380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 17:06:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 17:06:21,298][15401] Updated weights for policy 0, policy_version 28090 (0.0045) [2024-06-21 17:06:21,508][15349] Signal inference workers to stop experience collection... (6650 times) [2024-06-21 17:06:21,553][15401] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-06-21 17:06:21,621][15349] Signal inference workers to resume experience collection... (6650 times) [2024-06-21 17:06:21,622][15401] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-06-21 17:06:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 460324864. Throughput: 0: 41660.9. Samples: 460401520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-21 17:06:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 17:06:24,690][15401] Updated weights for policy 0, policy_version 28100 (0.0030) [2024-06-21 17:06:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 40686.8, 300 sec: 41487.6). Total num frames: 460505088. Throughput: 0: 41538.6. Samples: 460652580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-21 17:06:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 17:06:28,865][15401] Updated weights for policy 0, policy_version 28110 (0.0061) [2024-06-21 17:06:32,474][15401] Updated weights for policy 0, policy_version 28120 (0.0038) [2024-06-21 17:06:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 460734464. Throughput: 0: 41452.0. Samples: 460898980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-21 17:06:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 17:06:36,734][15401] Updated weights for policy 0, policy_version 28130 (0.0042) [2024-06-21 17:06:38,389][15132] Fps is (10 sec: 45876.1, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 460963840. Throughput: 0: 41825.1. Samples: 461031820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-21 17:06:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 17:06:40,517][15401] Updated weights for policy 0, policy_version 28140 (0.0037) [2024-06-21 17:06:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 461127680. Throughput: 0: 41775.7. Samples: 461286100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:06:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 17:06:44,567][15401] Updated weights for policy 0, policy_version 28150 (0.0030) [2024-06-21 17:06:48,245][15401] Updated weights for policy 0, policy_version 28160 (0.0034) [2024-06-21 17:06:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 461373440. Throughput: 0: 41516.5. Samples: 461523100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:06:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 17:06:52,605][15401] Updated weights for policy 0, policy_version 28170 (0.0032) [2024-06-21 17:06:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 461586432. Throughput: 0: 41880.0. Samples: 461657240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:06:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 17:06:56,254][15401] Updated weights for policy 0, policy_version 28180 (0.0043) [2024-06-21 17:06:58,389][15132] Fps is (10 sec: 37683.4, 60 sec: 40960.1, 300 sec: 41487.6). Total num frames: 461750272. Throughput: 0: 41761.8. Samples: 461907380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:06:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 17:07:00,414][15401] Updated weights for policy 0, policy_version 28190 (0.0029) [2024-06-21 17:07:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 461996032. Throughput: 0: 41502.2. Samples: 462142980. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 17:07:03,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 17:07:03,892][15401] Updated weights for policy 0, policy_version 28200 (0.0039) [2024-06-21 17:07:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 462176256. Throughput: 0: 41720.5. Samples: 462278940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 17:07:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 17:07:08,446][15401] Updated weights for policy 0, policy_version 28210 (0.0036) [2024-06-21 17:07:11,543][15401] Updated weights for policy 0, policy_version 28220 (0.0034) [2024-06-21 17:07:13,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 462372864. Throughput: 0: 41464.5. Samples: 462518480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 17:07:13,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 17:07:16,714][15401] Updated weights for policy 0, policy_version 28230 (0.0039) [2024-06-21 17:07:18,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42323.6, 300 sec: 41653.9). Total num frames: 462618624. Throughput: 0: 41492.8. Samples: 462766260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 17:07:18,393][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 17:07:19,218][15401] Updated weights for policy 0, policy_version 28240 (0.0032) [2024-06-21 17:07:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 462782464. Throughput: 0: 41553.7. Samples: 462901740. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-21 17:07:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 17:07:24,403][15401] Updated weights for policy 0, policy_version 28250 (0.0036) [2024-06-21 17:07:27,165][15401] Updated weights for policy 0, policy_version 28260 (0.0040) [2024-06-21 17:07:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 463011840. Throughput: 0: 41134.2. Samples: 463137140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-21 17:07:28,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-21 17:07:31,880][15349] Signal inference workers to stop experience collection... (6700 times) [2024-06-21 17:07:31,933][15401] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-06-21 17:07:31,995][15349] Signal inference workers to resume experience collection... (6700 times) [2024-06-21 17:07:31,995][15401] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-06-21 17:07:32,131][15401] Updated weights for policy 0, policy_version 28270 (0.0046) [2024-06-21 17:07:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 463224832. Throughput: 0: 41443.1. Samples: 463388040. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-21 17:07:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 17:07:34,935][15401] Updated weights for policy 0, policy_version 28280 (0.0027) [2024-06-21 17:07:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 41432.1). Total num frames: 463405056. Throughput: 0: 41195.1. Samples: 463511020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 17:07:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 17:07:39,974][15401] Updated weights for policy 0, policy_version 28290 (0.0040) [2024-06-21 17:07:42,776][15401] Updated weights for policy 0, policy_version 28300 (0.0044) [2024-06-21 17:07:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 41709.8). Total num frames: 463667200. Throughput: 0: 41105.7. Samples: 463757140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 17:07:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 17:07:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000028300_463667200.pth... [2024-06-21 17:07:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000027692_453705728.pth [2024-06-21 17:07:47,831][15401] Updated weights for policy 0, policy_version 28310 (0.0038) [2024-06-21 17:07:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 463847424. Throughput: 0: 41499.9. Samples: 464010480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 17:07:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 17:07:51,357][15401] Updated weights for policy 0, policy_version 28320 (0.0029) [2024-06-21 17:07:53,390][15132] Fps is (10 sec: 36044.8, 60 sec: 40686.9, 300 sec: 41432.1). Total num frames: 464027648. Throughput: 0: 41180.4. Samples: 464132060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 17:07:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 17:07:55,693][15401] Updated weights for policy 0, policy_version 28330 (0.0036) [2024-06-21 17:07:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 41654.2). Total num frames: 464289792. Throughput: 0: 41492.0. Samples: 464385620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-21 17:07:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 17:07:59,029][15401] Updated weights for policy 0, policy_version 28340 (0.0038) [2024-06-21 17:08:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41233.1, 300 sec: 41543.5). Total num frames: 464470016. Throughput: 0: 41608.9. Samples: 464638560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-21 17:08:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 17:08:03,644][15401] Updated weights for policy 0, policy_version 28350 (0.0039) [2024-06-21 17:08:07,545][15401] Updated weights for policy 0, policy_version 28360 (0.0041) [2024-06-21 17:08:08,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41506.0, 300 sec: 41432.1). Total num frames: 464666624. Throughput: 0: 41216.4. Samples: 464756480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-21 17:08:08,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 17:08:11,610][15401] Updated weights for policy 0, policy_version 28370 (0.0042) [2024-06-21 17:08:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 41599.0). Total num frames: 464896000. Throughput: 0: 41567.4. Samples: 465007680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-21 17:08:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 17:08:15,451][15401] Updated weights for policy 0, policy_version 28380 (0.0034) [2024-06-21 17:08:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41234.7, 300 sec: 41598.7). Total num frames: 465092608. Throughput: 0: 41671.0. Samples: 465263240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 17:08:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 17:08:19,440][15401] Updated weights for policy 0, policy_version 28390 (0.0033) [2024-06-21 17:08:23,380][15401] Updated weights for policy 0, policy_version 28400 (0.0034) [2024-06-21 17:08:23,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42050.6, 300 sec: 41487.3). Total num frames: 465305600. Throughput: 0: 41668.4. Samples: 465386200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 17:08:23,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 17:08:27,240][15401] Updated weights for policy 0, policy_version 28410 (0.0036) [2024-06-21 17:08:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 465518592. Throughput: 0: 41909.9. Samples: 465643080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 17:08:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 17:08:31,219][15401] Updated weights for policy 0, policy_version 28420 (0.0035) [2024-06-21 17:08:33,392][15132] Fps is (10 sec: 40960.2, 60 sec: 41504.5, 300 sec: 41598.4). Total num frames: 465715200. Throughput: 0: 41859.7. Samples: 465894260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 17:08:33,392][15132] Avg episode reward: [(0, '0.804')] [2024-06-21 17:08:35,068][15401] Updated weights for policy 0, policy_version 28430 (0.0047) [2024-06-21 17:08:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 465928192. Throughput: 0: 41824.4. Samples: 466014160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 17:08:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 17:08:38,772][15401] Updated weights for policy 0, policy_version 28440 (0.0035) [2024-06-21 17:08:42,911][15401] Updated weights for policy 0, policy_version 28450 (0.0037) [2024-06-21 17:08:43,392][15132] Fps is (10 sec: 42597.9, 60 sec: 41231.4, 300 sec: 41542.8). Total num frames: 466141184. Throughput: 0: 41802.6. Samples: 466266840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 17:08:43,393][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 17:08:46,371][15401] Updated weights for policy 0, policy_version 28460 (0.0044) [2024-06-21 17:08:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41487.7). Total num frames: 466337792. Throughput: 0: 41699.4. Samples: 466515040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 17:08:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 17:08:50,717][15401] Updated weights for policy 0, policy_version 28470 (0.0044) [2024-06-21 17:08:53,390][15132] Fps is (10 sec: 40970.2, 60 sec: 42052.3, 300 sec: 41432.4). Total num frames: 466550784. Throughput: 0: 41931.6. Samples: 466643400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 17:08:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 17:08:54,105][15401] Updated weights for policy 0, policy_version 28480 (0.0042) [2024-06-21 17:08:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41432.4). Total num frames: 466747392. Throughput: 0: 41932.1. Samples: 466894620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 17:08:58,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 17:08:59,016][15401] Updated weights for policy 0, policy_version 28490 (0.0032) [2024-06-21 17:09:02,011][15401] Updated weights for policy 0, policy_version 28500 (0.0025) [2024-06-21 17:09:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41543.1). Total num frames: 466993152. Throughput: 0: 41734.7. Samples: 467141300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 17:09:03,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 17:09:06,715][15401] Updated weights for policy 0, policy_version 28510 (0.0030) [2024-06-21 17:09:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 41598.7). Total num frames: 467206144. Throughput: 0: 41947.9. Samples: 467273760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 17:09:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 17:09:09,766][15401] Updated weights for policy 0, policy_version 28520 (0.0036) [2024-06-21 17:09:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41543.5). Total num frames: 467402752. Throughput: 0: 41735.8. Samples: 467521200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:09:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-21 17:09:14,487][15401] Updated weights for policy 0, policy_version 28530 (0.0039) [2024-06-21 17:09:15,647][15349] Signal inference workers to stop experience collection... (6750 times) [2024-06-21 17:09:15,648][15349] Signal inference workers to resume experience collection... (6750 times) [2024-06-21 17:09:15,681][15401] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-06-21 17:09:15,682][15401] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-06-21 17:09:17,534][15401] Updated weights for policy 0, policy_version 28540 (0.0041) [2024-06-21 17:09:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41543.1). Total num frames: 467615744. Throughput: 0: 41714.6. Samples: 467771320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:09:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-21 17:09:22,396][15401] Updated weights for policy 0, policy_version 28550 (0.0028) [2024-06-21 17:09:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41780.8, 300 sec: 41543.2). Total num frames: 467812352. Throughput: 0: 41895.5. Samples: 467899460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:09:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 17:09:25,051][15401] Updated weights for policy 0, policy_version 28560 (0.0042) [2024-06-21 17:09:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 468025344. Throughput: 0: 41960.1. Samples: 468154940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:09:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 17:09:30,265][15401] Updated weights for policy 0, policy_version 28570 (0.0036) [2024-06-21 17:09:32,750][15401] Updated weights for policy 0, policy_version 28580 (0.0034) [2024-06-21 17:09:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42327.0, 300 sec: 41654.2). Total num frames: 468254720. Throughput: 0: 41767.6. Samples: 468394580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 17:09:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-21 17:09:38,027][15401] Updated weights for policy 0, policy_version 28590 (0.0031) [2024-06-21 17:09:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 468418560. Throughput: 0: 41884.8. Samples: 468528220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 17:09:38,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 17:09:40,674][15401] Updated weights for policy 0, policy_version 28600 (0.0035) [2024-06-21 17:09:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41780.9, 300 sec: 41599.0). Total num frames: 468647936. Throughput: 0: 41746.3. Samples: 468773200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 17:09:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 17:09:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000028605_468664320.pth... [2024-06-21 17:09:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000027996_458686464.pth [2024-06-21 17:09:46,219][15401] Updated weights for policy 0, policy_version 28610 (0.0032) [2024-06-21 17:09:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 41654.2). Total num frames: 468877312. Throughput: 0: 41703.6. Samples: 469017960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 17:09:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 17:09:48,832][15401] Updated weights for policy 0, policy_version 28620 (0.0033) [2024-06-21 17:09:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 469041152. Throughput: 0: 41527.2. Samples: 469142480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 17:09:53,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 17:09:53,979][15401] Updated weights for policy 0, policy_version 28630 (0.0042) [2024-06-21 17:09:56,655][15401] Updated weights for policy 0, policy_version 28640 (0.0038) [2024-06-21 17:09:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 469286912. Throughput: 0: 41512.5. Samples: 469389260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 17:09:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 17:10:01,628][15401] Updated weights for policy 0, policy_version 28650 (0.0042) [2024-06-21 17:10:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 469483520. Throughput: 0: 41719.1. Samples: 469648680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 17:10:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 17:10:04,397][15401] Updated weights for policy 0, policy_version 28660 (0.0046) [2024-06-21 17:10:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 469680128. Throughput: 0: 41482.2. Samples: 469766160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 17:10:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 17:10:09,847][15401] Updated weights for policy 0, policy_version 28670 (0.0037) [2024-06-21 17:10:12,815][15401] Updated weights for policy 0, policy_version 28680 (0.0045) [2024-06-21 17:10:13,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42047.9, 300 sec: 41764.4). Total num frames: 469925888. Throughput: 0: 41415.0. Samples: 470018880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 17:10:13,396][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 17:10:17,590][15401] Updated weights for policy 0, policy_version 28690 (0.0034) [2024-06-21 17:10:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 470106112. Throughput: 0: 41798.2. Samples: 470275500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 17:10:18,392][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 17:10:20,448][15401] Updated weights for policy 0, policy_version 28700 (0.0036) [2024-06-21 17:10:23,390][15132] Fps is (10 sec: 37707.0, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 470302720. Throughput: 0: 41455.1. Samples: 470393700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 17:10:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 17:10:25,211][15401] Updated weights for policy 0, policy_version 28710 (0.0036) [2024-06-21 17:10:27,832][15349] Signal inference workers to stop experience collection... (6800 times) [2024-06-21 17:10:27,832][15349] Signal inference workers to resume experience collection... (6800 times) [2024-06-21 17:10:27,866][15401] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-06-21 17:10:27,867][15401] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-06-21 17:10:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 470532096. Throughput: 0: 41615.6. Samples: 470645900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 17:10:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 17:10:28,514][15401] Updated weights for policy 0, policy_version 28720 (0.0037) [2024-06-21 17:10:32,878][15401] Updated weights for policy 0, policy_version 28730 (0.0023) [2024-06-21 17:10:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 470728704. Throughput: 0: 41669.8. Samples: 470893100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 17:10:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 17:10:36,499][15401] Updated weights for policy 0, policy_version 28740 (0.0038) [2024-06-21 17:10:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 41598.7). Total num frames: 470958080. Throughput: 0: 41715.5. Samples: 471019680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 17:10:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 17:10:40,640][15401] Updated weights for policy 0, policy_version 28750 (0.0036) [2024-06-21 17:10:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 471138304. Throughput: 0: 41751.7. Samples: 471268080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 17:10:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 17:10:44,470][15401] Updated weights for policy 0, policy_version 28760 (0.0038) [2024-06-21 17:10:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 471351296. Throughput: 0: 41553.0. Samples: 471518560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:10:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 17:10:48,517][15401] Updated weights for policy 0, policy_version 28770 (0.0049) [2024-06-21 17:10:52,000][15401] Updated weights for policy 0, policy_version 28780 (0.0044) [2024-06-21 17:10:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 471580672. Throughput: 0: 41691.6. Samples: 471642280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:10:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 17:10:56,239][15401] Updated weights for policy 0, policy_version 28790 (0.0027) [2024-06-21 17:10:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 40960.1, 300 sec: 41598.7). Total num frames: 471744512. Throughput: 0: 41648.5. Samples: 471892800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:10:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 17:10:59,565][15401] Updated weights for policy 0, policy_version 28800 (0.0048) [2024-06-21 17:11:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 471973888. Throughput: 0: 41586.7. Samples: 472146900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:11:03,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 17:11:04,339][15401] Updated weights for policy 0, policy_version 28810 (0.0030) [2024-06-21 17:11:07,668][15401] Updated weights for policy 0, policy_version 28820 (0.0032) [2024-06-21 17:11:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 472186880. Throughput: 0: 41737.4. Samples: 472271880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 17:11:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 17:11:12,297][15401] Updated weights for policy 0, policy_version 28830 (0.0040) [2024-06-21 17:11:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 40691.2, 300 sec: 41654.2). Total num frames: 472367104. Throughput: 0: 41581.2. Samples: 472517060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 17:11:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 17:11:15,748][15401] Updated weights for policy 0, policy_version 28840 (0.0039) [2024-06-21 17:11:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 472596480. Throughput: 0: 41584.0. Samples: 472764380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 17:11:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 17:11:20,088][15401] Updated weights for policy 0, policy_version 28850 (0.0030) [2024-06-21 17:11:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 472809472. Throughput: 0: 41581.4. Samples: 472890840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 17:11:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 17:11:24,239][15401] Updated weights for policy 0, policy_version 28860 (0.0029) [2024-06-21 17:11:27,849][15401] Updated weights for policy 0, policy_version 28870 (0.0037) [2024-06-21 17:11:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 473006080. Throughput: 0: 41582.6. Samples: 473139300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:11:28,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 17:11:31,968][15401] Updated weights for policy 0, policy_version 28880 (0.0030) [2024-06-21 17:11:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 473219072. Throughput: 0: 41617.8. Samples: 473391360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:11:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 17:11:35,498][15401] Updated weights for policy 0, policy_version 28890 (0.0033) [2024-06-21 17:11:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 473448448. Throughput: 0: 41622.7. Samples: 473515300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:11:38,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-21 17:11:39,769][15401] Updated weights for policy 0, policy_version 28900 (0.0027) [2024-06-21 17:11:43,121][15401] Updated weights for policy 0, policy_version 28910 (0.0039) [2024-06-21 17:11:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42052.1, 300 sec: 41654.2). Total num frames: 473661440. Throughput: 0: 41616.3. Samples: 473765540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:11:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 17:11:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000028910_473661440.pth... [2024-06-21 17:11:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000028300_463667200.pth [2024-06-21 17:11:45,625][15349] Signal inference workers to stop experience collection... (6850 times) [2024-06-21 17:11:45,626][15349] Signal inference workers to resume experience collection... (6850 times) [2024-06-21 17:11:45,667][15401] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-06-21 17:11:45,667][15401] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-06-21 17:11:47,897][15401] Updated weights for policy 0, policy_version 28920 (0.0040) [2024-06-21 17:11:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 473841664. Throughput: 0: 41512.9. Samples: 474014980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 17:11:48,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 17:11:51,107][15401] Updated weights for policy 0, policy_version 28930 (0.0032) [2024-06-21 17:11:53,389][15132] Fps is (10 sec: 40961.0, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 474071040. Throughput: 0: 41372.1. Samples: 474133620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 17:11:53,390][15132] Avg episode reward: [(0, '0.155')] [2024-06-21 17:11:55,658][15401] Updated weights for policy 0, policy_version 28940 (0.0032) [2024-06-21 17:11:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 474251264. Throughput: 0: 41504.0. Samples: 474384740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 17:11:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 17:11:59,315][15401] Updated weights for policy 0, policy_version 28950 (0.0027) [2024-06-21 17:12:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 474464256. Throughput: 0: 41681.9. Samples: 474640060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 17:12:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 17:12:03,514][15401] Updated weights for policy 0, policy_version 28960 (0.0029) [2024-06-21 17:12:07,189][15401] Updated weights for policy 0, policy_version 28970 (0.0052) [2024-06-21 17:12:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 474693632. Throughput: 0: 41567.6. Samples: 474761380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:12:08,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-21 17:12:11,663][15401] Updated weights for policy 0, policy_version 28980 (0.0026) [2024-06-21 17:12:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.3, 300 sec: 41543.5). Total num frames: 474873856. Throughput: 0: 41630.2. Samples: 475012660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:12:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 17:12:15,010][15401] Updated weights for policy 0, policy_version 28990 (0.0042) [2024-06-21 17:12:18,390][15132] Fps is (10 sec: 37682.5, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 475070464. Throughput: 0: 41586.5. Samples: 475262760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:12:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 17:12:19,423][15401] Updated weights for policy 0, policy_version 29000 (0.0034) [2024-06-21 17:12:22,891][15401] Updated weights for policy 0, policy_version 29010 (0.0030) [2024-06-21 17:12:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 475316224. Throughput: 0: 41624.4. Samples: 475388400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:12:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 17:12:27,339][15401] Updated weights for policy 0, policy_version 29020 (0.0039) [2024-06-21 17:12:28,392][15132] Fps is (10 sec: 42588.7, 60 sec: 41504.5, 300 sec: 41598.4). Total num frames: 475496448. Throughput: 0: 41612.6. Samples: 475638200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 17:12:28,392][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 17:12:30,591][15401] Updated weights for policy 0, policy_version 29030 (0.0032) [2024-06-21 17:12:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 475725824. Throughput: 0: 41334.6. Samples: 475875040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 17:12:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 17:12:35,203][15401] Updated weights for policy 0, policy_version 29040 (0.0046) [2024-06-21 17:12:38,390][15132] Fps is (10 sec: 44247.3, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 475938816. Throughput: 0: 41642.1. Samples: 476007520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 17:12:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 17:12:38,515][15401] Updated weights for policy 0, policy_version 29050 (0.0033) [2024-06-21 17:12:42,923][15401] Updated weights for policy 0, policy_version 29060 (0.0028) [2024-06-21 17:12:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.2, 300 sec: 41654.3). Total num frames: 476135424. Throughput: 0: 41611.6. Samples: 476257260. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 17:12:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 17:12:46,617][15401] Updated weights for policy 0, policy_version 29070 (0.0045) [2024-06-21 17:12:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 476364800. Throughput: 0: 41345.2. Samples: 476500600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 17:12:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 17:12:50,712][15401] Updated weights for policy 0, policy_version 29080 (0.0037) [2024-06-21 17:12:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 40959.8, 300 sec: 41487.6). Total num frames: 476528640. Throughput: 0: 41548.2. Samples: 476631060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 17:12:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-21 17:12:54,595][15401] Updated weights for policy 0, policy_version 29090 (0.0031) [2024-06-21 17:12:55,382][15349] Signal inference workers to stop experience collection... (6900 times) [2024-06-21 17:12:55,383][15349] Signal inference workers to resume experience collection... (6900 times) [2024-06-21 17:12:55,398][15401] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-06-21 17:12:55,398][15401] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-06-21 17:12:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 476758016. Throughput: 0: 41416.5. Samples: 476876400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 17:12:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 17:12:58,487][15401] Updated weights for policy 0, policy_version 29100 (0.0040) [2024-06-21 17:13:02,475][15401] Updated weights for policy 0, policy_version 29110 (0.0046) [2024-06-21 17:13:03,392][15132] Fps is (10 sec: 45865.1, 60 sec: 42050.5, 300 sec: 41765.0). Total num frames: 476987392. Throughput: 0: 41424.6. Samples: 477126960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 17:13:03,392][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 17:13:06,326][15401] Updated weights for policy 0, policy_version 29120 (0.0030) [2024-06-21 17:13:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 40686.8, 300 sec: 41487.6). Total num frames: 477134848. Throughput: 0: 41506.6. Samples: 477256200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 17:13:08,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 17:13:10,347][15401] Updated weights for policy 0, policy_version 29130 (0.0040) [2024-06-21 17:13:13,392][15132] Fps is (10 sec: 39321.5, 60 sec: 41777.5, 300 sec: 41653.9). Total num frames: 477380608. Throughput: 0: 41389.8. Samples: 477500740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 17:13:13,393][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 17:13:14,031][15401] Updated weights for policy 0, policy_version 29140 (0.0032) [2024-06-21 17:13:18,264][15401] Updated weights for policy 0, policy_version 29150 (0.0037) [2024-06-21 17:13:18,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42050.7, 300 sec: 41654.2). Total num frames: 477593600. Throughput: 0: 41712.9. Samples: 477752220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 17:13:18,393][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 17:13:22,190][15401] Updated weights for policy 0, policy_version 29160 (0.0041) [2024-06-21 17:13:23,390][15132] Fps is (10 sec: 39330.7, 60 sec: 40959.9, 300 sec: 41543.1). Total num frames: 477773824. Throughput: 0: 41577.3. Samples: 477878500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 30.0) [2024-06-21 17:13:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 17:13:25,884][15401] Updated weights for policy 0, policy_version 29170 (0.0033) [2024-06-21 17:13:28,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42053.9, 300 sec: 41710.1). Total num frames: 478019584. Throughput: 0: 41394.6. Samples: 478120020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 30.0) [2024-06-21 17:13:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 17:13:30,244][15401] Updated weights for policy 0, policy_version 29180 (0.0035) [2024-06-21 17:13:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 478199808. Throughput: 0: 41674.2. Samples: 478375940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 30.0) [2024-06-21 17:13:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 17:13:33,872][15401] Updated weights for policy 0, policy_version 29190 (0.0048) [2024-06-21 17:13:38,356][15401] Updated weights for policy 0, policy_version 29200 (0.0037) [2024-06-21 17:13:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41233.1, 300 sec: 41599.1). Total num frames: 478412800. Throughput: 0: 41372.7. Samples: 478492820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 30.0) [2024-06-21 17:13:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 17:13:41,637][15401] Updated weights for policy 0, policy_version 29210 (0.0038) [2024-06-21 17:13:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 478658560. Throughput: 0: 41553.7. Samples: 478746320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-06-21 17:13:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 17:13:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000029215_478658560.pth... [2024-06-21 17:13:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000028605_468664320.pth [2024-06-21 17:13:46,203][15401] Updated weights for policy 0, policy_version 29220 (0.0023) [2024-06-21 17:13:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 478822400. Throughput: 0: 41856.5. Samples: 479010400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-06-21 17:13:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 17:13:49,570][15401] Updated weights for policy 0, policy_version 29230 (0.0046) [2024-06-21 17:13:53,392][15132] Fps is (10 sec: 37674.2, 60 sec: 41777.6, 300 sec: 41653.9). Total num frames: 479035392. Throughput: 0: 41471.6. Samples: 479122520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-06-21 17:13:53,392][15132] Avg episode reward: [(0, '0.190')] [2024-06-21 17:13:54,130][15401] Updated weights for policy 0, policy_version 29240 (0.0039) [2024-06-21 17:13:57,314][15401] Updated weights for policy 0, policy_version 29250 (0.0058) [2024-06-21 17:13:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 479281152. Throughput: 0: 41782.7. Samples: 479380860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-06-21 17:13:58,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 17:14:01,959][15401] Updated weights for policy 0, policy_version 29260 (0.0041) [2024-06-21 17:14:03,390][15132] Fps is (10 sec: 42608.4, 60 sec: 41234.7, 300 sec: 41543.2). Total num frames: 479461376. Throughput: 0: 41889.3. Samples: 479637140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 17:14:03,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 17:14:05,201][15401] Updated weights for policy 0, policy_version 29270 (0.0032) [2024-06-21 17:14:06,283][15349] Signal inference workers to stop experience collection... (6950 times) [2024-06-21 17:14:06,285][15349] Signal inference workers to resume experience collection... (6950 times) [2024-06-21 17:14:06,325][15401] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-06-21 17:14:06,325][15401] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-06-21 17:14:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 41654.2). Total num frames: 479690752. Throughput: 0: 41821.3. Samples: 479760460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 17:14:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 17:14:09,728][15401] Updated weights for policy 0, policy_version 29280 (0.0031) [2024-06-21 17:14:12,959][15401] Updated weights for policy 0, policy_version 29290 (0.0036) [2024-06-21 17:14:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42053.9, 300 sec: 41654.2). Total num frames: 479903744. Throughput: 0: 42185.0. Samples: 480018340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 17:14:13,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-21 17:14:17,566][15401] Updated weights for policy 0, policy_version 29300 (0.0045) [2024-06-21 17:14:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41234.7, 300 sec: 41543.2). Total num frames: 480067584. Throughput: 0: 41998.8. Samples: 480265880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 17:14:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 17:14:20,833][15401] Updated weights for policy 0, policy_version 29310 (0.0029) [2024-06-21 17:14:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.7, 300 sec: 41653.9). Total num frames: 480313344. Throughput: 0: 42000.3. Samples: 480382940. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 17:14:23,392][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 17:14:25,365][15401] Updated weights for policy 0, policy_version 29320 (0.0041) [2024-06-21 17:14:28,395][15132] Fps is (10 sec: 44211.1, 60 sec: 41502.2, 300 sec: 41542.3). Total num frames: 480509952. Throughput: 0: 41944.9. Samples: 480634080. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 17:14:28,396][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 17:14:28,787][15401] Updated weights for policy 0, policy_version 29330 (0.0028) [2024-06-21 17:14:33,184][15401] Updated weights for policy 0, policy_version 29340 (0.0032) [2024-06-21 17:14:33,389][15132] Fps is (10 sec: 39331.2, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 480706560. Throughput: 0: 41759.5. Samples: 480889580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 17:14:33,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-21 17:14:36,523][15401] Updated weights for policy 0, policy_version 29350 (0.0032) [2024-06-21 17:14:38,392][15132] Fps is (10 sec: 44251.7, 60 sec: 42323.6, 300 sec: 41709.4). Total num frames: 480952320. Throughput: 0: 42041.3. Samples: 481014380. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 17:14:38,392][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 17:14:41,031][15401] Updated weights for policy 0, policy_version 29360 (0.0031) [2024-06-21 17:14:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 481132544. Throughput: 0: 41846.6. Samples: 481263960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:14:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 17:14:44,358][15401] Updated weights for policy 0, policy_version 29370 (0.0045) [2024-06-21 17:14:48,390][15132] Fps is (10 sec: 37692.1, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 481329152. Throughput: 0: 41767.1. Samples: 481516660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:14:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 17:14:48,766][15401] Updated weights for policy 0, policy_version 29380 (0.0046) [2024-06-21 17:14:52,156][15401] Updated weights for policy 0, policy_version 29390 (0.0042) [2024-06-21 17:14:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.1, 300 sec: 41654.3). Total num frames: 481574912. Throughput: 0: 41787.2. Samples: 481640880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:14:53,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 17:14:56,485][15401] Updated weights for policy 0, policy_version 29400 (0.0044) [2024-06-21 17:14:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 481738752. Throughput: 0: 41584.1. Samples: 481889620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:14:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 17:15:00,093][15401] Updated weights for policy 0, policy_version 29410 (0.0037) [2024-06-21 17:15:03,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 481951744. Throughput: 0: 41662.9. Samples: 482140720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:15:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 17:15:04,049][15401] Updated weights for policy 0, policy_version 29420 (0.0029) [2024-06-21 17:15:08,227][15401] Updated weights for policy 0, policy_version 29430 (0.0038) [2024-06-21 17:15:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 41506.1, 300 sec: 41544.0). Total num frames: 482181120. Throughput: 0: 41934.1. Samples: 482269880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:15:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 17:15:12,046][15401] Updated weights for policy 0, policy_version 29440 (0.0046) [2024-06-21 17:15:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 482377728. Throughput: 0: 41669.8. Samples: 482508980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:15:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 17:15:16,167][15401] Updated weights for policy 0, policy_version 29450 (0.0041) [2024-06-21 17:15:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 482590720. Throughput: 0: 41596.0. Samples: 482761400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:15:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 17:15:19,922][15401] Updated weights for policy 0, policy_version 29460 (0.0043) [2024-06-21 17:15:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41507.8, 300 sec: 41598.7). Total num frames: 482803712. Throughput: 0: 41624.4. Samples: 482887380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:15:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 17:15:23,944][15401] Updated weights for policy 0, policy_version 29470 (0.0034) [2024-06-21 17:15:27,897][15401] Updated weights for policy 0, policy_version 29480 (0.0036) [2024-06-21 17:15:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42056.3, 300 sec: 41709.8). Total num frames: 483033088. Throughput: 0: 41610.6. Samples: 483136440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:15:28,392][15132] Avg episode reward: [(0, '0.324')] [2024-06-21 17:15:31,650][15401] Updated weights for policy 0, policy_version 29490 (0.0034) [2024-06-21 17:15:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 483229696. Throughput: 0: 41418.1. Samples: 483380480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:15:33,390][15132] Avg episode reward: [(0, '0.208')] [2024-06-21 17:15:35,838][15401] Updated weights for policy 0, policy_version 29500 (0.0044) [2024-06-21 17:15:38,392][15132] Fps is (10 sec: 37674.4, 60 sec: 40960.0, 300 sec: 41598.4). Total num frames: 483409920. Throughput: 0: 41558.2. Samples: 483511100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-21 17:15:38,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 17:15:38,756][15349] Signal inference workers to stop experience collection... (7000 times) [2024-06-21 17:15:38,802][15401] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-06-21 17:15:38,873][15349] Signal inference workers to resume experience collection... (7000 times) [2024-06-21 17:15:38,873][15401] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-06-21 17:15:39,465][15401] Updated weights for policy 0, policy_version 29510 (0.0038) [2024-06-21 17:15:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 483639296. Throughput: 0: 41482.6. Samples: 483756340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-21 17:15:43,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-21 17:15:43,521][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000029520_483655680.pth... [2024-06-21 17:15:43,525][15401] Updated weights for policy 0, policy_version 29520 (0.0053) [2024-06-21 17:15:43,579][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000028910_473661440.pth [2024-06-21 17:15:47,347][15401] Updated weights for policy 0, policy_version 29530 (0.0039) [2024-06-21 17:15:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 483852288. Throughput: 0: 41424.1. Samples: 484004800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-21 17:15:48,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-21 17:15:51,643][15401] Updated weights for policy 0, policy_version 29540 (0.0038) [2024-06-21 17:15:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 40959.9, 300 sec: 41654.2). Total num frames: 484032512. Throughput: 0: 41268.9. Samples: 484126980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-21 17:15:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 17:15:55,479][15401] Updated weights for policy 0, policy_version 29550 (0.0030) [2024-06-21 17:15:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 484261888. Throughput: 0: 41394.6. Samples: 484371740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 17:15:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 17:15:59,765][15401] Updated weights for policy 0, policy_version 29560 (0.0036) [2024-06-21 17:16:03,292][15401] Updated weights for policy 0, policy_version 29570 (0.0024) [2024-06-21 17:16:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 484474880. Throughput: 0: 41327.9. Samples: 484621160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 17:16:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 17:16:07,540][15401] Updated weights for policy 0, policy_version 29580 (0.0044) [2024-06-21 17:16:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 484655104. Throughput: 0: 41141.7. Samples: 484738760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 17:16:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 17:16:11,679][15401] Updated weights for policy 0, policy_version 29590 (0.0029) [2024-06-21 17:16:13,392][15132] Fps is (10 sec: 39312.4, 60 sec: 41504.4, 300 sec: 41598.4). Total num frames: 484868096. Throughput: 0: 41286.7. Samples: 484994440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 17:16:13,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 17:16:15,237][15401] Updated weights for policy 0, policy_version 29600 (0.0045) [2024-06-21 17:16:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 485081088. Throughput: 0: 41473.1. Samples: 485246760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 17:16:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 17:16:19,476][15401] Updated weights for policy 0, policy_version 29610 (0.0034) [2024-06-21 17:16:23,231][15401] Updated weights for policy 0, policy_version 29620 (0.0036) [2024-06-21 17:16:23,392][15132] Fps is (10 sec: 42598.6, 60 sec: 41504.5, 300 sec: 41653.9). Total num frames: 485294080. Throughput: 0: 41329.8. Samples: 485370940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 17:16:23,393][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 17:16:27,248][15401] Updated weights for policy 0, policy_version 29630 (0.0036) [2024-06-21 17:16:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40687.0, 300 sec: 41543.2). Total num frames: 485474304. Throughput: 0: 41556.9. Samples: 485626400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 17:16:28,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-21 17:16:30,854][15401] Updated weights for policy 0, policy_version 29640 (0.0032) [2024-06-21 17:16:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 485703680. Throughput: 0: 41656.0. Samples: 485879320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 17:16:33,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-21 17:16:34,880][15401] Updated weights for policy 0, policy_version 29650 (0.0037) [2024-06-21 17:16:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42053.9, 300 sec: 41598.7). Total num frames: 485933056. Throughput: 0: 41739.7. Samples: 486005260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-21 17:16:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 17:16:38,454][15401] Updated weights for policy 0, policy_version 29660 (0.0030) [2024-06-21 17:16:39,835][15349] Signal inference workers to stop experience collection... (7050 times) [2024-06-21 17:16:39,835][15349] Signal inference workers to resume experience collection... (7050 times) [2024-06-21 17:16:39,876][15401] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-06-21 17:16:39,877][15401] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-06-21 17:16:42,588][15401] Updated weights for policy 0, policy_version 29670 (0.0038) [2024-06-21 17:16:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 486129664. Throughput: 0: 41877.3. Samples: 486256220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-21 17:16:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 17:16:46,225][15401] Updated weights for policy 0, policy_version 29680 (0.0032) [2024-06-21 17:16:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 486326272. Throughput: 0: 41990.8. Samples: 486510740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-21 17:16:48,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 17:16:50,326][15401] Updated weights for policy 0, policy_version 29690 (0.0025) [2024-06-21 17:16:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 486539264. Throughput: 0: 42147.7. Samples: 486635400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-21 17:16:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 17:16:53,992][15401] Updated weights for policy 0, policy_version 29700 (0.0054) [2024-06-21 17:16:58,078][15401] Updated weights for policy 0, policy_version 29710 (0.0036) [2024-06-21 17:16:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41709.7). Total num frames: 486768640. Throughput: 0: 41998.6. Samples: 486884280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-21 17:16:58,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 17:17:01,839][15401] Updated weights for policy 0, policy_version 29720 (0.0037) [2024-06-21 17:17:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41233.1, 300 sec: 41543.1). Total num frames: 486948864. Throughput: 0: 41927.9. Samples: 487133520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-21 17:17:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 17:17:06,206][15401] Updated weights for policy 0, policy_version 29730 (0.0030) [2024-06-21 17:17:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 487178240. Throughput: 0: 41931.5. Samples: 487257760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-21 17:17:08,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 17:17:09,624][15401] Updated weights for policy 0, policy_version 29740 (0.0028) [2024-06-21 17:17:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41781.0, 300 sec: 41709.8). Total num frames: 487374848. Throughput: 0: 42021.9. Samples: 487517380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-21 17:17:13,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-21 17:17:14,005][15401] Updated weights for policy 0, policy_version 29750 (0.0036) [2024-06-21 17:17:17,443][15401] Updated weights for policy 0, policy_version 29760 (0.0035) [2024-06-21 17:17:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 487604224. Throughput: 0: 41697.9. Samples: 487755720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 17:17:18,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-21 17:17:21,992][15401] Updated weights for policy 0, policy_version 29770 (0.0041) [2024-06-21 17:17:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42054.0, 300 sec: 41765.7). Total num frames: 487817216. Throughput: 0: 41786.7. Samples: 487885660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 17:17:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 17:17:25,563][15401] Updated weights for policy 0, policy_version 29780 (0.0039) [2024-06-21 17:17:28,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42050.6, 300 sec: 41598.4). Total num frames: 487997440. Throughput: 0: 41822.3. Samples: 488138320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 17:17:28,393][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 17:17:30,083][15401] Updated weights for policy 0, policy_version 29790 (0.0046) [2024-06-21 17:17:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 488210432. Throughput: 0: 41508.5. Samples: 488378620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 17:17:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-21 17:17:33,985][15401] Updated weights for policy 0, policy_version 29800 (0.0031) [2024-06-21 17:17:35,376][15349] Signal inference workers to stop experience collection... (7100 times) [2024-06-21 17:17:35,429][15401] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-06-21 17:17:35,438][15349] Signal inference workers to resume experience collection... (7100 times) [2024-06-21 17:17:35,444][15401] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-06-21 17:17:37,925][15401] Updated weights for policy 0, policy_version 29810 (0.0039) [2024-06-21 17:17:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 488423424. Throughput: 0: 41516.4. Samples: 488503640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 17:17:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 17:17:41,924][15401] Updated weights for policy 0, policy_version 29820 (0.0044) [2024-06-21 17:17:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 41777.6, 300 sec: 41598.4). Total num frames: 488636416. Throughput: 0: 41542.3. Samples: 488753780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 17:17:43,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 17:17:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000029824_488636416.pth... [2024-06-21 17:17:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000029215_478658560.pth [2024-06-21 17:17:45,767][15401] Updated weights for policy 0, policy_version 29830 (0.0043) [2024-06-21 17:17:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 488849408. Throughput: 0: 41360.9. Samples: 488994760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 17:17:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 17:17:50,034][15401] Updated weights for policy 0, policy_version 29840 (0.0043) [2024-06-21 17:17:53,389][15132] Fps is (10 sec: 39331.6, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 489029632. Throughput: 0: 41422.9. Samples: 489121780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 17:17:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 17:17:53,632][15401] Updated weights for policy 0, policy_version 29850 (0.0028) [2024-06-21 17:17:57,888][15401] Updated weights for policy 0, policy_version 29860 (0.0033) [2024-06-21 17:17:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41233.2, 300 sec: 41543.5). Total num frames: 489242624. Throughput: 0: 41327.9. Samples: 489377140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:17:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 17:18:01,454][15401] Updated weights for policy 0, policy_version 29870 (0.0030) [2024-06-21 17:18:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 489472000. Throughput: 0: 41501.7. Samples: 489623300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:18:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 17:18:05,647][15401] Updated weights for policy 0, policy_version 29880 (0.0039) [2024-06-21 17:18:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.2, 300 sec: 41599.0). Total num frames: 489652224. Throughput: 0: 41571.1. Samples: 489756360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:18:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 17:18:09,274][15401] Updated weights for policy 0, policy_version 29890 (0.0041) [2024-06-21 17:18:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41599.0). Total num frames: 489865216. Throughput: 0: 41436.5. Samples: 490002860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:18:13,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 17:18:13,440][15401] Updated weights for policy 0, policy_version 29900 (0.0036) [2024-06-21 17:18:17,271][15401] Updated weights for policy 0, policy_version 29910 (0.0033) [2024-06-21 17:18:18,390][15132] Fps is (10 sec: 44235.6, 60 sec: 41505.9, 300 sec: 41765.3). Total num frames: 490094592. Throughput: 0: 41639.3. Samples: 490252400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:18:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 17:18:21,040][15401] Updated weights for policy 0, policy_version 29920 (0.0034) [2024-06-21 17:18:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 490307584. Throughput: 0: 41695.2. Samples: 490379920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:18:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 17:18:25,275][15401] Updated weights for policy 0, policy_version 29930 (0.0032) [2024-06-21 17:18:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42053.8, 300 sec: 41765.3). Total num frames: 490520576. Throughput: 0: 41716.3. Samples: 490630920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:18:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 17:18:28,626][15401] Updated weights for policy 0, policy_version 29940 (0.0051) [2024-06-21 17:18:33,080][15401] Updated weights for policy 0, policy_version 29950 (0.0038) [2024-06-21 17:18:33,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41777.5, 300 sec: 41709.4). Total num frames: 490717184. Throughput: 0: 41930.7. Samples: 490881740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:18:33,392][15132] Avg episode reward: [(0, '0.418')] [2024-06-21 17:18:36,558][15401] Updated weights for policy 0, policy_version 29960 (0.0030) [2024-06-21 17:18:38,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 490913792. Throughput: 0: 41827.0. Samples: 491004000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 17:18:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 17:18:40,801][15401] Updated weights for policy 0, policy_version 29970 (0.0043) [2024-06-21 17:18:43,390][15132] Fps is (10 sec: 40969.8, 60 sec: 41507.8, 300 sec: 41709.8). Total num frames: 491126784. Throughput: 0: 41650.2. Samples: 491251400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 17:18:43,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-21 17:18:44,748][15401] Updated weights for policy 0, policy_version 29980 (0.0032) [2024-06-21 17:18:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41654.6). Total num frames: 491323392. Throughput: 0: 41704.0. Samples: 491499980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 17:18:48,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-21 17:18:48,727][15401] Updated weights for policy 0, policy_version 29990 (0.0041) [2024-06-21 17:18:52,611][15401] Updated weights for policy 0, policy_version 30000 (0.0040) [2024-06-21 17:18:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 491569152. Throughput: 0: 41541.3. Samples: 491625720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 17:18:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 17:18:56,303][15401] Updated weights for policy 0, policy_version 30010 (0.0037) [2024-06-21 17:18:58,376][15349] Signal inference workers to stop experience collection... (7150 times) [2024-06-21 17:18:58,377][15349] Signal inference workers to resume experience collection... (7150 times) [2024-06-21 17:18:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 491749376. Throughput: 0: 41646.7. Samples: 491876960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:18:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 17:18:58,400][15401] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-06-21 17:18:58,428][15401] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-06-21 17:19:00,217][15401] Updated weights for policy 0, policy_version 30020 (0.0048) [2024-06-21 17:19:03,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 491945984. Throughput: 0: 41644.1. Samples: 492126380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:19:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 17:19:04,435][15401] Updated weights for policy 0, policy_version 30030 (0.0034) [2024-06-21 17:19:08,207][15401] Updated weights for policy 0, policy_version 30040 (0.0032) [2024-06-21 17:19:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 492175360. Throughput: 0: 41543.1. Samples: 492249360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:19:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 17:19:12,426][15401] Updated weights for policy 0, policy_version 30050 (0.0030) [2024-06-21 17:19:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 492371968. Throughput: 0: 41559.7. Samples: 492501100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:19:13,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 17:19:15,802][15401] Updated weights for policy 0, policy_version 30060 (0.0038) [2024-06-21 17:19:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41233.2, 300 sec: 41543.5). Total num frames: 492568576. Throughput: 0: 41654.7. Samples: 492756100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 17:19:18,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 17:19:20,478][15401] Updated weights for policy 0, policy_version 30070 (0.0034) [2024-06-21 17:19:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41779.2, 300 sec: 41710.6). Total num frames: 492814336. Throughput: 0: 41633.8. Samples: 492877520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 17:19:23,390][15132] Avg episode reward: [(0, '0.162')] [2024-06-21 17:19:23,465][15401] Updated weights for policy 0, policy_version 30080 (0.0043) [2024-06-21 17:19:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 40960.1, 300 sec: 41598.7). Total num frames: 492978176. Throughput: 0: 41748.4. Samples: 493130080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 17:19:28,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-21 17:19:28,531][15401] Updated weights for policy 0, policy_version 30090 (0.0031) [2024-06-21 17:19:31,337][15401] Updated weights for policy 0, policy_version 30100 (0.0040) [2024-06-21 17:19:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41507.8, 300 sec: 41543.5). Total num frames: 493207552. Throughput: 0: 41669.9. Samples: 493375120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 17:19:33,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 17:19:36,203][15401] Updated weights for policy 0, policy_version 30110 (0.0034) [2024-06-21 17:19:38,396][15132] Fps is (10 sec: 44208.7, 60 sec: 41774.7, 300 sec: 41653.3). Total num frames: 493420544. Throughput: 0: 41739.4. Samples: 493504260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 17:19:38,396][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 17:19:39,162][15401] Updated weights for policy 0, policy_version 30120 (0.0035) [2024-06-21 17:19:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 493600768. Throughput: 0: 41739.8. Samples: 493755260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 17:19:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 17:19:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000030128_493617152.pth... [2024-06-21 17:19:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000029520_483655680.pth [2024-06-21 17:19:44,161][15401] Updated weights for policy 0, policy_version 30130 (0.0034) [2024-06-21 17:19:47,002][15401] Updated weights for policy 0, policy_version 30140 (0.0023) [2024-06-21 17:19:48,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42325.4, 300 sec: 41654.2). Total num frames: 493862912. Throughput: 0: 41450.7. Samples: 493991660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 17:19:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 17:19:51,821][15401] Updated weights for policy 0, policy_version 30150 (0.0030) [2024-06-21 17:19:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 494043136. Throughput: 0: 41819.2. Samples: 494131220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 17:19:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 17:19:55,112][15401] Updated weights for policy 0, policy_version 30160 (0.0031) [2024-06-21 17:19:58,396][15132] Fps is (10 sec: 37659.0, 60 sec: 41501.6, 300 sec: 41653.3). Total num frames: 494239744. Throughput: 0: 41615.4. Samples: 494374060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 17:19:58,397][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 17:19:59,620][15401] Updated weights for policy 0, policy_version 30170 (0.0033) [2024-06-21 17:20:02,752][15401] Updated weights for policy 0, policy_version 30180 (0.0040) [2024-06-21 17:20:03,396][15132] Fps is (10 sec: 45847.2, 60 sec: 42594.1, 300 sec: 41764.5). Total num frames: 494501888. Throughput: 0: 41489.9. Samples: 494623400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 17:20:03,396][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 17:20:07,457][15401] Updated weights for policy 0, policy_version 30190 (0.0039) [2024-06-21 17:20:08,390][15132] Fps is (10 sec: 40986.2, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 494649344. Throughput: 0: 41818.6. Samples: 494759360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 17:20:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 17:20:10,585][15401] Updated weights for policy 0, policy_version 30200 (0.0045) [2024-06-21 17:20:13,390][15132] Fps is (10 sec: 37705.4, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 494878720. Throughput: 0: 41674.1. Samples: 495005420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:20:13,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 17:20:15,549][15401] Updated weights for policy 0, policy_version 30210 (0.0044) [2024-06-21 17:20:18,373][15401] Updated weights for policy 0, policy_version 30220 (0.0022) [2024-06-21 17:20:18,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 41765.3). Total num frames: 495124480. Throughput: 0: 41804.4. Samples: 495256320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:20:18,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-21 17:20:22,119][15349] Signal inference workers to stop experience collection... (7200 times) [2024-06-21 17:20:22,122][15349] Signal inference workers to resume experience collection... (7200 times) [2024-06-21 17:20:22,166][15401] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-06-21 17:20:22,167][15401] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-06-21 17:20:23,390][15132] Fps is (10 sec: 39322.3, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 495271936. Throughput: 0: 41777.1. Samples: 495383960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:20:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 17:20:23,667][15401] Updated weights for policy 0, policy_version 30230 (0.0040) [2024-06-21 17:20:26,130][15401] Updated weights for policy 0, policy_version 30240 (0.0033) [2024-06-21 17:20:28,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.4, 300 sec: 41598.7). Total num frames: 495501312. Throughput: 0: 41660.2. Samples: 495629960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:20:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 17:20:31,383][15401] Updated weights for policy 0, policy_version 30250 (0.0030) [2024-06-21 17:20:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41710.1). Total num frames: 495714304. Throughput: 0: 42039.1. Samples: 495883420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 17:20:33,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-21 17:20:34,110][15401] Updated weights for policy 0, policy_version 30260 (0.0034) [2024-06-21 17:20:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41510.5, 300 sec: 41598.7). Total num frames: 495910912. Throughput: 0: 41728.4. Samples: 496009000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 17:20:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 17:20:39,311][15401] Updated weights for policy 0, policy_version 30270 (0.0045) [2024-06-21 17:20:41,921][15401] Updated weights for policy 0, policy_version 30280 (0.0029) [2024-06-21 17:20:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 41709.8). Total num frames: 496156672. Throughput: 0: 41741.5. Samples: 496252160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 17:20:43,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-21 17:20:46,985][15401] Updated weights for policy 0, policy_version 30290 (0.0042) [2024-06-21 17:20:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 496353280. Throughput: 0: 41869.9. Samples: 496507300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 17:20:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 17:20:50,041][15401] Updated weights for policy 0, policy_version 30300 (0.0032) [2024-06-21 17:20:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 496533504. Throughput: 0: 41529.3. Samples: 496628180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:20:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 17:20:54,765][15401] Updated weights for policy 0, policy_version 30310 (0.0029) [2024-06-21 17:20:57,985][15401] Updated weights for policy 0, policy_version 30320 (0.0036) [2024-06-21 17:20:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42056.7, 300 sec: 41654.2). Total num frames: 496762880. Throughput: 0: 41612.1. Samples: 496877960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:20:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 17:21:02,575][15401] Updated weights for policy 0, policy_version 30330 (0.0034) [2024-06-21 17:21:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40691.1, 300 sec: 41654.3). Total num frames: 496943104. Throughput: 0: 41566.7. Samples: 497126820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:21:03,396][15132] Avg episode reward: [(0, '0.293')] [2024-06-21 17:21:06,146][15401] Updated weights for policy 0, policy_version 30340 (0.0029) [2024-06-21 17:21:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42050.6, 300 sec: 41709.8). Total num frames: 497172480. Throughput: 0: 41420.9. Samples: 497248000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:21:08,392][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 17:21:10,402][15401] Updated weights for policy 0, policy_version 30350 (0.0035) [2024-06-21 17:21:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.3, 300 sec: 41654.2). Total num frames: 497369088. Throughput: 0: 41613.3. Samples: 497502560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-21 17:21:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 17:21:14,135][15401] Updated weights for policy 0, policy_version 30360 (0.0035) [2024-06-21 17:21:18,129][15401] Updated weights for policy 0, policy_version 30370 (0.0031) [2024-06-21 17:21:18,390][15132] Fps is (10 sec: 40969.9, 60 sec: 40960.0, 300 sec: 41654.6). Total num frames: 497582080. Throughput: 0: 41442.2. Samples: 497748320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-21 17:21:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 17:21:21,958][15401] Updated weights for policy 0, policy_version 30380 (0.0035) [2024-06-21 17:21:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.6, 300 sec: 41765.0). Total num frames: 497795072. Throughput: 0: 41460.5. Samples: 497874820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-21 17:21:23,392][15132] Avg episode reward: [(0, '0.215')] [2024-06-21 17:21:26,062][15401] Updated weights for policy 0, policy_version 30390 (0.0029) [2024-06-21 17:21:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 498008064. Throughput: 0: 41650.7. Samples: 498126440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-21 17:21:28,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 17:21:29,877][15401] Updated weights for policy 0, policy_version 30400 (0.0050) [2024-06-21 17:21:33,390][15132] Fps is (10 sec: 40969.5, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 498204672. Throughput: 0: 41412.5. Samples: 498370860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-21 17:21:33,395][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 17:21:34,067][15401] Updated weights for policy 0, policy_version 30410 (0.0041) [2024-06-21 17:21:37,719][15401] Updated weights for policy 0, policy_version 30420 (0.0031) [2024-06-21 17:21:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 498417664. Throughput: 0: 41519.6. Samples: 498496560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-21 17:21:38,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 17:21:41,730][15401] Updated weights for policy 0, policy_version 30430 (0.0029) [2024-06-21 17:21:43,396][15132] Fps is (10 sec: 40934.1, 60 sec: 40955.7, 300 sec: 41653.3). Total num frames: 498614272. Throughput: 0: 41568.8. Samples: 498748820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-21 17:21:43,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 17:21:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000030434_498630656.pth... [2024-06-21 17:21:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000029824_488636416.pth [2024-06-21 17:21:45,529][15401] Updated weights for policy 0, policy_version 30440 (0.0034) [2024-06-21 17:21:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 498843648. Throughput: 0: 41578.2. Samples: 498997840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-21 17:21:48,399][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 17:21:49,412][15401] Updated weights for policy 0, policy_version 30450 (0.0037) [2024-06-21 17:21:51,962][15349] Signal inference workers to stop experience collection... (7250 times) [2024-06-21 17:21:52,018][15401] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-06-21 17:21:52,024][15349] Signal inference workers to resume experience collection... (7250 times) [2024-06-21 17:21:52,041][15401] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-06-21 17:21:53,227][15401] Updated weights for policy 0, policy_version 30460 (0.0035) [2024-06-21 17:21:53,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42052.4, 300 sec: 41654.3). Total num frames: 499056640. Throughput: 0: 41743.6. Samples: 499126360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:21:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 17:21:57,266][15401] Updated weights for policy 0, policy_version 30470 (0.0033) [2024-06-21 17:21:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 499236864. Throughput: 0: 41660.9. Samples: 499377300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:21:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 17:22:01,125][15401] Updated weights for policy 0, policy_version 30480 (0.0034) [2024-06-21 17:22:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 499466240. Throughput: 0: 41793.7. Samples: 499629040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:22:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 17:22:04,951][15401] Updated weights for policy 0, policy_version 30490 (0.0036) [2024-06-21 17:22:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41780.9, 300 sec: 41709.8). Total num frames: 499679232. Throughput: 0: 41837.4. Samples: 499757400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:22:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 17:22:08,710][15401] Updated weights for policy 0, policy_version 30500 (0.0036) [2024-06-21 17:22:12,993][15401] Updated weights for policy 0, policy_version 30510 (0.0032) [2024-06-21 17:22:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 499892224. Throughput: 0: 41705.4. Samples: 500003180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 17:22:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 17:22:17,054][15401] Updated weights for policy 0, policy_version 30520 (0.0034) [2024-06-21 17:22:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 500105216. Throughput: 0: 41870.7. Samples: 500255040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 17:22:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 17:22:20,511][15401] Updated weights for policy 0, policy_version 30530 (0.0027) [2024-06-21 17:22:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41507.8, 300 sec: 41654.6). Total num frames: 500285440. Throughput: 0: 41913.4. Samples: 500382660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 17:22:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 17:22:24,846][15401] Updated weights for policy 0, policy_version 30540 (0.0036) [2024-06-21 17:22:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 41709.7). Total num frames: 500514816. Throughput: 0: 41789.4. Samples: 500629080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 17:22:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 17:22:28,612][15401] Updated weights for policy 0, policy_version 30550 (0.0035) [2024-06-21 17:22:33,043][15401] Updated weights for policy 0, policy_version 30560 (0.0035) [2024-06-21 17:22:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 500711424. Throughput: 0: 42050.2. Samples: 500890100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 17:22:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-21 17:22:36,686][15401] Updated weights for policy 0, policy_version 30570 (0.0038) [2024-06-21 17:22:38,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41506.2, 300 sec: 41599.0). Total num frames: 500908032. Throughput: 0: 41802.7. Samples: 501007480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 17:22:38,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-21 17:22:40,651][15401] Updated weights for policy 0, policy_version 30580 (0.0036) [2024-06-21 17:22:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42603.0, 300 sec: 41765.3). Total num frames: 501170176. Throughput: 0: 41774.7. Samples: 501257160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 17:22:43,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-21 17:22:44,576][15401] Updated weights for policy 0, policy_version 30590 (0.0042) [2024-06-21 17:22:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 501334016. Throughput: 0: 41918.8. Samples: 501515380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 17:22:48,390][15132] Avg episode reward: [(0, '0.071')] [2024-06-21 17:22:48,516][15401] Updated weights for policy 0, policy_version 30600 (0.0029) [2024-06-21 17:22:52,273][15401] Updated weights for policy 0, policy_version 30610 (0.0047) [2024-06-21 17:22:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 501563392. Throughput: 0: 41600.8. Samples: 501629440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:22:53,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 17:22:56,117][15401] Updated weights for policy 0, policy_version 30620 (0.0033) [2024-06-21 17:22:58,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 41820.9). Total num frames: 501809152. Throughput: 0: 41895.5. Samples: 501888480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:22:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 17:23:00,034][15401] Updated weights for policy 0, policy_version 30630 (0.0035) [2024-06-21 17:23:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 501940224. Throughput: 0: 42055.5. Samples: 502147540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:23:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 17:23:03,876][15401] Updated weights for policy 0, policy_version 30640 (0.0049) [2024-06-21 17:23:07,741][15401] Updated weights for policy 0, policy_version 30650 (0.0025) [2024-06-21 17:23:08,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 502185984. Throughput: 0: 41706.6. Samples: 502259460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:23:08,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 17:23:11,932][15401] Updated weights for policy 0, policy_version 30660 (0.0039) [2024-06-21 17:23:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 502415360. Throughput: 0: 41963.2. Samples: 502517420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 17:23:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 17:23:15,413][15401] Updated weights for policy 0, policy_version 30670 (0.0035) [2024-06-21 17:23:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 502579200. Throughput: 0: 41847.1. Samples: 502773220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 17:23:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 17:23:19,956][15401] Updated weights for policy 0, policy_version 30680 (0.0038) [2024-06-21 17:23:23,157][15401] Updated weights for policy 0, policy_version 30690 (0.0025) [2024-06-21 17:23:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 41709.8). Total num frames: 502824960. Throughput: 0: 41737.2. Samples: 502885660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 17:23:23,400][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 17:23:26,629][15349] Signal inference workers to stop experience collection... (7300 times) [2024-06-21 17:23:26,684][15401] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-06-21 17:23:26,687][15349] Signal inference workers to resume experience collection... (7300 times) [2024-06-21 17:23:26,701][15401] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-06-21 17:23:27,690][15401] Updated weights for policy 0, policy_version 30700 (0.0047) [2024-06-21 17:23:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41779.4, 300 sec: 41710.1). Total num frames: 503021568. Throughput: 0: 41962.7. Samples: 503145480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 17:23:28,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-21 17:23:30,909][15401] Updated weights for policy 0, policy_version 30710 (0.0030) [2024-06-21 17:23:33,391][15132] Fps is (10 sec: 39314.6, 60 sec: 41777.9, 300 sec: 41709.5). Total num frames: 503218176. Throughput: 0: 41913.3. Samples: 503401560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 17:23:33,400][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 17:23:35,439][15401] Updated weights for policy 0, policy_version 30720 (0.0034) [2024-06-21 17:23:38,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.7, 300 sec: 41820.5). Total num frames: 503463936. Throughput: 0: 42048.5. Samples: 503521720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 17:23:38,392][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 17:23:38,703][15401] Updated weights for policy 0, policy_version 30730 (0.0030) [2024-06-21 17:23:43,135][15401] Updated weights for policy 0, policy_version 30740 (0.0034) [2024-06-21 17:23:43,389][15132] Fps is (10 sec: 44245.3, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 503660544. Throughput: 0: 41989.4. Samples: 503778000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 17:23:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 17:23:43,522][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000030742_503676928.pth... [2024-06-21 17:23:43,569][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000030128_493617152.pth [2024-06-21 17:23:46,649][15401] Updated weights for policy 0, policy_version 30750 (0.0043) [2024-06-21 17:23:48,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 503857152. Throughput: 0: 41654.7. Samples: 504022000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 17:23:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 17:23:50,998][15401] Updated weights for policy 0, policy_version 30760 (0.0038) [2024-06-21 17:23:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 504086528. Throughput: 0: 41982.4. Samples: 504148660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 17:23:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 17:23:54,330][15401] Updated weights for policy 0, policy_version 30770 (0.0037) [2024-06-21 17:23:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41765.3). Total num frames: 504266752. Throughput: 0: 41928.2. Samples: 504404180. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-21 17:23:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 17:23:58,662][15401] Updated weights for policy 0, policy_version 30780 (0.0041) [2024-06-21 17:24:02,341][15401] Updated weights for policy 0, policy_version 30790 (0.0033) [2024-06-21 17:24:03,396][15132] Fps is (10 sec: 40933.1, 60 sec: 42593.9, 300 sec: 41764.4). Total num frames: 504496128. Throughput: 0: 41674.5. Samples: 504648840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-21 17:24:03,396][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 17:24:06,599][15401] Updated weights for policy 0, policy_version 30800 (0.0034) [2024-06-21 17:24:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 504692736. Throughput: 0: 41992.9. Samples: 504775340. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-21 17:24:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 17:24:10,443][15401] Updated weights for policy 0, policy_version 30810 (0.0028) [2024-06-21 17:24:13,389][15132] Fps is (10 sec: 37707.9, 60 sec: 40960.2, 300 sec: 41709.8). Total num frames: 504872960. Throughput: 0: 41830.7. Samples: 505027860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-21 17:24:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 17:24:14,371][15401] Updated weights for policy 0, policy_version 30820 (0.0028) [2024-06-21 17:24:18,313][15401] Updated weights for policy 0, policy_version 30830 (0.0038) [2024-06-21 17:24:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 505118720. Throughput: 0: 41625.3. Samples: 505274620. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-21 17:24:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 17:24:22,222][15401] Updated weights for policy 0, policy_version 30840 (0.0024) [2024-06-21 17:24:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41506.3, 300 sec: 41820.9). Total num frames: 505315328. Throughput: 0: 41712.9. Samples: 505398700. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-21 17:24:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 17:24:26,309][15401] Updated weights for policy 0, policy_version 30850 (0.0042) [2024-06-21 17:24:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 505528320. Throughput: 0: 41631.5. Samples: 505651420. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-21 17:24:28,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-21 17:24:29,856][15401] Updated weights for policy 0, policy_version 30860 (0.0037) [2024-06-21 17:24:31,766][15349] Signal inference workers to stop experience collection... (7350 times) [2024-06-21 17:24:31,766][15349] Signal inference workers to resume experience collection... (7350 times) [2024-06-21 17:24:31,780][15401] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-06-21 17:24:31,781][15401] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-06-21 17:24:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42325.0, 300 sec: 41821.4). Total num frames: 505757696. Throughput: 0: 41816.4. Samples: 505903840. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-21 17:24:33,392][15132] Avg episode reward: [(0, '0.786')] [2024-06-21 17:24:34,097][15401] Updated weights for policy 0, policy_version 30870 (0.0038) [2024-06-21 17:24:37,595][15401] Updated weights for policy 0, policy_version 30880 (0.0034) [2024-06-21 17:24:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41507.8, 300 sec: 41876.4). Total num frames: 505954304. Throughput: 0: 41935.4. Samples: 506035760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:24:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 17:24:41,805][15401] Updated weights for policy 0, policy_version 30890 (0.0041) [2024-06-21 17:24:43,390][15132] Fps is (10 sec: 39330.7, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 506150912. Throughput: 0: 41751.4. Samples: 506283000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:24:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 17:24:45,790][15401] Updated weights for policy 0, policy_version 30900 (0.0035) [2024-06-21 17:24:48,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42052.0, 300 sec: 41820.8). Total num frames: 506380288. Throughput: 0: 41737.2. Samples: 506526760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:24:48,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 17:24:50,278][15401] Updated weights for policy 0, policy_version 30910 (0.0037) [2024-06-21 17:24:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.0, 300 sec: 41821.8). Total num frames: 506576896. Throughput: 0: 41799.6. Samples: 506656320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:24:53,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 17:24:53,534][15401] Updated weights for policy 0, policy_version 30920 (0.0030) [2024-06-21 17:24:58,331][15401] Updated weights for policy 0, policy_version 30930 (0.0032) [2024-06-21 17:24:58,390][15132] Fps is (10 sec: 37684.3, 60 sec: 41506.0, 300 sec: 41544.0). Total num frames: 506757120. Throughput: 0: 41746.5. Samples: 506906460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 17:24:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 17:25:01,285][15401] Updated weights for policy 0, policy_version 30940 (0.0050) [2024-06-21 17:25:03,392][15132] Fps is (10 sec: 42588.6, 60 sec: 41782.0, 300 sec: 41876.1). Total num frames: 507002880. Throughput: 0: 41756.0. Samples: 507153740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 17:25:03,393][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 17:25:06,027][15401] Updated weights for policy 0, policy_version 30950 (0.0035) [2024-06-21 17:25:08,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42050.6, 300 sec: 41820.5). Total num frames: 507215872. Throughput: 0: 41934.1. Samples: 507285840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 17:25:08,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 17:25:09,188][15401] Updated weights for policy 0, policy_version 30960 (0.0031) [2024-06-21 17:25:13,390][15132] Fps is (10 sec: 39330.2, 60 sec: 42052.0, 300 sec: 41598.7). Total num frames: 507396096. Throughput: 0: 41847.4. Samples: 507534560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 17:25:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 17:25:13,789][15401] Updated weights for policy 0, policy_version 30970 (0.0034) [2024-06-21 17:25:17,083][15401] Updated weights for policy 0, policy_version 30980 (0.0028) [2024-06-21 17:25:18,389][15132] Fps is (10 sec: 40970.1, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 507625472. Throughput: 0: 41513.4. Samples: 507771840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 17:25:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 17:25:21,512][15401] Updated weights for policy 0, policy_version 30990 (0.0042) [2024-06-21 17:25:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.0, 300 sec: 41709.7). Total num frames: 507805696. Throughput: 0: 41503.5. Samples: 507903420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 17:25:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 17:25:25,010][15401] Updated weights for policy 0, policy_version 31000 (0.0034) [2024-06-21 17:25:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 508018688. Throughput: 0: 41325.0. Samples: 508142620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 17:25:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 17:25:29,315][15401] Updated weights for policy 0, policy_version 31010 (0.0031) [2024-06-21 17:25:32,819][15401] Updated weights for policy 0, policy_version 31020 (0.0034) [2024-06-21 17:25:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41507.8, 300 sec: 41820.9). Total num frames: 508248064. Throughput: 0: 41519.4. Samples: 508395120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 17:25:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 17:25:37,165][15401] Updated weights for policy 0, policy_version 31030 (0.0042) [2024-06-21 17:25:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 508428288. Throughput: 0: 41551.2. Samples: 508526120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 17:25:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 17:25:40,835][15401] Updated weights for policy 0, policy_version 31040 (0.0046) [2024-06-21 17:25:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 508641280. Throughput: 0: 41421.3. Samples: 508770420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 17:25:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-21 17:25:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000031046_508657664.pth... [2024-06-21 17:25:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000030434_498630656.pth [2024-06-21 17:25:45,182][15401] Updated weights for policy 0, policy_version 31050 (0.0040) [2024-06-21 17:25:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.3, 300 sec: 41765.3). Total num frames: 508854272. Throughput: 0: 41521.3. Samples: 509022100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 17:25:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 17:25:48,741][15401] Updated weights for policy 0, policy_version 31060 (0.0045) [2024-06-21 17:25:53,143][15401] Updated weights for policy 0, policy_version 31070 (0.0037) [2024-06-21 17:25:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 509050880. Throughput: 0: 41293.2. Samples: 509143940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 17:25:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 17:25:56,552][15349] Signal inference workers to stop experience collection... (7400 times) [2024-06-21 17:25:56,611][15401] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-06-21 17:25:56,611][15349] Signal inference workers to resume experience collection... (7400 times) [2024-06-21 17:25:56,626][15401] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-06-21 17:25:56,775][15401] Updated weights for policy 0, policy_version 31080 (0.0035) [2024-06-21 17:25:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 509280256. Throughput: 0: 41202.7. Samples: 509388680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 17:25:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 17:26:01,148][15401] Updated weights for policy 0, policy_version 31090 (0.0047) [2024-06-21 17:26:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 40961.6, 300 sec: 41654.6). Total num frames: 509460480. Throughput: 0: 41530.5. Samples: 509640720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 17:26:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 17:26:04,763][15401] Updated weights for policy 0, policy_version 31100 (0.0034) [2024-06-21 17:26:08,389][15132] Fps is (10 sec: 39322.4, 60 sec: 40961.7, 300 sec: 41709.8). Total num frames: 509673472. Throughput: 0: 41295.7. Samples: 509761720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 17:26:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 17:26:09,166][15401] Updated weights for policy 0, policy_version 31110 (0.0049) [2024-06-21 17:26:12,562][15401] Updated weights for policy 0, policy_version 31120 (0.0042) [2024-06-21 17:26:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 41777.6, 300 sec: 41765.0). Total num frames: 509902848. Throughput: 0: 41480.4. Samples: 510009340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 17:26:13,393][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 17:26:17,114][15401] Updated weights for policy 0, policy_version 31130 (0.0035) [2024-06-21 17:26:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41710.1). Total num frames: 510099456. Throughput: 0: 41512.5. Samples: 510263180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 17:26:18,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 17:26:20,593][15401] Updated weights for policy 0, policy_version 31140 (0.0046) [2024-06-21 17:26:23,389][15132] Fps is (10 sec: 39331.3, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 510296064. Throughput: 0: 41245.8. Samples: 510382180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 17:26:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 17:26:25,015][15401] Updated weights for policy 0, policy_version 31150 (0.0045) [2024-06-21 17:26:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 510525440. Throughput: 0: 41333.0. Samples: 510630400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 17:26:28,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-21 17:26:28,399][15401] Updated weights for policy 0, policy_version 31160 (0.0041) [2024-06-21 17:26:33,002][15401] Updated weights for policy 0, policy_version 31170 (0.0040) [2024-06-21 17:26:33,392][15132] Fps is (10 sec: 40950.6, 60 sec: 40958.4, 300 sec: 41653.9). Total num frames: 510705664. Throughput: 0: 41405.0. Samples: 510885420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 17:26:33,392][15132] Avg episode reward: [(0, '0.244')] [2024-06-21 17:26:36,282][15401] Updated weights for policy 0, policy_version 31180 (0.0045) [2024-06-21 17:26:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41766.2). Total num frames: 510935040. Throughput: 0: 41398.0. Samples: 511006840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 17:26:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 17:26:40,746][15401] Updated weights for policy 0, policy_version 31190 (0.0030) [2024-06-21 17:26:43,390][15132] Fps is (10 sec: 42607.3, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 511131648. Throughput: 0: 41626.6. Samples: 511261880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 17:26:43,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 17:26:43,994][15401] Updated weights for policy 0, policy_version 31200 (0.0037) [2024-06-21 17:26:48,390][15132] Fps is (10 sec: 37682.5, 60 sec: 40959.9, 300 sec: 41543.1). Total num frames: 511311872. Throughput: 0: 41736.9. Samples: 511518880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 17:26:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 17:26:48,569][15401] Updated weights for policy 0, policy_version 31210 (0.0049) [2024-06-21 17:26:51,740][15401] Updated weights for policy 0, policy_version 31220 (0.0030) [2024-06-21 17:26:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 511557632. Throughput: 0: 41678.9. Samples: 511637280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 17:26:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 17:26:56,270][15401] Updated weights for policy 0, policy_version 31230 (0.0037) [2024-06-21 17:26:58,389][15132] Fps is (10 sec: 47514.0, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 511787008. Throughput: 0: 41790.3. Samples: 511889800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:26:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 17:26:59,526][15401] Updated weights for policy 0, policy_version 31240 (0.0031) [2024-06-21 17:27:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 511967232. Throughput: 0: 41669.8. Samples: 512138320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:27:03,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-21 17:27:04,131][15401] Updated weights for policy 0, policy_version 31250 (0.0037) [2024-06-21 17:27:07,366][15401] Updated weights for policy 0, policy_version 31260 (0.0049) [2024-06-21 17:27:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 512180224. Throughput: 0: 41843.6. Samples: 512265140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:27:08,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 17:27:11,930][15401] Updated weights for policy 0, policy_version 31270 (0.0041) [2024-06-21 17:27:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41780.9, 300 sec: 41709.8). Total num frames: 512409600. Throughput: 0: 41858.1. Samples: 512514020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:27:13,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-21 17:27:15,232][15401] Updated weights for policy 0, policy_version 31280 (0.0042) [2024-06-21 17:27:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 512589824. Throughput: 0: 41717.6. Samples: 512762620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 17:27:18,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 17:27:19,725][15401] Updated weights for policy 0, policy_version 31290 (0.0046) [2024-06-21 17:27:23,155][15401] Updated weights for policy 0, policy_version 31300 (0.0041) [2024-06-21 17:27:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42050.5, 300 sec: 41709.5). Total num frames: 512819200. Throughput: 0: 41673.2. Samples: 512882240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 17:27:23,393][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 17:27:27,461][15401] Updated weights for policy 0, policy_version 31310 (0.0036) [2024-06-21 17:27:28,392][15132] Fps is (10 sec: 44226.6, 60 sec: 41777.5, 300 sec: 41765.0). Total num frames: 513032192. Throughput: 0: 41803.3. Samples: 513143120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 17:27:28,392][15132] Avg episode reward: [(0, '0.336')] [2024-06-21 17:27:31,205][15401] Updated weights for policy 0, policy_version 31320 (0.0039) [2024-06-21 17:27:33,155][15349] Signal inference workers to stop experience collection... (7450 times) [2024-06-21 17:27:33,156][15349] Signal inference workers to resume experience collection... (7450 times) [2024-06-21 17:27:33,169][15401] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-06-21 17:27:33,195][15401] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-06-21 17:27:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42053.9, 300 sec: 41765.3). Total num frames: 513228800. Throughput: 0: 41525.0. Samples: 513387500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 17:27:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 17:27:35,558][15401] Updated weights for policy 0, policy_version 31330 (0.0025) [2024-06-21 17:27:38,392][15132] Fps is (10 sec: 40960.2, 60 sec: 41777.5, 300 sec: 41598.4). Total num frames: 513441792. Throughput: 0: 41666.9. Samples: 513512380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:27:38,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 17:27:39,403][15401] Updated weights for policy 0, policy_version 31340 (0.0029) [2024-06-21 17:27:43,367][15401] Updated weights for policy 0, policy_version 31350 (0.0035) [2024-06-21 17:27:43,390][15132] Fps is (10 sec: 40958.8, 60 sec: 41779.2, 300 sec: 41709.7). Total num frames: 513638400. Throughput: 0: 41713.5. Samples: 513766920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:27:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 17:27:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000031350_513638400.pth... [2024-06-21 17:27:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000030742_503676928.pth [2024-06-21 17:27:47,197][15401] Updated weights for policy 0, policy_version 31360 (0.0048) [2024-06-21 17:27:48,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 41709.8). Total num frames: 513867776. Throughput: 0: 41562.6. Samples: 514008640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:27:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 17:27:51,007][15401] Updated weights for policy 0, policy_version 31370 (0.0041) [2024-06-21 17:27:53,389][15132] Fps is (10 sec: 42599.7, 60 sec: 41779.4, 300 sec: 41543.2). Total num frames: 514064384. Throughput: 0: 41666.3. Samples: 514140120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:27:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 17:27:55,128][15401] Updated weights for policy 0, policy_version 31380 (0.0048) [2024-06-21 17:27:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 514260992. Throughput: 0: 41714.7. Samples: 514391180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:27:58,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-21 17:27:59,183][15401] Updated weights for policy 0, policy_version 31390 (0.0037) [2024-06-21 17:28:03,009][15401] Updated weights for policy 0, policy_version 31400 (0.0040) [2024-06-21 17:28:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 514473984. Throughput: 0: 41724.0. Samples: 514640200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:28:03,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-21 17:28:06,851][15401] Updated weights for policy 0, policy_version 31410 (0.0030) [2024-06-21 17:28:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 514686976. Throughput: 0: 41872.5. Samples: 514766400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:28:08,390][15132] Avg episode reward: [(0, '0.167')] [2024-06-21 17:28:10,671][15401] Updated weights for policy 0, policy_version 31420 (0.0031) [2024-06-21 17:28:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 514883584. Throughput: 0: 41566.3. Samples: 515013500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:28:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 17:28:14,547][15401] Updated weights for policy 0, policy_version 31430 (0.0033) [2024-06-21 17:28:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 515096576. Throughput: 0: 41801.8. Samples: 515268580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:28:18,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 17:28:18,534][15401] Updated weights for policy 0, policy_version 31440 (0.0032) [2024-06-21 17:28:22,230][15401] Updated weights for policy 0, policy_version 31450 (0.0043) [2024-06-21 17:28:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41234.7, 300 sec: 41598.7). Total num frames: 515293184. Throughput: 0: 41805.2. Samples: 515393520. Policy #0 lag: (min: 2.0, avg: 11.9, max: 23.0) [2024-06-21 17:28:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 17:28:26,298][15401] Updated weights for policy 0, policy_version 31460 (0.0038) [2024-06-21 17:28:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41507.8, 300 sec: 41710.0). Total num frames: 515522560. Throughput: 0: 41680.7. Samples: 515642540. Policy #0 lag: (min: 2.0, avg: 11.9, max: 23.0) [2024-06-21 17:28:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 17:28:29,999][15401] Updated weights for policy 0, policy_version 31470 (0.0040) [2024-06-21 17:28:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.1, 300 sec: 41599.0). Total num frames: 515735552. Throughput: 0: 41890.7. Samples: 515893720. Policy #0 lag: (min: 2.0, avg: 11.9, max: 23.0) [2024-06-21 17:28:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 17:28:34,073][15401] Updated weights for policy 0, policy_version 31480 (0.0042) [2024-06-21 17:28:37,826][15401] Updated weights for policy 0, policy_version 31490 (0.0037) [2024-06-21 17:28:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41507.7, 300 sec: 41598.7). Total num frames: 515932160. Throughput: 0: 41844.7. Samples: 516023140. Policy #0 lag: (min: 2.0, avg: 11.9, max: 23.0) [2024-06-21 17:28:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-21 17:28:41,642][15401] Updated weights for policy 0, policy_version 31500 (0.0046) [2024-06-21 17:28:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 516145152. Throughput: 0: 41804.8. Samples: 516272400. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-21 17:28:43,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-21 17:28:45,479][15401] Updated weights for policy 0, policy_version 31510 (0.0039) [2024-06-21 17:28:48,396][15132] Fps is (10 sec: 44209.1, 60 sec: 41774.8, 300 sec: 41653.3). Total num frames: 516374528. Throughput: 0: 41953.2. Samples: 516528360. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-21 17:28:48,396][15132] Avg episode reward: [(0, '0.792')] [2024-06-21 17:28:49,319][15401] Updated weights for policy 0, policy_version 31520 (0.0035) [2024-06-21 17:28:53,250][15401] Updated weights for policy 0, policy_version 31530 (0.0041) [2024-06-21 17:28:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 516587520. Throughput: 0: 41993.2. Samples: 516656100. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-21 17:28:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 17:28:56,992][15401] Updated weights for policy 0, policy_version 31540 (0.0031) [2024-06-21 17:28:58,389][15132] Fps is (10 sec: 40986.2, 60 sec: 42052.3, 300 sec: 41655.1). Total num frames: 516784128. Throughput: 0: 41940.8. Samples: 516900840. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-21 17:28:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 17:29:01,355][15401] Updated weights for policy 0, policy_version 31550 (0.0027) [2024-06-21 17:29:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 516997120. Throughput: 0: 42089.6. Samples: 517162620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 17:29:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 17:29:04,864][15401] Updated weights for policy 0, policy_version 31560 (0.0032) [2024-06-21 17:29:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 517193728. Throughput: 0: 42010.7. Samples: 517284000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 17:29:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 17:29:09,115][15401] Updated weights for policy 0, policy_version 31570 (0.0047) [2024-06-21 17:29:12,756][15401] Updated weights for policy 0, policy_version 31580 (0.0043) [2024-06-21 17:29:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 517423104. Throughput: 0: 42137.4. Samples: 517538720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 17:29:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 17:29:16,828][15401] Updated weights for policy 0, policy_version 31590 (0.0035) [2024-06-21 17:29:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 517619712. Throughput: 0: 42069.4. Samples: 517786840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 17:29:18,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-21 17:29:20,828][15401] Updated weights for policy 0, policy_version 31600 (0.0032) [2024-06-21 17:29:23,394][15132] Fps is (10 sec: 40943.1, 60 sec: 42322.5, 300 sec: 41709.2). Total num frames: 517832704. Throughput: 0: 42035.0. Samples: 517914880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 17:29:23,394][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 17:29:24,979][15401] Updated weights for policy 0, policy_version 31610 (0.0034) [2024-06-21 17:29:27,746][15349] Signal inference workers to stop experience collection... (7500 times) [2024-06-21 17:29:27,750][15349] Signal inference workers to resume experience collection... (7500 times) [2024-06-21 17:29:27,763][15401] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-06-21 17:29:27,764][15401] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-06-21 17:29:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41599.0). Total num frames: 518029312. Throughput: 0: 42142.3. Samples: 518168800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 17:29:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 17:29:28,695][15401] Updated weights for policy 0, policy_version 31620 (0.0032) [2024-06-21 17:29:33,018][15401] Updated weights for policy 0, policy_version 31630 (0.0035) [2024-06-21 17:29:33,390][15132] Fps is (10 sec: 40973.6, 60 sec: 41778.7, 300 sec: 41654.1). Total num frames: 518242304. Throughput: 0: 42028.9. Samples: 518419420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 17:29:33,391][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 17:29:36,567][15401] Updated weights for policy 0, policy_version 31640 (0.0035) [2024-06-21 17:29:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 518471680. Throughput: 0: 41937.4. Samples: 518543280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 17:29:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 17:29:40,703][15401] Updated weights for policy 0, policy_version 31650 (0.0037) [2024-06-21 17:29:43,390][15132] Fps is (10 sec: 40962.0, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 518651904. Throughput: 0: 42024.7. Samples: 518791960. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-21 17:29:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 17:29:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000031656_518651904.pth... [2024-06-21 17:29:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000031046_508657664.pth [2024-06-21 17:29:44,579][15401] Updated weights for policy 0, policy_version 31660 (0.0042) [2024-06-21 17:29:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41510.5, 300 sec: 41654.2). Total num frames: 518864896. Throughput: 0: 41728.0. Samples: 519040380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-21 17:29:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 17:29:48,510][15401] Updated weights for policy 0, policy_version 31670 (0.0034) [2024-06-21 17:29:52,410][15401] Updated weights for policy 0, policy_version 31680 (0.0037) [2024-06-21 17:29:53,390][15132] Fps is (10 sec: 44237.4, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 519094272. Throughput: 0: 41853.8. Samples: 519167420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-21 17:29:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 17:29:56,166][15401] Updated weights for policy 0, policy_version 31690 (0.0028) [2024-06-21 17:29:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41777.4, 300 sec: 41654.2). Total num frames: 519290880. Throughput: 0: 41731.4. Samples: 519416740. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-21 17:29:58,393][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 17:30:00,219][15401] Updated weights for policy 0, policy_version 31700 (0.0027) [2024-06-21 17:30:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41506.3, 300 sec: 41599.0). Total num frames: 519487488. Throughput: 0: 41751.6. Samples: 519665660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 17:30:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 17:30:03,867][15401] Updated weights for policy 0, policy_version 31710 (0.0033) [2024-06-21 17:30:08,043][15401] Updated weights for policy 0, policy_version 31720 (0.0031) [2024-06-21 17:30:08,390][15132] Fps is (10 sec: 40970.1, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 519700480. Throughput: 0: 41633.5. Samples: 519788220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 17:30:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 17:30:11,714][15401] Updated weights for policy 0, policy_version 31730 (0.0030) [2024-06-21 17:30:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 519946240. Throughput: 0: 41586.2. Samples: 520040180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 17:30:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-21 17:30:15,905][15401] Updated weights for policy 0, policy_version 31740 (0.0043) [2024-06-21 17:30:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 520110080. Throughput: 0: 41658.5. Samples: 520294020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 17:30:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 17:30:19,490][15401] Updated weights for policy 0, policy_version 31750 (0.0034) [2024-06-21 17:30:23,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41508.9, 300 sec: 41709.8). Total num frames: 520323072. Throughput: 0: 41572.1. Samples: 520414020. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 17:30:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 17:30:24,277][15401] Updated weights for policy 0, policy_version 31760 (0.0035) [2024-06-21 17:30:27,495][15401] Updated weights for policy 0, policy_version 31770 (0.0033) [2024-06-21 17:30:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 520568832. Throughput: 0: 41573.6. Samples: 520662760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 17:30:28,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 17:30:32,007][15401] Updated weights for policy 0, policy_version 31780 (0.0034) [2024-06-21 17:30:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.6, 300 sec: 41709.8). Total num frames: 520732672. Throughput: 0: 41744.6. Samples: 520918880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 17:30:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 17:30:35,525][15401] Updated weights for policy 0, policy_version 31790 (0.0035) [2024-06-21 17:30:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 520962048. Throughput: 0: 41610.7. Samples: 521039900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 17:30:38,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-21 17:30:39,842][15401] Updated weights for policy 0, policy_version 31800 (0.0022) [2024-06-21 17:30:43,046][15401] Updated weights for policy 0, policy_version 31810 (0.0036) [2024-06-21 17:30:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.5, 300 sec: 41820.9). Total num frames: 521191424. Throughput: 0: 41618.4. Samples: 521289460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 17:30:43,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-21 17:30:47,807][15401] Updated weights for policy 0, policy_version 31820 (0.0029) [2024-06-21 17:30:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 521355264. Throughput: 0: 41754.5. Samples: 521544620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 17:30:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 17:30:50,907][15401] Updated weights for policy 0, policy_version 31830 (0.0025) [2024-06-21 17:30:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 521584640. Throughput: 0: 41671.2. Samples: 521663420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 17:30:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 17:30:55,632][15401] Updated weights for policy 0, policy_version 31840 (0.0030) [2024-06-21 17:30:58,377][15349] Signal inference workers to stop experience collection... (7550 times) [2024-06-21 17:30:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41507.9, 300 sec: 41765.3). Total num frames: 521781248. Throughput: 0: 41730.2. Samples: 521918040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 17:30:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 17:30:58,432][15401] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-06-21 17:30:58,493][15349] Signal inference workers to resume experience collection... (7550 times) [2024-06-21 17:30:58,494][15401] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-06-21 17:30:58,818][15401] Updated weights for policy 0, policy_version 31850 (0.0033) [2024-06-21 17:31:03,229][15401] Updated weights for policy 0, policy_version 31860 (0.0039) [2024-06-21 17:31:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 521994240. Throughput: 0: 41710.6. Samples: 522171000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 17:31:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 17:31:06,511][15401] Updated weights for policy 0, policy_version 31870 (0.0031) [2024-06-21 17:31:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41765.6). Total num frames: 522223616. Throughput: 0: 41850.2. Samples: 522297280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:31:08,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-21 17:31:11,604][15401] Updated weights for policy 0, policy_version 31880 (0.0033) [2024-06-21 17:31:13,390][15132] Fps is (10 sec: 39320.8, 60 sec: 40686.8, 300 sec: 41654.2). Total num frames: 522387456. Throughput: 0: 41867.8. Samples: 522546820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:31:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 17:31:14,406][15401] Updated weights for policy 0, policy_version 31890 (0.0032) [2024-06-21 17:31:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 522616832. Throughput: 0: 41659.1. Samples: 522793540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:31:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 17:31:19,243][15401] Updated weights for policy 0, policy_version 31900 (0.0044) [2024-06-21 17:31:22,143][15401] Updated weights for policy 0, policy_version 31910 (0.0047) [2024-06-21 17:31:23,390][15132] Fps is (10 sec: 47514.3, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 522862592. Throughput: 0: 41774.7. Samples: 522919760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:31:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 17:31:26,876][15401] Updated weights for policy 0, policy_version 31920 (0.0045) [2024-06-21 17:31:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41765.6). Total num frames: 523026432. Throughput: 0: 41732.9. Samples: 523167440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:31:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 17:31:30,071][15401] Updated weights for policy 0, policy_version 31930 (0.0043) [2024-06-21 17:31:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 523239424. Throughput: 0: 41509.9. Samples: 523412560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:31:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 17:31:34,474][15401] Updated weights for policy 0, policy_version 31940 (0.0031) [2024-06-21 17:31:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 523452416. Throughput: 0: 41632.3. Samples: 523536880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:31:38,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 17:31:38,394][15401] Updated weights for policy 0, policy_version 31950 (0.0039) [2024-06-21 17:31:42,197][15401] Updated weights for policy 0, policy_version 31960 (0.0046) [2024-06-21 17:31:43,394][15132] Fps is (10 sec: 40939.7, 60 sec: 40956.6, 300 sec: 41820.2). Total num frames: 523649024. Throughput: 0: 41353.2. Samples: 523779140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:31:43,395][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 17:31:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000031961_523649024.pth... [2024-06-21 17:31:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000031350_513638400.pth [2024-06-21 17:31:46,466][15401] Updated weights for policy 0, policy_version 31970 (0.0038) [2024-06-21 17:31:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 523862016. Throughput: 0: 41296.0. Samples: 524029320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 17:31:48,390][15132] Avg episode reward: [(0, '0.211')] [2024-06-21 17:31:50,484][15401] Updated weights for policy 0, policy_version 31980 (0.0037) [2024-06-21 17:31:53,390][15132] Fps is (10 sec: 44258.6, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 524091392. Throughput: 0: 41317.8. Samples: 524156580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 17:31:53,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-21 17:31:54,830][15401] Updated weights for policy 0, policy_version 31990 (0.0035) [2024-06-21 17:31:58,129][15401] Updated weights for policy 0, policy_version 32000 (0.0038) [2024-06-21 17:31:58,392][15132] Fps is (10 sec: 42587.8, 60 sec: 41777.5, 300 sec: 41765.0). Total num frames: 524288000. Throughput: 0: 41259.7. Samples: 524403600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 17:31:58,393][15132] Avg episode reward: [(0, '0.328')] [2024-06-21 17:32:02,654][15401] Updated weights for policy 0, policy_version 32010 (0.0031) [2024-06-21 17:32:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 524500992. Throughput: 0: 41331.6. Samples: 524653460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 17:32:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 17:32:05,779][15401] Updated weights for policy 0, policy_version 32020 (0.0040) [2024-06-21 17:32:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 524697600. Throughput: 0: 41420.1. Samples: 524783660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 17:32:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 17:32:10,257][15401] Updated weights for policy 0, policy_version 32030 (0.0034) [2024-06-21 17:32:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 524910592. Throughput: 0: 41398.2. Samples: 525030360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 17:32:13,390][15132] Avg episode reward: [(0, '0.159')] [2024-06-21 17:32:13,742][15401] Updated weights for policy 0, policy_version 32040 (0.0032) [2024-06-21 17:32:18,041][15401] Updated weights for policy 0, policy_version 32050 (0.0040) [2024-06-21 17:32:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41654.6). Total num frames: 525107200. Throughput: 0: 41567.5. Samples: 525283100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 17:32:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 17:32:21,718][15401] Updated weights for policy 0, policy_version 32060 (0.0029) [2024-06-21 17:32:23,377][15349] Signal inference workers to stop experience collection... (7600 times) [2024-06-21 17:32:23,377][15349] Signal inference workers to resume experience collection... (7600 times) [2024-06-21 17:32:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41654.6). Total num frames: 525320192. Throughput: 0: 41588.5. Samples: 525408360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 17:32:23,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 17:32:23,420][15401] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-06-21 17:32:23,424][15401] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-06-21 17:32:26,382][15401] Updated weights for policy 0, policy_version 32070 (0.0034) [2024-06-21 17:32:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 525549568. Throughput: 0: 41684.2. Samples: 525654720. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-21 17:32:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 17:32:29,688][15401] Updated weights for policy 0, policy_version 32080 (0.0036) [2024-06-21 17:32:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41654.6). Total num frames: 525729792. Throughput: 0: 41869.7. Samples: 525913460. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-21 17:32:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-21 17:32:34,273][15401] Updated weights for policy 0, policy_version 32090 (0.0035) [2024-06-21 17:32:37,414][15401] Updated weights for policy 0, policy_version 32100 (0.0032) [2024-06-21 17:32:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 525942784. Throughput: 0: 41559.3. Samples: 526026740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-21 17:32:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 17:32:41,944][15401] Updated weights for policy 0, policy_version 32110 (0.0037) [2024-06-21 17:32:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42055.7, 300 sec: 41709.8). Total num frames: 526172160. Throughput: 0: 41795.1. Samples: 526284280. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-21 17:32:43,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 17:32:45,093][15401] Updated weights for policy 0, policy_version 32120 (0.0042) [2024-06-21 17:32:48,390][15132] Fps is (10 sec: 40959.1, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 526352384. Throughput: 0: 41705.2. Samples: 526530200. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-21 17:32:48,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-21 17:32:50,243][15401] Updated weights for policy 0, policy_version 32130 (0.0036) [2024-06-21 17:32:52,810][15401] Updated weights for policy 0, policy_version 32140 (0.0035) [2024-06-21 17:32:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 526581760. Throughput: 0: 41464.3. Samples: 526649560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 17:32:53,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 17:32:58,077][15401] Updated weights for policy 0, policy_version 32150 (0.0047) [2024-06-21 17:32:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41507.9, 300 sec: 41709.8). Total num frames: 526778368. Throughput: 0: 41843.1. Samples: 526913300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 17:32:58,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 17:33:00,697][15401] Updated weights for policy 0, policy_version 32160 (0.0025) [2024-06-21 17:33:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 526958592. Throughput: 0: 41649.5. Samples: 527157320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 17:33:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 17:33:05,898][15401] Updated weights for policy 0, policy_version 32170 (0.0034) [2024-06-21 17:33:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 527220736. Throughput: 0: 41619.7. Samples: 527281240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-21 17:33:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 17:33:08,407][15401] Updated weights for policy 0, policy_version 32180 (0.0045) [2024-06-21 17:33:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 41598.7). Total num frames: 527368192. Throughput: 0: 41746.7. Samples: 527533320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 17:33:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 17:33:13,569][15401] Updated weights for policy 0, policy_version 32190 (0.0045) [2024-06-21 17:33:16,398][15401] Updated weights for policy 0, policy_version 32200 (0.0029) [2024-06-21 17:33:18,390][15132] Fps is (10 sec: 39320.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 527613952. Throughput: 0: 41415.1. Samples: 527777140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 17:33:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 17:33:21,289][15401] Updated weights for policy 0, policy_version 32210 (0.0033) [2024-06-21 17:33:22,272][15349] Signal inference workers to stop experience collection... (7650 times) [2024-06-21 17:33:22,310][15401] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-06-21 17:33:22,329][15349] Signal inference workers to resume experience collection... (7650 times) [2024-06-21 17:33:22,329][15401] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-06-21 17:33:23,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 527843328. Throughput: 0: 41844.8. Samples: 527909760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 17:33:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-21 17:33:24,334][15401] Updated weights for policy 0, policy_version 32220 (0.0036) [2024-06-21 17:33:28,389][15132] Fps is (10 sec: 37683.6, 60 sec: 40686.9, 300 sec: 41543.2). Total num frames: 527990784. Throughput: 0: 41492.5. Samples: 528151440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 17:33:28,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-21 17:33:29,179][15401] Updated weights for policy 0, policy_version 32230 (0.0043) [2024-06-21 17:33:32,301][15401] Updated weights for policy 0, policy_version 32240 (0.0034) [2024-06-21 17:33:33,391][15132] Fps is (10 sec: 39316.5, 60 sec: 41778.4, 300 sec: 41709.6). Total num frames: 528236544. Throughput: 0: 41416.3. Samples: 528393980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-21 17:33:33,391][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 17:33:36,923][15401] Updated weights for policy 0, policy_version 32250 (0.0033) [2024-06-21 17:33:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 528433152. Throughput: 0: 41735.7. Samples: 528527660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-21 17:33:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 17:33:40,069][15401] Updated weights for policy 0, policy_version 32260 (0.0044) [2024-06-21 17:33:43,390][15132] Fps is (10 sec: 40964.6, 60 sec: 41233.0, 300 sec: 41599.6). Total num frames: 528646144. Throughput: 0: 41379.4. Samples: 528775380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-21 17:33:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 17:33:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000032266_528646144.pth... [2024-06-21 17:33:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000031656_518651904.pth [2024-06-21 17:33:44,657][15401] Updated weights for policy 0, policy_version 32270 (0.0040) [2024-06-21 17:33:47,885][15401] Updated weights for policy 0, policy_version 32280 (0.0034) [2024-06-21 17:33:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 528875520. Throughput: 0: 41356.8. Samples: 529018380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-21 17:33:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 17:33:52,412][15401] Updated weights for policy 0, policy_version 32290 (0.0028) [2024-06-21 17:33:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.2, 300 sec: 41598.7). Total num frames: 529055744. Throughput: 0: 41529.7. Samples: 529150080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:33:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 17:33:55,890][15401] Updated weights for policy 0, policy_version 32300 (0.0030) [2024-06-21 17:33:58,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 529252352. Throughput: 0: 41339.9. Samples: 529393620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:33:58,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 17:34:00,091][15401] Updated weights for policy 0, policy_version 32310 (0.0040) [2024-06-21 17:34:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 529498112. Throughput: 0: 41437.9. Samples: 529641840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:34:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 17:34:03,760][15401] Updated weights for policy 0, policy_version 32320 (0.0036) [2024-06-21 17:34:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 40959.9, 300 sec: 41543.1). Total num frames: 529678336. Throughput: 0: 41381.3. Samples: 529771920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:34:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 17:34:08,478][15401] Updated weights for policy 0, policy_version 32330 (0.0038) [2024-06-21 17:34:12,266][15401] Updated weights for policy 0, policy_version 32340 (0.0047) [2024-06-21 17:34:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 41654.2). Total num frames: 529907712. Throughput: 0: 41558.1. Samples: 530021560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:34:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 17:34:16,699][15401] Updated weights for policy 0, policy_version 32350 (0.0040) [2024-06-21 17:34:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41599.3). Total num frames: 530104320. Throughput: 0: 41633.2. Samples: 530267420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 17:34:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-21 17:34:19,937][15401] Updated weights for policy 0, policy_version 32360 (0.0027) [2024-06-21 17:34:23,390][15132] Fps is (10 sec: 37683.6, 60 sec: 40686.9, 300 sec: 41543.2). Total num frames: 530284544. Throughput: 0: 41403.9. Samples: 530390840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 17:34:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 17:34:24,352][15401] Updated weights for policy 0, policy_version 32370 (0.0030) [2024-06-21 17:34:27,781][15401] Updated weights for policy 0, policy_version 32380 (0.0038) [2024-06-21 17:34:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41654.3). Total num frames: 530530304. Throughput: 0: 41583.3. Samples: 530646620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 17:34:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 17:34:32,123][15401] Updated weights for policy 0, policy_version 32390 (0.0034) [2024-06-21 17:34:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41507.0, 300 sec: 41543.2). Total num frames: 530726912. Throughput: 0: 41759.2. Samples: 530897540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 17:34:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 17:34:35,598][15401] Updated weights for policy 0, policy_version 32400 (0.0038) [2024-06-21 17:34:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 530923520. Throughput: 0: 41604.0. Samples: 531022260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 17:34:38,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-21 17:34:39,980][15401] Updated weights for policy 0, policy_version 32410 (0.0038) [2024-06-21 17:34:43,269][15401] Updated weights for policy 0, policy_version 32420 (0.0037) [2024-06-21 17:34:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 531169280. Throughput: 0: 41781.2. Samples: 531273780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 17:34:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 17:34:44,900][15349] Signal inference workers to stop experience collection... (7700 times) [2024-06-21 17:34:44,942][15401] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-06-21 17:34:44,949][15349] Signal inference workers to resume experience collection... (7700 times) [2024-06-21 17:34:44,960][15401] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-06-21 17:34:47,793][15401] Updated weights for policy 0, policy_version 32430 (0.0043) [2024-06-21 17:34:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 41231.5, 300 sec: 41542.8). Total num frames: 531349504. Throughput: 0: 41862.2. Samples: 531525740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 17:34:48,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 17:34:51,114][15401] Updated weights for policy 0, policy_version 32440 (0.0034) [2024-06-21 17:34:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41654.6). Total num frames: 531578880. Throughput: 0: 41631.0. Samples: 531645320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 17:34:53,404][15132] Avg episode reward: [(0, '0.308')] [2024-06-21 17:34:55,530][15401] Updated weights for policy 0, policy_version 32450 (0.0034) [2024-06-21 17:34:58,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 531775488. Throughput: 0: 41782.7. Samples: 531901780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 17:34:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 17:34:59,272][15401] Updated weights for policy 0, policy_version 32460 (0.0024) [2024-06-21 17:35:03,331][15401] Updated weights for policy 0, policy_version 32470 (0.0027) [2024-06-21 17:35:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 531988480. Throughput: 0: 41948.3. Samples: 532155100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 17:35:03,391][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 17:35:06,852][15401] Updated weights for policy 0, policy_version 32480 (0.0041) [2024-06-21 17:35:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 41598.7). Total num frames: 532217856. Throughput: 0: 42010.1. Samples: 532281300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 17:35:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 17:35:11,268][15401] Updated weights for policy 0, policy_version 32490 (0.0048) [2024-06-21 17:35:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41709.7). Total num frames: 532414464. Throughput: 0: 41894.5. Samples: 532531880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 17:35:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 17:35:15,128][15401] Updated weights for policy 0, policy_version 32500 (0.0037) [2024-06-21 17:35:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 532611072. Throughput: 0: 41936.8. Samples: 532784700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:35:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 17:35:19,021][15401] Updated weights for policy 0, policy_version 32510 (0.0049) [2024-06-21 17:35:22,947][15401] Updated weights for policy 0, policy_version 32520 (0.0033) [2024-06-21 17:35:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 41598.7). Total num frames: 532840448. Throughput: 0: 41944.0. Samples: 532909740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:35:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 17:35:26,862][15401] Updated weights for policy 0, policy_version 32530 (0.0027) [2024-06-21 17:35:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 533053440. Throughput: 0: 42013.5. Samples: 533164380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:35:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 17:35:30,359][15401] Updated weights for policy 0, policy_version 32540 (0.0043) [2024-06-21 17:35:33,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42052.1, 300 sec: 41654.2). Total num frames: 533250048. Throughput: 0: 42006.9. Samples: 533415960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:35:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 17:35:34,592][15401] Updated weights for policy 0, policy_version 32550 (0.0034) [2024-06-21 17:35:38,307][15401] Updated weights for policy 0, policy_version 32560 (0.0051) [2024-06-21 17:35:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 41598.7). Total num frames: 533463040. Throughput: 0: 42137.7. Samples: 533541520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:35:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 17:35:42,390][15401] Updated weights for policy 0, policy_version 32570 (0.0038) [2024-06-21 17:35:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 533676032. Throughput: 0: 41961.7. Samples: 533790060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 17:35:43,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 17:35:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000032573_533676032.pth... [2024-06-21 17:35:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000031961_523649024.pth [2024-06-21 17:35:46,071][15401] Updated weights for policy 0, policy_version 32580 (0.0033) [2024-06-21 17:35:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42053.9, 300 sec: 41654.2). Total num frames: 533872640. Throughput: 0: 41790.7. Samples: 534035680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 17:35:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 17:35:50,486][15401] Updated weights for policy 0, policy_version 32590 (0.0047) [2024-06-21 17:35:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 534085632. Throughput: 0: 41935.6. Samples: 534168400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 17:35:53,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 17:35:53,939][15401] Updated weights for policy 0, policy_version 32600 (0.0036) [2024-06-21 17:35:58,154][15401] Updated weights for policy 0, policy_version 32610 (0.0036) [2024-06-21 17:35:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 534282240. Throughput: 0: 41898.4. Samples: 534417300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 17:35:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 17:36:01,434][15401] Updated weights for policy 0, policy_version 32620 (0.0041) [2024-06-21 17:36:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 41654.3). Total num frames: 534511616. Throughput: 0: 41875.3. Samples: 534669080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 17:36:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 17:36:05,850][15401] Updated weights for policy 0, policy_version 32630 (0.0052) [2024-06-21 17:36:08,390][15132] Fps is (10 sec: 42597.3, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 534708224. Throughput: 0: 41885.6. Samples: 534794600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 17:36:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 17:36:09,587][15401] Updated weights for policy 0, policy_version 32640 (0.0044) [2024-06-21 17:36:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 534904832. Throughput: 0: 41730.2. Samples: 535042240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 17:36:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 17:36:13,632][15401] Updated weights for policy 0, policy_version 32650 (0.0035) [2024-06-21 17:36:17,400][15401] Updated weights for policy 0, policy_version 32660 (0.0027) [2024-06-21 17:36:18,389][15132] Fps is (10 sec: 40961.1, 60 sec: 41779.4, 300 sec: 41543.2). Total num frames: 535117824. Throughput: 0: 41757.1. Samples: 535295020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-21 17:36:18,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 17:36:21,604][15401] Updated weights for policy 0, policy_version 32670 (0.0033) [2024-06-21 17:36:23,024][15349] Signal inference workers to stop experience collection... (7750 times) [2024-06-21 17:36:23,055][15401] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-06-21 17:36:23,080][15349] Signal inference workers to resume experience collection... (7750 times) [2024-06-21 17:36:23,080][15401] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-06-21 17:36:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 535347200. Throughput: 0: 41845.0. Samples: 535424540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:36:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 17:36:25,472][15401] Updated weights for policy 0, policy_version 32680 (0.0044) [2024-06-21 17:36:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 535527424. Throughput: 0: 41661.5. Samples: 535664820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:36:28,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-21 17:36:29,373][15401] Updated weights for policy 0, policy_version 32690 (0.0052) [2024-06-21 17:36:33,291][15401] Updated weights for policy 0, policy_version 32700 (0.0034) [2024-06-21 17:36:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 535756800. Throughput: 0: 41866.1. Samples: 535919660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:36:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 17:36:37,425][15401] Updated weights for policy 0, policy_version 32710 (0.0039) [2024-06-21 17:36:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.3, 300 sec: 41766.0). Total num frames: 535969792. Throughput: 0: 41825.8. Samples: 536050560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:36:38,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-21 17:36:41,082][15401] Updated weights for policy 0, policy_version 32720 (0.0049) [2024-06-21 17:36:43,390][15132] Fps is (10 sec: 42596.9, 60 sec: 41779.0, 300 sec: 41765.2). Total num frames: 536182784. Throughput: 0: 41708.0. Samples: 536294180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:36:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 17:36:45,529][15401] Updated weights for policy 0, policy_version 32730 (0.0039) [2024-06-21 17:36:48,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41777.5, 300 sec: 41653.9). Total num frames: 536379392. Throughput: 0: 41582.1. Samples: 536540380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:36:48,393][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 17:36:49,056][15401] Updated weights for policy 0, policy_version 32740 (0.0033) [2024-06-21 17:36:53,389][15132] Fps is (10 sec: 37685.2, 60 sec: 41233.1, 300 sec: 41599.1). Total num frames: 536559616. Throughput: 0: 41631.4. Samples: 536668000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:36:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 17:36:53,426][15401] Updated weights for policy 0, policy_version 32750 (0.0040) [2024-06-21 17:36:57,164][15401] Updated weights for policy 0, policy_version 32760 (0.0047) [2024-06-21 17:36:58,390][15132] Fps is (10 sec: 40969.6, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 536788992. Throughput: 0: 41666.2. Samples: 536917220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:36:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 17:37:01,260][15401] Updated weights for policy 0, policy_version 32770 (0.0035) [2024-06-21 17:37:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 537018368. Throughput: 0: 41412.3. Samples: 537158580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:37:03,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 17:37:04,836][15401] Updated weights for policy 0, policy_version 32780 (0.0033) [2024-06-21 17:37:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.2, 300 sec: 41598.7). Total num frames: 537182208. Throughput: 0: 41381.4. Samples: 537286700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 17:37:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 17:37:09,121][15401] Updated weights for policy 0, policy_version 32790 (0.0039) [2024-06-21 17:37:12,511][15401] Updated weights for policy 0, policy_version 32800 (0.0036) [2024-06-21 17:37:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 537411584. Throughput: 0: 41548.9. Samples: 537534520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 17:37:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 17:37:16,952][15401] Updated weights for policy 0, policy_version 32810 (0.0044) [2024-06-21 17:37:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 537624576. Throughput: 0: 41453.0. Samples: 537785040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 17:37:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 17:37:20,552][15401] Updated weights for policy 0, policy_version 32820 (0.0031) [2024-06-21 17:37:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 537821184. Throughput: 0: 41281.8. Samples: 537908240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 17:37:23,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 17:37:25,389][15401] Updated weights for policy 0, policy_version 32830 (0.0039) [2024-06-21 17:37:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 538034176. Throughput: 0: 41282.2. Samples: 538151860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 17:37:28,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 17:37:28,520][15401] Updated weights for policy 0, policy_version 32840 (0.0044) [2024-06-21 17:37:33,132][15401] Updated weights for policy 0, policy_version 32850 (0.0044) [2024-06-21 17:37:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 40960.1, 300 sec: 41598.7). Total num frames: 538214400. Throughput: 0: 41664.5. Samples: 538415180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 17:37:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 17:37:36,139][15401] Updated weights for policy 0, policy_version 32860 (0.0048) [2024-06-21 17:37:38,396][15132] Fps is (10 sec: 42570.9, 60 sec: 41501.7, 300 sec: 41653.3). Total num frames: 538460160. Throughput: 0: 41440.2. Samples: 538533080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 17:37:38,397][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 17:37:40,790][15401] Updated weights for policy 0, policy_version 32870 (0.0044) [2024-06-21 17:37:41,887][15349] Signal inference workers to stop experience collection... (7800 times) [2024-06-21 17:37:41,887][15349] Signal inference workers to resume experience collection... (7800 times) [2024-06-21 17:37:41,908][15401] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-06-21 17:37:41,908][15401] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-06-21 17:37:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 41506.4, 300 sec: 41765.3). Total num frames: 538673152. Throughput: 0: 41486.7. Samples: 538784120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 17:37:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 17:37:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000032878_538673152.pth... [2024-06-21 17:37:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000032266_528646144.pth [2024-06-21 17:37:44,052][15401] Updated weights for policy 0, policy_version 32880 (0.0034) [2024-06-21 17:37:48,390][15132] Fps is (10 sec: 39346.6, 60 sec: 41234.7, 300 sec: 41598.7). Total num frames: 538853376. Throughput: 0: 41730.2. Samples: 539036440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:37:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 17:37:48,623][15401] Updated weights for policy 0, policy_version 32890 (0.0033) [2024-06-21 17:37:51,828][15401] Updated weights for policy 0, policy_version 32900 (0.0038) [2024-06-21 17:37:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 41709.8). Total num frames: 539082752. Throughput: 0: 41473.2. Samples: 539153000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:37:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 17:37:56,498][15401] Updated weights for policy 0, policy_version 32910 (0.0028) [2024-06-21 17:37:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 539279360. Throughput: 0: 41684.9. Samples: 539410340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:37:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 17:37:59,701][15401] Updated weights for policy 0, policy_version 32920 (0.0039) [2024-06-21 17:38:03,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41231.4, 300 sec: 41598.3). Total num frames: 539492352. Throughput: 0: 41702.1. Samples: 539661740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 17:38:03,393][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 17:38:04,201][15401] Updated weights for policy 0, policy_version 32930 (0.0025) [2024-06-21 17:38:07,354][15401] Updated weights for policy 0, policy_version 32940 (0.0045) [2024-06-21 17:38:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 539721728. Throughput: 0: 41780.0. Samples: 539788340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 17:38:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 17:38:11,936][15401] Updated weights for policy 0, policy_version 32950 (0.0056) [2024-06-21 17:38:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 539901952. Throughput: 0: 41906.2. Samples: 540037640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 17:38:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 17:38:15,325][15401] Updated weights for policy 0, policy_version 32960 (0.0039) [2024-06-21 17:38:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 540114944. Throughput: 0: 41579.5. Samples: 540286260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 17:38:18,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 17:38:19,742][15401] Updated weights for policy 0, policy_version 32970 (0.0044) [2024-06-21 17:38:23,279][15401] Updated weights for policy 0, policy_version 32980 (0.0037) [2024-06-21 17:38:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 540344320. Throughput: 0: 41761.1. Samples: 540412060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 17:38:23,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 17:38:27,595][15401] Updated weights for policy 0, policy_version 32990 (0.0043) [2024-06-21 17:38:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41709.9). Total num frames: 540540928. Throughput: 0: 41732.0. Samples: 540662060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 17:38:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 17:38:31,035][15401] Updated weights for policy 0, policy_version 33000 (0.0033) [2024-06-21 17:38:33,393][15132] Fps is (10 sec: 39308.0, 60 sec: 42049.9, 300 sec: 41709.3). Total num frames: 540737536. Throughput: 0: 41746.7. Samples: 540915180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:38:33,393][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 17:38:35,428][15401] Updated weights for policy 0, policy_version 33010 (0.0033) [2024-06-21 17:38:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41783.6, 300 sec: 41765.3). Total num frames: 540966912. Throughput: 0: 41936.5. Samples: 541040140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:38:38,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-21 17:38:38,753][15401] Updated weights for policy 0, policy_version 33020 (0.0039) [2024-06-21 17:38:43,117][15401] Updated weights for policy 0, policy_version 33030 (0.0043) [2024-06-21 17:38:43,389][15132] Fps is (10 sec: 42613.0, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 541163520. Throughput: 0: 41626.3. Samples: 541283520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:38:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 17:38:46,911][15401] Updated weights for policy 0, policy_version 33040 (0.0030) [2024-06-21 17:38:48,389][15132] Fps is (10 sec: 37683.9, 60 sec: 41506.3, 300 sec: 41654.2). Total num frames: 541343744. Throughput: 0: 41714.4. Samples: 541538780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 17:38:48,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 17:38:50,895][15401] Updated weights for policy 0, policy_version 33050 (0.0037) [2024-06-21 17:38:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 541556736. Throughput: 0: 41519.2. Samples: 541656700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 17:38:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 17:38:54,973][15401] Updated weights for policy 0, policy_version 33060 (0.0045) [2024-06-21 17:38:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 541786112. Throughput: 0: 41519.9. Samples: 541906040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 17:38:58,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 17:38:58,895][15401] Updated weights for policy 0, policy_version 33070 (0.0032) [2024-06-21 17:39:01,193][15349] Signal inference workers to stop experience collection... (7850 times) [2024-06-21 17:39:01,194][15349] Signal inference workers to resume experience collection... (7850 times) [2024-06-21 17:39:01,204][15401] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-06-21 17:39:01,204][15401] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-06-21 17:39:03,390][15132] Fps is (10 sec: 40958.9, 60 sec: 41234.6, 300 sec: 41654.2). Total num frames: 541966336. Throughput: 0: 41603.5. Samples: 542158420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 17:39:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 17:39:03,411][15401] Updated weights for policy 0, policy_version 33080 (0.0040) [2024-06-21 17:39:06,662][15401] Updated weights for policy 0, policy_version 33090 (0.0032) [2024-06-21 17:39:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 542195712. Throughput: 0: 41456.3. Samples: 542277600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-21 17:39:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 17:39:11,254][15401] Updated weights for policy 0, policy_version 33100 (0.0034) [2024-06-21 17:39:13,392][15132] Fps is (10 sec: 44226.9, 60 sec: 41777.5, 300 sec: 41709.4). Total num frames: 542408704. Throughput: 0: 41440.0. Samples: 542526960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:39:13,393][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 17:39:14,476][15401] Updated weights for policy 0, policy_version 33110 (0.0038) [2024-06-21 17:39:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 542605312. Throughput: 0: 41442.6. Samples: 542779960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:39:18,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 17:39:19,023][15401] Updated weights for policy 0, policy_version 33120 (0.0034) [2024-06-21 17:39:22,243][15401] Updated weights for policy 0, policy_version 33130 (0.0030) [2024-06-21 17:39:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 542818304. Throughput: 0: 41388.5. Samples: 542902620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:39:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 17:39:26,770][15401] Updated weights for policy 0, policy_version 33140 (0.0035) [2024-06-21 17:39:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 543031296. Throughput: 0: 41658.2. Samples: 543158140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:39:28,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-21 17:39:30,066][15401] Updated weights for policy 0, policy_version 33150 (0.0036) [2024-06-21 17:39:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41781.5, 300 sec: 41765.3). Total num frames: 543244288. Throughput: 0: 41497.1. Samples: 543406160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:39:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 17:39:34,511][15401] Updated weights for policy 0, policy_version 33160 (0.0039) [2024-06-21 17:39:38,099][15401] Updated weights for policy 0, policy_version 33170 (0.0037) [2024-06-21 17:39:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 543457280. Throughput: 0: 41625.3. Samples: 543529840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 17:39:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 17:39:42,504][15401] Updated weights for policy 0, policy_version 33180 (0.0033) [2024-06-21 17:39:43,392][15132] Fps is (10 sec: 40950.7, 60 sec: 41504.5, 300 sec: 41709.8). Total num frames: 543653888. Throughput: 0: 41681.5. Samples: 543781800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 17:39:43,392][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 17:39:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000033182_543653888.pth... [2024-06-21 17:39:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000032573_533676032.pth [2024-06-21 17:39:46,117][15401] Updated weights for policy 0, policy_version 33190 (0.0038) [2024-06-21 17:39:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 543866880. Throughput: 0: 41556.6. Samples: 544028460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 17:39:48,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 17:39:50,201][15401] Updated weights for policy 0, policy_version 33200 (0.0034) [2024-06-21 17:39:53,390][15132] Fps is (10 sec: 40969.3, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 544063488. Throughput: 0: 41672.9. Samples: 544152880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 17:39:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 17:39:54,026][15401] Updated weights for policy 0, policy_version 33210 (0.0043) [2024-06-21 17:39:58,122][15401] Updated weights for policy 0, policy_version 33220 (0.0028) [2024-06-21 17:39:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 544276480. Throughput: 0: 41679.6. Samples: 544402440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:39:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 17:40:01,882][15401] Updated weights for policy 0, policy_version 33230 (0.0034) [2024-06-21 17:40:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.4, 300 sec: 41543.2). Total num frames: 544473088. Throughput: 0: 41579.6. Samples: 544651040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:40:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 17:40:06,127][15401] Updated weights for policy 0, policy_version 33240 (0.0038) [2024-06-21 17:40:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 544702464. Throughput: 0: 41631.2. Samples: 544776020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:40:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 17:40:09,699][15401] Updated weights for policy 0, policy_version 33250 (0.0033) [2024-06-21 17:40:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41234.7, 300 sec: 41598.7). Total num frames: 544882688. Throughput: 0: 41478.6. Samples: 545024680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:40:13,392][15132] Avg episode reward: [(0, '0.786')] [2024-06-21 17:40:14,083][15401] Updated weights for policy 0, policy_version 33260 (0.0030) [2024-06-21 17:40:17,451][15401] Updated weights for policy 0, policy_version 33270 (0.0031) [2024-06-21 17:40:18,392][15132] Fps is (10 sec: 40949.3, 60 sec: 41777.5, 300 sec: 41598.3). Total num frames: 545112064. Throughput: 0: 41403.5. Samples: 545269420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:40:18,393][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 17:40:21,811][15401] Updated weights for policy 0, policy_version 33280 (0.0030) [2024-06-21 17:40:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 545308672. Throughput: 0: 41527.9. Samples: 545398600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 17:40:23,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 17:40:25,357][15401] Updated weights for policy 0, policy_version 33290 (0.0031) [2024-06-21 17:40:28,389][15132] Fps is (10 sec: 39331.7, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 545505280. Throughput: 0: 41465.3. Samples: 545647640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 17:40:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 17:40:29,577][15401] Updated weights for policy 0, policy_version 33300 (0.0037) [2024-06-21 17:40:30,639][15349] Signal inference workers to stop experience collection... (7900 times) [2024-06-21 17:40:30,640][15349] Signal inference workers to resume experience collection... (7900 times) [2024-06-21 17:40:30,664][15401] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-06-21 17:40:30,696][15401] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-06-21 17:40:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 545734656. Throughput: 0: 41507.2. Samples: 545896280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 17:40:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 17:40:33,849][15401] Updated weights for policy 0, policy_version 33310 (0.0032) [2024-06-21 17:40:37,254][15401] Updated weights for policy 0, policy_version 33320 (0.0035) [2024-06-21 17:40:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 545947648. Throughput: 0: 41611.2. Samples: 546025380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 17:40:38,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 17:40:41,626][15401] Updated weights for policy 0, policy_version 33330 (0.0035) [2024-06-21 17:40:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41234.7, 300 sec: 41543.2). Total num frames: 546127872. Throughput: 0: 41506.6. Samples: 546270240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 17:40:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 17:40:45,077][15401] Updated weights for policy 0, policy_version 33340 (0.0039) [2024-06-21 17:40:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 546357248. Throughput: 0: 41628.8. Samples: 546524340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 17:40:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 17:40:49,299][15401] Updated weights for policy 0, policy_version 33350 (0.0032) [2024-06-21 17:40:52,905][15401] Updated weights for policy 0, policy_version 33360 (0.0038) [2024-06-21 17:40:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 546570240. Throughput: 0: 41626.6. Samples: 546649220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 17:40:53,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 17:40:56,902][15401] Updated weights for policy 0, policy_version 33370 (0.0028) [2024-06-21 17:40:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 546783232. Throughput: 0: 41408.9. Samples: 546888080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 17:40:58,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-21 17:41:01,503][15401] Updated weights for policy 0, policy_version 33380 (0.0035) [2024-06-21 17:41:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 41777.4, 300 sec: 41598.4). Total num frames: 546979840. Throughput: 0: 41475.6. Samples: 547135820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 17:41:03,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 17:41:05,063][15401] Updated weights for policy 0, policy_version 33390 (0.0032) [2024-06-21 17:41:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 547176448. Throughput: 0: 41470.1. Samples: 547264760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 17:41:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 17:41:09,299][15401] Updated weights for policy 0, policy_version 33400 (0.0035) [2024-06-21 17:41:13,110][15401] Updated weights for policy 0, policy_version 33410 (0.0029) [2024-06-21 17:41:13,390][15132] Fps is (10 sec: 40969.8, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 547389440. Throughput: 0: 41375.9. Samples: 547509560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 17:41:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 17:41:17,068][15401] Updated weights for policy 0, policy_version 33420 (0.0038) [2024-06-21 17:41:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41234.8, 300 sec: 41487.6). Total num frames: 547586048. Throughput: 0: 41462.6. Samples: 547762100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 17:41:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 17:41:21,414][15401] Updated weights for policy 0, policy_version 33430 (0.0039) [2024-06-21 17:41:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 547799040. Throughput: 0: 41355.1. Samples: 547886360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 17:41:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 17:41:25,332][15401] Updated weights for policy 0, policy_version 33440 (0.0029) [2024-06-21 17:41:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 548028416. Throughput: 0: 41317.7. Samples: 548129540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 17:41:28,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-21 17:41:29,398][15401] Updated weights for policy 0, policy_version 33450 (0.0029) [2024-06-21 17:41:33,164][15401] Updated weights for policy 0, policy_version 33460 (0.0036) [2024-06-21 17:41:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41543.2). Total num frames: 548225024. Throughput: 0: 41221.4. Samples: 548379300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 17:41:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 17:41:37,303][15401] Updated weights for policy 0, policy_version 33470 (0.0042) [2024-06-21 17:41:38,392][15132] Fps is (10 sec: 39312.4, 60 sec: 41231.4, 300 sec: 41487.3). Total num frames: 548421632. Throughput: 0: 41136.5. Samples: 548500460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 17:41:38,392][15132] Avg episode reward: [(0, '0.786')] [2024-06-21 17:41:40,978][15401] Updated weights for policy 0, policy_version 33480 (0.0049) [2024-06-21 17:41:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41543.5). Total num frames: 548634624. Throughput: 0: 41357.7. Samples: 548749180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 17:41:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-21 17:41:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000033486_548634624.pth... [2024-06-21 17:41:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000032878_538673152.pth [2024-06-21 17:41:45,054][15401] Updated weights for policy 0, policy_version 33490 (0.0028) [2024-06-21 17:41:48,390][15132] Fps is (10 sec: 40969.4, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 548831232. Throughput: 0: 41241.3. Samples: 548991580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 17:41:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 17:41:49,211][15401] Updated weights for policy 0, policy_version 33500 (0.0045) [2024-06-21 17:41:52,782][15401] Updated weights for policy 0, policy_version 33510 (0.0039) [2024-06-21 17:41:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 549044224. Throughput: 0: 41212.0. Samples: 549119300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 17:41:53,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 17:41:57,027][15401] Updated weights for policy 0, policy_version 33520 (0.0046) [2024-06-21 17:41:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 41376.5). Total num frames: 549224448. Throughput: 0: 41261.7. Samples: 549366340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 17:41:58,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 17:42:00,919][15401] Updated weights for policy 0, policy_version 33530 (0.0039) [2024-06-21 17:42:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 41234.8, 300 sec: 41598.7). Total num frames: 549453824. Throughput: 0: 41174.3. Samples: 549614940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 17:42:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 17:42:04,904][15401] Updated weights for policy 0, policy_version 33540 (0.0039) [2024-06-21 17:42:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 549666816. Throughput: 0: 41203.1. Samples: 549740500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:42:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 17:42:08,541][15401] Updated weights for policy 0, policy_version 33550 (0.0036) [2024-06-21 17:42:11,536][15349] Signal inference workers to stop experience collection... (7950 times) [2024-06-21 17:42:11,536][15349] Signal inference workers to resume experience collection... (7950 times) [2024-06-21 17:42:11,572][15401] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-06-21 17:42:11,572][15401] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-06-21 17:42:12,610][15401] Updated weights for policy 0, policy_version 33560 (0.0049) [2024-06-21 17:42:13,395][15132] Fps is (10 sec: 40938.6, 60 sec: 41229.6, 300 sec: 41486.9). Total num frames: 549863424. Throughput: 0: 41286.5. Samples: 549987640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:42:13,395][15132] Avg episode reward: [(0, '0.268')] [2024-06-21 17:42:16,553][15401] Updated weights for policy 0, policy_version 33570 (0.0053) [2024-06-21 17:42:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 550076416. Throughput: 0: 41216.0. Samples: 550234020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:42:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 17:42:20,557][15401] Updated weights for policy 0, policy_version 33580 (0.0033) [2024-06-21 17:42:23,389][15132] Fps is (10 sec: 40981.0, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 550273024. Throughput: 0: 41384.9. Samples: 550362680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:42:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 17:42:24,330][15401] Updated weights for policy 0, policy_version 33590 (0.0033) [2024-06-21 17:42:28,368][15401] Updated weights for policy 0, policy_version 33600 (0.0031) [2024-06-21 17:42:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41231.5, 300 sec: 41653.9). Total num frames: 550502400. Throughput: 0: 41526.8. Samples: 550617980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 17:42:28,393][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 17:42:32,209][15401] Updated weights for policy 0, policy_version 33610 (0.0039) [2024-06-21 17:42:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41506.1, 300 sec: 41544.1). Total num frames: 550715392. Throughput: 0: 41534.7. Samples: 550860640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 17:42:33,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 17:42:36,033][15401] Updated weights for policy 0, policy_version 33620 (0.0049) [2024-06-21 17:42:38,390][15132] Fps is (10 sec: 40969.6, 60 sec: 41507.7, 300 sec: 41487.6). Total num frames: 550912000. Throughput: 0: 41471.6. Samples: 550985520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 17:42:38,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-21 17:42:39,982][15401] Updated weights for policy 0, policy_version 33630 (0.0035) [2024-06-21 17:42:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 40960.1, 300 sec: 41487.6). Total num frames: 551092224. Throughput: 0: 41569.0. Samples: 551236940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 17:42:43,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-21 17:42:44,055][15401] Updated weights for policy 0, policy_version 33640 (0.0033) [2024-06-21 17:42:47,893][15401] Updated weights for policy 0, policy_version 33650 (0.0042) [2024-06-21 17:42:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 551337984. Throughput: 0: 41637.7. Samples: 551488640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 17:42:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 17:42:52,325][15401] Updated weights for policy 0, policy_version 33660 (0.0037) [2024-06-21 17:42:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41506.3, 300 sec: 41543.2). Total num frames: 551534592. Throughput: 0: 41747.7. Samples: 551619140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 17:42:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 17:42:55,709][15401] Updated weights for policy 0, policy_version 33670 (0.0030) [2024-06-21 17:42:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41779.2, 300 sec: 41487.9). Total num frames: 551731200. Throughput: 0: 41660.6. Samples: 551862160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 17:42:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 17:43:00,044][15401] Updated weights for policy 0, policy_version 33680 (0.0041) [2024-06-21 17:43:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 41779.0, 300 sec: 41487.6). Total num frames: 551960576. Throughput: 0: 41650.1. Samples: 552108280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 17:43:03,391][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 17:43:03,637][15401] Updated weights for policy 0, policy_version 33690 (0.0037) [2024-06-21 17:43:07,845][15401] Updated weights for policy 0, policy_version 33700 (0.0036) [2024-06-21 17:43:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 552157184. Throughput: 0: 41762.1. Samples: 552241980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 17:43:08,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 17:43:11,303][15401] Updated weights for policy 0, policy_version 33710 (0.0045) [2024-06-21 17:43:13,390][15132] Fps is (10 sec: 39322.1, 60 sec: 41509.6, 300 sec: 41487.6). Total num frames: 552353792. Throughput: 0: 41496.9. Samples: 552485240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 17:43:13,392][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 17:43:15,532][15401] Updated weights for policy 0, policy_version 33720 (0.0032) [2024-06-21 17:43:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 552583168. Throughput: 0: 41807.6. Samples: 552741980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:43:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 17:43:19,019][15401] Updated weights for policy 0, policy_version 33730 (0.0033) [2024-06-21 17:43:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 552779776. Throughput: 0: 41965.4. Samples: 552873960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:43:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 17:43:23,612][15401] Updated weights for policy 0, policy_version 33740 (0.0034) [2024-06-21 17:43:26,734][15401] Updated weights for policy 0, policy_version 33750 (0.0044) [2024-06-21 17:43:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41780.8, 300 sec: 41599.2). Total num frames: 553009152. Throughput: 0: 41764.9. Samples: 553116360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:43:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 17:43:29,809][15349] Signal inference workers to stop experience collection... (8000 times) [2024-06-21 17:43:29,810][15349] Signal inference workers to resume experience collection... (8000 times) [2024-06-21 17:43:29,857][15401] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-06-21 17:43:29,857][15401] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-06-21 17:43:31,352][15401] Updated weights for policy 0, policy_version 33760 (0.0037) [2024-06-21 17:43:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 553222144. Throughput: 0: 41763.0. Samples: 553367980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-21 17:43:33,392][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 17:43:34,764][15401] Updated weights for policy 0, policy_version 33770 (0.0027) [2024-06-21 17:43:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 553385984. Throughput: 0: 41736.0. Samples: 553497260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-21 17:43:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 17:43:39,003][15401] Updated weights for policy 0, policy_version 33780 (0.0039) [2024-06-21 17:43:42,546][15401] Updated weights for policy 0, policy_version 33790 (0.0045) [2024-06-21 17:43:43,394][15132] Fps is (10 sec: 40941.2, 60 sec: 42322.1, 300 sec: 41653.6). Total num frames: 553631744. Throughput: 0: 42033.6. Samples: 553753860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-21 17:43:43,395][15132] Avg episode reward: [(0, '0.777')] [2024-06-21 17:43:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000033792_553648128.pth... [2024-06-21 17:43:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000033182_543653888.pth [2024-06-21 17:43:46,853][15401] Updated weights for policy 0, policy_version 33800 (0.0038) [2024-06-21 17:43:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 553828352. Throughput: 0: 42158.4. Samples: 554005400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-21 17:43:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 17:43:50,095][15401] Updated weights for policy 0, policy_version 33810 (0.0031) [2024-06-21 17:43:53,390][15132] Fps is (10 sec: 40978.7, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 554041344. Throughput: 0: 41928.5. Samples: 554128760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-21 17:43:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 17:43:54,578][15401] Updated weights for policy 0, policy_version 33820 (0.0035) [2024-06-21 17:43:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 554254336. Throughput: 0: 42197.8. Samples: 554384140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-21 17:43:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 17:43:58,499][15401] Updated weights for policy 0, policy_version 33830 (0.0020) [2024-06-21 17:44:02,316][15401] Updated weights for policy 0, policy_version 33840 (0.0044) [2024-06-21 17:44:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 554467328. Throughput: 0: 41962.2. Samples: 554630280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:44:03,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-21 17:44:06,361][15401] Updated weights for policy 0, policy_version 33850 (0.0043) [2024-06-21 17:44:08,394][15132] Fps is (10 sec: 40942.8, 60 sec: 41776.4, 300 sec: 41542.9). Total num frames: 554663936. Throughput: 0: 41914.8. Samples: 554760300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:44:08,394][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 17:44:10,075][15401] Updated weights for policy 0, policy_version 33860 (0.0039) [2024-06-21 17:44:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 554893312. Throughput: 0: 42136.9. Samples: 555012520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:44:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 17:44:13,951][15401] Updated weights for policy 0, policy_version 33870 (0.0031) [2024-06-21 17:44:18,049][15401] Updated weights for policy 0, policy_version 33880 (0.0034) [2024-06-21 17:44:18,390][15132] Fps is (10 sec: 44255.1, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 555106304. Throughput: 0: 42093.8. Samples: 555262200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:44:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 17:44:21,673][15401] Updated weights for policy 0, policy_version 33890 (0.0039) [2024-06-21 17:44:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 555302912. Throughput: 0: 42015.1. Samples: 555387940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:44:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 17:44:25,634][15401] Updated weights for policy 0, policy_version 33900 (0.0039) [2024-06-21 17:44:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 555499520. Throughput: 0: 41807.5. Samples: 555635000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:44:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 17:44:29,903][15401] Updated weights for policy 0, policy_version 33910 (0.0030) [2024-06-21 17:44:33,366][15401] Updated weights for policy 0, policy_version 33920 (0.0036) [2024-06-21 17:44:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 555745280. Throughput: 0: 41810.1. Samples: 555886860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:44:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 17:44:37,566][15401] Updated weights for policy 0, policy_version 33930 (0.0036) [2024-06-21 17:44:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41599.0). Total num frames: 555925504. Throughput: 0: 41946.7. Samples: 556016360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 17:44:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 17:44:41,078][15401] Updated weights for policy 0, policy_version 33940 (0.0040) [2024-06-21 17:44:43,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41509.4, 300 sec: 41543.2). Total num frames: 556122112. Throughput: 0: 41792.5. Samples: 556264800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:44:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-21 17:44:44,000][15349] Signal inference workers to stop experience collection... (8050 times) [2024-06-21 17:44:44,001][15349] Signal inference workers to resume experience collection... (8050 times) [2024-06-21 17:44:44,044][15401] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-06-21 17:44:44,044][15401] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-06-21 17:44:45,425][15401] Updated weights for policy 0, policy_version 33950 (0.0027) [2024-06-21 17:44:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 556351488. Throughput: 0: 41940.5. Samples: 556517600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:44:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 17:44:48,823][15401] Updated weights for policy 0, policy_version 33960 (0.0041) [2024-06-21 17:44:53,116][15401] Updated weights for policy 0, policy_version 33970 (0.0041) [2024-06-21 17:44:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 556564480. Throughput: 0: 41997.2. Samples: 556650000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:44:53,400][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 17:44:56,964][15401] Updated weights for policy 0, policy_version 33980 (0.0041) [2024-06-21 17:44:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 556777472. Throughput: 0: 41783.5. Samples: 556892780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:44:58,399][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 17:45:00,866][15401] Updated weights for policy 0, policy_version 33990 (0.0043) [2024-06-21 17:45:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 556990464. Throughput: 0: 41842.8. Samples: 557145120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 17:45:03,398][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 17:45:04,935][15401] Updated weights for policy 0, policy_version 34000 (0.0028) [2024-06-21 17:45:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41782.1, 300 sec: 41654.2). Total num frames: 557170688. Throughput: 0: 41772.3. Samples: 557267700. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-21 17:45:08,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-21 17:45:09,200][15401] Updated weights for policy 0, policy_version 34010 (0.0040) [2024-06-21 17:45:12,573][15401] Updated weights for policy 0, policy_version 34020 (0.0032) [2024-06-21 17:45:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42050.6, 300 sec: 41709.8). Total num frames: 557416448. Throughput: 0: 41823.0. Samples: 557517140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-21 17:45:13,392][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 17:45:17,098][15401] Updated weights for policy 0, policy_version 34030 (0.0033) [2024-06-21 17:45:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 557596672. Throughput: 0: 41885.9. Samples: 557771720. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-21 17:45:18,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 17:45:20,440][15401] Updated weights for policy 0, policy_version 34040 (0.0034) [2024-06-21 17:45:23,390][15132] Fps is (10 sec: 39330.7, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 557809664. Throughput: 0: 41641.7. Samples: 557890240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-21 17:45:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 17:45:24,717][15401] Updated weights for policy 0, policy_version 34050 (0.0030) [2024-06-21 17:45:28,198][15401] Updated weights for policy 0, policy_version 34060 (0.0027) [2024-06-21 17:45:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 558039040. Throughput: 0: 41934.6. Samples: 558151860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 17:45:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 17:45:32,332][15401] Updated weights for policy 0, policy_version 34070 (0.0033) [2024-06-21 17:45:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 558235648. Throughput: 0: 41949.2. Samples: 558405320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 17:45:33,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 17:45:36,010][15401] Updated weights for policy 0, policy_version 34080 (0.0023) [2024-06-21 17:45:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 558448640. Throughput: 0: 41648.6. Samples: 558524180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 17:45:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 17:45:40,338][15401] Updated weights for policy 0, policy_version 34090 (0.0039) [2024-06-21 17:45:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 558661632. Throughput: 0: 42004.1. Samples: 558782960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 17:45:43,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-21 17:45:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000034098_558661632.pth... [2024-06-21 17:45:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000033486_548634624.pth [2024-06-21 17:45:43,984][15401] Updated weights for policy 0, policy_version 34100 (0.0036) [2024-06-21 17:45:48,383][15401] Updated weights for policy 0, policy_version 34110 (0.0034) [2024-06-21 17:45:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 558858240. Throughput: 0: 41842.7. Samples: 559028040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:45:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 17:45:51,619][15401] Updated weights for policy 0, policy_version 34120 (0.0029) [2024-06-21 17:45:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 559071232. Throughput: 0: 41840.6. Samples: 559150520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:45:53,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 17:45:56,072][15401] Updated weights for policy 0, policy_version 34130 (0.0046) [2024-06-21 17:45:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.2, 300 sec: 41599.1). Total num frames: 559251456. Throughput: 0: 41793.8. Samples: 559397760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:45:58,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-21 17:45:59,836][15401] Updated weights for policy 0, policy_version 34140 (0.0034) [2024-06-21 17:46:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 559480832. Throughput: 0: 41598.1. Samples: 559643640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:46:03,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-21 17:46:04,168][15401] Updated weights for policy 0, policy_version 34150 (0.0029) [2024-06-21 17:46:07,762][15401] Updated weights for policy 0, policy_version 34160 (0.0026) [2024-06-21 17:46:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 559693824. Throughput: 0: 41848.6. Samples: 559773420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 17:46:08,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-21 17:46:11,755][15401] Updated weights for policy 0, policy_version 34170 (0.0038) [2024-06-21 17:46:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41234.6, 300 sec: 41709.8). Total num frames: 559890432. Throughput: 0: 41571.0. Samples: 560022560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:46:13,399][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 17:46:15,546][15401] Updated weights for policy 0, policy_version 34180 (0.0053) [2024-06-21 17:46:18,132][15349] Signal inference workers to stop experience collection... (8100 times) [2024-06-21 17:46:18,179][15401] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-06-21 17:46:18,187][15349] Signal inference workers to resume experience collection... (8100 times) [2024-06-21 17:46:18,190][15401] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-06-21 17:46:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 41820.8). Total num frames: 560136192. Throughput: 0: 41414.7. Samples: 560268980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:46:18,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 17:46:19,596][15401] Updated weights for policy 0, policy_version 34190 (0.0041) [2024-06-21 17:46:23,292][15401] Updated weights for policy 0, policy_version 34200 (0.0033) [2024-06-21 17:46:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 560332800. Throughput: 0: 41672.3. Samples: 560399440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:46:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 17:46:27,461][15401] Updated weights for policy 0, policy_version 34210 (0.0027) [2024-06-21 17:46:28,392][15132] Fps is (10 sec: 37674.1, 60 sec: 41231.4, 300 sec: 41653.9). Total num frames: 560513024. Throughput: 0: 41475.1. Samples: 560649440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:46:28,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 17:46:31,251][15401] Updated weights for policy 0, policy_version 34220 (0.0034) [2024-06-21 17:46:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41710.1). Total num frames: 560726016. Throughput: 0: 41416.4. Samples: 560891780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:46:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 17:46:35,316][15401] Updated weights for policy 0, policy_version 34230 (0.0044) [2024-06-21 17:46:38,392][15132] Fps is (10 sec: 44236.9, 60 sec: 41777.4, 300 sec: 41765.0). Total num frames: 560955392. Throughput: 0: 41559.9. Samples: 561020820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:46:38,393][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 17:46:39,115][15401] Updated weights for policy 0, policy_version 34240 (0.0037) [2024-06-21 17:46:43,024][15401] Updated weights for policy 0, policy_version 34250 (0.0033) [2024-06-21 17:46:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 561152000. Throughput: 0: 41617.2. Samples: 561270540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:46:43,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 17:46:46,961][15401] Updated weights for policy 0, policy_version 34260 (0.0039) [2024-06-21 17:46:48,390][15132] Fps is (10 sec: 40969.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 561364992. Throughput: 0: 41610.7. Samples: 561516120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:46:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 17:46:51,143][15401] Updated weights for policy 0, policy_version 34270 (0.0025) [2024-06-21 17:46:53,389][15132] Fps is (10 sec: 39322.5, 60 sec: 41233.1, 300 sec: 41765.4). Total num frames: 561545216. Throughput: 0: 41461.0. Samples: 561639160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 17:46:53,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 17:46:54,771][15401] Updated weights for policy 0, policy_version 34280 (0.0031) [2024-06-21 17:46:58,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 561774592. Throughput: 0: 41483.9. Samples: 561889340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 17:46:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 17:46:58,853][15401] Updated weights for policy 0, policy_version 34290 (0.0052) [2024-06-21 17:47:03,158][15401] Updated weights for policy 0, policy_version 34300 (0.0035) [2024-06-21 17:47:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 561987584. Throughput: 0: 41651.3. Samples: 562143280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 17:47:03,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-21 17:47:06,793][15401] Updated weights for policy 0, policy_version 34310 (0.0027) [2024-06-21 17:47:08,389][15132] Fps is (10 sec: 40961.1, 60 sec: 41506.1, 300 sec: 41766.0). Total num frames: 562184192. Throughput: 0: 41550.3. Samples: 562269200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 17:47:08,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 17:47:10,671][15401] Updated weights for policy 0, policy_version 34320 (0.0039) [2024-06-21 17:47:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 562397184. Throughput: 0: 41590.6. Samples: 562520920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 17:47:13,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-21 17:47:14,559][15401] Updated weights for policy 0, policy_version 34330 (0.0043) [2024-06-21 17:47:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 562610176. Throughput: 0: 41744.0. Samples: 562770260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:47:18,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-21 17:47:18,477][15401] Updated weights for policy 0, policy_version 34340 (0.0025) [2024-06-21 17:47:22,465][15401] Updated weights for policy 0, policy_version 34350 (0.0030) [2024-06-21 17:47:23,392][15132] Fps is (10 sec: 40950.8, 60 sec: 41231.5, 300 sec: 41709.8). Total num frames: 562806784. Throughput: 0: 41765.8. Samples: 562900280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:47:23,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 17:47:24,347][15349] Signal inference workers to stop experience collection... (8150 times) [2024-06-21 17:47:24,351][15349] Signal inference workers to resume experience collection... (8150 times) [2024-06-21 17:47:24,379][15401] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-06-21 17:47:24,379][15401] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-06-21 17:47:26,211][15401] Updated weights for policy 0, policy_version 34360 (0.0029) [2024-06-21 17:47:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42054.1, 300 sec: 41765.3). Total num frames: 563036160. Throughput: 0: 41598.0. Samples: 563142440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:47:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 17:47:30,378][15401] Updated weights for policy 0, policy_version 34370 (0.0042) [2024-06-21 17:47:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 563249152. Throughput: 0: 41846.2. Samples: 563399200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:47:33,390][15132] Avg episode reward: [(0, '0.877')] [2024-06-21 17:47:33,778][15401] Updated weights for policy 0, policy_version 34380 (0.0029) [2024-06-21 17:47:38,348][15401] Updated weights for policy 0, policy_version 34390 (0.0046) [2024-06-21 17:47:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41507.8, 300 sec: 41876.4). Total num frames: 563445760. Throughput: 0: 41930.1. Samples: 563526020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:47:38,392][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 17:47:41,884][15401] Updated weights for policy 0, policy_version 34400 (0.0035) [2024-06-21 17:47:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 563658752. Throughput: 0: 41705.5. Samples: 563766080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 17:47:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 17:47:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000034403_563658752.pth... [2024-06-21 17:47:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000033792_553648128.pth [2024-06-21 17:47:46,125][15401] Updated weights for policy 0, policy_version 34410 (0.0046) [2024-06-21 17:47:48,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40959.9, 300 sec: 41654.2). Total num frames: 563822592. Throughput: 0: 41836.7. Samples: 564025940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 17:47:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 17:47:49,590][15401] Updated weights for policy 0, policy_version 34420 (0.0029) [2024-06-21 17:47:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 564068352. Throughput: 0: 41544.0. Samples: 564138680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 17:47:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 17:47:53,847][15401] Updated weights for policy 0, policy_version 34430 (0.0043) [2024-06-21 17:47:57,877][15401] Updated weights for policy 0, policy_version 34440 (0.0031) [2024-06-21 17:47:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 41779.4, 300 sec: 41765.3). Total num frames: 564281344. Throughput: 0: 41550.0. Samples: 564390660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 17:47:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 17:48:01,665][15401] Updated weights for policy 0, policy_version 34450 (0.0044) [2024-06-21 17:48:03,390][15132] Fps is (10 sec: 37682.7, 60 sec: 40959.8, 300 sec: 41654.2). Total num frames: 564445184. Throughput: 0: 41590.5. Samples: 564641840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:48:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 17:48:05,822][15401] Updated weights for policy 0, policy_version 34460 (0.0026) [2024-06-21 17:48:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 564690944. Throughput: 0: 41323.1. Samples: 564759720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:48:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 17:48:09,819][15401] Updated weights for policy 0, policy_version 34470 (0.0034) [2024-06-21 17:48:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41233.2, 300 sec: 41654.2). Total num frames: 564871168. Throughput: 0: 41664.8. Samples: 565017360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:48:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 17:48:13,826][15401] Updated weights for policy 0, policy_version 34480 (0.0039) [2024-06-21 17:48:17,619][15401] Updated weights for policy 0, policy_version 34490 (0.0027) [2024-06-21 17:48:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 565100544. Throughput: 0: 41538.6. Samples: 565268440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:48:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 17:48:21,579][15401] Updated weights for policy 0, policy_version 34500 (0.0046) [2024-06-21 17:48:23,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42327.0, 300 sec: 41820.9). Total num frames: 565346304. Throughput: 0: 41486.6. Samples: 565392920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:48:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 17:48:25,465][15401] Updated weights for policy 0, policy_version 34510 (0.0034) [2024-06-21 17:48:28,392][15132] Fps is (10 sec: 40950.6, 60 sec: 41231.3, 300 sec: 41653.9). Total num frames: 565510144. Throughput: 0: 41781.4. Samples: 565646340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 17:48:28,393][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 17:48:29,596][15401] Updated weights for policy 0, policy_version 34520 (0.0042) [2024-06-21 17:48:33,143][15401] Updated weights for policy 0, policy_version 34530 (0.0031) [2024-06-21 17:48:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 565739520. Throughput: 0: 41496.0. Samples: 565893260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 17:48:33,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-21 17:48:36,459][15349] Signal inference workers to stop experience collection... (8200 times) [2024-06-21 17:48:36,508][15401] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-06-21 17:48:36,516][15349] Signal inference workers to resume experience collection... (8200 times) [2024-06-21 17:48:36,523][15401] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-06-21 17:48:37,126][15401] Updated weights for policy 0, policy_version 34540 (0.0029) [2024-06-21 17:48:38,390][15132] Fps is (10 sec: 44246.8, 60 sec: 41779.1, 300 sec: 41766.0). Total num frames: 565952512. Throughput: 0: 41879.0. Samples: 566023240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 17:48:38,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 17:48:41,472][15401] Updated weights for policy 0, policy_version 34550 (0.0048) [2024-06-21 17:48:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41709.7). Total num frames: 566132736. Throughput: 0: 41911.4. Samples: 566276680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 17:48:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 17:48:44,852][15401] Updated weights for policy 0, policy_version 34560 (0.0038) [2024-06-21 17:48:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 41765.3). Total num frames: 566362112. Throughput: 0: 41772.2. Samples: 566521580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:48:48,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 17:48:49,368][15401] Updated weights for policy 0, policy_version 34570 (0.0032) [2024-06-21 17:48:52,943][15401] Updated weights for policy 0, policy_version 34580 (0.0027) [2024-06-21 17:48:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 566575104. Throughput: 0: 42031.8. Samples: 566651160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:48:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 17:48:57,424][15401] Updated weights for policy 0, policy_version 34590 (0.0047) [2024-06-21 17:48:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 566771712. Throughput: 0: 41896.7. Samples: 566902720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:48:58,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-21 17:49:00,558][15401] Updated weights for policy 0, policy_version 34600 (0.0031) [2024-06-21 17:49:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 41765.9). Total num frames: 566984704. Throughput: 0: 41773.8. Samples: 567148260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:49:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 17:49:04,974][15401] Updated weights for policy 0, policy_version 34610 (0.0037) [2024-06-21 17:49:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 567181312. Throughput: 0: 41817.0. Samples: 567274680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 17:49:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 17:49:08,896][15401] Updated weights for policy 0, policy_version 34620 (0.0028) [2024-06-21 17:49:12,766][15401] Updated weights for policy 0, policy_version 34630 (0.0037) [2024-06-21 17:49:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 41654.3). Total num frames: 567394304. Throughput: 0: 41817.4. Samples: 567528020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-21 17:49:13,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-21 17:49:16,458][15401] Updated weights for policy 0, policy_version 34640 (0.0032) [2024-06-21 17:49:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42050.7, 300 sec: 41765.0). Total num frames: 567623680. Throughput: 0: 41796.6. Samples: 567774200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-21 17:49:18,392][15132] Avg episode reward: [(0, '0.323')] [2024-06-21 17:49:20,359][15401] Updated weights for policy 0, policy_version 34650 (0.0033) [2024-06-21 17:49:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 567820288. Throughput: 0: 41828.6. Samples: 567905520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-21 17:49:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 17:49:24,075][15401] Updated weights for policy 0, policy_version 34660 (0.0026) [2024-06-21 17:49:28,063][15401] Updated weights for policy 0, policy_version 34670 (0.0031) [2024-06-21 17:49:28,393][15132] Fps is (10 sec: 40954.4, 60 sec: 42051.3, 300 sec: 41653.7). Total num frames: 568033280. Throughput: 0: 41801.5. Samples: 568157900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-21 17:49:28,394][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 17:49:31,596][15401] Updated weights for policy 0, policy_version 34680 (0.0043) [2024-06-21 17:49:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 568246272. Throughput: 0: 42137.7. Samples: 568417780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 17:49:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 17:49:35,639][15401] Updated weights for policy 0, policy_version 34690 (0.0031) [2024-06-21 17:49:38,390][15132] Fps is (10 sec: 42614.3, 60 sec: 41779.3, 300 sec: 41820.8). Total num frames: 568459264. Throughput: 0: 41993.0. Samples: 568540840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 17:49:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 17:49:39,244][15401] Updated weights for policy 0, policy_version 34700 (0.0042) [2024-06-21 17:49:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 568672256. Throughput: 0: 41926.2. Samples: 568789400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 17:49:43,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 17:49:43,450][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000034710_568688640.pth... [2024-06-21 17:49:43,458][15401] Updated weights for policy 0, policy_version 34710 (0.0030) [2024-06-21 17:49:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000034098_558661632.pth [2024-06-21 17:49:47,384][15401] Updated weights for policy 0, policy_version 34720 (0.0036) [2024-06-21 17:49:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 568868864. Throughput: 0: 41935.5. Samples: 569035360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 17:49:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 17:49:51,483][15401] Updated weights for policy 0, policy_version 34730 (0.0049) [2024-06-21 17:49:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 569081856. Throughput: 0: 41831.6. Samples: 569157100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 17:49:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 17:49:55,313][15401] Updated weights for policy 0, policy_version 34740 (0.0038) [2024-06-21 17:49:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 569311232. Throughput: 0: 41903.0. Samples: 569413660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:49:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 17:49:59,194][15401] Updated weights for policy 0, policy_version 34750 (0.0033) [2024-06-21 17:50:03,362][15401] Updated weights for policy 0, policy_version 34760 (0.0042) [2024-06-21 17:50:03,391][15132] Fps is (10 sec: 42589.9, 60 sec: 42050.9, 300 sec: 41820.6). Total num frames: 569507840. Throughput: 0: 41913.3. Samples: 569660280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:50:03,392][15132] Avg episode reward: [(0, '0.806')] [2024-06-21 17:50:05,623][15349] Signal inference workers to stop experience collection... (8250 times) [2024-06-21 17:50:05,654][15401] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-06-21 17:50:05,684][15349] Signal inference workers to resume experience collection... (8250 times) [2024-06-21 17:50:05,688][15401] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-06-21 17:50:06,824][15401] Updated weights for policy 0, policy_version 34770 (0.0036) [2024-06-21 17:50:08,393][15132] Fps is (10 sec: 39308.3, 60 sec: 42049.9, 300 sec: 41654.1). Total num frames: 569704448. Throughput: 0: 41887.0. Samples: 569790580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:50:08,393][15132] Avg episode reward: [(0, '0.755')] [2024-06-21 17:50:11,287][15401] Updated weights for policy 0, policy_version 34780 (0.0050) [2024-06-21 17:50:13,389][15132] Fps is (10 sec: 39329.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 569901056. Throughput: 0: 41850.6. Samples: 570041020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 17:50:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 17:50:14,673][15401] Updated weights for policy 0, policy_version 34790 (0.0028) [2024-06-21 17:50:18,390][15132] Fps is (10 sec: 42612.7, 60 sec: 41780.8, 300 sec: 41765.3). Total num frames: 570130432. Throughput: 0: 41577.8. Samples: 570288780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 17:50:18,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 17:50:19,016][15401] Updated weights for policy 0, policy_version 34800 (0.0032) [2024-06-21 17:50:22,604][15401] Updated weights for policy 0, policy_version 34810 (0.0034) [2024-06-21 17:50:23,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42047.7, 300 sec: 41708.9). Total num frames: 570343424. Throughput: 0: 41759.4. Samples: 570420280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 17:50:23,397][15132] Avg episode reward: [(0, '0.345')] [2024-06-21 17:50:26,866][15401] Updated weights for policy 0, policy_version 34820 (0.0047) [2024-06-21 17:50:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41508.7, 300 sec: 41654.2). Total num frames: 570523648. Throughput: 0: 41701.4. Samples: 570665960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 17:50:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 17:50:30,871][15401] Updated weights for policy 0, policy_version 34830 (0.0036) [2024-06-21 17:50:33,390][15132] Fps is (10 sec: 40986.1, 60 sec: 41779.2, 300 sec: 41709.7). Total num frames: 570753024. Throughput: 0: 41757.3. Samples: 570914440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 17:50:33,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 17:50:34,736][15401] Updated weights for policy 0, policy_version 34840 (0.0035) [2024-06-21 17:50:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 570949632. Throughput: 0: 41972.5. Samples: 571045860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 17:50:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 17:50:38,645][15401] Updated weights for policy 0, policy_version 34850 (0.0036) [2024-06-21 17:50:42,517][15401] Updated weights for policy 0, policy_version 34860 (0.0030) [2024-06-21 17:50:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 571162624. Throughput: 0: 41764.0. Samples: 571293040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 17:50:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 17:50:46,627][15401] Updated weights for policy 0, policy_version 34870 (0.0036) [2024-06-21 17:50:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 571408384. Throughput: 0: 41712.0. Samples: 571537240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 17:50:48,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 17:50:50,216][15401] Updated weights for policy 0, policy_version 34880 (0.0042) [2024-06-21 17:50:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 571555840. Throughput: 0: 41655.5. Samples: 571664940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 17:50:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 17:50:54,631][15401] Updated weights for policy 0, policy_version 34890 (0.0042) [2024-06-21 17:50:57,946][15401] Updated weights for policy 0, policy_version 34900 (0.0034) [2024-06-21 17:50:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 571801600. Throughput: 0: 41643.6. Samples: 571914980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 17:50:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 17:51:02,370][15401] Updated weights for policy 0, policy_version 34910 (0.0029) [2024-06-21 17:51:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 41780.6, 300 sec: 41765.3). Total num frames: 572014592. Throughput: 0: 41581.0. Samples: 572159920. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-21 17:51:03,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-21 17:51:05,614][15401] Updated weights for policy 0, policy_version 34920 (0.0033) [2024-06-21 17:51:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41508.5, 300 sec: 41709.8). Total num frames: 572194816. Throughput: 0: 41487.3. Samples: 572286940. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-21 17:51:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 17:51:10,429][15401] Updated weights for policy 0, policy_version 34930 (0.0034) [2024-06-21 17:51:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 572440576. Throughput: 0: 41535.9. Samples: 572535080. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-21 17:51:13,392][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 17:51:13,826][15401] Updated weights for policy 0, policy_version 34940 (0.0030) [2024-06-21 17:51:18,206][15401] Updated weights for policy 0, policy_version 34950 (0.0031) [2024-06-21 17:51:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 572620800. Throughput: 0: 41635.9. Samples: 572788060. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-21 17:51:18,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 17:51:21,622][15401] Updated weights for policy 0, policy_version 34960 (0.0040) [2024-06-21 17:51:23,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41237.5, 300 sec: 41710.1). Total num frames: 572817408. Throughput: 0: 41427.4. Samples: 572910100. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-21 17:51:23,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-21 17:51:26,019][15401] Updated weights for policy 0, policy_version 34970 (0.0029) [2024-06-21 17:51:27,502][15349] Signal inference workers to stop experience collection... (8300 times) [2024-06-21 17:51:27,543][15401] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-06-21 17:51:27,551][15349] Signal inference workers to resume experience collection... (8300 times) [2024-06-21 17:51:27,561][15401] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-06-21 17:51:28,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 573046784. Throughput: 0: 41424.9. Samples: 573157160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-21 17:51:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 17:51:29,603][15401] Updated weights for policy 0, policy_version 34980 (0.0036) [2024-06-21 17:51:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41654.6). Total num frames: 573243392. Throughput: 0: 41519.6. Samples: 573405620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-21 17:51:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-21 17:51:33,993][15401] Updated weights for policy 0, policy_version 34990 (0.0037) [2024-06-21 17:51:38,251][15401] Updated weights for policy 0, policy_version 35000 (0.0036) [2024-06-21 17:51:38,392][15132] Fps is (10 sec: 39312.2, 60 sec: 41504.4, 300 sec: 41653.9). Total num frames: 573440000. Throughput: 0: 41506.7. Samples: 573532840. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-21 17:51:38,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 17:51:41,701][15401] Updated weights for policy 0, policy_version 35010 (0.0030) [2024-06-21 17:51:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 573652992. Throughput: 0: 41359.0. Samples: 573776140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-21 17:51:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 17:51:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000035013_573652992.pth... [2024-06-21 17:51:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000034403_563658752.pth [2024-06-21 17:51:45,975][15401] Updated weights for policy 0, policy_version 35020 (0.0042) [2024-06-21 17:51:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 40687.0, 300 sec: 41709.8). Total num frames: 573849600. Throughput: 0: 41590.7. Samples: 574031500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 17:51:48,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 17:51:49,496][15401] Updated weights for policy 0, policy_version 35030 (0.0042) [2024-06-21 17:51:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 574078976. Throughput: 0: 41504.4. Samples: 574154640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 17:51:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 17:51:54,102][15401] Updated weights for policy 0, policy_version 35040 (0.0026) [2024-06-21 17:51:57,357][15401] Updated weights for policy 0, policy_version 35050 (0.0038) [2024-06-21 17:51:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 574275584. Throughput: 0: 41438.7. Samples: 574399820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 17:51:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 17:52:01,991][15401] Updated weights for policy 0, policy_version 35060 (0.0041) [2024-06-21 17:52:03,392][15132] Fps is (10 sec: 39312.4, 60 sec: 40958.3, 300 sec: 41653.9). Total num frames: 574472192. Throughput: 0: 41416.1. Samples: 574651880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 17:52:03,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 17:52:05,256][15401] Updated weights for policy 0, policy_version 35070 (0.0037) [2024-06-21 17:52:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 574685184. Throughput: 0: 41387.2. Samples: 574772520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 17:52:08,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 17:52:09,632][15401] Updated weights for policy 0, policy_version 35080 (0.0038) [2024-06-21 17:52:12,959][15401] Updated weights for policy 0, policy_version 35090 (0.0037) [2024-06-21 17:52:13,390][15132] Fps is (10 sec: 44247.4, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 574914560. Throughput: 0: 41485.3. Samples: 575024000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:52:13,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-21 17:52:17,236][15401] Updated weights for policy 0, policy_version 35100 (0.0039) [2024-06-21 17:52:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.2, 300 sec: 41654.6). Total num frames: 575094784. Throughput: 0: 41535.2. Samples: 575274700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:52:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 17:52:21,054][15401] Updated weights for policy 0, policy_version 35110 (0.0034) [2024-06-21 17:52:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 575324160. Throughput: 0: 41421.7. Samples: 575396720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:52:23,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 17:52:25,326][15401] Updated weights for policy 0, policy_version 35120 (0.0048) [2024-06-21 17:52:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 575537152. Throughput: 0: 41659.2. Samples: 575650800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:52:28,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-21 17:52:28,978][15401] Updated weights for policy 0, policy_version 35130 (0.0028) [2024-06-21 17:52:32,934][15401] Updated weights for policy 0, policy_version 35140 (0.0029) [2024-06-21 17:52:33,392][15132] Fps is (10 sec: 42586.0, 60 sec: 41777.2, 300 sec: 41709.4). Total num frames: 575750144. Throughput: 0: 41473.2. Samples: 575897920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:52:33,393][15132] Avg episode reward: [(0, '0.824')] [2024-06-21 17:52:36,983][15401] Updated weights for policy 0, policy_version 35150 (0.0038) [2024-06-21 17:52:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41507.8, 300 sec: 41598.7). Total num frames: 575930368. Throughput: 0: 41536.4. Samples: 576023780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:52:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-21 17:52:40,848][15401] Updated weights for policy 0, policy_version 35160 (0.0044) [2024-06-21 17:52:43,390][15132] Fps is (10 sec: 40971.6, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 576159744. Throughput: 0: 41674.6. Samples: 576275180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:52:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 17:52:44,866][15401] Updated weights for policy 0, policy_version 35170 (0.0025) [2024-06-21 17:52:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 576372736. Throughput: 0: 41698.6. Samples: 576528220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:52:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 17:52:48,703][15401] Updated weights for policy 0, policy_version 35180 (0.0031) [2024-06-21 17:52:52,623][15401] Updated weights for policy 0, policy_version 35190 (0.0032) [2024-06-21 17:52:53,391][15132] Fps is (10 sec: 42594.2, 60 sec: 41778.5, 300 sec: 41709.6). Total num frames: 576585728. Throughput: 0: 41761.6. Samples: 576651840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 17:52:53,391][15132] Avg episode reward: [(0, '0.748')] [2024-06-21 17:52:56,497][15401] Updated weights for policy 0, policy_version 35200 (0.0040) [2024-06-21 17:52:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 576798720. Throughput: 0: 41834.7. Samples: 576906560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 17:52:58,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 17:53:00,339][15401] Updated weights for policy 0, policy_version 35210 (0.0039) [2024-06-21 17:53:03,298][15349] Signal inference workers to stop experience collection... (8350 times) [2024-06-21 17:53:03,299][15349] Signal inference workers to resume experience collection... (8350 times) [2024-06-21 17:53:03,339][15401] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-06-21 17:53:03,339][15401] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-06-21 17:53:03,389][15132] Fps is (10 sec: 40964.9, 60 sec: 42054.0, 300 sec: 41709.8). Total num frames: 576995328. Throughput: 0: 41862.7. Samples: 577158520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 17:53:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 17:53:04,675][15401] Updated weights for policy 0, policy_version 35220 (0.0038) [2024-06-21 17:53:08,341][15401] Updated weights for policy 0, policy_version 35230 (0.0053) [2024-06-21 17:53:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 577208320. Throughput: 0: 41776.9. Samples: 577276680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 17:53:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 17:53:12,390][15401] Updated weights for policy 0, policy_version 35240 (0.0034) [2024-06-21 17:53:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 577421312. Throughput: 0: 41774.6. Samples: 577530660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 17:53:13,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 17:53:16,426][15401] Updated weights for policy 0, policy_version 35250 (0.0039) [2024-06-21 17:53:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 577617920. Throughput: 0: 41895.2. Samples: 577783080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 17:53:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 17:53:20,210][15401] Updated weights for policy 0, policy_version 35260 (0.0030) [2024-06-21 17:53:23,396][15132] Fps is (10 sec: 40934.1, 60 sec: 41774.7, 300 sec: 41764.7). Total num frames: 577830912. Throughput: 0: 41761.6. Samples: 577903320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 17:53:23,396][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 17:53:24,247][15401] Updated weights for policy 0, policy_version 35270 (0.0032) [2024-06-21 17:53:27,980][15401] Updated weights for policy 0, policy_version 35280 (0.0045) [2024-06-21 17:53:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 578043904. Throughput: 0: 41861.5. Samples: 578158940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 17:53:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 17:53:32,084][15401] Updated weights for policy 0, policy_version 35290 (0.0039) [2024-06-21 17:53:33,389][15132] Fps is (10 sec: 39347.3, 60 sec: 41235.2, 300 sec: 41598.7). Total num frames: 578224128. Throughput: 0: 41866.8. Samples: 578412220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 17:53:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 17:53:35,918][15401] Updated weights for policy 0, policy_version 35300 (0.0036) [2024-06-21 17:53:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 578453504. Throughput: 0: 41827.1. Samples: 578534020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 17:53:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 17:53:39,916][15401] Updated weights for policy 0, policy_version 35310 (0.0047) [2024-06-21 17:53:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 578650112. Throughput: 0: 41847.5. Samples: 578789700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:53:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 17:53:43,534][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000035320_578682880.pth... [2024-06-21 17:53:43,540][15401] Updated weights for policy 0, policy_version 35320 (0.0028) [2024-06-21 17:53:43,594][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000034710_568688640.pth [2024-06-21 17:53:47,647][15401] Updated weights for policy 0, policy_version 35330 (0.0036) [2024-06-21 17:53:48,392][15132] Fps is (10 sec: 42588.9, 60 sec: 41777.6, 300 sec: 41709.5). Total num frames: 578879488. Throughput: 0: 41769.7. Samples: 579038260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:53:48,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 17:53:51,435][15401] Updated weights for policy 0, policy_version 35340 (0.0042) [2024-06-21 17:53:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41780.0, 300 sec: 41765.3). Total num frames: 579092480. Throughput: 0: 41965.4. Samples: 579165120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:53:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 17:53:55,182][15401] Updated weights for policy 0, policy_version 35350 (0.0032) [2024-06-21 17:53:58,389][15132] Fps is (10 sec: 39331.1, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 579272704. Throughput: 0: 41959.7. Samples: 579418840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:53:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 17:53:59,060][15401] Updated weights for policy 0, policy_version 35360 (0.0038) [2024-06-21 17:54:02,702][15401] Updated weights for policy 0, policy_version 35370 (0.0030) [2024-06-21 17:54:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 579502080. Throughput: 0: 41825.3. Samples: 579665220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 17:54:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 17:54:07,009][15401] Updated weights for policy 0, policy_version 35380 (0.0031) [2024-06-21 17:54:08,396][15132] Fps is (10 sec: 44208.5, 60 sec: 41774.8, 300 sec: 41764.4). Total num frames: 579715072. Throughput: 0: 42124.1. Samples: 579798900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 17:54:08,396][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 17:54:10,536][15401] Updated weights for policy 0, policy_version 35390 (0.0034) [2024-06-21 17:54:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 41654.6). Total num frames: 579911680. Throughput: 0: 41937.7. Samples: 580046140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 17:54:13,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 17:54:15,071][15401] Updated weights for policy 0, policy_version 35400 (0.0033) [2024-06-21 17:54:18,311][15401] Updated weights for policy 0, policy_version 35410 (0.0039) [2024-06-21 17:54:18,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 580157440. Throughput: 0: 41631.9. Samples: 580285660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 17:54:18,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-21 17:54:23,354][15401] Updated weights for policy 0, policy_version 35420 (0.0034) [2024-06-21 17:54:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41510.7, 300 sec: 41654.8). Total num frames: 580321280. Throughput: 0: 41806.9. Samples: 580415320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 17:54:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 17:54:26,388][15401] Updated weights for policy 0, policy_version 35430 (0.0029) [2024-06-21 17:54:28,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 580534272. Throughput: 0: 41642.2. Samples: 580663600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 17:54:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 17:54:31,126][15401] Updated weights for policy 0, policy_version 35440 (0.0051) [2024-06-21 17:54:31,437][15349] Signal inference workers to stop experience collection... (8400 times) [2024-06-21 17:54:31,437][15349] Signal inference workers to resume experience collection... (8400 times) [2024-06-21 17:54:31,454][15401] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-06-21 17:54:31,454][15401] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-06-21 17:54:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 41709.8). Total num frames: 580763648. Throughput: 0: 41664.4. Samples: 580913060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 17:54:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 17:54:34,149][15401] Updated weights for policy 0, policy_version 35450 (0.0038) [2024-06-21 17:54:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 580943872. Throughput: 0: 41609.8. Samples: 581037560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 17:54:38,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-21 17:54:38,805][15401] Updated weights for policy 0, policy_version 35460 (0.0033) [2024-06-21 17:54:42,120][15401] Updated weights for policy 0, policy_version 35470 (0.0032) [2024-06-21 17:54:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 581156864. Throughput: 0: 41453.2. Samples: 581284240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 17:54:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-21 17:54:46,867][15401] Updated weights for policy 0, policy_version 35480 (0.0043) [2024-06-21 17:54:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41780.9, 300 sec: 41709.8). Total num frames: 581386240. Throughput: 0: 41556.5. Samples: 581535260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 17:54:48,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 17:54:49,796][15401] Updated weights for policy 0, policy_version 35490 (0.0043) [2024-06-21 17:54:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 581566464. Throughput: 0: 41457.9. Samples: 581664240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-21 17:54:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 17:54:54,371][15401] Updated weights for policy 0, policy_version 35500 (0.0036) [2024-06-21 17:54:57,601][15401] Updated weights for policy 0, policy_version 35510 (0.0035) [2024-06-21 17:54:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41654.5). Total num frames: 581795840. Throughput: 0: 41393.4. Samples: 581908840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-21 17:54:58,394][15132] Avg episode reward: [(0, '0.851')] [2024-06-21 17:55:02,711][15401] Updated weights for policy 0, policy_version 35520 (0.0033) [2024-06-21 17:55:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41506.0, 300 sec: 41654.7). Total num frames: 581992448. Throughput: 0: 41833.2. Samples: 582168160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-21 17:55:03,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-21 17:55:05,625][15401] Updated weights for policy 0, policy_version 35530 (0.0044) [2024-06-21 17:55:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41510.6, 300 sec: 41709.8). Total num frames: 582205440. Throughput: 0: 41568.4. Samples: 582285900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-21 17:55:08,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 17:55:10,801][15401] Updated weights for policy 0, policy_version 35540 (0.0042) [2024-06-21 17:55:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 582434816. Throughput: 0: 41647.9. Samples: 582537760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-21 17:55:13,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 17:55:13,485][15401] Updated weights for policy 0, policy_version 35550 (0.0024) [2024-06-21 17:55:18,390][15132] Fps is (10 sec: 37683.0, 60 sec: 40413.8, 300 sec: 41488.5). Total num frames: 582582272. Throughput: 0: 41749.3. Samples: 582791780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 17:55:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 17:55:18,610][15401] Updated weights for policy 0, policy_version 35560 (0.0029) [2024-06-21 17:55:21,580][15401] Updated weights for policy 0, policy_version 35570 (0.0029) [2024-06-21 17:55:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 582828032. Throughput: 0: 41588.9. Samples: 582909060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 17:55:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 17:55:26,279][15401] Updated weights for policy 0, policy_version 35580 (0.0029) [2024-06-21 17:55:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 583041024. Throughput: 0: 41709.0. Samples: 583161140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 17:55:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 17:55:29,607][15401] Updated weights for policy 0, policy_version 35590 (0.0036) [2024-06-21 17:55:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 583237632. Throughput: 0: 41660.7. Samples: 583410000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 17:55:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 17:55:34,068][15401] Updated weights for policy 0, policy_version 35600 (0.0036) [2024-06-21 17:55:37,549][15401] Updated weights for policy 0, policy_version 35610 (0.0039) [2024-06-21 17:55:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 583467008. Throughput: 0: 41541.3. Samples: 583533600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:55:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 17:55:41,807][15401] Updated weights for policy 0, policy_version 35620 (0.0039) [2024-06-21 17:55:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 583663616. Throughput: 0: 41702.7. Samples: 583785460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:55:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 17:55:43,546][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000035625_583680000.pth... [2024-06-21 17:55:43,592][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000035013_573652992.pth [2024-06-21 17:55:45,254][15401] Updated weights for policy 0, policy_version 35630 (0.0031) [2024-06-21 17:55:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 583860224. Throughput: 0: 41428.1. Samples: 584032420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:55:48,391][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 17:55:50,080][15401] Updated weights for policy 0, policy_version 35640 (0.0035) [2024-06-21 17:55:53,085][15401] Updated weights for policy 0, policy_version 35650 (0.0046) [2024-06-21 17:55:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 584089600. Throughput: 0: 41564.9. Samples: 584156320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:55:53,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 17:55:57,803][15401] Updated weights for policy 0, policy_version 35660 (0.0034) [2024-06-21 17:55:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 584269824. Throughput: 0: 41541.4. Samples: 584407120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 17:55:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-21 17:56:01,224][15401] Updated weights for policy 0, policy_version 35670 (0.0038) [2024-06-21 17:56:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 584499200. Throughput: 0: 41345.7. Samples: 584652340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:56:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 17:56:05,483][15401] Updated weights for policy 0, policy_version 35680 (0.0046) [2024-06-21 17:56:07,169][15349] Signal inference workers to stop experience collection... (8450 times) [2024-06-21 17:56:07,208][15401] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-06-21 17:56:07,218][15349] Signal inference workers to resume experience collection... (8450 times) [2024-06-21 17:56:07,222][15401] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-06-21 17:56:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 584695808. Throughput: 0: 41547.9. Samples: 584778720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:56:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 17:56:09,206][15401] Updated weights for policy 0, policy_version 35690 (0.0038) [2024-06-21 17:56:13,378][15401] Updated weights for policy 0, policy_version 35700 (0.0042) [2024-06-21 17:56:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 584908800. Throughput: 0: 41441.3. Samples: 585026000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:56:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 17:56:17,229][15401] Updated weights for policy 0, policy_version 35710 (0.0031) [2024-06-21 17:56:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 585121792. Throughput: 0: 41356.5. Samples: 585271040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 17:56:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 17:56:21,630][15401] Updated weights for policy 0, policy_version 35720 (0.0034) [2024-06-21 17:56:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 585302016. Throughput: 0: 41412.0. Samples: 585397140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:56:23,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 17:56:25,615][15401] Updated weights for policy 0, policy_version 35730 (0.0042) [2024-06-21 17:56:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 585531392. Throughput: 0: 41101.7. Samples: 585635040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:56:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 17:56:29,386][15401] Updated weights for policy 0, policy_version 35740 (0.0030) [2024-06-21 17:56:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.2, 300 sec: 41599.0). Total num frames: 585711616. Throughput: 0: 41331.7. Samples: 585892340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:56:33,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 17:56:33,529][15401] Updated weights for policy 0, policy_version 35750 (0.0048) [2024-06-21 17:56:37,327][15401] Updated weights for policy 0, policy_version 35760 (0.0041) [2024-06-21 17:56:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 585940992. Throughput: 0: 41226.7. Samples: 586011520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:56:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 17:56:41,183][15401] Updated weights for policy 0, policy_version 35770 (0.0033) [2024-06-21 17:56:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 586153984. Throughput: 0: 41309.2. Samples: 586266040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 17:56:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 17:56:45,146][15401] Updated weights for policy 0, policy_version 35780 (0.0029) [2024-06-21 17:56:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 586334208. Throughput: 0: 41377.5. Samples: 586514320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-21 17:56:48,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-21 17:56:48,952][15401] Updated weights for policy 0, policy_version 35790 (0.0049) [2024-06-21 17:56:52,921][15401] Updated weights for policy 0, policy_version 35800 (0.0035) [2024-06-21 17:56:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 586547200. Throughput: 0: 41216.5. Samples: 586633460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-21 17:56:53,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-21 17:56:56,922][15401] Updated weights for policy 0, policy_version 35810 (0.0047) [2024-06-21 17:56:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41654.6). Total num frames: 586760192. Throughput: 0: 41203.1. Samples: 586880140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-21 17:56:58,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-21 17:57:01,133][15401] Updated weights for policy 0, policy_version 35820 (0.0039) [2024-06-21 17:57:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.1, 300 sec: 41598.7). Total num frames: 586956800. Throughput: 0: 41351.6. Samples: 587131860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-21 17:57:03,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 17:57:04,808][15401] Updated weights for policy 0, policy_version 35830 (0.0051) [2024-06-21 17:57:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 587186176. Throughput: 0: 41232.4. Samples: 587252600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-21 17:57:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 17:57:08,780][15401] Updated weights for policy 0, policy_version 35840 (0.0047) [2024-06-21 17:57:13,010][15401] Updated weights for policy 0, policy_version 35850 (0.0032) [2024-06-21 17:57:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 587366400. Throughput: 0: 41499.2. Samples: 587502500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 17:57:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 17:57:16,628][15401] Updated weights for policy 0, policy_version 35860 (0.0035) [2024-06-21 17:57:16,724][15349] Signal inference workers to stop experience collection... (8500 times) [2024-06-21 17:57:16,725][15349] Signal inference workers to resume experience collection... (8500 times) [2024-06-21 17:57:16,770][15401] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-06-21 17:57:16,771][15401] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-06-21 17:57:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 587579392. Throughput: 0: 41252.8. Samples: 587748720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 17:57:18,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-21 17:57:21,112][15401] Updated weights for policy 0, policy_version 35870 (0.0041) [2024-06-21 17:57:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 587776000. Throughput: 0: 41403.4. Samples: 587874680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 17:57:23,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-21 17:57:24,479][15401] Updated weights for policy 0, policy_version 35880 (0.0027) [2024-06-21 17:57:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.1, 300 sec: 41488.0). Total num frames: 587988992. Throughput: 0: 41390.8. Samples: 588128620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 17:57:28,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 17:57:28,791][15401] Updated weights for policy 0, policy_version 35890 (0.0036) [2024-06-21 17:57:32,219][15401] Updated weights for policy 0, policy_version 35900 (0.0034) [2024-06-21 17:57:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 588234752. Throughput: 0: 41190.7. Samples: 588367900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:57:33,398][15132] Avg episode reward: [(0, '0.284')] [2024-06-21 17:57:36,588][15401] Updated weights for policy 0, policy_version 35910 (0.0051) [2024-06-21 17:57:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41231.4, 300 sec: 41542.8). Total num frames: 588414976. Throughput: 0: 41476.5. Samples: 588500000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:57:38,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 17:57:39,951][15401] Updated weights for policy 0, policy_version 35920 (0.0045) [2024-06-21 17:57:43,390][15132] Fps is (10 sec: 37682.9, 60 sec: 40960.1, 300 sec: 41487.6). Total num frames: 588611584. Throughput: 0: 41468.9. Samples: 588746240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:57:43,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 17:57:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000035926_588611584.pth... [2024-06-21 17:57:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000035320_578682880.pth [2024-06-21 17:57:44,573][15401] Updated weights for policy 0, policy_version 35930 (0.0037) [2024-06-21 17:57:48,110][15401] Updated weights for policy 0, policy_version 35940 (0.0037) [2024-06-21 17:57:48,390][15132] Fps is (10 sec: 42608.2, 60 sec: 41779.1, 300 sec: 41543.3). Total num frames: 588840960. Throughput: 0: 41241.7. Samples: 588987740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:57:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 17:57:53,000][15401] Updated weights for policy 0, policy_version 35950 (0.0034) [2024-06-21 17:57:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 589021184. Throughput: 0: 41442.3. Samples: 589117500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 17:57:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 17:57:55,891][15401] Updated weights for policy 0, policy_version 35960 (0.0036) [2024-06-21 17:57:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 589250560. Throughput: 0: 41375.9. Samples: 589364420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 17:57:58,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 17:58:00,761][15401] Updated weights for policy 0, policy_version 35970 (0.0040) [2024-06-21 17:58:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 589463552. Throughput: 0: 41461.3. Samples: 589614480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 17:58:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 17:58:03,763][15401] Updated weights for policy 0, policy_version 35980 (0.0038) [2024-06-21 17:58:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 589643776. Throughput: 0: 41466.7. Samples: 589740680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 17:58:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 17:58:08,500][15401] Updated weights for policy 0, policy_version 35990 (0.0031) [2024-06-21 17:58:11,945][15401] Updated weights for policy 0, policy_version 36000 (0.0045) [2024-06-21 17:58:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 589873152. Throughput: 0: 41229.3. Samples: 589983940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 17:58:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 17:58:16,455][15401] Updated weights for policy 0, policy_version 36010 (0.0033) [2024-06-21 17:58:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41488.5). Total num frames: 590069760. Throughput: 0: 41626.6. Samples: 590241100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:58:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 17:58:19,675][15401] Updated weights for policy 0, policy_version 36020 (0.0027) [2024-06-21 17:58:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 590282752. Throughput: 0: 41343.0. Samples: 590360340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:58:23,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 17:58:24,489][15401] Updated weights for policy 0, policy_version 36030 (0.0037) [2024-06-21 17:58:27,529][15401] Updated weights for policy 0, policy_version 36040 (0.0048) [2024-06-21 17:58:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 590512128. Throughput: 0: 41401.5. Samples: 590609300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:58:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 17:58:32,033][15401] Updated weights for policy 0, policy_version 36050 (0.0047) [2024-06-21 17:58:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 590692352. Throughput: 0: 41780.5. Samples: 590867860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:58:33,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 17:58:35,191][15401] Updated weights for policy 0, policy_version 36060 (0.0034) [2024-06-21 17:58:38,389][15132] Fps is (10 sec: 39321.1, 60 sec: 41507.8, 300 sec: 41543.2). Total num frames: 590905344. Throughput: 0: 41620.8. Samples: 590990440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 17:58:38,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-21 17:58:39,647][15401] Updated weights for policy 0, policy_version 36070 (0.0041) [2024-06-21 17:58:40,918][15349] Signal inference workers to stop experience collection... (8550 times) [2024-06-21 17:58:40,959][15401] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-06-21 17:58:40,969][15349] Signal inference workers to resume experience collection... (8550 times) [2024-06-21 17:58:40,972][15401] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-06-21 17:58:43,289][15401] Updated weights for policy 0, policy_version 36080 (0.0035) [2024-06-21 17:58:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 41543.5). Total num frames: 591134720. Throughput: 0: 41706.7. Samples: 591241220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:58:43,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 17:58:47,155][15401] Updated weights for policy 0, policy_version 36090 (0.0023) [2024-06-21 17:58:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 591298560. Throughput: 0: 41825.2. Samples: 591496620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:58:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 17:58:51,401][15401] Updated weights for policy 0, policy_version 36100 (0.0053) [2024-06-21 17:58:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 41598.7). Total num frames: 591544320. Throughput: 0: 41655.4. Samples: 591615180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:58:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 17:58:55,332][15401] Updated weights for policy 0, policy_version 36110 (0.0024) [2024-06-21 17:58:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 591757312. Throughput: 0: 41900.9. Samples: 591869480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:58:58,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 17:58:59,257][15401] Updated weights for policy 0, policy_version 36120 (0.0036) [2024-06-21 17:59:03,347][15401] Updated weights for policy 0, policy_version 36130 (0.0033) [2024-06-21 17:59:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41488.5). Total num frames: 591953920. Throughput: 0: 41709.3. Samples: 592118020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 17:59:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 17:59:07,218][15401] Updated weights for policy 0, policy_version 36140 (0.0033) [2024-06-21 17:59:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.6, 300 sec: 41598.4). Total num frames: 592183296. Throughput: 0: 41783.6. Samples: 592240700. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-21 17:59:08,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 17:59:11,325][15401] Updated weights for policy 0, policy_version 36150 (0.0036) [2024-06-21 17:59:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 592363520. Throughput: 0: 41868.2. Samples: 592493380. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-21 17:59:13,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 17:59:14,944][15401] Updated weights for policy 0, policy_version 36160 (0.0033) [2024-06-21 17:59:18,390][15132] Fps is (10 sec: 39331.0, 60 sec: 41779.2, 300 sec: 41543.1). Total num frames: 592576512. Throughput: 0: 41601.8. Samples: 592739940. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-21 17:59:18,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 17:59:18,918][15401] Updated weights for policy 0, policy_version 36170 (0.0042) [2024-06-21 17:59:22,843][15401] Updated weights for policy 0, policy_version 36180 (0.0054) [2024-06-21 17:59:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 41777.5, 300 sec: 41542.8). Total num frames: 592789504. Throughput: 0: 41681.7. Samples: 592866220. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-21 17:59:23,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 17:59:26,584][15401] Updated weights for policy 0, policy_version 36190 (0.0023) [2024-06-21 17:59:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 592969728. Throughput: 0: 41638.3. Samples: 593114940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:59:28,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 17:59:30,508][15401] Updated weights for policy 0, policy_version 36200 (0.0024) [2024-06-21 17:59:33,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 593215488. Throughput: 0: 41480.5. Samples: 593363240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:59:33,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 17:59:34,458][15401] Updated weights for policy 0, policy_version 36210 (0.0046) [2024-06-21 17:59:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 593395712. Throughput: 0: 41885.8. Samples: 593500040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:59:38,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 17:59:38,590][15401] Updated weights for policy 0, policy_version 36220 (0.0040) [2024-06-21 17:59:42,109][15401] Updated weights for policy 0, policy_version 36230 (0.0040) [2024-06-21 17:59:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 593608704. Throughput: 0: 41476.4. Samples: 593735920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:59:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 17:59:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000036231_593608704.pth... [2024-06-21 17:59:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000035625_583680000.pth [2024-06-21 17:59:46,289][15401] Updated weights for policy 0, policy_version 36240 (0.0049) [2024-06-21 17:59:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 41598.7). Total num frames: 593838080. Throughput: 0: 41543.6. Samples: 593987480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 17:59:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 17:59:50,255][15401] Updated weights for policy 0, policy_version 36250 (0.0039) [2024-06-21 17:59:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 594018304. Throughput: 0: 41688.9. Samples: 594116600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:59:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 17:59:53,414][15349] Signal inference workers to stop experience collection... (8600 times) [2024-06-21 17:59:53,454][15401] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-06-21 17:59:53,476][15349] Signal inference workers to resume experience collection... (8600 times) [2024-06-21 17:59:53,477][15401] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-06-21 17:59:54,070][15401] Updated weights for policy 0, policy_version 36260 (0.0030) [2024-06-21 17:59:57,932][15401] Updated weights for policy 0, policy_version 36270 (0.0039) [2024-06-21 17:59:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 594247680. Throughput: 0: 41617.4. Samples: 594366160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 17:59:58,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 18:00:01,850][15401] Updated weights for policy 0, policy_version 36280 (0.0041) [2024-06-21 18:00:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 41504.5, 300 sec: 41487.3). Total num frames: 594444288. Throughput: 0: 41727.2. Samples: 594617760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 18:00:03,392][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 18:00:06,101][15401] Updated weights for policy 0, policy_version 36290 (0.0032) [2024-06-21 18:00:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41234.7, 300 sec: 41432.1). Total num frames: 594657280. Throughput: 0: 41789.7. Samples: 594746660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 18:00:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 18:00:09,575][15401] Updated weights for policy 0, policy_version 36300 (0.0036) [2024-06-21 18:00:13,389][15132] Fps is (10 sec: 42608.6, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 594870272. Throughput: 0: 41916.9. Samples: 595001200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 18:00:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 18:00:13,724][15401] Updated weights for policy 0, policy_version 36310 (0.0042) [2024-06-21 18:00:17,399][15401] Updated weights for policy 0, policy_version 36320 (0.0028) [2024-06-21 18:00:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 595066880. Throughput: 0: 41962.1. Samples: 595251540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:00:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 18:00:21,514][15401] Updated weights for policy 0, policy_version 36330 (0.0036) [2024-06-21 18:00:23,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41506.1, 300 sec: 41487.3). Total num frames: 595279872. Throughput: 0: 41645.4. Samples: 595374180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:00:23,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 18:00:25,167][15401] Updated weights for policy 0, policy_version 36340 (0.0030) [2024-06-21 18:00:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 41598.7). Total num frames: 595509248. Throughput: 0: 42086.1. Samples: 595629800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:00:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 18:00:29,375][15401] Updated weights for policy 0, policy_version 36350 (0.0038) [2024-06-21 18:00:33,074][15401] Updated weights for policy 0, policy_version 36360 (0.0030) [2024-06-21 18:00:33,392][15132] Fps is (10 sec: 44237.0, 60 sec: 41777.6, 300 sec: 41542.8). Total num frames: 595722240. Throughput: 0: 41715.1. Samples: 595864760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:00:33,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 18:00:37,412][15401] Updated weights for policy 0, policy_version 36370 (0.0036) [2024-06-21 18:00:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 595918848. Throughput: 0: 41786.7. Samples: 595997000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:00:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 18:00:41,774][15401] Updated weights for policy 0, policy_version 36380 (0.0033) [2024-06-21 18:00:43,389][15132] Fps is (10 sec: 39331.1, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 596115456. Throughput: 0: 41799.2. Samples: 596247120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:00:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 18:00:45,182][15401] Updated weights for policy 0, policy_version 36390 (0.0047) [2024-06-21 18:00:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 596328448. Throughput: 0: 41817.3. Samples: 596499440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:00:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 18:00:49,616][15401] Updated weights for policy 0, policy_version 36400 (0.0038) [2024-06-21 18:00:53,032][15401] Updated weights for policy 0, policy_version 36410 (0.0042) [2024-06-21 18:00:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 41654.2). Total num frames: 596557824. Throughput: 0: 41815.3. Samples: 596628340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:00:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 18:00:57,366][15401] Updated weights for policy 0, policy_version 36420 (0.0037) [2024-06-21 18:00:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 596754432. Throughput: 0: 41693.2. Samples: 596877400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:00:58,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-21 18:01:00,748][15401] Updated weights for policy 0, policy_version 36430 (0.0034) [2024-06-21 18:01:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42053.8, 300 sec: 41598.7). Total num frames: 596967424. Throughput: 0: 41715.5. Samples: 597128740. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:01:03,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-21 18:01:04,910][15401] Updated weights for policy 0, policy_version 36440 (0.0025) [2024-06-21 18:01:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 41598.7). Total num frames: 597180416. Throughput: 0: 41787.6. Samples: 597254520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:01:08,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 18:01:08,535][15401] Updated weights for policy 0, policy_version 36450 (0.0031) [2024-06-21 18:01:12,564][15401] Updated weights for policy 0, policy_version 36460 (0.0027) [2024-06-21 18:01:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 597377024. Throughput: 0: 41799.6. Samples: 597510780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:01:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-21 18:01:16,653][15401] Updated weights for policy 0, policy_version 36470 (0.0034) [2024-06-21 18:01:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 597606400. Throughput: 0: 41763.5. Samples: 597744020. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:01:18,394][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 18:01:20,161][15349] Signal inference workers to stop experience collection... (8650 times) [2024-06-21 18:01:20,161][15349] Signal inference workers to resume experience collection... (8650 times) [2024-06-21 18:01:20,205][15401] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-06-21 18:01:20,206][15401] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-06-21 18:01:20,554][15401] Updated weights for policy 0, policy_version 36480 (0.0041) [2024-06-21 18:01:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41507.9, 300 sec: 41487.6). Total num frames: 597770240. Throughput: 0: 41736.5. Samples: 597875140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:01:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 18:01:24,333][15401] Updated weights for policy 0, policy_version 36490 (0.0034) [2024-06-21 18:01:28,384][15401] Updated weights for policy 0, policy_version 36500 (0.0039) [2024-06-21 18:01:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.4, 300 sec: 41709.8). Total num frames: 598016000. Throughput: 0: 41688.5. Samples: 598123100. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-21 18:01:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 18:01:32,450][15401] Updated weights for policy 0, policy_version 36510 (0.0044) [2024-06-21 18:01:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 41780.9, 300 sec: 41654.2). Total num frames: 598228992. Throughput: 0: 41404.9. Samples: 598362660. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-21 18:01:33,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-21 18:01:36,638][15401] Updated weights for policy 0, policy_version 36520 (0.0034) [2024-06-21 18:01:38,389][15132] Fps is (10 sec: 37682.8, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 598392832. Throughput: 0: 41452.4. Samples: 598493700. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-21 18:01:38,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 18:01:40,451][15401] Updated weights for policy 0, policy_version 36530 (0.0035) [2024-06-21 18:01:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 598638592. Throughput: 0: 41361.0. Samples: 598738640. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-21 18:01:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 18:01:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000036538_598638592.pth... [2024-06-21 18:01:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000035926_588611584.pth [2024-06-21 18:01:44,600][15401] Updated weights for policy 0, policy_version 36540 (0.0039) [2024-06-21 18:01:48,156][15401] Updated weights for policy 0, policy_version 36550 (0.0034) [2024-06-21 18:01:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 598851584. Throughput: 0: 41468.9. Samples: 598994840. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-21 18:01:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 18:01:52,460][15401] Updated weights for policy 0, policy_version 36560 (0.0034) [2024-06-21 18:01:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 599015424. Throughput: 0: 41444.9. Samples: 599119540. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-06-21 18:01:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 18:01:55,934][15401] Updated weights for policy 0, policy_version 36570 (0.0035) [2024-06-21 18:01:58,394][15132] Fps is (10 sec: 40940.3, 60 sec: 41775.9, 300 sec: 41709.1). Total num frames: 599261184. Throughput: 0: 41140.1. Samples: 599362280. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-06-21 18:01:58,395][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 18:02:00,122][15401] Updated weights for policy 0, policy_version 36580 (0.0043) [2024-06-21 18:02:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 599457792. Throughput: 0: 41845.8. Samples: 599627080. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-06-21 18:02:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 18:02:03,703][15401] Updated weights for policy 0, policy_version 36590 (0.0033) [2024-06-21 18:02:07,821][15401] Updated weights for policy 0, policy_version 36600 (0.0029) [2024-06-21 18:02:08,389][15132] Fps is (10 sec: 40980.3, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 599670784. Throughput: 0: 41586.2. Samples: 599746520. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-06-21 18:02:08,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 18:02:11,562][15401] Updated weights for policy 0, policy_version 36610 (0.0048) [2024-06-21 18:02:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 599883776. Throughput: 0: 41507.0. Samples: 599990920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:02:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 18:02:15,874][15401] Updated weights for policy 0, policy_version 36620 (0.0034) [2024-06-21 18:02:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 600064000. Throughput: 0: 42003.0. Samples: 600252800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:02:18,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 18:02:19,274][15401] Updated weights for policy 0, policy_version 36630 (0.0033) [2024-06-21 18:02:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 41709.7). Total num frames: 600293376. Throughput: 0: 41766.9. Samples: 600373220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:02:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 18:02:23,942][15401] Updated weights for policy 0, policy_version 36640 (0.0031) [2024-06-21 18:02:27,361][15401] Updated weights for policy 0, policy_version 36650 (0.0029) [2024-06-21 18:02:27,731][15349] Signal inference workers to stop experience collection... (8700 times) [2024-06-21 18:02:27,784][15401] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-06-21 18:02:27,841][15349] Signal inference workers to resume experience collection... (8700 times) [2024-06-21 18:02:27,842][15401] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-06-21 18:02:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 41779.0, 300 sec: 41654.2). Total num frames: 600522752. Throughput: 0: 41841.7. Samples: 600621520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:02:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 18:02:31,624][15401] Updated weights for policy 0, policy_version 36660 (0.0036) [2024-06-21 18:02:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41233.0, 300 sec: 41654.6). Total num frames: 600702976. Throughput: 0: 41890.3. Samples: 600879900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:02:33,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 18:02:35,323][15401] Updated weights for policy 0, policy_version 36670 (0.0032) [2024-06-21 18:02:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 600915968. Throughput: 0: 41741.8. Samples: 600997920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:02:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 18:02:39,214][15401] Updated weights for policy 0, policy_version 36680 (0.0040) [2024-06-21 18:02:43,025][15401] Updated weights for policy 0, policy_version 36690 (0.0036) [2024-06-21 18:02:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 601128960. Throughput: 0: 42025.9. Samples: 601253240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:02:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 18:02:46,810][15401] Updated weights for policy 0, policy_version 36700 (0.0024) [2024-06-21 18:02:48,392][15132] Fps is (10 sec: 40949.7, 60 sec: 41231.5, 300 sec: 41709.4). Total num frames: 601325568. Throughput: 0: 41716.4. Samples: 601504420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:02:48,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 18:02:50,807][15401] Updated weights for policy 0, policy_version 36710 (0.0030) [2024-06-21 18:02:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 601554944. Throughput: 0: 41724.9. Samples: 601624140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:02:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 18:02:54,823][15401] Updated weights for policy 0, policy_version 36720 (0.0041) [2024-06-21 18:02:58,390][15132] Fps is (10 sec: 42608.8, 60 sec: 41509.5, 300 sec: 41654.2). Total num frames: 601751552. Throughput: 0: 41917.4. Samples: 601877200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:02:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 18:02:58,965][15401] Updated weights for policy 0, policy_version 36730 (0.0033) [2024-06-21 18:03:02,498][15401] Updated weights for policy 0, policy_version 36740 (0.0031) [2024-06-21 18:03:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 601964544. Throughput: 0: 41586.3. Samples: 602124180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 18:03:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 18:03:06,708][15401] Updated weights for policy 0, policy_version 36750 (0.0049) [2024-06-21 18:03:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 602161152. Throughput: 0: 41745.9. Samples: 602251780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 18:03:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 18:03:10,748][15401] Updated weights for policy 0, policy_version 36760 (0.0039) [2024-06-21 18:03:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 602374144. Throughput: 0: 41889.4. Samples: 602506540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 18:03:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 18:03:14,386][15401] Updated weights for policy 0, policy_version 36770 (0.0038) [2024-06-21 18:03:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 602587136. Throughput: 0: 41720.5. Samples: 602757320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 18:03:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 18:03:18,567][15401] Updated weights for policy 0, policy_version 36780 (0.0033) [2024-06-21 18:03:22,317][15401] Updated weights for policy 0, policy_version 36790 (0.0027) [2024-06-21 18:03:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 602783744. Throughput: 0: 41774.6. Samples: 602877780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 18:03:23,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-21 18:03:26,625][15401] Updated weights for policy 0, policy_version 36800 (0.0031) [2024-06-21 18:03:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 603029504. Throughput: 0: 41659.5. Samples: 603127920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 18:03:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 18:03:30,285][15401] Updated weights for policy 0, policy_version 36810 (0.0027) [2024-06-21 18:03:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 603209728. Throughput: 0: 41608.5. Samples: 603376700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 18:03:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 18:03:34,304][15401] Updated weights for policy 0, policy_version 36820 (0.0043) [2024-06-21 18:03:38,120][15401] Updated weights for policy 0, policy_version 36830 (0.0054) [2024-06-21 18:03:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 603422720. Throughput: 0: 41607.5. Samples: 603496480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 18:03:38,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 18:03:42,138][15401] Updated weights for policy 0, policy_version 36840 (0.0035) [2024-06-21 18:03:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 603635712. Throughput: 0: 41680.1. Samples: 603752800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-21 18:03:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 18:03:43,474][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000036844_603652096.pth... [2024-06-21 18:03:43,534][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000036231_593608704.pth [2024-06-21 18:03:46,315][15401] Updated weights for policy 0, policy_version 36850 (0.0026) [2024-06-21 18:03:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41780.9, 300 sec: 41654.3). Total num frames: 603832320. Throughput: 0: 41745.4. Samples: 604002720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:03:48,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-21 18:03:49,434][15349] Signal inference workers to stop experience collection... (8750 times) [2024-06-21 18:03:49,434][15349] Signal inference workers to resume experience collection... (8750 times) [2024-06-21 18:03:49,467][15401] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-06-21 18:03:49,468][15401] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-06-21 18:03:49,778][15401] Updated weights for policy 0, policy_version 36860 (0.0038) [2024-06-21 18:03:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 604045312. Throughput: 0: 41592.3. Samples: 604123440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:03:53,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 18:03:54,071][15401] Updated weights for policy 0, policy_version 36870 (0.0033) [2024-06-21 18:03:57,672][15401] Updated weights for policy 0, policy_version 36880 (0.0026) [2024-06-21 18:03:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 604258304. Throughput: 0: 41631.5. Samples: 604379960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:03:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 18:04:01,824][15401] Updated weights for policy 0, policy_version 36890 (0.0025) [2024-06-21 18:04:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41599.0). Total num frames: 604454912. Throughput: 0: 41620.4. Samples: 604630240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:04:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 18:04:05,794][15401] Updated weights for policy 0, policy_version 36900 (0.0033) [2024-06-21 18:04:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 604667904. Throughput: 0: 41725.3. Samples: 604755420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:04:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 18:04:09,639][15401] Updated weights for policy 0, policy_version 36910 (0.0037) [2024-06-21 18:04:13,340][15401] Updated weights for policy 0, policy_version 36920 (0.0035) [2024-06-21 18:04:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 604897280. Throughput: 0: 41808.5. Samples: 605009300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 18:04:13,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-21 18:04:17,431][15401] Updated weights for policy 0, policy_version 36930 (0.0030) [2024-06-21 18:04:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41654.6). Total num frames: 605077504. Throughput: 0: 41713.7. Samples: 605253820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 18:04:18,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-21 18:04:21,390][15401] Updated weights for policy 0, policy_version 36940 (0.0038) [2024-06-21 18:04:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 605306880. Throughput: 0: 41785.3. Samples: 605376820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 18:04:23,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 18:04:25,309][15401] Updated weights for policy 0, policy_version 36950 (0.0036) [2024-06-21 18:04:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 605503488. Throughput: 0: 41766.2. Samples: 605632280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 18:04:28,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 18:04:29,425][15401] Updated weights for policy 0, policy_version 36960 (0.0040) [2024-06-21 18:04:33,061][15401] Updated weights for policy 0, policy_version 36970 (0.0046) [2024-06-21 18:04:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 605716480. Throughput: 0: 41776.0. Samples: 605882640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 18:04:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 18:04:36,951][15401] Updated weights for policy 0, policy_version 36980 (0.0030) [2024-06-21 18:04:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 605945856. Throughput: 0: 41866.8. Samples: 606007440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-21 18:04:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 18:04:41,228][15401] Updated weights for policy 0, policy_version 36990 (0.0039) [2024-06-21 18:04:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 606142464. Throughput: 0: 41941.0. Samples: 606267300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-21 18:04:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 18:04:44,613][15401] Updated weights for policy 0, policy_version 37000 (0.0034) [2024-06-21 18:04:48,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 606322688. Throughput: 0: 41880.4. Samples: 606514860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-21 18:04:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-21 18:04:49,466][15401] Updated weights for policy 0, policy_version 37010 (0.0031) [2024-06-21 18:04:52,372][15401] Updated weights for policy 0, policy_version 37020 (0.0041) [2024-06-21 18:04:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 606568448. Throughput: 0: 41704.5. Samples: 606632120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-21 18:04:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 18:04:57,261][15401] Updated weights for policy 0, policy_version 37030 (0.0038) [2024-06-21 18:04:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41710.1). Total num frames: 606748672. Throughput: 0: 41847.6. Samples: 606892440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-21 18:04:58,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-21 18:05:00,232][15401] Updated weights for policy 0, policy_version 37040 (0.0034) [2024-06-21 18:05:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 606961664. Throughput: 0: 41816.0. Samples: 607135540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 18:05:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 18:05:05,066][15401] Updated weights for policy 0, policy_version 37050 (0.0048) [2024-06-21 18:05:08,268][15401] Updated weights for policy 0, policy_version 37060 (0.0038) [2024-06-21 18:05:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 607191040. Throughput: 0: 41947.6. Samples: 607264460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 18:05:08,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-21 18:05:12,895][15401] Updated weights for policy 0, policy_version 37070 (0.0034) [2024-06-21 18:05:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 607387648. Throughput: 0: 41927.4. Samples: 607519020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 18:05:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 18:05:15,807][15401] Updated weights for policy 0, policy_version 37080 (0.0041) [2024-06-21 18:05:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41765.7). Total num frames: 607600640. Throughput: 0: 41748.0. Samples: 607761300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-21 18:05:18,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 18:05:20,681][15401] Updated weights for policy 0, policy_version 37090 (0.0041) [2024-06-21 18:05:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 607797248. Throughput: 0: 41801.4. Samples: 607888500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 18:05:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 18:05:24,042][15401] Updated weights for policy 0, policy_version 37100 (0.0033) [2024-06-21 18:05:28,358][15349] Signal inference workers to stop experience collection... (8800 times) [2024-06-21 18:05:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41233.1, 300 sec: 41543.5). Total num frames: 607977472. Throughput: 0: 41507.1. Samples: 608135120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 18:05:28,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 18:05:28,401][15401] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-06-21 18:05:28,477][15349] Signal inference workers to resume experience collection... (8800 times) [2024-06-21 18:05:28,477][15401] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-06-21 18:05:28,627][15401] Updated weights for policy 0, policy_version 37110 (0.0037) [2024-06-21 18:05:32,177][15401] Updated weights for policy 0, policy_version 37120 (0.0042) [2024-06-21 18:05:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 608190464. Throughput: 0: 41509.9. Samples: 608382800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 18:05:33,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 18:05:36,384][15401] Updated weights for policy 0, policy_version 37130 (0.0035) [2024-06-21 18:05:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 41231.5, 300 sec: 41709.4). Total num frames: 608419840. Throughput: 0: 41751.2. Samples: 608511020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 18:05:38,392][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 18:05:39,968][15401] Updated weights for policy 0, policy_version 37140 (0.0030) [2024-06-21 18:05:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 608632832. Throughput: 0: 41479.9. Samples: 608759040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 18:05:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 18:05:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000037148_608632832.pth... [2024-06-21 18:05:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000036538_598638592.pth [2024-06-21 18:05:43,954][15401] Updated weights for policy 0, policy_version 37150 (0.0040) [2024-06-21 18:05:47,815][15401] Updated weights for policy 0, policy_version 37160 (0.0043) [2024-06-21 18:05:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 608845824. Throughput: 0: 41506.6. Samples: 609003340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 18:05:48,394][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 18:05:51,561][15401] Updated weights for policy 0, policy_version 37170 (0.0045) [2024-06-21 18:05:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 609058816. Throughput: 0: 41475.0. Samples: 609130840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 18:05:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 18:05:55,726][15401] Updated weights for policy 0, policy_version 37180 (0.0024) [2024-06-21 18:05:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 609239040. Throughput: 0: 41409.4. Samples: 609382440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 18:05:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 18:05:59,631][15401] Updated weights for policy 0, policy_version 37190 (0.0029) [2024-06-21 18:06:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 609468416. Throughput: 0: 41589.7. Samples: 609632840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 18:06:03,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 18:06:03,537][15401] Updated weights for policy 0, policy_version 37200 (0.0034) [2024-06-21 18:06:07,445][15401] Updated weights for policy 0, policy_version 37210 (0.0034) [2024-06-21 18:06:08,391][15132] Fps is (10 sec: 44229.2, 60 sec: 41504.9, 300 sec: 41709.6). Total num frames: 609681408. Throughput: 0: 41662.0. Samples: 609763360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 18:06:08,392][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 18:06:11,153][15401] Updated weights for policy 0, policy_version 37220 (0.0040) [2024-06-21 18:06:13,392][15132] Fps is (10 sec: 39312.0, 60 sec: 41231.4, 300 sec: 41542.8). Total num frames: 609861632. Throughput: 0: 41763.4. Samples: 610014580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:06:13,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 18:06:15,300][15401] Updated weights for policy 0, policy_version 37230 (0.0023) [2024-06-21 18:06:18,389][15132] Fps is (10 sec: 39328.6, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 610074624. Throughput: 0: 41764.0. Samples: 610262180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:06:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 18:06:19,181][15401] Updated weights for policy 0, policy_version 37240 (0.0037) [2024-06-21 18:06:23,196][15401] Updated weights for policy 0, policy_version 37250 (0.0039) [2024-06-21 18:06:23,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 610320384. Throughput: 0: 41617.3. Samples: 610383700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:06:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 18:06:26,914][15401] Updated weights for policy 0, policy_version 37260 (0.0033) [2024-06-21 18:06:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 610484224. Throughput: 0: 41594.8. Samples: 610630800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:06:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 18:06:30,964][15401] Updated weights for policy 0, policy_version 37270 (0.0033) [2024-06-21 18:06:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 610713600. Throughput: 0: 41742.6. Samples: 610881760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:06:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 18:06:34,159][15349] Signal inference workers to stop experience collection... (8850 times) [2024-06-21 18:06:34,165][15349] Signal inference workers to resume experience collection... (8850 times) [2024-06-21 18:06:34,173][15401] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-06-21 18:06:34,200][15401] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-06-21 18:06:34,707][15401] Updated weights for policy 0, policy_version 37280 (0.0036) [2024-06-21 18:06:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 41780.9, 300 sec: 41654.2). Total num frames: 610926592. Throughput: 0: 41795.6. Samples: 611011640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:06:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 18:06:38,697][15401] Updated weights for policy 0, policy_version 37290 (0.0031) [2024-06-21 18:06:42,396][15401] Updated weights for policy 0, policy_version 37300 (0.0023) [2024-06-21 18:06:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 611139584. Throughput: 0: 41803.9. Samples: 611263620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:06:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 18:06:46,420][15401] Updated weights for policy 0, policy_version 37310 (0.0045) [2024-06-21 18:06:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 611368960. Throughput: 0: 41720.9. Samples: 611510280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:06:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 18:06:50,877][15401] Updated weights for policy 0, policy_version 37320 (0.0033) [2024-06-21 18:06:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41654.9). Total num frames: 611549184. Throughput: 0: 41710.4. Samples: 611640260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:06:53,392][15132] Avg episode reward: [(0, '0.781')] [2024-06-21 18:06:54,257][15401] Updated weights for policy 0, policy_version 37330 (0.0052) [2024-06-21 18:06:58,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 611745792. Throughput: 0: 41646.7. Samples: 611888580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:06:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 18:06:59,016][15401] Updated weights for policy 0, policy_version 37340 (0.0027) [2024-06-21 18:07:02,200][15401] Updated weights for policy 0, policy_version 37350 (0.0040) [2024-06-21 18:07:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 611975168. Throughput: 0: 41557.7. Samples: 612132280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 18:07:03,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-21 18:07:06,874][15401] Updated weights for policy 0, policy_version 37360 (0.0033) [2024-06-21 18:07:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41507.3, 300 sec: 41654.2). Total num frames: 612171776. Throughput: 0: 41820.4. Samples: 612265620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 18:07:08,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-21 18:07:09,956][15401] Updated weights for policy 0, policy_version 37370 (0.0037) [2024-06-21 18:07:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42054.0, 300 sec: 41765.3). Total num frames: 612384768. Throughput: 0: 41809.3. Samples: 612512220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 18:07:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 18:07:14,458][15401] Updated weights for policy 0, policy_version 37380 (0.0035) [2024-06-21 18:07:17,619][15401] Updated weights for policy 0, policy_version 37390 (0.0042) [2024-06-21 18:07:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 612614144. Throughput: 0: 41776.5. Samples: 612761700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 18:07:18,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 18:07:22,336][15401] Updated weights for policy 0, policy_version 37400 (0.0036) [2024-06-21 18:07:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 612810752. Throughput: 0: 41828.8. Samples: 612893940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-21 18:07:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 18:07:25,438][15401] Updated weights for policy 0, policy_version 37410 (0.0023) [2024-06-21 18:07:28,394][15132] Fps is (10 sec: 40942.6, 60 sec: 42322.3, 300 sec: 41764.7). Total num frames: 613023744. Throughput: 0: 41783.7. Samples: 613144060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:07:28,394][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 18:07:30,096][15401] Updated weights for policy 0, policy_version 37420 (0.0039) [2024-06-21 18:07:33,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42050.6, 300 sec: 41765.0). Total num frames: 613236736. Throughput: 0: 41840.0. Samples: 613393180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:07:33,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 18:07:33,606][15401] Updated weights for policy 0, policy_version 37430 (0.0037) [2024-06-21 18:07:37,715][15401] Updated weights for policy 0, policy_version 37440 (0.0031) [2024-06-21 18:07:38,389][15132] Fps is (10 sec: 39338.2, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 613416960. Throughput: 0: 41830.3. Samples: 613522620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:07:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 18:07:41,589][15401] Updated weights for policy 0, policy_version 37450 (0.0029) [2024-06-21 18:07:43,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42050.6, 300 sec: 41820.9). Total num frames: 613662720. Throughput: 0: 41818.3. Samples: 613770500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:07:43,392][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 18:07:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000037455_613662720.pth... [2024-06-21 18:07:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000036844_603652096.pth [2024-06-21 18:07:45,787][15401] Updated weights for policy 0, policy_version 37460 (0.0043) [2024-06-21 18:07:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 41504.5, 300 sec: 41709.4). Total num frames: 613859328. Throughput: 0: 41781.4. Samples: 614012540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:07:48,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 18:07:49,712][15401] Updated weights for policy 0, policy_version 37470 (0.0034) [2024-06-21 18:07:53,389][15132] Fps is (10 sec: 39331.2, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 614055936. Throughput: 0: 41670.3. Samples: 614140780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:07:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 18:07:53,714][15401] Updated weights for policy 0, policy_version 37480 (0.0037) [2024-06-21 18:07:57,604][15401] Updated weights for policy 0, policy_version 37490 (0.0027) [2024-06-21 18:07:58,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 614285312. Throughput: 0: 41835.4. Samples: 614394820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:07:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 18:08:01,227][15401] Updated weights for policy 0, policy_version 37500 (0.0032) [2024-06-21 18:08:02,585][15349] Signal inference workers to stop experience collection... (8900 times) [2024-06-21 18:08:02,585][15349] Signal inference workers to resume experience collection... (8900 times) [2024-06-21 18:08:02,612][15401] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-06-21 18:08:02,612][15401] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-06-21 18:08:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 614481920. Throughput: 0: 41813.8. Samples: 614643320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:08:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 18:08:05,513][15401] Updated weights for policy 0, policy_version 37510 (0.0041) [2024-06-21 18:08:08,392][15132] Fps is (10 sec: 42589.5, 60 sec: 42323.8, 300 sec: 41820.5). Total num frames: 614711296. Throughput: 0: 41774.9. Samples: 614773900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:08:08,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 18:08:08,833][15401] Updated weights for policy 0, policy_version 37520 (0.0044) [2024-06-21 18:08:13,334][15401] Updated weights for policy 0, policy_version 37530 (0.0032) [2024-06-21 18:08:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 614891520. Throughput: 0: 41680.7. Samples: 615019520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:08:13,396][15132] Avg episode reward: [(0, '0.743')] [2024-06-21 18:08:16,637][15401] Updated weights for policy 0, policy_version 37540 (0.0030) [2024-06-21 18:08:18,390][15132] Fps is (10 sec: 40968.8, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 615120896. Throughput: 0: 41755.9. Samples: 615272100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:08:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 18:08:20,935][15401] Updated weights for policy 0, policy_version 37550 (0.0029) [2024-06-21 18:08:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 615317504. Throughput: 0: 41711.1. Samples: 615399620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:08:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 18:08:24,319][15401] Updated weights for policy 0, policy_version 37560 (0.0044) [2024-06-21 18:08:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41509.0, 300 sec: 41709.8). Total num frames: 615514112. Throughput: 0: 41663.9. Samples: 615645280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:08:28,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-21 18:08:29,069][15401] Updated weights for policy 0, policy_version 37570 (0.0032) [2024-06-21 18:08:32,348][15401] Updated weights for policy 0, policy_version 37580 (0.0039) [2024-06-21 18:08:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41507.8, 300 sec: 41709.8). Total num frames: 615727104. Throughput: 0: 41903.1. Samples: 615898080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:08:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 18:08:36,870][15401] Updated weights for policy 0, policy_version 37590 (0.0029) [2024-06-21 18:08:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 615923712. Throughput: 0: 41811.1. Samples: 616022280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 18:08:38,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 18:08:40,125][15401] Updated weights for policy 0, policy_version 37600 (0.0033) [2024-06-21 18:08:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41780.9, 300 sec: 41820.8). Total num frames: 616169472. Throughput: 0: 41698.8. Samples: 616271260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 18:08:43,396][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 18:08:44,484][15401] Updated weights for policy 0, policy_version 37610 (0.0039) [2024-06-21 18:08:47,907][15401] Updated weights for policy 0, policy_version 37620 (0.0041) [2024-06-21 18:08:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41780.8, 300 sec: 41765.3). Total num frames: 616366080. Throughput: 0: 41688.4. Samples: 616519300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 18:08:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 18:08:52,168][15401] Updated weights for policy 0, policy_version 37630 (0.0043) [2024-06-21 18:08:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 616562688. Throughput: 0: 41596.3. Samples: 616645640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 18:08:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 18:08:55,818][15401] Updated weights for policy 0, policy_version 37640 (0.0033) [2024-06-21 18:08:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.4, 300 sec: 41820.9). Total num frames: 616792064. Throughput: 0: 41852.7. Samples: 616902880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 18:08:58,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 18:09:00,298][15401] Updated weights for policy 0, policy_version 37650 (0.0033) [2024-06-21 18:09:03,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42050.5, 300 sec: 41820.5). Total num frames: 617005056. Throughput: 0: 41807.6. Samples: 617153540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 18:09:03,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 18:09:03,793][15401] Updated weights for policy 0, policy_version 37660 (0.0040) [2024-06-21 18:09:07,970][15401] Updated weights for policy 0, policy_version 37670 (0.0033) [2024-06-21 18:09:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41234.5, 300 sec: 41654.2). Total num frames: 617185280. Throughput: 0: 41673.7. Samples: 617274940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 18:09:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 18:09:11,563][15401] Updated weights for policy 0, policy_version 37680 (0.0038) [2024-06-21 18:09:13,390][15132] Fps is (10 sec: 39330.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 617398272. Throughput: 0: 41816.8. Samples: 617527040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 18:09:13,396][15132] Avg episode reward: [(0, '0.291')] [2024-06-21 18:09:15,664][15401] Updated weights for policy 0, policy_version 37690 (0.0038) [2024-06-21 18:09:17,630][15349] Signal inference workers to stop experience collection... (8950 times) [2024-06-21 18:09:17,631][15349] Signal inference workers to resume experience collection... (8950 times) [2024-06-21 18:09:17,664][15401] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-06-21 18:09:17,664][15401] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-06-21 18:09:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 617594880. Throughput: 0: 41720.0. Samples: 617775480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 18:09:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 18:09:19,768][15401] Updated weights for policy 0, policy_version 37700 (0.0037) [2024-06-21 18:09:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 617824256. Throughput: 0: 41795.1. Samples: 617903060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 18:09:23,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 18:09:23,514][15401] Updated weights for policy 0, policy_version 37710 (0.0046) [2024-06-21 18:09:27,373][15401] Updated weights for policy 0, policy_version 37720 (0.0050) [2024-06-21 18:09:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41777.6, 300 sec: 41709.4). Total num frames: 618020864. Throughput: 0: 41661.8. Samples: 618146140. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 18:09:28,393][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 18:09:31,376][15401] Updated weights for policy 0, policy_version 37730 (0.0052) [2024-06-21 18:09:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 618233856. Throughput: 0: 41865.3. Samples: 618403240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 18:09:33,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 18:09:35,396][15401] Updated weights for policy 0, policy_version 37740 (0.0038) [2024-06-21 18:09:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 618446848. Throughput: 0: 41735.0. Samples: 618523720. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 18:09:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 18:09:39,154][15401] Updated weights for policy 0, policy_version 37750 (0.0044) [2024-06-21 18:09:43,221][15401] Updated weights for policy 0, policy_version 37760 (0.0036) [2024-06-21 18:09:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 618659840. Throughput: 0: 41486.1. Samples: 618769760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 18:09:43,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-21 18:09:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000037760_618659840.pth... [2024-06-21 18:09:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000037148_608632832.pth [2024-06-21 18:09:47,076][15401] Updated weights for policy 0, policy_version 37770 (0.0036) [2024-06-21 18:09:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 41231.4, 300 sec: 41598.4). Total num frames: 618840064. Throughput: 0: 41640.9. Samples: 619027380. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 18:09:48,392][15132] Avg episode reward: [(0, '0.288')] [2024-06-21 18:09:51,308][15401] Updated weights for policy 0, policy_version 37780 (0.0023) [2024-06-21 18:09:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 619069440. Throughput: 0: 41626.3. Samples: 619148120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:09:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-21 18:09:54,735][15401] Updated weights for policy 0, policy_version 37790 (0.0032) [2024-06-21 18:09:58,390][15132] Fps is (10 sec: 44247.2, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 619282432. Throughput: 0: 41615.2. Samples: 619399720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:09:58,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 18:09:59,532][15401] Updated weights for policy 0, policy_version 37800 (0.0036) [2024-06-21 18:10:02,640][15401] Updated weights for policy 0, policy_version 37810 (0.0038) [2024-06-21 18:10:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41234.8, 300 sec: 41654.2). Total num frames: 619479040. Throughput: 0: 41695.6. Samples: 619651780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:10:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 18:10:07,206][15401] Updated weights for policy 0, policy_version 37820 (0.0042) [2024-06-21 18:10:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 619692032. Throughput: 0: 41631.1. Samples: 619776460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:10:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 18:10:10,599][15401] Updated weights for policy 0, policy_version 37830 (0.0024) [2024-06-21 18:10:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 619921408. Throughput: 0: 41897.3. Samples: 620031420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:10:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 18:10:14,816][15401] Updated weights for policy 0, policy_version 37840 (0.0038) [2024-06-21 18:10:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 620118016. Throughput: 0: 41735.7. Samples: 620281340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 18:10:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 18:10:18,481][15401] Updated weights for policy 0, policy_version 37850 (0.0043) [2024-06-21 18:10:22,704][15401] Updated weights for policy 0, policy_version 37860 (0.0035) [2024-06-21 18:10:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 620314624. Throughput: 0: 41789.3. Samples: 620404240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 18:10:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 18:10:26,378][15401] Updated weights for policy 0, policy_version 37870 (0.0034) [2024-06-21 18:10:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42054.0, 300 sec: 41876.4). Total num frames: 620544000. Throughput: 0: 42046.3. Samples: 620661840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 18:10:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-21 18:10:30,735][15401] Updated weights for policy 0, policy_version 37880 (0.0034) [2024-06-21 18:10:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 41821.2). Total num frames: 620756992. Throughput: 0: 41824.0. Samples: 620909360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 18:10:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 18:10:34,130][15401] Updated weights for policy 0, policy_version 37890 (0.0041) [2024-06-21 18:10:38,333][15401] Updated weights for policy 0, policy_version 37900 (0.0045) [2024-06-21 18:10:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 620953600. Throughput: 0: 41924.9. Samples: 621034740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-21 18:10:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 18:10:41,975][15401] Updated weights for policy 0, policy_version 37910 (0.0031) [2024-06-21 18:10:42,928][15349] Signal inference workers to stop experience collection... (9000 times) [2024-06-21 18:10:42,958][15401] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-06-21 18:10:42,981][15349] Signal inference workers to resume experience collection... (9000 times) [2024-06-21 18:10:42,986][15401] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-06-21 18:10:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 621166592. Throughput: 0: 41870.7. Samples: 621283900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 18:10:43,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-21 18:10:46,120][15401] Updated weights for policy 0, policy_version 37920 (0.0047) [2024-06-21 18:10:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42053.9, 300 sec: 41709.8). Total num frames: 621363200. Throughput: 0: 41879.9. Samples: 621536380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 18:10:48,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-21 18:10:49,900][15401] Updated weights for policy 0, policy_version 37930 (0.0039) [2024-06-21 18:10:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 621559808. Throughput: 0: 41799.2. Samples: 621657420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 18:10:53,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 18:10:54,011][15401] Updated weights for policy 0, policy_version 37940 (0.0035) [2024-06-21 18:10:57,643][15401] Updated weights for policy 0, policy_version 37950 (0.0034) [2024-06-21 18:10:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 621789184. Throughput: 0: 41808.4. Samples: 621912800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 18:10:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-21 18:11:01,818][15401] Updated weights for policy 0, policy_version 37960 (0.0034) [2024-06-21 18:11:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41710.0). Total num frames: 621985792. Throughput: 0: 41789.8. Samples: 622161880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 18:11:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 18:11:05,480][15401] Updated weights for policy 0, policy_version 37970 (0.0031) [2024-06-21 18:11:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41821.2). Total num frames: 622198784. Throughput: 0: 41731.6. Samples: 622282160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 18:11:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 18:11:09,384][15401] Updated weights for policy 0, policy_version 37980 (0.0046) [2024-06-21 18:11:13,265][15401] Updated weights for policy 0, policy_version 37990 (0.0031) [2024-06-21 18:11:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 622428160. Throughput: 0: 41620.9. Samples: 622534780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 18:11:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 18:11:17,431][15401] Updated weights for policy 0, policy_version 38000 (0.0035) [2024-06-21 18:11:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 622608384. Throughput: 0: 41580.8. Samples: 622780500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 18:11:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 18:11:21,232][15401] Updated weights for policy 0, policy_version 38010 (0.0031) [2024-06-21 18:11:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 622837760. Throughput: 0: 41600.0. Samples: 622906740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 18:11:23,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 18:11:25,367][15401] Updated weights for policy 0, policy_version 38020 (0.0031) [2024-06-21 18:11:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 623017984. Throughput: 0: 41743.9. Samples: 623162380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 18:11:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 18:11:29,223][15401] Updated weights for policy 0, policy_version 38030 (0.0043) [2024-06-21 18:11:33,267][15401] Updated weights for policy 0, policy_version 38040 (0.0049) [2024-06-21 18:11:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 623247360. Throughput: 0: 41645.3. Samples: 623410420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 18:11:33,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 18:11:37,170][15401] Updated weights for policy 0, policy_version 38050 (0.0038) [2024-06-21 18:11:38,389][15132] Fps is (10 sec: 44237.9, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 623460352. Throughput: 0: 41643.1. Samples: 623531360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 18:11:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 18:11:41,262][15401] Updated weights for policy 0, policy_version 38060 (0.0043) [2024-06-21 18:11:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 623640576. Throughput: 0: 41531.6. Samples: 623781720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 18:11:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 18:11:43,438][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000038065_623656960.pth... [2024-06-21 18:11:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000037455_613662720.pth [2024-06-21 18:11:45,072][15401] Updated weights for policy 0, policy_version 38070 (0.0041) [2024-06-21 18:11:48,395][15132] Fps is (10 sec: 42575.6, 60 sec: 42048.6, 300 sec: 41820.1). Total num frames: 623886336. Throughput: 0: 41510.2. Samples: 624030060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-21 18:11:48,395][15132] Avg episode reward: [(0, '0.266')] [2024-06-21 18:11:49,184][15401] Updated weights for policy 0, policy_version 38080 (0.0039) [2024-06-21 18:11:52,950][15401] Updated weights for policy 0, policy_version 38090 (0.0035) [2024-06-21 18:11:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 624082944. Throughput: 0: 41613.7. Samples: 624154780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:11:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 18:11:57,168][15401] Updated weights for policy 0, policy_version 38100 (0.0033) [2024-06-21 18:11:58,390][15132] Fps is (10 sec: 39342.1, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 624279552. Throughput: 0: 41674.6. Samples: 624410140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:11:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 18:12:00,580][15401] Updated weights for policy 0, policy_version 38110 (0.0032) [2024-06-21 18:12:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 624508928. Throughput: 0: 41661.9. Samples: 624655280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:12:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 18:12:04,906][15401] Updated weights for policy 0, policy_version 38120 (0.0038) [2024-06-21 18:12:08,307][15401] Updated weights for policy 0, policy_version 38130 (0.0033) [2024-06-21 18:12:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 624721920. Throughput: 0: 41696.4. Samples: 624783080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:12:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 18:12:12,818][15401] Updated weights for policy 0, policy_version 38140 (0.0033) [2024-06-21 18:12:13,392][15132] Fps is (10 sec: 39311.8, 60 sec: 41231.4, 300 sec: 41653.9). Total num frames: 624902144. Throughput: 0: 41585.5. Samples: 625033820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 18:12:13,392][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 18:12:15,657][15349] Signal inference workers to stop experience collection... (9050 times) [2024-06-21 18:12:15,710][15401] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-06-21 18:12:15,710][15349] Signal inference workers to resume experience collection... (9050 times) [2024-06-21 18:12:15,725][15401] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-06-21 18:12:16,306][15401] Updated weights for policy 0, policy_version 38150 (0.0043) [2024-06-21 18:12:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 625098752. Throughput: 0: 41612.6. Samples: 625282980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:12:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 18:12:20,631][15401] Updated weights for policy 0, policy_version 38160 (0.0037) [2024-06-21 18:12:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41233.1, 300 sec: 41654.8). Total num frames: 625311744. Throughput: 0: 41731.0. Samples: 625409260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:12:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 18:12:24,165][15401] Updated weights for policy 0, policy_version 38170 (0.0039) [2024-06-21 18:12:28,219][15401] Updated weights for policy 0, policy_version 38180 (0.0040) [2024-06-21 18:12:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.4, 300 sec: 41710.1). Total num frames: 625541120. Throughput: 0: 41740.9. Samples: 625660060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:12:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 18:12:32,012][15401] Updated weights for policy 0, policy_version 38190 (0.0043) [2024-06-21 18:12:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 625737728. Throughput: 0: 41850.3. Samples: 625913100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:12:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 18:12:35,905][15401] Updated weights for policy 0, policy_version 38200 (0.0046) [2024-06-21 18:12:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41654.6). Total num frames: 625950720. Throughput: 0: 41732.1. Samples: 626032720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:12:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 18:12:39,998][15401] Updated weights for policy 0, policy_version 38210 (0.0032) [2024-06-21 18:12:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 41765.7). Total num frames: 626180096. Throughput: 0: 41671.2. Samples: 626285340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-21 18:12:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 18:12:43,711][15401] Updated weights for policy 0, policy_version 38220 (0.0037) [2024-06-21 18:12:47,781][15401] Updated weights for policy 0, policy_version 38230 (0.0034) [2024-06-21 18:12:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41509.7, 300 sec: 41765.3). Total num frames: 626376704. Throughput: 0: 41867.4. Samples: 626539320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-21 18:12:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 18:12:51,326][15401] Updated weights for policy 0, policy_version 38240 (0.0042) [2024-06-21 18:12:53,396][15132] Fps is (10 sec: 39295.8, 60 sec: 41501.7, 300 sec: 41653.3). Total num frames: 626573312. Throughput: 0: 41739.8. Samples: 626661640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-21 18:12:53,397][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 18:12:55,826][15401] Updated weights for policy 0, policy_version 38250 (0.0035) [2024-06-21 18:12:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 626786304. Throughput: 0: 41692.0. Samples: 626909860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-21 18:12:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 18:12:59,350][15401] Updated weights for policy 0, policy_version 38260 (0.0033) [2024-06-21 18:13:03,389][15132] Fps is (10 sec: 40986.8, 60 sec: 41233.0, 300 sec: 41599.0). Total num frames: 626982912. Throughput: 0: 41933.7. Samples: 627170000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-21 18:13:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 18:13:03,787][15401] Updated weights for policy 0, policy_version 38270 (0.0041) [2024-06-21 18:13:07,127][15401] Updated weights for policy 0, policy_version 38280 (0.0033) [2024-06-21 18:13:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 627212288. Throughput: 0: 41819.5. Samples: 627291140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:13:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 18:13:11,609][15401] Updated weights for policy 0, policy_version 38290 (0.0042) [2024-06-21 18:13:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42053.9, 300 sec: 41709.8). Total num frames: 627425280. Throughput: 0: 41792.4. Samples: 627540720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:13:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 18:13:15,146][15401] Updated weights for policy 0, policy_version 38300 (0.0034) [2024-06-21 18:13:18,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 627605504. Throughput: 0: 41936.8. Samples: 627800260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:13:18,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-21 18:13:19,362][15401] Updated weights for policy 0, policy_version 38310 (0.0037) [2024-06-21 18:13:22,836][15401] Updated weights for policy 0, policy_version 38320 (0.0032) [2024-06-21 18:13:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 627851264. Throughput: 0: 41844.9. Samples: 627915740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:13:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 18:13:27,049][15401] Updated weights for policy 0, policy_version 38330 (0.0033) [2024-06-21 18:13:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 628047872. Throughput: 0: 41944.4. Samples: 628172840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:13:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 18:13:30,523][15401] Updated weights for policy 0, policy_version 38340 (0.0028) [2024-06-21 18:13:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 628244480. Throughput: 0: 42050.2. Samples: 628431580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-21 18:13:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 18:13:34,706][15401] Updated weights for policy 0, policy_version 38350 (0.0042) [2024-06-21 18:13:38,349][15401] Updated weights for policy 0, policy_version 38360 (0.0035) [2024-06-21 18:13:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 628490240. Throughput: 0: 42042.9. Samples: 628553300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-21 18:13:38,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-21 18:13:42,570][15401] Updated weights for policy 0, policy_version 38370 (0.0047) [2024-06-21 18:13:43,385][15349] Signal inference workers to stop experience collection... (9100 times) [2024-06-21 18:13:43,386][15349] Signal inference workers to resume experience collection... (9100 times) [2024-06-21 18:13:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 628686848. Throughput: 0: 42215.1. Samples: 628809540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-21 18:13:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 18:13:43,415][15401] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-06-21 18:13:43,446][15401] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-06-21 18:13:43,525][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000038373_628703232.pth... [2024-06-21 18:13:43,569][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000037760_618659840.pth [2024-06-21 18:13:46,258][15401] Updated weights for policy 0, policy_version 38380 (0.0046) [2024-06-21 18:13:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 628883456. Throughput: 0: 41944.4. Samples: 629057500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-21 18:13:48,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 18:13:50,536][15401] Updated weights for policy 0, policy_version 38390 (0.0039) [2024-06-21 18:13:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42330.0, 300 sec: 41765.3). Total num frames: 629112832. Throughput: 0: 42048.6. Samples: 629183320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-21 18:13:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 18:13:53,918][15401] Updated weights for policy 0, policy_version 38400 (0.0031) [2024-06-21 18:13:58,305][15401] Updated weights for policy 0, policy_version 38410 (0.0036) [2024-06-21 18:13:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41710.1). Total num frames: 629309440. Throughput: 0: 42262.3. Samples: 629442520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 18:13:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 18:14:01,796][15401] Updated weights for policy 0, policy_version 38420 (0.0037) [2024-06-21 18:14:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 629522432. Throughput: 0: 41897.7. Samples: 629685660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 18:14:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 18:14:06,277][15401] Updated weights for policy 0, policy_version 38430 (0.0038) [2024-06-21 18:14:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 629719040. Throughput: 0: 42290.1. Samples: 629818800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 18:14:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 18:14:09,579][15401] Updated weights for policy 0, policy_version 38440 (0.0035) [2024-06-21 18:14:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 629932032. Throughput: 0: 42019.6. Samples: 630063720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 18:14:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 18:14:13,963][15401] Updated weights for policy 0, policy_version 38450 (0.0029) [2024-06-21 18:14:17,349][15401] Updated weights for policy 0, policy_version 38460 (0.0033) [2024-06-21 18:14:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 41820.8). Total num frames: 630161408. Throughput: 0: 41832.9. Samples: 630314060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 18:14:18,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-21 18:14:21,730][15401] Updated weights for policy 0, policy_version 38470 (0.0037) [2024-06-21 18:14:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41765.7). Total num frames: 630341632. Throughput: 0: 41962.8. Samples: 630441620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:14:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 18:14:25,098][15401] Updated weights for policy 0, policy_version 38480 (0.0033) [2024-06-21 18:14:28,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 630554624. Throughput: 0: 41776.6. Samples: 630689480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:14:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 18:14:29,586][15401] Updated weights for policy 0, policy_version 38490 (0.0037) [2024-06-21 18:14:32,832][15401] Updated weights for policy 0, policy_version 38500 (0.0034) [2024-06-21 18:14:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 630784000. Throughput: 0: 41894.3. Samples: 630942740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:14:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 18:14:37,538][15401] Updated weights for policy 0, policy_version 38510 (0.0041) [2024-06-21 18:14:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 630980608. Throughput: 0: 41926.1. Samples: 631070000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:14:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 18:14:40,634][15401] Updated weights for policy 0, policy_version 38520 (0.0037) [2024-06-21 18:14:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41876.7). Total num frames: 631193600. Throughput: 0: 41659.4. Samples: 631317200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:14:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 18:14:45,333][15401] Updated weights for policy 0, policy_version 38530 (0.0029) [2024-06-21 18:14:48,392][15132] Fps is (10 sec: 44224.6, 60 sec: 42323.4, 300 sec: 41876.0). Total num frames: 631422976. Throughput: 0: 41793.5. Samples: 631566480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-21 18:14:48,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 18:14:48,644][15401] Updated weights for policy 0, policy_version 38540 (0.0033) [2024-06-21 18:14:53,158][15401] Updated weights for policy 0, policy_version 38550 (0.0039) [2024-06-21 18:14:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41505.9, 300 sec: 41765.3). Total num frames: 631603200. Throughput: 0: 41623.4. Samples: 631691860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-21 18:14:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 18:14:56,512][15401] Updated weights for policy 0, policy_version 38560 (0.0038) [2024-06-21 18:14:58,390][15132] Fps is (10 sec: 40971.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 631832576. Throughput: 0: 41668.8. Samples: 631938820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-21 18:14:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 18:15:00,956][15401] Updated weights for policy 0, policy_version 38570 (0.0040) [2024-06-21 18:15:03,392][15132] Fps is (10 sec: 42589.1, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 632029184. Throughput: 0: 41732.1. Samples: 632192100. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-21 18:15:03,392][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 18:15:04,171][15401] Updated weights for policy 0, policy_version 38580 (0.0028) [2024-06-21 18:15:08,390][15132] Fps is (10 sec: 37683.5, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 632209408. Throughput: 0: 41575.9. Samples: 632312540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-21 18:15:08,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 18:15:09,121][15401] Updated weights for policy 0, policy_version 38590 (0.0051) [2024-06-21 18:15:12,490][15349] Signal inference workers to stop experience collection... (9150 times) [2024-06-21 18:15:12,491][15349] Signal inference workers to resume experience collection... (9150 times) [2024-06-21 18:15:12,498][15401] Updated weights for policy 0, policy_version 38600 (0.0038) [2024-06-21 18:15:12,506][15401] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-06-21 18:15:12,507][15401] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-06-21 18:15:13,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 632455168. Throughput: 0: 41634.6. Samples: 632563040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-21 18:15:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 18:15:16,941][15401] Updated weights for policy 0, policy_version 38610 (0.0043) [2024-06-21 18:15:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 632635392. Throughput: 0: 41651.1. Samples: 632817040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-21 18:15:18,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 18:15:20,233][15401] Updated weights for policy 0, policy_version 38620 (0.0031) [2024-06-21 18:15:23,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 632832000. Throughput: 0: 41499.1. Samples: 632937460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-21 18:15:23,396][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 18:15:24,612][15401] Updated weights for policy 0, policy_version 38630 (0.0042) [2024-06-21 18:15:28,392][15132] Fps is (10 sec: 42587.8, 60 sec: 41777.4, 300 sec: 41709.4). Total num frames: 633061376. Throughput: 0: 41543.2. Samples: 633186740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-21 18:15:28,392][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 18:15:28,433][15401] Updated weights for policy 0, policy_version 38640 (0.0030) [2024-06-21 18:15:32,966][15401] Updated weights for policy 0, policy_version 38650 (0.0043) [2024-06-21 18:15:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 633257984. Throughput: 0: 41680.4. Samples: 633441980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-21 18:15:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 18:15:36,279][15401] Updated weights for policy 0, policy_version 38660 (0.0032) [2024-06-21 18:15:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 633487360. Throughput: 0: 41584.5. Samples: 633563160. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 18:15:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 18:15:40,815][15401] Updated weights for policy 0, policy_version 38670 (0.0044) [2024-06-21 18:15:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 633700352. Throughput: 0: 41624.6. Samples: 633811920. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 18:15:43,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 18:15:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000038678_633700352.pth... [2024-06-21 18:15:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000038065_623656960.pth [2024-06-21 18:15:44,080][15401] Updated weights for policy 0, policy_version 38680 (0.0040) [2024-06-21 18:15:48,389][15132] Fps is (10 sec: 37683.9, 60 sec: 40688.9, 300 sec: 41709.8). Total num frames: 633864192. Throughput: 0: 41633.8. Samples: 634065520. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 18:15:48,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-21 18:15:48,615][15401] Updated weights for policy 0, policy_version 38690 (0.0029) [2024-06-21 18:15:52,045][15401] Updated weights for policy 0, policy_version 38700 (0.0039) [2024-06-21 18:15:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 634126336. Throughput: 0: 41504.3. Samples: 634180240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 18:15:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 18:15:56,847][15401] Updated weights for policy 0, policy_version 38710 (0.0024) [2024-06-21 18:15:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 634322944. Throughput: 0: 41562.6. Samples: 634433360. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 18:15:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 18:15:59,784][15401] Updated weights for policy 0, policy_version 38720 (0.0038) [2024-06-21 18:16:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 41234.7, 300 sec: 41709.8). Total num frames: 634503168. Throughput: 0: 41511.4. Samples: 634685060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 18:16:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 18:16:04,474][15401] Updated weights for policy 0, policy_version 38730 (0.0030) [2024-06-21 18:16:07,535][15401] Updated weights for policy 0, policy_version 38740 (0.0036) [2024-06-21 18:16:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 634732544. Throughput: 0: 41411.9. Samples: 634801000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 18:16:08,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 18:16:12,267][15401] Updated weights for policy 0, policy_version 38750 (0.0031) [2024-06-21 18:16:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 634912768. Throughput: 0: 41576.9. Samples: 635057600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 18:16:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 18:16:15,429][15401] Updated weights for policy 0, policy_version 38760 (0.0047) [2024-06-21 18:16:18,396][15132] Fps is (10 sec: 40934.3, 60 sec: 41774.7, 300 sec: 41708.9). Total num frames: 635142144. Throughput: 0: 41337.7. Samples: 635302440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 18:16:18,396][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 18:16:19,987][15401] Updated weights for policy 0, policy_version 38770 (0.0034) [2024-06-21 18:16:23,263][15401] Updated weights for policy 0, policy_version 38780 (0.0037) [2024-06-21 18:16:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 635371520. Throughput: 0: 41568.0. Samples: 635433720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 18:16:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 18:16:28,003][15401] Updated weights for policy 0, policy_version 38790 (0.0031) [2024-06-21 18:16:28,392][15132] Fps is (10 sec: 39337.1, 60 sec: 41233.1, 300 sec: 41653.9). Total num frames: 635535360. Throughput: 0: 41583.9. Samples: 635683300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:16:28,393][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 18:16:31,186][15401] Updated weights for policy 0, policy_version 38800 (0.0049) [2024-06-21 18:16:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 635781120. Throughput: 0: 41331.1. Samples: 635925420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:16:33,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 18:16:36,139][15401] Updated weights for policy 0, policy_version 38810 (0.0037) [2024-06-21 18:16:38,396][15132] Fps is (10 sec: 44219.0, 60 sec: 41501.8, 300 sec: 41819.9). Total num frames: 635977728. Throughput: 0: 41561.7. Samples: 636050780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:16:38,396][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 18:16:39,059][15401] Updated weights for policy 0, policy_version 38820 (0.0049) [2024-06-21 18:16:43,390][15132] Fps is (10 sec: 37683.0, 60 sec: 40959.9, 300 sec: 41599.4). Total num frames: 636157952. Throughput: 0: 41486.6. Samples: 636300260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:16:43,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 18:16:44,005][15401] Updated weights for policy 0, policy_version 38830 (0.0032) [2024-06-21 18:16:46,482][15349] Signal inference workers to stop experience collection... (9200 times) [2024-06-21 18:16:46,488][15349] Signal inference workers to resume experience collection... (9200 times) [2024-06-21 18:16:46,531][15401] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-06-21 18:16:46,531][15401] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-06-21 18:16:47,272][15401] Updated weights for policy 0, policy_version 38840 (0.0034) [2024-06-21 18:16:48,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 636387328. Throughput: 0: 41328.9. Samples: 636544860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 18:16:48,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-21 18:16:51,863][15401] Updated weights for policy 0, policy_version 38850 (0.0029) [2024-06-21 18:16:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 636600320. Throughput: 0: 41640.5. Samples: 636674820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 18:16:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 18:16:55,243][15401] Updated weights for policy 0, policy_version 38860 (0.0042) [2024-06-21 18:16:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40959.9, 300 sec: 41598.7). Total num frames: 636780544. Throughput: 0: 41304.7. Samples: 636916320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 18:16:58,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-21 18:16:59,634][15401] Updated weights for policy 0, policy_version 38870 (0.0039) [2024-06-21 18:17:02,977][15401] Updated weights for policy 0, policy_version 38880 (0.0047) [2024-06-21 18:17:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 637009920. Throughput: 0: 41423.2. Samples: 637166220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 18:17:03,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 18:17:07,451][15401] Updated weights for policy 0, policy_version 38890 (0.0032) [2024-06-21 18:17:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.1, 300 sec: 41765.6). Total num frames: 637222912. Throughput: 0: 41404.4. Samples: 637296920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 18:17:08,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 18:17:11,154][15401] Updated weights for policy 0, policy_version 38900 (0.0044) [2024-06-21 18:17:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 637419520. Throughput: 0: 41248.5. Samples: 637539380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 18:17:13,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-21 18:17:15,251][15401] Updated weights for policy 0, policy_version 38910 (0.0027) [2024-06-21 18:17:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41510.6, 300 sec: 41765.3). Total num frames: 637632512. Throughput: 0: 41546.3. Samples: 637795000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:17:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 18:17:18,898][15401] Updated weights for policy 0, policy_version 38920 (0.0041) [2024-06-21 18:17:22,944][15401] Updated weights for policy 0, policy_version 38930 (0.0036) [2024-06-21 18:17:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 637845504. Throughput: 0: 41603.4. Samples: 637922660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:17:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 18:17:26,671][15401] Updated weights for policy 0, policy_version 38940 (0.0041) [2024-06-21 18:17:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41780.9, 300 sec: 41709.8). Total num frames: 638042112. Throughput: 0: 41541.8. Samples: 638169640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:17:28,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-21 18:17:31,257][15401] Updated weights for policy 0, policy_version 38950 (0.0042) [2024-06-21 18:17:33,391][15132] Fps is (10 sec: 40952.9, 60 sec: 41232.0, 300 sec: 41709.5). Total num frames: 638255104. Throughput: 0: 41716.4. Samples: 638422160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:17:33,392][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 18:17:34,689][15401] Updated weights for policy 0, policy_version 38960 (0.0028) [2024-06-21 18:17:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41237.6, 300 sec: 41598.7). Total num frames: 638451712. Throughput: 0: 41646.8. Samples: 638548920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:17:38,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 18:17:38,980][15401] Updated weights for policy 0, policy_version 38970 (0.0030) [2024-06-21 18:17:42,662][15401] Updated weights for policy 0, policy_version 38980 (0.0034) [2024-06-21 18:17:43,390][15132] Fps is (10 sec: 40966.3, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 638664704. Throughput: 0: 41797.4. Samples: 638797200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 18:17:43,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 18:17:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000038981_638664704.pth... [2024-06-21 18:17:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000038373_628703232.pth [2024-06-21 18:17:46,646][15401] Updated weights for policy 0, policy_version 38990 (0.0043) [2024-06-21 18:17:48,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.3, 300 sec: 41821.8). Total num frames: 638910464. Throughput: 0: 41631.6. Samples: 639039640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 18:17:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 18:17:50,552][15401] Updated weights for policy 0, policy_version 39000 (0.0027) [2024-06-21 18:17:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41232.9, 300 sec: 41654.2). Total num frames: 639074304. Throughput: 0: 41732.8. Samples: 639174900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 18:17:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 18:17:54,108][15401] Updated weights for policy 0, policy_version 39010 (0.0030) [2024-06-21 18:17:58,390][15132] Fps is (10 sec: 37682.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 639287296. Throughput: 0: 41933.2. Samples: 639426380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 18:17:58,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-21 18:17:58,647][15401] Updated weights for policy 0, policy_version 39020 (0.0036) [2024-06-21 18:18:01,752][15401] Updated weights for policy 0, policy_version 39030 (0.0037) [2024-06-21 18:18:03,390][15132] Fps is (10 sec: 45876.1, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 639533056. Throughput: 0: 41669.7. Samples: 639670140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 18:18:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 18:18:06,194][15401] Updated weights for policy 0, policy_version 39040 (0.0027) [2024-06-21 18:18:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 639713280. Throughput: 0: 41749.3. Samples: 639801380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 18:18:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 18:18:09,888][15401] Updated weights for policy 0, policy_version 39050 (0.0039) [2024-06-21 18:18:13,391][15132] Fps is (10 sec: 39316.7, 60 sec: 41778.3, 300 sec: 41765.1). Total num frames: 639926272. Throughput: 0: 41714.0. Samples: 640046820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 18:18:13,391][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 18:18:14,166][15401] Updated weights for policy 0, policy_version 39060 (0.0024) [2024-06-21 18:18:17,705][15349] Signal inference workers to stop experience collection... (9250 times) [2024-06-21 18:18:17,759][15401] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-06-21 18:18:17,764][15349] Signal inference workers to resume experience collection... (9250 times) [2024-06-21 18:18:17,774][15401] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-06-21 18:18:17,943][15401] Updated weights for policy 0, policy_version 39070 (0.0026) [2024-06-21 18:18:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 640155648. Throughput: 0: 41666.0. Samples: 640297060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 18:18:18,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 18:18:21,894][15401] Updated weights for policy 0, policy_version 39080 (0.0040) [2024-06-21 18:18:23,389][15132] Fps is (10 sec: 40965.2, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 640335872. Throughput: 0: 41864.8. Samples: 640432840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 18:18:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 18:18:25,559][15401] Updated weights for policy 0, policy_version 39090 (0.0052) [2024-06-21 18:18:28,396][15132] Fps is (10 sec: 39296.0, 60 sec: 41774.7, 300 sec: 41708.9). Total num frames: 640548864. Throughput: 0: 41651.0. Samples: 640671760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-21 18:18:28,397][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 18:18:29,694][15401] Updated weights for policy 0, policy_version 39100 (0.0035) [2024-06-21 18:18:33,179][15401] Updated weights for policy 0, policy_version 39110 (0.0038) [2024-06-21 18:18:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42326.5, 300 sec: 41709.8). Total num frames: 640794624. Throughput: 0: 41953.3. Samples: 640927540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:18:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 18:18:37,791][15401] Updated weights for policy 0, policy_version 39120 (0.0037) [2024-06-21 18:18:38,390][15132] Fps is (10 sec: 39343.6, 60 sec: 41505.4, 300 sec: 41543.0). Total num frames: 640942080. Throughput: 0: 41884.3. Samples: 641059720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:18:38,391][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 18:18:40,779][15401] Updated weights for policy 0, policy_version 39130 (0.0045) [2024-06-21 18:18:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 641187840. Throughput: 0: 41665.9. Samples: 641301340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:18:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 18:18:45,360][15401] Updated weights for policy 0, policy_version 39140 (0.0038) [2024-06-21 18:18:48,389][15132] Fps is (10 sec: 47518.1, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 641417216. Throughput: 0: 41953.0. Samples: 641558020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:18:48,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-21 18:18:48,800][15401] Updated weights for policy 0, policy_version 39150 (0.0038) [2024-06-21 18:18:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 641581056. Throughput: 0: 41838.6. Samples: 641684120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:18:53,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-21 18:18:53,445][15401] Updated weights for policy 0, policy_version 39160 (0.0038) [2024-06-21 18:18:56,769][15401] Updated weights for policy 0, policy_version 39170 (0.0032) [2024-06-21 18:18:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 41765.3). Total num frames: 641843200. Throughput: 0: 41942.9. Samples: 641934200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-21 18:18:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 18:19:01,246][15401] Updated weights for policy 0, policy_version 39180 (0.0030) [2024-06-21 18:19:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 642039808. Throughput: 0: 42087.9. Samples: 642191020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-21 18:19:03,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 18:19:04,215][15401] Updated weights for policy 0, policy_version 39190 (0.0028) [2024-06-21 18:19:08,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 642220032. Throughput: 0: 41848.4. Samples: 642316020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-21 18:19:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 18:19:08,987][15401] Updated weights for policy 0, policy_version 39200 (0.0053) [2024-06-21 18:19:11,751][15401] Updated weights for policy 0, policy_version 39210 (0.0036) [2024-06-21 18:19:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42326.2, 300 sec: 41709.8). Total num frames: 642465792. Throughput: 0: 41999.3. Samples: 642561460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-21 18:19:13,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 18:19:16,903][15401] Updated weights for policy 0, policy_version 39220 (0.0031) [2024-06-21 18:19:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 642646016. Throughput: 0: 42134.3. Samples: 642823580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 23.0) [2024-06-21 18:19:18,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-21 18:19:19,530][15401] Updated weights for policy 0, policy_version 39230 (0.0029) [2024-06-21 18:19:23,390][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 642842624. Throughput: 0: 41855.0. Samples: 642943160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-21 18:19:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 18:19:24,914][15401] Updated weights for policy 0, policy_version 39240 (0.0030) [2024-06-21 18:19:27,332][15401] Updated weights for policy 0, policy_version 39250 (0.0032) [2024-06-21 18:19:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42329.9, 300 sec: 41709.8). Total num frames: 643088384. Throughput: 0: 42076.0. Samples: 643194760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-21 18:19:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 18:19:32,757][15401] Updated weights for policy 0, policy_version 39260 (0.0040) [2024-06-21 18:19:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 643252224. Throughput: 0: 42148.8. Samples: 643454720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-21 18:19:33,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 18:19:33,784][15349] Signal inference workers to stop experience collection... (9300 times) [2024-06-21 18:19:33,808][15401] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-06-21 18:19:33,847][15349] Signal inference workers to resume experience collection... (9300 times) [2024-06-21 18:19:33,847][15401] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-06-21 18:19:35,457][15401] Updated weights for policy 0, policy_version 39270 (0.0048) [2024-06-21 18:19:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42599.0, 300 sec: 41709.8). Total num frames: 643497984. Throughput: 0: 41857.8. Samples: 643567720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-21 18:19:38,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 18:19:40,402][15401] Updated weights for policy 0, policy_version 39280 (0.0029) [2024-06-21 18:19:43,385][15401] Updated weights for policy 0, policy_version 39290 (0.0033) [2024-06-21 18:19:43,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42325.3, 300 sec: 41710.2). Total num frames: 643727360. Throughput: 0: 42025.4. Samples: 643825340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-21 18:19:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 18:19:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000039290_643727360.pth... [2024-06-21 18:19:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000038678_633700352.pth [2024-06-21 18:19:47,931][15401] Updated weights for policy 0, policy_version 39300 (0.0048) [2024-06-21 18:19:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 643891200. Throughput: 0: 41947.2. Samples: 644078640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:19:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 18:19:51,097][15401] Updated weights for policy 0, policy_version 39310 (0.0046) [2024-06-21 18:19:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 41654.3). Total num frames: 644120576. Throughput: 0: 41820.5. Samples: 644197940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:19:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 18:19:55,557][15401] Updated weights for policy 0, policy_version 39320 (0.0030) [2024-06-21 18:19:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 41779.1, 300 sec: 41765.6). Total num frames: 644349952. Throughput: 0: 42137.3. Samples: 644457640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:19:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 18:19:58,873][15401] Updated weights for policy 0, policy_version 39330 (0.0042) [2024-06-21 18:20:03,181][15401] Updated weights for policy 0, policy_version 39340 (0.0051) [2024-06-21 18:20:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 644546560. Throughput: 0: 41905.2. Samples: 644709320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:20:03,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-21 18:20:06,508][15401] Updated weights for policy 0, policy_version 39350 (0.0049) [2024-06-21 18:20:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 644759552. Throughput: 0: 42074.7. Samples: 644836520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:20:08,390][15132] Avg episode reward: [(0, '0.171')] [2024-06-21 18:20:10,869][15401] Updated weights for policy 0, policy_version 39360 (0.0040) [2024-06-21 18:20:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 644956160. Throughput: 0: 42172.7. Samples: 645092540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:20:13,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 18:20:14,511][15401] Updated weights for policy 0, policy_version 39370 (0.0038) [2024-06-21 18:20:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42050.5, 300 sec: 41820.5). Total num frames: 645169152. Throughput: 0: 41840.9. Samples: 645337660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:20:18,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 18:20:18,777][15401] Updated weights for policy 0, policy_version 39380 (0.0045) [2024-06-21 18:20:22,312][15401] Updated weights for policy 0, policy_version 39390 (0.0039) [2024-06-21 18:20:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41765.7). Total num frames: 645382144. Throughput: 0: 42121.7. Samples: 645463200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:20:23,394][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 18:20:26,483][15401] Updated weights for policy 0, policy_version 39400 (0.0037) [2024-06-21 18:20:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 645578752. Throughput: 0: 41928.0. Samples: 645712100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:20:28,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 18:20:30,439][15401] Updated weights for policy 0, policy_version 39410 (0.0036) [2024-06-21 18:20:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 645775360. Throughput: 0: 41910.2. Samples: 645964600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:20:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 18:20:34,687][15401] Updated weights for policy 0, policy_version 39420 (0.0040) [2024-06-21 18:20:38,284][15401] Updated weights for policy 0, policy_version 39430 (0.0036) [2024-06-21 18:20:38,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42050.5, 300 sec: 41765.0). Total num frames: 646021120. Throughput: 0: 42026.6. Samples: 646089240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:20:38,393][15132] Avg episode reward: [(0, '0.241')] [2024-06-21 18:20:42,475][15401] Updated weights for policy 0, policy_version 39440 (0.0031) [2024-06-21 18:20:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 646234112. Throughput: 0: 41948.0. Samples: 646345300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:20:43,391][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 18:20:45,965][15401] Updated weights for policy 0, policy_version 39450 (0.0041) [2024-06-21 18:20:48,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.2, 300 sec: 41709.8). Total num frames: 646430720. Throughput: 0: 42002.6. Samples: 646599440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:20:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 18:20:50,028][15401] Updated weights for policy 0, policy_version 39460 (0.0042) [2024-06-21 18:20:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 646627328. Throughput: 0: 41930.2. Samples: 646723380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:20:53,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-21 18:20:53,756][15401] Updated weights for policy 0, policy_version 39470 (0.0038) [2024-06-21 18:20:57,585][15401] Updated weights for policy 0, policy_version 39480 (0.0032) [2024-06-21 18:20:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 646873088. Throughput: 0: 41905.8. Samples: 646978300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 18:20:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 18:21:01,698][15401] Updated weights for policy 0, policy_version 39490 (0.0041) [2024-06-21 18:21:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 647053312. Throughput: 0: 42129.3. Samples: 647233380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:21:03,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-21 18:21:04,524][15349] Signal inference workers to stop experience collection... (9350 times) [2024-06-21 18:21:04,525][15349] Signal inference workers to resume experience collection... (9350 times) [2024-06-21 18:21:04,547][15401] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-06-21 18:21:04,548][15401] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-06-21 18:21:05,616][15401] Updated weights for policy 0, policy_version 39500 (0.0029) [2024-06-21 18:21:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 647282688. Throughput: 0: 42075.1. Samples: 647356580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:21:08,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 18:21:09,296][15401] Updated weights for policy 0, policy_version 39510 (0.0031) [2024-06-21 18:21:13,296][15401] Updated weights for policy 0, policy_version 39520 (0.0044) [2024-06-21 18:21:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41877.3). Total num frames: 647495680. Throughput: 0: 42119.0. Samples: 647607460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:21:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 18:21:17,103][15401] Updated weights for policy 0, policy_version 39530 (0.0034) [2024-06-21 18:21:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 41765.3). Total num frames: 647692288. Throughput: 0: 42212.0. Samples: 647864140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:21:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 18:21:20,917][15401] Updated weights for policy 0, policy_version 39540 (0.0030) [2024-06-21 18:21:23,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42050.6, 300 sec: 41931.9). Total num frames: 647905280. Throughput: 0: 42120.5. Samples: 647984660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:21:23,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 18:21:25,195][15401] Updated weights for policy 0, policy_version 39550 (0.0031) [2024-06-21 18:21:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 648118272. Throughput: 0: 41999.2. Samples: 648235260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 18:21:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 18:21:28,817][15401] Updated weights for policy 0, policy_version 39560 (0.0055) [2024-06-21 18:21:33,087][15401] Updated weights for policy 0, policy_version 39570 (0.0030) [2024-06-21 18:21:33,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 41821.8). Total num frames: 648314880. Throughput: 0: 41898.8. Samples: 648484880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 18:21:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 18:21:36,646][15401] Updated weights for policy 0, policy_version 39580 (0.0038) [2024-06-21 18:21:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41780.9, 300 sec: 41931.9). Total num frames: 648527872. Throughput: 0: 41928.8. Samples: 648610180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 18:21:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 18:21:40,609][15401] Updated weights for policy 0, policy_version 39590 (0.0040) [2024-06-21 18:21:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 648740864. Throughput: 0: 41855.5. Samples: 648861800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 18:21:43,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 18:21:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000039596_648740864.pth... [2024-06-21 18:21:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000038981_638664704.pth [2024-06-21 18:21:44,650][15401] Updated weights for policy 0, policy_version 39600 (0.0038) [2024-06-21 18:21:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 648953856. Throughput: 0: 41537.9. Samples: 649102580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 18:21:48,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-21 18:21:48,408][15401] Updated weights for policy 0, policy_version 39610 (0.0028) [2024-06-21 18:21:52,485][15401] Updated weights for policy 0, policy_version 39620 (0.0032) [2024-06-21 18:21:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 649150464. Throughput: 0: 41659.9. Samples: 649231280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 18:21:53,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-21 18:21:56,825][15401] Updated weights for policy 0, policy_version 39630 (0.0038) [2024-06-21 18:21:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 649347072. Throughput: 0: 41736.5. Samples: 649485600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 18:21:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 18:22:00,279][15401] Updated weights for policy 0, policy_version 39640 (0.0038) [2024-06-21 18:22:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 649576448. Throughput: 0: 41616.7. Samples: 649736900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 18:22:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 18:22:04,659][15401] Updated weights for policy 0, policy_version 39650 (0.0036) [2024-06-21 18:22:08,093][15401] Updated weights for policy 0, policy_version 39660 (0.0036) [2024-06-21 18:22:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 649789440. Throughput: 0: 41796.0. Samples: 649865380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 18:22:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-21 18:22:12,169][15401] Updated weights for policy 0, policy_version 39670 (0.0030) [2024-06-21 18:22:13,390][15132] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41820.8). Total num frames: 649969664. Throughput: 0: 41803.6. Samples: 650116420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 18:22:13,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-21 18:22:15,874][15401] Updated weights for policy 0, policy_version 39680 (0.0024) [2024-06-21 18:22:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 650199040. Throughput: 0: 41739.2. Samples: 650363140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:22:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 18:22:19,941][15401] Updated weights for policy 0, policy_version 39690 (0.0035) [2024-06-21 18:22:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41780.8, 300 sec: 41931.9). Total num frames: 650412032. Throughput: 0: 41823.9. Samples: 650492260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:22:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 18:22:23,983][15401] Updated weights for policy 0, policy_version 39700 (0.0038) [2024-06-21 18:22:27,725][15401] Updated weights for policy 0, policy_version 39710 (0.0031) [2024-06-21 18:22:28,392][15132] Fps is (10 sec: 42587.5, 60 sec: 41777.5, 300 sec: 41931.8). Total num frames: 650625024. Throughput: 0: 41810.7. Samples: 650743380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:22:28,393][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 18:22:31,705][15401] Updated weights for policy 0, policy_version 39720 (0.0038) [2024-06-21 18:22:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 650821632. Throughput: 0: 42121.3. Samples: 650998040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:22:33,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 18:22:35,805][15401] Updated weights for policy 0, policy_version 39730 (0.0039) [2024-06-21 18:22:37,948][15349] Signal inference workers to stop experience collection... (9400 times) [2024-06-21 18:22:37,949][15349] Signal inference workers to resume experience collection... (9400 times) [2024-06-21 18:22:37,988][15401] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-06-21 18:22:37,988][15401] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-06-21 18:22:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 651034624. Throughput: 0: 41985.0. Samples: 651120600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:22:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 18:22:39,850][15401] Updated weights for policy 0, policy_version 39740 (0.0039) [2024-06-21 18:22:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 651247616. Throughput: 0: 41796.4. Samples: 651366440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:22:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 18:22:43,505][15401] Updated weights for policy 0, policy_version 39750 (0.0037) [2024-06-21 18:22:47,721][15401] Updated weights for policy 0, policy_version 39760 (0.0052) [2024-06-21 18:22:48,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41504.5, 300 sec: 41931.6). Total num frames: 651444224. Throughput: 0: 41844.1. Samples: 651619980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:22:48,392][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 18:22:51,166][15401] Updated weights for policy 0, policy_version 39770 (0.0035) [2024-06-21 18:22:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 651657216. Throughput: 0: 41621.9. Samples: 651738360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:22:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 18:22:55,331][15401] Updated weights for policy 0, policy_version 39780 (0.0041) [2024-06-21 18:22:58,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 651886592. Throughput: 0: 41738.7. Samples: 651994660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:22:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 18:22:59,038][15401] Updated weights for policy 0, policy_version 39790 (0.0030) [2024-06-21 18:23:02,952][15401] Updated weights for policy 0, policy_version 39800 (0.0035) [2024-06-21 18:23:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 652083200. Throughput: 0: 41868.2. Samples: 652247220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:23:03,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 18:23:06,766][15401] Updated weights for policy 0, policy_version 39810 (0.0033) [2024-06-21 18:23:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41932.1). Total num frames: 652296192. Throughput: 0: 41758.8. Samples: 652371400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-21 18:23:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 18:23:10,973][15401] Updated weights for policy 0, policy_version 39820 (0.0044) [2024-06-21 18:23:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 652509184. Throughput: 0: 41768.4. Samples: 652622860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:23:13,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 18:23:14,980][15401] Updated weights for policy 0, policy_version 39830 (0.0032) [2024-06-21 18:23:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 652705792. Throughput: 0: 41619.2. Samples: 652870900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:23:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-21 18:23:19,178][15401] Updated weights for policy 0, policy_version 39840 (0.0023) [2024-06-21 18:23:22,718][15401] Updated weights for policy 0, policy_version 39850 (0.0037) [2024-06-21 18:23:23,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41777.6, 300 sec: 41932.5). Total num frames: 652918784. Throughput: 0: 41718.1. Samples: 652998020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:23:23,393][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 18:23:27,112][15401] Updated weights for policy 0, policy_version 39860 (0.0028) [2024-06-21 18:23:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41780.8, 300 sec: 41820.8). Total num frames: 653131776. Throughput: 0: 41965.3. Samples: 653254880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:23:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 18:23:30,148][15401] Updated weights for policy 0, policy_version 39870 (0.0036) [2024-06-21 18:23:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41779.2, 300 sec: 41987.6). Total num frames: 653328384. Throughput: 0: 41881.8. Samples: 653504560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:23:33,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 18:23:34,883][15401] Updated weights for policy 0, policy_version 39880 (0.0034) [2024-06-21 18:23:38,126][15401] Updated weights for policy 0, policy_version 39890 (0.0030) [2024-06-21 18:23:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 653557760. Throughput: 0: 42019.4. Samples: 653629240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 18:23:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 18:23:42,703][15401] Updated weights for policy 0, policy_version 39900 (0.0036) [2024-06-21 18:23:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 653737984. Throughput: 0: 41789.6. Samples: 653875200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 18:23:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 18:23:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000039901_653737984.pth... [2024-06-21 18:23:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000039290_643727360.pth [2024-06-21 18:23:46,338][15401] Updated weights for policy 0, policy_version 39910 (0.0030) [2024-06-21 18:23:48,389][15132] Fps is (10 sec: 37684.1, 60 sec: 41507.9, 300 sec: 41876.4). Total num frames: 653934592. Throughput: 0: 41709.6. Samples: 654124140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 18:23:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 18:23:50,683][15401] Updated weights for policy 0, policy_version 39920 (0.0028) [2024-06-21 18:23:53,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 654196736. Throughput: 0: 41733.7. Samples: 654249420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 18:23:53,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 18:23:53,971][15401] Updated weights for policy 0, policy_version 39930 (0.0031) [2024-06-21 18:23:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 654360576. Throughput: 0: 41701.4. Samples: 654499420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-21 18:23:58,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 18:23:58,420][15401] Updated weights for policy 0, policy_version 39940 (0.0026) [2024-06-21 18:24:01,536][15401] Updated weights for policy 0, policy_version 39950 (0.0035) [2024-06-21 18:24:03,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 654573568. Throughput: 0: 41743.0. Samples: 654749340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:24:03,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 18:24:06,243][15401] Updated weights for policy 0, policy_version 39960 (0.0043) [2024-06-21 18:24:07,207][15349] Signal inference workers to stop experience collection... (9450 times) [2024-06-21 18:24:07,245][15401] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-06-21 18:24:07,255][15349] Signal inference workers to resume experience collection... (9450 times) [2024-06-21 18:24:07,271][15401] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-06-21 18:24:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 654786560. Throughput: 0: 41631.2. Samples: 654871320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:24:08,404][15132] Avg episode reward: [(0, '0.183')] [2024-06-21 18:24:09,427][15401] Updated weights for policy 0, policy_version 39970 (0.0043) [2024-06-21 18:24:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41231.5, 300 sec: 41820.5). Total num frames: 654983168. Throughput: 0: 41469.9. Samples: 655121120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:24:13,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 18:24:14,391][15401] Updated weights for policy 0, policy_version 39980 (0.0050) [2024-06-21 18:24:17,329][15401] Updated weights for policy 0, policy_version 39990 (0.0040) [2024-06-21 18:24:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 655212544. Throughput: 0: 41248.9. Samples: 655360760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:24:18,390][15132] Avg episode reward: [(0, '0.171')] [2024-06-21 18:24:22,576][15401] Updated weights for policy 0, policy_version 40000 (0.0037) [2024-06-21 18:24:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 41507.8, 300 sec: 41765.3). Total num frames: 655409152. Throughput: 0: 41359.6. Samples: 655490420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 18:24:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 18:24:24,962][15401] Updated weights for policy 0, policy_version 40010 (0.0028) [2024-06-21 18:24:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 655605760. Throughput: 0: 41574.4. Samples: 655746040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 18:24:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 18:24:30,467][15401] Updated weights for policy 0, policy_version 40020 (0.0029) [2024-06-21 18:24:33,253][15401] Updated weights for policy 0, policy_version 40030 (0.0025) [2024-06-21 18:24:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 655851520. Throughput: 0: 41372.6. Samples: 655985920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 18:24:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 18:24:37,970][15401] Updated weights for policy 0, policy_version 40040 (0.0038) [2024-06-21 18:24:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 656031744. Throughput: 0: 41580.4. Samples: 656120540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 18:24:38,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 18:24:41,089][15401] Updated weights for policy 0, policy_version 40050 (0.0034) [2024-06-21 18:24:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 656228352. Throughput: 0: 41506.1. Samples: 656367200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 18:24:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 18:24:45,622][15401] Updated weights for policy 0, policy_version 40060 (0.0035) [2024-06-21 18:24:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 656457728. Throughput: 0: 41487.1. Samples: 656616260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 18:24:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 18:24:49,007][15401] Updated weights for policy 0, policy_version 40070 (0.0033) [2024-06-21 18:24:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 656654336. Throughput: 0: 41621.2. Samples: 656744280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:24:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 18:24:53,508][15401] Updated weights for policy 0, policy_version 40080 (0.0043) [2024-06-21 18:24:56,854][15401] Updated weights for policy 0, policy_version 40090 (0.0026) [2024-06-21 18:24:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 656850944. Throughput: 0: 41526.6. Samples: 656989720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:24:58,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 18:25:01,521][15401] Updated weights for policy 0, policy_version 40100 (0.0050) [2024-06-21 18:25:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 657080320. Throughput: 0: 41900.0. Samples: 657246260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:25:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 18:25:04,674][15401] Updated weights for policy 0, policy_version 40110 (0.0031) [2024-06-21 18:25:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 657276928. Throughput: 0: 41847.1. Samples: 657373540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:25:08,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-21 18:25:09,384][15401] Updated weights for policy 0, policy_version 40120 (0.0045) [2024-06-21 18:25:12,538][15401] Updated weights for policy 0, policy_version 40130 (0.0027) [2024-06-21 18:25:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42053.9, 300 sec: 41821.2). Total num frames: 657506304. Throughput: 0: 41543.0. Samples: 657615480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 18:25:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 18:25:17,269][15401] Updated weights for policy 0, policy_version 40140 (0.0034) [2024-06-21 18:25:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 657719296. Throughput: 0: 41967.4. Samples: 657874440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 18:25:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 18:25:20,463][15401] Updated weights for policy 0, policy_version 40150 (0.0027) [2024-06-21 18:25:22,695][15349] Signal inference workers to stop experience collection... (9500 times) [2024-06-21 18:25:22,696][15349] Signal inference workers to resume experience collection... (9500 times) [2024-06-21 18:25:22,746][15401] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-06-21 18:25:22,747][15401] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-06-21 18:25:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 657915904. Throughput: 0: 41754.2. Samples: 657999480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 18:25:23,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-21 18:25:24,937][15401] Updated weights for policy 0, policy_version 40160 (0.0039) [2024-06-21 18:25:28,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 658128896. Throughput: 0: 41801.0. Samples: 658248240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 18:25:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 18:25:28,506][15401] Updated weights for policy 0, policy_version 40170 (0.0043) [2024-06-21 18:25:32,482][15401] Updated weights for policy 0, policy_version 40180 (0.0027) [2024-06-21 18:25:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.3, 300 sec: 41765.7). Total num frames: 658341888. Throughput: 0: 41833.9. Samples: 658498780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 18:25:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 18:25:36,315][15401] Updated weights for policy 0, policy_version 40190 (0.0028) [2024-06-21 18:25:38,390][15132] Fps is (10 sec: 40958.3, 60 sec: 41779.0, 300 sec: 41709.7). Total num frames: 658538496. Throughput: 0: 41865.4. Samples: 658628240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 18:25:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 18:25:40,423][15401] Updated weights for policy 0, policy_version 40200 (0.0035) [2024-06-21 18:25:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 658751488. Throughput: 0: 41867.7. Samples: 658873760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-21 18:25:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 18:25:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000040208_658767872.pth... [2024-06-21 18:25:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000039596_648740864.pth [2024-06-21 18:25:44,315][15401] Updated weights for policy 0, policy_version 40210 (0.0031) [2024-06-21 18:25:48,200][15401] Updated weights for policy 0, policy_version 40220 (0.0035) [2024-06-21 18:25:48,390][15132] Fps is (10 sec: 42599.6, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 658964480. Throughput: 0: 41813.6. Samples: 659127880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:25:48,391][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 18:25:52,011][15401] Updated weights for policy 0, policy_version 40230 (0.0028) [2024-06-21 18:25:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 659161088. Throughput: 0: 41704.4. Samples: 659250240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:25:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 18:25:55,778][15401] Updated weights for policy 0, policy_version 40240 (0.0028) [2024-06-21 18:25:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 659390464. Throughput: 0: 41948.4. Samples: 659503160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:25:58,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 18:26:00,241][15401] Updated weights for policy 0, policy_version 40250 (0.0032) [2024-06-21 18:26:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 659603456. Throughput: 0: 41848.3. Samples: 659757620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:26:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 18:26:03,671][15401] Updated weights for policy 0, policy_version 40260 (0.0042) [2024-06-21 18:26:08,132][15401] Updated weights for policy 0, policy_version 40270 (0.0021) [2024-06-21 18:26:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 659783680. Throughput: 0: 41789.3. Samples: 659880000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:26:08,400][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 18:26:11,506][15401] Updated weights for policy 0, policy_version 40280 (0.0037) [2024-06-21 18:26:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 660029440. Throughput: 0: 41972.4. Samples: 660137000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:26:13,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 18:26:15,898][15401] Updated weights for policy 0, policy_version 40290 (0.0026) [2024-06-21 18:26:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 41710.1). Total num frames: 660209664. Throughput: 0: 41928.0. Samples: 660385540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:26:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 18:26:19,316][15401] Updated weights for policy 0, policy_version 40300 (0.0030) [2024-06-21 18:26:23,392][15132] Fps is (10 sec: 37674.4, 60 sec: 41504.5, 300 sec: 41653.9). Total num frames: 660406272. Throughput: 0: 41637.2. Samples: 660502000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:26:23,393][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 18:26:23,925][15401] Updated weights for policy 0, policy_version 40310 (0.0036) [2024-06-21 18:26:27,079][15401] Updated weights for policy 0, policy_version 40320 (0.0040) [2024-06-21 18:26:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 660635648. Throughput: 0: 41812.5. Samples: 660755320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:26:28,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 18:26:31,854][15401] Updated weights for policy 0, policy_version 40330 (0.0029) [2024-06-21 18:26:33,390][15132] Fps is (10 sec: 44246.8, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 660848640. Throughput: 0: 41900.9. Samples: 661013420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:26:33,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-21 18:26:34,994][15401] Updated weights for policy 0, policy_version 40340 (0.0041) [2024-06-21 18:26:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.4, 300 sec: 41709.8). Total num frames: 661045248. Throughput: 0: 41788.0. Samples: 661130700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:26:38,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-21 18:26:39,758][15401] Updated weights for policy 0, policy_version 40350 (0.0027) [2024-06-21 18:26:41,874][15349] Signal inference workers to stop experience collection... (9550 times) [2024-06-21 18:26:41,874][15349] Signal inference workers to resume experience collection... (9550 times) [2024-06-21 18:26:41,906][15401] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-06-21 18:26:41,907][15401] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-06-21 18:26:43,007][15401] Updated weights for policy 0, policy_version 40360 (0.0040) [2024-06-21 18:26:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 661274624. Throughput: 0: 41702.3. Samples: 661379760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:26:43,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-21 18:26:47,610][15401] Updated weights for policy 0, policy_version 40370 (0.0040) [2024-06-21 18:26:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.3, 300 sec: 41709.8). Total num frames: 661454848. Throughput: 0: 41861.5. Samples: 661641380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:26:48,390][15132] Avg episode reward: [(0, '0.187')] [2024-06-21 18:26:50,913][15401] Updated weights for policy 0, policy_version 40380 (0.0030) [2024-06-21 18:26:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 661667840. Throughput: 0: 41716.0. Samples: 661757220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:26:53,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-21 18:26:55,669][15401] Updated weights for policy 0, policy_version 40390 (0.0046) [2024-06-21 18:26:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 661897216. Throughput: 0: 41492.5. Samples: 662004160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:26:58,392][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 18:26:58,607][15401] Updated weights for policy 0, policy_version 40400 (0.0026) [2024-06-21 18:27:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 40687.0, 300 sec: 41543.2). Total num frames: 662044672. Throughput: 0: 41673.8. Samples: 662260860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 18:27:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 18:27:03,642][15401] Updated weights for policy 0, policy_version 40410 (0.0036) [2024-06-21 18:27:06,698][15401] Updated weights for policy 0, policy_version 40420 (0.0029) [2024-06-21 18:27:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 662306816. Throughput: 0: 41708.0. Samples: 662378760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 18:27:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 18:27:11,279][15401] Updated weights for policy 0, policy_version 40430 (0.0032) [2024-06-21 18:27:13,389][15132] Fps is (10 sec: 47513.4, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 662519808. Throughput: 0: 41679.4. Samples: 662630900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 18:27:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 18:27:14,195][15401] Updated weights for policy 0, policy_version 40440 (0.0032) [2024-06-21 18:27:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 662683648. Throughput: 0: 41549.8. Samples: 662883160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 18:27:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 18:27:19,154][15401] Updated weights for policy 0, policy_version 40450 (0.0040) [2024-06-21 18:27:21,966][15401] Updated weights for policy 0, policy_version 40460 (0.0038) [2024-06-21 18:27:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 41710.1). Total num frames: 662929408. Throughput: 0: 41564.5. Samples: 663001100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 18:27:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 18:27:27,029][15401] Updated weights for policy 0, policy_version 40470 (0.0047) [2024-06-21 18:27:28,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 663158784. Throughput: 0: 41822.7. Samples: 663261780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:27:28,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 18:27:29,899][15401] Updated weights for policy 0, policy_version 40480 (0.0040) [2024-06-21 18:27:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41233.2, 300 sec: 41654.2). Total num frames: 663322624. Throughput: 0: 41605.8. Samples: 663513640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:27:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 18:27:34,641][15401] Updated weights for policy 0, policy_version 40490 (0.0036) [2024-06-21 18:27:37,508][15401] Updated weights for policy 0, policy_version 40500 (0.0031) [2024-06-21 18:27:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 663568384. Throughput: 0: 41697.8. Samples: 663633620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:27:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 18:27:42,433][15401] Updated weights for policy 0, policy_version 40510 (0.0025) [2024-06-21 18:27:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41765.7). Total num frames: 663764992. Throughput: 0: 42003.6. Samples: 663894320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:27:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 18:27:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000040514_663781376.pth... [2024-06-21 18:27:43,552][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000039901_653737984.pth [2024-06-21 18:27:45,269][15401] Updated weights for policy 0, policy_version 40520 (0.0031) [2024-06-21 18:27:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 663961600. Throughput: 0: 41743.9. Samples: 664139340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 18:27:48,395][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 18:27:50,495][15401] Updated weights for policy 0, policy_version 40530 (0.0044) [2024-06-21 18:27:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 664190976. Throughput: 0: 41843.0. Samples: 664261700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:27:53,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 18:27:53,577][15401] Updated weights for policy 0, policy_version 40540 (0.0034) [2024-06-21 18:27:58,210][15401] Updated weights for policy 0, policy_version 40550 (0.0035) [2024-06-21 18:27:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 664371200. Throughput: 0: 41974.3. Samples: 664519740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:27:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 18:28:00,771][15349] Signal inference workers to stop experience collection... (9600 times) [2024-06-21 18:28:00,776][15349] Signal inference workers to resume experience collection... (9600 times) [2024-06-21 18:28:00,804][15401] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-06-21 18:28:00,804][15401] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-06-21 18:28:01,463][15401] Updated weights for policy 0, policy_version 40560 (0.0030) [2024-06-21 18:28:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 41765.3). Total num frames: 664616960. Throughput: 0: 41798.7. Samples: 664764100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:28:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 18:28:05,880][15401] Updated weights for policy 0, policy_version 40570 (0.0044) [2024-06-21 18:28:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 664813568. Throughput: 0: 42067.5. Samples: 664894140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:28:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 18:28:09,433][15401] Updated weights for policy 0, policy_version 40580 (0.0041) [2024-06-21 18:28:13,393][15132] Fps is (10 sec: 39306.2, 60 sec: 41503.4, 300 sec: 41709.2). Total num frames: 665010176. Throughput: 0: 41724.7. Samples: 665139560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:28:13,394][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 18:28:13,891][15401] Updated weights for policy 0, policy_version 40590 (0.0037) [2024-06-21 18:28:17,440][15401] Updated weights for policy 0, policy_version 40600 (0.0035) [2024-06-21 18:28:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 41765.7). Total num frames: 665239552. Throughput: 0: 41633.3. Samples: 665387140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:28:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 18:28:21,777][15401] Updated weights for policy 0, policy_version 40610 (0.0036) [2024-06-21 18:28:23,389][15132] Fps is (10 sec: 42615.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 665436160. Throughput: 0: 41767.6. Samples: 665513160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:28:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 18:28:25,217][15401] Updated weights for policy 0, policy_version 40620 (0.0041) [2024-06-21 18:28:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 665649152. Throughput: 0: 41595.9. Samples: 665766140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:28:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 18:28:29,546][15401] Updated weights for policy 0, policy_version 40630 (0.0024) [2024-06-21 18:28:32,999][15401] Updated weights for policy 0, policy_version 40640 (0.0030) [2024-06-21 18:28:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 665862144. Throughput: 0: 41663.1. Samples: 666014180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:28:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 18:28:37,305][15401] Updated weights for policy 0, policy_version 40650 (0.0040) [2024-06-21 18:28:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 666075136. Throughput: 0: 41717.9. Samples: 666139000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:28:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 18:28:41,028][15401] Updated weights for policy 0, policy_version 40660 (0.0034) [2024-06-21 18:28:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 666255360. Throughput: 0: 41626.1. Samples: 666392920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:28:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 18:28:44,964][15401] Updated weights for policy 0, policy_version 40670 (0.0038) [2024-06-21 18:28:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 666468352. Throughput: 0: 41751.1. Samples: 666642900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 18:28:48,402][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 18:28:48,757][15401] Updated weights for policy 0, policy_version 40680 (0.0031) [2024-06-21 18:28:52,711][15401] Updated weights for policy 0, policy_version 40690 (0.0042) [2024-06-21 18:28:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 666681344. Throughput: 0: 41713.8. Samples: 666771260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 18:28:53,399][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 18:28:56,759][15401] Updated weights for policy 0, policy_version 40700 (0.0047) [2024-06-21 18:28:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 666877952. Throughput: 0: 41755.2. Samples: 667018380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 18:28:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 18:29:00,350][15401] Updated weights for policy 0, policy_version 40710 (0.0031) [2024-06-21 18:29:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41504.5, 300 sec: 41765.0). Total num frames: 667107328. Throughput: 0: 41660.0. Samples: 667261940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 18:29:03,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 18:29:04,677][15401] Updated weights for policy 0, policy_version 40720 (0.0031) [2024-06-21 18:29:08,335][15401] Updated weights for policy 0, policy_version 40730 (0.0024) [2024-06-21 18:29:08,396][15132] Fps is (10 sec: 44208.9, 60 sec: 41774.8, 300 sec: 41820.3). Total num frames: 667320320. Throughput: 0: 41788.3. Samples: 667393900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 18:29:08,396][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 18:29:12,458][15401] Updated weights for policy 0, policy_version 40740 (0.0032) [2024-06-21 18:29:13,389][15132] Fps is (10 sec: 39331.4, 60 sec: 41508.9, 300 sec: 41654.2). Total num frames: 667500544. Throughput: 0: 41650.4. Samples: 667640400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:29:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 18:29:16,350][15401] Updated weights for policy 0, policy_version 40750 (0.0040) [2024-06-21 18:29:18,390][15132] Fps is (10 sec: 42625.2, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 667746304. Throughput: 0: 41567.0. Samples: 667884700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:29:18,396][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 18:29:20,329][15401] Updated weights for policy 0, policy_version 40760 (0.0039) [2024-06-21 18:29:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 667926528. Throughput: 0: 41709.3. Samples: 668015920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:29:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 18:29:24,386][15401] Updated weights for policy 0, policy_version 40770 (0.0035) [2024-06-21 18:29:28,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 668123136. Throughput: 0: 41454.1. Samples: 668258360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:29:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 18:29:28,607][15401] Updated weights for policy 0, policy_version 40780 (0.0053) [2024-06-21 18:29:30,024][15349] Signal inference workers to stop experience collection... (9650 times) [2024-06-21 18:29:30,051][15401] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-06-21 18:29:30,136][15349] Signal inference workers to resume experience collection... (9650 times) [2024-06-21 18:29:30,136][15401] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-06-21 18:29:32,305][15401] Updated weights for policy 0, policy_version 40790 (0.0031) [2024-06-21 18:29:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 668368896. Throughput: 0: 41475.2. Samples: 668509280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:29:33,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-21 18:29:36,360][15401] Updated weights for policy 0, policy_version 40800 (0.0033) [2024-06-21 18:29:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 668549120. Throughput: 0: 41586.2. Samples: 668642640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 18:29:38,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-21 18:29:40,016][15401] Updated weights for policy 0, policy_version 40810 (0.0051) [2024-06-21 18:29:43,394][15132] Fps is (10 sec: 39304.8, 60 sec: 41776.3, 300 sec: 41709.2). Total num frames: 668762112. Throughput: 0: 41463.8. Samples: 668884420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 18:29:43,394][15132] Avg episode reward: [(0, '0.256')] [2024-06-21 18:29:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000040819_668778496.pth... [2024-06-21 18:29:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000040208_658767872.pth [2024-06-21 18:29:44,020][15401] Updated weights for policy 0, policy_version 40820 (0.0038) [2024-06-21 18:29:47,955][15401] Updated weights for policy 0, policy_version 40830 (0.0034) [2024-06-21 18:29:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 668975104. Throughput: 0: 41672.5. Samples: 669137100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 18:29:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 18:29:51,575][15401] Updated weights for policy 0, policy_version 40840 (0.0045) [2024-06-21 18:29:53,390][15132] Fps is (10 sec: 39337.9, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 669155328. Throughput: 0: 41546.8. Samples: 669263240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 18:29:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 18:29:55,849][15401] Updated weights for policy 0, policy_version 40850 (0.0036) [2024-06-21 18:29:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42323.7, 300 sec: 41820.5). Total num frames: 669417472. Throughput: 0: 41507.5. Samples: 669508340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 18:29:58,392][15132] Avg episode reward: [(0, '0.305')] [2024-06-21 18:29:59,757][15401] Updated weights for policy 0, policy_version 40860 (0.0038) [2024-06-21 18:30:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41234.6, 300 sec: 41709.8). Total num frames: 669581312. Throughput: 0: 41824.8. Samples: 669766820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 18:30:03,391][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 18:30:03,870][15401] Updated weights for policy 0, policy_version 40870 (0.0037) [2024-06-21 18:30:07,378][15401] Updated weights for policy 0, policy_version 40880 (0.0026) [2024-06-21 18:30:08,390][15132] Fps is (10 sec: 37692.1, 60 sec: 41237.4, 300 sec: 41654.2). Total num frames: 669794304. Throughput: 0: 41428.8. Samples: 669880220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 18:30:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 18:30:11,917][15401] Updated weights for policy 0, policy_version 40890 (0.0041) [2024-06-21 18:30:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 670023680. Throughput: 0: 41697.5. Samples: 670134740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 18:30:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-21 18:30:14,965][15401] Updated weights for policy 0, policy_version 40900 (0.0040) [2024-06-21 18:30:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 41654.2). Total num frames: 670203904. Throughput: 0: 41660.8. Samples: 670384020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 18:30:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 18:30:19,667][15401] Updated weights for policy 0, policy_version 40910 (0.0039) [2024-06-21 18:30:23,215][15401] Updated weights for policy 0, policy_version 40920 (0.0027) [2024-06-21 18:30:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 670433280. Throughput: 0: 41372.4. Samples: 670504400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 18:30:23,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-21 18:30:27,407][15401] Updated weights for policy 0, policy_version 40930 (0.0026) [2024-06-21 18:30:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 670613504. Throughput: 0: 41616.3. Samples: 670756980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 18:30:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-21 18:30:30,981][15401] Updated weights for policy 0, policy_version 40940 (0.0037) [2024-06-21 18:30:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 670842880. Throughput: 0: 41590.6. Samples: 671008680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 18:30:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 18:30:35,197][15401] Updated weights for policy 0, policy_version 40950 (0.0032) [2024-06-21 18:30:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 671072256. Throughput: 0: 41619.6. Samples: 671136120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 18:30:38,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-21 18:30:38,673][15401] Updated weights for policy 0, policy_version 40960 (0.0038) [2024-06-21 18:30:42,952][15401] Updated weights for policy 0, policy_version 40970 (0.0032) [2024-06-21 18:30:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41508.9, 300 sec: 41654.2). Total num frames: 671252480. Throughput: 0: 41650.5. Samples: 671382520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 18:30:43,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-21 18:30:46,426][15401] Updated weights for policy 0, policy_version 40980 (0.0038) [2024-06-21 18:30:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 671465472. Throughput: 0: 41448.6. Samples: 671632000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 18:30:48,394][15132] Avg episode reward: [(0, '0.229')] [2024-06-21 18:30:50,639][15401] Updated weights for policy 0, policy_version 40990 (0.0032) [2024-06-21 18:30:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 671678464. Throughput: 0: 41752.9. Samples: 671759100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-21 18:30:53,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-21 18:30:54,451][15401] Updated weights for policy 0, policy_version 41000 (0.0037) [2024-06-21 18:30:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41234.7, 300 sec: 41654.3). Total num frames: 671891456. Throughput: 0: 41544.5. Samples: 672004240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:30:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 18:30:59,138][15401] Updated weights for policy 0, policy_version 41010 (0.0031) [2024-06-21 18:31:00,647][15349] Signal inference workers to stop experience collection... (9700 times) [2024-06-21 18:31:00,647][15349] Signal inference workers to resume experience collection... (9700 times) [2024-06-21 18:31:00,690][15401] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-06-21 18:31:00,691][15401] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-06-21 18:31:02,439][15401] Updated weights for policy 0, policy_version 41020 (0.0038) [2024-06-21 18:31:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.4, 300 sec: 41709.8). Total num frames: 672088064. Throughput: 0: 41556.9. Samples: 672254080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:31:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 18:31:06,790][15401] Updated weights for policy 0, policy_version 41030 (0.0036) [2024-06-21 18:31:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 672301056. Throughput: 0: 41717.9. Samples: 672381700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:31:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 18:31:10,306][15401] Updated weights for policy 0, policy_version 41040 (0.0033) [2024-06-21 18:31:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 672514048. Throughput: 0: 41767.6. Samples: 672636520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:31:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 18:31:14,548][15401] Updated weights for policy 0, policy_version 41050 (0.0038) [2024-06-21 18:31:18,189][15401] Updated weights for policy 0, policy_version 41060 (0.0030) [2024-06-21 18:31:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41765.6). Total num frames: 672727040. Throughput: 0: 41623.5. Samples: 672881740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:31:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 18:31:22,500][15401] Updated weights for policy 0, policy_version 41070 (0.0039) [2024-06-21 18:31:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 672923648. Throughput: 0: 41592.9. Samples: 673007800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:31:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 18:31:26,525][15401] Updated weights for policy 0, policy_version 41080 (0.0028) [2024-06-21 18:31:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 673136640. Throughput: 0: 41625.9. Samples: 673255680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:31:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 18:31:30,222][15401] Updated weights for policy 0, policy_version 41090 (0.0036) [2024-06-21 18:31:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 673349632. Throughput: 0: 41806.2. Samples: 673513280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:31:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 18:31:34,132][15401] Updated weights for policy 0, policy_version 41100 (0.0034) [2024-06-21 18:31:38,094][15401] Updated weights for policy 0, policy_version 41110 (0.0039) [2024-06-21 18:31:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 673562624. Throughput: 0: 41686.7. Samples: 673635000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:31:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 18:31:42,272][15401] Updated weights for policy 0, policy_version 41120 (0.0039) [2024-06-21 18:31:43,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41777.6, 300 sec: 41709.4). Total num frames: 673759232. Throughput: 0: 41781.7. Samples: 673884520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 18:31:43,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 18:31:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000041123_673759232.pth... [2024-06-21 18:31:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000040514_663781376.pth [2024-06-21 18:31:45,813][15401] Updated weights for policy 0, policy_version 41130 (0.0037) [2024-06-21 18:31:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 673972224. Throughput: 0: 41734.2. Samples: 674132120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 18:31:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 18:31:49,973][15401] Updated weights for policy 0, policy_version 41140 (0.0030) [2024-06-21 18:31:53,392][15132] Fps is (10 sec: 42598.4, 60 sec: 41777.6, 300 sec: 41653.9). Total num frames: 674185216. Throughput: 0: 41757.3. Samples: 674260880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 18:31:53,392][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 18:31:53,569][15401] Updated weights for policy 0, policy_version 41150 (0.0034) [2024-06-21 18:31:57,978][15401] Updated weights for policy 0, policy_version 41160 (0.0042) [2024-06-21 18:31:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 674365440. Throughput: 0: 41714.2. Samples: 674513660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 18:31:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 18:32:01,301][15401] Updated weights for policy 0, policy_version 41170 (0.0033) [2024-06-21 18:32:03,389][15132] Fps is (10 sec: 40970.2, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 674594816. Throughput: 0: 41718.4. Samples: 674759060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 18:32:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-21 18:32:05,930][15401] Updated weights for policy 0, policy_version 41180 (0.0037) [2024-06-21 18:32:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 674824192. Throughput: 0: 41755.9. Samples: 674886820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 18:32:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 18:32:08,988][15401] Updated weights for policy 0, policy_version 41190 (0.0047) [2024-06-21 18:32:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 674988032. Throughput: 0: 41749.5. Samples: 675134400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 18:32:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 18:32:14,018][15401] Updated weights for policy 0, policy_version 41200 (0.0033) [2024-06-21 18:32:17,318][15401] Updated weights for policy 0, policy_version 41210 (0.0035) [2024-06-21 18:32:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 675233792. Throughput: 0: 41569.8. Samples: 675383920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:32:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 18:32:21,743][15401] Updated weights for policy 0, policy_version 41220 (0.0040) [2024-06-21 18:32:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 675446784. Throughput: 0: 41709.8. Samples: 675511940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:32:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 18:32:24,974][15401] Updated weights for policy 0, policy_version 41230 (0.0031) [2024-06-21 18:32:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 675627008. Throughput: 0: 41654.7. Samples: 675758880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:32:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 18:32:29,460][15401] Updated weights for policy 0, policy_version 41240 (0.0043) [2024-06-21 18:32:30,388][15349] Signal inference workers to stop experience collection... (9750 times) [2024-06-21 18:32:30,440][15401] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-06-21 18:32:30,441][15349] Signal inference workers to resume experience collection... (9750 times) [2024-06-21 18:32:30,455][15401] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-06-21 18:32:32,649][15401] Updated weights for policy 0, policy_version 41250 (0.0032) [2024-06-21 18:32:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 675840000. Throughput: 0: 41749.8. Samples: 676010860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:32:33,398][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 18:32:37,167][15401] Updated weights for policy 0, policy_version 41260 (0.0040) [2024-06-21 18:32:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 676069376. Throughput: 0: 41773.9. Samples: 676140600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:32:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 18:32:40,956][15401] Updated weights for policy 0, policy_version 41270 (0.0039) [2024-06-21 18:32:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41780.9, 300 sec: 41709.8). Total num frames: 676265984. Throughput: 0: 41732.4. Samples: 676391620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 18:32:43,394][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 18:32:44,850][15401] Updated weights for policy 0, policy_version 41280 (0.0035) [2024-06-21 18:32:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 676478976. Throughput: 0: 41813.7. Samples: 676640680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 18:32:48,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 18:32:48,494][15401] Updated weights for policy 0, policy_version 41290 (0.0038) [2024-06-21 18:32:52,574][15401] Updated weights for policy 0, policy_version 41300 (0.0034) [2024-06-21 18:32:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41507.8, 300 sec: 41709.8). Total num frames: 676675584. Throughput: 0: 41897.8. Samples: 676772220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 18:32:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 18:32:56,105][15401] Updated weights for policy 0, policy_version 41310 (0.0036) [2024-06-21 18:32:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 41654.2). Total num frames: 676904960. Throughput: 0: 41963.0. Samples: 677022740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 18:32:58,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-21 18:33:00,280][15401] Updated weights for policy 0, policy_version 41320 (0.0030) [2024-06-21 18:33:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 677117952. Throughput: 0: 41894.6. Samples: 677269180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-21 18:33:03,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-21 18:33:04,108][15401] Updated weights for policy 0, policy_version 41330 (0.0035) [2024-06-21 18:33:08,208][15401] Updated weights for policy 0, policy_version 41340 (0.0030) [2024-06-21 18:33:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41710.3). Total num frames: 677314560. Throughput: 0: 41937.4. Samples: 677399120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:33:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 18:33:11,717][15401] Updated weights for policy 0, policy_version 41350 (0.0029) [2024-06-21 18:33:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 677511168. Throughput: 0: 42006.1. Samples: 677649160. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:33:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 18:33:15,910][15401] Updated weights for policy 0, policy_version 41360 (0.0043) [2024-06-21 18:33:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 677724160. Throughput: 0: 41906.1. Samples: 677896640. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:33:18,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 18:33:19,547][15401] Updated weights for policy 0, policy_version 41370 (0.0033) [2024-06-21 18:33:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 677937152. Throughput: 0: 41832.0. Samples: 678023040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:33:23,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-21 18:33:23,688][15401] Updated weights for policy 0, policy_version 41380 (0.0031) [2024-06-21 18:33:27,494][15401] Updated weights for policy 0, policy_version 41390 (0.0038) [2024-06-21 18:33:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 678150144. Throughput: 0: 41783.6. Samples: 678271880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 18:33:28,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-21 18:33:31,463][15401] Updated weights for policy 0, policy_version 41400 (0.0036) [2024-06-21 18:33:33,396][15132] Fps is (10 sec: 44207.9, 60 sec: 42320.7, 300 sec: 41708.9). Total num frames: 678379520. Throughput: 0: 41987.3. Samples: 678530380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:33:33,397][15132] Avg episode reward: [(0, '0.381')] [2024-06-21 18:33:35,078][15401] Updated weights for policy 0, policy_version 41410 (0.0026) [2024-06-21 18:33:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 678592512. Throughput: 0: 41887.1. Samples: 678657140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:33:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 18:33:39,713][15401] Updated weights for policy 0, policy_version 41420 (0.0040) [2024-06-21 18:33:43,335][15401] Updated weights for policy 0, policy_version 41430 (0.0028) [2024-06-21 18:33:43,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 678789120. Throughput: 0: 41772.3. Samples: 678902500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:33:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 18:33:43,530][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000041431_678805504.pth... [2024-06-21 18:33:43,584][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000040819_668778496.pth [2024-06-21 18:33:47,616][15401] Updated weights for policy 0, policy_version 41440 (0.0040) [2024-06-21 18:33:48,339][15349] Signal inference workers to stop experience collection... (9800 times) [2024-06-21 18:33:48,389][15401] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-06-21 18:33:48,390][15132] Fps is (10 sec: 36044.6, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 678952960. Throughput: 0: 42073.3. Samples: 679162480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:33:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 18:33:48,395][15349] Signal inference workers to resume experience collection... (9800 times) [2024-06-21 18:33:48,407][15401] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-06-21 18:33:51,314][15401] Updated weights for policy 0, policy_version 41450 (0.0034) [2024-06-21 18:33:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 679198720. Throughput: 0: 41752.8. Samples: 679278000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:33:53,391][15132] Avg episode reward: [(0, '0.756')] [2024-06-21 18:33:55,613][15401] Updated weights for policy 0, policy_version 41460 (0.0041) [2024-06-21 18:33:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 41779.2, 300 sec: 41710.1). Total num frames: 679411712. Throughput: 0: 41669.4. Samples: 679524280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 18:33:58,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-21 18:33:59,251][15401] Updated weights for policy 0, policy_version 41470 (0.0034) [2024-06-21 18:34:03,343][15401] Updated weights for policy 0, policy_version 41480 (0.0043) [2024-06-21 18:34:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41655.1). Total num frames: 679608320. Throughput: 0: 41909.9. Samples: 679782580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 18:34:03,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 18:34:07,030][15401] Updated weights for policy 0, policy_version 41490 (0.0037) [2024-06-21 18:34:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 679821312. Throughput: 0: 41737.8. Samples: 679901240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 18:34:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 18:34:11,272][15401] Updated weights for policy 0, policy_version 41500 (0.0033) [2024-06-21 18:34:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 680050688. Throughput: 0: 41859.1. Samples: 680155540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 18:34:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 18:34:14,715][15401] Updated weights for policy 0, policy_version 41510 (0.0035) [2024-06-21 18:34:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 680230912. Throughput: 0: 41712.2. Samples: 680407160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 18:34:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 18:34:19,206][15401] Updated weights for policy 0, policy_version 41520 (0.0028) [2024-06-21 18:34:22,607][15401] Updated weights for policy 0, policy_version 41530 (0.0035) [2024-06-21 18:34:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 680443904. Throughput: 0: 41512.1. Samples: 680525180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 18:34:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 18:34:27,044][15401] Updated weights for policy 0, policy_version 41540 (0.0042) [2024-06-21 18:34:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 680673280. Throughput: 0: 41774.4. Samples: 680782340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 18:34:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 18:34:30,209][15401] Updated weights for policy 0, policy_version 41550 (0.0032) [2024-06-21 18:34:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41237.6, 300 sec: 41709.8). Total num frames: 680853504. Throughput: 0: 41666.8. Samples: 681037480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 18:34:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 18:34:34,780][15401] Updated weights for policy 0, policy_version 41560 (0.0051) [2024-06-21 18:34:38,135][15401] Updated weights for policy 0, policy_version 41570 (0.0036) [2024-06-21 18:34:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41765.9). Total num frames: 681082880. Throughput: 0: 41762.7. Samples: 681157320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 18:34:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 18:34:42,522][15401] Updated weights for policy 0, policy_version 41580 (0.0030) [2024-06-21 18:34:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 681295872. Throughput: 0: 42053.3. Samples: 681416680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 18:34:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 18:34:46,104][15401] Updated weights for policy 0, policy_version 41590 (0.0032) [2024-06-21 18:34:46,977][15349] Signal inference workers to stop experience collection... (9850 times) [2024-06-21 18:34:46,984][15349] Signal inference workers to resume experience collection... (9850 times) [2024-06-21 18:34:46,987][15401] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-06-21 18:34:47,010][15401] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-06-21 18:34:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 681492480. Throughput: 0: 41857.3. Samples: 681666160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 18:34:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 18:34:49,983][15401] Updated weights for policy 0, policy_version 41600 (0.0046) [2024-06-21 18:34:53,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41777.5, 300 sec: 41654.2). Total num frames: 681705472. Throughput: 0: 42040.3. Samples: 681793160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:34:53,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 18:34:54,093][15401] Updated weights for policy 0, policy_version 41610 (0.0039) [2024-06-21 18:34:57,612][15401] Updated weights for policy 0, policy_version 41620 (0.0042) [2024-06-21 18:34:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 41932.0). Total num frames: 681951232. Throughput: 0: 42005.8. Samples: 682045800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:34:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 18:35:01,754][15401] Updated weights for policy 0, policy_version 41630 (0.0038) [2024-06-21 18:35:03,389][15132] Fps is (10 sec: 40970.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 682115072. Throughput: 0: 42056.6. Samples: 682299700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:35:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 18:35:05,280][15401] Updated weights for policy 0, policy_version 41640 (0.0044) [2024-06-21 18:35:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 682344448. Throughput: 0: 42165.8. Samples: 682422640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:35:08,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-21 18:35:09,360][15401] Updated weights for policy 0, policy_version 41650 (0.0028) [2024-06-21 18:35:13,254][15401] Updated weights for policy 0, policy_version 41660 (0.0029) [2024-06-21 18:35:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 682557440. Throughput: 0: 42120.9. Samples: 682677780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:35:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 18:35:17,047][15401] Updated weights for policy 0, policy_version 41670 (0.0033) [2024-06-21 18:35:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 682754048. Throughput: 0: 41975.9. Samples: 682926400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:35:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 18:35:21,094][15401] Updated weights for policy 0, policy_version 41680 (0.0042) [2024-06-21 18:35:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 682967040. Throughput: 0: 42080.9. Samples: 683050960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 18:35:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 18:35:24,836][15401] Updated weights for policy 0, policy_version 41690 (0.0038) [2024-06-21 18:35:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 683180032. Throughput: 0: 42012.9. Samples: 683307260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 18:35:28,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 18:35:29,017][15401] Updated weights for policy 0, policy_version 41700 (0.0045) [2024-06-21 18:35:32,895][15401] Updated weights for policy 0, policy_version 41710 (0.0038) [2024-06-21 18:35:33,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.6, 300 sec: 41765.0). Total num frames: 683393024. Throughput: 0: 41814.2. Samples: 683547900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 18:35:33,393][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 18:35:36,843][15401] Updated weights for policy 0, policy_version 41720 (0.0033) [2024-06-21 18:35:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 683606016. Throughput: 0: 41970.7. Samples: 683681740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 18:35:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 18:35:40,551][15401] Updated weights for policy 0, policy_version 41730 (0.0034) [2024-06-21 18:35:43,389][15132] Fps is (10 sec: 39331.5, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 683786240. Throughput: 0: 41832.5. Samples: 683928260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 18:35:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 18:35:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000041736_683802624.pth... [2024-06-21 18:35:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000041123_673759232.pth [2024-06-21 18:35:44,810][15401] Updated weights for policy 0, policy_version 41740 (0.0036) [2024-06-21 18:35:48,371][15401] Updated weights for policy 0, policy_version 41750 (0.0041) [2024-06-21 18:35:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 684032000. Throughput: 0: 41638.2. Samples: 684173420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-21 18:35:48,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 18:35:52,490][15401] Updated weights for policy 0, policy_version 41760 (0.0032) [2024-06-21 18:35:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42054.0, 300 sec: 41820.9). Total num frames: 684228608. Throughput: 0: 41816.4. Samples: 684304380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-21 18:35:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-21 18:35:56,215][15401] Updated weights for policy 0, policy_version 41770 (0.0033) [2024-06-21 18:35:58,389][15132] Fps is (10 sec: 37683.0, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 684408832. Throughput: 0: 41682.6. Samples: 684553500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-21 18:35:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 18:36:00,147][15401] Updated weights for policy 0, policy_version 41780 (0.0041) [2024-06-21 18:36:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 684654592. Throughput: 0: 41694.2. Samples: 684802640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-21 18:36:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 18:36:03,922][15401] Updated weights for policy 0, policy_version 41790 (0.0029) [2024-06-21 18:36:08,113][15401] Updated weights for policy 0, policy_version 41800 (0.0047) [2024-06-21 18:36:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 684851200. Throughput: 0: 41872.1. Samples: 684935200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-21 18:36:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 18:36:11,418][15349] Signal inference workers to stop experience collection... (9900 times) [2024-06-21 18:36:11,419][15349] Signal inference workers to resume experience collection... (9900 times) [2024-06-21 18:36:11,459][15401] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-06-21 18:36:11,459][15401] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-06-21 18:36:11,555][15401] Updated weights for policy 0, policy_version 41810 (0.0039) [2024-06-21 18:36:13,390][15132] Fps is (10 sec: 37683.4, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 685031424. Throughput: 0: 41447.0. Samples: 685172380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 18:36:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 18:36:15,958][15401] Updated weights for policy 0, policy_version 41820 (0.0044) [2024-06-21 18:36:18,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 685293568. Throughput: 0: 41840.9. Samples: 685430640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 18:36:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 18:36:19,243][15401] Updated weights for policy 0, policy_version 41830 (0.0038) [2024-06-21 18:36:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 41777.5, 300 sec: 41820.5). Total num frames: 685473792. Throughput: 0: 41747.1. Samples: 685560460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 18:36:23,393][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 18:36:23,701][15401] Updated weights for policy 0, policy_version 41840 (0.0038) [2024-06-21 18:36:27,092][15401] Updated weights for policy 0, policy_version 41850 (0.0042) [2024-06-21 18:36:28,392][15132] Fps is (10 sec: 39312.1, 60 sec: 41777.5, 300 sec: 41820.5). Total num frames: 685686784. Throughput: 0: 41605.7. Samples: 685800620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 18:36:28,393][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 18:36:31,800][15401] Updated weights for policy 0, policy_version 41860 (0.0044) [2024-06-21 18:36:33,389][15132] Fps is (10 sec: 42609.1, 60 sec: 41780.9, 300 sec: 41820.9). Total num frames: 685899776. Throughput: 0: 41952.4. Samples: 686061280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 18:36:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 18:36:35,034][15401] Updated weights for policy 0, policy_version 41870 (0.0031) [2024-06-21 18:36:38,390][15132] Fps is (10 sec: 39330.9, 60 sec: 41233.1, 300 sec: 41765.7). Total num frames: 686080000. Throughput: 0: 41868.8. Samples: 686188480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 18:36:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 18:36:39,873][15401] Updated weights for policy 0, policy_version 41880 (0.0030) [2024-06-21 18:36:42,781][15401] Updated weights for policy 0, policy_version 41890 (0.0034) [2024-06-21 18:36:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.1, 300 sec: 41876.4). Total num frames: 686325760. Throughput: 0: 41703.8. Samples: 686430180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 18:36:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-21 18:36:47,774][15401] Updated weights for policy 0, policy_version 41900 (0.0033) [2024-06-21 18:36:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41506.0, 300 sec: 41821.2). Total num frames: 686522368. Throughput: 0: 42016.0. Samples: 686693360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 18:36:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 18:36:51,014][15401] Updated weights for policy 0, policy_version 41910 (0.0036) [2024-06-21 18:36:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 686718976. Throughput: 0: 41631.4. Samples: 686808620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 18:36:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 18:36:55,499][15401] Updated weights for policy 0, policy_version 41920 (0.0035) [2024-06-21 18:36:58,395][15132] Fps is (10 sec: 44214.6, 60 sec: 42594.8, 300 sec: 41931.2). Total num frames: 686964736. Throughput: 0: 41974.0. Samples: 687061420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 18:36:58,395][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 18:36:58,706][15401] Updated weights for policy 0, policy_version 41930 (0.0040) [2024-06-21 18:37:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 687128576. Throughput: 0: 41989.7. Samples: 687320180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-21 18:37:03,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 18:37:03,570][15401] Updated weights for policy 0, policy_version 41940 (0.0038) [2024-06-21 18:37:06,478][15401] Updated weights for policy 0, policy_version 41950 (0.0036) [2024-06-21 18:37:08,390][15132] Fps is (10 sec: 39341.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 687357952. Throughput: 0: 41661.3. Samples: 687435120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 18:37:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 18:37:11,545][15401] Updated weights for policy 0, policy_version 41960 (0.0030) [2024-06-21 18:37:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 41876.4). Total num frames: 687587328. Throughput: 0: 41908.5. Samples: 687686400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 18:37:13,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-21 18:37:14,903][15401] Updated weights for policy 0, policy_version 41970 (0.0040) [2024-06-21 18:37:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 687767552. Throughput: 0: 41622.1. Samples: 687934280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 18:37:18,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 18:37:19,370][15401] Updated weights for policy 0, policy_version 41980 (0.0041) [2024-06-21 18:37:22,662][15401] Updated weights for policy 0, policy_version 41990 (0.0041) [2024-06-21 18:37:23,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41507.8, 300 sec: 41820.9). Total num frames: 687964160. Throughput: 0: 41424.1. Samples: 688052560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 18:37:23,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-21 18:37:26,971][15401] Updated weights for policy 0, policy_version 42000 (0.0047) [2024-06-21 18:37:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41780.8, 300 sec: 41876.4). Total num frames: 688193536. Throughput: 0: 41824.1. Samples: 688312260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 18:37:28,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 18:37:31,106][15401] Updated weights for policy 0, policy_version 42010 (0.0043) [2024-06-21 18:37:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 688390144. Throughput: 0: 41528.5. Samples: 688562140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 18:37:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 18:37:34,968][15401] Updated weights for policy 0, policy_version 42020 (0.0030) [2024-06-21 18:37:36,004][15349] Signal inference workers to stop experience collection... (9950 times) [2024-06-21 18:37:36,005][15349] Signal inference workers to resume experience collection... (9950 times) [2024-06-21 18:37:36,044][15401] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-06-21 18:37:36,044][15401] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-06-21 18:37:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 688603136. Throughput: 0: 41682.6. Samples: 688684340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 18:37:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 18:37:38,786][15401] Updated weights for policy 0, policy_version 42030 (0.0033) [2024-06-21 18:37:42,679][15401] Updated weights for policy 0, policy_version 42040 (0.0027) [2024-06-21 18:37:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 688799744. Throughput: 0: 41697.6. Samples: 688937600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 18:37:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 18:37:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000042042_688816128.pth... [2024-06-21 18:37:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000041431_678805504.pth [2024-06-21 18:37:46,716][15401] Updated weights for policy 0, policy_version 42050 (0.0028) [2024-06-21 18:37:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 689029120. Throughput: 0: 41411.6. Samples: 689183700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 18:37:48,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 18:37:50,444][15401] Updated weights for policy 0, policy_version 42060 (0.0036) [2024-06-21 18:37:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 689242112. Throughput: 0: 41813.8. Samples: 689316740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 18:37:53,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-21 18:37:54,419][15401] Updated weights for policy 0, policy_version 42070 (0.0039) [2024-06-21 18:37:58,200][15401] Updated weights for policy 0, policy_version 42080 (0.0042) [2024-06-21 18:37:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41236.6, 300 sec: 41765.3). Total num frames: 689438720. Throughput: 0: 41781.3. Samples: 689566560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-21 18:37:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 18:38:02,000][15401] Updated weights for policy 0, policy_version 42090 (0.0034) [2024-06-21 18:38:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 689635328. Throughput: 0: 41786.2. Samples: 689814660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 18:38:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 18:38:05,860][15401] Updated weights for policy 0, policy_version 42100 (0.0045) [2024-06-21 18:38:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 689848320. Throughput: 0: 41866.7. Samples: 689936560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 18:38:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 18:38:09,673][15401] Updated weights for policy 0, policy_version 42110 (0.0047) [2024-06-21 18:38:13,390][15132] Fps is (10 sec: 42595.0, 60 sec: 41232.5, 300 sec: 41820.8). Total num frames: 690061312. Throughput: 0: 41683.3. Samples: 690188040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 18:38:13,391][15132] Avg episode reward: [(0, '0.327')] [2024-06-21 18:38:13,732][15401] Updated weights for policy 0, policy_version 42120 (0.0032) [2024-06-21 18:38:17,466][15401] Updated weights for policy 0, policy_version 42130 (0.0027) [2024-06-21 18:38:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 690257920. Throughput: 0: 41748.5. Samples: 690440820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 18:38:18,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-21 18:38:21,739][15401] Updated weights for policy 0, policy_version 42140 (0.0035) [2024-06-21 18:38:23,390][15132] Fps is (10 sec: 39324.6, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 690454528. Throughput: 0: 41752.4. Samples: 690563200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 18:38:23,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 18:38:25,671][15401] Updated weights for policy 0, policy_version 42150 (0.0048) [2024-06-21 18:38:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 41821.8). Total num frames: 690716672. Throughput: 0: 41581.7. Samples: 690808780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:38:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 18:38:29,705][15401] Updated weights for policy 0, policy_version 42160 (0.0038) [2024-06-21 18:38:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 690896896. Throughput: 0: 41653.7. Samples: 691058120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:38:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 18:38:33,414][15401] Updated weights for policy 0, policy_version 42170 (0.0042) [2024-06-21 18:38:37,983][15401] Updated weights for policy 0, policy_version 42180 (0.0031) [2024-06-21 18:38:38,390][15132] Fps is (10 sec: 36044.6, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 691077120. Throughput: 0: 41380.8. Samples: 691178880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:38:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 18:38:41,310][15401] Updated weights for policy 0, policy_version 42190 (0.0057) [2024-06-21 18:38:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 691322880. Throughput: 0: 41565.3. Samples: 691437000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:38:43,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-21 18:38:45,493][15401] Updated weights for policy 0, policy_version 42200 (0.0029) [2024-06-21 18:38:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 691535872. Throughput: 0: 41419.5. Samples: 691678540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:38:48,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-21 18:38:49,382][15401] Updated weights for policy 0, policy_version 42210 (0.0038) [2024-06-21 18:38:53,307][15401] Updated weights for policy 0, policy_version 42220 (0.0029) [2024-06-21 18:38:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 691732480. Throughput: 0: 41514.2. Samples: 691804700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 18:38:53,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 18:38:57,147][15401] Updated weights for policy 0, policy_version 42230 (0.0045) [2024-06-21 18:38:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 691929088. Throughput: 0: 41515.0. Samples: 692056180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 18:38:58,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 18:38:59,332][15349] Signal inference workers to stop experience collection... (10000 times) [2024-06-21 18:38:59,333][15349] Signal inference workers to resume experience collection... (10000 times) [2024-06-21 18:38:59,359][15401] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-06-21 18:38:59,360][15401] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-06-21 18:39:01,151][15401] Updated weights for policy 0, policy_version 42240 (0.0049) [2024-06-21 18:39:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 692142080. Throughput: 0: 41480.9. Samples: 692307460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 18:39:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 18:39:04,759][15401] Updated weights for policy 0, policy_version 42250 (0.0045) [2024-06-21 18:39:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 692355072. Throughput: 0: 41556.0. Samples: 692433220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 18:39:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 18:39:08,971][15401] Updated weights for policy 0, policy_version 42260 (0.0033) [2024-06-21 18:39:12,444][15401] Updated weights for policy 0, policy_version 42270 (0.0046) [2024-06-21 18:39:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41779.7, 300 sec: 41820.8). Total num frames: 692568064. Throughput: 0: 41627.1. Samples: 692682000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 18:39:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 18:39:16,944][15401] Updated weights for policy 0, policy_version 42280 (0.0042) [2024-06-21 18:39:18,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 692748288. Throughput: 0: 41883.6. Samples: 692942880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 18:39:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 18:39:20,243][15401] Updated weights for policy 0, policy_version 42290 (0.0030) [2024-06-21 18:39:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 692994048. Throughput: 0: 41919.6. Samples: 693065260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:39:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 18:39:24,822][15401] Updated weights for policy 0, policy_version 42300 (0.0027) [2024-06-21 18:39:28,077][15401] Updated weights for policy 0, policy_version 42310 (0.0034) [2024-06-21 18:39:28,392][15132] Fps is (10 sec: 45863.9, 60 sec: 41504.5, 300 sec: 41876.0). Total num frames: 693207040. Throughput: 0: 41659.5. Samples: 693311780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:39:28,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 18:39:32,975][15401] Updated weights for policy 0, policy_version 42320 (0.0036) [2024-06-21 18:39:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 693387264. Throughput: 0: 42048.9. Samples: 693570740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:39:33,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 18:39:35,870][15401] Updated weights for policy 0, policy_version 42330 (0.0037) [2024-06-21 18:39:38,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42052.4, 300 sec: 41709.8). Total num frames: 693600256. Throughput: 0: 41814.3. Samples: 693686340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:39:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 18:39:41,011][15401] Updated weights for policy 0, policy_version 42340 (0.0039) [2024-06-21 18:39:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 693846016. Throughput: 0: 41994.6. Samples: 693945940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 18:39:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 18:39:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000042349_693846016.pth... [2024-06-21 18:39:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000041736_683802624.pth [2024-06-21 18:39:44,155][15401] Updated weights for policy 0, policy_version 42350 (0.0033) [2024-06-21 18:39:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41233.1, 300 sec: 41710.1). Total num frames: 694009856. Throughput: 0: 42013.2. Samples: 694198060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:39:48,395][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 18:39:48,885][15401] Updated weights for policy 0, policy_version 42360 (0.0028) [2024-06-21 18:39:51,895][15401] Updated weights for policy 0, policy_version 42370 (0.0048) [2024-06-21 18:39:53,390][15132] Fps is (10 sec: 37682.7, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 694222848. Throughput: 0: 41777.3. Samples: 694313200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:39:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 18:39:56,666][15401] Updated weights for policy 0, policy_version 42380 (0.0034) [2024-06-21 18:39:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 694452224. Throughput: 0: 42018.3. Samples: 694572820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:39:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 18:39:59,598][15401] Updated weights for policy 0, policy_version 42390 (0.0039) [2024-06-21 18:40:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 694648832. Throughput: 0: 41687.9. Samples: 694818840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:40:03,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-21 18:40:04,301][15401] Updated weights for policy 0, policy_version 42400 (0.0032) [2024-06-21 18:40:07,246][15401] Updated weights for policy 0, policy_version 42410 (0.0025) [2024-06-21 18:40:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 694878208. Throughput: 0: 41707.5. Samples: 694942100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:40:08,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 18:40:12,607][15401] Updated weights for policy 0, policy_version 42420 (0.0038) [2024-06-21 18:40:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 695074816. Throughput: 0: 41914.1. Samples: 695197820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 18:40:13,391][15132] Avg episode reward: [(0, '0.145')] [2024-06-21 18:40:14,807][15349] Signal inference workers to stop experience collection... (10050 times) [2024-06-21 18:40:14,809][15349] Signal inference workers to resume experience collection... (10050 times) [2024-06-21 18:40:14,851][15401] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-06-21 18:40:14,851][15401] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-06-21 18:40:15,223][15401] Updated weights for policy 0, policy_version 42430 (0.0048) [2024-06-21 18:40:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 695271424. Throughput: 0: 41615.7. Samples: 695443440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 18:40:18,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 18:40:20,267][15401] Updated weights for policy 0, policy_version 42440 (0.0036) [2024-06-21 18:40:22,977][15401] Updated weights for policy 0, policy_version 42450 (0.0033) [2024-06-21 18:40:23,396][15132] Fps is (10 sec: 44209.5, 60 sec: 42047.8, 300 sec: 41819.9). Total num frames: 695517184. Throughput: 0: 41856.7. Samples: 695570160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 18:40:23,396][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 18:40:28,119][15401] Updated weights for policy 0, policy_version 42460 (0.0038) [2024-06-21 18:40:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 40961.7, 300 sec: 41599.0). Total num frames: 695664640. Throughput: 0: 41660.9. Samples: 695820680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 18:40:28,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-21 18:40:30,989][15401] Updated weights for policy 0, policy_version 42470 (0.0026) [2024-06-21 18:40:33,389][15132] Fps is (10 sec: 39346.7, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 695910400. Throughput: 0: 41512.5. Samples: 696066120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 18:40:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-21 18:40:36,182][15401] Updated weights for policy 0, policy_version 42480 (0.0039) [2024-06-21 18:40:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 696123392. Throughput: 0: 42018.8. Samples: 696204040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 18:40:38,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 18:40:38,700][15401] Updated weights for policy 0, policy_version 42490 (0.0035) [2024-06-21 18:40:43,389][15132] Fps is (10 sec: 37683.3, 60 sec: 40687.0, 300 sec: 41543.2). Total num frames: 696287232. Throughput: 0: 41646.7. Samples: 696446920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-21 18:40:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 18:40:43,853][15401] Updated weights for policy 0, policy_version 42500 (0.0029) [2024-06-21 18:40:46,817][15401] Updated weights for policy 0, policy_version 42510 (0.0043) [2024-06-21 18:40:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 696549376. Throughput: 0: 41566.6. Samples: 696689340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-21 18:40:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 18:40:51,914][15401] Updated weights for policy 0, policy_version 42520 (0.0038) [2024-06-21 18:40:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 696745984. Throughput: 0: 41833.9. Samples: 696824620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-21 18:40:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 18:40:54,837][15401] Updated weights for policy 0, policy_version 42530 (0.0035) [2024-06-21 18:40:58,390][15132] Fps is (10 sec: 37683.6, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 696926208. Throughput: 0: 41569.0. Samples: 697068420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-21 18:40:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 18:40:59,782][15401] Updated weights for policy 0, policy_version 42540 (0.0054) [2024-06-21 18:41:02,607][15401] Updated weights for policy 0, policy_version 42550 (0.0044) [2024-06-21 18:41:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.6, 300 sec: 41765.0). Total num frames: 697171968. Throughput: 0: 41688.3. Samples: 697319520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-21 18:41:03,393][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 18:41:07,541][15401] Updated weights for policy 0, policy_version 42560 (0.0034) [2024-06-21 18:41:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 697352192. Throughput: 0: 41727.3. Samples: 697447620. Policy #0 lag: (min: 1.0, avg: 7.7, max: 21.0) [2024-06-21 18:41:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 18:41:10,304][15401] Updated weights for policy 0, policy_version 42570 (0.0033) [2024-06-21 18:41:13,390][15132] Fps is (10 sec: 39330.7, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 697565184. Throughput: 0: 41516.7. Samples: 697688940. Policy #0 lag: (min: 1.0, avg: 7.7, max: 21.0) [2024-06-21 18:41:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 18:41:15,271][15401] Updated weights for policy 0, policy_version 42580 (0.0031) [2024-06-21 18:41:18,238][15401] Updated weights for policy 0, policy_version 42590 (0.0046) [2024-06-21 18:41:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41765.7). Total num frames: 697794560. Throughput: 0: 41709.4. Samples: 697943040. Policy #0 lag: (min: 1.0, avg: 7.7, max: 21.0) [2024-06-21 18:41:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 18:41:23,047][15401] Updated weights for policy 0, policy_version 42600 (0.0038) [2024-06-21 18:41:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 40964.3, 300 sec: 41654.6). Total num frames: 697974784. Throughput: 0: 41487.5. Samples: 698070980. Policy #0 lag: (min: 1.0, avg: 7.7, max: 21.0) [2024-06-21 18:41:23,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-21 18:41:26,261][15401] Updated weights for policy 0, policy_version 42610 (0.0038) [2024-06-21 18:41:27,495][15349] Signal inference workers to stop experience collection... (10100 times) [2024-06-21 18:41:27,496][15349] Signal inference workers to resume experience collection... (10100 times) [2024-06-21 18:41:27,548][15401] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-06-21 18:41:27,548][15401] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-06-21 18:41:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 41709.8). Total num frames: 698204160. Throughput: 0: 41646.1. Samples: 698321000. Policy #0 lag: (min: 1.0, avg: 7.7, max: 21.0) [2024-06-21 18:41:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 18:41:30,662][15401] Updated weights for policy 0, policy_version 42620 (0.0050) [2024-06-21 18:41:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 698417152. Throughput: 0: 41823.6. Samples: 698571400. Policy #0 lag: (min: 1.0, avg: 7.7, max: 21.0) [2024-06-21 18:41:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 18:41:33,899][15401] Updated weights for policy 0, policy_version 42630 (0.0029) [2024-06-21 18:41:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 698597376. Throughput: 0: 41584.5. Samples: 698695920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 18:41:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 18:41:38,472][15401] Updated weights for policy 0, policy_version 42640 (0.0045) [2024-06-21 18:41:41,687][15401] Updated weights for policy 0, policy_version 42650 (0.0036) [2024-06-21 18:41:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 698826752. Throughput: 0: 41683.6. Samples: 698944180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 18:41:43,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 18:41:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000042654_698843136.pth... [2024-06-21 18:41:43,523][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000042042_688816128.pth [2024-06-21 18:41:46,041][15401] Updated weights for policy 0, policy_version 42660 (0.0056) [2024-06-21 18:41:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 41231.5, 300 sec: 41709.4). Total num frames: 699023360. Throughput: 0: 41780.0. Samples: 699199620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 18:41:48,392][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 18:41:49,708][15401] Updated weights for policy 0, policy_version 42670 (0.0044) [2024-06-21 18:41:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41599.4). Total num frames: 699236352. Throughput: 0: 41639.5. Samples: 699321400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 18:41:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 18:41:54,099][15401] Updated weights for policy 0, policy_version 42680 (0.0027) [2024-06-21 18:41:57,744][15401] Updated weights for policy 0, policy_version 42690 (0.0032) [2024-06-21 18:41:58,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42050.6, 300 sec: 41765.0). Total num frames: 699449344. Throughput: 0: 41919.7. Samples: 699575420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 18:41:58,392][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 18:42:01,837][15401] Updated weights for policy 0, policy_version 42700 (0.0045) [2024-06-21 18:42:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41234.6, 300 sec: 41654.2). Total num frames: 699645952. Throughput: 0: 41716.7. Samples: 699820300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-21 18:42:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 18:42:05,474][15401] Updated weights for policy 0, policy_version 42710 (0.0034) [2024-06-21 18:42:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 699875328. Throughput: 0: 41638.8. Samples: 699944720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-21 18:42:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 18:42:09,458][15401] Updated weights for policy 0, policy_version 42720 (0.0045) [2024-06-21 18:42:13,334][15401] Updated weights for policy 0, policy_version 42730 (0.0029) [2024-06-21 18:42:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 700088320. Throughput: 0: 41744.8. Samples: 700199520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-21 18:42:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 18:42:17,014][15401] Updated weights for policy 0, policy_version 42740 (0.0032) [2024-06-21 18:42:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 700284928. Throughput: 0: 41689.4. Samples: 700447420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-21 18:42:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 18:42:21,273][15401] Updated weights for policy 0, policy_version 42750 (0.0025) [2024-06-21 18:42:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 700497920. Throughput: 0: 41715.1. Samples: 700573100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-21 18:42:23,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-21 18:42:24,806][15401] Updated weights for policy 0, policy_version 42760 (0.0040) [2024-06-21 18:42:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 700678144. Throughput: 0: 41884.9. Samples: 700829000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-21 18:42:28,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 18:42:29,336][15401] Updated weights for policy 0, policy_version 42770 (0.0039) [2024-06-21 18:42:32,407][15401] Updated weights for policy 0, policy_version 42780 (0.0043) [2024-06-21 18:42:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 700940288. Throughput: 0: 41587.0. Samples: 701070940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 18:42:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 18:42:37,502][15401] Updated weights for policy 0, policy_version 42790 (0.0040) [2024-06-21 18:42:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 701104128. Throughput: 0: 41763.6. Samples: 701200760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 18:42:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 18:42:40,162][15401] Updated weights for policy 0, policy_version 42800 (0.0035) [2024-06-21 18:42:43,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 701317120. Throughput: 0: 41688.1. Samples: 701451280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 18:42:43,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-21 18:42:45,113][15401] Updated weights for policy 0, policy_version 42810 (0.0038) [2024-06-21 18:42:47,280][15349] Signal inference workers to stop experience collection... (10150 times) [2024-06-21 18:42:47,330][15401] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-06-21 18:42:47,338][15349] Signal inference workers to resume experience collection... (10150 times) [2024-06-21 18:42:47,341][15401] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-06-21 18:42:48,127][15401] Updated weights for policy 0, policy_version 42820 (0.0034) [2024-06-21 18:42:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42327.0, 300 sec: 41765.3). Total num frames: 701562880. Throughput: 0: 41568.9. Samples: 701690900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 18:42:48,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 18:42:52,913][15401] Updated weights for policy 0, policy_version 42830 (0.0036) [2024-06-21 18:42:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 701743104. Throughput: 0: 41797.7. Samples: 701825620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 18:42:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 18:42:55,921][15401] Updated weights for policy 0, policy_version 42840 (0.0029) [2024-06-21 18:42:58,394][15132] Fps is (10 sec: 37667.1, 60 sec: 41504.8, 300 sec: 41709.2). Total num frames: 701939712. Throughput: 0: 41582.3. Samples: 702070900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 18:42:58,395][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 18:43:00,782][15401] Updated weights for policy 0, policy_version 42850 (0.0055) [2024-06-21 18:43:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 702169088. Throughput: 0: 41696.8. Samples: 702323780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 18:43:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 18:43:03,764][15401] Updated weights for policy 0, policy_version 42860 (0.0044) [2024-06-21 18:43:08,389][15132] Fps is (10 sec: 40978.2, 60 sec: 41233.1, 300 sec: 41654.4). Total num frames: 702349312. Throughput: 0: 41840.1. Samples: 702455900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 18:43:08,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 18:43:08,666][15401] Updated weights for policy 0, policy_version 42870 (0.0028) [2024-06-21 18:43:11,609][15401] Updated weights for policy 0, policy_version 42880 (0.0027) [2024-06-21 18:43:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 702578688. Throughput: 0: 41533.0. Samples: 702697980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 18:43:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 18:43:16,702][15401] Updated weights for policy 0, policy_version 42890 (0.0035) [2024-06-21 18:43:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 702808064. Throughput: 0: 41783.1. Samples: 702951180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 18:43:18,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 18:43:19,680][15401] Updated weights for policy 0, policy_version 42900 (0.0027) [2024-06-21 18:43:23,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 702971904. Throughput: 0: 41660.9. Samples: 703075500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 18:43:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 18:43:24,370][15401] Updated weights for policy 0, policy_version 42910 (0.0031) [2024-06-21 18:43:27,784][15401] Updated weights for policy 0, policy_version 42920 (0.0040) [2024-06-21 18:43:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 703217664. Throughput: 0: 41550.1. Samples: 703321040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 18:43:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 18:43:32,307][15401] Updated weights for policy 0, policy_version 42930 (0.0042) [2024-06-21 18:43:33,390][15132] Fps is (10 sec: 44232.4, 60 sec: 41232.4, 300 sec: 41820.7). Total num frames: 703414272. Throughput: 0: 41855.6. Samples: 703574440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 18:43:33,391][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 18:43:35,551][15401] Updated weights for policy 0, policy_version 42940 (0.0032) [2024-06-21 18:43:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 703594496. Throughput: 0: 41553.4. Samples: 703695520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 18:43:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-21 18:43:40,193][15401] Updated weights for policy 0, policy_version 42950 (0.0053) [2024-06-21 18:43:43,390][15132] Fps is (10 sec: 40963.9, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 703823872. Throughput: 0: 41576.9. Samples: 703941680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 18:43:43,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 18:43:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000042958_703823872.pth... [2024-06-21 18:43:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000042349_693846016.pth [2024-06-21 18:43:43,915][15401] Updated weights for policy 0, policy_version 42960 (0.0044) [2024-06-21 18:43:48,069][15401] Updated weights for policy 0, policy_version 42970 (0.0034) [2024-06-21 18:43:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 704036864. Throughput: 0: 41517.3. Samples: 704192060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 18:43:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 18:43:51,717][15401] Updated weights for policy 0, policy_version 42980 (0.0033) [2024-06-21 18:43:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 704217088. Throughput: 0: 41347.5. Samples: 704316540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:43:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 18:43:55,589][15401] Updated weights for policy 0, policy_version 42990 (0.0049) [2024-06-21 18:43:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41782.3, 300 sec: 41709.8). Total num frames: 704446464. Throughput: 0: 41539.0. Samples: 704567240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:43:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 18:43:59,564][15401] Updated weights for policy 0, policy_version 43000 (0.0049) [2024-06-21 18:44:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 704659456. Throughput: 0: 41521.4. Samples: 704819640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:44:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 18:44:03,709][15401] Updated weights for policy 0, policy_version 43010 (0.0045) [2024-06-21 18:44:07,657][15401] Updated weights for policy 0, policy_version 43020 (0.0047) [2024-06-21 18:44:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 704856064. Throughput: 0: 41460.5. Samples: 704941220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:44:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 18:44:11,386][15401] Updated weights for policy 0, policy_version 43030 (0.0038) [2024-06-21 18:44:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 705101824. Throughput: 0: 41580.8. Samples: 705192180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 18:44:13,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-21 18:44:15,385][15349] Signal inference workers to stop experience collection... (10200 times) [2024-06-21 18:44:15,416][15401] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-06-21 18:44:15,443][15349] Signal inference workers to resume experience collection... (10200 times) [2024-06-21 18:44:15,444][15401] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-06-21 18:44:15,589][15401] Updated weights for policy 0, policy_version 43040 (0.0034) [2024-06-21 18:44:18,390][15132] Fps is (10 sec: 40958.9, 60 sec: 40959.9, 300 sec: 41598.7). Total num frames: 705265664. Throughput: 0: 41468.7. Samples: 705440500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:44:18,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 18:44:19,575][15401] Updated weights for policy 0, policy_version 43050 (0.0038) [2024-06-21 18:44:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41599.0). Total num frames: 705478656. Throughput: 0: 41369.3. Samples: 705557140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:44:23,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 18:44:23,643][15401] Updated weights for policy 0, policy_version 43060 (0.0036) [2024-06-21 18:44:27,554][15401] Updated weights for policy 0, policy_version 43070 (0.0045) [2024-06-21 18:44:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 705708032. Throughput: 0: 41459.9. Samples: 705807380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:44:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 18:44:31,679][15401] Updated weights for policy 0, policy_version 43080 (0.0030) [2024-06-21 18:44:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 40960.7, 300 sec: 41598.7). Total num frames: 705871872. Throughput: 0: 41401.0. Samples: 706055100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:44:33,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 18:44:35,455][15401] Updated weights for policy 0, policy_version 43090 (0.0034) [2024-06-21 18:44:38,389][15132] Fps is (10 sec: 37684.1, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 706084864. Throughput: 0: 41307.2. Samples: 706175360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:44:38,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-21 18:44:39,398][15401] Updated weights for policy 0, policy_version 43100 (0.0035) [2024-06-21 18:44:43,177][15401] Updated weights for policy 0, policy_version 43110 (0.0042) [2024-06-21 18:44:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 706314240. Throughput: 0: 41518.7. Samples: 706435580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 18:44:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 18:44:47,371][15401] Updated weights for policy 0, policy_version 43120 (0.0032) [2024-06-21 18:44:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 706527232. Throughput: 0: 41280.5. Samples: 706677260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 18:44:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 18:44:51,015][15401] Updated weights for policy 0, policy_version 43130 (0.0039) [2024-06-21 18:44:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 706723840. Throughput: 0: 41363.0. Samples: 706802560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 18:44:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 18:44:55,178][15401] Updated weights for policy 0, policy_version 43140 (0.0036) [2024-06-21 18:44:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 706920448. Throughput: 0: 41509.0. Samples: 707060080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 18:44:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 18:44:58,924][15401] Updated weights for policy 0, policy_version 43150 (0.0039) [2024-06-21 18:45:02,952][15401] Updated weights for policy 0, policy_version 43160 (0.0033) [2024-06-21 18:45:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 707133440. Throughput: 0: 41366.7. Samples: 707302000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 18:45:03,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 18:45:06,675][15401] Updated weights for policy 0, policy_version 43170 (0.0043) [2024-06-21 18:45:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 707346432. Throughput: 0: 41709.7. Samples: 707434080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 18:45:08,394][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 18:45:10,624][15401] Updated weights for policy 0, policy_version 43180 (0.0037) [2024-06-21 18:45:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 40686.9, 300 sec: 41598.7). Total num frames: 707543040. Throughput: 0: 41679.6. Samples: 707682960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 18:45:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 18:45:14,451][15401] Updated weights for policy 0, policy_version 43190 (0.0040) [2024-06-21 18:45:18,396][15132] Fps is (10 sec: 42571.4, 60 sec: 41774.9, 300 sec: 41543.2). Total num frames: 707772416. Throughput: 0: 41666.1. Samples: 707930340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:18,397][15132] Avg episode reward: [(0, '0.713')] [2024-06-21 18:45:18,467][15401] Updated weights for policy 0, policy_version 43200 (0.0044) [2024-06-21 18:45:22,225][15401] Updated weights for policy 0, policy_version 43210 (0.0036) [2024-06-21 18:45:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 707985408. Throughput: 0: 41870.1. Samples: 708059520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-21 18:45:25,463][15349] Signal inference workers to stop experience collection... (10250 times) [2024-06-21 18:45:25,518][15401] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-06-21 18:45:25,522][15349] Signal inference workers to resume experience collection... (10250 times) [2024-06-21 18:45:25,530][15401] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-06-21 18:45:26,161][15401] Updated weights for policy 0, policy_version 43220 (0.0033) [2024-06-21 18:45:28,390][15132] Fps is (10 sec: 40986.1, 60 sec: 41233.2, 300 sec: 41598.7). Total num frames: 708182016. Throughput: 0: 41727.9. Samples: 708313340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:28,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-21 18:45:30,170][15401] Updated weights for policy 0, policy_version 43230 (0.0031) [2024-06-21 18:45:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 708395008. Throughput: 0: 41927.6. Samples: 708564000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 18:45:33,849][15401] Updated weights for policy 0, policy_version 43240 (0.0037) [2024-06-21 18:45:37,931][15401] Updated weights for policy 0, policy_version 43250 (0.0030) [2024-06-21 18:45:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 708624384. Throughput: 0: 41956.5. Samples: 708690600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 18:45:41,958][15401] Updated weights for policy 0, policy_version 43260 (0.0041) [2024-06-21 18:45:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 41506.0, 300 sec: 41543.2). Total num frames: 708804608. Throughput: 0: 41706.0. Samples: 708936860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 18:45:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000043262_708804608.pth... [2024-06-21 18:45:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000042654_698843136.pth [2024-06-21 18:45:45,710][15401] Updated weights for policy 0, policy_version 43270 (0.0046) [2024-06-21 18:45:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 709017600. Throughput: 0: 41973.0. Samples: 709190780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:48,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-21 18:45:49,710][15401] Updated weights for policy 0, policy_version 43280 (0.0035) [2024-06-21 18:45:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 709230592. Throughput: 0: 41811.6. Samples: 709315600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:53,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 18:45:54,038][15401] Updated weights for policy 0, policy_version 43290 (0.0038) [2024-06-21 18:45:58,018][15401] Updated weights for policy 0, policy_version 43300 (0.0027) [2024-06-21 18:45:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41543.5). Total num frames: 709427200. Throughput: 0: 41741.1. Samples: 709561300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:45:58,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 18:46:01,825][15401] Updated weights for policy 0, policy_version 43310 (0.0042) [2024-06-21 18:46:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 709623808. Throughput: 0: 41886.9. Samples: 709814980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:46:03,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 18:46:05,862][15401] Updated weights for policy 0, policy_version 43320 (0.0051) [2024-06-21 18:46:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 709853184. Throughput: 0: 41813.9. Samples: 709941140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 18:46:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 18:46:09,805][15401] Updated weights for policy 0, policy_version 43330 (0.0025) [2024-06-21 18:46:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 710066176. Throughput: 0: 41692.0. Samples: 710189480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:46:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 18:46:13,598][15401] Updated weights for policy 0, policy_version 43340 (0.0044) [2024-06-21 18:46:17,686][15401] Updated weights for policy 0, policy_version 43350 (0.0033) [2024-06-21 18:46:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41510.5, 300 sec: 41654.2). Total num frames: 710262784. Throughput: 0: 41754.5. Samples: 710442960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:46:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 18:46:21,185][15401] Updated weights for policy 0, policy_version 43360 (0.0036) [2024-06-21 18:46:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 710492160. Throughput: 0: 41638.6. Samples: 710564340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:46:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 18:46:25,390][15401] Updated weights for policy 0, policy_version 43370 (0.0054) [2024-06-21 18:46:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 710688768. Throughput: 0: 41746.8. Samples: 710815460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:46:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 18:46:29,162][15401] Updated weights for policy 0, policy_version 43380 (0.0031) [2024-06-21 18:46:33,130][15401] Updated weights for policy 0, policy_version 43390 (0.0038) [2024-06-21 18:46:33,394][15132] Fps is (10 sec: 40941.9, 60 sec: 41776.0, 300 sec: 41709.1). Total num frames: 710901760. Throughput: 0: 41596.8. Samples: 711062820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:46:33,394][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 18:46:37,235][15401] Updated weights for policy 0, policy_version 43400 (0.0033) [2024-06-21 18:46:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 711114752. Throughput: 0: 41677.2. Samples: 711191080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 18:46:38,390][15132] Avg episode reward: [(0, '0.891')] [2024-06-21 18:46:38,391][15349] Saving new best policy, reward=0.891! [2024-06-21 18:46:40,887][15401] Updated weights for policy 0, policy_version 43410 (0.0030) [2024-06-21 18:46:43,392][15132] Fps is (10 sec: 42607.2, 60 sec: 42050.7, 300 sec: 41709.8). Total num frames: 711327744. Throughput: 0: 41877.3. Samples: 711445880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 18:46:43,392][15132] Avg episode reward: [(0, '0.891')] [2024-06-21 18:46:44,949][15401] Updated weights for policy 0, policy_version 43420 (0.0046) [2024-06-21 18:46:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 711524352. Throughput: 0: 41767.5. Samples: 711694520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 18:46:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 18:46:49,065][15401] Updated weights for policy 0, policy_version 43430 (0.0033) [2024-06-21 18:46:52,930][15401] Updated weights for policy 0, policy_version 43440 (0.0036) [2024-06-21 18:46:53,389][15132] Fps is (10 sec: 40969.8, 60 sec: 41779.2, 300 sec: 41654.6). Total num frames: 711737344. Throughput: 0: 41863.1. Samples: 711824980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 18:46:53,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 18:46:56,785][15401] Updated weights for policy 0, policy_version 43450 (0.0045) [2024-06-21 18:46:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 711933952. Throughput: 0: 41934.8. Samples: 712076540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 18:46:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 18:47:00,622][15401] Updated weights for policy 0, policy_version 43460 (0.0043) [2024-06-21 18:47:01,140][15349] Signal inference workers to stop experience collection... (10300 times) [2024-06-21 18:47:01,140][15349] Signal inference workers to resume experience collection... (10300 times) [2024-06-21 18:47:01,180][15401] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-06-21 18:47:01,180][15401] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-06-21 18:47:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 712146944. Throughput: 0: 41710.3. Samples: 712319920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 18:47:03,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-21 18:47:04,535][15401] Updated weights for policy 0, policy_version 43470 (0.0042) [2024-06-21 18:47:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 712343552. Throughput: 0: 41802.3. Samples: 712445440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:47:08,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-21 18:47:08,673][15401] Updated weights for policy 0, policy_version 43480 (0.0036) [2024-06-21 18:47:12,266][15401] Updated weights for policy 0, policy_version 43490 (0.0040) [2024-06-21 18:47:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.4, 300 sec: 41709.8). Total num frames: 712589312. Throughput: 0: 41775.6. Samples: 712695360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:47:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 18:47:16,285][15401] Updated weights for policy 0, policy_version 43500 (0.0044) [2024-06-21 18:47:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 712785920. Throughput: 0: 41897.0. Samples: 712948000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:47:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 18:47:20,047][15401] Updated weights for policy 0, policy_version 43510 (0.0040) [2024-06-21 18:47:23,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 712966144. Throughput: 0: 41859.8. Samples: 713074760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:47:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 18:47:24,103][15401] Updated weights for policy 0, policy_version 43520 (0.0033) [2024-06-21 18:47:28,327][15401] Updated weights for policy 0, policy_version 43530 (0.0030) [2024-06-21 18:47:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 713195520. Throughput: 0: 41738.3. Samples: 713324000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 18:47:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 18:47:32,032][15401] Updated weights for policy 0, policy_version 43540 (0.0035) [2024-06-21 18:47:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41782.3, 300 sec: 41709.8). Total num frames: 713408512. Throughput: 0: 41716.9. Samples: 713571780. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 18:47:33,398][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 18:47:36,267][15401] Updated weights for policy 0, policy_version 43550 (0.0031) [2024-06-21 18:47:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.3, 300 sec: 41654.2). Total num frames: 713605120. Throughput: 0: 41616.1. Samples: 713697700. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 18:47:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 18:47:40,090][15401] Updated weights for policy 0, policy_version 43560 (0.0038) [2024-06-21 18:47:43,390][15132] Fps is (10 sec: 37683.1, 60 sec: 40961.6, 300 sec: 41432.1). Total num frames: 713785344. Throughput: 0: 41563.9. Samples: 713946920. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 18:47:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 18:47:43,446][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000043567_713801728.pth... [2024-06-21 18:47:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000042958_703823872.pth [2024-06-21 18:47:43,998][15401] Updated weights for policy 0, policy_version 43570 (0.0036) [2024-06-21 18:47:47,882][15401] Updated weights for policy 0, policy_version 43580 (0.0030) [2024-06-21 18:47:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 714031104. Throughput: 0: 41729.8. Samples: 714197760. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 18:47:48,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 18:47:51,745][15401] Updated weights for policy 0, policy_version 43590 (0.0036) [2024-06-21 18:47:53,396][15132] Fps is (10 sec: 47483.1, 60 sec: 42047.8, 300 sec: 41765.0). Total num frames: 714260480. Throughput: 0: 41866.0. Samples: 714329680. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 18:47:53,397][15132] Avg episode reward: [(0, '0.326')] [2024-06-21 18:47:55,563][15401] Updated weights for policy 0, policy_version 43600 (0.0038) [2024-06-21 18:47:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 714440704. Throughput: 0: 41843.9. Samples: 714578340. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 18:47:58,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 18:47:59,351][15401] Updated weights for policy 0, policy_version 43610 (0.0034) [2024-06-21 18:48:03,389][15132] Fps is (10 sec: 39346.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 714653696. Throughput: 0: 41853.8. Samples: 714831420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 18:48:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 18:48:03,487][15401] Updated weights for policy 0, policy_version 43620 (0.0036) [2024-06-21 18:48:07,021][15401] Updated weights for policy 0, policy_version 43630 (0.0029) [2024-06-21 18:48:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 714883072. Throughput: 0: 41775.5. Samples: 714954660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 18:48:08,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-21 18:48:11,157][15401] Updated weights for policy 0, policy_version 43640 (0.0035) [2024-06-21 18:48:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 715079680. Throughput: 0: 41731.0. Samples: 715201900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 18:48:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 18:48:14,253][15349] Signal inference workers to stop experience collection... (10350 times) [2024-06-21 18:48:14,309][15349] Signal inference workers to resume experience collection... (10350 times) [2024-06-21 18:48:14,310][15401] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-06-21 18:48:14,324][15401] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-06-21 18:48:14,609][15401] Updated weights for policy 0, policy_version 43650 (0.0026) [2024-06-21 18:48:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 715292672. Throughput: 0: 41947.5. Samples: 715459420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 18:48:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 18:48:18,906][15401] Updated weights for policy 0, policy_version 43660 (0.0039) [2024-06-21 18:48:22,231][15401] Updated weights for policy 0, policy_version 43670 (0.0045) [2024-06-21 18:48:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 41653.9). Total num frames: 715505664. Throughput: 0: 42023.4. Samples: 715588860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 18:48:23,393][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 18:48:26,624][15401] Updated weights for policy 0, policy_version 43680 (0.0035) [2024-06-21 18:48:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41654.4). Total num frames: 715702272. Throughput: 0: 42031.1. Samples: 715838320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:48:28,394][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 18:48:30,187][15401] Updated weights for policy 0, policy_version 43690 (0.0046) [2024-06-21 18:48:33,390][15132] Fps is (10 sec: 40969.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 715915264. Throughput: 0: 42103.5. Samples: 716092420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:48:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 18:48:34,314][15401] Updated weights for policy 0, policy_version 43700 (0.0034) [2024-06-21 18:48:37,966][15401] Updated weights for policy 0, policy_version 43710 (0.0035) [2024-06-21 18:48:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 716144640. Throughput: 0: 41992.7. Samples: 716219080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:48:38,394][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 18:48:42,176][15401] Updated weights for policy 0, policy_version 43720 (0.0032) [2024-06-21 18:48:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 716324864. Throughput: 0: 41915.1. Samples: 716464520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:48:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-21 18:48:46,218][15401] Updated weights for policy 0, policy_version 43730 (0.0032) [2024-06-21 18:48:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 716570624. Throughput: 0: 41920.4. Samples: 716717840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:48:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 18:48:50,424][15401] Updated weights for policy 0, policy_version 43740 (0.0040) [2024-06-21 18:48:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 41783.7, 300 sec: 41765.3). Total num frames: 716767232. Throughput: 0: 42016.0. Samples: 716845380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-21 18:48:53,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 18:48:54,192][15401] Updated weights for policy 0, policy_version 43750 (0.0027) [2024-06-21 18:48:58,220][15401] Updated weights for policy 0, policy_version 43760 (0.0033) [2024-06-21 18:48:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 716980224. Throughput: 0: 42224.5. Samples: 717102000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:48:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 18:49:01,775][15401] Updated weights for policy 0, policy_version 43770 (0.0033) [2024-06-21 18:49:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 41876.4). Total num frames: 717209600. Throughput: 0: 41807.9. Samples: 717340780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:49:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 18:49:05,825][15401] Updated weights for policy 0, policy_version 43780 (0.0037) [2024-06-21 18:49:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 717389824. Throughput: 0: 41807.1. Samples: 717470080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:49:08,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-21 18:49:09,766][15401] Updated weights for policy 0, policy_version 43790 (0.0047) [2024-06-21 18:49:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 717602816. Throughput: 0: 41944.9. Samples: 717725840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:49:13,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-21 18:49:13,672][15401] Updated weights for policy 0, policy_version 43800 (0.0039) [2024-06-21 18:49:17,489][15401] Updated weights for policy 0, policy_version 43810 (0.0034) [2024-06-21 18:49:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 717832192. Throughput: 0: 41800.0. Samples: 717973420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 18:49:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-21 18:49:21,422][15401] Updated weights for policy 0, policy_version 43820 (0.0045) [2024-06-21 18:49:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42054.0, 300 sec: 41765.3). Total num frames: 718028800. Throughput: 0: 41833.4. Samples: 718101580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 18:49:23,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 18:49:25,507][15401] Updated weights for policy 0, policy_version 43830 (0.0031) [2024-06-21 18:49:28,392][15132] Fps is (10 sec: 37674.3, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 718209024. Throughput: 0: 41812.0. Samples: 718346160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 18:49:28,393][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 18:49:29,267][15401] Updated weights for policy 0, policy_version 43840 (0.0037) [2024-06-21 18:49:33,009][15349] Signal inference workers to stop experience collection... (10400 times) [2024-06-21 18:49:33,017][15349] Signal inference workers to resume experience collection... (10400 times) [2024-06-21 18:49:33,065][15401] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-06-21 18:49:33,065][15401] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-06-21 18:49:33,162][15401] Updated weights for policy 0, policy_version 43850 (0.0056) [2024-06-21 18:49:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 718438400. Throughput: 0: 41833.8. Samples: 718600360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 18:49:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 18:49:37,022][15401] Updated weights for policy 0, policy_version 43860 (0.0034) [2024-06-21 18:49:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 718651392. Throughput: 0: 41947.2. Samples: 718733000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 18:49:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 18:49:41,120][15401] Updated weights for policy 0, policy_version 43870 (0.0050) [2024-06-21 18:49:43,390][15132] Fps is (10 sec: 40956.8, 60 sec: 42051.8, 300 sec: 41765.2). Total num frames: 718848000. Throughput: 0: 41757.9. Samples: 718981140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 18:49:43,391][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 18:49:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000043875_718848000.pth... [2024-06-21 18:49:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000043262_708804608.pth [2024-06-21 18:49:44,606][15401] Updated weights for policy 0, policy_version 43880 (0.0039) [2024-06-21 18:49:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 719044608. Throughput: 0: 42299.2. Samples: 719244240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 18:49:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 18:49:48,920][15401] Updated weights for policy 0, policy_version 43890 (0.0031) [2024-06-21 18:49:52,888][15401] Updated weights for policy 0, policy_version 43900 (0.0039) [2024-06-21 18:49:53,389][15132] Fps is (10 sec: 42602.0, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 719273984. Throughput: 0: 42172.5. Samples: 719367840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 18:49:53,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 18:49:56,636][15401] Updated weights for policy 0, policy_version 43910 (0.0043) [2024-06-21 18:49:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 719486976. Throughput: 0: 42068.0. Samples: 719618900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 18:49:58,396][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 18:50:00,614][15401] Updated weights for policy 0, policy_version 43920 (0.0024) [2024-06-21 18:50:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 719699968. Throughput: 0: 42122.2. Samples: 719868920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 18:50:03,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-21 18:50:04,493][15401] Updated weights for policy 0, policy_version 43930 (0.0028) [2024-06-21 18:50:08,356][15401] Updated weights for policy 0, policy_version 43940 (0.0032) [2024-06-21 18:50:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 719912960. Throughput: 0: 42139.4. Samples: 719997860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 18:50:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 18:50:12,066][15401] Updated weights for policy 0, policy_version 43950 (0.0040) [2024-06-21 18:50:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41877.3). Total num frames: 720125952. Throughput: 0: 42172.0. Samples: 720243800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-21 18:50:13,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-21 18:50:16,128][15401] Updated weights for policy 0, policy_version 43960 (0.0043) [2024-06-21 18:50:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 720338944. Throughput: 0: 42277.7. Samples: 720502860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 18:50:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 18:50:19,959][15401] Updated weights for policy 0, policy_version 43970 (0.0034) [2024-06-21 18:50:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 720535552. Throughput: 0: 42086.2. Samples: 720626880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 18:50:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 18:50:23,870][15401] Updated weights for policy 0, policy_version 43980 (0.0024) [2024-06-21 18:50:27,648][15401] Updated weights for policy 0, policy_version 43990 (0.0045) [2024-06-21 18:50:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.0, 300 sec: 41876.4). Total num frames: 720748544. Throughput: 0: 42218.9. Samples: 720880960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 18:50:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 18:50:31,567][15401] Updated weights for policy 0, policy_version 44000 (0.0031) [2024-06-21 18:50:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 720977920. Throughput: 0: 41852.8. Samples: 721127620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 18:50:33,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 18:50:35,853][15401] Updated weights for policy 0, policy_version 44010 (0.0048) [2024-06-21 18:50:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 721158144. Throughput: 0: 41992.3. Samples: 721257500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 18:50:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 18:50:39,532][15349] Signal inference workers to stop experience collection... (10450 times) [2024-06-21 18:50:39,532][15349] Signal inference workers to resume experience collection... (10450 times) [2024-06-21 18:50:39,541][15401] Updated weights for policy 0, policy_version 44020 (0.0033) [2024-06-21 18:50:39,570][15401] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-06-21 18:50:39,570][15401] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-06-21 18:50:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.7, 300 sec: 41876.4). Total num frames: 721371136. Throughput: 0: 41859.4. Samples: 721502580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 18:50:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 18:50:43,588][15401] Updated weights for policy 0, policy_version 44030 (0.0038) [2024-06-21 18:50:47,300][15401] Updated weights for policy 0, policy_version 44040 (0.0032) [2024-06-21 18:50:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 721600512. Throughput: 0: 41808.1. Samples: 721750280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:50:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 18:50:51,358][15401] Updated weights for policy 0, policy_version 44050 (0.0023) [2024-06-21 18:50:53,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 721813504. Throughput: 0: 41984.2. Samples: 721887140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:50:53,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-21 18:50:54,859][15401] Updated weights for policy 0, policy_version 44060 (0.0038) [2024-06-21 18:50:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 721993728. Throughput: 0: 42079.6. Samples: 722137380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:50:58,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 18:50:58,992][15401] Updated weights for policy 0, policy_version 44070 (0.0032) [2024-06-21 18:51:02,666][15401] Updated weights for policy 0, policy_version 44080 (0.0027) [2024-06-21 18:51:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 722223104. Throughput: 0: 41888.0. Samples: 722387820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:51:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 18:51:07,097][15401] Updated weights for policy 0, policy_version 44090 (0.0055) [2024-06-21 18:51:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 722419712. Throughput: 0: 41999.2. Samples: 722516840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:51:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 18:51:10,605][15401] Updated weights for policy 0, policy_version 44100 (0.0038) [2024-06-21 18:51:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 722649088. Throughput: 0: 41787.6. Samples: 722761400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 18:51:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 18:51:14,851][15401] Updated weights for policy 0, policy_version 44110 (0.0029) [2024-06-21 18:51:18,327][15401] Updated weights for policy 0, policy_version 44120 (0.0039) [2024-06-21 18:51:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 722862080. Throughput: 0: 42034.7. Samples: 723019180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 18:51:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 18:51:22,839][15401] Updated weights for policy 0, policy_version 44130 (0.0038) [2024-06-21 18:51:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 723042304. Throughput: 0: 41854.2. Samples: 723140940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 18:51:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-21 18:51:26,151][15401] Updated weights for policy 0, policy_version 44140 (0.0024) [2024-06-21 18:51:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41988.1). Total num frames: 723288064. Throughput: 0: 41909.1. Samples: 723388480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 18:51:28,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 18:51:30,439][15401] Updated weights for policy 0, policy_version 44150 (0.0029) [2024-06-21 18:51:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 723484672. Throughput: 0: 42171.9. Samples: 723648020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 18:51:33,399][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 18:51:33,762][15401] Updated weights for policy 0, policy_version 44160 (0.0038) [2024-06-21 18:51:38,181][15401] Updated weights for policy 0, policy_version 44170 (0.0026) [2024-06-21 18:51:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41876.7). Total num frames: 723681280. Throughput: 0: 41971.0. Samples: 723775840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-21 18:51:38,404][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 18:51:41,503][15401] Updated weights for policy 0, policy_version 44180 (0.0035) [2024-06-21 18:51:43,390][15132] Fps is (10 sec: 44233.5, 60 sec: 42597.9, 300 sec: 42042.9). Total num frames: 723927040. Throughput: 0: 41924.9. Samples: 724024040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:51:43,391][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 18:51:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000044185_723927040.pth... [2024-06-21 18:51:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000043567_713801728.pth [2024-06-21 18:51:45,883][15401] Updated weights for policy 0, policy_version 44190 (0.0038) [2024-06-21 18:51:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 724107264. Throughput: 0: 42037.9. Samples: 724279520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:51:48,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 18:51:49,243][15401] Updated weights for policy 0, policy_version 44200 (0.0033) [2024-06-21 18:51:53,390][15132] Fps is (10 sec: 37686.2, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 724303872. Throughput: 0: 41821.6. Samples: 724398820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:51:53,396][15132] Avg episode reward: [(0, '0.745')] [2024-06-21 18:51:53,868][15401] Updated weights for policy 0, policy_version 44210 (0.0045) [2024-06-21 18:51:57,200][15401] Updated weights for policy 0, policy_version 44220 (0.0040) [2024-06-21 18:51:58,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 42042.7). Total num frames: 724549632. Throughput: 0: 42043.5. Samples: 724653460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:51:58,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 18:52:01,410][15401] Updated weights for policy 0, policy_version 44230 (0.0024) [2024-06-21 18:52:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 724746240. Throughput: 0: 42077.4. Samples: 724912660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:52:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 18:52:04,808][15401] Updated weights for policy 0, policy_version 44240 (0.0041) [2024-06-21 18:52:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 724959232. Throughput: 0: 42024.1. Samples: 725032020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 18:52:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 18:52:09,499][15401] Updated weights for policy 0, policy_version 44250 (0.0027) [2024-06-21 18:52:12,594][15349] Signal inference workers to stop experience collection... (10500 times) [2024-06-21 18:52:12,597][15349] Signal inference workers to resume experience collection... (10500 times) [2024-06-21 18:52:12,611][15401] Updated weights for policy 0, policy_version 44260 (0.0040) [2024-06-21 18:52:12,639][15401] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-06-21 18:52:12,639][15401] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-06-21 18:52:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 725188608. Throughput: 0: 42080.8. Samples: 725282120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 18:52:13,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 18:52:17,595][15401] Updated weights for policy 0, policy_version 44270 (0.0049) [2024-06-21 18:52:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 725352448. Throughput: 0: 42081.9. Samples: 725541700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 18:52:18,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 18:52:20,304][15401] Updated weights for policy 0, policy_version 44280 (0.0033) [2024-06-21 18:52:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 725565440. Throughput: 0: 41747.2. Samples: 725654460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 18:52:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 18:52:25,459][15401] Updated weights for policy 0, policy_version 44290 (0.0027) [2024-06-21 18:52:28,176][15401] Updated weights for policy 0, policy_version 44300 (0.0027) [2024-06-21 18:52:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 725827584. Throughput: 0: 41874.6. Samples: 725908360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 18:52:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 18:52:33,031][15401] Updated weights for policy 0, policy_version 44310 (0.0044) [2024-06-21 18:52:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41987.4). Total num frames: 725991424. Throughput: 0: 41802.5. Samples: 726160640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 18:52:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 18:52:35,990][15401] Updated weights for policy 0, policy_version 44320 (0.0029) [2024-06-21 18:52:38,390][15132] Fps is (10 sec: 36044.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 726188032. Throughput: 0: 41724.0. Samples: 726276400. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 18:52:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 18:52:40,524][15401] Updated weights for policy 0, policy_version 44330 (0.0042) [2024-06-21 18:52:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 41778.1, 300 sec: 42042.7). Total num frames: 726433792. Throughput: 0: 41935.0. Samples: 726540540. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 18:52:43,393][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 18:52:43,853][15401] Updated weights for policy 0, policy_version 44340 (0.0035) [2024-06-21 18:52:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.1, 300 sec: 41877.3). Total num frames: 726614016. Throughput: 0: 41697.7. Samples: 726789060. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 18:52:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 18:52:48,762][15401] Updated weights for policy 0, policy_version 44350 (0.0036) [2024-06-21 18:52:51,509][15401] Updated weights for policy 0, policy_version 44360 (0.0032) [2024-06-21 18:52:53,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 726827008. Throughput: 0: 41674.2. Samples: 726907360. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 18:52:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 18:52:56,424][15401] Updated weights for policy 0, policy_version 44370 (0.0042) [2024-06-21 18:52:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41507.8, 300 sec: 41987.5). Total num frames: 727040000. Throughput: 0: 41736.5. Samples: 727160260. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 18:52:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 18:52:59,638][15401] Updated weights for policy 0, policy_version 44380 (0.0037) [2024-06-21 18:53:03,393][15132] Fps is (10 sec: 39308.2, 60 sec: 41230.7, 300 sec: 41820.4). Total num frames: 727220224. Throughput: 0: 41548.4. Samples: 727411520. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 18:53:03,394][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 18:53:04,134][15401] Updated weights for policy 0, policy_version 44390 (0.0031) [2024-06-21 18:53:07,416][15401] Updated weights for policy 0, policy_version 44400 (0.0034) [2024-06-21 18:53:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 727449600. Throughput: 0: 41705.8. Samples: 727531220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:53:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 18:53:11,944][15401] Updated weights for policy 0, policy_version 44410 (0.0033) [2024-06-21 18:53:13,389][15132] Fps is (10 sec: 44251.6, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 727662592. Throughput: 0: 41670.7. Samples: 727783540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:53:13,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 18:53:15,410][15401] Updated weights for policy 0, policy_version 44420 (0.0033) [2024-06-21 18:53:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41876.7). Total num frames: 727859200. Throughput: 0: 41811.6. Samples: 728042160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:53:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-21 18:53:18,992][15349] Signal inference workers to stop experience collection... (10550 times) [2024-06-21 18:53:18,992][15349] Signal inference workers to resume experience collection... (10550 times) [2024-06-21 18:53:19,030][15401] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-06-21 18:53:19,036][15401] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-06-21 18:53:19,730][15401] Updated weights for policy 0, policy_version 44430 (0.0038) [2024-06-21 18:53:23,143][15401] Updated weights for policy 0, policy_version 44440 (0.0030) [2024-06-21 18:53:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 728104960. Throughput: 0: 41762.6. Samples: 728155720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:53:23,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-21 18:53:27,532][15401] Updated weights for policy 0, policy_version 44450 (0.0044) [2024-06-21 18:53:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 41932.0). Total num frames: 728285184. Throughput: 0: 41598.4. Samples: 728412360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 18:53:28,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-21 18:53:30,859][15401] Updated weights for policy 0, policy_version 44460 (0.0036) [2024-06-21 18:53:33,390][15132] Fps is (10 sec: 36045.2, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 728465408. Throughput: 0: 41756.9. Samples: 728668120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 18:53:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 18:53:35,717][15401] Updated weights for policy 0, policy_version 44470 (0.0043) [2024-06-21 18:53:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 728743936. Throughput: 0: 41802.6. Samples: 728788480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 18:53:38,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 18:53:38,639][15401] Updated weights for policy 0, policy_version 44480 (0.0026) [2024-06-21 18:53:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 40961.7, 300 sec: 41765.3). Total num frames: 728891392. Throughput: 0: 41748.0. Samples: 729038920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 18:53:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 18:53:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000044489_728907776.pth... [2024-06-21 18:53:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000043875_718848000.pth [2024-06-21 18:53:43,769][15401] Updated weights for policy 0, policy_version 44490 (0.0039) [2024-06-21 18:53:46,729][15401] Updated weights for policy 0, policy_version 44500 (0.0037) [2024-06-21 18:53:48,390][15132] Fps is (10 sec: 36044.5, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 729104384. Throughput: 0: 41722.1. Samples: 729288880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 18:53:48,395][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 18:53:51,390][15401] Updated weights for policy 0, policy_version 44510 (0.0032) [2024-06-21 18:53:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 729350144. Throughput: 0: 41853.3. Samples: 729414620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 18:53:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 18:53:54,551][15401] Updated weights for policy 0, policy_version 44520 (0.0027) [2024-06-21 18:53:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 729513984. Throughput: 0: 41727.9. Samples: 729661300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 18:53:58,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 18:53:59,344][15401] Updated weights for policy 0, policy_version 44530 (0.0044) [2024-06-21 18:54:02,473][15401] Updated weights for policy 0, policy_version 44540 (0.0028) [2024-06-21 18:54:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42054.7, 300 sec: 41876.4). Total num frames: 729743360. Throughput: 0: 41475.6. Samples: 729908560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 18:54:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 18:54:07,174][15401] Updated weights for policy 0, policy_version 44550 (0.0042) [2024-06-21 18:54:08,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42050.5, 300 sec: 41931.6). Total num frames: 729972736. Throughput: 0: 41884.1. Samples: 730040600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 18:54:08,393][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 18:54:10,178][15401] Updated weights for policy 0, policy_version 44560 (0.0037) [2024-06-21 18:54:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 730152960. Throughput: 0: 41802.6. Samples: 730293480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 18:54:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 18:54:14,973][15401] Updated weights for policy 0, policy_version 44570 (0.0045) [2024-06-21 18:54:18,032][15401] Updated weights for policy 0, policy_version 44580 (0.0031) [2024-06-21 18:54:18,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 730398720. Throughput: 0: 41483.1. Samples: 730534860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 18:54:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 18:54:20,902][15349] Signal inference workers to stop experience collection... (10600 times) [2024-06-21 18:54:20,936][15401] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-06-21 18:54:20,969][15349] Signal inference workers to resume experience collection... (10600 times) [2024-06-21 18:54:20,970][15401] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-06-21 18:54:22,652][15401] Updated weights for policy 0, policy_version 44590 (0.0038) [2024-06-21 18:54:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41233.2, 300 sec: 41932.3). Total num frames: 730578944. Throughput: 0: 41877.4. Samples: 730672960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 18:54:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 18:54:25,876][15401] Updated weights for policy 0, policy_version 44600 (0.0040) [2024-06-21 18:54:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 730775552. Throughput: 0: 41741.4. Samples: 730917280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 18:54:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 18:54:30,353][15401] Updated weights for policy 0, policy_version 44610 (0.0046) [2024-06-21 18:54:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 731021312. Throughput: 0: 41738.3. Samples: 731167100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 18:54:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 18:54:33,877][15401] Updated weights for policy 0, policy_version 44620 (0.0041) [2024-06-21 18:54:38,034][15401] Updated weights for policy 0, policy_version 44630 (0.0029) [2024-06-21 18:54:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 41506.2, 300 sec: 41987.6). Total num frames: 731234304. Throughput: 0: 41841.3. Samples: 731297480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 18:54:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 18:54:41,935][15401] Updated weights for policy 0, policy_version 44640 (0.0041) [2024-06-21 18:54:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 731430912. Throughput: 0: 41846.2. Samples: 731544380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 18:54:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 18:54:45,742][15401] Updated weights for policy 0, policy_version 44650 (0.0043) [2024-06-21 18:54:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 731627520. Throughput: 0: 41985.7. Samples: 731797920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 18:54:48,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-21 18:54:49,911][15401] Updated weights for policy 0, policy_version 44660 (0.0031) [2024-06-21 18:54:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 731840512. Throughput: 0: 41752.0. Samples: 731919340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 18:54:53,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 18:54:53,554][15401] Updated weights for policy 0, policy_version 44670 (0.0033) [2024-06-21 18:54:57,874][15401] Updated weights for policy 0, policy_version 44680 (0.0035) [2024-06-21 18:54:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 41932.0). Total num frames: 732069888. Throughput: 0: 41761.9. Samples: 732172760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 18:54:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 18:55:01,617][15401] Updated weights for policy 0, policy_version 44690 (0.0037) [2024-06-21 18:55:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 732266496. Throughput: 0: 41801.4. Samples: 732415920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 18:55:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 18:55:05,859][15401] Updated weights for policy 0, policy_version 44700 (0.0035) [2024-06-21 18:55:08,389][15132] Fps is (10 sec: 39321.2, 60 sec: 41507.8, 300 sec: 41820.9). Total num frames: 732463104. Throughput: 0: 41592.9. Samples: 732544640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 18:55:08,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 18:55:09,292][15401] Updated weights for policy 0, policy_version 44710 (0.0039) [2024-06-21 18:55:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 732676096. Throughput: 0: 41755.1. Samples: 732796260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 18:55:13,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-21 18:55:13,831][15401] Updated weights for policy 0, policy_version 44720 (0.0038) [2024-06-21 18:55:17,420][15401] Updated weights for policy 0, policy_version 44730 (0.0038) [2024-06-21 18:55:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 732905472. Throughput: 0: 41744.1. Samples: 733045580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 18:55:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 18:55:21,716][15401] Updated weights for policy 0, policy_version 44740 (0.0034) [2024-06-21 18:55:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 733085696. Throughput: 0: 41775.8. Samples: 733177400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 18:55:23,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 18:55:25,237][15401] Updated weights for policy 0, policy_version 44750 (0.0036) [2024-06-21 18:55:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 733298688. Throughput: 0: 41714.7. Samples: 733421540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-21 18:55:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 18:55:29,466][15401] Updated weights for policy 0, policy_version 44760 (0.0032) [2024-06-21 18:55:33,108][15401] Updated weights for policy 0, policy_version 44770 (0.0036) [2024-06-21 18:55:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 733528064. Throughput: 0: 41786.2. Samples: 733678300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-21 18:55:33,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 18:55:37,209][15401] Updated weights for policy 0, policy_version 44780 (0.0041) [2024-06-21 18:55:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 733724672. Throughput: 0: 41933.0. Samples: 733806320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-21 18:55:38,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 18:55:40,682][15401] Updated weights for policy 0, policy_version 44790 (0.0034) [2024-06-21 18:55:41,570][15349] Signal inference workers to stop experience collection... (10650 times) [2024-06-21 18:55:41,571][15349] Signal inference workers to resume experience collection... (10650 times) [2024-06-21 18:55:41,596][15401] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-06-21 18:55:41,596][15401] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-06-21 18:55:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 733937664. Throughput: 0: 41868.8. Samples: 734056860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-21 18:55:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 18:55:43,453][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000044797_733954048.pth... [2024-06-21 18:55:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000044185_723927040.pth [2024-06-21 18:55:44,987][15401] Updated weights for policy 0, policy_version 44800 (0.0032) [2024-06-21 18:55:48,394][15132] Fps is (10 sec: 42578.6, 60 sec: 42049.2, 300 sec: 41820.2). Total num frames: 734150656. Throughput: 0: 42030.9. Samples: 734307500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-21 18:55:48,395][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 18:55:48,624][15401] Updated weights for policy 0, policy_version 44810 (0.0038) [2024-06-21 18:55:52,746][15401] Updated weights for policy 0, policy_version 44820 (0.0029) [2024-06-21 18:55:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 734330880. Throughput: 0: 41912.4. Samples: 734430700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-21 18:55:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 18:55:56,483][15401] Updated weights for policy 0, policy_version 44830 (0.0043) [2024-06-21 18:55:58,390][15132] Fps is (10 sec: 40978.2, 60 sec: 41506.0, 300 sec: 41820.9). Total num frames: 734560256. Throughput: 0: 41906.1. Samples: 734682040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:55:58,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-21 18:56:00,669][15401] Updated weights for policy 0, policy_version 44840 (0.0047) [2024-06-21 18:56:03,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42050.6, 300 sec: 41931.6). Total num frames: 734789632. Throughput: 0: 41997.3. Samples: 734935560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:56:03,393][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 18:56:04,152][15401] Updated weights for policy 0, policy_version 44850 (0.0042) [2024-06-21 18:56:08,287][15401] Updated weights for policy 0, policy_version 44860 (0.0033) [2024-06-21 18:56:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 734986240. Throughput: 0: 41817.0. Samples: 735059160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:56:08,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 18:56:11,965][15401] Updated weights for policy 0, policy_version 44870 (0.0032) [2024-06-21 18:56:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 735199232. Throughput: 0: 41886.2. Samples: 735306420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:56:13,395][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 18:56:16,027][15401] Updated weights for policy 0, policy_version 44880 (0.0032) [2024-06-21 18:56:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 735412224. Throughput: 0: 41815.1. Samples: 735559980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 18:56:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 18:56:19,634][15401] Updated weights for policy 0, policy_version 44890 (0.0029) [2024-06-21 18:56:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 735608832. Throughput: 0: 41826.1. Samples: 735688500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 18:56:23,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 18:56:24,148][15401] Updated weights for policy 0, policy_version 44900 (0.0025) [2024-06-21 18:56:27,637][15401] Updated weights for policy 0, policy_version 44910 (0.0029) [2024-06-21 18:56:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 735821824. Throughput: 0: 41725.7. Samples: 735934520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 18:56:28,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 18:56:31,745][15401] Updated weights for policy 0, policy_version 44920 (0.0047) [2024-06-21 18:56:33,391][15132] Fps is (10 sec: 40955.4, 60 sec: 41505.4, 300 sec: 41820.7). Total num frames: 736018432. Throughput: 0: 41844.1. Samples: 736190340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 18:56:33,391][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 18:56:35,385][15401] Updated weights for policy 0, policy_version 44930 (0.0042) [2024-06-21 18:56:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41709.9). Total num frames: 736231424. Throughput: 0: 41803.2. Samples: 736311840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 18:56:38,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 18:56:39,567][15401] Updated weights for policy 0, policy_version 44940 (0.0035) [2024-06-21 18:56:43,389][15132] Fps is (10 sec: 40964.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 736428032. Throughput: 0: 41832.6. Samples: 736564500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 18:56:43,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 18:56:43,702][15401] Updated weights for policy 0, policy_version 44950 (0.0036) [2024-06-21 18:56:47,194][15401] Updated weights for policy 0, policy_version 44960 (0.0044) [2024-06-21 18:56:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41509.3, 300 sec: 41820.9). Total num frames: 736641024. Throughput: 0: 41874.2. Samples: 736819800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 18:56:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 18:56:51,420][15401] Updated weights for policy 0, policy_version 44970 (0.0039) [2024-06-21 18:56:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 41765.7). Total num frames: 736870400. Throughput: 0: 41832.6. Samples: 736941620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:56:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 18:56:54,854][15401] Updated weights for policy 0, policy_version 44980 (0.0034) [2024-06-21 18:56:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 737083392. Throughput: 0: 41980.6. Samples: 737195540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:56:58,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 18:56:59,609][15401] Updated weights for policy 0, policy_version 44990 (0.0038) [2024-06-21 18:57:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41234.7, 300 sec: 41709.8). Total num frames: 737263616. Throughput: 0: 41853.8. Samples: 737443400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:57:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 18:57:03,516][15401] Updated weights for policy 0, policy_version 45000 (0.0036) [2024-06-21 18:57:07,387][15401] Updated weights for policy 0, policy_version 45010 (0.0033) [2024-06-21 18:57:08,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 737509376. Throughput: 0: 41704.3. Samples: 737565200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:57:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 18:57:11,074][15401] Updated weights for policy 0, policy_version 45020 (0.0034) [2024-06-21 18:57:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 737673216. Throughput: 0: 41780.2. Samples: 737814620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:57:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 18:57:15,102][15401] Updated weights for policy 0, policy_version 45030 (0.0043) [2024-06-21 18:57:15,991][15349] Signal inference workers to stop experience collection... (10700 times) [2024-06-21 18:57:15,991][15349] Signal inference workers to resume experience collection... (10700 times) [2024-06-21 18:57:16,027][15401] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-06-21 18:57:16,028][15401] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-06-21 18:57:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 737902592. Throughput: 0: 41641.4. Samples: 738064160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 18:57:18,394][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 18:57:18,753][15401] Updated weights for policy 0, policy_version 45040 (0.0030) [2024-06-21 18:57:22,893][15401] Updated weights for policy 0, policy_version 45050 (0.0036) [2024-06-21 18:57:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 738099200. Throughput: 0: 41784.8. Samples: 738192160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:57:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 18:57:26,884][15401] Updated weights for policy 0, policy_version 45060 (0.0042) [2024-06-21 18:57:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 738295808. Throughput: 0: 41650.9. Samples: 738438800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:57:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 18:57:30,682][15401] Updated weights for policy 0, policy_version 45070 (0.0043) [2024-06-21 18:57:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42053.0, 300 sec: 41876.4). Total num frames: 738541568. Throughput: 0: 41504.8. Samples: 738687520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:57:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 18:57:34,675][15401] Updated weights for policy 0, policy_version 45080 (0.0037) [2024-06-21 18:57:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 41779.2, 300 sec: 41710.1). Total num frames: 738738176. Throughput: 0: 41693.3. Samples: 738817820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:57:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-21 18:57:38,499][15401] Updated weights for policy 0, policy_version 45090 (0.0033) [2024-06-21 18:57:43,011][15401] Updated weights for policy 0, policy_version 45100 (0.0044) [2024-06-21 18:57:43,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 738918400. Throughput: 0: 41607.4. Samples: 739067880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:57:43,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 18:57:43,465][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000045101_738934784.pth... [2024-06-21 18:57:43,520][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000044489_728907776.pth [2024-06-21 18:57:46,200][15401] Updated weights for policy 0, policy_version 45110 (0.0047) [2024-06-21 18:57:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 739164160. Throughput: 0: 41529.4. Samples: 739312220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 18:57:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 18:57:50,589][15401] Updated weights for policy 0, policy_version 45120 (0.0033) [2024-06-21 18:57:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 739344384. Throughput: 0: 41837.1. Samples: 739447860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:57:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 18:57:53,977][15401] Updated weights for policy 0, policy_version 45130 (0.0047) [2024-06-21 18:57:58,389][15132] Fps is (10 sec: 37683.3, 60 sec: 40960.0, 300 sec: 41765.8). Total num frames: 739540992. Throughput: 0: 41793.3. Samples: 739695320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:57:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 18:57:58,621][15401] Updated weights for policy 0, policy_version 45140 (0.0036) [2024-06-21 18:58:01,966][15401] Updated weights for policy 0, policy_version 45150 (0.0044) [2024-06-21 18:58:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 739786752. Throughput: 0: 41568.5. Samples: 739934740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:58:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 18:58:06,339][15401] Updated weights for policy 0, policy_version 45160 (0.0037) [2024-06-21 18:58:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 40687.1, 300 sec: 41654.2). Total num frames: 739950592. Throughput: 0: 41641.4. Samples: 740066020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:58:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 18:58:10,017][15401] Updated weights for policy 0, policy_version 45170 (0.0037) [2024-06-21 18:58:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 740196352. Throughput: 0: 41634.4. Samples: 740312340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 18:58:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 18:58:14,163][15401] Updated weights for policy 0, policy_version 45180 (0.0039) [2024-06-21 18:58:17,917][15401] Updated weights for policy 0, policy_version 45190 (0.0043) [2024-06-21 18:58:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 740409344. Throughput: 0: 41688.0. Samples: 740563480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:58:18,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 18:58:21,997][15401] Updated weights for policy 0, policy_version 45200 (0.0038) [2024-06-21 18:58:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 740605952. Throughput: 0: 41663.5. Samples: 740692680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:58:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 18:58:25,624][15401] Updated weights for policy 0, policy_version 45210 (0.0044) [2024-06-21 18:58:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.5, 300 sec: 41932.0). Total num frames: 740835328. Throughput: 0: 41611.2. Samples: 740940380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:58:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 18:58:29,913][15401] Updated weights for policy 0, policy_version 45220 (0.0047) [2024-06-21 18:58:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 741031936. Throughput: 0: 41914.7. Samples: 741198380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:58:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 18:58:33,502][15401] Updated weights for policy 0, policy_version 45230 (0.0039) [2024-06-21 18:58:37,687][15401] Updated weights for policy 0, policy_version 45240 (0.0034) [2024-06-21 18:58:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 741228544. Throughput: 0: 41623.6. Samples: 741320920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:58:38,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-21 18:58:41,599][15401] Updated weights for policy 0, policy_version 45250 (0.0042) [2024-06-21 18:58:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 741474304. Throughput: 0: 41620.3. Samples: 741568240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 18:58:43,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-21 18:58:45,542][15401] Updated weights for policy 0, policy_version 45260 (0.0044) [2024-06-21 18:58:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 741638144. Throughput: 0: 42205.4. Samples: 741833980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:58:48,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 18:58:49,152][15401] Updated weights for policy 0, policy_version 45270 (0.0038) [2024-06-21 18:58:53,378][15401] Updated weights for policy 0, policy_version 45280 (0.0051) [2024-06-21 18:58:53,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42050.5, 300 sec: 41876.1). Total num frames: 741867520. Throughput: 0: 41816.4. Samples: 741947860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:58:53,393][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 18:58:56,354][15349] Signal inference workers to stop experience collection... (10750 times) [2024-06-21 18:58:56,402][15349] Signal inference workers to resume experience collection... (10750 times) [2024-06-21 18:58:56,404][15401] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-06-21 18:58:56,419][15401] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-06-21 18:58:56,991][15401] Updated weights for policy 0, policy_version 45290 (0.0032) [2024-06-21 18:58:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 742096896. Throughput: 0: 41997.3. Samples: 742202220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:58:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 18:59:01,401][15401] Updated weights for policy 0, policy_version 45300 (0.0045) [2024-06-21 18:59:03,389][15132] Fps is (10 sec: 39331.4, 60 sec: 41233.1, 300 sec: 41654.6). Total num frames: 742260736. Throughput: 0: 41985.0. Samples: 742452800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:59:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 18:59:04,867][15401] Updated weights for policy 0, policy_version 45310 (0.0042) [2024-06-21 18:59:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 742490112. Throughput: 0: 41728.9. Samples: 742570480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:59:08,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-21 18:59:09,151][15401] Updated weights for policy 0, policy_version 45320 (0.0034) [2024-06-21 18:59:12,546][15401] Updated weights for policy 0, policy_version 45330 (0.0031) [2024-06-21 18:59:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 742719488. Throughput: 0: 42025.7. Samples: 742831540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 18:59:13,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 18:59:16,951][15401] Updated weights for policy 0, policy_version 45340 (0.0027) [2024-06-21 18:59:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 742916096. Throughput: 0: 41687.8. Samples: 743074340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-06-21 18:59:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 18:59:20,165][15401] Updated weights for policy 0, policy_version 45350 (0.0042) [2024-06-21 18:59:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 743129088. Throughput: 0: 41898.2. Samples: 743206340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-06-21 18:59:23,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-21 18:59:24,723][15401] Updated weights for policy 0, policy_version 45360 (0.0036) [2024-06-21 18:59:27,901][15401] Updated weights for policy 0, policy_version 45370 (0.0034) [2024-06-21 18:59:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 743342080. Throughput: 0: 41982.7. Samples: 743457460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-06-21 18:59:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 18:59:32,639][15401] Updated weights for policy 0, policy_version 45380 (0.0040) [2024-06-21 18:59:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 743538688. Throughput: 0: 41531.6. Samples: 743702900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-06-21 18:59:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 18:59:36,043][15401] Updated weights for policy 0, policy_version 45390 (0.0042) [2024-06-21 18:59:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 743751680. Throughput: 0: 41767.5. Samples: 743827300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-06-21 18:59:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 18:59:40,291][15401] Updated weights for policy 0, policy_version 45400 (0.0033) [2024-06-21 18:59:43,392][15132] Fps is (10 sec: 42587.6, 60 sec: 41504.5, 300 sec: 41820.5). Total num frames: 743964672. Throughput: 0: 41749.3. Samples: 744081040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 18:59:43,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 18:59:43,541][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000045409_743981056.pth... [2024-06-21 18:59:43,607][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000044797_733954048.pth [2024-06-21 18:59:43,750][15401] Updated weights for policy 0, policy_version 45410 (0.0046) [2024-06-21 18:59:47,937][15401] Updated weights for policy 0, policy_version 45420 (0.0032) [2024-06-21 18:59:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 744161280. Throughput: 0: 41677.7. Samples: 744328300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 18:59:48,391][15132] Avg episode reward: [(0, '0.814')] [2024-06-21 18:59:52,190][15401] Updated weights for policy 0, policy_version 45430 (0.0037) [2024-06-21 18:59:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 41781.0, 300 sec: 41709.8). Total num frames: 744374272. Throughput: 0: 41887.6. Samples: 744455420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 18:59:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 18:59:55,851][15401] Updated weights for policy 0, policy_version 45440 (0.0028) [2024-06-21 18:59:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 744603648. Throughput: 0: 41720.8. Samples: 744708980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 18:59:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 18:59:59,877][15401] Updated weights for policy 0, policy_version 45450 (0.0041) [2024-06-21 19:00:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 744800256. Throughput: 0: 41925.0. Samples: 744960960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 19:00:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 19:00:03,878][15401] Updated weights for policy 0, policy_version 45460 (0.0030) [2024-06-21 19:00:06,152][15349] Signal inference workers to stop experience collection... (10800 times) [2024-06-21 19:00:06,152][15349] Signal inference workers to resume experience collection... (10800 times) [2024-06-21 19:00:06,200][15401] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-06-21 19:00:06,200][15401] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-06-21 19:00:07,567][15401] Updated weights for policy 0, policy_version 45470 (0.0030) [2024-06-21 19:00:08,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 744980480. Throughput: 0: 41685.3. Samples: 745082180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-21 19:00:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 19:00:11,653][15401] Updated weights for policy 0, policy_version 45480 (0.0024) [2024-06-21 19:00:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 745226240. Throughput: 0: 41757.4. Samples: 745336540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:00:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 19:00:15,474][15401] Updated weights for policy 0, policy_version 45490 (0.0026) [2024-06-21 19:00:18,392][15132] Fps is (10 sec: 42589.8, 60 sec: 41504.8, 300 sec: 41765.1). Total num frames: 745406464. Throughput: 0: 42012.3. Samples: 745593540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:00:18,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 19:00:19,342][15401] Updated weights for policy 0, policy_version 45500 (0.0034) [2024-06-21 19:00:23,134][15401] Updated weights for policy 0, policy_version 45510 (0.0042) [2024-06-21 19:00:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 41777.5, 300 sec: 41820.5). Total num frames: 745635840. Throughput: 0: 42004.1. Samples: 745717580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:00:23,392][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 19:00:27,402][15401] Updated weights for policy 0, policy_version 45520 (0.0035) [2024-06-21 19:00:28,389][15132] Fps is (10 sec: 44245.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 745848832. Throughput: 0: 42145.9. Samples: 745977500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:00:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 19:00:30,787][15401] Updated weights for policy 0, policy_version 45530 (0.0026) [2024-06-21 19:00:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 746045440. Throughput: 0: 42099.7. Samples: 746222780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:00:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 19:00:35,070][15401] Updated weights for policy 0, policy_version 45540 (0.0023) [2024-06-21 19:00:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 746274816. Throughput: 0: 42012.4. Samples: 746345980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:00:38,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-21 19:00:38,511][15401] Updated weights for policy 0, policy_version 45550 (0.0030) [2024-06-21 19:00:42,820][15401] Updated weights for policy 0, policy_version 45560 (0.0040) [2024-06-21 19:00:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41780.9, 300 sec: 41765.9). Total num frames: 746471424. Throughput: 0: 41958.7. Samples: 746597120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 19:00:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 19:00:46,111][15401] Updated weights for policy 0, policy_version 45570 (0.0037) [2024-06-21 19:00:48,392][15132] Fps is (10 sec: 39312.0, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 746668032. Throughput: 0: 42065.7. Samples: 746854020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 19:00:48,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 19:00:50,705][15401] Updated weights for policy 0, policy_version 45580 (0.0038) [2024-06-21 19:00:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 746913792. Throughput: 0: 42130.1. Samples: 746978040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 19:00:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 19:00:54,015][15401] Updated weights for policy 0, policy_version 45590 (0.0036) [2024-06-21 19:00:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 41506.2, 300 sec: 41710.1). Total num frames: 747094016. Throughput: 0: 42154.3. Samples: 747233480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 19:00:58,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 19:00:58,459][15401] Updated weights for policy 0, policy_version 45600 (0.0033) [2024-06-21 19:01:01,540][15401] Updated weights for policy 0, policy_version 45610 (0.0035) [2024-06-21 19:01:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 747307008. Throughput: 0: 42205.4. Samples: 747492700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 19:01:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 19:01:06,051][15401] Updated weights for policy 0, policy_version 45620 (0.0037) [2024-06-21 19:01:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 41820.9). Total num frames: 747536384. Throughput: 0: 42102.7. Samples: 747612100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 19:01:08,399][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 19:01:09,106][15401] Updated weights for policy 0, policy_version 45630 (0.0028) [2024-06-21 19:01:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 747732992. Throughput: 0: 42105.8. Samples: 747872260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 19:01:13,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-21 19:01:13,593][15401] Updated weights for policy 0, policy_version 45640 (0.0031) [2024-06-21 19:01:17,032][15401] Updated weights for policy 0, policy_version 45650 (0.0029) [2024-06-21 19:01:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42599.8, 300 sec: 41876.4). Total num frames: 747962368. Throughput: 0: 42281.7. Samples: 748125460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 19:01:18,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-21 19:01:19,160][15349] Signal inference workers to stop experience collection... (10850 times) [2024-06-21 19:01:19,215][15349] Signal inference workers to resume experience collection... (10850 times) [2024-06-21 19:01:19,215][15401] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-06-21 19:01:19,227][15401] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-06-21 19:01:21,060][15401] Updated weights for policy 0, policy_version 45660 (0.0038) [2024-06-21 19:01:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42327.0, 300 sec: 41876.4). Total num frames: 748175360. Throughput: 0: 42315.5. Samples: 748250180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 19:01:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 19:01:25,205][15401] Updated weights for policy 0, policy_version 45670 (0.0041) [2024-06-21 19:01:28,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42325.1, 300 sec: 41932.0). Total num frames: 748388352. Throughput: 0: 42456.2. Samples: 748507660. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 19:01:28,396][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 19:01:28,746][15401] Updated weights for policy 0, policy_version 45680 (0.0031) [2024-06-21 19:01:32,764][15401] Updated weights for policy 0, policy_version 45690 (0.0037) [2024-06-21 19:01:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 748584960. Throughput: 0: 42225.2. Samples: 748754060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-21 19:01:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 19:01:36,560][15401] Updated weights for policy 0, policy_version 45700 (0.0028) [2024-06-21 19:01:38,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 748797952. Throughput: 0: 42262.8. Samples: 748879860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 19:01:38,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 19:01:40,918][15401] Updated weights for policy 0, policy_version 45710 (0.0031) [2024-06-21 19:01:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.1, 300 sec: 41876.4). Total num frames: 748994560. Throughput: 0: 42150.9. Samples: 749130280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 19:01:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 19:01:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000045716_749010944.pth... [2024-06-21 19:01:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000045101_738934784.pth [2024-06-21 19:01:44,551][15401] Updated weights for policy 0, policy_version 45720 (0.0030) [2024-06-21 19:01:48,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42322.6, 300 sec: 41819.9). Total num frames: 749207552. Throughput: 0: 42048.4. Samples: 749385140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 19:01:48,396][15132] Avg episode reward: [(0, '0.345')] [2024-06-21 19:01:48,967][15401] Updated weights for policy 0, policy_version 45730 (0.0047) [2024-06-21 19:01:52,323][15401] Updated weights for policy 0, policy_version 45740 (0.0040) [2024-06-21 19:01:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 749420544. Throughput: 0: 42130.1. Samples: 749507960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 19:01:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 19:01:56,849][15401] Updated weights for policy 0, policy_version 45750 (0.0035) [2024-06-21 19:01:58,390][15132] Fps is (10 sec: 44264.6, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 749649920. Throughput: 0: 42048.3. Samples: 749764440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 19:01:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 19:02:00,126][15401] Updated weights for policy 0, policy_version 45760 (0.0027) [2024-06-21 19:02:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 749846528. Throughput: 0: 41966.2. Samples: 750013940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 19:02:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 19:02:04,645][15401] Updated weights for policy 0, policy_version 45770 (0.0038) [2024-06-21 19:02:07,900][15401] Updated weights for policy 0, policy_version 45780 (0.0031) [2024-06-21 19:02:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 750075904. Throughput: 0: 41901.3. Samples: 750135740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 19:02:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 19:02:12,353][15401] Updated weights for policy 0, policy_version 45790 (0.0038) [2024-06-21 19:02:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 750272512. Throughput: 0: 41953.6. Samples: 750395560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 19:02:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 19:02:15,683][15401] Updated weights for policy 0, policy_version 45800 (0.0032) [2024-06-21 19:02:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 750469120. Throughput: 0: 41986.3. Samples: 750643440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 19:02:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 19:02:20,095][15401] Updated weights for policy 0, policy_version 45810 (0.0028) [2024-06-21 19:02:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 750698496. Throughput: 0: 41994.2. Samples: 750769600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 19:02:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 19:02:23,649][15401] Updated weights for policy 0, policy_version 45820 (0.0037) [2024-06-21 19:02:27,861][15401] Updated weights for policy 0, policy_version 45830 (0.0034) [2024-06-21 19:02:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.3, 300 sec: 41820.8). Total num frames: 750878720. Throughput: 0: 42096.1. Samples: 751024600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 19:02:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 19:02:31,313][15401] Updated weights for policy 0, policy_version 45840 (0.0037) [2024-06-21 19:02:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42050.7, 300 sec: 41931.6). Total num frames: 751108096. Throughput: 0: 41999.2. Samples: 751274940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-21 19:02:33,393][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 19:02:35,721][15401] Updated weights for policy 0, policy_version 45850 (0.0031) [2024-06-21 19:02:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 751337472. Throughput: 0: 42158.7. Samples: 751405100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:02:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 19:02:39,044][15401] Updated weights for policy 0, policy_version 45860 (0.0033) [2024-06-21 19:02:43,233][15401] Updated weights for policy 0, policy_version 45870 (0.0032) [2024-06-21 19:02:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 751534080. Throughput: 0: 41848.4. Samples: 751647620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:02:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 19:02:46,833][15401] Updated weights for policy 0, policy_version 45880 (0.0028) [2024-06-21 19:02:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42329.8, 300 sec: 42043.0). Total num frames: 751747072. Throughput: 0: 42005.8. Samples: 751904200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:02:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 19:02:51,370][15401] Updated weights for policy 0, policy_version 45890 (0.0043) [2024-06-21 19:02:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 751960064. Throughput: 0: 42254.6. Samples: 752037200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:02:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 19:02:53,647][15349] Signal inference workers to stop experience collection... (10900 times) [2024-06-21 19:02:53,691][15401] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-06-21 19:02:53,700][15349] Signal inference workers to resume experience collection... (10900 times) [2024-06-21 19:02:53,705][15401] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-06-21 19:02:54,567][15401] Updated weights for policy 0, policy_version 45900 (0.0039) [2024-06-21 19:02:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 752156672. Throughput: 0: 42061.8. Samples: 752288340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:02:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 19:02:59,166][15401] Updated weights for policy 0, policy_version 45910 (0.0050) [2024-06-21 19:03:02,607][15401] Updated weights for policy 0, policy_version 45920 (0.0041) [2024-06-21 19:03:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 752369664. Throughput: 0: 41876.0. Samples: 752527860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:03:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 19:03:06,779][15401] Updated weights for policy 0, policy_version 45930 (0.0046) [2024-06-21 19:03:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 752566272. Throughput: 0: 41891.1. Samples: 752654700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:03:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 19:03:10,238][15401] Updated weights for policy 0, policy_version 45940 (0.0039) [2024-06-21 19:03:13,392][15132] Fps is (10 sec: 40950.7, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 752779264. Throughput: 0: 41922.7. Samples: 752911220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:03:13,392][15132] Avg episode reward: [(0, '0.139')] [2024-06-21 19:03:14,571][15401] Updated weights for policy 0, policy_version 45950 (0.0033) [2024-06-21 19:03:18,012][15401] Updated weights for policy 0, policy_version 45960 (0.0049) [2024-06-21 19:03:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 753025024. Throughput: 0: 41844.0. Samples: 753157820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:03:18,390][15132] Avg episode reward: [(0, '0.142')] [2024-06-21 19:03:22,431][15401] Updated weights for policy 0, policy_version 45970 (0.0023) [2024-06-21 19:03:23,389][15132] Fps is (10 sec: 42608.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 753205248. Throughput: 0: 41881.9. Samples: 753289780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:03:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 19:03:25,995][15401] Updated weights for policy 0, policy_version 45980 (0.0036) [2024-06-21 19:03:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 753418240. Throughput: 0: 41934.8. Samples: 753534680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:03:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 19:03:30,164][15401] Updated weights for policy 0, policy_version 45990 (0.0041) [2024-06-21 19:03:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42054.0, 300 sec: 42043.0). Total num frames: 753631232. Throughput: 0: 41963.5. Samples: 753792560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:03:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 19:03:34,038][15401] Updated weights for policy 0, policy_version 46000 (0.0041) [2024-06-21 19:03:38,089][15401] Updated weights for policy 0, policy_version 46010 (0.0042) [2024-06-21 19:03:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 753844224. Throughput: 0: 41747.1. Samples: 753915820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:03:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 19:03:41,835][15401] Updated weights for policy 0, policy_version 46020 (0.0038) [2024-06-21 19:03:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 754040832. Throughput: 0: 41723.1. Samples: 754165880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:03:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 19:03:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000046023_754040832.pth... [2024-06-21 19:03:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000045409_743981056.pth [2024-06-21 19:03:45,752][15401] Updated weights for policy 0, policy_version 46030 (0.0032) [2024-06-21 19:03:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41987.8). Total num frames: 754253824. Throughput: 0: 42153.4. Samples: 754424760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:03:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 19:03:49,551][15401] Updated weights for policy 0, policy_version 46040 (0.0031) [2024-06-21 19:03:53,396][15132] Fps is (10 sec: 40933.5, 60 sec: 41501.7, 300 sec: 41875.5). Total num frames: 754450432. Throughput: 0: 42089.5. Samples: 754549000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:03:53,397][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 19:03:53,805][15401] Updated weights for policy 0, policy_version 46050 (0.0030) [2024-06-21 19:03:57,659][15401] Updated weights for policy 0, policy_version 46060 (0.0026) [2024-06-21 19:03:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 754679808. Throughput: 0: 42034.8. Samples: 754802680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:03:58,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 19:04:01,543][15401] Updated weights for policy 0, policy_version 46070 (0.0043) [2024-06-21 19:04:03,390][15132] Fps is (10 sec: 44265.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 754892800. Throughput: 0: 42170.2. Samples: 755055480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:04:03,400][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 19:04:05,315][15401] Updated weights for policy 0, policy_version 46080 (0.0036) [2024-06-21 19:04:08,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 755073024. Throughput: 0: 42011.6. Samples: 755180300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:04:08,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-21 19:04:09,137][15401] Updated weights for policy 0, policy_version 46090 (0.0041) [2024-06-21 19:04:12,921][15401] Updated weights for policy 0, policy_version 46100 (0.0038) [2024-06-21 19:04:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42326.9, 300 sec: 42043.0). Total num frames: 755318784. Throughput: 0: 42101.2. Samples: 755429240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:04:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 19:04:16,909][15401] Updated weights for policy 0, policy_version 46110 (0.0032) [2024-06-21 19:04:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 755515392. Throughput: 0: 41869.3. Samples: 755676680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:04:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 19:04:20,726][15401] Updated weights for policy 0, policy_version 46120 (0.0041) [2024-06-21 19:04:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 755712000. Throughput: 0: 41916.3. Samples: 755802060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:04:23,391][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 19:04:24,571][15401] Updated weights for policy 0, policy_version 46130 (0.0030) [2024-06-21 19:04:26,712][15349] Signal inference workers to stop experience collection... (10950 times) [2024-06-21 19:04:26,714][15349] Signal inference workers to resume experience collection... (10950 times) [2024-06-21 19:04:26,727][15401] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-06-21 19:04:26,740][15401] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-06-21 19:04:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 755941376. Throughput: 0: 41955.5. Samples: 756053880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:04:28,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-21 19:04:28,406][15401] Updated weights for policy 0, policy_version 46140 (0.0026) [2024-06-21 19:04:32,387][15401] Updated weights for policy 0, policy_version 46150 (0.0028) [2024-06-21 19:04:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 756137984. Throughput: 0: 41805.7. Samples: 756306020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 19:04:33,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-21 19:04:36,271][15401] Updated weights for policy 0, policy_version 46160 (0.0029) [2024-06-21 19:04:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41932.3). Total num frames: 756334592. Throughput: 0: 41838.4. Samples: 756431460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 19:04:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 19:04:40,026][15401] Updated weights for policy 0, policy_version 46170 (0.0034) [2024-06-21 19:04:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 756563968. Throughput: 0: 41889.2. Samples: 756687700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 19:04:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 19:04:44,345][15401] Updated weights for policy 0, policy_version 46180 (0.0036) [2024-06-21 19:04:48,113][15401] Updated weights for policy 0, policy_version 46190 (0.0046) [2024-06-21 19:04:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 756776960. Throughput: 0: 41758.7. Samples: 756934620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 19:04:48,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 19:04:52,144][15401] Updated weights for policy 0, policy_version 46200 (0.0047) [2024-06-21 19:04:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41783.7, 300 sec: 41876.4). Total num frames: 756957184. Throughput: 0: 41686.1. Samples: 757056180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 19:04:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-21 19:04:56,035][15401] Updated weights for policy 0, policy_version 46210 (0.0039) [2024-06-21 19:04:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 757202944. Throughput: 0: 41846.4. Samples: 757312320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-21 19:04:58,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-21 19:05:00,336][15401] Updated weights for policy 0, policy_version 46220 (0.0041) [2024-06-21 19:05:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 757399552. Throughput: 0: 41855.6. Samples: 757560180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:05:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 19:05:03,614][15401] Updated weights for policy 0, policy_version 46230 (0.0035) [2024-06-21 19:05:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 757596160. Throughput: 0: 41825.1. Samples: 757684180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:05:08,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-21 19:05:08,395][15401] Updated weights for policy 0, policy_version 46240 (0.0032) [2024-06-21 19:05:11,355][15401] Updated weights for policy 0, policy_version 46250 (0.0032) [2024-06-21 19:05:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42098.8). Total num frames: 757825536. Throughput: 0: 41936.5. Samples: 757941020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:05:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 19:05:16,033][15401] Updated weights for policy 0, policy_version 46260 (0.0033) [2024-06-21 19:05:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 42043.3). Total num frames: 758038528. Throughput: 0: 41959.5. Samples: 758194200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:05:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 19:05:19,200][15401] Updated weights for policy 0, policy_version 46270 (0.0045) [2024-06-21 19:05:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 758235136. Throughput: 0: 41850.8. Samples: 758314740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:05:23,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 19:05:23,673][15401] Updated weights for policy 0, policy_version 46280 (0.0033) [2024-06-21 19:05:27,350][15401] Updated weights for policy 0, policy_version 46290 (0.0034) [2024-06-21 19:05:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 758448128. Throughput: 0: 41784.4. Samples: 758568000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 19:05:28,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 19:05:31,874][15401] Updated weights for policy 0, policy_version 46300 (0.0043) [2024-06-21 19:05:32,696][15349] Signal inference workers to stop experience collection... (11000 times) [2024-06-21 19:05:32,696][15349] Signal inference workers to resume experience collection... (11000 times) [2024-06-21 19:05:32,744][15401] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-06-21 19:05:32,744][15401] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-06-21 19:05:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 758661120. Throughput: 0: 41869.8. Samples: 758818760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:05:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 19:05:35,180][15401] Updated weights for policy 0, policy_version 46310 (0.0031) [2024-06-21 19:05:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.5, 300 sec: 41987.5). Total num frames: 758857728. Throughput: 0: 41914.4. Samples: 758942320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:05:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 19:05:39,891][15401] Updated weights for policy 0, policy_version 46320 (0.0045) [2024-06-21 19:05:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 41987.8). Total num frames: 759054336. Throughput: 0: 41802.4. Samples: 759193440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:05:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 19:05:43,404][15401] Updated weights for policy 0, policy_version 46330 (0.0038) [2024-06-21 19:05:43,544][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000046331_759087104.pth... [2024-06-21 19:05:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000045716_749010944.pth [2024-06-21 19:05:47,391][15401] Updated weights for policy 0, policy_version 46340 (0.0032) [2024-06-21 19:05:48,391][15132] Fps is (10 sec: 42591.7, 60 sec: 41778.2, 300 sec: 41931.7). Total num frames: 759283712. Throughput: 0: 41909.7. Samples: 759446180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:05:48,391][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 19:05:50,987][15401] Updated weights for policy 0, policy_version 46350 (0.0033) [2024-06-21 19:05:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 759480320. Throughput: 0: 41887.4. Samples: 759569120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:05:53,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-21 19:05:55,369][15401] Updated weights for policy 0, policy_version 46360 (0.0045) [2024-06-21 19:05:58,389][15132] Fps is (10 sec: 40966.1, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 759693312. Throughput: 0: 41655.1. Samples: 759815500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:05:58,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 19:05:58,746][15401] Updated weights for policy 0, policy_version 46370 (0.0042) [2024-06-21 19:06:03,066][15401] Updated weights for policy 0, policy_version 46380 (0.0032) [2024-06-21 19:06:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 759889920. Throughput: 0: 41702.3. Samples: 760070800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:06:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 19:06:06,549][15401] Updated weights for policy 0, policy_version 46390 (0.0034) [2024-06-21 19:06:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 760102912. Throughput: 0: 41750.2. Samples: 760193500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:06:08,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-21 19:06:10,846][15401] Updated weights for policy 0, policy_version 46400 (0.0045) [2024-06-21 19:06:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 760315904. Throughput: 0: 41722.6. Samples: 760445520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:06:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 19:06:14,529][15401] Updated weights for policy 0, policy_version 46410 (0.0040) [2024-06-21 19:06:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 760528896. Throughput: 0: 41610.3. Samples: 760691220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:06:18,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 19:06:18,579][15401] Updated weights for policy 0, policy_version 46420 (0.0034) [2024-06-21 19:06:23,090][15401] Updated weights for policy 0, policy_version 46430 (0.0042) [2024-06-21 19:06:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 760725504. Throughput: 0: 41591.4. Samples: 760813940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:06:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 19:06:26,389][15401] Updated weights for policy 0, policy_version 46440 (0.0046) [2024-06-21 19:06:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 760938496. Throughput: 0: 41555.7. Samples: 761063440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:06:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 19:06:30,922][15401] Updated weights for policy 0, policy_version 46450 (0.0036) [2024-06-21 19:06:33,396][15132] Fps is (10 sec: 40933.7, 60 sec: 41228.7, 300 sec: 41819.9). Total num frames: 761135104. Throughput: 0: 41586.9. Samples: 761317800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:06:33,397][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 19:06:34,193][15401] Updated weights for policy 0, policy_version 46460 (0.0042) [2024-06-21 19:06:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 761348096. Throughput: 0: 41571.3. Samples: 761439820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:06:38,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-21 19:06:38,673][15401] Updated weights for policy 0, policy_version 46470 (0.0044) [2024-06-21 19:06:42,142][15401] Updated weights for policy 0, policy_version 46480 (0.0041) [2024-06-21 19:06:43,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42052.3, 300 sec: 41932.8). Total num frames: 761577472. Throughput: 0: 41594.5. Samples: 761687260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:06:43,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-21 19:06:46,299][15401] Updated weights for policy 0, policy_version 46490 (0.0033) [2024-06-21 19:06:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41507.1, 300 sec: 41876.4). Total num frames: 761774080. Throughput: 0: 41455.5. Samples: 761936300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:06:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 19:06:50,458][15401] Updated weights for policy 0, policy_version 46500 (0.0035) [2024-06-21 19:06:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 761987072. Throughput: 0: 41485.0. Samples: 762060320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:06:53,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-21 19:06:53,867][15401] Updated weights for policy 0, policy_version 46510 (0.0035) [2024-06-21 19:06:58,388][15401] Updated weights for policy 0, policy_version 46520 (0.0042) [2024-06-21 19:06:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 762183680. Throughput: 0: 41443.7. Samples: 762310480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:06:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 19:07:01,642][15401] Updated weights for policy 0, policy_version 46530 (0.0023) [2024-06-21 19:07:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 762380288. Throughput: 0: 41639.6. Samples: 762565000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 19:07:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 19:07:06,182][15401] Updated weights for policy 0, policy_version 46540 (0.0038) [2024-06-21 19:07:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 762609664. Throughput: 0: 41645.7. Samples: 762688000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 19:07:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 19:07:09,535][15401] Updated weights for policy 0, policy_version 46550 (0.0032) [2024-06-21 19:07:13,392][15132] Fps is (10 sec: 40949.7, 60 sec: 41231.4, 300 sec: 41765.0). Total num frames: 762789888. Throughput: 0: 41632.8. Samples: 762937020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 19:07:13,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 19:07:13,984][15401] Updated weights for policy 0, policy_version 46560 (0.0024) [2024-06-21 19:07:17,605][15401] Updated weights for policy 0, policy_version 46570 (0.0044) [2024-06-21 19:07:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 763019264. Throughput: 0: 41306.8. Samples: 763176340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 19:07:18,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 19:07:21,812][15401] Updated weights for policy 0, policy_version 46580 (0.0042) [2024-06-21 19:07:23,390][15132] Fps is (10 sec: 44247.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 763232256. Throughput: 0: 41474.6. Samples: 763306180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 19:07:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 19:07:24,882][15349] Signal inference workers to stop experience collection... (11050 times) [2024-06-21 19:07:24,935][15349] Signal inference workers to resume experience collection... (11050 times) [2024-06-21 19:07:24,936][15401] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-06-21 19:07:24,949][15401] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-06-21 19:07:25,393][15401] Updated weights for policy 0, policy_version 46590 (0.0034) [2024-06-21 19:07:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41765.7). Total num frames: 763428864. Throughput: 0: 41426.9. Samples: 763551460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 19:07:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 19:07:30,019][15401] Updated weights for policy 0, policy_version 46600 (0.0041) [2024-06-21 19:07:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41783.6, 300 sec: 41709.8). Total num frames: 763641856. Throughput: 0: 41394.5. Samples: 763799060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:07:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 19:07:33,589][15401] Updated weights for policy 0, policy_version 46610 (0.0027) [2024-06-21 19:07:37,932][15401] Updated weights for policy 0, policy_version 46620 (0.0049) [2024-06-21 19:07:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 763838464. Throughput: 0: 41460.4. Samples: 763926040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:07:38,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-21 19:07:41,498][15401] Updated weights for policy 0, policy_version 46630 (0.0039) [2024-06-21 19:07:43,390][15132] Fps is (10 sec: 39322.2, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 764035072. Throughput: 0: 41331.0. Samples: 764170380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:07:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 19:07:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000046634_764051456.pth... [2024-06-21 19:07:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000046023_754040832.pth [2024-06-21 19:07:45,745][15401] Updated weights for policy 0, policy_version 46640 (0.0038) [2024-06-21 19:07:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 764248064. Throughput: 0: 41305.3. Samples: 764423740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:07:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 19:07:49,399][15401] Updated weights for policy 0, policy_version 46650 (0.0033) [2024-06-21 19:07:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 764461056. Throughput: 0: 41258.3. Samples: 764544620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:07:53,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-21 19:07:53,441][15401] Updated weights for policy 0, policy_version 46660 (0.0030) [2024-06-21 19:07:57,162][15401] Updated weights for policy 0, policy_version 46670 (0.0039) [2024-06-21 19:07:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 764674048. Throughput: 0: 41299.7. Samples: 764795400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:07:58,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-21 19:08:01,153][15401] Updated weights for policy 0, policy_version 46680 (0.0049) [2024-06-21 19:08:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 764870656. Throughput: 0: 41664.4. Samples: 765051240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:08:03,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-21 19:08:05,037][15401] Updated weights for policy 0, policy_version 46690 (0.0043) [2024-06-21 19:08:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 41765.6). Total num frames: 765100032. Throughput: 0: 41541.3. Samples: 765175540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:08:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 19:08:08,985][15401] Updated weights for policy 0, policy_version 46700 (0.0036) [2024-06-21 19:08:12,794][15401] Updated weights for policy 0, policy_version 46710 (0.0047) [2024-06-21 19:08:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42053.9, 300 sec: 41654.2). Total num frames: 765313024. Throughput: 0: 41677.6. Samples: 765426960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:08:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 19:08:16,812][15401] Updated weights for policy 0, policy_version 46720 (0.0038) [2024-06-21 19:08:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 765493248. Throughput: 0: 41710.4. Samples: 765676020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:08:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 19:08:20,528][15401] Updated weights for policy 0, policy_version 46730 (0.0032) [2024-06-21 19:08:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 765706240. Throughput: 0: 41613.7. Samples: 765798660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:08:23,395][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 19:08:24,432][15401] Updated weights for policy 0, policy_version 46740 (0.0038) [2024-06-21 19:08:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 765935616. Throughput: 0: 41771.2. Samples: 766050080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:08:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 19:08:28,596][15401] Updated weights for policy 0, policy_version 46750 (0.0024) [2024-06-21 19:08:32,472][15401] Updated weights for policy 0, policy_version 46760 (0.0025) [2024-06-21 19:08:33,392][15132] Fps is (10 sec: 42588.3, 60 sec: 41504.6, 300 sec: 41653.9). Total num frames: 766132224. Throughput: 0: 41646.2. Samples: 766297920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:08:33,393][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 19:08:36,542][15401] Updated weights for policy 0, policy_version 46770 (0.0030) [2024-06-21 19:08:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 766328832. Throughput: 0: 41659.9. Samples: 766419320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:08:38,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-21 19:08:40,283][15401] Updated weights for policy 0, policy_version 46780 (0.0032) [2024-06-21 19:08:43,389][15132] Fps is (10 sec: 39331.1, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 766525440. Throughput: 0: 41676.3. Samples: 766670840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:08:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 19:08:44,337][15401] Updated weights for policy 0, policy_version 46790 (0.0042) [2024-06-21 19:08:48,051][15401] Updated weights for policy 0, policy_version 46800 (0.0030) [2024-06-21 19:08:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 41766.2). Total num frames: 766771200. Throughput: 0: 41499.5. Samples: 766918720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:08:48,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-21 19:08:49,816][15349] Signal inference workers to stop experience collection... (11100 times) [2024-06-21 19:08:49,816][15349] Signal inference workers to resume experience collection... (11100 times) [2024-06-21 19:08:49,869][15401] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-06-21 19:08:49,869][15401] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-06-21 19:08:52,651][15401] Updated weights for policy 0, policy_version 46810 (0.0037) [2024-06-21 19:08:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 766967808. Throughput: 0: 41601.4. Samples: 767047600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:08:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 19:08:56,177][15401] Updated weights for policy 0, policy_version 46820 (0.0036) [2024-06-21 19:08:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 767180800. Throughput: 0: 41569.8. Samples: 767297600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 19:08:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 19:09:00,365][15401] Updated weights for policy 0, policy_version 46830 (0.0031) [2024-06-21 19:09:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 767393792. Throughput: 0: 41529.3. Samples: 767544840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 19:09:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 19:09:03,752][15401] Updated weights for policy 0, policy_version 46840 (0.0034) [2024-06-21 19:09:08,195][15401] Updated weights for policy 0, policy_version 46850 (0.0033) [2024-06-21 19:09:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 767590400. Throughput: 0: 41668.0. Samples: 767673720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 19:09:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 19:09:11,702][15401] Updated weights for policy 0, policy_version 46860 (0.0036) [2024-06-21 19:09:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 767819776. Throughput: 0: 41702.7. Samples: 767926700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 19:09:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 19:09:15,950][15401] Updated weights for policy 0, policy_version 46870 (0.0045) [2024-06-21 19:09:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 768016384. Throughput: 0: 41755.6. Samples: 768176820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 19:09:18,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 19:09:19,601][15401] Updated weights for policy 0, policy_version 46880 (0.0045) [2024-06-21 19:09:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 768212992. Throughput: 0: 41805.4. Samples: 768300560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 19:09:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 19:09:23,944][15401] Updated weights for policy 0, policy_version 46890 (0.0040) [2024-06-21 19:09:27,719][15401] Updated weights for policy 0, policy_version 46900 (0.0043) [2024-06-21 19:09:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 768425984. Throughput: 0: 41884.6. Samples: 768555640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:09:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 19:09:31,713][15401] Updated weights for policy 0, policy_version 46910 (0.0039) [2024-06-21 19:09:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41507.8, 300 sec: 41654.2). Total num frames: 768622592. Throughput: 0: 41931.2. Samples: 768805620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:09:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 19:09:35,682][15401] Updated weights for policy 0, policy_version 46920 (0.0047) [2024-06-21 19:09:38,392][15132] Fps is (10 sec: 40949.7, 60 sec: 41777.5, 300 sec: 41598.4). Total num frames: 768835584. Throughput: 0: 41598.6. Samples: 768919640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:09:38,393][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 19:09:39,993][15401] Updated weights for policy 0, policy_version 46930 (0.0036) [2024-06-21 19:09:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 769032192. Throughput: 0: 41698.2. Samples: 769174020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:09:43,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 19:09:43,545][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000046939_769048576.pth... [2024-06-21 19:09:43,585][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000046331_759087104.pth [2024-06-21 19:09:43,735][15401] Updated weights for policy 0, policy_version 46940 (0.0033) [2024-06-21 19:09:47,655][15401] Updated weights for policy 0, policy_version 46950 (0.0029) [2024-06-21 19:09:48,390][15132] Fps is (10 sec: 42608.6, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 769261568. Throughput: 0: 41669.7. Samples: 769419980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:09:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 19:09:51,665][15401] Updated weights for policy 0, policy_version 46960 (0.0042) [2024-06-21 19:09:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 769474560. Throughput: 0: 41576.9. Samples: 769544680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:09:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 19:09:55,282][15401] Updated weights for policy 0, policy_version 46970 (0.0034) [2024-06-21 19:09:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 41504.5, 300 sec: 41598.4). Total num frames: 769671168. Throughput: 0: 41480.5. Samples: 769793420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 19:09:58,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 19:09:59,662][15401] Updated weights for policy 0, policy_version 46980 (0.0034) [2024-06-21 19:10:03,333][15401] Updated weights for policy 0, policy_version 46990 (0.0038) [2024-06-21 19:10:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 769884160. Throughput: 0: 41527.4. Samples: 770045560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 19:10:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 19:10:07,478][15401] Updated weights for policy 0, policy_version 47000 (0.0039) [2024-06-21 19:10:08,390][15132] Fps is (10 sec: 42608.2, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 770097152. Throughput: 0: 41670.6. Samples: 770175740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 19:10:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 19:10:10,912][15401] Updated weights for policy 0, policy_version 47010 (0.0041) [2024-06-21 19:10:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 770310144. Throughput: 0: 41551.5. Samples: 770425460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 19:10:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 19:10:15,221][15401] Updated weights for policy 0, policy_version 47020 (0.0037) [2024-06-21 19:10:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 770523136. Throughput: 0: 41632.9. Samples: 770679100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 19:10:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 19:10:18,456][15401] Updated weights for policy 0, policy_version 47030 (0.0035) [2024-06-21 19:10:22,918][15401] Updated weights for policy 0, policy_version 47040 (0.0037) [2024-06-21 19:10:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 770719744. Throughput: 0: 42013.0. Samples: 770810120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 19:10:23,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-21 19:10:25,912][15349] Signal inference workers to stop experience collection... (11150 times) [2024-06-21 19:10:25,912][15349] Signal inference workers to resume experience collection... (11150 times) [2024-06-21 19:10:25,933][15401] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-06-21 19:10:25,933][15401] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-06-21 19:10:26,063][15401] Updated weights for policy 0, policy_version 47050 (0.0031) [2024-06-21 19:10:28,392][15132] Fps is (10 sec: 39312.3, 60 sec: 41504.4, 300 sec: 41542.8). Total num frames: 770916352. Throughput: 0: 41836.5. Samples: 771056760. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 19:10:28,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 19:10:30,615][15401] Updated weights for policy 0, policy_version 47060 (0.0024) [2024-06-21 19:10:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 771162112. Throughput: 0: 42166.3. Samples: 771317460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 19:10:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 19:10:33,610][15401] Updated weights for policy 0, policy_version 47070 (0.0028) [2024-06-21 19:10:38,159][15401] Updated weights for policy 0, policy_version 47080 (0.0029) [2024-06-21 19:10:38,390][15132] Fps is (10 sec: 44246.6, 60 sec: 42053.8, 300 sec: 41709.8). Total num frames: 771358720. Throughput: 0: 42351.9. Samples: 771450520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 19:10:38,391][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 19:10:41,291][15401] Updated weights for policy 0, policy_version 47090 (0.0034) [2024-06-21 19:10:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.3, 300 sec: 41598.9). Total num frames: 771555328. Throughput: 0: 42196.8. Samples: 771692180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 19:10:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 19:10:46,007][15401] Updated weights for policy 0, policy_version 47100 (0.0041) [2024-06-21 19:10:48,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42050.5, 300 sec: 41709.4). Total num frames: 771784704. Throughput: 0: 42291.1. Samples: 771948760. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 19:10:48,393][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 19:10:49,494][15401] Updated weights for policy 0, policy_version 47110 (0.0037) [2024-06-21 19:10:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 771997696. Throughput: 0: 42220.4. Samples: 772075660. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-21 19:10:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 19:10:53,985][15401] Updated weights for policy 0, policy_version 47120 (0.0044) [2024-06-21 19:10:57,146][15401] Updated weights for policy 0, policy_version 47130 (0.0040) [2024-06-21 19:10:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42327.0, 300 sec: 41765.3). Total num frames: 772210688. Throughput: 0: 42175.5. Samples: 772323360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 19:10:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 19:11:01,606][15401] Updated weights for policy 0, policy_version 47140 (0.0030) [2024-06-21 19:11:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 772407296. Throughput: 0: 42207.1. Samples: 772578420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 19:11:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 19:11:05,083][15401] Updated weights for policy 0, policy_version 47150 (0.0032) [2024-06-21 19:11:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.7, 300 sec: 41765.0). Total num frames: 772636672. Throughput: 0: 42167.0. Samples: 772707740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 19:11:08,393][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 19:11:09,237][15401] Updated weights for policy 0, policy_version 47160 (0.0030) [2024-06-21 19:11:12,999][15401] Updated weights for policy 0, policy_version 47170 (0.0030) [2024-06-21 19:11:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 772849664. Throughput: 0: 42234.2. Samples: 772957200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 19:11:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 19:11:17,503][15401] Updated weights for policy 0, policy_version 47180 (0.0048) [2024-06-21 19:11:18,390][15132] Fps is (10 sec: 39331.1, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 773029888. Throughput: 0: 41858.1. Samples: 773201080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 19:11:18,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-21 19:11:21,358][15401] Updated weights for policy 0, policy_version 47190 (0.0032) [2024-06-21 19:11:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 773259264. Throughput: 0: 41775.2. Samples: 773330400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 19:11:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 19:11:25,014][15401] Updated weights for policy 0, policy_version 47200 (0.0026) [2024-06-21 19:11:28,396][15132] Fps is (10 sec: 40933.0, 60 sec: 42049.3, 300 sec: 41709.8). Total num frames: 773439488. Throughput: 0: 42017.5. Samples: 773583240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:11:28,397][15132] Avg episode reward: [(0, '0.357')] [2024-06-21 19:11:29,083][15401] Updated weights for policy 0, policy_version 47210 (0.0033) [2024-06-21 19:11:32,823][15401] Updated weights for policy 0, policy_version 47220 (0.0028) [2024-06-21 19:11:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 773668864. Throughput: 0: 41982.8. Samples: 773837880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:11:33,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-21 19:11:36,928][15401] Updated weights for policy 0, policy_version 47230 (0.0037) [2024-06-21 19:11:38,389][15132] Fps is (10 sec: 45905.7, 60 sec: 42325.5, 300 sec: 41765.3). Total num frames: 773898240. Throughput: 0: 42071.7. Samples: 773968880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:11:38,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 19:11:40,451][15401] Updated weights for policy 0, policy_version 47240 (0.0039) [2024-06-21 19:11:43,390][15132] Fps is (10 sec: 40958.2, 60 sec: 42052.0, 300 sec: 41709.7). Total num frames: 774078464. Throughput: 0: 42154.7. Samples: 774220340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:11:43,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-21 19:11:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000047246_774078464.pth... [2024-06-21 19:11:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000046634_764051456.pth [2024-06-21 19:11:44,687][15401] Updated weights for policy 0, policy_version 47250 (0.0025) [2024-06-21 19:11:48,197][15401] Updated weights for policy 0, policy_version 47260 (0.0033) [2024-06-21 19:11:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42054.0, 300 sec: 41765.3). Total num frames: 774307840. Throughput: 0: 41926.7. Samples: 774465120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:11:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 19:11:52,563][15401] Updated weights for policy 0, policy_version 47270 (0.0032) [2024-06-21 19:11:53,316][15349] Signal inference workers to stop experience collection... (11200 times) [2024-06-21 19:11:53,318][15349] Signal inference workers to resume experience collection... (11200 times) [2024-06-21 19:11:53,364][15401] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-06-21 19:11:53,364][15401] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-06-21 19:11:53,392][15132] Fps is (10 sec: 42589.8, 60 sec: 41777.6, 300 sec: 41765.0). Total num frames: 774504448. Throughput: 0: 41952.0. Samples: 774595580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:11:53,392][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 19:11:55,842][15401] Updated weights for policy 0, policy_version 47280 (0.0031) [2024-06-21 19:11:58,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 774717440. Throughput: 0: 41933.0. Samples: 774844280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 19:11:58,392][15132] Avg episode reward: [(0, '0.202')] [2024-06-21 19:12:00,287][15401] Updated weights for policy 0, policy_version 47290 (0.0039) [2024-06-21 19:12:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 774946816. Throughput: 0: 41958.6. Samples: 775089220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 19:12:03,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 19:12:03,612][15401] Updated weights for policy 0, policy_version 47300 (0.0032) [2024-06-21 19:12:08,073][15401] Updated weights for policy 0, policy_version 47310 (0.0037) [2024-06-21 19:12:08,389][15132] Fps is (10 sec: 42608.5, 60 sec: 41780.9, 300 sec: 41876.7). Total num frames: 775143424. Throughput: 0: 41956.1. Samples: 775218420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 19:12:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 19:12:11,412][15401] Updated weights for policy 0, policy_version 47320 (0.0037) [2024-06-21 19:12:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 775356416. Throughput: 0: 41976.4. Samples: 775471900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 19:12:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 19:12:15,744][15401] Updated weights for policy 0, policy_version 47330 (0.0027) [2024-06-21 19:12:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 775585792. Throughput: 0: 41811.5. Samples: 775719400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 19:12:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 19:12:19,969][15401] Updated weights for policy 0, policy_version 47340 (0.0039) [2024-06-21 19:12:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.4, 300 sec: 41820.9). Total num frames: 775766016. Throughput: 0: 41858.3. Samples: 775852500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 19:12:23,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-21 19:12:23,407][15401] Updated weights for policy 0, policy_version 47350 (0.0033) [2024-06-21 19:12:27,625][15401] Updated weights for policy 0, policy_version 47360 (0.0037) [2024-06-21 19:12:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42330.0, 300 sec: 41820.9). Total num frames: 775979008. Throughput: 0: 41746.5. Samples: 776098920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 19:12:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 19:12:31,693][15401] Updated weights for policy 0, policy_version 47370 (0.0035) [2024-06-21 19:12:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 776192000. Throughput: 0: 41817.8. Samples: 776346920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 19:12:33,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 19:12:35,243][15401] Updated weights for policy 0, policy_version 47380 (0.0028) [2024-06-21 19:12:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 776404992. Throughput: 0: 41827.6. Samples: 776477720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 19:12:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 19:12:39,252][15401] Updated weights for policy 0, policy_version 47390 (0.0037) [2024-06-21 19:12:43,007][15401] Updated weights for policy 0, policy_version 47400 (0.0037) [2024-06-21 19:12:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.6, 300 sec: 41931.9). Total num frames: 776617984. Throughput: 0: 41880.8. Samples: 776728820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 19:12:43,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-21 19:12:46,821][15401] Updated weights for policy 0, policy_version 47410 (0.0047) [2024-06-21 19:12:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 776814592. Throughput: 0: 42085.8. Samples: 776983080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 19:12:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 19:12:50,802][15401] Updated weights for policy 0, policy_version 47420 (0.0030) [2024-06-21 19:12:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42053.9, 300 sec: 41876.4). Total num frames: 777027584. Throughput: 0: 41895.9. Samples: 777103740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 19:12:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 19:12:54,287][15401] Updated weights for policy 0, policy_version 47430 (0.0029) [2024-06-21 19:12:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42053.9, 300 sec: 41931.9). Total num frames: 777240576. Throughput: 0: 41986.3. Samples: 777361280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:12:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 19:12:58,515][15401] Updated weights for policy 0, policy_version 47440 (0.0035) [2024-06-21 19:13:02,523][15401] Updated weights for policy 0, policy_version 47450 (0.0032) [2024-06-21 19:13:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 777437184. Throughput: 0: 42076.8. Samples: 777612860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:13:03,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 19:13:06,420][15401] Updated weights for policy 0, policy_version 47460 (0.0034) [2024-06-21 19:13:08,031][15349] Signal inference workers to stop experience collection... (11250 times) [2024-06-21 19:13:08,036][15349] Signal inference workers to resume experience collection... (11250 times) [2024-06-21 19:13:08,047][15401] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-06-21 19:13:08,062][15401] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-06-21 19:13:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 777666560. Throughput: 0: 41870.7. Samples: 777736680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:13:08,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 19:13:10,223][15401] Updated weights for policy 0, policy_version 47470 (0.0039) [2024-06-21 19:13:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 777846784. Throughput: 0: 42012.9. Samples: 777989500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:13:13,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 19:13:14,156][15401] Updated weights for policy 0, policy_version 47480 (0.0049) [2024-06-21 19:13:18,240][15401] Updated weights for policy 0, policy_version 47490 (0.0034) [2024-06-21 19:13:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 778076160. Throughput: 0: 42055.1. Samples: 778239400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:13:18,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-21 19:13:22,478][15401] Updated weights for policy 0, policy_version 47500 (0.0031) [2024-06-21 19:13:23,391][15132] Fps is (10 sec: 45868.9, 60 sec: 42324.2, 300 sec: 41931.7). Total num frames: 778305536. Throughput: 0: 41940.0. Samples: 778365080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:13:23,391][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 19:13:25,807][15401] Updated weights for policy 0, policy_version 47510 (0.0037) [2024-06-21 19:13:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41876.7). Total num frames: 778485760. Throughput: 0: 41908.0. Samples: 778614680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:13:28,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 19:13:30,223][15401] Updated weights for policy 0, policy_version 47520 (0.0044) [2024-06-21 19:13:33,390][15132] Fps is (10 sec: 40965.8, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 778715136. Throughput: 0: 41684.0. Samples: 778858860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:13:33,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 19:13:33,656][15401] Updated weights for policy 0, policy_version 47530 (0.0039) [2024-06-21 19:13:37,911][15401] Updated weights for policy 0, policy_version 47540 (0.0026) [2024-06-21 19:13:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 778895360. Throughput: 0: 41849.0. Samples: 778986940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:13:38,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-21 19:13:41,400][15401] Updated weights for policy 0, policy_version 47550 (0.0041) [2024-06-21 19:13:43,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 779091968. Throughput: 0: 41712.4. Samples: 779238340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:13:43,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-21 19:13:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000047553_779108352.pth... [2024-06-21 19:13:43,534][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000046939_769048576.pth [2024-06-21 19:13:46,098][15401] Updated weights for policy 0, policy_version 47560 (0.0027) [2024-06-21 19:13:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 779337728. Throughput: 0: 41602.9. Samples: 779484980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:13:48,390][15132] Avg episode reward: [(0, '0.156')] [2024-06-21 19:13:49,068][15401] Updated weights for policy 0, policy_version 47570 (0.0036) [2024-06-21 19:13:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 779517952. Throughput: 0: 41859.7. Samples: 779620380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:13:53,399][15132] Avg episode reward: [(0, '0.327')] [2024-06-21 19:13:54,025][15401] Updated weights for policy 0, policy_version 47580 (0.0045) [2024-06-21 19:13:56,903][15401] Updated weights for policy 0, policy_version 47590 (0.0035) [2024-06-21 19:13:58,394][15132] Fps is (10 sec: 39304.6, 60 sec: 41503.2, 300 sec: 41820.3). Total num frames: 779730944. Throughput: 0: 41597.5. Samples: 779861560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 19:13:58,394][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 19:14:01,654][15401] Updated weights for policy 0, policy_version 47600 (0.0040) [2024-06-21 19:14:03,392][15132] Fps is (10 sec: 45865.1, 60 sec: 42323.7, 300 sec: 41987.1). Total num frames: 779976704. Throughput: 0: 41715.1. Samples: 780116680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 19:14:03,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 19:14:04,675][15401] Updated weights for policy 0, policy_version 47610 (0.0028) [2024-06-21 19:14:08,389][15132] Fps is (10 sec: 40977.3, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 780140544. Throughput: 0: 41828.5. Samples: 780247300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 19:14:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 19:14:09,589][15401] Updated weights for policy 0, policy_version 47620 (0.0032) [2024-06-21 19:14:12,750][15401] Updated weights for policy 0, policy_version 47630 (0.0031) [2024-06-21 19:14:13,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 780386304. Throughput: 0: 41738.3. Samples: 780492900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 19:14:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 19:14:17,257][15401] Updated weights for policy 0, policy_version 47640 (0.0036) [2024-06-21 19:14:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 780582912. Throughput: 0: 41875.5. Samples: 780743260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 19:14:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 19:14:20,707][15401] Updated weights for policy 0, policy_version 47650 (0.0040) [2024-06-21 19:14:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41234.0, 300 sec: 41876.4). Total num frames: 780779520. Throughput: 0: 41842.6. Samples: 780869860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 19:14:23,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-21 19:14:25,088][15401] Updated weights for policy 0, policy_version 47660 (0.0035) [2024-06-21 19:14:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 781008896. Throughput: 0: 41752.9. Samples: 781117220. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-21 19:14:28,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 19:14:28,391][15401] Updated weights for policy 0, policy_version 47670 (0.0031) [2024-06-21 19:14:30,701][15349] Signal inference workers to stop experience collection... (11300 times) [2024-06-21 19:14:30,701][15349] Signal inference workers to resume experience collection... (11300 times) [2024-06-21 19:14:30,741][15401] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-06-21 19:14:30,741][15401] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-06-21 19:14:32,931][15401] Updated weights for policy 0, policy_version 47680 (0.0028) [2024-06-21 19:14:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41932.3). Total num frames: 781205504. Throughput: 0: 41953.3. Samples: 781372880. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-21 19:14:33,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-21 19:14:36,180][15401] Updated weights for policy 0, policy_version 47690 (0.0042) [2024-06-21 19:14:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 781402112. Throughput: 0: 41690.4. Samples: 781496440. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-21 19:14:38,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-21 19:14:40,579][15401] Updated weights for policy 0, policy_version 47700 (0.0037) [2024-06-21 19:14:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 781631488. Throughput: 0: 41959.5. Samples: 781749560. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-21 19:14:43,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 19:14:44,042][15401] Updated weights for policy 0, policy_version 47710 (0.0044) [2024-06-21 19:14:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 781828096. Throughput: 0: 41819.2. Samples: 781998440. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-21 19:14:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 19:14:48,511][15401] Updated weights for policy 0, policy_version 47720 (0.0033) [2024-06-21 19:14:51,905][15401] Updated weights for policy 0, policy_version 47730 (0.0037) [2024-06-21 19:14:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41876.7). Total num frames: 782024704. Throughput: 0: 41739.1. Samples: 782125560. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-21 19:14:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 19:14:56,077][15401] Updated weights for policy 0, policy_version 47740 (0.0035) [2024-06-21 19:14:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42055.2, 300 sec: 41931.9). Total num frames: 782254080. Throughput: 0: 41826.6. Samples: 782375100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 19:14:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 19:15:00,119][15401] Updated weights for policy 0, policy_version 47750 (0.0034) [2024-06-21 19:15:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41507.8, 300 sec: 41931.9). Total num frames: 782467072. Throughput: 0: 41833.0. Samples: 782625740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 19:15:03,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 19:15:03,915][15401] Updated weights for policy 0, policy_version 47760 (0.0038) [2024-06-21 19:15:07,938][15401] Updated weights for policy 0, policy_version 47770 (0.0037) [2024-06-21 19:15:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 782680064. Throughput: 0: 41716.5. Samples: 782747100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 19:15:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 19:15:11,555][15401] Updated weights for policy 0, policy_version 47780 (0.0042) [2024-06-21 19:15:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 782876672. Throughput: 0: 41885.8. Samples: 783002080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 19:15:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 19:15:15,782][15401] Updated weights for policy 0, policy_version 47790 (0.0035) [2024-06-21 19:15:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 783073280. Throughput: 0: 41676.3. Samples: 783248320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 19:15:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 19:15:19,648][15401] Updated weights for policy 0, policy_version 47800 (0.0025) [2024-06-21 19:15:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41932.3). Total num frames: 783286272. Throughput: 0: 41641.4. Samples: 783370300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 19:15:23,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-21 19:15:23,615][15401] Updated weights for policy 0, policy_version 47810 (0.0043) [2024-06-21 19:15:27,343][15401] Updated weights for policy 0, policy_version 47820 (0.0035) [2024-06-21 19:15:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 783515648. Throughput: 0: 41579.6. Samples: 783620640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:15:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 19:15:31,978][15401] Updated weights for policy 0, policy_version 47830 (0.0029) [2024-06-21 19:15:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41506.0, 300 sec: 41820.9). Total num frames: 783695872. Throughput: 0: 41726.5. Samples: 783876140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:15:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 19:15:35,222][15401] Updated weights for policy 0, policy_version 47840 (0.0031) [2024-06-21 19:15:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 783925248. Throughput: 0: 41668.5. Samples: 784000640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:15:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 19:15:39,812][15401] Updated weights for policy 0, policy_version 47850 (0.0029) [2024-06-21 19:15:43,106][15401] Updated weights for policy 0, policy_version 47860 (0.0029) [2024-06-21 19:15:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41876.7). Total num frames: 784138240. Throughput: 0: 41697.7. Samples: 784251500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:15:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-21 19:15:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000047860_784138240.pth... [2024-06-21 19:15:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000047246_774078464.pth [2024-06-21 19:15:47,834][15401] Updated weights for policy 0, policy_version 47870 (0.0039) [2024-06-21 19:15:48,392][15132] Fps is (10 sec: 39311.8, 60 sec: 41504.4, 300 sec: 41765.0). Total num frames: 784318464. Throughput: 0: 41798.6. Samples: 784506780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:15:48,393][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 19:15:51,050][15401] Updated weights for policy 0, policy_version 47880 (0.0048) [2024-06-21 19:15:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 784547840. Throughput: 0: 41695.6. Samples: 784623400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:15:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 19:15:55,614][15401] Updated weights for policy 0, policy_version 47890 (0.0035) [2024-06-21 19:15:58,389][15132] Fps is (10 sec: 44247.6, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 784760832. Throughput: 0: 41677.3. Samples: 784877560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:15:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 19:15:59,111][15401] Updated weights for policy 0, policy_version 47900 (0.0031) [2024-06-21 19:16:01,629][15349] Signal inference workers to stop experience collection... (11350 times) [2024-06-21 19:16:01,631][15349] Signal inference workers to resume experience collection... (11350 times) [2024-06-21 19:16:01,661][15401] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-06-21 19:16:01,661][15401] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-06-21 19:16:03,309][15401] Updated weights for policy 0, policy_version 47910 (0.0021) [2024-06-21 19:16:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41506.0, 300 sec: 41765.6). Total num frames: 784957440. Throughput: 0: 41906.1. Samples: 785134100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:16:03,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-21 19:16:06,763][15401] Updated weights for policy 0, policy_version 47920 (0.0030) [2024-06-21 19:16:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41504.5, 300 sec: 41765.0). Total num frames: 785170432. Throughput: 0: 41883.5. Samples: 785255160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:16:08,392][15132] Avg episode reward: [(0, '0.788')] [2024-06-21 19:16:10,995][15401] Updated weights for policy 0, policy_version 47930 (0.0033) [2024-06-21 19:16:13,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 785416192. Throughput: 0: 42068.5. Samples: 785513720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:16:13,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-21 19:16:14,577][15401] Updated weights for policy 0, policy_version 47940 (0.0042) [2024-06-21 19:16:18,391][15132] Fps is (10 sec: 42600.9, 60 sec: 42051.0, 300 sec: 41820.6). Total num frames: 785596416. Throughput: 0: 41907.8. Samples: 785762060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:16:18,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 19:16:18,581][15401] Updated weights for policy 0, policy_version 47950 (0.0036) [2024-06-21 19:16:22,342][15401] Updated weights for policy 0, policy_version 47960 (0.0046) [2024-06-21 19:16:23,390][15132] Fps is (10 sec: 37682.7, 60 sec: 41779.1, 300 sec: 41877.3). Total num frames: 785793024. Throughput: 0: 41799.9. Samples: 785881640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:16:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 19:16:26,319][15401] Updated weights for policy 0, policy_version 47970 (0.0037) [2024-06-21 19:16:28,390][15132] Fps is (10 sec: 42605.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 786022400. Throughput: 0: 41992.1. Samples: 786141140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:16:28,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 19:16:30,093][15401] Updated weights for policy 0, policy_version 47980 (0.0045) [2024-06-21 19:16:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.7, 300 sec: 41820.5). Total num frames: 786235392. Throughput: 0: 41694.2. Samples: 786383020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:16:33,393][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 19:16:34,277][15401] Updated weights for policy 0, policy_version 47990 (0.0051) [2024-06-21 19:16:38,352][15401] Updated weights for policy 0, policy_version 48000 (0.0032) [2024-06-21 19:16:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 786432000. Throughput: 0: 41962.6. Samples: 786511720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:16:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 19:16:42,160][15401] Updated weights for policy 0, policy_version 48010 (0.0041) [2024-06-21 19:16:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 786661376. Throughput: 0: 41997.8. Samples: 786767460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:16:43,396][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 19:16:46,246][15401] Updated weights for policy 0, policy_version 48020 (0.0047) [2024-06-21 19:16:48,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42322.5, 300 sec: 41875.8). Total num frames: 786857984. Throughput: 0: 41855.9. Samples: 787017880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:16:48,397][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 19:16:49,816][15401] Updated weights for policy 0, policy_version 48030 (0.0043) [2024-06-21 19:16:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41821.2). Total num frames: 787054592. Throughput: 0: 41904.0. Samples: 787140740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:16:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 19:16:54,018][15401] Updated weights for policy 0, policy_version 48040 (0.0049) [2024-06-21 19:16:57,848][15401] Updated weights for policy 0, policy_version 48050 (0.0029) [2024-06-21 19:16:58,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 787300352. Throughput: 0: 41799.1. Samples: 787394680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:16:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-21 19:17:01,752][15401] Updated weights for policy 0, policy_version 48060 (0.0032) [2024-06-21 19:17:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.4, 300 sec: 41765.3). Total num frames: 787464192. Throughput: 0: 41888.4. Samples: 787646960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:17:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 19:17:05,571][15401] Updated weights for policy 0, policy_version 48070 (0.0029) [2024-06-21 19:17:08,392][15132] Fps is (10 sec: 39309.9, 60 sec: 42051.9, 300 sec: 41820.4). Total num frames: 787693568. Throughput: 0: 41811.1. Samples: 787763260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:17:08,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 19:17:09,702][15401] Updated weights for policy 0, policy_version 48080 (0.0031) [2024-06-21 19:17:13,179][15401] Updated weights for policy 0, policy_version 48090 (0.0028) [2024-06-21 19:17:13,389][15132] Fps is (10 sec: 45874.7, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 787922944. Throughput: 0: 41896.9. Samples: 788026500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:17:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 19:17:17,429][15401] Updated weights for policy 0, policy_version 48100 (0.0023) [2024-06-21 19:17:18,389][15132] Fps is (10 sec: 40972.1, 60 sec: 41780.5, 300 sec: 41820.8). Total num frames: 788103168. Throughput: 0: 42161.0. Samples: 788280160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:17:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 19:17:20,677][15401] Updated weights for policy 0, policy_version 48110 (0.0039) [2024-06-21 19:17:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 788332544. Throughput: 0: 41996.1. Samples: 788401540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:17:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 19:17:25,086][15401] Updated weights for policy 0, policy_version 48120 (0.0036) [2024-06-21 19:17:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 788545536. Throughput: 0: 42068.0. Samples: 788660520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 19:17:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 19:17:28,848][15401] Updated weights for policy 0, policy_version 48130 (0.0037) [2024-06-21 19:17:29,264][15349] Signal inference workers to stop experience collection... (11400 times) [2024-06-21 19:17:29,271][15349] Signal inference workers to resume experience collection... (11400 times) [2024-06-21 19:17:29,293][15401] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-06-21 19:17:29,294][15401] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-06-21 19:17:33,166][15401] Updated weights for policy 0, policy_version 48140 (0.0041) [2024-06-21 19:17:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41780.8, 300 sec: 41820.8). Total num frames: 788742144. Throughput: 0: 42134.4. Samples: 788913660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 19:17:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 19:17:36,591][15401] Updated weights for policy 0, policy_version 48150 (0.0043) [2024-06-21 19:17:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 788971520. Throughput: 0: 42155.9. Samples: 789037760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 19:17:38,391][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 19:17:40,842][15401] Updated weights for policy 0, policy_version 48160 (0.0030) [2024-06-21 19:17:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 789168128. Throughput: 0: 42180.9. Samples: 789292820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 19:17:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 19:17:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000048168_789184512.pth... [2024-06-21 19:17:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000047553_779108352.pth [2024-06-21 19:17:44,209][15401] Updated weights for policy 0, policy_version 48170 (0.0034) [2024-06-21 19:17:48,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41783.7, 300 sec: 41820.9). Total num frames: 789364736. Throughput: 0: 42122.6. Samples: 789542480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 19:17:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 19:17:48,486][15401] Updated weights for policy 0, policy_version 48180 (0.0037) [2024-06-21 19:17:51,914][15401] Updated weights for policy 0, policy_version 48190 (0.0030) [2024-06-21 19:17:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 789610496. Throughput: 0: 42327.7. Samples: 789667880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 19:17:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 19:17:56,421][15401] Updated weights for policy 0, policy_version 48200 (0.0037) [2024-06-21 19:17:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 789774336. Throughput: 0: 42135.1. Samples: 789922580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:17:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 19:17:59,642][15401] Updated weights for policy 0, policy_version 48210 (0.0030) [2024-06-21 19:18:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 790020096. Throughput: 0: 41935.6. Samples: 790167260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:18:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 19:18:04,017][15401] Updated weights for policy 0, policy_version 48220 (0.0037) [2024-06-21 19:18:07,444][15401] Updated weights for policy 0, policy_version 48230 (0.0046) [2024-06-21 19:18:08,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42600.5, 300 sec: 42043.0). Total num frames: 790249472. Throughput: 0: 42204.9. Samples: 790300760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:18:08,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 19:18:12,216][15401] Updated weights for policy 0, policy_version 48240 (0.0032) [2024-06-21 19:18:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 790413312. Throughput: 0: 42071.5. Samples: 790553740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:18:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 19:18:15,402][15401] Updated weights for policy 0, policy_version 48250 (0.0043) [2024-06-21 19:18:18,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.6, 300 sec: 41820.7). Total num frames: 790642688. Throughput: 0: 41828.1. Samples: 790796020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:18:18,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-21 19:18:19,956][15401] Updated weights for policy 0, policy_version 48260 (0.0036) [2024-06-21 19:18:23,090][15401] Updated weights for policy 0, policy_version 48270 (0.0037) [2024-06-21 19:18:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 790855680. Throughput: 0: 42078.4. Samples: 790931280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:18:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 19:18:27,530][15401] Updated weights for policy 0, policy_version 48280 (0.0038) [2024-06-21 19:18:28,394][15132] Fps is (10 sec: 39315.4, 60 sec: 41503.4, 300 sec: 41764.7). Total num frames: 791035904. Throughput: 0: 41876.7. Samples: 791177440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 19:18:28,394][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 19:18:30,934][15401] Updated weights for policy 0, policy_version 48290 (0.0042) [2024-06-21 19:18:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 791281664. Throughput: 0: 41906.1. Samples: 791428260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 19:18:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 19:18:35,187][15401] Updated weights for policy 0, policy_version 48300 (0.0034) [2024-06-21 19:18:38,389][15132] Fps is (10 sec: 44254.8, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 791478272. Throughput: 0: 42127.1. Samples: 791563600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 19:18:38,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 19:18:38,903][15401] Updated weights for policy 0, policy_version 48310 (0.0042) [2024-06-21 19:18:42,865][15401] Updated weights for policy 0, policy_version 48320 (0.0029) [2024-06-21 19:18:43,392][15132] Fps is (10 sec: 39312.3, 60 sec: 41777.5, 300 sec: 41820.5). Total num frames: 791674880. Throughput: 0: 41997.7. Samples: 791812580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 19:18:43,393][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 19:18:46,824][15401] Updated weights for policy 0, policy_version 48330 (0.0043) [2024-06-21 19:18:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 791904256. Throughput: 0: 42015.5. Samples: 792057960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 19:18:48,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-21 19:18:51,117][15401] Updated weights for policy 0, policy_version 48340 (0.0035) [2024-06-21 19:18:51,793][15349] Signal inference workers to stop experience collection... (11450 times) [2024-06-21 19:18:51,794][15349] Signal inference workers to resume experience collection... (11450 times) [2024-06-21 19:18:51,805][15401] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-06-21 19:18:51,805][15401] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-06-21 19:18:53,389][15132] Fps is (10 sec: 42608.8, 60 sec: 41506.1, 300 sec: 41932.5). Total num frames: 792100864. Throughput: 0: 41898.7. Samples: 792186200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 19:18:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 19:18:54,482][15401] Updated weights for policy 0, policy_version 48350 (0.0037) [2024-06-21 19:18:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41821.2). Total num frames: 792313856. Throughput: 0: 41735.3. Samples: 792431820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:18:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 19:18:58,822][15401] Updated weights for policy 0, policy_version 48360 (0.0036) [2024-06-21 19:19:02,483][15401] Updated weights for policy 0, policy_version 48370 (0.0044) [2024-06-21 19:19:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 792510464. Throughput: 0: 41913.4. Samples: 792682020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:19:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 19:19:06,839][15401] Updated weights for policy 0, policy_version 48380 (0.0033) [2024-06-21 19:19:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 792707072. Throughput: 0: 41706.6. Samples: 792808080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:19:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 19:19:10,430][15401] Updated weights for policy 0, policy_version 48390 (0.0047) [2024-06-21 19:19:13,393][15132] Fps is (10 sec: 42582.9, 60 sec: 42049.8, 300 sec: 41875.9). Total num frames: 792936448. Throughput: 0: 41713.7. Samples: 793054540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:19:13,394][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 19:19:14,626][15401] Updated weights for policy 0, policy_version 48400 (0.0036) [2024-06-21 19:19:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41507.8, 300 sec: 41876.4). Total num frames: 793133056. Throughput: 0: 41728.5. Samples: 793306040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:19:18,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-21 19:19:18,462][15401] Updated weights for policy 0, policy_version 48410 (0.0035) [2024-06-21 19:19:22,674][15401] Updated weights for policy 0, policy_version 48420 (0.0037) [2024-06-21 19:19:23,390][15132] Fps is (10 sec: 39335.8, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 793329664. Throughput: 0: 41445.7. Samples: 793428660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:19:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 19:19:26,342][15401] Updated weights for policy 0, policy_version 48430 (0.0048) [2024-06-21 19:19:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41782.0, 300 sec: 41820.8). Total num frames: 793542656. Throughput: 0: 41500.0. Samples: 793679980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:19:28,398][15132] Avg episode reward: [(0, '0.873')] [2024-06-21 19:19:30,523][15401] Updated weights for policy 0, policy_version 48440 (0.0041) [2024-06-21 19:19:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 793772032. Throughput: 0: 41452.8. Samples: 793923340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:19:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 19:19:34,471][15401] Updated weights for policy 0, policy_version 48450 (0.0032) [2024-06-21 19:19:38,260][15401] Updated weights for policy 0, policy_version 48460 (0.0044) [2024-06-21 19:19:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 793968640. Throughput: 0: 41502.6. Samples: 794053820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:19:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 19:19:42,159][15401] Updated weights for policy 0, policy_version 48470 (0.0038) [2024-06-21 19:19:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41507.8, 300 sec: 41820.8). Total num frames: 794165248. Throughput: 0: 41655.0. Samples: 794306300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:19:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 19:19:43,441][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000048473_794181632.pth... [2024-06-21 19:19:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000047860_784138240.pth [2024-06-21 19:19:45,841][15401] Updated weights for policy 0, policy_version 48480 (0.0038) [2024-06-21 19:19:48,390][15132] Fps is (10 sec: 42595.7, 60 sec: 41505.6, 300 sec: 41931.8). Total num frames: 794394624. Throughput: 0: 41506.9. Samples: 794549860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:19:48,391][15132] Avg episode reward: [(0, '0.752')] [2024-06-21 19:19:50,162][15401] Updated weights for policy 0, policy_version 48490 (0.0034) [2024-06-21 19:19:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 794591232. Throughput: 0: 41671.6. Samples: 794683300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:19:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 19:19:53,653][15401] Updated weights for policy 0, policy_version 48500 (0.0036) [2024-06-21 19:19:58,198][15401] Updated weights for policy 0, policy_version 48510 (0.0035) [2024-06-21 19:19:58,390][15132] Fps is (10 sec: 39324.3, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 794787840. Throughput: 0: 41530.9. Samples: 794923280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:19:58,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 19:20:01,526][15401] Updated weights for policy 0, policy_version 48520 (0.0026) [2024-06-21 19:20:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 795033600. Throughput: 0: 41344.9. Samples: 795166560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:20:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 19:20:05,806][15401] Updated weights for policy 0, policy_version 48530 (0.0036) [2024-06-21 19:20:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 795213824. Throughput: 0: 41656.8. Samples: 795303220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:20:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 19:20:09,395][15401] Updated weights for policy 0, policy_version 48540 (0.0033) [2024-06-21 19:20:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41508.7, 300 sec: 41876.4). Total num frames: 795426816. Throughput: 0: 41642.7. Samples: 795553900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:20:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 19:20:13,514][15401] Updated weights for policy 0, policy_version 48550 (0.0040) [2024-06-21 19:20:15,835][15349] Signal inference workers to stop experience collection... (11500 times) [2024-06-21 19:20:15,835][15349] Signal inference workers to resume experience collection... (11500 times) [2024-06-21 19:20:15,881][15401] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-06-21 19:20:15,881][15401] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-06-21 19:20:17,607][15401] Updated weights for policy 0, policy_version 48560 (0.0038) [2024-06-21 19:20:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 795656192. Throughput: 0: 41616.5. Samples: 795796080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:20:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 19:20:21,274][15401] Updated weights for policy 0, policy_version 48570 (0.0038) [2024-06-21 19:20:23,391][15132] Fps is (10 sec: 40954.2, 60 sec: 41778.3, 300 sec: 41765.1). Total num frames: 795836416. Throughput: 0: 41558.0. Samples: 795923980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:20:23,391][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 19:20:25,326][15401] Updated weights for policy 0, policy_version 48580 (0.0038) [2024-06-21 19:20:28,393][15132] Fps is (10 sec: 40943.7, 60 sec: 42049.5, 300 sec: 41931.4). Total num frames: 796065792. Throughput: 0: 41447.5. Samples: 796171600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:20:28,394][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 19:20:29,291][15401] Updated weights for policy 0, policy_version 48590 (0.0041) [2024-06-21 19:20:33,019][15401] Updated weights for policy 0, policy_version 48600 (0.0030) [2024-06-21 19:20:33,389][15132] Fps is (10 sec: 42604.3, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 796262400. Throughput: 0: 41650.9. Samples: 796424120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:20:33,390][15132] Avg episode reward: [(0, '0.086')] [2024-06-21 19:20:37,069][15401] Updated weights for policy 0, policy_version 48610 (0.0033) [2024-06-21 19:20:38,389][15132] Fps is (10 sec: 37698.2, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 796442624. Throughput: 0: 41456.8. Samples: 796548860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:20:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 19:20:41,098][15401] Updated weights for policy 0, policy_version 48620 (0.0024) [2024-06-21 19:20:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 796704768. Throughput: 0: 41770.2. Samples: 796802940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:20:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 19:20:44,890][15401] Updated weights for policy 0, policy_version 48630 (0.0041) [2024-06-21 19:20:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.6, 300 sec: 41820.9). Total num frames: 796884992. Throughput: 0: 42123.6. Samples: 797062120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:20:48,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-21 19:20:48,697][15401] Updated weights for policy 0, policy_version 48640 (0.0033) [2024-06-21 19:20:52,790][15401] Updated weights for policy 0, policy_version 48650 (0.0033) [2024-06-21 19:20:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 797081600. Throughput: 0: 41625.8. Samples: 797176380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:20:53,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-21 19:20:56,751][15401] Updated weights for policy 0, policy_version 48660 (0.0036) [2024-06-21 19:20:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 797327360. Throughput: 0: 41761.8. Samples: 797433180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:20:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 19:21:00,729][15401] Updated weights for policy 0, policy_version 48670 (0.0040) [2024-06-21 19:21:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41765.7). Total num frames: 797491200. Throughput: 0: 41955.1. Samples: 797684060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:21:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 19:21:04,892][15401] Updated weights for policy 0, policy_version 48680 (0.0044) [2024-06-21 19:21:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 797720576. Throughput: 0: 41719.5. Samples: 797801300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:21:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 19:21:08,518][15401] Updated weights for policy 0, policy_version 48690 (0.0046) [2024-06-21 19:21:12,668][15401] Updated weights for policy 0, policy_version 48700 (0.0039) [2024-06-21 19:21:13,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42050.5, 300 sec: 41876.3). Total num frames: 797949952. Throughput: 0: 42021.4. Samples: 798062500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:21:13,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 19:21:16,347][15401] Updated weights for policy 0, policy_version 48710 (0.0052) [2024-06-21 19:21:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 798130176. Throughput: 0: 42005.4. Samples: 798314360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:21:18,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 19:21:20,463][15401] Updated weights for policy 0, policy_version 48720 (0.0050) [2024-06-21 19:21:23,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42326.2, 300 sec: 41876.4). Total num frames: 798375936. Throughput: 0: 41809.3. Samples: 798430280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:21:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 19:21:24,214][15401] Updated weights for policy 0, policy_version 48730 (0.0027) [2024-06-21 19:21:28,166][15401] Updated weights for policy 0, policy_version 48740 (0.0037) [2024-06-21 19:21:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41508.9, 300 sec: 41765.7). Total num frames: 798556160. Throughput: 0: 41787.7. Samples: 798683380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:21:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 19:21:32,075][15401] Updated weights for policy 0, policy_version 48750 (0.0030) [2024-06-21 19:21:33,392][15132] Fps is (10 sec: 37674.2, 60 sec: 41504.4, 300 sec: 41765.0). Total num frames: 798752768. Throughput: 0: 41691.5. Samples: 798938340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-21 19:21:33,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 19:21:35,961][15401] Updated weights for policy 0, policy_version 48760 (0.0038) [2024-06-21 19:21:37,434][15349] Signal inference workers to stop experience collection... (11550 times) [2024-06-21 19:21:37,435][15349] Signal inference workers to resume experience collection... (11550 times) [2024-06-21 19:21:37,448][15401] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-06-21 19:21:37,482][15401] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-06-21 19:21:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 41876.4). Total num frames: 799014912. Throughput: 0: 41827.1. Samples: 799058600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-21 19:21:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 19:21:39,852][15401] Updated weights for policy 0, policy_version 48770 (0.0028) [2024-06-21 19:21:43,390][15132] Fps is (10 sec: 44246.9, 60 sec: 41506.0, 300 sec: 41821.7). Total num frames: 799195136. Throughput: 0: 41751.8. Samples: 799312020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-21 19:21:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 19:21:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000048779_799195136.pth... [2024-06-21 19:21:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000048168_789184512.pth [2024-06-21 19:21:43,762][15401] Updated weights for policy 0, policy_version 48780 (0.0024) [2024-06-21 19:21:47,719][15401] Updated weights for policy 0, policy_version 48790 (0.0029) [2024-06-21 19:21:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 799391744. Throughput: 0: 41775.9. Samples: 799563980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-21 19:21:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 19:21:51,496][15401] Updated weights for policy 0, policy_version 48800 (0.0032) [2024-06-21 19:21:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 799621120. Throughput: 0: 41921.1. Samples: 799687760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-21 19:21:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 19:21:55,701][15401] Updated weights for policy 0, policy_version 48810 (0.0030) [2024-06-21 19:21:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 799817728. Throughput: 0: 41778.7. Samples: 799942440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-21 19:21:58,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 19:21:59,055][15401] Updated weights for policy 0, policy_version 48820 (0.0036) [2024-06-21 19:22:03,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.1, 300 sec: 41710.2). Total num frames: 799997952. Throughput: 0: 41811.8. Samples: 800195900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-21 19:22:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 19:22:03,743][15401] Updated weights for policy 0, policy_version 48830 (0.0031) [2024-06-21 19:22:07,183][15401] Updated weights for policy 0, policy_version 48840 (0.0038) [2024-06-21 19:22:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 800227328. Throughput: 0: 41950.3. Samples: 800318040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-21 19:22:08,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-21 19:22:11,706][15401] Updated weights for policy 0, policy_version 48850 (0.0042) [2024-06-21 19:22:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 41780.9, 300 sec: 41876.4). Total num frames: 800456704. Throughput: 0: 41805.2. Samples: 800564620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-21 19:22:13,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-21 19:22:15,170][15401] Updated weights for policy 0, policy_version 48860 (0.0033) [2024-06-21 19:22:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 800636928. Throughput: 0: 41772.5. Samples: 800818000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-21 19:22:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 19:22:19,433][15401] Updated weights for policy 0, policy_version 48870 (0.0031) [2024-06-21 19:22:23,150][15401] Updated weights for policy 0, policy_version 48880 (0.0033) [2024-06-21 19:22:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 800849920. Throughput: 0: 41749.9. Samples: 800937340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-21 19:22:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 19:22:27,055][15401] Updated weights for policy 0, policy_version 48890 (0.0038) [2024-06-21 19:22:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 801079296. Throughput: 0: 41743.7. Samples: 801190480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-21 19:22:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 19:22:30,796][15401] Updated weights for policy 0, policy_version 48900 (0.0042) [2024-06-21 19:22:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42053.9, 300 sec: 41709.8). Total num frames: 801275904. Throughput: 0: 41683.2. Samples: 801439720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:22:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 19:22:35,067][15401] Updated weights for policy 0, policy_version 48910 (0.0029) [2024-06-21 19:22:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 801488896. Throughput: 0: 41745.0. Samples: 801566280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:22:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 19:22:38,523][15401] Updated weights for policy 0, policy_version 48920 (0.0034) [2024-06-21 19:22:42,979][15401] Updated weights for policy 0, policy_version 48930 (0.0044) [2024-06-21 19:22:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 801685504. Throughput: 0: 41698.7. Samples: 801818880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:22:43,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 19:22:46,230][15401] Updated weights for policy 0, policy_version 48940 (0.0030) [2024-06-21 19:22:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 41709.8). Total num frames: 801914880. Throughput: 0: 41569.1. Samples: 802066500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:22:48,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-21 19:22:50,628][15401] Updated weights for policy 0, policy_version 48950 (0.0038) [2024-06-21 19:22:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 802111488. Throughput: 0: 41719.8. Samples: 802195440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:22:53,390][15132] Avg episode reward: [(0, '0.112')] [2024-06-21 19:22:54,369][15401] Updated weights for policy 0, policy_version 48960 (0.0027) [2024-06-21 19:22:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 802308096. Throughput: 0: 41793.0. Samples: 802445300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 19:22:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 19:22:58,468][15401] Updated weights for policy 0, policy_version 48970 (0.0033) [2024-06-21 19:23:01,933][15401] Updated weights for policy 0, policy_version 48980 (0.0044) [2024-06-21 19:23:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 41709.8). Total num frames: 802553856. Throughput: 0: 41621.4. Samples: 802690960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:23:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 19:23:06,399][15401] Updated weights for policy 0, policy_version 48990 (0.0029) [2024-06-21 19:23:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 802750464. Throughput: 0: 41972.8. Samples: 802826120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:23:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 19:23:09,998][15401] Updated weights for policy 0, policy_version 49000 (0.0029) [2024-06-21 19:23:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 41710.1). Total num frames: 802947072. Throughput: 0: 41934.2. Samples: 803077520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:23:13,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 19:23:14,002][15401] Updated weights for policy 0, policy_version 49010 (0.0034) [2024-06-21 19:23:14,856][15349] Signal inference workers to stop experience collection... (11600 times) [2024-06-21 19:23:14,909][15401] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-06-21 19:23:14,915][15349] Signal inference workers to resume experience collection... (11600 times) [2024-06-21 19:23:14,924][15401] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-06-21 19:23:17,658][15401] Updated weights for policy 0, policy_version 49020 (0.0034) [2024-06-21 19:23:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 803160064. Throughput: 0: 41938.7. Samples: 803326960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:23:18,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 19:23:21,886][15401] Updated weights for policy 0, policy_version 49030 (0.0042) [2024-06-21 19:23:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41877.0). Total num frames: 803389440. Throughput: 0: 42067.1. Samples: 803459300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:23:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 19:23:25,315][15401] Updated weights for policy 0, policy_version 49040 (0.0036) [2024-06-21 19:23:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 803569664. Throughput: 0: 41904.5. Samples: 803704580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:23:28,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-21 19:23:30,018][15401] Updated weights for policy 0, policy_version 49050 (0.0032) [2024-06-21 19:23:33,268][15401] Updated weights for policy 0, policy_version 49060 (0.0046) [2024-06-21 19:23:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 803799040. Throughput: 0: 41983.9. Samples: 803955780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:23:33,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 19:23:37,497][15401] Updated weights for policy 0, policy_version 49070 (0.0033) [2024-06-21 19:23:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41765.7). Total num frames: 803995648. Throughput: 0: 41980.2. Samples: 804084540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:23:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 19:23:40,870][15401] Updated weights for policy 0, policy_version 49080 (0.0027) [2024-06-21 19:23:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 804208640. Throughput: 0: 42041.8. Samples: 804337180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:23:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 19:23:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000049086_804225024.pth... [2024-06-21 19:23:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000048473_794181632.pth [2024-06-21 19:23:45,207][15401] Updated weights for policy 0, policy_version 49090 (0.0037) [2024-06-21 19:23:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 804438016. Throughput: 0: 42256.9. Samples: 804592520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:23:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 19:23:48,423][15401] Updated weights for policy 0, policy_version 49100 (0.0027) [2024-06-21 19:23:52,957][15401] Updated weights for policy 0, policy_version 49110 (0.0037) [2024-06-21 19:23:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 804634624. Throughput: 0: 42116.4. Samples: 804721360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:23:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 19:23:56,169][15401] Updated weights for policy 0, policy_version 49120 (0.0036) [2024-06-21 19:23:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 804847616. Throughput: 0: 42235.1. Samples: 804978100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:23:58,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-21 19:24:00,474][15401] Updated weights for policy 0, policy_version 49130 (0.0036) [2024-06-21 19:24:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 805076992. Throughput: 0: 42294.6. Samples: 805230220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:24:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 19:24:03,814][15401] Updated weights for policy 0, policy_version 49140 (0.0047) [2024-06-21 19:24:08,265][15401] Updated weights for policy 0, policy_version 49150 (0.0043) [2024-06-21 19:24:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41821.4). Total num frames: 805273600. Throughput: 0: 42213.3. Samples: 805358900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:24:08,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 19:24:11,438][15401] Updated weights for policy 0, policy_version 49160 (0.0028) [2024-06-21 19:24:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 805486592. Throughput: 0: 42357.7. Samples: 805610680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:24:13,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 19:24:15,898][15401] Updated weights for policy 0, policy_version 49170 (0.0034) [2024-06-21 19:24:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 805699584. Throughput: 0: 42287.5. Samples: 805858720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:24:18,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 19:24:19,278][15401] Updated weights for policy 0, policy_version 49180 (0.0041) [2024-06-21 19:24:23,391][15132] Fps is (10 sec: 40954.1, 60 sec: 41778.1, 300 sec: 41876.2). Total num frames: 805896192. Throughput: 0: 42274.5. Samples: 805986960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:24:23,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 19:24:23,995][15401] Updated weights for policy 0, policy_version 49190 (0.0033) [2024-06-21 19:24:27,411][15401] Updated weights for policy 0, policy_version 49200 (0.0036) [2024-06-21 19:24:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 806109184. Throughput: 0: 42013.2. Samples: 806227780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:24:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 19:24:31,983][15401] Updated weights for policy 0, policy_version 49210 (0.0048) [2024-06-21 19:24:33,392][15132] Fps is (10 sec: 40956.3, 60 sec: 41777.5, 300 sec: 41820.5). Total num frames: 806305792. Throughput: 0: 41959.0. Samples: 806480780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:24:33,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 19:24:35,613][15401] Updated weights for policy 0, policy_version 49220 (0.0030) [2024-06-21 19:24:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 806518784. Throughput: 0: 41852.6. Samples: 806604720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 19:24:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 19:24:39,767][15401] Updated weights for policy 0, policy_version 49230 (0.0047) [2024-06-21 19:24:40,996][15349] Signal inference workers to stop experience collection... (11650 times) [2024-06-21 19:24:40,997][15349] Signal inference workers to resume experience collection... (11650 times) [2024-06-21 19:24:41,012][15401] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-06-21 19:24:41,030][15401] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-06-21 19:24:43,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42052.1, 300 sec: 41820.9). Total num frames: 806731776. Throughput: 0: 41620.7. Samples: 806851040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 19:24:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 19:24:43,550][15401] Updated weights for policy 0, policy_version 49240 (0.0044) [2024-06-21 19:24:47,751][15401] Updated weights for policy 0, policy_version 49250 (0.0042) [2024-06-21 19:24:48,393][15132] Fps is (10 sec: 40944.0, 60 sec: 41503.4, 300 sec: 41820.3). Total num frames: 806928384. Throughput: 0: 41638.4. Samples: 807104100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 19:24:48,394][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 19:24:51,430][15401] Updated weights for policy 0, policy_version 49260 (0.0034) [2024-06-21 19:24:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 807157760. Throughput: 0: 41571.6. Samples: 807229620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 19:24:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 19:24:55,623][15401] Updated weights for policy 0, policy_version 49270 (0.0043) [2024-06-21 19:24:58,389][15132] Fps is (10 sec: 44253.7, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 807370752. Throughput: 0: 41437.0. Samples: 807475340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 19:24:58,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-21 19:24:59,248][15401] Updated weights for policy 0, policy_version 49280 (0.0026) [2024-06-21 19:25:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 807550976. Throughput: 0: 41624.0. Samples: 807731800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 19:25:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 19:25:03,441][15401] Updated weights for policy 0, policy_version 49290 (0.0046) [2024-06-21 19:25:07,065][15401] Updated weights for policy 0, policy_version 49300 (0.0034) [2024-06-21 19:25:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41777.5, 300 sec: 41876.0). Total num frames: 807780352. Throughput: 0: 41446.3. Samples: 807852080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:25:08,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 19:25:11,319][15401] Updated weights for policy 0, policy_version 49310 (0.0027) [2024-06-21 19:25:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 807976960. Throughput: 0: 41741.9. Samples: 808106160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:25:13,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 19:25:15,088][15401] Updated weights for policy 0, policy_version 49320 (0.0045) [2024-06-21 19:25:18,390][15132] Fps is (10 sec: 39330.3, 60 sec: 41232.9, 300 sec: 41821.0). Total num frames: 808173568. Throughput: 0: 41557.6. Samples: 808350780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:25:18,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-21 19:25:19,408][15401] Updated weights for policy 0, policy_version 49330 (0.0029) [2024-06-21 19:25:22,976][15401] Updated weights for policy 0, policy_version 49340 (0.0049) [2024-06-21 19:25:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41780.2, 300 sec: 41821.4). Total num frames: 808402944. Throughput: 0: 41534.9. Samples: 808473800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:25:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 19:25:27,271][15401] Updated weights for policy 0, policy_version 49350 (0.0040) [2024-06-21 19:25:28,389][15132] Fps is (10 sec: 42599.4, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 808599552. Throughput: 0: 41716.1. Samples: 808728260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:25:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 19:25:31,033][15401] Updated weights for policy 0, policy_version 49360 (0.0041) [2024-06-21 19:25:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41780.9, 300 sec: 41931.9). Total num frames: 808812544. Throughput: 0: 41439.5. Samples: 808968720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 19:25:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 19:25:35,465][15401] Updated weights for policy 0, policy_version 49370 (0.0041) [2024-06-21 19:25:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 809009152. Throughput: 0: 41426.7. Samples: 809093820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 19:25:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 19:25:38,713][15401] Updated weights for policy 0, policy_version 49380 (0.0046) [2024-06-21 19:25:43,255][15401] Updated weights for policy 0, policy_version 49390 (0.0038) [2024-06-21 19:25:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 809205760. Throughput: 0: 41655.9. Samples: 809349860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 19:25:43,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 19:25:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000049391_809222144.pth... [2024-06-21 19:25:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000048779_799195136.pth [2024-06-21 19:25:46,401][15401] Updated weights for policy 0, policy_version 49400 (0.0047) [2024-06-21 19:25:48,392][15132] Fps is (10 sec: 42587.8, 60 sec: 41780.1, 300 sec: 41876.0). Total num frames: 809435136. Throughput: 0: 41335.0. Samples: 809591980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 19:25:48,393][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 19:25:50,966][15401] Updated weights for policy 0, policy_version 49410 (0.0033) [2024-06-21 19:25:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 809648128. Throughput: 0: 41590.6. Samples: 809723560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 19:25:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 19:25:53,988][15401] Updated weights for policy 0, policy_version 49420 (0.0037) [2024-06-21 19:25:58,389][15132] Fps is (10 sec: 39331.7, 60 sec: 40960.0, 300 sec: 41820.9). Total num frames: 809828352. Throughput: 0: 41545.8. Samples: 809975720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 19:25:58,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 19:25:58,736][15401] Updated weights for policy 0, policy_version 49430 (0.0050) [2024-06-21 19:26:01,843][15401] Updated weights for policy 0, policy_version 49440 (0.0037) [2024-06-21 19:26:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 810057728. Throughput: 0: 41649.0. Samples: 810224980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 19:26:03,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 19:26:06,514][15401] Updated weights for policy 0, policy_version 49450 (0.0044) [2024-06-21 19:26:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41507.9, 300 sec: 41765.7). Total num frames: 810270720. Throughput: 0: 41779.2. Samples: 810353860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-21 19:26:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 19:26:09,463][15401] Updated weights for policy 0, policy_version 49460 (0.0029) [2024-06-21 19:26:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 810467328. Throughput: 0: 41594.6. Samples: 810600020. Policy #0 lag: (min: 1.0, avg: 12.2, max: 25.0) [2024-06-21 19:26:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 19:26:14,246][15349] Signal inference workers to stop experience collection... (11700 times) [2024-06-21 19:26:14,246][15349] Signal inference workers to resume experience collection... (11700 times) [2024-06-21 19:26:14,293][15401] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-06-21 19:26:14,293][15401] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-06-21 19:26:14,403][15401] Updated weights for policy 0, policy_version 49470 (0.0044) [2024-06-21 19:26:17,703][15401] Updated weights for policy 0, policy_version 49480 (0.0037) [2024-06-21 19:26:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.5, 300 sec: 41765.3). Total num frames: 810696704. Throughput: 0: 41851.6. Samples: 810852040. Policy #0 lag: (min: 1.0, avg: 12.2, max: 25.0) [2024-06-21 19:26:18,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 19:26:22,494][15401] Updated weights for policy 0, policy_version 49490 (0.0030) [2024-06-21 19:26:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 810893312. Throughput: 0: 41964.4. Samples: 810982220. Policy #0 lag: (min: 1.0, avg: 12.2, max: 25.0) [2024-06-21 19:26:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 19:26:25,404][15401] Updated weights for policy 0, policy_version 49500 (0.0043) [2024-06-21 19:26:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 811089920. Throughput: 0: 41574.8. Samples: 811220720. Policy #0 lag: (min: 1.0, avg: 12.2, max: 25.0) [2024-06-21 19:26:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 19:26:30,217][15401] Updated weights for policy 0, policy_version 49510 (0.0040) [2024-06-21 19:26:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 811319296. Throughput: 0: 41793.9. Samples: 811472600. Policy #0 lag: (min: 1.0, avg: 12.2, max: 25.0) [2024-06-21 19:26:33,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 19:26:33,430][15401] Updated weights for policy 0, policy_version 49520 (0.0037) [2024-06-21 19:26:37,871][15401] Updated weights for policy 0, policy_version 49530 (0.0027) [2024-06-21 19:26:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 811515904. Throughput: 0: 41765.3. Samples: 811603000. Policy #0 lag: (min: 1.0, avg: 12.2, max: 25.0) [2024-06-21 19:26:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 19:26:41,147][15401] Updated weights for policy 0, policy_version 49540 (0.0043) [2024-06-21 19:26:43,390][15132] Fps is (10 sec: 39320.6, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 811712512. Throughput: 0: 41693.1. Samples: 811851920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:26:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 19:26:45,675][15401] Updated weights for policy 0, policy_version 49550 (0.0041) [2024-06-21 19:26:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42054.0, 300 sec: 41820.9). Total num frames: 811958272. Throughput: 0: 41733.8. Samples: 812103000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:26:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-21 19:26:48,935][15401] Updated weights for policy 0, policy_version 49560 (0.0031) [2024-06-21 19:26:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 812138496. Throughput: 0: 41754.6. Samples: 812232820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:26:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 19:26:53,484][15401] Updated weights for policy 0, policy_version 49570 (0.0030) [2024-06-21 19:26:56,803][15401] Updated weights for policy 0, policy_version 49580 (0.0041) [2024-06-21 19:26:58,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.6, 300 sec: 41931.6). Total num frames: 812367872. Throughput: 0: 41646.7. Samples: 812474220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:26:58,392][15132] Avg episode reward: [(0, '0.296')] [2024-06-21 19:27:01,510][15401] Updated weights for policy 0, policy_version 49590 (0.0047) [2024-06-21 19:27:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 812564480. Throughput: 0: 41591.0. Samples: 812723640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:27:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 19:27:05,127][15401] Updated weights for policy 0, policy_version 49600 (0.0030) [2024-06-21 19:27:08,389][15132] Fps is (10 sec: 37692.3, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 812744704. Throughput: 0: 41506.3. Samples: 812850000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:27:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 19:27:09,270][15401] Updated weights for policy 0, policy_version 49610 (0.0040) [2024-06-21 19:27:12,946][15401] Updated weights for policy 0, policy_version 49620 (0.0045) [2024-06-21 19:27:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 813006848. Throughput: 0: 41838.6. Samples: 813103460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:27:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 19:27:16,991][15401] Updated weights for policy 0, policy_version 49630 (0.0039) [2024-06-21 19:27:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 813203456. Throughput: 0: 41897.3. Samples: 813357980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:27:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 19:27:20,453][15401] Updated weights for policy 0, policy_version 49640 (0.0036) [2024-06-21 19:27:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 813400064. Throughput: 0: 41768.6. Samples: 813482580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:27:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 19:27:24,887][15401] Updated weights for policy 0, policy_version 49650 (0.0032) [2024-06-21 19:27:28,037][15401] Updated weights for policy 0, policy_version 49660 (0.0037) [2024-06-21 19:27:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 813629440. Throughput: 0: 42032.2. Samples: 813743360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:27:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 19:27:28,597][15349] Signal inference workers to stop experience collection... (11750 times) [2024-06-21 19:27:28,652][15401] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-06-21 19:27:28,653][15349] Signal inference workers to resume experience collection... (11750 times) [2024-06-21 19:27:28,662][15401] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-06-21 19:27:32,705][15401] Updated weights for policy 0, policy_version 49670 (0.0033) [2024-06-21 19:27:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 813842432. Throughput: 0: 42165.3. Samples: 814000440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:27:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 19:27:35,799][15401] Updated weights for policy 0, policy_version 49680 (0.0030) [2024-06-21 19:27:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 814055424. Throughput: 0: 41928.0. Samples: 814119580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 19:27:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 19:27:40,663][15401] Updated weights for policy 0, policy_version 49690 (0.0041) [2024-06-21 19:27:43,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.7, 300 sec: 41820.5). Total num frames: 814252032. Throughput: 0: 42128.0. Samples: 814369980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-21 19:27:43,393][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 19:27:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000049698_814252032.pth... [2024-06-21 19:27:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000049086_804225024.pth [2024-06-21 19:27:44,054][15401] Updated weights for policy 0, policy_version 49700 (0.0040) [2024-06-21 19:27:48,301][15401] Updated weights for policy 0, policy_version 49710 (0.0037) [2024-06-21 19:27:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 814448640. Throughput: 0: 42318.8. Samples: 814627980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-21 19:27:48,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 19:27:51,733][15401] Updated weights for policy 0, policy_version 49720 (0.0051) [2024-06-21 19:27:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 814678016. Throughput: 0: 42218.7. Samples: 814749840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-21 19:27:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 19:27:55,995][15401] Updated weights for policy 0, policy_version 49730 (0.0039) [2024-06-21 19:27:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42054.0, 300 sec: 41820.8). Total num frames: 814891008. Throughput: 0: 42204.0. Samples: 815002640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-21 19:27:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 19:27:59,500][15401] Updated weights for policy 0, policy_version 49740 (0.0032) [2024-06-21 19:28:03,396][15132] Fps is (10 sec: 39296.2, 60 sec: 41774.7, 300 sec: 41764.4). Total num frames: 815071232. Throughput: 0: 42160.6. Samples: 815255480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-21 19:28:03,397][15132] Avg episode reward: [(0, '0.742')] [2024-06-21 19:28:03,870][15401] Updated weights for policy 0, policy_version 49750 (0.0038) [2024-06-21 19:28:07,142][15401] Updated weights for policy 0, policy_version 49760 (0.0029) [2024-06-21 19:28:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 41931.9). Total num frames: 815316992. Throughput: 0: 42020.9. Samples: 815373520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-21 19:28:08,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-21 19:28:11,559][15401] Updated weights for policy 0, policy_version 49770 (0.0032) [2024-06-21 19:28:13,392][15132] Fps is (10 sec: 45893.9, 60 sec: 42050.6, 300 sec: 41931.6). Total num frames: 815529984. Throughput: 0: 42117.8. Samples: 815638760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:28:13,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-21 19:28:14,989][15401] Updated weights for policy 0, policy_version 49780 (0.0034) [2024-06-21 19:28:18,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42047.8, 300 sec: 41819.9). Total num frames: 815726592. Throughput: 0: 41785.6. Samples: 815881060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:28:18,396][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 19:28:19,812][15401] Updated weights for policy 0, policy_version 49790 (0.0042) [2024-06-21 19:28:23,258][15401] Updated weights for policy 0, policy_version 49800 (0.0034) [2024-06-21 19:28:23,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 815939584. Throughput: 0: 41827.5. Samples: 816001820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:28:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 19:28:27,411][15401] Updated weights for policy 0, policy_version 49810 (0.0032) [2024-06-21 19:28:28,389][15132] Fps is (10 sec: 40986.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 816136192. Throughput: 0: 42073.9. Samples: 816263200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:28:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 19:28:31,062][15401] Updated weights for policy 0, policy_version 49820 (0.0039) [2024-06-21 19:28:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 816349184. Throughput: 0: 41820.8. Samples: 816509920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:28:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 19:28:35,148][15401] Updated weights for policy 0, policy_version 49830 (0.0038) [2024-06-21 19:28:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 816545792. Throughput: 0: 41862.6. Samples: 816633660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:28:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 19:28:38,823][15401] Updated weights for policy 0, policy_version 49840 (0.0033) [2024-06-21 19:28:42,929][15401] Updated weights for policy 0, policy_version 49850 (0.0035) [2024-06-21 19:28:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41780.9, 300 sec: 41765.3). Total num frames: 816758784. Throughput: 0: 41924.4. Samples: 816889240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:28:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 19:28:46,965][15401] Updated weights for policy 0, policy_version 49860 (0.0032) [2024-06-21 19:28:47,720][15349] Signal inference workers to stop experience collection... (11800 times) [2024-06-21 19:28:47,720][15349] Signal inference workers to resume experience collection... (11800 times) [2024-06-21 19:28:47,756][15401] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-06-21 19:28:47,756][15401] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-06-21 19:28:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 816988160. Throughput: 0: 41727.7. Samples: 817132960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:28:48,396][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 19:28:50,695][15401] Updated weights for policy 0, policy_version 49870 (0.0036) [2024-06-21 19:28:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 817184768. Throughput: 0: 42013.2. Samples: 817264120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:28:53,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-21 19:28:54,761][15401] Updated weights for policy 0, policy_version 49880 (0.0037) [2024-06-21 19:28:58,338][15401] Updated weights for policy 0, policy_version 49890 (0.0034) [2024-06-21 19:28:58,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41777.5, 300 sec: 41765.0). Total num frames: 817397760. Throughput: 0: 41704.8. Samples: 817515480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:28:58,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 19:29:02,402][15401] Updated weights for policy 0, policy_version 49900 (0.0025) [2024-06-21 19:29:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41783.7, 300 sec: 41709.8). Total num frames: 817577984. Throughput: 0: 41915.3. Samples: 817766980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:29:03,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 19:29:06,317][15401] Updated weights for policy 0, policy_version 49910 (0.0036) [2024-06-21 19:29:08,390][15132] Fps is (10 sec: 42608.8, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 817823744. Throughput: 0: 42076.0. Samples: 817895240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:29:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 19:29:10,075][15401] Updated weights for policy 0, policy_version 49920 (0.0035) [2024-06-21 19:29:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41507.8, 300 sec: 41765.3). Total num frames: 818020352. Throughput: 0: 41811.1. Samples: 818144700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-21 19:29:13,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 19:29:14,108][15401] Updated weights for policy 0, policy_version 49930 (0.0028) [2024-06-21 19:29:18,197][15401] Updated weights for policy 0, policy_version 49940 (0.0034) [2024-06-21 19:29:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41510.6, 300 sec: 41765.5). Total num frames: 818216960. Throughput: 0: 41983.7. Samples: 818399180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 19:29:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 19:29:22,018][15401] Updated weights for policy 0, policy_version 49950 (0.0032) [2024-06-21 19:29:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 818462720. Throughput: 0: 41854.6. Samples: 818517120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 19:29:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 19:29:25,904][15401] Updated weights for policy 0, policy_version 49960 (0.0043) [2024-06-21 19:29:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41777.6, 300 sec: 41820.9). Total num frames: 818642944. Throughput: 0: 41889.9. Samples: 818774380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 19:29:28,392][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 19:29:29,781][15401] Updated weights for policy 0, policy_version 49970 (0.0029) [2024-06-21 19:29:33,391][15132] Fps is (10 sec: 37678.0, 60 sec: 41505.1, 300 sec: 41765.1). Total num frames: 818839552. Throughput: 0: 42006.3. Samples: 819023300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 19:29:33,392][15132] Avg episode reward: [(0, '0.335')] [2024-06-21 19:29:33,653][15401] Updated weights for policy 0, policy_version 49980 (0.0035) [2024-06-21 19:29:37,527][15401] Updated weights for policy 0, policy_version 49990 (0.0037) [2024-06-21 19:29:38,390][15132] Fps is (10 sec: 44246.5, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 819085312. Throughput: 0: 41863.5. Samples: 819147980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 19:29:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 19:29:41,407][15401] Updated weights for policy 0, policy_version 50000 (0.0042) [2024-06-21 19:29:43,389][15132] Fps is (10 sec: 40966.3, 60 sec: 41506.2, 300 sec: 41765.9). Total num frames: 819249152. Throughput: 0: 41890.8. Samples: 819400460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 19:29:43,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 19:29:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000050004_819265536.pth... [2024-06-21 19:29:43,528][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000049391_809222144.pth [2024-06-21 19:29:45,258][15401] Updated weights for policy 0, policy_version 50010 (0.0041) [2024-06-21 19:29:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 819478528. Throughput: 0: 41803.7. Samples: 819648140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:29:48,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 19:29:49,080][15401] Updated weights for policy 0, policy_version 50020 (0.0036) [2024-06-21 19:29:52,957][15401] Updated weights for policy 0, policy_version 50030 (0.0036) [2024-06-21 19:29:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 819707904. Throughput: 0: 41947.6. Samples: 819782880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:29:53,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 19:29:56,808][15401] Updated weights for policy 0, policy_version 50040 (0.0038) [2024-06-21 19:29:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41507.8, 300 sec: 41820.9). Total num frames: 819888128. Throughput: 0: 41915.1. Samples: 820030880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:29:58,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 19:30:00,633][15401] Updated weights for policy 0, policy_version 50050 (0.0040) [2024-06-21 19:30:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 41876.7). Total num frames: 820133888. Throughput: 0: 41654.1. Samples: 820273620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:30:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 19:30:04,601][15401] Updated weights for policy 0, policy_version 50060 (0.0038) [2024-06-21 19:30:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 820330496. Throughput: 0: 42066.4. Samples: 820410100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:30:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 19:30:08,414][15401] Updated weights for policy 0, policy_version 50070 (0.0041) [2024-06-21 19:30:11,416][15349] Signal inference workers to stop experience collection... (11850 times) [2024-06-21 19:30:11,452][15401] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-06-21 19:30:11,463][15349] Signal inference workers to resume experience collection... (11850 times) [2024-06-21 19:30:11,478][15401] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-06-21 19:30:12,204][15401] Updated weights for policy 0, policy_version 50080 (0.0037) [2024-06-21 19:30:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 820527104. Throughput: 0: 41843.4. Samples: 820657240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 19:30:13,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 19:30:16,203][15401] Updated weights for policy 0, policy_version 50090 (0.0031) [2024-06-21 19:30:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 820756480. Throughput: 0: 41791.6. Samples: 820903860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 19:30:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 19:30:20,231][15401] Updated weights for policy 0, policy_version 50100 (0.0024) [2024-06-21 19:30:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41820.8). Total num frames: 820936704. Throughput: 0: 41970.7. Samples: 821036660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 19:30:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 19:30:24,159][15401] Updated weights for policy 0, policy_version 50110 (0.0048) [2024-06-21 19:30:28,091][15401] Updated weights for policy 0, policy_version 50120 (0.0042) [2024-06-21 19:30:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42053.8, 300 sec: 41876.4). Total num frames: 821166080. Throughput: 0: 41887.0. Samples: 821285380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 19:30:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 19:30:31,897][15401] Updated weights for policy 0, policy_version 50130 (0.0048) [2024-06-21 19:30:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42326.4, 300 sec: 41931.9). Total num frames: 821379072. Throughput: 0: 41867.1. Samples: 821532160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 19:30:33,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 19:30:36,042][15401] Updated weights for policy 0, policy_version 50140 (0.0032) [2024-06-21 19:30:38,390][15132] Fps is (10 sec: 37682.7, 60 sec: 40959.9, 300 sec: 41820.8). Total num frames: 821542912. Throughput: 0: 41688.3. Samples: 821658860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 19:30:38,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 19:30:39,660][15401] Updated weights for policy 0, policy_version 50150 (0.0032) [2024-06-21 19:30:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 41876.7). Total num frames: 821788672. Throughput: 0: 41836.4. Samples: 821913520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 19:30:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 19:30:43,731][15401] Updated weights for policy 0, policy_version 50160 (0.0039) [2024-06-21 19:30:47,380][15401] Updated weights for policy 0, policy_version 50170 (0.0029) [2024-06-21 19:30:48,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 822001664. Throughput: 0: 42040.0. Samples: 822165420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 19:30:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 19:30:51,525][15401] Updated weights for policy 0, policy_version 50180 (0.0030) [2024-06-21 19:30:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 822198272. Throughput: 0: 41871.5. Samples: 822294320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:30:53,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-21 19:30:55,151][15401] Updated weights for policy 0, policy_version 50190 (0.0046) [2024-06-21 19:30:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 822394880. Throughput: 0: 41947.1. Samples: 822544860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:30:58,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-21 19:30:59,243][15401] Updated weights for policy 0, policy_version 50200 (0.0044) [2024-06-21 19:31:02,846][15401] Updated weights for policy 0, policy_version 50210 (0.0045) [2024-06-21 19:31:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 822657024. Throughput: 0: 42045.1. Samples: 822795900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:31:03,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 19:31:06,957][15401] Updated weights for policy 0, policy_version 50220 (0.0025) [2024-06-21 19:31:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 822837248. Throughput: 0: 42105.0. Samples: 822931380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:31:08,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-21 19:31:10,691][15401] Updated weights for policy 0, policy_version 50230 (0.0036) [2024-06-21 19:31:13,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 823050240. Throughput: 0: 42005.3. Samples: 823175620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:31:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 19:31:15,131][15401] Updated weights for policy 0, policy_version 50240 (0.0030) [2024-06-21 19:31:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 823279616. Throughput: 0: 42132.9. Samples: 823428140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:31:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 19:31:18,419][15401] Updated weights for policy 0, policy_version 50250 (0.0030) [2024-06-21 19:31:21,659][15349] Signal inference workers to stop experience collection... (11900 times) [2024-06-21 19:31:21,693][15401] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-06-21 19:31:21,724][15349] Signal inference workers to resume experience collection... (11900 times) [2024-06-21 19:31:21,728][15401] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-06-21 19:31:22,654][15401] Updated weights for policy 0, policy_version 50260 (0.0025) [2024-06-21 19:31:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 823459840. Throughput: 0: 42272.9. Samples: 823561140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:31:23,390][15132] Avg episode reward: [(0, '0.887')] [2024-06-21 19:31:26,095][15401] Updated weights for policy 0, policy_version 50270 (0.0031) [2024-06-21 19:31:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 823689216. Throughput: 0: 42117.4. Samples: 823808800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:31:28,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 19:31:30,854][15401] Updated weights for policy 0, policy_version 50280 (0.0032) [2024-06-21 19:31:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 823918592. Throughput: 0: 42052.5. Samples: 824057780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:31:33,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 19:31:33,949][15401] Updated weights for policy 0, policy_version 50290 (0.0036) [2024-06-21 19:31:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 41987.5). Total num frames: 824098816. Throughput: 0: 42059.9. Samples: 824187020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:31:38,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 19:31:38,730][15401] Updated weights for policy 0, policy_version 50300 (0.0035) [2024-06-21 19:31:41,887][15401] Updated weights for policy 0, policy_version 50310 (0.0033) [2024-06-21 19:31:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 824311808. Throughput: 0: 41968.9. Samples: 824433460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:31:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 19:31:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000050313_824328192.pth... [2024-06-21 19:31:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000049698_814252032.pth [2024-06-21 19:31:46,731][15401] Updated weights for policy 0, policy_version 50320 (0.0036) [2024-06-21 19:31:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42323.7, 300 sec: 42042.7). Total num frames: 824541184. Throughput: 0: 42105.5. Samples: 824690740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:31:48,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 19:31:49,583][15401] Updated weights for policy 0, policy_version 50330 (0.0029) [2024-06-21 19:31:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 41932.3). Total num frames: 824737792. Throughput: 0: 41753.7. Samples: 824810300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:31:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 19:31:54,759][15401] Updated weights for policy 0, policy_version 50340 (0.0036) [2024-06-21 19:31:57,242][15401] Updated weights for policy 0, policy_version 50350 (0.0041) [2024-06-21 19:31:58,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 824950784. Throughput: 0: 41842.3. Samples: 825058520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:31:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 19:32:02,677][15401] Updated weights for policy 0, policy_version 50360 (0.0033) [2024-06-21 19:32:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.3, 300 sec: 42043.0). Total num frames: 825147392. Throughput: 0: 42036.9. Samples: 825319800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:32:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 19:32:05,123][15401] Updated weights for policy 0, policy_version 50370 (0.0038) [2024-06-21 19:32:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 825376768. Throughput: 0: 41770.8. Samples: 825440820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:32:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 19:32:10,314][15401] Updated weights for policy 0, policy_version 50380 (0.0039) [2024-06-21 19:32:13,055][15401] Updated weights for policy 0, policy_version 50390 (0.0035) [2024-06-21 19:32:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 825589760. Throughput: 0: 41872.4. Samples: 825693060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:32:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 19:32:17,879][15401] Updated weights for policy 0, policy_version 50400 (0.0033) [2024-06-21 19:32:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 825769984. Throughput: 0: 41954.7. Samples: 825945740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:32:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 19:32:21,004][15401] Updated weights for policy 0, policy_version 50410 (0.0041) [2024-06-21 19:32:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 825982976. Throughput: 0: 41667.2. Samples: 826062040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 19:32:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 19:32:25,996][15401] Updated weights for policy 0, policy_version 50420 (0.0031) [2024-06-21 19:32:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 826212352. Throughput: 0: 41886.6. Samples: 826318360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:32:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 19:32:29,072][15401] Updated weights for policy 0, policy_version 50430 (0.0045) [2024-06-21 19:32:31,302][15349] Signal inference workers to stop experience collection... (11950 times) [2024-06-21 19:32:31,303][15349] Signal inference workers to resume experience collection... (11950 times) [2024-06-21 19:32:31,342][15401] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-06-21 19:32:31,342][15401] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-06-21 19:32:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 41765.3). Total num frames: 826376192. Throughput: 0: 41720.6. Samples: 826568060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:32:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 19:32:33,672][15401] Updated weights for policy 0, policy_version 50440 (0.0035) [2024-06-21 19:32:36,811][15401] Updated weights for policy 0, policy_version 50450 (0.0031) [2024-06-21 19:32:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41932.3). Total num frames: 826621952. Throughput: 0: 41792.7. Samples: 826690980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:32:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 19:32:41,424][15401] Updated weights for policy 0, policy_version 50460 (0.0030) [2024-06-21 19:32:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 826834944. Throughput: 0: 42003.2. Samples: 826948660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:32:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 19:32:44,481][15401] Updated weights for policy 0, policy_version 50470 (0.0043) [2024-06-21 19:32:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41234.7, 300 sec: 41820.9). Total num frames: 827015168. Throughput: 0: 41733.7. Samples: 827197820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:32:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 19:32:49,513][15401] Updated weights for policy 0, policy_version 50480 (0.0038) [2024-06-21 19:32:52,162][15401] Updated weights for policy 0, policy_version 50490 (0.0035) [2024-06-21 19:32:53,389][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 827244544. Throughput: 0: 41728.4. Samples: 827318600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-21 19:32:53,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-21 19:32:57,159][15401] Updated weights for policy 0, policy_version 50500 (0.0033) [2024-06-21 19:32:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41877.3). Total num frames: 827424768. Throughput: 0: 41763.2. Samples: 827572400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 19:32:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 19:33:00,087][15401] Updated weights for policy 0, policy_version 50510 (0.0031) [2024-06-21 19:33:03,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 827621376. Throughput: 0: 41637.9. Samples: 827819440. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 19:33:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 19:33:05,555][15401] Updated weights for policy 0, policy_version 50520 (0.0031) [2024-06-21 19:33:07,982][15401] Updated weights for policy 0, policy_version 50530 (0.0025) [2024-06-21 19:33:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42052.2, 300 sec: 41932.3). Total num frames: 827899904. Throughput: 0: 41814.6. Samples: 827943700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 19:33:08,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-21 19:33:13,211][15401] Updated weights for policy 0, policy_version 50540 (0.0047) [2024-06-21 19:33:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 40959.9, 300 sec: 41766.2). Total num frames: 828047360. Throughput: 0: 41704.4. Samples: 828195060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 19:33:13,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 19:33:15,862][15401] Updated weights for policy 0, policy_version 50550 (0.0035) [2024-06-21 19:33:18,390][15132] Fps is (10 sec: 36044.5, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 828260352. Throughput: 0: 41719.7. Samples: 828445460. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 19:33:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 19:33:20,967][15401] Updated weights for policy 0, policy_version 50560 (0.0043) [2024-06-21 19:33:23,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 828506112. Throughput: 0: 41857.1. Samples: 828574540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-21 19:33:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 19:33:23,604][15401] Updated weights for policy 0, policy_version 50570 (0.0027) [2024-06-21 19:33:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 828669952. Throughput: 0: 41611.8. Samples: 828821200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:33:28,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 19:33:28,702][15401] Updated weights for policy 0, policy_version 50580 (0.0049) [2024-06-21 19:33:31,546][15401] Updated weights for policy 0, policy_version 50590 (0.0030) [2024-06-21 19:33:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 828915712. Throughput: 0: 41513.8. Samples: 829065940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:33:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 19:33:36,457][15401] Updated weights for policy 0, policy_version 50600 (0.0038) [2024-06-21 19:33:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 829112320. Throughput: 0: 41820.9. Samples: 829200540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:33:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 19:33:39,239][15401] Updated weights for policy 0, policy_version 50610 (0.0044) [2024-06-21 19:33:43,389][15132] Fps is (10 sec: 37683.0, 60 sec: 40959.9, 300 sec: 41709.8). Total num frames: 829292544. Throughput: 0: 41765.8. Samples: 829451860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:33:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 19:33:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000050617_829308928.pth... [2024-06-21 19:33:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000050004_819265536.pth [2024-06-21 19:33:44,216][15349] Signal inference workers to stop experience collection... (12000 times) [2024-06-21 19:33:44,269][15401] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-06-21 19:33:44,269][15349] Signal inference workers to resume experience collection... (12000 times) [2024-06-21 19:33:44,272][15401] Updated weights for policy 0, policy_version 50620 (0.0033) [2024-06-21 19:33:44,282][15401] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-06-21 19:33:47,144][15401] Updated weights for policy 0, policy_version 50630 (0.0045) [2024-06-21 19:33:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 829538304. Throughput: 0: 41600.8. Samples: 829691480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:33:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 19:33:52,057][15401] Updated weights for policy 0, policy_version 50640 (0.0034) [2024-06-21 19:33:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.2, 300 sec: 41821.2). Total num frames: 829734912. Throughput: 0: 41878.8. Samples: 829828240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:33:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 19:33:55,152][15401] Updated weights for policy 0, policy_version 50650 (0.0034) [2024-06-21 19:33:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 829915136. Throughput: 0: 41669.9. Samples: 830070200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 19:33:58,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 19:34:00,211][15401] Updated weights for policy 0, policy_version 50660 (0.0028) [2024-06-21 19:34:02,962][15401] Updated weights for policy 0, policy_version 50670 (0.0039) [2024-06-21 19:34:03,390][15132] Fps is (10 sec: 45872.9, 60 sec: 42871.1, 300 sec: 41931.9). Total num frames: 830193664. Throughput: 0: 41535.8. Samples: 830314580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 19:34:03,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-21 19:34:07,867][15401] Updated weights for policy 0, policy_version 50680 (0.0033) [2024-06-21 19:34:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 40960.0, 300 sec: 41820.9). Total num frames: 830357504. Throughput: 0: 41796.3. Samples: 830455380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 19:34:08,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-21 19:34:10,998][15401] Updated weights for policy 0, policy_version 50690 (0.0033) [2024-06-21 19:34:13,390][15132] Fps is (10 sec: 37684.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 830570496. Throughput: 0: 41686.2. Samples: 830697080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 19:34:13,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 19:34:15,510][15401] Updated weights for policy 0, policy_version 50700 (0.0037) [2024-06-21 19:34:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 830783488. Throughput: 0: 41868.0. Samples: 830950000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 19:34:18,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 19:34:19,291][15401] Updated weights for policy 0, policy_version 50710 (0.0038) [2024-06-21 19:34:23,192][15401] Updated weights for policy 0, policy_version 50720 (0.0030) [2024-06-21 19:34:23,396][15132] Fps is (10 sec: 42571.4, 60 sec: 41501.6, 300 sec: 41875.8). Total num frames: 830996480. Throughput: 0: 41655.3. Samples: 831075300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 19:34:23,396][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 19:34:27,142][15401] Updated weights for policy 0, policy_version 50730 (0.0030) [2024-06-21 19:34:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 41932.1). Total num frames: 831209472. Throughput: 0: 41647.1. Samples: 831325980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 19:34:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 19:34:31,190][15401] Updated weights for policy 0, policy_version 50740 (0.0026) [2024-06-21 19:34:33,389][15132] Fps is (10 sec: 40986.4, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 831406080. Throughput: 0: 41813.3. Samples: 831573080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:34:33,396][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 19:34:34,935][15401] Updated weights for policy 0, policy_version 50750 (0.0034) [2024-06-21 19:34:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 831619072. Throughput: 0: 41421.7. Samples: 831692220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:34:38,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 19:34:38,877][15401] Updated weights for policy 0, policy_version 50760 (0.0033) [2024-06-21 19:34:42,618][15401] Updated weights for policy 0, policy_version 50770 (0.0046) [2024-06-21 19:34:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.6, 300 sec: 41876.0). Total num frames: 831832064. Throughput: 0: 41676.4. Samples: 831945740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:34:43,393][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 19:34:46,827][15401] Updated weights for policy 0, policy_version 50780 (0.0034) [2024-06-21 19:34:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 832045056. Throughput: 0: 41903.5. Samples: 832200220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:34:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 19:34:50,328][15401] Updated weights for policy 0, policy_version 50790 (0.0029) [2024-06-21 19:34:53,390][15132] Fps is (10 sec: 39331.1, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 832225280. Throughput: 0: 41608.0. Samples: 832327740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:34:53,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 19:34:54,900][15401] Updated weights for policy 0, policy_version 50800 (0.0025) [2024-06-21 19:34:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 832454656. Throughput: 0: 41733.3. Samples: 832575080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:34:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 19:34:58,692][15401] Updated weights for policy 0, policy_version 50810 (0.0037) [2024-06-21 19:35:02,622][15401] Updated weights for policy 0, policy_version 50820 (0.0033) [2024-06-21 19:35:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41233.3, 300 sec: 41820.8). Total num frames: 832667648. Throughput: 0: 41623.9. Samples: 832823080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 19:35:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 19:35:06,483][15401] Updated weights for policy 0, policy_version 50830 (0.0032) [2024-06-21 19:35:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 832880640. Throughput: 0: 41790.3. Samples: 832955600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 19:35:08,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-21 19:35:10,279][15401] Updated weights for policy 0, policy_version 50840 (0.0049) [2024-06-21 19:35:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 833077248. Throughput: 0: 41666.0. Samples: 833200960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 19:35:13,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-21 19:35:14,332][15401] Updated weights for policy 0, policy_version 50850 (0.0043) [2024-06-21 19:35:17,606][15349] Signal inference workers to stop experience collection... (12050 times) [2024-06-21 19:35:17,606][15349] Signal inference workers to resume experience collection... (12050 times) [2024-06-21 19:35:17,629][15401] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-06-21 19:35:17,629][15401] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-06-21 19:35:18,143][15401] Updated weights for policy 0, policy_version 50860 (0.0037) [2024-06-21 19:35:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 833306624. Throughput: 0: 41755.0. Samples: 833452060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 19:35:18,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 19:35:22,436][15401] Updated weights for policy 0, policy_version 50870 (0.0039) [2024-06-21 19:35:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41783.6, 300 sec: 41820.8). Total num frames: 833503232. Throughput: 0: 41923.0. Samples: 833578760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 19:35:23,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 19:35:26,189][15401] Updated weights for policy 0, policy_version 50880 (0.0040) [2024-06-21 19:35:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 833716224. Throughput: 0: 41777.8. Samples: 833825640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 19:35:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 19:35:30,514][15401] Updated weights for policy 0, policy_version 50890 (0.0032) [2024-06-21 19:35:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 833912832. Throughput: 0: 41604.4. Samples: 834072420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 19:35:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 19:35:33,876][15401] Updated weights for policy 0, policy_version 50900 (0.0039) [2024-06-21 19:35:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 834093056. Throughput: 0: 41483.6. Samples: 834194500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:35:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 19:35:38,442][15401] Updated weights for policy 0, policy_version 50910 (0.0029) [2024-06-21 19:35:41,959][15401] Updated weights for policy 0, policy_version 50920 (0.0042) [2024-06-21 19:35:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42054.0, 300 sec: 41876.4). Total num frames: 834355200. Throughput: 0: 41677.9. Samples: 834450580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:35:43,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 19:35:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000050925_834355200.pth... [2024-06-21 19:35:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000050313_824328192.pth [2024-06-21 19:35:46,182][15401] Updated weights for policy 0, policy_version 50930 (0.0029) [2024-06-21 19:35:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 834519040. Throughput: 0: 41826.7. Samples: 834705280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:35:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 19:35:49,894][15401] Updated weights for policy 0, policy_version 50940 (0.0032) [2024-06-21 19:35:53,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 834732032. Throughput: 0: 41493.5. Samples: 834822800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:35:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 19:35:53,983][15401] Updated weights for policy 0, policy_version 50950 (0.0044) [2024-06-21 19:35:57,598][15401] Updated weights for policy 0, policy_version 50960 (0.0041) [2024-06-21 19:35:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 834961408. Throughput: 0: 41690.4. Samples: 835077020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:35:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-21 19:36:01,838][15401] Updated weights for policy 0, policy_version 50970 (0.0052) [2024-06-21 19:36:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 835158016. Throughput: 0: 41678.3. Samples: 835327580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 19:36:03,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 19:36:05,326][15401] Updated weights for policy 0, policy_version 50980 (0.0042) [2024-06-21 19:36:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 835354624. Throughput: 0: 41515.2. Samples: 835446940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:36:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 19:36:09,663][15401] Updated weights for policy 0, policy_version 50990 (0.0029) [2024-06-21 19:36:13,058][15401] Updated weights for policy 0, policy_version 51000 (0.0037) [2024-06-21 19:36:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 835600384. Throughput: 0: 41664.0. Samples: 835700520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:36:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 19:36:17,408][15401] Updated weights for policy 0, policy_version 51010 (0.0033) [2024-06-21 19:36:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 835780608. Throughput: 0: 41867.8. Samples: 835956480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:36:18,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-21 19:36:20,843][15401] Updated weights for policy 0, policy_version 51020 (0.0028) [2024-06-21 19:36:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 835993600. Throughput: 0: 41855.1. Samples: 836077980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:36:23,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-21 19:36:25,056][15401] Updated weights for policy 0, policy_version 51030 (0.0037) [2024-06-21 19:36:28,367][15401] Updated weights for policy 0, policy_version 51040 (0.0039) [2024-06-21 19:36:28,392][15132] Fps is (10 sec: 45865.0, 60 sec: 42050.6, 300 sec: 41765.0). Total num frames: 836239360. Throughput: 0: 41776.8. Samples: 836330640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:36:28,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 19:36:32,986][15401] Updated weights for policy 0, policy_version 51050 (0.0039) [2024-06-21 19:36:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 836419584. Throughput: 0: 41817.4. Samples: 836587060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:36:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 19:36:36,479][15401] Updated weights for policy 0, policy_version 51060 (0.0028) [2024-06-21 19:36:38,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 836632576. Throughput: 0: 42004.9. Samples: 836713020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:36:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 19:36:40,840][15401] Updated weights for policy 0, policy_version 51070 (0.0030) [2024-06-21 19:36:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41765.7). Total num frames: 836861952. Throughput: 0: 41938.7. Samples: 836964260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 19:36:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 19:36:43,925][15349] Signal inference workers to stop experience collection... (12100 times) [2024-06-21 19:36:43,925][15349] Signal inference workers to resume experience collection... (12100 times) [2024-06-21 19:36:43,956][15401] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-06-21 19:36:43,956][15401] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-06-21 19:36:44,068][15401] Updated weights for policy 0, policy_version 51080 (0.0031) [2024-06-21 19:36:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 837025792. Throughput: 0: 42053.0. Samples: 837219960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 19:36:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 19:36:48,621][15401] Updated weights for policy 0, policy_version 51090 (0.0026) [2024-06-21 19:36:51,892][15401] Updated weights for policy 0, policy_version 51100 (0.0044) [2024-06-21 19:36:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 837271552. Throughput: 0: 42068.0. Samples: 837340000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 19:36:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 19:36:56,273][15401] Updated weights for policy 0, policy_version 51110 (0.0039) [2024-06-21 19:36:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 837468160. Throughput: 0: 42069.8. Samples: 837593660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 19:36:58,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-21 19:36:59,841][15401] Updated weights for policy 0, policy_version 51120 (0.0041) [2024-06-21 19:37:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 837664768. Throughput: 0: 42036.7. Samples: 837848120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 19:37:03,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 19:37:04,229][15401] Updated weights for policy 0, policy_version 51130 (0.0048) [2024-06-21 19:37:07,979][15401] Updated weights for policy 0, policy_version 51140 (0.0037) [2024-06-21 19:37:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 837894144. Throughput: 0: 41996.9. Samples: 837967840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 19:37:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 19:37:11,803][15401] Updated weights for policy 0, policy_version 51150 (0.0040) [2024-06-21 19:37:13,396][15132] Fps is (10 sec: 45845.4, 60 sec: 42047.7, 300 sec: 41875.5). Total num frames: 838123520. Throughput: 0: 41980.2. Samples: 838219920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 19:37:13,397][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 19:37:15,915][15401] Updated weights for policy 0, policy_version 51160 (0.0040) [2024-06-21 19:37:18,391][15132] Fps is (10 sec: 39314.2, 60 sec: 41778.0, 300 sec: 41709.5). Total num frames: 838287360. Throughput: 0: 41881.8. Samples: 838471820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 19:37:18,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 19:37:19,672][15401] Updated weights for policy 0, policy_version 51170 (0.0040) [2024-06-21 19:37:23,390][15132] Fps is (10 sec: 37707.1, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 838500352. Throughput: 0: 41785.1. Samples: 838593360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 19:37:23,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 19:37:23,733][15401] Updated weights for policy 0, policy_version 51180 (0.0034) [2024-06-21 19:37:27,391][15401] Updated weights for policy 0, policy_version 51190 (0.0043) [2024-06-21 19:37:28,389][15132] Fps is (10 sec: 42606.5, 60 sec: 41234.8, 300 sec: 41820.8). Total num frames: 838713344. Throughput: 0: 41805.8. Samples: 838845520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 19:37:28,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 19:37:31,595][15401] Updated weights for policy 0, policy_version 51200 (0.0035) [2024-06-21 19:37:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 838926336. Throughput: 0: 41758.7. Samples: 839099100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 19:37:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 19:37:35,061][15401] Updated weights for policy 0, policy_version 51210 (0.0035) [2024-06-21 19:37:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 839139328. Throughput: 0: 41875.0. Samples: 839224380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 19:37:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 19:37:39,325][15401] Updated weights for policy 0, policy_version 51220 (0.0036) [2024-06-21 19:37:43,039][15401] Updated weights for policy 0, policy_version 51230 (0.0033) [2024-06-21 19:37:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 839352320. Throughput: 0: 41858.1. Samples: 839477280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-21 19:37:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 19:37:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000051230_839352320.pth... [2024-06-21 19:37:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000050617_829308928.pth [2024-06-21 19:37:47,090][15401] Updated weights for policy 0, policy_version 51240 (0.0037) [2024-06-21 19:37:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 839565312. Throughput: 0: 41712.7. Samples: 839725200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:37:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 19:37:50,972][15401] Updated weights for policy 0, policy_version 51250 (0.0030) [2024-06-21 19:37:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 839778304. Throughput: 0: 41811.4. Samples: 839849360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:37:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 19:37:55,034][15401] Updated weights for policy 0, policy_version 51260 (0.0032) [2024-06-21 19:37:55,728][15349] Signal inference workers to stop experience collection... (12150 times) [2024-06-21 19:37:55,728][15349] Signal inference workers to resume experience collection... (12150 times) [2024-06-21 19:37:55,762][15401] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-06-21 19:37:55,762][15401] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-06-21 19:37:58,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41777.5, 300 sec: 41876.0). Total num frames: 839974912. Throughput: 0: 41901.5. Samples: 840105320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:37:58,393][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 19:37:58,879][15401] Updated weights for policy 0, policy_version 51270 (0.0034) [2024-06-21 19:38:02,709][15401] Updated weights for policy 0, policy_version 51280 (0.0026) [2024-06-21 19:38:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 840204288. Throughput: 0: 41843.6. Samples: 840354700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:38:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 19:38:06,663][15401] Updated weights for policy 0, policy_version 51290 (0.0040) [2024-06-21 19:38:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 840400896. Throughput: 0: 42037.9. Samples: 840485060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:38:08,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 19:38:10,427][15401] Updated weights for policy 0, policy_version 51300 (0.0030) [2024-06-21 19:38:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41237.6, 300 sec: 41820.9). Total num frames: 840597504. Throughput: 0: 41812.5. Samples: 840727080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-21 19:38:13,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 19:38:14,619][15401] Updated weights for policy 0, policy_version 51310 (0.0036) [2024-06-21 19:38:18,178][15401] Updated weights for policy 0, policy_version 51320 (0.0031) [2024-06-21 19:38:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42326.7, 300 sec: 41765.3). Total num frames: 840826880. Throughput: 0: 41728.0. Samples: 840976860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 19:38:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 19:38:22,598][15401] Updated weights for policy 0, policy_version 51330 (0.0030) [2024-06-21 19:38:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 841039872. Throughput: 0: 41757.3. Samples: 841103460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 19:38:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 19:38:26,167][15401] Updated weights for policy 0, policy_version 51340 (0.0030) [2024-06-21 19:38:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 841236480. Throughput: 0: 41626.3. Samples: 841350460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 19:38:28,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 19:38:30,407][15401] Updated weights for policy 0, policy_version 51350 (0.0048) [2024-06-21 19:38:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 841449472. Throughput: 0: 41800.0. Samples: 841606200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 19:38:33,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 19:38:33,927][15401] Updated weights for policy 0, policy_version 51360 (0.0029) [2024-06-21 19:38:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 841629696. Throughput: 0: 41685.9. Samples: 841725220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 19:38:38,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-21 19:38:38,455][15401] Updated weights for policy 0, policy_version 51370 (0.0047) [2024-06-21 19:38:41,827][15401] Updated weights for policy 0, policy_version 51380 (0.0043) [2024-06-21 19:38:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 841842688. Throughput: 0: 41417.3. Samples: 841969000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 19:38:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-21 19:38:46,270][15401] Updated weights for policy 0, policy_version 51390 (0.0041) [2024-06-21 19:38:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 842055680. Throughput: 0: 41654.6. Samples: 842229160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-21 19:38:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 19:38:49,688][15401] Updated weights for policy 0, policy_version 51400 (0.0027) [2024-06-21 19:38:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 842268672. Throughput: 0: 41484.4. Samples: 842351860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-21 19:38:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 19:38:53,870][15401] Updated weights for policy 0, policy_version 51410 (0.0045) [2024-06-21 19:38:57,479][15401] Updated weights for policy 0, policy_version 51420 (0.0038) [2024-06-21 19:38:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41780.9, 300 sec: 41654.3). Total num frames: 842481664. Throughput: 0: 41713.3. Samples: 842604180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-21 19:38:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 19:39:01,454][15401] Updated weights for policy 0, policy_version 51430 (0.0030) [2024-06-21 19:39:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 842678272. Throughput: 0: 41800.4. Samples: 842857880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-21 19:39:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 19:39:05,233][15401] Updated weights for policy 0, policy_version 51440 (0.0049) [2024-06-21 19:39:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 842891264. Throughput: 0: 41699.6. Samples: 842979940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-21 19:39:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 19:39:09,601][15401] Updated weights for policy 0, policy_version 51450 (0.0036) [2024-06-21 19:39:12,470][15349] Signal inference workers to stop experience collection... (12200 times) [2024-06-21 19:39:12,500][15401] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-06-21 19:39:12,523][15349] Signal inference workers to resume experience collection... (12200 times) [2024-06-21 19:39:12,528][15401] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-06-21 19:39:13,050][15401] Updated weights for policy 0, policy_version 51460 (0.0036) [2024-06-21 19:39:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42052.1, 300 sec: 41820.8). Total num frames: 843120640. Throughput: 0: 41899.9. Samples: 843235960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-21 19:39:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 19:39:17,518][15401] Updated weights for policy 0, policy_version 51470 (0.0034) [2024-06-21 19:39:18,396][15132] Fps is (10 sec: 40934.3, 60 sec: 41228.6, 300 sec: 41709.8). Total num frames: 843300864. Throughput: 0: 41786.1. Samples: 843486840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-21 19:39:18,396][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 19:39:20,661][15401] Updated weights for policy 0, policy_version 51480 (0.0031) [2024-06-21 19:39:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 843530240. Throughput: 0: 41890.6. Samples: 843610300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-21 19:39:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 19:39:25,335][15401] Updated weights for policy 0, policy_version 51490 (0.0039) [2024-06-21 19:39:28,390][15132] Fps is (10 sec: 42625.7, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 843726848. Throughput: 0: 41988.0. Samples: 843858460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-21 19:39:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 19:39:29,016][15401] Updated weights for policy 0, policy_version 51500 (0.0046) [2024-06-21 19:39:32,970][15401] Updated weights for policy 0, policy_version 51510 (0.0035) [2024-06-21 19:39:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 843939840. Throughput: 0: 41770.1. Samples: 844108820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-21 19:39:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 19:39:36,684][15401] Updated weights for policy 0, policy_version 51520 (0.0036) [2024-06-21 19:39:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 41876.7). Total num frames: 844185600. Throughput: 0: 41960.0. Samples: 844240060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-21 19:39:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 19:39:40,702][15401] Updated weights for policy 0, policy_version 51530 (0.0024) [2024-06-21 19:39:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 844349440. Throughput: 0: 41944.5. Samples: 844491680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-21 19:39:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 19:39:43,446][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000051536_844365824.pth... [2024-06-21 19:39:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000050925_834355200.pth [2024-06-21 19:39:44,751][15401] Updated weights for policy 0, policy_version 51540 (0.0036) [2024-06-21 19:39:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 844578816. Throughput: 0: 41727.0. Samples: 844735600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-21 19:39:48,391][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 19:39:48,991][15401] Updated weights for policy 0, policy_version 51550 (0.0027) [2024-06-21 19:39:52,779][15401] Updated weights for policy 0, policy_version 51560 (0.0030) [2024-06-21 19:39:53,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 844808192. Throughput: 0: 41923.1. Samples: 844866480. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-06-21 19:39:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 19:39:56,591][15401] Updated weights for policy 0, policy_version 51570 (0.0036) [2024-06-21 19:39:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 844972032. Throughput: 0: 41858.3. Samples: 845119580. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-06-21 19:39:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 19:40:00,470][15401] Updated weights for policy 0, policy_version 51580 (0.0039) [2024-06-21 19:40:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 845217792. Throughput: 0: 41915.8. Samples: 845372780. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-06-21 19:40:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 19:40:04,245][15401] Updated weights for policy 0, policy_version 51590 (0.0042) [2024-06-21 19:40:08,145][15401] Updated weights for policy 0, policy_version 51600 (0.0047) [2024-06-21 19:40:08,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 845414400. Throughput: 0: 41954.4. Samples: 845498240. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-06-21 19:40:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 19:40:08,968][15349] Signal inference workers to stop experience collection... (12250 times) [2024-06-21 19:40:08,968][15349] Signal inference workers to resume experience collection... (12250 times) [2024-06-21 19:40:08,985][15401] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-06-21 19:40:08,986][15401] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-06-21 19:40:11,803][15401] Updated weights for policy 0, policy_version 51610 (0.0022) [2024-06-21 19:40:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41233.2, 300 sec: 41654.3). Total num frames: 845594624. Throughput: 0: 42022.3. Samples: 845749460. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-06-21 19:40:13,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 19:40:16,139][15401] Updated weights for policy 0, policy_version 51620 (0.0039) [2024-06-21 19:40:18,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42329.9, 300 sec: 41820.9). Total num frames: 845840384. Throughput: 0: 41890.3. Samples: 845993880. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-06-21 19:40:18,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-21 19:40:19,984][15401] Updated weights for policy 0, policy_version 51630 (0.0038) [2024-06-21 19:40:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 846036992. Throughput: 0: 41928.0. Samples: 846126820. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-06-21 19:40:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 19:40:24,013][15401] Updated weights for policy 0, policy_version 51640 (0.0050) [2024-06-21 19:40:27,597][15401] Updated weights for policy 0, policy_version 51650 (0.0034) [2024-06-21 19:40:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 846233600. Throughput: 0: 41723.4. Samples: 846369240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-21 19:40:28,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-21 19:40:31,842][15401] Updated weights for policy 0, policy_version 51660 (0.0027) [2024-06-21 19:40:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 41987.5). Total num frames: 846479360. Throughput: 0: 41882.4. Samples: 846620300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-21 19:40:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 19:40:35,052][15401] Updated weights for policy 0, policy_version 51670 (0.0032) [2024-06-21 19:40:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 846659584. Throughput: 0: 41979.2. Samples: 846755540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-21 19:40:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 19:40:39,542][15401] Updated weights for policy 0, policy_version 51680 (0.0032) [2024-06-21 19:40:42,700][15401] Updated weights for policy 0, policy_version 51690 (0.0037) [2024-06-21 19:40:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 846888960. Throughput: 0: 41887.2. Samples: 847004500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-21 19:40:43,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 19:40:47,507][15401] Updated weights for policy 0, policy_version 51700 (0.0033) [2024-06-21 19:40:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41779.4, 300 sec: 41876.4). Total num frames: 847085568. Throughput: 0: 41890.7. Samples: 847257860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-21 19:40:48,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 19:40:50,385][15401] Updated weights for policy 0, policy_version 51710 (0.0039) [2024-06-21 19:40:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 847282176. Throughput: 0: 41801.2. Samples: 847379300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-21 19:40:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 19:40:55,169][15401] Updated weights for policy 0, policy_version 51720 (0.0030) [2024-06-21 19:40:58,167][15401] Updated weights for policy 0, policy_version 51730 (0.0038) [2024-06-21 19:40:58,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 41987.5). Total num frames: 847544320. Throughput: 0: 41813.7. Samples: 847631080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 19:40:58,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-21 19:41:02,950][15401] Updated weights for policy 0, policy_version 51740 (0.0047) [2024-06-21 19:41:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 847708160. Throughput: 0: 41953.3. Samples: 847881780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 19:41:03,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-21 19:41:06,079][15401] Updated weights for policy 0, policy_version 51750 (0.0044) [2024-06-21 19:41:08,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 847921152. Throughput: 0: 41621.8. Samples: 847999800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 19:41:08,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 19:41:10,775][15401] Updated weights for policy 0, policy_version 51760 (0.0047) [2024-06-21 19:41:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 848134144. Throughput: 0: 41987.2. Samples: 848258660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 19:41:13,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-21 19:41:14,449][15401] Updated weights for policy 0, policy_version 51770 (0.0027) [2024-06-21 19:41:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 848347136. Throughput: 0: 41963.0. Samples: 848508640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 19:41:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 19:41:18,458][15401] Updated weights for policy 0, policy_version 51780 (0.0044) [2024-06-21 19:41:19,761][15349] Signal inference workers to stop experience collection... (12300 times) [2024-06-21 19:41:19,761][15349] Signal inference workers to resume experience collection... (12300 times) [2024-06-21 19:41:19,804][15401] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-06-21 19:41:19,804][15401] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-06-21 19:41:22,206][15401] Updated weights for policy 0, policy_version 51790 (0.0033) [2024-06-21 19:41:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 41765.7). Total num frames: 848560128. Throughput: 0: 41710.4. Samples: 848632500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 19:41:23,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 19:41:26,614][15401] Updated weights for policy 0, policy_version 51800 (0.0039) [2024-06-21 19:41:28,391][15132] Fps is (10 sec: 42592.3, 60 sec: 42324.3, 300 sec: 41876.2). Total num frames: 848773120. Throughput: 0: 41903.0. Samples: 848890200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-21 19:41:28,391][15132] Avg episode reward: [(0, '0.155')] [2024-06-21 19:41:30,325][15401] Updated weights for policy 0, policy_version 51810 (0.0040) [2024-06-21 19:41:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 848986112. Throughput: 0: 41739.5. Samples: 849136140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:41:33,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-21 19:41:34,608][15401] Updated weights for policy 0, policy_version 51820 (0.0030) [2024-06-21 19:41:38,194][15401] Updated weights for policy 0, policy_version 51830 (0.0035) [2024-06-21 19:41:38,389][15132] Fps is (10 sec: 40966.2, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 849182720. Throughput: 0: 41907.6. Samples: 849265140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:41:38,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-21 19:41:42,657][15401] Updated weights for policy 0, policy_version 51840 (0.0036) [2024-06-21 19:41:43,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 849362944. Throughput: 0: 41860.4. Samples: 849514800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:41:43,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-21 19:41:43,538][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000051842_849379328.pth... [2024-06-21 19:41:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000051230_839352320.pth [2024-06-21 19:41:45,952][15401] Updated weights for policy 0, policy_version 51850 (0.0038) [2024-06-21 19:41:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 849608704. Throughput: 0: 41754.2. Samples: 849760720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:41:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 19:41:50,854][15401] Updated weights for policy 0, policy_version 51860 (0.0035) [2024-06-21 19:41:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 849821696. Throughput: 0: 42030.7. Samples: 849891180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:41:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 19:41:53,646][15401] Updated weights for policy 0, policy_version 51870 (0.0035) [2024-06-21 19:41:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 40686.9, 300 sec: 41765.3). Total num frames: 849985536. Throughput: 0: 41604.0. Samples: 850130840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-21 19:41:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 19:41:58,879][15401] Updated weights for policy 0, policy_version 51880 (0.0043) [2024-06-21 19:42:01,969][15401] Updated weights for policy 0, policy_version 51890 (0.0046) [2024-06-21 19:42:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 850231296. Throughput: 0: 41704.1. Samples: 850385320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 19:42:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 19:42:06,338][15401] Updated weights for policy 0, policy_version 51900 (0.0029) [2024-06-21 19:42:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 41766.2). Total num frames: 850444288. Throughput: 0: 41989.6. Samples: 850522040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 19:42:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 19:42:09,598][15401] Updated weights for policy 0, policy_version 51910 (0.0037) [2024-06-21 19:42:13,389][15132] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41821.1). Total num frames: 850624512. Throughput: 0: 41697.8. Samples: 850766540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 19:42:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 19:42:13,990][15401] Updated weights for policy 0, policy_version 51920 (0.0042) [2024-06-21 19:42:17,337][15401] Updated weights for policy 0, policy_version 51930 (0.0042) [2024-06-21 19:42:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 850870272. Throughput: 0: 41670.2. Samples: 851011300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 19:42:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 19:42:21,817][15401] Updated weights for policy 0, policy_version 51940 (0.0043) [2024-06-21 19:42:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 851050496. Throughput: 0: 41903.9. Samples: 851150820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 19:42:23,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 19:42:24,988][15401] Updated weights for policy 0, policy_version 51950 (0.0031) [2024-06-21 19:42:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41507.2, 300 sec: 41820.9). Total num frames: 851263488. Throughput: 0: 41686.2. Samples: 851390680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 19:42:28,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 19:42:29,970][15401] Updated weights for policy 0, policy_version 51960 (0.0036) [2024-06-21 19:42:32,692][15401] Updated weights for policy 0, policy_version 51970 (0.0024) [2024-06-21 19:42:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 851509248. Throughput: 0: 41855.5. Samples: 851644220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 19:42:33,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-21 19:42:37,473][15401] Updated weights for policy 0, policy_version 51980 (0.0044) [2024-06-21 19:42:38,390][15132] Fps is (10 sec: 37682.8, 60 sec: 40959.9, 300 sec: 41654.2). Total num frames: 851640320. Throughput: 0: 41792.8. Samples: 851771860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-21 19:42:38,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 19:42:39,081][15349] Signal inference workers to stop experience collection... (12350 times) [2024-06-21 19:42:39,132][15401] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-06-21 19:42:39,136][15349] Signal inference workers to resume experience collection... (12350 times) [2024-06-21 19:42:39,142][15401] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-06-21 19:42:40,345][15401] Updated weights for policy 0, policy_version 51990 (0.0045) [2024-06-21 19:42:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 41876.4). Total num frames: 851918848. Throughput: 0: 41988.4. Samples: 852020320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-21 19:42:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 19:42:45,961][15401] Updated weights for policy 0, policy_version 52000 (0.0036) [2024-06-21 19:42:48,361][15401] Updated weights for policy 0, policy_version 52010 (0.0044) [2024-06-21 19:42:48,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 852131840. Throughput: 0: 41988.2. Samples: 852274800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-21 19:42:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-21 19:42:53,389][15132] Fps is (10 sec: 36045.0, 60 sec: 40960.0, 300 sec: 41710.1). Total num frames: 852279296. Throughput: 0: 41782.7. Samples: 852402260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-21 19:42:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 19:42:53,801][15401] Updated weights for policy 0, policy_version 52020 (0.0041) [2024-06-21 19:42:56,000][15401] Updated weights for policy 0, policy_version 52030 (0.0056) [2024-06-21 19:42:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 41876.4). Total num frames: 852557824. Throughput: 0: 41878.2. Samples: 852651060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-21 19:42:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 19:43:01,555][15401] Updated weights for policy 0, policy_version 52040 (0.0037) [2024-06-21 19:43:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 852721664. Throughput: 0: 42250.3. Samples: 852912560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-21 19:43:03,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-21 19:43:04,006][15401] Updated weights for policy 0, policy_version 52050 (0.0044) [2024-06-21 19:43:08,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 852934656. Throughput: 0: 41655.2. Samples: 853025300. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-06-21 19:43:08,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 19:43:09,234][15401] Updated weights for policy 0, policy_version 52060 (0.0064) [2024-06-21 19:43:12,024][15401] Updated weights for policy 0, policy_version 52070 (0.0030) [2024-06-21 19:43:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 41931.9). Total num frames: 853196800. Throughput: 0: 41930.1. Samples: 853277540. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-06-21 19:43:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 19:43:16,904][15401] Updated weights for policy 0, policy_version 52080 (0.0033) [2024-06-21 19:43:18,394][15132] Fps is (10 sec: 39301.9, 60 sec: 40956.6, 300 sec: 41653.6). Total num frames: 853327872. Throughput: 0: 42015.4. Samples: 853535120. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-06-21 19:43:18,395][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 19:43:19,985][15401] Updated weights for policy 0, policy_version 52090 (0.0041) [2024-06-21 19:43:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 853590016. Throughput: 0: 41637.3. Samples: 853645540. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-06-21 19:43:23,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 19:43:24,601][15401] Updated weights for policy 0, policy_version 52100 (0.0043) [2024-06-21 19:43:27,951][15401] Updated weights for policy 0, policy_version 52110 (0.0029) [2024-06-21 19:43:28,390][15132] Fps is (10 sec: 47536.4, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 853803008. Throughput: 0: 41904.8. Samples: 853906040. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-06-21 19:43:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 19:43:30,148][15349] Signal inference workers to stop experience collection... (12400 times) [2024-06-21 19:43:30,148][15349] Signal inference workers to resume experience collection... (12400 times) [2024-06-21 19:43:30,164][15401] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-06-21 19:43:30,165][15401] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-06-21 19:43:32,194][15401] Updated weights for policy 0, policy_version 52120 (0.0036) [2024-06-21 19:43:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 40960.0, 300 sec: 41820.9). Total num frames: 853966848. Throughput: 0: 41914.4. Samples: 854160940. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-06-21 19:43:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 19:43:35,908][15401] Updated weights for policy 0, policy_version 52130 (0.0038) [2024-06-21 19:43:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 41987.5). Total num frames: 854228992. Throughput: 0: 41660.4. Samples: 854276980. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-06-21 19:43:38,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-21 19:43:39,887][15401] Updated weights for policy 0, policy_version 52140 (0.0040) [2024-06-21 19:43:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 41765.3). Total num frames: 854376448. Throughput: 0: 41969.4. Samples: 854539680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:43:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-21 19:43:43,447][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000052148_854392832.pth... [2024-06-21 19:43:43,523][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000051536_844365824.pth [2024-06-21 19:43:43,807][15401] Updated weights for policy 0, policy_version 52150 (0.0032) [2024-06-21 19:43:47,557][15401] Updated weights for policy 0, policy_version 52160 (0.0039) [2024-06-21 19:43:48,390][15132] Fps is (10 sec: 37682.8, 60 sec: 41233.1, 300 sec: 41820.8). Total num frames: 854605824. Throughput: 0: 41622.5. Samples: 854785580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:43:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 19:43:51,627][15401] Updated weights for policy 0, policy_version 52170 (0.0030) [2024-06-21 19:43:53,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 41931.9). Total num frames: 854851584. Throughput: 0: 41924.3. Samples: 854911900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:43:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 19:43:55,547][15401] Updated weights for policy 0, policy_version 52180 (0.0035) [2024-06-21 19:43:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41820.8). Total num frames: 855015424. Throughput: 0: 41852.0. Samples: 855160880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:43:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 19:43:59,629][15401] Updated weights for policy 0, policy_version 52190 (0.0032) [2024-06-21 19:44:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 855228416. Throughput: 0: 41557.9. Samples: 855405020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:44:03,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 19:44:03,523][15401] Updated weights for policy 0, policy_version 52200 (0.0044) [2024-06-21 19:44:07,490][15401] Updated weights for policy 0, policy_version 52210 (0.0048) [2024-06-21 19:44:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 855457792. Throughput: 0: 41911.6. Samples: 855531560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 19:44:08,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 19:44:11,659][15401] Updated weights for policy 0, policy_version 52220 (0.0038) [2024-06-21 19:44:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 40686.9, 300 sec: 41821.7). Total num frames: 855638016. Throughput: 0: 41575.6. Samples: 855776940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 19:44:13,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 19:44:15,134][15401] Updated weights for policy 0, policy_version 52230 (0.0058) [2024-06-21 19:44:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42328.8, 300 sec: 41820.9). Total num frames: 855867392. Throughput: 0: 41401.7. Samples: 856024020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 19:44:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-21 19:44:19,389][15401] Updated weights for policy 0, policy_version 52240 (0.0045) [2024-06-21 19:44:22,811][15401] Updated weights for policy 0, policy_version 52250 (0.0041) [2024-06-21 19:44:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 856064000. Throughput: 0: 41835.1. Samples: 856159560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 19:44:23,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-21 19:44:27,356][15401] Updated weights for policy 0, policy_version 52260 (0.0033) [2024-06-21 19:44:28,392][15132] Fps is (10 sec: 39311.0, 60 sec: 40958.3, 300 sec: 41764.9). Total num frames: 856260608. Throughput: 0: 41557.5. Samples: 856409880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 19:44:28,393][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 19:44:30,626][15401] Updated weights for policy 0, policy_version 52270 (0.0045) [2024-06-21 19:44:32,032][15349] Signal inference workers to stop experience collection... (12450 times) [2024-06-21 19:44:32,033][15349] Signal inference workers to resume experience collection... (12450 times) [2024-06-21 19:44:32,048][15401] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-06-21 19:44:32,048][15401] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-06-21 19:44:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 41820.8). Total num frames: 856522752. Throughput: 0: 41418.2. Samples: 856649400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 19:44:33,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 19:44:34,914][15401] Updated weights for policy 0, policy_version 52280 (0.0042) [2024-06-21 19:44:38,281][15401] Updated weights for policy 0, policy_version 52290 (0.0041) [2024-06-21 19:44:38,389][15132] Fps is (10 sec: 45888.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 856719360. Throughput: 0: 41668.2. Samples: 856786960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 19:44:38,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-21 19:44:43,081][15401] Updated weights for policy 0, policy_version 52300 (0.0029) [2024-06-21 19:44:43,390][15132] Fps is (10 sec: 36045.0, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 856883200. Throughput: 0: 41576.0. Samples: 857031800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 19:44:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 19:44:45,988][15401] Updated weights for policy 0, policy_version 52310 (0.0023) [2024-06-21 19:44:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 857128960. Throughput: 0: 41612.0. Samples: 857277560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:44:48,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 19:44:51,036][15401] Updated weights for policy 0, policy_version 52320 (0.0049) [2024-06-21 19:44:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 857325568. Throughput: 0: 41683.1. Samples: 857407300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:44:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 19:44:53,991][15401] Updated weights for policy 0, policy_version 52330 (0.0034) [2024-06-21 19:44:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 857505792. Throughput: 0: 41824.4. Samples: 857659040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:44:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 19:44:58,773][15401] Updated weights for policy 0, policy_version 52340 (0.0048) [2024-06-21 19:45:01,960][15401] Updated weights for policy 0, policy_version 52350 (0.0040) [2024-06-21 19:45:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 857751552. Throughput: 0: 41813.8. Samples: 857905640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:45:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 19:45:06,948][15401] Updated weights for policy 0, policy_version 52360 (0.0042) [2024-06-21 19:45:08,389][15132] Fps is (10 sec: 45876.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 857964544. Throughput: 0: 41756.5. Samples: 858038600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:45:08,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 19:45:09,970][15401] Updated weights for policy 0, policy_version 52370 (0.0028) [2024-06-21 19:45:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 858144768. Throughput: 0: 41736.7. Samples: 858287920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 19:45:13,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 19:45:14,686][15401] Updated weights for policy 0, policy_version 52380 (0.0032) [2024-06-21 19:45:17,694][15401] Updated weights for policy 0, policy_version 52390 (0.0035) [2024-06-21 19:45:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 858390528. Throughput: 0: 41816.6. Samples: 858531140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 19:45:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 19:45:22,389][15401] Updated weights for policy 0, policy_version 52400 (0.0040) [2024-06-21 19:45:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 858603520. Throughput: 0: 41727.5. Samples: 858664700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 19:45:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 19:45:25,324][15401] Updated weights for policy 0, policy_version 52410 (0.0037) [2024-06-21 19:45:28,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41781.1, 300 sec: 41654.2). Total num frames: 858767360. Throughput: 0: 41792.5. Samples: 858912460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 19:45:28,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 19:45:30,121][15401] Updated weights for policy 0, policy_version 52420 (0.0028) [2024-06-21 19:45:30,303][15349] Signal inference workers to stop experience collection... (12500 times) [2024-06-21 19:45:30,308][15349] Signal inference workers to resume experience collection... (12500 times) [2024-06-21 19:45:30,343][15401] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-06-21 19:45:30,343][15401] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-06-21 19:45:32,940][15401] Updated weights for policy 0, policy_version 52430 (0.0039) [2024-06-21 19:45:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 859029504. Throughput: 0: 41821.2. Samples: 859159520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 19:45:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 19:45:37,735][15401] Updated weights for policy 0, policy_version 52440 (0.0029) [2024-06-21 19:45:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 859226112. Throughput: 0: 42042.2. Samples: 859299200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 19:45:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 19:45:40,550][15401] Updated weights for policy 0, policy_version 52450 (0.0040) [2024-06-21 19:45:43,389][15132] Fps is (10 sec: 36045.5, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 859389952. Throughput: 0: 41901.9. Samples: 859544620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 19:45:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 19:45:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000052454_859406336.pth... [2024-06-21 19:45:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000051842_849379328.pth [2024-06-21 19:45:45,587][15401] Updated weights for policy 0, policy_version 52460 (0.0041) [2024-06-21 19:45:48,227][15401] Updated weights for policy 0, policy_version 52470 (0.0045) [2024-06-21 19:45:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 859668480. Throughput: 0: 41811.6. Samples: 859787160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-21 19:45:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 19:45:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 859832320. Throughput: 0: 41864.3. Samples: 859922500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 19:45:53,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-21 19:45:53,392][15401] Updated weights for policy 0, policy_version 52480 (0.0049) [2024-06-21 19:45:56,591][15401] Updated weights for policy 0, policy_version 52490 (0.0041) [2024-06-21 19:45:58,390][15132] Fps is (10 sec: 36044.3, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 860028928. Throughput: 0: 41652.9. Samples: 860162300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 19:45:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 19:46:01,294][15401] Updated weights for policy 0, policy_version 52500 (0.0036) [2024-06-21 19:46:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 860291072. Throughput: 0: 41731.0. Samples: 860409040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 19:46:03,395][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 19:46:04,523][15401] Updated weights for policy 0, policy_version 52510 (0.0034) [2024-06-21 19:46:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 860438528. Throughput: 0: 41783.5. Samples: 860544960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 19:46:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 19:46:08,978][15401] Updated weights for policy 0, policy_version 52520 (0.0026) [2024-06-21 19:46:12,337][15401] Updated weights for policy 0, policy_version 52530 (0.0042) [2024-06-21 19:46:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 860684288. Throughput: 0: 41771.6. Samples: 860792180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 19:46:13,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-21 19:46:16,841][15401] Updated weights for policy 0, policy_version 52540 (0.0034) [2024-06-21 19:46:18,390][15132] Fps is (10 sec: 47511.4, 60 sec: 42051.9, 300 sec: 41876.3). Total num frames: 860913664. Throughput: 0: 41896.2. Samples: 861044860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 19:46:18,391][15132] Avg episode reward: [(0, '0.767')] [2024-06-21 19:46:20,069][15401] Updated weights for policy 0, policy_version 52550 (0.0040) [2024-06-21 19:46:23,390][15132] Fps is (10 sec: 37683.0, 60 sec: 40960.0, 300 sec: 41654.4). Total num frames: 861061120. Throughput: 0: 41558.6. Samples: 861169340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-21 19:46:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 19:46:24,669][15401] Updated weights for policy 0, policy_version 52560 (0.0030) [2024-06-21 19:46:27,817][15401] Updated weights for policy 0, policy_version 52570 (0.0032) [2024-06-21 19:46:28,389][15132] Fps is (10 sec: 40962.3, 60 sec: 42598.5, 300 sec: 41820.9). Total num frames: 861323264. Throughput: 0: 41631.2. Samples: 861418020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 19:46:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 19:46:32,355][15401] Updated weights for policy 0, policy_version 52580 (0.0034) [2024-06-21 19:46:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 861519872. Throughput: 0: 42059.1. Samples: 861679820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 19:46:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 19:46:35,497][15401] Updated weights for policy 0, policy_version 52590 (0.0041) [2024-06-21 19:46:38,390][15132] Fps is (10 sec: 36044.2, 60 sec: 40959.9, 300 sec: 41765.3). Total num frames: 861683712. Throughput: 0: 41748.9. Samples: 861801200. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 19:46:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 19:46:40,114][15401] Updated weights for policy 0, policy_version 52600 (0.0035) [2024-06-21 19:46:40,649][15349] Signal inference workers to stop experience collection... (12550 times) [2024-06-21 19:46:40,649][15349] Signal inference workers to resume experience collection... (12550 times) [2024-06-21 19:46:40,699][15401] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-06-21 19:46:40,699][15401] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-06-21 19:46:43,183][15401] Updated weights for policy 0, policy_version 52610 (0.0027) [2024-06-21 19:46:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 41876.4). Total num frames: 861962240. Throughput: 0: 42060.5. Samples: 862055020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 19:46:43,393][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 19:46:47,892][15401] Updated weights for policy 0, policy_version 52620 (0.0047) [2024-06-21 19:46:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 862142464. Throughput: 0: 42309.5. Samples: 862312960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 19:46:48,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 19:46:51,089][15401] Updated weights for policy 0, policy_version 52630 (0.0035) [2024-06-21 19:46:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 862339072. Throughput: 0: 41839.2. Samples: 862427720. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 19:46:53,396][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 19:46:55,976][15401] Updated weights for policy 0, policy_version 52640 (0.0035) [2024-06-21 19:46:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41820.8). Total num frames: 862568448. Throughput: 0: 41875.1. Samples: 862676560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-21 19:46:58,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-21 19:46:58,990][15401] Updated weights for policy 0, policy_version 52650 (0.0039) [2024-06-21 19:47:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 40960.1, 300 sec: 41709.8). Total num frames: 862748672. Throughput: 0: 42181.8. Samples: 862943020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-21 19:47:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 19:47:03,621][15401] Updated weights for policy 0, policy_version 52660 (0.0034) [2024-06-21 19:47:06,723][15401] Updated weights for policy 0, policy_version 52670 (0.0027) [2024-06-21 19:47:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 862961664. Throughput: 0: 42008.9. Samples: 863059740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-21 19:47:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 19:47:11,255][15401] Updated weights for policy 0, policy_version 52680 (0.0025) [2024-06-21 19:47:13,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 863207424. Throughput: 0: 42124.2. Samples: 863313620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-21 19:47:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 19:47:14,664][15401] Updated weights for policy 0, policy_version 52690 (0.0040) [2024-06-21 19:47:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 40960.4, 300 sec: 41765.3). Total num frames: 863371264. Throughput: 0: 41974.3. Samples: 863568660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-21 19:47:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 19:47:19,234][15401] Updated weights for policy 0, policy_version 52700 (0.0040) [2024-06-21 19:47:22,629][15401] Updated weights for policy 0, policy_version 52710 (0.0025) [2024-06-21 19:47:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 41820.8). Total num frames: 863600640. Throughput: 0: 41886.7. Samples: 863686100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-21 19:47:23,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 19:47:26,879][15401] Updated weights for policy 0, policy_version 52720 (0.0039) [2024-06-21 19:47:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 863830016. Throughput: 0: 41947.1. Samples: 863942640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-21 19:47:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 19:47:30,391][15401] Updated weights for policy 0, policy_version 52730 (0.0025) [2024-06-21 19:47:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 864010240. Throughput: 0: 41746.0. Samples: 864191540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:47:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 19:47:34,924][15401] Updated weights for policy 0, policy_version 52740 (0.0036) [2024-06-21 19:47:38,214][15401] Updated weights for policy 0, policy_version 52750 (0.0028) [2024-06-21 19:47:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 41820.9). Total num frames: 864256000. Throughput: 0: 41815.9. Samples: 864309440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:47:38,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-21 19:47:42,676][15401] Updated weights for policy 0, policy_version 52760 (0.0048) [2024-06-21 19:47:43,260][15349] Signal inference workers to stop experience collection... (12600 times) [2024-06-21 19:47:43,264][15349] Signal inference workers to resume experience collection... (12600 times) [2024-06-21 19:47:43,279][15401] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-06-21 19:47:43,314][15401] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-06-21 19:47:43,389][15132] Fps is (10 sec: 44237.8, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 864452608. Throughput: 0: 42078.3. Samples: 864570080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:47:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 19:47:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000052763_864468992.pth... [2024-06-21 19:47:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000052148_854392832.pth [2024-06-21 19:47:46,791][15401] Updated weights for policy 0, policy_version 52770 (0.0026) [2024-06-21 19:47:48,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 864632832. Throughput: 0: 41665.6. Samples: 864817980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:47:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 19:47:50,457][15401] Updated weights for policy 0, policy_version 52780 (0.0030) [2024-06-21 19:47:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 864878592. Throughput: 0: 41675.5. Samples: 864935140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:47:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 19:47:54,606][15401] Updated weights for policy 0, policy_version 52790 (0.0040) [2024-06-21 19:47:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 865058816. Throughput: 0: 41698.3. Samples: 865190040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:47:58,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 19:47:58,478][15401] Updated weights for policy 0, policy_version 52800 (0.0038) [2024-06-21 19:48:02,240][15401] Updated weights for policy 0, policy_version 52810 (0.0043) [2024-06-21 19:48:03,389][15132] Fps is (10 sec: 36045.1, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 865239040. Throughput: 0: 41663.1. Samples: 865443500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:48:03,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 19:48:06,326][15401] Updated weights for policy 0, policy_version 52820 (0.0037) [2024-06-21 19:48:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 865484800. Throughput: 0: 41647.5. Samples: 865560240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:48:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 19:48:10,491][15401] Updated weights for policy 0, policy_version 52830 (0.0040) [2024-06-21 19:48:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41233.2, 300 sec: 41877.1). Total num frames: 865681408. Throughput: 0: 41726.3. Samples: 865820320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:48:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 19:48:14,055][15401] Updated weights for policy 0, policy_version 52840 (0.0049) [2024-06-21 19:48:18,099][15401] Updated weights for policy 0, policy_version 52850 (0.0036) [2024-06-21 19:48:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42050.5, 300 sec: 41709.4). Total num frames: 865894400. Throughput: 0: 41604.2. Samples: 866063820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:48:18,392][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 19:48:21,722][15401] Updated weights for policy 0, policy_version 52860 (0.0038) [2024-06-21 19:48:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 866140160. Throughput: 0: 41795.5. Samples: 866190240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:48:23,393][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 19:48:25,673][15401] Updated weights for policy 0, policy_version 52870 (0.0036) [2024-06-21 19:48:28,389][15132] Fps is (10 sec: 40970.2, 60 sec: 41233.2, 300 sec: 41820.9). Total num frames: 866304000. Throughput: 0: 41769.8. Samples: 866449720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:48:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 19:48:29,719][15401] Updated weights for policy 0, policy_version 52880 (0.0034) [2024-06-21 19:48:33,202][15401] Updated weights for policy 0, policy_version 52890 (0.0034) [2024-06-21 19:48:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 41765.3). Total num frames: 866549760. Throughput: 0: 41689.0. Samples: 866693980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 19:48:33,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 19:48:37,404][15401] Updated weights for policy 0, policy_version 52900 (0.0045) [2024-06-21 19:48:38,390][15132] Fps is (10 sec: 45874.1, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 866762752. Throughput: 0: 41996.4. Samples: 866824980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 19:48:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 19:48:40,783][15401] Updated weights for policy 0, policy_version 52910 (0.0031) [2024-06-21 19:48:43,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 866926592. Throughput: 0: 41878.4. Samples: 867074560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 19:48:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 19:48:45,070][15401] Updated weights for policy 0, policy_version 52920 (0.0039) [2024-06-21 19:48:48,045][15349] Signal inference workers to stop experience collection... (12650 times) [2024-06-21 19:48:48,088][15401] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-06-21 19:48:48,095][15349] Signal inference workers to resume experience collection... (12650 times) [2024-06-21 19:48:48,105][15401] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-06-21 19:48:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 41820.9). Total num frames: 867188736. Throughput: 0: 41708.4. Samples: 867320380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 19:48:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 19:48:48,433][15401] Updated weights for policy 0, policy_version 52930 (0.0040) [2024-06-21 19:48:52,822][15401] Updated weights for policy 0, policy_version 52940 (0.0024) [2024-06-21 19:48:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 867385344. Throughput: 0: 42138.6. Samples: 867456480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 19:48:53,392][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 19:48:56,690][15401] Updated weights for policy 0, policy_version 52950 (0.0034) [2024-06-21 19:48:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 867581952. Throughput: 0: 41906.2. Samples: 867706100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 19:48:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 19:49:00,602][15401] Updated weights for policy 0, policy_version 52960 (0.0033) [2024-06-21 19:49:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 41820.8). Total num frames: 867794944. Throughput: 0: 42083.0. Samples: 867957460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 19:49:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 19:49:04,658][15401] Updated weights for policy 0, policy_version 52970 (0.0025) [2024-06-21 19:49:08,355][15401] Updated weights for policy 0, policy_version 52980 (0.0032) [2024-06-21 19:49:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 868024320. Throughput: 0: 42166.7. Samples: 868087740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 19:49:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 19:49:12,561][15401] Updated weights for policy 0, policy_version 52990 (0.0042) [2024-06-21 19:49:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 868220928. Throughput: 0: 42096.8. Samples: 868344080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:49:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 19:49:16,233][15401] Updated weights for policy 0, policy_version 53000 (0.0031) [2024-06-21 19:49:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42326.9, 300 sec: 41931.9). Total num frames: 868433920. Throughput: 0: 42178.9. Samples: 868592040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:49:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 19:49:20,431][15401] Updated weights for policy 0, policy_version 53010 (0.0036) [2024-06-21 19:49:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41932.3). Total num frames: 868630528. Throughput: 0: 42018.3. Samples: 868715800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:49:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 19:49:23,838][15401] Updated weights for policy 0, policy_version 53020 (0.0039) [2024-06-21 19:49:28,238][15401] Updated weights for policy 0, policy_version 53030 (0.0030) [2024-06-21 19:49:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 41820.9). Total num frames: 868859904. Throughput: 0: 42183.9. Samples: 868972840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:49:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 19:49:31,759][15401] Updated weights for policy 0, policy_version 53040 (0.0041) [2024-06-21 19:49:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 869056512. Throughput: 0: 42340.1. Samples: 869225680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:49:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 19:49:35,932][15401] Updated weights for policy 0, policy_version 53050 (0.0044) [2024-06-21 19:49:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 869269504. Throughput: 0: 42034.6. Samples: 869348040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 19:49:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 19:49:39,958][15401] Updated weights for policy 0, policy_version 53060 (0.0043) [2024-06-21 19:49:43,391][15132] Fps is (10 sec: 40954.8, 60 sec: 42324.4, 300 sec: 41820.7). Total num frames: 869466112. Throughput: 0: 41969.1. Samples: 869594760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:49:43,392][15132] Avg episode reward: [(0, '0.336')] [2024-06-21 19:49:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000053068_869466112.pth... [2024-06-21 19:49:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000052454_859406336.pth [2024-06-21 19:49:43,864][15401] Updated weights for policy 0, policy_version 53070 (0.0052) [2024-06-21 19:49:47,735][15401] Updated weights for policy 0, policy_version 53080 (0.0032) [2024-06-21 19:49:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 869679104. Throughput: 0: 42086.0. Samples: 869851320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:49:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 19:49:51,785][15401] Updated weights for policy 0, policy_version 53090 (0.0033) [2024-06-21 19:49:53,389][15132] Fps is (10 sec: 42603.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 869892096. Throughput: 0: 41954.3. Samples: 869975680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:49:53,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 19:49:55,480][15401] Updated weights for policy 0, policy_version 53100 (0.0038) [2024-06-21 19:49:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 870105088. Throughput: 0: 41743.6. Samples: 870222540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:49:58,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-21 19:49:59,629][15401] Updated weights for policy 0, policy_version 53110 (0.0041) [2024-06-21 19:50:03,383][15401] Updated weights for policy 0, policy_version 53120 (0.0042) [2024-06-21 19:50:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 870318080. Throughput: 0: 42006.4. Samples: 870482320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:50:03,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 19:50:04,666][15349] Signal inference workers to stop experience collection... (12700 times) [2024-06-21 19:50:04,666][15349] Signal inference workers to resume experience collection... (12700 times) [2024-06-21 19:50:04,680][15401] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-06-21 19:50:04,681][15401] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-06-21 19:50:07,454][15401] Updated weights for policy 0, policy_version 53130 (0.0047) [2024-06-21 19:50:08,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 870547456. Throughput: 0: 41891.8. Samples: 870600940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:50:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 19:50:11,241][15401] Updated weights for policy 0, policy_version 53140 (0.0043) [2024-06-21 19:50:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 870744064. Throughput: 0: 41852.9. Samples: 870856220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-21 19:50:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 19:50:15,083][15401] Updated weights for policy 0, policy_version 53150 (0.0044) [2024-06-21 19:50:18,389][15132] Fps is (10 sec: 37684.1, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 870924288. Throughput: 0: 41782.2. Samples: 871105880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 19:50:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 19:50:19,180][15401] Updated weights for policy 0, policy_version 53160 (0.0028) [2024-06-21 19:50:22,776][15401] Updated weights for policy 0, policy_version 53170 (0.0035) [2024-06-21 19:50:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 871153664. Throughput: 0: 41750.8. Samples: 871226820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 19:50:23,398][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 19:50:27,233][15401] Updated weights for policy 0, policy_version 53180 (0.0037) [2024-06-21 19:50:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41504.5, 300 sec: 41765.0). Total num frames: 871350272. Throughput: 0: 42105.6. Samples: 871489560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 19:50:28,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 19:50:30,245][15401] Updated weights for policy 0, policy_version 53190 (0.0048) [2024-06-21 19:50:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 871563264. Throughput: 0: 42055.0. Samples: 871743800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 19:50:33,395][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 19:50:34,774][15401] Updated weights for policy 0, policy_version 53200 (0.0036) [2024-06-21 19:50:38,133][15401] Updated weights for policy 0, policy_version 53210 (0.0029) [2024-06-21 19:50:38,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 871792640. Throughput: 0: 42042.7. Samples: 871867600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 19:50:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 19:50:42,380][15401] Updated weights for policy 0, policy_version 53220 (0.0033) [2024-06-21 19:50:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42053.1, 300 sec: 41765.3). Total num frames: 871989248. Throughput: 0: 42201.3. Samples: 872121600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 19:50:43,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-21 19:50:45,753][15401] Updated weights for policy 0, policy_version 53230 (0.0042) [2024-06-21 19:50:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 872202240. Throughput: 0: 42028.9. Samples: 872373620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 19:50:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 19:50:50,234][15401] Updated weights for policy 0, policy_version 53240 (0.0038) [2024-06-21 19:50:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 872431616. Throughput: 0: 42279.2. Samples: 872503500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:50:53,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 19:50:53,565][15401] Updated weights for policy 0, policy_version 53250 (0.0039) [2024-06-21 19:50:57,916][15401] Updated weights for policy 0, policy_version 53260 (0.0039) [2024-06-21 19:50:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 872611840. Throughput: 0: 42121.4. Samples: 872751680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:50:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 19:51:01,373][15401] Updated weights for policy 0, policy_version 53270 (0.0051) [2024-06-21 19:51:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 872841216. Throughput: 0: 42001.2. Samples: 872995940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:51:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 19:51:06,035][15401] Updated weights for policy 0, policy_version 53280 (0.0029) [2024-06-21 19:51:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 873054208. Throughput: 0: 42149.7. Samples: 873123560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:51:08,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 19:51:09,058][15401] Updated weights for policy 0, policy_version 53290 (0.0042) [2024-06-21 19:51:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 873250816. Throughput: 0: 41842.1. Samples: 873372360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:51:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 19:51:13,871][15401] Updated weights for policy 0, policy_version 53300 (0.0038) [2024-06-21 19:51:17,240][15401] Updated weights for policy 0, policy_version 53310 (0.0024) [2024-06-21 19:51:18,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 873447424. Throughput: 0: 41709.0. Samples: 873620700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-21 19:51:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 19:51:22,019][15401] Updated weights for policy 0, policy_version 53320 (0.0048) [2024-06-21 19:51:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 873660416. Throughput: 0: 41757.1. Samples: 873746680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 19:51:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 19:51:25,166][15401] Updated weights for policy 0, policy_version 53330 (0.0035) [2024-06-21 19:51:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42054.0, 300 sec: 41876.4). Total num frames: 873873408. Throughput: 0: 41625.9. Samples: 873994760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 19:51:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 19:51:29,695][15401] Updated weights for policy 0, policy_version 53340 (0.0037) [2024-06-21 19:51:33,055][15401] Updated weights for policy 0, policy_version 53350 (0.0043) [2024-06-21 19:51:33,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42323.6, 300 sec: 42098.2). Total num frames: 874102784. Throughput: 0: 41622.2. Samples: 874246720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 19:51:33,393][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 19:51:36,075][15349] Signal inference workers to stop experience collection... (12750 times) [2024-06-21 19:51:36,121][15401] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-06-21 19:51:36,130][15349] Signal inference workers to resume experience collection... (12750 times) [2024-06-21 19:51:36,135][15401] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-06-21 19:51:37,692][15401] Updated weights for policy 0, policy_version 53360 (0.0034) [2024-06-21 19:51:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 874283008. Throughput: 0: 41626.2. Samples: 874376680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 19:51:38,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 19:51:40,926][15401] Updated weights for policy 0, policy_version 53370 (0.0030) [2024-06-21 19:51:43,390][15132] Fps is (10 sec: 39330.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 874496000. Throughput: 0: 41606.2. Samples: 874623960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 19:51:43,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-21 19:51:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000053376_874512384.pth... [2024-06-21 19:51:43,615][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000052763_864468992.pth [2024-06-21 19:51:45,601][15401] Updated weights for policy 0, policy_version 53380 (0.0032) [2024-06-21 19:51:48,393][15132] Fps is (10 sec: 44220.0, 60 sec: 42049.5, 300 sec: 41986.9). Total num frames: 874725376. Throughput: 0: 41649.8. Samples: 874870340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 19:51:48,394][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 19:51:48,912][15401] Updated weights for policy 0, policy_version 53390 (0.0035) [2024-06-21 19:51:53,348][15401] Updated weights for policy 0, policy_version 53400 (0.0040) [2024-06-21 19:51:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 874905600. Throughput: 0: 41732.1. Samples: 875001500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 19:51:53,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-21 19:51:56,598][15401] Updated weights for policy 0, policy_version 53410 (0.0043) [2024-06-21 19:51:58,390][15132] Fps is (10 sec: 39336.3, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 875118592. Throughput: 0: 41754.6. Samples: 875251320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 19:51:58,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-21 19:52:01,105][15401] Updated weights for policy 0, policy_version 53420 (0.0031) [2024-06-21 19:52:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 875331584. Throughput: 0: 41778.6. Samples: 875500740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 19:52:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 19:52:04,597][15401] Updated weights for policy 0, policy_version 53430 (0.0043) [2024-06-21 19:52:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41231.5, 300 sec: 41765.0). Total num frames: 875528192. Throughput: 0: 41809.0. Samples: 875628180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 19:52:08,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 19:52:08,697][15401] Updated weights for policy 0, policy_version 53440 (0.0051) [2024-06-21 19:52:12,477][15401] Updated weights for policy 0, policy_version 53450 (0.0037) [2024-06-21 19:52:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 875757568. Throughput: 0: 41866.6. Samples: 875878760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 19:52:13,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 19:52:16,502][15401] Updated weights for policy 0, policy_version 53460 (0.0036) [2024-06-21 19:52:18,389][15132] Fps is (10 sec: 42609.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 875954176. Throughput: 0: 41709.9. Samples: 876123560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 19:52:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 19:52:20,318][15401] Updated weights for policy 0, policy_version 53470 (0.0026) [2024-06-21 19:52:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 876150784. Throughput: 0: 41527.5. Samples: 876245420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 19:52:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 19:52:24,193][15401] Updated weights for policy 0, policy_version 53480 (0.0033) [2024-06-21 19:52:27,984][15401] Updated weights for policy 0, policy_version 53490 (0.0028) [2024-06-21 19:52:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 876396544. Throughput: 0: 41727.7. Samples: 876501700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 19:52:28,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-21 19:52:32,345][15401] Updated weights for policy 0, policy_version 53500 (0.0044) [2024-06-21 19:52:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41234.7, 300 sec: 41765.3). Total num frames: 876576768. Throughput: 0: 41913.4. Samples: 876756280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 19:52:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 19:52:35,651][15401] Updated weights for policy 0, policy_version 53510 (0.0025) [2024-06-21 19:52:38,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 876789760. Throughput: 0: 41623.5. Samples: 876874560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 19:52:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 19:52:40,078][15401] Updated weights for policy 0, policy_version 53520 (0.0022) [2024-06-21 19:52:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 877002752. Throughput: 0: 41705.1. Samples: 877128040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 19:52:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 19:52:43,575][15401] Updated weights for policy 0, policy_version 53530 (0.0042) [2024-06-21 19:52:48,171][15401] Updated weights for policy 0, policy_version 53540 (0.0032) [2024-06-21 19:52:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41235.7, 300 sec: 41765.3). Total num frames: 877199360. Throughput: 0: 41748.4. Samples: 877379420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 19:52:48,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 19:52:51,752][15401] Updated weights for policy 0, policy_version 53550 (0.0029) [2024-06-21 19:52:53,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 877428736. Throughput: 0: 41543.6. Samples: 877497540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 19:52:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 19:52:55,892][15401] Updated weights for policy 0, policy_version 53560 (0.0036) [2024-06-21 19:52:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 877641728. Throughput: 0: 41678.6. Samples: 877754300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-21 19:52:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 19:52:59,551][15401] Updated weights for policy 0, policy_version 53570 (0.0033) [2024-06-21 19:53:03,392][15132] Fps is (10 sec: 39312.6, 60 sec: 41504.5, 300 sec: 41820.5). Total num frames: 877821952. Throughput: 0: 41800.9. Samples: 878004700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:53:03,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 19:53:03,934][15401] Updated weights for policy 0, policy_version 53580 (0.0030) [2024-06-21 19:53:07,235][15401] Updated weights for policy 0, policy_version 53590 (0.0034) [2024-06-21 19:53:08,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42049.4, 300 sec: 41931.0). Total num frames: 878051328. Throughput: 0: 41739.9. Samples: 878123980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:53:08,397][15132] Avg episode reward: [(0, '0.300')] [2024-06-21 19:53:12,011][15401] Updated weights for policy 0, policy_version 53600 (0.0037) [2024-06-21 19:53:13,389][15132] Fps is (10 sec: 42608.3, 60 sec: 41506.1, 300 sec: 41876.7). Total num frames: 878247936. Throughput: 0: 41657.7. Samples: 878376300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:53:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 19:53:13,713][15349] Signal inference workers to stop experience collection... (12800 times) [2024-06-21 19:53:13,713][15349] Signal inference workers to resume experience collection... (12800 times) [2024-06-21 19:53:13,763][15401] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-06-21 19:53:13,763][15401] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-06-21 19:53:15,002][15401] Updated weights for policy 0, policy_version 53610 (0.0048) [2024-06-21 19:53:18,395][15132] Fps is (10 sec: 40964.5, 60 sec: 41775.4, 300 sec: 41764.6). Total num frames: 878460928. Throughput: 0: 41434.6. Samples: 878621060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:53:18,395][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 19:53:19,880][15401] Updated weights for policy 0, policy_version 53620 (0.0032) [2024-06-21 19:53:22,987][15401] Updated weights for policy 0, policy_version 53630 (0.0035) [2024-06-21 19:53:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 878690304. Throughput: 0: 41689.4. Samples: 878750580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:53:23,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-21 19:53:27,622][15401] Updated weights for policy 0, policy_version 53640 (0.0028) [2024-06-21 19:53:28,389][15132] Fps is (10 sec: 39342.7, 60 sec: 40959.9, 300 sec: 41709.8). Total num frames: 878854144. Throughput: 0: 41574.1. Samples: 878998880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:53:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 19:53:30,822][15401] Updated weights for policy 0, policy_version 53650 (0.0025) [2024-06-21 19:53:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 879083520. Throughput: 0: 41453.0. Samples: 879244800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 19:53:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 19:53:35,335][15401] Updated weights for policy 0, policy_version 53660 (0.0033) [2024-06-21 19:53:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 879296512. Throughput: 0: 41699.1. Samples: 879374000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 19:53:38,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 19:53:38,600][15401] Updated weights for policy 0, policy_version 53670 (0.0041) [2024-06-21 19:53:43,317][15401] Updated weights for policy 0, policy_version 53680 (0.0033) [2024-06-21 19:53:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 879493120. Throughput: 0: 41328.9. Samples: 879614100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 19:53:43,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 19:53:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000053680_879493120.pth... [2024-06-21 19:53:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000053068_869466112.pth [2024-06-21 19:53:46,776][15401] Updated weights for policy 0, policy_version 53690 (0.0031) [2024-06-21 19:53:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 879706112. Throughput: 0: 41359.1. Samples: 879865760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 19:53:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 19:53:51,658][15401] Updated weights for policy 0, policy_version 53700 (0.0027) [2024-06-21 19:53:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 879902720. Throughput: 0: 41663.2. Samples: 879998560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 19:53:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 19:53:54,561][15401] Updated weights for policy 0, policy_version 53710 (0.0039) [2024-06-21 19:53:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41709.8). Total num frames: 880099328. Throughput: 0: 41432.8. Samples: 880240780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 19:53:58,390][15132] Avg episode reward: [(0, '0.879')] [2024-06-21 19:53:59,455][15401] Updated weights for policy 0, policy_version 53720 (0.0031) [2024-06-21 19:54:02,448][15401] Updated weights for policy 0, policy_version 53730 (0.0042) [2024-06-21 19:54:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41780.8, 300 sec: 41709.8). Total num frames: 880328704. Throughput: 0: 41521.4. Samples: 880489300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 19:54:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 19:54:07,129][15401] Updated weights for policy 0, policy_version 53740 (0.0039) [2024-06-21 19:54:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 40964.4, 300 sec: 41654.2). Total num frames: 880508928. Throughput: 0: 41519.9. Samples: 880618980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 19:54:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 19:54:10,173][15401] Updated weights for policy 0, policy_version 53750 (0.0038) [2024-06-21 19:54:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 880754688. Throughput: 0: 41520.9. Samples: 880867320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-21 19:54:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 19:54:14,846][15401] Updated weights for policy 0, policy_version 53760 (0.0034) [2024-06-21 19:54:18,233][15401] Updated weights for policy 0, policy_version 53770 (0.0038) [2024-06-21 19:54:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 41782.9, 300 sec: 41820.9). Total num frames: 880967680. Throughput: 0: 41624.4. Samples: 881117900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-21 19:54:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 19:54:22,635][15401] Updated weights for policy 0, policy_version 53780 (0.0051) [2024-06-21 19:54:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 40686.9, 300 sec: 41598.7). Total num frames: 881131520. Throughput: 0: 41554.2. Samples: 881243940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-21 19:54:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 19:54:25,891][15401] Updated weights for policy 0, policy_version 53790 (0.0029) [2024-06-21 19:54:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 881377280. Throughput: 0: 41753.6. Samples: 881493020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-21 19:54:28,391][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 19:54:30,434][15401] Updated weights for policy 0, policy_version 53800 (0.0034) [2024-06-21 19:54:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 881590272. Throughput: 0: 41772.0. Samples: 881745500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-21 19:54:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 19:54:33,766][15401] Updated weights for policy 0, policy_version 53810 (0.0049) [2024-06-21 19:54:38,283][15401] Updated weights for policy 0, policy_version 53820 (0.0040) [2024-06-21 19:54:38,390][15132] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 41765.5). Total num frames: 881786880. Throughput: 0: 41613.4. Samples: 881871160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-21 19:54:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 19:54:41,960][15401] Updated weights for policy 0, policy_version 53830 (0.0043) [2024-06-21 19:54:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 881999872. Throughput: 0: 41830.3. Samples: 882123140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-21 19:54:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 19:54:46,059][15401] Updated weights for policy 0, policy_version 53840 (0.0033) [2024-06-21 19:54:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 882196480. Throughput: 0: 41947.6. Samples: 882376940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 19:54:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 19:54:49,629][15401] Updated weights for policy 0, policy_version 53850 (0.0047) [2024-06-21 19:54:52,830][15349] Signal inference workers to stop experience collection... (12850 times) [2024-06-21 19:54:52,831][15349] Signal inference workers to resume experience collection... (12850 times) [2024-06-21 19:54:52,864][15401] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-06-21 19:54:52,864][15401] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-06-21 19:54:53,394][15132] Fps is (10 sec: 40942.9, 60 sec: 41776.4, 300 sec: 41709.2). Total num frames: 882409472. Throughput: 0: 41727.3. Samples: 882496880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 19:54:53,394][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 19:54:53,679][15401] Updated weights for policy 0, policy_version 53860 (0.0033) [2024-06-21 19:54:57,469][15401] Updated weights for policy 0, policy_version 53870 (0.0046) [2024-06-21 19:54:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 41709.8). Total num frames: 882622464. Throughput: 0: 41880.5. Samples: 882751940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 19:54:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 19:55:01,486][15401] Updated weights for policy 0, policy_version 53880 (0.0032) [2024-06-21 19:55:03,390][15132] Fps is (10 sec: 44255.2, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 882851840. Throughput: 0: 42003.6. Samples: 883008060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 19:55:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 19:55:05,120][15401] Updated weights for policy 0, policy_version 53890 (0.0040) [2024-06-21 19:55:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 883032064. Throughput: 0: 42096.4. Samples: 883138280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 19:55:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 19:55:09,353][15401] Updated weights for policy 0, policy_version 53900 (0.0047) [2024-06-21 19:55:12,955][15401] Updated weights for policy 0, policy_version 53910 (0.0034) [2024-06-21 19:55:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 883277824. Throughput: 0: 42105.9. Samples: 883387780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 19:55:13,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 19:55:17,097][15401] Updated weights for policy 0, policy_version 53920 (0.0036) [2024-06-21 19:55:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 883474432. Throughput: 0: 42020.4. Samples: 883636420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 19:55:18,396][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 19:55:20,766][15401] Updated weights for policy 0, policy_version 53930 (0.0035) [2024-06-21 19:55:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 41710.1). Total num frames: 883654656. Throughput: 0: 41956.0. Samples: 883759180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 19:55:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 19:55:25,011][15401] Updated weights for policy 0, policy_version 53940 (0.0030) [2024-06-21 19:55:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 883884032. Throughput: 0: 42043.1. Samples: 884015080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 19:55:28,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 19:55:28,627][15401] Updated weights for policy 0, policy_version 53950 (0.0037) [2024-06-21 19:55:32,949][15401] Updated weights for policy 0, policy_version 53960 (0.0040) [2024-06-21 19:55:33,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 884113408. Throughput: 0: 42016.1. Samples: 884267660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 19:55:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 19:55:36,517][15401] Updated weights for policy 0, policy_version 53970 (0.0049) [2024-06-21 19:55:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 884310016. Throughput: 0: 42222.6. Samples: 884396720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 19:55:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 19:55:40,461][15401] Updated weights for policy 0, policy_version 53980 (0.0041) [2024-06-21 19:55:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42325.2, 300 sec: 41820.8). Total num frames: 884539392. Throughput: 0: 42042.5. Samples: 884643860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 19:55:43,391][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 19:55:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000053988_884539392.pth... [2024-06-21 19:55:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000053376_874512384.pth [2024-06-21 19:55:44,507][15401] Updated weights for policy 0, policy_version 53990 (0.0037) [2024-06-21 19:55:48,323][15401] Updated weights for policy 0, policy_version 54000 (0.0046) [2024-06-21 19:55:48,390][15132] Fps is (10 sec: 42594.6, 60 sec: 42324.7, 300 sec: 41709.7). Total num frames: 884736000. Throughput: 0: 42308.5. Samples: 884911980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 19:55:48,391][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 19:55:52,099][15401] Updated weights for policy 0, policy_version 54010 (0.0032) [2024-06-21 19:55:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42328.3, 300 sec: 41820.9). Total num frames: 884948992. Throughput: 0: 42006.7. Samples: 885028580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 19:55:53,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-21 19:55:56,015][15401] Updated weights for policy 0, policy_version 54020 (0.0038) [2024-06-21 19:55:58,396][15132] Fps is (10 sec: 45850.1, 60 sec: 42866.8, 300 sec: 41875.5). Total num frames: 885194752. Throughput: 0: 42096.7. Samples: 885282400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 19:55:58,396][15132] Avg episode reward: [(0, '0.313')] [2024-06-21 19:55:59,652][15401] Updated weights for policy 0, policy_version 54030 (0.0037) [2024-06-21 19:56:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 885358592. Throughput: 0: 42398.3. Samples: 885544340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 19:56:03,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 19:56:03,703][15401] Updated weights for policy 0, policy_version 54040 (0.0030) [2024-06-21 19:56:07,818][15401] Updated weights for policy 0, policy_version 54050 (0.0036) [2024-06-21 19:56:08,392][15132] Fps is (10 sec: 39337.2, 60 sec: 42596.7, 300 sec: 41820.5). Total num frames: 885587968. Throughput: 0: 42315.6. Samples: 885663480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 19:56:08,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 19:56:11,607][15401] Updated weights for policy 0, policy_version 54060 (0.0036) [2024-06-21 19:56:12,521][15349] Signal inference workers to stop experience collection... (12900 times) [2024-06-21 19:56:12,521][15349] Signal inference workers to resume experience collection... (12900 times) [2024-06-21 19:56:12,563][15401] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-06-21 19:56:12,563][15401] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-06-21 19:56:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 885817344. Throughput: 0: 42205.8. Samples: 885914340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 19:56:13,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 19:56:15,445][15401] Updated weights for policy 0, policy_version 54070 (0.0033) [2024-06-21 19:56:18,390][15132] Fps is (10 sec: 39331.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 885981184. Throughput: 0: 42339.8. Samples: 886172960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 19:56:18,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 19:56:19,439][15401] Updated weights for policy 0, policy_version 54080 (0.0028) [2024-06-21 19:56:23,240][15401] Updated weights for policy 0, policy_version 54090 (0.0050) [2024-06-21 19:56:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 41820.9). Total num frames: 886210560. Throughput: 0: 42009.9. Samples: 886287160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 19:56:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 19:56:27,187][15401] Updated weights for policy 0, policy_version 54100 (0.0060) [2024-06-21 19:56:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 41821.2). Total num frames: 886439936. Throughput: 0: 42245.8. Samples: 886544920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 19:56:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 19:56:30,994][15401] Updated weights for policy 0, policy_version 54110 (0.0031) [2024-06-21 19:56:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 41779.0, 300 sec: 41820.8). Total num frames: 886620160. Throughput: 0: 41901.2. Samples: 886797500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 19:56:33,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-21 19:56:34,846][15401] Updated weights for policy 0, policy_version 54120 (0.0034) [2024-06-21 19:56:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 886833152. Throughput: 0: 41980.3. Samples: 886917700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 19:56:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 19:56:38,815][15401] Updated weights for policy 0, policy_version 54130 (0.0036) [2024-06-21 19:56:42,653][15401] Updated weights for policy 0, policy_version 54140 (0.0044) [2024-06-21 19:56:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 41876.9). Total num frames: 887078912. Throughput: 0: 42179.7. Samples: 887180220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 19:56:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 19:56:46,618][15401] Updated weights for policy 0, policy_version 54150 (0.0044) [2024-06-21 19:56:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.9, 300 sec: 41876.4). Total num frames: 887259136. Throughput: 0: 41971.5. Samples: 887433060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 19:56:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 19:56:50,379][15401] Updated weights for policy 0, policy_version 54160 (0.0030) [2024-06-21 19:56:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 887472128. Throughput: 0: 41960.8. Samples: 887551620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 19:56:53,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-21 19:56:54,114][15401] Updated weights for policy 0, policy_version 54170 (0.0037) [2024-06-21 19:56:58,365][15401] Updated weights for policy 0, policy_version 54180 (0.0033) [2024-06-21 19:56:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41510.6, 300 sec: 41876.4). Total num frames: 887685120. Throughput: 0: 42335.2. Samples: 887819420. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-21 19:56:58,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 19:57:02,348][15401] Updated weights for policy 0, policy_version 54190 (0.0036) [2024-06-21 19:57:03,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41876.7). Total num frames: 887881728. Throughput: 0: 41972.9. Samples: 888061740. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-21 19:57:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 19:57:06,147][15401] Updated weights for policy 0, policy_version 54200 (0.0030) [2024-06-21 19:57:08,391][15132] Fps is (10 sec: 40955.3, 60 sec: 41780.1, 300 sec: 41820.7). Total num frames: 888094720. Throughput: 0: 42186.9. Samples: 888185620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-21 19:57:08,391][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 19:57:09,881][15401] Updated weights for policy 0, policy_version 54210 (0.0037) [2024-06-21 19:57:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 888307712. Throughput: 0: 42096.4. Samples: 888439260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-21 19:57:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 19:57:13,658][15401] Updated weights for policy 0, policy_version 54220 (0.0027) [2024-06-21 19:57:17,618][15401] Updated weights for policy 0, policy_version 54230 (0.0046) [2024-06-21 19:57:18,389][15132] Fps is (10 sec: 42603.1, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 888520704. Throughput: 0: 42045.9. Samples: 888689560. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-21 19:57:18,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 19:57:21,336][15401] Updated weights for policy 0, policy_version 54240 (0.0040) [2024-06-21 19:57:23,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 888733696. Throughput: 0: 42298.9. Samples: 888821140. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-21 19:57:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 19:57:25,354][15401] Updated weights for policy 0, policy_version 54250 (0.0030) [2024-06-21 19:57:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41504.5, 300 sec: 41876.1). Total num frames: 888930304. Throughput: 0: 42148.5. Samples: 889077000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-21 19:57:28,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 19:57:29,332][15401] Updated weights for policy 0, policy_version 54260 (0.0031) [2024-06-21 19:57:33,059][15401] Updated weights for policy 0, policy_version 54270 (0.0031) [2024-06-21 19:57:33,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42323.7, 300 sec: 41931.6). Total num frames: 889159680. Throughput: 0: 42120.9. Samples: 889328600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 19:57:33,392][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 19:57:33,951][15349] Signal inference workers to stop experience collection... (12950 times) [2024-06-21 19:57:33,952][15349] Signal inference workers to resume experience collection... (12950 times) [2024-06-21 19:57:33,975][15401] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-06-21 19:57:33,975][15401] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-06-21 19:57:36,962][15401] Updated weights for policy 0, policy_version 54280 (0.0039) [2024-06-21 19:57:38,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 889372672. Throughput: 0: 42319.7. Samples: 889456000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 19:57:38,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-21 19:57:40,955][15401] Updated weights for policy 0, policy_version 54290 (0.0030) [2024-06-21 19:57:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 889569280. Throughput: 0: 41836.7. Samples: 889702080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 19:57:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 19:57:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000054295_889569280.pth... [2024-06-21 19:57:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000053680_879493120.pth [2024-06-21 19:57:44,572][15401] Updated weights for policy 0, policy_version 54300 (0.0038) [2024-06-21 19:57:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 889782272. Throughput: 0: 42093.8. Samples: 889955960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 19:57:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 19:57:48,917][15401] Updated weights for policy 0, policy_version 54310 (0.0026) [2024-06-21 19:57:52,234][15401] Updated weights for policy 0, policy_version 54320 (0.0031) [2024-06-21 19:57:53,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42323.7, 300 sec: 41931.6). Total num frames: 890011648. Throughput: 0: 42245.0. Samples: 890086700. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 19:57:53,393][15132] Avg episode reward: [(0, '0.885')] [2024-06-21 19:57:56,816][15401] Updated weights for policy 0, policy_version 54330 (0.0035) [2024-06-21 19:57:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42043.3). Total num frames: 890224640. Throughput: 0: 42130.7. Samples: 890335140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 19:57:58,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-21 19:58:00,548][15401] Updated weights for policy 0, policy_version 54340 (0.0040) [2024-06-21 19:58:03,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 41988.4). Total num frames: 890437632. Throughput: 0: 42167.5. Samples: 890587100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 19:58:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 19:58:04,357][15401] Updated weights for policy 0, policy_version 54350 (0.0032) [2024-06-21 19:58:08,297][15401] Updated weights for policy 0, policy_version 54360 (0.0032) [2024-06-21 19:58:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42326.2, 300 sec: 41987.5). Total num frames: 890634240. Throughput: 0: 42140.4. Samples: 890717460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 19:58:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 19:58:12,139][15401] Updated weights for policy 0, policy_version 54370 (0.0048) [2024-06-21 19:58:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41988.2). Total num frames: 890847232. Throughput: 0: 41978.1. Samples: 890965920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 19:58:13,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 19:58:16,255][15401] Updated weights for policy 0, policy_version 54380 (0.0033) [2024-06-21 19:58:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 891043840. Throughput: 0: 41965.8. Samples: 891216960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 19:58:18,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 19:58:19,782][15401] Updated weights for policy 0, policy_version 54390 (0.0036) [2024-06-21 19:58:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 891256832. Throughput: 0: 41936.9. Samples: 891343160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 19:58:23,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-21 19:58:23,962][15401] Updated weights for policy 0, policy_version 54400 (0.0050) [2024-06-21 19:58:27,405][15401] Updated weights for policy 0, policy_version 54410 (0.0036) [2024-06-21 19:58:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 41987.5). Total num frames: 891469824. Throughput: 0: 42064.2. Samples: 891594960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 19:58:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-21 19:58:31,743][15401] Updated weights for policy 0, policy_version 54420 (0.0028) [2024-06-21 19:58:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42054.0, 300 sec: 41987.5). Total num frames: 891682816. Throughput: 0: 42107.1. Samples: 891850780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 19:58:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 19:58:35,057][15401] Updated weights for policy 0, policy_version 54430 (0.0033) [2024-06-21 19:58:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 891895808. Throughput: 0: 41996.6. Samples: 891976440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 19:58:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 19:58:39,703][15401] Updated weights for policy 0, policy_version 54440 (0.0032) [2024-06-21 19:58:42,656][15401] Updated weights for policy 0, policy_version 54450 (0.0036) [2024-06-21 19:58:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 892125184. Throughput: 0: 42027.6. Samples: 892226380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 19:58:43,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 19:58:47,289][15401] Updated weights for policy 0, policy_version 54460 (0.0037) [2024-06-21 19:58:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 892289024. Throughput: 0: 42166.6. Samples: 892484600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 19:58:48,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 19:58:50,666][15401] Updated weights for policy 0, policy_version 54470 (0.0047) [2024-06-21 19:58:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41780.8, 300 sec: 42098.5). Total num frames: 892518400. Throughput: 0: 41876.6. Samples: 892601920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 19:58:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 19:58:55,189][15401] Updated weights for policy 0, policy_version 54480 (0.0030) [2024-06-21 19:58:58,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 892747776. Throughput: 0: 41947.8. Samples: 892853560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 19:58:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 19:58:58,505][15401] Updated weights for policy 0, policy_version 54490 (0.0043) [2024-06-21 19:59:02,755][15401] Updated weights for policy 0, policy_version 54500 (0.0036) [2024-06-21 19:59:03,389][15132] Fps is (10 sec: 42599.6, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 892944384. Throughput: 0: 42100.9. Samples: 893111500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 19:59:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 19:59:04,236][15349] Signal inference workers to stop experience collection... (13000 times) [2024-06-21 19:59:04,237][15349] Signal inference workers to resume experience collection... (13000 times) [2024-06-21 19:59:04,260][15401] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-06-21 19:59:04,260][15401] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-06-21 19:59:06,463][15401] Updated weights for policy 0, policy_version 54510 (0.0038) [2024-06-21 19:59:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 893157376. Throughput: 0: 41971.6. Samples: 893231880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 19:59:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 19:59:10,392][15401] Updated weights for policy 0, policy_version 54520 (0.0040) [2024-06-21 19:59:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 893386752. Throughput: 0: 42146.6. Samples: 893491560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 19:59:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 19:59:13,962][15401] Updated weights for policy 0, policy_version 54530 (0.0036) [2024-06-21 19:59:18,139][15401] Updated weights for policy 0, policy_version 54540 (0.0038) [2024-06-21 19:59:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 893583360. Throughput: 0: 41964.9. Samples: 893739200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:59:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 19:59:21,540][15401] Updated weights for policy 0, policy_version 54550 (0.0027) [2024-06-21 19:59:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 893779968. Throughput: 0: 41929.2. Samples: 893863260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:59:23,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 19:59:25,756][15401] Updated weights for policy 0, policy_version 54560 (0.0032) [2024-06-21 19:59:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 893992960. Throughput: 0: 42211.6. Samples: 894125900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:59:28,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 19:59:30,099][15401] Updated weights for policy 0, policy_version 54570 (0.0035) [2024-06-21 19:59:33,347][15401] Updated weights for policy 0, policy_version 54580 (0.0045) [2024-06-21 19:59:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 894238720. Throughput: 0: 42007.5. Samples: 894374940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:59:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 19:59:37,743][15401] Updated weights for policy 0, policy_version 54590 (0.0035) [2024-06-21 19:59:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 894435328. Throughput: 0: 42359.7. Samples: 894508100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:59:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 19:59:41,354][15401] Updated weights for policy 0, policy_version 54600 (0.0033) [2024-06-21 19:59:43,390][15132] Fps is (10 sec: 37683.4, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 894615552. Throughput: 0: 42327.9. Samples: 894758320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:59:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 19:59:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000054603_894615552.pth... [2024-06-21 19:59:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000053988_884539392.pth [2024-06-21 19:59:45,487][15401] Updated weights for policy 0, policy_version 54610 (0.0023) [2024-06-21 19:59:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42154.7). Total num frames: 894844928. Throughput: 0: 42171.1. Samples: 895009200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-21 19:59:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 19:59:48,940][15401] Updated weights for policy 0, policy_version 54620 (0.0036) [2024-06-21 19:59:53,155][15401] Updated weights for policy 0, policy_version 54630 (0.0028) [2024-06-21 19:59:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 895057920. Throughput: 0: 42440.4. Samples: 895141700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 19:59:53,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 19:59:57,298][15401] Updated weights for policy 0, policy_version 54640 (0.0032) [2024-06-21 19:59:58,389][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 895254528. Throughput: 0: 42182.3. Samples: 895389760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 19:59:58,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 20:00:00,912][15401] Updated weights for policy 0, policy_version 54650 (0.0042) [2024-06-21 20:00:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 895483904. Throughput: 0: 42286.5. Samples: 895642100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 20:00:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 20:00:04,835][15401] Updated weights for policy 0, policy_version 54660 (0.0027) [2024-06-21 20:00:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 895696896. Throughput: 0: 42487.2. Samples: 895775180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 20:00:08,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 20:00:08,486][15401] Updated weights for policy 0, policy_version 54670 (0.0034) [2024-06-21 20:00:12,515][15401] Updated weights for policy 0, policy_version 54680 (0.0032) [2024-06-21 20:00:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 895877120. Throughput: 0: 42193.3. Samples: 896024600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 20:00:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 20:00:16,307][15401] Updated weights for policy 0, policy_version 54690 (0.0039) [2024-06-21 20:00:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 896122880. Throughput: 0: 42265.9. Samples: 896276900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 20:00:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 20:00:20,403][15401] Updated weights for policy 0, policy_version 54700 (0.0028) [2024-06-21 20:00:22,740][15349] Signal inference workers to stop experience collection... (13050 times) [2024-06-21 20:00:22,747][15349] Signal inference workers to resume experience collection... (13050 times) [2024-06-21 20:00:22,796][15401] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-06-21 20:00:22,796][15401] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-06-21 20:00:23,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 896319488. Throughput: 0: 42183.4. Samples: 896406360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 20:00:23,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-21 20:00:24,204][15401] Updated weights for policy 0, policy_version 54710 (0.0034) [2024-06-21 20:00:28,282][15401] Updated weights for policy 0, policy_version 54720 (0.0037) [2024-06-21 20:00:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 896532480. Throughput: 0: 42176.9. Samples: 896656280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 20:00:28,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 20:00:31,740][15401] Updated weights for policy 0, policy_version 54730 (0.0031) [2024-06-21 20:00:33,390][15132] Fps is (10 sec: 44237.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 896761856. Throughput: 0: 42149.6. Samples: 896905940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 20:00:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 20:00:36,285][15401] Updated weights for policy 0, policy_version 54740 (0.0035) [2024-06-21 20:00:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 896942080. Throughput: 0: 42154.1. Samples: 897038640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 20:00:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 20:00:39,570][15401] Updated weights for policy 0, policy_version 54750 (0.0029) [2024-06-21 20:00:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42098.7). Total num frames: 897155072. Throughput: 0: 42172.0. Samples: 897287500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 20:00:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 20:00:43,956][15401] Updated weights for policy 0, policy_version 54760 (0.0048) [2024-06-21 20:00:47,307][15401] Updated weights for policy 0, policy_version 54770 (0.0047) [2024-06-21 20:00:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 897384448. Throughput: 0: 42221.0. Samples: 897542040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 20:00:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 20:00:51,795][15401] Updated weights for policy 0, policy_version 54780 (0.0023) [2024-06-21 20:00:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41988.4). Total num frames: 897581056. Throughput: 0: 42188.9. Samples: 897673680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 20:00:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 20:00:55,170][15401] Updated weights for policy 0, policy_version 54790 (0.0034) [2024-06-21 20:00:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 897810432. Throughput: 0: 42069.8. Samples: 897917740. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 20:00:58,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 20:00:59,494][15401] Updated weights for policy 0, policy_version 54800 (0.0032) [2024-06-21 20:01:02,855][15401] Updated weights for policy 0, policy_version 54810 (0.0030) [2024-06-21 20:01:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.4, 300 sec: 42098.9). Total num frames: 898007040. Throughput: 0: 42146.2. Samples: 898173480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 20:01:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 20:01:07,734][15401] Updated weights for policy 0, policy_version 54820 (0.0042) [2024-06-21 20:01:08,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 898187264. Throughput: 0: 42013.2. Samples: 898296940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 20:01:08,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-21 20:01:10,648][15401] Updated weights for policy 0, policy_version 54830 (0.0047) [2024-06-21 20:01:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 898433024. Throughput: 0: 42071.2. Samples: 898549480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 20:01:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 20:01:15,367][15401] Updated weights for policy 0, policy_version 54840 (0.0038) [2024-06-21 20:01:18,266][15401] Updated weights for policy 0, policy_version 54850 (0.0029) [2024-06-21 20:01:18,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 898662400. Throughput: 0: 42066.8. Samples: 898798940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 20:01:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-21 20:01:23,050][15401] Updated weights for policy 0, policy_version 54860 (0.0035) [2024-06-21 20:01:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.4, 300 sec: 41987.5). Total num frames: 898826240. Throughput: 0: 41988.2. Samples: 898928100. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 20:01:23,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 20:01:26,378][15401] Updated weights for policy 0, policy_version 54870 (0.0037) [2024-06-21 20:01:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 899055616. Throughput: 0: 41835.7. Samples: 899170100. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-21 20:01:28,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 20:01:30,865][15401] Updated weights for policy 0, policy_version 54880 (0.0047) [2024-06-21 20:01:33,391][15132] Fps is (10 sec: 45866.3, 60 sec: 42051.0, 300 sec: 42209.4). Total num frames: 899284992. Throughput: 0: 41982.3. Samples: 899431320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 20:01:33,392][15132] Avg episode reward: [(0, '0.323')] [2024-06-21 20:01:33,966][15401] Updated weights for policy 0, policy_version 54890 (0.0029) [2024-06-21 20:01:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 899465216. Throughput: 0: 41884.9. Samples: 899558500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 20:01:38,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-21 20:01:38,425][15401] Updated weights for policy 0, policy_version 54900 (0.0035) [2024-06-21 20:01:41,595][15401] Updated weights for policy 0, policy_version 54910 (0.0029) [2024-06-21 20:01:42,634][15349] Signal inference workers to stop experience collection... (13100 times) [2024-06-21 20:01:42,634][15349] Signal inference workers to resume experience collection... (13100 times) [2024-06-21 20:01:42,646][15401] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-06-21 20:01:42,646][15401] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-06-21 20:01:43,389][15132] Fps is (10 sec: 40968.1, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 899694592. Throughput: 0: 41914.8. Samples: 899803900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 20:01:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 20:01:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000054913_899694592.pth... [2024-06-21 20:01:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000054295_889569280.pth [2024-06-21 20:01:46,012][15401] Updated weights for policy 0, policy_version 54920 (0.0030) [2024-06-21 20:01:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 899891200. Throughput: 0: 41976.0. Samples: 900062400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 20:01:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 20:01:49,426][15401] Updated weights for policy 0, policy_version 54930 (0.0041) [2024-06-21 20:01:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 900104192. Throughput: 0: 41904.0. Samples: 900182620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 20:01:53,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 20:01:54,112][15401] Updated weights for policy 0, policy_version 54940 (0.0046) [2024-06-21 20:01:57,440][15401] Updated weights for policy 0, policy_version 54950 (0.0032) [2024-06-21 20:01:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 900317184. Throughput: 0: 41789.0. Samples: 900429980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 20:01:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-21 20:02:02,162][15401] Updated weights for policy 0, policy_version 54960 (0.0038) [2024-06-21 20:02:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42098.7). Total num frames: 900513792. Throughput: 0: 41937.2. Samples: 900686120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 20:02:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 20:02:05,163][15401] Updated weights for policy 0, policy_version 54970 (0.0041) [2024-06-21 20:02:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 900726784. Throughput: 0: 41681.3. Samples: 900803760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 20:02:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 20:02:10,425][15401] Updated weights for policy 0, policy_version 54980 (0.0039) [2024-06-21 20:02:12,827][15401] Updated weights for policy 0, policy_version 54990 (0.0034) [2024-06-21 20:02:13,390][15132] Fps is (10 sec: 45872.4, 60 sec: 42324.9, 300 sec: 42209.5). Total num frames: 900972544. Throughput: 0: 42019.3. Samples: 901061000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 20:02:13,391][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 20:02:17,944][15401] Updated weights for policy 0, policy_version 55000 (0.0036) [2024-06-21 20:02:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 901136384. Throughput: 0: 41898.6. Samples: 901316680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 20:02:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 20:02:20,628][15401] Updated weights for policy 0, policy_version 55010 (0.0043) [2024-06-21 20:02:23,389][15132] Fps is (10 sec: 36047.4, 60 sec: 41779.2, 300 sec: 42043.4). Total num frames: 901332992. Throughput: 0: 41624.0. Samples: 901431580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 20:02:23,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 20:02:25,623][15401] Updated weights for policy 0, policy_version 55020 (0.0031) [2024-06-21 20:02:28,361][15401] Updated weights for policy 0, policy_version 55030 (0.0030) [2024-06-21 20:02:28,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42598.4, 300 sec: 42210.0). Total num frames: 901611520. Throughput: 0: 41942.7. Samples: 901691320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 20:02:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 20:02:33,271][15401] Updated weights for policy 0, policy_version 55040 (0.0035) [2024-06-21 20:02:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 41507.5, 300 sec: 42043.0). Total num frames: 901775360. Throughput: 0: 41909.8. Samples: 901948340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 20:02:33,390][15132] Avg episode reward: [(0, '0.134')] [2024-06-21 20:02:36,435][15401] Updated weights for policy 0, policy_version 55050 (0.0038) [2024-06-21 20:02:38,392][15132] Fps is (10 sec: 37673.6, 60 sec: 42050.5, 300 sec: 42098.2). Total num frames: 901988352. Throughput: 0: 41922.1. Samples: 902069220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-21 20:02:38,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 20:02:41,256][15401] Updated weights for policy 0, policy_version 55060 (0.0028) [2024-06-21 20:02:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 902201344. Throughput: 0: 42047.9. Samples: 902322140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 20:02:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 20:02:44,124][15401] Updated weights for policy 0, policy_version 55070 (0.0029) [2024-06-21 20:02:48,390][15132] Fps is (10 sec: 39330.9, 60 sec: 41506.1, 300 sec: 41932.3). Total num frames: 902381568. Throughput: 0: 41813.3. Samples: 902567720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 20:02:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 20:02:49,032][15401] Updated weights for policy 0, policy_version 55080 (0.0045) [2024-06-21 20:02:52,225][15401] Updated weights for policy 0, policy_version 55090 (0.0045) [2024-06-21 20:02:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 902627328. Throughput: 0: 42056.0. Samples: 902696280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 20:02:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 20:02:57,040][15401] Updated weights for policy 0, policy_version 55100 (0.0032) [2024-06-21 20:02:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 902807552. Throughput: 0: 41900.6. Samples: 902946500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 20:02:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 20:02:59,942][15401] Updated weights for policy 0, policy_version 55110 (0.0026) [2024-06-21 20:03:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 903036928. Throughput: 0: 41734.4. Samples: 903194720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 20:03:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 20:03:04,889][15401] Updated weights for policy 0, policy_version 55120 (0.0033) [2024-06-21 20:03:07,921][15401] Updated weights for policy 0, policy_version 55130 (0.0043) [2024-06-21 20:03:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 903249920. Throughput: 0: 42008.1. Samples: 903321940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 20:03:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 20:03:12,536][15401] Updated weights for policy 0, policy_version 55140 (0.0037) [2024-06-21 20:03:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 40960.5, 300 sec: 41987.5). Total num frames: 903430144. Throughput: 0: 41864.4. Samples: 903575220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-21 20:03:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 20:03:15,756][15401] Updated weights for policy 0, policy_version 55150 (0.0050) [2024-06-21 20:03:16,917][15349] Signal inference workers to stop experience collection... (13150 times) [2024-06-21 20:03:16,918][15349] Signal inference workers to resume experience collection... (13150 times) [2024-06-21 20:03:16,965][15401] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-06-21 20:03:16,965][15401] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-06-21 20:03:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 903643136. Throughput: 0: 41789.0. Samples: 903828840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:03:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 20:03:20,179][15401] Updated weights for policy 0, policy_version 55160 (0.0039) [2024-06-21 20:03:23,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 903888896. Throughput: 0: 41872.9. Samples: 903953400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:03:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 20:03:23,474][15401] Updated weights for policy 0, policy_version 55170 (0.0040) [2024-06-21 20:03:27,920][15401] Updated weights for policy 0, policy_version 55180 (0.0046) [2024-06-21 20:03:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 40960.0, 300 sec: 41987.5). Total num frames: 904069120. Throughput: 0: 41809.9. Samples: 904203580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:03:28,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 20:03:31,617][15401] Updated weights for policy 0, policy_version 55190 (0.0054) [2024-06-21 20:03:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 904282112. Throughput: 0: 41881.5. Samples: 904452380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:03:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 20:03:35,545][15401] Updated weights for policy 0, policy_version 55200 (0.0042) [2024-06-21 20:03:38,392][15132] Fps is (10 sec: 44225.3, 60 sec: 42052.2, 300 sec: 41987.1). Total num frames: 904511488. Throughput: 0: 41841.7. Samples: 904579260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:03:38,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 20:03:39,168][15401] Updated weights for policy 0, policy_version 55210 (0.0038) [2024-06-21 20:03:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 904691712. Throughput: 0: 41976.1. Samples: 904835420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:03:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 20:03:43,437][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000055219_904708096.pth... [2024-06-21 20:03:43,517][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000054603_894615552.pth [2024-06-21 20:03:43,661][15401] Updated weights for policy 0, policy_version 55220 (0.0038) [2024-06-21 20:03:46,947][15401] Updated weights for policy 0, policy_version 55230 (0.0037) [2024-06-21 20:03:48,389][15132] Fps is (10 sec: 40970.7, 60 sec: 42325.5, 300 sec: 42043.1). Total num frames: 904921088. Throughput: 0: 41907.1. Samples: 905080540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:03:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 20:03:51,481][15401] Updated weights for policy 0, policy_version 55240 (0.0044) [2024-06-21 20:03:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 905134080. Throughput: 0: 42045.7. Samples: 905214000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:03:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 20:03:54,636][15401] Updated weights for policy 0, policy_version 55250 (0.0034) [2024-06-21 20:03:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 905330688. Throughput: 0: 41882.7. Samples: 905459940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:03:58,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-21 20:03:59,504][15401] Updated weights for policy 0, policy_version 55260 (0.0036) [2024-06-21 20:04:02,438][15401] Updated weights for policy 0, policy_version 55270 (0.0034) [2024-06-21 20:04:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 905560064. Throughput: 0: 41863.1. Samples: 905712680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:04:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 20:04:07,324][15401] Updated weights for policy 0, policy_version 55280 (0.0043) [2024-06-21 20:04:08,392][15132] Fps is (10 sec: 44225.3, 60 sec: 42050.5, 300 sec: 41987.1). Total num frames: 905773056. Throughput: 0: 41990.2. Samples: 905843060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:04:08,392][15132] Avg episode reward: [(0, '0.258')] [2024-06-21 20:04:10,443][15401] Updated weights for policy 0, policy_version 55290 (0.0041) [2024-06-21 20:04:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 905969664. Throughput: 0: 41870.2. Samples: 906087740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:04:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 20:04:14,996][15401] Updated weights for policy 0, policy_version 55300 (0.0044) [2024-06-21 20:04:18,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 906182656. Throughput: 0: 41796.9. Samples: 906333240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:04:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 20:04:18,642][15401] Updated weights for policy 0, policy_version 55310 (0.0031) [2024-06-21 20:04:22,815][15401] Updated weights for policy 0, policy_version 55320 (0.0027) [2024-06-21 20:04:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 906379264. Throughput: 0: 41915.7. Samples: 906465360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:04:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 20:04:26,338][15401] Updated weights for policy 0, policy_version 55330 (0.0035) [2024-06-21 20:04:28,392][15132] Fps is (10 sec: 39311.7, 60 sec: 41777.4, 300 sec: 41820.5). Total num frames: 906575872. Throughput: 0: 41764.3. Samples: 906714920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:04:28,392][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 20:04:30,779][15401] Updated weights for policy 0, policy_version 55340 (0.0041) [2024-06-21 20:04:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 906805248. Throughput: 0: 41996.9. Samples: 906970400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:04:33,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 20:04:33,998][15401] Updated weights for policy 0, policy_version 55350 (0.0036) [2024-06-21 20:04:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 41234.8, 300 sec: 41931.9). Total num frames: 906985472. Throughput: 0: 41804.0. Samples: 907095180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:04:38,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-21 20:04:38,563][15401] Updated weights for policy 0, policy_version 55360 (0.0046) [2024-06-21 20:04:41,681][15401] Updated weights for policy 0, policy_version 55370 (0.0039) [2024-06-21 20:04:42,843][15349] Signal inference workers to stop experience collection... (13200 times) [2024-06-21 20:04:42,844][15349] Signal inference workers to resume experience collection... (13200 times) [2024-06-21 20:04:42,888][15401] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-06-21 20:04:42,888][15401] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-06-21 20:04:43,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42323.6, 300 sec: 41987.1). Total num frames: 907231232. Throughput: 0: 41819.8. Samples: 907341940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:04:43,392][15132] Avg episode reward: [(0, '0.306')] [2024-06-21 20:04:46,263][15401] Updated weights for policy 0, policy_version 55380 (0.0036) [2024-06-21 20:04:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 907444224. Throughput: 0: 41856.3. Samples: 907596220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:04:48,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-21 20:04:49,609][15401] Updated weights for policy 0, policy_version 55390 (0.0035) [2024-06-21 20:04:53,389][15132] Fps is (10 sec: 40970.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 907640832. Throughput: 0: 41891.7. Samples: 907728080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:04:53,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 20:04:53,973][15401] Updated weights for policy 0, policy_version 55400 (0.0030) [2024-06-21 20:04:57,508][15401] Updated weights for policy 0, policy_version 55410 (0.0041) [2024-06-21 20:04:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 41932.0). Total num frames: 907853824. Throughput: 0: 41850.2. Samples: 907971000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 20:04:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 20:05:02,322][15401] Updated weights for policy 0, policy_version 55420 (0.0038) [2024-06-21 20:05:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 908050432. Throughput: 0: 41928.0. Samples: 908220000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 20:05:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 20:05:05,531][15401] Updated weights for policy 0, policy_version 55430 (0.0030) [2024-06-21 20:05:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41507.9, 300 sec: 41987.5). Total num frames: 908263424. Throughput: 0: 41686.7. Samples: 908341260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 20:05:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 20:05:10,249][15401] Updated weights for policy 0, policy_version 55440 (0.0032) [2024-06-21 20:05:13,391][15132] Fps is (10 sec: 42592.2, 60 sec: 41778.2, 300 sec: 41876.2). Total num frames: 908476416. Throughput: 0: 41795.2. Samples: 908595660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 20:05:13,391][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 20:05:13,471][15401] Updated weights for policy 0, policy_version 55450 (0.0036) [2024-06-21 20:05:17,896][15401] Updated weights for policy 0, policy_version 55460 (0.0038) [2024-06-21 20:05:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 908673024. Throughput: 0: 41754.1. Samples: 908849340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 20:05:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 20:05:21,146][15401] Updated weights for policy 0, policy_version 55470 (0.0033) [2024-06-21 20:05:23,389][15132] Fps is (10 sec: 42604.4, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 908902400. Throughput: 0: 41662.7. Samples: 908970000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 20:05:23,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 20:05:25,569][15401] Updated weights for policy 0, policy_version 55480 (0.0064) [2024-06-21 20:05:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.1, 300 sec: 41876.4). Total num frames: 909115392. Throughput: 0: 41829.9. Samples: 909224180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-21 20:05:28,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-21 20:05:28,974][15401] Updated weights for policy 0, policy_version 55490 (0.0043) [2024-06-21 20:05:33,392][15132] Fps is (10 sec: 39311.8, 60 sec: 41504.3, 300 sec: 41876.1). Total num frames: 909295616. Throughput: 0: 41830.7. Samples: 909478700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:05:33,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 20:05:33,675][15401] Updated weights for policy 0, policy_version 55500 (0.0033) [2024-06-21 20:05:37,062][15401] Updated weights for policy 0, policy_version 55510 (0.0037) [2024-06-21 20:05:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 909508608. Throughput: 0: 41520.9. Samples: 909596520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:05:38,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 20:05:41,584][15401] Updated weights for policy 0, policy_version 55520 (0.0034) [2024-06-21 20:05:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 41507.8, 300 sec: 41820.9). Total num frames: 909721600. Throughput: 0: 41651.0. Samples: 909845300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:05:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 20:05:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000055525_909721600.pth... [2024-06-21 20:05:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000054913_899694592.pth [2024-06-21 20:05:45,225][15401] Updated weights for policy 0, policy_version 55530 (0.0034) [2024-06-21 20:05:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 909934592. Throughput: 0: 41697.3. Samples: 910096380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:05:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 20:05:49,104][15401] Updated weights for policy 0, policy_version 55540 (0.0042) [2024-06-21 20:05:52,999][15401] Updated weights for policy 0, policy_version 55550 (0.0031) [2024-06-21 20:05:53,396][15132] Fps is (10 sec: 42570.8, 60 sec: 41774.7, 300 sec: 41819.9). Total num frames: 910147584. Throughput: 0: 41766.3. Samples: 910221020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:05:53,396][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 20:05:56,799][15401] Updated weights for policy 0, policy_version 55560 (0.0030) [2024-06-21 20:05:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 910344192. Throughput: 0: 41590.1. Samples: 910467160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:05:58,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-21 20:06:00,675][15401] Updated weights for policy 0, policy_version 55570 (0.0033) [2024-06-21 20:06:03,389][15132] Fps is (10 sec: 40987.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 910557184. Throughput: 0: 41601.8. Samples: 910721420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:06:03,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-21 20:06:04,475][15401] Updated weights for policy 0, policy_version 55580 (0.0034) [2024-06-21 20:06:08,259][15401] Updated weights for policy 0, policy_version 55590 (0.0030) [2024-06-21 20:06:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 910786560. Throughput: 0: 41685.3. Samples: 910845840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 20:06:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 20:06:12,830][15401] Updated weights for policy 0, policy_version 55600 (0.0036) [2024-06-21 20:06:13,394][15132] Fps is (10 sec: 40942.0, 60 sec: 41504.1, 300 sec: 41709.2). Total num frames: 910966784. Throughput: 0: 41666.2. Samples: 911099340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 20:06:13,394][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 20:06:14,547][15349] Signal inference workers to stop experience collection... (13250 times) [2024-06-21 20:06:14,547][15349] Signal inference workers to resume experience collection... (13250 times) [2024-06-21 20:06:14,576][15401] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-06-21 20:06:14,576][15401] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-06-21 20:06:16,159][15401] Updated weights for policy 0, policy_version 55610 (0.0031) [2024-06-21 20:06:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 911179776. Throughput: 0: 41642.3. Samples: 911352500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 20:06:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 20:06:20,562][15401] Updated weights for policy 0, policy_version 55620 (0.0042) [2024-06-21 20:06:23,389][15132] Fps is (10 sec: 44255.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 911409152. Throughput: 0: 41768.9. Samples: 911476120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 20:06:23,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-21 20:06:23,882][15401] Updated weights for policy 0, policy_version 55630 (0.0049) [2024-06-21 20:06:28,275][15401] Updated weights for policy 0, policy_version 55640 (0.0037) [2024-06-21 20:06:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41765.6). Total num frames: 911605760. Throughput: 0: 41874.3. Samples: 911729640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 20:06:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 20:06:31,619][15401] Updated weights for policy 0, policy_version 55650 (0.0026) [2024-06-21 20:06:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 41876.4). Total num frames: 911818752. Throughput: 0: 41725.4. Samples: 911974020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 20:06:33,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-21 20:06:36,704][15401] Updated weights for policy 0, policy_version 55660 (0.0038) [2024-06-21 20:06:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 912015360. Throughput: 0: 41979.9. Samples: 912109840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 20:06:38,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 20:06:39,813][15401] Updated weights for policy 0, policy_version 55670 (0.0042) [2024-06-21 20:06:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 912228352. Throughput: 0: 41950.3. Samples: 912354920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 20:06:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 20:06:44,394][15401] Updated weights for policy 0, policy_version 55680 (0.0039) [2024-06-21 20:06:47,522][15401] Updated weights for policy 0, policy_version 55690 (0.0033) [2024-06-21 20:06:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 912457728. Throughput: 0: 41792.9. Samples: 912602100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 20:06:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 20:06:52,256][15401] Updated weights for policy 0, policy_version 55700 (0.0044) [2024-06-21 20:06:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41237.6, 300 sec: 41709.8). Total num frames: 912621568. Throughput: 0: 41930.3. Samples: 912732700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 20:06:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 20:06:55,275][15401] Updated weights for policy 0, policy_version 55710 (0.0052) [2024-06-21 20:06:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 912867328. Throughput: 0: 41804.5. Samples: 912980360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 20:06:58,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 20:06:59,934][15401] Updated weights for policy 0, policy_version 55720 (0.0044) [2024-06-21 20:07:03,074][15401] Updated weights for policy 0, policy_version 55730 (0.0030) [2024-06-21 20:07:03,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42050.5, 300 sec: 41876.1). Total num frames: 913080320. Throughput: 0: 41660.3. Samples: 913227320. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 20:07:03,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 20:07:07,530][15401] Updated weights for policy 0, policy_version 55740 (0.0048) [2024-06-21 20:07:08,392][15132] Fps is (10 sec: 39312.3, 60 sec: 41231.5, 300 sec: 41654.0). Total num frames: 913260544. Throughput: 0: 41745.4. Samples: 913354760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 20:07:08,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 20:07:11,020][15401] Updated weights for policy 0, policy_version 55750 (0.0042) [2024-06-21 20:07:13,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42326.6, 300 sec: 41931.6). Total num frames: 913506304. Throughput: 0: 41700.8. Samples: 913606280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-21 20:07:13,392][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 20:07:15,552][15401] Updated weights for policy 0, policy_version 55760 (0.0039) [2024-06-21 20:07:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 913702912. Throughput: 0: 41888.9. Samples: 913859020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 20:07:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 20:07:18,874][15401] Updated weights for policy 0, policy_version 55770 (0.0039) [2024-06-21 20:07:23,359][15401] Updated weights for policy 0, policy_version 55780 (0.0031) [2024-06-21 20:07:23,396][15132] Fps is (10 sec: 39305.5, 60 sec: 41501.6, 300 sec: 41653.3). Total num frames: 913899520. Throughput: 0: 41557.0. Samples: 913980180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 20:07:23,397][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 20:07:26,831][15401] Updated weights for policy 0, policy_version 55790 (0.0032) [2024-06-21 20:07:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 914128896. Throughput: 0: 41821.3. Samples: 914236880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 20:07:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-21 20:07:31,091][15401] Updated weights for policy 0, policy_version 55800 (0.0044) [2024-06-21 20:07:33,389][15132] Fps is (10 sec: 42626.3, 60 sec: 41779.2, 300 sec: 41821.2). Total num frames: 914325504. Throughput: 0: 41962.1. Samples: 914490400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 20:07:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-21 20:07:34,557][15401] Updated weights for policy 0, policy_version 55810 (0.0034) [2024-06-21 20:07:38,392][15132] Fps is (10 sec: 39311.8, 60 sec: 41777.5, 300 sec: 41765.0). Total num frames: 914522112. Throughput: 0: 41747.0. Samples: 914611420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 20:07:38,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 20:07:39,254][15401] Updated weights for policy 0, policy_version 55820 (0.0029) [2024-06-21 20:07:42,406][15401] Updated weights for policy 0, policy_version 55830 (0.0037) [2024-06-21 20:07:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 914751488. Throughput: 0: 41936.4. Samples: 914867500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 20:07:43,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-21 20:07:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000055832_914751488.pth... [2024-06-21 20:07:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000055219_904708096.pth [2024-06-21 20:07:43,888][15349] Signal inference workers to stop experience collection... (13300 times) [2024-06-21 20:07:43,888][15349] Signal inference workers to resume experience collection... (13300 times) [2024-06-21 20:07:43,923][15401] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-06-21 20:07:43,923][15401] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-06-21 20:07:47,081][15401] Updated weights for policy 0, policy_version 55840 (0.0042) [2024-06-21 20:07:48,389][15132] Fps is (10 sec: 42609.1, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 914948096. Throughput: 0: 41941.9. Samples: 915114600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-21 20:07:48,396][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 20:07:50,232][15401] Updated weights for policy 0, policy_version 55850 (0.0042) [2024-06-21 20:07:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 915144704. Throughput: 0: 41768.8. Samples: 915234260. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-21 20:07:53,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-21 20:07:54,943][15401] Updated weights for policy 0, policy_version 55860 (0.0035) [2024-06-21 20:07:57,995][15401] Updated weights for policy 0, policy_version 55870 (0.0039) [2024-06-21 20:07:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 915374080. Throughput: 0: 41893.9. Samples: 915491400. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-21 20:07:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 20:08:02,943][15401] Updated weights for policy 0, policy_version 55880 (0.0035) [2024-06-21 20:08:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41234.8, 300 sec: 41709.8). Total num frames: 915554304. Throughput: 0: 41862.2. Samples: 915742820. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-21 20:08:03,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-21 20:08:05,852][15401] Updated weights for policy 0, policy_version 55890 (0.0042) [2024-06-21 20:08:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42326.9, 300 sec: 41931.9). Total num frames: 915800064. Throughput: 0: 41786.9. Samples: 915860320. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-21 20:08:08,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 20:08:10,675][15401] Updated weights for policy 0, policy_version 55900 (0.0024) [2024-06-21 20:08:13,391][15132] Fps is (10 sec: 44227.9, 60 sec: 41506.5, 300 sec: 41876.1). Total num frames: 915996672. Throughput: 0: 41737.8. Samples: 916115160. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-21 20:08:13,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 20:08:14,085][15401] Updated weights for policy 0, policy_version 55910 (0.0028) [2024-06-21 20:08:18,392][15132] Fps is (10 sec: 37674.1, 60 sec: 41231.3, 300 sec: 41653.9). Total num frames: 916176896. Throughput: 0: 41730.2. Samples: 916368360. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-21 20:08:18,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 20:08:18,439][15401] Updated weights for policy 0, policy_version 55920 (0.0033) [2024-06-21 20:08:22,047][15401] Updated weights for policy 0, policy_version 55930 (0.0034) [2024-06-21 20:08:23,392][15132] Fps is (10 sec: 44234.5, 60 sec: 42328.2, 300 sec: 41931.6). Total num frames: 916439040. Throughput: 0: 41638.7. Samples: 916485160. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-21 20:08:23,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 20:08:26,125][15401] Updated weights for policy 0, policy_version 55940 (0.0035) [2024-06-21 20:08:28,389][15132] Fps is (10 sec: 44247.5, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 916619264. Throughput: 0: 41647.1. Samples: 916741620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:08:28,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 20:08:29,826][15401] Updated weights for policy 0, policy_version 55950 (0.0038) [2024-06-21 20:08:33,390][15132] Fps is (10 sec: 37691.8, 60 sec: 41506.1, 300 sec: 41710.1). Total num frames: 916815872. Throughput: 0: 41775.8. Samples: 916994520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:08:33,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 20:08:33,779][15401] Updated weights for policy 0, policy_version 55960 (0.0038) [2024-06-21 20:08:37,518][15401] Updated weights for policy 0, policy_version 55970 (0.0028) [2024-06-21 20:08:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42054.0, 300 sec: 41876.4). Total num frames: 917045248. Throughput: 0: 41922.3. Samples: 917120760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:08:38,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-21 20:08:41,502][15401] Updated weights for policy 0, policy_version 55980 (0.0036) [2024-06-21 20:08:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 917258240. Throughput: 0: 41744.0. Samples: 917369880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:08:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 20:08:45,436][15401] Updated weights for policy 0, policy_version 55990 (0.0035) [2024-06-21 20:08:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 917438464. Throughput: 0: 41633.7. Samples: 917616340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:08:48,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 20:08:49,441][15401] Updated weights for policy 0, policy_version 56000 (0.0028) [2024-06-21 20:08:53,353][15401] Updated weights for policy 0, policy_version 56010 (0.0042) [2024-06-21 20:08:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 917667840. Throughput: 0: 41681.3. Samples: 917735980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:08:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 20:08:56,667][15349] Signal inference workers to stop experience collection... (13350 times) [2024-06-21 20:08:56,669][15349] Signal inference workers to resume experience collection... (13350 times) [2024-06-21 20:08:56,688][15401] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-06-21 20:08:56,688][15401] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-06-21 20:08:57,315][15401] Updated weights for policy 0, policy_version 56020 (0.0035) [2024-06-21 20:08:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 917848064. Throughput: 0: 41653.3. Samples: 917989480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:08:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 20:09:01,070][15401] Updated weights for policy 0, policy_version 56030 (0.0050) [2024-06-21 20:09:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41710.1). Total num frames: 918077440. Throughput: 0: 41665.4. Samples: 918243200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:09:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 20:09:05,181][15401] Updated weights for policy 0, policy_version 56040 (0.0047) [2024-06-21 20:09:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 918274048. Throughput: 0: 41817.8. Samples: 918366860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:09:08,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 20:09:08,939][15401] Updated weights for policy 0, policy_version 56050 (0.0028) [2024-06-21 20:09:12,921][15401] Updated weights for policy 0, policy_version 56060 (0.0038) [2024-06-21 20:09:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41507.5, 300 sec: 41709.8). Total num frames: 918487040. Throughput: 0: 41666.7. Samples: 918616620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:09:13,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-21 20:09:16,843][15401] Updated weights for policy 0, policy_version 56070 (0.0035) [2024-06-21 20:09:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42054.0, 300 sec: 41765.3). Total num frames: 918700032. Throughput: 0: 41593.9. Samples: 918866240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:09:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 20:09:20,737][15401] Updated weights for policy 0, policy_version 56080 (0.0035) [2024-06-21 20:09:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 40961.6, 300 sec: 41765.7). Total num frames: 918896640. Throughput: 0: 41584.8. Samples: 918992080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:09:23,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 20:09:24,456][15401] Updated weights for policy 0, policy_version 56090 (0.0036) [2024-06-21 20:09:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 919109632. Throughput: 0: 41570.3. Samples: 919240540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:09:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-21 20:09:28,842][15401] Updated weights for policy 0, policy_version 56100 (0.0034) [2024-06-21 20:09:32,384][15401] Updated weights for policy 0, policy_version 56110 (0.0036) [2024-06-21 20:09:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 919339008. Throughput: 0: 41591.0. Samples: 919487940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:09:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-21 20:09:37,048][15401] Updated weights for policy 0, policy_version 56120 (0.0042) [2024-06-21 20:09:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41654.6). Total num frames: 919519232. Throughput: 0: 41939.2. Samples: 919623240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 20:09:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 20:09:40,011][15401] Updated weights for policy 0, policy_version 56130 (0.0039) [2024-06-21 20:09:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 919732224. Throughput: 0: 41711.6. Samples: 919866500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 20:09:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 20:09:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000056136_919732224.pth... [2024-06-21 20:09:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000055525_909721600.pth [2024-06-21 20:09:44,762][15401] Updated weights for policy 0, policy_version 56140 (0.0033) [2024-06-21 20:09:48,029][15401] Updated weights for policy 0, policy_version 56150 (0.0027) [2024-06-21 20:09:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 919977984. Throughput: 0: 41658.7. Samples: 920117840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 20:09:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 20:09:52,591][15401] Updated weights for policy 0, policy_version 56160 (0.0034) [2024-06-21 20:09:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 920141824. Throughput: 0: 41870.6. Samples: 920251040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 20:09:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 20:09:55,914][15401] Updated weights for policy 0, policy_version 56170 (0.0034) [2024-06-21 20:09:58,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42323.6, 300 sec: 41820.5). Total num frames: 920387584. Throughput: 0: 41741.3. Samples: 920495080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 20:09:58,392][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 20:10:00,313][15401] Updated weights for policy 0, policy_version 56180 (0.0044) [2024-06-21 20:10:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 920584192. Throughput: 0: 41822.2. Samples: 920748240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 20:10:03,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 20:10:03,635][15401] Updated weights for policy 0, policy_version 56190 (0.0041) [2024-06-21 20:10:07,953][15401] Updated weights for policy 0, policy_version 56200 (0.0022) [2024-06-21 20:10:08,392][15132] Fps is (10 sec: 39321.6, 60 sec: 41777.5, 300 sec: 41709.6). Total num frames: 920780800. Throughput: 0: 41824.8. Samples: 920874300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 20:10:08,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 20:10:11,374][15401] Updated weights for policy 0, policy_version 56210 (0.0035) [2024-06-21 20:10:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 921026560. Throughput: 0: 41891.4. Samples: 921125660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 20:10:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 20:10:15,863][15401] Updated weights for policy 0, policy_version 56220 (0.0042) [2024-06-21 20:10:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 921206784. Throughput: 0: 42107.2. Samples: 921382760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 20:10:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-21 20:10:19,641][15401] Updated weights for policy 0, policy_version 56230 (0.0027) [2024-06-21 20:10:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 921419776. Throughput: 0: 41725.7. Samples: 921500900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 20:10:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 20:10:23,571][15401] Updated weights for policy 0, policy_version 56240 (0.0032) [2024-06-21 20:10:27,274][15401] Updated weights for policy 0, policy_version 56250 (0.0033) [2024-06-21 20:10:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 41876.8). Total num frames: 921649152. Throughput: 0: 42131.2. Samples: 921762400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 20:10:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 20:10:31,164][15401] Updated weights for policy 0, policy_version 56260 (0.0037) [2024-06-21 20:10:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 921845760. Throughput: 0: 42232.9. Samples: 922018320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 20:10:33,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 20:10:34,196][15349] Signal inference workers to stop experience collection... (13400 times) [2024-06-21 20:10:34,197][15349] Signal inference workers to resume experience collection... (13400 times) [2024-06-21 20:10:34,242][15401] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-06-21 20:10:34,243][15401] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-06-21 20:10:34,898][15401] Updated weights for policy 0, policy_version 56270 (0.0025) [2024-06-21 20:10:38,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 922058752. Throughput: 0: 41960.4. Samples: 922139260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 20:10:38,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-21 20:10:38,768][15401] Updated weights for policy 0, policy_version 56280 (0.0037) [2024-06-21 20:10:42,691][15401] Updated weights for policy 0, policy_version 56290 (0.0036) [2024-06-21 20:10:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.6, 300 sec: 41820.5). Total num frames: 922271744. Throughput: 0: 42256.0. Samples: 922396600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 20:10:43,392][15132] Avg episode reward: [(0, '0.235')] [2024-06-21 20:10:46,396][15401] Updated weights for policy 0, policy_version 56300 (0.0042) [2024-06-21 20:10:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41766.2). Total num frames: 922468352. Throughput: 0: 42310.6. Samples: 922652220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 20:10:48,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-21 20:10:50,394][15401] Updated weights for policy 0, policy_version 56310 (0.0043) [2024-06-21 20:10:53,393][15132] Fps is (10 sec: 44230.5, 60 sec: 42868.7, 300 sec: 41931.4). Total num frames: 922714112. Throughput: 0: 42325.8. Samples: 922779020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 20:10:53,394][15132] Avg episode reward: [(0, '0.257')] [2024-06-21 20:10:54,004][15401] Updated weights for policy 0, policy_version 56320 (0.0035) [2024-06-21 20:10:58,204][15401] Updated weights for policy 0, policy_version 56330 (0.0043) [2024-06-21 20:10:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42054.0, 300 sec: 41876.4). Total num frames: 922910720. Throughput: 0: 42434.3. Samples: 923035200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 20:10:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 20:11:01,655][15401] Updated weights for policy 0, policy_version 56340 (0.0036) [2024-06-21 20:11:03,389][15132] Fps is (10 sec: 39336.7, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 923107328. Throughput: 0: 42539.1. Samples: 923297020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 20:11:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 20:11:06,051][15401] Updated weights for policy 0, policy_version 56350 (0.0049) [2024-06-21 20:11:08,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42873.1, 300 sec: 41988.1). Total num frames: 923353088. Throughput: 0: 42560.3. Samples: 923416120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 20:11:08,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 20:11:09,271][15401] Updated weights for policy 0, policy_version 56360 (0.0047) [2024-06-21 20:11:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 923533312. Throughput: 0: 42475.5. Samples: 923673800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 20:11:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 20:11:13,764][15401] Updated weights for policy 0, policy_version 56370 (0.0039) [2024-06-21 20:11:16,865][15401] Updated weights for policy 0, policy_version 56380 (0.0032) [2024-06-21 20:11:18,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 923746304. Throughput: 0: 42347.1. Samples: 923923940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-21 20:11:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 20:11:21,596][15401] Updated weights for policy 0, policy_version 56390 (0.0025) [2024-06-21 20:11:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 923975680. Throughput: 0: 42417.4. Samples: 924048040. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 20:11:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 20:11:24,996][15401] Updated weights for policy 0, policy_version 56400 (0.0045) [2024-06-21 20:11:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 924172288. Throughput: 0: 42461.4. Samples: 924307260. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 20:11:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 20:11:29,354][15401] Updated weights for policy 0, policy_version 56410 (0.0030) [2024-06-21 20:11:32,983][15401] Updated weights for policy 0, policy_version 56420 (0.0031) [2024-06-21 20:11:33,394][15132] Fps is (10 sec: 40939.5, 60 sec: 42321.8, 300 sec: 41931.2). Total num frames: 924385280. Throughput: 0: 42396.7. Samples: 924560280. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 20:11:33,395][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 20:11:36,929][15401] Updated weights for policy 0, policy_version 56430 (0.0037) [2024-06-21 20:11:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 924598272. Throughput: 0: 42373.4. Samples: 924685660. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 20:11:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 20:11:40,538][15401] Updated weights for policy 0, policy_version 56440 (0.0027) [2024-06-21 20:11:43,391][15132] Fps is (10 sec: 44251.8, 60 sec: 42599.0, 300 sec: 41931.7). Total num frames: 924827648. Throughput: 0: 42450.5. Samples: 924945540. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 20:11:43,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 20:11:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000056447_924827648.pth... [2024-06-21 20:11:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000055832_914751488.pth [2024-06-21 20:11:44,635][15401] Updated weights for policy 0, policy_version 56450 (0.0047) [2024-06-21 20:11:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 925024256. Throughput: 0: 42104.0. Samples: 925191700. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 20:11:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 20:11:48,544][15401] Updated weights for policy 0, policy_version 56460 (0.0047) [2024-06-21 20:11:52,329][15401] Updated weights for policy 0, policy_version 56470 (0.0029) [2024-06-21 20:11:53,392][15132] Fps is (10 sec: 42594.7, 60 sec: 42326.3, 300 sec: 41987.1). Total num frames: 925253632. Throughput: 0: 42186.8. Samples: 925314620. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 20:11:53,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 20:11:56,293][15401] Updated weights for policy 0, policy_version 56480 (0.0025) [2024-06-21 20:11:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 41931.9). Total num frames: 925450240. Throughput: 0: 42214.6. Samples: 925573560. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 20:11:58,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 20:11:58,563][15349] Signal inference workers to stop experience collection... (13450 times) [2024-06-21 20:11:58,588][15401] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-06-21 20:11:58,628][15349] Signal inference workers to resume experience collection... (13450 times) [2024-06-21 20:11:58,629][15401] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-06-21 20:12:00,221][15401] Updated weights for policy 0, policy_version 56490 (0.0042) [2024-06-21 20:12:03,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42043.3). Total num frames: 925663232. Throughput: 0: 42100.4. Samples: 925818460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 20:12:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 20:12:03,994][15401] Updated weights for policy 0, policy_version 56500 (0.0042) [2024-06-21 20:12:08,047][15401] Updated weights for policy 0, policy_version 56510 (0.0042) [2024-06-21 20:12:08,389][15132] Fps is (10 sec: 40970.1, 60 sec: 41779.3, 300 sec: 41876.7). Total num frames: 925859840. Throughput: 0: 42232.0. Samples: 925948480. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 20:12:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 20:12:11,747][15401] Updated weights for policy 0, policy_version 56520 (0.0027) [2024-06-21 20:12:13,391][15132] Fps is (10 sec: 39314.7, 60 sec: 42051.0, 300 sec: 41876.1). Total num frames: 926056448. Throughput: 0: 42078.3. Samples: 926200860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 20:12:13,392][15132] Avg episode reward: [(0, '0.298')] [2024-06-21 20:12:16,671][15401] Updated weights for policy 0, policy_version 56530 (0.0046) [2024-06-21 20:12:18,393][15132] Fps is (10 sec: 45858.3, 60 sec: 42868.8, 300 sec: 42099.0). Total num frames: 926318592. Throughput: 0: 42027.4. Samples: 926451460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 20:12:18,394][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 20:12:19,537][15401] Updated weights for policy 0, policy_version 56540 (0.0028) [2024-06-21 20:12:23,389][15132] Fps is (10 sec: 42606.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 926482432. Throughput: 0: 42098.2. Samples: 926580080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 20:12:23,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-21 20:12:24,305][15401] Updated weights for policy 0, policy_version 56550 (0.0049) [2024-06-21 20:12:27,351][15401] Updated weights for policy 0, policy_version 56560 (0.0034) [2024-06-21 20:12:28,390][15132] Fps is (10 sec: 37695.7, 60 sec: 42052.0, 300 sec: 41931.9). Total num frames: 926695424. Throughput: 0: 41934.9. Samples: 926832560. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 20:12:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 20:12:31,795][15401] Updated weights for policy 0, policy_version 56570 (0.0048) [2024-06-21 20:12:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42328.8, 300 sec: 42043.4). Total num frames: 926924800. Throughput: 0: 42212.5. Samples: 927091260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 20:12:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 20:12:34,919][15401] Updated weights for policy 0, policy_version 56580 (0.0031) [2024-06-21 20:12:38,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 927121408. Throughput: 0: 42236.9. Samples: 927215180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 20:12:38,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-21 20:12:39,463][15401] Updated weights for policy 0, policy_version 56590 (0.0040) [2024-06-21 20:12:42,697][15401] Updated weights for policy 0, policy_version 56600 (0.0028) [2024-06-21 20:12:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42053.3, 300 sec: 42043.0). Total num frames: 927350784. Throughput: 0: 42062.6. Samples: 927466280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 20:12:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 20:12:47,036][15401] Updated weights for policy 0, policy_version 56610 (0.0032) [2024-06-21 20:12:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 927563776. Throughput: 0: 42221.9. Samples: 927718440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 20:12:48,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-21 20:12:50,383][15401] Updated weights for policy 0, policy_version 56620 (0.0032) [2024-06-21 20:12:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41780.9, 300 sec: 41987.5). Total num frames: 927760384. Throughput: 0: 42066.6. Samples: 927841480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 20:12:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 20:12:54,707][15401] Updated weights for policy 0, policy_version 56630 (0.0036) [2024-06-21 20:12:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 42098.5). Total num frames: 927973376. Throughput: 0: 42022.2. Samples: 928091780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 20:12:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 20:12:58,464][15401] Updated weights for policy 0, policy_version 56640 (0.0030) [2024-06-21 20:13:02,584][15401] Updated weights for policy 0, policy_version 56650 (0.0031) [2024-06-21 20:13:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 928169984. Throughput: 0: 41920.8. Samples: 928337740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 20:13:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 20:13:06,683][15401] Updated weights for policy 0, policy_version 56660 (0.0054) [2024-06-21 20:13:08,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42050.5, 300 sec: 41987.4). Total num frames: 928382976. Throughput: 0: 41830.6. Samples: 928462560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 20:13:08,401][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 20:13:10,355][15401] Updated weights for policy 0, policy_version 56670 (0.0046) [2024-06-21 20:13:13,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42051.8, 300 sec: 42043.0). Total num frames: 928579584. Throughput: 0: 41714.9. Samples: 928709820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 20:13:13,401][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 20:13:14,676][15401] Updated weights for policy 0, policy_version 56680 (0.0040) [2024-06-21 20:13:16,171][15349] Signal inference workers to stop experience collection... (13500 times) [2024-06-21 20:13:16,174][15349] Signal inference workers to resume experience collection... (13500 times) [2024-06-21 20:13:16,189][15401] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-06-21 20:13:16,190][15401] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-06-21 20:13:18,389][15132] Fps is (10 sec: 40970.2, 60 sec: 41235.6, 300 sec: 41876.7). Total num frames: 928792576. Throughput: 0: 41541.8. Samples: 928960640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 20:13:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 20:13:18,484][15401] Updated weights for policy 0, policy_version 56690 (0.0029) [2024-06-21 20:13:22,334][15401] Updated weights for policy 0, policy_version 56700 (0.0029) [2024-06-21 20:13:23,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 929005568. Throughput: 0: 41604.0. Samples: 929087360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 20:13:23,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 20:13:26,297][15401] Updated weights for policy 0, policy_version 56710 (0.0032) [2024-06-21 20:13:28,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42048.0, 300 sec: 42042.1). Total num frames: 929218560. Throughput: 0: 41543.1. Samples: 929335980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 20:13:28,396][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 20:13:30,080][15401] Updated weights for policy 0, policy_version 56720 (0.0042) [2024-06-21 20:13:33,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41504.4, 300 sec: 41931.6). Total num frames: 929415168. Throughput: 0: 41673.2. Samples: 929593840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 20:13:33,392][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 20:13:34,113][15401] Updated weights for policy 0, policy_version 56730 (0.0049) [2024-06-21 20:13:38,118][15401] Updated weights for policy 0, policy_version 56740 (0.0032) [2024-06-21 20:13:38,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 929644544. Throughput: 0: 41647.0. Samples: 929715600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 20:13:38,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-21 20:13:41,971][15401] Updated weights for policy 0, policy_version 56750 (0.0036) [2024-06-21 20:13:43,394][15132] Fps is (10 sec: 44227.7, 60 sec: 41776.1, 300 sec: 42097.9). Total num frames: 929857536. Throughput: 0: 41755.7. Samples: 929970980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 20:13:43,394][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 20:13:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000056754_929857536.pth... [2024-06-21 20:13:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000056136_919732224.pth [2024-06-21 20:13:45,765][15401] Updated weights for policy 0, policy_version 56760 (0.0032) [2024-06-21 20:13:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 930054144. Throughput: 0: 41778.5. Samples: 930217780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 20:13:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 20:13:49,735][15401] Updated weights for policy 0, policy_version 56770 (0.0031) [2024-06-21 20:13:53,321][15401] Updated weights for policy 0, policy_version 56780 (0.0048) [2024-06-21 20:13:53,389][15132] Fps is (10 sec: 42617.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 930283520. Throughput: 0: 41829.8. Samples: 930344800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 20:13:53,390][15132] Avg episode reward: [(0, '0.211')] [2024-06-21 20:13:57,486][15401] Updated weights for policy 0, policy_version 56790 (0.0028) [2024-06-21 20:13:58,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42047.7, 300 sec: 42097.6). Total num frames: 930496512. Throughput: 0: 42061.2. Samples: 930602740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 20:13:58,396][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 20:14:01,293][15401] Updated weights for policy 0, policy_version 56800 (0.0022) [2024-06-21 20:14:03,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42047.7, 300 sec: 42097.6). Total num frames: 930693120. Throughput: 0: 42013.0. Samples: 930851500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 20:14:03,397][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 20:14:05,286][15401] Updated weights for policy 0, policy_version 56810 (0.0031) [2024-06-21 20:14:08,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42054.0, 300 sec: 42098.5). Total num frames: 930906112. Throughput: 0: 41972.0. Samples: 930976100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 20:14:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 20:14:08,929][15401] Updated weights for policy 0, policy_version 56820 (0.0040) [2024-06-21 20:14:13,195][15401] Updated weights for policy 0, policy_version 56830 (0.0031) [2024-06-21 20:14:13,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42327.0, 300 sec: 42098.5). Total num frames: 931119104. Throughput: 0: 42286.9. Samples: 931238620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 20:14:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 20:14:16,814][15401] Updated weights for policy 0, policy_version 56840 (0.0034) [2024-06-21 20:14:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.6, 300 sec: 42153.7). Total num frames: 931332096. Throughput: 0: 41981.8. Samples: 931483020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:14:18,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 20:14:20,936][15401] Updated weights for policy 0, policy_version 56850 (0.0046) [2024-06-21 20:14:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 931512320. Throughput: 0: 42058.7. Samples: 931608240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:14:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 20:14:24,541][15401] Updated weights for policy 0, policy_version 56860 (0.0030) [2024-06-21 20:14:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42056.8, 300 sec: 42043.0). Total num frames: 931741696. Throughput: 0: 42052.7. Samples: 931863160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:14:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 20:14:28,524][15401] Updated weights for policy 0, policy_version 56870 (0.0041) [2024-06-21 20:14:32,571][15401] Updated weights for policy 0, policy_version 56880 (0.0026) [2024-06-21 20:14:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42600.1, 300 sec: 42209.6). Total num frames: 931971072. Throughput: 0: 42246.3. Samples: 932118860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:14:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 20:14:36,413][15401] Updated weights for policy 0, policy_version 56890 (0.0040) [2024-06-21 20:14:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.6, 300 sec: 42153.7). Total num frames: 932167680. Throughput: 0: 42339.1. Samples: 932250160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:14:38,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 20:14:40,275][15401] Updated weights for policy 0, policy_version 56900 (0.0039) [2024-06-21 20:14:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42055.4, 300 sec: 42043.0). Total num frames: 932380672. Throughput: 0: 42058.8. Samples: 932495120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:14:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 20:14:44,119][15401] Updated weights for policy 0, policy_version 56910 (0.0048) [2024-06-21 20:14:46,677][15349] Signal inference workers to stop experience collection... (13550 times) [2024-06-21 20:14:46,677][15349] Signal inference workers to resume experience collection... (13550 times) [2024-06-21 20:14:46,691][15401] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-06-21 20:14:46,691][15401] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-06-21 20:14:47,918][15401] Updated weights for policy 0, policy_version 56920 (0.0038) [2024-06-21 20:14:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 932577280. Throughput: 0: 42072.8. Samples: 932744500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:14:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 20:14:52,324][15401] Updated weights for policy 0, policy_version 56930 (0.0028) [2024-06-21 20:14:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41777.5, 300 sec: 42043.0). Total num frames: 932790272. Throughput: 0: 42056.3. Samples: 932868740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 20:14:53,393][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 20:14:56,055][15401] Updated weights for policy 0, policy_version 56940 (0.0044) [2024-06-21 20:14:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41510.6, 300 sec: 42043.0). Total num frames: 932986880. Throughput: 0: 41744.0. Samples: 933117100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 20:14:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 20:15:00,288][15401] Updated weights for policy 0, policy_version 56950 (0.0034) [2024-06-21 20:15:03,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42056.8, 300 sec: 42154.4). Total num frames: 933216256. Throughput: 0: 41873.0. Samples: 933367200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 20:15:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 20:15:04,039][15401] Updated weights for policy 0, policy_version 56960 (0.0033) [2024-06-21 20:15:08,020][15401] Updated weights for policy 0, policy_version 56970 (0.0034) [2024-06-21 20:15:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41504.4, 300 sec: 41931.6). Total num frames: 933396480. Throughput: 0: 41863.9. Samples: 933492220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 20:15:08,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 20:15:11,670][15401] Updated weights for policy 0, policy_version 56980 (0.0028) [2024-06-21 20:15:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 933625856. Throughput: 0: 41813.8. Samples: 933744780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 20:15:13,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 20:15:15,735][15401] Updated weights for policy 0, policy_version 56990 (0.0037) [2024-06-21 20:15:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 41507.8, 300 sec: 42043.0). Total num frames: 933822464. Throughput: 0: 41628.4. Samples: 933992140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 20:15:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 20:15:19,514][15401] Updated weights for policy 0, policy_version 57000 (0.0043) [2024-06-21 20:15:23,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 934019072. Throughput: 0: 41385.8. Samples: 934112420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 20:15:23,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 20:15:23,931][15401] Updated weights for policy 0, policy_version 57010 (0.0037) [2024-06-21 20:15:27,673][15401] Updated weights for policy 0, policy_version 57020 (0.0028) [2024-06-21 20:15:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 934248448. Throughput: 0: 41482.2. Samples: 934361820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 20:15:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 20:15:31,754][15401] Updated weights for policy 0, policy_version 57030 (0.0036) [2024-06-21 20:15:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 934461440. Throughput: 0: 41547.0. Samples: 934614120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 20:15:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 20:15:35,224][15401] Updated weights for policy 0, policy_version 57040 (0.0035) [2024-06-21 20:15:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41234.8, 300 sec: 41932.3). Total num frames: 934641664. Throughput: 0: 41442.3. Samples: 934733540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 20:15:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 20:15:39,773][15401] Updated weights for policy 0, policy_version 57050 (0.0037) [2024-06-21 20:15:42,804][15401] Updated weights for policy 0, policy_version 57060 (0.0055) [2024-06-21 20:15:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 934871040. Throughput: 0: 41467.6. Samples: 934983140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 20:15:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 20:15:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000057060_934871040.pth... [2024-06-21 20:15:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000056447_924827648.pth [2024-06-21 20:15:47,583][15401] Updated weights for policy 0, policy_version 57070 (0.0038) [2024-06-21 20:15:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41876.9). Total num frames: 935067648. Throughput: 0: 41573.3. Samples: 935238000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 20:15:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 20:15:50,635][15401] Updated weights for policy 0, policy_version 57080 (0.0034) [2024-06-21 20:15:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41507.9, 300 sec: 41931.9). Total num frames: 935280640. Throughput: 0: 41453.0. Samples: 935357500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 20:15:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 20:15:55,664][15401] Updated weights for policy 0, policy_version 57090 (0.0042) [2024-06-21 20:15:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 935493632. Throughput: 0: 41428.4. Samples: 935609060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 20:15:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 20:15:59,247][15401] Updated weights for policy 0, policy_version 57100 (0.0034) [2024-06-21 20:16:03,306][15401] Updated weights for policy 0, policy_version 57110 (0.0039) [2024-06-21 20:16:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 935690240. Throughput: 0: 41725.8. Samples: 935869800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 20:16:03,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-21 20:16:06,806][15401] Updated weights for policy 0, policy_version 57120 (0.0044) [2024-06-21 20:16:06,816][15349] Signal inference workers to stop experience collection... (13600 times) [2024-06-21 20:16:06,816][15349] Signal inference workers to resume experience collection... (13600 times) [2024-06-21 20:16:06,864][15401] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-06-21 20:16:06,864][15401] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-06-21 20:16:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41781.0, 300 sec: 41931.9). Total num frames: 935903232. Throughput: 0: 41751.6. Samples: 935991240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 20:16:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 20:16:10,871][15401] Updated weights for policy 0, policy_version 57130 (0.0030) [2024-06-21 20:16:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 936132608. Throughput: 0: 41775.7. Samples: 936241720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 20:16:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 20:16:14,574][15401] Updated weights for policy 0, policy_version 57140 (0.0031) [2024-06-21 20:16:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 936312832. Throughput: 0: 41865.9. Samples: 936498080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 20:16:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 20:16:18,584][15401] Updated weights for policy 0, policy_version 57150 (0.0032) [2024-06-21 20:16:22,287][15401] Updated weights for policy 0, policy_version 57160 (0.0033) [2024-06-21 20:16:23,389][15132] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 936525824. Throughput: 0: 41907.0. Samples: 936619360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 20:16:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 20:16:26,355][15401] Updated weights for policy 0, policy_version 57170 (0.0037) [2024-06-21 20:16:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41779.1, 300 sec: 41932.6). Total num frames: 936755200. Throughput: 0: 42092.3. Samples: 936877300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 20:16:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 20:16:29,863][15401] Updated weights for policy 0, policy_version 57180 (0.0038) [2024-06-21 20:16:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 936951808. Throughput: 0: 42011.1. Samples: 937128500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 20:16:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 20:16:34,098][15401] Updated weights for policy 0, policy_version 57190 (0.0026) [2024-06-21 20:16:38,045][15401] Updated weights for policy 0, policy_version 57200 (0.0040) [2024-06-21 20:16:38,391][15132] Fps is (10 sec: 42591.5, 60 sec: 42324.1, 300 sec: 41876.4). Total num frames: 937181184. Throughput: 0: 42195.6. Samples: 937256380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:16:38,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 20:16:41,610][15401] Updated weights for policy 0, policy_version 57210 (0.0034) [2024-06-21 20:16:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41777.5, 300 sec: 41876.0). Total num frames: 937377792. Throughput: 0: 42207.9. Samples: 937508520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:16:43,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 20:16:45,702][15401] Updated weights for policy 0, policy_version 57220 (0.0033) [2024-06-21 20:16:48,389][15132] Fps is (10 sec: 40967.6, 60 sec: 42052.4, 300 sec: 41821.2). Total num frames: 937590784. Throughput: 0: 41990.0. Samples: 937759340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:16:48,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-21 20:16:49,272][15401] Updated weights for policy 0, policy_version 57230 (0.0022) [2024-06-21 20:16:53,380][15401] Updated weights for policy 0, policy_version 57240 (0.0043) [2024-06-21 20:16:53,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.3, 300 sec: 41932.3). Total num frames: 937820160. Throughput: 0: 42051.5. Samples: 937883560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:16:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 20:16:56,856][15401] Updated weights for policy 0, policy_version 57250 (0.0034) [2024-06-21 20:16:58,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42050.5, 300 sec: 41876.0). Total num frames: 938016768. Throughput: 0: 42010.1. Samples: 938132280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:16:58,392][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 20:17:01,231][15401] Updated weights for policy 0, policy_version 57260 (0.0044) [2024-06-21 20:17:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 938196992. Throughput: 0: 42111.6. Samples: 938393100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:17:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 20:17:04,829][15401] Updated weights for policy 0, policy_version 57270 (0.0047) [2024-06-21 20:17:08,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42325.2, 300 sec: 41987.7). Total num frames: 938442752. Throughput: 0: 42103.9. Samples: 938514040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:17:08,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 20:17:09,018][15401] Updated weights for policy 0, policy_version 57280 (0.0046) [2024-06-21 20:17:12,685][15401] Updated weights for policy 0, policy_version 57290 (0.0044) [2024-06-21 20:17:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42050.5, 300 sec: 41821.0). Total num frames: 938655744. Throughput: 0: 41934.7. Samples: 938764460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 20:17:13,393][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 20:17:16,758][15401] Updated weights for policy 0, policy_version 57300 (0.0039) [2024-06-21 20:17:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 938852352. Throughput: 0: 42183.5. Samples: 939026760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 20:17:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 20:17:20,290][15401] Updated weights for policy 0, policy_version 57310 (0.0028) [2024-06-21 20:17:23,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 939065344. Throughput: 0: 41982.1. Samples: 939145500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 20:17:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 20:17:24,546][15401] Updated weights for policy 0, policy_version 57320 (0.0034) [2024-06-21 20:17:27,994][15401] Updated weights for policy 0, policy_version 57330 (0.0039) [2024-06-21 20:17:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 939294720. Throughput: 0: 42129.0. Samples: 939404220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 20:17:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 20:17:32,474][15401] Updated weights for policy 0, policy_version 57340 (0.0040) [2024-06-21 20:17:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 939474944. Throughput: 0: 42013.7. Samples: 939649960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 20:17:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 20:17:36,247][15401] Updated weights for policy 0, policy_version 57350 (0.0028) [2024-06-21 20:17:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42053.5, 300 sec: 41876.4). Total num frames: 939704320. Throughput: 0: 42026.7. Samples: 939774760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 20:17:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 20:17:40,246][15401] Updated weights for policy 0, policy_version 57360 (0.0040) [2024-06-21 20:17:42,093][15349] Signal inference workers to stop experience collection... (13650 times) [2024-06-21 20:17:42,127][15401] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-06-21 20:17:42,152][15349] Signal inference workers to resume experience collection... (13650 times) [2024-06-21 20:17:42,156][15401] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-06-21 20:17:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42054.0, 300 sec: 41820.8). Total num frames: 939900928. Throughput: 0: 42356.1. Samples: 940038200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 20:17:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-21 20:17:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000057368_939917312.pth... [2024-06-21 20:17:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000056754_929857536.pth [2024-06-21 20:17:43,876][15401] Updated weights for policy 0, policy_version 57370 (0.0027) [2024-06-21 20:17:48,014][15401] Updated weights for policy 0, policy_version 57380 (0.0038) [2024-06-21 20:17:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 41876.4). Total num frames: 940113920. Throughput: 0: 41945.7. Samples: 940280660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-21 20:17:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 20:17:51,554][15401] Updated weights for policy 0, policy_version 57390 (0.0045) [2024-06-21 20:17:53,392][15132] Fps is (10 sec: 42588.8, 60 sec: 41777.7, 300 sec: 41876.1). Total num frames: 940326912. Throughput: 0: 42030.5. Samples: 940405500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 20:17:53,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 20:17:56,141][15401] Updated weights for policy 0, policy_version 57400 (0.0041) [2024-06-21 20:17:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42054.0, 300 sec: 41931.9). Total num frames: 940539904. Throughput: 0: 42078.3. Samples: 940657880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 20:17:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 20:17:59,376][15401] Updated weights for policy 0, policy_version 57410 (0.0034) [2024-06-21 20:18:03,389][15132] Fps is (10 sec: 40968.9, 60 sec: 42325.3, 300 sec: 41876.7). Total num frames: 940736512. Throughput: 0: 41932.9. Samples: 940913740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 20:18:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 20:18:03,885][15401] Updated weights for policy 0, policy_version 57420 (0.0044) [2024-06-21 20:18:07,020][15401] Updated weights for policy 0, policy_version 57430 (0.0027) [2024-06-21 20:18:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41932.3). Total num frames: 940949504. Throughput: 0: 41949.7. Samples: 941033240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 20:18:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 20:18:12,054][15401] Updated weights for policy 0, policy_version 57440 (0.0033) [2024-06-21 20:18:13,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42052.3, 300 sec: 41987.1). Total num frames: 941178880. Throughput: 0: 41914.6. Samples: 941290480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 20:18:13,392][15132] Avg episode reward: [(0, '0.316')] [2024-06-21 20:18:14,995][15401] Updated weights for policy 0, policy_version 57450 (0.0044) [2024-06-21 20:18:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 941359104. Throughput: 0: 42051.5. Samples: 941542280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 20:18:18,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 20:18:19,906][15401] Updated weights for policy 0, policy_version 57460 (0.0030) [2024-06-21 20:18:22,906][15401] Updated weights for policy 0, policy_version 57470 (0.0032) [2024-06-21 20:18:23,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.3, 300 sec: 41988.4). Total num frames: 941604864. Throughput: 0: 41892.9. Samples: 941659940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 20:18:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 20:18:27,819][15401] Updated weights for policy 0, policy_version 57480 (0.0039) [2024-06-21 20:18:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41876.7). Total num frames: 941768704. Throughput: 0: 41749.7. Samples: 941916940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:18:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 20:18:30,645][15401] Updated weights for policy 0, policy_version 57490 (0.0039) [2024-06-21 20:18:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 941998080. Throughput: 0: 41900.9. Samples: 942166200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:18:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 20:18:35,477][15401] Updated weights for policy 0, policy_version 57500 (0.0034) [2024-06-21 20:18:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42052.2, 300 sec: 41932.6). Total num frames: 942227456. Throughput: 0: 41927.0. Samples: 942292120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:18:38,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 20:18:38,747][15401] Updated weights for policy 0, policy_version 57510 (0.0036) [2024-06-21 20:18:43,065][15401] Updated weights for policy 0, policy_version 57520 (0.0031) [2024-06-21 20:18:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 942407680. Throughput: 0: 41876.9. Samples: 942542340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:18:43,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 20:18:46,573][15401] Updated weights for policy 0, policy_version 57530 (0.0045) [2024-06-21 20:18:48,392][15132] Fps is (10 sec: 37674.0, 60 sec: 41504.5, 300 sec: 41765.0). Total num frames: 942604288. Throughput: 0: 41693.3. Samples: 942790040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:18:48,393][15132] Avg episode reward: [(0, '0.201')] [2024-06-21 20:18:51,080][15401] Updated weights for policy 0, policy_version 57540 (0.0038) [2024-06-21 20:18:53,060][15349] Signal inference workers to stop experience collection... (13700 times) [2024-06-21 20:18:53,061][15349] Signal inference workers to resume experience collection... (13700 times) [2024-06-21 20:18:53,081][15401] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-06-21 20:18:53,109][15401] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-06-21 20:18:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42326.9, 300 sec: 41932.8). Total num frames: 942866432. Throughput: 0: 41865.8. Samples: 942917200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:18:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-21 20:18:54,262][15401] Updated weights for policy 0, policy_version 57550 (0.0031) [2024-06-21 20:18:58,390][15132] Fps is (10 sec: 44247.2, 60 sec: 41779.2, 300 sec: 41877.3). Total num frames: 943046656. Throughput: 0: 41869.3. Samples: 943174500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:18:58,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-21 20:18:58,681][15401] Updated weights for policy 0, policy_version 57560 (0.0039) [2024-06-21 20:19:02,049][15401] Updated weights for policy 0, policy_version 57570 (0.0030) [2024-06-21 20:19:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 943243264. Throughput: 0: 41833.7. Samples: 943424800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:19:03,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-21 20:19:06,335][15401] Updated weights for policy 0, policy_version 57580 (0.0036) [2024-06-21 20:19:08,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.6, 300 sec: 41931.6). Total num frames: 943489024. Throughput: 0: 42079.0. Samples: 943553600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:19:08,392][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 20:19:09,931][15401] Updated weights for policy 0, policy_version 57590 (0.0046) [2024-06-21 20:19:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41234.7, 300 sec: 41765.7). Total num frames: 943652864. Throughput: 0: 41975.6. Samples: 943805840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:19:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 20:19:14,147][15401] Updated weights for policy 0, policy_version 57600 (0.0029) [2024-06-21 20:19:17,846][15401] Updated weights for policy 0, policy_version 57610 (0.0031) [2024-06-21 20:19:18,392][15132] Fps is (10 sec: 39321.7, 60 sec: 42050.6, 300 sec: 41931.6). Total num frames: 943882240. Throughput: 0: 41904.0. Samples: 944051980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:19:18,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 20:19:21,898][15401] Updated weights for policy 0, policy_version 57620 (0.0034) [2024-06-21 20:19:23,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 944128000. Throughput: 0: 41985.8. Samples: 944181480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:19:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 20:19:25,735][15401] Updated weights for policy 0, policy_version 57630 (0.0036) [2024-06-21 20:19:28,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 944308224. Throughput: 0: 42118.2. Samples: 944437660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:19:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 20:19:29,935][15401] Updated weights for policy 0, policy_version 57640 (0.0044) [2024-06-21 20:19:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 41821.2). Total num frames: 944504832. Throughput: 0: 42206.8. Samples: 944689240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:19:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-21 20:19:33,681][15401] Updated weights for policy 0, policy_version 57650 (0.0030) [2024-06-21 20:19:37,607][15401] Updated weights for policy 0, policy_version 57660 (0.0043) [2024-06-21 20:19:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 944750592. Throughput: 0: 42186.3. Samples: 944815580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 20:19:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 20:19:41,470][15401] Updated weights for policy 0, policy_version 57670 (0.0031) [2024-06-21 20:19:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 944947200. Throughput: 0: 42164.0. Samples: 945071880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 20:19:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 20:19:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000057675_944947200.pth... [2024-06-21 20:19:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000057060_934871040.pth [2024-06-21 20:19:45,105][15401] Updated weights for policy 0, policy_version 57680 (0.0052) [2024-06-21 20:19:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.1, 300 sec: 41932.3). Total num frames: 945160192. Throughput: 0: 42148.5. Samples: 945321480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 20:19:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 20:19:49,369][15401] Updated weights for policy 0, policy_version 57690 (0.0045) [2024-06-21 20:19:52,754][15401] Updated weights for policy 0, policy_version 57700 (0.0041) [2024-06-21 20:19:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 945373184. Throughput: 0: 42233.4. Samples: 945454000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 20:19:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 20:19:57,046][15401] Updated weights for policy 0, policy_version 57710 (0.0036) [2024-06-21 20:19:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 945569792. Throughput: 0: 42285.4. Samples: 945708680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 20:19:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 20:20:00,520][15401] Updated weights for policy 0, policy_version 57720 (0.0035) [2024-06-21 20:20:01,029][15349] Signal inference workers to stop experience collection... (13750 times) [2024-06-21 20:20:01,029][15349] Signal inference workers to resume experience collection... (13750 times) [2024-06-21 20:20:01,046][15401] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-06-21 20:20:01,046][15401] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-06-21 20:20:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42043.4). Total num frames: 945799168. Throughput: 0: 42272.1. Samples: 945954120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 20:20:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 20:20:04,648][15401] Updated weights for policy 0, policy_version 57730 (0.0032) [2024-06-21 20:20:08,225][15401] Updated weights for policy 0, policy_version 57740 (0.0050) [2024-06-21 20:20:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42054.0, 300 sec: 41987.5). Total num frames: 946012160. Throughput: 0: 42367.2. Samples: 946088000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 20:20:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 20:20:12,152][15401] Updated weights for policy 0, policy_version 57750 (0.0038) [2024-06-21 20:20:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 946192384. Throughput: 0: 42133.8. Samples: 946333680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 20:20:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 20:20:15,878][15401] Updated weights for policy 0, policy_version 57760 (0.0025) [2024-06-21 20:20:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.2, 300 sec: 42098.6). Total num frames: 946438144. Throughput: 0: 42188.0. Samples: 946587700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 20:20:18,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 20:20:19,911][15401] Updated weights for policy 0, policy_version 57770 (0.0037) [2024-06-21 20:20:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 946634752. Throughput: 0: 42404.9. Samples: 946723800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 20:20:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 20:20:23,654][15401] Updated weights for policy 0, policy_version 57780 (0.0043) [2024-06-21 20:20:27,554][15401] Updated weights for policy 0, policy_version 57790 (0.0038) [2024-06-21 20:20:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 946847744. Throughput: 0: 42209.9. Samples: 946971320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 20:20:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 20:20:31,527][15401] Updated weights for policy 0, policy_version 57800 (0.0025) [2024-06-21 20:20:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 947077120. Throughput: 0: 42062.3. Samples: 947214280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 20:20:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 20:20:35,157][15401] Updated weights for policy 0, policy_version 57810 (0.0026) [2024-06-21 20:20:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 947240960. Throughput: 0: 42040.9. Samples: 947345840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 20:20:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 20:20:39,456][15401] Updated weights for policy 0, policy_version 57820 (0.0036) [2024-06-21 20:20:42,692][15401] Updated weights for policy 0, policy_version 57830 (0.0042) [2024-06-21 20:20:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 947503104. Throughput: 0: 41958.6. Samples: 947596820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-21 20:20:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 20:20:47,366][15401] Updated weights for policy 0, policy_version 57840 (0.0033) [2024-06-21 20:20:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 947699712. Throughput: 0: 42178.5. Samples: 947852160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 20:20:48,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-21 20:20:50,800][15401] Updated weights for policy 0, policy_version 57850 (0.0034) [2024-06-21 20:20:53,389][15132] Fps is (10 sec: 36044.9, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 947863552. Throughput: 0: 41874.7. Samples: 947972360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 20:20:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 20:20:55,088][15401] Updated weights for policy 0, policy_version 57860 (0.0041) [2024-06-21 20:20:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 948109312. Throughput: 0: 42019.6. Samples: 948224560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 20:20:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 20:20:58,812][15401] Updated weights for policy 0, policy_version 57870 (0.0043) [2024-06-21 20:21:02,737][15401] Updated weights for policy 0, policy_version 57880 (0.0035) [2024-06-21 20:21:03,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 948338688. Throughput: 0: 42080.7. Samples: 948481340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 20:21:03,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 20:21:06,824][15401] Updated weights for policy 0, policy_version 57890 (0.0024) [2024-06-21 20:21:08,391][15132] Fps is (10 sec: 39316.6, 60 sec: 41505.2, 300 sec: 41931.7). Total num frames: 948502528. Throughput: 0: 41943.7. Samples: 948611320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 20:21:08,391][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 20:21:10,431][15401] Updated weights for policy 0, policy_version 57900 (0.0042) [2024-06-21 20:21:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 948731904. Throughput: 0: 41822.2. Samples: 948853320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 20:21:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 20:21:14,715][15401] Updated weights for policy 0, policy_version 57910 (0.0035) [2024-06-21 20:21:17,195][15349] Signal inference workers to stop experience collection... (13800 times) [2024-06-21 20:21:17,217][15401] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-06-21 20:21:17,253][15349] Signal inference workers to resume experience collection... (13800 times) [2024-06-21 20:21:17,254][15401] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-06-21 20:21:18,089][15401] Updated weights for policy 0, policy_version 57920 (0.0027) [2024-06-21 20:21:18,390][15132] Fps is (10 sec: 45878.3, 60 sec: 42051.8, 300 sec: 42154.0). Total num frames: 948961280. Throughput: 0: 42074.6. Samples: 949107660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 20:21:18,391][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 20:21:22,687][15401] Updated weights for policy 0, policy_version 57930 (0.0032) [2024-06-21 20:21:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 949141504. Throughput: 0: 42135.1. Samples: 949241920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-21 20:21:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 20:21:25,684][15401] Updated weights for policy 0, policy_version 57940 (0.0040) [2024-06-21 20:21:28,396][15132] Fps is (10 sec: 40936.0, 60 sec: 42047.8, 300 sec: 42097.6). Total num frames: 949370880. Throughput: 0: 42018.0. Samples: 949487900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:21:28,396][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 20:21:30,726][15401] Updated weights for policy 0, policy_version 57950 (0.0037) [2024-06-21 20:21:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 42043.3). Total num frames: 949583872. Throughput: 0: 41937.0. Samples: 949739320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:21:33,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-21 20:21:33,881][15401] Updated weights for policy 0, policy_version 57960 (0.0045) [2024-06-21 20:21:38,389][15132] Fps is (10 sec: 39347.1, 60 sec: 42052.3, 300 sec: 41987.8). Total num frames: 949764096. Throughput: 0: 42131.5. Samples: 949868280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:21:38,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-21 20:21:38,488][15401] Updated weights for policy 0, policy_version 57970 (0.0037) [2024-06-21 20:21:41,606][15401] Updated weights for policy 0, policy_version 57980 (0.0041) [2024-06-21 20:21:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 950009856. Throughput: 0: 42115.1. Samples: 950119740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:21:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 20:21:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000057985_950026240.pth... [2024-06-21 20:21:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000057368_939917312.pth [2024-06-21 20:21:46,293][15401] Updated weights for policy 0, policy_version 57990 (0.0033) [2024-06-21 20:21:48,392][15132] Fps is (10 sec: 44225.7, 60 sec: 41777.6, 300 sec: 41987.1). Total num frames: 950206464. Throughput: 0: 42097.4. Samples: 950375820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:21:48,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 20:21:49,270][15401] Updated weights for policy 0, policy_version 58000 (0.0038) [2024-06-21 20:21:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 41987.8). Total num frames: 950403072. Throughput: 0: 41937.5. Samples: 950498460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:21:53,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 20:21:53,890][15401] Updated weights for policy 0, policy_version 58010 (0.0044) [2024-06-21 20:21:57,307][15401] Updated weights for policy 0, policy_version 58020 (0.0043) [2024-06-21 20:21:58,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 950648832. Throughput: 0: 42320.5. Samples: 950757740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:21:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 20:22:01,459][15401] Updated weights for policy 0, policy_version 58030 (0.0040) [2024-06-21 20:22:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 950845440. Throughput: 0: 42286.2. Samples: 951010520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:22:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 20:22:04,968][15401] Updated weights for policy 0, policy_version 58040 (0.0036) [2024-06-21 20:22:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42326.1, 300 sec: 41987.8). Total num frames: 951042048. Throughput: 0: 42071.4. Samples: 951135140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:22:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 20:22:09,330][15401] Updated weights for policy 0, policy_version 58050 (0.0031) [2024-06-21 20:22:12,536][15401] Updated weights for policy 0, policy_version 58060 (0.0036) [2024-06-21 20:22:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 951271424. Throughput: 0: 42164.3. Samples: 951385020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:22:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 20:22:17,083][15401] Updated weights for policy 0, policy_version 58070 (0.0029) [2024-06-21 20:22:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.6, 300 sec: 42043.0). Total num frames: 951468032. Throughput: 0: 42328.8. Samples: 951644120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:22:18,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-21 20:22:20,267][15401] Updated weights for policy 0, policy_version 58080 (0.0032) [2024-06-21 20:22:23,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 41987.1). Total num frames: 951681024. Throughput: 0: 42137.2. Samples: 951764560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:22:23,393][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 20:22:24,941][15401] Updated weights for policy 0, policy_version 58090 (0.0036) [2024-06-21 20:22:28,115][15401] Updated weights for policy 0, policy_version 58100 (0.0039) [2024-06-21 20:22:28,128][15349] Signal inference workers to stop experience collection... (13850 times) [2024-06-21 20:22:28,129][15349] Signal inference workers to resume experience collection... (13850 times) [2024-06-21 20:22:28,148][15401] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-06-21 20:22:28,148][15401] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-06-21 20:22:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42602.9, 300 sec: 42209.6). Total num frames: 951926784. Throughput: 0: 42193.7. Samples: 952018460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:22:28,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-21 20:22:32,808][15401] Updated weights for policy 0, policy_version 58110 (0.0032) [2024-06-21 20:22:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 952090624. Throughput: 0: 42302.8. Samples: 952279340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:22:33,396][15132] Avg episode reward: [(0, '0.150')] [2024-06-21 20:22:35,700][15401] Updated weights for policy 0, policy_version 58120 (0.0031) [2024-06-21 20:22:38,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 952287232. Throughput: 0: 42110.7. Samples: 952393440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:22:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 20:22:40,573][15401] Updated weights for policy 0, policy_version 58130 (0.0042) [2024-06-21 20:22:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42050.5, 300 sec: 42098.2). Total num frames: 952532992. Throughput: 0: 42114.2. Samples: 952652980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:22:43,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 20:22:43,840][15401] Updated weights for policy 0, policy_version 58140 (0.0040) [2024-06-21 20:22:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41780.9, 300 sec: 41987.8). Total num frames: 952713216. Throughput: 0: 42061.5. Samples: 952903280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:22:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 20:22:48,516][15401] Updated weights for policy 0, policy_version 58150 (0.0038) [2024-06-21 20:22:51,550][15401] Updated weights for policy 0, policy_version 58160 (0.0043) [2024-06-21 20:22:53,390][15132] Fps is (10 sec: 40967.2, 60 sec: 42324.9, 300 sec: 42042.9). Total num frames: 952942592. Throughput: 0: 42025.2. Samples: 953026300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:22:53,391][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 20:22:56,081][15401] Updated weights for policy 0, policy_version 58170 (0.0044) [2024-06-21 20:22:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 953155584. Throughput: 0: 42228.4. Samples: 953285300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:22:58,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 20:22:59,375][15401] Updated weights for policy 0, policy_version 58180 (0.0030) [2024-06-21 20:23:03,389][15132] Fps is (10 sec: 42601.6, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 953368576. Throughput: 0: 42006.3. Samples: 953534400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:23:03,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 20:23:03,599][15401] Updated weights for policy 0, policy_version 58190 (0.0033) [2024-06-21 20:23:07,285][15401] Updated weights for policy 0, policy_version 58200 (0.0035) [2024-06-21 20:23:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42098.9). Total num frames: 953597952. Throughput: 0: 42156.0. Samples: 953661480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:23:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 20:23:11,729][15401] Updated weights for policy 0, policy_version 58210 (0.0041) [2024-06-21 20:23:13,389][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 953761792. Throughput: 0: 42132.4. Samples: 953914420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:23:13,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-21 20:23:14,964][15401] Updated weights for policy 0, policy_version 58220 (0.0038) [2024-06-21 20:23:18,391][15132] Fps is (10 sec: 40954.7, 60 sec: 42324.4, 300 sec: 42042.8). Total num frames: 954007552. Throughput: 0: 41923.6. Samples: 954165960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:23:18,391][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 20:23:19,366][15401] Updated weights for policy 0, policy_version 58230 (0.0027) [2024-06-21 20:23:22,544][15401] Updated weights for policy 0, policy_version 58240 (0.0033) [2024-06-21 20:23:23,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42325.3, 300 sec: 42209.3). Total num frames: 954220544. Throughput: 0: 42291.5. Samples: 954296660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:23:23,393][15132] Avg episode reward: [(0, '0.671')] [2024-06-21 20:23:26,833][15401] Updated weights for policy 0, policy_version 58250 (0.0033) [2024-06-21 20:23:28,390][15132] Fps is (10 sec: 40965.0, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 954417152. Throughput: 0: 42028.4. Samples: 954544160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:23:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 20:23:30,531][15401] Updated weights for policy 0, policy_version 58260 (0.0035) [2024-06-21 20:23:33,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 954630144. Throughput: 0: 42213.6. Samples: 954802900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:23:33,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 20:23:34,534][15401] Updated weights for policy 0, policy_version 58270 (0.0038) [2024-06-21 20:23:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 954843136. Throughput: 0: 42295.7. Samples: 954929580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:23:38,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 20:23:38,437][15401] Updated weights for policy 0, policy_version 58280 (0.0041) [2024-06-21 20:23:39,731][15349] Signal inference workers to stop experience collection... (13900 times) [2024-06-21 20:23:39,731][15349] Signal inference workers to resume experience collection... (13900 times) [2024-06-21 20:23:39,758][15401] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-06-21 20:23:39,758][15401] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-06-21 20:23:42,343][15401] Updated weights for policy 0, policy_version 58290 (0.0041) [2024-06-21 20:23:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42053.9, 300 sec: 42210.0). Total num frames: 955056128. Throughput: 0: 42129.3. Samples: 955181120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:23:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 20:23:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000058292_955056128.pth... [2024-06-21 20:23:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000057675_944947200.pth [2024-06-21 20:23:46,247][15401] Updated weights for policy 0, policy_version 58300 (0.0033) [2024-06-21 20:23:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 955252736. Throughput: 0: 42165.6. Samples: 955431860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:23:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 20:23:50,095][15401] Updated weights for policy 0, policy_version 58310 (0.0047) [2024-06-21 20:23:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.8, 300 sec: 42154.1). Total num frames: 955482112. Throughput: 0: 42040.5. Samples: 955553300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 20:23:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 20:23:54,093][15401] Updated weights for policy 0, policy_version 58320 (0.0030) [2024-06-21 20:23:57,771][15401] Updated weights for policy 0, policy_version 58330 (0.0045) [2024-06-21 20:23:58,390][15132] Fps is (10 sec: 44234.7, 60 sec: 42324.9, 300 sec: 42209.6). Total num frames: 955695104. Throughput: 0: 42010.6. Samples: 955804920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 20:23:58,391][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 20:24:02,073][15401] Updated weights for policy 0, policy_version 58340 (0.0040) [2024-06-21 20:24:03,392][15132] Fps is (10 sec: 39312.0, 60 sec: 41777.4, 300 sec: 41987.5). Total num frames: 955875328. Throughput: 0: 42078.9. Samples: 956059560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 20:24:03,393][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 20:24:05,413][15401] Updated weights for policy 0, policy_version 58350 (0.0037) [2024-06-21 20:24:08,390][15132] Fps is (10 sec: 40961.0, 60 sec: 41779.0, 300 sec: 42209.6). Total num frames: 956104704. Throughput: 0: 41866.4. Samples: 956180560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 20:24:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 20:24:09,775][15401] Updated weights for policy 0, policy_version 58360 (0.0028) [2024-06-21 20:24:13,303][15401] Updated weights for policy 0, policy_version 58370 (0.0040) [2024-06-21 20:24:13,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42871.5, 300 sec: 42210.0). Total num frames: 956334080. Throughput: 0: 42019.6. Samples: 956435040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 20:24:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 20:24:17,720][15401] Updated weights for policy 0, policy_version 58380 (0.0044) [2024-06-21 20:24:18,389][15132] Fps is (10 sec: 40961.6, 60 sec: 41780.2, 300 sec: 41987.5). Total num frames: 956514304. Throughput: 0: 41960.2. Samples: 956691100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 20:24:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 20:24:21,209][15401] Updated weights for policy 0, policy_version 58390 (0.0037) [2024-06-21 20:24:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42054.0, 300 sec: 42154.1). Total num frames: 956743680. Throughput: 0: 41820.1. Samples: 956811480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-21 20:24:23,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-21 20:24:25,442][15401] Updated weights for policy 0, policy_version 58400 (0.0025) [2024-06-21 20:24:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 956940288. Throughput: 0: 42035.3. Samples: 957072700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:24:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 20:24:28,807][15401] Updated weights for policy 0, policy_version 58410 (0.0026) [2024-06-21 20:24:33,336][15401] Updated weights for policy 0, policy_version 58420 (0.0046) [2024-06-21 20:24:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 957153280. Throughput: 0: 42050.3. Samples: 957324120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:24:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 20:24:36,528][15401] Updated weights for policy 0, policy_version 58430 (0.0036) [2024-06-21 20:24:38,389][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 957382656. Throughput: 0: 42057.3. Samples: 957445880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:24:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 20:24:41,062][15401] Updated weights for policy 0, policy_version 58440 (0.0043) [2024-06-21 20:24:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 957562880. Throughput: 0: 42209.9. Samples: 957704340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:24:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 20:24:44,550][15401] Updated weights for policy 0, policy_version 58450 (0.0038) [2024-06-21 20:24:48,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42050.6, 300 sec: 42042.7). Total num frames: 957775872. Throughput: 0: 42068.0. Samples: 957952620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:24:48,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 20:24:48,809][15401] Updated weights for policy 0, policy_version 58460 (0.0044) [2024-06-21 20:24:52,567][15401] Updated weights for policy 0, policy_version 58470 (0.0039) [2024-06-21 20:24:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 958005248. Throughput: 0: 42226.6. Samples: 958080740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:24:53,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-21 20:24:57,172][15401] Updated weights for policy 0, policy_version 58480 (0.0031) [2024-06-21 20:24:58,389][15132] Fps is (10 sec: 42608.9, 60 sec: 41779.6, 300 sec: 42043.0). Total num frames: 958201856. Throughput: 0: 42324.0. Samples: 958339620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 20:24:58,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 20:25:00,430][15401] Updated weights for policy 0, policy_version 58490 (0.0041) [2024-06-21 20:25:00,856][15349] Signal inference workers to stop experience collection... (13950 times) [2024-06-21 20:25:00,856][15349] Signal inference workers to resume experience collection... (13950 times) [2024-06-21 20:25:00,888][15401] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-06-21 20:25:00,888][15401] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-06-21 20:25:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42098.5). Total num frames: 958431232. Throughput: 0: 41938.2. Samples: 958578320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 20:25:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 20:25:04,935][15401] Updated weights for policy 0, policy_version 58500 (0.0034) [2024-06-21 20:25:08,270][15401] Updated weights for policy 0, policy_version 58510 (0.0048) [2024-06-21 20:25:08,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.8, 300 sec: 42153.7). Total num frames: 958627840. Throughput: 0: 42135.0. Samples: 958707660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 20:25:08,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 20:25:12,827][15401] Updated weights for policy 0, policy_version 58520 (0.0034) [2024-06-21 20:25:13,396][15132] Fps is (10 sec: 37658.6, 60 sec: 41228.6, 300 sec: 41931.0). Total num frames: 958808064. Throughput: 0: 42029.4. Samples: 958964300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 20:25:13,396][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 20:25:15,975][15401] Updated weights for policy 0, policy_version 58530 (0.0038) [2024-06-21 20:25:18,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 959086592. Throughput: 0: 41765.7. Samples: 959203580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 20:25:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 20:25:20,396][15401] Updated weights for policy 0, policy_version 58540 (0.0042) [2024-06-21 20:25:23,390][15132] Fps is (10 sec: 44265.2, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 959250432. Throughput: 0: 42182.6. Samples: 959344100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 20:25:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 20:25:23,729][15401] Updated weights for policy 0, policy_version 58550 (0.0032) [2024-06-21 20:25:28,264][15401] Updated weights for policy 0, policy_version 58560 (0.0042) [2024-06-21 20:25:28,390][15132] Fps is (10 sec: 36044.6, 60 sec: 41779.0, 300 sec: 41931.9). Total num frames: 959447040. Throughput: 0: 41894.5. Samples: 959589600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 20:25:28,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 20:25:31,550][15401] Updated weights for policy 0, policy_version 58570 (0.0038) [2024-06-21 20:25:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 959709184. Throughput: 0: 41845.4. Samples: 959835560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 20:25:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 20:25:35,900][15401] Updated weights for policy 0, policy_version 58580 (0.0047) [2024-06-21 20:25:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 959873024. Throughput: 0: 41887.1. Samples: 959965660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 20:25:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-21 20:25:39,443][15401] Updated weights for policy 0, policy_version 58590 (0.0042) [2024-06-21 20:25:43,389][15132] Fps is (10 sec: 36044.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 960069632. Throughput: 0: 41649.3. Samples: 960213840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-21 20:25:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 20:25:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000058598_960069632.pth... [2024-06-21 20:25:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000057985_950026240.pth [2024-06-21 20:25:43,982][15401] Updated weights for policy 0, policy_version 58600 (0.0026) [2024-06-21 20:25:47,214][15401] Updated weights for policy 0, policy_version 58610 (0.0031) [2024-06-21 20:25:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.2, 300 sec: 42265.2). Total num frames: 960331776. Throughput: 0: 41721.8. Samples: 960455800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-21 20:25:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 20:25:51,737][15401] Updated weights for policy 0, policy_version 58620 (0.0044) [2024-06-21 20:25:53,392][15132] Fps is (10 sec: 44225.4, 60 sec: 41777.4, 300 sec: 42042.6). Total num frames: 960512000. Throughput: 0: 41824.3. Samples: 960589760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-21 20:25:53,393][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 20:25:55,037][15401] Updated weights for policy 0, policy_version 58630 (0.0044) [2024-06-21 20:25:58,392][15132] Fps is (10 sec: 37675.2, 60 sec: 41777.8, 300 sec: 41931.7). Total num frames: 960708608. Throughput: 0: 41637.0. Samples: 960837780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-21 20:25:58,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 20:25:59,439][15401] Updated weights for policy 0, policy_version 58640 (0.0028) [2024-06-21 20:26:02,819][15401] Updated weights for policy 0, policy_version 58650 (0.0031) [2024-06-21 20:26:03,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42052.2, 300 sec: 42209.8). Total num frames: 960954368. Throughput: 0: 41924.5. Samples: 961090180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-21 20:26:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 20:26:07,005][15401] Updated weights for policy 0, policy_version 58660 (0.0047) [2024-06-21 20:26:08,390][15132] Fps is (10 sec: 42606.7, 60 sec: 41780.8, 300 sec: 42043.0). Total num frames: 961134592. Throughput: 0: 41822.2. Samples: 961226100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-21 20:26:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-21 20:26:10,522][15349] Signal inference workers to stop experience collection... (14000 times) [2024-06-21 20:26:10,523][15349] Signal inference workers to resume experience collection... (14000 times) [2024-06-21 20:26:10,541][15401] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-06-21 20:26:10,541][15401] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-06-21 20:26:10,729][15401] Updated weights for policy 0, policy_version 58670 (0.0040) [2024-06-21 20:26:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42056.8, 300 sec: 41932.0). Total num frames: 961331200. Throughput: 0: 41645.0. Samples: 961463620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-21 20:26:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 20:26:14,731][15401] Updated weights for policy 0, policy_version 58680 (0.0047) [2024-06-21 20:26:18,390][15132] Fps is (10 sec: 42595.1, 60 sec: 41232.5, 300 sec: 42098.4). Total num frames: 961560576. Throughput: 0: 41993.9. Samples: 961725320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:26:18,391][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 20:26:18,426][15401] Updated weights for policy 0, policy_version 58690 (0.0042) [2024-06-21 20:26:22,698][15401] Updated weights for policy 0, policy_version 58700 (0.0031) [2024-06-21 20:26:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41988.4). Total num frames: 961757184. Throughput: 0: 41931.9. Samples: 961852600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:26:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 20:26:26,214][15401] Updated weights for policy 0, policy_version 58710 (0.0036) [2024-06-21 20:26:28,390][15132] Fps is (10 sec: 42601.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 961986560. Throughput: 0: 41875.5. Samples: 962098240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:26:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 20:26:30,693][15401] Updated weights for policy 0, policy_version 58720 (0.0034) [2024-06-21 20:26:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 962199552. Throughput: 0: 42310.6. Samples: 962359780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:26:33,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-21 20:26:33,970][15401] Updated weights for policy 0, policy_version 58730 (0.0047) [2024-06-21 20:26:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 962379776. Throughput: 0: 42074.0. Samples: 962482980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:26:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 20:26:38,422][15401] Updated weights for policy 0, policy_version 58740 (0.0039) [2024-06-21 20:26:41,941][15401] Updated weights for policy 0, policy_version 58750 (0.0031) [2024-06-21 20:26:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42154.4). Total num frames: 962641920. Throughput: 0: 42170.4. Samples: 962735360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:26:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 20:26:46,354][15401] Updated weights for policy 0, policy_version 58760 (0.0051) [2024-06-21 20:26:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 962805760. Throughput: 0: 42089.4. Samples: 962984200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:26:48,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-21 20:26:50,002][15401] Updated weights for policy 0, policy_version 58770 (0.0042) [2024-06-21 20:26:53,390][15132] Fps is (10 sec: 37681.3, 60 sec: 41780.7, 300 sec: 41931.9). Total num frames: 963018752. Throughput: 0: 41706.8. Samples: 963102920. Policy #0 lag: (min: 0.0, avg: 13.3, max: 23.0) [2024-06-21 20:26:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-21 20:26:54,090][15401] Updated weights for policy 0, policy_version 58780 (0.0040) [2024-06-21 20:26:57,872][15401] Updated weights for policy 0, policy_version 58790 (0.0045) [2024-06-21 20:26:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42053.7, 300 sec: 41987.5). Total num frames: 963231744. Throughput: 0: 42122.6. Samples: 963359140. Policy #0 lag: (min: 0.0, avg: 13.3, max: 23.0) [2024-06-21 20:26:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-21 20:27:01,926][15401] Updated weights for policy 0, policy_version 58800 (0.0030) [2024-06-21 20:27:03,389][15132] Fps is (10 sec: 44239.0, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 963461120. Throughput: 0: 41803.5. Samples: 963606440. Policy #0 lag: (min: 0.0, avg: 13.3, max: 23.0) [2024-06-21 20:27:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 20:27:05,605][15401] Updated weights for policy 0, policy_version 58810 (0.0040) [2024-06-21 20:27:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 963657728. Throughput: 0: 41886.6. Samples: 963737500. Policy #0 lag: (min: 0.0, avg: 13.3, max: 23.0) [2024-06-21 20:27:08,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-21 20:27:09,599][15401] Updated weights for policy 0, policy_version 58820 (0.0025) [2024-06-21 20:27:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 963854336. Throughput: 0: 42205.5. Samples: 963997480. Policy #0 lag: (min: 0.0, avg: 13.3, max: 23.0) [2024-06-21 20:27:13,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-21 20:27:13,416][15401] Updated weights for policy 0, policy_version 58830 (0.0026) [2024-06-21 20:27:17,214][15401] Updated weights for policy 0, policy_version 58840 (0.0029) [2024-06-21 20:27:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.9, 300 sec: 42043.4). Total num frames: 964083712. Throughput: 0: 41864.0. Samples: 964243660. Policy #0 lag: (min: 0.0, avg: 13.3, max: 23.0) [2024-06-21 20:27:18,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 20:27:21,048][15401] Updated weights for policy 0, policy_version 58850 (0.0045) [2024-06-21 20:27:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 964280320. Throughput: 0: 41948.0. Samples: 964370640. Policy #0 lag: (min: 0.0, avg: 13.3, max: 23.0) [2024-06-21 20:27:23,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-21 20:27:24,737][15349] Signal inference workers to stop experience collection... (14050 times) [2024-06-21 20:27:24,744][15349] Signal inference workers to resume experience collection... (14050 times) [2024-06-21 20:27:24,756][15401] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-06-21 20:27:24,770][15401] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-06-21 20:27:24,901][15401] Updated weights for policy 0, policy_version 58860 (0.0047) [2024-06-21 20:27:28,391][15132] Fps is (10 sec: 40953.7, 60 sec: 41778.2, 300 sec: 42042.8). Total num frames: 964493312. Throughput: 0: 41895.0. Samples: 964620700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:27:28,391][15132] Avg episode reward: [(0, '0.194')] [2024-06-21 20:27:28,825][15401] Updated weights for policy 0, policy_version 58870 (0.0030) [2024-06-21 20:27:32,674][15401] Updated weights for policy 0, policy_version 58880 (0.0036) [2024-06-21 20:27:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 964722688. Throughput: 0: 42068.5. Samples: 964877280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:27:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 20:27:36,633][15401] Updated weights for policy 0, policy_version 58890 (0.0037) [2024-06-21 20:27:38,389][15132] Fps is (10 sec: 42605.2, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 964919296. Throughput: 0: 42289.4. Samples: 965005920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:27:38,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-21 20:27:40,232][15401] Updated weights for policy 0, policy_version 58900 (0.0041) [2024-06-21 20:27:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 41504.4, 300 sec: 42098.2). Total num frames: 965132288. Throughput: 0: 42105.7. Samples: 965254000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:27:43,392][15132] Avg episode reward: [(0, '0.728')] [2024-06-21 20:27:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000058907_965132288.pth... [2024-06-21 20:27:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000058292_955056128.pth [2024-06-21 20:27:44,443][15401] Updated weights for policy 0, policy_version 58910 (0.0036) [2024-06-21 20:27:47,847][15401] Updated weights for policy 0, policy_version 58920 (0.0034) [2024-06-21 20:27:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 965361664. Throughput: 0: 42296.9. Samples: 965509800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:27:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 20:27:52,286][15401] Updated weights for policy 0, policy_version 58930 (0.0036) [2024-06-21 20:27:53,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42325.7, 300 sec: 42043.0). Total num frames: 965558272. Throughput: 0: 42301.0. Samples: 965641040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:27:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-21 20:27:55,672][15401] Updated weights for policy 0, policy_version 58940 (0.0034) [2024-06-21 20:27:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 965771264. Throughput: 0: 41953.2. Samples: 965885380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:27:58,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 20:27:59,931][15401] Updated weights for policy 0, policy_version 58950 (0.0033) [2024-06-21 20:28:03,377][15401] Updated weights for policy 0, policy_version 58960 (0.0041) [2024-06-21 20:28:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 966000640. Throughput: 0: 42238.6. Samples: 966144400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:28:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 20:28:08,003][15401] Updated weights for policy 0, policy_version 58970 (0.0035) [2024-06-21 20:28:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 966164480. Throughput: 0: 42240.0. Samples: 966271440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:28:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 20:28:11,377][15401] Updated weights for policy 0, policy_version 58980 (0.0028) [2024-06-21 20:28:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42043.2). Total num frames: 966410240. Throughput: 0: 42077.9. Samples: 966514140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:28:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 20:28:15,589][15401] Updated weights for policy 0, policy_version 58990 (0.0034) [2024-06-21 20:28:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 966590464. Throughput: 0: 42165.7. Samples: 966774740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:28:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 20:28:19,395][15401] Updated weights for policy 0, policy_version 59000 (0.0032) [2024-06-21 20:28:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 966803456. Throughput: 0: 41882.5. Samples: 966890640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:28:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 20:28:23,577][15401] Updated weights for policy 0, policy_version 59010 (0.0059) [2024-06-21 20:28:27,125][15401] Updated weights for policy 0, policy_version 59020 (0.0031) [2024-06-21 20:28:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42324.7, 300 sec: 42042.7). Total num frames: 967032832. Throughput: 0: 41951.2. Samples: 967141800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:28:28,392][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 20:28:31,435][15401] Updated weights for policy 0, policy_version 59030 (0.0042) [2024-06-21 20:28:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 967245824. Throughput: 0: 41723.1. Samples: 967387340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:28:33,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-21 20:28:35,322][15401] Updated weights for policy 0, policy_version 59040 (0.0044) [2024-06-21 20:28:38,392][15132] Fps is (10 sec: 39321.3, 60 sec: 41777.4, 300 sec: 41931.6). Total num frames: 967426048. Throughput: 0: 41639.0. Samples: 967514900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 20:28:38,393][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 20:28:39,102][15401] Updated weights for policy 0, policy_version 59050 (0.0042) [2024-06-21 20:28:42,831][15401] Updated weights for policy 0, policy_version 59060 (0.0046) [2024-06-21 20:28:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42054.0, 300 sec: 42043.0). Total num frames: 967655424. Throughput: 0: 41931.6. Samples: 967772300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 20:28:43,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-21 20:28:46,911][15401] Updated weights for policy 0, policy_version 59070 (0.0050) [2024-06-21 20:28:48,389][15132] Fps is (10 sec: 42609.2, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 967852032. Throughput: 0: 41709.9. Samples: 968021340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 20:28:48,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 20:28:50,354][15401] Updated weights for policy 0, policy_version 59080 (0.0034) [2024-06-21 20:28:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41932.0). Total num frames: 968065024. Throughput: 0: 41663.5. Samples: 968146300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 20:28:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 20:28:54,865][15401] Updated weights for policy 0, policy_version 59090 (0.0042) [2024-06-21 20:28:57,182][15349] Signal inference workers to stop experience collection... (14100 times) [2024-06-21 20:28:57,182][15349] Signal inference workers to resume experience collection... (14100 times) [2024-06-21 20:28:57,210][15401] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-06-21 20:28:57,210][15401] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-06-21 20:28:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42043.4). Total num frames: 968278016. Throughput: 0: 41864.1. Samples: 968398020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 20:28:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 20:28:58,452][15401] Updated weights for policy 0, policy_version 59100 (0.0038) [2024-06-21 20:29:02,454][15401] Updated weights for policy 0, policy_version 59110 (0.0033) [2024-06-21 20:29:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 968491008. Throughput: 0: 41658.2. Samples: 968649360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 20:29:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 20:29:06,350][15401] Updated weights for policy 0, policy_version 59120 (0.0039) [2024-06-21 20:29:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 968704000. Throughput: 0: 41866.2. Samples: 968774620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 20:29:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 20:29:10,329][15401] Updated weights for policy 0, policy_version 59130 (0.0048) [2024-06-21 20:29:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 968916992. Throughput: 0: 41871.1. Samples: 969025900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-21 20:29:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 20:29:14,248][15401] Updated weights for policy 0, policy_version 59140 (0.0042) [2024-06-21 20:29:18,025][15401] Updated weights for policy 0, policy_version 59150 (0.0051) [2024-06-21 20:29:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 969129984. Throughput: 0: 41912.6. Samples: 969273400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 20:29:18,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 20:29:21,995][15401] Updated weights for policy 0, policy_version 59160 (0.0044) [2024-06-21 20:29:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 969326592. Throughput: 0: 41939.5. Samples: 969402080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 20:29:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 20:29:25,679][15401] Updated weights for policy 0, policy_version 59170 (0.0033) [2024-06-21 20:29:28,392][15132] Fps is (10 sec: 39311.8, 60 sec: 41506.1, 300 sec: 41931.6). Total num frames: 969523200. Throughput: 0: 41812.4. Samples: 969653960. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 20:29:28,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 20:29:29,913][15401] Updated weights for policy 0, policy_version 59180 (0.0038) [2024-06-21 20:29:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 969736192. Throughput: 0: 41890.6. Samples: 969906420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 20:29:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 20:29:33,642][15401] Updated weights for policy 0, policy_version 59190 (0.0043) [2024-06-21 20:29:37,579][15401] Updated weights for policy 0, policy_version 59200 (0.0034) [2024-06-21 20:29:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42054.0, 300 sec: 41987.5). Total num frames: 969949184. Throughput: 0: 41905.8. Samples: 970032060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 20:29:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 20:29:41,722][15401] Updated weights for policy 0, policy_version 59210 (0.0035) [2024-06-21 20:29:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42043.4). Total num frames: 970178560. Throughput: 0: 41856.8. Samples: 970281580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 20:29:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 20:29:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000059215_970178560.pth... [2024-06-21 20:29:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000058598_960069632.pth [2024-06-21 20:29:45,524][15401] Updated weights for policy 0, policy_version 59220 (0.0034) [2024-06-21 20:29:48,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41777.4, 300 sec: 41876.0). Total num frames: 970358784. Throughput: 0: 41902.2. Samples: 970535060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 20:29:48,393][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 20:29:49,416][15401] Updated weights for policy 0, policy_version 59230 (0.0028) [2024-06-21 20:29:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 970571776. Throughput: 0: 41782.8. Samples: 970654840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-21 20:29:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 20:29:53,424][15401] Updated weights for policy 0, policy_version 59240 (0.0045) [2024-06-21 20:29:57,177][15401] Updated weights for policy 0, policy_version 59250 (0.0041) [2024-06-21 20:29:58,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 970817536. Throughput: 0: 41947.6. Samples: 970913540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:29:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 20:30:01,206][15401] Updated weights for policy 0, policy_version 59260 (0.0043) [2024-06-21 20:30:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 970997760. Throughput: 0: 42123.5. Samples: 971168960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:30:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 20:30:05,223][15401] Updated weights for policy 0, policy_version 59270 (0.0040) [2024-06-21 20:30:08,392][15132] Fps is (10 sec: 39312.1, 60 sec: 41777.6, 300 sec: 42043.6). Total num frames: 971210752. Throughput: 0: 41891.7. Samples: 971287300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:30:08,392][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 20:30:08,854][15401] Updated weights for policy 0, policy_version 59280 (0.0040) [2024-06-21 20:30:12,939][15401] Updated weights for policy 0, policy_version 59290 (0.0031) [2024-06-21 20:30:13,153][15349] Signal inference workers to stop experience collection... (14150 times) [2024-06-21 20:30:13,154][15349] Signal inference workers to resume experience collection... (14150 times) [2024-06-21 20:30:13,168][15401] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-06-21 20:30:13,182][15401] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-06-21 20:30:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 971440128. Throughput: 0: 42055.2. Samples: 971546340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:30:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 20:30:16,657][15401] Updated weights for policy 0, policy_version 59300 (0.0039) [2024-06-21 20:30:18,389][15132] Fps is (10 sec: 40970.1, 60 sec: 41506.1, 300 sec: 41932.0). Total num frames: 971620352. Throughput: 0: 41971.6. Samples: 971795140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:30:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 20:30:20,775][15401] Updated weights for policy 0, policy_version 59310 (0.0043) [2024-06-21 20:30:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 971833344. Throughput: 0: 41921.2. Samples: 971918520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:30:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 20:30:24,580][15401] Updated weights for policy 0, policy_version 59320 (0.0049) [2024-06-21 20:30:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42053.9, 300 sec: 41820.9). Total num frames: 972046336. Throughput: 0: 41984.9. Samples: 972170900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:30:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 20:30:28,523][15401] Updated weights for policy 0, policy_version 59330 (0.0055) [2024-06-21 20:30:32,356][15401] Updated weights for policy 0, policy_version 59340 (0.0039) [2024-06-21 20:30:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 972259328. Throughput: 0: 41954.7. Samples: 972422920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 20:30:33,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 20:30:36,201][15401] Updated weights for policy 0, policy_version 59350 (0.0041) [2024-06-21 20:30:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41777.5, 300 sec: 41987.1). Total num frames: 972455936. Throughput: 0: 41983.5. Samples: 972544200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 20:30:38,393][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 20:30:40,349][15401] Updated weights for policy 0, policy_version 59360 (0.0040) [2024-06-21 20:30:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 972668928. Throughput: 0: 41862.7. Samples: 972797360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 20:30:43,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-21 20:30:44,059][15401] Updated weights for policy 0, policy_version 59370 (0.0027) [2024-06-21 20:30:48,146][15401] Updated weights for policy 0, policy_version 59380 (0.0030) [2024-06-21 20:30:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42054.0, 300 sec: 41932.3). Total num frames: 972881920. Throughput: 0: 41713.0. Samples: 973046040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 20:30:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 20:30:52,403][15401] Updated weights for policy 0, policy_version 59390 (0.0038) [2024-06-21 20:30:53,396][15132] Fps is (10 sec: 40933.7, 60 sec: 41774.7, 300 sec: 41931.3). Total num frames: 973078528. Throughput: 0: 41825.6. Samples: 973169620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 20:30:53,397][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 20:30:55,965][15401] Updated weights for policy 0, policy_version 59400 (0.0029) [2024-06-21 20:30:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41765.3). Total num frames: 973275136. Throughput: 0: 41656.3. Samples: 973420880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 20:30:58,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 20:31:00,020][15401] Updated weights for policy 0, policy_version 59410 (0.0033) [2024-06-21 20:31:03,392][15132] Fps is (10 sec: 40976.4, 60 sec: 41504.5, 300 sec: 41876.1). Total num frames: 973488128. Throughput: 0: 41658.1. Samples: 973669860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-21 20:31:03,392][15132] Avg episode reward: [(0, '0.233')] [2024-06-21 20:31:03,945][15401] Updated weights for policy 0, policy_version 59420 (0.0031) [2024-06-21 20:31:07,874][15401] Updated weights for policy 0, policy_version 59430 (0.0031) [2024-06-21 20:31:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41507.8, 300 sec: 41931.9). Total num frames: 973701120. Throughput: 0: 41744.1. Samples: 973797000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-21 20:31:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 20:31:11,545][15401] Updated weights for policy 0, policy_version 59440 (0.0046) [2024-06-21 20:31:13,392][15132] Fps is (10 sec: 42598.3, 60 sec: 41231.4, 300 sec: 41876.2). Total num frames: 973914112. Throughput: 0: 41621.3. Samples: 974043960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-21 20:31:13,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 20:31:15,670][15401] Updated weights for policy 0, policy_version 59450 (0.0043) [2024-06-21 20:31:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 974127104. Throughput: 0: 41525.3. Samples: 974291560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-21 20:31:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 20:31:19,522][15401] Updated weights for policy 0, policy_version 59460 (0.0041) [2024-06-21 20:31:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 974323712. Throughput: 0: 41655.5. Samples: 974418600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-21 20:31:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 20:31:23,722][15401] Updated weights for policy 0, policy_version 59470 (0.0052) [2024-06-21 20:31:27,360][15401] Updated weights for policy 0, policy_version 59480 (0.0048) [2024-06-21 20:31:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 974536704. Throughput: 0: 41525.4. Samples: 974666000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-21 20:31:28,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-21 20:31:31,520][15401] Updated weights for policy 0, policy_version 59490 (0.0048) [2024-06-21 20:31:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 41777.5, 300 sec: 41987.1). Total num frames: 974766080. Throughput: 0: 41567.5. Samples: 974916680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-21 20:31:33,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 20:31:35,129][15401] Updated weights for policy 0, policy_version 59500 (0.0040) [2024-06-21 20:31:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41780.8, 300 sec: 41765.3). Total num frames: 974962688. Throughput: 0: 41633.4. Samples: 975042860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-21 20:31:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 20:31:39,548][15401] Updated weights for policy 0, policy_version 59510 (0.0044) [2024-06-21 20:31:43,097][15401] Updated weights for policy 0, policy_version 59520 (0.0039) [2024-06-21 20:31:43,390][15132] Fps is (10 sec: 40969.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 975175680. Throughput: 0: 41515.5. Samples: 975289080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-21 20:31:43,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-21 20:31:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000059520_975175680.pth... [2024-06-21 20:31:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000058907_965132288.pth [2024-06-21 20:31:47,408][15401] Updated weights for policy 0, policy_version 59530 (0.0034) [2024-06-21 20:31:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41777.4, 300 sec: 41931.7). Total num frames: 975388672. Throughput: 0: 41437.7. Samples: 975534560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:31:48,393][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 20:31:51,370][15401] Updated weights for policy 0, policy_version 59540 (0.0030) [2024-06-21 20:31:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41783.6, 300 sec: 41876.4). Total num frames: 975585280. Throughput: 0: 41454.1. Samples: 975662440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:31:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 20:31:55,267][15401] Updated weights for policy 0, policy_version 59550 (0.0030) [2024-06-21 20:31:58,390][15132] Fps is (10 sec: 39331.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 975781888. Throughput: 0: 41402.2. Samples: 975906960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:31:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 20:31:59,626][15401] Updated weights for policy 0, policy_version 59560 (0.0049) [2024-06-21 20:32:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41507.8, 300 sec: 41765.3). Total num frames: 975978496. Throughput: 0: 41504.5. Samples: 976159260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:32:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 20:32:03,510][15401] Updated weights for policy 0, policy_version 59570 (0.0031) [2024-06-21 20:32:07,209][15401] Updated weights for policy 0, policy_version 59580 (0.0043) [2024-06-21 20:32:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 976207872. Throughput: 0: 41535.2. Samples: 976287680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:32:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 20:32:08,483][15349] Signal inference workers to stop experience collection... (14200 times) [2024-06-21 20:32:08,488][15349] Signal inference workers to resume experience collection... (14200 times) [2024-06-21 20:32:08,505][15401] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-06-21 20:32:08,537][15401] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-06-21 20:32:11,323][15401] Updated weights for policy 0, policy_version 59590 (0.0030) [2024-06-21 20:32:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41780.8, 300 sec: 41820.8). Total num frames: 976420864. Throughput: 0: 41471.8. Samples: 976532240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:32:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 20:32:15,074][15401] Updated weights for policy 0, policy_version 59600 (0.0035) [2024-06-21 20:32:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 976617472. Throughput: 0: 41553.4. Samples: 976786480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:32:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 20:32:19,035][15401] Updated weights for policy 0, policy_version 59610 (0.0032) [2024-06-21 20:32:23,301][15401] Updated weights for policy 0, policy_version 59620 (0.0033) [2024-06-21 20:32:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41765.5). Total num frames: 976814080. Throughput: 0: 41426.6. Samples: 976907060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:32:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 20:32:26,745][15401] Updated weights for policy 0, policy_version 59630 (0.0033) [2024-06-21 20:32:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 977059840. Throughput: 0: 41527.2. Samples: 977157800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:32:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 20:32:31,084][15401] Updated weights for policy 0, policy_version 59640 (0.0038) [2024-06-21 20:32:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41234.8, 300 sec: 41765.3). Total num frames: 977240064. Throughput: 0: 41704.6. Samples: 977411160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:32:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-21 20:32:34,438][15401] Updated weights for policy 0, policy_version 59650 (0.0031) [2024-06-21 20:32:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41710.1). Total num frames: 977436672. Throughput: 0: 41611.6. Samples: 977534960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:32:38,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-21 20:32:38,738][15401] Updated weights for policy 0, policy_version 59660 (0.0038) [2024-06-21 20:32:42,075][15401] Updated weights for policy 0, policy_version 59670 (0.0032) [2024-06-21 20:32:43,392][15132] Fps is (10 sec: 42587.6, 60 sec: 41504.5, 300 sec: 41709.4). Total num frames: 977666048. Throughput: 0: 41870.7. Samples: 977791240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:32:43,393][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 20:32:46,404][15401] Updated weights for policy 0, policy_version 59680 (0.0032) [2024-06-21 20:32:48,391][15132] Fps is (10 sec: 44232.4, 60 sec: 41507.1, 300 sec: 41765.2). Total num frames: 977879040. Throughput: 0: 41921.7. Samples: 978045780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:32:48,391][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 20:32:49,767][15401] Updated weights for policy 0, policy_version 59690 (0.0043) [2024-06-21 20:32:53,389][15132] Fps is (10 sec: 39331.2, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 978059264. Throughput: 0: 41848.4. Samples: 978170860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:32:53,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-21 20:32:54,241][15401] Updated weights for policy 0, policy_version 59700 (0.0040) [2024-06-21 20:32:57,408][15401] Updated weights for policy 0, policy_version 59710 (0.0034) [2024-06-21 20:32:58,389][15132] Fps is (10 sec: 42602.8, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 978305024. Throughput: 0: 42097.0. Samples: 978426600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:32:58,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 20:33:01,949][15401] Updated weights for policy 0, policy_version 59720 (0.0056) [2024-06-21 20:33:03,392][15132] Fps is (10 sec: 47503.8, 60 sec: 42596.9, 300 sec: 41931.6). Total num frames: 978534400. Throughput: 0: 42040.3. Samples: 978678380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 20:33:03,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 20:33:05,400][15401] Updated weights for policy 0, policy_version 59730 (0.0039) [2024-06-21 20:33:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 978714624. Throughput: 0: 42284.0. Samples: 978809840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 20:33:08,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 20:33:09,625][15401] Updated weights for policy 0, policy_version 59740 (0.0039) [2024-06-21 20:33:13,012][15401] Updated weights for policy 0, policy_version 59750 (0.0037) [2024-06-21 20:33:13,390][15132] Fps is (10 sec: 40968.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 978944000. Throughput: 0: 42371.5. Samples: 979064520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 20:33:13,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-21 20:33:17,297][15401] Updated weights for policy 0, policy_version 59760 (0.0035) [2024-06-21 20:33:18,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42320.8, 300 sec: 41875.5). Total num frames: 979156992. Throughput: 0: 42286.7. Samples: 979314340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 20:33:18,396][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 20:33:20,884][15401] Updated weights for policy 0, policy_version 59770 (0.0040) [2024-06-21 20:33:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 41765.7). Total num frames: 979353600. Throughput: 0: 42358.8. Samples: 979441100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 20:33:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 20:33:24,947][15401] Updated weights for policy 0, policy_version 59780 (0.0038) [2024-06-21 20:33:28,390][15132] Fps is (10 sec: 40984.5, 60 sec: 41778.9, 300 sec: 41765.3). Total num frames: 979566592. Throughput: 0: 42268.5. Samples: 979693240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 20:33:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-21 20:33:28,839][15401] Updated weights for policy 0, policy_version 59790 (0.0044) [2024-06-21 20:33:31,183][15349] Signal inference workers to stop experience collection... (14250 times) [2024-06-21 20:33:31,184][15349] Signal inference workers to resume experience collection... (14250 times) [2024-06-21 20:33:31,231][15401] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-06-21 20:33:31,232][15401] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-06-21 20:33:32,690][15401] Updated weights for policy 0, policy_version 59800 (0.0037) [2024-06-21 20:33:33,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.6, 300 sec: 41876.4). Total num frames: 979779584. Throughput: 0: 42191.6. Samples: 979944460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 20:33:33,401][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 20:33:36,552][15401] Updated weights for policy 0, policy_version 59810 (0.0026) [2024-06-21 20:33:38,390][15132] Fps is (10 sec: 40961.2, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 979976192. Throughput: 0: 42164.7. Samples: 980068280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:33:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-21 20:33:40,987][15401] Updated weights for policy 0, policy_version 59820 (0.0041) [2024-06-21 20:33:43,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42327.1, 300 sec: 41876.4). Total num frames: 980205568. Throughput: 0: 42084.9. Samples: 980320420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:33:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 20:33:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000059827_980205568.pth... [2024-06-21 20:33:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000059215_970178560.pth [2024-06-21 20:33:44,382][15401] Updated weights for policy 0, policy_version 59830 (0.0027) [2024-06-21 20:33:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.9, 300 sec: 41765.3). Total num frames: 980385792. Throughput: 0: 42000.6. Samples: 980568320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:33:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 20:33:48,815][15401] Updated weights for policy 0, policy_version 59840 (0.0035) [2024-06-21 20:33:52,281][15401] Updated weights for policy 0, policy_version 59850 (0.0048) [2024-06-21 20:33:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 41876.4). Total num frames: 980631552. Throughput: 0: 41832.9. Samples: 980692320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:33:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 20:33:56,534][15401] Updated weights for policy 0, policy_version 59860 (0.0033) [2024-06-21 20:33:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 980828160. Throughput: 0: 41826.7. Samples: 980946720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:33:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 20:34:00,050][15401] Updated weights for policy 0, policy_version 59870 (0.0031) [2024-06-21 20:34:03,396][15132] Fps is (10 sec: 40933.8, 60 sec: 41776.2, 300 sec: 41820.0). Total num frames: 981041152. Throughput: 0: 41783.6. Samples: 981194600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:34:03,396][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 20:34:04,791][15401] Updated weights for policy 0, policy_version 59880 (0.0033) [2024-06-21 20:34:07,691][15401] Updated weights for policy 0, policy_version 59890 (0.0034) [2024-06-21 20:34:08,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42323.8, 300 sec: 41820.5). Total num frames: 981254144. Throughput: 0: 41746.3. Samples: 981319780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:34:08,392][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 20:34:12,491][15401] Updated weights for policy 0, policy_version 59900 (0.0025) [2024-06-21 20:34:13,390][15132] Fps is (10 sec: 37707.2, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 981417984. Throughput: 0: 41824.4. Samples: 981575320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:34:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-21 20:34:15,592][15401] Updated weights for policy 0, policy_version 59910 (0.0041) [2024-06-21 20:34:18,389][15132] Fps is (10 sec: 40969.1, 60 sec: 41783.7, 300 sec: 41820.9). Total num frames: 981663744. Throughput: 0: 41864.9. Samples: 981828280. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:34:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-21 20:34:20,058][15401] Updated weights for policy 0, policy_version 59920 (0.0040) [2024-06-21 20:34:23,179][15401] Updated weights for policy 0, policy_version 59930 (0.0038) [2024-06-21 20:34:23,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42325.3, 300 sec: 41932.3). Total num frames: 981893120. Throughput: 0: 41955.2. Samples: 981956260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:34:23,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-21 20:34:27,763][15401] Updated weights for policy 0, policy_version 59940 (0.0033) [2024-06-21 20:34:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.5, 300 sec: 41820.9). Total num frames: 982073344. Throughput: 0: 41971.1. Samples: 982209120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:34:28,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-21 20:34:31,229][15401] Updated weights for policy 0, policy_version 59950 (0.0046) [2024-06-21 20:34:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42053.8, 300 sec: 41876.4). Total num frames: 982302720. Throughput: 0: 41961.1. Samples: 982456580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:34:33,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 20:34:35,526][15401] Updated weights for policy 0, policy_version 59960 (0.0047) [2024-06-21 20:34:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 982515712. Throughput: 0: 42083.5. Samples: 982586080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:34:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 20:34:39,065][15401] Updated weights for policy 0, policy_version 59970 (0.0049) [2024-06-21 20:34:43,389][15132] Fps is (10 sec: 37684.3, 60 sec: 41233.1, 300 sec: 41765.7). Total num frames: 982679552. Throughput: 0: 41879.6. Samples: 982831300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:34:43,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 20:34:43,777][15401] Updated weights for policy 0, policy_version 59980 (0.0042) [2024-06-21 20:34:46,784][15401] Updated weights for policy 0, policy_version 59990 (0.0048) [2024-06-21 20:34:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 982941696. Throughput: 0: 42010.4. Samples: 983084800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-21 20:34:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 20:34:51,423][15401] Updated weights for policy 0, policy_version 60000 (0.0033) [2024-06-21 20:34:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 983138304. Throughput: 0: 42237.6. Samples: 983220380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:34:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 20:34:55,045][15401] Updated weights for policy 0, policy_version 60010 (0.0035) [2024-06-21 20:34:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 983334912. Throughput: 0: 41946.3. Samples: 983462900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:34:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 20:34:59,190][15401] Updated weights for policy 0, policy_version 60020 (0.0027) [2024-06-21 20:35:01,369][15349] Signal inference workers to stop experience collection... (14300 times) [2024-06-21 20:35:01,396][15401] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-06-21 20:35:01,434][15349] Signal inference workers to resume experience collection... (14300 times) [2024-06-21 20:35:01,434][15401] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-06-21 20:35:02,690][15401] Updated weights for policy 0, policy_version 60030 (0.0027) [2024-06-21 20:35:03,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42055.0, 300 sec: 41876.4). Total num frames: 983564288. Throughput: 0: 41982.1. Samples: 983717580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:35:03,393][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 20:35:06,805][15401] Updated weights for policy 0, policy_version 60040 (0.0029) [2024-06-21 20:35:08,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42052.2, 300 sec: 41820.5). Total num frames: 983777280. Throughput: 0: 42034.7. Samples: 983847920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:35:08,392][15132] Avg episode reward: [(0, '0.864')] [2024-06-21 20:35:10,295][15401] Updated weights for policy 0, policy_version 60050 (0.0025) [2024-06-21 20:35:13,389][15132] Fps is (10 sec: 42609.5, 60 sec: 42871.6, 300 sec: 41931.9). Total num frames: 983990272. Throughput: 0: 41960.5. Samples: 984097340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:35:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 20:35:14,373][15401] Updated weights for policy 0, policy_version 60060 (0.0025) [2024-06-21 20:35:17,927][15401] Updated weights for policy 0, policy_version 60070 (0.0034) [2024-06-21 20:35:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 984186880. Throughput: 0: 42096.2. Samples: 984350900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:35:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 20:35:22,454][15401] Updated weights for policy 0, policy_version 60080 (0.0028) [2024-06-21 20:35:23,389][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 984383488. Throughput: 0: 41996.0. Samples: 984475900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 20:35:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 20:35:25,637][15401] Updated weights for policy 0, policy_version 60090 (0.0033) [2024-06-21 20:35:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 984629248. Throughput: 0: 42263.0. Samples: 984733140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:35:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 20:35:29,939][15401] Updated weights for policy 0, policy_version 60100 (0.0030) [2024-06-21 20:35:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 41876.7). Total num frames: 984809472. Throughput: 0: 42239.1. Samples: 984985560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:35:33,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-21 20:35:33,534][15401] Updated weights for policy 0, policy_version 60110 (0.0034) [2024-06-21 20:35:37,564][15401] Updated weights for policy 0, policy_version 60120 (0.0037) [2024-06-21 20:35:38,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 985006080. Throughput: 0: 42074.2. Samples: 985113720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:35:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 20:35:41,598][15401] Updated weights for policy 0, policy_version 60130 (0.0030) [2024-06-21 20:35:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 41931.9). Total num frames: 985251840. Throughput: 0: 42273.9. Samples: 985365220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:35:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 20:35:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000060135_985251840.pth... [2024-06-21 20:35:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000059520_975175680.pth [2024-06-21 20:35:45,259][15401] Updated weights for policy 0, policy_version 60140 (0.0029) [2024-06-21 20:35:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42052.2, 300 sec: 41988.4). Total num frames: 985464832. Throughput: 0: 42168.4. Samples: 985615060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:35:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 20:35:49,088][15401] Updated weights for policy 0, policy_version 60150 (0.0042) [2024-06-21 20:35:52,818][15401] Updated weights for policy 0, policy_version 60160 (0.0040) [2024-06-21 20:35:53,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42050.6, 300 sec: 41987.1). Total num frames: 985661440. Throughput: 0: 42237.7. Samples: 985748620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:35:53,392][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 20:35:56,691][15401] Updated weights for policy 0, policy_version 60170 (0.0030) [2024-06-21 20:35:58,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42323.7, 300 sec: 41987.5). Total num frames: 985874432. Throughput: 0: 42259.5. Samples: 985999120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:35:58,392][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 20:36:00,607][15401] Updated weights for policy 0, policy_version 60180 (0.0034) [2024-06-21 20:36:03,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42054.0, 300 sec: 41987.5). Total num frames: 986087424. Throughput: 0: 42229.7. Samples: 986251240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 20:36:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 20:36:04,446][15401] Updated weights for policy 0, policy_version 60190 (0.0032) [2024-06-21 20:36:08,396][15132] Fps is (10 sec: 40943.3, 60 sec: 41776.3, 300 sec: 41931.4). Total num frames: 986284032. Throughput: 0: 42216.2. Samples: 986375900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:36:08,397][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 20:36:09,041][15401] Updated weights for policy 0, policy_version 60200 (0.0045) [2024-06-21 20:36:11,971][15401] Updated weights for policy 0, policy_version 60210 (0.0037) [2024-06-21 20:36:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.6, 300 sec: 42042.7). Total num frames: 986529792. Throughput: 0: 42017.3. Samples: 986624020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:36:13,393][15132] Avg episode reward: [(0, '0.227')] [2024-06-21 20:36:16,760][15401] Updated weights for policy 0, policy_version 60220 (0.0039) [2024-06-21 20:36:17,924][15349] Signal inference workers to stop experience collection... (14350 times) [2024-06-21 20:36:17,924][15349] Signal inference workers to resume experience collection... (14350 times) [2024-06-21 20:36:17,973][15401] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-06-21 20:36:17,973][15401] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-06-21 20:36:18,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 986726400. Throughput: 0: 42085.0. Samples: 986879380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:36:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 20:36:19,966][15401] Updated weights for policy 0, policy_version 60230 (0.0028) [2024-06-21 20:36:23,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 986906624. Throughput: 0: 42076.1. Samples: 987007140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:36:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 20:36:24,438][15401] Updated weights for policy 0, policy_version 60240 (0.0036) [2024-06-21 20:36:27,724][15401] Updated weights for policy 0, policy_version 60250 (0.0037) [2024-06-21 20:36:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41987.8). Total num frames: 987152384. Throughput: 0: 42167.4. Samples: 987262760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:36:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 20:36:31,966][15401] Updated weights for policy 0, policy_version 60260 (0.0033) [2024-06-21 20:36:33,396][15132] Fps is (10 sec: 45845.3, 60 sec: 42593.8, 300 sec: 42042.1). Total num frames: 987365376. Throughput: 0: 42487.8. Samples: 987527280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:36:33,396][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 20:36:35,145][15401] Updated weights for policy 0, policy_version 60270 (0.0049) [2024-06-21 20:36:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 987545600. Throughput: 0: 42249.4. Samples: 987649740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:36:38,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 20:36:39,721][15401] Updated weights for policy 0, policy_version 60280 (0.0035) [2024-06-21 20:36:42,733][15401] Updated weights for policy 0, policy_version 60290 (0.0032) [2024-06-21 20:36:43,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42598.3, 300 sec: 42098.9). Total num frames: 987807744. Throughput: 0: 42337.8. Samples: 987904220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:36:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 20:36:47,513][15401] Updated weights for policy 0, policy_version 60300 (0.0038) [2024-06-21 20:36:48,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42323.8, 300 sec: 42098.2). Total num frames: 988004352. Throughput: 0: 42660.0. Samples: 988171040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:36:48,392][15132] Avg episode reward: [(0, '0.269')] [2024-06-21 20:36:50,315][15401] Updated weights for policy 0, policy_version 60310 (0.0045) [2024-06-21 20:36:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42327.0, 300 sec: 42098.6). Total num frames: 988200960. Throughput: 0: 42554.6. Samples: 988290580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:36:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 20:36:55,309][15401] Updated weights for policy 0, policy_version 60320 (0.0034) [2024-06-21 20:36:58,168][15401] Updated weights for policy 0, policy_version 60330 (0.0027) [2024-06-21 20:36:58,391][15132] Fps is (10 sec: 44239.6, 60 sec: 42871.9, 300 sec: 42264.9). Total num frames: 988446720. Throughput: 0: 42617.5. Samples: 988541780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:36:58,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 20:37:03,330][15401] Updated weights for policy 0, policy_version 60340 (0.0040) [2024-06-21 20:37:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 988610560. Throughput: 0: 42696.5. Samples: 988800720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:37:03,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-21 20:37:06,147][15401] Updated weights for policy 0, policy_version 60350 (0.0030) [2024-06-21 20:37:08,390][15132] Fps is (10 sec: 37689.2, 60 sec: 42329.7, 300 sec: 42043.0). Total num frames: 988823552. Throughput: 0: 42473.1. Samples: 988918440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:37:08,396][15132] Avg episode reward: [(0, '0.817')] [2024-06-21 20:37:10,979][15401] Updated weights for policy 0, policy_version 60360 (0.0023) [2024-06-21 20:37:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42327.1, 300 sec: 42209.6). Total num frames: 989069312. Throughput: 0: 42191.7. Samples: 989161380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:37:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 20:37:13,928][15401] Updated weights for policy 0, policy_version 60370 (0.0034) [2024-06-21 20:37:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 989233152. Throughput: 0: 42156.7. Samples: 989424060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:37:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 20:37:19,054][15401] Updated weights for policy 0, policy_version 60380 (0.0033) [2024-06-21 20:37:21,634][15401] Updated weights for policy 0, policy_version 60390 (0.0034) [2024-06-21 20:37:23,392][15132] Fps is (10 sec: 37673.4, 60 sec: 42323.5, 300 sec: 41987.1). Total num frames: 989446144. Throughput: 0: 42052.7. Samples: 989542220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 20:37:23,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 20:37:26,661][15401] Updated weights for policy 0, policy_version 60400 (0.0037) [2024-06-21 20:37:28,392][15132] Fps is (10 sec: 47502.1, 60 sec: 42596.7, 300 sec: 42264.8). Total num frames: 989708288. Throughput: 0: 42124.4. Samples: 989799920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 20:37:28,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 20:37:29,455][15401] Updated weights for policy 0, policy_version 60410 (0.0036) [2024-06-21 20:37:33,389][15132] Fps is (10 sec: 40970.5, 60 sec: 41510.6, 300 sec: 42098.6). Total num frames: 989855744. Throughput: 0: 41983.6. Samples: 990060200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 20:37:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 20:37:34,341][15401] Updated weights for policy 0, policy_version 60420 (0.0044) [2024-06-21 20:37:37,065][15401] Updated weights for policy 0, policy_version 60430 (0.0039) [2024-06-21 20:37:38,396][15132] Fps is (10 sec: 37667.7, 60 sec: 42320.7, 300 sec: 42098.0). Total num frames: 990085120. Throughput: 0: 41917.0. Samples: 990177120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 20:37:38,397][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 20:37:42,026][15401] Updated weights for policy 0, policy_version 60440 (0.0036) [2024-06-21 20:37:42,675][15349] Signal inference workers to stop experience collection... (14400 times) [2024-06-21 20:37:42,675][15349] Signal inference workers to resume experience collection... (14400 times) [2024-06-21 20:37:42,712][15401] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-06-21 20:37:42,713][15401] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-06-21 20:37:43,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42050.6, 300 sec: 42209.4). Total num frames: 990330880. Throughput: 0: 42066.0. Samples: 990434780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 20:37:43,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 20:37:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000060445_990330880.pth... [2024-06-21 20:37:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000059827_980205568.pth [2024-06-21 20:37:44,769][15401] Updated weights for policy 0, policy_version 60450 (0.0044) [2024-06-21 20:37:48,389][15132] Fps is (10 sec: 42626.2, 60 sec: 41780.9, 300 sec: 42209.6). Total num frames: 990511104. Throughput: 0: 41865.2. Samples: 990684660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 20:37:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 20:37:49,924][15401] Updated weights for policy 0, policy_version 60460 (0.0044) [2024-06-21 20:37:53,126][15401] Updated weights for policy 0, policy_version 60470 (0.0034) [2024-06-21 20:37:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 990740480. Throughput: 0: 41982.7. Samples: 990807660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-21 20:37:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 20:37:57,566][15401] Updated weights for policy 0, policy_version 60480 (0.0040) [2024-06-21 20:37:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41234.3, 300 sec: 41987.8). Total num frames: 990920704. Throughput: 0: 42137.7. Samples: 991057580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:37:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 20:38:01,085][15401] Updated weights for policy 0, policy_version 60490 (0.0040) [2024-06-21 20:38:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 991117312. Throughput: 0: 41909.7. Samples: 991310000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:38:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 20:38:05,385][15401] Updated weights for policy 0, policy_version 60500 (0.0034) [2024-06-21 20:38:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 991379456. Throughput: 0: 42090.7. Samples: 991436200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:38:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 20:38:08,946][15401] Updated weights for policy 0, policy_version 60510 (0.0035) [2024-06-21 20:38:13,104][15401] Updated weights for policy 0, policy_version 60520 (0.0039) [2024-06-21 20:38:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41506.1, 300 sec: 42043.9). Total num frames: 991559680. Throughput: 0: 41938.3. Samples: 991687040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:38:13,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 20:38:17,588][15401] Updated weights for policy 0, policy_version 60530 (0.0044) [2024-06-21 20:38:18,390][15132] Fps is (10 sec: 36045.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 991739904. Throughput: 0: 41733.3. Samples: 991938200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:38:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 20:38:20,855][15401] Updated weights for policy 0, policy_version 60540 (0.0039) [2024-06-21 20:38:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.2, 300 sec: 42154.1). Total num frames: 992002048. Throughput: 0: 41879.4. Samples: 992061420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:38:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 20:38:25,154][15401] Updated weights for policy 0, policy_version 60550 (0.0035) [2024-06-21 20:38:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 41507.8, 300 sec: 42098.9). Total num frames: 992198656. Throughput: 0: 41829.8. Samples: 992317020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:38:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 20:38:28,522][15401] Updated weights for policy 0, policy_version 60560 (0.0034) [2024-06-21 20:38:32,894][15401] Updated weights for policy 0, policy_version 60570 (0.0051) [2024-06-21 20:38:33,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 992378880. Throughput: 0: 41896.4. Samples: 992570000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 20:38:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 20:38:36,199][15401] Updated weights for policy 0, policy_version 60580 (0.0035) [2024-06-21 20:38:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42329.9, 300 sec: 42098.6). Total num frames: 992624640. Throughput: 0: 41847.7. Samples: 992690800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:38:38,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 20:38:40,979][15401] Updated weights for policy 0, policy_version 60590 (0.0033) [2024-06-21 20:38:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41507.8, 300 sec: 42154.1). Total num frames: 992821248. Throughput: 0: 41970.7. Samples: 992946260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:38:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-21 20:38:43,942][15401] Updated weights for policy 0, policy_version 60600 (0.0032) [2024-06-21 20:38:48,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 993001472. Throughput: 0: 41899.2. Samples: 993195460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:38:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 20:38:49,243][15401] Updated weights for policy 0, policy_version 60610 (0.0040) [2024-06-21 20:38:51,942][15401] Updated weights for policy 0, policy_version 60620 (0.0031) [2024-06-21 20:38:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41777.6, 300 sec: 42098.2). Total num frames: 993247232. Throughput: 0: 41770.3. Samples: 993315960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:38:53,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 20:38:56,974][15401] Updated weights for policy 0, policy_version 60630 (0.0028) [2024-06-21 20:38:57,813][15349] Signal inference workers to stop experience collection... (14450 times) [2024-06-21 20:38:57,859][15401] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-06-21 20:38:57,927][15349] Signal inference workers to resume experience collection... (14450 times) [2024-06-21 20:38:57,927][15401] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-06-21 20:38:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42043.9). Total num frames: 993443840. Throughput: 0: 42043.9. Samples: 993579020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:38:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 20:38:59,866][15401] Updated weights for policy 0, policy_version 60640 (0.0027) [2024-06-21 20:39:03,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42052.2, 300 sec: 41987.8). Total num frames: 993640448. Throughput: 0: 41850.6. Samples: 993821480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:39:03,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 20:39:04,680][15401] Updated weights for policy 0, policy_version 60650 (0.0035) [2024-06-21 20:39:07,493][15401] Updated weights for policy 0, policy_version 60660 (0.0035) [2024-06-21 20:39:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 993886208. Throughput: 0: 41966.2. Samples: 993949900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 20:39:08,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 20:39:12,415][15401] Updated weights for policy 0, policy_version 60670 (0.0038) [2024-06-21 20:39:13,389][15132] Fps is (10 sec: 40961.0, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 994050048. Throughput: 0: 41998.8. Samples: 994206960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 20:39:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 20:39:15,069][15401] Updated weights for policy 0, policy_version 60680 (0.0034) [2024-06-21 20:39:18,392][15132] Fps is (10 sec: 39312.6, 60 sec: 42323.7, 300 sec: 41987.1). Total num frames: 994279424. Throughput: 0: 41815.6. Samples: 994451800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 20:39:18,392][15132] Avg episode reward: [(0, '0.250')] [2024-06-21 20:39:20,239][15401] Updated weights for policy 0, policy_version 60690 (0.0044) [2024-06-21 20:39:22,563][15401] Updated weights for policy 0, policy_version 60700 (0.0031) [2024-06-21 20:39:23,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 994525184. Throughput: 0: 42049.8. Samples: 994583040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 20:39:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 20:39:28,002][15401] Updated weights for policy 0, policy_version 60710 (0.0027) [2024-06-21 20:39:28,389][15132] Fps is (10 sec: 40969.9, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 994689024. Throughput: 0: 42107.6. Samples: 994841100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 20:39:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 20:39:30,377][15401] Updated weights for policy 0, policy_version 60720 (0.0038) [2024-06-21 20:39:33,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 994934784. Throughput: 0: 41941.3. Samples: 995082820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 20:39:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 20:39:35,798][15401] Updated weights for policy 0, policy_version 60730 (0.0034) [2024-06-21 20:39:38,031][15401] Updated weights for policy 0, policy_version 60740 (0.0026) [2024-06-21 20:39:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 995164160. Throughput: 0: 42221.9. Samples: 995215840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 20:39:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 20:39:43,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 995311616. Throughput: 0: 42008.5. Samples: 995469400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 20:39:43,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 20:39:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000060749_995311616.pth... [2024-06-21 20:39:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000060135_985251840.pth [2024-06-21 20:39:43,631][15401] Updated weights for policy 0, policy_version 60750 (0.0044) [2024-06-21 20:39:46,219][15401] Updated weights for policy 0, policy_version 60760 (0.0038) [2024-06-21 20:39:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 995557376. Throughput: 0: 42112.6. Samples: 995716540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 20:39:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 20:39:51,239][15401] Updated weights for policy 0, policy_version 60770 (0.0033) [2024-06-21 20:39:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 41781.0, 300 sec: 42098.6). Total num frames: 995753984. Throughput: 0: 42179.3. Samples: 995847960. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 20:39:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 20:39:53,931][15401] Updated weights for policy 0, policy_version 60780 (0.0039) [2024-06-21 20:39:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41987.8). Total num frames: 995950592. Throughput: 0: 41958.6. Samples: 996095100. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 20:39:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 20:39:58,868][15401] Updated weights for policy 0, policy_version 60790 (0.0038) [2024-06-21 20:40:01,704][15401] Updated weights for policy 0, policy_version 60800 (0.0027) [2024-06-21 20:40:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42098.9). Total num frames: 996196352. Throughput: 0: 42062.2. Samples: 996344500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 20:40:03,390][15132] Avg episode reward: [(0, '0.177')] [2024-06-21 20:40:06,720][15401] Updated weights for policy 0, policy_version 60810 (0.0037) [2024-06-21 20:40:07,579][15349] Signal inference workers to stop experience collection... (14500 times) [2024-06-21 20:40:07,587][15349] Signal inference workers to resume experience collection... (14500 times) [2024-06-21 20:40:07,590][15401] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-06-21 20:40:07,605][15401] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-06-21 20:40:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 996376576. Throughput: 0: 42167.9. Samples: 996480600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 20:40:08,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 20:40:09,400][15401] Updated weights for policy 0, policy_version 60820 (0.0038) [2024-06-21 20:40:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 996589568. Throughput: 0: 41908.1. Samples: 996726960. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 20:40:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 20:40:14,358][15401] Updated weights for policy 0, policy_version 60830 (0.0039) [2024-06-21 20:40:17,088][15401] Updated weights for policy 0, policy_version 60840 (0.0040) [2024-06-21 20:40:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42325.3, 300 sec: 42153.7). Total num frames: 996818944. Throughput: 0: 42085.8. Samples: 996976780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 20:40:18,393][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 20:40:22,187][15401] Updated weights for policy 0, policy_version 60850 (0.0034) [2024-06-21 20:40:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 997015552. Throughput: 0: 42052.1. Samples: 997108180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 20:40:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-21 20:40:24,812][15401] Updated weights for policy 0, policy_version 60860 (0.0037) [2024-06-21 20:40:28,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42596.7, 300 sec: 42153.7). Total num frames: 997244928. Throughput: 0: 42158.2. Samples: 997366620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:40:28,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 20:40:30,120][15401] Updated weights for policy 0, policy_version 60870 (0.0035) [2024-06-21 20:40:32,644][15401] Updated weights for policy 0, policy_version 60880 (0.0031) [2024-06-21 20:40:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 997457920. Throughput: 0: 42098.1. Samples: 997610960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:40:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 20:40:37,649][15401] Updated weights for policy 0, policy_version 60890 (0.0029) [2024-06-21 20:40:38,389][15132] Fps is (10 sec: 39331.0, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 997638144. Throughput: 0: 42026.1. Samples: 997739140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:40:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 20:40:40,646][15401] Updated weights for policy 0, policy_version 60900 (0.0036) [2024-06-21 20:40:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42098.6). Total num frames: 997883904. Throughput: 0: 42316.4. Samples: 997999340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:40:43,396][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 20:40:45,414][15401] Updated weights for policy 0, policy_version 60910 (0.0026) [2024-06-21 20:40:48,390][15132] Fps is (10 sec: 45870.7, 60 sec: 42324.6, 300 sec: 42154.3). Total num frames: 998096896. Throughput: 0: 42267.1. Samples: 998246560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:40:48,391][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 20:40:48,585][15401] Updated weights for policy 0, policy_version 60920 (0.0022) [2024-06-21 20:40:53,013][15401] Updated weights for policy 0, policy_version 60930 (0.0039) [2024-06-21 20:40:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42098.9). Total num frames: 998293504. Throughput: 0: 42105.8. Samples: 998375360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:40:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 20:40:56,387][15401] Updated weights for policy 0, policy_version 60940 (0.0039) [2024-06-21 20:40:58,389][15132] Fps is (10 sec: 40964.5, 60 sec: 42598.5, 300 sec: 42098.6). Total num frames: 998506496. Throughput: 0: 42282.2. Samples: 998629660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:40:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 20:41:00,489][15401] Updated weights for policy 0, policy_version 60950 (0.0035) [2024-06-21 20:41:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42155.0). Total num frames: 998719488. Throughput: 0: 42390.4. Samples: 998884240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:41:03,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 20:41:04,294][15401] Updated weights for policy 0, policy_version 60960 (0.0041) [2024-06-21 20:41:07,965][15401] Updated weights for policy 0, policy_version 60970 (0.0040) [2024-06-21 20:41:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42043.3). Total num frames: 998932480. Throughput: 0: 42173.2. Samples: 999005980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-21 20:41:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 20:41:12,077][15401] Updated weights for policy 0, policy_version 60980 (0.0034) [2024-06-21 20:41:13,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42596.6, 300 sec: 42098.2). Total num frames: 999145472. Throughput: 0: 42148.0. Samples: 999263280. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-21 20:41:13,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 20:41:14,843][15349] Signal inference workers to stop experience collection... (14550 times) [2024-06-21 20:41:14,843][15349] Signal inference workers to resume experience collection... (14550 times) [2024-06-21 20:41:14,870][15401] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-06-21 20:41:14,904][15401] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-06-21 20:41:15,451][15401] Updated weights for policy 0, policy_version 60990 (0.0025) [2024-06-21 20:41:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41780.9, 300 sec: 42098.5). Total num frames: 999325696. Throughput: 0: 42549.4. Samples: 999525680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-21 20:41:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 20:41:19,775][15401] Updated weights for policy 0, policy_version 61000 (0.0035) [2024-06-21 20:41:23,392][15132] Fps is (10 sec: 40960.6, 60 sec: 42323.7, 300 sec: 42042.7). Total num frames: 999555072. Throughput: 0: 42318.8. Samples: 999643580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-21 20:41:23,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 20:41:23,740][15401] Updated weights for policy 0, policy_version 61010 (0.0039) [2024-06-21 20:41:27,398][15401] Updated weights for policy 0, policy_version 61020 (0.0029) [2024-06-21 20:41:28,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42325.3, 300 sec: 42099.1). Total num frames: 999784448. Throughput: 0: 42266.3. Samples: 999901420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-21 20:41:28,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 20:41:31,391][15401] Updated weights for policy 0, policy_version 61030 (0.0034) [2024-06-21 20:41:33,389][15132] Fps is (10 sec: 40969.7, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 999964672. Throughput: 0: 42547.7. Samples: 1000161160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-21 20:41:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 20:41:35,139][15401] Updated weights for policy 0, policy_version 61040 (0.0040) [2024-06-21 20:41:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 1000210432. Throughput: 0: 42447.1. Samples: 1000285480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-21 20:41:38,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 20:41:38,966][15401] Updated weights for policy 0, policy_version 61050 (0.0034) [2024-06-21 20:41:42,942][15401] Updated weights for policy 0, policy_version 61060 (0.0037) [2024-06-21 20:41:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.4, 300 sec: 42098.9). Total num frames: 1000423424. Throughput: 0: 42523.4. Samples: 1000543220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:41:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 20:41:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000061061_1000423424.pth... [2024-06-21 20:41:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000060445_990330880.pth [2024-06-21 20:41:46,972][15401] Updated weights for policy 0, policy_version 61070 (0.0037) [2024-06-21 20:41:48,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42051.3, 300 sec: 42098.2). Total num frames: 1000620032. Throughput: 0: 42449.6. Samples: 1000794580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:41:48,393][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 20:41:50,706][15401] Updated weights for policy 0, policy_version 61080 (0.0029) [2024-06-21 20:41:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42098.8). Total num frames: 1000865792. Throughput: 0: 42436.9. Samples: 1000915640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:41:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 20:41:54,504][15401] Updated weights for policy 0, policy_version 61090 (0.0040) [2024-06-21 20:41:58,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1001046016. Throughput: 0: 42439.3. Samples: 1001172940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:41:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 20:41:58,420][15401] Updated weights for policy 0, policy_version 61100 (0.0043) [2024-06-21 20:42:02,107][15401] Updated weights for policy 0, policy_version 61110 (0.0028) [2024-06-21 20:42:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.1, 300 sec: 42154.1). Total num frames: 1001259008. Throughput: 0: 42237.1. Samples: 1001426360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:42:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 20:42:06,290][15401] Updated weights for policy 0, policy_version 61120 (0.0043) [2024-06-21 20:42:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1001472000. Throughput: 0: 42393.7. Samples: 1001551200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:42:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 20:42:10,017][15401] Updated weights for policy 0, policy_version 61130 (0.0032) [2024-06-21 20:42:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42327.1, 300 sec: 42209.6). Total num frames: 1001684992. Throughput: 0: 42425.9. Samples: 1001810480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:42:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 20:42:14,077][15401] Updated weights for policy 0, policy_version 61140 (0.0031) [2024-06-21 20:42:17,758][15401] Updated weights for policy 0, policy_version 61150 (0.0027) [2024-06-21 20:42:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42210.0). Total num frames: 1001897984. Throughput: 0: 42174.6. Samples: 1002059020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:42:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 20:42:21,871][15401] Updated weights for policy 0, policy_version 61160 (0.0041) [2024-06-21 20:42:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.0, 300 sec: 42098.9). Total num frames: 1002127360. Throughput: 0: 42193.7. Samples: 1002184200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:42:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 20:42:25,736][15401] Updated weights for policy 0, policy_version 61170 (0.0042) [2024-06-21 20:42:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1002307584. Throughput: 0: 42249.3. Samples: 1002444440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:42:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 20:42:29,375][15401] Updated weights for policy 0, policy_version 61180 (0.0045) [2024-06-21 20:42:30,164][15349] Signal inference workers to stop experience collection... (14600 times) [2024-06-21 20:42:30,223][15401] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-06-21 20:42:30,280][15349] Signal inference workers to resume experience collection... (14600 times) [2024-06-21 20:42:30,280][15401] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-06-21 20:42:33,327][15401] Updated weights for policy 0, policy_version 61190 (0.0025) [2024-06-21 20:42:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42210.6). Total num frames: 1002536960. Throughput: 0: 42152.5. Samples: 1002691340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:42:33,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-21 20:42:37,779][15401] Updated weights for policy 0, policy_version 61200 (0.0040) [2024-06-21 20:42:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42098.9). Total num frames: 1002749952. Throughput: 0: 42277.8. Samples: 1002818140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:42:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 20:42:41,478][15401] Updated weights for policy 0, policy_version 61210 (0.0033) [2024-06-21 20:42:43,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42050.6, 300 sec: 42153.7). Total num frames: 1002946560. Throughput: 0: 42099.4. Samples: 1003067520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:42:43,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 20:42:45,374][15401] Updated weights for policy 0, policy_version 61220 (0.0033) [2024-06-21 20:42:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42098.6). Total num frames: 1003159552. Throughput: 0: 42161.0. Samples: 1003323600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:42:48,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-21 20:42:49,093][15401] Updated weights for policy 0, policy_version 61230 (0.0036) [2024-06-21 20:42:52,884][15401] Updated weights for policy 0, policy_version 61240 (0.0029) [2024-06-21 20:42:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 1003356160. Throughput: 0: 42189.8. Samples: 1003449740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 20:42:53,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-21 20:42:56,784][15401] Updated weights for policy 0, policy_version 61250 (0.0030) [2024-06-21 20:42:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 1003585536. Throughput: 0: 42244.4. Samples: 1003711480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:42:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 20:43:00,757][15401] Updated weights for policy 0, policy_version 61260 (0.0035) [2024-06-21 20:43:03,391][15132] Fps is (10 sec: 44229.7, 60 sec: 42324.3, 300 sec: 42098.3). Total num frames: 1003798528. Throughput: 0: 42248.8. Samples: 1003960280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:43:03,392][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 20:43:04,703][15401] Updated weights for policy 0, policy_version 61270 (0.0039) [2024-06-21 20:43:08,391][15132] Fps is (10 sec: 40955.0, 60 sec: 42051.3, 300 sec: 42153.9). Total num frames: 1003995136. Throughput: 0: 42222.0. Samples: 1004084240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:43:08,391][15132] Avg episode reward: [(0, '0.315')] [2024-06-21 20:43:08,585][15401] Updated weights for policy 0, policy_version 61280 (0.0027) [2024-06-21 20:43:12,295][15401] Updated weights for policy 0, policy_version 61290 (0.0041) [2024-06-21 20:43:13,389][15132] Fps is (10 sec: 40966.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1004208128. Throughput: 0: 42176.9. Samples: 1004342400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:43:13,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-21 20:43:16,119][15401] Updated weights for policy 0, policy_version 61300 (0.0042) [2024-06-21 20:43:18,390][15132] Fps is (10 sec: 44241.6, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 1004437504. Throughput: 0: 42231.8. Samples: 1004591780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:43:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 20:43:20,161][15401] Updated weights for policy 0, policy_version 61310 (0.0035) [2024-06-21 20:43:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1004650496. Throughput: 0: 42258.2. Samples: 1004719760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:43:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 20:43:23,732][15401] Updated weights for policy 0, policy_version 61320 (0.0042) [2024-06-21 20:43:27,872][15401] Updated weights for policy 0, policy_version 61330 (0.0035) [2024-06-21 20:43:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1004847104. Throughput: 0: 42387.5. Samples: 1004974860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:43:28,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-21 20:43:31,582][15401] Updated weights for policy 0, policy_version 61340 (0.0041) [2024-06-21 20:43:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1005076480. Throughput: 0: 42204.1. Samples: 1005222780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 20:43:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 20:43:35,623][15401] Updated weights for policy 0, policy_version 61350 (0.0029) [2024-06-21 20:43:38,392][15132] Fps is (10 sec: 40951.4, 60 sec: 41777.7, 300 sec: 42153.8). Total num frames: 1005256704. Throughput: 0: 42230.3. Samples: 1005350200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 20:43:38,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 20:43:39,284][15401] Updated weights for policy 0, policy_version 61360 (0.0032) [2024-06-21 20:43:43,383][15401] Updated weights for policy 0, policy_version 61370 (0.0022) [2024-06-21 20:43:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42320.7). Total num frames: 1005486080. Throughput: 0: 42056.6. Samples: 1005604020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 20:43:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 20:43:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000061370_1005486080.pth... [2024-06-21 20:43:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000060749_995311616.pth [2024-06-21 20:43:46,281][15349] Signal inference workers to stop experience collection... (14650 times) [2024-06-21 20:43:46,340][15349] Signal inference workers to resume experience collection... (14650 times) [2024-06-21 20:43:46,340][15401] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-06-21 20:43:46,365][15401] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-06-21 20:43:47,346][15401] Updated weights for policy 0, policy_version 61380 (0.0045) [2024-06-21 20:43:48,389][15132] Fps is (10 sec: 44246.6, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 1005699072. Throughput: 0: 42004.1. Samples: 1005850400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 20:43:48,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 20:43:51,088][15401] Updated weights for policy 0, policy_version 61390 (0.0035) [2024-06-21 20:43:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 1005912064. Throughput: 0: 42145.1. Samples: 1005980720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 20:43:53,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 20:43:55,008][15401] Updated weights for policy 0, policy_version 61400 (0.0041) [2024-06-21 20:43:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1006092288. Throughput: 0: 41968.9. Samples: 1006231000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 20:43:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 20:43:59,272][15401] Updated weights for policy 0, policy_version 61410 (0.0044) [2024-06-21 20:44:02,697][15401] Updated weights for policy 0, policy_version 61420 (0.0030) [2024-06-21 20:44:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42053.4, 300 sec: 42154.1). Total num frames: 1006321664. Throughput: 0: 42015.4. Samples: 1006482460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 20:44:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 20:44:06,911][15401] Updated weights for policy 0, policy_version 61430 (0.0025) [2024-06-21 20:44:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42326.2, 300 sec: 42320.7). Total num frames: 1006534656. Throughput: 0: 42049.8. Samples: 1006612000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 20:44:08,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-21 20:44:10,499][15401] Updated weights for policy 0, policy_version 61440 (0.0030) [2024-06-21 20:44:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42210.0). Total num frames: 1006731264. Throughput: 0: 41766.3. Samples: 1006854340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 20:44:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 20:44:14,611][15401] Updated weights for policy 0, policy_version 61450 (0.0035) [2024-06-21 20:44:18,174][15401] Updated weights for policy 0, policy_version 61460 (0.0038) [2024-06-21 20:44:18,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42050.7, 300 sec: 42153.7). Total num frames: 1006960640. Throughput: 0: 42046.6. Samples: 1007114980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:44:18,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 20:44:22,156][15401] Updated weights for policy 0, policy_version 61470 (0.0042) [2024-06-21 20:44:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1007157248. Throughput: 0: 42136.7. Samples: 1007246260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:44:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 20:44:25,867][15401] Updated weights for policy 0, policy_version 61480 (0.0025) [2024-06-21 20:44:28,392][15132] Fps is (10 sec: 42599.9, 60 sec: 42324.0, 300 sec: 42209.3). Total num frames: 1007386624. Throughput: 0: 42035.8. Samples: 1007495720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:44:28,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 20:44:30,188][15401] Updated weights for policy 0, policy_version 61490 (0.0032) [2024-06-21 20:44:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 1007566848. Throughput: 0: 42231.2. Samples: 1007750800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:44:33,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-21 20:44:33,801][15401] Updated weights for policy 0, policy_version 61500 (0.0035) [2024-06-21 20:44:37,934][15401] Updated weights for policy 0, policy_version 61510 (0.0037) [2024-06-21 20:44:38,389][15132] Fps is (10 sec: 40968.6, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1007796224. Throughput: 0: 42103.7. Samples: 1007875380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:44:38,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 20:44:41,839][15401] Updated weights for policy 0, policy_version 61520 (0.0031) [2024-06-21 20:44:43,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1008025600. Throughput: 0: 42178.7. Samples: 1008129040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:44:43,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-21 20:44:45,593][15401] Updated weights for policy 0, policy_version 61530 (0.0038) [2024-06-21 20:44:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42265.1). Total num frames: 1008222208. Throughput: 0: 42225.1. Samples: 1008382600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:44:48,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 20:44:49,547][15401] Updated weights for policy 0, policy_version 61540 (0.0025) [2024-06-21 20:44:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1008418816. Throughput: 0: 42117.4. Samples: 1008507280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 20:44:53,398][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 20:44:53,515][15401] Updated weights for policy 0, policy_version 61550 (0.0029) [2024-06-21 20:44:56,854][15349] Signal inference workers to stop experience collection... (14700 times) [2024-06-21 20:44:56,904][15401] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-06-21 20:44:56,908][15349] Signal inference workers to resume experience collection... (14700 times) [2024-06-21 20:44:56,916][15401] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-06-21 20:44:57,210][15401] Updated weights for policy 0, policy_version 61560 (0.0033) [2024-06-21 20:44:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1008631808. Throughput: 0: 42418.1. Samples: 1008763160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 20:44:58,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 20:45:01,264][15401] Updated weights for policy 0, policy_version 61570 (0.0039) [2024-06-21 20:45:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1008844800. Throughput: 0: 42245.8. Samples: 1009015940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 20:45:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 20:45:04,879][15401] Updated weights for policy 0, policy_version 61580 (0.0037) [2024-06-21 20:45:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1009074176. Throughput: 0: 42188.0. Samples: 1009144720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 20:45:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 20:45:08,844][15401] Updated weights for policy 0, policy_version 61590 (0.0045) [2024-06-21 20:45:12,743][15401] Updated weights for policy 0, policy_version 61600 (0.0039) [2024-06-21 20:45:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 1009270784. Throughput: 0: 42265.1. Samples: 1009397560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 20:45:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 20:45:16,776][15401] Updated weights for policy 0, policy_version 61610 (0.0041) [2024-06-21 20:45:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42265.1). Total num frames: 1009483776. Throughput: 0: 42066.5. Samples: 1009643800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 20:45:18,396][15132] Avg episode reward: [(0, '0.258')] [2024-06-21 20:45:20,820][15401] Updated weights for policy 0, policy_version 61620 (0.0034) [2024-06-21 20:45:23,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42265.5). Total num frames: 1009713152. Throughput: 0: 42038.9. Samples: 1009767140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 20:45:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 20:45:24,387][15401] Updated weights for policy 0, policy_version 61630 (0.0044) [2024-06-21 20:45:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41507.6, 300 sec: 42098.6). Total num frames: 1009876992. Throughput: 0: 42097.9. Samples: 1010023440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-21 20:45:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 20:45:28,524][15401] Updated weights for policy 0, policy_version 61640 (0.0032) [2024-06-21 20:45:32,411][15401] Updated weights for policy 0, policy_version 61650 (0.0029) [2024-06-21 20:45:33,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1010089984. Throughput: 0: 42150.0. Samples: 1010279340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 20:45:33,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-21 20:45:36,326][15401] Updated weights for policy 0, policy_version 61660 (0.0039) [2024-06-21 20:45:38,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42323.6, 300 sec: 42209.3). Total num frames: 1010335744. Throughput: 0: 42206.7. Samples: 1010406680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 20:45:38,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 20:45:40,040][15401] Updated weights for policy 0, policy_version 61670 (0.0027) [2024-06-21 20:45:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.3, 300 sec: 42154.2). Total num frames: 1010532352. Throughput: 0: 42021.1. Samples: 1010654100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 20:45:43,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-21 20:45:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000061678_1010532352.pth... [2024-06-21 20:45:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000061061_1000423424.pth [2024-06-21 20:45:44,015][15401] Updated weights for policy 0, policy_version 61680 (0.0037) [2024-06-21 20:45:48,194][15401] Updated weights for policy 0, policy_version 61690 (0.0037) [2024-06-21 20:45:48,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 1010745344. Throughput: 0: 42177.5. Samples: 1010913920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 20:45:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 20:45:51,656][15401] Updated weights for policy 0, policy_version 61700 (0.0033) [2024-06-21 20:45:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1010958336. Throughput: 0: 42115.2. Samples: 1011039900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 20:45:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 20:45:55,637][15401] Updated weights for policy 0, policy_version 61710 (0.0032) [2024-06-21 20:45:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.8, 300 sec: 42264.8). Total num frames: 1011187712. Throughput: 0: 42094.6. Samples: 1011291920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 20:45:58,393][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 20:45:59,370][15401] Updated weights for policy 0, policy_version 61720 (0.0036) [2024-06-21 20:46:03,154][15401] Updated weights for policy 0, policy_version 61730 (0.0035) [2024-06-21 20:46:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1011400704. Throughput: 0: 42469.0. Samples: 1011554900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-06-21 20:46:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 20:46:07,549][15401] Updated weights for policy 0, policy_version 61740 (0.0045) [2024-06-21 20:46:08,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42050.6, 300 sec: 42209.6). Total num frames: 1011597312. Throughput: 0: 42428.1. Samples: 1011676500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-21 20:46:08,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 20:46:11,078][15401] Updated weights for policy 0, policy_version 61750 (0.0039) [2024-06-21 20:46:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1011810304. Throughput: 0: 42422.9. Samples: 1011932480. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-21 20:46:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 20:46:15,066][15401] Updated weights for policy 0, policy_version 61760 (0.0028) [2024-06-21 20:46:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42265.5). Total num frames: 1012023296. Throughput: 0: 42489.7. Samples: 1012191380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-21 20:46:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 20:46:18,529][15401] Updated weights for policy 0, policy_version 61770 (0.0039) [2024-06-21 20:46:22,454][15349] Signal inference workers to stop experience collection... (14750 times) [2024-06-21 20:46:22,497][15401] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-06-21 20:46:22,504][15349] Signal inference workers to resume experience collection... (14750 times) [2024-06-21 20:46:22,513][15401] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-06-21 20:46:22,638][15401] Updated weights for policy 0, policy_version 61780 (0.0043) [2024-06-21 20:46:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.4, 300 sec: 42154.4). Total num frames: 1012219904. Throughput: 0: 42465.0. Samples: 1012317500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-21 20:46:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 20:46:26,300][15401] Updated weights for policy 0, policy_version 61790 (0.0046) [2024-06-21 20:46:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42376.2). Total num frames: 1012465664. Throughput: 0: 42679.9. Samples: 1012574700. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-21 20:46:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 20:46:30,282][15401] Updated weights for policy 0, policy_version 61800 (0.0028) [2024-06-21 20:46:33,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42871.3, 300 sec: 42209.6). Total num frames: 1012662272. Throughput: 0: 42637.5. Samples: 1012832620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-21 20:46:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-21 20:46:33,943][15401] Updated weights for policy 0, policy_version 61810 (0.0031) [2024-06-21 20:46:38,100][15401] Updated weights for policy 0, policy_version 61820 (0.0040) [2024-06-21 20:46:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42209.6). Total num frames: 1012875264. Throughput: 0: 42632.9. Samples: 1012958380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-21 20:46:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 20:46:41,457][15401] Updated weights for policy 0, policy_version 61830 (0.0037) [2024-06-21 20:46:43,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.4, 300 sec: 42321.1). Total num frames: 1013104640. Throughput: 0: 42637.9. Samples: 1013210520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-21 20:46:43,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-21 20:46:45,783][15401] Updated weights for policy 0, policy_version 61840 (0.0038) [2024-06-21 20:46:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42098.6). Total num frames: 1013284864. Throughput: 0: 42658.6. Samples: 1013474540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:46:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 20:46:49,136][15401] Updated weights for policy 0, policy_version 61850 (0.0027) [2024-06-21 20:46:53,263][15401] Updated weights for policy 0, policy_version 61860 (0.0027) [2024-06-21 20:46:53,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42265.1). Total num frames: 1013514240. Throughput: 0: 42596.3. Samples: 1013593240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:46:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 20:46:56,695][15401] Updated weights for policy 0, policy_version 61870 (0.0032) [2024-06-21 20:46:58,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42873.1, 300 sec: 42376.3). Total num frames: 1013760000. Throughput: 0: 42618.2. Samples: 1013850300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:46:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 20:47:00,674][15401] Updated weights for policy 0, policy_version 61880 (0.0024) [2024-06-21 20:47:03,392][15132] Fps is (10 sec: 42589.1, 60 sec: 42323.6, 300 sec: 42264.8). Total num frames: 1013940224. Throughput: 0: 42815.0. Samples: 1014118160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:47:03,392][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 20:47:04,147][15401] Updated weights for policy 0, policy_version 61890 (0.0020) [2024-06-21 20:47:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 1014136832. Throughput: 0: 42680.7. Samples: 1014238140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:47:08,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-21 20:47:08,964][15401] Updated weights for policy 0, policy_version 61900 (0.0043) [2024-06-21 20:47:11,890][15401] Updated weights for policy 0, policy_version 61910 (0.0037) [2024-06-21 20:47:13,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1014382592. Throughput: 0: 42598.7. Samples: 1014491640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:47:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 20:47:16,635][15401] Updated weights for policy 0, policy_version 61920 (0.0036) [2024-06-21 20:47:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1014579200. Throughput: 0: 42729.6. Samples: 1014755440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:47:18,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 20:47:19,643][15401] Updated weights for policy 0, policy_version 61930 (0.0038) [2024-06-21 20:47:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42320.4). Total num frames: 1014792192. Throughput: 0: 42659.0. Samples: 1014878140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 20:47:23,393][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 20:47:24,275][15401] Updated weights for policy 0, policy_version 61940 (0.0047) [2024-06-21 20:47:27,423][15401] Updated weights for policy 0, policy_version 61950 (0.0027) [2024-06-21 20:47:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 1015037952. Throughput: 0: 42707.1. Samples: 1015132340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 20:47:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 20:47:32,101][15401] Updated weights for policy 0, policy_version 61960 (0.0034) [2024-06-21 20:47:32,340][15349] Signal inference workers to stop experience collection... (14800 times) [2024-06-21 20:47:32,341][15349] Signal inference workers to resume experience collection... (14800 times) [2024-06-21 20:47:32,387][15401] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-06-21 20:47:32,387][15401] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-06-21 20:47:33,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42323.8, 300 sec: 42209.3). Total num frames: 1015201792. Throughput: 0: 42533.8. Samples: 1015388660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 20:47:33,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 20:47:35,099][15401] Updated weights for policy 0, policy_version 61970 (0.0029) [2024-06-21 20:47:38,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42325.2, 300 sec: 42265.5). Total num frames: 1015414784. Throughput: 0: 42619.7. Samples: 1015511120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 20:47:38,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-21 20:47:39,635][15401] Updated weights for policy 0, policy_version 61980 (0.0036) [2024-06-21 20:47:42,940][15401] Updated weights for policy 0, policy_version 61990 (0.0028) [2024-06-21 20:47:43,389][15132] Fps is (10 sec: 45886.1, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1015660544. Throughput: 0: 42620.9. Samples: 1015768240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 20:47:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 20:47:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000061991_1015660544.pth... [2024-06-21 20:47:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000061370_1005486080.pth [2024-06-21 20:47:47,365][15401] Updated weights for policy 0, policy_version 62000 (0.0022) [2024-06-21 20:47:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1015840768. Throughput: 0: 42421.5. Samples: 1016027020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 20:47:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 20:47:50,502][15401] Updated weights for policy 0, policy_version 62010 (0.0028) [2024-06-21 20:47:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1016070144. Throughput: 0: 42520.5. Samples: 1016151560. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 20:47:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 20:47:54,931][15401] Updated weights for policy 0, policy_version 62020 (0.0024) [2024-06-21 20:47:58,306][15401] Updated weights for policy 0, policy_version 62030 (0.0037) [2024-06-21 20:47:58,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42325.3, 300 sec: 42376.4). Total num frames: 1016299520. Throughput: 0: 42633.6. Samples: 1016410160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-21 20:47:58,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 20:48:02,525][15401] Updated weights for policy 0, policy_version 62040 (0.0041) [2024-06-21 20:48:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42320.9). Total num frames: 1016479744. Throughput: 0: 42590.6. Samples: 1016672020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:48:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-21 20:48:06,004][15401] Updated weights for policy 0, policy_version 62050 (0.0031) [2024-06-21 20:48:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 1016725504. Throughput: 0: 42532.6. Samples: 1016792000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:48:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 20:48:10,502][15401] Updated weights for policy 0, policy_version 62060 (0.0032) [2024-06-21 20:48:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1016922112. Throughput: 0: 42448.9. Samples: 1017042540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:48:13,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-21 20:48:13,667][15401] Updated weights for policy 0, policy_version 62070 (0.0045) [2024-06-21 20:48:18,028][15401] Updated weights for policy 0, policy_version 62080 (0.0046) [2024-06-21 20:48:18,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42596.6, 300 sec: 42320.4). Total num frames: 1017135104. Throughput: 0: 42585.7. Samples: 1017305020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:48:18,393][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 20:48:21,395][15401] Updated weights for policy 0, policy_version 62090 (0.0046) [2024-06-21 20:48:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42376.3). Total num frames: 1017348096. Throughput: 0: 42697.5. Samples: 1017432500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:48:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 20:48:25,445][15401] Updated weights for policy 0, policy_version 62100 (0.0024) [2024-06-21 20:48:28,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1017577472. Throughput: 0: 42573.8. Samples: 1017684060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:48:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 20:48:29,081][15401] Updated weights for policy 0, policy_version 62110 (0.0033) [2024-06-21 20:48:32,821][15401] Updated weights for policy 0, policy_version 62120 (0.0034) [2024-06-21 20:48:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42873.1, 300 sec: 42432.1). Total num frames: 1017774080. Throughput: 0: 42676.7. Samples: 1017947480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:48:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 20:48:36,797][15401] Updated weights for policy 0, policy_version 62130 (0.0032) [2024-06-21 20:48:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42376.2). Total num frames: 1017987072. Throughput: 0: 42727.7. Samples: 1018074300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:48:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 20:48:40,445][15401] Updated weights for policy 0, policy_version 62140 (0.0035) [2024-06-21 20:48:43,392][15132] Fps is (10 sec: 45864.9, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 1018232832. Throughput: 0: 42790.3. Samples: 1018335820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 20:48:43,392][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 20:48:44,647][15401] Updated weights for policy 0, policy_version 62150 (0.0023) [2024-06-21 20:48:48,227][15401] Updated weights for policy 0, policy_version 62160 (0.0042) [2024-06-21 20:48:48,392][15132] Fps is (10 sec: 44225.6, 60 sec: 43142.7, 300 sec: 42431.4). Total num frames: 1018429440. Throughput: 0: 42624.8. Samples: 1018590240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 20:48:48,393][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 20:48:49,523][15349] Signal inference workers to stop experience collection... (14850 times) [2024-06-21 20:48:49,524][15349] Signal inference workers to resume experience collection... (14850 times) [2024-06-21 20:48:49,570][15401] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-06-21 20:48:49,570][15401] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-06-21 20:48:52,444][15401] Updated weights for policy 0, policy_version 62170 (0.0046) [2024-06-21 20:48:53,389][15132] Fps is (10 sec: 37692.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1018609664. Throughput: 0: 42629.7. Samples: 1018710340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 20:48:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 20:48:56,158][15401] Updated weights for policy 0, policy_version 62180 (0.0040) [2024-06-21 20:48:58,392][15132] Fps is (10 sec: 44237.0, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 1018871808. Throughput: 0: 42774.6. Samples: 1018967500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 20:48:58,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 20:49:00,081][15401] Updated weights for policy 0, policy_version 62190 (0.0027) [2024-06-21 20:49:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1019052032. Throughput: 0: 42728.7. Samples: 1019227700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 20:49:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 20:49:03,647][15401] Updated weights for policy 0, policy_version 62200 (0.0039) [2024-06-21 20:49:07,795][15401] Updated weights for policy 0, policy_version 62210 (0.0033) [2024-06-21 20:49:08,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1019265024. Throughput: 0: 42615.9. Samples: 1019350220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 20:49:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 20:49:11,377][15401] Updated weights for policy 0, policy_version 62220 (0.0040) [2024-06-21 20:49:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 1019510784. Throughput: 0: 42724.8. Samples: 1019606680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 20:49:13,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 20:49:15,351][15401] Updated weights for policy 0, policy_version 62230 (0.0035) [2024-06-21 20:49:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 1019691008. Throughput: 0: 42553.8. Samples: 1019862400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 20:49:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 20:49:18,827][15401] Updated weights for policy 0, policy_version 62240 (0.0037) [2024-06-21 20:49:22,902][15401] Updated weights for policy 0, policy_version 62250 (0.0032) [2024-06-21 20:49:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 1019904000. Throughput: 0: 42442.5. Samples: 1019984220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 20:49:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 20:49:26,311][15401] Updated weights for policy 0, policy_version 62260 (0.0036) [2024-06-21 20:49:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1020149760. Throughput: 0: 42414.1. Samples: 1020244360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 20:49:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 20:49:30,973][15401] Updated weights for policy 0, policy_version 62270 (0.0043) [2024-06-21 20:49:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1020313600. Throughput: 0: 42604.1. Samples: 1020507320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 20:49:33,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-21 20:49:34,201][15401] Updated weights for policy 0, policy_version 62280 (0.0042) [2024-06-21 20:49:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1020542976. Throughput: 0: 42492.4. Samples: 1020622500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 20:49:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 20:49:38,545][15401] Updated weights for policy 0, policy_version 62290 (0.0031) [2024-06-21 20:49:41,945][15401] Updated weights for policy 0, policy_version 62300 (0.0035) [2024-06-21 20:49:43,396][15132] Fps is (10 sec: 47483.0, 60 sec: 42595.5, 300 sec: 42597.5). Total num frames: 1020788736. Throughput: 0: 42466.0. Samples: 1020878640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 20:49:43,397][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 20:49:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000062304_1020788736.pth... [2024-06-21 20:49:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000061678_1010532352.pth [2024-06-21 20:49:46,428][15401] Updated weights for policy 0, policy_version 62310 (0.0033) [2024-06-21 20:49:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 41779.2, 300 sec: 42431.4). Total num frames: 1020936192. Throughput: 0: 42648.3. Samples: 1021146980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 20:49:48,392][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 20:49:49,890][15401] Updated weights for policy 0, policy_version 62320 (0.0039) [2024-06-21 20:49:53,389][15132] Fps is (10 sec: 39346.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1021181952. Throughput: 0: 42453.8. Samples: 1021260640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 20:49:53,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 20:49:53,987][15401] Updated weights for policy 0, policy_version 62330 (0.0035) [2024-06-21 20:49:57,487][15401] Updated weights for policy 0, policy_version 62340 (0.0035) [2024-06-21 20:49:58,392][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 1021427712. Throughput: 0: 42535.9. Samples: 1021520900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:49:58,393][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 20:50:01,551][15401] Updated weights for policy 0, policy_version 62350 (0.0028) [2024-06-21 20:50:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42376.3). Total num frames: 1021575168. Throughput: 0: 42585.9. Samples: 1021778760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:50:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 20:50:04,995][15349] Signal inference workers to stop experience collection... (14900 times) [2024-06-21 20:50:05,054][15401] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-06-21 20:50:05,113][15349] Signal inference workers to resume experience collection... (14900 times) [2024-06-21 20:50:05,114][15401] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-06-21 20:50:05,243][15401] Updated weights for policy 0, policy_version 62360 (0.0031) [2024-06-21 20:50:08,392][15132] Fps is (10 sec: 39321.6, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 1021820928. Throughput: 0: 42506.2. Samples: 1021897100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:50:08,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 20:50:09,475][15401] Updated weights for policy 0, policy_version 62370 (0.0027) [2024-06-21 20:50:13,269][15401] Updated weights for policy 0, policy_version 62380 (0.0033) [2024-06-21 20:50:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1022033920. Throughput: 0: 42356.6. Samples: 1022150400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:50:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 20:50:17,295][15401] Updated weights for policy 0, policy_version 62390 (0.0044) [2024-06-21 20:50:18,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1022214144. Throughput: 0: 42210.2. Samples: 1022406780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:50:18,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 20:50:20,930][15401] Updated weights for policy 0, policy_version 62400 (0.0041) [2024-06-21 20:50:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1022459904. Throughput: 0: 42359.2. Samples: 1022528660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:50:23,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-21 20:50:24,989][15401] Updated weights for policy 0, policy_version 62410 (0.0033) [2024-06-21 20:50:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 41777.6, 300 sec: 42598.0). Total num frames: 1022656512. Throughput: 0: 42310.5. Samples: 1022782440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:50:28,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 20:50:29,009][15401] Updated weights for policy 0, policy_version 62420 (0.0045) [2024-06-21 20:50:32,804][15401] Updated weights for policy 0, policy_version 62430 (0.0031) [2024-06-21 20:50:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 1022869504. Throughput: 0: 42101.8. Samples: 1023041460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 20:50:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 20:50:36,562][15401] Updated weights for policy 0, policy_version 62440 (0.0033) [2024-06-21 20:50:38,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1023082496. Throughput: 0: 42364.5. Samples: 1023167040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:50:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 20:50:40,533][15401] Updated weights for policy 0, policy_version 62450 (0.0028) [2024-06-21 20:50:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42056.7, 300 sec: 42598.4). Total num frames: 1023311872. Throughput: 0: 42223.1. Samples: 1023420840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:50:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 20:50:44,145][15401] Updated weights for policy 0, policy_version 62460 (0.0030) [2024-06-21 20:50:48,394][15132] Fps is (10 sec: 42580.0, 60 sec: 42870.2, 300 sec: 42542.2). Total num frames: 1023508480. Throughput: 0: 42208.5. Samples: 1023678320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:50:48,394][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 20:50:48,394][15401] Updated weights for policy 0, policy_version 62470 (0.0029) [2024-06-21 20:50:51,729][15401] Updated weights for policy 0, policy_version 62480 (0.0031) [2024-06-21 20:50:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1023721472. Throughput: 0: 42183.3. Samples: 1023795240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:50:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 20:50:56,279][15401] Updated weights for policy 0, policy_version 62490 (0.0038) [2024-06-21 20:50:58,389][15132] Fps is (10 sec: 44255.9, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 1023950848. Throughput: 0: 42221.3. Samples: 1024050360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:50:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 20:50:59,902][15401] Updated weights for policy 0, policy_version 62500 (0.0027) [2024-06-21 20:51:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 1024131072. Throughput: 0: 42266.6. Samples: 1024308780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:51:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 20:51:03,975][15401] Updated weights for policy 0, policy_version 62510 (0.0045) [2024-06-21 20:51:07,574][15401] Updated weights for policy 0, policy_version 62520 (0.0036) [2024-06-21 20:51:08,392][15132] Fps is (10 sec: 39311.6, 60 sec: 42052.3, 300 sec: 42487.0). Total num frames: 1024344064. Throughput: 0: 42150.1. Samples: 1024425520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:51:08,393][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 20:51:11,760][15401] Updated weights for policy 0, policy_version 62530 (0.0033) [2024-06-21 20:51:13,391][15132] Fps is (10 sec: 44231.7, 60 sec: 42324.4, 300 sec: 42542.7). Total num frames: 1024573440. Throughput: 0: 42210.4. Samples: 1024681860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 20:51:13,391][15132] Avg episode reward: [(0, '0.318')] [2024-06-21 20:51:15,242][15401] Updated weights for policy 0, policy_version 62540 (0.0039) [2024-06-21 20:51:15,810][15349] Signal inference workers to stop experience collection... (14950 times) [2024-06-21 20:51:15,811][15349] Signal inference workers to resume experience collection... (14950 times) [2024-06-21 20:51:15,823][15401] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-06-21 20:51:15,823][15401] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-06-21 20:51:18,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1024753664. Throughput: 0: 42162.4. Samples: 1024938760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:51:18,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 20:51:19,335][15401] Updated weights for policy 0, policy_version 62550 (0.0034) [2024-06-21 20:51:23,019][15401] Updated weights for policy 0, policy_version 62560 (0.0026) [2024-06-21 20:51:23,390][15132] Fps is (10 sec: 42603.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1024999424. Throughput: 0: 42143.9. Samples: 1025063520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:51:23,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-21 20:51:26,895][15401] Updated weights for policy 0, policy_version 62570 (0.0032) [2024-06-21 20:51:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42327.0, 300 sec: 42487.4). Total num frames: 1025196032. Throughput: 0: 42327.7. Samples: 1025325580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:51:28,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-21 20:51:30,530][15401] Updated weights for policy 0, policy_version 62580 (0.0027) [2024-06-21 20:51:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1025409024. Throughput: 0: 42325.7. Samples: 1025582800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:51:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 20:51:34,553][15401] Updated weights for policy 0, policy_version 62590 (0.0034) [2024-06-21 20:51:37,881][15401] Updated weights for policy 0, policy_version 62600 (0.0033) [2024-06-21 20:51:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1025654784. Throughput: 0: 42634.5. Samples: 1025713800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:51:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 20:51:42,292][15401] Updated weights for policy 0, policy_version 62610 (0.0029) [2024-06-21 20:51:43,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 1025835008. Throughput: 0: 42782.9. Samples: 1025975700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:51:43,393][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 20:51:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000062612_1025835008.pth... [2024-06-21 20:51:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000061991_1015660544.pth [2024-06-21 20:51:45,371][15401] Updated weights for policy 0, policy_version 62620 (0.0035) [2024-06-21 20:51:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42328.3, 300 sec: 42487.4). Total num frames: 1026048000. Throughput: 0: 42553.9. Samples: 1026223700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:51:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 20:51:49,890][15401] Updated weights for policy 0, policy_version 62630 (0.0024) [2024-06-21 20:51:53,174][15401] Updated weights for policy 0, policy_version 62640 (0.0031) [2024-06-21 20:51:53,390][15132] Fps is (10 sec: 47525.1, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 1026310144. Throughput: 0: 42846.7. Samples: 1026353520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 20:51:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 20:51:57,412][15401] Updated weights for policy 0, policy_version 62650 (0.0038) [2024-06-21 20:51:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42432.1). Total num frames: 1026457600. Throughput: 0: 42736.7. Samples: 1026604960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-21 20:51:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 20:52:00,780][15401] Updated weights for policy 0, policy_version 62660 (0.0036) [2024-06-21 20:52:03,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1026686976. Throughput: 0: 42737.4. Samples: 1026861940. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-21 20:52:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 20:52:05,375][15401] Updated weights for policy 0, policy_version 62670 (0.0031) [2024-06-21 20:52:08,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43144.6, 300 sec: 42542.5). Total num frames: 1026932736. Throughput: 0: 42723.6. Samples: 1026986180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-21 20:52:08,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 20:52:08,643][15401] Updated weights for policy 0, policy_version 62680 (0.0038) [2024-06-21 20:52:13,076][15401] Updated weights for policy 0, policy_version 62690 (0.0034) [2024-06-21 20:52:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42599.1, 300 sec: 42542.8). Total num frames: 1027129344. Throughput: 0: 42586.9. Samples: 1027242000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-21 20:52:13,396][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 20:52:15,812][15349] Signal inference workers to stop experience collection... (15000 times) [2024-06-21 20:52:15,864][15401] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-06-21 20:52:15,865][15349] Signal inference workers to resume experience collection... (15000 times) [2024-06-21 20:52:15,872][15401] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-06-21 20:52:16,261][15401] Updated weights for policy 0, policy_version 62700 (0.0027) [2024-06-21 20:52:18,392][15132] Fps is (10 sec: 40959.6, 60 sec: 43142.7, 300 sec: 42542.9). Total num frames: 1027342336. Throughput: 0: 42476.0. Samples: 1027494320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-21 20:52:18,393][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 20:52:20,755][15401] Updated weights for policy 0, policy_version 62710 (0.0026) [2024-06-21 20:52:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1027538944. Throughput: 0: 42368.6. Samples: 1027620380. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-21 20:52:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 20:52:23,970][15401] Updated weights for policy 0, policy_version 62720 (0.0030) [2024-06-21 20:52:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1027751936. Throughput: 0: 42335.7. Samples: 1027880700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-21 20:52:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 20:52:28,423][15401] Updated weights for policy 0, policy_version 62730 (0.0035) [2024-06-21 20:52:31,777][15401] Updated weights for policy 0, policy_version 62740 (0.0038) [2024-06-21 20:52:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1027981312. Throughput: 0: 42439.1. Samples: 1028133460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:52:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 20:52:36,152][15401] Updated weights for policy 0, policy_version 62750 (0.0052) [2024-06-21 20:52:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1028161536. Throughput: 0: 42445.7. Samples: 1028263580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:52:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 20:52:39,454][15401] Updated weights for policy 0, policy_version 62760 (0.0034) [2024-06-21 20:52:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 1028390912. Throughput: 0: 42466.7. Samples: 1028515960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:52:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 20:52:44,195][15401] Updated weights for policy 0, policy_version 62770 (0.0038) [2024-06-21 20:52:47,254][15401] Updated weights for policy 0, policy_version 62780 (0.0039) [2024-06-21 20:52:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1028603904. Throughput: 0: 42426.1. Samples: 1028771120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:52:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 20:52:51,867][15401] Updated weights for policy 0, policy_version 62790 (0.0032) [2024-06-21 20:52:53,389][15132] Fps is (10 sec: 39321.2, 60 sec: 41233.1, 300 sec: 42320.7). Total num frames: 1028784128. Throughput: 0: 42603.1. Samples: 1028903220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:52:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 20:52:54,707][15401] Updated weights for policy 0, policy_version 62800 (0.0028) [2024-06-21 20:52:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1029029888. Throughput: 0: 42490.2. Samples: 1029154060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:52:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-21 20:52:59,578][15401] Updated weights for policy 0, policy_version 62810 (0.0038) [2024-06-21 20:53:02,452][15401] Updated weights for policy 0, policy_version 62820 (0.0048) [2024-06-21 20:53:03,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1029259264. Throughput: 0: 42576.1. Samples: 1029410140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:53:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 20:53:07,206][15401] Updated weights for policy 0, policy_version 62830 (0.0031) [2024-06-21 20:53:08,389][15132] Fps is (10 sec: 40961.2, 60 sec: 41780.9, 300 sec: 42431.8). Total num frames: 1029439488. Throughput: 0: 42651.1. Samples: 1029539680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 20:53:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 20:53:10,619][15401] Updated weights for policy 0, policy_version 62840 (0.0043) [2024-06-21 20:53:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1029668864. Throughput: 0: 42387.4. Samples: 1029788140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 20:53:13,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 20:53:14,860][15401] Updated weights for policy 0, policy_version 62850 (0.0029) [2024-06-21 20:53:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 1029881856. Throughput: 0: 42418.1. Samples: 1030042280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 20:53:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 20:53:18,591][15401] Updated weights for policy 0, policy_version 62860 (0.0047) [2024-06-21 20:53:22,608][15401] Updated weights for policy 0, policy_version 62870 (0.0048) [2024-06-21 20:53:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1030078464. Throughput: 0: 42377.1. Samples: 1030170540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 20:53:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 20:53:26,199][15401] Updated weights for policy 0, policy_version 62880 (0.0027) [2024-06-21 20:53:28,394][15132] Fps is (10 sec: 42581.4, 60 sec: 42595.5, 300 sec: 42486.8). Total num frames: 1030307840. Throughput: 0: 42296.1. Samples: 1030419460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 20:53:28,394][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 20:53:30,124][15401] Updated weights for policy 0, policy_version 62890 (0.0027) [2024-06-21 20:53:31,767][15349] Signal inference workers to stop experience collection... (15050 times) [2024-06-21 20:53:31,769][15349] Signal inference workers to resume experience collection... (15050 times) [2024-06-21 20:53:31,786][15401] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-06-21 20:53:31,819][15401] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-06-21 20:53:33,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 1030520832. Throughput: 0: 42441.3. Samples: 1030681080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 20:53:33,392][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 20:53:33,921][15401] Updated weights for policy 0, policy_version 62900 (0.0032) [2024-06-21 20:53:38,273][15401] Updated weights for policy 0, policy_version 62910 (0.0031) [2024-06-21 20:53:38,391][15132] Fps is (10 sec: 40969.5, 60 sec: 42597.3, 300 sec: 42320.8). Total num frames: 1030717440. Throughput: 0: 42236.6. Samples: 1030803940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 20:53:38,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 20:53:41,945][15401] Updated weights for policy 0, policy_version 62920 (0.0058) [2024-06-21 20:53:43,392][15132] Fps is (10 sec: 40959.7, 60 sec: 42323.5, 300 sec: 42376.2). Total num frames: 1030930432. Throughput: 0: 42172.1. Samples: 1031051900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 20:53:43,393][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 20:53:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000062924_1030946816.pth... [2024-06-21 20:53:43,578][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000062304_1020788736.pth [2024-06-21 20:53:45,846][15401] Updated weights for policy 0, policy_version 62930 (0.0042) [2024-06-21 20:53:48,392][15132] Fps is (10 sec: 42595.3, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 1031143424. Throughput: 0: 42263.0. Samples: 1031312080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 20:53:48,393][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 20:53:49,550][15401] Updated weights for policy 0, policy_version 62940 (0.0033) [2024-06-21 20:53:53,389][15132] Fps is (10 sec: 42609.6, 60 sec: 42871.6, 300 sec: 42321.1). Total num frames: 1031356416. Throughput: 0: 42214.3. Samples: 1031439320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 20:53:53,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-21 20:53:53,521][15401] Updated weights for policy 0, policy_version 62950 (0.0031) [2024-06-21 20:53:57,276][15401] Updated weights for policy 0, policy_version 62960 (0.0045) [2024-06-21 20:53:58,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1031553024. Throughput: 0: 42211.1. Samples: 1031687640. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 20:53:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 20:54:01,522][15401] Updated weights for policy 0, policy_version 62970 (0.0043) [2024-06-21 20:54:03,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1031782400. Throughput: 0: 42245.0. Samples: 1031943300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 20:54:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 20:54:05,175][15401] Updated weights for policy 0, policy_version 62980 (0.0028) [2024-06-21 20:54:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 1031962624. Throughput: 0: 42095.9. Samples: 1032064860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 20:54:08,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-21 20:54:09,521][15401] Updated weights for policy 0, policy_version 62990 (0.0034) [2024-06-21 20:54:13,239][15401] Updated weights for policy 0, policy_version 63000 (0.0042) [2024-06-21 20:54:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1032192000. Throughput: 0: 42143.3. Samples: 1032315740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 20:54:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 20:54:17,250][15401] Updated weights for policy 0, policy_version 63010 (0.0026) [2024-06-21 20:54:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1032404992. Throughput: 0: 42107.5. Samples: 1032575820. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 20:54:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-21 20:54:20,834][15401] Updated weights for policy 0, policy_version 63020 (0.0034) [2024-06-21 20:54:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.6, 300 sec: 42264.8). Total num frames: 1032617984. Throughput: 0: 42139.3. Samples: 1032700240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 20:54:23,393][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 20:54:24,881][15401] Updated weights for policy 0, policy_version 63030 (0.0040) [2024-06-21 20:54:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42055.1, 300 sec: 42431.8). Total num frames: 1032830976. Throughput: 0: 42199.6. Samples: 1032950780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-21 20:54:28,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 20:54:28,404][15401] Updated weights for policy 0, policy_version 63040 (0.0037) [2024-06-21 20:54:32,602][15401] Updated weights for policy 0, policy_version 63050 (0.0036) [2024-06-21 20:54:33,390][15132] Fps is (10 sec: 40969.4, 60 sec: 41780.8, 300 sec: 42320.7). Total num frames: 1033027584. Throughput: 0: 42149.7. Samples: 1033208720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:54:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-21 20:54:35,998][15401] Updated weights for policy 0, policy_version 63060 (0.0033) [2024-06-21 20:54:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42324.8, 300 sec: 42265.7). Total num frames: 1033256960. Throughput: 0: 42035.4. Samples: 1033331020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:54:38,392][15132] Avg episode reward: [(0, '0.771')] [2024-06-21 20:54:40,591][15401] Updated weights for policy 0, policy_version 63070 (0.0033) [2024-06-21 20:54:43,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42327.2, 300 sec: 42487.7). Total num frames: 1033469952. Throughput: 0: 42172.6. Samples: 1033585400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:54:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-21 20:54:43,680][15401] Updated weights for policy 0, policy_version 63080 (0.0034) [2024-06-21 20:54:48,318][15401] Updated weights for policy 0, policy_version 63090 (0.0037) [2024-06-21 20:54:48,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42053.9, 300 sec: 42320.7). Total num frames: 1033666560. Throughput: 0: 42381.2. Samples: 1033850460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:54:48,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-21 20:54:51,719][15401] Updated weights for policy 0, policy_version 63100 (0.0032) [2024-06-21 20:54:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42210.0). Total num frames: 1033879552. Throughput: 0: 42387.7. Samples: 1033972300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:54:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 20:54:55,850][15401] Updated weights for policy 0, policy_version 63110 (0.0034) [2024-06-21 20:54:56,615][15349] Signal inference workers to stop experience collection... (15100 times) [2024-06-21 20:54:56,616][15349] Signal inference workers to resume experience collection... (15100 times) [2024-06-21 20:54:56,647][15401] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-06-21 20:54:56,648][15401] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-06-21 20:54:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1034092544. Throughput: 0: 42445.9. Samples: 1034225800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:54:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 20:54:59,421][15401] Updated weights for policy 0, policy_version 63120 (0.0035) [2024-06-21 20:55:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42321.1). Total num frames: 1034305536. Throughput: 0: 42438.7. Samples: 1034485560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 20:55:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 20:55:03,447][15401] Updated weights for policy 0, policy_version 63130 (0.0038) [2024-06-21 20:55:06,960][15401] Updated weights for policy 0, policy_version 63140 (0.0031) [2024-06-21 20:55:08,396][15132] Fps is (10 sec: 44208.2, 60 sec: 42867.0, 300 sec: 42375.3). Total num frames: 1034534912. Throughput: 0: 42290.1. Samples: 1034603460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 20:55:08,396][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 20:55:10,956][15401] Updated weights for policy 0, policy_version 63150 (0.0035) [2024-06-21 20:55:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1034764288. Throughput: 0: 42665.8. Samples: 1034870740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 20:55:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 20:55:14,575][15401] Updated weights for policy 0, policy_version 63160 (0.0039) [2024-06-21 20:55:18,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1034960896. Throughput: 0: 42442.9. Samples: 1035118640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 20:55:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 20:55:18,503][15401] Updated weights for policy 0, policy_version 63170 (0.0026) [2024-06-21 20:55:22,135][15401] Updated weights for policy 0, policy_version 63180 (0.0036) [2024-06-21 20:55:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.1, 300 sec: 42376.6). Total num frames: 1035157504. Throughput: 0: 42546.3. Samples: 1035245500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 20:55:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-21 20:55:26,618][15401] Updated weights for policy 0, policy_version 63190 (0.0035) [2024-06-21 20:55:28,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.9, 300 sec: 42431.5). Total num frames: 1035386880. Throughput: 0: 42690.3. Samples: 1035506560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 20:55:28,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 20:55:29,707][15401] Updated weights for policy 0, policy_version 63200 (0.0031) [2024-06-21 20:55:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42431.8). Total num frames: 1035599872. Throughput: 0: 42464.5. Samples: 1035761360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 20:55:33,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 20:55:34,571][15401] Updated weights for policy 0, policy_version 63210 (0.0031) [2024-06-21 20:55:37,569][15401] Updated weights for policy 0, policy_version 63220 (0.0039) [2024-06-21 20:55:38,389][15132] Fps is (10 sec: 42608.1, 60 sec: 42600.2, 300 sec: 42376.3). Total num frames: 1035812864. Throughput: 0: 42565.4. Samples: 1035887740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 20:55:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 20:55:42,042][15401] Updated weights for policy 0, policy_version 63230 (0.0033) [2024-06-21 20:55:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42487.9). Total num frames: 1036042240. Throughput: 0: 42696.3. Samples: 1036147140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 20:55:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 20:55:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000063235_1036042240.pth... [2024-06-21 20:55:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000062612_1025835008.pth [2024-06-21 20:55:45,494][15401] Updated weights for policy 0, policy_version 63240 (0.0036) [2024-06-21 20:55:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1036238848. Throughput: 0: 42532.0. Samples: 1036399500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 20:55:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 20:55:49,454][15401] Updated weights for policy 0, policy_version 63250 (0.0041) [2024-06-21 20:55:53,204][15401] Updated weights for policy 0, policy_version 63260 (0.0040) [2024-06-21 20:55:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1036451840. Throughput: 0: 42720.8. Samples: 1036525620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 20:55:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 20:55:56,903][15401] Updated weights for policy 0, policy_version 63270 (0.0036) [2024-06-21 20:55:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1036664832. Throughput: 0: 42367.0. Samples: 1036777260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 20:55:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 20:56:00,981][15401] Updated weights for policy 0, policy_version 63280 (0.0036) [2024-06-21 20:56:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 1036877824. Throughput: 0: 42738.6. Samples: 1037041880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 20:56:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 20:56:04,392][15401] Updated weights for policy 0, policy_version 63290 (0.0030) [2024-06-21 20:56:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42603.0, 300 sec: 42432.0). Total num frames: 1037090816. Throughput: 0: 42763.2. Samples: 1037169840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 20:56:08,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 20:56:08,509][15401] Updated weights for policy 0, policy_version 63300 (0.0050) [2024-06-21 20:56:12,201][15401] Updated weights for policy 0, policy_version 63310 (0.0038) [2024-06-21 20:56:13,390][15132] Fps is (10 sec: 40958.7, 60 sec: 42052.0, 300 sec: 42487.3). Total num frames: 1037287424. Throughput: 0: 42561.4. Samples: 1037421740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 20:56:13,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-21 20:56:16,078][15401] Updated weights for policy 0, policy_version 63320 (0.0040) [2024-06-21 20:56:17,103][15349] Signal inference workers to stop experience collection... (15150 times) [2024-06-21 20:56:17,104][15349] Signal inference workers to resume experience collection... (15150 times) [2024-06-21 20:56:17,121][15401] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-06-21 20:56:17,160][15401] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-06-21 20:56:18,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42375.9). Total num frames: 1037500416. Throughput: 0: 42660.0. Samples: 1037681160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 20:56:18,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 20:56:19,850][15401] Updated weights for policy 0, policy_version 63330 (0.0036) [2024-06-21 20:56:23,390][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1037729792. Throughput: 0: 42594.1. Samples: 1037804480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 20:56:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 20:56:23,911][15401] Updated weights for policy 0, policy_version 63340 (0.0026) [2024-06-21 20:56:27,882][15401] Updated weights for policy 0, policy_version 63350 (0.0038) [2024-06-21 20:56:28,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42326.9, 300 sec: 42431.8). Total num frames: 1037926400. Throughput: 0: 42457.4. Samples: 1038057720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 20:56:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 20:56:31,615][15401] Updated weights for policy 0, policy_version 63360 (0.0035) [2024-06-21 20:56:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1038139392. Throughput: 0: 42601.3. Samples: 1038316560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 20:56:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 20:56:35,958][15401] Updated weights for policy 0, policy_version 63370 (0.0042) [2024-06-21 20:56:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 1038368768. Throughput: 0: 42603.1. Samples: 1038442760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 20:56:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 20:56:39,269][15401] Updated weights for policy 0, policy_version 63380 (0.0048) [2024-06-21 20:56:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1038565376. Throughput: 0: 42677.9. Samples: 1038697760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 20:56:43,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 20:56:43,494][15401] Updated weights for policy 0, policy_version 63390 (0.0032) [2024-06-21 20:56:47,362][15401] Updated weights for policy 0, policy_version 63400 (0.0040) [2024-06-21 20:56:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1038778368. Throughput: 0: 42510.6. Samples: 1038954860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 20:56:48,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-21 20:56:51,523][15401] Updated weights for policy 0, policy_version 63410 (0.0028) [2024-06-21 20:56:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1039007744. Throughput: 0: 42626.5. Samples: 1039088040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 20:56:53,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 20:56:54,836][15401] Updated weights for policy 0, policy_version 63420 (0.0035) [2024-06-21 20:56:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1039187968. Throughput: 0: 42390.5. Samples: 1039329300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 20:56:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-21 20:56:59,158][15401] Updated weights for policy 0, policy_version 63430 (0.0030) [2024-06-21 20:57:03,150][15401] Updated weights for policy 0, policy_version 63440 (0.0032) [2024-06-21 20:57:03,391][15132] Fps is (10 sec: 39317.0, 60 sec: 42051.4, 300 sec: 42265.3). Total num frames: 1039400960. Throughput: 0: 42486.9. Samples: 1039593020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 20:57:03,391][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 20:57:06,774][15401] Updated weights for policy 0, policy_version 63450 (0.0029) [2024-06-21 20:57:08,390][15132] Fps is (10 sec: 47511.5, 60 sec: 42871.1, 300 sec: 42487.3). Total num frames: 1039663104. Throughput: 0: 42581.4. Samples: 1039720660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:57:08,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 20:57:10,642][15401] Updated weights for policy 0, policy_version 63460 (0.0033) [2024-06-21 20:57:13,389][15132] Fps is (10 sec: 44242.5, 60 sec: 42598.6, 300 sec: 42376.6). Total num frames: 1039843328. Throughput: 0: 42527.2. Samples: 1039971440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:57:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 20:57:14,580][15401] Updated weights for policy 0, policy_version 63470 (0.0036) [2024-06-21 20:57:18,160][15401] Updated weights for policy 0, policy_version 63480 (0.0025) [2024-06-21 20:57:18,389][15132] Fps is (10 sec: 40962.0, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 1040072704. Throughput: 0: 42452.5. Samples: 1040226920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:57:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 20:57:22,557][15401] Updated weights for policy 0, policy_version 63490 (0.0025) [2024-06-21 20:57:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1040269312. Throughput: 0: 42465.3. Samples: 1040353700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:57:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 20:57:25,747][15401] Updated weights for policy 0, policy_version 63500 (0.0035) [2024-06-21 20:57:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1040482304. Throughput: 0: 42435.1. Samples: 1040607340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:57:28,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-21 20:57:29,963][15401] Updated weights for policy 0, policy_version 63510 (0.0049) [2024-06-21 20:57:33,314][15401] Updated weights for policy 0, policy_version 63520 (0.0038) [2024-06-21 20:57:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1040711680. Throughput: 0: 42512.9. Samples: 1040867940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:57:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 20:57:34,075][15349] Signal inference workers to stop experience collection... (15200 times) [2024-06-21 20:57:34,123][15401] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-06-21 20:57:34,193][15349] Signal inference workers to resume experience collection... (15200 times) [2024-06-21 20:57:34,194][15401] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-06-21 20:57:37,635][15401] Updated weights for policy 0, policy_version 63530 (0.0031) [2024-06-21 20:57:38,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1040875520. Throughput: 0: 42388.5. Samples: 1040995520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:57:38,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 20:57:40,889][15401] Updated weights for policy 0, policy_version 63540 (0.0040) [2024-06-21 20:57:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1041121280. Throughput: 0: 42536.9. Samples: 1041243460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 20:57:43,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-21 20:57:43,465][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000063546_1041137664.pth... [2024-06-21 20:57:43,511][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000062924_1030946816.pth [2024-06-21 20:57:45,246][15401] Updated weights for policy 0, policy_version 63550 (0.0037) [2024-06-21 20:57:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1041350656. Throughput: 0: 42395.3. Samples: 1041500760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 20:57:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 20:57:48,787][15401] Updated weights for policy 0, policy_version 63560 (0.0036) [2024-06-21 20:57:53,056][15401] Updated weights for policy 0, policy_version 63570 (0.0039) [2024-06-21 20:57:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1041530880. Throughput: 0: 42409.7. Samples: 1041629080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 20:57:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 20:57:56,498][15401] Updated weights for policy 0, policy_version 63580 (0.0053) [2024-06-21 20:57:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1041760256. Throughput: 0: 42312.0. Samples: 1041875480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 20:57:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 20:58:00,760][15401] Updated weights for policy 0, policy_version 63590 (0.0036) [2024-06-21 20:58:03,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42597.6, 300 sec: 42431.4). Total num frames: 1041956864. Throughput: 0: 42348.0. Samples: 1042132680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 20:58:03,392][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 20:58:04,178][15401] Updated weights for policy 0, policy_version 63600 (0.0047) [2024-06-21 20:58:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.5, 300 sec: 42376.3). Total num frames: 1042169856. Throughput: 0: 42261.8. Samples: 1042255480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 20:58:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 20:58:08,635][15401] Updated weights for policy 0, policy_version 63610 (0.0034) [2024-06-21 20:58:12,001][15401] Updated weights for policy 0, policy_version 63620 (0.0036) [2024-06-21 20:58:13,392][15132] Fps is (10 sec: 44236.5, 60 sec: 42596.6, 300 sec: 42431.4). Total num frames: 1042399232. Throughput: 0: 42266.5. Samples: 1042509440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 20:58:13,393][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 20:58:16,175][15401] Updated weights for policy 0, policy_version 63630 (0.0034) [2024-06-21 20:58:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1042595840. Throughput: 0: 42143.9. Samples: 1042764420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 20:58:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 20:58:20,034][15401] Updated weights for policy 0, policy_version 63640 (0.0036) [2024-06-21 20:58:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42376.8). Total num frames: 1042808832. Throughput: 0: 42074.6. Samples: 1042888880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 20:58:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 20:58:23,838][15401] Updated weights for policy 0, policy_version 63650 (0.0045) [2024-06-21 20:58:27,693][15401] Updated weights for policy 0, policy_version 63660 (0.0036) [2024-06-21 20:58:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42432.1). Total num frames: 1043038208. Throughput: 0: 42351.2. Samples: 1043149260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 20:58:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 20:58:31,653][15401] Updated weights for policy 0, policy_version 63670 (0.0033) [2024-06-21 20:58:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 1043251200. Throughput: 0: 42180.4. Samples: 1043398880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 20:58:33,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-21 20:58:35,620][15401] Updated weights for policy 0, policy_version 63680 (0.0028) [2024-06-21 20:58:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42432.1). Total num frames: 1043447808. Throughput: 0: 42130.3. Samples: 1043524940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 20:58:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 20:58:39,607][15401] Updated weights for policy 0, policy_version 63690 (0.0034) [2024-06-21 20:58:43,161][15349] Signal inference workers to stop experience collection... (15250 times) [2024-06-21 20:58:43,190][15401] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-06-21 20:58:43,272][15349] Signal inference workers to resume experience collection... (15250 times) [2024-06-21 20:58:43,272][15401] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-06-21 20:58:43,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42376.6). Total num frames: 1043644416. Throughput: 0: 42319.1. Samples: 1043779840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 20:58:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 20:58:43,403][15401] Updated weights for policy 0, policy_version 63700 (0.0030) [2024-06-21 20:58:47,438][15401] Updated weights for policy 0, policy_version 63710 (0.0029) [2024-06-21 20:58:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 1043841024. Throughput: 0: 42164.4. Samples: 1044029980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 20:58:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 20:58:51,375][15401] Updated weights for policy 0, policy_version 63720 (0.0042) [2024-06-21 20:58:53,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42593.9, 300 sec: 42486.4). Total num frames: 1044086784. Throughput: 0: 42214.4. Samples: 1044155400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 20:58:53,396][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 20:58:55,303][15401] Updated weights for policy 0, policy_version 63730 (0.0051) [2024-06-21 20:58:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.1, 300 sec: 42320.7). Total num frames: 1044267008. Throughput: 0: 42064.5. Samples: 1044402240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 20:58:58,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 20:58:59,138][15401] Updated weights for policy 0, policy_version 63740 (0.0029) [2024-06-21 20:59:02,968][15401] Updated weights for policy 0, policy_version 63750 (0.0042) [2024-06-21 20:59:03,390][15132] Fps is (10 sec: 39346.2, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 1044480000. Throughput: 0: 42027.1. Samples: 1044655640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 20:59:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 20:59:07,141][15401] Updated weights for policy 0, policy_version 63760 (0.0037) [2024-06-21 20:59:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1044725760. Throughput: 0: 42144.6. Samples: 1044785380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:59:08,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 20:59:10,562][15401] Updated weights for policy 0, policy_version 63770 (0.0037) [2024-06-21 20:59:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41780.8, 300 sec: 42376.2). Total num frames: 1044905984. Throughput: 0: 41990.5. Samples: 1045038840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:59:13,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 20:59:14,641][15401] Updated weights for policy 0, policy_version 63780 (0.0032) [2024-06-21 20:59:18,154][15401] Updated weights for policy 0, policy_version 63790 (0.0044) [2024-06-21 20:59:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 1045135360. Throughput: 0: 42064.5. Samples: 1045291780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:59:18,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 20:59:22,259][15401] Updated weights for policy 0, policy_version 63800 (0.0029) [2024-06-21 20:59:23,391][15132] Fps is (10 sec: 44232.2, 60 sec: 42324.6, 300 sec: 42431.6). Total num frames: 1045348352. Throughput: 0: 42244.3. Samples: 1045425980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:59:23,391][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 20:59:25,759][15401] Updated weights for policy 0, policy_version 63810 (0.0041) [2024-06-21 20:59:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 1045544960. Throughput: 0: 42225.3. Samples: 1045679980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:59:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 20:59:30,010][15401] Updated weights for policy 0, policy_version 63820 (0.0033) [2024-06-21 20:59:33,389][15132] Fps is (10 sec: 42603.4, 60 sec: 42052.4, 300 sec: 42432.1). Total num frames: 1045774336. Throughput: 0: 42165.4. Samples: 1045927420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:59:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 20:59:33,914][15401] Updated weights for policy 0, policy_version 63830 (0.0028) [2024-06-21 20:59:37,581][15401] Updated weights for policy 0, policy_version 63840 (0.0040) [2024-06-21 20:59:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42431.7). Total num frames: 1045987328. Throughput: 0: 42325.4. Samples: 1046059780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 20:59:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 20:59:41,564][15401] Updated weights for policy 0, policy_version 63850 (0.0038) [2024-06-21 20:59:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42376.3). Total num frames: 1046167552. Throughput: 0: 42498.3. Samples: 1046314660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:59:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 20:59:43,547][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000063854_1046183936.pth... [2024-06-21 20:59:43,606][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000063235_1036042240.pth [2024-06-21 20:59:45,195][15401] Updated weights for policy 0, policy_version 63860 (0.0029) [2024-06-21 20:59:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1046396928. Throughput: 0: 42453.9. Samples: 1046566060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:59:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 20:59:49,326][15401] Updated weights for policy 0, policy_version 63870 (0.0037) [2024-06-21 20:59:52,841][15401] Updated weights for policy 0, policy_version 63880 (0.0022) [2024-06-21 20:59:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42329.8, 300 sec: 42487.3). Total num frames: 1046626304. Throughput: 0: 42479.9. Samples: 1046696980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:59:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 20:59:57,123][15401] Updated weights for policy 0, policy_version 63890 (0.0042) [2024-06-21 20:59:58,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42050.6, 300 sec: 42320.3). Total num frames: 1046790144. Throughput: 0: 42423.6. Samples: 1046948000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 20:59:58,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 21:00:00,514][15401] Updated weights for policy 0, policy_version 63900 (0.0029) [2024-06-21 21:00:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42376.8). Total num frames: 1047035904. Throughput: 0: 42320.4. Samples: 1047196300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:00:03,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 21:00:05,121][15401] Updated weights for policy 0, policy_version 63910 (0.0032) [2024-06-21 21:00:08,270][15401] Updated weights for policy 0, policy_version 63920 (0.0035) [2024-06-21 21:00:08,389][15132] Fps is (10 sec: 47525.6, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1047265280. Throughput: 0: 42338.0. Samples: 1047331140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:00:08,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 21:00:12,749][15401] Updated weights for policy 0, policy_version 63930 (0.0038) [2024-06-21 21:00:13,392][15132] Fps is (10 sec: 40959.5, 60 sec: 42323.6, 300 sec: 42320.3). Total num frames: 1047445504. Throughput: 0: 42342.5. Samples: 1047585500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:00:13,393][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 21:00:15,958][15401] Updated weights for policy 0, policy_version 63940 (0.0033) [2024-06-21 21:00:18,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 1047691264. Throughput: 0: 42337.9. Samples: 1047832720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:00:18,392][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 21:00:20,358][15401] Updated weights for policy 0, policy_version 63950 (0.0034) [2024-06-21 21:00:23,390][15132] Fps is (10 sec: 44247.8, 60 sec: 42326.1, 300 sec: 42376.6). Total num frames: 1047887872. Throughput: 0: 42387.2. Samples: 1047967200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:00:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 21:00:23,608][15401] Updated weights for policy 0, policy_version 63960 (0.0042) [2024-06-21 21:00:24,377][15349] Signal inference workers to stop experience collection... (15300 times) [2024-06-21 21:00:24,427][15401] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-06-21 21:00:24,491][15349] Signal inference workers to resume experience collection... (15300 times) [2024-06-21 21:00:24,491][15401] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-06-21 21:00:28,389][15132] Fps is (10 sec: 37691.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1048068096. Throughput: 0: 42415.1. Samples: 1048223340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:00:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 21:00:28,630][15401] Updated weights for policy 0, policy_version 63970 (0.0030) [2024-06-21 21:00:31,294][15401] Updated weights for policy 0, policy_version 63980 (0.0032) [2024-06-21 21:00:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1048330240. Throughput: 0: 42302.2. Samples: 1048469660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:00:33,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-21 21:00:36,429][15401] Updated weights for policy 0, policy_version 63990 (0.0036) [2024-06-21 21:00:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1048543232. Throughput: 0: 42512.0. Samples: 1048610020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:00:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 21:00:39,074][15401] Updated weights for policy 0, policy_version 64000 (0.0026) [2024-06-21 21:00:43,392][15132] Fps is (10 sec: 37673.8, 60 sec: 42323.6, 300 sec: 42264.8). Total num frames: 1048707072. Throughput: 0: 42576.9. Samples: 1048863960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:00:43,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 21:00:43,917][15401] Updated weights for policy 0, policy_version 64010 (0.0033) [2024-06-21 21:00:46,589][15401] Updated weights for policy 0, policy_version 64020 (0.0035) [2024-06-21 21:00:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1048985600. Throughput: 0: 42378.7. Samples: 1049103240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:00:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 21:00:51,660][15401] Updated weights for policy 0, policy_version 64030 (0.0037) [2024-06-21 21:00:53,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1049165824. Throughput: 0: 42560.8. Samples: 1049246380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:00:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 21:00:54,243][15401] Updated weights for policy 0, policy_version 64040 (0.0026) [2024-06-21 21:00:58,389][15132] Fps is (10 sec: 36044.9, 60 sec: 42600.1, 300 sec: 42265.2). Total num frames: 1049346048. Throughput: 0: 42470.8. Samples: 1049496580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:00:58,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 21:00:59,297][15401] Updated weights for policy 0, policy_version 64050 (0.0045) [2024-06-21 21:01:02,115][15401] Updated weights for policy 0, policy_version 64060 (0.0032) [2024-06-21 21:01:03,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42868.6, 300 sec: 42430.8). Total num frames: 1049608192. Throughput: 0: 42423.1. Samples: 1049741940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:01:03,396][15132] Avg episode reward: [(0, '0.755')] [2024-06-21 21:01:06,948][15401] Updated weights for policy 0, policy_version 64070 (0.0031) [2024-06-21 21:01:08,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42323.6, 300 sec: 42431.5). Total num frames: 1049804800. Throughput: 0: 42673.7. Samples: 1049887620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:01:08,393][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 21:01:09,945][15401] Updated weights for policy 0, policy_version 64080 (0.0048) [2024-06-21 21:01:13,389][15132] Fps is (10 sec: 37707.3, 60 sec: 42327.1, 300 sec: 42321.0). Total num frames: 1049985024. Throughput: 0: 42407.1. Samples: 1050131660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:01:13,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-21 21:01:14,807][15401] Updated weights for policy 0, policy_version 64090 (0.0029) [2024-06-21 21:01:17,747][15401] Updated weights for policy 0, policy_version 64100 (0.0029) [2024-06-21 21:01:18,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42873.0, 300 sec: 42487.3). Total num frames: 1050263552. Throughput: 0: 42564.4. Samples: 1050385060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:01:18,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-21 21:01:21,441][15349] Signal inference workers to stop experience collection... (15350 times) [2024-06-21 21:01:21,469][15401] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-06-21 21:01:21,501][15349] Signal inference workers to resume experience collection... (15350 times) [2024-06-21 21:01:21,509][15401] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-06-21 21:01:22,433][15401] Updated weights for policy 0, policy_version 64110 (0.0037) [2024-06-21 21:01:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1050427392. Throughput: 0: 42532.9. Samples: 1050524000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:01:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 21:01:25,451][15401] Updated weights for policy 0, policy_version 64120 (0.0029) [2024-06-21 21:01:28,392][15132] Fps is (10 sec: 36036.2, 60 sec: 42596.7, 300 sec: 42320.4). Total num frames: 1050624000. Throughput: 0: 42203.6. Samples: 1050763120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:01:28,392][15132] Avg episode reward: [(0, '0.381')] [2024-06-21 21:01:29,941][15401] Updated weights for policy 0, policy_version 64130 (0.0032) [2024-06-21 21:01:33,211][15401] Updated weights for policy 0, policy_version 64140 (0.0033) [2024-06-21 21:01:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1050869760. Throughput: 0: 42667.7. Samples: 1051023280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:01:33,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 21:01:37,659][15401] Updated weights for policy 0, policy_version 64150 (0.0037) [2024-06-21 21:01:38,389][15132] Fps is (10 sec: 42608.9, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 1051049984. Throughput: 0: 42452.5. Samples: 1051156740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:01:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-21 21:01:40,748][15401] Updated weights for policy 0, policy_version 64160 (0.0054) [2024-06-21 21:01:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42873.1, 300 sec: 42376.2). Total num frames: 1051279360. Throughput: 0: 42309.2. Samples: 1051400500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:01:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 21:01:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000064165_1051279360.pth... [2024-06-21 21:01:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000063546_1041137664.pth [2024-06-21 21:01:45,376][15401] Updated weights for policy 0, policy_version 64170 (0.0032) [2024-06-21 21:01:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1051508736. Throughput: 0: 42572.8. Samples: 1051657440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:01:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 21:01:48,408][15401] Updated weights for policy 0, policy_version 64180 (0.0036) [2024-06-21 21:01:52,979][15401] Updated weights for policy 0, policy_version 64190 (0.0037) [2024-06-21 21:01:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1051688960. Throughput: 0: 42273.0. Samples: 1051789800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:01:53,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-21 21:01:55,871][15401] Updated weights for policy 0, policy_version 64200 (0.0023) [2024-06-21 21:01:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42432.0). Total num frames: 1051918336. Throughput: 0: 42337.0. Samples: 1052036820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:01:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 21:02:00,457][15401] Updated weights for policy 0, policy_version 64210 (0.0037) [2024-06-21 21:02:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42603.0, 300 sec: 42376.3). Total num frames: 1052164096. Throughput: 0: 42517.8. Samples: 1052298360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:02:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 21:02:03,523][15401] Updated weights for policy 0, policy_version 64220 (0.0033) [2024-06-21 21:02:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 42320.7). Total num frames: 1052327936. Throughput: 0: 42362.4. Samples: 1052430300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:02:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 21:02:08,470][15401] Updated weights for policy 0, policy_version 64230 (0.0026) [2024-06-21 21:02:11,532][15401] Updated weights for policy 0, policy_version 64240 (0.0037) [2024-06-21 21:02:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 1052557312. Throughput: 0: 42575.1. Samples: 1052678900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:02:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 21:02:16,039][15401] Updated weights for policy 0, policy_version 64250 (0.0024) [2024-06-21 21:02:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1052786688. Throughput: 0: 42351.1. Samples: 1052929080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:02:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 21:02:19,158][15401] Updated weights for policy 0, policy_version 64260 (0.0032) [2024-06-21 21:02:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1052966912. Throughput: 0: 42344.7. Samples: 1053062260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 21:02:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-21 21:02:23,984][15401] Updated weights for policy 0, policy_version 64270 (0.0038) [2024-06-21 21:02:26,871][15401] Updated weights for policy 0, policy_version 64280 (0.0033) [2024-06-21 21:02:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42320.7). Total num frames: 1053196288. Throughput: 0: 42348.1. Samples: 1053306160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 21:02:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 21:02:31,478][15401] Updated weights for policy 0, policy_version 64290 (0.0037) [2024-06-21 21:02:33,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1053425664. Throughput: 0: 42351.0. Samples: 1053563240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 21:02:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 21:02:34,753][15401] Updated weights for policy 0, policy_version 64300 (0.0043) [2024-06-21 21:02:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.6, 300 sec: 42320.4). Total num frames: 1053605888. Throughput: 0: 42461.2. Samples: 1053700660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 21:02:38,392][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 21:02:39,210][15401] Updated weights for policy 0, policy_version 64310 (0.0041) [2024-06-21 21:02:40,340][15349] Signal inference workers to stop experience collection... (15400 times) [2024-06-21 21:02:40,342][15349] Signal inference workers to resume experience collection... (15400 times) [2024-06-21 21:02:40,364][15401] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-06-21 21:02:40,364][15401] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-06-21 21:02:42,735][15401] Updated weights for policy 0, policy_version 64320 (0.0034) [2024-06-21 21:02:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1053818880. Throughput: 0: 42531.0. Samples: 1053950720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 21:02:43,396][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 21:02:46,820][15401] Updated weights for policy 0, policy_version 64330 (0.0022) [2024-06-21 21:02:48,390][15132] Fps is (10 sec: 45885.6, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 1054064640. Throughput: 0: 42295.8. Samples: 1054201680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 21:02:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 21:02:50,473][15401] Updated weights for policy 0, policy_version 64340 (0.0035) [2024-06-21 21:02:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1054244864. Throughput: 0: 42346.9. Samples: 1054335920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 21:02:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 21:02:54,310][15401] Updated weights for policy 0, policy_version 64350 (0.0029) [2024-06-21 21:02:58,314][15401] Updated weights for policy 0, policy_version 64360 (0.0029) [2024-06-21 21:02:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 1054474240. Throughput: 0: 42571.2. Samples: 1054594600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 21:02:58,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 21:03:02,118][15401] Updated weights for policy 0, policy_version 64370 (0.0030) [2024-06-21 21:03:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42050.5, 300 sec: 42431.4). Total num frames: 1054687232. Throughput: 0: 42577.6. Samples: 1054845180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 21:03:03,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 21:03:06,125][15401] Updated weights for policy 0, policy_version 64380 (0.0032) [2024-06-21 21:03:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42321.1). Total num frames: 1054883840. Throughput: 0: 42544.5. Samples: 1054976760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 21:03:08,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 21:03:09,690][15401] Updated weights for policy 0, policy_version 64390 (0.0029) [2024-06-21 21:03:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1055096832. Throughput: 0: 42771.6. Samples: 1055230880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 21:03:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 21:03:13,985][15401] Updated weights for policy 0, policy_version 64400 (0.0034) [2024-06-21 21:03:17,360][15401] Updated weights for policy 0, policy_version 64410 (0.0030) [2024-06-21 21:03:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1055326208. Throughput: 0: 42626.3. Samples: 1055481420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 21:03:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 21:03:21,591][15401] Updated weights for policy 0, policy_version 64420 (0.0033) [2024-06-21 21:03:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1055522816. Throughput: 0: 42470.7. Samples: 1055611740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 21:03:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 21:03:24,960][15401] Updated weights for policy 0, policy_version 64430 (0.0041) [2024-06-21 21:03:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1055752192. Throughput: 0: 42351.6. Samples: 1055856540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 21:03:28,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-21 21:03:29,962][15401] Updated weights for policy 0, policy_version 64440 (0.0037) [2024-06-21 21:03:32,868][15401] Updated weights for policy 0, policy_version 64450 (0.0040) [2024-06-21 21:03:33,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 1055981568. Throughput: 0: 42398.7. Samples: 1056109720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 21:03:33,393][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 21:03:37,481][15401] Updated weights for policy 0, policy_version 64460 (0.0032) [2024-06-21 21:03:38,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42054.0, 300 sec: 42320.7). Total num frames: 1056129024. Throughput: 0: 42295.3. Samples: 1056239200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 21:03:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 21:03:40,478][15401] Updated weights for policy 0, policy_version 64470 (0.0037) [2024-06-21 21:03:43,390][15132] Fps is (10 sec: 39330.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1056374784. Throughput: 0: 42212.7. Samples: 1056494180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-21 21:03:43,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 21:03:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000064476_1056374784.pth... [2024-06-21 21:03:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000063854_1046183936.pth [2024-06-21 21:03:44,866][15401] Updated weights for policy 0, policy_version 64480 (0.0038) [2024-06-21 21:03:48,042][15401] Updated weights for policy 0, policy_version 64490 (0.0041) [2024-06-21 21:03:48,390][15132] Fps is (10 sec: 49151.4, 60 sec: 42598.5, 300 sec: 42488.2). Total num frames: 1056620544. Throughput: 0: 42393.8. Samples: 1056752800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-21 21:03:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 21:03:52,271][15401] Updated weights for policy 0, policy_version 64500 (0.0035) [2024-06-21 21:03:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1056784384. Throughput: 0: 42567.0. Samples: 1056892280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-21 21:03:53,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-21 21:03:55,547][15401] Updated weights for policy 0, policy_version 64510 (0.0040) [2024-06-21 21:03:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1057013760. Throughput: 0: 42336.8. Samples: 1057136040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-21 21:03:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 21:03:58,945][15349] Signal inference workers to stop experience collection... (15450 times) [2024-06-21 21:03:58,946][15349] Signal inference workers to resume experience collection... (15450 times) [2024-06-21 21:03:58,966][15401] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-06-21 21:03:58,966][15401] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-06-21 21:04:00,998][15401] Updated weights for policy 0, policy_version 64520 (0.0031) [2024-06-21 21:04:03,142][15401] Updated weights for policy 0, policy_version 64530 (0.0027) [2024-06-21 21:04:03,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 1057259520. Throughput: 0: 42490.5. Samples: 1057393500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-21 21:04:03,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 21:04:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1057406976. Throughput: 0: 42546.3. Samples: 1057526320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-21 21:04:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 21:04:08,447][15401] Updated weights for policy 0, policy_version 64540 (0.0044) [2024-06-21 21:04:11,076][15401] Updated weights for policy 0, policy_version 64550 (0.0023) [2024-06-21 21:04:13,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1057669120. Throughput: 0: 42666.2. Samples: 1057776520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-21 21:04:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 21:04:16,018][15401] Updated weights for policy 0, policy_version 64560 (0.0036) [2024-06-21 21:04:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42432.0). Total num frames: 1057865728. Throughput: 0: 42797.1. Samples: 1058035480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-21 21:04:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 21:04:18,786][15401] Updated weights for policy 0, policy_version 64570 (0.0033) [2024-06-21 21:04:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1058045952. Throughput: 0: 42664.0. Samples: 1058159080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:04:23,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 21:04:23,885][15401] Updated weights for policy 0, policy_version 64580 (0.0032) [2024-06-21 21:04:26,307][15401] Updated weights for policy 0, policy_version 64590 (0.0026) [2024-06-21 21:04:28,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1058308096. Throughput: 0: 42693.1. Samples: 1058415360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:04:28,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-21 21:04:31,379][15401] Updated weights for policy 0, policy_version 64600 (0.0035) [2024-06-21 21:04:33,391][15132] Fps is (10 sec: 47507.0, 60 sec: 42326.1, 300 sec: 42487.1). Total num frames: 1058521088. Throughput: 0: 42780.1. Samples: 1058677960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:04:33,391][15132] Avg episode reward: [(0, '0.244')] [2024-06-21 21:04:33,931][15401] Updated weights for policy 0, policy_version 64610 (0.0035) [2024-06-21 21:04:38,396][15132] Fps is (10 sec: 39296.3, 60 sec: 42866.8, 300 sec: 42486.4). Total num frames: 1058701312. Throughput: 0: 42482.9. Samples: 1058804280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:04:38,396][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 21:04:38,736][15401] Updated weights for policy 0, policy_version 64620 (0.0036) [2024-06-21 21:04:41,668][15401] Updated weights for policy 0, policy_version 64630 (0.0021) [2024-06-21 21:04:43,390][15132] Fps is (10 sec: 42604.0, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1058947072. Throughput: 0: 42753.8. Samples: 1059059960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:04:43,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 21:04:46,133][15401] Updated weights for policy 0, policy_version 64640 (0.0037) [2024-06-21 21:04:48,389][15132] Fps is (10 sec: 45905.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1059160064. Throughput: 0: 42889.6. Samples: 1059323520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:04:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 21:04:49,361][15401] Updated weights for policy 0, policy_version 64650 (0.0031) [2024-06-21 21:04:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 1059356672. Throughput: 0: 42673.7. Samples: 1059446640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:04:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 21:04:53,649][15401] Updated weights for policy 0, policy_version 64660 (0.0039) [2024-06-21 21:04:57,278][15401] Updated weights for policy 0, policy_version 64670 (0.0037) [2024-06-21 21:04:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 1059569664. Throughput: 0: 42774.6. Samples: 1059701380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:04:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 21:05:01,229][15401] Updated weights for policy 0, policy_version 64680 (0.0032) [2024-06-21 21:05:03,377][15349] Signal inference workers to stop experience collection... (15500 times) [2024-06-21 21:05:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.4, 300 sec: 42376.2). Total num frames: 1059766272. Throughput: 0: 42923.5. Samples: 1059967040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:05:03,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-21 21:05:03,432][15401] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-06-21 21:05:03,439][15349] Signal inference workers to resume experience collection... (15500 times) [2024-06-21 21:05:03,451][15401] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-06-21 21:05:04,779][15401] Updated weights for policy 0, policy_version 64690 (0.0041) [2024-06-21 21:05:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.5, 300 sec: 42598.8). Total num frames: 1060012032. Throughput: 0: 42755.1. Samples: 1060083060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:05:08,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-21 21:05:08,670][15401] Updated weights for policy 0, policy_version 64700 (0.0036) [2024-06-21 21:05:12,538][15401] Updated weights for policy 0, policy_version 64710 (0.0029) [2024-06-21 21:05:13,392][15132] Fps is (10 sec: 45863.3, 60 sec: 42596.6, 300 sec: 42487.3). Total num frames: 1060225024. Throughput: 0: 42852.3. Samples: 1060343820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:05:13,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 21:05:16,433][15401] Updated weights for policy 0, policy_version 64720 (0.0035) [2024-06-21 21:05:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1060421632. Throughput: 0: 42848.0. Samples: 1060606060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:05:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 21:05:20,241][15401] Updated weights for policy 0, policy_version 64730 (0.0029) [2024-06-21 21:05:23,390][15132] Fps is (10 sec: 42609.1, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1060651008. Throughput: 0: 42736.8. Samples: 1060727160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:05:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 21:05:24,225][15401] Updated weights for policy 0, policy_version 64740 (0.0042) [2024-06-21 21:05:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1060847616. Throughput: 0: 42612.6. Samples: 1060977520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:05:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 21:05:28,494][15401] Updated weights for policy 0, policy_version 64750 (0.0029) [2024-06-21 21:05:32,035][15401] Updated weights for policy 0, policy_version 64760 (0.0035) [2024-06-21 21:05:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42326.2, 300 sec: 42431.8). Total num frames: 1061060608. Throughput: 0: 42479.8. Samples: 1061235120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:05:33,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-21 21:05:36,238][15401] Updated weights for policy 0, policy_version 64770 (0.0049) [2024-06-21 21:05:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42876.0, 300 sec: 42598.7). Total num frames: 1061273600. Throughput: 0: 42508.8. Samples: 1061359540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-21 21:05:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 21:05:39,820][15401] Updated weights for policy 0, policy_version 64780 (0.0042) [2024-06-21 21:05:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1061486592. Throughput: 0: 42470.7. Samples: 1061612560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:05:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 21:05:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000064788_1061486592.pth... [2024-06-21 21:05:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000064165_1051279360.pth [2024-06-21 21:05:43,908][15401] Updated weights for policy 0, policy_version 64790 (0.0023) [2024-06-21 21:05:47,671][15401] Updated weights for policy 0, policy_version 64800 (0.0028) [2024-06-21 21:05:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1061699584. Throughput: 0: 42275.5. Samples: 1061869440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:05:48,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 21:05:51,472][15401] Updated weights for policy 0, policy_version 64810 (0.0032) [2024-06-21 21:05:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1061912576. Throughput: 0: 42566.1. Samples: 1061998540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:05:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 21:05:55,439][15401] Updated weights for policy 0, policy_version 64820 (0.0037) [2024-06-21 21:05:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42488.2). Total num frames: 1062141952. Throughput: 0: 42339.7. Samples: 1062249000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:05:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 21:05:59,343][15401] Updated weights for policy 0, policy_version 64830 (0.0054) [2024-06-21 21:06:03,094][15401] Updated weights for policy 0, policy_version 64840 (0.0027) [2024-06-21 21:06:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42487.7). Total num frames: 1062338560. Throughput: 0: 42236.8. Samples: 1062506720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:06:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 21:06:07,025][15401] Updated weights for policy 0, policy_version 64850 (0.0034) [2024-06-21 21:06:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1062567936. Throughput: 0: 42397.0. Samples: 1062635020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:06:08,390][15132] Avg episode reward: [(0, '0.163')] [2024-06-21 21:06:10,865][15401] Updated weights for policy 0, policy_version 64860 (0.0030) [2024-06-21 21:06:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.1, 300 sec: 42376.2). Total num frames: 1062764544. Throughput: 0: 42414.1. Samples: 1062886160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:06:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 21:06:14,851][15401] Updated weights for policy 0, policy_version 64870 (0.0039) [2024-06-21 21:06:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1062961152. Throughput: 0: 42568.9. Samples: 1063150720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:06:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 21:06:18,619][15401] Updated weights for policy 0, policy_version 64880 (0.0035) [2024-06-21 21:06:22,589][15401] Updated weights for policy 0, policy_version 64890 (0.0033) [2024-06-21 21:06:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 1063190528. Throughput: 0: 42332.5. Samples: 1063264500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:06:23,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 21:06:24,146][15349] Signal inference workers to stop experience collection... (15550 times) [2024-06-21 21:06:24,155][15349] Signal inference workers to resume experience collection... (15550 times) [2024-06-21 21:06:24,203][15401] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-06-21 21:06:24,204][15401] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-06-21 21:06:26,384][15401] Updated weights for policy 0, policy_version 64900 (0.0043) [2024-06-21 21:06:28,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42869.6, 300 sec: 42542.5). Total num frames: 1063419904. Throughput: 0: 42422.7. Samples: 1063521680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:06:28,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 21:06:30,182][15401] Updated weights for policy 0, policy_version 64910 (0.0030) [2024-06-21 21:06:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1063583744. Throughput: 0: 42491.9. Samples: 1063781580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:06:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 21:06:34,061][15401] Updated weights for policy 0, policy_version 64920 (0.0028) [2024-06-21 21:06:37,753][15401] Updated weights for policy 0, policy_version 64930 (0.0040) [2024-06-21 21:06:38,390][15132] Fps is (10 sec: 39330.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1063813120. Throughput: 0: 42290.1. Samples: 1063901600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:06:38,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-21 21:06:41,852][15401] Updated weights for policy 0, policy_version 64940 (0.0032) [2024-06-21 21:06:43,390][15132] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1064075264. Throughput: 0: 42501.7. Samples: 1064161580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:06:43,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 21:06:45,250][15401] Updated weights for policy 0, policy_version 64950 (0.0030) [2024-06-21 21:06:48,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1064222720. Throughput: 0: 42445.5. Samples: 1064416760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:06:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 21:06:49,727][15401] Updated weights for policy 0, policy_version 64960 (0.0033) [2024-06-21 21:06:53,358][15401] Updated weights for policy 0, policy_version 64970 (0.0032) [2024-06-21 21:06:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1064468480. Throughput: 0: 42225.8. Samples: 1064535180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:06:53,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 21:06:57,576][15401] Updated weights for policy 0, policy_version 64980 (0.0034) [2024-06-21 21:06:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1064681472. Throughput: 0: 42389.3. Samples: 1064793680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:06:58,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-21 21:07:01,403][15401] Updated weights for policy 0, policy_version 64990 (0.0047) [2024-06-21 21:07:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 1064878080. Throughput: 0: 42121.0. Samples: 1065046160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:07:03,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-21 21:07:05,383][15401] Updated weights for policy 0, policy_version 65000 (0.0039) [2024-06-21 21:07:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 1065107456. Throughput: 0: 42332.8. Samples: 1065169580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:07:08,392][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 21:07:09,026][15401] Updated weights for policy 0, policy_version 65010 (0.0027) [2024-06-21 21:07:13,056][15401] Updated weights for policy 0, policy_version 65020 (0.0038) [2024-06-21 21:07:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1065304064. Throughput: 0: 42424.4. Samples: 1065430680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:07:13,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 21:07:16,712][15401] Updated weights for policy 0, policy_version 65030 (0.0037) [2024-06-21 21:07:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1065500672. Throughput: 0: 42125.4. Samples: 1065677220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:07:18,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-21 21:07:20,735][15401] Updated weights for policy 0, policy_version 65040 (0.0028) [2024-06-21 21:07:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1065730048. Throughput: 0: 42249.6. Samples: 1065802820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:07:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 21:07:24,433][15401] Updated weights for policy 0, policy_version 65050 (0.0038) [2024-06-21 21:07:28,354][15401] Updated weights for policy 0, policy_version 65060 (0.0039) [2024-06-21 21:07:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42054.0, 300 sec: 42431.8). Total num frames: 1065943040. Throughput: 0: 42264.1. Samples: 1066063460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:07:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 21:07:32,027][15401] Updated weights for policy 0, policy_version 65070 (0.0037) [2024-06-21 21:07:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42542.9). Total num frames: 1066156032. Throughput: 0: 42058.5. Samples: 1066309500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:07:33,393][15132] Avg episode reward: [(0, '0.413')] [2024-06-21 21:07:36,210][15401] Updated weights for policy 0, policy_version 65080 (0.0032) [2024-06-21 21:07:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 1066369024. Throughput: 0: 42448.3. Samples: 1066445460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:07:38,393][15132] Avg episode reward: [(0, '0.336')] [2024-06-21 21:07:39,653][15401] Updated weights for policy 0, policy_version 65090 (0.0028) [2024-06-21 21:07:43,392][15132] Fps is (10 sec: 40960.0, 60 sec: 41504.6, 300 sec: 42375.9). Total num frames: 1066565632. Throughput: 0: 42359.5. Samples: 1066699960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 21:07:43,393][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 21:07:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000065098_1066565632.pth... [2024-06-21 21:07:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000064476_1056374784.pth [2024-06-21 21:07:43,879][15401] Updated weights for policy 0, policy_version 65100 (0.0039) [2024-06-21 21:07:47,674][15401] Updated weights for policy 0, policy_version 65110 (0.0033) [2024-06-21 21:07:48,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1066778624. Throughput: 0: 42320.9. Samples: 1066950600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 21:07:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 21:07:51,586][15401] Updated weights for policy 0, policy_version 65120 (0.0042) [2024-06-21 21:07:53,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1067008000. Throughput: 0: 42451.5. Samples: 1067079800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 21:07:53,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 21:07:55,185][15401] Updated weights for policy 0, policy_version 65130 (0.0037) [2024-06-21 21:07:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42432.1). Total num frames: 1067204608. Throughput: 0: 42226.6. Samples: 1067330880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 21:07:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 21:07:59,166][15401] Updated weights for policy 0, policy_version 65140 (0.0031) [2024-06-21 21:08:02,861][15401] Updated weights for policy 0, policy_version 65150 (0.0035) [2024-06-21 21:08:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1067433984. Throughput: 0: 42405.7. Samples: 1067585480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 21:08:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 21:08:06,961][15401] Updated weights for policy 0, policy_version 65160 (0.0047) [2024-06-21 21:08:08,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 1067630592. Throughput: 0: 42584.5. Samples: 1067719120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 21:08:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-21 21:08:10,807][15401] Updated weights for policy 0, policy_version 65170 (0.0037) [2024-06-21 21:08:13,390][15132] Fps is (10 sec: 39318.4, 60 sec: 42051.8, 300 sec: 42376.1). Total num frames: 1067827200. Throughput: 0: 42130.4. Samples: 1067959360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 21:08:13,391][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 21:08:14,762][15401] Updated weights for policy 0, policy_version 65180 (0.0037) [2024-06-21 21:08:18,160][15349] Signal inference workers to stop experience collection... (15600 times) [2024-06-21 21:08:18,160][15349] Signal inference workers to resume experience collection... (15600 times) [2024-06-21 21:08:18,180][15401] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-06-21 21:08:18,180][15401] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-06-21 21:08:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1068056576. Throughput: 0: 42319.7. Samples: 1068213780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 21:08:18,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-21 21:08:18,487][15401] Updated weights for policy 0, policy_version 65190 (0.0029) [2024-06-21 21:08:22,834][15401] Updated weights for policy 0, policy_version 65200 (0.0026) [2024-06-21 21:08:23,389][15132] Fps is (10 sec: 42602.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1068253184. Throughput: 0: 42163.2. Samples: 1068342700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:08:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 21:08:26,060][15401] Updated weights for policy 0, policy_version 65210 (0.0036) [2024-06-21 21:08:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42321.1). Total num frames: 1068466176. Throughput: 0: 42132.1. Samples: 1068595800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:08:28,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-21 21:08:30,493][15401] Updated weights for policy 0, policy_version 65220 (0.0046) [2024-06-21 21:08:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 1068695552. Throughput: 0: 42268.4. Samples: 1068852680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:08:33,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-21 21:08:33,666][15401] Updated weights for policy 0, policy_version 65230 (0.0026) [2024-06-21 21:08:38,074][15401] Updated weights for policy 0, policy_version 65240 (0.0039) [2024-06-21 21:08:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42327.0, 300 sec: 42487.4). Total num frames: 1068908544. Throughput: 0: 42288.5. Samples: 1068982780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:08:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 21:08:41,391][15401] Updated weights for policy 0, policy_version 65250 (0.0043) [2024-06-21 21:08:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1069105152. Throughput: 0: 42244.0. Samples: 1069231860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:08:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 21:08:45,693][15401] Updated weights for policy 0, policy_version 65260 (0.0041) [2024-06-21 21:08:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1069334528. Throughput: 0: 42351.6. Samples: 1069491300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:08:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 21:08:48,872][15401] Updated weights for policy 0, policy_version 65270 (0.0037) [2024-06-21 21:08:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1069531136. Throughput: 0: 42298.9. Samples: 1069622580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:08:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 21:08:53,419][15401] Updated weights for policy 0, policy_version 65280 (0.0024) [2024-06-21 21:08:56,563][15401] Updated weights for policy 0, policy_version 65290 (0.0026) [2024-06-21 21:08:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1069744128. Throughput: 0: 42487.4. Samples: 1069871260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 21:08:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 21:09:00,958][15401] Updated weights for policy 0, policy_version 65300 (0.0028) [2024-06-21 21:09:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 1069957120. Throughput: 0: 42698.6. Samples: 1070135220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 21:09:03,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 21:09:04,567][15401] Updated weights for policy 0, policy_version 65310 (0.0043) [2024-06-21 21:09:08,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42593.8, 300 sec: 42430.9). Total num frames: 1070186496. Throughput: 0: 42596.5. Samples: 1070259820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 21:09:08,396][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 21:09:08,753][15401] Updated weights for policy 0, policy_version 65320 (0.0046) [2024-06-21 21:09:12,335][15401] Updated weights for policy 0, policy_version 65330 (0.0029) [2024-06-21 21:09:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42599.0, 300 sec: 42431.8). Total num frames: 1070383104. Throughput: 0: 42522.6. Samples: 1070509320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 21:09:13,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-21 21:09:16,639][15401] Updated weights for policy 0, policy_version 65340 (0.0037) [2024-06-21 21:09:18,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1070579712. Throughput: 0: 42583.1. Samples: 1070768920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 21:09:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 21:09:20,071][15401] Updated weights for policy 0, policy_version 65350 (0.0036) [2024-06-21 21:09:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1070809088. Throughput: 0: 42523.6. Samples: 1070896340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 21:09:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 21:09:24,323][15401] Updated weights for policy 0, policy_version 65360 (0.0027) [2024-06-21 21:09:27,597][15401] Updated weights for policy 0, policy_version 65370 (0.0033) [2024-06-21 21:09:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42376.4). Total num frames: 1071022080. Throughput: 0: 42626.4. Samples: 1071150040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 21:09:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 21:09:31,996][15401] Updated weights for policy 0, policy_version 65380 (0.0038) [2024-06-21 21:09:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42432.7). Total num frames: 1071218688. Throughput: 0: 42568.1. Samples: 1071406860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 21:09:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 21:09:34,240][15349] Signal inference workers to stop experience collection... (15650 times) [2024-06-21 21:09:34,241][15349] Signal inference workers to resume experience collection... (15650 times) [2024-06-21 21:09:34,288][15401] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-06-21 21:09:34,288][15401] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-06-21 21:09:35,221][15401] Updated weights for policy 0, policy_version 65390 (0.0044) [2024-06-21 21:09:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1071464448. Throughput: 0: 42509.8. Samples: 1071535520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-21 21:09:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 21:09:39,531][15401] Updated weights for policy 0, policy_version 65400 (0.0030) [2024-06-21 21:09:42,888][15401] Updated weights for policy 0, policy_version 65410 (0.0032) [2024-06-21 21:09:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1071677440. Throughput: 0: 42569.7. Samples: 1071786900. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-21 21:09:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 21:09:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000065410_1071677440.pth... [2024-06-21 21:09:43,448][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000064788_1061486592.pth [2024-06-21 21:09:47,318][15401] Updated weights for policy 0, policy_version 65420 (0.0036) [2024-06-21 21:09:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1071857664. Throughput: 0: 42348.8. Samples: 1072040920. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-21 21:09:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 21:09:50,799][15401] Updated weights for policy 0, policy_version 65430 (0.0039) [2024-06-21 21:09:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1072087040. Throughput: 0: 42309.2. Samples: 1072163460. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-21 21:09:53,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-21 21:09:54,934][15401] Updated weights for policy 0, policy_version 65440 (0.0035) [2024-06-21 21:09:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1072300032. Throughput: 0: 42525.3. Samples: 1072422960. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-21 21:09:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 21:09:58,694][15401] Updated weights for policy 0, policy_version 65450 (0.0028) [2024-06-21 21:10:02,752][15401] Updated weights for policy 0, policy_version 65460 (0.0031) [2024-06-21 21:10:03,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42320.4). Total num frames: 1072496640. Throughput: 0: 42262.1. Samples: 1072670820. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-21 21:10:03,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 21:10:06,419][15401] Updated weights for policy 0, policy_version 65470 (0.0024) [2024-06-21 21:10:08,390][15132] Fps is (10 sec: 42595.6, 60 sec: 42329.4, 300 sec: 42376.5). Total num frames: 1072726016. Throughput: 0: 42207.8. Samples: 1072795720. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-21 21:10:08,391][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 21:10:10,670][15401] Updated weights for policy 0, policy_version 65480 (0.0036) [2024-06-21 21:10:13,391][15132] Fps is (10 sec: 42604.4, 60 sec: 42324.6, 300 sec: 42376.1). Total num frames: 1072922624. Throughput: 0: 42220.3. Samples: 1073050000. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-21 21:10:13,391][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 21:10:14,219][15401] Updated weights for policy 0, policy_version 65490 (0.0042) [2024-06-21 21:10:18,350][15401] Updated weights for policy 0, policy_version 65500 (0.0034) [2024-06-21 21:10:18,390][15132] Fps is (10 sec: 42601.4, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1073152000. Throughput: 0: 42119.9. Samples: 1073302260. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-21 21:10:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 21:10:21,906][15401] Updated weights for policy 0, policy_version 65510 (0.0031) [2024-06-21 21:10:23,390][15132] Fps is (10 sec: 44241.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1073364992. Throughput: 0: 42142.3. Samples: 1073431920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:10:23,396][15132] Avg episode reward: [(0, '0.784')] [2024-06-21 21:10:26,592][15401] Updated weights for policy 0, policy_version 65520 (0.0036) [2024-06-21 21:10:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1073545216. Throughput: 0: 42229.0. Samples: 1073687200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:10:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 21:10:29,633][15401] Updated weights for policy 0, policy_version 65530 (0.0040) [2024-06-21 21:10:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.6, 300 sec: 42375.9). Total num frames: 1073774592. Throughput: 0: 42069.3. Samples: 1073934140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:10:33,393][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 21:10:34,291][15401] Updated weights for policy 0, policy_version 65540 (0.0048) [2024-06-21 21:10:37,351][15401] Updated weights for policy 0, policy_version 65550 (0.0034) [2024-06-21 21:10:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1074003968. Throughput: 0: 42250.6. Samples: 1074064740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:10:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 21:10:42,079][15401] Updated weights for policy 0, policy_version 65560 (0.0038) [2024-06-21 21:10:43,390][15132] Fps is (10 sec: 39331.1, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 1074167808. Throughput: 0: 42049.4. Samples: 1074315180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:10:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 21:10:45,130][15401] Updated weights for policy 0, policy_version 65570 (0.0027) [2024-06-21 21:10:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1074413568. Throughput: 0: 42237.0. Samples: 1074571380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:10:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 21:10:49,773][15401] Updated weights for policy 0, policy_version 65580 (0.0041) [2024-06-21 21:10:52,796][15349] Signal inference workers to stop experience collection... (15700 times) [2024-06-21 21:10:52,796][15349] Signal inference workers to resume experience collection... (15700 times) [2024-06-21 21:10:52,818][15401] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-06-21 21:10:52,818][15401] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-06-21 21:10:52,952][15401] Updated weights for policy 0, policy_version 65590 (0.0037) [2024-06-21 21:10:53,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1074642944. Throughput: 0: 42347.7. Samples: 1074701340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:10:53,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 21:10:57,521][15401] Updated weights for policy 0, policy_version 65600 (0.0037) [2024-06-21 21:10:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1074823168. Throughput: 0: 42404.5. Samples: 1074958160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:10:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 21:11:00,635][15401] Updated weights for policy 0, policy_version 65610 (0.0039) [2024-06-21 21:11:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42327.1, 300 sec: 42265.2). Total num frames: 1075036160. Throughput: 0: 42373.4. Samples: 1075209060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-21 21:11:03,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-21 21:11:05,156][15401] Updated weights for policy 0, policy_version 65620 (0.0031) [2024-06-21 21:11:08,175][15401] Updated weights for policy 0, policy_version 65630 (0.0045) [2024-06-21 21:11:08,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42597.2, 300 sec: 42431.4). Total num frames: 1075281920. Throughput: 0: 42478.7. Samples: 1075343560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 21:11:08,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 21:11:12,803][15401] Updated weights for policy 0, policy_version 65640 (0.0035) [2024-06-21 21:11:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42326.0, 300 sec: 42376.2). Total num frames: 1075462144. Throughput: 0: 42434.0. Samples: 1075596740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 21:11:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 21:11:16,074][15401] Updated weights for policy 0, policy_version 65650 (0.0036) [2024-06-21 21:11:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1075691520. Throughput: 0: 42405.8. Samples: 1075842300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 21:11:18,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 21:11:21,310][15401] Updated weights for policy 0, policy_version 65660 (0.0029) [2024-06-21 21:11:23,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42323.7, 300 sec: 42320.7). Total num frames: 1075904512. Throughput: 0: 42536.9. Samples: 1075979000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 21:11:23,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 21:11:23,879][15401] Updated weights for policy 0, policy_version 65670 (0.0030) [2024-06-21 21:11:28,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 1076068352. Throughput: 0: 42511.5. Samples: 1076228200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 21:11:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 21:11:28,944][15401] Updated weights for policy 0, policy_version 65680 (0.0038) [2024-06-21 21:11:31,577][15401] Updated weights for policy 0, policy_version 65690 (0.0022) [2024-06-21 21:11:33,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42054.0, 300 sec: 42320.7). Total num frames: 1076297728. Throughput: 0: 42430.2. Samples: 1076480740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 21:11:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 21:11:36,408][15401] Updated weights for policy 0, policy_version 65700 (0.0042) [2024-06-21 21:11:38,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1076527104. Throughput: 0: 42484.6. Samples: 1076613140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 21:11:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 21:11:39,271][15401] Updated weights for policy 0, policy_version 65710 (0.0026) [2024-06-21 21:11:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1076723712. Throughput: 0: 42253.4. Samples: 1076859560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-21 21:11:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 21:11:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000065718_1076723712.pth... [2024-06-21 21:11:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000065098_1066565632.pth [2024-06-21 21:11:44,155][15401] Updated weights for policy 0, policy_version 65720 (0.0044) [2024-06-21 21:11:46,996][15401] Updated weights for policy 0, policy_version 65730 (0.0031) [2024-06-21 21:11:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1076936704. Throughput: 0: 42253.6. Samples: 1077110480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:11:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 21:11:51,762][15401] Updated weights for policy 0, policy_version 65740 (0.0040) [2024-06-21 21:11:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1077182464. Throughput: 0: 42338.7. Samples: 1077248700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:11:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 21:11:54,958][15401] Updated weights for policy 0, policy_version 65750 (0.0033) [2024-06-21 21:11:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1077346304. Throughput: 0: 42125.0. Samples: 1077492360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:11:58,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 21:11:59,581][15401] Updated weights for policy 0, policy_version 65760 (0.0036) [2024-06-21 21:12:00,301][15349] Signal inference workers to stop experience collection... (15750 times) [2024-06-21 21:12:00,302][15349] Signal inference workers to resume experience collection... (15750 times) [2024-06-21 21:12:00,326][15401] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-06-21 21:12:00,327][15401] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-06-21 21:12:02,639][15401] Updated weights for policy 0, policy_version 65770 (0.0028) [2024-06-21 21:12:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42321.0). Total num frames: 1077592064. Throughput: 0: 42163.6. Samples: 1077739660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:12:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 21:12:07,469][15401] Updated weights for policy 0, policy_version 65780 (0.0041) [2024-06-21 21:12:08,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42327.1, 300 sec: 42431.8). Total num frames: 1077821440. Throughput: 0: 42209.1. Samples: 1077878300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:12:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 21:12:10,250][15401] Updated weights for policy 0, policy_version 65790 (0.0039) [2024-06-21 21:12:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1077985280. Throughput: 0: 42304.0. Samples: 1078131880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:12:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 21:12:14,966][15401] Updated weights for policy 0, policy_version 65800 (0.0038) [2024-06-21 21:12:18,194][15401] Updated weights for policy 0, policy_version 65810 (0.0031) [2024-06-21 21:12:18,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42323.7, 300 sec: 42375.9). Total num frames: 1078231040. Throughput: 0: 42249.3. Samples: 1078382060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:12:18,392][15132] Avg episode reward: [(0, '0.755')] [2024-06-21 21:12:22,509][15401] Updated weights for policy 0, policy_version 65820 (0.0028) [2024-06-21 21:12:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42327.1, 300 sec: 42376.3). Total num frames: 1078444032. Throughput: 0: 42273.4. Samples: 1078515440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:12:23,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 21:12:25,964][15401] Updated weights for policy 0, policy_version 65830 (0.0037) [2024-06-21 21:12:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.5, 300 sec: 42265.5). Total num frames: 1078624256. Throughput: 0: 42340.0. Samples: 1078764860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 21:12:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-21 21:12:30,363][15401] Updated weights for policy 0, policy_version 65840 (0.0024) [2024-06-21 21:12:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42321.1). Total num frames: 1078853632. Throughput: 0: 42409.4. Samples: 1079018900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 21:12:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 21:12:33,606][15401] Updated weights for policy 0, policy_version 65850 (0.0040) [2024-06-21 21:12:38,007][15401] Updated weights for policy 0, policy_version 65860 (0.0033) [2024-06-21 21:12:38,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42320.8, 300 sec: 42375.7). Total num frames: 1079066624. Throughput: 0: 42218.0. Samples: 1079148780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 21:12:38,396][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 21:12:41,611][15401] Updated weights for policy 0, policy_version 65870 (0.0033) [2024-06-21 21:12:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42375.9). Total num frames: 1079279616. Throughput: 0: 42390.6. Samples: 1079400040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 21:12:43,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 21:12:45,612][15401] Updated weights for policy 0, policy_version 65880 (0.0038) [2024-06-21 21:12:48,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1079492608. Throughput: 0: 42478.2. Samples: 1079651180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 21:12:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 21:12:49,712][15401] Updated weights for policy 0, policy_version 65890 (0.0027) [2024-06-21 21:12:53,355][15401] Updated weights for policy 0, policy_version 65900 (0.0035) [2024-06-21 21:12:53,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1079705600. Throughput: 0: 42298.9. Samples: 1079781760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 21:12:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 21:12:57,252][15401] Updated weights for policy 0, policy_version 65910 (0.0030) [2024-06-21 21:12:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 1079918592. Throughput: 0: 42260.0. Samples: 1080033580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 21:12:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-21 21:13:01,291][15401] Updated weights for policy 0, policy_version 65920 (0.0036) [2024-06-21 21:13:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1080147968. Throughput: 0: 42233.3. Samples: 1080282460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-21 21:13:03,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 21:13:04,900][15401] Updated weights for policy 0, policy_version 65930 (0.0032) [2024-06-21 21:13:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.0, 300 sec: 42320.8). Total num frames: 1080311808. Throughput: 0: 42166.2. Samples: 1080412920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:13:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 21:13:08,959][15401] Updated weights for policy 0, policy_version 65940 (0.0033) [2024-06-21 21:13:12,612][15401] Updated weights for policy 0, policy_version 65950 (0.0047) [2024-06-21 21:13:13,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1080541184. Throughput: 0: 42202.1. Samples: 1080663960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:13:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 21:13:15,152][15349] Signal inference workers to stop experience collection... (15800 times) [2024-06-21 21:13:15,153][15349] Signal inference workers to resume experience collection... (15800 times) [2024-06-21 21:13:15,170][15401] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-06-21 21:13:15,171][15401] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-06-21 21:13:16,717][15401] Updated weights for policy 0, policy_version 65960 (0.0027) [2024-06-21 21:13:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 1080770560. Throughput: 0: 42122.2. Samples: 1080914400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:13:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 21:13:20,313][15401] Updated weights for policy 0, policy_version 65970 (0.0040) [2024-06-21 21:13:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1080950784. Throughput: 0: 41977.1. Samples: 1081037480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:13:23,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-21 21:13:24,777][15401] Updated weights for policy 0, policy_version 65980 (0.0048) [2024-06-21 21:13:28,024][15401] Updated weights for policy 0, policy_version 65990 (0.0033) [2024-06-21 21:13:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.2, 300 sec: 42320.7). Total num frames: 1081180160. Throughput: 0: 42088.7. Samples: 1081293940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:13:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 21:13:32,581][15401] Updated weights for policy 0, policy_version 66000 (0.0024) [2024-06-21 21:13:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1081376768. Throughput: 0: 42243.5. Samples: 1081552140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:13:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-21 21:13:35,953][15401] Updated weights for policy 0, policy_version 66010 (0.0029) [2024-06-21 21:13:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42056.7, 300 sec: 42320.7). Total num frames: 1081589760. Throughput: 0: 42081.8. Samples: 1081675440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:13:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 21:13:40,098][15401] Updated weights for policy 0, policy_version 66020 (0.0030) [2024-06-21 21:13:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42054.0, 300 sec: 42265.2). Total num frames: 1081802752. Throughput: 0: 42225.0. Samples: 1081933700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:13:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 21:13:43,440][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000066029_1081819136.pth... [2024-06-21 21:13:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000065410_1071677440.pth [2024-06-21 21:13:43,663][15401] Updated weights for policy 0, policy_version 66030 (0.0028) [2024-06-21 21:13:47,963][15401] Updated weights for policy 0, policy_version 66040 (0.0035) [2024-06-21 21:13:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1081999360. Throughput: 0: 42314.3. Samples: 1082186600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:13:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 21:13:51,245][15401] Updated weights for policy 0, policy_version 66050 (0.0041) [2024-06-21 21:13:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1082212352. Throughput: 0: 42152.8. Samples: 1082309800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:13:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 21:13:55,483][15401] Updated weights for policy 0, policy_version 66060 (0.0034) [2024-06-21 21:13:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1082441728. Throughput: 0: 42232.9. Samples: 1082564440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:13:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 21:13:58,824][15401] Updated weights for policy 0, policy_version 66070 (0.0035) [2024-06-21 21:14:03,236][15401] Updated weights for policy 0, policy_version 66080 (0.0025) [2024-06-21 21:14:03,390][15132] Fps is (10 sec: 44235.2, 60 sec: 41779.0, 300 sec: 42266.0). Total num frames: 1082654720. Throughput: 0: 42472.5. Samples: 1082825680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 21:14:06,651][15401] Updated weights for policy 0, policy_version 66090 (0.0030) [2024-06-21 21:14:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1082867712. Throughput: 0: 42472.9. Samples: 1082948760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 21:14:10,831][15401] Updated weights for policy 0, policy_version 66100 (0.0025) [2024-06-21 21:14:13,389][15132] Fps is (10 sec: 42600.6, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1083080704. Throughput: 0: 42337.6. Samples: 1083199120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 21:14:14,451][15401] Updated weights for policy 0, policy_version 66110 (0.0029) [2024-06-21 21:14:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1083293696. Throughput: 0: 42221.5. Samples: 1083452100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 21:14:19,108][15401] Updated weights for policy 0, policy_version 66120 (0.0041) [2024-06-21 21:14:22,503][15401] Updated weights for policy 0, policy_version 66130 (0.0026) [2024-06-21 21:14:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42320.4). Total num frames: 1083506688. Throughput: 0: 42231.1. Samples: 1083575940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:23,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 21:14:26,682][15401] Updated weights for policy 0, policy_version 66140 (0.0037) [2024-06-21 21:14:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1083719680. Throughput: 0: 42276.7. Samples: 1083836160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 21:14:30,312][15401] Updated weights for policy 0, policy_version 66150 (0.0040) [2024-06-21 21:14:33,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1083932672. Throughput: 0: 42287.9. Samples: 1084089560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 21:14:34,397][15401] Updated weights for policy 0, policy_version 66160 (0.0034) [2024-06-21 21:14:37,859][15401] Updated weights for policy 0, policy_version 66170 (0.0027) [2024-06-21 21:14:38,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1084129280. Throughput: 0: 42309.4. Samples: 1084213720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:38,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 21:14:42,234][15401] Updated weights for policy 0, policy_version 66180 (0.0033) [2024-06-21 21:14:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1084358656. Throughput: 0: 42476.4. Samples: 1084475880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 21:14:45,658][15401] Updated weights for policy 0, policy_version 66190 (0.0038) [2024-06-21 21:14:47,630][15349] Signal inference workers to stop experience collection... (15850 times) [2024-06-21 21:14:47,630][15349] Signal inference workers to resume experience collection... (15850 times) [2024-06-21 21:14:47,650][15401] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-06-21 21:14:47,650][15401] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-06-21 21:14:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1084571648. Throughput: 0: 42327.6. Samples: 1084730400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:48,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-21 21:14:49,779][15401] Updated weights for policy 0, policy_version 66200 (0.0037) [2024-06-21 21:14:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1084768256. Throughput: 0: 42313.7. Samples: 1084852880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 21:14:53,459][15401] Updated weights for policy 0, policy_version 66210 (0.0035) [2024-06-21 21:14:57,496][15401] Updated weights for policy 0, policy_version 66220 (0.0041) [2024-06-21 21:14:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42376.6). Total num frames: 1084997632. Throughput: 0: 42510.9. Samples: 1085112120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:14:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 21:15:01,068][15401] Updated weights for policy 0, policy_version 66230 (0.0038) [2024-06-21 21:15:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.7, 300 sec: 42265.3). Total num frames: 1085194240. Throughput: 0: 42574.3. Samples: 1085367940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-21 21:15:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 21:15:05,097][15401] Updated weights for policy 0, policy_version 66240 (0.0038) [2024-06-21 21:15:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42320.8). Total num frames: 1085407232. Throughput: 0: 42506.1. Samples: 1085488620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 21:15:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 21:15:09,058][15401] Updated weights for policy 0, policy_version 66250 (0.0041) [2024-06-21 21:15:12,898][15401] Updated weights for policy 0, policy_version 66260 (0.0035) [2024-06-21 21:15:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 1085620224. Throughput: 0: 42400.9. Samples: 1085744200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 21:15:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 21:15:16,805][15401] Updated weights for policy 0, policy_version 66270 (0.0031) [2024-06-21 21:15:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1085816832. Throughput: 0: 42397.6. Samples: 1085997440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 21:15:18,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-21 21:15:20,655][15401] Updated weights for policy 0, policy_version 66280 (0.0028) [2024-06-21 21:15:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.0, 300 sec: 42431.7). Total num frames: 1086062592. Throughput: 0: 42401.2. Samples: 1086121780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 21:15:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 21:15:24,853][15401] Updated weights for policy 0, policy_version 66290 (0.0031) [2024-06-21 21:15:28,305][15401] Updated weights for policy 0, policy_version 66300 (0.0034) [2024-06-21 21:15:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42321.1). Total num frames: 1086259200. Throughput: 0: 42285.4. Samples: 1086378720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 21:15:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 21:15:32,590][15401] Updated weights for policy 0, policy_version 66310 (0.0036) [2024-06-21 21:15:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1086455808. Throughput: 0: 42227.4. Samples: 1086630640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 21:15:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 21:15:36,130][15401] Updated weights for policy 0, policy_version 66320 (0.0039) [2024-06-21 21:15:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1086685184. Throughput: 0: 42337.0. Samples: 1086758040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 21:15:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 21:15:40,260][15401] Updated weights for policy 0, policy_version 66330 (0.0037) [2024-06-21 21:15:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1086881792. Throughput: 0: 42248.0. Samples: 1087013280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-21 21:15:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 21:15:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000066338_1086881792.pth... [2024-06-21 21:15:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000065718_1076723712.pth [2024-06-21 21:15:43,819][15401] Updated weights for policy 0, policy_version 66340 (0.0027) [2024-06-21 21:15:47,973][15401] Updated weights for policy 0, policy_version 66350 (0.0040) [2024-06-21 21:15:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1087094784. Throughput: 0: 42235.4. Samples: 1087268540. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:15:48,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 21:15:51,323][15401] Updated weights for policy 0, policy_version 66360 (0.0033) [2024-06-21 21:15:52,035][15349] Signal inference workers to stop experience collection... (15900 times) [2024-06-21 21:15:52,089][15401] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-06-21 21:15:52,089][15349] Signal inference workers to resume experience collection... (15900 times) [2024-06-21 21:15:52,110][15401] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-06-21 21:15:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1087340544. Throughput: 0: 42296.1. Samples: 1087391940. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:15:53,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-21 21:15:55,843][15401] Updated weights for policy 0, policy_version 66370 (0.0037) [2024-06-21 21:15:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1087504384. Throughput: 0: 42271.3. Samples: 1087646400. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:15:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 21:15:59,226][15401] Updated weights for policy 0, policy_version 66380 (0.0027) [2024-06-21 21:16:03,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 1087717376. Throughput: 0: 42264.0. Samples: 1087899320. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:16:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 21:16:03,498][15401] Updated weights for policy 0, policy_version 66390 (0.0037) [2024-06-21 21:16:06,986][15401] Updated weights for policy 0, policy_version 66400 (0.0037) [2024-06-21 21:16:08,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1087979520. Throughput: 0: 42358.4. Samples: 1088027900. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:16:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 21:16:11,214][15401] Updated weights for policy 0, policy_version 66410 (0.0036) [2024-06-21 21:16:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.4, 300 sec: 42154.1). Total num frames: 1088126976. Throughput: 0: 42228.9. Samples: 1088279020. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:16:13,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 21:16:14,583][15401] Updated weights for policy 0, policy_version 66420 (0.0036) [2024-06-21 21:16:18,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.2, 300 sec: 42210.0). Total num frames: 1088356352. Throughput: 0: 42287.6. Samples: 1088533580. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:16:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 21:16:18,974][15401] Updated weights for policy 0, policy_version 66430 (0.0037) [2024-06-21 21:16:22,163][15401] Updated weights for policy 0, policy_version 66440 (0.0024) [2024-06-21 21:16:23,389][15132] Fps is (10 sec: 47512.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1088602112. Throughput: 0: 42442.5. Samples: 1088667960. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:16:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 21:16:26,586][15401] Updated weights for policy 0, policy_version 66450 (0.0037) [2024-06-21 21:16:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42265.1). Total num frames: 1088765952. Throughput: 0: 42320.4. Samples: 1088917700. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-21 21:16:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 21:16:29,878][15401] Updated weights for policy 0, policy_version 66460 (0.0035) [2024-06-21 21:16:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1089011712. Throughput: 0: 42225.7. Samples: 1089168700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 21:16:33,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 21:16:34,124][15401] Updated weights for policy 0, policy_version 66470 (0.0026) [2024-06-21 21:16:37,700][15401] Updated weights for policy 0, policy_version 66480 (0.0037) [2024-06-21 21:16:38,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1089241088. Throughput: 0: 42549.9. Samples: 1089306680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 21:16:38,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 21:16:41,652][15401] Updated weights for policy 0, policy_version 66490 (0.0032) [2024-06-21 21:16:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1089421312. Throughput: 0: 42405.2. Samples: 1089554640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 21:16:43,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 21:16:45,327][15401] Updated weights for policy 0, policy_version 66500 (0.0028) [2024-06-21 21:16:48,390][15132] Fps is (10 sec: 39318.3, 60 sec: 42324.8, 300 sec: 42209.5). Total num frames: 1089634304. Throughput: 0: 42523.6. Samples: 1089812920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 21:16:48,391][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 21:16:49,284][15401] Updated weights for policy 0, policy_version 66510 (0.0030) [2024-06-21 21:16:52,963][15401] Updated weights for policy 0, policy_version 66520 (0.0037) [2024-06-21 21:16:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1089863680. Throughput: 0: 42551.9. Samples: 1089942740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 21:16:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 21:16:56,910][15401] Updated weights for policy 0, policy_version 66530 (0.0033) [2024-06-21 21:16:58,390][15132] Fps is (10 sec: 42601.6, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 1090060288. Throughput: 0: 42516.8. Samples: 1090192280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 21:16:58,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 21:17:00,679][15401] Updated weights for policy 0, policy_version 66540 (0.0028) [2024-06-21 21:17:03,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1090273280. Throughput: 0: 42634.8. Samples: 1090452140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 21:17:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 21:17:04,823][15401] Updated weights for policy 0, policy_version 66550 (0.0051) [2024-06-21 21:17:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1090502656. Throughput: 0: 42502.7. Samples: 1090580580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-21 21:17:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 21:17:08,456][15401] Updated weights for policy 0, policy_version 66560 (0.0027) [2024-06-21 21:17:12,648][15401] Updated weights for policy 0, policy_version 66570 (0.0035) [2024-06-21 21:17:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42265.5). Total num frames: 1090699264. Throughput: 0: 42626.4. Samples: 1090835880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:17:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 21:17:15,309][15349] Signal inference workers to stop experience collection... (15950 times) [2024-06-21 21:17:15,309][15349] Signal inference workers to resume experience collection... (15950 times) [2024-06-21 21:17:15,354][15401] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-06-21 21:17:15,354][15401] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-06-21 21:17:16,076][15401] Updated weights for policy 0, policy_version 66580 (0.0031) [2024-06-21 21:17:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1090928640. Throughput: 0: 42738.3. Samples: 1091091920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:17:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 21:17:20,133][15401] Updated weights for policy 0, policy_version 66590 (0.0034) [2024-06-21 21:17:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1091141632. Throughput: 0: 42523.0. Samples: 1091220220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:17:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 21:17:23,775][15401] Updated weights for policy 0, policy_version 66600 (0.0025) [2024-06-21 21:17:27,782][15401] Updated weights for policy 0, policy_version 66610 (0.0034) [2024-06-21 21:17:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 1091338240. Throughput: 0: 42620.6. Samples: 1091472560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:17:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 21:17:31,445][15401] Updated weights for policy 0, policy_version 66620 (0.0035) [2024-06-21 21:17:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42377.1). Total num frames: 1091567616. Throughput: 0: 42590.8. Samples: 1091729480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:17:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 21:17:35,595][15401] Updated weights for policy 0, policy_version 66630 (0.0031) [2024-06-21 21:17:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 1091780608. Throughput: 0: 42574.3. Samples: 1091858580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:17:38,398][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 21:17:39,115][15401] Updated weights for policy 0, policy_version 66640 (0.0028) [2024-06-21 21:17:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1091977216. Throughput: 0: 42708.9. Samples: 1092114180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:17:43,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 21:17:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000066650_1091993600.pth... [2024-06-21 21:17:43,423][15401] Updated weights for policy 0, policy_version 66650 (0.0036) [2024-06-21 21:17:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000066029_1081819136.pth [2024-06-21 21:17:46,841][15401] Updated weights for policy 0, policy_version 66660 (0.0034) [2024-06-21 21:17:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42872.0, 300 sec: 42376.2). Total num frames: 1092206592. Throughput: 0: 42439.4. Samples: 1092361920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 21:17:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 21:17:51,328][15401] Updated weights for policy 0, policy_version 66670 (0.0029) [2024-06-21 21:17:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1092419584. Throughput: 0: 42548.5. Samples: 1092495260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:17:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 21:17:54,391][15401] Updated weights for policy 0, policy_version 66680 (0.0027) [2024-06-21 21:17:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1092616192. Throughput: 0: 42554.1. Samples: 1092750820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:17:58,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-21 21:17:58,928][15401] Updated weights for policy 0, policy_version 66690 (0.0033) [2024-06-21 21:18:02,268][15401] Updated weights for policy 0, policy_version 66700 (0.0036) [2024-06-21 21:18:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1092845568. Throughput: 0: 42607.7. Samples: 1093009260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:18:03,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-21 21:18:06,387][15401] Updated weights for policy 0, policy_version 66710 (0.0026) [2024-06-21 21:18:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1093058560. Throughput: 0: 42593.4. Samples: 1093136920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:18:08,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 21:18:09,943][15401] Updated weights for policy 0, policy_version 66720 (0.0030) [2024-06-21 21:18:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1093255168. Throughput: 0: 42647.6. Samples: 1093391700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:18:13,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-21 21:18:14,199][15401] Updated weights for policy 0, policy_version 66730 (0.0033) [2024-06-21 21:18:17,484][15401] Updated weights for policy 0, policy_version 66740 (0.0038) [2024-06-21 21:18:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1093484544. Throughput: 0: 42604.2. Samples: 1093646660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:18:18,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 21:18:22,022][15401] Updated weights for policy 0, policy_version 66750 (0.0026) [2024-06-21 21:18:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42487.4). Total num frames: 1093713920. Throughput: 0: 42751.6. Samples: 1093782400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:18:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-21 21:18:25,102][15401] Updated weights for policy 0, policy_version 66760 (0.0030) [2024-06-21 21:18:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1093894144. Throughput: 0: 42787.9. Samples: 1094039640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:18:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 21:18:29,806][15401] Updated weights for policy 0, policy_version 66770 (0.0027) [2024-06-21 21:18:32,727][15401] Updated weights for policy 0, policy_version 66780 (0.0030) [2024-06-21 21:18:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1094139904. Throughput: 0: 42753.9. Samples: 1094285840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:18:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 21:18:37,406][15401] Updated weights for policy 0, policy_version 66790 (0.0035) [2024-06-21 21:18:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 1094336512. Throughput: 0: 42807.9. Samples: 1094421720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:18:38,393][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 21:18:40,582][15401] Updated weights for policy 0, policy_version 66800 (0.0034) [2024-06-21 21:18:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1094533120. Throughput: 0: 42613.0. Samples: 1094668400. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:18:43,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 21:18:44,946][15401] Updated weights for policy 0, policy_version 66810 (0.0027) [2024-06-21 21:18:48,372][15401] Updated weights for policy 0, policy_version 66820 (0.0033) [2024-06-21 21:18:48,390][15349] Signal inference workers to stop experience collection... (16000 times) [2024-06-21 21:18:48,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1094778880. Throughput: 0: 42560.7. Samples: 1094924500. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:18:48,390][15349] Signal inference workers to resume experience collection... (16000 times) [2024-06-21 21:18:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 21:18:48,427][15401] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-06-21 21:18:48,427][15401] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-06-21 21:18:52,843][15401] Updated weights for policy 0, policy_version 66830 (0.0036) [2024-06-21 21:18:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1094959104. Throughput: 0: 42690.7. Samples: 1095058000. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:18:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 21:18:55,932][15401] Updated weights for policy 0, policy_version 66840 (0.0038) [2024-06-21 21:18:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42431.9). Total num frames: 1095172096. Throughput: 0: 42465.7. Samples: 1095302660. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:18:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-21 21:19:00,581][15401] Updated weights for policy 0, policy_version 66850 (0.0041) [2024-06-21 21:19:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1095417856. Throughput: 0: 42439.0. Samples: 1095556420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:19:03,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 21:19:03,769][15401] Updated weights for policy 0, policy_version 66860 (0.0027) [2024-06-21 21:19:08,261][15401] Updated weights for policy 0, policy_version 66870 (0.0037) [2024-06-21 21:19:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1095598080. Throughput: 0: 42420.8. Samples: 1095691340. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:19:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 21:19:11,451][15401] Updated weights for policy 0, policy_version 66880 (0.0033) [2024-06-21 21:19:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1095811072. Throughput: 0: 42170.7. Samples: 1095937320. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-21 21:19:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 21:19:15,881][15401] Updated weights for policy 0, policy_version 66890 (0.0031) [2024-06-21 21:19:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 1096056832. Throughput: 0: 42456.4. Samples: 1096196380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 21:19:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 21:19:18,995][15401] Updated weights for policy 0, policy_version 66900 (0.0029) [2024-06-21 21:19:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 1096220672. Throughput: 0: 42326.4. Samples: 1096326300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 21:19:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-21 21:19:23,566][15401] Updated weights for policy 0, policy_version 66910 (0.0032) [2024-06-21 21:19:26,910][15401] Updated weights for policy 0, policy_version 66920 (0.0034) [2024-06-21 21:19:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1096466432. Throughput: 0: 42345.2. Samples: 1096573940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 21:19:28,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 21:19:31,013][15401] Updated weights for policy 0, policy_version 66930 (0.0032) [2024-06-21 21:19:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1096663040. Throughput: 0: 42518.8. Samples: 1096837840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 21:19:33,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-21 21:19:34,683][15401] Updated weights for policy 0, policy_version 66940 (0.0035) [2024-06-21 21:19:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 1096876032. Throughput: 0: 42271.9. Samples: 1096960240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 21:19:38,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 21:19:38,724][15401] Updated weights for policy 0, policy_version 66950 (0.0039) [2024-06-21 21:19:42,526][15401] Updated weights for policy 0, policy_version 66960 (0.0037) [2024-06-21 21:19:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1097121792. Throughput: 0: 42528.7. Samples: 1097216460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 21:19:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 21:19:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000066963_1097121792.pth... [2024-06-21 21:19:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000066338_1086881792.pth [2024-06-21 21:19:46,574][15401] Updated weights for policy 0, policy_version 66970 (0.0027) [2024-06-21 21:19:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 1097285632. Throughput: 0: 42520.1. Samples: 1097469820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 21:19:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 21:19:50,047][15401] Updated weights for policy 0, policy_version 66980 (0.0028) [2024-06-21 21:19:53,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1097498624. Throughput: 0: 42198.6. Samples: 1097590280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-21 21:19:53,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-21 21:19:54,253][15401] Updated weights for policy 0, policy_version 66990 (0.0038) [2024-06-21 21:19:57,189][15349] Signal inference workers to stop experience collection... (16050 times) [2024-06-21 21:19:57,250][15401] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-06-21 21:19:57,305][15349] Signal inference workers to resume experience collection... (16050 times) [2024-06-21 21:19:57,305][15401] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-06-21 21:19:57,680][15401] Updated weights for policy 0, policy_version 67000 (0.0034) [2024-06-21 21:19:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1097744384. Throughput: 0: 42495.2. Samples: 1097849600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 21:19:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 21:20:01,913][15401] Updated weights for policy 0, policy_version 67010 (0.0033) [2024-06-21 21:20:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42376.3). Total num frames: 1097908224. Throughput: 0: 42551.1. Samples: 1098111180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 21:20:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 21:20:05,384][15401] Updated weights for policy 0, policy_version 67020 (0.0032) [2024-06-21 21:20:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1098137600. Throughput: 0: 42252.8. Samples: 1098227680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 21:20:08,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-21 21:20:09,644][15401] Updated weights for policy 0, policy_version 67030 (0.0040) [2024-06-21 21:20:13,022][15401] Updated weights for policy 0, policy_version 67040 (0.0045) [2024-06-21 21:20:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1098383360. Throughput: 0: 42580.5. Samples: 1098490060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 21:20:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 21:20:17,233][15401] Updated weights for policy 0, policy_version 67050 (0.0032) [2024-06-21 21:20:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 1098563584. Throughput: 0: 42292.0. Samples: 1098740980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 21:20:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 21:20:20,896][15401] Updated weights for policy 0, policy_version 67060 (0.0027) [2024-06-21 21:20:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1098776576. Throughput: 0: 42271.2. Samples: 1098862440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 21:20:23,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-21 21:20:25,308][15401] Updated weights for policy 0, policy_version 67070 (0.0040) [2024-06-21 21:20:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1099005952. Throughput: 0: 42288.5. Samples: 1099119440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 21:20:28,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 21:20:28,919][15401] Updated weights for policy 0, policy_version 67080 (0.0038) [2024-06-21 21:20:32,975][15401] Updated weights for policy 0, policy_version 67090 (0.0024) [2024-06-21 21:20:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1099202560. Throughput: 0: 42295.9. Samples: 1099373140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-21 21:20:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 21:20:36,535][15401] Updated weights for policy 0, policy_version 67100 (0.0026) [2024-06-21 21:20:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1099415552. Throughput: 0: 42498.7. Samples: 1099502720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:20:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 21:20:40,868][15401] Updated weights for policy 0, policy_version 67110 (0.0040) [2024-06-21 21:20:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1099644928. Throughput: 0: 42326.1. Samples: 1099754280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:20:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 21:20:44,220][15401] Updated weights for policy 0, policy_version 67120 (0.0047) [2024-06-21 21:20:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1099841536. Throughput: 0: 42100.0. Samples: 1100005680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:20:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 21:20:48,953][15401] Updated weights for policy 0, policy_version 67130 (0.0030) [2024-06-21 21:20:52,008][15401] Updated weights for policy 0, policy_version 67140 (0.0037) [2024-06-21 21:20:53,394][15132] Fps is (10 sec: 40940.9, 60 sec: 42595.1, 300 sec: 42542.2). Total num frames: 1100054528. Throughput: 0: 42375.2. Samples: 1100134760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:20:53,395][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 21:20:56,743][15401] Updated weights for policy 0, policy_version 67150 (0.0039) [2024-06-21 21:20:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42323.5, 300 sec: 42598.0). Total num frames: 1100283904. Throughput: 0: 42346.6. Samples: 1100395760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:20:58,393][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 21:20:59,513][15401] Updated weights for policy 0, policy_version 67160 (0.0040) [2024-06-21 21:21:03,390][15132] Fps is (10 sec: 42617.9, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 1100480512. Throughput: 0: 42397.2. Samples: 1100648860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:21:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 21:21:04,497][15401] Updated weights for policy 0, policy_version 67170 (0.0034) [2024-06-21 21:21:07,140][15401] Updated weights for policy 0, policy_version 67180 (0.0029) [2024-06-21 21:21:08,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1100693504. Throughput: 0: 42481.6. Samples: 1100774120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:21:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 21:21:12,288][15401] Updated weights for policy 0, policy_version 67190 (0.0029) [2024-06-21 21:21:13,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1100922880. Throughput: 0: 42585.9. Samples: 1101035800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:21:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 21:21:15,375][15401] Updated weights for policy 0, policy_version 67200 (0.0034) [2024-06-21 21:21:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1101119488. Throughput: 0: 42455.2. Samples: 1101283620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 21:21:19,830][15401] Updated weights for policy 0, policy_version 67210 (0.0027) [2024-06-21 21:21:21,055][15349] Signal inference workers to stop experience collection... (16100 times) [2024-06-21 21:21:21,055][15349] Signal inference workers to resume experience collection... (16100 times) [2024-06-21 21:21:21,089][15401] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-06-21 21:21:21,089][15401] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-06-21 21:21:23,243][15401] Updated weights for policy 0, policy_version 67220 (0.0039) [2024-06-21 21:21:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1101332480. Throughput: 0: 42184.8. Samples: 1101401040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-21 21:21:27,617][15401] Updated weights for policy 0, policy_version 67230 (0.0037) [2024-06-21 21:21:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1101545472. Throughput: 0: 42467.6. Samples: 1101665320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:28,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-21 21:21:30,857][15401] Updated weights for policy 0, policy_version 67240 (0.0037) [2024-06-21 21:21:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1101758464. Throughput: 0: 42487.1. Samples: 1101917600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 21:21:35,198][15401] Updated weights for policy 0, policy_version 67250 (0.0033) [2024-06-21 21:21:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1101971456. Throughput: 0: 42451.1. Samples: 1102044860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:38,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 21:21:38,452][15401] Updated weights for policy 0, policy_version 67260 (0.0035) [2024-06-21 21:21:42,840][15401] Updated weights for policy 0, policy_version 67270 (0.0034) [2024-06-21 21:21:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 1102168064. Throughput: 0: 42334.8. Samples: 1102300720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 21:21:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000067272_1102184448.pth... [2024-06-21 21:21:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000066650_1091993600.pth [2024-06-21 21:21:46,267][15401] Updated weights for policy 0, policy_version 67280 (0.0044) [2024-06-21 21:21:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1102397440. Throughput: 0: 42361.1. Samples: 1102555100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:48,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 21:21:50,502][15401] Updated weights for policy 0, policy_version 67290 (0.0040) [2024-06-21 21:21:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42328.7, 300 sec: 42487.3). Total num frames: 1102594048. Throughput: 0: 42323.3. Samples: 1102678660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 21:21:54,348][15401] Updated weights for policy 0, policy_version 67300 (0.0035) [2024-06-21 21:21:58,209][15401] Updated weights for policy 0, policy_version 67310 (0.0030) [2024-06-21 21:21:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 1102807040. Throughput: 0: 42165.6. Samples: 1102933260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-21 21:21:58,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 21:22:02,115][15401] Updated weights for policy 0, policy_version 67320 (0.0038) [2024-06-21 21:22:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1103020032. Throughput: 0: 42144.4. Samples: 1103180120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-21 21:22:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 21:22:06,152][15401] Updated weights for policy 0, policy_version 67330 (0.0039) [2024-06-21 21:22:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1103216640. Throughput: 0: 42498.9. Samples: 1103313480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-21 21:22:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-21 21:22:09,763][15401] Updated weights for policy 0, policy_version 67340 (0.0036) [2024-06-21 21:22:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 1103429632. Throughput: 0: 42152.4. Samples: 1103562180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-21 21:22:13,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-21 21:22:13,822][15401] Updated weights for policy 0, policy_version 67350 (0.0035) [2024-06-21 21:22:17,520][15401] Updated weights for policy 0, policy_version 67360 (0.0029) [2024-06-21 21:22:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1103659008. Throughput: 0: 42212.5. Samples: 1103817160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-21 21:22:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 21:22:21,910][15401] Updated weights for policy 0, policy_version 67370 (0.0027) [2024-06-21 21:22:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42376.2). Total num frames: 1103839232. Throughput: 0: 42204.4. Samples: 1103944060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-21 21:22:23,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 21:22:25,464][15401] Updated weights for policy 0, policy_version 67380 (0.0034) [2024-06-21 21:22:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1104084992. Throughput: 0: 42087.1. Samples: 1104194640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-21 21:22:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 21:22:29,465][15401] Updated weights for policy 0, policy_version 67390 (0.0026) [2024-06-21 21:22:32,903][15401] Updated weights for policy 0, policy_version 67400 (0.0036) [2024-06-21 21:22:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1104297984. Throughput: 0: 42222.1. Samples: 1104455100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-21 21:22:33,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-21 21:22:36,942][15401] Updated weights for policy 0, policy_version 67410 (0.0048) [2024-06-21 21:22:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1104494592. Throughput: 0: 42350.0. Samples: 1104584420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-21 21:22:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 21:22:40,532][15401] Updated weights for policy 0, policy_version 67420 (0.0031) [2024-06-21 21:22:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 1104740352. Throughput: 0: 42407.5. Samples: 1104841600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:22:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 21:22:44,514][15401] Updated weights for policy 0, policy_version 67430 (0.0035) [2024-06-21 21:22:48,309][15401] Updated weights for policy 0, policy_version 67440 (0.0040) [2024-06-21 21:22:48,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1104936960. Throughput: 0: 42505.9. Samples: 1105092880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:22:48,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 21:22:49,080][15349] Signal inference workers to stop experience collection... (16150 times) [2024-06-21 21:22:49,081][15349] Signal inference workers to resume experience collection... (16150 times) [2024-06-21 21:22:49,099][15401] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-06-21 21:22:49,099][15401] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-06-21 21:22:52,171][15401] Updated weights for policy 0, policy_version 67450 (0.0032) [2024-06-21 21:22:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1105149952. Throughput: 0: 42350.1. Samples: 1105219240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:22:53,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 21:22:55,870][15401] Updated weights for policy 0, policy_version 67460 (0.0024) [2024-06-21 21:22:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1105346560. Throughput: 0: 42524.5. Samples: 1105475780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:22:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 21:22:59,773][15401] Updated weights for policy 0, policy_version 67470 (0.0043) [2024-06-21 21:23:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1105575936. Throughput: 0: 42522.0. Samples: 1105730660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:23:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 21:23:04,049][15401] Updated weights for policy 0, policy_version 67480 (0.0039) [2024-06-21 21:23:07,553][15401] Updated weights for policy 0, policy_version 67490 (0.0038) [2024-06-21 21:23:08,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.6, 300 sec: 42431.4). Total num frames: 1105772544. Throughput: 0: 42544.9. Samples: 1105858680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:23:08,393][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 21:23:11,537][15401] Updated weights for policy 0, policy_version 67500 (0.0039) [2024-06-21 21:23:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1106001920. Throughput: 0: 42642.7. Samples: 1106113560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:23:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 21:23:15,461][15401] Updated weights for policy 0, policy_version 67510 (0.0042) [2024-06-21 21:23:18,396][15132] Fps is (10 sec: 44219.3, 60 sec: 42593.8, 300 sec: 42375.3). Total num frames: 1106214912. Throughput: 0: 42361.2. Samples: 1106361620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:23:18,396][15132] Avg episode reward: [(0, '0.763')] [2024-06-21 21:23:19,324][15401] Updated weights for policy 0, policy_version 67520 (0.0032) [2024-06-21 21:23:23,124][15401] Updated weights for policy 0, policy_version 67530 (0.0042) [2024-06-21 21:23:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1106411520. Throughput: 0: 42481.4. Samples: 1106496080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:23:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 21:23:26,948][15401] Updated weights for policy 0, policy_version 67540 (0.0043) [2024-06-21 21:23:28,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1106608128. Throughput: 0: 42339.7. Samples: 1106746880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:23:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 21:23:30,812][15401] Updated weights for policy 0, policy_version 67550 (0.0032) [2024-06-21 21:23:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 1106853888. Throughput: 0: 42455.5. Samples: 1107003380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:23:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 21:23:34,472][15401] Updated weights for policy 0, policy_version 67560 (0.0027) [2024-06-21 21:23:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1107050496. Throughput: 0: 42583.7. Samples: 1107135500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:23:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 21:23:38,495][15401] Updated weights for policy 0, policy_version 67570 (0.0041) [2024-06-21 21:23:42,069][15401] Updated weights for policy 0, policy_version 67580 (0.0030) [2024-06-21 21:23:43,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1107247104. Throughput: 0: 42401.7. Samples: 1107383860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:23:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 21:23:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000067582_1107263488.pth... [2024-06-21 21:23:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000066963_1097121792.pth [2024-06-21 21:23:46,738][15401] Updated weights for policy 0, policy_version 67590 (0.0043) [2024-06-21 21:23:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1107492864. Throughput: 0: 42288.5. Samples: 1107633640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:23:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 21:23:49,963][15401] Updated weights for policy 0, policy_version 67600 (0.0030) [2024-06-21 21:23:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1107673088. Throughput: 0: 42392.1. Samples: 1107766220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:23:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 21:23:54,416][15401] Updated weights for policy 0, policy_version 67610 (0.0033) [2024-06-21 21:23:57,680][15401] Updated weights for policy 0, policy_version 67620 (0.0037) [2024-06-21 21:23:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1107886080. Throughput: 0: 42171.5. Samples: 1108011280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:23:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 21:24:02,197][15401] Updated weights for policy 0, policy_version 67630 (0.0043) [2024-06-21 21:24:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1108099072. Throughput: 0: 42306.1. Samples: 1108265120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-21 21:24:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 21:24:05,940][15401] Updated weights for policy 0, policy_version 67640 (0.0037) [2024-06-21 21:24:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42376.2). Total num frames: 1108312064. Throughput: 0: 42067.1. Samples: 1108389100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 21:24:08,394][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 21:24:09,866][15401] Updated weights for policy 0, policy_version 67650 (0.0035) [2024-06-21 21:24:11,497][15349] Signal inference workers to stop experience collection... (16200 times) [2024-06-21 21:24:11,497][15349] Signal inference workers to resume experience collection... (16200 times) [2024-06-21 21:24:11,545][15401] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-06-21 21:24:11,545][15401] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-06-21 21:24:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42265.1). Total num frames: 1108525056. Throughput: 0: 42031.4. Samples: 1108638300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 21:24:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 21:24:13,804][15401] Updated weights for policy 0, policy_version 67660 (0.0031) [2024-06-21 21:24:17,845][15401] Updated weights for policy 0, policy_version 67670 (0.0038) [2024-06-21 21:24:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41783.6, 300 sec: 42376.2). Total num frames: 1108721664. Throughput: 0: 42145.3. Samples: 1108899920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 21:24:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 21:24:21,465][15401] Updated weights for policy 0, policy_version 67680 (0.0049) [2024-06-21 21:24:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1108951040. Throughput: 0: 41866.2. Samples: 1109019480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 21:24:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 21:24:25,984][15401] Updated weights for policy 0, policy_version 67690 (0.0031) [2024-06-21 21:24:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1109147648. Throughput: 0: 41807.2. Samples: 1109265180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 21:24:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 21:24:29,333][15401] Updated weights for policy 0, policy_version 67700 (0.0034) [2024-06-21 21:24:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41233.0, 300 sec: 42209.6). Total num frames: 1109327872. Throughput: 0: 42167.7. Samples: 1109531180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 21:24:33,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 21:24:33,555][15401] Updated weights for policy 0, policy_version 67710 (0.0037) [2024-06-21 21:24:36,981][15401] Updated weights for policy 0, policy_version 67720 (0.0037) [2024-06-21 21:24:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 1109573632. Throughput: 0: 41844.8. Samples: 1109649240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 21:24:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 21:24:41,203][15401] Updated weights for policy 0, policy_version 67730 (0.0035) [2024-06-21 21:24:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1109786624. Throughput: 0: 42073.2. Samples: 1109904580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 21:24:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 21:24:44,764][15401] Updated weights for policy 0, policy_version 67740 (0.0035) [2024-06-21 21:24:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41233.1, 300 sec: 42265.2). Total num frames: 1109966848. Throughput: 0: 42177.2. Samples: 1110163100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 21:24:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 21:24:48,937][15401] Updated weights for policy 0, policy_version 67750 (0.0036) [2024-06-21 21:24:52,577][15401] Updated weights for policy 0, policy_version 67760 (0.0038) [2024-06-21 21:24:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1110196224. Throughput: 0: 42121.0. Samples: 1110284540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 21:24:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 21:24:56,550][15401] Updated weights for policy 0, policy_version 67770 (0.0027) [2024-06-21 21:24:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1110425600. Throughput: 0: 42295.7. Samples: 1110541600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 21:24:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 21:25:00,452][15401] Updated weights for policy 0, policy_version 67780 (0.0043) [2024-06-21 21:25:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1110605824. Throughput: 0: 42124.9. Samples: 1110795540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 21:25:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 21:25:04,330][15401] Updated weights for policy 0, policy_version 67790 (0.0034) [2024-06-21 21:25:08,239][15401] Updated weights for policy 0, policy_version 67800 (0.0037) [2024-06-21 21:25:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1110851584. Throughput: 0: 42160.9. Samples: 1110916720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 21:25:08,393][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 21:25:11,921][15401] Updated weights for policy 0, policy_version 67810 (0.0030) [2024-06-21 21:25:13,392][15132] Fps is (10 sec: 44227.4, 60 sec: 42050.9, 300 sec: 42320.4). Total num frames: 1111048192. Throughput: 0: 42360.6. Samples: 1111171500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 21:25:13,392][15132] Avg episode reward: [(0, '0.287')] [2024-06-21 21:25:15,782][15401] Updated weights for policy 0, policy_version 67820 (0.0034) [2024-06-21 21:25:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1111244800. Throughput: 0: 42128.9. Samples: 1111426980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 21:25:18,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 21:25:19,726][15401] Updated weights for policy 0, policy_version 67830 (0.0022) [2024-06-21 21:25:23,389][15132] Fps is (10 sec: 42607.4, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1111474176. Throughput: 0: 42315.5. Samples: 1111553440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 21:25:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 21:25:23,473][15401] Updated weights for policy 0, policy_version 67840 (0.0037) [2024-06-21 21:25:26,956][15349] Signal inference workers to stop experience collection... (16250 times) [2024-06-21 21:25:26,985][15401] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-06-21 21:25:27,015][15349] Signal inference workers to resume experience collection... (16250 times) [2024-06-21 21:25:27,016][15401] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-06-21 21:25:27,527][15401] Updated weights for policy 0, policy_version 67850 (0.0031) [2024-06-21 21:25:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1111687168. Throughput: 0: 42321.5. Samples: 1111809040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:25:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 21:25:31,128][15401] Updated weights for policy 0, policy_version 67860 (0.0028) [2024-06-21 21:25:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1111883776. Throughput: 0: 42129.9. Samples: 1112058940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:25:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 21:25:35,185][15401] Updated weights for policy 0, policy_version 67870 (0.0039) [2024-06-21 21:25:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1112096768. Throughput: 0: 42232.1. Samples: 1112184980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:25:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 21:25:38,771][15401] Updated weights for policy 0, policy_version 67880 (0.0023) [2024-06-21 21:25:42,921][15401] Updated weights for policy 0, policy_version 67890 (0.0039) [2024-06-21 21:25:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1112342528. Throughput: 0: 42433.6. Samples: 1112451120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:25:43,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-21 21:25:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000067892_1112342528.pth... [2024-06-21 21:25:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000067272_1102184448.pth [2024-06-21 21:25:46,187][15401] Updated weights for policy 0, policy_version 67900 (0.0031) [2024-06-21 21:25:48,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42598.3, 300 sec: 42265.8). Total num frames: 1112522752. Throughput: 0: 42497.1. Samples: 1112707920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:25:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 21:25:50,421][15401] Updated weights for policy 0, policy_version 67910 (0.0033) [2024-06-21 21:25:53,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.5, 300 sec: 42265.5). Total num frames: 1112752128. Throughput: 0: 42498.8. Samples: 1112829160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:25:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 21:25:54,229][15401] Updated weights for policy 0, policy_version 67920 (0.0024) [2024-06-21 21:25:57,927][15401] Updated weights for policy 0, policy_version 67930 (0.0041) [2024-06-21 21:25:58,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1112965120. Throughput: 0: 42606.0. Samples: 1113088680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:25:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 21:26:01,732][15401] Updated weights for policy 0, policy_version 67940 (0.0046) [2024-06-21 21:26:03,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1113161728. Throughput: 0: 42532.0. Samples: 1113340920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:26:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 21:26:05,910][15401] Updated weights for policy 0, policy_version 67950 (0.0039) [2024-06-21 21:26:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 1113391104. Throughput: 0: 42542.6. Samples: 1113467860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 21:26:08,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-21 21:26:09,513][15401] Updated weights for policy 0, policy_version 67960 (0.0027) [2024-06-21 21:26:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42599.9, 300 sec: 42320.7). Total num frames: 1113604096. Throughput: 0: 42689.3. Samples: 1113730060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:26:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 21:26:13,521][15401] Updated weights for policy 0, policy_version 67970 (0.0048) [2024-06-21 21:26:17,473][15401] Updated weights for policy 0, policy_version 67980 (0.0035) [2024-06-21 21:26:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1113800704. Throughput: 0: 42663.0. Samples: 1113978780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:26:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 21:26:21,297][15401] Updated weights for policy 0, policy_version 67990 (0.0033) [2024-06-21 21:26:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1114046464. Throughput: 0: 42703.5. Samples: 1114106640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:26:23,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 21:26:25,077][15401] Updated weights for policy 0, policy_version 68000 (0.0041) [2024-06-21 21:26:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1114226688. Throughput: 0: 42514.8. Samples: 1114364280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:26:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 21:26:28,886][15401] Updated weights for policy 0, policy_version 68010 (0.0035) [2024-06-21 21:26:32,532][15401] Updated weights for policy 0, policy_version 68020 (0.0028) [2024-06-21 21:26:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1114439680. Throughput: 0: 42402.9. Samples: 1114616040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:26:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 21:26:36,575][15401] Updated weights for policy 0, policy_version 68030 (0.0037) [2024-06-21 21:26:38,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43144.3, 300 sec: 42431.7). Total num frames: 1114685440. Throughput: 0: 42599.7. Samples: 1114746160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:26:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 21:26:39,999][15401] Updated weights for policy 0, policy_version 68040 (0.0039) [2024-06-21 21:26:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1114849280. Throughput: 0: 42543.0. Samples: 1115003120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:26:43,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 21:26:44,331][15401] Updated weights for policy 0, policy_version 68050 (0.0043) [2024-06-21 21:26:48,136][15401] Updated weights for policy 0, policy_version 68060 (0.0049) [2024-06-21 21:26:48,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.7, 300 sec: 42376.2). Total num frames: 1115095040. Throughput: 0: 42450.2. Samples: 1115251180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:26:48,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-21 21:26:51,937][15349] Signal inference workers to stop experience collection... (16300 times) [2024-06-21 21:26:51,944][15349] Signal inference workers to resume experience collection... (16300 times) [2024-06-21 21:26:51,965][15401] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-06-21 21:26:51,965][15401] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-06-21 21:26:52,091][15401] Updated weights for policy 0, policy_version 68070 (0.0032) [2024-06-21 21:26:53,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.3, 300 sec: 42431.8). Total num frames: 1115324416. Throughput: 0: 42667.6. Samples: 1115387900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 21:26:53,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 21:26:55,666][15401] Updated weights for policy 0, policy_version 68080 (0.0047) [2024-06-21 21:26:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1115488256. Throughput: 0: 42480.9. Samples: 1115641700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 21:26:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 21:26:59,824][15401] Updated weights for policy 0, policy_version 68090 (0.0030) [2024-06-21 21:27:03,219][15401] Updated weights for policy 0, policy_version 68100 (0.0025) [2024-06-21 21:27:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1115750400. Throughput: 0: 42370.2. Samples: 1115885440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 21:27:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 21:27:07,453][15401] Updated weights for policy 0, policy_version 68110 (0.0025) [2024-06-21 21:27:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1115947008. Throughput: 0: 42589.8. Samples: 1116023180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 21:27:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-21 21:27:11,278][15401] Updated weights for policy 0, policy_version 68120 (0.0031) [2024-06-21 21:27:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1116143616. Throughput: 0: 42357.8. Samples: 1116270380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 21:27:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 21:27:15,619][15401] Updated weights for policy 0, policy_version 68130 (0.0031) [2024-06-21 21:27:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42487.0). Total num frames: 1116372992. Throughput: 0: 42413.2. Samples: 1116524740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 21:27:18,393][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 21:27:18,855][15401] Updated weights for policy 0, policy_version 68140 (0.0042) [2024-06-21 21:27:23,367][15401] Updated weights for policy 0, policy_version 68150 (0.0029) [2024-06-21 21:27:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1116569600. Throughput: 0: 42420.7. Samples: 1116655080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 21:27:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 21:27:26,487][15401] Updated weights for policy 0, policy_version 68160 (0.0029) [2024-06-21 21:27:28,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1116782592. Throughput: 0: 42230.8. Samples: 1116903500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 21:27:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 21:27:30,908][15401] Updated weights for policy 0, policy_version 68170 (0.0032) [2024-06-21 21:27:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42376.3). Total num frames: 1116995584. Throughput: 0: 42433.2. Samples: 1117160680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:27:33,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-21 21:27:34,182][15401] Updated weights for policy 0, policy_version 68180 (0.0028) [2024-06-21 21:27:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42050.7, 300 sec: 42264.8). Total num frames: 1117208576. Throughput: 0: 42319.1. Samples: 1117292360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:27:38,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 21:27:38,501][15401] Updated weights for policy 0, policy_version 68190 (0.0028) [2024-06-21 21:27:42,174][15401] Updated weights for policy 0, policy_version 68200 (0.0038) [2024-06-21 21:27:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 1117405184. Throughput: 0: 42372.4. Samples: 1117548460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:27:43,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-21 21:27:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000068201_1117405184.pth... [2024-06-21 21:27:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000067582_1107263488.pth [2024-06-21 21:27:46,048][15401] Updated weights for policy 0, policy_version 68210 (0.0030) [2024-06-21 21:27:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1117634560. Throughput: 0: 42488.5. Samples: 1117797420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:27:48,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-21 21:27:50,234][15401] Updated weights for policy 0, policy_version 68220 (0.0032) [2024-06-21 21:27:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1117831168. Throughput: 0: 42386.2. Samples: 1117930560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:27:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 21:27:54,049][15401] Updated weights for policy 0, policy_version 68230 (0.0035) [2024-06-21 21:27:57,878][15401] Updated weights for policy 0, policy_version 68240 (0.0028) [2024-06-21 21:27:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1118044160. Throughput: 0: 42424.9. Samples: 1118179500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:27:58,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 21:28:01,653][15401] Updated weights for policy 0, policy_version 68250 (0.0039) [2024-06-21 21:28:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 1118289920. Throughput: 0: 42344.4. Samples: 1118430140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:28:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 21:28:05,292][15401] Updated weights for policy 0, policy_version 68260 (0.0037) [2024-06-21 21:28:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1118486528. Throughput: 0: 42486.6. Samples: 1118566980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:28:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 21:28:09,561][15401] Updated weights for policy 0, policy_version 68270 (0.0043) [2024-06-21 21:28:12,773][15401] Updated weights for policy 0, policy_version 68280 (0.0055) [2024-06-21 21:28:13,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42321.6). Total num frames: 1118699520. Throughput: 0: 42551.6. Samples: 1118818320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:28:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 21:28:17,123][15401] Updated weights for policy 0, policy_version 68290 (0.0042) [2024-06-21 21:28:18,272][15349] Signal inference workers to stop experience collection... (16350 times) [2024-06-21 21:28:18,273][15349] Signal inference workers to resume experience collection... (16350 times) [2024-06-21 21:28:18,317][15401] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-06-21 21:28:18,317][15401] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-06-21 21:28:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42376.3). Total num frames: 1118912512. Throughput: 0: 42709.9. Samples: 1119082620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-21 21:28:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 21:28:20,774][15401] Updated weights for policy 0, policy_version 68300 (0.0030) [2024-06-21 21:28:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1119125504. Throughput: 0: 42587.1. Samples: 1119208680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-21 21:28:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 21:28:24,820][15401] Updated weights for policy 0, policy_version 68310 (0.0032) [2024-06-21 21:28:28,282][15401] Updated weights for policy 0, policy_version 68320 (0.0039) [2024-06-21 21:28:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 1119354880. Throughput: 0: 42461.0. Samples: 1119459200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-21 21:28:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 21:28:32,782][15401] Updated weights for policy 0, policy_version 68330 (0.0030) [2024-06-21 21:28:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1119535104. Throughput: 0: 42715.0. Samples: 1119719600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-21 21:28:33,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 21:28:36,127][15401] Updated weights for policy 0, policy_version 68340 (0.0052) [2024-06-21 21:28:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42327.0, 300 sec: 42376.2). Total num frames: 1119748096. Throughput: 0: 42476.8. Samples: 1119842020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-21 21:28:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 21:28:40,240][15401] Updated weights for policy 0, policy_version 68350 (0.0037) [2024-06-21 21:28:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1119977472. Throughput: 0: 42612.8. Samples: 1120097080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-21 21:28:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 21:28:43,845][15401] Updated weights for policy 0, policy_version 68360 (0.0039) [2024-06-21 21:28:47,823][15401] Updated weights for policy 0, policy_version 68370 (0.0034) [2024-06-21 21:28:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1120190464. Throughput: 0: 42877.9. Samples: 1120359640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-21 21:28:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 21:28:51,283][15401] Updated weights for policy 0, policy_version 68380 (0.0032) [2024-06-21 21:28:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1120403456. Throughput: 0: 42689.9. Samples: 1120488020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-21 21:28:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 21:28:55,395][15401] Updated weights for policy 0, policy_version 68390 (0.0026) [2024-06-21 21:28:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1120632832. Throughput: 0: 42842.9. Samples: 1120746260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:28:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 21:28:58,733][15401] Updated weights for policy 0, policy_version 68400 (0.0029) [2024-06-21 21:29:03,007][15401] Updated weights for policy 0, policy_version 68410 (0.0032) [2024-06-21 21:29:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1120845824. Throughput: 0: 42650.6. Samples: 1121001900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:29:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 21:29:06,307][15401] Updated weights for policy 0, policy_version 68420 (0.0033) [2024-06-21 21:29:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1121042432. Throughput: 0: 42667.7. Samples: 1121128720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:29:08,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 21:29:10,679][15401] Updated weights for policy 0, policy_version 68430 (0.0033) [2024-06-21 21:29:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1121271808. Throughput: 0: 42923.0. Samples: 1121390740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:29:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 21:29:13,816][15401] Updated weights for policy 0, policy_version 68440 (0.0052) [2024-06-21 21:29:18,256][15401] Updated weights for policy 0, policy_version 68450 (0.0032) [2024-06-21 21:29:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 1121484800. Throughput: 0: 42926.6. Samples: 1121651300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:29:18,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 21:29:21,882][15401] Updated weights for policy 0, policy_version 68460 (0.0032) [2024-06-21 21:29:22,885][15349] Signal inference workers to stop experience collection... (16400 times) [2024-06-21 21:29:22,892][15349] Signal inference workers to resume experience collection... (16400 times) [2024-06-21 21:29:22,900][15401] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-06-21 21:29:22,923][15401] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-06-21 21:29:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 1121714176. Throughput: 0: 42970.8. Samples: 1121775700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:29:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 21:29:25,697][15401] Updated weights for policy 0, policy_version 68470 (0.0038) [2024-06-21 21:29:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1121910784. Throughput: 0: 43252.6. Samples: 1122043440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:29:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 21:29:29,704][15401] Updated weights for policy 0, policy_version 68480 (0.0027) [2024-06-21 21:29:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1122123776. Throughput: 0: 42892.4. Samples: 1122289800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:29:33,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 21:29:33,567][15401] Updated weights for policy 0, policy_version 68490 (0.0032) [2024-06-21 21:29:37,242][15401] Updated weights for policy 0, policy_version 68500 (0.0026) [2024-06-21 21:29:38,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 1122353152. Throughput: 0: 42948.4. Samples: 1122420700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 21:29:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 21:29:41,155][15401] Updated weights for policy 0, policy_version 68510 (0.0027) [2024-06-21 21:29:43,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1122516992. Throughput: 0: 42820.6. Samples: 1122673180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:29:43,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 21:29:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000068514_1122533376.pth... [2024-06-21 21:29:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000067892_1112342528.pth [2024-06-21 21:29:44,849][15401] Updated weights for policy 0, policy_version 68520 (0.0032) [2024-06-21 21:29:48,390][15132] Fps is (10 sec: 42594.4, 60 sec: 43143.9, 300 sec: 42653.8). Total num frames: 1122779136. Throughput: 0: 42798.6. Samples: 1122927880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:29:48,391][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 21:29:48,682][15401] Updated weights for policy 0, policy_version 68530 (0.0027) [2024-06-21 21:29:52,686][15401] Updated weights for policy 0, policy_version 68540 (0.0028) [2024-06-21 21:29:53,390][15132] Fps is (10 sec: 45873.8, 60 sec: 42871.2, 300 sec: 42542.8). Total num frames: 1122975744. Throughput: 0: 43078.8. Samples: 1123067280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:29:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 21:29:56,182][15401] Updated weights for policy 0, policy_version 68550 (0.0031) [2024-06-21 21:29:58,390][15132] Fps is (10 sec: 39325.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1123172352. Throughput: 0: 42828.9. Samples: 1123318040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:29:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 21:30:00,364][15401] Updated weights for policy 0, policy_version 68560 (0.0028) [2024-06-21 21:30:03,392][15132] Fps is (10 sec: 44227.2, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 1123418112. Throughput: 0: 42689.8. Samples: 1123572440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:30:03,393][15132] Avg episode reward: [(0, '0.862')] [2024-06-21 21:30:03,866][15401] Updated weights for policy 0, policy_version 68570 (0.0030) [2024-06-21 21:30:07,875][15401] Updated weights for policy 0, policy_version 68580 (0.0025) [2024-06-21 21:30:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 1123614720. Throughput: 0: 43032.4. Samples: 1123712160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:30:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 21:30:11,292][15401] Updated weights for policy 0, policy_version 68590 (0.0034) [2024-06-21 21:30:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1123827712. Throughput: 0: 42623.0. Samples: 1123961480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:30:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 21:30:15,734][15401] Updated weights for policy 0, policy_version 68600 (0.0030) [2024-06-21 21:30:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1124073472. Throughput: 0: 42844.6. Samples: 1124217800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:30:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 21:30:18,898][15401] Updated weights for policy 0, policy_version 68610 (0.0038) [2024-06-21 21:30:23,386][15401] Updated weights for policy 0, policy_version 68620 (0.0030) [2024-06-21 21:30:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1124270080. Throughput: 0: 42963.5. Samples: 1124354060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:30:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 21:30:26,477][15401] Updated weights for policy 0, policy_version 68630 (0.0027) [2024-06-21 21:30:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1124483072. Throughput: 0: 42904.0. Samples: 1124603860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:30:28,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 21:30:31,379][15401] Updated weights for policy 0, policy_version 68640 (0.0035) [2024-06-21 21:30:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1124696064. Throughput: 0: 42945.9. Samples: 1124860400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:30:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 21:30:33,980][15401] Updated weights for policy 0, policy_version 68650 (0.0029) [2024-06-21 21:30:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1124892672. Throughput: 0: 42747.3. Samples: 1124990900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:30:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 21:30:38,724][15349] Signal inference workers to stop experience collection... (16450 times) [2024-06-21 21:30:38,725][15349] Signal inference workers to resume experience collection... (16450 times) [2024-06-21 21:30:38,744][15401] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-06-21 21:30:38,745][15401] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-06-21 21:30:38,869][15401] Updated weights for policy 0, policy_version 68660 (0.0043) [2024-06-21 21:30:41,766][15401] Updated weights for policy 0, policy_version 68670 (0.0027) [2024-06-21 21:30:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1125122048. Throughput: 0: 42735.1. Samples: 1125241120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:30:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 21:30:46,439][15401] Updated weights for policy 0, policy_version 68680 (0.0031) [2024-06-21 21:30:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42599.1, 300 sec: 42653.9). Total num frames: 1125335040. Throughput: 0: 42831.6. Samples: 1125499760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:30:48,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-21 21:30:49,471][15401] Updated weights for policy 0, policy_version 68690 (0.0026) [2024-06-21 21:30:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1125531648. Throughput: 0: 42656.7. Samples: 1125631720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:30:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 21:30:54,040][15401] Updated weights for policy 0, policy_version 68700 (0.0032) [2024-06-21 21:30:57,267][15401] Updated weights for policy 0, policy_version 68710 (0.0036) [2024-06-21 21:30:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1125777408. Throughput: 0: 42700.1. Samples: 1125882980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:30:58,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-21 21:31:01,593][15401] Updated weights for policy 0, policy_version 68720 (0.0032) [2024-06-21 21:31:03,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1125990400. Throughput: 0: 42723.1. Samples: 1126140340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 21:31:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 21:31:05,046][15401] Updated weights for policy 0, policy_version 68730 (0.0034) [2024-06-21 21:31:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1126170624. Throughput: 0: 42592.5. Samples: 1126270720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:31:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 21:31:09,342][15401] Updated weights for policy 0, policy_version 68740 (0.0031) [2024-06-21 21:31:12,630][15401] Updated weights for policy 0, policy_version 68750 (0.0035) [2024-06-21 21:31:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1126416384. Throughput: 0: 42806.6. Samples: 1126530160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:31:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 21:31:16,786][15401] Updated weights for policy 0, policy_version 68760 (0.0038) [2024-06-21 21:31:18,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1126629376. Throughput: 0: 42664.4. Samples: 1126780300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:31:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 21:31:20,660][15401] Updated weights for policy 0, policy_version 68770 (0.0030) [2024-06-21 21:31:23,396][15132] Fps is (10 sec: 40933.3, 60 sec: 42593.8, 300 sec: 42708.5). Total num frames: 1126825984. Throughput: 0: 42701.0. Samples: 1126912720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:31:23,397][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 21:31:24,210][15401] Updated weights for policy 0, policy_version 68780 (0.0038) [2024-06-21 21:31:28,192][15401] Updated weights for policy 0, policy_version 68790 (0.0027) [2024-06-21 21:31:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1127055360. Throughput: 0: 42996.1. Samples: 1127175940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:31:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 21:31:31,888][15401] Updated weights for policy 0, policy_version 68800 (0.0032) [2024-06-21 21:31:33,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1127268352. Throughput: 0: 42729.6. Samples: 1127422600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:31:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 21:31:35,756][15401] Updated weights for policy 0, policy_version 68810 (0.0033) [2024-06-21 21:31:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1127481344. Throughput: 0: 42749.1. Samples: 1127555420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:31:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 21:31:39,935][15401] Updated weights for policy 0, policy_version 68820 (0.0034) [2024-06-21 21:31:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1127694336. Throughput: 0: 42803.0. Samples: 1127809120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:31:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 21:31:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000068830_1127710720.pth... [2024-06-21 21:31:43,484][15401] Updated weights for policy 0, policy_version 68830 (0.0041) [2024-06-21 21:31:43,519][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000068201_1117405184.pth [2024-06-21 21:31:47,450][15401] Updated weights for policy 0, policy_version 68840 (0.0028) [2024-06-21 21:31:48,395][15132] Fps is (10 sec: 40938.7, 60 sec: 42594.7, 300 sec: 42597.7). Total num frames: 1127890944. Throughput: 0: 42740.8. Samples: 1128063900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:31:48,395][15132] Avg episode reward: [(0, '0.292')] [2024-06-21 21:31:48,415][15349] Signal inference workers to stop experience collection... (16500 times) [2024-06-21 21:31:48,415][15349] Signal inference workers to resume experience collection... (16500 times) [2024-06-21 21:31:48,463][15401] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-06-21 21:31:48,463][15401] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-06-21 21:31:51,670][15401] Updated weights for policy 0, policy_version 68850 (0.0040) [2024-06-21 21:31:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1128120320. Throughput: 0: 42717.3. Samples: 1128193000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:31:53,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 21:31:55,093][15401] Updated weights for policy 0, policy_version 68860 (0.0037) [2024-06-21 21:31:58,390][15132] Fps is (10 sec: 44259.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1128333312. Throughput: 0: 42576.4. Samples: 1128446100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:31:58,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 21:31:59,550][15401] Updated weights for policy 0, policy_version 68870 (0.0033) [2024-06-21 21:32:02,996][15401] Updated weights for policy 0, policy_version 68880 (0.0039) [2024-06-21 21:32:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1128529920. Throughput: 0: 42589.3. Samples: 1128696820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:32:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 21:32:07,333][15401] Updated weights for policy 0, policy_version 68890 (0.0037) [2024-06-21 21:32:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1128726528. Throughput: 0: 42431.5. Samples: 1128821860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:32:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 21:32:10,622][15401] Updated weights for policy 0, policy_version 68900 (0.0026) [2024-06-21 21:32:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 1128955904. Throughput: 0: 42162.6. Samples: 1129073260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:32:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 21:32:15,263][15401] Updated weights for policy 0, policy_version 68910 (0.0027) [2024-06-21 21:32:18,239][15401] Updated weights for policy 0, policy_version 68920 (0.0034) [2024-06-21 21:32:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1129185280. Throughput: 0: 42342.7. Samples: 1129328020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:32:18,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 21:32:23,011][15401] Updated weights for policy 0, policy_version 68930 (0.0030) [2024-06-21 21:32:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42328.2, 300 sec: 42653.6). Total num frames: 1129365504. Throughput: 0: 42312.8. Samples: 1129459600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:32:23,393][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 21:32:25,711][15401] Updated weights for policy 0, policy_version 68940 (0.0034) [2024-06-21 21:32:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1129578496. Throughput: 0: 42214.2. Samples: 1129708760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 21:32:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 21:32:30,826][15401] Updated weights for policy 0, policy_version 68950 (0.0043) [2024-06-21 21:32:33,355][15401] Updated weights for policy 0, policy_version 68960 (0.0030) [2024-06-21 21:32:33,391][15132] Fps is (10 sec: 47520.0, 60 sec: 42870.8, 300 sec: 42820.7). Total num frames: 1129840640. Throughput: 0: 42234.5. Samples: 1129964280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 21:32:33,391][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 21:32:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 1129988096. Throughput: 0: 42265.7. Samples: 1130094960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 21:32:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 21:32:38,487][15401] Updated weights for policy 0, policy_version 68970 (0.0043) [2024-06-21 21:32:41,215][15401] Updated weights for policy 0, policy_version 68980 (0.0033) [2024-06-21 21:32:43,389][15132] Fps is (10 sec: 37687.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1130217472. Throughput: 0: 42248.1. Samples: 1130347260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 21:32:43,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 21:32:46,230][15401] Updated weights for policy 0, policy_version 68990 (0.0039) [2024-06-21 21:32:48,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42875.1, 300 sec: 42820.6). Total num frames: 1130463232. Throughput: 0: 42465.8. Samples: 1130607780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 21:32:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 21:32:48,973][15401] Updated weights for policy 0, policy_version 69000 (0.0046) [2024-06-21 21:32:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 1130627072. Throughput: 0: 42462.2. Samples: 1130732660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 21:32:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-21 21:32:53,914][15401] Updated weights for policy 0, policy_version 69010 (0.0038) [2024-06-21 21:32:56,808][15401] Updated weights for policy 0, policy_version 69020 (0.0029) [2024-06-21 21:32:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1130872832. Throughput: 0: 42386.3. Samples: 1130980640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 21:32:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-21 21:33:01,609][15401] Updated weights for policy 0, policy_version 69030 (0.0027) [2024-06-21 21:33:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1131085824. Throughput: 0: 42506.7. Samples: 1131240820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 21:33:03,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 21:33:04,771][15401] Updated weights for policy 0, policy_version 69040 (0.0041) [2024-06-21 21:33:08,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 1131266048. Throughput: 0: 42340.5. Samples: 1131364920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-21 21:33:08,393][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 21:33:09,400][15401] Updated weights for policy 0, policy_version 69050 (0.0036) [2024-06-21 21:33:12,401][15401] Updated weights for policy 0, policy_version 69060 (0.0044) [2024-06-21 21:33:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1131528192. Throughput: 0: 42469.3. Samples: 1131619880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 21:33:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-21 21:33:16,963][15401] Updated weights for policy 0, policy_version 69070 (0.0028) [2024-06-21 21:33:18,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1131708416. Throughput: 0: 42557.5. Samples: 1131879320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 21:33:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 21:33:20,014][15401] Updated weights for policy 0, policy_version 69080 (0.0029) [2024-06-21 21:33:21,097][15349] Signal inference workers to stop experience collection... (16550 times) [2024-06-21 21:33:21,097][15349] Signal inference workers to resume experience collection... (16550 times) [2024-06-21 21:33:21,124][15401] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-06-21 21:33:21,124][15401] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-06-21 21:33:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 1131905024. Throughput: 0: 42448.5. Samples: 1132005140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 21:33:23,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-21 21:33:24,446][15401] Updated weights for policy 0, policy_version 69090 (0.0035) [2024-06-21 21:33:27,600][15401] Updated weights for policy 0, policy_version 69100 (0.0034) [2024-06-21 21:33:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1132150784. Throughput: 0: 42473.7. Samples: 1132258580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 21:33:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 21:33:32,029][15401] Updated weights for policy 0, policy_version 69110 (0.0042) [2024-06-21 21:33:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 41779.9, 300 sec: 42709.5). Total num frames: 1132347392. Throughput: 0: 42374.1. Samples: 1132514620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 21:33:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 21:33:35,267][15401] Updated weights for policy 0, policy_version 69120 (0.0037) [2024-06-21 21:33:38,396][15132] Fps is (10 sec: 39296.0, 60 sec: 42593.7, 300 sec: 42597.5). Total num frames: 1132544000. Throughput: 0: 42329.0. Samples: 1132637740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 21:33:38,397][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 21:33:39,833][15401] Updated weights for policy 0, policy_version 69130 (0.0041) [2024-06-21 21:33:43,306][15401] Updated weights for policy 0, policy_version 69140 (0.0045) [2024-06-21 21:33:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1132789760. Throughput: 0: 42609.8. Samples: 1132898080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 21:33:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 21:33:43,568][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000069141_1132806144.pth... [2024-06-21 21:33:43,616][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000068514_1122533376.pth [2024-06-21 21:33:47,502][15401] Updated weights for policy 0, policy_version 69150 (0.0030) [2024-06-21 21:33:48,390][15132] Fps is (10 sec: 44265.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1132986368. Throughput: 0: 42395.6. Samples: 1133148620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-21 21:33:48,391][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 21:33:50,947][15401] Updated weights for policy 0, policy_version 69160 (0.0036) [2024-06-21 21:33:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1133199360. Throughput: 0: 42380.5. Samples: 1133271940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:33:53,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 21:33:55,131][15401] Updated weights for policy 0, policy_version 69170 (0.0025) [2024-06-21 21:33:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1133412352. Throughput: 0: 42582.3. Samples: 1133536080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:33:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 21:33:58,624][15401] Updated weights for policy 0, policy_version 69180 (0.0032) [2024-06-21 21:34:02,822][15401] Updated weights for policy 0, policy_version 69190 (0.0027) [2024-06-21 21:34:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1133625344. Throughput: 0: 42281.7. Samples: 1133782000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:34:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 21:34:06,479][15401] Updated weights for policy 0, policy_version 69200 (0.0036) [2024-06-21 21:34:08,390][15132] Fps is (10 sec: 39319.6, 60 sec: 42326.7, 300 sec: 42487.3). Total num frames: 1133805568. Throughput: 0: 42330.6. Samples: 1133910040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:34:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 21:34:10,666][15401] Updated weights for policy 0, policy_version 69210 (0.0025) [2024-06-21 21:34:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1134034944. Throughput: 0: 42316.9. Samples: 1134162840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:34:13,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 21:34:14,483][15401] Updated weights for policy 0, policy_version 69220 (0.0035) [2024-06-21 21:34:18,245][15401] Updated weights for policy 0, policy_version 69230 (0.0044) [2024-06-21 21:34:18,389][15132] Fps is (10 sec: 45877.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1134264320. Throughput: 0: 42231.3. Samples: 1134415020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:34:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 21:34:22,109][15401] Updated weights for policy 0, policy_version 69240 (0.0033) [2024-06-21 21:34:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1134460928. Throughput: 0: 42351.1. Samples: 1134543260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:34:23,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 21:34:25,941][15401] Updated weights for policy 0, policy_version 69250 (0.0039) [2024-06-21 21:34:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1134673920. Throughput: 0: 42299.1. Samples: 1134801540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:34:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 21:34:30,187][15401] Updated weights for policy 0, policy_version 69260 (0.0027) [2024-06-21 21:34:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 1134886912. Throughput: 0: 42465.0. Samples: 1135059540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:34:33,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 21:34:33,775][15401] Updated weights for policy 0, policy_version 69270 (0.0042) [2024-06-21 21:34:33,787][15349] Signal inference workers to stop experience collection... (16600 times) [2024-06-21 21:34:33,787][15349] Signal inference workers to resume experience collection... (16600 times) [2024-06-21 21:34:33,800][15401] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-06-21 21:34:33,800][15401] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-06-21 21:34:37,688][15401] Updated weights for policy 0, policy_version 69280 (0.0032) [2024-06-21 21:34:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42876.2, 300 sec: 42709.5). Total num frames: 1135116288. Throughput: 0: 42553.4. Samples: 1135186840. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 21:34:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 21:34:41,200][15401] Updated weights for policy 0, policy_version 69290 (0.0050) [2024-06-21 21:34:43,390][15132] Fps is (10 sec: 44235.4, 60 sec: 42325.2, 300 sec: 42543.0). Total num frames: 1135329280. Throughput: 0: 42397.1. Samples: 1135443960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 21:34:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 21:34:45,225][15401] Updated weights for policy 0, policy_version 69300 (0.0040) [2024-06-21 21:34:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1135542272. Throughput: 0: 42540.9. Samples: 1135696340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 21:34:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 21:34:48,810][15401] Updated weights for policy 0, policy_version 69310 (0.0039) [2024-06-21 21:34:53,268][15401] Updated weights for policy 0, policy_version 69320 (0.0037) [2024-06-21 21:34:53,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1135738880. Throughput: 0: 42515.9. Samples: 1135823240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 21:34:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-21 21:34:56,349][15401] Updated weights for policy 0, policy_version 69330 (0.0029) [2024-06-21 21:34:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1135968256. Throughput: 0: 42635.6. Samples: 1136081440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 21:34:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 21:35:00,914][15401] Updated weights for policy 0, policy_version 69340 (0.0036) [2024-06-21 21:35:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1136164864. Throughput: 0: 42825.7. Samples: 1136342180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 21:35:03,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 21:35:04,015][15401] Updated weights for policy 0, policy_version 69350 (0.0027) [2024-06-21 21:35:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.7, 300 sec: 42542.8). Total num frames: 1136377856. Throughput: 0: 42647.9. Samples: 1136462420. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 21:35:08,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-21 21:35:08,735][15401] Updated weights for policy 0, policy_version 69360 (0.0039) [2024-06-21 21:35:12,013][15401] Updated weights for policy 0, policy_version 69370 (0.0042) [2024-06-21 21:35:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1136607232. Throughput: 0: 42624.4. Samples: 1136719640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 21:35:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 21:35:16,239][15401] Updated weights for policy 0, policy_version 69380 (0.0030) [2024-06-21 21:35:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1136787456. Throughput: 0: 42618.2. Samples: 1136977360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 21:35:19,751][15401] Updated weights for policy 0, policy_version 69390 (0.0029) [2024-06-21 21:35:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1137016832. Throughput: 0: 42435.9. Samples: 1137096460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 21:35:24,237][15401] Updated weights for policy 0, policy_version 69400 (0.0046) [2024-06-21 21:35:27,432][15401] Updated weights for policy 0, policy_version 69410 (0.0032) [2024-06-21 21:35:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1137229824. Throughput: 0: 42390.8. Samples: 1137351540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 21:35:31,891][15401] Updated weights for policy 0, policy_version 69420 (0.0043) [2024-06-21 21:35:33,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 1137426432. Throughput: 0: 42575.1. Samples: 1137612320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:33,392][15132] Avg episode reward: [(0, '0.744')] [2024-06-21 21:35:35,296][15401] Updated weights for policy 0, policy_version 69430 (0.0030) [2024-06-21 21:35:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1137672192. Throughput: 0: 42367.7. Samples: 1137729780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 21:35:39,493][15401] Updated weights for policy 0, policy_version 69440 (0.0036) [2024-06-21 21:35:41,200][15349] Signal inference workers to stop experience collection... (16650 times) [2024-06-21 21:35:41,200][15349] Signal inference workers to resume experience collection... (16650 times) [2024-06-21 21:35:41,242][15401] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-06-21 21:35:41,243][15401] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-06-21 21:35:43,041][15401] Updated weights for policy 0, policy_version 69450 (0.0048) [2024-06-21 21:35:43,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1137885184. Throughput: 0: 42427.1. Samples: 1137990660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:43,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-21 21:35:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000069451_1137885184.pth... [2024-06-21 21:35:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000068830_1127710720.pth [2024-06-21 21:35:47,108][15401] Updated weights for policy 0, policy_version 69460 (0.0032) [2024-06-21 21:35:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1138065408. Throughput: 0: 42400.5. Samples: 1138250200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:48,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 21:35:50,651][15401] Updated weights for policy 0, policy_version 69470 (0.0030) [2024-06-21 21:35:53,390][15132] Fps is (10 sec: 40956.7, 60 sec: 42597.9, 300 sec: 42431.7). Total num frames: 1138294784. Throughput: 0: 42405.1. Samples: 1138370680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:53,391][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 21:35:55,014][15401] Updated weights for policy 0, policy_version 69480 (0.0032) [2024-06-21 21:35:58,281][15401] Updated weights for policy 0, policy_version 69490 (0.0033) [2024-06-21 21:35:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1138524160. Throughput: 0: 42380.9. Samples: 1138626780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-21 21:35:58,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 21:36:02,500][15401] Updated weights for policy 0, policy_version 69500 (0.0050) [2024-06-21 21:36:03,389][15132] Fps is (10 sec: 40963.8, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 1138704384. Throughput: 0: 42420.4. Samples: 1138886280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-21 21:36:06,204][15401] Updated weights for policy 0, policy_version 69510 (0.0041) [2024-06-21 21:36:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1138933760. Throughput: 0: 42517.5. Samples: 1139009740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 21:36:10,637][15401] Updated weights for policy 0, policy_version 69520 (0.0028) [2024-06-21 21:36:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1139146752. Throughput: 0: 42617.7. Samples: 1139269340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:13,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 21:36:13,699][15401] Updated weights for policy 0, policy_version 69530 (0.0033) [2024-06-21 21:36:18,262][15401] Updated weights for policy 0, policy_version 69540 (0.0040) [2024-06-21 21:36:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42432.7). Total num frames: 1139343360. Throughput: 0: 42506.2. Samples: 1139525000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:18,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 21:36:21,301][15401] Updated weights for policy 0, policy_version 69550 (0.0037) [2024-06-21 21:36:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1139572736. Throughput: 0: 42551.6. Samples: 1139644600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 21:36:25,790][15401] Updated weights for policy 0, policy_version 69560 (0.0036) [2024-06-21 21:36:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1139785728. Throughput: 0: 42604.0. Samples: 1139907840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 21:36:28,993][15401] Updated weights for policy 0, policy_version 69570 (0.0031) [2024-06-21 21:36:33,239][15401] Updated weights for policy 0, policy_version 69580 (0.0040) [2024-06-21 21:36:33,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42871.5, 300 sec: 42431.4). Total num frames: 1139998720. Throughput: 0: 42443.9. Samples: 1140160280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:33,393][15132] Avg episode reward: [(0, '0.726')] [2024-06-21 21:36:37,079][15401] Updated weights for policy 0, policy_version 69590 (0.0037) [2024-06-21 21:36:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1140228096. Throughput: 0: 42605.7. Samples: 1140287900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 21:36:40,928][15401] Updated weights for policy 0, policy_version 69600 (0.0040) [2024-06-21 21:36:43,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42052.3, 300 sec: 42432.5). Total num frames: 1140408320. Throughput: 0: 42715.3. Samples: 1140548960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:36:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 21:36:44,516][15401] Updated weights for policy 0, policy_version 69610 (0.0035) [2024-06-21 21:36:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1140637696. Throughput: 0: 42482.6. Samples: 1140798000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 21:36:48,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-21 21:36:48,569][15401] Updated weights for policy 0, policy_version 69620 (0.0035) [2024-06-21 21:36:52,244][15401] Updated weights for policy 0, policy_version 69630 (0.0031) [2024-06-21 21:36:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42599.0, 300 sec: 42431.8). Total num frames: 1140850688. Throughput: 0: 42639.1. Samples: 1140928500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 21:36:53,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 21:36:56,414][15401] Updated weights for policy 0, policy_version 69640 (0.0032) [2024-06-21 21:36:58,391][15132] Fps is (10 sec: 39316.9, 60 sec: 41778.4, 300 sec: 42376.1). Total num frames: 1141030912. Throughput: 0: 42375.0. Samples: 1141176260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 21:36:58,391][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 21:37:00,244][15401] Updated weights for policy 0, policy_version 69650 (0.0045) [2024-06-21 21:37:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1141276672. Throughput: 0: 42241.0. Samples: 1141425840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 21:37:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 21:37:03,961][15401] Updated weights for policy 0, policy_version 69660 (0.0041) [2024-06-21 21:37:07,728][15401] Updated weights for policy 0, policy_version 69670 (0.0039) [2024-06-21 21:37:08,390][15132] Fps is (10 sec: 45880.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1141489664. Throughput: 0: 42674.5. Samples: 1141564960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 21:37:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 21:37:09,343][15349] Signal inference workers to stop experience collection... (16700 times) [2024-06-21 21:37:09,368][15401] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-06-21 21:37:09,458][15349] Signal inference workers to resume experience collection... (16700 times) [2024-06-21 21:37:09,458][15401] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-06-21 21:37:11,556][15401] Updated weights for policy 0, policy_version 69680 (0.0032) [2024-06-21 21:37:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.5, 300 sec: 42320.7). Total num frames: 1141669888. Throughput: 0: 42279.8. Samples: 1141810420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 21:37:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 21:37:15,774][15401] Updated weights for policy 0, policy_version 69690 (0.0032) [2024-06-21 21:37:18,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42487.3). Total num frames: 1141899264. Throughput: 0: 42332.9. Samples: 1142065260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 21:37:18,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 21:37:19,371][15401] Updated weights for policy 0, policy_version 69700 (0.0034) [2024-06-21 21:37:23,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1142112256. Throughput: 0: 42481.4. Samples: 1142199560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-21 21:37:23,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-21 21:37:23,438][15401] Updated weights for policy 0, policy_version 69710 (0.0045) [2024-06-21 21:37:27,007][15401] Updated weights for policy 0, policy_version 69720 (0.0027) [2024-06-21 21:37:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42052.4, 300 sec: 42265.3). Total num frames: 1142308864. Throughput: 0: 42162.2. Samples: 1142446260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:37:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 21:37:31,202][15401] Updated weights for policy 0, policy_version 69730 (0.0036) [2024-06-21 21:37:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 1142538240. Throughput: 0: 42249.9. Samples: 1142699240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:37:33,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 21:37:34,919][15401] Updated weights for policy 0, policy_version 69740 (0.0033) [2024-06-21 21:37:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1142751232. Throughput: 0: 42432.8. Samples: 1142837980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:37:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 21:37:38,725][15401] Updated weights for policy 0, policy_version 69750 (0.0033) [2024-06-21 21:37:42,675][15401] Updated weights for policy 0, policy_version 69760 (0.0040) [2024-06-21 21:37:43,390][15132] Fps is (10 sec: 40958.6, 60 sec: 42325.1, 300 sec: 42320.7). Total num frames: 1142947840. Throughput: 0: 42334.2. Samples: 1143081260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:37:43,391][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 21:37:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000069760_1142947840.pth... [2024-06-21 21:37:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000069141_1132806144.pth [2024-06-21 21:37:46,387][15401] Updated weights for policy 0, policy_version 69770 (0.0044) [2024-06-21 21:37:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 1143160832. Throughput: 0: 42508.1. Samples: 1143338700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:37:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 21:37:50,450][15401] Updated weights for policy 0, policy_version 69780 (0.0035) [2024-06-21 21:37:53,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1143373824. Throughput: 0: 42224.4. Samples: 1143465060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:37:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 21:37:53,974][15401] Updated weights for policy 0, policy_version 69790 (0.0038) [2024-06-21 21:37:58,329][15401] Updated weights for policy 0, policy_version 69800 (0.0036) [2024-06-21 21:37:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42872.3, 300 sec: 42431.8). Total num frames: 1143603200. Throughput: 0: 42410.1. Samples: 1143718880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:37:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 21:38:01,756][15401] Updated weights for policy 0, policy_version 69810 (0.0036) [2024-06-21 21:38:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42432.1). Total num frames: 1143783424. Throughput: 0: 42460.5. Samples: 1143975880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:38:03,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-21 21:38:06,165][15401] Updated weights for policy 0, policy_version 69820 (0.0043) [2024-06-21 21:38:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1144012800. Throughput: 0: 42293.7. Samples: 1144102780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-21 21:38:08,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-21 21:38:09,366][15401] Updated weights for policy 0, policy_version 69830 (0.0032) [2024-06-21 21:38:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1144225792. Throughput: 0: 42405.3. Samples: 1144354500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 21:38:13,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-21 21:38:13,667][15401] Updated weights for policy 0, policy_version 69840 (0.0041) [2024-06-21 21:38:17,104][15401] Updated weights for policy 0, policy_version 69850 (0.0027) [2024-06-21 21:38:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 1144438784. Throughput: 0: 42507.8. Samples: 1144612100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 21:38:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 21:38:21,464][15401] Updated weights for policy 0, policy_version 69860 (0.0037) [2024-06-21 21:38:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1144651776. Throughput: 0: 42221.8. Samples: 1144737960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 21:38:23,392][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 21:38:24,906][15401] Updated weights for policy 0, policy_version 69870 (0.0036) [2024-06-21 21:38:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1144881152. Throughput: 0: 42511.7. Samples: 1144994280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 21:38:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 21:38:28,984][15401] Updated weights for policy 0, policy_version 69880 (0.0027) [2024-06-21 21:38:31,878][15349] Signal inference workers to stop experience collection... (16750 times) [2024-06-21 21:38:31,905][15401] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-06-21 21:38:31,945][15349] Signal inference workers to resume experience collection... (16750 times) [2024-06-21 21:38:31,946][15401] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-06-21 21:38:32,587][15401] Updated weights for policy 0, policy_version 69890 (0.0037) [2024-06-21 21:38:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.6, 300 sec: 42543.5). Total num frames: 1145094144. Throughput: 0: 42320.7. Samples: 1145243240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 21:38:33,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 21:38:36,695][15401] Updated weights for policy 0, policy_version 69900 (0.0044) [2024-06-21 21:38:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42375.9). Total num frames: 1145290752. Throughput: 0: 42449.4. Samples: 1145375380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 21:38:38,392][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 21:38:40,290][15401] Updated weights for policy 0, policy_version 69910 (0.0035) [2024-06-21 21:38:43,396][15132] Fps is (10 sec: 40943.5, 60 sec: 42594.0, 300 sec: 42430.9). Total num frames: 1145503744. Throughput: 0: 42519.3. Samples: 1145632520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 21:38:43,397][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 21:38:44,357][15401] Updated weights for policy 0, policy_version 69920 (0.0037) [2024-06-21 21:38:48,087][15401] Updated weights for policy 0, policy_version 69930 (0.0027) [2024-06-21 21:38:48,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 1145749504. Throughput: 0: 42288.4. Samples: 1145878860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-21 21:38:48,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 21:38:52,302][15401] Updated weights for policy 0, policy_version 69940 (0.0041) [2024-06-21 21:38:53,390][15132] Fps is (10 sec: 42625.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1145929728. Throughput: 0: 42486.2. Samples: 1146014660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:38:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 21:38:55,876][15401] Updated weights for policy 0, policy_version 69950 (0.0035) [2024-06-21 21:38:58,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1146126336. Throughput: 0: 42457.2. Samples: 1146265080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:38:58,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-21 21:39:00,033][15401] Updated weights for policy 0, policy_version 69960 (0.0028) [2024-06-21 21:39:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1146355712. Throughput: 0: 42221.9. Samples: 1146512080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:39:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 21:39:03,562][15401] Updated weights for policy 0, policy_version 69970 (0.0029) [2024-06-21 21:39:07,932][15401] Updated weights for policy 0, policy_version 69980 (0.0056) [2024-06-21 21:39:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1146568704. Throughput: 0: 42315.5. Samples: 1146642160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:39:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 21:39:11,175][15401] Updated weights for policy 0, policy_version 69990 (0.0035) [2024-06-21 21:39:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1146781696. Throughput: 0: 42365.4. Samples: 1146900720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:39:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-21 21:39:15,807][15401] Updated weights for policy 0, policy_version 70000 (0.0032) [2024-06-21 21:39:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1146994688. Throughput: 0: 42414.4. Samples: 1147151780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:39:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 21:39:18,983][15401] Updated weights for policy 0, policy_version 70010 (0.0037) [2024-06-21 21:39:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1147191296. Throughput: 0: 42282.7. Samples: 1147278000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:39:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 21:39:23,695][15401] Updated weights for policy 0, policy_version 70020 (0.0034) [2024-06-21 21:39:26,751][15401] Updated weights for policy 0, policy_version 70030 (0.0039) [2024-06-21 21:39:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1147404288. Throughput: 0: 42231.9. Samples: 1147532680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:39:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 21:39:31,476][15401] Updated weights for policy 0, policy_version 70040 (0.0033) [2024-06-21 21:39:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 1147633664. Throughput: 0: 42382.6. Samples: 1147786080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-21 21:39:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 21:39:34,507][15401] Updated weights for policy 0, policy_version 70050 (0.0034) [2024-06-21 21:39:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42053.9, 300 sec: 42320.7). Total num frames: 1147813888. Throughput: 0: 42283.9. Samples: 1147917440. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-21 21:39:38,391][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 21:39:39,250][15401] Updated weights for policy 0, policy_version 70060 (0.0034) [2024-06-21 21:39:42,200][15401] Updated weights for policy 0, policy_version 70070 (0.0031) [2024-06-21 21:39:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42329.9, 300 sec: 42376.3). Total num frames: 1148043264. Throughput: 0: 42184.5. Samples: 1148163380. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-21 21:39:43,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-21 21:39:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000070071_1148043264.pth... [2024-06-21 21:39:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000069451_1137885184.pth [2024-06-21 21:39:46,866][15401] Updated weights for policy 0, policy_version 70080 (0.0026) [2024-06-21 21:39:48,389][15132] Fps is (10 sec: 45876.5, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 1148272640. Throughput: 0: 42422.8. Samples: 1148421100. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-21 21:39:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 21:39:49,526][15349] Signal inference workers to stop experience collection... (16800 times) [2024-06-21 21:39:49,527][15349] Signal inference workers to resume experience collection... (16800 times) [2024-06-21 21:39:49,536][15401] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-06-21 21:39:49,554][15401] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-06-21 21:39:49,851][15401] Updated weights for policy 0, policy_version 70090 (0.0042) [2024-06-21 21:39:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1148452864. Throughput: 0: 42466.7. Samples: 1148553160. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-21 21:39:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 21:39:54,548][15401] Updated weights for policy 0, policy_version 70100 (0.0037) [2024-06-21 21:39:57,698][15401] Updated weights for policy 0, policy_version 70110 (0.0034) [2024-06-21 21:39:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1148698624. Throughput: 0: 42308.0. Samples: 1148804580. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-21 21:39:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 21:40:02,159][15401] Updated weights for policy 0, policy_version 70120 (0.0031) [2024-06-21 21:40:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1148895232. Throughput: 0: 42365.1. Samples: 1149058220. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-21 21:40:03,396][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 21:40:05,271][15401] Updated weights for policy 0, policy_version 70130 (0.0031) [2024-06-21 21:40:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1149075456. Throughput: 0: 42329.9. Samples: 1149182840. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-21 21:40:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 21:40:10,202][15401] Updated weights for policy 0, policy_version 70140 (0.0036) [2024-06-21 21:40:13,140][15401] Updated weights for policy 0, policy_version 70150 (0.0034) [2024-06-21 21:40:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1149337600. Throughput: 0: 42390.6. Samples: 1149440260. Policy #0 lag: (min: 2.0, avg: 11.9, max: 24.0) [2024-06-21 21:40:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 21:40:17,642][15401] Updated weights for policy 0, policy_version 70160 (0.0037) [2024-06-21 21:40:18,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42323.5, 300 sec: 42431.4). Total num frames: 1149534208. Throughput: 0: 42466.7. Samples: 1149697180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:18,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 21:40:20,733][15401] Updated weights for policy 0, policy_version 70170 (0.0035) [2024-06-21 21:40:23,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42431.5). Total num frames: 1149747200. Throughput: 0: 42350.3. Samples: 1149823300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:23,392][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 21:40:25,240][15401] Updated weights for policy 0, policy_version 70180 (0.0043) [2024-06-21 21:40:28,363][15401] Updated weights for policy 0, policy_version 70190 (0.0028) [2024-06-21 21:40:28,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.5, 300 sec: 42598.8). Total num frames: 1149992960. Throughput: 0: 42636.0. Samples: 1150082000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 21:40:32,689][15401] Updated weights for policy 0, policy_version 70200 (0.0038) [2024-06-21 21:40:33,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1150156800. Throughput: 0: 42621.2. Samples: 1150339060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 21:40:35,981][15401] Updated weights for policy 0, policy_version 70210 (0.0044) [2024-06-21 21:40:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1150386176. Throughput: 0: 42273.8. Samples: 1150455480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 21:40:40,277][15401] Updated weights for policy 0, policy_version 70220 (0.0034) [2024-06-21 21:40:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1150599168. Throughput: 0: 42572.0. Samples: 1150720320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 21:40:44,267][15401] Updated weights for policy 0, policy_version 70230 (0.0037) [2024-06-21 21:40:48,262][15401] Updated weights for policy 0, policy_version 70240 (0.0031) [2024-06-21 21:40:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42431.9). Total num frames: 1150812160. Throughput: 0: 42549.1. Samples: 1150972920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 21:40:52,038][15401] Updated weights for policy 0, policy_version 70250 (0.0026) [2024-06-21 21:40:53,373][15349] Signal inference workers to stop experience collection... (16850 times) [2024-06-21 21:40:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42376.3). Total num frames: 1151025152. Throughput: 0: 42680.5. Samples: 1151103460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:53,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 21:40:53,409][15401] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-06-21 21:40:53,429][15349] Signal inference workers to resume experience collection... (16850 times) [2024-06-21 21:40:53,430][15401] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-06-21 21:40:56,230][15401] Updated weights for policy 0, policy_version 70260 (0.0031) [2024-06-21 21:40:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1151221760. Throughput: 0: 42643.1. Samples: 1151359200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:40:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 21:40:59,568][15401] Updated weights for policy 0, policy_version 70270 (0.0037) [2024-06-21 21:41:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1151451136. Throughput: 0: 42583.6. Samples: 1151613340. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 21:41:03,827][15401] Updated weights for policy 0, policy_version 70280 (0.0037) [2024-06-21 21:41:07,391][15401] Updated weights for policy 0, policy_version 70290 (0.0029) [2024-06-21 21:41:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42487.3). Total num frames: 1151680512. Throughput: 0: 42681.0. Samples: 1151743840. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 21:41:11,487][15401] Updated weights for policy 0, policy_version 70300 (0.0035) [2024-06-21 21:41:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1151844352. Throughput: 0: 42453.7. Samples: 1151992420. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:13,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-21 21:41:15,561][15401] Updated weights for policy 0, policy_version 70310 (0.0038) [2024-06-21 21:41:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.2, 300 sec: 42431.8). Total num frames: 1152090112. Throughput: 0: 42277.4. Samples: 1152241540. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 21:41:19,167][15401] Updated weights for policy 0, policy_version 70320 (0.0028) [2024-06-21 21:41:23,218][15401] Updated weights for policy 0, policy_version 70330 (0.0032) [2024-06-21 21:41:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42327.1, 300 sec: 42376.3). Total num frames: 1152286720. Throughput: 0: 42648.1. Samples: 1152374640. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 21:41:26,823][15401] Updated weights for policy 0, policy_version 70340 (0.0025) [2024-06-21 21:41:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 42321.0). Total num frames: 1152483328. Throughput: 0: 42317.4. Samples: 1152624600. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 21:41:30,925][15401] Updated weights for policy 0, policy_version 70350 (0.0031) [2024-06-21 21:41:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42375.9). Total num frames: 1152729088. Throughput: 0: 42176.3. Samples: 1152870960. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:33,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 21:41:34,648][15401] Updated weights for policy 0, policy_version 70360 (0.0036) [2024-06-21 21:41:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1152909312. Throughput: 0: 42319.4. Samples: 1153007840. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 21:41:38,823][15401] Updated weights for policy 0, policy_version 70370 (0.0036) [2024-06-21 21:41:42,548][15401] Updated weights for policy 0, policy_version 70380 (0.0022) [2024-06-21 21:41:43,390][15132] Fps is (10 sec: 39330.3, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1153122304. Throughput: 0: 42350.6. Samples: 1153264980. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-21 21:41:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 21:41:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000070381_1153122304.pth... [2024-06-21 21:41:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000069760_1142947840.pth [2024-06-21 21:41:46,709][15401] Updated weights for policy 0, policy_version 70390 (0.0032) [2024-06-21 21:41:48,396][15132] Fps is (10 sec: 47483.5, 60 sec: 42866.8, 300 sec: 42486.4). Total num frames: 1153384448. Throughput: 0: 42113.2. Samples: 1153508700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:41:48,396][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 21:41:50,107][15401] Updated weights for policy 0, policy_version 70400 (0.0023) [2024-06-21 21:41:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.2, 300 sec: 42432.0). Total num frames: 1153548288. Throughput: 0: 42268.0. Samples: 1153645900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:41:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 21:41:54,243][15401] Updated weights for policy 0, policy_version 70410 (0.0026) [2024-06-21 21:41:55,198][15349] Signal inference workers to stop experience collection... (16900 times) [2024-06-21 21:41:55,199][15349] Signal inference workers to resume experience collection... (16900 times) [2024-06-21 21:41:55,231][15401] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-06-21 21:41:55,260][15401] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-06-21 21:41:57,798][15401] Updated weights for policy 0, policy_version 70420 (0.0036) [2024-06-21 21:41:58,390][15132] Fps is (10 sec: 37707.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1153761280. Throughput: 0: 42261.4. Samples: 1153894180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:41:58,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-21 21:42:01,831][15401] Updated weights for policy 0, policy_version 70430 (0.0038) [2024-06-21 21:42:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1154007040. Throughput: 0: 42371.5. Samples: 1154148260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:42:03,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 21:42:05,363][15401] Updated weights for policy 0, policy_version 70440 (0.0036) [2024-06-21 21:42:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 42376.2). Total num frames: 1154170880. Throughput: 0: 42268.4. Samples: 1154276720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:42:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 21:42:09,515][15401] Updated weights for policy 0, policy_version 70450 (0.0028) [2024-06-21 21:42:12,977][15401] Updated weights for policy 0, policy_version 70460 (0.0045) [2024-06-21 21:42:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42432.1). Total num frames: 1154416640. Throughput: 0: 42225.4. Samples: 1154524740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:42:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-21 21:42:17,226][15401] Updated weights for policy 0, policy_version 70470 (0.0045) [2024-06-21 21:42:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1154613248. Throughput: 0: 42669.5. Samples: 1154790980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:42:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 21:42:20,998][15401] Updated weights for policy 0, policy_version 70480 (0.0046) [2024-06-21 21:42:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1154826240. Throughput: 0: 42353.9. Samples: 1154913760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 21:42:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 21:42:24,774][15401] Updated weights for policy 0, policy_version 70490 (0.0042) [2024-06-21 21:42:28,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1155039232. Throughput: 0: 42183.1. Samples: 1155163220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:42:28,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-21 21:42:28,731][15401] Updated weights for policy 0, policy_version 70500 (0.0029) [2024-06-21 21:42:32,795][15401] Updated weights for policy 0, policy_version 70510 (0.0030) [2024-06-21 21:42:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42054.0, 300 sec: 42376.3). Total num frames: 1155252224. Throughput: 0: 42471.9. Samples: 1155419660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:42:33,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 21:42:36,772][15401] Updated weights for policy 0, policy_version 70520 (0.0041) [2024-06-21 21:42:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1155465216. Throughput: 0: 42343.6. Samples: 1155551360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:42:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 21:42:40,502][15401] Updated weights for policy 0, policy_version 70530 (0.0034) [2024-06-21 21:42:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42431.8). Total num frames: 1155678208. Throughput: 0: 42295.6. Samples: 1155797480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:42:43,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-21 21:42:44,451][15401] Updated weights for policy 0, policy_version 70540 (0.0039) [2024-06-21 21:42:48,104][15401] Updated weights for policy 0, policy_version 70550 (0.0035) [2024-06-21 21:42:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42056.7, 300 sec: 42487.3). Total num frames: 1155907584. Throughput: 0: 42330.6. Samples: 1156053140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:42:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 21:42:52,118][15401] Updated weights for policy 0, policy_version 70560 (0.0037) [2024-06-21 21:42:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1156104192. Throughput: 0: 42424.3. Samples: 1156185820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:42:53,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-21 21:42:55,780][15401] Updated weights for policy 0, policy_version 70570 (0.0044) [2024-06-21 21:42:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1156317184. Throughput: 0: 42350.6. Samples: 1156430520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:42:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 21:43:00,017][15401] Updated weights for policy 0, policy_version 70580 (0.0033) [2024-06-21 21:43:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1156513792. Throughput: 0: 42240.3. Samples: 1156691800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:43:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 21:43:03,594][15401] Updated weights for policy 0, policy_version 70590 (0.0044) [2024-06-21 21:43:07,891][15401] Updated weights for policy 0, policy_version 70600 (0.0024) [2024-06-21 21:43:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1156726784. Throughput: 0: 42224.5. Samples: 1156813860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-21 21:43:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 21:43:10,673][15349] Signal inference workers to stop experience collection... (16950 times) [2024-06-21 21:43:10,674][15349] Signal inference workers to resume experience collection... (16950 times) [2024-06-21 21:43:10,716][15401] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-06-21 21:43:10,716][15401] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-06-21 21:43:11,211][15401] Updated weights for policy 0, policy_version 70610 (0.0028) [2024-06-21 21:43:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1156956160. Throughput: 0: 42164.9. Samples: 1157060640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 21:43:15,695][15401] Updated weights for policy 0, policy_version 70620 (0.0037) [2024-06-21 21:43:18,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42050.5, 300 sec: 42320.4). Total num frames: 1157136384. Throughput: 0: 42371.0. Samples: 1157326460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:18,401][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 21:43:18,977][15401] Updated weights for policy 0, policy_version 70630 (0.0034) [2024-06-21 21:43:23,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1157349376. Throughput: 0: 42122.3. Samples: 1157446860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 21:43:23,412][15401] Updated weights for policy 0, policy_version 70640 (0.0035) [2024-06-21 21:43:26,792][15401] Updated weights for policy 0, policy_version 70650 (0.0033) [2024-06-21 21:43:28,390][15132] Fps is (10 sec: 47524.9, 60 sec: 42871.5, 300 sec: 42432.1). Total num frames: 1157611520. Throughput: 0: 42295.9. Samples: 1157700800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 21:43:31,288][15401] Updated weights for policy 0, policy_version 70660 (0.0027) [2024-06-21 21:43:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42376.6). Total num frames: 1157791744. Throughput: 0: 42338.3. Samples: 1157958360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 21:43:34,483][15401] Updated weights for policy 0, policy_version 70670 (0.0037) [2024-06-21 21:43:38,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.3, 300 sec: 42321.6). Total num frames: 1157988352. Throughput: 0: 42040.6. Samples: 1158077640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 21:43:38,837][15401] Updated weights for policy 0, policy_version 70680 (0.0035) [2024-06-21 21:43:42,379][15401] Updated weights for policy 0, policy_version 70690 (0.0035) [2024-06-21 21:43:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1158234112. Throughput: 0: 42273.9. Samples: 1158332840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 21:43:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000070693_1158234112.pth... [2024-06-21 21:43:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000070071_1148043264.pth [2024-06-21 21:43:46,722][15401] Updated weights for policy 0, policy_version 70700 (0.0040) [2024-06-21 21:43:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 1158414336. Throughput: 0: 42194.7. Samples: 1158590560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:48,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 21:43:50,108][15401] Updated weights for policy 0, policy_version 70710 (0.0053) [2024-06-21 21:43:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1158627328. Throughput: 0: 42105.3. Samples: 1158708600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-21 21:43:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 21:43:54,394][15401] Updated weights for policy 0, policy_version 70720 (0.0025) [2024-06-21 21:43:57,804][15401] Updated weights for policy 0, policy_version 70730 (0.0044) [2024-06-21 21:43:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1158873088. Throughput: 0: 42364.6. Samples: 1158967040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:43:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 21:44:02,127][15401] Updated weights for policy 0, policy_version 70740 (0.0032) [2024-06-21 21:44:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1159036928. Throughput: 0: 42081.8. Samples: 1159220040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:44:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 21:44:05,736][15401] Updated weights for policy 0, policy_version 70750 (0.0043) [2024-06-21 21:44:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1159266304. Throughput: 0: 41974.1. Samples: 1159335700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:44:08,394][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 21:44:09,691][15401] Updated weights for policy 0, policy_version 70760 (0.0031) [2024-06-21 21:44:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 1159479296. Throughput: 0: 42081.9. Samples: 1159594480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:44:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 21:44:13,584][15401] Updated weights for policy 0, policy_version 70770 (0.0039) [2024-06-21 21:44:17,296][15401] Updated weights for policy 0, policy_version 70780 (0.0030) [2024-06-21 21:44:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42054.0, 300 sec: 42265.2). Total num frames: 1159659520. Throughput: 0: 41928.4. Samples: 1159845140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:44:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 21:44:19,730][15349] Signal inference workers to stop experience collection... (17000 times) [2024-06-21 21:44:19,783][15401] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-06-21 21:44:19,790][15349] Signal inference workers to resume experience collection... (17000 times) [2024-06-21 21:44:19,797][15401] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-06-21 21:44:21,072][15401] Updated weights for policy 0, policy_version 70790 (0.0025) [2024-06-21 21:44:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1159888896. Throughput: 0: 41924.8. Samples: 1159964260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:44:23,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 21:44:24,762][15401] Updated weights for policy 0, policy_version 70800 (0.0037) [2024-06-21 21:44:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 42265.2). Total num frames: 1160101888. Throughput: 0: 42243.0. Samples: 1160233780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:44:28,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-21 21:44:28,747][15401] Updated weights for policy 0, policy_version 70810 (0.0032) [2024-06-21 21:44:32,401][15401] Updated weights for policy 0, policy_version 70820 (0.0036) [2024-06-21 21:44:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.5, 300 sec: 42375.9). Total num frames: 1160314880. Throughput: 0: 41940.4. Samples: 1160477980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-21 21:44:33,393][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 21:44:37,194][15401] Updated weights for policy 0, policy_version 70830 (0.0048) [2024-06-21 21:44:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1160527872. Throughput: 0: 42214.3. Samples: 1160608240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:44:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 21:44:40,326][15401] Updated weights for policy 0, policy_version 70840 (0.0036) [2024-06-21 21:44:43,389][15132] Fps is (10 sec: 42609.0, 60 sec: 41779.2, 300 sec: 42265.1). Total num frames: 1160740864. Throughput: 0: 42118.7. Samples: 1160862380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:44:43,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-21 21:44:44,922][15401] Updated weights for policy 0, policy_version 70850 (0.0047) [2024-06-21 21:44:48,294][15401] Updated weights for policy 0, policy_version 70860 (0.0036) [2024-06-21 21:44:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1160970240. Throughput: 0: 42030.6. Samples: 1161111420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:44:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 21:44:52,534][15401] Updated weights for policy 0, policy_version 70870 (0.0026) [2024-06-21 21:44:53,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 42320.4). Total num frames: 1161183232. Throughput: 0: 42402.7. Samples: 1161243920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:44:53,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 21:44:56,654][15401] Updated weights for policy 0, policy_version 70880 (0.0037) [2024-06-21 21:44:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 1161363456. Throughput: 0: 42308.9. Samples: 1161498380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:44:58,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 21:45:00,273][15401] Updated weights for policy 0, policy_version 70890 (0.0038) [2024-06-21 21:45:03,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1161576448. Throughput: 0: 42286.1. Samples: 1161748020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:45:03,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-21 21:45:04,389][15401] Updated weights for policy 0, policy_version 70900 (0.0038) [2024-06-21 21:45:07,986][15401] Updated weights for policy 0, policy_version 70910 (0.0039) [2024-06-21 21:45:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1161822208. Throughput: 0: 42498.6. Samples: 1161876700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:45:08,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 21:45:11,998][15401] Updated weights for policy 0, policy_version 70920 (0.0044) [2024-06-21 21:45:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42265.5). Total num frames: 1162002432. Throughput: 0: 42252.8. Samples: 1162135160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:45:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 21:45:15,390][15401] Updated weights for policy 0, policy_version 70930 (0.0030) [2024-06-21 21:45:18,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.7, 300 sec: 42320.7). Total num frames: 1162231808. Throughput: 0: 42509.8. Samples: 1162390920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-21 21:45:18,393][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 21:45:19,604][15401] Updated weights for policy 0, policy_version 70940 (0.0042) [2024-06-21 21:45:23,084][15401] Updated weights for policy 0, policy_version 70950 (0.0042) [2024-06-21 21:45:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1162461184. Throughput: 0: 42608.8. Samples: 1162525640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:45:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 21:45:27,269][15401] Updated weights for policy 0, policy_version 70960 (0.0032) [2024-06-21 21:45:28,392][15132] Fps is (10 sec: 40960.3, 60 sec: 42323.7, 300 sec: 42320.4). Total num frames: 1162641408. Throughput: 0: 42589.3. Samples: 1162779000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:45:28,392][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 21:45:30,685][15401] Updated weights for policy 0, policy_version 70970 (0.0043) [2024-06-21 21:45:33,391][15132] Fps is (10 sec: 40955.4, 60 sec: 42599.4, 300 sec: 42320.5). Total num frames: 1162870784. Throughput: 0: 42588.4. Samples: 1163027940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:45:33,391][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 21:45:34,930][15401] Updated weights for policy 0, policy_version 70980 (0.0031) [2024-06-21 21:45:36,966][15349] Signal inference workers to stop experience collection... (17050 times) [2024-06-21 21:45:36,967][15349] Signal inference workers to resume experience collection... (17050 times) [2024-06-21 21:45:36,978][15401] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-06-21 21:45:36,978][15401] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-06-21 21:45:38,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1163083776. Throughput: 0: 42762.3. Samples: 1163168120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:45:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 21:45:38,456][15401] Updated weights for policy 0, policy_version 70990 (0.0034) [2024-06-21 21:45:42,336][15401] Updated weights for policy 0, policy_version 71000 (0.0035) [2024-06-21 21:45:43,389][15132] Fps is (10 sec: 39326.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1163264000. Throughput: 0: 42675.6. Samples: 1163418780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:45:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 21:45:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000071000_1163264000.pth... [2024-06-21 21:45:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000070381_1153122304.pth [2024-06-21 21:45:46,220][15401] Updated weights for policy 0, policy_version 71010 (0.0037) [2024-06-21 21:45:48,391][15132] Fps is (10 sec: 44228.6, 60 sec: 42597.2, 300 sec: 42376.0). Total num frames: 1163526144. Throughput: 0: 42667.7. Samples: 1163668140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:45:48,392][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 21:45:49,785][15401] Updated weights for policy 0, policy_version 71020 (0.0029) [2024-06-21 21:45:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41780.8, 300 sec: 42265.2). Total num frames: 1163689984. Throughput: 0: 42875.1. Samples: 1163806080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:45:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 21:45:54,206][15401] Updated weights for policy 0, policy_version 71030 (0.0025) [2024-06-21 21:45:57,658][15401] Updated weights for policy 0, policy_version 71040 (0.0033) [2024-06-21 21:45:58,390][15132] Fps is (10 sec: 39328.8, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1163919360. Throughput: 0: 42614.8. Samples: 1164052820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:45:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-21 21:46:01,907][15401] Updated weights for policy 0, policy_version 71050 (0.0033) [2024-06-21 21:46:03,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.7, 300 sec: 42320.7). Total num frames: 1164165120. Throughput: 0: 42522.8. Samples: 1164304340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-21 21:46:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 21:46:05,322][15401] Updated weights for policy 0, policy_version 71060 (0.0037) [2024-06-21 21:46:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 41777.6, 300 sec: 42320.4). Total num frames: 1164328960. Throughput: 0: 42503.9. Samples: 1164438420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-21 21:46:08,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 21:46:09,587][15401] Updated weights for policy 0, policy_version 71070 (0.0031) [2024-06-21 21:46:13,115][15401] Updated weights for policy 0, policy_version 71080 (0.0031) [2024-06-21 21:46:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1164574720. Throughput: 0: 42544.0. Samples: 1164693380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-21 21:46:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 21:46:17,543][15401] Updated weights for policy 0, policy_version 71090 (0.0026) [2024-06-21 21:46:18,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42600.2, 300 sec: 42376.2). Total num frames: 1164787712. Throughput: 0: 42661.5. Samples: 1164947660. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-21 21:46:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 21:46:20,848][15401] Updated weights for policy 0, policy_version 71100 (0.0034) [2024-06-21 21:46:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1164984320. Throughput: 0: 42414.6. Samples: 1165076780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-21 21:46:23,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 21:46:25,212][15401] Updated weights for policy 0, policy_version 71110 (0.0029) [2024-06-21 21:46:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42265.5). Total num frames: 1165197312. Throughput: 0: 42404.5. Samples: 1165326980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-21 21:46:28,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-21 21:46:28,635][15401] Updated weights for policy 0, policy_version 71120 (0.0037) [2024-06-21 21:46:32,814][15401] Updated weights for policy 0, policy_version 71130 (0.0041) [2024-06-21 21:46:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42599.1, 300 sec: 42431.8). Total num frames: 1165426688. Throughput: 0: 42664.3. Samples: 1165587960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-21 21:46:33,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 21:46:36,217][15401] Updated weights for policy 0, policy_version 71140 (0.0030) [2024-06-21 21:46:38,390][15132] Fps is (10 sec: 44235.1, 60 sec: 42598.1, 300 sec: 42431.8). Total num frames: 1165639680. Throughput: 0: 42318.9. Samples: 1165710440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-21 21:46:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 21:46:40,596][15401] Updated weights for policy 0, policy_version 71150 (0.0037) [2024-06-21 21:46:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42266.1). Total num frames: 1165852672. Throughput: 0: 42428.8. Samples: 1165962120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-21 21:46:43,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 21:46:43,815][15401] Updated weights for policy 0, policy_version 71160 (0.0038) [2024-06-21 21:46:48,390][15132] Fps is (10 sec: 39322.7, 60 sec: 41780.4, 300 sec: 42320.7). Total num frames: 1166032896. Throughput: 0: 42659.5. Samples: 1166224020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:46:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 21:46:48,486][15401] Updated weights for policy 0, policy_version 71170 (0.0030) [2024-06-21 21:46:48,796][15349] Signal inference workers to stop experience collection... (17100 times) [2024-06-21 21:46:48,797][15349] Signal inference workers to resume experience collection... (17100 times) [2024-06-21 21:46:48,816][15401] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-06-21 21:46:48,816][15401] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-06-21 21:46:51,773][15401] Updated weights for policy 0, policy_version 71180 (0.0040) [2024-06-21 21:46:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 1166278656. Throughput: 0: 42394.1. Samples: 1166346060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:46:53,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 21:46:55,922][15401] Updated weights for policy 0, policy_version 71190 (0.0032) [2024-06-21 21:46:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1166491648. Throughput: 0: 42458.3. Samples: 1166604000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:46:58,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-21 21:46:59,333][15401] Updated weights for policy 0, policy_version 71200 (0.0028) [2024-06-21 21:47:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1166671872. Throughput: 0: 42672.0. Samples: 1166867900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:47:03,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-21 21:47:03,726][15401] Updated weights for policy 0, policy_version 71210 (0.0036) [2024-06-21 21:47:07,013][15401] Updated weights for policy 0, policy_version 71220 (0.0036) [2024-06-21 21:47:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43146.2, 300 sec: 42376.2). Total num frames: 1166917632. Throughput: 0: 42487.5. Samples: 1166988720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:47:08,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 21:47:11,430][15401] Updated weights for policy 0, policy_version 71230 (0.0033) [2024-06-21 21:47:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1167130624. Throughput: 0: 42650.2. Samples: 1167246240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:47:13,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 21:47:14,639][15401] Updated weights for policy 0, policy_version 71240 (0.0026) [2024-06-21 21:47:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1167310848. Throughput: 0: 42641.9. Samples: 1167506840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:47:18,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-21 21:47:19,054][15401] Updated weights for policy 0, policy_version 71250 (0.0033) [2024-06-21 21:47:22,371][15401] Updated weights for policy 0, policy_version 71260 (0.0034) [2024-06-21 21:47:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42487.4). Total num frames: 1167572992. Throughput: 0: 42548.8. Samples: 1167625120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:47:23,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-21 21:47:26,534][15401] Updated weights for policy 0, policy_version 71270 (0.0055) [2024-06-21 21:47:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1167769600. Throughput: 0: 42788.1. Samples: 1167887580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 24.0) [2024-06-21 21:47:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 21:47:29,925][15401] Updated weights for policy 0, policy_version 71280 (0.0033) [2024-06-21 21:47:33,392][15132] Fps is (10 sec: 37673.9, 60 sec: 42050.6, 300 sec: 42320.4). Total num frames: 1167949824. Throughput: 0: 42767.9. Samples: 1168148680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:47:33,393][15132] Avg episode reward: [(0, '0.173')] [2024-06-21 21:47:34,068][15401] Updated weights for policy 0, policy_version 71290 (0.0035) [2024-06-21 21:47:37,671][15401] Updated weights for policy 0, policy_version 71300 (0.0038) [2024-06-21 21:47:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.7, 300 sec: 42431.8). Total num frames: 1168195584. Throughput: 0: 42766.0. Samples: 1168270520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:47:38,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-21 21:47:41,741][15401] Updated weights for policy 0, policy_version 71310 (0.0041) [2024-06-21 21:47:43,390][15132] Fps is (10 sec: 47525.0, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1168424960. Throughput: 0: 42888.8. Samples: 1168534000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:47:43,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 21:47:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000071315_1168424960.pth... [2024-06-21 21:47:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000070693_1158234112.pth [2024-06-21 21:47:45,299][15401] Updated weights for policy 0, policy_version 71320 (0.0037) [2024-06-21 21:47:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 1168605184. Throughput: 0: 42657.7. Samples: 1168787500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:47:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 21:47:49,389][15401] Updated weights for policy 0, policy_version 71330 (0.0028) [2024-06-21 21:47:52,907][15401] Updated weights for policy 0, policy_version 71340 (0.0031) [2024-06-21 21:47:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1168834560. Throughput: 0: 42697.4. Samples: 1168910100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:47:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 21:47:57,135][15401] Updated weights for policy 0, policy_version 71350 (0.0029) [2024-06-21 21:47:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1169031168. Throughput: 0: 42676.8. Samples: 1169166700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:47:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 21:48:00,489][15401] Updated weights for policy 0, policy_version 71360 (0.0044) [2024-06-21 21:48:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1169244160. Throughput: 0: 42565.3. Samples: 1169422280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:48:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 21:48:04,752][15401] Updated weights for policy 0, policy_version 71370 (0.0031) [2024-06-21 21:48:08,078][15401] Updated weights for policy 0, policy_version 71380 (0.0041) [2024-06-21 21:48:08,395][15132] Fps is (10 sec: 45848.2, 60 sec: 42867.3, 300 sec: 42486.5). Total num frames: 1169489920. Throughput: 0: 42845.9. Samples: 1169553440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:48:08,396][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 21:48:12,574][15401] Updated weights for policy 0, policy_version 71390 (0.0035) [2024-06-21 21:48:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1169686528. Throughput: 0: 42679.5. Samples: 1169808160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-21 21:48:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 21:48:16,248][15401] Updated weights for policy 0, policy_version 71400 (0.0035) [2024-06-21 21:48:18,390][15132] Fps is (10 sec: 39344.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1169883136. Throughput: 0: 42440.5. Samples: 1170058400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 21:48:20,401][15401] Updated weights for policy 0, policy_version 71410 (0.0034) [2024-06-21 21:48:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1170112512. Throughput: 0: 42478.6. Samples: 1170182060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 21:48:23,913][15401] Updated weights for policy 0, policy_version 71420 (0.0040) [2024-06-21 21:48:27,231][15349] Signal inference workers to stop experience collection... (17150 times) [2024-06-21 21:48:27,232][15349] Signal inference workers to resume experience collection... (17150 times) [2024-06-21 21:48:27,257][15401] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-06-21 21:48:27,257][15401] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-06-21 21:48:27,983][15401] Updated weights for policy 0, policy_version 71430 (0.0043) [2024-06-21 21:48:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1170309120. Throughput: 0: 42426.2. Samples: 1170443180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 21:48:31,566][15401] Updated weights for policy 0, policy_version 71440 (0.0030) [2024-06-21 21:48:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42327.1, 300 sec: 42376.3). Total num frames: 1170489344. Throughput: 0: 42468.5. Samples: 1170698580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:33,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 21:48:35,792][15401] Updated weights for policy 0, policy_version 71450 (0.0037) [2024-06-21 21:48:38,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.7, 300 sec: 42487.0). Total num frames: 1170767872. Throughput: 0: 42408.4. Samples: 1170818580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:38,393][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 21:48:39,176][15401] Updated weights for policy 0, policy_version 71460 (0.0039) [2024-06-21 21:48:43,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1170948096. Throughput: 0: 42399.6. Samples: 1171074680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:43,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-21 21:48:43,523][15401] Updated weights for policy 0, policy_version 71470 (0.0036) [2024-06-21 21:48:46,872][15401] Updated weights for policy 0, policy_version 71480 (0.0032) [2024-06-21 21:48:48,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1171144704. Throughput: 0: 42313.3. Samples: 1171326380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 21:48:51,343][15401] Updated weights for policy 0, policy_version 71490 (0.0032) [2024-06-21 21:48:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1171406848. Throughput: 0: 42257.1. Samples: 1171454760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 21:48:54,729][15401] Updated weights for policy 0, policy_version 71500 (0.0038) [2024-06-21 21:48:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1171554304. Throughput: 0: 42321.8. Samples: 1171712640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 21:48:58,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 21:48:58,994][15401] Updated weights for policy 0, policy_version 71510 (0.0036) [2024-06-21 21:49:02,141][15401] Updated weights for policy 0, policy_version 71520 (0.0033) [2024-06-21 21:49:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1171800064. Throughput: 0: 42399.2. Samples: 1171966360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 21:49:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 21:49:06,766][15401] Updated weights for policy 0, policy_version 71530 (0.0036) [2024-06-21 21:49:08,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42329.6, 300 sec: 42542.9). Total num frames: 1172029440. Throughput: 0: 42529.9. Samples: 1172095900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 21:49:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 21:49:10,136][15401] Updated weights for policy 0, policy_version 71540 (0.0035) [2024-06-21 21:49:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1172209664. Throughput: 0: 42378.2. Samples: 1172350200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 21:49:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 21:49:14,675][15401] Updated weights for policy 0, policy_version 71550 (0.0047) [2024-06-21 21:49:18,170][15401] Updated weights for policy 0, policy_version 71560 (0.0045) [2024-06-21 21:49:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1172439040. Throughput: 0: 42288.9. Samples: 1172601580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 21:49:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 21:49:22,298][15401] Updated weights for policy 0, policy_version 71570 (0.0033) [2024-06-21 21:49:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1172652032. Throughput: 0: 42565.4. Samples: 1172733920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 21:49:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-21 21:49:25,913][15401] Updated weights for policy 0, policy_version 71580 (0.0028) [2024-06-21 21:49:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 1172848640. Throughput: 0: 42599.5. Samples: 1172991660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 21:49:28,395][15132] Avg episode reward: [(0, '0.769')] [2024-06-21 21:49:29,825][15401] Updated weights for policy 0, policy_version 71590 (0.0035) [2024-06-21 21:49:31,573][15349] Signal inference workers to stop experience collection... (17200 times) [2024-06-21 21:49:31,573][15349] Signal inference workers to resume experience collection... (17200 times) [2024-06-21 21:49:31,609][15401] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-06-21 21:49:31,610][15401] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-06-21 21:49:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.7, 300 sec: 42542.5). Total num frames: 1173078016. Throughput: 0: 42579.9. Samples: 1173242580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 21:49:33,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 21:49:33,571][15401] Updated weights for policy 0, policy_version 71600 (0.0029) [2024-06-21 21:49:37,479][15401] Updated weights for policy 0, policy_version 71610 (0.0037) [2024-06-21 21:49:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 1173291008. Throughput: 0: 42613.7. Samples: 1173372380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-21 21:49:38,393][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 21:49:41,028][15401] Updated weights for policy 0, policy_version 71620 (0.0031) [2024-06-21 21:49:43,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1173504000. Throughput: 0: 42589.3. Samples: 1173629160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:49:43,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-21 21:49:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000071625_1173504000.pth... [2024-06-21 21:49:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000071000_1163264000.pth [2024-06-21 21:49:45,082][15401] Updated weights for policy 0, policy_version 71630 (0.0028) [2024-06-21 21:49:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 1173733376. Throughput: 0: 42415.0. Samples: 1173875040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:49:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 21:49:49,343][15401] Updated weights for policy 0, policy_version 71640 (0.0030) [2024-06-21 21:49:53,225][15401] Updated weights for policy 0, policy_version 71650 (0.0041) [2024-06-21 21:49:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 1173913600. Throughput: 0: 42418.0. Samples: 1174004720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:49:53,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-21 21:49:56,844][15401] Updated weights for policy 0, policy_version 71660 (0.0049) [2024-06-21 21:49:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1174159360. Throughput: 0: 42528.4. Samples: 1174263980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:49:58,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 21:50:00,687][15401] Updated weights for policy 0, policy_version 71670 (0.0030) [2024-06-21 21:50:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1174372352. Throughput: 0: 42462.9. Samples: 1174512420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:50:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 21:50:04,394][15401] Updated weights for policy 0, policy_version 71680 (0.0040) [2024-06-21 21:50:08,335][15401] Updated weights for policy 0, policy_version 71690 (0.0039) [2024-06-21 21:50:08,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 1174568960. Throughput: 0: 42538.7. Samples: 1174648260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:50:08,392][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 21:50:11,903][15401] Updated weights for policy 0, policy_version 71700 (0.0028) [2024-06-21 21:50:13,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42542.9). Total num frames: 1174781952. Throughput: 0: 42428.0. Samples: 1174901020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:50:13,393][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 21:50:15,988][15401] Updated weights for policy 0, policy_version 71710 (0.0035) [2024-06-21 21:50:18,396][15132] Fps is (10 sec: 42581.0, 60 sec: 42593.8, 300 sec: 42486.4). Total num frames: 1174994944. Throughput: 0: 42501.1. Samples: 1175155300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:50:18,397][15132] Avg episode reward: [(0, '0.734')] [2024-06-21 21:50:19,463][15401] Updated weights for policy 0, policy_version 71720 (0.0023) [2024-06-21 21:50:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1175191552. Throughput: 0: 42462.6. Samples: 1175283200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 21:50:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 21:50:23,638][15401] Updated weights for policy 0, policy_version 71730 (0.0027) [2024-06-21 21:50:27,141][15401] Updated weights for policy 0, policy_version 71740 (0.0037) [2024-06-21 21:50:28,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.5, 300 sec: 42543.0). Total num frames: 1175420928. Throughput: 0: 42416.9. Samples: 1175537920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:50:28,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 21:50:31,125][15401] Updated weights for policy 0, policy_version 71750 (0.0038) [2024-06-21 21:50:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 1175650304. Throughput: 0: 42641.9. Samples: 1175793920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:50:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 21:50:35,273][15401] Updated weights for policy 0, policy_version 71760 (0.0035) [2024-06-21 21:50:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1175846912. Throughput: 0: 42653.5. Samples: 1175924120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:50:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 21:50:38,604][15401] Updated weights for policy 0, policy_version 71770 (0.0032) [2024-06-21 21:50:43,091][15401] Updated weights for policy 0, policy_version 71780 (0.0036) [2024-06-21 21:50:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 1176043520. Throughput: 0: 42527.3. Samples: 1176177700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:50:43,396][15132] Avg episode reward: [(0, '0.687')] [2024-06-21 21:50:45,473][15349] Signal inference workers to stop experience collection... (17250 times) [2024-06-21 21:50:45,473][15349] Signal inference workers to resume experience collection... (17250 times) [2024-06-21 21:50:45,489][15401] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-06-21 21:50:45,489][15401] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-06-21 21:50:46,276][15401] Updated weights for policy 0, policy_version 71790 (0.0035) [2024-06-21 21:50:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1176272896. Throughput: 0: 42747.2. Samples: 1176436040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:50:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 21:50:50,546][15401] Updated weights for policy 0, policy_version 71800 (0.0039) [2024-06-21 21:50:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1176502272. Throughput: 0: 42551.5. Samples: 1176562980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:50:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 21:50:54,115][15401] Updated weights for policy 0, policy_version 71810 (0.0029) [2024-06-21 21:50:58,360][15401] Updated weights for policy 0, policy_version 71820 (0.0049) [2024-06-21 21:50:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1176698880. Throughput: 0: 42529.9. Samples: 1176814760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:50:58,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-21 21:51:01,865][15401] Updated weights for policy 0, policy_version 71830 (0.0035) [2024-06-21 21:51:03,394][15132] Fps is (10 sec: 39304.8, 60 sec: 42049.3, 300 sec: 42598.1). Total num frames: 1176895488. Throughput: 0: 42645.5. Samples: 1177074260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:51:03,394][15132] Avg episode reward: [(0, '0.776')] [2024-06-21 21:51:06,104][15401] Updated weights for policy 0, policy_version 71840 (0.0043) [2024-06-21 21:51:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 1177124864. Throughput: 0: 42596.6. Samples: 1177200040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 21:51:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 21:51:09,430][15401] Updated weights for policy 0, policy_version 71850 (0.0047) [2024-06-21 21:51:13,390][15132] Fps is (10 sec: 42616.4, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 1177321472. Throughput: 0: 42572.8. Samples: 1177453700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 21:51:13,819][15401] Updated weights for policy 0, policy_version 71860 (0.0042) [2024-06-21 21:51:17,502][15401] Updated weights for policy 0, policy_version 71870 (0.0037) [2024-06-21 21:51:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42329.8, 300 sec: 42542.9). Total num frames: 1177534464. Throughput: 0: 42625.2. Samples: 1177712060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 21:51:21,573][15401] Updated weights for policy 0, policy_version 71880 (0.0030) [2024-06-21 21:51:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1177763840. Throughput: 0: 42559.2. Samples: 1177839280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 21:51:25,150][15401] Updated weights for policy 0, policy_version 71890 (0.0037) [2024-06-21 21:51:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1177960448. Throughput: 0: 42640.8. Samples: 1178096540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 21:51:29,257][15401] Updated weights for policy 0, policy_version 71900 (0.0034) [2024-06-21 21:51:32,734][15401] Updated weights for policy 0, policy_version 71910 (0.0031) [2024-06-21 21:51:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42487.4). Total num frames: 1178173440. Throughput: 0: 42341.0. Samples: 1178341380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:33,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 21:51:37,123][15401] Updated weights for policy 0, policy_version 71920 (0.0035) [2024-06-21 21:51:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1178386432. Throughput: 0: 42371.1. Samples: 1178469680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 21:51:40,480][15401] Updated weights for policy 0, policy_version 71930 (0.0040) [2024-06-21 21:51:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1178599424. Throughput: 0: 42487.9. Samples: 1178726720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:43,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 21:51:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000071937_1178615808.pth... [2024-06-21 21:51:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000071315_1168424960.pth [2024-06-21 21:51:44,778][15401] Updated weights for policy 0, policy_version 71940 (0.0025) [2024-06-21 21:51:48,037][15401] Updated weights for policy 0, policy_version 71950 (0.0035) [2024-06-21 21:51:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 1178828800. Throughput: 0: 42017.8. Samples: 1178964980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:48,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 21:51:52,670][15401] Updated weights for policy 0, policy_version 71960 (0.0026) [2024-06-21 21:51:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 1179025408. Throughput: 0: 42226.9. Samples: 1179100360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-21 21:51:53,393][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 21:51:56,157][15401] Updated weights for policy 0, policy_version 71970 (0.0037) [2024-06-21 21:51:58,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 1179222016. Throughput: 0: 42242.2. Samples: 1179354600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:51:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 21:52:00,503][15401] Updated weights for policy 0, policy_version 71980 (0.0034) [2024-06-21 21:52:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42601.4, 300 sec: 42487.3). Total num frames: 1179451392. Throughput: 0: 41986.2. Samples: 1179601440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:52:03,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-21 21:52:03,918][15401] Updated weights for policy 0, policy_version 71990 (0.0032) [2024-06-21 21:52:08,117][15401] Updated weights for policy 0, policy_version 72000 (0.0035) [2024-06-21 21:52:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1179664384. Throughput: 0: 42069.2. Samples: 1179732400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:52:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 21:52:11,658][15401] Updated weights for policy 0, policy_version 72010 (0.0033) [2024-06-21 21:52:13,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 1179828224. Throughput: 0: 41863.6. Samples: 1179980400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:52:13,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 21:52:15,724][15401] Updated weights for policy 0, policy_version 72020 (0.0034) [2024-06-21 21:52:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1180090368. Throughput: 0: 42080.9. Samples: 1180235020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:52:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-21 21:52:19,321][15401] Updated weights for policy 0, policy_version 72030 (0.0037) [2024-06-21 21:52:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1180286976. Throughput: 0: 42232.4. Samples: 1180370140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:52:23,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 21:52:23,431][15401] Updated weights for policy 0, policy_version 72040 (0.0033) [2024-06-21 21:52:27,110][15401] Updated weights for policy 0, policy_version 72050 (0.0037) [2024-06-21 21:52:28,389][15132] Fps is (10 sec: 37682.7, 60 sec: 41779.2, 300 sec: 42432.1). Total num frames: 1180467200. Throughput: 0: 41969.4. Samples: 1180615340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:52:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-21 21:52:31,169][15401] Updated weights for policy 0, policy_version 72060 (0.0036) [2024-06-21 21:52:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1180712960. Throughput: 0: 42445.8. Samples: 1180874940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:52:33,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-21 21:52:34,701][15349] Signal inference workers to stop experience collection... (17300 times) [2024-06-21 21:52:34,703][15349] Signal inference workers to resume experience collection... (17300 times) [2024-06-21 21:52:34,744][15401] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-06-21 21:52:34,744][15401] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-06-21 21:52:34,835][15401] Updated weights for policy 0, policy_version 72070 (0.0028) [2024-06-21 21:52:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1180925952. Throughput: 0: 42403.1. Samples: 1181008400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 21:52:38,390][15132] Avg episode reward: [(0, '0.158')] [2024-06-21 21:52:38,955][15401] Updated weights for policy 0, policy_version 72080 (0.0045) [2024-06-21 21:52:42,928][15401] Updated weights for policy 0, policy_version 72090 (0.0038) [2024-06-21 21:52:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1181122560. Throughput: 0: 42152.5. Samples: 1181251460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:52:43,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-21 21:52:46,762][15401] Updated weights for policy 0, policy_version 72100 (0.0033) [2024-06-21 21:52:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41780.9, 300 sec: 42376.2). Total num frames: 1181335552. Throughput: 0: 42330.3. Samples: 1181506300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:52:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 21:52:51,130][15401] Updated weights for policy 0, policy_version 72110 (0.0036) [2024-06-21 21:52:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42054.0, 300 sec: 42431.8). Total num frames: 1181548544. Throughput: 0: 42171.6. Samples: 1181630120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:52:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 21:52:54,374][15401] Updated weights for policy 0, policy_version 72120 (0.0040) [2024-06-21 21:52:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1181745152. Throughput: 0: 42195.1. Samples: 1181879180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:52:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 21:52:58,920][15401] Updated weights for policy 0, policy_version 72130 (0.0034) [2024-06-21 21:53:02,338][15401] Updated weights for policy 0, policy_version 72140 (0.0031) [2024-06-21 21:53:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42377.1). Total num frames: 1181990912. Throughput: 0: 42225.2. Samples: 1182135160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:53:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 21:53:06,400][15401] Updated weights for policy 0, policy_version 72150 (0.0036) [2024-06-21 21:53:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1182187520. Throughput: 0: 42119.2. Samples: 1182265500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:53:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 21:53:09,867][15401] Updated weights for policy 0, policy_version 72160 (0.0036) [2024-06-21 21:53:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1182384128. Throughput: 0: 42390.2. Samples: 1182522900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:53:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 21:53:14,284][15401] Updated weights for policy 0, policy_version 72170 (0.0033) [2024-06-21 21:53:17,416][15401] Updated weights for policy 0, policy_version 72180 (0.0028) [2024-06-21 21:53:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 1182613504. Throughput: 0: 42161.8. Samples: 1182772220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:53:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 21:53:22,130][15401] Updated weights for policy 0, policy_version 72190 (0.0041) [2024-06-21 21:53:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1182826496. Throughput: 0: 42146.9. Samples: 1182905000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:53:23,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 21:53:25,176][15401] Updated weights for policy 0, policy_version 72200 (0.0048) [2024-06-21 21:53:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1183023104. Throughput: 0: 42357.8. Samples: 1183157560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 21:53:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 21:53:29,987][15401] Updated weights for policy 0, policy_version 72210 (0.0037) [2024-06-21 21:53:32,864][15401] Updated weights for policy 0, policy_version 72220 (0.0022) [2024-06-21 21:53:33,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.7, 300 sec: 42320.7). Total num frames: 1183252480. Throughput: 0: 42194.6. Samples: 1183405160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 21:53:33,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 21:53:37,733][15401] Updated weights for policy 0, policy_version 72230 (0.0039) [2024-06-21 21:53:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1183465472. Throughput: 0: 42426.2. Samples: 1183539300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 21:53:38,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 21:53:40,554][15401] Updated weights for policy 0, policy_version 72240 (0.0039) [2024-06-21 21:53:43,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42323.6, 300 sec: 42431.4). Total num frames: 1183662080. Throughput: 0: 42416.7. Samples: 1183788040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 21:53:43,393][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 21:53:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000072245_1183662080.pth... [2024-06-21 21:53:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000071625_1173504000.pth [2024-06-21 21:53:45,317][15401] Updated weights for policy 0, policy_version 72250 (0.0036) [2024-06-21 21:53:48,375][15401] Updated weights for policy 0, policy_version 72260 (0.0039) [2024-06-21 21:53:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 1183907840. Throughput: 0: 42340.0. Samples: 1184040460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 21:53:48,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 21:53:53,153][15401] Updated weights for policy 0, policy_version 72270 (0.0031) [2024-06-21 21:53:53,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1184088064. Throughput: 0: 42344.6. Samples: 1184171020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 21:53:53,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 21:53:56,029][15401] Updated weights for policy 0, policy_version 72280 (0.0025) [2024-06-21 21:53:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1184284672. Throughput: 0: 42154.6. Samples: 1184419860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 21:53:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 21:54:00,660][15349] Signal inference workers to stop experience collection... (17350 times) [2024-06-21 21:54:00,661][15349] Signal inference workers to resume experience collection... (17350 times) [2024-06-21 21:54:00,695][15401] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-06-21 21:54:00,695][15401] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-06-21 21:54:00,810][15401] Updated weights for policy 0, policy_version 72290 (0.0035) [2024-06-21 21:54:03,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1184530432. Throughput: 0: 42192.6. Samples: 1184670880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 21:54:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 21:54:03,723][15401] Updated weights for policy 0, policy_version 72300 (0.0044) [2024-06-21 21:54:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1184694272. Throughput: 0: 42293.9. Samples: 1184808220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:08,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-21 21:54:08,735][15401] Updated weights for policy 0, policy_version 72310 (0.0026) [2024-06-21 21:54:11,414][15401] Updated weights for policy 0, policy_version 72320 (0.0033) [2024-06-21 21:54:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1184940032. Throughput: 0: 42084.5. Samples: 1185051360. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:13,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-21 21:54:16,314][15401] Updated weights for policy 0, policy_version 72330 (0.0044) [2024-06-21 21:54:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1185153024. Throughput: 0: 42236.0. Samples: 1185305680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:18,395][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 21:54:19,325][15401] Updated weights for policy 0, policy_version 72340 (0.0035) [2024-06-21 21:54:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1185349632. Throughput: 0: 42208.5. Samples: 1185438680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:23,400][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 21:54:23,819][15401] Updated weights for policy 0, policy_version 72350 (0.0027) [2024-06-21 21:54:27,091][15401] Updated weights for policy 0, policy_version 72360 (0.0033) [2024-06-21 21:54:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42376.6). Total num frames: 1185579008. Throughput: 0: 42243.2. Samples: 1185688880. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:28,399][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 21:54:31,465][15401] Updated weights for policy 0, policy_version 72370 (0.0038) [2024-06-21 21:54:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42327.0, 300 sec: 42376.2). Total num frames: 1185792000. Throughput: 0: 42391.9. Samples: 1185948100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:33,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-21 21:54:34,573][15401] Updated weights for policy 0, policy_version 72380 (0.0026) [2024-06-21 21:54:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1185972224. Throughput: 0: 42329.9. Samples: 1186075860. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 21:54:39,245][15401] Updated weights for policy 0, policy_version 72390 (0.0036) [2024-06-21 21:54:42,315][15401] Updated weights for policy 0, policy_version 72400 (0.0031) [2024-06-21 21:54:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42320.7). Total num frames: 1186217984. Throughput: 0: 42384.8. Samples: 1186327180. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:43,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 21:54:46,937][15401] Updated weights for policy 0, policy_version 72410 (0.0040) [2024-06-21 21:54:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1186414592. Throughput: 0: 42671.4. Samples: 1186591100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-21 21:54:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 21:54:50,437][15401] Updated weights for policy 0, policy_version 72420 (0.0037) [2024-06-21 21:54:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42264.8). Total num frames: 1186627584. Throughput: 0: 42347.3. Samples: 1186713960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:54:53,393][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 21:54:54,785][15401] Updated weights for policy 0, policy_version 72430 (0.0030) [2024-06-21 21:54:58,105][15401] Updated weights for policy 0, policy_version 72440 (0.0038) [2024-06-21 21:54:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1186856960. Throughput: 0: 42690.1. Samples: 1186972420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:54:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 21:55:02,334][15401] Updated weights for policy 0, policy_version 72450 (0.0048) [2024-06-21 21:55:03,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42052.2, 300 sec: 42321.0). Total num frames: 1187053568. Throughput: 0: 42704.0. Samples: 1187227360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:55:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 21:55:06,044][15401] Updated weights for policy 0, policy_version 72460 (0.0042) [2024-06-21 21:55:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42321.0). Total num frames: 1187266560. Throughput: 0: 42572.9. Samples: 1187354460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:55:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 21:55:09,974][15401] Updated weights for policy 0, policy_version 72470 (0.0041) [2024-06-21 21:55:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42321.6). Total num frames: 1187479552. Throughput: 0: 42709.8. Samples: 1187610820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:55:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 21:55:13,699][15401] Updated weights for policy 0, policy_version 72480 (0.0034) [2024-06-21 21:55:14,975][15349] Signal inference workers to stop experience collection... (17400 times) [2024-06-21 21:55:14,975][15349] Signal inference workers to resume experience collection... (17400 times) [2024-06-21 21:55:14,994][15401] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-06-21 21:55:14,995][15401] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-06-21 21:55:17,757][15401] Updated weights for policy 0, policy_version 72490 (0.0031) [2024-06-21 21:55:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1187676160. Throughput: 0: 42532.1. Samples: 1187862040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:55:18,394][15132] Avg episode reward: [(0, '0.765')] [2024-06-21 21:55:21,591][15401] Updated weights for policy 0, policy_version 72500 (0.0037) [2024-06-21 21:55:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1187905536. Throughput: 0: 42505.0. Samples: 1187988580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:55:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 21:55:25,360][15401] Updated weights for policy 0, policy_version 72510 (0.0047) [2024-06-21 21:55:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1188102144. Throughput: 0: 42385.4. Samples: 1188234520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:55:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 21:55:29,554][15401] Updated weights for policy 0, policy_version 72520 (0.0045) [2024-06-21 21:55:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 1188315136. Throughput: 0: 42017.5. Samples: 1188481880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-21 21:55:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 21:55:33,612][15401] Updated weights for policy 0, policy_version 72530 (0.0041) [2024-06-21 21:55:37,610][15401] Updated weights for policy 0, policy_version 72540 (0.0034) [2024-06-21 21:55:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1188528128. Throughput: 0: 42242.3. Samples: 1188614760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:55:38,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-21 21:55:41,058][15401] Updated weights for policy 0, policy_version 72550 (0.0041) [2024-06-21 21:55:43,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42323.9, 300 sec: 42320.4). Total num frames: 1188757504. Throughput: 0: 42053.6. Samples: 1188864920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:55:43,392][15132] Avg episode reward: [(0, '0.742')] [2024-06-21 21:55:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000072556_1188757504.pth... [2024-06-21 21:55:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000071937_1178615808.pth [2024-06-21 21:55:45,414][15401] Updated weights for policy 0, policy_version 72560 (0.0027) [2024-06-21 21:55:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1188954112. Throughput: 0: 42149.3. Samples: 1189124080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:55:48,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-21 21:55:48,981][15401] Updated weights for policy 0, policy_version 72570 (0.0024) [2024-06-21 21:55:53,213][15401] Updated weights for policy 0, policy_version 72580 (0.0030) [2024-06-21 21:55:53,392][15132] Fps is (10 sec: 40958.6, 60 sec: 42325.3, 300 sec: 42264.8). Total num frames: 1189167104. Throughput: 0: 42031.9. Samples: 1189246000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:55:53,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 21:55:56,716][15401] Updated weights for policy 0, policy_version 72590 (0.0036) [2024-06-21 21:55:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42376.9). Total num frames: 1189396480. Throughput: 0: 41978.5. Samples: 1189499860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:55:58,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 21:56:00,830][15401] Updated weights for policy 0, policy_version 72600 (0.0038) [2024-06-21 21:56:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 1189576704. Throughput: 0: 42128.3. Samples: 1189757820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:56:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 21:56:04,351][15401] Updated weights for policy 0, policy_version 72610 (0.0032) [2024-06-21 21:56:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1189806080. Throughput: 0: 42026.2. Samples: 1189879760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:56:08,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-21 21:56:08,396][15401] Updated weights for policy 0, policy_version 72620 (0.0034) [2024-06-21 21:56:12,142][15401] Updated weights for policy 0, policy_version 72630 (0.0028) [2024-06-21 21:56:13,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1190051840. Throughput: 0: 42393.4. Samples: 1190142220. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:56:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-21 21:56:15,906][15401] Updated weights for policy 0, policy_version 72640 (0.0040) [2024-06-21 21:56:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 1190232064. Throughput: 0: 42570.6. Samples: 1190397560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-21 21:56:18,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 21:56:19,867][15401] Updated weights for policy 0, policy_version 72650 (0.0033) [2024-06-21 21:56:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1190428672. Throughput: 0: 42310.7. Samples: 1190518740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:56:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 21:56:23,599][15401] Updated weights for policy 0, policy_version 72660 (0.0028) [2024-06-21 21:56:25,221][15349] Signal inference workers to stop experience collection... (17450 times) [2024-06-21 21:56:25,260][15401] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-06-21 21:56:25,280][15349] Signal inference workers to resume experience collection... (17450 times) [2024-06-21 21:56:25,281][15401] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-06-21 21:56:27,609][15401] Updated weights for policy 0, policy_version 72670 (0.0027) [2024-06-21 21:56:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 1190690816. Throughput: 0: 42588.7. Samples: 1190781320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:56:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 21:56:31,022][15401] Updated weights for policy 0, policy_version 72680 (0.0044) [2024-06-21 21:56:33,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.6, 300 sec: 42320.4). Total num frames: 1190871040. Throughput: 0: 42393.3. Samples: 1191031880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:56:33,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 21:56:35,050][15401] Updated weights for policy 0, policy_version 72690 (0.0037) [2024-06-21 21:56:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1191084032. Throughput: 0: 42561.0. Samples: 1191161140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:56:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 21:56:39,188][15401] Updated weights for policy 0, policy_version 72700 (0.0039) [2024-06-21 21:56:42,771][15401] Updated weights for policy 0, policy_version 72710 (0.0034) [2024-06-21 21:56:43,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42326.8, 300 sec: 42265.5). Total num frames: 1191297024. Throughput: 0: 42786.2. Samples: 1191425240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:56:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 21:56:46,724][15401] Updated weights for policy 0, policy_version 72720 (0.0029) [2024-06-21 21:56:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42376.6). Total num frames: 1191526400. Throughput: 0: 42562.8. Samples: 1191673140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:56:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 21:56:50,623][15401] Updated weights for policy 0, policy_version 72730 (0.0033) [2024-06-21 21:56:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.2, 300 sec: 42376.3). Total num frames: 1191723008. Throughput: 0: 42749.2. Samples: 1191803480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:56:53,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 21:56:54,036][15401] Updated weights for policy 0, policy_version 72740 (0.0043) [2024-06-21 21:56:58,326][15401] Updated weights for policy 0, policy_version 72750 (0.0030) [2024-06-21 21:56:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1191936000. Throughput: 0: 42639.1. Samples: 1192060980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:56:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 21:57:01,537][15401] Updated weights for policy 0, policy_version 72760 (0.0038) [2024-06-21 21:57:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1192148992. Throughput: 0: 42569.7. Samples: 1192313200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-21 21:57:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 21:57:06,066][15401] Updated weights for policy 0, policy_version 72770 (0.0027) [2024-06-21 21:57:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 1192361984. Throughput: 0: 42779.7. Samples: 1192443840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 21:57:09,548][15401] Updated weights for policy 0, policy_version 72780 (0.0036) [2024-06-21 21:57:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1192574976. Throughput: 0: 42627.9. Samples: 1192699580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:13,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 21:57:13,712][15401] Updated weights for policy 0, policy_version 72790 (0.0026) [2024-06-21 21:57:17,107][15401] Updated weights for policy 0, policy_version 72800 (0.0027) [2024-06-21 21:57:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1192787968. Throughput: 0: 42806.7. Samples: 1192958080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 21:57:21,209][15401] Updated weights for policy 0, policy_version 72810 (0.0022) [2024-06-21 21:57:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1193000960. Throughput: 0: 42809.9. Samples: 1193087580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 21:57:24,659][15401] Updated weights for policy 0, policy_version 72820 (0.0042) [2024-06-21 21:57:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1193213952. Throughput: 0: 42660.6. Samples: 1193344960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 21:57:28,731][15401] Updated weights for policy 0, policy_version 72830 (0.0035) [2024-06-21 21:57:32,141][15401] Updated weights for policy 0, policy_version 72840 (0.0040) [2024-06-21 21:57:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42871.5, 300 sec: 42431.5). Total num frames: 1193443328. Throughput: 0: 42817.3. Samples: 1193600020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:33,392][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 21:57:36,249][15401] Updated weights for policy 0, policy_version 72850 (0.0046) [2024-06-21 21:57:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1193623552. Throughput: 0: 42742.1. Samples: 1193726880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:38,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 21:57:39,994][15401] Updated weights for policy 0, policy_version 72860 (0.0024) [2024-06-21 21:57:42,802][15349] Signal inference workers to stop experience collection... (17500 times) [2024-06-21 21:57:42,807][15349] Signal inference workers to resume experience collection... (17500 times) [2024-06-21 21:57:42,823][15401] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-06-21 21:57:42,824][15401] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-06-21 21:57:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1193885696. Throughput: 0: 42766.3. Samples: 1193985460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 21:57:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000072869_1193885696.pth... [2024-06-21 21:57:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000072245_1183662080.pth [2024-06-21 21:57:43,704][15401] Updated weights for policy 0, policy_version 72870 (0.0036) [2024-06-21 21:57:47,888][15401] Updated weights for policy 0, policy_version 72880 (0.0029) [2024-06-21 21:57:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1194082304. Throughput: 0: 42723.1. Samples: 1194235740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 21:57:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 21:57:51,857][15401] Updated weights for policy 0, policy_version 72890 (0.0033) [2024-06-21 21:57:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1194278912. Throughput: 0: 42492.1. Samples: 1194355980. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:57:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 21:57:55,901][15401] Updated weights for policy 0, policy_version 72900 (0.0033) [2024-06-21 21:57:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1194524672. Throughput: 0: 42500.9. Samples: 1194612120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:57:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 21:57:59,338][15401] Updated weights for policy 0, policy_version 72910 (0.0033) [2024-06-21 21:58:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1194704896. Throughput: 0: 42618.3. Samples: 1194875900. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:58:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 21:58:03,602][15401] Updated weights for policy 0, policy_version 72920 (0.0033) [2024-06-21 21:58:06,893][15401] Updated weights for policy 0, policy_version 72930 (0.0039) [2024-06-21 21:58:08,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1194901504. Throughput: 0: 42439.1. Samples: 1194997340. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:58:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 21:58:11,290][15401] Updated weights for policy 0, policy_version 72940 (0.0032) [2024-06-21 21:58:13,396][15132] Fps is (10 sec: 47482.6, 60 sec: 43413.0, 300 sec: 42597.5). Total num frames: 1195180032. Throughput: 0: 42502.2. Samples: 1195257840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:58:13,397][15132] Avg episode reward: [(0, '0.577')] [2024-06-21 21:58:14,372][15401] Updated weights for policy 0, policy_version 72950 (0.0037) [2024-06-21 21:58:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1195343872. Throughput: 0: 42648.9. Samples: 1195519120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:58:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 21:58:18,986][15401] Updated weights for policy 0, policy_version 72960 (0.0024) [2024-06-21 21:58:21,866][15401] Updated weights for policy 0, policy_version 72970 (0.0030) [2024-06-21 21:58:23,389][15132] Fps is (10 sec: 36068.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1195540480. Throughput: 0: 42539.2. Samples: 1195641140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:58:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-21 21:58:26,586][15401] Updated weights for policy 0, policy_version 72980 (0.0030) [2024-06-21 21:58:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 1195802624. Throughput: 0: 42502.6. Samples: 1195898080. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:58:28,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 21:58:29,793][15401] Updated weights for policy 0, policy_version 72990 (0.0040) [2024-06-21 21:58:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42053.9, 300 sec: 42376.2). Total num frames: 1195966464. Throughput: 0: 42753.3. Samples: 1196159640. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-21 21:58:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 21:58:34,447][15401] Updated weights for policy 0, policy_version 73000 (0.0039) [2024-06-21 21:58:37,435][15401] Updated weights for policy 0, policy_version 73010 (0.0034) [2024-06-21 21:58:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 1196195840. Throughput: 0: 42665.3. Samples: 1196275920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:58:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 21:58:42,444][15401] Updated weights for policy 0, policy_version 73020 (0.0035) [2024-06-21 21:58:43,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1196425216. Throughput: 0: 42712.6. Samples: 1196534180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:58:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 21:58:45,321][15401] Updated weights for policy 0, policy_version 73030 (0.0028) [2024-06-21 21:58:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1196621824. Throughput: 0: 42514.6. Samples: 1196789060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:58:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 21:58:50,041][15401] Updated weights for policy 0, policy_version 73040 (0.0034) [2024-06-21 21:58:53,135][15401] Updated weights for policy 0, policy_version 73050 (0.0036) [2024-06-21 21:58:53,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1196851200. Throughput: 0: 42556.8. Samples: 1196912400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:58:53,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 21:58:57,499][15401] Updated weights for policy 0, policy_version 73060 (0.0034) [2024-06-21 21:58:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1197047808. Throughput: 0: 42523.4. Samples: 1197171120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:58:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 21:58:59,008][15349] Signal inference workers to stop experience collection... (17550 times) [2024-06-21 21:58:59,008][15349] Signal inference workers to resume experience collection... (17550 times) [2024-06-21 21:58:59,051][15401] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-06-21 21:58:59,051][15401] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-06-21 21:59:00,887][15401] Updated weights for policy 0, policy_version 73070 (0.0041) [2024-06-21 21:59:03,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1197228032. Throughput: 0: 42485.0. Samples: 1197430940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:59:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-21 21:59:05,469][15401] Updated weights for policy 0, policy_version 73080 (0.0046) [2024-06-21 21:59:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1197506560. Throughput: 0: 42405.3. Samples: 1197549380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:59:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 21:59:08,398][15401] Updated weights for policy 0, policy_version 73090 (0.0028) [2024-06-21 21:59:13,033][15401] Updated weights for policy 0, policy_version 73100 (0.0035) [2024-06-21 21:59:13,390][15132] Fps is (10 sec: 45874.2, 60 sec: 41783.6, 300 sec: 42487.3). Total num frames: 1197686784. Throughput: 0: 42436.8. Samples: 1197807740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:59:13,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-21 21:59:16,405][15401] Updated weights for policy 0, policy_version 73110 (0.0033) [2024-06-21 21:59:18,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1197867008. Throughput: 0: 42429.5. Samples: 1198068960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-21 21:59:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 21:59:20,527][15401] Updated weights for policy 0, policy_version 73120 (0.0024) [2024-06-21 21:59:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1198129152. Throughput: 0: 42556.5. Samples: 1198190960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 21:59:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 21:59:24,089][15401] Updated weights for policy 0, policy_version 73130 (0.0024) [2024-06-21 21:59:28,079][15401] Updated weights for policy 0, policy_version 73140 (0.0040) [2024-06-21 21:59:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 1198325760. Throughput: 0: 42635.5. Samples: 1198452780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 21:59:28,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 21:59:31,655][15401] Updated weights for policy 0, policy_version 73150 (0.0032) [2024-06-21 21:59:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1198522368. Throughput: 0: 42717.4. Samples: 1198711340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 21:59:33,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 21:59:35,886][15401] Updated weights for policy 0, policy_version 73160 (0.0028) [2024-06-21 21:59:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1198784512. Throughput: 0: 42672.6. Samples: 1198832660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 21:59:38,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 21:59:39,248][15401] Updated weights for policy 0, policy_version 73170 (0.0038) [2024-06-21 21:59:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 1198948352. Throughput: 0: 42781.3. Samples: 1199096280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 21:59:43,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-21 21:59:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000073179_1198964736.pth... [2024-06-21 21:59:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000072556_1188757504.pth [2024-06-21 21:59:43,626][15401] Updated weights for policy 0, policy_version 73180 (0.0026) [2024-06-21 21:59:47,106][15401] Updated weights for policy 0, policy_version 73190 (0.0035) [2024-06-21 21:59:48,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1199161344. Throughput: 0: 42681.3. Samples: 1199351600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 21:59:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 21:59:51,183][15401] Updated weights for policy 0, policy_version 73200 (0.0031) [2024-06-21 21:59:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1199407104. Throughput: 0: 42767.8. Samples: 1199473940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 21:59:53,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-21 21:59:55,023][15401] Updated weights for policy 0, policy_version 73210 (0.0032) [2024-06-21 21:59:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1199603712. Throughput: 0: 42761.9. Samples: 1199732020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 21:59:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 21:59:58,752][15401] Updated weights for policy 0, policy_version 73220 (0.0038) [2024-06-21 22:00:02,732][15401] Updated weights for policy 0, policy_version 73230 (0.0031) [2024-06-21 22:00:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1199800320. Throughput: 0: 42595.1. Samples: 1199985740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:00:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-21 22:00:06,332][15401] Updated weights for policy 0, policy_version 73240 (0.0032) [2024-06-21 22:00:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1200046080. Throughput: 0: 42652.0. Samples: 1200110300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-21 22:00:10,503][15401] Updated weights for policy 0, policy_version 73250 (0.0028) [2024-06-21 22:00:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1200259072. Throughput: 0: 42688.8. Samples: 1200373780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-21 22:00:13,877][15401] Updated weights for policy 0, policy_version 73260 (0.0033) [2024-06-21 22:00:18,333][15401] Updated weights for policy 0, policy_version 73270 (0.0052) [2024-06-21 22:00:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1200455680. Throughput: 0: 42547.8. Samples: 1200626000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:18,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 22:00:21,212][15349] Signal inference workers to stop experience collection... (17600 times) [2024-06-21 22:00:21,257][15401] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-06-21 22:00:21,267][15349] Signal inference workers to resume experience collection... (17600 times) [2024-06-21 22:00:21,274][15401] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-06-21 22:00:21,424][15401] Updated weights for policy 0, policy_version 73280 (0.0047) [2024-06-21 22:00:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1200701440. Throughput: 0: 42551.9. Samples: 1200747600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:23,392][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 22:00:26,423][15401] Updated weights for policy 0, policy_version 73290 (0.0029) [2024-06-21 22:00:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1200865280. Throughput: 0: 42414.8. Samples: 1201004940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:28,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 22:00:29,485][15401] Updated weights for policy 0, policy_version 73300 (0.0045) [2024-06-21 22:00:33,389][15132] Fps is (10 sec: 36053.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1201061888. Throughput: 0: 42424.1. Samples: 1201260680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 22:00:34,150][15401] Updated weights for policy 0, policy_version 73310 (0.0034) [2024-06-21 22:00:37,062][15401] Updated weights for policy 0, policy_version 73320 (0.0040) [2024-06-21 22:00:38,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1201340416. Throughput: 0: 42544.2. Samples: 1201388420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 22:00:41,809][15401] Updated weights for policy 0, policy_version 73330 (0.0036) [2024-06-21 22:00:43,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1201487872. Throughput: 0: 42584.4. Samples: 1201648320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:43,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-21 22:00:44,598][15401] Updated weights for policy 0, policy_version 73340 (0.0024) [2024-06-21 22:00:48,396][15132] Fps is (10 sec: 37658.6, 60 sec: 42593.8, 300 sec: 42542.3). Total num frames: 1201717248. Throughput: 0: 42511.2. Samples: 1201899020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:00:48,397][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 22:00:49,721][15401] Updated weights for policy 0, policy_version 73350 (0.0033) [2024-06-21 22:00:52,401][15401] Updated weights for policy 0, policy_version 73360 (0.0044) [2024-06-21 22:00:53,389][15132] Fps is (10 sec: 49152.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1201979392. Throughput: 0: 42663.6. Samples: 1202030160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:00:53,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-21 22:00:57,066][15401] Updated weights for policy 0, policy_version 73370 (0.0030) [2024-06-21 22:00:58,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1202126848. Throughput: 0: 42395.7. Samples: 1202281580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:00:58,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 22:01:00,037][15401] Updated weights for policy 0, policy_version 73380 (0.0039) [2024-06-21 22:01:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1202356224. Throughput: 0: 42431.2. Samples: 1202535400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:01:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-21 22:01:04,881][15401] Updated weights for policy 0, policy_version 73390 (0.0027) [2024-06-21 22:01:07,949][15401] Updated weights for policy 0, policy_version 73400 (0.0032) [2024-06-21 22:01:08,389][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1202601984. Throughput: 0: 42667.6. Samples: 1202667540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:01:08,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 22:01:12,304][15401] Updated weights for policy 0, policy_version 73410 (0.0034) [2024-06-21 22:01:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1202765824. Throughput: 0: 42540.6. Samples: 1202919260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:01:13,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-21 22:01:15,630][15401] Updated weights for policy 0, policy_version 73420 (0.0031) [2024-06-21 22:01:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1202995200. Throughput: 0: 42411.3. Samples: 1203169200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:01:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 22:01:20,343][15401] Updated weights for policy 0, policy_version 73430 (0.0031) [2024-06-21 22:01:23,172][15349] Signal inference workers to stop experience collection... (17650 times) [2024-06-21 22:01:23,219][15401] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-06-21 22:01:23,230][15349] Signal inference workers to resume experience collection... (17650 times) [2024-06-21 22:01:23,240][15401] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-06-21 22:01:23,364][15401] Updated weights for policy 0, policy_version 73440 (0.0038) [2024-06-21 22:01:23,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 1203240960. Throughput: 0: 42466.6. Samples: 1203299420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:01:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 22:01:28,044][15401] Updated weights for policy 0, policy_version 73450 (0.0036) [2024-06-21 22:01:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1203421184. Throughput: 0: 42419.5. Samples: 1203557200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:01:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 22:01:31,062][15401] Updated weights for policy 0, policy_version 73460 (0.0040) [2024-06-21 22:01:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1203650560. Throughput: 0: 42394.1. Samples: 1203806480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-21 22:01:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 22:01:35,524][15401] Updated weights for policy 0, policy_version 73470 (0.0041) [2024-06-21 22:01:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1203847168. Throughput: 0: 42392.9. Samples: 1203937840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:01:38,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 22:01:38,730][15401] Updated weights for policy 0, policy_version 73480 (0.0031) [2024-06-21 22:01:43,314][15401] Updated weights for policy 0, policy_version 73490 (0.0038) [2024-06-21 22:01:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1204060160. Throughput: 0: 42492.6. Samples: 1204193760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:01:43,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 22:01:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000073491_1204076544.pth... [2024-06-21 22:01:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000072869_1193885696.pth [2024-06-21 22:01:46,523][15401] Updated weights for policy 0, policy_version 73500 (0.0030) [2024-06-21 22:01:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 1204289536. Throughput: 0: 42369.3. Samples: 1204442020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:01:48,396][15132] Avg episode reward: [(0, '0.474')] [2024-06-21 22:01:51,149][15401] Updated weights for policy 0, policy_version 73510 (0.0039) [2024-06-21 22:01:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1204486144. Throughput: 0: 42438.7. Samples: 1204577280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:01:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 22:01:54,193][15401] Updated weights for policy 0, policy_version 73520 (0.0038) [2024-06-21 22:01:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1204682752. Throughput: 0: 42442.1. Samples: 1204829160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:01:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 22:01:58,721][15401] Updated weights for policy 0, policy_version 73530 (0.0041) [2024-06-21 22:02:01,913][15401] Updated weights for policy 0, policy_version 73540 (0.0043) [2024-06-21 22:02:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1204912128. Throughput: 0: 42565.0. Samples: 1205084620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:02:03,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-21 22:02:06,639][15401] Updated weights for policy 0, policy_version 73550 (0.0033) [2024-06-21 22:02:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1205125120. Throughput: 0: 42518.6. Samples: 1205212760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:02:08,392][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 22:02:09,484][15401] Updated weights for policy 0, policy_version 73560 (0.0036) [2024-06-21 22:02:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1205338112. Throughput: 0: 42327.7. Samples: 1205461940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:02:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 22:02:14,097][15401] Updated weights for policy 0, policy_version 73570 (0.0024) [2024-06-21 22:02:17,150][15401] Updated weights for policy 0, policy_version 73580 (0.0037) [2024-06-21 22:02:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1205567488. Throughput: 0: 42399.8. Samples: 1205714480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-21 22:02:18,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-21 22:02:21,878][15401] Updated weights for policy 0, policy_version 73590 (0.0040) [2024-06-21 22:02:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 1205780480. Throughput: 0: 42426.6. Samples: 1205847140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:02:23,393][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 22:02:25,359][15401] Updated weights for policy 0, policy_version 73600 (0.0046) [2024-06-21 22:02:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 1205960704. Throughput: 0: 42379.2. Samples: 1206100820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:02:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 22:02:29,749][15401] Updated weights for policy 0, policy_version 73610 (0.0036) [2024-06-21 22:02:33,073][15401] Updated weights for policy 0, policy_version 73620 (0.0029) [2024-06-21 22:02:33,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1206190080. Throughput: 0: 42279.5. Samples: 1206344600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:02:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 22:02:37,429][15401] Updated weights for policy 0, policy_version 73630 (0.0035) [2024-06-21 22:02:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1206403072. Throughput: 0: 42212.4. Samples: 1206476840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:02:38,390][15132] Avg episode reward: [(0, '0.896')] [2024-06-21 22:02:38,391][15349] Saving new best policy, reward=0.896! [2024-06-21 22:02:40,534][15401] Updated weights for policy 0, policy_version 73640 (0.0027) [2024-06-21 22:02:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1206599680. Throughput: 0: 42432.7. Samples: 1206738640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:02:43,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-21 22:02:45,021][15401] Updated weights for policy 0, policy_version 73650 (0.0032) [2024-06-21 22:02:46,461][15349] Signal inference workers to stop experience collection... (17700 times) [2024-06-21 22:02:46,462][15349] Signal inference workers to resume experience collection... (17700 times) [2024-06-21 22:02:46,500][15401] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-06-21 22:02:46,500][15401] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-06-21 22:02:48,365][15401] Updated weights for policy 0, policy_version 73660 (0.0036) [2024-06-21 22:02:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1206845440. Throughput: 0: 42251.4. Samples: 1206985940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:02:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 22:02:52,593][15401] Updated weights for policy 0, policy_version 73670 (0.0036) [2024-06-21 22:02:53,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.7, 300 sec: 42431.4). Total num frames: 1207042048. Throughput: 0: 42329.3. Samples: 1207117680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:02:53,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 22:02:56,131][15401] Updated weights for policy 0, policy_version 73680 (0.0046) [2024-06-21 22:02:58,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1207238656. Throughput: 0: 42528.4. Samples: 1207375720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:02:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 22:03:00,622][15401] Updated weights for policy 0, policy_version 73690 (0.0028) [2024-06-21 22:03:03,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1207468032. Throughput: 0: 42156.4. Samples: 1207611520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:03:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 22:03:03,762][15401] Updated weights for policy 0, policy_version 73700 (0.0042) [2024-06-21 22:03:08,225][15401] Updated weights for policy 0, policy_version 73710 (0.0037) [2024-06-21 22:03:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42377.2). Total num frames: 1207681024. Throughput: 0: 42273.3. Samples: 1207749340. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:08,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-21 22:03:11,734][15401] Updated weights for policy 0, policy_version 73720 (0.0037) [2024-06-21 22:03:13,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1207844864. Throughput: 0: 42140.5. Samples: 1207997140. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 22:03:15,806][15401] Updated weights for policy 0, policy_version 73730 (0.0038) [2024-06-21 22:03:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1208107008. Throughput: 0: 42357.8. Samples: 1208250700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 22:03:19,381][15401] Updated weights for policy 0, policy_version 73740 (0.0041) [2024-06-21 22:03:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42054.0, 300 sec: 42376.2). Total num frames: 1208303616. Throughput: 0: 42486.3. Samples: 1208388720. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:23,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-21 22:03:23,400][15401] Updated weights for policy 0, policy_version 73750 (0.0039) [2024-06-21 22:03:27,214][15401] Updated weights for policy 0, policy_version 73760 (0.0029) [2024-06-21 22:03:28,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1208500224. Throughput: 0: 42252.1. Samples: 1208639980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:28,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-21 22:03:30,966][15401] Updated weights for policy 0, policy_version 73770 (0.0038) [2024-06-21 22:03:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 1208745984. Throughput: 0: 42355.1. Samples: 1208892020. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:33,393][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 22:03:34,931][15401] Updated weights for policy 0, policy_version 73780 (0.0029) [2024-06-21 22:03:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1208942592. Throughput: 0: 42384.4. Samples: 1209024880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-21 22:03:38,686][15401] Updated weights for policy 0, policy_version 73790 (0.0038) [2024-06-21 22:03:42,681][15401] Updated weights for policy 0, policy_version 73800 (0.0046) [2024-06-21 22:03:43,396][15132] Fps is (10 sec: 40943.6, 60 sec: 42593.9, 300 sec: 42486.4). Total num frames: 1209155584. Throughput: 0: 42358.4. Samples: 1209282120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:43,396][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 22:03:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000073801_1209155584.pth... [2024-06-21 22:03:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000073179_1198964736.pth [2024-06-21 22:03:46,306][15401] Updated weights for policy 0, policy_version 73810 (0.0029) [2024-06-21 22:03:48,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42050.7, 300 sec: 42431.5). Total num frames: 1209368576. Throughput: 0: 42575.7. Samples: 1209527520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-21 22:03:48,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-21 22:03:50,621][15401] Updated weights for policy 0, policy_version 73820 (0.0037) [2024-06-21 22:03:53,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 1209597952. Throughput: 0: 42445.0. Samples: 1209659360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:03:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 22:03:54,045][15401] Updated weights for policy 0, policy_version 73830 (0.0036) [2024-06-21 22:03:58,154][15401] Updated weights for policy 0, policy_version 73840 (0.0032) [2024-06-21 22:03:58,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1209794560. Throughput: 0: 42670.1. Samples: 1209917300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:03:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 22:04:01,745][15401] Updated weights for policy 0, policy_version 73850 (0.0050) [2024-06-21 22:04:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42376.2). Total num frames: 1210007552. Throughput: 0: 42601.5. Samples: 1210167760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:04:03,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 22:04:03,520][15349] Signal inference workers to stop experience collection... (17750 times) [2024-06-21 22:04:03,520][15349] Signal inference workers to resume experience collection... (17750 times) [2024-06-21 22:04:03,550][15401] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-06-21 22:04:03,550][15401] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-06-21 22:04:06,078][15401] Updated weights for policy 0, policy_version 73860 (0.0027) [2024-06-21 22:04:08,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 1210236928. Throughput: 0: 42556.5. Samples: 1210303760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:04:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 22:04:09,165][15401] Updated weights for policy 0, policy_version 73870 (0.0039) [2024-06-21 22:04:13,390][15132] Fps is (10 sec: 40956.8, 60 sec: 42870.9, 300 sec: 42542.7). Total num frames: 1210417152. Throughput: 0: 42746.0. Samples: 1210563580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:04:13,391][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 22:04:14,105][15401] Updated weights for policy 0, policy_version 73880 (0.0026) [2024-06-21 22:04:16,652][15401] Updated weights for policy 0, policy_version 73890 (0.0040) [2024-06-21 22:04:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1210662912. Throughput: 0: 42626.3. Samples: 1210810100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:04:18,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 22:04:22,013][15401] Updated weights for policy 0, policy_version 73900 (0.0038) [2024-06-21 22:04:23,390][15132] Fps is (10 sec: 45877.9, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1210875904. Throughput: 0: 42780.8. Samples: 1210950020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:04:23,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-21 22:04:24,220][15401] Updated weights for policy 0, policy_version 73910 (0.0035) [2024-06-21 22:04:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1211056128. Throughput: 0: 42629.2. Samples: 1211200160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:04:28,399][15132] Avg episode reward: [(0, '0.166')] [2024-06-21 22:04:29,569][15401] Updated weights for policy 0, policy_version 73920 (0.0043) [2024-06-21 22:04:31,987][15401] Updated weights for policy 0, policy_version 73930 (0.0040) [2024-06-21 22:04:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42600.2, 300 sec: 42431.8). Total num frames: 1211301888. Throughput: 0: 42677.4. Samples: 1211447900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:04:33,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 22:04:37,397][15401] Updated weights for policy 0, policy_version 73940 (0.0037) [2024-06-21 22:04:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1211482112. Throughput: 0: 42812.1. Samples: 1211585900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:04:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 22:04:39,721][15401] Updated weights for policy 0, policy_version 73950 (0.0040) [2024-06-21 22:04:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42329.8, 300 sec: 42487.3). Total num frames: 1211695104. Throughput: 0: 42615.6. Samples: 1211835000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:04:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 22:04:45,139][15401] Updated weights for policy 0, policy_version 73960 (0.0030) [2024-06-21 22:04:47,528][15401] Updated weights for policy 0, policy_version 73970 (0.0039) [2024-06-21 22:04:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 1211940864. Throughput: 0: 42586.6. Samples: 1212084160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:04:48,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 22:04:52,642][15401] Updated weights for policy 0, policy_version 73980 (0.0026) [2024-06-21 22:04:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1212104704. Throughput: 0: 42625.7. Samples: 1212221920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:04:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 22:04:55,139][15401] Updated weights for policy 0, policy_version 73990 (0.0045) [2024-06-21 22:04:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1212350464. Throughput: 0: 42474.5. Samples: 1212474900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:04:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 22:05:00,401][15401] Updated weights for policy 0, policy_version 74000 (0.0029) [2024-06-21 22:05:00,889][15349] Signal inference workers to stop experience collection... (17800 times) [2024-06-21 22:05:00,924][15401] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-06-21 22:05:01,006][15349] Signal inference workers to resume experience collection... (17800 times) [2024-06-21 22:05:01,006][15401] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-06-21 22:05:02,972][15401] Updated weights for policy 0, policy_version 74010 (0.0033) [2024-06-21 22:05:03,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 1212579840. Throughput: 0: 42479.4. Samples: 1212721680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:05:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 22:05:08,042][15401] Updated weights for policy 0, policy_version 74020 (0.0042) [2024-06-21 22:05:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1212760064. Throughput: 0: 42386.0. Samples: 1212857380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:05:08,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 22:05:10,494][15401] Updated weights for policy 0, policy_version 74030 (0.0029) [2024-06-21 22:05:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43145.1, 300 sec: 42542.9). Total num frames: 1213005824. Throughput: 0: 42660.5. Samples: 1213119880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:05:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 22:05:15,558][15401] Updated weights for policy 0, policy_version 74040 (0.0038) [2024-06-21 22:05:18,291][15401] Updated weights for policy 0, policy_version 74050 (0.0027) [2024-06-21 22:05:18,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42871.4, 300 sec: 42487.7). Total num frames: 1213235200. Throughput: 0: 42786.1. Samples: 1213373280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 22:05:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 22:05:22,944][15401] Updated weights for policy 0, policy_version 74060 (0.0028) [2024-06-21 22:05:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1213415424. Throughput: 0: 42785.2. Samples: 1213511240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:05:23,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 22:05:25,898][15401] Updated weights for policy 0, policy_version 74070 (0.0034) [2024-06-21 22:05:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1213644800. Throughput: 0: 42735.6. Samples: 1213758100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:05:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 22:05:30,566][15401] Updated weights for policy 0, policy_version 74080 (0.0050) [2024-06-21 22:05:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 1213874176. Throughput: 0: 42861.6. Samples: 1214012940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:05:33,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-21 22:05:33,535][15401] Updated weights for policy 0, policy_version 74090 (0.0024) [2024-06-21 22:05:37,994][15401] Updated weights for policy 0, policy_version 74100 (0.0030) [2024-06-21 22:05:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1214054400. Throughput: 0: 42763.5. Samples: 1214146280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:05:38,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 22:05:41,286][15401] Updated weights for policy 0, policy_version 74110 (0.0033) [2024-06-21 22:05:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 42654.9). Total num frames: 1214300160. Throughput: 0: 42893.6. Samples: 1214405120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:05:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 22:05:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000074115_1214300160.pth... [2024-06-21 22:05:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000073491_1204076544.pth [2024-06-21 22:05:45,486][15401] Updated weights for policy 0, policy_version 74120 (0.0032) [2024-06-21 22:05:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1214496768. Throughput: 0: 43127.6. Samples: 1214662420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:05:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 22:05:49,028][15401] Updated weights for policy 0, policy_version 74130 (0.0029) [2024-06-21 22:05:53,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43144.3, 300 sec: 42598.3). Total num frames: 1214693376. Throughput: 0: 42906.2. Samples: 1214788180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:05:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 22:05:53,489][15401] Updated weights for policy 0, policy_version 74140 (0.0034) [2024-06-21 22:05:56,726][15401] Updated weights for policy 0, policy_version 74150 (0.0041) [2024-06-21 22:05:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1214922752. Throughput: 0: 42753.3. Samples: 1215043780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:05:58,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-21 22:06:01,584][15401] Updated weights for policy 0, policy_version 74160 (0.0038) [2024-06-21 22:06:03,389][15132] Fps is (10 sec: 44238.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1215135744. Throughput: 0: 42837.5. Samples: 1215300960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-21 22:06:03,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-21 22:06:04,734][15401] Updated weights for policy 0, policy_version 74170 (0.0038) [2024-06-21 22:06:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1215332352. Throughput: 0: 42570.2. Samples: 1215426900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 22:06:09,095][15401] Updated weights for policy 0, policy_version 74180 (0.0032) [2024-06-21 22:06:12,390][15401] Updated weights for policy 0, policy_version 74190 (0.0027) [2024-06-21 22:06:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1215561728. Throughput: 0: 42831.9. Samples: 1215685540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 22:06:16,870][15401] Updated weights for policy 0, policy_version 74200 (0.0037) [2024-06-21 22:06:16,871][15349] Signal inference workers to stop experience collection... (17850 times) [2024-06-21 22:06:16,872][15349] Signal inference workers to resume experience collection... (17850 times) [2024-06-21 22:06:16,910][15401] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-06-21 22:06:16,911][15401] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-06-21 22:06:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1215774720. Throughput: 0: 42654.9. Samples: 1215932400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:18,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 22:06:20,223][15401] Updated weights for policy 0, policy_version 74210 (0.0030) [2024-06-21 22:06:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1215987712. Throughput: 0: 42450.2. Samples: 1216056540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 22:06:24,510][15401] Updated weights for policy 0, policy_version 74220 (0.0051) [2024-06-21 22:06:27,997][15401] Updated weights for policy 0, policy_version 74230 (0.0030) [2024-06-21 22:06:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1216184320. Throughput: 0: 42478.9. Samples: 1216316660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 22:06:32,066][15401] Updated weights for policy 0, policy_version 74240 (0.0032) [2024-06-21 22:06:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1216430080. Throughput: 0: 42343.6. Samples: 1216567880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 22:06:35,618][15401] Updated weights for policy 0, policy_version 74250 (0.0038) [2024-06-21 22:06:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1216610304. Throughput: 0: 42412.8. Samples: 1216696740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 22:06:39,674][15401] Updated weights for policy 0, policy_version 74260 (0.0032) [2024-06-21 22:06:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1216823296. Throughput: 0: 42415.5. Samples: 1216952480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 22:06:43,597][15401] Updated weights for policy 0, policy_version 74270 (0.0030) [2024-06-21 22:06:47,236][15401] Updated weights for policy 0, policy_version 74280 (0.0041) [2024-06-21 22:06:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1217069056. Throughput: 0: 42241.2. Samples: 1217201820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 22:06:48,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-21 22:06:51,391][15401] Updated weights for policy 0, policy_version 74290 (0.0031) [2024-06-21 22:06:53,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42870.0, 300 sec: 42653.6). Total num frames: 1217265664. Throughput: 0: 42559.5. Samples: 1217342180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:06:53,393][15132] Avg episode reward: [(0, '0.745')] [2024-06-21 22:06:54,930][15401] Updated weights for policy 0, policy_version 74300 (0.0032) [2024-06-21 22:06:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1217462272. Throughput: 0: 42287.1. Samples: 1217588460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:06:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 22:06:59,129][15401] Updated weights for policy 0, policy_version 74310 (0.0036) [2024-06-21 22:07:02,670][15401] Updated weights for policy 0, policy_version 74320 (0.0031) [2024-06-21 22:07:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1217708032. Throughput: 0: 42352.4. Samples: 1217838260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:07:03,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 22:07:07,218][15401] Updated weights for policy 0, policy_version 74330 (0.0048) [2024-06-21 22:07:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1217888256. Throughput: 0: 42583.2. Samples: 1217972780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:07:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 22:07:10,353][15401] Updated weights for policy 0, policy_version 74340 (0.0034) [2024-06-21 22:07:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1218101248. Throughput: 0: 42290.6. Samples: 1218219740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:07:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 22:07:14,703][15401] Updated weights for policy 0, policy_version 74350 (0.0031) [2024-06-21 22:07:17,799][15401] Updated weights for policy 0, policy_version 74360 (0.0037) [2024-06-21 22:07:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1218330624. Throughput: 0: 42246.3. Samples: 1218468960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:07:18,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 22:07:22,252][15401] Updated weights for policy 0, policy_version 74370 (0.0036) [2024-06-21 22:07:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1218510848. Throughput: 0: 42454.1. Samples: 1218607180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:07:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 22:07:24,742][15349] Signal inference workers to stop experience collection... (17900 times) [2024-06-21 22:07:24,792][15401] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-06-21 22:07:24,799][15349] Signal inference workers to resume experience collection... (17900 times) [2024-06-21 22:07:24,815][15401] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-06-21 22:07:25,283][15401] Updated weights for policy 0, policy_version 74380 (0.0032) [2024-06-21 22:07:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1218740224. Throughput: 0: 42339.0. Samples: 1218857740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:07:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 22:07:29,902][15401] Updated weights for policy 0, policy_version 74390 (0.0035) [2024-06-21 22:07:33,141][15401] Updated weights for policy 0, policy_version 74400 (0.0026) [2024-06-21 22:07:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1218985984. Throughput: 0: 42446.8. Samples: 1219111920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-21 22:07:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 22:07:37,915][15401] Updated weights for policy 0, policy_version 74410 (0.0035) [2024-06-21 22:07:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1219149824. Throughput: 0: 42236.0. Samples: 1219242700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:07:38,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 22:07:41,239][15401] Updated weights for policy 0, policy_version 74420 (0.0034) [2024-06-21 22:07:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1219395584. Throughput: 0: 42294.4. Samples: 1219491700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:07:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 22:07:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000074426_1219395584.pth... [2024-06-21 22:07:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000073801_1209155584.pth [2024-06-21 22:07:45,411][15401] Updated weights for policy 0, policy_version 74430 (0.0033) [2024-06-21 22:07:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42543.2). Total num frames: 1219592192. Throughput: 0: 42527.1. Samples: 1219751980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:07:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 22:07:48,809][15401] Updated weights for policy 0, policy_version 74440 (0.0032) [2024-06-21 22:07:53,065][15401] Updated weights for policy 0, policy_version 74450 (0.0029) [2024-06-21 22:07:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 1219788800. Throughput: 0: 42223.1. Samples: 1219872820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:07:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 22:07:56,422][15401] Updated weights for policy 0, policy_version 74460 (0.0029) [2024-06-21 22:07:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1220034560. Throughput: 0: 42403.0. Samples: 1220127880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:07:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-21 22:08:01,110][15401] Updated weights for policy 0, policy_version 74470 (0.0042) [2024-06-21 22:08:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42487.4). Total num frames: 1220214784. Throughput: 0: 42743.2. Samples: 1220392400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:08:03,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 22:08:04,253][15401] Updated weights for policy 0, policy_version 74480 (0.0027) [2024-06-21 22:08:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1220427776. Throughput: 0: 42360.1. Samples: 1220513380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:08:08,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-21 22:08:08,629][15401] Updated weights for policy 0, policy_version 74490 (0.0027) [2024-06-21 22:08:11,810][15401] Updated weights for policy 0, policy_version 74500 (0.0027) [2024-06-21 22:08:13,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1220689920. Throughput: 0: 42446.4. Samples: 1220767820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:08:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 22:08:16,173][15401] Updated weights for policy 0, policy_version 74510 (0.0031) [2024-06-21 22:08:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1220853760. Throughput: 0: 42740.5. Samples: 1221035240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-21 22:08:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 22:08:19,448][15401] Updated weights for policy 0, policy_version 74520 (0.0038) [2024-06-21 22:08:23,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1221066752. Throughput: 0: 42526.3. Samples: 1221156380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:08:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 22:08:23,783][15401] Updated weights for policy 0, policy_version 74530 (0.0029) [2024-06-21 22:08:27,045][15401] Updated weights for policy 0, policy_version 74540 (0.0030) [2024-06-21 22:08:28,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.7, 300 sec: 42654.3). Total num frames: 1221328896. Throughput: 0: 42591.1. Samples: 1221408300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:08:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 22:08:31,829][15401] Updated weights for policy 0, policy_version 74550 (0.0047) [2024-06-21 22:08:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 1221492736. Throughput: 0: 42744.8. Samples: 1221675500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:08:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 22:08:34,748][15401] Updated weights for policy 0, policy_version 74560 (0.0030) [2024-06-21 22:08:38,269][15349] Signal inference workers to stop experience collection... (17950 times) [2024-06-21 22:08:38,272][15349] Signal inference workers to resume experience collection... (17950 times) [2024-06-21 22:08:38,303][15401] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-06-21 22:08:38,303][15401] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-06-21 22:08:38,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 1221705728. Throughput: 0: 42607.9. Samples: 1221790180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:08:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 22:08:39,588][15401] Updated weights for policy 0, policy_version 74570 (0.0028) [2024-06-21 22:08:42,606][15401] Updated weights for policy 0, policy_version 74580 (0.0056) [2024-06-21 22:08:43,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1221967872. Throughput: 0: 42807.6. Samples: 1222054220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:08:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-21 22:08:47,127][15401] Updated weights for policy 0, policy_version 74590 (0.0026) [2024-06-21 22:08:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1222115328. Throughput: 0: 42811.0. Samples: 1222318900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:08:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 22:08:50,121][15401] Updated weights for policy 0, policy_version 74600 (0.0055) [2024-06-21 22:08:53,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1222344704. Throughput: 0: 42587.5. Samples: 1222429820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:08:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 22:08:54,534][15401] Updated weights for policy 0, policy_version 74610 (0.0033) [2024-06-21 22:08:57,656][15401] Updated weights for policy 0, policy_version 74620 (0.0039) [2024-06-21 22:08:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1222590464. Throughput: 0: 42957.7. Samples: 1222700920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:08:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 22:09:02,072][15401] Updated weights for policy 0, policy_version 74630 (0.0027) [2024-06-21 22:09:03,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42866.8, 300 sec: 42541.9). Total num frames: 1222787072. Throughput: 0: 42793.4. Samples: 1222961220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:09:03,396][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 22:09:05,387][15401] Updated weights for policy 0, policy_version 74640 (0.0039) [2024-06-21 22:09:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.1). Total num frames: 1223000064. Throughput: 0: 42739.1. Samples: 1223079640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 22:09:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 22:09:09,631][15401] Updated weights for policy 0, policy_version 74650 (0.0026) [2024-06-21 22:09:13,210][15401] Updated weights for policy 0, policy_version 74660 (0.0038) [2024-06-21 22:09:13,390][15132] Fps is (10 sec: 44264.3, 60 sec: 42325.1, 300 sec: 42598.4). Total num frames: 1223229440. Throughput: 0: 42803.3. Samples: 1223334460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 22:09:17,189][15401] Updated weights for policy 0, policy_version 74670 (0.0027) [2024-06-21 22:09:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1223409664. Throughput: 0: 42628.1. Samples: 1223593760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 22:09:20,817][15401] Updated weights for policy 0, policy_version 74680 (0.0043) [2024-06-21 22:09:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1223655424. Throughput: 0: 42860.9. Samples: 1223718920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:23,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 22:09:24,840][15401] Updated weights for policy 0, policy_version 74690 (0.0035) [2024-06-21 22:09:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1223852032. Throughput: 0: 42610.2. Samples: 1223971680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:28,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-21 22:09:28,707][15401] Updated weights for policy 0, policy_version 74700 (0.0037) [2024-06-21 22:09:32,769][15401] Updated weights for policy 0, policy_version 74710 (0.0033) [2024-06-21 22:09:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 1224048640. Throughput: 0: 42517.5. Samples: 1224232180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:33,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-21 22:09:36,322][15401] Updated weights for policy 0, policy_version 74720 (0.0027) [2024-06-21 22:09:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1224278016. Throughput: 0: 42849.7. Samples: 1224358060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 22:09:40,228][15401] Updated weights for policy 0, policy_version 74730 (0.0033) [2024-06-21 22:09:43,390][15132] Fps is (10 sec: 44235.4, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 1224491008. Throughput: 0: 42539.4. Samples: 1224615200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:43,391][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 22:09:43,582][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000074738_1224507392.pth... [2024-06-21 22:09:43,631][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000074115_1214300160.pth [2024-06-21 22:09:43,943][15401] Updated weights for policy 0, policy_version 74740 (0.0026) [2024-06-21 22:09:48,022][15401] Updated weights for policy 0, policy_version 74750 (0.0038) [2024-06-21 22:09:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1224704000. Throughput: 0: 42419.8. Samples: 1224869840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 22:09:51,416][15349] Signal inference workers to stop experience collection... (18000 times) [2024-06-21 22:09:51,455][15401] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-06-21 22:09:51,465][15349] Signal inference workers to resume experience collection... (18000 times) [2024-06-21 22:09:51,475][15401] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-06-21 22:09:51,604][15401] Updated weights for policy 0, policy_version 74760 (0.0031) [2024-06-21 22:09:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1224916992. Throughput: 0: 42661.1. Samples: 1224999400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-21 22:09:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 22:09:55,571][15401] Updated weights for policy 0, policy_version 74770 (0.0032) [2024-06-21 22:09:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1225129984. Throughput: 0: 42663.7. Samples: 1225254320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:09:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 22:09:59,266][15401] Updated weights for policy 0, policy_version 74780 (0.0026) [2024-06-21 22:10:03,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 1225342976. Throughput: 0: 42675.7. Samples: 1225514160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:10:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 22:10:03,508][15401] Updated weights for policy 0, policy_version 74790 (0.0034) [2024-06-21 22:10:06,730][15401] Updated weights for policy 0, policy_version 74800 (0.0036) [2024-06-21 22:10:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1225539584. Throughput: 0: 42714.2. Samples: 1225641060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:10:08,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 22:10:11,133][15401] Updated weights for policy 0, policy_version 74810 (0.0038) [2024-06-21 22:10:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1225785344. Throughput: 0: 42843.6. Samples: 1225899640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:10:13,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 22:10:14,278][15401] Updated weights for policy 0, policy_version 74820 (0.0024) [2024-06-21 22:10:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1225965568. Throughput: 0: 42708.4. Samples: 1226154060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:10:18,390][15132] Avg episode reward: [(0, '0.157')] [2024-06-21 22:10:18,755][15401] Updated weights for policy 0, policy_version 74830 (0.0027) [2024-06-21 22:10:21,948][15401] Updated weights for policy 0, policy_version 74840 (0.0030) [2024-06-21 22:10:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1226194944. Throughput: 0: 42600.8. Samples: 1226275100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:10:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 22:10:26,399][15401] Updated weights for policy 0, policy_version 74850 (0.0040) [2024-06-21 22:10:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1226424320. Throughput: 0: 42681.5. Samples: 1226535860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:10:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 22:10:29,482][15401] Updated weights for policy 0, policy_version 74860 (0.0035) [2024-06-21 22:10:33,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 1226604544. Throughput: 0: 42845.7. Samples: 1226798000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:10:33,393][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 22:10:34,280][15401] Updated weights for policy 0, policy_version 74870 (0.0034) [2024-06-21 22:10:37,329][15401] Updated weights for policy 0, policy_version 74880 (0.0033) [2024-06-21 22:10:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1226850304. Throughput: 0: 42613.8. Samples: 1226917020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-21 22:10:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 22:10:42,013][15401] Updated weights for policy 0, policy_version 74890 (0.0040) [2024-06-21 22:10:43,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1227063296. Throughput: 0: 42674.2. Samples: 1227174660. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:10:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 22:10:45,063][15401] Updated weights for policy 0, policy_version 74900 (0.0029) [2024-06-21 22:10:48,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 1227227136. Throughput: 0: 42733.3. Samples: 1227437160. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:10:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 22:10:49,742][15401] Updated weights for policy 0, policy_version 74910 (0.0032) [2024-06-21 22:10:53,113][15401] Updated weights for policy 0, policy_version 74920 (0.0035) [2024-06-21 22:10:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1227489280. Throughput: 0: 42539.2. Samples: 1227555320. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:10:53,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-21 22:10:57,456][15401] Updated weights for policy 0, policy_version 74930 (0.0029) [2024-06-21 22:10:58,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1227718656. Throughput: 0: 42655.9. Samples: 1227819160. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:10:58,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 22:11:00,550][15401] Updated weights for policy 0, policy_version 74940 (0.0037) [2024-06-21 22:11:03,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1227866112. Throughput: 0: 42796.9. Samples: 1228079920. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:11:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 22:11:04,995][15401] Updated weights for policy 0, policy_version 74950 (0.0036) [2024-06-21 22:11:05,692][15349] Signal inference workers to stop experience collection... (18050 times) [2024-06-21 22:11:05,693][15349] Signal inference workers to resume experience collection... (18050 times) [2024-06-21 22:11:05,707][15401] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-06-21 22:11:05,707][15401] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-06-21 22:11:08,212][15401] Updated weights for policy 0, policy_version 74960 (0.0048) [2024-06-21 22:11:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1228144640. Throughput: 0: 42671.6. Samples: 1228195320. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:11:08,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-21 22:11:12,468][15401] Updated weights for policy 0, policy_version 74970 (0.0048) [2024-06-21 22:11:13,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1228341248. Throughput: 0: 42876.5. Samples: 1228465300. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:11:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 22:11:15,764][15401] Updated weights for policy 0, policy_version 74980 (0.0037) [2024-06-21 22:11:18,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1228521472. Throughput: 0: 42724.0. Samples: 1228720480. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:11:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 22:11:20,488][15401] Updated weights for policy 0, policy_version 74990 (0.0034) [2024-06-21 22:11:23,307][15401] Updated weights for policy 0, policy_version 75000 (0.0036) [2024-06-21 22:11:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1228800000. Throughput: 0: 42760.6. Samples: 1228841240. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-21 22:11:23,390][15132] Avg episode reward: [(0, '0.882')] [2024-06-21 22:11:27,880][15401] Updated weights for policy 0, policy_version 75010 (0.0032) [2024-06-21 22:11:28,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1228980224. Throughput: 0: 42848.0. Samples: 1229102820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:11:28,392][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 22:11:30,958][15401] Updated weights for policy 0, policy_version 75020 (0.0027) [2024-06-21 22:11:33,389][15132] Fps is (10 sec: 36044.7, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 1229160448. Throughput: 0: 42792.4. Samples: 1229362820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:11:33,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 22:11:35,536][15401] Updated weights for policy 0, policy_version 75030 (0.0041) [2024-06-21 22:11:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1229438976. Throughput: 0: 42931.4. Samples: 1229487240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:11:38,393][15132] Avg episode reward: [(0, '0.311')] [2024-06-21 22:11:38,669][15401] Updated weights for policy 0, policy_version 75040 (0.0033) [2024-06-21 22:11:43,204][15401] Updated weights for policy 0, policy_version 75050 (0.0033) [2024-06-21 22:11:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1229619200. Throughput: 0: 42730.7. Samples: 1229742040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:11:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 22:11:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000075050_1229619200.pth... [2024-06-21 22:11:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000074426_1219395584.pth [2024-06-21 22:11:46,345][15401] Updated weights for policy 0, policy_version 75060 (0.0042) [2024-06-21 22:11:48,389][15132] Fps is (10 sec: 36045.3, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 1229799424. Throughput: 0: 42620.1. Samples: 1229997820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:11:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-21 22:11:51,235][15401] Updated weights for policy 0, policy_version 75070 (0.0042) [2024-06-21 22:11:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1230077952. Throughput: 0: 42821.0. Samples: 1230122260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:11:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 22:11:53,978][15401] Updated weights for policy 0, policy_version 75080 (0.0038) [2024-06-21 22:11:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 1230241792. Throughput: 0: 42456.5. Samples: 1230375840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:11:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 22:11:58,811][15401] Updated weights for policy 0, policy_version 75090 (0.0040) [2024-06-21 22:12:00,581][15349] Signal inference workers to stop experience collection... (18100 times) [2024-06-21 22:12:00,623][15401] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-06-21 22:12:00,631][15349] Signal inference workers to resume experience collection... (18100 times) [2024-06-21 22:12:00,647][15401] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-06-21 22:12:01,565][15401] Updated weights for policy 0, policy_version 75100 (0.0034) [2024-06-21 22:12:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1230454784. Throughput: 0: 42399.1. Samples: 1230628440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:12:03,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 22:12:06,337][15401] Updated weights for policy 0, policy_version 75110 (0.0042) [2024-06-21 22:12:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 1230684160. Throughput: 0: 42618.7. Samples: 1230759080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:12:08,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-21 22:12:09,809][15401] Updated weights for policy 0, policy_version 75120 (0.0027) [2024-06-21 22:12:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1230864384. Throughput: 0: 42556.8. Samples: 1231017880. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:13,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-21 22:12:13,985][15401] Updated weights for policy 0, policy_version 75130 (0.0038) [2024-06-21 22:12:17,474][15401] Updated weights for policy 0, policy_version 75140 (0.0039) [2024-06-21 22:12:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1231110144. Throughput: 0: 42251.0. Samples: 1231264120. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:18,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 22:12:22,043][15401] Updated weights for policy 0, policy_version 75150 (0.0032) [2024-06-21 22:12:23,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1231339520. Throughput: 0: 42460.9. Samples: 1231397980. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:23,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-21 22:12:25,645][15401] Updated weights for policy 0, policy_version 75160 (0.0027) [2024-06-21 22:12:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1231519744. Throughput: 0: 42472.0. Samples: 1231653280. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:28,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 22:12:29,574][15401] Updated weights for policy 0, policy_version 75170 (0.0044) [2024-06-21 22:12:33,370][15401] Updated weights for policy 0, policy_version 75180 (0.0032) [2024-06-21 22:12:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1231749120. Throughput: 0: 42369.7. Samples: 1231904460. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:33,396][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 22:12:37,326][15401] Updated weights for policy 0, policy_version 75190 (0.0041) [2024-06-21 22:12:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1231962112. Throughput: 0: 42408.9. Samples: 1232030660. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 22:12:41,236][15401] Updated weights for policy 0, policy_version 75200 (0.0039) [2024-06-21 22:12:43,394][15132] Fps is (10 sec: 40942.7, 60 sec: 42322.4, 300 sec: 42597.8). Total num frames: 1232158720. Throughput: 0: 42470.7. Samples: 1232287200. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:43,394][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 22:12:45,171][15401] Updated weights for policy 0, policy_version 75210 (0.0039) [2024-06-21 22:12:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1232371712. Throughput: 0: 42605.4. Samples: 1232545680. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 22:12:49,125][15401] Updated weights for policy 0, policy_version 75220 (0.0035) [2024-06-21 22:12:52,797][15401] Updated weights for policy 0, policy_version 75230 (0.0046) [2024-06-21 22:12:53,389][15132] Fps is (10 sec: 44255.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1232601088. Throughput: 0: 42447.5. Samples: 1232669220. Policy #0 lag: (min: 2.0, avg: 10.3, max: 23.0) [2024-06-21 22:12:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 22:12:56,827][15401] Updated weights for policy 0, policy_version 75240 (0.0033) [2024-06-21 22:12:58,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 1232797696. Throughput: 0: 42321.6. Samples: 1232922620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:12:58,397][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 22:13:00,535][15401] Updated weights for policy 0, policy_version 75250 (0.0036) [2024-06-21 22:13:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1232994304. Throughput: 0: 42406.3. Samples: 1233172400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 22:13:04,607][15401] Updated weights for policy 0, policy_version 75260 (0.0032) [2024-06-21 22:13:07,990][15349] Signal inference workers to stop experience collection... (18150 times) [2024-06-21 22:13:08,024][15401] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-06-21 22:13:08,052][15349] Signal inference workers to resume experience collection... (18150 times) [2024-06-21 22:13:08,053][15401] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-06-21 22:13:08,205][15401] Updated weights for policy 0, policy_version 75270 (0.0026) [2024-06-21 22:13:08,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1233223680. Throughput: 0: 42218.6. Samples: 1233297820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 22:13:12,331][15401] Updated weights for policy 0, policy_version 75280 (0.0026) [2024-06-21 22:13:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1233436672. Throughput: 0: 42303.0. Samples: 1233556920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:13,391][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 22:13:16,126][15401] Updated weights for policy 0, policy_version 75290 (0.0039) [2024-06-21 22:13:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1233649664. Throughput: 0: 42192.3. Samples: 1233803120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-21 22:13:20,229][15401] Updated weights for policy 0, policy_version 75300 (0.0036) [2024-06-21 22:13:23,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 1233846272. Throughput: 0: 42263.9. Samples: 1233932540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:23,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-21 22:13:23,853][15401] Updated weights for policy 0, policy_version 75310 (0.0036) [2024-06-21 22:13:28,001][15401] Updated weights for policy 0, policy_version 75320 (0.0024) [2024-06-21 22:13:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1234042880. Throughput: 0: 42326.3. Samples: 1234191700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:28,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 22:13:31,371][15401] Updated weights for policy 0, policy_version 75330 (0.0034) [2024-06-21 22:13:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.5, 300 sec: 42653.6). Total num frames: 1234288640. Throughput: 0: 42043.0. Samples: 1234437720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:33,393][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 22:13:35,524][15401] Updated weights for policy 0, policy_version 75340 (0.0028) [2024-06-21 22:13:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1234485248. Throughput: 0: 42243.0. Samples: 1234570160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 22:13:38,937][15401] Updated weights for policy 0, policy_version 75350 (0.0042) [2024-06-21 22:13:43,386][15401] Updated weights for policy 0, policy_version 75360 (0.0041) [2024-06-21 22:13:43,389][15132] Fps is (10 sec: 40970.7, 60 sec: 42328.4, 300 sec: 42654.0). Total num frames: 1234698240. Throughput: 0: 42376.4. Samples: 1234829280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-21 22:13:43,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-21 22:13:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000075360_1234698240.pth... [2024-06-21 22:13:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000074738_1224507392.pth [2024-06-21 22:13:46,718][15401] Updated weights for policy 0, policy_version 75370 (0.0020) [2024-06-21 22:13:48,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1234944000. Throughput: 0: 42295.9. Samples: 1235075820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:13:48,392][15132] Avg episode reward: [(0, '0.810')] [2024-06-21 22:13:50,994][15401] Updated weights for policy 0, policy_version 75380 (0.0046) [2024-06-21 22:13:53,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1235124224. Throughput: 0: 42420.5. Samples: 1235206740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:13:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 22:13:54,294][15401] Updated weights for policy 0, policy_version 75390 (0.0033) [2024-06-21 22:13:58,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42056.8, 300 sec: 42488.2). Total num frames: 1235320832. Throughput: 0: 42334.4. Samples: 1235461960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:13:58,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 22:13:58,657][15401] Updated weights for policy 0, policy_version 75400 (0.0037) [2024-06-21 22:14:01,822][15401] Updated weights for policy 0, policy_version 75410 (0.0031) [2024-06-21 22:14:03,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 1235566592. Throughput: 0: 42320.2. Samples: 1235707800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:14:03,396][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 22:14:06,250][15401] Updated weights for policy 0, policy_version 75420 (0.0034) [2024-06-21 22:14:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 1235730432. Throughput: 0: 42442.3. Samples: 1235842440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:14:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 22:14:09,907][15401] Updated weights for policy 0, policy_version 75430 (0.0035) [2024-06-21 22:14:13,390][15132] Fps is (10 sec: 39346.3, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 1235959808. Throughput: 0: 42287.8. Samples: 1236094660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:14:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 22:14:14,454][15401] Updated weights for policy 0, policy_version 75440 (0.0035) [2024-06-21 22:14:17,718][15401] Updated weights for policy 0, policy_version 75450 (0.0044) [2024-06-21 22:14:18,392][15132] Fps is (10 sec: 49140.2, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 1236221952. Throughput: 0: 42252.5. Samples: 1236339080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:14:18,392][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 22:14:22,083][15401] Updated weights for policy 0, policy_version 75460 (0.0042) [2024-06-21 22:14:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42376.2). Total num frames: 1236353024. Throughput: 0: 42395.7. Samples: 1236477960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:14:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 22:14:25,107][15349] Signal inference workers to stop experience collection... (18200 times) [2024-06-21 22:14:25,114][15349] Signal inference workers to resume experience collection... (18200 times) [2024-06-21 22:14:25,146][15401] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-06-21 22:14:25,146][15401] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-06-21 22:14:25,270][15401] Updated weights for policy 0, policy_version 75470 (0.0030) [2024-06-21 22:14:28,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1236615168. Throughput: 0: 42282.1. Samples: 1236731980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 22:14:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 22:14:29,620][15401] Updated weights for policy 0, policy_version 75480 (0.0037) [2024-06-21 22:14:32,803][15401] Updated weights for policy 0, policy_version 75490 (0.0035) [2024-06-21 22:14:33,389][15132] Fps is (10 sec: 49152.1, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 1236844544. Throughput: 0: 42422.7. Samples: 1236984740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:14:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 22:14:37,209][15401] Updated weights for policy 0, policy_version 75500 (0.0032) [2024-06-21 22:14:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1237008384. Throughput: 0: 42374.2. Samples: 1237113580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:14:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 22:14:40,551][15401] Updated weights for policy 0, policy_version 75510 (0.0034) [2024-06-21 22:14:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1237254144. Throughput: 0: 42286.6. Samples: 1237364860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:14:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 22:14:44,818][15401] Updated weights for policy 0, policy_version 75520 (0.0038) [2024-06-21 22:14:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 1237467136. Throughput: 0: 42647.4. Samples: 1237626660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:14:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 22:14:48,669][15401] Updated weights for policy 0, policy_version 75530 (0.0021) [2024-06-21 22:14:52,242][15401] Updated weights for policy 0, policy_version 75540 (0.0024) [2024-06-21 22:14:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1237663744. Throughput: 0: 42609.7. Samples: 1237759880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:14:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 22:14:56,135][15401] Updated weights for policy 0, policy_version 75550 (0.0037) [2024-06-21 22:14:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1237893120. Throughput: 0: 42634.0. Samples: 1238013180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:14:58,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 22:14:59,629][15401] Updated weights for policy 0, policy_version 75560 (0.0031) [2024-06-21 22:15:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 1238106112. Throughput: 0: 43139.7. Samples: 1238280260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:15:03,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-21 22:15:03,605][15401] Updated weights for policy 0, policy_version 75570 (0.0036) [2024-06-21 22:15:07,057][15401] Updated weights for policy 0, policy_version 75580 (0.0054) [2024-06-21 22:15:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42487.0). Total num frames: 1238319104. Throughput: 0: 42798.6. Samples: 1238404000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:15:08,392][15132] Avg episode reward: [(0, '0.725')] [2024-06-21 22:15:11,193][15401] Updated weights for policy 0, policy_version 75590 (0.0031) [2024-06-21 22:15:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 1238548480. Throughput: 0: 42841.0. Samples: 1238659820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-21 22:15:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 22:15:15,047][15401] Updated weights for policy 0, policy_version 75600 (0.0021) [2024-06-21 22:15:18,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42053.8, 300 sec: 42542.9). Total num frames: 1238745088. Throughput: 0: 43157.1. Samples: 1238926820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:18,396][15132] Avg episode reward: [(0, '0.484')] [2024-06-21 22:15:18,975][15401] Updated weights for policy 0, policy_version 75610 (0.0032) [2024-06-21 22:15:22,481][15401] Updated weights for policy 0, policy_version 75620 (0.0043) [2024-06-21 22:15:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 42542.9). Total num frames: 1238974464. Throughput: 0: 43080.1. Samples: 1239052180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:23,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-21 22:15:26,643][15349] Signal inference workers to stop experience collection... (18250 times) [2024-06-21 22:15:26,644][15349] Signal inference workers to resume experience collection... (18250 times) [2024-06-21 22:15:26,659][15401] Updated weights for policy 0, policy_version 75630 (0.0028) [2024-06-21 22:15:26,689][15401] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-06-21 22:15:26,689][15401] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-06-21 22:15:28,390][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 1239203840. Throughput: 0: 43181.8. Samples: 1239308040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:28,400][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 22:15:30,206][15401] Updated weights for policy 0, policy_version 75640 (0.0038) [2024-06-21 22:15:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1239400448. Throughput: 0: 43154.3. Samples: 1239568600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-21 22:15:34,177][15401] Updated weights for policy 0, policy_version 75650 (0.0027) [2024-06-21 22:15:37,797][15401] Updated weights for policy 0, policy_version 75660 (0.0037) [2024-06-21 22:15:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 1239613440. Throughput: 0: 42944.9. Samples: 1239692400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 22:15:42,058][15401] Updated weights for policy 0, policy_version 75670 (0.0034) [2024-06-21 22:15:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1239826432. Throughput: 0: 43010.6. Samples: 1239948660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:43,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 22:15:43,488][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000075674_1239842816.pth... [2024-06-21 22:15:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000075050_1229619200.pth [2024-06-21 22:15:45,336][15401] Updated weights for policy 0, policy_version 75680 (0.0036) [2024-06-21 22:15:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1240023040. Throughput: 0: 42754.3. Samples: 1240204200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-21 22:15:49,894][15401] Updated weights for policy 0, policy_version 75690 (0.0044) [2024-06-21 22:15:53,131][15401] Updated weights for policy 0, policy_version 75700 (0.0041) [2024-06-21 22:15:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42542.9). Total num frames: 1240268800. Throughput: 0: 42691.9. Samples: 1240325040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 22:15:57,528][15401] Updated weights for policy 0, policy_version 75710 (0.0039) [2024-06-21 22:15:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1240481792. Throughput: 0: 42944.1. Samples: 1240592300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-21 22:15:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 22:16:01,085][15401] Updated weights for policy 0, policy_version 75720 (0.0033) [2024-06-21 22:16:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1240662016. Throughput: 0: 42688.2. Samples: 1240847780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:03,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 22:16:05,115][15401] Updated weights for policy 0, policy_version 75730 (0.0034) [2024-06-21 22:16:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42873.1, 300 sec: 42542.9). Total num frames: 1240891392. Throughput: 0: 42685.8. Samples: 1240973040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:08,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 22:16:08,977][15401] Updated weights for policy 0, policy_version 75740 (0.0043) [2024-06-21 22:16:12,761][15401] Updated weights for policy 0, policy_version 75750 (0.0033) [2024-06-21 22:16:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1241120768. Throughput: 0: 42713.8. Samples: 1241230160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 22:16:16,679][15401] Updated weights for policy 0, policy_version 75760 (0.0042) [2024-06-21 22:16:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42431.8). Total num frames: 1241317376. Throughput: 0: 42622.7. Samples: 1241486620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 22:16:20,423][15401] Updated weights for policy 0, policy_version 75770 (0.0041) [2024-06-21 22:16:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1241530368. Throughput: 0: 42666.3. Samples: 1241612380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 22:16:24,573][15401] Updated weights for policy 0, policy_version 75780 (0.0043) [2024-06-21 22:16:28,106][15401] Updated weights for policy 0, policy_version 75790 (0.0038) [2024-06-21 22:16:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1241743360. Throughput: 0: 42647.2. Samples: 1241867780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:28,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-21 22:16:32,304][15401] Updated weights for policy 0, policy_version 75800 (0.0047) [2024-06-21 22:16:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1241939968. Throughput: 0: 42744.2. Samples: 1242127700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:33,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-21 22:16:35,809][15401] Updated weights for policy 0, policy_version 75810 (0.0032) [2024-06-21 22:16:38,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1242185728. Throughput: 0: 42737.7. Samples: 1242248240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 22:16:39,822][15401] Updated weights for policy 0, policy_version 75820 (0.0026) [2024-06-21 22:16:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1242382336. Throughput: 0: 42666.9. Samples: 1242512320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 22:16:43,402][15401] Updated weights for policy 0, policy_version 75830 (0.0040) [2024-06-21 22:16:47,278][15401] Updated weights for policy 0, policy_version 75840 (0.0031) [2024-06-21 22:16:48,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1242578944. Throughput: 0: 42747.5. Samples: 1242771420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-21 22:16:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 22:16:50,986][15401] Updated weights for policy 0, policy_version 75850 (0.0037) [2024-06-21 22:16:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1242808320. Throughput: 0: 42590.7. Samples: 1242889620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:16:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 22:16:54,829][15401] Updated weights for policy 0, policy_version 75860 (0.0041) [2024-06-21 22:16:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1243021312. Throughput: 0: 42669.8. Samples: 1243150300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:16:58,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-21 22:16:58,659][15401] Updated weights for policy 0, policy_version 75870 (0.0033) [2024-06-21 22:16:59,590][15349] Signal inference workers to stop experience collection... (18300 times) [2024-06-21 22:16:59,590][15349] Signal inference workers to resume experience collection... (18300 times) [2024-06-21 22:16:59,608][15401] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-06-21 22:16:59,609][15401] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-06-21 22:17:02,555][15401] Updated weights for policy 0, policy_version 75880 (0.0031) [2024-06-21 22:17:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1243217920. Throughput: 0: 42462.6. Samples: 1243397440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:17:03,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 22:17:06,556][15401] Updated weights for policy 0, policy_version 75890 (0.0037) [2024-06-21 22:17:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1243463680. Throughput: 0: 42433.1. Samples: 1243521880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:17:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 22:17:10,706][15401] Updated weights for policy 0, policy_version 75900 (0.0030) [2024-06-21 22:17:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1243643904. Throughput: 0: 42554.6. Samples: 1243782740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:17:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 22:17:14,253][15401] Updated weights for policy 0, policy_version 75910 (0.0027) [2024-06-21 22:17:18,257][15401] Updated weights for policy 0, policy_version 75920 (0.0046) [2024-06-21 22:17:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1243873280. Throughput: 0: 42367.6. Samples: 1244034240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:17:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 22:17:22,063][15401] Updated weights for policy 0, policy_version 75930 (0.0027) [2024-06-21 22:17:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1244102656. Throughput: 0: 42563.8. Samples: 1244163600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:17:23,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-21 22:17:25,804][15401] Updated weights for policy 0, policy_version 75940 (0.0035) [2024-06-21 22:17:28,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 1244250112. Throughput: 0: 42484.9. Samples: 1244424140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:17:28,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-21 22:17:29,661][15401] Updated weights for policy 0, policy_version 75950 (0.0023) [2024-06-21 22:17:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 1244512256. Throughput: 0: 42256.4. Samples: 1244673060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-21 22:17:33,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 22:17:33,520][15401] Updated weights for policy 0, policy_version 75960 (0.0043) [2024-06-21 22:17:37,241][15401] Updated weights for policy 0, policy_version 75970 (0.0039) [2024-06-21 22:17:38,390][15132] Fps is (10 sec: 49152.0, 60 sec: 42598.5, 300 sec: 42654.5). Total num frames: 1244741632. Throughput: 0: 42687.5. Samples: 1244810560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:17:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 22:17:41,126][15401] Updated weights for policy 0, policy_version 75980 (0.0033) [2024-06-21 22:17:43,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1244905472. Throughput: 0: 42330.6. Samples: 1245055180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:17:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 22:17:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000075983_1244905472.pth... [2024-06-21 22:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000075360_1234698240.pth [2024-06-21 22:17:44,962][15401] Updated weights for policy 0, policy_version 75990 (0.0034) [2024-06-21 22:17:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1245151232. Throughput: 0: 42457.8. Samples: 1245308040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:17:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 22:17:48,779][15401] Updated weights for policy 0, policy_version 76000 (0.0027) [2024-06-21 22:17:52,752][15401] Updated weights for policy 0, policy_version 76010 (0.0038) [2024-06-21 22:17:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 1245364224. Throughput: 0: 42629.8. Samples: 1245440220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:17:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 22:17:56,825][15401] Updated weights for policy 0, policy_version 76020 (0.0034) [2024-06-21 22:17:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1245560832. Throughput: 0: 42385.8. Samples: 1245690100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:17:58,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 22:18:00,451][15401] Updated weights for policy 0, policy_version 76030 (0.0043) [2024-06-21 22:18:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1245790208. Throughput: 0: 42495.0. Samples: 1245946520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:18:03,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-21 22:18:04,225][15349] Signal inference workers to stop experience collection... (18350 times) [2024-06-21 22:18:04,228][15349] Signal inference workers to resume experience collection... (18350 times) [2024-06-21 22:18:04,242][15401] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-06-21 22:18:04,242][15401] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-06-21 22:18:04,393][15401] Updated weights for policy 0, policy_version 76040 (0.0035) [2024-06-21 22:18:08,194][15401] Updated weights for policy 0, policy_version 76050 (0.0032) [2024-06-21 22:18:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1246003200. Throughput: 0: 42487.4. Samples: 1246075540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:18:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 22:18:12,296][15401] Updated weights for policy 0, policy_version 76060 (0.0026) [2024-06-21 22:18:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1246199808. Throughput: 0: 42314.3. Samples: 1246328280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:18:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 22:18:15,991][15401] Updated weights for policy 0, policy_version 76070 (0.0032) [2024-06-21 22:18:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1246412800. Throughput: 0: 42361.0. Samples: 1246579200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 22:18:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 22:18:20,026][15401] Updated weights for policy 0, policy_version 76080 (0.0030) [2024-06-21 22:18:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1246625792. Throughput: 0: 42162.7. Samples: 1246707880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:18:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 22:18:23,961][15401] Updated weights for policy 0, policy_version 76090 (0.0043) [2024-06-21 22:18:28,085][15401] Updated weights for policy 0, policy_version 76100 (0.0027) [2024-06-21 22:18:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42543.2). Total num frames: 1246838784. Throughput: 0: 42305.0. Samples: 1246958900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:18:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 22:18:31,615][15401] Updated weights for policy 0, policy_version 76110 (0.0033) [2024-06-21 22:18:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 1247068160. Throughput: 0: 42327.6. Samples: 1247212780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:18:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 22:18:35,542][15401] Updated weights for policy 0, policy_version 76120 (0.0036) [2024-06-21 22:18:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1247281152. Throughput: 0: 42271.2. Samples: 1247342420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:18:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 22:18:39,281][15401] Updated weights for policy 0, policy_version 76130 (0.0039) [2024-06-21 22:18:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42432.1). Total num frames: 1247461376. Throughput: 0: 42320.8. Samples: 1247594540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:18:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 22:18:43,574][15401] Updated weights for policy 0, policy_version 76140 (0.0033) [2024-06-21 22:18:47,013][15401] Updated weights for policy 0, policy_version 76150 (0.0036) [2024-06-21 22:18:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1247690752. Throughput: 0: 42153.9. Samples: 1247843440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:18:48,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 22:18:51,239][15401] Updated weights for policy 0, policy_version 76160 (0.0030) [2024-06-21 22:18:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1247903744. Throughput: 0: 42313.0. Samples: 1247979620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:18:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 22:18:54,643][15401] Updated weights for policy 0, policy_version 76170 (0.0034) [2024-06-21 22:18:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42432.7). Total num frames: 1248083968. Throughput: 0: 42236.1. Samples: 1248228900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:18:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 22:18:59,120][15401] Updated weights for policy 0, policy_version 76180 (0.0038) [2024-06-21 22:19:02,532][15401] Updated weights for policy 0, policy_version 76190 (0.0042) [2024-06-21 22:19:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1248329728. Throughput: 0: 42243.9. Samples: 1248480180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:19:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 22:19:06,665][15401] Updated weights for policy 0, policy_version 76200 (0.0035) [2024-06-21 22:19:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1248526336. Throughput: 0: 42407.5. Samples: 1248616220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:19:08,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 22:19:10,193][15401] Updated weights for policy 0, policy_version 76210 (0.0036) [2024-06-21 22:19:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42431.8). Total num frames: 1248739328. Throughput: 0: 42277.3. Samples: 1248861480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:13,392][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 22:19:14,334][15401] Updated weights for policy 0, policy_version 76220 (0.0031) [2024-06-21 22:19:17,955][15401] Updated weights for policy 0, policy_version 76230 (0.0042) [2024-06-21 22:19:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1248968704. Throughput: 0: 42457.4. Samples: 1249123360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:18,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 22:19:22,318][15401] Updated weights for policy 0, policy_version 76240 (0.0036) [2024-06-21 22:19:23,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1249148928. Throughput: 0: 42465.7. Samples: 1249253380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 22:19:25,541][15401] Updated weights for policy 0, policy_version 76250 (0.0037) [2024-06-21 22:19:28,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1249394688. Throughput: 0: 42357.8. Samples: 1249500640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:28,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-21 22:19:29,893][15401] Updated weights for policy 0, policy_version 76260 (0.0027) [2024-06-21 22:19:33,215][15401] Updated weights for policy 0, policy_version 76270 (0.0038) [2024-06-21 22:19:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1249607680. Throughput: 0: 42635.4. Samples: 1249762040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:33,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-21 22:19:37,420][15401] Updated weights for policy 0, policy_version 76280 (0.0028) [2024-06-21 22:19:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1249804288. Throughput: 0: 42443.7. Samples: 1249889580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 22:19:40,782][15401] Updated weights for policy 0, policy_version 76290 (0.0028) [2024-06-21 22:19:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1250050048. Throughput: 0: 42540.8. Samples: 1250143240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 22:19:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000076297_1250050048.pth... [2024-06-21 22:19:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000075674_1239842816.pth [2024-06-21 22:19:44,905][15401] Updated weights for policy 0, policy_version 76300 (0.0040) [2024-06-21 22:19:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1250230272. Throughput: 0: 42713.3. Samples: 1250402280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:48,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-21 22:19:49,057][15401] Updated weights for policy 0, policy_version 76310 (0.0029) [2024-06-21 22:19:49,456][15349] Signal inference workers to stop experience collection... (18400 times) [2024-06-21 22:19:49,457][15349] Signal inference workers to resume experience collection... (18400 times) [2024-06-21 22:19:49,497][15401] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-06-21 22:19:49,497][15401] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-06-21 22:19:52,610][15401] Updated weights for policy 0, policy_version 76320 (0.0030) [2024-06-21 22:19:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1250443264. Throughput: 0: 42460.4. Samples: 1250526940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-21 22:19:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 22:19:56,660][15401] Updated weights for policy 0, policy_version 76330 (0.0032) [2024-06-21 22:19:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1250689024. Throughput: 0: 42747.3. Samples: 1250785000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:19:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-21 22:20:00,259][15401] Updated weights for policy 0, policy_version 76340 (0.0044) [2024-06-21 22:20:03,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42487.7). Total num frames: 1250852864. Throughput: 0: 42635.0. Samples: 1251041940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:20:03,399][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 22:20:04,263][15401] Updated weights for policy 0, policy_version 76350 (0.0038) [2024-06-21 22:20:07,942][15401] Updated weights for policy 0, policy_version 76360 (0.0037) [2024-06-21 22:20:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1251098624. Throughput: 0: 42469.4. Samples: 1251164500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:20:08,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-21 22:20:12,262][15401] Updated weights for policy 0, policy_version 76370 (0.0036) [2024-06-21 22:20:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43146.2, 300 sec: 42654.0). Total num frames: 1251328000. Throughput: 0: 42802.3. Samples: 1251426740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:20:13,396][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 22:20:15,425][15401] Updated weights for policy 0, policy_version 76380 (0.0030) [2024-06-21 22:20:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1251508224. Throughput: 0: 42633.8. Samples: 1251680560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:20:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 22:20:20,064][15401] Updated weights for policy 0, policy_version 76390 (0.0053) [2024-06-21 22:20:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42431.7). Total num frames: 1251721216. Throughput: 0: 42453.4. Samples: 1251800000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:20:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 22:20:23,498][15401] Updated weights for policy 0, policy_version 76400 (0.0033) [2024-06-21 22:20:27,686][15401] Updated weights for policy 0, policy_version 76410 (0.0027) [2024-06-21 22:20:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1251966976. Throughput: 0: 42757.8. Samples: 1252067340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:20:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 22:20:31,062][15401] Updated weights for policy 0, policy_version 76420 (0.0043) [2024-06-21 22:20:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1252147200. Throughput: 0: 42472.3. Samples: 1252313540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:20:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 22:20:35,340][15401] Updated weights for policy 0, policy_version 76430 (0.0035) [2024-06-21 22:20:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1252376576. Throughput: 0: 42502.7. Samples: 1252439560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 22:20:38,390][15132] Avg episode reward: [(0, '0.078')] [2024-06-21 22:20:38,599][15401] Updated weights for policy 0, policy_version 76440 (0.0035) [2024-06-21 22:20:43,109][15401] Updated weights for policy 0, policy_version 76450 (0.0036) [2024-06-21 22:20:43,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 1252573184. Throughput: 0: 42448.3. Samples: 1252695280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:20:43,393][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 22:20:46,185][15401] Updated weights for policy 0, policy_version 76460 (0.0035) [2024-06-21 22:20:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1252769792. Throughput: 0: 42457.4. Samples: 1252952520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:20:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 22:20:50,709][15401] Updated weights for policy 0, policy_version 76470 (0.0035) [2024-06-21 22:20:53,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1253015552. Throughput: 0: 42550.7. Samples: 1253079280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:20:53,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 22:20:53,845][15401] Updated weights for policy 0, policy_version 76480 (0.0033) [2024-06-21 22:20:58,355][15401] Updated weights for policy 0, policy_version 76490 (0.0033) [2024-06-21 22:20:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1253212160. Throughput: 0: 42573.9. Samples: 1253342560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:20:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 22:21:01,577][15401] Updated weights for policy 0, policy_version 76500 (0.0044) [2024-06-21 22:21:03,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.7, 300 sec: 42487.0). Total num frames: 1253425152. Throughput: 0: 42541.7. Samples: 1253595040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:21:03,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 22:21:05,978][15401] Updated weights for policy 0, policy_version 76510 (0.0038) [2024-06-21 22:21:07,093][15349] Signal inference workers to stop experience collection... (18450 times) [2024-06-21 22:21:07,147][15349] Signal inference workers to resume experience collection... (18450 times) [2024-06-21 22:21:07,147][15401] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-06-21 22:21:07,172][15401] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-06-21 22:21:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1253670912. Throughput: 0: 42676.7. Samples: 1253720440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:21:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 22:21:09,650][15401] Updated weights for policy 0, policy_version 76520 (0.0039) [2024-06-21 22:21:13,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1253851136. Throughput: 0: 42440.5. Samples: 1253977160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:21:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 22:21:13,896][15401] Updated weights for policy 0, policy_version 76530 (0.0042) [2024-06-21 22:21:17,364][15401] Updated weights for policy 0, policy_version 76540 (0.0033) [2024-06-21 22:21:18,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42323.7, 300 sec: 42431.4). Total num frames: 1254047744. Throughput: 0: 42688.1. Samples: 1254234600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:21:18,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 22:21:21,698][15401] Updated weights for policy 0, policy_version 76550 (0.0027) [2024-06-21 22:21:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42542.8). Total num frames: 1254293504. Throughput: 0: 42719.6. Samples: 1254361940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:21:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 22:21:25,804][15401] Updated weights for policy 0, policy_version 76560 (0.0032) [2024-06-21 22:21:28,392][15132] Fps is (10 sec: 42598.4, 60 sec: 41777.6, 300 sec: 42487.0). Total num frames: 1254473728. Throughput: 0: 42560.9. Samples: 1254610520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 22:21:28,393][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 22:21:29,408][15401] Updated weights for policy 0, policy_version 76570 (0.0036) [2024-06-21 22:21:33,277][15401] Updated weights for policy 0, policy_version 76580 (0.0033) [2024-06-21 22:21:33,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.7, 300 sec: 42375.9). Total num frames: 1254686720. Throughput: 0: 42580.8. Samples: 1254868760. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:21:33,393][15132] Avg episode reward: [(0, '0.414')] [2024-06-21 22:21:36,795][15401] Updated weights for policy 0, policy_version 76590 (0.0031) [2024-06-21 22:21:38,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1254932480. Throughput: 0: 42580.9. Samples: 1254995420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:21:38,402][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 22:21:40,729][15401] Updated weights for policy 0, policy_version 76600 (0.0046) [2024-06-21 22:21:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 1255112704. Throughput: 0: 42450.6. Samples: 1255252840. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:21:43,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-21 22:21:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000076606_1255112704.pth... [2024-06-21 22:21:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000075983_1244905472.pth [2024-06-21 22:21:44,276][15401] Updated weights for policy 0, policy_version 76610 (0.0038) [2024-06-21 22:21:48,072][15401] Updated weights for policy 0, policy_version 76620 (0.0028) [2024-06-21 22:21:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1255342080. Throughput: 0: 42426.8. Samples: 1255504140. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:21:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 22:21:51,763][15401] Updated weights for policy 0, policy_version 76630 (0.0030) [2024-06-21 22:21:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1255555072. Throughput: 0: 42698.2. Samples: 1255641860. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:21:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 22:21:55,509][15401] Updated weights for policy 0, policy_version 76640 (0.0041) [2024-06-21 22:21:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1255751680. Throughput: 0: 42577.8. Samples: 1255893160. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:21:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 22:21:59,475][15401] Updated weights for policy 0, policy_version 76650 (0.0042) [2024-06-21 22:22:03,395][15132] Fps is (10 sec: 42574.2, 60 sec: 42596.1, 300 sec: 42431.0). Total num frames: 1255981056. Throughput: 0: 42582.2. Samples: 1256150940. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:22:03,396][15132] Avg episode reward: [(0, '0.411')] [2024-06-21 22:22:03,542][15401] Updated weights for policy 0, policy_version 76660 (0.0044) [2024-06-21 22:22:07,133][15401] Updated weights for policy 0, policy_version 76670 (0.0037) [2024-06-21 22:22:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1256210432. Throughput: 0: 42746.3. Samples: 1256285520. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:22:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 22:22:10,913][15401] Updated weights for policy 0, policy_version 76680 (0.0045) [2024-06-21 22:22:13,390][15132] Fps is (10 sec: 40983.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1256390656. Throughput: 0: 42864.9. Samples: 1256539340. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-21 22:22:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 22:22:13,896][15349] Signal inference workers to stop experience collection... (18500 times) [2024-06-21 22:22:13,896][15349] Signal inference workers to resume experience collection... (18500 times) [2024-06-21 22:22:13,920][15401] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-06-21 22:22:13,920][15401] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-06-21 22:22:14,979][15401] Updated weights for policy 0, policy_version 76690 (0.0031) [2024-06-21 22:22:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42487.3). Total num frames: 1256636416. Throughput: 0: 42853.0. Samples: 1256797040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 22:22:18,503][15401] Updated weights for policy 0, policy_version 76700 (0.0021) [2024-06-21 22:22:22,638][15401] Updated weights for policy 0, policy_version 76710 (0.0040) [2024-06-21 22:22:23,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 1256849408. Throughput: 0: 42936.4. Samples: 1256927660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:23,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 22:22:26,187][15401] Updated weights for policy 0, policy_version 76720 (0.0025) [2024-06-21 22:22:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.1, 300 sec: 42487.7). Total num frames: 1257046016. Throughput: 0: 42753.7. Samples: 1257176760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:28,393][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 22:22:30,180][15401] Updated weights for policy 0, policy_version 76730 (0.0025) [2024-06-21 22:22:33,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43146.3, 300 sec: 42487.3). Total num frames: 1257275392. Throughput: 0: 42891.1. Samples: 1257434240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 22:22:34,066][15401] Updated weights for policy 0, policy_version 76740 (0.0042) [2024-06-21 22:22:37,682][15401] Updated weights for policy 0, policy_version 76750 (0.0031) [2024-06-21 22:22:38,396][15132] Fps is (10 sec: 44209.1, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 1257488384. Throughput: 0: 42821.1. Samples: 1257569080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:38,396][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 22:22:41,697][15401] Updated weights for policy 0, policy_version 76760 (0.0035) [2024-06-21 22:22:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.7, 300 sec: 42542.5). Total num frames: 1257701376. Throughput: 0: 42858.5. Samples: 1257821900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:43,393][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 22:22:45,832][15401] Updated weights for policy 0, policy_version 76770 (0.0044) [2024-06-21 22:22:48,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1257914368. Throughput: 0: 42778.2. Samples: 1258075720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 22:22:49,382][15401] Updated weights for policy 0, policy_version 76780 (0.0028) [2024-06-21 22:22:53,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1258110976. Throughput: 0: 42610.7. Samples: 1258203000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:53,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 22:22:53,498][15401] Updated weights for policy 0, policy_version 76790 (0.0037) [2024-06-21 22:22:56,950][15401] Updated weights for policy 0, policy_version 76800 (0.0028) [2024-06-21 22:22:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1258340352. Throughput: 0: 42617.0. Samples: 1258457100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 22:22:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 22:23:01,042][15401] Updated weights for policy 0, policy_version 76810 (0.0038) [2024-06-21 22:23:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42875.5, 300 sec: 42542.9). Total num frames: 1258553344. Throughput: 0: 42479.9. Samples: 1258708640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 22:23:04,547][15401] Updated weights for policy 0, policy_version 76820 (0.0043) [2024-06-21 22:23:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1258749952. Throughput: 0: 42417.8. Samples: 1258836360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 22:23:08,906][15401] Updated weights for policy 0, policy_version 76830 (0.0041) [2024-06-21 22:23:12,283][15401] Updated weights for policy 0, policy_version 76840 (0.0035) [2024-06-21 22:23:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1258962944. Throughput: 0: 42499.6. Samples: 1259089240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 22:23:16,452][15401] Updated weights for policy 0, policy_version 76850 (0.0033) [2024-06-21 22:23:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1259192320. Throughput: 0: 42498.2. Samples: 1259346660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 22:23:19,881][15401] Updated weights for policy 0, policy_version 76860 (0.0033) [2024-06-21 22:23:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1259405312. Throughput: 0: 42544.2. Samples: 1259483300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 22:23:24,005][15401] Updated weights for policy 0, policy_version 76870 (0.0028) [2024-06-21 22:23:27,435][15401] Updated weights for policy 0, policy_version 76880 (0.0024) [2024-06-21 22:23:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1259618304. Throughput: 0: 42611.7. Samples: 1259739320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:28,391][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 22:23:31,615][15401] Updated weights for policy 0, policy_version 76890 (0.0036) [2024-06-21 22:23:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1259831296. Throughput: 0: 42665.5. Samples: 1259995660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 22:23:35,122][15349] Signal inference workers to stop experience collection... (18550 times) [2024-06-21 22:23:35,148][15401] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-06-21 22:23:35,176][15349] Signal inference workers to resume experience collection... (18550 times) [2024-06-21 22:23:35,177][15401] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-06-21 22:23:35,319][15401] Updated weights for policy 0, policy_version 76900 (0.0031) [2024-06-21 22:23:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 1260027904. Throughput: 0: 42728.9. Samples: 1260125800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 22:23:39,183][15401] Updated weights for policy 0, policy_version 76910 (0.0030) [2024-06-21 22:23:43,186][15401] Updated weights for policy 0, policy_version 76920 (0.0035) [2024-06-21 22:23:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 1260273664. Throughput: 0: 42840.9. Samples: 1260384940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 22:23:43,395][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000076921_1260273664.pth... [2024-06-21 22:23:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000076297_1250050048.pth [2024-06-21 22:23:47,280][15401] Updated weights for policy 0, policy_version 76930 (0.0039) [2024-06-21 22:23:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1260470272. Throughput: 0: 42689.3. Samples: 1260629660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-21 22:23:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 22:23:50,970][15401] Updated weights for policy 0, policy_version 76940 (0.0030) [2024-06-21 22:23:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1260683264. Throughput: 0: 42660.1. Samples: 1260756060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:23:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 22:23:54,950][15401] Updated weights for policy 0, policy_version 76950 (0.0037) [2024-06-21 22:23:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1260896256. Throughput: 0: 42781.8. Samples: 1261014420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:23:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 22:23:58,607][15401] Updated weights for policy 0, policy_version 76960 (0.0034) [2024-06-21 22:24:02,511][15401] Updated weights for policy 0, policy_version 76970 (0.0029) [2024-06-21 22:24:03,390][15132] Fps is (10 sec: 40958.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1261092864. Throughput: 0: 42677.4. Samples: 1261267160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:24:03,391][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 22:24:06,199][15401] Updated weights for policy 0, policy_version 76980 (0.0027) [2024-06-21 22:24:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 1261338624. Throughput: 0: 42415.6. Samples: 1261392000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:24:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 22:24:10,582][15401] Updated weights for policy 0, policy_version 76990 (0.0028) [2024-06-21 22:24:13,389][15132] Fps is (10 sec: 44238.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1261535232. Throughput: 0: 42497.0. Samples: 1261651680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:24:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-21 22:24:13,886][15401] Updated weights for policy 0, policy_version 77000 (0.0038) [2024-06-21 22:24:18,169][15401] Updated weights for policy 0, policy_version 77010 (0.0024) [2024-06-21 22:24:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1261731840. Throughput: 0: 42363.1. Samples: 1261902000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:24:18,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 22:24:21,752][15401] Updated weights for policy 0, policy_version 77020 (0.0027) [2024-06-21 22:24:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1261961216. Throughput: 0: 42392.0. Samples: 1262033440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:24:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 22:24:25,805][15401] Updated weights for policy 0, policy_version 77030 (0.0033) [2024-06-21 22:24:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1262157824. Throughput: 0: 42207.5. Samples: 1262284280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:24:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 22:24:29,440][15401] Updated weights for policy 0, policy_version 77040 (0.0036) [2024-06-21 22:24:33,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 1262354432. Throughput: 0: 42575.1. Samples: 1262545540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:24:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 22:24:33,803][15401] Updated weights for policy 0, policy_version 77050 (0.0031) [2024-06-21 22:24:37,395][15401] Updated weights for policy 0, policy_version 77060 (0.0034) [2024-06-21 22:24:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1262600192. Throughput: 0: 42448.8. Samples: 1262666260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:24:38,391][15132] Avg episode reward: [(0, '0.730')] [2024-06-21 22:24:41,405][15401] Updated weights for policy 0, policy_version 77070 (0.0037) [2024-06-21 22:24:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 1262796800. Throughput: 0: 42432.4. Samples: 1262923880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:24:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 22:24:44,921][15401] Updated weights for policy 0, policy_version 77080 (0.0042) [2024-06-21 22:24:48,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 1262977024. Throughput: 0: 42521.6. Samples: 1263180620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:24:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 22:24:49,415][15401] Updated weights for policy 0, policy_version 77090 (0.0036) [2024-06-21 22:24:52,546][15401] Updated weights for policy 0, policy_version 77100 (0.0038) [2024-06-21 22:24:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1263206400. Throughput: 0: 42442.2. Samples: 1263301900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:24:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 22:24:56,982][15401] Updated weights for policy 0, policy_version 77110 (0.0044) [2024-06-21 22:24:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1263435776. Throughput: 0: 42355.9. Samples: 1263557700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:24:58,392][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 22:25:00,602][15401] Updated weights for policy 0, policy_version 77120 (0.0025) [2024-06-21 22:25:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 1263632384. Throughput: 0: 42424.4. Samples: 1263811100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:25:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 22:25:04,493][15401] Updated weights for policy 0, policy_version 77130 (0.0031) [2024-06-21 22:25:08,110][15401] Updated weights for policy 0, policy_version 77140 (0.0034) [2024-06-21 22:25:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1263861760. Throughput: 0: 42450.5. Samples: 1263943720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:25:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 22:25:11,655][15349] Signal inference workers to stop experience collection... (18600 times) [2024-06-21 22:25:11,684][15401] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-06-21 22:25:11,710][15349] Signal inference workers to resume experience collection... (18600 times) [2024-06-21 22:25:11,716][15401] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-06-21 22:25:12,027][15401] Updated weights for policy 0, policy_version 77150 (0.0025) [2024-06-21 22:25:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1264058368. Throughput: 0: 42442.6. Samples: 1264194200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:25:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 22:25:15,837][15401] Updated weights for policy 0, policy_version 77160 (0.0043) [2024-06-21 22:25:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1264287744. Throughput: 0: 42351.2. Samples: 1264451340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:25:18,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-21 22:25:19,589][15401] Updated weights for policy 0, policy_version 77170 (0.0023) [2024-06-21 22:25:23,370][15401] Updated weights for policy 0, policy_version 77180 (0.0026) [2024-06-21 22:25:23,395][15132] Fps is (10 sec: 45849.0, 60 sec: 42594.2, 300 sec: 42542.0). Total num frames: 1264517120. Throughput: 0: 42689.7. Samples: 1264587540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:25:23,396][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 22:25:27,506][15401] Updated weights for policy 0, policy_version 77190 (0.0032) [2024-06-21 22:25:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1264697344. Throughput: 0: 42619.2. Samples: 1264841740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:25:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 22:25:30,962][15401] Updated weights for policy 0, policy_version 77200 (0.0039) [2024-06-21 22:25:33,389][15132] Fps is (10 sec: 42623.5, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 1264943104. Throughput: 0: 42457.5. Samples: 1265091200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:25:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 22:25:35,060][15401] Updated weights for policy 0, policy_version 77210 (0.0041) [2024-06-21 22:25:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 1265139712. Throughput: 0: 42769.9. Samples: 1265226540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:25:38,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-21 22:25:38,793][15401] Updated weights for policy 0, policy_version 77220 (0.0044) [2024-06-21 22:25:43,090][15401] Updated weights for policy 0, policy_version 77230 (0.0035) [2024-06-21 22:25:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1265336320. Throughput: 0: 42602.4. Samples: 1265474800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:25:43,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-21 22:25:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000077230_1265336320.pth... [2024-06-21 22:25:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000076606_1255112704.pth [2024-06-21 22:25:46,391][15401] Updated weights for policy 0, policy_version 77240 (0.0034) [2024-06-21 22:25:48,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43415.9, 300 sec: 42598.1). Total num frames: 1265582080. Throughput: 0: 42569.8. Samples: 1265726840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:25:48,392][15132] Avg episode reward: [(0, '0.364')] [2024-06-21 22:25:50,604][15401] Updated weights for policy 0, policy_version 77250 (0.0038) [2024-06-21 22:25:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1265762304. Throughput: 0: 42801.7. Samples: 1265869800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:25:53,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-21 22:25:54,013][15401] Updated weights for policy 0, policy_version 77260 (0.0031) [2024-06-21 22:25:58,392][15132] Fps is (10 sec: 39321.5, 60 sec: 42323.7, 300 sec: 42542.9). Total num frames: 1265975296. Throughput: 0: 42638.2. Samples: 1266113020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:25:58,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 22:25:58,686][15401] Updated weights for policy 0, policy_version 77270 (0.0044) [2024-06-21 22:26:01,651][15401] Updated weights for policy 0, policy_version 77280 (0.0029) [2024-06-21 22:26:03,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1266237440. Throughput: 0: 42517.8. Samples: 1266364640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:26:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 22:26:06,323][15401] Updated weights for policy 0, policy_version 77290 (0.0027) [2024-06-21 22:26:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1266401280. Throughput: 0: 42642.9. Samples: 1266506220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-21 22:26:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 22:26:09,319][15401] Updated weights for policy 0, policy_version 77300 (0.0034) [2024-06-21 22:26:10,057][15349] Signal inference workers to stop experience collection... (18650 times) [2024-06-21 22:26:10,103][15401] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-06-21 22:26:10,114][15349] Signal inference workers to resume experience collection... (18650 times) [2024-06-21 22:26:10,120][15401] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-06-21 22:26:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 1266630656. Throughput: 0: 42591.1. Samples: 1266758340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 22:26:13,949][15401] Updated weights for policy 0, policy_version 77310 (0.0040) [2024-06-21 22:26:17,094][15401] Updated weights for policy 0, policy_version 77320 (0.0043) [2024-06-21 22:26:18,390][15132] Fps is (10 sec: 49151.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1266892800. Throughput: 0: 42534.5. Samples: 1267005260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 22:26:21,601][15401] Updated weights for policy 0, policy_version 77330 (0.0033) [2024-06-21 22:26:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42056.3, 300 sec: 42598.8). Total num frames: 1267040256. Throughput: 0: 42552.4. Samples: 1267141400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 22:26:24,963][15401] Updated weights for policy 0, policy_version 77340 (0.0034) [2024-06-21 22:26:28,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1267269632. Throughput: 0: 42604.7. Samples: 1267392020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 22:26:29,105][15401] Updated weights for policy 0, policy_version 77350 (0.0040) [2024-06-21 22:26:32,773][15401] Updated weights for policy 0, policy_version 77360 (0.0029) [2024-06-21 22:26:33,390][15132] Fps is (10 sec: 47512.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1267515392. Throughput: 0: 42661.2. Samples: 1267646500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:33,391][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 22:26:36,705][15401] Updated weights for policy 0, policy_version 77370 (0.0052) [2024-06-21 22:26:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1267679232. Throughput: 0: 42422.7. Samples: 1267778820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:38,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 22:26:40,402][15401] Updated weights for policy 0, policy_version 77380 (0.0031) [2024-06-21 22:26:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1267924992. Throughput: 0: 42537.0. Samples: 1268027080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:43,396][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 22:26:44,243][15401] Updated weights for policy 0, policy_version 77390 (0.0032) [2024-06-21 22:26:48,067][15401] Updated weights for policy 0, policy_version 77400 (0.0029) [2024-06-21 22:26:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 1268137984. Throughput: 0: 42685.9. Samples: 1268285500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 22:26:52,091][15401] Updated weights for policy 0, policy_version 77410 (0.0034) [2024-06-21 22:26:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 1268318208. Throughput: 0: 42345.4. Samples: 1268411760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 22:26:55,560][15401] Updated weights for policy 0, policy_version 77420 (0.0035) [2024-06-21 22:26:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43419.4, 300 sec: 42710.3). Total num frames: 1268580352. Throughput: 0: 42445.9. Samples: 1268668400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-21 22:26:58,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 22:26:59,789][15401] Updated weights for policy 0, policy_version 77430 (0.0026) [2024-06-21 22:27:03,336][15401] Updated weights for policy 0, policy_version 77440 (0.0025) [2024-06-21 22:27:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1268776960. Throughput: 0: 42717.5. Samples: 1268927540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 22:27:07,683][15401] Updated weights for policy 0, policy_version 77450 (0.0029) [2024-06-21 22:27:08,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1268957184. Throughput: 0: 42502.6. Samples: 1269054020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:08,391][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 22:27:09,196][15349] Signal inference workers to stop experience collection... (18700 times) [2024-06-21 22:27:09,197][15349] Signal inference workers to resume experience collection... (18700 times) [2024-06-21 22:27:09,217][15401] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-06-21 22:27:09,217][15401] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-06-21 22:27:10,813][15401] Updated weights for policy 0, policy_version 77460 (0.0045) [2024-06-21 22:27:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1269235712. Throughput: 0: 42614.4. Samples: 1269309660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:13,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 22:27:15,506][15401] Updated weights for policy 0, policy_version 77470 (0.0038) [2024-06-21 22:27:18,303][15401] Updated weights for policy 0, policy_version 77480 (0.0035) [2024-06-21 22:27:18,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1269432320. Throughput: 0: 42678.4. Samples: 1269567020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 22:27:23,054][15401] Updated weights for policy 0, policy_version 77490 (0.0042) [2024-06-21 22:27:23,393][15132] Fps is (10 sec: 36030.2, 60 sec: 42595.5, 300 sec: 42542.3). Total num frames: 1269596160. Throughput: 0: 42630.6. Samples: 1269697360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:23,394][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 22:27:26,088][15401] Updated weights for policy 0, policy_version 77500 (0.0028) [2024-06-21 22:27:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1269841920. Throughput: 0: 42811.5. Samples: 1269953600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:28,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 22:27:30,586][15401] Updated weights for policy 0, policy_version 77510 (0.0040) [2024-06-21 22:27:33,389][15132] Fps is (10 sec: 45893.6, 60 sec: 42325.5, 300 sec: 42599.3). Total num frames: 1270054912. Throughput: 0: 42868.4. Samples: 1270214580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:33,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-21 22:27:33,625][15401] Updated weights for policy 0, policy_version 77520 (0.0029) [2024-06-21 22:27:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 1270235136. Throughput: 0: 42862.5. Samples: 1270340580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 22:27:38,457][15401] Updated weights for policy 0, policy_version 77530 (0.0038) [2024-06-21 22:27:41,850][15401] Updated weights for policy 0, policy_version 77540 (0.0031) [2024-06-21 22:27:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1270497280. Throughput: 0: 42707.9. Samples: 1270590260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 22:27:43,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 22:27:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000077545_1270497280.pth... [2024-06-21 22:27:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000076921_1260273664.pth [2024-06-21 22:27:46,104][15401] Updated weights for policy 0, policy_version 77550 (0.0041) [2024-06-21 22:27:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1270677504. Throughput: 0: 42670.2. Samples: 1270847700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:27:48,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-21 22:27:49,634][15401] Updated weights for policy 0, policy_version 77560 (0.0030) [2024-06-21 22:27:53,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1270874112. Throughput: 0: 42517.8. Samples: 1270967320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:27:53,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-21 22:27:53,933][15401] Updated weights for policy 0, policy_version 77570 (0.0037) [2024-06-21 22:27:57,053][15401] Updated weights for policy 0, policy_version 77580 (0.0035) [2024-06-21 22:27:58,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42047.7, 300 sec: 42542.0). Total num frames: 1271103488. Throughput: 0: 42438.4. Samples: 1271219660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:27:58,396][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 22:28:01,596][15401] Updated weights for policy 0, policy_version 77590 (0.0031) [2024-06-21 22:28:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1271300096. Throughput: 0: 42621.4. Samples: 1271484980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:28:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 22:28:04,662][15401] Updated weights for policy 0, policy_version 77600 (0.0028) [2024-06-21 22:28:08,389][15132] Fps is (10 sec: 39346.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1271496704. Throughput: 0: 42359.8. Samples: 1271603380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:28:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 22:28:09,507][15401] Updated weights for policy 0, policy_version 77610 (0.0038) [2024-06-21 22:28:12,275][15401] Updated weights for policy 0, policy_version 77620 (0.0034) [2024-06-21 22:28:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1271742464. Throughput: 0: 42230.2. Samples: 1271853960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:28:13,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 22:28:17,124][15401] Updated weights for policy 0, policy_version 77630 (0.0041) [2024-06-21 22:28:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1271939072. Throughput: 0: 42247.2. Samples: 1272115700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:28:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 22:28:20,268][15401] Updated weights for policy 0, policy_version 77640 (0.0034) [2024-06-21 22:28:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42601.2, 300 sec: 42487.3). Total num frames: 1272152064. Throughput: 0: 42104.1. Samples: 1272235260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:28:23,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 22:28:24,846][15401] Updated weights for policy 0, policy_version 77650 (0.0036) [2024-06-21 22:28:25,761][15349] Signal inference workers to stop experience collection... (18750 times) [2024-06-21 22:28:25,762][15349] Signal inference workers to resume experience collection... (18750 times) [2024-06-21 22:28:25,777][15401] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-06-21 22:28:25,803][15401] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-06-21 22:28:28,002][15401] Updated weights for policy 0, policy_version 77660 (0.0023) [2024-06-21 22:28:28,394][15132] Fps is (10 sec: 45851.8, 60 sec: 42594.8, 300 sec: 42597.7). Total num frames: 1272397824. Throughput: 0: 42223.3. Samples: 1272490520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:28:28,395][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 22:28:32,669][15401] Updated weights for policy 0, policy_version 77670 (0.0037) [2024-06-21 22:28:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1272561664. Throughput: 0: 42145.4. Samples: 1272744240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-21 22:28:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 22:28:36,027][15401] Updated weights for policy 0, policy_version 77680 (0.0044) [2024-06-21 22:28:38,390][15132] Fps is (10 sec: 37701.9, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1272774656. Throughput: 0: 42244.4. Samples: 1272868320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:28:38,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-21 22:28:40,302][15401] Updated weights for policy 0, policy_version 77690 (0.0030) [2024-06-21 22:28:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1273020416. Throughput: 0: 42266.8. Samples: 1273121400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:28:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 22:28:43,773][15401] Updated weights for policy 0, policy_version 77700 (0.0025) [2024-06-21 22:28:48,346][15401] Updated weights for policy 0, policy_version 77710 (0.0033) [2024-06-21 22:28:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1273200640. Throughput: 0: 42194.8. Samples: 1273383740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:28:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 22:28:51,350][15401] Updated weights for policy 0, policy_version 77720 (0.0029) [2024-06-21 22:28:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1273413632. Throughput: 0: 42253.2. Samples: 1273504780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:28:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 22:28:55,916][15401] Updated weights for policy 0, policy_version 77730 (0.0034) [2024-06-21 22:28:58,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42602.9, 300 sec: 42598.4). Total num frames: 1273659392. Throughput: 0: 42372.4. Samples: 1273760720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:28:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 22:28:58,940][15401] Updated weights for policy 0, policy_version 77740 (0.0035) [2024-06-21 22:29:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1273839616. Throughput: 0: 42330.6. Samples: 1274020580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:29:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-21 22:29:03,524][15401] Updated weights for policy 0, policy_version 77750 (0.0038) [2024-06-21 22:29:06,520][15401] Updated weights for policy 0, policy_version 77760 (0.0022) [2024-06-21 22:29:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1274068992. Throughput: 0: 42297.4. Samples: 1274138640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:29:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 22:29:11,473][15401] Updated weights for policy 0, policy_version 77770 (0.0047) [2024-06-21 22:29:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1274298368. Throughput: 0: 42395.7. Samples: 1274398120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:29:13,396][15132] Avg episode reward: [(0, '0.325')] [2024-06-21 22:29:14,162][15401] Updated weights for policy 0, policy_version 77780 (0.0039) [2024-06-21 22:29:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1274462208. Throughput: 0: 42521.7. Samples: 1274657720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:29:18,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-21 22:29:19,174][15401] Updated weights for policy 0, policy_version 77790 (0.0039) [2024-06-21 22:29:21,959][15401] Updated weights for policy 0, policy_version 77800 (0.0026) [2024-06-21 22:29:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1274707968. Throughput: 0: 42298.4. Samples: 1274771740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:29:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 22:29:26,858][15401] Updated weights for policy 0, policy_version 77810 (0.0034) [2024-06-21 22:29:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42055.7, 300 sec: 42598.4). Total num frames: 1274920960. Throughput: 0: 42581.3. Samples: 1275037560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:29:28,391][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 22:29:29,750][15401] Updated weights for policy 0, policy_version 77820 (0.0033) [2024-06-21 22:29:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1275101184. Throughput: 0: 42484.8. Samples: 1275295560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:29:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 22:29:34,323][15401] Updated weights for policy 0, policy_version 77830 (0.0032) [2024-06-21 22:29:36,917][15349] Signal inference workers to stop experience collection... (18800 times) [2024-06-21 22:29:36,918][15349] Signal inference workers to resume experience collection... (18800 times) [2024-06-21 22:29:36,967][15401] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-06-21 22:29:36,967][15401] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-06-21 22:29:37,626][15401] Updated weights for policy 0, policy_version 77840 (0.0031) [2024-06-21 22:29:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1275346944. Throughput: 0: 42518.8. Samples: 1275418120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:29:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-21 22:29:41,935][15401] Updated weights for policy 0, policy_version 77850 (0.0032) [2024-06-21 22:29:43,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42050.6, 300 sec: 42598.0). Total num frames: 1275543552. Throughput: 0: 42474.1. Samples: 1275672160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:29:43,393][15132] Avg episode reward: [(0, '0.455')] [2024-06-21 22:29:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000077853_1275543552.pth... [2024-06-21 22:29:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000077230_1265336320.pth [2024-06-21 22:29:45,451][15401] Updated weights for policy 0, policy_version 77860 (0.0029) [2024-06-21 22:29:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1275740160. Throughput: 0: 42504.4. Samples: 1275933280. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:29:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 22:29:49,561][15401] Updated weights for policy 0, policy_version 77870 (0.0030) [2024-06-21 22:29:52,978][15401] Updated weights for policy 0, policy_version 77880 (0.0032) [2024-06-21 22:29:53,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1275985920. Throughput: 0: 42754.5. Samples: 1276062600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:29:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 22:29:57,198][15401] Updated weights for policy 0, policy_version 77890 (0.0038) [2024-06-21 22:29:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1276198912. Throughput: 0: 42696.0. Samples: 1276319440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:29:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 22:30:00,660][15401] Updated weights for policy 0, policy_version 77900 (0.0038) [2024-06-21 22:30:03,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1276395520. Throughput: 0: 42643.2. Samples: 1276576660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:30:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 22:30:04,802][15401] Updated weights for policy 0, policy_version 77910 (0.0030) [2024-06-21 22:30:08,349][15401] Updated weights for policy 0, policy_version 77920 (0.0029) [2024-06-21 22:30:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1276641280. Throughput: 0: 42872.3. Samples: 1276701000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-21 22:30:08,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-21 22:30:12,441][15401] Updated weights for policy 0, policy_version 77930 (0.0042) [2024-06-21 22:30:13,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42320.9, 300 sec: 42542.0). Total num frames: 1276837888. Throughput: 0: 42570.9. Samples: 1276953520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:13,397][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 22:30:16,090][15401] Updated weights for policy 0, policy_version 77940 (0.0035) [2024-06-21 22:30:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42377.1). Total num frames: 1277018112. Throughput: 0: 42679.2. Samples: 1277216120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 22:30:20,083][15401] Updated weights for policy 0, policy_version 77950 (0.0030) [2024-06-21 22:30:23,395][15132] Fps is (10 sec: 42601.5, 60 sec: 42594.3, 300 sec: 42597.6). Total num frames: 1277263872. Throughput: 0: 42619.5. Samples: 1277336240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:23,396][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 22:30:23,931][15401] Updated weights for policy 0, policy_version 77960 (0.0032) [2024-06-21 22:30:27,587][15401] Updated weights for policy 0, policy_version 77970 (0.0032) [2024-06-21 22:30:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1277493248. Throughput: 0: 42806.3. Samples: 1277598340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-21 22:30:31,771][15401] Updated weights for policy 0, policy_version 77980 (0.0032) [2024-06-21 22:30:33,389][15132] Fps is (10 sec: 40983.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1277673472. Throughput: 0: 42702.2. Samples: 1277854880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 22:30:35,074][15401] Updated weights for policy 0, policy_version 77990 (0.0042) [2024-06-21 22:30:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1277886464. Throughput: 0: 42561.4. Samples: 1277977860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:38,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-21 22:30:39,479][15401] Updated weights for policy 0, policy_version 78000 (0.0039) [2024-06-21 22:30:42,754][15401] Updated weights for policy 0, policy_version 78010 (0.0033) [2024-06-21 22:30:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.3, 300 sec: 42543.2). Total num frames: 1278132224. Throughput: 0: 42645.5. Samples: 1278238480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 22:30:47,234][15401] Updated weights for policy 0, policy_version 78020 (0.0047) [2024-06-21 22:30:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1278328832. Throughput: 0: 42619.9. Samples: 1278494560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:48,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-21 22:30:50,482][15401] Updated weights for policy 0, policy_version 78030 (0.0030) [2024-06-21 22:30:51,097][15349] Signal inference workers to stop experience collection... (18850 times) [2024-06-21 22:30:51,132][15401] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-06-21 22:30:51,156][15349] Signal inference workers to resume experience collection... (18850 times) [2024-06-21 22:30:51,160][15401] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-06-21 22:30:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.5, 300 sec: 42543.2). Total num frames: 1278525440. Throughput: 0: 42613.4. Samples: 1278618600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-21 22:30:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 22:30:54,742][15401] Updated weights for policy 0, policy_version 78040 (0.0043) [2024-06-21 22:30:57,991][15401] Updated weights for policy 0, policy_version 78050 (0.0047) [2024-06-21 22:30:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 42542.9). Total num frames: 1278787584. Throughput: 0: 42870.2. Samples: 1278882400. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:30:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 22:31:02,632][15401] Updated weights for policy 0, policy_version 78060 (0.0040) [2024-06-21 22:31:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1278951424. Throughput: 0: 42711.4. Samples: 1279138140. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 22:31:05,773][15401] Updated weights for policy 0, policy_version 78070 (0.0032) [2024-06-21 22:31:08,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1279180800. Throughput: 0: 42693.6. Samples: 1279257220. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:08,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-21 22:31:10,239][15401] Updated weights for policy 0, policy_version 78080 (0.0034) [2024-06-21 22:31:13,352][15401] Updated weights for policy 0, policy_version 78090 (0.0025) [2024-06-21 22:31:13,390][15132] Fps is (10 sec: 47514.1, 60 sec: 43149.1, 300 sec: 42487.3). Total num frames: 1279426560. Throughput: 0: 42703.6. Samples: 1279520000. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 22:31:18,227][15401] Updated weights for policy 0, policy_version 78100 (0.0047) [2024-06-21 22:31:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1279590400. Throughput: 0: 42720.7. Samples: 1279777320. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 22:31:21,122][15401] Updated weights for policy 0, policy_version 78110 (0.0034) [2024-06-21 22:31:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42875.5, 300 sec: 42598.4). Total num frames: 1279836160. Throughput: 0: 42613.4. Samples: 1279895460. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 22:31:25,916][15401] Updated weights for policy 0, policy_version 78120 (0.0047) [2024-06-21 22:31:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 1280049152. Throughput: 0: 42644.0. Samples: 1280157460. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 22:31:28,797][15401] Updated weights for policy 0, policy_version 78130 (0.0047) [2024-06-21 22:31:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1280212992. Throughput: 0: 42694.6. Samples: 1280415820. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 22:31:33,554][15401] Updated weights for policy 0, policy_version 78140 (0.0032) [2024-06-21 22:31:36,511][15401] Updated weights for policy 0, policy_version 78150 (0.0035) [2024-06-21 22:31:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1280475136. Throughput: 0: 42497.7. Samples: 1280531000. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 22:31:41,344][15401] Updated weights for policy 0, policy_version 78160 (0.0038) [2024-06-21 22:31:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1280671744. Throughput: 0: 42608.8. Samples: 1280799800. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-06-21 22:31:43,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-21 22:31:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000078166_1280671744.pth... [2024-06-21 22:31:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000077545_1270497280.pth [2024-06-21 22:31:44,174][15401] Updated weights for policy 0, policy_version 78170 (0.0025) [2024-06-21 22:31:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1280868352. Throughput: 0: 42350.2. Samples: 1281043900. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:31:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 22:31:49,102][15401] Updated weights for policy 0, policy_version 78180 (0.0032) [2024-06-21 22:31:52,034][15401] Updated weights for policy 0, policy_version 78190 (0.0039) [2024-06-21 22:31:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1281097728. Throughput: 0: 42492.2. Samples: 1281169360. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:31:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 22:31:56,734][15401] Updated weights for policy 0, policy_version 78200 (0.0035) [2024-06-21 22:31:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41506.1, 300 sec: 42376.2). Total num frames: 1281277952. Throughput: 0: 42442.3. Samples: 1281429900. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:31:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 22:31:59,844][15401] Updated weights for policy 0, policy_version 78210 (0.0026) [2024-06-21 22:32:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1281507328. Throughput: 0: 42331.2. Samples: 1281682220. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:32:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 22:32:04,469][15401] Updated weights for policy 0, policy_version 78220 (0.0029) [2024-06-21 22:32:05,592][15349] Signal inference workers to stop experience collection... (18900 times) [2024-06-21 22:32:05,643][15401] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-06-21 22:32:05,645][15349] Signal inference workers to resume experience collection... (18900 times) [2024-06-21 22:32:05,653][15401] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-06-21 22:32:07,527][15401] Updated weights for policy 0, policy_version 78230 (0.0026) [2024-06-21 22:32:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1281736704. Throughput: 0: 42518.6. Samples: 1281808800. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:32:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 22:32:12,380][15401] Updated weights for policy 0, policy_version 78240 (0.0023) [2024-06-21 22:32:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1281933312. Throughput: 0: 42497.7. Samples: 1282069860. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:32:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 22:32:15,466][15401] Updated weights for policy 0, policy_version 78250 (0.0039) [2024-06-21 22:32:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42543.4). Total num frames: 1282146304. Throughput: 0: 42269.7. Samples: 1282317960. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:32:18,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 22:32:20,011][15401] Updated weights for policy 0, policy_version 78260 (0.0034) [2024-06-21 22:32:23,336][15401] Updated weights for policy 0, policy_version 78270 (0.0043) [2024-06-21 22:32:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1282375680. Throughput: 0: 42535.6. Samples: 1282445100. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:32:23,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 22:32:27,680][15401] Updated weights for policy 0, policy_version 78280 (0.0032) [2024-06-21 22:32:28,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41777.5, 300 sec: 42375.9). Total num frames: 1282555904. Throughput: 0: 42274.7. Samples: 1282702260. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-21 22:32:28,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 22:32:31,104][15401] Updated weights for policy 0, policy_version 78290 (0.0046) [2024-06-21 22:32:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1282801664. Throughput: 0: 42319.2. Samples: 1282948260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:32:33,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 22:32:35,394][15401] Updated weights for policy 0, policy_version 78300 (0.0034) [2024-06-21 22:32:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1282998272. Throughput: 0: 42539.1. Samples: 1283083620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:32:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 22:32:38,772][15401] Updated weights for policy 0, policy_version 78310 (0.0032) [2024-06-21 22:32:43,315][15401] Updated weights for policy 0, policy_version 78320 (0.0034) [2024-06-21 22:32:43,390][15132] Fps is (10 sec: 39320.5, 60 sec: 42052.0, 300 sec: 42431.7). Total num frames: 1283194880. Throughput: 0: 42385.4. Samples: 1283337260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:32:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 22:32:46,238][15401] Updated weights for policy 0, policy_version 78330 (0.0034) [2024-06-21 22:32:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1283440640. Throughput: 0: 42371.6. Samples: 1283588940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:32:48,392][15132] Avg episode reward: [(0, '0.747')] [2024-06-21 22:32:50,941][15401] Updated weights for policy 0, policy_version 78340 (0.0026) [2024-06-21 22:32:53,389][15132] Fps is (10 sec: 45876.9, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 1283653632. Throughput: 0: 42626.8. Samples: 1283727000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:32:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 22:32:54,191][15401] Updated weights for policy 0, policy_version 78350 (0.0029) [2024-06-21 22:32:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1283833856. Throughput: 0: 42316.6. Samples: 1283974100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:32:58,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-21 22:32:58,478][15401] Updated weights for policy 0, policy_version 78360 (0.0034) [2024-06-21 22:33:01,649][15401] Updated weights for policy 0, policy_version 78370 (0.0037) [2024-06-21 22:33:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1284079616. Throughput: 0: 42575.2. Samples: 1284233840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:33:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 22:33:05,926][15401] Updated weights for policy 0, policy_version 78380 (0.0031) [2024-06-21 22:33:08,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1284292608. Throughput: 0: 42626.3. Samples: 1284363280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:33:08,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 22:33:09,816][15401] Updated weights for policy 0, policy_version 78390 (0.0027) [2024-06-21 22:33:13,313][15401] Updated weights for policy 0, policy_version 78400 (0.0030) [2024-06-21 22:33:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1284505600. Throughput: 0: 42498.7. Samples: 1284614600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:33:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 22:33:17,350][15401] Updated weights for policy 0, policy_version 78410 (0.0024) [2024-06-21 22:33:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 1284718592. Throughput: 0: 42728.9. Samples: 1284871160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-21 22:33:18,393][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 22:33:21,122][15401] Updated weights for policy 0, policy_version 78420 (0.0034) [2024-06-21 22:33:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42432.5). Total num frames: 1284915200. Throughput: 0: 42598.2. Samples: 1285000540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:33:23,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-21 22:33:24,876][15401] Updated weights for policy 0, policy_version 78430 (0.0028) [2024-06-21 22:33:26,053][15349] Signal inference workers to stop experience collection... (18950 times) [2024-06-21 22:33:26,053][15349] Signal inference workers to resume experience collection... (18950 times) [2024-06-21 22:33:26,069][15401] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-06-21 22:33:26,069][15401] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-06-21 22:33:28,392][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.6). Total num frames: 1285144576. Throughput: 0: 42632.2. Samples: 1285255800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:33:28,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 22:33:28,587][15401] Updated weights for policy 0, policy_version 78440 (0.0027) [2024-06-21 22:33:32,358][15401] Updated weights for policy 0, policy_version 78450 (0.0031) [2024-06-21 22:33:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1285341184. Throughput: 0: 42790.3. Samples: 1285514500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:33:33,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 22:33:36,374][15401] Updated weights for policy 0, policy_version 78460 (0.0036) [2024-06-21 22:33:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1285570560. Throughput: 0: 42544.4. Samples: 1285641500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:33:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 22:33:40,136][15401] Updated weights for policy 0, policy_version 78470 (0.0033) [2024-06-21 22:33:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.9, 300 sec: 42653.9). Total num frames: 1285783552. Throughput: 0: 42675.1. Samples: 1285894480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:33:43,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-21 22:33:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000078478_1285783552.pth... [2024-06-21 22:33:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000077853_1275543552.pth [2024-06-21 22:33:44,229][15401] Updated weights for policy 0, policy_version 78480 (0.0025) [2024-06-21 22:33:48,198][15401] Updated weights for policy 0, policy_version 78490 (0.0037) [2024-06-21 22:33:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1285996544. Throughput: 0: 42595.7. Samples: 1286150640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:33:48,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-21 22:33:51,868][15401] Updated weights for policy 0, policy_version 78500 (0.0038) [2024-06-21 22:33:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1286193152. Throughput: 0: 42415.9. Samples: 1286272000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:33:53,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 22:33:55,921][15401] Updated weights for policy 0, policy_version 78510 (0.0032) [2024-06-21 22:33:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1286422528. Throughput: 0: 42563.9. Samples: 1286529980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:33:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-21 22:33:59,630][15401] Updated weights for policy 0, policy_version 78520 (0.0029) [2024-06-21 22:34:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1286602752. Throughput: 0: 42563.1. Samples: 1286786400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-21 22:34:03,404][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 22:34:03,776][15401] Updated weights for policy 0, policy_version 78530 (0.0034) [2024-06-21 22:34:07,081][15401] Updated weights for policy 0, policy_version 78540 (0.0026) [2024-06-21 22:34:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1286832128. Throughput: 0: 42387.2. Samples: 1286907960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:08,399][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 22:34:11,542][15401] Updated weights for policy 0, policy_version 78550 (0.0026) [2024-06-21 22:34:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1287061504. Throughput: 0: 42534.7. Samples: 1287169760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 22:34:14,577][15401] Updated weights for policy 0, policy_version 78560 (0.0037) [2024-06-21 22:34:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 1287241728. Throughput: 0: 42484.5. Samples: 1287426300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 22:34:19,201][15401] Updated weights for policy 0, policy_version 78570 (0.0036) [2024-06-21 22:34:22,059][15401] Updated weights for policy 0, policy_version 78580 (0.0036) [2024-06-21 22:34:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1287471104. Throughput: 0: 42249.7. Samples: 1287542740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 22:34:27,294][15401] Updated weights for policy 0, policy_version 78590 (0.0024) [2024-06-21 22:34:28,396][15132] Fps is (10 sec: 45845.3, 60 sec: 42595.5, 300 sec: 42708.5). Total num frames: 1287700480. Throughput: 0: 42524.0. Samples: 1287808340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:28,397][15132] Avg episode reward: [(0, '0.397')] [2024-06-21 22:34:30,411][15401] Updated weights for policy 0, policy_version 78600 (0.0026) [2024-06-21 22:34:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1287880704. Throughput: 0: 42445.6. Samples: 1288060700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:33,392][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 22:34:34,913][15401] Updated weights for policy 0, policy_version 78610 (0.0028) [2024-06-21 22:34:38,035][15401] Updated weights for policy 0, policy_version 78620 (0.0028) [2024-06-21 22:34:38,390][15132] Fps is (10 sec: 40986.4, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 1288110080. Throughput: 0: 42402.3. Samples: 1288180100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 22:34:42,552][15401] Updated weights for policy 0, policy_version 78630 (0.0032) [2024-06-21 22:34:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1288323072. Throughput: 0: 42575.6. Samples: 1288445880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:43,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 22:34:43,584][15349] Signal inference workers to stop experience collection... (19000 times) [2024-06-21 22:34:43,584][15349] Signal inference workers to resume experience collection... (19000 times) [2024-06-21 22:34:43,599][15401] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-06-21 22:34:43,624][15401] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-06-21 22:34:45,649][15401] Updated weights for policy 0, policy_version 78640 (0.0031) [2024-06-21 22:34:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42487.4). Total num frames: 1288519680. Throughput: 0: 42600.5. Samples: 1288703420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 22:34:50,034][15401] Updated weights for policy 0, policy_version 78650 (0.0046) [2024-06-21 22:34:53,224][15401] Updated weights for policy 0, policy_version 78660 (0.0040) [2024-06-21 22:34:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1288765440. Throughput: 0: 42549.8. Samples: 1288822700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 22:34:53,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 22:34:57,900][15401] Updated weights for policy 0, policy_version 78670 (0.0045) [2024-06-21 22:34:58,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 1288962048. Throughput: 0: 42515.0. Samples: 1289083040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:34:58,393][15132] Avg episode reward: [(0, '0.283')] [2024-06-21 22:35:00,947][15401] Updated weights for policy 0, policy_version 78680 (0.0028) [2024-06-21 22:35:03,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1289142272. Throughput: 0: 42560.9. Samples: 1289341540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:03,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 22:35:05,437][15401] Updated weights for policy 0, policy_version 78690 (0.0037) [2024-06-21 22:35:08,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 1289404416. Throughput: 0: 42685.0. Samples: 1289463560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 22:35:08,462][15401] Updated weights for policy 0, policy_version 78700 (0.0026) [2024-06-21 22:35:13,162][15401] Updated weights for policy 0, policy_version 78710 (0.0032) [2024-06-21 22:35:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1289584640. Throughput: 0: 42549.2. Samples: 1289722780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:13,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-21 22:35:16,092][15401] Updated weights for policy 0, policy_version 78720 (0.0028) [2024-06-21 22:35:18,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42432.6). Total num frames: 1289781248. Throughput: 0: 42555.2. Samples: 1289975680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-21 22:35:20,707][15401] Updated weights for policy 0, policy_version 78730 (0.0028) [2024-06-21 22:35:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1290043392. Throughput: 0: 42739.1. Samples: 1290103360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 22:35:24,012][15401] Updated weights for policy 0, policy_version 78740 (0.0034) [2024-06-21 22:35:28,261][15401] Updated weights for policy 0, policy_version 78750 (0.0021) [2024-06-21 22:35:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 1290240000. Throughput: 0: 42620.6. Samples: 1290363800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-21 22:35:31,430][15401] Updated weights for policy 0, policy_version 78760 (0.0025) [2024-06-21 22:35:33,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42593.9, 300 sec: 42542.0). Total num frames: 1290436608. Throughput: 0: 42620.1. Samples: 1290621600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:33,396][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 22:35:35,820][15401] Updated weights for policy 0, policy_version 78770 (0.0031) [2024-06-21 22:35:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1290665984. Throughput: 0: 42870.1. Samples: 1290751860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-21 22:35:38,913][15401] Updated weights for policy 0, policy_version 78780 (0.0030) [2024-06-21 22:35:43,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1290862592. Throughput: 0: 42778.7. Samples: 1291007980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-21 22:35:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-21 22:35:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000078788_1290862592.pth... [2024-06-21 22:35:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000078166_1280671744.pth [2024-06-21 22:35:43,850][15401] Updated weights for policy 0, policy_version 78790 (0.0034) [2024-06-21 22:35:46,971][15401] Updated weights for policy 0, policy_version 78800 (0.0028) [2024-06-21 22:35:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1291091968. Throughput: 0: 42600.4. Samples: 1291258560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:35:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 22:35:51,509][15401] Updated weights for policy 0, policy_version 78810 (0.0028) [2024-06-21 22:35:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1291304960. Throughput: 0: 42892.8. Samples: 1291393740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:35:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-21 22:35:54,327][15349] Signal inference workers to stop experience collection... (19050 times) [2024-06-21 22:35:54,380][15401] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-06-21 22:35:54,380][15349] Signal inference workers to resume experience collection... (19050 times) [2024-06-21 22:35:54,394][15401] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-06-21 22:35:54,550][15401] Updated weights for policy 0, policy_version 78820 (0.0034) [2024-06-21 22:35:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1291517952. Throughput: 0: 42763.6. Samples: 1291647140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:35:58,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-21 22:35:58,996][15401] Updated weights for policy 0, policy_version 78830 (0.0029) [2024-06-21 22:36:01,983][15401] Updated weights for policy 0, policy_version 78840 (0.0029) [2024-06-21 22:36:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1291730944. Throughput: 0: 43053.4. Samples: 1291913080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:36:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 22:36:06,349][15401] Updated weights for policy 0, policy_version 78850 (0.0032) [2024-06-21 22:36:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1291960320. Throughput: 0: 43097.3. Samples: 1292042740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:36:08,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 22:36:09,563][15401] Updated weights for policy 0, policy_version 78860 (0.0041) [2024-06-21 22:36:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1292173312. Throughput: 0: 42871.5. Samples: 1292293020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:36:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 22:36:13,939][15401] Updated weights for policy 0, policy_version 78870 (0.0045) [2024-06-21 22:36:17,291][15401] Updated weights for policy 0, policy_version 78880 (0.0024) [2024-06-21 22:36:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42542.9). Total num frames: 1292386304. Throughput: 0: 42810.9. Samples: 1292547820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:36:18,392][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 22:36:22,231][15401] Updated weights for policy 0, policy_version 78890 (0.0036) [2024-06-21 22:36:23,393][15132] Fps is (10 sec: 42584.3, 60 sec: 42596.0, 300 sec: 42542.4). Total num frames: 1292599296. Throughput: 0: 42798.6. Samples: 1292677940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:36:23,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 22:36:24,744][15401] Updated weights for policy 0, policy_version 78900 (0.0027) [2024-06-21 22:36:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1292828672. Throughput: 0: 42999.2. Samples: 1292942940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 22:36:28,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-21 22:36:29,830][15401] Updated weights for policy 0, policy_version 78910 (0.0031) [2024-06-21 22:36:32,622][15401] Updated weights for policy 0, policy_version 78920 (0.0042) [2024-06-21 22:36:33,389][15132] Fps is (10 sec: 44251.9, 60 sec: 43422.3, 300 sec: 42598.4). Total num frames: 1293041664. Throughput: 0: 42928.9. Samples: 1293190360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:36:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 22:36:37,249][15401] Updated weights for policy 0, policy_version 78930 (0.0042) [2024-06-21 22:36:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1293221888. Throughput: 0: 42779.0. Samples: 1293318800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:36:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 22:36:40,187][15401] Updated weights for policy 0, policy_version 78940 (0.0024) [2024-06-21 22:36:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43689.0, 300 sec: 42764.7). Total num frames: 1293484032. Throughput: 0: 42940.8. Samples: 1293579580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:36:43,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 22:36:44,689][15401] Updated weights for policy 0, policy_version 78950 (0.0027) [2024-06-21 22:36:47,885][15401] Updated weights for policy 0, policy_version 78960 (0.0041) [2024-06-21 22:36:48,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1293697024. Throughput: 0: 42695.9. Samples: 1293834400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:36:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 22:36:52,183][15401] Updated weights for policy 0, policy_version 78970 (0.0033) [2024-06-21 22:36:53,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1293877248. Throughput: 0: 42656.9. Samples: 1293962300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:36:53,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-21 22:36:54,281][15349] Signal inference workers to stop experience collection... (19100 times) [2024-06-21 22:36:54,281][15349] Signal inference workers to resume experience collection... (19100 times) [2024-06-21 22:36:54,295][15401] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-06-21 22:36:54,295][15401] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-06-21 22:36:55,764][15401] Updated weights for policy 0, policy_version 78980 (0.0032) [2024-06-21 22:36:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1294090240. Throughput: 0: 42856.0. Samples: 1294221540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:36:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-21 22:36:59,584][15401] Updated weights for policy 0, policy_version 78990 (0.0033) [2024-06-21 22:37:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1294303232. Throughput: 0: 43120.5. Samples: 1294488240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:37:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 22:37:03,550][15401] Updated weights for policy 0, policy_version 79000 (0.0033) [2024-06-21 22:37:07,285][15401] Updated weights for policy 0, policy_version 79010 (0.0044) [2024-06-21 22:37:08,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1294532608. Throughput: 0: 43019.1. Samples: 1294613760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:37:08,393][15132] Avg episode reward: [(0, '0.153')] [2024-06-21 22:37:11,138][15401] Updated weights for policy 0, policy_version 79020 (0.0026) [2024-06-21 22:37:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1294761984. Throughput: 0: 42653.3. Samples: 1294862340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:37:13,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 22:37:14,899][15401] Updated weights for policy 0, policy_version 79030 (0.0037) [2024-06-21 22:37:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1294958592. Throughput: 0: 43081.2. Samples: 1295129020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 22:37:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 22:37:18,838][15401] Updated weights for policy 0, policy_version 79040 (0.0030) [2024-06-21 22:37:22,362][15401] Updated weights for policy 0, policy_version 79050 (0.0032) [2024-06-21 22:37:23,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42599.1, 300 sec: 42709.5). Total num frames: 1295155200. Throughput: 0: 42949.7. Samples: 1295251640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:37:23,393][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 22:37:26,634][15401] Updated weights for policy 0, policy_version 79060 (0.0031) [2024-06-21 22:37:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1295400960. Throughput: 0: 42872.2. Samples: 1295508720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:37:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 22:37:30,279][15401] Updated weights for policy 0, policy_version 79070 (0.0040) [2024-06-21 22:37:33,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 1295581184. Throughput: 0: 43051.1. Samples: 1295771800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:37:33,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 22:37:34,334][15401] Updated weights for policy 0, policy_version 79080 (0.0035) [2024-06-21 22:37:37,806][15401] Updated weights for policy 0, policy_version 79090 (0.0040) [2024-06-21 22:37:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42765.1). Total num frames: 1295810560. Throughput: 0: 42909.7. Samples: 1295893240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:37:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 22:37:42,048][15401] Updated weights for policy 0, policy_version 79100 (0.0035) [2024-06-21 22:37:43,390][15132] Fps is (10 sec: 47524.6, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1296056320. Throughput: 0: 43000.9. Samples: 1296156580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:37:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 22:37:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000079105_1296056320.pth... [2024-06-21 22:37:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000078478_1285783552.pth [2024-06-21 22:37:45,395][15401] Updated weights for policy 0, policy_version 79110 (0.0038) [2024-06-21 22:37:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1296220160. Throughput: 0: 42673.8. Samples: 1296408560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:37:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 22:37:49,688][15401] Updated weights for policy 0, policy_version 79120 (0.0026) [2024-06-21 22:37:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1296449536. Throughput: 0: 42659.6. Samples: 1296533340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:37:53,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 22:37:53,665][15401] Updated weights for policy 0, policy_version 79130 (0.0034) [2024-06-21 22:37:57,219][15401] Updated weights for policy 0, policy_version 79140 (0.0035) [2024-06-21 22:37:58,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43690.8, 300 sec: 42820.6). Total num frames: 1296711680. Throughput: 0: 43015.7. Samples: 1296798040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:37:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 22:38:01,155][15401] Updated weights for policy 0, policy_version 79150 (0.0033) [2024-06-21 22:38:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1296875520. Throughput: 0: 42933.4. Samples: 1297061020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 22:38:03,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 22:38:04,807][15401] Updated weights for policy 0, policy_version 79160 (0.0044) [2024-06-21 22:38:08,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 1297104896. Throughput: 0: 42670.3. Samples: 1297171800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:08,392][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 22:38:08,834][15401] Updated weights for policy 0, policy_version 79170 (0.0033) [2024-06-21 22:38:12,418][15401] Updated weights for policy 0, policy_version 79180 (0.0032) [2024-06-21 22:38:13,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 1297350656. Throughput: 0: 42850.6. Samples: 1297437000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 22:38:16,655][15401] Updated weights for policy 0, policy_version 79190 (0.0032) [2024-06-21 22:38:18,080][15349] Signal inference workers to stop experience collection... (19150 times) [2024-06-21 22:38:18,112][15401] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-06-21 22:38:18,200][15349] Signal inference workers to resume experience collection... (19150 times) [2024-06-21 22:38:18,200][15401] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-06-21 22:38:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1297514496. Throughput: 0: 42833.3. Samples: 1297699200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 22:38:20,000][15401] Updated weights for policy 0, policy_version 79200 (0.0034) [2024-06-21 22:38:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43146.2, 300 sec: 42709.8). Total num frames: 1297743872. Throughput: 0: 42715.1. Samples: 1297815420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 22:38:24,313][15401] Updated weights for policy 0, policy_version 79210 (0.0033) [2024-06-21 22:38:27,477][15401] Updated weights for policy 0, policy_version 79220 (0.0032) [2024-06-21 22:38:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1297973248. Throughput: 0: 42779.7. Samples: 1298081660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 22:38:31,790][15401] Updated weights for policy 0, policy_version 79230 (0.0035) [2024-06-21 22:38:33,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42871.5, 300 sec: 42653.6). Total num frames: 1298153472. Throughput: 0: 42879.4. Samples: 1298338240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:33,393][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 22:38:35,233][15401] Updated weights for policy 0, policy_version 79240 (0.0037) [2024-06-21 22:38:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1298382848. Throughput: 0: 42980.2. Samples: 1298467440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 22:38:39,372][15401] Updated weights for policy 0, policy_version 79250 (0.0027) [2024-06-21 22:38:42,747][15401] Updated weights for policy 0, policy_version 79260 (0.0033) [2024-06-21 22:38:43,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 1298595840. Throughput: 0: 42878.5. Samples: 1298727580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:43,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-21 22:38:47,081][15401] Updated weights for policy 0, policy_version 79270 (0.0048) [2024-06-21 22:38:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1298808832. Throughput: 0: 42711.6. Samples: 1298983040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 22:38:50,628][15401] Updated weights for policy 0, policy_version 79280 (0.0034) [2024-06-21 22:38:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1299038208. Throughput: 0: 43093.4. Samples: 1299110900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-21 22:38:53,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 22:38:54,556][15401] Updated weights for policy 0, policy_version 79290 (0.0045) [2024-06-21 22:38:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 1299218432. Throughput: 0: 42929.2. Samples: 1299368820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:38:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 22:38:58,682][15401] Updated weights for policy 0, policy_version 79300 (0.0028) [2024-06-21 22:39:01,997][15401] Updated weights for policy 0, policy_version 79310 (0.0030) [2024-06-21 22:39:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1299464192. Throughput: 0: 42879.6. Samples: 1299628780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 22:39:06,138][15401] Updated weights for policy 0, policy_version 79320 (0.0051) [2024-06-21 22:39:08,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 1299693568. Throughput: 0: 43104.9. Samples: 1299755140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 22:39:09,461][15401] Updated weights for policy 0, policy_version 79330 (0.0039) [2024-06-21 22:39:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 1299873792. Throughput: 0: 43054.2. Samples: 1300019100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 22:39:13,599][15401] Updated weights for policy 0, policy_version 79340 (0.0027) [2024-06-21 22:39:17,147][15401] Updated weights for policy 0, policy_version 79350 (0.0039) [2024-06-21 22:39:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1300103168. Throughput: 0: 42946.2. Samples: 1300270720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 22:39:21,165][15401] Updated weights for policy 0, policy_version 79360 (0.0032) [2024-06-21 22:39:23,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.9, 300 sec: 42821.1). Total num frames: 1300332544. Throughput: 0: 42859.0. Samples: 1300396200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:23,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 22:39:24,816][15401] Updated weights for policy 0, policy_version 79370 (0.0031) [2024-06-21 22:39:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1300529152. Throughput: 0: 42840.2. Samples: 1300655380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 22:39:28,738][15401] Updated weights for policy 0, policy_version 79380 (0.0030) [2024-06-21 22:39:32,669][15401] Updated weights for policy 0, policy_version 79390 (0.0044) [2024-06-21 22:39:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 1300742144. Throughput: 0: 42752.9. Samples: 1300906920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 22:39:36,461][15401] Updated weights for policy 0, policy_version 79400 (0.0039) [2024-06-21 22:39:38,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1300955136. Throughput: 0: 42824.4. Samples: 1301038000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:38,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-21 22:39:40,273][15401] Updated weights for policy 0, policy_version 79410 (0.0029) [2024-06-21 22:39:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1301168128. Throughput: 0: 42902.1. Samples: 1301299420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-21 22:39:43,390][15132] Avg episode reward: [(0, '0.203')] [2024-06-21 22:39:43,510][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000079418_1301184512.pth... [2024-06-21 22:39:43,557][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000078788_1290862592.pth [2024-06-21 22:39:44,271][15401] Updated weights for policy 0, policy_version 79420 (0.0034) [2024-06-21 22:39:47,799][15349] Signal inference workers to stop experience collection... (19200 times) [2024-06-21 22:39:47,805][15349] Signal inference workers to resume experience collection... (19200 times) [2024-06-21 22:39:47,819][15401] Updated weights for policy 0, policy_version 79430 (0.0033) [2024-06-21 22:39:47,851][15401] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-06-21 22:39:47,851][15401] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-06-21 22:39:48,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1301413888. Throughput: 0: 42692.9. Samples: 1301549960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:39:48,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 22:39:51,771][15401] Updated weights for policy 0, policy_version 79440 (0.0032) [2024-06-21 22:39:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 1301594112. Throughput: 0: 42831.2. Samples: 1301682540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:39:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 22:39:55,184][15401] Updated weights for policy 0, policy_version 79450 (0.0039) [2024-06-21 22:39:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1301807104. Throughput: 0: 42716.5. Samples: 1301941340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:39:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 22:39:59,354][15401] Updated weights for policy 0, policy_version 79460 (0.0043) [2024-06-21 22:40:02,713][15401] Updated weights for policy 0, policy_version 79470 (0.0034) [2024-06-21 22:40:03,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1302052864. Throughput: 0: 42749.3. Samples: 1302194440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:40:03,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 22:40:07,185][15401] Updated weights for policy 0, policy_version 79480 (0.0034) [2024-06-21 22:40:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1302233088. Throughput: 0: 42926.2. Samples: 1302327780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:40:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 22:40:10,400][15401] Updated weights for policy 0, policy_version 79490 (0.0042) [2024-06-21 22:40:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1302446080. Throughput: 0: 42718.6. Samples: 1302577720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:40:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 22:40:14,783][15401] Updated weights for policy 0, policy_version 79500 (0.0031) [2024-06-21 22:40:18,351][15401] Updated weights for policy 0, policy_version 79510 (0.0034) [2024-06-21 22:40:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1302691840. Throughput: 0: 42665.7. Samples: 1302826880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:40:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 22:40:22,379][15401] Updated weights for policy 0, policy_version 79520 (0.0032) [2024-06-21 22:40:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 1302855680. Throughput: 0: 42707.3. Samples: 1302959820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:40:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-21 22:40:26,159][15401] Updated weights for policy 0, policy_version 79530 (0.0029) [2024-06-21 22:40:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42932.2). Total num frames: 1303101440. Throughput: 0: 42461.0. Samples: 1303210260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-21 22:40:28,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 22:40:30,089][15401] Updated weights for policy 0, policy_version 79540 (0.0032) [2024-06-21 22:40:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1303314432. Throughput: 0: 42587.5. Samples: 1303466400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:40:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 22:40:34,002][15401] Updated weights for policy 0, policy_version 79550 (0.0033) [2024-06-21 22:40:37,644][15401] Updated weights for policy 0, policy_version 79560 (0.0030) [2024-06-21 22:40:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1303511040. Throughput: 0: 42556.9. Samples: 1303597600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:40:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 22:40:41,594][15401] Updated weights for policy 0, policy_version 79570 (0.0035) [2024-06-21 22:40:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1303740416. Throughput: 0: 42416.8. Samples: 1303850100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:40:43,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-21 22:40:45,684][15401] Updated weights for policy 0, policy_version 79580 (0.0035) [2024-06-21 22:40:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1303953408. Throughput: 0: 42495.7. Samples: 1304106740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:40:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 22:40:49,594][15401] Updated weights for policy 0, policy_version 79590 (0.0027) [2024-06-21 22:40:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1304150016. Throughput: 0: 42371.1. Samples: 1304234480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:40:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 22:40:53,656][15401] Updated weights for policy 0, policy_version 79600 (0.0030) [2024-06-21 22:40:57,125][15401] Updated weights for policy 0, policy_version 79610 (0.0044) [2024-06-21 22:40:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1304379392. Throughput: 0: 42463.6. Samples: 1304488580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:40:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 22:41:01,549][15401] Updated weights for policy 0, policy_version 79620 (0.0045) [2024-06-21 22:41:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 1304576000. Throughput: 0: 42744.5. Samples: 1304750380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:41:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 22:41:05,115][15401] Updated weights for policy 0, policy_version 79630 (0.0025) [2024-06-21 22:41:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 1304788992. Throughput: 0: 42366.6. Samples: 1304866420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:41:08,392][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 22:41:09,439][15401] Updated weights for policy 0, policy_version 79640 (0.0043) [2024-06-21 22:41:12,695][15401] Updated weights for policy 0, policy_version 79650 (0.0038) [2024-06-21 22:41:13,014][15349] Signal inference workers to stop experience collection... (19250 times) [2024-06-21 22:41:13,014][15349] Signal inference workers to resume experience collection... (19250 times) [2024-06-21 22:41:13,040][15401] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-06-21 22:41:13,072][15401] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-06-21 22:41:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1305018368. Throughput: 0: 42620.5. Samples: 1305128080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:41:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 22:41:17,245][15401] Updated weights for policy 0, policy_version 79660 (0.0036) [2024-06-21 22:41:18,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42052.2, 300 sec: 42765.5). Total num frames: 1305214976. Throughput: 0: 42692.0. Samples: 1305387540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-21 22:41:18,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 22:41:20,235][15401] Updated weights for policy 0, policy_version 79670 (0.0025) [2024-06-21 22:41:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1305444352. Throughput: 0: 42595.5. Samples: 1305514400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:41:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 22:41:24,614][15401] Updated weights for policy 0, policy_version 79680 (0.0034) [2024-06-21 22:41:27,799][15401] Updated weights for policy 0, policy_version 79690 (0.0040) [2024-06-21 22:41:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1305657344. Throughput: 0: 42718.3. Samples: 1305772420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:41:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-21 22:41:32,045][15401] Updated weights for policy 0, policy_version 79700 (0.0039) [2024-06-21 22:41:33,395][15132] Fps is (10 sec: 40938.7, 60 sec: 42321.7, 300 sec: 42819.8). Total num frames: 1305853952. Throughput: 0: 42748.0. Samples: 1306030620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:41:33,395][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 22:41:35,595][15401] Updated weights for policy 0, policy_version 79710 (0.0047) [2024-06-21 22:41:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 1306083328. Throughput: 0: 42648.9. Samples: 1306153680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:41:38,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 22:41:39,538][15401] Updated weights for policy 0, policy_version 79720 (0.0028) [2024-06-21 22:41:43,106][15401] Updated weights for policy 0, policy_version 79730 (0.0033) [2024-06-21 22:41:43,389][15132] Fps is (10 sec: 45899.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1306312704. Throughput: 0: 42687.1. Samples: 1306409500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:41:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 22:41:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000079731_1306312704.pth... [2024-06-21 22:41:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000079105_1296056320.pth [2024-06-21 22:41:47,537][15401] Updated weights for policy 0, policy_version 79740 (0.0034) [2024-06-21 22:41:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1306492928. Throughput: 0: 42655.0. Samples: 1306669860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:41:48,392][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 22:41:50,756][15401] Updated weights for policy 0, policy_version 79750 (0.0038) [2024-06-21 22:41:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1306722304. Throughput: 0: 42671.9. Samples: 1306786560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:41:53,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-21 22:41:55,165][15401] Updated weights for policy 0, policy_version 79760 (0.0032) [2024-06-21 22:41:58,290][15401] Updated weights for policy 0, policy_version 79770 (0.0027) [2024-06-21 22:41:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1306951680. Throughput: 0: 42644.4. Samples: 1307047080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:41:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 22:42:03,060][15401] Updated weights for policy 0, policy_version 79780 (0.0036) [2024-06-21 22:42:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 1307131904. Throughput: 0: 42585.7. Samples: 1307303900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:42:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 22:42:05,981][15401] Updated weights for policy 0, policy_version 79790 (0.0036) [2024-06-21 22:42:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 1307344896. Throughput: 0: 42440.5. Samples: 1307424220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 22:42:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 22:42:10,493][15401] Updated weights for policy 0, policy_version 79800 (0.0033) [2024-06-21 22:42:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1307590656. Throughput: 0: 42564.4. Samples: 1307687820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:13,394][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 22:42:13,719][15401] Updated weights for policy 0, policy_version 79810 (0.0046) [2024-06-21 22:42:17,967][15401] Updated weights for policy 0, policy_version 79820 (0.0047) [2024-06-21 22:42:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 1307770880. Throughput: 0: 42433.8. Samples: 1307939920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 22:42:21,827][15401] Updated weights for policy 0, policy_version 79830 (0.0035) [2024-06-21 22:42:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1308000256. Throughput: 0: 42485.3. Samples: 1308065520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:23,395][15132] Avg episode reward: [(0, '0.171')] [2024-06-21 22:42:25,757][15401] Updated weights for policy 0, policy_version 79840 (0.0041) [2024-06-21 22:42:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 1308196864. Throughput: 0: 42444.0. Samples: 1308319480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 22:42:29,531][15401] Updated weights for policy 0, policy_version 79850 (0.0034) [2024-06-21 22:42:33,381][15401] Updated weights for policy 0, policy_version 79860 (0.0028) [2024-06-21 22:42:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42875.3, 300 sec: 42765.0). Total num frames: 1308426240. Throughput: 0: 42433.9. Samples: 1308579380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:33,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 22:42:34,224][15349] Signal inference workers to stop experience collection... (19300 times) [2024-06-21 22:42:34,224][15349] Signal inference workers to resume experience collection... (19300 times) [2024-06-21 22:42:34,268][15401] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-06-21 22:42:34,268][15401] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-06-21 22:42:37,350][15401] Updated weights for policy 0, policy_version 79870 (0.0038) [2024-06-21 22:42:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1308639232. Throughput: 0: 42636.6. Samples: 1308705200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:38,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-21 22:42:41,227][15401] Updated weights for policy 0, policy_version 79880 (0.0037) [2024-06-21 22:42:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 1308852224. Throughput: 0: 42650.1. Samples: 1308966340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:43,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 22:42:44,993][15401] Updated weights for policy 0, policy_version 79890 (0.0026) [2024-06-21 22:42:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1309032448. Throughput: 0: 42565.0. Samples: 1309219320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 22:42:49,054][15401] Updated weights for policy 0, policy_version 79900 (0.0043) [2024-06-21 22:42:52,670][15401] Updated weights for policy 0, policy_version 79910 (0.0033) [2024-06-21 22:42:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 1309261824. Throughput: 0: 42612.0. Samples: 1309341760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:53,398][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 22:42:56,881][15401] Updated weights for policy 0, policy_version 79920 (0.0034) [2024-06-21 22:42:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1309474816. Throughput: 0: 42421.5. Samples: 1309596780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-21 22:42:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 22:43:00,612][15401] Updated weights for policy 0, policy_version 79930 (0.0038) [2024-06-21 22:43:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 1309671424. Throughput: 0: 42479.5. Samples: 1309851500. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:03,394][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 22:43:04,797][15401] Updated weights for policy 0, policy_version 79940 (0.0043) [2024-06-21 22:43:08,242][15401] Updated weights for policy 0, policy_version 79950 (0.0032) [2024-06-21 22:43:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1309900800. Throughput: 0: 42464.4. Samples: 1309976420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:08,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-21 22:43:12,320][15401] Updated weights for policy 0, policy_version 79960 (0.0044) [2024-06-21 22:43:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1310113792. Throughput: 0: 42425.8. Samples: 1310228640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 22:43:16,025][15401] Updated weights for policy 0, policy_version 79970 (0.0048) [2024-06-21 22:43:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1310310400. Throughput: 0: 42294.0. Samples: 1310482620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 22:43:19,846][15401] Updated weights for policy 0, policy_version 79980 (0.0030) [2024-06-21 22:43:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1310523392. Throughput: 0: 42315.0. Samples: 1310609380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 22:43:23,743][15401] Updated weights for policy 0, policy_version 79990 (0.0037) [2024-06-21 22:43:27,630][15401] Updated weights for policy 0, policy_version 80000 (0.0039) [2024-06-21 22:43:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 1310736384. Throughput: 0: 42281.3. Samples: 1310869000. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:28,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 22:43:31,798][15401] Updated weights for policy 0, policy_version 80010 (0.0035) [2024-06-21 22:43:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1310965760. Throughput: 0: 42200.5. Samples: 1311118340. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-21 22:43:35,294][15401] Updated weights for policy 0, policy_version 80020 (0.0029) [2024-06-21 22:43:38,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1311162368. Throughput: 0: 42453.1. Samples: 1311252140. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:38,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-21 22:43:39,529][15401] Updated weights for policy 0, policy_version 80030 (0.0037) [2024-06-21 22:43:43,292][15401] Updated weights for policy 0, policy_version 80040 (0.0055) [2024-06-21 22:43:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1311375360. Throughput: 0: 42388.4. Samples: 1311504260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-21 22:43:43,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-21 22:43:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000080041_1311391744.pth... [2024-06-21 22:43:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000079418_1301184512.pth [2024-06-21 22:43:47,278][15401] Updated weights for policy 0, policy_version 80050 (0.0033) [2024-06-21 22:43:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1311588352. Throughput: 0: 42276.5. Samples: 1311753940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:43:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 22:43:50,956][15401] Updated weights for policy 0, policy_version 80060 (0.0036) [2024-06-21 22:43:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1311801344. Throughput: 0: 42334.3. Samples: 1311881460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:43:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 22:43:54,943][15401] Updated weights for policy 0, policy_version 80070 (0.0027) [2024-06-21 22:43:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1311997952. Throughput: 0: 42322.2. Samples: 1312133140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:43:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 22:43:58,655][15401] Updated weights for policy 0, policy_version 80080 (0.0028) [2024-06-21 22:44:02,739][15401] Updated weights for policy 0, policy_version 80090 (0.0028) [2024-06-21 22:44:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1312210944. Throughput: 0: 42382.3. Samples: 1312389820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:44:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-21 22:44:04,047][15349] Signal inference workers to stop experience collection... (19350 times) [2024-06-21 22:44:04,055][15349] Signal inference workers to resume experience collection... (19350 times) [2024-06-21 22:44:04,079][15401] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-06-21 22:44:04,109][15401] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-06-21 22:44:06,201][15401] Updated weights for policy 0, policy_version 80100 (0.0035) [2024-06-21 22:44:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1312440320. Throughput: 0: 42370.6. Samples: 1312516060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:44:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 22:44:10,473][15401] Updated weights for policy 0, policy_version 80110 (0.0033) [2024-06-21 22:44:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1312653312. Throughput: 0: 42310.2. Samples: 1312772960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:44:13,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 22:44:13,842][15401] Updated weights for policy 0, policy_version 80120 (0.0036) [2024-06-21 22:44:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.5, 300 sec: 42376.6). Total num frames: 1312833536. Throughput: 0: 42537.8. Samples: 1313032540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:44:18,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 22:44:18,406][15401] Updated weights for policy 0, policy_version 80130 (0.0045) [2024-06-21 22:44:21,571][15401] Updated weights for policy 0, policy_version 80140 (0.0046) [2024-06-21 22:44:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1313079296. Throughput: 0: 42254.2. Samples: 1313153580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:44:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 22:44:26,186][15401] Updated weights for policy 0, policy_version 80150 (0.0036) [2024-06-21 22:44:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1313292288. Throughput: 0: 42280.7. Samples: 1313406900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:44:28,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 22:44:29,316][15401] Updated weights for policy 0, policy_version 80160 (0.0036) [2024-06-21 22:44:33,395][15132] Fps is (10 sec: 39299.8, 60 sec: 41775.3, 300 sec: 42431.0). Total num frames: 1313472512. Throughput: 0: 42520.6. Samples: 1313667600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-21 22:44:33,395][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 22:44:33,724][15401] Updated weights for policy 0, policy_version 80170 (0.0035) [2024-06-21 22:44:37,088][15401] Updated weights for policy 0, policy_version 80180 (0.0033) [2024-06-21 22:44:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1313718272. Throughput: 0: 42321.8. Samples: 1313785940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:44:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 22:44:41,376][15401] Updated weights for policy 0, policy_version 80190 (0.0038) [2024-06-21 22:44:43,389][15132] Fps is (10 sec: 44261.1, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1313914880. Throughput: 0: 42473.4. Samples: 1314044440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:44:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 22:44:44,902][15401] Updated weights for policy 0, policy_version 80200 (0.0034) [2024-06-21 22:44:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1314111488. Throughput: 0: 42393.4. Samples: 1314297520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:44:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 22:44:49,417][15401] Updated weights for policy 0, policy_version 80210 (0.0037) [2024-06-21 22:44:52,679][15401] Updated weights for policy 0, policy_version 80220 (0.0038) [2024-06-21 22:44:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1314357248. Throughput: 0: 42309.4. Samples: 1314419980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:44:53,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-21 22:44:56,863][15401] Updated weights for policy 0, policy_version 80230 (0.0041) [2024-06-21 22:44:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1314537472. Throughput: 0: 42295.2. Samples: 1314676240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:44:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-21 22:45:00,322][15401] Updated weights for policy 0, policy_version 80240 (0.0033) [2024-06-21 22:45:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1314750464. Throughput: 0: 42180.3. Samples: 1314930660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:45:03,400][15132] Avg episode reward: [(0, '0.790')] [2024-06-21 22:45:04,434][15401] Updated weights for policy 0, policy_version 80250 (0.0039) [2024-06-21 22:45:07,983][15401] Updated weights for policy 0, policy_version 80260 (0.0035) [2024-06-21 22:45:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1314996224. Throughput: 0: 42330.1. Samples: 1315058440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:45:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 22:45:11,903][15401] Updated weights for policy 0, policy_version 80270 (0.0039) [2024-06-21 22:45:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1315176448. Throughput: 0: 42325.8. Samples: 1315311560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:45:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 22:45:15,418][15401] Updated weights for policy 0, policy_version 80280 (0.0031) [2024-06-21 22:45:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1315389440. Throughput: 0: 42285.6. Samples: 1315570220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:45:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 22:45:19,673][15401] Updated weights for policy 0, policy_version 80290 (0.0043) [2024-06-21 22:45:22,988][15401] Updated weights for policy 0, policy_version 80300 (0.0035) [2024-06-21 22:45:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 1315635200. Throughput: 0: 42515.1. Samples: 1315699120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:45:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 22:45:27,243][15401] Updated weights for policy 0, policy_version 80310 (0.0038) [2024-06-21 22:45:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1315831808. Throughput: 0: 42488.8. Samples: 1315956440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:45:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-21 22:45:30,552][15401] Updated weights for policy 0, policy_version 80320 (0.0036) [2024-06-21 22:45:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42329.1, 300 sec: 42376.2). Total num frames: 1316012032. Throughput: 0: 42722.5. Samples: 1316220040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:45:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-21 22:45:34,331][15349] Signal inference workers to stop experience collection... (19400 times) [2024-06-21 22:45:34,331][15349] Signal inference workers to resume experience collection... (19400 times) [2024-06-21 22:45:34,380][15401] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-06-21 22:45:34,380][15401] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-06-21 22:45:35,033][15401] Updated weights for policy 0, policy_version 80330 (0.0039) [2024-06-21 22:45:38,147][15401] Updated weights for policy 0, policy_version 80340 (0.0038) [2024-06-21 22:45:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1316290560. Throughput: 0: 42708.3. Samples: 1316341860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:45:38,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 22:45:42,570][15401] Updated weights for policy 0, policy_version 80350 (0.0032) [2024-06-21 22:45:43,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1316487168. Throughput: 0: 42681.2. Samples: 1316596900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:45:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 22:45:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000080352_1316487168.pth... [2024-06-21 22:45:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000079731_1306312704.pth [2024-06-21 22:45:45,928][15401] Updated weights for policy 0, policy_version 80360 (0.0052) [2024-06-21 22:45:48,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1316667392. Throughput: 0: 42840.1. Samples: 1316858460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:45:48,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-21 22:45:50,085][15401] Updated weights for policy 0, policy_version 80370 (0.0045) [2024-06-21 22:45:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1316913152. Throughput: 0: 42770.2. Samples: 1316983100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:45:53,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 22:45:53,705][15401] Updated weights for policy 0, policy_version 80380 (0.0034) [2024-06-21 22:45:57,908][15401] Updated weights for policy 0, policy_version 80390 (0.0038) [2024-06-21 22:45:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1317126144. Throughput: 0: 42862.7. Samples: 1317240380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:45:58,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-21 22:46:01,351][15401] Updated weights for policy 0, policy_version 80400 (0.0043) [2024-06-21 22:46:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42487.6). Total num frames: 1317322752. Throughput: 0: 42778.1. Samples: 1317495240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:46:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 22:46:05,545][15401] Updated weights for policy 0, policy_version 80410 (0.0025) [2024-06-21 22:46:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1317552128. Throughput: 0: 42791.7. Samples: 1317624740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:46:08,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-21 22:46:08,862][15401] Updated weights for policy 0, policy_version 80420 (0.0046) [2024-06-21 22:46:13,307][15401] Updated weights for policy 0, policy_version 80430 (0.0038) [2024-06-21 22:46:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1317765120. Throughput: 0: 42915.2. Samples: 1317887620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:46:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 22:46:16,456][15401] Updated weights for policy 0, policy_version 80440 (0.0038) [2024-06-21 22:46:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1317978112. Throughput: 0: 42573.8. Samples: 1318135860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-21 22:46:21,117][15401] Updated weights for policy 0, policy_version 80450 (0.0031) [2024-06-21 22:46:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1318207488. Throughput: 0: 42874.7. Samples: 1318271220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 22:46:24,087][15401] Updated weights for policy 0, policy_version 80460 (0.0032) [2024-06-21 22:46:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42488.1). Total num frames: 1318387712. Throughput: 0: 42954.3. Samples: 1318529840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:28,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 22:46:28,774][15401] Updated weights for policy 0, policy_version 80470 (0.0046) [2024-06-21 22:46:31,874][15401] Updated weights for policy 0, policy_version 80480 (0.0026) [2024-06-21 22:46:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 42542.9). Total num frames: 1318633472. Throughput: 0: 42774.5. Samples: 1318783320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 22:46:36,550][15401] Updated weights for policy 0, policy_version 80490 (0.0037) [2024-06-21 22:46:37,815][15349] Signal inference workers to stop experience collection... (19450 times) [2024-06-21 22:46:37,862][15401] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-06-21 22:46:37,871][15349] Signal inference workers to resume experience collection... (19450 times) [2024-06-21 22:46:37,880][15401] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-06-21 22:46:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1318862848. Throughput: 0: 43041.5. Samples: 1318919960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 22:46:39,495][15401] Updated weights for policy 0, policy_version 80500 (0.0030) [2024-06-21 22:46:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1319026688. Throughput: 0: 42899.1. Samples: 1319170840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:43,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-21 22:46:44,180][15401] Updated weights for policy 0, policy_version 80510 (0.0033) [2024-06-21 22:46:46,989][15401] Updated weights for policy 0, policy_version 80520 (0.0039) [2024-06-21 22:46:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 1319272448. Throughput: 0: 42912.2. Samples: 1319426280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 22:46:51,826][15401] Updated weights for policy 0, policy_version 80530 (0.0035) [2024-06-21 22:46:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1319485440. Throughput: 0: 43011.5. Samples: 1319560260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:53,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 22:46:54,403][15401] Updated weights for policy 0, policy_version 80540 (0.0027) [2024-06-21 22:46:58,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1319665664. Throughput: 0: 42843.4. Samples: 1319815580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:46:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 22:46:59,423][15401] Updated weights for policy 0, policy_version 80550 (0.0037) [2024-06-21 22:47:02,098][15401] Updated weights for policy 0, policy_version 80560 (0.0034) [2024-06-21 22:47:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1319911424. Throughput: 0: 42940.5. Samples: 1320068180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-21 22:47:03,391][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 22:47:07,004][15401] Updated weights for policy 0, policy_version 80570 (0.0038) [2024-06-21 22:47:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1320124416. Throughput: 0: 43025.9. Samples: 1320207380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-21 22:47:10,015][15401] Updated weights for policy 0, policy_version 80580 (0.0027) [2024-06-21 22:47:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1320321024. Throughput: 0: 42854.6. Samples: 1320458300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 22:47:14,613][15401] Updated weights for policy 0, policy_version 80590 (0.0045) [2024-06-21 22:47:17,637][15401] Updated weights for policy 0, policy_version 80600 (0.0028) [2024-06-21 22:47:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 1320566784. Throughput: 0: 42965.0. Samples: 1320716740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:18,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 22:47:22,098][15401] Updated weights for policy 0, policy_version 80610 (0.0037) [2024-06-21 22:47:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1320763392. Throughput: 0: 42973.2. Samples: 1320853860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:23,393][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 22:47:25,270][15401] Updated weights for policy 0, policy_version 80620 (0.0043) [2024-06-21 22:47:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1320960000. Throughput: 0: 42866.2. Samples: 1321099820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 22:47:29,708][15401] Updated weights for policy 0, policy_version 80630 (0.0046) [2024-06-21 22:47:33,380][15401] Updated weights for policy 0, policy_version 80640 (0.0025) [2024-06-21 22:47:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1321205760. Throughput: 0: 42970.5. Samples: 1321359960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:33,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 22:47:37,378][15401] Updated weights for policy 0, policy_version 80650 (0.0032) [2024-06-21 22:47:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1321418752. Throughput: 0: 42930.6. Samples: 1321492140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:38,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 22:47:41,093][15401] Updated weights for policy 0, policy_version 80660 (0.0036) [2024-06-21 22:47:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1321615360. Throughput: 0: 42734.2. Samples: 1321738620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:43,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 22:47:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000080665_1321615360.pth... [2024-06-21 22:47:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000080041_1311391744.pth [2024-06-21 22:47:45,042][15401] Updated weights for policy 0, policy_version 80670 (0.0028) [2024-06-21 22:47:48,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 1321811968. Throughput: 0: 42929.8. Samples: 1322000020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-21 22:47:48,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-21 22:47:48,829][15401] Updated weights for policy 0, policy_version 80680 (0.0033) [2024-06-21 22:47:52,755][15401] Updated weights for policy 0, policy_version 80690 (0.0038) [2024-06-21 22:47:53,252][15349] Signal inference workers to stop experience collection... (19500 times) [2024-06-21 22:47:53,253][15349] Signal inference workers to resume experience collection... (19500 times) [2024-06-21 22:47:53,304][15401] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-06-21 22:47:53,304][15401] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-06-21 22:47:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1322057728. Throughput: 0: 42679.6. Samples: 1322127960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:47:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 22:47:56,544][15401] Updated weights for policy 0, policy_version 80700 (0.0038) [2024-06-21 22:47:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1322270720. Throughput: 0: 42641.8. Samples: 1322377180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:47:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-21 22:48:00,661][15401] Updated weights for policy 0, policy_version 80710 (0.0039) [2024-06-21 22:48:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1322450944. Throughput: 0: 42626.7. Samples: 1322634940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:48:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 22:48:04,155][15401] Updated weights for policy 0, policy_version 80720 (0.0036) [2024-06-21 22:48:08,281][15401] Updated weights for policy 0, policy_version 80730 (0.0028) [2024-06-21 22:48:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1322680320. Throughput: 0: 42356.5. Samples: 1322759800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:48:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 22:48:11,874][15401] Updated weights for policy 0, policy_version 80740 (0.0038) [2024-06-21 22:48:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1322909696. Throughput: 0: 42553.9. Samples: 1323014740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:48:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 22:48:15,861][15401] Updated weights for policy 0, policy_version 80750 (0.0047) [2024-06-21 22:48:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1323106304. Throughput: 0: 42480.5. Samples: 1323271580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:48:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 22:48:19,886][15401] Updated weights for policy 0, policy_version 80760 (0.0040) [2024-06-21 22:48:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 1323319296. Throughput: 0: 42347.6. Samples: 1323397780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:48:23,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 22:48:23,724][15401] Updated weights for policy 0, policy_version 80770 (0.0042) [2024-06-21 22:48:27,665][15401] Updated weights for policy 0, policy_version 80780 (0.0031) [2024-06-21 22:48:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1323532288. Throughput: 0: 42655.2. Samples: 1323658100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:48:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 22:48:31,541][15401] Updated weights for policy 0, policy_version 80790 (0.0024) [2024-06-21 22:48:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1323761664. Throughput: 0: 42379.1. Samples: 1323907080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:48:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 22:48:35,269][15401] Updated weights for policy 0, policy_version 80800 (0.0021) [2024-06-21 22:48:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1323925504. Throughput: 0: 42464.0. Samples: 1324038840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 22:48:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-21 22:48:39,065][15401] Updated weights for policy 0, policy_version 80810 (0.0035) [2024-06-21 22:48:43,099][15401] Updated weights for policy 0, policy_version 80820 (0.0035) [2024-06-21 22:48:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1324171264. Throughput: 0: 42621.8. Samples: 1324295160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:48:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 22:48:46,869][15401] Updated weights for policy 0, policy_version 80830 (0.0045) [2024-06-21 22:48:48,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1324417024. Throughput: 0: 42412.9. Samples: 1324543520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:48:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 22:48:50,688][15401] Updated weights for policy 0, policy_version 80840 (0.0036) [2024-06-21 22:48:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1324580864. Throughput: 0: 42600.4. Samples: 1324676820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:48:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-21 22:48:54,551][15401] Updated weights for policy 0, policy_version 80850 (0.0032) [2024-06-21 22:48:58,224][15401] Updated weights for policy 0, policy_version 80860 (0.0031) [2024-06-21 22:48:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1324810240. Throughput: 0: 42558.6. Samples: 1324929880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:48:58,396][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 22:49:02,074][15401] Updated weights for policy 0, policy_version 80870 (0.0038) [2024-06-21 22:49:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 1325056000. Throughput: 0: 42502.5. Samples: 1325184200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:49:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 22:49:06,096][15401] Updated weights for policy 0, policy_version 80880 (0.0024) [2024-06-21 22:49:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1325219840. Throughput: 0: 42712.0. Samples: 1325319820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:49:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 22:49:09,527][15349] Signal inference workers to stop experience collection... (19550 times) [2024-06-21 22:49:09,578][15401] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-06-21 22:49:09,642][15349] Signal inference workers to resume experience collection... (19550 times) [2024-06-21 22:49:09,642][15401] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-06-21 22:49:09,784][15401] Updated weights for policy 0, policy_version 80890 (0.0031) [2024-06-21 22:49:13,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1325449216. Throughput: 0: 42412.5. Samples: 1325566660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:49:13,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-21 22:49:13,692][15401] Updated weights for policy 0, policy_version 80900 (0.0029) [2024-06-21 22:49:17,417][15401] Updated weights for policy 0, policy_version 80910 (0.0035) [2024-06-21 22:49:18,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1325694976. Throughput: 0: 42587.2. Samples: 1325823500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:49:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 22:49:21,523][15401] Updated weights for policy 0, policy_version 80920 (0.0024) [2024-06-21 22:49:23,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1325842432. Throughput: 0: 42626.9. Samples: 1325957060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:49:23,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 22:49:25,172][15401] Updated weights for policy 0, policy_version 80930 (0.0042) [2024-06-21 22:49:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42765.8). Total num frames: 1326088192. Throughput: 0: 42522.2. Samples: 1326208660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-21 22:49:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 22:49:29,095][15401] Updated weights for policy 0, policy_version 80940 (0.0033) [2024-06-21 22:49:32,600][15401] Updated weights for policy 0, policy_version 80950 (0.0037) [2024-06-21 22:49:33,396][15132] Fps is (10 sec: 49121.4, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 1326333952. Throughput: 0: 42786.8. Samples: 1326469200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:49:33,396][15132] Avg episode reward: [(0, '0.369')] [2024-06-21 22:49:36,543][15401] Updated weights for policy 0, policy_version 80960 (0.0037) [2024-06-21 22:49:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1326497792. Throughput: 0: 42821.7. Samples: 1326603800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:49:38,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-21 22:49:40,106][15401] Updated weights for policy 0, policy_version 80970 (0.0028) [2024-06-21 22:49:43,390][15132] Fps is (10 sec: 40985.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1326743552. Throughput: 0: 42844.3. Samples: 1326857880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:49:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 22:49:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000080978_1326743552.pth... [2024-06-21 22:49:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000080352_1316487168.pth [2024-06-21 22:49:44,198][15401] Updated weights for policy 0, policy_version 80980 (0.0023) [2024-06-21 22:49:47,535][15401] Updated weights for policy 0, policy_version 80990 (0.0031) [2024-06-21 22:49:48,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1326972928. Throughput: 0: 42921.2. Samples: 1327115640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:49:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 22:49:51,748][15401] Updated weights for policy 0, policy_version 81000 (0.0030) [2024-06-21 22:49:53,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1327120384. Throughput: 0: 42837.8. Samples: 1327247520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:49:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 22:49:55,353][15401] Updated weights for policy 0, policy_version 81010 (0.0038) [2024-06-21 22:49:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1327398912. Throughput: 0: 42985.6. Samples: 1327501020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:49:58,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-21 22:49:59,574][15401] Updated weights for policy 0, policy_version 81020 (0.0028) [2024-06-21 22:50:02,797][15401] Updated weights for policy 0, policy_version 81030 (0.0020) [2024-06-21 22:50:03,390][15132] Fps is (10 sec: 49151.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1327611904. Throughput: 0: 43076.3. Samples: 1327761940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:50:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-21 22:50:07,152][15401] Updated weights for policy 0, policy_version 81040 (0.0028) [2024-06-21 22:50:08,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1327775744. Throughput: 0: 42989.0. Samples: 1327891560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:50:08,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 22:50:09,063][15349] Signal inference workers to stop experience collection... (19600 times) [2024-06-21 22:50:09,096][15401] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-06-21 22:50:09,120][15349] Signal inference workers to resume experience collection... (19600 times) [2024-06-21 22:50:09,120][15401] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-06-21 22:50:10,913][15401] Updated weights for policy 0, policy_version 81050 (0.0026) [2024-06-21 22:50:13,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 1328054272. Throughput: 0: 42893.7. Samples: 1328138980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:50:13,393][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 22:50:14,778][15401] Updated weights for policy 0, policy_version 81060 (0.0035) [2024-06-21 22:50:18,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1328234496. Throughput: 0: 43068.3. Samples: 1328407000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:50:18,392][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 22:50:18,566][15401] Updated weights for policy 0, policy_version 81070 (0.0033) [2024-06-21 22:50:23,030][15401] Updated weights for policy 0, policy_version 81080 (0.0039) [2024-06-21 22:50:23,389][15132] Fps is (10 sec: 36053.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1328414720. Throughput: 0: 42760.6. Samples: 1328528020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:50:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-21 22:50:26,010][15401] Updated weights for policy 0, policy_version 81090 (0.0027) [2024-06-21 22:50:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 1328676864. Throughput: 0: 42713.6. Samples: 1328779980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:50:28,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 22:50:30,598][15401] Updated weights for policy 0, policy_version 81100 (0.0043) [2024-06-21 22:50:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42329.9, 300 sec: 42654.0). Total num frames: 1328873472. Throughput: 0: 42854.6. Samples: 1329044100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:50:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 22:50:33,628][15401] Updated weights for policy 0, policy_version 81110 (0.0035) [2024-06-21 22:50:38,392][15132] Fps is (10 sec: 37673.8, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1329053696. Throughput: 0: 42655.5. Samples: 1329167120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:50:38,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 22:50:38,523][15401] Updated weights for policy 0, policy_version 81120 (0.0034) [2024-06-21 22:50:41,215][15401] Updated weights for policy 0, policy_version 81130 (0.0032) [2024-06-21 22:50:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1329315840. Throughput: 0: 42653.8. Samples: 1329420440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:50:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 22:50:46,085][15401] Updated weights for policy 0, policy_version 81140 (0.0032) [2024-06-21 22:50:48,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1329512448. Throughput: 0: 42699.8. Samples: 1329683420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:50:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 22:50:48,834][15401] Updated weights for policy 0, policy_version 81150 (0.0038) [2024-06-21 22:50:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1329709056. Throughput: 0: 42522.7. Samples: 1329805080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:50:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-21 22:50:53,530][15401] Updated weights for policy 0, policy_version 81160 (0.0028) [2024-06-21 22:50:56,465][15401] Updated weights for policy 0, policy_version 81170 (0.0046) [2024-06-21 22:50:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1329971200. Throughput: 0: 42729.5. Samples: 1330061700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:50:58,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 22:51:01,436][15401] Updated weights for policy 0, policy_version 81180 (0.0026) [2024-06-21 22:51:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1330167808. Throughput: 0: 42658.2. Samples: 1330326620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:51:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 22:51:04,207][15401] Updated weights for policy 0, policy_version 81190 (0.0037) [2024-06-21 22:51:04,213][15349] Signal inference workers to stop experience collection... (19650 times) [2024-06-21 22:51:04,214][15349] Signal inference workers to resume experience collection... (19650 times) [2024-06-21 22:51:04,256][15401] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-06-21 22:51:04,256][15401] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-06-21 22:51:08,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1330348032. Throughput: 0: 42716.1. Samples: 1330450240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-21 22:51:08,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 22:51:09,039][15401] Updated weights for policy 0, policy_version 81200 (0.0032) [2024-06-21 22:51:11,972][15401] Updated weights for policy 0, policy_version 81210 (0.0023) [2024-06-21 22:51:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 1330610176. Throughput: 0: 42764.5. Samples: 1330704380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 22:51:16,665][15401] Updated weights for policy 0, policy_version 81220 (0.0039) [2024-06-21 22:51:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1330790400. Throughput: 0: 42687.5. Samples: 1330965040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 22:51:20,043][15401] Updated weights for policy 0, policy_version 81230 (0.0036) [2024-06-21 22:51:23,392][15132] Fps is (10 sec: 37673.8, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 1330987008. Throughput: 0: 42537.8. Samples: 1331081320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:23,392][15132] Avg episode reward: [(0, '0.234')] [2024-06-21 22:51:24,217][15401] Updated weights for policy 0, policy_version 81240 (0.0030) [2024-06-21 22:51:27,710][15401] Updated weights for policy 0, policy_version 81250 (0.0035) [2024-06-21 22:51:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1331249152. Throughput: 0: 42845.9. Samples: 1331348500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:28,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-21 22:51:31,727][15401] Updated weights for policy 0, policy_version 81260 (0.0027) [2024-06-21 22:51:33,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1331429376. Throughput: 0: 42882.4. Samples: 1331613140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 22:51:35,153][15401] Updated weights for policy 0, policy_version 81270 (0.0030) [2024-06-21 22:51:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 1331642368. Throughput: 0: 42868.1. Samples: 1331734140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:38,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-21 22:51:39,264][15401] Updated weights for policy 0, policy_version 81280 (0.0035) [2024-06-21 22:51:42,731][15401] Updated weights for policy 0, policy_version 81290 (0.0033) [2024-06-21 22:51:43,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1331904512. Throughput: 0: 43048.7. Samples: 1331998900. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 22:51:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000081293_1331904512.pth... [2024-06-21 22:51:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000080665_1321615360.pth [2024-06-21 22:51:46,787][15401] Updated weights for policy 0, policy_version 81300 (0.0032) [2024-06-21 22:51:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1332084736. Throughput: 0: 42958.2. Samples: 1332259740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:48,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 22:51:50,207][15401] Updated weights for policy 0, policy_version 81310 (0.0032) [2024-06-21 22:51:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1332297728. Throughput: 0: 42859.9. Samples: 1332378940. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 22:51:54,528][15401] Updated weights for policy 0, policy_version 81320 (0.0042) [2024-06-21 22:51:57,930][15401] Updated weights for policy 0, policy_version 81330 (0.0037) [2024-06-21 22:51:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1332527104. Throughput: 0: 43084.4. Samples: 1332643180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-21 22:51:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 22:52:02,156][15401] Updated weights for policy 0, policy_version 81340 (0.0037) [2024-06-21 22:52:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1332707328. Throughput: 0: 42942.3. Samples: 1332897440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-21 22:52:05,701][15401] Updated weights for policy 0, policy_version 81350 (0.0043) [2024-06-21 22:52:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1332936704. Throughput: 0: 42992.5. Samples: 1333015880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 22:52:09,669][15401] Updated weights for policy 0, policy_version 81360 (0.0032) [2024-06-21 22:52:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1333149696. Throughput: 0: 42991.9. Samples: 1333283140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 22:52:13,445][15401] Updated weights for policy 0, policy_version 81370 (0.0035) [2024-06-21 22:52:17,133][15401] Updated weights for policy 0, policy_version 81380 (0.0034) [2024-06-21 22:52:18,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42866.9, 300 sec: 42708.9). Total num frames: 1333362688. Throughput: 0: 42664.7. Samples: 1333533320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:18,396][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 22:52:21,165][15401] Updated weights for policy 0, policy_version 81390 (0.0029) [2024-06-21 22:52:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1333575680. Throughput: 0: 42905.6. Samples: 1333664900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-21 22:52:24,676][15401] Updated weights for policy 0, policy_version 81400 (0.0051) [2024-06-21 22:52:28,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1333788672. Throughput: 0: 42717.9. Samples: 1333921200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 22:52:28,640][15401] Updated weights for policy 0, policy_version 81410 (0.0042) [2024-06-21 22:52:32,423][15401] Updated weights for policy 0, policy_version 81420 (0.0036) [2024-06-21 22:52:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1334018048. Throughput: 0: 42621.7. Samples: 1334177720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 22:52:34,993][15349] Signal inference workers to stop experience collection... (19700 times) [2024-06-21 22:52:34,993][15349] Signal inference workers to resume experience collection... (19700 times) [2024-06-21 22:52:35,003][15401] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-06-21 22:52:35,003][15401] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-06-21 22:52:36,386][15401] Updated weights for policy 0, policy_version 81430 (0.0037) [2024-06-21 22:52:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1334231040. Throughput: 0: 42846.4. Samples: 1334307020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 22:52:40,009][15401] Updated weights for policy 0, policy_version 81440 (0.0036) [2024-06-21 22:52:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 1334427648. Throughput: 0: 42844.4. Samples: 1334571180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-21 22:52:43,943][15401] Updated weights for policy 0, policy_version 81450 (0.0032) [2024-06-21 22:52:47,868][15401] Updated weights for policy 0, policy_version 81460 (0.0025) [2024-06-21 22:52:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1334673408. Throughput: 0: 42709.7. Samples: 1334819380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-21 22:52:48,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 22:52:51,755][15401] Updated weights for policy 0, policy_version 81470 (0.0028) [2024-06-21 22:52:53,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1334870016. Throughput: 0: 42955.5. Samples: 1334948980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:52:53,393][15132] Avg episode reward: [(0, '0.386')] [2024-06-21 22:52:55,329][15401] Updated weights for policy 0, policy_version 81480 (0.0034) [2024-06-21 22:52:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1335066624. Throughput: 0: 42764.5. Samples: 1335207540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:52:58,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-21 22:52:59,313][15401] Updated weights for policy 0, policy_version 81490 (0.0029) [2024-06-21 22:53:03,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1335279616. Throughput: 0: 42870.6. Samples: 1335462220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:53:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 22:53:03,404][15401] Updated weights for policy 0, policy_version 81500 (0.0037) [2024-06-21 22:53:07,176][15401] Updated weights for policy 0, policy_version 81510 (0.0041) [2024-06-21 22:53:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1335508992. Throughput: 0: 42828.6. Samples: 1335592180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:53:08,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-21 22:53:10,933][15401] Updated weights for policy 0, policy_version 81520 (0.0041) [2024-06-21 22:53:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1335721984. Throughput: 0: 42790.2. Samples: 1335846760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:53:13,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 22:53:14,739][15401] Updated weights for policy 0, policy_version 81530 (0.0040) [2024-06-21 22:53:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 1335934976. Throughput: 0: 42861.4. Samples: 1336106480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:53:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 22:53:18,490][15401] Updated weights for policy 0, policy_version 81540 (0.0023) [2024-06-21 22:53:22,220][15401] Updated weights for policy 0, policy_version 81550 (0.0038) [2024-06-21 22:53:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1336147968. Throughput: 0: 42852.7. Samples: 1336235400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:53:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 22:53:26,058][15401] Updated weights for policy 0, policy_version 81560 (0.0037) [2024-06-21 22:53:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1336377344. Throughput: 0: 42710.6. Samples: 1336493160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:53:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 22:53:29,756][15401] Updated weights for policy 0, policy_version 81570 (0.0037) [2024-06-21 22:53:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1336573952. Throughput: 0: 42897.4. Samples: 1336749760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:53:33,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-21 22:53:33,598][15401] Updated weights for policy 0, policy_version 81580 (0.0027) [2024-06-21 22:53:37,277][15401] Updated weights for policy 0, policy_version 81590 (0.0027) [2024-06-21 22:53:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 1336786944. Throughput: 0: 42913.8. Samples: 1336880100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 22:53:38,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 22:53:41,309][15401] Updated weights for policy 0, policy_version 81600 (0.0050) [2024-06-21 22:53:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1337016320. Throughput: 0: 42863.1. Samples: 1337136380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:53:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 22:53:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000081605_1337016320.pth... [2024-06-21 22:53:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000080978_1326743552.pth [2024-06-21 22:53:44,909][15401] Updated weights for policy 0, policy_version 81610 (0.0033) [2024-06-21 22:53:48,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1337229312. Throughput: 0: 42996.0. Samples: 1337397040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:53:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 22:53:48,815][15401] Updated weights for policy 0, policy_version 81620 (0.0033) [2024-06-21 22:53:52,313][15401] Updated weights for policy 0, policy_version 81630 (0.0042) [2024-06-21 22:53:53,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 1337425920. Throughput: 0: 42967.9. Samples: 1337525840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:53:53,393][15132] Avg episode reward: [(0, '0.440')] [2024-06-21 22:53:56,656][15401] Updated weights for policy 0, policy_version 81640 (0.0042) [2024-06-21 22:53:58,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 1337671680. Throughput: 0: 42952.3. Samples: 1337779720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:53:58,393][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 22:53:59,141][15349] Signal inference workers to stop experience collection... (19750 times) [2024-06-21 22:53:59,179][15401] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-06-21 22:53:59,205][15349] Signal inference workers to resume experience collection... (19750 times) [2024-06-21 22:53:59,213][15401] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-06-21 22:54:00,207][15401] Updated weights for policy 0, policy_version 81650 (0.0038) [2024-06-21 22:54:03,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1337868288. Throughput: 0: 42932.5. Samples: 1338038440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:54:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 22:54:04,267][15401] Updated weights for policy 0, policy_version 81660 (0.0036) [2024-06-21 22:54:08,303][15401] Updated weights for policy 0, policy_version 81670 (0.0037) [2024-06-21 22:54:08,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1338081280. Throughput: 0: 42964.5. Samples: 1338168800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:54:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 22:54:11,951][15401] Updated weights for policy 0, policy_version 81680 (0.0030) [2024-06-21 22:54:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1338310656. Throughput: 0: 43088.8. Samples: 1338432160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:54:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 22:54:15,837][15401] Updated weights for policy 0, policy_version 81690 (0.0032) [2024-06-21 22:54:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 1338507264. Throughput: 0: 42982.2. Samples: 1338683960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:54:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-21 22:54:19,604][15401] Updated weights for policy 0, policy_version 81700 (0.0028) [2024-06-21 22:54:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1338720256. Throughput: 0: 42757.9. Samples: 1338804100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:54:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 22:54:23,491][15401] Updated weights for policy 0, policy_version 81710 (0.0022) [2024-06-21 22:54:27,228][15401] Updated weights for policy 0, policy_version 81720 (0.0033) [2024-06-21 22:54:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 1338949632. Throughput: 0: 42909.7. Samples: 1339067320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-21 22:54:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 22:54:31,147][15401] Updated weights for policy 0, policy_version 81730 (0.0036) [2024-06-21 22:54:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 1339162624. Throughput: 0: 42742.8. Samples: 1339320460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:54:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-21 22:54:34,640][15401] Updated weights for policy 0, policy_version 81740 (0.0029) [2024-06-21 22:54:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 1339359232. Throughput: 0: 42772.1. Samples: 1339450480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:54:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 22:54:38,587][15401] Updated weights for policy 0, policy_version 81750 (0.0040) [2024-06-21 22:54:42,191][15401] Updated weights for policy 0, policy_version 81760 (0.0041) [2024-06-21 22:54:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1339572224. Throughput: 0: 42681.5. Samples: 1339700280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:54:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 22:54:46,720][15401] Updated weights for policy 0, policy_version 81770 (0.0035) [2024-06-21 22:54:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1339801600. Throughput: 0: 42650.6. Samples: 1339957720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:54:48,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-21 22:54:49,706][15401] Updated weights for policy 0, policy_version 81780 (0.0024) [2024-06-21 22:54:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1339998208. Throughput: 0: 42671.6. Samples: 1340089020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:54:53,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 22:54:54,281][15401] Updated weights for policy 0, policy_version 81790 (0.0042) [2024-06-21 22:54:57,524][15401] Updated weights for policy 0, policy_version 81800 (0.0028) [2024-06-21 22:54:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1340227584. Throughput: 0: 42379.6. Samples: 1340339240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:54:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 22:55:01,892][15401] Updated weights for policy 0, policy_version 81810 (0.0032) [2024-06-21 22:55:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1340424192. Throughput: 0: 42653.4. Samples: 1340603360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:55:03,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-21 22:55:04,942][15401] Updated weights for policy 0, policy_version 81820 (0.0045) [2024-06-21 22:55:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1340637184. Throughput: 0: 42880.4. Samples: 1340733720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:55:08,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-21 22:55:09,454][15401] Updated weights for policy 0, policy_version 81830 (0.0028) [2024-06-21 22:55:12,613][15401] Updated weights for policy 0, policy_version 81840 (0.0030) [2024-06-21 22:55:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1340866560. Throughput: 0: 42702.2. Samples: 1340988920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 22:55:13,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-21 22:55:17,195][15401] Updated weights for policy 0, policy_version 81850 (0.0022) [2024-06-21 22:55:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1341079552. Throughput: 0: 42745.3. Samples: 1341244000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:18,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-21 22:55:20,624][15401] Updated weights for policy 0, policy_version 81860 (0.0042) [2024-06-21 22:55:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1341276160. Throughput: 0: 42682.2. Samples: 1341371180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 22:55:24,800][15401] Updated weights for policy 0, policy_version 81870 (0.0033) [2024-06-21 22:55:28,323][15401] Updated weights for policy 0, policy_version 81880 (0.0041) [2024-06-21 22:55:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1341521920. Throughput: 0: 42838.2. Samples: 1341628000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 22:55:32,222][15349] Signal inference workers to stop experience collection... (19800 times) [2024-06-21 22:55:32,223][15349] Signal inference workers to resume experience collection... (19800 times) [2024-06-21 22:55:32,265][15401] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-06-21 22:55:32,265][15401] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-06-21 22:55:32,364][15401] Updated weights for policy 0, policy_version 81890 (0.0041) [2024-06-21 22:55:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 1341718528. Throughput: 0: 42884.1. Samples: 1341887500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:33,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-21 22:55:35,852][15401] Updated weights for policy 0, policy_version 81900 (0.0039) [2024-06-21 22:55:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1341931520. Throughput: 0: 42673.3. Samples: 1342009320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:38,391][15132] Avg episode reward: [(0, '0.342')] [2024-06-21 22:55:40,202][15401] Updated weights for policy 0, policy_version 81910 (0.0026) [2024-06-21 22:55:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1342160896. Throughput: 0: 43014.7. Samples: 1342274900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-21 22:55:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000081919_1342160896.pth... [2024-06-21 22:55:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000081293_1331904512.pth [2024-06-21 22:55:43,669][15401] Updated weights for policy 0, policy_version 81920 (0.0032) [2024-06-21 22:55:47,623][15401] Updated weights for policy 0, policy_version 81930 (0.0049) [2024-06-21 22:55:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1342373888. Throughput: 0: 42665.6. Samples: 1342523320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:48,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 22:55:51,282][15401] Updated weights for policy 0, policy_version 81940 (0.0028) [2024-06-21 22:55:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1342586880. Throughput: 0: 42688.8. Samples: 1342654720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:53,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 22:55:55,087][15401] Updated weights for policy 0, policy_version 81950 (0.0022) [2024-06-21 22:55:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1342799872. Throughput: 0: 43009.9. Samples: 1342924360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:55:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 22:55:58,741][15401] Updated weights for policy 0, policy_version 81960 (0.0040) [2024-06-21 22:56:02,731][15401] Updated weights for policy 0, policy_version 81970 (0.0031) [2024-06-21 22:56:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1343012864. Throughput: 0: 42754.1. Samples: 1343167940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-21 22:56:03,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-21 22:56:06,629][15401] Updated weights for policy 0, policy_version 81980 (0.0030) [2024-06-21 22:56:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1343225856. Throughput: 0: 42784.9. Samples: 1343296500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-21 22:56:10,351][15401] Updated weights for policy 0, policy_version 81990 (0.0035) [2024-06-21 22:56:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1343422464. Throughput: 0: 42941.8. Samples: 1343560380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:13,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 22:56:14,129][15401] Updated weights for policy 0, policy_version 82000 (0.0035) [2024-06-21 22:56:18,152][15401] Updated weights for policy 0, policy_version 82010 (0.0033) [2024-06-21 22:56:18,390][15132] Fps is (10 sec: 42595.6, 60 sec: 42871.0, 300 sec: 42931.9). Total num frames: 1343651840. Throughput: 0: 42681.1. Samples: 1343808180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:18,391][15132] Avg episode reward: [(0, '0.593')] [2024-06-21 22:56:21,785][15401] Updated weights for policy 0, policy_version 82020 (0.0041) [2024-06-21 22:56:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1343864832. Throughput: 0: 42891.1. Samples: 1343939420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:23,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-21 22:56:25,815][15401] Updated weights for policy 0, policy_version 82030 (0.0034) [2024-06-21 22:56:28,389][15132] Fps is (10 sec: 40962.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1344061440. Throughput: 0: 42757.8. Samples: 1344199000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-21 22:56:29,315][15401] Updated weights for policy 0, policy_version 82040 (0.0027) [2024-06-21 22:56:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1344290816. Throughput: 0: 42705.9. Samples: 1344445080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:33,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-21 22:56:33,710][15401] Updated weights for policy 0, policy_version 82050 (0.0027) [2024-06-21 22:56:36,807][15401] Updated weights for policy 0, policy_version 82060 (0.0034) [2024-06-21 22:56:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1344487424. Throughput: 0: 42689.9. Samples: 1344575760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 22:56:41,651][15401] Updated weights for policy 0, policy_version 82070 (0.0039) [2024-06-21 22:56:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1344733184. Throughput: 0: 42508.9. Samples: 1344837260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:43,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 22:56:44,726][15401] Updated weights for policy 0, policy_version 82080 (0.0025) [2024-06-21 22:56:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1344913408. Throughput: 0: 42817.0. Samples: 1345094700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 22:56:49,403][15401] Updated weights for policy 0, policy_version 82090 (0.0023) [2024-06-21 22:56:52,269][15401] Updated weights for policy 0, policy_version 82100 (0.0050) [2024-06-21 22:56:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1345126400. Throughput: 0: 42685.4. Samples: 1345217340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 22:56:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 22:56:54,712][15349] Signal inference workers to stop experience collection... (19850 times) [2024-06-21 22:56:54,712][15349] Signal inference workers to resume experience collection... (19850 times) [2024-06-21 22:56:54,751][15401] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-06-21 22:56:54,756][15401] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-06-21 22:56:57,056][15401] Updated weights for policy 0, policy_version 82110 (0.0032) [2024-06-21 22:56:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 1345339392. Throughput: 0: 42499.0. Samples: 1345472840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:56:58,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-21 22:57:00,111][15401] Updated weights for policy 0, policy_version 82120 (0.0039) [2024-06-21 22:57:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1345552384. Throughput: 0: 42591.2. Samples: 1345724760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 22:57:04,845][15401] Updated weights for policy 0, policy_version 82130 (0.0030) [2024-06-21 22:57:07,930][15401] Updated weights for policy 0, policy_version 82140 (0.0028) [2024-06-21 22:57:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1345781760. Throughput: 0: 42520.5. Samples: 1345852840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 22:57:12,291][15401] Updated weights for policy 0, policy_version 82150 (0.0031) [2024-06-21 22:57:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.9). Total num frames: 1345978368. Throughput: 0: 42539.1. Samples: 1346113260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 22:57:15,637][15401] Updated weights for policy 0, policy_version 82160 (0.0033) [2024-06-21 22:57:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.8, 300 sec: 42765.0). Total num frames: 1346191360. Throughput: 0: 42745.7. Samples: 1346368640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 22:57:20,131][15401] Updated weights for policy 0, policy_version 82170 (0.0033) [2024-06-21 22:57:23,165][15401] Updated weights for policy 0, policy_version 82180 (0.0040) [2024-06-21 22:57:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1346437120. Throughput: 0: 42578.2. Samples: 1346491780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 22:57:27,921][15401] Updated weights for policy 0, policy_version 82190 (0.0043) [2024-06-21 22:57:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1346617344. Throughput: 0: 42629.7. Samples: 1346755600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 22:57:30,816][15401] Updated weights for policy 0, policy_version 82200 (0.0037) [2024-06-21 22:57:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1346846720. Throughput: 0: 42525.6. Samples: 1347008360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:33,396][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 22:57:35,597][15401] Updated weights for policy 0, policy_version 82210 (0.0043) [2024-06-21 22:57:38,286][15401] Updated weights for policy 0, policy_version 82220 (0.0028) [2024-06-21 22:57:38,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 1347092480. Throughput: 0: 42615.0. Samples: 1347135020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-21 22:57:43,057][15401] Updated weights for policy 0, policy_version 82230 (0.0045) [2024-06-21 22:57:43,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 1347256320. Throughput: 0: 42849.4. Samples: 1347401160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-21 22:57:43,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 22:57:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000082231_1347272704.pth... [2024-06-21 22:57:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000081605_1337016320.pth [2024-06-21 22:57:46,660][15401] Updated weights for policy 0, policy_version 82240 (0.0022) [2024-06-21 22:57:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 1347485696. Throughput: 0: 42824.1. Samples: 1347651840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:57:48,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 22:57:50,741][15401] Updated weights for policy 0, policy_version 82250 (0.0041) [2024-06-21 22:57:53,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1347715072. Throughput: 0: 42871.4. Samples: 1347782060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:57:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-21 22:57:54,268][15401] Updated weights for policy 0, policy_version 82260 (0.0034) [2024-06-21 22:57:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1347895296. Throughput: 0: 42818.2. Samples: 1348040080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:57:58,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 22:57:58,474][15401] Updated weights for policy 0, policy_version 82270 (0.0040) [2024-06-21 22:58:01,972][15401] Updated weights for policy 0, policy_version 82280 (0.0035) [2024-06-21 22:58:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1348124672. Throughput: 0: 42792.4. Samples: 1348294300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:58:03,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-21 22:58:06,189][15401] Updated weights for policy 0, policy_version 82290 (0.0041) [2024-06-21 22:58:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1348370432. Throughput: 0: 42958.6. Samples: 1348424920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:58:08,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-21 22:58:09,544][15401] Updated weights for policy 0, policy_version 82300 (0.0041) [2024-06-21 22:58:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1348550656. Throughput: 0: 42774.7. Samples: 1348680460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:58:13,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-21 22:58:13,592][15401] Updated weights for policy 0, policy_version 82310 (0.0047) [2024-06-21 22:58:17,101][15401] Updated weights for policy 0, policy_version 82320 (0.0031) [2024-06-21 22:58:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1348763648. Throughput: 0: 42882.8. Samples: 1348938080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:58:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 22:58:21,135][15401] Updated weights for policy 0, policy_version 82330 (0.0024) [2024-06-21 22:58:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1348993024. Throughput: 0: 42854.3. Samples: 1349063460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:58:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-21 22:58:24,837][15401] Updated weights for policy 0, policy_version 82340 (0.0039) [2024-06-21 22:58:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1349189632. Throughput: 0: 42661.7. Samples: 1349320840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:58:28,391][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 22:58:28,673][15401] Updated weights for policy 0, policy_version 82350 (0.0038) [2024-06-21 22:58:32,734][15401] Updated weights for policy 0, policy_version 82360 (0.0032) [2024-06-21 22:58:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1349402624. Throughput: 0: 42738.5. Samples: 1349575080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-21 22:58:33,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-21 22:58:36,306][15401] Updated weights for policy 0, policy_version 82370 (0.0027) [2024-06-21 22:58:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1349615616. Throughput: 0: 42571.7. Samples: 1349697780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:58:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 22:58:40,144][15349] Signal inference workers to stop experience collection... (19900 times) [2024-06-21 22:58:40,144][15349] Signal inference workers to resume experience collection... (19900 times) [2024-06-21 22:58:40,177][15401] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-06-21 22:58:40,177][15401] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-06-21 22:58:40,286][15401] Updated weights for policy 0, policy_version 82380 (0.0027) [2024-06-21 22:58:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 1349812224. Throughput: 0: 42674.7. Samples: 1349960440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:58:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-21 22:58:44,111][15401] Updated weights for policy 0, policy_version 82390 (0.0035) [2024-06-21 22:58:48,300][15401] Updated weights for policy 0, policy_version 82400 (0.0028) [2024-06-21 22:58:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 1350041600. Throughput: 0: 42641.8. Samples: 1350213280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:58:48,393][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 22:58:51,926][15401] Updated weights for policy 0, policy_version 82410 (0.0033) [2024-06-21 22:58:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1350254592. Throughput: 0: 42570.4. Samples: 1350340580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:58:53,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-21 22:58:55,946][15401] Updated weights for policy 0, policy_version 82420 (0.0036) [2024-06-21 22:58:58,392][15132] Fps is (10 sec: 44236.7, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 1350483968. Throughput: 0: 42683.5. Samples: 1350601320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:58:58,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-21 22:58:59,455][15401] Updated weights for policy 0, policy_version 82430 (0.0036) [2024-06-21 22:59:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1350680576. Throughput: 0: 42603.1. Samples: 1350855220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:59:03,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-21 22:59:03,571][15401] Updated weights for policy 0, policy_version 82440 (0.0037) [2024-06-21 22:59:07,023][15401] Updated weights for policy 0, policy_version 82450 (0.0030) [2024-06-21 22:59:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1350909952. Throughput: 0: 42628.5. Samples: 1350981740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:59:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 22:59:11,082][15401] Updated weights for policy 0, policy_version 82460 (0.0028) [2024-06-21 22:59:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1351106560. Throughput: 0: 42748.1. Samples: 1351244500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:59:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 22:59:14,597][15401] Updated weights for policy 0, policy_version 82470 (0.0039) [2024-06-21 22:59:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1351319552. Throughput: 0: 42761.0. Samples: 1351499320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:59:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 22:59:18,645][15401] Updated weights for policy 0, policy_version 82480 (0.0035) [2024-06-21 22:59:22,122][15401] Updated weights for policy 0, policy_version 82490 (0.0031) [2024-06-21 22:59:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1351548928. Throughput: 0: 42891.6. Samples: 1351627900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:59:23,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-21 22:59:26,087][15401] Updated weights for policy 0, policy_version 82500 (0.0039) [2024-06-21 22:59:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 1351761920. Throughput: 0: 42886.2. Samples: 1351890420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-21 22:59:28,392][15132] Avg episode reward: [(0, '0.805')] [2024-06-21 22:59:29,569][15401] Updated weights for policy 0, policy_version 82510 (0.0028) [2024-06-21 22:59:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1351991296. Throughput: 0: 42962.2. Samples: 1352146480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:59:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 22:59:33,542][15401] Updated weights for policy 0, policy_version 82520 (0.0034) [2024-06-21 22:59:37,160][15401] Updated weights for policy 0, policy_version 82530 (0.0027) [2024-06-21 22:59:38,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1352204288. Throughput: 0: 43036.8. Samples: 1352277240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:59:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 22:59:41,053][15401] Updated weights for policy 0, policy_version 82540 (0.0035) [2024-06-21 22:59:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 1352400896. Throughput: 0: 43060.8. Samples: 1352538960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:59:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 22:59:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000082544_1352400896.pth... [2024-06-21 22:59:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000081919_1342160896.pth [2024-06-21 22:59:44,876][15401] Updated weights for policy 0, policy_version 82550 (0.0030) [2024-06-21 22:59:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 1352630272. Throughput: 0: 43029.8. Samples: 1352791560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:59:48,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-21 22:59:49,010][15401] Updated weights for policy 0, policy_version 82560 (0.0039) [2024-06-21 22:59:52,560][15401] Updated weights for policy 0, policy_version 82570 (0.0045) [2024-06-21 22:59:53,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1352843264. Throughput: 0: 43100.0. Samples: 1352921240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:59:53,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-21 22:59:57,106][15401] Updated weights for policy 0, policy_version 82580 (0.0034) [2024-06-21 22:59:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1353039872. Throughput: 0: 42954.6. Samples: 1353177460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 22:59:58,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-21 23:00:00,244][15401] Updated weights for policy 0, policy_version 82590 (0.0032) [2024-06-21 23:00:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1353269248. Throughput: 0: 42973.0. Samples: 1353433100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:00:03,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 23:00:04,639][15401] Updated weights for policy 0, policy_version 82600 (0.0034) [2024-06-21 23:00:08,143][15401] Updated weights for policy 0, policy_version 82610 (0.0033) [2024-06-21 23:00:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1353482240. Throughput: 0: 42892.7. Samples: 1353558080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:00:08,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 23:00:08,675][15349] Signal inference workers to stop experience collection... (19950 times) [2024-06-21 23:00:08,676][15349] Signal inference workers to resume experience collection... (19950 times) [2024-06-21 23:00:08,721][15401] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-06-21 23:00:08,721][15401] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-06-21 23:00:12,198][15401] Updated weights for policy 0, policy_version 82620 (0.0035) [2024-06-21 23:00:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1353662464. Throughput: 0: 42759.1. Samples: 1353814480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:00:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 23:00:16,009][15401] Updated weights for policy 0, policy_version 82630 (0.0042) [2024-06-21 23:00:18,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1353908224. Throughput: 0: 42668.1. Samples: 1354066540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:00:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-21 23:00:20,117][15401] Updated weights for policy 0, policy_version 82640 (0.0039) [2024-06-21 23:00:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1354104832. Throughput: 0: 42574.7. Samples: 1354193100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:00:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 23:00:23,530][15401] Updated weights for policy 0, policy_version 82650 (0.0033) [2024-06-21 23:00:27,595][15401] Updated weights for policy 0, policy_version 82660 (0.0040) [2024-06-21 23:00:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1354334208. Throughput: 0: 42615.2. Samples: 1354456640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:00:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 23:00:31,118][15401] Updated weights for policy 0, policy_version 82670 (0.0030) [2024-06-21 23:00:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1354563584. Throughput: 0: 42649.7. Samples: 1354710800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:00:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-21 23:00:35,180][15401] Updated weights for policy 0, policy_version 82680 (0.0040) [2024-06-21 23:00:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1354760192. Throughput: 0: 42739.5. Samples: 1354844520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:00:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 23:00:38,697][15401] Updated weights for policy 0, policy_version 82690 (0.0040) [2024-06-21 23:00:42,744][15401] Updated weights for policy 0, policy_version 82700 (0.0036) [2024-06-21 23:00:43,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1354973184. Throughput: 0: 42686.2. Samples: 1355098440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:00:43,393][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 23:00:46,334][15401] Updated weights for policy 0, policy_version 82710 (0.0035) [2024-06-21 23:00:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1355218944. Throughput: 0: 42624.8. Samples: 1355351220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:00:48,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-21 23:00:50,315][15401] Updated weights for policy 0, policy_version 82720 (0.0045) [2024-06-21 23:00:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1355415552. Throughput: 0: 42878.8. Samples: 1355487620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:00:53,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 23:00:53,898][15401] Updated weights for policy 0, policy_version 82730 (0.0039) [2024-06-21 23:00:57,939][15401] Updated weights for policy 0, policy_version 82740 (0.0032) [2024-06-21 23:00:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1355628544. Throughput: 0: 42961.9. Samples: 1355747760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:00:58,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-21 23:01:01,713][15401] Updated weights for policy 0, policy_version 82750 (0.0030) [2024-06-21 23:01:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1355841536. Throughput: 0: 42886.6. Samples: 1355996440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:01:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 23:01:05,438][15401] Updated weights for policy 0, policy_version 82760 (0.0038) [2024-06-21 23:01:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1356038144. Throughput: 0: 43002.1. Samples: 1356128200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-21 23:01:08,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 23:01:09,308][15401] Updated weights for policy 0, policy_version 82770 (0.0030) [2024-06-21 23:01:13,041][15401] Updated weights for policy 0, policy_version 82780 (0.0027) [2024-06-21 23:01:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 1356283904. Throughput: 0: 42954.7. Samples: 1356389600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 23:01:16,983][15401] Updated weights for policy 0, policy_version 82790 (0.0034) [2024-06-21 23:01:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1356480512. Throughput: 0: 42898.3. Samples: 1356641220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:18,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 23:01:20,744][15401] Updated weights for policy 0, policy_version 82800 (0.0044) [2024-06-21 23:01:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1356677120. Throughput: 0: 42735.6. Samples: 1356767620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:23,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-21 23:01:24,641][15401] Updated weights for policy 0, policy_version 82810 (0.0036) [2024-06-21 23:01:28,146][15401] Updated weights for policy 0, policy_version 82820 (0.0038) [2024-06-21 23:01:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1356922880. Throughput: 0: 42924.9. Samples: 1357029960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 23:01:28,862][15349] Signal inference workers to stop experience collection... (20000 times) [2024-06-21 23:01:28,863][15349] Signal inference workers to resume experience collection... (20000 times) [2024-06-21 23:01:28,902][15401] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-06-21 23:01:28,902][15401] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-06-21 23:01:32,343][15401] Updated weights for policy 0, policy_version 82830 (0.0041) [2024-06-21 23:01:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1357135872. Throughput: 0: 43157.2. Samples: 1357293300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 23:01:35,553][15401] Updated weights for policy 0, policy_version 82840 (0.0041) [2024-06-21 23:01:38,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1357332480. Throughput: 0: 42882.6. Samples: 1357417440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:38,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 23:01:40,329][15401] Updated weights for policy 0, policy_version 82850 (0.0037) [2024-06-21 23:01:43,103][15401] Updated weights for policy 0, policy_version 82860 (0.0042) [2024-06-21 23:01:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43419.2, 300 sec: 42931.6). Total num frames: 1357578240. Throughput: 0: 42896.2. Samples: 1357678100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 23:01:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000082861_1357594624.pth... [2024-06-21 23:01:43,575][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000082231_1347272704.pth [2024-06-21 23:01:48,036][15401] Updated weights for policy 0, policy_version 82870 (0.0041) [2024-06-21 23:01:48,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 1357758464. Throughput: 0: 43165.4. Samples: 1357938980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:48,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 23:01:50,611][15401] Updated weights for policy 0, policy_version 82880 (0.0054) [2024-06-21 23:01:53,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1357971456. Throughput: 0: 42871.1. Samples: 1358057400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-21 23:01:55,491][15401] Updated weights for policy 0, policy_version 82890 (0.0028) [2024-06-21 23:01:58,355][15401] Updated weights for policy 0, policy_version 82900 (0.0029) [2024-06-21 23:01:58,390][15132] Fps is (10 sec: 47524.5, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 1358233600. Throughput: 0: 42952.9. Samples: 1358322480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:01:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-21 23:02:03,038][15401] Updated weights for policy 0, policy_version 82910 (0.0029) [2024-06-21 23:02:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1358397440. Throughput: 0: 43152.4. Samples: 1358583080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 23:02:05,824][15401] Updated weights for policy 0, policy_version 82920 (0.0027) [2024-06-21 23:02:08,396][15132] Fps is (10 sec: 39296.8, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 1358626816. Throughput: 0: 43046.7. Samples: 1358705000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:08,396][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 23:02:10,726][15401] Updated weights for policy 0, policy_version 82930 (0.0028) [2024-06-21 23:02:13,373][15401] Updated weights for policy 0, policy_version 82940 (0.0020) [2024-06-21 23:02:13,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 1358888960. Throughput: 0: 43177.9. Samples: 1358972960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 23:02:18,310][15401] Updated weights for policy 0, policy_version 82950 (0.0037) [2024-06-21 23:02:18,390][15132] Fps is (10 sec: 42625.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1359052800. Throughput: 0: 42996.9. Samples: 1359228160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 23:02:21,568][15401] Updated weights for policy 0, policy_version 82960 (0.0031) [2024-06-21 23:02:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1359282176. Throughput: 0: 42837.4. Samples: 1359345020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 23:02:25,829][15401] Updated weights for policy 0, policy_version 82970 (0.0028) [2024-06-21 23:02:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1359495168. Throughput: 0: 42920.0. Samples: 1359609500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 23:02:29,144][15401] Updated weights for policy 0, policy_version 82980 (0.0028) [2024-06-21 23:02:33,376][15401] Updated weights for policy 0, policy_version 82990 (0.0038) [2024-06-21 23:02:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1359708160. Throughput: 0: 42868.4. Samples: 1359867960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 23:02:36,618][15401] Updated weights for policy 0, policy_version 83000 (0.0030) [2024-06-21 23:02:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43146.3, 300 sec: 42932.0). Total num frames: 1359921152. Throughput: 0: 42994.7. Samples: 1359992160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:38,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-21 23:02:41,062][15401] Updated weights for policy 0, policy_version 83010 (0.0038) [2024-06-21 23:02:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 1360150528. Throughput: 0: 42832.0. Samples: 1360249920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 23:02:44,624][15401] Updated weights for policy 0, policy_version 83020 (0.0043) [2024-06-21 23:02:46,487][15349] Signal inference workers to stop experience collection... (20050 times) [2024-06-21 23:02:46,492][15349] Signal inference workers to resume experience collection... (20050 times) [2024-06-21 23:02:46,512][15401] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-06-21 23:02:46,542][15401] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-06-21 23:02:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1360330752. Throughput: 0: 42806.2. Samples: 1360509360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-21 23:02:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 23:02:48,930][15401] Updated weights for policy 0, policy_version 83030 (0.0037) [2024-06-21 23:02:52,313][15401] Updated weights for policy 0, policy_version 83040 (0.0044) [2024-06-21 23:02:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1360560128. Throughput: 0: 42788.1. Samples: 1360630200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:02:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 23:02:56,353][15401] Updated weights for policy 0, policy_version 83050 (0.0049) [2024-06-21 23:02:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1360773120. Throughput: 0: 42714.5. Samples: 1360895120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:02:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 23:03:00,088][15401] Updated weights for policy 0, policy_version 83060 (0.0035) [2024-06-21 23:03:03,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1360953344. Throughput: 0: 42772.6. Samples: 1361152920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:03:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 23:03:04,331][15401] Updated weights for policy 0, policy_version 83070 (0.0033) [2024-06-21 23:03:07,549][15401] Updated weights for policy 0, policy_version 83080 (0.0031) [2024-06-21 23:03:08,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42874.3, 300 sec: 42875.8). Total num frames: 1361199104. Throughput: 0: 42829.7. Samples: 1361272460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:03:08,392][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 23:03:12,175][15401] Updated weights for policy 0, policy_version 83090 (0.0022) [2024-06-21 23:03:13,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 1361428480. Throughput: 0: 42827.6. Samples: 1361536740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:03:13,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-21 23:03:15,485][15401] Updated weights for policy 0, policy_version 83100 (0.0035) [2024-06-21 23:03:18,390][15132] Fps is (10 sec: 40969.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1361608704. Throughput: 0: 42743.0. Samples: 1361791400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:03:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 23:03:19,781][15401] Updated weights for policy 0, policy_version 83110 (0.0035) [2024-06-21 23:03:23,045][15401] Updated weights for policy 0, policy_version 83120 (0.0032) [2024-06-21 23:03:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 1361854464. Throughput: 0: 42627.4. Samples: 1361910400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:03:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 23:03:27,230][15401] Updated weights for policy 0, policy_version 83130 (0.0021) [2024-06-21 23:03:28,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1362067456. Throughput: 0: 42848.5. Samples: 1362178100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:03:28,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-21 23:03:30,513][15401] Updated weights for policy 0, policy_version 83140 (0.0026) [2024-06-21 23:03:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 1362247680. Throughput: 0: 42831.1. Samples: 1362436760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:03:33,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-21 23:03:34,727][15401] Updated weights for policy 0, policy_version 83150 (0.0026) [2024-06-21 23:03:37,886][15401] Updated weights for policy 0, policy_version 83160 (0.0031) [2024-06-21 23:03:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1362509824. Throughput: 0: 42921.9. Samples: 1362561680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-21 23:03:38,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 23:03:42,595][15401] Updated weights for policy 0, policy_version 83170 (0.0026) [2024-06-21 23:03:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 1362690048. Throughput: 0: 42975.6. Samples: 1362829020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:03:43,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 23:03:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000083172_1362690048.pth... [2024-06-21 23:03:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000082544_1352400896.pth [2024-06-21 23:03:45,581][15401] Updated weights for policy 0, policy_version 83180 (0.0032) [2024-06-21 23:03:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1362903040. Throughput: 0: 42860.7. Samples: 1363081660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:03:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 23:03:50,307][15401] Updated weights for policy 0, policy_version 83190 (0.0052) [2024-06-21 23:03:53,201][15401] Updated weights for policy 0, policy_version 83200 (0.0036) [2024-06-21 23:03:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 1363148800. Throughput: 0: 43139.5. Samples: 1363213640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:03:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 23:03:57,857][15401] Updated weights for policy 0, policy_version 83210 (0.0031) [2024-06-21 23:03:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1363329024. Throughput: 0: 43011.7. Samples: 1363472260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:03:58,396][15132] Avg episode reward: [(0, '0.312')] [2024-06-21 23:04:00,893][15401] Updated weights for policy 0, policy_version 83220 (0.0029) [2024-06-21 23:04:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1363558400. Throughput: 0: 42877.9. Samples: 1363720900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:04:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-21 23:04:05,423][15401] Updated weights for policy 0, policy_version 83230 (0.0036) [2024-06-21 23:04:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 1363787776. Throughput: 0: 43176.2. Samples: 1363853320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:04:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 23:04:08,482][15401] Updated weights for policy 0, policy_version 83240 (0.0026) [2024-06-21 23:04:10,183][15349] Signal inference workers to stop experience collection... (20100 times) [2024-06-21 23:04:10,209][15401] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-06-21 23:04:10,245][15349] Signal inference workers to resume experience collection... (20100 times) [2024-06-21 23:04:10,254][15401] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-06-21 23:04:13,331][15401] Updated weights for policy 0, policy_version 83250 (0.0032) [2024-06-21 23:04:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1363968000. Throughput: 0: 42895.6. Samples: 1364108400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:04:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-21 23:04:16,070][15401] Updated weights for policy 0, policy_version 83260 (0.0036) [2024-06-21 23:04:18,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 1364213760. Throughput: 0: 42807.5. Samples: 1364363200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:04:18,393][15132] Avg episode reward: [(0, '0.575')] [2024-06-21 23:04:20,980][15401] Updated weights for policy 0, policy_version 83270 (0.0034) [2024-06-21 23:04:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.7, 300 sec: 42932.0). Total num frames: 1364426752. Throughput: 0: 43050.8. Samples: 1364498960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:04:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 23:04:23,975][15401] Updated weights for policy 0, policy_version 83280 (0.0032) [2024-06-21 23:04:28,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1364606976. Throughput: 0: 42669.3. Samples: 1364749140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-21 23:04:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-21 23:04:28,557][15401] Updated weights for policy 0, policy_version 83290 (0.0042) [2024-06-21 23:04:31,614][15401] Updated weights for policy 0, policy_version 83300 (0.0044) [2024-06-21 23:04:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1364852736. Throughput: 0: 42713.5. Samples: 1365003760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:04:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-21 23:04:36,332][15401] Updated weights for policy 0, policy_version 83310 (0.0033) [2024-06-21 23:04:38,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42323.7, 300 sec: 42875.8). Total num frames: 1365049344. Throughput: 0: 42837.3. Samples: 1365141420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:04:38,393][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 23:04:39,350][15401] Updated weights for policy 0, policy_version 83320 (0.0044) [2024-06-21 23:04:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1365245952. Throughput: 0: 42486.2. Samples: 1365384140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:04:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 23:04:43,879][15401] Updated weights for policy 0, policy_version 83330 (0.0037) [2024-06-21 23:04:47,142][15401] Updated weights for policy 0, policy_version 83340 (0.0043) [2024-06-21 23:04:48,390][15132] Fps is (10 sec: 44246.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1365491712. Throughput: 0: 42770.2. Samples: 1365645560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:04:48,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 23:04:51,337][15401] Updated weights for policy 0, policy_version 83350 (0.0038) [2024-06-21 23:04:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1365688320. Throughput: 0: 42858.5. Samples: 1365781960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:04:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-21 23:04:54,768][15401] Updated weights for policy 0, policy_version 83360 (0.0037) [2024-06-21 23:04:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1365901312. Throughput: 0: 42635.0. Samples: 1366026980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:04:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 23:04:58,844][15401] Updated weights for policy 0, policy_version 83370 (0.0032) [2024-06-21 23:05:02,462][15401] Updated weights for policy 0, policy_version 83380 (0.0034) [2024-06-21 23:05:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1366130688. Throughput: 0: 42812.1. Samples: 1366289640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:05:03,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-21 23:05:06,663][15401] Updated weights for policy 0, policy_version 83390 (0.0045) [2024-06-21 23:05:08,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 1366310912. Throughput: 0: 42760.3. Samples: 1366423180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:05:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 23:05:10,099][15401] Updated weights for policy 0, policy_version 83400 (0.0022) [2024-06-21 23:05:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1366556672. Throughput: 0: 42578.7. Samples: 1366665180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:05:13,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-21 23:05:14,155][15401] Updated weights for policy 0, policy_version 83410 (0.0027) [2024-06-21 23:05:18,101][15401] Updated weights for policy 0, policy_version 83420 (0.0045) [2024-06-21 23:05:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 1366769664. Throughput: 0: 42799.9. Samples: 1366929760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-21 23:05:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-21 23:05:22,219][15401] Updated weights for policy 0, policy_version 83430 (0.0036) [2024-06-21 23:05:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 1366966272. Throughput: 0: 42565.3. Samples: 1367056760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:05:23,399][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 23:05:25,619][15401] Updated weights for policy 0, policy_version 83440 (0.0024) [2024-06-21 23:05:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1367212032. Throughput: 0: 42759.1. Samples: 1367308300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:05:28,400][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 23:05:29,740][15401] Updated weights for policy 0, policy_version 83450 (0.0045) [2024-06-21 23:05:31,058][15349] Signal inference workers to stop experience collection... (20150 times) [2024-06-21 23:05:31,058][15349] Signal inference workers to resume experience collection... (20150 times) [2024-06-21 23:05:31,087][15401] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-06-21 23:05:31,087][15401] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-06-21 23:05:33,185][15401] Updated weights for policy 0, policy_version 83460 (0.0035) [2024-06-21 23:05:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1367408640. Throughput: 0: 42908.6. Samples: 1367576440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:05:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 23:05:37,318][15401] Updated weights for policy 0, policy_version 83470 (0.0047) [2024-06-21 23:05:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42600.2, 300 sec: 42820.9). Total num frames: 1367605248. Throughput: 0: 42611.3. Samples: 1367699460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:05:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 23:05:40,803][15401] Updated weights for policy 0, policy_version 83480 (0.0047) [2024-06-21 23:05:43,394][15132] Fps is (10 sec: 44216.1, 60 sec: 43414.3, 300 sec: 42819.9). Total num frames: 1367851008. Throughput: 0: 42896.2. Samples: 1367957500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:05:43,395][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 23:05:43,445][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000083488_1367867392.pth... [2024-06-21 23:05:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000082861_1357594624.pth [2024-06-21 23:05:44,786][15401] Updated weights for policy 0, policy_version 83490 (0.0023) [2024-06-21 23:05:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1368047616. Throughput: 0: 42933.8. Samples: 1368221660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:05:48,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-21 23:05:48,409][15401] Updated weights for policy 0, policy_version 83500 (0.0034) [2024-06-21 23:05:52,419][15401] Updated weights for policy 0, policy_version 83510 (0.0034) [2024-06-21 23:05:53,389][15132] Fps is (10 sec: 40979.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1368260608. Throughput: 0: 42674.8. Samples: 1368343540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:05:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 23:05:55,944][15401] Updated weights for policy 0, policy_version 83520 (0.0035) [2024-06-21 23:05:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1368473600. Throughput: 0: 43030.2. Samples: 1368601540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:05:58,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-21 23:05:59,824][15401] Updated weights for policy 0, policy_version 83530 (0.0025) [2024-06-21 23:06:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1368686592. Throughput: 0: 42948.9. Samples: 1368862460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:06:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 23:06:03,851][15401] Updated weights for policy 0, policy_version 83540 (0.0034) [2024-06-21 23:06:07,241][15401] Updated weights for policy 0, policy_version 83550 (0.0033) [2024-06-21 23:06:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 1368932352. Throughput: 0: 42984.9. Samples: 1368991080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-21 23:06:08,394][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 23:06:11,530][15401] Updated weights for policy 0, policy_version 83560 (0.0033) [2024-06-21 23:06:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1369145344. Throughput: 0: 43127.5. Samples: 1369249040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:13,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-21 23:06:15,072][15401] Updated weights for policy 0, policy_version 83570 (0.0032) [2024-06-21 23:06:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 1369341952. Throughput: 0: 42860.0. Samples: 1369505140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 23:06:19,057][15401] Updated weights for policy 0, policy_version 83580 (0.0030) [2024-06-21 23:06:22,774][15401] Updated weights for policy 0, policy_version 83590 (0.0039) [2024-06-21 23:06:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1369554944. Throughput: 0: 42893.2. Samples: 1369629660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 23:06:26,677][15401] Updated weights for policy 0, policy_version 83600 (0.0036) [2024-06-21 23:06:28,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1369784320. Throughput: 0: 42876.7. Samples: 1369886760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 23:06:30,741][15401] Updated weights for policy 0, policy_version 83610 (0.0036) [2024-06-21 23:06:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42932.0). Total num frames: 1369997312. Throughput: 0: 42718.5. Samples: 1370144000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 23:06:34,281][15401] Updated weights for policy 0, policy_version 83620 (0.0038) [2024-06-21 23:06:38,323][15401] Updated weights for policy 0, policy_version 83630 (0.0026) [2024-06-21 23:06:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1370193920. Throughput: 0: 42733.2. Samples: 1370266540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:38,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-21 23:06:40,180][15349] Signal inference workers to stop experience collection... (20200 times) [2024-06-21 23:06:40,187][15349] Signal inference workers to resume experience collection... (20200 times) [2024-06-21 23:06:40,216][15401] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-06-21 23:06:40,216][15401] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-06-21 23:06:41,770][15401] Updated weights for policy 0, policy_version 83640 (0.0028) [2024-06-21 23:06:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42601.7, 300 sec: 42876.4). Total num frames: 1370406912. Throughput: 0: 42676.5. Samples: 1370521980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 23:06:45,983][15401] Updated weights for policy 0, policy_version 83650 (0.0030) [2024-06-21 23:06:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1370619904. Throughput: 0: 42700.5. Samples: 1370783980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 23:06:49,319][15401] Updated weights for policy 0, policy_version 83660 (0.0028) [2024-06-21 23:06:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1370832896. Throughput: 0: 42762.8. Samples: 1370915400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-21 23:06:53,583][15401] Updated weights for policy 0, policy_version 83670 (0.0041) [2024-06-21 23:06:56,908][15401] Updated weights for policy 0, policy_version 83680 (0.0036) [2024-06-21 23:06:58,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 1371029504. Throughput: 0: 42573.3. Samples: 1371164940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:06:58,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 23:07:01,124][15401] Updated weights for policy 0, policy_version 83690 (0.0044) [2024-06-21 23:07:03,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 1371226112. Throughput: 0: 42755.3. Samples: 1371429140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:07:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 23:07:04,738][15401] Updated weights for policy 0, policy_version 83700 (0.0028) [2024-06-21 23:07:08,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1371488256. Throughput: 0: 42765.9. Samples: 1371554120. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 23:07:08,595][15401] Updated weights for policy 0, policy_version 83710 (0.0029) [2024-06-21 23:07:12,291][15401] Updated weights for policy 0, policy_version 83720 (0.0037) [2024-06-21 23:07:13,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1371684864. Throughput: 0: 42663.7. Samples: 1371806620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:13,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 23:07:16,283][15401] Updated weights for policy 0, policy_version 83730 (0.0027) [2024-06-21 23:07:18,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1371865088. Throughput: 0: 43082.9. Samples: 1372082720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 23:07:19,754][15401] Updated weights for policy 0, policy_version 83740 (0.0037) [2024-06-21 23:07:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1372127232. Throughput: 0: 42980.1. Samples: 1372200640. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:23,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 23:07:23,819][15401] Updated weights for policy 0, policy_version 83750 (0.0028) [2024-06-21 23:07:28,009][15401] Updated weights for policy 0, policy_version 83760 (0.0040) [2024-06-21 23:07:28,390][15132] Fps is (10 sec: 47511.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1372340224. Throughput: 0: 42898.7. Samples: 1372452440. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 23:07:31,451][15401] Updated weights for policy 0, policy_version 83770 (0.0033) [2024-06-21 23:07:33,390][15132] Fps is (10 sec: 37682.7, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 1372504064. Throughput: 0: 42924.4. Samples: 1372715580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:33,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 23:07:35,797][15401] Updated weights for policy 0, policy_version 83780 (0.0034) [2024-06-21 23:07:38,344][15349] Signal inference workers to stop experience collection... (20250 times) [2024-06-21 23:07:38,389][15132] Fps is (10 sec: 40961.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1372749824. Throughput: 0: 42766.6. Samples: 1372839900. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 23:07:38,392][15401] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-06-21 23:07:38,402][15349] Signal inference workers to resume experience collection... (20250 times) [2024-06-21 23:07:38,415][15401] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-06-21 23:07:39,053][15401] Updated weights for policy 0, policy_version 83790 (0.0039) [2024-06-21 23:07:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1372962816. Throughput: 0: 42852.6. Samples: 1373093200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 23:07:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000083800_1372979200.pth... [2024-06-21 23:07:43,518][15401] Updated weights for policy 0, policy_version 83800 (0.0030) [2024-06-21 23:07:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000083172_1362690048.pth [2024-06-21 23:07:46,883][15401] Updated weights for policy 0, policy_version 83810 (0.0033) [2024-06-21 23:07:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1373159424. Throughput: 0: 42846.7. Samples: 1373357240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:48,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-21 23:07:51,159][15401] Updated weights for policy 0, policy_version 83820 (0.0029) [2024-06-21 23:07:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1373405184. Throughput: 0: 42897.2. Samples: 1373484500. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-21 23:07:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-21 23:07:54,287][15401] Updated weights for policy 0, policy_version 83830 (0.0038) [2024-06-21 23:07:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 1373601792. Throughput: 0: 42984.5. Samples: 1373740920. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:07:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 23:07:58,846][15401] Updated weights for policy 0, policy_version 83840 (0.0032) [2024-06-21 23:08:02,150][15401] Updated weights for policy 0, policy_version 83850 (0.0040) [2024-06-21 23:08:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 1373814784. Throughput: 0: 42427.4. Samples: 1373991960. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-21 23:08:06,526][15401] Updated weights for policy 0, policy_version 83860 (0.0023) [2024-06-21 23:08:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.1, 300 sec: 42709.5). Total num frames: 1374027776. Throughput: 0: 42714.8. Samples: 1374122820. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 23:08:09,842][15401] Updated weights for policy 0, policy_version 83870 (0.0033) [2024-06-21 23:08:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1374240768. Throughput: 0: 42739.4. Samples: 1374375700. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 23:08:14,137][15401] Updated weights for policy 0, policy_version 83880 (0.0036) [2024-06-21 23:08:17,867][15401] Updated weights for policy 0, policy_version 83890 (0.0029) [2024-06-21 23:08:18,390][15132] Fps is (10 sec: 44237.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1374470144. Throughput: 0: 42495.1. Samples: 1374627860. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 23:08:21,743][15401] Updated weights for policy 0, policy_version 83900 (0.0032) [2024-06-21 23:08:23,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 1374683136. Throughput: 0: 42679.9. Samples: 1374760600. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:23,392][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 23:08:25,508][15401] Updated weights for policy 0, policy_version 83910 (0.0032) [2024-06-21 23:08:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.6, 300 sec: 42765.0). Total num frames: 1374863360. Throughput: 0: 42724.5. Samples: 1375015800. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 23:08:29,375][15401] Updated weights for policy 0, policy_version 83920 (0.0041) [2024-06-21 23:08:33,336][15401] Updated weights for policy 0, policy_version 83930 (0.0032) [2024-06-21 23:08:33,389][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1375109120. Throughput: 0: 42541.0. Samples: 1375271580. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 23:08:37,431][15401] Updated weights for policy 0, policy_version 83940 (0.0047) [2024-06-21 23:08:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1375322112. Throughput: 0: 42587.2. Samples: 1375400920. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-21 23:08:40,836][15401] Updated weights for policy 0, policy_version 83950 (0.0037) [2024-06-21 23:08:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1375518720. Throughput: 0: 42603.9. Samples: 1375658100. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-21 23:08:43,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-21 23:08:44,836][15401] Updated weights for policy 0, policy_version 83960 (0.0046) [2024-06-21 23:08:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1375748096. Throughput: 0: 42779.2. Samples: 1375917020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:08:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 23:08:48,754][15401] Updated weights for policy 0, policy_version 83970 (0.0035) [2024-06-21 23:08:52,286][15401] Updated weights for policy 0, policy_version 83980 (0.0031) [2024-06-21 23:08:53,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 1375977472. Throughput: 0: 42800.5. Samples: 1376048940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:08:53,393][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 23:08:53,735][15349] Signal inference workers to stop experience collection... (20300 times) [2024-06-21 23:08:53,736][15349] Signal inference workers to resume experience collection... (20300 times) [2024-06-21 23:08:53,777][15401] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-06-21 23:08:53,777][15401] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-06-21 23:08:56,368][15401] Updated weights for policy 0, policy_version 83990 (0.0040) [2024-06-21 23:08:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1376157696. Throughput: 0: 42754.7. Samples: 1376299660. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:08:58,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 23:09:00,001][15401] Updated weights for policy 0, policy_version 84000 (0.0041) [2024-06-21 23:09:03,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1376387072. Throughput: 0: 42946.2. Samples: 1376560440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:09:03,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 23:09:03,970][15401] Updated weights for policy 0, policy_version 84010 (0.0023) [2024-06-21 23:09:07,541][15401] Updated weights for policy 0, policy_version 84020 (0.0033) [2024-06-21 23:09:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 1376616448. Throughput: 0: 42957.8. Samples: 1376693600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:09:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 23:09:11,509][15401] Updated weights for policy 0, policy_version 84030 (0.0041) [2024-06-21 23:09:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 1376813056. Throughput: 0: 42892.5. Samples: 1376945960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:09:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 23:09:15,156][15401] Updated weights for policy 0, policy_version 84040 (0.0039) [2024-06-21 23:09:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1377042432. Throughput: 0: 42706.2. Samples: 1377193360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:09:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 23:09:19,264][15401] Updated weights for policy 0, policy_version 84050 (0.0028) [2024-06-21 23:09:22,877][15401] Updated weights for policy 0, policy_version 84060 (0.0035) [2024-06-21 23:09:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 1377255424. Throughput: 0: 42748.8. Samples: 1377324620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:09:23,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 23:09:26,746][15401] Updated weights for policy 0, policy_version 84070 (0.0037) [2024-06-21 23:09:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1377452032. Throughput: 0: 42771.4. Samples: 1377582800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:09:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 23:09:30,693][15401] Updated weights for policy 0, policy_version 84080 (0.0029) [2024-06-21 23:09:33,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 1377665024. Throughput: 0: 42776.1. Samples: 1377841940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-21 23:09:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-21 23:09:34,379][15401] Updated weights for policy 0, policy_version 84090 (0.0045) [2024-06-21 23:09:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1377878016. Throughput: 0: 42713.6. Samples: 1377970940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:09:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-21 23:09:38,477][15401] Updated weights for policy 0, policy_version 84100 (0.0033) [2024-06-21 23:09:42,085][15401] Updated weights for policy 0, policy_version 84110 (0.0028) [2024-06-21 23:09:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1378107392. Throughput: 0: 42697.7. Samples: 1378221060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:09:43,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-21 23:09:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000084113_1378107392.pth... [2024-06-21 23:09:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000083488_1367867392.pth [2024-06-21 23:09:46,113][15401] Updated weights for policy 0, policy_version 84120 (0.0027) [2024-06-21 23:09:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1378320384. Throughput: 0: 42595.6. Samples: 1378477240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:09:48,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-21 23:09:49,752][15401] Updated weights for policy 0, policy_version 84130 (0.0052) [2024-06-21 23:09:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 1378516992. Throughput: 0: 42439.5. Samples: 1378603380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:09:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 23:09:53,735][15401] Updated weights for policy 0, policy_version 84140 (0.0030) [2024-06-21 23:09:58,003][15401] Updated weights for policy 0, policy_version 84150 (0.0022) [2024-06-21 23:09:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1378729984. Throughput: 0: 42578.9. Samples: 1378862020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:09:58,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-21 23:10:01,179][15401] Updated weights for policy 0, policy_version 84160 (0.0025) [2024-06-21 23:10:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1378975744. Throughput: 0: 42764.0. Samples: 1379117740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:10:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 23:10:05,701][15401] Updated weights for policy 0, policy_version 84170 (0.0031) [2024-06-21 23:10:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1379172352. Throughput: 0: 42778.3. Samples: 1379249640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:10:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 23:10:08,739][15401] Updated weights for policy 0, policy_version 84180 (0.0027) [2024-06-21 23:10:13,320][15401] Updated weights for policy 0, policy_version 84190 (0.0033) [2024-06-21 23:10:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1379368960. Throughput: 0: 42643.8. Samples: 1379501780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:10:13,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-21 23:10:16,433][15401] Updated weights for policy 0, policy_version 84200 (0.0037) [2024-06-21 23:10:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1379614720. Throughput: 0: 42493.3. Samples: 1379754140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:10:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 23:10:20,886][15401] Updated weights for policy 0, policy_version 84210 (0.0028) [2024-06-21 23:10:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1379811328. Throughput: 0: 42647.0. Samples: 1379890060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:10:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 23:10:24,002][15401] Updated weights for policy 0, policy_version 84220 (0.0034) [2024-06-21 23:10:27,080][15349] Signal inference workers to stop experience collection... (20350 times) [2024-06-21 23:10:27,083][15349] Signal inference workers to resume experience collection... (20350 times) [2024-06-21 23:10:27,103][15401] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-06-21 23:10:27,103][15401] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-06-21 23:10:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1380007936. Throughput: 0: 42734.7. Samples: 1380144120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:10:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 23:10:28,523][15401] Updated weights for policy 0, policy_version 84230 (0.0039) [2024-06-21 23:10:31,706][15401] Updated weights for policy 0, policy_version 84240 (0.0030) [2024-06-21 23:10:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1380237312. Throughput: 0: 42629.8. Samples: 1380395580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:10:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 23:10:36,316][15401] Updated weights for policy 0, policy_version 84250 (0.0034) [2024-06-21 23:10:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42710.1). Total num frames: 1380450304. Throughput: 0: 42780.0. Samples: 1380528480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:10:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 23:10:39,502][15401] Updated weights for policy 0, policy_version 84260 (0.0035) [2024-06-21 23:10:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1380646912. Throughput: 0: 42585.8. Samples: 1380778380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:10:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 23:10:43,867][15401] Updated weights for policy 0, policy_version 84270 (0.0030) [2024-06-21 23:10:47,438][15401] Updated weights for policy 0, policy_version 84280 (0.0029) [2024-06-21 23:10:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1380892672. Throughput: 0: 42572.9. Samples: 1381033520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:10:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 23:10:51,513][15401] Updated weights for policy 0, policy_version 84290 (0.0032) [2024-06-21 23:10:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1381072896. Throughput: 0: 42604.4. Samples: 1381166840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:10:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 23:10:55,225][15401] Updated weights for policy 0, policy_version 84300 (0.0034) [2024-06-21 23:10:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1381285888. Throughput: 0: 42564.2. Samples: 1381417160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:10:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 23:10:59,151][15401] Updated weights for policy 0, policy_version 84310 (0.0026) [2024-06-21 23:11:02,807][15401] Updated weights for policy 0, policy_version 84320 (0.0024) [2024-06-21 23:11:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1381515264. Throughput: 0: 42809.4. Samples: 1381680560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:11:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 23:11:06,905][15401] Updated weights for policy 0, policy_version 84330 (0.0045) [2024-06-21 23:11:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1381711872. Throughput: 0: 42639.6. Samples: 1381808840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:11:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 23:11:10,779][15401] Updated weights for policy 0, policy_version 84340 (0.0030) [2024-06-21 23:11:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1381924864. Throughput: 0: 42418.6. Samples: 1382052960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:11:13,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 23:11:14,336][15401] Updated weights for policy 0, policy_version 84350 (0.0042) [2024-06-21 23:11:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 1382137856. Throughput: 0: 42773.2. Samples: 1382320380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-21 23:11:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-21 23:11:18,457][15401] Updated weights for policy 0, policy_version 84360 (0.0032) [2024-06-21 23:11:21,898][15401] Updated weights for policy 0, policy_version 84370 (0.0028) [2024-06-21 23:11:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1382350848. Throughput: 0: 42613.5. Samples: 1382446080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:11:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 23:11:26,363][15401] Updated weights for policy 0, policy_version 84380 (0.0054) [2024-06-21 23:11:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1382563840. Throughput: 0: 42511.7. Samples: 1382691400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:11:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 23:11:30,175][15401] Updated weights for policy 0, policy_version 84390 (0.0038) [2024-06-21 23:11:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1382776832. Throughput: 0: 42676.1. Samples: 1382953940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:11:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-21 23:11:33,933][15401] Updated weights for policy 0, policy_version 84400 (0.0040) [2024-06-21 23:11:38,018][15401] Updated weights for policy 0, policy_version 84410 (0.0032) [2024-06-21 23:11:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1382989824. Throughput: 0: 42474.3. Samples: 1383078180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:11:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 23:11:41,426][15401] Updated weights for policy 0, policy_version 84420 (0.0029) [2024-06-21 23:11:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1383219200. Throughput: 0: 42589.2. Samples: 1383333680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:11:43,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-21 23:11:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000084425_1383219200.pth... [2024-06-21 23:11:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000083800_1372979200.pth [2024-06-21 23:11:45,625][15401] Updated weights for policy 0, policy_version 84430 (0.0036) [2024-06-21 23:11:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1383399424. Throughput: 0: 42450.7. Samples: 1383590840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:11:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 23:11:49,210][15401] Updated weights for policy 0, policy_version 84440 (0.0036) [2024-06-21 23:11:53,213][15401] Updated weights for policy 0, policy_version 84450 (0.0037) [2024-06-21 23:11:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1383628800. Throughput: 0: 42293.3. Samples: 1383712040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:11:53,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-21 23:11:57,075][15401] Updated weights for policy 0, policy_version 84460 (0.0041) [2024-06-21 23:11:57,542][15349] Signal inference workers to stop experience collection... (20400 times) [2024-06-21 23:11:57,594][15401] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-06-21 23:11:57,601][15349] Signal inference workers to resume experience collection... (20400 times) [2024-06-21 23:11:57,637][15401] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-06-21 23:11:58,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1383858176. Throughput: 0: 42588.9. Samples: 1383969460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:11:58,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-21 23:12:00,909][15401] Updated weights for policy 0, policy_version 84470 (0.0029) [2024-06-21 23:12:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1384038400. Throughput: 0: 42359.7. Samples: 1384226560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:12:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 23:12:04,864][15401] Updated weights for policy 0, policy_version 84480 (0.0040) [2024-06-21 23:12:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1384251392. Throughput: 0: 42255.9. Samples: 1384347600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-21 23:12:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 23:12:08,867][15401] Updated weights for policy 0, policy_version 84490 (0.0024) [2024-06-21 23:12:12,298][15401] Updated weights for policy 0, policy_version 84500 (0.0028) [2024-06-21 23:12:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1384497152. Throughput: 0: 42639.0. Samples: 1384610160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-21 23:12:16,391][15401] Updated weights for policy 0, policy_version 84510 (0.0035) [2024-06-21 23:12:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1384693760. Throughput: 0: 42439.0. Samples: 1384863700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 23:12:19,932][15401] Updated weights for policy 0, policy_version 84520 (0.0032) [2024-06-21 23:12:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1384906752. Throughput: 0: 42514.2. Samples: 1384991320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:23,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 23:12:23,879][15401] Updated weights for policy 0, policy_version 84530 (0.0022) [2024-06-21 23:12:27,495][15401] Updated weights for policy 0, policy_version 84540 (0.0043) [2024-06-21 23:12:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1385119744. Throughput: 0: 42607.2. Samples: 1385251000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:28,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 23:12:31,382][15401] Updated weights for policy 0, policy_version 84550 (0.0032) [2024-06-21 23:12:33,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 1385349120. Throughput: 0: 42677.6. Samples: 1385511440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:33,392][15132] Avg episode reward: [(0, '0.314')] [2024-06-21 23:12:35,280][15401] Updated weights for policy 0, policy_version 84560 (0.0052) [2024-06-21 23:12:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1385562112. Throughput: 0: 42805.0. Samples: 1385638260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:38,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-21 23:12:38,871][15401] Updated weights for policy 0, policy_version 84570 (0.0033) [2024-06-21 23:12:43,103][15401] Updated weights for policy 0, policy_version 84580 (0.0036) [2024-06-21 23:12:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1385775104. Throughput: 0: 42768.1. Samples: 1385894020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 23:12:46,726][15401] Updated weights for policy 0, policy_version 84590 (0.0031) [2024-06-21 23:12:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1385971712. Throughput: 0: 42707.5. Samples: 1386148400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 23:12:50,833][15401] Updated weights for policy 0, policy_version 84600 (0.0035) [2024-06-21 23:12:53,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 1386217472. Throughput: 0: 42847.1. Samples: 1386275820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:53,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 23:12:54,339][15401] Updated weights for policy 0, policy_version 84610 (0.0038) [2024-06-21 23:12:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 1386397696. Throughput: 0: 42713.4. Samples: 1386532260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-21 23:12:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 23:12:58,402][15401] Updated weights for policy 0, policy_version 84620 (0.0042) [2024-06-21 23:13:01,901][15401] Updated weights for policy 0, policy_version 84630 (0.0027) [2024-06-21 23:13:03,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1386610688. Throughput: 0: 42802.7. Samples: 1386789820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 23:13:06,120][15401] Updated weights for policy 0, policy_version 84640 (0.0029) [2024-06-21 23:13:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1386856448. Throughput: 0: 42776.4. Samples: 1386916260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 23:13:09,522][15401] Updated weights for policy 0, policy_version 84650 (0.0036) [2024-06-21 23:13:13,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 1387053056. Throughput: 0: 42858.3. Samples: 1387179900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:13,397][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 23:13:13,544][15401] Updated weights for policy 0, policy_version 84660 (0.0037) [2024-06-21 23:13:17,211][15401] Updated weights for policy 0, policy_version 84670 (0.0039) [2024-06-21 23:13:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1387266048. Throughput: 0: 42668.8. Samples: 1387431440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 23:13:21,039][15401] Updated weights for policy 0, policy_version 84680 (0.0028) [2024-06-21 23:13:22,211][15349] Signal inference workers to stop experience collection... (20450 times) [2024-06-21 23:13:22,211][15349] Signal inference workers to resume experience collection... (20450 times) [2024-06-21 23:13:22,256][15401] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-06-21 23:13:22,256][15401] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-06-21 23:13:23,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1387479040. Throughput: 0: 42554.3. Samples: 1387553200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:23,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-21 23:13:24,883][15401] Updated weights for policy 0, policy_version 84690 (0.0025) [2024-06-21 23:13:28,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 1387692032. Throughput: 0: 42788.9. Samples: 1387819620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:28,392][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 23:13:28,874][15401] Updated weights for policy 0, policy_version 84700 (0.0032) [2024-06-21 23:13:32,442][15401] Updated weights for policy 0, policy_version 84710 (0.0033) [2024-06-21 23:13:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 1387905024. Throughput: 0: 42740.9. Samples: 1388071740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 23:13:36,399][15401] Updated weights for policy 0, policy_version 84720 (0.0038) [2024-06-21 23:13:38,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1388118016. Throughput: 0: 42858.8. Samples: 1388204360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-21 23:13:40,189][15401] Updated weights for policy 0, policy_version 84730 (0.0031) [2024-06-21 23:13:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1388331008. Throughput: 0: 42840.9. Samples: 1388460100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 23:13:43,473][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000084738_1388347392.pth... [2024-06-21 23:13:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000084113_1378107392.pth [2024-06-21 23:13:43,869][15401] Updated weights for policy 0, policy_version 84740 (0.0027) [2024-06-21 23:13:48,109][15401] Updated weights for policy 0, policy_version 84750 (0.0043) [2024-06-21 23:13:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 1388544000. Throughput: 0: 42813.4. Samples: 1388716420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-21 23:13:51,498][15401] Updated weights for policy 0, policy_version 84760 (0.0029) [2024-06-21 23:13:53,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 1388756992. Throughput: 0: 42886.2. Samples: 1388846240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:13:53,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 23:13:55,639][15401] Updated weights for policy 0, policy_version 84770 (0.0031) [2024-06-21 23:13:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1388986368. Throughput: 0: 42825.3. Samples: 1389106760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:13:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 23:13:59,315][15401] Updated weights for policy 0, policy_version 84780 (0.0036) [2024-06-21 23:14:03,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1389182976. Throughput: 0: 42847.2. Samples: 1389359560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:03,392][15132] Avg episode reward: [(0, '0.670')] [2024-06-21 23:14:03,417][15401] Updated weights for policy 0, policy_version 84790 (0.0030) [2024-06-21 23:14:06,959][15401] Updated weights for policy 0, policy_version 84800 (0.0028) [2024-06-21 23:14:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1389412352. Throughput: 0: 42962.7. Samples: 1389486520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-21 23:14:11,445][15401] Updated weights for policy 0, policy_version 84810 (0.0033) [2024-06-21 23:14:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42603.1, 300 sec: 42598.4). Total num frames: 1389608960. Throughput: 0: 42818.3. Samples: 1389746340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 23:14:14,620][15401] Updated weights for policy 0, policy_version 84820 (0.0031) [2024-06-21 23:14:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1389821952. Throughput: 0: 42818.3. Samples: 1389998560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 23:14:19,173][15401] Updated weights for policy 0, policy_version 84830 (0.0040) [2024-06-21 23:14:22,246][15401] Updated weights for policy 0, policy_version 84840 (0.0037) [2024-06-21 23:14:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1390067712. Throughput: 0: 42689.2. Samples: 1390125380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 23:14:26,763][15401] Updated weights for policy 0, policy_version 84850 (0.0036) [2024-06-21 23:14:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 1390231552. Throughput: 0: 42678.6. Samples: 1390380640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 23:14:29,866][15401] Updated weights for policy 0, policy_version 84860 (0.0046) [2024-06-21 23:14:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1390460928. Throughput: 0: 42702.2. Samples: 1390638020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:33,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-21 23:14:34,420][15401] Updated weights for policy 0, policy_version 84870 (0.0041) [2024-06-21 23:14:37,506][15401] Updated weights for policy 0, policy_version 84880 (0.0033) [2024-06-21 23:14:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1390690304. Throughput: 0: 42688.1. Samples: 1390767100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:38,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 23:14:42,038][15401] Updated weights for policy 0, policy_version 84890 (0.0023) [2024-06-21 23:14:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1390886912. Throughput: 0: 42525.8. Samples: 1391020420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-21 23:14:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 23:14:45,034][15401] Updated weights for policy 0, policy_version 84900 (0.0033) [2024-06-21 23:14:48,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1391099904. Throughput: 0: 42710.1. Samples: 1391281520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:14:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 23:14:49,655][15401] Updated weights for policy 0, policy_version 84910 (0.0040) [2024-06-21 23:14:52,220][15349] Signal inference workers to stop experience collection... (20500 times) [2024-06-21 23:14:52,220][15349] Signal inference workers to resume experience collection... (20500 times) [2024-06-21 23:14:52,236][15401] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-06-21 23:14:52,236][15401] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-06-21 23:14:52,515][15401] Updated weights for policy 0, policy_version 84920 (0.0034) [2024-06-21 23:14:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 1391345664. Throughput: 0: 42781.3. Samples: 1391411680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:14:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 23:14:57,325][15401] Updated weights for policy 0, policy_version 84930 (0.0039) [2024-06-21 23:14:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1391525888. Throughput: 0: 42743.8. Samples: 1391669820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:14:58,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 23:15:00,388][15401] Updated weights for policy 0, policy_version 84940 (0.0034) [2024-06-21 23:15:03,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 1391755264. Throughput: 0: 42734.1. Samples: 1391921700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:15:03,393][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 23:15:04,870][15401] Updated weights for policy 0, policy_version 84950 (0.0037) [2024-06-21 23:15:07,995][15401] Updated weights for policy 0, policy_version 84960 (0.0030) [2024-06-21 23:15:08,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1391984640. Throughput: 0: 42925.8. Samples: 1392057040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:15:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-21 23:15:12,680][15401] Updated weights for policy 0, policy_version 84970 (0.0031) [2024-06-21 23:15:13,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1392164864. Throughput: 0: 42923.2. Samples: 1392312180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:15:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-21 23:15:16,020][15401] Updated weights for policy 0, policy_version 84980 (0.0030) [2024-06-21 23:15:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1392410624. Throughput: 0: 42644.8. Samples: 1392557040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:15:18,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 23:15:20,536][15401] Updated weights for policy 0, policy_version 84990 (0.0050) [2024-06-21 23:15:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1392623616. Throughput: 0: 42855.5. Samples: 1392695600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:15:23,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 23:15:23,572][15401] Updated weights for policy 0, policy_version 85000 (0.0030) [2024-06-21 23:15:28,327][15401] Updated weights for policy 0, policy_version 85010 (0.0033) [2024-06-21 23:15:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1392803840. Throughput: 0: 42927.2. Samples: 1392952140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:15:28,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 23:15:31,077][15401] Updated weights for policy 0, policy_version 85020 (0.0035) [2024-06-21 23:15:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1393049600. Throughput: 0: 42662.4. Samples: 1393201320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:15:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-21 23:15:35,907][15401] Updated weights for policy 0, policy_version 85030 (0.0029) [2024-06-21 23:15:38,390][15132] Fps is (10 sec: 45872.3, 60 sec: 42871.1, 300 sec: 42765.0). Total num frames: 1393262592. Throughput: 0: 42863.5. Samples: 1393340560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:15:38,391][15132] Avg episode reward: [(0, '0.489')] [2024-06-21 23:15:38,710][15401] Updated weights for policy 0, policy_version 85040 (0.0036) [2024-06-21 23:15:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1393442816. Throughput: 0: 42872.2. Samples: 1393599060. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:15:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 23:15:43,464][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000085050_1393459200.pth... [2024-06-21 23:15:43,467][15401] Updated weights for policy 0, policy_version 85050 (0.0023) [2024-06-21 23:15:43,515][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000084425_1383219200.pth [2024-06-21 23:15:46,270][15401] Updated weights for policy 0, policy_version 85060 (0.0024) [2024-06-21 23:15:48,390][15132] Fps is (10 sec: 44238.8, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 1393704960. Throughput: 0: 42777.4. Samples: 1393846580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:15:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-21 23:15:50,994][15401] Updated weights for policy 0, policy_version 85070 (0.0035) [2024-06-21 23:15:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1393885184. Throughput: 0: 42867.6. Samples: 1393986080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:15:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 23:15:53,971][15401] Updated weights for policy 0, policy_version 85080 (0.0041) [2024-06-21 23:15:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1394098176. Throughput: 0: 42797.6. Samples: 1394238080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:15:58,393][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 23:15:58,891][15401] Updated weights for policy 0, policy_version 85090 (0.0029) [2024-06-21 23:16:01,614][15401] Updated weights for policy 0, policy_version 85100 (0.0033) [2024-06-21 23:16:03,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 1394360320. Throughput: 0: 42903.2. Samples: 1394487680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:16:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 23:16:06,390][15401] Updated weights for policy 0, policy_version 85110 (0.0048) [2024-06-21 23:16:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1394556928. Throughput: 0: 42903.5. Samples: 1394626260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:16:08,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-21 23:16:09,542][15401] Updated weights for policy 0, policy_version 85120 (0.0029) [2024-06-21 23:16:10,578][15349] Signal inference workers to stop experience collection... (20550 times) [2024-06-21 23:16:10,579][15349] Signal inference workers to resume experience collection... (20550 times) [2024-06-21 23:16:10,620][15401] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-06-21 23:16:10,620][15401] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-06-21 23:16:13,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1394737152. Throughput: 0: 42953.2. Samples: 1394885040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:16:13,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-21 23:16:13,741][15401] Updated weights for policy 0, policy_version 85130 (0.0031) [2024-06-21 23:16:17,054][15401] Updated weights for policy 0, policy_version 85140 (0.0036) [2024-06-21 23:16:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1395015680. Throughput: 0: 43029.7. Samples: 1395137660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:16:18,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-21 23:16:21,324][15401] Updated weights for policy 0, policy_version 85150 (0.0034) [2024-06-21 23:16:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1395163136. Throughput: 0: 42915.6. Samples: 1395271740. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:16:23,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-21 23:16:24,859][15401] Updated weights for policy 0, policy_version 85160 (0.0028) [2024-06-21 23:16:28,390][15132] Fps is (10 sec: 37683.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1395392512. Throughput: 0: 42632.7. Samples: 1395517540. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-21 23:16:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 23:16:28,975][15401] Updated weights for policy 0, policy_version 85170 (0.0050) [2024-06-21 23:16:32,442][15401] Updated weights for policy 0, policy_version 85180 (0.0027) [2024-06-21 23:16:33,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1395638272. Throughput: 0: 42756.0. Samples: 1395770600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:16:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 23:16:37,061][15401] Updated weights for policy 0, policy_version 85190 (0.0033) [2024-06-21 23:16:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.7, 300 sec: 42654.0). Total num frames: 1395802112. Throughput: 0: 42633.4. Samples: 1395904580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:16:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-21 23:16:40,203][15401] Updated weights for policy 0, policy_version 85200 (0.0027) [2024-06-21 23:16:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1396047872. Throughput: 0: 42577.4. Samples: 1396154060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:16:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 23:16:44,634][15401] Updated weights for policy 0, policy_version 85210 (0.0036) [2024-06-21 23:16:47,840][15401] Updated weights for policy 0, policy_version 85220 (0.0033) [2024-06-21 23:16:48,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1396277248. Throughput: 0: 42718.5. Samples: 1396410020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:16:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 23:16:52,287][15401] Updated weights for policy 0, policy_version 85230 (0.0039) [2024-06-21 23:16:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1396441088. Throughput: 0: 42529.8. Samples: 1396540100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:16:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 23:16:55,293][15401] Updated weights for policy 0, policy_version 85240 (0.0028) [2024-06-21 23:16:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 1396686848. Throughput: 0: 42438.8. Samples: 1396794780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:16:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 23:17:00,065][15401] Updated weights for policy 0, policy_version 85250 (0.0034) [2024-06-21 23:17:03,009][15401] Updated weights for policy 0, policy_version 85260 (0.0030) [2024-06-21 23:17:03,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1396916224. Throughput: 0: 42442.7. Samples: 1397047580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:17:03,399][15132] Avg episode reward: [(0, '0.404')] [2024-06-21 23:17:07,824][15401] Updated weights for policy 0, policy_version 85270 (0.0045) [2024-06-21 23:17:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1397080064. Throughput: 0: 42415.1. Samples: 1397180420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:17:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 23:17:10,552][15401] Updated weights for policy 0, policy_version 85280 (0.0034) [2024-06-21 23:17:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1397325824. Throughput: 0: 42600.5. Samples: 1397434560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:17:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 23:17:15,427][15401] Updated weights for policy 0, policy_version 85290 (0.0031) [2024-06-21 23:17:18,171][15401] Updated weights for policy 0, policy_version 85300 (0.0026) [2024-06-21 23:17:18,390][15132] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1397571584. Throughput: 0: 42684.1. Samples: 1397691380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-21 23:17:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 23:17:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1397702656. Throughput: 0: 42630.2. Samples: 1397822940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:17:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 23:17:23,495][15401] Updated weights for policy 0, policy_version 85310 (0.0041) [2024-06-21 23:17:24,224][15349] Signal inference workers to stop experience collection... (20600 times) [2024-06-21 23:17:24,224][15349] Signal inference workers to resume experience collection... (20600 times) [2024-06-21 23:17:24,264][15401] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-06-21 23:17:24,264][15401] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-06-21 23:17:25,596][15401] Updated weights for policy 0, policy_version 85320 (0.0042) [2024-06-21 23:17:28,396][15132] Fps is (10 sec: 40933.8, 60 sec: 43140.0, 300 sec: 42820.0). Total num frames: 1397981184. Throughput: 0: 42767.7. Samples: 1398078880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:17:28,397][15132] Avg episode reward: [(0, '0.432')] [2024-06-21 23:17:31,465][15401] Updated weights for policy 0, policy_version 85330 (0.0033) [2024-06-21 23:17:33,392][15132] Fps is (10 sec: 49139.8, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 1398194176. Throughput: 0: 42673.3. Samples: 1398330420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:17:33,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 23:17:33,535][15401] Updated weights for policy 0, policy_version 85340 (0.0046) [2024-06-21 23:17:38,390][15132] Fps is (10 sec: 36067.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1398341632. Throughput: 0: 42731.5. Samples: 1398463020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:17:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 23:17:38,877][15401] Updated weights for policy 0, policy_version 85350 (0.0029) [2024-06-21 23:17:41,408][15401] Updated weights for policy 0, policy_version 85360 (0.0034) [2024-06-21 23:17:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1398620160. Throughput: 0: 42829.2. Samples: 1398722100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:17:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-21 23:17:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000085365_1398620160.pth... [2024-06-21 23:17:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000084738_1388347392.pth [2024-06-21 23:17:46,307][15401] Updated weights for policy 0, policy_version 85370 (0.0033) [2024-06-21 23:17:48,390][15132] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1398833152. Throughput: 0: 42891.1. Samples: 1398977680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:17:48,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 23:17:49,164][15401] Updated weights for policy 0, policy_version 85380 (0.0029) [2024-06-21 23:17:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1399013376. Throughput: 0: 42860.8. Samples: 1399109160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:17:53,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-21 23:17:53,721][15401] Updated weights for policy 0, policy_version 85390 (0.0041) [2024-06-21 23:17:56,907][15401] Updated weights for policy 0, policy_version 85400 (0.0029) [2024-06-21 23:17:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 1399259136. Throughput: 0: 42911.5. Samples: 1399365580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:17:58,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-21 23:18:01,334][15401] Updated weights for policy 0, policy_version 85410 (0.0029) [2024-06-21 23:18:03,396][15132] Fps is (10 sec: 45846.5, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 1399472128. Throughput: 0: 42842.9. Samples: 1399619580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:18:03,396][15132] Avg episode reward: [(0, '0.502')] [2024-06-21 23:18:04,724][15401] Updated weights for policy 0, policy_version 85420 (0.0034) [2024-06-21 23:18:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 1399652352. Throughput: 0: 42755.4. Samples: 1399746940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:18:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-21 23:18:08,845][15401] Updated weights for policy 0, policy_version 85430 (0.0039) [2024-06-21 23:18:12,606][15401] Updated weights for policy 0, policy_version 85440 (0.0027) [2024-06-21 23:18:13,389][15132] Fps is (10 sec: 40986.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1399881728. Throughput: 0: 42825.7. Samples: 1400005760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-21 23:18:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-21 23:18:16,736][15401] Updated weights for policy 0, policy_version 85450 (0.0038) [2024-06-21 23:18:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1400094720. Throughput: 0: 42793.4. Samples: 1400256020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:18,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 23:18:20,269][15401] Updated weights for policy 0, policy_version 85460 (0.0038) [2024-06-21 23:18:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43417.4, 300 sec: 42765.3). Total num frames: 1400307712. Throughput: 0: 42732.8. Samples: 1400386000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-21 23:18:24,212][15401] Updated weights for policy 0, policy_version 85470 (0.0036) [2024-06-21 23:18:27,849][15401] Updated weights for policy 0, policy_version 85480 (0.0028) [2024-06-21 23:18:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42602.9, 300 sec: 42820.5). Total num frames: 1400537088. Throughput: 0: 42659.5. Samples: 1400641780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:28,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-21 23:18:31,711][15401] Updated weights for policy 0, policy_version 85490 (0.0038) [2024-06-21 23:18:32,318][15349] Signal inference workers to stop experience collection... (20650 times) [2024-06-21 23:18:32,318][15349] Signal inference workers to resume experience collection... (20650 times) [2024-06-21 23:18:32,339][15401] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-06-21 23:18:32,371][15401] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-06-21 23:18:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 1400750080. Throughput: 0: 42632.9. Samples: 1400896160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:33,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-21 23:18:35,394][15401] Updated weights for policy 0, policy_version 85500 (0.0036) [2024-06-21 23:18:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43690.8, 300 sec: 42820.5). Total num frames: 1400963072. Throughput: 0: 42653.5. Samples: 1401028560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:38,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-21 23:18:39,205][15401] Updated weights for policy 0, policy_version 85510 (0.0031) [2024-06-21 23:18:43,050][15401] Updated weights for policy 0, policy_version 85520 (0.0035) [2024-06-21 23:18:43,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 1401159680. Throughput: 0: 42688.0. Samples: 1401286640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:43,393][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 23:18:46,860][15401] Updated weights for policy 0, policy_version 85530 (0.0027) [2024-06-21 23:18:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 1401389056. Throughput: 0: 42703.9. Samples: 1401540980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 23:18:50,717][15401] Updated weights for policy 0, policy_version 85540 (0.0032) [2024-06-21 23:18:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1401585664. Throughput: 0: 42768.9. Samples: 1401671540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 23:18:54,358][15401] Updated weights for policy 0, policy_version 85550 (0.0030) [2024-06-21 23:18:58,186][15401] Updated weights for policy 0, policy_version 85560 (0.0037) [2024-06-21 23:18:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1401815040. Throughput: 0: 42911.6. Samples: 1401936780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:18:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 23:19:01,858][15401] Updated weights for policy 0, policy_version 85570 (0.0024) [2024-06-21 23:19:03,390][15132] Fps is (10 sec: 44234.0, 60 sec: 42602.4, 300 sec: 42764.9). Total num frames: 1402028032. Throughput: 0: 42973.6. Samples: 1402189860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-21 23:19:03,391][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 23:19:05,796][15401] Updated weights for policy 0, policy_version 85580 (0.0037) [2024-06-21 23:19:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1402241024. Throughput: 0: 42999.4. Samples: 1402320960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-21 23:19:09,345][15401] Updated weights for policy 0, policy_version 85590 (0.0029) [2024-06-21 23:19:13,380][15401] Updated weights for policy 0, policy_version 85600 (0.0030) [2024-06-21 23:19:13,390][15132] Fps is (10 sec: 44239.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1402470400. Throughput: 0: 43075.6. Samples: 1402580180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 23:19:16,885][15401] Updated weights for policy 0, policy_version 85610 (0.0038) [2024-06-21 23:19:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1402683392. Throughput: 0: 43076.4. Samples: 1402834600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:18,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 23:19:20,865][15401] Updated weights for policy 0, policy_version 85620 (0.0024) [2024-06-21 23:19:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1402880000. Throughput: 0: 43050.5. Samples: 1402965840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 23:19:24,630][15401] Updated weights for policy 0, policy_version 85630 (0.0035) [2024-06-21 23:19:28,384][15401] Updated weights for policy 0, policy_version 85640 (0.0023) [2024-06-21 23:19:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1403125760. Throughput: 0: 43103.2. Samples: 1403226180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 23:19:32,208][15401] Updated weights for policy 0, policy_version 85650 (0.0037) [2024-06-21 23:19:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1403322368. Throughput: 0: 43020.7. Samples: 1403476920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 23:19:35,311][15349] Signal inference workers to stop experience collection... (20700 times) [2024-06-21 23:19:35,370][15401] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-06-21 23:19:35,429][15349] Signal inference workers to resume experience collection... (20700 times) [2024-06-21 23:19:35,429][15401] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-06-21 23:19:36,373][15401] Updated weights for policy 0, policy_version 85660 (0.0036) [2024-06-21 23:19:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1403518976. Throughput: 0: 43052.1. Samples: 1403608880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-21 23:19:39,863][15401] Updated weights for policy 0, policy_version 85670 (0.0030) [2024-06-21 23:19:43,395][15132] Fps is (10 sec: 42576.8, 60 sec: 43142.6, 300 sec: 42875.4). Total num frames: 1403748352. Throughput: 0: 42894.6. Samples: 1403867260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:43,395][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 23:19:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000085678_1403748352.pth... [2024-06-21 23:19:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000085050_1393459200.pth [2024-06-21 23:19:43,827][15401] Updated weights for policy 0, policy_version 85680 (0.0032) [2024-06-21 23:19:47,316][15401] Updated weights for policy 0, policy_version 85690 (0.0036) [2024-06-21 23:19:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1403961344. Throughput: 0: 43016.7. Samples: 1404125580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-21 23:19:51,665][15401] Updated weights for policy 0, policy_version 85700 (0.0041) [2024-06-21 23:19:53,389][15132] Fps is (10 sec: 40981.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1404157952. Throughput: 0: 42970.2. Samples: 1404254620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-21 23:19:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-21 23:19:55,037][15401] Updated weights for policy 0, policy_version 85710 (0.0039) [2024-06-21 23:19:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 1404403712. Throughput: 0: 42852.0. Samples: 1404508520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:19:58,395][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 23:19:59,323][15401] Updated weights for policy 0, policy_version 85720 (0.0028) [2024-06-21 23:20:02,898][15401] Updated weights for policy 0, policy_version 85730 (0.0022) [2024-06-21 23:20:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43145.0, 300 sec: 42820.6). Total num frames: 1404616704. Throughput: 0: 42748.0. Samples: 1404758260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 23:20:06,753][15401] Updated weights for policy 0, policy_version 85740 (0.0043) [2024-06-21 23:20:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1404796928. Throughput: 0: 42798.7. Samples: 1404891780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:08,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 23:20:10,270][15401] Updated weights for policy 0, policy_version 85750 (0.0041) [2024-06-21 23:20:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1405042688. Throughput: 0: 42736.5. Samples: 1405149320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:13,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-21 23:20:14,592][15401] Updated weights for policy 0, policy_version 85760 (0.0033) [2024-06-21 23:20:18,136][15401] Updated weights for policy 0, policy_version 85770 (0.0036) [2024-06-21 23:20:18,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1405272064. Throughput: 0: 42871.5. Samples: 1405406140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 23:20:22,087][15401] Updated weights for policy 0, policy_version 85780 (0.0034) [2024-06-21 23:20:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1405452288. Throughput: 0: 42932.0. Samples: 1405540820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 23:20:25,702][15401] Updated weights for policy 0, policy_version 85790 (0.0027) [2024-06-21 23:20:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1405681664. Throughput: 0: 42874.3. Samples: 1405796380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 23:20:29,544][15401] Updated weights for policy 0, policy_version 85800 (0.0034) [2024-06-21 23:20:33,200][15401] Updated weights for policy 0, policy_version 85810 (0.0031) [2024-06-21 23:20:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.2). Total num frames: 1405911040. Throughput: 0: 42737.2. Samples: 1406048760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:33,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-21 23:20:37,312][15401] Updated weights for policy 0, policy_version 85820 (0.0038) [2024-06-21 23:20:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1406091264. Throughput: 0: 42879.9. Samples: 1406184220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 23:20:40,968][15401] Updated weights for policy 0, policy_version 85830 (0.0033) [2024-06-21 23:20:43,155][15349] Signal inference workers to stop experience collection... (20750 times) [2024-06-21 23:20:43,160][15349] Signal inference workers to resume experience collection... (20750 times) [2024-06-21 23:20:43,196][15401] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-06-21 23:20:43,196][15401] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-06-21 23:20:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42875.2, 300 sec: 42765.0). Total num frames: 1406320640. Throughput: 0: 42954.3. Samples: 1406441460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-21 23:20:44,723][15401] Updated weights for policy 0, policy_version 85840 (0.0031) [2024-06-21 23:20:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1406533632. Throughput: 0: 43115.7. Samples: 1406698460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-21 23:20:48,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 23:20:48,613][15401] Updated weights for policy 0, policy_version 85850 (0.0033) [2024-06-21 23:20:52,236][15401] Updated weights for policy 0, policy_version 85860 (0.0028) [2024-06-21 23:20:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1406746624. Throughput: 0: 43073.7. Samples: 1406830100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:20:53,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-21 23:20:56,230][15401] Updated weights for policy 0, policy_version 85870 (0.0038) [2024-06-21 23:20:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1406959616. Throughput: 0: 43032.9. Samples: 1407085800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:20:58,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 23:20:59,752][15401] Updated weights for policy 0, policy_version 85880 (0.0037) [2024-06-21 23:21:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1407172608. Throughput: 0: 43062.0. Samples: 1407343920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:21:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-21 23:21:03,994][15401] Updated weights for policy 0, policy_version 85890 (0.0034) [2024-06-21 23:21:08,049][15401] Updated weights for policy 0, policy_version 85900 (0.0029) [2024-06-21 23:21:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1407385600. Throughput: 0: 42832.0. Samples: 1407468260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:21:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-21 23:21:11,709][15401] Updated weights for policy 0, policy_version 85910 (0.0034) [2024-06-21 23:21:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 1407598592. Throughput: 0: 42826.5. Samples: 1407723680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:21:13,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 23:21:16,001][15401] Updated weights for policy 0, policy_version 85920 (0.0033) [2024-06-21 23:21:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1407811584. Throughput: 0: 42854.3. Samples: 1407977200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:21:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 23:21:19,583][15401] Updated weights for policy 0, policy_version 85930 (0.0028) [2024-06-21 23:21:23,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1408008192. Throughput: 0: 42752.1. Samples: 1408108060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:21:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 23:21:23,615][15401] Updated weights for policy 0, policy_version 85940 (0.0023) [2024-06-21 23:21:27,065][15401] Updated weights for policy 0, policy_version 85950 (0.0028) [2024-06-21 23:21:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1408253952. Throughput: 0: 42805.4. Samples: 1408367700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:21:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 23:21:31,244][15401] Updated weights for policy 0, policy_version 85960 (0.0041) [2024-06-21 23:21:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1408466944. Throughput: 0: 42910.6. Samples: 1408629440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:21:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 23:21:34,798][15401] Updated weights for policy 0, policy_version 85970 (0.0037) [2024-06-21 23:21:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1408663552. Throughput: 0: 42779.6. Samples: 1408755180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-21 23:21:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 23:21:38,889][15401] Updated weights for policy 0, policy_version 85980 (0.0037) [2024-06-21 23:21:42,546][15401] Updated weights for policy 0, policy_version 85990 (0.0026) [2024-06-21 23:21:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1408909312. Throughput: 0: 42949.6. Samples: 1409018540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:21:43,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 23:21:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000085993_1408909312.pth... [2024-06-21 23:21:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000085365_1398620160.pth [2024-06-21 23:21:46,711][15401] Updated weights for policy 0, policy_version 86000 (0.0031) [2024-06-21 23:21:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1409105920. Throughput: 0: 42763.0. Samples: 1409268260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:21:48,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 23:21:50,341][15401] Updated weights for policy 0, policy_version 86010 (0.0032) [2024-06-21 23:21:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1409318912. Throughput: 0: 42735.1. Samples: 1409391340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:21:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 23:21:54,153][15401] Updated weights for policy 0, policy_version 86020 (0.0038) [2024-06-21 23:21:58,101][15401] Updated weights for policy 0, policy_version 86030 (0.0026) [2024-06-21 23:21:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1409531904. Throughput: 0: 42940.6. Samples: 1409655900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:21:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-21 23:22:01,739][15401] Updated weights for policy 0, policy_version 86040 (0.0041) [2024-06-21 23:22:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1409761280. Throughput: 0: 42718.6. Samples: 1409899540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:22:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 23:22:05,689][15401] Updated weights for policy 0, policy_version 86050 (0.0050) [2024-06-21 23:22:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1409957888. Throughput: 0: 42734.0. Samples: 1410031100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:22:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-21 23:22:09,514][15401] Updated weights for policy 0, policy_version 86060 (0.0037) [2024-06-21 23:22:13,237][15401] Updated weights for policy 0, policy_version 86070 (0.0038) [2024-06-21 23:22:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1410170880. Throughput: 0: 42746.1. Samples: 1410291280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:22:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-21 23:22:17,233][15401] Updated weights for policy 0, policy_version 86080 (0.0027) [2024-06-21 23:22:17,765][15349] Signal inference workers to stop experience collection... (20800 times) [2024-06-21 23:22:17,768][15349] Signal inference workers to resume experience collection... (20800 times) [2024-06-21 23:22:17,792][15401] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-06-21 23:22:17,792][15401] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-06-21 23:22:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1410400256. Throughput: 0: 42533.7. Samples: 1410543460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:22:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 23:22:20,894][15401] Updated weights for policy 0, policy_version 86090 (0.0032) [2024-06-21 23:22:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42765.9). Total num frames: 1410596864. Throughput: 0: 42670.6. Samples: 1410675360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:22:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-21 23:22:24,667][15401] Updated weights for policy 0, policy_version 86100 (0.0028) [2024-06-21 23:22:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1410809856. Throughput: 0: 42522.3. Samples: 1410932040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:22:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-21 23:22:28,677][15401] Updated weights for policy 0, policy_version 86110 (0.0026) [2024-06-21 23:22:32,169][15401] Updated weights for policy 0, policy_version 86120 (0.0028) [2024-06-21 23:22:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 1411055616. Throughput: 0: 42621.2. Samples: 1411186220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-21 23:22:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 23:22:36,571][15401] Updated weights for policy 0, policy_version 86130 (0.0031) [2024-06-21 23:22:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1411235840. Throughput: 0: 42859.1. Samples: 1411320000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:22:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 23:22:39,716][15401] Updated weights for policy 0, policy_version 86140 (0.0035) [2024-06-21 23:22:43,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1411432448. Throughput: 0: 42616.9. Samples: 1411573660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:22:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 23:22:44,115][15401] Updated weights for policy 0, policy_version 86150 (0.0032) [2024-06-21 23:22:47,289][15401] Updated weights for policy 0, policy_version 86160 (0.0032) [2024-06-21 23:22:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1411678208. Throughput: 0: 42862.6. Samples: 1411828360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:22:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-21 23:22:51,930][15401] Updated weights for policy 0, policy_version 86170 (0.0035) [2024-06-21 23:22:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1411891200. Throughput: 0: 42851.7. Samples: 1411959420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:22:53,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-21 23:22:54,863][15401] Updated weights for policy 0, policy_version 86180 (0.0035) [2024-06-21 23:22:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 1412087808. Throughput: 0: 42835.1. Samples: 1412218860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:22:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 23:22:59,442][15401] Updated weights for policy 0, policy_version 86190 (0.0027) [2024-06-21 23:23:02,813][15401] Updated weights for policy 0, policy_version 86200 (0.0034) [2024-06-21 23:23:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1412317184. Throughput: 0: 42948.0. Samples: 1412476120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:23:03,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-21 23:23:07,105][15401] Updated weights for policy 0, policy_version 86210 (0.0031) [2024-06-21 23:23:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1412546560. Throughput: 0: 43080.5. Samples: 1412613980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:23:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 23:23:10,252][15401] Updated weights for policy 0, policy_version 86220 (0.0040) [2024-06-21 23:23:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1412726784. Throughput: 0: 43020.8. Samples: 1412867980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:23:13,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-21 23:23:14,776][15401] Updated weights for policy 0, policy_version 86230 (0.0029) [2024-06-21 23:23:17,682][15401] Updated weights for policy 0, policy_version 86240 (0.0029) [2024-06-21 23:23:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1412988928. Throughput: 0: 42982.0. Samples: 1413120400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:23:18,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-21 23:23:22,443][15401] Updated weights for policy 0, policy_version 86250 (0.0038) [2024-06-21 23:23:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1413169152. Throughput: 0: 43116.5. Samples: 1413260240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-21 23:23:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 23:23:25,243][15401] Updated weights for policy 0, policy_version 86260 (0.0030) [2024-06-21 23:23:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1413382144. Throughput: 0: 43036.0. Samples: 1413510280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:23:28,396][15132] Avg episode reward: [(0, '0.359')] [2024-06-21 23:23:30,041][15401] Updated weights for policy 0, policy_version 86270 (0.0037) [2024-06-21 23:23:33,053][15401] Updated weights for policy 0, policy_version 86280 (0.0035) [2024-06-21 23:23:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 1413627904. Throughput: 0: 42855.2. Samples: 1413756840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:23:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 23:23:37,589][15401] Updated weights for policy 0, policy_version 86290 (0.0024) [2024-06-21 23:23:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 1413791744. Throughput: 0: 42964.0. Samples: 1413892800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:23:38,391][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 23:23:40,638][15401] Updated weights for policy 0, policy_version 86300 (0.0031) [2024-06-21 23:23:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1414021120. Throughput: 0: 42858.2. Samples: 1414147480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:23:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 23:23:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000086306_1414037504.pth... [2024-06-21 23:23:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000085678_1403748352.pth [2024-06-21 23:23:45,111][15401] Updated weights for policy 0, policy_version 86310 (0.0032) [2024-06-21 23:23:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 1414250496. Throughput: 0: 42870.8. Samples: 1414405300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:23:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-21 23:23:48,423][15401] Updated weights for policy 0, policy_version 86320 (0.0036) [2024-06-21 23:23:48,430][15349] Signal inference workers to stop experience collection... (20850 times) [2024-06-21 23:23:48,430][15349] Signal inference workers to resume experience collection... (20850 times) [2024-06-21 23:23:48,458][15401] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-06-21 23:23:48,458][15401] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-06-21 23:23:53,311][15401] Updated weights for policy 0, policy_version 86330 (0.0027) [2024-06-21 23:23:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1414430720. Throughput: 0: 42607.6. Samples: 1414531320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:23:53,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 23:23:56,213][15401] Updated weights for policy 0, policy_version 86340 (0.0039) [2024-06-21 23:23:58,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42876.2). Total num frames: 1414676480. Throughput: 0: 42392.9. Samples: 1414775660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:23:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 23:24:01,039][15401] Updated weights for policy 0, policy_version 86350 (0.0036) [2024-06-21 23:24:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1414856704. Throughput: 0: 42627.1. Samples: 1415038620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:24:03,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-21 23:24:03,967][15401] Updated weights for policy 0, policy_version 86360 (0.0021) [2024-06-21 23:24:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1415069696. Throughput: 0: 42280.8. Samples: 1415162880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:24:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 23:24:08,718][15401] Updated weights for policy 0, policy_version 86370 (0.0037) [2024-06-21 23:24:11,499][15401] Updated weights for policy 0, policy_version 86380 (0.0039) [2024-06-21 23:24:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1415299072. Throughput: 0: 42150.6. Samples: 1415407060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:24:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-21 23:24:16,371][15401] Updated weights for policy 0, policy_version 86390 (0.0042) [2024-06-21 23:24:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.0, 300 sec: 42765.0). Total num frames: 1415495680. Throughput: 0: 42531.3. Samples: 1415670760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-21 23:24:18,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 23:24:19,265][15401] Updated weights for policy 0, policy_version 86400 (0.0028) [2024-06-21 23:24:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1415708672. Throughput: 0: 42309.8. Samples: 1415796740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:24:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-21 23:24:23,854][15401] Updated weights for policy 0, policy_version 86410 (0.0029) [2024-06-21 23:24:26,918][15401] Updated weights for policy 0, policy_version 86420 (0.0032) [2024-06-21 23:24:28,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1415938048. Throughput: 0: 42223.2. Samples: 1416047520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:24:28,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 23:24:31,723][15401] Updated weights for policy 0, policy_version 86430 (0.0036) [2024-06-21 23:24:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 1416134656. Throughput: 0: 42383.1. Samples: 1416312540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:24:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 23:24:34,704][15401] Updated weights for policy 0, policy_version 86440 (0.0040) [2024-06-21 23:24:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42710.2). Total num frames: 1416347648. Throughput: 0: 42337.3. Samples: 1416436500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:24:38,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 23:24:39,317][15401] Updated weights for policy 0, policy_version 86450 (0.0033) [2024-06-21 23:24:42,236][15401] Updated weights for policy 0, policy_version 86460 (0.0046) [2024-06-21 23:24:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1416593408. Throughput: 0: 42573.3. Samples: 1416691460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:24:43,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 23:24:46,818][15401] Updated weights for policy 0, policy_version 86470 (0.0032) [2024-06-21 23:24:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1416790016. Throughput: 0: 42698.8. Samples: 1416960060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:24:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 23:24:49,868][15401] Updated weights for policy 0, policy_version 86480 (0.0033) [2024-06-21 23:24:53,390][15132] Fps is (10 sec: 39319.5, 60 sec: 42598.0, 300 sec: 42653.9). Total num frames: 1416986624. Throughput: 0: 42740.4. Samples: 1417086220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:24:53,391][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 23:24:54,583][15401] Updated weights for policy 0, policy_version 86490 (0.0039) [2024-06-21 23:24:57,488][15401] Updated weights for policy 0, policy_version 86500 (0.0029) [2024-06-21 23:24:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1417232384. Throughput: 0: 42821.8. Samples: 1417334040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:24:58,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 23:25:02,091][15401] Updated weights for policy 0, policy_version 86510 (0.0028) [2024-06-21 23:25:03,389][15132] Fps is (10 sec: 42600.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1417412608. Throughput: 0: 42779.3. Samples: 1417595820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:25:03,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-21 23:25:05,093][15401] Updated weights for policy 0, policy_version 86520 (0.0035) [2024-06-21 23:25:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1417625600. Throughput: 0: 42658.3. Samples: 1417716360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:25:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 23:25:10,354][15401] Updated weights for policy 0, policy_version 86530 (0.0031) [2024-06-21 23:25:10,998][15349] Signal inference workers to stop experience collection... (20900 times) [2024-06-21 23:25:11,001][15349] Signal inference workers to resume experience collection... (20900 times) [2024-06-21 23:25:11,042][15401] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-06-21 23:25:11,043][15401] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-06-21 23:25:12,823][15401] Updated weights for policy 0, policy_version 86540 (0.0032) [2024-06-21 23:25:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1417871360. Throughput: 0: 42732.8. Samples: 1417970500. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-21 23:25:17,939][15401] Updated weights for policy 0, policy_version 86550 (0.0038) [2024-06-21 23:25:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 1418051584. Throughput: 0: 42801.8. Samples: 1418238620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:18,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-21 23:25:20,538][15401] Updated weights for policy 0, policy_version 86560 (0.0032) [2024-06-21 23:25:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1418264576. Throughput: 0: 42702.1. Samples: 1418358100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-21 23:25:25,470][15401] Updated weights for policy 0, policy_version 86570 (0.0028) [2024-06-21 23:25:28,308][15401] Updated weights for policy 0, policy_version 86580 (0.0040) [2024-06-21 23:25:28,389][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1418526720. Throughput: 0: 42854.7. Samples: 1418619920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 23:25:33,003][15401] Updated weights for policy 0, policy_version 86590 (0.0034) [2024-06-21 23:25:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1418706944. Throughput: 0: 42641.6. Samples: 1418878940. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 23:25:35,982][15401] Updated weights for policy 0, policy_version 86600 (0.0045) [2024-06-21 23:25:38,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1418903552. Throughput: 0: 42393.4. Samples: 1418993900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 23:25:40,651][15401] Updated weights for policy 0, policy_version 86610 (0.0028) [2024-06-21 23:25:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1419149312. Throughput: 0: 42723.7. Samples: 1419256600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 23:25:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000086619_1419165696.pth... [2024-06-21 23:25:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000085993_1408909312.pth [2024-06-21 23:25:43,625][15401] Updated weights for policy 0, policy_version 86620 (0.0021) [2024-06-21 23:25:48,287][15401] Updated weights for policy 0, policy_version 86630 (0.0042) [2024-06-21 23:25:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1419345920. Throughput: 0: 42520.4. Samples: 1419509240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-21 23:25:51,174][15401] Updated weights for policy 0, policy_version 86640 (0.0048) [2024-06-21 23:25:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.7, 300 sec: 42653.9). Total num frames: 1419542528. Throughput: 0: 42541.2. Samples: 1419630720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 23:25:55,883][15401] Updated weights for policy 0, policy_version 86650 (0.0027) [2024-06-21 23:25:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1419788288. Throughput: 0: 42848.9. Samples: 1419898700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:25:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 23:25:59,158][15401] Updated weights for policy 0, policy_version 86660 (0.0036) [2024-06-21 23:26:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1419968512. Throughput: 0: 42568.2. Samples: 1420154200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-21 23:26:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 23:26:03,800][15401] Updated weights for policy 0, policy_version 86670 (0.0038) [2024-06-21 23:26:06,952][15401] Updated weights for policy 0, policy_version 86680 (0.0032) [2024-06-21 23:26:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1420181504. Throughput: 0: 42495.7. Samples: 1420270400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 23:26:11,472][15401] Updated weights for policy 0, policy_version 86690 (0.0038) [2024-06-21 23:26:12,828][15349] Signal inference workers to stop experience collection... (20950 times) [2024-06-21 23:26:12,828][15349] Signal inference workers to resume experience collection... (20950 times) [2024-06-21 23:26:12,850][15401] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-06-21 23:26:12,850][15401] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-06-21 23:26:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1420427264. Throughput: 0: 42459.0. Samples: 1420530580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-21 23:26:14,703][15401] Updated weights for policy 0, policy_version 86700 (0.0039) [2024-06-21 23:26:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1420591104. Throughput: 0: 42492.6. Samples: 1420791100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:18,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-21 23:26:19,017][15401] Updated weights for policy 0, policy_version 86710 (0.0039) [2024-06-21 23:26:22,355][15401] Updated weights for policy 0, policy_version 86720 (0.0038) [2024-06-21 23:26:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1420836864. Throughput: 0: 42629.7. Samples: 1420912240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-21 23:26:26,850][15401] Updated weights for policy 0, policy_version 86730 (0.0034) [2024-06-21 23:26:28,390][15132] Fps is (10 sec: 47512.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1421066240. Throughput: 0: 42469.1. Samples: 1421167720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:28,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-21 23:26:30,044][15401] Updated weights for policy 0, policy_version 86740 (0.0029) [2024-06-21 23:26:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1421230080. Throughput: 0: 42709.9. Samples: 1421431180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 23:26:34,511][15401] Updated weights for policy 0, policy_version 86750 (0.0023) [2024-06-21 23:26:37,875][15401] Updated weights for policy 0, policy_version 86760 (0.0029) [2024-06-21 23:26:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1421492224. Throughput: 0: 42642.6. Samples: 1421549640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 23:26:42,084][15401] Updated weights for policy 0, policy_version 86770 (0.0029) [2024-06-21 23:26:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1421688832. Throughput: 0: 42411.2. Samples: 1421807200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:43,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-21 23:26:45,838][15401] Updated weights for policy 0, policy_version 86780 (0.0028) [2024-06-21 23:26:48,389][15132] Fps is (10 sec: 37684.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1421869056. Throughput: 0: 42570.8. Samples: 1422069880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 23:26:49,826][15401] Updated weights for policy 0, policy_version 86790 (0.0034) [2024-06-21 23:26:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1422131200. Throughput: 0: 42597.4. Samples: 1422187280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-21 23:26:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 23:26:53,394][15401] Updated weights for policy 0, policy_version 86800 (0.0042) [2024-06-21 23:26:57,711][15401] Updated weights for policy 0, policy_version 86810 (0.0041) [2024-06-21 23:26:58,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1422344192. Throughput: 0: 42623.6. Samples: 1422448640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:26:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 23:27:01,113][15401] Updated weights for policy 0, policy_version 86820 (0.0033) [2024-06-21 23:27:03,392][15132] Fps is (10 sec: 39311.5, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 1422524416. Throughput: 0: 42613.9. Samples: 1422708840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:03,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 23:27:05,276][15401] Updated weights for policy 0, policy_version 86830 (0.0031) [2024-06-21 23:27:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1422753792. Throughput: 0: 42641.8. Samples: 1422831120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 23:27:08,833][15401] Updated weights for policy 0, policy_version 86840 (0.0038) [2024-06-21 23:27:12,841][15401] Updated weights for policy 0, policy_version 86850 (0.0028) [2024-06-21 23:27:13,392][15132] Fps is (10 sec: 44237.5, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 1422966784. Throughput: 0: 42741.0. Samples: 1423091160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:13,392][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 23:27:16,452][15401] Updated weights for policy 0, policy_version 86860 (0.0031) [2024-06-21 23:27:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1423163392. Throughput: 0: 42649.3. Samples: 1423350400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 23:27:20,375][15401] Updated weights for policy 0, policy_version 86870 (0.0033) [2024-06-21 23:27:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1423392768. Throughput: 0: 42779.3. Samples: 1423474700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 23:27:24,075][15401] Updated weights for policy 0, policy_version 86880 (0.0032) [2024-06-21 23:27:28,224][15401] Updated weights for policy 0, policy_version 86890 (0.0037) [2024-06-21 23:27:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1423605760. Throughput: 0: 42924.4. Samples: 1423738800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 23:27:29,135][15349] Signal inference workers to stop experience collection... (21000 times) [2024-06-21 23:27:29,136][15349] Signal inference workers to resume experience collection... (21000 times) [2024-06-21 23:27:29,188][15401] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-06-21 23:27:29,188][15401] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-06-21 23:27:31,762][15401] Updated weights for policy 0, policy_version 86900 (0.0037) [2024-06-21 23:27:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1423818752. Throughput: 0: 42583.4. Samples: 1423986140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 23:27:35,806][15401] Updated weights for policy 0, policy_version 86910 (0.0033) [2024-06-21 23:27:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1424031744. Throughput: 0: 42831.0. Samples: 1424114680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:38,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 23:27:39,458][15401] Updated weights for policy 0, policy_version 86920 (0.0029) [2024-06-21 23:27:43,252][15401] Updated weights for policy 0, policy_version 86930 (0.0041) [2024-06-21 23:27:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1424261120. Throughput: 0: 42991.9. Samples: 1424383280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:43,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 23:27:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000086930_1424261120.pth... [2024-06-21 23:27:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000086306_1414037504.pth [2024-06-21 23:27:46,942][15401] Updated weights for policy 0, policy_version 86940 (0.0039) [2024-06-21 23:27:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.4, 300 sec: 42653.9). Total num frames: 1424474112. Throughput: 0: 42812.5. Samples: 1424635300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-21 23:27:48,396][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 23:27:50,966][15401] Updated weights for policy 0, policy_version 86950 (0.0022) [2024-06-21 23:27:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1424687104. Throughput: 0: 42953.4. Samples: 1424764020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:27:53,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-21 23:27:54,501][15401] Updated weights for policy 0, policy_version 86960 (0.0046) [2024-06-21 23:27:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1424883712. Throughput: 0: 42963.6. Samples: 1425024420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:27:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-21 23:27:58,689][15401] Updated weights for policy 0, policy_version 86970 (0.0031) [2024-06-21 23:28:02,294][15401] Updated weights for policy 0, policy_version 86980 (0.0037) [2024-06-21 23:28:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 1425113088. Throughput: 0: 42823.4. Samples: 1425277460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:28:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-21 23:28:06,342][15401] Updated weights for policy 0, policy_version 86990 (0.0034) [2024-06-21 23:28:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1425326080. Throughput: 0: 43055.9. Samples: 1425412220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:28:08,396][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 23:28:09,808][15401] Updated weights for policy 0, policy_version 87000 (0.0021) [2024-06-21 23:28:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.1, 300 sec: 42542.8). Total num frames: 1425539072. Throughput: 0: 43041.3. Samples: 1425675660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:28:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-21 23:28:13,757][15401] Updated weights for policy 0, policy_version 87010 (0.0041) [2024-06-21 23:28:17,229][15401] Updated weights for policy 0, policy_version 87020 (0.0035) [2024-06-21 23:28:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1425752064. Throughput: 0: 43231.6. Samples: 1425931560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:28:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 23:28:21,386][15401] Updated weights for policy 0, policy_version 87030 (0.0042) [2024-06-21 23:28:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1425997824. Throughput: 0: 43209.9. Samples: 1426059120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:28:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-21 23:28:24,653][15401] Updated weights for policy 0, policy_version 87040 (0.0035) [2024-06-21 23:28:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1426178048. Throughput: 0: 43074.4. Samples: 1426321620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:28:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-21 23:28:28,855][15401] Updated weights for policy 0, policy_version 87050 (0.0030) [2024-06-21 23:28:32,295][15401] Updated weights for policy 0, policy_version 87060 (0.0031) [2024-06-21 23:28:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1426391040. Throughput: 0: 43025.1. Samples: 1426571420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:28:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-21 23:28:36,670][15401] Updated weights for policy 0, policy_version 87070 (0.0046) [2024-06-21 23:28:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1426636800. Throughput: 0: 43046.1. Samples: 1426701100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-21 23:28:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 23:28:39,851][15401] Updated weights for policy 0, policy_version 87080 (0.0041) [2024-06-21 23:28:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1426817024. Throughput: 0: 43150.1. Samples: 1426966180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:28:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 23:28:44,319][15401] Updated weights for policy 0, policy_version 87090 (0.0044) [2024-06-21 23:28:47,344][15401] Updated weights for policy 0, policy_version 87100 (0.0035) [2024-06-21 23:28:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1427046400. Throughput: 0: 43052.1. Samples: 1427214800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:28:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-21 23:28:51,463][15349] Signal inference workers to stop experience collection... (21050 times) [2024-06-21 23:28:51,469][15349] Signal inference workers to resume experience collection... (21050 times) [2024-06-21 23:28:51,511][15401] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-06-21 23:28:51,511][15401] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-06-21 23:28:51,790][15401] Updated weights for policy 0, policy_version 87110 (0.0046) [2024-06-21 23:28:53,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1427292160. Throughput: 0: 43108.0. Samples: 1427352080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:28:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 23:28:55,176][15401] Updated weights for policy 0, policy_version 87120 (0.0042) [2024-06-21 23:28:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1427456000. Throughput: 0: 43037.4. Samples: 1427612340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:28:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 23:28:59,404][15401] Updated weights for policy 0, policy_version 87130 (0.0035) [2024-06-21 23:29:02,660][15401] Updated weights for policy 0, policy_version 87140 (0.0034) [2024-06-21 23:29:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1427701760. Throughput: 0: 42898.8. Samples: 1427862000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:29:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-21 23:29:07,119][15401] Updated weights for policy 0, policy_version 87150 (0.0027) [2024-06-21 23:29:08,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1427931136. Throughput: 0: 43065.3. Samples: 1427997060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:29:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 23:29:10,467][15401] Updated weights for policy 0, policy_version 87160 (0.0033) [2024-06-21 23:29:13,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1428094976. Throughput: 0: 42968.7. Samples: 1428255220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:29:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 23:29:14,703][15401] Updated weights for policy 0, policy_version 87170 (0.0041) [2024-06-21 23:29:18,155][15401] Updated weights for policy 0, policy_version 87180 (0.0032) [2024-06-21 23:29:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1428357120. Throughput: 0: 42934.7. Samples: 1428503480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:29:18,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 23:29:22,158][15401] Updated weights for policy 0, policy_version 87190 (0.0028) [2024-06-21 23:29:23,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1428570112. Throughput: 0: 43079.2. Samples: 1428639660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:29:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-21 23:29:25,993][15401] Updated weights for policy 0, policy_version 87200 (0.0036) [2024-06-21 23:29:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1428766720. Throughput: 0: 42896.4. Samples: 1428896520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:29:28,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 23:29:30,127][15401] Updated weights for policy 0, policy_version 87210 (0.0033) [2024-06-21 23:29:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1428996096. Throughput: 0: 42881.9. Samples: 1429144480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-21 23:29:33,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 23:29:33,539][15401] Updated weights for policy 0, policy_version 87220 (0.0040) [2024-06-21 23:29:37,787][15401] Updated weights for policy 0, policy_version 87230 (0.0037) [2024-06-21 23:29:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1429192704. Throughput: 0: 42863.0. Samples: 1429280920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:29:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 23:29:41,402][15401] Updated weights for policy 0, policy_version 87240 (0.0034) [2024-06-21 23:29:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1429405696. Throughput: 0: 42763.0. Samples: 1429536680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:29:43,391][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 23:29:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000087244_1429405696.pth... [2024-06-21 23:29:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000086619_1419165696.pth [2024-06-21 23:29:45,544][15401] Updated weights for policy 0, policy_version 87250 (0.0029) [2024-06-21 23:29:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 1429651456. Throughput: 0: 42782.1. Samples: 1429787200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:29:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 23:29:48,952][15401] Updated weights for policy 0, policy_version 87260 (0.0047) [2024-06-21 23:29:53,225][15401] Updated weights for policy 0, policy_version 87270 (0.0023) [2024-06-21 23:29:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1429831680. Throughput: 0: 42776.0. Samples: 1429921980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:29:53,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 23:29:56,413][15401] Updated weights for policy 0, policy_version 87280 (0.0033) [2024-06-21 23:29:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1430044672. Throughput: 0: 42723.3. Samples: 1430177760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:29:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 23:30:00,713][15401] Updated weights for policy 0, policy_version 87290 (0.0041) [2024-06-21 23:30:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1430290432. Throughput: 0: 42808.8. Samples: 1430429880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:30:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 23:30:03,979][15401] Updated weights for policy 0, policy_version 87300 (0.0029) [2024-06-21 23:30:06,447][15349] Signal inference workers to stop experience collection... (21100 times) [2024-06-21 23:30:06,447][15349] Signal inference workers to resume experience collection... (21100 times) [2024-06-21 23:30:06,490][15401] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-06-21 23:30:06,491][15401] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-06-21 23:30:08,351][15401] Updated weights for policy 0, policy_version 87310 (0.0031) [2024-06-21 23:30:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1430487040. Throughput: 0: 42832.4. Samples: 1430567120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:30:08,391][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 23:30:11,806][15401] Updated weights for policy 0, policy_version 87320 (0.0039) [2024-06-21 23:30:13,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43416.0, 300 sec: 42875.7). Total num frames: 1430700032. Throughput: 0: 42651.2. Samples: 1430815920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:30:13,392][15132] Avg episode reward: [(0, '0.438')] [2024-06-21 23:30:16,024][15401] Updated weights for policy 0, policy_version 87330 (0.0028) [2024-06-21 23:30:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1430896640. Throughput: 0: 42834.7. Samples: 1431072040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:30:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 23:30:19,515][15401] Updated weights for policy 0, policy_version 87340 (0.0035) [2024-06-21 23:30:23,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1431109632. Throughput: 0: 42600.4. Samples: 1431197940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:30:23,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-21 23:30:23,695][15401] Updated weights for policy 0, policy_version 87350 (0.0027) [2024-06-21 23:30:27,095][15401] Updated weights for policy 0, policy_version 87360 (0.0032) [2024-06-21 23:30:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1431355392. Throughput: 0: 42485.9. Samples: 1431448540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-21 23:30:28,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-21 23:30:31,233][15401] Updated weights for policy 0, policy_version 87370 (0.0030) [2024-06-21 23:30:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 1431535616. Throughput: 0: 42822.7. Samples: 1431714220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:30:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-21 23:30:34,670][15401] Updated weights for policy 0, policy_version 87380 (0.0026) [2024-06-21 23:30:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1431764992. Throughput: 0: 42582.2. Samples: 1431838180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:30:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-21 23:30:39,479][15401] Updated weights for policy 0, policy_version 87390 (0.0033) [2024-06-21 23:30:42,426][15401] Updated weights for policy 0, policy_version 87400 (0.0028) [2024-06-21 23:30:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1431977984. Throughput: 0: 42510.7. Samples: 1432090740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:30:43,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-21 23:30:47,068][15401] Updated weights for policy 0, policy_version 87410 (0.0035) [2024-06-21 23:30:48,393][15132] Fps is (10 sec: 40944.3, 60 sec: 42049.6, 300 sec: 42820.0). Total num frames: 1432174592. Throughput: 0: 42760.8. Samples: 1432354280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:30:48,394][15132] Avg episode reward: [(0, '0.688')] [2024-06-21 23:30:50,304][15401] Updated weights for policy 0, policy_version 87420 (0.0026) [2024-06-21 23:30:53,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1432403968. Throughput: 0: 42473.4. Samples: 1432478520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:30:53,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 23:30:54,633][15401] Updated weights for policy 0, policy_version 87430 (0.0038) [2024-06-21 23:30:58,260][15401] Updated weights for policy 0, policy_version 87440 (0.0030) [2024-06-21 23:30:58,389][15132] Fps is (10 sec: 44254.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1432616960. Throughput: 0: 42692.6. Samples: 1432736980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:30:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-21 23:31:02,233][15401] Updated weights for policy 0, policy_version 87450 (0.0033) [2024-06-21 23:31:03,394][15132] Fps is (10 sec: 42589.4, 60 sec: 42322.2, 300 sec: 42875.4). Total num frames: 1432829952. Throughput: 0: 42683.7. Samples: 1432993000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:31:03,402][15132] Avg episode reward: [(0, '0.610')] [2024-06-21 23:31:05,809][15401] Updated weights for policy 0, policy_version 87460 (0.0045) [2024-06-21 23:31:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1433042944. Throughput: 0: 42719.2. Samples: 1433120300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:31:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 23:31:09,607][15401] Updated weights for policy 0, policy_version 87470 (0.0035) [2024-06-21 23:31:13,389][15132] Fps is (10 sec: 42617.8, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 1433255936. Throughput: 0: 42860.1. Samples: 1433377240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:31:13,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 23:31:13,421][15401] Updated weights for policy 0, policy_version 87480 (0.0033) [2024-06-21 23:31:17,225][15401] Updated weights for policy 0, policy_version 87490 (0.0024) [2024-06-21 23:31:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1433485312. Throughput: 0: 42649.3. Samples: 1433633440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-21 23:31:18,403][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 23:31:21,230][15401] Updated weights for policy 0, policy_version 87500 (0.0038) [2024-06-21 23:31:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1433681920. Throughput: 0: 42810.3. Samples: 1433764640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:31:23,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-21 23:31:24,928][15401] Updated weights for policy 0, policy_version 87510 (0.0038) [2024-06-21 23:31:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 1433878528. Throughput: 0: 42834.6. Samples: 1434018300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:31:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 23:31:28,830][15401] Updated weights for policy 0, policy_version 87520 (0.0029) [2024-06-21 23:31:32,743][15401] Updated weights for policy 0, policy_version 87530 (0.0037) [2024-06-21 23:31:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 1434124288. Throughput: 0: 42684.5. Samples: 1434275020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:31:33,393][15132] Avg episode reward: [(0, '0.600')] [2024-06-21 23:31:36,400][15401] Updated weights for policy 0, policy_version 87540 (0.0034) [2024-06-21 23:31:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1434337280. Throughput: 0: 42761.4. Samples: 1434402680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:31:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 23:31:40,242][15349] Signal inference workers to stop experience collection... (21150 times) [2024-06-21 23:31:40,260][15401] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-06-21 23:31:40,299][15349] Signal inference workers to resume experience collection... (21150 times) [2024-06-21 23:31:40,299][15401] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-06-21 23:31:40,453][15401] Updated weights for policy 0, policy_version 87550 (0.0037) [2024-06-21 23:31:43,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1434517504. Throughput: 0: 42748.8. Samples: 1434660680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:31:43,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-21 23:31:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000087557_1434533888.pth... [2024-06-21 23:31:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000086930_1424261120.pth [2024-06-21 23:31:44,317][15401] Updated weights for policy 0, policy_version 87560 (0.0029) [2024-06-21 23:31:48,034][15401] Updated weights for policy 0, policy_version 87570 (0.0022) [2024-06-21 23:31:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43147.3, 300 sec: 42820.6). Total num frames: 1434763264. Throughput: 0: 42546.0. Samples: 1434907380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:31:48,391][15132] Avg episode reward: [(0, '0.689')] [2024-06-21 23:31:51,915][15401] Updated weights for policy 0, policy_version 87580 (0.0033) [2024-06-21 23:31:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1434959872. Throughput: 0: 42699.2. Samples: 1435041760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:31:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 23:31:55,587][15401] Updated weights for policy 0, policy_version 87590 (0.0038) [2024-06-21 23:31:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42876.4). Total num frames: 1435172864. Throughput: 0: 42754.5. Samples: 1435301200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:31:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-21 23:31:59,534][15401] Updated weights for policy 0, policy_version 87600 (0.0037) [2024-06-21 23:32:03,206][15401] Updated weights for policy 0, policy_version 87610 (0.0029) [2024-06-21 23:32:03,393][15132] Fps is (10 sec: 44222.6, 60 sec: 42872.4, 300 sec: 42875.6). Total num frames: 1435402240. Throughput: 0: 42706.4. Samples: 1435555360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:32:03,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 23:32:07,119][15401] Updated weights for policy 0, policy_version 87620 (0.0033) [2024-06-21 23:32:08,394][15132] Fps is (10 sec: 42581.4, 60 sec: 42595.5, 300 sec: 42820.3). Total num frames: 1435598848. Throughput: 0: 42765.4. Samples: 1435689260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:32:08,394][15132] Avg episode reward: [(0, '0.180')] [2024-06-21 23:32:10,853][15401] Updated weights for policy 0, policy_version 87630 (0.0029) [2024-06-21 23:32:13,389][15132] Fps is (10 sec: 40973.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1435811840. Throughput: 0: 42690.7. Samples: 1435939380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-21 23:32:13,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-21 23:32:14,913][15401] Updated weights for policy 0, policy_version 87640 (0.0027) [2024-06-21 23:32:18,390][15132] Fps is (10 sec: 44254.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1436041216. Throughput: 0: 42758.7. Samples: 1436199060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 23:32:18,556][15401] Updated weights for policy 0, policy_version 87650 (0.0043) [2024-06-21 23:32:22,642][15401] Updated weights for policy 0, policy_version 87660 (0.0022) [2024-06-21 23:32:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1436254208. Throughput: 0: 42747.0. Samples: 1436326300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 23:32:26,458][15401] Updated weights for policy 0, policy_version 87670 (0.0041) [2024-06-21 23:32:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1436467200. Throughput: 0: 42671.9. Samples: 1436580920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-21 23:32:30,360][15401] Updated weights for policy 0, policy_version 87680 (0.0032) [2024-06-21 23:32:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 1436680192. Throughput: 0: 42867.9. Samples: 1436836440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 23:32:34,221][15401] Updated weights for policy 0, policy_version 87690 (0.0039) [2024-06-21 23:32:37,806][15401] Updated weights for policy 0, policy_version 87700 (0.0032) [2024-06-21 23:32:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1436893184. Throughput: 0: 42778.6. Samples: 1436966800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-21 23:32:41,724][15401] Updated weights for policy 0, policy_version 87710 (0.0050) [2024-06-21 23:32:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1437122560. Throughput: 0: 42723.6. Samples: 1437223760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-21 23:32:45,589][15401] Updated weights for policy 0, policy_version 87720 (0.0032) [2024-06-21 23:32:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1437319168. Throughput: 0: 42862.6. Samples: 1437484040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:48,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 23:32:49,285][15401] Updated weights for policy 0, policy_version 87730 (0.0033) [2024-06-21 23:32:53,262][15401] Updated weights for policy 0, policy_version 87740 (0.0037) [2024-06-21 23:32:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1437532160. Throughput: 0: 42662.6. Samples: 1437608900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:53,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-21 23:32:56,993][15401] Updated weights for policy 0, policy_version 87750 (0.0036) [2024-06-21 23:32:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 1437777920. Throughput: 0: 42797.3. Samples: 1437865260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:32:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-21 23:33:00,970][15401] Updated weights for policy 0, policy_version 87760 (0.0038) [2024-06-21 23:33:03,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42596.1, 300 sec: 42819.6). Total num frames: 1437958144. Throughput: 0: 42701.9. Samples: 1438120920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:33:03,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 23:33:04,857][15401] Updated weights for policy 0, policy_version 87770 (0.0034) [2024-06-21 23:33:08,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42601.3, 300 sec: 42765.0). Total num frames: 1438154752. Throughput: 0: 42577.8. Samples: 1438242300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-21 23:33:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 23:33:08,856][15401] Updated weights for policy 0, policy_version 87780 (0.0027) [2024-06-21 23:33:12,494][15401] Updated weights for policy 0, policy_version 87790 (0.0027) [2024-06-21 23:33:13,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1438384128. Throughput: 0: 42829.3. Samples: 1438508240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-21 23:33:16,669][15401] Updated weights for policy 0, policy_version 87800 (0.0042) [2024-06-21 23:33:17,196][15349] Signal inference workers to stop experience collection... (21200 times) [2024-06-21 23:33:17,196][15349] Signal inference workers to resume experience collection... (21200 times) [2024-06-21 23:33:17,218][15401] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-06-21 23:33:17,218][15401] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-06-21 23:33:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1438597120. Throughput: 0: 42694.4. Samples: 1438757680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-21 23:33:20,443][15401] Updated weights for policy 0, policy_version 87810 (0.0040) [2024-06-21 23:33:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1438810112. Throughput: 0: 42713.3. Samples: 1438888900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-21 23:33:24,099][15401] Updated weights for policy 0, policy_version 87820 (0.0029) [2024-06-21 23:33:28,229][15401] Updated weights for policy 0, policy_version 87830 (0.0036) [2024-06-21 23:33:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1439023104. Throughput: 0: 42870.8. Samples: 1439152940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-21 23:33:31,550][15401] Updated weights for policy 0, policy_version 87840 (0.0031) [2024-06-21 23:33:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1439252480. Throughput: 0: 42515.1. Samples: 1439397220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-21 23:33:35,814][15401] Updated weights for policy 0, policy_version 87850 (0.0050) [2024-06-21 23:33:38,390][15132] Fps is (10 sec: 44234.7, 60 sec: 42871.2, 300 sec: 42876.0). Total num frames: 1439465472. Throughput: 0: 42777.8. Samples: 1439533920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:38,391][15132] Avg episode reward: [(0, '0.503')] [2024-06-21 23:33:38,990][15401] Updated weights for policy 0, policy_version 87860 (0.0028) [2024-06-21 23:33:43,366][15401] Updated weights for policy 0, policy_version 87870 (0.0032) [2024-06-21 23:33:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1439662080. Throughput: 0: 42878.2. Samples: 1439794780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-21 23:33:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000087871_1439678464.pth... [2024-06-21 23:33:43,552][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000087244_1429405696.pth [2024-06-21 23:33:46,548][15401] Updated weights for policy 0, policy_version 87880 (0.0038) [2024-06-21 23:33:48,389][15132] Fps is (10 sec: 42600.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1439891456. Throughput: 0: 42814.6. Samples: 1440047300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-21 23:33:50,999][15401] Updated weights for policy 0, policy_version 87890 (0.0032) [2024-06-21 23:33:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1440120832. Throughput: 0: 42987.6. Samples: 1440176740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-21 23:33:54,085][15401] Updated weights for policy 0, policy_version 87900 (0.0047) [2024-06-21 23:33:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1440301056. Throughput: 0: 42824.5. Samples: 1440435340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:33:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 23:33:58,544][15401] Updated weights for policy 0, policy_version 87910 (0.0039) [2024-06-21 23:34:02,070][15401] Updated weights for policy 0, policy_version 87920 (0.0029) [2024-06-21 23:34:03,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42874.3, 300 sec: 42709.1). Total num frames: 1440530432. Throughput: 0: 42877.7. Samples: 1440687280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:03,393][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 23:34:06,122][15401] Updated weights for policy 0, policy_version 87930 (0.0048) [2024-06-21 23:34:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1440743424. Throughput: 0: 42828.1. Samples: 1440816160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-21 23:34:09,438][15401] Updated weights for policy 0, policy_version 87940 (0.0029) [2024-06-21 23:34:13,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1440923648. Throughput: 0: 42629.7. Samples: 1441071280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:13,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 23:34:14,301][15401] Updated weights for policy 0, policy_version 87950 (0.0038) [2024-06-21 23:34:16,967][15401] Updated weights for policy 0, policy_version 87960 (0.0037) [2024-06-21 23:34:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1441153024. Throughput: 0: 42836.9. Samples: 1441324880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:18,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-21 23:34:21,920][15401] Updated weights for policy 0, policy_version 87970 (0.0030) [2024-06-21 23:34:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1441382400. Throughput: 0: 42654.6. Samples: 1441453360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 23:34:24,759][15401] Updated weights for policy 0, policy_version 87980 (0.0037) [2024-06-21 23:34:28,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 1441579008. Throughput: 0: 42562.8. Samples: 1441710380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:28,397][15132] Avg episode reward: [(0, '0.695')] [2024-06-21 23:34:29,326][15349] Signal inference workers to stop experience collection... (21250 times) [2024-06-21 23:34:29,372][15401] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-06-21 23:34:29,375][15349] Signal inference workers to resume experience collection... (21250 times) [2024-06-21 23:34:29,392][15401] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-06-21 23:34:29,507][15401] Updated weights for policy 0, policy_version 87990 (0.0034) [2024-06-21 23:34:32,721][15401] Updated weights for policy 0, policy_version 88000 (0.0036) [2024-06-21 23:34:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1441808384. Throughput: 0: 42673.6. Samples: 1441967620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:33,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-21 23:34:37,131][15401] Updated weights for policy 0, policy_version 88010 (0.0038) [2024-06-21 23:34:38,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42598.7, 300 sec: 42765.0). Total num frames: 1442021376. Throughput: 0: 42664.0. Samples: 1442096620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:38,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-21 23:34:40,359][15401] Updated weights for policy 0, policy_version 88020 (0.0037) [2024-06-21 23:34:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 1442217984. Throughput: 0: 42564.0. Samples: 1442350820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:43,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 23:34:44,690][15401] Updated weights for policy 0, policy_version 88030 (0.0033) [2024-06-21 23:34:48,196][15401] Updated weights for policy 0, policy_version 88040 (0.0025) [2024-06-21 23:34:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1442447360. Throughput: 0: 42666.2. Samples: 1442607160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-21 23:34:52,394][15401] Updated weights for policy 0, policy_version 88050 (0.0038) [2024-06-21 23:34:53,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1442660352. Throughput: 0: 42738.5. Samples: 1442739400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:34:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 23:34:55,867][15401] Updated weights for policy 0, policy_version 88060 (0.0036) [2024-06-21 23:34:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1442856960. Throughput: 0: 42633.9. Samples: 1442989800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:34:58,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 23:35:00,021][15401] Updated weights for policy 0, policy_version 88070 (0.0041) [2024-06-21 23:35:03,390][15401] Updated weights for policy 0, policy_version 88080 (0.0043) [2024-06-21 23:35:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 1443102720. Throughput: 0: 42688.1. Samples: 1443245840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-21 23:35:07,651][15401] Updated weights for policy 0, policy_version 88090 (0.0036) [2024-06-21 23:35:08,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1443299328. Throughput: 0: 42758.7. Samples: 1443377500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 23:35:11,290][15401] Updated weights for policy 0, policy_version 88100 (0.0029) [2024-06-21 23:35:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1443512320. Throughput: 0: 42665.6. Samples: 1443630060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:13,396][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 23:35:15,429][15401] Updated weights for policy 0, policy_version 88110 (0.0046) [2024-06-21 23:35:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1443725312. Throughput: 0: 42540.5. Samples: 1443881940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-21 23:35:18,895][15401] Updated weights for policy 0, policy_version 88120 (0.0040) [2024-06-21 23:35:23,231][15401] Updated weights for policy 0, policy_version 88130 (0.0041) [2024-06-21 23:35:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1443921920. Throughput: 0: 42615.0. Samples: 1444014300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-21 23:35:26,418][15401] Updated weights for policy 0, policy_version 88140 (0.0042) [2024-06-21 23:35:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 1444151296. Throughput: 0: 42650.2. Samples: 1444269980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-21 23:35:30,898][15401] Updated weights for policy 0, policy_version 88150 (0.0031) [2024-06-21 23:35:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1444364288. Throughput: 0: 42604.1. Samples: 1444524340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-21 23:35:33,976][15401] Updated weights for policy 0, policy_version 88160 (0.0029) [2024-06-21 23:35:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1444560896. Throughput: 0: 42547.6. Samples: 1444654040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-21 23:35:38,634][15401] Updated weights for policy 0, policy_version 88170 (0.0032) [2024-06-21 23:35:41,383][15401] Updated weights for policy 0, policy_version 88180 (0.0030) [2024-06-21 23:35:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42873.1, 300 sec: 42765.6). Total num frames: 1444790272. Throughput: 0: 42623.3. Samples: 1444907860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:43,391][15132] Avg episode reward: [(0, '0.216')] [2024-06-21 23:35:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000088183_1444790272.pth... [2024-06-21 23:35:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000087557_1434533888.pth [2024-06-21 23:35:46,367][15401] Updated weights for policy 0, policy_version 88190 (0.0038) [2024-06-21 23:35:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1445003264. Throughput: 0: 42553.3. Samples: 1445160740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-21 23:35:48,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 23:35:49,054][15401] Updated weights for policy 0, policy_version 88200 (0.0046) [2024-06-21 23:35:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1445199872. Throughput: 0: 42585.2. Samples: 1445293840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:35:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 23:35:54,028][15401] Updated weights for policy 0, policy_version 88210 (0.0030) [2024-06-21 23:35:55,616][15349] Signal inference workers to stop experience collection... (21300 times) [2024-06-21 23:35:55,616][15349] Signal inference workers to resume experience collection... (21300 times) [2024-06-21 23:35:55,641][15401] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-06-21 23:35:55,641][15401] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-06-21 23:35:56,787][15401] Updated weights for policy 0, policy_version 88220 (0.0033) [2024-06-21 23:35:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42765.7). Total num frames: 1445445632. Throughput: 0: 42514.1. Samples: 1445543200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:35:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-21 23:36:01,521][15401] Updated weights for policy 0, policy_version 88230 (0.0037) [2024-06-21 23:36:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1445642240. Throughput: 0: 42849.6. Samples: 1445810180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:36:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-21 23:36:04,666][15401] Updated weights for policy 0, policy_version 88240 (0.0038) [2024-06-21 23:36:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1445838848. Throughput: 0: 42678.4. Samples: 1445934820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:36:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 23:36:09,163][15401] Updated weights for policy 0, policy_version 88250 (0.0031) [2024-06-21 23:36:12,415][15401] Updated weights for policy 0, policy_version 88260 (0.0029) [2024-06-21 23:36:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1446084608. Throughput: 0: 42541.3. Samples: 1446184340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:36:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 23:36:16,873][15401] Updated weights for policy 0, policy_version 88270 (0.0030) [2024-06-21 23:36:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1446264832. Throughput: 0: 42705.8. Samples: 1446446100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:36:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 23:36:20,228][15401] Updated weights for policy 0, policy_version 88280 (0.0034) [2024-06-21 23:36:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1446494208. Throughput: 0: 42525.0. Samples: 1446567660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:36:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 23:36:24,198][15401] Updated weights for policy 0, policy_version 88290 (0.0029) [2024-06-21 23:36:27,768][15401] Updated weights for policy 0, policy_version 88300 (0.0025) [2024-06-21 23:36:28,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 1446723584. Throughput: 0: 42756.1. Samples: 1446831980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:36:28,393][15132] Avg episode reward: [(0, '0.709')] [2024-06-21 23:36:32,086][15401] Updated weights for policy 0, policy_version 88310 (0.0040) [2024-06-21 23:36:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1446903808. Throughput: 0: 42770.2. Samples: 1447085400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:36:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-21 23:36:35,502][15401] Updated weights for policy 0, policy_version 88320 (0.0023) [2024-06-21 23:36:38,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1447133184. Throughput: 0: 42515.7. Samples: 1447207140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-21 23:36:38,392][15132] Avg episode reward: [(0, '0.131')] [2024-06-21 23:36:39,723][15401] Updated weights for policy 0, policy_version 88330 (0.0033) [2024-06-21 23:36:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1447346176. Throughput: 0: 42771.6. Samples: 1447467920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:36:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-21 23:36:43,495][15401] Updated weights for policy 0, policy_version 88340 (0.0032) [2024-06-21 23:36:47,329][15401] Updated weights for policy 0, policy_version 88350 (0.0026) [2024-06-21 23:36:48,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1447559168. Throughput: 0: 42469.4. Samples: 1447721300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:36:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-21 23:36:50,935][15401] Updated weights for policy 0, policy_version 88360 (0.0049) [2024-06-21 23:36:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1447772160. Throughput: 0: 42490.1. Samples: 1447846880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:36:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-21 23:36:55,318][15401] Updated weights for policy 0, policy_version 88370 (0.0033) [2024-06-21 23:36:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.6, 300 sec: 42710.0). Total num frames: 1448001536. Throughput: 0: 42907.8. Samples: 1448115180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:36:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-21 23:36:58,394][15401] Updated weights for policy 0, policy_version 88380 (0.0044) [2024-06-21 23:37:02,851][15401] Updated weights for policy 0, policy_version 88390 (0.0041) [2024-06-21 23:37:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42710.1). Total num frames: 1448198144. Throughput: 0: 42669.2. Samples: 1448366220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:37:03,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-21 23:37:06,543][15401] Updated weights for policy 0, policy_version 88400 (0.0034) [2024-06-21 23:37:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1448427520. Throughput: 0: 42855.8. Samples: 1448496180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:37:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 23:37:10,317][15401] Updated weights for policy 0, policy_version 88410 (0.0044) [2024-06-21 23:37:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1448640512. Throughput: 0: 42822.3. Samples: 1448758880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:37:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-21 23:37:13,960][15401] Updated weights for policy 0, policy_version 88420 (0.0032) [2024-06-21 23:37:17,835][15401] Updated weights for policy 0, policy_version 88430 (0.0035) [2024-06-21 23:37:18,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 1448837120. Throughput: 0: 42740.8. Samples: 1449008840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:37:18,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-21 23:37:21,211][15349] Signal inference workers to stop experience collection... (21350 times) [2024-06-21 23:37:21,212][15349] Signal inference workers to resume experience collection... (21350 times) [2024-06-21 23:37:21,228][15401] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-06-21 23:37:21,228][15401] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-06-21 23:37:21,351][15401] Updated weights for policy 0, policy_version 88440 (0.0031) [2024-06-21 23:37:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1449050112. Throughput: 0: 42835.6. Samples: 1449134640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:37:23,390][15132] Avg episode reward: [(0, '0.886')] [2024-06-21 23:37:25,461][15401] Updated weights for policy 0, policy_version 88450 (0.0036) [2024-06-21 23:37:28,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 1449263104. Throughput: 0: 42902.0. Samples: 1449398500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:37:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 23:37:29,085][15401] Updated weights for policy 0, policy_version 88460 (0.0033) [2024-06-21 23:37:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1449476096. Throughput: 0: 42824.9. Samples: 1449648420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-21 23:37:33,396][15132] Avg episode reward: [(0, '0.456')] [2024-06-21 23:37:33,563][15401] Updated weights for policy 0, policy_version 88470 (0.0044) [2024-06-21 23:37:36,872][15401] Updated weights for policy 0, policy_version 88480 (0.0046) [2024-06-21 23:37:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 1449705472. Throughput: 0: 42862.2. Samples: 1449775680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:37:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 23:37:41,185][15401] Updated weights for policy 0, policy_version 88490 (0.0045) [2024-06-21 23:37:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1449902080. Throughput: 0: 42606.1. Samples: 1450032460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:37:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-21 23:37:43,529][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000088496_1449918464.pth... [2024-06-21 23:37:43,603][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000087871_1439678464.pth [2024-06-21 23:37:44,616][15401] Updated weights for policy 0, policy_version 88500 (0.0036) [2024-06-21 23:37:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1450115072. Throughput: 0: 42642.2. Samples: 1450285120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:37:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 23:37:48,779][15401] Updated weights for policy 0, policy_version 88510 (0.0046) [2024-06-21 23:37:52,450][15401] Updated weights for policy 0, policy_version 88520 (0.0026) [2024-06-21 23:37:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1450328064. Throughput: 0: 42572.1. Samples: 1450411920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:37:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 23:37:56,557][15401] Updated weights for policy 0, policy_version 88530 (0.0025) [2024-06-21 23:37:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42599.3). Total num frames: 1450524672. Throughput: 0: 42418.7. Samples: 1450667720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:37:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-21 23:38:00,164][15401] Updated weights for policy 0, policy_version 88540 (0.0033) [2024-06-21 23:38:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1450754048. Throughput: 0: 42589.0. Samples: 1450925240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 23:38:04,325][15401] Updated weights for policy 0, policy_version 88550 (0.0034) [2024-06-21 23:38:07,710][15401] Updated weights for policy 0, policy_version 88560 (0.0040) [2024-06-21 23:38:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1450983424. Throughput: 0: 42720.8. Samples: 1451057080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:08,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-21 23:38:11,749][15401] Updated weights for policy 0, policy_version 88570 (0.0029) [2024-06-21 23:38:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1451163648. Throughput: 0: 42506.5. Samples: 1451311300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:13,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-21 23:38:15,327][15401] Updated weights for policy 0, policy_version 88580 (0.0033) [2024-06-21 23:38:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 1451393024. Throughput: 0: 42657.8. Samples: 1451568020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-21 23:38:19,418][15401] Updated weights for policy 0, policy_version 88590 (0.0048) [2024-06-21 23:38:22,902][15401] Updated weights for policy 0, policy_version 88600 (0.0035) [2024-06-21 23:38:23,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1451638784. Throughput: 0: 42855.6. Samples: 1451704180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 23:38:26,967][15401] Updated weights for policy 0, policy_version 88610 (0.0032) [2024-06-21 23:38:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1451802624. Throughput: 0: 42622.3. Samples: 1451950460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-21 23:38:30,267][15349] Signal inference workers to stop experience collection... (21400 times) [2024-06-21 23:38:30,267][15349] Signal inference workers to resume experience collection... (21400 times) [2024-06-21 23:38:30,311][15401] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-06-21 23:38:30,311][15401] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-06-21 23:38:30,417][15401] Updated weights for policy 0, policy_version 88620 (0.0043) [2024-06-21 23:38:33,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 1452032000. Throughput: 0: 42780.9. Samples: 1452210360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:33,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-21 23:38:34,835][15401] Updated weights for policy 0, policy_version 88630 (0.0027) [2024-06-21 23:38:38,305][15401] Updated weights for policy 0, policy_version 88640 (0.0036) [2024-06-21 23:38:38,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1452277760. Throughput: 0: 42954.1. Samples: 1452344860. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-21 23:38:42,419][15401] Updated weights for policy 0, policy_version 88650 (0.0036) [2024-06-21 23:38:43,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1452457984. Throughput: 0: 42790.7. Samples: 1452593300. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-21 23:38:45,936][15401] Updated weights for policy 0, policy_version 88660 (0.0032) [2024-06-21 23:38:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1452670976. Throughput: 0: 42787.5. Samples: 1452850680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:48,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-21 23:38:50,043][15401] Updated weights for policy 0, policy_version 88670 (0.0024) [2024-06-21 23:38:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1452916736. Throughput: 0: 42817.3. Samples: 1452983860. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:53,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-21 23:38:53,483][15401] Updated weights for policy 0, policy_version 88680 (0.0032) [2024-06-21 23:38:58,207][15401] Updated weights for policy 0, policy_version 88690 (0.0036) [2024-06-21 23:38:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 1453096960. Throughput: 0: 42757.0. Samples: 1453235360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:38:58,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-21 23:39:01,110][15401] Updated weights for policy 0, policy_version 88700 (0.0039) [2024-06-21 23:39:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1453326336. Throughput: 0: 42646.3. Samples: 1453487100. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:39:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-21 23:39:05,775][15401] Updated weights for policy 0, policy_version 88710 (0.0039) [2024-06-21 23:39:08,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1453555712. Throughput: 0: 42629.8. Samples: 1453622520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:39:08,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 23:39:08,660][15401] Updated weights for policy 0, policy_version 88720 (0.0039) [2024-06-21 23:39:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1453735936. Throughput: 0: 42887.9. Samples: 1453880420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:39:13,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-21 23:39:13,557][15401] Updated weights for policy 0, policy_version 88730 (0.0033) [2024-06-21 23:39:16,768][15401] Updated weights for policy 0, policy_version 88740 (0.0026) [2024-06-21 23:39:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1453948928. Throughput: 0: 42538.3. Samples: 1454124480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:39:18,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-21 23:39:21,107][15401] Updated weights for policy 0, policy_version 88750 (0.0022) [2024-06-21 23:39:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 1454178304. Throughput: 0: 42484.6. Samples: 1454256660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-21 23:39:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 23:39:24,234][15401] Updated weights for policy 0, policy_version 88760 (0.0028) [2024-06-21 23:39:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1454391296. Throughput: 0: 42721.8. Samples: 1454515780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:39:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-21 23:39:28,657][15401] Updated weights for policy 0, policy_version 88770 (0.0030) [2024-06-21 23:39:31,799][15401] Updated weights for policy 0, policy_version 88780 (0.0034) [2024-06-21 23:39:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 1454604288. Throughput: 0: 42681.4. Samples: 1454771340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:39:33,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-21 23:39:36,195][15401] Updated weights for policy 0, policy_version 88790 (0.0035) [2024-06-21 23:39:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1454833664. Throughput: 0: 42685.2. Samples: 1454904700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:39:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-21 23:39:39,362][15401] Updated weights for policy 0, policy_version 88800 (0.0034) [2024-06-21 23:39:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1455046656. Throughput: 0: 42857.2. Samples: 1455163940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:39:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-21 23:39:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000088809_1455046656.pth... [2024-06-21 23:39:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000088183_1444790272.pth [2024-06-21 23:39:43,625][15401] Updated weights for policy 0, policy_version 88810 (0.0040) [2024-06-21 23:39:47,142][15401] Updated weights for policy 0, policy_version 88820 (0.0034) [2024-06-21 23:39:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1455259648. Throughput: 0: 42966.2. Samples: 1455420580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:39:48,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-21 23:39:51,091][15401] Updated weights for policy 0, policy_version 88830 (0.0038) [2024-06-21 23:39:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1455472640. Throughput: 0: 42820.5. Samples: 1455549440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:39:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-21 23:39:54,599][15401] Updated weights for policy 0, policy_version 88840 (0.0035) [2024-06-21 23:39:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 1455685632. Throughput: 0: 42879.5. Samples: 1455810100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:39:58,393][15132] Avg episode reward: [(0, '0.423')] [2024-06-21 23:39:58,973][15401] Updated weights for policy 0, policy_version 88850 (0.0024) [2024-06-21 23:40:02,325][15401] Updated weights for policy 0, policy_version 88860 (0.0026) [2024-06-21 23:40:02,323][15349] Signal inference workers to stop experience collection... (21450 times) [2024-06-21 23:40:02,328][15349] Signal inference workers to resume experience collection... (21450 times) [2024-06-21 23:40:02,367][15401] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-06-21 23:40:02,367][15401] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-06-21 23:40:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1455898624. Throughput: 0: 43222.6. Samples: 1456069500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:40:03,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-21 23:40:06,327][15401] Updated weights for policy 0, policy_version 88870 (0.0040) [2024-06-21 23:40:08,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1456111616. Throughput: 0: 43198.2. Samples: 1456200580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:40:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-21 23:40:09,845][15401] Updated weights for policy 0, policy_version 88880 (0.0034) [2024-06-21 23:40:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 1456340992. Throughput: 0: 43189.2. Samples: 1456459400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-21 23:40:13,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 23:40:13,823][15401] Updated weights for policy 0, policy_version 88890 (0.0022) [2024-06-21 23:40:17,494][15401] Updated weights for policy 0, policy_version 88900 (0.0034) [2024-06-21 23:40:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1456553984. Throughput: 0: 43168.4. Samples: 1456713920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:18,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-21 23:40:21,490][15401] Updated weights for policy 0, policy_version 88910 (0.0034) [2024-06-21 23:40:23,396][15132] Fps is (10 sec: 40943.6, 60 sec: 42866.8, 300 sec: 42708.6). Total num frames: 1456750592. Throughput: 0: 43088.2. Samples: 1456843940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:23,396][15132] Avg episode reward: [(0, '0.706')] [2024-06-21 23:40:25,440][15401] Updated weights for policy 0, policy_version 88920 (0.0034) [2024-06-21 23:40:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1456979968. Throughput: 0: 43031.6. Samples: 1457100360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:28,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-21 23:40:29,155][15401] Updated weights for policy 0, policy_version 88930 (0.0035) [2024-06-21 23:40:32,903][15401] Updated weights for policy 0, policy_version 88940 (0.0046) [2024-06-21 23:40:33,390][15132] Fps is (10 sec: 44264.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1457192960. Throughput: 0: 43046.6. Samples: 1457357680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:33,390][15132] Avg episode reward: [(0, '0.049')] [2024-06-21 23:40:36,819][15401] Updated weights for policy 0, policy_version 88950 (0.0040) [2024-06-21 23:40:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1457389568. Throughput: 0: 42982.7. Samples: 1457483660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 23:40:40,432][15401] Updated weights for policy 0, policy_version 88960 (0.0043) [2024-06-21 23:40:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1457618944. Throughput: 0: 43035.2. Samples: 1457746580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-21 23:40:44,728][15401] Updated weights for policy 0, policy_version 88970 (0.0031) [2024-06-21 23:40:48,027][15401] Updated weights for policy 0, policy_version 88980 (0.0043) [2024-06-21 23:40:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1457848320. Throughput: 0: 42841.7. Samples: 1457997380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:48,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-21 23:40:52,248][15401] Updated weights for policy 0, policy_version 88990 (0.0028) [2024-06-21 23:40:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1458028544. Throughput: 0: 42956.9. Samples: 1458133640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-21 23:40:55,609][15401] Updated weights for policy 0, policy_version 89000 (0.0030) [2024-06-21 23:40:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 1458274304. Throughput: 0: 42837.4. Samples: 1458386980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:40:58,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 23:40:59,931][15401] Updated weights for policy 0, policy_version 89010 (0.0037) [2024-06-21 23:41:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1458487296. Throughput: 0: 42888.4. Samples: 1458643900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:41:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-21 23:41:03,574][15401] Updated weights for policy 0, policy_version 89020 (0.0043) [2024-06-21 23:41:07,591][15401] Updated weights for policy 0, policy_version 89030 (0.0034) [2024-06-21 23:41:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 1458683904. Throughput: 0: 42936.3. Samples: 1458775900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-21 23:41:08,392][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 23:41:11,246][15401] Updated weights for policy 0, policy_version 89040 (0.0034) [2024-06-21 23:41:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 1458913280. Throughput: 0: 42832.9. Samples: 1459027840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-21 23:41:15,162][15401] Updated weights for policy 0, policy_version 89050 (0.0039) [2024-06-21 23:41:18,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1459126272. Throughput: 0: 42855.7. Samples: 1459286180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-21 23:41:18,973][15401] Updated weights for policy 0, policy_version 89060 (0.0031) [2024-06-21 23:41:22,808][15401] Updated weights for policy 0, policy_version 89070 (0.0031) [2024-06-21 23:41:23,393][15132] Fps is (10 sec: 40944.8, 60 sec: 42873.3, 300 sec: 42709.3). Total num frames: 1459322880. Throughput: 0: 42794.1. Samples: 1459409560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:23,394][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 23:41:26,557][15401] Updated weights for policy 0, policy_version 89080 (0.0029) [2024-06-21 23:41:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1459568640. Throughput: 0: 42659.0. Samples: 1459666240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-21 23:41:30,183][15401] Updated weights for policy 0, policy_version 89090 (0.0026) [2024-06-21 23:41:33,390][15132] Fps is (10 sec: 45892.2, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 1459781632. Throughput: 0: 42895.5. Samples: 1459927680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:33,395][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 23:41:34,356][15401] Updated weights for policy 0, policy_version 89100 (0.0037) [2024-06-21 23:41:37,607][15401] Updated weights for policy 0, policy_version 89110 (0.0025) [2024-06-21 23:41:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1459978240. Throughput: 0: 42725.4. Samples: 1460056280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:38,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-21 23:41:42,090][15401] Updated weights for policy 0, policy_version 89120 (0.0040) [2024-06-21 23:41:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1460191232. Throughput: 0: 42923.1. Samples: 1460318520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-21 23:41:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000089124_1460207616.pth... [2024-06-21 23:41:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000088496_1449918464.pth [2024-06-21 23:41:45,612][15401] Updated weights for policy 0, policy_version 89130 (0.0033) [2024-06-21 23:41:47,235][15349] Signal inference workers to stop experience collection... (21500 times) [2024-06-21 23:41:47,240][15349] Signal inference workers to resume experience collection... (21500 times) [2024-06-21 23:41:47,253][15401] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-06-21 23:41:47,253][15401] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-06-21 23:41:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1460404224. Throughput: 0: 42856.2. Samples: 1460572420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-21 23:41:49,759][15401] Updated weights for policy 0, policy_version 89140 (0.0021) [2024-06-21 23:41:53,365][15401] Updated weights for policy 0, policy_version 89150 (0.0029) [2024-06-21 23:41:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 1460633600. Throughput: 0: 42791.6. Samples: 1460701420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:53,393][15132] Avg episode reward: [(0, '0.588')] [2024-06-21 23:41:57,341][15401] Updated weights for policy 0, policy_version 89160 (0.0041) [2024-06-21 23:41:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1460846592. Throughput: 0: 43080.6. Samples: 1460966460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:41:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-21 23:42:01,089][15401] Updated weights for policy 0, policy_version 89170 (0.0036) [2024-06-21 23:42:03,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1461043200. Throughput: 0: 43023.0. Samples: 1461222320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-21 23:42:03,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-21 23:42:04,771][15401] Updated weights for policy 0, policy_version 89180 (0.0033) [2024-06-21 23:42:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1461256192. Throughput: 0: 43084.4. Samples: 1461348200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-21 23:42:08,554][15401] Updated weights for policy 0, policy_version 89190 (0.0039) [2024-06-21 23:42:12,099][15401] Updated weights for policy 0, policy_version 89200 (0.0036) [2024-06-21 23:42:13,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 1461501952. Throughput: 0: 43199.6. Samples: 1461610220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:13,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-21 23:42:15,911][15401] Updated weights for policy 0, policy_version 89210 (0.0029) [2024-06-21 23:42:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1461714944. Throughput: 0: 43279.6. Samples: 1461875260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-21 23:42:19,469][15401] Updated weights for policy 0, policy_version 89220 (0.0042) [2024-06-21 23:42:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43420.4, 300 sec: 42931.6). Total num frames: 1461927936. Throughput: 0: 43243.6. Samples: 1462002240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:23,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 23:42:23,424][15401] Updated weights for policy 0, policy_version 89230 (0.0043) [2024-06-21 23:42:27,241][15401] Updated weights for policy 0, policy_version 89240 (0.0035) [2024-06-21 23:42:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.9, 300 sec: 42931.3). Total num frames: 1462140928. Throughput: 0: 43111.6. Samples: 1462258640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:28,392][15132] Avg episode reward: [(0, '0.426')] [2024-06-21 23:42:31,282][15401] Updated weights for policy 0, policy_version 89250 (0.0029) [2024-06-21 23:42:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 1462353920. Throughput: 0: 43294.5. Samples: 1462520780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:33,393][15132] Avg episode reward: [(0, '0.715')] [2024-06-21 23:42:34,974][15401] Updated weights for policy 0, policy_version 89260 (0.0034) [2024-06-21 23:42:38,389][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1462566912. Throughput: 0: 43066.3. Samples: 1462639400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-21 23:42:38,827][15401] Updated weights for policy 0, policy_version 89270 (0.0043) [2024-06-21 23:42:42,611][15401] Updated weights for policy 0, policy_version 89280 (0.0035) [2024-06-21 23:42:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 1462779904. Throughput: 0: 42949.3. Samples: 1462899180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-21 23:42:46,308][15401] Updated weights for policy 0, policy_version 89290 (0.0032) [2024-06-21 23:42:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 1462960128. Throughput: 0: 43118.6. Samples: 1463162560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-21 23:42:50,349][15401] Updated weights for policy 0, policy_version 89300 (0.0037) [2024-06-21 23:42:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1463205888. Throughput: 0: 42946.8. Samples: 1463280800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-21 23:42:53,862][15401] Updated weights for policy 0, policy_version 89310 (0.0040) [2024-06-21 23:42:57,984][15401] Updated weights for policy 0, policy_version 89320 (0.0038) [2024-06-21 23:42:58,389][15132] Fps is (10 sec: 47514.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1463435264. Throughput: 0: 42930.8. Samples: 1463542100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-21 23:42:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-21 23:43:01,715][15401] Updated weights for policy 0, policy_version 89330 (0.0032) [2024-06-21 23:43:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1463599104. Throughput: 0: 42754.6. Samples: 1463799220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:03,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 23:43:05,674][15401] Updated weights for policy 0, policy_version 89340 (0.0032) [2024-06-21 23:43:05,716][15349] Signal inference workers to stop experience collection... (21550 times) [2024-06-21 23:43:05,716][15349] Signal inference workers to resume experience collection... (21550 times) [2024-06-21 23:43:05,738][15401] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-06-21 23:43:05,738][15401] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-06-21 23:43:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1463844864. Throughput: 0: 42600.7. Samples: 1463919280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:08,399][15132] Avg episode reward: [(0, '0.409')] [2024-06-21 23:43:09,239][15401] Updated weights for policy 0, policy_version 89350 (0.0041) [2024-06-21 23:43:13,375][15401] Updated weights for policy 0, policy_version 89360 (0.0028) [2024-06-21 23:43:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1464074240. Throughput: 0: 42733.7. Samples: 1464181560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-21 23:43:16,943][15401] Updated weights for policy 0, policy_version 89370 (0.0033) [2024-06-21 23:43:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1464238080. Throughput: 0: 42633.0. Samples: 1464439160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-21 23:43:20,974][15401] Updated weights for policy 0, policy_version 89380 (0.0045) [2024-06-21 23:43:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1464483840. Throughput: 0: 42631.2. Samples: 1464557800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 23:43:24,891][15401] Updated weights for policy 0, policy_version 89390 (0.0035) [2024-06-21 23:43:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42600.0, 300 sec: 42932.0). Total num frames: 1464696832. Throughput: 0: 42734.1. Samples: 1464822220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-21 23:43:28,702][15401] Updated weights for policy 0, policy_version 89400 (0.0031) [2024-06-21 23:43:32,829][15401] Updated weights for policy 0, policy_version 89410 (0.0041) [2024-06-21 23:43:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 1464893440. Throughput: 0: 42434.8. Samples: 1465072120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 23:43:36,706][15401] Updated weights for policy 0, policy_version 89420 (0.0030) [2024-06-21 23:43:38,391][15132] Fps is (10 sec: 44232.6, 60 sec: 42870.7, 300 sec: 42987.0). Total num frames: 1465139200. Throughput: 0: 42632.3. Samples: 1465199300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:38,391][15132] Avg episode reward: [(0, '0.618')] [2024-06-21 23:43:40,541][15401] Updated weights for policy 0, policy_version 89430 (0.0037) [2024-06-21 23:43:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 1465335808. Throughput: 0: 42686.7. Samples: 1465463000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:43,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 23:43:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000089438_1465352192.pth... [2024-06-21 23:43:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000088809_1455046656.pth [2024-06-21 23:43:44,316][15401] Updated weights for policy 0, policy_version 89440 (0.0034) [2024-06-21 23:43:48,162][15401] Updated weights for policy 0, policy_version 89450 (0.0034) [2024-06-21 23:43:48,390][15132] Fps is (10 sec: 40963.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1465548800. Throughput: 0: 42412.4. Samples: 1465707780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-21 23:43:51,859][15401] Updated weights for policy 0, policy_version 89460 (0.0030) [2024-06-21 23:43:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1465761792. Throughput: 0: 42682.3. Samples: 1465839980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-21 23:43:53,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 23:43:55,775][15401] Updated weights for policy 0, policy_version 89470 (0.0028) [2024-06-21 23:43:58,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 1465942016. Throughput: 0: 42582.0. Samples: 1466097740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:43:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 23:43:59,908][15401] Updated weights for policy 0, policy_version 89480 (0.0031) [2024-06-21 23:44:03,393][15132] Fps is (10 sec: 42581.7, 60 sec: 43141.7, 300 sec: 42820.0). Total num frames: 1466187776. Throughput: 0: 42272.7. Samples: 1466341600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:03,394][15132] Avg episode reward: [(0, '0.301')] [2024-06-21 23:44:03,521][15401] Updated weights for policy 0, policy_version 89490 (0.0036) [2024-06-21 23:44:07,458][15401] Updated weights for policy 0, policy_version 89500 (0.0035) [2024-06-21 23:44:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1466400768. Throughput: 0: 42575.5. Samples: 1466473700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-21 23:44:11,761][15401] Updated weights for policy 0, policy_version 89510 (0.0037) [2024-06-21 23:44:13,392][15132] Fps is (10 sec: 39327.8, 60 sec: 41777.6, 300 sec: 42820.2). Total num frames: 1466580992. Throughput: 0: 42409.3. Samples: 1466730740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:13,392][15132] Avg episode reward: [(0, '0.446')] [2024-06-21 23:44:15,344][15401] Updated weights for policy 0, policy_version 89520 (0.0029) [2024-06-21 23:44:16,362][15349] Signal inference workers to stop experience collection... (21600 times) [2024-06-21 23:44:16,396][15401] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-06-21 23:44:16,523][15349] Signal inference workers to resume experience collection... (21600 times) [2024-06-21 23:44:16,523][15401] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-06-21 23:44:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1466826752. Throughput: 0: 42248.9. Samples: 1466973320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 23:44:19,618][15401] Updated weights for policy 0, policy_version 89530 (0.0036) [2024-06-21 23:44:23,069][15401] Updated weights for policy 0, policy_version 89540 (0.0043) [2024-06-21 23:44:23,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1467039744. Throughput: 0: 42563.1. Samples: 1467114600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-21 23:44:27,259][15401] Updated weights for policy 0, policy_version 89550 (0.0039) [2024-06-21 23:44:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1467236352. Throughput: 0: 42388.8. Samples: 1467370500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:28,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-21 23:44:30,769][15401] Updated weights for policy 0, policy_version 89560 (0.0019) [2024-06-21 23:44:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1467482112. Throughput: 0: 42311.1. Samples: 1467611780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:33,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 23:44:35,154][15401] Updated weights for policy 0, policy_version 89570 (0.0036) [2024-06-21 23:44:38,322][15401] Updated weights for policy 0, policy_version 89580 (0.0037) [2024-06-21 23:44:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42326.0, 300 sec: 42820.6). Total num frames: 1467678720. Throughput: 0: 42464.0. Samples: 1467750860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 23:44:42,780][15401] Updated weights for policy 0, policy_version 89590 (0.0037) [2024-06-21 23:44:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1467875328. Throughput: 0: 42485.6. Samples: 1468009600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-21 23:44:45,961][15401] Updated weights for policy 0, policy_version 89600 (0.0036) [2024-06-21 23:44:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1468104704. Throughput: 0: 42584.2. Samples: 1468257720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-21 23:44:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-21 23:44:50,387][15401] Updated weights for policy 0, policy_version 89610 (0.0021) [2024-06-21 23:44:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 1468301312. Throughput: 0: 42709.8. Samples: 1468395640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:44:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-21 23:44:53,557][15401] Updated weights for policy 0, policy_version 89620 (0.0035) [2024-06-21 23:44:58,022][15401] Updated weights for policy 0, policy_version 89630 (0.0023) [2024-06-21 23:44:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1468514304. Throughput: 0: 42512.6. Samples: 1468643700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:44:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 23:45:01,621][15401] Updated weights for policy 0, policy_version 89640 (0.0037) [2024-06-21 23:45:03,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43147.4, 300 sec: 42931.6). Total num frames: 1468776448. Throughput: 0: 42658.7. Samples: 1468892960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 23:45:05,747][15401] Updated weights for policy 0, policy_version 89650 (0.0043) [2024-06-21 23:45:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1468956672. Throughput: 0: 42682.3. Samples: 1469035300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-21 23:45:09,108][15401] Updated weights for policy 0, policy_version 89660 (0.0040) [2024-06-21 23:45:13,337][15401] Updated weights for policy 0, policy_version 89670 (0.0030) [2024-06-21 23:45:13,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 1469153280. Throughput: 0: 42636.0. Samples: 1469289220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:13,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 23:45:16,719][15401] Updated weights for policy 0, policy_version 89680 (0.0027) [2024-06-21 23:45:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42932.6). Total num frames: 1469415424. Throughput: 0: 42817.3. Samples: 1469538560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-21 23:45:21,220][15401] Updated weights for policy 0, policy_version 89690 (0.0033) [2024-06-21 23:45:23,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1469562880. Throughput: 0: 42891.1. Samples: 1469680960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:23,399][15132] Avg episode reward: [(0, '0.595')] [2024-06-21 23:45:24,339][15401] Updated weights for policy 0, policy_version 89700 (0.0038) [2024-06-21 23:45:24,904][15349] Signal inference workers to stop experience collection... (21650 times) [2024-06-21 23:45:24,960][15401] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-06-21 23:45:25,014][15349] Signal inference workers to resume experience collection... (21650 times) [2024-06-21 23:45:25,015][15401] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-06-21 23:45:28,390][15132] Fps is (10 sec: 36045.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1469775872. Throughput: 0: 42528.0. Samples: 1469923360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 23:45:29,068][15401] Updated weights for policy 0, policy_version 89710 (0.0032) [2024-06-21 23:45:32,215][15401] Updated weights for policy 0, policy_version 89720 (0.0024) [2024-06-21 23:45:33,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1470038016. Throughput: 0: 42655.9. Samples: 1470177240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:33,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-21 23:45:36,605][15401] Updated weights for policy 0, policy_version 89730 (0.0045) [2024-06-21 23:45:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1470218240. Throughput: 0: 42674.2. Samples: 1470315980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-21 23:45:40,026][15401] Updated weights for policy 0, policy_version 89740 (0.0030) [2024-06-21 23:45:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1470431232. Throughput: 0: 42614.1. Samples: 1470561340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-21 23:45:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-21 23:45:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000089748_1470431232.pth... [2024-06-21 23:45:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000089124_1460207616.pth [2024-06-21 23:45:44,158][15401] Updated weights for policy 0, policy_version 89750 (0.0043) [2024-06-21 23:45:47,672][15401] Updated weights for policy 0, policy_version 89760 (0.0031) [2024-06-21 23:45:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1470676992. Throughput: 0: 42822.8. Samples: 1470819980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:45:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-21 23:45:51,557][15401] Updated weights for policy 0, policy_version 89770 (0.0033) [2024-06-21 23:45:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1470857216. Throughput: 0: 42663.0. Samples: 1470955140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:45:53,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 23:45:55,100][15401] Updated weights for policy 0, policy_version 89780 (0.0032) [2024-06-21 23:45:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1471086592. Throughput: 0: 42452.5. Samples: 1471199480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:45:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 23:45:59,129][15401] Updated weights for policy 0, policy_version 89790 (0.0038) [2024-06-21 23:46:02,610][15401] Updated weights for policy 0, policy_version 89800 (0.0044) [2024-06-21 23:46:03,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42052.4, 300 sec: 42765.4). Total num frames: 1471299584. Throughput: 0: 42673.1. Samples: 1471458840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:46:03,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-21 23:46:06,657][15401] Updated weights for policy 0, policy_version 89810 (0.0036) [2024-06-21 23:46:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1471496192. Throughput: 0: 42368.1. Samples: 1471587520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:46:08,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-21 23:46:10,189][15401] Updated weights for policy 0, policy_version 89820 (0.0039) [2024-06-21 23:46:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 1471741952. Throughput: 0: 42615.7. Samples: 1471841060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:46:13,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 23:46:14,570][15401] Updated weights for policy 0, policy_version 89830 (0.0039) [2024-06-21 23:46:17,727][15401] Updated weights for policy 0, policy_version 89840 (0.0036) [2024-06-21 23:46:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42765.6). Total num frames: 1471938560. Throughput: 0: 42810.7. Samples: 1472103720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:46:18,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-21 23:46:22,126][15401] Updated weights for policy 0, policy_version 89850 (0.0037) [2024-06-21 23:46:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1472135168. Throughput: 0: 42556.4. Samples: 1472231020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:46:23,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-21 23:46:25,437][15401] Updated weights for policy 0, policy_version 89860 (0.0034) [2024-06-21 23:46:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1472364544. Throughput: 0: 42750.3. Samples: 1472485100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:46:28,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-21 23:46:29,842][15401] Updated weights for policy 0, policy_version 89870 (0.0029) [2024-06-21 23:46:33,025][15401] Updated weights for policy 0, policy_version 89880 (0.0027) [2024-06-21 23:46:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1472593920. Throughput: 0: 42711.5. Samples: 1472742000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:46:33,390][15132] Avg episode reward: [(0, '0.160')] [2024-06-21 23:46:37,365][15401] Updated weights for policy 0, policy_version 89890 (0.0030) [2024-06-21 23:46:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1472774144. Throughput: 0: 42767.2. Samples: 1472879660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-21 23:46:38,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 23:46:39,968][15349] Signal inference workers to stop experience collection... (21700 times) [2024-06-21 23:46:40,023][15401] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-06-21 23:46:40,088][15349] Signal inference workers to resume experience collection... (21700 times) [2024-06-21 23:46:40,088][15401] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-06-21 23:46:40,560][15401] Updated weights for policy 0, policy_version 89900 (0.0034) [2024-06-21 23:46:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1473019904. Throughput: 0: 43034.6. Samples: 1473136040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:46:43,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-21 23:46:44,929][15401] Updated weights for policy 0, policy_version 89910 (0.0039) [2024-06-21 23:46:48,167][15401] Updated weights for policy 0, policy_version 89920 (0.0021) [2024-06-21 23:46:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1473249280. Throughput: 0: 42926.9. Samples: 1473390560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:46:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 23:46:52,337][15401] Updated weights for policy 0, policy_version 89930 (0.0040) [2024-06-21 23:46:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1473429504. Throughput: 0: 43009.8. Samples: 1473522960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:46:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-21 23:46:56,128][15401] Updated weights for policy 0, policy_version 89940 (0.0036) [2024-06-21 23:46:58,390][15132] Fps is (10 sec: 40956.8, 60 sec: 42870.8, 300 sec: 42765.2). Total num frames: 1473658880. Throughput: 0: 43034.7. Samples: 1473777660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:46:58,391][15132] Avg episode reward: [(0, '0.579')] [2024-06-21 23:47:00,126][15401] Updated weights for policy 0, policy_version 89950 (0.0033) [2024-06-21 23:47:03,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 1473888256. Throughput: 0: 42903.7. Samples: 1474034480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:47:03,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-21 23:47:03,457][15401] Updated weights for policy 0, policy_version 89960 (0.0038) [2024-06-21 23:47:07,865][15401] Updated weights for policy 0, policy_version 89970 (0.0041) [2024-06-21 23:47:08,389][15132] Fps is (10 sec: 42602.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1474084864. Throughput: 0: 43000.6. Samples: 1474166040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:47:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-21 23:47:11,050][15401] Updated weights for policy 0, policy_version 89980 (0.0055) [2024-06-21 23:47:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1474314240. Throughput: 0: 43047.9. Samples: 1474422260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:47:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-21 23:47:15,471][15401] Updated weights for policy 0, policy_version 89990 (0.0029) [2024-06-21 23:47:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1474527232. Throughput: 0: 42882.7. Samples: 1474671720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:47:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 23:47:18,660][15401] Updated weights for policy 0, policy_version 90000 (0.0039) [2024-06-21 23:47:23,110][15401] Updated weights for policy 0, policy_version 90010 (0.0033) [2024-06-21 23:47:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42653.9). Total num frames: 1474723840. Throughput: 0: 42759.9. Samples: 1474803960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:47:23,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-21 23:47:26,294][15401] Updated weights for policy 0, policy_version 90020 (0.0028) [2024-06-21 23:47:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1474936832. Throughput: 0: 42713.4. Samples: 1475058140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:47:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-21 23:47:30,880][15401] Updated weights for policy 0, policy_version 90030 (0.0032) [2024-06-21 23:47:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1475166208. Throughput: 0: 42834.3. Samples: 1475318100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-21 23:47:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 23:47:33,816][15401] Updated weights for policy 0, policy_version 90040 (0.0026) [2024-06-21 23:47:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1475362816. Throughput: 0: 42845.8. Samples: 1475451020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:47:38,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 23:47:38,644][15401] Updated weights for policy 0, policy_version 90050 (0.0038) [2024-06-21 23:47:41,734][15401] Updated weights for policy 0, policy_version 90060 (0.0035) [2024-06-21 23:47:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1475592192. Throughput: 0: 42656.7. Samples: 1475697180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:47:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-21 23:47:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000090063_1475592192.pth... [2024-06-21 23:47:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000089438_1465352192.pth [2024-06-21 23:47:46,630][15401] Updated weights for policy 0, policy_version 90070 (0.0030) [2024-06-21 23:47:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1475805184. Throughput: 0: 42776.4. Samples: 1475959320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:47:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-21 23:47:49,301][15401] Updated weights for policy 0, policy_version 90080 (0.0031) [2024-06-21 23:47:53,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1476001792. Throughput: 0: 42647.5. Samples: 1476085180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:47:53,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-21 23:47:54,334][15401] Updated weights for policy 0, policy_version 90090 (0.0034) [2024-06-21 23:47:55,885][15349] Signal inference workers to stop experience collection... (21750 times) [2024-06-21 23:47:55,885][15349] Signal inference workers to resume experience collection... (21750 times) [2024-06-21 23:47:55,900][15401] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-06-21 23:47:55,900][15401] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-06-21 23:47:57,031][15401] Updated weights for policy 0, policy_version 90100 (0.0045) [2024-06-21 23:47:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42599.1, 300 sec: 42765.0). Total num frames: 1476214784. Throughput: 0: 42483.7. Samples: 1476334020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:47:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-21 23:48:01,958][15401] Updated weights for policy 0, policy_version 90110 (0.0042) [2024-06-21 23:48:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 1476427776. Throughput: 0: 42685.3. Samples: 1476592560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:48:03,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 23:48:05,077][15401] Updated weights for policy 0, policy_version 90120 (0.0029) [2024-06-21 23:48:08,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1476624384. Throughput: 0: 42596.5. Samples: 1476720700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:48:08,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-21 23:48:09,754][15401] Updated weights for policy 0, policy_version 90130 (0.0031) [2024-06-21 23:48:12,790][15401] Updated weights for policy 0, policy_version 90140 (0.0038) [2024-06-21 23:48:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1476870144. Throughput: 0: 42568.0. Samples: 1476973700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:48:13,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-21 23:48:17,368][15401] Updated weights for policy 0, policy_version 90150 (0.0040) [2024-06-21 23:48:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1477066752. Throughput: 0: 42616.7. Samples: 1477235860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:48:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-21 23:48:20,605][15401] Updated weights for policy 0, policy_version 90160 (0.0046) [2024-06-21 23:48:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 1477263360. Throughput: 0: 42322.2. Samples: 1477355520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:48:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 23:48:24,794][15401] Updated weights for policy 0, policy_version 90170 (0.0040) [2024-06-21 23:48:28,017][15401] Updated weights for policy 0, policy_version 90180 (0.0034) [2024-06-21 23:48:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1477509120. Throughput: 0: 42608.7. Samples: 1477614560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-21 23:48:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-21 23:48:32,300][15401] Updated weights for policy 0, policy_version 90190 (0.0031) [2024-06-21 23:48:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42543.0). Total num frames: 1477689344. Throughput: 0: 42664.0. Samples: 1477879200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:48:33,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-21 23:48:35,719][15401] Updated weights for policy 0, policy_version 90200 (0.0027) [2024-06-21 23:48:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1477918720. Throughput: 0: 42609.3. Samples: 1478002600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:48:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-21 23:48:39,910][15401] Updated weights for policy 0, policy_version 90210 (0.0037) [2024-06-21 23:48:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1478148096. Throughput: 0: 42814.1. Samples: 1478260660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:48:43,390][15132] Avg episode reward: [(0, '0.148')] [2024-06-21 23:48:43,456][15401] Updated weights for policy 0, policy_version 90220 (0.0034) [2024-06-21 23:48:47,626][15401] Updated weights for policy 0, policy_version 90230 (0.0026) [2024-06-21 23:48:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1478344704. Throughput: 0: 42841.8. Samples: 1478520440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:48:48,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-21 23:48:51,123][15401] Updated weights for policy 0, policy_version 90240 (0.0037) [2024-06-21 23:48:53,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1478574080. Throughput: 0: 42775.5. Samples: 1478645700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:48:53,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 23:48:55,731][15401] Updated weights for policy 0, policy_version 90250 (0.0024) [2024-06-21 23:48:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42710.1). Total num frames: 1478787072. Throughput: 0: 42834.3. Samples: 1478901240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:48:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-21 23:48:58,711][15401] Updated weights for policy 0, policy_version 90260 (0.0031) [2024-06-21 23:49:03,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1478967296. Throughput: 0: 42875.2. Samples: 1479165240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:49:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-21 23:49:03,669][15401] Updated weights for policy 0, policy_version 90270 (0.0037) [2024-06-21 23:49:06,311][15401] Updated weights for policy 0, policy_version 90280 (0.0035) [2024-06-21 23:49:08,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43413.0, 300 sec: 42875.5). Total num frames: 1479229440. Throughput: 0: 42805.0. Samples: 1479282020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:49:08,396][15132] Avg episode reward: [(0, '0.433')] [2024-06-21 23:49:11,162][15401] Updated weights for policy 0, policy_version 90290 (0.0036) [2024-06-21 23:49:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1479426048. Throughput: 0: 42887.5. Samples: 1479544500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:49:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-21 23:49:14,066][15401] Updated weights for policy 0, policy_version 90300 (0.0033) [2024-06-21 23:49:18,389][15132] Fps is (10 sec: 37707.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1479606272. Throughput: 0: 42811.7. Samples: 1479805720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:49:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-21 23:49:18,797][15401] Updated weights for policy 0, policy_version 90310 (0.0030) [2024-06-21 23:49:21,664][15401] Updated weights for policy 0, policy_version 90320 (0.0029) [2024-06-21 23:49:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1479868416. Throughput: 0: 42762.6. Samples: 1479926920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-21 23:49:23,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-21 23:49:26,312][15401] Updated weights for policy 0, policy_version 90330 (0.0046) [2024-06-21 23:49:28,396][15132] Fps is (10 sec: 45845.6, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 1480065024. Throughput: 0: 42773.5. Samples: 1480185740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:49:28,397][15132] Avg episode reward: [(0, '0.516')] [2024-06-21 23:49:29,829][15401] Updated weights for policy 0, policy_version 90340 (0.0037) [2024-06-21 23:49:33,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1480245248. Throughput: 0: 42670.6. Samples: 1480440620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:49:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-21 23:49:34,060][15401] Updated weights for policy 0, policy_version 90350 (0.0035) [2024-06-21 23:49:37,110][15349] Signal inference workers to stop experience collection... (21800 times) [2024-06-21 23:49:37,164][15401] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-06-21 23:49:37,227][15349] Signal inference workers to resume experience collection... (21800 times) [2024-06-21 23:49:37,227][15401] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-06-21 23:49:37,367][15401] Updated weights for policy 0, policy_version 90360 (0.0030) [2024-06-21 23:49:38,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1480491008. Throughput: 0: 42640.1. Samples: 1480564400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:49:38,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-21 23:49:41,721][15401] Updated weights for policy 0, policy_version 90370 (0.0025) [2024-06-21 23:49:43,392][15132] Fps is (10 sec: 47501.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 1480720384. Throughput: 0: 42825.6. Samples: 1480828500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:49:43,393][15132] Avg episode reward: [(0, '0.496')] [2024-06-21 23:49:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000090376_1480720384.pth... [2024-06-21 23:49:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000089748_1470431232.pth [2024-06-21 23:49:45,037][15401] Updated weights for policy 0, policy_version 90380 (0.0032) [2024-06-21 23:49:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1480900608. Throughput: 0: 42558.3. Samples: 1481080360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:49:48,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-21 23:49:49,419][15401] Updated weights for policy 0, policy_version 90390 (0.0029) [2024-06-21 23:49:52,821][15401] Updated weights for policy 0, policy_version 90400 (0.0044) [2024-06-21 23:49:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1481129984. Throughput: 0: 42829.1. Samples: 1481209060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:49:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-21 23:49:56,921][15401] Updated weights for policy 0, policy_version 90410 (0.0031) [2024-06-21 23:49:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1481342976. Throughput: 0: 42765.5. Samples: 1481468940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:49:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 23:50:00,409][15401] Updated weights for policy 0, policy_version 90420 (0.0033) [2024-06-21 23:50:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1481539584. Throughput: 0: 42584.0. Samples: 1481722000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:50:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-21 23:50:04,573][15401] Updated weights for policy 0, policy_version 90430 (0.0047) [2024-06-21 23:50:07,935][15401] Updated weights for policy 0, policy_version 90440 (0.0036) [2024-06-21 23:50:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42329.9, 300 sec: 42765.4). Total num frames: 1481768960. Throughput: 0: 42729.6. Samples: 1481849740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:50:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 23:50:12,118][15401] Updated weights for policy 0, policy_version 90450 (0.0032) [2024-06-21 23:50:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1481981952. Throughput: 0: 42769.7. Samples: 1482110100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:50:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 23:50:15,543][15401] Updated weights for policy 0, policy_version 90460 (0.0032) [2024-06-21 23:50:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1482194944. Throughput: 0: 42565.3. Samples: 1482356060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-21 23:50:18,394][15132] Avg episode reward: [(0, '0.483')] [2024-06-21 23:50:19,796][15401] Updated weights for policy 0, policy_version 90470 (0.0036) [2024-06-21 23:50:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1482407936. Throughput: 0: 42820.4. Samples: 1482491320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:50:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-21 23:50:23,584][15401] Updated weights for policy 0, policy_version 90480 (0.0039) [2024-06-21 23:50:27,730][15401] Updated weights for policy 0, policy_version 90490 (0.0044) [2024-06-21 23:50:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42056.8, 300 sec: 42542.9). Total num frames: 1482588160. Throughput: 0: 42640.7. Samples: 1482747220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:50:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-21 23:50:31,134][15401] Updated weights for policy 0, policy_version 90500 (0.0032) [2024-06-21 23:50:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1482850304. Throughput: 0: 42442.6. Samples: 1482990280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:50:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-21 23:50:35,788][15401] Updated weights for policy 0, policy_version 90510 (0.0042) [2024-06-21 23:50:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1483046912. Throughput: 0: 42659.6. Samples: 1483128740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:50:38,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-21 23:50:38,907][15401] Updated weights for policy 0, policy_version 90520 (0.0042) [2024-06-21 23:50:43,373][15401] Updated weights for policy 0, policy_version 90530 (0.0049) [2024-06-21 23:50:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 1483243520. Throughput: 0: 42494.8. Samples: 1483381220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:50:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-21 23:50:46,348][15401] Updated weights for policy 0, policy_version 90540 (0.0032) [2024-06-21 23:50:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1483505664. Throughput: 0: 42373.3. Samples: 1483628800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:50:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 23:50:50,848][15401] Updated weights for policy 0, policy_version 90550 (0.0040) [2024-06-21 23:50:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1483685888. Throughput: 0: 42662.4. Samples: 1483769560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:50:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-21 23:50:54,259][15401] Updated weights for policy 0, policy_version 90560 (0.0037) [2024-06-21 23:50:55,074][15349] Signal inference workers to stop experience collection... (21850 times) [2024-06-21 23:50:55,124][15401] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-06-21 23:50:55,132][15349] Signal inference workers to resume experience collection... (21850 times) [2024-06-21 23:50:55,137][15401] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-06-21 23:50:58,255][15401] Updated weights for policy 0, policy_version 90570 (0.0033) [2024-06-21 23:50:58,394][15132] Fps is (10 sec: 39306.0, 60 sec: 42595.4, 300 sec: 42708.9). Total num frames: 1483898880. Throughput: 0: 42580.6. Samples: 1484026400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:50:58,394][15132] Avg episode reward: [(0, '0.511')] [2024-06-21 23:51:01,900][15401] Updated weights for policy 0, policy_version 90580 (0.0026) [2024-06-21 23:51:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1484144640. Throughput: 0: 42648.4. Samples: 1484275240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:51:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-21 23:51:05,711][15401] Updated weights for policy 0, policy_version 90590 (0.0025) [2024-06-21 23:51:08,390][15132] Fps is (10 sec: 42615.2, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1484324864. Throughput: 0: 42685.8. Samples: 1484412180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:51:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-21 23:51:09,457][15401] Updated weights for policy 0, policy_version 90600 (0.0033) [2024-06-21 23:51:13,201][15401] Updated weights for policy 0, policy_version 90610 (0.0038) [2024-06-21 23:51:13,391][15132] Fps is (10 sec: 40953.7, 60 sec: 42870.2, 300 sec: 42764.8). Total num frames: 1484554240. Throughput: 0: 42732.1. Samples: 1484670240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-21 23:51:13,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-21 23:51:16,906][15401] Updated weights for policy 0, policy_version 90620 (0.0035) [2024-06-21 23:51:18,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 1484783616. Throughput: 0: 43047.3. Samples: 1484927400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-21 23:51:20,703][15401] Updated weights for policy 0, policy_version 90630 (0.0027) [2024-06-21 23:51:23,389][15132] Fps is (10 sec: 40967.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1484963840. Throughput: 0: 42937.9. Samples: 1485060940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-21 23:51:24,336][15401] Updated weights for policy 0, policy_version 90640 (0.0029) [2024-06-21 23:51:28,110][15401] Updated weights for policy 0, policy_version 90650 (0.0035) [2024-06-21 23:51:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1485209600. Throughput: 0: 43088.7. Samples: 1485320200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-21 23:51:32,407][15401] Updated weights for policy 0, policy_version 90660 (0.0028) [2024-06-21 23:51:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1485406208. Throughput: 0: 43254.2. Samples: 1485575240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 23:51:35,659][15401] Updated weights for policy 0, policy_version 90670 (0.0044) [2024-06-21 23:51:38,393][15132] Fps is (10 sec: 39308.6, 60 sec: 42596.2, 300 sec: 42653.5). Total num frames: 1485602816. Throughput: 0: 42914.0. Samples: 1485700820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:38,393][15132] Avg episode reward: [(0, '0.475')] [2024-06-21 23:51:39,874][15401] Updated weights for policy 0, policy_version 90680 (0.0038) [2024-06-21 23:51:43,339][15401] Updated weights for policy 0, policy_version 90690 (0.0021) [2024-06-21 23:51:43,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43689.0, 300 sec: 42764.7). Total num frames: 1485864960. Throughput: 0: 43007.7. Samples: 1485961680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:43,393][15132] Avg episode reward: [(0, '0.494')] [2024-06-21 23:51:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000090690_1485864960.pth... [2024-06-21 23:51:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000090063_1475592192.pth [2024-06-21 23:51:47,391][15401] Updated weights for policy 0, policy_version 90700 (0.0042) [2024-06-21 23:51:48,390][15132] Fps is (10 sec: 44250.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1486045184. Throughput: 0: 43177.8. Samples: 1486218240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:48,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-21 23:51:51,210][15401] Updated weights for policy 0, policy_version 90710 (0.0033) [2024-06-21 23:51:53,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42598.5, 300 sec: 42654.1). Total num frames: 1486241792. Throughput: 0: 43005.0. Samples: 1486347400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-21 23:51:54,901][15401] Updated weights for policy 0, policy_version 90720 (0.0031) [2024-06-21 23:51:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43147.5, 300 sec: 42709.8). Total num frames: 1486487552. Throughput: 0: 42987.4. Samples: 1486604600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:51:58,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-21 23:51:59,080][15401] Updated weights for policy 0, policy_version 90730 (0.0041) [2024-06-21 23:52:02,597][15401] Updated weights for policy 0, policy_version 90740 (0.0032) [2024-06-21 23:52:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1486700544. Throughput: 0: 42902.1. Samples: 1486858000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:52:03,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-21 23:52:06,835][15401] Updated weights for policy 0, policy_version 90750 (0.0030) [2024-06-21 23:52:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1486897152. Throughput: 0: 42796.9. Samples: 1486986800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-21 23:52:08,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-21 23:52:10,624][15401] Updated weights for policy 0, policy_version 90760 (0.0036) [2024-06-21 23:52:13,391][15132] Fps is (10 sec: 44229.7, 60 sec: 43144.6, 300 sec: 42764.8). Total num frames: 1487142912. Throughput: 0: 42816.2. Samples: 1487247000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:13,392][15132] Avg episode reward: [(0, '0.830')] [2024-06-21 23:52:14,380][15401] Updated weights for policy 0, policy_version 90770 (0.0037) [2024-06-21 23:52:17,289][15349] Signal inference workers to stop experience collection... (21900 times) [2024-06-21 23:52:17,290][15349] Signal inference workers to resume experience collection... (21900 times) [2024-06-21 23:52:17,332][15401] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-06-21 23:52:17,332][15401] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-06-21 23:52:18,161][15401] Updated weights for policy 0, policy_version 90780 (0.0045) [2024-06-21 23:52:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 1487339520. Throughput: 0: 42904.0. Samples: 1487505920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-21 23:52:21,789][15401] Updated weights for policy 0, policy_version 90790 (0.0041) [2024-06-21 23:52:23,389][15132] Fps is (10 sec: 39328.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1487536128. Throughput: 0: 42946.2. Samples: 1487633260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:23,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-21 23:52:25,535][15401] Updated weights for policy 0, policy_version 90800 (0.0026) [2024-06-21 23:52:28,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 1487781888. Throughput: 0: 42874.4. Samples: 1487891200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:28,397][15132] Avg episode reward: [(0, '0.555')] [2024-06-21 23:52:29,263][15401] Updated weights for policy 0, policy_version 90810 (0.0036) [2024-06-21 23:52:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1487978496. Throughput: 0: 42925.0. Samples: 1488149860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:33,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-21 23:52:33,578][15401] Updated weights for policy 0, policy_version 90820 (0.0023) [2024-06-21 23:52:36,727][15401] Updated weights for policy 0, policy_version 90830 (0.0046) [2024-06-21 23:52:38,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42873.8, 300 sec: 42654.0). Total num frames: 1488175104. Throughput: 0: 42850.6. Samples: 1488275680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:38,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-21 23:52:41,179][15401] Updated weights for policy 0, policy_version 90840 (0.0028) [2024-06-21 23:52:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1488420864. Throughput: 0: 43014.6. Samples: 1488540260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-21 23:52:44,612][15401] Updated weights for policy 0, policy_version 90850 (0.0033) [2024-06-21 23:52:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1488633856. Throughput: 0: 43015.4. Samples: 1488793700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-21 23:52:48,751][15401] Updated weights for policy 0, policy_version 90860 (0.0031) [2024-06-21 23:52:52,138][15401] Updated weights for policy 0, policy_version 90870 (0.0035) [2024-06-21 23:52:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1488846848. Throughput: 0: 43061.2. Samples: 1488924560. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-21 23:52:56,382][15401] Updated weights for policy 0, policy_version 90880 (0.0037) [2024-06-21 23:52:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1489059840. Throughput: 0: 43148.2. Samples: 1489188600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:52:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-21 23:52:59,704][15401] Updated weights for policy 0, policy_version 90890 (0.0037) [2024-06-21 23:53:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1489272832. Throughput: 0: 42964.4. Samples: 1489439320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-21 23:53:03,399][15132] Avg episode reward: [(0, '0.240')] [2024-06-21 23:53:03,979][15401] Updated weights for policy 0, policy_version 90900 (0.0038) [2024-06-21 23:53:07,262][15401] Updated weights for policy 0, policy_version 90910 (0.0046) [2024-06-21 23:53:08,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 1489502208. Throughput: 0: 43154.9. Samples: 1489575340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:08,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-21 23:53:11,609][15401] Updated weights for policy 0, policy_version 90920 (0.0047) [2024-06-21 23:53:13,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42599.2, 300 sec: 42820.5). Total num frames: 1489698816. Throughput: 0: 43086.7. Samples: 1489829840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:13,391][15132] Avg episode reward: [(0, '0.682')] [2024-06-21 23:53:15,077][15401] Updated weights for policy 0, policy_version 90930 (0.0032) [2024-06-21 23:53:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1489895424. Throughput: 0: 43145.8. Samples: 1490091420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-21 23:53:19,394][15401] Updated weights for policy 0, policy_version 90940 (0.0031) [2024-06-21 23:53:22,726][15401] Updated weights for policy 0, policy_version 90950 (0.0033) [2024-06-21 23:53:23,389][15132] Fps is (10 sec: 44238.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1490141184. Throughput: 0: 43132.0. Samples: 1490216620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 23:53:27,086][15401] Updated weights for policy 0, policy_version 90960 (0.0036) [2024-06-21 23:53:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42602.9, 300 sec: 42876.1). Total num frames: 1490337792. Throughput: 0: 43097.3. Samples: 1490479640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:28,391][15132] Avg episode reward: [(0, '0.507')] [2024-06-21 23:53:30,350][15401] Updated weights for policy 0, policy_version 90970 (0.0034) [2024-06-21 23:53:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1490550784. Throughput: 0: 43147.2. Samples: 1490735320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:33,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-21 23:53:34,611][15401] Updated weights for policy 0, policy_version 90980 (0.0028) [2024-06-21 23:53:36,199][15349] Signal inference workers to stop experience collection... (21950 times) [2024-06-21 23:53:36,200][15349] Signal inference workers to resume experience collection... (21950 times) [2024-06-21 23:53:36,223][15401] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-06-21 23:53:36,223][15401] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-06-21 23:53:37,887][15401] Updated weights for policy 0, policy_version 90990 (0.0025) [2024-06-21 23:53:38,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43688.9, 300 sec: 42875.7). Total num frames: 1490796544. Throughput: 0: 43194.2. Samples: 1490868400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:38,392][15132] Avg episode reward: [(0, '0.273')] [2024-06-21 23:53:42,159][15401] Updated weights for policy 0, policy_version 91000 (0.0028) [2024-06-21 23:53:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1490976768. Throughput: 0: 43057.9. Samples: 1491126200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-21 23:53:43,435][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000091003_1490993152.pth... [2024-06-21 23:53:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000090376_1480720384.pth [2024-06-21 23:53:45,370][15401] Updated weights for policy 0, policy_version 91010 (0.0038) [2024-06-21 23:53:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 1491222528. Throughput: 0: 43149.0. Samples: 1491381020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-21 23:53:49,877][15401] Updated weights for policy 0, policy_version 91020 (0.0038) [2024-06-21 23:53:53,148][15401] Updated weights for policy 0, policy_version 91030 (0.0037) [2024-06-21 23:53:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 1491435520. Throughput: 0: 43065.5. Samples: 1491513180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:53,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-21 23:53:57,362][15401] Updated weights for policy 0, policy_version 91040 (0.0036) [2024-06-21 23:53:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 1491632128. Throughput: 0: 43071.5. Samples: 1491768040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-21 23:53:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-21 23:54:00,989][15401] Updated weights for policy 0, policy_version 91050 (0.0040) [2024-06-21 23:54:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.9, 300 sec: 42821.1). Total num frames: 1491861504. Throughput: 0: 42796.8. Samples: 1492017380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:03,392][15132] Avg episode reward: [(0, '0.389')] [2024-06-21 23:54:04,937][15401] Updated weights for policy 0, policy_version 91060 (0.0041) [2024-06-21 23:54:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 1492058112. Throughput: 0: 42997.2. Samples: 1492151500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-21 23:54:08,595][15401] Updated weights for policy 0, policy_version 91070 (0.0030) [2024-06-21 23:54:12,765][15401] Updated weights for policy 0, policy_version 91080 (0.0021) [2024-06-21 23:54:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.8, 300 sec: 42987.2). Total num frames: 1492287488. Throughput: 0: 42822.3. Samples: 1492406640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 23:54:16,147][15401] Updated weights for policy 0, policy_version 91090 (0.0031) [2024-06-21 23:54:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1492500480. Throughput: 0: 42743.5. Samples: 1492658780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-21 23:54:20,321][15401] Updated weights for policy 0, policy_version 91100 (0.0031) [2024-06-21 23:54:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 1492713472. Throughput: 0: 42726.3. Samples: 1492790980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-21 23:54:23,817][15401] Updated weights for policy 0, policy_version 91110 (0.0036) [2024-06-21 23:54:27,922][15401] Updated weights for policy 0, policy_version 91120 (0.0040) [2024-06-21 23:54:28,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 1492926464. Throughput: 0: 42689.0. Samples: 1493047200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-21 23:54:31,536][15401] Updated weights for policy 0, policy_version 91130 (0.0028) [2024-06-21 23:54:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1493139456. Throughput: 0: 42673.7. Samples: 1493301340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:33,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-21 23:54:35,467][15401] Updated weights for policy 0, policy_version 91140 (0.0041) [2024-06-21 23:54:38,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.1, 300 sec: 42765.4). Total num frames: 1493336064. Throughput: 0: 42537.7. Samples: 1493427380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:38,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-21 23:54:39,357][15401] Updated weights for policy 0, policy_version 91150 (0.0038) [2024-06-21 23:54:43,150][15401] Updated weights for policy 0, policy_version 91160 (0.0031) [2024-06-21 23:54:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1493565440. Throughput: 0: 42638.3. Samples: 1493686760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-21 23:54:46,893][15401] Updated weights for policy 0, policy_version 91170 (0.0032) [2024-06-21 23:54:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1493778432. Throughput: 0: 42742.2. Samples: 1493940680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:48,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-21 23:54:50,752][15401] Updated weights for policy 0, policy_version 91180 (0.0038) [2024-06-21 23:54:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 1493975040. Throughput: 0: 42676.1. Samples: 1494071920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-21 23:54:53,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-21 23:54:54,356][15401] Updated weights for policy 0, policy_version 91190 (0.0043) [2024-06-21 23:54:58,139][15349] Signal inference workers to stop experience collection... (22000 times) [2024-06-21 23:54:58,139][15349] Signal inference workers to resume experience collection... (22000 times) [2024-06-21 23:54:58,196][15401] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-06-21 23:54:58,196][15401] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-06-21 23:54:58,274][15401] Updated weights for policy 0, policy_version 91200 (0.0032) [2024-06-21 23:54:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1494220800. Throughput: 0: 42829.4. Samples: 1494333960. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:54:58,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-21 23:55:01,819][15401] Updated weights for policy 0, policy_version 91210 (0.0028) [2024-06-21 23:55:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 1494417408. Throughput: 0: 42982.8. Samples: 1494593000. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-21 23:55:05,794][15401] Updated weights for policy 0, policy_version 91220 (0.0033) [2024-06-21 23:55:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1494630400. Throughput: 0: 42756.9. Samples: 1494715040. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-21 23:55:09,554][15401] Updated weights for policy 0, policy_version 91230 (0.0029) [2024-06-21 23:55:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1494859776. Throughput: 0: 42853.5. Samples: 1494975620. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-21 23:55:13,589][15401] Updated weights for policy 0, policy_version 91240 (0.0029) [2024-06-21 23:55:17,229][15401] Updated weights for policy 0, policy_version 91250 (0.0039) [2024-06-21 23:55:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1495072768. Throughput: 0: 42866.6. Samples: 1495230340. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-21 23:55:21,136][15401] Updated weights for policy 0, policy_version 91260 (0.0023) [2024-06-21 23:55:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42987.1). Total num frames: 1495269376. Throughput: 0: 42896.3. Samples: 1495357720. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-21 23:55:24,729][15401] Updated weights for policy 0, policy_version 91270 (0.0034) [2024-06-21 23:55:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.2, 300 sec: 42820.6). Total num frames: 1495482368. Throughput: 0: 42940.7. Samples: 1495619100. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:28,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-21 23:55:29,132][15401] Updated weights for policy 0, policy_version 91280 (0.0038) [2024-06-21 23:55:32,745][15401] Updated weights for policy 0, policy_version 91290 (0.0044) [2024-06-21 23:55:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1495711744. Throughput: 0: 42874.7. Samples: 1495870040. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:33,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-21 23:55:36,715][15401] Updated weights for policy 0, policy_version 91300 (0.0034) [2024-06-21 23:55:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 1495924736. Throughput: 0: 42922.5. Samples: 1496003440. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-21 23:55:40,187][15401] Updated weights for policy 0, policy_version 91310 (0.0025) [2024-06-21 23:55:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1496104960. Throughput: 0: 42733.8. Samples: 1496256980. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-21 23:55:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000091316_1496121344.pth... [2024-06-21 23:55:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000090690_1485864960.pth [2024-06-21 23:55:44,250][15401] Updated weights for policy 0, policy_version 91320 (0.0025) [2024-06-21 23:55:48,309][15401] Updated weights for policy 0, policy_version 91330 (0.0040) [2024-06-21 23:55:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 1496350720. Throughput: 0: 42684.8. Samples: 1496513820. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-21 23:55:48,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-21 23:55:51,997][15401] Updated weights for policy 0, policy_version 91340 (0.0027) [2024-06-21 23:55:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42932.2). Total num frames: 1496563712. Throughput: 0: 42881.4. Samples: 1496644700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:55:53,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-21 23:55:55,937][15401] Updated weights for policy 0, policy_version 91350 (0.0022) [2024-06-21 23:55:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1496743936. Throughput: 0: 42716.2. Samples: 1496897840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:55:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-21 23:55:59,443][15401] Updated weights for policy 0, policy_version 91360 (0.0041) [2024-06-21 23:56:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 1496989696. Throughput: 0: 42842.4. Samples: 1497158240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-21 23:56:03,478][15401] Updated weights for policy 0, policy_version 91370 (0.0028) [2024-06-21 23:56:06,991][15401] Updated weights for policy 0, policy_version 91380 (0.0037) [2024-06-21 23:56:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 1497202688. Throughput: 0: 43045.5. Samples: 1497294760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:08,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-21 23:56:10,992][15401] Updated weights for policy 0, policy_version 91390 (0.0044) [2024-06-21 23:56:11,968][15349] Signal inference workers to stop experience collection... (22050 times) [2024-06-21 23:56:11,969][15349] Signal inference workers to resume experience collection... (22050 times) [2024-06-21 23:56:11,994][15401] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-06-21 23:56:11,994][15401] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-06-21 23:56:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 1497382912. Throughput: 0: 42723.2. Samples: 1497541640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:13,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-21 23:56:14,972][15401] Updated weights for policy 0, policy_version 91400 (0.0023) [2024-06-21 23:56:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 1497645056. Throughput: 0: 42809.2. Samples: 1497796460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:18,396][15132] Avg episode reward: [(0, '0.660')] [2024-06-21 23:56:18,550][15401] Updated weights for policy 0, policy_version 91410 (0.0026) [2024-06-21 23:56:22,583][15401] Updated weights for policy 0, policy_version 91420 (0.0042) [2024-06-21 23:56:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1497841664. Throughput: 0: 42865.0. Samples: 1497932360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-21 23:56:26,433][15401] Updated weights for policy 0, policy_version 91430 (0.0026) [2024-06-21 23:56:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1498038272. Throughput: 0: 42806.6. Samples: 1498183280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:28,390][15132] Avg episode reward: [(0, '0.108')] [2024-06-21 23:56:30,463][15401] Updated weights for policy 0, policy_version 91440 (0.0033) [2024-06-21 23:56:33,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42987.3). Total num frames: 1498284032. Throughput: 0: 42719.1. Samples: 1498436280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:33,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-21 23:56:33,965][15401] Updated weights for policy 0, policy_version 91450 (0.0037) [2024-06-21 23:56:38,169][15401] Updated weights for policy 0, policy_version 91460 (0.0037) [2024-06-21 23:56:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 1498497024. Throughput: 0: 42940.9. Samples: 1498577040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-21 23:56:41,603][15401] Updated weights for policy 0, policy_version 91470 (0.0044) [2024-06-21 23:56:43,390][15132] Fps is (10 sec: 40968.9, 60 sec: 43144.3, 300 sec: 42876.1). Total num frames: 1498693632. Throughput: 0: 42891.7. Samples: 1498827980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-21 23:56:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-21 23:56:45,790][15401] Updated weights for policy 0, policy_version 91480 (0.0044) [2024-06-21 23:56:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1498939392. Throughput: 0: 42597.3. Samples: 1499075120. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:56:48,400][15132] Avg episode reward: [(0, '0.602')] [2024-06-21 23:56:49,297][15401] Updated weights for policy 0, policy_version 91490 (0.0030) [2024-06-21 23:56:53,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1499119616. Throughput: 0: 42608.5. Samples: 1499212140. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:56:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 23:56:53,486][15401] Updated weights for policy 0, policy_version 91500 (0.0030) [2024-06-21 23:56:57,123][15401] Updated weights for policy 0, policy_version 91510 (0.0035) [2024-06-21 23:56:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1499332608. Throughput: 0: 42859.5. Samples: 1499470320. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:56:58,404][15132] Avg episode reward: [(0, '0.576')] [2024-06-21 23:57:00,919][15401] Updated weights for policy 0, policy_version 91520 (0.0040) [2024-06-21 23:57:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 1499594752. Throughput: 0: 42773.1. Samples: 1499721240. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:57:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 23:57:04,550][15401] Updated weights for policy 0, policy_version 91530 (0.0038) [2024-06-21 23:57:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 1499774976. Throughput: 0: 42968.6. Samples: 1499865940. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:57:08,398][15132] Avg episode reward: [(0, '0.363')] [2024-06-21 23:57:08,415][15401] Updated weights for policy 0, policy_version 91540 (0.0041) [2024-06-21 23:57:11,929][15401] Updated weights for policy 0, policy_version 91550 (0.0024) [2024-06-21 23:57:13,390][15132] Fps is (10 sec: 37682.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1499971584. Throughput: 0: 42956.4. Samples: 1500116320. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:57:13,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-21 23:57:16,262][15401] Updated weights for policy 0, policy_version 91560 (0.0027) [2024-06-21 23:57:16,818][15349] Signal inference workers to stop experience collection... (22100 times) [2024-06-21 23:57:16,819][15349] Signal inference workers to resume experience collection... (22100 times) [2024-06-21 23:57:16,837][15401] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-06-21 23:57:16,837][15401] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-06-21 23:57:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42987.1). Total num frames: 1500217344. Throughput: 0: 42884.8. Samples: 1500366000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:57:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-21 23:57:19,645][15401] Updated weights for policy 0, policy_version 91570 (0.0038) [2024-06-21 23:57:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42766.0). Total num frames: 1500397568. Throughput: 0: 42879.1. Samples: 1500506600. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:57:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 23:57:23,809][15401] Updated weights for policy 0, policy_version 91580 (0.0028) [2024-06-21 23:57:27,495][15401] Updated weights for policy 0, policy_version 91590 (0.0025) [2024-06-21 23:57:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1500626944. Throughput: 0: 42922.3. Samples: 1500759480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:57:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-21 23:57:31,539][15401] Updated weights for policy 0, policy_version 91600 (0.0026) [2024-06-21 23:57:33,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 1500872704. Throughput: 0: 42973.8. Samples: 1501008940. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:57:33,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-21 23:57:35,058][15401] Updated weights for policy 0, policy_version 91610 (0.0045) [2024-06-21 23:57:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1501036544. Throughput: 0: 43032.8. Samples: 1501148620. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-21 23:57:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-21 23:57:39,139][15401] Updated weights for policy 0, policy_version 91620 (0.0027) [2024-06-21 23:57:42,529][15401] Updated weights for policy 0, policy_version 91630 (0.0023) [2024-06-21 23:57:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 1501282304. Throughput: 0: 42966.2. Samples: 1501403800. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:57:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-21 23:57:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000091631_1501282304.pth... [2024-06-21 23:57:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000091003_1490993152.pth [2024-06-21 23:57:46,829][15401] Updated weights for policy 0, policy_version 91640 (0.0037) [2024-06-21 23:57:48,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 1501511680. Throughput: 0: 43049.8. Samples: 1501658480. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:57:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-21 23:57:50,147][15401] Updated weights for policy 0, policy_version 91650 (0.0047) [2024-06-21 23:57:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1501675520. Throughput: 0: 42744.5. Samples: 1501789440. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:57:53,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-21 23:57:54,252][15401] Updated weights for policy 0, policy_version 91660 (0.0034) [2024-06-21 23:57:58,161][15401] Updated weights for policy 0, policy_version 91670 (0.0034) [2024-06-21 23:57:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1501921280. Throughput: 0: 42734.3. Samples: 1502039360. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:57:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 23:58:01,997][15401] Updated weights for policy 0, policy_version 91680 (0.0027) [2024-06-21 23:58:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 1502150656. Throughput: 0: 42961.1. Samples: 1502299240. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:58:03,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-21 23:58:05,623][15401] Updated weights for policy 0, policy_version 91690 (0.0036) [2024-06-21 23:58:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 1502314496. Throughput: 0: 42701.7. Samples: 1502428180. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:58:08,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-21 23:58:09,676][15401] Updated weights for policy 0, policy_version 91700 (0.0042) [2024-06-21 23:58:13,128][15401] Updated weights for policy 0, policy_version 91710 (0.0033) [2024-06-21 23:58:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 1502576640. Throughput: 0: 42619.6. Samples: 1502677460. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:58:13,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-21 23:58:17,281][15401] Updated weights for policy 0, policy_version 91720 (0.0039) [2024-06-21 23:58:18,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1502789632. Throughput: 0: 42855.1. Samples: 1502937420. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:58:18,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-21 23:58:21,072][15401] Updated weights for policy 0, policy_version 91730 (0.0029) [2024-06-21 23:58:23,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1502953472. Throughput: 0: 42660.4. Samples: 1503068340. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:58:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-21 23:58:25,267][15401] Updated weights for policy 0, policy_version 91740 (0.0028) [2024-06-21 23:58:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1503199232. Throughput: 0: 42620.5. Samples: 1503321720. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:58:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-21 23:58:28,688][15401] Updated weights for policy 0, policy_version 91750 (0.0045) [2024-06-21 23:58:32,684][15349] Signal inference workers to stop experience collection... (22150 times) [2024-06-21 23:58:32,686][15349] Signal inference workers to resume experience collection... (22150 times) [2024-06-21 23:58:32,705][15401] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-06-21 23:58:32,705][15401] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-06-21 23:58:32,860][15401] Updated weights for policy 0, policy_version 91760 (0.0037) [2024-06-21 23:58:33,392][15132] Fps is (10 sec: 49140.0, 60 sec: 42869.7, 300 sec: 42876.1). Total num frames: 1503444992. Throughput: 0: 42776.3. Samples: 1503583520. Policy #0 lag: (min: 0.0, avg: 13.3, max: 28.0) [2024-06-21 23:58:33,393][15132] Avg episode reward: [(0, '0.399')] [2024-06-21 23:58:36,293][15401] Updated weights for policy 0, policy_version 91770 (0.0040) [2024-06-21 23:58:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1503608832. Throughput: 0: 42826.2. Samples: 1503716620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:58:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-21 23:58:40,354][15401] Updated weights for policy 0, policy_version 91780 (0.0033) [2024-06-21 23:58:43,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1503854592. Throughput: 0: 42813.2. Samples: 1503965960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:58:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-21 23:58:43,956][15401] Updated weights for policy 0, policy_version 91790 (0.0032) [2024-06-21 23:58:47,939][15401] Updated weights for policy 0, policy_version 91800 (0.0026) [2024-06-21 23:58:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1504067584. Throughput: 0: 42875.9. Samples: 1504228660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:58:48,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-21 23:58:51,630][15401] Updated weights for policy 0, policy_version 91810 (0.0035) [2024-06-21 23:58:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1504264192. Throughput: 0: 42890.2. Samples: 1504358240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:58:53,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-21 23:58:55,825][15401] Updated weights for policy 0, policy_version 91820 (0.0048) [2024-06-21 23:58:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1504493568. Throughput: 0: 42991.2. Samples: 1504611960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:58:58,396][15132] Avg episode reward: [(0, '0.353')] [2024-06-21 23:58:59,484][15401] Updated weights for policy 0, policy_version 91830 (0.0035) [2024-06-21 23:59:03,317][15401] Updated weights for policy 0, policy_version 91840 (0.0035) [2024-06-21 23:59:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1504706560. Throughput: 0: 43032.0. Samples: 1504873860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:59:03,390][15132] Avg episode reward: [(0, '0.086')] [2024-06-21 23:59:07,025][15401] Updated weights for policy 0, policy_version 91850 (0.0031) [2024-06-21 23:59:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1504919552. Throughput: 0: 43021.7. Samples: 1505004320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:59:08,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-21 23:59:10,819][15401] Updated weights for policy 0, policy_version 91860 (0.0029) [2024-06-21 23:59:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 1505132544. Throughput: 0: 42953.3. Samples: 1505254620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:59:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-21 23:59:14,871][15401] Updated weights for policy 0, policy_version 91870 (0.0034) [2024-06-21 23:59:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1505345536. Throughput: 0: 43061.5. Samples: 1505521180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:59:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-21 23:59:18,451][15401] Updated weights for policy 0, policy_version 91880 (0.0026) [2024-06-21 23:59:22,447][15401] Updated weights for policy 0, policy_version 91890 (0.0029) [2024-06-21 23:59:23,396][15132] Fps is (10 sec: 44208.0, 60 sec: 43686.0, 300 sec: 42875.1). Total num frames: 1505574912. Throughput: 0: 42939.6. Samples: 1505649180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:59:23,397][15132] Avg episode reward: [(0, '0.587')] [2024-06-21 23:59:26,209][15401] Updated weights for policy 0, policy_version 91900 (0.0037) [2024-06-21 23:59:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1505787904. Throughput: 0: 43010.7. Samples: 1505901440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-21 23:59:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-21 23:59:29,951][15401] Updated weights for policy 0, policy_version 91910 (0.0025) [2024-06-21 23:59:33,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 1505984512. Throughput: 0: 42978.3. Samples: 1506162680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 23:59:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-21 23:59:33,720][15401] Updated weights for policy 0, policy_version 91920 (0.0043) [2024-06-21 23:59:37,428][15401] Updated weights for policy 0, policy_version 91930 (0.0041) [2024-06-21 23:59:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1506213888. Throughput: 0: 43022.2. Samples: 1506294240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 23:59:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-21 23:59:41,196][15401] Updated weights for policy 0, policy_version 91940 (0.0031) [2024-06-21 23:59:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1506410496. Throughput: 0: 43011.0. Samples: 1506547460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 23:59:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-21 23:59:43,440][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000091945_1506426880.pth... [2024-06-21 23:59:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000091316_1496121344.pth [2024-06-21 23:59:45,359][15401] Updated weights for policy 0, policy_version 91950 (0.0021) [2024-06-21 23:59:47,819][15349] Signal inference workers to stop experience collection... (22200 times) [2024-06-21 23:59:47,820][15349] Signal inference workers to resume experience collection... (22200 times) [2024-06-21 23:59:47,872][15401] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-06-21 23:59:47,872][15401] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-06-21 23:59:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1506639872. Throughput: 0: 42823.6. Samples: 1506800920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 23:59:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-21 23:59:48,658][15401] Updated weights for policy 0, policy_version 91960 (0.0042) [2024-06-21 23:59:52,883][15401] Updated weights for policy 0, policy_version 91970 (0.0031) [2024-06-21 23:59:53,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 1506852864. Throughput: 0: 42898.2. Samples: 1506934840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 23:59:53,392][15132] Avg episode reward: [(0, '0.372')] [2024-06-21 23:59:56,439][15401] Updated weights for policy 0, policy_version 91980 (0.0038) [2024-06-21 23:59:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1507049472. Throughput: 0: 43047.6. Samples: 1507191760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-21 23:59:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 00:00:00,451][15401] Updated weights for policy 0, policy_version 91990 (0.0028) [2024-06-22 00:00:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1507295232. Throughput: 0: 42742.7. Samples: 1507444600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 00:00:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 00:00:04,140][15401] Updated weights for policy 0, policy_version 92000 (0.0023) [2024-06-22 00:00:07,963][15401] Updated weights for policy 0, policy_version 92010 (0.0034) [2024-06-22 00:00:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1507508224. Throughput: 0: 42818.1. Samples: 1507575720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 00:00:08,390][15132] Avg episode reward: [(0, '0.128')] [2024-06-22 00:00:11,823][15401] Updated weights for policy 0, policy_version 92020 (0.0028) [2024-06-22 00:00:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1507704832. Throughput: 0: 42886.6. Samples: 1507831340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 00:00:13,390][15132] Avg episode reward: [(0, '0.128')] [2024-06-22 00:00:15,500][15401] Updated weights for policy 0, policy_version 92030 (0.0032) [2024-06-22 00:00:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 1507917824. Throughput: 0: 42845.6. Samples: 1508090740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 00:00:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 00:00:19,348][15401] Updated weights for policy 0, policy_version 92040 (0.0034) [2024-06-22 00:00:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 1508130816. Throughput: 0: 42709.5. Samples: 1508216160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 00:00:23,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 00:00:23,518][15401] Updated weights for policy 0, policy_version 92050 (0.0028) [2024-06-22 00:00:26,954][15401] Updated weights for policy 0, policy_version 92060 (0.0021) [2024-06-22 00:00:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1508360192. Throughput: 0: 42719.3. Samples: 1508469820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:00:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 00:00:31,074][15401] Updated weights for policy 0, policy_version 92070 (0.0043) [2024-06-22 00:00:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1508573184. Throughput: 0: 42955.4. Samples: 1508733920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:00:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 00:00:34,372][15401] Updated weights for policy 0, policy_version 92080 (0.0033) [2024-06-22 00:00:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42987.1). Total num frames: 1508786176. Throughput: 0: 42905.7. Samples: 1508865500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:00:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 00:00:38,714][15401] Updated weights for policy 0, policy_version 92090 (0.0031) [2024-06-22 00:00:41,913][15401] Updated weights for policy 0, policy_version 92100 (0.0033) [2024-06-22 00:00:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1508982784. Throughput: 0: 42820.3. Samples: 1509118680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:00:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 00:00:46,209][15401] Updated weights for policy 0, policy_version 92110 (0.0038) [2024-06-22 00:00:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1509228544. Throughput: 0: 43046.1. Samples: 1509381680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:00:48,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 00:00:49,961][15401] Updated weights for policy 0, policy_version 92120 (0.0040) [2024-06-22 00:00:53,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42871.5, 300 sec: 42986.8). Total num frames: 1509425152. Throughput: 0: 43111.9. Samples: 1509515860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:00:53,393][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 00:00:53,882][15401] Updated weights for policy 0, policy_version 92130 (0.0038) [2024-06-22 00:00:54,225][15349] Signal inference workers to stop experience collection... (22250 times) [2024-06-22 00:00:54,276][15401] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-06-22 00:00:54,280][15349] Signal inference workers to resume experience collection... (22250 times) [2024-06-22 00:00:54,291][15401] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-06-22 00:00:57,586][15401] Updated weights for policy 0, policy_version 92140 (0.0020) [2024-06-22 00:00:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1509638144. Throughput: 0: 43001.8. Samples: 1509766420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:00:58,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 00:01:01,464][15401] Updated weights for policy 0, policy_version 92150 (0.0030) [2024-06-22 00:01:03,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1509867520. Throughput: 0: 42863.2. Samples: 1510019580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:01:03,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-22 00:01:05,438][15401] Updated weights for policy 0, policy_version 92160 (0.0027) [2024-06-22 00:01:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 1510047744. Throughput: 0: 42897.3. Samples: 1510146540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:01:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 00:01:09,279][15401] Updated weights for policy 0, policy_version 92170 (0.0032) [2024-06-22 00:01:13,203][15401] Updated weights for policy 0, policy_version 92180 (0.0040) [2024-06-22 00:01:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1510277120. Throughput: 0: 42858.5. Samples: 1510398460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:01:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 00:01:16,977][15401] Updated weights for policy 0, policy_version 92190 (0.0029) [2024-06-22 00:01:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1510490112. Throughput: 0: 42714.4. Samples: 1510656060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:01:18,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-22 00:01:20,848][15401] Updated weights for policy 0, policy_version 92200 (0.0033) [2024-06-22 00:01:23,391][15132] Fps is (10 sec: 40954.1, 60 sec: 42597.3, 300 sec: 42875.9). Total num frames: 1510686720. Throughput: 0: 42575.5. Samples: 1510781460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 00:01:23,391][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 00:01:24,392][15401] Updated weights for policy 0, policy_version 92210 (0.0044) [2024-06-22 00:01:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 1510899712. Throughput: 0: 42676.6. Samples: 1511039120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:01:28,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 00:01:28,794][15401] Updated weights for policy 0, policy_version 92220 (0.0035) [2024-06-22 00:01:31,879][15401] Updated weights for policy 0, policy_version 92230 (0.0031) [2024-06-22 00:01:33,389][15132] Fps is (10 sec: 44243.7, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 1511129088. Throughput: 0: 42509.8. Samples: 1511294620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:01:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 00:01:36,452][15401] Updated weights for policy 0, policy_version 92240 (0.0041) [2024-06-22 00:01:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 1511325696. Throughput: 0: 42529.9. Samples: 1511429600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:01:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 00:01:39,508][15401] Updated weights for policy 0, policy_version 92250 (0.0034) [2024-06-22 00:01:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1511538688. Throughput: 0: 42553.4. Samples: 1511681320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:01:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 00:01:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000092258_1511555072.pth... [2024-06-22 00:01:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000091631_1501282304.pth [2024-06-22 00:01:44,169][15401] Updated weights for policy 0, policy_version 92260 (0.0038) [2024-06-22 00:01:47,433][15401] Updated weights for policy 0, policy_version 92270 (0.0044) [2024-06-22 00:01:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1511768064. Throughput: 0: 42543.7. Samples: 1511934040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:01:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 00:01:51,805][15401] Updated weights for policy 0, policy_version 92280 (0.0033) [2024-06-22 00:01:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 1511964672. Throughput: 0: 42624.5. Samples: 1512064640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:01:53,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 00:01:55,006][15401] Updated weights for policy 0, policy_version 92290 (0.0044) [2024-06-22 00:01:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1512194048. Throughput: 0: 42609.5. Samples: 1512315880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:01:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 00:01:59,446][15401] Updated weights for policy 0, policy_version 92300 (0.0051) [2024-06-22 00:02:02,815][15401] Updated weights for policy 0, policy_version 92310 (0.0038) [2024-06-22 00:02:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1512423424. Throughput: 0: 42704.8. Samples: 1512577780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:02:03,396][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 00:02:06,923][15401] Updated weights for policy 0, policy_version 92320 (0.0025) [2024-06-22 00:02:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1512620032. Throughput: 0: 42835.3. Samples: 1512708980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:02:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 00:02:10,446][15401] Updated weights for policy 0, policy_version 92330 (0.0024) [2024-06-22 00:02:11,161][15349] Signal inference workers to stop experience collection... (22300 times) [2024-06-22 00:02:11,167][15349] Signal inference workers to resume experience collection... (22300 times) [2024-06-22 00:02:11,188][15401] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-06-22 00:02:11,188][15401] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-06-22 00:02:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1512833024. Throughput: 0: 42785.3. Samples: 1512964460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:02:13,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 00:02:15,113][15401] Updated weights for policy 0, policy_version 92340 (0.0047) [2024-06-22 00:02:17,913][15401] Updated weights for policy 0, policy_version 92350 (0.0028) [2024-06-22 00:02:18,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 1513062400. Throughput: 0: 42634.1. Samples: 1513213260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 00:02:18,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 00:02:22,969][15401] Updated weights for policy 0, policy_version 92360 (0.0033) [2024-06-22 00:02:23,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42324.7, 300 sec: 42709.1). Total num frames: 1513226240. Throughput: 0: 42604.7. Samples: 1513346920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:02:23,392][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 00:02:25,377][15401] Updated weights for policy 0, policy_version 92370 (0.0044) [2024-06-22 00:02:28,390][15132] Fps is (10 sec: 42606.6, 60 sec: 43144.1, 300 sec: 42764.9). Total num frames: 1513488384. Throughput: 0: 42586.2. Samples: 1513597720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:02:28,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-22 00:02:30,473][15401] Updated weights for policy 0, policy_version 92380 (0.0040) [2024-06-22 00:02:33,135][15401] Updated weights for policy 0, policy_version 92390 (0.0023) [2024-06-22 00:02:33,390][15132] Fps is (10 sec: 49163.3, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 1513717760. Throughput: 0: 42611.4. Samples: 1513851560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:02:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 00:02:38,389][15132] Fps is (10 sec: 37685.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1513865216. Throughput: 0: 42644.1. Samples: 1513983620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:02:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 00:02:38,497][15401] Updated weights for policy 0, policy_version 92400 (0.0037) [2024-06-22 00:02:40,789][15401] Updated weights for policy 0, policy_version 92410 (0.0034) [2024-06-22 00:02:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1514127360. Throughput: 0: 42662.6. Samples: 1514235700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:02:43,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-22 00:02:46,124][15401] Updated weights for policy 0, policy_version 92420 (0.0041) [2024-06-22 00:02:48,389][15132] Fps is (10 sec: 49151.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1514356736. Throughput: 0: 42376.5. Samples: 1514484720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:02:48,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 00:02:48,423][15401] Updated weights for policy 0, policy_version 92430 (0.0034) [2024-06-22 00:02:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1514504192. Throughput: 0: 42381.3. Samples: 1514616140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:02:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 00:02:53,929][15401] Updated weights for policy 0, policy_version 92440 (0.0038) [2024-06-22 00:02:56,141][15401] Updated weights for policy 0, policy_version 92450 (0.0041) [2024-06-22 00:02:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1514782720. Throughput: 0: 42282.1. Samples: 1514867160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:02:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 00:03:01,579][15401] Updated weights for policy 0, policy_version 92460 (0.0035) [2024-06-22 00:03:03,396][15132] Fps is (10 sec: 45846.2, 60 sec: 42320.9, 300 sec: 42875.2). Total num frames: 1514962944. Throughput: 0: 42624.7. Samples: 1515131540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:03:03,396][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 00:03:03,972][15401] Updated weights for policy 0, policy_version 92470 (0.0027) [2024-06-22 00:03:08,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1515159552. Throughput: 0: 42229.0. Samples: 1515247120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:03:08,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 00:03:09,212][15401] Updated weights for policy 0, policy_version 92480 (0.0034) [2024-06-22 00:03:10,820][15349] Signal inference workers to stop experience collection... (22350 times) [2024-06-22 00:03:10,872][15401] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-06-22 00:03:10,928][15349] Signal inference workers to resume experience collection... (22350 times) [2024-06-22 00:03:10,928][15401] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-06-22 00:03:11,765][15401] Updated weights for policy 0, policy_version 92490 (0.0030) [2024-06-22 00:03:13,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1515405312. Throughput: 0: 42420.5. Samples: 1515506620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 00:03:13,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 00:03:16,679][15401] Updated weights for policy 0, policy_version 92500 (0.0039) [2024-06-22 00:03:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 1515601920. Throughput: 0: 42742.3. Samples: 1515774960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 00:03:19,328][15401] Updated weights for policy 0, policy_version 92510 (0.0031) [2024-06-22 00:03:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42873.1, 300 sec: 42709.4). Total num frames: 1515798528. Throughput: 0: 42411.4. Samples: 1515892140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:23,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 00:03:23,964][15401] Updated weights for policy 0, policy_version 92520 (0.0035) [2024-06-22 00:03:26,930][15401] Updated weights for policy 0, policy_version 92530 (0.0021) [2024-06-22 00:03:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.7, 300 sec: 42709.8). Total num frames: 1516044288. Throughput: 0: 42357.4. Samples: 1516141780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:28,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 00:03:31,437][15401] Updated weights for policy 0, policy_version 92540 (0.0033) [2024-06-22 00:03:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 1516224512. Throughput: 0: 42890.2. Samples: 1516414780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:33,390][15132] Avg episode reward: [(0, '0.127')] [2024-06-22 00:03:34,887][15401] Updated weights for policy 0, policy_version 92550 (0.0029) [2024-06-22 00:03:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1516437504. Throughput: 0: 42632.1. Samples: 1516534580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 00:03:39,242][15401] Updated weights for policy 0, policy_version 92560 (0.0029) [2024-06-22 00:03:42,499][15401] Updated weights for policy 0, policy_version 92570 (0.0041) [2024-06-22 00:03:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1516699648. Throughput: 0: 42763.2. Samples: 1516791500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 00:03:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000092572_1516699648.pth... [2024-06-22 00:03:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000091945_1506426880.pth [2024-06-22 00:03:47,233][15401] Updated weights for policy 0, policy_version 92580 (0.0027) [2024-06-22 00:03:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 1516863488. Throughput: 0: 42799.9. Samples: 1517057260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 00:03:50,235][15401] Updated weights for policy 0, policy_version 92590 (0.0042) [2024-06-22 00:03:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1517092864. Throughput: 0: 42797.7. Samples: 1517173020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 00:03:54,830][15401] Updated weights for policy 0, policy_version 92600 (0.0032) [2024-06-22 00:03:57,784][15401] Updated weights for policy 0, policy_version 92610 (0.0032) [2024-06-22 00:03:58,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1517338624. Throughput: 0: 42916.8. Samples: 1517437880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:03:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 00:04:02,578][15401] Updated weights for policy 0, policy_version 92620 (0.0037) [2024-06-22 00:04:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42329.9, 300 sec: 42654.0). Total num frames: 1517502464. Throughput: 0: 42769.4. Samples: 1517699580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:04:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 00:04:05,781][15401] Updated weights for policy 0, policy_version 92630 (0.0026) [2024-06-22 00:04:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1517748224. Throughput: 0: 42820.5. Samples: 1517819060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 00:04:08,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-22 00:04:10,501][15401] Updated weights for policy 0, policy_version 92640 (0.0048) [2024-06-22 00:04:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1517961216. Throughput: 0: 43081.8. Samples: 1518080460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 00:04:13,521][15401] Updated weights for policy 0, policy_version 92650 (0.0032) [2024-06-22 00:04:17,886][15349] Signal inference workers to stop experience collection... (22400 times) [2024-06-22 00:04:17,916][15401] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-06-22 00:04:17,942][15349] Signal inference workers to resume experience collection... (22400 times) [2024-06-22 00:04:17,944][15401] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-06-22 00:04:18,111][15401] Updated weights for policy 0, policy_version 92660 (0.0040) [2024-06-22 00:04:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 1518157824. Throughput: 0: 42819.1. Samples: 1518341640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:18,393][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 00:04:21,229][15401] Updated weights for policy 0, policy_version 92670 (0.0043) [2024-06-22 00:04:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1518387200. Throughput: 0: 42874.7. Samples: 1518463940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 00:04:25,572][15401] Updated weights for policy 0, policy_version 92680 (0.0035) [2024-06-22 00:04:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1518583808. Throughput: 0: 42864.1. Samples: 1518720380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 00:04:28,848][15401] Updated weights for policy 0, policy_version 92690 (0.0046) [2024-06-22 00:04:33,076][15401] Updated weights for policy 0, policy_version 92700 (0.0029) [2024-06-22 00:04:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 1518796800. Throughput: 0: 42672.7. Samples: 1518977640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:33,404][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 00:04:36,494][15401] Updated weights for policy 0, policy_version 92710 (0.0026) [2024-06-22 00:04:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1519042560. Throughput: 0: 42968.0. Samples: 1519106580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:38,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 00:04:40,509][15401] Updated weights for policy 0, policy_version 92720 (0.0038) [2024-06-22 00:04:43,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1519239168. Throughput: 0: 42839.6. Samples: 1519365660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 00:04:44,054][15401] Updated weights for policy 0, policy_version 92730 (0.0033) [2024-06-22 00:04:48,379][15401] Updated weights for policy 0, policy_version 92740 (0.0034) [2024-06-22 00:04:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 1519452160. Throughput: 0: 42765.3. Samples: 1519624020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 00:04:51,868][15401] Updated weights for policy 0, policy_version 92750 (0.0032) [2024-06-22 00:04:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1519665152. Throughput: 0: 42951.6. Samples: 1519751880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 00:04:55,727][15401] Updated weights for policy 0, policy_version 92760 (0.0036) [2024-06-22 00:04:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1519894528. Throughput: 0: 42792.0. Samples: 1520006100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:04:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 00:04:59,718][15401] Updated weights for policy 0, policy_version 92770 (0.0033) [2024-06-22 00:05:03,233][15401] Updated weights for policy 0, policy_version 92780 (0.0039) [2024-06-22 00:05:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1520107520. Throughput: 0: 42678.2. Samples: 1520262160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 00:05:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 00:05:07,485][15401] Updated weights for policy 0, policy_version 92790 (0.0033) [2024-06-22 00:05:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1520304128. Throughput: 0: 42773.6. Samples: 1520388760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 00:05:11,319][15401] Updated weights for policy 0, policy_version 92800 (0.0041) [2024-06-22 00:05:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1520533504. Throughput: 0: 42862.1. Samples: 1520649180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 00:05:15,128][15401] Updated weights for policy 0, policy_version 92810 (0.0036) [2024-06-22 00:05:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1520746496. Throughput: 0: 42709.3. Samples: 1520899460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 00:05:18,779][15401] Updated weights for policy 0, policy_version 92820 (0.0037) [2024-06-22 00:05:22,657][15401] Updated weights for policy 0, policy_version 92830 (0.0033) [2024-06-22 00:05:23,391][15132] Fps is (10 sec: 40955.8, 60 sec: 42597.6, 300 sec: 42653.8). Total num frames: 1520943104. Throughput: 0: 42705.2. Samples: 1521028360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:23,391][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 00:05:26,242][15401] Updated weights for policy 0, policy_version 92840 (0.0032) [2024-06-22 00:05:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1521172480. Throughput: 0: 42742.6. Samples: 1521289080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 00:05:30,329][15401] Updated weights for policy 0, policy_version 92850 (0.0034) [2024-06-22 00:05:33,390][15132] Fps is (10 sec: 44241.3, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 1521385472. Throughput: 0: 42595.9. Samples: 1521540840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 00:05:34,148][15401] Updated weights for policy 0, policy_version 92860 (0.0035) [2024-06-22 00:05:37,938][15401] Updated weights for policy 0, policy_version 92870 (0.0028) [2024-06-22 00:05:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1521582080. Throughput: 0: 42644.4. Samples: 1521670880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 00:05:38,708][15349] Signal inference workers to stop experience collection... (22450 times) [2024-06-22 00:05:38,709][15349] Signal inference workers to resume experience collection... (22450 times) [2024-06-22 00:05:38,756][15401] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-06-22 00:05:38,756][15401] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-06-22 00:05:41,669][15401] Updated weights for policy 0, policy_version 92880 (0.0038) [2024-06-22 00:05:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1521811456. Throughput: 0: 42774.1. Samples: 1521930940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 00:05:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000092884_1521811456.pth... [2024-06-22 00:05:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000092258_1511555072.pth [2024-06-22 00:05:45,375][15401] Updated weights for policy 0, policy_version 92890 (0.0029) [2024-06-22 00:05:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 1522040832. Throughput: 0: 42750.8. Samples: 1522185940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 00:05:49,101][15401] Updated weights for policy 0, policy_version 92900 (0.0033) [2024-06-22 00:05:53,082][15401] Updated weights for policy 0, policy_version 92910 (0.0026) [2024-06-22 00:05:53,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1522237440. Throughput: 0: 42829.9. Samples: 1522316200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:53,392][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 00:05:57,241][15401] Updated weights for policy 0, policy_version 92920 (0.0042) [2024-06-22 00:05:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1522450432. Throughput: 0: 42820.9. Samples: 1522576120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:05:58,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 00:06:00,708][15401] Updated weights for policy 0, policy_version 92930 (0.0031) [2024-06-22 00:06:03,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1522679808. Throughput: 0: 42883.1. Samples: 1522829200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 00:06:03,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 00:06:04,779][15401] Updated weights for policy 0, policy_version 92940 (0.0035) [2024-06-22 00:06:08,238][15401] Updated weights for policy 0, policy_version 92950 (0.0036) [2024-06-22 00:06:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1522892800. Throughput: 0: 42946.4. Samples: 1522960900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:08,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-22 00:06:12,251][15401] Updated weights for policy 0, policy_version 92960 (0.0045) [2024-06-22 00:06:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1523089408. Throughput: 0: 42869.8. Samples: 1523218220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 00:06:16,316][15401] Updated weights for policy 0, policy_version 92970 (0.0044) [2024-06-22 00:06:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42876.3). Total num frames: 1523335168. Throughput: 0: 42904.5. Samples: 1523471540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 00:06:19,615][15401] Updated weights for policy 0, policy_version 92980 (0.0037) [2024-06-22 00:06:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42870.5, 300 sec: 42764.7). Total num frames: 1523515392. Throughput: 0: 43072.8. Samples: 1523609260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:23,393][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 00:06:24,144][15401] Updated weights for policy 0, policy_version 92990 (0.0035) [2024-06-22 00:06:27,084][15401] Updated weights for policy 0, policy_version 93000 (0.0036) [2024-06-22 00:06:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1523728384. Throughput: 0: 42903.1. Samples: 1523861580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 00:06:31,595][15401] Updated weights for policy 0, policy_version 93010 (0.0027) [2024-06-22 00:06:33,389][15132] Fps is (10 sec: 45886.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1523974144. Throughput: 0: 43112.9. Samples: 1524126020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:33,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 00:06:34,503][15401] Updated weights for policy 0, policy_version 93020 (0.0029) [2024-06-22 00:06:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1524170752. Throughput: 0: 43217.0. Samples: 1524260860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 00:06:38,995][15401] Updated weights for policy 0, policy_version 93030 (0.0032) [2024-06-22 00:06:42,337][15401] Updated weights for policy 0, policy_version 93040 (0.0036) [2024-06-22 00:06:43,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 1524367360. Throughput: 0: 43006.1. Samples: 1524511500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:43,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 00:06:46,535][15401] Updated weights for policy 0, policy_version 93050 (0.0042) [2024-06-22 00:06:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1524613120. Throughput: 0: 43162.8. Samples: 1524771520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:48,390][15132] Avg episode reward: [(0, '0.164')] [2024-06-22 00:06:49,846][15401] Updated weights for policy 0, policy_version 93060 (0.0033) [2024-06-22 00:06:53,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 1524826112. Throughput: 0: 43332.8. Samples: 1524910880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 00:06:53,947][15401] Updated weights for policy 0, policy_version 93070 (0.0024) [2024-06-22 00:06:56,811][15349] Signal inference workers to stop experience collection... (22500 times) [2024-06-22 00:06:56,816][15349] Signal inference workers to resume experience collection... (22500 times) [2024-06-22 00:06:56,825][15401] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-06-22 00:06:56,859][15401] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-06-22 00:06:57,449][15401] Updated weights for policy 0, policy_version 93080 (0.0028) [2024-06-22 00:06:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1525022720. Throughput: 0: 43147.2. Samples: 1525159840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 00:06:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 00:07:01,383][15401] Updated weights for policy 0, policy_version 93090 (0.0036) [2024-06-22 00:07:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1525252096. Throughput: 0: 43281.3. Samples: 1525419200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 00:07:04,932][15401] Updated weights for policy 0, policy_version 93100 (0.0041) [2024-06-22 00:07:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1525481472. Throughput: 0: 43157.0. Samples: 1525551220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 00:07:09,136][15401] Updated weights for policy 0, policy_version 93110 (0.0031) [2024-06-22 00:07:12,663][15401] Updated weights for policy 0, policy_version 93120 (0.0035) [2024-06-22 00:07:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42820.9). Total num frames: 1525694464. Throughput: 0: 43214.7. Samples: 1525806240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:13,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 00:07:16,626][15401] Updated weights for policy 0, policy_version 93130 (0.0031) [2024-06-22 00:07:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 1525891072. Throughput: 0: 43224.7. Samples: 1526071140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:18,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 00:07:20,160][15401] Updated weights for policy 0, policy_version 93140 (0.0041) [2024-06-22 00:07:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43692.3, 300 sec: 42876.1). Total num frames: 1526136832. Throughput: 0: 42975.8. Samples: 1526194780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:23,399][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 00:07:23,947][15401] Updated weights for policy 0, policy_version 93150 (0.0033) [2024-06-22 00:07:28,362][15401] Updated weights for policy 0, policy_version 93160 (0.0039) [2024-06-22 00:07:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1526333440. Throughput: 0: 43087.7. Samples: 1526450340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 00:07:31,937][15401] Updated weights for policy 0, policy_version 93170 (0.0032) [2024-06-22 00:07:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1526530048. Throughput: 0: 43045.3. Samples: 1526708560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 00:07:36,116][15401] Updated weights for policy 0, policy_version 93180 (0.0033) [2024-06-22 00:07:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 1526759424. Throughput: 0: 42708.0. Samples: 1526832740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:38,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 00:07:39,830][15401] Updated weights for policy 0, policy_version 93190 (0.0033) [2024-06-22 00:07:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 1526956032. Throughput: 0: 42967.0. Samples: 1527093360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 00:07:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000093199_1526972416.pth... [2024-06-22 00:07:43,527][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000092572_1516699648.pth [2024-06-22 00:07:43,741][15401] Updated weights for policy 0, policy_version 93200 (0.0038) [2024-06-22 00:07:47,325][15401] Updated weights for policy 0, policy_version 93210 (0.0037) [2024-06-22 00:07:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1527169024. Throughput: 0: 42926.7. Samples: 1527350900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 00:07:51,214][15401] Updated weights for policy 0, policy_version 93220 (0.0034) [2024-06-22 00:07:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1527398400. Throughput: 0: 42884.0. Samples: 1527481000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:07:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 00:07:54,760][15401] Updated weights for policy 0, policy_version 93230 (0.0031) [2024-06-22 00:07:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42876.7). Total num frames: 1527611392. Throughput: 0: 42961.3. Samples: 1527739600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:07:58,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 00:07:58,785][15401] Updated weights for policy 0, policy_version 93240 (0.0032) [2024-06-22 00:08:02,323][15401] Updated weights for policy 0, policy_version 93250 (0.0042) [2024-06-22 00:08:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1527840768. Throughput: 0: 42863.3. Samples: 1527999980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 00:08:06,968][15401] Updated weights for policy 0, policy_version 93260 (0.0042) [2024-06-22 00:08:08,390][15132] Fps is (10 sec: 45885.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1528070144. Throughput: 0: 43047.1. Samples: 1528131900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 00:08:09,831][15401] Updated weights for policy 0, policy_version 93270 (0.0036) [2024-06-22 00:08:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1528233984. Throughput: 0: 43015.5. Samples: 1528386040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:13,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-22 00:08:14,643][15401] Updated weights for policy 0, policy_version 93280 (0.0030) [2024-06-22 00:08:17,382][15401] Updated weights for policy 0, policy_version 93290 (0.0023) [2024-06-22 00:08:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1528479744. Throughput: 0: 42956.8. Samples: 1528641620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:18,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 00:08:22,263][15401] Updated weights for policy 0, policy_version 93300 (0.0032) [2024-06-22 00:08:22,816][15349] Signal inference workers to stop experience collection... (22550 times) [2024-06-22 00:08:22,816][15349] Signal inference workers to resume experience collection... (22550 times) [2024-06-22 00:08:22,840][15401] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-06-22 00:08:22,840][15401] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-06-22 00:08:23,390][15132] Fps is (10 sec: 49151.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1528725504. Throughput: 0: 43339.2. Samples: 1528783000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 00:08:24,959][15401] Updated weights for policy 0, policy_version 93310 (0.0036) [2024-06-22 00:08:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.2, 300 sec: 42931.6). Total num frames: 1528889344. Throughput: 0: 43119.1. Samples: 1529033720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 00:08:29,835][15401] Updated weights for policy 0, policy_version 93320 (0.0039) [2024-06-22 00:08:32,478][15401] Updated weights for policy 0, policy_version 93330 (0.0033) [2024-06-22 00:08:33,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43415.9, 300 sec: 43042.4). Total num frames: 1529135104. Throughput: 0: 43010.1. Samples: 1529286460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:33,392][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 00:08:37,279][15401] Updated weights for policy 0, policy_version 93340 (0.0031) [2024-06-22 00:08:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1529331712. Throughput: 0: 43128.3. Samples: 1529421780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 00:08:40,226][15401] Updated weights for policy 0, policy_version 93350 (0.0042) [2024-06-22 00:08:43,390][15132] Fps is (10 sec: 40969.4, 60 sec: 43144.6, 300 sec: 42987.1). Total num frames: 1529544704. Throughput: 0: 43036.0. Samples: 1529676120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:43,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 00:08:44,845][15401] Updated weights for policy 0, policy_version 93360 (0.0042) [2024-06-22 00:08:47,861][15401] Updated weights for policy 0, policy_version 93370 (0.0031) [2024-06-22 00:08:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 1529774080. Throughput: 0: 42899.0. Samples: 1529930440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:48,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 00:08:52,504][15401] Updated weights for policy 0, policy_version 93380 (0.0053) [2024-06-22 00:08:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 1530003456. Throughput: 0: 42962.7. Samples: 1530065220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 00:08:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 00:08:55,492][15401] Updated weights for policy 0, policy_version 93390 (0.0032) [2024-06-22 00:08:58,391][15132] Fps is (10 sec: 40955.8, 60 sec: 42872.4, 300 sec: 42987.0). Total num frames: 1530183680. Throughput: 0: 43110.0. Samples: 1530326040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:08:58,391][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 00:08:59,990][15401] Updated weights for policy 0, policy_version 93400 (0.0040) [2024-06-22 00:09:03,336][15401] Updated weights for policy 0, policy_version 93410 (0.0041) [2024-06-22 00:09:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1530429440. Throughput: 0: 42915.2. Samples: 1530572800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 00:09:07,698][15401] Updated weights for policy 0, policy_version 93420 (0.0045) [2024-06-22 00:09:08,390][15132] Fps is (10 sec: 45880.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1530642432. Throughput: 0: 42728.0. Samples: 1530705760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 00:09:11,461][15401] Updated weights for policy 0, policy_version 93430 (0.0039) [2024-06-22 00:09:13,392][15132] Fps is (10 sec: 39312.1, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 1530822656. Throughput: 0: 42767.2. Samples: 1530958340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:13,393][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 00:09:15,267][15401] Updated weights for policy 0, policy_version 93440 (0.0051) [2024-06-22 00:09:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1531052032. Throughput: 0: 42755.7. Samples: 1531210360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:18,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-22 00:09:19,029][15401] Updated weights for policy 0, policy_version 93450 (0.0042) [2024-06-22 00:09:22,850][15401] Updated weights for policy 0, policy_version 93460 (0.0046) [2024-06-22 00:09:23,391][15132] Fps is (10 sec: 45878.7, 60 sec: 42597.3, 300 sec: 43042.5). Total num frames: 1531281408. Throughput: 0: 42697.6. Samples: 1531343240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:23,392][15132] Avg episode reward: [(0, '0.197')] [2024-06-22 00:09:26,574][15401] Updated weights for policy 0, policy_version 93470 (0.0031) [2024-06-22 00:09:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 1531461632. Throughput: 0: 42515.5. Samples: 1531589320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 00:09:30,829][15401] Updated weights for policy 0, policy_version 93480 (0.0027) [2024-06-22 00:09:33,389][15132] Fps is (10 sec: 39328.2, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 1531674624. Throughput: 0: 42535.2. Samples: 1531844520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 00:09:34,272][15401] Updated weights for policy 0, policy_version 93490 (0.0041) [2024-06-22 00:09:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1531887616. Throughput: 0: 42417.3. Samples: 1531974000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 00:09:38,535][15401] Updated weights for policy 0, policy_version 93500 (0.0046) [2024-06-22 00:09:42,186][15401] Updated weights for policy 0, policy_version 93510 (0.0028) [2024-06-22 00:09:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1532100608. Throughput: 0: 42218.8. Samples: 1532225840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:43,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 00:09:43,476][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000093513_1532116992.pth... [2024-06-22 00:09:43,534][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000092884_1521811456.pth [2024-06-22 00:09:46,039][15401] Updated weights for policy 0, policy_version 93520 (0.0029) [2024-06-22 00:09:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1532329984. Throughput: 0: 42391.1. Samples: 1532480400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 00:09:48,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-22 00:09:49,784][15401] Updated weights for policy 0, policy_version 93530 (0.0030) [2024-06-22 00:09:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1532542976. Throughput: 0: 42434.8. Samples: 1532615320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:09:53,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 00:09:53,642][15401] Updated weights for policy 0, policy_version 93540 (0.0049) [2024-06-22 00:09:57,557][15401] Updated weights for policy 0, policy_version 93550 (0.0038) [2024-06-22 00:09:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42599.1, 300 sec: 42820.6). Total num frames: 1532739584. Throughput: 0: 42422.6. Samples: 1532867260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:09:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 00:09:58,771][15349] Signal inference workers to stop experience collection... (22600 times) [2024-06-22 00:09:58,771][15349] Signal inference workers to resume experience collection... (22600 times) [2024-06-22 00:09:58,791][15401] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-06-22 00:09:58,791][15401] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-06-22 00:10:01,327][15401] Updated weights for policy 0, policy_version 93560 (0.0027) [2024-06-22 00:10:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42820.6). Total num frames: 1532936192. Throughput: 0: 42439.5. Samples: 1533120140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 00:10:05,505][15401] Updated weights for policy 0, policy_version 93570 (0.0035) [2024-06-22 00:10:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1533181952. Throughput: 0: 42313.2. Samples: 1533247260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:08,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 00:10:08,818][15401] Updated weights for policy 0, policy_version 93580 (0.0031) [2024-06-22 00:10:13,057][15401] Updated weights for policy 0, policy_version 93590 (0.0046) [2024-06-22 00:10:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 1533378560. Throughput: 0: 42456.4. Samples: 1533499860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:13,390][15132] Avg episode reward: [(0, '0.203')] [2024-06-22 00:10:16,378][15401] Updated weights for policy 0, policy_version 93600 (0.0034) [2024-06-22 00:10:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42820.7). Total num frames: 1533575168. Throughput: 0: 42616.5. Samples: 1533762260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 00:10:20,755][15401] Updated weights for policy 0, policy_version 93610 (0.0040) [2024-06-22 00:10:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42053.4, 300 sec: 42820.6). Total num frames: 1533804544. Throughput: 0: 42568.6. Samples: 1533889580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 00:10:24,524][15401] Updated weights for policy 0, policy_version 93620 (0.0025) [2024-06-22 00:10:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1534017536. Throughput: 0: 42618.6. Samples: 1534143680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 00:10:28,511][15401] Updated weights for policy 0, policy_version 93630 (0.0028) [2024-06-22 00:10:32,208][15401] Updated weights for policy 0, policy_version 93640 (0.0034) [2024-06-22 00:10:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1534230528. Throughput: 0: 42634.3. Samples: 1534398940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:33,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 00:10:36,344][15401] Updated weights for policy 0, policy_version 93650 (0.0042) [2024-06-22 00:10:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1534443520. Throughput: 0: 42455.1. Samples: 1534525800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:38,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 00:10:40,005][15401] Updated weights for policy 0, policy_version 93660 (0.0032) [2024-06-22 00:10:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1534656512. Throughput: 0: 42584.5. Samples: 1534783560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:10:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 00:10:43,936][15401] Updated weights for policy 0, policy_version 93670 (0.0039) [2024-06-22 00:10:47,591][15401] Updated weights for policy 0, policy_version 93680 (0.0037) [2024-06-22 00:10:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 1534869504. Throughput: 0: 42737.3. Samples: 1535043320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:10:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 00:10:51,448][15401] Updated weights for policy 0, policy_version 93690 (0.0034) [2024-06-22 00:10:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1535098880. Throughput: 0: 42761.8. Samples: 1535171540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:10:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 00:10:55,246][15401] Updated weights for policy 0, policy_version 93700 (0.0038) [2024-06-22 00:10:58,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1535311872. Throughput: 0: 42695.2. Samples: 1535421240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:10:58,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 00:10:59,307][15401] Updated weights for policy 0, policy_version 93710 (0.0038) [2024-06-22 00:11:03,055][15401] Updated weights for policy 0, policy_version 93720 (0.0033) [2024-06-22 00:11:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1535524864. Throughput: 0: 42702.5. Samples: 1535683880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:03,401][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 00:11:06,855][15401] Updated weights for policy 0, policy_version 93730 (0.0035) [2024-06-22 00:11:08,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 1535737856. Throughput: 0: 42686.0. Samples: 1535810460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:08,396][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 00:11:10,729][15401] Updated weights for policy 0, policy_version 93740 (0.0031) [2024-06-22 00:11:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1535950848. Throughput: 0: 42717.8. Samples: 1536065980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 00:11:14,669][15401] Updated weights for policy 0, policy_version 93750 (0.0029) [2024-06-22 00:11:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.5, 300 sec: 42876.5). Total num frames: 1536163840. Throughput: 0: 42801.8. Samples: 1536325020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 00:11:18,407][15401] Updated weights for policy 0, policy_version 93760 (0.0034) [2024-06-22 00:11:22,235][15401] Updated weights for policy 0, policy_version 93770 (0.0042) [2024-06-22 00:11:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1536360448. Throughput: 0: 42737.3. Samples: 1536448980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 00:11:26,183][15401] Updated weights for policy 0, policy_version 93780 (0.0034) [2024-06-22 00:11:26,897][15349] Signal inference workers to stop experience collection... (22650 times) [2024-06-22 00:11:26,933][15401] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-06-22 00:11:26,957][15349] Signal inference workers to resume experience collection... (22650 times) [2024-06-22 00:11:26,958][15401] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-06-22 00:11:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1536589824. Throughput: 0: 42709.8. Samples: 1536705500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:28,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 00:11:29,783][15401] Updated weights for policy 0, policy_version 93790 (0.0037) [2024-06-22 00:11:33,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 1536786432. Throughput: 0: 42708.5. Samples: 1536965480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:33,396][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 00:11:33,924][15401] Updated weights for policy 0, policy_version 93800 (0.0038) [2024-06-22 00:11:37,494][15401] Updated weights for policy 0, policy_version 93810 (0.0036) [2024-06-22 00:11:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.6, 300 sec: 42820.5). Total num frames: 1536999424. Throughput: 0: 42636.7. Samples: 1537090300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:38,393][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 00:11:41,612][15401] Updated weights for policy 0, policy_version 93820 (0.0028) [2024-06-22 00:11:43,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1537228800. Throughput: 0: 42829.8. Samples: 1537348480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 00:11:43,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 00:11:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000093826_1537245184.pth... [2024-06-22 00:11:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000093199_1526972416.pth [2024-06-22 00:11:45,139][15401] Updated weights for policy 0, policy_version 93830 (0.0044) [2024-06-22 00:11:48,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1537425408. Throughput: 0: 42633.0. Samples: 1537602360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:11:48,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 00:11:49,403][15401] Updated weights for policy 0, policy_version 93840 (0.0046) [2024-06-22 00:11:53,065][15401] Updated weights for policy 0, policy_version 93850 (0.0032) [2024-06-22 00:11:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1537654784. Throughput: 0: 42510.3. Samples: 1537723420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:11:53,394][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 00:11:57,278][15401] Updated weights for policy 0, policy_version 93860 (0.0028) [2024-06-22 00:11:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1537867776. Throughput: 0: 42583.7. Samples: 1537982240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:11:58,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 00:12:00,833][15401] Updated weights for policy 0, policy_version 93870 (0.0026) [2024-06-22 00:12:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1538048000. Throughput: 0: 42457.8. Samples: 1538235620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:12:03,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 00:12:04,800][15401] Updated weights for policy 0, policy_version 93880 (0.0027) [2024-06-22 00:12:08,314][15401] Updated weights for policy 0, policy_version 93890 (0.0029) [2024-06-22 00:12:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1538293760. Throughput: 0: 42498.7. Samples: 1538361420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:12:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 00:12:12,639][15401] Updated weights for policy 0, policy_version 93900 (0.0043) [2024-06-22 00:12:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 1538473984. Throughput: 0: 42486.3. Samples: 1538617380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:12:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 00:12:16,059][15401] Updated weights for policy 0, policy_version 93910 (0.0026) [2024-06-22 00:12:18,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 1538719744. Throughput: 0: 42301.3. Samples: 1538869040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:12:18,396][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 00:12:20,339][15401] Updated weights for policy 0, policy_version 93920 (0.0046) [2024-06-22 00:12:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1538916352. Throughput: 0: 42504.5. Samples: 1539002900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:12:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 00:12:23,927][15401] Updated weights for policy 0, policy_version 93930 (0.0027) [2024-06-22 00:12:27,898][15401] Updated weights for policy 0, policy_version 93940 (0.0025) [2024-06-22 00:12:28,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1539129344. Throughput: 0: 42387.1. Samples: 1539255900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:12:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 00:12:31,638][15401] Updated weights for policy 0, policy_version 93950 (0.0040) [2024-06-22 00:12:32,738][15349] Signal inference workers to stop experience collection... (22700 times) [2024-06-22 00:12:32,738][15349] Signal inference workers to resume experience collection... (22700 times) [2024-06-22 00:12:32,753][15401] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-06-22 00:12:32,753][15401] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-06-22 00:12:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 1539358720. Throughput: 0: 42399.4. Samples: 1539510340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:12:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 00:12:35,639][15401] Updated weights for policy 0, policy_version 93960 (0.0036) [2024-06-22 00:12:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 1539571712. Throughput: 0: 42716.0. Samples: 1539645640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 00:12:38,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 00:12:39,357][15401] Updated weights for policy 0, policy_version 93970 (0.0042) [2024-06-22 00:12:43,289][15401] Updated weights for policy 0, policy_version 93980 (0.0035) [2024-06-22 00:12:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1539768320. Throughput: 0: 42583.9. Samples: 1539898520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:12:43,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-22 00:12:46,998][15401] Updated weights for policy 0, policy_version 93990 (0.0032) [2024-06-22 00:12:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1539997696. Throughput: 0: 42668.8. Samples: 1540155720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:12:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 00:12:50,823][15401] Updated weights for policy 0, policy_version 94000 (0.0031) [2024-06-22 00:12:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 1540227072. Throughput: 0: 42822.2. Samples: 1540288420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:12:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 00:12:54,661][15401] Updated weights for policy 0, policy_version 94010 (0.0051) [2024-06-22 00:12:58,285][15401] Updated weights for policy 0, policy_version 94020 (0.0036) [2024-06-22 00:12:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1540423680. Throughput: 0: 42832.8. Samples: 1540544860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:12:58,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 00:13:02,259][15401] Updated weights for policy 0, policy_version 94030 (0.0028) [2024-06-22 00:13:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1540636672. Throughput: 0: 43020.8. Samples: 1540804700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:13:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 00:13:06,257][15401] Updated weights for policy 0, policy_version 94040 (0.0029) [2024-06-22 00:13:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1540849664. Throughput: 0: 42976.0. Samples: 1540936820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:13:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 00:13:09,757][15401] Updated weights for policy 0, policy_version 94050 (0.0032) [2024-06-22 00:13:13,392][15132] Fps is (10 sec: 40949.4, 60 sec: 42869.6, 300 sec: 42598.0). Total num frames: 1541046272. Throughput: 0: 42945.1. Samples: 1541188540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:13:13,393][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 00:13:13,811][15401] Updated weights for policy 0, policy_version 94060 (0.0043) [2024-06-22 00:13:17,301][15401] Updated weights for policy 0, policy_version 94070 (0.0034) [2024-06-22 00:13:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42603.0, 300 sec: 42542.9). Total num frames: 1541275648. Throughput: 0: 43035.3. Samples: 1541446920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:13:18,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 00:13:21,441][15401] Updated weights for policy 0, policy_version 94080 (0.0030) [2024-06-22 00:13:23,390][15132] Fps is (10 sec: 44248.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1541488640. Throughput: 0: 42831.1. Samples: 1541573040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:13:23,399][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 00:13:25,353][15401] Updated weights for policy 0, policy_version 94090 (0.0041) [2024-06-22 00:13:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 1541701632. Throughput: 0: 42751.7. Samples: 1541822340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:13:28,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 00:13:28,916][15401] Updated weights for policy 0, policy_version 94100 (0.0026) [2024-06-22 00:13:32,904][15401] Updated weights for policy 0, policy_version 94110 (0.0029) [2024-06-22 00:13:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1541898240. Throughput: 0: 42846.7. Samples: 1542083820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 00:13:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 00:13:36,502][15401] Updated weights for policy 0, policy_version 94120 (0.0029) [2024-06-22 00:13:38,355][15349] Signal inference workers to stop experience collection... (22750 times) [2024-06-22 00:13:38,355][15349] Signal inference workers to resume experience collection... (22750 times) [2024-06-22 00:13:38,365][15401] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-06-22 00:13:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1542144000. Throughput: 0: 42696.5. Samples: 1542209760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:13:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 00:13:38,395][15401] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-06-22 00:13:40,567][15401] Updated weights for policy 0, policy_version 94130 (0.0027) [2024-06-22 00:13:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1542356992. Throughput: 0: 42733.3. Samples: 1542467860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:13:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 00:13:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000094138_1542356992.pth... [2024-06-22 00:13:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000093513_1532116992.pth [2024-06-22 00:13:44,026][15401] Updated weights for policy 0, policy_version 94140 (0.0034) [2024-06-22 00:13:48,341][15401] Updated weights for policy 0, policy_version 94150 (0.0041) [2024-06-22 00:13:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1542553600. Throughput: 0: 42822.3. Samples: 1542731700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:13:48,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 00:13:51,780][15401] Updated weights for policy 0, policy_version 94160 (0.0027) [2024-06-22 00:13:53,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42593.8, 300 sec: 42708.7). Total num frames: 1542782976. Throughput: 0: 42615.8. Samples: 1542854800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:13:53,396][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 00:13:56,020][15401] Updated weights for policy 0, policy_version 94170 (0.0046) [2024-06-22 00:13:58,390][15132] Fps is (10 sec: 44232.5, 60 sec: 42870.8, 300 sec: 42598.3). Total num frames: 1542995968. Throughput: 0: 42758.9. Samples: 1543112620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:13:58,391][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 00:13:59,275][15401] Updated weights for policy 0, policy_version 94180 (0.0027) [2024-06-22 00:14:03,390][15132] Fps is (10 sec: 40985.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1543192576. Throughput: 0: 42896.7. Samples: 1543377280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:14:03,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 00:14:03,574][15401] Updated weights for policy 0, policy_version 94190 (0.0040) [2024-06-22 00:14:06,822][15401] Updated weights for policy 0, policy_version 94200 (0.0035) [2024-06-22 00:14:08,389][15132] Fps is (10 sec: 42602.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1543421952. Throughput: 0: 42963.6. Samples: 1543506400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:14:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 00:14:11,215][15401] Updated weights for policy 0, policy_version 94210 (0.0045) [2024-06-22 00:14:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.4, 300 sec: 42653.9). Total num frames: 1543634944. Throughput: 0: 43125.2. Samples: 1543762980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:14:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 00:14:14,574][15401] Updated weights for policy 0, policy_version 94220 (0.0032) [2024-06-22 00:14:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42543.1). Total num frames: 1543831552. Throughput: 0: 43104.0. Samples: 1544023500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:14:18,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 00:14:18,723][15401] Updated weights for policy 0, policy_version 94230 (0.0031) [2024-06-22 00:14:21,946][15401] Updated weights for policy 0, policy_version 94240 (0.0026) [2024-06-22 00:14:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1544077312. Throughput: 0: 43137.8. Samples: 1544150960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:14:23,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 00:14:26,394][15401] Updated weights for policy 0, policy_version 94250 (0.0034) [2024-06-22 00:14:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1544290304. Throughput: 0: 43228.0. Samples: 1544413120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:14:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 00:14:29,914][15401] Updated weights for policy 0, policy_version 94260 (0.0034) [2024-06-22 00:14:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1544486912. Throughput: 0: 43107.4. Samples: 1544671540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 00:14:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 00:14:33,847][15401] Updated weights for policy 0, policy_version 94270 (0.0039) [2024-06-22 00:14:37,375][15401] Updated weights for policy 0, policy_version 94280 (0.0026) [2024-06-22 00:14:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1544732672. Throughput: 0: 43232.8. Samples: 1544800000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:14:38,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 00:14:41,545][15401] Updated weights for policy 0, policy_version 94290 (0.0033) [2024-06-22 00:14:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1544929280. Throughput: 0: 43303.5. Samples: 1545061240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:14:43,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 00:14:44,895][15401] Updated weights for policy 0, policy_version 94300 (0.0028) [2024-06-22 00:14:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1545142272. Throughput: 0: 43085.5. Samples: 1545316120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:14:48,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 00:14:49,227][15401] Updated weights for policy 0, policy_version 94310 (0.0033) [2024-06-22 00:14:52,835][15401] Updated weights for policy 0, policy_version 94320 (0.0030) [2024-06-22 00:14:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 1545388032. Throughput: 0: 43025.8. Samples: 1545442560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:14:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 00:14:56,875][15401] Updated weights for policy 0, policy_version 94330 (0.0032) [2024-06-22 00:14:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42872.1, 300 sec: 42820.5). Total num frames: 1545568256. Throughput: 0: 43134.2. Samples: 1545704020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:14:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 00:15:00,292][15349] Signal inference workers to stop experience collection... (22800 times) [2024-06-22 00:15:00,293][15349] Signal inference workers to resume experience collection... (22800 times) [2024-06-22 00:15:00,340][15401] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-06-22 00:15:00,340][15401] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-06-22 00:15:00,444][15401] Updated weights for policy 0, policy_version 94340 (0.0040) [2024-06-22 00:15:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 1545797632. Throughput: 0: 42961.8. Samples: 1545956780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:15:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 00:15:04,438][15401] Updated weights for policy 0, policy_version 94350 (0.0031) [2024-06-22 00:15:08,107][15401] Updated weights for policy 0, policy_version 94360 (0.0040) [2024-06-22 00:15:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1546010624. Throughput: 0: 42975.5. Samples: 1546084860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:15:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 00:15:12,170][15401] Updated weights for policy 0, policy_version 94370 (0.0038) [2024-06-22 00:15:13,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 1546207232. Throughput: 0: 42903.0. Samples: 1546343860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:15:13,393][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 00:15:15,802][15401] Updated weights for policy 0, policy_version 94380 (0.0036) [2024-06-22 00:15:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1546436608. Throughput: 0: 42602.9. Samples: 1546588660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:15:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 00:15:19,847][15401] Updated weights for policy 0, policy_version 94390 (0.0037) [2024-06-22 00:15:23,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1546633216. Throughput: 0: 42795.6. Samples: 1546725800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:15:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 00:15:23,433][15401] Updated weights for policy 0, policy_version 94400 (0.0042) [2024-06-22 00:15:27,397][15401] Updated weights for policy 0, policy_version 94410 (0.0033) [2024-06-22 00:15:28,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 1546846208. Throughput: 0: 42757.8. Samples: 1546985440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 00:15:28,392][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 00:15:31,059][15401] Updated weights for policy 0, policy_version 94420 (0.0047) [2024-06-22 00:15:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1547091968. Throughput: 0: 42574.6. Samples: 1547231980. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:15:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 00:15:35,018][15401] Updated weights for policy 0, policy_version 94430 (0.0028) [2024-06-22 00:15:38,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1547272192. Throughput: 0: 42735.2. Samples: 1547365640. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:15:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 00:15:38,651][15401] Updated weights for policy 0, policy_version 94440 (0.0038) [2024-06-22 00:15:42,542][15401] Updated weights for policy 0, policy_version 94450 (0.0036) [2024-06-22 00:15:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1547485184. Throughput: 0: 42740.5. Samples: 1547627340. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:15:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 00:15:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000094452_1547501568.pth... [2024-06-22 00:15:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000093826_1537245184.pth [2024-06-22 00:15:46,143][15401] Updated weights for policy 0, policy_version 94460 (0.0033) [2024-06-22 00:15:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1547730944. Throughput: 0: 42768.8. Samples: 1547881380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:15:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 00:15:50,011][15401] Updated weights for policy 0, policy_version 94470 (0.0027) [2024-06-22 00:15:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 1547927552. Throughput: 0: 42873.2. Samples: 1548014160. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:15:53,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 00:15:54,030][15401] Updated weights for policy 0, policy_version 94480 (0.0033) [2024-06-22 00:15:57,861][15401] Updated weights for policy 0, policy_version 94490 (0.0031) [2024-06-22 00:15:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1548124160. Throughput: 0: 42605.9. Samples: 1548261020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:15:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 00:16:01,887][15401] Updated weights for policy 0, policy_version 94500 (0.0034) [2024-06-22 00:16:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1548369920. Throughput: 0: 42735.8. Samples: 1548511780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:16:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 00:16:05,506][15401] Updated weights for policy 0, policy_version 94510 (0.0031) [2024-06-22 00:16:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1548550144. Throughput: 0: 42741.3. Samples: 1548649160. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:16:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 00:16:09,442][15401] Updated weights for policy 0, policy_version 94520 (0.0030) [2024-06-22 00:16:13,099][15401] Updated weights for policy 0, policy_version 94530 (0.0029) [2024-06-22 00:16:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 1548779520. Throughput: 0: 42543.1. Samples: 1548899780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:16:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 00:16:17,044][15401] Updated weights for policy 0, policy_version 94540 (0.0033) [2024-06-22 00:16:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1549008896. Throughput: 0: 42716.9. Samples: 1549154240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:16:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 00:16:20,693][15401] Updated weights for policy 0, policy_version 94550 (0.0033) [2024-06-22 00:16:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1549205504. Throughput: 0: 42766.7. Samples: 1549290140. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:16:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 00:16:24,769][15401] Updated weights for policy 0, policy_version 94560 (0.0044) [2024-06-22 00:16:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42821.5). Total num frames: 1549418496. Throughput: 0: 42642.7. Samples: 1549546260. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-22 00:16:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 00:16:28,595][15401] Updated weights for policy 0, policy_version 94570 (0.0031) [2024-06-22 00:16:32,270][15401] Updated weights for policy 0, policy_version 94580 (0.0028) [2024-06-22 00:16:32,764][15349] Signal inference workers to stop experience collection... (22850 times) [2024-06-22 00:16:32,817][15401] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-06-22 00:16:32,879][15349] Signal inference workers to resume experience collection... (22850 times) [2024-06-22 00:16:32,879][15401] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-06-22 00:16:33,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 1549664256. Throughput: 0: 42559.1. Samples: 1549796540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:16:33,390][15132] Avg episode reward: [(0, '0.136')] [2024-06-22 00:16:36,129][15401] Updated weights for policy 0, policy_version 94590 (0.0034) [2024-06-22 00:16:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 1549860864. Throughput: 0: 42656.4. Samples: 1549933700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:16:38,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 00:16:39,863][15401] Updated weights for policy 0, policy_version 94600 (0.0033) [2024-06-22 00:16:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1550057472. Throughput: 0: 42843.5. Samples: 1550188980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:16:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 00:16:44,066][15401] Updated weights for policy 0, policy_version 94610 (0.0033) [2024-06-22 00:16:47,403][15401] Updated weights for policy 0, policy_version 94620 (0.0037) [2024-06-22 00:16:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1550286848. Throughput: 0: 42915.1. Samples: 1550442960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:16:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 00:16:51,842][15401] Updated weights for policy 0, policy_version 94630 (0.0030) [2024-06-22 00:16:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1550499840. Throughput: 0: 42783.1. Samples: 1550574400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:16:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 00:16:55,095][15401] Updated weights for policy 0, policy_version 94640 (0.0037) [2024-06-22 00:16:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1550712832. Throughput: 0: 42792.5. Samples: 1550825440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:16:58,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 00:16:59,395][15401] Updated weights for policy 0, policy_version 94650 (0.0028) [2024-06-22 00:17:02,644][15401] Updated weights for policy 0, policy_version 94660 (0.0032) [2024-06-22 00:17:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1550925824. Throughput: 0: 42907.2. Samples: 1551085060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:17:03,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-22 00:17:06,939][15401] Updated weights for policy 0, policy_version 94670 (0.0030) [2024-06-22 00:17:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1551138816. Throughput: 0: 42745.2. Samples: 1551213680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:17:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 00:17:10,387][15401] Updated weights for policy 0, policy_version 94680 (0.0023) [2024-06-22 00:17:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 1551368192. Throughput: 0: 42665.2. Samples: 1551466200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:17:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 00:17:14,590][15401] Updated weights for policy 0, policy_version 94690 (0.0029) [2024-06-22 00:17:17,984][15401] Updated weights for policy 0, policy_version 94700 (0.0027) [2024-06-22 00:17:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1551581184. Throughput: 0: 42655.6. Samples: 1551716040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:17:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 00:17:22,271][15401] Updated weights for policy 0, policy_version 94710 (0.0026) [2024-06-22 00:17:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1551761408. Throughput: 0: 42479.1. Samples: 1551845260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 00:17:23,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 00:17:25,747][15401] Updated weights for policy 0, policy_version 94720 (0.0033) [2024-06-22 00:17:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1551990784. Throughput: 0: 42507.9. Samples: 1552101840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:17:28,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 00:17:29,911][15401] Updated weights for policy 0, policy_version 94730 (0.0023) [2024-06-22 00:17:33,342][15401] Updated weights for policy 0, policy_version 94740 (0.0030) [2024-06-22 00:17:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1552220160. Throughput: 0: 42664.1. Samples: 1552362840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:17:33,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-22 00:17:37,451][15401] Updated weights for policy 0, policy_version 94750 (0.0034) [2024-06-22 00:17:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1552400384. Throughput: 0: 42572.4. Samples: 1552490160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:17:38,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 00:17:41,013][15401] Updated weights for policy 0, policy_version 94760 (0.0028) [2024-06-22 00:17:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1552629760. Throughput: 0: 42690.2. Samples: 1552746500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:17:43,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 00:17:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000094765_1552629760.pth... [2024-06-22 00:17:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000094138_1542356992.pth [2024-06-22 00:17:44,931][15401] Updated weights for policy 0, policy_version 94770 (0.0041) [2024-06-22 00:17:48,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 1552826368. Throughput: 0: 42681.7. Samples: 1553005840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:17:48,392][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 00:17:49,057][15401] Updated weights for policy 0, policy_version 94780 (0.0045) [2024-06-22 00:17:52,675][15401] Updated weights for policy 0, policy_version 94790 (0.0033) [2024-06-22 00:17:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1553039360. Throughput: 0: 42606.6. Samples: 1553130980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:17:53,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 00:17:56,637][15401] Updated weights for policy 0, policy_version 94800 (0.0033) [2024-06-22 00:17:58,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1553268736. Throughput: 0: 42671.2. Samples: 1553386400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:17:58,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 00:18:00,271][15401] Updated weights for policy 0, policy_version 94810 (0.0042) [2024-06-22 00:18:01,605][15349] Signal inference workers to stop experience collection... (22900 times) [2024-06-22 00:18:01,605][15349] Signal inference workers to resume experience collection... (22900 times) [2024-06-22 00:18:01,642][15401] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-06-22 00:18:01,642][15401] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-06-22 00:18:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1553481728. Throughput: 0: 42927.1. Samples: 1553647760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:18:03,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 00:18:04,190][15401] Updated weights for policy 0, policy_version 94820 (0.0045) [2024-06-22 00:18:07,933][15401] Updated weights for policy 0, policy_version 94830 (0.0043) [2024-06-22 00:18:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 1553694720. Throughput: 0: 42855.6. Samples: 1553773760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:18:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 00:18:12,054][15401] Updated weights for policy 0, policy_version 94840 (0.0030) [2024-06-22 00:18:13,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42320.9, 300 sec: 42819.6). Total num frames: 1553907712. Throughput: 0: 42784.2. Samples: 1554027400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:18:13,396][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 00:18:15,477][15401] Updated weights for policy 0, policy_version 94850 (0.0043) [2024-06-22 00:18:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1554104320. Throughput: 0: 42761.4. Samples: 1554287100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:18:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 00:18:19,703][15401] Updated weights for policy 0, policy_version 94860 (0.0025) [2024-06-22 00:18:23,259][15401] Updated weights for policy 0, policy_version 94870 (0.0038) [2024-06-22 00:18:23,390][15132] Fps is (10 sec: 44265.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1554350080. Throughput: 0: 42622.7. Samples: 1554408180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 00:18:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 00:18:27,363][15401] Updated weights for policy 0, policy_version 94880 (0.0037) [2024-06-22 00:18:28,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 1554563072. Throughput: 0: 42701.8. Samples: 1554668180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:18:28,393][15132] Avg episode reward: [(0, '0.215')] [2024-06-22 00:18:31,442][15401] Updated weights for policy 0, policy_version 94890 (0.0032) [2024-06-22 00:18:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1554759680. Throughput: 0: 42626.6. Samples: 1554923940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:18:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 00:18:34,962][15401] Updated weights for policy 0, policy_version 94900 (0.0034) [2024-06-22 00:18:38,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1554972672. Throughput: 0: 42604.8. Samples: 1555048200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:18:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 00:18:38,830][15401] Updated weights for policy 0, policy_version 94910 (0.0046) [2024-06-22 00:18:42,604][15401] Updated weights for policy 0, policy_version 94920 (0.0035) [2024-06-22 00:18:43,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1555185664. Throughput: 0: 42838.7. Samples: 1555314140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:18:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 00:18:46,781][15401] Updated weights for policy 0, policy_version 94930 (0.0026) [2024-06-22 00:18:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.2, 300 sec: 42765.9). Total num frames: 1555398656. Throughput: 0: 42592.5. Samples: 1555564420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:18:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 00:18:50,261][15401] Updated weights for policy 0, policy_version 94940 (0.0041) [2024-06-22 00:18:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 1555611648. Throughput: 0: 42699.9. Samples: 1555695260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:18:53,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 00:18:54,206][15401] Updated weights for policy 0, policy_version 94950 (0.0025) [2024-06-22 00:18:57,845][15401] Updated weights for policy 0, policy_version 94960 (0.0028) [2024-06-22 00:18:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1555824640. Throughput: 0: 42746.2. Samples: 1555950700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:18:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 00:19:02,126][15401] Updated weights for policy 0, policy_version 94970 (0.0039) [2024-06-22 00:19:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1556021248. Throughput: 0: 42609.8. Samples: 1556204540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:19:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 00:19:05,778][15401] Updated weights for policy 0, policy_version 94980 (0.0037) [2024-06-22 00:19:08,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 1556250624. Throughput: 0: 42724.4. Samples: 1556330880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:19:08,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 00:19:09,954][15401] Updated weights for policy 0, policy_version 94990 (0.0037) [2024-06-22 00:19:13,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42601.3, 300 sec: 42820.2). Total num frames: 1556463616. Throughput: 0: 42605.4. Samples: 1556585420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:19:13,392][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 00:19:13,649][15401] Updated weights for policy 0, policy_version 95000 (0.0035) [2024-06-22 00:19:17,785][15401] Updated weights for policy 0, policy_version 95010 (0.0032) [2024-06-22 00:19:18,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1556660224. Throughput: 0: 42625.5. Samples: 1556842080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 00:19:18,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-22 00:19:21,259][15401] Updated weights for policy 0, policy_version 95020 (0.0031) [2024-06-22 00:19:23,389][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1556889600. Throughput: 0: 42740.6. Samples: 1556971520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:19:23,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 00:19:25,337][15401] Updated weights for policy 0, policy_version 95030 (0.0044) [2024-06-22 00:19:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 1557102592. Throughput: 0: 42550.1. Samples: 1557228900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:19:28,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 00:19:28,765][15401] Updated weights for policy 0, policy_version 95040 (0.0044) [2024-06-22 00:19:32,771][15349] Signal inference workers to stop experience collection... (22950 times) [2024-06-22 00:19:32,805][15401] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-06-22 00:19:32,893][15349] Signal inference workers to resume experience collection... (22950 times) [2024-06-22 00:19:32,893][15401] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-06-22 00:19:33,028][15401] Updated weights for policy 0, policy_version 95050 (0.0041) [2024-06-22 00:19:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1557315584. Throughput: 0: 42725.8. Samples: 1557487080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:19:33,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 00:19:36,235][15401] Updated weights for policy 0, policy_version 95060 (0.0037) [2024-06-22 00:19:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1557544960. Throughput: 0: 42663.2. Samples: 1557615100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:19:38,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 00:19:40,628][15401] Updated weights for policy 0, policy_version 95070 (0.0028) [2024-06-22 00:19:43,393][15132] Fps is (10 sec: 42581.7, 60 sec: 42595.6, 300 sec: 42708.9). Total num frames: 1557741568. Throughput: 0: 42710.9. Samples: 1557872860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:19:43,394][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 00:19:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000095078_1557757952.pth... [2024-06-22 00:19:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000094452_1547501568.pth [2024-06-22 00:19:43,770][15401] Updated weights for policy 0, policy_version 95080 (0.0029) [2024-06-22 00:19:48,189][15401] Updated weights for policy 0, policy_version 95090 (0.0026) [2024-06-22 00:19:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1557954560. Throughput: 0: 43000.0. Samples: 1558139540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:19:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 00:19:51,332][15401] Updated weights for policy 0, policy_version 95100 (0.0037) [2024-06-22 00:19:53,392][15132] Fps is (10 sec: 45881.6, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 1558200320. Throughput: 0: 42944.0. Samples: 1558263360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:19:53,393][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 00:19:55,878][15401] Updated weights for policy 0, policy_version 95110 (0.0034) [2024-06-22 00:19:58,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1558413312. Throughput: 0: 42907.9. Samples: 1558516180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:19:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 00:19:58,809][15401] Updated weights for policy 0, policy_version 95120 (0.0035) [2024-06-22 00:20:03,389][15132] Fps is (10 sec: 39331.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1558593536. Throughput: 0: 43252.5. Samples: 1558788440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:20:03,390][15132] Avg episode reward: [(0, '0.094')] [2024-06-22 00:20:03,432][15401] Updated weights for policy 0, policy_version 95130 (0.0027) [2024-06-22 00:20:06,240][15401] Updated weights for policy 0, policy_version 95140 (0.0033) [2024-06-22 00:20:08,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1558839296. Throughput: 0: 43094.5. Samples: 1558910880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:20:08,393][15132] Avg episode reward: [(0, '0.005')] [2024-06-22 00:20:10,966][15401] Updated weights for policy 0, policy_version 95150 (0.0029) [2024-06-22 00:20:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 1559068672. Throughput: 0: 43055.2. Samples: 1559166380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:20:13,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 00:20:13,842][15401] Updated weights for policy 0, policy_version 95160 (0.0043) [2024-06-22 00:20:18,390][15132] Fps is (10 sec: 40969.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1559248896. Throughput: 0: 43182.1. Samples: 1559430280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 00:20:18,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 00:20:18,602][15401] Updated weights for policy 0, policy_version 95170 (0.0034) [2024-06-22 00:20:22,049][15401] Updated weights for policy 0, policy_version 95180 (0.0035) [2024-06-22 00:20:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 1559461888. Throughput: 0: 42976.8. Samples: 1559549060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:20:23,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 00:20:26,014][15401] Updated weights for policy 0, policy_version 95190 (0.0042) [2024-06-22 00:20:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1559691264. Throughput: 0: 43017.5. Samples: 1559808480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:20:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 00:20:29,475][15401] Updated weights for policy 0, policy_version 95200 (0.0036) [2024-06-22 00:20:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1559904256. Throughput: 0: 42928.8. Samples: 1560071340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:20:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 00:20:33,606][15401] Updated weights for policy 0, policy_version 95210 (0.0029) [2024-06-22 00:20:37,197][15401] Updated weights for policy 0, policy_version 95220 (0.0045) [2024-06-22 00:20:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1560100864. Throughput: 0: 42913.5. Samples: 1560194360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:20:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 00:20:41,177][15401] Updated weights for policy 0, policy_version 95230 (0.0040) [2024-06-22 00:20:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43420.4, 300 sec: 42765.0). Total num frames: 1560346624. Throughput: 0: 43131.2. Samples: 1560457080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:20:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 00:20:44,785][15401] Updated weights for policy 0, policy_version 95240 (0.0038) [2024-06-22 00:20:46,481][15349] Signal inference workers to stop experience collection... (23000 times) [2024-06-22 00:20:46,519][15401] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-06-22 00:20:46,545][15349] Signal inference workers to resume experience collection... (23000 times) [2024-06-22 00:20:46,552][15401] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-06-22 00:20:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1560543232. Throughput: 0: 42685.6. Samples: 1560709300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:20:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 00:20:48,779][15401] Updated weights for policy 0, policy_version 95250 (0.0037) [2024-06-22 00:20:52,357][15401] Updated weights for policy 0, policy_version 95260 (0.0036) [2024-06-22 00:20:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 1560756224. Throughput: 0: 42853.0. Samples: 1560839160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:20:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 00:20:56,374][15401] Updated weights for policy 0, policy_version 95270 (0.0038) [2024-06-22 00:20:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 1560985600. Throughput: 0: 42920.6. Samples: 1561097800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:20:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 00:21:00,206][15401] Updated weights for policy 0, policy_version 95280 (0.0040) [2024-06-22 00:21:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1561198592. Throughput: 0: 42785.4. Samples: 1561355620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:21:03,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 00:21:04,028][15401] Updated weights for policy 0, policy_version 95290 (0.0030) [2024-06-22 00:21:07,747][15401] Updated weights for policy 0, policy_version 95300 (0.0029) [2024-06-22 00:21:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 1561411584. Throughput: 0: 42986.3. Samples: 1561483440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:21:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 00:21:11,621][15401] Updated weights for policy 0, policy_version 95310 (0.0044) [2024-06-22 00:21:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1561624576. Throughput: 0: 42960.9. Samples: 1561741720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 00:21:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 00:21:15,381][15401] Updated weights for policy 0, policy_version 95320 (0.0037) [2024-06-22 00:21:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1561837568. Throughput: 0: 42902.2. Samples: 1562001940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:18,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 00:21:19,181][15401] Updated weights for policy 0, policy_version 95330 (0.0036) [2024-06-22 00:21:23,090][15401] Updated weights for policy 0, policy_version 95340 (0.0033) [2024-06-22 00:21:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1562050560. Throughput: 0: 42930.1. Samples: 1562126220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 00:21:26,734][15401] Updated weights for policy 0, policy_version 95350 (0.0025) [2024-06-22 00:21:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1562279936. Throughput: 0: 42946.7. Samples: 1562389680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 00:21:30,631][15401] Updated weights for policy 0, policy_version 95360 (0.0028) [2024-06-22 00:21:33,392][15132] Fps is (10 sec: 45864.8, 60 sec: 43415.9, 300 sec: 42875.8). Total num frames: 1562509312. Throughput: 0: 43135.1. Samples: 1562650480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:33,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 00:21:34,233][15401] Updated weights for policy 0, policy_version 95370 (0.0031) [2024-06-22 00:21:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1562689536. Throughput: 0: 43088.7. Samples: 1562778160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 00:21:38,534][15401] Updated weights for policy 0, policy_version 95380 (0.0048) [2024-06-22 00:21:42,115][15401] Updated weights for policy 0, policy_version 95390 (0.0035) [2024-06-22 00:21:43,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1562918912. Throughput: 0: 43114.2. Samples: 1563037940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 00:21:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000095394_1562935296.pth... [2024-06-22 00:21:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000094765_1552629760.pth [2024-06-22 00:21:46,255][15401] Updated weights for policy 0, policy_version 95400 (0.0037) [2024-06-22 00:21:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1563115520. Throughput: 0: 43103.6. Samples: 1563295280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 00:21:49,669][15401] Updated weights for policy 0, policy_version 95410 (0.0029) [2024-06-22 00:21:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1563344896. Throughput: 0: 43101.6. Samples: 1563423020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:53,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 00:21:53,676][15401] Updated weights for policy 0, policy_version 95420 (0.0027) [2024-06-22 00:21:57,075][15401] Updated weights for policy 0, policy_version 95430 (0.0048) [2024-06-22 00:21:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1563574272. Throughput: 0: 43130.6. Samples: 1563682600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:21:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 00:22:01,211][15401] Updated weights for policy 0, policy_version 95440 (0.0040) [2024-06-22 00:22:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1563770880. Throughput: 0: 43228.9. Samples: 1563947240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:22:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 00:22:04,747][15401] Updated weights for policy 0, policy_version 95450 (0.0032) [2024-06-22 00:22:08,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42869.6, 300 sec: 42764.7). Total num frames: 1563983872. Throughput: 0: 43161.2. Samples: 1564068580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:22:08,393][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 00:22:08,798][15401] Updated weights for policy 0, policy_version 95460 (0.0044) [2024-06-22 00:22:12,219][15401] Updated weights for policy 0, policy_version 95470 (0.0031) [2024-06-22 00:22:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1564213248. Throughput: 0: 43091.5. Samples: 1564328800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 00:22:13,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 00:22:16,247][15349] Signal inference workers to stop experience collection... (23050 times) [2024-06-22 00:22:16,248][15349] Signal inference workers to resume experience collection... (23050 times) [2024-06-22 00:22:16,287][15401] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-06-22 00:22:16,292][15401] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-06-22 00:22:16,389][15401] Updated weights for policy 0, policy_version 95480 (0.0040) [2024-06-22 00:22:18,389][15132] Fps is (10 sec: 44248.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1564426240. Throughput: 0: 43157.9. Samples: 1564592480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:18,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 00:22:19,603][15401] Updated weights for policy 0, policy_version 95490 (0.0020) [2024-06-22 00:22:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1564622848. Throughput: 0: 43139.5. Samples: 1564719540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:23,392][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 00:22:23,862][15401] Updated weights for policy 0, policy_version 95500 (0.0021) [2024-06-22 00:22:27,033][15401] Updated weights for policy 0, policy_version 95510 (0.0024) [2024-06-22 00:22:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1564852224. Throughput: 0: 43048.1. Samples: 1564975100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 00:22:31,401][15401] Updated weights for policy 0, policy_version 95520 (0.0032) [2024-06-22 00:22:33,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 1565081600. Throughput: 0: 43216.9. Samples: 1565240040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:33,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 00:22:34,635][15401] Updated weights for policy 0, policy_version 95530 (0.0034) [2024-06-22 00:22:38,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1565261824. Throughput: 0: 43235.2. Samples: 1565368600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 00:22:39,319][15401] Updated weights for policy 0, policy_version 95540 (0.0025) [2024-06-22 00:22:42,117][15401] Updated weights for policy 0, policy_version 95550 (0.0042) [2024-06-22 00:22:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 43043.1). Total num frames: 1565523968. Throughput: 0: 43105.4. Samples: 1565622340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 00:22:46,954][15401] Updated weights for policy 0, policy_version 95560 (0.0040) [2024-06-22 00:22:48,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 1565736960. Throughput: 0: 43179.1. Samples: 1565890300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:48,396][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 00:22:49,588][15401] Updated weights for policy 0, policy_version 95570 (0.0028) [2024-06-22 00:22:53,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 1565917184. Throughput: 0: 43229.0. Samples: 1566013880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:53,393][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 00:22:54,540][15401] Updated weights for policy 0, policy_version 95580 (0.0039) [2024-06-22 00:22:57,558][15401] Updated weights for policy 0, policy_version 95590 (0.0035) [2024-06-22 00:22:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 1566179328. Throughput: 0: 43159.2. Samples: 1566270960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:22:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 00:23:02,413][15401] Updated weights for policy 0, policy_version 95600 (0.0031) [2024-06-22 00:23:03,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1566359552. Throughput: 0: 43118.6. Samples: 1566532820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:23:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 00:23:05,224][15401] Updated weights for policy 0, policy_version 95610 (0.0054) [2024-06-22 00:23:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43146.3, 300 sec: 42932.6). Total num frames: 1566572544. Throughput: 0: 42987.2. Samples: 1566653860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 00:23:08,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 00:23:10,228][15401] Updated weights for policy 0, policy_version 95620 (0.0038) [2024-06-22 00:23:12,819][15401] Updated weights for policy 0, policy_version 95630 (0.0033) [2024-06-22 00:23:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 1566818304. Throughput: 0: 42907.9. Samples: 1566905960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 00:23:17,642][15401] Updated weights for policy 0, policy_version 95640 (0.0032) [2024-06-22 00:23:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1566982144. Throughput: 0: 43163.1. Samples: 1567182380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 00:23:20,329][15401] Updated weights for policy 0, policy_version 95650 (0.0029) [2024-06-22 00:23:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43419.3, 300 sec: 42932.0). Total num frames: 1567227904. Throughput: 0: 42892.0. Samples: 1567298740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:23,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 00:23:25,224][15401] Updated weights for policy 0, policy_version 95660 (0.0042) [2024-06-22 00:23:27,769][15401] Updated weights for policy 0, policy_version 95670 (0.0035) [2024-06-22 00:23:28,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43690.6, 300 sec: 43098.3). Total num frames: 1567473664. Throughput: 0: 42906.7. Samples: 1567553140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 00:23:32,814][15401] Updated weights for policy 0, policy_version 95680 (0.0043) [2024-06-22 00:23:33,391][15132] Fps is (10 sec: 40952.4, 60 sec: 42597.0, 300 sec: 42931.4). Total num frames: 1567637504. Throughput: 0: 43046.6. Samples: 1567827480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:33,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 00:23:34,206][15349] Signal inference workers to stop experience collection... (23100 times) [2024-06-22 00:23:34,207][15349] Signal inference workers to resume experience collection... (23100 times) [2024-06-22 00:23:34,245][15401] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-06-22 00:23:34,245][15401] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-06-22 00:23:35,340][15401] Updated weights for policy 0, policy_version 95690 (0.0032) [2024-06-22 00:23:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 1567866880. Throughput: 0: 42976.2. Samples: 1567947700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:38,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 00:23:40,430][15401] Updated weights for policy 0, policy_version 95700 (0.0044) [2024-06-22 00:23:42,921][15401] Updated weights for policy 0, policy_version 95710 (0.0034) [2024-06-22 00:23:43,390][15132] Fps is (10 sec: 49161.0, 60 sec: 43417.5, 300 sec: 43153.8). Total num frames: 1568129024. Throughput: 0: 43006.6. Samples: 1568206260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 00:23:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000095711_1568129024.pth... [2024-06-22 00:23:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000095078_1557757952.pth [2024-06-22 00:23:47,834][15401] Updated weights for policy 0, policy_version 95720 (0.0044) [2024-06-22 00:23:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 1568276480. Throughput: 0: 43125.8. Samples: 1568473480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 00:23:50,435][15401] Updated weights for policy 0, policy_version 95730 (0.0040) [2024-06-22 00:23:53,389][15132] Fps is (10 sec: 37683.8, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 1568505856. Throughput: 0: 43059.2. Samples: 1568591520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:53,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 00:23:55,349][15401] Updated weights for policy 0, policy_version 95740 (0.0036) [2024-06-22 00:23:58,150][15401] Updated weights for policy 0, policy_version 95750 (0.0039) [2024-06-22 00:23:58,390][15132] Fps is (10 sec: 49151.3, 60 sec: 43144.4, 300 sec: 43209.3). Total num frames: 1568768000. Throughput: 0: 43219.8. Samples: 1568850860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:23:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 00:24:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 1568915456. Throughput: 0: 42969.2. Samples: 1569116000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:24:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 00:24:03,668][15401] Updated weights for policy 0, policy_version 95760 (0.0032) [2024-06-22 00:24:06,191][15401] Updated weights for policy 0, policy_version 95770 (0.0030) [2024-06-22 00:24:08,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 1569144832. Throughput: 0: 43074.7. Samples: 1569237100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:24:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 00:24:11,191][15401] Updated weights for policy 0, policy_version 95780 (0.0038) [2024-06-22 00:24:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.3, 300 sec: 43153.8). Total num frames: 1569390592. Throughput: 0: 43164.7. Samples: 1569495560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 00:24:13,806][15401] Updated weights for policy 0, policy_version 95790 (0.0038) [2024-06-22 00:24:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1569570816. Throughput: 0: 42886.3. Samples: 1569757280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:18,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 00:24:18,737][15401] Updated weights for policy 0, policy_version 95800 (0.0045) [2024-06-22 00:24:21,296][15401] Updated weights for policy 0, policy_version 95810 (0.0031) [2024-06-22 00:24:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1569800192. Throughput: 0: 42827.4. Samples: 1569874940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:23,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 00:24:26,211][15401] Updated weights for policy 0, policy_version 95820 (0.0032) [2024-06-22 00:24:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 1570045952. Throughput: 0: 43086.4. Samples: 1570145140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 00:24:28,742][15401] Updated weights for policy 0, policy_version 95830 (0.0038) [2024-06-22 00:24:30,188][15349] Signal inference workers to stop experience collection... (23150 times) [2024-06-22 00:24:30,231][15401] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-06-22 00:24:30,237][15349] Signal inference workers to resume experience collection... (23150 times) [2024-06-22 00:24:30,246][15401] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-06-22 00:24:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42872.8, 300 sec: 42931.6). Total num frames: 1570209792. Throughput: 0: 42970.6. Samples: 1570407160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 00:24:33,770][15401] Updated weights for policy 0, policy_version 95840 (0.0036) [2024-06-22 00:24:36,327][15401] Updated weights for policy 0, policy_version 95850 (0.0031) [2024-06-22 00:24:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 43098.8). Total num frames: 1570455552. Throughput: 0: 42985.7. Samples: 1570525880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 00:24:41,247][15401] Updated weights for policy 0, policy_version 95860 (0.0028) [2024-06-22 00:24:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 43098.2). Total num frames: 1570668544. Throughput: 0: 43178.8. Samples: 1570793900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 00:24:43,997][15401] Updated weights for policy 0, policy_version 95870 (0.0030) [2024-06-22 00:24:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 1570865152. Throughput: 0: 43126.3. Samples: 1571056680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:48,395][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 00:24:48,695][15401] Updated weights for policy 0, policy_version 95880 (0.0041) [2024-06-22 00:24:51,818][15401] Updated weights for policy 0, policy_version 95890 (0.0039) [2024-06-22 00:24:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 1571094528. Throughput: 0: 43016.4. Samples: 1571172840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 00:24:56,194][15401] Updated weights for policy 0, policy_version 95900 (0.0030) [2024-06-22 00:24:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 43098.2). Total num frames: 1571307520. Throughput: 0: 43253.9. Samples: 1571441980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:24:58,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 00:24:59,540][15401] Updated weights for policy 0, policy_version 95910 (0.0038) [2024-06-22 00:25:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 1571520512. Throughput: 0: 43161.2. Samples: 1571699540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:25:03,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 00:25:04,078][15401] Updated weights for policy 0, policy_version 95920 (0.0040) [2024-06-22 00:25:07,112][15401] Updated weights for policy 0, policy_version 95930 (0.0033) [2024-06-22 00:25:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 1571749888. Throughput: 0: 43148.1. Samples: 1571816600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 24.0) [2024-06-22 00:25:08,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 00:25:11,905][15401] Updated weights for policy 0, policy_version 95940 (0.0034) [2024-06-22 00:25:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 1571946496. Throughput: 0: 42996.3. Samples: 1572079980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 00:25:14,769][15401] Updated weights for policy 0, policy_version 95950 (0.0047) [2024-06-22 00:25:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1572159488. Throughput: 0: 42780.5. Samples: 1572332280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:18,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-22 00:25:19,404][15401] Updated weights for policy 0, policy_version 95960 (0.0042) [2024-06-22 00:25:22,422][15401] Updated weights for policy 0, policy_version 95970 (0.0045) [2024-06-22 00:25:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1572388864. Throughput: 0: 42972.8. Samples: 1572459660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 00:25:26,892][15401] Updated weights for policy 0, policy_version 95980 (0.0027) [2024-06-22 00:25:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 1572585472. Throughput: 0: 42702.7. Samples: 1572715520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 00:25:30,189][15401] Updated weights for policy 0, policy_version 95990 (0.0051) [2024-06-22 00:25:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 1572798464. Throughput: 0: 42535.2. Samples: 1572970760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:33,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 00:25:34,529][15401] Updated weights for policy 0, policy_version 96000 (0.0050) [2024-06-22 00:25:38,007][15401] Updated weights for policy 0, policy_version 96010 (0.0038) [2024-06-22 00:25:38,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 1573044224. Throughput: 0: 42852.8. Samples: 1573101220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:38,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 00:25:42,410][15401] Updated weights for policy 0, policy_version 96020 (0.0028) [2024-06-22 00:25:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1573224448. Throughput: 0: 42604.8. Samples: 1573359200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 00:25:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000096022_1573224448.pth... [2024-06-22 00:25:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000095394_1562935296.pth [2024-06-22 00:25:45,517][15349] Signal inference workers to stop experience collection... (23200 times) [2024-06-22 00:25:45,566][15401] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-06-22 00:25:45,567][15349] Signal inference workers to resume experience collection... (23200 times) [2024-06-22 00:25:45,584][15401] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-06-22 00:25:45,713][15401] Updated weights for policy 0, policy_version 96030 (0.0037) [2024-06-22 00:25:48,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1573453824. Throughput: 0: 42473.0. Samples: 1573610820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 00:25:50,143][15401] Updated weights for policy 0, policy_version 96040 (0.0034) [2024-06-22 00:25:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 1573666816. Throughput: 0: 42804.5. Samples: 1573742800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 00:25:53,451][15401] Updated weights for policy 0, policy_version 96050 (0.0045) [2024-06-22 00:25:57,584][15401] Updated weights for policy 0, policy_version 96060 (0.0041) [2024-06-22 00:25:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1573863424. Throughput: 0: 42647.6. Samples: 1573999120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:25:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 00:26:01,159][15401] Updated weights for policy 0, policy_version 96070 (0.0042) [2024-06-22 00:26:03,392][15132] Fps is (10 sec: 44225.6, 60 sec: 43142.9, 300 sec: 43042.4). Total num frames: 1574109184. Throughput: 0: 42664.8. Samples: 1574252300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:26:03,392][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 00:26:05,304][15401] Updated weights for policy 0, policy_version 96080 (0.0043) [2024-06-22 00:26:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 1574305792. Throughput: 0: 42689.0. Samples: 1574380660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 00:26:08,636][15401] Updated weights for policy 0, policy_version 96090 (0.0032) [2024-06-22 00:26:13,241][15401] Updated weights for policy 0, policy_version 96100 (0.0043) [2024-06-22 00:26:13,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1574502400. Throughput: 0: 42671.5. Samples: 1574635740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:13,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 00:26:16,731][15401] Updated weights for policy 0, policy_version 96110 (0.0037) [2024-06-22 00:26:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1574715392. Throughput: 0: 42398.2. Samples: 1574878680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 00:26:21,019][15401] Updated weights for policy 0, policy_version 96120 (0.0036) [2024-06-22 00:26:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1574944768. Throughput: 0: 42499.3. Samples: 1575013680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 00:26:24,319][15401] Updated weights for policy 0, policy_version 96130 (0.0041) [2024-06-22 00:26:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 1575141376. Throughput: 0: 42499.6. Samples: 1575271680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 00:26:28,749][15401] Updated weights for policy 0, policy_version 96140 (0.0030) [2024-06-22 00:26:31,875][15401] Updated weights for policy 0, policy_version 96150 (0.0042) [2024-06-22 00:26:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1575370752. Throughput: 0: 42455.5. Samples: 1575521320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:33,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 00:26:36,252][15401] Updated weights for policy 0, policy_version 96160 (0.0043) [2024-06-22 00:26:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 1575583744. Throughput: 0: 42627.9. Samples: 1575661060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 00:26:39,503][15401] Updated weights for policy 0, policy_version 96170 (0.0030) [2024-06-22 00:26:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1575780352. Throughput: 0: 42591.0. Samples: 1575915720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 00:26:43,872][15401] Updated weights for policy 0, policy_version 96180 (0.0039) [2024-06-22 00:26:47,068][15401] Updated weights for policy 0, policy_version 96190 (0.0037) [2024-06-22 00:26:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 1576042496. Throughput: 0: 42634.8. Samples: 1576170760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:48,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 00:26:51,512][15401] Updated weights for policy 0, policy_version 96200 (0.0028) [2024-06-22 00:26:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1576222720. Throughput: 0: 42940.9. Samples: 1576313000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:53,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 00:26:54,518][15401] Updated weights for policy 0, policy_version 96210 (0.0032) [2024-06-22 00:26:58,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1576435712. Throughput: 0: 42895.4. Samples: 1576566040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:26:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 00:26:59,179][15401] Updated weights for policy 0, policy_version 96220 (0.0030) [2024-06-22 00:27:02,210][15401] Updated weights for policy 0, policy_version 96230 (0.0028) [2024-06-22 00:27:03,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43146.3, 300 sec: 43098.6). Total num frames: 1576697856. Throughput: 0: 43006.7. Samples: 1576813980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 00:27:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 00:27:06,921][15401] Updated weights for policy 0, policy_version 96240 (0.0028) [2024-06-22 00:27:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1576845312. Throughput: 0: 43136.5. Samples: 1576954820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 00:27:08,918][15349] Signal inference workers to stop experience collection... (23250 times) [2024-06-22 00:27:08,939][15401] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-06-22 00:27:08,979][15349] Signal inference workers to resume experience collection... (23250 times) [2024-06-22 00:27:08,980][15401] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-06-22 00:27:09,976][15401] Updated weights for policy 0, policy_version 96250 (0.0031) [2024-06-22 00:27:13,392][15132] Fps is (10 sec: 37674.0, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 1577074688. Throughput: 0: 42964.3. Samples: 1577205180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:13,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 00:27:14,365][15401] Updated weights for policy 0, policy_version 96260 (0.0041) [2024-06-22 00:27:17,825][15401] Updated weights for policy 0, policy_version 96270 (0.0036) [2024-06-22 00:27:18,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43415.9, 300 sec: 43042.7). Total num frames: 1577320448. Throughput: 0: 43153.8. Samples: 1577463340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:18,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 00:27:21,963][15401] Updated weights for policy 0, policy_version 96280 (0.0038) [2024-06-22 00:27:23,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1577500672. Throughput: 0: 43070.2. Samples: 1577599220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 00:27:25,439][15401] Updated weights for policy 0, policy_version 96290 (0.0033) [2024-06-22 00:27:28,392][15132] Fps is (10 sec: 40959.9, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 1577730048. Throughput: 0: 43003.2. Samples: 1577850960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:28,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 00:27:29,532][15401] Updated weights for policy 0, policy_version 96300 (0.0025) [2024-06-22 00:27:32,820][15401] Updated weights for policy 0, policy_version 96310 (0.0036) [2024-06-22 00:27:33,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 1577975808. Throughput: 0: 43155.5. Samples: 1578112760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:33,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 00:27:37,152][15401] Updated weights for policy 0, policy_version 96320 (0.0031) [2024-06-22 00:27:38,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1578156032. Throughput: 0: 42884.4. Samples: 1578242800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:38,399][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 00:27:40,387][15401] Updated weights for policy 0, policy_version 96330 (0.0044) [2024-06-22 00:27:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1578385408. Throughput: 0: 43000.5. Samples: 1578501060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 00:27:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000096337_1578385408.pth... [2024-06-22 00:27:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000095711_1568129024.pth [2024-06-22 00:27:45,097][15401] Updated weights for policy 0, policy_version 96340 (0.0025) [2024-06-22 00:27:48,073][15401] Updated weights for policy 0, policy_version 96350 (0.0024) [2024-06-22 00:27:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 1578598400. Throughput: 0: 43080.9. Samples: 1578752620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 00:27:52,654][15401] Updated weights for policy 0, policy_version 96360 (0.0042) [2024-06-22 00:27:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1578811392. Throughput: 0: 43023.9. Samples: 1578890900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:53,394][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 00:27:55,746][15401] Updated weights for policy 0, policy_version 96370 (0.0031) [2024-06-22 00:27:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 1579024384. Throughput: 0: 43023.7. Samples: 1579141140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:27:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 00:28:00,158][15401] Updated weights for policy 0, policy_version 96380 (0.0040) [2024-06-22 00:28:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 1579237376. Throughput: 0: 43085.4. Samples: 1579402080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 00:28:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 00:28:03,407][15401] Updated weights for policy 0, policy_version 96390 (0.0034) [2024-06-22 00:28:07,619][15401] Updated weights for policy 0, policy_version 96400 (0.0028) [2024-06-22 00:28:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1579450368. Throughput: 0: 43033.4. Samples: 1579535720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 00:28:10,935][15401] Updated weights for policy 0, policy_version 96410 (0.0040) [2024-06-22 00:28:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 1579663360. Throughput: 0: 42965.3. Samples: 1579784300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:13,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 00:28:15,182][15401] Updated weights for policy 0, policy_version 96420 (0.0031) [2024-06-22 00:28:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42598.4, 300 sec: 42875.8). Total num frames: 1579876352. Throughput: 0: 43100.3. Samples: 1580052380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:18,393][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 00:28:18,762][15401] Updated weights for policy 0, policy_version 96430 (0.0033) [2024-06-22 00:28:22,762][15401] Updated weights for policy 0, policy_version 96440 (0.0023) [2024-06-22 00:28:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 1580105728. Throughput: 0: 42983.5. Samples: 1580177160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:23,393][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 00:28:26,338][15401] Updated weights for policy 0, policy_version 96450 (0.0034) [2024-06-22 00:28:28,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43146.2, 300 sec: 42987.4). Total num frames: 1580318720. Throughput: 0: 42788.9. Samples: 1580426560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 00:28:30,256][15401] Updated weights for policy 0, policy_version 96460 (0.0045) [2024-06-22 00:28:32,912][15349] Signal inference workers to stop experience collection... (23300 times) [2024-06-22 00:28:32,919][15349] Signal inference workers to resume experience collection... (23300 times) [2024-06-22 00:28:32,928][15401] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-06-22 00:28:32,928][15401] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-06-22 00:28:33,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1580531712. Throughput: 0: 43095.0. Samples: 1580691900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 00:28:34,078][15401] Updated weights for policy 0, policy_version 96470 (0.0023) [2024-06-22 00:28:37,746][15401] Updated weights for policy 0, policy_version 96480 (0.0041) [2024-06-22 00:28:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1580761088. Throughput: 0: 42932.4. Samples: 1580822860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:38,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 00:28:41,732][15401] Updated weights for policy 0, policy_version 96490 (0.0040) [2024-06-22 00:28:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1580974080. Throughput: 0: 43015.0. Samples: 1581076820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 00:28:45,747][15401] Updated weights for policy 0, policy_version 96500 (0.0024) [2024-06-22 00:28:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1581154304. Throughput: 0: 42855.8. Samples: 1581330600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 00:28:49,375][15401] Updated weights for policy 0, policy_version 96510 (0.0047) [2024-06-22 00:28:53,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1581367296. Throughput: 0: 42675.7. Samples: 1581456120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 00:28:53,423][15401] Updated weights for policy 0, policy_version 96520 (0.0037) [2024-06-22 00:28:57,064][15401] Updated weights for policy 0, policy_version 96530 (0.0027) [2024-06-22 00:28:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1581596672. Throughput: 0: 42866.7. Samples: 1581713300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 00:28:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 00:29:00,959][15401] Updated weights for policy 0, policy_version 96540 (0.0032) [2024-06-22 00:29:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1581809664. Throughput: 0: 42614.3. Samples: 1581969920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:03,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-22 00:29:04,746][15401] Updated weights for policy 0, policy_version 96550 (0.0033) [2024-06-22 00:29:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1582006272. Throughput: 0: 42752.9. Samples: 1582100940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:08,399][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 00:29:08,751][15401] Updated weights for policy 0, policy_version 96560 (0.0028) [2024-06-22 00:29:12,506][15401] Updated weights for policy 0, policy_version 96570 (0.0036) [2024-06-22 00:29:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1582252032. Throughput: 0: 42964.5. Samples: 1582359960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 00:29:16,326][15401] Updated weights for policy 0, policy_version 96580 (0.0040) [2024-06-22 00:29:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 1582465024. Throughput: 0: 42747.6. Samples: 1582615540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 00:29:19,994][15401] Updated weights for policy 0, policy_version 96590 (0.0035) [2024-06-22 00:29:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 1582661632. Throughput: 0: 42760.7. Samples: 1582747080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 00:29:23,871][15401] Updated weights for policy 0, policy_version 96600 (0.0030) [2024-06-22 00:29:27,340][15401] Updated weights for policy 0, policy_version 96610 (0.0030) [2024-06-22 00:29:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 1582874624. Throughput: 0: 42962.0. Samples: 1583010100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 00:29:31,530][15401] Updated weights for policy 0, policy_version 96620 (0.0041) [2024-06-22 00:29:33,389][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1583120384. Throughput: 0: 43036.6. Samples: 1583267240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:33,398][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 00:29:34,947][15401] Updated weights for policy 0, policy_version 96630 (0.0030) [2024-06-22 00:29:38,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1583300608. Throughput: 0: 43107.8. Samples: 1583395980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:38,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 00:29:39,113][15401] Updated weights for policy 0, policy_version 96640 (0.0034) [2024-06-22 00:29:42,663][15401] Updated weights for policy 0, policy_version 96650 (0.0033) [2024-06-22 00:29:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1583529984. Throughput: 0: 43117.0. Samples: 1583653560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:43,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 00:29:43,513][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000096652_1583546368.pth... [2024-06-22 00:29:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000096022_1573224448.pth [2024-06-22 00:29:46,815][15401] Updated weights for policy 0, policy_version 96660 (0.0030) [2024-06-22 00:29:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1583759360. Throughput: 0: 43125.2. Samples: 1583910560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:48,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 00:29:50,234][15401] Updated weights for policy 0, policy_version 96670 (0.0047) [2024-06-22 00:29:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1583955968. Throughput: 0: 42983.6. Samples: 1584035200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:53,390][15132] Avg episode reward: [(0, '0.161')] [2024-06-22 00:29:54,235][15349] Signal inference workers to stop experience collection... (23350 times) [2024-06-22 00:29:54,283][15401] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-06-22 00:29:54,286][15349] Signal inference workers to resume experience collection... (23350 times) [2024-06-22 00:29:54,293][15401] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-06-22 00:29:54,437][15401] Updated weights for policy 0, policy_version 96680 (0.0042) [2024-06-22 00:29:58,262][15401] Updated weights for policy 0, policy_version 96690 (0.0027) [2024-06-22 00:29:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1584168960. Throughput: 0: 42941.7. Samples: 1584292340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 00:29:58,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 00:30:01,898][15401] Updated weights for policy 0, policy_version 96700 (0.0039) [2024-06-22 00:30:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1584398336. Throughput: 0: 42959.2. Samples: 1584548700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:03,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 00:30:05,589][15401] Updated weights for policy 0, policy_version 96710 (0.0030) [2024-06-22 00:30:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 1584627712. Throughput: 0: 43074.6. Samples: 1584685440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 00:30:09,395][15401] Updated weights for policy 0, policy_version 96720 (0.0032) [2024-06-22 00:30:13,182][15401] Updated weights for policy 0, policy_version 96730 (0.0029) [2024-06-22 00:30:13,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1584824320. Throughput: 0: 42919.0. Samples: 1584941460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:13,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 00:30:17,062][15401] Updated weights for policy 0, policy_version 96740 (0.0030) [2024-06-22 00:30:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1585037312. Throughput: 0: 42876.5. Samples: 1585196680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 00:30:21,188][15401] Updated weights for policy 0, policy_version 96750 (0.0035) [2024-06-22 00:30:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1585250304. Throughput: 0: 42989.8. Samples: 1585330520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:23,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 00:30:24,783][15401] Updated weights for policy 0, policy_version 96760 (0.0032) [2024-06-22 00:30:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1585463296. Throughput: 0: 42981.7. Samples: 1585587740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 00:30:28,625][15401] Updated weights for policy 0, policy_version 96770 (0.0041) [2024-06-22 00:30:32,390][15401] Updated weights for policy 0, policy_version 96780 (0.0045) [2024-06-22 00:30:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1585676288. Throughput: 0: 42992.9. Samples: 1585845240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 00:30:36,144][15401] Updated weights for policy 0, policy_version 96790 (0.0034) [2024-06-22 00:30:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 1585905664. Throughput: 0: 43000.0. Samples: 1585970200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 00:30:39,963][15401] Updated weights for policy 0, policy_version 96800 (0.0039) [2024-06-22 00:30:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1586085888. Throughput: 0: 42981.9. Samples: 1586226520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 00:30:44,144][15401] Updated weights for policy 0, policy_version 96810 (0.0031) [2024-06-22 00:30:47,547][15401] Updated weights for policy 0, policy_version 96820 (0.0039) [2024-06-22 00:30:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1586298880. Throughput: 0: 42944.0. Samples: 1586481180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 00:30:51,688][15401] Updated weights for policy 0, policy_version 96830 (0.0029) [2024-06-22 00:30:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1586544640. Throughput: 0: 42806.5. Samples: 1586611740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:53,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 00:30:55,238][15401] Updated weights for policy 0, policy_version 96840 (0.0032) [2024-06-22 00:30:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1586741248. Throughput: 0: 42678.7. Samples: 1586862000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 00:30:58,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 00:30:59,556][15401] Updated weights for policy 0, policy_version 96850 (0.0038) [2024-06-22 00:31:03,200][15401] Updated weights for policy 0, policy_version 96860 (0.0034) [2024-06-22 00:31:03,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.6, 300 sec: 42875.8). Total num frames: 1586954240. Throughput: 0: 42854.6. Samples: 1587125240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:03,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 00:31:07,080][15401] Updated weights for policy 0, policy_version 96870 (0.0029) [2024-06-22 00:31:07,742][15349] Signal inference workers to stop experience collection... (23400 times) [2024-06-22 00:31:07,750][15349] Signal inference workers to resume experience collection... (23400 times) [2024-06-22 00:31:07,794][15401] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-06-22 00:31:07,794][15401] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-06-22 00:31:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1587200000. Throughput: 0: 42790.3. Samples: 1587256080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 00:31:10,616][15401] Updated weights for policy 0, policy_version 96880 (0.0038) [2024-06-22 00:31:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1587380224. Throughput: 0: 42841.0. Samples: 1587515580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:13,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 00:31:14,699][15401] Updated weights for policy 0, policy_version 96890 (0.0042) [2024-06-22 00:31:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1587593216. Throughput: 0: 42802.7. Samples: 1587771360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:18,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 00:31:18,609][15401] Updated weights for policy 0, policy_version 96900 (0.0043) [2024-06-22 00:31:22,278][15401] Updated weights for policy 0, policy_version 96910 (0.0025) [2024-06-22 00:31:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 1587855360. Throughput: 0: 42955.1. Samples: 1587903180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 00:31:26,422][15401] Updated weights for policy 0, policy_version 96920 (0.0038) [2024-06-22 00:31:28,393][15132] Fps is (10 sec: 44220.2, 60 sec: 42868.8, 300 sec: 42931.1). Total num frames: 1588035584. Throughput: 0: 43010.6. Samples: 1588162160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:28,394][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 00:31:29,882][15401] Updated weights for policy 0, policy_version 96930 (0.0038) [2024-06-22 00:31:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1588248576. Throughput: 0: 42903.0. Samples: 1588411820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 00:31:33,995][15401] Updated weights for policy 0, policy_version 96940 (0.0037) [2024-06-22 00:31:37,534][15401] Updated weights for policy 0, policy_version 96950 (0.0030) [2024-06-22 00:31:38,389][15132] Fps is (10 sec: 44253.4, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1588477952. Throughput: 0: 42912.1. Samples: 1588542780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:38,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 00:31:41,542][15401] Updated weights for policy 0, policy_version 96960 (0.0039) [2024-06-22 00:31:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1588674560. Throughput: 0: 43338.2. Samples: 1588812220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 00:31:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000096965_1588674560.pth... [2024-06-22 00:31:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000096337_1578385408.pth [2024-06-22 00:31:44,969][15401] Updated weights for policy 0, policy_version 96970 (0.0030) [2024-06-22 00:31:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 1588903936. Throughput: 0: 42849.0. Samples: 1589053340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 00:31:49,666][15401] Updated weights for policy 0, policy_version 96980 (0.0025) [2024-06-22 00:31:52,483][15401] Updated weights for policy 0, policy_version 96990 (0.0032) [2024-06-22 00:31:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 1589133312. Throughput: 0: 42977.8. Samples: 1589190080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 00:31:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 00:31:57,269][15401] Updated weights for policy 0, policy_version 97000 (0.0039) [2024-06-22 00:31:58,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1589297152. Throughput: 0: 43061.6. Samples: 1589453360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:31:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 00:32:00,280][15401] Updated weights for policy 0, policy_version 97010 (0.0033) [2024-06-22 00:32:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43146.2, 300 sec: 43042.7). Total num frames: 1589542912. Throughput: 0: 42747.6. Samples: 1589695000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:03,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 00:32:04,643][15401] Updated weights for policy 0, policy_version 97020 (0.0026) [2024-06-22 00:32:07,777][15401] Updated weights for policy 0, policy_version 97030 (0.0031) [2024-06-22 00:32:07,787][15349] Signal inference workers to stop experience collection... (23450 times) [2024-06-22 00:32:07,787][15349] Signal inference workers to resume experience collection... (23450 times) [2024-06-22 00:32:07,803][15401] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-06-22 00:32:07,803][15401] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-06-22 00:32:08,389][15132] Fps is (10 sec: 49152.9, 60 sec: 43144.5, 300 sec: 43098.6). Total num frames: 1589788672. Throughput: 0: 42894.3. Samples: 1589833420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 00:32:12,140][15401] Updated weights for policy 0, policy_version 97040 (0.0045) [2024-06-22 00:32:13,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 1589919744. Throughput: 0: 42934.7. Samples: 1590094060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 00:32:15,282][15401] Updated weights for policy 0, policy_version 97050 (0.0025) [2024-06-22 00:32:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 1590198272. Throughput: 0: 42982.3. Samples: 1590346020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:18,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 00:32:19,530][15401] Updated weights for policy 0, policy_version 97060 (0.0042) [2024-06-22 00:32:22,887][15401] Updated weights for policy 0, policy_version 97070 (0.0029) [2024-06-22 00:32:23,390][15132] Fps is (10 sec: 49151.3, 60 sec: 42598.3, 300 sec: 42987.5). Total num frames: 1590411264. Throughput: 0: 43190.1. Samples: 1590486340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 00:32:27,085][15401] Updated weights for policy 0, policy_version 97080 (0.0022) [2024-06-22 00:32:28,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42328.0, 300 sec: 42709.5). Total num frames: 1590575104. Throughput: 0: 42615.5. Samples: 1590729920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 00:32:30,587][15401] Updated weights for policy 0, policy_version 97090 (0.0033) [2024-06-22 00:32:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 1590820864. Throughput: 0: 42843.1. Samples: 1590981280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:33,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 00:32:34,612][15401] Updated weights for policy 0, policy_version 97100 (0.0035) [2024-06-22 00:32:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1591033856. Throughput: 0: 42962.6. Samples: 1591123400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:38,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-22 00:32:38,462][15401] Updated weights for policy 0, policy_version 97110 (0.0026) [2024-06-22 00:32:42,346][15401] Updated weights for policy 0, policy_version 97120 (0.0034) [2024-06-22 00:32:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1591230464. Throughput: 0: 42698.7. Samples: 1591374800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 00:32:45,876][15401] Updated weights for policy 0, policy_version 97130 (0.0037) [2024-06-22 00:32:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1591459840. Throughput: 0: 43020.5. Samples: 1591630920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 00:32:49,892][15401] Updated weights for policy 0, policy_version 97140 (0.0024) [2024-06-22 00:32:53,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1591689216. Throughput: 0: 42922.6. Samples: 1591764940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 00:32:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 00:32:53,430][15401] Updated weights for policy 0, policy_version 97150 (0.0038) [2024-06-22 00:32:57,489][15401] Updated weights for policy 0, policy_version 97160 (0.0043) [2024-06-22 00:32:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1591885824. Throughput: 0: 42705.3. Samples: 1592015800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:32:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 00:33:00,953][15401] Updated weights for policy 0, policy_version 97170 (0.0039) [2024-06-22 00:33:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 1592098816. Throughput: 0: 42842.1. Samples: 1592274020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:03,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 00:33:05,130][15401] Updated weights for policy 0, policy_version 97180 (0.0032) [2024-06-22 00:33:08,202][15349] Signal inference workers to stop experience collection... (23500 times) [2024-06-22 00:33:08,202][15349] Signal inference workers to resume experience collection... (23500 times) [2024-06-22 00:33:08,225][15401] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-06-22 00:33:08,225][15401] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-06-22 00:33:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1592344576. Throughput: 0: 42584.5. Samples: 1592402640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 00:33:08,695][15401] Updated weights for policy 0, policy_version 97190 (0.0037) [2024-06-22 00:33:12,690][15401] Updated weights for policy 0, policy_version 97200 (0.0027) [2024-06-22 00:33:13,389][15132] Fps is (10 sec: 42609.2, 60 sec: 43417.7, 300 sec: 42876.5). Total num frames: 1592524800. Throughput: 0: 42732.6. Samples: 1592652880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 00:33:16,307][15401] Updated weights for policy 0, policy_version 97210 (0.0042) [2024-06-22 00:33:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 1592737792. Throughput: 0: 42809.7. Samples: 1592907720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 00:33:20,324][15401] Updated weights for policy 0, policy_version 97220 (0.0055) [2024-06-22 00:33:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1592950784. Throughput: 0: 42494.2. Samples: 1593035640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:23,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 00:33:24,110][15401] Updated weights for policy 0, policy_version 97230 (0.0030) [2024-06-22 00:33:27,981][15401] Updated weights for policy 0, policy_version 97240 (0.0038) [2024-06-22 00:33:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1593180160. Throughput: 0: 42526.4. Samples: 1593288480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 00:33:31,846][15401] Updated weights for policy 0, policy_version 97250 (0.0038) [2024-06-22 00:33:33,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 1593376768. Throughput: 0: 42445.2. Samples: 1593541060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:33,393][15132] Avg episode reward: [(0, '0.233')] [2024-06-22 00:33:36,306][15401] Updated weights for policy 0, policy_version 97260 (0.0039) [2024-06-22 00:33:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1593573376. Throughput: 0: 42421.5. Samples: 1593673900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 00:33:39,540][15401] Updated weights for policy 0, policy_version 97270 (0.0037) [2024-06-22 00:33:43,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1593802752. Throughput: 0: 42524.0. Samples: 1593929380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 00:33:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000097278_1593802752.pth... [2024-06-22 00:33:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000096652_1583546368.pth [2024-06-22 00:33:43,888][15401] Updated weights for policy 0, policy_version 97280 (0.0034) [2024-06-22 00:33:47,106][15401] Updated weights for policy 0, policy_version 97290 (0.0033) [2024-06-22 00:33:48,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 1594032128. Throughput: 0: 42457.2. Samples: 1594184500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 00:33:51,377][15401] Updated weights for policy 0, policy_version 97300 (0.0033) [2024-06-22 00:33:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1594228736. Throughput: 0: 42537.4. Samples: 1594316820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:33:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 00:33:55,132][15401] Updated weights for policy 0, policy_version 97310 (0.0051) [2024-06-22 00:33:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1594441728. Throughput: 0: 42567.9. Samples: 1594568440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:33:58,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 00:33:59,096][15401] Updated weights for policy 0, policy_version 97320 (0.0038) [2024-06-22 00:34:03,041][15401] Updated weights for policy 0, policy_version 97330 (0.0041) [2024-06-22 00:34:03,396][15132] Fps is (10 sec: 44208.0, 60 sec: 42868.6, 300 sec: 42930.7). Total num frames: 1594671104. Throughput: 0: 42547.2. Samples: 1594822620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:03,397][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 00:34:06,773][15401] Updated weights for policy 0, policy_version 97340 (0.0031) [2024-06-22 00:34:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1594867712. Throughput: 0: 42567.6. Samples: 1594951180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:08,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-22 00:34:10,702][15401] Updated weights for policy 0, policy_version 97350 (0.0034) [2024-06-22 00:34:13,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1595080704. Throughput: 0: 42765.3. Samples: 1595212920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 00:34:14,335][15401] Updated weights for policy 0, policy_version 97360 (0.0029) [2024-06-22 00:34:18,306][15401] Updated weights for policy 0, policy_version 97370 (0.0036) [2024-06-22 00:34:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1595310080. Throughput: 0: 42879.7. Samples: 1595470540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 00:34:22,003][15401] Updated weights for policy 0, policy_version 97380 (0.0024) [2024-06-22 00:34:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 1595506688. Throughput: 0: 42772.9. Samples: 1595598680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 00:34:23,649][15349] Signal inference workers to stop experience collection... (23550 times) [2024-06-22 00:34:23,677][15401] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-06-22 00:34:23,703][15349] Signal inference workers to resume experience collection... (23550 times) [2024-06-22 00:34:23,708][15401] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-06-22 00:34:25,701][15401] Updated weights for policy 0, policy_version 97390 (0.0036) [2024-06-22 00:34:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1595736064. Throughput: 0: 42922.2. Samples: 1595860880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 00:34:29,889][15401] Updated weights for policy 0, policy_version 97400 (0.0042) [2024-06-22 00:34:33,238][15401] Updated weights for policy 0, policy_version 97410 (0.0041) [2024-06-22 00:34:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 1595965440. Throughput: 0: 42838.4. Samples: 1596112220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:33,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-22 00:34:37,649][15401] Updated weights for policy 0, policy_version 97420 (0.0038) [2024-06-22 00:34:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1596145664. Throughput: 0: 42819.6. Samples: 1596243700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:38,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-22 00:34:40,847][15401] Updated weights for policy 0, policy_version 97430 (0.0023) [2024-06-22 00:34:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1596375040. Throughput: 0: 42977.7. Samples: 1596502440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 00:34:45,292][15401] Updated weights for policy 0, policy_version 97440 (0.0032) [2024-06-22 00:34:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1596604416. Throughput: 0: 42948.3. Samples: 1596755020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 00:34:48,581][15401] Updated weights for policy 0, policy_version 97450 (0.0039) [2024-06-22 00:34:53,047][15401] Updated weights for policy 0, policy_version 97460 (0.0045) [2024-06-22 00:34:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1596801024. Throughput: 0: 42972.3. Samples: 1596884940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 00:34:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 00:34:56,268][15401] Updated weights for policy 0, policy_version 97470 (0.0036) [2024-06-22 00:34:58,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 1597014016. Throughput: 0: 42864.4. Samples: 1597141920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:34:58,392][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 00:35:00,794][15401] Updated weights for policy 0, policy_version 97480 (0.0038) [2024-06-22 00:35:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 1597227008. Throughput: 0: 42632.4. Samples: 1597389000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 00:35:04,114][15401] Updated weights for policy 0, policy_version 97490 (0.0041) [2024-06-22 00:35:08,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1597440000. Throughput: 0: 42657.7. Samples: 1597518280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 00:35:08,395][15401] Updated weights for policy 0, policy_version 97500 (0.0040) [2024-06-22 00:35:11,482][15401] Updated weights for policy 0, policy_version 97510 (0.0041) [2024-06-22 00:35:13,390][15132] Fps is (10 sec: 42595.3, 60 sec: 42870.9, 300 sec: 42764.9). Total num frames: 1597652992. Throughput: 0: 42634.4. Samples: 1597779460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:13,391][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 00:35:15,823][15401] Updated weights for policy 0, policy_version 97520 (0.0043) [2024-06-22 00:35:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1597882368. Throughput: 0: 42691.2. Samples: 1598033320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:18,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 00:35:19,520][15401] Updated weights for policy 0, policy_version 97530 (0.0036) [2024-06-22 00:35:23,389][15132] Fps is (10 sec: 42601.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1598078976. Throughput: 0: 42711.1. Samples: 1598165700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 00:35:23,513][15401] Updated weights for policy 0, policy_version 97540 (0.0050) [2024-06-22 00:35:26,990][15401] Updated weights for policy 0, policy_version 97550 (0.0034) [2024-06-22 00:35:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1598275584. Throughput: 0: 42588.9. Samples: 1598418940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 00:35:31,196][15401] Updated weights for policy 0, policy_version 97560 (0.0030) [2024-06-22 00:35:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1598521344. Throughput: 0: 42800.1. Samples: 1598681020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 00:35:34,507][15401] Updated weights for policy 0, policy_version 97570 (0.0033) [2024-06-22 00:35:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1598717952. Throughput: 0: 42805.4. Samples: 1598811180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 00:35:38,888][15401] Updated weights for policy 0, policy_version 97580 (0.0023) [2024-06-22 00:35:42,448][15401] Updated weights for policy 0, policy_version 97590 (0.0037) [2024-06-22 00:35:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1598930944. Throughput: 0: 42783.4. Samples: 1599067080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 00:35:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000097592_1598947328.pth... [2024-06-22 00:35:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000096965_1588674560.pth [2024-06-22 00:35:46,451][15401] Updated weights for policy 0, policy_version 97600 (0.0035) [2024-06-22 00:35:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1599176704. Throughput: 0: 42833.0. Samples: 1599316480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 00:35:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 00:35:50,594][15401] Updated weights for policy 0, policy_version 97610 (0.0037) [2024-06-22 00:35:50,821][15349] Signal inference workers to stop experience collection... (23600 times) [2024-06-22 00:35:50,861][15401] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-06-22 00:35:50,883][15349] Signal inference workers to resume experience collection... (23600 times) [2024-06-22 00:35:50,888][15401] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-06-22 00:35:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1599356928. Throughput: 0: 42836.9. Samples: 1599445940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:35:53,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 00:35:54,049][15401] Updated weights for policy 0, policy_version 97620 (0.0036) [2024-06-22 00:35:58,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42327.0, 300 sec: 42709.8). Total num frames: 1599553536. Throughput: 0: 42720.2. Samples: 1599701840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:35:58,393][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 00:35:58,401][15401] Updated weights for policy 0, policy_version 97630 (0.0037) [2024-06-22 00:36:01,960][15401] Updated weights for policy 0, policy_version 97640 (0.0035) [2024-06-22 00:36:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1599815680. Throughput: 0: 42598.1. Samples: 1599950240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:03,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 00:36:05,979][15401] Updated weights for policy 0, policy_version 97650 (0.0022) [2024-06-22 00:36:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1599995904. Throughput: 0: 42709.3. Samples: 1600087620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:08,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 00:36:09,464][15401] Updated weights for policy 0, policy_version 97660 (0.0047) [2024-06-22 00:36:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42599.0, 300 sec: 42765.0). Total num frames: 1600208896. Throughput: 0: 42747.2. Samples: 1600342560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:13,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 00:36:13,478][15401] Updated weights for policy 0, policy_version 97670 (0.0041) [2024-06-22 00:36:17,348][15401] Updated weights for policy 0, policy_version 97680 (0.0031) [2024-06-22 00:36:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 1600454656. Throughput: 0: 42626.2. Samples: 1600599300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:18,392][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 00:36:21,061][15401] Updated weights for policy 0, policy_version 97690 (0.0028) [2024-06-22 00:36:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 1600651264. Throughput: 0: 42613.3. Samples: 1600728780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 00:36:24,850][15401] Updated weights for policy 0, policy_version 97700 (0.0035) [2024-06-22 00:36:28,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1600847872. Throughput: 0: 42562.3. Samples: 1600982380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 00:36:29,046][15401] Updated weights for policy 0, policy_version 97710 (0.0031) [2024-06-22 00:36:32,522][15401] Updated weights for policy 0, policy_version 97720 (0.0029) [2024-06-22 00:36:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1601093632. Throughput: 0: 42785.7. Samples: 1601241840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 00:36:36,656][15401] Updated weights for policy 0, policy_version 97730 (0.0033) [2024-06-22 00:36:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1601290240. Throughput: 0: 42852.0. Samples: 1601374280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 00:36:40,058][15401] Updated weights for policy 0, policy_version 97740 (0.0035) [2024-06-22 00:36:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1601503232. Throughput: 0: 42712.4. Samples: 1601623900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 00:36:44,137][15401] Updated weights for policy 0, policy_version 97750 (0.0027) [2024-06-22 00:36:47,651][15401] Updated weights for policy 0, policy_version 97760 (0.0036) [2024-06-22 00:36:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1601716224. Throughput: 0: 42872.0. Samples: 1601879480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 00:36:48,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 00:36:51,648][15401] Updated weights for policy 0, policy_version 97770 (0.0051) [2024-06-22 00:36:53,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1601912832. Throughput: 0: 42774.7. Samples: 1602012580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:36:53,392][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 00:36:55,253][15401] Updated weights for policy 0, policy_version 97780 (0.0044) [2024-06-22 00:36:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1602158592. Throughput: 0: 42667.9. Samples: 1602262620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:36:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 00:36:59,645][15401] Updated weights for policy 0, policy_version 97790 (0.0039) [2024-06-22 00:37:03,123][15401] Updated weights for policy 0, policy_version 97800 (0.0035) [2024-06-22 00:37:03,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1602355200. Throughput: 0: 42643.7. Samples: 1602518160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 00:37:07,128][15401] Updated weights for policy 0, policy_version 97810 (0.0026) [2024-06-22 00:37:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1602551808. Throughput: 0: 42517.3. Samples: 1602642060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 00:37:10,934][15401] Updated weights for policy 0, policy_version 97820 (0.0044) [2024-06-22 00:37:13,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1602781184. Throughput: 0: 42524.5. Samples: 1602895980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 00:37:15,182][15401] Updated weights for policy 0, policy_version 97830 (0.0038) [2024-06-22 00:37:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 1602977792. Throughput: 0: 42458.2. Samples: 1603152460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 00:37:18,593][15401] Updated weights for policy 0, policy_version 97840 (0.0036) [2024-06-22 00:37:22,867][15401] Updated weights for policy 0, policy_version 97850 (0.0033) [2024-06-22 00:37:23,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 1603190784. Throughput: 0: 42226.5. Samples: 1603274740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:23,396][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 00:37:24,477][15349] Signal inference workers to stop experience collection... (23650 times) [2024-06-22 00:37:24,478][15349] Signal inference workers to resume experience collection... (23650 times) [2024-06-22 00:37:24,494][15401] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-06-22 00:37:24,494][15401] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-06-22 00:37:26,208][15401] Updated weights for policy 0, policy_version 97860 (0.0024) [2024-06-22 00:37:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1603420160. Throughput: 0: 42368.0. Samples: 1603530460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 00:37:30,529][15401] Updated weights for policy 0, policy_version 97870 (0.0031) [2024-06-22 00:37:33,392][15132] Fps is (10 sec: 44254.5, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 1603633152. Throughput: 0: 42473.3. Samples: 1603790880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:33,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 00:37:33,993][15401] Updated weights for policy 0, policy_version 97880 (0.0037) [2024-06-22 00:37:37,942][15401] Updated weights for policy 0, policy_version 97890 (0.0040) [2024-06-22 00:37:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1603829760. Throughput: 0: 42279.9. Samples: 1603915080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:38,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 00:37:41,765][15401] Updated weights for policy 0, policy_version 97900 (0.0042) [2024-06-22 00:37:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 1604059136. Throughput: 0: 42406.7. Samples: 1604170920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 00:37:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000097904_1604059136.pth... [2024-06-22 00:37:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000097278_1593802752.pth [2024-06-22 00:37:45,841][15401] Updated weights for policy 0, policy_version 97910 (0.0037) [2024-06-22 00:37:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1604255744. Throughput: 0: 42516.7. Samples: 1604431420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 00:37:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 00:37:49,318][15401] Updated weights for policy 0, policy_version 97920 (0.0029) [2024-06-22 00:37:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 1604452352. Throughput: 0: 42542.8. Samples: 1604556480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:37:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 00:37:53,633][15401] Updated weights for policy 0, policy_version 97930 (0.0039) [2024-06-22 00:37:57,214][15401] Updated weights for policy 0, policy_version 97940 (0.0030) [2024-06-22 00:37:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 1604698112. Throughput: 0: 42564.0. Samples: 1604811360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:37:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 00:38:01,161][15401] Updated weights for policy 0, policy_version 97950 (0.0036) [2024-06-22 00:38:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1604911104. Throughput: 0: 42566.6. Samples: 1605067960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 00:38:04,801][15401] Updated weights for policy 0, policy_version 97960 (0.0041) [2024-06-22 00:38:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1605107712. Throughput: 0: 42593.1. Samples: 1605191160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 00:38:08,773][15401] Updated weights for policy 0, policy_version 97970 (0.0039) [2024-06-22 00:38:12,424][15401] Updated weights for policy 0, policy_version 97980 (0.0033) [2024-06-22 00:38:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1605337088. Throughput: 0: 42756.0. Samples: 1605454480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 00:38:17,048][15401] Updated weights for policy 0, policy_version 97990 (0.0031) [2024-06-22 00:38:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1605533696. Throughput: 0: 42713.4. Samples: 1605712880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 00:38:20,139][15401] Updated weights for policy 0, policy_version 98000 (0.0037) [2024-06-22 00:38:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 1605746688. Throughput: 0: 42642.0. Samples: 1605833960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 00:38:24,588][15401] Updated weights for policy 0, policy_version 98010 (0.0030) [2024-06-22 00:38:27,719][15401] Updated weights for policy 0, policy_version 98020 (0.0037) [2024-06-22 00:38:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 1605976064. Throughput: 0: 42656.9. Samples: 1606090580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:28,392][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 00:38:32,099][15401] Updated weights for policy 0, policy_version 98030 (0.0033) [2024-06-22 00:38:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 1606172672. Throughput: 0: 42604.5. Samples: 1606348620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:33,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 00:38:35,325][15401] Updated weights for policy 0, policy_version 98040 (0.0039) [2024-06-22 00:38:38,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1606385664. Throughput: 0: 42602.7. Samples: 1606473600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 00:38:39,585][15401] Updated weights for policy 0, policy_version 98050 (0.0032) [2024-06-22 00:38:43,139][15401] Updated weights for policy 0, policy_version 98060 (0.0037) [2024-06-22 00:38:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1606631424. Throughput: 0: 42751.6. Samples: 1606735180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 00:38:47,267][15401] Updated weights for policy 0, policy_version 98070 (0.0029) [2024-06-22 00:38:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1606811648. Throughput: 0: 42812.8. Samples: 1606994540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 00:38:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 00:38:48,732][15349] Signal inference workers to stop experience collection... (23700 times) [2024-06-22 00:38:48,775][15401] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-06-22 00:38:48,784][15349] Signal inference workers to resume experience collection... (23700 times) [2024-06-22 00:38:48,790][15401] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-06-22 00:38:50,675][15401] Updated weights for policy 0, policy_version 98080 (0.0031) [2024-06-22 00:38:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1607041024. Throughput: 0: 42832.5. Samples: 1607118620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:38:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 00:38:54,759][15401] Updated weights for policy 0, policy_version 98090 (0.0030) [2024-06-22 00:38:58,106][15401] Updated weights for policy 0, policy_version 98100 (0.0039) [2024-06-22 00:38:58,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 1607270400. Throughput: 0: 42796.5. Samples: 1607380320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:38:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 00:39:02,446][15401] Updated weights for policy 0, policy_version 98110 (0.0031) [2024-06-22 00:39:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1607450624. Throughput: 0: 42919.9. Samples: 1607644280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:03,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 00:39:05,696][15401] Updated weights for policy 0, policy_version 98120 (0.0046) [2024-06-22 00:39:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1607696384. Throughput: 0: 42948.8. Samples: 1607766660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 00:39:10,353][15401] Updated weights for policy 0, policy_version 98130 (0.0039) [2024-06-22 00:39:13,335][15401] Updated weights for policy 0, policy_version 98140 (0.0035) [2024-06-22 00:39:13,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1607925760. Throughput: 0: 43041.9. Samples: 1608027360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 00:39:18,229][15401] Updated weights for policy 0, policy_version 98150 (0.0033) [2024-06-22 00:39:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1608105984. Throughput: 0: 43039.9. Samples: 1608285420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 00:39:21,025][15401] Updated weights for policy 0, policy_version 98160 (0.0041) [2024-06-22 00:39:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1608351744. Throughput: 0: 42875.0. Samples: 1608402980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 00:39:25,944][15401] Updated weights for policy 0, policy_version 98170 (0.0036) [2024-06-22 00:39:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 1608548352. Throughput: 0: 43042.2. Samples: 1608672080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 00:39:28,770][15401] Updated weights for policy 0, policy_version 98180 (0.0028) [2024-06-22 00:39:33,389][15132] Fps is (10 sec: 36045.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1608712192. Throughput: 0: 43005.5. Samples: 1608929780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 00:39:33,610][15401] Updated weights for policy 0, policy_version 98190 (0.0035) [2024-06-22 00:39:36,221][15401] Updated weights for policy 0, policy_version 98200 (0.0032) [2024-06-22 00:39:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 1609007104. Throughput: 0: 42853.4. Samples: 1609047020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:38,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 00:39:41,202][15401] Updated weights for policy 0, policy_version 98210 (0.0032) [2024-06-22 00:39:43,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1609203712. Throughput: 0: 43049.4. Samples: 1609317540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 00:39:43,513][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000098219_1609220096.pth... [2024-06-22 00:39:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000097592_1598947328.pth [2024-06-22 00:39:43,891][15401] Updated weights for policy 0, policy_version 98220 (0.0024) [2024-06-22 00:39:48,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1609367552. Throughput: 0: 42994.4. Samples: 1609579020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:39:48,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-22 00:39:48,857][15401] Updated weights for policy 0, policy_version 98230 (0.0030) [2024-06-22 00:39:51,489][15401] Updated weights for policy 0, policy_version 98240 (0.0038) [2024-06-22 00:39:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 1609646080. Throughput: 0: 42846.1. Samples: 1609694740. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:39:53,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 00:39:56,804][15401] Updated weights for policy 0, policy_version 98250 (0.0026) [2024-06-22 00:39:58,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1609842688. Throughput: 0: 43003.1. Samples: 1609962500. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:39:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 00:39:58,918][15401] Updated weights for policy 0, policy_version 98260 (0.0042) [2024-06-22 00:40:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1610022912. Throughput: 0: 43003.7. Samples: 1610220580. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:03,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 00:40:04,337][15401] Updated weights for policy 0, policy_version 98270 (0.0028) [2024-06-22 00:40:06,450][15401] Updated weights for policy 0, policy_version 98280 (0.0031) [2024-06-22 00:40:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.7). Total num frames: 1610285056. Throughput: 0: 43017.8. Samples: 1610338780. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 00:40:11,818][15401] Updated weights for policy 0, policy_version 98290 (0.0031) [2024-06-22 00:40:12,658][15349] Signal inference workers to stop experience collection... (23750 times) [2024-06-22 00:40:12,658][15349] Signal inference workers to resume experience collection... (23750 times) [2024-06-22 00:40:12,676][15401] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-06-22 00:40:12,676][15401] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-06-22 00:40:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1610465280. Throughput: 0: 42895.1. Samples: 1610602360. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 00:40:14,536][15401] Updated weights for policy 0, policy_version 98300 (0.0031) [2024-06-22 00:40:18,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1610661888. Throughput: 0: 42786.2. Samples: 1610855160. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:18,390][15132] Avg episode reward: [(0, '0.163')] [2024-06-22 00:40:19,234][15401] Updated weights for policy 0, policy_version 98310 (0.0040) [2024-06-22 00:40:22,046][15401] Updated weights for policy 0, policy_version 98320 (0.0043) [2024-06-22 00:40:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 1610907648. Throughput: 0: 42938.6. Samples: 1610979360. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:23,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 00:40:27,037][15401] Updated weights for policy 0, policy_version 98330 (0.0038) [2024-06-22 00:40:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1611120640. Throughput: 0: 42900.1. Samples: 1611248040. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:28,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 00:40:29,653][15401] Updated weights for policy 0, policy_version 98340 (0.0034) [2024-06-22 00:40:33,389][15132] Fps is (10 sec: 40969.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1611317248. Throughput: 0: 42744.9. Samples: 1611502540. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 00:40:35,016][15401] Updated weights for policy 0, policy_version 98350 (0.0038) [2024-06-22 00:40:37,376][15401] Updated weights for policy 0, policy_version 98360 (0.0028) [2024-06-22 00:40:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 1611546624. Throughput: 0: 42975.3. Samples: 1611628620. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 00:40:42,487][15401] Updated weights for policy 0, policy_version 98370 (0.0027) [2024-06-22 00:40:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1611743232. Throughput: 0: 42989.4. Samples: 1611897020. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 00:40:44,929][15401] Updated weights for policy 0, policy_version 98380 (0.0050) [2024-06-22 00:40:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 1611972608. Throughput: 0: 42770.1. Samples: 1612145340. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-22 00:40:48,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 00:40:49,977][15401] Updated weights for policy 0, policy_version 98390 (0.0036) [2024-06-22 00:40:52,503][15401] Updated weights for policy 0, policy_version 98400 (0.0035) [2024-06-22 00:40:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1612201984. Throughput: 0: 43052.4. Samples: 1612276140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:40:53,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-22 00:40:57,517][15401] Updated weights for policy 0, policy_version 98410 (0.0038) [2024-06-22 00:40:58,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1612382208. Throughput: 0: 43067.1. Samples: 1612540380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:40:58,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 00:41:00,128][15401] Updated weights for policy 0, policy_version 98420 (0.0042) [2024-06-22 00:41:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1612627968. Throughput: 0: 43013.7. Samples: 1612790780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:03,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 00:41:04,998][15401] Updated weights for policy 0, policy_version 98430 (0.0030) [2024-06-22 00:41:07,683][15401] Updated weights for policy 0, policy_version 98440 (0.0044) [2024-06-22 00:41:08,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1612857344. Throughput: 0: 43296.1. Samples: 1612927580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 00:41:12,426][15401] Updated weights for policy 0, policy_version 98450 (0.0037) [2024-06-22 00:41:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 1613053952. Throughput: 0: 43219.0. Samples: 1613192900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 00:41:15,099][15349] Signal inference workers to stop experience collection... (23800 times) [2024-06-22 00:41:15,099][15349] Signal inference workers to resume experience collection... (23800 times) [2024-06-22 00:41:15,109][15401] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-06-22 00:41:15,110][15401] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-06-22 00:41:15,259][15401] Updated weights for policy 0, policy_version 98460 (0.0030) [2024-06-22 00:41:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1613266944. Throughput: 0: 43088.1. Samples: 1613441500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 00:41:19,894][15401] Updated weights for policy 0, policy_version 98470 (0.0029) [2024-06-22 00:41:22,830][15401] Updated weights for policy 0, policy_version 98480 (0.0038) [2024-06-22 00:41:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 1613496320. Throughput: 0: 43201.3. Samples: 1613572680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 00:41:27,566][15401] Updated weights for policy 0, policy_version 98490 (0.0031) [2024-06-22 00:41:28,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1613676544. Throughput: 0: 42908.7. Samples: 1613827920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 00:41:30,691][15401] Updated weights for policy 0, policy_version 98500 (0.0038) [2024-06-22 00:41:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1613905920. Throughput: 0: 43103.2. Samples: 1614084880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 00:41:35,103][15401] Updated weights for policy 0, policy_version 98510 (0.0035) [2024-06-22 00:41:38,301][15401] Updated weights for policy 0, policy_version 98520 (0.0035) [2024-06-22 00:41:38,389][15132] Fps is (10 sec: 47514.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1614151680. Throughput: 0: 43070.3. Samples: 1614214300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:38,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-22 00:41:42,806][15401] Updated weights for policy 0, policy_version 98530 (0.0034) [2024-06-22 00:41:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1614348288. Throughput: 0: 42918.2. Samples: 1614471700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 00:41:43,390][15132] Avg episode reward: [(0, '0.190')] [2024-06-22 00:41:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000098532_1614348288.pth... [2024-06-22 00:41:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000097904_1604059136.pth [2024-06-22 00:41:45,767][15401] Updated weights for policy 0, policy_version 98540 (0.0038) [2024-06-22 00:41:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43146.2, 300 sec: 42876.4). Total num frames: 1614561280. Throughput: 0: 42959.1. Samples: 1614723940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:41:48,395][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 00:41:50,426][15401] Updated weights for policy 0, policy_version 98550 (0.0030) [2024-06-22 00:41:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1614790656. Throughput: 0: 42973.7. Samples: 1614861400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:41:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 00:41:53,450][15401] Updated weights for policy 0, policy_version 98560 (0.0024) [2024-06-22 00:41:57,996][15401] Updated weights for policy 0, policy_version 98570 (0.0043) [2024-06-22 00:41:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 1614987264. Throughput: 0: 42676.9. Samples: 1615113360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:41:58,390][15132] Avg episode reward: [(0, '0.162')] [2024-06-22 00:42:01,238][15401] Updated weights for policy 0, policy_version 98580 (0.0028) [2024-06-22 00:42:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1615216640. Throughput: 0: 42829.2. Samples: 1615368820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 00:42:05,506][15401] Updated weights for policy 0, policy_version 98590 (0.0039) [2024-06-22 00:42:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1615429632. Throughput: 0: 42918.2. Samples: 1615504000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 00:42:08,668][15401] Updated weights for policy 0, policy_version 98600 (0.0034) [2024-06-22 00:42:13,155][15401] Updated weights for policy 0, policy_version 98610 (0.0035) [2024-06-22 00:42:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1615626240. Throughput: 0: 42915.3. Samples: 1615759100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 00:42:16,804][15401] Updated weights for policy 0, policy_version 98620 (0.0027) [2024-06-22 00:42:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42932.6). Total num frames: 1615855616. Throughput: 0: 42785.3. Samples: 1616010220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 00:42:20,761][15401] Updated weights for policy 0, policy_version 98630 (0.0030) [2024-06-22 00:42:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1616068608. Throughput: 0: 42833.3. Samples: 1616141800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 00:42:24,274][15401] Updated weights for policy 0, policy_version 98640 (0.0038) [2024-06-22 00:42:28,303][15401] Updated weights for policy 0, policy_version 98650 (0.0031) [2024-06-22 00:42:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.7, 300 sec: 42876.4). Total num frames: 1616281600. Throughput: 0: 42751.1. Samples: 1616395500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 00:42:32,078][15401] Updated weights for policy 0, policy_version 98660 (0.0027) [2024-06-22 00:42:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1616494592. Throughput: 0: 42740.9. Samples: 1616647280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:33,399][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 00:42:36,130][15401] Updated weights for policy 0, policy_version 98670 (0.0032) [2024-06-22 00:42:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1616707584. Throughput: 0: 42529.5. Samples: 1616775220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 00:42:39,779][15401] Updated weights for policy 0, policy_version 98680 (0.0032) [2024-06-22 00:42:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1616920576. Throughput: 0: 42538.1. Samples: 1617027580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 00:42:43,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 00:42:44,179][15401] Updated weights for policy 0, policy_version 98690 (0.0039) [2024-06-22 00:42:47,372][15401] Updated weights for policy 0, policy_version 98700 (0.0032) [2024-06-22 00:42:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1617117184. Throughput: 0: 42613.4. Samples: 1617286420. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:42:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 00:42:51,870][15401] Updated weights for policy 0, policy_version 98710 (0.0047) [2024-06-22 00:42:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1617330176. Throughput: 0: 42366.2. Samples: 1617410480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:42:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 00:42:55,587][15349] Signal inference workers to stop experience collection... (23850 times) [2024-06-22 00:42:55,646][15349] Signal inference workers to resume experience collection... (23850 times) [2024-06-22 00:42:55,646][15401] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-06-22 00:42:55,667][15401] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-06-22 00:42:55,791][15401] Updated weights for policy 0, policy_version 98720 (0.0036) [2024-06-22 00:42:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1617559552. Throughput: 0: 42228.3. Samples: 1617659380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:42:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 00:42:59,363][15401] Updated weights for policy 0, policy_version 98730 (0.0025) [2024-06-22 00:43:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 1617739776. Throughput: 0: 42624.3. Samples: 1617928320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:03,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 00:43:03,482][15401] Updated weights for policy 0, policy_version 98740 (0.0031) [2024-06-22 00:43:06,856][15401] Updated weights for policy 0, policy_version 98750 (0.0026) [2024-06-22 00:43:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1617969152. Throughput: 0: 42367.2. Samples: 1618048320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 00:43:11,013][15401] Updated weights for policy 0, policy_version 98760 (0.0025) [2024-06-22 00:43:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1618182144. Throughput: 0: 42474.2. Samples: 1618306840. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 00:43:14,442][15401] Updated weights for policy 0, policy_version 98770 (0.0029) [2024-06-22 00:43:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 1618378752. Throughput: 0: 42582.8. Samples: 1618563500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 00:43:18,778][15401] Updated weights for policy 0, policy_version 98780 (0.0034) [2024-06-22 00:43:22,473][15401] Updated weights for policy 0, policy_version 98790 (0.0039) [2024-06-22 00:43:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 1618608128. Throughput: 0: 42481.7. Samples: 1618686900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 00:43:26,430][15401] Updated weights for policy 0, policy_version 98800 (0.0024) [2024-06-22 00:43:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1618837504. Throughput: 0: 42677.9. Samples: 1618948080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 00:43:30,034][15401] Updated weights for policy 0, policy_version 98810 (0.0029) [2024-06-22 00:43:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1619034112. Throughput: 0: 42564.0. Samples: 1619201800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 00:43:34,039][15401] Updated weights for policy 0, policy_version 98820 (0.0040) [2024-06-22 00:43:37,721][15401] Updated weights for policy 0, policy_version 98830 (0.0033) [2024-06-22 00:43:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1619247104. Throughput: 0: 42542.1. Samples: 1619324880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:38,392][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 00:43:41,622][15401] Updated weights for policy 0, policy_version 98840 (0.0024) [2024-06-22 00:43:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 1619476480. Throughput: 0: 42742.3. Samples: 1619582780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 00:43:43,393][15132] Avg episode reward: [(0, '0.272')] [2024-06-22 00:43:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000098845_1619476480.pth... [2024-06-22 00:43:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000098219_1609220096.pth [2024-06-22 00:43:45,365][15401] Updated weights for policy 0, policy_version 98850 (0.0026) [2024-06-22 00:43:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1619656704. Throughput: 0: 42548.7. Samples: 1619843000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:43:48,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 00:43:49,289][15401] Updated weights for policy 0, policy_version 98860 (0.0040) [2024-06-22 00:43:52,878][15401] Updated weights for policy 0, policy_version 98870 (0.0029) [2024-06-22 00:43:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1619886080. Throughput: 0: 42586.6. Samples: 1619964720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:43:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 00:43:56,884][15401] Updated weights for policy 0, policy_version 98880 (0.0036) [2024-06-22 00:43:57,936][15349] Signal inference workers to stop experience collection... (23900 times) [2024-06-22 00:43:57,937][15349] Signal inference workers to resume experience collection... (23900 times) [2024-06-22 00:43:57,984][15401] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-06-22 00:43:57,984][15401] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-06-22 00:43:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1620099072. Throughput: 0: 42545.8. Samples: 1620221400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:43:58,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 00:44:00,620][15401] Updated weights for policy 0, policy_version 98890 (0.0038) [2024-06-22 00:44:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1620295680. Throughput: 0: 42496.8. Samples: 1620475860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 00:44:04,818][15401] Updated weights for policy 0, policy_version 98900 (0.0032) [2024-06-22 00:44:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1620525056. Throughput: 0: 42447.6. Samples: 1620597040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 00:44:08,484][15401] Updated weights for policy 0, policy_version 98910 (0.0031) [2024-06-22 00:44:12,891][15401] Updated weights for policy 0, policy_version 98920 (0.0034) [2024-06-22 00:44:13,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 1620738048. Throughput: 0: 42510.2. Samples: 1620861140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:13,393][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 00:44:16,363][15401] Updated weights for policy 0, policy_version 98930 (0.0032) [2024-06-22 00:44:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1620951040. Throughput: 0: 42406.7. Samples: 1621110100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 00:44:20,455][15401] Updated weights for policy 0, policy_version 98940 (0.0030) [2024-06-22 00:44:23,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1621164032. Throughput: 0: 42430.3. Samples: 1621234240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 00:44:24,030][15401] Updated weights for policy 0, policy_version 98950 (0.0048) [2024-06-22 00:44:27,949][15401] Updated weights for policy 0, policy_version 98960 (0.0033) [2024-06-22 00:44:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 1621377024. Throughput: 0: 42573.8. Samples: 1621498600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:28,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 00:44:31,673][15401] Updated weights for policy 0, policy_version 98970 (0.0041) [2024-06-22 00:44:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1621573632. Throughput: 0: 42361.3. Samples: 1621749260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 00:44:35,775][15401] Updated weights for policy 0, policy_version 98980 (0.0034) [2024-06-22 00:44:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1621803008. Throughput: 0: 42420.0. Samples: 1621873620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:38,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 00:44:39,333][15401] Updated weights for policy 0, policy_version 98990 (0.0030) [2024-06-22 00:44:43,385][15401] Updated weights for policy 0, policy_version 99000 (0.0036) [2024-06-22 00:44:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1622016000. Throughput: 0: 42395.6. Samples: 1622129200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 00:44:43,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 00:44:46,915][15401] Updated weights for policy 0, policy_version 99010 (0.0035) [2024-06-22 00:44:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 1622196224. Throughput: 0: 42492.4. Samples: 1622388020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:44:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 00:44:51,071][15401] Updated weights for policy 0, policy_version 99020 (0.0033) [2024-06-22 00:44:53,391][15132] Fps is (10 sec: 42593.8, 60 sec: 42597.6, 300 sec: 42709.3). Total num frames: 1622441984. Throughput: 0: 42503.0. Samples: 1622509720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:44:53,391][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 00:44:54,658][15401] Updated weights for policy 0, policy_version 99030 (0.0024) [2024-06-22 00:44:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1622638592. Throughput: 0: 42401.4. Samples: 1622769100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:44:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 00:44:58,658][15401] Updated weights for policy 0, policy_version 99040 (0.0038) [2024-06-22 00:45:02,343][15401] Updated weights for policy 0, policy_version 99050 (0.0028) [2024-06-22 00:45:03,389][15132] Fps is (10 sec: 40964.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1622851584. Throughput: 0: 42580.4. Samples: 1623026220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:03,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 00:45:06,491][15401] Updated weights for policy 0, policy_version 99060 (0.0030) [2024-06-22 00:45:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1623080960. Throughput: 0: 42599.9. Samples: 1623151240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 00:45:10,071][15401] Updated weights for policy 0, policy_version 99070 (0.0024) [2024-06-22 00:45:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 1623277568. Throughput: 0: 42489.3. Samples: 1623410620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 00:45:14,064][15401] Updated weights for policy 0, policy_version 99080 (0.0033) [2024-06-22 00:45:14,800][15349] Signal inference workers to stop experience collection... (23950 times) [2024-06-22 00:45:14,833][15401] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-06-22 00:45:14,856][15349] Signal inference workers to resume experience collection... (23950 times) [2024-06-22 00:45:14,856][15401] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-06-22 00:45:17,629][15401] Updated weights for policy 0, policy_version 99090 (0.0037) [2024-06-22 00:45:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 1623490560. Throughput: 0: 42564.8. Samples: 1623664680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 00:45:21,625][15401] Updated weights for policy 0, policy_version 99100 (0.0038) [2024-06-22 00:45:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1623719936. Throughput: 0: 42698.1. Samples: 1623795040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 00:45:25,188][15401] Updated weights for policy 0, policy_version 99110 (0.0036) [2024-06-22 00:45:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1623916544. Throughput: 0: 42705.3. Samples: 1624050940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 00:45:29,477][15401] Updated weights for policy 0, policy_version 99120 (0.0033) [2024-06-22 00:45:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1624129536. Throughput: 0: 42659.7. Samples: 1624307700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:33,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 00:45:33,496][15401] Updated weights for policy 0, policy_version 99130 (0.0041) [2024-06-22 00:45:37,234][15401] Updated weights for policy 0, policy_version 99140 (0.0033) [2024-06-22 00:45:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1624358912. Throughput: 0: 42894.8. Samples: 1624439940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:38,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 00:45:41,054][15401] Updated weights for policy 0, policy_version 99150 (0.0030) [2024-06-22 00:45:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 1624555520. Throughput: 0: 42831.4. Samples: 1624696520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 00:45:43,396][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 00:45:43,541][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000099156_1624571904.pth... [2024-06-22 00:45:43,609][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000098532_1614348288.pth [2024-06-22 00:45:44,681][15401] Updated weights for policy 0, policy_version 99160 (0.0040) [2024-06-22 00:45:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1624784896. Throughput: 0: 42693.3. Samples: 1624947420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:45:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 00:45:48,597][15401] Updated weights for policy 0, policy_version 99170 (0.0037) [2024-06-22 00:45:52,251][15401] Updated weights for policy 0, policy_version 99180 (0.0039) [2024-06-22 00:45:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42872.2, 300 sec: 42820.5). Total num frames: 1625014272. Throughput: 0: 42818.2. Samples: 1625078060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:45:53,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-22 00:45:56,094][15401] Updated weights for policy 0, policy_version 99190 (0.0027) [2024-06-22 00:45:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1625210880. Throughput: 0: 42711.3. Samples: 1625332620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:45:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 00:46:00,065][15401] Updated weights for policy 0, policy_version 99200 (0.0033) [2024-06-22 00:46:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1625423872. Throughput: 0: 42758.3. Samples: 1625588800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 00:46:03,697][15401] Updated weights for policy 0, policy_version 99210 (0.0036) [2024-06-22 00:46:07,900][15401] Updated weights for policy 0, policy_version 99220 (0.0035) [2024-06-22 00:46:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1625620480. Throughput: 0: 42732.1. Samples: 1625717980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:08,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 00:46:11,313][15401] Updated weights for policy 0, policy_version 99230 (0.0023) [2024-06-22 00:46:13,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 1625849856. Throughput: 0: 42760.0. Samples: 1625975240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:13,393][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 00:46:15,807][15401] Updated weights for policy 0, policy_version 99240 (0.0036) [2024-06-22 00:46:18,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 1626079232. Throughput: 0: 42715.1. Samples: 1626229980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:18,392][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 00:46:18,886][15401] Updated weights for policy 0, policy_version 99250 (0.0027) [2024-06-22 00:46:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1626259456. Throughput: 0: 42781.0. Samples: 1626365080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 00:46:23,542][15401] Updated weights for policy 0, policy_version 99260 (0.0023) [2024-06-22 00:46:26,405][15401] Updated weights for policy 0, policy_version 99270 (0.0041) [2024-06-22 00:46:28,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1626488832. Throughput: 0: 42603.3. Samples: 1626613660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 00:46:31,022][15401] Updated weights for policy 0, policy_version 99280 (0.0029) [2024-06-22 00:46:32,994][15349] Signal inference workers to stop experience collection... (24000 times) [2024-06-22 00:46:33,035][15401] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-06-22 00:46:33,104][15349] Signal inference workers to resume experience collection... (24000 times) [2024-06-22 00:46:33,104][15401] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-06-22 00:46:33,391][15132] Fps is (10 sec: 47505.7, 60 sec: 43416.4, 300 sec: 42653.7). Total num frames: 1626734592. Throughput: 0: 42762.9. Samples: 1626871820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:33,392][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 00:46:34,437][15401] Updated weights for policy 0, policy_version 99290 (0.0033) [2024-06-22 00:46:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1626898432. Throughput: 0: 42809.4. Samples: 1627004480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 00:46:38,622][15401] Updated weights for policy 0, policy_version 99300 (0.0033) [2024-06-22 00:46:41,987][15401] Updated weights for policy 0, policy_version 99310 (0.0031) [2024-06-22 00:46:43,390][15132] Fps is (10 sec: 40966.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1627144192. Throughput: 0: 42732.7. Samples: 1627255600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 00:46:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 00:46:46,191][15401] Updated weights for policy 0, policy_version 99320 (0.0030) [2024-06-22 00:46:48,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1627373568. Throughput: 0: 42709.8. Samples: 1627510740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:46:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 00:46:49,930][15401] Updated weights for policy 0, policy_version 99330 (0.0047) [2024-06-22 00:46:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1627537408. Throughput: 0: 42715.9. Samples: 1627640200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:46:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 00:46:53,990][15401] Updated weights for policy 0, policy_version 99340 (0.0029) [2024-06-22 00:46:57,630][15401] Updated weights for policy 0, policy_version 99350 (0.0034) [2024-06-22 00:46:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1627783168. Throughput: 0: 42758.2. Samples: 1627899260. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:46:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 00:47:01,694][15401] Updated weights for policy 0, policy_version 99360 (0.0047) [2024-06-22 00:47:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1627996160. Throughput: 0: 42720.8. Samples: 1628152320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 00:47:05,035][15401] Updated weights for policy 0, policy_version 99370 (0.0037) [2024-06-22 00:47:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1628192768. Throughput: 0: 42533.8. Samples: 1628279100. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:08,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 00:47:09,472][15401] Updated weights for policy 0, policy_version 99380 (0.0037) [2024-06-22 00:47:12,509][15401] Updated weights for policy 0, policy_version 99390 (0.0030) [2024-06-22 00:47:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 1628422144. Throughput: 0: 42644.8. Samples: 1628532680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 00:47:17,170][15401] Updated weights for policy 0, policy_version 99400 (0.0036) [2024-06-22 00:47:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 1628618752. Throughput: 0: 42650.9. Samples: 1628791040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 00:47:20,625][15401] Updated weights for policy 0, policy_version 99410 (0.0037) [2024-06-22 00:47:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1628815360. Throughput: 0: 42576.1. Samples: 1628920400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:23,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 00:47:24,672][15401] Updated weights for policy 0, policy_version 99420 (0.0039) [2024-06-22 00:47:28,056][15401] Updated weights for policy 0, policy_version 99430 (0.0046) [2024-06-22 00:47:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1629061120. Throughput: 0: 42662.3. Samples: 1629175400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:28,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 00:47:32,180][15401] Updated weights for policy 0, policy_version 99440 (0.0036) [2024-06-22 00:47:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42326.5, 300 sec: 42598.4). Total num frames: 1629274112. Throughput: 0: 42702.6. Samples: 1629432360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:33,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 00:47:35,630][15401] Updated weights for policy 0, policy_version 99450 (0.0042) [2024-06-22 00:47:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1629454336. Throughput: 0: 42670.3. Samples: 1629560360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 00:47:39,905][15401] Updated weights for policy 0, policy_version 99460 (0.0032) [2024-06-22 00:47:43,177][15401] Updated weights for policy 0, policy_version 99470 (0.0031) [2024-06-22 00:47:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 1629716480. Throughput: 0: 42563.9. Samples: 1629814640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 00:47:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 00:47:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000099470_1629716480.pth... [2024-06-22 00:47:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000098845_1619476480.pth [2024-06-22 00:47:47,544][15401] Updated weights for policy 0, policy_version 99480 (0.0029) [2024-06-22 00:47:48,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 1629896704. Throughput: 0: 42784.4. Samples: 1630077620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:47:48,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-22 00:47:50,702][15401] Updated weights for policy 0, policy_version 99490 (0.0038) [2024-06-22 00:47:53,389][15132] Fps is (10 sec: 37684.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1630093312. Throughput: 0: 42644.9. Samples: 1630198120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:47:53,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-22 00:47:55,090][15401] Updated weights for policy 0, policy_version 99500 (0.0041) [2024-06-22 00:47:55,997][15349] Signal inference workers to stop experience collection... (24050 times) [2024-06-22 00:47:56,022][15401] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-06-22 00:47:56,062][15349] Signal inference workers to resume experience collection... (24050 times) [2024-06-22 00:47:56,062][15401] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-06-22 00:47:58,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1630355456. Throughput: 0: 42715.6. Samples: 1630454880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:47:58,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 00:47:58,679][15401] Updated weights for policy 0, policy_version 99510 (0.0032) [2024-06-22 00:48:02,904][15401] Updated weights for policy 0, policy_version 99520 (0.0043) [2024-06-22 00:48:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1630552064. Throughput: 0: 42648.3. Samples: 1630710220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:03,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 00:48:06,465][15401] Updated weights for policy 0, policy_version 99530 (0.0040) [2024-06-22 00:48:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1630748672. Throughput: 0: 42572.0. Samples: 1630836140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:08,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 00:48:10,503][15401] Updated weights for policy 0, policy_version 99540 (0.0036) [2024-06-22 00:48:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1630961664. Throughput: 0: 42491.6. Samples: 1631087520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 00:48:14,308][15401] Updated weights for policy 0, policy_version 99550 (0.0043) [2024-06-22 00:48:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1631174656. Throughput: 0: 42411.2. Samples: 1631340860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:18,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 00:48:18,449][15401] Updated weights for policy 0, policy_version 99560 (0.0044) [2024-06-22 00:48:21,963][15401] Updated weights for policy 0, policy_version 99570 (0.0036) [2024-06-22 00:48:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1631404032. Throughput: 0: 42392.0. Samples: 1631468000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 00:48:26,111][15401] Updated weights for policy 0, policy_version 99580 (0.0032) [2024-06-22 00:48:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1631600640. Throughput: 0: 42491.7. Samples: 1631726760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:28,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 00:48:29,671][15401] Updated weights for policy 0, policy_version 99590 (0.0031) [2024-06-22 00:48:33,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 1631797248. Throughput: 0: 42357.5. Samples: 1631983800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:33,393][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 00:48:34,097][15401] Updated weights for policy 0, policy_version 99600 (0.0027) [2024-06-22 00:48:37,650][15401] Updated weights for policy 0, policy_version 99610 (0.0035) [2024-06-22 00:48:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1632059392. Throughput: 0: 42464.4. Samples: 1632109020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:38,391][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 00:48:41,699][15401] Updated weights for policy 0, policy_version 99620 (0.0027) [2024-06-22 00:48:43,392][15132] Fps is (10 sec: 45875.4, 60 sec: 42323.8, 300 sec: 42709.1). Total num frames: 1632256000. Throughput: 0: 42626.2. Samples: 1632373160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 00:48:43,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 00:48:45,251][15401] Updated weights for policy 0, policy_version 99630 (0.0035) [2024-06-22 00:48:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1632436224. Throughput: 0: 42582.2. Samples: 1632626420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:48:48,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 00:48:49,345][15401] Updated weights for policy 0, policy_version 99640 (0.0034) [2024-06-22 00:48:52,946][15401] Updated weights for policy 0, policy_version 99650 (0.0026) [2024-06-22 00:48:53,392][15132] Fps is (10 sec: 42598.3, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 1632681984. Throughput: 0: 42603.9. Samples: 1632753420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:48:53,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 00:48:56,898][15401] Updated weights for policy 0, policy_version 99660 (0.0032) [2024-06-22 00:48:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1632878592. Throughput: 0: 42597.8. Samples: 1633004420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:48:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 00:49:00,503][15401] Updated weights for policy 0, policy_version 99670 (0.0035) [2024-06-22 00:49:03,390][15132] Fps is (10 sec: 39330.4, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 1633075200. Throughput: 0: 42798.4. Samples: 1633266800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 00:49:04,606][15401] Updated weights for policy 0, policy_version 99680 (0.0029) [2024-06-22 00:49:08,117][15401] Updated weights for policy 0, policy_version 99690 (0.0030) [2024-06-22 00:49:08,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.8, 300 sec: 42709.5). Total num frames: 1633337344. Throughput: 0: 42923.0. Samples: 1633399640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:08,393][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 00:49:12,136][15401] Updated weights for policy 0, policy_version 99700 (0.0030) [2024-06-22 00:49:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1633517568. Throughput: 0: 42681.8. Samples: 1633647440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 00:49:15,046][15349] Signal inference workers to stop experience collection... (24100 times) [2024-06-22 00:49:15,047][15349] Signal inference workers to resume experience collection... (24100 times) [2024-06-22 00:49:15,059][15401] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-06-22 00:49:15,059][15401] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-06-22 00:49:15,762][15401] Updated weights for policy 0, policy_version 99710 (0.0041) [2024-06-22 00:49:18,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1633730560. Throughput: 0: 42674.3. Samples: 1633904040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:18,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 00:49:19,788][15401] Updated weights for policy 0, policy_version 99720 (0.0032) [2024-06-22 00:49:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1633959936. Throughput: 0: 42782.1. Samples: 1634034220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 00:49:23,539][15401] Updated weights for policy 0, policy_version 99730 (0.0028) [2024-06-22 00:49:27,342][15401] Updated weights for policy 0, policy_version 99740 (0.0034) [2024-06-22 00:49:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1634156544. Throughput: 0: 42491.2. Samples: 1634285160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 00:49:31,180][15401] Updated weights for policy 0, policy_version 99750 (0.0034) [2024-06-22 00:49:33,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42871.5, 300 sec: 42598.0). Total num frames: 1634369536. Throughput: 0: 42615.9. Samples: 1634544240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:33,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 00:49:35,277][15401] Updated weights for policy 0, policy_version 99760 (0.0028) [2024-06-22 00:49:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1634566144. Throughput: 0: 42618.7. Samples: 1634671160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 00:49:39,279][15401] Updated weights for policy 0, policy_version 99770 (0.0034) [2024-06-22 00:49:42,778][15401] Updated weights for policy 0, policy_version 99780 (0.0022) [2024-06-22 00:49:43,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1634811904. Throughput: 0: 42754.2. Samples: 1634928360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 00:49:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 00:49:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000099781_1634811904.pth... [2024-06-22 00:49:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000099156_1624571904.pth [2024-06-22 00:49:47,035][15401] Updated weights for policy 0, policy_version 99790 (0.0032) [2024-06-22 00:49:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42654.1). Total num frames: 1635024896. Throughput: 0: 42545.0. Samples: 1635181320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:49:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 00:49:50,311][15401] Updated weights for policy 0, policy_version 99800 (0.0024) [2024-06-22 00:49:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 1635221504. Throughput: 0: 42433.3. Samples: 1635309040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:49:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 00:49:54,764][15401] Updated weights for policy 0, policy_version 99810 (0.0051) [2024-06-22 00:49:57,911][15401] Updated weights for policy 0, policy_version 99820 (0.0032) [2024-06-22 00:49:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1635450880. Throughput: 0: 42642.7. Samples: 1635566360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:49:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 00:50:02,439][15401] Updated weights for policy 0, policy_version 99830 (0.0044) [2024-06-22 00:50:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 1635663872. Throughput: 0: 42583.1. Samples: 1635820280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:03,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 00:50:05,497][15401] Updated weights for policy 0, policy_version 99840 (0.0025) [2024-06-22 00:50:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 1635860480. Throughput: 0: 42540.5. Samples: 1635948540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:08,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 00:50:10,098][15401] Updated weights for policy 0, policy_version 99850 (0.0034) [2024-06-22 00:50:13,089][15401] Updated weights for policy 0, policy_version 99860 (0.0037) [2024-06-22 00:50:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1636106240. Throughput: 0: 42736.3. Samples: 1636208300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 00:50:17,583][15401] Updated weights for policy 0, policy_version 99870 (0.0037) [2024-06-22 00:50:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1636319232. Throughput: 0: 42714.7. Samples: 1636466300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:18,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 00:50:20,769][15401] Updated weights for policy 0, policy_version 99880 (0.0027) [2024-06-22 00:50:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1636499456. Throughput: 0: 42728.4. Samples: 1636593940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:23,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 00:50:25,074][15401] Updated weights for policy 0, policy_version 99890 (0.0027) [2024-06-22 00:50:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1636745216. Throughput: 0: 42635.1. Samples: 1636846940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 00:50:28,642][15401] Updated weights for policy 0, policy_version 99900 (0.0040) [2024-06-22 00:50:32,195][15349] Signal inference workers to stop experience collection... (24150 times) [2024-06-22 00:50:32,195][15349] Signal inference workers to resume experience collection... (24150 times) [2024-06-22 00:50:32,223][15401] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-06-22 00:50:32,224][15401] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-06-22 00:50:32,811][15401] Updated weights for policy 0, policy_version 99910 (0.0039) [2024-06-22 00:50:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 1636941824. Throughput: 0: 42853.8. Samples: 1637109740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 00:50:36,529][15401] Updated weights for policy 0, policy_version 99920 (0.0037) [2024-06-22 00:50:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1637154816. Throughput: 0: 42843.2. Samples: 1637236980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 00:50:40,418][15401] Updated weights for policy 0, policy_version 99930 (0.0026) [2024-06-22 00:50:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1637400576. Throughput: 0: 42845.7. Samples: 1637494420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 00:50:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 00:50:44,312][15401] Updated weights for policy 0, policy_version 99940 (0.0040) [2024-06-22 00:50:48,189][15401] Updated weights for policy 0, policy_version 99950 (0.0032) [2024-06-22 00:50:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1637580800. Throughput: 0: 43027.0. Samples: 1637756500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:50:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 00:50:51,804][15401] Updated weights for policy 0, policy_version 99960 (0.0031) [2024-06-22 00:50:53,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1637777408. Throughput: 0: 42884.1. Samples: 1637878320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:50:53,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 00:50:55,740][15401] Updated weights for policy 0, policy_version 99970 (0.0034) [2024-06-22 00:50:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1638039552. Throughput: 0: 42864.9. Samples: 1638137220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:50:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 00:50:59,235][15401] Updated weights for policy 0, policy_version 99980 (0.0029) [2024-06-22 00:51:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1638219776. Throughput: 0: 42968.6. Samples: 1638399880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:03,390][15132] Avg episode reward: [(0, '0.101')] [2024-06-22 00:51:03,454][15401] Updated weights for policy 0, policy_version 99990 (0.0046) [2024-06-22 00:51:06,747][15401] Updated weights for policy 0, policy_version 100000 (0.0039) [2024-06-22 00:51:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 1638432768. Throughput: 0: 42856.1. Samples: 1638522460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:08,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 00:51:10,966][15401] Updated weights for policy 0, policy_version 100010 (0.0033) [2024-06-22 00:51:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1638678528. Throughput: 0: 42984.9. Samples: 1638781260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 00:51:14,696][15401] Updated weights for policy 0, policy_version 100020 (0.0037) [2024-06-22 00:51:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1638858752. Throughput: 0: 42826.7. Samples: 1639036940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 00:51:18,660][15401] Updated weights for policy 0, policy_version 100030 (0.0034) [2024-06-22 00:51:22,387][15401] Updated weights for policy 0, policy_version 100040 (0.0031) [2024-06-22 00:51:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1639071744. Throughput: 0: 42712.2. Samples: 1639159040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:23,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 00:51:26,751][15401] Updated weights for policy 0, policy_version 100050 (0.0033) [2024-06-22 00:51:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42654.2). Total num frames: 1639317504. Throughput: 0: 42776.0. Samples: 1639419340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 00:51:29,918][15401] Updated weights for policy 0, policy_version 100060 (0.0037) [2024-06-22 00:51:33,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1639497728. Throughput: 0: 42775.3. Samples: 1639681380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 00:51:34,207][15401] Updated weights for policy 0, policy_version 100070 (0.0033) [2024-06-22 00:51:37,966][15401] Updated weights for policy 0, policy_version 100080 (0.0027) [2024-06-22 00:51:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1639710720. Throughput: 0: 42639.4. Samples: 1639797100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 00:51:41,753][15401] Updated weights for policy 0, policy_version 100090 (0.0039) [2024-06-22 00:51:43,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1639956480. Throughput: 0: 42775.9. Samples: 1640062140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 00:51:43,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 00:51:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000100096_1639972864.pth... [2024-06-22 00:51:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000099470_1629716480.pth [2024-06-22 00:51:45,539][15401] Updated weights for policy 0, policy_version 100100 (0.0037) [2024-06-22 00:51:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1640136704. Throughput: 0: 42645.8. Samples: 1640318940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:51:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 00:51:49,503][15401] Updated weights for policy 0, policy_version 100110 (0.0030) [2024-06-22 00:51:53,113][15401] Updated weights for policy 0, policy_version 100120 (0.0029) [2024-06-22 00:51:53,389][15132] Fps is (10 sec: 40961.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1640366080. Throughput: 0: 42565.4. Samples: 1640437900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:51:53,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 00:51:56,498][15349] Signal inference workers to stop experience collection... (24200 times) [2024-06-22 00:51:56,501][15349] Signal inference workers to resume experience collection... (24200 times) [2024-06-22 00:51:56,516][15401] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-06-22 00:51:56,516][15401] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-06-22 00:51:57,273][15401] Updated weights for policy 0, policy_version 100130 (0.0024) [2024-06-22 00:51:58,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1640611840. Throughput: 0: 42695.5. Samples: 1640702560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:51:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 00:52:01,020][15401] Updated weights for policy 0, policy_version 100140 (0.0032) [2024-06-22 00:52:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1640775680. Throughput: 0: 42616.9. Samples: 1640954700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 00:52:05,012][15401] Updated weights for policy 0, policy_version 100150 (0.0032) [2024-06-22 00:52:08,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 1641005056. Throughput: 0: 42522.4. Samples: 1641072640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:08,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 00:52:09,343][15401] Updated weights for policy 0, policy_version 100160 (0.0032) [2024-06-22 00:52:12,616][15401] Updated weights for policy 0, policy_version 100170 (0.0038) [2024-06-22 00:52:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1641218048. Throughput: 0: 42571.7. Samples: 1641335060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 00:52:16,991][15401] Updated weights for policy 0, policy_version 100180 (0.0033) [2024-06-22 00:52:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1641431040. Throughput: 0: 42365.2. Samples: 1641587820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 00:52:20,690][15401] Updated weights for policy 0, policy_version 100190 (0.0033) [2024-06-22 00:52:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1641644032. Throughput: 0: 42645.8. Samples: 1641716260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:23,393][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 00:52:24,504][15401] Updated weights for policy 0, policy_version 100200 (0.0032) [2024-06-22 00:52:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1641824256. Throughput: 0: 42356.6. Samples: 1641968180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 00:52:28,479][15401] Updated weights for policy 0, policy_version 100210 (0.0030) [2024-06-22 00:52:31,978][15401] Updated weights for policy 0, policy_version 100220 (0.0038) [2024-06-22 00:52:33,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 1642070016. Throughput: 0: 42294.6. Samples: 1642222300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:33,392][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 00:52:36,185][15401] Updated weights for policy 0, policy_version 100230 (0.0046) [2024-06-22 00:52:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1642283008. Throughput: 0: 42625.7. Samples: 1642356060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:38,390][15132] Avg episode reward: [(0, '0.122')] [2024-06-22 00:52:39,550][15401] Updated weights for policy 0, policy_version 100240 (0.0034) [2024-06-22 00:52:43,389][15132] Fps is (10 sec: 39331.4, 60 sec: 41779.4, 300 sec: 42598.4). Total num frames: 1642463232. Throughput: 0: 42449.1. Samples: 1642612760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 00:52:43,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 00:52:43,747][15401] Updated weights for policy 0, policy_version 100250 (0.0032) [2024-06-22 00:52:47,238][15401] Updated weights for policy 0, policy_version 100260 (0.0032) [2024-06-22 00:52:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1642708992. Throughput: 0: 42448.7. Samples: 1642864900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:52:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 00:52:51,377][15401] Updated weights for policy 0, policy_version 100270 (0.0034) [2024-06-22 00:52:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1642921984. Throughput: 0: 42741.3. Samples: 1642995900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:52:53,396][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 00:52:54,823][15401] Updated weights for policy 0, policy_version 100280 (0.0031) [2024-06-22 00:52:58,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1643118592. Throughput: 0: 42708.0. Samples: 1643256920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:52:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 00:52:58,958][15401] Updated weights for policy 0, policy_version 100290 (0.0027) [2024-06-22 00:53:02,331][15401] Updated weights for policy 0, policy_version 100300 (0.0040) [2024-06-22 00:53:03,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 1643364352. Throughput: 0: 42698.2. Samples: 1643509340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:03,393][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 00:53:06,435][15401] Updated weights for policy 0, policy_version 100310 (0.0033) [2024-06-22 00:53:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1643560960. Throughput: 0: 42888.9. Samples: 1643646160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 00:53:09,787][15401] Updated weights for policy 0, policy_version 100320 (0.0028) [2024-06-22 00:53:13,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1643757568. Throughput: 0: 42915.2. Samples: 1643899360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 00:53:13,897][15401] Updated weights for policy 0, policy_version 100330 (0.0034) [2024-06-22 00:53:15,705][15349] Signal inference workers to stop experience collection... (24250 times) [2024-06-22 00:53:15,709][15349] Signal inference workers to resume experience collection... (24250 times) [2024-06-22 00:53:15,719][15401] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-06-22 00:53:15,733][15401] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-06-22 00:53:17,505][15401] Updated weights for policy 0, policy_version 100340 (0.0036) [2024-06-22 00:53:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1644003328. Throughput: 0: 42880.5. Samples: 1644151820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 00:53:21,541][15401] Updated weights for policy 0, policy_version 100350 (0.0030) [2024-06-22 00:53:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 1644199936. Throughput: 0: 42874.7. Samples: 1644285420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:23,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 00:53:25,119][15401] Updated weights for policy 0, policy_version 100360 (0.0039) [2024-06-22 00:53:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1644396544. Throughput: 0: 42644.4. Samples: 1644531760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:28,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 00:53:29,483][15401] Updated weights for policy 0, policy_version 100370 (0.0035) [2024-06-22 00:53:32,807][15401] Updated weights for policy 0, policy_version 100380 (0.0032) [2024-06-22 00:53:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1644625920. Throughput: 0: 42707.7. Samples: 1644786740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:33,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 00:53:37,108][15401] Updated weights for policy 0, policy_version 100390 (0.0031) [2024-06-22 00:53:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1644855296. Throughput: 0: 42833.4. Samples: 1644923400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 00:53:41,010][15401] Updated weights for policy 0, policy_version 100400 (0.0032) [2024-06-22 00:53:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1645051904. Throughput: 0: 42515.0. Samples: 1645170100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 00:53:43,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 00:53:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000100406_1645051904.pth... [2024-06-22 00:53:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000099781_1634811904.pth [2024-06-22 00:53:44,899][15401] Updated weights for policy 0, policy_version 100410 (0.0037) [2024-06-22 00:53:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1645264896. Throughput: 0: 42605.9. Samples: 1645426500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:53:48,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 00:53:48,634][15401] Updated weights for policy 0, policy_version 100420 (0.0043) [2024-06-22 00:53:52,548][15401] Updated weights for policy 0, policy_version 100430 (0.0027) [2024-06-22 00:53:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1645461504. Throughput: 0: 42315.6. Samples: 1645550360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:53:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 00:53:56,481][15401] Updated weights for policy 0, policy_version 100440 (0.0031) [2024-06-22 00:53:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1645690880. Throughput: 0: 42321.2. Samples: 1645803820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:53:58,396][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 00:54:00,108][15401] Updated weights for policy 0, policy_version 100450 (0.0038) [2024-06-22 00:54:03,391][15132] Fps is (10 sec: 44229.0, 60 sec: 42325.8, 300 sec: 42598.5). Total num frames: 1645903872. Throughput: 0: 42243.2. Samples: 1646052840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:03,392][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 00:54:04,091][15401] Updated weights for policy 0, policy_version 100460 (0.0036) [2024-06-22 00:54:07,847][15401] Updated weights for policy 0, policy_version 100470 (0.0022) [2024-06-22 00:54:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1646100480. Throughput: 0: 42239.0. Samples: 1646186180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 00:54:11,704][15401] Updated weights for policy 0, policy_version 100480 (0.0032) [2024-06-22 00:54:13,392][15132] Fps is (10 sec: 40957.3, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 1646313472. Throughput: 0: 42424.4. Samples: 1646440960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:13,393][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 00:54:15,886][15401] Updated weights for policy 0, policy_version 100490 (0.0034) [2024-06-22 00:54:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1646526464. Throughput: 0: 42374.2. Samples: 1646693580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 00:54:19,352][15401] Updated weights for policy 0, policy_version 100500 (0.0027) [2024-06-22 00:54:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1646739456. Throughput: 0: 42323.1. Samples: 1646827940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 00:54:23,438][15401] Updated weights for policy 0, policy_version 100510 (0.0028) [2024-06-22 00:54:27,102][15401] Updated weights for policy 0, policy_version 100520 (0.0032) [2024-06-22 00:54:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1646952448. Throughput: 0: 42402.8. Samples: 1647078220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 00:54:30,992][15401] Updated weights for policy 0, policy_version 100530 (0.0035) [2024-06-22 00:54:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1647181824. Throughput: 0: 42417.7. Samples: 1647335300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:33,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 00:54:35,126][15401] Updated weights for policy 0, policy_version 100540 (0.0038) [2024-06-22 00:54:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1647394816. Throughput: 0: 42576.5. Samples: 1647466300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 00:54:38,526][15401] Updated weights for policy 0, policy_version 100550 (0.0027) [2024-06-22 00:54:42,740][15349] Signal inference workers to stop experience collection... (24300 times) [2024-06-22 00:54:42,740][15349] Signal inference workers to resume experience collection... (24300 times) [2024-06-22 00:54:42,774][15401] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-06-22 00:54:42,775][15401] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-06-22 00:54:42,876][15401] Updated weights for policy 0, policy_version 100560 (0.0037) [2024-06-22 00:54:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1647591424. Throughput: 0: 42491.6. Samples: 1647715940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 00:54:46,209][15401] Updated weights for policy 0, policy_version 100570 (0.0033) [2024-06-22 00:54:48,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1647804416. Throughput: 0: 42589.5. Samples: 1647969300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 00:54:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 00:54:50,644][15401] Updated weights for policy 0, policy_version 100580 (0.0038) [2024-06-22 00:54:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1648033792. Throughput: 0: 42462.0. Samples: 1648096960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:54:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 00:54:53,673][15401] Updated weights for policy 0, policy_version 100590 (0.0032) [2024-06-22 00:54:58,178][15401] Updated weights for policy 0, policy_version 100600 (0.0032) [2024-06-22 00:54:58,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1648230400. Throughput: 0: 42568.6. Samples: 1648356440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:54:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 00:55:01,635][15401] Updated weights for policy 0, policy_version 100610 (0.0034) [2024-06-22 00:55:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42326.5, 300 sec: 42653.9). Total num frames: 1648443392. Throughput: 0: 42464.4. Samples: 1648604480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 00:55:06,173][15401] Updated weights for policy 0, policy_version 100620 (0.0033) [2024-06-22 00:55:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1648656384. Throughput: 0: 42391.9. Samples: 1648735580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 00:55:09,602][15401] Updated weights for policy 0, policy_version 100630 (0.0040) [2024-06-22 00:55:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 1648852992. Throughput: 0: 42466.7. Samples: 1648989220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:13,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 00:55:13,914][15401] Updated weights for policy 0, policy_version 100640 (0.0043) [2024-06-22 00:55:17,435][15401] Updated weights for policy 0, policy_version 100650 (0.0041) [2024-06-22 00:55:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1649082368. Throughput: 0: 42285.8. Samples: 1649238160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 00:55:21,467][15401] Updated weights for policy 0, policy_version 100660 (0.0028) [2024-06-22 00:55:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1649295360. Throughput: 0: 42307.0. Samples: 1649370120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 00:55:24,800][15401] Updated weights for policy 0, policy_version 100670 (0.0033) [2024-06-22 00:55:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1649508352. Throughput: 0: 42446.7. Samples: 1649626040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 00:55:29,399][15401] Updated weights for policy 0, policy_version 100680 (0.0040) [2024-06-22 00:55:32,382][15401] Updated weights for policy 0, policy_version 100690 (0.0031) [2024-06-22 00:55:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1649721344. Throughput: 0: 42449.5. Samples: 1649879520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:33,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-22 00:55:36,925][15401] Updated weights for policy 0, policy_version 100700 (0.0034) [2024-06-22 00:55:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 1649934336. Throughput: 0: 42627.6. Samples: 1650015200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:38,396][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 00:55:39,951][15401] Updated weights for policy 0, policy_version 100710 (0.0028) [2024-06-22 00:55:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1650130944. Throughput: 0: 42465.8. Samples: 1650267400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:43,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-22 00:55:43,481][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000100717_1650147328.pth... [2024-06-22 00:55:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000100096_1639972864.pth [2024-06-22 00:55:44,639][15401] Updated weights for policy 0, policy_version 100720 (0.0041) [2024-06-22 00:55:48,013][15401] Updated weights for policy 0, policy_version 100730 (0.0038) [2024-06-22 00:55:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 1650360320. Throughput: 0: 42431.2. Samples: 1650513880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 00:55:48,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 00:55:52,222][15401] Updated weights for policy 0, policy_version 100740 (0.0025) [2024-06-22 00:55:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1650573312. Throughput: 0: 42424.6. Samples: 1650644680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:55:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 00:55:55,629][15401] Updated weights for policy 0, policy_version 100750 (0.0034) [2024-06-22 00:55:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1650769920. Throughput: 0: 42440.0. Samples: 1650899020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:55:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 00:55:59,796][15401] Updated weights for policy 0, policy_version 100760 (0.0033) [2024-06-22 00:56:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1650999296. Throughput: 0: 42590.2. Samples: 1651154720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 00:56:03,923][15401] Updated weights for policy 0, policy_version 100770 (0.0033) [2024-06-22 00:56:07,383][15401] Updated weights for policy 0, policy_version 100780 (0.0031) [2024-06-22 00:56:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1651212288. Throughput: 0: 42624.0. Samples: 1651288200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:08,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 00:56:10,335][15349] Signal inference workers to stop experience collection... (24350 times) [2024-06-22 00:56:10,340][15349] Signal inference workers to resume experience collection... (24350 times) [2024-06-22 00:56:10,370][15401] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-06-22 00:56:10,370][15401] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-06-22 00:56:11,370][15401] Updated weights for policy 0, policy_version 100790 (0.0030) [2024-06-22 00:56:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1651408896. Throughput: 0: 42581.0. Samples: 1651542180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 00:56:15,486][15401] Updated weights for policy 0, policy_version 100800 (0.0038) [2024-06-22 00:56:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1651638272. Throughput: 0: 42506.3. Samples: 1651792300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 00:56:19,114][15401] Updated weights for policy 0, policy_version 100810 (0.0027) [2024-06-22 00:56:23,105][15401] Updated weights for policy 0, policy_version 100820 (0.0034) [2024-06-22 00:56:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1651834880. Throughput: 0: 42420.9. Samples: 1651924140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 00:56:26,758][15401] Updated weights for policy 0, policy_version 100830 (0.0044) [2024-06-22 00:56:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1652047872. Throughput: 0: 42430.2. Samples: 1652176760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 00:56:30,646][15401] Updated weights for policy 0, policy_version 100840 (0.0031) [2024-06-22 00:56:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1652260864. Throughput: 0: 42609.7. Samples: 1652431320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 00:56:34,495][15401] Updated weights for policy 0, policy_version 100850 (0.0042) [2024-06-22 00:56:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 1652490240. Throughput: 0: 42480.5. Samples: 1652556300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 00:56:38,393][15401] Updated weights for policy 0, policy_version 100860 (0.0028) [2024-06-22 00:56:42,534][15401] Updated weights for policy 0, policy_version 100870 (0.0031) [2024-06-22 00:56:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1652686848. Throughput: 0: 42446.7. Samples: 1652809120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 00:56:46,060][15401] Updated weights for policy 0, policy_version 100880 (0.0022) [2024-06-22 00:56:48,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1652899840. Throughput: 0: 42456.4. Samples: 1653065260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-06-22 00:56:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 00:56:50,351][15401] Updated weights for policy 0, policy_version 100890 (0.0037) [2024-06-22 00:56:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1653112832. Throughput: 0: 42273.5. Samples: 1653190500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:56:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 00:56:53,593][15401] Updated weights for policy 0, policy_version 100900 (0.0037) [2024-06-22 00:56:58,006][15401] Updated weights for policy 0, policy_version 100910 (0.0039) [2024-06-22 00:56:58,396][15132] Fps is (10 sec: 42571.6, 60 sec: 42593.8, 300 sec: 42541.9). Total num frames: 1653325824. Throughput: 0: 42413.9. Samples: 1653451080. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:56:58,397][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 00:57:01,152][15401] Updated weights for policy 0, policy_version 100920 (0.0048) [2024-06-22 00:57:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.7, 300 sec: 42487.3). Total num frames: 1653538816. Throughput: 0: 42525.8. Samples: 1653706060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:03,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 00:57:05,889][15401] Updated weights for policy 0, policy_version 100930 (0.0041) [2024-06-22 00:57:08,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1653751808. Throughput: 0: 42453.7. Samples: 1653834560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 00:57:08,891][15401] Updated weights for policy 0, policy_version 100940 (0.0032) [2024-06-22 00:57:13,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1653948416. Throughput: 0: 42593.2. Samples: 1654093460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 00:57:13,529][15401] Updated weights for policy 0, policy_version 100950 (0.0038) [2024-06-22 00:57:16,701][15401] Updated weights for policy 0, policy_version 100960 (0.0035) [2024-06-22 00:57:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 1654177792. Throughput: 0: 42544.0. Samples: 1654345800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:18,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 00:57:21,376][15401] Updated weights for policy 0, policy_version 100970 (0.0038) [2024-06-22 00:57:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1654390784. Throughput: 0: 42774.6. Samples: 1654481160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 00:57:23,506][15349] Signal inference workers to stop experience collection... (24400 times) [2024-06-22 00:57:23,561][15401] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-06-22 00:57:23,561][15349] Signal inference workers to resume experience collection... (24400 times) [2024-06-22 00:57:23,590][15401] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-06-22 00:57:24,606][15401] Updated weights for policy 0, policy_version 100980 (0.0025) [2024-06-22 00:57:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42376.6). Total num frames: 1654571008. Throughput: 0: 42671.6. Samples: 1654729340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 00:57:29,088][15401] Updated weights for policy 0, policy_version 100990 (0.0027) [2024-06-22 00:57:32,264][15401] Updated weights for policy 0, policy_version 101000 (0.0041) [2024-06-22 00:57:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1654833152. Throughput: 0: 42487.6. Samples: 1654977200. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:33,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 00:57:36,799][15401] Updated weights for policy 0, policy_version 101010 (0.0027) [2024-06-22 00:57:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1655013376. Throughput: 0: 42759.9. Samples: 1655114700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 00:57:40,017][15401] Updated weights for policy 0, policy_version 101020 (0.0038) [2024-06-22 00:57:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1655209984. Throughput: 0: 42434.0. Samples: 1655360340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 00:57:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000101026_1655209984.pth... [2024-06-22 00:57:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000100406_1645051904.pth [2024-06-22 00:57:44,523][15401] Updated weights for policy 0, policy_version 101030 (0.0040) [2024-06-22 00:57:47,571][15401] Updated weights for policy 0, policy_version 101040 (0.0028) [2024-06-22 00:57:48,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 1655455744. Throughput: 0: 42367.5. Samples: 1655612600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 00:57:48,392][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 00:57:52,192][15401] Updated weights for policy 0, policy_version 101050 (0.0033) [2024-06-22 00:57:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 1655652352. Throughput: 0: 42638.2. Samples: 1655753380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:57:53,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 00:57:55,354][15401] Updated weights for policy 0, policy_version 101060 (0.0033) [2024-06-22 00:57:58,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42329.9, 300 sec: 42376.6). Total num frames: 1655865344. Throughput: 0: 42373.9. Samples: 1656000280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:57:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 00:57:59,644][15401] Updated weights for policy 0, policy_version 101070 (0.0033) [2024-06-22 00:58:02,847][15401] Updated weights for policy 0, policy_version 101080 (0.0026) [2024-06-22 00:58:03,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 1656111104. Throughput: 0: 42509.3. Samples: 1656258720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 00:58:07,225][15401] Updated weights for policy 0, policy_version 101090 (0.0038) [2024-06-22 00:58:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1656291328. Throughput: 0: 42393.2. Samples: 1656388860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 00:58:10,896][15401] Updated weights for policy 0, policy_version 101100 (0.0045) [2024-06-22 00:58:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1656520704. Throughput: 0: 42390.6. Samples: 1656636920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 00:58:14,890][15401] Updated weights for policy 0, policy_version 101110 (0.0029) [2024-06-22 00:58:18,375][15401] Updated weights for policy 0, policy_version 101120 (0.0030) [2024-06-22 00:58:18,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1656750080. Throughput: 0: 42624.9. Samples: 1656895320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 00:58:22,505][15401] Updated weights for policy 0, policy_version 101130 (0.0043) [2024-06-22 00:58:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1656930304. Throughput: 0: 42493.8. Samples: 1657026920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 00:58:26,085][15401] Updated weights for policy 0, policy_version 101140 (0.0043) [2024-06-22 00:58:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 1657159680. Throughput: 0: 42572.4. Samples: 1657276100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 00:58:30,646][15401] Updated weights for policy 0, policy_version 101150 (0.0030) [2024-06-22 00:58:32,138][15349] Signal inference workers to stop experience collection... (24450 times) [2024-06-22 00:58:32,145][15349] Signal inference workers to resume experience collection... (24450 times) [2024-06-22 00:58:32,152][15401] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-06-22 00:58:32,182][15401] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-06-22 00:58:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1657372672. Throughput: 0: 42591.1. Samples: 1657529100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 00:58:33,670][15401] Updated weights for policy 0, policy_version 101160 (0.0029) [2024-06-22 00:58:38,254][15401] Updated weights for policy 0, policy_version 101170 (0.0033) [2024-06-22 00:58:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1657569280. Throughput: 0: 42351.6. Samples: 1657659100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 00:58:41,188][15401] Updated weights for policy 0, policy_version 101180 (0.0032) [2024-06-22 00:58:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 1657798656. Throughput: 0: 42466.0. Samples: 1657911260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 00:58:46,189][15401] Updated weights for policy 0, policy_version 101190 (0.0039) [2024-06-22 00:58:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 1658011648. Throughput: 0: 42516.9. Samples: 1658171980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 00:58:48,392][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 00:58:49,000][15401] Updated weights for policy 0, policy_version 101200 (0.0038) [2024-06-22 00:58:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42327.0, 300 sec: 42376.3). Total num frames: 1658191872. Throughput: 0: 42393.5. Samples: 1658296560. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:58:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 00:58:53,722][15401] Updated weights for policy 0, policy_version 101210 (0.0024) [2024-06-22 00:58:56,653][15401] Updated weights for policy 0, policy_version 101220 (0.0027) [2024-06-22 00:58:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42487.6). Total num frames: 1658437632. Throughput: 0: 42573.8. Samples: 1658552740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:58:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 00:59:01,222][15401] Updated weights for policy 0, policy_version 101230 (0.0033) [2024-06-22 00:59:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1658650624. Throughput: 0: 42599.5. Samples: 1658812300. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 00:59:04,391][15401] Updated weights for policy 0, policy_version 101240 (0.0037) [2024-06-22 00:59:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42487.7). Total num frames: 1658847232. Throughput: 0: 42558.3. Samples: 1658942040. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 00:59:08,979][15401] Updated weights for policy 0, policy_version 101250 (0.0031) [2024-06-22 00:59:12,141][15401] Updated weights for policy 0, policy_version 101260 (0.0036) [2024-06-22 00:59:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1659092992. Throughput: 0: 42651.6. Samples: 1659195420. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 00:59:16,753][15401] Updated weights for policy 0, policy_version 101270 (0.0041) [2024-06-22 00:59:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1659289600. Throughput: 0: 42833.4. Samples: 1659456600. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:18,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 00:59:19,912][15401] Updated weights for policy 0, policy_version 101280 (0.0036) [2024-06-22 00:59:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1659486208. Throughput: 0: 42875.0. Samples: 1659588480. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:23,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 00:59:24,241][15401] Updated weights for policy 0, policy_version 101290 (0.0031) [2024-06-22 00:59:27,334][15401] Updated weights for policy 0, policy_version 101300 (0.0030) [2024-06-22 00:59:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1659731968. Throughput: 0: 42960.6. Samples: 1659844480. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:28,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 00:59:31,763][15401] Updated weights for policy 0, policy_version 101310 (0.0033) [2024-06-22 00:59:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1659944960. Throughput: 0: 43047.5. Samples: 1660109120. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 00:59:34,777][15401] Updated weights for policy 0, policy_version 101320 (0.0023) [2024-06-22 00:59:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1660141568. Throughput: 0: 43030.7. Samples: 1660232940. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 00:59:39,397][15401] Updated weights for policy 0, policy_version 101330 (0.0036) [2024-06-22 00:59:42,442][15401] Updated weights for policy 0, policy_version 101340 (0.0033) [2024-06-22 00:59:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1660403712. Throughput: 0: 43058.2. Samples: 1660490360. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:43,390][15132] Avg episode reward: [(0, '0.152')] [2024-06-22 00:59:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000101343_1660403712.pth... [2024-06-22 00:59:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000100717_1650147328.pth [2024-06-22 00:59:46,975][15401] Updated weights for policy 0, policy_version 101350 (0.0044) [2024-06-22 00:59:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1660567552. Throughput: 0: 43078.8. Samples: 1660750840. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-22 00:59:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 00:59:49,377][15349] Signal inference workers to stop experience collection... (24500 times) [2024-06-22 00:59:49,417][15401] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-06-22 00:59:49,426][15349] Signal inference workers to resume experience collection... (24500 times) [2024-06-22 00:59:49,435][15401] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-06-22 00:59:50,091][15401] Updated weights for policy 0, policy_version 101360 (0.0026) [2024-06-22 00:59:53,389][15132] Fps is (10 sec: 37683.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1660780544. Throughput: 0: 42798.6. Samples: 1660867980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:59:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 00:59:54,671][15401] Updated weights for policy 0, policy_version 101370 (0.0032) [2024-06-22 00:59:57,742][15401] Updated weights for policy 0, policy_version 101380 (0.0023) [2024-06-22 00:59:58,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1661042688. Throughput: 0: 43072.9. Samples: 1661133700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 00:59:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 01:00:02,125][15401] Updated weights for policy 0, policy_version 101390 (0.0042) [2024-06-22 01:00:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1661222912. Throughput: 0: 43099.9. Samples: 1661396100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 01:00:05,266][15401] Updated weights for policy 0, policy_version 101400 (0.0046) [2024-06-22 01:00:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1661435904. Throughput: 0: 43050.7. Samples: 1661525760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:08,392][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 01:00:09,942][15401] Updated weights for policy 0, policy_version 101410 (0.0034) [2024-06-22 01:00:12,749][15401] Updated weights for policy 0, policy_version 101420 (0.0040) [2024-06-22 01:00:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1661681664. Throughput: 0: 43146.2. Samples: 1661786060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 01:00:17,397][15401] Updated weights for policy 0, policy_version 101430 (0.0031) [2024-06-22 01:00:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1661845504. Throughput: 0: 43103.1. Samples: 1662048760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:18,394][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 01:00:20,487][15401] Updated weights for policy 0, policy_version 101440 (0.0031) [2024-06-22 01:00:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 1662091264. Throughput: 0: 43104.0. Samples: 1662172620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:23,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 01:00:24,871][15401] Updated weights for policy 0, policy_version 101450 (0.0026) [2024-06-22 01:00:28,370][15401] Updated weights for policy 0, policy_version 101460 (0.0026) [2024-06-22 01:00:28,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1662320640. Throughput: 0: 43034.7. Samples: 1662426920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 01:00:32,693][15401] Updated weights for policy 0, policy_version 101470 (0.0045) [2024-06-22 01:00:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1662517248. Throughput: 0: 43186.7. Samples: 1662694240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 01:00:35,922][15401] Updated weights for policy 0, policy_version 101480 (0.0046) [2024-06-22 01:00:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1662746624. Throughput: 0: 43171.0. Samples: 1662810680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 01:00:40,235][15401] Updated weights for policy 0, policy_version 101490 (0.0037) [2024-06-22 01:00:43,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 1662959616. Throughput: 0: 43041.2. Samples: 1663070660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:43,393][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 01:00:43,703][15401] Updated weights for policy 0, policy_version 101500 (0.0029) [2024-06-22 01:00:47,811][15401] Updated weights for policy 0, policy_version 101510 (0.0028) [2024-06-22 01:00:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1663156224. Throughput: 0: 43068.4. Samples: 1663334180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 01:00:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 01:00:51,270][15401] Updated weights for policy 0, policy_version 101520 (0.0029) [2024-06-22 01:00:53,396][15132] Fps is (10 sec: 42581.4, 60 sec: 43412.9, 300 sec: 42764.1). Total num frames: 1663385600. Throughput: 0: 42906.8. Samples: 1663456840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:00:53,397][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 01:00:55,536][15401] Updated weights for policy 0, policy_version 101530 (0.0028) [2024-06-22 01:00:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1663598592. Throughput: 0: 42930.7. Samples: 1663717940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:00:58,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 01:00:58,709][15401] Updated weights for policy 0, policy_version 101540 (0.0042) [2024-06-22 01:01:02,991][15401] Updated weights for policy 0, policy_version 101550 (0.0025) [2024-06-22 01:01:03,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1663795200. Throughput: 0: 42770.3. Samples: 1663973420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 01:01:06,451][15401] Updated weights for policy 0, policy_version 101560 (0.0033) [2024-06-22 01:01:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1664024576. Throughput: 0: 42855.9. Samples: 1664101140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 01:01:10,633][15401] Updated weights for policy 0, policy_version 101570 (0.0032) [2024-06-22 01:01:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1664237568. Throughput: 0: 42947.9. Samples: 1664359580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 01:01:14,294][15401] Updated weights for policy 0, policy_version 101580 (0.0035) [2024-06-22 01:01:18,358][15401] Updated weights for policy 0, policy_version 101590 (0.0038) [2024-06-22 01:01:18,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1664450560. Throughput: 0: 42678.2. Samples: 1664614760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 01:01:19,997][15349] Signal inference workers to stop experience collection... (24550 times) [2024-06-22 01:01:20,036][15401] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-06-22 01:01:20,045][15349] Signal inference workers to resume experience collection... (24550 times) [2024-06-22 01:01:20,050][15401] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-06-22 01:01:21,807][15401] Updated weights for policy 0, policy_version 101600 (0.0045) [2024-06-22 01:01:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1664647168. Throughput: 0: 42889.9. Samples: 1664740720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:23,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 01:01:25,987][15401] Updated weights for policy 0, policy_version 101610 (0.0038) [2024-06-22 01:01:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1664876544. Throughput: 0: 42772.5. Samples: 1664995320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 01:01:29,390][15401] Updated weights for policy 0, policy_version 101620 (0.0048) [2024-06-22 01:01:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1665073152. Throughput: 0: 42535.5. Samples: 1665248280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 01:01:33,558][15401] Updated weights for policy 0, policy_version 101630 (0.0027) [2024-06-22 01:01:36,901][15401] Updated weights for policy 0, policy_version 101640 (0.0034) [2024-06-22 01:01:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1665286144. Throughput: 0: 42636.9. Samples: 1665375220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 01:01:41,096][15401] Updated weights for policy 0, policy_version 101650 (0.0040) [2024-06-22 01:01:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1665515520. Throughput: 0: 42675.9. Samples: 1665638360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 01:01:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000101656_1665531904.pth... [2024-06-22 01:01:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000101026_1655209984.pth [2024-06-22 01:01:45,339][15401] Updated weights for policy 0, policy_version 101660 (0.0035) [2024-06-22 01:01:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1665728512. Throughput: 0: 42458.6. Samples: 1665884060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 01:01:48,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 01:01:48,893][15401] Updated weights for policy 0, policy_version 101670 (0.0037) [2024-06-22 01:01:52,885][15401] Updated weights for policy 0, policy_version 101680 (0.0035) [2024-06-22 01:01:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42602.9, 300 sec: 42765.9). Total num frames: 1665941504. Throughput: 0: 42592.5. Samples: 1666017800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:01:53,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 01:01:56,502][15401] Updated weights for policy 0, policy_version 101690 (0.0033) [2024-06-22 01:01:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 1666138112. Throughput: 0: 42557.3. Samples: 1666274660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:01:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 01:02:00,835][15401] Updated weights for policy 0, policy_version 101700 (0.0026) [2024-06-22 01:02:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1666383872. Throughput: 0: 42359.9. Samples: 1666520960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 01:02:04,061][15401] Updated weights for policy 0, policy_version 101710 (0.0032) [2024-06-22 01:02:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1666564096. Throughput: 0: 42586.3. Samples: 1666657100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:08,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 01:02:08,421][15401] Updated weights for policy 0, policy_version 101720 (0.0034) [2024-06-22 01:02:11,006][15349] Signal inference workers to stop experience collection... (24600 times) [2024-06-22 01:02:11,007][15349] Signal inference workers to resume experience collection... (24600 times) [2024-06-22 01:02:11,032][15401] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-06-22 01:02:11,033][15401] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-06-22 01:02:11,624][15401] Updated weights for policy 0, policy_version 101730 (0.0037) [2024-06-22 01:02:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1666777088. Throughput: 0: 42350.4. Samples: 1666901080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 01:02:16,094][15401] Updated weights for policy 0, policy_version 101740 (0.0035) [2024-06-22 01:02:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1666990080. Throughput: 0: 42582.6. Samples: 1667164500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:18,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 01:02:19,135][15401] Updated weights for policy 0, policy_version 101750 (0.0040) [2024-06-22 01:02:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1667219456. Throughput: 0: 42622.2. Samples: 1667293220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:23,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 01:02:23,655][15401] Updated weights for policy 0, policy_version 101760 (0.0040) [2024-06-22 01:02:27,045][15401] Updated weights for policy 0, policy_version 101770 (0.0032) [2024-06-22 01:02:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1667416064. Throughput: 0: 42376.5. Samples: 1667545300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 01:02:31,351][15401] Updated weights for policy 0, policy_version 101780 (0.0026) [2024-06-22 01:02:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1667629056. Throughput: 0: 42758.4. Samples: 1667808180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 01:02:34,631][15401] Updated weights for policy 0, policy_version 101790 (0.0026) [2024-06-22 01:02:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 1667858432. Throughput: 0: 42691.2. Samples: 1667939000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:38,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 01:02:38,752][15401] Updated weights for policy 0, policy_version 101800 (0.0035) [2024-06-22 01:02:42,403][15401] Updated weights for policy 0, policy_version 101810 (0.0035) [2024-06-22 01:02:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 1668071424. Throughput: 0: 42504.6. Samples: 1668187360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 01:02:46,488][15401] Updated weights for policy 0, policy_version 101820 (0.0038) [2024-06-22 01:02:48,390][15132] Fps is (10 sec: 40969.0, 60 sec: 42325.2, 300 sec: 42765.3). Total num frames: 1668268032. Throughput: 0: 42853.2. Samples: 1668449360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 01:02:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 01:02:50,267][15401] Updated weights for policy 0, policy_version 101830 (0.0037) [2024-06-22 01:02:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1668497408. Throughput: 0: 42684.8. Samples: 1668577920. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:02:53,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 01:02:54,183][15401] Updated weights for policy 0, policy_version 101840 (0.0033) [2024-06-22 01:02:58,121][15401] Updated weights for policy 0, policy_version 101850 (0.0034) [2024-06-22 01:02:58,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1668710400. Throughput: 0: 42803.6. Samples: 1668827240. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:02:58,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 01:03:01,988][15401] Updated weights for policy 0, policy_version 101860 (0.0031) [2024-06-22 01:03:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1668923392. Throughput: 0: 42779.1. Samples: 1669089560. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:03,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 01:03:05,634][15401] Updated weights for policy 0, policy_version 101870 (0.0027) [2024-06-22 01:03:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1669136384. Throughput: 0: 42710.2. Samples: 1669215180. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 01:03:09,674][15401] Updated weights for policy 0, policy_version 101880 (0.0022) [2024-06-22 01:03:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1669349376. Throughput: 0: 42690.6. Samples: 1669466380. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 01:03:13,488][15401] Updated weights for policy 0, policy_version 101890 (0.0051) [2024-06-22 01:03:17,862][15401] Updated weights for policy 0, policy_version 101900 (0.0038) [2024-06-22 01:03:18,355][15349] Signal inference workers to stop experience collection... (24650 times) [2024-06-22 01:03:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1669545984. Throughput: 0: 42448.4. Samples: 1669718360. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:18,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 01:03:18,404][15401] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-06-22 01:03:18,470][15349] Signal inference workers to resume experience collection... (24650 times) [2024-06-22 01:03:18,471][15401] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-06-22 01:03:21,566][15401] Updated weights for policy 0, policy_version 101910 (0.0029) [2024-06-22 01:03:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1669775360. Throughput: 0: 42460.5. Samples: 1669849620. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:23,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-22 01:03:25,407][15401] Updated weights for policy 0, policy_version 101920 (0.0026) [2024-06-22 01:03:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1669988352. Throughput: 0: 42565.7. Samples: 1670102820. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:28,391][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 01:03:28,912][15401] Updated weights for policy 0, policy_version 101930 (0.0026) [2024-06-22 01:03:32,912][15401] Updated weights for policy 0, policy_version 101940 (0.0032) [2024-06-22 01:03:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1670217728. Throughput: 0: 42557.5. Samples: 1670364440. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:33,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 01:03:36,483][15401] Updated weights for policy 0, policy_version 101950 (0.0038) [2024-06-22 01:03:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 1670397952. Throughput: 0: 42638.3. Samples: 1670496640. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 01:03:40,546][15401] Updated weights for policy 0, policy_version 101960 (0.0040) [2024-06-22 01:03:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1670643712. Throughput: 0: 42714.5. Samples: 1670749400. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:43,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 01:03:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000101968_1670643712.pth... [2024-06-22 01:03:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000101343_1660403712.pth [2024-06-22 01:03:43,962][15401] Updated weights for policy 0, policy_version 101970 (0.0028) [2024-06-22 01:03:48,283][15401] Updated weights for policy 0, policy_version 101980 (0.0040) [2024-06-22 01:03:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 1670840320. Throughput: 0: 42753.1. Samples: 1671013440. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-06-22 01:03:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 01:03:51,404][15401] Updated weights for policy 0, policy_version 101990 (0.0037) [2024-06-22 01:03:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1671053312. Throughput: 0: 42686.2. Samples: 1671136060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:03:53,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 01:03:55,841][15401] Updated weights for policy 0, policy_version 102000 (0.0037) [2024-06-22 01:03:58,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1671299072. Throughput: 0: 42790.7. Samples: 1671391960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:03:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 01:03:58,836][15401] Updated weights for policy 0, policy_version 102010 (0.0032) [2024-06-22 01:04:03,395][15132] Fps is (10 sec: 42576.3, 60 sec: 42594.7, 300 sec: 42819.8). Total num frames: 1671479296. Throughput: 0: 43088.2. Samples: 1671657560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:03,395][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 01:04:03,458][15401] Updated weights for policy 0, policy_version 102020 (0.0028) [2024-06-22 01:04:06,895][15401] Updated weights for policy 0, policy_version 102030 (0.0030) [2024-06-22 01:04:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1671708672. Throughput: 0: 42892.9. Samples: 1671779800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 01:04:11,109][15401] Updated weights for policy 0, policy_version 102040 (0.0022) [2024-06-22 01:04:13,390][15132] Fps is (10 sec: 44259.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1671921664. Throughput: 0: 42875.0. Samples: 1672032200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:13,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 01:04:14,408][15401] Updated weights for policy 0, policy_version 102050 (0.0036) [2024-06-22 01:04:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1672134656. Throughput: 0: 42874.7. Samples: 1672293800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 01:04:18,679][15401] Updated weights for policy 0, policy_version 102060 (0.0031) [2024-06-22 01:04:22,190][15401] Updated weights for policy 0, policy_version 102070 (0.0034) [2024-06-22 01:04:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1672331264. Throughput: 0: 42651.0. Samples: 1672415940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:23,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 01:04:26,717][15401] Updated weights for policy 0, policy_version 102080 (0.0044) [2024-06-22 01:04:27,034][15349] Signal inference workers to stop experience collection... (24700 times) [2024-06-22 01:04:27,034][15349] Signal inference workers to resume experience collection... (24700 times) [2024-06-22 01:04:27,081][15401] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-06-22 01:04:27,081][15401] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-06-22 01:04:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1672560640. Throughput: 0: 42582.8. Samples: 1672665620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:28,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 01:04:30,013][15401] Updated weights for policy 0, policy_version 102090 (0.0022) [2024-06-22 01:04:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1672740864. Throughput: 0: 42599.0. Samples: 1672930400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 01:04:34,322][15401] Updated weights for policy 0, policy_version 102100 (0.0029) [2024-06-22 01:04:37,617][15401] Updated weights for policy 0, policy_version 102110 (0.0045) [2024-06-22 01:04:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1672970240. Throughput: 0: 42613.7. Samples: 1673053680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 01:04:41,928][15401] Updated weights for policy 0, policy_version 102120 (0.0030) [2024-06-22 01:04:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 1673183232. Throughput: 0: 42587.2. Samples: 1673308380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 01:04:45,244][15401] Updated weights for policy 0, policy_version 102130 (0.0037) [2024-06-22 01:04:48,391][15132] Fps is (10 sec: 42594.0, 60 sec: 42597.5, 300 sec: 42764.8). Total num frames: 1673396224. Throughput: 0: 42525.2. Samples: 1673571020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:48,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 01:04:49,528][15401] Updated weights for policy 0, policy_version 102140 (0.0031) [2024-06-22 01:04:52,905][15401] Updated weights for policy 0, policy_version 102150 (0.0033) [2024-06-22 01:04:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1673625600. Throughput: 0: 42579.9. Samples: 1673695900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 01:04:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 01:04:57,033][15401] Updated weights for policy 0, policy_version 102160 (0.0034) [2024-06-22 01:04:58,390][15132] Fps is (10 sec: 44242.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1673838592. Throughput: 0: 42717.9. Samples: 1673954500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:04:58,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 01:05:00,724][15401] Updated weights for policy 0, policy_version 102170 (0.0046) [2024-06-22 01:05:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42875.3, 300 sec: 42765.0). Total num frames: 1674051584. Throughput: 0: 42674.2. Samples: 1674214140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 01:05:04,783][15401] Updated weights for policy 0, policy_version 102180 (0.0029) [2024-06-22 01:05:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1674264576. Throughput: 0: 42783.6. Samples: 1674341200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 01:05:08,450][15401] Updated weights for policy 0, policy_version 102190 (0.0034) [2024-06-22 01:05:12,255][15401] Updated weights for policy 0, policy_version 102200 (0.0029) [2024-06-22 01:05:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1674477568. Throughput: 0: 43005.2. Samples: 1674600860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:13,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-22 01:05:16,113][15401] Updated weights for policy 0, policy_version 102210 (0.0035) [2024-06-22 01:05:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1674690560. Throughput: 0: 42920.4. Samples: 1674861820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 01:05:19,893][15401] Updated weights for policy 0, policy_version 102220 (0.0030) [2024-06-22 01:05:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1674919936. Throughput: 0: 42937.1. Samples: 1674985840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 01:05:23,659][15401] Updated weights for policy 0, policy_version 102230 (0.0044) [2024-06-22 01:05:27,602][15401] Updated weights for policy 0, policy_version 102240 (0.0023) [2024-06-22 01:05:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 1675116544. Throughput: 0: 43006.5. Samples: 1675243780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:28,393][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 01:05:31,354][15401] Updated weights for policy 0, policy_version 102250 (0.0035) [2024-06-22 01:05:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 1675329536. Throughput: 0: 42944.1. Samples: 1675503560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:33,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 01:05:35,294][15401] Updated weights for policy 0, policy_version 102260 (0.0042) [2024-06-22 01:05:38,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 1675558912. Throughput: 0: 42967.2. Samples: 1675629420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 01:05:39,177][15401] Updated weights for policy 0, policy_version 102270 (0.0048) [2024-06-22 01:05:40,668][15349] Signal inference workers to stop experience collection... (24750 times) [2024-06-22 01:05:40,669][15349] Signal inference workers to resume experience collection... (24750 times) [2024-06-22 01:05:40,704][15401] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-06-22 01:05:40,704][15401] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-06-22 01:05:42,745][15401] Updated weights for policy 0, policy_version 102280 (0.0035) [2024-06-22 01:05:43,390][15132] Fps is (10 sec: 44246.8, 60 sec: 43144.3, 300 sec: 42765.0). Total num frames: 1675771904. Throughput: 0: 42862.0. Samples: 1675883300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 01:05:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000102281_1675771904.pth... [2024-06-22 01:05:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000101656_1665531904.pth [2024-06-22 01:05:47,203][15401] Updated weights for policy 0, policy_version 102290 (0.0028) [2024-06-22 01:05:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42599.3, 300 sec: 42599.3). Total num frames: 1675952128. Throughput: 0: 42918.7. Samples: 1676145480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 01:05:50,542][15401] Updated weights for policy 0, policy_version 102300 (0.0031) [2024-06-22 01:05:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1676214272. Throughput: 0: 42712.9. Samples: 1676263280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 01:05:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 01:05:55,362][15401] Updated weights for policy 0, policy_version 102310 (0.0033) [2024-06-22 01:05:58,186][15401] Updated weights for policy 0, policy_version 102320 (0.0042) [2024-06-22 01:05:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1676410880. Throughput: 0: 42677.5. Samples: 1676521340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:05:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 01:06:02,861][15401] Updated weights for policy 0, policy_version 102330 (0.0023) [2024-06-22 01:06:03,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1676591104. Throughput: 0: 42602.2. Samples: 1676778920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 01:06:06,119][15401] Updated weights for policy 0, policy_version 102340 (0.0052) [2024-06-22 01:06:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1676820480. Throughput: 0: 42565.8. Samples: 1676901300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 01:06:10,378][15401] Updated weights for policy 0, policy_version 102350 (0.0042) [2024-06-22 01:06:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1677033472. Throughput: 0: 42554.6. Samples: 1677158640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 01:06:13,754][15401] Updated weights for policy 0, policy_version 102360 (0.0042) [2024-06-22 01:06:17,781][15401] Updated weights for policy 0, policy_version 102370 (0.0032) [2024-06-22 01:06:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1677246464. Throughput: 0: 42601.9. Samples: 1677420540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 01:06:21,362][15401] Updated weights for policy 0, policy_version 102380 (0.0048) [2024-06-22 01:06:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1677459456. Throughput: 0: 42568.5. Samples: 1677545000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 01:06:25,499][15401] Updated weights for policy 0, policy_version 102390 (0.0033) [2024-06-22 01:06:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 1677672448. Throughput: 0: 42640.2. Samples: 1677802100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 01:06:29,123][15401] Updated weights for policy 0, policy_version 102400 (0.0029) [2024-06-22 01:06:32,963][15401] Updated weights for policy 0, policy_version 102410 (0.0031) [2024-06-22 01:06:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1677901824. Throughput: 0: 42547.8. Samples: 1678060140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 01:06:36,644][15401] Updated weights for policy 0, policy_version 102420 (0.0036) [2024-06-22 01:06:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1678114816. Throughput: 0: 42827.1. Samples: 1678190500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 01:06:40,606][15401] Updated weights for policy 0, policy_version 102430 (0.0041) [2024-06-22 01:06:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 1678311424. Throughput: 0: 42791.5. Samples: 1678446960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 01:06:44,345][15401] Updated weights for policy 0, policy_version 102440 (0.0028) [2024-06-22 01:06:48,197][15401] Updated weights for policy 0, policy_version 102450 (0.0040) [2024-06-22 01:06:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1678540800. Throughput: 0: 42683.2. Samples: 1678699660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 01:06:51,979][15401] Updated weights for policy 0, policy_version 102460 (0.0039) [2024-06-22 01:06:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1678737408. Throughput: 0: 42916.4. Samples: 1678832540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 01:06:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 01:06:55,908][15349] Signal inference workers to stop experience collection... (24800 times) [2024-06-22 01:06:55,912][15349] Signal inference workers to resume experience collection... (24800 times) [2024-06-22 01:06:55,954][15401] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-06-22 01:06:55,954][15401] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-06-22 01:06:56,089][15401] Updated weights for policy 0, policy_version 102470 (0.0036) [2024-06-22 01:06:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1678966784. Throughput: 0: 42868.1. Samples: 1679087700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:06:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 01:06:59,683][15401] Updated weights for policy 0, policy_version 102480 (0.0031) [2024-06-22 01:07:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1679163392. Throughput: 0: 42700.8. Samples: 1679342080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:03,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 01:07:03,601][15401] Updated weights for policy 0, policy_version 102490 (0.0038) [2024-06-22 01:07:07,188][15401] Updated weights for policy 0, policy_version 102500 (0.0036) [2024-06-22 01:07:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1679392768. Throughput: 0: 42673.7. Samples: 1679465320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:08,394][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 01:07:11,491][15401] Updated weights for policy 0, policy_version 102510 (0.0044) [2024-06-22 01:07:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1679605760. Throughput: 0: 42703.6. Samples: 1679723760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 01:07:15,014][15401] Updated weights for policy 0, policy_version 102520 (0.0032) [2024-06-22 01:07:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1679802368. Throughput: 0: 42792.2. Samples: 1679985780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 01:07:19,097][15401] Updated weights for policy 0, policy_version 102530 (0.0035) [2024-06-22 01:07:22,674][15401] Updated weights for policy 0, policy_version 102540 (0.0033) [2024-06-22 01:07:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1680031744. Throughput: 0: 42709.8. Samples: 1680112440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 01:07:26,706][15401] Updated weights for policy 0, policy_version 102550 (0.0034) [2024-06-22 01:07:28,392][15132] Fps is (10 sec: 45863.6, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 1680261120. Throughput: 0: 42786.6. Samples: 1680372460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:28,393][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 01:07:30,059][15401] Updated weights for policy 0, policy_version 102560 (0.0038) [2024-06-22 01:07:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1680457728. Throughput: 0: 42971.0. Samples: 1680633360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 01:07:34,343][15401] Updated weights for policy 0, policy_version 102570 (0.0030) [2024-06-22 01:07:37,780][15401] Updated weights for policy 0, policy_version 102580 (0.0037) [2024-06-22 01:07:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1680687104. Throughput: 0: 42704.0. Samples: 1680754220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 01:07:41,975][15401] Updated weights for policy 0, policy_version 102590 (0.0029) [2024-06-22 01:07:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1680900096. Throughput: 0: 42806.8. Samples: 1681014000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:43,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 01:07:43,479][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000102595_1680916480.pth... [2024-06-22 01:07:43,530][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000101968_1670643712.pth [2024-06-22 01:07:45,694][15401] Updated weights for policy 0, policy_version 102600 (0.0035) [2024-06-22 01:07:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1681096704. Throughput: 0: 42840.8. Samples: 1681269920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 01:07:49,594][15401] Updated weights for policy 0, policy_version 102610 (0.0032) [2024-06-22 01:07:53,143][15401] Updated weights for policy 0, policy_version 102620 (0.0048) [2024-06-22 01:07:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1681326080. Throughput: 0: 42824.1. Samples: 1681392400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:07:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 01:07:57,319][15401] Updated weights for policy 0, policy_version 102630 (0.0035) [2024-06-22 01:07:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1681539072. Throughput: 0: 42840.5. Samples: 1681651580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:07:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 01:08:00,620][15401] Updated weights for policy 0, policy_version 102640 (0.0027) [2024-06-22 01:08:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1681735680. Throughput: 0: 42682.1. Samples: 1681906480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 01:08:05,379][15401] Updated weights for policy 0, policy_version 102650 (0.0032) [2024-06-22 01:08:08,215][15401] Updated weights for policy 0, policy_version 102660 (0.0029) [2024-06-22 01:08:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1681981440. Throughput: 0: 42612.0. Samples: 1682029980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 01:08:12,875][15401] Updated weights for policy 0, policy_version 102670 (0.0026) [2024-06-22 01:08:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1682161664. Throughput: 0: 42648.5. Samples: 1682291540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 01:08:15,838][15401] Updated weights for policy 0, policy_version 102680 (0.0034) [2024-06-22 01:08:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1682374656. Throughput: 0: 42538.6. Samples: 1682547600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 01:08:20,664][15401] Updated weights for policy 0, policy_version 102690 (0.0028) [2024-06-22 01:08:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1682620416. Throughput: 0: 42633.3. Samples: 1682672720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 01:08:23,632][15401] Updated weights for policy 0, policy_version 102700 (0.0046) [2024-06-22 01:08:27,402][15349] Signal inference workers to stop experience collection... (24850 times) [2024-06-22 01:08:27,403][15349] Signal inference workers to resume experience collection... (24850 times) [2024-06-22 01:08:27,436][15401] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-06-22 01:08:27,437][15401] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-06-22 01:08:28,193][15401] Updated weights for policy 0, policy_version 102710 (0.0032) [2024-06-22 01:08:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 1682800640. Throughput: 0: 42572.8. Samples: 1682929780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 01:08:32,067][15401] Updated weights for policy 0, policy_version 102720 (0.0033) [2024-06-22 01:08:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1683013632. Throughput: 0: 42365.8. Samples: 1683176380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 01:08:35,890][15401] Updated weights for policy 0, policy_version 102730 (0.0023) [2024-06-22 01:08:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1683259392. Throughput: 0: 42473.7. Samples: 1683303720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 01:08:39,570][15401] Updated weights for policy 0, policy_version 102740 (0.0042) [2024-06-22 01:08:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1683439616. Throughput: 0: 42665.3. Samples: 1683571520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 01:08:43,572][15401] Updated weights for policy 0, policy_version 102750 (0.0046) [2024-06-22 01:08:47,068][15401] Updated weights for policy 0, policy_version 102760 (0.0045) [2024-06-22 01:08:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1683652608. Throughput: 0: 42618.8. Samples: 1683824320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 01:08:51,379][15401] Updated weights for policy 0, policy_version 102770 (0.0029) [2024-06-22 01:08:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1683898368. Throughput: 0: 42715.5. Samples: 1683952180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 01:08:53,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-22 01:08:54,505][15401] Updated weights for policy 0, policy_version 102780 (0.0026) [2024-06-22 01:08:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.1, 300 sec: 42654.7). Total num frames: 1684062208. Throughput: 0: 42594.1. Samples: 1684208280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:08:58,390][15132] Avg episode reward: [(0, '0.160')] [2024-06-22 01:08:59,119][15401] Updated weights for policy 0, policy_version 102790 (0.0043) [2024-06-22 01:09:02,209][15401] Updated weights for policy 0, policy_version 102800 (0.0044) [2024-06-22 01:09:03,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1684275200. Throughput: 0: 42509.5. Samples: 1684460520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 01:09:06,706][15401] Updated weights for policy 0, policy_version 102810 (0.0035) [2024-06-22 01:09:08,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1684537344. Throughput: 0: 42578.3. Samples: 1684588740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:08,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 01:09:10,070][15401] Updated weights for policy 0, policy_version 102820 (0.0029) [2024-06-22 01:09:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1684717568. Throughput: 0: 42606.8. Samples: 1684847080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 01:09:14,254][15401] Updated weights for policy 0, policy_version 102830 (0.0039) [2024-06-22 01:09:17,682][15401] Updated weights for policy 0, policy_version 102840 (0.0037) [2024-06-22 01:09:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1684930560. Throughput: 0: 42698.8. Samples: 1685097820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:18,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 01:09:21,841][15401] Updated weights for policy 0, policy_version 102850 (0.0044) [2024-06-22 01:09:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1685143552. Throughput: 0: 42760.0. Samples: 1685227920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:23,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 01:09:25,329][15401] Updated weights for policy 0, policy_version 102860 (0.0031) [2024-06-22 01:09:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1685356544. Throughput: 0: 42551.1. Samples: 1685486320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:28,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 01:09:29,710][15401] Updated weights for policy 0, policy_version 102870 (0.0033) [2024-06-22 01:09:33,035][15401] Updated weights for policy 0, policy_version 102880 (0.0027) [2024-06-22 01:09:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1685585920. Throughput: 0: 42419.0. Samples: 1685733280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:33,393][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 01:09:37,251][15401] Updated weights for policy 0, policy_version 102890 (0.0049) [2024-06-22 01:09:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42709.4). Total num frames: 1685782528. Throughput: 0: 42564.4. Samples: 1685867580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 01:09:41,042][15401] Updated weights for policy 0, policy_version 102900 (0.0025) [2024-06-22 01:09:43,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 1685995520. Throughput: 0: 42674.4. Samples: 1686128620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:43,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 01:09:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000102906_1686011904.pth... [2024-06-22 01:09:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000102281_1675771904.pth [2024-06-22 01:09:44,768][15401] Updated weights for policy 0, policy_version 102910 (0.0022) [2024-06-22 01:09:46,205][15349] Signal inference workers to stop experience collection... (24900 times) [2024-06-22 01:09:46,205][15349] Signal inference workers to resume experience collection... (24900 times) [2024-06-22 01:09:46,246][15401] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-06-22 01:09:46,247][15401] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-06-22 01:09:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1686208512. Throughput: 0: 42537.2. Samples: 1686374700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 01:09:48,977][15401] Updated weights for policy 0, policy_version 102920 (0.0047) [2024-06-22 01:09:52,388][15401] Updated weights for policy 0, policy_version 102930 (0.0033) [2024-06-22 01:09:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1686421504. Throughput: 0: 42623.9. Samples: 1686506820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 01:09:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 01:09:56,627][15401] Updated weights for policy 0, policy_version 102940 (0.0022) [2024-06-22 01:09:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1686634496. Throughput: 0: 42686.1. Samples: 1686767960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:09:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 01:09:59,912][15401] Updated weights for policy 0, policy_version 102950 (0.0036) [2024-06-22 01:10:03,394][15132] Fps is (10 sec: 42580.3, 60 sec: 42868.3, 300 sec: 42653.3). Total num frames: 1686847488. Throughput: 0: 42730.1. Samples: 1687020860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:03,394][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 01:10:04,308][15401] Updated weights for policy 0, policy_version 102960 (0.0040) [2024-06-22 01:10:07,979][15401] Updated weights for policy 0, policy_version 102970 (0.0035) [2024-06-22 01:10:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 1687060480. Throughput: 0: 42666.6. Samples: 1687147920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:08,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 01:10:12,078][15401] Updated weights for policy 0, policy_version 102980 (0.0048) [2024-06-22 01:10:13,390][15132] Fps is (10 sec: 42616.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1687273472. Throughput: 0: 42510.2. Samples: 1687399280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:13,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-22 01:10:15,722][15401] Updated weights for policy 0, policy_version 102990 (0.0036) [2024-06-22 01:10:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1687486464. Throughput: 0: 42662.7. Samples: 1687653000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 01:10:19,644][15401] Updated weights for policy 0, policy_version 103000 (0.0036) [2024-06-22 01:10:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1687699456. Throughput: 0: 42558.4. Samples: 1687782700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:23,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 01:10:23,458][15401] Updated weights for policy 0, policy_version 103010 (0.0033) [2024-06-22 01:10:27,293][15401] Updated weights for policy 0, policy_version 103020 (0.0040) [2024-06-22 01:10:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 1687896064. Throughput: 0: 42495.0. Samples: 1688040900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 01:10:31,257][15401] Updated weights for policy 0, policy_version 103030 (0.0034) [2024-06-22 01:10:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 1688125440. Throughput: 0: 42579.1. Samples: 1688290760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 01:10:35,125][15401] Updated weights for policy 0, policy_version 103040 (0.0038) [2024-06-22 01:10:38,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1688338432. Throughput: 0: 42469.8. Samples: 1688418060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:38,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 01:10:38,957][15401] Updated weights for policy 0, policy_version 103050 (0.0048) [2024-06-22 01:10:42,707][15401] Updated weights for policy 0, policy_version 103060 (0.0038) [2024-06-22 01:10:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1688567808. Throughput: 0: 42441.3. Samples: 1688677820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 01:10:46,509][15401] Updated weights for policy 0, policy_version 103070 (0.0023) [2024-06-22 01:10:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1688780800. Throughput: 0: 42488.1. Samples: 1688932640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 01:10:50,307][15401] Updated weights for policy 0, policy_version 103080 (0.0032) [2024-06-22 01:10:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1688993792. Throughput: 0: 42571.5. Samples: 1689063640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 01:10:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 01:10:54,177][15401] Updated weights for policy 0, policy_version 103090 (0.0046) [2024-06-22 01:10:57,966][15401] Updated weights for policy 0, policy_version 103100 (0.0028) [2024-06-22 01:10:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1689206784. Throughput: 0: 42697.3. Samples: 1689320660. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:10:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 01:11:02,162][15401] Updated weights for policy 0, policy_version 103110 (0.0031) [2024-06-22 01:11:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42601.4, 300 sec: 42653.9). Total num frames: 1689403392. Throughput: 0: 42690.2. Samples: 1689574060. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:03,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 01:11:05,511][15401] Updated weights for policy 0, policy_version 103120 (0.0026) [2024-06-22 01:11:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1689632768. Throughput: 0: 42652.0. Samples: 1689702040. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:08,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 01:11:09,788][15401] Updated weights for policy 0, policy_version 103130 (0.0025) [2024-06-22 01:11:12,346][15349] Signal inference workers to stop experience collection... (24950 times) [2024-06-22 01:11:12,346][15349] Signal inference workers to resume experience collection... (24950 times) [2024-06-22 01:11:12,362][15401] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-06-22 01:11:12,362][15401] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-06-22 01:11:13,232][15401] Updated weights for policy 0, policy_version 103140 (0.0037) [2024-06-22 01:11:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1689845760. Throughput: 0: 42694.8. Samples: 1689962160. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:13,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-22 01:11:17,404][15401] Updated weights for policy 0, policy_version 103150 (0.0031) [2024-06-22 01:11:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1690058752. Throughput: 0: 42901.3. Samples: 1690221320. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:18,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 01:11:21,057][15401] Updated weights for policy 0, policy_version 103160 (0.0045) [2024-06-22 01:11:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1690271744. Throughput: 0: 42872.6. Samples: 1690347220. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 01:11:24,987][15401] Updated weights for policy 0, policy_version 103170 (0.0032) [2024-06-22 01:11:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1690468352. Throughput: 0: 42762.7. Samples: 1690602140. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 01:11:28,590][15401] Updated weights for policy 0, policy_version 103180 (0.0040) [2024-06-22 01:11:32,399][15401] Updated weights for policy 0, policy_version 103190 (0.0034) [2024-06-22 01:11:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1690681344. Throughput: 0: 42919.5. Samples: 1690864020. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 01:11:36,261][15401] Updated weights for policy 0, policy_version 103200 (0.0029) [2024-06-22 01:11:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1690927104. Throughput: 0: 42821.3. Samples: 1690990600. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 01:11:40,315][15401] Updated weights for policy 0, policy_version 103210 (0.0037) [2024-06-22 01:11:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1691123712. Throughput: 0: 42809.9. Samples: 1691247100. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:43,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-22 01:11:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000103218_1691123712.pth... [2024-06-22 01:11:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000102595_1680916480.pth [2024-06-22 01:11:43,909][15401] Updated weights for policy 0, policy_version 103220 (0.0036) [2024-06-22 01:11:47,924][15401] Updated weights for policy 0, policy_version 103230 (0.0035) [2024-06-22 01:11:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1691336704. Throughput: 0: 42965.9. Samples: 1691507520. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:48,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-22 01:11:51,602][15401] Updated weights for policy 0, policy_version 103240 (0.0047) [2024-06-22 01:11:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1691566080. Throughput: 0: 42940.0. Samples: 1691634340. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 01:11:55,402][15401] Updated weights for policy 0, policy_version 103250 (0.0042) [2024-06-22 01:11:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1691762688. Throughput: 0: 42881.7. Samples: 1691891840. Policy #0 lag: (min: 2.0, avg: 10.5, max: 23.0) [2024-06-22 01:11:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 01:11:59,067][15401] Updated weights for policy 0, policy_version 103260 (0.0040) [2024-06-22 01:12:02,869][15401] Updated weights for policy 0, policy_version 103270 (0.0031) [2024-06-22 01:12:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 1691975680. Throughput: 0: 42766.2. Samples: 1692145900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:03,392][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 01:12:06,631][15401] Updated weights for policy 0, policy_version 103280 (0.0042) [2024-06-22 01:12:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1692205056. Throughput: 0: 42907.8. Samples: 1692278080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:08,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 01:12:10,436][15401] Updated weights for policy 0, policy_version 103290 (0.0032) [2024-06-22 01:12:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1692401664. Throughput: 0: 43014.1. Samples: 1692537780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 01:12:14,224][15401] Updated weights for policy 0, policy_version 103300 (0.0036) [2024-06-22 01:12:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1692614656. Throughput: 0: 42761.4. Samples: 1692788280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:18,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 01:12:18,500][15401] Updated weights for policy 0, policy_version 103310 (0.0039) [2024-06-22 01:12:22,038][15401] Updated weights for policy 0, policy_version 103320 (0.0028) [2024-06-22 01:12:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1692844032. Throughput: 0: 42922.7. Samples: 1692922120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:23,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 01:12:26,025][15401] Updated weights for policy 0, policy_version 103330 (0.0042) [2024-06-22 01:12:28,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1693040640. Throughput: 0: 42914.0. Samples: 1693178240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:28,391][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 01:12:29,490][15401] Updated weights for policy 0, policy_version 103340 (0.0034) [2024-06-22 01:12:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1693270016. Throughput: 0: 42578.2. Samples: 1693423540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:33,395][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 01:12:33,532][15401] Updated weights for policy 0, policy_version 103350 (0.0040) [2024-06-22 01:12:37,296][15401] Updated weights for policy 0, policy_version 103360 (0.0033) [2024-06-22 01:12:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1693483008. Throughput: 0: 42765.7. Samples: 1693558800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 01:12:41,177][15401] Updated weights for policy 0, policy_version 103370 (0.0034) [2024-06-22 01:12:43,255][15349] Signal inference workers to stop experience collection... (25000 times) [2024-06-22 01:12:43,257][15349] Signal inference workers to resume experience collection... (25000 times) [2024-06-22 01:12:43,271][15401] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-06-22 01:12:43,298][15401] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-06-22 01:12:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1693679616. Throughput: 0: 42871.6. Samples: 1693821060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 01:12:45,001][15401] Updated weights for policy 0, policy_version 103380 (0.0025) [2024-06-22 01:12:48,391][15132] Fps is (10 sec: 44229.6, 60 sec: 43143.3, 300 sec: 42709.2). Total num frames: 1693925376. Throughput: 0: 42911.8. Samples: 1694076900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:48,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 01:12:48,598][15401] Updated weights for policy 0, policy_version 103390 (0.0030) [2024-06-22 01:12:52,824][15401] Updated weights for policy 0, policy_version 103400 (0.0038) [2024-06-22 01:12:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1694138368. Throughput: 0: 42804.2. Samples: 1694204260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 01:12:56,034][15401] Updated weights for policy 0, policy_version 103410 (0.0034) [2024-06-22 01:12:58,389][15132] Fps is (10 sec: 40967.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1694334976. Throughput: 0: 42877.0. Samples: 1694467240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:12:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 01:13:00,427][15401] Updated weights for policy 0, policy_version 103420 (0.0047) [2024-06-22 01:13:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43419.4, 300 sec: 42709.5). Total num frames: 1694580736. Throughput: 0: 42917.8. Samples: 1694719580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 01:13:03,524][15401] Updated weights for policy 0, policy_version 103430 (0.0031) [2024-06-22 01:13:08,088][15401] Updated weights for policy 0, policy_version 103440 (0.0036) [2024-06-22 01:13:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1694760960. Throughput: 0: 42990.2. Samples: 1694856680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 01:13:11,511][15401] Updated weights for policy 0, policy_version 103450 (0.0034) [2024-06-22 01:13:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1694973952. Throughput: 0: 42765.2. Samples: 1695102660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 01:13:15,790][15401] Updated weights for policy 0, policy_version 103460 (0.0026) [2024-06-22 01:13:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1695203328. Throughput: 0: 43000.6. Samples: 1695358560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 01:13:19,089][15401] Updated weights for policy 0, policy_version 103470 (0.0043) [2024-06-22 01:13:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1695399936. Throughput: 0: 42971.5. Samples: 1695492520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:23,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 01:13:23,732][15401] Updated weights for policy 0, policy_version 103480 (0.0035) [2024-06-22 01:13:26,629][15401] Updated weights for policy 0, policy_version 103490 (0.0033) [2024-06-22 01:13:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1695596544. Throughput: 0: 42607.4. Samples: 1695738400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:28,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 01:13:31,390][15401] Updated weights for policy 0, policy_version 103500 (0.0025) [2024-06-22 01:13:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1695842304. Throughput: 0: 42642.5. Samples: 1695995740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 01:13:34,829][15401] Updated weights for policy 0, policy_version 103510 (0.0035) [2024-06-22 01:13:38,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1696055296. Throughput: 0: 42899.5. Samples: 1696134740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 01:13:39,150][15401] Updated weights for policy 0, policy_version 103520 (0.0033) [2024-06-22 01:13:42,551][15401] Updated weights for policy 0, policy_version 103530 (0.0029) [2024-06-22 01:13:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1696251904. Throughput: 0: 42508.8. Samples: 1696380140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 01:13:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000103531_1696251904.pth... [2024-06-22 01:13:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000102906_1686011904.pth [2024-06-22 01:13:47,174][15401] Updated weights for policy 0, policy_version 103540 (0.0031) [2024-06-22 01:13:47,618][15349] Signal inference workers to stop experience collection... (25050 times) [2024-06-22 01:13:47,621][15349] Signal inference workers to resume experience collection... (25050 times) [2024-06-22 01:13:47,660][15401] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-06-22 01:13:47,660][15401] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-06-22 01:13:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42599.5, 300 sec: 42653.9). Total num frames: 1696481280. Throughput: 0: 42520.3. Samples: 1696633000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:48,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 01:13:50,685][15401] Updated weights for policy 0, policy_version 103550 (0.0031) [2024-06-22 01:13:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1696694272. Throughput: 0: 42319.2. Samples: 1696761040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 01:13:54,862][15401] Updated weights for policy 0, policy_version 103560 (0.0023) [2024-06-22 01:13:58,213][15401] Updated weights for policy 0, policy_version 103570 (0.0043) [2024-06-22 01:13:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1696890880. Throughput: 0: 42441.8. Samples: 1697012540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 01:13:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 01:14:02,356][15401] Updated weights for policy 0, policy_version 103580 (0.0042) [2024-06-22 01:14:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1697103872. Throughput: 0: 42623.1. Samples: 1697276600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:03,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 01:14:05,685][15401] Updated weights for policy 0, policy_version 103590 (0.0036) [2024-06-22 01:14:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1697316864. Throughput: 0: 42390.4. Samples: 1697400080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:08,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 01:14:10,310][15401] Updated weights for policy 0, policy_version 103600 (0.0041) [2024-06-22 01:14:13,299][15401] Updated weights for policy 0, policy_version 103610 (0.0030) [2024-06-22 01:14:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1697546240. Throughput: 0: 42563.3. Samples: 1697653740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 01:14:17,763][15401] Updated weights for policy 0, policy_version 103620 (0.0031) [2024-06-22 01:14:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1697742848. Throughput: 0: 42658.2. Samples: 1697915360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 01:14:20,885][15401] Updated weights for policy 0, policy_version 103630 (0.0034) [2024-06-22 01:14:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1697972224. Throughput: 0: 42382.6. Samples: 1698042060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:23,393][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 01:14:25,230][15401] Updated weights for policy 0, policy_version 103640 (0.0035) [2024-06-22 01:14:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 1698185216. Throughput: 0: 42570.7. Samples: 1698295820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:28,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 01:14:28,488][15401] Updated weights for policy 0, policy_version 103650 (0.0037) [2024-06-22 01:14:32,791][15401] Updated weights for policy 0, policy_version 103660 (0.0041) [2024-06-22 01:14:33,392][15132] Fps is (10 sec: 40961.2, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 1698381824. Throughput: 0: 42693.2. Samples: 1698554280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:33,392][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 01:14:36,402][15401] Updated weights for policy 0, policy_version 103670 (0.0027) [2024-06-22 01:14:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 1698611200. Throughput: 0: 42731.4. Samples: 1698684060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:38,393][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 01:14:40,254][15401] Updated weights for policy 0, policy_version 103680 (0.0032) [2024-06-22 01:14:43,390][15132] Fps is (10 sec: 44246.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1698824192. Throughput: 0: 42758.5. Samples: 1698936680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 01:14:44,036][15401] Updated weights for policy 0, policy_version 103690 (0.0031) [2024-06-22 01:14:47,760][15401] Updated weights for policy 0, policy_version 103700 (0.0031) [2024-06-22 01:14:48,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1699037184. Throughput: 0: 42617.2. Samples: 1699194380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 01:14:51,554][15401] Updated weights for policy 0, policy_version 103710 (0.0049) [2024-06-22 01:14:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1699233792. Throughput: 0: 42776.4. Samples: 1699325020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 01:14:55,509][15401] Updated weights for policy 0, policy_version 103720 (0.0031) [2024-06-22 01:14:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.3, 300 sec: 42765.6). Total num frames: 1699463168. Throughput: 0: 42811.0. Samples: 1699580240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 01:14:58,395][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 01:14:59,451][15401] Updated weights for policy 0, policy_version 103730 (0.0042) [2024-06-22 01:15:03,282][15401] Updated weights for policy 0, policy_version 103740 (0.0030) [2024-06-22 01:15:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1699676160. Throughput: 0: 42778.2. Samples: 1699840380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:03,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-22 01:15:07,102][15401] Updated weights for policy 0, policy_version 103750 (0.0031) [2024-06-22 01:15:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1699889152. Throughput: 0: 42810.4. Samples: 1699968420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 01:15:08,403][15349] Signal inference workers to stop experience collection... (25100 times) [2024-06-22 01:15:08,403][15349] Signal inference workers to resume experience collection... (25100 times) [2024-06-22 01:15:08,421][15401] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-06-22 01:15:08,422][15401] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-06-22 01:15:10,954][15401] Updated weights for policy 0, policy_version 103760 (0.0038) [2024-06-22 01:15:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1700102144. Throughput: 0: 42768.5. Samples: 1700220400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:13,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 01:15:14,722][15401] Updated weights for policy 0, policy_version 103770 (0.0032) [2024-06-22 01:15:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1700315136. Throughput: 0: 42757.7. Samples: 1700478280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 01:15:18,483][15401] Updated weights for policy 0, policy_version 103780 (0.0040) [2024-06-22 01:15:22,201][15401] Updated weights for policy 0, policy_version 103790 (0.0032) [2024-06-22 01:15:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 1700511744. Throughput: 0: 42791.5. Samples: 1700609580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 01:15:26,118][15401] Updated weights for policy 0, policy_version 103800 (0.0028) [2024-06-22 01:15:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1700741120. Throughput: 0: 42732.0. Samples: 1700859620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:28,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 01:15:30,237][15401] Updated weights for policy 0, policy_version 103810 (0.0037) [2024-06-22 01:15:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.0, 300 sec: 42820.9). Total num frames: 1700970496. Throughput: 0: 42878.7. Samples: 1701123920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 01:15:33,820][15401] Updated weights for policy 0, policy_version 103820 (0.0051) [2024-06-22 01:15:37,639][15401] Updated weights for policy 0, policy_version 103830 (0.0033) [2024-06-22 01:15:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 1701150720. Throughput: 0: 42852.2. Samples: 1701253480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:38,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 01:15:41,350][15401] Updated weights for policy 0, policy_version 103840 (0.0041) [2024-06-22 01:15:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1701396480. Throughput: 0: 42805.0. Samples: 1701506460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 01:15:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000103845_1701396480.pth... [2024-06-22 01:15:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000103218_1691123712.pth [2024-06-22 01:15:45,502][15401] Updated weights for policy 0, policy_version 103850 (0.0038) [2024-06-22 01:15:48,392][15132] Fps is (10 sec: 45875.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1701609472. Throughput: 0: 42779.4. Samples: 1701765560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:48,392][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 01:15:48,938][15401] Updated weights for policy 0, policy_version 103860 (0.0038) [2024-06-22 01:15:52,994][15401] Updated weights for policy 0, policy_version 103870 (0.0043) [2024-06-22 01:15:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1701822464. Throughput: 0: 42807.9. Samples: 1701894780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 01:15:56,552][15401] Updated weights for policy 0, policy_version 103880 (0.0031) [2024-06-22 01:15:58,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1702035456. Throughput: 0: 42839.6. Samples: 1702148180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 01:15:58,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-22 01:16:00,554][15401] Updated weights for policy 0, policy_version 103890 (0.0048) [2024-06-22 01:16:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1702232064. Throughput: 0: 42931.5. Samples: 1702410200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 01:16:04,331][15401] Updated weights for policy 0, policy_version 103900 (0.0030) [2024-06-22 01:16:08,101][15401] Updated weights for policy 0, policy_version 103910 (0.0029) [2024-06-22 01:16:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1702461440. Throughput: 0: 42705.5. Samples: 1702531320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 01:16:11,871][15401] Updated weights for policy 0, policy_version 103920 (0.0036) [2024-06-22 01:16:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1702674432. Throughput: 0: 42974.7. Samples: 1702793580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:13,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 01:16:15,717][15401] Updated weights for policy 0, policy_version 103930 (0.0032) [2024-06-22 01:16:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1702871040. Throughput: 0: 42988.1. Samples: 1703058380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 01:16:19,515][15401] Updated weights for policy 0, policy_version 103940 (0.0027) [2024-06-22 01:16:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1703100416. Throughput: 0: 42842.3. Samples: 1703181280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 01:16:23,609][15401] Updated weights for policy 0, policy_version 103950 (0.0038) [2024-06-22 01:16:27,054][15401] Updated weights for policy 0, policy_version 103960 (0.0033) [2024-06-22 01:16:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1703297024. Throughput: 0: 42707.2. Samples: 1703428280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 01:16:31,274][15401] Updated weights for policy 0, policy_version 103970 (0.0036) [2024-06-22 01:16:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1703510016. Throughput: 0: 42882.6. Samples: 1703695180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 01:16:34,854][15401] Updated weights for policy 0, policy_version 103980 (0.0041) [2024-06-22 01:16:37,600][15349] Signal inference workers to stop experience collection... (25150 times) [2024-06-22 01:16:37,602][15349] Signal inference workers to resume experience collection... (25150 times) [2024-06-22 01:16:37,627][15401] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-06-22 01:16:37,627][15401] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-06-22 01:16:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 1703739392. Throughput: 0: 42674.3. Samples: 1703815120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 01:16:38,717][15401] Updated weights for policy 0, policy_version 103990 (0.0041) [2024-06-22 01:16:42,258][15401] Updated weights for policy 0, policy_version 104000 (0.0029) [2024-06-22 01:16:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1703968768. Throughput: 0: 42852.4. Samples: 1704076540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:43,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 01:16:46,269][15401] Updated weights for policy 0, policy_version 104010 (0.0033) [2024-06-22 01:16:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 1704148992. Throughput: 0: 42836.9. Samples: 1704337860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 01:16:49,907][15401] Updated weights for policy 0, policy_version 104020 (0.0030) [2024-06-22 01:16:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1704394752. Throughput: 0: 42982.5. Samples: 1704465540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 01:16:53,865][15401] Updated weights for policy 0, policy_version 104030 (0.0039) [2024-06-22 01:16:57,645][15401] Updated weights for policy 0, policy_version 104040 (0.0034) [2024-06-22 01:16:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1704607744. Throughput: 0: 42767.1. Samples: 1704718000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 01:16:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 01:17:02,270][15401] Updated weights for policy 0, policy_version 104050 (0.0039) [2024-06-22 01:17:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1704804352. Throughput: 0: 42601.8. Samples: 1704975460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:03,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 01:17:05,462][15401] Updated weights for policy 0, policy_version 104060 (0.0033) [2024-06-22 01:17:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1705017344. Throughput: 0: 42680.9. Samples: 1705101920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 01:17:10,037][15401] Updated weights for policy 0, policy_version 104070 (0.0027) [2024-06-22 01:17:13,049][15401] Updated weights for policy 0, policy_version 104080 (0.0031) [2024-06-22 01:17:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 1705263104. Throughput: 0: 42845.7. Samples: 1705356340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 01:17:17,653][15401] Updated weights for policy 0, policy_version 104090 (0.0030) [2024-06-22 01:17:18,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 1705426944. Throughput: 0: 42752.5. Samples: 1705619140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:18,392][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 01:17:20,694][15401] Updated weights for policy 0, policy_version 104100 (0.0037) [2024-06-22 01:17:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1705672704. Throughput: 0: 42747.5. Samples: 1705738760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 01:17:25,273][15401] Updated weights for policy 0, policy_version 104110 (0.0039) [2024-06-22 01:17:28,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1705869312. Throughput: 0: 42660.5. Samples: 1705996260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 01:17:28,670][15401] Updated weights for policy 0, policy_version 104120 (0.0035) [2024-06-22 01:17:32,699][15401] Updated weights for policy 0, policy_version 104130 (0.0031) [2024-06-22 01:17:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1706065920. Throughput: 0: 42614.6. Samples: 1706255520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 01:17:36,213][15401] Updated weights for policy 0, policy_version 104140 (0.0028) [2024-06-22 01:17:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1706295296. Throughput: 0: 42470.3. Samples: 1706376700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 01:17:40,720][15401] Updated weights for policy 0, policy_version 104150 (0.0026) [2024-06-22 01:17:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 1706524672. Throughput: 0: 42592.5. Samples: 1706634660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 01:17:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000104158_1706524672.pth... [2024-06-22 01:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000103531_1696251904.pth [2024-06-22 01:17:43,961][15401] Updated weights for policy 0, policy_version 104160 (0.0034) [2024-06-22 01:17:48,189][15401] Updated weights for policy 0, policy_version 104170 (0.0030) [2024-06-22 01:17:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1706721280. Throughput: 0: 42538.1. Samples: 1706889680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 01:17:51,862][15401] Updated weights for policy 0, policy_version 104180 (0.0040) [2024-06-22 01:17:53,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 1706934272. Throughput: 0: 42620.5. Samples: 1707019940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:53,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 01:17:55,961][15401] Updated weights for policy 0, policy_version 104190 (0.0030) [2024-06-22 01:17:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1707147264. Throughput: 0: 42661.3. Samples: 1707276100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 01:17:58,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-22 01:17:59,342][15401] Updated weights for policy 0, policy_version 104200 (0.0042) [2024-06-22 01:18:03,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1707360256. Throughput: 0: 42583.2. Samples: 1707535280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 01:18:03,583][15401] Updated weights for policy 0, policy_version 104210 (0.0033) [2024-06-22 01:18:04,930][15349] Signal inference workers to stop experience collection... (25200 times) [2024-06-22 01:18:04,981][15401] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-06-22 01:18:05,045][15349] Signal inference workers to resume experience collection... (25200 times) [2024-06-22 01:18:05,045][15401] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-06-22 01:18:06,794][15401] Updated weights for policy 0, policy_version 104220 (0.0028) [2024-06-22 01:18:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 1707556864. Throughput: 0: 42661.1. Samples: 1707658500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 01:18:11,183][15401] Updated weights for policy 0, policy_version 104230 (0.0032) [2024-06-22 01:18:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1707786240. Throughput: 0: 42681.7. Samples: 1707916940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 01:18:14,534][15401] Updated weights for policy 0, policy_version 104240 (0.0035) [2024-06-22 01:18:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1707999232. Throughput: 0: 42727.0. Samples: 1708178240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 01:18:18,836][15401] Updated weights for policy 0, policy_version 104250 (0.0040) [2024-06-22 01:18:22,378][15401] Updated weights for policy 0, policy_version 104260 (0.0028) [2024-06-22 01:18:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1708228608. Throughput: 0: 42815.5. Samples: 1708303400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:23,399][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 01:18:26,643][15401] Updated weights for policy 0, policy_version 104270 (0.0023) [2024-06-22 01:18:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1708425216. Throughput: 0: 42712.1. Samples: 1708556700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 01:18:30,015][15401] Updated weights for policy 0, policy_version 104280 (0.0029) [2024-06-22 01:18:33,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 1708654592. Throughput: 0: 42680.9. Samples: 1708810420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:33,393][15132] Avg episode reward: [(0, '0.217')] [2024-06-22 01:18:34,238][15401] Updated weights for policy 0, policy_version 104290 (0.0041) [2024-06-22 01:18:37,612][15401] Updated weights for policy 0, policy_version 104300 (0.0029) [2024-06-22 01:18:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1708867584. Throughput: 0: 42739.2. Samples: 1708943100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 01:18:41,977][15401] Updated weights for policy 0, policy_version 104310 (0.0029) [2024-06-22 01:18:43,391][15132] Fps is (10 sec: 42600.7, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 1709080576. Throughput: 0: 42904.8. Samples: 1709206900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:43,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 01:18:45,709][15401] Updated weights for policy 0, policy_version 104320 (0.0039) [2024-06-22 01:18:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1709277184. Throughput: 0: 42816.4. Samples: 1709462020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 01:18:49,539][15401] Updated weights for policy 0, policy_version 104330 (0.0029) [2024-06-22 01:18:53,121][15401] Updated weights for policy 0, policy_version 104340 (0.0036) [2024-06-22 01:18:53,390][15132] Fps is (10 sec: 42606.0, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 1709506560. Throughput: 0: 42862.0. Samples: 1709587300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:53,396][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 01:18:57,055][15401] Updated weights for policy 0, policy_version 104350 (0.0031) [2024-06-22 01:18:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1709735936. Throughput: 0: 42872.5. Samples: 1709846200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 01:18:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 01:19:00,642][15401] Updated weights for policy 0, policy_version 104360 (0.0030) [2024-06-22 01:19:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1709948928. Throughput: 0: 42766.3. Samples: 1710102720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 01:19:04,619][15401] Updated weights for policy 0, policy_version 104370 (0.0038) [2024-06-22 01:19:08,289][15401] Updated weights for policy 0, policy_version 104380 (0.0036) [2024-06-22 01:19:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1710161920. Throughput: 0: 42810.3. Samples: 1710229860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 01:19:12,071][15401] Updated weights for policy 0, policy_version 104390 (0.0027) [2024-06-22 01:19:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1710342144. Throughput: 0: 42808.9. Samples: 1710483100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 01:19:16,279][15401] Updated weights for policy 0, policy_version 104400 (0.0037) [2024-06-22 01:19:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 1710587904. Throughput: 0: 42971.6. Samples: 1710744040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 01:19:19,666][15401] Updated weights for policy 0, policy_version 104410 (0.0028) [2024-06-22 01:19:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1710784512. Throughput: 0: 42899.9. Samples: 1710873600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 01:19:23,846][15349] Signal inference workers to stop experience collection... (25250 times) [2024-06-22 01:19:23,847][15349] Signal inference workers to resume experience collection... (25250 times) [2024-06-22 01:19:23,863][15401] Updated weights for policy 0, policy_version 104420 (0.0037) [2024-06-22 01:19:23,890][15401] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-06-22 01:19:23,890][15401] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-06-22 01:19:27,466][15401] Updated weights for policy 0, policy_version 104430 (0.0035) [2024-06-22 01:19:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 1710997504. Throughput: 0: 42650.2. Samples: 1711126080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:28,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 01:19:31,397][15401] Updated weights for policy 0, policy_version 104440 (0.0022) [2024-06-22 01:19:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.3, 300 sec: 42765.4). Total num frames: 1711226880. Throughput: 0: 42680.5. Samples: 1711382640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:33,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 01:19:35,470][15401] Updated weights for policy 0, policy_version 104450 (0.0038) [2024-06-22 01:19:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1711439872. Throughput: 0: 42811.3. Samples: 1711513800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 01:19:38,893][15401] Updated weights for policy 0, policy_version 104460 (0.0035) [2024-06-22 01:19:43,001][15401] Updated weights for policy 0, policy_version 104470 (0.0037) [2024-06-22 01:19:43,390][15132] Fps is (10 sec: 40956.0, 60 sec: 42599.1, 300 sec: 42709.4). Total num frames: 1711636480. Throughput: 0: 42673.8. Samples: 1711766560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:43,391][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 01:19:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000104470_1711636480.pth... [2024-06-22 01:19:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000103845_1701396480.pth [2024-06-22 01:19:46,427][15401] Updated weights for policy 0, policy_version 104480 (0.0025) [2024-06-22 01:19:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1711865856. Throughput: 0: 42677.7. Samples: 1712023220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 01:19:50,610][15401] Updated weights for policy 0, policy_version 104490 (0.0042) [2024-06-22 01:19:53,389][15132] Fps is (10 sec: 44240.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1712078848. Throughput: 0: 42788.5. Samples: 1712155340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 01:19:53,943][15401] Updated weights for policy 0, policy_version 104500 (0.0040) [2024-06-22 01:19:58,190][15401] Updated weights for policy 0, policy_version 104510 (0.0034) [2024-06-22 01:19:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1712291840. Throughput: 0: 42770.1. Samples: 1712407760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:19:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 01:20:01,632][15401] Updated weights for policy 0, policy_version 104520 (0.0040) [2024-06-22 01:20:03,391][15132] Fps is (10 sec: 39316.8, 60 sec: 42051.4, 300 sec: 42653.8). Total num frames: 1712472064. Throughput: 0: 42709.1. Samples: 1712666000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 01:20:03,391][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 01:20:05,975][15401] Updated weights for policy 0, policy_version 104530 (0.0035) [2024-06-22 01:20:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1712717824. Throughput: 0: 42588.1. Samples: 1712790060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:08,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 01:20:09,569][15401] Updated weights for policy 0, policy_version 104540 (0.0027) [2024-06-22 01:20:13,389][15132] Fps is (10 sec: 44242.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1712914432. Throughput: 0: 42581.0. Samples: 1713042220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 01:20:13,836][15401] Updated weights for policy 0, policy_version 104550 (0.0038) [2024-06-22 01:20:17,193][15401] Updated weights for policy 0, policy_version 104560 (0.0036) [2024-06-22 01:20:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1713127424. Throughput: 0: 42618.2. Samples: 1713300460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 01:20:21,247][15401] Updated weights for policy 0, policy_version 104570 (0.0037) [2024-06-22 01:20:23,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1713356800. Throughput: 0: 42564.8. Samples: 1713429320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:23,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 01:20:24,550][15401] Updated weights for policy 0, policy_version 104580 (0.0030) [2024-06-22 01:20:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1713569792. Throughput: 0: 42894.7. Samples: 1713696780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 01:20:28,718][15401] Updated weights for policy 0, policy_version 104590 (0.0036) [2024-06-22 01:20:32,028][15401] Updated weights for policy 0, policy_version 104600 (0.0046) [2024-06-22 01:20:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 1713766400. Throughput: 0: 42864.5. Samples: 1713952120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:33,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 01:20:36,326][15401] Updated weights for policy 0, policy_version 104610 (0.0039) [2024-06-22 01:20:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1713995776. Throughput: 0: 42703.6. Samples: 1714077000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 01:20:40,092][15401] Updated weights for policy 0, policy_version 104620 (0.0023) [2024-06-22 01:20:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42872.0, 300 sec: 42709.8). Total num frames: 1714208768. Throughput: 0: 42922.1. Samples: 1714339260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:43,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 01:20:44,079][15401] Updated weights for policy 0, policy_version 104630 (0.0036) [2024-06-22 01:20:46,599][15349] Signal inference workers to stop experience collection... (25300 times) [2024-06-22 01:20:46,604][15349] Signal inference workers to resume experience collection... (25300 times) [2024-06-22 01:20:46,640][15401] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-06-22 01:20:46,640][15401] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-06-22 01:20:47,631][15401] Updated weights for policy 0, policy_version 104640 (0.0036) [2024-06-22 01:20:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1714438144. Throughput: 0: 42763.8. Samples: 1714590320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 01:20:51,874][15401] Updated weights for policy 0, policy_version 104650 (0.0034) [2024-06-22 01:20:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1714634752. Throughput: 0: 42942.9. Samples: 1714722500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 01:20:55,244][15401] Updated weights for policy 0, policy_version 104660 (0.0033) [2024-06-22 01:20:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1714864128. Throughput: 0: 43031.5. Samples: 1714978640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:20:58,399][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 01:20:59,489][15401] Updated weights for policy 0, policy_version 104670 (0.0038) [2024-06-22 01:21:03,053][15401] Updated weights for policy 0, policy_version 104680 (0.0028) [2024-06-22 01:21:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43418.4, 300 sec: 42765.0). Total num frames: 1715077120. Throughput: 0: 42871.0. Samples: 1715229660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 01:21:03,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 01:21:07,000][15401] Updated weights for policy 0, policy_version 104690 (0.0028) [2024-06-22 01:21:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1715273728. Throughput: 0: 42919.7. Samples: 1715360600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:08,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 01:21:10,831][15401] Updated weights for policy 0, policy_version 104700 (0.0031) [2024-06-22 01:21:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1715503104. Throughput: 0: 42703.5. Samples: 1715618440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:13,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 01:21:14,825][15401] Updated weights for policy 0, policy_version 104710 (0.0032) [2024-06-22 01:21:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1715716096. Throughput: 0: 42738.1. Samples: 1715875340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:18,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-22 01:21:18,609][15401] Updated weights for policy 0, policy_version 104720 (0.0040) [2024-06-22 01:21:22,261][15401] Updated weights for policy 0, policy_version 104730 (0.0035) [2024-06-22 01:21:23,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42599.8, 300 sec: 42764.9). Total num frames: 1715912704. Throughput: 0: 42886.7. Samples: 1716006920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:23,391][15132] Avg episode reward: [(0, '0.798')] [2024-06-22 01:21:26,056][15401] Updated weights for policy 0, policy_version 104740 (0.0039) [2024-06-22 01:21:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1716142080. Throughput: 0: 42705.0. Samples: 1716260980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:28,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 01:21:29,773][15401] Updated weights for policy 0, policy_version 104750 (0.0036) [2024-06-22 01:21:33,389][15132] Fps is (10 sec: 44238.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1716355072. Throughput: 0: 42931.6. Samples: 1716522240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 01:21:33,614][15401] Updated weights for policy 0, policy_version 104760 (0.0041) [2024-06-22 01:21:37,569][15401] Updated weights for policy 0, policy_version 104770 (0.0036) [2024-06-22 01:21:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1716568064. Throughput: 0: 42929.5. Samples: 1716654320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 01:21:41,146][15401] Updated weights for policy 0, policy_version 104780 (0.0046) [2024-06-22 01:21:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1716781056. Throughput: 0: 42946.1. Samples: 1716911220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:43,391][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 01:21:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000104784_1716781056.pth... [2024-06-22 01:21:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000104158_1706524672.pth [2024-06-22 01:21:45,214][15401] Updated weights for policy 0, policy_version 104790 (0.0042) [2024-06-22 01:21:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1717010432. Throughput: 0: 43173.9. Samples: 1717172480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 01:21:48,833][15401] Updated weights for policy 0, policy_version 104800 (0.0032) [2024-06-22 01:21:52,975][15401] Updated weights for policy 0, policy_version 104810 (0.0038) [2024-06-22 01:21:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1717207040. Throughput: 0: 43016.4. Samples: 1717296340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:53,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-22 01:21:56,547][15401] Updated weights for policy 0, policy_version 104820 (0.0025) [2024-06-22 01:21:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1717420032. Throughput: 0: 42943.1. Samples: 1717550880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:21:58,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 01:22:00,611][15401] Updated weights for policy 0, policy_version 104830 (0.0041) [2024-06-22 01:22:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1717633024. Throughput: 0: 42801.0. Samples: 1717801380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 01:22:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 01:22:04,271][15401] Updated weights for policy 0, policy_version 104840 (0.0027) [2024-06-22 01:22:06,651][15349] Signal inference workers to stop experience collection... (25350 times) [2024-06-22 01:22:06,695][15401] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-06-22 01:22:06,704][15349] Signal inference workers to resume experience collection... (25350 times) [2024-06-22 01:22:06,710][15401] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-06-22 01:22:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1717846016. Throughput: 0: 42768.2. Samples: 1717931480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 01:22:08,622][15401] Updated weights for policy 0, policy_version 104850 (0.0028) [2024-06-22 01:22:12,082][15401] Updated weights for policy 0, policy_version 104860 (0.0043) [2024-06-22 01:22:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 1718075392. Throughput: 0: 42675.7. Samples: 1718181380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 01:22:16,411][15401] Updated weights for policy 0, policy_version 104870 (0.0031) [2024-06-22 01:22:18,395][15132] Fps is (10 sec: 40937.3, 60 sec: 42321.3, 300 sec: 42653.1). Total num frames: 1718255616. Throughput: 0: 42593.7. Samples: 1718439200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:18,395][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 01:22:19,700][15401] Updated weights for policy 0, policy_version 104880 (0.0042) [2024-06-22 01:22:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 1718484992. Throughput: 0: 42451.4. Samples: 1718564640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 01:22:24,039][15401] Updated weights for policy 0, policy_version 104890 (0.0040) [2024-06-22 01:22:27,557][15401] Updated weights for policy 0, policy_version 104900 (0.0030) [2024-06-22 01:22:28,390][15132] Fps is (10 sec: 47540.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1718730752. Throughput: 0: 42370.2. Samples: 1718817880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:28,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-22 01:22:31,754][15401] Updated weights for policy 0, policy_version 104910 (0.0036) [2024-06-22 01:22:33,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 1718894592. Throughput: 0: 42233.7. Samples: 1719073100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:33,393][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 01:22:35,237][15401] Updated weights for policy 0, policy_version 104920 (0.0033) [2024-06-22 01:22:38,392][15132] Fps is (10 sec: 37674.5, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 1719107584. Throughput: 0: 42210.2. Samples: 1719195900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:38,393][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 01:22:39,526][15401] Updated weights for policy 0, policy_version 104930 (0.0040) [2024-06-22 01:22:43,017][15401] Updated weights for policy 0, policy_version 104940 (0.0032) [2024-06-22 01:22:43,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1719353344. Throughput: 0: 42278.3. Samples: 1719453400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:43,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-22 01:22:47,415][15401] Updated weights for policy 0, policy_version 104950 (0.0024) [2024-06-22 01:22:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 1719549952. Throughput: 0: 42351.9. Samples: 1719707220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 01:22:50,594][15401] Updated weights for policy 0, policy_version 104960 (0.0036) [2024-06-22 01:22:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1719746560. Throughput: 0: 42373.8. Samples: 1719838300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 01:22:54,915][15401] Updated weights for policy 0, policy_version 104970 (0.0037) [2024-06-22 01:22:58,203][15401] Updated weights for policy 0, policy_version 104980 (0.0032) [2024-06-22 01:22:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1719992320. Throughput: 0: 42661.3. Samples: 1720101140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:22:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 01:23:02,493][15401] Updated weights for policy 0, policy_version 104990 (0.0036) [2024-06-22 01:23:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1720188928. Throughput: 0: 42561.9. Samples: 1720354240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 01:23:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 01:23:05,796][15401] Updated weights for policy 0, policy_version 105000 (0.0027) [2024-06-22 01:23:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1720401920. Throughput: 0: 42651.5. Samples: 1720483960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 01:23:10,107][15401] Updated weights for policy 0, policy_version 105010 (0.0037) [2024-06-22 01:23:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1720614912. Throughput: 0: 42787.3. Samples: 1720743300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 01:23:13,993][15401] Updated weights for policy 0, policy_version 105020 (0.0032) [2024-06-22 01:23:17,653][15401] Updated weights for policy 0, policy_version 105030 (0.0045) [2024-06-22 01:23:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42875.6, 300 sec: 42709.5). Total num frames: 1720827904. Throughput: 0: 42702.3. Samples: 1720994600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 01:23:21,728][15401] Updated weights for policy 0, policy_version 105040 (0.0034) [2024-06-22 01:23:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1721057280. Throughput: 0: 42930.2. Samples: 1721127660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 01:23:25,364][15401] Updated weights for policy 0, policy_version 105050 (0.0036) [2024-06-22 01:23:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 1721253888. Throughput: 0: 42774.2. Samples: 1721378240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:28,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 01:23:29,509][15401] Updated weights for policy 0, policy_version 105060 (0.0028) [2024-06-22 01:23:32,911][15401] Updated weights for policy 0, policy_version 105070 (0.0030) [2024-06-22 01:23:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 1721466880. Throughput: 0: 42859.6. Samples: 1721635900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 01:23:35,014][15349] Signal inference workers to stop experience collection... (25400 times) [2024-06-22 01:23:35,014][15349] Signal inference workers to resume experience collection... (25400 times) [2024-06-22 01:23:35,030][15401] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-06-22 01:23:35,030][15401] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-06-22 01:23:37,050][15401] Updated weights for policy 0, policy_version 105080 (0.0028) [2024-06-22 01:23:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42765.3). Total num frames: 1721696256. Throughput: 0: 42885.3. Samples: 1721768140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 01:23:40,711][15401] Updated weights for policy 0, policy_version 105090 (0.0027) [2024-06-22 01:23:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1721892864. Throughput: 0: 42725.9. Samples: 1722023800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 01:23:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000105096_1721892864.pth... [2024-06-22 01:23:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000104470_1711636480.pth [2024-06-22 01:23:44,735][15401] Updated weights for policy 0, policy_version 105100 (0.0030) [2024-06-22 01:23:48,382][15401] Updated weights for policy 0, policy_version 105110 (0.0027) [2024-06-22 01:23:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1722122240. Throughput: 0: 42856.4. Samples: 1722282780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 01:23:52,322][15401] Updated weights for policy 0, policy_version 105120 (0.0034) [2024-06-22 01:23:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1722335232. Throughput: 0: 42792.5. Samples: 1722409620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 01:23:55,944][15401] Updated weights for policy 0, policy_version 105130 (0.0038) [2024-06-22 01:23:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1722531840. Throughput: 0: 42625.7. Samples: 1722661460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:23:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 01:24:00,121][15401] Updated weights for policy 0, policy_version 105140 (0.0042) [2024-06-22 01:24:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1722761216. Throughput: 0: 42736.4. Samples: 1722917740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 01:24:03,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 01:24:03,530][15401] Updated weights for policy 0, policy_version 105150 (0.0029) [2024-06-22 01:24:07,726][15401] Updated weights for policy 0, policy_version 105160 (0.0027) [2024-06-22 01:24:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1722957824. Throughput: 0: 42710.3. Samples: 1723049620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:08,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 01:24:11,018][15401] Updated weights for policy 0, policy_version 105170 (0.0032) [2024-06-22 01:24:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1723187200. Throughput: 0: 42820.8. Samples: 1723305180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:13,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 01:24:15,252][15401] Updated weights for policy 0, policy_version 105180 (0.0034) [2024-06-22 01:24:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1723400192. Throughput: 0: 42763.1. Samples: 1723560240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 01:24:18,568][15401] Updated weights for policy 0, policy_version 105190 (0.0029) [2024-06-22 01:24:22,835][15401] Updated weights for policy 0, policy_version 105200 (0.0039) [2024-06-22 01:24:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1723596800. Throughput: 0: 42748.6. Samples: 1723691820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 01:24:26,301][15401] Updated weights for policy 0, policy_version 105210 (0.0040) [2024-06-22 01:24:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1723842560. Throughput: 0: 42672.3. Samples: 1723944060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 01:24:30,774][15401] Updated weights for policy 0, policy_version 105220 (0.0034) [2024-06-22 01:24:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1724039168. Throughput: 0: 42571.9. Samples: 1724198520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 01:24:34,490][15401] Updated weights for policy 0, policy_version 105230 (0.0026) [2024-06-22 01:24:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 1724235776. Throughput: 0: 42466.6. Samples: 1724320620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 01:24:38,939][15401] Updated weights for policy 0, policy_version 105240 (0.0040) [2024-06-22 01:24:42,095][15401] Updated weights for policy 0, policy_version 105250 (0.0030) [2024-06-22 01:24:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1724481536. Throughput: 0: 42632.9. Samples: 1724579940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 01:24:46,423][15401] Updated weights for policy 0, policy_version 105260 (0.0038) [2024-06-22 01:24:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1724678144. Throughput: 0: 42618.2. Samples: 1724835560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 01:24:49,966][15401] Updated weights for policy 0, policy_version 105270 (0.0029) [2024-06-22 01:24:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1724858368. Throughput: 0: 42469.0. Samples: 1724960720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 01:24:53,947][15401] Updated weights for policy 0, policy_version 105280 (0.0033) [2024-06-22 01:24:57,519][15401] Updated weights for policy 0, policy_version 105290 (0.0031) [2024-06-22 01:24:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 1725120512. Throughput: 0: 42734.6. Samples: 1725228240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:24:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 01:25:01,423][15401] Updated weights for policy 0, policy_version 105300 (0.0033) [2024-06-22 01:25:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1725317120. Throughput: 0: 42627.9. Samples: 1725478500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 01:25:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 01:25:05,113][15401] Updated weights for policy 0, policy_version 105310 (0.0047) [2024-06-22 01:25:08,389][15132] Fps is (10 sec: 37684.1, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 1725497344. Throughput: 0: 42484.6. Samples: 1725603620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:08,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-22 01:25:09,214][15401] Updated weights for policy 0, policy_version 105320 (0.0042) [2024-06-22 01:25:12,631][15401] Updated weights for policy 0, policy_version 105330 (0.0032) [2024-06-22 01:25:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1725743104. Throughput: 0: 42712.9. Samples: 1725866140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 01:25:16,864][15349] Signal inference workers to stop experience collection... (25450 times) [2024-06-22 01:25:16,914][15349] Signal inference workers to resume experience collection... (25450 times) [2024-06-22 01:25:16,914][15401] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-06-22 01:25:16,932][15401] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-06-22 01:25:17,093][15401] Updated weights for policy 0, policy_version 105340 (0.0034) [2024-06-22 01:25:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1725956096. Throughput: 0: 42772.0. Samples: 1726123260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:18,391][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 01:25:20,177][15401] Updated weights for policy 0, policy_version 105350 (0.0037) [2024-06-22 01:25:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1726169088. Throughput: 0: 42960.6. Samples: 1726253840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:23,391][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 01:25:24,533][15401] Updated weights for policy 0, policy_version 105360 (0.0039) [2024-06-22 01:25:27,570][15401] Updated weights for policy 0, policy_version 105370 (0.0033) [2024-06-22 01:25:28,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1726382080. Throughput: 0: 42808.7. Samples: 1726506340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 01:25:32,001][15401] Updated weights for policy 0, policy_version 105380 (0.0040) [2024-06-22 01:25:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1726611456. Throughput: 0: 43099.2. Samples: 1726775020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:33,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 01:25:34,869][15401] Updated weights for policy 0, policy_version 105390 (0.0032) [2024-06-22 01:25:38,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1726808064. Throughput: 0: 43161.8. Samples: 1726903000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 01:25:39,455][15401] Updated weights for policy 0, policy_version 105400 (0.0043) [2024-06-22 01:25:42,464][15401] Updated weights for policy 0, policy_version 105410 (0.0038) [2024-06-22 01:25:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1727037440. Throughput: 0: 42836.5. Samples: 1727155880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:43,394][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 01:25:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000105411_1727053824.pth... [2024-06-22 01:25:43,604][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000104784_1716781056.pth [2024-06-22 01:25:47,180][15401] Updated weights for policy 0, policy_version 105420 (0.0054) [2024-06-22 01:25:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1727250432. Throughput: 0: 43122.7. Samples: 1727419020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 01:25:50,294][15401] Updated weights for policy 0, policy_version 105430 (0.0035) [2024-06-22 01:25:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 1727479808. Throughput: 0: 43236.8. Samples: 1727549280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 01:25:54,532][15401] Updated weights for policy 0, policy_version 105440 (0.0027) [2024-06-22 01:25:57,745][15401] Updated weights for policy 0, policy_version 105450 (0.0026) [2024-06-22 01:25:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1727692800. Throughput: 0: 43148.1. Samples: 1727807800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:25:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 01:26:02,290][15401] Updated weights for policy 0, policy_version 105460 (0.0039) [2024-06-22 01:26:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1727905792. Throughput: 0: 43117.8. Samples: 1728063560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-22 01:26:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 01:26:05,419][15401] Updated weights for policy 0, policy_version 105470 (0.0048) [2024-06-22 01:26:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1728102400. Throughput: 0: 43104.0. Samples: 1728193520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 01:26:09,670][15401] Updated weights for policy 0, policy_version 105480 (0.0030) [2024-06-22 01:26:13,215][15401] Updated weights for policy 0, policy_version 105490 (0.0035) [2024-06-22 01:26:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1728348160. Throughput: 0: 43190.1. Samples: 1728449880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:13,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 01:26:17,453][15401] Updated weights for policy 0, policy_version 105500 (0.0040) [2024-06-22 01:26:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1728561152. Throughput: 0: 42872.3. Samples: 1728704280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 01:26:20,862][15401] Updated weights for policy 0, policy_version 105510 (0.0032) [2024-06-22 01:26:23,396][15132] Fps is (10 sec: 37659.0, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 1728724992. Throughput: 0: 42869.4. Samples: 1728832400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:23,396][15132] Avg episode reward: [(0, '0.215')] [2024-06-22 01:26:25,195][15401] Updated weights for policy 0, policy_version 105520 (0.0031) [2024-06-22 01:26:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.8, 300 sec: 42765.0). Total num frames: 1728970752. Throughput: 0: 42882.8. Samples: 1729085600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 01:26:28,641][15401] Updated weights for policy 0, policy_version 105530 (0.0036) [2024-06-22 01:26:32,715][15401] Updated weights for policy 0, policy_version 105540 (0.0044) [2024-06-22 01:26:33,389][15132] Fps is (10 sec: 45905.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1729183744. Throughput: 0: 42842.8. Samples: 1729346940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:33,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 01:26:34,258][15349] Signal inference workers to stop experience collection... (25500 times) [2024-06-22 01:26:34,261][15349] Signal inference workers to resume experience collection... (25500 times) [2024-06-22 01:26:34,290][15401] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-06-22 01:26:34,290][15401] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-06-22 01:26:36,305][15401] Updated weights for policy 0, policy_version 105550 (0.0038) [2024-06-22 01:26:38,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1729363968. Throughput: 0: 42769.4. Samples: 1729473900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:38,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 01:26:40,227][15401] Updated weights for policy 0, policy_version 105560 (0.0032) [2024-06-22 01:26:43,389][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1729626112. Throughput: 0: 42729.7. Samples: 1729730640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:43,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 01:26:43,763][15401] Updated weights for policy 0, policy_version 105570 (0.0032) [2024-06-22 01:26:47,962][15401] Updated weights for policy 0, policy_version 105580 (0.0038) [2024-06-22 01:26:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1729822720. Throughput: 0: 42787.1. Samples: 1729988980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 01:26:51,313][15401] Updated weights for policy 0, policy_version 105590 (0.0046) [2024-06-22 01:26:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1730019328. Throughput: 0: 42687.1. Samples: 1730114440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:53,394][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 01:26:55,619][15401] Updated weights for policy 0, policy_version 105600 (0.0031) [2024-06-22 01:26:58,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1730281472. Throughput: 0: 42862.0. Samples: 1730378680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:26:58,391][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 01:26:59,058][15401] Updated weights for policy 0, policy_version 105610 (0.0043) [2024-06-22 01:27:03,324][15401] Updated weights for policy 0, policy_version 105620 (0.0027) [2024-06-22 01:27:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1730478080. Throughput: 0: 42976.0. Samples: 1730638200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:27:03,392][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 01:27:06,495][15401] Updated weights for policy 0, policy_version 105630 (0.0032) [2024-06-22 01:27:08,390][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1730658304. Throughput: 0: 42823.4. Samples: 1730759180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 01:27:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 01:27:10,945][15401] Updated weights for policy 0, policy_version 105640 (0.0040) [2024-06-22 01:27:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42932.5). Total num frames: 1730920448. Throughput: 0: 43048.8. Samples: 1731022800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:13,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 01:27:14,042][15401] Updated weights for policy 0, policy_version 105650 (0.0042) [2024-06-22 01:27:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1731117056. Throughput: 0: 43053.2. Samples: 1731284340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:18,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-22 01:27:18,666][15401] Updated weights for policy 0, policy_version 105660 (0.0042) [2024-06-22 01:27:22,141][15401] Updated weights for policy 0, policy_version 105670 (0.0037) [2024-06-22 01:27:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43149.1, 300 sec: 42653.9). Total num frames: 1731313664. Throughput: 0: 42889.3. Samples: 1731403920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:23,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-22 01:27:26,565][15401] Updated weights for policy 0, policy_version 105680 (0.0036) [2024-06-22 01:27:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42932.0). Total num frames: 1731559424. Throughput: 0: 43054.6. Samples: 1731668100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 01:27:29,882][15401] Updated weights for policy 0, policy_version 105690 (0.0036) [2024-06-22 01:27:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 1731756032. Throughput: 0: 42961.8. Samples: 1731922260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 01:27:34,412][15401] Updated weights for policy 0, policy_version 105700 (0.0037) [2024-06-22 01:27:37,345][15401] Updated weights for policy 0, policy_version 105710 (0.0041) [2024-06-22 01:27:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1731969024. Throughput: 0: 42880.8. Samples: 1732044080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 01:27:41,946][15401] Updated weights for policy 0, policy_version 105720 (0.0042) [2024-06-22 01:27:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1732182016. Throughput: 0: 42870.8. Samples: 1732307860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:43,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 01:27:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000105725_1732198400.pth... [2024-06-22 01:27:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000105096_1721892864.pth [2024-06-22 01:27:44,952][15401] Updated weights for policy 0, policy_version 105730 (0.0043) [2024-06-22 01:27:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1732378624. Throughput: 0: 42692.5. Samples: 1732559360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 01:27:49,557][15401] Updated weights for policy 0, policy_version 105740 (0.0028) [2024-06-22 01:27:52,596][15401] Updated weights for policy 0, policy_version 105750 (0.0035) [2024-06-22 01:27:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1732608000. Throughput: 0: 42802.8. Samples: 1732685300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 01:27:57,226][15401] Updated weights for policy 0, policy_version 105760 (0.0025) [2024-06-22 01:27:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 1732820992. Throughput: 0: 42648.8. Samples: 1732942000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:27:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 01:27:59,427][15349] Signal inference workers to stop experience collection... (25550 times) [2024-06-22 01:27:59,477][15401] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-06-22 01:27:59,486][15349] Signal inference workers to resume experience collection... (25550 times) [2024-06-22 01:27:59,496][15401] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-06-22 01:28:00,848][15401] Updated weights for policy 0, policy_version 105770 (0.0048) [2024-06-22 01:28:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1733017600. Throughput: 0: 42410.7. Samples: 1733192820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:28:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 01:28:05,105][15401] Updated weights for policy 0, policy_version 105780 (0.0034) [2024-06-22 01:28:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1733246976. Throughput: 0: 42476.6. Samples: 1733315360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 01:28:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 01:28:08,563][15401] Updated weights for policy 0, policy_version 105790 (0.0028) [2024-06-22 01:28:12,855][15401] Updated weights for policy 0, policy_version 105800 (0.0037) [2024-06-22 01:28:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1733459968. Throughput: 0: 42372.5. Samples: 1733574860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 01:28:16,382][15401] Updated weights for policy 0, policy_version 105810 (0.0050) [2024-06-22 01:28:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1733656576. Throughput: 0: 42343.6. Samples: 1733827720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 01:28:20,575][15401] Updated weights for policy 0, policy_version 105820 (0.0033) [2024-06-22 01:28:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1733869568. Throughput: 0: 42400.9. Samples: 1733952120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 01:28:23,865][15401] Updated weights for policy 0, policy_version 105830 (0.0036) [2024-06-22 01:28:28,132][15401] Updated weights for policy 0, policy_version 105840 (0.0029) [2024-06-22 01:28:28,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 1734098944. Throughput: 0: 42417.3. Samples: 1734216740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:28,393][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 01:28:31,470][15401] Updated weights for policy 0, policy_version 105850 (0.0035) [2024-06-22 01:28:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1734295552. Throughput: 0: 42411.9. Samples: 1734467900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 01:28:35,717][15401] Updated weights for policy 0, policy_version 105860 (0.0037) [2024-06-22 01:28:38,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1734508544. Throughput: 0: 42348.4. Samples: 1734590980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 01:28:39,674][15401] Updated weights for policy 0, policy_version 105870 (0.0026) [2024-06-22 01:28:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1734737920. Throughput: 0: 42485.5. Samples: 1734853840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 01:28:43,395][15401] Updated weights for policy 0, policy_version 105880 (0.0035) [2024-06-22 01:28:47,349][15401] Updated weights for policy 0, policy_version 105890 (0.0028) [2024-06-22 01:28:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1734934528. Throughput: 0: 42524.4. Samples: 1735106420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 01:28:50,926][15401] Updated weights for policy 0, policy_version 105900 (0.0030) [2024-06-22 01:28:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1735147520. Throughput: 0: 42599.6. Samples: 1735232340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 01:28:54,936][15401] Updated weights for policy 0, policy_version 105910 (0.0034) [2024-06-22 01:28:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1735376896. Throughput: 0: 42515.9. Samples: 1735488080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:28:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 01:28:58,590][15401] Updated weights for policy 0, policy_version 105920 (0.0033) [2024-06-22 01:29:02,588][15401] Updated weights for policy 0, policy_version 105930 (0.0026) [2024-06-22 01:29:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1735573504. Throughput: 0: 42563.1. Samples: 1735743060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:29:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 01:29:06,107][15401] Updated weights for policy 0, policy_version 105940 (0.0042) [2024-06-22 01:29:08,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1735786496. Throughput: 0: 42589.2. Samples: 1735868620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 01:29:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 01:29:10,677][15401] Updated weights for policy 0, policy_version 105950 (0.0031) [2024-06-22 01:29:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1735999488. Throughput: 0: 42306.7. Samples: 1736120440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 01:29:14,456][15401] Updated weights for policy 0, policy_version 105960 (0.0034) [2024-06-22 01:29:17,731][15349] Signal inference workers to stop experience collection... (25600 times) [2024-06-22 01:29:17,732][15349] Signal inference workers to resume experience collection... (25600 times) [2024-06-22 01:29:17,750][15401] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-06-22 01:29:17,750][15401] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-06-22 01:29:18,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1736196096. Throughput: 0: 42620.9. Samples: 1736385840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 01:29:18,430][15401] Updated weights for policy 0, policy_version 105970 (0.0041) [2024-06-22 01:29:21,937][15401] Updated weights for policy 0, policy_version 105980 (0.0029) [2024-06-22 01:29:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 1736425472. Throughput: 0: 42599.9. Samples: 1736508080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:23,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 01:29:26,032][15401] Updated weights for policy 0, policy_version 105990 (0.0043) [2024-06-22 01:29:28,390][15132] Fps is (10 sec: 45871.5, 60 sec: 42599.5, 300 sec: 42764.9). Total num frames: 1736654848. Throughput: 0: 42359.6. Samples: 1736760060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:28,391][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 01:29:29,388][15401] Updated weights for policy 0, policy_version 106000 (0.0033) [2024-06-22 01:29:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1736835072. Throughput: 0: 42764.5. Samples: 1737030820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:33,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 01:29:33,606][15401] Updated weights for policy 0, policy_version 106010 (0.0030) [2024-06-22 01:29:36,994][15401] Updated weights for policy 0, policy_version 106020 (0.0030) [2024-06-22 01:29:38,390][15132] Fps is (10 sec: 40963.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1737064448. Throughput: 0: 42540.8. Samples: 1737146680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 01:29:41,210][15401] Updated weights for policy 0, policy_version 106030 (0.0041) [2024-06-22 01:29:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1737293824. Throughput: 0: 42532.4. Samples: 1737402040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 01:29:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000106036_1737293824.pth... [2024-06-22 01:29:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000105411_1727053824.pth [2024-06-22 01:29:44,594][15401] Updated weights for policy 0, policy_version 106040 (0.0034) [2024-06-22 01:29:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1737474048. Throughput: 0: 42763.0. Samples: 1737667400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:48,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 01:29:49,009][15401] Updated weights for policy 0, policy_version 106050 (0.0039) [2024-06-22 01:29:52,454][15401] Updated weights for policy 0, policy_version 106060 (0.0030) [2024-06-22 01:29:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1737703424. Throughput: 0: 42632.3. Samples: 1737787080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 01:29:56,747][15401] Updated weights for policy 0, policy_version 106070 (0.0023) [2024-06-22 01:29:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1737932800. Throughput: 0: 42768.9. Samples: 1738045040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:29:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 01:29:59,931][15401] Updated weights for policy 0, policy_version 106080 (0.0023) [2024-06-22 01:30:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1738096640. Throughput: 0: 42713.0. Samples: 1738307920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:30:03,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 01:30:04,385][15401] Updated weights for policy 0, policy_version 106090 (0.0037) [2024-06-22 01:30:08,037][15401] Updated weights for policy 0, policy_version 106100 (0.0047) [2024-06-22 01:30:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1738358784. Throughput: 0: 42670.7. Samples: 1738428160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 01:30:08,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 01:30:12,035][15401] Updated weights for policy 0, policy_version 106110 (0.0036) [2024-06-22 01:30:13,395][15132] Fps is (10 sec: 49123.8, 60 sec: 43140.4, 300 sec: 42819.7). Total num frames: 1738588160. Throughput: 0: 42799.4. Samples: 1738686240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:13,396][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 01:30:15,664][15401] Updated weights for policy 0, policy_version 106120 (0.0035) [2024-06-22 01:30:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1738752000. Throughput: 0: 42516.3. Samples: 1738944060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:18,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 01:30:19,823][15401] Updated weights for policy 0, policy_version 106130 (0.0023) [2024-06-22 01:30:23,282][15401] Updated weights for policy 0, policy_version 106140 (0.0041) [2024-06-22 01:30:23,389][15132] Fps is (10 sec: 40983.5, 60 sec: 42873.3, 300 sec: 42765.1). Total num frames: 1738997760. Throughput: 0: 42597.9. Samples: 1739063580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 01:30:27,239][15401] Updated weights for policy 0, policy_version 106150 (0.0037) [2024-06-22 01:30:28,389][15132] Fps is (10 sec: 45876.4, 60 sec: 42599.1, 300 sec: 42709.5). Total num frames: 1739210752. Throughput: 0: 42850.0. Samples: 1739330280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 01:30:30,877][15401] Updated weights for policy 0, policy_version 106160 (0.0039) [2024-06-22 01:30:33,040][15349] Signal inference workers to stop experience collection... (25650 times) [2024-06-22 01:30:33,040][15349] Signal inference workers to resume experience collection... (25650 times) [2024-06-22 01:30:33,079][15401] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-06-22 01:30:33,079][15401] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-06-22 01:30:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1739407360. Throughput: 0: 42739.1. Samples: 1739590660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 01:30:34,697][15401] Updated weights for policy 0, policy_version 106170 (0.0047) [2024-06-22 01:30:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1739636736. Throughput: 0: 42758.6. Samples: 1739711220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 01:30:38,471][15401] Updated weights for policy 0, policy_version 106180 (0.0023) [2024-06-22 01:30:42,199][15401] Updated weights for policy 0, policy_version 106190 (0.0044) [2024-06-22 01:30:43,392][15132] Fps is (10 sec: 45865.3, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1739866112. Throughput: 0: 42888.1. Samples: 1739975100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:43,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 01:30:46,226][15401] Updated weights for policy 0, policy_version 106200 (0.0025) [2024-06-22 01:30:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1740062720. Throughput: 0: 42854.0. Samples: 1740236360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:48,396][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 01:30:50,045][15401] Updated weights for policy 0, policy_version 106210 (0.0025) [2024-06-22 01:30:53,390][15132] Fps is (10 sec: 42607.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1740292096. Throughput: 0: 42976.9. Samples: 1740362120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:53,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 01:30:53,847][15401] Updated weights for policy 0, policy_version 106220 (0.0032) [2024-06-22 01:30:57,626][15401] Updated weights for policy 0, policy_version 106230 (0.0022) [2024-06-22 01:30:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1740505088. Throughput: 0: 43077.1. Samples: 1740624460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:30:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 01:31:01,469][15401] Updated weights for policy 0, policy_version 106240 (0.0037) [2024-06-22 01:31:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1740701696. Throughput: 0: 42994.9. Samples: 1740878820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:31:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 01:31:05,154][15401] Updated weights for policy 0, policy_version 106250 (0.0032) [2024-06-22 01:31:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1740931072. Throughput: 0: 43138.6. Samples: 1741004820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 01:31:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 01:31:08,946][15401] Updated weights for policy 0, policy_version 106260 (0.0041) [2024-06-22 01:31:12,882][15401] Updated weights for policy 0, policy_version 106270 (0.0030) [2024-06-22 01:31:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42602.3, 300 sec: 42653.9). Total num frames: 1741144064. Throughput: 0: 43159.3. Samples: 1741272460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 01:31:16,365][15401] Updated weights for policy 0, policy_version 106280 (0.0041) [2024-06-22 01:31:18,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43689.0, 300 sec: 42876.7). Total num frames: 1741373440. Throughput: 0: 43025.8. Samples: 1741526920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:18,392][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 01:31:20,396][15401] Updated weights for policy 0, policy_version 106290 (0.0031) [2024-06-22 01:31:23,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1741586432. Throughput: 0: 43266.2. Samples: 1741658200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:23,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 01:31:23,770][15401] Updated weights for policy 0, policy_version 106300 (0.0029) [2024-06-22 01:31:28,017][15401] Updated weights for policy 0, policy_version 106310 (0.0051) [2024-06-22 01:31:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1741799424. Throughput: 0: 43220.0. Samples: 1741919900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 01:31:31,274][15401] Updated weights for policy 0, policy_version 106320 (0.0034) [2024-06-22 01:31:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1741996032. Throughput: 0: 43055.2. Samples: 1742173840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 01:31:35,515][15401] Updated weights for policy 0, policy_version 106330 (0.0050) [2024-06-22 01:31:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1742241792. Throughput: 0: 43042.2. Samples: 1742299020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 01:31:38,918][15401] Updated weights for policy 0, policy_version 106340 (0.0025) [2024-06-22 01:31:43,191][15401] Updated weights for policy 0, policy_version 106350 (0.0029) [2024-06-22 01:31:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 1742438400. Throughput: 0: 42991.4. Samples: 1742559080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 01:31:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000106350_1742438400.pth... [2024-06-22 01:31:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000105725_1732198400.pth [2024-06-22 01:31:46,663][15401] Updated weights for policy 0, policy_version 106360 (0.0023) [2024-06-22 01:31:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1742635008. Throughput: 0: 43083.5. Samples: 1742817580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 01:31:50,197][15349] Signal inference workers to stop experience collection... (25700 times) [2024-06-22 01:31:50,198][15349] Signal inference workers to resume experience collection... (25700 times) [2024-06-22 01:31:50,234][15401] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-06-22 01:31:50,234][15401] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-06-22 01:31:50,778][15401] Updated weights for policy 0, policy_version 106370 (0.0029) [2024-06-22 01:31:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1742897152. Throughput: 0: 43120.0. Samples: 1742945220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 01:31:54,101][15401] Updated weights for policy 0, policy_version 106380 (0.0038) [2024-06-22 01:31:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1743077376. Throughput: 0: 43041.4. Samples: 1743209320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:31:58,394][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 01:31:58,491][15401] Updated weights for policy 0, policy_version 106390 (0.0038) [2024-06-22 01:32:02,138][15401] Updated weights for policy 0, policy_version 106400 (0.0050) [2024-06-22 01:32:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1743290368. Throughput: 0: 42979.1. Samples: 1743460880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 01:32:06,293][15401] Updated weights for policy 0, policy_version 106410 (0.0037) [2024-06-22 01:32:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1743519744. Throughput: 0: 42920.5. Samples: 1743589620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:08,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 01:32:09,632][15401] Updated weights for policy 0, policy_version 106420 (0.0053) [2024-06-22 01:32:13,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1743716352. Throughput: 0: 42903.6. Samples: 1743850560. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 01:32:13,921][15401] Updated weights for policy 0, policy_version 106430 (0.0046) [2024-06-22 01:32:17,232][15401] Updated weights for policy 0, policy_version 106440 (0.0051) [2024-06-22 01:32:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1743929344. Throughput: 0: 42914.3. Samples: 1744104980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 01:32:21,370][15401] Updated weights for policy 0, policy_version 106450 (0.0046) [2024-06-22 01:32:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1744175104. Throughput: 0: 43046.2. Samples: 1744236100. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:23,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 01:32:24,791][15401] Updated weights for policy 0, policy_version 106460 (0.0022) [2024-06-22 01:32:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1744355328. Throughput: 0: 43009.9. Samples: 1744494520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 01:32:28,819][15401] Updated weights for policy 0, policy_version 106470 (0.0026) [2024-06-22 01:32:32,843][15401] Updated weights for policy 0, policy_version 106480 (0.0032) [2024-06-22 01:32:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1744584704. Throughput: 0: 42885.7. Samples: 1744747440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 01:32:36,517][15401] Updated weights for policy 0, policy_version 106490 (0.0025) [2024-06-22 01:32:38,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1744830464. Throughput: 0: 42970.2. Samples: 1744878880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:38,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 01:32:40,354][15401] Updated weights for policy 0, policy_version 106500 (0.0028) [2024-06-22 01:32:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1745010688. Throughput: 0: 43013.4. Samples: 1745144920. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:43,390][15132] Avg episode reward: [(0, '0.047')] [2024-06-22 01:32:44,244][15401] Updated weights for policy 0, policy_version 106510 (0.0029) [2024-06-22 01:32:47,876][15401] Updated weights for policy 0, policy_version 106520 (0.0021) [2024-06-22 01:32:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1745223680. Throughput: 0: 43038.9. Samples: 1745397620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:48,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 01:32:51,665][15401] Updated weights for policy 0, policy_version 106530 (0.0029) [2024-06-22 01:32:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1745469440. Throughput: 0: 43130.7. Samples: 1745530500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 01:32:55,360][15401] Updated weights for policy 0, policy_version 106540 (0.0040) [2024-06-22 01:32:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1745633280. Throughput: 0: 43077.2. Samples: 1745789040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:32:58,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 01:32:59,287][15401] Updated weights for policy 0, policy_version 106550 (0.0036) [2024-06-22 01:33:03,297][15401] Updated weights for policy 0, policy_version 106560 (0.0031) [2024-06-22 01:33:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1745879040. Throughput: 0: 43018.2. Samples: 1746040800. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:33:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 01:33:06,087][15349] Signal inference workers to stop experience collection... (25750 times) [2024-06-22 01:33:06,088][15349] Signal inference workers to resume experience collection... (25750 times) [2024-06-22 01:33:06,137][15401] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-06-22 01:33:06,137][15401] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-06-22 01:33:06,887][15401] Updated weights for policy 0, policy_version 106570 (0.0037) [2024-06-22 01:33:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1746108416. Throughput: 0: 43074.2. Samples: 1746174440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:33:08,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 01:33:10,715][15401] Updated weights for policy 0, policy_version 106580 (0.0031) [2024-06-22 01:33:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1746288640. Throughput: 0: 43064.5. Samples: 1746432420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 01:33:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 01:33:14,693][15401] Updated weights for policy 0, policy_version 106590 (0.0025) [2024-06-22 01:33:18,238][15401] Updated weights for policy 0, policy_version 106600 (0.0033) [2024-06-22 01:33:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 1746534400. Throughput: 0: 42981.0. Samples: 1746681580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 01:33:22,334][15401] Updated weights for policy 0, policy_version 106610 (0.0028) [2024-06-22 01:33:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 1746747392. Throughput: 0: 43090.8. Samples: 1746817960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 01:33:25,959][15401] Updated weights for policy 0, policy_version 106620 (0.0039) [2024-06-22 01:33:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1746944000. Throughput: 0: 42950.6. Samples: 1747077700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 01:33:29,908][15401] Updated weights for policy 0, policy_version 106630 (0.0024) [2024-06-22 01:33:33,390][15132] Fps is (10 sec: 42596.0, 60 sec: 43144.3, 300 sec: 42931.6). Total num frames: 1747173376. Throughput: 0: 42987.5. Samples: 1747332080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:33,399][15132] Avg episode reward: [(0, '0.218')] [2024-06-22 01:33:33,589][15401] Updated weights for policy 0, policy_version 106640 (0.0031) [2024-06-22 01:33:37,595][15401] Updated weights for policy 0, policy_version 106650 (0.0025) [2024-06-22 01:33:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1747402752. Throughput: 0: 42989.7. Samples: 1747465040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 01:33:41,073][15401] Updated weights for policy 0, policy_version 106660 (0.0027) [2024-06-22 01:33:43,390][15132] Fps is (10 sec: 40961.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1747582976. Throughput: 0: 42947.5. Samples: 1747721680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 01:33:43,532][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000106665_1747599360.pth... [2024-06-22 01:33:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000106036_1737293824.pth [2024-06-22 01:33:45,208][15401] Updated weights for policy 0, policy_version 106670 (0.0041) [2024-06-22 01:33:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 1747828736. Throughput: 0: 42839.5. Samples: 1747968580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 01:33:48,606][15401] Updated weights for policy 0, policy_version 106680 (0.0025) [2024-06-22 01:33:53,059][15401] Updated weights for policy 0, policy_version 106690 (0.0026) [2024-06-22 01:33:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 1748025344. Throughput: 0: 42814.1. Samples: 1748101080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:53,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 01:33:55,899][15401] Updated weights for policy 0, policy_version 106700 (0.0037) [2024-06-22 01:33:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1748221952. Throughput: 0: 42802.1. Samples: 1748358520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:33:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 01:34:00,500][15401] Updated weights for policy 0, policy_version 106710 (0.0036) [2024-06-22 01:34:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1748467712. Throughput: 0: 42928.9. Samples: 1748613380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:34:03,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 01:34:03,999][15401] Updated weights for policy 0, policy_version 106720 (0.0032) [2024-06-22 01:34:07,787][15401] Updated weights for policy 0, policy_version 106730 (0.0037) [2024-06-22 01:34:08,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 1748680704. Throughput: 0: 43047.9. Samples: 1748755220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:34:08,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 01:34:11,388][15401] Updated weights for policy 0, policy_version 106740 (0.0033) [2024-06-22 01:34:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 1748877312. Throughput: 0: 42835.9. Samples: 1749005320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 01:34:13,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-22 01:34:15,443][15401] Updated weights for policy 0, policy_version 106750 (0.0023) [2024-06-22 01:34:18,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 1749106688. Throughput: 0: 42940.4. Samples: 1749264380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 01:34:18,724][15349] Signal inference workers to stop experience collection... (25800 times) [2024-06-22 01:34:18,760][15401] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-06-22 01:34:18,782][15349] Signal inference workers to resume experience collection... (25800 times) [2024-06-22 01:34:18,783][15401] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-06-22 01:34:18,939][15401] Updated weights for policy 0, policy_version 106760 (0.0028) [2024-06-22 01:34:23,180][15401] Updated weights for policy 0, policy_version 106770 (0.0028) [2024-06-22 01:34:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42987.3). Total num frames: 1749336064. Throughput: 0: 43007.0. Samples: 1749400360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:23,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 01:34:26,995][15401] Updated weights for policy 0, policy_version 106780 (0.0028) [2024-06-22 01:34:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1749516288. Throughput: 0: 42806.7. Samples: 1749647980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:28,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-22 01:34:30,698][15401] Updated weights for policy 0, policy_version 106790 (0.0042) [2024-06-22 01:34:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.9, 300 sec: 42987.2). Total num frames: 1749745664. Throughput: 0: 43086.7. Samples: 1749907480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 01:34:34,535][15401] Updated weights for policy 0, policy_version 106800 (0.0034) [2024-06-22 01:34:38,199][15401] Updated weights for policy 0, policy_version 106810 (0.0024) [2024-06-22 01:34:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1749975040. Throughput: 0: 43138.4. Samples: 1750042300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 01:34:42,539][15401] Updated weights for policy 0, policy_version 106820 (0.0030) [2024-06-22 01:34:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1750171648. Throughput: 0: 43002.2. Samples: 1750293620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 01:34:46,321][15401] Updated weights for policy 0, policy_version 106830 (0.0035) [2024-06-22 01:34:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42987.1). Total num frames: 1750384640. Throughput: 0: 42956.3. Samples: 1750546420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 01:34:50,296][15401] Updated weights for policy 0, policy_version 106840 (0.0040) [2024-06-22 01:34:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1750597632. Throughput: 0: 42724.4. Samples: 1750677720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 01:34:53,833][15401] Updated weights for policy 0, policy_version 106850 (0.0040) [2024-06-22 01:34:58,039][15401] Updated weights for policy 0, policy_version 106860 (0.0038) [2024-06-22 01:34:58,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 1750810624. Throughput: 0: 43007.8. Samples: 1750940660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:34:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 01:35:01,386][15401] Updated weights for policy 0, policy_version 106870 (0.0031) [2024-06-22 01:35:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1751040000. Throughput: 0: 42801.0. Samples: 1751190420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:35:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 01:35:05,748][15401] Updated weights for policy 0, policy_version 106880 (0.0043) [2024-06-22 01:35:08,390][15132] Fps is (10 sec: 44233.6, 60 sec: 42872.7, 300 sec: 42932.4). Total num frames: 1751252992. Throughput: 0: 42775.5. Samples: 1751325280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:35:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 01:35:09,043][15401] Updated weights for policy 0, policy_version 106890 (0.0039) [2024-06-22 01:35:13,361][15401] Updated weights for policy 0, policy_version 106900 (0.0044) [2024-06-22 01:35:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1751449600. Throughput: 0: 42897.7. Samples: 1751578380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 01:35:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 01:35:16,789][15401] Updated weights for policy 0, policy_version 106910 (0.0035) [2024-06-22 01:35:18,390][15132] Fps is (10 sec: 40962.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1751662592. Throughput: 0: 42807.9. Samples: 1751833840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:18,391][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 01:35:21,205][15401] Updated weights for policy 0, policy_version 106920 (0.0022) [2024-06-22 01:35:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42931.6). Total num frames: 1751875584. Throughput: 0: 42720.1. Samples: 1751964700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 01:35:24,258][15401] Updated weights for policy 0, policy_version 106930 (0.0027) [2024-06-22 01:35:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1752072192. Throughput: 0: 42699.5. Samples: 1752215100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:28,395][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 01:35:28,969][15401] Updated weights for policy 0, policy_version 106940 (0.0037) [2024-06-22 01:35:32,055][15401] Updated weights for policy 0, policy_version 106950 (0.0039) [2024-06-22 01:35:33,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1752285184. Throughput: 0: 42680.6. Samples: 1752467040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 01:35:36,772][15401] Updated weights for policy 0, policy_version 106960 (0.0028) [2024-06-22 01:35:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42931.9). Total num frames: 1752530944. Throughput: 0: 42583.9. Samples: 1752594000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:38,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 01:35:39,382][15349] Signal inference workers to stop experience collection... (25850 times) [2024-06-22 01:35:39,435][15401] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-06-22 01:35:39,441][15349] Signal inference workers to resume experience collection... (25850 times) [2024-06-22 01:35:39,449][15401] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-06-22 01:35:39,583][15401] Updated weights for policy 0, policy_version 106970 (0.0030) [2024-06-22 01:35:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 1752694784. Throughput: 0: 42240.8. Samples: 1752841500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 01:35:43,511][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000106977_1752711168.pth... [2024-06-22 01:35:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000106350_1742438400.pth [2024-06-22 01:35:44,573][15401] Updated weights for policy 0, policy_version 106980 (0.0027) [2024-06-22 01:35:47,149][15401] Updated weights for policy 0, policy_version 106990 (0.0030) [2024-06-22 01:35:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1752940544. Throughput: 0: 42391.4. Samples: 1753098040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 01:35:52,558][15401] Updated weights for policy 0, policy_version 107000 (0.0032) [2024-06-22 01:35:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1753153536. Throughput: 0: 42290.9. Samples: 1753228340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 01:35:55,459][15401] Updated weights for policy 0, policy_version 107010 (0.0023) [2024-06-22 01:35:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 1753350144. Throughput: 0: 42145.8. Samples: 1753474940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:35:58,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 01:36:00,189][15401] Updated weights for policy 0, policy_version 107020 (0.0030) [2024-06-22 01:36:02,930][15401] Updated weights for policy 0, policy_version 107030 (0.0032) [2024-06-22 01:36:03,394][15132] Fps is (10 sec: 44216.7, 60 sec: 42595.2, 300 sec: 42931.0). Total num frames: 1753595904. Throughput: 0: 42198.1. Samples: 1753732940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:36:03,395][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 01:36:07,507][15401] Updated weights for policy 0, policy_version 107040 (0.0032) [2024-06-22 01:36:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.7, 300 sec: 42820.6). Total num frames: 1753776128. Throughput: 0: 42399.1. Samples: 1753872660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:36:08,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 01:36:10,319][15401] Updated weights for policy 0, policy_version 107050 (0.0027) [2024-06-22 01:36:13,390][15132] Fps is (10 sec: 40977.8, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 1754005504. Throughput: 0: 42411.9. Samples: 1754123640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 01:36:13,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 01:36:15,339][15401] Updated weights for policy 0, policy_version 107060 (0.0034) [2024-06-22 01:36:17,970][15401] Updated weights for policy 0, policy_version 107070 (0.0036) [2024-06-22 01:36:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1754234880. Throughput: 0: 42415.9. Samples: 1754375760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 01:36:22,822][15401] Updated weights for policy 0, policy_version 107080 (0.0023) [2024-06-22 01:36:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1754415104. Throughput: 0: 42576.6. Samples: 1754509940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:23,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 01:36:25,795][15401] Updated weights for policy 0, policy_version 107090 (0.0032) [2024-06-22 01:36:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1754644480. Throughput: 0: 42712.0. Samples: 1754763540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 01:36:30,351][15401] Updated weights for policy 0, policy_version 107100 (0.0031) [2024-06-22 01:36:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1754857472. Throughput: 0: 42826.4. Samples: 1755025220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:33,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 01:36:33,534][15401] Updated weights for policy 0, policy_version 107110 (0.0038) [2024-06-22 01:36:37,850][15401] Updated weights for policy 0, policy_version 107120 (0.0037) [2024-06-22 01:36:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 1755054080. Throughput: 0: 42813.8. Samples: 1755154960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:38,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 01:36:41,112][15401] Updated weights for policy 0, policy_version 107130 (0.0028) [2024-06-22 01:36:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1755283456. Throughput: 0: 43053.3. Samples: 1755412340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 01:36:45,245][15401] Updated weights for policy 0, policy_version 107140 (0.0027) [2024-06-22 01:36:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1755512832. Throughput: 0: 43015.3. Samples: 1755668440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 01:36:48,872][15401] Updated weights for policy 0, policy_version 107150 (0.0037) [2024-06-22 01:36:52,818][15401] Updated weights for policy 0, policy_version 107160 (0.0041) [2024-06-22 01:36:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1755709440. Throughput: 0: 42843.0. Samples: 1755800600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 01:36:56,439][15401] Updated weights for policy 0, policy_version 107170 (0.0033) [2024-06-22 01:36:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1755922432. Throughput: 0: 42802.3. Samples: 1756049740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:36:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 01:37:00,603][15401] Updated weights for policy 0, policy_version 107180 (0.0039) [2024-06-22 01:37:01,666][15349] Signal inference workers to stop experience collection... (25900 times) [2024-06-22 01:37:01,708][15401] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-06-22 01:37:01,727][15349] Signal inference workers to resume experience collection... (25900 times) [2024-06-22 01:37:01,727][15401] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-06-22 01:37:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42601.5, 300 sec: 42820.5). Total num frames: 1756151808. Throughput: 0: 42968.9. Samples: 1756309360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:37:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 01:37:04,127][15401] Updated weights for policy 0, policy_version 107190 (0.0034) [2024-06-22 01:37:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1756348416. Throughput: 0: 42896.4. Samples: 1756440280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:37:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 01:37:08,414][15401] Updated weights for policy 0, policy_version 107200 (0.0034) [2024-06-22 01:37:11,721][15401] Updated weights for policy 0, policy_version 107210 (0.0042) [2024-06-22 01:37:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1756561408. Throughput: 0: 42768.2. Samples: 1756688120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 01:37:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 01:37:15,854][15401] Updated weights for policy 0, policy_version 107220 (0.0032) [2024-06-22 01:37:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1756774400. Throughput: 0: 42639.9. Samples: 1756944020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 01:37:19,557][15401] Updated weights for policy 0, policy_version 107230 (0.0041) [2024-06-22 01:37:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1757003776. Throughput: 0: 42641.2. Samples: 1757073820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:23,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 01:37:23,944][15401] Updated weights for policy 0, policy_version 107240 (0.0040) [2024-06-22 01:37:27,148][15401] Updated weights for policy 0, policy_version 107250 (0.0043) [2024-06-22 01:37:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1757200384. Throughput: 0: 42440.1. Samples: 1757322140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 01:37:31,514][15401] Updated weights for policy 0, policy_version 107260 (0.0050) [2024-06-22 01:37:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1757413376. Throughput: 0: 42496.6. Samples: 1757580780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:33,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 01:37:34,807][15401] Updated weights for policy 0, policy_version 107270 (0.0038) [2024-06-22 01:37:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1757642752. Throughput: 0: 42419.5. Samples: 1757709480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 01:37:38,974][15401] Updated weights for policy 0, policy_version 107280 (0.0037) [2024-06-22 01:37:42,751][15401] Updated weights for policy 0, policy_version 107290 (0.0042) [2024-06-22 01:37:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1757839360. Throughput: 0: 42586.6. Samples: 1757966140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 01:37:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000107291_1757855744.pth... [2024-06-22 01:37:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000106665_1747599360.pth [2024-06-22 01:37:47,105][15401] Updated weights for policy 0, policy_version 107300 (0.0046) [2024-06-22 01:37:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1758052352. Throughput: 0: 42486.2. Samples: 1758221240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 01:37:50,337][15401] Updated weights for policy 0, policy_version 107310 (0.0039) [2024-06-22 01:37:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1758265344. Throughput: 0: 42482.2. Samples: 1758351980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 01:37:54,639][15401] Updated weights for policy 0, policy_version 107320 (0.0028) [2024-06-22 01:37:58,041][15401] Updated weights for policy 0, policy_version 107330 (0.0032) [2024-06-22 01:37:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1758494720. Throughput: 0: 42711.2. Samples: 1758610120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:37:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 01:38:02,116][15401] Updated weights for policy 0, policy_version 107340 (0.0037) [2024-06-22 01:38:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1758707712. Throughput: 0: 42642.7. Samples: 1758862940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:38:03,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 01:38:05,647][15401] Updated weights for policy 0, policy_version 107350 (0.0037) [2024-06-22 01:38:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1758920704. Throughput: 0: 42696.4. Samples: 1758995160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:38:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 01:38:09,544][15401] Updated weights for policy 0, policy_version 107360 (0.0037) [2024-06-22 01:38:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1759133696. Throughput: 0: 42800.9. Samples: 1759248180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:38:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 01:38:13,609][15401] Updated weights for policy 0, policy_version 107370 (0.0029) [2024-06-22 01:38:17,258][15401] Updated weights for policy 0, policy_version 107380 (0.0032) [2024-06-22 01:38:18,392][15132] Fps is (10 sec: 44226.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 1759363072. Throughput: 0: 42602.6. Samples: 1759498000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:18,392][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 01:38:21,486][15401] Updated weights for policy 0, policy_version 107390 (0.0029) [2024-06-22 01:38:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1759543296. Throughput: 0: 42686.3. Samples: 1759630360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 01:38:25,322][15401] Updated weights for policy 0, policy_version 107400 (0.0034) [2024-06-22 01:38:28,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42869.7, 300 sec: 42709.2). Total num frames: 1759772672. Throughput: 0: 42611.7. Samples: 1759883760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:28,392][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 01:38:29,223][15401] Updated weights for policy 0, policy_version 107410 (0.0027) [2024-06-22 01:38:32,813][15401] Updated weights for policy 0, policy_version 107420 (0.0042) [2024-06-22 01:38:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1760002048. Throughput: 0: 42687.6. Samples: 1760142180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 01:38:36,698][15401] Updated weights for policy 0, policy_version 107430 (0.0040) [2024-06-22 01:38:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1760198656. Throughput: 0: 42763.6. Samples: 1760276340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 01:38:40,467][15401] Updated weights for policy 0, policy_version 107440 (0.0029) [2024-06-22 01:38:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1760411648. Throughput: 0: 42780.0. Samples: 1760535220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:43,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 01:38:44,254][15401] Updated weights for policy 0, policy_version 107450 (0.0032) [2024-06-22 01:38:45,862][15349] Signal inference workers to stop experience collection... (25950 times) [2024-06-22 01:38:45,862][15349] Signal inference workers to resume experience collection... (25950 times) [2024-06-22 01:38:45,904][15401] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-06-22 01:38:45,904][15401] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-06-22 01:38:47,890][15401] Updated weights for policy 0, policy_version 107460 (0.0031) [2024-06-22 01:38:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1760641024. Throughput: 0: 43053.8. Samples: 1760800360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 01:38:51,949][15401] Updated weights for policy 0, policy_version 107470 (0.0039) [2024-06-22 01:38:53,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 1760837632. Throughput: 0: 43072.4. Samples: 1760933520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:53,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 01:38:55,518][15401] Updated weights for policy 0, policy_version 107480 (0.0030) [2024-06-22 01:38:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1761050624. Throughput: 0: 43006.1. Samples: 1761183460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:38:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 01:38:59,399][15401] Updated weights for policy 0, policy_version 107490 (0.0025) [2024-06-22 01:39:03,062][15401] Updated weights for policy 0, policy_version 107500 (0.0025) [2024-06-22 01:39:03,392][15132] Fps is (10 sec: 45875.4, 60 sec: 43142.8, 300 sec: 42765.0). Total num frames: 1761296384. Throughput: 0: 43154.6. Samples: 1761439960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:39:03,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 01:39:07,138][15401] Updated weights for policy 0, policy_version 107510 (0.0030) [2024-06-22 01:39:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1761492992. Throughput: 0: 43287.0. Samples: 1761578280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:39:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 01:39:10,587][15401] Updated weights for policy 0, policy_version 107520 (0.0033) [2024-06-22 01:39:13,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1761689600. Throughput: 0: 43165.3. Samples: 1761826100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:39:13,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 01:39:14,782][15401] Updated weights for policy 0, policy_version 107530 (0.0032) [2024-06-22 01:39:18,234][15401] Updated weights for policy 0, policy_version 107540 (0.0039) [2024-06-22 01:39:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 1761935360. Throughput: 0: 43197.2. Samples: 1762086060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-22 01:39:18,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 01:39:22,610][15401] Updated weights for policy 0, policy_version 107550 (0.0032) [2024-06-22 01:39:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1762131968. Throughput: 0: 43222.3. Samples: 1762221340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:39:23,390][15132] Avg episode reward: [(0, '0.920')] [2024-06-22 01:39:23,472][15349] Saving new best policy, reward=0.920! [2024-06-22 01:39:25,849][15401] Updated weights for policy 0, policy_version 107560 (0.0037) [2024-06-22 01:39:28,393][15132] Fps is (10 sec: 40944.5, 60 sec: 42870.4, 300 sec: 42708.9). Total num frames: 1762344960. Throughput: 0: 42851.5. Samples: 1762463700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:39:28,394][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 01:39:30,339][15401] Updated weights for policy 0, policy_version 107570 (0.0037) [2024-06-22 01:39:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1762574336. Throughput: 0: 42792.4. Samples: 1762726020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:39:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 01:39:33,470][15401] Updated weights for policy 0, policy_version 107580 (0.0030) [2024-06-22 01:39:37,988][15401] Updated weights for policy 0, policy_version 107590 (0.0028) [2024-06-22 01:39:38,389][15132] Fps is (10 sec: 42615.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1762770944. Throughput: 0: 42827.7. Samples: 1762860660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:39:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 01:39:41,330][15401] Updated weights for policy 0, policy_version 107600 (0.0038) [2024-06-22 01:39:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1763000320. Throughput: 0: 42777.8. Samples: 1763108460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:39:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 01:39:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000107605_1763000320.pth... [2024-06-22 01:39:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000106977_1752711168.pth [2024-06-22 01:39:45,662][15401] Updated weights for policy 0, policy_version 107610 (0.0040) [2024-06-22 01:39:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1763213312. Throughput: 0: 42825.1. Samples: 1763366980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:39:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 01:39:48,818][15401] Updated weights for policy 0, policy_version 107620 (0.0032) [2024-06-22 01:39:53,203][15401] Updated weights for policy 0, policy_version 107630 (0.0032) [2024-06-22 01:39:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1763426304. Throughput: 0: 42709.4. Samples: 1763500200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:39:53,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 01:39:56,391][15401] Updated weights for policy 0, policy_version 107640 (0.0031) [2024-06-22 01:39:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1763622912. Throughput: 0: 42758.8. Samples: 1763750240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:39:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 01:40:00,694][15349] Signal inference workers to stop experience collection... (26000 times) [2024-06-22 01:40:00,694][15349] Signal inference workers to resume experience collection... (26000 times) [2024-06-22 01:40:00,726][15401] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-06-22 01:40:00,726][15401] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-06-22 01:40:00,826][15401] Updated weights for policy 0, policy_version 107650 (0.0037) [2024-06-22 01:40:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1763852288. Throughput: 0: 42799.9. Samples: 1764012060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:40:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 01:40:04,162][15401] Updated weights for policy 0, policy_version 107660 (0.0046) [2024-06-22 01:40:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1764048896. Throughput: 0: 42695.9. Samples: 1764142660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:40:08,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 01:40:08,563][15401] Updated weights for policy 0, policy_version 107670 (0.0038) [2024-06-22 01:40:11,856][15401] Updated weights for policy 0, policy_version 107680 (0.0047) [2024-06-22 01:40:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1764278272. Throughput: 0: 42817.0. Samples: 1764390300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:40:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 01:40:16,104][15401] Updated weights for policy 0, policy_version 107690 (0.0034) [2024-06-22 01:40:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1764491264. Throughput: 0: 42792.4. Samples: 1764651680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 01:40:18,392][15132] Avg episode reward: [(0, '0.257')] [2024-06-22 01:40:19,507][15401] Updated weights for policy 0, policy_version 107700 (0.0040) [2024-06-22 01:40:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 1764687872. Throughput: 0: 42579.3. Samples: 1764776740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:40:23,399][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 01:40:23,728][15401] Updated weights for policy 0, policy_version 107710 (0.0033) [2024-06-22 01:40:27,311][15401] Updated weights for policy 0, policy_version 107720 (0.0034) [2024-06-22 01:40:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42874.3, 300 sec: 42820.6). Total num frames: 1764917248. Throughput: 0: 42752.1. Samples: 1765032300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:40:28,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 01:40:31,255][15401] Updated weights for policy 0, policy_version 107730 (0.0044) [2024-06-22 01:40:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1765130240. Throughput: 0: 42664.7. Samples: 1765286900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:40:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 01:40:34,871][15401] Updated weights for policy 0, policy_version 107740 (0.0028) [2024-06-22 01:40:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1765310464. Throughput: 0: 42525.4. Samples: 1765413840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:40:38,390][15132] Avg episode reward: [(0, '0.932')] [2024-06-22 01:40:38,454][15349] Saving new best policy, reward=0.932! [2024-06-22 01:40:38,959][15401] Updated weights for policy 0, policy_version 107750 (0.0040) [2024-06-22 01:40:42,919][15401] Updated weights for policy 0, policy_version 107760 (0.0028) [2024-06-22 01:40:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1765556224. Throughput: 0: 42661.6. Samples: 1765670020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:40:43,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-22 01:40:46,715][15401] Updated weights for policy 0, policy_version 107770 (0.0032) [2024-06-22 01:40:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1765769216. Throughput: 0: 42429.6. Samples: 1765921380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:40:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 01:40:50,655][15401] Updated weights for policy 0, policy_version 107780 (0.0033) [2024-06-22 01:40:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1765965824. Throughput: 0: 42347.1. Samples: 1766048280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:40:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 01:40:54,838][15401] Updated weights for policy 0, policy_version 107790 (0.0030) [2024-06-22 01:40:58,190][15401] Updated weights for policy 0, policy_version 107800 (0.0022) [2024-06-22 01:40:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42710.1). Total num frames: 1766195200. Throughput: 0: 42610.3. Samples: 1766307760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:40:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 01:41:02,330][15401] Updated weights for policy 0, policy_version 107810 (0.0036) [2024-06-22 01:41:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 1766408192. Throughput: 0: 42464.9. Samples: 1766562600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:41:03,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 01:41:06,002][15401] Updated weights for policy 0, policy_version 107820 (0.0037) [2024-06-22 01:41:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1766604800. Throughput: 0: 42598.0. Samples: 1766693640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:41:08,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 01:41:09,989][15401] Updated weights for policy 0, policy_version 107830 (0.0038) [2024-06-22 01:41:11,125][15349] Signal inference workers to stop experience collection... (26050 times) [2024-06-22 01:41:11,126][15349] Signal inference workers to resume experience collection... (26050 times) [2024-06-22 01:41:11,147][15401] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-06-22 01:41:11,147][15401] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-06-22 01:41:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1766834176. Throughput: 0: 42654.1. Samples: 1766951740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:41:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 01:41:13,637][15401] Updated weights for policy 0, policy_version 107840 (0.0036) [2024-06-22 01:41:18,005][15401] Updated weights for policy 0, policy_version 107850 (0.0028) [2024-06-22 01:41:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1767047168. Throughput: 0: 42586.8. Samples: 1767203300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 01:41:18,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 01:41:21,305][15401] Updated weights for policy 0, policy_version 107860 (0.0032) [2024-06-22 01:41:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1767243776. Throughput: 0: 42521.8. Samples: 1767327320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:41:23,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 01:41:25,521][15401] Updated weights for policy 0, policy_version 107870 (0.0026) [2024-06-22 01:41:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1767456768. Throughput: 0: 42458.3. Samples: 1767580640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:41:28,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 01:41:29,037][15401] Updated weights for policy 0, policy_version 107880 (0.0029) [2024-06-22 01:41:33,020][15401] Updated weights for policy 0, policy_version 107890 (0.0041) [2024-06-22 01:41:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1767686144. Throughput: 0: 42604.9. Samples: 1767838600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:41:33,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 01:41:36,516][15401] Updated weights for policy 0, policy_version 107900 (0.0038) [2024-06-22 01:41:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1767866368. Throughput: 0: 42599.1. Samples: 1767965240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:41:38,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-22 01:41:40,562][15401] Updated weights for policy 0, policy_version 107910 (0.0025) [2024-06-22 01:41:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1768112128. Throughput: 0: 42547.9. Samples: 1768222420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:41:43,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 01:41:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000107917_1768112128.pth... [2024-06-22 01:41:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000107291_1757855744.pth [2024-06-22 01:41:43,989][15401] Updated weights for policy 0, policy_version 107920 (0.0029) [2024-06-22 01:41:48,285][15401] Updated weights for policy 0, policy_version 107930 (0.0036) [2024-06-22 01:41:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1768325120. Throughput: 0: 42624.8. Samples: 1768480720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:41:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 01:41:52,048][15401] Updated weights for policy 0, policy_version 107940 (0.0036) [2024-06-22 01:41:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1768521728. Throughput: 0: 42501.6. Samples: 1768606220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:41:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 01:41:55,919][15401] Updated weights for policy 0, policy_version 107950 (0.0036) [2024-06-22 01:41:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1768751104. Throughput: 0: 42644.0. Samples: 1768870720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:41:58,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 01:41:59,618][15401] Updated weights for policy 0, policy_version 107960 (0.0030) [2024-06-22 01:42:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1768964096. Throughput: 0: 42715.1. Samples: 1769125480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:42:03,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-22 01:42:03,467][15401] Updated weights for policy 0, policy_version 107970 (0.0034) [2024-06-22 01:42:07,414][15401] Updated weights for policy 0, policy_version 107980 (0.0031) [2024-06-22 01:42:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1769160704. Throughput: 0: 42778.3. Samples: 1769252340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:42:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 01:42:11,160][15401] Updated weights for policy 0, policy_version 107990 (0.0038) [2024-06-22 01:42:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1769373696. Throughput: 0: 42675.1. Samples: 1769501020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:42:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 01:42:15,442][15401] Updated weights for policy 0, policy_version 108000 (0.0043) [2024-06-22 01:42:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1769586688. Throughput: 0: 42644.3. Samples: 1769757600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 01:42:18,393][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 01:42:19,020][15401] Updated weights for policy 0, policy_version 108010 (0.0038) [2024-06-22 01:42:22,994][15401] Updated weights for policy 0, policy_version 108020 (0.0023) [2024-06-22 01:42:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1769799680. Throughput: 0: 42721.9. Samples: 1769887720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:42:23,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 01:42:26,628][15401] Updated weights for policy 0, policy_version 108030 (0.0053) [2024-06-22 01:42:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1770012672. Throughput: 0: 42660.1. Samples: 1770142120. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:42:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 01:42:28,965][15349] Signal inference workers to stop experience collection... (26100 times) [2024-06-22 01:42:29,006][15401] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-06-22 01:42:29,025][15349] Signal inference workers to resume experience collection... (26100 times) [2024-06-22 01:42:29,026][15401] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-06-22 01:42:30,468][15401] Updated weights for policy 0, policy_version 108040 (0.0027) [2024-06-22 01:42:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 1770242048. Throughput: 0: 42736.8. Samples: 1770403880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:42:33,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 01:42:34,330][15401] Updated weights for policy 0, policy_version 108050 (0.0044) [2024-06-22 01:42:38,081][15401] Updated weights for policy 0, policy_version 108060 (0.0037) [2024-06-22 01:42:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1770455040. Throughput: 0: 42758.6. Samples: 1770530360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:42:38,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 01:42:42,097][15401] Updated weights for policy 0, policy_version 108070 (0.0050) [2024-06-22 01:42:43,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1770668032. Throughput: 0: 42433.3. Samples: 1770780220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:42:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 01:42:45,892][15401] Updated weights for policy 0, policy_version 108080 (0.0047) [2024-06-22 01:42:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1770864640. Throughput: 0: 42479.6. Samples: 1771037060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:42:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 01:42:49,713][15401] Updated weights for policy 0, policy_version 108090 (0.0029) [2024-06-22 01:42:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1771094016. Throughput: 0: 42542.7. Samples: 1771166760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:42:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 01:42:53,406][15401] Updated weights for policy 0, policy_version 108100 (0.0037) [2024-06-22 01:42:57,635][15401] Updated weights for policy 0, policy_version 108110 (0.0022) [2024-06-22 01:42:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1771290624. Throughput: 0: 42791.1. Samples: 1771426620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:42:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 01:43:00,892][15401] Updated weights for policy 0, policy_version 108120 (0.0029) [2024-06-22 01:43:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42709.2). Total num frames: 1771520000. Throughput: 0: 42681.8. Samples: 1771678380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:43:03,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 01:43:05,377][15401] Updated weights for policy 0, policy_version 108130 (0.0040) [2024-06-22 01:43:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1771749376. Throughput: 0: 42693.8. Samples: 1771808940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:43:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 01:43:08,755][15401] Updated weights for policy 0, policy_version 108140 (0.0031) [2024-06-22 01:43:12,850][15401] Updated weights for policy 0, policy_version 108150 (0.0034) [2024-06-22 01:43:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 1771929600. Throughput: 0: 42817.0. Samples: 1772068880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:43:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 01:43:16,387][15401] Updated weights for policy 0, policy_version 108160 (0.0031) [2024-06-22 01:43:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1772158976. Throughput: 0: 42665.9. Samples: 1772323840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 01:43:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 01:43:20,555][15401] Updated weights for policy 0, policy_version 108170 (0.0038) [2024-06-22 01:43:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1772371968. Throughput: 0: 42591.2. Samples: 1772446960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:43:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 01:43:24,079][15401] Updated weights for policy 0, policy_version 108180 (0.0042) [2024-06-22 01:43:28,075][15401] Updated weights for policy 0, policy_version 108190 (0.0038) [2024-06-22 01:43:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1772584960. Throughput: 0: 42821.7. Samples: 1772707200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:43:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 01:43:31,753][15401] Updated weights for policy 0, policy_version 108200 (0.0024) [2024-06-22 01:43:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1772797952. Throughput: 0: 42664.4. Samples: 1772956960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:43:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 01:43:35,960][15401] Updated weights for policy 0, policy_version 108210 (0.0027) [2024-06-22 01:43:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1772994560. Throughput: 0: 42690.5. Samples: 1773087840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:43:38,392][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 01:43:39,718][15401] Updated weights for policy 0, policy_version 108220 (0.0032) [2024-06-22 01:43:41,193][15349] Signal inference workers to stop experience collection... (26150 times) [2024-06-22 01:43:41,247][15401] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-06-22 01:43:41,254][15349] Signal inference workers to resume experience collection... (26150 times) [2024-06-22 01:43:41,267][15401] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-06-22 01:43:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1773223936. Throughput: 0: 42646.1. Samples: 1773345700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:43:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 01:43:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000108229_1773223936.pth... [2024-06-22 01:43:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000107605_1763000320.pth [2024-06-22 01:43:43,606][15401] Updated weights for policy 0, policy_version 108230 (0.0040) [2024-06-22 01:43:47,390][15401] Updated weights for policy 0, policy_version 108240 (0.0045) [2024-06-22 01:43:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 1773453312. Throughput: 0: 42632.5. Samples: 1773596740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:43:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 01:43:51,519][15401] Updated weights for policy 0, policy_version 108250 (0.0039) [2024-06-22 01:43:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1773633536. Throughput: 0: 42668.4. Samples: 1773729020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:43:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 01:43:55,137][15401] Updated weights for policy 0, policy_version 108260 (0.0029) [2024-06-22 01:43:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 1773879296. Throughput: 0: 42584.7. Samples: 1773985200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:43:58,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 01:43:58,943][15401] Updated weights for policy 0, policy_version 108270 (0.0030) [2024-06-22 01:44:02,809][15401] Updated weights for policy 0, policy_version 108280 (0.0027) [2024-06-22 01:44:03,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1774108672. Throughput: 0: 42510.6. Samples: 1774236820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:44:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 01:44:06,462][15401] Updated weights for policy 0, policy_version 108290 (0.0042) [2024-06-22 01:44:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1774272512. Throughput: 0: 42771.4. Samples: 1774371680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:44:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 01:44:10,446][15401] Updated weights for policy 0, policy_version 108300 (0.0030) [2024-06-22 01:44:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1774501888. Throughput: 0: 42585.4. Samples: 1774623540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:44:13,394][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 01:44:14,102][15401] Updated weights for policy 0, policy_version 108310 (0.0027) [2024-06-22 01:44:17,986][15401] Updated weights for policy 0, policy_version 108320 (0.0039) [2024-06-22 01:44:18,390][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1774747648. Throughput: 0: 42737.0. Samples: 1774880120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 01:44:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 01:44:21,631][15401] Updated weights for policy 0, policy_version 108330 (0.0039) [2024-06-22 01:44:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42599.0). Total num frames: 1774911488. Throughput: 0: 42764.1. Samples: 1775012220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:44:23,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 01:44:25,673][15401] Updated weights for policy 0, policy_version 108340 (0.0042) [2024-06-22 01:44:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1775140864. Throughput: 0: 42672.1. Samples: 1775265940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:44:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 01:44:29,310][15401] Updated weights for policy 0, policy_version 108350 (0.0042) [2024-06-22 01:44:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 1775353856. Throughput: 0: 42736.5. Samples: 1775519880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:44:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 01:44:33,496][15401] Updated weights for policy 0, policy_version 108360 (0.0048) [2024-06-22 01:44:37,000][15401] Updated weights for policy 0, policy_version 108370 (0.0028) [2024-06-22 01:44:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1775566848. Throughput: 0: 42764.4. Samples: 1775653420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:44:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 01:44:41,299][15401] Updated weights for policy 0, policy_version 108380 (0.0045) [2024-06-22 01:44:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1775779840. Throughput: 0: 42413.4. Samples: 1775893800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:44:43,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 01:44:44,685][15401] Updated weights for policy 0, policy_version 108390 (0.0047) [2024-06-22 01:44:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 1775976448. Throughput: 0: 42667.0. Samples: 1776156840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:44:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 01:44:48,899][15401] Updated weights for policy 0, policy_version 108400 (0.0043) [2024-06-22 01:44:52,836][15401] Updated weights for policy 0, policy_version 108410 (0.0031) [2024-06-22 01:44:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1776205824. Throughput: 0: 42498.3. Samples: 1776284100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:44:53,390][15132] Avg episode reward: [(0, '0.164')] [2024-06-22 01:44:56,721][15401] Updated weights for policy 0, policy_version 108420 (0.0041) [2024-06-22 01:44:58,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1776402432. Throughput: 0: 42483.3. Samples: 1776535280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:44:58,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-22 01:44:59,504][15349] Signal inference workers to stop experience collection... (26200 times) [2024-06-22 01:44:59,554][15401] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-06-22 01:44:59,554][15349] Signal inference workers to resume experience collection... (26200 times) [2024-06-22 01:44:59,568][15401] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-06-22 01:45:00,459][15401] Updated weights for policy 0, policy_version 108430 (0.0038) [2024-06-22 01:45:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 1776615424. Throughput: 0: 42481.3. Samples: 1776791780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:45:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 01:45:04,397][15401] Updated weights for policy 0, policy_version 108440 (0.0054) [2024-06-22 01:45:08,072][15401] Updated weights for policy 0, policy_version 108450 (0.0032) [2024-06-22 01:45:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1776844800. Throughput: 0: 42454.1. Samples: 1776922660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:45:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 01:45:12,097][15401] Updated weights for policy 0, policy_version 108460 (0.0031) [2024-06-22 01:45:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1777057792. Throughput: 0: 42383.0. Samples: 1777173180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:45:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 01:45:15,832][15401] Updated weights for policy 0, policy_version 108470 (0.0047) [2024-06-22 01:45:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 1777254400. Throughput: 0: 42409.7. Samples: 1777428320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:45:18,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 01:45:19,975][15401] Updated weights for policy 0, policy_version 108480 (0.0026) [2024-06-22 01:45:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1777483776. Throughput: 0: 42312.0. Samples: 1777557460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 01:45:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 01:45:23,409][15401] Updated weights for policy 0, policy_version 108490 (0.0034) [2024-06-22 01:45:27,557][15401] Updated weights for policy 0, policy_version 108500 (0.0037) [2024-06-22 01:45:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1777680384. Throughput: 0: 42646.7. Samples: 1777812900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:45:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 01:45:31,282][15401] Updated weights for policy 0, policy_version 108510 (0.0035) [2024-06-22 01:45:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1777909760. Throughput: 0: 42507.3. Samples: 1778069660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:45:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 01:45:35,092][15401] Updated weights for policy 0, policy_version 108520 (0.0032) [2024-06-22 01:45:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1778122752. Throughput: 0: 42570.7. Samples: 1778199780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:45:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 01:45:38,795][15401] Updated weights for policy 0, policy_version 108530 (0.0028) [2024-06-22 01:45:42,589][15401] Updated weights for policy 0, policy_version 108540 (0.0033) [2024-06-22 01:45:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 1778319360. Throughput: 0: 42730.5. Samples: 1778458160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:45:43,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 01:45:43,433][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000108541_1778335744.pth... [2024-06-22 01:45:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000107917_1768112128.pth [2024-06-22 01:45:46,773][15401] Updated weights for policy 0, policy_version 108550 (0.0035) [2024-06-22 01:45:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1778548736. Throughput: 0: 42668.2. Samples: 1778711840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:45:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 01:45:50,905][15401] Updated weights for policy 0, policy_version 108560 (0.0029) [2024-06-22 01:45:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1778761728. Throughput: 0: 42568.1. Samples: 1778838220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:45:53,396][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 01:45:54,196][15401] Updated weights for policy 0, policy_version 108570 (0.0031) [2024-06-22 01:45:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1778958336. Throughput: 0: 42733.9. Samples: 1779096200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:45:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 01:45:58,423][15401] Updated weights for policy 0, policy_version 108580 (0.0039) [2024-06-22 01:46:01,788][15401] Updated weights for policy 0, policy_version 108590 (0.0043) [2024-06-22 01:46:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1779187712. Throughput: 0: 42690.7. Samples: 1779349400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:46:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 01:46:05,909][15401] Updated weights for policy 0, policy_version 108600 (0.0032) [2024-06-22 01:46:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 1779400704. Throughput: 0: 42695.9. Samples: 1779478880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:46:08,392][15132] Avg episode reward: [(0, '0.825')] [2024-06-22 01:46:09,474][15401] Updated weights for policy 0, policy_version 108610 (0.0032) [2024-06-22 01:46:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1779613696. Throughput: 0: 42736.1. Samples: 1779736020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:46:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 01:46:13,424][15401] Updated weights for policy 0, policy_version 108620 (0.0034) [2024-06-22 01:46:16,929][15401] Updated weights for policy 0, policy_version 108630 (0.0031) [2024-06-22 01:46:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1779826688. Throughput: 0: 42823.5. Samples: 1779996720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:46:18,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 01:46:20,979][15401] Updated weights for policy 0, policy_version 108640 (0.0037) [2024-06-22 01:46:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1780056064. Throughput: 0: 42827.1. Samples: 1780127000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 01:46:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 01:46:24,365][15401] Updated weights for policy 0, policy_version 108650 (0.0036) [2024-06-22 01:46:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1780269056. Throughput: 0: 42785.8. Samples: 1780383520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:46:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 01:46:29,047][15401] Updated weights for policy 0, policy_version 108660 (0.0027) [2024-06-22 01:46:31,386][15349] Signal inference workers to stop experience collection... (26250 times) [2024-06-22 01:46:31,387][15349] Signal inference workers to resume experience collection... (26250 times) [2024-06-22 01:46:31,434][15401] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-06-22 01:46:31,434][15401] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-06-22 01:46:32,375][15401] Updated weights for policy 0, policy_version 108670 (0.0034) [2024-06-22 01:46:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1780498432. Throughput: 0: 42860.4. Samples: 1780640560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:46:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 01:46:36,759][15401] Updated weights for policy 0, policy_version 108680 (0.0041) [2024-06-22 01:46:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1780678656. Throughput: 0: 43026.2. Samples: 1780774400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:46:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 01:46:40,036][15401] Updated weights for policy 0, policy_version 108690 (0.0033) [2024-06-22 01:46:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1780891648. Throughput: 0: 42880.7. Samples: 1781025840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:46:43,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-22 01:46:44,314][15401] Updated weights for policy 0, policy_version 108700 (0.0034) [2024-06-22 01:46:47,646][15401] Updated weights for policy 0, policy_version 108710 (0.0033) [2024-06-22 01:46:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1781121024. Throughput: 0: 42984.2. Samples: 1781283700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:46:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 01:46:51,919][15401] Updated weights for policy 0, policy_version 108720 (0.0038) [2024-06-22 01:46:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1781334016. Throughput: 0: 43025.9. Samples: 1781414940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:46:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 01:46:55,158][15401] Updated weights for policy 0, policy_version 108730 (0.0035) [2024-06-22 01:46:58,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1781530624. Throughput: 0: 43002.7. Samples: 1781671140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:46:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 01:46:59,418][15401] Updated weights for policy 0, policy_version 108740 (0.0029) [2024-06-22 01:47:02,829][15401] Updated weights for policy 0, policy_version 108750 (0.0029) [2024-06-22 01:47:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 1781760000. Throughput: 0: 43015.5. Samples: 1781932520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:47:03,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 01:47:07,032][15401] Updated weights for policy 0, policy_version 108760 (0.0029) [2024-06-22 01:47:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1781989376. Throughput: 0: 43080.8. Samples: 1782065640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:47:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 01:47:10,529][15401] Updated weights for policy 0, policy_version 108770 (0.0026) [2024-06-22 01:47:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1782185984. Throughput: 0: 42840.9. Samples: 1782311360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:47:13,395][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 01:47:14,690][15401] Updated weights for policy 0, policy_version 108780 (0.0037) [2024-06-22 01:47:18,211][15401] Updated weights for policy 0, policy_version 108790 (0.0038) [2024-06-22 01:47:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1782415360. Throughput: 0: 42679.5. Samples: 1782561140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:47:18,392][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 01:47:22,343][15401] Updated weights for policy 0, policy_version 108800 (0.0037) [2024-06-22 01:47:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1782628352. Throughput: 0: 42678.5. Samples: 1782694940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 01:47:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 01:47:25,900][15401] Updated weights for policy 0, policy_version 108810 (0.0031) [2024-06-22 01:47:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1782824960. Throughput: 0: 42731.3. Samples: 1782948740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:47:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 01:47:30,005][15401] Updated weights for policy 0, policy_version 108820 (0.0040) [2024-06-22 01:47:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1783037952. Throughput: 0: 42812.5. Samples: 1783210260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:47:33,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 01:47:34,015][15401] Updated weights for policy 0, policy_version 108830 (0.0036) [2024-06-22 01:47:37,460][15401] Updated weights for policy 0, policy_version 108840 (0.0027) [2024-06-22 01:47:38,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1783283712. Throughput: 0: 42725.6. Samples: 1783337600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:47:38,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 01:47:41,473][15401] Updated weights for policy 0, policy_version 108850 (0.0038) [2024-06-22 01:47:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1783447552. Throughput: 0: 42722.7. Samples: 1783593660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:47:43,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-22 01:47:43,566][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000108855_1783480320.pth... [2024-06-22 01:47:43,638][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000108229_1773223936.pth [2024-06-22 01:47:45,370][15401] Updated weights for policy 0, policy_version 108860 (0.0040) [2024-06-22 01:47:48,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 1783676928. Throughput: 0: 42593.9. Samples: 1783849140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:47:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 01:47:49,075][15401] Updated weights for policy 0, policy_version 108870 (0.0029) [2024-06-22 01:47:53,164][15401] Updated weights for policy 0, policy_version 108880 (0.0049) [2024-06-22 01:47:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1783889920. Throughput: 0: 42418.8. Samples: 1783974480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:47:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 01:47:56,743][15401] Updated weights for policy 0, policy_version 108890 (0.0022) [2024-06-22 01:47:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1784102912. Throughput: 0: 42581.4. Samples: 1784227520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:47:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 01:48:00,720][15401] Updated weights for policy 0, policy_version 108900 (0.0038) [2024-06-22 01:48:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1784315904. Throughput: 0: 42600.9. Samples: 1784478180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:48:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 01:48:04,730][15401] Updated weights for policy 0, policy_version 108910 (0.0037) [2024-06-22 01:48:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1784528896. Throughput: 0: 42501.1. Samples: 1784607480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:48:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 01:48:08,435][15401] Updated weights for policy 0, policy_version 108920 (0.0031) [2024-06-22 01:48:12,630][15401] Updated weights for policy 0, policy_version 108930 (0.0033) [2024-06-22 01:48:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1784741888. Throughput: 0: 42631.5. Samples: 1784867160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:48:13,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 01:48:16,323][15401] Updated weights for policy 0, policy_version 108940 (0.0030) [2024-06-22 01:48:16,984][15349] Signal inference workers to stop experience collection... (26300 times) [2024-06-22 01:48:17,021][15401] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-06-22 01:48:17,037][15349] Signal inference workers to resume experience collection... (26300 times) [2024-06-22 01:48:17,039][15401] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-06-22 01:48:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1784971264. Throughput: 0: 42336.0. Samples: 1785115380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:48:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 01:48:20,245][15401] Updated weights for policy 0, policy_version 108950 (0.0035) [2024-06-22 01:48:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 1785167872. Throughput: 0: 42468.2. Samples: 1785248660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-22 01:48:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 01:48:24,313][15401] Updated weights for policy 0, policy_version 108960 (0.0039) [2024-06-22 01:48:27,837][15401] Updated weights for policy 0, policy_version 108970 (0.0032) [2024-06-22 01:48:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 1785380864. Throughput: 0: 42455.5. Samples: 1785504160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:48:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 01:48:31,977][15401] Updated weights for policy 0, policy_version 108980 (0.0032) [2024-06-22 01:48:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1785593856. Throughput: 0: 42321.8. Samples: 1785753620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:48:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 01:48:35,487][15401] Updated weights for policy 0, policy_version 108990 (0.0036) [2024-06-22 01:48:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1785790464. Throughput: 0: 42403.1. Samples: 1785882620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:48:38,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 01:48:39,609][15401] Updated weights for policy 0, policy_version 109000 (0.0047) [2024-06-22 01:48:43,011][15401] Updated weights for policy 0, policy_version 109010 (0.0034) [2024-06-22 01:48:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1786019840. Throughput: 0: 42407.5. Samples: 1786135860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:48:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 01:48:47,158][15401] Updated weights for policy 0, policy_version 109020 (0.0036) [2024-06-22 01:48:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1786232832. Throughput: 0: 42525.0. Samples: 1786391800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:48:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 01:48:50,599][15401] Updated weights for policy 0, policy_version 109030 (0.0027) [2024-06-22 01:48:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1786429440. Throughput: 0: 42479.1. Samples: 1786519040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:48:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 01:48:54,648][15401] Updated weights for policy 0, policy_version 109040 (0.0035) [2024-06-22 01:48:58,135][15401] Updated weights for policy 0, policy_version 109050 (0.0031) [2024-06-22 01:48:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1786675200. Throughput: 0: 42357.4. Samples: 1786773240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:48:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 01:49:02,370][15401] Updated weights for policy 0, policy_version 109060 (0.0037) [2024-06-22 01:49:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1786871808. Throughput: 0: 42655.2. Samples: 1787034860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:49:03,391][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 01:49:05,646][15401] Updated weights for policy 0, policy_version 109070 (0.0037) [2024-06-22 01:49:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1787068416. Throughput: 0: 42430.1. Samples: 1787158020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:49:08,396][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 01:49:09,990][15401] Updated weights for policy 0, policy_version 109080 (0.0045) [2024-06-22 01:49:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1787314176. Throughput: 0: 42612.4. Samples: 1787421720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:49:13,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 01:49:13,782][15401] Updated weights for policy 0, policy_version 109090 (0.0033) [2024-06-22 01:49:17,478][15401] Updated weights for policy 0, policy_version 109100 (0.0023) [2024-06-22 01:49:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1787510784. Throughput: 0: 42884.5. Samples: 1787683420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:49:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 01:49:21,320][15401] Updated weights for policy 0, policy_version 109110 (0.0033) [2024-06-22 01:49:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1787723776. Throughput: 0: 42792.9. Samples: 1787808300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:49:23,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-22 01:49:25,223][15401] Updated weights for policy 0, policy_version 109120 (0.0035) [2024-06-22 01:49:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1787953152. Throughput: 0: 42856.9. Samples: 1788064420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:49:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 01:49:29,084][15401] Updated weights for policy 0, policy_version 109130 (0.0044) [2024-06-22 01:49:32,726][15401] Updated weights for policy 0, policy_version 109140 (0.0032) [2024-06-22 01:49:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1788166144. Throughput: 0: 42789.3. Samples: 1788317320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:49:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 01:49:36,734][15401] Updated weights for policy 0, policy_version 109150 (0.0028) [2024-06-22 01:49:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1788379136. Throughput: 0: 42933.6. Samples: 1788451060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:49:38,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 01:49:40,502][15401] Updated weights for policy 0, policy_version 109160 (0.0037) [2024-06-22 01:49:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1788575744. Throughput: 0: 43061.8. Samples: 1788711020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:49:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 01:49:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000109167_1788592128.pth... [2024-06-22 01:49:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000108541_1778335744.pth [2024-06-22 01:49:44,418][15401] Updated weights for policy 0, policy_version 109170 (0.0033) [2024-06-22 01:49:45,626][15349] Signal inference workers to stop experience collection... (26350 times) [2024-06-22 01:49:45,681][15349] Signal inference workers to resume experience collection... (26350 times) [2024-06-22 01:49:45,681][15401] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-06-22 01:49:45,701][15401] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-06-22 01:49:48,050][15401] Updated weights for policy 0, policy_version 109180 (0.0039) [2024-06-22 01:49:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1788805120. Throughput: 0: 42816.9. Samples: 1788961620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:49:48,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 01:49:52,153][15401] Updated weights for policy 0, policy_version 109190 (0.0032) [2024-06-22 01:49:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1789018112. Throughput: 0: 43025.8. Samples: 1789094180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:49:53,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 01:49:55,695][15401] Updated weights for policy 0, policy_version 109200 (0.0042) [2024-06-22 01:49:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1789214720. Throughput: 0: 42889.5. Samples: 1789351740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:49:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 01:49:59,876][15401] Updated weights for policy 0, policy_version 109210 (0.0049) [2024-06-22 01:50:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1789444096. Throughput: 0: 42833.8. Samples: 1789610940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:50:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 01:50:03,412][15401] Updated weights for policy 0, policy_version 109220 (0.0037) [2024-06-22 01:50:07,264][15401] Updated weights for policy 0, policy_version 109230 (0.0039) [2024-06-22 01:50:08,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1789673472. Throughput: 0: 42928.3. Samples: 1789740080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:50:08,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 01:50:10,942][15401] Updated weights for policy 0, policy_version 109240 (0.0033) [2024-06-22 01:50:13,394][15132] Fps is (10 sec: 42579.2, 60 sec: 42595.3, 300 sec: 42764.4). Total num frames: 1789870080. Throughput: 0: 43027.8. Samples: 1790000860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:50:13,394][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 01:50:15,047][15401] Updated weights for policy 0, policy_version 109250 (0.0038) [2024-06-22 01:50:18,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 1790099456. Throughput: 0: 43001.8. Samples: 1790252500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:50:18,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 01:50:18,503][15401] Updated weights for policy 0, policy_version 109260 (0.0028) [2024-06-22 01:50:22,678][15401] Updated weights for policy 0, policy_version 109270 (0.0036) [2024-06-22 01:50:23,390][15132] Fps is (10 sec: 42616.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1790296064. Throughput: 0: 42928.9. Samples: 1790382860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:50:23,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 01:50:25,972][15401] Updated weights for policy 0, policy_version 109280 (0.0031) [2024-06-22 01:50:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1790525440. Throughput: 0: 42983.4. Samples: 1790645280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 01:50:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 01:50:30,287][15401] Updated weights for policy 0, policy_version 109290 (0.0027) [2024-06-22 01:50:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1790738432. Throughput: 0: 43026.6. Samples: 1790897820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:50:33,394][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 01:50:33,597][15401] Updated weights for policy 0, policy_version 109300 (0.0024) [2024-06-22 01:50:37,870][15401] Updated weights for policy 0, policy_version 109310 (0.0030) [2024-06-22 01:50:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1790951424. Throughput: 0: 42949.4. Samples: 1791026900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:50:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 01:50:41,967][15401] Updated weights for policy 0, policy_version 109320 (0.0030) [2024-06-22 01:50:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 1791180800. Throughput: 0: 42986.4. Samples: 1791286140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:50:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 01:50:45,825][15401] Updated weights for policy 0, policy_version 109330 (0.0031) [2024-06-22 01:50:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1791377408. Throughput: 0: 42882.5. Samples: 1791540660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:50:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 01:50:49,544][15401] Updated weights for policy 0, policy_version 109340 (0.0042) [2024-06-22 01:50:53,342][15401] Updated weights for policy 0, policy_version 109350 (0.0032) [2024-06-22 01:50:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1791590400. Throughput: 0: 42802.3. Samples: 1791666180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:50:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 01:50:57,101][15401] Updated weights for policy 0, policy_version 109360 (0.0030) [2024-06-22 01:50:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 1791836160. Throughput: 0: 42855.8. Samples: 1791929180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:50:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 01:51:00,845][15401] Updated weights for policy 0, policy_version 109370 (0.0035) [2024-06-22 01:51:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 1792016384. Throughput: 0: 42884.0. Samples: 1792182180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:51:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 01:51:04,673][15401] Updated weights for policy 0, policy_version 109380 (0.0039) [2024-06-22 01:51:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1792229376. Throughput: 0: 42706.7. Samples: 1792304660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:51:08,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-22 01:51:08,825][15401] Updated weights for policy 0, policy_version 109390 (0.0025) [2024-06-22 01:51:12,559][15401] Updated weights for policy 0, policy_version 109400 (0.0035) [2024-06-22 01:51:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43147.8, 300 sec: 42820.6). Total num frames: 1792458752. Throughput: 0: 42796.1. Samples: 1792571100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:51:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 01:51:13,467][15349] Signal inference workers to stop experience collection... (26400 times) [2024-06-22 01:51:13,468][15349] Signal inference workers to resume experience collection... (26400 times) [2024-06-22 01:51:13,507][15401] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-06-22 01:51:13,507][15401] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-06-22 01:51:16,360][15401] Updated weights for policy 0, policy_version 109410 (0.0025) [2024-06-22 01:51:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 1792655360. Throughput: 0: 42824.5. Samples: 1792824920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:51:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 01:51:20,055][15401] Updated weights for policy 0, policy_version 109420 (0.0023) [2024-06-22 01:51:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1792868352. Throughput: 0: 42701.4. Samples: 1792948460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:51:23,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-22 01:51:24,253][15401] Updated weights for policy 0, policy_version 109430 (0.0039) [2024-06-22 01:51:27,642][15401] Updated weights for policy 0, policy_version 109440 (0.0040) [2024-06-22 01:51:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1793097728. Throughput: 0: 42687.4. Samples: 1793207060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 01:51:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 01:51:31,815][15401] Updated weights for policy 0, policy_version 109450 (0.0029) [2024-06-22 01:51:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1793277952. Throughput: 0: 42666.3. Samples: 1793460640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:51:33,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 01:51:35,283][15401] Updated weights for policy 0, policy_version 109460 (0.0037) [2024-06-22 01:51:38,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1793490944. Throughput: 0: 42609.3. Samples: 1793583600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:51:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 01:51:39,334][15401] Updated weights for policy 0, policy_version 109470 (0.0039) [2024-06-22 01:51:42,906][15401] Updated weights for policy 0, policy_version 109480 (0.0046) [2024-06-22 01:51:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1793720320. Throughput: 0: 42602.6. Samples: 1793846300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:51:43,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 01:51:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000109481_1793736704.pth... [2024-06-22 01:51:43,517][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000108855_1783480320.pth [2024-06-22 01:51:47,384][15401] Updated weights for policy 0, policy_version 109490 (0.0036) [2024-06-22 01:51:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1793916928. Throughput: 0: 42662.6. Samples: 1794102000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:51:48,392][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 01:51:50,965][15401] Updated weights for policy 0, policy_version 109500 (0.0029) [2024-06-22 01:51:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1794146304. Throughput: 0: 42669.4. Samples: 1794224780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:51:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 01:51:54,917][15401] Updated weights for policy 0, policy_version 109510 (0.0041) [2024-06-22 01:51:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 42654.3). Total num frames: 1794342912. Throughput: 0: 42433.7. Samples: 1794480620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:51:58,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 01:51:58,611][15401] Updated weights for policy 0, policy_version 109520 (0.0034) [2024-06-22 01:52:02,531][15401] Updated weights for policy 0, policy_version 109530 (0.0024) [2024-06-22 01:52:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1794572288. Throughput: 0: 42515.1. Samples: 1794738100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:52:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 01:52:06,076][15401] Updated weights for policy 0, policy_version 109540 (0.0034) [2024-06-22 01:52:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1794785280. Throughput: 0: 42444.3. Samples: 1794858460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:52:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 01:52:10,038][15401] Updated weights for policy 0, policy_version 109550 (0.0034) [2024-06-22 01:52:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1794998272. Throughput: 0: 42568.4. Samples: 1795122640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:52:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 01:52:13,666][15401] Updated weights for policy 0, policy_version 109560 (0.0035) [2024-06-22 01:52:17,597][15401] Updated weights for policy 0, policy_version 109570 (0.0038) [2024-06-22 01:52:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1795211264. Throughput: 0: 42608.9. Samples: 1795378040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:52:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 01:52:21,107][15349] Signal inference workers to stop experience collection... (26450 times) [2024-06-22 01:52:21,110][15349] Signal inference workers to resume experience collection... (26450 times) [2024-06-22 01:52:21,129][15401] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-06-22 01:52:21,130][15401] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-06-22 01:52:21,251][15401] Updated weights for policy 0, policy_version 109580 (0.0027) [2024-06-22 01:52:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1795440640. Throughput: 0: 42760.8. Samples: 1795507840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:52:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 01:52:25,143][15401] Updated weights for policy 0, policy_version 109590 (0.0043) [2024-06-22 01:52:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1795637248. Throughput: 0: 42823.2. Samples: 1795773340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 01:52:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 01:52:29,285][15401] Updated weights for policy 0, policy_version 109600 (0.0049) [2024-06-22 01:52:32,559][15401] Updated weights for policy 0, policy_version 109610 (0.0029) [2024-06-22 01:52:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1795883008. Throughput: 0: 42657.9. Samples: 1796021600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:52:33,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 01:52:36,790][15401] Updated weights for policy 0, policy_version 109620 (0.0031) [2024-06-22 01:52:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1796096000. Throughput: 0: 42984.4. Samples: 1796159080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:52:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 01:52:40,000][15401] Updated weights for policy 0, policy_version 109630 (0.0041) [2024-06-22 01:52:43,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1796259840. Throughput: 0: 43076.0. Samples: 1796419040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:52:43,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 01:52:44,395][15401] Updated weights for policy 0, policy_version 109640 (0.0024) [2024-06-22 01:52:48,169][15401] Updated weights for policy 0, policy_version 109650 (0.0029) [2024-06-22 01:52:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 1796521984. Throughput: 0: 42994.6. Samples: 1796672860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:52:48,399][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 01:52:52,114][15401] Updated weights for policy 0, policy_version 109660 (0.0035) [2024-06-22 01:52:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1796718592. Throughput: 0: 43246.6. Samples: 1796804560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:52:53,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 01:52:55,877][15401] Updated weights for policy 0, policy_version 109670 (0.0034) [2024-06-22 01:52:58,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1796898816. Throughput: 0: 42854.3. Samples: 1797051080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:52:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 01:52:59,747][15401] Updated weights for policy 0, policy_version 109680 (0.0046) [2024-06-22 01:53:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1797144576. Throughput: 0: 42885.6. Samples: 1797307900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:53:03,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 01:53:03,428][15401] Updated weights for policy 0, policy_version 109690 (0.0027) [2024-06-22 01:53:07,635][15401] Updated weights for policy 0, policy_version 109700 (0.0034) [2024-06-22 01:53:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1797357568. Throughput: 0: 42889.6. Samples: 1797437860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:53:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 01:53:11,033][15401] Updated weights for policy 0, policy_version 109710 (0.0036) [2024-06-22 01:53:13,396][15132] Fps is (10 sec: 39296.9, 60 sec: 42320.8, 300 sec: 42597.5). Total num frames: 1797537792. Throughput: 0: 42519.3. Samples: 1797686980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:53:13,397][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 01:53:15,462][15401] Updated weights for policy 0, policy_version 109720 (0.0043) [2024-06-22 01:53:18,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 1797783552. Throughput: 0: 42651.9. Samples: 1797941040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:53:18,393][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 01:53:18,651][15401] Updated weights for policy 0, policy_version 109730 (0.0039) [2024-06-22 01:53:23,176][15401] Updated weights for policy 0, policy_version 109740 (0.0035) [2024-06-22 01:53:23,390][15132] Fps is (10 sec: 44264.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1797980160. Throughput: 0: 42480.8. Samples: 1798070720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:53:23,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 01:53:24,559][15349] Signal inference workers to stop experience collection... (26500 times) [2024-06-22 01:53:24,581][15401] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-06-22 01:53:24,616][15349] Signal inference workers to resume experience collection... (26500 times) [2024-06-22 01:53:24,618][15401] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-06-22 01:53:26,294][15401] Updated weights for policy 0, policy_version 109750 (0.0037) [2024-06-22 01:53:28,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1798193152. Throughput: 0: 42239.1. Samples: 1798319800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 01:53:28,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 01:53:30,746][15401] Updated weights for policy 0, policy_version 109760 (0.0034) [2024-06-22 01:53:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 1798422528. Throughput: 0: 42511.9. Samples: 1798585900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:53:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 01:53:34,175][15401] Updated weights for policy 0, policy_version 109770 (0.0039) [2024-06-22 01:53:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 1798602752. Throughput: 0: 42422.9. Samples: 1798713580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:53:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 01:53:38,560][15401] Updated weights for policy 0, policy_version 109780 (0.0026) [2024-06-22 01:53:41,720][15401] Updated weights for policy 0, policy_version 109790 (0.0038) [2024-06-22 01:53:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1798864896. Throughput: 0: 42583.9. Samples: 1798967360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:53:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 01:53:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000109794_1798864896.pth... [2024-06-22 01:53:43,445][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000109167_1788592128.pth [2024-06-22 01:53:45,982][15401] Updated weights for policy 0, policy_version 109800 (0.0039) [2024-06-22 01:53:48,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1799061504. Throughput: 0: 42691.7. Samples: 1799229020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:53:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 01:53:49,558][15401] Updated weights for policy 0, policy_version 109810 (0.0033) [2024-06-22 01:53:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1799258112. Throughput: 0: 42581.2. Samples: 1799354020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:53:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 01:53:53,643][15401] Updated weights for policy 0, policy_version 109820 (0.0035) [2024-06-22 01:53:57,270][15401] Updated weights for policy 0, policy_version 109830 (0.0036) [2024-06-22 01:53:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1799503872. Throughput: 0: 42739.5. Samples: 1799609980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:53:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 01:54:01,251][15401] Updated weights for policy 0, policy_version 109840 (0.0025) [2024-06-22 01:54:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1799700480. Throughput: 0: 42811.7. Samples: 1799867460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:54:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 01:54:04,842][15401] Updated weights for policy 0, policy_version 109850 (0.0033) [2024-06-22 01:54:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1799913472. Throughput: 0: 42597.0. Samples: 1799987580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:54:08,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 01:54:09,073][15401] Updated weights for policy 0, policy_version 109860 (0.0040) [2024-06-22 01:54:12,368][15401] Updated weights for policy 0, policy_version 109870 (0.0042) [2024-06-22 01:54:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43695.4, 300 sec: 42876.1). Total num frames: 1800159232. Throughput: 0: 42974.3. Samples: 1800253640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:54:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 01:54:16,946][15401] Updated weights for policy 0, policy_version 109880 (0.0029) [2024-06-22 01:54:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1800339456. Throughput: 0: 42814.0. Samples: 1800512520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:54:18,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 01:54:19,980][15401] Updated weights for policy 0, policy_version 109890 (0.0030) [2024-06-22 01:54:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1800552448. Throughput: 0: 42662.9. Samples: 1800633420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:54:23,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 01:54:24,318][15401] Updated weights for policy 0, policy_version 109900 (0.0033) [2024-06-22 01:54:27,808][15401] Updated weights for policy 0, policy_version 109910 (0.0025) [2024-06-22 01:54:28,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 1800814592. Throughput: 0: 42867.6. Samples: 1800896400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:54:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 01:54:32,174][15401] Updated weights for policy 0, policy_version 109920 (0.0033) [2024-06-22 01:54:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 1800978432. Throughput: 0: 42815.1. Samples: 1801155700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 01:54:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 01:54:35,336][15401] Updated weights for policy 0, policy_version 109930 (0.0025) [2024-06-22 01:54:38,390][15132] Fps is (10 sec: 37682.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1801191424. Throughput: 0: 42607.1. Samples: 1801271340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:54:38,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 01:54:39,886][15401] Updated weights for policy 0, policy_version 109940 (0.0028) [2024-06-22 01:54:42,853][15401] Updated weights for policy 0, policy_version 109950 (0.0045) [2024-06-22 01:54:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1801453568. Throughput: 0: 42860.0. Samples: 1801538680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:54:43,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 01:54:47,537][15401] Updated weights for policy 0, policy_version 109960 (0.0026) [2024-06-22 01:54:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1801601024. Throughput: 0: 43033.3. Samples: 1801803960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:54:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 01:54:50,175][15349] Signal inference workers to stop experience collection... (26550 times) [2024-06-22 01:54:50,228][15401] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-06-22 01:54:50,230][15349] Signal inference workers to resume experience collection... (26550 times) [2024-06-22 01:54:50,246][15401] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-06-22 01:54:50,378][15401] Updated weights for policy 0, policy_version 109970 (0.0041) [2024-06-22 01:54:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1801846784. Throughput: 0: 43021.3. Samples: 1801923540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:54:53,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-22 01:54:54,942][15401] Updated weights for policy 0, policy_version 109980 (0.0022) [2024-06-22 01:54:57,816][15401] Updated weights for policy 0, policy_version 109990 (0.0037) [2024-06-22 01:54:58,389][15132] Fps is (10 sec: 50790.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1802108928. Throughput: 0: 43044.9. Samples: 1802190660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:54:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 01:55:02,953][15401] Updated weights for policy 0, policy_version 110000 (0.0045) [2024-06-22 01:55:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1802256384. Throughput: 0: 43104.8. Samples: 1802452240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:55:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 01:55:05,561][15401] Updated weights for policy 0, policy_version 110010 (0.0026) [2024-06-22 01:55:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43415.9, 300 sec: 42876.4). Total num frames: 1802518528. Throughput: 0: 43011.6. Samples: 1802569040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:55:08,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 01:55:10,338][15401] Updated weights for policy 0, policy_version 110020 (0.0032) [2024-06-22 01:55:13,151][15401] Updated weights for policy 0, policy_version 110030 (0.0023) [2024-06-22 01:55:13,389][15132] Fps is (10 sec: 49152.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 1802747904. Throughput: 0: 43166.3. Samples: 1802838880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:55:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 01:55:17,862][15401] Updated weights for policy 0, policy_version 110040 (0.0032) [2024-06-22 01:55:18,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1802895360. Throughput: 0: 43186.6. Samples: 1803099100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:55:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 01:55:20,979][15401] Updated weights for policy 0, policy_version 110050 (0.0028) [2024-06-22 01:55:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1803157504. Throughput: 0: 43241.0. Samples: 1803217180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:55:23,396][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 01:55:25,557][15401] Updated weights for policy 0, policy_version 110060 (0.0037) [2024-06-22 01:55:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1803354112. Throughput: 0: 43141.4. Samples: 1803480040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:55:28,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 01:55:28,651][15401] Updated weights for policy 0, policy_version 110070 (0.0030) [2024-06-22 01:55:33,118][15401] Updated weights for policy 0, policy_version 110080 (0.0031) [2024-06-22 01:55:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1803550720. Throughput: 0: 42956.5. Samples: 1803737000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 01:55:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 01:55:36,410][15401] Updated weights for policy 0, policy_version 110090 (0.0039) [2024-06-22 01:55:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42765.1). Total num frames: 1803796480. Throughput: 0: 43048.1. Samples: 1803860700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:55:38,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 01:55:40,786][15401] Updated weights for policy 0, policy_version 110100 (0.0040) [2024-06-22 01:55:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1804009472. Throughput: 0: 42923.9. Samples: 1804122240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:55:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 01:55:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000110108_1804009472.pth... [2024-06-22 01:55:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000109481_1793736704.pth [2024-06-22 01:55:44,196][15401] Updated weights for policy 0, policy_version 110110 (0.0025) [2024-06-22 01:55:48,263][15401] Updated weights for policy 0, policy_version 110120 (0.0027) [2024-06-22 01:55:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1804206080. Throughput: 0: 42956.5. Samples: 1804385280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:55:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 01:55:51,684][15401] Updated weights for policy 0, policy_version 110130 (0.0029) [2024-06-22 01:55:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1804451840. Throughput: 0: 43212.5. Samples: 1804513500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:55:53,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 01:55:55,739][15401] Updated weights for policy 0, policy_version 110140 (0.0024) [2024-06-22 01:55:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1804632064. Throughput: 0: 42982.7. Samples: 1804773100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:55:58,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 01:55:59,158][15401] Updated weights for policy 0, policy_version 110150 (0.0034) [2024-06-22 01:56:03,303][15401] Updated weights for policy 0, policy_version 110160 (0.0032) [2024-06-22 01:56:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1804861440. Throughput: 0: 42940.8. Samples: 1805031440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:56:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 01:56:06,030][15349] Signal inference workers to stop experience collection... (26600 times) [2024-06-22 01:56:06,030][15349] Signal inference workers to resume experience collection... (26600 times) [2024-06-22 01:56:06,052][15401] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-06-22 01:56:06,052][15401] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-06-22 01:56:06,817][15401] Updated weights for policy 0, policy_version 110170 (0.0038) [2024-06-22 01:56:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 1805090816. Throughput: 0: 43215.6. Samples: 1805161880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:56:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 01:56:10,943][15401] Updated weights for policy 0, policy_version 110180 (0.0025) [2024-06-22 01:56:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1805271040. Throughput: 0: 42864.7. Samples: 1805408960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:56:13,394][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 01:56:14,630][15401] Updated weights for policy 0, policy_version 110190 (0.0037) [2024-06-22 01:56:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1805500416. Throughput: 0: 42832.0. Samples: 1805664440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:56:18,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-22 01:56:18,506][15401] Updated weights for policy 0, policy_version 110200 (0.0024) [2024-06-22 01:56:22,364][15401] Updated weights for policy 0, policy_version 110210 (0.0035) [2024-06-22 01:56:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1805729792. Throughput: 0: 43142.2. Samples: 1805802100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:56:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 01:56:25,980][15401] Updated weights for policy 0, policy_version 110220 (0.0049) [2024-06-22 01:56:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1805893632. Throughput: 0: 43024.0. Samples: 1806058320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:56:28,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-22 01:56:30,075][15401] Updated weights for policy 0, policy_version 110230 (0.0033) [2024-06-22 01:56:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 1806155776. Throughput: 0: 42757.2. Samples: 1806309360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-22 01:56:33,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 01:56:33,912][15401] Updated weights for policy 0, policy_version 110240 (0.0035) [2024-06-22 01:56:37,497][15401] Updated weights for policy 0, policy_version 110250 (0.0027) [2024-06-22 01:56:38,390][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1806385152. Throughput: 0: 43024.5. Samples: 1806449600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:56:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 01:56:41,326][15401] Updated weights for policy 0, policy_version 110260 (0.0041) [2024-06-22 01:56:43,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1806548992. Throughput: 0: 42952.1. Samples: 1806705940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:56:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 01:56:45,136][15401] Updated weights for policy 0, policy_version 110270 (0.0038) [2024-06-22 01:56:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1806794752. Throughput: 0: 42697.8. Samples: 1806952840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:56:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 01:56:48,971][15401] Updated weights for policy 0, policy_version 110280 (0.0030) [2024-06-22 01:56:52,999][15401] Updated weights for policy 0, policy_version 110290 (0.0028) [2024-06-22 01:56:53,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1807007744. Throughput: 0: 42856.7. Samples: 1807090440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:56:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 01:56:56,591][15401] Updated weights for policy 0, policy_version 110300 (0.0035) [2024-06-22 01:56:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1807204352. Throughput: 0: 42880.9. Samples: 1807338600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:56:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 01:57:00,850][15401] Updated weights for policy 0, policy_version 110310 (0.0042) [2024-06-22 01:57:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1807450112. Throughput: 0: 42700.7. Samples: 1807585980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:57:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 01:57:04,297][15401] Updated weights for policy 0, policy_version 110320 (0.0041) [2024-06-22 01:57:08,374][15401] Updated weights for policy 0, policy_version 110330 (0.0031) [2024-06-22 01:57:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1807646720. Throughput: 0: 42731.5. Samples: 1807725020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:57:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 01:57:11,855][15401] Updated weights for policy 0, policy_version 110340 (0.0027) [2024-06-22 01:57:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1807843328. Throughput: 0: 42636.0. Samples: 1807976940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:57:13,394][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 01:57:16,209][15401] Updated weights for policy 0, policy_version 110350 (0.0051) [2024-06-22 01:57:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1808072704. Throughput: 0: 42579.2. Samples: 1808225420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:57:18,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 01:57:19,455][15401] Updated weights for policy 0, policy_version 110360 (0.0037) [2024-06-22 01:57:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1808269312. Throughput: 0: 42507.2. Samples: 1808362420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:57:23,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 01:57:23,769][15401] Updated weights for policy 0, policy_version 110370 (0.0035) [2024-06-22 01:57:27,125][15349] Signal inference workers to stop experience collection... (26650 times) [2024-06-22 01:57:27,126][15349] Signal inference workers to resume experience collection... (26650 times) [2024-06-22 01:57:27,175][15401] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-06-22 01:57:27,175][15401] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-06-22 01:57:27,264][15401] Updated weights for policy 0, policy_version 110380 (0.0029) [2024-06-22 01:57:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1808482304. Throughput: 0: 42486.5. Samples: 1808617840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:57:28,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 01:57:31,280][15401] Updated weights for policy 0, policy_version 110390 (0.0026) [2024-06-22 01:57:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1808711680. Throughput: 0: 42662.2. Samples: 1808872640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 01:57:33,399][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 01:57:34,947][15401] Updated weights for policy 0, policy_version 110400 (0.0038) [2024-06-22 01:57:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 1808908288. Throughput: 0: 42508.6. Samples: 1809003320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:57:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 01:57:39,029][15401] Updated weights for policy 0, policy_version 110410 (0.0045) [2024-06-22 01:57:42,526][15401] Updated weights for policy 0, policy_version 110420 (0.0029) [2024-06-22 01:57:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 1809137664. Throughput: 0: 42732.8. Samples: 1809261680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:57:43,393][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 01:57:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000110421_1809137664.pth... [2024-06-22 01:57:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000109794_1798864896.pth [2024-06-22 01:57:46,877][15401] Updated weights for policy 0, policy_version 110430 (0.0039) [2024-06-22 01:57:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1809367040. Throughput: 0: 42757.0. Samples: 1809510040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:57:48,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-22 01:57:50,544][15401] Updated weights for policy 0, policy_version 110440 (0.0030) [2024-06-22 01:57:53,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 1809530880. Throughput: 0: 42515.5. Samples: 1809638220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:57:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 01:57:54,467][15401] Updated weights for policy 0, policy_version 110450 (0.0032) [2024-06-22 01:57:58,226][15401] Updated weights for policy 0, policy_version 110460 (0.0037) [2024-06-22 01:57:58,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1809776640. Throughput: 0: 42536.3. Samples: 1809891080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:57:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 01:58:02,507][15401] Updated weights for policy 0, policy_version 110470 (0.0039) [2024-06-22 01:58:03,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1810006016. Throughput: 0: 42604.8. Samples: 1810142640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:58:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 01:58:05,908][15401] Updated weights for policy 0, policy_version 110480 (0.0027) [2024-06-22 01:58:08,389][15132] Fps is (10 sec: 39322.8, 60 sec: 42052.3, 300 sec: 42821.5). Total num frames: 1810169856. Throughput: 0: 42487.6. Samples: 1810274360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:58:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 01:58:10,267][15401] Updated weights for policy 0, policy_version 110490 (0.0028) [2024-06-22 01:58:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1810415616. Throughput: 0: 42458.3. Samples: 1810528460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:58:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 01:58:13,423][15401] Updated weights for policy 0, policy_version 110500 (0.0033) [2024-06-22 01:58:17,733][15401] Updated weights for policy 0, policy_version 110510 (0.0024) [2024-06-22 01:58:18,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 1810644992. Throughput: 0: 42537.0. Samples: 1810786800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:58:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 01:58:20,910][15401] Updated weights for policy 0, policy_version 110520 (0.0032) [2024-06-22 01:58:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1810808832. Throughput: 0: 42345.8. Samples: 1810908880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:58:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 01:58:25,461][15401] Updated weights for policy 0, policy_version 110530 (0.0040) [2024-06-22 01:58:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1811054592. Throughput: 0: 42232.5. Samples: 1811162040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:58:28,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 01:58:28,556][15401] Updated weights for policy 0, policy_version 110540 (0.0027) [2024-06-22 01:58:33,096][15401] Updated weights for policy 0, policy_version 110550 (0.0039) [2024-06-22 01:58:33,256][15349] Signal inference workers to stop experience collection... (26700 times) [2024-06-22 01:58:33,256][15349] Signal inference workers to resume experience collection... (26700 times) [2024-06-22 01:58:33,293][15401] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-06-22 01:58:33,293][15401] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-06-22 01:58:33,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 1811283968. Throughput: 0: 42429.9. Samples: 1811419380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:58:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 01:58:36,435][15401] Updated weights for policy 0, policy_version 110560 (0.0039) [2024-06-22 01:58:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1811464192. Throughput: 0: 42399.1. Samples: 1811546180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 01:58:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 01:58:40,781][15401] Updated weights for policy 0, policy_version 110570 (0.0043) [2024-06-22 01:58:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 1811693568. Throughput: 0: 42572.3. Samples: 1811806820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:58:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 01:58:44,270][15401] Updated weights for policy 0, policy_version 110580 (0.0042) [2024-06-22 01:58:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 1811890176. Throughput: 0: 42712.6. Samples: 1812064700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:58:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 01:58:48,463][15401] Updated weights for policy 0, policy_version 110590 (0.0030) [2024-06-22 01:58:52,001][15401] Updated weights for policy 0, policy_version 110600 (0.0039) [2024-06-22 01:58:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1812119552. Throughput: 0: 42484.4. Samples: 1812186160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:58:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 01:58:56,156][15401] Updated weights for policy 0, policy_version 110610 (0.0032) [2024-06-22 01:58:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 1812316160. Throughput: 0: 42619.6. Samples: 1812446340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:58:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 01:58:59,564][15401] Updated weights for policy 0, policy_version 110620 (0.0029) [2024-06-22 01:59:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 1812512768. Throughput: 0: 42637.8. Samples: 1812705500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:59:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 01:59:03,876][15401] Updated weights for policy 0, policy_version 110630 (0.0035) [2024-06-22 01:59:07,124][15401] Updated weights for policy 0, policy_version 110640 (0.0035) [2024-06-22 01:59:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1812742144. Throughput: 0: 42613.3. Samples: 1812826480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:59:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 01:59:11,600][15401] Updated weights for policy 0, policy_version 110650 (0.0042) [2024-06-22 01:59:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1812971520. Throughput: 0: 42610.3. Samples: 1813079500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:59:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 01:59:14,758][15401] Updated weights for policy 0, policy_version 110660 (0.0029) [2024-06-22 01:59:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1813151744. Throughput: 0: 42725.3. Samples: 1813342020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:59:18,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-22 01:59:19,400][15401] Updated weights for policy 0, policy_version 110670 (0.0037) [2024-06-22 01:59:22,964][15401] Updated weights for policy 0, policy_version 110680 (0.0031) [2024-06-22 01:59:23,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 1813381120. Throughput: 0: 42588.0. Samples: 1813462740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:59:23,393][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 01:59:27,016][15401] Updated weights for policy 0, policy_version 110690 (0.0047) [2024-06-22 01:59:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1813610496. Throughput: 0: 42523.1. Samples: 1813720360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:59:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 01:59:30,383][15401] Updated weights for policy 0, policy_version 110700 (0.0029) [2024-06-22 01:59:33,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1813807104. Throughput: 0: 42571.5. Samples: 1813980420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:59:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 01:59:34,729][15401] Updated weights for policy 0, policy_version 110710 (0.0023) [2024-06-22 01:59:37,910][15401] Updated weights for policy 0, policy_version 110720 (0.0036) [2024-06-22 01:59:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1814036480. Throughput: 0: 42599.9. Samples: 1814103160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 01:59:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 01:59:42,455][15401] Updated weights for policy 0, policy_version 110730 (0.0022) [2024-06-22 01:59:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 1814265856. Throughput: 0: 42719.4. Samples: 1814368720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 01:59:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 01:59:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000110734_1814265856.pth... [2024-06-22 01:59:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000110108_1804009472.pth [2024-06-22 01:59:45,370][15401] Updated weights for policy 0, policy_version 110740 (0.0030) [2024-06-22 01:59:48,396][15132] Fps is (10 sec: 39296.7, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 1814429696. Throughput: 0: 42515.7. Samples: 1814618980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 01:59:48,396][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 01:59:50,277][15401] Updated weights for policy 0, policy_version 110750 (0.0045) [2024-06-22 01:59:52,925][15401] Updated weights for policy 0, policy_version 110760 (0.0043) [2024-06-22 01:59:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1814691840. Throughput: 0: 42583.1. Samples: 1814742720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 01:59:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 01:59:57,835][15401] Updated weights for policy 0, policy_version 110770 (0.0037) [2024-06-22 01:59:58,389][15132] Fps is (10 sec: 45904.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1814888448. Throughput: 0: 42873.8. Samples: 1815008820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 01:59:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 02:00:01,141][15401] Updated weights for policy 0, policy_version 110780 (0.0026) [2024-06-22 02:00:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1815068672. Throughput: 0: 42768.9. Samples: 1815266620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:00:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 02:00:05,468][15401] Updated weights for policy 0, policy_version 110790 (0.0043) [2024-06-22 02:00:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1815330816. Throughput: 0: 42760.6. Samples: 1815386860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:00:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 02:00:08,837][15401] Updated weights for policy 0, policy_version 110800 (0.0027) [2024-06-22 02:00:10,943][15349] Signal inference workers to stop experience collection... (26750 times) [2024-06-22 02:00:10,991][15401] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-06-22 02:00:11,000][15349] Signal inference workers to resume experience collection... (26750 times) [2024-06-22 02:00:11,010][15401] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-06-22 02:00:13,011][15401] Updated weights for policy 0, policy_version 110810 (0.0038) [2024-06-22 02:00:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1815543808. Throughput: 0: 43001.7. Samples: 1815655440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:00:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 02:00:16,439][15401] Updated weights for policy 0, policy_version 110820 (0.0032) [2024-06-22 02:00:18,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 1815724032. Throughput: 0: 42703.5. Samples: 1815902180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:00:18,393][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 02:00:20,595][15401] Updated weights for policy 0, policy_version 110830 (0.0040) [2024-06-22 02:00:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1815969792. Throughput: 0: 42816.4. Samples: 1816029900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:00:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 02:00:24,488][15401] Updated weights for policy 0, policy_version 110840 (0.0043) [2024-06-22 02:00:28,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1816150016. Throughput: 0: 42669.6. Samples: 1816288840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:00:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 02:00:28,521][15401] Updated weights for policy 0, policy_version 110850 (0.0026) [2024-06-22 02:00:31,980][15401] Updated weights for policy 0, policy_version 110860 (0.0022) [2024-06-22 02:00:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1816379392. Throughput: 0: 42726.9. Samples: 1816541420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:00:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 02:00:36,049][15401] Updated weights for policy 0, policy_version 110870 (0.0034) [2024-06-22 02:00:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1816608768. Throughput: 0: 42882.3. Samples: 1816672420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:00:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 02:00:39,460][15401] Updated weights for policy 0, policy_version 110880 (0.0039) [2024-06-22 02:00:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 1816788992. Throughput: 0: 42870.3. Samples: 1816937980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:00:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 02:00:43,610][15401] Updated weights for policy 0, policy_version 110890 (0.0023) [2024-06-22 02:00:46,951][15401] Updated weights for policy 0, policy_version 110900 (0.0034) [2024-06-22 02:00:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43149.1, 300 sec: 42598.4). Total num frames: 1817018368. Throughput: 0: 42720.8. Samples: 1817189060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:00:48,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 02:00:51,231][15401] Updated weights for policy 0, policy_version 110910 (0.0042) [2024-06-22 02:00:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1817247744. Throughput: 0: 42891.5. Samples: 1817316980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:00:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 02:00:54,459][15401] Updated weights for policy 0, policy_version 110920 (0.0039) [2024-06-22 02:00:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1817427968. Throughput: 0: 42620.1. Samples: 1817573340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:00:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 02:00:58,834][15401] Updated weights for policy 0, policy_version 110930 (0.0035) [2024-06-22 02:01:02,433][15401] Updated weights for policy 0, policy_version 110940 (0.0046) [2024-06-22 02:01:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1817657344. Throughput: 0: 42629.4. Samples: 1817820400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:01:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 02:01:06,485][15401] Updated weights for policy 0, policy_version 110950 (0.0030) [2024-06-22 02:01:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1817870336. Throughput: 0: 42674.4. Samples: 1817950240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:01:08,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 02:01:10,018][15401] Updated weights for policy 0, policy_version 110960 (0.0029) [2024-06-22 02:01:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1818066944. Throughput: 0: 42766.1. Samples: 1818213320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:01:13,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 02:01:14,087][15401] Updated weights for policy 0, policy_version 110970 (0.0029) [2024-06-22 02:01:17,640][15401] Updated weights for policy 0, policy_version 110980 (0.0036) [2024-06-22 02:01:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 1818312704. Throughput: 0: 42607.1. Samples: 1818458740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:01:18,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-22 02:01:20,213][15349] Signal inference workers to stop experience collection... (26800 times) [2024-06-22 02:01:20,262][15349] Signal inference workers to resume experience collection... (26800 times) [2024-06-22 02:01:20,262][15401] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-06-22 02:01:20,278][15401] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-06-22 02:01:22,001][15401] Updated weights for policy 0, policy_version 110990 (0.0026) [2024-06-22 02:01:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1818525696. Throughput: 0: 42665.2. Samples: 1818592360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:01:23,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-22 02:01:25,333][15401] Updated weights for policy 0, policy_version 111000 (0.0024) [2024-06-22 02:01:28,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42323.5, 300 sec: 42487.0). Total num frames: 1818689536. Throughput: 0: 42434.5. Samples: 1818847640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:01:28,393][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 02:01:29,779][15401] Updated weights for policy 0, policy_version 111010 (0.0042) [2024-06-22 02:01:33,388][15401] Updated weights for policy 0, policy_version 111020 (0.0040) [2024-06-22 02:01:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1818951680. Throughput: 0: 42339.1. Samples: 1819094320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:01:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 02:01:37,539][15401] Updated weights for policy 0, policy_version 111030 (0.0035) [2024-06-22 02:01:38,390][15132] Fps is (10 sec: 45886.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1819148288. Throughput: 0: 42423.6. Samples: 1819226040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:01:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 02:01:41,296][15401] Updated weights for policy 0, policy_version 111040 (0.0034) [2024-06-22 02:01:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1819344896. Throughput: 0: 42379.6. Samples: 1819480420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:01:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 02:01:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000111045_1819361280.pth... [2024-06-22 02:01:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000110421_1809137664.pth [2024-06-22 02:01:45,284][15401] Updated weights for policy 0, policy_version 111050 (0.0034) [2024-06-22 02:01:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 1819574272. Throughput: 0: 42337.4. Samples: 1819725680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:01:48,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 02:01:49,029][15401] Updated weights for policy 0, policy_version 111060 (0.0039) [2024-06-22 02:01:53,163][15401] Updated weights for policy 0, policy_version 111070 (0.0024) [2024-06-22 02:01:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1819787264. Throughput: 0: 42496.5. Samples: 1819862580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:01:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 02:01:56,483][15401] Updated weights for policy 0, policy_version 111080 (0.0039) [2024-06-22 02:01:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1819983872. Throughput: 0: 42285.8. Samples: 1820116180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:01:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 02:02:00,642][15401] Updated weights for policy 0, policy_version 111090 (0.0040) [2024-06-22 02:02:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1820229632. Throughput: 0: 42361.4. Samples: 1820365000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:02:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 02:02:04,739][15401] Updated weights for policy 0, policy_version 111100 (0.0030) [2024-06-22 02:02:08,268][15401] Updated weights for policy 0, policy_version 111110 (0.0037) [2024-06-22 02:02:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1820426240. Throughput: 0: 42391.5. Samples: 1820499980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:02:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 02:02:12,169][15401] Updated weights for policy 0, policy_version 111120 (0.0034) [2024-06-22 02:02:13,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 1820639232. Throughput: 0: 42494.7. Samples: 1820759900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:02:13,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 02:02:15,887][15401] Updated weights for policy 0, policy_version 111130 (0.0033) [2024-06-22 02:02:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1820868608. Throughput: 0: 42628.9. Samples: 1821012620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:02:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 02:02:19,702][15401] Updated weights for policy 0, policy_version 111140 (0.0027) [2024-06-22 02:02:23,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1821065216. Throughput: 0: 42637.0. Samples: 1821144700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:02:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 02:02:23,454][15401] Updated weights for policy 0, policy_version 111150 (0.0034) [2024-06-22 02:02:27,242][15401] Updated weights for policy 0, policy_version 111160 (0.0041) [2024-06-22 02:02:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 1821261824. Throughput: 0: 42614.7. Samples: 1821398080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:02:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 02:02:31,115][15401] Updated weights for policy 0, policy_version 111170 (0.0030) [2024-06-22 02:02:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1821507584. Throughput: 0: 42946.2. Samples: 1821658160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:02:33,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 02:02:34,943][15401] Updated weights for policy 0, policy_version 111180 (0.0032) [2024-06-22 02:02:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 1821704192. Throughput: 0: 42824.9. Samples: 1821789700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 02:02:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 02:02:38,644][15401] Updated weights for policy 0, policy_version 111190 (0.0035) [2024-06-22 02:02:42,536][15401] Updated weights for policy 0, policy_version 111200 (0.0024) [2024-06-22 02:02:43,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 1821917184. Throughput: 0: 42762.1. Samples: 1822040580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:02:43,392][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 02:02:46,452][15401] Updated weights for policy 0, policy_version 111210 (0.0032) [2024-06-22 02:02:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 1822130176. Throughput: 0: 42864.9. Samples: 1822293920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:02:48,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 02:02:50,586][15401] Updated weights for policy 0, policy_version 111220 (0.0026) [2024-06-22 02:02:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1822343168. Throughput: 0: 42696.0. Samples: 1822421300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:02:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 02:02:54,524][15401] Updated weights for policy 0, policy_version 111230 (0.0038) [2024-06-22 02:02:58,319][15401] Updated weights for policy 0, policy_version 111240 (0.0031) [2024-06-22 02:02:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1822556160. Throughput: 0: 42646.8. Samples: 1822678900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:02:58,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 02:03:02,189][15401] Updated weights for policy 0, policy_version 111250 (0.0038) [2024-06-22 02:03:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1822769152. Throughput: 0: 42616.9. Samples: 1822930380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 02:03:06,007][15401] Updated weights for policy 0, policy_version 111260 (0.0023) [2024-06-22 02:03:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1822982144. Throughput: 0: 42608.8. Samples: 1823062100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 02:03:09,751][15401] Updated weights for policy 0, policy_version 111270 (0.0039) [2024-06-22 02:03:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 1823195136. Throughput: 0: 42685.3. Samples: 1823318920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 02:03:13,462][15401] Updated weights for policy 0, policy_version 111280 (0.0033) [2024-06-22 02:03:17,195][15401] Updated weights for policy 0, policy_version 111290 (0.0028) [2024-06-22 02:03:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1823408128. Throughput: 0: 42553.5. Samples: 1823573060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:18,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 02:03:21,169][15401] Updated weights for policy 0, policy_version 111300 (0.0025) [2024-06-22 02:03:22,164][15349] Signal inference workers to stop experience collection... (26850 times) [2024-06-22 02:03:22,192][15401] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-06-22 02:03:22,227][15349] Signal inference workers to resume experience collection... (26850 times) [2024-06-22 02:03:22,228][15401] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-06-22 02:03:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1823637504. Throughput: 0: 42471.1. Samples: 1823700900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:23,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 02:03:25,046][15401] Updated weights for policy 0, policy_version 111310 (0.0025) [2024-06-22 02:03:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1823834112. Throughput: 0: 42580.5. Samples: 1823956600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 02:03:28,902][15401] Updated weights for policy 0, policy_version 111320 (0.0037) [2024-06-22 02:03:32,732][15401] Updated weights for policy 0, policy_version 111330 (0.0022) [2024-06-22 02:03:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1824047104. Throughput: 0: 42695.1. Samples: 1824215200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 02:03:36,875][15401] Updated weights for policy 0, policy_version 111340 (0.0034) [2024-06-22 02:03:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1824276480. Throughput: 0: 42839.5. Samples: 1824349080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 02:03:40,286][15401] Updated weights for policy 0, policy_version 111350 (0.0030) [2024-06-22 02:03:43,390][15132] Fps is (10 sec: 44234.0, 60 sec: 42872.8, 300 sec: 42709.4). Total num frames: 1824489472. Throughput: 0: 42742.9. Samples: 1824602360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 02:03:43,391][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 02:03:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000111358_1824489472.pth... [2024-06-22 02:03:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000110734_1814265856.pth [2024-06-22 02:03:44,246][15401] Updated weights for policy 0, policy_version 111360 (0.0033) [2024-06-22 02:03:48,046][15401] Updated weights for policy 0, policy_version 111370 (0.0033) [2024-06-22 02:03:48,391][15132] Fps is (10 sec: 42590.4, 60 sec: 42870.0, 300 sec: 42653.7). Total num frames: 1824702464. Throughput: 0: 42865.7. Samples: 1824859420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:03:48,392][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 02:03:51,829][15401] Updated weights for policy 0, policy_version 111380 (0.0040) [2024-06-22 02:03:53,389][15132] Fps is (10 sec: 42601.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1824915456. Throughput: 0: 42822.3. Samples: 1824989100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:03:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 02:03:55,662][15401] Updated weights for policy 0, policy_version 111390 (0.0027) [2024-06-22 02:03:58,389][15132] Fps is (10 sec: 42607.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1825128448. Throughput: 0: 42981.4. Samples: 1825253080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:03:58,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 02:03:59,355][15401] Updated weights for policy 0, policy_version 111400 (0.0036) [2024-06-22 02:04:03,330][15401] Updated weights for policy 0, policy_version 111410 (0.0024) [2024-06-22 02:04:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1825341440. Throughput: 0: 42954.5. Samples: 1825506020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 02:04:06,803][15401] Updated weights for policy 0, policy_version 111420 (0.0033) [2024-06-22 02:04:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1825587200. Throughput: 0: 42938.6. Samples: 1825633140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:08,396][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 02:04:10,799][15401] Updated weights for policy 0, policy_version 111430 (0.0033) [2024-06-22 02:04:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1825767424. Throughput: 0: 43112.8. Samples: 1825896680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 02:04:14,686][15401] Updated weights for policy 0, policy_version 111440 (0.0036) [2024-06-22 02:04:18,306][15401] Updated weights for policy 0, policy_version 111450 (0.0034) [2024-06-22 02:04:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 1825996800. Throughput: 0: 42938.6. Samples: 1826147440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 02:04:22,140][15401] Updated weights for policy 0, policy_version 111460 (0.0032) [2024-06-22 02:04:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1826226176. Throughput: 0: 42835.7. Samples: 1826276680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 02:04:26,353][15401] Updated weights for policy 0, policy_version 111470 (0.0040) [2024-06-22 02:04:28,390][15132] Fps is (10 sec: 40956.4, 60 sec: 42870.9, 300 sec: 42709.4). Total num frames: 1826406400. Throughput: 0: 42913.1. Samples: 1826533460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:28,391][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 02:04:29,665][15401] Updated weights for policy 0, policy_version 111480 (0.0027) [2024-06-22 02:04:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1826619392. Throughput: 0: 42975.7. Samples: 1826793240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:33,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-22 02:04:34,114][15401] Updated weights for policy 0, policy_version 111490 (0.0036) [2024-06-22 02:04:37,642][15401] Updated weights for policy 0, policy_version 111500 (0.0038) [2024-06-22 02:04:37,644][15349] Signal inference workers to stop experience collection... (26900 times) [2024-06-22 02:04:37,644][15349] Signal inference workers to resume experience collection... (26900 times) [2024-06-22 02:04:37,683][15401] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-06-22 02:04:37,684][15401] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-06-22 02:04:38,390][15132] Fps is (10 sec: 47515.9, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 1826881536. Throughput: 0: 42982.7. Samples: 1826923340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:38,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 02:04:41,895][15401] Updated weights for policy 0, policy_version 111510 (0.0038) [2024-06-22 02:04:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.8, 300 sec: 42765.9). Total num frames: 1827045376. Throughput: 0: 42807.0. Samples: 1827179400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:04:43,396][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 02:04:45,269][15401] Updated weights for policy 0, policy_version 111520 (0.0050) [2024-06-22 02:04:48,389][15132] Fps is (10 sec: 39323.3, 60 sec: 42872.9, 300 sec: 42653.9). Total num frames: 1827274752. Throughput: 0: 42778.4. Samples: 1827431040. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:04:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 02:04:49,552][15401] Updated weights for policy 0, policy_version 111530 (0.0034) [2024-06-22 02:04:52,903][15401] Updated weights for policy 0, policy_version 111540 (0.0027) [2024-06-22 02:04:53,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1827520512. Throughput: 0: 42876.1. Samples: 1827562560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:04:53,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 02:04:57,041][15401] Updated weights for policy 0, policy_version 111550 (0.0028) [2024-06-22 02:04:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1827667968. Throughput: 0: 42640.1. Samples: 1827815480. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:04:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 02:05:00,436][15401] Updated weights for policy 0, policy_version 111560 (0.0046) [2024-06-22 02:05:03,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1827897344. Throughput: 0: 42886.6. Samples: 1828077340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:03,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 02:05:04,580][15401] Updated weights for policy 0, policy_version 111570 (0.0028) [2024-06-22 02:05:07,925][15401] Updated weights for policy 0, policy_version 111580 (0.0033) [2024-06-22 02:05:08,389][15132] Fps is (10 sec: 49151.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1828159488. Throughput: 0: 43004.0. Samples: 1828211860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 02:05:12,002][15401] Updated weights for policy 0, policy_version 111590 (0.0037) [2024-06-22 02:05:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1828323328. Throughput: 0: 42964.3. Samples: 1828466820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 02:05:15,388][15401] Updated weights for policy 0, policy_version 111600 (0.0031) [2024-06-22 02:05:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1828569088. Throughput: 0: 42889.8. Samples: 1828723280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 02:05:19,492][15401] Updated weights for policy 0, policy_version 111610 (0.0049) [2024-06-22 02:05:22,981][15401] Updated weights for policy 0, policy_version 111620 (0.0030) [2024-06-22 02:05:23,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 1828798464. Throughput: 0: 43035.3. Samples: 1828859920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 02:05:27,010][15401] Updated weights for policy 0, policy_version 111630 (0.0028) [2024-06-22 02:05:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.9, 300 sec: 42653.9). Total num frames: 1828962304. Throughput: 0: 42940.4. Samples: 1829111720. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:28,399][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 02:05:30,741][15401] Updated weights for policy 0, policy_version 111640 (0.0033) [2024-06-22 02:05:33,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1829224448. Throughput: 0: 43050.2. Samples: 1829368300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:33,399][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 02:05:34,662][15401] Updated weights for policy 0, policy_version 111650 (0.0038) [2024-06-22 02:05:38,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42598.7, 300 sec: 42876.1). Total num frames: 1829437440. Throughput: 0: 43189.8. Samples: 1829506100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 02:05:38,399][15401] Updated weights for policy 0, policy_version 111660 (0.0030) [2024-06-22 02:05:38,548][15349] Signal inference workers to stop experience collection... (26950 times) [2024-06-22 02:05:38,552][15349] Signal inference workers to resume experience collection... (26950 times) [2024-06-22 02:05:38,600][15401] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-06-22 02:05:38,600][15401] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-06-22 02:05:42,772][15401] Updated weights for policy 0, policy_version 111670 (0.0039) [2024-06-22 02:05:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1829617664. Throughput: 0: 43339.9. Samples: 1829765780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 24.0) [2024-06-22 02:05:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 02:05:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000111672_1829634048.pth... [2024-06-22 02:05:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000111045_1819361280.pth [2024-06-22 02:05:46,017][15401] Updated weights for policy 0, policy_version 111680 (0.0034) [2024-06-22 02:05:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1829879808. Throughput: 0: 43019.2. Samples: 1830013200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:05:48,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 02:05:50,239][15401] Updated weights for policy 0, policy_version 111690 (0.0034) [2024-06-22 02:05:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 1830060032. Throughput: 0: 43172.8. Samples: 1830154640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:05:53,399][15132] Avg episode reward: [(0, '0.293')] [2024-06-22 02:05:53,747][15401] Updated weights for policy 0, policy_version 111700 (0.0030) [2024-06-22 02:05:57,894][15401] Updated weights for policy 0, policy_version 111710 (0.0041) [2024-06-22 02:05:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1830273024. Throughput: 0: 43179.6. Samples: 1830409900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:05:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 02:06:01,350][15401] Updated weights for policy 0, policy_version 111720 (0.0039) [2024-06-22 02:06:03,392][15132] Fps is (10 sec: 47504.2, 60 sec: 43962.3, 300 sec: 42931.3). Total num frames: 1830535168. Throughput: 0: 42969.9. Samples: 1830657020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:03,392][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 02:06:05,450][15401] Updated weights for policy 0, policy_version 111730 (0.0027) [2024-06-22 02:06:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1830682624. Throughput: 0: 43131.2. Samples: 1830800820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 02:06:09,020][15401] Updated weights for policy 0, policy_version 111740 (0.0044) [2024-06-22 02:06:13,129][15401] Updated weights for policy 0, policy_version 111750 (0.0031) [2024-06-22 02:06:13,389][15132] Fps is (10 sec: 37691.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1830912000. Throughput: 0: 43014.4. Samples: 1831047360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 02:06:16,566][15401] Updated weights for policy 0, policy_version 111760 (0.0042) [2024-06-22 02:06:18,390][15132] Fps is (10 sec: 49152.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1831174144. Throughput: 0: 42781.4. Samples: 1831293460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:18,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 02:06:20,914][15401] Updated weights for policy 0, policy_version 111770 (0.0026) [2024-06-22 02:06:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 1831337984. Throughput: 0: 42895.8. Samples: 1831436420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 02:06:24,106][15401] Updated weights for policy 0, policy_version 111780 (0.0022) [2024-06-22 02:06:28,389][15132] Fps is (10 sec: 36045.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1831534592. Throughput: 0: 42748.1. Samples: 1831689440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 02:06:28,733][15401] Updated weights for policy 0, policy_version 111790 (0.0036) [2024-06-22 02:06:32,060][15401] Updated weights for policy 0, policy_version 111800 (0.0030) [2024-06-22 02:06:33,389][15132] Fps is (10 sec: 47514.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1831813120. Throughput: 0: 42814.3. Samples: 1831939840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 02:06:36,442][15401] Updated weights for policy 0, policy_version 111810 (0.0023) [2024-06-22 02:06:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1831993344. Throughput: 0: 42858.4. Samples: 1832083260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 02:06:39,622][15401] Updated weights for policy 0, policy_version 111820 (0.0037) [2024-06-22 02:06:43,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 1832189952. Throughput: 0: 42720.4. Samples: 1832332320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 02:06:43,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 02:06:43,912][15401] Updated weights for policy 0, policy_version 111830 (0.0038) [2024-06-22 02:06:47,330][15401] Updated weights for policy 0, policy_version 111840 (0.0036) [2024-06-22 02:06:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1832435712. Throughput: 0: 42773.6. Samples: 1832581740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:06:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 02:06:51,611][15401] Updated weights for policy 0, policy_version 111850 (0.0038) [2024-06-22 02:06:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1832632320. Throughput: 0: 42661.2. Samples: 1832720580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:06:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 02:06:54,907][15401] Updated weights for policy 0, policy_version 111860 (0.0041) [2024-06-22 02:06:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1832845312. Throughput: 0: 42803.1. Samples: 1832973500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:06:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 02:06:59,105][15401] Updated weights for policy 0, policy_version 111870 (0.0022) [2024-06-22 02:07:00,924][15349] Signal inference workers to stop experience collection... (27000 times) [2024-06-22 02:07:00,924][15349] Signal inference workers to resume experience collection... (27000 times) [2024-06-22 02:07:00,941][15401] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-06-22 02:07:00,941][15401] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-06-22 02:07:02,539][15401] Updated weights for policy 0, policy_version 111880 (0.0035) [2024-06-22 02:07:03,390][15132] Fps is (10 sec: 47514.6, 60 sec: 42872.9, 300 sec: 42987.2). Total num frames: 1833107456. Throughput: 0: 42932.4. Samples: 1833225420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 02:07:06,564][15401] Updated weights for policy 0, policy_version 111890 (0.0026) [2024-06-22 02:07:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42876.5). Total num frames: 1833287680. Throughput: 0: 42797.6. Samples: 1833362300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 02:07:10,403][15401] Updated weights for policy 0, policy_version 111900 (0.0025) [2024-06-22 02:07:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1833484288. Throughput: 0: 42750.6. Samples: 1833613220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 02:07:14,162][15401] Updated weights for policy 0, policy_version 111910 (0.0037) [2024-06-22 02:07:18,045][15401] Updated weights for policy 0, policy_version 111920 (0.0031) [2024-06-22 02:07:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1833713664. Throughput: 0: 42946.2. Samples: 1833872420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:18,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 02:07:21,671][15401] Updated weights for policy 0, policy_version 111930 (0.0042) [2024-06-22 02:07:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1833910272. Throughput: 0: 42694.2. Samples: 1834004500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 02:07:25,421][15401] Updated weights for policy 0, policy_version 111940 (0.0021) [2024-06-22 02:07:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1834123264. Throughput: 0: 42773.4. Samples: 1834257120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 02:07:29,359][15401] Updated weights for policy 0, policy_version 111950 (0.0023) [2024-06-22 02:07:33,254][15401] Updated weights for policy 0, policy_version 111960 (0.0043) [2024-06-22 02:07:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1834352640. Throughput: 0: 43041.7. Samples: 1834518620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 02:07:36,956][15401] Updated weights for policy 0, policy_version 111970 (0.0037) [2024-06-22 02:07:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 1834549248. Throughput: 0: 42866.1. Samples: 1834649540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 02:07:40,820][15401] Updated weights for policy 0, policy_version 111980 (0.0036) [2024-06-22 02:07:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1834778624. Throughput: 0: 42848.1. Samples: 1834901660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 02:07:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000111986_1834778624.pth... [2024-06-22 02:07:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000111358_1824489472.pth [2024-06-22 02:07:44,561][15401] Updated weights for policy 0, policy_version 111990 (0.0042) [2024-06-22 02:07:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1834991616. Throughput: 0: 42877.0. Samples: 1835154880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:07:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 02:07:48,464][15401] Updated weights for policy 0, policy_version 112000 (0.0033) [2024-06-22 02:07:52,153][15401] Updated weights for policy 0, policy_version 112010 (0.0034) [2024-06-22 02:07:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 1835188224. Throughput: 0: 42699.0. Samples: 1835283760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:07:53,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 02:07:56,035][15401] Updated weights for policy 0, policy_version 112020 (0.0036) [2024-06-22 02:07:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1835417600. Throughput: 0: 42675.5. Samples: 1835533620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:07:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 02:08:00,132][15401] Updated weights for policy 0, policy_version 112030 (0.0030) [2024-06-22 02:08:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 1835630592. Throughput: 0: 42536.4. Samples: 1835786560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 02:08:03,653][15401] Updated weights for policy 0, policy_version 112040 (0.0028) [2024-06-22 02:08:07,729][15401] Updated weights for policy 0, policy_version 112050 (0.0023) [2024-06-22 02:08:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1835843584. Throughput: 0: 42649.2. Samples: 1835923720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 02:08:11,125][15401] Updated weights for policy 0, policy_version 112060 (0.0042) [2024-06-22 02:08:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 1836040192. Throughput: 0: 42667.3. Samples: 1836177160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 02:08:15,227][15401] Updated weights for policy 0, policy_version 112070 (0.0035) [2024-06-22 02:08:16,802][15349] Signal inference workers to stop experience collection... (27050 times) [2024-06-22 02:08:16,805][15349] Signal inference workers to resume experience collection... (27050 times) [2024-06-22 02:08:16,852][15401] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-06-22 02:08:16,852][15401] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-06-22 02:08:18,394][15132] Fps is (10 sec: 44217.2, 60 sec: 42868.2, 300 sec: 42875.4). Total num frames: 1836285952. Throughput: 0: 42565.6. Samples: 1836434260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:18,394][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 02:08:18,945][15401] Updated weights for policy 0, policy_version 112080 (0.0038) [2024-06-22 02:08:22,869][15401] Updated weights for policy 0, policy_version 112090 (0.0045) [2024-06-22 02:08:23,389][15132] Fps is (10 sec: 44238.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1836482560. Throughput: 0: 42651.6. Samples: 1836568860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 02:08:26,790][15401] Updated weights for policy 0, policy_version 112100 (0.0045) [2024-06-22 02:08:28,390][15132] Fps is (10 sec: 37700.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1836662784. Throughput: 0: 42460.4. Samples: 1836812380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:28,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-22 02:08:30,517][15401] Updated weights for policy 0, policy_version 112110 (0.0031) [2024-06-22 02:08:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1836892160. Throughput: 0: 42545.7. Samples: 1837069440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 02:08:34,797][15401] Updated weights for policy 0, policy_version 112120 (0.0028) [2024-06-22 02:08:38,264][15401] Updated weights for policy 0, policy_version 112130 (0.0038) [2024-06-22 02:08:38,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43142.7, 300 sec: 42875.8). Total num frames: 1837137920. Throughput: 0: 42647.5. Samples: 1837203000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:38,393][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 02:08:42,534][15401] Updated weights for policy 0, policy_version 112140 (0.0045) [2024-06-22 02:08:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 1837318144. Throughput: 0: 42629.3. Samples: 1837451940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 02:08:46,269][15401] Updated weights for policy 0, policy_version 112150 (0.0030) [2024-06-22 02:08:48,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1837563904. Throughput: 0: 42773.7. Samples: 1837711380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 02:08:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 02:08:50,439][15401] Updated weights for policy 0, policy_version 112160 (0.0033) [2024-06-22 02:08:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1837760512. Throughput: 0: 42649.8. Samples: 1837842960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:08:53,399][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 02:08:53,831][15401] Updated weights for policy 0, policy_version 112170 (0.0030) [2024-06-22 02:08:57,970][15401] Updated weights for policy 0, policy_version 112180 (0.0041) [2024-06-22 02:08:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1837957120. Throughput: 0: 42670.9. Samples: 1838097340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:08:58,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 02:09:01,409][15401] Updated weights for policy 0, policy_version 112190 (0.0043) [2024-06-22 02:09:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1838202880. Throughput: 0: 42541.0. Samples: 1838348420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:03,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 02:09:05,847][15401] Updated weights for policy 0, policy_version 112200 (0.0042) [2024-06-22 02:09:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1838399488. Throughput: 0: 42383.9. Samples: 1838476140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 02:09:09,138][15401] Updated weights for policy 0, policy_version 112210 (0.0047) [2024-06-22 02:09:13,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 1838596096. Throughput: 0: 42599.5. Samples: 1838729460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:13,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 02:09:13,802][15401] Updated weights for policy 0, policy_version 112220 (0.0032) [2024-06-22 02:09:16,808][15401] Updated weights for policy 0, policy_version 112230 (0.0035) [2024-06-22 02:09:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42328.5, 300 sec: 42709.5). Total num frames: 1838825472. Throughput: 0: 42533.0. Samples: 1838983420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 02:09:21,458][15401] Updated weights for policy 0, policy_version 112240 (0.0038) [2024-06-22 02:09:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 1839022080. Throughput: 0: 42631.2. Samples: 1839121300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 02:09:24,558][15401] Updated weights for policy 0, policy_version 112250 (0.0038) [2024-06-22 02:09:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1839235072. Throughput: 0: 42552.9. Samples: 1839366820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 02:09:29,142][15401] Updated weights for policy 0, policy_version 112260 (0.0028) [2024-06-22 02:09:32,591][15401] Updated weights for policy 0, policy_version 112270 (0.0033) [2024-06-22 02:09:33,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43142.8, 300 sec: 42709.2). Total num frames: 1839480832. Throughput: 0: 42493.7. Samples: 1839623700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:33,393][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 02:09:36,643][15401] Updated weights for policy 0, policy_version 112280 (0.0043) [2024-06-22 02:09:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 1839677440. Throughput: 0: 42589.4. Samples: 1839759480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 02:09:39,168][15349] Signal inference workers to stop experience collection... (27100 times) [2024-06-22 02:09:39,174][15349] Signal inference workers to resume experience collection... (27100 times) [2024-06-22 02:09:39,190][15401] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-06-22 02:09:39,190][15401] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-06-22 02:09:40,105][15401] Updated weights for policy 0, policy_version 112290 (0.0027) [2024-06-22 02:09:43,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1839890432. Throughput: 0: 42452.8. Samples: 1840007720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:43,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 02:09:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000112298_1839890432.pth... [2024-06-22 02:09:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000111672_1829634048.pth [2024-06-22 02:09:44,280][15401] Updated weights for policy 0, policy_version 112300 (0.0042) [2024-06-22 02:09:47,643][15401] Updated weights for policy 0, policy_version 112310 (0.0033) [2024-06-22 02:09:48,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42320.9, 300 sec: 42653.0). Total num frames: 1840103424. Throughput: 0: 42589.1. Samples: 1840265200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:09:48,396][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 02:09:51,961][15401] Updated weights for policy 0, policy_version 112320 (0.0036) [2024-06-22 02:09:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1840316416. Throughput: 0: 42703.2. Samples: 1840397780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:09:53,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 02:09:55,221][15401] Updated weights for policy 0, policy_version 112330 (0.0022) [2024-06-22 02:09:58,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1840529408. Throughput: 0: 42610.8. Samples: 1840646840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:09:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 02:09:59,486][15401] Updated weights for policy 0, policy_version 112340 (0.0033) [2024-06-22 02:10:03,303][15401] Updated weights for policy 0, policy_version 112350 (0.0048) [2024-06-22 02:10:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1840742400. Throughput: 0: 42900.7. Samples: 1840913960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:03,394][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 02:10:06,940][15401] Updated weights for policy 0, policy_version 112360 (0.0040) [2024-06-22 02:10:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1840971776. Throughput: 0: 42699.9. Samples: 1841042800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 02:10:10,865][15401] Updated weights for policy 0, policy_version 112370 (0.0039) [2024-06-22 02:10:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1841168384. Throughput: 0: 42762.7. Samples: 1841291140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:13,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 02:10:14,403][15401] Updated weights for policy 0, policy_version 112380 (0.0024) [2024-06-22 02:10:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1841381376. Throughput: 0: 42948.6. Samples: 1841556280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:18,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-22 02:10:18,402][15401] Updated weights for policy 0, policy_version 112390 (0.0040) [2024-06-22 02:10:22,267][15401] Updated weights for policy 0, policy_version 112400 (0.0035) [2024-06-22 02:10:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1841594368. Throughput: 0: 42704.9. Samples: 1841681200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 02:10:26,085][15401] Updated weights for policy 0, policy_version 112410 (0.0044) [2024-06-22 02:10:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1841823744. Throughput: 0: 42660.8. Samples: 1841927460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:28,391][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 02:10:30,189][15401] Updated weights for policy 0, policy_version 112420 (0.0024) [2024-06-22 02:10:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 1842020352. Throughput: 0: 42761.1. Samples: 1842189180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 02:10:33,804][15401] Updated weights for policy 0, policy_version 112430 (0.0024) [2024-06-22 02:10:37,709][15401] Updated weights for policy 0, policy_version 112440 (0.0033) [2024-06-22 02:10:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1842233344. Throughput: 0: 42512.8. Samples: 1842310860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:38,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 02:10:41,538][15401] Updated weights for policy 0, policy_version 112450 (0.0043) [2024-06-22 02:10:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1842462720. Throughput: 0: 42732.8. Samples: 1842569820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 02:10:45,312][15401] Updated weights for policy 0, policy_version 112460 (0.0026) [2024-06-22 02:10:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 1842659328. Throughput: 0: 42669.0. Samples: 1842834060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 02:10:49,371][15401] Updated weights for policy 0, policy_version 112470 (0.0041) [2024-06-22 02:10:52,980][15401] Updated weights for policy 0, policy_version 112480 (0.0032) [2024-06-22 02:10:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1842888704. Throughput: 0: 42476.5. Samples: 1842954240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 02:10:57,067][15401] Updated weights for policy 0, policy_version 112490 (0.0044) [2024-06-22 02:10:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1843101696. Throughput: 0: 42680.4. Samples: 1843211760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:10:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 02:11:00,858][15401] Updated weights for policy 0, policy_version 112500 (0.0032) [2024-06-22 02:11:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1843298304. Throughput: 0: 42655.1. Samples: 1843475760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 02:11:04,611][15401] Updated weights for policy 0, policy_version 112510 (0.0036) [2024-06-22 02:11:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 1843511296. Throughput: 0: 42483.9. Samples: 1843593080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:08,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 02:11:08,455][15401] Updated weights for policy 0, policy_version 112520 (0.0026) [2024-06-22 02:11:11,370][15349] Signal inference workers to stop experience collection... (27150 times) [2024-06-22 02:11:11,371][15349] Signal inference workers to resume experience collection... (27150 times) [2024-06-22 02:11:11,415][15401] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-06-22 02:11:11,415][15401] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-06-22 02:11:12,243][15401] Updated weights for policy 0, policy_version 112530 (0.0041) [2024-06-22 02:11:13,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 1843740672. Throughput: 0: 42800.0. Samples: 1843853560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:13,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 02:11:16,089][15401] Updated weights for policy 0, policy_version 112540 (0.0038) [2024-06-22 02:11:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1843937280. Throughput: 0: 42691.6. Samples: 1844110300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:18,390][15132] Avg episode reward: [(0, '0.887')] [2024-06-22 02:11:20,120][15401] Updated weights for policy 0, policy_version 112550 (0.0031) [2024-06-22 02:11:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1844150272. Throughput: 0: 42692.9. Samples: 1844232040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:23,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 02:11:23,790][15401] Updated weights for policy 0, policy_version 112560 (0.0040) [2024-06-22 02:11:27,769][15401] Updated weights for policy 0, policy_version 112570 (0.0028) [2024-06-22 02:11:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1844363264. Throughput: 0: 42810.6. Samples: 1844496300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:28,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 02:11:31,252][15401] Updated weights for policy 0, policy_version 112580 (0.0037) [2024-06-22 02:11:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1844576256. Throughput: 0: 42564.8. Samples: 1844749480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 02:11:35,289][15401] Updated weights for policy 0, policy_version 112590 (0.0036) [2024-06-22 02:11:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1844805632. Throughput: 0: 42727.9. Samples: 1844877000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 02:11:38,779][15401] Updated weights for policy 0, policy_version 112600 (0.0038) [2024-06-22 02:11:42,725][15401] Updated weights for policy 0, policy_version 112610 (0.0035) [2024-06-22 02:11:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1845018624. Throughput: 0: 42819.0. Samples: 1845138620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 02:11:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000112611_1845018624.pth... [2024-06-22 02:11:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000111986_1834778624.pth [2024-06-22 02:11:46,714][15401] Updated weights for policy 0, policy_version 112620 (0.0040) [2024-06-22 02:11:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1845215232. Throughput: 0: 42704.4. Samples: 1845397460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:48,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 02:11:50,492][15401] Updated weights for policy 0, policy_version 112630 (0.0034) [2024-06-22 02:11:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1845460992. Throughput: 0: 42786.2. Samples: 1845518360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:11:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 02:11:54,199][15401] Updated weights for policy 0, policy_version 112640 (0.0031) [2024-06-22 02:11:58,055][15401] Updated weights for policy 0, policy_version 112650 (0.0027) [2024-06-22 02:11:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1845657600. Throughput: 0: 42936.1. Samples: 1845785580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:11:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 02:12:01,690][15401] Updated weights for policy 0, policy_version 112660 (0.0035) [2024-06-22 02:12:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1845854208. Throughput: 0: 42962.8. Samples: 1846043620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 02:12:05,590][15401] Updated weights for policy 0, policy_version 112670 (0.0025) [2024-06-22 02:12:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 1846116352. Throughput: 0: 42986.2. Samples: 1846166420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 02:12:09,349][15401] Updated weights for policy 0, policy_version 112680 (0.0033) [2024-06-22 02:12:13,248][15401] Updated weights for policy 0, policy_version 112690 (0.0041) [2024-06-22 02:12:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1846312960. Throughput: 0: 42990.7. Samples: 1846430880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 02:12:17,133][15401] Updated weights for policy 0, policy_version 112700 (0.0036) [2024-06-22 02:12:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1846509568. Throughput: 0: 42966.7. Samples: 1846682980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 02:12:20,919][15401] Updated weights for policy 0, policy_version 112710 (0.0028) [2024-06-22 02:12:23,395][15132] Fps is (10 sec: 40937.0, 60 sec: 42867.4, 300 sec: 42708.6). Total num frames: 1846722560. Throughput: 0: 42784.9. Samples: 1846802560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:23,396][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 02:12:25,101][15401] Updated weights for policy 0, policy_version 112720 (0.0040) [2024-06-22 02:12:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1846951936. Throughput: 0: 42698.6. Samples: 1847060060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 02:12:28,522][15401] Updated weights for policy 0, policy_version 112730 (0.0032) [2024-06-22 02:12:33,227][15401] Updated weights for policy 0, policy_version 112740 (0.0030) [2024-06-22 02:12:33,389][15132] Fps is (10 sec: 42622.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1847148544. Throughput: 0: 42676.0. Samples: 1847317880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 02:12:36,499][15401] Updated weights for policy 0, policy_version 112750 (0.0045) [2024-06-22 02:12:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1847345152. Throughput: 0: 42680.1. Samples: 1847438960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:38,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 02:12:40,818][15401] Updated weights for policy 0, policy_version 112760 (0.0037) [2024-06-22 02:12:40,912][15349] Signal inference workers to stop experience collection... (27200 times) [2024-06-22 02:12:40,967][15401] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-06-22 02:12:40,976][15349] Signal inference workers to resume experience collection... (27200 times) [2024-06-22 02:12:40,983][15401] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-06-22 02:12:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1847574528. Throughput: 0: 42477.4. Samples: 1847697060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 02:12:44,223][15401] Updated weights for policy 0, policy_version 112770 (0.0035) [2024-06-22 02:12:48,367][15401] Updated weights for policy 0, policy_version 112780 (0.0033) [2024-06-22 02:12:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1847787520. Throughput: 0: 42517.7. Samples: 1847956920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 02:12:51,864][15401] Updated weights for policy 0, policy_version 112790 (0.0022) [2024-06-22 02:12:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1847984128. Throughput: 0: 42483.6. Samples: 1848078180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 02:12:53,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 02:12:55,876][15401] Updated weights for policy 0, policy_version 112800 (0.0048) [2024-06-22 02:12:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1848213504. Throughput: 0: 42282.6. Samples: 1848333600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:12:58,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 02:12:59,593][15401] Updated weights for policy 0, policy_version 112810 (0.0038) [2024-06-22 02:13:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1848426496. Throughput: 0: 42338.6. Samples: 1848588220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:03,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 02:13:03,545][15401] Updated weights for policy 0, policy_version 112820 (0.0032) [2024-06-22 02:13:07,421][15401] Updated weights for policy 0, policy_version 112830 (0.0033) [2024-06-22 02:13:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1848639488. Throughput: 0: 42564.5. Samples: 1848717720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:08,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 02:13:11,338][15401] Updated weights for policy 0, policy_version 112840 (0.0033) [2024-06-22 02:13:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42543.5). Total num frames: 1848836096. Throughput: 0: 42419.7. Samples: 1848968940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:13,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 02:13:15,285][15401] Updated weights for policy 0, policy_version 112850 (0.0036) [2024-06-22 02:13:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 1849065472. Throughput: 0: 42387.9. Samples: 1849225440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:18,393][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 02:13:19,108][15401] Updated weights for policy 0, policy_version 112860 (0.0042) [2024-06-22 02:13:23,189][15401] Updated weights for policy 0, policy_version 112870 (0.0032) [2024-06-22 02:13:23,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42600.7, 300 sec: 42764.7). Total num frames: 1849278464. Throughput: 0: 42615.5. Samples: 1849356760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:23,393][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 02:13:26,637][15401] Updated weights for policy 0, policy_version 112880 (0.0033) [2024-06-22 02:13:28,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1849475072. Throughput: 0: 42558.5. Samples: 1849612200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 02:13:30,745][15401] Updated weights for policy 0, policy_version 112890 (0.0033) [2024-06-22 02:13:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1849720832. Throughput: 0: 42467.5. Samples: 1849867960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:33,399][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 02:13:34,060][15401] Updated weights for policy 0, policy_version 112900 (0.0033) [2024-06-22 02:13:38,250][15401] Updated weights for policy 0, policy_version 112910 (0.0036) [2024-06-22 02:13:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1849917440. Throughput: 0: 42752.8. Samples: 1850002060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 02:13:42,437][15401] Updated weights for policy 0, policy_version 112920 (0.0032) [2024-06-22 02:13:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1850130432. Throughput: 0: 42766.2. Samples: 1850258080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:43,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 02:13:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000112923_1850130432.pth... [2024-06-22 02:13:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000112298_1839890432.pth [2024-06-22 02:13:45,702][15401] Updated weights for policy 0, policy_version 112930 (0.0032) [2024-06-22 02:13:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1850359808. Throughput: 0: 42529.8. Samples: 1850502060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:48,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 02:13:50,131][15401] Updated weights for policy 0, policy_version 112940 (0.0026) [2024-06-22 02:13:52,442][15349] Signal inference workers to stop experience collection... (27250 times) [2024-06-22 02:13:52,494][15401] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-06-22 02:13:52,554][15349] Signal inference workers to resume experience collection... (27250 times) [2024-06-22 02:13:52,554][15401] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-06-22 02:13:53,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42866.8, 300 sec: 42708.5). Total num frames: 1850556416. Throughput: 0: 42585.4. Samples: 1850634340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 02:13:53,396][15132] Avg episode reward: [(0, '0.206')] [2024-06-22 02:13:53,706][15401] Updated weights for policy 0, policy_version 112950 (0.0026) [2024-06-22 02:13:57,784][15401] Updated weights for policy 0, policy_version 112960 (0.0045) [2024-06-22 02:13:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1850753024. Throughput: 0: 42733.3. Samples: 1850891940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:13:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 02:14:01,363][15401] Updated weights for policy 0, policy_version 112970 (0.0038) [2024-06-22 02:14:03,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1850982400. Throughput: 0: 42641.4. Samples: 1851144200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:03,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 02:14:05,721][15401] Updated weights for policy 0, policy_version 112980 (0.0027) [2024-06-22 02:14:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 1851195392. Throughput: 0: 42638.7. Samples: 1851275400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 02:14:08,913][15401] Updated weights for policy 0, policy_version 112990 (0.0030) [2024-06-22 02:14:13,125][15401] Updated weights for policy 0, policy_version 113000 (0.0039) [2024-06-22 02:14:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1851392000. Throughput: 0: 42530.8. Samples: 1851526080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 02:14:16,494][15401] Updated weights for policy 0, policy_version 113010 (0.0043) [2024-06-22 02:14:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 1851637760. Throughput: 0: 42541.8. Samples: 1851782340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:18,393][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 02:14:20,638][15401] Updated weights for policy 0, policy_version 113020 (0.0030) [2024-06-22 02:14:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 1851817984. Throughput: 0: 42479.1. Samples: 1851913620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 02:14:24,072][15401] Updated weights for policy 0, policy_version 113030 (0.0031) [2024-06-22 02:14:28,102][15401] Updated weights for policy 0, policy_version 113040 (0.0036) [2024-06-22 02:14:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 1852047360. Throughput: 0: 42447.2. Samples: 1852168200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 02:14:31,613][15401] Updated weights for policy 0, policy_version 113050 (0.0025) [2024-06-22 02:14:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1852260352. Throughput: 0: 42642.2. Samples: 1852420960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 02:14:35,619][15401] Updated weights for policy 0, policy_version 113060 (0.0038) [2024-06-22 02:14:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1852473344. Throughput: 0: 42646.1. Samples: 1852553140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 02:14:39,606][15401] Updated weights for policy 0, policy_version 113070 (0.0028) [2024-06-22 02:14:43,235][15401] Updated weights for policy 0, policy_version 113080 (0.0037) [2024-06-22 02:14:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 1852702720. Throughput: 0: 42567.8. Samples: 1852807500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:43,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 02:14:47,253][15401] Updated weights for policy 0, policy_version 113090 (0.0030) [2024-06-22 02:14:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1852899328. Throughput: 0: 42433.2. Samples: 1853053700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 02:14:51,124][15401] Updated weights for policy 0, policy_version 113100 (0.0035) [2024-06-22 02:14:53,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 1853095936. Throughput: 0: 42330.8. Samples: 1853180280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-22 02:14:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 02:14:55,301][15401] Updated weights for policy 0, policy_version 113110 (0.0034) [2024-06-22 02:14:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1853325312. Throughput: 0: 42603.9. Samples: 1853443260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:14:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 02:14:58,668][15401] Updated weights for policy 0, policy_version 113120 (0.0027) [2024-06-22 02:15:03,177][15401] Updated weights for policy 0, policy_version 113130 (0.0028) [2024-06-22 02:15:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1853521920. Throughput: 0: 42544.0. Samples: 1853696820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 02:15:06,148][15349] Signal inference workers to stop experience collection... (27300 times) [2024-06-22 02:15:06,149][15349] Signal inference workers to resume experience collection... (27300 times) [2024-06-22 02:15:06,196][15401] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-06-22 02:15:06,196][15401] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-06-22 02:15:06,642][15401] Updated weights for policy 0, policy_version 113140 (0.0040) [2024-06-22 02:15:08,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1853751296. Throughput: 0: 42454.6. Samples: 1853824080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 02:15:10,778][15401] Updated weights for policy 0, policy_version 113150 (0.0026) [2024-06-22 02:15:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1853947904. Throughput: 0: 42398.2. Samples: 1854076120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:13,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 02:15:14,395][15401] Updated weights for policy 0, policy_version 113160 (0.0036) [2024-06-22 02:15:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1854144512. Throughput: 0: 42465.9. Samples: 1854331920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 02:15:18,610][15401] Updated weights for policy 0, policy_version 113170 (0.0045) [2024-06-22 02:15:22,345][15401] Updated weights for policy 0, policy_version 113180 (0.0046) [2024-06-22 02:15:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1854390272. Throughput: 0: 42442.7. Samples: 1854463060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 02:15:26,358][15401] Updated weights for policy 0, policy_version 113190 (0.0041) [2024-06-22 02:15:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1854586880. Throughput: 0: 42296.1. Samples: 1854710820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 02:15:30,126][15401] Updated weights for policy 0, policy_version 113200 (0.0023) [2024-06-22 02:15:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1854783488. Throughput: 0: 42728.9. Samples: 1854976500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:33,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 02:15:34,015][15401] Updated weights for policy 0, policy_version 113210 (0.0035) [2024-06-22 02:15:37,561][15401] Updated weights for policy 0, policy_version 113220 (0.0040) [2024-06-22 02:15:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1855012864. Throughput: 0: 42818.6. Samples: 1855107120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 02:15:41,663][15401] Updated weights for policy 0, policy_version 113230 (0.0042) [2024-06-22 02:15:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1855242240. Throughput: 0: 42605.9. Samples: 1855360520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 02:15:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000113235_1855242240.pth... [2024-06-22 02:15:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000112611_1845018624.pth [2024-06-22 02:15:44,971][15401] Updated weights for policy 0, policy_version 113240 (0.0034) [2024-06-22 02:15:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1855438848. Throughput: 0: 42582.2. Samples: 1855613020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:48,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 02:15:49,444][15401] Updated weights for policy 0, policy_version 113250 (0.0032) [2024-06-22 02:15:52,511][15401] Updated weights for policy 0, policy_version 113260 (0.0035) [2024-06-22 02:15:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1855668224. Throughput: 0: 42692.5. Samples: 1855745240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 02:15:56,903][15401] Updated weights for policy 0, policy_version 113270 (0.0033) [2024-06-22 02:15:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1855864832. Throughput: 0: 42893.2. Samples: 1856006320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:15:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 02:16:00,343][15401] Updated weights for policy 0, policy_version 113280 (0.0025) [2024-06-22 02:16:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 1856094208. Throughput: 0: 42827.5. Samples: 1856259160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 02:16:04,727][15401] Updated weights for policy 0, policy_version 113290 (0.0036) [2024-06-22 02:16:07,886][15401] Updated weights for policy 0, policy_version 113300 (0.0040) [2024-06-22 02:16:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 1856307200. Throughput: 0: 42802.6. Samples: 1856389180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:08,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-22 02:16:12,314][15401] Updated weights for policy 0, policy_version 113310 (0.0030) [2024-06-22 02:16:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1856520192. Throughput: 0: 43090.7. Samples: 1856649900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 02:16:15,559][15401] Updated weights for policy 0, policy_version 113320 (0.0028) [2024-06-22 02:16:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1856733184. Throughput: 0: 42784.5. Samples: 1856901800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 02:16:19,919][15401] Updated weights for policy 0, policy_version 113330 (0.0031) [2024-06-22 02:16:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1856946176. Throughput: 0: 42760.5. Samples: 1857031340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 02:16:23,571][15401] Updated weights for policy 0, policy_version 113340 (0.0036) [2024-06-22 02:16:27,653][15401] Updated weights for policy 0, policy_version 113350 (0.0033) [2024-06-22 02:16:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1857159168. Throughput: 0: 42938.6. Samples: 1857292760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 02:16:31,139][15401] Updated weights for policy 0, policy_version 113360 (0.0037) [2024-06-22 02:16:32,348][15349] Signal inference workers to stop experience collection... (27350 times) [2024-06-22 02:16:32,349][15349] Signal inference workers to resume experience collection... (27350 times) [2024-06-22 02:16:32,383][15401] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-06-22 02:16:32,383][15401] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-06-22 02:16:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1857388544. Throughput: 0: 42845.3. Samples: 1857541060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 02:16:35,111][15401] Updated weights for policy 0, policy_version 113370 (0.0035) [2024-06-22 02:16:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1857585152. Throughput: 0: 42874.5. Samples: 1857674600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 02:16:39,080][15401] Updated weights for policy 0, policy_version 113380 (0.0028) [2024-06-22 02:16:42,667][15401] Updated weights for policy 0, policy_version 113390 (0.0024) [2024-06-22 02:16:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1857798144. Throughput: 0: 42674.7. Samples: 1857926680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 02:16:46,597][15401] Updated weights for policy 0, policy_version 113400 (0.0026) [2024-06-22 02:16:48,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1858043904. Throughput: 0: 42684.8. Samples: 1858179980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 02:16:50,092][15401] Updated weights for policy 0, policy_version 113410 (0.0041) [2024-06-22 02:16:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1858224128. Throughput: 0: 42838.7. Samples: 1858316920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 02:16:54,263][15401] Updated weights for policy 0, policy_version 113420 (0.0047) [2024-06-22 02:16:57,571][15401] Updated weights for policy 0, policy_version 113430 (0.0036) [2024-06-22 02:16:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1858437120. Throughput: 0: 42741.7. Samples: 1858573280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 02:16:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 02:17:01,854][15401] Updated weights for policy 0, policy_version 113440 (0.0026) [2024-06-22 02:17:03,392][15132] Fps is (10 sec: 47501.6, 60 sec: 43415.8, 300 sec: 42653.6). Total num frames: 1858699264. Throughput: 0: 42648.3. Samples: 1858821080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:03,393][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 02:17:05,019][15401] Updated weights for policy 0, policy_version 113450 (0.0023) [2024-06-22 02:17:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1858863104. Throughput: 0: 42843.5. Samples: 1858959300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 02:17:09,580][15401] Updated weights for policy 0, policy_version 113460 (0.0042) [2024-06-22 02:17:13,085][15401] Updated weights for policy 0, policy_version 113470 (0.0023) [2024-06-22 02:17:13,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1859092480. Throughput: 0: 42690.2. Samples: 1859213820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 02:17:17,362][15401] Updated weights for policy 0, policy_version 113480 (0.0032) [2024-06-22 02:17:18,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.5, 300 sec: 42765.8). Total num frames: 1859338240. Throughput: 0: 42644.9. Samples: 1859460080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 02:17:20,719][15401] Updated weights for policy 0, policy_version 113490 (0.0034) [2024-06-22 02:17:23,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 1859502080. Throughput: 0: 42625.4. Samples: 1859592840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:23,393][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 02:17:25,140][15401] Updated weights for policy 0, policy_version 113500 (0.0036) [2024-06-22 02:17:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1859731456. Throughput: 0: 42601.8. Samples: 1859843760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 02:17:28,846][15401] Updated weights for policy 0, policy_version 113510 (0.0050) [2024-06-22 02:17:32,575][15349] Signal inference workers to stop experience collection... (27400 times) [2024-06-22 02:17:32,580][15349] Signal inference workers to resume experience collection... (27400 times) [2024-06-22 02:17:32,598][15401] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-06-22 02:17:32,599][15401] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-06-22 02:17:32,754][15401] Updated weights for policy 0, policy_version 113520 (0.0029) [2024-06-22 02:17:33,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1859960832. Throughput: 0: 42782.7. Samples: 1860105200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 02:17:36,391][15401] Updated weights for policy 0, policy_version 113530 (0.0032) [2024-06-22 02:17:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1860141056. Throughput: 0: 42580.3. Samples: 1860233040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:38,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-22 02:17:40,231][15401] Updated weights for policy 0, policy_version 113540 (0.0034) [2024-06-22 02:17:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1860370432. Throughput: 0: 42565.4. Samples: 1860488720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 02:17:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000113548_1860370432.pth... [2024-06-22 02:17:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000112923_1850130432.pth [2024-06-22 02:17:43,893][15401] Updated weights for policy 0, policy_version 113550 (0.0035) [2024-06-22 02:17:47,893][15401] Updated weights for policy 0, policy_version 113560 (0.0033) [2024-06-22 02:17:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1860583424. Throughput: 0: 42798.0. Samples: 1860746880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 02:17:51,513][15401] Updated weights for policy 0, policy_version 113570 (0.0045) [2024-06-22 02:17:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1860763648. Throughput: 0: 42465.5. Samples: 1860870240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 02:17:55,513][15401] Updated weights for policy 0, policy_version 113580 (0.0049) [2024-06-22 02:17:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1861025792. Throughput: 0: 42442.8. Samples: 1861123740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:17:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 02:17:59,590][15401] Updated weights for policy 0, policy_version 113590 (0.0044) [2024-06-22 02:18:03,226][15401] Updated weights for policy 0, policy_version 113600 (0.0026) [2024-06-22 02:18:03,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 1861222400. Throughput: 0: 42767.6. Samples: 1861384620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 02:18:07,487][15401] Updated weights for policy 0, policy_version 113610 (0.0039) [2024-06-22 02:18:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1861419008. Throughput: 0: 42504.0. Samples: 1861505420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 02:18:10,938][15401] Updated weights for policy 0, policy_version 113620 (0.0031) [2024-06-22 02:18:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 1861648384. Throughput: 0: 42626.1. Samples: 1861761940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 02:18:15,026][15401] Updated weights for policy 0, policy_version 113630 (0.0027) [2024-06-22 02:18:18,392][15132] Fps is (10 sec: 42588.3, 60 sec: 41777.6, 300 sec: 42598.4). Total num frames: 1861844992. Throughput: 0: 42780.4. Samples: 1862030420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:18,393][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 02:18:18,666][15401] Updated weights for policy 0, policy_version 113640 (0.0032) [2024-06-22 02:18:22,511][15401] Updated weights for policy 0, policy_version 113650 (0.0034) [2024-06-22 02:18:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1862057984. Throughput: 0: 42658.2. Samples: 1862152660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:23,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 02:18:26,262][15401] Updated weights for policy 0, policy_version 113660 (0.0043) [2024-06-22 02:18:28,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1862303744. Throughput: 0: 42672.9. Samples: 1862409000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:28,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 02:18:30,056][15401] Updated weights for policy 0, policy_version 113670 (0.0027) [2024-06-22 02:18:33,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1862500352. Throughput: 0: 42800.0. Samples: 1862672880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 02:18:34,083][15401] Updated weights for policy 0, policy_version 113680 (0.0026) [2024-06-22 02:18:37,705][15401] Updated weights for policy 0, policy_version 113690 (0.0022) [2024-06-22 02:18:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1862713344. Throughput: 0: 42796.0. Samples: 1862796060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 02:18:41,778][15401] Updated weights for policy 0, policy_version 113700 (0.0039) [2024-06-22 02:18:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1862942720. Throughput: 0: 42859.5. Samples: 1863052420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 02:18:45,510][15401] Updated weights for policy 0, policy_version 113710 (0.0021) [2024-06-22 02:18:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42654.9). Total num frames: 1863139328. Throughput: 0: 42791.9. Samples: 1863310260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:48,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 02:18:49,323][15401] Updated weights for policy 0, policy_version 113720 (0.0037) [2024-06-22 02:18:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1863335936. Throughput: 0: 42885.0. Samples: 1863435240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 02:18:53,441][15401] Updated weights for policy 0, policy_version 113730 (0.0033) [2024-06-22 02:18:56,839][15401] Updated weights for policy 0, policy_version 113740 (0.0028) [2024-06-22 02:18:57,594][15349] Signal inference workers to stop experience collection... (27450 times) [2024-06-22 02:18:57,645][15401] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-06-22 02:18:57,654][15349] Signal inference workers to resume experience collection... (27450 times) [2024-06-22 02:18:57,661][15401] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-06-22 02:18:58,392][15132] Fps is (10 sec: 45865.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1863598080. Throughput: 0: 42867.4. Samples: 1863691060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 02:18:58,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 02:19:00,930][15401] Updated weights for policy 0, policy_version 113750 (0.0041) [2024-06-22 02:19:03,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1863794688. Throughput: 0: 42634.6. Samples: 1863948880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 02:19:04,490][15401] Updated weights for policy 0, policy_version 113760 (0.0034) [2024-06-22 02:19:08,390][15132] Fps is (10 sec: 37691.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1863974912. Throughput: 0: 42581.5. Samples: 1864068820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 02:19:08,763][15401] Updated weights for policy 0, policy_version 113770 (0.0043) [2024-06-22 02:19:12,306][15401] Updated weights for policy 0, policy_version 113780 (0.0034) [2024-06-22 02:19:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1864220672. Throughput: 0: 42783.0. Samples: 1864334240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:13,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 02:19:16,395][15401] Updated weights for policy 0, policy_version 113790 (0.0035) [2024-06-22 02:19:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 1864433664. Throughput: 0: 42632.8. Samples: 1864591360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 02:19:19,895][15401] Updated weights for policy 0, policy_version 113800 (0.0038) [2024-06-22 02:19:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1864630272. Throughput: 0: 42560.7. Samples: 1864711300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:23,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 02:19:23,933][15401] Updated weights for policy 0, policy_version 113810 (0.0040) [2024-06-22 02:19:27,451][15401] Updated weights for policy 0, policy_version 113820 (0.0039) [2024-06-22 02:19:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1864876032. Throughput: 0: 42772.5. Samples: 1864977180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:28,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 02:19:31,595][15401] Updated weights for policy 0, policy_version 113830 (0.0037) [2024-06-22 02:19:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1865056256. Throughput: 0: 42637.9. Samples: 1865228960. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:33,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 02:19:35,097][15401] Updated weights for policy 0, policy_version 113840 (0.0035) [2024-06-22 02:19:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1865252864. Throughput: 0: 42445.7. Samples: 1865345300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 02:19:39,349][15401] Updated weights for policy 0, policy_version 113850 (0.0038) [2024-06-22 02:19:43,132][15401] Updated weights for policy 0, policy_version 113860 (0.0042) [2024-06-22 02:19:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1865498624. Throughput: 0: 42583.4. Samples: 1865607220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 02:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000113861_1865498624.pth... [2024-06-22 02:19:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000113235_1855242240.pth [2024-06-22 02:19:47,088][15401] Updated weights for policy 0, policy_version 113870 (0.0032) [2024-06-22 02:19:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1865695232. Throughput: 0: 42538.0. Samples: 1865863080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:48,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 02:19:50,866][15401] Updated weights for policy 0, policy_version 113880 (0.0025) [2024-06-22 02:19:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1865908224. Throughput: 0: 42544.9. Samples: 1865983340. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 02:19:55,113][15401] Updated weights for policy 0, policy_version 113890 (0.0037) [2024-06-22 02:19:58,340][15401] Updated weights for policy 0, policy_version 113900 (0.0032) [2024-06-22 02:19:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42326.8, 300 sec: 42765.0). Total num frames: 1866137600. Throughput: 0: 42409.3. Samples: 1866242660. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 02:19:58,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 02:20:01,824][15349] Signal inference workers to stop experience collection... (27500 times) [2024-06-22 02:20:01,856][15401] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-06-22 02:20:01,886][15349] Signal inference workers to resume experience collection... (27500 times) [2024-06-22 02:20:01,886][15401] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-06-22 02:20:02,919][15401] Updated weights for policy 0, policy_version 113910 (0.0033) [2024-06-22 02:20:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1866317824. Throughput: 0: 42420.0. Samples: 1866500260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 02:20:06,106][15401] Updated weights for policy 0, policy_version 113920 (0.0033) [2024-06-22 02:20:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1866547200. Throughput: 0: 42494.9. Samples: 1866623560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 02:20:10,520][15401] Updated weights for policy 0, policy_version 113930 (0.0053) [2024-06-22 02:20:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1866760192. Throughput: 0: 42304.9. Samples: 1866880900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:13,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 02:20:13,678][15401] Updated weights for policy 0, policy_version 113940 (0.0047) [2024-06-22 02:20:18,040][15401] Updated weights for policy 0, policy_version 113950 (0.0040) [2024-06-22 02:20:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1866973184. Throughput: 0: 42531.1. Samples: 1867142860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 02:20:21,202][15401] Updated weights for policy 0, policy_version 113960 (0.0046) [2024-06-22 02:20:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1867202560. Throughput: 0: 42775.1. Samples: 1867270180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:23,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 02:20:25,467][15401] Updated weights for policy 0, policy_version 113970 (0.0030) [2024-06-22 02:20:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 1867399168. Throughput: 0: 42742.8. Samples: 1867530640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 02:20:29,053][15401] Updated weights for policy 0, policy_version 113980 (0.0042) [2024-06-22 02:20:33,086][15401] Updated weights for policy 0, policy_version 113990 (0.0035) [2024-06-22 02:20:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1867612160. Throughput: 0: 42644.4. Samples: 1867782080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:33,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 02:20:36,562][15401] Updated weights for policy 0, policy_version 114000 (0.0032) [2024-06-22 02:20:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1867841536. Throughput: 0: 42873.3. Samples: 1867912640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 02:20:41,090][15401] Updated weights for policy 0, policy_version 114010 (0.0031) [2024-06-22 02:20:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1868038144. Throughput: 0: 42942.8. Samples: 1868175080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 02:20:44,147][15401] Updated weights for policy 0, policy_version 114020 (0.0027) [2024-06-22 02:20:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1868251136. Throughput: 0: 43010.3. Samples: 1868435720. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 02:20:48,504][15401] Updated weights for policy 0, policy_version 114030 (0.0038) [2024-06-22 02:20:51,711][15401] Updated weights for policy 0, policy_version 114040 (0.0029) [2024-06-22 02:20:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1868496896. Throughput: 0: 43121.8. Samples: 1868564040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 02:20:55,986][15401] Updated weights for policy 0, policy_version 114050 (0.0040) [2024-06-22 02:20:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1868677120. Throughput: 0: 43030.6. Samples: 1868817280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:20:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 02:20:59,559][15401] Updated weights for policy 0, policy_version 114060 (0.0033) [2024-06-22 02:21:03,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1868890112. Throughput: 0: 42994.3. Samples: 1869077600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 02:21:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 02:21:03,687][15401] Updated weights for policy 0, policy_version 114070 (0.0044) [2024-06-22 02:21:07,116][15401] Updated weights for policy 0, policy_version 114080 (0.0029) [2024-06-22 02:21:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1869135872. Throughput: 0: 43014.3. Samples: 1869205820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 02:21:11,171][15401] Updated weights for policy 0, policy_version 114090 (0.0031) [2024-06-22 02:21:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1869332480. Throughput: 0: 43045.8. Samples: 1869467700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 02:21:14,599][15401] Updated weights for policy 0, policy_version 114100 (0.0025) [2024-06-22 02:21:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1869545472. Throughput: 0: 43152.5. Samples: 1869723940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 02:21:18,856][15401] Updated weights for policy 0, policy_version 114110 (0.0045) [2024-06-22 02:21:22,453][15401] Updated weights for policy 0, policy_version 114120 (0.0038) [2024-06-22 02:21:22,605][15349] Signal inference workers to stop experience collection... (27550 times) [2024-06-22 02:21:22,605][15349] Signal inference workers to resume experience collection... (27550 times) [2024-06-22 02:21:22,632][15401] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-06-22 02:21:22,632][15401] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-06-22 02:21:23,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 1869791232. Throughput: 0: 43041.7. Samples: 1869849620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:23,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 02:21:26,418][15401] Updated weights for policy 0, policy_version 114130 (0.0037) [2024-06-22 02:21:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1869971456. Throughput: 0: 42974.1. Samples: 1870108920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 02:21:30,085][15401] Updated weights for policy 0, policy_version 114140 (0.0032) [2024-06-22 02:21:33,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1870200832. Throughput: 0: 42972.8. Samples: 1870369500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 02:21:34,029][15401] Updated weights for policy 0, policy_version 114150 (0.0024) [2024-06-22 02:21:37,674][15401] Updated weights for policy 0, policy_version 114160 (0.0035) [2024-06-22 02:21:38,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1870430208. Throughput: 0: 42912.3. Samples: 1870495100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:38,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 02:21:41,558][15401] Updated weights for policy 0, policy_version 114170 (0.0039) [2024-06-22 02:21:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1870626816. Throughput: 0: 42960.0. Samples: 1870750480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:43,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 02:21:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000114174_1870626816.pth... [2024-06-22 02:21:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000113548_1860370432.pth [2024-06-22 02:21:45,701][15401] Updated weights for policy 0, policy_version 114180 (0.0051) [2024-06-22 02:21:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1870839808. Throughput: 0: 42907.6. Samples: 1871008440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 02:21:49,336][15401] Updated weights for policy 0, policy_version 114190 (0.0049) [2024-06-22 02:21:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1871036416. Throughput: 0: 42880.0. Samples: 1871135420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 02:21:53,498][15401] Updated weights for policy 0, policy_version 114200 (0.0031) [2024-06-22 02:21:56,967][15401] Updated weights for policy 0, policy_version 114210 (0.0032) [2024-06-22 02:21:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42598.8). Total num frames: 1871265792. Throughput: 0: 42609.3. Samples: 1871385120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:21:58,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 02:22:01,194][15401] Updated weights for policy 0, policy_version 114220 (0.0024) [2024-06-22 02:22:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1871478784. Throughput: 0: 42666.3. Samples: 1871643920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 02:22:03,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 02:22:04,697][15401] Updated weights for policy 0, policy_version 114230 (0.0040) [2024-06-22 02:22:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1871675392. Throughput: 0: 42747.6. Samples: 1871773160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:08,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 02:22:08,802][15401] Updated weights for policy 0, policy_version 114240 (0.0023) [2024-06-22 02:22:12,370][15401] Updated weights for policy 0, policy_version 114250 (0.0032) [2024-06-22 02:22:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1871888384. Throughput: 0: 42763.3. Samples: 1872033260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:13,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-22 02:22:16,369][15401] Updated weights for policy 0, policy_version 114260 (0.0038) [2024-06-22 02:22:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 1872134144. Throughput: 0: 42458.7. Samples: 1872280140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:18,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-22 02:22:20,165][15401] Updated weights for policy 0, policy_version 114270 (0.0049) [2024-06-22 02:22:23,390][15132] Fps is (10 sec: 44233.2, 60 sec: 42326.5, 300 sec: 42709.4). Total num frames: 1872330752. Throughput: 0: 42695.7. Samples: 1872416440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:23,391][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 02:22:24,187][15401] Updated weights for policy 0, policy_version 114280 (0.0031) [2024-06-22 02:22:27,875][15401] Updated weights for policy 0, policy_version 114290 (0.0044) [2024-06-22 02:22:28,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42867.0, 300 sec: 42653.0). Total num frames: 1872543744. Throughput: 0: 42579.3. Samples: 1872666820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:28,396][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 02:22:31,735][15401] Updated weights for policy 0, policy_version 114300 (0.0029) [2024-06-22 02:22:33,390][15132] Fps is (10 sec: 44240.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1872773120. Throughput: 0: 42390.6. Samples: 1872916020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:33,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 02:22:35,566][15401] Updated weights for policy 0, policy_version 114310 (0.0035) [2024-06-22 02:22:38,390][15132] Fps is (10 sec: 42624.5, 60 sec: 42325.1, 300 sec: 42709.4). Total num frames: 1872969728. Throughput: 0: 42566.4. Samples: 1873050920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 02:22:39,649][15401] Updated weights for policy 0, policy_version 114320 (0.0045) [2024-06-22 02:22:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1873166336. Throughput: 0: 42489.3. Samples: 1873297140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 02:22:43,605][15401] Updated weights for policy 0, policy_version 114330 (0.0035) [2024-06-22 02:22:47,434][15401] Updated weights for policy 0, policy_version 114340 (0.0039) [2024-06-22 02:22:48,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1873395712. Throughput: 0: 42558.5. Samples: 1873559060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:48,394][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 02:22:51,235][15401] Updated weights for policy 0, policy_version 114350 (0.0034) [2024-06-22 02:22:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1873608704. Throughput: 0: 42491.4. Samples: 1873685280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 02:22:55,121][15401] Updated weights for policy 0, policy_version 114360 (0.0039) [2024-06-22 02:22:55,787][15349] Signal inference workers to stop experience collection... (27600 times) [2024-06-22 02:22:55,787][15349] Signal inference workers to resume experience collection... (27600 times) [2024-06-22 02:22:55,828][15401] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-06-22 02:22:55,828][15401] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-06-22 02:22:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1873805312. Throughput: 0: 42302.2. Samples: 1873936860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:22:58,392][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 02:22:59,092][15401] Updated weights for policy 0, policy_version 114370 (0.0033) [2024-06-22 02:23:02,774][15401] Updated weights for policy 0, policy_version 114380 (0.0033) [2024-06-22 02:23:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1874034688. Throughput: 0: 42536.9. Samples: 1874194300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 02:23:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 02:23:06,773][15401] Updated weights for policy 0, policy_version 114390 (0.0035) [2024-06-22 02:23:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1874247680. Throughput: 0: 42386.0. Samples: 1874323780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 02:23:10,499][15401] Updated weights for policy 0, policy_version 114400 (0.0035) [2024-06-22 02:23:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1874444288. Throughput: 0: 42303.4. Samples: 1874570200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 02:23:14,344][15401] Updated weights for policy 0, policy_version 114410 (0.0037) [2024-06-22 02:23:18,173][15401] Updated weights for policy 0, policy_version 114420 (0.0023) [2024-06-22 02:23:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1874657280. Throughput: 0: 42544.0. Samples: 1874830500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 02:23:21,896][15401] Updated weights for policy 0, policy_version 114430 (0.0031) [2024-06-22 02:23:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.9, 300 sec: 42598.4). Total num frames: 1874870272. Throughput: 0: 42307.4. Samples: 1874954740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 02:23:25,810][15401] Updated weights for policy 0, policy_version 114440 (0.0030) [2024-06-22 02:23:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 1875099648. Throughput: 0: 42525.7. Samples: 1875210800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:28,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 02:23:29,342][15401] Updated weights for policy 0, policy_version 114450 (0.0040) [2024-06-22 02:23:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1875296256. Throughput: 0: 42482.2. Samples: 1875470760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 02:23:33,428][15401] Updated weights for policy 0, policy_version 114460 (0.0036) [2024-06-22 02:23:37,039][15401] Updated weights for policy 0, policy_version 114470 (0.0030) [2024-06-22 02:23:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.6, 300 sec: 42598.4). Total num frames: 1875509248. Throughput: 0: 42459.3. Samples: 1875595940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 02:23:41,047][15401] Updated weights for policy 0, policy_version 114480 (0.0029) [2024-06-22 02:23:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1875738624. Throughput: 0: 42548.4. Samples: 1875851540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 02:23:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000114486_1875738624.pth... [2024-06-22 02:23:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000113861_1865498624.pth [2024-06-22 02:23:44,919][15401] Updated weights for policy 0, policy_version 114490 (0.0029) [2024-06-22 02:23:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1875951616. Throughput: 0: 42449.8. Samples: 1876104540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:48,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 02:23:48,814][15401] Updated weights for policy 0, policy_version 114500 (0.0027) [2024-06-22 02:23:52,501][15401] Updated weights for policy 0, policy_version 114510 (0.0035) [2024-06-22 02:23:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 1876148224. Throughput: 0: 42364.1. Samples: 1876230160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 02:23:56,510][15401] Updated weights for policy 0, policy_version 114520 (0.0029) [2024-06-22 02:23:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1876361216. Throughput: 0: 42662.1. Samples: 1876490000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:23:58,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 02:23:59,891][15401] Updated weights for policy 0, policy_version 114530 (0.0034) [2024-06-22 02:24:02,171][15349] Signal inference workers to stop experience collection... (27650 times) [2024-06-22 02:24:02,171][15349] Signal inference workers to resume experience collection... (27650 times) [2024-06-22 02:24:02,226][15401] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-06-22 02:24:02,226][15401] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-06-22 02:24:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 1876574208. Throughput: 0: 42546.6. Samples: 1876745200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 02:24:03,393][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 02:24:04,461][15401] Updated weights for policy 0, policy_version 114540 (0.0041) [2024-06-22 02:24:07,824][15401] Updated weights for policy 0, policy_version 114550 (0.0039) [2024-06-22 02:24:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1876803584. Throughput: 0: 42715.2. Samples: 1876876920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:08,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 02:24:11,823][15401] Updated weights for policy 0, policy_version 114560 (0.0031) [2024-06-22 02:24:13,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 1877000192. Throughput: 0: 42780.4. Samples: 1877136020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:13,392][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 02:24:15,414][15401] Updated weights for policy 0, policy_version 114570 (0.0031) [2024-06-22 02:24:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1877229568. Throughput: 0: 42688.9. Samples: 1877391860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:18,392][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 02:24:19,223][15401] Updated weights for policy 0, policy_version 114580 (0.0027) [2024-06-22 02:24:23,027][15401] Updated weights for policy 0, policy_version 114590 (0.0029) [2024-06-22 02:24:23,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1877458944. Throughput: 0: 42867.1. Samples: 1877524960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:23,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-22 02:24:26,754][15401] Updated weights for policy 0, policy_version 114600 (0.0021) [2024-06-22 02:24:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1877622784. Throughput: 0: 42833.5. Samples: 1877779040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:28,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 02:24:30,512][15401] Updated weights for policy 0, policy_version 114610 (0.0023) [2024-06-22 02:24:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1877868544. Throughput: 0: 42921.6. Samples: 1878036020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:33,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 02:24:34,621][15401] Updated weights for policy 0, policy_version 114620 (0.0037) [2024-06-22 02:24:38,139][15401] Updated weights for policy 0, policy_version 114630 (0.0030) [2024-06-22 02:24:38,390][15132] Fps is (10 sec: 49151.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1878114304. Throughput: 0: 43152.8. Samples: 1878172040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 02:24:42,139][15401] Updated weights for policy 0, policy_version 114640 (0.0022) [2024-06-22 02:24:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1878278144. Throughput: 0: 43009.4. Samples: 1878425420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 02:24:45,735][15401] Updated weights for policy 0, policy_version 114650 (0.0026) [2024-06-22 02:24:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1878523904. Throughput: 0: 42964.2. Samples: 1878678480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 02:24:49,731][15401] Updated weights for policy 0, policy_version 114660 (0.0044) [2024-06-22 02:24:53,321][15401] Updated weights for policy 0, policy_version 114670 (0.0027) [2024-06-22 02:24:53,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1878753280. Throughput: 0: 43095.0. Samples: 1878816200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:53,391][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 02:24:57,657][15401] Updated weights for policy 0, policy_version 114680 (0.0047) [2024-06-22 02:24:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1878917120. Throughput: 0: 42903.2. Samples: 1879066560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:24:58,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-22 02:25:01,343][15401] Updated weights for policy 0, policy_version 114690 (0.0034) [2024-06-22 02:25:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 1879179264. Throughput: 0: 42763.5. Samples: 1879316120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:25:03,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 02:25:05,037][15401] Updated weights for policy 0, policy_version 114700 (0.0034) [2024-06-22 02:25:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1879343104. Throughput: 0: 42917.8. Samples: 1879456260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 02:25:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 02:25:09,113][15401] Updated weights for policy 0, policy_version 114710 (0.0046) [2024-06-22 02:25:12,591][15349] Signal inference workers to stop experience collection... (27700 times) [2024-06-22 02:25:12,592][15349] Signal inference workers to resume experience collection... (27700 times) [2024-06-22 02:25:12,637][15401] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-06-22 02:25:12,637][15401] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-06-22 02:25:12,726][15401] Updated weights for policy 0, policy_version 114720 (0.0035) [2024-06-22 02:25:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1879572480. Throughput: 0: 42758.6. Samples: 1879703180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 02:25:16,886][15401] Updated weights for policy 0, policy_version 114730 (0.0031) [2024-06-22 02:25:18,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1879818240. Throughput: 0: 42534.3. Samples: 1879950060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 02:25:20,389][15401] Updated weights for policy 0, policy_version 114740 (0.0028) [2024-06-22 02:25:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1879998464. Throughput: 0: 42641.1. Samples: 1880090880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 02:25:24,445][15401] Updated weights for policy 0, policy_version 114750 (0.0041) [2024-06-22 02:25:28,030][15401] Updated weights for policy 0, policy_version 114760 (0.0029) [2024-06-22 02:25:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1880227840. Throughput: 0: 42771.2. Samples: 1880350120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 02:25:32,076][15401] Updated weights for policy 0, policy_version 114770 (0.0042) [2024-06-22 02:25:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1880457216. Throughput: 0: 42653.2. Samples: 1880597880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:33,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 02:25:35,811][15401] Updated weights for policy 0, policy_version 114780 (0.0033) [2024-06-22 02:25:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1880653824. Throughput: 0: 42562.2. Samples: 1880731500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 02:25:39,690][15401] Updated weights for policy 0, policy_version 114790 (0.0046) [2024-06-22 02:25:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1880866816. Throughput: 0: 42633.3. Samples: 1880985060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 02:25:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000114799_1880866816.pth... [2024-06-22 02:25:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000114174_1870626816.pth [2024-06-22 02:25:43,855][15401] Updated weights for policy 0, policy_version 114800 (0.0038) [2024-06-22 02:25:47,322][15401] Updated weights for policy 0, policy_version 114810 (0.0041) [2024-06-22 02:25:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1881079808. Throughput: 0: 42736.0. Samples: 1881239240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 02:25:51,448][15401] Updated weights for policy 0, policy_version 114820 (0.0027) [2024-06-22 02:25:53,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 1881276416. Throughput: 0: 42549.7. Samples: 1881371100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:53,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 02:25:55,285][15401] Updated weights for policy 0, policy_version 114830 (0.0027) [2024-06-22 02:25:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1881505792. Throughput: 0: 42739.6. Samples: 1881626460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:25:58,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-22 02:25:59,098][15401] Updated weights for policy 0, policy_version 114840 (0.0030) [2024-06-22 02:26:02,813][15401] Updated weights for policy 0, policy_version 114850 (0.0034) [2024-06-22 02:26:03,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1881718784. Throughput: 0: 42848.1. Samples: 1881878220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:26:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 02:26:06,776][15401] Updated weights for policy 0, policy_version 114860 (0.0041) [2024-06-22 02:26:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1881915392. Throughput: 0: 42576.5. Samples: 1882006820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 02:26:08,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 02:26:10,405][15401] Updated weights for policy 0, policy_version 114870 (0.0026) [2024-06-22 02:26:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1882128384. Throughput: 0: 42430.6. Samples: 1882259500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 02:26:14,592][15401] Updated weights for policy 0, policy_version 114880 (0.0046) [2024-06-22 02:26:17,934][15401] Updated weights for policy 0, policy_version 114890 (0.0031) [2024-06-22 02:26:18,394][15132] Fps is (10 sec: 45852.9, 60 sec: 42595.1, 300 sec: 42653.6). Total num frames: 1882374144. Throughput: 0: 42594.2. Samples: 1882514820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:18,395][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 02:26:22,591][15401] Updated weights for policy 0, policy_version 114900 (0.0042) [2024-06-22 02:26:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1882570752. Throughput: 0: 42444.9. Samples: 1882641520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 02:26:25,799][15401] Updated weights for policy 0, policy_version 114910 (0.0038) [2024-06-22 02:26:28,390][15132] Fps is (10 sec: 39340.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1882767360. Throughput: 0: 42418.2. Samples: 1882893880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:28,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 02:26:30,249][15401] Updated weights for policy 0, policy_version 114920 (0.0043) [2024-06-22 02:26:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1882996736. Throughput: 0: 42502.7. Samples: 1883151860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 02:26:33,479][15401] Updated weights for policy 0, policy_version 114930 (0.0028) [2024-06-22 02:26:37,833][15401] Updated weights for policy 0, policy_version 114940 (0.0039) [2024-06-22 02:26:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1883209728. Throughput: 0: 42570.8. Samples: 1883286680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 02:26:40,915][15401] Updated weights for policy 0, policy_version 114950 (0.0035) [2024-06-22 02:26:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1883406336. Throughput: 0: 42579.5. Samples: 1883542540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 02:26:45,417][15401] Updated weights for policy 0, policy_version 114960 (0.0042) [2024-06-22 02:26:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1883652096. Throughput: 0: 42649.3. Samples: 1883797440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 02:26:48,543][15401] Updated weights for policy 0, policy_version 114970 (0.0028) [2024-06-22 02:26:52,935][15401] Updated weights for policy 0, policy_version 114980 (0.0029) [2024-06-22 02:26:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 1883865088. Throughput: 0: 42737.7. Samples: 1883930020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 02:26:53,543][15349] Signal inference workers to stop experience collection... (27750 times) [2024-06-22 02:26:53,582][15401] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-06-22 02:26:53,603][15349] Signal inference workers to resume experience collection... (27750 times) [2024-06-22 02:26:53,604][15401] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-06-22 02:26:56,081][15401] Updated weights for policy 0, policy_version 114990 (0.0030) [2024-06-22 02:26:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1884045312. Throughput: 0: 42940.5. Samples: 1884191820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:26:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 02:27:00,401][15401] Updated weights for policy 0, policy_version 115000 (0.0031) [2024-06-22 02:27:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1884291072. Throughput: 0: 42796.6. Samples: 1884440460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:27:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 02:27:03,676][15401] Updated weights for policy 0, policy_version 115010 (0.0026) [2024-06-22 02:27:07,969][15401] Updated weights for policy 0, policy_version 115020 (0.0028) [2024-06-22 02:27:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1884504064. Throughput: 0: 43027.1. Samples: 1884577740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 02:27:08,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 02:27:11,329][15401] Updated weights for policy 0, policy_version 115030 (0.0027) [2024-06-22 02:27:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1884700672. Throughput: 0: 43182.3. Samples: 1884837080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 02:27:15,411][15401] Updated weights for policy 0, policy_version 115040 (0.0028) [2024-06-22 02:27:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42601.8, 300 sec: 42709.6). Total num frames: 1884930048. Throughput: 0: 43067.6. Samples: 1885089900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 02:27:18,974][15401] Updated weights for policy 0, policy_version 115050 (0.0037) [2024-06-22 02:27:23,120][15401] Updated weights for policy 0, policy_version 115060 (0.0028) [2024-06-22 02:27:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 1885143040. Throughput: 0: 43067.6. Samples: 1885224720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 02:27:26,743][15401] Updated weights for policy 0, policy_version 115070 (0.0035) [2024-06-22 02:27:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1885339648. Throughput: 0: 43003.7. Samples: 1885477700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:28,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-22 02:27:31,076][15401] Updated weights for policy 0, policy_version 115080 (0.0034) [2024-06-22 02:27:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1885569024. Throughput: 0: 43069.2. Samples: 1885735560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:33,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 02:27:34,589][15401] Updated weights for policy 0, policy_version 115090 (0.0039) [2024-06-22 02:27:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1885765632. Throughput: 0: 43140.0. Samples: 1885871320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 02:27:38,708][15401] Updated weights for policy 0, policy_version 115100 (0.0036) [2024-06-22 02:27:42,361][15401] Updated weights for policy 0, policy_version 115110 (0.0032) [2024-06-22 02:27:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1885995008. Throughput: 0: 42968.9. Samples: 1886125420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 02:27:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000115112_1885995008.pth... [2024-06-22 02:27:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000114486_1875738624.pth [2024-06-22 02:27:46,120][15401] Updated weights for policy 0, policy_version 115120 (0.0035) [2024-06-22 02:27:48,396][15132] Fps is (10 sec: 45845.9, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 1886224384. Throughput: 0: 43084.5. Samples: 1886379540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:48,396][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 02:27:49,980][15401] Updated weights for policy 0, policy_version 115130 (0.0028) [2024-06-22 02:27:53,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1886420992. Throughput: 0: 43032.1. Samples: 1886514280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:53,392][15132] Avg episode reward: [(0, '0.805')] [2024-06-22 02:27:53,630][15401] Updated weights for policy 0, policy_version 115140 (0.0037) [2024-06-22 02:27:57,627][15401] Updated weights for policy 0, policy_version 115150 (0.0038) [2024-06-22 02:27:58,390][15132] Fps is (10 sec: 42625.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1886650368. Throughput: 0: 43004.0. Samples: 1886772260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:27:58,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 02:28:01,185][15401] Updated weights for policy 0, policy_version 115160 (0.0024) [2024-06-22 02:28:03,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1886879744. Throughput: 0: 42968.4. Samples: 1887023480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:28:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 02:28:04,926][15401] Updated weights for policy 0, policy_version 115170 (0.0033) [2024-06-22 02:28:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1887092736. Throughput: 0: 42967.4. Samples: 1887158260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:28:08,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 02:28:08,662][15401] Updated weights for policy 0, policy_version 115180 (0.0024) [2024-06-22 02:28:12,393][15401] Updated weights for policy 0, policy_version 115190 (0.0023) [2024-06-22 02:28:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1887305728. Throughput: 0: 43218.2. Samples: 1887422520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-06-22 02:28:13,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 02:28:14,617][15349] Signal inference workers to stop experience collection... (27800 times) [2024-06-22 02:28:14,624][15349] Signal inference workers to resume experience collection... (27800 times) [2024-06-22 02:28:14,630][15401] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-06-22 02:28:14,661][15401] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-06-22 02:28:16,842][15401] Updated weights for policy 0, policy_version 115200 (0.0038) [2024-06-22 02:28:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 1887535104. Throughput: 0: 43136.1. Samples: 1887676680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:18,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 02:28:20,025][15401] Updated weights for policy 0, policy_version 115210 (0.0038) [2024-06-22 02:28:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1887731712. Throughput: 0: 43035.6. Samples: 1887807920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 02:28:24,285][15401] Updated weights for policy 0, policy_version 115220 (0.0046) [2024-06-22 02:28:27,804][15401] Updated weights for policy 0, policy_version 115230 (0.0030) [2024-06-22 02:28:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1887944704. Throughput: 0: 43108.9. Samples: 1888065320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 02:28:31,748][15401] Updated weights for policy 0, policy_version 115240 (0.0028) [2024-06-22 02:28:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 1888174080. Throughput: 0: 43243.1. Samples: 1888325200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:33,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 02:28:35,603][15401] Updated weights for policy 0, policy_version 115250 (0.0036) [2024-06-22 02:28:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1888354304. Throughput: 0: 43110.7. Samples: 1888454160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 02:28:39,307][15401] Updated weights for policy 0, policy_version 115260 (0.0027) [2024-06-22 02:28:43,101][15401] Updated weights for policy 0, policy_version 115270 (0.0036) [2024-06-22 02:28:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1888583680. Throughput: 0: 42934.1. Samples: 1888704300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 02:28:47,238][15401] Updated weights for policy 0, policy_version 115280 (0.0028) [2024-06-22 02:28:48,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43422.3, 300 sec: 42987.2). Total num frames: 1888829440. Throughput: 0: 43093.4. Samples: 1888962680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 02:28:50,606][15401] Updated weights for policy 0, policy_version 115290 (0.0024) [2024-06-22 02:28:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 1889009664. Throughput: 0: 43040.9. Samples: 1889095100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 02:28:54,662][15401] Updated weights for policy 0, policy_version 115300 (0.0029) [2024-06-22 02:28:57,982][15401] Updated weights for policy 0, policy_version 115310 (0.0038) [2024-06-22 02:28:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 1889239040. Throughput: 0: 42985.8. Samples: 1889356880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:28:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 02:29:02,305][15401] Updated weights for policy 0, policy_version 115320 (0.0037) [2024-06-22 02:29:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1889452032. Throughput: 0: 42995.5. Samples: 1889611480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:29:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 02:29:05,898][15401] Updated weights for policy 0, policy_version 115330 (0.0029) [2024-06-22 02:29:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 1889665024. Throughput: 0: 42968.3. Samples: 1889741500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:29:08,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 02:29:09,969][15401] Updated weights for policy 0, policy_version 115340 (0.0042) [2024-06-22 02:29:13,389][15401] Updated weights for policy 0, policy_version 115350 (0.0045) [2024-06-22 02:29:13,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.7, 300 sec: 42931.6). Total num frames: 1889894400. Throughput: 0: 42965.7. Samples: 1889998880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 02:29:13,392][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 02:29:17,453][15401] Updated weights for policy 0, policy_version 115360 (0.0032) [2024-06-22 02:29:17,463][15349] Signal inference workers to stop experience collection... (27850 times) [2024-06-22 02:29:17,464][15349] Signal inference workers to resume experience collection... (27850 times) [2024-06-22 02:29:17,518][15401] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-06-22 02:29:17,519][15401] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-06-22 02:29:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1890107392. Throughput: 0: 42760.8. Samples: 1890249440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:18,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 02:29:20,906][15401] Updated weights for policy 0, policy_version 115370 (0.0023) [2024-06-22 02:29:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1890304000. Throughput: 0: 42815.9. Samples: 1890380880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 02:29:24,954][15401] Updated weights for policy 0, policy_version 115380 (0.0031) [2024-06-22 02:29:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 1890533376. Throughput: 0: 43106.3. Samples: 1890644080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 02:29:28,457][15401] Updated weights for policy 0, policy_version 115390 (0.0026) [2024-06-22 02:29:32,721][15401] Updated weights for policy 0, policy_version 115400 (0.0030) [2024-06-22 02:29:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1890762752. Throughput: 0: 43087.9. Samples: 1890901640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:33,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 02:29:36,055][15401] Updated weights for policy 0, policy_version 115410 (0.0028) [2024-06-22 02:29:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 1890959360. Throughput: 0: 42976.6. Samples: 1891029040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 02:29:40,445][15401] Updated weights for policy 0, policy_version 115420 (0.0035) [2024-06-22 02:29:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1891172352. Throughput: 0: 42859.9. Samples: 1891285580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 02:29:43,551][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000115429_1891188736.pth... [2024-06-22 02:29:43,599][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000114799_1880866816.pth [2024-06-22 02:29:43,752][15401] Updated weights for policy 0, policy_version 115430 (0.0031) [2024-06-22 02:29:48,043][15401] Updated weights for policy 0, policy_version 115440 (0.0032) [2024-06-22 02:29:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1891385344. Throughput: 0: 42925.0. Samples: 1891543100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 02:29:51,297][15401] Updated weights for policy 0, policy_version 115450 (0.0038) [2024-06-22 02:29:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1891581952. Throughput: 0: 42808.0. Samples: 1891667860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:53,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 02:29:55,559][15401] Updated weights for policy 0, policy_version 115460 (0.0036) [2024-06-22 02:29:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1891827712. Throughput: 0: 42951.1. Samples: 1891931580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:29:58,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 02:29:59,489][15401] Updated weights for policy 0, policy_version 115470 (0.0030) [2024-06-22 02:30:03,255][15401] Updated weights for policy 0, policy_version 115480 (0.0034) [2024-06-22 02:30:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1892040704. Throughput: 0: 43094.2. Samples: 1892188680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:30:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 02:30:06,863][15401] Updated weights for policy 0, policy_version 115490 (0.0037) [2024-06-22 02:30:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1892220928. Throughput: 0: 42973.3. Samples: 1892314680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:30:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 02:30:10,735][15401] Updated weights for policy 0, policy_version 115500 (0.0037) [2024-06-22 02:30:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 1892466688. Throughput: 0: 42801.7. Samples: 1892570160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 02:30:13,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 02:30:14,945][15401] Updated weights for policy 0, policy_version 115510 (0.0040) [2024-06-22 02:30:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1892646912. Throughput: 0: 42844.5. Samples: 1892829640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 02:30:18,608][15401] Updated weights for policy 0, policy_version 115520 (0.0029) [2024-06-22 02:30:22,483][15401] Updated weights for policy 0, policy_version 115530 (0.0031) [2024-06-22 02:30:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1892859904. Throughput: 0: 42726.1. Samples: 1892951720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 02:30:26,350][15401] Updated weights for policy 0, policy_version 115540 (0.0029) [2024-06-22 02:30:28,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 1893105664. Throughput: 0: 42731.5. Samples: 1893208600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:28,393][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 02:30:30,116][15401] Updated weights for policy 0, policy_version 115550 (0.0034) [2024-06-22 02:30:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 1893269504. Throughput: 0: 42783.0. Samples: 1893468340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:33,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-22 02:30:34,077][15401] Updated weights for policy 0, policy_version 115560 (0.0033) [2024-06-22 02:30:35,073][15349] Signal inference workers to stop experience collection... (27900 times) [2024-06-22 02:30:35,129][15349] Signal inference workers to resume experience collection... (27900 times) [2024-06-22 02:30:35,129][15401] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-06-22 02:30:35,152][15401] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-06-22 02:30:37,691][15401] Updated weights for policy 0, policy_version 115570 (0.0036) [2024-06-22 02:30:38,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1893515264. Throughput: 0: 42696.5. Samples: 1893589200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 02:30:41,787][15401] Updated weights for policy 0, policy_version 115580 (0.0044) [2024-06-22 02:30:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1893728256. Throughput: 0: 42506.8. Samples: 1893844380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:43,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 02:30:45,338][15401] Updated weights for policy 0, policy_version 115590 (0.0034) [2024-06-22 02:30:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42876.4). Total num frames: 1893924864. Throughput: 0: 42655.7. Samples: 1894108180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:48,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 02:30:49,468][15401] Updated weights for policy 0, policy_version 115600 (0.0033) [2024-06-22 02:30:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1894137856. Throughput: 0: 42432.8. Samples: 1894224160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 02:30:53,405][15401] Updated weights for policy 0, policy_version 115610 (0.0044) [2024-06-22 02:30:57,579][15401] Updated weights for policy 0, policy_version 115620 (0.0038) [2024-06-22 02:30:58,394][15132] Fps is (10 sec: 44217.3, 60 sec: 42322.3, 300 sec: 42875.4). Total num frames: 1894367232. Throughput: 0: 42570.9. Samples: 1894486040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:30:58,395][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 02:31:00,881][15401] Updated weights for policy 0, policy_version 115630 (0.0022) [2024-06-22 02:31:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42820.5). Total num frames: 1894547456. Throughput: 0: 42507.1. Samples: 1894742460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:31:03,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 02:31:04,982][15401] Updated weights for policy 0, policy_version 115640 (0.0042) [2024-06-22 02:31:08,389][15132] Fps is (10 sec: 42617.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1894793216. Throughput: 0: 42561.9. Samples: 1894867000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:31:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 02:31:08,630][15401] Updated weights for policy 0, policy_version 115650 (0.0038) [2024-06-22 02:31:12,727][15401] Updated weights for policy 0, policy_version 115660 (0.0032) [2024-06-22 02:31:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42821.3). Total num frames: 1895006208. Throughput: 0: 42609.1. Samples: 1895125900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:31:13,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 02:31:16,058][15401] Updated weights for policy 0, policy_version 115670 (0.0046) [2024-06-22 02:31:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1895202816. Throughput: 0: 42620.8. Samples: 1895386280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 02:31:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 02:31:20,147][15401] Updated weights for policy 0, policy_version 115680 (0.0034) [2024-06-22 02:31:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 1895432192. Throughput: 0: 42790.3. Samples: 1895514760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:31:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 02:31:23,726][15401] Updated weights for policy 0, policy_version 115690 (0.0033) [2024-06-22 02:31:27,700][15401] Updated weights for policy 0, policy_version 115700 (0.0027) [2024-06-22 02:31:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42600.2, 300 sec: 42931.6). Total num frames: 1895661568. Throughput: 0: 42964.1. Samples: 1895777760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:31:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 02:31:31,338][15401] Updated weights for policy 0, policy_version 115710 (0.0044) [2024-06-22 02:31:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1895858176. Throughput: 0: 42716.0. Samples: 1896030400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:31:33,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 02:31:35,375][15401] Updated weights for policy 0, policy_version 115720 (0.0040) [2024-06-22 02:31:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1896071168. Throughput: 0: 42874.3. Samples: 1896153500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:31:38,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 02:31:39,202][15401] Updated weights for policy 0, policy_version 115730 (0.0033) [2024-06-22 02:31:43,059][15401] Updated weights for policy 0, policy_version 115740 (0.0042) [2024-06-22 02:31:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1896300544. Throughput: 0: 42965.2. Samples: 1896419280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:31:43,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-22 02:31:43,459][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000115742_1896316928.pth... [2024-06-22 02:31:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000115112_1885995008.pth [2024-06-22 02:31:46,780][15401] Updated weights for policy 0, policy_version 115750 (0.0042) [2024-06-22 02:31:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1896497152. Throughput: 0: 42859.9. Samples: 1896671160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:31:48,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 02:31:50,021][15349] Signal inference workers to stop experience collection... (27950 times) [2024-06-22 02:31:50,064][15401] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-06-22 02:31:50,072][15349] Signal inference workers to resume experience collection... (27950 times) [2024-06-22 02:31:50,085][15401] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-06-22 02:31:50,607][15401] Updated weights for policy 0, policy_version 115760 (0.0046) [2024-06-22 02:31:53,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 1896710144. Throughput: 0: 42858.5. Samples: 1896795740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:31:53,393][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 02:31:54,324][15401] Updated weights for policy 0, policy_version 115770 (0.0041) [2024-06-22 02:31:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42601.6, 300 sec: 42820.5). Total num frames: 1896923136. Throughput: 0: 42984.8. Samples: 1897060220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:31:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 02:31:58,417][15401] Updated weights for policy 0, policy_version 115780 (0.0029) [2024-06-22 02:32:01,820][15401] Updated weights for policy 0, policy_version 115790 (0.0042) [2024-06-22 02:32:03,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1897152512. Throughput: 0: 42789.3. Samples: 1897311800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:32:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 02:32:05,981][15401] Updated weights for policy 0, policy_version 115800 (0.0029) [2024-06-22 02:32:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1897349120. Throughput: 0: 42769.8. Samples: 1897439400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:32:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 02:32:09,356][15401] Updated weights for policy 0, policy_version 115810 (0.0028) [2024-06-22 02:32:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1897562112. Throughput: 0: 42648.0. Samples: 1897696920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:32:13,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 02:32:13,584][15401] Updated weights for policy 0, policy_version 115820 (0.0038) [2024-06-22 02:32:17,054][15401] Updated weights for policy 0, policy_version 115830 (0.0032) [2024-06-22 02:32:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1897791488. Throughput: 0: 42614.3. Samples: 1897948040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 02:32:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 02:32:21,467][15401] Updated weights for policy 0, policy_version 115840 (0.0046) [2024-06-22 02:32:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1898004480. Throughput: 0: 42786.2. Samples: 1898078880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:32:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 02:32:24,764][15401] Updated weights for policy 0, policy_version 115850 (0.0044) [2024-06-22 02:32:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 1898201088. Throughput: 0: 42679.4. Samples: 1898339860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:32:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 02:32:29,105][15401] Updated weights for policy 0, policy_version 115860 (0.0029) [2024-06-22 02:32:32,426][15401] Updated weights for policy 0, policy_version 115870 (0.0044) [2024-06-22 02:32:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1898430464. Throughput: 0: 42621.0. Samples: 1898589100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:32:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 02:32:36,916][15401] Updated weights for policy 0, policy_version 115880 (0.0052) [2024-06-22 02:32:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1898643456. Throughput: 0: 42806.7. Samples: 1898721940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:32:38,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 02:32:40,066][15401] Updated weights for policy 0, policy_version 115890 (0.0033) [2024-06-22 02:32:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.1, 300 sec: 42710.4). Total num frames: 1898823680. Throughput: 0: 42651.5. Samples: 1898979540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:32:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 02:32:44,585][15401] Updated weights for policy 0, policy_version 115900 (0.0032) [2024-06-22 02:32:47,701][15401] Updated weights for policy 0, policy_version 115910 (0.0035) [2024-06-22 02:32:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 1899085824. Throughput: 0: 42511.1. Samples: 1899224800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:32:48,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 02:32:52,512][15401] Updated weights for policy 0, policy_version 115920 (0.0037) [2024-06-22 02:32:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1899266048. Throughput: 0: 42853.7. Samples: 1899367820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:32:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 02:32:55,291][15401] Updated weights for policy 0, policy_version 115930 (0.0050) [2024-06-22 02:32:58,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1899479040. Throughput: 0: 42717.7. Samples: 1899619220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:32:58,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-22 02:33:00,206][15401] Updated weights for policy 0, policy_version 115940 (0.0032) [2024-06-22 02:33:03,032][15401] Updated weights for policy 0, policy_version 115950 (0.0028) [2024-06-22 02:33:03,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1899741184. Throughput: 0: 42547.0. Samples: 1899862660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:33:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 02:33:07,719][15401] Updated weights for policy 0, policy_version 115960 (0.0031) [2024-06-22 02:33:08,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 1899921408. Throughput: 0: 42750.9. Samples: 1900002940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:33:08,396][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 02:33:10,729][15401] Updated weights for policy 0, policy_version 115970 (0.0040) [2024-06-22 02:33:13,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1900118016. Throughput: 0: 42562.3. Samples: 1900255160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:33:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 02:33:15,238][15401] Updated weights for policy 0, policy_version 115980 (0.0031) [2024-06-22 02:33:16,412][15349] Signal inference workers to stop experience collection... (28000 times) [2024-06-22 02:33:16,459][15401] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-06-22 02:33:16,469][15349] Signal inference workers to resume experience collection... (28000 times) [2024-06-22 02:33:16,479][15401] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-06-22 02:33:18,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1900363776. Throughput: 0: 42621.9. Samples: 1900507080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:33:18,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-22 02:33:18,496][15401] Updated weights for policy 0, policy_version 115990 (0.0042) [2024-06-22 02:33:22,991][15401] Updated weights for policy 0, policy_version 116000 (0.0029) [2024-06-22 02:33:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1900560384. Throughput: 0: 42676.6. Samples: 1900642380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:33:23,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-22 02:33:26,165][15401] Updated weights for policy 0, policy_version 116010 (0.0033) [2024-06-22 02:33:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1900773376. Throughput: 0: 42583.6. Samples: 1900895800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:33:28,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 02:33:30,635][15401] Updated weights for policy 0, policy_version 116020 (0.0046) [2024-06-22 02:33:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1901002752. Throughput: 0: 42780.9. Samples: 1901149940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:33:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 02:33:33,995][15401] Updated weights for policy 0, policy_version 116030 (0.0044) [2024-06-22 02:33:38,263][15401] Updated weights for policy 0, policy_version 116040 (0.0041) [2024-06-22 02:33:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1901199360. Throughput: 0: 42432.8. Samples: 1901277300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:33:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 02:33:41,714][15401] Updated weights for policy 0, policy_version 116050 (0.0030) [2024-06-22 02:33:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 1901412352. Throughput: 0: 42508.1. Samples: 1901532080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:33:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 02:33:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000116054_1901428736.pth... [2024-06-22 02:33:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000115429_1891188736.pth [2024-06-22 02:33:46,153][15401] Updated weights for policy 0, policy_version 116060 (0.0036) [2024-06-22 02:33:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1901658112. Throughput: 0: 42785.9. Samples: 1901788020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:33:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 02:33:49,385][15401] Updated weights for policy 0, policy_version 116070 (0.0048) [2024-06-22 02:33:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1901821952. Throughput: 0: 42488.1. Samples: 1901914640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:33:53,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 02:33:53,807][15401] Updated weights for policy 0, policy_version 116080 (0.0029) [2024-06-22 02:33:57,030][15401] Updated weights for policy 0, policy_version 116090 (0.0038) [2024-06-22 02:33:58,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1902034944. Throughput: 0: 42501.4. Samples: 1902167720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:33:58,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 02:34:01,705][15401] Updated weights for policy 0, policy_version 116100 (0.0031) [2024-06-22 02:34:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1902264320. Throughput: 0: 42601.2. Samples: 1902424140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:34:03,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 02:34:04,798][15401] Updated weights for policy 0, policy_version 116110 (0.0026) [2024-06-22 02:34:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42329.8, 300 sec: 42598.8). Total num frames: 1902460928. Throughput: 0: 42502.2. Samples: 1902554980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:34:08,390][15132] Avg episode reward: [(0, '0.166')] [2024-06-22 02:34:09,307][15401] Updated weights for policy 0, policy_version 116120 (0.0033) [2024-06-22 02:34:12,647][15401] Updated weights for policy 0, policy_version 116130 (0.0031) [2024-06-22 02:34:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1902690304. Throughput: 0: 42454.2. Samples: 1902806240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:34:13,391][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 02:34:17,055][15401] Updated weights for policy 0, policy_version 116140 (0.0038) [2024-06-22 02:34:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1902903296. Throughput: 0: 42441.8. Samples: 1903059820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 02:34:18,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 02:34:20,273][15401] Updated weights for policy 0, policy_version 116150 (0.0035) [2024-06-22 02:34:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1903083520. Throughput: 0: 42445.0. Samples: 1903187320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:34:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 02:34:24,680][15401] Updated weights for policy 0, policy_version 116160 (0.0029) [2024-06-22 02:34:27,858][15401] Updated weights for policy 0, policy_version 116170 (0.0028) [2024-06-22 02:34:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1903329280. Throughput: 0: 42450.3. Samples: 1903442340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:34:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 02:34:32,522][15401] Updated weights for policy 0, policy_version 116180 (0.0035) [2024-06-22 02:34:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1903525888. Throughput: 0: 42578.3. Samples: 1903704040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:34:33,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 02:34:35,560][15401] Updated weights for policy 0, policy_version 116190 (0.0033) [2024-06-22 02:34:38,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1903738880. Throughput: 0: 42443.2. Samples: 1903824580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:34:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 02:34:40,349][15401] Updated weights for policy 0, policy_version 116200 (0.0031) [2024-06-22 02:34:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1903968256. Throughput: 0: 42486.3. Samples: 1904079600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:34:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 02:34:43,558][15401] Updated weights for policy 0, policy_version 116210 (0.0042) [2024-06-22 02:34:48,009][15401] Updated weights for policy 0, policy_version 116220 (0.0042) [2024-06-22 02:34:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 1904164864. Throughput: 0: 42469.8. Samples: 1904335280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:34:48,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 02:34:51,553][15401] Updated weights for policy 0, policy_version 116230 (0.0029) [2024-06-22 02:34:53,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1904361472. Throughput: 0: 42394.2. Samples: 1904462720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:34:53,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 02:34:54,189][15349] Signal inference workers to stop experience collection... (28050 times) [2024-06-22 02:34:54,191][15349] Signal inference workers to resume experience collection... (28050 times) [2024-06-22 02:34:54,236][15401] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-06-22 02:34:54,236][15401] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-06-22 02:34:55,672][15401] Updated weights for policy 0, policy_version 116240 (0.0029) [2024-06-22 02:34:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1904607232. Throughput: 0: 42348.5. Samples: 1904711920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:34:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 02:34:59,538][15401] Updated weights for policy 0, policy_version 116250 (0.0038) [2024-06-22 02:35:03,215][15401] Updated weights for policy 0, policy_version 116260 (0.0032) [2024-06-22 02:35:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1904803840. Throughput: 0: 42514.7. Samples: 1904972980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:35:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 02:35:07,214][15401] Updated weights for policy 0, policy_version 116270 (0.0040) [2024-06-22 02:35:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1905000448. Throughput: 0: 42415.0. Samples: 1905096000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:35:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 02:35:10,849][15401] Updated weights for policy 0, policy_version 116280 (0.0040) [2024-06-22 02:35:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1905262592. Throughput: 0: 42493.6. Samples: 1905354560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:35:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 02:35:14,851][15401] Updated weights for policy 0, policy_version 116290 (0.0032) [2024-06-22 02:35:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 1905442816. Throughput: 0: 42388.5. Samples: 1905611520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:35:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 02:35:18,417][15401] Updated weights for policy 0, policy_version 116300 (0.0037) [2024-06-22 02:35:22,525][15401] Updated weights for policy 0, policy_version 116310 (0.0041) [2024-06-22 02:35:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 1905639424. Throughput: 0: 42488.1. Samples: 1905736540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 02:35:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 02:35:26,078][15401] Updated weights for policy 0, policy_version 116320 (0.0031) [2024-06-22 02:35:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1905868800. Throughput: 0: 42564.8. Samples: 1905995020. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:35:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 02:35:30,239][15401] Updated weights for policy 0, policy_version 116330 (0.0039) [2024-06-22 02:35:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1906065408. Throughput: 0: 42548.5. Samples: 1906249960. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:35:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 02:35:33,858][15401] Updated weights for policy 0, policy_version 116340 (0.0034) [2024-06-22 02:35:37,947][15401] Updated weights for policy 0, policy_version 116350 (0.0026) [2024-06-22 02:35:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 1906294784. Throughput: 0: 42510.6. Samples: 1906375800. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:35:38,393][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 02:35:41,399][15401] Updated weights for policy 0, policy_version 116360 (0.0038) [2024-06-22 02:35:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1906491392. Throughput: 0: 42720.5. Samples: 1906634340. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:35:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 02:35:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000116364_1906507776.pth... [2024-06-22 02:35:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000115742_1896316928.pth [2024-06-22 02:35:45,515][15401] Updated weights for policy 0, policy_version 116370 (0.0033) [2024-06-22 02:35:48,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1906688000. Throughput: 0: 42472.5. Samples: 1906884240. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:35:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 02:35:49,773][15401] Updated weights for policy 0, policy_version 116380 (0.0025) [2024-06-22 02:35:53,183][15401] Updated weights for policy 0, policy_version 116390 (0.0024) [2024-06-22 02:35:53,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42599.0). Total num frames: 1906933760. Throughput: 0: 42444.8. Samples: 1907006020. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:35:53,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 02:35:57,368][15401] Updated weights for policy 0, policy_version 116400 (0.0030) [2024-06-22 02:35:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1907146752. Throughput: 0: 42521.9. Samples: 1907268040. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:35:58,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 02:36:00,961][15401] Updated weights for policy 0, policy_version 116410 (0.0034) [2024-06-22 02:36:03,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 1907343360. Throughput: 0: 42492.0. Samples: 1907523660. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:36:03,390][15132] Avg episode reward: [(0, '0.183')] [2024-06-22 02:36:03,946][15349] Signal inference workers to stop experience collection... (28100 times) [2024-06-22 02:36:04,001][15401] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-06-22 02:36:04,008][15349] Signal inference workers to resume experience collection... (28100 times) [2024-06-22 02:36:04,016][15401] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-06-22 02:36:04,873][15401] Updated weights for policy 0, policy_version 116420 (0.0035) [2024-06-22 02:36:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1907572736. Throughput: 0: 42564.5. Samples: 1907651940. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:36:08,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 02:36:08,577][15401] Updated weights for policy 0, policy_version 116430 (0.0034) [2024-06-22 02:36:12,333][15401] Updated weights for policy 0, policy_version 116440 (0.0026) [2024-06-22 02:36:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 1907785728. Throughput: 0: 42649.4. Samples: 1907914240. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:36:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 02:36:16,209][15401] Updated weights for policy 0, policy_version 116450 (0.0038) [2024-06-22 02:36:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1907998720. Throughput: 0: 42509.7. Samples: 1908162900. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:36:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 02:36:20,019][15401] Updated weights for policy 0, policy_version 116460 (0.0033) [2024-06-22 02:36:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 1908211712. Throughput: 0: 42516.0. Samples: 1908289020. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-22 02:36:23,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 02:36:23,877][15401] Updated weights for policy 0, policy_version 116470 (0.0038) [2024-06-22 02:36:28,035][15401] Updated weights for policy 0, policy_version 116480 (0.0029) [2024-06-22 02:36:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1908424704. Throughput: 0: 42581.3. Samples: 1908550500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:36:28,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 02:36:31,450][15401] Updated weights for policy 0, policy_version 116490 (0.0040) [2024-06-22 02:36:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1908654080. Throughput: 0: 42587.1. Samples: 1908800660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:36:33,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 02:36:35,823][15401] Updated weights for policy 0, policy_version 116500 (0.0043) [2024-06-22 02:36:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 1908850688. Throughput: 0: 42916.6. Samples: 1908937260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:36:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 02:36:39,157][15401] Updated weights for policy 0, policy_version 116510 (0.0032) [2024-06-22 02:36:43,296][15401] Updated weights for policy 0, policy_version 116520 (0.0031) [2024-06-22 02:36:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1909063680. Throughput: 0: 42703.9. Samples: 1909189720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:36:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 02:36:46,921][15401] Updated weights for policy 0, policy_version 116530 (0.0028) [2024-06-22 02:36:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42654.3). Total num frames: 1909293056. Throughput: 0: 42667.1. Samples: 1909443680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:36:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 02:36:51,024][15401] Updated weights for policy 0, policy_version 116540 (0.0041) [2024-06-22 02:36:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1909489664. Throughput: 0: 42719.9. Samples: 1909574340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:36:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 02:36:54,649][15401] Updated weights for policy 0, policy_version 116550 (0.0045) [2024-06-22 02:36:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1909702656. Throughput: 0: 42539.6. Samples: 1909828520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:36:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 02:36:58,458][15401] Updated weights for policy 0, policy_version 116560 (0.0028) [2024-06-22 02:37:02,173][15401] Updated weights for policy 0, policy_version 116570 (0.0044) [2024-06-22 02:37:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1909932032. Throughput: 0: 42707.7. Samples: 1910084740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:37:03,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 02:37:06,174][15401] Updated weights for policy 0, policy_version 116580 (0.0033) [2024-06-22 02:37:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1910128640. Throughput: 0: 42814.8. Samples: 1910215580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:37:08,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 02:37:09,909][15401] Updated weights for policy 0, policy_version 116590 (0.0032) [2024-06-22 02:37:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1910341632. Throughput: 0: 42664.0. Samples: 1910470380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:37:13,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 02:37:13,726][15401] Updated weights for policy 0, policy_version 116600 (0.0026) [2024-06-22 02:37:17,630][15401] Updated weights for policy 0, policy_version 116610 (0.0046) [2024-06-22 02:37:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 1910587392. Throughput: 0: 42771.7. Samples: 1910725380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:37:18,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 02:37:21,745][15401] Updated weights for policy 0, policy_version 116620 (0.0032) [2024-06-22 02:37:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 1910751232. Throughput: 0: 42618.3. Samples: 1910855080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 02:37:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 02:37:25,351][15401] Updated weights for policy 0, policy_version 116630 (0.0027) [2024-06-22 02:37:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1910996992. Throughput: 0: 42655.1. Samples: 1911109200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:37:28,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 02:37:29,301][15401] Updated weights for policy 0, policy_version 116640 (0.0051) [2024-06-22 02:37:32,943][15401] Updated weights for policy 0, policy_version 116650 (0.0030) [2024-06-22 02:37:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1911209984. Throughput: 0: 42682.6. Samples: 1911364400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:37:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 02:37:36,709][15349] Signal inference workers to stop experience collection... (28150 times) [2024-06-22 02:37:36,740][15401] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-06-22 02:37:36,767][15349] Signal inference workers to resume experience collection... (28150 times) [2024-06-22 02:37:36,768][15401] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-06-22 02:37:36,918][15401] Updated weights for policy 0, policy_version 116660 (0.0032) [2024-06-22 02:37:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1911406592. Throughput: 0: 42706.5. Samples: 1911496140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:37:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 02:37:40,520][15401] Updated weights for policy 0, policy_version 116670 (0.0029) [2024-06-22 02:37:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1911635968. Throughput: 0: 42727.8. Samples: 1911751280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:37:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 02:37:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000116677_1911635968.pth... [2024-06-22 02:37:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000116054_1901428736.pth [2024-06-22 02:37:44,443][15401] Updated weights for policy 0, policy_version 116680 (0.0034) [2024-06-22 02:37:48,145][15401] Updated weights for policy 0, policy_version 116690 (0.0033) [2024-06-22 02:37:48,390][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1911865344. Throughput: 0: 42647.4. Samples: 1912003880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:37:48,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 02:37:51,904][15401] Updated weights for policy 0, policy_version 116700 (0.0049) [2024-06-22 02:37:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1912045568. Throughput: 0: 42629.2. Samples: 1912133900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:37:53,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 02:37:55,851][15401] Updated weights for policy 0, policy_version 116710 (0.0046) [2024-06-22 02:37:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1912274944. Throughput: 0: 42564.0. Samples: 1912385760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:37:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 02:37:59,900][15401] Updated weights for policy 0, policy_version 116720 (0.0037) [2024-06-22 02:38:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 1912487936. Throughput: 0: 42713.8. Samples: 1912647500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:38:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 02:38:03,462][15401] Updated weights for policy 0, policy_version 116730 (0.0024) [2024-06-22 02:38:07,524][15401] Updated weights for policy 0, policy_version 116740 (0.0038) [2024-06-22 02:38:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1912684544. Throughput: 0: 42721.2. Samples: 1912777540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:38:08,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 02:38:11,121][15401] Updated weights for policy 0, policy_version 116750 (0.0041) [2024-06-22 02:38:13,396][15132] Fps is (10 sec: 42570.1, 60 sec: 42866.8, 300 sec: 42541.9). Total num frames: 1912913920. Throughput: 0: 42613.1. Samples: 1913027060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:38:13,397][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 02:38:15,354][15401] Updated weights for policy 0, policy_version 116760 (0.0028) [2024-06-22 02:38:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1913126912. Throughput: 0: 42786.7. Samples: 1913289800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:38:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 02:38:18,822][15401] Updated weights for policy 0, policy_version 116770 (0.0033) [2024-06-22 02:38:23,382][15401] Updated weights for policy 0, policy_version 116780 (0.0023) [2024-06-22 02:38:23,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1913323520. Throughput: 0: 42630.0. Samples: 1913414480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 02:38:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 02:38:26,386][15401] Updated weights for policy 0, policy_version 116790 (0.0039) [2024-06-22 02:38:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1913552896. Throughput: 0: 42594.4. Samples: 1913668020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:38:28,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 02:38:30,995][15401] Updated weights for policy 0, policy_version 116800 (0.0035) [2024-06-22 02:38:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1913749504. Throughput: 0: 42848.3. Samples: 1913932060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:38:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 02:38:34,103][15401] Updated weights for policy 0, policy_version 116810 (0.0031) [2024-06-22 02:38:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 1913962496. Throughput: 0: 42718.3. Samples: 1914056220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:38:38,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 02:38:38,528][15401] Updated weights for policy 0, policy_version 116820 (0.0035) [2024-06-22 02:38:41,772][15401] Updated weights for policy 0, policy_version 116830 (0.0023) [2024-06-22 02:38:43,396][15132] Fps is (10 sec: 45846.3, 60 sec: 42866.9, 300 sec: 42541.9). Total num frames: 1914208256. Throughput: 0: 42748.0. Samples: 1914309700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:38:43,397][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 02:38:46,083][15401] Updated weights for policy 0, policy_version 116840 (0.0031) [2024-06-22 02:38:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1914388480. Throughput: 0: 42722.9. Samples: 1914570040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:38:48,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 02:38:49,463][15401] Updated weights for policy 0, policy_version 116850 (0.0038) [2024-06-22 02:38:53,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1914617856. Throughput: 0: 42582.6. Samples: 1914693760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:38:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 02:38:53,587][15401] Updated weights for policy 0, policy_version 116860 (0.0025) [2024-06-22 02:38:57,199][15401] Updated weights for policy 0, policy_version 116870 (0.0031) [2024-06-22 02:38:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1914847232. Throughput: 0: 42723.1. Samples: 1914949320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:38:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 02:39:01,562][15401] Updated weights for policy 0, policy_version 116880 (0.0039) [2024-06-22 02:39:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1915011072. Throughput: 0: 42705.3. Samples: 1915211540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:39:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 02:39:04,079][15349] Signal inference workers to stop experience collection... (28200 times) [2024-06-22 02:39:04,111][15401] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-06-22 02:39:04,144][15349] Signal inference workers to resume experience collection... (28200 times) [2024-06-22 02:39:04,144][15401] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-06-22 02:39:05,011][15401] Updated weights for policy 0, policy_version 116890 (0.0038) [2024-06-22 02:39:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1915256832. Throughput: 0: 42599.6. Samples: 1915331460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:39:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 02:39:09,169][15401] Updated weights for policy 0, policy_version 116900 (0.0043) [2024-06-22 02:39:12,527][15401] Updated weights for policy 0, policy_version 116910 (0.0026) [2024-06-22 02:39:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 1915486208. Throughput: 0: 42768.2. Samples: 1915592600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:39:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 02:39:16,911][15401] Updated weights for policy 0, policy_version 116920 (0.0032) [2024-06-22 02:39:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1915666432. Throughput: 0: 42554.4. Samples: 1915847000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:39:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 02:39:20,281][15401] Updated weights for policy 0, policy_version 116930 (0.0041) [2024-06-22 02:39:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1915895808. Throughput: 0: 42442.2. Samples: 1915966120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:39:23,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 02:39:24,583][15401] Updated weights for policy 0, policy_version 116940 (0.0036) [2024-06-22 02:39:27,881][15401] Updated weights for policy 0, policy_version 116950 (0.0035) [2024-06-22 02:39:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1916125184. Throughput: 0: 42686.1. Samples: 1916230300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 02:39:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 02:39:32,150][15401] Updated weights for policy 0, policy_version 116960 (0.0037) [2024-06-22 02:39:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1916321792. Throughput: 0: 42642.2. Samples: 1916488940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:39:33,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 02:39:35,447][15401] Updated weights for policy 0, policy_version 116970 (0.0042) [2024-06-22 02:39:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1916534784. Throughput: 0: 42549.5. Samples: 1916608480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:39:38,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 02:39:39,782][15401] Updated weights for policy 0, policy_version 116980 (0.0034) [2024-06-22 02:39:43,373][15401] Updated weights for policy 0, policy_version 116990 (0.0032) [2024-06-22 02:39:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 1916764160. Throughput: 0: 42736.8. Samples: 1916872480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:39:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 02:39:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000116990_1916764160.pth... [2024-06-22 02:39:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000116364_1906507776.pth [2024-06-22 02:39:47,333][15401] Updated weights for policy 0, policy_version 117000 (0.0039) [2024-06-22 02:39:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1916944384. Throughput: 0: 42637.7. Samples: 1917130240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:39:48,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 02:39:51,001][15401] Updated weights for policy 0, policy_version 117010 (0.0030) [2024-06-22 02:39:53,390][15132] Fps is (10 sec: 42595.4, 60 sec: 42871.1, 300 sec: 42653.8). Total num frames: 1917190144. Throughput: 0: 42599.3. Samples: 1917248460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:39:53,391][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 02:39:55,676][15401] Updated weights for policy 0, policy_version 117020 (0.0034) [2024-06-22 02:39:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1917386752. Throughput: 0: 42462.9. Samples: 1917503420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:39:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-22 02:39:58,992][15401] Updated weights for policy 0, policy_version 117030 (0.0038) [2024-06-22 02:40:03,389][15132] Fps is (10 sec: 37685.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1917566976. Throughput: 0: 42574.7. Samples: 1917762860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:40:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 02:40:03,415][15401] Updated weights for policy 0, policy_version 117040 (0.0030) [2024-06-22 02:40:06,638][15401] Updated weights for policy 0, policy_version 117050 (0.0038) [2024-06-22 02:40:08,393][15132] Fps is (10 sec: 44219.1, 60 sec: 42868.6, 300 sec: 42597.8). Total num frames: 1917829120. Throughput: 0: 42611.4. Samples: 1917883800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:40:08,394][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 02:40:10,984][15401] Updated weights for policy 0, policy_version 117060 (0.0041) [2024-06-22 02:40:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 1918025728. Throughput: 0: 42508.5. Samples: 1918143180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:40:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 02:40:14,282][15401] Updated weights for policy 0, policy_version 117070 (0.0042) [2024-06-22 02:40:18,390][15132] Fps is (10 sec: 37697.5, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1918205952. Throughput: 0: 42463.1. Samples: 1918399780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:40:18,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 02:40:19,000][15401] Updated weights for policy 0, policy_version 117080 (0.0037) [2024-06-22 02:40:22,077][15401] Updated weights for policy 0, policy_version 117090 (0.0043) [2024-06-22 02:40:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1918451712. Throughput: 0: 42565.3. Samples: 1918523920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:40:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 02:40:26,527][15401] Updated weights for policy 0, policy_version 117100 (0.0041) [2024-06-22 02:40:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1918648320. Throughput: 0: 42382.2. Samples: 1918779680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 02:40:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 02:40:29,870][15401] Updated weights for policy 0, policy_version 117110 (0.0038) [2024-06-22 02:40:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 1918861312. Throughput: 0: 42488.0. Samples: 1919042200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:40:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 02:40:34,044][15401] Updated weights for policy 0, policy_version 117120 (0.0033) [2024-06-22 02:40:35,746][15349] Signal inference workers to stop experience collection... (28250 times) [2024-06-22 02:40:35,770][15401] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-06-22 02:40:35,809][15349] Signal inference workers to resume experience collection... (28250 times) [2024-06-22 02:40:35,817][15401] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-06-22 02:40:37,310][15401] Updated weights for policy 0, policy_version 117130 (0.0028) [2024-06-22 02:40:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1919074304. Throughput: 0: 42745.6. Samples: 1919171980. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:40:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 02:40:41,443][15401] Updated weights for policy 0, policy_version 117140 (0.0031) [2024-06-22 02:40:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1919287296. Throughput: 0: 42716.0. Samples: 1919425640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:40:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 02:40:44,820][15401] Updated weights for policy 0, policy_version 117150 (0.0039) [2024-06-22 02:40:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1919500288. Throughput: 0: 42600.8. Samples: 1919679900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:40:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 02:40:48,910][15401] Updated weights for policy 0, policy_version 117160 (0.0043) [2024-06-22 02:40:53,105][15401] Updated weights for policy 0, policy_version 117170 (0.0037) [2024-06-22 02:40:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.8, 300 sec: 42653.9). Total num frames: 1919729664. Throughput: 0: 42780.1. Samples: 1919808740. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:40:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 02:40:56,522][15401] Updated weights for policy 0, policy_version 117180 (0.0044) [2024-06-22 02:40:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1919926272. Throughput: 0: 42575.1. Samples: 1920059060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:40:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 02:41:00,649][15401] Updated weights for policy 0, policy_version 117190 (0.0030) [2024-06-22 02:41:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1920155648. Throughput: 0: 42565.0. Samples: 1920315200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:41:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 02:41:04,112][15401] Updated weights for policy 0, policy_version 117200 (0.0033) [2024-06-22 02:41:08,193][15401] Updated weights for policy 0, policy_version 117210 (0.0043) [2024-06-22 02:41:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42328.1, 300 sec: 42653.9). Total num frames: 1920368640. Throughput: 0: 42848.0. Samples: 1920452080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:41:08,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-22 02:41:11,706][15401] Updated weights for policy 0, policy_version 117220 (0.0043) [2024-06-22 02:41:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 1920581632. Throughput: 0: 42756.8. Samples: 1920703840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:41:13,393][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 02:41:15,946][15401] Updated weights for policy 0, policy_version 117230 (0.0037) [2024-06-22 02:41:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 1920794624. Throughput: 0: 42504.5. Samples: 1920954900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:41:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 02:41:19,689][15401] Updated weights for policy 0, policy_version 117240 (0.0033) [2024-06-22 02:41:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1921007616. Throughput: 0: 42467.4. Samples: 1921083020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:41:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 02:41:23,729][15401] Updated weights for policy 0, policy_version 117250 (0.0030) [2024-06-22 02:41:27,490][15401] Updated weights for policy 0, policy_version 117260 (0.0028) [2024-06-22 02:41:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1921236992. Throughput: 0: 42544.4. Samples: 1921340140. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-22 02:41:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 02:41:31,121][15401] Updated weights for policy 0, policy_version 117270 (0.0037) [2024-06-22 02:41:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1921433600. Throughput: 0: 42693.9. Samples: 1921601120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:41:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 02:41:35,146][15401] Updated weights for policy 0, policy_version 117280 (0.0036) [2024-06-22 02:41:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 1921662976. Throughput: 0: 42570.7. Samples: 1921724520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:41:38,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 02:41:38,507][15401] Updated weights for policy 0, policy_version 117290 (0.0045) [2024-06-22 02:41:42,816][15401] Updated weights for policy 0, policy_version 117300 (0.0031) [2024-06-22 02:41:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 1921875968. Throughput: 0: 42768.3. Samples: 1921983740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:41:43,393][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 02:41:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000117303_1921892352.pth... [2024-06-22 02:41:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000116677_1911635968.pth [2024-06-22 02:41:45,995][15401] Updated weights for policy 0, policy_version 117310 (0.0033) [2024-06-22 02:41:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1922072576. Throughput: 0: 42833.8. Samples: 1922242720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:41:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 02:41:50,423][15401] Updated weights for policy 0, policy_version 117320 (0.0029) [2024-06-22 02:41:53,393][15132] Fps is (10 sec: 42593.0, 60 sec: 42868.9, 300 sec: 42708.9). Total num frames: 1922301952. Throughput: 0: 42592.9. Samples: 1922368920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:41:53,394][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 02:41:54,182][15401] Updated weights for policy 0, policy_version 117330 (0.0031) [2024-06-22 02:41:57,967][15401] Updated weights for policy 0, policy_version 117340 (0.0035) [2024-06-22 02:41:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1922514944. Throughput: 0: 42737.8. Samples: 1922626940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:41:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 02:42:00,282][15349] Signal inference workers to stop experience collection... (28300 times) [2024-06-22 02:42:00,314][15401] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-06-22 02:42:00,348][15349] Signal inference workers to resume experience collection... (28300 times) [2024-06-22 02:42:00,349][15401] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-06-22 02:42:01,666][15401] Updated weights for policy 0, policy_version 117350 (0.0035) [2024-06-22 02:42:03,390][15132] Fps is (10 sec: 40975.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1922711552. Throughput: 0: 42922.6. Samples: 1922886420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:42:03,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 02:42:05,505][15401] Updated weights for policy 0, policy_version 117360 (0.0026) [2024-06-22 02:42:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1922957312. Throughput: 0: 42878.7. Samples: 1923012560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:42:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 02:42:09,136][15401] Updated weights for policy 0, policy_version 117370 (0.0033) [2024-06-22 02:42:13,095][15401] Updated weights for policy 0, policy_version 117380 (0.0040) [2024-06-22 02:42:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42873.0, 300 sec: 42598.3). Total num frames: 1923153920. Throughput: 0: 42932.1. Samples: 1923272100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:42:13,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 02:42:16,793][15401] Updated weights for policy 0, policy_version 117390 (0.0043) [2024-06-22 02:42:18,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1923334144. Throughput: 0: 42711.5. Samples: 1923523140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:42:18,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 02:42:21,065][15401] Updated weights for policy 0, policy_version 117400 (0.0034) [2024-06-22 02:42:23,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1923563520. Throughput: 0: 42682.8. Samples: 1923645140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:42:23,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 02:42:24,287][15401] Updated weights for policy 0, policy_version 117410 (0.0027) [2024-06-22 02:42:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1923776512. Throughput: 0: 42709.0. Samples: 1923905540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:42:28,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 02:42:28,929][15401] Updated weights for policy 0, policy_version 117420 (0.0030) [2024-06-22 02:42:31,768][15401] Updated weights for policy 0, policy_version 117430 (0.0035) [2024-06-22 02:42:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1923973120. Throughput: 0: 42643.0. Samples: 1924161660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 02:42:33,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-22 02:42:36,334][15401] Updated weights for policy 0, policy_version 117440 (0.0029) [2024-06-22 02:42:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 1924218880. Throughput: 0: 42711.2. Samples: 1924290760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:42:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 02:42:39,477][15401] Updated weights for policy 0, policy_version 117450 (0.0036) [2024-06-22 02:42:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1924431872. Throughput: 0: 42863.2. Samples: 1924555780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:42:43,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 02:42:44,001][15401] Updated weights for policy 0, policy_version 117460 (0.0042) [2024-06-22 02:42:47,059][15401] Updated weights for policy 0, policy_version 117470 (0.0032) [2024-06-22 02:42:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1924628480. Throughput: 0: 42573.8. Samples: 1924802240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:42:48,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 02:42:51,493][15401] Updated weights for policy 0, policy_version 117480 (0.0041) [2024-06-22 02:42:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42601.0, 300 sec: 42653.9). Total num frames: 1924857856. Throughput: 0: 42584.5. Samples: 1924928860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:42:53,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 02:42:54,862][15401] Updated weights for policy 0, policy_version 117490 (0.0034) [2024-06-22 02:42:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1925070848. Throughput: 0: 42753.6. Samples: 1925196000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:42:58,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 02:42:59,309][15401] Updated weights for policy 0, policy_version 117500 (0.0039) [2024-06-22 02:43:00,382][15349] Signal inference workers to stop experience collection... (28350 times) [2024-06-22 02:43:00,423][15401] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-06-22 02:43:00,435][15349] Signal inference workers to resume experience collection... (28350 times) [2024-06-22 02:43:00,443][15401] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-06-22 02:43:02,443][15401] Updated weights for policy 0, policy_version 117510 (0.0027) [2024-06-22 02:43:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1925283840. Throughput: 0: 42594.6. Samples: 1925439900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:03,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 02:43:06,803][15401] Updated weights for policy 0, policy_version 117520 (0.0037) [2024-06-22 02:43:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42654.9). Total num frames: 1925496832. Throughput: 0: 42888.5. Samples: 1925575120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 02:43:10,536][15401] Updated weights for policy 0, policy_version 117530 (0.0032) [2024-06-22 02:43:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1925693440. Throughput: 0: 42964.3. Samples: 1925838940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 02:43:14,750][15401] Updated weights for policy 0, policy_version 117540 (0.0038) [2024-06-22 02:43:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1925922816. Throughput: 0: 42570.3. Samples: 1926077320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 02:43:18,667][15401] Updated weights for policy 0, policy_version 117550 (0.0034) [2024-06-22 02:43:22,605][15401] Updated weights for policy 0, policy_version 117560 (0.0036) [2024-06-22 02:43:23,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1926152192. Throughput: 0: 42755.0. Samples: 1926214740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:23,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 02:43:26,255][15401] Updated weights for policy 0, policy_version 117570 (0.0031) [2024-06-22 02:43:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1926316032. Throughput: 0: 42589.9. Samples: 1926472320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 02:43:30,423][15401] Updated weights for policy 0, policy_version 117580 (0.0031) [2024-06-22 02:43:33,395][15132] Fps is (10 sec: 42575.8, 60 sec: 43413.9, 300 sec: 42764.2). Total num frames: 1926578176. Throughput: 0: 42372.8. Samples: 1926709240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:33,395][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 02:43:33,948][15401] Updated weights for policy 0, policy_version 117590 (0.0028) [2024-06-22 02:43:38,242][15401] Updated weights for policy 0, policy_version 117600 (0.0028) [2024-06-22 02:43:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42543.8). Total num frames: 1926758400. Throughput: 0: 42691.2. Samples: 1926849960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 02:43:41,588][15401] Updated weights for policy 0, policy_version 117610 (0.0031) [2024-06-22 02:43:43,390][15132] Fps is (10 sec: 37703.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1926955008. Throughput: 0: 42379.1. Samples: 1927103060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:43,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 02:43:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000117612_1926955008.pth... [2024-06-22 02:43:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000116990_1916764160.pth [2024-06-22 02:43:45,826][15401] Updated weights for policy 0, policy_version 117620 (0.0032) [2024-06-22 02:43:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1927217152. Throughput: 0: 42511.1. Samples: 1927352900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 02:43:49,445][15401] Updated weights for policy 0, policy_version 117630 (0.0028) [2024-06-22 02:43:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1927397376. Throughput: 0: 42549.6. Samples: 1927489860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 02:43:53,834][15401] Updated weights for policy 0, policy_version 117640 (0.0031) [2024-06-22 02:43:57,243][15401] Updated weights for policy 0, policy_version 117650 (0.0027) [2024-06-22 02:43:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1927593984. Throughput: 0: 42249.8. Samples: 1927740180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:43:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 02:44:01,336][15401] Updated weights for policy 0, policy_version 117660 (0.0034) [2024-06-22 02:44:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1927839744. Throughput: 0: 42593.9. Samples: 1927994040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:44:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 02:44:04,889][15401] Updated weights for policy 0, policy_version 117670 (0.0039) [2024-06-22 02:44:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 1928036352. Throughput: 0: 42423.9. Samples: 1928123820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:44:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 02:44:08,855][15401] Updated weights for policy 0, policy_version 117680 (0.0038) [2024-06-22 02:44:12,362][15401] Updated weights for policy 0, policy_version 117690 (0.0025) [2024-06-22 02:44:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1928249344. Throughput: 0: 42239.1. Samples: 1928373080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:44:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 02:44:16,673][15401] Updated weights for policy 0, policy_version 117700 (0.0052) [2024-06-22 02:44:18,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 1928478720. Throughput: 0: 42605.2. Samples: 1928626520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:44:18,396][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 02:44:19,823][15349] Signal inference workers to stop experience collection... (28400 times) [2024-06-22 02:44:19,832][15349] Signal inference workers to resume experience collection... (28400 times) [2024-06-22 02:44:19,865][15401] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-06-22 02:44:19,865][15401] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-06-22 02:44:19,976][15401] Updated weights for policy 0, policy_version 117710 (0.0042) [2024-06-22 02:44:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 1928658944. Throughput: 0: 42343.5. Samples: 1928755420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:44:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 02:44:24,547][15401] Updated weights for policy 0, policy_version 117720 (0.0047) [2024-06-22 02:44:27,737][15401] Updated weights for policy 0, policy_version 117730 (0.0032) [2024-06-22 02:44:28,390][15132] Fps is (10 sec: 40985.7, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1928888320. Throughput: 0: 42241.7. Samples: 1929003940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:44:28,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 02:44:32,153][15401] Updated weights for policy 0, policy_version 117740 (0.0046) [2024-06-22 02:44:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42329.0, 300 sec: 42653.9). Total num frames: 1929117696. Throughput: 0: 42371.1. Samples: 1929259600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 02:44:33,392][15132] Avg episode reward: [(0, '0.164')] [2024-06-22 02:44:35,590][15401] Updated weights for policy 0, policy_version 117750 (0.0034) [2024-06-22 02:44:38,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42050.6, 300 sec: 42431.4). Total num frames: 1929281536. Throughput: 0: 42196.5. Samples: 1929388800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:44:38,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 02:44:39,958][15401] Updated weights for policy 0, policy_version 117760 (0.0040) [2024-06-22 02:44:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1929527296. Throughput: 0: 42261.9. Samples: 1929641960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:44:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 02:44:43,506][15401] Updated weights for policy 0, policy_version 117770 (0.0041) [2024-06-22 02:44:47,504][15401] Updated weights for policy 0, policy_version 117780 (0.0034) [2024-06-22 02:44:48,389][15132] Fps is (10 sec: 47525.4, 60 sec: 42325.4, 300 sec: 42598.5). Total num frames: 1929756672. Throughput: 0: 42241.4. Samples: 1929894900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:44:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 02:44:51,257][15401] Updated weights for policy 0, policy_version 117790 (0.0026) [2024-06-22 02:44:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1929920512. Throughput: 0: 42231.2. Samples: 1930024220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:44:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 02:44:55,001][15401] Updated weights for policy 0, policy_version 117800 (0.0038) [2024-06-22 02:44:58,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1930149888. Throughput: 0: 42257.7. Samples: 1930274680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:44:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 02:44:59,113][15401] Updated weights for policy 0, policy_version 117810 (0.0031) [2024-06-22 02:45:02,851][15401] Updated weights for policy 0, policy_version 117820 (0.0026) [2024-06-22 02:45:03,390][15132] Fps is (10 sec: 45873.3, 60 sec: 42325.0, 300 sec: 42543.4). Total num frames: 1930379264. Throughput: 0: 42186.9. Samples: 1930524680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:45:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 02:45:06,800][15401] Updated weights for policy 0, policy_version 117830 (0.0031) [2024-06-22 02:45:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1930575872. Throughput: 0: 42377.4. Samples: 1930662400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:45:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 02:45:10,327][15401] Updated weights for policy 0, policy_version 117840 (0.0037) [2024-06-22 02:45:13,389][15132] Fps is (10 sec: 40962.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1930788864. Throughput: 0: 42413.5. Samples: 1930912540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:45:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 02:45:14,789][15401] Updated weights for policy 0, policy_version 117850 (0.0032) [2024-06-22 02:45:18,283][15401] Updated weights for policy 0, policy_version 117860 (0.0037) [2024-06-22 02:45:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42329.8, 300 sec: 42598.4). Total num frames: 1931018240. Throughput: 0: 42484.5. Samples: 1931171400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:45:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 02:45:22,387][15401] Updated weights for policy 0, policy_version 117870 (0.0037) [2024-06-22 02:45:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1931214848. Throughput: 0: 42561.0. Samples: 1931303940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:45:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 02:45:25,777][15401] Updated weights for policy 0, policy_version 117880 (0.0026) [2024-06-22 02:45:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1931444224. Throughput: 0: 42470.6. Samples: 1931553140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:45:28,392][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 02:45:30,148][15401] Updated weights for policy 0, policy_version 117890 (0.0035) [2024-06-22 02:45:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1931673600. Throughput: 0: 42616.3. Samples: 1931812640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:45:33,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-22 02:45:33,390][15401] Updated weights for policy 0, policy_version 117900 (0.0029) [2024-06-22 02:45:37,913][15401] Updated weights for policy 0, policy_version 117910 (0.0044) [2024-06-22 02:45:38,391][15132] Fps is (10 sec: 40952.4, 60 sec: 42871.8, 300 sec: 42598.1). Total num frames: 1931853824. Throughput: 0: 42595.5. Samples: 1931941100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 02:45:38,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 02:45:41,071][15401] Updated weights for policy 0, policy_version 117920 (0.0035) [2024-06-22 02:45:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1932083200. Throughput: 0: 42641.7. Samples: 1932193560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:45:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 02:45:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000117925_1932083200.pth... [2024-06-22 02:45:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000117303_1921892352.pth [2024-06-22 02:45:45,480][15401] Updated weights for policy 0, policy_version 117930 (0.0031) [2024-06-22 02:45:48,389][15132] Fps is (10 sec: 45884.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1932312576. Throughput: 0: 42836.0. Samples: 1932452280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:45:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 02:45:48,606][15401] Updated weights for policy 0, policy_version 117940 (0.0029) [2024-06-22 02:45:53,007][15401] Updated weights for policy 0, policy_version 117950 (0.0032) [2024-06-22 02:45:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1932509184. Throughput: 0: 42660.8. Samples: 1932582140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:45:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 02:45:56,289][15349] Signal inference workers to stop experience collection... (28450 times) [2024-06-22 02:45:56,289][15349] Signal inference workers to resume experience collection... (28450 times) [2024-06-22 02:45:56,325][15401] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-06-22 02:45:56,325][15401] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-06-22 02:45:56,435][15401] Updated weights for policy 0, policy_version 117960 (0.0032) [2024-06-22 02:45:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1932705792. Throughput: 0: 42697.6. Samples: 1932833940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:45:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 02:46:00,636][15401] Updated weights for policy 0, policy_version 117970 (0.0043) [2024-06-22 02:46:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.7, 300 sec: 42542.9). Total num frames: 1932918784. Throughput: 0: 42699.2. Samples: 1933092860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:46:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 02:46:04,377][15401] Updated weights for policy 0, policy_version 117980 (0.0034) [2024-06-22 02:46:08,182][15401] Updated weights for policy 0, policy_version 117990 (0.0031) [2024-06-22 02:46:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 1933148160. Throughput: 0: 42560.4. Samples: 1933219160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:46:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 02:46:11,847][15401] Updated weights for policy 0, policy_version 118000 (0.0029) [2024-06-22 02:46:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1933361152. Throughput: 0: 42764.4. Samples: 1933477540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:46:13,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 02:46:16,037][15401] Updated weights for policy 0, policy_version 118010 (0.0035) [2024-06-22 02:46:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1933557760. Throughput: 0: 42791.6. Samples: 1933738260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:46:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 02:46:19,387][15401] Updated weights for policy 0, policy_version 118020 (0.0032) [2024-06-22 02:46:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1933787136. Throughput: 0: 42759.5. Samples: 1933865200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:46:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 02:46:23,431][15401] Updated weights for policy 0, policy_version 118030 (0.0027) [2024-06-22 02:46:27,165][15401] Updated weights for policy 0, policy_version 118040 (0.0035) [2024-06-22 02:46:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1934016512. Throughput: 0: 42934.6. Samples: 1934125620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:46:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 02:46:31,036][15401] Updated weights for policy 0, policy_version 118050 (0.0031) [2024-06-22 02:46:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 1934213120. Throughput: 0: 42949.8. Samples: 1934385020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:46:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 02:46:34,642][15401] Updated weights for policy 0, policy_version 118060 (0.0042) [2024-06-22 02:46:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42872.8, 300 sec: 42543.2). Total num frames: 1934426112. Throughput: 0: 42861.0. Samples: 1934510880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 02:46:38,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 02:46:38,701][15401] Updated weights for policy 0, policy_version 118070 (0.0038) [2024-06-22 02:46:42,296][15401] Updated weights for policy 0, policy_version 118080 (0.0028) [2024-06-22 02:46:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1934655488. Throughput: 0: 42970.4. Samples: 1934767600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:46:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 02:46:46,204][15401] Updated weights for policy 0, policy_version 118090 (0.0029) [2024-06-22 02:46:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 1934868480. Throughput: 0: 43135.0. Samples: 1935033940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:46:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 02:46:49,772][15401] Updated weights for policy 0, policy_version 118100 (0.0034) [2024-06-22 02:46:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1935065088. Throughput: 0: 43053.0. Samples: 1935156540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:46:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 02:46:54,000][15401] Updated weights for policy 0, policy_version 118110 (0.0027) [2024-06-22 02:46:57,276][15401] Updated weights for policy 0, policy_version 118120 (0.0026) [2024-06-22 02:46:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1935310848. Throughput: 0: 43011.7. Samples: 1935413060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:46:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 02:47:01,506][15401] Updated weights for policy 0, policy_version 118130 (0.0027) [2024-06-22 02:47:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1935491072. Throughput: 0: 43141.7. Samples: 1935679640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:47:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 02:47:04,790][15401] Updated weights for policy 0, policy_version 118140 (0.0030) [2024-06-22 02:47:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1935704064. Throughput: 0: 43091.3. Samples: 1935804300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:47:08,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-22 02:47:09,082][15401] Updated weights for policy 0, policy_version 118150 (0.0031) [2024-06-22 02:47:12,015][15349] Signal inference workers to stop experience collection... (28500 times) [2024-06-22 02:47:12,016][15349] Signal inference workers to resume experience collection... (28500 times) [2024-06-22 02:47:12,054][15401] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-06-22 02:47:12,054][15401] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-06-22 02:47:12,809][15401] Updated weights for policy 0, policy_version 118160 (0.0026) [2024-06-22 02:47:13,392][15132] Fps is (10 sec: 47502.3, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 1935966208. Throughput: 0: 43044.9. Samples: 1936062740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:47:13,393][15132] Avg episode reward: [(0, '0.168')] [2024-06-22 02:47:16,832][15401] Updated weights for policy 0, policy_version 118170 (0.0036) [2024-06-22 02:47:18,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 1936146432. Throughput: 0: 43074.3. Samples: 1936323640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:47:18,396][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 02:47:20,384][15401] Updated weights for policy 0, policy_version 118180 (0.0045) [2024-06-22 02:47:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1936375808. Throughput: 0: 43094.6. Samples: 1936450140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:47:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 02:47:24,319][15401] Updated weights for policy 0, policy_version 118190 (0.0030) [2024-06-22 02:47:27,922][15401] Updated weights for policy 0, policy_version 118200 (0.0036) [2024-06-22 02:47:28,389][15132] Fps is (10 sec: 45904.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1936605184. Throughput: 0: 43092.4. Samples: 1936706760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:47:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 02:47:32,303][15401] Updated weights for policy 0, policy_version 118210 (0.0038) [2024-06-22 02:47:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1936785408. Throughput: 0: 43051.6. Samples: 1936971260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:47:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 02:47:35,514][15401] Updated weights for policy 0, policy_version 118220 (0.0040) [2024-06-22 02:47:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1937031168. Throughput: 0: 43010.2. Samples: 1937092000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 02:47:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 02:47:39,830][15401] Updated weights for policy 0, policy_version 118230 (0.0037) [2024-06-22 02:47:43,018][15401] Updated weights for policy 0, policy_version 118240 (0.0042) [2024-06-22 02:47:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1937244160. Throughput: 0: 43152.3. Samples: 1937354920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:47:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 02:47:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000118240_1937244160.pth... [2024-06-22 02:47:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000117612_1926955008.pth [2024-06-22 02:47:47,661][15401] Updated weights for policy 0, policy_version 118250 (0.0037) [2024-06-22 02:47:48,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1937408000. Throughput: 0: 43112.1. Samples: 1937619680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:47:48,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 02:47:50,634][15401] Updated weights for policy 0, policy_version 118260 (0.0035) [2024-06-22 02:47:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 1937686528. Throughput: 0: 43064.4. Samples: 1937742200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:47:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 02:47:55,819][15401] Updated weights for policy 0, policy_version 118270 (0.0034) [2024-06-22 02:47:58,333][15401] Updated weights for policy 0, policy_version 118280 (0.0023) [2024-06-22 02:47:58,390][15132] Fps is (10 sec: 49151.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1937899520. Throughput: 0: 43140.5. Samples: 1938003960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:47:58,391][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 02:48:03,390][15132] Fps is (10 sec: 36044.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1938046976. Throughput: 0: 43078.0. Samples: 1938261880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 02:48:03,458][15401] Updated weights for policy 0, policy_version 118290 (0.0031) [2024-06-22 02:48:06,065][15401] Updated weights for policy 0, policy_version 118300 (0.0036) [2024-06-22 02:48:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 1938325504. Throughput: 0: 42798.8. Samples: 1938376080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 02:48:11,130][15401] Updated weights for policy 0, policy_version 118310 (0.0025) [2024-06-22 02:48:13,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 1938505728. Throughput: 0: 42939.6. Samples: 1938639040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 02:48:14,204][15401] Updated weights for policy 0, policy_version 118320 (0.0043) [2024-06-22 02:48:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 1938702336. Throughput: 0: 42683.1. Samples: 1938892000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 02:48:18,581][15401] Updated weights for policy 0, policy_version 118330 (0.0031) [2024-06-22 02:48:21,919][15401] Updated weights for policy 0, policy_version 118340 (0.0031) [2024-06-22 02:48:23,390][15132] Fps is (10 sec: 47512.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1938980864. Throughput: 0: 42878.9. Samples: 1939021560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 02:48:26,206][15401] Updated weights for policy 0, policy_version 118350 (0.0028) [2024-06-22 02:48:27,640][15349] Signal inference workers to stop experience collection... (28550 times) [2024-06-22 02:48:27,640][15349] Signal inference workers to resume experience collection... (28550 times) [2024-06-22 02:48:27,683][15401] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-06-22 02:48:27,683][15401] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-06-22 02:48:28,395][15132] Fps is (10 sec: 44211.2, 60 sec: 42321.2, 300 sec: 42598.3). Total num frames: 1939144704. Throughput: 0: 42774.5. Samples: 1939280020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:28,396][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 02:48:29,506][15401] Updated weights for policy 0, policy_version 118360 (0.0032) [2024-06-22 02:48:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1939357696. Throughput: 0: 42557.3. Samples: 1939534760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 02:48:33,753][15401] Updated weights for policy 0, policy_version 118370 (0.0035) [2024-06-22 02:48:37,051][15401] Updated weights for policy 0, policy_version 118380 (0.0038) [2024-06-22 02:48:38,396][15132] Fps is (10 sec: 47510.9, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 1939619840. Throughput: 0: 42721.5. Samples: 1939664940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:38,396][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 02:48:41,382][15401] Updated weights for policy 0, policy_version 118390 (0.0038) [2024-06-22 02:48:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1939783680. Throughput: 0: 42670.3. Samples: 1939924120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 02:48:43,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 02:48:44,888][15401] Updated weights for policy 0, policy_version 118400 (0.0037) [2024-06-22 02:48:48,390][15132] Fps is (10 sec: 37706.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1939996672. Throughput: 0: 42434.2. Samples: 1940171420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:48:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 02:48:48,972][15401] Updated weights for policy 0, policy_version 118410 (0.0029) [2024-06-22 02:48:52,522][15401] Updated weights for policy 0, policy_version 118420 (0.0033) [2024-06-22 02:48:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1940242432. Throughput: 0: 42864.3. Samples: 1940304980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:48:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 02:48:56,469][15401] Updated weights for policy 0, policy_version 118430 (0.0030) [2024-06-22 02:48:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1940406272. Throughput: 0: 42747.1. Samples: 1940562660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:48:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 02:49:00,175][15401] Updated weights for policy 0, policy_version 118440 (0.0035) [2024-06-22 02:49:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 1940652032. Throughput: 0: 42656.1. Samples: 1940811520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 02:49:03,877][15401] Updated weights for policy 0, policy_version 118450 (0.0030) [2024-06-22 02:49:07,773][15401] Updated weights for policy 0, policy_version 118460 (0.0036) [2024-06-22 02:49:08,390][15132] Fps is (10 sec: 49151.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1940897792. Throughput: 0: 42873.9. Samples: 1940950880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 02:49:11,274][15401] Updated weights for policy 0, policy_version 118470 (0.0039) [2024-06-22 02:49:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42654.8). Total num frames: 1941061632. Throughput: 0: 42807.7. Samples: 1941206120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 02:49:15,349][15401] Updated weights for policy 0, policy_version 118480 (0.0044) [2024-06-22 02:49:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1941291008. Throughput: 0: 42810.6. Samples: 1941461240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 02:49:19,006][15401] Updated weights for policy 0, policy_version 118490 (0.0038) [2024-06-22 02:49:22,906][15401] Updated weights for policy 0, policy_version 118500 (0.0045) [2024-06-22 02:49:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1941520384. Throughput: 0: 42840.3. Samples: 1941592480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:23,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 02:49:26,084][15349] Signal inference workers to stop experience collection... (28600 times) [2024-06-22 02:49:26,084][15349] Signal inference workers to resume experience collection... (28600 times) [2024-06-22 02:49:26,139][15401] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-06-22 02:49:26,139][15401] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-06-22 02:49:26,541][15401] Updated weights for policy 0, policy_version 118510 (0.0028) [2024-06-22 02:49:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42329.5, 300 sec: 42598.4). Total num frames: 1941684224. Throughput: 0: 42653.4. Samples: 1941843520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:28,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-22 02:49:30,524][15401] Updated weights for policy 0, policy_version 118520 (0.0028) [2024-06-22 02:49:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 1941946368. Throughput: 0: 42932.1. Samples: 1942103360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:33,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 02:49:34,092][15401] Updated weights for policy 0, policy_version 118530 (0.0039) [2024-06-22 02:49:38,150][15401] Updated weights for policy 0, policy_version 118540 (0.0033) [2024-06-22 02:49:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42329.8, 300 sec: 42820.5). Total num frames: 1942159360. Throughput: 0: 42927.6. Samples: 1942236720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 02:49:41,959][15401] Updated weights for policy 0, policy_version 118550 (0.0032) [2024-06-22 02:49:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1942339584. Throughput: 0: 42739.1. Samples: 1942485920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 02:49:43,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 02:49:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000118551_1942339584.pth... [2024-06-22 02:49:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000117925_1932083200.pth [2024-06-22 02:49:45,783][15401] Updated weights for policy 0, policy_version 118560 (0.0027) [2024-06-22 02:49:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 1942601728. Throughput: 0: 43002.1. Samples: 1942746620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:49:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 02:49:49,660][15401] Updated weights for policy 0, policy_version 118570 (0.0032) [2024-06-22 02:49:53,365][15401] Updated weights for policy 0, policy_version 118580 (0.0024) [2024-06-22 02:49:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1942814720. Throughput: 0: 43019.2. Samples: 1942886740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:49:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 02:49:57,062][15401] Updated weights for policy 0, policy_version 118590 (0.0037) [2024-06-22 02:49:58,393][15132] Fps is (10 sec: 39306.5, 60 sec: 43141.7, 300 sec: 42764.5). Total num frames: 1942994944. Throughput: 0: 42826.2. Samples: 1943133460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:49:58,394][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 02:50:01,055][15401] Updated weights for policy 0, policy_version 118600 (0.0031) [2024-06-22 02:50:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1943224320. Throughput: 0: 42857.8. Samples: 1943389840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 02:50:05,227][15401] Updated weights for policy 0, policy_version 118610 (0.0035) [2024-06-22 02:50:08,390][15132] Fps is (10 sec: 44253.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1943437312. Throughput: 0: 42968.0. Samples: 1943526040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:08,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 02:50:08,698][15401] Updated weights for policy 0, policy_version 118620 (0.0043) [2024-06-22 02:50:12,871][15401] Updated weights for policy 0, policy_version 118630 (0.0038) [2024-06-22 02:50:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1943650304. Throughput: 0: 42978.1. Samples: 1943777540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:13,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 02:50:16,509][15401] Updated weights for policy 0, policy_version 118640 (0.0042) [2024-06-22 02:50:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1943879680. Throughput: 0: 42717.4. Samples: 1944025640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 02:50:20,688][15401] Updated weights for policy 0, policy_version 118650 (0.0038) [2024-06-22 02:50:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1944059904. Throughput: 0: 42824.5. Samples: 1944163820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 02:50:24,226][15401] Updated weights for policy 0, policy_version 118660 (0.0031) [2024-06-22 02:50:28,240][15401] Updated weights for policy 0, policy_version 118670 (0.0044) [2024-06-22 02:50:28,391][15132] Fps is (10 sec: 40954.6, 60 sec: 43416.7, 300 sec: 42764.8). Total num frames: 1944289280. Throughput: 0: 42816.1. Samples: 1944412700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:28,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 02:50:31,775][15401] Updated weights for policy 0, policy_version 118680 (0.0031) [2024-06-22 02:50:33,392][15132] Fps is (10 sec: 47502.3, 60 sec: 43142.8, 300 sec: 42987.1). Total num frames: 1944535040. Throughput: 0: 42540.0. Samples: 1944661020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:33,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 02:50:36,409][15401] Updated weights for policy 0, policy_version 118690 (0.0036) [2024-06-22 02:50:37,318][15349] Signal inference workers to stop experience collection... (28650 times) [2024-06-22 02:50:37,318][15349] Signal inference workers to resume experience collection... (28650 times) [2024-06-22 02:50:37,345][15401] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-06-22 02:50:37,352][15401] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-06-22 02:50:38,389][15132] Fps is (10 sec: 42603.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1944715264. Throughput: 0: 42497.4. Samples: 1944799120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 02:50:39,473][15401] Updated weights for policy 0, policy_version 118700 (0.0029) [2024-06-22 02:50:43,390][15132] Fps is (10 sec: 37692.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1944911872. Throughput: 0: 42521.9. Samples: 1945046780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 02:50:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 02:50:44,111][15401] Updated weights for policy 0, policy_version 118710 (0.0036) [2024-06-22 02:50:47,163][15401] Updated weights for policy 0, policy_version 118720 (0.0038) [2024-06-22 02:50:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1945157632. Throughput: 0: 42379.6. Samples: 1945296920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:50:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 02:50:51,758][15401] Updated weights for policy 0, policy_version 118730 (0.0027) [2024-06-22 02:50:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 1945337856. Throughput: 0: 42419.6. Samples: 1945434920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:50:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 02:50:54,714][15401] Updated weights for policy 0, policy_version 118740 (0.0036) [2024-06-22 02:50:58,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42599.5, 300 sec: 42820.2). Total num frames: 1945550848. Throughput: 0: 42392.5. Samples: 1945685300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:50:58,392][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 02:50:59,636][15401] Updated weights for policy 0, policy_version 118750 (0.0048) [2024-06-22 02:51:02,153][15401] Updated weights for policy 0, policy_version 118760 (0.0036) [2024-06-22 02:51:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1945796608. Throughput: 0: 42653.7. Samples: 1945945060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:03,391][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 02:51:07,292][15401] Updated weights for policy 0, policy_version 118770 (0.0027) [2024-06-22 02:51:08,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1945993216. Throughput: 0: 42599.7. Samples: 1946080800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 02:51:09,966][15401] Updated weights for policy 0, policy_version 118780 (0.0034) [2024-06-22 02:51:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1946189824. Throughput: 0: 42608.8. Samples: 1946330040. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 02:51:14,805][15401] Updated weights for policy 0, policy_version 118790 (0.0031) [2024-06-22 02:51:17,507][15401] Updated weights for policy 0, policy_version 118800 (0.0041) [2024-06-22 02:51:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1946451968. Throughput: 0: 42782.7. Samples: 1946586140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 02:51:22,799][15401] Updated weights for policy 0, policy_version 118810 (0.0025) [2024-06-22 02:51:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1946615808. Throughput: 0: 42748.9. Samples: 1946722820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:23,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 02:51:25,274][15401] Updated weights for policy 0, policy_version 118820 (0.0036) [2024-06-22 02:51:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42599.2, 300 sec: 42820.5). Total num frames: 1946845184. Throughput: 0: 42632.4. Samples: 1946965240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 02:51:30,264][15401] Updated weights for policy 0, policy_version 118830 (0.0035) [2024-06-22 02:51:32,836][15401] Updated weights for policy 0, policy_version 118840 (0.0039) [2024-06-22 02:51:33,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 1947090944. Throughput: 0: 42737.3. Samples: 1947220100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 02:51:38,150][15401] Updated weights for policy 0, policy_version 118850 (0.0021) [2024-06-22 02:51:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1947238400. Throughput: 0: 42779.9. Samples: 1947360020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:38,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-22 02:51:39,075][15349] Signal inference workers to stop experience collection... (28700 times) [2024-06-22 02:51:39,081][15349] Signal inference workers to resume experience collection... (28700 times) [2024-06-22 02:51:39,100][15401] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-06-22 02:51:39,100][15401] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-06-22 02:51:40,363][15401] Updated weights for policy 0, policy_version 118860 (0.0028) [2024-06-22 02:51:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1947500544. Throughput: 0: 42738.7. Samples: 1947608440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 02:51:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000118866_1947500544.pth... [2024-06-22 02:51:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000118240_1937244160.pth [2024-06-22 02:51:45,801][15401] Updated weights for policy 0, policy_version 118870 (0.0034) [2024-06-22 02:51:48,058][15401] Updated weights for policy 0, policy_version 118880 (0.0038) [2024-06-22 02:51:48,389][15132] Fps is (10 sec: 49152.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1947729920. Throughput: 0: 42616.1. Samples: 1947862780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-22 02:51:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 02:51:53,312][15401] Updated weights for policy 0, policy_version 118890 (0.0039) [2024-06-22 02:51:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1947893760. Throughput: 0: 42613.6. Samples: 1947998420. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:51:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 02:51:55,720][15401] Updated weights for policy 0, policy_version 118900 (0.0035) [2024-06-22 02:51:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 1948155904. Throughput: 0: 42809.3. Samples: 1948256460. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:51:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 02:52:00,829][15401] Updated weights for policy 0, policy_version 118910 (0.0041) [2024-06-22 02:52:03,235][15401] Updated weights for policy 0, policy_version 118920 (0.0047) [2024-06-22 02:52:03,390][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1948385280. Throughput: 0: 42837.3. Samples: 1948513820. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 02:52:08,396][15132] Fps is (10 sec: 37659.2, 60 sec: 42320.8, 300 sec: 42597.8). Total num frames: 1948532736. Throughput: 0: 42802.3. Samples: 1948649200. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:08,396][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 02:52:08,514][15401] Updated weights for policy 0, policy_version 118930 (0.0031) [2024-06-22 02:52:10,966][15401] Updated weights for policy 0, policy_version 118940 (0.0028) [2024-06-22 02:52:13,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43415.8, 300 sec: 42876.7). Total num frames: 1948794880. Throughput: 0: 43079.1. Samples: 1948903900. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:13,392][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 02:52:15,895][15401] Updated weights for policy 0, policy_version 118950 (0.0035) [2024-06-22 02:52:18,390][15132] Fps is (10 sec: 49183.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1949024256. Throughput: 0: 43198.2. Samples: 1949164020. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 02:52:18,605][15401] Updated weights for policy 0, policy_version 118960 (0.0022) [2024-06-22 02:52:23,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1949188096. Throughput: 0: 43076.5. Samples: 1949298460. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 02:52:23,555][15401] Updated weights for policy 0, policy_version 118970 (0.0038) [2024-06-22 02:52:26,226][15401] Updated weights for policy 0, policy_version 118980 (0.0029) [2024-06-22 02:52:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 1949450240. Throughput: 0: 43140.0. Samples: 1949549740. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 02:52:31,098][15401] Updated weights for policy 0, policy_version 118990 (0.0035) [2024-06-22 02:52:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1949646848. Throughput: 0: 43348.7. Samples: 1949813480. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 02:52:34,046][15401] Updated weights for policy 0, policy_version 119000 (0.0034) [2024-06-22 02:52:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1949843456. Throughput: 0: 43025.4. Samples: 1949934560. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:38,393][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 02:52:38,509][15401] Updated weights for policy 0, policy_version 119010 (0.0033) [2024-06-22 02:52:39,090][15349] Signal inference workers to stop experience collection... (28750 times) [2024-06-22 02:52:39,147][15401] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-06-22 02:52:39,202][15349] Signal inference workers to resume experience collection... (28750 times) [2024-06-22 02:52:39,202][15401] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-06-22 02:52:41,858][15401] Updated weights for policy 0, policy_version 119020 (0.0023) [2024-06-22 02:52:43,396][15132] Fps is (10 sec: 45846.1, 60 sec: 43412.9, 300 sec: 43041.8). Total num frames: 1950105600. Throughput: 0: 43051.6. Samples: 1950194060. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:43,397][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 02:52:46,004][15401] Updated weights for policy 0, policy_version 119030 (0.0039) [2024-06-22 02:52:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1950285824. Throughput: 0: 43228.4. Samples: 1950459100. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-22 02:52:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 02:52:49,365][15401] Updated weights for policy 0, policy_version 119040 (0.0040) [2024-06-22 02:52:53,389][15132] Fps is (10 sec: 39347.3, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1950498816. Throughput: 0: 42984.4. Samples: 1950583220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:52:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 02:52:53,406][15401] Updated weights for policy 0, policy_version 119050 (0.0038) [2024-06-22 02:52:56,826][15401] Updated weights for policy 0, policy_version 119060 (0.0037) [2024-06-22 02:52:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1950728192. Throughput: 0: 43076.6. Samples: 1950842240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:52:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 02:53:00,934][15401] Updated weights for policy 0, policy_version 119070 (0.0036) [2024-06-22 02:53:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1950924800. Throughput: 0: 43072.5. Samples: 1951102280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 02:53:04,399][15401] Updated weights for policy 0, policy_version 119080 (0.0045) [2024-06-22 02:53:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43422.2, 300 sec: 42820.5). Total num frames: 1951137792. Throughput: 0: 42807.1. Samples: 1951224780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 02:53:08,784][15401] Updated weights for policy 0, policy_version 119090 (0.0034) [2024-06-22 02:53:12,034][15401] Updated weights for policy 0, policy_version 119100 (0.0035) [2024-06-22 02:53:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 1951383552. Throughput: 0: 42909.8. Samples: 1951480680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:13,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 02:53:16,226][15401] Updated weights for policy 0, policy_version 119110 (0.0034) [2024-06-22 02:53:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1951563776. Throughput: 0: 43030.4. Samples: 1951749840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:18,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 02:53:19,629][15401] Updated weights for policy 0, policy_version 119120 (0.0036) [2024-06-22 02:53:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42876.9). Total num frames: 1951793152. Throughput: 0: 42998.6. Samples: 1951869500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:23,392][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 02:53:24,059][15401] Updated weights for policy 0, policy_version 119130 (0.0034) [2024-06-22 02:53:27,325][15401] Updated weights for policy 0, policy_version 119140 (0.0030) [2024-06-22 02:53:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1952022528. Throughput: 0: 42809.2. Samples: 1952120200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 02:53:31,661][15401] Updated weights for policy 0, policy_version 119150 (0.0036) [2024-06-22 02:53:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42599.3). Total num frames: 1952186368. Throughput: 0: 42800.6. Samples: 1952385120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 02:53:35,237][15401] Updated weights for policy 0, policy_version 119160 (0.0048) [2024-06-22 02:53:38,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1952415744. Throughput: 0: 42599.9. Samples: 1952500220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 02:53:39,397][15401] Updated weights for policy 0, policy_version 119170 (0.0039) [2024-06-22 02:53:43,097][15401] Updated weights for policy 0, policy_version 119180 (0.0043) [2024-06-22 02:53:43,393][15132] Fps is (10 sec: 45859.3, 60 sec: 42327.5, 300 sec: 42875.6). Total num frames: 1952645120. Throughput: 0: 42487.4. Samples: 1952754320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:43,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 02:53:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000119180_1952645120.pth... [2024-06-22 02:53:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000118551_1942339584.pth [2024-06-22 02:53:46,359][15349] Signal inference workers to stop experience collection... (28800 times) [2024-06-22 02:53:46,384][15401] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-06-22 02:53:46,425][15349] Signal inference workers to resume experience collection... (28800 times) [2024-06-22 02:53:46,425][15401] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-06-22 02:53:47,231][15401] Updated weights for policy 0, policy_version 119190 (0.0029) [2024-06-22 02:53:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1952808960. Throughput: 0: 42411.6. Samples: 1953010800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 02:53:48,392][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 02:53:50,820][15401] Updated weights for policy 0, policy_version 119200 (0.0018) [2024-06-22 02:53:53,389][15132] Fps is (10 sec: 42613.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1953071104. Throughput: 0: 42277.4. Samples: 1953127260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:53:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 02:53:55,242][15401] Updated weights for policy 0, policy_version 119210 (0.0034) [2024-06-22 02:53:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1953267712. Throughput: 0: 42577.8. Samples: 1953396680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:53:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 02:53:58,839][15401] Updated weights for policy 0, policy_version 119220 (0.0032) [2024-06-22 02:54:02,777][15401] Updated weights for policy 0, policy_version 119230 (0.0040) [2024-06-22 02:54:03,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1953464320. Throughput: 0: 42070.0. Samples: 1953643000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 02:54:06,621][15401] Updated weights for policy 0, policy_version 119240 (0.0027) [2024-06-22 02:54:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 1953710080. Throughput: 0: 42212.5. Samples: 1953769160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:08,392][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 02:54:10,288][15401] Updated weights for policy 0, policy_version 119250 (0.0030) [2024-06-22 02:54:13,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1953906688. Throughput: 0: 42621.9. Samples: 1954038180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 02:54:14,234][15401] Updated weights for policy 0, policy_version 119260 (0.0032) [2024-06-22 02:54:17,944][15401] Updated weights for policy 0, policy_version 119270 (0.0024) [2024-06-22 02:54:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1954119680. Throughput: 0: 42165.6. Samples: 1954282580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 02:54:21,889][15401] Updated weights for policy 0, policy_version 119280 (0.0030) [2024-06-22 02:54:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1954332672. Throughput: 0: 42374.2. Samples: 1954407060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 02:54:25,599][15401] Updated weights for policy 0, policy_version 119290 (0.0040) [2024-06-22 02:54:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 1954529280. Throughput: 0: 42629.5. Samples: 1954672500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 02:54:29,563][15401] Updated weights for policy 0, policy_version 119300 (0.0039) [2024-06-22 02:54:33,236][15401] Updated weights for policy 0, policy_version 119310 (0.0042) [2024-06-22 02:54:33,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 1954775040. Throughput: 0: 42361.7. Samples: 1954917180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:33,393][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 02:54:37,447][15401] Updated weights for policy 0, policy_version 119320 (0.0045) [2024-06-22 02:54:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1954988032. Throughput: 0: 42719.5. Samples: 1955049640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:38,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 02:54:41,009][15401] Updated weights for policy 0, policy_version 119330 (0.0037) [2024-06-22 02:54:43,390][15132] Fps is (10 sec: 37691.9, 60 sec: 41781.5, 300 sec: 42542.9). Total num frames: 1955151872. Throughput: 0: 42446.1. Samples: 1955306760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:43,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 02:54:45,072][15401] Updated weights for policy 0, policy_version 119340 (0.0038) [2024-06-22 02:54:46,640][15349] Signal inference workers to stop experience collection... (28850 times) [2024-06-22 02:54:46,640][15349] Signal inference workers to resume experience collection... (28850 times) [2024-06-22 02:54:46,650][15401] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-06-22 02:54:46,650][15401] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-06-22 02:54:48,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43415.9, 300 sec: 42709.1). Total num frames: 1955414016. Throughput: 0: 42566.4. Samples: 1955558580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:48,392][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 02:54:48,746][15401] Updated weights for policy 0, policy_version 119350 (0.0027) [2024-06-22 02:54:52,668][15401] Updated weights for policy 0, policy_version 119360 (0.0040) [2024-06-22 02:54:53,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.3, 300 sec: 42821.1). Total num frames: 1955627008. Throughput: 0: 42711.5. Samples: 1955691080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 02:54:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 02:54:56,607][15401] Updated weights for policy 0, policy_version 119370 (0.0033) [2024-06-22 02:54:58,390][15132] Fps is (10 sec: 36053.3, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 1955774464. Throughput: 0: 42223.5. Samples: 1955938240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:54:58,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-22 02:55:00,256][15401] Updated weights for policy 0, policy_version 119380 (0.0033) [2024-06-22 02:55:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1956020224. Throughput: 0: 42393.6. Samples: 1956190300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 02:55:04,454][15401] Updated weights for policy 0, policy_version 119390 (0.0040) [2024-06-22 02:55:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 1956233216. Throughput: 0: 42511.7. Samples: 1956320080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 02:55:08,476][15401] Updated weights for policy 0, policy_version 119400 (0.0035) [2024-06-22 02:55:12,184][15401] Updated weights for policy 0, policy_version 119410 (0.0035) [2024-06-22 02:55:13,390][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 1956429824. Throughput: 0: 42056.3. Samples: 1956565040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 02:55:16,219][15401] Updated weights for policy 0, policy_version 119420 (0.0046) [2024-06-22 02:55:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1956659200. Throughput: 0: 42215.2. Samples: 1956816760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 02:55:19,994][15401] Updated weights for policy 0, policy_version 119430 (0.0033) [2024-06-22 02:55:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42654.1). Total num frames: 1956872192. Throughput: 0: 42238.3. Samples: 1956950360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 02:55:24,155][15401] Updated weights for policy 0, policy_version 119440 (0.0046) [2024-06-22 02:55:27,724][15401] Updated weights for policy 0, policy_version 119450 (0.0032) [2024-06-22 02:55:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 1957068800. Throughput: 0: 42106.3. Samples: 1957201540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:28,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 02:55:31,831][15401] Updated weights for policy 0, policy_version 119460 (0.0032) [2024-06-22 02:55:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 1957298176. Throughput: 0: 42151.9. Samples: 1957455320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 02:55:35,262][15401] Updated weights for policy 0, policy_version 119470 (0.0043) [2024-06-22 02:55:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 1957478400. Throughput: 0: 42210.4. Samples: 1957590540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 02:55:39,380][15401] Updated weights for policy 0, policy_version 119480 (0.0036) [2024-06-22 02:55:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1957707776. Throughput: 0: 42325.0. Samples: 1957842860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 02:55:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000119489_1957707776.pth... [2024-06-22 02:55:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000118866_1947500544.pth [2024-06-22 02:55:43,647][15401] Updated weights for policy 0, policy_version 119490 (0.0030) [2024-06-22 02:55:47,278][15401] Updated weights for policy 0, policy_version 119500 (0.0034) [2024-06-22 02:55:48,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 1957953536. Throughput: 0: 42194.4. Samples: 1958089040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:48,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 02:55:51,348][15401] Updated weights for policy 0, policy_version 119510 (0.0027) [2024-06-22 02:55:53,393][15132] Fps is (10 sec: 42583.7, 60 sec: 41776.9, 300 sec: 42653.8). Total num frames: 1958133760. Throughput: 0: 42288.4. Samples: 1958223200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 02:55:53,393][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 02:55:54,857][15401] Updated weights for policy 0, policy_version 119520 (0.0034) [2024-06-22 02:55:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1958363136. Throughput: 0: 42484.9. Samples: 1958476860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:55:58,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-22 02:55:58,838][15401] Updated weights for policy 0, policy_version 119530 (0.0036) [2024-06-22 02:56:02,758][15401] Updated weights for policy 0, policy_version 119540 (0.0032) [2024-06-22 02:56:03,390][15132] Fps is (10 sec: 44251.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1958576128. Throughput: 0: 42559.9. Samples: 1958731960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 02:56:06,392][15401] Updated weights for policy 0, policy_version 119550 (0.0025) [2024-06-22 02:56:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1958772736. Throughput: 0: 42434.2. Samples: 1958859900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 02:56:10,390][15401] Updated weights for policy 0, policy_version 119560 (0.0030) [2024-06-22 02:56:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1959002112. Throughput: 0: 42516.4. Samples: 1959114780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 02:56:13,895][15401] Updated weights for policy 0, policy_version 119570 (0.0039) [2024-06-22 02:56:17,929][15401] Updated weights for policy 0, policy_version 119580 (0.0043) [2024-06-22 02:56:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1959198720. Throughput: 0: 42655.2. Samples: 1959374800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:18,396][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 02:56:21,363][15401] Updated weights for policy 0, policy_version 119590 (0.0036) [2024-06-22 02:56:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1959411712. Throughput: 0: 42361.2. Samples: 1959496800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 02:56:26,032][15401] Updated weights for policy 0, policy_version 119600 (0.0042) [2024-06-22 02:56:28,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 1959657472. Throughput: 0: 42528.8. Samples: 1959756760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:28,392][15132] Avg episode reward: [(0, '0.815')] [2024-06-22 02:56:29,213][15401] Updated weights for policy 0, policy_version 119610 (0.0026) [2024-06-22 02:56:31,200][15349] Signal inference workers to stop experience collection... (28900 times) [2024-06-22 02:56:31,255][15401] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-06-22 02:56:31,315][15349] Signal inference workers to resume experience collection... (28900 times) [2024-06-22 02:56:31,315][15401] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-06-22 02:56:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1959837696. Throughput: 0: 42801.3. Samples: 1960015100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 02:56:33,582][15401] Updated weights for policy 0, policy_version 119620 (0.0026) [2024-06-22 02:56:36,976][15401] Updated weights for policy 0, policy_version 119630 (0.0037) [2024-06-22 02:56:38,392][15132] Fps is (10 sec: 40960.0, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 1960067072. Throughput: 0: 42497.4. Samples: 1960135540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:38,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 02:56:41,083][15401] Updated weights for policy 0, policy_version 119640 (0.0033) [2024-06-22 02:56:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1960280064. Throughput: 0: 42673.8. Samples: 1960397180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 02:56:44,475][15401] Updated weights for policy 0, policy_version 119650 (0.0036) [2024-06-22 02:56:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1960476672. Throughput: 0: 42604.6. Samples: 1960649160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 02:56:48,686][15401] Updated weights for policy 0, policy_version 119660 (0.0047) [2024-06-22 02:56:52,208][15401] Updated weights for policy 0, policy_version 119670 (0.0037) [2024-06-22 02:56:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.8, 300 sec: 42487.3). Total num frames: 1960689664. Throughput: 0: 42525.8. Samples: 1960773560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-22 02:56:53,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 02:56:56,506][15401] Updated weights for policy 0, policy_version 119680 (0.0034) [2024-06-22 02:56:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1960902656. Throughput: 0: 42614.8. Samples: 1961032440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:56:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 02:56:59,767][15401] Updated weights for policy 0, policy_version 119690 (0.0031) [2024-06-22 02:57:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42654.8). Total num frames: 1961115648. Throughput: 0: 42615.5. Samples: 1961292500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:03,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 02:57:04,350][15401] Updated weights for policy 0, policy_version 119700 (0.0026) [2024-06-22 02:57:07,395][15401] Updated weights for policy 0, policy_version 119710 (0.0031) [2024-06-22 02:57:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 1961345024. Throughput: 0: 42628.9. Samples: 1961415100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 02:57:11,941][15401] Updated weights for policy 0, policy_version 119720 (0.0034) [2024-06-22 02:57:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1961541632. Throughput: 0: 42659.2. Samples: 1961676320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:13,390][15132] Avg episode reward: [(0, '0.108')] [2024-06-22 02:57:15,312][15401] Updated weights for policy 0, policy_version 119730 (0.0037) [2024-06-22 02:57:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1961754624. Throughput: 0: 42581.0. Samples: 1961931240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:18,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 02:57:19,850][15401] Updated weights for policy 0, policy_version 119740 (0.0034) [2024-06-22 02:57:22,890][15401] Updated weights for policy 0, policy_version 119750 (0.0031) [2024-06-22 02:57:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1962000384. Throughput: 0: 42770.3. Samples: 1962060100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 02:57:27,235][15401] Updated weights for policy 0, policy_version 119760 (0.0040) [2024-06-22 02:57:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 1962196992. Throughput: 0: 42731.1. Samples: 1962320080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 02:57:30,761][15401] Updated weights for policy 0, policy_version 119770 (0.0036) [2024-06-22 02:57:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1962409984. Throughput: 0: 42765.3. Samples: 1962573600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:33,390][15132] Avg episode reward: [(0, '0.048')] [2024-06-22 02:57:35,005][15401] Updated weights for policy 0, policy_version 119780 (0.0047) [2024-06-22 02:57:38,329][15401] Updated weights for policy 0, policy_version 119790 (0.0046) [2024-06-22 02:57:38,393][15132] Fps is (10 sec: 44222.1, 60 sec: 42870.8, 300 sec: 42487.8). Total num frames: 1962639360. Throughput: 0: 42949.7. Samples: 1962706440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:38,393][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 02:57:42,529][15401] Updated weights for policy 0, policy_version 119800 (0.0030) [2024-06-22 02:57:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1962819584. Throughput: 0: 43179.1. Samples: 1962975500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 02:57:43,556][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000119803_1962852352.pth... [2024-06-22 02:57:43,620][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000119180_1952645120.pth [2024-06-22 02:57:45,948][15401] Updated weights for policy 0, policy_version 119810 (0.0031) [2024-06-22 02:57:48,390][15132] Fps is (10 sec: 40973.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1963048960. Throughput: 0: 42836.1. Samples: 1963220120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 02:57:49,791][15349] Signal inference workers to stop experience collection... (28950 times) [2024-06-22 02:57:49,791][15349] Signal inference workers to resume experience collection... (28950 times) [2024-06-22 02:57:49,836][15401] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-06-22 02:57:49,836][15401] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-06-22 02:57:50,154][15401] Updated weights for policy 0, policy_version 119820 (0.0045) [2024-06-22 02:57:53,374][15401] Updated weights for policy 0, policy_version 119830 (0.0029) [2024-06-22 02:57:53,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1963294720. Throughput: 0: 43030.3. Samples: 1963351460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:53,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 02:57:57,966][15401] Updated weights for policy 0, policy_version 119840 (0.0032) [2024-06-22 02:57:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1963474944. Throughput: 0: 43181.3. Samples: 1963619480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 02:57:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 02:58:00,840][15401] Updated weights for policy 0, policy_version 119850 (0.0037) [2024-06-22 02:58:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1963704320. Throughput: 0: 42997.4. Samples: 1963866120. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 02:58:05,693][15401] Updated weights for policy 0, policy_version 119860 (0.0025) [2024-06-22 02:58:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1963933696. Throughput: 0: 43119.5. Samples: 1964000480. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:08,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 02:58:08,819][15401] Updated weights for policy 0, policy_version 119870 (0.0024) [2024-06-22 02:58:13,184][15401] Updated weights for policy 0, policy_version 119880 (0.0033) [2024-06-22 02:58:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1964113920. Throughput: 0: 43093.0. Samples: 1964259260. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:13,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 02:58:16,369][15401] Updated weights for policy 0, policy_version 119890 (0.0049) [2024-06-22 02:58:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1964343296. Throughput: 0: 43012.8. Samples: 1964509180. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 02:58:20,714][15401] Updated weights for policy 0, policy_version 119900 (0.0033) [2024-06-22 02:58:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1964572672. Throughput: 0: 43093.9. Samples: 1964645520. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 02:58:23,818][15401] Updated weights for policy 0, policy_version 119910 (0.0036) [2024-06-22 02:58:28,293][15401] Updated weights for policy 0, policy_version 119920 (0.0041) [2024-06-22 02:58:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1964769280. Throughput: 0: 42848.0. Samples: 1964903660. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 02:58:31,322][15401] Updated weights for policy 0, policy_version 119930 (0.0031) [2024-06-22 02:58:33,393][15132] Fps is (10 sec: 40945.3, 60 sec: 42868.9, 300 sec: 42597.9). Total num frames: 1964982272. Throughput: 0: 43057.0. Samples: 1965157840. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:33,394][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 02:58:35,721][15401] Updated weights for policy 0, policy_version 119940 (0.0028) [2024-06-22 02:58:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43146.8, 300 sec: 42654.4). Total num frames: 1965228032. Throughput: 0: 43159.4. Samples: 1965293640. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 02:58:39,151][15401] Updated weights for policy 0, policy_version 119950 (0.0040) [2024-06-22 02:58:43,389][15132] Fps is (10 sec: 42613.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1965408256. Throughput: 0: 42904.0. Samples: 1965550160. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 02:58:43,676][15401] Updated weights for policy 0, policy_version 119960 (0.0034) [2024-06-22 02:58:46,706][15401] Updated weights for policy 0, policy_version 119970 (0.0033) [2024-06-22 02:58:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1965637632. Throughput: 0: 43004.3. Samples: 1965801320. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 02:58:51,213][15401] Updated weights for policy 0, policy_version 119980 (0.0029) [2024-06-22 02:58:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1965850624. Throughput: 0: 43018.6. Samples: 1965936320. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:53,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 02:58:54,470][15401] Updated weights for policy 0, policy_version 119990 (0.0042) [2024-06-22 02:58:55,205][15349] Signal inference workers to stop experience collection... (29000 times) [2024-06-22 02:58:55,205][15349] Signal inference workers to resume experience collection... (29000 times) [2024-06-22 02:58:55,246][15401] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-06-22 02:58:55,246][15401] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-06-22 02:58:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1966063616. Throughput: 0: 42898.6. Samples: 1966189700. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-06-22 02:58:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 02:58:58,855][15401] Updated weights for policy 0, policy_version 120000 (0.0028) [2024-06-22 02:59:01,863][15401] Updated weights for policy 0, policy_version 120010 (0.0025) [2024-06-22 02:59:03,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42653.9). Total num frames: 1966292992. Throughput: 0: 42992.0. Samples: 1966443920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:03,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 02:59:06,259][15401] Updated weights for policy 0, policy_version 120020 (0.0030) [2024-06-22 02:59:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1966489600. Throughput: 0: 42945.7. Samples: 1966578080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 02:59:09,354][15401] Updated weights for policy 0, policy_version 120030 (0.0037) [2024-06-22 02:59:13,389][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1966702592. Throughput: 0: 42945.8. Samples: 1966836220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:13,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-22 02:59:13,642][15401] Updated weights for policy 0, policy_version 120040 (0.0026) [2024-06-22 02:59:16,907][15401] Updated weights for policy 0, policy_version 120050 (0.0028) [2024-06-22 02:59:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1966948352. Throughput: 0: 43094.0. Samples: 1967096920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 02:59:21,203][15401] Updated weights for policy 0, policy_version 120060 (0.0043) [2024-06-22 02:59:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1967144960. Throughput: 0: 43083.5. Samples: 1967232400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 02:59:24,493][15401] Updated weights for policy 0, policy_version 120070 (0.0034) [2024-06-22 02:59:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 1967374336. Throughput: 0: 43116.8. Samples: 1967490420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 02:59:28,938][15401] Updated weights for policy 0, policy_version 120080 (0.0023) [2024-06-22 02:59:32,002][15401] Updated weights for policy 0, policy_version 120090 (0.0032) [2024-06-22 02:59:33,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43693.3, 300 sec: 42765.0). Total num frames: 1967603712. Throughput: 0: 43200.6. Samples: 1967745340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 02:59:36,591][15401] Updated weights for policy 0, policy_version 120100 (0.0033) [2024-06-22 02:59:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 1967783936. Throughput: 0: 43219.7. Samples: 1967881200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:38,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 02:59:39,658][15401] Updated weights for policy 0, policy_version 120110 (0.0033) [2024-06-22 02:59:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 1967996928. Throughput: 0: 43105.8. Samples: 1968129460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 02:59:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000120117_1967996928.pth... [2024-06-22 02:59:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000119489_1957707776.pth [2024-06-22 02:59:44,510][15401] Updated weights for policy 0, policy_version 120120 (0.0037) [2024-06-22 02:59:47,278][15401] Updated weights for policy 0, policy_version 120130 (0.0024) [2024-06-22 02:59:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1968242688. Throughput: 0: 43232.9. Samples: 1968389300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 02:59:52,150][15401] Updated weights for policy 0, policy_version 120140 (0.0029) [2024-06-22 02:59:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1968422912. Throughput: 0: 43278.4. Samples: 1968525600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 02:59:55,075][15401] Updated weights for policy 0, policy_version 120150 (0.0034) [2024-06-22 02:59:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1968652288. Throughput: 0: 43069.7. Samples: 1968774360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 02:59:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 02:59:59,753][15401] Updated weights for policy 0, policy_version 120160 (0.0038) [2024-06-22 03:00:02,902][15401] Updated weights for policy 0, policy_version 120170 (0.0037) [2024-06-22 03:00:03,392][15132] Fps is (10 sec: 45863.5, 60 sec: 43144.5, 300 sec: 42875.7). Total num frames: 1968881664. Throughput: 0: 43042.6. Samples: 1969033940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:03,393][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 03:00:07,419][15401] Updated weights for policy 0, policy_version 120180 (0.0026) [2024-06-22 03:00:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1969078272. Throughput: 0: 42932.6. Samples: 1969164360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:08,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-22 03:00:10,519][15401] Updated weights for policy 0, policy_version 120190 (0.0040) [2024-06-22 03:00:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1969307648. Throughput: 0: 42674.3. Samples: 1969410760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 03:00:14,715][15349] Signal inference workers to stop experience collection... (29050 times) [2024-06-22 03:00:14,715][15349] Signal inference workers to resume experience collection... (29050 times) [2024-06-22 03:00:14,730][15401] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-06-22 03:00:14,731][15401] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-06-22 03:00:15,443][15401] Updated weights for policy 0, policy_version 120200 (0.0034) [2024-06-22 03:00:18,068][15401] Updated weights for policy 0, policy_version 120210 (0.0045) [2024-06-22 03:00:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1969520640. Throughput: 0: 42707.0. Samples: 1969667160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 03:00:23,139][15401] Updated weights for policy 0, policy_version 120220 (0.0035) [2024-06-22 03:00:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 1969700864. Throughput: 0: 42519.2. Samples: 1969794560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 03:00:26,189][15401] Updated weights for policy 0, policy_version 120230 (0.0031) [2024-06-22 03:00:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1969946624. Throughput: 0: 42689.4. Samples: 1970050480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:28,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 03:00:30,759][15401] Updated weights for policy 0, policy_version 120240 (0.0038) [2024-06-22 03:00:33,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 1970143232. Throughput: 0: 42633.7. Samples: 1970307820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 03:00:33,713][15401] Updated weights for policy 0, policy_version 120250 (0.0037) [2024-06-22 03:00:38,368][15401] Updated weights for policy 0, policy_version 120260 (0.0029) [2024-06-22 03:00:38,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 1970339840. Throughput: 0: 42433.7. Samples: 1970435220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:38,392][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 03:00:41,252][15401] Updated weights for policy 0, policy_version 120270 (0.0033) [2024-06-22 03:00:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1970585600. Throughput: 0: 42615.7. Samples: 1970692060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 03:00:45,835][15401] Updated weights for policy 0, policy_version 120280 (0.0038) [2024-06-22 03:00:48,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.4, 300 sec: 42932.1). Total num frames: 1970798592. Throughput: 0: 42622.7. Samples: 1970951860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 03:00:48,799][15401] Updated weights for policy 0, policy_version 120290 (0.0032) [2024-06-22 03:00:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 1970978816. Throughput: 0: 42551.9. Samples: 1971079200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 03:00:53,494][15401] Updated weights for policy 0, policy_version 120300 (0.0058) [2024-06-22 03:00:56,761][15401] Updated weights for policy 0, policy_version 120310 (0.0029) [2024-06-22 03:00:58,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 1971224576. Throughput: 0: 42732.4. Samples: 1971333820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:00:58,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 03:01:01,383][15401] Updated weights for policy 0, policy_version 120320 (0.0033) [2024-06-22 03:01:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 1971437568. Throughput: 0: 42668.0. Samples: 1971587220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:01:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 03:01:04,270][15401] Updated weights for policy 0, policy_version 120330 (0.0040) [2024-06-22 03:01:08,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1971617792. Throughput: 0: 42763.4. Samples: 1971718920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 03:01:09,176][15401] Updated weights for policy 0, policy_version 120340 (0.0027) [2024-06-22 03:01:12,077][15401] Updated weights for policy 0, policy_version 120350 (0.0035) [2024-06-22 03:01:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 1971863552. Throughput: 0: 42527.4. Samples: 1971964320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:13,393][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 03:01:16,789][15349] Signal inference workers to stop experience collection... (29100 times) [2024-06-22 03:01:16,835][15401] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-06-22 03:01:16,842][15349] Signal inference workers to resume experience collection... (29100 times) [2024-06-22 03:01:16,849][15401] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-06-22 03:01:16,852][15401] Updated weights for policy 0, policy_version 120360 (0.0030) [2024-06-22 03:01:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1972060160. Throughput: 0: 42571.6. Samples: 1972223540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 03:01:19,967][15401] Updated weights for policy 0, policy_version 120370 (0.0049) [2024-06-22 03:01:23,392][15132] Fps is (10 sec: 39321.7, 60 sec: 42596.6, 300 sec: 42709.5). Total num frames: 1972256768. Throughput: 0: 42618.6. Samples: 1972353060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:23,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 03:01:24,140][15401] Updated weights for policy 0, policy_version 120380 (0.0026) [2024-06-22 03:01:27,458][15401] Updated weights for policy 0, policy_version 120390 (0.0040) [2024-06-22 03:01:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1972486144. Throughput: 0: 42707.6. Samples: 1972613900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:28,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-22 03:01:31,652][15401] Updated weights for policy 0, policy_version 120400 (0.0037) [2024-06-22 03:01:33,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 1972699136. Throughput: 0: 42679.7. Samples: 1972872440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 03:01:35,011][15401] Updated weights for policy 0, policy_version 120410 (0.0027) [2024-06-22 03:01:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1972895744. Throughput: 0: 42590.3. Samples: 1972995760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:38,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 03:01:39,194][15401] Updated weights for policy 0, policy_version 120420 (0.0031) [2024-06-22 03:01:42,552][15401] Updated weights for policy 0, policy_version 120430 (0.0030) [2024-06-22 03:01:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1973141504. Throughput: 0: 42738.6. Samples: 1973256960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 03:01:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000120431_1973141504.pth... [2024-06-22 03:01:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000119803_1962852352.pth [2024-06-22 03:01:46,736][15401] Updated weights for policy 0, policy_version 120440 (0.0032) [2024-06-22 03:01:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1973338112. Throughput: 0: 42795.4. Samples: 1973513020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 03:01:50,205][15401] Updated weights for policy 0, policy_version 120450 (0.0037) [2024-06-22 03:01:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1973551104. Throughput: 0: 42716.0. Samples: 1973641140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 03:01:54,547][15401] Updated weights for policy 0, policy_version 120460 (0.0037) [2024-06-22 03:01:57,921][15401] Updated weights for policy 0, policy_version 120470 (0.0033) [2024-06-22 03:01:58,394][15132] Fps is (10 sec: 44217.9, 60 sec: 42597.0, 300 sec: 42931.0). Total num frames: 1973780480. Throughput: 0: 42890.5. Samples: 1973894480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:01:58,394][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 03:02:02,408][15401] Updated weights for policy 0, policy_version 120480 (0.0033) [2024-06-22 03:02:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1973977088. Throughput: 0: 42830.8. Samples: 1974150920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 03:02:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 03:02:05,881][15401] Updated weights for policy 0, policy_version 120490 (0.0032) [2024-06-22 03:02:08,390][15132] Fps is (10 sec: 42616.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1974206464. Throughput: 0: 42762.6. Samples: 1974277280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 03:02:09,902][15401] Updated weights for policy 0, policy_version 120500 (0.0036) [2024-06-22 03:02:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 1974419456. Throughput: 0: 42780.8. Samples: 1974539040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 03:02:13,700][15401] Updated weights for policy 0, policy_version 120510 (0.0037) [2024-06-22 03:02:17,630][15401] Updated weights for policy 0, policy_version 120520 (0.0031) [2024-06-22 03:02:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1974616064. Throughput: 0: 42821.7. Samples: 1974799420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:18,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 03:02:21,487][15401] Updated weights for policy 0, policy_version 120530 (0.0022) [2024-06-22 03:02:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 1974845440. Throughput: 0: 42948.1. Samples: 1974928420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:23,390][15132] Avg episode reward: [(0, '0.916')] [2024-06-22 03:02:25,139][15401] Updated weights for policy 0, policy_version 120540 (0.0027) [2024-06-22 03:02:28,392][15132] Fps is (10 sec: 44225.0, 60 sec: 42869.5, 300 sec: 42875.7). Total num frames: 1975058432. Throughput: 0: 42749.1. Samples: 1975180780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:28,393][15132] Avg episode reward: [(0, '0.916')] [2024-06-22 03:02:29,123][15401] Updated weights for policy 0, policy_version 120550 (0.0031) [2024-06-22 03:02:31,792][15349] Signal inference workers to stop experience collection... (29150 times) [2024-06-22 03:02:31,792][15349] Signal inference workers to resume experience collection... (29150 times) [2024-06-22 03:02:31,809][15401] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-06-22 03:02:31,809][15401] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-06-22 03:02:32,555][15401] Updated weights for policy 0, policy_version 120560 (0.0030) [2024-06-22 03:02:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42821.0). Total num frames: 1975271424. Throughput: 0: 42794.8. Samples: 1975438780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 03:02:37,009][15401] Updated weights for policy 0, policy_version 120570 (0.0047) [2024-06-22 03:02:38,389][15132] Fps is (10 sec: 42609.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1975484416. Throughput: 0: 42779.6. Samples: 1975566220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 03:02:40,174][15401] Updated weights for policy 0, policy_version 120580 (0.0035) [2024-06-22 03:02:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1975697408. Throughput: 0: 42760.1. Samples: 1975818500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:43,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 03:02:44,507][15401] Updated weights for policy 0, policy_version 120590 (0.0035) [2024-06-22 03:02:47,803][15401] Updated weights for policy 0, policy_version 120600 (0.0027) [2024-06-22 03:02:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1975910400. Throughput: 0: 42842.2. Samples: 1976078820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 03:02:52,290][15401] Updated weights for policy 0, policy_version 120610 (0.0030) [2024-06-22 03:02:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1976139776. Throughput: 0: 42857.8. Samples: 1976205880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:53,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-22 03:02:55,360][15401] Updated weights for policy 0, policy_version 120620 (0.0030) [2024-06-22 03:02:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42328.5, 300 sec: 42765.0). Total num frames: 1976320000. Throughput: 0: 42655.2. Samples: 1976458520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:02:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 03:02:59,938][15401] Updated weights for policy 0, policy_version 120630 (0.0034) [2024-06-22 03:03:03,275][15401] Updated weights for policy 0, policy_version 120640 (0.0030) [2024-06-22 03:03:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1976565760. Throughput: 0: 42629.0. Samples: 1976717720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 03:03:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 03:03:07,415][15401] Updated weights for policy 0, policy_version 120650 (0.0037) [2024-06-22 03:03:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1976778752. Throughput: 0: 42725.7. Samples: 1976851080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:08,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 03:03:11,005][15401] Updated weights for policy 0, policy_version 120660 (0.0044) [2024-06-22 03:03:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1976975360. Throughput: 0: 42701.6. Samples: 1977102240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 03:03:14,903][15401] Updated weights for policy 0, policy_version 120670 (0.0028) [2024-06-22 03:03:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1977204736. Throughput: 0: 42814.7. Samples: 1977365440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 03:03:18,665][15401] Updated weights for policy 0, policy_version 120680 (0.0032) [2024-06-22 03:03:22,523][15401] Updated weights for policy 0, policy_version 120690 (0.0034) [2024-06-22 03:03:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1977417728. Throughput: 0: 42860.5. Samples: 1977494940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:23,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 03:03:26,373][15401] Updated weights for policy 0, policy_version 120700 (0.0039) [2024-06-22 03:03:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42600.2, 300 sec: 42821.0). Total num frames: 1977614336. Throughput: 0: 42932.4. Samples: 1977750460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 03:03:30,318][15401] Updated weights for policy 0, policy_version 120710 (0.0023) [2024-06-22 03:03:33,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1977843712. Throughput: 0: 42724.0. Samples: 1978001500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:33,392][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 03:03:34,252][15401] Updated weights for policy 0, policy_version 120720 (0.0035) [2024-06-22 03:03:37,772][15401] Updated weights for policy 0, policy_version 120730 (0.0039) [2024-06-22 03:03:38,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1978056704. Throughput: 0: 42881.4. Samples: 1978135540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:38,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 03:03:41,723][15401] Updated weights for policy 0, policy_version 120740 (0.0032) [2024-06-22 03:03:43,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1978269696. Throughput: 0: 43001.2. Samples: 1978393580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:43,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 03:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000120744_1978269696.pth... [2024-06-22 03:03:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000120117_1967996928.pth [2024-06-22 03:03:45,646][15401] Updated weights for policy 0, policy_version 120750 (0.0037) [2024-06-22 03:03:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1978499072. Throughput: 0: 42872.4. Samples: 1978646980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 03:03:48,989][15349] Signal inference workers to stop experience collection... (29200 times) [2024-06-22 03:03:48,991][15349] Signal inference workers to resume experience collection... (29200 times) [2024-06-22 03:03:49,014][15401] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-06-22 03:03:49,015][15401] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-06-22 03:03:49,141][15401] Updated weights for policy 0, policy_version 120760 (0.0034) [2024-06-22 03:03:53,199][15401] Updated weights for policy 0, policy_version 120770 (0.0045) [2024-06-22 03:03:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1978695680. Throughput: 0: 42916.9. Samples: 1978782340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:53,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 03:03:56,911][15401] Updated weights for policy 0, policy_version 120780 (0.0030) [2024-06-22 03:03:58,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.7, 300 sec: 42765.0). Total num frames: 1978908672. Throughput: 0: 43078.2. Samples: 1979040860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:03:58,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 03:04:00,756][15401] Updated weights for policy 0, policy_version 120790 (0.0046) [2024-06-22 03:04:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1979138048. Throughput: 0: 42811.2. Samples: 1979291940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:04:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 03:04:04,580][15401] Updated weights for policy 0, policy_version 120800 (0.0030) [2024-06-22 03:04:08,287][15401] Updated weights for policy 0, policy_version 120810 (0.0036) [2024-06-22 03:04:08,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1979351040. Throughput: 0: 42803.0. Samples: 1979421080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 03:04:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 03:04:12,133][15401] Updated weights for policy 0, policy_version 120820 (0.0033) [2024-06-22 03:04:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1979547648. Throughput: 0: 42829.1. Samples: 1979677760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 03:04:15,951][15401] Updated weights for policy 0, policy_version 120830 (0.0036) [2024-06-22 03:04:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1979777024. Throughput: 0: 42899.7. Samples: 1979931880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:18,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 03:04:19,786][15401] Updated weights for policy 0, policy_version 120840 (0.0029) [2024-06-22 03:04:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1979973632. Throughput: 0: 42855.4. Samples: 1980064040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 03:04:23,731][15401] Updated weights for policy 0, policy_version 120850 (0.0036) [2024-06-22 03:04:27,421][15401] Updated weights for policy 0, policy_version 120860 (0.0044) [2024-06-22 03:04:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1980186624. Throughput: 0: 42801.8. Samples: 1980319660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 03:04:31,306][15401] Updated weights for policy 0, policy_version 120870 (0.0050) [2024-06-22 03:04:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 1980416000. Throughput: 0: 42732.5. Samples: 1980569940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 03:04:35,109][15401] Updated weights for policy 0, policy_version 120880 (0.0038) [2024-06-22 03:04:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1980628992. Throughput: 0: 42732.5. Samples: 1980705300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 03:04:39,020][15401] Updated weights for policy 0, policy_version 120890 (0.0030) [2024-06-22 03:04:42,850][15401] Updated weights for policy 0, policy_version 120900 (0.0042) [2024-06-22 03:04:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1980825600. Throughput: 0: 42623.2. Samples: 1980958800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 03:04:46,688][15401] Updated weights for policy 0, policy_version 120910 (0.0037) [2024-06-22 03:04:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1981071360. Throughput: 0: 42627.9. Samples: 1981210200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 03:04:50,518][15401] Updated weights for policy 0, policy_version 120920 (0.0036) [2024-06-22 03:04:53,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 1981251584. Throughput: 0: 42790.4. Samples: 1981346660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 03:04:54,153][15401] Updated weights for policy 0, policy_version 120930 (0.0035) [2024-06-22 03:04:58,094][15401] Updated weights for policy 0, policy_version 120940 (0.0036) [2024-06-22 03:04:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 1981480960. Throughput: 0: 42853.7. Samples: 1981606180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:04:58,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 03:05:02,071][15401] Updated weights for policy 0, policy_version 120950 (0.0032) [2024-06-22 03:05:03,390][15132] Fps is (10 sec: 47514.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1981726720. Throughput: 0: 42727.0. Samples: 1981854600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:05:03,391][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 03:05:05,735][15401] Updated weights for policy 0, policy_version 120960 (0.0034) [2024-06-22 03:05:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1981890560. Throughput: 0: 42735.6. Samples: 1981987140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 03:05:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 03:05:09,921][15401] Updated weights for policy 0, policy_version 120970 (0.0048) [2024-06-22 03:05:13,344][15401] Updated weights for policy 0, policy_version 120980 (0.0033) [2024-06-22 03:05:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1982136320. Throughput: 0: 42680.9. Samples: 1982240300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:13,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 03:05:17,588][15401] Updated weights for policy 0, policy_version 120990 (0.0031) [2024-06-22 03:05:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1982349312. Throughput: 0: 42666.2. Samples: 1982489920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 03:05:21,098][15401] Updated weights for policy 0, policy_version 121000 (0.0028) [2024-06-22 03:05:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1982529536. Throughput: 0: 42548.5. Samples: 1982619980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:23,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 03:05:25,041][15349] Signal inference workers to stop experience collection... (29250 times) [2024-06-22 03:05:25,041][15349] Signal inference workers to resume experience collection... (29250 times) [2024-06-22 03:05:25,084][15401] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-06-22 03:05:25,084][15401] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-06-22 03:05:25,186][15401] Updated weights for policy 0, policy_version 121010 (0.0031) [2024-06-22 03:05:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1982758912. Throughput: 0: 42572.0. Samples: 1982874540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 03:05:28,806][15401] Updated weights for policy 0, policy_version 121020 (0.0033) [2024-06-22 03:05:32,805][15401] Updated weights for policy 0, policy_version 121030 (0.0036) [2024-06-22 03:05:33,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.6, 300 sec: 42820.5). Total num frames: 1982971904. Throughput: 0: 42685.6. Samples: 1983131160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:33,393][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 03:05:36,462][15401] Updated weights for policy 0, policy_version 121040 (0.0028) [2024-06-22 03:05:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1983168512. Throughput: 0: 42479.0. Samples: 1983258200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:38,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 03:05:40,926][15401] Updated weights for policy 0, policy_version 121050 (0.0034) [2024-06-22 03:05:43,389][15132] Fps is (10 sec: 44248.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1983414272. Throughput: 0: 42291.6. Samples: 1983509300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:43,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 03:05:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000121058_1983414272.pth... [2024-06-22 03:05:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000120431_1973141504.pth [2024-06-22 03:05:44,415][15401] Updated weights for policy 0, policy_version 121060 (0.0037) [2024-06-22 03:05:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1983594496. Throughput: 0: 42696.5. Samples: 1983775940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 03:05:48,414][15401] Updated weights for policy 0, policy_version 121070 (0.0024) [2024-06-22 03:05:51,835][15401] Updated weights for policy 0, policy_version 121080 (0.0031) [2024-06-22 03:05:53,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.9, 300 sec: 42709.5). Total num frames: 1983823872. Throughput: 0: 42506.6. Samples: 1983900040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:53,393][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 03:05:55,893][15401] Updated weights for policy 0, policy_version 121090 (0.0037) [2024-06-22 03:05:58,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 1984069632. Throughput: 0: 42665.3. Samples: 1984160340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:05:58,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 03:05:59,272][15401] Updated weights for policy 0, policy_version 121100 (0.0040) [2024-06-22 03:06:03,348][15401] Updated weights for policy 0, policy_version 121110 (0.0030) [2024-06-22 03:06:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1984266240. Throughput: 0: 42934.1. Samples: 1984421960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:06:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 03:06:06,768][15401] Updated weights for policy 0, policy_version 121120 (0.0024) [2024-06-22 03:06:08,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1984446464. Throughput: 0: 42817.4. Samples: 1984546760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-22 03:06:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 03:06:10,843][15401] Updated weights for policy 0, policy_version 121130 (0.0028) [2024-06-22 03:06:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1984708608. Throughput: 0: 43011.9. Samples: 1984810080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 03:06:14,278][15401] Updated weights for policy 0, policy_version 121140 (0.0041) [2024-06-22 03:06:18,356][15401] Updated weights for policy 0, policy_version 121150 (0.0031) [2024-06-22 03:06:18,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 1984921600. Throughput: 0: 43065.9. Samples: 1985069020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 03:06:22,037][15401] Updated weights for policy 0, policy_version 121160 (0.0030) [2024-06-22 03:06:23,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1985101824. Throughput: 0: 42978.8. Samples: 1985192240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 03:06:26,138][15401] Updated weights for policy 0, policy_version 121170 (0.0027) [2024-06-22 03:06:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1985331200. Throughput: 0: 43187.5. Samples: 1985452740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 03:06:30,133][15401] Updated weights for policy 0, policy_version 121180 (0.0046) [2024-06-22 03:06:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 1985544192. Throughput: 0: 42898.6. Samples: 1985706380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 03:06:34,081][15401] Updated weights for policy 0, policy_version 121190 (0.0038) [2024-06-22 03:06:37,725][15401] Updated weights for policy 0, policy_version 121200 (0.0038) [2024-06-22 03:06:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1985757184. Throughput: 0: 43022.0. Samples: 1985835920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:38,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 03:06:41,456][15349] Signal inference workers to stop experience collection... (29300 times) [2024-06-22 03:06:41,456][15349] Signal inference workers to resume experience collection... (29300 times) [2024-06-22 03:06:41,504][15401] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-06-22 03:06:41,504][15401] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-06-22 03:06:41,595][15401] Updated weights for policy 0, policy_version 121210 (0.0041) [2024-06-22 03:06:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1985986560. Throughput: 0: 43061.4. Samples: 1986098000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:43,391][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 03:06:45,086][15401] Updated weights for policy 0, policy_version 121220 (0.0038) [2024-06-22 03:06:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1986199552. Throughput: 0: 42938.8. Samples: 1986354200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 03:06:49,207][15401] Updated weights for policy 0, policy_version 121230 (0.0057) [2024-06-22 03:06:52,529][15401] Updated weights for policy 0, policy_version 121240 (0.0030) [2024-06-22 03:06:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42871.6, 300 sec: 42765.3). Total num frames: 1986396160. Throughput: 0: 42951.5. Samples: 1986479680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:53,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 03:06:56,843][15401] Updated weights for policy 0, policy_version 121250 (0.0024) [2024-06-22 03:06:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 1986625536. Throughput: 0: 42881.6. Samples: 1986739740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:06:58,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-22 03:07:00,540][15401] Updated weights for policy 0, policy_version 121260 (0.0034) [2024-06-22 03:07:03,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1986822144. Throughput: 0: 42939.6. Samples: 1987001300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:07:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 03:07:04,370][15401] Updated weights for policy 0, policy_version 121270 (0.0048) [2024-06-22 03:07:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1987035136. Throughput: 0: 42949.7. Samples: 1987124980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:07:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 03:07:08,510][15401] Updated weights for policy 0, policy_version 121280 (0.0020) [2024-06-22 03:07:12,083][15401] Updated weights for policy 0, policy_version 121290 (0.0029) [2024-06-22 03:07:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 1987264512. Throughput: 0: 42783.2. Samples: 1987377980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 03:07:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 03:07:16,255][15401] Updated weights for policy 0, policy_version 121300 (0.0041) [2024-06-22 03:07:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1987461120. Throughput: 0: 42816.4. Samples: 1987633120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:18,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 03:07:19,753][15401] Updated weights for policy 0, policy_version 121310 (0.0055) [2024-06-22 03:07:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 1987674112. Throughput: 0: 42717.2. Samples: 1987758200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:23,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 03:07:23,985][15401] Updated weights for policy 0, policy_version 121320 (0.0039) [2024-06-22 03:07:27,477][15401] Updated weights for policy 0, policy_version 121330 (0.0032) [2024-06-22 03:07:28,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1987903488. Throughput: 0: 42669.5. Samples: 1988018120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 03:07:31,783][15401] Updated weights for policy 0, policy_version 121340 (0.0029) [2024-06-22 03:07:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1988116480. Throughput: 0: 42563.1. Samples: 1988269540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:33,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 03:07:35,260][15401] Updated weights for policy 0, policy_version 121350 (0.0031) [2024-06-22 03:07:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1988313088. Throughput: 0: 42650.2. Samples: 1988398840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 03:07:39,403][15401] Updated weights for policy 0, policy_version 121360 (0.0032) [2024-06-22 03:07:42,974][15401] Updated weights for policy 0, policy_version 121370 (0.0021) [2024-06-22 03:07:43,393][15132] Fps is (10 sec: 40943.6, 60 sec: 42322.5, 300 sec: 42764.4). Total num frames: 1988526080. Throughput: 0: 42613.1. Samples: 1988657500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:43,394][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 03:07:43,440][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000121371_1988542464.pth... [2024-06-22 03:07:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000120744_1978269696.pth [2024-06-22 03:07:47,143][15401] Updated weights for policy 0, policy_version 121380 (0.0030) [2024-06-22 03:07:48,391][15132] Fps is (10 sec: 45869.3, 60 sec: 42870.5, 300 sec: 42820.4). Total num frames: 1988771840. Throughput: 0: 42363.6. Samples: 1988907720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:48,391][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 03:07:50,708][15401] Updated weights for policy 0, policy_version 121390 (0.0035) [2024-06-22 03:07:53,389][15132] Fps is (10 sec: 42615.3, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 1988952064. Throughput: 0: 42415.5. Samples: 1989033680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 03:07:54,894][15401] Updated weights for policy 0, policy_version 121400 (0.0034) [2024-06-22 03:07:58,389][15132] Fps is (10 sec: 39327.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1989165056. Throughput: 0: 42469.3. Samples: 1989289100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:07:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 03:07:58,462][15401] Updated weights for policy 0, policy_version 121410 (0.0029) [2024-06-22 03:08:02,695][15401] Updated weights for policy 0, policy_version 121420 (0.0034) [2024-06-22 03:08:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1989378048. Throughput: 0: 42615.5. Samples: 1989550820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:08:03,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 03:08:05,954][15401] Updated weights for policy 0, policy_version 121430 (0.0029) [2024-06-22 03:08:07,889][15349] Signal inference workers to stop experience collection... (29350 times) [2024-06-22 03:08:07,890][15349] Signal inference workers to resume experience collection... (29350 times) [2024-06-22 03:08:07,917][15401] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-06-22 03:08:07,917][15401] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-06-22 03:08:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1989607424. Throughput: 0: 42619.1. Samples: 1989676060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:08:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 03:08:10,318][15401] Updated weights for policy 0, policy_version 121440 (0.0031) [2024-06-22 03:08:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1989804032. Throughput: 0: 42606.1. Samples: 1989935400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:08:13,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 03:08:13,554][15401] Updated weights for policy 0, policy_version 121450 (0.0032) [2024-06-22 03:08:17,937][15401] Updated weights for policy 0, policy_version 121460 (0.0031) [2024-06-22 03:08:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1990017024. Throughput: 0: 42804.0. Samples: 1990195720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 03:08:21,135][15401] Updated weights for policy 0, policy_version 121470 (0.0032) [2024-06-22 03:08:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1990246400. Throughput: 0: 42533.4. Samples: 1990312840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 03:08:25,888][15401] Updated weights for policy 0, policy_version 121480 (0.0045) [2024-06-22 03:08:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 1990475776. Throughput: 0: 42641.9. Samples: 1990576220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:28,396][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 03:08:28,742][15401] Updated weights for policy 0, policy_version 121490 (0.0027) [2024-06-22 03:08:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1990656000. Throughput: 0: 42795.5. Samples: 1990833460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 03:08:33,400][15401] Updated weights for policy 0, policy_version 121500 (0.0034) [2024-06-22 03:08:36,367][15401] Updated weights for policy 0, policy_version 121510 (0.0032) [2024-06-22 03:08:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1990868992. Throughput: 0: 42643.5. Samples: 1990952640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 03:08:40,956][15401] Updated weights for policy 0, policy_version 121520 (0.0039) [2024-06-22 03:08:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42874.3, 300 sec: 42709.5). Total num frames: 1991098368. Throughput: 0: 42778.6. Samples: 1991214140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 03:08:44,158][15401] Updated weights for policy 0, policy_version 121530 (0.0047) [2024-06-22 03:08:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41780.1, 300 sec: 42653.9). Total num frames: 1991278592. Throughput: 0: 42670.8. Samples: 1991471000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:48,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 03:08:48,609][15401] Updated weights for policy 0, policy_version 121540 (0.0041) [2024-06-22 03:08:52,066][15401] Updated weights for policy 0, policy_version 121550 (0.0035) [2024-06-22 03:08:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 1991524352. Throughput: 0: 42500.8. Samples: 1991588600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:53,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 03:08:56,501][15401] Updated weights for policy 0, policy_version 121560 (0.0026) [2024-06-22 03:08:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1991737344. Throughput: 0: 42577.8. Samples: 1991851400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:08:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 03:08:59,712][15401] Updated weights for policy 0, policy_version 121570 (0.0050) [2024-06-22 03:09:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1991917568. Throughput: 0: 42638.0. Samples: 1992114440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:09:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 03:09:04,079][15401] Updated weights for policy 0, policy_version 121580 (0.0032) [2024-06-22 03:09:07,356][15401] Updated weights for policy 0, policy_version 121590 (0.0034) [2024-06-22 03:09:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1992179712. Throughput: 0: 42670.5. Samples: 1992233020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:09:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 03:09:11,774][15401] Updated weights for policy 0, policy_version 121600 (0.0031) [2024-06-22 03:09:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1992359936. Throughput: 0: 42643.2. Samples: 1992495160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:09:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 03:09:15,035][15401] Updated weights for policy 0, policy_version 121610 (0.0032) [2024-06-22 03:09:18,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1992556544. Throughput: 0: 42611.8. Samples: 1992751000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 03:09:18,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 03:09:19,491][15401] Updated weights for policy 0, policy_version 121620 (0.0028) [2024-06-22 03:09:20,849][15349] Signal inference workers to stop experience collection... (29400 times) [2024-06-22 03:09:20,850][15349] Signal inference workers to resume experience collection... (29400 times) [2024-06-22 03:09:20,890][15401] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-06-22 03:09:20,890][15401] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-06-22 03:09:22,607][15401] Updated weights for policy 0, policy_version 121630 (0.0025) [2024-06-22 03:09:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1992818688. Throughput: 0: 42751.2. Samples: 1992876440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:09:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 03:09:27,208][15401] Updated weights for policy 0, policy_version 121640 (0.0036) [2024-06-22 03:09:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 1992998912. Throughput: 0: 42819.1. Samples: 1993141000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:09:28,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 03:09:30,097][15401] Updated weights for policy 0, policy_version 121650 (0.0040) [2024-06-22 03:09:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1993211904. Throughput: 0: 42769.8. Samples: 1993395640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:09:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 03:09:34,940][15401] Updated weights for policy 0, policy_version 121660 (0.0032) [2024-06-22 03:09:37,635][15401] Updated weights for policy 0, policy_version 121670 (0.0033) [2024-06-22 03:09:38,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1993474048. Throughput: 0: 42965.8. Samples: 1993522060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:09:38,394][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 03:09:42,450][15401] Updated weights for policy 0, policy_version 121680 (0.0033) [2024-06-22 03:09:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1993637888. Throughput: 0: 42987.0. Samples: 1993785820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:09:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 03:09:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000121682_1993637888.pth... [2024-06-22 03:09:43,445][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000121058_1983414272.pth [2024-06-22 03:09:45,081][15401] Updated weights for policy 0, policy_version 121690 (0.0033) [2024-06-22 03:09:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 1993867264. Throughput: 0: 42735.7. Samples: 1994037540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:09:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 03:09:50,334][15401] Updated weights for policy 0, policy_version 121700 (0.0031) [2024-06-22 03:09:52,736][15401] Updated weights for policy 0, policy_version 121710 (0.0043) [2024-06-22 03:09:53,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1994113024. Throughput: 0: 43081.9. Samples: 1994171700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:09:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 03:09:57,858][15401] Updated weights for policy 0, policy_version 121720 (0.0030) [2024-06-22 03:09:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1994276864. Throughput: 0: 42893.8. Samples: 1994425380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:09:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 03:10:00,494][15401] Updated weights for policy 0, policy_version 121730 (0.0032) [2024-06-22 03:10:03,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1994506240. Throughput: 0: 42820.4. Samples: 1994677920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:10:03,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 03:10:05,587][15401] Updated weights for policy 0, policy_version 121740 (0.0040) [2024-06-22 03:10:08,201][15401] Updated weights for policy 0, policy_version 121750 (0.0039) [2024-06-22 03:10:08,390][15132] Fps is (10 sec: 49151.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1994768384. Throughput: 0: 43065.2. Samples: 1994814380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:10:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 03:10:13,244][15401] Updated weights for policy 0, policy_version 121760 (0.0042) [2024-06-22 03:10:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1994915840. Throughput: 0: 42855.8. Samples: 1995069520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:10:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 03:10:15,856][15401] Updated weights for policy 0, policy_version 121770 (0.0036) [2024-06-22 03:10:18,390][15132] Fps is (10 sec: 39322.1, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1995161600. Throughput: 0: 42615.5. Samples: 1995313340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 03:10:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 03:10:21,285][15401] Updated weights for policy 0, policy_version 121780 (0.0031) [2024-06-22 03:10:23,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1995390976. Throughput: 0: 42964.1. Samples: 1995455440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:10:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 03:10:23,542][15401] Updated weights for policy 0, policy_version 121790 (0.0036) [2024-06-22 03:10:28,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 1995538432. Throughput: 0: 42692.1. Samples: 1995706960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:10:28,398][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 03:10:28,866][15401] Updated weights for policy 0, policy_version 121800 (0.0027) [2024-06-22 03:10:31,270][15401] Updated weights for policy 0, policy_version 121810 (0.0026) [2024-06-22 03:10:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1995816960. Throughput: 0: 42516.3. Samples: 1995950780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:10:33,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 03:10:36,482][15401] Updated weights for policy 0, policy_version 121820 (0.0034) [2024-06-22 03:10:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1996013568. Throughput: 0: 42652.9. Samples: 1996091080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:10:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 03:10:39,047][15401] Updated weights for policy 0, policy_version 121830 (0.0031) [2024-06-22 03:10:43,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1996177408. Throughput: 0: 42627.5. Samples: 1996343620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:10:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 03:10:43,896][15401] Updated weights for policy 0, policy_version 121840 (0.0032) [2024-06-22 03:10:46,758][15401] Updated weights for policy 0, policy_version 121850 (0.0036) [2024-06-22 03:10:47,422][15349] Signal inference workers to stop experience collection... (29450 times) [2024-06-22 03:10:47,426][15349] Signal inference workers to resume experience collection... (29450 times) [2024-06-22 03:10:47,433][15401] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-06-22 03:10:47,466][15401] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-06-22 03:10:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 1996455936. Throughput: 0: 42512.7. Samples: 1996590980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:10:48,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 03:10:51,906][15401] Updated weights for policy 0, policy_version 121860 (0.0030) [2024-06-22 03:10:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 1996636160. Throughput: 0: 42668.1. Samples: 1996734440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:10:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 03:10:54,354][15401] Updated weights for policy 0, policy_version 121870 (0.0030) [2024-06-22 03:10:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1996849152. Throughput: 0: 42624.0. Samples: 1996987600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:10:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 03:10:59,375][15401] Updated weights for policy 0, policy_version 121880 (0.0032) [2024-06-22 03:11:02,010][15401] Updated weights for policy 0, policy_version 121890 (0.0029) [2024-06-22 03:11:03,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 1997094912. Throughput: 0: 42745.8. Samples: 1997237000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:11:03,392][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 03:11:06,868][15401] Updated weights for policy 0, policy_version 121900 (0.0041) [2024-06-22 03:11:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1997291520. Throughput: 0: 42638.6. Samples: 1997374180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:11:08,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 03:11:09,885][15401] Updated weights for policy 0, policy_version 121910 (0.0030) [2024-06-22 03:11:13,390][15132] Fps is (10 sec: 40969.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1997504512. Throughput: 0: 42666.6. Samples: 1997626960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:11:13,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 03:11:14,256][15401] Updated weights for policy 0, policy_version 121920 (0.0053) [2024-06-22 03:11:17,303][15401] Updated weights for policy 0, policy_version 121930 (0.0036) [2024-06-22 03:11:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1997750272. Throughput: 0: 42856.0. Samples: 1997879300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 03:11:18,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 03:11:22,166][15401] Updated weights for policy 0, policy_version 121940 (0.0043) [2024-06-22 03:11:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 1997914112. Throughput: 0: 42654.9. Samples: 1998010560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:11:23,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 03:11:24,982][15401] Updated weights for policy 0, policy_version 121950 (0.0036) [2024-06-22 03:11:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1998143488. Throughput: 0: 42720.0. Samples: 1998266020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:11:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 03:11:29,661][15401] Updated weights for policy 0, policy_version 121960 (0.0028) [2024-06-22 03:11:32,517][15401] Updated weights for policy 0, policy_version 121970 (0.0035) [2024-06-22 03:11:33,389][15132] Fps is (10 sec: 49152.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1998405632. Throughput: 0: 42883.9. Samples: 1998520760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:11:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 03:11:37,366][15401] Updated weights for policy 0, policy_version 121980 (0.0034) [2024-06-22 03:11:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1998569472. Throughput: 0: 42736.9. Samples: 1998657600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:11:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 03:11:40,162][15401] Updated weights for policy 0, policy_version 121990 (0.0036) [2024-06-22 03:11:43,392][15132] Fps is (10 sec: 39312.0, 60 sec: 43688.8, 300 sec: 42709.1). Total num frames: 1998798848. Throughput: 0: 42767.1. Samples: 1998912220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:11:43,393][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 03:11:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000121997_1998798848.pth... [2024-06-22 03:11:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000121371_1988542464.pth [2024-06-22 03:11:44,760][15401] Updated weights for policy 0, policy_version 122000 (0.0041) [2024-06-22 03:11:47,888][15401] Updated weights for policy 0, policy_version 122010 (0.0029) [2024-06-22 03:11:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 1999028224. Throughput: 0: 42885.2. Samples: 1999166740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:11:48,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 03:11:52,641][15401] Updated weights for policy 0, policy_version 122020 (0.0038) [2024-06-22 03:11:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1999208448. Throughput: 0: 42748.4. Samples: 1999297860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:11:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 03:11:55,753][15401] Updated weights for policy 0, policy_version 122030 (0.0030) [2024-06-22 03:11:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1999437824. Throughput: 0: 42744.4. Samples: 1999550460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:11:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 03:12:00,168][15401] Updated weights for policy 0, policy_version 122040 (0.0024) [2024-06-22 03:12:03,274][15349] Signal inference workers to stop experience collection... (29500 times) [2024-06-22 03:12:03,275][15349] Signal inference workers to resume experience collection... (29500 times) [2024-06-22 03:12:03,309][15401] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-06-22 03:12:03,310][15401] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-06-22 03:12:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1999650816. Throughput: 0: 43049.4. Samples: 1999816520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:12:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 03:12:03,437][15401] Updated weights for policy 0, policy_version 122050 (0.0038) [2024-06-22 03:12:07,692][15401] Updated weights for policy 0, policy_version 122060 (0.0027) [2024-06-22 03:12:08,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42871.2, 300 sec: 42709.4). Total num frames: 1999863808. Throughput: 0: 42964.7. Samples: 1999943980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:12:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 03:12:11,147][15401] Updated weights for policy 0, policy_version 122070 (0.0030) [2024-06-22 03:12:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2000093184. Throughput: 0: 42888.0. Samples: 2000195980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:12:13,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 03:12:15,207][15401] Updated weights for policy 0, policy_version 122080 (0.0038) [2024-06-22 03:12:18,390][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 2000289792. Throughput: 0: 43103.0. Samples: 2000460400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:12:18,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 03:12:18,808][15401] Updated weights for policy 0, policy_version 122090 (0.0035) [2024-06-22 03:12:23,017][15401] Updated weights for policy 0, policy_version 122100 (0.0037) [2024-06-22 03:12:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42709.4). Total num frames: 2000502784. Throughput: 0: 42833.7. Samples: 2000585120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 03:12:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 03:12:26,439][15401] Updated weights for policy 0, policy_version 122110 (0.0031) [2024-06-22 03:12:28,392][15132] Fps is (10 sec: 45864.7, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 2000748544. Throughput: 0: 42889.8. Samples: 2000842260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:12:28,392][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 03:12:30,522][15401] Updated weights for policy 0, policy_version 122120 (0.0044) [2024-06-22 03:12:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2000928768. Throughput: 0: 43082.2. Samples: 2001105440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:12:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 03:12:34,078][15401] Updated weights for policy 0, policy_version 122130 (0.0035) [2024-06-22 03:12:38,151][15401] Updated weights for policy 0, policy_version 122140 (0.0033) [2024-06-22 03:12:38,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42871.5, 300 sec: 42765.6). Total num frames: 2001141760. Throughput: 0: 42904.6. Samples: 2001228560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:12:38,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 03:12:41,771][15401] Updated weights for policy 0, policy_version 122150 (0.0040) [2024-06-22 03:12:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42709.7). Total num frames: 2001371136. Throughput: 0: 43039.5. Samples: 2001487240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:12:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 03:12:45,746][15401] Updated weights for policy 0, policy_version 122160 (0.0030) [2024-06-22 03:12:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2001567744. Throughput: 0: 42863.1. Samples: 2001745360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:12:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 03:12:49,320][15401] Updated weights for policy 0, policy_version 122170 (0.0039) [2024-06-22 03:12:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2001780736. Throughput: 0: 42851.4. Samples: 2001872280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:12:53,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 03:12:53,400][15401] Updated weights for policy 0, policy_version 122180 (0.0028) [2024-06-22 03:12:56,967][15401] Updated weights for policy 0, policy_version 122190 (0.0028) [2024-06-22 03:12:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2002010112. Throughput: 0: 42887.9. Samples: 2002125940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:12:58,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 03:13:01,089][15401] Updated weights for policy 0, policy_version 122200 (0.0036) [2024-06-22 03:13:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2002239488. Throughput: 0: 42786.8. Samples: 2002385800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:13:03,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 03:13:04,519][15401] Updated weights for policy 0, policy_version 122210 (0.0020) [2024-06-22 03:13:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 2002436096. Throughput: 0: 42860.1. Samples: 2002513820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:13:08,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 03:13:08,510][15401] Updated weights for policy 0, policy_version 122220 (0.0031) [2024-06-22 03:13:12,328][15401] Updated weights for policy 0, policy_version 122230 (0.0043) [2024-06-22 03:13:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2002665472. Throughput: 0: 42902.4. Samples: 2002772760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:13:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 03:13:16,253][15401] Updated weights for policy 0, policy_version 122240 (0.0031) [2024-06-22 03:13:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2002862080. Throughput: 0: 42718.9. Samples: 2003027780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:13:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 03:13:19,839][15401] Updated weights for policy 0, policy_version 122250 (0.0037) [2024-06-22 03:13:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2003075072. Throughput: 0: 42787.4. Samples: 2003154000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:13:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 03:13:23,764][15401] Updated weights for policy 0, policy_version 122260 (0.0031) [2024-06-22 03:13:27,230][15401] Updated weights for policy 0, policy_version 122270 (0.0031) [2024-06-22 03:13:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 2003288064. Throughput: 0: 42769.9. Samples: 2003411880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:13:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 03:13:31,293][15401] Updated weights for policy 0, policy_version 122280 (0.0032) [2024-06-22 03:13:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2003484672. Throughput: 0: 42893.3. Samples: 2003675560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:13:33,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 03:13:34,979][15401] Updated weights for policy 0, policy_version 122290 (0.0031) [2024-06-22 03:13:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2003714048. Throughput: 0: 42854.7. Samples: 2003800740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:13:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 03:13:39,097][15401] Updated weights for policy 0, policy_version 122300 (0.0027) [2024-06-22 03:13:41,503][15349] Signal inference workers to stop experience collection... (29550 times) [2024-06-22 03:13:41,503][15349] Signal inference workers to resume experience collection... (29550 times) [2024-06-22 03:13:41,530][15401] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-06-22 03:13:41,530][15401] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-06-22 03:13:42,422][15401] Updated weights for policy 0, policy_version 122310 (0.0031) [2024-06-22 03:13:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2003943424. Throughput: 0: 42997.0. Samples: 2004060800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:13:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 03:13:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000122312_2003959808.pth... [2024-06-22 03:13:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000121682_1993637888.pth [2024-06-22 03:13:46,777][15401] Updated weights for policy 0, policy_version 122320 (0.0042) [2024-06-22 03:13:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2004140032. Throughput: 0: 43116.9. Samples: 2004326060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:13:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 03:13:49,911][15401] Updated weights for policy 0, policy_version 122330 (0.0030) [2024-06-22 03:13:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2004353024. Throughput: 0: 43021.8. Samples: 2004449800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:13:53,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 03:13:54,456][15401] Updated weights for policy 0, policy_version 122340 (0.0042) [2024-06-22 03:13:57,426][15401] Updated weights for policy 0, policy_version 122350 (0.0026) [2024-06-22 03:13:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2004598784. Throughput: 0: 43095.0. Samples: 2004712040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:13:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 03:14:01,994][15401] Updated weights for policy 0, policy_version 122360 (0.0033) [2024-06-22 03:14:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2004795392. Throughput: 0: 43180.8. Samples: 2004970920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:14:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 03:14:05,262][15401] Updated weights for policy 0, policy_version 122370 (0.0034) [2024-06-22 03:14:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2005008384. Throughput: 0: 43008.9. Samples: 2005089400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:14:08,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 03:14:09,658][15401] Updated weights for policy 0, policy_version 122380 (0.0031) [2024-06-22 03:14:12,731][15401] Updated weights for policy 0, policy_version 122390 (0.0044) [2024-06-22 03:14:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2005237760. Throughput: 0: 43112.8. Samples: 2005351960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:14:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 03:14:17,379][15401] Updated weights for policy 0, policy_version 122400 (0.0037) [2024-06-22 03:14:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2005450752. Throughput: 0: 43088.0. Samples: 2005614520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:14:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 03:14:20,267][15401] Updated weights for policy 0, policy_version 122410 (0.0036) [2024-06-22 03:14:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2005663744. Throughput: 0: 43119.4. Samples: 2005741120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:14:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 03:14:24,924][15401] Updated weights for policy 0, policy_version 122420 (0.0046) [2024-06-22 03:14:28,029][15401] Updated weights for policy 0, policy_version 122430 (0.0040) [2024-06-22 03:14:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 2005893120. Throughput: 0: 43184.4. Samples: 2006004100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:14:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 03:14:32,548][15401] Updated weights for policy 0, policy_version 122440 (0.0035) [2024-06-22 03:14:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 2006106112. Throughput: 0: 43139.2. Samples: 2006267320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:14:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 03:14:35,804][15401] Updated weights for policy 0, policy_version 122450 (0.0042) [2024-06-22 03:14:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 2006302720. Throughput: 0: 43080.9. Samples: 2006388440. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:14:38,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 03:14:40,506][15401] Updated weights for policy 0, policy_version 122460 (0.0040) [2024-06-22 03:14:43,325][15401] Updated weights for policy 0, policy_version 122470 (0.0028) [2024-06-22 03:14:43,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 2006548480. Throughput: 0: 43057.7. Samples: 2006649740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:14:43,393][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 03:14:48,229][15401] Updated weights for policy 0, policy_version 122480 (0.0041) [2024-06-22 03:14:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2006712320. Throughput: 0: 43165.4. Samples: 2006913360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:14:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 03:14:51,288][15401] Updated weights for policy 0, policy_version 122490 (0.0043) [2024-06-22 03:14:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2006958080. Throughput: 0: 43191.6. Samples: 2007033020. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:14:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 03:14:55,767][15401] Updated weights for policy 0, policy_version 122500 (0.0035) [2024-06-22 03:14:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2007154688. Throughput: 0: 43121.4. Samples: 2007292420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:14:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 03:14:58,842][15401] Updated weights for policy 0, policy_version 122510 (0.0026) [2024-06-22 03:15:03,208][15401] Updated weights for policy 0, policy_version 122520 (0.0026) [2024-06-22 03:15:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2007367680. Throughput: 0: 43117.8. Samples: 2007554820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:15:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 03:15:06,625][15401] Updated weights for policy 0, policy_version 122530 (0.0032) [2024-06-22 03:15:08,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43415.9, 300 sec: 43042.4). Total num frames: 2007613440. Throughput: 0: 43145.3. Samples: 2007682760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:15:08,393][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 03:15:11,165][15401] Updated weights for policy 0, policy_version 122540 (0.0037) [2024-06-22 03:15:11,499][15349] Signal inference workers to stop experience collection... (29600 times) [2024-06-22 03:15:11,532][15401] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-06-22 03:15:11,564][15349] Signal inference workers to resume experience collection... (29600 times) [2024-06-22 03:15:11,566][15401] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-06-22 03:15:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 2007793664. Throughput: 0: 42826.2. Samples: 2007931380. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:15:13,392][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 03:15:14,398][15401] Updated weights for policy 0, policy_version 122550 (0.0023) [2024-06-22 03:15:18,392][15132] Fps is (10 sec: 39321.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 2008006656. Throughput: 0: 42713.6. Samples: 2008189540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:15:18,392][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 03:15:18,691][15401] Updated weights for policy 0, policy_version 122560 (0.0026) [2024-06-22 03:15:22,601][15401] Updated weights for policy 0, policy_version 122570 (0.0042) [2024-06-22 03:15:23,389][15132] Fps is (10 sec: 45886.3, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 2008252416. Throughput: 0: 42754.7. Samples: 2008312400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:15:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 03:15:26,148][15401] Updated weights for policy 0, policy_version 122580 (0.0035) [2024-06-22 03:15:28,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2008449024. Throughput: 0: 42662.3. Samples: 2008569440. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 03:15:28,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 03:15:30,194][15401] Updated weights for policy 0, policy_version 122590 (0.0026) [2024-06-22 03:15:33,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 2008645632. Throughput: 0: 42491.4. Samples: 2008825480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:15:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 03:15:33,824][15401] Updated weights for policy 0, policy_version 122600 (0.0035) [2024-06-22 03:15:37,795][15401] Updated weights for policy 0, policy_version 122610 (0.0027) [2024-06-22 03:15:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 2008875008. Throughput: 0: 42713.4. Samples: 2008955120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:15:38,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-22 03:15:41,475][15401] Updated weights for policy 0, policy_version 122620 (0.0039) [2024-06-22 03:15:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 2009104384. Throughput: 0: 42688.8. Samples: 2009213420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:15:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 03:15:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000122626_2009104384.pth... [2024-06-22 03:15:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000121997_1998798848.pth [2024-06-22 03:15:45,407][15401] Updated weights for policy 0, policy_version 122630 (0.0043) [2024-06-22 03:15:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2009300992. Throughput: 0: 42470.6. Samples: 2009466000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:15:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 03:15:49,046][15401] Updated weights for policy 0, policy_version 122640 (0.0039) [2024-06-22 03:15:53,077][15401] Updated weights for policy 0, policy_version 122650 (0.0047) [2024-06-22 03:15:53,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42593.9, 300 sec: 42930.7). Total num frames: 2009513984. Throughput: 0: 42521.1. Samples: 2009596380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:15:53,397][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 03:15:56,606][15401] Updated weights for policy 0, policy_version 122660 (0.0048) [2024-06-22 03:15:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 2009743360. Throughput: 0: 42764.4. Samples: 2009855680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:15:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 03:16:00,659][15401] Updated weights for policy 0, policy_version 122670 (0.0036) [2024-06-22 03:16:03,389][15132] Fps is (10 sec: 44265.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2009956352. Throughput: 0: 42582.8. Samples: 2010105660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:16:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 03:16:04,405][15401] Updated weights for policy 0, policy_version 122680 (0.0039) [2024-06-22 03:16:08,382][15401] Updated weights for policy 0, policy_version 122690 (0.0028) [2024-06-22 03:16:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 2010152960. Throughput: 0: 42618.0. Samples: 2010230220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:16:08,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 03:16:12,052][15401] Updated weights for policy 0, policy_version 122700 (0.0029) [2024-06-22 03:16:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 2010365952. Throughput: 0: 42677.8. Samples: 2010489940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:16:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 03:16:16,107][15401] Updated weights for policy 0, policy_version 122710 (0.0030) [2024-06-22 03:16:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 2010595328. Throughput: 0: 42583.6. Samples: 2010741740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:16:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 03:16:19,748][15401] Updated weights for policy 0, policy_version 122720 (0.0023) [2024-06-22 03:16:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 2010775552. Throughput: 0: 42740.5. Samples: 2010878440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:16:23,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 03:16:23,605][15401] Updated weights for policy 0, policy_version 122730 (0.0030) [2024-06-22 03:16:27,382][15401] Updated weights for policy 0, policy_version 122740 (0.0035) [2024-06-22 03:16:28,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2010988544. Throughput: 0: 42705.9. Samples: 2011135180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 03:16:28,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 03:16:31,174][15401] Updated weights for policy 0, policy_version 122750 (0.0033) [2024-06-22 03:16:33,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 2011250688. Throughput: 0: 42720.8. Samples: 2011388440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:16:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 03:16:35,169][15401] Updated weights for policy 0, policy_version 122760 (0.0039) [2024-06-22 03:16:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 2011430912. Throughput: 0: 42911.4. Samples: 2011527120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:16:38,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 03:16:38,708][15401] Updated weights for policy 0, policy_version 122770 (0.0027) [2024-06-22 03:16:40,012][15349] Signal inference workers to stop experience collection... (29650 times) [2024-06-22 03:16:40,013][15349] Signal inference workers to resume experience collection... (29650 times) [2024-06-22 03:16:40,034][15401] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-06-22 03:16:40,065][15401] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-06-22 03:16:42,652][15401] Updated weights for policy 0, policy_version 122780 (0.0040) [2024-06-22 03:16:43,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 2011643904. Throughput: 0: 42757.3. Samples: 2011779860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:16:43,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 03:16:46,139][15401] Updated weights for policy 0, policy_version 122790 (0.0032) [2024-06-22 03:16:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2011889664. Throughput: 0: 42994.2. Samples: 2012040400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:16:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 03:16:50,113][15401] Updated weights for policy 0, policy_version 122800 (0.0039) [2024-06-22 03:16:53,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43149.1, 300 sec: 42931.6). Total num frames: 2012102656. Throughput: 0: 43181.0. Samples: 2012173360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:16:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 03:16:53,551][15401] Updated weights for policy 0, policy_version 122810 (0.0035) [2024-06-22 03:16:57,728][15401] Updated weights for policy 0, policy_version 122820 (0.0045) [2024-06-22 03:16:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 2012282880. Throughput: 0: 42943.5. Samples: 2012422400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:16:58,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 03:17:01,653][15401] Updated weights for policy 0, policy_version 122830 (0.0037) [2024-06-22 03:17:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 2012528640. Throughput: 0: 42971.2. Samples: 2012675440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:17:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 03:17:05,356][15401] Updated weights for policy 0, policy_version 122840 (0.0034) [2024-06-22 03:17:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2012741632. Throughput: 0: 42893.7. Samples: 2012808660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:17:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 03:17:09,349][15401] Updated weights for policy 0, policy_version 122850 (0.0040) [2024-06-22 03:17:13,082][15401] Updated weights for policy 0, policy_version 122860 (0.0031) [2024-06-22 03:17:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2012938240. Throughput: 0: 42668.4. Samples: 2013055260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:17:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 03:17:17,021][15401] Updated weights for policy 0, policy_version 122870 (0.0048) [2024-06-22 03:17:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2013151232. Throughput: 0: 42573.9. Samples: 2013304260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:17:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 03:17:21,265][15401] Updated weights for policy 0, policy_version 122880 (0.0026) [2024-06-22 03:17:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 2013364224. Throughput: 0: 42377.8. Samples: 2013434120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:17:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 03:17:24,831][15401] Updated weights for policy 0, policy_version 122890 (0.0032) [2024-06-22 03:17:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 2013560832. Throughput: 0: 42465.8. Samples: 2013690820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:17:28,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 03:17:28,885][15401] Updated weights for policy 0, policy_version 122900 (0.0037) [2024-06-22 03:17:32,601][15401] Updated weights for policy 0, policy_version 122910 (0.0038) [2024-06-22 03:17:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2013790208. Throughput: 0: 42397.3. Samples: 2013948280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 03:17:33,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 03:17:36,488][15401] Updated weights for policy 0, policy_version 122920 (0.0043) [2024-06-22 03:17:38,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2014003200. Throughput: 0: 42390.7. Samples: 2014080940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:17:38,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-22 03:17:40,186][15401] Updated weights for policy 0, policy_version 122930 (0.0031) [2024-06-22 03:17:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 2014199808. Throughput: 0: 42500.5. Samples: 2014334920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:17:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 03:17:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000122938_2014216192.pth... [2024-06-22 03:17:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000122312_2003959808.pth [2024-06-22 03:17:44,064][15401] Updated weights for policy 0, policy_version 122940 (0.0026) [2024-06-22 03:17:47,807][15401] Updated weights for policy 0, policy_version 122950 (0.0049) [2024-06-22 03:17:48,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 2014445568. Throughput: 0: 42598.5. Samples: 2014592480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:17:48,393][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 03:17:51,812][15401] Updated weights for policy 0, policy_version 122960 (0.0030) [2024-06-22 03:17:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2014625792. Throughput: 0: 42479.6. Samples: 2014720240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:17:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 03:17:55,214][15401] Updated weights for policy 0, policy_version 122970 (0.0032) [2024-06-22 03:17:58,392][15132] Fps is (10 sec: 40960.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2014855168. Throughput: 0: 42667.2. Samples: 2014975380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:17:58,392][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 03:17:59,353][15401] Updated weights for policy 0, policy_version 122980 (0.0044) [2024-06-22 03:18:02,913][15401] Updated weights for policy 0, policy_version 122990 (0.0038) [2024-06-22 03:18:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2015068160. Throughput: 0: 42664.9. Samples: 2015224180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:18:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 03:18:07,057][15401] Updated weights for policy 0, policy_version 123000 (0.0039) [2024-06-22 03:18:08,389][15132] Fps is (10 sec: 39331.3, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 2015248384. Throughput: 0: 42722.4. Samples: 2015356620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:18:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 03:18:08,827][15349] Signal inference workers to stop experience collection... (29700 times) [2024-06-22 03:18:08,857][15401] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-06-22 03:18:08,891][15349] Signal inference workers to resume experience collection... (29700 times) [2024-06-22 03:18:08,892][15401] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-06-22 03:18:10,487][15401] Updated weights for policy 0, policy_version 123010 (0.0028) [2024-06-22 03:18:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2015494144. Throughput: 0: 42751.2. Samples: 2015614520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:18:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 03:18:14,990][15401] Updated weights for policy 0, policy_version 123020 (0.0038) [2024-06-22 03:18:18,362][15401] Updated weights for policy 0, policy_version 123030 (0.0030) [2024-06-22 03:18:18,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2015723520. Throughput: 0: 42496.9. Samples: 2015860640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:18:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 03:18:22,707][15401] Updated weights for policy 0, policy_version 123040 (0.0037) [2024-06-22 03:18:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2015887360. Throughput: 0: 42400.4. Samples: 2015988960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:18:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 03:18:26,154][15401] Updated weights for policy 0, policy_version 123050 (0.0027) [2024-06-22 03:18:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2016133120. Throughput: 0: 42324.4. Samples: 2016239520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:18:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 03:18:30,883][15401] Updated weights for policy 0, policy_version 123060 (0.0027) [2024-06-22 03:18:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2016329728. Throughput: 0: 42380.0. Samples: 2016499480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 03:18:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 03:18:33,981][15401] Updated weights for policy 0, policy_version 123070 (0.0031) [2024-06-22 03:18:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 2016526336. Throughput: 0: 42258.7. Samples: 2016621880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:18:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 03:18:38,458][15401] Updated weights for policy 0, policy_version 123080 (0.0031) [2024-06-22 03:18:41,638][15401] Updated weights for policy 0, policy_version 123090 (0.0041) [2024-06-22 03:18:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2016772096. Throughput: 0: 42347.1. Samples: 2016880900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:18:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 03:18:46,086][15401] Updated weights for policy 0, policy_version 123100 (0.0038) [2024-06-22 03:18:48,391][15132] Fps is (10 sec: 45869.0, 60 sec: 42326.2, 300 sec: 42820.4). Total num frames: 2016985088. Throughput: 0: 42473.0. Samples: 2017135520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:18:48,391][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 03:18:49,459][15401] Updated weights for policy 0, policy_version 123110 (0.0033) [2024-06-22 03:18:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2017181696. Throughput: 0: 42413.8. Samples: 2017265240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:18:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 03:18:53,537][15401] Updated weights for policy 0, policy_version 123120 (0.0038) [2024-06-22 03:18:57,053][15401] Updated weights for policy 0, policy_version 123130 (0.0043) [2024-06-22 03:18:58,390][15132] Fps is (10 sec: 42603.6, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 2017411072. Throughput: 0: 42257.2. Samples: 2017516100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:18:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 03:19:01,110][15401] Updated weights for policy 0, policy_version 123140 (0.0029) [2024-06-22 03:19:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2017607680. Throughput: 0: 42767.6. Samples: 2017785180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:19:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 03:19:04,645][15401] Updated weights for policy 0, policy_version 123150 (0.0026) [2024-06-22 03:19:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2017837056. Throughput: 0: 42703.5. Samples: 2017910620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:19:08,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 03:19:08,766][15401] Updated weights for policy 0, policy_version 123160 (0.0026) [2024-06-22 03:19:12,113][15401] Updated weights for policy 0, policy_version 123170 (0.0036) [2024-06-22 03:19:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2018050048. Throughput: 0: 42847.4. Samples: 2018167660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:19:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 03:19:16,394][15401] Updated weights for policy 0, policy_version 123180 (0.0034) [2024-06-22 03:19:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2018263040. Throughput: 0: 42847.4. Samples: 2018427600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:19:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 03:19:19,756][15401] Updated weights for policy 0, policy_version 123190 (0.0032) [2024-06-22 03:19:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2018476032. Throughput: 0: 42856.4. Samples: 2018550420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:19:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 03:19:24,253][15401] Updated weights for policy 0, policy_version 123200 (0.0031) [2024-06-22 03:19:26,776][15349] Signal inference workers to stop experience collection... (29750 times) [2024-06-22 03:19:26,824][15401] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-06-22 03:19:26,890][15349] Signal inference workers to resume experience collection... (29750 times) [2024-06-22 03:19:26,891][15401] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-06-22 03:19:27,412][15401] Updated weights for policy 0, policy_version 123210 (0.0032) [2024-06-22 03:19:28,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2018689024. Throughput: 0: 42729.7. Samples: 2018803740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:19:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 03:19:31,773][15401] Updated weights for policy 0, policy_version 123220 (0.0037) [2024-06-22 03:19:33,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2018885632. Throughput: 0: 42837.9. Samples: 2019063180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 03:19:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 03:19:35,089][15401] Updated weights for policy 0, policy_version 123230 (0.0039) [2024-06-22 03:19:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 2019098624. Throughput: 0: 42746.1. Samples: 2019188820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:19:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 03:19:39,483][15401] Updated weights for policy 0, policy_version 123240 (0.0019) [2024-06-22 03:19:42,934][15401] Updated weights for policy 0, policy_version 123250 (0.0029) [2024-06-22 03:19:43,392][15132] Fps is (10 sec: 45865.2, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2019344384. Throughput: 0: 42960.4. Samples: 2019449420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:19:43,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 03:19:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000123251_2019344384.pth... [2024-06-22 03:19:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000122626_2009104384.pth [2024-06-22 03:19:46,999][15401] Updated weights for policy 0, policy_version 123260 (0.0028) [2024-06-22 03:19:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42599.3, 300 sec: 42653.9). Total num frames: 2019540992. Throughput: 0: 42718.6. Samples: 2019707520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:19:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 03:19:50,428][15401] Updated weights for policy 0, policy_version 123270 (0.0031) [2024-06-22 03:19:53,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2019737600. Throughput: 0: 42633.4. Samples: 2019829120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:19:53,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 03:19:54,578][15401] Updated weights for policy 0, policy_version 123280 (0.0027) [2024-06-22 03:19:57,963][15401] Updated weights for policy 0, policy_version 123290 (0.0033) [2024-06-22 03:19:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2019983360. Throughput: 0: 42826.4. Samples: 2020094840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:19:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 03:20:02,312][15401] Updated weights for policy 0, policy_version 123300 (0.0032) [2024-06-22 03:20:03,392][15132] Fps is (10 sec: 45865.3, 60 sec: 43143.0, 300 sec: 42654.0). Total num frames: 2020196352. Throughput: 0: 42681.8. Samples: 2020348380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:20:03,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 03:20:05,639][15401] Updated weights for policy 0, policy_version 123310 (0.0033) [2024-06-22 03:20:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 2020392960. Throughput: 0: 42790.2. Samples: 2020475980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:20:08,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 03:20:09,948][15401] Updated weights for policy 0, policy_version 123320 (0.0050) [2024-06-22 03:20:13,318][15401] Updated weights for policy 0, policy_version 123330 (0.0027) [2024-06-22 03:20:13,396][15132] Fps is (10 sec: 44217.8, 60 sec: 43140.0, 300 sec: 42820.0). Total num frames: 2020638720. Throughput: 0: 42886.9. Samples: 2020733920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:20:13,396][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 03:20:17,749][15401] Updated weights for policy 0, policy_version 123340 (0.0027) [2024-06-22 03:20:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2020818944. Throughput: 0: 42814.0. Samples: 2020989800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:20:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 03:20:21,097][15401] Updated weights for policy 0, policy_version 123350 (0.0034) [2024-06-22 03:20:23,389][15132] Fps is (10 sec: 37707.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2021015552. Throughput: 0: 42665.4. Samples: 2021108760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:20:23,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 03:20:25,641][15401] Updated weights for policy 0, policy_version 123360 (0.0034) [2024-06-22 03:20:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2021261312. Throughput: 0: 42675.2. Samples: 2021369700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:20:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 03:20:28,717][15401] Updated weights for policy 0, policy_version 123370 (0.0026) [2024-06-22 03:20:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2021441536. Throughput: 0: 42717.3. Samples: 2021629800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:20:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 03:20:33,560][15401] Updated weights for policy 0, policy_version 123380 (0.0029) [2024-06-22 03:20:36,514][15401] Updated weights for policy 0, policy_version 123390 (0.0043) [2024-06-22 03:20:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2021670912. Throughput: 0: 42700.4. Samples: 2021750640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 03:20:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 03:20:41,235][15401] Updated weights for policy 0, policy_version 123400 (0.0030) [2024-06-22 03:20:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 2021883904. Throughput: 0: 42588.0. Samples: 2022011300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:20:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 03:20:44,234][15401] Updated weights for policy 0, policy_version 123410 (0.0038) [2024-06-22 03:20:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 2022080512. Throughput: 0: 42592.2. Samples: 2022264940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:20:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 03:20:49,112][15401] Updated weights for policy 0, policy_version 123420 (0.0039) [2024-06-22 03:20:51,881][15401] Updated weights for policy 0, policy_version 123430 (0.0030) [2024-06-22 03:20:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2022326272. Throughput: 0: 42429.7. Samples: 2022385320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:20:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 03:20:56,735][15401] Updated weights for policy 0, policy_version 123440 (0.0039) [2024-06-22 03:20:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2022522880. Throughput: 0: 42509.6. Samples: 2022646580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:20:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 03:20:59,500][15401] Updated weights for policy 0, policy_version 123450 (0.0035) [2024-06-22 03:21:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 2022735872. Throughput: 0: 42607.6. Samples: 2022907140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:21:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 03:21:04,585][15401] Updated weights for policy 0, policy_version 123460 (0.0041) [2024-06-22 03:21:04,842][15349] Signal inference workers to stop experience collection... (29800 times) [2024-06-22 03:21:04,842][15349] Signal inference workers to resume experience collection... (29800 times) [2024-06-22 03:21:04,885][15401] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-06-22 03:21:04,885][15401] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-06-22 03:21:07,169][15401] Updated weights for policy 0, policy_version 123470 (0.0030) [2024-06-22 03:21:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2022981632. Throughput: 0: 42673.2. Samples: 2023029060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:21:08,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 03:21:12,283][15401] Updated weights for policy 0, policy_version 123480 (0.0037) [2024-06-22 03:21:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 2023178240. Throughput: 0: 42808.8. Samples: 2023296100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:21:13,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 03:21:14,771][15401] Updated weights for policy 0, policy_version 123490 (0.0034) [2024-06-22 03:21:18,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2023358464. Throughput: 0: 42568.0. Samples: 2023545360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:21:18,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-22 03:21:19,938][15401] Updated weights for policy 0, policy_version 123500 (0.0029) [2024-06-22 03:21:22,399][15401] Updated weights for policy 0, policy_version 123510 (0.0034) [2024-06-22 03:21:23,394][15132] Fps is (10 sec: 42578.1, 60 sec: 43141.0, 300 sec: 42764.3). Total num frames: 2023604224. Throughput: 0: 42636.3. Samples: 2023669480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:21:23,395][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 03:21:27,525][15401] Updated weights for policy 0, policy_version 123520 (0.0034) [2024-06-22 03:21:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2023800832. Throughput: 0: 42733.4. Samples: 2023934300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:21:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 03:21:29,992][15401] Updated weights for policy 0, policy_version 123530 (0.0024) [2024-06-22 03:21:33,396][15132] Fps is (10 sec: 39315.3, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 2023997440. Throughput: 0: 42641.9. Samples: 2024184100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:21:33,396][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 03:21:35,064][15401] Updated weights for policy 0, policy_version 123540 (0.0033) [2024-06-22 03:21:38,273][15401] Updated weights for policy 0, policy_version 123550 (0.0034) [2024-06-22 03:21:38,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 2024243200. Throughput: 0: 42731.0. Samples: 2024308220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:21:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 03:21:42,818][15401] Updated weights for policy 0, policy_version 123560 (0.0034) [2024-06-22 03:21:43,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2024423424. Throughput: 0: 42604.0. Samples: 2024563760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:21:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 03:21:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000123562_2024439808.pth... [2024-06-22 03:21:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000122938_2014216192.pth [2024-06-22 03:21:45,948][15401] Updated weights for policy 0, policy_version 123570 (0.0031) [2024-06-22 03:21:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2024636416. Throughput: 0: 42373.6. Samples: 2024813960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:21:48,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 03:21:50,551][15401] Updated weights for policy 0, policy_version 123580 (0.0040) [2024-06-22 03:21:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2024865792. Throughput: 0: 42610.3. Samples: 2024946520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:21:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 03:21:53,685][15401] Updated weights for policy 0, policy_version 123590 (0.0050) [2024-06-22 03:21:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2025062400. Throughput: 0: 42398.6. Samples: 2025204040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:21:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 03:21:58,398][15401] Updated weights for policy 0, policy_version 123600 (0.0034) [2024-06-22 03:22:01,266][15401] Updated weights for policy 0, policy_version 123610 (0.0041) [2024-06-22 03:22:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 2025291776. Throughput: 0: 42400.8. Samples: 2025453400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:22:03,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-22 03:22:06,097][15401] Updated weights for policy 0, policy_version 123620 (0.0032) [2024-06-22 03:22:08,395][15132] Fps is (10 sec: 44214.1, 60 sec: 42048.6, 300 sec: 42597.7). Total num frames: 2025504768. Throughput: 0: 42554.2. Samples: 2025584440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:22:08,395][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 03:22:09,258][15401] Updated weights for policy 0, policy_version 123630 (0.0046) [2024-06-22 03:22:13,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 2025701376. Throughput: 0: 42303.4. Samples: 2025838060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:22:13,392][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 03:22:13,902][15401] Updated weights for policy 0, policy_version 123640 (0.0031) [2024-06-22 03:22:14,280][15349] Signal inference workers to stop experience collection... (29850 times) [2024-06-22 03:22:14,281][15349] Signal inference workers to resume experience collection... (29850 times) [2024-06-22 03:22:14,314][15401] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-06-22 03:22:14,314][15401] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-06-22 03:22:16,805][15401] Updated weights for policy 0, policy_version 123650 (0.0049) [2024-06-22 03:22:18,390][15132] Fps is (10 sec: 42620.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2025930752. Throughput: 0: 42387.3. Samples: 2026091260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:22:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 03:22:21,470][15401] Updated weights for policy 0, policy_version 123660 (0.0053) [2024-06-22 03:22:23,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42328.7, 300 sec: 42654.3). Total num frames: 2026143744. Throughput: 0: 42637.0. Samples: 2026226880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:22:23,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 03:22:24,314][15401] Updated weights for policy 0, policy_version 123670 (0.0031) [2024-06-22 03:22:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 2026340352. Throughput: 0: 42617.2. Samples: 2026481540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:22:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 03:22:29,055][15401] Updated weights for policy 0, policy_version 123680 (0.0030) [2024-06-22 03:22:32,227][15401] Updated weights for policy 0, policy_version 123690 (0.0024) [2024-06-22 03:22:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43149.2, 300 sec: 42653.9). Total num frames: 2026586112. Throughput: 0: 42528.1. Samples: 2026727720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:22:33,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 03:22:36,511][15401] Updated weights for policy 0, policy_version 123700 (0.0038) [2024-06-22 03:22:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2026782720. Throughput: 0: 42626.7. Samples: 2026864720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 03:22:38,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 03:22:40,032][15401] Updated weights for policy 0, policy_version 123710 (0.0029) [2024-06-22 03:22:43,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42542.9). Total num frames: 2026995712. Throughput: 0: 42584.1. Samples: 2027120420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:22:43,393][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 03:22:44,039][15401] Updated weights for policy 0, policy_version 123720 (0.0023) [2024-06-22 03:22:47,740][15401] Updated weights for policy 0, policy_version 123730 (0.0031) [2024-06-22 03:22:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2027208704. Throughput: 0: 42580.4. Samples: 2027369520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:22:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 03:22:52,012][15401] Updated weights for policy 0, policy_version 123740 (0.0035) [2024-06-22 03:22:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 2027405312. Throughput: 0: 42515.2. Samples: 2027497400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:22:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 03:22:55,473][15401] Updated weights for policy 0, policy_version 123750 (0.0031) [2024-06-22 03:22:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2027618304. Throughput: 0: 42572.6. Samples: 2027753720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:22:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 03:22:59,511][15401] Updated weights for policy 0, policy_version 123760 (0.0032) [2024-06-22 03:23:03,055][15401] Updated weights for policy 0, policy_version 123770 (0.0045) [2024-06-22 03:23:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 2027847680. Throughput: 0: 42558.2. Samples: 2028006380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 03:23:07,053][15401] Updated weights for policy 0, policy_version 123780 (0.0042) [2024-06-22 03:23:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42329.0, 300 sec: 42542.9). Total num frames: 2028044288. Throughput: 0: 42461.8. Samples: 2028137660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:08,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-22 03:23:11,059][15401] Updated weights for policy 0, policy_version 123790 (0.0028) [2024-06-22 03:23:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 2028257280. Throughput: 0: 42340.4. Samples: 2028386860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 03:23:14,519][15401] Updated weights for policy 0, policy_version 123800 (0.0035) [2024-06-22 03:23:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2028486656. Throughput: 0: 42608.8. Samples: 2028645120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 03:23:18,913][15401] Updated weights for policy 0, policy_version 123810 (0.0025) [2024-06-22 03:23:22,334][15401] Updated weights for policy 0, policy_version 123820 (0.0043) [2024-06-22 03:23:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2028683264. Throughput: 0: 42526.6. Samples: 2028778420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 03:23:23,766][15349] Signal inference workers to stop experience collection... (29900 times) [2024-06-22 03:23:23,767][15349] Signal inference workers to resume experience collection... (29900 times) [2024-06-22 03:23:23,816][15401] InferenceWorker_p0-w0: stopping experience collection (29900 times) [2024-06-22 03:23:23,817][15401] InferenceWorker_p0-w0: resuming experience collection (29900 times) [2024-06-22 03:23:26,296][15401] Updated weights for policy 0, policy_version 123830 (0.0029) [2024-06-22 03:23:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2028896256. Throughput: 0: 42455.1. Samples: 2029030800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 03:23:29,815][15401] Updated weights for policy 0, policy_version 123840 (0.0036) [2024-06-22 03:23:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2029125632. Throughput: 0: 42643.6. Samples: 2029288480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:33,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 03:23:33,870][15401] Updated weights for policy 0, policy_version 123850 (0.0032) [2024-06-22 03:23:37,376][15401] Updated weights for policy 0, policy_version 123860 (0.0032) [2024-06-22 03:23:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2029338624. Throughput: 0: 42816.8. Samples: 2029424160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:38,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 03:23:41,948][15401] Updated weights for policy 0, policy_version 123870 (0.0035) [2024-06-22 03:23:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42325.3, 300 sec: 42542.7). Total num frames: 2029535232. Throughput: 0: 42722.1. Samples: 2029676320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 03:23:43,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 03:23:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000123873_2029535232.pth... [2024-06-22 03:23:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000123251_2019344384.pth [2024-06-22 03:23:44,972][15401] Updated weights for policy 0, policy_version 123880 (0.0037) [2024-06-22 03:23:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 2029780992. Throughput: 0: 42627.6. Samples: 2029924720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:23:48,393][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 03:23:49,483][15401] Updated weights for policy 0, policy_version 123890 (0.0033) [2024-06-22 03:23:52,884][15401] Updated weights for policy 0, policy_version 123900 (0.0033) [2024-06-22 03:23:53,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 2029993984. Throughput: 0: 42613.0. Samples: 2030055240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:23:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 03:23:57,334][15401] Updated weights for policy 0, policy_version 123910 (0.0037) [2024-06-22 03:23:58,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2030157824. Throughput: 0: 42678.8. Samples: 2030307400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:23:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 03:24:00,640][15401] Updated weights for policy 0, policy_version 123920 (0.0031) [2024-06-22 03:24:03,395][15132] Fps is (10 sec: 39301.2, 60 sec: 42321.8, 300 sec: 42542.1). Total num frames: 2030387200. Throughput: 0: 42539.6. Samples: 2030559620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:03,395][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 03:24:05,055][15401] Updated weights for policy 0, policy_version 123930 (0.0032) [2024-06-22 03:24:08,323][15401] Updated weights for policy 0, policy_version 123940 (0.0030) [2024-06-22 03:24:08,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2030632960. Throughput: 0: 42407.2. Samples: 2030686740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 03:24:12,863][15401] Updated weights for policy 0, policy_version 123950 (0.0027) [2024-06-22 03:24:13,389][15132] Fps is (10 sec: 40981.5, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 2030796800. Throughput: 0: 42395.7. Samples: 2030938600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:13,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 03:24:16,015][15401] Updated weights for policy 0, policy_version 123960 (0.0027) [2024-06-22 03:24:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2031026176. Throughput: 0: 42368.9. Samples: 2031195080. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 03:24:20,479][15401] Updated weights for policy 0, policy_version 123970 (0.0024) [2024-06-22 03:24:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2031255552. Throughput: 0: 42176.9. Samples: 2031322120. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 03:24:23,645][15401] Updated weights for policy 0, policy_version 123980 (0.0027) [2024-06-22 03:24:28,292][15401] Updated weights for policy 0, policy_version 123990 (0.0033) [2024-06-22 03:24:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2031452160. Throughput: 0: 42383.6. Samples: 2031583480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 03:24:31,450][15401] Updated weights for policy 0, policy_version 124000 (0.0030) [2024-06-22 03:24:33,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42320.8, 300 sec: 42597.5). Total num frames: 2031665152. Throughput: 0: 42289.1. Samples: 2031827900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:33,397][15132] Avg episode reward: [(0, '0.089')] [2024-06-22 03:24:35,573][15349] Signal inference workers to stop experience collection... (29950 times) [2024-06-22 03:24:35,626][15401] InferenceWorker_p0-w0: stopping experience collection (29950 times) [2024-06-22 03:24:35,634][15349] Signal inference workers to resume experience collection... (29950 times) [2024-06-22 03:24:35,652][15401] InferenceWorker_p0-w0: resuming experience collection (29950 times) [2024-06-22 03:24:35,932][15401] Updated weights for policy 0, policy_version 124010 (0.0038) [2024-06-22 03:24:38,390][15132] Fps is (10 sec: 42597.0, 60 sec: 42325.2, 300 sec: 42487.6). Total num frames: 2031878144. Throughput: 0: 42269.4. Samples: 2031957380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 03:24:39,526][15401] Updated weights for policy 0, policy_version 124020 (0.0032) [2024-06-22 03:24:43,389][15132] Fps is (10 sec: 39347.2, 60 sec: 42054.0, 300 sec: 42431.8). Total num frames: 2032058368. Throughput: 0: 42379.5. Samples: 2032214480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-06-22 03:24:43,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 03:24:43,839][15401] Updated weights for policy 0, policy_version 124030 (0.0036) [2024-06-22 03:24:47,237][15401] Updated weights for policy 0, policy_version 124040 (0.0029) [2024-06-22 03:24:48,390][15132] Fps is (10 sec: 44238.2, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 2032320512. Throughput: 0: 42257.7. Samples: 2032461000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:24:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 03:24:51,654][15401] Updated weights for policy 0, policy_version 124050 (0.0044) [2024-06-22 03:24:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2032517120. Throughput: 0: 42498.7. Samples: 2032599180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:24:53,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 03:24:54,603][15401] Updated weights for policy 0, policy_version 124060 (0.0027) [2024-06-22 03:24:58,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 2032697344. Throughput: 0: 42624.9. Samples: 2032856720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:24:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 03:24:59,262][15401] Updated weights for policy 0, policy_version 124070 (0.0035) [2024-06-22 03:25:02,013][15401] Updated weights for policy 0, policy_version 124080 (0.0025) [2024-06-22 03:25:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43148.2, 300 sec: 42653.9). Total num frames: 2032975872. Throughput: 0: 42530.7. Samples: 2033108960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 03:25:06,795][15401] Updated weights for policy 0, policy_version 124090 (0.0037) [2024-06-22 03:25:08,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42325.3, 300 sec: 42488.2). Total num frames: 2033172480. Throughput: 0: 42978.3. Samples: 2033256140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 03:25:09,428][15401] Updated weights for policy 0, policy_version 124100 (0.0042) [2024-06-22 03:25:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2033352704. Throughput: 0: 42679.1. Samples: 2033504040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 03:25:14,538][15401] Updated weights for policy 0, policy_version 124110 (0.0031) [2024-06-22 03:25:16,963][15401] Updated weights for policy 0, policy_version 124120 (0.0029) [2024-06-22 03:25:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 2033631232. Throughput: 0: 42836.8. Samples: 2033755380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:18,392][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 03:25:22,260][15401] Updated weights for policy 0, policy_version 124130 (0.0024) [2024-06-22 03:25:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2033811456. Throughput: 0: 43176.4. Samples: 2033900300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 03:25:24,884][15401] Updated weights for policy 0, policy_version 124140 (0.0042) [2024-06-22 03:25:28,389][15132] Fps is (10 sec: 36053.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2033991680. Throughput: 0: 42793.8. Samples: 2034140200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 03:25:29,947][15401] Updated weights for policy 0, policy_version 124150 (0.0035) [2024-06-22 03:25:32,602][15401] Updated weights for policy 0, policy_version 124160 (0.0038) [2024-06-22 03:25:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43149.2, 300 sec: 42654.0). Total num frames: 2034253824. Throughput: 0: 43044.1. Samples: 2034397980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:33,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 03:25:37,591][15401] Updated weights for policy 0, policy_version 124170 (0.0032) [2024-06-22 03:25:38,049][15349] Signal inference workers to stop experience collection... (30000 times) [2024-06-22 03:25:38,057][15349] Signal inference workers to resume experience collection... (30000 times) [2024-06-22 03:25:38,059][15401] InferenceWorker_p0-w0: stopping experience collection (30000 times) [2024-06-22 03:25:38,094][15401] InferenceWorker_p0-w0: resuming experience collection (30000 times) [2024-06-22 03:25:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 2034450432. Throughput: 0: 43050.1. Samples: 2034536440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 03:25:40,048][15401] Updated weights for policy 0, policy_version 124180 (0.0024) [2024-06-22 03:25:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 2034663424. Throughput: 0: 42886.5. Samples: 2034786620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:43,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 03:25:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000124186_2034663424.pth... [2024-06-22 03:25:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000123562_2024439808.pth [2024-06-22 03:25:45,030][15401] Updated weights for policy 0, policy_version 124190 (0.0036) [2024-06-22 03:25:48,087][15401] Updated weights for policy 0, policy_version 124200 (0.0026) [2024-06-22 03:25:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2034909184. Throughput: 0: 42920.5. Samples: 2035040380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 03:25:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 03:25:52,645][15401] Updated weights for policy 0, policy_version 124210 (0.0030) [2024-06-22 03:25:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2035089408. Throughput: 0: 42673.4. Samples: 2035176440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:25:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 03:25:55,741][15401] Updated weights for policy 0, policy_version 124220 (0.0053) [2024-06-22 03:25:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 2035302400. Throughput: 0: 42718.6. Samples: 2035426380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:25:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 03:26:00,314][15401] Updated weights for policy 0, policy_version 124230 (0.0029) [2024-06-22 03:26:03,382][15401] Updated weights for policy 0, policy_version 124240 (0.0028) [2024-06-22 03:26:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2035548160. Throughput: 0: 42813.9. Samples: 2035681900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 03:26:07,971][15401] Updated weights for policy 0, policy_version 124250 (0.0035) [2024-06-22 03:26:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2035728384. Throughput: 0: 42596.4. Samples: 2035817140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 03:26:11,331][15401] Updated weights for policy 0, policy_version 124260 (0.0042) [2024-06-22 03:26:13,396][15132] Fps is (10 sec: 39296.3, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 2035941376. Throughput: 0: 42739.2. Samples: 2036063740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:13,396][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 03:26:15,661][15401] Updated weights for policy 0, policy_version 124270 (0.0031) [2024-06-22 03:26:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42054.0, 300 sec: 42543.6). Total num frames: 2036154368. Throughput: 0: 42808.5. Samples: 2036324360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 03:26:18,957][15401] Updated weights for policy 0, policy_version 124280 (0.0041) [2024-06-22 03:26:23,181][15401] Updated weights for policy 0, policy_version 124290 (0.0030) [2024-06-22 03:26:23,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2036367360. Throughput: 0: 42630.3. Samples: 2036454800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 03:26:26,624][15401] Updated weights for policy 0, policy_version 124300 (0.0040) [2024-06-22 03:26:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42654.9). Total num frames: 2036580352. Throughput: 0: 42559.1. Samples: 2036701780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 03:26:30,702][15401] Updated weights for policy 0, policy_version 124310 (0.0028) [2024-06-22 03:26:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2036793344. Throughput: 0: 42650.2. Samples: 2036959640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 03:26:34,316][15401] Updated weights for policy 0, policy_version 124320 (0.0034) [2024-06-22 03:26:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2037006336. Throughput: 0: 42570.6. Samples: 2037092120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 03:26:38,450][15401] Updated weights for policy 0, policy_version 124330 (0.0038) [2024-06-22 03:26:41,873][15401] Updated weights for policy 0, policy_version 124340 (0.0033) [2024-06-22 03:26:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2037202944. Throughput: 0: 42470.6. Samples: 2037337560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 03:26:46,010][15401] Updated weights for policy 0, policy_version 124350 (0.0025) [2024-06-22 03:26:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2037432320. Throughput: 0: 42512.8. Samples: 2037594980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 03:26:48,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 03:26:49,479][15401] Updated weights for policy 0, policy_version 124360 (0.0048) [2024-06-22 03:26:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 2037645312. Throughput: 0: 42468.4. Samples: 2037728220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:26:53,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-22 03:26:53,900][15401] Updated weights for policy 0, policy_version 124370 (0.0035) [2024-06-22 03:26:56,973][15401] Updated weights for policy 0, policy_version 124380 (0.0037) [2024-06-22 03:26:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2037858304. Throughput: 0: 42339.7. Samples: 2037968760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:26:58,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 03:27:01,592][15401] Updated weights for policy 0, policy_version 124390 (0.0032) [2024-06-22 03:27:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42599.2). Total num frames: 2038071296. Throughput: 0: 42558.1. Samples: 2038239480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 03:27:05,029][15401] Updated weights for policy 0, policy_version 124400 (0.0034) [2024-06-22 03:27:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2038284288. Throughput: 0: 42472.4. Samples: 2038366060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 03:27:09,056][15401] Updated weights for policy 0, policy_version 124410 (0.0038) [2024-06-22 03:27:11,136][15349] Signal inference workers to stop experience collection... (30050 times) [2024-06-22 03:27:11,137][15349] Signal inference workers to resume experience collection... (30050 times) [2024-06-22 03:27:11,185][15401] InferenceWorker_p0-w0: stopping experience collection (30050 times) [2024-06-22 03:27:11,192][15401] InferenceWorker_p0-w0: resuming experience collection (30050 times) [2024-06-22 03:27:12,631][15401] Updated weights for policy 0, policy_version 124420 (0.0026) [2024-06-22 03:27:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42875.9, 300 sec: 42653.9). Total num frames: 2038513664. Throughput: 0: 42595.5. Samples: 2038618580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:13,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 03:27:16,447][15401] Updated weights for policy 0, policy_version 124430 (0.0028) [2024-06-22 03:27:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 2038710272. Throughput: 0: 42767.4. Samples: 2038884180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 03:27:20,038][15401] Updated weights for policy 0, policy_version 124440 (0.0027) [2024-06-22 03:27:23,396][15132] Fps is (10 sec: 42571.9, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 2038939648. Throughput: 0: 42729.0. Samples: 2039015200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:23,396][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 03:27:24,076][15401] Updated weights for policy 0, policy_version 124450 (0.0028) [2024-06-22 03:27:27,786][15401] Updated weights for policy 0, policy_version 124460 (0.0030) [2024-06-22 03:27:28,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2039152640. Throughput: 0: 42841.6. Samples: 2039265420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 03:27:31,772][15401] Updated weights for policy 0, policy_version 124470 (0.0045) [2024-06-22 03:27:33,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2039365632. Throughput: 0: 42979.6. Samples: 2039529060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 03:27:35,302][15401] Updated weights for policy 0, policy_version 124480 (0.0042) [2024-06-22 03:27:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 2039562240. Throughput: 0: 42883.6. Samples: 2039657980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 03:27:39,390][15401] Updated weights for policy 0, policy_version 124490 (0.0029) [2024-06-22 03:27:42,996][15401] Updated weights for policy 0, policy_version 124500 (0.0025) [2024-06-22 03:27:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2039808000. Throughput: 0: 43202.1. Samples: 2039912860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:43,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-22 03:27:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000124500_2039808000.pth... [2024-06-22 03:27:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000123873_2029535232.pth [2024-06-22 03:27:47,222][15401] Updated weights for policy 0, policy_version 124510 (0.0033) [2024-06-22 03:27:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2040020992. Throughput: 0: 42907.6. Samples: 2040170320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 03:27:48,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 03:27:50,640][15401] Updated weights for policy 0, policy_version 124520 (0.0049) [2024-06-22 03:27:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2040201216. Throughput: 0: 42953.3. Samples: 2040298960. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:27:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 03:27:54,911][15401] Updated weights for policy 0, policy_version 124530 (0.0024) [2024-06-22 03:27:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2040446976. Throughput: 0: 43081.1. Samples: 2040557220. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:27:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 03:27:58,623][15401] Updated weights for policy 0, policy_version 124540 (0.0032) [2024-06-22 03:28:02,411][15401] Updated weights for policy 0, policy_version 124550 (0.0044) [2024-06-22 03:28:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2040659968. Throughput: 0: 42726.4. Samples: 2040806860. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 03:28:06,179][15401] Updated weights for policy 0, policy_version 124560 (0.0044) [2024-06-22 03:28:08,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2040840192. Throughput: 0: 42663.3. Samples: 2040934780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 03:28:10,319][15401] Updated weights for policy 0, policy_version 124570 (0.0030) [2024-06-22 03:28:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2041069568. Throughput: 0: 42838.6. Samples: 2041193160. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 03:28:14,027][15401] Updated weights for policy 0, policy_version 124580 (0.0026) [2024-06-22 03:28:17,903][15401] Updated weights for policy 0, policy_version 124590 (0.0029) [2024-06-22 03:28:18,392][15132] Fps is (10 sec: 45864.8, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2041298944. Throughput: 0: 42723.1. Samples: 2041451700. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:18,392][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 03:28:21,521][15401] Updated weights for policy 0, policy_version 124600 (0.0027) [2024-06-22 03:28:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 2041495552. Throughput: 0: 42779.9. Samples: 2041583080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 03:28:25,407][15401] Updated weights for policy 0, policy_version 124610 (0.0039) [2024-06-22 03:28:28,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2041708544. Throughput: 0: 42846.9. Samples: 2041840960. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:28,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-22 03:28:29,301][15401] Updated weights for policy 0, policy_version 124620 (0.0039) [2024-06-22 03:28:32,976][15401] Updated weights for policy 0, policy_version 124630 (0.0044) [2024-06-22 03:28:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2041954304. Throughput: 0: 42808.9. Samples: 2042096720. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:33,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 03:28:37,116][15401] Updated weights for policy 0, policy_version 124640 (0.0028) [2024-06-22 03:28:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 2042150912. Throughput: 0: 43001.4. Samples: 2042234020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 03:28:38,840][15349] Signal inference workers to stop experience collection... (30100 times) [2024-06-22 03:28:38,880][15401] InferenceWorker_p0-w0: stopping experience collection (30100 times) [2024-06-22 03:28:38,954][15349] Signal inference workers to resume experience collection... (30100 times) [2024-06-22 03:28:38,954][15401] InferenceWorker_p0-w0: resuming experience collection (30100 times) [2024-06-22 03:28:40,597][15401] Updated weights for policy 0, policy_version 124650 (0.0035) [2024-06-22 03:28:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2042347520. Throughput: 0: 42787.9. Samples: 2042482680. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:43,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 03:28:45,048][15401] Updated weights for policy 0, policy_version 124660 (0.0038) [2024-06-22 03:28:48,083][15401] Updated weights for policy 0, policy_version 124670 (0.0039) [2024-06-22 03:28:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2042593280. Throughput: 0: 42911.6. Samples: 2042737880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 03:28:52,965][15401] Updated weights for policy 0, policy_version 124680 (0.0035) [2024-06-22 03:28:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2042773504. Throughput: 0: 43240.2. Samples: 2042880580. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 03:28:53,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 03:28:55,550][15401] Updated weights for policy 0, policy_version 124690 (0.0036) [2024-06-22 03:28:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.8). Total num frames: 2043002880. Throughput: 0: 43062.7. Samples: 2043130980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:28:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 03:29:00,458][15401] Updated weights for policy 0, policy_version 124700 (0.0035) [2024-06-22 03:29:03,033][15401] Updated weights for policy 0, policy_version 124710 (0.0029) [2024-06-22 03:29:03,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2043248640. Throughput: 0: 42964.1. Samples: 2043384980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:03,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 03:29:07,979][15401] Updated weights for policy 0, policy_version 124720 (0.0035) [2024-06-22 03:29:08,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 2043428864. Throughput: 0: 43166.6. Samples: 2043525680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:08,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 03:29:10,499][15401] Updated weights for policy 0, policy_version 124730 (0.0036) [2024-06-22 03:29:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2043641856. Throughput: 0: 42922.6. Samples: 2043772480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 03:29:15,734][15401] Updated weights for policy 0, policy_version 124740 (0.0035) [2024-06-22 03:29:18,273][15401] Updated weights for policy 0, policy_version 124750 (0.0030) [2024-06-22 03:29:18,392][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 42875.8). Total num frames: 2043904000. Throughput: 0: 42926.6. Samples: 2044028520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:18,393][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 03:29:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2044051456. Throughput: 0: 42941.9. Samples: 2044166400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:23,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 03:29:23,467][15401] Updated weights for policy 0, policy_version 124760 (0.0029) [2024-06-22 03:29:25,910][15401] Updated weights for policy 0, policy_version 124770 (0.0034) [2024-06-22 03:29:28,390][15132] Fps is (10 sec: 37692.3, 60 sec: 42871.4, 300 sec: 42766.0). Total num frames: 2044280832. Throughput: 0: 42919.5. Samples: 2044414060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:28,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 03:29:31,060][15401] Updated weights for policy 0, policy_version 124780 (0.0028) [2024-06-22 03:29:33,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 2044542976. Throughput: 0: 42932.2. Samples: 2044669840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 03:29:33,593][15401] Updated weights for policy 0, policy_version 124790 (0.0039) [2024-06-22 03:29:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2044706816. Throughput: 0: 42838.5. Samples: 2044808320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 03:29:38,558][15401] Updated weights for policy 0, policy_version 124800 (0.0031) [2024-06-22 03:29:41,337][15401] Updated weights for policy 0, policy_version 124810 (0.0035) [2024-06-22 03:29:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2044936192. Throughput: 0: 42761.7. Samples: 2045055260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 03:29:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000124813_2044936192.pth... [2024-06-22 03:29:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000124186_2034663424.pth [2024-06-22 03:29:46,294][15401] Updated weights for policy 0, policy_version 124820 (0.0046) [2024-06-22 03:29:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2045165568. Throughput: 0: 42812.4. Samples: 2045311540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 03:29:49,118][15401] Updated weights for policy 0, policy_version 124830 (0.0041) [2024-06-22 03:29:53,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2045313024. Throughput: 0: 42576.1. Samples: 2045441500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 03:29:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 03:29:53,866][15349] Signal inference workers to stop experience collection... (30150 times) [2024-06-22 03:29:53,896][15401] InferenceWorker_p0-w0: stopping experience collection (30150 times) [2024-06-22 03:29:53,922][15349] Signal inference workers to resume experience collection... (30150 times) [2024-06-22 03:29:53,928][15401] InferenceWorker_p0-w0: resuming experience collection (30150 times) [2024-06-22 03:29:53,931][15401] Updated weights for policy 0, policy_version 124840 (0.0030) [2024-06-22 03:29:56,666][15401] Updated weights for policy 0, policy_version 124850 (0.0041) [2024-06-22 03:29:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2045591552. Throughput: 0: 42715.1. Samples: 2045694660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:29:58,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 03:30:01,520][15401] Updated weights for policy 0, policy_version 124860 (0.0029) [2024-06-22 03:30:03,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2045804544. Throughput: 0: 42719.3. Samples: 2045950780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 03:30:04,588][15401] Updated weights for policy 0, policy_version 124870 (0.0036) [2024-06-22 03:30:08,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2045968384. Throughput: 0: 42462.6. Samples: 2046077220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 03:30:09,276][15401] Updated weights for policy 0, policy_version 124880 (0.0045) [2024-06-22 03:30:12,433][15401] Updated weights for policy 0, policy_version 124890 (0.0032) [2024-06-22 03:30:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 2046214144. Throughput: 0: 42614.3. Samples: 2046331700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 03:30:16,989][15401] Updated weights for policy 0, policy_version 124900 (0.0042) [2024-06-22 03:30:18,389][15132] Fps is (10 sec: 49152.5, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 2046459904. Throughput: 0: 42545.1. Samples: 2046584360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 03:30:20,109][15401] Updated weights for policy 0, policy_version 124910 (0.0033) [2024-06-22 03:30:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2046623744. Throughput: 0: 42386.3. Samples: 2046715700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:23,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 03:30:24,881][15401] Updated weights for policy 0, policy_version 124920 (0.0025) [2024-06-22 03:30:27,770][15401] Updated weights for policy 0, policy_version 124930 (0.0031) [2024-06-22 03:30:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2046869504. Throughput: 0: 42488.0. Samples: 2046967220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 03:30:32,342][15401] Updated weights for policy 0, policy_version 124940 (0.0036) [2024-06-22 03:30:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 2047066112. Throughput: 0: 42633.0. Samples: 2047230020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 03:30:35,700][15401] Updated weights for policy 0, policy_version 124950 (0.0036) [2024-06-22 03:30:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2047262720. Throughput: 0: 42455.8. Samples: 2047352020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 03:30:40,189][15401] Updated weights for policy 0, policy_version 124960 (0.0042) [2024-06-22 03:30:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2047492096. Throughput: 0: 42274.2. Samples: 2047597000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 03:30:43,512][15401] Updated weights for policy 0, policy_version 124970 (0.0038) [2024-06-22 03:30:47,659][15401] Updated weights for policy 0, policy_version 124980 (0.0029) [2024-06-22 03:30:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2047688704. Throughput: 0: 42343.5. Samples: 2047856240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 03:30:51,125][15401] Updated weights for policy 0, policy_version 124990 (0.0032) [2024-06-22 03:30:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2047901696. Throughput: 0: 42338.2. Samples: 2047982440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:53,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 03:30:55,196][15401] Updated weights for policy 0, policy_version 125000 (0.0036) [2024-06-22 03:30:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2048114688. Throughput: 0: 42447.0. Samples: 2048241820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-22 03:30:58,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 03:30:58,843][15401] Updated weights for policy 0, policy_version 125010 (0.0028) [2024-06-22 03:31:03,097][15401] Updated weights for policy 0, policy_version 125020 (0.0041) [2024-06-22 03:31:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2048327680. Throughput: 0: 42470.6. Samples: 2048495540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:03,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 03:31:06,487][15401] Updated weights for policy 0, policy_version 125030 (0.0035) [2024-06-22 03:31:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 2048540672. Throughput: 0: 42319.5. Samples: 2048620080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 03:31:11,098][15401] Updated weights for policy 0, policy_version 125040 (0.0039) [2024-06-22 03:31:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2048770048. Throughput: 0: 42455.7. Samples: 2048877720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 03:31:14,399][15401] Updated weights for policy 0, policy_version 125050 (0.0028) [2024-06-22 03:31:15,569][15349] Signal inference workers to stop experience collection... (30200 times) [2024-06-22 03:31:15,601][15401] InferenceWorker_p0-w0: stopping experience collection (30200 times) [2024-06-22 03:31:15,631][15349] Signal inference workers to resume experience collection... (30200 times) [2024-06-22 03:31:15,636][15401] InferenceWorker_p0-w0: resuming experience collection (30200 times) [2024-06-22 03:31:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.0, 300 sec: 42709.5). Total num frames: 2048966656. Throughput: 0: 42361.6. Samples: 2049136300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 03:31:18,509][15401] Updated weights for policy 0, policy_version 125060 (0.0020) [2024-06-22 03:31:21,937][15401] Updated weights for policy 0, policy_version 125070 (0.0041) [2024-06-22 03:31:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2049196032. Throughput: 0: 42447.3. Samples: 2049262140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 03:31:25,996][15401] Updated weights for policy 0, policy_version 125080 (0.0048) [2024-06-22 03:31:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2049409024. Throughput: 0: 42913.8. Samples: 2049528120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:28,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 03:31:29,446][15401] Updated weights for policy 0, policy_version 125090 (0.0039) [2024-06-22 03:31:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2049622016. Throughput: 0: 42749.7. Samples: 2049779980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:33,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 03:31:33,540][15401] Updated weights for policy 0, policy_version 125100 (0.0029) [2024-06-22 03:31:37,350][15401] Updated weights for policy 0, policy_version 125110 (0.0032) [2024-06-22 03:31:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2049835008. Throughput: 0: 42733.2. Samples: 2049905440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:38,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 03:31:41,237][15401] Updated weights for policy 0, policy_version 125120 (0.0039) [2024-06-22 03:31:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2050048000. Throughput: 0: 42780.0. Samples: 2050166920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 03:31:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000125125_2050048000.pth... [2024-06-22 03:31:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000124500_2039808000.pth [2024-06-22 03:31:44,899][15401] Updated weights for policy 0, policy_version 125130 (0.0032) [2024-06-22 03:31:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2050244608. Throughput: 0: 42818.1. Samples: 2050422360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 03:31:49,009][15401] Updated weights for policy 0, policy_version 125140 (0.0045) [2024-06-22 03:31:52,505][15401] Updated weights for policy 0, policy_version 125150 (0.0029) [2024-06-22 03:31:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2050473984. Throughput: 0: 42806.8. Samples: 2050546380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 03:31:56,459][15401] Updated weights for policy 0, policy_version 125160 (0.0034) [2024-06-22 03:31:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2050686976. Throughput: 0: 42949.3. Samples: 2050810440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 03:31:58,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 03:31:59,907][15401] Updated weights for policy 0, policy_version 125170 (0.0028) [2024-06-22 03:32:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2050916352. Throughput: 0: 42966.7. Samples: 2051069800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 03:32:04,266][15401] Updated weights for policy 0, policy_version 125180 (0.0032) [2024-06-22 03:32:07,958][15401] Updated weights for policy 0, policy_version 125190 (0.0032) [2024-06-22 03:32:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2051129344. Throughput: 0: 42999.1. Samples: 2051197100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:08,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 03:32:11,734][15401] Updated weights for policy 0, policy_version 125200 (0.0027) [2024-06-22 03:32:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2051325952. Throughput: 0: 42852.1. Samples: 2051456460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:13,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 03:32:15,526][15401] Updated weights for policy 0, policy_version 125210 (0.0034) [2024-06-22 03:32:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.7, 300 sec: 42766.0). Total num frames: 2051555328. Throughput: 0: 43093.5. Samples: 2051719180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:18,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 03:32:19,144][15401] Updated weights for policy 0, policy_version 125220 (0.0030) [2024-06-22 03:32:23,215][15401] Updated weights for policy 0, policy_version 125230 (0.0035) [2024-06-22 03:32:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2051768320. Throughput: 0: 43162.2. Samples: 2051847740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:23,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 03:32:26,589][15401] Updated weights for policy 0, policy_version 125240 (0.0031) [2024-06-22 03:32:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2051997696. Throughput: 0: 43186.7. Samples: 2052110320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 03:32:30,737][15401] Updated weights for policy 0, policy_version 125250 (0.0022) [2024-06-22 03:32:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2052227072. Throughput: 0: 43341.0. Samples: 2052372700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 03:32:34,126][15401] Updated weights for policy 0, policy_version 125260 (0.0035) [2024-06-22 03:32:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 2052407296. Throughput: 0: 43513.2. Samples: 2052504580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:38,392][15132] Avg episode reward: [(0, '0.261')] [2024-06-22 03:32:38,582][15401] Updated weights for policy 0, policy_version 125270 (0.0024) [2024-06-22 03:32:40,784][15349] Signal inference workers to stop experience collection... (30250 times) [2024-06-22 03:32:40,785][15349] Signal inference workers to resume experience collection... (30250 times) [2024-06-22 03:32:40,814][15401] InferenceWorker_p0-w0: stopping experience collection (30250 times) [2024-06-22 03:32:40,814][15401] InferenceWorker_p0-w0: resuming experience collection (30250 times) [2024-06-22 03:32:41,727][15401] Updated weights for policy 0, policy_version 125280 (0.0032) [2024-06-22 03:32:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2052636672. Throughput: 0: 43213.2. Samples: 2052755040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:43,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-22 03:32:45,986][15401] Updated weights for policy 0, policy_version 125290 (0.0032) [2024-06-22 03:32:48,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 2052866048. Throughput: 0: 43225.3. Samples: 2053014940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:48,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 03:32:49,108][15401] Updated weights for policy 0, policy_version 125300 (0.0037) [2024-06-22 03:32:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2053062656. Throughput: 0: 43296.0. Samples: 2053145420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 03:32:53,749][15401] Updated weights for policy 0, policy_version 125310 (0.0041) [2024-06-22 03:32:56,723][15401] Updated weights for policy 0, policy_version 125320 (0.0025) [2024-06-22 03:32:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 2053292032. Throughput: 0: 43123.9. Samples: 2053397040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:32:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 03:33:01,518][15401] Updated weights for policy 0, policy_version 125330 (0.0026) [2024-06-22 03:33:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2053505024. Throughput: 0: 43087.9. Samples: 2053658140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 03:33:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 03:33:04,288][15401] Updated weights for policy 0, policy_version 125340 (0.0042) [2024-06-22 03:33:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2053701632. Throughput: 0: 43189.4. Samples: 2053791260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 03:33:09,355][15401] Updated weights for policy 0, policy_version 125350 (0.0037) [2024-06-22 03:33:11,797][15401] Updated weights for policy 0, policy_version 125360 (0.0040) [2024-06-22 03:33:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 42876.4). Total num frames: 2053947392. Throughput: 0: 43013.7. Samples: 2054045940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 03:33:16,903][15401] Updated weights for policy 0, policy_version 125370 (0.0043) [2024-06-22 03:33:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2054160384. Throughput: 0: 42952.4. Samples: 2054305560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 03:33:19,678][15401] Updated weights for policy 0, policy_version 125380 (0.0046) [2024-06-22 03:33:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2054340608. Throughput: 0: 42968.6. Samples: 2054438060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 03:33:24,453][15401] Updated weights for policy 0, policy_version 125390 (0.0026) [2024-06-22 03:33:27,151][15401] Updated weights for policy 0, policy_version 125400 (0.0035) [2024-06-22 03:33:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2054586368. Throughput: 0: 42982.6. Samples: 2054689260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:28,396][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 03:33:31,881][15401] Updated weights for policy 0, policy_version 125410 (0.0044) [2024-06-22 03:33:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2054799360. Throughput: 0: 43113.5. Samples: 2054955040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:33,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 03:33:34,811][15401] Updated weights for policy 0, policy_version 125420 (0.0042) [2024-06-22 03:33:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2054979584. Throughput: 0: 42922.7. Samples: 2055076940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:38,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 03:33:39,333][15401] Updated weights for policy 0, policy_version 125430 (0.0040) [2024-06-22 03:33:42,251][15401] Updated weights for policy 0, policy_version 125440 (0.0028) [2024-06-22 03:33:42,265][15349] Signal inference workers to stop experience collection... (30300 times) [2024-06-22 03:33:42,271][15349] Signal inference workers to resume experience collection... (30300 times) [2024-06-22 03:33:42,311][15401] InferenceWorker_p0-w0: stopping experience collection (30300 times) [2024-06-22 03:33:42,311][15401] InferenceWorker_p0-w0: resuming experience collection (30300 times) [2024-06-22 03:33:43,394][15132] Fps is (10 sec: 44218.6, 60 sec: 43414.7, 300 sec: 42875.5). Total num frames: 2055241728. Throughput: 0: 43091.2. Samples: 2055336320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:43,394][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 03:33:43,459][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000125443_2055258112.pth... [2024-06-22 03:33:43,524][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000124813_2044936192.pth [2024-06-22 03:33:46,907][15401] Updated weights for policy 0, policy_version 125450 (0.0032) [2024-06-22 03:33:48,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 2055454720. Throughput: 0: 43318.5. Samples: 2055607480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 03:33:49,665][15401] Updated weights for policy 0, policy_version 125460 (0.0030) [2024-06-22 03:33:53,389][15132] Fps is (10 sec: 37698.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2055618560. Throughput: 0: 43140.9. Samples: 2055732600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 03:33:54,602][15401] Updated weights for policy 0, policy_version 125470 (0.0032) [2024-06-22 03:33:57,192][15401] Updated weights for policy 0, policy_version 125480 (0.0035) [2024-06-22 03:33:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2055880704. Throughput: 0: 43206.1. Samples: 2055990220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:33:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 03:34:02,078][15401] Updated weights for policy 0, policy_version 125490 (0.0032) [2024-06-22 03:34:03,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 2056093696. Throughput: 0: 43317.4. Samples: 2056254840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 03:34:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 03:34:05,095][15401] Updated weights for policy 0, policy_version 125500 (0.0028) [2024-06-22 03:34:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2056290304. Throughput: 0: 43236.8. Samples: 2056383720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:08,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-22 03:34:09,497][15401] Updated weights for policy 0, policy_version 125510 (0.0050) [2024-06-22 03:34:12,669][15401] Updated weights for policy 0, policy_version 125520 (0.0041) [2024-06-22 03:34:13,393][15132] Fps is (10 sec: 44221.1, 60 sec: 43142.0, 300 sec: 42820.4). Total num frames: 2056536064. Throughput: 0: 43238.5. Samples: 2056635140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:13,393][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 03:34:17,169][15401] Updated weights for policy 0, policy_version 125530 (0.0027) [2024-06-22 03:34:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2056716288. Throughput: 0: 43212.9. Samples: 2056899620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 03:34:20,715][15401] Updated weights for policy 0, policy_version 125540 (0.0027) [2024-06-22 03:34:23,390][15132] Fps is (10 sec: 40973.9, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2056945664. Throughput: 0: 43254.2. Samples: 2057023380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 03:34:24,995][15401] Updated weights for policy 0, policy_version 125550 (0.0029) [2024-06-22 03:34:28,185][15401] Updated weights for policy 0, policy_version 125560 (0.0038) [2024-06-22 03:34:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2057175040. Throughput: 0: 43284.2. Samples: 2057283940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 03:34:32,415][15401] Updated weights for policy 0, policy_version 125570 (0.0025) [2024-06-22 03:34:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2057371648. Throughput: 0: 43067.6. Samples: 2057545520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 03:34:35,893][15401] Updated weights for policy 0, policy_version 125580 (0.0037) [2024-06-22 03:34:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2057584640. Throughput: 0: 43084.0. Samples: 2057671380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 03:34:39,897][15401] Updated weights for policy 0, policy_version 125590 (0.0037) [2024-06-22 03:34:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42601.2, 300 sec: 42820.6). Total num frames: 2057797632. Throughput: 0: 43040.5. Samples: 2057927040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 03:34:43,824][15401] Updated weights for policy 0, policy_version 125600 (0.0033) [2024-06-22 03:34:47,442][15401] Updated weights for policy 0, policy_version 125610 (0.0032) [2024-06-22 03:34:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 2058010624. Throughput: 0: 42760.3. Samples: 2058179060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:48,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-22 03:34:51,346][15401] Updated weights for policy 0, policy_version 125620 (0.0044) [2024-06-22 03:34:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2058207232. Throughput: 0: 42720.9. Samples: 2058306160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:53,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 03:34:55,487][15401] Updated weights for policy 0, policy_version 125630 (0.0023) [2024-06-22 03:34:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2058420224. Throughput: 0: 42927.2. Samples: 2058566720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:34:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 03:34:58,546][15349] Signal inference workers to stop experience collection... (30350 times) [2024-06-22 03:34:58,599][15401] InferenceWorker_p0-w0: stopping experience collection (30350 times) [2024-06-22 03:34:58,607][15349] Signal inference workers to resume experience collection... (30350 times) [2024-06-22 03:34:58,617][15401] InferenceWorker_p0-w0: resuming experience collection (30350 times) [2024-06-22 03:34:58,938][15401] Updated weights for policy 0, policy_version 125640 (0.0033) [2024-06-22 03:35:03,077][15401] Updated weights for policy 0, policy_version 125650 (0.0052) [2024-06-22 03:35:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2058649600. Throughput: 0: 42689.4. Samples: 2058820640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 03:35:03,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 03:35:06,675][15401] Updated weights for policy 0, policy_version 125660 (0.0038) [2024-06-22 03:35:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2058878976. Throughput: 0: 42771.7. Samples: 2058948100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 03:35:10,544][15401] Updated weights for policy 0, policy_version 125670 (0.0039) [2024-06-22 03:35:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.9, 300 sec: 42820.5). Total num frames: 2059091968. Throughput: 0: 42782.0. Samples: 2059209120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 03:35:14,111][15401] Updated weights for policy 0, policy_version 125680 (0.0036) [2024-06-22 03:35:18,095][15401] Updated weights for policy 0, policy_version 125690 (0.0038) [2024-06-22 03:35:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2059304960. Throughput: 0: 42445.3. Samples: 2059455560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 03:35:21,917][15401] Updated weights for policy 0, policy_version 125700 (0.0039) [2024-06-22 03:35:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2059501568. Throughput: 0: 42603.1. Samples: 2059588520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:23,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 03:35:25,578][15401] Updated weights for policy 0, policy_version 125710 (0.0039) [2024-06-22 03:35:28,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.7, 300 sec: 42875.7). Total num frames: 2059714560. Throughput: 0: 42796.9. Samples: 2059853000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:28,392][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 03:35:29,433][15401] Updated weights for policy 0, policy_version 125720 (0.0036) [2024-06-22 03:35:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2059927552. Throughput: 0: 42769.0. Samples: 2060103660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 03:35:33,636][15401] Updated weights for policy 0, policy_version 125730 (0.0029) [2024-06-22 03:35:37,004][15401] Updated weights for policy 0, policy_version 125740 (0.0040) [2024-06-22 03:35:38,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2060156928. Throughput: 0: 42836.0. Samples: 2060233780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 03:35:41,126][15401] Updated weights for policy 0, policy_version 125750 (0.0035) [2024-06-22 03:35:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2060353536. Throughput: 0: 42753.9. Samples: 2060490640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 03:35:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000125754_2060353536.pth... [2024-06-22 03:35:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000125125_2050048000.pth [2024-06-22 03:35:44,997][15401] Updated weights for policy 0, policy_version 125760 (0.0040) [2024-06-22 03:35:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2060582912. Throughput: 0: 42746.6. Samples: 2060744240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 03:35:48,766][15401] Updated weights for policy 0, policy_version 125770 (0.0036) [2024-06-22 03:35:52,566][15401] Updated weights for policy 0, policy_version 125780 (0.0027) [2024-06-22 03:35:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2060795904. Throughput: 0: 42959.0. Samples: 2060881260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 03:35:56,243][15401] Updated weights for policy 0, policy_version 125790 (0.0028) [2024-06-22 03:35:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2061008896. Throughput: 0: 42805.8. Samples: 2061135380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:35:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 03:36:00,204][15401] Updated weights for policy 0, policy_version 125800 (0.0038) [2024-06-22 03:36:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 2061205504. Throughput: 0: 43090.3. Samples: 2061394620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:36:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 03:36:03,975][15401] Updated weights for policy 0, policy_version 125810 (0.0027) [2024-06-22 03:36:07,789][15401] Updated weights for policy 0, policy_version 125820 (0.0037) [2024-06-22 03:36:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42986.8). Total num frames: 2061451264. Throughput: 0: 43018.6. Samples: 2061524460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 03:36:08,392][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 03:36:11,616][15401] Updated weights for policy 0, policy_version 125830 (0.0031) [2024-06-22 03:36:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 2061647872. Throughput: 0: 42792.0. Samples: 2061778540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:13,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 03:36:15,438][15401] Updated weights for policy 0, policy_version 125840 (0.0026) [2024-06-22 03:36:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2061877248. Throughput: 0: 43028.4. Samples: 2062039940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 03:36:19,160][15401] Updated weights for policy 0, policy_version 125850 (0.0033) [2024-06-22 03:36:21,369][15349] Signal inference workers to stop experience collection... (30400 times) [2024-06-22 03:36:21,369][15349] Signal inference workers to resume experience collection... (30400 times) [2024-06-22 03:36:21,412][15401] InferenceWorker_p0-w0: stopping experience collection (30400 times) [2024-06-22 03:36:21,412][15401] InferenceWorker_p0-w0: resuming experience collection (30400 times) [2024-06-22 03:36:22,977][15401] Updated weights for policy 0, policy_version 125860 (0.0031) [2024-06-22 03:36:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 2062106624. Throughput: 0: 43072.5. Samples: 2062172040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 03:36:26,620][15401] Updated weights for policy 0, policy_version 125870 (0.0033) [2024-06-22 03:36:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 2062286848. Throughput: 0: 42926.2. Samples: 2062422320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:28,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 03:36:30,503][15401] Updated weights for policy 0, policy_version 125880 (0.0032) [2024-06-22 03:36:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2062532608. Throughput: 0: 43194.2. Samples: 2062687980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:33,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 03:36:34,057][15401] Updated weights for policy 0, policy_version 125890 (0.0044) [2024-06-22 03:36:38,306][15401] Updated weights for policy 0, policy_version 125900 (0.0034) [2024-06-22 03:36:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 2062745600. Throughput: 0: 43187.4. Samples: 2062824700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:38,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 03:36:41,530][15401] Updated weights for policy 0, policy_version 125910 (0.0035) [2024-06-22 03:36:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2062925824. Throughput: 0: 43020.1. Samples: 2063071280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 03:36:46,051][15401] Updated weights for policy 0, policy_version 125920 (0.0029) [2024-06-22 03:36:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2063171584. Throughput: 0: 42906.2. Samples: 2063325400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 03:36:49,223][15401] Updated weights for policy 0, policy_version 125930 (0.0041) [2024-06-22 03:36:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2063368192. Throughput: 0: 43096.6. Samples: 2063463700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 03:36:53,723][15401] Updated weights for policy 0, policy_version 125940 (0.0030) [2024-06-22 03:36:56,763][15401] Updated weights for policy 0, policy_version 125950 (0.0051) [2024-06-22 03:36:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2063581184. Throughput: 0: 42904.2. Samples: 2063709220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:36:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 03:37:01,277][15401] Updated weights for policy 0, policy_version 125960 (0.0035) [2024-06-22 03:37:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2063810560. Throughput: 0: 42897.3. Samples: 2063970320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:37:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 03:37:04,714][15401] Updated weights for policy 0, policy_version 125970 (0.0054) [2024-06-22 03:37:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 2064007168. Throughput: 0: 42837.2. Samples: 2064099720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 03:37:08,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 03:37:09,212][15401] Updated weights for policy 0, policy_version 125980 (0.0031) [2024-06-22 03:37:12,441][15401] Updated weights for policy 0, policy_version 125990 (0.0037) [2024-06-22 03:37:13,391][15132] Fps is (10 sec: 42594.0, 60 sec: 43143.8, 300 sec: 42987.0). Total num frames: 2064236544. Throughput: 0: 42982.5. Samples: 2064356580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:13,391][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 03:37:16,853][15401] Updated weights for policy 0, policy_version 126000 (0.0029) [2024-06-22 03:37:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2064465920. Throughput: 0: 42881.3. Samples: 2064617640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:18,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 03:37:19,837][15401] Updated weights for policy 0, policy_version 126010 (0.0025) [2024-06-22 03:37:23,390][15132] Fps is (10 sec: 42602.9, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2064662528. Throughput: 0: 42742.4. Samples: 2064748100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 03:37:24,254][15401] Updated weights for policy 0, policy_version 126020 (0.0026) [2024-06-22 03:37:27,481][15401] Updated weights for policy 0, policy_version 126030 (0.0030) [2024-06-22 03:37:28,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43413.0, 300 sec: 42930.7). Total num frames: 2064891904. Throughput: 0: 42849.4. Samples: 2064999780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:28,396][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 03:37:31,249][15349] Signal inference workers to stop experience collection... (30450 times) [2024-06-22 03:37:31,272][15401] InferenceWorker_p0-w0: stopping experience collection (30450 times) [2024-06-22 03:37:31,363][15349] Signal inference workers to resume experience collection... (30450 times) [2024-06-22 03:37:31,363][15401] InferenceWorker_p0-w0: resuming experience collection (30450 times) [2024-06-22 03:37:32,115][15401] Updated weights for policy 0, policy_version 126040 (0.0033) [2024-06-22 03:37:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 43098.6). Total num frames: 2065121280. Throughput: 0: 42958.9. Samples: 2065258560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 03:37:35,094][15401] Updated weights for policy 0, policy_version 126050 (0.0032) [2024-06-22 03:37:38,389][15132] Fps is (10 sec: 40986.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2065301504. Throughput: 0: 42743.9. Samples: 2065387180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 03:37:39,612][15401] Updated weights for policy 0, policy_version 126060 (0.0027) [2024-06-22 03:37:42,781][15401] Updated weights for policy 0, policy_version 126070 (0.0028) [2024-06-22 03:37:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 2065547264. Throughput: 0: 43053.2. Samples: 2065646620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 03:37:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000126071_2065547264.pth... [2024-06-22 03:37:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000125443_2055258112.pth [2024-06-22 03:37:47,614][15401] Updated weights for policy 0, policy_version 126080 (0.0034) [2024-06-22 03:37:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 2065743872. Throughput: 0: 42925.6. Samples: 2065901980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:48,391][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 03:37:51,115][15401] Updated weights for policy 0, policy_version 126090 (0.0040) [2024-06-22 03:37:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2065956864. Throughput: 0: 42820.3. Samples: 2066026640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 03:37:55,214][15401] Updated weights for policy 0, policy_version 126100 (0.0038) [2024-06-22 03:37:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2066169856. Throughput: 0: 42750.8. Samples: 2066280320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:37:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 03:37:58,774][15401] Updated weights for policy 0, policy_version 126110 (0.0034) [2024-06-22 03:38:02,953][15401] Updated weights for policy 0, policy_version 126120 (0.0047) [2024-06-22 03:38:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2066366464. Throughput: 0: 42838.2. Samples: 2066545360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:38:03,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-22 03:38:06,470][15401] Updated weights for policy 0, policy_version 126130 (0.0037) [2024-06-22 03:38:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2066595840. Throughput: 0: 42658.1. Samples: 2066667720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:38:08,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-22 03:38:10,508][15401] Updated weights for policy 0, policy_version 126140 (0.0035) [2024-06-22 03:38:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43145.3, 300 sec: 42931.6). Total num frames: 2066825216. Throughput: 0: 42803.9. Samples: 2066925680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-22 03:38:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 03:38:14,061][15401] Updated weights for policy 0, policy_version 126150 (0.0029) [2024-06-22 03:38:18,006][15401] Updated weights for policy 0, policy_version 126160 (0.0025) [2024-06-22 03:38:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 2067005440. Throughput: 0: 42752.1. Samples: 2067182400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:18,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-22 03:38:21,857][15401] Updated weights for policy 0, policy_version 126170 (0.0047) [2024-06-22 03:38:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2067234816. Throughput: 0: 42644.0. Samples: 2067306160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:23,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 03:38:25,866][15401] Updated weights for policy 0, policy_version 126180 (0.0037) [2024-06-22 03:38:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 2067447808. Throughput: 0: 42610.3. Samples: 2067564080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:28,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 03:38:29,431][15401] Updated weights for policy 0, policy_version 126190 (0.0031) [2024-06-22 03:38:33,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41779.2, 300 sec: 42876.1). Total num frames: 2067628032. Throughput: 0: 42687.1. Samples: 2067822900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 03:38:33,673][15401] Updated weights for policy 0, policy_version 126200 (0.0040) [2024-06-22 03:38:37,005][15401] Updated weights for policy 0, policy_version 126210 (0.0033) [2024-06-22 03:38:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42821.1). Total num frames: 2067873792. Throughput: 0: 42579.2. Samples: 2067942700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:38,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-22 03:38:41,214][15401] Updated weights for policy 0, policy_version 126220 (0.0044) [2024-06-22 03:38:43,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2068103168. Throughput: 0: 42829.3. Samples: 2068207640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:43,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-22 03:38:44,676][15401] Updated weights for policy 0, policy_version 126230 (0.0040) [2024-06-22 03:38:47,892][15349] Signal inference workers to stop experience collection... (30500 times) [2024-06-22 03:38:47,892][15349] Signal inference workers to resume experience collection... (30500 times) [2024-06-22 03:38:47,923][15401] InferenceWorker_p0-w0: stopping experience collection (30500 times) [2024-06-22 03:38:47,923][15401] InferenceWorker_p0-w0: resuming experience collection (30500 times) [2024-06-22 03:38:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 2068283392. Throughput: 0: 42387.6. Samples: 2068452800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:48,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 03:38:48,913][15401] Updated weights for policy 0, policy_version 126240 (0.0038) [2024-06-22 03:38:52,331][15401] Updated weights for policy 0, policy_version 126250 (0.0036) [2024-06-22 03:38:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2068529152. Throughput: 0: 42541.0. Samples: 2068582060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 03:38:56,656][15401] Updated weights for policy 0, policy_version 126260 (0.0033) [2024-06-22 03:38:58,393][15132] Fps is (10 sec: 42584.2, 60 sec: 42322.9, 300 sec: 42764.5). Total num frames: 2068709376. Throughput: 0: 42495.0. Samples: 2068838100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:38:58,393][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 03:38:59,953][15401] Updated weights for policy 0, policy_version 126270 (0.0027) [2024-06-22 03:39:03,391][15132] Fps is (10 sec: 40952.0, 60 sec: 42870.1, 300 sec: 42875.8). Total num frames: 2068938752. Throughput: 0: 42480.8. Samples: 2069094120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:39:03,402][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 03:39:04,293][15401] Updated weights for policy 0, policy_version 126280 (0.0024) [2024-06-22 03:39:07,573][15401] Updated weights for policy 0, policy_version 126290 (0.0038) [2024-06-22 03:39:08,392][15132] Fps is (10 sec: 44242.8, 60 sec: 42597.0, 300 sec: 42765.2). Total num frames: 2069151744. Throughput: 0: 42675.0. Samples: 2069226620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:39:08,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 03:39:12,337][15401] Updated weights for policy 0, policy_version 126300 (0.0043) [2024-06-22 03:39:13,390][15132] Fps is (10 sec: 39328.8, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 2069331968. Throughput: 0: 42536.3. Samples: 2069478220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:39:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 03:39:15,808][15401] Updated weights for policy 0, policy_version 126310 (0.0040) [2024-06-22 03:39:18,390][15132] Fps is (10 sec: 40968.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2069561344. Throughput: 0: 42317.8. Samples: 2069727200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:18,391][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 03:39:20,066][15401] Updated weights for policy 0, policy_version 126320 (0.0038) [2024-06-22 03:39:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2069774336. Throughput: 0: 42553.8. Samples: 2069857620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 03:39:23,447][15401] Updated weights for policy 0, policy_version 126330 (0.0042) [2024-06-22 03:39:27,833][15401] Updated weights for policy 0, policy_version 126340 (0.0030) [2024-06-22 03:39:28,396][15132] Fps is (10 sec: 42571.6, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 2069987328. Throughput: 0: 42297.1. Samples: 2070111280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:28,396][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 03:39:31,139][15401] Updated weights for policy 0, policy_version 126350 (0.0034) [2024-06-22 03:39:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2070200320. Throughput: 0: 42422.6. Samples: 2070361920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:33,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 03:39:35,455][15401] Updated weights for policy 0, policy_version 126360 (0.0034) [2024-06-22 03:39:38,390][15132] Fps is (10 sec: 42624.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2070413312. Throughput: 0: 42595.0. Samples: 2070498840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 03:39:38,669][15401] Updated weights for policy 0, policy_version 126370 (0.0027) [2024-06-22 03:39:43,077][15401] Updated weights for policy 0, policy_version 126380 (0.0038) [2024-06-22 03:39:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2070626304. Throughput: 0: 42497.9. Samples: 2070750360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 03:39:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000126381_2070626304.pth... [2024-06-22 03:39:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000125754_2060353536.pth [2024-06-22 03:39:46,229][15401] Updated weights for policy 0, policy_version 126390 (0.0039) [2024-06-22 03:39:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2070855680. Throughput: 0: 42532.0. Samples: 2071007980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 03:39:50,587][15401] Updated weights for policy 0, policy_version 126400 (0.0037) [2024-06-22 03:39:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2071068672. Throughput: 0: 42529.9. Samples: 2071140380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:53,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 03:39:54,169][15401] Updated weights for policy 0, policy_version 126410 (0.0030) [2024-06-22 03:39:58,010][15401] Updated weights for policy 0, policy_version 126420 (0.0045) [2024-06-22 03:39:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.8, 300 sec: 42765.0). Total num frames: 2071265280. Throughput: 0: 42540.6. Samples: 2071392540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:39:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 03:40:01,586][15401] Updated weights for policy 0, policy_version 126430 (0.0041) [2024-06-22 03:40:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42326.7, 300 sec: 42709.5). Total num frames: 2071478272. Throughput: 0: 42783.6. Samples: 2071652460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:40:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 03:40:05,930][15401] Updated weights for policy 0, policy_version 126440 (0.0039) [2024-06-22 03:40:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42599.8, 300 sec: 42765.0). Total num frames: 2071707648. Throughput: 0: 42804.0. Samples: 2071783800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:40:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 03:40:09,103][15401] Updated weights for policy 0, policy_version 126450 (0.0033) [2024-06-22 03:40:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2071904256. Throughput: 0: 42768.3. Samples: 2072035580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:40:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 03:40:13,462][15401] Updated weights for policy 0, policy_version 126460 (0.0034) [2024-06-22 03:40:16,020][15349] Signal inference workers to stop experience collection... (30550 times) [2024-06-22 03:40:16,020][15349] Signal inference workers to resume experience collection... (30550 times) [2024-06-22 03:40:16,036][15401] InferenceWorker_p0-w0: stopping experience collection (30550 times) [2024-06-22 03:40:16,068][15401] InferenceWorker_p0-w0: resuming experience collection (30550 times) [2024-06-22 03:40:16,842][15401] Updated weights for policy 0, policy_version 126470 (0.0030) [2024-06-22 03:40:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2072133632. Throughput: 0: 43006.3. Samples: 2072297100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 03:40:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 03:40:21,099][15401] Updated weights for policy 0, policy_version 126480 (0.0033) [2024-06-22 03:40:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 2072330240. Throughput: 0: 42796.7. Samples: 2072424680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:40:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 03:40:24,855][15401] Updated weights for policy 0, policy_version 126490 (0.0031) [2024-06-22 03:40:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 2072543232. Throughput: 0: 42836.0. Samples: 2072677980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:40:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 03:40:28,929][15401] Updated weights for policy 0, policy_version 126500 (0.0043) [2024-06-22 03:40:32,502][15401] Updated weights for policy 0, policy_version 126510 (0.0030) [2024-06-22 03:40:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 2072756224. Throughput: 0: 42832.5. Samples: 2072935440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:40:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 03:40:36,719][15401] Updated weights for policy 0, policy_version 126520 (0.0027) [2024-06-22 03:40:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 2072985600. Throughput: 0: 42759.1. Samples: 2073064540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:40:38,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 03:40:40,185][15401] Updated weights for policy 0, policy_version 126530 (0.0035) [2024-06-22 03:40:43,391][15132] Fps is (10 sec: 44230.4, 60 sec: 42870.4, 300 sec: 42764.8). Total num frames: 2073198592. Throughput: 0: 42737.2. Samples: 2073315780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:40:43,391][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 03:40:44,214][15401] Updated weights for policy 0, policy_version 126540 (0.0035) [2024-06-22 03:40:47,843][15401] Updated weights for policy 0, policy_version 126550 (0.0024) [2024-06-22 03:40:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2073411584. Throughput: 0: 42708.4. Samples: 2073574340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:40:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 03:40:51,664][15401] Updated weights for policy 0, policy_version 126560 (0.0033) [2024-06-22 03:40:53,390][15132] Fps is (10 sec: 42604.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2073624576. Throughput: 0: 42606.7. Samples: 2073701100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:40:53,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 03:40:55,353][15401] Updated weights for policy 0, policy_version 126570 (0.0033) [2024-06-22 03:40:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2073853952. Throughput: 0: 42818.5. Samples: 2073962420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:40:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 03:40:59,079][15401] Updated weights for policy 0, policy_version 126580 (0.0022) [2024-06-22 03:41:02,968][15401] Updated weights for policy 0, policy_version 126590 (0.0035) [2024-06-22 03:41:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 2074050560. Throughput: 0: 42782.6. Samples: 2074222320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:41:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 03:41:06,550][15401] Updated weights for policy 0, policy_version 126600 (0.0026) [2024-06-22 03:41:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2074263552. Throughput: 0: 42791.9. Samples: 2074350320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:41:08,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 03:41:10,450][15401] Updated weights for policy 0, policy_version 126610 (0.0038) [2024-06-22 03:41:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2074509312. Throughput: 0: 42856.0. Samples: 2074606500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:41:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 03:41:14,050][15401] Updated weights for policy 0, policy_version 126620 (0.0043) [2024-06-22 03:41:18,059][15401] Updated weights for policy 0, policy_version 126630 (0.0036) [2024-06-22 03:41:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2074705920. Throughput: 0: 42810.6. Samples: 2074861920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 03:41:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 03:41:22,224][15401] Updated weights for policy 0, policy_version 126640 (0.0037) [2024-06-22 03:41:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2074918912. Throughput: 0: 42758.7. Samples: 2074988680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:41:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 03:41:25,535][15401] Updated weights for policy 0, policy_version 126650 (0.0046) [2024-06-22 03:41:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2075131904. Throughput: 0: 43056.6. Samples: 2075253260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:41:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 03:41:29,703][15401] Updated weights for policy 0, policy_version 126660 (0.0045) [2024-06-22 03:41:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2075344896. Throughput: 0: 43079.9. Samples: 2075512940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:41:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 03:41:33,604][15401] Updated weights for policy 0, policy_version 126670 (0.0034) [2024-06-22 03:41:37,617][15349] Signal inference workers to stop experience collection... (30600 times) [2024-06-22 03:41:37,668][15349] Signal inference workers to resume experience collection... (30600 times) [2024-06-22 03:41:37,670][15401] InferenceWorker_p0-w0: stopping experience collection (30600 times) [2024-06-22 03:41:37,673][15401] Updated weights for policy 0, policy_version 126680 (0.0033) [2024-06-22 03:41:37,697][15401] InferenceWorker_p0-w0: resuming experience collection (30600 times) [2024-06-22 03:41:38,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2075574272. Throughput: 0: 43035.9. Samples: 2075637720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:41:38,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 03:41:41,066][15401] Updated weights for policy 0, policy_version 126690 (0.0036) [2024-06-22 03:41:43,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43145.6, 300 sec: 42765.0). Total num frames: 2075787264. Throughput: 0: 42981.9. Samples: 2075896600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:41:43,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 03:41:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000126697_2075803648.pth... [2024-06-22 03:41:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000126071_2065547264.pth [2024-06-22 03:41:45,117][15401] Updated weights for policy 0, policy_version 126700 (0.0036) [2024-06-22 03:41:48,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2075967488. Throughput: 0: 42989.5. Samples: 2076156840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:41:48,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-22 03:41:48,877][15401] Updated weights for policy 0, policy_version 126710 (0.0035) [2024-06-22 03:41:52,686][15401] Updated weights for policy 0, policy_version 126720 (0.0037) [2024-06-22 03:41:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2076196864. Throughput: 0: 42845.4. Samples: 2076278360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:41:53,390][15132] Avg episode reward: [(0, '0.190')] [2024-06-22 03:41:56,551][15401] Updated weights for policy 0, policy_version 126730 (0.0037) [2024-06-22 03:41:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2076426240. Throughput: 0: 42905.4. Samples: 2076537240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:41:58,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 03:42:00,315][15401] Updated weights for policy 0, policy_version 126740 (0.0032) [2024-06-22 03:42:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2076622848. Throughput: 0: 42988.0. Samples: 2076796380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:42:03,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 03:42:04,068][15401] Updated weights for policy 0, policy_version 126750 (0.0047) [2024-06-22 03:42:08,103][15401] Updated weights for policy 0, policy_version 126760 (0.0028) [2024-06-22 03:42:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.2). Total num frames: 2076852224. Throughput: 0: 42891.0. Samples: 2076918780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:42:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 03:42:11,735][15401] Updated weights for policy 0, policy_version 126770 (0.0035) [2024-06-22 03:42:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2077081600. Throughput: 0: 42888.8. Samples: 2077183260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:42:13,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 03:42:15,787][15401] Updated weights for policy 0, policy_version 126780 (0.0041) [2024-06-22 03:42:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2077261824. Throughput: 0: 42826.8. Samples: 2077440140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:42:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 03:42:19,280][15401] Updated weights for policy 0, policy_version 126790 (0.0030) [2024-06-22 03:42:23,237][15401] Updated weights for policy 0, policy_version 126800 (0.0022) [2024-06-22 03:42:23,395][15132] Fps is (10 sec: 40939.3, 60 sec: 42867.8, 300 sec: 42709.7). Total num frames: 2077491200. Throughput: 0: 42717.5. Samples: 2077560220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 03:42:23,395][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 03:42:26,810][15401] Updated weights for policy 0, policy_version 126810 (0.0039) [2024-06-22 03:42:28,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2077736960. Throughput: 0: 42789.3. Samples: 2077822120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:42:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 03:42:30,792][15401] Updated weights for policy 0, policy_version 126820 (0.0037) [2024-06-22 03:42:33,390][15132] Fps is (10 sec: 42619.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2077917184. Throughput: 0: 42907.8. Samples: 2078087700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:42:33,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-22 03:42:34,470][15401] Updated weights for policy 0, policy_version 126830 (0.0032) [2024-06-22 03:42:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2078130176. Throughput: 0: 42751.1. Samples: 2078202160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:42:38,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 03:42:38,658][15401] Updated weights for policy 0, policy_version 126840 (0.0034) [2024-06-22 03:42:42,427][15401] Updated weights for policy 0, policy_version 126850 (0.0032) [2024-06-22 03:42:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2078375936. Throughput: 0: 42851.5. Samples: 2078465560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:42:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 03:42:46,416][15401] Updated weights for policy 0, policy_version 126860 (0.0031) [2024-06-22 03:42:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2078539776. Throughput: 0: 42944.1. Samples: 2078728860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:42:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 03:42:49,846][15401] Updated weights for policy 0, policy_version 126870 (0.0032) [2024-06-22 03:42:53,396][15132] Fps is (10 sec: 40933.9, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 2078785536. Throughput: 0: 42829.9. Samples: 2078846400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:42:53,396][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 03:42:54,111][15401] Updated weights for policy 0, policy_version 126880 (0.0032) [2024-06-22 03:42:57,273][15401] Updated weights for policy 0, policy_version 126890 (0.0030) [2024-06-22 03:42:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2078998528. Throughput: 0: 42732.1. Samples: 2079106200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:42:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 03:43:01,830][15401] Updated weights for policy 0, policy_version 126900 (0.0035) [2024-06-22 03:43:03,390][15132] Fps is (10 sec: 39346.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2079178752. Throughput: 0: 42844.3. Samples: 2079368140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:43:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 03:43:04,477][15349] Signal inference workers to stop experience collection... (30650 times) [2024-06-22 03:43:04,531][15401] InferenceWorker_p0-w0: stopping experience collection (30650 times) [2024-06-22 03:43:04,540][15349] Signal inference workers to resume experience collection... (30650 times) [2024-06-22 03:43:04,544][15401] InferenceWorker_p0-w0: resuming experience collection (30650 times) [2024-06-22 03:43:05,069][15401] Updated weights for policy 0, policy_version 126910 (0.0034) [2024-06-22 03:43:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2079408128. Throughput: 0: 42837.2. Samples: 2079487680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:43:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 03:43:09,302][15401] Updated weights for policy 0, policy_version 126920 (0.0030) [2024-06-22 03:43:12,992][15401] Updated weights for policy 0, policy_version 126930 (0.0035) [2024-06-22 03:43:13,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2079653888. Throughput: 0: 42767.1. Samples: 2079746640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:43:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 03:43:17,204][15401] Updated weights for policy 0, policy_version 126940 (0.0036) [2024-06-22 03:43:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2079834112. Throughput: 0: 42584.6. Samples: 2080004000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:43:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 03:43:20,485][15401] Updated weights for policy 0, policy_version 126950 (0.0030) [2024-06-22 03:43:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42602.0, 300 sec: 42709.5). Total num frames: 2080047104. Throughput: 0: 42820.4. Samples: 2080129080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 03:43:23,399][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 03:43:24,741][15401] Updated weights for policy 0, policy_version 126960 (0.0034) [2024-06-22 03:43:27,871][15401] Updated weights for policy 0, policy_version 126970 (0.0031) [2024-06-22 03:43:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2080276480. Throughput: 0: 42826.3. Samples: 2080392740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:43:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 03:43:32,217][15401] Updated weights for policy 0, policy_version 126980 (0.0032) [2024-06-22 03:43:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2080473088. Throughput: 0: 42680.3. Samples: 2080649480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:43:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 03:43:35,847][15401] Updated weights for policy 0, policy_version 126990 (0.0034) [2024-06-22 03:43:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2080686080. Throughput: 0: 42862.6. Samples: 2080774940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:43:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 03:43:39,665][15401] Updated weights for policy 0, policy_version 127000 (0.0032) [2024-06-22 03:43:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2080915456. Throughput: 0: 42823.0. Samples: 2081033240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:43:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 03:43:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000127009_2080915456.pth... [2024-06-22 03:43:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000126381_2070626304.pth [2024-06-22 03:43:43,735][15401] Updated weights for policy 0, policy_version 127010 (0.0037) [2024-06-22 03:43:47,515][15401] Updated weights for policy 0, policy_version 127020 (0.0033) [2024-06-22 03:43:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2081112064. Throughput: 0: 42643.7. Samples: 2081287100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:43:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 03:43:51,421][15401] Updated weights for policy 0, policy_version 127030 (0.0033) [2024-06-22 03:43:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42602.9, 300 sec: 42821.0). Total num frames: 2081341440. Throughput: 0: 42961.7. Samples: 2081420960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:43:53,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 03:43:55,237][15401] Updated weights for policy 0, policy_version 127040 (0.0027) [2024-06-22 03:43:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2081538048. Throughput: 0: 42769.4. Samples: 2081671260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:43:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 03:43:58,994][15401] Updated weights for policy 0, policy_version 127050 (0.0038) [2024-06-22 03:44:02,895][15401] Updated weights for policy 0, policy_version 127060 (0.0036) [2024-06-22 03:44:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42765.3). Total num frames: 2081767424. Throughput: 0: 42711.6. Samples: 2081926020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:44:03,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 03:44:06,846][15401] Updated weights for policy 0, policy_version 127070 (0.0032) [2024-06-22 03:44:08,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 2081980416. Throughput: 0: 42851.9. Samples: 2082057520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:44:08,393][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 03:44:10,512][15401] Updated weights for policy 0, policy_version 127080 (0.0031) [2024-06-22 03:44:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 2082193408. Throughput: 0: 42678.2. Samples: 2082313360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:44:13,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 03:44:14,675][15401] Updated weights for policy 0, policy_version 127090 (0.0029) [2024-06-22 03:44:18,320][15401] Updated weights for policy 0, policy_version 127100 (0.0034) [2024-06-22 03:44:18,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2082406400. Throughput: 0: 42681.7. Samples: 2082570160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:44:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 03:44:22,153][15401] Updated weights for policy 0, policy_version 127110 (0.0027) [2024-06-22 03:44:23,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 2082619392. Throughput: 0: 42671.6. Samples: 2082695160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:44:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 03:44:25,858][15401] Updated weights for policy 0, policy_version 127120 (0.0038) [2024-06-22 03:44:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 2082848768. Throughput: 0: 42699.1. Samples: 2082954700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 03:44:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 03:44:29,692][15401] Updated weights for policy 0, policy_version 127130 (0.0026) [2024-06-22 03:44:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2083045376. Throughput: 0: 42937.8. Samples: 2083219300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:44:33,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 03:44:33,695][15401] Updated weights for policy 0, policy_version 127140 (0.0043) [2024-06-22 03:44:37,122][15401] Updated weights for policy 0, policy_version 127150 (0.0029) [2024-06-22 03:44:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2083274752. Throughput: 0: 42654.7. Samples: 2083340420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:44:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 03:44:40,895][15349] Signal inference workers to stop experience collection... (30700 times) [2024-06-22 03:44:40,941][15401] InferenceWorker_p0-w0: stopping experience collection (30700 times) [2024-06-22 03:44:41,014][15349] Signal inference workers to resume experience collection... (30700 times) [2024-06-22 03:44:41,015][15401] InferenceWorker_p0-w0: resuming experience collection (30700 times) [2024-06-22 03:44:41,151][15401] Updated weights for policy 0, policy_version 127160 (0.0030) [2024-06-22 03:44:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2083487744. Throughput: 0: 42968.8. Samples: 2083604860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:44:43,391][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 03:44:44,573][15401] Updated weights for policy 0, policy_version 127170 (0.0047) [2024-06-22 03:44:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2083684352. Throughput: 0: 43198.7. Samples: 2083869960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:44:48,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 03:44:48,530][15401] Updated weights for policy 0, policy_version 127180 (0.0024) [2024-06-22 03:44:52,290][15401] Updated weights for policy 0, policy_version 127190 (0.0042) [2024-06-22 03:44:53,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 2083913728. Throughput: 0: 43111.6. Samples: 2083997540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:44:53,393][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 03:44:56,047][15401] Updated weights for policy 0, policy_version 127200 (0.0036) [2024-06-22 03:44:58,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 2084159488. Throughput: 0: 43193.3. Samples: 2084256960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:44:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 03:45:00,234][15401] Updated weights for policy 0, policy_version 127210 (0.0032) [2024-06-22 03:45:03,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2084356096. Throughput: 0: 43205.4. Samples: 2084514400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:45:03,391][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 03:45:03,479][15401] Updated weights for policy 0, policy_version 127220 (0.0024) [2024-06-22 03:45:07,739][15401] Updated weights for policy 0, policy_version 127230 (0.0041) [2024-06-22 03:45:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 2084569088. Throughput: 0: 43233.0. Samples: 2084640640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:45:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 03:45:10,943][15401] Updated weights for policy 0, policy_version 127240 (0.0031) [2024-06-22 03:45:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43692.4, 300 sec: 42987.2). Total num frames: 2084814848. Throughput: 0: 43391.6. Samples: 2084907320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:45:13,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 03:45:15,452][15401] Updated weights for policy 0, policy_version 127250 (0.0027) [2024-06-22 03:45:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 2085011456. Throughput: 0: 43071.1. Samples: 2085157500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:45:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 03:45:19,184][15401] Updated weights for policy 0, policy_version 127260 (0.0049) [2024-06-22 03:45:22,865][15401] Updated weights for policy 0, policy_version 127270 (0.0026) [2024-06-22 03:45:23,396][15132] Fps is (10 sec: 39296.6, 60 sec: 43140.0, 300 sec: 42930.7). Total num frames: 2085208064. Throughput: 0: 43158.4. Samples: 2085282820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:45:23,396][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 03:45:26,560][15401] Updated weights for policy 0, policy_version 127280 (0.0025) [2024-06-22 03:45:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2085437440. Throughput: 0: 43284.1. Samples: 2085552640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 03:45:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 03:45:30,119][15401] Updated weights for policy 0, policy_version 127290 (0.0037) [2024-06-22 03:45:33,390][15132] Fps is (10 sec: 44264.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2085650432. Throughput: 0: 43162.5. Samples: 2085812280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:45:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 03:45:33,986][15401] Updated weights for policy 0, policy_version 127300 (0.0029) [2024-06-22 03:45:37,794][15401] Updated weights for policy 0, policy_version 127310 (0.0032) [2024-06-22 03:45:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.9, 300 sec: 42931.5). Total num frames: 2085863424. Throughput: 0: 43129.8. Samples: 2085938380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:45:38,392][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 03:45:41,574][15401] Updated weights for policy 0, policy_version 127320 (0.0027) [2024-06-22 03:45:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 2086092800. Throughput: 0: 43269.7. Samples: 2086204100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:45:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 03:45:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000127326_2086109184.pth... [2024-06-22 03:45:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000126697_2075803648.pth [2024-06-22 03:45:45,151][15401] Updated weights for policy 0, policy_version 127330 (0.0031) [2024-06-22 03:45:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2086289408. Throughput: 0: 43408.0. Samples: 2086467760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:45:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 03:45:49,073][15401] Updated weights for policy 0, policy_version 127340 (0.0027) [2024-06-22 03:45:52,628][15401] Updated weights for policy 0, policy_version 127350 (0.0036) [2024-06-22 03:45:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 2086502400. Throughput: 0: 43364.9. Samples: 2086592060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:45:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 03:45:54,879][15349] Signal inference workers to stop experience collection... (30750 times) [2024-06-22 03:45:54,929][15401] InferenceWorker_p0-w0: stopping experience collection (30750 times) [2024-06-22 03:45:55,004][15349] Signal inference workers to resume experience collection... (30750 times) [2024-06-22 03:45:55,004][15401] InferenceWorker_p0-w0: resuming experience collection (30750 times) [2024-06-22 03:45:56,627][15401] Updated weights for policy 0, policy_version 127360 (0.0035) [2024-06-22 03:45:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2086748160. Throughput: 0: 43162.7. Samples: 2086849640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:45:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 03:46:00,046][15401] Updated weights for policy 0, policy_version 127370 (0.0024) [2024-06-22 03:46:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2086928384. Throughput: 0: 43521.0. Samples: 2087115940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:46:03,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-22 03:46:04,277][15401] Updated weights for policy 0, policy_version 127380 (0.0045) [2024-06-22 03:46:07,505][15401] Updated weights for policy 0, policy_version 127390 (0.0024) [2024-06-22 03:46:08,396][15132] Fps is (10 sec: 40933.7, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 2087157760. Throughput: 0: 43467.1. Samples: 2087238840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:46:08,396][15132] Avg episode reward: [(0, '0.261')] [2024-06-22 03:46:11,946][15401] Updated weights for policy 0, policy_version 127400 (0.0040) [2024-06-22 03:46:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2087370752. Throughput: 0: 43174.1. Samples: 2087495480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:46:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 03:46:15,561][15401] Updated weights for policy 0, policy_version 127410 (0.0039) [2024-06-22 03:46:18,390][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2087583744. Throughput: 0: 43164.6. Samples: 2087754680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:46:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 03:46:19,813][15401] Updated weights for policy 0, policy_version 127420 (0.0043) [2024-06-22 03:46:23,062][15401] Updated weights for policy 0, policy_version 127430 (0.0035) [2024-06-22 03:46:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43422.1, 300 sec: 42987.1). Total num frames: 2087813120. Throughput: 0: 43135.1. Samples: 2087879360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:46:23,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 03:46:27,399][15401] Updated weights for policy 0, policy_version 127440 (0.0032) [2024-06-22 03:46:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2088009728. Throughput: 0: 43070.8. Samples: 2088142280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 03:46:28,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 03:46:31,197][15401] Updated weights for policy 0, policy_version 127450 (0.0034) [2024-06-22 03:46:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2088222720. Throughput: 0: 42893.3. Samples: 2088397960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:46:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 03:46:35,074][15401] Updated weights for policy 0, policy_version 127460 (0.0029) [2024-06-22 03:46:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 2088452096. Throughput: 0: 42933.3. Samples: 2088524060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:46:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 03:46:38,710][15401] Updated weights for policy 0, policy_version 127470 (0.0034) [2024-06-22 03:46:42,775][15401] Updated weights for policy 0, policy_version 127480 (0.0036) [2024-06-22 03:46:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 2088632320. Throughput: 0: 42779.5. Samples: 2088774720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:46:43,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 03:46:46,220][15401] Updated weights for policy 0, policy_version 127490 (0.0032) [2024-06-22 03:46:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2088861696. Throughput: 0: 42623.1. Samples: 2089033980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:46:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 03:46:50,795][15401] Updated weights for policy 0, policy_version 127500 (0.0023) [2024-06-22 03:46:53,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2089107456. Throughput: 0: 42792.8. Samples: 2089164240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:46:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 03:46:53,677][15401] Updated weights for policy 0, policy_version 127510 (0.0032) [2024-06-22 03:46:58,389][15401] Updated weights for policy 0, policy_version 127520 (0.0031) [2024-06-22 03:46:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 2089287680. Throughput: 0: 42740.6. Samples: 2089418800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:46:58,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 03:47:01,232][15401] Updated weights for policy 0, policy_version 127530 (0.0023) [2024-06-22 03:47:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2089517056. Throughput: 0: 42655.0. Samples: 2089674160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:47:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 03:47:05,899][15401] Updated weights for policy 0, policy_version 127540 (0.0026) [2024-06-22 03:47:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42876.2, 300 sec: 42876.1). Total num frames: 2089730048. Throughput: 0: 42832.2. Samples: 2089806800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:47:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 03:47:08,746][15401] Updated weights for policy 0, policy_version 127550 (0.0036) [2024-06-22 03:47:13,247][15401] Updated weights for policy 0, policy_version 127560 (0.0039) [2024-06-22 03:47:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 2089943040. Throughput: 0: 42739.9. Samples: 2090065580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:47:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 03:47:16,614][15401] Updated weights for policy 0, policy_version 127570 (0.0038) [2024-06-22 03:47:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42932.4). Total num frames: 2090156032. Throughput: 0: 42617.9. Samples: 2090315760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:47:18,396][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 03:47:20,891][15401] Updated weights for policy 0, policy_version 127580 (0.0042) [2024-06-22 03:47:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2090352640. Throughput: 0: 42690.6. Samples: 2090445140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:47:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 03:47:24,226][15401] Updated weights for policy 0, policy_version 127590 (0.0028) [2024-06-22 03:47:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2090582016. Throughput: 0: 42904.5. Samples: 2090705420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:47:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 03:47:28,525][15401] Updated weights for policy 0, policy_version 127600 (0.0034) [2024-06-22 03:47:31,333][15349] Signal inference workers to stop experience collection... (30800 times) [2024-06-22 03:47:31,333][15349] Signal inference workers to resume experience collection... (30800 times) [2024-06-22 03:47:31,380][15401] InferenceWorker_p0-w0: stopping experience collection (30800 times) [2024-06-22 03:47:31,380][15401] InferenceWorker_p0-w0: resuming experience collection (30800 times) [2024-06-22 03:47:31,871][15401] Updated weights for policy 0, policy_version 127610 (0.0030) [2024-06-22 03:47:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2090811392. Throughput: 0: 42756.9. Samples: 2090958040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 03:47:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 03:47:36,155][15401] Updated weights for policy 0, policy_version 127620 (0.0028) [2024-06-22 03:47:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2091008000. Throughput: 0: 42800.0. Samples: 2091090240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:47:38,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 03:47:39,484][15401] Updated weights for policy 0, policy_version 127630 (0.0022) [2024-06-22 03:47:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2091237376. Throughput: 0: 42946.1. Samples: 2091351380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:47:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 03:47:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000127639_2091237376.pth... [2024-06-22 03:47:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000127009_2080915456.pth [2024-06-22 03:47:43,637][15401] Updated weights for policy 0, policy_version 127640 (0.0030) [2024-06-22 03:47:47,204][15401] Updated weights for policy 0, policy_version 127650 (0.0030) [2024-06-22 03:47:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42988.1). Total num frames: 2091466752. Throughput: 0: 42961.0. Samples: 2091607400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:47:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 03:47:51,122][15401] Updated weights for policy 0, policy_version 127660 (0.0035) [2024-06-22 03:47:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 2091630592. Throughput: 0: 42897.2. Samples: 2091737180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:47:53,392][15132] Avg episode reward: [(0, '0.288')] [2024-06-22 03:47:54,886][15401] Updated weights for policy 0, policy_version 127670 (0.0030) [2024-06-22 03:47:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 2091876352. Throughput: 0: 43014.6. Samples: 2092001240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:47:58,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 03:47:58,595][15401] Updated weights for policy 0, policy_version 127680 (0.0032) [2024-06-22 03:48:02,716][15401] Updated weights for policy 0, policy_version 127690 (0.0022) [2024-06-22 03:48:03,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2092105728. Throughput: 0: 42999.1. Samples: 2092250720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:48:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 03:48:06,238][15401] Updated weights for policy 0, policy_version 127700 (0.0038) [2024-06-22 03:48:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2092302336. Throughput: 0: 43050.6. Samples: 2092382420. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:48:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 03:48:10,340][15401] Updated weights for policy 0, policy_version 127710 (0.0028) [2024-06-22 03:48:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2092515328. Throughput: 0: 43081.8. Samples: 2092644100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:48:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 03:48:13,782][15401] Updated weights for policy 0, policy_version 127720 (0.0034) [2024-06-22 03:48:17,922][15401] Updated weights for policy 0, policy_version 127730 (0.0036) [2024-06-22 03:48:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2092744704. Throughput: 0: 42985.2. Samples: 2092892380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:48:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 03:48:21,993][15401] Updated weights for policy 0, policy_version 127740 (0.0045) [2024-06-22 03:48:23,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2092924928. Throughput: 0: 42879.5. Samples: 2093019820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:48:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 03:48:25,694][15401] Updated weights for policy 0, policy_version 127750 (0.0044) [2024-06-22 03:48:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2093154304. Throughput: 0: 42750.7. Samples: 2093275160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:48:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 03:48:29,563][15401] Updated weights for policy 0, policy_version 127760 (0.0042) [2024-06-22 03:48:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 2093350912. Throughput: 0: 42724.1. Samples: 2093529980. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 03:48:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 03:48:33,678][15401] Updated weights for policy 0, policy_version 127770 (0.0032) [2024-06-22 03:48:37,245][15401] Updated weights for policy 0, policy_version 127780 (0.0023) [2024-06-22 03:48:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2093563904. Throughput: 0: 42620.5. Samples: 2093655100. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:48:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 03:48:41,139][15401] Updated weights for policy 0, policy_version 127790 (0.0030) [2024-06-22 03:48:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2093793280. Throughput: 0: 42464.1. Samples: 2093912120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:48:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 03:48:45,185][15401] Updated weights for policy 0, policy_version 127800 (0.0033) [2024-06-22 03:48:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2094022656. Throughput: 0: 42729.4. Samples: 2094173540. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:48:48,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 03:48:48,779][15401] Updated weights for policy 0, policy_version 127810 (0.0037) [2024-06-22 03:48:53,097][15401] Updated weights for policy 0, policy_version 127820 (0.0047) [2024-06-22 03:48:53,396][15132] Fps is (10 sec: 42571.0, 60 sec: 43139.9, 300 sec: 42986.2). Total num frames: 2094219264. Throughput: 0: 42720.2. Samples: 2094305100. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:48:53,397][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 03:48:56,192][15401] Updated weights for policy 0, policy_version 127830 (0.0040) [2024-06-22 03:48:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2094432256. Throughput: 0: 42540.7. Samples: 2094558440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:48:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 03:49:00,747][15401] Updated weights for policy 0, policy_version 127840 (0.0036) [2024-06-22 03:49:02,122][15349] Signal inference workers to stop experience collection... (30850 times) [2024-06-22 03:49:02,122][15349] Signal inference workers to resume experience collection... (30850 times) [2024-06-22 03:49:02,171][15401] InferenceWorker_p0-w0: stopping experience collection (30850 times) [2024-06-22 03:49:02,171][15401] InferenceWorker_p0-w0: resuming experience collection (30850 times) [2024-06-22 03:49:03,390][15132] Fps is (10 sec: 44265.3, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 2094661632. Throughput: 0: 42650.7. Samples: 2094811660. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:49:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 03:49:03,944][15401] Updated weights for policy 0, policy_version 127850 (0.0043) [2024-06-22 03:49:08,322][15401] Updated weights for policy 0, policy_version 127860 (0.0025) [2024-06-22 03:49:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 2094858240. Throughput: 0: 42712.4. Samples: 2094941880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:49:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 03:49:11,770][15401] Updated weights for policy 0, policy_version 127870 (0.0021) [2024-06-22 03:49:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2095071232. Throughput: 0: 42672.9. Samples: 2095195440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:49:13,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-22 03:49:16,014][15401] Updated weights for policy 0, policy_version 127880 (0.0036) [2024-06-22 03:49:18,390][15132] Fps is (10 sec: 44235.4, 60 sec: 42598.2, 300 sec: 42987.1). Total num frames: 2095300608. Throughput: 0: 42777.4. Samples: 2095454980. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:49:18,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-22 03:49:19,372][15401] Updated weights for policy 0, policy_version 127890 (0.0048) [2024-06-22 03:49:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2095480832. Throughput: 0: 42946.7. Samples: 2095587700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:49:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 03:49:23,649][15401] Updated weights for policy 0, policy_version 127900 (0.0041) [2024-06-22 03:49:26,983][15401] Updated weights for policy 0, policy_version 127910 (0.0029) [2024-06-22 03:49:28,389][15132] Fps is (10 sec: 42600.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2095726592. Throughput: 0: 42869.8. Samples: 2095841260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:49:28,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 03:49:31,224][15401] Updated weights for policy 0, policy_version 127920 (0.0045) [2024-06-22 03:49:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 2095939584. Throughput: 0: 42750.2. Samples: 2096097300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:49:33,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 03:49:34,592][15401] Updated weights for policy 0, policy_version 127930 (0.0044) [2024-06-22 03:49:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2096119808. Throughput: 0: 42713.3. Samples: 2096226920. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-22 03:49:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 03:49:39,005][15401] Updated weights for policy 0, policy_version 127940 (0.0029) [2024-06-22 03:49:42,071][15401] Updated weights for policy 0, policy_version 127950 (0.0024) [2024-06-22 03:49:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2096381952. Throughput: 0: 42664.9. Samples: 2096478360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:49:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 03:49:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000127953_2096381952.pth... [2024-06-22 03:49:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000127326_2086109184.pth [2024-06-22 03:49:46,784][15401] Updated weights for policy 0, policy_version 127960 (0.0024) [2024-06-22 03:49:48,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 2096578560. Throughput: 0: 42734.3. Samples: 2096734700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:49:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 03:49:49,595][15401] Updated weights for policy 0, policy_version 127970 (0.0032) [2024-06-22 03:49:53,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42330.0, 300 sec: 42709.5). Total num frames: 2096758784. Throughput: 0: 42759.3. Samples: 2096866040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:49:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 03:49:54,363][15401] Updated weights for policy 0, policy_version 127980 (0.0038) [2024-06-22 03:49:57,722][15401] Updated weights for policy 0, policy_version 127990 (0.0034) [2024-06-22 03:49:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2097004544. Throughput: 0: 42797.7. Samples: 2097121340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:49:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 03:50:02,057][15401] Updated weights for policy 0, policy_version 128000 (0.0035) [2024-06-22 03:50:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2097217536. Throughput: 0: 42642.1. Samples: 2097373860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:50:03,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 03:50:05,474][15401] Updated weights for policy 0, policy_version 128010 (0.0026) [2024-06-22 03:50:08,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 2097397760. Throughput: 0: 42583.4. Samples: 2097504060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:50:08,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 03:50:09,634][15401] Updated weights for policy 0, policy_version 128020 (0.0027) [2024-06-22 03:50:13,070][15401] Updated weights for policy 0, policy_version 128030 (0.0031) [2024-06-22 03:50:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2097643520. Throughput: 0: 42628.9. Samples: 2097759560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:50:13,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 03:50:17,359][15401] Updated weights for policy 0, policy_version 128040 (0.0031) [2024-06-22 03:50:18,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42598.7, 300 sec: 42877.0). Total num frames: 2097856512. Throughput: 0: 42648.9. Samples: 2098016500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:50:18,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 03:50:19,844][15349] Signal inference workers to stop experience collection... (30900 times) [2024-06-22 03:50:19,844][15349] Signal inference workers to resume experience collection... (30900 times) [2024-06-22 03:50:19,884][15401] InferenceWorker_p0-w0: stopping experience collection (30900 times) [2024-06-22 03:50:19,884][15401] InferenceWorker_p0-w0: resuming experience collection (30900 times) [2024-06-22 03:50:20,734][15401] Updated weights for policy 0, policy_version 128050 (0.0028) [2024-06-22 03:50:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2098053120. Throughput: 0: 42555.4. Samples: 2098141920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:50:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 03:50:24,795][15401] Updated weights for policy 0, policy_version 128060 (0.0044) [2024-06-22 03:50:28,363][15401] Updated weights for policy 0, policy_version 128070 (0.0027) [2024-06-22 03:50:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2098298880. Throughput: 0: 42734.8. Samples: 2098401420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:50:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 03:50:32,713][15401] Updated weights for policy 0, policy_version 128080 (0.0034) [2024-06-22 03:50:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 2098495488. Throughput: 0: 42764.5. Samples: 2098659100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:50:33,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 03:50:36,055][15401] Updated weights for policy 0, policy_version 128090 (0.0023) [2024-06-22 03:50:38,396][15132] Fps is (10 sec: 40933.5, 60 sec: 43139.8, 300 sec: 42764.1). Total num frames: 2098708480. Throughput: 0: 42536.0. Samples: 2098780440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 03:50:38,396][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 03:50:40,672][15401] Updated weights for policy 0, policy_version 128100 (0.0041) [2024-06-22 03:50:43,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2098937856. Throughput: 0: 42630.2. Samples: 2099039700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:50:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 03:50:43,654][15401] Updated weights for policy 0, policy_version 128110 (0.0038) [2024-06-22 03:50:48,305][15401] Updated weights for policy 0, policy_version 128120 (0.0031) [2024-06-22 03:50:48,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2099118080. Throughput: 0: 42817.8. Samples: 2099300660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:50:48,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 03:50:51,362][15401] Updated weights for policy 0, policy_version 128130 (0.0026) [2024-06-22 03:50:53,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2099331072. Throughput: 0: 42504.8. Samples: 2099416680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:50:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 03:50:56,035][15401] Updated weights for policy 0, policy_version 128140 (0.0031) [2024-06-22 03:50:58,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 2099576832. Throughput: 0: 42647.9. Samples: 2099678820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:50:58,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 03:50:59,032][15401] Updated weights for policy 0, policy_version 128150 (0.0033) [2024-06-22 03:51:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42654.9). Total num frames: 2099740672. Throughput: 0: 42605.4. Samples: 2099933740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 03:51:03,637][15401] Updated weights for policy 0, policy_version 128160 (0.0040) [2024-06-22 03:51:06,726][15401] Updated weights for policy 0, policy_version 128170 (0.0027) [2024-06-22 03:51:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 2099986432. Throughput: 0: 42461.0. Samples: 2100052660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:08,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 03:51:11,623][15401] Updated weights for policy 0, policy_version 128180 (0.0037) [2024-06-22 03:51:12,279][15349] Signal inference workers to stop experience collection... (30950 times) [2024-06-22 03:51:12,280][15349] Signal inference workers to resume experience collection... (30950 times) [2024-06-22 03:51:12,322][15401] InferenceWorker_p0-w0: stopping experience collection (30950 times) [2024-06-22 03:51:12,322][15401] InferenceWorker_p0-w0: resuming experience collection (30950 times) [2024-06-22 03:51:13,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2100215808. Throughput: 0: 42473.4. Samples: 2100312720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 03:51:14,567][15401] Updated weights for policy 0, policy_version 128190 (0.0037) [2024-06-22 03:51:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2100379648. Throughput: 0: 42418.1. Samples: 2100567920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 03:51:19,282][15401] Updated weights for policy 0, policy_version 128200 (0.0040) [2024-06-22 03:51:22,531][15401] Updated weights for policy 0, policy_version 128210 (0.0030) [2024-06-22 03:51:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2100641792. Throughput: 0: 42392.7. Samples: 2100687940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:23,393][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 03:51:26,832][15401] Updated weights for policy 0, policy_version 128220 (0.0037) [2024-06-22 03:51:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2100838400. Throughput: 0: 42468.2. Samples: 2100950760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:28,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 03:51:30,260][15401] Updated weights for policy 0, policy_version 128230 (0.0044) [2024-06-22 03:51:33,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2101018624. Throughput: 0: 42316.9. Samples: 2101204920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 03:51:34,325][15401] Updated weights for policy 0, policy_version 128240 (0.0041) [2024-06-22 03:51:37,810][15401] Updated weights for policy 0, policy_version 128250 (0.0033) [2024-06-22 03:51:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 2101264384. Throughput: 0: 42415.2. Samples: 2101325360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 03:51:42,358][15401] Updated weights for policy 0, policy_version 128260 (0.0038) [2024-06-22 03:51:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2101477376. Throughput: 0: 42505.3. Samples: 2101591460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 03:51:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 03:51:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000128264_2101477376.pth... [2024-06-22 03:51:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000127639_2091237376.pth [2024-06-22 03:51:45,567][15401] Updated weights for policy 0, policy_version 128270 (0.0022) [2024-06-22 03:51:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2101673984. Throughput: 0: 42529.3. Samples: 2101847560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:51:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 03:51:49,894][15401] Updated weights for policy 0, policy_version 128280 (0.0050) [2024-06-22 03:51:53,267][15401] Updated weights for policy 0, policy_version 128290 (0.0031) [2024-06-22 03:51:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2101903360. Throughput: 0: 42730.6. Samples: 2101975540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:51:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 03:51:57,451][15401] Updated weights for policy 0, policy_version 128300 (0.0032) [2024-06-22 03:51:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 2102116352. Throughput: 0: 42770.2. Samples: 2102237380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:51:58,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 03:52:00,806][15401] Updated weights for policy 0, policy_version 128310 (0.0037) [2024-06-22 03:52:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2102312960. Throughput: 0: 42700.0. Samples: 2102489420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:03,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 03:52:05,219][15401] Updated weights for policy 0, policy_version 128320 (0.0041) [2024-06-22 03:52:08,262][15401] Updated weights for policy 0, policy_version 128330 (0.0028) [2024-06-22 03:52:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2102558720. Throughput: 0: 42851.3. Samples: 2102616140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 03:52:12,986][15401] Updated weights for policy 0, policy_version 128340 (0.0032) [2024-06-22 03:52:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2102755328. Throughput: 0: 42770.5. Samples: 2102875440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:13,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 03:52:15,721][15401] Updated weights for policy 0, policy_version 128350 (0.0035) [2024-06-22 03:52:18,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2102951936. Throughput: 0: 42793.2. Samples: 2103130620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 03:52:20,394][15401] Updated weights for policy 0, policy_version 128360 (0.0033) [2024-06-22 03:52:23,244][15401] Updated weights for policy 0, policy_version 128370 (0.0033) [2024-06-22 03:52:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 2103214080. Throughput: 0: 43057.8. Samples: 2103262960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 03:52:27,874][15401] Updated weights for policy 0, policy_version 128380 (0.0028) [2024-06-22 03:52:28,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2103394304. Throughput: 0: 42917.8. Samples: 2103522760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 03:52:30,784][15349] Signal inference workers to stop experience collection... (31000 times) [2024-06-22 03:52:30,785][15349] Signal inference workers to resume experience collection... (31000 times) [2024-06-22 03:52:30,832][15401] InferenceWorker_p0-w0: stopping experience collection (31000 times) [2024-06-22 03:52:30,832][15401] InferenceWorker_p0-w0: resuming experience collection (31000 times) [2024-06-22 03:52:30,923][15401] Updated weights for policy 0, policy_version 128390 (0.0028) [2024-06-22 03:52:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2103607296. Throughput: 0: 42998.7. Samples: 2103782500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 03:52:35,221][15401] Updated weights for policy 0, policy_version 128400 (0.0024) [2024-06-22 03:52:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2103853056. Throughput: 0: 43000.6. Samples: 2103910560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 03:52:38,622][15401] Updated weights for policy 0, policy_version 128410 (0.0028) [2024-06-22 03:52:43,208][15401] Updated weights for policy 0, policy_version 128420 (0.0030) [2024-06-22 03:52:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2104033280. Throughput: 0: 42939.0. Samples: 2104169640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 03:52:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 03:52:46,452][15401] Updated weights for policy 0, policy_version 128430 (0.0037) [2024-06-22 03:52:48,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2104246272. Throughput: 0: 42877.4. Samples: 2104418900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:52:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 03:52:51,085][15401] Updated weights for policy 0, policy_version 128440 (0.0037) [2024-06-22 03:52:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2104475648. Throughput: 0: 43057.6. Samples: 2104553740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:52:53,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 03:52:53,971][15401] Updated weights for policy 0, policy_version 128450 (0.0032) [2024-06-22 03:52:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 2104655872. Throughput: 0: 43068.9. Samples: 2104813540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:52:58,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 03:52:58,551][15401] Updated weights for policy 0, policy_version 128460 (0.0027) [2024-06-22 03:53:01,474][15401] Updated weights for policy 0, policy_version 128470 (0.0037) [2024-06-22 03:53:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2104901632. Throughput: 0: 42988.6. Samples: 2105065100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:03,397][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 03:53:06,324][15401] Updated weights for policy 0, policy_version 128480 (0.0042) [2024-06-22 03:53:08,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2105114624. Throughput: 0: 43125.8. Samples: 2105203620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 03:53:08,942][15401] Updated weights for policy 0, policy_version 128490 (0.0032) [2024-06-22 03:53:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2105311232. Throughput: 0: 43096.9. Samples: 2105462120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 03:53:13,847][15401] Updated weights for policy 0, policy_version 128500 (0.0027) [2024-06-22 03:53:16,497][15401] Updated weights for policy 0, policy_version 128510 (0.0041) [2024-06-22 03:53:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2105556992. Throughput: 0: 42933.3. Samples: 2105714500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 03:53:21,519][15401] Updated weights for policy 0, policy_version 128520 (0.0038) [2024-06-22 03:53:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2105769984. Throughput: 0: 43164.4. Samples: 2105852960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:23,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 03:53:24,687][15401] Updated weights for policy 0, policy_version 128530 (0.0029) [2024-06-22 03:53:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2105966592. Throughput: 0: 42923.0. Samples: 2106101280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:28,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 03:53:29,205][15401] Updated weights for policy 0, policy_version 128540 (0.0046) [2024-06-22 03:53:32,457][15401] Updated weights for policy 0, policy_version 128550 (0.0035) [2024-06-22 03:53:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2106212352. Throughput: 0: 42955.6. Samples: 2106351900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 03:53:36,922][15401] Updated weights for policy 0, policy_version 128560 (0.0038) [2024-06-22 03:53:37,418][15349] Signal inference workers to stop experience collection... (31050 times) [2024-06-22 03:53:37,418][15349] Signal inference workers to resume experience collection... (31050 times) [2024-06-22 03:53:37,453][15401] InferenceWorker_p0-w0: stopping experience collection (31050 times) [2024-06-22 03:53:37,453][15401] InferenceWorker_p0-w0: resuming experience collection (31050 times) [2024-06-22 03:53:38,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2106425344. Throughput: 0: 42892.1. Samples: 2106483880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 03:53:40,130][15401] Updated weights for policy 0, policy_version 128570 (0.0040) [2024-06-22 03:53:43,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2106605568. Throughput: 0: 42727.7. Samples: 2106736280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 03:53:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 03:53:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000128578_2106621952.pth... [2024-06-22 03:53:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000127953_2096381952.pth [2024-06-22 03:53:44,426][15401] Updated weights for policy 0, policy_version 128580 (0.0033) [2024-06-22 03:53:47,994][15401] Updated weights for policy 0, policy_version 128590 (0.0036) [2024-06-22 03:53:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42821.5). Total num frames: 2106851328. Throughput: 0: 42746.3. Samples: 2106988680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:53:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 03:53:51,890][15401] Updated weights for policy 0, policy_version 128600 (0.0044) [2024-06-22 03:53:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2107047936. Throughput: 0: 42735.6. Samples: 2107126720. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:53:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 03:53:55,650][15401] Updated weights for policy 0, policy_version 128610 (0.0032) [2024-06-22 03:53:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2107244544. Throughput: 0: 42482.6. Samples: 2107373840. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:53:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 03:53:59,873][15401] Updated weights for policy 0, policy_version 128620 (0.0027) [2024-06-22 03:54:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2107457536. Throughput: 0: 42550.8. Samples: 2107629280. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:03,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 03:54:03,424][15401] Updated weights for policy 0, policy_version 128630 (0.0027) [2024-06-22 03:54:07,518][15401] Updated weights for policy 0, policy_version 128640 (0.0034) [2024-06-22 03:54:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2107686912. Throughput: 0: 42292.5. Samples: 2107756120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 03:54:11,092][15401] Updated weights for policy 0, policy_version 128650 (0.0037) [2024-06-22 03:54:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2107883520. Throughput: 0: 42567.1. Samples: 2108016700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 03:54:15,049][15401] Updated weights for policy 0, policy_version 128660 (0.0044) [2024-06-22 03:54:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 2108112896. Throughput: 0: 42656.9. Samples: 2108271460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 03:54:18,818][15401] Updated weights for policy 0, policy_version 128670 (0.0049) [2024-06-22 03:54:22,725][15401] Updated weights for policy 0, policy_version 128680 (0.0027) [2024-06-22 03:54:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2108309504. Throughput: 0: 42501.0. Samples: 2108396420. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 03:54:26,646][15401] Updated weights for policy 0, policy_version 128690 (0.0026) [2024-06-22 03:54:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2108538880. Throughput: 0: 42579.6. Samples: 2108652360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 03:54:30,155][15401] Updated weights for policy 0, policy_version 128700 (0.0039) [2024-06-22 03:54:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 2108751872. Throughput: 0: 42592.8. Samples: 2108905360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:33,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 03:54:34,203][15401] Updated weights for policy 0, policy_version 128710 (0.0041) [2024-06-22 03:54:37,740][15401] Updated weights for policy 0, policy_version 128720 (0.0022) [2024-06-22 03:54:38,391][15132] Fps is (10 sec: 42592.7, 60 sec: 42324.4, 300 sec: 42653.8). Total num frames: 2108964864. Throughput: 0: 42472.5. Samples: 2109038040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:38,391][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 03:54:41,970][15401] Updated weights for policy 0, policy_version 128730 (0.0033) [2024-06-22 03:54:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2109177856. Throughput: 0: 42791.2. Samples: 2109299440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:43,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-22 03:54:45,612][15401] Updated weights for policy 0, policy_version 128740 (0.0034) [2024-06-22 03:54:48,390][15132] Fps is (10 sec: 42603.9, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2109390848. Throughput: 0: 42822.5. Samples: 2109556300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-22 03:54:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 03:54:49,510][15401] Updated weights for policy 0, policy_version 128750 (0.0043) [2024-06-22 03:54:53,012][15401] Updated weights for policy 0, policy_version 128760 (0.0039) [2024-06-22 03:54:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2109603840. Throughput: 0: 42818.0. Samples: 2109682940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:54:53,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 03:54:57,240][15401] Updated weights for policy 0, policy_version 128770 (0.0036) [2024-06-22 03:54:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2109816832. Throughput: 0: 43021.8. Samples: 2109952680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:54:58,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 03:55:00,470][15401] Updated weights for policy 0, policy_version 128780 (0.0028) [2024-06-22 03:55:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 2110029824. Throughput: 0: 42894.2. Samples: 2110201700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 03:55:04,568][15349] Signal inference workers to stop experience collection... (31100 times) [2024-06-22 03:55:04,569][15349] Signal inference workers to resume experience collection... (31100 times) [2024-06-22 03:55:04,613][15401] InferenceWorker_p0-w0: stopping experience collection (31100 times) [2024-06-22 03:55:04,613][15401] InferenceWorker_p0-w0: resuming experience collection (31100 times) [2024-06-22 03:55:04,709][15401] Updated weights for policy 0, policy_version 128790 (0.0034) [2024-06-22 03:55:07,922][15401] Updated weights for policy 0, policy_version 128800 (0.0030) [2024-06-22 03:55:08,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2110259200. Throughput: 0: 42996.0. Samples: 2110331240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:08,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 03:55:12,414][15401] Updated weights for policy 0, policy_version 128810 (0.0036) [2024-06-22 03:55:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2110472192. Throughput: 0: 43170.6. Samples: 2110595040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 03:55:15,860][15401] Updated weights for policy 0, policy_version 128820 (0.0031) [2024-06-22 03:55:18,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2110668800. Throughput: 0: 43125.3. Samples: 2110846000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 03:55:20,245][15401] Updated weights for policy 0, policy_version 128830 (0.0034) [2024-06-22 03:55:23,299][15401] Updated weights for policy 0, policy_version 128840 (0.0034) [2024-06-22 03:55:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2110914560. Throughput: 0: 43015.4. Samples: 2110973680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:23,392][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 03:55:27,967][15401] Updated weights for policy 0, policy_version 128850 (0.0042) [2024-06-22 03:55:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2111094784. Throughput: 0: 43071.0. Samples: 2111237640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 03:55:30,813][15401] Updated weights for policy 0, policy_version 128860 (0.0030) [2024-06-22 03:55:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 2111324160. Throughput: 0: 42925.7. Samples: 2111487960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 03:55:35,596][15401] Updated weights for policy 0, policy_version 128870 (0.0032) [2024-06-22 03:55:38,371][15401] Updated weights for policy 0, policy_version 128880 (0.0034) [2024-06-22 03:55:38,392][15132] Fps is (10 sec: 47502.6, 60 sec: 43416.8, 300 sec: 42820.2). Total num frames: 2111569920. Throughput: 0: 42991.6. Samples: 2111617660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:38,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 03:55:43,193][15401] Updated weights for policy 0, policy_version 128890 (0.0046) [2024-06-22 03:55:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2111750144. Throughput: 0: 42792.0. Samples: 2111878320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 03:55:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000128891_2111750144.pth... [2024-06-22 03:55:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000128264_2101477376.pth [2024-06-22 03:55:46,743][15401] Updated weights for policy 0, policy_version 128900 (0.0031) [2024-06-22 03:55:48,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2111946752. Throughput: 0: 42817.0. Samples: 2112128460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 03:55:48,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 03:55:50,898][15401] Updated weights for policy 0, policy_version 128910 (0.0035) [2024-06-22 03:55:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 2112192512. Throughput: 0: 42831.8. Samples: 2112258680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:55:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 03:55:54,202][15401] Updated weights for policy 0, policy_version 128920 (0.0040) [2024-06-22 03:55:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2112372736. Throughput: 0: 42758.2. Samples: 2112519160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:55:58,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 03:55:58,524][15401] Updated weights for policy 0, policy_version 128930 (0.0052) [2024-06-22 03:56:01,675][15401] Updated weights for policy 0, policy_version 128940 (0.0028) [2024-06-22 03:56:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 2112602112. Throughput: 0: 42706.4. Samples: 2112767800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:03,391][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 03:56:06,056][15401] Updated weights for policy 0, policy_version 128950 (0.0030) [2024-06-22 03:56:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2112831488. Throughput: 0: 42817.4. Samples: 2112900460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 03:56:09,109][15401] Updated weights for policy 0, policy_version 128960 (0.0044) [2024-06-22 03:56:13,389][15132] Fps is (10 sec: 40961.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2113011712. Throughput: 0: 42808.5. Samples: 2113164020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:13,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 03:56:13,745][15401] Updated weights for policy 0, policy_version 128970 (0.0031) [2024-06-22 03:56:16,678][15401] Updated weights for policy 0, policy_version 128980 (0.0046) [2024-06-22 03:56:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 2113257472. Throughput: 0: 42899.7. Samples: 2113418440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:18,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 03:56:20,976][15349] Signal inference workers to stop experience collection... (31150 times) [2024-06-22 03:56:20,976][15349] Signal inference workers to resume experience collection... (31150 times) [2024-06-22 03:56:20,987][15401] InferenceWorker_p0-w0: stopping experience collection (31150 times) [2024-06-22 03:56:20,987][15401] InferenceWorker_p0-w0: resuming experience collection (31150 times) [2024-06-22 03:56:21,136][15401] Updated weights for policy 0, policy_version 128990 (0.0037) [2024-06-22 03:56:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2113470464. Throughput: 0: 42926.7. Samples: 2113549260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:23,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 03:56:24,125][15401] Updated weights for policy 0, policy_version 129000 (0.0022) [2024-06-22 03:56:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2113667072. Throughput: 0: 43006.4. Samples: 2113813600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:28,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 03:56:28,629][15401] Updated weights for policy 0, policy_version 129010 (0.0032) [2024-06-22 03:56:31,589][15401] Updated weights for policy 0, policy_version 129020 (0.0041) [2024-06-22 03:56:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2113912832. Throughput: 0: 43027.0. Samples: 2114064680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 03:56:36,315][15401] Updated weights for policy 0, policy_version 129030 (0.0035) [2024-06-22 03:56:38,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 2114125824. Throughput: 0: 43112.4. Samples: 2114198740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:38,391][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 03:56:39,151][15401] Updated weights for policy 0, policy_version 129040 (0.0023) [2024-06-22 03:56:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2114322432. Throughput: 0: 43128.9. Samples: 2114459960. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 03:56:44,124][15401] Updated weights for policy 0, policy_version 129050 (0.0028) [2024-06-22 03:56:47,019][15401] Updated weights for policy 0, policy_version 129060 (0.0029) [2024-06-22 03:56:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.5, 300 sec: 42931.6). Total num frames: 2114568192. Throughput: 0: 42976.7. Samples: 2114701740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 03:56:51,786][15401] Updated weights for policy 0, policy_version 129070 (0.0035) [2024-06-22 03:56:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2114764800. Throughput: 0: 43051.0. Samples: 2114837760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 03:56:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 03:56:54,843][15401] Updated weights for policy 0, policy_version 129080 (0.0038) [2024-06-22 03:56:58,389][15132] Fps is (10 sec: 36045.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2114928640. Throughput: 0: 42870.3. Samples: 2115093180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:56:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 03:56:59,359][15401] Updated weights for policy 0, policy_version 129090 (0.0028) [2024-06-22 03:57:02,395][15401] Updated weights for policy 0, policy_version 129100 (0.0031) [2024-06-22 03:57:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42820.5). Total num frames: 2115190784. Throughput: 0: 42651.9. Samples: 2115337780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 03:57:07,152][15401] Updated weights for policy 0, policy_version 129110 (0.0046) [2024-06-22 03:57:08,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2115403776. Throughput: 0: 42900.8. Samples: 2115479800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 03:57:09,898][15401] Updated weights for policy 0, policy_version 129120 (0.0035) [2024-06-22 03:57:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2115584000. Throughput: 0: 42734.9. Samples: 2115736680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 03:57:14,678][15401] Updated weights for policy 0, policy_version 129130 (0.0027) [2024-06-22 03:57:17,456][15401] Updated weights for policy 0, policy_version 129140 (0.0037) [2024-06-22 03:57:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2115846144. Throughput: 0: 42770.6. Samples: 2115989360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:18,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 03:57:22,229][15401] Updated weights for policy 0, policy_version 129150 (0.0033) [2024-06-22 03:57:23,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2116059136. Throughput: 0: 42910.3. Samples: 2116129700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:23,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 03:57:25,301][15401] Updated weights for policy 0, policy_version 129160 (0.0033) [2024-06-22 03:57:28,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2116239360. Throughput: 0: 42595.6. Samples: 2116376760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:28,394][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 03:57:30,160][15401] Updated weights for policy 0, policy_version 129170 (0.0029) [2024-06-22 03:57:32,977][15401] Updated weights for policy 0, policy_version 129180 (0.0020) [2024-06-22 03:57:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2116485120. Throughput: 0: 42810.2. Samples: 2116628200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 03:57:37,758][15401] Updated weights for policy 0, policy_version 129190 (0.0029) [2024-06-22 03:57:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2116681728. Throughput: 0: 42844.5. Samples: 2116765760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 03:57:41,278][15401] Updated weights for policy 0, policy_version 129200 (0.0049) [2024-06-22 03:57:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2116894720. Throughput: 0: 42839.0. Samples: 2117020940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:43,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 03:57:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000129205_2116894720.pth... [2024-06-22 03:57:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000128578_2106621952.pth [2024-06-22 03:57:45,268][15401] Updated weights for policy 0, policy_version 129210 (0.0038) [2024-06-22 03:57:45,967][15349] Signal inference workers to stop experience collection... (31200 times) [2024-06-22 03:57:45,972][15349] Signal inference workers to resume experience collection... (31200 times) [2024-06-22 03:57:46,020][15401] InferenceWorker_p0-w0: stopping experience collection (31200 times) [2024-06-22 03:57:46,020][15401] InferenceWorker_p0-w0: resuming experience collection (31200 times) [2024-06-22 03:57:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2117107712. Throughput: 0: 43112.9. Samples: 2117277860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 03:57:48,742][15401] Updated weights for policy 0, policy_version 129220 (0.0031) [2024-06-22 03:57:52,986][15401] Updated weights for policy 0, policy_version 129230 (0.0033) [2024-06-22 03:57:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 2117337088. Throughput: 0: 42872.2. Samples: 2117409040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-22 03:57:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 03:57:56,257][15401] Updated weights for policy 0, policy_version 129240 (0.0033) [2024-06-22 03:57:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2117517312. Throughput: 0: 42856.0. Samples: 2117665200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:57:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 03:58:00,539][15401] Updated weights for policy 0, policy_version 129250 (0.0039) [2024-06-22 03:58:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2117763072. Throughput: 0: 42838.8. Samples: 2117917100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:03,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 03:58:04,050][15401] Updated weights for policy 0, policy_version 129260 (0.0038) [2024-06-22 03:58:07,991][15401] Updated weights for policy 0, policy_version 129270 (0.0038) [2024-06-22 03:58:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2117976064. Throughput: 0: 42748.4. Samples: 2118053380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:08,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 03:58:11,424][15401] Updated weights for policy 0, policy_version 129280 (0.0036) [2024-06-22 03:58:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 2118172672. Throughput: 0: 42980.6. Samples: 2118310880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 03:58:15,649][15401] Updated weights for policy 0, policy_version 129290 (0.0036) [2024-06-22 03:58:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2118418432. Throughput: 0: 42910.8. Samples: 2118559180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:18,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 03:58:19,243][15401] Updated weights for policy 0, policy_version 129300 (0.0027) [2024-06-22 03:58:23,162][15401] Updated weights for policy 0, policy_version 129310 (0.0038) [2024-06-22 03:58:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 2118615040. Throughput: 0: 42915.9. Samples: 2118696980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 03:58:26,831][15401] Updated weights for policy 0, policy_version 129320 (0.0027) [2024-06-22 03:58:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2118828032. Throughput: 0: 42908.4. Samples: 2118951820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 03:58:30,835][15401] Updated weights for policy 0, policy_version 129330 (0.0035) [2024-06-22 03:58:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2119057408. Throughput: 0: 42948.9. Samples: 2119210560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 03:58:34,349][15401] Updated weights for policy 0, policy_version 129340 (0.0039) [2024-06-22 03:58:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2119254016. Throughput: 0: 42928.4. Samples: 2119340820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:38,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 03:58:38,514][15401] Updated weights for policy 0, policy_version 129350 (0.0033) [2024-06-22 03:58:41,849][15401] Updated weights for policy 0, policy_version 129360 (0.0033) [2024-06-22 03:58:43,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 2119467008. Throughput: 0: 42887.7. Samples: 2119595420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:43,397][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 03:58:46,066][15401] Updated weights for policy 0, policy_version 129370 (0.0029) [2024-06-22 03:58:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2119696384. Throughput: 0: 42932.0. Samples: 2119849040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 03:58:49,910][15401] Updated weights for policy 0, policy_version 129380 (0.0032) [2024-06-22 03:58:53,390][15132] Fps is (10 sec: 44264.6, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 2119909376. Throughput: 0: 42776.8. Samples: 2119978340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 03:58:53,655][15401] Updated weights for policy 0, policy_version 129390 (0.0025) [2024-06-22 03:58:57,732][15401] Updated weights for policy 0, policy_version 129400 (0.0037) [2024-06-22 03:58:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2120122368. Throughput: 0: 42867.9. Samples: 2120239940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 03:58:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 03:59:01,249][15401] Updated weights for policy 0, policy_version 129410 (0.0036) [2024-06-22 03:59:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2120351744. Throughput: 0: 43061.7. Samples: 2120496960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 03:59:05,225][15401] Updated weights for policy 0, policy_version 129420 (0.0032) [2024-06-22 03:59:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2120564736. Throughput: 0: 43052.6. Samples: 2120634340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:08,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 03:59:08,734][15401] Updated weights for policy 0, policy_version 129430 (0.0037) [2024-06-22 03:59:12,903][15401] Updated weights for policy 0, policy_version 129440 (0.0036) [2024-06-22 03:59:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2120761344. Throughput: 0: 43071.1. Samples: 2120890020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:13,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 03:59:16,235][15401] Updated weights for policy 0, policy_version 129450 (0.0039) [2024-06-22 03:59:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2120990720. Throughput: 0: 42866.8. Samples: 2121139560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 03:59:20,526][15401] Updated weights for policy 0, policy_version 129460 (0.0040) [2024-06-22 03:59:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2121203712. Throughput: 0: 42984.9. Samples: 2121275140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 03:59:23,864][15401] Updated weights for policy 0, policy_version 129470 (0.0037) [2024-06-22 03:59:28,085][15401] Updated weights for policy 0, policy_version 129480 (0.0033) [2024-06-22 03:59:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2121400320. Throughput: 0: 42945.7. Samples: 2121527700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 03:59:30,862][15349] Signal inference workers to stop experience collection... (31250 times) [2024-06-22 03:59:30,862][15349] Signal inference workers to resume experience collection... (31250 times) [2024-06-22 03:59:30,877][15401] InferenceWorker_p0-w0: stopping experience collection (31250 times) [2024-06-22 03:59:30,877][15401] InferenceWorker_p0-w0: resuming experience collection (31250 times) [2024-06-22 03:59:31,361][15401] Updated weights for policy 0, policy_version 129490 (0.0038) [2024-06-22 03:59:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.8). Total num frames: 2121629696. Throughput: 0: 43044.9. Samples: 2121786060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:33,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 03:59:35,958][15401] Updated weights for policy 0, policy_version 129500 (0.0039) [2024-06-22 03:59:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2121826304. Throughput: 0: 43116.2. Samples: 2121918560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 03:59:39,024][15401] Updated weights for policy 0, policy_version 129510 (0.0041) [2024-06-22 03:59:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 2122039296. Throughput: 0: 42824.5. Samples: 2122167040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 03:59:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000129519_2122039296.pth... [2024-06-22 03:59:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000128891_2111750144.pth [2024-06-22 03:59:43,636][15401] Updated weights for policy 0, policy_version 129520 (0.0041) [2024-06-22 03:59:46,815][15401] Updated weights for policy 0, policy_version 129530 (0.0022) [2024-06-22 03:59:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 2122268672. Throughput: 0: 42973.9. Samples: 2122430780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 03:59:51,594][15401] Updated weights for policy 0, policy_version 129540 (0.0038) [2024-06-22 03:59:53,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42867.0, 300 sec: 42930.7). Total num frames: 2122481664. Throughput: 0: 42799.6. Samples: 2122560600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:53,396][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 03:59:54,569][15401] Updated weights for policy 0, policy_version 129550 (0.0029) [2024-06-22 03:59:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2122694656. Throughput: 0: 42711.2. Samples: 2122812020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 03:59:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 03:59:59,010][15401] Updated weights for policy 0, policy_version 129560 (0.0037) [2024-06-22 04:00:02,448][15401] Updated weights for policy 0, policy_version 129570 (0.0028) [2024-06-22 04:00:03,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2122907648. Throughput: 0: 42898.6. Samples: 2123070000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:03,400][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 04:00:06,578][15401] Updated weights for policy 0, policy_version 129580 (0.0036) [2024-06-22 04:00:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2123120640. Throughput: 0: 42856.9. Samples: 2123203700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 04:00:09,978][15401] Updated weights for policy 0, policy_version 129590 (0.0030) [2024-06-22 04:00:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2123333632. Throughput: 0: 42875.9. Samples: 2123457120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 04:00:13,971][15401] Updated weights for policy 0, policy_version 129600 (0.0037) [2024-06-22 04:00:17,784][15401] Updated weights for policy 0, policy_version 129610 (0.0037) [2024-06-22 04:00:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2123546624. Throughput: 0: 42936.5. Samples: 2123718200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 04:00:21,505][15401] Updated weights for policy 0, policy_version 129620 (0.0037) [2024-06-22 04:00:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2123759616. Throughput: 0: 42869.3. Samples: 2123847680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 04:00:25,469][15401] Updated weights for policy 0, policy_version 129630 (0.0035) [2024-06-22 04:00:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 2123988992. Throughput: 0: 43002.2. Samples: 2124102140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 04:00:28,981][15401] Updated weights for policy 0, policy_version 129640 (0.0033) [2024-06-22 04:00:32,861][15401] Updated weights for policy 0, policy_version 129650 (0.0043) [2024-06-22 04:00:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 2124201984. Throughput: 0: 43070.2. Samples: 2124368940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:33,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 04:00:36,359][15401] Updated weights for policy 0, policy_version 129660 (0.0033) [2024-06-22 04:00:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 2124414976. Throughput: 0: 42993.1. Samples: 2124495120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:38,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 04:00:40,637][15401] Updated weights for policy 0, policy_version 129670 (0.0039) [2024-06-22 04:00:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2124644352. Throughput: 0: 43086.7. Samples: 2124750920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 04:00:44,051][15401] Updated weights for policy 0, policy_version 129680 (0.0039) [2024-06-22 04:00:45,362][15349] Signal inference workers to stop experience collection... (31300 times) [2024-06-22 04:00:45,362][15349] Signal inference workers to resume experience collection... (31300 times) [2024-06-22 04:00:45,376][15401] InferenceWorker_p0-w0: stopping experience collection (31300 times) [2024-06-22 04:00:45,376][15401] InferenceWorker_p0-w0: resuming experience collection (31300 times) [2024-06-22 04:00:48,160][15401] Updated weights for policy 0, policy_version 129690 (0.0027) [2024-06-22 04:00:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2124840960. Throughput: 0: 43130.3. Samples: 2125010860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:48,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 04:00:52,050][15401] Updated weights for policy 0, policy_version 129700 (0.0037) [2024-06-22 04:00:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43149.0, 300 sec: 43042.7). Total num frames: 2125070336. Throughput: 0: 43011.9. Samples: 2125139240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 04:00:56,157][15401] Updated weights for policy 0, policy_version 129710 (0.0041) [2024-06-22 04:00:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 43042.8). Total num frames: 2125299712. Throughput: 0: 43119.6. Samples: 2125397500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:00:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 04:00:59,628][15401] Updated weights for policy 0, policy_version 129720 (0.0032) [2024-06-22 04:01:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2125463552. Throughput: 0: 43143.6. Samples: 2125659660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:01:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 04:01:03,690][15401] Updated weights for policy 0, policy_version 129730 (0.0036) [2024-06-22 04:01:07,067][15401] Updated weights for policy 0, policy_version 129740 (0.0030) [2024-06-22 04:01:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2125692928. Throughput: 0: 42912.4. Samples: 2125778740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:08,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 04:01:11,392][15401] Updated weights for policy 0, policy_version 129750 (0.0033) [2024-06-22 04:01:13,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2125938688. Throughput: 0: 43178.2. Samples: 2126045160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:13,394][15132] Avg episode reward: [(0, '0.282')] [2024-06-22 04:01:14,582][15401] Updated weights for policy 0, policy_version 129760 (0.0030) [2024-06-22 04:01:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2126118912. Throughput: 0: 42983.1. Samples: 2126303180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 04:01:19,203][15401] Updated weights for policy 0, policy_version 129770 (0.0029) [2024-06-22 04:01:22,135][15401] Updated weights for policy 0, policy_version 129780 (0.0033) [2024-06-22 04:01:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 2126331904. Throughput: 0: 42849.1. Samples: 2126423220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:23,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 04:01:26,822][15401] Updated weights for policy 0, policy_version 129790 (0.0030) [2024-06-22 04:01:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2126577664. Throughput: 0: 43068.8. Samples: 2126689020. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:28,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-22 04:01:29,688][15401] Updated weights for policy 0, policy_version 129800 (0.0042) [2024-06-22 04:01:33,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2126774272. Throughput: 0: 42947.1. Samples: 2126943480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:33,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-22 04:01:34,396][15401] Updated weights for policy 0, policy_version 129810 (0.0032) [2024-06-22 04:01:37,420][15401] Updated weights for policy 0, policy_version 129820 (0.0038) [2024-06-22 04:01:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42931.7). Total num frames: 2126987264. Throughput: 0: 42844.6. Samples: 2127067240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:38,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-22 04:01:41,980][15401] Updated weights for policy 0, policy_version 129830 (0.0036) [2024-06-22 04:01:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2127216640. Throughput: 0: 42951.7. Samples: 2127330320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 04:01:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000129835_2127216640.pth... [2024-06-22 04:01:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000129205_2116894720.pth [2024-06-22 04:01:45,267][15401] Updated weights for policy 0, policy_version 129840 (0.0030) [2024-06-22 04:01:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2127413248. Throughput: 0: 42633.7. Samples: 2127578180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 04:01:49,581][15401] Updated weights for policy 0, policy_version 129850 (0.0034) [2024-06-22 04:01:52,156][15349] Signal inference workers to stop experience collection... (31350 times) [2024-06-22 04:01:52,194][15401] InferenceWorker_p0-w0: stopping experience collection (31350 times) [2024-06-22 04:01:52,203][15349] Signal inference workers to resume experience collection... (31350 times) [2024-06-22 04:01:52,207][15401] InferenceWorker_p0-w0: resuming experience collection (31350 times) [2024-06-22 04:01:53,061][15401] Updated weights for policy 0, policy_version 129860 (0.0037) [2024-06-22 04:01:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 2127626240. Throughput: 0: 42773.0. Samples: 2127703520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 04:01:57,411][15401] Updated weights for policy 0, policy_version 129870 (0.0025) [2024-06-22 04:01:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2127839232. Throughput: 0: 42740.1. Samples: 2127968460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:01:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 04:02:00,612][15401] Updated weights for policy 0, policy_version 129880 (0.0039) [2024-06-22 04:02:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2128052224. Throughput: 0: 42581.4. Samples: 2128219340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 04:02:03,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 04:02:05,114][15401] Updated weights for policy 0, policy_version 129890 (0.0025) [2024-06-22 04:02:08,239][15401] Updated weights for policy 0, policy_version 129900 (0.0033) [2024-06-22 04:02:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2128281600. Throughput: 0: 42688.2. Samples: 2128344200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 04:02:12,591][15401] Updated weights for policy 0, policy_version 129910 (0.0033) [2024-06-22 04:02:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2128461824. Throughput: 0: 42679.7. Samples: 2128609600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 04:02:16,025][15401] Updated weights for policy 0, policy_version 129920 (0.0040) [2024-06-22 04:02:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2128691200. Throughput: 0: 42537.8. Samples: 2128857680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 04:02:20,510][15401] Updated weights for policy 0, policy_version 129930 (0.0023) [2024-06-22 04:02:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2128904192. Throughput: 0: 42676.4. Samples: 2128987680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:23,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 04:02:23,722][15401] Updated weights for policy 0, policy_version 129940 (0.0044) [2024-06-22 04:02:28,002][15401] Updated weights for policy 0, policy_version 129950 (0.0037) [2024-06-22 04:02:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 2129100800. Throughput: 0: 42487.1. Samples: 2129242240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 04:02:31,568][15401] Updated weights for policy 0, policy_version 129960 (0.0026) [2024-06-22 04:02:33,391][15132] Fps is (10 sec: 44228.5, 60 sec: 42870.2, 300 sec: 42931.4). Total num frames: 2129346560. Throughput: 0: 42673.4. Samples: 2129498560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:33,392][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 04:02:35,677][15401] Updated weights for policy 0, policy_version 129970 (0.0032) [2024-06-22 04:02:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2129559552. Throughput: 0: 42715.5. Samples: 2129625720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 04:02:39,088][15401] Updated weights for policy 0, policy_version 129980 (0.0033) [2024-06-22 04:02:43,297][15401] Updated weights for policy 0, policy_version 129990 (0.0039) [2024-06-22 04:02:43,389][15132] Fps is (10 sec: 40967.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2129756160. Throughput: 0: 42536.0. Samples: 2129882580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 04:02:46,919][15401] Updated weights for policy 0, policy_version 130000 (0.0033) [2024-06-22 04:02:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2129969152. Throughput: 0: 42568.0. Samples: 2130134900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 04:02:50,902][15401] Updated weights for policy 0, policy_version 130010 (0.0034) [2024-06-22 04:02:53,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 2130182144. Throughput: 0: 42536.5. Samples: 2130258440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:53,393][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 04:02:54,550][15401] Updated weights for policy 0, policy_version 130020 (0.0034) [2024-06-22 04:02:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2130395136. Throughput: 0: 42476.0. Samples: 2130521020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:02:58,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 04:02:58,556][15401] Updated weights for policy 0, policy_version 130030 (0.0040) [2024-06-22 04:03:02,225][15401] Updated weights for policy 0, policy_version 130040 (0.0036) [2024-06-22 04:03:03,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2130608128. Throughput: 0: 42601.3. Samples: 2130774740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:03:03,390][15132] Avg episode reward: [(0, '0.184')] [2024-06-22 04:03:06,109][15401] Updated weights for policy 0, policy_version 130050 (0.0037) [2024-06-22 04:03:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 2130821120. Throughput: 0: 42456.9. Samples: 2130898240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-22 04:03:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 04:03:10,236][15401] Updated weights for policy 0, policy_version 130060 (0.0042) [2024-06-22 04:03:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2131017728. Throughput: 0: 42536.9. Samples: 2131156400. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 04:03:14,100][15401] Updated weights for policy 0, policy_version 130070 (0.0039) [2024-06-22 04:03:17,826][15401] Updated weights for policy 0, policy_version 130080 (0.0027) [2024-06-22 04:03:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2131247104. Throughput: 0: 42434.9. Samples: 2131408060. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 04:03:21,769][15401] Updated weights for policy 0, policy_version 130090 (0.0035) [2024-06-22 04:03:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2131460096. Throughput: 0: 42593.7. Samples: 2131542440. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 04:03:25,255][15401] Updated weights for policy 0, policy_version 130100 (0.0038) [2024-06-22 04:03:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2131656704. Throughput: 0: 42403.5. Samples: 2131790740. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 04:03:29,324][15349] Signal inference workers to stop experience collection... (31400 times) [2024-06-22 04:03:29,324][15349] Signal inference workers to resume experience collection... (31400 times) [2024-06-22 04:03:29,388][15401] InferenceWorker_p0-w0: stopping experience collection (31400 times) [2024-06-22 04:03:29,388][15401] InferenceWorker_p0-w0: resuming experience collection (31400 times) [2024-06-22 04:03:29,457][15401] Updated weights for policy 0, policy_version 130110 (0.0029) [2024-06-22 04:03:32,894][15401] Updated weights for policy 0, policy_version 130120 (0.0037) [2024-06-22 04:03:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42599.7, 300 sec: 42876.1). Total num frames: 2131902464. Throughput: 0: 42442.2. Samples: 2132044800. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 04:03:37,128][15401] Updated weights for policy 0, policy_version 130130 (0.0029) [2024-06-22 04:03:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42765.9). Total num frames: 2132082688. Throughput: 0: 42698.3. Samples: 2132179760. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:38,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 04:03:40,411][15401] Updated weights for policy 0, policy_version 130140 (0.0035) [2024-06-22 04:03:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2132295680. Throughput: 0: 42467.9. Samples: 2132432080. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:43,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 04:03:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000130145_2132295680.pth... [2024-06-22 04:03:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000129519_2122039296.pth [2024-06-22 04:03:44,827][15401] Updated weights for policy 0, policy_version 130150 (0.0035) [2024-06-22 04:03:47,906][15401] Updated weights for policy 0, policy_version 130160 (0.0034) [2024-06-22 04:03:48,396][15132] Fps is (10 sec: 45845.9, 60 sec: 42866.9, 300 sec: 42819.7). Total num frames: 2132541440. Throughput: 0: 42423.2. Samples: 2132684060. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:48,396][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 04:03:52,870][15401] Updated weights for policy 0, policy_version 130170 (0.0029) [2024-06-22 04:03:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 2132738048. Throughput: 0: 42731.1. Samples: 2132821140. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 04:03:55,864][15401] Updated weights for policy 0, policy_version 130180 (0.0032) [2024-06-22 04:03:58,390][15132] Fps is (10 sec: 40985.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2132951040. Throughput: 0: 42532.3. Samples: 2133070360. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:03:58,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 04:04:00,463][15401] Updated weights for policy 0, policy_version 130190 (0.0026) [2024-06-22 04:04:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 2133180416. Throughput: 0: 42556.0. Samples: 2133323180. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:04:03,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 04:04:03,713][15401] Updated weights for policy 0, policy_version 130200 (0.0033) [2024-06-22 04:04:08,062][15401] Updated weights for policy 0, policy_version 130210 (0.0034) [2024-06-22 04:04:08,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 2133377024. Throughput: 0: 42505.6. Samples: 2133455460. Policy #0 lag: (min: 1.0, avg: 11.9, max: 24.0) [2024-06-22 04:04:08,396][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 04:04:11,230][15401] Updated weights for policy 0, policy_version 130220 (0.0024) [2024-06-22 04:04:13,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2133590016. Throughput: 0: 42679.9. Samples: 2133711340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:13,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-22 04:04:15,649][15401] Updated weights for policy 0, policy_version 130230 (0.0034) [2024-06-22 04:04:18,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2133819392. Throughput: 0: 42753.0. Samples: 2133968680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:18,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 04:04:18,878][15401] Updated weights for policy 0, policy_version 130240 (0.0027) [2024-06-22 04:04:23,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2133983232. Throughput: 0: 42476.0. Samples: 2134091180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 04:04:23,545][15401] Updated weights for policy 0, policy_version 130250 (0.0042) [2024-06-22 04:04:26,835][15401] Updated weights for policy 0, policy_version 130260 (0.0037) [2024-06-22 04:04:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2134228992. Throughput: 0: 42606.7. Samples: 2134349380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 04:04:31,019][15401] Updated weights for policy 0, policy_version 130270 (0.0030) [2024-06-22 04:04:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 2134441984. Throughput: 0: 42722.4. Samples: 2134606300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:33,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 04:04:34,542][15401] Updated weights for policy 0, policy_version 130280 (0.0030) [2024-06-22 04:04:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2134638592. Throughput: 0: 42402.2. Samples: 2134729240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:38,392][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 04:04:38,674][15401] Updated weights for policy 0, policy_version 130290 (0.0025) [2024-06-22 04:04:42,446][15401] Updated weights for policy 0, policy_version 130300 (0.0041) [2024-06-22 04:04:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2134900736. Throughput: 0: 42742.3. Samples: 2134993760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 04:04:46,434][15401] Updated weights for policy 0, policy_version 130310 (0.0027) [2024-06-22 04:04:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42329.8, 300 sec: 42710.4). Total num frames: 2135080960. Throughput: 0: 42854.2. Samples: 2135251520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:48,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 04:04:49,907][15349] Signal inference workers to stop experience collection... (31450 times) [2024-06-22 04:04:49,908][15349] Signal inference workers to resume experience collection... (31450 times) [2024-06-22 04:04:49,933][15401] InferenceWorker_p0-w0: stopping experience collection (31450 times) [2024-06-22 04:04:49,934][15401] InferenceWorker_p0-w0: resuming experience collection (31450 times) [2024-06-22 04:04:50,072][15401] Updated weights for policy 0, policy_version 130320 (0.0050) [2024-06-22 04:04:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2135293952. Throughput: 0: 42584.2. Samples: 2135371480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 04:04:54,443][15401] Updated weights for policy 0, policy_version 130330 (0.0036) [2024-06-22 04:04:57,693][15401] Updated weights for policy 0, policy_version 130340 (0.0042) [2024-06-22 04:04:58,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2135539712. Throughput: 0: 42882.7. Samples: 2135641060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:04:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 04:05:02,086][15401] Updated weights for policy 0, policy_version 130350 (0.0028) [2024-06-22 04:05:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 2135719936. Throughput: 0: 42760.8. Samples: 2135892920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:05:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 04:05:05,265][15401] Updated weights for policy 0, policy_version 130360 (0.0047) [2024-06-22 04:05:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 2135949312. Throughput: 0: 42861.3. Samples: 2136019940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:05:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 04:05:09,495][15401] Updated weights for policy 0, policy_version 130370 (0.0036) [2024-06-22 04:05:12,908][15401] Updated weights for policy 0, policy_version 130380 (0.0044) [2024-06-22 04:05:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2136162304. Throughput: 0: 42906.6. Samples: 2136280180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 04:05:13,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 04:05:17,158][15401] Updated weights for policy 0, policy_version 130390 (0.0026) [2024-06-22 04:05:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2136358912. Throughput: 0: 42992.9. Samples: 2136540980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 04:05:20,650][15401] Updated weights for policy 0, policy_version 130400 (0.0043) [2024-06-22 04:05:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 2136604672. Throughput: 0: 43054.2. Samples: 2136666680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 04:05:24,935][15401] Updated weights for policy 0, policy_version 130410 (0.0039) [2024-06-22 04:05:28,291][15401] Updated weights for policy 0, policy_version 130420 (0.0024) [2024-06-22 04:05:28,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 2136801280. Throughput: 0: 42886.1. Samples: 2136923740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:28,392][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 04:05:32,682][15401] Updated weights for policy 0, policy_version 130430 (0.0036) [2024-06-22 04:05:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 2136997888. Throughput: 0: 42859.7. Samples: 2137180200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:33,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 04:05:35,947][15401] Updated weights for policy 0, policy_version 130440 (0.0026) [2024-06-22 04:05:38,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2137243648. Throughput: 0: 42916.4. Samples: 2137302720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 04:05:40,443][15401] Updated weights for policy 0, policy_version 130450 (0.0033) [2024-06-22 04:05:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2137423872. Throughput: 0: 42578.3. Samples: 2137557080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 04:05:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000130459_2137440256.pth... [2024-06-22 04:05:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000129835_2127216640.pth [2024-06-22 04:05:43,885][15401] Updated weights for policy 0, policy_version 130460 (0.0050) [2024-06-22 04:05:48,111][15401] Updated weights for policy 0, policy_version 130470 (0.0040) [2024-06-22 04:05:48,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2137620480. Throughput: 0: 42696.9. Samples: 2137814280. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 04:05:51,631][15401] Updated weights for policy 0, policy_version 130480 (0.0043) [2024-06-22 04:05:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2137882624. Throughput: 0: 42662.1. Samples: 2137939740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 04:05:55,581][15401] Updated weights for policy 0, policy_version 130490 (0.0035) [2024-06-22 04:05:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2138079232. Throughput: 0: 42449.4. Samples: 2138190400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:05:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 04:05:59,308][15401] Updated weights for policy 0, policy_version 130500 (0.0038) [2024-06-22 04:06:03,164][15401] Updated weights for policy 0, policy_version 130510 (0.0031) [2024-06-22 04:06:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2138275840. Throughput: 0: 42298.8. Samples: 2138444420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:06:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 04:06:07,369][15401] Updated weights for policy 0, policy_version 130520 (0.0037) [2024-06-22 04:06:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2138505216. Throughput: 0: 42309.3. Samples: 2138570600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:06:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 04:06:10,676][15401] Updated weights for policy 0, policy_version 130530 (0.0028) [2024-06-22 04:06:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2138701824. Throughput: 0: 42338.4. Samples: 2138828860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 04:06:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 04:06:14,884][15401] Updated weights for policy 0, policy_version 130540 (0.0032) [2024-06-22 04:06:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2138898432. Throughput: 0: 42134.2. Samples: 2139076240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:18,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 04:06:19,018][15401] Updated weights for policy 0, policy_version 130550 (0.0036) [2024-06-22 04:06:20,024][15349] Signal inference workers to stop experience collection... (31500 times) [2024-06-22 04:06:20,032][15349] Signal inference workers to resume experience collection... (31500 times) [2024-06-22 04:06:20,034][15401] InferenceWorker_p0-w0: stopping experience collection (31500 times) [2024-06-22 04:06:20,054][15401] InferenceWorker_p0-w0: resuming experience collection (31500 times) [2024-06-22 04:06:22,651][15401] Updated weights for policy 0, policy_version 130560 (0.0038) [2024-06-22 04:06:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2139144192. Throughput: 0: 42249.1. Samples: 2139203920. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 04:06:26,516][15401] Updated weights for policy 0, policy_version 130570 (0.0047) [2024-06-22 04:06:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 2139357184. Throughput: 0: 42420.1. Samples: 2139465980. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:28,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 04:06:30,287][15401] Updated weights for policy 0, policy_version 130580 (0.0041) [2024-06-22 04:06:33,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2139553792. Throughput: 0: 42240.3. Samples: 2139715100. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 04:06:33,974][15401] Updated weights for policy 0, policy_version 130590 (0.0032) [2024-06-22 04:06:37,979][15401] Updated weights for policy 0, policy_version 130600 (0.0028) [2024-06-22 04:06:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 2139750400. Throughput: 0: 42277.1. Samples: 2139842200. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:38,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-22 04:06:41,417][15401] Updated weights for policy 0, policy_version 130610 (0.0040) [2024-06-22 04:06:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2139979776. Throughput: 0: 42456.0. Samples: 2140100920. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 04:06:45,919][15401] Updated weights for policy 0, policy_version 130620 (0.0029) [2024-06-22 04:06:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2140192768. Throughput: 0: 42327.6. Samples: 2140349160. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:48,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 04:06:48,987][15401] Updated weights for policy 0, policy_version 130630 (0.0055) [2024-06-22 04:06:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41506.2, 300 sec: 42487.3). Total num frames: 2140372992. Throughput: 0: 42398.1. Samples: 2140478520. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:53,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-22 04:06:53,551][15401] Updated weights for policy 0, policy_version 130640 (0.0040) [2024-06-22 04:06:57,067][15401] Updated weights for policy 0, policy_version 130650 (0.0032) [2024-06-22 04:06:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2140618752. Throughput: 0: 42391.0. Samples: 2140736460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:06:58,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 04:07:01,086][15401] Updated weights for policy 0, policy_version 130660 (0.0036) [2024-06-22 04:07:03,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2140848128. Throughput: 0: 42518.2. Samples: 2140989560. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:07:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 04:07:04,505][15401] Updated weights for policy 0, policy_version 130670 (0.0036) [2024-06-22 04:07:08,390][15132] Fps is (10 sec: 39318.9, 60 sec: 41778.7, 300 sec: 42542.7). Total num frames: 2141011968. Throughput: 0: 42636.1. Samples: 2141122580. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:07:08,391][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 04:07:08,863][15401] Updated weights for policy 0, policy_version 130680 (0.0036) [2024-06-22 04:07:12,240][15401] Updated weights for policy 0, policy_version 130690 (0.0032) [2024-06-22 04:07:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 2141241344. Throughput: 0: 42340.7. Samples: 2141371320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:07:13,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 04:07:16,649][15401] Updated weights for policy 0, policy_version 130700 (0.0038) [2024-06-22 04:07:18,390][15132] Fps is (10 sec: 47516.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2141487104. Throughput: 0: 42440.5. Samples: 2141624920. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-22 04:07:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 04:07:19,949][15401] Updated weights for policy 0, policy_version 130710 (0.0039) [2024-06-22 04:07:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42050.5, 300 sec: 42598.0). Total num frames: 2141667328. Throughput: 0: 42659.4. Samples: 2141761980. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:07:23,393][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 04:07:24,073][15401] Updated weights for policy 0, policy_version 130720 (0.0033) [2024-06-22 04:07:27,302][15401] Updated weights for policy 0, policy_version 130730 (0.0041) [2024-06-22 04:07:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42543.1). Total num frames: 2141896704. Throughput: 0: 42570.2. Samples: 2142016580. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:07:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 04:07:31,661][15401] Updated weights for policy 0, policy_version 130740 (0.0034) [2024-06-22 04:07:32,840][15349] Signal inference workers to stop experience collection... (31550 times) [2024-06-22 04:07:32,840][15349] Signal inference workers to resume experience collection... (31550 times) [2024-06-22 04:07:32,887][15401] InferenceWorker_p0-w0: stopping experience collection (31550 times) [2024-06-22 04:07:32,892][15401] InferenceWorker_p0-w0: resuming experience collection (31550 times) [2024-06-22 04:07:33,390][15132] Fps is (10 sec: 47524.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2142142464. Throughput: 0: 42776.2. Samples: 2142274100. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:07:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 04:07:35,065][15401] Updated weights for policy 0, policy_version 130750 (0.0046) [2024-06-22 04:07:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2142306304. Throughput: 0: 42977.0. Samples: 2142412480. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:07:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 04:07:39,199][15401] Updated weights for policy 0, policy_version 130760 (0.0044) [2024-06-22 04:07:42,631][15401] Updated weights for policy 0, policy_version 130770 (0.0035) [2024-06-22 04:07:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2142552064. Throughput: 0: 42786.7. Samples: 2142661860. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:07:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 04:07:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000130771_2142552064.pth... [2024-06-22 04:07:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000130145_2132295680.pth [2024-06-22 04:07:46,963][15401] Updated weights for policy 0, policy_version 130780 (0.0028) [2024-06-22 04:07:48,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 2142781440. Throughput: 0: 42841.4. Samples: 2142917420. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:07:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 04:07:50,517][15401] Updated weights for policy 0, policy_version 130790 (0.0035) [2024-06-22 04:07:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2142961664. Throughput: 0: 42975.8. Samples: 2143056460. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:07:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 04:07:54,378][15401] Updated weights for policy 0, policy_version 130800 (0.0036) [2024-06-22 04:07:58,039][15401] Updated weights for policy 0, policy_version 130810 (0.0030) [2024-06-22 04:07:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2143191040. Throughput: 0: 43057.8. Samples: 2143308920. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:07:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 04:08:02,407][15401] Updated weights for policy 0, policy_version 130820 (0.0036) [2024-06-22 04:08:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2143420416. Throughput: 0: 43204.5. Samples: 2143569120. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:08:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 04:08:05,522][15401] Updated weights for policy 0, policy_version 130830 (0.0043) [2024-06-22 04:08:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43145.0, 300 sec: 42653.9). Total num frames: 2143600640. Throughput: 0: 43088.9. Samples: 2143700880. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:08:08,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-22 04:08:09,879][15401] Updated weights for policy 0, policy_version 130840 (0.0047) [2024-06-22 04:08:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2143830016. Throughput: 0: 43012.4. Samples: 2143952140. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:08:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 04:08:13,759][15401] Updated weights for policy 0, policy_version 130850 (0.0051) [2024-06-22 04:08:17,513][15401] Updated weights for policy 0, policy_version 130860 (0.0033) [2024-06-22 04:08:18,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42869.6, 300 sec: 42709.1). Total num frames: 2144059392. Throughput: 0: 43012.3. Samples: 2144209760. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-22 04:08:18,393][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 04:08:21,281][15401] Updated weights for policy 0, policy_version 130870 (0.0029) [2024-06-22 04:08:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 2144239616. Throughput: 0: 42943.1. Samples: 2144344920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:08:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 04:08:25,047][15401] Updated weights for policy 0, policy_version 130880 (0.0037) [2024-06-22 04:08:28,389][15132] Fps is (10 sec: 42610.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2144485376. Throughput: 0: 42875.7. Samples: 2144591260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:08:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 04:08:28,786][15401] Updated weights for policy 0, policy_version 130890 (0.0044) [2024-06-22 04:08:32,690][15401] Updated weights for policy 0, policy_version 130900 (0.0030) [2024-06-22 04:08:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2144698368. Throughput: 0: 42888.0. Samples: 2144847380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:08:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 04:08:36,265][15401] Updated weights for policy 0, policy_version 130910 (0.0032) [2024-06-22 04:08:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2144878592. Throughput: 0: 42701.7. Samples: 2144978040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:08:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 04:08:40,475][15401] Updated weights for policy 0, policy_version 130920 (0.0035) [2024-06-22 04:08:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 2145124352. Throughput: 0: 42643.3. Samples: 2145227860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:08:43,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 04:08:43,934][15401] Updated weights for policy 0, policy_version 130930 (0.0029) [2024-06-22 04:08:48,060][15401] Updated weights for policy 0, policy_version 130940 (0.0034) [2024-06-22 04:08:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2145337344. Throughput: 0: 42609.2. Samples: 2145486540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:08:48,396][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 04:08:51,483][15401] Updated weights for policy 0, policy_version 130950 (0.0034) [2024-06-22 04:08:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2145517568. Throughput: 0: 42562.8. Samples: 2145616200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:08:53,390][15132] Avg episode reward: [(0, '0.100')] [2024-06-22 04:08:55,702][15401] Updated weights for policy 0, policy_version 130960 (0.0030) [2024-06-22 04:08:56,819][15349] Signal inference workers to stop experience collection... (31600 times) [2024-06-22 04:08:56,820][15349] Signal inference workers to resume experience collection... (31600 times) [2024-06-22 04:08:56,839][15401] InferenceWorker_p0-w0: stopping experience collection (31600 times) [2024-06-22 04:08:56,840][15401] InferenceWorker_p0-w0: resuming experience collection (31600 times) [2024-06-22 04:08:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 2145746944. Throughput: 0: 42525.9. Samples: 2145865800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:08:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 04:08:59,413][15401] Updated weights for policy 0, policy_version 130970 (0.0042) [2024-06-22 04:09:03,270][15401] Updated weights for policy 0, policy_version 130980 (0.0046) [2024-06-22 04:09:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 2145976320. Throughput: 0: 42608.7. Samples: 2146127040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:03,393][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 04:09:06,949][15401] Updated weights for policy 0, policy_version 130990 (0.0029) [2024-06-22 04:09:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2146156544. Throughput: 0: 42445.0. Samples: 2146254940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:08,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-22 04:09:10,892][15401] Updated weights for policy 0, policy_version 131000 (0.0037) [2024-06-22 04:09:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2146402304. Throughput: 0: 42512.8. Samples: 2146504340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 04:09:14,707][15401] Updated weights for policy 0, policy_version 131010 (0.0033) [2024-06-22 04:09:18,391][15132] Fps is (10 sec: 42591.7, 60 sec: 42053.0, 300 sec: 42709.3). Total num frames: 2146582528. Throughput: 0: 42591.0. Samples: 2146764040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:18,391][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 04:09:19,152][15401] Updated weights for policy 0, policy_version 131020 (0.0033) [2024-06-22 04:09:22,342][15401] Updated weights for policy 0, policy_version 131030 (0.0030) [2024-06-22 04:09:23,391][15132] Fps is (10 sec: 39315.0, 60 sec: 42597.3, 300 sec: 42598.2). Total num frames: 2146795520. Throughput: 0: 42474.5. Samples: 2146889460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:23,396][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 04:09:26,826][15401] Updated weights for policy 0, policy_version 131040 (0.0041) [2024-06-22 04:09:28,390][15132] Fps is (10 sec: 45882.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2147041280. Throughput: 0: 42583.9. Samples: 2147144140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 04:09:30,024][15401] Updated weights for policy 0, policy_version 131050 (0.0042) [2024-06-22 04:09:33,390][15132] Fps is (10 sec: 42604.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2147221504. Throughput: 0: 42799.1. Samples: 2147412500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:33,391][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 04:09:34,340][15401] Updated weights for policy 0, policy_version 131060 (0.0026) [2024-06-22 04:09:37,684][15401] Updated weights for policy 0, policy_version 131070 (0.0027) [2024-06-22 04:09:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 2147450880. Throughput: 0: 42450.4. Samples: 2147526480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 04:09:41,824][15401] Updated weights for policy 0, policy_version 131080 (0.0030) [2024-06-22 04:09:43,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2147680256. Throughput: 0: 42731.1. Samples: 2147788700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 04:09:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000131085_2147696640.pth... [2024-06-22 04:09:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000130459_2137440256.pth [2024-06-22 04:09:45,382][15401] Updated weights for policy 0, policy_version 131090 (0.0039) [2024-06-22 04:09:48,390][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 2147844096. Throughput: 0: 42899.5. Samples: 2148057520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 04:09:49,304][15401] Updated weights for policy 0, policy_version 131100 (0.0026) [2024-06-22 04:09:53,095][15401] Updated weights for policy 0, policy_version 131110 (0.0035) [2024-06-22 04:09:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2148106240. Throughput: 0: 42676.0. Samples: 2148175360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 04:09:57,138][15401] Updated weights for policy 0, policy_version 131120 (0.0032) [2024-06-22 04:09:58,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 2148319232. Throughput: 0: 42916.3. Samples: 2148435580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:09:58,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 04:10:00,679][15401] Updated weights for policy 0, policy_version 131130 (0.0032) [2024-06-22 04:10:03,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2148483072. Throughput: 0: 43126.8. Samples: 2148704680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:10:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 04:10:04,167][15349] Signal inference workers to stop experience collection... (31650 times) [2024-06-22 04:10:04,168][15349] Signal inference workers to resume experience collection... (31650 times) [2024-06-22 04:10:04,199][15401] InferenceWorker_p0-w0: stopping experience collection (31650 times) [2024-06-22 04:10:04,200][15401] InferenceWorker_p0-w0: resuming experience collection (31650 times) [2024-06-22 04:10:04,642][15401] Updated weights for policy 0, policy_version 131140 (0.0040) [2024-06-22 04:10:08,228][15401] Updated weights for policy 0, policy_version 131150 (0.0028) [2024-06-22 04:10:08,391][15132] Fps is (10 sec: 44232.3, 60 sec: 43416.7, 300 sec: 42709.3). Total num frames: 2148761600. Throughput: 0: 42945.3. Samples: 2148821980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:10:08,391][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 04:10:12,281][15401] Updated weights for policy 0, policy_version 131160 (0.0033) [2024-06-22 04:10:13,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2148974592. Throughput: 0: 43105.4. Samples: 2149083880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:10:13,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 04:10:15,837][15401] Updated weights for policy 0, policy_version 131170 (0.0030) [2024-06-22 04:10:18,390][15132] Fps is (10 sec: 39326.0, 60 sec: 42872.5, 300 sec: 42542.9). Total num frames: 2149154816. Throughput: 0: 42928.5. Samples: 2149344280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:10:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 04:10:19,832][15401] Updated weights for policy 0, policy_version 131180 (0.0029) [2024-06-22 04:10:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43418.8, 300 sec: 42709.8). Total num frames: 2149400576. Throughput: 0: 43063.3. Samples: 2149464320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 04:10:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 04:10:23,530][15401] Updated weights for policy 0, policy_version 131190 (0.0035) [2024-06-22 04:10:27,414][15401] Updated weights for policy 0, policy_version 131200 (0.0036) [2024-06-22 04:10:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2149613568. Throughput: 0: 42988.9. Samples: 2149723200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:10:28,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 04:10:31,198][15401] Updated weights for policy 0, policy_version 131210 (0.0042) [2024-06-22 04:10:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2149810176. Throughput: 0: 42745.9. Samples: 2149981080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:10:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 04:10:34,938][15401] Updated weights for policy 0, policy_version 131220 (0.0046) [2024-06-22 04:10:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 2150039552. Throughput: 0: 42940.9. Samples: 2150107700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:10:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 04:10:39,516][15401] Updated weights for policy 0, policy_version 131230 (0.0039) [2024-06-22 04:10:43,029][15401] Updated weights for policy 0, policy_version 131240 (0.0027) [2024-06-22 04:10:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2150236160. Throughput: 0: 42854.9. Samples: 2150364040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:10:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 04:10:47,073][15401] Updated weights for policy 0, policy_version 131250 (0.0025) [2024-06-22 04:10:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 2150449152. Throughput: 0: 42669.3. Samples: 2150624800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:10:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 04:10:50,547][15401] Updated weights for policy 0, policy_version 131260 (0.0029) [2024-06-22 04:10:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2150694912. Throughput: 0: 42843.7. Samples: 2150749900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:10:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 04:10:54,448][15401] Updated weights for policy 0, policy_version 131270 (0.0037) [2024-06-22 04:10:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2150875136. Throughput: 0: 42684.4. Samples: 2151004680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:10:58,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 04:10:58,578][15401] Updated weights for policy 0, policy_version 131280 (0.0032) [2024-06-22 04:11:01,977][15401] Updated weights for policy 0, policy_version 131290 (0.0051) [2024-06-22 04:11:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 2151088128. Throughput: 0: 42562.3. Samples: 2151259580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:11:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 04:11:06,232][15401] Updated weights for policy 0, policy_version 131300 (0.0041) [2024-06-22 04:11:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42599.3, 300 sec: 42765.0). Total num frames: 2151317504. Throughput: 0: 42790.8. Samples: 2151389900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:11:08,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 04:11:09,802][15401] Updated weights for policy 0, policy_version 131310 (0.0043) [2024-06-22 04:11:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2151514112. Throughput: 0: 42660.4. Samples: 2151642920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:11:13,394][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 04:11:13,818][15401] Updated weights for policy 0, policy_version 131320 (0.0029) [2024-06-22 04:11:17,458][15401] Updated weights for policy 0, policy_version 131330 (0.0028) [2024-06-22 04:11:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2151727104. Throughput: 0: 42623.2. Samples: 2151899120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:11:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 04:11:21,450][15401] Updated weights for policy 0, policy_version 131340 (0.0032) [2024-06-22 04:11:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2151956480. Throughput: 0: 42741.0. Samples: 2152031040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 04:11:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 04:11:25,018][15401] Updated weights for policy 0, policy_version 131350 (0.0031) [2024-06-22 04:11:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 2152136704. Throughput: 0: 42851.9. Samples: 2152292380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:11:28,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 04:11:29,342][15401] Updated weights for policy 0, policy_version 131360 (0.0042) [2024-06-22 04:11:32,317][15349] Signal inference workers to stop experience collection... (31700 times) [2024-06-22 04:11:32,373][15401] InferenceWorker_p0-w0: stopping experience collection (31700 times) [2024-06-22 04:11:32,436][15349] Signal inference workers to resume experience collection... (31700 times) [2024-06-22 04:11:32,436][15401] InferenceWorker_p0-w0: resuming experience collection (31700 times) [2024-06-22 04:11:32,566][15401] Updated weights for policy 0, policy_version 131370 (0.0037) [2024-06-22 04:11:33,389][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2152398848. Throughput: 0: 42635.2. Samples: 2152543380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:11:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 04:11:36,874][15401] Updated weights for policy 0, policy_version 131380 (0.0031) [2024-06-22 04:11:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2152611840. Throughput: 0: 42959.0. Samples: 2152683060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:11:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 04:11:40,035][15401] Updated weights for policy 0, policy_version 131390 (0.0034) [2024-06-22 04:11:43,392][15132] Fps is (10 sec: 37673.5, 60 sec: 42323.5, 300 sec: 42653.6). Total num frames: 2152775680. Throughput: 0: 42891.8. Samples: 2152934920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:11:43,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 04:11:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000131396_2152792064.pth... [2024-06-22 04:11:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000130771_2142552064.pth [2024-06-22 04:11:44,234][15401] Updated weights for policy 0, policy_version 131400 (0.0031) [2024-06-22 04:11:47,819][15401] Updated weights for policy 0, policy_version 131410 (0.0030) [2024-06-22 04:11:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2153021440. Throughput: 0: 42890.6. Samples: 2153189660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:11:48,395][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 04:11:52,262][15401] Updated weights for policy 0, policy_version 131420 (0.0038) [2024-06-22 04:11:53,390][15132] Fps is (10 sec: 45887.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2153234432. Throughput: 0: 42793.2. Samples: 2153315600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:11:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 04:11:55,626][15401] Updated weights for policy 0, policy_version 131430 (0.0027) [2024-06-22 04:11:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2153431040. Throughput: 0: 42802.6. Samples: 2153569040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:11:58,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 04:11:59,771][15401] Updated weights for policy 0, policy_version 131440 (0.0032) [2024-06-22 04:12:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.2). Total num frames: 2153660416. Throughput: 0: 42936.0. Samples: 2153831240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:12:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 04:12:03,552][15401] Updated weights for policy 0, policy_version 131450 (0.0044) [2024-06-22 04:12:07,347][15401] Updated weights for policy 0, policy_version 131460 (0.0032) [2024-06-22 04:12:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2153889792. Throughput: 0: 42823.8. Samples: 2153958120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:12:08,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 04:12:11,068][15401] Updated weights for policy 0, policy_version 131470 (0.0037) [2024-06-22 04:12:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2154070016. Throughput: 0: 42599.6. Samples: 2154209360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:12:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 04:12:14,965][15401] Updated weights for policy 0, policy_version 131480 (0.0039) [2024-06-22 04:12:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 2154299392. Throughput: 0: 42657.4. Samples: 2154462960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:12:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 04:12:18,592][15401] Updated weights for policy 0, policy_version 131490 (0.0038) [2024-06-22 04:12:22,632][15401] Updated weights for policy 0, policy_version 131500 (0.0038) [2024-06-22 04:12:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2154512384. Throughput: 0: 42629.4. Samples: 2154601380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:12:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 04:12:26,420][15401] Updated weights for policy 0, policy_version 131510 (0.0043) [2024-06-22 04:12:28,392][15132] Fps is (10 sec: 42589.1, 60 sec: 43143.0, 300 sec: 42653.7). Total num frames: 2154725376. Throughput: 0: 42538.2. Samples: 2154849120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 04:12:28,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 04:12:30,135][15401] Updated weights for policy 0, policy_version 131520 (0.0023) [2024-06-22 04:12:33,390][15132] Fps is (10 sec: 44233.6, 60 sec: 42597.9, 300 sec: 42876.0). Total num frames: 2154954752. Throughput: 0: 42563.4. Samples: 2155105040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:12:33,391][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 04:12:34,009][15401] Updated weights for policy 0, policy_version 131530 (0.0027) [2024-06-22 04:12:37,251][15349] Signal inference workers to stop experience collection... (31750 times) [2024-06-22 04:12:37,284][15401] InferenceWorker_p0-w0: stopping experience collection (31750 times) [2024-06-22 04:12:37,310][15349] Signal inference workers to resume experience collection... (31750 times) [2024-06-22 04:12:37,310][15401] InferenceWorker_p0-w0: resuming experience collection (31750 times) [2024-06-22 04:12:37,909][15401] Updated weights for policy 0, policy_version 131540 (0.0030) [2024-06-22 04:12:38,390][15132] Fps is (10 sec: 44245.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2155167744. Throughput: 0: 42902.6. Samples: 2155246220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:12:38,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 04:12:41,377][15401] Updated weights for policy 0, policy_version 131550 (0.0029) [2024-06-22 04:12:43,389][15132] Fps is (10 sec: 40963.4, 60 sec: 43146.4, 300 sec: 42654.0). Total num frames: 2155364352. Throughput: 0: 42927.2. Samples: 2155500760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:12:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 04:12:45,627][15401] Updated weights for policy 0, policy_version 131560 (0.0032) [2024-06-22 04:12:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2155593728. Throughput: 0: 42697.6. Samples: 2155752640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:12:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 04:12:49,545][15401] Updated weights for policy 0, policy_version 131570 (0.0029) [2024-06-22 04:12:53,158][15401] Updated weights for policy 0, policy_version 131580 (0.0037) [2024-06-22 04:12:53,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2155806720. Throughput: 0: 42769.6. Samples: 2155882760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:12:53,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 04:12:57,189][15401] Updated weights for policy 0, policy_version 131590 (0.0042) [2024-06-22 04:12:58,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2156003328. Throughput: 0: 42826.8. Samples: 2156136560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:12:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 04:13:00,699][15401] Updated weights for policy 0, policy_version 131600 (0.0036) [2024-06-22 04:13:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 2156232704. Throughput: 0: 42999.8. Samples: 2156397960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:13:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 04:13:04,692][15401] Updated weights for policy 0, policy_version 131610 (0.0036) [2024-06-22 04:13:08,165][15401] Updated weights for policy 0, policy_version 131620 (0.0030) [2024-06-22 04:13:08,390][15132] Fps is (10 sec: 45873.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2156462080. Throughput: 0: 42878.1. Samples: 2156530900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:13:08,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 04:13:12,340][15401] Updated weights for policy 0, policy_version 131630 (0.0034) [2024-06-22 04:13:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.9). Total num frames: 2156658688. Throughput: 0: 43046.4. Samples: 2156786120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:13:13,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 04:13:15,669][15401] Updated weights for policy 0, policy_version 131640 (0.0034) [2024-06-22 04:13:18,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43142.7, 300 sec: 42875.8). Total num frames: 2156888064. Throughput: 0: 43056.2. Samples: 2157042640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:13:18,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 04:13:19,779][15401] Updated weights for policy 0, policy_version 131650 (0.0029) [2024-06-22 04:13:23,342][15401] Updated weights for policy 0, policy_version 131660 (0.0039) [2024-06-22 04:13:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 2157117440. Throughput: 0: 42816.4. Samples: 2157172960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:13:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 04:13:27,370][15401] Updated weights for policy 0, policy_version 131670 (0.0036) [2024-06-22 04:13:28,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 2157297664. Throughput: 0: 42751.0. Samples: 2157424560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:13:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 04:13:30,825][15401] Updated weights for policy 0, policy_version 131680 (0.0027) [2024-06-22 04:13:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.9, 300 sec: 42876.1). Total num frames: 2157527040. Throughput: 0: 43083.1. Samples: 2157691380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:13:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 04:13:34,837][15401] Updated weights for policy 0, policy_version 131690 (0.0034) [2024-06-22 04:13:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2157756416. Throughput: 0: 42948.6. Samples: 2157815440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:13:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 04:13:38,527][15401] Updated weights for policy 0, policy_version 131700 (0.0034) [2024-06-22 04:13:42,912][15401] Updated weights for policy 0, policy_version 131710 (0.0038) [2024-06-22 04:13:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2157953024. Throughput: 0: 43063.0. Samples: 2158074400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:13:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 04:13:43,455][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000131712_2157969408.pth... [2024-06-22 04:13:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000131085_2147696640.pth [2024-06-22 04:13:46,128][15401] Updated weights for policy 0, policy_version 131720 (0.0034) [2024-06-22 04:13:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2158149632. Throughput: 0: 42885.1. Samples: 2158327780. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:13:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 04:13:50,532][15401] Updated weights for policy 0, policy_version 131730 (0.0036) [2024-06-22 04:13:50,800][15349] Signal inference workers to stop experience collection... (31800 times) [2024-06-22 04:13:50,848][15401] InferenceWorker_p0-w0: stopping experience collection (31800 times) [2024-06-22 04:13:50,910][15349] Signal inference workers to resume experience collection... (31800 times) [2024-06-22 04:13:50,911][15401] InferenceWorker_p0-w0: resuming experience collection (31800 times) [2024-06-22 04:13:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2158379008. Throughput: 0: 42620.6. Samples: 2158448820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:13:53,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 04:13:54,134][15401] Updated weights for policy 0, policy_version 131740 (0.0031) [2024-06-22 04:13:58,145][15401] Updated weights for policy 0, policy_version 131750 (0.0032) [2024-06-22 04:13:58,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 2158608384. Throughput: 0: 42863.1. Samples: 2158715060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:13:58,392][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 04:14:01,611][15401] Updated weights for policy 0, policy_version 131760 (0.0030) [2024-06-22 04:14:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2158788608. Throughput: 0: 42989.5. Samples: 2158977060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:14:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 04:14:05,595][15401] Updated weights for policy 0, policy_version 131770 (0.0028) [2024-06-22 04:14:08,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2159034368. Throughput: 0: 42806.8. Samples: 2159099260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:14:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 04:14:09,208][15401] Updated weights for policy 0, policy_version 131780 (0.0031) [2024-06-22 04:14:13,130][15401] Updated weights for policy 0, policy_version 131790 (0.0048) [2024-06-22 04:14:13,391][15132] Fps is (10 sec: 45866.6, 60 sec: 43143.2, 300 sec: 42931.6). Total num frames: 2159247360. Throughput: 0: 43038.3. Samples: 2159361360. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:14:13,392][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 04:14:17,179][15401] Updated weights for policy 0, policy_version 131800 (0.0026) [2024-06-22 04:14:18,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 2159427584. Throughput: 0: 42900.9. Samples: 2159622020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:14:18,393][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 04:14:20,662][15401] Updated weights for policy 0, policy_version 131810 (0.0031) [2024-06-22 04:14:23,389][15132] Fps is (10 sec: 42606.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2159673344. Throughput: 0: 42802.7. Samples: 2159741560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:14:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 04:14:24,842][15401] Updated weights for policy 0, policy_version 131820 (0.0032) [2024-06-22 04:14:28,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2159886336. Throughput: 0: 42958.7. Samples: 2160007540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:14:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 04:14:28,533][15401] Updated weights for policy 0, policy_version 131830 (0.0030) [2024-06-22 04:14:32,372][15401] Updated weights for policy 0, policy_version 131840 (0.0027) [2024-06-22 04:14:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2160082944. Throughput: 0: 43135.4. Samples: 2160268880. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-22 04:14:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 04:14:36,145][15401] Updated weights for policy 0, policy_version 131850 (0.0035) [2024-06-22 04:14:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2160345088. Throughput: 0: 43211.6. Samples: 2160393340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:14:38,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 04:14:39,951][15401] Updated weights for policy 0, policy_version 131860 (0.0032) [2024-06-22 04:14:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2160525312. Throughput: 0: 43105.3. Samples: 2160654700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:14:43,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 04:14:43,637][15401] Updated weights for policy 0, policy_version 131870 (0.0040) [2024-06-22 04:14:48,045][15401] Updated weights for policy 0, policy_version 131880 (0.0028) [2024-06-22 04:14:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2160738304. Throughput: 0: 42917.7. Samples: 2160908360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:14:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 04:14:51,235][15401] Updated weights for policy 0, policy_version 131890 (0.0035) [2024-06-22 04:14:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2160967680. Throughput: 0: 42992.4. Samples: 2161033920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:14:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 04:14:55,496][15401] Updated weights for policy 0, policy_version 131900 (0.0043) [2024-06-22 04:14:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 43042.7). Total num frames: 2161180672. Throughput: 0: 42902.6. Samples: 2161291900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:14:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 04:14:59,371][15401] Updated weights for policy 0, policy_version 131910 (0.0033) [2024-06-22 04:15:03,149][15401] Updated weights for policy 0, policy_version 131920 (0.0029) [2024-06-22 04:15:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42765.2). Total num frames: 2161377280. Throughput: 0: 42869.8. Samples: 2161551060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:15:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 04:15:06,899][15401] Updated weights for policy 0, policy_version 131930 (0.0034) [2024-06-22 04:15:07,910][15349] Signal inference workers to stop experience collection... (31850 times) [2024-06-22 04:15:07,959][15401] InferenceWorker_p0-w0: stopping experience collection (31850 times) [2024-06-22 04:15:07,966][15349] Signal inference workers to resume experience collection... (31850 times) [2024-06-22 04:15:07,968][15401] InferenceWorker_p0-w0: resuming experience collection (31850 times) [2024-06-22 04:15:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2161623040. Throughput: 0: 42911.0. Samples: 2161672560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:15:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 04:15:10,777][15401] Updated weights for policy 0, policy_version 131940 (0.0034) [2024-06-22 04:15:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42599.7, 300 sec: 42876.1). Total num frames: 2161803264. Throughput: 0: 42775.9. Samples: 2161932460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:15:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 04:15:14,731][15401] Updated weights for policy 0, policy_version 131950 (0.0028) [2024-06-22 04:15:18,310][15401] Updated weights for policy 0, policy_version 131960 (0.0041) [2024-06-22 04:15:18,395][15132] Fps is (10 sec: 40937.9, 60 sec: 43415.3, 300 sec: 42819.8). Total num frames: 2162032640. Throughput: 0: 42810.8. Samples: 2162195600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:15:18,396][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 04:15:22,136][15401] Updated weights for policy 0, policy_version 131970 (0.0024) [2024-06-22 04:15:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2162262016. Throughput: 0: 42900.5. Samples: 2162323860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:15:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 04:15:25,840][15401] Updated weights for policy 0, policy_version 131980 (0.0040) [2024-06-22 04:15:28,390][15132] Fps is (10 sec: 42621.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2162458624. Throughput: 0: 42882.7. Samples: 2162584420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:15:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 04:15:29,613][15401] Updated weights for policy 0, policy_version 131990 (0.0034) [2024-06-22 04:15:33,279][15401] Updated weights for policy 0, policy_version 132000 (0.0046) [2024-06-22 04:15:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2162688000. Throughput: 0: 42909.7. Samples: 2162839300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:15:33,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 04:15:37,036][15401] Updated weights for policy 0, policy_version 132010 (0.0026) [2024-06-22 04:15:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2162900992. Throughput: 0: 43069.3. Samples: 2162972040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-22 04:15:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 04:15:41,054][15401] Updated weights for policy 0, policy_version 132020 (0.0031) [2024-06-22 04:15:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2163097600. Throughput: 0: 43076.1. Samples: 2163230320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:15:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 04:15:43,544][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000132026_2163113984.pth... [2024-06-22 04:15:43,623][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000131396_2152792064.pth [2024-06-22 04:15:44,616][15401] Updated weights for policy 0, policy_version 132030 (0.0031) [2024-06-22 04:15:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2163310592. Throughput: 0: 42828.9. Samples: 2163478360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:15:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 04:15:48,695][15401] Updated weights for policy 0, policy_version 132040 (0.0027) [2024-06-22 04:15:52,309][15401] Updated weights for policy 0, policy_version 132050 (0.0027) [2024-06-22 04:15:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2163523584. Throughput: 0: 43144.9. Samples: 2163614080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:15:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 04:15:56,177][15401] Updated weights for policy 0, policy_version 132060 (0.0042) [2024-06-22 04:15:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 2163736576. Throughput: 0: 43111.5. Samples: 2163872580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:15:58,393][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 04:16:00,021][15401] Updated weights for policy 0, policy_version 132070 (0.0033) [2024-06-22 04:16:03,389][15132] Fps is (10 sec: 44238.0, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 2163965952. Throughput: 0: 43021.9. Samples: 2164131340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:16:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 04:16:03,701][15401] Updated weights for policy 0, policy_version 132080 (0.0036) [2024-06-22 04:16:07,678][15401] Updated weights for policy 0, policy_version 132090 (0.0038) [2024-06-22 04:16:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2164178944. Throughput: 0: 43044.0. Samples: 2164260840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:16:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 04:16:11,630][15401] Updated weights for policy 0, policy_version 132100 (0.0039) [2024-06-22 04:16:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2164375552. Throughput: 0: 42785.0. Samples: 2164509740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:16:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 04:16:15,274][15401] Updated weights for policy 0, policy_version 132110 (0.0037) [2024-06-22 04:16:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42875.4, 300 sec: 42876.1). Total num frames: 2164604928. Throughput: 0: 42838.3. Samples: 2164767020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:16:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 04:16:19,203][15401] Updated weights for policy 0, policy_version 132120 (0.0022) [2024-06-22 04:16:23,017][15401] Updated weights for policy 0, policy_version 132130 (0.0030) [2024-06-22 04:16:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2164817920. Throughput: 0: 42818.3. Samples: 2164898860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:16:23,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 04:16:26,955][15401] Updated weights for policy 0, policy_version 132140 (0.0030) [2024-06-22 04:16:28,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2165014528. Throughput: 0: 42655.8. Samples: 2165149840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:16:28,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-22 04:16:30,714][15401] Updated weights for policy 0, policy_version 132150 (0.0033) [2024-06-22 04:16:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2165243904. Throughput: 0: 42900.1. Samples: 2165408860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:16:33,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-22 04:16:34,599][15401] Updated weights for policy 0, policy_version 132160 (0.0033) [2024-06-22 04:16:37,052][15349] Signal inference workers to stop experience collection... (31900 times) [2024-06-22 04:16:37,052][15349] Signal inference workers to resume experience collection... (31900 times) [2024-06-22 04:16:37,069][15401] InferenceWorker_p0-w0: stopping experience collection (31900 times) [2024-06-22 04:16:37,069][15401] InferenceWorker_p0-w0: resuming experience collection (31900 times) [2024-06-22 04:16:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42987.6). Total num frames: 2165456896. Throughput: 0: 42761.5. Samples: 2165538340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 04:16:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 04:16:38,478][15401] Updated weights for policy 0, policy_version 132170 (0.0030) [2024-06-22 04:16:42,223][15401] Updated weights for policy 0, policy_version 132180 (0.0031) [2024-06-22 04:16:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2165669888. Throughput: 0: 42602.7. Samples: 2165789600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:16:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 04:16:46,005][15401] Updated weights for policy 0, policy_version 132190 (0.0035) [2024-06-22 04:16:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2165882880. Throughput: 0: 42503.4. Samples: 2166044000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:16:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 04:16:49,823][15401] Updated weights for policy 0, policy_version 132200 (0.0036) [2024-06-22 04:16:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 2166095872. Throughput: 0: 42503.5. Samples: 2166173500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:16:53,391][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 04:16:53,674][15401] Updated weights for policy 0, policy_version 132210 (0.0034) [2024-06-22 04:16:57,506][15401] Updated weights for policy 0, policy_version 132220 (0.0035) [2024-06-22 04:16:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2166308864. Throughput: 0: 42631.1. Samples: 2166428140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:16:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 04:17:01,703][15401] Updated weights for policy 0, policy_version 132230 (0.0046) [2024-06-22 04:17:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2166538240. Throughput: 0: 42396.0. Samples: 2166674840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 04:17:05,348][15401] Updated weights for policy 0, policy_version 132240 (0.0042) [2024-06-22 04:17:08,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 2166685696. Throughput: 0: 42378.2. Samples: 2166805880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:08,390][15132] Avg episode reward: [(0, '0.165')] [2024-06-22 04:17:09,409][15401] Updated weights for policy 0, policy_version 132250 (0.0042) [2024-06-22 04:17:13,367][15401] Updated weights for policy 0, policy_version 132260 (0.0030) [2024-06-22 04:17:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2166947840. Throughput: 0: 42455.3. Samples: 2167060320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 04:17:17,171][15401] Updated weights for policy 0, policy_version 132270 (0.0025) [2024-06-22 04:17:18,392][15132] Fps is (10 sec: 47501.8, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 2167160832. Throughput: 0: 42315.0. Samples: 2167313140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:18,392][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 04:17:20,967][15401] Updated weights for policy 0, policy_version 132280 (0.0026) [2024-06-22 04:17:23,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41779.1, 300 sec: 42709.8). Total num frames: 2167324672. Throughput: 0: 42336.0. Samples: 2167443460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 04:17:24,870][15401] Updated weights for policy 0, policy_version 132290 (0.0050) [2024-06-22 04:17:28,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 2167586816. Throughput: 0: 42331.6. Samples: 2167694520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:28,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 04:17:28,586][15401] Updated weights for policy 0, policy_version 132300 (0.0030) [2024-06-22 04:17:32,586][15401] Updated weights for policy 0, policy_version 132310 (0.0041) [2024-06-22 04:17:33,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2167816192. Throughput: 0: 42414.1. Samples: 2167952640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 04:17:36,571][15401] Updated weights for policy 0, policy_version 132320 (0.0035) [2024-06-22 04:17:38,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2167980032. Throughput: 0: 42386.2. Samples: 2168080880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:38,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-22 04:17:40,402][15401] Updated weights for policy 0, policy_version 132330 (0.0030) [2024-06-22 04:17:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2168225792. Throughput: 0: 42437.3. Samples: 2168337820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:17:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 04:17:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000132338_2168225792.pth... [2024-06-22 04:17:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000131712_2157969408.pth [2024-06-22 04:17:44,173][15401] Updated weights for policy 0, policy_version 132340 (0.0041) [2024-06-22 04:17:45,136][15349] Signal inference workers to stop experience collection... (31950 times) [2024-06-22 04:17:45,190][15349] Signal inference workers to resume experience collection... (31950 times) [2024-06-22 04:17:45,190][15401] InferenceWorker_p0-w0: stopping experience collection (31950 times) [2024-06-22 04:17:45,210][15401] InferenceWorker_p0-w0: resuming experience collection (31950 times) [2024-06-22 04:17:48,172][15401] Updated weights for policy 0, policy_version 132350 (0.0034) [2024-06-22 04:17:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2168438784. Throughput: 0: 42573.5. Samples: 2168590640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:17:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 04:17:52,175][15401] Updated weights for policy 0, policy_version 132360 (0.0029) [2024-06-22 04:17:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2168635392. Throughput: 0: 42564.5. Samples: 2168721280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:17:53,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 04:17:55,740][15401] Updated weights for policy 0, policy_version 132370 (0.0035) [2024-06-22 04:17:58,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 2168848384. Throughput: 0: 42603.5. Samples: 2168977580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:17:58,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 04:17:59,917][15401] Updated weights for policy 0, policy_version 132380 (0.0029) [2024-06-22 04:18:03,375][15401] Updated weights for policy 0, policy_version 132390 (0.0040) [2024-06-22 04:18:03,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2169077760. Throughput: 0: 42719.2. Samples: 2169235400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 04:18:07,430][15401] Updated weights for policy 0, policy_version 132400 (0.0033) [2024-06-22 04:18:08,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 2169290752. Throughput: 0: 42687.1. Samples: 2169364380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:08,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 04:18:11,258][15401] Updated weights for policy 0, policy_version 132410 (0.0036) [2024-06-22 04:18:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2169503744. Throughput: 0: 42598.2. Samples: 2169611440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 04:18:14,996][15401] Updated weights for policy 0, policy_version 132420 (0.0047) [2024-06-22 04:18:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 2169700352. Throughput: 0: 42752.6. Samples: 2169876500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 04:18:18,834][15401] Updated weights for policy 0, policy_version 132430 (0.0030) [2024-06-22 04:18:22,531][15401] Updated weights for policy 0, policy_version 132440 (0.0039) [2024-06-22 04:18:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 2169929728. Throughput: 0: 42790.6. Samples: 2170006460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:23,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 04:18:26,367][15401] Updated weights for policy 0, policy_version 132450 (0.0027) [2024-06-22 04:18:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2170142720. Throughput: 0: 42737.4. Samples: 2170261000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:28,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 04:18:30,226][15401] Updated weights for policy 0, policy_version 132460 (0.0038) [2024-06-22 04:18:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2170355712. Throughput: 0: 42779.0. Samples: 2170515700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:33,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 04:18:33,883][15401] Updated weights for policy 0, policy_version 132470 (0.0027) [2024-06-22 04:18:37,769][15401] Updated weights for policy 0, policy_version 132480 (0.0046) [2024-06-22 04:18:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2170552320. Throughput: 0: 42803.9. Samples: 2170647460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:38,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 04:18:41,715][15401] Updated weights for policy 0, policy_version 132490 (0.0025) [2024-06-22 04:18:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2170798080. Throughput: 0: 42735.1. Samples: 2170900560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 04:18:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 04:18:45,348][15401] Updated weights for policy 0, policy_version 132500 (0.0039) [2024-06-22 04:18:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2170994688. Throughput: 0: 42825.4. Samples: 2171162540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:18:48,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 04:18:49,347][15401] Updated weights for policy 0, policy_version 132510 (0.0030) [2024-06-22 04:18:52,860][15401] Updated weights for policy 0, policy_version 132520 (0.0049) [2024-06-22 04:18:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 2171207680. Throughput: 0: 42778.2. Samples: 2171289400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:18:53,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 04:18:56,873][15401] Updated weights for policy 0, policy_version 132530 (0.0044) [2024-06-22 04:18:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 2171437056. Throughput: 0: 43047.7. Samples: 2171548580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:18:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 04:19:00,447][15401] Updated weights for policy 0, policy_version 132540 (0.0033) [2024-06-22 04:19:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2171617280. Throughput: 0: 42845.3. Samples: 2171804540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 04:19:04,391][15401] Updated weights for policy 0, policy_version 132550 (0.0028) [2024-06-22 04:19:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 2171846656. Throughput: 0: 42780.6. Samples: 2171931580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:08,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-22 04:19:08,435][15401] Updated weights for policy 0, policy_version 132560 (0.0030) [2024-06-22 04:19:12,033][15401] Updated weights for policy 0, policy_version 132570 (0.0054) [2024-06-22 04:19:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 2172059648. Throughput: 0: 42629.7. Samples: 2172179340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:13,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 04:19:15,899][15401] Updated weights for policy 0, policy_version 132580 (0.0026) [2024-06-22 04:19:17,257][15349] Signal inference workers to stop experience collection... (32000 times) [2024-06-22 04:19:17,311][15401] InferenceWorker_p0-w0: stopping experience collection (32000 times) [2024-06-22 04:19:17,317][15349] Signal inference workers to resume experience collection... (32000 times) [2024-06-22 04:19:17,324][15401] InferenceWorker_p0-w0: resuming experience collection (32000 times) [2024-06-22 04:19:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2172272640. Throughput: 0: 42880.4. Samples: 2172445320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 04:19:19,753][15401] Updated weights for policy 0, policy_version 132590 (0.0032) [2024-06-22 04:19:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2172502016. Throughput: 0: 42779.5. Samples: 2172572540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 04:19:23,441][15401] Updated weights for policy 0, policy_version 132600 (0.0026) [2024-06-22 04:19:27,477][15401] Updated weights for policy 0, policy_version 132610 (0.0030) [2024-06-22 04:19:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2172715008. Throughput: 0: 42786.3. Samples: 2172825940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 04:19:31,003][15401] Updated weights for policy 0, policy_version 132620 (0.0034) [2024-06-22 04:19:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2172895232. Throughput: 0: 42736.7. Samples: 2173085700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 04:19:35,033][15401] Updated weights for policy 0, policy_version 132630 (0.0031) [2024-06-22 04:19:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2173140992. Throughput: 0: 42690.3. Samples: 2173210460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:38,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 04:19:39,043][15401] Updated weights for policy 0, policy_version 132640 (0.0034) [2024-06-22 04:19:43,078][15401] Updated weights for policy 0, policy_version 132650 (0.0027) [2024-06-22 04:19:43,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2173353984. Throughput: 0: 42564.1. Samples: 2173463960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 04:19:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000132651_2173353984.pth... [2024-06-22 04:19:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000132026_2163113984.pth [2024-06-22 04:19:47,108][15401] Updated weights for policy 0, policy_version 132660 (0.0032) [2024-06-22 04:19:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2173550592. Throughput: 0: 42488.9. Samples: 2173716540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 04:19:48,395][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 04:19:50,777][15401] Updated weights for policy 0, policy_version 132670 (0.0039) [2024-06-22 04:19:53,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2173779968. Throughput: 0: 42603.5. Samples: 2173848740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:19:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 04:19:54,502][15401] Updated weights for policy 0, policy_version 132680 (0.0028) [2024-06-22 04:19:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2173976576. Throughput: 0: 42728.2. Samples: 2174102100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:19:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 04:19:58,444][15401] Updated weights for policy 0, policy_version 132690 (0.0033) [2024-06-22 04:20:02,795][15401] Updated weights for policy 0, policy_version 132700 (0.0032) [2024-06-22 04:20:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2174205952. Throughput: 0: 42536.8. Samples: 2174359480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 04:20:06,527][15401] Updated weights for policy 0, policy_version 132710 (0.0035) [2024-06-22 04:20:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2174402560. Throughput: 0: 42537.9. Samples: 2174486740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 04:20:10,438][15401] Updated weights for policy 0, policy_version 132720 (0.0032) [2024-06-22 04:20:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42654.8). Total num frames: 2174615552. Throughput: 0: 42425.0. Samples: 2174735060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 04:20:14,097][15401] Updated weights for policy 0, policy_version 132730 (0.0045) [2024-06-22 04:20:18,046][15401] Updated weights for policy 0, policy_version 132740 (0.0030) [2024-06-22 04:20:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 2174812160. Throughput: 0: 42340.8. Samples: 2174991040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:18,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 04:20:21,686][15401] Updated weights for policy 0, policy_version 132750 (0.0046) [2024-06-22 04:20:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2175041536. Throughput: 0: 42402.7. Samples: 2175118580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 04:20:25,687][15401] Updated weights for policy 0, policy_version 132760 (0.0035) [2024-06-22 04:20:28,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2175270912. Throughput: 0: 42468.8. Samples: 2175375060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 04:20:29,572][15401] Updated weights for policy 0, policy_version 132770 (0.0027) [2024-06-22 04:20:33,337][15401] Updated weights for policy 0, policy_version 132780 (0.0027) [2024-06-22 04:20:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2175467520. Throughput: 0: 42484.4. Samples: 2175628340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 04:20:37,145][15401] Updated weights for policy 0, policy_version 132790 (0.0032) [2024-06-22 04:20:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2175696896. Throughput: 0: 42391.1. Samples: 2175756340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 04:20:41,076][15401] Updated weights for policy 0, policy_version 132800 (0.0038) [2024-06-22 04:20:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2175893504. Throughput: 0: 42485.6. Samples: 2176013960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 04:20:44,907][15401] Updated weights for policy 0, policy_version 132810 (0.0034) [2024-06-22 04:20:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2176090112. Throughput: 0: 42517.8. Samples: 2176272780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:20:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 04:20:48,812][15401] Updated weights for policy 0, policy_version 132820 (0.0039) [2024-06-22 04:20:52,495][15401] Updated weights for policy 0, policy_version 132830 (0.0032) [2024-06-22 04:20:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 2176319488. Throughput: 0: 42379.8. Samples: 2176393840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:20:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 04:20:56,527][15401] Updated weights for policy 0, policy_version 132840 (0.0037) [2024-06-22 04:20:58,393][15132] Fps is (10 sec: 44220.8, 60 sec: 42595.7, 300 sec: 42597.8). Total num frames: 2176532480. Throughput: 0: 42615.1. Samples: 2176652900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:20:58,394][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 04:21:00,460][15401] Updated weights for policy 0, policy_version 132850 (0.0043) [2024-06-22 04:21:01,712][15349] Signal inference workers to stop experience collection... (32050 times) [2024-06-22 04:21:01,713][15349] Signal inference workers to resume experience collection... (32050 times) [2024-06-22 04:21:01,755][15401] InferenceWorker_p0-w0: stopping experience collection (32050 times) [2024-06-22 04:21:01,755][15401] InferenceWorker_p0-w0: resuming experience collection (32050 times) [2024-06-22 04:21:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2176745472. Throughput: 0: 42553.4. Samples: 2176905940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 04:21:04,047][15401] Updated weights for policy 0, policy_version 132860 (0.0028) [2024-06-22 04:21:08,065][15401] Updated weights for policy 0, policy_version 132870 (0.0030) [2024-06-22 04:21:08,389][15132] Fps is (10 sec: 40975.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2176942080. Throughput: 0: 42508.4. Samples: 2177031460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 04:21:11,744][15401] Updated weights for policy 0, policy_version 132880 (0.0038) [2024-06-22 04:21:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2177171456. Throughput: 0: 42600.8. Samples: 2177292100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 04:21:15,645][15401] Updated weights for policy 0, policy_version 132890 (0.0042) [2024-06-22 04:21:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2177384448. Throughput: 0: 42627.9. Samples: 2177546600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 04:21:19,819][15401] Updated weights for policy 0, policy_version 132900 (0.0037) [2024-06-22 04:21:23,249][15401] Updated weights for policy 0, policy_version 132910 (0.0026) [2024-06-22 04:21:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 2177597440. Throughput: 0: 42562.2. Samples: 2177671640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 04:21:27,603][15401] Updated weights for policy 0, policy_version 132920 (0.0034) [2024-06-22 04:21:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2177810432. Throughput: 0: 42724.9. Samples: 2177936580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 04:21:31,051][15401] Updated weights for policy 0, policy_version 132930 (0.0034) [2024-06-22 04:21:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2178023424. Throughput: 0: 42353.3. Samples: 2178178680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:33,393][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 04:21:35,462][15401] Updated weights for policy 0, policy_version 132940 (0.0040) [2024-06-22 04:21:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42050.5, 300 sec: 42542.5). Total num frames: 2178220032. Throughput: 0: 42558.2. Samples: 2178309060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:38,393][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 04:21:38,798][15401] Updated weights for policy 0, policy_version 132950 (0.0026) [2024-06-22 04:21:43,045][15401] Updated weights for policy 0, policy_version 132960 (0.0036) [2024-06-22 04:21:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2178433024. Throughput: 0: 42532.9. Samples: 2178566720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 04:21:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000132962_2178449408.pth... [2024-06-22 04:21:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000132338_2168225792.pth [2024-06-22 04:21:46,755][15401] Updated weights for policy 0, policy_version 132970 (0.0030) [2024-06-22 04:21:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2178662400. Throughput: 0: 42390.3. Samples: 2178813500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:48,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 04:21:50,739][15401] Updated weights for policy 0, policy_version 132980 (0.0038) [2024-06-22 04:21:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2178875392. Throughput: 0: 42563.9. Samples: 2178946840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 04:21:53,393][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 04:21:54,324][15401] Updated weights for policy 0, policy_version 132990 (0.0036) [2024-06-22 04:21:55,670][15349] Signal inference workers to stop experience collection... (32100 times) [2024-06-22 04:21:55,670][15349] Signal inference workers to resume experience collection... (32100 times) [2024-06-22 04:21:55,700][15401] InferenceWorker_p0-w0: stopping experience collection (32100 times) [2024-06-22 04:21:55,700][15401] InferenceWorker_p0-w0: resuming experience collection (32100 times) [2024-06-22 04:21:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42054.8, 300 sec: 42431.8). Total num frames: 2179055616. Throughput: 0: 42649.8. Samples: 2179211340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:21:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 04:21:58,434][15401] Updated weights for policy 0, policy_version 133000 (0.0032) [2024-06-22 04:22:01,802][15401] Updated weights for policy 0, policy_version 133010 (0.0029) [2024-06-22 04:22:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2179301376. Throughput: 0: 42601.8. Samples: 2179463680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:03,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 04:22:05,913][15401] Updated weights for policy 0, policy_version 133020 (0.0038) [2024-06-22 04:22:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2179530752. Throughput: 0: 42691.1. Samples: 2179592740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 04:22:09,171][15401] Updated weights for policy 0, policy_version 133030 (0.0040) [2024-06-22 04:22:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42487.7). Total num frames: 2179694592. Throughput: 0: 42592.3. Samples: 2179853240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 04:22:13,761][15401] Updated weights for policy 0, policy_version 133040 (0.0039) [2024-06-22 04:22:16,683][15401] Updated weights for policy 0, policy_version 133050 (0.0025) [2024-06-22 04:22:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2179940352. Throughput: 0: 42885.5. Samples: 2180108520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 04:22:21,354][15401] Updated weights for policy 0, policy_version 133060 (0.0038) [2024-06-22 04:22:23,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2180169728. Throughput: 0: 43080.9. Samples: 2180247600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 04:22:24,108][15401] Updated weights for policy 0, policy_version 133070 (0.0029) [2024-06-22 04:22:28,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42050.7, 300 sec: 42431.5). Total num frames: 2180333568. Throughput: 0: 42803.6. Samples: 2180492980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:28,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 04:22:29,043][15401] Updated weights for policy 0, policy_version 133080 (0.0043) [2024-06-22 04:22:31,661][15401] Updated weights for policy 0, policy_version 133090 (0.0036) [2024-06-22 04:22:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2180562944. Throughput: 0: 43012.5. Samples: 2180749060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:33,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 04:22:36,621][15401] Updated weights for policy 0, policy_version 133100 (0.0039) [2024-06-22 04:22:38,389][15132] Fps is (10 sec: 49162.9, 60 sec: 43419.4, 300 sec: 42709.5). Total num frames: 2180825088. Throughput: 0: 43083.2. Samples: 2180885580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 04:22:39,531][15401] Updated weights for policy 0, policy_version 133110 (0.0037) [2024-06-22 04:22:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 2180988928. Throughput: 0: 42700.8. Samples: 2181132980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:43,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 04:22:44,158][15401] Updated weights for policy 0, policy_version 133120 (0.0028) [2024-06-22 04:22:46,840][15401] Updated weights for policy 0, policy_version 133130 (0.0039) [2024-06-22 04:22:48,394][15132] Fps is (10 sec: 39303.1, 60 sec: 42595.0, 300 sec: 42653.2). Total num frames: 2181218304. Throughput: 0: 42940.0. Samples: 2181396180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:48,395][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 04:22:51,615][15401] Updated weights for policy 0, policy_version 133140 (0.0026) [2024-06-22 04:22:53,390][15132] Fps is (10 sec: 47524.8, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 2181464064. Throughput: 0: 42974.7. Samples: 2181526600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 04:22:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 04:22:54,637][15401] Updated weights for policy 0, policy_version 133150 (0.0044) [2024-06-22 04:22:58,047][15349] Signal inference workers to stop experience collection... (32150 times) [2024-06-22 04:22:58,048][15349] Signal inference workers to resume experience collection... (32150 times) [2024-06-22 04:22:58,093][15401] InferenceWorker_p0-w0: stopping experience collection (32150 times) [2024-06-22 04:22:58,094][15401] InferenceWorker_p0-w0: resuming experience collection (32150 times) [2024-06-22 04:22:58,390][15132] Fps is (10 sec: 42618.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2181644288. Throughput: 0: 42930.8. Samples: 2181785120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:22:58,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-22 04:22:59,309][15401] Updated weights for policy 0, policy_version 133160 (0.0039) [2024-06-22 04:23:02,101][15401] Updated weights for policy 0, policy_version 133170 (0.0033) [2024-06-22 04:23:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2181873664. Throughput: 0: 42884.3. Samples: 2182038320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 04:23:07,117][15401] Updated weights for policy 0, policy_version 133180 (0.0035) [2024-06-22 04:23:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2182103040. Throughput: 0: 42827.2. Samples: 2182174820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:08,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-22 04:23:09,630][15401] Updated weights for policy 0, policy_version 133190 (0.0032) [2024-06-22 04:23:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2182283264. Throughput: 0: 43032.3. Samples: 2182429340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:13,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-22 04:23:14,553][15401] Updated weights for policy 0, policy_version 133200 (0.0028) [2024-06-22 04:23:17,655][15401] Updated weights for policy 0, policy_version 133210 (0.0034) [2024-06-22 04:23:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2182529024. Throughput: 0: 42949.3. Samples: 2182681780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 04:23:22,270][15401] Updated weights for policy 0, policy_version 133220 (0.0053) [2024-06-22 04:23:23,395][15132] Fps is (10 sec: 44212.6, 60 sec: 42594.6, 300 sec: 42653.1). Total num frames: 2182725632. Throughput: 0: 42957.4. Samples: 2182818900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:23,395][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 04:23:25,270][15401] Updated weights for policy 0, policy_version 133230 (0.0035) [2024-06-22 04:23:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43419.3, 300 sec: 42653.9). Total num frames: 2182938624. Throughput: 0: 43056.1. Samples: 2183070400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:28,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 04:23:29,898][15401] Updated weights for policy 0, policy_version 133240 (0.0030) [2024-06-22 04:23:32,821][15401] Updated weights for policy 0, policy_version 133250 (0.0033) [2024-06-22 04:23:33,389][15132] Fps is (10 sec: 45900.8, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 2183184384. Throughput: 0: 42702.8. Samples: 2183317600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 04:23:37,396][15401] Updated weights for policy 0, policy_version 133260 (0.0047) [2024-06-22 04:23:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2183348224. Throughput: 0: 42889.7. Samples: 2183456640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:38,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-22 04:23:40,513][15401] Updated weights for policy 0, policy_version 133270 (0.0033) [2024-06-22 04:23:43,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 2183561216. Throughput: 0: 42605.4. Samples: 2183702360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 04:23:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000133274_2183561216.pth... [2024-06-22 04:23:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000132651_2173353984.pth [2024-06-22 04:23:45,035][15401] Updated weights for policy 0, policy_version 133280 (0.0034) [2024-06-22 04:23:48,054][15401] Updated weights for policy 0, policy_version 133290 (0.0024) [2024-06-22 04:23:48,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43421.1, 300 sec: 42765.0). Total num frames: 2183823360. Throughput: 0: 42592.5. Samples: 2183954980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 04:23:52,647][15401] Updated weights for policy 0, policy_version 133300 (0.0025) [2024-06-22 04:23:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2184003584. Throughput: 0: 42754.6. Samples: 2184098780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:53,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 04:23:55,805][15401] Updated weights for policy 0, policy_version 133310 (0.0040) [2024-06-22 04:23:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2184216576. Throughput: 0: 42530.2. Samples: 2184343200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 04:23:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 04:24:00,469][15401] Updated weights for policy 0, policy_version 133320 (0.0028) [2024-06-22 04:24:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2184462336. Throughput: 0: 42632.3. Samples: 2184600240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 04:24:03,789][15401] Updated weights for policy 0, policy_version 133330 (0.0035) [2024-06-22 04:24:08,109][15401] Updated weights for policy 0, policy_version 133340 (0.0033) [2024-06-22 04:24:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2184658944. Throughput: 0: 42594.6. Samples: 2184735420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 04:24:09,446][15349] Signal inference workers to stop experience collection... (32200 times) [2024-06-22 04:24:09,497][15349] Signal inference workers to resume experience collection... (32200 times) [2024-06-22 04:24:09,498][15401] InferenceWorker_p0-w0: stopping experience collection (32200 times) [2024-06-22 04:24:09,512][15401] InferenceWorker_p0-w0: resuming experience collection (32200 times) [2024-06-22 04:24:11,521][15401] Updated weights for policy 0, policy_version 133350 (0.0037) [2024-06-22 04:24:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2184855552. Throughput: 0: 42542.0. Samples: 2184984800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 04:24:15,566][15401] Updated weights for policy 0, policy_version 133360 (0.0033) [2024-06-22 04:24:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2185084928. Throughput: 0: 42821.7. Samples: 2185244580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 04:24:19,238][15401] Updated weights for policy 0, policy_version 133370 (0.0033) [2024-06-22 04:24:23,294][15401] Updated weights for policy 0, policy_version 133380 (0.0030) [2024-06-22 04:24:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42875.4, 300 sec: 42653.9). Total num frames: 2185297920. Throughput: 0: 42723.6. Samples: 2185379200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 04:24:26,821][15401] Updated weights for policy 0, policy_version 133390 (0.0036) [2024-06-22 04:24:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2185494528. Throughput: 0: 42763.5. Samples: 2185626720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 04:24:30,902][15401] Updated weights for policy 0, policy_version 133400 (0.0035) [2024-06-22 04:24:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2185723904. Throughput: 0: 42867.8. Samples: 2185884040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 04:24:34,467][15401] Updated weights for policy 0, policy_version 133410 (0.0025) [2024-06-22 04:24:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2185920512. Throughput: 0: 42551.3. Samples: 2186013580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 04:24:38,609][15401] Updated weights for policy 0, policy_version 133420 (0.0037) [2024-06-22 04:24:42,173][15401] Updated weights for policy 0, policy_version 133430 (0.0033) [2024-06-22 04:24:43,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2186149888. Throughput: 0: 42671.1. Samples: 2186263400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 04:24:46,201][15401] Updated weights for policy 0, policy_version 133440 (0.0039) [2024-06-22 04:24:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2186346496. Throughput: 0: 42714.3. Samples: 2186522380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 04:24:50,437][15401] Updated weights for policy 0, policy_version 133450 (0.0044) [2024-06-22 04:24:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2186559488. Throughput: 0: 42449.3. Samples: 2186645640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 04:24:53,931][15401] Updated weights for policy 0, policy_version 133460 (0.0041) [2024-06-22 04:24:58,107][15401] Updated weights for policy 0, policy_version 133470 (0.0039) [2024-06-22 04:24:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2186772480. Throughput: 0: 42534.4. Samples: 2186898840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 04:24:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 04:25:01,611][15401] Updated weights for policy 0, policy_version 133480 (0.0042) [2024-06-22 04:25:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2186969088. Throughput: 0: 42521.3. Samples: 2187158040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 04:25:05,611][15401] Updated weights for policy 0, policy_version 133490 (0.0028) [2024-06-22 04:25:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2187198464. Throughput: 0: 42324.6. Samples: 2187283800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 04:25:09,206][15401] Updated weights for policy 0, policy_version 133500 (0.0040) [2024-06-22 04:25:13,146][15401] Updated weights for policy 0, policy_version 133510 (0.0028) [2024-06-22 04:25:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2187427840. Throughput: 0: 42613.3. Samples: 2187544320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 04:25:16,924][15401] Updated weights for policy 0, policy_version 133520 (0.0041) [2024-06-22 04:25:18,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42050.6, 300 sec: 42598.0). Total num frames: 2187608064. Throughput: 0: 42610.8. Samples: 2187801620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:18,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 04:25:20,851][15401] Updated weights for policy 0, policy_version 133530 (0.0029) [2024-06-22 04:25:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2187837440. Throughput: 0: 42433.8. Samples: 2187923100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:23,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 04:25:24,811][15401] Updated weights for policy 0, policy_version 133540 (0.0030) [2024-06-22 04:25:28,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2188066816. Throughput: 0: 42604.8. Samples: 2188180620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 04:25:28,613][15401] Updated weights for policy 0, policy_version 133550 (0.0037) [2024-06-22 04:25:32,312][15401] Updated weights for policy 0, policy_version 133560 (0.0028) [2024-06-22 04:25:33,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42320.9, 300 sec: 42597.5). Total num frames: 2188263424. Throughput: 0: 42490.4. Samples: 2188434720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:33,396][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 04:25:36,465][15401] Updated weights for policy 0, policy_version 133570 (0.0036) [2024-06-22 04:25:38,009][15349] Signal inference workers to stop experience collection... (32250 times) [2024-06-22 04:25:38,010][15349] Signal inference workers to resume experience collection... (32250 times) [2024-06-22 04:25:38,031][15401] InferenceWorker_p0-w0: stopping experience collection (32250 times) [2024-06-22 04:25:38,061][15401] InferenceWorker_p0-w0: resuming experience collection (32250 times) [2024-06-22 04:25:38,394][15132] Fps is (10 sec: 42578.7, 60 sec: 42868.1, 300 sec: 42708.8). Total num frames: 2188492800. Throughput: 0: 42589.8. Samples: 2188562380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:38,395][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 04:25:39,779][15401] Updated weights for policy 0, policy_version 133580 (0.0037) [2024-06-22 04:25:43,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2188689408. Throughput: 0: 42686.5. Samples: 2188819740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 04:25:43,556][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000133588_2188705792.pth... [2024-06-22 04:25:43,623][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000132962_2178449408.pth [2024-06-22 04:25:44,032][15401] Updated weights for policy 0, policy_version 133590 (0.0038) [2024-06-22 04:25:47,393][15401] Updated weights for policy 0, policy_version 133600 (0.0034) [2024-06-22 04:25:48,389][15132] Fps is (10 sec: 40979.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2188902400. Throughput: 0: 42552.1. Samples: 2189072880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:48,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 04:25:51,780][15401] Updated weights for policy 0, policy_version 133610 (0.0032) [2024-06-22 04:25:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42654.5). Total num frames: 2189115392. Throughput: 0: 42670.5. Samples: 2189203980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 04:25:55,575][15401] Updated weights for policy 0, policy_version 133620 (0.0047) [2024-06-22 04:25:58,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 2189328384. Throughput: 0: 42540.1. Samples: 2189458720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:25:58,392][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 04:25:59,306][15401] Updated weights for policy 0, policy_version 133630 (0.0032) [2024-06-22 04:26:03,059][15401] Updated weights for policy 0, policy_version 133640 (0.0031) [2024-06-22 04:26:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2189557760. Throughput: 0: 42245.9. Samples: 2189702580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 04:26:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 04:26:07,491][15401] Updated weights for policy 0, policy_version 133650 (0.0033) [2024-06-22 04:26:08,392][15132] Fps is (10 sec: 44236.5, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 2189770752. Throughput: 0: 42518.2. Samples: 2189836520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:08,393][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 04:26:10,498][15401] Updated weights for policy 0, policy_version 133660 (0.0037) [2024-06-22 04:26:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2189967360. Throughput: 0: 42598.7. Samples: 2190097560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:13,393][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 04:26:15,095][15401] Updated weights for policy 0, policy_version 133670 (0.0034) [2024-06-22 04:26:18,239][15401] Updated weights for policy 0, policy_version 133680 (0.0040) [2024-06-22 04:26:18,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 2190213120. Throughput: 0: 42398.9. Samples: 2190342400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:18,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 04:26:22,685][15401] Updated weights for policy 0, policy_version 133690 (0.0041) [2024-06-22 04:26:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2190393344. Throughput: 0: 42614.7. Samples: 2190479840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:23,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 04:26:25,996][15401] Updated weights for policy 0, policy_version 133700 (0.0028) [2024-06-22 04:26:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2190606336. Throughput: 0: 42590.9. Samples: 2190736320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:28,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 04:26:30,263][15401] Updated weights for policy 0, policy_version 133710 (0.0047) [2024-06-22 04:26:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42876.0, 300 sec: 42765.4). Total num frames: 2190835712. Throughput: 0: 42662.1. Samples: 2190992680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:33,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 04:26:33,651][15401] Updated weights for policy 0, policy_version 133720 (0.0031) [2024-06-22 04:26:37,683][15401] Updated weights for policy 0, policy_version 133730 (0.0030) [2024-06-22 04:26:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42601.8, 300 sec: 42765.0). Total num frames: 2191048704. Throughput: 0: 42739.7. Samples: 2191127260. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 04:26:41,455][15401] Updated weights for policy 0, policy_version 133740 (0.0045) [2024-06-22 04:26:43,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 2191261696. Throughput: 0: 42725.7. Samples: 2191381380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:43,392][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 04:26:44,461][15349] Signal inference workers to stop experience collection... (32300 times) [2024-06-22 04:26:44,507][15401] InferenceWorker_p0-w0: stopping experience collection (32300 times) [2024-06-22 04:26:44,516][15349] Signal inference workers to resume experience collection... (32300 times) [2024-06-22 04:26:44,518][15401] InferenceWorker_p0-w0: resuming experience collection (32300 times) [2024-06-22 04:26:45,375][15401] Updated weights for policy 0, policy_version 133750 (0.0039) [2024-06-22 04:26:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2191474688. Throughput: 0: 43137.7. Samples: 2191643780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:48,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-22 04:26:48,873][15401] Updated weights for policy 0, policy_version 133760 (0.0044) [2024-06-22 04:26:52,877][15401] Updated weights for policy 0, policy_version 133770 (0.0023) [2024-06-22 04:26:53,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2191687680. Throughput: 0: 43013.3. Samples: 2191772120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:53,393][15132] Avg episode reward: [(0, '0.245')] [2024-06-22 04:26:56,377][15401] Updated weights for policy 0, policy_version 133780 (0.0036) [2024-06-22 04:26:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2191900672. Throughput: 0: 43015.6. Samples: 2192033260. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:26:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 04:27:00,549][15401] Updated weights for policy 0, policy_version 133790 (0.0035) [2024-06-22 04:27:03,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2192113664. Throughput: 0: 43259.6. Samples: 2192289080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 04:27:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 04:27:04,157][15401] Updated weights for policy 0, policy_version 133800 (0.0035) [2024-06-22 04:27:08,071][15401] Updated weights for policy 0, policy_version 133810 (0.0024) [2024-06-22 04:27:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2192343040. Throughput: 0: 43058.6. Samples: 2192417480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:08,394][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 04:27:11,803][15401] Updated weights for policy 0, policy_version 133820 (0.0035) [2024-06-22 04:27:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2192556032. Throughput: 0: 43149.6. Samples: 2192678060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:13,391][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 04:27:15,648][15401] Updated weights for policy 0, policy_version 133830 (0.0028) [2024-06-22 04:27:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2192769024. Throughput: 0: 43236.4. Samples: 2192938320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 04:27:19,381][15401] Updated weights for policy 0, policy_version 133840 (0.0031) [2024-06-22 04:27:23,265][15401] Updated weights for policy 0, policy_version 133850 (0.0039) [2024-06-22 04:27:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 2192998400. Throughput: 0: 43093.8. Samples: 2193066480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:23,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 04:27:26,882][15401] Updated weights for policy 0, policy_version 133860 (0.0034) [2024-06-22 04:27:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.8, 300 sec: 42875.7). Total num frames: 2193211392. Throughput: 0: 43148.9. Samples: 2193323080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:28,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 04:27:30,810][15401] Updated weights for policy 0, policy_version 133870 (0.0026) [2024-06-22 04:27:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 2193424384. Throughput: 0: 43151.5. Samples: 2193585700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:33,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 04:27:34,477][15401] Updated weights for policy 0, policy_version 133880 (0.0032) [2024-06-22 04:27:38,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 2193637376. Throughput: 0: 43103.7. Samples: 2193711680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 04:27:38,446][15401] Updated weights for policy 0, policy_version 133890 (0.0029) [2024-06-22 04:27:42,089][15401] Updated weights for policy 0, policy_version 133900 (0.0025) [2024-06-22 04:27:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43146.2, 300 sec: 42821.2). Total num frames: 2193850368. Throughput: 0: 42982.6. Samples: 2193967480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:43,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 04:27:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000133903_2193866752.pth... [2024-06-22 04:27:43,513][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000133274_2183561216.pth [2024-06-22 04:27:46,316][15401] Updated weights for policy 0, policy_version 133910 (0.0041) [2024-06-22 04:27:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2194063360. Throughput: 0: 43083.5. Samples: 2194227840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 04:27:50,006][15401] Updated weights for policy 0, policy_version 133920 (0.0032) [2024-06-22 04:27:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 2194276352. Throughput: 0: 43061.4. Samples: 2194355240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 04:27:53,645][15401] Updated weights for policy 0, policy_version 133930 (0.0052) [2024-06-22 04:27:57,692][15401] Updated weights for policy 0, policy_version 133940 (0.0033) [2024-06-22 04:27:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2194489344. Throughput: 0: 43071.2. Samples: 2194616260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:27:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 04:28:01,489][15401] Updated weights for policy 0, policy_version 133950 (0.0032) [2024-06-22 04:28:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2194718720. Throughput: 0: 43045.2. Samples: 2194875360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:28:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 04:28:05,330][15401] Updated weights for policy 0, policy_version 133960 (0.0038) [2024-06-22 04:28:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2194915328. Throughput: 0: 42967.9. Samples: 2195000040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 04:28:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 04:28:08,947][15401] Updated weights for policy 0, policy_version 133970 (0.0028) [2024-06-22 04:28:10,597][15349] Signal inference workers to stop experience collection... (32350 times) [2024-06-22 04:28:10,598][15349] Signal inference workers to resume experience collection... (32350 times) [2024-06-22 04:28:10,622][15401] InferenceWorker_p0-w0: stopping experience collection (32350 times) [2024-06-22 04:28:10,623][15401] InferenceWorker_p0-w0: resuming experience collection (32350 times) [2024-06-22 04:28:12,805][15401] Updated weights for policy 0, policy_version 133980 (0.0042) [2024-06-22 04:28:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2195128320. Throughput: 0: 43133.0. Samples: 2195263960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:13,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-22 04:28:16,684][15401] Updated weights for policy 0, policy_version 133990 (0.0039) [2024-06-22 04:28:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42821.4). Total num frames: 2195357696. Throughput: 0: 42846.4. Samples: 2195513680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 04:28:20,950][15401] Updated weights for policy 0, policy_version 134000 (0.0030) [2024-06-22 04:28:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2195570688. Throughput: 0: 42893.7. Samples: 2195641900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 04:28:24,839][15401] Updated weights for policy 0, policy_version 134010 (0.0045) [2024-06-22 04:28:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 2195767296. Throughput: 0: 42881.4. Samples: 2195897140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 04:28:28,639][15401] Updated weights for policy 0, policy_version 134020 (0.0029) [2024-06-22 04:28:32,421][15401] Updated weights for policy 0, policy_version 134030 (0.0031) [2024-06-22 04:28:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.3, 300 sec: 42931.7). Total num frames: 2196013056. Throughput: 0: 42872.1. Samples: 2196157080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 04:28:36,121][15401] Updated weights for policy 0, policy_version 134040 (0.0031) [2024-06-22 04:28:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2196209664. Throughput: 0: 42954.3. Samples: 2196288180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 04:28:39,874][15401] Updated weights for policy 0, policy_version 134050 (0.0044) [2024-06-22 04:28:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2196422656. Throughput: 0: 42873.3. Samples: 2196545560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 04:28:43,726][15401] Updated weights for policy 0, policy_version 134060 (0.0033) [2024-06-22 04:28:47,442][15401] Updated weights for policy 0, policy_version 134070 (0.0035) [2024-06-22 04:28:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2196635648. Throughput: 0: 42879.2. Samples: 2196804920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 04:28:51,182][15401] Updated weights for policy 0, policy_version 134080 (0.0038) [2024-06-22 04:28:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2196848640. Throughput: 0: 42947.1. Samples: 2196932660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 04:28:54,934][15401] Updated weights for policy 0, policy_version 134090 (0.0028) [2024-06-22 04:28:58,391][15132] Fps is (10 sec: 42592.9, 60 sec: 42870.4, 300 sec: 42709.3). Total num frames: 2197061632. Throughput: 0: 42692.9. Samples: 2197185200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:28:58,391][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 04:28:58,726][15401] Updated weights for policy 0, policy_version 134100 (0.0043) [2024-06-22 04:29:02,492][15401] Updated weights for policy 0, policy_version 134110 (0.0034) [2024-06-22 04:29:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 2197274624. Throughput: 0: 42801.3. Samples: 2197439740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:29:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 04:29:06,765][15401] Updated weights for policy 0, policy_version 134120 (0.0029) [2024-06-22 04:29:08,389][15132] Fps is (10 sec: 44243.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2197504000. Throughput: 0: 42914.3. Samples: 2197573040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:29:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 04:29:10,343][15401] Updated weights for policy 0, policy_version 134130 (0.0039) [2024-06-22 04:29:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2197716992. Throughput: 0: 42883.1. Samples: 2197826880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 04:29:14,655][15401] Updated weights for policy 0, policy_version 134140 (0.0027) [2024-06-22 04:29:17,869][15401] Updated weights for policy 0, policy_version 134150 (0.0045) [2024-06-22 04:29:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 2197929984. Throughput: 0: 42851.9. Samples: 2198085520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:18,392][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 04:29:22,506][15401] Updated weights for policy 0, policy_version 134160 (0.0038) [2024-06-22 04:29:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2198142976. Throughput: 0: 42828.3. Samples: 2198215460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 04:29:25,518][15401] Updated weights for policy 0, policy_version 134170 (0.0030) [2024-06-22 04:29:28,374][15349] Signal inference workers to stop experience collection... (32400 times) [2024-06-22 04:29:28,374][15349] Signal inference workers to resume experience collection... (32400 times) [2024-06-22 04:29:28,386][15401] InferenceWorker_p0-w0: stopping experience collection (32400 times) [2024-06-22 04:29:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2198355968. Throughput: 0: 42812.8. Samples: 2198472140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 04:29:28,413][15401] InferenceWorker_p0-w0: resuming experience collection (32400 times) [2024-06-22 04:29:30,021][15401] Updated weights for policy 0, policy_version 134180 (0.0032) [2024-06-22 04:29:33,002][15401] Updated weights for policy 0, policy_version 134190 (0.0025) [2024-06-22 04:29:33,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 2198585344. Throughput: 0: 42625.4. Samples: 2198723160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:33,393][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 04:29:37,538][15401] Updated weights for policy 0, policy_version 134200 (0.0041) [2024-06-22 04:29:38,391][15132] Fps is (10 sec: 40954.3, 60 sec: 42597.3, 300 sec: 42764.8). Total num frames: 2198765568. Throughput: 0: 42718.2. Samples: 2198855040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:38,391][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 04:29:40,718][15401] Updated weights for policy 0, policy_version 134210 (0.0043) [2024-06-22 04:29:43,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 2198994944. Throughput: 0: 42834.6. Samples: 2199112800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:43,393][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 04:29:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000134216_2198994944.pth... [2024-06-22 04:29:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000133588_2188705792.pth [2024-06-22 04:29:45,389][15401] Updated weights for policy 0, policy_version 134220 (0.0030) [2024-06-22 04:29:48,370][15401] Updated weights for policy 0, policy_version 134230 (0.0025) [2024-06-22 04:29:48,389][15132] Fps is (10 sec: 45882.3, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 2199224320. Throughput: 0: 42784.4. Samples: 2199365040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 04:29:52,892][15401] Updated weights for policy 0, policy_version 134240 (0.0041) [2024-06-22 04:29:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2199404544. Throughput: 0: 42844.4. Samples: 2199501040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 04:29:55,793][15401] Updated weights for policy 0, policy_version 134250 (0.0034) [2024-06-22 04:29:58,396][15132] Fps is (10 sec: 40933.2, 60 sec: 42867.9, 300 sec: 42930.7). Total num frames: 2199633920. Throughput: 0: 42858.3. Samples: 2199755780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:29:58,397][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 04:30:00,343][15401] Updated weights for policy 0, policy_version 134260 (0.0046) [2024-06-22 04:30:03,309][15401] Updated weights for policy 0, policy_version 134270 (0.0055) [2024-06-22 04:30:03,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 2199879680. Throughput: 0: 42799.1. Samples: 2200011380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:30:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 04:30:07,917][15401] Updated weights for policy 0, policy_version 134280 (0.0032) [2024-06-22 04:30:08,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2200043520. Throughput: 0: 42780.6. Samples: 2200140580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:30:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 04:30:11,549][15401] Updated weights for policy 0, policy_version 134290 (0.0023) [2024-06-22 04:30:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 2200272896. Throughput: 0: 42722.7. Samples: 2200394660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:30:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 04:30:15,421][15401] Updated weights for policy 0, policy_version 134300 (0.0043) [2024-06-22 04:30:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 2200485888. Throughput: 0: 42868.5. Samples: 2200652140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 04:30:19,113][15401] Updated weights for policy 0, policy_version 134310 (0.0038) [2024-06-22 04:30:23,053][15401] Updated weights for policy 0, policy_version 134320 (0.0039) [2024-06-22 04:30:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2200698880. Throughput: 0: 42682.8. Samples: 2200775700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 04:30:26,755][15401] Updated weights for policy 0, policy_version 134330 (0.0043) [2024-06-22 04:30:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42932.6). Total num frames: 2200928256. Throughput: 0: 42605.5. Samples: 2201029940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 04:30:30,773][15401] Updated weights for policy 0, policy_version 134340 (0.0033) [2024-06-22 04:30:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42327.0, 300 sec: 42821.2). Total num frames: 2201124864. Throughput: 0: 42762.5. Samples: 2201289360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 04:30:34,446][15401] Updated weights for policy 0, policy_version 134350 (0.0037) [2024-06-22 04:30:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42872.5, 300 sec: 42876.1). Total num frames: 2201337856. Throughput: 0: 42516.0. Samples: 2201414260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:38,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-22 04:30:38,795][15401] Updated weights for policy 0, policy_version 134360 (0.0030) [2024-06-22 04:30:42,306][15401] Updated weights for policy 0, policy_version 134370 (0.0025) [2024-06-22 04:30:43,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42598.4, 300 sec: 42875.7). Total num frames: 2201550848. Throughput: 0: 42451.9. Samples: 2201665940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:43,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 04:30:46,667][15401] Updated weights for policy 0, policy_version 134380 (0.0031) [2024-06-22 04:30:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 2201747456. Throughput: 0: 42407.6. Samples: 2201919720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:48,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 04:30:50,248][15401] Updated weights for policy 0, policy_version 134390 (0.0031) [2024-06-22 04:30:53,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 2201976832. Throughput: 0: 42315.9. Samples: 2202044800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 04:30:54,052][15401] Updated weights for policy 0, policy_version 134400 (0.0034) [2024-06-22 04:30:55,051][15349] Signal inference workers to stop experience collection... (32450 times) [2024-06-22 04:30:55,078][15401] InferenceWorker_p0-w0: stopping experience collection (32450 times) [2024-06-22 04:30:55,112][15349] Signal inference workers to resume experience collection... (32450 times) [2024-06-22 04:30:55,115][15401] InferenceWorker_p0-w0: resuming experience collection (32450 times) [2024-06-22 04:30:57,962][15401] Updated weights for policy 0, policy_version 134410 (0.0037) [2024-06-22 04:30:58,391][15132] Fps is (10 sec: 42589.9, 60 sec: 42328.5, 300 sec: 42764.7). Total num frames: 2202173440. Throughput: 0: 42484.4. Samples: 2202306540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:30:58,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 04:31:02,145][15401] Updated weights for policy 0, policy_version 134420 (0.0034) [2024-06-22 04:31:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42765.4). Total num frames: 2202386432. Throughput: 0: 42378.3. Samples: 2202559160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:31:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 04:31:05,475][15401] Updated weights for policy 0, policy_version 134430 (0.0031) [2024-06-22 04:31:08,389][15132] Fps is (10 sec: 44245.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2202615808. Throughput: 0: 42497.8. Samples: 2202688100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:31:08,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-22 04:31:09,756][15401] Updated weights for policy 0, policy_version 134440 (0.0037) [2024-06-22 04:31:13,119][15401] Updated weights for policy 0, policy_version 134450 (0.0032) [2024-06-22 04:31:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2202828800. Throughput: 0: 42564.8. Samples: 2202945360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 04:31:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 04:31:17,346][15401] Updated weights for policy 0, policy_version 134460 (0.0041) [2024-06-22 04:31:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2203025408. Throughput: 0: 42554.3. Samples: 2203204300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 04:31:20,879][15401] Updated weights for policy 0, policy_version 134470 (0.0036) [2024-06-22 04:31:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2203254784. Throughput: 0: 42476.0. Samples: 2203325680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 04:31:25,133][15401] Updated weights for policy 0, policy_version 134480 (0.0038) [2024-06-22 04:31:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2203467776. Throughput: 0: 42709.9. Samples: 2203587780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 04:31:28,426][15401] Updated weights for policy 0, policy_version 134490 (0.0027) [2024-06-22 04:31:32,603][15401] Updated weights for policy 0, policy_version 134500 (0.0046) [2024-06-22 04:31:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2203664384. Throughput: 0: 42737.7. Samples: 2203842920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 04:31:36,231][15401] Updated weights for policy 0, policy_version 134510 (0.0036) [2024-06-22 04:31:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 2203893760. Throughput: 0: 42782.6. Samples: 2203970020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 04:31:40,314][15401] Updated weights for policy 0, policy_version 134520 (0.0033) [2024-06-22 04:31:43,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 2204106752. Throughput: 0: 42692.9. Samples: 2204227740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:43,393][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 04:31:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000134528_2204106752.pth... [2024-06-22 04:31:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000133903_2193866752.pth [2024-06-22 04:31:44,005][15401] Updated weights for policy 0, policy_version 134530 (0.0032) [2024-06-22 04:31:48,020][15401] Updated weights for policy 0, policy_version 134540 (0.0034) [2024-06-22 04:31:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2204303360. Throughput: 0: 42659.5. Samples: 2204478840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 04:31:51,637][15401] Updated weights for policy 0, policy_version 134550 (0.0035) [2024-06-22 04:31:53,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2204549120. Throughput: 0: 42701.2. Samples: 2204609660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:53,392][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 04:31:55,898][15401] Updated weights for policy 0, policy_version 134560 (0.0028) [2024-06-22 04:31:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 2204729344. Throughput: 0: 42496.4. Samples: 2204857700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:31:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 04:31:59,412][15401] Updated weights for policy 0, policy_version 134570 (0.0050) [2024-06-22 04:32:03,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2204925952. Throughput: 0: 42569.3. Samples: 2205119920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:32:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 04:32:03,732][15401] Updated weights for policy 0, policy_version 134580 (0.0028) [2024-06-22 04:32:06,994][15401] Updated weights for policy 0, policy_version 134590 (0.0038) [2024-06-22 04:32:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2205188096. Throughput: 0: 42688.0. Samples: 2205246640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:32:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 04:32:11,260][15401] Updated weights for policy 0, policy_version 134600 (0.0045) [2024-06-22 04:32:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2205368320. Throughput: 0: 42508.9. Samples: 2205500680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:32:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 04:32:14,777][15401] Updated weights for policy 0, policy_version 134610 (0.0039) [2024-06-22 04:32:15,001][15349] Signal inference workers to stop experience collection... (32500 times) [2024-06-22 04:32:15,001][15349] Signal inference workers to resume experience collection... (32500 times) [2024-06-22 04:32:15,041][15401] InferenceWorker_p0-w0: stopping experience collection (32500 times) [2024-06-22 04:32:15,041][15401] InferenceWorker_p0-w0: resuming experience collection (32500 times) [2024-06-22 04:32:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2205581312. Throughput: 0: 42794.8. Samples: 2205768680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 04:32:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 04:32:18,876][15401] Updated weights for policy 0, policy_version 134620 (0.0025) [2024-06-22 04:32:22,533][15401] Updated weights for policy 0, policy_version 134630 (0.0042) [2024-06-22 04:32:23,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 2205843456. Throughput: 0: 42608.1. Samples: 2205887380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:32:23,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 04:32:26,432][15401] Updated weights for policy 0, policy_version 134640 (0.0026) [2024-06-22 04:32:28,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.6, 300 sec: 42709.5). Total num frames: 2206023680. Throughput: 0: 42590.2. Samples: 2206144300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:32:28,393][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 04:32:29,967][15401] Updated weights for policy 0, policy_version 134650 (0.0039) [2024-06-22 04:32:33,389][15132] Fps is (10 sec: 36045.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2206203904. Throughput: 0: 42996.9. Samples: 2206413700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:32:33,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 04:32:34,093][15401] Updated weights for policy 0, policy_version 134660 (0.0027) [2024-06-22 04:32:37,507][15401] Updated weights for policy 0, policy_version 134670 (0.0039) [2024-06-22 04:32:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2206466048. Throughput: 0: 42751.2. Samples: 2206533460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:32:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 04:32:41,788][15401] Updated weights for policy 0, policy_version 134680 (0.0040) [2024-06-22 04:32:43,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2206679040. Throughput: 0: 42895.0. Samples: 2206787980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:32:43,392][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 04:32:45,349][15401] Updated weights for policy 0, policy_version 134690 (0.0035) [2024-06-22 04:32:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2206859264. Throughput: 0: 42895.1. Samples: 2207050200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:32:48,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 04:32:49,491][15401] Updated weights for policy 0, policy_version 134700 (0.0036) [2024-06-22 04:32:53,040][15401] Updated weights for policy 0, policy_version 134710 (0.0033) [2024-06-22 04:32:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2207121408. Throughput: 0: 42813.1. Samples: 2207173240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:32:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 04:32:57,059][15401] Updated weights for policy 0, policy_version 134720 (0.0033) [2024-06-22 04:32:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2207318016. Throughput: 0: 42828.9. Samples: 2207427980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:32:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 04:33:00,717][15401] Updated weights for policy 0, policy_version 134730 (0.0033) [2024-06-22 04:33:03,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2207498240. Throughput: 0: 42624.0. Samples: 2207686760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:33:03,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 04:33:04,871][15401] Updated weights for policy 0, policy_version 134740 (0.0038) [2024-06-22 04:33:08,392][15132] Fps is (10 sec: 42587.4, 60 sec: 42596.6, 300 sec: 42764.6). Total num frames: 2207744000. Throughput: 0: 42648.8. Samples: 2207806680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:33:08,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 04:33:08,393][15401] Updated weights for policy 0, policy_version 134750 (0.0045) [2024-06-22 04:33:12,553][15401] Updated weights for policy 0, policy_version 134760 (0.0030) [2024-06-22 04:33:13,058][15349] Signal inference workers to stop experience collection... (32550 times) [2024-06-22 04:33:13,059][15349] Signal inference workers to resume experience collection... (32550 times) [2024-06-22 04:33:13,090][15401] InferenceWorker_p0-w0: stopping experience collection (32550 times) [2024-06-22 04:33:13,090][15401] InferenceWorker_p0-w0: resuming experience collection (32550 times) [2024-06-22 04:33:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2207956992. Throughput: 0: 42813.4. Samples: 2208070800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:33:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 04:33:15,970][15401] Updated weights for policy 0, policy_version 134770 (0.0038) [2024-06-22 04:33:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2208153600. Throughput: 0: 42302.1. Samples: 2208317300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:33:18,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 04:33:20,159][15401] Updated weights for policy 0, policy_version 134780 (0.0031) [2024-06-22 04:33:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 2208350208. Throughput: 0: 42537.4. Samples: 2208447640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:33:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 04:33:23,993][15401] Updated weights for policy 0, policy_version 134790 (0.0040) [2024-06-22 04:33:27,859][15401] Updated weights for policy 0, policy_version 134800 (0.0036) [2024-06-22 04:33:28,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42598.4, 300 sec: 42598.0). Total num frames: 2208579584. Throughput: 0: 42587.7. Samples: 2208704520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:33:28,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 04:33:31,628][15401] Updated weights for policy 0, policy_version 134810 (0.0032) [2024-06-22 04:33:33,389][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2208808960. Throughput: 0: 42338.7. Samples: 2208955440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:33:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 04:33:35,536][15401] Updated weights for policy 0, policy_version 134820 (0.0033) [2024-06-22 04:33:38,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2209005568. Throughput: 0: 42477.2. Samples: 2209084700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:33:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 04:33:39,267][15401] Updated weights for policy 0, policy_version 134830 (0.0032) [2024-06-22 04:33:43,111][15401] Updated weights for policy 0, policy_version 134840 (0.0043) [2024-06-22 04:33:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2209218560. Throughput: 0: 42521.1. Samples: 2209341440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:33:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 04:33:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000134840_2209218560.pth... [2024-06-22 04:33:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000134216_2198994944.pth [2024-06-22 04:33:46,923][15401] Updated weights for policy 0, policy_version 134850 (0.0024) [2024-06-22 04:33:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2209431552. Throughput: 0: 42382.5. Samples: 2209593980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:33:48,392][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 04:33:50,842][15401] Updated weights for policy 0, policy_version 134860 (0.0035) [2024-06-22 04:33:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42654.1). Total num frames: 2209644544. Throughput: 0: 42544.5. Samples: 2209721080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:33:53,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 04:33:54,461][15401] Updated weights for policy 0, policy_version 134870 (0.0042) [2024-06-22 04:33:58,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42050.5, 300 sec: 42598.0). Total num frames: 2209841152. Throughput: 0: 42323.9. Samples: 2209975480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:33:58,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 04:33:58,864][15401] Updated weights for policy 0, policy_version 134880 (0.0043) [2024-06-22 04:34:02,030][15401] Updated weights for policy 0, policy_version 134890 (0.0034) [2024-06-22 04:34:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2210054144. Throughput: 0: 42564.9. Samples: 2210232720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:34:03,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 04:34:06,457][15401] Updated weights for policy 0, policy_version 134900 (0.0032) [2024-06-22 04:34:08,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 2210283520. Throughput: 0: 42549.6. Samples: 2210362380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:34:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 04:34:09,866][15401] Updated weights for policy 0, policy_version 134910 (0.0020) [2024-06-22 04:34:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 2210496512. Throughput: 0: 42414.7. Samples: 2210613080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:34:13,393][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 04:34:13,846][15401] Updated weights for policy 0, policy_version 134920 (0.0036) [2024-06-22 04:34:17,502][15401] Updated weights for policy 0, policy_version 134930 (0.0035) [2024-06-22 04:34:18,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 2210709504. Throughput: 0: 42474.8. Samples: 2210867080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:34:18,396][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 04:34:21,875][15401] Updated weights for policy 0, policy_version 134940 (0.0035) [2024-06-22 04:34:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 2210906112. Throughput: 0: 42486.0. Samples: 2210996680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 04:34:23,393][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 04:34:25,189][15401] Updated weights for policy 0, policy_version 134950 (0.0036) [2024-06-22 04:34:28,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42873.2, 300 sec: 42598.8). Total num frames: 2211151872. Throughput: 0: 42487.7. Samples: 2211253380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:34:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 04:34:29,668][15401] Updated weights for policy 0, policy_version 134960 (0.0037) [2024-06-22 04:34:31,303][15349] Signal inference workers to stop experience collection... (32600 times) [2024-06-22 04:34:31,303][15349] Signal inference workers to resume experience collection... (32600 times) [2024-06-22 04:34:31,324][15401] InferenceWorker_p0-w0: stopping experience collection (32600 times) [2024-06-22 04:34:31,325][15401] InferenceWorker_p0-w0: resuming experience collection (32600 times) [2024-06-22 04:34:32,877][15401] Updated weights for policy 0, policy_version 134970 (0.0039) [2024-06-22 04:34:33,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42325.3, 300 sec: 42654.1). Total num frames: 2211348480. Throughput: 0: 42536.0. Samples: 2211508100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:34:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 04:34:37,199][15401] Updated weights for policy 0, policy_version 134980 (0.0028) [2024-06-22 04:34:38,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42052.1, 300 sec: 42487.7). Total num frames: 2211528704. Throughput: 0: 42556.0. Samples: 2211636100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:34:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 04:34:40,395][15401] Updated weights for policy 0, policy_version 134990 (0.0030) [2024-06-22 04:34:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 2211774464. Throughput: 0: 42648.1. Samples: 2211894540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:34:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 04:34:44,812][15401] Updated weights for policy 0, policy_version 135000 (0.0041) [2024-06-22 04:34:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2211987456. Throughput: 0: 42542.0. Samples: 2212147100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:34:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 04:34:48,431][15401] Updated weights for policy 0, policy_version 135010 (0.0035) [2024-06-22 04:34:52,563][15401] Updated weights for policy 0, policy_version 135020 (0.0028) [2024-06-22 04:34:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42543.8). Total num frames: 2212184064. Throughput: 0: 42395.5. Samples: 2212270180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:34:53,396][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 04:34:55,959][15401] Updated weights for policy 0, policy_version 135030 (0.0035) [2024-06-22 04:34:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.3, 300 sec: 42542.9). Total num frames: 2212429824. Throughput: 0: 42832.5. Samples: 2212540540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:34:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 04:35:00,131][15401] Updated weights for policy 0, policy_version 135040 (0.0034) [2024-06-22 04:35:03,345][15401] Updated weights for policy 0, policy_version 135050 (0.0021) [2024-06-22 04:35:03,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2212659200. Throughput: 0: 42864.8. Samples: 2212795720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:35:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 04:35:07,537][15401] Updated weights for policy 0, policy_version 135060 (0.0049) [2024-06-22 04:35:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2212839424. Throughput: 0: 42903.6. Samples: 2212927240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:35:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 04:35:10,881][15401] Updated weights for policy 0, policy_version 135070 (0.0034) [2024-06-22 04:35:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2213068800. Throughput: 0: 42887.8. Samples: 2213183340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:35:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 04:35:15,047][15401] Updated weights for policy 0, policy_version 135080 (0.0028) [2024-06-22 04:35:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 2213281792. Throughput: 0: 42842.7. Samples: 2213436020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:35:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 04:35:18,774][15401] Updated weights for policy 0, policy_version 135090 (0.0030) [2024-06-22 04:35:22,693][15401] Updated weights for policy 0, policy_version 135100 (0.0032) [2024-06-22 04:35:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 2213494784. Throughput: 0: 42817.3. Samples: 2213562880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:35:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 04:35:26,743][15401] Updated weights for policy 0, policy_version 135110 (0.0031) [2024-06-22 04:35:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2213691392. Throughput: 0: 42770.2. Samples: 2213819200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:35:28,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 04:35:30,495][15401] Updated weights for policy 0, policy_version 135120 (0.0029) [2024-06-22 04:35:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2213888000. Throughput: 0: 42932.4. Samples: 2214079060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:35:33,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 04:35:34,327][15401] Updated weights for policy 0, policy_version 135130 (0.0040) [2024-06-22 04:35:38,219][15401] Updated weights for policy 0, policy_version 135140 (0.0029) [2024-06-22 04:35:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42654.3). Total num frames: 2214133760. Throughput: 0: 42883.6. Samples: 2214199940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:35:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 04:35:39,508][15349] Signal inference workers to stop experience collection... (32650 times) [2024-06-22 04:35:39,543][15401] InferenceWorker_p0-w0: stopping experience collection (32650 times) [2024-06-22 04:35:39,568][15349] Signal inference workers to resume experience collection... (32650 times) [2024-06-22 04:35:39,572][15401] InferenceWorker_p0-w0: resuming experience collection (32650 times) [2024-06-22 04:35:41,823][15401] Updated weights for policy 0, policy_version 135150 (0.0033) [2024-06-22 04:35:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2214346752. Throughput: 0: 42719.1. Samples: 2214462900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:35:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 04:35:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000135154_2214363136.pth... [2024-06-22 04:35:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000134528_2204106752.pth [2024-06-22 04:35:45,722][15401] Updated weights for policy 0, policy_version 135160 (0.0029) [2024-06-22 04:35:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2214543360. Throughput: 0: 42788.9. Samples: 2214721220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:35:48,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 04:35:49,592][15401] Updated weights for policy 0, policy_version 135170 (0.0043) [2024-06-22 04:35:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42654.2). Total num frames: 2214756352. Throughput: 0: 42604.8. Samples: 2214844460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:35:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 04:35:53,555][15401] Updated weights for policy 0, policy_version 135180 (0.0028) [2024-06-22 04:35:57,179][15401] Updated weights for policy 0, policy_version 135190 (0.0023) [2024-06-22 04:35:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2215002112. Throughput: 0: 42726.2. Samples: 2215106020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:35:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 04:36:01,391][15401] Updated weights for policy 0, policy_version 135200 (0.0030) [2024-06-22 04:36:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 2215182336. Throughput: 0: 43043.0. Samples: 2215372960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:03,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 04:36:04,763][15401] Updated weights for policy 0, policy_version 135210 (0.0046) [2024-06-22 04:36:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2215428096. Throughput: 0: 42914.3. Samples: 2215494020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 04:36:08,992][15401] Updated weights for policy 0, policy_version 135220 (0.0032) [2024-06-22 04:36:12,411][15401] Updated weights for policy 0, policy_version 135230 (0.0035) [2024-06-22 04:36:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2215641088. Throughput: 0: 42948.3. Samples: 2215751880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:13,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 04:36:16,734][15401] Updated weights for policy 0, policy_version 135240 (0.0039) [2024-06-22 04:36:18,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 2215821312. Throughput: 0: 42823.4. Samples: 2216006220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:18,392][15132] Avg episode reward: [(0, '0.223')] [2024-06-22 04:36:20,261][15401] Updated weights for policy 0, policy_version 135250 (0.0026) [2024-06-22 04:36:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2216067072. Throughput: 0: 42926.7. Samples: 2216131640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:23,392][15132] Avg episode reward: [(0, '0.235')] [2024-06-22 04:36:24,303][15401] Updated weights for policy 0, policy_version 135260 (0.0042) [2024-06-22 04:36:28,112][15401] Updated weights for policy 0, policy_version 135270 (0.0038) [2024-06-22 04:36:28,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2216263680. Throughput: 0: 42905.4. Samples: 2216393640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:28,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 04:36:31,835][15401] Updated weights for policy 0, policy_version 135280 (0.0040) [2024-06-22 04:36:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2216460288. Throughput: 0: 42764.4. Samples: 2216645620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 04:36:35,723][15401] Updated weights for policy 0, policy_version 135290 (0.0028) [2024-06-22 04:36:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2216706048. Throughput: 0: 42919.3. Samples: 2216775820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 04:36:39,512][15401] Updated weights for policy 0, policy_version 135300 (0.0043) [2024-06-22 04:36:43,321][15401] Updated weights for policy 0, policy_version 135310 (0.0037) [2024-06-22 04:36:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2216919040. Throughput: 0: 42880.1. Samples: 2217035620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 04:36:47,158][15401] Updated weights for policy 0, policy_version 135320 (0.0045) [2024-06-22 04:36:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2217115648. Throughput: 0: 42660.1. Samples: 2217292660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:48,394][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 04:36:50,906][15401] Updated weights for policy 0, policy_version 135330 (0.0038) [2024-06-22 04:36:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 2217361408. Throughput: 0: 42663.4. Samples: 2217413880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:53,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 04:36:55,030][15401] Updated weights for policy 0, policy_version 135340 (0.0030) [2024-06-22 04:36:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2217541632. Throughput: 0: 42738.4. Samples: 2217675100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:36:58,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 04:36:58,593][15401] Updated weights for policy 0, policy_version 135350 (0.0049) [2024-06-22 04:37:02,975][15401] Updated weights for policy 0, policy_version 135360 (0.0028) [2024-06-22 04:37:03,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2217754624. Throughput: 0: 42744.4. Samples: 2217929620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:37:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 04:37:06,224][15401] Updated weights for policy 0, policy_version 135370 (0.0031) [2024-06-22 04:37:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2218000384. Throughput: 0: 42748.0. Samples: 2218055300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:37:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 04:37:10,499][15401] Updated weights for policy 0, policy_version 135380 (0.0047) [2024-06-22 04:37:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2218164224. Throughput: 0: 42756.0. Samples: 2218317660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:37:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 04:37:14,217][15401] Updated weights for policy 0, policy_version 135390 (0.0037) [2024-06-22 04:37:15,374][15349] Signal inference workers to stop experience collection... (32700 times) [2024-06-22 04:37:15,429][15401] InferenceWorker_p0-w0: stopping experience collection (32700 times) [2024-06-22 04:37:15,433][15349] Signal inference workers to resume experience collection... (32700 times) [2024-06-22 04:37:15,441][15401] InferenceWorker_p0-w0: resuming experience collection (32700 times) [2024-06-22 04:37:18,023][15401] Updated weights for policy 0, policy_version 135400 (0.0038) [2024-06-22 04:37:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 2218393600. Throughput: 0: 42704.1. Samples: 2218567300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:37:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 04:37:21,821][15401] Updated weights for policy 0, policy_version 135410 (0.0031) [2024-06-22 04:37:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 2218622976. Throughput: 0: 42748.5. Samples: 2218699500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:37:23,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 04:37:25,442][15401] Updated weights for policy 0, policy_version 135420 (0.0026) [2024-06-22 04:37:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2218819584. Throughput: 0: 42735.6. Samples: 2218958720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:37:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 04:37:29,293][15401] Updated weights for policy 0, policy_version 135430 (0.0035) [2024-06-22 04:37:32,823][15401] Updated weights for policy 0, policy_version 135440 (0.0033) [2024-06-22 04:37:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 2219065344. Throughput: 0: 42638.4. Samples: 2219211380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 04:37:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 04:37:37,062][15401] Updated weights for policy 0, policy_version 135450 (0.0032) [2024-06-22 04:37:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 2219261952. Throughput: 0: 42937.0. Samples: 2219346040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:37:38,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 04:37:40,259][15401] Updated weights for policy 0, policy_version 135460 (0.0030) [2024-06-22 04:37:43,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2219474944. Throughput: 0: 42901.7. Samples: 2219605680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:37:43,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 04:37:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000135466_2219474944.pth... [2024-06-22 04:37:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000134840_2209218560.pth [2024-06-22 04:37:44,653][15401] Updated weights for policy 0, policy_version 135470 (0.0023) [2024-06-22 04:37:48,100][15401] Updated weights for policy 0, policy_version 135480 (0.0021) [2024-06-22 04:37:48,390][15132] Fps is (10 sec: 45870.9, 60 sec: 43416.9, 300 sec: 42709.4). Total num frames: 2219720704. Throughput: 0: 42964.0. Samples: 2219863040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:37:48,391][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 04:37:52,216][15401] Updated weights for policy 0, policy_version 135490 (0.0038) [2024-06-22 04:37:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2219917312. Throughput: 0: 43172.0. Samples: 2219998040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:37:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 04:37:55,620][15401] Updated weights for policy 0, policy_version 135500 (0.0041) [2024-06-22 04:37:58,390][15132] Fps is (10 sec: 39325.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2220113920. Throughput: 0: 42936.7. Samples: 2220249820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:37:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 04:37:59,829][15401] Updated weights for policy 0, policy_version 135510 (0.0038) [2024-06-22 04:38:03,167][15401] Updated weights for policy 0, policy_version 135520 (0.0046) [2024-06-22 04:38:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 2220359680. Throughput: 0: 43089.6. Samples: 2220506340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:38:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 04:38:07,419][15401] Updated weights for policy 0, policy_version 135530 (0.0030) [2024-06-22 04:38:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2220539904. Throughput: 0: 43180.9. Samples: 2220642640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:38:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 04:38:10,831][15401] Updated weights for policy 0, policy_version 135540 (0.0033) [2024-06-22 04:38:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.3, 300 sec: 42709.5). Total num frames: 2220752896. Throughput: 0: 42841.6. Samples: 2220886600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:38:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 04:38:15,007][15401] Updated weights for policy 0, policy_version 135550 (0.0032) [2024-06-22 04:38:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2220998656. Throughput: 0: 42978.9. Samples: 2221145440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:38:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 04:38:18,399][15401] Updated weights for policy 0, policy_version 135560 (0.0038) [2024-06-22 04:38:22,835][15401] Updated weights for policy 0, policy_version 135570 (0.0022) [2024-06-22 04:38:23,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 2221178880. Throughput: 0: 42883.7. Samples: 2221275800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:38:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 04:38:26,367][15401] Updated weights for policy 0, policy_version 135580 (0.0034) [2024-06-22 04:38:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2221408256. Throughput: 0: 42805.4. Samples: 2221531920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:38:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 04:38:30,863][15401] Updated weights for policy 0, policy_version 135590 (0.0042) [2024-06-22 04:38:33,100][15349] Signal inference workers to stop experience collection... (32750 times) [2024-06-22 04:38:33,120][15401] InferenceWorker_p0-w0: stopping experience collection (32750 times) [2024-06-22 04:38:33,210][15349] Signal inference workers to resume experience collection... (32750 times) [2024-06-22 04:38:33,210][15401] InferenceWorker_p0-w0: resuming experience collection (32750 times) [2024-06-22 04:38:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2221637632. Throughput: 0: 42883.2. Samples: 2221792740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:38:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 04:38:33,951][15401] Updated weights for policy 0, policy_version 135600 (0.0031) [2024-06-22 04:38:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2221817856. Throughput: 0: 42734.4. Samples: 2221921080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 04:38:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 04:38:38,405][15401] Updated weights for policy 0, policy_version 135610 (0.0044) [2024-06-22 04:38:41,902][15401] Updated weights for policy 0, policy_version 135620 (0.0032) [2024-06-22 04:38:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2222047232. Throughput: 0: 42848.1. Samples: 2222177980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:38:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 04:38:45,880][15401] Updated weights for policy 0, policy_version 135630 (0.0053) [2024-06-22 04:38:48,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42599.0, 300 sec: 42820.5). Total num frames: 2222276608. Throughput: 0: 42811.5. Samples: 2222432860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:38:48,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 04:38:49,382][15401] Updated weights for policy 0, policy_version 135640 (0.0032) [2024-06-22 04:38:53,330][15401] Updated weights for policy 0, policy_version 135650 (0.0037) [2024-06-22 04:38:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 2222489600. Throughput: 0: 42754.2. Samples: 2222566580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:38:53,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 04:38:57,133][15401] Updated weights for policy 0, policy_version 135660 (0.0042) [2024-06-22 04:38:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2222686208. Throughput: 0: 42951.3. Samples: 2222819400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:38:58,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 04:39:00,882][15401] Updated weights for policy 0, policy_version 135670 (0.0029) [2024-06-22 04:39:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2222915584. Throughput: 0: 42850.3. Samples: 2223073700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:39:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 04:39:04,727][15401] Updated weights for policy 0, policy_version 135680 (0.0049) [2024-06-22 04:39:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2223112192. Throughput: 0: 42874.3. Samples: 2223205140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:39:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 04:39:08,971][15401] Updated weights for policy 0, policy_version 135690 (0.0026) [2024-06-22 04:39:12,119][15401] Updated weights for policy 0, policy_version 135700 (0.0035) [2024-06-22 04:39:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.7, 300 sec: 42766.0). Total num frames: 2223325184. Throughput: 0: 42817.0. Samples: 2223458680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:39:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 04:39:16,416][15401] Updated weights for policy 0, policy_version 135710 (0.0031) [2024-06-22 04:39:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 2223554560. Throughput: 0: 42804.0. Samples: 2223718920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:39:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 04:39:20,045][15401] Updated weights for policy 0, policy_version 135720 (0.0033) [2024-06-22 04:39:23,396][15132] Fps is (10 sec: 44208.1, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 2223767552. Throughput: 0: 42817.4. Samples: 2223848140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:39:23,396][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 04:39:23,933][15401] Updated weights for policy 0, policy_version 135730 (0.0043) [2024-06-22 04:39:27,706][15401] Updated weights for policy 0, policy_version 135740 (0.0038) [2024-06-22 04:39:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2223980544. Throughput: 0: 42715.5. Samples: 2224100180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:39:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 04:39:31,568][15401] Updated weights for policy 0, policy_version 135750 (0.0032) [2024-06-22 04:39:33,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2224177152. Throughput: 0: 42823.7. Samples: 2224359920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:39:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 04:39:35,309][15401] Updated weights for policy 0, policy_version 135760 (0.0030) [2024-06-22 04:39:37,347][15349] Signal inference workers to stop experience collection... (32800 times) [2024-06-22 04:39:37,353][15349] Signal inference workers to resume experience collection... (32800 times) [2024-06-22 04:39:37,368][15401] InferenceWorker_p0-w0: stopping experience collection (32800 times) [2024-06-22 04:39:37,368][15401] InferenceWorker_p0-w0: resuming experience collection (32800 times) [2024-06-22 04:39:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2224422912. Throughput: 0: 42563.6. Samples: 2224481940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 04:39:38,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 04:39:39,088][15401] Updated weights for policy 0, policy_version 135770 (0.0031) [2024-06-22 04:39:42,995][15401] Updated weights for policy 0, policy_version 135780 (0.0026) [2024-06-22 04:39:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2224619520. Throughput: 0: 42773.8. Samples: 2224744220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:39:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 04:39:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000135781_2224635904.pth... [2024-06-22 04:39:43,523][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000135154_2214363136.pth [2024-06-22 04:39:46,750][15401] Updated weights for policy 0, policy_version 135790 (0.0037) [2024-06-22 04:39:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2224816128. Throughput: 0: 42817.3. Samples: 2225000480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:39:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 04:39:50,987][15401] Updated weights for policy 0, policy_version 135800 (0.0036) [2024-06-22 04:39:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2225045504. Throughput: 0: 42660.8. Samples: 2225124880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:39:53,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 04:39:54,361][15401] Updated weights for policy 0, policy_version 135810 (0.0026) [2024-06-22 04:39:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2225258496. Throughput: 0: 42694.7. Samples: 2225379940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:39:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 04:39:58,431][15401] Updated weights for policy 0, policy_version 135820 (0.0027) [2024-06-22 04:40:01,990][15401] Updated weights for policy 0, policy_version 135830 (0.0034) [2024-06-22 04:40:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2225455104. Throughput: 0: 42608.9. Samples: 2225636320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 04:40:06,476][15401] Updated weights for policy 0, policy_version 135840 (0.0037) [2024-06-22 04:40:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2225684480. Throughput: 0: 42640.3. Samples: 2225766680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:08,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 04:40:09,544][15401] Updated weights for policy 0, policy_version 135850 (0.0034) [2024-06-22 04:40:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2225897472. Throughput: 0: 42731.1. Samples: 2226023080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:13,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 04:40:14,187][15401] Updated weights for policy 0, policy_version 135860 (0.0035) [2024-06-22 04:40:16,988][15401] Updated weights for policy 0, policy_version 135870 (0.0038) [2024-06-22 04:40:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 2226110464. Throughput: 0: 42810.1. Samples: 2226286480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:18,393][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 04:40:21,691][15401] Updated weights for policy 0, policy_version 135880 (0.0031) [2024-06-22 04:40:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 2226339840. Throughput: 0: 42932.9. Samples: 2226413920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:23,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 04:40:25,537][15401] Updated weights for policy 0, policy_version 135890 (0.0032) [2024-06-22 04:40:28,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 2226552832. Throughput: 0: 42919.4. Samples: 2226675700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:28,393][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 04:40:29,159][15401] Updated weights for policy 0, policy_version 135900 (0.0022) [2024-06-22 04:40:32,899][15401] Updated weights for policy 0, policy_version 135910 (0.0034) [2024-06-22 04:40:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2226765824. Throughput: 0: 42927.1. Samples: 2226932200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:33,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-22 04:40:36,675][15401] Updated weights for policy 0, policy_version 135920 (0.0040) [2024-06-22 04:40:38,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2226995200. Throughput: 0: 43057.3. Samples: 2227062460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 04:40:40,418][15401] Updated weights for policy 0, policy_version 135930 (0.0051) [2024-06-22 04:40:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2227191808. Throughput: 0: 43111.1. Samples: 2227319940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 04:40:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 04:40:44,173][15401] Updated weights for policy 0, policy_version 135940 (0.0043) [2024-06-22 04:40:45,846][15349] Signal inference workers to stop experience collection... (32850 times) [2024-06-22 04:40:45,847][15349] Signal inference workers to resume experience collection... (32850 times) [2024-06-22 04:40:45,862][15401] InferenceWorker_p0-w0: stopping experience collection (32850 times) [2024-06-22 04:40:45,894][15401] InferenceWorker_p0-w0: resuming experience collection (32850 times) [2024-06-22 04:40:48,047][15401] Updated weights for policy 0, policy_version 135950 (0.0026) [2024-06-22 04:40:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2227404800. Throughput: 0: 42949.8. Samples: 2227569060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:40:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 04:40:51,788][15401] Updated weights for policy 0, policy_version 135960 (0.0036) [2024-06-22 04:40:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2227601408. Throughput: 0: 42911.2. Samples: 2227697680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:40:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 04:40:55,566][15401] Updated weights for policy 0, policy_version 135970 (0.0034) [2024-06-22 04:40:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2227830784. Throughput: 0: 42973.8. Samples: 2227956900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:40:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 04:40:59,395][15401] Updated weights for policy 0, policy_version 135980 (0.0026) [2024-06-22 04:41:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 2228043776. Throughput: 0: 42752.4. Samples: 2228210340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:03,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 04:41:03,818][15401] Updated weights for policy 0, policy_version 135990 (0.0033) [2024-06-22 04:41:06,911][15401] Updated weights for policy 0, policy_version 136000 (0.0029) [2024-06-22 04:41:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2228240384. Throughput: 0: 42777.4. Samples: 2228338900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 04:41:11,393][15401] Updated weights for policy 0, policy_version 136010 (0.0035) [2024-06-22 04:41:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 2228469760. Throughput: 0: 42730.8. Samples: 2228598480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:13,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 04:41:15,056][15401] Updated weights for policy 0, policy_version 136020 (0.0035) [2024-06-22 04:41:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 2228682752. Throughput: 0: 42626.7. Samples: 2228850400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:18,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 04:41:18,978][15401] Updated weights for policy 0, policy_version 136030 (0.0037) [2024-06-22 04:41:22,488][15401] Updated weights for policy 0, policy_version 136040 (0.0025) [2024-06-22 04:41:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2228895744. Throughput: 0: 42712.0. Samples: 2228984500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 04:41:26,679][15401] Updated weights for policy 0, policy_version 136050 (0.0030) [2024-06-22 04:41:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 2229092352. Throughput: 0: 42647.9. Samples: 2229239100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:28,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 04:41:30,520][15401] Updated weights for policy 0, policy_version 136060 (0.0036) [2024-06-22 04:41:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2229338112. Throughput: 0: 42561.2. Samples: 2229484320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 04:41:34,367][15401] Updated weights for policy 0, policy_version 136070 (0.0042) [2024-06-22 04:41:38,140][15401] Updated weights for policy 0, policy_version 136080 (0.0042) [2024-06-22 04:41:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2229534720. Throughput: 0: 42836.8. Samples: 2229625340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:38,400][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 04:41:42,032][15401] Updated weights for policy 0, policy_version 136090 (0.0051) [2024-06-22 04:41:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2229747712. Throughput: 0: 42607.4. Samples: 2229874240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 04:41:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 04:41:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000136093_2229747712.pth... [2024-06-22 04:41:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000135466_2219474944.pth [2024-06-22 04:41:45,679][15401] Updated weights for policy 0, policy_version 136100 (0.0026) [2024-06-22 04:41:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2229977088. Throughput: 0: 42701.8. Samples: 2230131820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:41:48,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 04:41:49,474][15401] Updated weights for policy 0, policy_version 136110 (0.0038) [2024-06-22 04:41:50,510][15349] Signal inference workers to stop experience collection... (32900 times) [2024-06-22 04:41:50,510][15349] Signal inference workers to resume experience collection... (32900 times) [2024-06-22 04:41:50,536][15401] InferenceWorker_p0-w0: stopping experience collection (32900 times) [2024-06-22 04:41:50,537][15401] InferenceWorker_p0-w0: resuming experience collection (32900 times) [2024-06-22 04:41:53,264][15401] Updated weights for policy 0, policy_version 136120 (0.0047) [2024-06-22 04:41:53,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 2230190080. Throughput: 0: 42811.9. Samples: 2230265540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:41:53,401][15132] Avg episode reward: [(0, '0.221')] [2024-06-22 04:41:57,386][15401] Updated weights for policy 0, policy_version 136130 (0.0028) [2024-06-22 04:41:58,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2230370304. Throughput: 0: 42701.3. Samples: 2230520040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:41:58,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 04:42:00,876][15401] Updated weights for policy 0, policy_version 136140 (0.0033) [2024-06-22 04:42:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 2230632448. Throughput: 0: 42676.0. Samples: 2230770820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 04:42:04,793][15401] Updated weights for policy 0, policy_version 136150 (0.0030) [2024-06-22 04:42:08,291][15401] Updated weights for policy 0, policy_version 136160 (0.0027) [2024-06-22 04:42:08,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43415.8, 300 sec: 42986.8). Total num frames: 2230845440. Throughput: 0: 42892.8. Samples: 2230914780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:08,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 04:42:12,365][15401] Updated weights for policy 0, policy_version 136170 (0.0038) [2024-06-22 04:42:13,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2231009280. Throughput: 0: 42792.6. Samples: 2231164760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:13,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 04:42:15,893][15401] Updated weights for policy 0, policy_version 136180 (0.0035) [2024-06-22 04:42:18,389][15132] Fps is (10 sec: 42609.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2231271424. Throughput: 0: 43026.0. Samples: 2231420480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 04:42:20,160][15401] Updated weights for policy 0, policy_version 136190 (0.0026) [2024-06-22 04:42:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2231468032. Throughput: 0: 42914.2. Samples: 2231556480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 04:42:23,595][15401] Updated weights for policy 0, policy_version 136200 (0.0032) [2024-06-22 04:42:27,722][15401] Updated weights for policy 0, policy_version 136210 (0.0034) [2024-06-22 04:42:28,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2231664640. Throughput: 0: 43016.4. Samples: 2231809980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:28,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-22 04:42:31,488][15401] Updated weights for policy 0, policy_version 136220 (0.0042) [2024-06-22 04:42:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2231926784. Throughput: 0: 42839.1. Samples: 2232059580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 04:42:35,320][15401] Updated weights for policy 0, policy_version 136230 (0.0033) [2024-06-22 04:42:38,394][15132] Fps is (10 sec: 42580.6, 60 sec: 42595.4, 300 sec: 42764.4). Total num frames: 2232090624. Throughput: 0: 42924.0. Samples: 2232197200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:38,394][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 04:42:39,311][15401] Updated weights for policy 0, policy_version 136240 (0.0025) [2024-06-22 04:42:42,955][15401] Updated weights for policy 0, policy_version 136250 (0.0040) [2024-06-22 04:42:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.6). Total num frames: 2232320000. Throughput: 0: 42902.2. Samples: 2232450640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 04:42:47,085][15401] Updated weights for policy 0, policy_version 136260 (0.0033) [2024-06-22 04:42:48,389][15132] Fps is (10 sec: 47534.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2232565760. Throughput: 0: 43032.0. Samples: 2232707260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 04:42:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 04:42:50,543][15401] Updated weights for policy 0, policy_version 136270 (0.0032) [2024-06-22 04:42:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 2232745984. Throughput: 0: 42758.2. Samples: 2232838800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:42:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 04:42:54,410][15401] Updated weights for policy 0, policy_version 136280 (0.0029) [2024-06-22 04:42:58,148][15401] Updated weights for policy 0, policy_version 136290 (0.0051) [2024-06-22 04:42:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2232975360. Throughput: 0: 42823.9. Samples: 2233091840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:42:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 04:43:02,087][15401] Updated weights for policy 0, policy_version 136300 (0.0040) [2024-06-22 04:43:03,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2233204736. Throughput: 0: 42937.6. Samples: 2233352680. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 04:43:06,135][15401] Updated weights for policy 0, policy_version 136310 (0.0030) [2024-06-22 04:43:07,027][15349] Signal inference workers to stop experience collection... (32950 times) [2024-06-22 04:43:07,072][15401] InferenceWorker_p0-w0: stopping experience collection (32950 times) [2024-06-22 04:43:07,076][15349] Signal inference workers to resume experience collection... (32950 times) [2024-06-22 04:43:07,090][15401] InferenceWorker_p0-w0: resuming experience collection (32950 times) [2024-06-22 04:43:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 2233384960. Throughput: 0: 42859.2. Samples: 2233485140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:08,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-22 04:43:09,850][15401] Updated weights for policy 0, policy_version 136320 (0.0036) [2024-06-22 04:43:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2233614336. Throughput: 0: 42720.9. Samples: 2233732420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:13,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 04:43:13,671][15401] Updated weights for policy 0, policy_version 136330 (0.0036) [2024-06-22 04:43:17,469][15401] Updated weights for policy 0, policy_version 136340 (0.0031) [2024-06-22 04:43:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 2233843712. Throughput: 0: 42879.6. Samples: 2233989160. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:18,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 04:43:21,197][15401] Updated weights for policy 0, policy_version 136350 (0.0031) [2024-06-22 04:43:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2234023936. Throughput: 0: 42741.4. Samples: 2234120380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 04:43:25,096][15401] Updated weights for policy 0, policy_version 136360 (0.0028) [2024-06-22 04:43:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2234253312. Throughput: 0: 42776.5. Samples: 2234375580. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 04:43:28,749][15401] Updated weights for policy 0, policy_version 136370 (0.0028) [2024-06-22 04:43:32,842][15401] Updated weights for policy 0, policy_version 136380 (0.0027) [2024-06-22 04:43:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2234466304. Throughput: 0: 42802.7. Samples: 2234633380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 04:43:36,198][15401] Updated weights for policy 0, policy_version 136390 (0.0037) [2024-06-22 04:43:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43147.6, 300 sec: 42820.5). Total num frames: 2234679296. Throughput: 0: 42663.7. Samples: 2234758660. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 04:43:40,445][15401] Updated weights for policy 0, policy_version 136400 (0.0029) [2024-06-22 04:43:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2234908672. Throughput: 0: 42776.9. Samples: 2235016800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 04:43:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000136409_2234925056.pth... [2024-06-22 04:43:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000135781_2224635904.pth [2024-06-22 04:43:43,734][15401] Updated weights for policy 0, policy_version 136410 (0.0026) [2024-06-22 04:43:47,981][15401] Updated weights for policy 0, policy_version 136420 (0.0034) [2024-06-22 04:43:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2235105280. Throughput: 0: 42743.7. Samples: 2235276140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-22 04:43:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 04:43:51,147][15401] Updated weights for policy 0, policy_version 136430 (0.0034) [2024-06-22 04:43:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 2235285504. Throughput: 0: 42595.6. Samples: 2235401940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:43:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 04:43:55,782][15401] Updated weights for policy 0, policy_version 136440 (0.0033) [2024-06-22 04:43:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2235564032. Throughput: 0: 42765.8. Samples: 2235656880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:43:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 04:43:59,235][15401] Updated weights for policy 0, policy_version 136450 (0.0033) [2024-06-22 04:44:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 2235744256. Throughput: 0: 42965.1. Samples: 2235922580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 04:44:03,456][15401] Updated weights for policy 0, policy_version 136460 (0.0044) [2024-06-22 04:44:06,854][15401] Updated weights for policy 0, policy_version 136470 (0.0028) [2024-06-22 04:44:08,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2235940864. Throughput: 0: 42655.2. Samples: 2236039860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:08,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-22 04:44:11,064][15401] Updated weights for policy 0, policy_version 136480 (0.0035) [2024-06-22 04:44:13,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2236203008. Throughput: 0: 42752.8. Samples: 2236299460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 04:44:14,482][15401] Updated weights for policy 0, policy_version 136490 (0.0032) [2024-06-22 04:44:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.9). Total num frames: 2236383232. Throughput: 0: 42899.9. Samples: 2236563880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 04:44:18,781][15401] Updated weights for policy 0, policy_version 136500 (0.0041) [2024-06-22 04:44:22,002][15401] Updated weights for policy 0, policy_version 136510 (0.0038) [2024-06-22 04:44:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2236596224. Throughput: 0: 42753.2. Samples: 2236682560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 04:44:26,488][15401] Updated weights for policy 0, policy_version 136520 (0.0049) [2024-06-22 04:44:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2236841984. Throughput: 0: 42747.5. Samples: 2236940440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:28,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 04:44:30,337][15401] Updated weights for policy 0, policy_version 136530 (0.0042) [2024-06-22 04:44:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2237005824. Throughput: 0: 42691.9. Samples: 2237197280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 04:44:34,300][15401] Updated weights for policy 0, policy_version 136540 (0.0030) [2024-06-22 04:44:35,319][15349] Signal inference workers to stop experience collection... (33000 times) [2024-06-22 04:44:35,319][15349] Signal inference workers to resume experience collection... (33000 times) [2024-06-22 04:44:35,368][15401] InferenceWorker_p0-w0: stopping experience collection (33000 times) [2024-06-22 04:44:35,368][15401] InferenceWorker_p0-w0: resuming experience collection (33000 times) [2024-06-22 04:44:37,881][15401] Updated weights for policy 0, policy_version 136550 (0.0037) [2024-06-22 04:44:38,393][15132] Fps is (10 sec: 39308.6, 60 sec: 42596.0, 300 sec: 42764.5). Total num frames: 2237235200. Throughput: 0: 42616.3. Samples: 2237319820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:38,393][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 04:44:42,109][15401] Updated weights for policy 0, policy_version 136560 (0.0031) [2024-06-22 04:44:43,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2237480960. Throughput: 0: 42575.2. Samples: 2237572760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:43,396][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 04:44:45,537][15401] Updated weights for policy 0, policy_version 136570 (0.0038) [2024-06-22 04:44:48,389][15132] Fps is (10 sec: 39334.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2237628416. Throughput: 0: 42551.0. Samples: 2237837380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 04:44:49,668][15401] Updated weights for policy 0, policy_version 136580 (0.0037) [2024-06-22 04:44:53,156][15401] Updated weights for policy 0, policy_version 136590 (0.0035) [2024-06-22 04:44:53,396][15132] Fps is (10 sec: 40933.9, 60 sec: 43413.0, 300 sec: 42819.6). Total num frames: 2237890560. Throughput: 0: 42439.3. Samples: 2237949900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 04:44:53,396][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 04:44:57,195][15401] Updated weights for policy 0, policy_version 136600 (0.0042) [2024-06-22 04:44:58,390][15132] Fps is (10 sec: 50789.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2238136320. Throughput: 0: 42674.6. Samples: 2238219820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:44:58,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 04:45:00,634][15401] Updated weights for policy 0, policy_version 136610 (0.0019) [2024-06-22 04:45:03,392][15132] Fps is (10 sec: 37698.0, 60 sec: 42050.5, 300 sec: 42653.6). Total num frames: 2238267392. Throughput: 0: 42790.6. Samples: 2238489560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:03,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 04:45:04,640][15401] Updated weights for policy 0, policy_version 136620 (0.0039) [2024-06-22 04:45:08,155][15401] Updated weights for policy 0, policy_version 136630 (0.0039) [2024-06-22 04:45:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2238545920. Throughput: 0: 42758.7. Samples: 2238606700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 04:45:12,161][15401] Updated weights for policy 0, policy_version 136640 (0.0029) [2024-06-22 04:45:13,389][15132] Fps is (10 sec: 49164.0, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 2238758912. Throughput: 0: 42888.9. Samples: 2238870440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 04:45:15,786][15401] Updated weights for policy 0, policy_version 136650 (0.0040) [2024-06-22 04:45:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2238939136. Throughput: 0: 43121.8. Samples: 2239137760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 04:45:19,875][15401] Updated weights for policy 0, policy_version 136660 (0.0037) [2024-06-22 04:45:23,340][15401] Updated weights for policy 0, policy_version 136670 (0.0032) [2024-06-22 04:45:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 2239201280. Throughput: 0: 42958.2. Samples: 2239252800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 04:45:27,531][15401] Updated weights for policy 0, policy_version 136680 (0.0031) [2024-06-22 04:45:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2239414272. Throughput: 0: 43302.6. Samples: 2239521380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 04:45:30,925][15401] Updated weights for policy 0, policy_version 136690 (0.0036) [2024-06-22 04:45:33,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2239578112. Throughput: 0: 43128.0. Samples: 2239778140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 04:45:35,059][15401] Updated weights for policy 0, policy_version 136700 (0.0031) [2024-06-22 04:45:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43147.0, 300 sec: 42820.6). Total num frames: 2239823872. Throughput: 0: 43245.8. Samples: 2239895680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:38,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 04:45:38,831][15401] Updated weights for policy 0, policy_version 136710 (0.0038) [2024-06-22 04:45:42,586][15401] Updated weights for policy 0, policy_version 136720 (0.0029) [2024-06-22 04:45:43,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2240053248. Throughput: 0: 43136.0. Samples: 2240160940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 04:45:43,396][15349] Signal inference workers to stop experience collection... (33050 times) [2024-06-22 04:45:43,443][15401] InferenceWorker_p0-w0: stopping experience collection (33050 times) [2024-06-22 04:45:43,514][15349] Signal inference workers to resume experience collection... (33050 times) [2024-06-22 04:45:43,515][15401] InferenceWorker_p0-w0: resuming experience collection (33050 times) [2024-06-22 04:45:43,647][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000136724_2240086016.pth... [2024-06-22 04:45:43,696][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000136093_2229747712.pth [2024-06-22 04:45:46,472][15401] Updated weights for policy 0, policy_version 136730 (0.0029) [2024-06-22 04:45:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2240233472. Throughput: 0: 42947.2. Samples: 2240422080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:48,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 04:45:50,284][15401] Updated weights for policy 0, policy_version 136740 (0.0028) [2024-06-22 04:45:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 2240462848. Throughput: 0: 43089.3. Samples: 2240545720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 04:45:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 04:45:53,954][15401] Updated weights for policy 0, policy_version 136750 (0.0036) [2024-06-22 04:45:58,117][15401] Updated weights for policy 0, policy_version 136760 (0.0028) [2024-06-22 04:45:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 2240692224. Throughput: 0: 43135.9. Samples: 2240811560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:45:58,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-22 04:46:01,771][15401] Updated weights for policy 0, policy_version 136770 (0.0038) [2024-06-22 04:46:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43692.5, 300 sec: 42876.1). Total num frames: 2240888832. Throughput: 0: 42944.5. Samples: 2241070260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 04:46:05,680][15401] Updated weights for policy 0, policy_version 136780 (0.0047) [2024-06-22 04:46:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2241118208. Throughput: 0: 43106.6. Samples: 2241192600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 04:46:09,626][15401] Updated weights for policy 0, policy_version 136790 (0.0025) [2024-06-22 04:46:13,146][15401] Updated weights for policy 0, policy_version 136800 (0.0034) [2024-06-22 04:46:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2241331200. Throughput: 0: 43033.5. Samples: 2241457880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 04:46:17,136][15401] Updated weights for policy 0, policy_version 136810 (0.0037) [2024-06-22 04:46:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2241544192. Throughput: 0: 42931.1. Samples: 2241710040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 04:46:21,046][15401] Updated weights for policy 0, policy_version 136820 (0.0030) [2024-06-22 04:46:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2241773568. Throughput: 0: 43187.4. Samples: 2241839120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 04:46:24,658][15401] Updated weights for policy 0, policy_version 136830 (0.0043) [2024-06-22 04:46:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2241953792. Throughput: 0: 43029.4. Samples: 2242097260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:28,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-22 04:46:28,607][15401] Updated weights for policy 0, policy_version 136840 (0.0042) [2024-06-22 04:46:32,210][15401] Updated weights for policy 0, policy_version 136850 (0.0039) [2024-06-22 04:46:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2242166784. Throughput: 0: 42892.8. Samples: 2242352260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:33,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 04:46:36,578][15401] Updated weights for policy 0, policy_version 136860 (0.0038) [2024-06-22 04:46:38,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 2242412544. Throughput: 0: 42952.4. Samples: 2242478680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:38,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 04:46:39,838][15401] Updated weights for policy 0, policy_version 136870 (0.0033) [2024-06-22 04:46:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2242592768. Throughput: 0: 42615.2. Samples: 2242729240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:43,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 04:46:44,314][15401] Updated weights for policy 0, policy_version 136880 (0.0033) [2024-06-22 04:46:47,411][15401] Updated weights for policy 0, policy_version 136890 (0.0037) [2024-06-22 04:46:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 2242822144. Throughput: 0: 42600.8. Samples: 2242987300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:48,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 04:46:51,895][15401] Updated weights for policy 0, policy_version 136900 (0.0040) [2024-06-22 04:46:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2243051520. Throughput: 0: 42756.1. Samples: 2243116620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 04:46:55,322][15401] Updated weights for policy 0, policy_version 136910 (0.0037) [2024-06-22 04:46:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2243231744. Throughput: 0: 42364.4. Samples: 2243364280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 04:46:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 04:46:59,781][15401] Updated weights for policy 0, policy_version 136920 (0.0037) [2024-06-22 04:47:00,271][15349] Signal inference workers to stop experience collection... (33100 times) [2024-06-22 04:47:00,297][15401] InferenceWorker_p0-w0: stopping experience collection (33100 times) [2024-06-22 04:47:00,337][15349] Signal inference workers to resume experience collection... (33100 times) [2024-06-22 04:47:00,337][15401] InferenceWorker_p0-w0: resuming experience collection (33100 times) [2024-06-22 04:47:03,127][15401] Updated weights for policy 0, policy_version 136930 (0.0040) [2024-06-22 04:47:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 2243461120. Throughput: 0: 42380.4. Samples: 2243617160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 04:47:07,342][15401] Updated weights for policy 0, policy_version 136940 (0.0035) [2024-06-22 04:47:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2243674112. Throughput: 0: 42451.5. Samples: 2243749440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 04:47:11,094][15401] Updated weights for policy 0, policy_version 136950 (0.0043) [2024-06-22 04:47:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2243870720. Throughput: 0: 42191.1. Samples: 2243995860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:13,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 04:47:15,308][15401] Updated weights for policy 0, policy_version 136960 (0.0034) [2024-06-22 04:47:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 2244083712. Throughput: 0: 42134.6. Samples: 2244248320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:18,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 04:47:18,921][15401] Updated weights for policy 0, policy_version 136970 (0.0041) [2024-06-22 04:47:22,775][15401] Updated weights for policy 0, policy_version 136980 (0.0042) [2024-06-22 04:47:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 2244296704. Throughput: 0: 42290.8. Samples: 2244381660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 04:47:26,393][15401] Updated weights for policy 0, policy_version 136990 (0.0031) [2024-06-22 04:47:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2244526080. Throughput: 0: 42448.0. Samples: 2244639400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 04:47:30,447][15401] Updated weights for policy 0, policy_version 137000 (0.0038) [2024-06-22 04:47:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.6, 300 sec: 42821.2). Total num frames: 2244722688. Throughput: 0: 42535.7. Samples: 2244901400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 04:47:34,106][15401] Updated weights for policy 0, policy_version 137010 (0.0038) [2024-06-22 04:47:37,948][15401] Updated weights for policy 0, policy_version 137020 (0.0033) [2024-06-22 04:47:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 2244935680. Throughput: 0: 42336.8. Samples: 2245021780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 04:47:41,931][15401] Updated weights for policy 0, policy_version 137030 (0.0023) [2024-06-22 04:47:43,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 2245148672. Throughput: 0: 42519.9. Samples: 2245277780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:43,392][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 04:47:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000137033_2245148672.pth... [2024-06-22 04:47:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000136409_2234925056.pth [2024-06-22 04:47:45,741][15401] Updated weights for policy 0, policy_version 137040 (0.0042) [2024-06-22 04:47:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2245361664. Throughput: 0: 42583.5. Samples: 2245533420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:48,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 04:47:49,647][15401] Updated weights for policy 0, policy_version 137050 (0.0022) [2024-06-22 04:47:53,359][15401] Updated weights for policy 0, policy_version 137060 (0.0029) [2024-06-22 04:47:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2245591040. Throughput: 0: 42454.7. Samples: 2245659900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 04:47:57,101][15401] Updated weights for policy 0, policy_version 137070 (0.0034) [2024-06-22 04:47:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2245804032. Throughput: 0: 42578.6. Samples: 2245911900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:47:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 04:48:00,965][15401] Updated weights for policy 0, policy_version 137080 (0.0040) [2024-06-22 04:48:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2246000640. Throughput: 0: 42660.9. Samples: 2246168060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 04:48:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 04:48:04,723][15401] Updated weights for policy 0, policy_version 137090 (0.0050) [2024-06-22 04:48:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2246213632. Throughput: 0: 42642.6. Samples: 2246300580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:08,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 04:48:08,742][15401] Updated weights for policy 0, policy_version 137100 (0.0031) [2024-06-22 04:48:12,373][15401] Updated weights for policy 0, policy_version 137110 (0.0040) [2024-06-22 04:48:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2246426624. Throughput: 0: 42669.9. Samples: 2246559540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 04:48:16,468][15401] Updated weights for policy 0, policy_version 137120 (0.0039) [2024-06-22 04:48:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2246656000. Throughput: 0: 42548.8. Samples: 2246816100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 04:48:19,859][15349] Signal inference workers to stop experience collection... (33150 times) [2024-06-22 04:48:19,907][15401] InferenceWorker_p0-w0: stopping experience collection (33150 times) [2024-06-22 04:48:19,916][15349] Signal inference workers to resume experience collection... (33150 times) [2024-06-22 04:48:19,924][15401] InferenceWorker_p0-w0: resuming experience collection (33150 times) [2024-06-22 04:48:19,926][15401] Updated weights for policy 0, policy_version 137130 (0.0041) [2024-06-22 04:48:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2246836224. Throughput: 0: 42659.2. Samples: 2246941440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:23,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 04:48:24,148][15401] Updated weights for policy 0, policy_version 137140 (0.0036) [2024-06-22 04:48:27,536][15401] Updated weights for policy 0, policy_version 137150 (0.0025) [2024-06-22 04:48:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2247081984. Throughput: 0: 42717.5. Samples: 2247199960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 04:48:31,568][15401] Updated weights for policy 0, policy_version 137160 (0.0031) [2024-06-22 04:48:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2247278592. Throughput: 0: 42829.1. Samples: 2247460720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 04:48:35,094][15401] Updated weights for policy 0, policy_version 137170 (0.0030) [2024-06-22 04:48:38,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2247491584. Throughput: 0: 42772.0. Samples: 2247584640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:38,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 04:48:39,214][15401] Updated weights for policy 0, policy_version 137180 (0.0032) [2024-06-22 04:48:42,698][15401] Updated weights for policy 0, policy_version 137190 (0.0037) [2024-06-22 04:48:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 2247737344. Throughput: 0: 42996.6. Samples: 2247846740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 04:48:46,829][15401] Updated weights for policy 0, policy_version 137200 (0.0037) [2024-06-22 04:48:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2247933952. Throughput: 0: 43047.2. Samples: 2248105180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 04:48:50,394][15401] Updated weights for policy 0, policy_version 137210 (0.0037) [2024-06-22 04:48:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2248130560. Throughput: 0: 42842.6. Samples: 2248228500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:53,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 04:48:54,393][15401] Updated weights for policy 0, policy_version 137220 (0.0038) [2024-06-22 04:48:58,376][15401] Updated weights for policy 0, policy_version 137230 (0.0033) [2024-06-22 04:48:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2248376320. Throughput: 0: 42951.5. Samples: 2248492360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:48:58,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 04:49:01,984][15401] Updated weights for policy 0, policy_version 137240 (0.0035) [2024-06-22 04:49:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2248589312. Throughput: 0: 42896.8. Samples: 2248746460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 04:49:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 04:49:05,939][15401] Updated weights for policy 0, policy_version 137250 (0.0032) [2024-06-22 04:49:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2248785920. Throughput: 0: 42936.4. Samples: 2248873580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 04:49:09,649][15401] Updated weights for policy 0, policy_version 137260 (0.0040) [2024-06-22 04:49:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2248998912. Throughput: 0: 42997.7. Samples: 2249134860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:13,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 04:49:13,592][15401] Updated weights for policy 0, policy_version 137270 (0.0054) [2024-06-22 04:49:17,571][15401] Updated weights for policy 0, policy_version 137280 (0.0041) [2024-06-22 04:49:18,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2249228288. Throughput: 0: 42873.1. Samples: 2249390020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 04:49:21,117][15401] Updated weights for policy 0, policy_version 137290 (0.0044) [2024-06-22 04:49:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2249424896. Throughput: 0: 43015.1. Samples: 2249520320. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 04:49:25,098][15401] Updated weights for policy 0, policy_version 137300 (0.0037) [2024-06-22 04:49:28,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2249654272. Throughput: 0: 42919.5. Samples: 2249778120. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 04:49:28,686][15401] Updated weights for policy 0, policy_version 137310 (0.0021) [2024-06-22 04:49:32,670][15401] Updated weights for policy 0, policy_version 137320 (0.0032) [2024-06-22 04:49:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.7, 300 sec: 42820.7). Total num frames: 2249867264. Throughput: 0: 42867.9. Samples: 2250034340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:33,392][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 04:49:36,334][15401] Updated weights for policy 0, policy_version 137330 (0.0033) [2024-06-22 04:49:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2250080256. Throughput: 0: 42997.4. Samples: 2250163380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 04:49:40,279][15401] Updated weights for policy 0, policy_version 137340 (0.0033) [2024-06-22 04:49:43,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2250309632. Throughput: 0: 42864.4. Samples: 2250421260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:43,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-22 04:49:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000137348_2250309632.pth... [2024-06-22 04:49:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000136724_2240086016.pth [2024-06-22 04:49:44,164][15401] Updated weights for policy 0, policy_version 137350 (0.0029) [2024-06-22 04:49:47,963][15401] Updated weights for policy 0, policy_version 137360 (0.0029) [2024-06-22 04:49:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 2250506240. Throughput: 0: 42859.5. Samples: 2250675140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 04:49:52,018][15401] Updated weights for policy 0, policy_version 137370 (0.0029) [2024-06-22 04:49:53,394][15132] Fps is (10 sec: 40940.4, 60 sec: 43141.1, 300 sec: 42653.3). Total num frames: 2250719232. Throughput: 0: 42951.4. Samples: 2250806600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:53,395][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 04:49:54,795][15349] Signal inference workers to stop experience collection... (33200 times) [2024-06-22 04:49:54,844][15401] InferenceWorker_p0-w0: stopping experience collection (33200 times) [2024-06-22 04:49:54,911][15349] Signal inference workers to resume experience collection... (33200 times) [2024-06-22 04:49:54,911][15401] InferenceWorker_p0-w0: resuming experience collection (33200 times) [2024-06-22 04:49:55,617][15401] Updated weights for policy 0, policy_version 137380 (0.0030) [2024-06-22 04:49:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 2250915840. Throughput: 0: 42654.2. Samples: 2251054300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:49:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 04:49:59,626][15401] Updated weights for policy 0, policy_version 137390 (0.0032) [2024-06-22 04:50:03,130][15401] Updated weights for policy 0, policy_version 137400 (0.0038) [2024-06-22 04:50:03,390][15132] Fps is (10 sec: 44256.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2251161600. Throughput: 0: 42598.6. Samples: 2251306960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:50:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 04:50:07,219][15401] Updated weights for policy 0, policy_version 137410 (0.0040) [2024-06-22 04:50:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2251374592. Throughput: 0: 42779.5. Samples: 2251445400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 04:50:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 04:50:10,550][15401] Updated weights for policy 0, policy_version 137420 (0.0031) [2024-06-22 04:50:13,390][15132] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2251554816. Throughput: 0: 42696.5. Samples: 2251699460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:13,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 04:50:14,825][15401] Updated weights for policy 0, policy_version 137430 (0.0042) [2024-06-22 04:50:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2251800576. Throughput: 0: 42476.9. Samples: 2251945700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 04:50:18,860][15401] Updated weights for policy 0, policy_version 137440 (0.0033) [2024-06-22 04:50:22,617][15401] Updated weights for policy 0, policy_version 137450 (0.0041) [2024-06-22 04:50:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2251997184. Throughput: 0: 42722.2. Samples: 2252085880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 04:50:26,499][15401] Updated weights for policy 0, policy_version 137460 (0.0029) [2024-06-22 04:50:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2252193792. Throughput: 0: 42672.0. Samples: 2252341500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:28,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 04:50:30,176][15401] Updated weights for policy 0, policy_version 137470 (0.0035) [2024-06-22 04:50:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 2252439552. Throughput: 0: 42664.1. Samples: 2252595020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:33,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 04:50:33,996][15401] Updated weights for policy 0, policy_version 137480 (0.0030) [2024-06-22 04:50:37,683][15401] Updated weights for policy 0, policy_version 137490 (0.0038) [2024-06-22 04:50:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2252668928. Throughput: 0: 42751.1. Samples: 2252730200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:38,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-22 04:50:41,638][15401] Updated weights for policy 0, policy_version 137500 (0.0031) [2024-06-22 04:50:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2252849152. Throughput: 0: 42940.5. Samples: 2252986620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 04:50:45,415][15401] Updated weights for policy 0, policy_version 137510 (0.0023) [2024-06-22 04:50:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2253078528. Throughput: 0: 42990.4. Samples: 2253241520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:48,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 04:50:49,260][15401] Updated weights for policy 0, policy_version 137520 (0.0028) [2024-06-22 04:50:53,300][15401] Updated weights for policy 0, policy_version 137530 (0.0036) [2024-06-22 04:50:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42874.9, 300 sec: 42709.5). Total num frames: 2253291520. Throughput: 0: 42910.3. Samples: 2253376360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 04:50:56,814][15401] Updated weights for policy 0, policy_version 137540 (0.0040) [2024-06-22 04:50:58,394][15132] Fps is (10 sec: 40943.7, 60 sec: 42868.6, 300 sec: 42708.9). Total num frames: 2253488128. Throughput: 0: 42749.1. Samples: 2253623340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:50:58,394][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 04:51:00,905][15401] Updated weights for policy 0, policy_version 137550 (0.0036) [2024-06-22 04:51:03,391][15132] Fps is (10 sec: 42593.5, 60 sec: 42597.7, 300 sec: 42709.3). Total num frames: 2253717504. Throughput: 0: 42961.6. Samples: 2253879020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:51:03,391][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 04:51:04,350][15401] Updated weights for policy 0, policy_version 137560 (0.0039) [2024-06-22 04:51:08,390][15132] Fps is (10 sec: 44253.9, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 2253930496. Throughput: 0: 42877.1. Samples: 2254015360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 04:51:08,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-22 04:51:08,576][15401] Updated weights for policy 0, policy_version 137570 (0.0039) [2024-06-22 04:51:11,322][15349] Signal inference workers to stop experience collection... (33250 times) [2024-06-22 04:51:11,363][15401] InferenceWorker_p0-w0: stopping experience collection (33250 times) [2024-06-22 04:51:11,437][15349] Signal inference workers to resume experience collection... (33250 times) [2024-06-22 04:51:11,437][15401] InferenceWorker_p0-w0: resuming experience collection (33250 times) [2024-06-22 04:51:12,061][15401] Updated weights for policy 0, policy_version 137580 (0.0027) [2024-06-22 04:51:13,389][15132] Fps is (10 sec: 40965.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2254127104. Throughput: 0: 42677.4. Samples: 2254261980. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:13,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 04:51:16,183][15401] Updated weights for policy 0, policy_version 137590 (0.0039) [2024-06-22 04:51:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2254372864. Throughput: 0: 42749.8. Samples: 2254518760. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:18,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 04:51:19,913][15401] Updated weights for policy 0, policy_version 137600 (0.0038) [2024-06-22 04:51:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2254569472. Throughput: 0: 42782.9. Samples: 2254655420. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 04:51:23,663][15401] Updated weights for policy 0, policy_version 137610 (0.0032) [2024-06-22 04:51:27,681][15401] Updated weights for policy 0, policy_version 137620 (0.0039) [2024-06-22 04:51:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 2254798848. Throughput: 0: 42851.1. Samples: 2254915020. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:28,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 04:51:31,059][15401] Updated weights for policy 0, policy_version 137630 (0.0047) [2024-06-22 04:51:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 2255028224. Throughput: 0: 42769.8. Samples: 2255166160. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:33,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 04:51:35,400][15401] Updated weights for policy 0, policy_version 137640 (0.0041) [2024-06-22 04:51:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2255224832. Throughput: 0: 42816.0. Samples: 2255303080. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 04:51:38,583][15401] Updated weights for policy 0, policy_version 137650 (0.0026) [2024-06-22 04:51:42,943][15401] Updated weights for policy 0, policy_version 137660 (0.0030) [2024-06-22 04:51:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2255437824. Throughput: 0: 43066.6. Samples: 2255561160. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:43,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 04:51:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000137661_2255437824.pth... [2024-06-22 04:51:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000137033_2245148672.pth [2024-06-22 04:51:46,268][15401] Updated weights for policy 0, policy_version 137670 (0.0026) [2024-06-22 04:51:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2255683584. Throughput: 0: 42992.3. Samples: 2255813620. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:48,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 04:51:50,403][15401] Updated weights for policy 0, policy_version 137680 (0.0038) [2024-06-22 04:51:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2255847424. Throughput: 0: 42956.7. Samples: 2255948400. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 04:51:54,084][15401] Updated weights for policy 0, policy_version 137690 (0.0033) [2024-06-22 04:51:58,061][15401] Updated weights for policy 0, policy_version 137700 (0.0029) [2024-06-22 04:51:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43420.6, 300 sec: 42820.6). Total num frames: 2256093184. Throughput: 0: 43241.3. Samples: 2256207840. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:51:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 04:52:01,555][15401] Updated weights for policy 0, policy_version 137710 (0.0038) [2024-06-22 04:52:03,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43416.7, 300 sec: 42875.8). Total num frames: 2256322560. Throughput: 0: 43235.9. Samples: 2256464480. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:52:03,392][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 04:52:05,482][15401] Updated weights for policy 0, policy_version 137720 (0.0041) [2024-06-22 04:52:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2256519168. Throughput: 0: 43259.0. Samples: 2256602080. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:52:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 04:52:08,999][15401] Updated weights for policy 0, policy_version 137730 (0.0030) [2024-06-22 04:52:12,979][15401] Updated weights for policy 0, policy_version 137740 (0.0028) [2024-06-22 04:52:13,392][15132] Fps is (10 sec: 42598.4, 60 sec: 43688.9, 300 sec: 42931.3). Total num frames: 2256748544. Throughput: 0: 43212.4. Samples: 2256859580. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-06-22 04:52:13,392][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 04:52:16,957][15401] Updated weights for policy 0, policy_version 137750 (0.0045) [2024-06-22 04:52:17,655][15349] Signal inference workers to stop experience collection... (33300 times) [2024-06-22 04:52:17,660][15349] Signal inference workers to resume experience collection... (33300 times) [2024-06-22 04:52:17,695][15401] InferenceWorker_p0-w0: stopping experience collection (33300 times) [2024-06-22 04:52:17,696][15401] InferenceWorker_p0-w0: resuming experience collection (33300 times) [2024-06-22 04:52:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2256977920. Throughput: 0: 43263.1. Samples: 2257113000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 04:52:20,477][15401] Updated weights for policy 0, policy_version 137760 (0.0025) [2024-06-22 04:52:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2257158144. Throughput: 0: 43322.7. Samples: 2257252600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 04:52:24,610][15401] Updated weights for policy 0, policy_version 137770 (0.0032) [2024-06-22 04:52:28,385][15401] Updated weights for policy 0, policy_version 137780 (0.0039) [2024-06-22 04:52:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 2257387520. Throughput: 0: 43205.8. Samples: 2257505420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:28,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 04:52:32,103][15401] Updated weights for policy 0, policy_version 137790 (0.0028) [2024-06-22 04:52:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2257633280. Throughput: 0: 43272.9. Samples: 2257760900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:33,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-22 04:52:35,808][15401] Updated weights for policy 0, policy_version 137800 (0.0039) [2024-06-22 04:52:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 2257797120. Throughput: 0: 43243.9. Samples: 2257894380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 04:52:39,810][15401] Updated weights for policy 0, policy_version 137810 (0.0033) [2024-06-22 04:52:43,302][15401] Updated weights for policy 0, policy_version 137820 (0.0023) [2024-06-22 04:52:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 2258042880. Throughput: 0: 43153.1. Samples: 2258149740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 04:52:47,359][15401] Updated weights for policy 0, policy_version 137830 (0.0030) [2024-06-22 04:52:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2258255872. Throughput: 0: 43256.2. Samples: 2258410900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 04:52:50,785][15401] Updated weights for policy 0, policy_version 137840 (0.0049) [2024-06-22 04:52:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2258452480. Throughput: 0: 43168.9. Samples: 2258544680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 04:52:54,912][15401] Updated weights for policy 0, policy_version 137850 (0.0035) [2024-06-22 04:52:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 2258681856. Throughput: 0: 43080.4. Samples: 2258798100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:52:58,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 04:52:58,535][15401] Updated weights for policy 0, policy_version 137860 (0.0039) [2024-06-22 04:53:02,706][15401] Updated weights for policy 0, policy_version 137870 (0.0033) [2024-06-22 04:53:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 2258894848. Throughput: 0: 43314.7. Samples: 2259062160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:53:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 04:53:05,935][15401] Updated weights for policy 0, policy_version 137880 (0.0040) [2024-06-22 04:53:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 2259107840. Throughput: 0: 42989.2. Samples: 2259187120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:53:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 04:53:10,165][15401] Updated weights for policy 0, policy_version 137890 (0.0033) [2024-06-22 04:53:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 2259320832. Throughput: 0: 43033.8. Samples: 2259441940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 04:53:13,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 04:53:13,923][15401] Updated weights for policy 0, policy_version 137900 (0.0033) [2024-06-22 04:53:17,761][15401] Updated weights for policy 0, policy_version 137910 (0.0036) [2024-06-22 04:53:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 2259550208. Throughput: 0: 43042.2. Samples: 2259697800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 04:53:21,433][15401] Updated weights for policy 0, policy_version 137920 (0.0024) [2024-06-22 04:53:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 2259746816. Throughput: 0: 42988.3. Samples: 2259828960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:23,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 04:53:25,224][15401] Updated weights for policy 0, policy_version 137930 (0.0032) [2024-06-22 04:53:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2259943424. Throughput: 0: 43042.0. Samples: 2260086620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 04:53:28,992][15401] Updated weights for policy 0, policy_version 137940 (0.0035) [2024-06-22 04:53:32,605][15349] Signal inference workers to stop experience collection... (33350 times) [2024-06-22 04:53:32,606][15349] Signal inference workers to resume experience collection... (33350 times) [2024-06-22 04:53:32,655][15401] InferenceWorker_p0-w0: stopping experience collection (33350 times) [2024-06-22 04:53:32,655][15401] InferenceWorker_p0-w0: resuming experience collection (33350 times) [2024-06-22 04:53:32,992][15401] Updated weights for policy 0, policy_version 137950 (0.0034) [2024-06-22 04:53:33,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 2260189184. Throughput: 0: 42873.8. Samples: 2260340220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 04:53:36,513][15401] Updated weights for policy 0, policy_version 137960 (0.0034) [2024-06-22 04:53:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2260385792. Throughput: 0: 42837.3. Samples: 2260472360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 04:53:40,550][15401] Updated weights for policy 0, policy_version 137970 (0.0038) [2024-06-22 04:53:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 2260582400. Throughput: 0: 42927.3. Samples: 2260729820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 04:53:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000137976_2260598784.pth... [2024-06-22 04:53:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000137348_2250309632.pth [2024-06-22 04:53:44,256][15401] Updated weights for policy 0, policy_version 137980 (0.0031) [2024-06-22 04:53:48,188][15401] Updated weights for policy 0, policy_version 137990 (0.0042) [2024-06-22 04:53:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 2260844544. Throughput: 0: 42779.5. Samples: 2260987240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:48,396][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 04:53:51,781][15401] Updated weights for policy 0, policy_version 138000 (0.0035) [2024-06-22 04:53:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2261024768. Throughput: 0: 42935.6. Samples: 2261119220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 04:53:55,764][15401] Updated weights for policy 0, policy_version 138010 (0.0031) [2024-06-22 04:53:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2261237760. Throughput: 0: 42871.8. Samples: 2261371180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:53:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 04:53:59,375][15401] Updated weights for policy 0, policy_version 138020 (0.0022) [2024-06-22 04:54:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2261450752. Throughput: 0: 42943.6. Samples: 2261630260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:54:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 04:54:03,770][15401] Updated weights for policy 0, policy_version 138030 (0.0032) [2024-06-22 04:54:07,077][15401] Updated weights for policy 0, policy_version 138040 (0.0030) [2024-06-22 04:54:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2261680128. Throughput: 0: 42925.3. Samples: 2261760500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:54:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 04:54:11,325][15401] Updated weights for policy 0, policy_version 138050 (0.0040) [2024-06-22 04:54:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 2261893120. Throughput: 0: 42848.0. Samples: 2262014780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:54:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 04:54:14,584][15401] Updated weights for policy 0, policy_version 138060 (0.0042) [2024-06-22 04:54:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2262106112. Throughput: 0: 43026.2. Samples: 2262276400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 04:54:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 04:54:18,954][15401] Updated weights for policy 0, policy_version 138070 (0.0035) [2024-06-22 04:54:22,474][15401] Updated weights for policy 0, policy_version 138080 (0.0029) [2024-06-22 04:54:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 2262319104. Throughput: 0: 42872.4. Samples: 2262401620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:54:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 04:54:26,611][15401] Updated weights for policy 0, policy_version 138090 (0.0037) [2024-06-22 04:54:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43417.4, 300 sec: 42987.5). Total num frames: 2262548480. Throughput: 0: 42913.6. Samples: 2262660940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:54:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 04:54:30,229][15401] Updated weights for policy 0, policy_version 138100 (0.0040) [2024-06-22 04:54:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2262745088. Throughput: 0: 43014.6. Samples: 2262922900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:54:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 04:54:34,230][15401] Updated weights for policy 0, policy_version 138110 (0.0032) [2024-06-22 04:54:37,658][15401] Updated weights for policy 0, policy_version 138120 (0.0038) [2024-06-22 04:54:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2262974464. Throughput: 0: 42833.8. Samples: 2263046740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:54:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 04:54:41,946][15401] Updated weights for policy 0, policy_version 138130 (0.0031) [2024-06-22 04:54:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 2263171072. Throughput: 0: 43069.5. Samples: 2263309300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:54:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 04:54:45,494][15401] Updated weights for policy 0, policy_version 138140 (0.0043) [2024-06-22 04:54:46,279][15349] Signal inference workers to stop experience collection... (33400 times) [2024-06-22 04:54:46,279][15349] Signal inference workers to resume experience collection... (33400 times) [2024-06-22 04:54:46,310][15401] InferenceWorker_p0-w0: stopping experience collection (33400 times) [2024-06-22 04:54:46,310][15401] InferenceWorker_p0-w0: resuming experience collection (33400 times) [2024-06-22 04:54:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42932.3). Total num frames: 2263384064. Throughput: 0: 42948.9. Samples: 2263562960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:54:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 04:54:49,594][15401] Updated weights for policy 0, policy_version 138150 (0.0030) [2024-06-22 04:54:53,013][15401] Updated weights for policy 0, policy_version 138160 (0.0040) [2024-06-22 04:54:53,393][15132] Fps is (10 sec: 44220.3, 60 sec: 43142.0, 300 sec: 43042.2). Total num frames: 2263613440. Throughput: 0: 42872.1. Samples: 2263689900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:54:53,394][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 04:54:57,045][15401] Updated weights for policy 0, policy_version 138170 (0.0039) [2024-06-22 04:54:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2263810048. Throughput: 0: 42949.3. Samples: 2263947500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:54:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 04:55:00,737][15401] Updated weights for policy 0, policy_version 138180 (0.0030) [2024-06-22 04:55:03,390][15132] Fps is (10 sec: 42613.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2264039424. Throughput: 0: 42824.7. Samples: 2264203520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:55:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 04:55:04,687][15401] Updated weights for policy 0, policy_version 138190 (0.0036) [2024-06-22 04:55:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 2264236032. Throughput: 0: 42827.3. Samples: 2264328840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:55:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 04:55:08,578][15401] Updated weights for policy 0, policy_version 138200 (0.0041) [2024-06-22 04:55:12,262][15401] Updated weights for policy 0, policy_version 138210 (0.0024) [2024-06-22 04:55:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2264449024. Throughput: 0: 42783.8. Samples: 2264586200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:55:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 04:55:16,110][15401] Updated weights for policy 0, policy_version 138220 (0.0029) [2024-06-22 04:55:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2264694784. Throughput: 0: 42631.6. Samples: 2264841320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 04:55:18,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 04:55:19,963][15401] Updated weights for policy 0, policy_version 138230 (0.0028) [2024-06-22 04:55:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2264891392. Throughput: 0: 42806.3. Samples: 2264973020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:55:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 04:55:23,774][15401] Updated weights for policy 0, policy_version 138240 (0.0032) [2024-06-22 04:55:28,253][15401] Updated weights for policy 0, policy_version 138250 (0.0034) [2024-06-22 04:55:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2265088000. Throughput: 0: 42661.7. Samples: 2265229080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:55:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 04:55:31,391][15401] Updated weights for policy 0, policy_version 138260 (0.0034) [2024-06-22 04:55:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2265333760. Throughput: 0: 42542.6. Samples: 2265477380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:55:33,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 04:55:35,686][15401] Updated weights for policy 0, policy_version 138270 (0.0030) [2024-06-22 04:55:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2265546752. Throughput: 0: 42737.7. Samples: 2265612940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:55:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 04:55:38,977][15401] Updated weights for policy 0, policy_version 138280 (0.0036) [2024-06-22 04:55:43,228][15401] Updated weights for policy 0, policy_version 138290 (0.0040) [2024-06-22 04:55:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2265743360. Throughput: 0: 42650.1. Samples: 2265866760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:55:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 04:55:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000138290_2265743360.pth... [2024-06-22 04:55:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000137661_2255437824.pth [2024-06-22 04:55:46,900][15401] Updated weights for policy 0, policy_version 138300 (0.0029) [2024-06-22 04:55:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2265956352. Throughput: 0: 42681.0. Samples: 2266124160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:55:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 04:55:51,118][15401] Updated weights for policy 0, policy_version 138310 (0.0032) [2024-06-22 04:55:52,917][15349] Signal inference workers to stop experience collection... (33450 times) [2024-06-22 04:55:52,943][15401] InferenceWorker_p0-w0: stopping experience collection (33450 times) [2024-06-22 04:55:52,981][15349] Signal inference workers to resume experience collection... (33450 times) [2024-06-22 04:55:52,983][15401] InferenceWorker_p0-w0: resuming experience collection (33450 times) [2024-06-22 04:55:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42601.1, 300 sec: 42987.8). Total num frames: 2266169344. Throughput: 0: 42712.9. Samples: 2266250920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:55:53,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 04:55:54,796][15401] Updated weights for policy 0, policy_version 138320 (0.0034) [2024-06-22 04:55:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42876.3). Total num frames: 2266365952. Throughput: 0: 42630.1. Samples: 2266504560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:55:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 04:55:58,987][15401] Updated weights for policy 0, policy_version 138330 (0.0041) [2024-06-22 04:56:02,365][15401] Updated weights for policy 0, policy_version 138340 (0.0033) [2024-06-22 04:56:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2266578944. Throughput: 0: 42711.7. Samples: 2266763340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:56:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 04:56:06,454][15401] Updated weights for policy 0, policy_version 138350 (0.0027) [2024-06-22 04:56:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 2266824704. Throughput: 0: 42734.6. Samples: 2266896080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:56:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 04:56:09,829][15401] Updated weights for policy 0, policy_version 138360 (0.0038) [2024-06-22 04:56:13,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 2267021312. Throughput: 0: 42664.0. Samples: 2267149060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:56:13,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 04:56:13,976][15401] Updated weights for policy 0, policy_version 138370 (0.0029) [2024-06-22 04:56:17,703][15401] Updated weights for policy 0, policy_version 138380 (0.0030) [2024-06-22 04:56:18,393][15132] Fps is (10 sec: 40945.2, 60 sec: 42322.7, 300 sec: 42931.1). Total num frames: 2267234304. Throughput: 0: 42927.2. Samples: 2267409260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:56:18,394][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 04:56:21,741][15401] Updated weights for policy 0, policy_version 138390 (0.0038) [2024-06-22 04:56:23,396][15132] Fps is (10 sec: 44218.8, 60 sec: 42866.9, 300 sec: 42931.0). Total num frames: 2267463680. Throughput: 0: 42763.3. Samples: 2267537560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 04:56:23,397][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 04:56:25,147][15401] Updated weights for policy 0, policy_version 138400 (0.0039) [2024-06-22 04:56:28,389][15132] Fps is (10 sec: 44253.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2267676672. Throughput: 0: 42785.0. Samples: 2267792080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:56:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 04:56:29,129][15401] Updated weights for policy 0, policy_version 138410 (0.0027) [2024-06-22 04:56:32,796][15401] Updated weights for policy 0, policy_version 138420 (0.0034) [2024-06-22 04:56:33,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2267873280. Throughput: 0: 42804.9. Samples: 2268050380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:56:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 04:56:36,609][15401] Updated weights for policy 0, policy_version 138430 (0.0034) [2024-06-22 04:56:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2268102656. Throughput: 0: 42872.9. Samples: 2268180200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:56:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 04:56:40,386][15401] Updated weights for policy 0, policy_version 138440 (0.0034) [2024-06-22 04:56:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2268315648. Throughput: 0: 42916.9. Samples: 2268435820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:56:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 04:56:44,129][15401] Updated weights for policy 0, policy_version 138450 (0.0030) [2024-06-22 04:56:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2268512256. Throughput: 0: 42762.2. Samples: 2268687640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:56:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 04:56:48,492][15401] Updated weights for policy 0, policy_version 138460 (0.0035) [2024-06-22 04:56:51,674][15401] Updated weights for policy 0, policy_version 138470 (0.0032) [2024-06-22 04:56:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2268741632. Throughput: 0: 42614.9. Samples: 2268813740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:56:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 04:56:56,178][15401] Updated weights for policy 0, policy_version 138480 (0.0023) [2024-06-22 04:56:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 2268971008. Throughput: 0: 42888.5. Samples: 2269078940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:56:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 04:56:59,247][15401] Updated weights for policy 0, policy_version 138490 (0.0037) [2024-06-22 04:57:03,396][15132] Fps is (10 sec: 42570.7, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 2269167616. Throughput: 0: 42699.7. Samples: 2269330860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:57:03,396][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 04:57:03,684][15401] Updated weights for policy 0, policy_version 138500 (0.0050) [2024-06-22 04:57:06,742][15401] Updated weights for policy 0, policy_version 138510 (0.0035) [2024-06-22 04:57:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 2269364224. Throughput: 0: 42651.9. Samples: 2269456620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:57:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 04:57:09,262][15349] Signal inference workers to stop experience collection... (33500 times) [2024-06-22 04:57:09,269][15349] Signal inference workers to resume experience collection... (33500 times) [2024-06-22 04:57:09,298][15401] InferenceWorker_p0-w0: stopping experience collection (33500 times) [2024-06-22 04:57:09,298][15401] InferenceWorker_p0-w0: resuming experience collection (33500 times) [2024-06-22 04:57:11,217][15401] Updated weights for policy 0, policy_version 138520 (0.0030) [2024-06-22 04:57:13,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 2269593600. Throughput: 0: 43001.8. Samples: 2269727160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:57:13,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 04:57:14,238][15401] Updated weights for policy 0, policy_version 138530 (0.0026) [2024-06-22 04:57:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42874.1, 300 sec: 42876.1). Total num frames: 2269806592. Throughput: 0: 42827.4. Samples: 2269977620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:57:18,390][15132] Avg episode reward: [(0, '0.156')] [2024-06-22 04:57:19,049][15401] Updated weights for policy 0, policy_version 138540 (0.0030) [2024-06-22 04:57:22,704][15401] Updated weights for policy 0, policy_version 138550 (0.0040) [2024-06-22 04:57:23,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42874.3, 300 sec: 42875.7). Total num frames: 2270035968. Throughput: 0: 42751.8. Samples: 2270104140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:57:23,393][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 04:57:26,554][15401] Updated weights for policy 0, policy_version 138560 (0.0027) [2024-06-22 04:57:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2270248960. Throughput: 0: 42960.1. Samples: 2270369020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 04:57:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 04:57:30,070][15401] Updated weights for policy 0, policy_version 138570 (0.0034) [2024-06-22 04:57:33,390][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2270461952. Throughput: 0: 42952.0. Samples: 2270620480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:57:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 04:57:34,432][15401] Updated weights for policy 0, policy_version 138580 (0.0042) [2024-06-22 04:57:37,918][15401] Updated weights for policy 0, policy_version 138590 (0.0032) [2024-06-22 04:57:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2270674944. Throughput: 0: 42974.7. Samples: 2270747600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:57:38,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-22 04:57:41,866][15401] Updated weights for policy 0, policy_version 138600 (0.0048) [2024-06-22 04:57:43,395][15132] Fps is (10 sec: 40937.7, 60 sec: 42594.6, 300 sec: 42764.2). Total num frames: 2270871552. Throughput: 0: 42861.5. Samples: 2271007940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:57:43,396][15132] Avg episode reward: [(0, '0.164')] [2024-06-22 04:57:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000138603_2270871552.pth... [2024-06-22 04:57:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000137976_2260598784.pth [2024-06-22 04:57:45,434][15401] Updated weights for policy 0, policy_version 138610 (0.0039) [2024-06-22 04:57:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2271100928. Throughput: 0: 42881.2. Samples: 2271260240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:57:48,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 04:57:49,287][15401] Updated weights for policy 0, policy_version 138620 (0.0045) [2024-06-22 04:57:52,921][15401] Updated weights for policy 0, policy_version 138630 (0.0030) [2024-06-22 04:57:53,390][15132] Fps is (10 sec: 44260.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2271313920. Throughput: 0: 43040.8. Samples: 2271393460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:57:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 04:57:57,300][15401] Updated weights for policy 0, policy_version 138640 (0.0039) [2024-06-22 04:57:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2271510528. Throughput: 0: 42795.1. Samples: 2271652940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:57:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 04:58:00,363][15401] Updated weights for policy 0, policy_version 138650 (0.0037) [2024-06-22 04:58:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 2271739904. Throughput: 0: 42764.5. Samples: 2271902020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:58:03,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 04:58:04,724][15401] Updated weights for policy 0, policy_version 138660 (0.0034) [2024-06-22 04:58:08,223][15401] Updated weights for policy 0, policy_version 138670 (0.0032) [2024-06-22 04:58:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2271969280. Throughput: 0: 43032.1. Samples: 2272040480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:58:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 04:58:12,203][15401] Updated weights for policy 0, policy_version 138680 (0.0027) [2024-06-22 04:58:13,394][15132] Fps is (10 sec: 40941.8, 60 sec: 42595.2, 300 sec: 42708.8). Total num frames: 2272149504. Throughput: 0: 42534.9. Samples: 2272283280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:58:13,394][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 04:58:16,144][15401] Updated weights for policy 0, policy_version 138690 (0.0027) [2024-06-22 04:58:18,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 2272362496. Throughput: 0: 42689.7. Samples: 2272541620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:58:18,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 04:58:20,067][15401] Updated weights for policy 0, policy_version 138700 (0.0037) [2024-06-22 04:58:23,389][15132] Fps is (10 sec: 44256.9, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 2272591872. Throughput: 0: 42606.2. Samples: 2272664880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:58:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 04:58:23,813][15401] Updated weights for policy 0, policy_version 138710 (0.0032) [2024-06-22 04:58:27,860][15401] Updated weights for policy 0, policy_version 138720 (0.0035) [2024-06-22 04:58:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2272788480. Throughput: 0: 42407.0. Samples: 2272916020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 04:58:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 04:58:29,779][15349] Signal inference workers to stop experience collection... (33550 times) [2024-06-22 04:58:29,779][15349] Signal inference workers to resume experience collection... (33550 times) [2024-06-22 04:58:29,815][15401] InferenceWorker_p0-w0: stopping experience collection (33550 times) [2024-06-22 04:58:29,815][15401] InferenceWorker_p0-w0: resuming experience collection (33550 times) [2024-06-22 04:58:31,351][15401] Updated weights for policy 0, policy_version 138730 (0.0028) [2024-06-22 04:58:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2273001472. Throughput: 0: 42649.7. Samples: 2273179480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:58:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 04:58:35,394][15401] Updated weights for policy 0, policy_version 138740 (0.0027) [2024-06-22 04:58:38,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 2273230848. Throughput: 0: 42621.7. Samples: 2273311540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:58:38,393][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 04:58:39,055][15401] Updated weights for policy 0, policy_version 138750 (0.0028) [2024-06-22 04:58:43,004][15401] Updated weights for policy 0, policy_version 138760 (0.0038) [2024-06-22 04:58:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42875.3, 300 sec: 42709.5). Total num frames: 2273443840. Throughput: 0: 42491.5. Samples: 2273565060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:58:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 04:58:46,730][15401] Updated weights for policy 0, policy_version 138770 (0.0029) [2024-06-22 04:58:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2273656832. Throughput: 0: 42573.8. Samples: 2273817840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:58:48,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 04:58:50,782][15401] Updated weights for policy 0, policy_version 138780 (0.0026) [2024-06-22 04:58:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2273869824. Throughput: 0: 42357.7. Samples: 2273946580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:58:53,390][15132] Avg episode reward: [(0, '0.139')] [2024-06-22 04:58:54,182][15401] Updated weights for policy 0, policy_version 138790 (0.0036) [2024-06-22 04:58:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2274082816. Throughput: 0: 42829.5. Samples: 2274210420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:58:58,390][15132] Avg episode reward: [(0, '0.139')] [2024-06-22 04:58:58,548][15401] Updated weights for policy 0, policy_version 138800 (0.0036) [2024-06-22 04:59:01,662][15401] Updated weights for policy 0, policy_version 138810 (0.0029) [2024-06-22 04:59:03,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2274279424. Throughput: 0: 42791.3. Samples: 2274467120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:59:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 04:59:06,122][15401] Updated weights for policy 0, policy_version 138820 (0.0043) [2024-06-22 04:59:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2274525184. Throughput: 0: 42820.3. Samples: 2274591800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:59:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 04:59:09,435][15401] Updated weights for policy 0, policy_version 138830 (0.0032) [2024-06-22 04:59:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42874.6, 300 sec: 42765.0). Total num frames: 2274721792. Throughput: 0: 42976.0. Samples: 2274849940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:59:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 04:59:13,805][15401] Updated weights for policy 0, policy_version 138840 (0.0039) [2024-06-22 04:59:17,046][15401] Updated weights for policy 0, policy_version 138850 (0.0048) [2024-06-22 04:59:18,394][15132] Fps is (10 sec: 40939.6, 60 sec: 42869.6, 300 sec: 42764.3). Total num frames: 2274934784. Throughput: 0: 42689.2. Samples: 2275100700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:59:18,395][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 04:59:21,693][15401] Updated weights for policy 0, policy_version 138860 (0.0027) [2024-06-22 04:59:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 2275131392. Throughput: 0: 42639.0. Samples: 2275230200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:59:23,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-22 04:59:24,826][15401] Updated weights for policy 0, policy_version 138870 (0.0027) [2024-06-22 04:59:28,389][15132] Fps is (10 sec: 42619.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2275360768. Throughput: 0: 42641.0. Samples: 2275483900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:59:28,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 04:59:29,554][15401] Updated weights for policy 0, policy_version 138880 (0.0028) [2024-06-22 04:59:32,510][15401] Updated weights for policy 0, policy_version 138890 (0.0031) [2024-06-22 04:59:33,390][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2275590144. Throughput: 0: 42602.1. Samples: 2275734940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 04:59:33,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 04:59:37,459][15401] Updated weights for policy 0, policy_version 138900 (0.0028) [2024-06-22 04:59:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 2275786752. Throughput: 0: 42793.0. Samples: 2275872360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:59:38,393][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 04:59:40,311][15401] Updated weights for policy 0, policy_version 138910 (0.0050) [2024-06-22 04:59:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2275999744. Throughput: 0: 42582.2. Samples: 2276126620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:59:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 04:59:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000138916_2275999744.pth... [2024-06-22 04:59:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000138290_2265743360.pth [2024-06-22 04:59:45,176][15401] Updated weights for policy 0, policy_version 138920 (0.0043) [2024-06-22 04:59:47,965][15401] Updated weights for policy 0, policy_version 138930 (0.0033) [2024-06-22 04:59:48,389][15132] Fps is (10 sec: 45886.6, 60 sec: 43144.5, 300 sec: 42821.1). Total num frames: 2276245504. Throughput: 0: 42408.4. Samples: 2276375500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:59:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 04:59:52,923][15401] Updated weights for policy 0, policy_version 138940 (0.0026) [2024-06-22 04:59:53,215][15349] Signal inference workers to stop experience collection... (33600 times) [2024-06-22 04:59:53,216][15349] Signal inference workers to resume experience collection... (33600 times) [2024-06-22 04:59:53,235][15401] InferenceWorker_p0-w0: stopping experience collection (33600 times) [2024-06-22 04:59:53,236][15401] InferenceWorker_p0-w0: resuming experience collection (33600 times) [2024-06-22 04:59:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 2276425728. Throughput: 0: 42584.5. Samples: 2276508100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:59:53,392][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 04:59:55,948][15401] Updated weights for policy 0, policy_version 138950 (0.0032) [2024-06-22 04:59:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2276638720. Throughput: 0: 42492.9. Samples: 2276762120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 04:59:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 05:00:00,595][15401] Updated weights for policy 0, policy_version 138960 (0.0029) [2024-06-22 05:00:03,392][15132] Fps is (10 sec: 44225.4, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 2276868096. Throughput: 0: 42535.2. Samples: 2277014680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 05:00:03,393][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 05:00:03,541][15401] Updated weights for policy 0, policy_version 138970 (0.0037) [2024-06-22 05:00:08,392][15132] Fps is (10 sec: 39312.3, 60 sec: 41777.5, 300 sec: 42653.6). Total num frames: 2277031936. Throughput: 0: 42595.7. Samples: 2277147100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 05:00:08,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 05:00:08,580][15401] Updated weights for policy 0, policy_version 138980 (0.0041) [2024-06-22 05:00:11,228][15401] Updated weights for policy 0, policy_version 138990 (0.0031) [2024-06-22 05:00:13,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2277277696. Throughput: 0: 42501.7. Samples: 2277396480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 05:00:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 05:00:16,118][15401] Updated weights for policy 0, policy_version 139000 (0.0042) [2024-06-22 05:00:18,389][15132] Fps is (10 sec: 47525.0, 60 sec: 42875.0, 300 sec: 42765.0). Total num frames: 2277507072. Throughput: 0: 42599.6. Samples: 2277651920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 05:00:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 05:00:18,993][15401] Updated weights for policy 0, policy_version 139010 (0.0034) [2024-06-22 05:00:23,394][15132] Fps is (10 sec: 40942.2, 60 sec: 42595.4, 300 sec: 42708.8). Total num frames: 2277687296. Throughput: 0: 42396.8. Samples: 2277780300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 05:00:23,394][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 05:00:23,686][15401] Updated weights for policy 0, policy_version 139020 (0.0037) [2024-06-22 05:00:26,485][15401] Updated weights for policy 0, policy_version 139030 (0.0034) [2024-06-22 05:00:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2277933056. Throughput: 0: 42406.7. Samples: 2278034920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 05:00:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 05:00:31,381][15401] Updated weights for policy 0, policy_version 139040 (0.0034) [2024-06-22 05:00:33,390][15132] Fps is (10 sec: 44256.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2278129664. Throughput: 0: 42731.0. Samples: 2278298400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 05:00:33,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 05:00:34,062][15401] Updated weights for policy 0, policy_version 139050 (0.0036) [2024-06-22 05:00:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 2278326272. Throughput: 0: 42524.0. Samples: 2278421680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 05:00:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 05:00:38,961][15401] Updated weights for policy 0, policy_version 139060 (0.0029) [2024-06-22 05:00:41,623][15401] Updated weights for policy 0, policy_version 139070 (0.0033) [2024-06-22 05:00:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2278572032. Throughput: 0: 42493.8. Samples: 2278674340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:00:43,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 05:00:46,443][15401] Updated weights for policy 0, policy_version 139080 (0.0025) [2024-06-22 05:00:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2278768640. Throughput: 0: 42743.3. Samples: 2278938020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:00:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 05:00:49,434][15401] Updated weights for policy 0, policy_version 139090 (0.0026) [2024-06-22 05:00:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2278965248. Throughput: 0: 42572.6. Samples: 2279062760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:00:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 05:00:54,075][15401] Updated weights for policy 0, policy_version 139100 (0.0039) [2024-06-22 05:00:57,020][15401] Updated weights for policy 0, policy_version 139110 (0.0029) [2024-06-22 05:00:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2279227392. Throughput: 0: 42692.6. Samples: 2279317640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:00:58,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-22 05:01:01,534][15401] Updated weights for policy 0, policy_version 139120 (0.0020) [2024-06-22 05:01:03,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 2279407616. Throughput: 0: 43001.2. Samples: 2279586980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:01:03,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-22 05:01:04,792][15401] Updated weights for policy 0, policy_version 139130 (0.0033) [2024-06-22 05:01:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43146.2, 300 sec: 42709.8). Total num frames: 2279620608. Throughput: 0: 42782.0. Samples: 2279705300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:01:08,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-22 05:01:08,999][15401] Updated weights for policy 0, policy_version 139140 (0.0046) [2024-06-22 05:01:09,693][15349] Signal inference workers to stop experience collection... (33650 times) [2024-06-22 05:01:09,693][15349] Signal inference workers to resume experience collection... (33650 times) [2024-06-22 05:01:09,734][15401] InferenceWorker_p0-w0: stopping experience collection (33650 times) [2024-06-22 05:01:09,734][15401] InferenceWorker_p0-w0: resuming experience collection (33650 times) [2024-06-22 05:01:12,431][15401] Updated weights for policy 0, policy_version 139150 (0.0048) [2024-06-22 05:01:13,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 42821.1). Total num frames: 2279866368. Throughput: 0: 42980.1. Samples: 2279969020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:01:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 05:01:16,660][15401] Updated weights for policy 0, policy_version 139160 (0.0036) [2024-06-22 05:01:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 2280046592. Throughput: 0: 42939.5. Samples: 2280230680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:01:18,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 05:01:20,118][15401] Updated weights for policy 0, policy_version 139170 (0.0026) [2024-06-22 05:01:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43147.7, 300 sec: 42709.5). Total num frames: 2280275968. Throughput: 0: 42900.0. Samples: 2280352180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:01:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 05:01:24,050][15401] Updated weights for policy 0, policy_version 139180 (0.0039) [2024-06-22 05:01:27,983][15401] Updated weights for policy 0, policy_version 139190 (0.0032) [2024-06-22 05:01:28,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2280505344. Throughput: 0: 43080.4. Samples: 2280612960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:01:28,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 05:01:31,905][15401] Updated weights for policy 0, policy_version 139200 (0.0031) [2024-06-22 05:01:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2280685568. Throughput: 0: 42985.8. Samples: 2280872380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:01:33,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 05:01:35,515][15401] Updated weights for policy 0, policy_version 139210 (0.0029) [2024-06-22 05:01:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2280914944. Throughput: 0: 43084.0. Samples: 2281001540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 05:01:38,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 05:01:39,654][15401] Updated weights for policy 0, policy_version 139220 (0.0031) [2024-06-22 05:01:43,202][15401] Updated weights for policy 0, policy_version 139230 (0.0039) [2024-06-22 05:01:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2281144320. Throughput: 0: 43040.8. Samples: 2281254480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:01:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 05:01:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000139230_2281144320.pth... [2024-06-22 05:01:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000138603_2270871552.pth [2024-06-22 05:01:47,326][15401] Updated weights for policy 0, policy_version 139240 (0.0028) [2024-06-22 05:01:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2281340928. Throughput: 0: 42869.4. Samples: 2281516100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:01:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 05:01:51,021][15401] Updated weights for policy 0, policy_version 139250 (0.0036) [2024-06-22 05:01:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2281553920. Throughput: 0: 43092.5. Samples: 2281644460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:01:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 05:01:54,922][15401] Updated weights for policy 0, policy_version 139260 (0.0025) [2024-06-22 05:01:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 2281766912. Throughput: 0: 42906.2. Samples: 2281899800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:01:58,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 05:01:58,663][15401] Updated weights for policy 0, policy_version 139270 (0.0034) [2024-06-22 05:02:02,438][15401] Updated weights for policy 0, policy_version 139280 (0.0037) [2024-06-22 05:02:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2281979904. Throughput: 0: 42822.8. Samples: 2282157700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 05:02:06,357][15401] Updated weights for policy 0, policy_version 139290 (0.0045) [2024-06-22 05:02:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2282209280. Throughput: 0: 42959.1. Samples: 2282285340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 05:02:09,937][15401] Updated weights for policy 0, policy_version 139300 (0.0054) [2024-06-22 05:02:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2282422272. Throughput: 0: 42909.7. Samples: 2282543900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 05:02:13,786][15401] Updated weights for policy 0, policy_version 139310 (0.0024) [2024-06-22 05:02:17,849][15401] Updated weights for policy 0, policy_version 139320 (0.0032) [2024-06-22 05:02:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.8, 300 sec: 42765.4). Total num frames: 2282651648. Throughput: 0: 42921.9. Samples: 2282803860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 05:02:21,827][15401] Updated weights for policy 0, policy_version 139330 (0.0028) [2024-06-22 05:02:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2282864640. Throughput: 0: 42941.2. Samples: 2282933900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 05:02:25,479][15401] Updated weights for policy 0, policy_version 139340 (0.0030) [2024-06-22 05:02:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2283061248. Throughput: 0: 43091.1. Samples: 2283193580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:28,390][15132] Avg episode reward: [(0, '0.124')] [2024-06-22 05:02:29,226][15401] Updated weights for policy 0, policy_version 139350 (0.0032) [2024-06-22 05:02:30,431][15349] Signal inference workers to stop experience collection... (33700 times) [2024-06-22 05:02:30,436][15349] Signal inference workers to resume experience collection... (33700 times) [2024-06-22 05:02:30,473][15401] InferenceWorker_p0-w0: stopping experience collection (33700 times) [2024-06-22 05:02:30,473][15401] InferenceWorker_p0-w0: resuming experience collection (33700 times) [2024-06-22 05:02:32,855][15401] Updated weights for policy 0, policy_version 139360 (0.0034) [2024-06-22 05:02:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 2283307008. Throughput: 0: 42937.4. Samples: 2283448280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:33,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 05:02:36,965][15401] Updated weights for policy 0, policy_version 139370 (0.0033) [2024-06-22 05:02:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42821.3). Total num frames: 2283503616. Throughput: 0: 43096.8. Samples: 2283583820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 05:02:40,346][15401] Updated weights for policy 0, policy_version 139380 (0.0042) [2024-06-22 05:02:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2283700224. Throughput: 0: 42911.9. Samples: 2283830840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:02:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 05:02:44,926][15401] Updated weights for policy 0, policy_version 139390 (0.0037) [2024-06-22 05:02:47,955][15401] Updated weights for policy 0, policy_version 139400 (0.0042) [2024-06-22 05:02:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2283929600. Throughput: 0: 42833.2. Samples: 2284085200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:02:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 05:02:52,482][15401] Updated weights for policy 0, policy_version 139410 (0.0033) [2024-06-22 05:02:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2284142592. Throughput: 0: 43059.4. Samples: 2284223020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:02:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 05:02:55,413][15401] Updated weights for policy 0, policy_version 139420 (0.0035) [2024-06-22 05:02:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2284355584. Throughput: 0: 42933.0. Samples: 2284475880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:02:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 05:02:59,994][15401] Updated weights for policy 0, policy_version 139430 (0.0033) [2024-06-22 05:03:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2284568576. Throughput: 0: 42844.8. Samples: 2284731880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 05:03:03,407][15401] Updated weights for policy 0, policy_version 139440 (0.0032) [2024-06-22 05:03:07,605][15401] Updated weights for policy 0, policy_version 139450 (0.0039) [2024-06-22 05:03:08,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.8, 300 sec: 42876.4). Total num frames: 2284797952. Throughput: 0: 42856.5. Samples: 2284862540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:08,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 05:03:11,051][15401] Updated weights for policy 0, policy_version 139460 (0.0042) [2024-06-22 05:03:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 2284994560. Throughput: 0: 42704.6. Samples: 2285115280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 05:03:15,407][15401] Updated weights for policy 0, policy_version 139470 (0.0037) [2024-06-22 05:03:18,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 2285223936. Throughput: 0: 42792.7. Samples: 2285374060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:18,393][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 05:03:18,552][15401] Updated weights for policy 0, policy_version 139480 (0.0028) [2024-06-22 05:03:22,928][15401] Updated weights for policy 0, policy_version 139490 (0.0045) [2024-06-22 05:03:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2285420544. Throughput: 0: 42621.4. Samples: 2285501780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 05:03:26,277][15401] Updated weights for policy 0, policy_version 139500 (0.0027) [2024-06-22 05:03:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2285649920. Throughput: 0: 42850.8. Samples: 2285759120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 05:03:30,495][15401] Updated weights for policy 0, policy_version 139510 (0.0044) [2024-06-22 05:03:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 2285846528. Throughput: 0: 42984.1. Samples: 2286019480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 05:03:33,839][15401] Updated weights for policy 0, policy_version 139520 (0.0036) [2024-06-22 05:03:38,329][15401] Updated weights for policy 0, policy_version 139530 (0.0036) [2024-06-22 05:03:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2286059520. Throughput: 0: 42571.2. Samples: 2286138720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 05:03:41,728][15401] Updated weights for policy 0, policy_version 139540 (0.0032) [2024-06-22 05:03:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2286272512. Throughput: 0: 42594.3. Samples: 2286392620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-22 05:03:43,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 05:03:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000139544_2286288896.pth... [2024-06-22 05:03:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000138916_2275999744.pth [2024-06-22 05:03:45,857][15401] Updated weights for policy 0, policy_version 139550 (0.0043) [2024-06-22 05:03:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 2286485504. Throughput: 0: 42750.7. Samples: 2286655660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:03:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 05:03:49,263][15401] Updated weights for policy 0, policy_version 139560 (0.0029) [2024-06-22 05:03:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2286698496. Throughput: 0: 42592.8. Samples: 2286779120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:03:53,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 05:03:53,423][15401] Updated weights for policy 0, policy_version 139570 (0.0029) [2024-06-22 05:03:56,231][15349] Signal inference workers to stop experience collection... (33750 times) [2024-06-22 05:03:56,283][15401] InferenceWorker_p0-w0: stopping experience collection (33750 times) [2024-06-22 05:03:56,290][15349] Signal inference workers to resume experience collection... (33750 times) [2024-06-22 05:03:56,301][15401] InferenceWorker_p0-w0: resuming experience collection (33750 times) [2024-06-22 05:03:56,787][15401] Updated weights for policy 0, policy_version 139580 (0.0032) [2024-06-22 05:03:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2286911488. Throughput: 0: 42535.9. Samples: 2287029400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:03:58,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 05:04:01,110][15401] Updated weights for policy 0, policy_version 139590 (0.0037) [2024-06-22 05:04:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2287124480. Throughput: 0: 42597.7. Samples: 2287290860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 05:04:04,489][15401] Updated weights for policy 0, policy_version 139600 (0.0040) [2024-06-22 05:04:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2287337472. Throughput: 0: 42560.9. Samples: 2287417020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 05:04:09,007][15401] Updated weights for policy 0, policy_version 139610 (0.0033) [2024-06-22 05:04:12,177][15401] Updated weights for policy 0, policy_version 139620 (0.0032) [2024-06-22 05:04:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42710.2). Total num frames: 2287534080. Throughput: 0: 42392.8. Samples: 2287666800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 05:04:16,608][15401] Updated weights for policy 0, policy_version 139630 (0.0027) [2024-06-22 05:04:18,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42052.3, 300 sec: 42764.7). Total num frames: 2287747072. Throughput: 0: 42499.9. Samples: 2287932080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:18,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 05:04:19,827][15401] Updated weights for policy 0, policy_version 139640 (0.0028) [2024-06-22 05:04:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2287960064. Throughput: 0: 42679.6. Samples: 2288059300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 05:04:24,074][15401] Updated weights for policy 0, policy_version 139650 (0.0033) [2024-06-22 05:04:27,462][15401] Updated weights for policy 0, policy_version 139660 (0.0036) [2024-06-22 05:04:28,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2288189440. Throughput: 0: 42669.7. Samples: 2288312760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 05:04:31,978][15401] Updated weights for policy 0, policy_version 139670 (0.0023) [2024-06-22 05:04:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 2288402432. Throughput: 0: 42744.2. Samples: 2288579160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:33,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 05:04:35,846][15401] Updated weights for policy 0, policy_version 139680 (0.0031) [2024-06-22 05:04:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2288615424. Throughput: 0: 42781.9. Samples: 2288704300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 05:04:39,501][15401] Updated weights for policy 0, policy_version 139690 (0.0038) [2024-06-22 05:04:43,144][15401] Updated weights for policy 0, policy_version 139700 (0.0030) [2024-06-22 05:04:43,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 2288844800. Throughput: 0: 42858.6. Samples: 2288958140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:43,393][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 05:04:47,082][15401] Updated weights for policy 0, policy_version 139710 (0.0026) [2024-06-22 05:04:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2289041408. Throughput: 0: 42932.2. Samples: 2289222800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:04:48,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 05:04:50,604][15401] Updated weights for policy 0, policy_version 139720 (0.0026) [2024-06-22 05:04:53,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2289270784. Throughput: 0: 43046.8. Samples: 2289354120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:04:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 05:04:54,516][15401] Updated weights for policy 0, policy_version 139730 (0.0030) [2024-06-22 05:04:58,032][15401] Updated weights for policy 0, policy_version 139740 (0.0043) [2024-06-22 05:04:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 2289500160. Throughput: 0: 43008.9. Samples: 2289602200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:04:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 05:05:02,376][15401] Updated weights for policy 0, policy_version 139750 (0.0051) [2024-06-22 05:05:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 2289713152. Throughput: 0: 43065.8. Samples: 2289869940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 05:05:05,434][15401] Updated weights for policy 0, policy_version 139760 (0.0031) [2024-06-22 05:05:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2289909760. Throughput: 0: 43044.9. Samples: 2289996320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 05:05:09,778][15401] Updated weights for policy 0, policy_version 139770 (0.0029) [2024-06-22 05:05:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2290139136. Throughput: 0: 43108.4. Samples: 2290252640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:13,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 05:05:13,583][15401] Updated weights for policy 0, policy_version 139780 (0.0035) [2024-06-22 05:05:17,294][15401] Updated weights for policy 0, policy_version 139790 (0.0044) [2024-06-22 05:05:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43419.4, 300 sec: 42932.3). Total num frames: 2290352128. Throughput: 0: 42950.8. Samples: 2290511940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:18,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 05:05:21,431][15401] Updated weights for policy 0, policy_version 139800 (0.0043) [2024-06-22 05:05:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2290548736. Throughput: 0: 43028.0. Samples: 2290640560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:23,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 05:05:24,626][15349] Signal inference workers to stop experience collection... (33800 times) [2024-06-22 05:05:24,627][15349] Signal inference workers to resume experience collection... (33800 times) [2024-06-22 05:05:24,655][15401] InferenceWorker_p0-w0: stopping experience collection (33800 times) [2024-06-22 05:05:24,655][15401] InferenceWorker_p0-w0: resuming experience collection (33800 times) [2024-06-22 05:05:25,096][15401] Updated weights for policy 0, policy_version 139810 (0.0038) [2024-06-22 05:05:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2290778112. Throughput: 0: 42983.2. Samples: 2290892280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:28,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 05:05:29,005][15401] Updated weights for policy 0, policy_version 139820 (0.0035) [2024-06-22 05:05:32,675][15401] Updated weights for policy 0, policy_version 139830 (0.0044) [2024-06-22 05:05:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2290991104. Throughput: 0: 42735.8. Samples: 2291145920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:33,400][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 05:05:36,678][15401] Updated weights for policy 0, policy_version 139840 (0.0057) [2024-06-22 05:05:38,391][15132] Fps is (10 sec: 40955.7, 60 sec: 42870.7, 300 sec: 42764.9). Total num frames: 2291187712. Throughput: 0: 42751.4. Samples: 2291277980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:38,391][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 05:05:40,308][15401] Updated weights for policy 0, policy_version 139850 (0.0038) [2024-06-22 05:05:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 2291417088. Throughput: 0: 42905.0. Samples: 2291532920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:43,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 05:05:43,511][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000139858_2291433472.pth... [2024-06-22 05:05:43,560][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000139230_2281144320.pth [2024-06-22 05:05:44,575][15401] Updated weights for policy 0, policy_version 139860 (0.0038) [2024-06-22 05:05:47,803][15401] Updated weights for policy 0, policy_version 139870 (0.0034) [2024-06-22 05:05:48,389][15132] Fps is (10 sec: 44241.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2291630080. Throughput: 0: 42771.7. Samples: 2291794660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 05:05:52,073][15401] Updated weights for policy 0, policy_version 139880 (0.0031) [2024-06-22 05:05:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2291826688. Throughput: 0: 42805.2. Samples: 2291922560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 05:05:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 05:05:55,672][15401] Updated weights for policy 0, policy_version 139890 (0.0041) [2024-06-22 05:05:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2292072448. Throughput: 0: 42833.7. Samples: 2292180160. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:05:58,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 05:05:59,659][15401] Updated weights for policy 0, policy_version 139900 (0.0022) [2024-06-22 05:06:03,378][15401] Updated weights for policy 0, policy_version 139910 (0.0045) [2024-06-22 05:06:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2292285440. Throughput: 0: 42832.8. Samples: 2292439420. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:03,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 05:06:07,205][15401] Updated weights for policy 0, policy_version 139920 (0.0028) [2024-06-22 05:06:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2292465664. Throughput: 0: 42828.4. Samples: 2292567840. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 05:06:10,995][15401] Updated weights for policy 0, policy_version 139930 (0.0039) [2024-06-22 05:06:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2292695040. Throughput: 0: 42904.4. Samples: 2292822980. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 05:06:15,458][15401] Updated weights for policy 0, policy_version 139940 (0.0040) [2024-06-22 05:06:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2292924416. Throughput: 0: 42981.1. Samples: 2293080060. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:18,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 05:06:18,483][15401] Updated weights for policy 0, policy_version 139950 (0.0039) [2024-06-22 05:06:23,242][15401] Updated weights for policy 0, policy_version 139960 (0.0043) [2024-06-22 05:06:23,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 2293121024. Throughput: 0: 42877.4. Samples: 2293207520. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:23,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 05:06:26,054][15401] Updated weights for policy 0, policy_version 139970 (0.0035) [2024-06-22 05:06:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2293350400. Throughput: 0: 42878.5. Samples: 2293462460. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:28,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 05:06:30,674][15401] Updated weights for policy 0, policy_version 139980 (0.0038) [2024-06-22 05:06:33,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2293563392. Throughput: 0: 42748.3. Samples: 2293718340. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:33,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 05:06:34,256][15401] Updated weights for policy 0, policy_version 139990 (0.0048) [2024-06-22 05:06:38,087][15401] Updated weights for policy 0, policy_version 140000 (0.0023) [2024-06-22 05:06:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42872.2, 300 sec: 42765.0). Total num frames: 2293760000. Throughput: 0: 42708.9. Samples: 2293844460. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 05:06:42,073][15401] Updated weights for policy 0, policy_version 140010 (0.0042) [2024-06-22 05:06:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 2293972992. Throughput: 0: 42759.9. Samples: 2294104360. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:43,396][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 05:06:45,967][15401] Updated weights for policy 0, policy_version 140020 (0.0046) [2024-06-22 05:06:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2294185984. Throughput: 0: 42642.9. Samples: 2294358340. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 05:06:49,590][15401] Updated weights for policy 0, policy_version 140030 (0.0030) [2024-06-22 05:06:53,363][15401] Updated weights for policy 0, policy_version 140040 (0.0032) [2024-06-22 05:06:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2294415360. Throughput: 0: 42659.5. Samples: 2294487520. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 05:06:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 05:06:57,196][15401] Updated weights for policy 0, policy_version 140050 (0.0038) [2024-06-22 05:06:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2294628352. Throughput: 0: 42820.0. Samples: 2294749880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:06:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 05:07:00,833][15401] Updated weights for policy 0, policy_version 140060 (0.0027) [2024-06-22 05:07:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 2294841344. Throughput: 0: 42787.5. Samples: 2295005500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 05:07:04,734][15401] Updated weights for policy 0, policy_version 140070 (0.0032) [2024-06-22 05:07:06,422][15349] Signal inference workers to stop experience collection... (33850 times) [2024-06-22 05:07:06,422][15349] Signal inference workers to resume experience collection... (33850 times) [2024-06-22 05:07:06,467][15401] InferenceWorker_p0-w0: stopping experience collection (33850 times) [2024-06-22 05:07:06,467][15401] InferenceWorker_p0-w0: resuming experience collection (33850 times) [2024-06-22 05:07:08,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2295054336. Throughput: 0: 42768.9. Samples: 2295132120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:08,393][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 05:07:08,613][15401] Updated weights for policy 0, policy_version 140080 (0.0046) [2024-06-22 05:07:12,441][15401] Updated weights for policy 0, policy_version 140090 (0.0034) [2024-06-22 05:07:13,394][15132] Fps is (10 sec: 44217.2, 60 sec: 43141.4, 300 sec: 42819.9). Total num frames: 2295283712. Throughput: 0: 42748.4. Samples: 2295386320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:13,394][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 05:07:16,271][15401] Updated weights for policy 0, policy_version 140100 (0.0041) [2024-06-22 05:07:18,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2295480320. Throughput: 0: 42855.2. Samples: 2295646820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 05:07:20,024][15401] Updated weights for policy 0, policy_version 140110 (0.0029) [2024-06-22 05:07:23,390][15132] Fps is (10 sec: 40978.1, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 2295693312. Throughput: 0: 42772.5. Samples: 2295769220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 05:07:24,103][15401] Updated weights for policy 0, policy_version 140120 (0.0032) [2024-06-22 05:07:27,602][15401] Updated weights for policy 0, policy_version 140130 (0.0037) [2024-06-22 05:07:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2295906304. Throughput: 0: 42814.9. Samples: 2296031020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 05:07:31,642][15401] Updated weights for policy 0, policy_version 140140 (0.0041) [2024-06-22 05:07:33,391][15132] Fps is (10 sec: 42593.7, 60 sec: 42597.7, 300 sec: 42764.9). Total num frames: 2296119296. Throughput: 0: 42952.6. Samples: 2296291260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:33,391][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 05:07:35,283][15401] Updated weights for policy 0, policy_version 140150 (0.0028) [2024-06-22 05:07:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2296348672. Throughput: 0: 42931.5. Samples: 2296419440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 05:07:39,177][15401] Updated weights for policy 0, policy_version 140160 (0.0032) [2024-06-22 05:07:43,063][15401] Updated weights for policy 0, policy_version 140170 (0.0024) [2024-06-22 05:07:43,389][15132] Fps is (10 sec: 42603.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2296545280. Throughput: 0: 42734.3. Samples: 2296672920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 05:07:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000140170_2296545280.pth... [2024-06-22 05:07:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000139544_2286288896.pth [2024-06-22 05:07:47,104][15401] Updated weights for policy 0, policy_version 140180 (0.0039) [2024-06-22 05:07:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2296758272. Throughput: 0: 42747.2. Samples: 2296929120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 05:07:50,470][15401] Updated weights for policy 0, policy_version 140190 (0.0030) [2024-06-22 05:07:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2296971264. Throughput: 0: 42869.0. Samples: 2297061120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 05:07:54,583][15401] Updated weights for policy 0, policy_version 140200 (0.0030) [2024-06-22 05:07:58,085][15401] Updated weights for policy 0, policy_version 140210 (0.0035) [2024-06-22 05:07:58,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2297200640. Throughput: 0: 42949.9. Samples: 2297318880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 05:07:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 05:08:02,083][15401] Updated weights for policy 0, policy_version 140220 (0.0041) [2024-06-22 05:08:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 2297397248. Throughput: 0: 42864.9. Samples: 2297575740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 05:08:06,121][15401] Updated weights for policy 0, policy_version 140230 (0.0027) [2024-06-22 05:08:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2297626624. Throughput: 0: 43013.0. Samples: 2297704800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:08,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-22 05:08:09,788][15401] Updated weights for policy 0, policy_version 140240 (0.0033) [2024-06-22 05:08:13,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42326.7, 300 sec: 42709.5). Total num frames: 2297823232. Throughput: 0: 42794.5. Samples: 2297956880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:13,393][15132] Avg episode reward: [(0, '0.795')] [2024-06-22 05:08:13,707][15401] Updated weights for policy 0, policy_version 140250 (0.0029) [2024-06-22 05:08:17,366][15401] Updated weights for policy 0, policy_version 140260 (0.0039) [2024-06-22 05:08:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2298052608. Throughput: 0: 42842.3. Samples: 2298219120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 05:08:21,158][15401] Updated weights for policy 0, policy_version 140270 (0.0034) [2024-06-22 05:08:23,390][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2298281984. Throughput: 0: 42821.8. Samples: 2298346420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 05:08:24,900][15401] Updated weights for policy 0, policy_version 140280 (0.0030) [2024-06-22 05:08:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2298478592. Throughput: 0: 43115.0. Samples: 2298613100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 05:08:28,737][15401] Updated weights for policy 0, policy_version 140290 (0.0033) [2024-06-22 05:08:32,486][15401] Updated weights for policy 0, policy_version 140300 (0.0029) [2024-06-22 05:08:33,396][15132] Fps is (10 sec: 42571.1, 60 sec: 43140.7, 300 sec: 42875.2). Total num frames: 2298707968. Throughput: 0: 42967.5. Samples: 2298862940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:33,397][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 05:08:36,204][15401] Updated weights for policy 0, policy_version 140310 (0.0034) [2024-06-22 05:08:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2298937344. Throughput: 0: 42913.2. Samples: 2298992220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 05:08:40,296][15401] Updated weights for policy 0, policy_version 140320 (0.0038) [2024-06-22 05:08:43,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2299117568. Throughput: 0: 42936.7. Samples: 2299251020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 05:08:44,010][15401] Updated weights for policy 0, policy_version 140330 (0.0029) [2024-06-22 05:08:45,572][15349] Signal inference workers to stop experience collection... (33900 times) [2024-06-22 05:08:45,623][15401] InferenceWorker_p0-w0: stopping experience collection (33900 times) [2024-06-22 05:08:45,691][15349] Signal inference workers to resume experience collection... (33900 times) [2024-06-22 05:08:45,691][15401] InferenceWorker_p0-w0: resuming experience collection (33900 times) [2024-06-22 05:08:47,747][15401] Updated weights for policy 0, policy_version 140340 (0.0035) [2024-06-22 05:08:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2299346944. Throughput: 0: 42816.8. Samples: 2299502500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 05:08:51,844][15401] Updated weights for policy 0, policy_version 140350 (0.0036) [2024-06-22 05:08:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2299559936. Throughput: 0: 42898.0. Samples: 2299635220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 05:08:55,842][15401] Updated weights for policy 0, policy_version 140360 (0.0038) [2024-06-22 05:08:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2299772928. Throughput: 0: 43016.1. Samples: 2299892500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 05:08:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 05:08:59,337][15401] Updated weights for policy 0, policy_version 140370 (0.0031) [2024-06-22 05:09:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2299969536. Throughput: 0: 42893.8. Samples: 2300149340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 05:09:03,483][15401] Updated weights for policy 0, policy_version 140380 (0.0035) [2024-06-22 05:09:06,804][15401] Updated weights for policy 0, policy_version 140390 (0.0026) [2024-06-22 05:09:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 2300198912. Throughput: 0: 43032.4. Samples: 2300282980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:08,393][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 05:09:10,869][15401] Updated weights for policy 0, policy_version 140400 (0.0048) [2024-06-22 05:09:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43146.4, 300 sec: 42932.0). Total num frames: 2300411904. Throughput: 0: 42773.9. Samples: 2300537920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 05:09:14,428][15401] Updated weights for policy 0, policy_version 140410 (0.0029) [2024-06-22 05:09:18,389][15401] Updated weights for policy 0, policy_version 140420 (0.0038) [2024-06-22 05:09:18,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2300641280. Throughput: 0: 42889.2. Samples: 2300792680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 05:09:22,212][15401] Updated weights for policy 0, policy_version 140430 (0.0055) [2024-06-22 05:09:23,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2300837888. Throughput: 0: 42885.7. Samples: 2300922080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 05:09:26,129][15401] Updated weights for policy 0, policy_version 140440 (0.0041) [2024-06-22 05:09:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2301067264. Throughput: 0: 42783.1. Samples: 2301176260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 05:09:29,701][15401] Updated weights for policy 0, policy_version 140450 (0.0031) [2024-06-22 05:09:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42602.9, 300 sec: 42876.1). Total num frames: 2301263872. Throughput: 0: 42932.4. Samples: 2301434460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 05:09:33,863][15401] Updated weights for policy 0, policy_version 140460 (0.0041) [2024-06-22 05:09:37,534][15401] Updated weights for policy 0, policy_version 140470 (0.0039) [2024-06-22 05:09:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 2301493248. Throughput: 0: 42825.4. Samples: 2301562360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 05:09:41,492][15401] Updated weights for policy 0, policy_version 140480 (0.0037) [2024-06-22 05:09:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2301689856. Throughput: 0: 42791.6. Samples: 2301818120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 05:09:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000140485_2301706240.pth... [2024-06-22 05:09:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000139858_2291433472.pth [2024-06-22 05:09:45,173][15401] Updated weights for policy 0, policy_version 140490 (0.0030) [2024-06-22 05:09:48,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 2301919232. Throughput: 0: 42781.4. Samples: 2302074600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:48,392][15132] Avg episode reward: [(0, '0.144')] [2024-06-22 05:09:49,391][15401] Updated weights for policy 0, policy_version 140500 (0.0042) [2024-06-22 05:09:52,794][15401] Updated weights for policy 0, policy_version 140510 (0.0036) [2024-06-22 05:09:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2302132224. Throughput: 0: 42623.1. Samples: 2302200920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 05:09:57,178][15401] Updated weights for policy 0, policy_version 140520 (0.0035) [2024-06-22 05:09:58,389][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2302328832. Throughput: 0: 42721.3. Samples: 2302460380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:09:58,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-22 05:10:00,602][15401] Updated weights for policy 0, policy_version 140530 (0.0032) [2024-06-22 05:10:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2302558208. Throughput: 0: 42786.3. Samples: 2302718060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 05:10:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 05:10:04,997][15401] Updated weights for policy 0, policy_version 140540 (0.0040) [2024-06-22 05:10:08,264][15401] Updated weights for policy 0, policy_version 140550 (0.0031) [2024-06-22 05:10:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 2302771200. Throughput: 0: 42778.7. Samples: 2302847120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 05:10:12,472][15401] Updated weights for policy 0, policy_version 140560 (0.0027) [2024-06-22 05:10:13,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 2302951424. Throughput: 0: 42701.7. Samples: 2303097940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:13,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 05:10:15,852][15401] Updated weights for policy 0, policy_version 140570 (0.0034) [2024-06-22 05:10:15,875][15349] Signal inference workers to stop experience collection... (33950 times) [2024-06-22 05:10:15,876][15349] Signal inference workers to resume experience collection... (33950 times) [2024-06-22 05:10:15,925][15401] InferenceWorker_p0-w0: stopping experience collection (33950 times) [2024-06-22 05:10:15,925][15401] InferenceWorker_p0-w0: resuming experience collection (33950 times) [2024-06-22 05:10:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2303197184. Throughput: 0: 42686.7. Samples: 2303355360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 05:10:20,301][15401] Updated weights for policy 0, policy_version 140580 (0.0040) [2024-06-22 05:10:23,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 2303410176. Throughput: 0: 42787.1. Samples: 2303487780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 05:10:23,427][15401] Updated weights for policy 0, policy_version 140590 (0.0025) [2024-06-22 05:10:27,863][15401] Updated weights for policy 0, policy_version 140600 (0.0037) [2024-06-22 05:10:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2303590400. Throughput: 0: 42767.5. Samples: 2303742660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 05:10:30,878][15401] Updated weights for policy 0, policy_version 140610 (0.0031) [2024-06-22 05:10:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42876.3). Total num frames: 2303836160. Throughput: 0: 42742.3. Samples: 2303997900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 05:10:35,435][15401] Updated weights for policy 0, policy_version 140620 (0.0032) [2024-06-22 05:10:38,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2304065536. Throughput: 0: 42863.7. Samples: 2304129780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:38,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 05:10:38,791][15401] Updated weights for policy 0, policy_version 140630 (0.0027) [2024-06-22 05:10:43,043][15401] Updated weights for policy 0, policy_version 140640 (0.0032) [2024-06-22 05:10:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2304262144. Throughput: 0: 42732.8. Samples: 2304383360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 05:10:46,351][15401] Updated weights for policy 0, policy_version 140650 (0.0033) [2024-06-22 05:10:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 2304491520. Throughput: 0: 42622.6. Samples: 2304636080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 05:10:50,735][15401] Updated weights for policy 0, policy_version 140660 (0.0031) [2024-06-22 05:10:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2304704512. Throughput: 0: 42692.1. Samples: 2304768260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 05:10:53,828][15401] Updated weights for policy 0, policy_version 140670 (0.0028) [2024-06-22 05:10:58,255][15401] Updated weights for policy 0, policy_version 140680 (0.0038) [2024-06-22 05:10:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2304901120. Throughput: 0: 42915.5. Samples: 2305029040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:10:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 05:11:01,562][15401] Updated weights for policy 0, policy_version 140690 (0.0032) [2024-06-22 05:11:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2305130496. Throughput: 0: 43005.9. Samples: 2305290620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:11:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 05:11:05,622][15401] Updated weights for policy 0, policy_version 140700 (0.0031) [2024-06-22 05:11:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2305343488. Throughput: 0: 43026.2. Samples: 2305423960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 05:11:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 05:11:08,974][15401] Updated weights for policy 0, policy_version 140710 (0.0031) [2024-06-22 05:11:13,058][15401] Updated weights for policy 0, policy_version 140720 (0.0041) [2024-06-22 05:11:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 2305556480. Throughput: 0: 43105.0. Samples: 2305682380. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 05:11:16,604][15401] Updated weights for policy 0, policy_version 140730 (0.0036) [2024-06-22 05:11:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 2305769472. Throughput: 0: 43076.3. Samples: 2305936340. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 05:11:20,859][15401] Updated weights for policy 0, policy_version 140740 (0.0033) [2024-06-22 05:11:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2305982464. Throughput: 0: 43133.7. Samples: 2306070900. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:23,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 05:11:24,155][15401] Updated weights for policy 0, policy_version 140750 (0.0044) [2024-06-22 05:11:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2306195456. Throughput: 0: 43049.9. Samples: 2306320600. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:28,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 05:11:28,658][15401] Updated weights for policy 0, policy_version 140760 (0.0044) [2024-06-22 05:11:32,106][15401] Updated weights for policy 0, policy_version 140770 (0.0036) [2024-06-22 05:11:33,390][15132] Fps is (10 sec: 44246.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2306424832. Throughput: 0: 43167.9. Samples: 2306578640. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 05:11:36,098][15401] Updated weights for policy 0, policy_version 140780 (0.0036) [2024-06-22 05:11:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2306637824. Throughput: 0: 43161.8. Samples: 2306710540. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:38,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 05:11:39,534][15401] Updated weights for policy 0, policy_version 140790 (0.0031) [2024-06-22 05:11:40,495][15349] Signal inference workers to stop experience collection... (34000 times) [2024-06-22 05:11:40,495][15349] Signal inference workers to resume experience collection... (34000 times) [2024-06-22 05:11:40,513][15401] InferenceWorker_p0-w0: stopping experience collection (34000 times) [2024-06-22 05:11:40,513][15401] InferenceWorker_p0-w0: resuming experience collection (34000 times) [2024-06-22 05:11:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2306850816. Throughput: 0: 43125.9. Samples: 2306969700. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 05:11:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000140800_2306867200.pth... [2024-06-22 05:11:43,493][15401] Updated weights for policy 0, policy_version 140800 (0.0046) [2024-06-22 05:11:43,539][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000140170_2296545280.pth [2024-06-22 05:11:47,104][15401] Updated weights for policy 0, policy_version 140810 (0.0030) [2024-06-22 05:11:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2307080192. Throughput: 0: 43042.2. Samples: 2307227520. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:48,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 05:11:51,148][15401] Updated weights for policy 0, policy_version 140820 (0.0045) [2024-06-22 05:11:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2307276800. Throughput: 0: 43008.9. Samples: 2307359360. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 05:11:54,589][15401] Updated weights for policy 0, policy_version 140830 (0.0032) [2024-06-22 05:11:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2307489792. Throughput: 0: 43073.7. Samples: 2307620700. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:11:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 05:11:58,918][15401] Updated weights for policy 0, policy_version 140840 (0.0035) [2024-06-22 05:12:02,141][15401] Updated weights for policy 0, policy_version 140850 (0.0032) [2024-06-22 05:12:03,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43415.8, 300 sec: 42987.2). Total num frames: 2307735552. Throughput: 0: 43065.2. Samples: 2307874380. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:12:03,393][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 05:12:06,489][15401] Updated weights for policy 0, policy_version 140860 (0.0035) [2024-06-22 05:12:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.7). Total num frames: 2307932160. Throughput: 0: 43105.0. Samples: 2308010520. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-22 05:12:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 05:12:09,852][15401] Updated weights for policy 0, policy_version 140870 (0.0031) [2024-06-22 05:12:13,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2308128768. Throughput: 0: 43224.5. Samples: 2308265700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 05:12:14,428][15401] Updated weights for policy 0, policy_version 140880 (0.0040) [2024-06-22 05:12:17,685][15401] Updated weights for policy 0, policy_version 140890 (0.0032) [2024-06-22 05:12:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2308358144. Throughput: 0: 43111.7. Samples: 2308518660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:18,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 05:12:21,960][15401] Updated weights for policy 0, policy_version 140900 (0.0049) [2024-06-22 05:12:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43146.1, 300 sec: 42931.6). Total num frames: 2308571136. Throughput: 0: 43135.9. Samples: 2308651660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:23,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 05:12:25,276][15401] Updated weights for policy 0, policy_version 140910 (0.0028) [2024-06-22 05:12:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42876.3). Total num frames: 2308767744. Throughput: 0: 43094.6. Samples: 2308908960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 05:12:29,710][15401] Updated weights for policy 0, policy_version 140920 (0.0031) [2024-06-22 05:12:32,871][15401] Updated weights for policy 0, policy_version 140930 (0.0021) [2024-06-22 05:12:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2309013504. Throughput: 0: 43047.2. Samples: 2309164640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 05:12:37,268][15401] Updated weights for policy 0, policy_version 140940 (0.0036) [2024-06-22 05:12:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2309193728. Throughput: 0: 43078.4. Samples: 2309297880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 05:12:40,583][15401] Updated weights for policy 0, policy_version 140950 (0.0037) [2024-06-22 05:12:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 2309423104. Throughput: 0: 42989.6. Samples: 2309555240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 05:12:44,937][15401] Updated weights for policy 0, policy_version 140960 (0.0033) [2024-06-22 05:12:48,201][15401] Updated weights for policy 0, policy_version 140970 (0.0033) [2024-06-22 05:12:48,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 2309652480. Throughput: 0: 42969.0. Samples: 2309807980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:48,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 05:12:52,313][15349] Signal inference workers to stop experience collection... (34050 times) [2024-06-22 05:12:52,319][15349] Signal inference workers to resume experience collection... (34050 times) [2024-06-22 05:12:52,343][15401] InferenceWorker_p0-w0: stopping experience collection (34050 times) [2024-06-22 05:12:52,343][15401] InferenceWorker_p0-w0: resuming experience collection (34050 times) [2024-06-22 05:12:52,472][15401] Updated weights for policy 0, policy_version 140980 (0.0036) [2024-06-22 05:12:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2309865472. Throughput: 0: 42907.4. Samples: 2309941360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:53,391][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 05:12:55,677][15401] Updated weights for policy 0, policy_version 140990 (0.0026) [2024-06-22 05:12:58,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2310078464. Throughput: 0: 42891.0. Samples: 2310195800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:12:58,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 05:12:59,966][15401] Updated weights for policy 0, policy_version 141000 (0.0035) [2024-06-22 05:13:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.2, 300 sec: 42931.6). Total num frames: 2310291456. Throughput: 0: 42956.8. Samples: 2310451720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:13:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 05:13:03,435][15401] Updated weights for policy 0, policy_version 141010 (0.0027) [2024-06-22 05:13:07,663][15401] Updated weights for policy 0, policy_version 141020 (0.0031) [2024-06-22 05:13:08,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 2310488064. Throughput: 0: 42842.5. Samples: 2310579560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:13:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 05:13:11,155][15401] Updated weights for policy 0, policy_version 141030 (0.0036) [2024-06-22 05:13:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2310717440. Throughput: 0: 42785.8. Samples: 2310834320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:13:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 05:13:15,219][15401] Updated weights for policy 0, policy_version 141040 (0.0034) [2024-06-22 05:13:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2310930432. Throughput: 0: 42913.3. Samples: 2311095740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 05:13:18,735][15401] Updated weights for policy 0, policy_version 141050 (0.0033) [2024-06-22 05:13:22,932][15401] Updated weights for policy 0, policy_version 141060 (0.0038) [2024-06-22 05:13:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2311127040. Throughput: 0: 42731.8. Samples: 2311220820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:23,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 05:13:26,358][15401] Updated weights for policy 0, policy_version 141070 (0.0034) [2024-06-22 05:13:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42877.0). Total num frames: 2311356416. Throughput: 0: 42736.2. Samples: 2311478360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:28,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 05:13:30,488][15401] Updated weights for policy 0, policy_version 141080 (0.0029) [2024-06-22 05:13:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2311585792. Throughput: 0: 42750.2. Samples: 2311731640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 05:13:34,063][15401] Updated weights for policy 0, policy_version 141090 (0.0048) [2024-06-22 05:13:38,024][15401] Updated weights for policy 0, policy_version 141100 (0.0054) [2024-06-22 05:13:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2311782400. Throughput: 0: 42712.0. Samples: 2311863400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 05:13:41,859][15401] Updated weights for policy 0, policy_version 141110 (0.0030) [2024-06-22 05:13:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2311995392. Throughput: 0: 42705.5. Samples: 2312117540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 05:13:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000141113_2311995392.pth... [2024-06-22 05:13:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000140485_2301706240.pth [2024-06-22 05:13:45,607][15401] Updated weights for policy 0, policy_version 141120 (0.0026) [2024-06-22 05:13:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 2312208384. Throughput: 0: 42728.5. Samples: 2312374500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:48,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-22 05:13:49,711][15401] Updated weights for policy 0, policy_version 141130 (0.0033) [2024-06-22 05:13:53,372][15401] Updated weights for policy 0, policy_version 141140 (0.0030) [2024-06-22 05:13:53,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 2312437760. Throughput: 0: 42821.1. Samples: 2312506620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:53,392][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 05:13:57,208][15401] Updated weights for policy 0, policy_version 141150 (0.0037) [2024-06-22 05:13:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2312634368. Throughput: 0: 42794.6. Samples: 2312760080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:13:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 05:14:01,070][15401] Updated weights for policy 0, policy_version 141160 (0.0038) [2024-06-22 05:14:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 2312847360. Throughput: 0: 42675.0. Samples: 2313016120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:14:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 05:14:05,067][15401] Updated weights for policy 0, policy_version 141170 (0.0024) [2024-06-22 05:14:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2313076736. Throughput: 0: 42744.5. Samples: 2313144320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:14:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 05:14:08,647][15401] Updated weights for policy 0, policy_version 141180 (0.0031) [2024-06-22 05:14:12,657][15401] Updated weights for policy 0, policy_version 141190 (0.0036) [2024-06-22 05:14:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2313273344. Throughput: 0: 42835.4. Samples: 2313405960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 05:14:13,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 05:14:16,197][15401] Updated weights for policy 0, policy_version 141200 (0.0026) [2024-06-22 05:14:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2313502720. Throughput: 0: 42905.4. Samples: 2313662380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:18,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 05:14:20,334][15401] Updated weights for policy 0, policy_version 141210 (0.0036) [2024-06-22 05:14:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2313715712. Throughput: 0: 42820.1. Samples: 2313790300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:23,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 05:14:23,725][15401] Updated weights for policy 0, policy_version 141220 (0.0028) [2024-06-22 05:14:28,137][15401] Updated weights for policy 0, policy_version 141230 (0.0043) [2024-06-22 05:14:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2313912320. Throughput: 0: 42919.5. Samples: 2314048920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 05:14:31,353][15401] Updated weights for policy 0, policy_version 141240 (0.0045) [2024-06-22 05:14:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 2314141696. Throughput: 0: 42749.8. Samples: 2314298240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:33,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 05:14:35,710][15401] Updated weights for policy 0, policy_version 141250 (0.0036) [2024-06-22 05:14:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2314338304. Throughput: 0: 42773.9. Samples: 2314431340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 05:14:38,926][15349] Signal inference workers to stop experience collection... (34100 times) [2024-06-22 05:14:38,926][15349] Signal inference workers to resume experience collection... (34100 times) [2024-06-22 05:14:38,951][15401] InferenceWorker_p0-w0: stopping experience collection (34100 times) [2024-06-22 05:14:38,952][15401] InferenceWorker_p0-w0: resuming experience collection (34100 times) [2024-06-22 05:14:39,076][15401] Updated weights for policy 0, policy_version 141260 (0.0028) [2024-06-22 05:14:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 2314551296. Throughput: 0: 42848.1. Samples: 2314688240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 05:14:43,433][15401] Updated weights for policy 0, policy_version 141270 (0.0031) [2024-06-22 05:14:46,872][15401] Updated weights for policy 0, policy_version 141280 (0.0036) [2024-06-22 05:14:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2314797056. Throughput: 0: 42705.5. Samples: 2314937860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 05:14:51,264][15401] Updated weights for policy 0, policy_version 141290 (0.0027) [2024-06-22 05:14:53,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42598.4, 300 sec: 42931.3). Total num frames: 2314993664. Throughput: 0: 42821.3. Samples: 2315071380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:53,393][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 05:14:54,649][15401] Updated weights for policy 0, policy_version 141300 (0.0055) [2024-06-22 05:14:58,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2315190272. Throughput: 0: 42736.4. Samples: 2315329100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:14:58,391][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 05:14:58,822][15401] Updated weights for policy 0, policy_version 141310 (0.0034) [2024-06-22 05:15:02,629][15401] Updated weights for policy 0, policy_version 141320 (0.0032) [2024-06-22 05:15:03,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2315436032. Throughput: 0: 42466.2. Samples: 2315573360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:15:03,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 05:15:06,333][15401] Updated weights for policy 0, policy_version 141330 (0.0039) [2024-06-22 05:15:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 2315632640. Throughput: 0: 42733.2. Samples: 2315713300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:15:08,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 05:15:10,016][15401] Updated weights for policy 0, policy_version 141340 (0.0037) [2024-06-22 05:15:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2315829248. Throughput: 0: 42699.1. Samples: 2315970380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:15:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 05:15:14,243][15401] Updated weights for policy 0, policy_version 141350 (0.0040) [2024-06-22 05:15:17,770][15401] Updated weights for policy 0, policy_version 141360 (0.0025) [2024-06-22 05:15:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2316058624. Throughput: 0: 42810.9. Samples: 2316224740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:15:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 05:15:21,669][15401] Updated weights for policy 0, policy_version 141370 (0.0037) [2024-06-22 05:15:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2316288000. Throughput: 0: 42773.4. Samples: 2316356140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:15:23,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 05:15:25,601][15401] Updated weights for policy 0, policy_version 141380 (0.0030) [2024-06-22 05:15:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2316484608. Throughput: 0: 42744.9. Samples: 2316611760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:15:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 05:15:29,155][15401] Updated weights for policy 0, policy_version 141390 (0.0028) [2024-06-22 05:15:33,112][15401] Updated weights for policy 0, policy_version 141400 (0.0024) [2024-06-22 05:15:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2316713984. Throughput: 0: 43063.0. Samples: 2316875700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:15:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 05:15:36,699][15401] Updated weights for policy 0, policy_version 141410 (0.0028) [2024-06-22 05:15:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2316926976. Throughput: 0: 42855.6. Samples: 2316999780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:15:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 05:15:40,740][15401] Updated weights for policy 0, policy_version 141420 (0.0027) [2024-06-22 05:15:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2317139968. Throughput: 0: 42893.4. Samples: 2317259300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:15:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 05:15:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000141427_2317139968.pth... [2024-06-22 05:15:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000140800_2306867200.pth [2024-06-22 05:15:44,280][15401] Updated weights for policy 0, policy_version 141430 (0.0033) [2024-06-22 05:15:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 2317336576. Throughput: 0: 43258.6. Samples: 2317520000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:15:48,404][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 05:15:48,592][15401] Updated weights for policy 0, policy_version 141440 (0.0036) [2024-06-22 05:15:52,129][15401] Updated weights for policy 0, policy_version 141450 (0.0035) [2024-06-22 05:15:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42931.7). Total num frames: 2317565952. Throughput: 0: 42876.9. Samples: 2317642760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:15:53,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 05:15:56,207][15401] Updated weights for policy 0, policy_version 141460 (0.0026) [2024-06-22 05:15:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2317795328. Throughput: 0: 42953.3. Samples: 2317903280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:15:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 05:15:59,750][15401] Updated weights for policy 0, policy_version 141470 (0.0027) [2024-06-22 05:16:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2317975552. Throughput: 0: 43022.4. Samples: 2318160740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:16:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 05:16:03,837][15401] Updated weights for policy 0, policy_version 141480 (0.0040) [2024-06-22 05:16:07,228][15401] Updated weights for policy 0, policy_version 141490 (0.0037) [2024-06-22 05:16:08,195][15349] Signal inference workers to stop experience collection... (34150 times) [2024-06-22 05:16:08,196][15349] Signal inference workers to resume experience collection... (34150 times) [2024-06-22 05:16:08,227][15401] InferenceWorker_p0-w0: stopping experience collection (34150 times) [2024-06-22 05:16:08,227][15401] InferenceWorker_p0-w0: resuming experience collection (34150 times) [2024-06-22 05:16:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 2318221312. Throughput: 0: 42942.7. Samples: 2318288560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:16:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 05:16:11,275][15401] Updated weights for policy 0, policy_version 141500 (0.0036) [2024-06-22 05:16:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 2318434304. Throughput: 0: 43139.4. Samples: 2318553140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:16:13,392][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 05:16:14,754][15401] Updated weights for policy 0, policy_version 141510 (0.0029) [2024-06-22 05:16:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 2318630912. Throughput: 0: 42866.1. Samples: 2318804680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:16:18,390][15132] Avg episode reward: [(0, '0.163')] [2024-06-22 05:16:18,812][15401] Updated weights for policy 0, policy_version 141520 (0.0038) [2024-06-22 05:16:22,317][15401] Updated weights for policy 0, policy_version 141530 (0.0035) [2024-06-22 05:16:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2318843904. Throughput: 0: 42917.3. Samples: 2318931060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:16:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 05:16:26,370][15401] Updated weights for policy 0, policy_version 141540 (0.0027) [2024-06-22 05:16:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2319073280. Throughput: 0: 42868.9. Samples: 2319188400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:16:28,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 05:16:30,014][15401] Updated weights for policy 0, policy_version 141550 (0.0036) [2024-06-22 05:16:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2319286272. Throughput: 0: 42690.2. Samples: 2319441060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:16:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 05:16:34,130][15401] Updated weights for policy 0, policy_version 141560 (0.0035) [2024-06-22 05:16:38,010][15401] Updated weights for policy 0, policy_version 141570 (0.0036) [2024-06-22 05:16:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2319499264. Throughput: 0: 42852.4. Samples: 2319571120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:16:38,398][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 05:16:42,142][15401] Updated weights for policy 0, policy_version 141580 (0.0040) [2024-06-22 05:16:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2319728640. Throughput: 0: 42971.1. Samples: 2319836980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:16:43,398][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 05:16:45,405][15401] Updated weights for policy 0, policy_version 141590 (0.0040) [2024-06-22 05:16:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2319941632. Throughput: 0: 43001.4. Samples: 2320095800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:16:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 05:16:49,497][15401] Updated weights for policy 0, policy_version 141600 (0.0021) [2024-06-22 05:16:52,817][15401] Updated weights for policy 0, policy_version 141610 (0.0039) [2024-06-22 05:16:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2320154624. Throughput: 0: 42933.1. Samples: 2320220560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:16:53,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 05:16:56,887][15401] Updated weights for policy 0, policy_version 141620 (0.0030) [2024-06-22 05:16:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2320351232. Throughput: 0: 42864.2. Samples: 2320481920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:16:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 05:17:00,198][15401] Updated weights for policy 0, policy_version 141630 (0.0030) [2024-06-22 05:17:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2320564224. Throughput: 0: 43041.5. Samples: 2320741540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:17:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 05:17:04,763][15401] Updated weights for policy 0, policy_version 141640 (0.0035) [2024-06-22 05:17:08,082][15401] Updated weights for policy 0, policy_version 141650 (0.0038) [2024-06-22 05:17:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2320793600. Throughput: 0: 43152.5. Samples: 2320872920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:17:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 05:17:12,250][15401] Updated weights for policy 0, policy_version 141660 (0.0042) [2024-06-22 05:17:13,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42322.6, 300 sec: 42764.1). Total num frames: 2320973824. Throughput: 0: 43122.9. Samples: 2321129200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:17:13,396][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 05:17:15,571][15401] Updated weights for policy 0, policy_version 141670 (0.0030) [2024-06-22 05:17:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2321219584. Throughput: 0: 43038.7. Samples: 2321377800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:17:18,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-22 05:17:19,889][15401] Updated weights for policy 0, policy_version 141680 (0.0032) [2024-06-22 05:17:23,386][15401] Updated weights for policy 0, policy_version 141690 (0.0032) [2024-06-22 05:17:23,389][15132] Fps is (10 sec: 47543.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2321448960. Throughput: 0: 43236.4. Samples: 2321516760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 05:17:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 05:17:28,327][15401] Updated weights for policy 0, policy_version 141700 (0.0031) [2024-06-22 05:17:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2321612800. Throughput: 0: 42776.4. Samples: 2321761920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:17:28,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 05:17:31,149][15401] Updated weights for policy 0, policy_version 141710 (0.0037) [2024-06-22 05:17:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2321858560. Throughput: 0: 42614.1. Samples: 2322013440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:17:33,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 05:17:35,835][15401] Updated weights for policy 0, policy_version 141720 (0.0035) [2024-06-22 05:17:37,135][15349] Signal inference workers to stop experience collection... (34200 times) [2024-06-22 05:17:37,135][15349] Signal inference workers to resume experience collection... (34200 times) [2024-06-22 05:17:37,149][15401] InferenceWorker_p0-w0: stopping experience collection (34200 times) [2024-06-22 05:17:37,149][15401] InferenceWorker_p0-w0: resuming experience collection (34200 times) [2024-06-22 05:17:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2322071552. Throughput: 0: 42855.2. Samples: 2322149040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:17:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 05:17:38,609][15401] Updated weights for policy 0, policy_version 141730 (0.0040) [2024-06-22 05:17:43,311][15401] Updated weights for policy 0, policy_version 141740 (0.0032) [2024-06-22 05:17:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 2322268160. Throughput: 0: 42816.3. Samples: 2322408660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:17:43,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 05:17:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000141741_2322284544.pth... [2024-06-22 05:17:43,588][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000141113_2311995392.pth [2024-06-22 05:17:46,658][15401] Updated weights for policy 0, policy_version 141750 (0.0036) [2024-06-22 05:17:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2322513920. Throughput: 0: 42622.5. Samples: 2322659560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:17:48,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-22 05:17:51,060][15401] Updated weights for policy 0, policy_version 141760 (0.0032) [2024-06-22 05:17:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2322726912. Throughput: 0: 42719.9. Samples: 2322795320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:17:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 05:17:54,007][15401] Updated weights for policy 0, policy_version 141770 (0.0031) [2024-06-22 05:17:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2322890752. Throughput: 0: 42605.6. Samples: 2323046180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:17:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 05:17:58,657][15401] Updated weights for policy 0, policy_version 141780 (0.0034) [2024-06-22 05:18:02,073][15401] Updated weights for policy 0, policy_version 141790 (0.0031) [2024-06-22 05:18:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2323152896. Throughput: 0: 42650.1. Samples: 2323297060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:18:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 05:18:06,275][15401] Updated weights for policy 0, policy_version 141800 (0.0028) [2024-06-22 05:18:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2323349504. Throughput: 0: 42543.0. Samples: 2323431200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:18:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 05:18:09,434][15401] Updated weights for policy 0, policy_version 141810 (0.0033) [2024-06-22 05:18:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 2323546112. Throughput: 0: 42782.7. Samples: 2323687140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:18:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 05:18:13,810][15401] Updated weights for policy 0, policy_version 141820 (0.0039) [2024-06-22 05:18:16,932][15401] Updated weights for policy 0, policy_version 141830 (0.0045) [2024-06-22 05:18:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2323791872. Throughput: 0: 42847.1. Samples: 2323941560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:18:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 05:18:21,512][15401] Updated weights for policy 0, policy_version 141840 (0.0039) [2024-06-22 05:18:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2323988480. Throughput: 0: 42835.9. Samples: 2324076660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:18:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 05:18:24,552][15401] Updated weights for policy 0, policy_version 141850 (0.0037) [2024-06-22 05:18:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2324201472. Throughput: 0: 42672.9. Samples: 2324328940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 05:18:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 05:18:29,143][15401] Updated weights for policy 0, policy_version 141860 (0.0037) [2024-06-22 05:18:32,138][15401] Updated weights for policy 0, policy_version 141870 (0.0035) [2024-06-22 05:18:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2324430848. Throughput: 0: 42814.7. Samples: 2324586220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:18:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 05:18:36,797][15401] Updated weights for policy 0, policy_version 141880 (0.0039) [2024-06-22 05:18:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2324643840. Throughput: 0: 42778.1. Samples: 2324720340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:18:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 05:18:39,671][15401] Updated weights for policy 0, policy_version 141890 (0.0035) [2024-06-22 05:18:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2324840448. Throughput: 0: 42837.8. Samples: 2324973880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:18:43,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 05:18:44,642][15401] Updated weights for policy 0, policy_version 141900 (0.0030) [2024-06-22 05:18:47,243][15401] Updated weights for policy 0, policy_version 141910 (0.0028) [2024-06-22 05:18:48,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42876.1). Total num frames: 2325086208. Throughput: 0: 42927.2. Samples: 2325228880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:18:48,392][15132] Avg episode reward: [(0, '0.290')] [2024-06-22 05:18:52,287][15401] Updated weights for policy 0, policy_version 141920 (0.0036) [2024-06-22 05:18:53,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 2325282816. Throughput: 0: 43070.2. Samples: 2325369460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:18:53,393][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 05:18:54,867][15401] Updated weights for policy 0, policy_version 141930 (0.0026) [2024-06-22 05:18:58,390][15132] Fps is (10 sec: 39330.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2325479424. Throughput: 0: 42769.3. Samples: 2325611760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:18:58,391][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 05:18:59,878][15401] Updated weights for policy 0, policy_version 141940 (0.0022) [2024-06-22 05:19:00,981][15349] Signal inference workers to stop experience collection... (34250 times) [2024-06-22 05:19:01,035][15401] InferenceWorker_p0-w0: stopping experience collection (34250 times) [2024-06-22 05:19:01,101][15349] Signal inference workers to resume experience collection... (34250 times) [2024-06-22 05:19:01,101][15401] InferenceWorker_p0-w0: resuming experience collection (34250 times) [2024-06-22 05:19:02,587][15401] Updated weights for policy 0, policy_version 141950 (0.0024) [2024-06-22 05:19:03,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2325725184. Throughput: 0: 42780.4. Samples: 2325866680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:19:03,391][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 05:19:07,972][15401] Updated weights for policy 0, policy_version 141960 (0.0043) [2024-06-22 05:19:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2325889024. Throughput: 0: 42899.7. Samples: 2326007140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:19:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 05:19:10,105][15401] Updated weights for policy 0, policy_version 141970 (0.0032) [2024-06-22 05:19:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2326134784. Throughput: 0: 42795.1. Samples: 2326254720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:19:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 05:19:15,461][15401] Updated weights for policy 0, policy_version 141980 (0.0029) [2024-06-22 05:19:17,743][15401] Updated weights for policy 0, policy_version 141990 (0.0027) [2024-06-22 05:19:18,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2326364160. Throughput: 0: 42682.3. Samples: 2326506920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:19:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 05:19:23,014][15401] Updated weights for policy 0, policy_version 142000 (0.0039) [2024-06-22 05:19:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 2326544384. Throughput: 0: 42764.5. Samples: 2326644840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:19:23,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 05:19:25,899][15401] Updated weights for policy 0, policy_version 142010 (0.0023) [2024-06-22 05:19:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2326790144. Throughput: 0: 42688.9. Samples: 2326894880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:19:28,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 05:19:30,424][15401] Updated weights for policy 0, policy_version 142020 (0.0036) [2024-06-22 05:19:33,345][15401] Updated weights for policy 0, policy_version 142030 (0.0030) [2024-06-22 05:19:33,390][15132] Fps is (10 sec: 47525.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2327019520. Throughput: 0: 42913.8. Samples: 2327159900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:19:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 05:19:38,087][15401] Updated weights for policy 0, policy_version 142040 (0.0032) [2024-06-22 05:19:38,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 2327183360. Throughput: 0: 42697.8. Samples: 2327290860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:19:38,392][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 05:19:40,764][15401] Updated weights for policy 0, policy_version 142050 (0.0029) [2024-06-22 05:19:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2327445504. Throughput: 0: 42958.6. Samples: 2327544900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:19:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 05:19:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000142056_2327445504.pth... [2024-06-22 05:19:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000141427_2317139968.pth [2024-06-22 05:19:45,431][15401] Updated weights for policy 0, policy_version 142060 (0.0030) [2024-06-22 05:19:48,328][15401] Updated weights for policy 0, policy_version 142070 (0.0037) [2024-06-22 05:19:48,390][15132] Fps is (10 sec: 49163.7, 60 sec: 43146.2, 300 sec: 42987.5). Total num frames: 2327674880. Throughput: 0: 42990.7. Samples: 2327801260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:19:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 05:19:53,204][15401] Updated weights for policy 0, policy_version 142080 (0.0034) [2024-06-22 05:19:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 2327838720. Throughput: 0: 42828.3. Samples: 2327934420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:19:53,391][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 05:19:56,318][15401] Updated weights for policy 0, policy_version 142090 (0.0028) [2024-06-22 05:19:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 2328100864. Throughput: 0: 43095.6. Samples: 2328194020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:19:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 05:20:00,694][15401] Updated weights for policy 0, policy_version 142100 (0.0038) [2024-06-22 05:20:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2328297472. Throughput: 0: 43307.0. Samples: 2328455740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:20:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 05:20:03,946][15401] Updated weights for policy 0, policy_version 142110 (0.0027) [2024-06-22 05:20:08,245][15401] Updated weights for policy 0, policy_version 142120 (0.0046) [2024-06-22 05:20:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2328494080. Throughput: 0: 42950.3. Samples: 2328577500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:20:08,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 05:20:09,970][15349] Signal inference workers to stop experience collection... (34300 times) [2024-06-22 05:20:09,972][15349] Signal inference workers to resume experience collection... (34300 times) [2024-06-22 05:20:10,016][15401] InferenceWorker_p0-w0: stopping experience collection (34300 times) [2024-06-22 05:20:10,016][15401] InferenceWorker_p0-w0: resuming experience collection (34300 times) [2024-06-22 05:20:11,638][15401] Updated weights for policy 0, policy_version 142130 (0.0027) [2024-06-22 05:20:13,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43688.9, 300 sec: 43042.4). Total num frames: 2328756224. Throughput: 0: 43333.7. Samples: 2328845000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:20:13,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 05:20:15,718][15401] Updated weights for policy 0, policy_version 142140 (0.0036) [2024-06-22 05:20:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2328936448. Throughput: 0: 43150.8. Samples: 2329101680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:20:18,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 05:20:19,093][15401] Updated weights for policy 0, policy_version 142150 (0.0031) [2024-06-22 05:20:23,323][15401] Updated weights for policy 0, policy_version 142160 (0.0044) [2024-06-22 05:20:23,396][15132] Fps is (10 sec: 39306.2, 60 sec: 43414.7, 300 sec: 42930.7). Total num frames: 2329149440. Throughput: 0: 42982.9. Samples: 2329225260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:20:23,396][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 05:20:26,576][15401] Updated weights for policy 0, policy_version 142170 (0.0033) [2024-06-22 05:20:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 2329395200. Throughput: 0: 43275.2. Samples: 2329492280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:20:28,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 05:20:31,391][15401] Updated weights for policy 0, policy_version 142180 (0.0034) [2024-06-22 05:20:33,390][15132] Fps is (10 sec: 42625.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2329575424. Throughput: 0: 43153.7. Samples: 2329743180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 05:20:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 05:20:34,312][15401] Updated weights for policy 0, policy_version 142190 (0.0033) [2024-06-22 05:20:38,390][15132] Fps is (10 sec: 37682.5, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 2329772032. Throughput: 0: 42962.7. Samples: 2329867740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:20:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 05:20:38,947][15401] Updated weights for policy 0, policy_version 142200 (0.0028) [2024-06-22 05:20:41,972][15401] Updated weights for policy 0, policy_version 142210 (0.0037) [2024-06-22 05:20:43,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 2330017792. Throughput: 0: 43027.9. Samples: 2330130380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:20:43,392][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 05:20:46,269][15401] Updated weights for policy 0, policy_version 142220 (0.0024) [2024-06-22 05:20:48,390][15132] Fps is (10 sec: 45872.2, 60 sec: 42597.9, 300 sec: 42931.5). Total num frames: 2330230784. Throughput: 0: 42909.1. Samples: 2330386680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:20:48,391][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 05:20:49,603][15401] Updated weights for policy 0, policy_version 142230 (0.0033) [2024-06-22 05:20:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2330427392. Throughput: 0: 43022.1. Samples: 2330513500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:20:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 05:20:54,014][15401] Updated weights for policy 0, policy_version 142240 (0.0041) [2024-06-22 05:20:57,273][15401] Updated weights for policy 0, policy_version 142250 (0.0027) [2024-06-22 05:20:58,389][15132] Fps is (10 sec: 42601.9, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2330656768. Throughput: 0: 42782.8. Samples: 2330770120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:20:58,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 05:21:01,572][15401] Updated weights for policy 0, policy_version 142260 (0.0037) [2024-06-22 05:21:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2330869760. Throughput: 0: 42827.8. Samples: 2331028940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:21:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 05:21:05,424][15349] Signal inference workers to stop experience collection... (34350 times) [2024-06-22 05:21:05,472][15401] InferenceWorker_p0-w0: stopping experience collection (34350 times) [2024-06-22 05:21:05,483][15349] Signal inference workers to resume experience collection... (34350 times) [2024-06-22 05:21:05,492][15401] InferenceWorker_p0-w0: resuming experience collection (34350 times) [2024-06-22 05:21:05,494][15401] Updated weights for policy 0, policy_version 142270 (0.0028) [2024-06-22 05:21:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 2331066368. Throughput: 0: 42802.9. Samples: 2331151120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:21:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 05:21:09,043][15401] Updated weights for policy 0, policy_version 142280 (0.0028) [2024-06-22 05:21:12,912][15401] Updated weights for policy 0, policy_version 142290 (0.0031) [2024-06-22 05:21:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 2331312128. Throughput: 0: 42687.9. Samples: 2331413240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:21:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 05:21:16,583][15401] Updated weights for policy 0, policy_version 142300 (0.0030) [2024-06-22 05:21:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2331475968. Throughput: 0: 42986.8. Samples: 2331677580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:21:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 05:21:20,437][15401] Updated weights for policy 0, policy_version 142310 (0.0034) [2024-06-22 05:21:23,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42871.5, 300 sec: 42875.2). Total num frames: 2331721728. Throughput: 0: 42844.3. Samples: 2331796000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:21:23,396][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 05:21:24,765][15401] Updated weights for policy 0, policy_version 142320 (0.0036) [2024-06-22 05:21:28,216][15401] Updated weights for policy 0, policy_version 142330 (0.0037) [2024-06-22 05:21:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 2331951104. Throughput: 0: 42759.6. Samples: 2332054460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:21:28,396][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 05:21:32,302][15401] Updated weights for policy 0, policy_version 142340 (0.0036) [2024-06-22 05:21:33,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2332131328. Throughput: 0: 42908.7. Samples: 2332317540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:21:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 05:21:35,716][15401] Updated weights for policy 0, policy_version 142350 (0.0026) [2024-06-22 05:21:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2332360704. Throughput: 0: 42800.8. Samples: 2332439540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 05:21:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 05:21:39,854][15401] Updated weights for policy 0, policy_version 142360 (0.0032) [2024-06-22 05:21:43,299][15401] Updated weights for policy 0, policy_version 142370 (0.0040) [2024-06-22 05:21:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 2332590080. Throughput: 0: 42771.4. Samples: 2332694840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:21:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 05:21:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000142370_2332590080.pth... [2024-06-22 05:21:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000141741_2322284544.pth [2024-06-22 05:21:47,867][15401] Updated weights for policy 0, policy_version 142380 (0.0029) [2024-06-22 05:21:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.8, 300 sec: 42765.0). Total num frames: 2332770304. Throughput: 0: 42837.8. Samples: 2332956640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:21:48,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 05:21:50,810][15401] Updated weights for policy 0, policy_version 142390 (0.0031) [2024-06-22 05:21:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2332999680. Throughput: 0: 42813.8. Samples: 2333077740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:21:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 05:21:55,739][15401] Updated weights for policy 0, policy_version 142400 (0.0047) [2024-06-22 05:21:58,347][15401] Updated weights for policy 0, policy_version 142410 (0.0023) [2024-06-22 05:21:58,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 2333245440. Throughput: 0: 42738.6. Samples: 2333336480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:21:58,391][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 05:22:03,359][15401] Updated weights for policy 0, policy_version 142420 (0.0027) [2024-06-22 05:22:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2333409280. Throughput: 0: 42691.8. Samples: 2333598720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:03,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 05:22:06,161][15401] Updated weights for policy 0, policy_version 142430 (0.0042) [2024-06-22 05:22:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42932.6). Total num frames: 2333638656. Throughput: 0: 42692.7. Samples: 2333716900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 05:22:11,087][15401] Updated weights for policy 0, policy_version 142440 (0.0039) [2024-06-22 05:22:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2333868032. Throughput: 0: 42742.4. Samples: 2333977880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 05:22:13,621][15349] Signal inference workers to stop experience collection... (34400 times) [2024-06-22 05:22:13,672][15401] InferenceWorker_p0-w0: stopping experience collection (34400 times) [2024-06-22 05:22:13,678][15349] Signal inference workers to resume experience collection... (34400 times) [2024-06-22 05:22:13,689][15401] InferenceWorker_p0-w0: resuming experience collection (34400 times) [2024-06-22 05:22:13,814][15401] Updated weights for policy 0, policy_version 142450 (0.0027) [2024-06-22 05:22:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2334048256. Throughput: 0: 42536.0. Samples: 2334231660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 05:22:18,631][15401] Updated weights for policy 0, policy_version 142460 (0.0028) [2024-06-22 05:22:21,503][15401] Updated weights for policy 0, policy_version 142470 (0.0031) [2024-06-22 05:22:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42876.0, 300 sec: 42987.2). Total num frames: 2334294016. Throughput: 0: 42563.1. Samples: 2334354880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 05:22:26,108][15401] Updated weights for policy 0, policy_version 142480 (0.0029) [2024-06-22 05:22:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2334507008. Throughput: 0: 42879.7. Samples: 2334624420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 05:22:29,181][15401] Updated weights for policy 0, policy_version 142490 (0.0027) [2024-06-22 05:22:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2334703616. Throughput: 0: 42620.6. Samples: 2334874560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:33,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 05:22:33,541][15401] Updated weights for policy 0, policy_version 142500 (0.0041) [2024-06-22 05:22:36,813][15401] Updated weights for policy 0, policy_version 142510 (0.0036) [2024-06-22 05:22:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2334932992. Throughput: 0: 42684.4. Samples: 2334998540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 05:22:41,115][15401] Updated weights for policy 0, policy_version 142520 (0.0036) [2024-06-22 05:22:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2335129600. Throughput: 0: 42808.6. Samples: 2335262860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:22:43,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 05:22:44,502][15401] Updated weights for policy 0, policy_version 142530 (0.0031) [2024-06-22 05:22:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2335342592. Throughput: 0: 42704.6. Samples: 2335520420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:22:48,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 05:22:48,838][15401] Updated weights for policy 0, policy_version 142540 (0.0035) [2024-06-22 05:22:52,374][15401] Updated weights for policy 0, policy_version 142550 (0.0027) [2024-06-22 05:22:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2335588352. Throughput: 0: 42841.7. Samples: 2335644780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:22:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 05:22:56,722][15401] Updated weights for policy 0, policy_version 142560 (0.0030) [2024-06-22 05:22:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2335784960. Throughput: 0: 42821.5. Samples: 2335904840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:22:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 05:22:59,978][15401] Updated weights for policy 0, policy_version 142570 (0.0038) [2024-06-22 05:23:03,396][15132] Fps is (10 sec: 40934.1, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 2335997952. Throughput: 0: 42985.5. Samples: 2336166280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:03,396][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 05:23:04,233][15401] Updated weights for policy 0, policy_version 142580 (0.0035) [2024-06-22 05:23:07,450][15401] Updated weights for policy 0, policy_version 142590 (0.0036) [2024-06-22 05:23:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2336227328. Throughput: 0: 42996.9. Samples: 2336289740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 05:23:11,978][15401] Updated weights for policy 0, policy_version 142600 (0.0042) [2024-06-22 05:23:13,392][15132] Fps is (10 sec: 42615.4, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 2336423936. Throughput: 0: 42835.9. Samples: 2336552140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:13,392][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 05:23:15,148][15401] Updated weights for policy 0, policy_version 142610 (0.0031) [2024-06-22 05:23:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2336620544. Throughput: 0: 42975.4. Samples: 2336808460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:18,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 05:23:19,454][15401] Updated weights for policy 0, policy_version 142620 (0.0027) [2024-06-22 05:23:22,699][15401] Updated weights for policy 0, policy_version 142630 (0.0024) [2024-06-22 05:23:22,725][15349] Signal inference workers to stop experience collection... (34450 times) [2024-06-22 05:23:22,725][15349] Signal inference workers to resume experience collection... (34450 times) [2024-06-22 05:23:22,742][15401] InferenceWorker_p0-w0: stopping experience collection (34450 times) [2024-06-22 05:23:22,768][15401] InferenceWorker_p0-w0: resuming experience collection (34450 times) [2024-06-22 05:23:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2336866304. Throughput: 0: 42962.7. Samples: 2336931860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:23,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 05:23:26,951][15401] Updated weights for policy 0, policy_version 142640 (0.0035) [2024-06-22 05:23:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2337062912. Throughput: 0: 42934.2. Samples: 2337194900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:28,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 05:23:30,460][15401] Updated weights for policy 0, policy_version 142650 (0.0040) [2024-06-22 05:23:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2337275904. Throughput: 0: 43085.3. Samples: 2337459260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 05:23:34,492][15401] Updated weights for policy 0, policy_version 142660 (0.0023) [2024-06-22 05:23:37,903][15401] Updated weights for policy 0, policy_version 142670 (0.0036) [2024-06-22 05:23:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2337521664. Throughput: 0: 43069.4. Samples: 2337582900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 05:23:42,363][15401] Updated weights for policy 0, policy_version 142680 (0.0036) [2024-06-22 05:23:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 2337718272. Throughput: 0: 43193.4. Samples: 2337848540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-22 05:23:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 05:23:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000142683_2337718272.pth... [2024-06-22 05:23:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000142056_2327445504.pth [2024-06-22 05:23:45,377][15401] Updated weights for policy 0, policy_version 142690 (0.0027) [2024-06-22 05:23:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.5). Total num frames: 2337931264. Throughput: 0: 43039.1. Samples: 2338102760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:23:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 05:23:49,901][15401] Updated weights for policy 0, policy_version 142700 (0.0043) [2024-06-22 05:23:52,813][15401] Updated weights for policy 0, policy_version 142710 (0.0049) [2024-06-22 05:23:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2338177024. Throughput: 0: 43202.7. Samples: 2338233860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:23:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 05:23:57,619][15401] Updated weights for policy 0, policy_version 142720 (0.0039) [2024-06-22 05:23:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2338357248. Throughput: 0: 43190.8. Samples: 2338495620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:23:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 05:24:00,265][15401] Updated weights for policy 0, policy_version 142730 (0.0043) [2024-06-22 05:24:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42876.0, 300 sec: 42987.2). Total num frames: 2338570240. Throughput: 0: 43143.6. Samples: 2338749920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 05:24:05,141][15401] Updated weights for policy 0, policy_version 142740 (0.0038) [2024-06-22 05:24:07,785][15401] Updated weights for policy 0, policy_version 142750 (0.0033) [2024-06-22 05:24:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2338816000. Throughput: 0: 43246.8. Samples: 2338877960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 05:24:12,510][15401] Updated weights for policy 0, policy_version 142760 (0.0037) [2024-06-22 05:24:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 2339012608. Throughput: 0: 43316.0. Samples: 2339144120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:13,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 05:24:15,410][15401] Updated weights for policy 0, policy_version 142770 (0.0030) [2024-06-22 05:24:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.7, 300 sec: 42987.5). Total num frames: 2339225600. Throughput: 0: 43027.1. Samples: 2339395480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 05:24:20,158][15401] Updated weights for policy 0, policy_version 142780 (0.0034) [2024-06-22 05:24:22,987][15401] Updated weights for policy 0, policy_version 142790 (0.0032) [2024-06-22 05:24:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2339471360. Throughput: 0: 43228.4. Samples: 2339528180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:23,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 05:24:27,946][15401] Updated weights for policy 0, policy_version 142800 (0.0037) [2024-06-22 05:24:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2339651584. Throughput: 0: 43105.7. Samples: 2339788300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 05:24:31,083][15401] Updated weights for policy 0, policy_version 142810 (0.0042) [2024-06-22 05:24:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 43043.1). Total num frames: 2339880960. Throughput: 0: 43032.4. Samples: 2340039220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 05:24:35,356][15401] Updated weights for policy 0, policy_version 142820 (0.0033) [2024-06-22 05:24:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2340093952. Throughput: 0: 43141.9. Samples: 2340175240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 05:24:38,822][15401] Updated weights for policy 0, policy_version 142830 (0.0040) [2024-06-22 05:24:42,820][15401] Updated weights for policy 0, policy_version 142840 (0.0044) [2024-06-22 05:24:43,390][15132] Fps is (10 sec: 40956.0, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 2340290560. Throughput: 0: 43040.9. Samples: 2340432500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:43,391][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 05:24:44,653][15349] Signal inference workers to stop experience collection... (34500 times) [2024-06-22 05:24:44,702][15401] InferenceWorker_p0-w0: stopping experience collection (34500 times) [2024-06-22 05:24:44,710][15349] Signal inference workers to resume experience collection... (34500 times) [2024-06-22 05:24:44,716][15401] InferenceWorker_p0-w0: resuming experience collection (34500 times) [2024-06-22 05:24:46,347][15401] Updated weights for policy 0, policy_version 142850 (0.0038) [2024-06-22 05:24:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2340536320. Throughput: 0: 42874.3. Samples: 2340679260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:24:48,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 05:24:50,880][15401] Updated weights for policy 0, policy_version 142860 (0.0041) [2024-06-22 05:24:53,389][15132] Fps is (10 sec: 44241.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2340732928. Throughput: 0: 42992.0. Samples: 2340812600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:24:53,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 05:24:54,237][15401] Updated weights for policy 0, policy_version 142870 (0.0041) [2024-06-22 05:24:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2340929536. Throughput: 0: 42757.8. Samples: 2341068220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:24:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 05:24:58,658][15401] Updated weights for policy 0, policy_version 142880 (0.0041) [2024-06-22 05:25:01,860][15401] Updated weights for policy 0, policy_version 142890 (0.0045) [2024-06-22 05:25:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2341175296. Throughput: 0: 42764.9. Samples: 2341319900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:03,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 05:25:06,260][15401] Updated weights for policy 0, policy_version 142900 (0.0028) [2024-06-22 05:25:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 2341371904. Throughput: 0: 42832.5. Samples: 2341455640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:08,395][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 05:25:09,342][15401] Updated weights for policy 0, policy_version 142910 (0.0027) [2024-06-22 05:25:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2341568512. Throughput: 0: 42738.7. Samples: 2341711540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:13,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-22 05:25:13,739][15401] Updated weights for policy 0, policy_version 142920 (0.0034) [2024-06-22 05:25:16,971][15401] Updated weights for policy 0, policy_version 142930 (0.0025) [2024-06-22 05:25:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42932.6). Total num frames: 2341814272. Throughput: 0: 42828.4. Samples: 2341966500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:18,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-22 05:25:21,258][15401] Updated weights for policy 0, policy_version 142940 (0.0037) [2024-06-22 05:25:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2342010880. Throughput: 0: 42828.0. Samples: 2342102500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 05:25:24,594][15401] Updated weights for policy 0, policy_version 142950 (0.0036) [2024-06-22 05:25:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2342240256. Throughput: 0: 42699.0. Samples: 2342353920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:28,396][15132] Avg episode reward: [(0, '0.215')] [2024-06-22 05:25:29,432][15401] Updated weights for policy 0, policy_version 142960 (0.0032) [2024-06-22 05:25:32,161][15401] Updated weights for policy 0, policy_version 142970 (0.0037) [2024-06-22 05:25:33,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 2342469632. Throughput: 0: 42863.3. Samples: 2342608120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 05:25:36,952][15401] Updated weights for policy 0, policy_version 142980 (0.0047) [2024-06-22 05:25:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 2342666240. Throughput: 0: 42909.3. Samples: 2342743520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 05:25:39,834][15401] Updated weights for policy 0, policy_version 142990 (0.0036) [2024-06-22 05:25:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43145.2, 300 sec: 42876.2). Total num frames: 2342879232. Throughput: 0: 42867.1. Samples: 2342997240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:43,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-22 05:25:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000142998_2342879232.pth... [2024-06-22 05:25:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000142370_2332590080.pth [2024-06-22 05:25:44,424][15401] Updated weights for policy 0, policy_version 143000 (0.0031) [2024-06-22 05:25:47,388][15401] Updated weights for policy 0, policy_version 143010 (0.0052) [2024-06-22 05:25:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2343108608. Throughput: 0: 42865.4. Samples: 2343248840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 05:25:51,894][15401] Updated weights for policy 0, policy_version 143020 (0.0031) [2024-06-22 05:25:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2343288832. Throughput: 0: 42816.9. Samples: 2343382400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 05:25:53,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 05:25:54,203][15349] Signal inference workers to stop experience collection... (34550 times) [2024-06-22 05:25:54,203][15349] Signal inference workers to resume experience collection... (34550 times) [2024-06-22 05:25:54,215][15401] InferenceWorker_p0-w0: stopping experience collection (34550 times) [2024-06-22 05:25:54,215][15401] InferenceWorker_p0-w0: resuming experience collection (34550 times) [2024-06-22 05:25:55,227][15401] Updated weights for policy 0, policy_version 143030 (0.0044) [2024-06-22 05:25:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2343534592. Throughput: 0: 42730.3. Samples: 2343634400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:25:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 05:25:59,471][15401] Updated weights for policy 0, policy_version 143040 (0.0038) [2024-06-22 05:26:02,976][15401] Updated weights for policy 0, policy_version 143050 (0.0030) [2024-06-22 05:26:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2343747584. Throughput: 0: 42717.8. Samples: 2343888800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:03,392][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 05:26:07,427][15401] Updated weights for policy 0, policy_version 143060 (0.0038) [2024-06-22 05:26:08,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2343911424. Throughput: 0: 42482.1. Samples: 2344014200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:08,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-22 05:26:10,831][15401] Updated weights for policy 0, policy_version 143070 (0.0025) [2024-06-22 05:26:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 2344173568. Throughput: 0: 42598.6. Samples: 2344270860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:13,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 05:26:14,889][15401] Updated weights for policy 0, policy_version 143080 (0.0028) [2024-06-22 05:26:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42877.0). Total num frames: 2344370176. Throughput: 0: 42897.9. Samples: 2344538520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 05:26:18,413][15401] Updated weights for policy 0, policy_version 143090 (0.0027) [2024-06-22 05:26:22,386][15401] Updated weights for policy 0, policy_version 143100 (0.0035) [2024-06-22 05:26:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2344566784. Throughput: 0: 42619.5. Samples: 2344661400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:23,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 05:26:25,946][15401] Updated weights for policy 0, policy_version 143110 (0.0029) [2024-06-22 05:26:28,394][15132] Fps is (10 sec: 44217.1, 60 sec: 42868.4, 300 sec: 42986.5). Total num frames: 2344812544. Throughput: 0: 42732.3. Samples: 2344920380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:28,394][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 05:26:29,957][15401] Updated weights for policy 0, policy_version 143120 (0.0025) [2024-06-22 05:26:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2345025536. Throughput: 0: 42815.6. Samples: 2345175540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:33,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 05:26:33,531][15401] Updated weights for policy 0, policy_version 143130 (0.0036) [2024-06-22 05:26:37,858][15401] Updated weights for policy 0, policy_version 143140 (0.0034) [2024-06-22 05:26:38,389][15132] Fps is (10 sec: 40978.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2345222144. Throughput: 0: 42732.5. Samples: 2345305360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 05:26:41,112][15401] Updated weights for policy 0, policy_version 143150 (0.0041) [2024-06-22 05:26:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2345467904. Throughput: 0: 42981.2. Samples: 2345568560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:43,390][15132] Avg episode reward: [(0, '0.887')] [2024-06-22 05:26:45,242][15401] Updated weights for policy 0, policy_version 143160 (0.0033) [2024-06-22 05:26:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 2345664512. Throughput: 0: 43062.3. Samples: 2345826600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 05:26:48,725][15401] Updated weights for policy 0, policy_version 143170 (0.0031) [2024-06-22 05:26:52,817][15401] Updated weights for policy 0, policy_version 143180 (0.0032) [2024-06-22 05:26:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2345861120. Throughput: 0: 42956.9. Samples: 2345947260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 05:26:53,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 05:26:56,825][15401] Updated weights for policy 0, policy_version 143190 (0.0044) [2024-06-22 05:26:58,211][15349] Signal inference workers to stop experience collection... (34600 times) [2024-06-22 05:26:58,264][15401] InferenceWorker_p0-w0: stopping experience collection (34600 times) [2024-06-22 05:26:58,331][15349] Signal inference workers to resume experience collection... (34600 times) [2024-06-22 05:26:58,332][15401] InferenceWorker_p0-w0: resuming experience collection (34600 times) [2024-06-22 05:26:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2346106880. Throughput: 0: 43079.7. Samples: 2346209440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:26:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 05:27:00,850][15401] Updated weights for policy 0, policy_version 143200 (0.0035) [2024-06-22 05:27:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2346287104. Throughput: 0: 42765.8. Samples: 2346462980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:03,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 05:27:04,699][15401] Updated weights for policy 0, policy_version 143210 (0.0035) [2024-06-22 05:27:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2346500096. Throughput: 0: 42764.9. Samples: 2346585820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:08,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 05:27:08,399][15401] Updated weights for policy 0, policy_version 143220 (0.0038) [2024-06-22 05:27:12,339][15401] Updated weights for policy 0, policy_version 143230 (0.0039) [2024-06-22 05:27:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2346745856. Throughput: 0: 42926.9. Samples: 2346851900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 05:27:15,959][15401] Updated weights for policy 0, policy_version 143240 (0.0037) [2024-06-22 05:27:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2346942464. Throughput: 0: 42929.7. Samples: 2347107380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 05:27:20,067][15401] Updated weights for policy 0, policy_version 143250 (0.0027) [2024-06-22 05:27:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2347155456. Throughput: 0: 42739.5. Samples: 2347228640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:23,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 05:27:23,955][15401] Updated weights for policy 0, policy_version 143260 (0.0029) [2024-06-22 05:27:27,690][15401] Updated weights for policy 0, policy_version 143270 (0.0046) [2024-06-22 05:27:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42601.6, 300 sec: 42931.6). Total num frames: 2347368448. Throughput: 0: 42746.4. Samples: 2347492140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 05:27:31,588][15401] Updated weights for policy 0, policy_version 143280 (0.0031) [2024-06-22 05:27:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2347581440. Throughput: 0: 42764.4. Samples: 2347751000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 05:27:35,192][15401] Updated weights for policy 0, policy_version 143290 (0.0050) [2024-06-22 05:27:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2347810816. Throughput: 0: 42831.7. Samples: 2347874680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:38,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 05:27:39,154][15401] Updated weights for policy 0, policy_version 143300 (0.0033) [2024-06-22 05:27:42,662][15401] Updated weights for policy 0, policy_version 143310 (0.0040) [2024-06-22 05:27:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2348040192. Throughput: 0: 42873.3. Samples: 2348138740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:43,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-22 05:27:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000143313_2348040192.pth... [2024-06-22 05:27:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000142683_2337718272.pth [2024-06-22 05:27:46,736][15401] Updated weights for policy 0, policy_version 143320 (0.0031) [2024-06-22 05:27:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2348204032. Throughput: 0: 43059.0. Samples: 2348400640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:48,390][15132] Avg episode reward: [(0, '0.191')] [2024-06-22 05:27:50,468][15401] Updated weights for policy 0, policy_version 143330 (0.0035) [2024-06-22 05:27:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2348449792. Throughput: 0: 43012.0. Samples: 2348521360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:53,392][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 05:27:54,343][15401] Updated weights for policy 0, policy_version 143340 (0.0028) [2024-06-22 05:27:58,040][15401] Updated weights for policy 0, policy_version 143350 (0.0028) [2024-06-22 05:27:58,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42988.1). Total num frames: 2348679168. Throughput: 0: 42895.1. Samples: 2348782180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 05:27:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 05:28:01,960][15401] Updated weights for policy 0, policy_version 143360 (0.0037) [2024-06-22 05:28:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2348859392. Throughput: 0: 43092.1. Samples: 2349046520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 05:28:05,599][15401] Updated weights for policy 0, policy_version 143370 (0.0033) [2024-06-22 05:28:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 2349088768. Throughput: 0: 43112.1. Samples: 2349168680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 05:28:09,406][15401] Updated weights for policy 0, policy_version 143380 (0.0040) [2024-06-22 05:28:12,999][15401] Updated weights for policy 0, policy_version 143390 (0.0030) [2024-06-22 05:28:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2349301760. Throughput: 0: 43044.4. Samples: 2349429140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:13,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 05:28:13,861][15349] Signal inference workers to stop experience collection... (34650 times) [2024-06-22 05:28:13,903][15401] InferenceWorker_p0-w0: stopping experience collection (34650 times) [2024-06-22 05:28:13,912][15349] Signal inference workers to resume experience collection... (34650 times) [2024-06-22 05:28:13,913][15401] InferenceWorker_p0-w0: resuming experience collection (34650 times) [2024-06-22 05:28:17,049][15401] Updated weights for policy 0, policy_version 143400 (0.0037) [2024-06-22 05:28:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2349514752. Throughput: 0: 42991.1. Samples: 2349685600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:18,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 05:28:21,100][15401] Updated weights for policy 0, policy_version 143410 (0.0028) [2024-06-22 05:28:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2349744128. Throughput: 0: 43092.5. Samples: 2349813840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:23,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-22 05:28:24,525][15401] Updated weights for policy 0, policy_version 143420 (0.0028) [2024-06-22 05:28:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2349940736. Throughput: 0: 42977.9. Samples: 2350072740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:28,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 05:28:28,536][15401] Updated weights for policy 0, policy_version 143430 (0.0031) [2024-06-22 05:28:32,082][15401] Updated weights for policy 0, policy_version 143440 (0.0028) [2024-06-22 05:28:33,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2350137344. Throughput: 0: 42838.6. Samples: 2350328380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 05:28:36,286][15401] Updated weights for policy 0, policy_version 143450 (0.0032) [2024-06-22 05:28:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2350383104. Throughput: 0: 43004.5. Samples: 2350456560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 05:28:39,727][15401] Updated weights for policy 0, policy_version 143460 (0.0033) [2024-06-22 05:28:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2350579712. Throughput: 0: 42805.9. Samples: 2350708440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 05:28:44,016][15401] Updated weights for policy 0, policy_version 143470 (0.0045) [2024-06-22 05:28:47,413][15401] Updated weights for policy 0, policy_version 143480 (0.0036) [2024-06-22 05:28:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2350776320. Throughput: 0: 42688.8. Samples: 2350967520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:48,395][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 05:28:51,767][15401] Updated weights for policy 0, policy_version 143490 (0.0047) [2024-06-22 05:28:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2351022080. Throughput: 0: 42841.7. Samples: 2351096560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 05:28:55,378][15401] Updated weights for policy 0, policy_version 143500 (0.0030) [2024-06-22 05:28:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2351218688. Throughput: 0: 42516.9. Samples: 2351342400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:28:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 05:28:59,257][15401] Updated weights for policy 0, policy_version 143510 (0.0023) [2024-06-22 05:29:03,166][15401] Updated weights for policy 0, policy_version 143520 (0.0035) [2024-06-22 05:29:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2351431680. Throughput: 0: 42731.0. Samples: 2351608500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:29:03,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-22 05:29:06,928][15401] Updated weights for policy 0, policy_version 143530 (0.0036) [2024-06-22 05:29:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2351661056. Throughput: 0: 42805.6. Samples: 2351740100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 05:29:10,895][15401] Updated weights for policy 0, policy_version 143540 (0.0039) [2024-06-22 05:29:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2351857664. Throughput: 0: 42569.3. Samples: 2351988360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 05:29:14,331][15349] Signal inference workers to stop experience collection... (34700 times) [2024-06-22 05:29:14,378][15349] Signal inference workers to resume experience collection... (34700 times) [2024-06-22 05:29:14,383][15401] InferenceWorker_p0-w0: stopping experience collection (34700 times) [2024-06-22 05:29:14,406][15401] InferenceWorker_p0-w0: resuming experience collection (34700 times) [2024-06-22 05:29:14,517][15401] Updated weights for policy 0, policy_version 143550 (0.0031) [2024-06-22 05:29:18,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 2352070656. Throughput: 0: 42702.3. Samples: 2352250080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:18,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 05:29:18,609][15401] Updated weights for policy 0, policy_version 143560 (0.0032) [2024-06-22 05:29:22,316][15401] Updated weights for policy 0, policy_version 143570 (0.0039) [2024-06-22 05:29:23,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 2352316416. Throughput: 0: 42703.0. Samples: 2352378200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:23,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 05:29:26,278][15401] Updated weights for policy 0, policy_version 143580 (0.0023) [2024-06-22 05:29:28,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2352513024. Throughput: 0: 42724.5. Samples: 2352631040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 05:29:29,805][15401] Updated weights for policy 0, policy_version 143590 (0.0034) [2024-06-22 05:29:33,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2352709632. Throughput: 0: 42654.2. Samples: 2352886960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:33,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 05:29:33,904][15401] Updated weights for policy 0, policy_version 143600 (0.0028) [2024-06-22 05:29:37,665][15401] Updated weights for policy 0, policy_version 143610 (0.0036) [2024-06-22 05:29:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42876.2). Total num frames: 2352939008. Throughput: 0: 42612.6. Samples: 2353014120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 05:29:41,791][15401] Updated weights for policy 0, policy_version 143620 (0.0033) [2024-06-22 05:29:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2353135616. Throughput: 0: 42829.8. Samples: 2353269740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:43,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-22 05:29:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000143625_2353152000.pth... [2024-06-22 05:29:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000142998_2342879232.pth [2024-06-22 05:29:45,135][15401] Updated weights for policy 0, policy_version 143630 (0.0032) [2024-06-22 05:29:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2353364992. Throughput: 0: 42456.1. Samples: 2353519020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 05:29:49,354][15401] Updated weights for policy 0, policy_version 143640 (0.0032) [2024-06-22 05:29:52,731][15401] Updated weights for policy 0, policy_version 143650 (0.0035) [2024-06-22 05:29:53,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42320.9, 300 sec: 42819.6). Total num frames: 2353561600. Throughput: 0: 42440.7. Samples: 2353650200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:53,397][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 05:29:56,843][15401] Updated weights for policy 0, policy_version 143660 (0.0035) [2024-06-22 05:29:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2353774592. Throughput: 0: 42584.9. Samples: 2353904680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:29:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 05:30:00,539][15401] Updated weights for policy 0, policy_version 143670 (0.0031) [2024-06-22 05:30:03,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2354003968. Throughput: 0: 42437.7. Samples: 2354159680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 05:30:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 05:30:04,557][15401] Updated weights for policy 0, policy_version 143680 (0.0042) [2024-06-22 05:30:08,116][15401] Updated weights for policy 0, policy_version 143690 (0.0039) [2024-06-22 05:30:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2354216960. Throughput: 0: 42631.7. Samples: 2354296620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 05:30:11,948][15401] Updated weights for policy 0, policy_version 143700 (0.0044) [2024-06-22 05:30:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2354429952. Throughput: 0: 42513.2. Samples: 2354544140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:13,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 05:30:15,978][15401] Updated weights for policy 0, policy_version 143710 (0.0033) [2024-06-22 05:30:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 2354642944. Throughput: 0: 42632.7. Samples: 2354805420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 05:30:19,447][15401] Updated weights for policy 0, policy_version 143720 (0.0042) [2024-06-22 05:30:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2354855936. Throughput: 0: 42690.1. Samples: 2354935180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 05:30:23,854][15401] Updated weights for policy 0, policy_version 143730 (0.0023) [2024-06-22 05:30:27,157][15401] Updated weights for policy 0, policy_version 143740 (0.0030) [2024-06-22 05:30:28,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2355085312. Throughput: 0: 42742.6. Samples: 2355193160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:28,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 05:30:31,686][15401] Updated weights for policy 0, policy_version 143750 (0.0024) [2024-06-22 05:30:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2355281920. Throughput: 0: 43005.4. Samples: 2355454260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 05:30:34,801][15401] Updated weights for policy 0, policy_version 143760 (0.0035) [2024-06-22 05:30:37,807][15349] Signal inference workers to stop experience collection... (34750 times) [2024-06-22 05:30:37,851][15401] InferenceWorker_p0-w0: stopping experience collection (34750 times) [2024-06-22 05:30:37,858][15349] Signal inference workers to resume experience collection... (34750 times) [2024-06-22 05:30:37,865][15401] InferenceWorker_p0-w0: resuming experience collection (34750 times) [2024-06-22 05:30:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 2355494912. Throughput: 0: 42848.2. Samples: 2355578100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 05:30:39,224][15401] Updated weights for policy 0, policy_version 143770 (0.0038) [2024-06-22 05:30:42,486][15401] Updated weights for policy 0, policy_version 143780 (0.0031) [2024-06-22 05:30:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2355724288. Throughput: 0: 42929.7. Samples: 2355836520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:43,392][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 05:30:46,786][15401] Updated weights for policy 0, policy_version 143790 (0.0030) [2024-06-22 05:30:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2355904512. Throughput: 0: 43008.5. Samples: 2356095060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 05:30:50,043][15401] Updated weights for policy 0, policy_version 143800 (0.0033) [2024-06-22 05:30:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 2356150272. Throughput: 0: 42599.9. Samples: 2356213620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 05:30:54,238][15401] Updated weights for policy 0, policy_version 143810 (0.0029) [2024-06-22 05:30:57,799][15401] Updated weights for policy 0, policy_version 143820 (0.0038) [2024-06-22 05:30:58,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 2356379648. Throughput: 0: 42967.9. Samples: 2356477700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:30:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 05:31:01,703][15401] Updated weights for policy 0, policy_version 143830 (0.0045) [2024-06-22 05:31:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 2356527104. Throughput: 0: 43088.3. Samples: 2356744400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:31:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 05:31:05,301][15401] Updated weights for policy 0, policy_version 143840 (0.0038) [2024-06-22 05:31:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2356789248. Throughput: 0: 42785.7. Samples: 2356860540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 05:31:08,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 05:31:09,302][15401] Updated weights for policy 0, policy_version 143850 (0.0039) [2024-06-22 05:31:13,145][15401] Updated weights for policy 0, policy_version 143860 (0.0042) [2024-06-22 05:31:13,390][15132] Fps is (10 sec: 47512.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2357002240. Throughput: 0: 42816.4. Samples: 2357119900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:13,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 05:31:16,830][15401] Updated weights for policy 0, policy_version 143870 (0.0044) [2024-06-22 05:31:18,389][15132] Fps is (10 sec: 37684.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2357166080. Throughput: 0: 42876.9. Samples: 2357383720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 05:31:20,767][15401] Updated weights for policy 0, policy_version 143880 (0.0028) [2024-06-22 05:31:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 2357428224. Throughput: 0: 42742.3. Samples: 2357501500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:23,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 05:31:24,250][15401] Updated weights for policy 0, policy_version 143890 (0.0031) [2024-06-22 05:31:28,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2357641216. Throughput: 0: 42794.8. Samples: 2357762280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 05:31:28,482][15401] Updated weights for policy 0, policy_version 143900 (0.0023) [2024-06-22 05:31:32,119][15401] Updated weights for policy 0, policy_version 143910 (0.0029) [2024-06-22 05:31:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2357821440. Throughput: 0: 42872.8. Samples: 2358024340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:33,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 05:31:36,081][15401] Updated weights for policy 0, policy_version 143920 (0.0033) [2024-06-22 05:31:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 2358083584. Throughput: 0: 42991.2. Samples: 2358148220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:38,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-22 05:31:39,706][15401] Updated weights for policy 0, policy_version 143930 (0.0034) [2024-06-22 05:31:43,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2358280192. Throughput: 0: 43129.9. Samples: 2358418540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:43,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-22 05:31:43,432][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000143939_2358296576.pth... [2024-06-22 05:31:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000143313_2348040192.pth [2024-06-22 05:31:43,675][15401] Updated weights for policy 0, policy_version 143940 (0.0033) [2024-06-22 05:31:47,502][15401] Updated weights for policy 0, policy_version 143950 (0.0027) [2024-06-22 05:31:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2358493184. Throughput: 0: 42893.6. Samples: 2358674620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:48,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 05:31:51,227][15401] Updated weights for policy 0, policy_version 143960 (0.0040) [2024-06-22 05:31:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2358738944. Throughput: 0: 43089.9. Samples: 2358799580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 05:31:55,211][15401] Updated weights for policy 0, policy_version 143970 (0.0042) [2024-06-22 05:31:58,296][15349] Signal inference workers to stop experience collection... (34800 times) [2024-06-22 05:31:58,300][15349] Signal inference workers to resume experience collection... (34800 times) [2024-06-22 05:31:58,351][15401] InferenceWorker_p0-w0: stopping experience collection (34800 times) [2024-06-22 05:31:58,351][15401] InferenceWorker_p0-w0: resuming experience collection (34800 times) [2024-06-22 05:31:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 2358919168. Throughput: 0: 43152.6. Samples: 2359061760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:31:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 05:31:58,839][15401] Updated weights for policy 0, policy_version 143980 (0.0034) [2024-06-22 05:32:03,071][15401] Updated weights for policy 0, policy_version 143990 (0.0034) [2024-06-22 05:32:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 2359132160. Throughput: 0: 42907.0. Samples: 2359314540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:32:03,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-22 05:32:06,479][15401] Updated weights for policy 0, policy_version 144000 (0.0034) [2024-06-22 05:32:08,392][15132] Fps is (10 sec: 45862.6, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 2359377920. Throughput: 0: 43108.6. Samples: 2359441500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:32:08,393][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 05:32:10,978][15401] Updated weights for policy 0, policy_version 144010 (0.0035) [2024-06-22 05:32:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2359574528. Throughput: 0: 43086.6. Samples: 2359701180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-22 05:32:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 05:32:14,211][15401] Updated weights for policy 0, policy_version 144020 (0.0030) [2024-06-22 05:32:18,390][15132] Fps is (10 sec: 39332.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2359771136. Throughput: 0: 43009.9. Samples: 2359959780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:18,394][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 05:32:18,443][15401] Updated weights for policy 0, policy_version 144030 (0.0025) [2024-06-22 05:32:21,787][15401] Updated weights for policy 0, policy_version 144040 (0.0028) [2024-06-22 05:32:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2360016896. Throughput: 0: 42928.9. Samples: 2360080020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 05:32:25,826][15401] Updated weights for policy 0, policy_version 144050 (0.0032) [2024-06-22 05:32:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2360213504. Throughput: 0: 42825.7. Samples: 2360345700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:28,399][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 05:32:29,446][15401] Updated weights for policy 0, policy_version 144060 (0.0024) [2024-06-22 05:32:33,370][15401] Updated weights for policy 0, policy_version 144070 (0.0029) [2024-06-22 05:32:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43690.8, 300 sec: 42820.5). Total num frames: 2360442880. Throughput: 0: 42889.0. Samples: 2360604620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 05:32:36,967][15401] Updated weights for policy 0, policy_version 144080 (0.0038) [2024-06-22 05:32:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2360655872. Throughput: 0: 42849.8. Samples: 2360727820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 05:32:41,239][15401] Updated weights for policy 0, policy_version 144090 (0.0040) [2024-06-22 05:32:43,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 2360852480. Throughput: 0: 42822.1. Samples: 2360988860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:43,393][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 05:32:44,619][15401] Updated weights for policy 0, policy_version 144100 (0.0035) [2024-06-22 05:32:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2361065472. Throughput: 0: 42796.0. Samples: 2361240360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:48,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 05:32:48,892][15401] Updated weights for policy 0, policy_version 144110 (0.0029) [2024-06-22 05:32:52,246][15401] Updated weights for policy 0, policy_version 144120 (0.0039) [2024-06-22 05:32:53,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2361294848. Throughput: 0: 42765.8. Samples: 2361365840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 05:32:56,613][15401] Updated weights for policy 0, policy_version 144130 (0.0030) [2024-06-22 05:32:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2361475072. Throughput: 0: 42673.8. Samples: 2361621500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:32:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 05:32:59,890][15401] Updated weights for policy 0, policy_version 144140 (0.0034) [2024-06-22 05:33:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2361704448. Throughput: 0: 42666.6. Samples: 2361879780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:33:03,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-22 05:33:04,250][15401] Updated weights for policy 0, policy_version 144150 (0.0029) [2024-06-22 05:33:07,505][15401] Updated weights for policy 0, policy_version 144160 (0.0032) [2024-06-22 05:33:08,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42873.4, 300 sec: 42876.1). Total num frames: 2361950208. Throughput: 0: 42869.7. Samples: 2362009160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:33:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 05:33:11,975][15401] Updated weights for policy 0, policy_version 144170 (0.0047) [2024-06-22 05:33:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2362114048. Throughput: 0: 42611.2. Samples: 2362263200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:33:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 05:33:15,384][15401] Updated weights for policy 0, policy_version 144180 (0.0032) [2024-06-22 05:33:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2362343424. Throughput: 0: 42511.5. Samples: 2362517640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 05:33:19,607][15401] Updated weights for policy 0, policy_version 144190 (0.0040) [2024-06-22 05:33:23,077][15401] Updated weights for policy 0, policy_version 144200 (0.0038) [2024-06-22 05:33:23,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2362589184. Throughput: 0: 42706.3. Samples: 2362649600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:23,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 05:33:27,201][15401] Updated weights for policy 0, policy_version 144210 (0.0043) [2024-06-22 05:33:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2362753024. Throughput: 0: 42479.6. Samples: 2362900340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 05:33:30,482][15349] Signal inference workers to stop experience collection... (34850 times) [2024-06-22 05:33:30,509][15401] InferenceWorker_p0-w0: stopping experience collection (34850 times) [2024-06-22 05:33:30,537][15349] Signal inference workers to resume experience collection... (34850 times) [2024-06-22 05:33:30,540][15401] InferenceWorker_p0-w0: resuming experience collection (34850 times) [2024-06-22 05:33:30,866][15401] Updated weights for policy 0, policy_version 144220 (0.0033) [2024-06-22 05:33:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2362998784. Throughput: 0: 42480.4. Samples: 2363151980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:33,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 05:33:34,842][15401] Updated weights for policy 0, policy_version 144230 (0.0027) [2024-06-22 05:33:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2363211776. Throughput: 0: 42591.8. Samples: 2363282480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 05:33:38,476][15401] Updated weights for policy 0, policy_version 144240 (0.0035) [2024-06-22 05:33:42,534][15401] Updated weights for policy 0, policy_version 144250 (0.0027) [2024-06-22 05:33:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 2363408384. Throughput: 0: 42616.3. Samples: 2363539240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:43,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 05:33:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000144252_2363424768.pth... [2024-06-22 05:33:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000143625_2353152000.pth [2024-06-22 05:33:46,212][15401] Updated weights for policy 0, policy_version 144260 (0.0044) [2024-06-22 05:33:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2363621376. Throughput: 0: 42396.4. Samples: 2363787620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 05:33:50,279][15401] Updated weights for policy 0, policy_version 144270 (0.0034) [2024-06-22 05:33:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2363850752. Throughput: 0: 42405.8. Samples: 2363917420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 05:33:54,465][15401] Updated weights for policy 0, policy_version 144280 (0.0026) [2024-06-22 05:33:57,994][15401] Updated weights for policy 0, policy_version 144290 (0.0035) [2024-06-22 05:33:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2364063744. Throughput: 0: 42423.9. Samples: 2364172380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:33:58,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 05:34:02,037][15401] Updated weights for policy 0, policy_version 144300 (0.0040) [2024-06-22 05:34:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2364260352. Throughput: 0: 42440.4. Samples: 2364427460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:34:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 05:34:05,507][15401] Updated weights for policy 0, policy_version 144310 (0.0023) [2024-06-22 05:34:08,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2364473344. Throughput: 0: 42299.5. Samples: 2364553080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:34:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 05:34:09,656][15401] Updated weights for policy 0, policy_version 144320 (0.0041) [2024-06-22 05:34:13,291][15401] Updated weights for policy 0, policy_version 144330 (0.0026) [2024-06-22 05:34:13,395][15132] Fps is (10 sec: 44213.4, 60 sec: 43140.6, 300 sec: 42820.1). Total num frames: 2364702720. Throughput: 0: 42492.2. Samples: 2364812720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:34:13,395][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 05:34:17,342][15401] Updated weights for policy 0, policy_version 144340 (0.0037) [2024-06-22 05:34:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2364899328. Throughput: 0: 42558.4. Samples: 2365067100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 05:34:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 05:34:21,084][15401] Updated weights for policy 0, policy_version 144350 (0.0039) [2024-06-22 05:34:23,390][15132] Fps is (10 sec: 40979.7, 60 sec: 42051.8, 300 sec: 42709.4). Total num frames: 2365112320. Throughput: 0: 42412.0. Samples: 2365191040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:34:23,391][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 05:34:24,942][15401] Updated weights for policy 0, policy_version 144360 (0.0024) [2024-06-22 05:34:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2365325312. Throughput: 0: 42428.6. Samples: 2365448520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:34:28,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 05:34:28,738][15401] Updated weights for policy 0, policy_version 144370 (0.0034) [2024-06-22 05:34:32,561][15401] Updated weights for policy 0, policy_version 144380 (0.0032) [2024-06-22 05:34:33,390][15132] Fps is (10 sec: 42600.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2365538304. Throughput: 0: 42602.6. Samples: 2365704740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:34:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 05:34:36,410][15401] Updated weights for policy 0, policy_version 144390 (0.0030) [2024-06-22 05:34:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2365734912. Throughput: 0: 42584.5. Samples: 2365833720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:34:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 05:34:40,542][15401] Updated weights for policy 0, policy_version 144400 (0.0030) [2024-06-22 05:34:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2365947904. Throughput: 0: 42571.2. Samples: 2366087980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:34:43,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 05:34:43,974][15401] Updated weights for policy 0, policy_version 144410 (0.0037) [2024-06-22 05:34:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 2366160896. Throughput: 0: 42641.0. Samples: 2366346300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:34:48,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 05:34:48,481][15401] Updated weights for policy 0, policy_version 144420 (0.0038) [2024-06-22 05:34:50,841][15349] Signal inference workers to stop experience collection... (34900 times) [2024-06-22 05:34:50,894][15401] InferenceWorker_p0-w0: stopping experience collection (34900 times) [2024-06-22 05:34:50,953][15349] Signal inference workers to resume experience collection... (34900 times) [2024-06-22 05:34:50,954][15401] InferenceWorker_p0-w0: resuming experience collection (34900 times) [2024-06-22 05:34:51,743][15401] Updated weights for policy 0, policy_version 144430 (0.0038) [2024-06-22 05:34:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2366390272. Throughput: 0: 42661.7. Samples: 2366472860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:34:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 05:34:56,008][15401] Updated weights for policy 0, policy_version 144440 (0.0038) [2024-06-22 05:34:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 2366603264. Throughput: 0: 42511.4. Samples: 2366725500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:34:58,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 05:34:59,606][15401] Updated weights for policy 0, policy_version 144450 (0.0052) [2024-06-22 05:35:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2366816256. Throughput: 0: 42622.1. Samples: 2366985100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:35:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 05:35:03,579][15401] Updated weights for policy 0, policy_version 144460 (0.0023) [2024-06-22 05:35:07,263][15401] Updated weights for policy 0, policy_version 144470 (0.0047) [2024-06-22 05:35:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2367012864. Throughput: 0: 42719.6. Samples: 2367113400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:35:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 05:35:11,191][15401] Updated weights for policy 0, policy_version 144480 (0.0037) [2024-06-22 05:35:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42602.2, 300 sec: 42765.0). Total num frames: 2367258624. Throughput: 0: 42646.1. Samples: 2367367600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:35:13,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 05:35:15,123][15401] Updated weights for policy 0, policy_version 144490 (0.0033) [2024-06-22 05:35:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2367455232. Throughput: 0: 42609.8. Samples: 2367622180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:35:18,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 05:35:18,845][15401] Updated weights for policy 0, policy_version 144500 (0.0025) [2024-06-22 05:35:22,545][15401] Updated weights for policy 0, policy_version 144510 (0.0037) [2024-06-22 05:35:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.8, 300 sec: 42653.9). Total num frames: 2367668224. Throughput: 0: 42577.8. Samples: 2367749720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:35:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 05:35:26,225][15401] Updated weights for policy 0, policy_version 144520 (0.0030) [2024-06-22 05:35:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2367897600. Throughput: 0: 42806.1. Samples: 2368014260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:35:28,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 05:35:30,286][15401] Updated weights for policy 0, policy_version 144530 (0.0039) [2024-06-22 05:35:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2368110592. Throughput: 0: 42754.6. Samples: 2368270260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:35:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 05:35:34,050][15401] Updated weights for policy 0, policy_version 144540 (0.0043) [2024-06-22 05:35:37,911][15401] Updated weights for policy 0, policy_version 144550 (0.0021) [2024-06-22 05:35:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2368307200. Throughput: 0: 42745.3. Samples: 2368396400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:35:38,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 05:35:41,731][15401] Updated weights for policy 0, policy_version 144560 (0.0037) [2024-06-22 05:35:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2368536576. Throughput: 0: 42853.8. Samples: 2368653920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:35:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 05:35:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000144565_2368552960.pth... [2024-06-22 05:35:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000143939_2358296576.pth [2024-06-22 05:35:45,462][15401] Updated weights for policy 0, policy_version 144570 (0.0029) [2024-06-22 05:35:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 2368749568. Throughput: 0: 42852.4. Samples: 2368913560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:35:48,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 05:35:49,346][15401] Updated weights for policy 0, policy_version 144580 (0.0040) [2024-06-22 05:35:53,032][15401] Updated weights for policy 0, policy_version 144590 (0.0035) [2024-06-22 05:35:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2368962560. Throughput: 0: 42702.8. Samples: 2369035020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:35:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 05:35:56,991][15401] Updated weights for policy 0, policy_version 144600 (0.0037) [2024-06-22 05:35:58,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2369175552. Throughput: 0: 42783.0. Samples: 2369292840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:35:58,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 05:36:00,630][15401] Updated weights for policy 0, policy_version 144610 (0.0031) [2024-06-22 05:36:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2369388544. Throughput: 0: 42995.2. Samples: 2369556960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:36:03,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 05:36:04,491][15401] Updated weights for policy 0, policy_version 144620 (0.0023) [2024-06-22 05:36:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2369585152. Throughput: 0: 42758.3. Samples: 2369673840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:36:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 05:36:08,616][15401] Updated weights for policy 0, policy_version 144630 (0.0030) [2024-06-22 05:36:10,540][15349] Signal inference workers to stop experience collection... (34950 times) [2024-06-22 05:36:10,541][15349] Signal inference workers to resume experience collection... (34950 times) [2024-06-22 05:36:10,588][15401] InferenceWorker_p0-w0: stopping experience collection (34950 times) [2024-06-22 05:36:10,588][15401] InferenceWorker_p0-w0: resuming experience collection (34950 times) [2024-06-22 05:36:12,069][15401] Updated weights for policy 0, policy_version 144640 (0.0026) [2024-06-22 05:36:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2369814528. Throughput: 0: 42589.0. Samples: 2369930760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:36:13,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 05:36:16,213][15401] Updated weights for policy 0, policy_version 144650 (0.0042) [2024-06-22 05:36:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2370027520. Throughput: 0: 42708.1. Samples: 2370192120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:36:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 05:36:19,693][15401] Updated weights for policy 0, policy_version 144660 (0.0040) [2024-06-22 05:36:23,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2370240512. Throughput: 0: 42661.8. Samples: 2370316180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 05:36:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 05:36:23,926][15401] Updated weights for policy 0, policy_version 144670 (0.0035) [2024-06-22 05:36:27,751][15401] Updated weights for policy 0, policy_version 144680 (0.0028) [2024-06-22 05:36:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2370469888. Throughput: 0: 42715.1. Samples: 2370576100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:36:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 05:36:31,498][15401] Updated weights for policy 0, policy_version 144690 (0.0026) [2024-06-22 05:36:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2370682880. Throughput: 0: 42777.8. Samples: 2370838460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:36:33,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 05:36:35,309][15401] Updated weights for policy 0, policy_version 144700 (0.0036) [2024-06-22 05:36:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2370863104. Throughput: 0: 42860.5. Samples: 2370963740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:36:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 05:36:39,122][15401] Updated weights for policy 0, policy_version 144710 (0.0029) [2024-06-22 05:36:42,885][15401] Updated weights for policy 0, policy_version 144720 (0.0028) [2024-06-22 05:36:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2371108864. Throughput: 0: 42891.2. Samples: 2371222940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:36:43,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 05:36:46,771][15401] Updated weights for policy 0, policy_version 144730 (0.0037) [2024-06-22 05:36:48,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 2371305472. Throughput: 0: 42801.8. Samples: 2371483040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:36:48,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 05:36:50,403][15401] Updated weights for policy 0, policy_version 144740 (0.0037) [2024-06-22 05:36:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2371518464. Throughput: 0: 42972.8. Samples: 2371607620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:36:53,391][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 05:36:54,604][15401] Updated weights for policy 0, policy_version 144750 (0.0038) [2024-06-22 05:36:57,976][15401] Updated weights for policy 0, policy_version 144760 (0.0042) [2024-06-22 05:36:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2371764224. Throughput: 0: 42896.8. Samples: 2371861120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:36:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 05:37:02,296][15401] Updated weights for policy 0, policy_version 144770 (0.0033) [2024-06-22 05:37:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2371960832. Throughput: 0: 42835.0. Samples: 2372119700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:37:03,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 05:37:05,689][15401] Updated weights for policy 0, policy_version 144780 (0.0020) [2024-06-22 05:37:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2372157440. Throughput: 0: 42927.1. Samples: 2372247900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:37:08,391][15132] Avg episode reward: [(0, '0.213')] [2024-06-22 05:37:09,854][15401] Updated weights for policy 0, policy_version 144790 (0.0034) [2024-06-22 05:37:13,243][15401] Updated weights for policy 0, policy_version 144800 (0.0041) [2024-06-22 05:37:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2372403200. Throughput: 0: 42882.2. Samples: 2372505800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:37:13,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 05:37:17,533][15401] Updated weights for policy 0, policy_version 144810 (0.0030) [2024-06-22 05:37:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2372583424. Throughput: 0: 42830.1. Samples: 2372765820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:37:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 05:37:21,485][15401] Updated weights for policy 0, policy_version 144820 (0.0039) [2024-06-22 05:37:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2372812800. Throughput: 0: 42787.4. Samples: 2372889180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:37:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 05:37:25,375][15401] Updated weights for policy 0, policy_version 144830 (0.0057) [2024-06-22 05:37:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2373025792. Throughput: 0: 42694.2. Samples: 2373144180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 05:37:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 05:37:29,039][15401] Updated weights for policy 0, policy_version 144840 (0.0033) [2024-06-22 05:37:33,144][15401] Updated weights for policy 0, policy_version 144850 (0.0040) [2024-06-22 05:37:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2373238784. Throughput: 0: 42712.0. Samples: 2373405080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:37:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 05:37:36,658][15401] Updated weights for policy 0, policy_version 144860 (0.0040) [2024-06-22 05:37:37,283][15349] Signal inference workers to stop experience collection... (35000 times) [2024-06-22 05:37:37,283][15349] Signal inference workers to resume experience collection... (35000 times) [2024-06-22 05:37:37,336][15401] InferenceWorker_p0-w0: stopping experience collection (35000 times) [2024-06-22 05:37:37,336][15401] InferenceWorker_p0-w0: resuming experience collection (35000 times) [2024-06-22 05:37:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.3, 300 sec: 42709.8). Total num frames: 2373451776. Throughput: 0: 42767.5. Samples: 2373532160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:37:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 05:37:40,900][15401] Updated weights for policy 0, policy_version 144870 (0.0045) [2024-06-22 05:37:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2373681152. Throughput: 0: 42805.7. Samples: 2373787380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:37:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 05:37:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000144878_2373681152.pth... [2024-06-22 05:37:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000144252_2363424768.pth [2024-06-22 05:37:44,397][15401] Updated weights for policy 0, policy_version 144880 (0.0040) [2024-06-22 05:37:48,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2373861376. Throughput: 0: 42711.7. Samples: 2374041720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:37:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 05:37:48,498][15401] Updated weights for policy 0, policy_version 144890 (0.0028) [2024-06-22 05:37:51,855][15401] Updated weights for policy 0, policy_version 144900 (0.0030) [2024-06-22 05:37:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2374090752. Throughput: 0: 42650.6. Samples: 2374167180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:37:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 05:37:55,886][15401] Updated weights for policy 0, policy_version 144910 (0.0042) [2024-06-22 05:37:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2374303744. Throughput: 0: 42684.5. Samples: 2374426600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:37:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 05:37:59,735][15401] Updated weights for policy 0, policy_version 144920 (0.0032) [2024-06-22 05:38:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2374516736. Throughput: 0: 42638.5. Samples: 2374684560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:38:03,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 05:38:03,555][15401] Updated weights for policy 0, policy_version 144930 (0.0033) [2024-06-22 05:38:07,136][15401] Updated weights for policy 0, policy_version 144940 (0.0032) [2024-06-22 05:38:08,395][15132] Fps is (10 sec: 42575.2, 60 sec: 42867.6, 300 sec: 42764.2). Total num frames: 2374729728. Throughput: 0: 42758.5. Samples: 2374813540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:38:08,395][15132] Avg episode reward: [(0, '0.193')] [2024-06-22 05:38:11,064][15401] Updated weights for policy 0, policy_version 144950 (0.0038) [2024-06-22 05:38:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2374942720. Throughput: 0: 42861.5. Samples: 2375072940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:38:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 05:38:14,911][15401] Updated weights for policy 0, policy_version 144960 (0.0046) [2024-06-22 05:38:18,390][15132] Fps is (10 sec: 42620.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2375155712. Throughput: 0: 42679.4. Samples: 2375325660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:38:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 05:38:18,750][15401] Updated weights for policy 0, policy_version 144970 (0.0031) [2024-06-22 05:38:22,470][15401] Updated weights for policy 0, policy_version 144980 (0.0033) [2024-06-22 05:38:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2375368704. Throughput: 0: 42678.7. Samples: 2375452700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:38:23,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 05:38:26,442][15401] Updated weights for policy 0, policy_version 144990 (0.0040) [2024-06-22 05:38:28,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2375598080. Throughput: 0: 42698.8. Samples: 2375708820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:38:28,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 05:38:30,108][15401] Updated weights for policy 0, policy_version 145000 (0.0039) [2024-06-22 05:38:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2375794688. Throughput: 0: 42771.6. Samples: 2375966440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 05:38:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 05:38:34,059][15401] Updated weights for policy 0, policy_version 145010 (0.0039) [2024-06-22 05:38:38,095][15401] Updated weights for policy 0, policy_version 145020 (0.0035) [2024-06-22 05:38:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 2376007680. Throughput: 0: 42711.2. Samples: 2376089180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:38:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 05:38:42,001][15401] Updated weights for policy 0, policy_version 145030 (0.0034) [2024-06-22 05:38:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2376237056. Throughput: 0: 42582.6. Samples: 2376342820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:38:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 05:38:45,669][15401] Updated weights for policy 0, policy_version 145040 (0.0027) [2024-06-22 05:38:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2376433664. Throughput: 0: 42614.3. Samples: 2376602200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:38:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 05:38:49,698][15401] Updated weights for policy 0, policy_version 145050 (0.0031) [2024-06-22 05:38:53,321][15401] Updated weights for policy 0, policy_version 145060 (0.0042) [2024-06-22 05:38:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2376663040. Throughput: 0: 42606.4. Samples: 2376730600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:38:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 05:38:57,414][15401] Updated weights for policy 0, policy_version 145070 (0.0033) [2024-06-22 05:38:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2376876032. Throughput: 0: 42558.7. Samples: 2376988080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:38:58,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 05:39:00,918][15401] Updated weights for policy 0, policy_version 145080 (0.0043) [2024-06-22 05:39:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2377072640. Throughput: 0: 42659.6. Samples: 2377245340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:39:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 05:39:05,103][15401] Updated weights for policy 0, policy_version 145090 (0.0032) [2024-06-22 05:39:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42602.3, 300 sec: 42654.7). Total num frames: 2377285632. Throughput: 0: 42513.9. Samples: 2377365820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:39:08,390][15132] Avg episode reward: [(0, '0.208')] [2024-06-22 05:39:08,794][15401] Updated weights for policy 0, policy_version 145100 (0.0030) [2024-06-22 05:39:12,787][15401] Updated weights for policy 0, policy_version 145110 (0.0036) [2024-06-22 05:39:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 2377498624. Throughput: 0: 42655.7. Samples: 2377628440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:39:13,393][15132] Avg episode reward: [(0, '0.071')] [2024-06-22 05:39:16,640][15401] Updated weights for policy 0, policy_version 145120 (0.0038) [2024-06-22 05:39:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 2377711616. Throughput: 0: 42446.5. Samples: 2377876540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:39:18,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-22 05:39:20,252][15349] Signal inference workers to stop experience collection... (35050 times) [2024-06-22 05:39:20,306][15401] InferenceWorker_p0-w0: stopping experience collection (35050 times) [2024-06-22 05:39:20,312][15349] Signal inference workers to resume experience collection... (35050 times) [2024-06-22 05:39:20,324][15401] InferenceWorker_p0-w0: resuming experience collection (35050 times) [2024-06-22 05:39:20,609][15401] Updated weights for policy 0, policy_version 145130 (0.0031) [2024-06-22 05:39:23,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2377940992. Throughput: 0: 42571.5. Samples: 2378004900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:39:23,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-22 05:39:24,271][15401] Updated weights for policy 0, policy_version 145140 (0.0037) [2024-06-22 05:39:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 2378121216. Throughput: 0: 42598.3. Samples: 2378259740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:39:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 05:39:28,421][15401] Updated weights for policy 0, policy_version 145150 (0.0037) [2024-06-22 05:39:31,851][15401] Updated weights for policy 0, policy_version 145160 (0.0035) [2024-06-22 05:39:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2378350592. Throughput: 0: 42594.4. Samples: 2378518940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 05:39:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 05:39:36,014][15401] Updated weights for policy 0, policy_version 145170 (0.0040) [2024-06-22 05:39:38,392][15132] Fps is (10 sec: 45863.5, 60 sec: 42869.6, 300 sec: 42820.2). Total num frames: 2378579968. Throughput: 0: 42518.6. Samples: 2378644040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:39:38,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 05:39:40,163][15401] Updated weights for policy 0, policy_version 145180 (0.0026) [2024-06-22 05:39:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2378760192. Throughput: 0: 42420.1. Samples: 2378896980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:39:43,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 05:39:43,449][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000145189_2378776576.pth... [2024-06-22 05:39:43,509][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000144565_2368552960.pth [2024-06-22 05:39:43,714][15401] Updated weights for policy 0, policy_version 145190 (0.0034) [2024-06-22 05:39:47,729][15401] Updated weights for policy 0, policy_version 145200 (0.0038) [2024-06-22 05:39:48,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2378973184. Throughput: 0: 42464.4. Samples: 2379156240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:39:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 05:39:51,248][15401] Updated weights for policy 0, policy_version 145210 (0.0022) [2024-06-22 05:39:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2379218944. Throughput: 0: 42546.2. Samples: 2379280400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:39:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 05:39:55,187][15401] Updated weights for policy 0, policy_version 145220 (0.0037) [2024-06-22 05:39:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2379399168. Throughput: 0: 42497.4. Samples: 2379540720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:39:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 05:39:58,912][15401] Updated weights for policy 0, policy_version 145230 (0.0036) [2024-06-22 05:40:02,968][15401] Updated weights for policy 0, policy_version 145240 (0.0038) [2024-06-22 05:40:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2379628544. Throughput: 0: 42637.9. Samples: 2379795240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:40:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 05:40:06,763][15401] Updated weights for policy 0, policy_version 145250 (0.0041) [2024-06-22 05:40:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2379857920. Throughput: 0: 42613.8. Samples: 2379922520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:40:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 05:40:10,917][15401] Updated weights for policy 0, policy_version 145260 (0.0023) [2024-06-22 05:40:13,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2380054528. Throughput: 0: 42739.7. Samples: 2380183040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:40:13,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-22 05:40:14,337][15401] Updated weights for policy 0, policy_version 145270 (0.0045) [2024-06-22 05:40:18,383][15401] Updated weights for policy 0, policy_version 145280 (0.0034) [2024-06-22 05:40:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2380267520. Throughput: 0: 42658.5. Samples: 2380438580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:40:18,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 05:40:21,941][15401] Updated weights for policy 0, policy_version 145290 (0.0038) [2024-06-22 05:40:23,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2380513280. Throughput: 0: 42627.6. Samples: 2380562180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:40:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 05:40:25,809][15401] Updated weights for policy 0, policy_version 145300 (0.0036) [2024-06-22 05:40:28,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 2380693504. Throughput: 0: 42927.9. Samples: 2380828840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:40:28,392][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 05:40:29,363][15401] Updated weights for policy 0, policy_version 145310 (0.0036) [2024-06-22 05:40:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2380906496. Throughput: 0: 42744.9. Samples: 2381079760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:40:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 05:40:33,729][15401] Updated weights for policy 0, policy_version 145320 (0.0025) [2024-06-22 05:40:37,530][15401] Updated weights for policy 0, policy_version 145330 (0.0046) [2024-06-22 05:40:38,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43146.4, 300 sec: 42820.6). Total num frames: 2381168640. Throughput: 0: 42977.3. Samples: 2381214380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 05:40:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 05:40:41,343][15401] Updated weights for policy 0, policy_version 145340 (0.0046) [2024-06-22 05:40:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 2381332480. Throughput: 0: 42911.3. Samples: 2381471720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:40:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 05:40:45,093][15401] Updated weights for policy 0, policy_version 145350 (0.0029) [2024-06-22 05:40:45,720][15349] Signal inference workers to stop experience collection... (35100 times) [2024-06-22 05:40:45,778][15401] InferenceWorker_p0-w0: stopping experience collection (35100 times) [2024-06-22 05:40:45,784][15349] Signal inference workers to resume experience collection... (35100 times) [2024-06-22 05:40:45,794][15401] InferenceWorker_p0-w0: resuming experience collection (35100 times) [2024-06-22 05:40:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2381561856. Throughput: 0: 42844.8. Samples: 2381723260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:40:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 05:40:48,760][15401] Updated weights for policy 0, policy_version 145360 (0.0038) [2024-06-22 05:40:52,666][15401] Updated weights for policy 0, policy_version 145370 (0.0034) [2024-06-22 05:40:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2381807616. Throughput: 0: 43050.1. Samples: 2381859780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:40:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 05:40:56,394][15401] Updated weights for policy 0, policy_version 145380 (0.0042) [2024-06-22 05:40:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2381987840. Throughput: 0: 42921.6. Samples: 2382114500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:40:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 05:41:00,187][15401] Updated weights for policy 0, policy_version 145390 (0.0046) [2024-06-22 05:41:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2382200832. Throughput: 0: 42822.6. Samples: 2382365600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 05:41:04,082][15401] Updated weights for policy 0, policy_version 145400 (0.0028) [2024-06-22 05:41:07,693][15401] Updated weights for policy 0, policy_version 145410 (0.0035) [2024-06-22 05:41:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2382430208. Throughput: 0: 43148.9. Samples: 2382503880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 05:41:11,609][15401] Updated weights for policy 0, policy_version 145420 (0.0032) [2024-06-22 05:41:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2382626816. Throughput: 0: 42871.1. Samples: 2382757940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 05:41:15,389][15401] Updated weights for policy 0, policy_version 145430 (0.0022) [2024-06-22 05:41:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2382856192. Throughput: 0: 42932.4. Samples: 2383011720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 05:41:19,123][15401] Updated weights for policy 0, policy_version 145440 (0.0030) [2024-06-22 05:41:22,984][15401] Updated weights for policy 0, policy_version 145450 (0.0035) [2024-06-22 05:41:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2383069184. Throughput: 0: 42968.7. Samples: 2383147980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:23,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 05:41:26,700][15401] Updated weights for policy 0, policy_version 145460 (0.0058) [2024-06-22 05:41:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 2383265792. Throughput: 0: 42832.8. Samples: 2383399200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:28,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 05:41:30,603][15401] Updated weights for policy 0, policy_version 145470 (0.0037) [2024-06-22 05:41:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2383495168. Throughput: 0: 42709.5. Samples: 2383645180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 05:41:34,268][15401] Updated weights for policy 0, policy_version 145480 (0.0037) [2024-06-22 05:41:38,158][15401] Updated weights for policy 0, policy_version 145490 (0.0042) [2024-06-22 05:41:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2383724544. Throughput: 0: 42811.6. Samples: 2383786300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 05:41:42,223][15401] Updated weights for policy 0, policy_version 145500 (0.0041) [2024-06-22 05:41:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2383904768. Throughput: 0: 42723.4. Samples: 2384037060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 05:41:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 05:41:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000145502_2383904768.pth... [2024-06-22 05:41:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000144878_2373681152.pth [2024-06-22 05:41:45,843][15401] Updated weights for policy 0, policy_version 145510 (0.0036) [2024-06-22 05:41:48,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 2384134144. Throughput: 0: 42671.4. Samples: 2384286080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:41:48,397][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 05:41:49,722][15401] Updated weights for policy 0, policy_version 145520 (0.0028) [2024-06-22 05:41:53,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 2384347136. Throughput: 0: 42595.3. Samples: 2384420940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:41:53,397][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 05:41:53,643][15401] Updated weights for policy 0, policy_version 145530 (0.0024) [2024-06-22 05:41:57,242][15401] Updated weights for policy 0, policy_version 145540 (0.0026) [2024-06-22 05:41:58,390][15132] Fps is (10 sec: 42624.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2384560128. Throughput: 0: 42552.7. Samples: 2384672820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:41:58,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 05:42:01,221][15401] Updated weights for policy 0, policy_version 145550 (0.0042) [2024-06-22 05:42:03,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2384773120. Throughput: 0: 42685.0. Samples: 2384932540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 05:42:03,640][15349] Signal inference workers to stop experience collection... (35150 times) [2024-06-22 05:42:03,643][15349] Signal inference workers to resume experience collection... (35150 times) [2024-06-22 05:42:03,654][15401] InferenceWorker_p0-w0: stopping experience collection (35150 times) [2024-06-22 05:42:03,655][15401] InferenceWorker_p0-w0: resuming experience collection (35150 times) [2024-06-22 05:42:05,043][15401] Updated weights for policy 0, policy_version 145560 (0.0040) [2024-06-22 05:42:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2384986112. Throughput: 0: 42502.3. Samples: 2385060580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 05:42:09,140][15401] Updated weights for policy 0, policy_version 145570 (0.0043) [2024-06-22 05:42:12,874][15401] Updated weights for policy 0, policy_version 145580 (0.0038) [2024-06-22 05:42:13,393][15132] Fps is (10 sec: 42585.1, 60 sec: 42869.3, 300 sec: 42764.6). Total num frames: 2385199104. Throughput: 0: 42588.7. Samples: 2385315820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:13,393][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 05:42:16,581][15401] Updated weights for policy 0, policy_version 145590 (0.0031) [2024-06-22 05:42:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2385412096. Throughput: 0: 42993.3. Samples: 2385579880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 05:42:20,346][15401] Updated weights for policy 0, policy_version 145600 (0.0024) [2024-06-22 05:42:23,389][15132] Fps is (10 sec: 40972.8, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2385608704. Throughput: 0: 42519.2. Samples: 2385699660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 05:42:23,978][15401] Updated weights for policy 0, policy_version 145610 (0.0033) [2024-06-22 05:42:28,115][15401] Updated weights for policy 0, policy_version 145620 (0.0036) [2024-06-22 05:42:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2385838080. Throughput: 0: 42763.7. Samples: 2385961420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 05:42:31,951][15401] Updated weights for policy 0, policy_version 145630 (0.0024) [2024-06-22 05:42:33,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 2386051072. Throughput: 0: 42956.9. Samples: 2386218960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:33,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 05:42:35,834][15401] Updated weights for policy 0, policy_version 145640 (0.0031) [2024-06-22 05:42:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2386264064. Throughput: 0: 42656.4. Samples: 2386340200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:38,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 05:42:39,659][15401] Updated weights for policy 0, policy_version 145650 (0.0037) [2024-06-22 05:42:43,389][15132] Fps is (10 sec: 40968.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2386460672. Throughput: 0: 42879.3. Samples: 2386602380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 05:42:43,670][15401] Updated weights for policy 0, policy_version 145660 (0.0033) [2024-06-22 05:42:47,366][15401] Updated weights for policy 0, policy_version 145670 (0.0043) [2024-06-22 05:42:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 2386673664. Throughput: 0: 42566.1. Samples: 2386848020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:42:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 05:42:51,255][15401] Updated weights for policy 0, policy_version 145680 (0.0039) [2024-06-22 05:42:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42329.9, 300 sec: 42653.9). Total num frames: 2386886656. Throughput: 0: 42616.6. Samples: 2386978320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:42:53,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 05:42:55,175][15401] Updated weights for policy 0, policy_version 145690 (0.0038) [2024-06-22 05:42:58,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 2387099648. Throughput: 0: 42604.2. Samples: 2387232980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:42:58,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 05:42:59,466][15401] Updated weights for policy 0, policy_version 145700 (0.0037) [2024-06-22 05:43:02,822][15401] Updated weights for policy 0, policy_version 145710 (0.0029) [2024-06-22 05:43:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.2, 300 sec: 42710.2). Total num frames: 2387329024. Throughput: 0: 42462.0. Samples: 2387490680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 05:43:07,059][15401] Updated weights for policy 0, policy_version 145720 (0.0038) [2024-06-22 05:43:08,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2387542016. Throughput: 0: 42832.4. Samples: 2387627120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 05:43:09,103][15349] Signal inference workers to stop experience collection... (35200 times) [2024-06-22 05:43:09,142][15401] InferenceWorker_p0-w0: stopping experience collection (35200 times) [2024-06-22 05:43:09,163][15349] Signal inference workers to resume experience collection... (35200 times) [2024-06-22 05:43:09,163][15401] InferenceWorker_p0-w0: resuming experience collection (35200 times) [2024-06-22 05:43:10,236][15401] Updated weights for policy 0, policy_version 145730 (0.0042) [2024-06-22 05:43:13,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42327.5, 300 sec: 42654.0). Total num frames: 2387738624. Throughput: 0: 42542.2. Samples: 2387875820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 05:43:14,453][15401] Updated weights for policy 0, policy_version 145740 (0.0040) [2024-06-22 05:43:17,710][15401] Updated weights for policy 0, policy_version 145750 (0.0029) [2024-06-22 05:43:18,391][15132] Fps is (10 sec: 44229.9, 60 sec: 42870.3, 300 sec: 42764.8). Total num frames: 2387984384. Throughput: 0: 42657.5. Samples: 2388138520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:18,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 05:43:22,083][15401] Updated weights for policy 0, policy_version 145760 (0.0026) [2024-06-22 05:43:23,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 2388180992. Throughput: 0: 42991.4. Samples: 2388274920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:23,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 05:43:25,089][15401] Updated weights for policy 0, policy_version 145770 (0.0029) [2024-06-22 05:43:28,389][15132] Fps is (10 sec: 40966.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2388393984. Throughput: 0: 42848.0. Samples: 2388530540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 05:43:29,607][15401] Updated weights for policy 0, policy_version 145780 (0.0034) [2024-06-22 05:43:32,958][15401] Updated weights for policy 0, policy_version 145790 (0.0027) [2024-06-22 05:43:33,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43146.0, 300 sec: 42820.5). Total num frames: 2388639744. Throughput: 0: 42926.6. Samples: 2388779720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 05:43:37,236][15401] Updated weights for policy 0, policy_version 145800 (0.0035) [2024-06-22 05:43:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2388819968. Throughput: 0: 43092.9. Samples: 2388917500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 05:43:40,606][15401] Updated weights for policy 0, policy_version 145810 (0.0030) [2024-06-22 05:43:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2389049344. Throughput: 0: 43126.7. Samples: 2389173580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 05:43:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000145816_2389049344.pth... [2024-06-22 05:43:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000145189_2378776576.pth [2024-06-22 05:43:44,932][15401] Updated weights for policy 0, policy_version 145820 (0.0035) [2024-06-22 05:43:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2389262336. Throughput: 0: 42969.0. Samples: 2389424280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 05:43:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 05:43:48,901][15401] Updated weights for policy 0, policy_version 145830 (0.0036) [2024-06-22 05:43:52,528][15401] Updated weights for policy 0, policy_version 145840 (0.0033) [2024-06-22 05:43:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2389475328. Throughput: 0: 42868.1. Samples: 2389556180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:43:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 05:43:56,111][15401] Updated weights for policy 0, policy_version 145850 (0.0041) [2024-06-22 05:43:58,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43144.5, 300 sec: 42764.7). Total num frames: 2389688320. Throughput: 0: 43157.7. Samples: 2389818020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:43:58,392][15132] Avg episode reward: [(0, '0.186')] [2024-06-22 05:44:00,144][15401] Updated weights for policy 0, policy_version 145860 (0.0041) [2024-06-22 05:44:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2389901312. Throughput: 0: 43054.9. Samples: 2390075920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:03,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 05:44:03,693][15401] Updated weights for policy 0, policy_version 145870 (0.0031) [2024-06-22 05:44:07,759][15401] Updated weights for policy 0, policy_version 145880 (0.0030) [2024-06-22 05:44:08,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 2390114304. Throughput: 0: 42968.4. Samples: 2390208400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:08,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-22 05:44:11,191][15401] Updated weights for policy 0, policy_version 145890 (0.0027) [2024-06-22 05:44:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2390343680. Throughput: 0: 42883.5. Samples: 2390460300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 05:44:15,618][15401] Updated weights for policy 0, policy_version 145900 (0.0032) [2024-06-22 05:44:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42599.6, 300 sec: 42709.5). Total num frames: 2390540288. Throughput: 0: 43032.2. Samples: 2390716160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 05:44:18,602][15349] Signal inference workers to stop experience collection... (35250 times) [2024-06-22 05:44:18,603][15349] Signal inference workers to resume experience collection... (35250 times) [2024-06-22 05:44:18,647][15401] InferenceWorker_p0-w0: stopping experience collection (35250 times) [2024-06-22 05:44:18,647][15401] InferenceWorker_p0-w0: resuming experience collection (35250 times) [2024-06-22 05:44:18,755][15401] Updated weights for policy 0, policy_version 145910 (0.0028) [2024-06-22 05:44:23,275][15401] Updated weights for policy 0, policy_version 145920 (0.0032) [2024-06-22 05:44:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 2390753280. Throughput: 0: 42851.9. Samples: 2390845840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:23,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 05:44:26,344][15401] Updated weights for policy 0, policy_version 145930 (0.0037) [2024-06-22 05:44:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2390982656. Throughput: 0: 42737.8. Samples: 2391096780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 05:44:30,832][15401] Updated weights for policy 0, policy_version 145940 (0.0025) [2024-06-22 05:44:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 2391195648. Throughput: 0: 43048.4. Samples: 2391361560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:33,393][15132] Avg episode reward: [(0, '0.243')] [2024-06-22 05:44:33,941][15401] Updated weights for policy 0, policy_version 145950 (0.0028) [2024-06-22 05:44:38,355][15401] Updated weights for policy 0, policy_version 145960 (0.0021) [2024-06-22 05:44:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2391408640. Throughput: 0: 42958.2. Samples: 2391489300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:38,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 05:44:41,439][15401] Updated weights for policy 0, policy_version 145970 (0.0040) [2024-06-22 05:44:43,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2391638016. Throughput: 0: 42850.2. Samples: 2391746180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 05:44:46,190][15401] Updated weights for policy 0, policy_version 145980 (0.0038) [2024-06-22 05:44:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2391834624. Throughput: 0: 42911.0. Samples: 2392006920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 05:44:49,532][15401] Updated weights for policy 0, policy_version 145990 (0.0035) [2024-06-22 05:44:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2392031232. Throughput: 0: 42670.8. Samples: 2392128580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 05:44:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 05:44:53,920][15401] Updated weights for policy 0, policy_version 146000 (0.0033) [2024-06-22 05:44:57,078][15401] Updated weights for policy 0, policy_version 146010 (0.0035) [2024-06-22 05:44:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 2392276992. Throughput: 0: 42824.9. Samples: 2392387420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:44:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 05:45:01,547][15401] Updated weights for policy 0, policy_version 146020 (0.0036) [2024-06-22 05:45:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2392473600. Throughput: 0: 42823.9. Samples: 2392643240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 05:45:04,765][15401] Updated weights for policy 0, policy_version 146030 (0.0042) [2024-06-22 05:45:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2392670208. Throughput: 0: 42714.4. Samples: 2392767980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:08,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-22 05:45:08,978][15401] Updated weights for policy 0, policy_version 146040 (0.0039) [2024-06-22 05:45:12,390][15401] Updated weights for policy 0, policy_version 146050 (0.0031) [2024-06-22 05:45:13,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 2392932352. Throughput: 0: 42932.0. Samples: 2393028820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:13,392][15132] Avg episode reward: [(0, '0.061')] [2024-06-22 05:45:16,828][15401] Updated weights for policy 0, policy_version 146060 (0.0042) [2024-06-22 05:45:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 2393128960. Throughput: 0: 42721.9. Samples: 2393284040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:18,392][15132] Avg episode reward: [(0, '0.808')] [2024-06-22 05:45:20,065][15401] Updated weights for policy 0, policy_version 146070 (0.0029) [2024-06-22 05:45:23,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 2393325568. Throughput: 0: 42655.1. Samples: 2393408780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:23,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 05:45:24,451][15401] Updated weights for policy 0, policy_version 146080 (0.0029) [2024-06-22 05:45:27,731][15401] Updated weights for policy 0, policy_version 146090 (0.0032) [2024-06-22 05:45:28,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2393554944. Throughput: 0: 42820.0. Samples: 2393673080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:28,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 05:45:32,319][15401] Updated weights for policy 0, policy_version 146100 (0.0026) [2024-06-22 05:45:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 2393784320. Throughput: 0: 42549.4. Samples: 2393921640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 05:45:34,806][15349] Signal inference workers to stop experience collection... (35300 times) [2024-06-22 05:45:34,842][15401] InferenceWorker_p0-w0: stopping experience collection (35300 times) [2024-06-22 05:45:34,864][15349] Signal inference workers to resume experience collection... (35300 times) [2024-06-22 05:45:34,870][15401] InferenceWorker_p0-w0: resuming experience collection (35300 times) [2024-06-22 05:45:35,179][15401] Updated weights for policy 0, policy_version 146110 (0.0030) [2024-06-22 05:45:38,390][15132] Fps is (10 sec: 40956.4, 60 sec: 42597.7, 300 sec: 42820.4). Total num frames: 2393964544. Throughput: 0: 42856.9. Samples: 2394057180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:38,391][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 05:45:39,817][15401] Updated weights for policy 0, policy_version 146120 (0.0035) [2024-06-22 05:45:42,822][15401] Updated weights for policy 0, policy_version 146130 (0.0023) [2024-06-22 05:45:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2394193920. Throughput: 0: 42793.9. Samples: 2394313140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 05:45:43,499][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000146131_2394210304.pth... [2024-06-22 05:45:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000145502_2383904768.pth [2024-06-22 05:45:47,375][15401] Updated weights for policy 0, policy_version 146140 (0.0040) [2024-06-22 05:45:48,390][15132] Fps is (10 sec: 45879.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2394423296. Throughput: 0: 42701.4. Samples: 2394564800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:48,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 05:45:50,321][15401] Updated weights for policy 0, policy_version 146150 (0.0034) [2024-06-22 05:45:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2394603520. Throughput: 0: 42844.8. Samples: 2394696000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 05:45:54,938][15401] Updated weights for policy 0, policy_version 146160 (0.0040) [2024-06-22 05:45:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2394832896. Throughput: 0: 42839.6. Samples: 2394956500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 05:45:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 05:45:58,406][15401] Updated weights for policy 0, policy_version 146170 (0.0037) [2024-06-22 05:46:02,493][15401] Updated weights for policy 0, policy_version 146180 (0.0045) [2024-06-22 05:46:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2395062272. Throughput: 0: 42840.8. Samples: 2395211780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:03,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 05:46:05,997][15401] Updated weights for policy 0, policy_version 146190 (0.0045) [2024-06-22 05:46:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2395258880. Throughput: 0: 42808.5. Samples: 2395335160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:08,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 05:46:10,067][15401] Updated weights for policy 0, policy_version 146200 (0.0038) [2024-06-22 05:46:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2395471872. Throughput: 0: 42747.1. Samples: 2395596700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:13,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 05:46:13,607][15401] Updated weights for policy 0, policy_version 146210 (0.0030) [2024-06-22 05:46:17,631][15401] Updated weights for policy 0, policy_version 146220 (0.0036) [2024-06-22 05:46:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 2395684864. Throughput: 0: 42940.5. Samples: 2395853960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 05:46:21,236][15401] Updated weights for policy 0, policy_version 146230 (0.0035) [2024-06-22 05:46:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2395897856. Throughput: 0: 42668.9. Samples: 2395977340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:23,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 05:46:25,411][15401] Updated weights for policy 0, policy_version 146240 (0.0039) [2024-06-22 05:46:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2396110848. Throughput: 0: 42590.6. Samples: 2396229720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 05:46:29,427][15401] Updated weights for policy 0, policy_version 146250 (0.0046) [2024-06-22 05:46:33,311][15401] Updated weights for policy 0, policy_version 146260 (0.0023) [2024-06-22 05:46:33,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2396323840. Throughput: 0: 42727.6. Samples: 2396487540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 05:46:36,997][15401] Updated weights for policy 0, policy_version 146270 (0.0043) [2024-06-22 05:46:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42872.1, 300 sec: 42820.6). Total num frames: 2396536832. Throughput: 0: 42580.4. Samples: 2396612120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 05:46:40,914][15401] Updated weights for policy 0, policy_version 146280 (0.0040) [2024-06-22 05:46:43,390][15132] Fps is (10 sec: 42595.0, 60 sec: 42597.8, 300 sec: 42765.8). Total num frames: 2396749824. Throughput: 0: 42325.5. Samples: 2396861180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:43,391][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 05:46:44,629][15401] Updated weights for policy 0, policy_version 146290 (0.0027) [2024-06-22 05:46:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42710.4). Total num frames: 2396946432. Throughput: 0: 42617.4. Samples: 2397129560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 05:46:48,620][15401] Updated weights for policy 0, policy_version 146300 (0.0033) [2024-06-22 05:46:52,203][15401] Updated weights for policy 0, policy_version 146310 (0.0038) [2024-06-22 05:46:53,390][15132] Fps is (10 sec: 42601.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2397175808. Throughput: 0: 42670.1. Samples: 2397255320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:53,391][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 05:46:56,150][15401] Updated weights for policy 0, policy_version 146320 (0.0042) [2024-06-22 05:46:57,652][15349] Signal inference workers to stop experience collection... (35350 times) [2024-06-22 05:46:57,653][15349] Signal inference workers to resume experience collection... (35350 times) [2024-06-22 05:46:57,668][15401] InferenceWorker_p0-w0: stopping experience collection (35350 times) [2024-06-22 05:46:57,668][15401] InferenceWorker_p0-w0: resuming experience collection (35350 times) [2024-06-22 05:46:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2397388800. Throughput: 0: 42555.6. Samples: 2397511700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 05:46:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 05:46:59,950][15401] Updated weights for policy 0, policy_version 146330 (0.0036) [2024-06-22 05:47:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2397585408. Throughput: 0: 42513.8. Samples: 2397767080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:03,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 05:47:03,974][15401] Updated weights for policy 0, policy_version 146340 (0.0044) [2024-06-22 05:47:07,586][15401] Updated weights for policy 0, policy_version 146350 (0.0027) [2024-06-22 05:47:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42765.1). Total num frames: 2397814784. Throughput: 0: 42524.0. Samples: 2397890920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:08,392][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 05:47:11,442][15401] Updated weights for policy 0, policy_version 146360 (0.0037) [2024-06-22 05:47:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2398044160. Throughput: 0: 42758.7. Samples: 2398153860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:13,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 05:47:15,450][15401] Updated weights for policy 0, policy_version 146370 (0.0038) [2024-06-22 05:47:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2398240768. Throughput: 0: 42843.6. Samples: 2398415500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 05:47:19,259][15401] Updated weights for policy 0, policy_version 146380 (0.0035) [2024-06-22 05:47:23,089][15401] Updated weights for policy 0, policy_version 146390 (0.0032) [2024-06-22 05:47:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 2398453760. Throughput: 0: 42685.8. Samples: 2398532980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 05:47:26,759][15401] Updated weights for policy 0, policy_version 146400 (0.0038) [2024-06-22 05:47:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 2398683136. Throughput: 0: 42904.7. Samples: 2398791860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 05:47:31,011][15401] Updated weights for policy 0, policy_version 146410 (0.0024) [2024-06-22 05:47:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2398879744. Throughput: 0: 42712.8. Samples: 2399051640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 05:47:34,587][15401] Updated weights for policy 0, policy_version 146420 (0.0032) [2024-06-22 05:47:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2399092736. Throughput: 0: 42610.6. Samples: 2399172800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:38,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 05:47:38,744][15401] Updated weights for policy 0, policy_version 146430 (0.0037) [2024-06-22 05:47:42,328][15401] Updated weights for policy 0, policy_version 146440 (0.0027) [2024-06-22 05:47:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43145.1, 300 sec: 42931.6). Total num frames: 2399338496. Throughput: 0: 42813.4. Samples: 2399438300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 05:47:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000146444_2399338496.pth... [2024-06-22 05:47:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000145816_2389049344.pth [2024-06-22 05:47:46,403][15401] Updated weights for policy 0, policy_version 146450 (0.0028) [2024-06-22 05:47:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2399535104. Throughput: 0: 42770.6. Samples: 2399691760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 05:47:49,871][15401] Updated weights for policy 0, policy_version 146460 (0.0024) [2024-06-22 05:47:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 2399748096. Throughput: 0: 42804.1. Samples: 2399817000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 05:47:53,850][15401] Updated weights for policy 0, policy_version 146470 (0.0035) [2024-06-22 05:47:57,445][15401] Updated weights for policy 0, policy_version 146480 (0.0035) [2024-06-22 05:47:58,390][15132] Fps is (10 sec: 44233.5, 60 sec: 43143.9, 300 sec: 42876.0). Total num frames: 2399977472. Throughput: 0: 42864.5. Samples: 2400082800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:47:58,391][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 05:48:01,255][15401] Updated weights for policy 0, policy_version 146490 (0.0032) [2024-06-22 05:48:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2400174080. Throughput: 0: 42630.1. Samples: 2400333860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 05:48:03,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 05:48:05,191][15401] Updated weights for policy 0, policy_version 146500 (0.0041) [2024-06-22 05:48:08,390][15132] Fps is (10 sec: 40963.2, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 2400387072. Throughput: 0: 42823.6. Samples: 2400460040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 05:48:09,101][15401] Updated weights for policy 0, policy_version 146510 (0.0042) [2024-06-22 05:48:12,978][15401] Updated weights for policy 0, policy_version 146520 (0.0026) [2024-06-22 05:48:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.2). Total num frames: 2400600064. Throughput: 0: 42866.6. Samples: 2400720860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 05:48:16,828][15401] Updated weights for policy 0, policy_version 146530 (0.0041) [2024-06-22 05:48:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 2400813056. Throughput: 0: 42795.2. Samples: 2400977420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 05:48:20,521][15401] Updated weights for policy 0, policy_version 146540 (0.0047) [2024-06-22 05:48:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2401026048. Throughput: 0: 42966.8. Samples: 2401106300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 05:48:24,688][15401] Updated weights for policy 0, policy_version 146550 (0.0029) [2024-06-22 05:48:25,628][15349] Signal inference workers to stop experience collection... (35400 times) [2024-06-22 05:48:25,628][15349] Signal inference workers to resume experience collection... (35400 times) [2024-06-22 05:48:25,687][15401] InferenceWorker_p0-w0: stopping experience collection (35400 times) [2024-06-22 05:48:25,687][15401] InferenceWorker_p0-w0: resuming experience collection (35400 times) [2024-06-22 05:48:28,080][15401] Updated weights for policy 0, policy_version 146560 (0.0034) [2024-06-22 05:48:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2401239040. Throughput: 0: 42775.4. Samples: 2401363200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 05:48:32,281][15401] Updated weights for policy 0, policy_version 146570 (0.0036) [2024-06-22 05:48:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2401435648. Throughput: 0: 42861.5. Samples: 2401620520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:33,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 05:48:35,868][15401] Updated weights for policy 0, policy_version 146580 (0.0028) [2024-06-22 05:48:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2401681408. Throughput: 0: 42724.4. Samples: 2401739600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:38,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-22 05:48:39,895][15401] Updated weights for policy 0, policy_version 146590 (0.0038) [2024-06-22 05:48:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2401878016. Throughput: 0: 42643.1. Samples: 2402001700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 05:48:43,422][15401] Updated weights for policy 0, policy_version 146600 (0.0031) [2024-06-22 05:48:47,842][15401] Updated weights for policy 0, policy_version 146610 (0.0037) [2024-06-22 05:48:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2402074624. Throughput: 0: 42818.7. Samples: 2402260700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 05:48:51,097][15401] Updated weights for policy 0, policy_version 146620 (0.0035) [2024-06-22 05:48:53,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 2402320384. Throughput: 0: 42667.5. Samples: 2402380080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:53,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 05:48:55,574][15401] Updated weights for policy 0, policy_version 146630 (0.0036) [2024-06-22 05:48:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.9, 300 sec: 42765.0). Total num frames: 2402516992. Throughput: 0: 42553.9. Samples: 2402635780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:48:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 05:48:58,677][15401] Updated weights for policy 0, policy_version 146640 (0.0029) [2024-06-22 05:49:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 2402697216. Throughput: 0: 42781.3. Samples: 2402902580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:49:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 05:49:03,432][15401] Updated weights for policy 0, policy_version 146650 (0.0043) [2024-06-22 05:49:06,508][15401] Updated weights for policy 0, policy_version 146660 (0.0025) [2024-06-22 05:49:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2402975744. Throughput: 0: 42586.1. Samples: 2403022680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 05:49:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 05:49:11,146][15401] Updated weights for policy 0, policy_version 146670 (0.0032) [2024-06-22 05:49:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2403155968. Throughput: 0: 42667.7. Samples: 2403283240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:13,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 05:49:14,057][15401] Updated weights for policy 0, policy_version 146680 (0.0033) [2024-06-22 05:49:18,390][15132] Fps is (10 sec: 36044.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2403336192. Throughput: 0: 42668.3. Samples: 2403540600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 05:49:18,782][15401] Updated weights for policy 0, policy_version 146690 (0.0024) [2024-06-22 05:49:22,229][15401] Updated weights for policy 0, policy_version 146700 (0.0039) [2024-06-22 05:49:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2403614720. Throughput: 0: 42642.7. Samples: 2403658520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 05:49:26,408][15401] Updated weights for policy 0, policy_version 146710 (0.0034) [2024-06-22 05:49:28,210][15349] Signal inference workers to stop experience collection... (35450 times) [2024-06-22 05:49:28,211][15349] Signal inference workers to resume experience collection... (35450 times) [2024-06-22 05:49:28,226][15401] InferenceWorker_p0-w0: stopping experience collection (35450 times) [2024-06-22 05:49:28,227][15401] InferenceWorker_p0-w0: resuming experience collection (35450 times) [2024-06-22 05:49:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2403811328. Throughput: 0: 42770.9. Samples: 2403926400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:28,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 05:49:29,534][15401] Updated weights for policy 0, policy_version 146720 (0.0039) [2024-06-22 05:49:33,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2403991552. Throughput: 0: 42673.8. Samples: 2404181020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:33,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 05:49:34,046][15401] Updated weights for policy 0, policy_version 146730 (0.0031) [2024-06-22 05:49:37,242][15401] Updated weights for policy 0, policy_version 146740 (0.0036) [2024-06-22 05:49:38,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2404253696. Throughput: 0: 42824.2. Samples: 2404307160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 05:49:42,090][15401] Updated weights for policy 0, policy_version 146750 (0.0025) [2024-06-22 05:49:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2404433920. Throughput: 0: 43037.0. Samples: 2404572440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 05:49:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000146756_2404450304.pth... [2024-06-22 05:49:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000146131_2394210304.pth [2024-06-22 05:49:44,777][15401] Updated weights for policy 0, policy_version 146760 (0.0037) [2024-06-22 05:49:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2404646912. Throughput: 0: 42540.5. Samples: 2404816900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 05:49:49,755][15401] Updated weights for policy 0, policy_version 146770 (0.0035) [2024-06-22 05:49:52,573][15401] Updated weights for policy 0, policy_version 146780 (0.0037) [2024-06-22 05:49:53,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2404892672. Throughput: 0: 42771.5. Samples: 2404947400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 05:49:57,291][15401] Updated weights for policy 0, policy_version 146790 (0.0041) [2024-06-22 05:49:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2405056512. Throughput: 0: 42831.4. Samples: 2405210660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:49:58,391][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 05:50:00,126][15401] Updated weights for policy 0, policy_version 146800 (0.0032) [2024-06-22 05:50:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 2405302272. Throughput: 0: 42502.6. Samples: 2405453220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:50:03,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-22 05:50:04,867][15401] Updated weights for policy 0, policy_version 146810 (0.0038) [2024-06-22 05:50:07,654][15401] Updated weights for policy 0, policy_version 146820 (0.0034) [2024-06-22 05:50:08,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2405515264. Throughput: 0: 42928.0. Samples: 2405590280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 05:50:08,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 05:50:12,515][15401] Updated weights for policy 0, policy_version 146830 (0.0021) [2024-06-22 05:50:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2405711872. Throughput: 0: 42786.5. Samples: 2405851780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 05:50:15,367][15401] Updated weights for policy 0, policy_version 146840 (0.0031) [2024-06-22 05:50:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 2405957632. Throughput: 0: 42641.4. Samples: 2406099880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 05:50:20,144][15401] Updated weights for policy 0, policy_version 146850 (0.0047) [2024-06-22 05:50:22,951][15401] Updated weights for policy 0, policy_version 146860 (0.0030) [2024-06-22 05:50:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2406170624. Throughput: 0: 42882.2. Samples: 2406236860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 05:50:27,742][15401] Updated weights for policy 0, policy_version 146870 (0.0032) [2024-06-22 05:50:28,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2406334464. Throughput: 0: 42639.1. Samples: 2406491200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 05:50:30,639][15401] Updated weights for policy 0, policy_version 146880 (0.0037) [2024-06-22 05:50:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42820.7). Total num frames: 2406596608. Throughput: 0: 42723.9. Samples: 2406739480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 05:50:35,244][15401] Updated weights for policy 0, policy_version 146890 (0.0034) [2024-06-22 05:50:38,218][15401] Updated weights for policy 0, policy_version 146900 (0.0022) [2024-06-22 05:50:38,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2406809600. Throughput: 0: 43002.8. Samples: 2406882520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 05:50:42,797][15401] Updated weights for policy 0, policy_version 146910 (0.0044) [2024-06-22 05:50:43,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 2406973440. Throughput: 0: 42765.0. Samples: 2407135180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:43,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 05:50:45,782][15401] Updated weights for policy 0, policy_version 146920 (0.0040) [2024-06-22 05:50:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2407235584. Throughput: 0: 42949.9. Samples: 2407385960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 05:50:50,232][15401] Updated weights for policy 0, policy_version 146930 (0.0038) [2024-06-22 05:50:53,389][15132] Fps is (10 sec: 47525.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2407448576. Throughput: 0: 42903.6. Samples: 2407520940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:53,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 05:50:53,642][15401] Updated weights for policy 0, policy_version 146940 (0.0026) [2024-06-22 05:50:58,226][15401] Updated weights for policy 0, policy_version 146950 (0.0037) [2024-06-22 05:50:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2407628800. Throughput: 0: 42774.2. Samples: 2407776620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:50:58,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 05:51:01,215][15401] Updated weights for policy 0, policy_version 146960 (0.0028) [2024-06-22 05:51:02,735][15349] Signal inference workers to stop experience collection... (35500 times) [2024-06-22 05:51:02,735][15349] Signal inference workers to resume experience collection... (35500 times) [2024-06-22 05:51:02,761][15401] InferenceWorker_p0-w0: stopping experience collection (35500 times) [2024-06-22 05:51:02,791][15401] InferenceWorker_p0-w0: resuming experience collection (35500 times) [2024-06-22 05:51:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2407890944. Throughput: 0: 42888.4. Samples: 2408029860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:51:03,390][15132] Avg episode reward: [(0, '0.923')] [2024-06-22 05:51:05,590][15401] Updated weights for policy 0, policy_version 146970 (0.0037) [2024-06-22 05:51:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2408087552. Throughput: 0: 42915.8. Samples: 2408168080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:51:08,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 05:51:08,916][15401] Updated weights for policy 0, policy_version 146980 (0.0038) [2024-06-22 05:51:13,080][15401] Updated weights for policy 0, policy_version 146990 (0.0057) [2024-06-22 05:51:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2408284160. Throughput: 0: 42888.8. Samples: 2408421200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 05:51:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 05:51:16,428][15401] Updated weights for policy 0, policy_version 147000 (0.0028) [2024-06-22 05:51:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 2408529920. Throughput: 0: 43113.9. Samples: 2408679600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 05:51:20,526][15401] Updated weights for policy 0, policy_version 147010 (0.0048) [2024-06-22 05:51:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2408742912. Throughput: 0: 42958.1. Samples: 2408815640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:23,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 05:51:24,030][15401] Updated weights for policy 0, policy_version 147020 (0.0035) [2024-06-22 05:51:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2408923136. Throughput: 0: 42957.9. Samples: 2409068180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:28,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-22 05:51:28,461][15401] Updated weights for policy 0, policy_version 147030 (0.0036) [2024-06-22 05:51:31,700][15401] Updated weights for policy 0, policy_version 147040 (0.0042) [2024-06-22 05:51:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2409168896. Throughput: 0: 43062.6. Samples: 2409323780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 05:51:36,097][15401] Updated weights for policy 0, policy_version 147050 (0.0033) [2024-06-22 05:51:38,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42876.2). Total num frames: 2409398272. Throughput: 0: 43073.3. Samples: 2409459240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 05:51:39,124][15401] Updated weights for policy 0, policy_version 147060 (0.0042) [2024-06-22 05:51:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43419.2, 300 sec: 42820.5). Total num frames: 2409578496. Throughput: 0: 42977.2. Samples: 2409710600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 05:51:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000147069_2409578496.pth... [2024-06-22 05:51:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000146444_2399338496.pth [2024-06-22 05:51:43,703][15401] Updated weights for policy 0, policy_version 147070 (0.0031) [2024-06-22 05:51:46,862][15401] Updated weights for policy 0, policy_version 147080 (0.0035) [2024-06-22 05:51:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2409807872. Throughput: 0: 43123.6. Samples: 2409970420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 05:51:51,279][15401] Updated weights for policy 0, policy_version 147090 (0.0027) [2024-06-22 05:51:53,394][15132] Fps is (10 sec: 47493.9, 60 sec: 43414.5, 300 sec: 42931.0). Total num frames: 2410053632. Throughput: 0: 42894.7. Samples: 2410098520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:53,394][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 05:51:54,614][15401] Updated weights for policy 0, policy_version 147100 (0.0038) [2024-06-22 05:51:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2410217472. Throughput: 0: 43036.5. Samples: 2410357840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:51:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 05:51:58,956][15401] Updated weights for policy 0, policy_version 147110 (0.0038) [2024-06-22 05:52:02,053][15401] Updated weights for policy 0, policy_version 147120 (0.0036) [2024-06-22 05:52:03,390][15132] Fps is (10 sec: 40977.4, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 2410463232. Throughput: 0: 43093.2. Samples: 2410618800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:52:03,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 05:52:06,744][15401] Updated weights for policy 0, policy_version 147130 (0.0034) [2024-06-22 05:52:07,773][15349] Signal inference workers to stop experience collection... (35550 times) [2024-06-22 05:52:07,774][15349] Signal inference workers to resume experience collection... (35550 times) [2024-06-22 05:52:07,806][15401] InferenceWorker_p0-w0: stopping experience collection (35550 times) [2024-06-22 05:52:07,806][15401] InferenceWorker_p0-w0: resuming experience collection (35550 times) [2024-06-22 05:52:08,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 2410692608. Throughput: 0: 42973.9. Samples: 2410749460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:52:08,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 05:52:09,503][15401] Updated weights for policy 0, policy_version 147140 (0.0032) [2024-06-22 05:52:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2410872832. Throughput: 0: 43075.9. Samples: 2411006600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:52:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 05:52:14,217][15401] Updated weights for policy 0, policy_version 147150 (0.0028) [2024-06-22 05:52:17,040][15401] Updated weights for policy 0, policy_version 147160 (0.0045) [2024-06-22 05:52:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2411085824. Throughput: 0: 42945.4. Samples: 2411256320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 05:52:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 05:52:21,646][15401] Updated weights for policy 0, policy_version 147170 (0.0042) [2024-06-22 05:52:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2411315200. Throughput: 0: 42908.5. Samples: 2411390120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:52:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 05:52:24,455][15401] Updated weights for policy 0, policy_version 147180 (0.0041) [2024-06-22 05:52:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2411511808. Throughput: 0: 43039.7. Samples: 2411647380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:52:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 05:52:29,073][15401] Updated weights for policy 0, policy_version 147190 (0.0040) [2024-06-22 05:52:32,442][15401] Updated weights for policy 0, policy_version 147200 (0.0031) [2024-06-22 05:52:33,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 2411741184. Throughput: 0: 42822.6. Samples: 2411897540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:52:33,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 05:52:36,919][15401] Updated weights for policy 0, policy_version 147210 (0.0035) [2024-06-22 05:52:38,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 2411970560. Throughput: 0: 43034.6. Samples: 2412035000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:52:38,393][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 05:52:40,246][15401] Updated weights for policy 0, policy_version 147220 (0.0030) [2024-06-22 05:52:43,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2412150784. Throughput: 0: 42863.5. Samples: 2412286700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:52:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 05:52:44,596][15401] Updated weights for policy 0, policy_version 147230 (0.0026) [2024-06-22 05:52:47,676][15401] Updated weights for policy 0, policy_version 147240 (0.0028) [2024-06-22 05:52:48,391][15132] Fps is (10 sec: 40965.8, 60 sec: 42870.7, 300 sec: 42820.4). Total num frames: 2412380160. Throughput: 0: 42700.8. Samples: 2412540380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:52:48,391][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 05:52:52,178][15401] Updated weights for policy 0, policy_version 147250 (0.0039) [2024-06-22 05:52:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42601.5, 300 sec: 42820.7). Total num frames: 2412609536. Throughput: 0: 42872.4. Samples: 2412678720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:52:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 05:52:55,643][15401] Updated weights for policy 0, policy_version 147260 (0.0037) [2024-06-22 05:52:58,390][15132] Fps is (10 sec: 40963.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2412789760. Throughput: 0: 42721.2. Samples: 2412929060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:52:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 05:53:00,060][15401] Updated weights for policy 0, policy_version 147270 (0.0032) [2024-06-22 05:53:03,106][15401] Updated weights for policy 0, policy_version 147280 (0.0024) [2024-06-22 05:53:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2413035520. Throughput: 0: 42929.6. Samples: 2413188160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:53:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 05:53:07,518][15401] Updated weights for policy 0, policy_version 147290 (0.0041) [2024-06-22 05:53:08,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2413248512. Throughput: 0: 43036.4. Samples: 2413326760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:53:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 05:53:11,245][15401] Updated weights for policy 0, policy_version 147300 (0.0038) [2024-06-22 05:53:13,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2413445120. Throughput: 0: 42908.8. Samples: 2413578280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:53:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 05:53:15,054][15401] Updated weights for policy 0, policy_version 147310 (0.0021) [2024-06-22 05:53:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2413658112. Throughput: 0: 43063.6. Samples: 2413835300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:53:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 05:53:18,788][15401] Updated weights for policy 0, policy_version 147320 (0.0028) [2024-06-22 05:53:22,944][15401] Updated weights for policy 0, policy_version 147330 (0.0040) [2024-06-22 05:53:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2413887488. Throughput: 0: 42902.4. Samples: 2413965500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 05:53:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 05:53:26,276][15401] Updated weights for policy 0, policy_version 147340 (0.0043) [2024-06-22 05:53:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2414084096. Throughput: 0: 42856.4. Samples: 2414215240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:53:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 05:53:29,338][15349] Signal inference workers to stop experience collection... (35600 times) [2024-06-22 05:53:29,372][15401] InferenceWorker_p0-w0: stopping experience collection (35600 times) [2024-06-22 05:53:29,398][15349] Signal inference workers to resume experience collection... (35600 times) [2024-06-22 05:53:29,398][15401] InferenceWorker_p0-w0: resuming experience collection (35600 times) [2024-06-22 05:53:30,522][15401] Updated weights for policy 0, policy_version 147350 (0.0032) [2024-06-22 05:53:33,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42868.6, 300 sec: 42819.6). Total num frames: 2414313472. Throughput: 0: 42783.8. Samples: 2414465880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:53:33,396][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 05:53:33,956][15401] Updated weights for policy 0, policy_version 147360 (0.0034) [2024-06-22 05:53:38,155][15401] Updated weights for policy 0, policy_version 147370 (0.0041) [2024-06-22 05:53:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 2414526464. Throughput: 0: 42720.4. Samples: 2414601140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:53:38,396][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 05:53:41,511][15401] Updated weights for policy 0, policy_version 147380 (0.0039) [2024-06-22 05:53:43,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2414723072. Throughput: 0: 42805.6. Samples: 2414855300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:53:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 05:53:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000147384_2414739456.pth... [2024-06-22 05:53:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000146756_2404450304.pth [2024-06-22 05:53:45,823][15401] Updated weights for policy 0, policy_version 147390 (0.0037) [2024-06-22 05:53:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42872.2, 300 sec: 42820.6). Total num frames: 2414952448. Throughput: 0: 42651.7. Samples: 2415107480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:53:48,399][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 05:53:49,059][15401] Updated weights for policy 0, policy_version 147400 (0.0031) [2024-06-22 05:53:53,343][15401] Updated weights for policy 0, policy_version 147410 (0.0022) [2024-06-22 05:53:53,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42593.8, 300 sec: 42875.2). Total num frames: 2415165440. Throughput: 0: 42537.4. Samples: 2415241220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:53:53,396][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 05:53:57,053][15401] Updated weights for policy 0, policy_version 147420 (0.0044) [2024-06-22 05:53:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 2415378432. Throughput: 0: 42692.9. Samples: 2415499460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:53:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 05:54:00,886][15401] Updated weights for policy 0, policy_version 147430 (0.0037) [2024-06-22 05:54:03,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 2415607808. Throughput: 0: 42645.9. Samples: 2415754360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:54:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 05:54:04,516][15401] Updated weights for policy 0, policy_version 147440 (0.0035) [2024-06-22 05:54:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2415804416. Throughput: 0: 42747.6. Samples: 2415889140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:54:08,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 05:54:08,832][15401] Updated weights for policy 0, policy_version 147450 (0.0045) [2024-06-22 05:54:12,135][15401] Updated weights for policy 0, policy_version 147460 (0.0039) [2024-06-22 05:54:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2416017408. Throughput: 0: 42831.2. Samples: 2416142640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:54:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 05:54:16,214][15401] Updated weights for policy 0, policy_version 147470 (0.0032) [2024-06-22 05:54:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2416230400. Throughput: 0: 42955.1. Samples: 2416398580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:54:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 05:54:19,981][15401] Updated weights for policy 0, policy_version 147480 (0.0044) [2024-06-22 05:54:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2416443392. Throughput: 0: 42802.2. Samples: 2416527240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 05:54:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 05:54:23,726][15401] Updated weights for policy 0, policy_version 147490 (0.0034) [2024-06-22 05:54:27,649][15401] Updated weights for policy 0, policy_version 147500 (0.0024) [2024-06-22 05:54:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2416656384. Throughput: 0: 42927.9. Samples: 2416787060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:54:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 05:54:31,548][15401] Updated weights for policy 0, policy_version 147510 (0.0031) [2024-06-22 05:54:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 2416885760. Throughput: 0: 42971.6. Samples: 2417041200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:54:33,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 05:54:35,237][15401] Updated weights for policy 0, policy_version 147520 (0.0031) [2024-06-22 05:54:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 2417065984. Throughput: 0: 42879.9. Samples: 2417170540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:54:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 05:54:39,188][15401] Updated weights for policy 0, policy_version 147530 (0.0045) [2024-06-22 05:54:43,274][15401] Updated weights for policy 0, policy_version 147540 (0.0035) [2024-06-22 05:54:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2417295360. Throughput: 0: 42752.8. Samples: 2417423340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:54:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 05:54:46,735][15401] Updated weights for policy 0, policy_version 147550 (0.0045) [2024-06-22 05:54:46,770][15349] Signal inference workers to stop experience collection... (35650 times) [2024-06-22 05:54:46,770][15349] Signal inference workers to resume experience collection... (35650 times) [2024-06-22 05:54:46,780][15401] InferenceWorker_p0-w0: stopping experience collection (35650 times) [2024-06-22 05:54:46,780][15401] InferenceWorker_p0-w0: resuming experience collection (35650 times) [2024-06-22 05:54:48,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 2417541120. Throughput: 0: 42911.8. Samples: 2417685500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:54:48,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 05:54:50,751][15401] Updated weights for policy 0, policy_version 147560 (0.0038) [2024-06-22 05:54:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42602.9, 300 sec: 42931.6). Total num frames: 2417721344. Throughput: 0: 42669.2. Samples: 2417809260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:54:53,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 05:54:54,475][15401] Updated weights for policy 0, policy_version 147570 (0.0028) [2024-06-22 05:54:58,344][15401] Updated weights for policy 0, policy_version 147580 (0.0034) [2024-06-22 05:54:58,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2417950720. Throughput: 0: 42760.8. Samples: 2418066880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:54:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 05:55:01,996][15401] Updated weights for policy 0, policy_version 147590 (0.0042) [2024-06-22 05:55:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2418180096. Throughput: 0: 42800.0. Samples: 2418324580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:55:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 05:55:05,854][15401] Updated weights for policy 0, policy_version 147600 (0.0026) [2024-06-22 05:55:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2418360320. Throughput: 0: 42774.7. Samples: 2418452100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:55:08,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 05:55:10,042][15401] Updated weights for policy 0, policy_version 147610 (0.0043) [2024-06-22 05:55:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2418589696. Throughput: 0: 42607.6. Samples: 2418704400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:55:13,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 05:55:13,394][15401] Updated weights for policy 0, policy_version 147620 (0.0036) [2024-06-22 05:55:17,564][15401] Updated weights for policy 0, policy_version 147630 (0.0027) [2024-06-22 05:55:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2418802688. Throughput: 0: 42670.7. Samples: 2418961380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:55:18,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 05:55:20,963][15401] Updated weights for policy 0, policy_version 147640 (0.0027) [2024-06-22 05:55:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2418982912. Throughput: 0: 42613.3. Samples: 2419088140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:55:23,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 05:55:25,021][15401] Updated weights for policy 0, policy_version 147650 (0.0034) [2024-06-22 05:55:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2419245056. Throughput: 0: 42747.6. Samples: 2419346980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 05:55:28,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 05:55:28,537][15401] Updated weights for policy 0, policy_version 147660 (0.0029) [2024-06-22 05:55:32,826][15401] Updated weights for policy 0, policy_version 147670 (0.0038) [2024-06-22 05:55:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2419441664. Throughput: 0: 42745.7. Samples: 2419608960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:55:33,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 05:55:36,040][15401] Updated weights for policy 0, policy_version 147680 (0.0031) [2024-06-22 05:55:38,394][15132] Fps is (10 sec: 42579.6, 60 sec: 43414.3, 300 sec: 43042.4). Total num frames: 2419671040. Throughput: 0: 42765.6. Samples: 2419733900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:55:38,394][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 05:55:40,406][15401] Updated weights for policy 0, policy_version 147690 (0.0040) [2024-06-22 05:55:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2419884032. Throughput: 0: 42796.9. Samples: 2419992740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:55:43,390][15132] Avg episode reward: [(0, '0.159')] [2024-06-22 05:55:43,553][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000147699_2419900416.pth... [2024-06-22 05:55:43,594][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000147069_2409578496.pth [2024-06-22 05:55:43,734][15401] Updated weights for policy 0, policy_version 147700 (0.0028) [2024-06-22 05:55:48,389][15132] Fps is (10 sec: 39339.5, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 2420064256. Throughput: 0: 42747.1. Samples: 2420248200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:55:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 05:55:48,399][15401] Updated weights for policy 0, policy_version 147710 (0.0034) [2024-06-22 05:55:51,502][15401] Updated weights for policy 0, policy_version 147720 (0.0034) [2024-06-22 05:55:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2420293632. Throughput: 0: 42567.2. Samples: 2420367620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:55:53,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 05:55:55,750][15349] Signal inference workers to stop experience collection... (35700 times) [2024-06-22 05:55:55,751][15349] Signal inference workers to resume experience collection... (35700 times) [2024-06-22 05:55:55,788][15401] InferenceWorker_p0-w0: stopping experience collection (35700 times) [2024-06-22 05:55:55,789][15401] InferenceWorker_p0-w0: resuming experience collection (35700 times) [2024-06-22 05:55:56,085][15401] Updated weights for policy 0, policy_version 147730 (0.0042) [2024-06-22 05:55:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2420506624. Throughput: 0: 42733.7. Samples: 2420627420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:55:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 05:55:59,070][15401] Updated weights for policy 0, policy_version 147740 (0.0029) [2024-06-22 05:56:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 2420719616. Throughput: 0: 42791.5. Samples: 2420887000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:56:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 05:56:03,524][15401] Updated weights for policy 0, policy_version 147750 (0.0030) [2024-06-22 05:56:06,830][15401] Updated weights for policy 0, policy_version 147760 (0.0038) [2024-06-22 05:56:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2420948992. Throughput: 0: 42832.7. Samples: 2421015620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:56:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 05:56:11,261][15401] Updated weights for policy 0, policy_version 147770 (0.0039) [2024-06-22 05:56:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2421145600. Throughput: 0: 42855.6. Samples: 2421275480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:56:13,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 05:56:14,713][15401] Updated weights for policy 0, policy_version 147780 (0.0036) [2024-06-22 05:56:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2421358592. Throughput: 0: 42795.7. Samples: 2421534760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:56:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 05:56:19,169][15401] Updated weights for policy 0, policy_version 147790 (0.0039) [2024-06-22 05:56:22,271][15401] Updated weights for policy 0, policy_version 147800 (0.0029) [2024-06-22 05:56:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43690.5, 300 sec: 42987.1). Total num frames: 2421604352. Throughput: 0: 42729.9. Samples: 2421656560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:56:23,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 05:56:26,743][15401] Updated weights for policy 0, policy_version 147810 (0.0043) [2024-06-22 05:56:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2421784576. Throughput: 0: 42739.5. Samples: 2421916020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:56:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 05:56:30,165][15401] Updated weights for policy 0, policy_version 147820 (0.0034) [2024-06-22 05:56:33,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2421981184. Throughput: 0: 42844.3. Samples: 2422176200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 05:56:33,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 05:56:34,407][15401] Updated weights for policy 0, policy_version 147830 (0.0040) [2024-06-22 05:56:37,626][15401] Updated weights for policy 0, policy_version 147840 (0.0032) [2024-06-22 05:56:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42874.7, 300 sec: 42931.7). Total num frames: 2422243328. Throughput: 0: 42968.4. Samples: 2422301200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:56:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 05:56:42,181][15401] Updated weights for policy 0, policy_version 147850 (0.0047) [2024-06-22 05:56:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2422423552. Throughput: 0: 42788.4. Samples: 2422552900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:56:43,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 05:56:45,284][15401] Updated weights for policy 0, policy_version 147860 (0.0028) [2024-06-22 05:56:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.3, 300 sec: 42654.5). Total num frames: 2422636544. Throughput: 0: 42683.1. Samples: 2422807740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:56:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 05:56:49,825][15401] Updated weights for policy 0, policy_version 147870 (0.0037) [2024-06-22 05:56:52,932][15401] Updated weights for policy 0, policy_version 147880 (0.0040) [2024-06-22 05:56:53,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 2422882304. Throughput: 0: 42745.3. Samples: 2422939260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:56:53,393][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 05:56:57,415][15401] Updated weights for policy 0, policy_version 147890 (0.0032) [2024-06-22 05:56:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2423078912. Throughput: 0: 42730.2. Samples: 2423198340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:56:58,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 05:57:00,775][15401] Updated weights for policy 0, policy_version 147900 (0.0026) [2024-06-22 05:57:03,390][15132] Fps is (10 sec: 40966.0, 60 sec: 42870.9, 300 sec: 42709.3). Total num frames: 2423291904. Throughput: 0: 42587.1. Samples: 2423451220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:57:03,391][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 05:57:05,094][15401] Updated weights for policy 0, policy_version 147910 (0.0028) [2024-06-22 05:57:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2423504896. Throughput: 0: 42656.5. Samples: 2423576100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:57:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 05:57:08,680][15401] Updated weights for policy 0, policy_version 147920 (0.0052) [2024-06-22 05:57:12,872][15401] Updated weights for policy 0, policy_version 147930 (0.0036) [2024-06-22 05:57:13,396][15132] Fps is (10 sec: 40937.6, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 2423701504. Throughput: 0: 42581.0. Samples: 2423832440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:57:13,396][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 05:57:16,034][15349] Signal inference workers to stop experience collection... (35750 times) [2024-06-22 05:57:16,086][15401] InferenceWorker_p0-w0: stopping experience collection (35750 times) [2024-06-22 05:57:16,092][15349] Signal inference workers to resume experience collection... (35750 times) [2024-06-22 05:57:16,104][15401] InferenceWorker_p0-w0: resuming experience collection (35750 times) [2024-06-22 05:57:16,226][15401] Updated weights for policy 0, policy_version 147940 (0.0027) [2024-06-22 05:57:18,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 2423930880. Throughput: 0: 42355.3. Samples: 2424082460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:57:18,397][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 05:57:20,378][15401] Updated weights for policy 0, policy_version 147950 (0.0022) [2024-06-22 05:57:23,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2424127488. Throughput: 0: 42498.2. Samples: 2424213620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:57:23,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 05:57:23,745][15401] Updated weights for policy 0, policy_version 147960 (0.0033) [2024-06-22 05:57:27,938][15401] Updated weights for policy 0, policy_version 147970 (0.0031) [2024-06-22 05:57:28,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2424356864. Throughput: 0: 42756.0. Samples: 2424476920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:57:28,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 05:57:31,470][15401] Updated weights for policy 0, policy_version 147980 (0.0025) [2024-06-22 05:57:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 2424586240. Throughput: 0: 42715.2. Samples: 2424729920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 05:57:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 05:57:35,682][15401] Updated weights for policy 0, policy_version 147990 (0.0027) [2024-06-22 05:57:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2424766464. Throughput: 0: 42709.8. Samples: 2424861100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:57:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 05:57:39,182][15401] Updated weights for policy 0, policy_version 148000 (0.0045) [2024-06-22 05:57:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 2424979456. Throughput: 0: 42536.8. Samples: 2425112500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:57:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 05:57:43,461][15401] Updated weights for policy 0, policy_version 148010 (0.0035) [2024-06-22 05:57:43,590][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000148011_2425012224.pth... [2024-06-22 05:57:43,639][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000147384_2414739456.pth [2024-06-22 05:57:47,393][15401] Updated weights for policy 0, policy_version 148020 (0.0025) [2024-06-22 05:57:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2425208832. Throughput: 0: 42445.6. Samples: 2425361240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:57:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 05:57:51,035][15401] Updated weights for policy 0, policy_version 148030 (0.0033) [2024-06-22 05:57:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 2425405440. Throughput: 0: 42584.4. Samples: 2425492400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:57:53,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 05:57:54,871][15401] Updated weights for policy 0, policy_version 148040 (0.0034) [2024-06-22 05:57:58,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2425602048. Throughput: 0: 42683.0. Samples: 2425752900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:57:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 05:57:58,999][15401] Updated weights for policy 0, policy_version 148050 (0.0045) [2024-06-22 05:58:02,643][15401] Updated weights for policy 0, policy_version 148060 (0.0028) [2024-06-22 05:58:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 2425847808. Throughput: 0: 42686.9. Samples: 2426003100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:58:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 05:58:06,572][15401] Updated weights for policy 0, policy_version 148070 (0.0043) [2024-06-22 05:58:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2426060800. Throughput: 0: 42781.9. Samples: 2426138800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:58:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 05:58:10,260][15401] Updated weights for policy 0, policy_version 148080 (0.0035) [2024-06-22 05:58:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 2426257408. Throughput: 0: 42532.0. Samples: 2426390860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:58:13,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 05:58:13,999][15401] Updated weights for policy 0, policy_version 148090 (0.0032) [2024-06-22 05:58:17,908][15401] Updated weights for policy 0, policy_version 148100 (0.0037) [2024-06-22 05:58:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 2426486784. Throughput: 0: 42697.4. Samples: 2426651300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:58:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 05:58:21,377][15401] Updated weights for policy 0, policy_version 148110 (0.0036) [2024-06-22 05:58:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2426699776. Throughput: 0: 42691.2. Samples: 2426782200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:58:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 05:58:25,746][15401] Updated weights for policy 0, policy_version 148120 (0.0023) [2024-06-22 05:58:28,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 2426929152. Throughput: 0: 42721.3. Samples: 2427034960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:58:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 05:58:28,948][15401] Updated weights for policy 0, policy_version 148130 (0.0040) [2024-06-22 05:58:33,255][15401] Updated weights for policy 0, policy_version 148140 (0.0028) [2024-06-22 05:58:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2427125760. Throughput: 0: 43131.7. Samples: 2427302160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:58:33,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 05:58:36,327][15401] Updated weights for policy 0, policy_version 148150 (0.0031) [2024-06-22 05:58:38,392][15132] Fps is (10 sec: 40951.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2427338752. Throughput: 0: 42936.4. Samples: 2427424640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 05:58:38,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 05:58:40,908][15349] Signal inference workers to stop experience collection... (35800 times) [2024-06-22 05:58:40,956][15401] InferenceWorker_p0-w0: stopping experience collection (35800 times) [2024-06-22 05:58:40,966][15349] Signal inference workers to resume experience collection... (35800 times) [2024-06-22 05:58:40,978][15401] InferenceWorker_p0-w0: resuming experience collection (35800 times) [2024-06-22 05:58:41,101][15401] Updated weights for policy 0, policy_version 148160 (0.0042) [2024-06-22 05:58:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2427551744. Throughput: 0: 42761.3. Samples: 2427677160. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:58:43,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 05:58:43,910][15401] Updated weights for policy 0, policy_version 148170 (0.0038) [2024-06-22 05:58:48,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.4, 300 sec: 42654.8). Total num frames: 2427748352. Throughput: 0: 43250.2. Samples: 2427949360. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:58:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 05:58:48,826][15401] Updated weights for policy 0, policy_version 148180 (0.0034) [2024-06-22 05:58:51,470][15401] Updated weights for policy 0, policy_version 148190 (0.0034) [2024-06-22 05:58:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2427977728. Throughput: 0: 42817.1. Samples: 2428065580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:58:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 05:58:56,276][15401] Updated weights for policy 0, policy_version 148200 (0.0037) [2024-06-22 05:58:58,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2428207104. Throughput: 0: 42903.7. Samples: 2428321520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:58:58,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 05:58:59,239][15401] Updated weights for policy 0, policy_version 148210 (0.0037) [2024-06-22 05:59:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2428387328. Throughput: 0: 43055.1. Samples: 2428588780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 05:59:03,910][15401] Updated weights for policy 0, policy_version 148220 (0.0040) [2024-06-22 05:59:06,848][15401] Updated weights for policy 0, policy_version 148230 (0.0045) [2024-06-22 05:59:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2428616704. Throughput: 0: 42779.1. Samples: 2428707260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:08,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 05:59:11,587][15401] Updated weights for policy 0, policy_version 148240 (0.0037) [2024-06-22 05:59:13,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2428862464. Throughput: 0: 42842.0. Samples: 2428962840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 05:59:14,979][15401] Updated weights for policy 0, policy_version 148250 (0.0033) [2024-06-22 05:59:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2429026304. Throughput: 0: 42752.7. Samples: 2429226040. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 05:59:19,035][15401] Updated weights for policy 0, policy_version 148260 (0.0025) [2024-06-22 05:59:22,388][15401] Updated weights for policy 0, policy_version 148270 (0.0034) [2024-06-22 05:59:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2429272064. Throughput: 0: 42741.3. Samples: 2429347900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 05:59:26,800][15401] Updated weights for policy 0, policy_version 148280 (0.0029) [2024-06-22 05:59:28,390][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2429501440. Throughput: 0: 43055.6. Samples: 2429614660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:28,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 05:59:29,813][15401] Updated weights for policy 0, policy_version 148290 (0.0026) [2024-06-22 05:59:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2429698048. Throughput: 0: 42687.5. Samples: 2429870300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:33,399][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 05:59:34,369][15401] Updated weights for policy 0, policy_version 148300 (0.0045) [2024-06-22 05:59:37,295][15401] Updated weights for policy 0, policy_version 148310 (0.0033) [2024-06-22 05:59:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 2429927424. Throughput: 0: 42979.7. Samples: 2429999660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:38,399][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 05:59:42,147][15401] Updated weights for policy 0, policy_version 148320 (0.0035) [2024-06-22 05:59:43,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.8, 300 sec: 42653.9). Total num frames: 2430124032. Throughput: 0: 43166.1. Samples: 2430264100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 25.0) [2024-06-22 05:59:43,401][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 05:59:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000148324_2430140416.pth... [2024-06-22 05:59:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000147699_2419900416.pth [2024-06-22 05:59:43,646][15349] Signal inference workers to stop experience collection... (35850 times) [2024-06-22 05:59:43,652][15349] Signal inference workers to resume experience collection... (35850 times) [2024-06-22 05:59:43,681][15401] InferenceWorker_p0-w0: stopping experience collection (35850 times) [2024-06-22 05:59:43,681][15401] InferenceWorker_p0-w0: resuming experience collection (35850 times) [2024-06-22 05:59:44,847][15401] Updated weights for policy 0, policy_version 148330 (0.0025) [2024-06-22 05:59:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2430353408. Throughput: 0: 42898.2. Samples: 2430519200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 05:59:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 05:59:49,864][15401] Updated weights for policy 0, policy_version 148340 (0.0031) [2024-06-22 05:59:52,694][15401] Updated weights for policy 0, policy_version 148350 (0.0036) [2024-06-22 05:59:53,392][15132] Fps is (10 sec: 45874.9, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 2430582784. Throughput: 0: 43101.6. Samples: 2430646940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 05:59:53,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 05:59:57,635][15401] Updated weights for policy 0, policy_version 148360 (0.0024) [2024-06-22 05:59:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2430779392. Throughput: 0: 43287.6. Samples: 2430910780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 05:59:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 06:00:00,118][15401] Updated weights for policy 0, policy_version 148370 (0.0027) [2024-06-22 06:00:03,392][15132] Fps is (10 sec: 42598.9, 60 sec: 43688.9, 300 sec: 42875.7). Total num frames: 2431008768. Throughput: 0: 43149.0. Samples: 2431167840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:03,393][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 06:00:05,203][15401] Updated weights for policy 0, policy_version 148380 (0.0031) [2024-06-22 06:00:08,122][15401] Updated weights for policy 0, policy_version 148390 (0.0029) [2024-06-22 06:00:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2431221760. Throughput: 0: 43304.9. Samples: 2431296620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 06:00:12,663][15401] Updated weights for policy 0, policy_version 148400 (0.0035) [2024-06-22 06:00:13,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2431418368. Throughput: 0: 43183.1. Samples: 2431557900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 06:00:15,973][15401] Updated weights for policy 0, policy_version 148410 (0.0033) [2024-06-22 06:00:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43963.8, 300 sec: 42987.2). Total num frames: 2431664128. Throughput: 0: 43137.0. Samples: 2431811460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:18,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 06:00:20,104][15401] Updated weights for policy 0, policy_version 148420 (0.0026) [2024-06-22 06:00:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2431860736. Throughput: 0: 43161.7. Samples: 2431941940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 06:00:23,555][15401] Updated weights for policy 0, policy_version 148430 (0.0023) [2024-06-22 06:00:27,465][15401] Updated weights for policy 0, policy_version 148440 (0.0033) [2024-06-22 06:00:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2432073728. Throughput: 0: 43101.1. Samples: 2432203540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 06:00:31,115][15401] Updated weights for policy 0, policy_version 148450 (0.0039) [2024-06-22 06:00:33,390][15132] Fps is (10 sec: 45873.3, 60 sec: 43690.4, 300 sec: 42876.7). Total num frames: 2432319488. Throughput: 0: 43119.9. Samples: 2432459620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:33,391][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 06:00:34,853][15401] Updated weights for policy 0, policy_version 148460 (0.0028) [2024-06-22 06:00:38,394][15132] Fps is (10 sec: 44215.9, 60 sec: 43141.2, 300 sec: 42819.9). Total num frames: 2432516096. Throughput: 0: 43254.0. Samples: 2432593460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:38,395][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 06:00:38,722][15401] Updated weights for policy 0, policy_version 148470 (0.0033) [2024-06-22 06:00:42,345][15401] Updated weights for policy 0, policy_version 148480 (0.0021) [2024-06-22 06:00:43,389][15132] Fps is (10 sec: 39323.5, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 2432712704. Throughput: 0: 43100.9. Samples: 2432850320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:43,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 06:00:46,385][15401] Updated weights for policy 0, policy_version 148490 (0.0030) [2024-06-22 06:00:48,389][15132] Fps is (10 sec: 44257.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2432958464. Throughput: 0: 43072.1. Samples: 2433105980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-22 06:00:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 06:00:49,821][15401] Updated weights for policy 0, policy_version 148500 (0.0033) [2024-06-22 06:00:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2433155072. Throughput: 0: 43094.2. Samples: 2433235860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:00:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 06:00:53,972][15401] Updated weights for policy 0, policy_version 148510 (0.0035) [2024-06-22 06:00:57,774][15401] Updated weights for policy 0, policy_version 148520 (0.0035) [2024-06-22 06:00:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2433351680. Throughput: 0: 42904.4. Samples: 2433488600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:00:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 06:00:58,456][15349] Signal inference workers to stop experience collection... (35900 times) [2024-06-22 06:00:58,457][15349] Signal inference workers to resume experience collection... (35900 times) [2024-06-22 06:00:58,500][15401] InferenceWorker_p0-w0: stopping experience collection (35900 times) [2024-06-22 06:00:58,500][15401] InferenceWorker_p0-w0: resuming experience collection (35900 times) [2024-06-22 06:01:01,479][15401] Updated weights for policy 0, policy_version 148530 (0.0027) [2024-06-22 06:01:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 2433597440. Throughput: 0: 43023.2. Samples: 2433747500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 06:01:05,447][15401] Updated weights for policy 0, policy_version 148540 (0.0031) [2024-06-22 06:01:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2433794048. Throughput: 0: 43029.5. Samples: 2433878260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:08,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 06:01:08,946][15401] Updated weights for policy 0, policy_version 148550 (0.0045) [2024-06-22 06:01:12,970][15401] Updated weights for policy 0, policy_version 148560 (0.0041) [2024-06-22 06:01:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2434023424. Throughput: 0: 42887.0. Samples: 2434133460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 06:01:16,686][15401] Updated weights for policy 0, policy_version 148570 (0.0035) [2024-06-22 06:01:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2434252800. Throughput: 0: 42890.3. Samples: 2434389660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 06:01:20,498][15401] Updated weights for policy 0, policy_version 148580 (0.0037) [2024-06-22 06:01:23,396][15132] Fps is (10 sec: 42570.8, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 2434449408. Throughput: 0: 42829.8. Samples: 2434520880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:23,397][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 06:01:24,301][15401] Updated weights for policy 0, policy_version 148590 (0.0031) [2024-06-22 06:01:27,886][15401] Updated weights for policy 0, policy_version 148600 (0.0027) [2024-06-22 06:01:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.4, 300 sec: 43042.7). Total num frames: 2434678784. Throughput: 0: 42983.9. Samples: 2434784600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 06:01:32,097][15401] Updated weights for policy 0, policy_version 148610 (0.0034) [2024-06-22 06:01:33,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42598.8, 300 sec: 42820.6). Total num frames: 2434875392. Throughput: 0: 43098.2. Samples: 2435045400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:33,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 06:01:35,585][15401] Updated weights for policy 0, policy_version 148620 (0.0026) [2024-06-22 06:01:38,396][15132] Fps is (10 sec: 42571.6, 60 sec: 43143.2, 300 sec: 42986.2). Total num frames: 2435104768. Throughput: 0: 43120.1. Samples: 2435176540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:38,397][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 06:01:39,602][15401] Updated weights for policy 0, policy_version 148630 (0.0050) [2024-06-22 06:01:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2435301376. Throughput: 0: 43210.8. Samples: 2435433080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:43,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 06:01:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000148640_2435317760.pth... [2024-06-22 06:01:43,508][15401] Updated weights for policy 0, policy_version 148640 (0.0036) [2024-06-22 06:01:43,557][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000148011_2425012224.pth [2024-06-22 06:01:47,287][15401] Updated weights for policy 0, policy_version 148650 (0.0057) [2024-06-22 06:01:48,390][15132] Fps is (10 sec: 42625.7, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 2435530752. Throughput: 0: 43252.4. Samples: 2435693860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:01:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 06:01:51,089][15401] Updated weights for policy 0, policy_version 148660 (0.0036) [2024-06-22 06:01:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2435727360. Throughput: 0: 43221.3. Samples: 2435823220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:01:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 06:01:54,914][15401] Updated weights for policy 0, policy_version 148670 (0.0028) [2024-06-22 06:01:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43415.9, 300 sec: 42931.4). Total num frames: 2435956736. Throughput: 0: 43127.5. Samples: 2436074300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:01:58,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 06:01:58,609][15401] Updated weights for policy 0, policy_version 148680 (0.0037) [2024-06-22 06:02:02,409][15401] Updated weights for policy 0, policy_version 148690 (0.0033) [2024-06-22 06:02:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2436169728. Throughput: 0: 43211.5. Samples: 2436334180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 06:02:06,135][15401] Updated weights for policy 0, policy_version 148700 (0.0036) [2024-06-22 06:02:08,389][15132] Fps is (10 sec: 42609.2, 60 sec: 43144.6, 300 sec: 42988.1). Total num frames: 2436382720. Throughput: 0: 43247.7. Samples: 2436466740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:08,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 06:02:09,819][15401] Updated weights for policy 0, policy_version 148710 (0.0031) [2024-06-22 06:02:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42932.5). Total num frames: 2436595712. Throughput: 0: 43082.6. Samples: 2436723320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 06:02:14,010][15401] Updated weights for policy 0, policy_version 148720 (0.0038) [2024-06-22 06:02:17,407][15401] Updated weights for policy 0, policy_version 148730 (0.0028) [2024-06-22 06:02:18,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 2436825088. Throughput: 0: 43013.8. Samples: 2436981020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:18,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 06:02:21,503][15401] Updated weights for policy 0, policy_version 148740 (0.0032) [2024-06-22 06:02:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43149.1, 300 sec: 42987.2). Total num frames: 2437038080. Throughput: 0: 42988.7. Samples: 2437110760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 06:02:25,266][15401] Updated weights for policy 0, policy_version 148750 (0.0028) [2024-06-22 06:02:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2437251072. Throughput: 0: 43030.4. Samples: 2437369460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:28,396][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 06:02:28,962][15401] Updated weights for policy 0, policy_version 148760 (0.0028) [2024-06-22 06:02:32,700][15401] Updated weights for policy 0, policy_version 148770 (0.0041) [2024-06-22 06:02:33,379][15349] Signal inference workers to stop experience collection... (35950 times) [2024-06-22 06:02:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2437464064. Throughput: 0: 43073.4. Samples: 2437632160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:33,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 06:02:33,424][15401] InferenceWorker_p0-w0: stopping experience collection (35950 times) [2024-06-22 06:02:33,456][15349] Signal inference workers to resume experience collection... (35950 times) [2024-06-22 06:02:33,456][15401] InferenceWorker_p0-w0: resuming experience collection (35950 times) [2024-06-22 06:02:36,439][15401] Updated weights for policy 0, policy_version 148780 (0.0039) [2024-06-22 06:02:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42876.0, 300 sec: 43042.7). Total num frames: 2437677056. Throughput: 0: 43091.8. Samples: 2437762360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 06:02:40,644][15401] Updated weights for policy 0, policy_version 148790 (0.0035) [2024-06-22 06:02:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 2437906432. Throughput: 0: 43179.2. Samples: 2438017260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:43,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 06:02:44,744][15401] Updated weights for policy 0, policy_version 148800 (0.0039) [2024-06-22 06:02:48,147][15401] Updated weights for policy 0, policy_version 148810 (0.0033) [2024-06-22 06:02:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2438103040. Throughput: 0: 43135.6. Samples: 2438275280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:48,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 06:02:52,338][15401] Updated weights for policy 0, policy_version 148820 (0.0036) [2024-06-22 06:02:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2438283264. Throughput: 0: 42901.2. Samples: 2438397300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-22 06:02:53,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 06:02:55,964][15401] Updated weights for policy 0, policy_version 148830 (0.0037) [2024-06-22 06:02:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43146.2, 300 sec: 43042.7). Total num frames: 2438545408. Throughput: 0: 42716.1. Samples: 2438645540. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:02:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 06:03:00,195][15401] Updated weights for policy 0, policy_version 148840 (0.0026) [2024-06-22 06:03:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2438742016. Throughput: 0: 42860.0. Samples: 2438909720. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:03,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 06:03:03,513][15401] Updated weights for policy 0, policy_version 148850 (0.0045) [2024-06-22 06:03:08,112][15401] Updated weights for policy 0, policy_version 148860 (0.0032) [2024-06-22 06:03:08,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42323.6, 300 sec: 42931.3). Total num frames: 2438922240. Throughput: 0: 42617.5. Samples: 2439028640. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:08,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 06:03:11,180][15401] Updated weights for policy 0, policy_version 148870 (0.0039) [2024-06-22 06:03:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2439184384. Throughput: 0: 42556.1. Samples: 2439284480. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 06:03:15,767][15401] Updated weights for policy 0, policy_version 148880 (0.0036) [2024-06-22 06:03:18,391][15132] Fps is (10 sec: 44241.3, 60 sec: 42324.4, 300 sec: 42931.4). Total num frames: 2439364608. Throughput: 0: 42552.5. Samples: 2439547080. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:18,391][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 06:03:18,918][15401] Updated weights for policy 0, policy_version 148890 (0.0027) [2024-06-22 06:03:23,333][15401] Updated weights for policy 0, policy_version 148900 (0.0034) [2024-06-22 06:03:23,391][15132] Fps is (10 sec: 39315.8, 60 sec: 42324.4, 300 sec: 42875.9). Total num frames: 2439577600. Throughput: 0: 42387.6. Samples: 2439669860. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:23,391][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 06:03:26,390][15401] Updated weights for policy 0, policy_version 148910 (0.0022) [2024-06-22 06:03:28,389][15132] Fps is (10 sec: 47520.1, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 2439839744. Throughput: 0: 42496.0. Samples: 2439929580. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:28,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 06:03:30,748][15401] Updated weights for policy 0, policy_version 148920 (0.0029) [2024-06-22 06:03:33,390][15132] Fps is (10 sec: 42604.4, 60 sec: 42325.2, 300 sec: 42932.0). Total num frames: 2440003584. Throughput: 0: 42670.5. Samples: 2440195460. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 06:03:33,953][15401] Updated weights for policy 0, policy_version 148930 (0.0035) [2024-06-22 06:03:38,248][15401] Updated weights for policy 0, policy_version 148940 (0.0027) [2024-06-22 06:03:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2440232960. Throughput: 0: 42616.3. Samples: 2440315040. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 06:03:40,337][15349] Signal inference workers to stop experience collection... (36000 times) [2024-06-22 06:03:40,344][15349] Signal inference workers to resume experience collection... (36000 times) [2024-06-22 06:03:40,361][15401] InferenceWorker_p0-w0: stopping experience collection (36000 times) [2024-06-22 06:03:40,361][15401] InferenceWorker_p0-w0: resuming experience collection (36000 times) [2024-06-22 06:03:41,900][15401] Updated weights for policy 0, policy_version 148950 (0.0033) [2024-06-22 06:03:43,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 43153.8). Total num frames: 2440478720. Throughput: 0: 42776.8. Samples: 2440570500. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:43,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 06:03:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000148955_2440478720.pth... [2024-06-22 06:03:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000148324_2430140416.pth [2024-06-22 06:03:45,820][15401] Updated weights for policy 0, policy_version 148960 (0.0038) [2024-06-22 06:03:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42931.7). Total num frames: 2440642560. Throughput: 0: 42752.0. Samples: 2440833560. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:48,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 06:03:49,463][15401] Updated weights for policy 0, policy_version 148970 (0.0032) [2024-06-22 06:03:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2440871936. Throughput: 0: 42706.7. Samples: 2440950340. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 06:03:53,501][15401] Updated weights for policy 0, policy_version 148980 (0.0028) [2024-06-22 06:03:57,348][15401] Updated weights for policy 0, policy_version 148990 (0.0039) [2024-06-22 06:03:58,390][15132] Fps is (10 sec: 49151.9, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 2441134080. Throughput: 0: 42874.6. Samples: 2441213840. Policy #0 lag: (min: 1.0, avg: 12.2, max: 26.0) [2024-06-22 06:03:58,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 06:04:01,079][15401] Updated weights for policy 0, policy_version 149000 (0.0037) [2024-06-22 06:04:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2441297920. Throughput: 0: 42918.2. Samples: 2441478340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 06:04:04,978][15401] Updated weights for policy 0, policy_version 149010 (0.0035) [2024-06-22 06:04:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43419.4, 300 sec: 42931.6). Total num frames: 2441527296. Throughput: 0: 42713.5. Samples: 2441591900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 06:04:08,516][15401] Updated weights for policy 0, policy_version 149020 (0.0031) [2024-06-22 06:04:12,400][15401] Updated weights for policy 0, policy_version 149030 (0.0036) [2024-06-22 06:04:13,394][15132] Fps is (10 sec: 47493.3, 60 sec: 43141.5, 300 sec: 43208.7). Total num frames: 2441773056. Throughput: 0: 43077.7. Samples: 2441868260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:13,394][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 06:04:16,083][15401] Updated weights for policy 0, policy_version 149040 (0.0036) [2024-06-22 06:04:18,397][15132] Fps is (10 sec: 42564.5, 60 sec: 43139.8, 300 sec: 42986.0). Total num frames: 2441953280. Throughput: 0: 42899.7. Samples: 2442126280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:18,398][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 06:04:20,158][15401] Updated weights for policy 0, policy_version 149050 (0.0028) [2024-06-22 06:04:23,392][15132] Fps is (10 sec: 40967.6, 60 sec: 43417.0, 300 sec: 42986.8). Total num frames: 2442182656. Throughput: 0: 42766.3. Samples: 2442239620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:23,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 06:04:23,587][15401] Updated weights for policy 0, policy_version 149060 (0.0038) [2024-06-22 06:04:27,707][15401] Updated weights for policy 0, policy_version 149070 (0.0043) [2024-06-22 06:04:28,390][15132] Fps is (10 sec: 45910.9, 60 sec: 42871.4, 300 sec: 43098.3). Total num frames: 2442412032. Throughput: 0: 43185.8. Samples: 2442513860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 06:04:31,010][15401] Updated weights for policy 0, policy_version 149080 (0.0031) [2024-06-22 06:04:33,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2442575872. Throughput: 0: 42937.4. Samples: 2442765740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 06:04:35,596][15401] Updated weights for policy 0, policy_version 149090 (0.0038) [2024-06-22 06:04:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 43043.1). Total num frames: 2442821632. Throughput: 0: 42963.1. Samples: 2442883680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 06:04:38,461][15349] Signal inference workers to stop experience collection... (36050 times) [2024-06-22 06:04:38,486][15401] InferenceWorker_p0-w0: stopping experience collection (36050 times) [2024-06-22 06:04:38,526][15349] Signal inference workers to resume experience collection... (36050 times) [2024-06-22 06:04:38,526][15401] InferenceWorker_p0-w0: resuming experience collection (36050 times) [2024-06-22 06:04:38,669][15401] Updated weights for policy 0, policy_version 149100 (0.0030) [2024-06-22 06:04:43,219][15401] Updated weights for policy 0, policy_version 149110 (0.0024) [2024-06-22 06:04:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 2443034624. Throughput: 0: 43123.6. Samples: 2443154400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 06:04:46,148][15401] Updated weights for policy 0, policy_version 149120 (0.0028) [2024-06-22 06:04:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 2443231232. Throughput: 0: 42858.7. Samples: 2443406980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 06:04:50,725][15401] Updated weights for policy 0, policy_version 149130 (0.0037) [2024-06-22 06:04:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2443476992. Throughput: 0: 43090.2. Samples: 2443530960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 06:04:53,671][15401] Updated weights for policy 0, policy_version 149140 (0.0035) [2024-06-22 06:04:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42876.4). Total num frames: 2443657216. Throughput: 0: 42864.9. Samples: 2443797000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:04:58,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-22 06:04:58,487][15401] Updated weights for policy 0, policy_version 149150 (0.0032) [2024-06-22 06:05:01,482][15401] Updated weights for policy 0, policy_version 149160 (0.0041) [2024-06-22 06:05:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2443870208. Throughput: 0: 42691.0. Samples: 2444047040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 06:05:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 06:05:06,213][15401] Updated weights for policy 0, policy_version 149170 (0.0052) [2024-06-22 06:05:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2444099584. Throughput: 0: 43065.9. Samples: 2444177480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 06:05:09,135][15401] Updated weights for policy 0, policy_version 149180 (0.0029) [2024-06-22 06:05:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42328.2, 300 sec: 42876.1). Total num frames: 2444312576. Throughput: 0: 42793.8. Samples: 2444439580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 06:05:13,713][15401] Updated weights for policy 0, policy_version 149190 (0.0039) [2024-06-22 06:05:16,926][15401] Updated weights for policy 0, policy_version 149200 (0.0028) [2024-06-22 06:05:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42877.0, 300 sec: 42931.6). Total num frames: 2444525568. Throughput: 0: 42677.7. Samples: 2444686240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 06:05:21,242][15401] Updated weights for policy 0, policy_version 149210 (0.0031) [2024-06-22 06:05:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 2444738560. Throughput: 0: 42968.4. Samples: 2444817260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:23,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 06:05:24,542][15401] Updated weights for policy 0, policy_version 149220 (0.0032) [2024-06-22 06:05:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 2444951552. Throughput: 0: 42723.2. Samples: 2445076940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:28,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 06:05:28,726][15401] Updated weights for policy 0, policy_version 149230 (0.0034) [2024-06-22 06:05:32,398][15401] Updated weights for policy 0, policy_version 149240 (0.0034) [2024-06-22 06:05:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42932.3). Total num frames: 2445180928. Throughput: 0: 42544.3. Samples: 2445321480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 06:05:36,664][15401] Updated weights for policy 0, policy_version 149250 (0.0031) [2024-06-22 06:05:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2445377536. Throughput: 0: 42875.8. Samples: 2445460380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 06:05:39,230][15349] Signal inference workers to stop experience collection... (36100 times) [2024-06-22 06:05:39,255][15401] InferenceWorker_p0-w0: stopping experience collection (36100 times) [2024-06-22 06:05:39,294][15349] Signal inference workers to resume experience collection... (36100 times) [2024-06-22 06:05:39,294][15401] InferenceWorker_p0-w0: resuming experience collection (36100 times) [2024-06-22 06:05:39,978][15401] Updated weights for policy 0, policy_version 149260 (0.0030) [2024-06-22 06:05:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2445590528. Throughput: 0: 42775.5. Samples: 2445721900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 06:05:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000149268_2445606912.pth... [2024-06-22 06:05:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000148640_2435317760.pth [2024-06-22 06:05:44,041][15401] Updated weights for policy 0, policy_version 149270 (0.0034) [2024-06-22 06:05:47,801][15401] Updated weights for policy 0, policy_version 149280 (0.0045) [2024-06-22 06:05:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2445819904. Throughput: 0: 42760.6. Samples: 2445971260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:48,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 06:05:51,530][15401] Updated weights for policy 0, policy_version 149290 (0.0038) [2024-06-22 06:05:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 2446016512. Throughput: 0: 42877.7. Samples: 2446106980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:53,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-22 06:05:55,499][15401] Updated weights for policy 0, policy_version 149300 (0.0039) [2024-06-22 06:05:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2446229504. Throughput: 0: 42801.8. Samples: 2446365660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:05:58,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 06:05:59,253][15401] Updated weights for policy 0, policy_version 149310 (0.0039) [2024-06-22 06:06:03,264][15401] Updated weights for policy 0, policy_version 149320 (0.0047) [2024-06-22 06:06:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2446475264. Throughput: 0: 42941.9. Samples: 2446618620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 06:06:03,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 06:06:06,721][15401] Updated weights for policy 0, policy_version 149330 (0.0032) [2024-06-22 06:06:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2446671872. Throughput: 0: 42995.6. Samples: 2446752060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:08,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 06:06:11,071][15401] Updated weights for policy 0, policy_version 149340 (0.0042) [2024-06-22 06:06:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2446884864. Throughput: 0: 43063.1. Samples: 2447014780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:13,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 06:06:14,121][15401] Updated weights for policy 0, policy_version 149350 (0.0036) [2024-06-22 06:06:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42821.5). Total num frames: 2447081472. Throughput: 0: 43148.1. Samples: 2447263140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:18,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-22 06:06:18,822][15401] Updated weights for policy 0, policy_version 149360 (0.0037) [2024-06-22 06:06:22,102][15401] Updated weights for policy 0, policy_version 149370 (0.0031) [2024-06-22 06:06:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2447310848. Throughput: 0: 42936.7. Samples: 2447392520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 06:06:26,433][15401] Updated weights for policy 0, policy_version 149380 (0.0028) [2024-06-22 06:06:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2447540224. Throughput: 0: 42928.0. Samples: 2447653660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 06:06:29,918][15401] Updated weights for policy 0, policy_version 149390 (0.0031) [2024-06-22 06:06:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42821.5). Total num frames: 2447736832. Throughput: 0: 43115.5. Samples: 2447911460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 06:06:33,985][15401] Updated weights for policy 0, policy_version 149400 (0.0033) [2024-06-22 06:06:37,467][15401] Updated weights for policy 0, policy_version 149410 (0.0038) [2024-06-22 06:06:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2447949824. Throughput: 0: 42805.2. Samples: 2448033220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 06:06:41,653][15401] Updated weights for policy 0, policy_version 149420 (0.0039) [2024-06-22 06:06:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2448179200. Throughput: 0: 42902.2. Samples: 2448296260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:43,396][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 06:06:45,356][15401] Updated weights for policy 0, policy_version 149430 (0.0042) [2024-06-22 06:06:48,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.5, 300 sec: 42820.2). Total num frames: 2448359424. Throughput: 0: 42986.5. Samples: 2448553120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:48,393][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 06:06:49,234][15401] Updated weights for policy 0, policy_version 149440 (0.0027) [2024-06-22 06:06:52,714][15401] Updated weights for policy 0, policy_version 149450 (0.0043) [2024-06-22 06:06:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 2448605184. Throughput: 0: 42773.3. Samples: 2448676860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 06:06:57,021][15401] Updated weights for policy 0, policy_version 149460 (0.0041) [2024-06-22 06:06:58,390][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2448818176. Throughput: 0: 42683.0. Samples: 2448935520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:06:58,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 06:06:59,952][15349] Signal inference workers to stop experience collection... (36150 times) [2024-06-22 06:07:00,001][15349] Signal inference workers to resume experience collection... (36150 times) [2024-06-22 06:07:00,002][15401] InferenceWorker_p0-w0: stopping experience collection (36150 times) [2024-06-22 06:07:00,027][15401] InferenceWorker_p0-w0: resuming experience collection (36150 times) [2024-06-22 06:07:00,142][15401] Updated weights for policy 0, policy_version 149470 (0.0034) [2024-06-22 06:07:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2449031168. Throughput: 0: 42966.6. Samples: 2449196640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:07:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 06:07:04,466][15401] Updated weights for policy 0, policy_version 149480 (0.0040) [2024-06-22 06:07:08,006][15401] Updated weights for policy 0, policy_version 149490 (0.0033) [2024-06-22 06:07:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2449260544. Throughput: 0: 42898.5. Samples: 2449322960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 06:07:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 06:07:11,999][15401] Updated weights for policy 0, policy_version 149500 (0.0027) [2024-06-22 06:07:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2449440768. Throughput: 0: 42825.2. Samples: 2449580800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 06:07:15,749][15401] Updated weights for policy 0, policy_version 149510 (0.0040) [2024-06-22 06:07:18,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2449670144. Throughput: 0: 42768.4. Samples: 2449836140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:18,392][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 06:07:20,220][15401] Updated weights for policy 0, policy_version 149520 (0.0033) [2024-06-22 06:07:23,297][15401] Updated weights for policy 0, policy_version 149530 (0.0027) [2024-06-22 06:07:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2449899520. Throughput: 0: 42843.2. Samples: 2449961160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 06:07:27,733][15401] Updated weights for policy 0, policy_version 149540 (0.0045) [2024-06-22 06:07:28,391][15132] Fps is (10 sec: 40963.0, 60 sec: 42324.2, 300 sec: 42764.8). Total num frames: 2450079744. Throughput: 0: 42642.1. Samples: 2450215220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:28,391][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 06:07:30,945][15401] Updated weights for policy 0, policy_version 149550 (0.0029) [2024-06-22 06:07:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2450309120. Throughput: 0: 42546.9. Samples: 2450467620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 06:07:35,249][15401] Updated weights for policy 0, policy_version 149560 (0.0031) [2024-06-22 06:07:38,389][15132] Fps is (10 sec: 44244.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2450522112. Throughput: 0: 42782.7. Samples: 2450602080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 06:07:38,524][15401] Updated weights for policy 0, policy_version 149570 (0.0025) [2024-06-22 06:07:42,947][15401] Updated weights for policy 0, policy_version 149580 (0.0031) [2024-06-22 06:07:43,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 2450735104. Throughput: 0: 42720.4. Samples: 2450858040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:43,393][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 06:07:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000149581_2450735104.pth... [2024-06-22 06:07:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000148955_2440478720.pth [2024-06-22 06:07:46,774][15401] Updated weights for policy 0, policy_version 149590 (0.0040) [2024-06-22 06:07:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 2450948096. Throughput: 0: 42461.9. Samples: 2451107420. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 06:07:50,461][15401] Updated weights for policy 0, policy_version 149600 (0.0034) [2024-06-22 06:07:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2451144704. Throughput: 0: 42458.2. Samples: 2451233580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 06:07:54,303][15401] Updated weights for policy 0, policy_version 149610 (0.0033) [2024-06-22 06:07:58,102][15401] Updated weights for policy 0, policy_version 149620 (0.0037) [2024-06-22 06:07:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2451374080. Throughput: 0: 42449.0. Samples: 2451491000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:07:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 06:08:02,161][15401] Updated weights for policy 0, policy_version 149630 (0.0037) [2024-06-22 06:08:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42876.4). Total num frames: 2451570688. Throughput: 0: 42541.4. Samples: 2451750400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:08:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 06:08:05,881][15401] Updated weights for policy 0, policy_version 149640 (0.0030) [2024-06-22 06:08:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2451783680. Throughput: 0: 42371.1. Samples: 2451867860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:08:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 06:08:09,839][15401] Updated weights for policy 0, policy_version 149650 (0.0022) [2024-06-22 06:08:11,312][15349] Signal inference workers to stop experience collection... (36200 times) [2024-06-22 06:08:11,363][15401] InferenceWorker_p0-w0: stopping experience collection (36200 times) [2024-06-22 06:08:11,371][15349] Signal inference workers to resume experience collection... (36200 times) [2024-06-22 06:08:11,378][15401] InferenceWorker_p0-w0: resuming experience collection (36200 times) [2024-06-22 06:08:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42876.3). Total num frames: 2452013056. Throughput: 0: 42544.7. Samples: 2452129660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 06:08:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 06:08:13,513][15401] Updated weights for policy 0, policy_version 149660 (0.0041) [2024-06-22 06:08:17,455][15401] Updated weights for policy 0, policy_version 149670 (0.0034) [2024-06-22 06:08:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42327.0, 300 sec: 42820.8). Total num frames: 2452209664. Throughput: 0: 42605.2. Samples: 2452384860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 06:08:21,141][15401] Updated weights for policy 0, policy_version 149680 (0.0044) [2024-06-22 06:08:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2452422656. Throughput: 0: 42360.9. Samples: 2452508320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 06:08:24,932][15401] Updated weights for policy 0, policy_version 149690 (0.0036) [2024-06-22 06:08:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42872.7, 300 sec: 42876.1). Total num frames: 2452652032. Throughput: 0: 42483.3. Samples: 2452769680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 06:08:28,847][15401] Updated weights for policy 0, policy_version 149700 (0.0038) [2024-06-22 06:08:32,714][15401] Updated weights for policy 0, policy_version 149710 (0.0040) [2024-06-22 06:08:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2452848640. Throughput: 0: 42508.8. Samples: 2453020320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 06:08:36,479][15401] Updated weights for policy 0, policy_version 149720 (0.0033) [2024-06-22 06:08:38,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2453061632. Throughput: 0: 42417.4. Samples: 2453142360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 06:08:40,747][15401] Updated weights for policy 0, policy_version 149730 (0.0032) [2024-06-22 06:08:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 2453291008. Throughput: 0: 42532.4. Samples: 2453404960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 06:08:44,386][15401] Updated weights for policy 0, policy_version 149740 (0.0025) [2024-06-22 06:08:48,396][15132] Fps is (10 sec: 42570.7, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 2453487616. Throughput: 0: 42413.4. Samples: 2453659280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:48,397][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 06:08:48,678][15401] Updated weights for policy 0, policy_version 149750 (0.0037) [2024-06-22 06:08:52,289][15401] Updated weights for policy 0, policy_version 149760 (0.0034) [2024-06-22 06:08:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2453700608. Throughput: 0: 42621.3. Samples: 2453785820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 06:08:56,259][15401] Updated weights for policy 0, policy_version 149770 (0.0037) [2024-06-22 06:08:58,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2453913600. Throughput: 0: 42482.7. Samples: 2454041380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:08:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 06:08:59,852][15401] Updated weights for policy 0, policy_version 149780 (0.0041) [2024-06-22 06:09:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2454126592. Throughput: 0: 42562.7. Samples: 2454300180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:09:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 06:09:03,822][15401] Updated weights for policy 0, policy_version 149790 (0.0036) [2024-06-22 06:09:07,436][15401] Updated weights for policy 0, policy_version 149800 (0.0035) [2024-06-22 06:09:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 2454355968. Throughput: 0: 42688.0. Samples: 2454429280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:09:08,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 06:09:11,625][15401] Updated weights for policy 0, policy_version 149810 (0.0034) [2024-06-22 06:09:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42766.1). Total num frames: 2454568960. Throughput: 0: 42623.3. Samples: 2454687740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 06:09:13,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 06:09:15,321][15401] Updated weights for policy 0, policy_version 149820 (0.0048) [2024-06-22 06:09:18,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 2454781952. Throughput: 0: 42626.7. Samples: 2454938620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:18,392][15132] Avg episode reward: [(0, '0.890')] [2024-06-22 06:09:19,288][15401] Updated weights for policy 0, policy_version 149830 (0.0043) [2024-06-22 06:09:23,130][15401] Updated weights for policy 0, policy_version 149840 (0.0036) [2024-06-22 06:09:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2454994944. Throughput: 0: 42768.8. Samples: 2455066960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:23,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 06:09:26,901][15401] Updated weights for policy 0, policy_version 149850 (0.0032) [2024-06-22 06:09:28,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2455191552. Throughput: 0: 42641.4. Samples: 2455323820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 06:09:30,594][15401] Updated weights for policy 0, policy_version 149860 (0.0040) [2024-06-22 06:09:33,334][15349] Signal inference workers to stop experience collection... (36250 times) [2024-06-22 06:09:33,336][15349] Signal inference workers to resume experience collection... (36250 times) [2024-06-22 06:09:33,351][15401] InferenceWorker_p0-w0: stopping experience collection (36250 times) [2024-06-22 06:09:33,361][15401] InferenceWorker_p0-w0: resuming experience collection (36250 times) [2024-06-22 06:09:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2455420928. Throughput: 0: 42812.4. Samples: 2455585560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 06:09:34,495][15401] Updated weights for policy 0, policy_version 149870 (0.0029) [2024-06-22 06:09:37,951][15401] Updated weights for policy 0, policy_version 149880 (0.0043) [2024-06-22 06:09:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2455633920. Throughput: 0: 42913.8. Samples: 2455716940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:38,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 06:09:42,151][15401] Updated weights for policy 0, policy_version 149890 (0.0043) [2024-06-22 06:09:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2455846912. Throughput: 0: 43083.4. Samples: 2455980140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 06:09:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000149894_2455863296.pth... [2024-06-22 06:09:43,566][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000149268_2445606912.pth [2024-06-22 06:09:45,674][15401] Updated weights for policy 0, policy_version 149900 (0.0026) [2024-06-22 06:09:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 2456076288. Throughput: 0: 42979.5. Samples: 2456234260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 06:09:49,622][15401] Updated weights for policy 0, policy_version 149910 (0.0024) [2024-06-22 06:09:53,171][15401] Updated weights for policy 0, policy_version 149920 (0.0030) [2024-06-22 06:09:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2456289280. Throughput: 0: 43000.5. Samples: 2456364300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 06:09:57,156][15401] Updated weights for policy 0, policy_version 149930 (0.0038) [2024-06-22 06:09:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2456502272. Throughput: 0: 43036.9. Samples: 2456624400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:09:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 06:10:00,995][15401] Updated weights for policy 0, policy_version 149940 (0.0040) [2024-06-22 06:10:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2456715264. Throughput: 0: 43136.9. Samples: 2456879680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:10:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 06:10:04,970][15401] Updated weights for policy 0, policy_version 149950 (0.0024) [2024-06-22 06:10:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2456928256. Throughput: 0: 43152.0. Samples: 2457008800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:10:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 06:10:08,535][15401] Updated weights for policy 0, policy_version 149960 (0.0024) [2024-06-22 06:10:12,415][15401] Updated weights for policy 0, policy_version 149970 (0.0039) [2024-06-22 06:10:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2457141248. Throughput: 0: 43220.4. Samples: 2457268740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:10:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 06:10:16,145][15401] Updated weights for policy 0, policy_version 149980 (0.0045) [2024-06-22 06:10:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2457354240. Throughput: 0: 43111.4. Samples: 2457525580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:10:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 06:10:19,872][15401] Updated weights for policy 0, policy_version 149990 (0.0039) [2024-06-22 06:10:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2457567232. Throughput: 0: 43027.0. Samples: 2457653160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:10:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 06:10:23,821][15401] Updated weights for policy 0, policy_version 150000 (0.0035) [2024-06-22 06:10:27,438][15401] Updated weights for policy 0, policy_version 150010 (0.0034) [2024-06-22 06:10:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2457780224. Throughput: 0: 42814.6. Samples: 2457906800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:10:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 06:10:31,318][15401] Updated weights for policy 0, policy_version 150020 (0.0045) [2024-06-22 06:10:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 2457976832. Throughput: 0: 43051.9. Samples: 2458171600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:10:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 06:10:35,098][15401] Updated weights for policy 0, policy_version 150030 (0.0025) [2024-06-22 06:10:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2458222592. Throughput: 0: 42952.8. Samples: 2458297180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:10:38,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 06:10:38,821][15401] Updated weights for policy 0, policy_version 150040 (0.0023) [2024-06-22 06:10:42,665][15401] Updated weights for policy 0, policy_version 150050 (0.0038) [2024-06-22 06:10:43,394][15132] Fps is (10 sec: 47494.7, 60 sec: 43414.7, 300 sec: 42819.9). Total num frames: 2458451968. Throughput: 0: 43065.4. Samples: 2458562520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:10:43,394][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 06:10:46,327][15401] Updated weights for policy 0, policy_version 150060 (0.0033) [2024-06-22 06:10:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2458648576. Throughput: 0: 43130.2. Samples: 2458820540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:10:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 06:10:50,337][15401] Updated weights for policy 0, policy_version 150070 (0.0026) [2024-06-22 06:10:53,390][15132] Fps is (10 sec: 42615.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2458877952. Throughput: 0: 43008.8. Samples: 2458944200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:10:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 06:10:54,148][15401] Updated weights for policy 0, policy_version 150080 (0.0035) [2024-06-22 06:10:57,897][15401] Updated weights for policy 0, policy_version 150090 (0.0030) [2024-06-22 06:10:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2459090944. Throughput: 0: 43143.1. Samples: 2459210180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:10:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 06:11:01,834][15401] Updated weights for policy 0, policy_version 150100 (0.0025) [2024-06-22 06:11:02,959][15349] Signal inference workers to stop experience collection... (36300 times) [2024-06-22 06:11:02,959][15349] Signal inference workers to resume experience collection... (36300 times) [2024-06-22 06:11:02,969][15401] InferenceWorker_p0-w0: stopping experience collection (36300 times) [2024-06-22 06:11:02,970][15401] InferenceWorker_p0-w0: resuming experience collection (36300 times) [2024-06-22 06:11:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2459303936. Throughput: 0: 43155.2. Samples: 2459467560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:11:03,398][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 06:11:05,738][15401] Updated weights for policy 0, policy_version 150110 (0.0028) [2024-06-22 06:11:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2459516928. Throughput: 0: 43141.4. Samples: 2459594520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:11:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 06:11:09,517][15401] Updated weights for policy 0, policy_version 150120 (0.0042) [2024-06-22 06:11:13,347][15401] Updated weights for policy 0, policy_version 150130 (0.0033) [2024-06-22 06:11:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2459729920. Throughput: 0: 43224.1. Samples: 2459851880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:11:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 06:11:16,984][15401] Updated weights for policy 0, policy_version 150140 (0.0032) [2024-06-22 06:11:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2459942912. Throughput: 0: 42989.9. Samples: 2460106140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:11:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 06:11:21,058][15401] Updated weights for policy 0, policy_version 150150 (0.0030) [2024-06-22 06:11:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2460155904. Throughput: 0: 43033.0. Samples: 2460233660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 06:11:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 06:11:24,533][15401] Updated weights for policy 0, policy_version 150160 (0.0035) [2024-06-22 06:11:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2460352512. Throughput: 0: 42869.7. Samples: 2460491480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:11:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 06:11:28,608][15401] Updated weights for policy 0, policy_version 150170 (0.0042) [2024-06-22 06:11:32,121][15401] Updated weights for policy 0, policy_version 150180 (0.0035) [2024-06-22 06:11:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 2460565504. Throughput: 0: 42715.6. Samples: 2460742740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:11:33,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 06:11:36,083][15401] Updated weights for policy 0, policy_version 150190 (0.0045) [2024-06-22 06:11:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2460794880. Throughput: 0: 42777.9. Samples: 2460869300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:11:38,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 06:11:39,928][15401] Updated weights for policy 0, policy_version 150200 (0.0036) [2024-06-22 06:11:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42874.3, 300 sec: 42932.0). Total num frames: 2461024256. Throughput: 0: 42784.8. Samples: 2461135500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:11:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 06:11:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000150209_2461024256.pth... [2024-06-22 06:11:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000149581_2450735104.pth [2024-06-22 06:11:43,649][15401] Updated weights for policy 0, policy_version 150210 (0.0037) [2024-06-22 06:11:47,581][15401] Updated weights for policy 0, policy_version 150220 (0.0037) [2024-06-22 06:11:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2461220864. Throughput: 0: 42544.5. Samples: 2461382060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:11:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 06:11:51,431][15401] Updated weights for policy 0, policy_version 150230 (0.0038) [2024-06-22 06:11:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2461417472. Throughput: 0: 42623.6. Samples: 2461512580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:11:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 06:11:55,404][15401] Updated weights for policy 0, policy_version 150240 (0.0040) [2024-06-22 06:11:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2461630464. Throughput: 0: 42744.1. Samples: 2461775360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:11:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 06:11:59,079][15401] Updated weights for policy 0, policy_version 150250 (0.0037) [2024-06-22 06:12:02,865][15401] Updated weights for policy 0, policy_version 150260 (0.0027) [2024-06-22 06:12:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2461876224. Throughput: 0: 42685.4. Samples: 2462026980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:12:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 06:12:06,550][15401] Updated weights for policy 0, policy_version 150270 (0.0022) [2024-06-22 06:12:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2462056448. Throughput: 0: 42765.8. Samples: 2462158120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:12:08,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 06:12:10,271][15401] Updated weights for policy 0, policy_version 150280 (0.0033) [2024-06-22 06:12:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 2462302208. Throughput: 0: 42808.9. Samples: 2462417880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:12:13,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 06:12:14,130][15401] Updated weights for policy 0, policy_version 150290 (0.0032) [2024-06-22 06:12:17,710][15401] Updated weights for policy 0, policy_version 150300 (0.0037) [2024-06-22 06:12:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2462515200. Throughput: 0: 42928.8. Samples: 2462674540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:12:18,395][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 06:12:22,023][15401] Updated weights for policy 0, policy_version 150310 (0.0028) [2024-06-22 06:12:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.8). Total num frames: 2462711808. Throughput: 0: 43049.8. Samples: 2462806440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:12:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 06:12:25,338][15401] Updated weights for policy 0, policy_version 150320 (0.0027) [2024-06-22 06:12:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2462941184. Throughput: 0: 42743.2. Samples: 2463058940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:12:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 06:12:29,720][15401] Updated weights for policy 0, policy_version 150330 (0.0041) [2024-06-22 06:12:31,561][15349] Signal inference workers to stop experience collection... (36350 times) [2024-06-22 06:12:31,565][15349] Signal inference workers to resume experience collection... (36350 times) [2024-06-22 06:12:31,607][15401] InferenceWorker_p0-w0: stopping experience collection (36350 times) [2024-06-22 06:12:31,607][15401] InferenceWorker_p0-w0: resuming experience collection (36350 times) [2024-06-22 06:12:33,309][15401] Updated weights for policy 0, policy_version 150340 (0.0040) [2024-06-22 06:12:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2463170560. Throughput: 0: 43053.3. Samples: 2463319460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:12:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 06:12:37,414][15401] Updated weights for policy 0, policy_version 150350 (0.0042) [2024-06-22 06:12:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 2463350784. Throughput: 0: 43024.0. Samples: 2463448660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:12:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 06:12:40,913][15401] Updated weights for policy 0, policy_version 150360 (0.0029) [2024-06-22 06:12:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2463580160. Throughput: 0: 42826.2. Samples: 2463702540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:12:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 06:12:45,294][15401] Updated weights for policy 0, policy_version 150370 (0.0043) [2024-06-22 06:12:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2463809536. Throughput: 0: 42925.0. Samples: 2463958600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:12:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 06:12:48,496][15401] Updated weights for policy 0, policy_version 150380 (0.0039) [2024-06-22 06:12:52,913][15401] Updated weights for policy 0, policy_version 150390 (0.0029) [2024-06-22 06:12:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2464006144. Throughput: 0: 43029.8. Samples: 2464094460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:12:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 06:12:56,129][15401] Updated weights for policy 0, policy_version 150400 (0.0042) [2024-06-22 06:12:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2464235520. Throughput: 0: 42980.9. Samples: 2464352020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:12:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 06:13:00,770][15401] Updated weights for policy 0, policy_version 150410 (0.0040) [2024-06-22 06:13:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2464448512. Throughput: 0: 42868.9. Samples: 2464603640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:13:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 06:13:03,735][15401] Updated weights for policy 0, policy_version 150420 (0.0028) [2024-06-22 06:13:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2464628736. Throughput: 0: 42725.3. Samples: 2464729080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:13:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 06:13:08,410][15401] Updated weights for policy 0, policy_version 150430 (0.0044) [2024-06-22 06:13:11,500][15401] Updated weights for policy 0, policy_version 150440 (0.0036) [2024-06-22 06:13:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 2464890880. Throughput: 0: 42795.5. Samples: 2464984740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:13:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 06:13:16,368][15401] Updated weights for policy 0, policy_version 150450 (0.0032) [2024-06-22 06:13:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2465071104. Throughput: 0: 42726.7. Samples: 2465242160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:13:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 06:13:19,252][15401] Updated weights for policy 0, policy_version 150460 (0.0048) [2024-06-22 06:13:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2465284096. Throughput: 0: 42657.0. Samples: 2465368220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:13:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 06:13:24,073][15401] Updated weights for policy 0, policy_version 150470 (0.0032) [2024-06-22 06:13:27,251][15401] Updated weights for policy 0, policy_version 150480 (0.0038) [2024-06-22 06:13:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2465513472. Throughput: 0: 42654.5. Samples: 2465622000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-22 06:13:28,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 06:13:31,438][15401] Updated weights for policy 0, policy_version 150490 (0.0023) [2024-06-22 06:13:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2465710080. Throughput: 0: 42745.3. Samples: 2465882140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:13:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 06:13:34,913][15401] Updated weights for policy 0, policy_version 150500 (0.0041) [2024-06-22 06:13:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2465906688. Throughput: 0: 42407.8. Samples: 2466002820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:13:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 06:13:39,058][15401] Updated weights for policy 0, policy_version 150510 (0.0028) [2024-06-22 06:13:42,715][15401] Updated weights for policy 0, policy_version 150520 (0.0031) [2024-06-22 06:13:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42932.6). Total num frames: 2466152448. Throughput: 0: 42285.8. Samples: 2466254880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:13:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 06:13:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000150522_2466152448.pth... [2024-06-22 06:13:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000149894_2455863296.pth [2024-06-22 06:13:47,099][15401] Updated weights for policy 0, policy_version 150530 (0.0035) [2024-06-22 06:13:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 2466316288. Throughput: 0: 42330.2. Samples: 2466508500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:13:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 06:13:50,519][15401] Updated weights for policy 0, policy_version 150540 (0.0037) [2024-06-22 06:13:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2466562048. Throughput: 0: 42154.2. Samples: 2466626020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:13:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 06:13:54,663][15401] Updated weights for policy 0, policy_version 150550 (0.0028) [2024-06-22 06:13:57,903][15349] Signal inference workers to stop experience collection... (36400 times) [2024-06-22 06:13:57,908][15349] Signal inference workers to resume experience collection... (36400 times) [2024-06-22 06:13:57,925][15401] InferenceWorker_p0-w0: stopping experience collection (36400 times) [2024-06-22 06:13:57,925][15401] InferenceWorker_p0-w0: resuming experience collection (36400 times) [2024-06-22 06:13:58,083][15401] Updated weights for policy 0, policy_version 150560 (0.0031) [2024-06-22 06:13:58,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2466791424. Throughput: 0: 42508.6. Samples: 2466897620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:13:58,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 06:14:01,919][15401] Updated weights for policy 0, policy_version 150570 (0.0031) [2024-06-22 06:14:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 2466955264. Throughput: 0: 42713.4. Samples: 2467164260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:14:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 06:14:05,728][15401] Updated weights for policy 0, policy_version 150580 (0.0032) [2024-06-22 06:14:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2467217408. Throughput: 0: 42533.7. Samples: 2467282240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:14:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 06:14:09,331][15401] Updated weights for policy 0, policy_version 150590 (0.0031) [2024-06-22 06:14:13,217][15401] Updated weights for policy 0, policy_version 150600 (0.0030) [2024-06-22 06:14:13,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42325.5, 300 sec: 42876.4). Total num frames: 2467430400. Throughput: 0: 42660.6. Samples: 2467541720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:14:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 06:14:17,696][15401] Updated weights for policy 0, policy_version 150610 (0.0038) [2024-06-22 06:14:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2467610624. Throughput: 0: 42618.7. Samples: 2467799980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:14:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 06:14:21,256][15401] Updated weights for policy 0, policy_version 150620 (0.0047) [2024-06-22 06:14:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 2467856384. Throughput: 0: 42795.6. Samples: 2467928620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:14:23,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 06:14:25,294][15401] Updated weights for policy 0, policy_version 150630 (0.0038) [2024-06-22 06:14:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2468052992. Throughput: 0: 42940.1. Samples: 2468187180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:14:28,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 06:14:28,800][15401] Updated weights for policy 0, policy_version 150640 (0.0033) [2024-06-22 06:14:32,663][15401] Updated weights for policy 0, policy_version 150650 (0.0038) [2024-06-22 06:14:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2468282368. Throughput: 0: 43148.0. Samples: 2468450160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 06:14:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 06:14:36,242][15401] Updated weights for policy 0, policy_version 150660 (0.0044) [2024-06-22 06:14:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2468511744. Throughput: 0: 43437.4. Samples: 2468580700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:14:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 06:14:40,070][15401] Updated weights for policy 0, policy_version 150670 (0.0034) [2024-06-22 06:14:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2468708352. Throughput: 0: 43115.9. Samples: 2468837840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:14:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 06:14:43,605][15401] Updated weights for policy 0, policy_version 150680 (0.0024) [2024-06-22 06:14:47,420][15401] Updated weights for policy 0, policy_version 150690 (0.0033) [2024-06-22 06:14:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2468921344. Throughput: 0: 42967.0. Samples: 2469097780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:14:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 06:14:51,185][15401] Updated weights for policy 0, policy_version 150700 (0.0026) [2024-06-22 06:14:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2469150720. Throughput: 0: 43206.8. Samples: 2469226540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:14:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 06:14:54,832][15401] Updated weights for policy 0, policy_version 150710 (0.0038) [2024-06-22 06:14:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2469363712. Throughput: 0: 43269.3. Samples: 2469488840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:14:58,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 06:14:58,574][15401] Updated weights for policy 0, policy_version 150720 (0.0037) [2024-06-22 06:15:02,690][15401] Updated weights for policy 0, policy_version 150730 (0.0036) [2024-06-22 06:15:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 2469576704. Throughput: 0: 43419.5. Samples: 2469753860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:15:03,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 06:15:05,991][15401] Updated weights for policy 0, policy_version 150740 (0.0036) [2024-06-22 06:15:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 2469806080. Throughput: 0: 43321.4. Samples: 2469878180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:15:08,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 06:15:10,224][15401] Updated weights for policy 0, policy_version 150750 (0.0029) [2024-06-22 06:15:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2470035456. Throughput: 0: 43554.2. Samples: 2470147120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:15:13,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 06:15:13,432][15401] Updated weights for policy 0, policy_version 150760 (0.0041) [2024-06-22 06:15:17,846][15401] Updated weights for policy 0, policy_version 150770 (0.0038) [2024-06-22 06:15:18,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43690.7, 300 sec: 42931.7). Total num frames: 2470232064. Throughput: 0: 43485.0. Samples: 2470406980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:15:18,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-22 06:15:21,310][15401] Updated weights for policy 0, policy_version 150780 (0.0036) [2024-06-22 06:15:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 2470461440. Throughput: 0: 43379.9. Samples: 2470532800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:15:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 06:15:23,912][15349] Signal inference workers to stop experience collection... (36450 times) [2024-06-22 06:15:23,913][15349] Signal inference workers to resume experience collection... (36450 times) [2024-06-22 06:15:23,927][15401] InferenceWorker_p0-w0: stopping experience collection (36450 times) [2024-06-22 06:15:23,927][15401] InferenceWorker_p0-w0: resuming experience collection (36450 times) [2024-06-22 06:15:25,716][15401] Updated weights for policy 0, policy_version 150790 (0.0032) [2024-06-22 06:15:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43963.7, 300 sec: 43098.3). Total num frames: 2470690816. Throughput: 0: 43432.5. Samples: 2470792300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:15:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 06:15:28,646][15401] Updated weights for policy 0, policy_version 150800 (0.0022) [2024-06-22 06:15:33,093][15401] Updated weights for policy 0, policy_version 150810 (0.0034) [2024-06-22 06:15:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2470871040. Throughput: 0: 43534.6. Samples: 2471056840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:15:33,390][15132] Avg episode reward: [(0, '0.149')] [2024-06-22 06:15:36,155][15401] Updated weights for policy 0, policy_version 150820 (0.0032) [2024-06-22 06:15:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43415.8, 300 sec: 42931.9). Total num frames: 2471116800. Throughput: 0: 43425.2. Samples: 2471180780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:15:38,393][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 06:15:40,733][15401] Updated weights for policy 0, policy_version 150830 (0.0036) [2024-06-22 06:15:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 2471329792. Throughput: 0: 43338.0. Samples: 2471439060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:15:43,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 06:15:43,573][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000150839_2471346176.pth... [2024-06-22 06:15:43,639][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000150209_2461024256.pth [2024-06-22 06:15:43,803][15401] Updated weights for policy 0, policy_version 150840 (0.0026) [2024-06-22 06:15:48,350][15401] Updated weights for policy 0, policy_version 150850 (0.0038) [2024-06-22 06:15:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 2471526400. Throughput: 0: 43434.3. Samples: 2471708400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:15:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 06:15:51,525][15401] Updated weights for policy 0, policy_version 150860 (0.0033) [2024-06-22 06:15:53,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 2471755776. Throughput: 0: 43319.1. Samples: 2471827540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:15:53,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 06:15:55,889][15401] Updated weights for policy 0, policy_version 150870 (0.0035) [2024-06-22 06:15:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2471968768. Throughput: 0: 43140.9. Samples: 2472088460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:15:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 06:15:59,003][15401] Updated weights for policy 0, policy_version 150880 (0.0033) [2024-06-22 06:16:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2472165376. Throughput: 0: 43244.8. Samples: 2472353000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:03,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 06:16:03,597][15401] Updated weights for policy 0, policy_version 150890 (0.0031) [2024-06-22 06:16:06,425][15401] Updated weights for policy 0, policy_version 150900 (0.0025) [2024-06-22 06:16:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 2472394752. Throughput: 0: 43185.0. Samples: 2472476120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 06:16:11,020][15401] Updated weights for policy 0, policy_version 150910 (0.0030) [2024-06-22 06:16:13,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 2472640512. Throughput: 0: 43250.6. Samples: 2472738580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 06:16:14,106][15401] Updated weights for policy 0, policy_version 150920 (0.0040) [2024-06-22 06:16:18,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2472804352. Throughput: 0: 43238.6. Samples: 2473002580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:18,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 06:16:18,642][15401] Updated weights for policy 0, policy_version 150930 (0.0033) [2024-06-22 06:16:21,635][15401] Updated weights for policy 0, policy_version 150940 (0.0031) [2024-06-22 06:16:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2473050112. Throughput: 0: 43094.8. Samples: 2473119940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:23,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 06:16:26,144][15401] Updated weights for policy 0, policy_version 150950 (0.0025) [2024-06-22 06:16:28,389][15132] Fps is (10 sec: 47514.7, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 2473279488. Throughput: 0: 43395.8. Samples: 2473391860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:28,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 06:16:29,378][15401] Updated weights for policy 0, policy_version 150960 (0.0038) [2024-06-22 06:16:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 2473459712. Throughput: 0: 43060.9. Samples: 2473646140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 06:16:33,805][15401] Updated weights for policy 0, policy_version 150970 (0.0032) [2024-06-22 06:16:35,055][15349] Signal inference workers to stop experience collection... (36500 times) [2024-06-22 06:16:35,083][15401] InferenceWorker_p0-w0: stopping experience collection (36500 times) [2024-06-22 06:16:35,117][15349] Signal inference workers to resume experience collection... (36500 times) [2024-06-22 06:16:35,117][15401] InferenceWorker_p0-w0: resuming experience collection (36500 times) [2024-06-22 06:16:37,108][15401] Updated weights for policy 0, policy_version 150980 (0.0030) [2024-06-22 06:16:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 2473705472. Throughput: 0: 43138.4. Samples: 2473768660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:38,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 06:16:41,165][15401] Updated weights for policy 0, policy_version 150990 (0.0030) [2024-06-22 06:16:43,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43142.9, 300 sec: 43042.3). Total num frames: 2473918464. Throughput: 0: 43322.0. Samples: 2474038060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 06:16:43,393][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 06:16:44,594][15401] Updated weights for policy 0, policy_version 151000 (0.0022) [2024-06-22 06:16:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 43098.3). Total num frames: 2474131456. Throughput: 0: 43110.3. Samples: 2474292960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:16:48,393][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 06:16:48,639][15401] Updated weights for policy 0, policy_version 151010 (0.0034) [2024-06-22 06:16:52,247][15401] Updated weights for policy 0, policy_version 151020 (0.0031) [2024-06-22 06:16:53,389][15132] Fps is (10 sec: 44248.0, 60 sec: 43419.4, 300 sec: 43153.8). Total num frames: 2474360832. Throughput: 0: 43242.2. Samples: 2474422020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:16:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 06:16:56,471][15401] Updated weights for policy 0, policy_version 151030 (0.0028) [2024-06-22 06:16:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2474573824. Throughput: 0: 43313.4. Samples: 2474687680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:16:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 06:16:59,805][15401] Updated weights for policy 0, policy_version 151040 (0.0053) [2024-06-22 06:17:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 2474770432. Throughput: 0: 42971.8. Samples: 2474936300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 06:17:04,087][15401] Updated weights for policy 0, policy_version 151050 (0.0034) [2024-06-22 06:17:07,864][15401] Updated weights for policy 0, policy_version 151060 (0.0043) [2024-06-22 06:17:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2474983424. Throughput: 0: 43154.1. Samples: 2475061880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:08,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 06:17:11,603][15401] Updated weights for policy 0, policy_version 151070 (0.0033) [2024-06-22 06:17:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 2475180032. Throughput: 0: 42842.2. Samples: 2475319760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 06:17:15,509][15401] Updated weights for policy 0, policy_version 151080 (0.0034) [2024-06-22 06:17:18,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43689.0, 300 sec: 43097.9). Total num frames: 2475425792. Throughput: 0: 42719.0. Samples: 2475568600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:18,393][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 06:17:19,107][15401] Updated weights for policy 0, policy_version 151090 (0.0031) [2024-06-22 06:17:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42931.6). Total num frames: 2475606016. Throughput: 0: 42881.1. Samples: 2475698320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 06:17:23,515][15401] Updated weights for policy 0, policy_version 151100 (0.0038) [2024-06-22 06:17:26,587][15401] Updated weights for policy 0, policy_version 151110 (0.0038) [2024-06-22 06:17:28,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2475819008. Throughput: 0: 42471.2. Samples: 2475949160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 06:17:31,179][15401] Updated weights for policy 0, policy_version 151120 (0.0035) [2024-06-22 06:17:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2476048384. Throughput: 0: 42512.4. Samples: 2476206020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:33,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 06:17:34,213][15401] Updated weights for policy 0, policy_version 151130 (0.0033) [2024-06-22 06:17:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 2476261376. Throughput: 0: 42742.1. Samples: 2476345420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 06:17:38,699][15401] Updated weights for policy 0, policy_version 151140 (0.0028) [2024-06-22 06:17:41,590][15401] Updated weights for policy 0, policy_version 151150 (0.0037) [2024-06-22 06:17:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 2476457984. Throughput: 0: 42417.7. Samples: 2476596480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 06:17:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 06:17:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000151151_2476457984.pth... [2024-06-22 06:17:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000150522_2466152448.pth [2024-06-22 06:17:43,793][15349] Signal inference workers to stop experience collection... (36550 times) [2024-06-22 06:17:43,794][15349] Signal inference workers to resume experience collection... (36550 times) [2024-06-22 06:17:43,808][15401] InferenceWorker_p0-w0: stopping experience collection (36550 times) [2024-06-22 06:17:43,809][15401] InferenceWorker_p0-w0: resuming experience collection (36550 times) [2024-06-22 06:17:46,370][15401] Updated weights for policy 0, policy_version 151160 (0.0040) [2024-06-22 06:17:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2476703744. Throughput: 0: 42589.8. Samples: 2476852840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:17:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 06:17:49,561][15401] Updated weights for policy 0, policy_version 151170 (0.0032) [2024-06-22 06:17:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 2476883968. Throughput: 0: 42867.5. Samples: 2476990920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:17:53,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 06:17:53,886][15401] Updated weights for policy 0, policy_version 151180 (0.0038) [2024-06-22 06:17:57,139][15401] Updated weights for policy 0, policy_version 151190 (0.0033) [2024-06-22 06:17:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 2477113344. Throughput: 0: 42726.5. Samples: 2477242460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:17:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 06:18:01,661][15401] Updated weights for policy 0, policy_version 151200 (0.0040) [2024-06-22 06:18:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 2477342720. Throughput: 0: 42859.5. Samples: 2477497180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 06:18:04,670][15401] Updated weights for policy 0, policy_version 151210 (0.0041) [2024-06-22 06:18:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2477539328. Throughput: 0: 43034.8. Samples: 2477634880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 06:18:09,168][15401] Updated weights for policy 0, policy_version 151220 (0.0044) [2024-06-22 06:18:12,195][15401] Updated weights for policy 0, policy_version 151230 (0.0033) [2024-06-22 06:18:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2477768704. Throughput: 0: 42991.1. Samples: 2477883760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 06:18:16,980][15401] Updated weights for policy 0, policy_version 151240 (0.0038) [2024-06-22 06:18:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.2, 300 sec: 43042.7). Total num frames: 2477981696. Throughput: 0: 43141.5. Samples: 2478147380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 06:18:20,006][15401] Updated weights for policy 0, policy_version 151250 (0.0028) [2024-06-22 06:18:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2478194688. Throughput: 0: 42980.0. Samples: 2478279520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 06:18:24,486][15401] Updated weights for policy 0, policy_version 151260 (0.0028) [2024-06-22 06:18:27,507][15401] Updated weights for policy 0, policy_version 151270 (0.0029) [2024-06-22 06:18:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 2478407680. Throughput: 0: 42859.2. Samples: 2478525140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 06:18:32,259][15401] Updated weights for policy 0, policy_version 151280 (0.0028) [2024-06-22 06:18:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 43098.3). Total num frames: 2478620672. Throughput: 0: 42832.7. Samples: 2478780320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 06:18:35,673][15401] Updated weights for policy 0, policy_version 151290 (0.0034) [2024-06-22 06:18:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2478817280. Throughput: 0: 42636.9. Samples: 2478909580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 06:18:39,862][15401] Updated weights for policy 0, policy_version 151300 (0.0039) [2024-06-22 06:18:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 43153.8). Total num frames: 2479046656. Throughput: 0: 42702.3. Samples: 2479164060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 06:18:43,400][15401] Updated weights for policy 0, policy_version 151310 (0.0029) [2024-06-22 06:18:47,763][15401] Updated weights for policy 0, policy_version 151320 (0.0033) [2024-06-22 06:18:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 43042.7). Total num frames: 2479259648. Throughput: 0: 42747.9. Samples: 2479420840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:18:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 06:18:51,069][15401] Updated weights for policy 0, policy_version 151330 (0.0035) [2024-06-22 06:18:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2479439872. Throughput: 0: 42432.3. Samples: 2479544340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:18:53,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-22 06:18:55,350][15401] Updated weights for policy 0, policy_version 151340 (0.0035) [2024-06-22 06:18:58,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 2479685632. Throughput: 0: 42656.4. Samples: 2479803300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:18:58,391][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 06:18:58,669][15401] Updated weights for policy 0, policy_version 151350 (0.0029) [2024-06-22 06:19:02,920][15401] Updated weights for policy 0, policy_version 151360 (0.0040) [2024-06-22 06:19:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 2479882240. Throughput: 0: 42517.6. Samples: 2480060680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 06:19:04,656][15349] Signal inference workers to stop experience collection... (36600 times) [2024-06-22 06:19:04,658][15349] Signal inference workers to resume experience collection... (36600 times) [2024-06-22 06:19:04,671][15401] InferenceWorker_p0-w0: stopping experience collection (36600 times) [2024-06-22 06:19:04,700][15401] InferenceWorker_p0-w0: resuming experience collection (36600 times) [2024-06-22 06:19:06,237][15401] Updated weights for policy 0, policy_version 151370 (0.0041) [2024-06-22 06:19:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2480095232. Throughput: 0: 42345.8. Samples: 2480185080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 06:19:10,552][15401] Updated weights for policy 0, policy_version 151380 (0.0027) [2024-06-22 06:19:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 43098.2). Total num frames: 2480324608. Throughput: 0: 42614.3. Samples: 2480442780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:13,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 06:19:13,898][15401] Updated weights for policy 0, policy_version 151390 (0.0036) [2024-06-22 06:19:18,094][15401] Updated weights for policy 0, policy_version 151400 (0.0035) [2024-06-22 06:19:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 2480537600. Throughput: 0: 42678.4. Samples: 2480700840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:18,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 06:19:21,595][15401] Updated weights for policy 0, policy_version 151410 (0.0030) [2024-06-22 06:19:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 2480734208. Throughput: 0: 42660.9. Samples: 2480829320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 06:19:25,822][15401] Updated weights for policy 0, policy_version 151420 (0.0036) [2024-06-22 06:19:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2480963584. Throughput: 0: 42656.5. Samples: 2481083600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 06:19:29,530][15401] Updated weights for policy 0, policy_version 151430 (0.0046) [2024-06-22 06:19:33,271][15401] Updated weights for policy 0, policy_version 151440 (0.0032) [2024-06-22 06:19:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2481192960. Throughput: 0: 42716.1. Samples: 2481343060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 06:19:37,183][15401] Updated weights for policy 0, policy_version 151450 (0.0034) [2024-06-22 06:19:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2481389568. Throughput: 0: 42873.4. Samples: 2481473640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 06:19:41,022][15401] Updated weights for policy 0, policy_version 151460 (0.0041) [2024-06-22 06:19:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2481602560. Throughput: 0: 42609.4. Samples: 2481720720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:43,392][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 06:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000151465_2481602560.pth... [2024-06-22 06:19:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000150839_2471346176.pth [2024-06-22 06:19:45,031][15401] Updated weights for policy 0, policy_version 151470 (0.0034) [2024-06-22 06:19:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42931.6). Total num frames: 2481815552. Throughput: 0: 42681.0. Samples: 2481981320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:48,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 06:19:48,582][15401] Updated weights for policy 0, policy_version 151480 (0.0047) [2024-06-22 06:19:52,966][15401] Updated weights for policy 0, policy_version 151490 (0.0032) [2024-06-22 06:19:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2482028544. Throughput: 0: 42786.6. Samples: 2482110480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 06:19:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 06:19:56,068][15401] Updated weights for policy 0, policy_version 151500 (0.0040) [2024-06-22 06:19:58,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 2482257920. Throughput: 0: 42739.8. Samples: 2482366080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:19:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 06:20:00,487][15401] Updated weights for policy 0, policy_version 151510 (0.0037) [2024-06-22 06:20:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 2482470912. Throughput: 0: 42876.9. Samples: 2482630300. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 06:20:03,675][15401] Updated weights for policy 0, policy_version 151520 (0.0045) [2024-06-22 06:20:08,111][15401] Updated weights for policy 0, policy_version 151530 (0.0031) [2024-06-22 06:20:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2482683904. Throughput: 0: 42792.4. Samples: 2482754980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:08,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 06:20:11,093][15349] Signal inference workers to stop experience collection... (36650 times) [2024-06-22 06:20:11,130][15401] InferenceWorker_p0-w0: stopping experience collection (36650 times) [2024-06-22 06:20:11,154][15349] Signal inference workers to resume experience collection... (36650 times) [2024-06-22 06:20:11,163][15401] InferenceWorker_p0-w0: resuming experience collection (36650 times) [2024-06-22 06:20:11,297][15401] Updated weights for policy 0, policy_version 151540 (0.0032) [2024-06-22 06:20:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2482913280. Throughput: 0: 42969.4. Samples: 2483017220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 06:20:15,809][15401] Updated weights for policy 0, policy_version 151550 (0.0043) [2024-06-22 06:20:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2483109888. Throughput: 0: 42950.3. Samples: 2483275820. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 06:20:18,885][15401] Updated weights for policy 0, policy_version 151560 (0.0037) [2024-06-22 06:20:23,381][15401] Updated weights for policy 0, policy_version 151570 (0.0030) [2024-06-22 06:20:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2483322880. Throughput: 0: 42739.7. Samples: 2483396920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 06:20:26,439][15401] Updated weights for policy 0, policy_version 151580 (0.0025) [2024-06-22 06:20:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2483552256. Throughput: 0: 43065.7. Samples: 2483658680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 06:20:31,635][15401] Updated weights for policy 0, policy_version 151590 (0.0042) [2024-06-22 06:20:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 2483748864. Throughput: 0: 43005.7. Samples: 2483916580. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:33,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 06:20:33,949][15401] Updated weights for policy 0, policy_version 151600 (0.0029) [2024-06-22 06:20:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2483945472. Throughput: 0: 42816.0. Samples: 2484037200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 06:20:39,077][15401] Updated weights for policy 0, policy_version 151610 (0.0037) [2024-06-22 06:20:41,915][15401] Updated weights for policy 0, policy_version 151620 (0.0038) [2024-06-22 06:20:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2484207616. Throughput: 0: 43001.1. Samples: 2484301120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 06:20:46,679][15401] Updated weights for policy 0, policy_version 151630 (0.0034) [2024-06-22 06:20:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 2484404224. Throughput: 0: 42898.1. Samples: 2484560720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:48,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-22 06:20:49,505][15401] Updated weights for policy 0, policy_version 151640 (0.0035) [2024-06-22 06:20:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2484584448. Throughput: 0: 42704.5. Samples: 2484676680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-22 06:20:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 06:20:54,346][15401] Updated weights for policy 0, policy_version 151650 (0.0026) [2024-06-22 06:20:56,889][15401] Updated weights for policy 0, policy_version 151660 (0.0024) [2024-06-22 06:20:58,395][15132] Fps is (10 sec: 45850.1, 60 sec: 43413.7, 300 sec: 43041.9). Total num frames: 2484862976. Throughput: 0: 42709.8. Samples: 2484939400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:20:58,396][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 06:21:01,813][15401] Updated weights for policy 0, policy_version 151670 (0.0041) [2024-06-22 06:21:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 2485010432. Throughput: 0: 42846.5. Samples: 2485203920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:03,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 06:21:04,971][15401] Updated weights for policy 0, policy_version 151680 (0.0034) [2024-06-22 06:21:08,390][15132] Fps is (10 sec: 37703.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2485239808. Throughput: 0: 42768.3. Samples: 2485321500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 06:21:09,357][15401] Updated weights for policy 0, policy_version 151690 (0.0033) [2024-06-22 06:21:12,544][15401] Updated weights for policy 0, policy_version 151700 (0.0036) [2024-06-22 06:21:13,390][15132] Fps is (10 sec: 49152.1, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 2485501952. Throughput: 0: 42752.9. Samples: 2485582560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 06:21:17,017][15401] Updated weights for policy 0, policy_version 151710 (0.0043) [2024-06-22 06:21:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2485633024. Throughput: 0: 42927.5. Samples: 2485848320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 06:21:20,165][15401] Updated weights for policy 0, policy_version 151720 (0.0036) [2024-06-22 06:21:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2485878784. Throughput: 0: 42723.5. Samples: 2485959760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 06:21:24,619][15401] Updated weights for policy 0, policy_version 151730 (0.0032) [2024-06-22 06:21:27,135][15349] Signal inference workers to stop experience collection... (36700 times) [2024-06-22 06:21:27,184][15401] InferenceWorker_p0-w0: stopping experience collection (36700 times) [2024-06-22 06:21:27,193][15349] Signal inference workers to resume experience collection... (36700 times) [2024-06-22 06:21:27,203][15401] InferenceWorker_p0-w0: resuming experience collection (36700 times) [2024-06-22 06:21:27,986][15401] Updated weights for policy 0, policy_version 151740 (0.0043) [2024-06-22 06:21:28,389][15132] Fps is (10 sec: 50790.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2486140928. Throughput: 0: 42930.2. Samples: 2486232980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 06:21:32,561][15401] Updated weights for policy 0, policy_version 151750 (0.0032) [2024-06-22 06:21:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2486288384. Throughput: 0: 42832.4. Samples: 2486488180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 06:21:35,664][15401] Updated weights for policy 0, policy_version 151760 (0.0037) [2024-06-22 06:21:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2486517760. Throughput: 0: 42695.2. Samples: 2486597960. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:38,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 06:21:40,219][15401] Updated weights for policy 0, policy_version 151770 (0.0028) [2024-06-22 06:21:43,243][15401] Updated weights for policy 0, policy_version 151780 (0.0029) [2024-06-22 06:21:43,396][15132] Fps is (10 sec: 49121.4, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 2486779904. Throughput: 0: 43036.5. Samples: 2486876080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:43,396][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 06:21:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000151781_2486779904.pth... [2024-06-22 06:21:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000151151_2476457984.pth [2024-06-22 06:21:47,808][15401] Updated weights for policy 0, policy_version 151790 (0.0027) [2024-06-22 06:21:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2486943744. Throughput: 0: 42796.9. Samples: 2487129780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:48,394][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 06:21:50,761][15401] Updated weights for policy 0, policy_version 151800 (0.0040) [2024-06-22 06:21:53,391][15132] Fps is (10 sec: 39339.4, 60 sec: 43143.2, 300 sec: 42709.2). Total num frames: 2487173120. Throughput: 0: 42790.8. Samples: 2487247160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:53,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 06:21:55,401][15401] Updated weights for policy 0, policy_version 151810 (0.0039) [2024-06-22 06:21:58,338][15401] Updated weights for policy 0, policy_version 151820 (0.0027) [2024-06-22 06:21:58,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42602.4, 300 sec: 42876.1). Total num frames: 2487418880. Throughput: 0: 42989.4. Samples: 2487517080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 06:21:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 06:22:03,254][15401] Updated weights for policy 0, policy_version 151830 (0.0032) [2024-06-22 06:22:03,390][15132] Fps is (10 sec: 40967.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2487582720. Throughput: 0: 42688.4. Samples: 2487769300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:03,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 06:22:06,053][15401] Updated weights for policy 0, policy_version 151840 (0.0035) [2024-06-22 06:22:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2487828480. Throughput: 0: 42830.8. Samples: 2487887140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 06:22:10,835][15401] Updated weights for policy 0, policy_version 151850 (0.0044) [2024-06-22 06:22:12,200][15349] Signal inference workers to stop experience collection... (36750 times) [2024-06-22 06:22:12,201][15349] Signal inference workers to resume experience collection... (36750 times) [2024-06-22 06:22:12,225][15401] InferenceWorker_p0-w0: stopping experience collection (36750 times) [2024-06-22 06:22:12,225][15401] InferenceWorker_p0-w0: resuming experience collection (36750 times) [2024-06-22 06:22:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 2488041472. Throughput: 0: 42884.9. Samples: 2488162800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 06:22:13,694][15401] Updated weights for policy 0, policy_version 151860 (0.0039) [2024-06-22 06:22:18,384][15401] Updated weights for policy 0, policy_version 151870 (0.0027) [2024-06-22 06:22:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2488238080. Throughput: 0: 42762.8. Samples: 2488412500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 06:22:21,452][15401] Updated weights for policy 0, policy_version 151880 (0.0040) [2024-06-22 06:22:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2488483840. Throughput: 0: 43085.3. Samples: 2488536800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 06:22:26,528][15401] Updated weights for policy 0, policy_version 151890 (0.0033) [2024-06-22 06:22:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2488664064. Throughput: 0: 42711.4. Samples: 2488797820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 06:22:29,328][15401] Updated weights for policy 0, policy_version 151900 (0.0035) [2024-06-22 06:22:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2488877056. Throughput: 0: 42795.5. Samples: 2489055580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 06:22:33,830][15401] Updated weights for policy 0, policy_version 151910 (0.0036) [2024-06-22 06:22:36,775][15401] Updated weights for policy 0, policy_version 151920 (0.0041) [2024-06-22 06:22:38,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 2489139200. Throughput: 0: 43125.3. Samples: 2489187720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 06:22:41,207][15401] Updated weights for policy 0, policy_version 151930 (0.0037) [2024-06-22 06:22:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42329.8, 300 sec: 42765.0). Total num frames: 2489319424. Throughput: 0: 43072.8. Samples: 2489455360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 06:22:44,386][15401] Updated weights for policy 0, policy_version 151940 (0.0033) [2024-06-22 06:22:48,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2489516032. Throughput: 0: 42963.6. Samples: 2489702660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 06:22:48,763][15401] Updated weights for policy 0, policy_version 151950 (0.0035) [2024-06-22 06:22:52,041][15401] Updated weights for policy 0, policy_version 151960 (0.0036) [2024-06-22 06:22:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43419.0, 300 sec: 42931.7). Total num frames: 2489778176. Throughput: 0: 43209.3. Samples: 2489831560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:53,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-22 06:22:56,343][15401] Updated weights for policy 0, policy_version 151970 (0.0037) [2024-06-22 06:22:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2489958400. Throughput: 0: 42907.0. Samples: 2490093620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:22:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 06:22:59,648][15401] Updated weights for policy 0, policy_version 151980 (0.0031) [2024-06-22 06:23:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2490171392. Throughput: 0: 42902.2. Samples: 2490343100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 06:23:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 06:23:03,858][15401] Updated weights for policy 0, policy_version 151990 (0.0029) [2024-06-22 06:23:07,336][15401] Updated weights for policy 0, policy_version 152000 (0.0034) [2024-06-22 06:23:08,324][15349] Signal inference workers to stop experience collection... (36800 times) [2024-06-22 06:23:08,371][15401] InferenceWorker_p0-w0: stopping experience collection (36800 times) [2024-06-22 06:23:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 2490384384. Throughput: 0: 42926.6. Samples: 2490468600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:08,392][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 06:23:08,441][15349] Signal inference workers to resume experience collection... (36800 times) [2024-06-22 06:23:08,442][15401] InferenceWorker_p0-w0: resuming experience collection (36800 times) [2024-06-22 06:23:11,807][15401] Updated weights for policy 0, policy_version 152010 (0.0022) [2024-06-22 06:23:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2490580992. Throughput: 0: 42856.1. Samples: 2490726340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:13,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 06:23:15,362][15401] Updated weights for policy 0, policy_version 152020 (0.0031) [2024-06-22 06:23:18,392][15132] Fps is (10 sec: 42599.5, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 2490810368. Throughput: 0: 42588.7. Samples: 2490972160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:18,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 06:23:19,640][15401] Updated weights for policy 0, policy_version 152030 (0.0036) [2024-06-22 06:23:22,987][15401] Updated weights for policy 0, policy_version 152040 (0.0034) [2024-06-22 06:23:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2491039744. Throughput: 0: 42556.0. Samples: 2491102740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 06:23:27,073][15401] Updated weights for policy 0, policy_version 152050 (0.0039) [2024-06-22 06:23:28,389][15132] Fps is (10 sec: 42608.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2491236352. Throughput: 0: 42384.1. Samples: 2491362640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 06:23:30,699][15401] Updated weights for policy 0, policy_version 152060 (0.0040) [2024-06-22 06:23:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2491449344. Throughput: 0: 42505.3. Samples: 2491615400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 06:23:35,022][15401] Updated weights for policy 0, policy_version 152070 (0.0040) [2024-06-22 06:23:38,219][15401] Updated weights for policy 0, policy_version 152080 (0.0037) [2024-06-22 06:23:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2491678720. Throughput: 0: 42513.8. Samples: 2491744680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 06:23:42,514][15401] Updated weights for policy 0, policy_version 152090 (0.0028) [2024-06-22 06:23:43,391][15132] Fps is (10 sec: 44232.4, 60 sec: 42870.7, 300 sec: 42820.4). Total num frames: 2491891712. Throughput: 0: 42465.3. Samples: 2492004600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:43,391][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 06:23:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000152093_2491891712.pth... [2024-06-22 06:23:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000151465_2481602560.pth [2024-06-22 06:23:45,762][15401] Updated weights for policy 0, policy_version 152100 (0.0025) [2024-06-22 06:23:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2492088320. Throughput: 0: 42631.1. Samples: 2492261500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 06:23:50,041][15401] Updated weights for policy 0, policy_version 152110 (0.0030) [2024-06-22 06:23:53,389][15132] Fps is (10 sec: 42603.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2492317696. Throughput: 0: 42840.1. Samples: 2492396300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 06:23:53,440][15401] Updated weights for policy 0, policy_version 152120 (0.0030) [2024-06-22 06:23:57,583][15401] Updated weights for policy 0, policy_version 152130 (0.0035) [2024-06-22 06:23:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2492514304. Throughput: 0: 42914.2. Samples: 2492657480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:23:58,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 06:24:00,996][15401] Updated weights for policy 0, policy_version 152140 (0.0028) [2024-06-22 06:24:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2492743680. Throughput: 0: 42930.0. Samples: 2492903920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:24:03,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 06:24:05,240][15401] Updated weights for policy 0, policy_version 152150 (0.0028) [2024-06-22 06:24:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 2492956672. Throughput: 0: 42970.7. Samples: 2493036420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 06:24:08,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 06:24:08,606][15401] Updated weights for policy 0, policy_version 152160 (0.0036) [2024-06-22 06:24:13,106][15401] Updated weights for policy 0, policy_version 152170 (0.0026) [2024-06-22 06:24:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2493153280. Throughput: 0: 43063.0. Samples: 2493300480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 06:24:16,305][15401] Updated weights for policy 0, policy_version 152180 (0.0035) [2024-06-22 06:24:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43146.0, 300 sec: 42931.6). Total num frames: 2493399040. Throughput: 0: 42954.1. Samples: 2493548340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 06:24:20,819][15401] Updated weights for policy 0, policy_version 152190 (0.0029) [2024-06-22 06:24:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2493595648. Throughput: 0: 43014.7. Samples: 2493680340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 06:24:23,893][15401] Updated weights for policy 0, policy_version 152200 (0.0039) [2024-06-22 06:24:28,356][15401] Updated weights for policy 0, policy_version 152210 (0.0023) [2024-06-22 06:24:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2493808640. Throughput: 0: 43097.9. Samples: 2493943960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:28,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 06:24:29,300][15349] Signal inference workers to stop experience collection... (36850 times) [2024-06-22 06:24:29,304][15349] Signal inference workers to resume experience collection... (36850 times) [2024-06-22 06:24:29,342][15401] InferenceWorker_p0-w0: stopping experience collection (36850 times) [2024-06-22 06:24:29,342][15401] InferenceWorker_p0-w0: resuming experience collection (36850 times) [2024-06-22 06:24:31,630][15401] Updated weights for policy 0, policy_version 152220 (0.0022) [2024-06-22 06:24:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2494038016. Throughput: 0: 42952.0. Samples: 2494194340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:33,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 06:24:36,119][15401] Updated weights for policy 0, policy_version 152230 (0.0043) [2024-06-22 06:24:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2494251008. Throughput: 0: 42992.9. Samples: 2494330980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 06:24:39,517][15401] Updated weights for policy 0, policy_version 152240 (0.0036) [2024-06-22 06:24:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42326.1, 300 sec: 42765.0). Total num frames: 2494431232. Throughput: 0: 42726.1. Samples: 2494580160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 06:24:43,790][15401] Updated weights for policy 0, policy_version 152250 (0.0030) [2024-06-22 06:24:47,040][15401] Updated weights for policy 0, policy_version 152260 (0.0031) [2024-06-22 06:24:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2494676992. Throughput: 0: 42949.3. Samples: 2494836640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 06:24:51,335][15401] Updated weights for policy 0, policy_version 152270 (0.0040) [2024-06-22 06:24:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 2494889984. Throughput: 0: 43049.7. Samples: 2494973660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 06:24:54,737][15401] Updated weights for policy 0, policy_version 152280 (0.0038) [2024-06-22 06:24:58,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 2495086592. Throughput: 0: 42797.7. Samples: 2495226480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:24:58,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 06:24:58,895][15401] Updated weights for policy 0, policy_version 152290 (0.0040) [2024-06-22 06:25:02,177][15401] Updated weights for policy 0, policy_version 152300 (0.0042) [2024-06-22 06:25:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2495315968. Throughput: 0: 42926.7. Samples: 2495480040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:25:03,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 06:25:06,539][15401] Updated weights for policy 0, policy_version 152310 (0.0039) [2024-06-22 06:25:08,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2495528960. Throughput: 0: 43003.2. Samples: 2495615480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:25:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 06:25:09,664][15401] Updated weights for policy 0, policy_version 152320 (0.0044) [2024-06-22 06:25:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2495725568. Throughput: 0: 42690.7. Samples: 2495865040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:25:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 06:25:14,068][15401] Updated weights for policy 0, policy_version 152330 (0.0027) [2024-06-22 06:25:17,278][15401] Updated weights for policy 0, policy_version 152340 (0.0034) [2024-06-22 06:25:18,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2495971328. Throughput: 0: 42794.3. Samples: 2496120080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 06:25:21,728][15401] Updated weights for policy 0, policy_version 152350 (0.0041) [2024-06-22 06:25:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2496167936. Throughput: 0: 42809.8. Samples: 2496257420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 06:25:24,921][15401] Updated weights for policy 0, policy_version 152360 (0.0033) [2024-06-22 06:25:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2496364544. Throughput: 0: 42720.5. Samples: 2496502580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:28,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 06:25:29,497][15401] Updated weights for policy 0, policy_version 152370 (0.0036) [2024-06-22 06:25:32,782][15401] Updated weights for policy 0, policy_version 152380 (0.0046) [2024-06-22 06:25:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2496610304. Throughput: 0: 42603.7. Samples: 2496753800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:33,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 06:25:37,141][15401] Updated weights for policy 0, policy_version 152390 (0.0045) [2024-06-22 06:25:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2496806912. Throughput: 0: 42679.7. Samples: 2496894240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 06:25:40,342][15401] Updated weights for policy 0, policy_version 152400 (0.0027) [2024-06-22 06:25:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2497019904. Throughput: 0: 42645.7. Samples: 2497145440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:43,394][15132] Avg episode reward: [(0, '0.158')] [2024-06-22 06:25:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000152406_2497019904.pth... [2024-06-22 06:25:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000151781_2486779904.pth [2024-06-22 06:25:44,797][15401] Updated weights for policy 0, policy_version 152410 (0.0032) [2024-06-22 06:25:48,041][15401] Updated weights for policy 0, policy_version 152420 (0.0045) [2024-06-22 06:25:48,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 2497265664. Throughput: 0: 42602.3. Samples: 2497397240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:48,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 06:25:52,452][15401] Updated weights for policy 0, policy_version 152430 (0.0030) [2024-06-22 06:25:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.8). Total num frames: 2497445888. Throughput: 0: 42570.2. Samples: 2497531140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 06:25:55,714][15401] Updated weights for policy 0, policy_version 152440 (0.0034) [2024-06-22 06:25:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 2497675264. Throughput: 0: 42693.3. Samples: 2497786240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:25:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 06:26:00,253][15401] Updated weights for policy 0, policy_version 152450 (0.0046) [2024-06-22 06:26:01,059][15349] Signal inference workers to stop experience collection... (36900 times) [2024-06-22 06:26:01,107][15401] InferenceWorker_p0-w0: stopping experience collection (36900 times) [2024-06-22 06:26:01,177][15349] Signal inference workers to resume experience collection... (36900 times) [2024-06-22 06:26:01,178][15401] InferenceWorker_p0-w0: resuming experience collection (36900 times) [2024-06-22 06:26:03,374][15401] Updated weights for policy 0, policy_version 152460 (0.0028) [2024-06-22 06:26:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2497904640. Throughput: 0: 42605.3. Samples: 2498037320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:26:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 06:26:07,871][15401] Updated weights for policy 0, policy_version 152470 (0.0040) [2024-06-22 06:26:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 2498084864. Throughput: 0: 42434.6. Samples: 2498166980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:26:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 06:26:11,313][15401] Updated weights for policy 0, policy_version 152480 (0.0029) [2024-06-22 06:26:13,390][15132] Fps is (10 sec: 40956.3, 60 sec: 43143.8, 300 sec: 42987.0). Total num frames: 2498314240. Throughput: 0: 42515.5. Samples: 2498415820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 06:26:13,391][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 06:26:15,833][15401] Updated weights for policy 0, policy_version 152490 (0.0046) [2024-06-22 06:26:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2498510848. Throughput: 0: 42717.8. Samples: 2498676100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:18,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 06:26:19,068][15401] Updated weights for policy 0, policy_version 152500 (0.0037) [2024-06-22 06:26:23,390][15132] Fps is (10 sec: 39325.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2498707456. Throughput: 0: 42420.8. Samples: 2498803180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:23,395][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 06:26:23,561][15401] Updated weights for policy 0, policy_version 152510 (0.0044) [2024-06-22 06:26:26,600][15401] Updated weights for policy 0, policy_version 152520 (0.0038) [2024-06-22 06:26:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2498936832. Throughput: 0: 42427.6. Samples: 2499054680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 06:26:31,082][15401] Updated weights for policy 0, policy_version 152530 (0.0030) [2024-06-22 06:26:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2499149824. Throughput: 0: 42707.6. Samples: 2499318980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:33,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 06:26:34,243][15401] Updated weights for policy 0, policy_version 152540 (0.0034) [2024-06-22 06:26:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 2499346432. Throughput: 0: 42526.6. Samples: 2499444840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 06:26:38,601][15401] Updated weights for policy 0, policy_version 152550 (0.0037) [2024-06-22 06:26:42,118][15401] Updated weights for policy 0, policy_version 152560 (0.0041) [2024-06-22 06:26:43,391][15132] Fps is (10 sec: 44231.4, 60 sec: 42870.6, 300 sec: 42875.9). Total num frames: 2499592192. Throughput: 0: 42539.7. Samples: 2499700580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:43,391][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 06:26:46,187][15401] Updated weights for policy 0, policy_version 152570 (0.0044) [2024-06-22 06:26:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42054.0, 300 sec: 42765.3). Total num frames: 2499788800. Throughput: 0: 42647.2. Samples: 2499956440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 06:26:49,936][15401] Updated weights for policy 0, policy_version 152580 (0.0037) [2024-06-22 06:26:53,389][15132] Fps is (10 sec: 39326.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2499985408. Throughput: 0: 42583.6. Samples: 2500083240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 06:26:53,765][15401] Updated weights for policy 0, policy_version 152590 (0.0033) [2024-06-22 06:26:57,751][15401] Updated weights for policy 0, policy_version 152600 (0.0029) [2024-06-22 06:26:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2500214784. Throughput: 0: 42814.3. Samples: 2500342420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:26:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 06:27:01,300][15401] Updated weights for policy 0, policy_version 152610 (0.0038) [2024-06-22 06:27:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2500427776. Throughput: 0: 42661.3. Samples: 2500595860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:27:03,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 06:27:05,352][15401] Updated weights for policy 0, policy_version 152620 (0.0042) [2024-06-22 06:27:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2500640768. Throughput: 0: 42767.1. Samples: 2500727700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:27:08,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 06:27:09,113][15401] Updated weights for policy 0, policy_version 152630 (0.0034) [2024-06-22 06:27:12,914][15401] Updated weights for policy 0, policy_version 152640 (0.0038) [2024-06-22 06:27:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42326.0, 300 sec: 42765.0). Total num frames: 2500853760. Throughput: 0: 42891.5. Samples: 2500984800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:27:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 06:27:16,675][15401] Updated weights for policy 0, policy_version 152650 (0.0032) [2024-06-22 06:27:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2501066752. Throughput: 0: 42643.7. Samples: 2501237940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 06:27:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 06:27:20,550][15401] Updated weights for policy 0, policy_version 152660 (0.0024) [2024-06-22 06:27:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2501296128. Throughput: 0: 42724.4. Samples: 2501367440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:27:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 06:27:23,981][15401] Updated weights for policy 0, policy_version 152670 (0.0037) [2024-06-22 06:27:28,265][15401] Updated weights for policy 0, policy_version 152680 (0.0046) [2024-06-22 06:27:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2501509120. Throughput: 0: 42807.0. Samples: 2501626840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:27:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 06:27:31,809][15401] Updated weights for policy 0, policy_version 152690 (0.0030) [2024-06-22 06:27:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2501705728. Throughput: 0: 42692.0. Samples: 2501877580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:27:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 06:27:36,172][15401] Updated weights for policy 0, policy_version 152700 (0.0035) [2024-06-22 06:27:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2501918720. Throughput: 0: 42795.6. Samples: 2502009040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:27:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 06:27:39,541][15401] Updated weights for policy 0, policy_version 152710 (0.0036) [2024-06-22 06:27:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42053.0, 300 sec: 42709.4). Total num frames: 2502115328. Throughput: 0: 42700.6. Samples: 2502263960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:27:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 06:27:43,569][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000152719_2502148096.pth... [2024-06-22 06:27:43,617][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000152093_2491891712.pth [2024-06-22 06:27:44,075][15401] Updated weights for policy 0, policy_version 152720 (0.0031) [2024-06-22 06:27:46,963][15349] Signal inference workers to stop experience collection... (36950 times) [2024-06-22 06:27:46,963][15349] Signal inference workers to resume experience collection... (36950 times) [2024-06-22 06:27:47,009][15401] InferenceWorker_p0-w0: stopping experience collection (36950 times) [2024-06-22 06:27:47,009][15401] InferenceWorker_p0-w0: resuming experience collection (36950 times) [2024-06-22 06:27:47,106][15401] Updated weights for policy 0, policy_version 152730 (0.0030) [2024-06-22 06:27:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2502361088. Throughput: 0: 42712.5. Samples: 2502517920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:27:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 06:27:51,609][15401] Updated weights for policy 0, policy_version 152740 (0.0036) [2024-06-22 06:27:53,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2502574080. Throughput: 0: 42760.5. Samples: 2502651920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:27:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 06:27:54,739][15401] Updated weights for policy 0, policy_version 152750 (0.0035) [2024-06-22 06:27:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2502770688. Throughput: 0: 42730.8. Samples: 2502907680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:27:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 06:27:59,031][15401] Updated weights for policy 0, policy_version 152760 (0.0032) [2024-06-22 06:28:02,547][15401] Updated weights for policy 0, policy_version 152770 (0.0041) [2024-06-22 06:28:03,390][15132] Fps is (10 sec: 44233.2, 60 sec: 43144.0, 300 sec: 42820.8). Total num frames: 2503016448. Throughput: 0: 42830.3. Samples: 2503165340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:28:03,391][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 06:28:06,932][15401] Updated weights for policy 0, policy_version 152780 (0.0039) [2024-06-22 06:28:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2503229440. Throughput: 0: 42873.4. Samples: 2503296740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:28:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 06:28:10,213][15401] Updated weights for policy 0, policy_version 152790 (0.0031) [2024-06-22 06:28:13,389][15132] Fps is (10 sec: 39325.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 2503409664. Throughput: 0: 42885.8. Samples: 2503556700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:28:13,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-22 06:28:14,543][15401] Updated weights for policy 0, policy_version 152800 (0.0029) [2024-06-22 06:28:17,832][15401] Updated weights for policy 0, policy_version 152810 (0.0036) [2024-06-22 06:28:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2503655424. Throughput: 0: 42939.9. Samples: 2503809880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:28:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 06:28:22,125][15401] Updated weights for policy 0, policy_version 152820 (0.0033) [2024-06-22 06:28:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2503852032. Throughput: 0: 42948.0. Samples: 2503941700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 06:28:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 06:28:25,475][15401] Updated weights for policy 0, policy_version 152830 (0.0022) [2024-06-22 06:28:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2504065024. Throughput: 0: 42986.8. Samples: 2504198360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:28:28,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 06:28:29,631][15401] Updated weights for policy 0, policy_version 152840 (0.0031) [2024-06-22 06:28:33,333][15401] Updated weights for policy 0, policy_version 152850 (0.0040) [2024-06-22 06:28:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2504294400. Throughput: 0: 43082.0. Samples: 2504456620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:28:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 06:28:37,219][15401] Updated weights for policy 0, policy_version 152860 (0.0024) [2024-06-22 06:28:38,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43415.9, 300 sec: 42820.4). Total num frames: 2504523776. Throughput: 0: 43006.6. Samples: 2504587320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:28:38,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 06:28:40,896][15401] Updated weights for policy 0, policy_version 152870 (0.0027) [2024-06-22 06:28:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 2504720384. Throughput: 0: 43019.8. Samples: 2504843580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:28:43,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 06:28:44,977][15401] Updated weights for policy 0, policy_version 152880 (0.0037) [2024-06-22 06:28:48,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2504933376. Throughput: 0: 43016.4. Samples: 2505101040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:28:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 06:28:48,485][15401] Updated weights for policy 0, policy_version 152890 (0.0041) [2024-06-22 06:28:52,671][15401] Updated weights for policy 0, policy_version 152900 (0.0031) [2024-06-22 06:28:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2505146368. Throughput: 0: 43010.7. Samples: 2505232220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:28:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 06:28:56,028][15401] Updated weights for policy 0, policy_version 152910 (0.0031) [2024-06-22 06:28:56,931][15349] Signal inference workers to stop experience collection... (37000 times) [2024-06-22 06:28:56,936][15349] Signal inference workers to resume experience collection... (37000 times) [2024-06-22 06:28:56,962][15401] InferenceWorker_p0-w0: stopping experience collection (37000 times) [2024-06-22 06:28:56,963][15401] InferenceWorker_p0-w0: resuming experience collection (37000 times) [2024-06-22 06:28:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2505359360. Throughput: 0: 42842.1. Samples: 2505484600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:28:58,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 06:29:00,503][15401] Updated weights for policy 0, policy_version 152920 (0.0037) [2024-06-22 06:29:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42599.0, 300 sec: 42765.0). Total num frames: 2505572352. Throughput: 0: 42811.3. Samples: 2505736380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:29:03,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 06:29:03,608][15401] Updated weights for policy 0, policy_version 152930 (0.0039) [2024-06-22 06:29:08,166][15401] Updated weights for policy 0, policy_version 152940 (0.0035) [2024-06-22 06:29:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 2505785344. Throughput: 0: 42849.2. Samples: 2505870020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:29:08,393][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 06:29:11,493][15401] Updated weights for policy 0, policy_version 152950 (0.0038) [2024-06-22 06:29:13,390][15132] Fps is (10 sec: 44235.4, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 2506014720. Throughput: 0: 42906.1. Samples: 2506129140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:29:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 06:29:15,826][15401] Updated weights for policy 0, policy_version 152960 (0.0037) [2024-06-22 06:29:18,389][15132] Fps is (10 sec: 45886.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2506244096. Throughput: 0: 42798.4. Samples: 2506382540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:29:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 06:29:19,206][15401] Updated weights for policy 0, policy_version 152970 (0.0033) [2024-06-22 06:29:23,269][15401] Updated weights for policy 0, policy_version 152980 (0.0036) [2024-06-22 06:29:23,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2506424320. Throughput: 0: 42829.4. Samples: 2506514540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:29:23,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 06:29:26,596][15401] Updated weights for policy 0, policy_version 152990 (0.0037) [2024-06-22 06:29:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2506637312. Throughput: 0: 42881.0. Samples: 2506773220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 06:29:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 06:29:30,802][15401] Updated weights for policy 0, policy_version 153000 (0.0032) [2024-06-22 06:29:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2506883072. Throughput: 0: 42772.3. Samples: 2507025800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:29:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 06:29:34,094][15401] Updated weights for policy 0, policy_version 153010 (0.0030) [2024-06-22 06:29:38,230][15401] Updated weights for policy 0, policy_version 153020 (0.0037) [2024-06-22 06:29:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 2507079680. Throughput: 0: 42890.1. Samples: 2507162280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:29:38,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 06:29:42,154][15401] Updated weights for policy 0, policy_version 153030 (0.0035) [2024-06-22 06:29:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2507292672. Throughput: 0: 43148.4. Samples: 2507426280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:29:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 06:29:43,441][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000153034_2507309056.pth... [2024-06-22 06:29:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000152406_2497019904.pth [2024-06-22 06:29:45,634][15401] Updated weights for policy 0, policy_version 153040 (0.0033) [2024-06-22 06:29:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2507538432. Throughput: 0: 43132.3. Samples: 2507677340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:29:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 06:29:49,583][15401] Updated weights for policy 0, policy_version 153050 (0.0036) [2024-06-22 06:29:53,314][15401] Updated weights for policy 0, policy_version 153060 (0.0027) [2024-06-22 06:29:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 2507735040. Throughput: 0: 43165.0. Samples: 2507812340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:29:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 06:29:57,086][15401] Updated weights for policy 0, policy_version 153070 (0.0041) [2024-06-22 06:29:58,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 2507931648. Throughput: 0: 43125.4. Samples: 2508069880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:29:58,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 06:30:00,913][15401] Updated weights for policy 0, policy_version 153080 (0.0036) [2024-06-22 06:30:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 2508193792. Throughput: 0: 43117.7. Samples: 2508322840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:30:03,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 06:30:04,600][15401] Updated weights for policy 0, policy_version 153090 (0.0039) [2024-06-22 06:30:08,388][15401] Updated weights for policy 0, policy_version 153100 (0.0039) [2024-06-22 06:30:08,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43419.4, 300 sec: 42931.6). Total num frames: 2508390400. Throughput: 0: 43243.6. Samples: 2508460500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:30:08,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 06:30:12,020][15401] Updated weights for policy 0, policy_version 153110 (0.0027) [2024-06-22 06:30:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2508587008. Throughput: 0: 43147.5. Samples: 2508714860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:30:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 06:30:15,863][15401] Updated weights for policy 0, policy_version 153120 (0.0031) [2024-06-22 06:30:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2508816384. Throughput: 0: 43204.4. Samples: 2508970000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:30:18,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-22 06:30:19,593][15401] Updated weights for policy 0, policy_version 153130 (0.0047) [2024-06-22 06:30:23,275][15349] Signal inference workers to stop experience collection... (37050 times) [2024-06-22 06:30:23,275][15349] Signal inference workers to resume experience collection... (37050 times) [2024-06-22 06:30:23,301][15401] InferenceWorker_p0-w0: stopping experience collection (37050 times) [2024-06-22 06:30:23,301][15401] InferenceWorker_p0-w0: resuming experience collection (37050 times) [2024-06-22 06:30:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2509029376. Throughput: 0: 43166.4. Samples: 2509104760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:30:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 06:30:23,422][15401] Updated weights for policy 0, policy_version 153140 (0.0032) [2024-06-22 06:30:27,426][15401] Updated weights for policy 0, policy_version 153150 (0.0033) [2024-06-22 06:30:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2509242368. Throughput: 0: 43013.0. Samples: 2509361860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:30:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 06:30:31,076][15401] Updated weights for policy 0, policy_version 153160 (0.0028) [2024-06-22 06:30:33,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2509471744. Throughput: 0: 43143.6. Samples: 2509618800. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:30:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 06:30:34,884][15401] Updated weights for policy 0, policy_version 153170 (0.0049) [2024-06-22 06:30:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2509684736. Throughput: 0: 43176.3. Samples: 2509755280. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:30:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 06:30:38,649][15401] Updated weights for policy 0, policy_version 153180 (0.0026) [2024-06-22 06:30:42,691][15401] Updated weights for policy 0, policy_version 153190 (0.0049) [2024-06-22 06:30:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 2509881344. Throughput: 0: 42992.6. Samples: 2510004440. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:30:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 06:30:46,425][15401] Updated weights for policy 0, policy_version 153200 (0.0029) [2024-06-22 06:30:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 2510127104. Throughput: 0: 43109.2. Samples: 2510262760. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:30:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 06:30:50,493][15401] Updated weights for policy 0, policy_version 153210 (0.0029) [2024-06-22 06:30:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2510323712. Throughput: 0: 43079.1. Samples: 2510399060. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:30:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 06:30:54,092][15401] Updated weights for policy 0, policy_version 153220 (0.0051) [2024-06-22 06:30:58,101][15401] Updated weights for policy 0, policy_version 153230 (0.0033) [2024-06-22 06:30:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 2510520320. Throughput: 0: 43208.0. Samples: 2510659220. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:30:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 06:31:01,523][15401] Updated weights for policy 0, policy_version 153240 (0.0034) [2024-06-22 06:31:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2510766080. Throughput: 0: 43087.7. Samples: 2510908940. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:31:03,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 06:31:05,835][15401] Updated weights for policy 0, policy_version 153250 (0.0036) [2024-06-22 06:31:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.7). Total num frames: 2510946304. Throughput: 0: 43140.3. Samples: 2511046080. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:31:08,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-22 06:31:09,202][15401] Updated weights for policy 0, policy_version 153260 (0.0035) [2024-06-22 06:31:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2511159296. Throughput: 0: 43144.0. Samples: 2511303340. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:31:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 06:31:13,548][15401] Updated weights for policy 0, policy_version 153270 (0.0033) [2024-06-22 06:31:16,920][15401] Updated weights for policy 0, policy_version 153280 (0.0046) [2024-06-22 06:31:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.9, 300 sec: 43042.4). Total num frames: 2511405056. Throughput: 0: 43011.5. Samples: 2511554420. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:31:18,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 06:31:21,257][15401] Updated weights for policy 0, policy_version 153290 (0.0044) [2024-06-22 06:31:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2511601664. Throughput: 0: 43002.4. Samples: 2511690380. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:31:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 06:31:24,559][15401] Updated weights for policy 0, policy_version 153300 (0.0039) [2024-06-22 06:31:28,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2511798272. Throughput: 0: 43022.6. Samples: 2511940460. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:31:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 06:31:28,956][15401] Updated weights for policy 0, policy_version 153310 (0.0027) [2024-06-22 06:31:32,140][15401] Updated weights for policy 0, policy_version 153320 (0.0033) [2024-06-22 06:31:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 2512044032. Throughput: 0: 42879.2. Samples: 2512192320. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-22 06:31:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 06:31:36,547][15401] Updated weights for policy 0, policy_version 153330 (0.0042) [2024-06-22 06:31:38,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.7, 300 sec: 42931.8). Total num frames: 2512257024. Throughput: 0: 42984.6. Samples: 2512333360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:31:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 06:31:39,758][15401] Updated weights for policy 0, policy_version 153340 (0.0036) [2024-06-22 06:31:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2512437248. Throughput: 0: 42772.0. Samples: 2512583960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:31:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 06:31:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000153348_2512453632.pth... [2024-06-22 06:31:43,527][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000152719_2502148096.pth [2024-06-22 06:31:44,125][15401] Updated weights for policy 0, policy_version 153350 (0.0039) [2024-06-22 06:31:47,339][15401] Updated weights for policy 0, policy_version 153360 (0.0042) [2024-06-22 06:31:48,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 2512699392. Throughput: 0: 42791.4. Samples: 2512834560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:31:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 06:31:52,043][15401] Updated weights for policy 0, policy_version 153370 (0.0045) [2024-06-22 06:31:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2512896000. Throughput: 0: 42808.0. Samples: 2512972440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:31:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 06:31:54,999][15401] Updated weights for policy 0, policy_version 153380 (0.0035) [2024-06-22 06:31:58,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2513076224. Throughput: 0: 42695.1. Samples: 2513224620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:31:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 06:31:59,595][15401] Updated weights for policy 0, policy_version 153390 (0.0032) [2024-06-22 06:31:59,899][15349] Signal inference workers to stop experience collection... (37100 times) [2024-06-22 06:31:59,913][15401] InferenceWorker_p0-w0: stopping experience collection (37100 times) [2024-06-22 06:31:59,915][15349] Signal inference workers to resume experience collection... (37100 times) [2024-06-22 06:31:59,925][15401] InferenceWorker_p0-w0: resuming experience collection (37100 times) [2024-06-22 06:32:02,503][15401] Updated weights for policy 0, policy_version 153400 (0.0032) [2024-06-22 06:32:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2513338368. Throughput: 0: 42753.9. Samples: 2513478240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:32:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 06:32:07,181][15401] Updated weights for policy 0, policy_version 153410 (0.0042) [2024-06-22 06:32:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2513502208. Throughput: 0: 42671.1. Samples: 2513610580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:32:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 06:32:10,063][15401] Updated weights for policy 0, policy_version 153420 (0.0025) [2024-06-22 06:32:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2513731584. Throughput: 0: 42754.7. Samples: 2513864420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:32:13,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 06:32:14,707][15401] Updated weights for policy 0, policy_version 153430 (0.0031) [2024-06-22 06:32:17,628][15401] Updated weights for policy 0, policy_version 153440 (0.0038) [2024-06-22 06:32:18,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 2513977344. Throughput: 0: 42891.0. Samples: 2514122420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:32:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 06:32:22,262][15401] Updated weights for policy 0, policy_version 153450 (0.0043) [2024-06-22 06:32:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2514157568. Throughput: 0: 42758.2. Samples: 2514257480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:32:23,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-22 06:32:25,897][15401] Updated weights for policy 0, policy_version 153460 (0.0029) [2024-06-22 06:32:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2514386944. Throughput: 0: 42691.6. Samples: 2514505080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:32:28,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 06:32:30,041][15401] Updated weights for policy 0, policy_version 153470 (0.0030) [2024-06-22 06:32:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2514599936. Throughput: 0: 42884.6. Samples: 2514764360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:32:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 06:32:33,437][15401] Updated weights for policy 0, policy_version 153480 (0.0041) [2024-06-22 06:32:37,603][15401] Updated weights for policy 0, policy_version 153490 (0.0026) [2024-06-22 06:32:38,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42598.2, 300 sec: 43042.7). Total num frames: 2514812928. Throughput: 0: 42637.6. Samples: 2514891140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 06:32:38,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 06:32:41,111][15401] Updated weights for policy 0, policy_version 153500 (0.0039) [2024-06-22 06:32:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2515025920. Throughput: 0: 42714.6. Samples: 2515146780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:32:43,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 06:32:45,209][15401] Updated weights for policy 0, policy_version 153510 (0.0033) [2024-06-22 06:32:48,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 2515238912. Throughput: 0: 42807.0. Samples: 2515404560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:32:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 06:32:48,793][15401] Updated weights for policy 0, policy_version 153520 (0.0040) [2024-06-22 06:32:53,065][15401] Updated weights for policy 0, policy_version 153530 (0.0042) [2024-06-22 06:32:53,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.8, 300 sec: 43042.4). Total num frames: 2515468288. Throughput: 0: 42724.0. Samples: 2515533260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:32:53,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 06:32:56,447][15401] Updated weights for policy 0, policy_version 153540 (0.0036) [2024-06-22 06:32:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.2). Total num frames: 2515664896. Throughput: 0: 42828.4. Samples: 2515791700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:32:58,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 06:33:00,664][15401] Updated weights for policy 0, policy_version 153550 (0.0028) [2024-06-22 06:33:03,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2515894272. Throughput: 0: 42796.0. Samples: 2516048240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 06:33:04,015][15401] Updated weights for policy 0, policy_version 153560 (0.0031) [2024-06-22 06:33:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 2516074496. Throughput: 0: 42495.0. Samples: 2516169860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:08,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 06:33:08,560][15401] Updated weights for policy 0, policy_version 153570 (0.0027) [2024-06-22 06:33:12,202][15401] Updated weights for policy 0, policy_version 153580 (0.0035) [2024-06-22 06:33:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2516320256. Throughput: 0: 42902.0. Samples: 2516435680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 06:33:15,964][15401] Updated weights for policy 0, policy_version 153590 (0.0031) [2024-06-22 06:33:18,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.5, 300 sec: 42931.6). Total num frames: 2516516864. Throughput: 0: 42722.8. Samples: 2516686880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 06:33:19,984][15401] Updated weights for policy 0, policy_version 153600 (0.0034) [2024-06-22 06:33:23,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2516729856. Throughput: 0: 42752.6. Samples: 2516815000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 06:33:23,424][15401] Updated weights for policy 0, policy_version 153610 (0.0035) [2024-06-22 06:33:25,980][15349] Signal inference workers to stop experience collection... (37150 times) [2024-06-22 06:33:26,028][15401] InferenceWorker_p0-w0: stopping experience collection (37150 times) [2024-06-22 06:33:26,093][15349] Signal inference workers to resume experience collection... (37150 times) [2024-06-22 06:33:26,093][15401] InferenceWorker_p0-w0: resuming experience collection (37150 times) [2024-06-22 06:33:27,628][15401] Updated weights for policy 0, policy_version 153620 (0.0032) [2024-06-22 06:33:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2516942848. Throughput: 0: 42799.5. Samples: 2517072760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 06:33:30,889][15401] Updated weights for policy 0, policy_version 153630 (0.0026) [2024-06-22 06:33:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42820.6). Total num frames: 2517155840. Throughput: 0: 42667.9. Samples: 2517324720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:33,393][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 06:33:35,187][15401] Updated weights for policy 0, policy_version 153640 (0.0042) [2024-06-22 06:33:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.7, 300 sec: 42931.7). Total num frames: 2517385216. Throughput: 0: 42747.2. Samples: 2517456780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 06:33:38,452][15401] Updated weights for policy 0, policy_version 153650 (0.0037) [2024-06-22 06:33:42,868][15401] Updated weights for policy 0, policy_version 153660 (0.0042) [2024-06-22 06:33:43,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2517581824. Throughput: 0: 42630.4. Samples: 2517710060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 06:33:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 06:33:43,477][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000153662_2517598208.pth... [2024-06-22 06:33:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000153034_2507309056.pth [2024-06-22 06:33:46,126][15401] Updated weights for policy 0, policy_version 153670 (0.0031) [2024-06-22 06:33:48,392][15132] Fps is (10 sec: 39310.6, 60 sec: 42323.4, 300 sec: 42820.2). Total num frames: 2517778432. Throughput: 0: 42720.2. Samples: 2517970760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:33:48,393][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 06:33:50,617][15401] Updated weights for policy 0, policy_version 153680 (0.0028) [2024-06-22 06:33:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 2518024192. Throughput: 0: 42797.9. Samples: 2518095660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:33:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 06:33:53,639][15401] Updated weights for policy 0, policy_version 153690 (0.0035) [2024-06-22 06:33:58,077][15401] Updated weights for policy 0, policy_version 153700 (0.0034) [2024-06-22 06:33:58,389][15132] Fps is (10 sec: 44249.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2518220800. Throughput: 0: 42716.3. Samples: 2518357900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:33:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 06:34:01,039][15401] Updated weights for policy 0, policy_version 153710 (0.0040) [2024-06-22 06:34:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42876.5). Total num frames: 2518433792. Throughput: 0: 42865.8. Samples: 2518615840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 06:34:05,597][15401] Updated weights for policy 0, policy_version 153720 (0.0041) [2024-06-22 06:34:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43419.3, 300 sec: 42931.7). Total num frames: 2518679552. Throughput: 0: 42876.4. Samples: 2518744440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:08,391][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 06:34:08,591][15401] Updated weights for policy 0, policy_version 153730 (0.0052) [2024-06-22 06:34:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2518859776. Throughput: 0: 42938.7. Samples: 2519005000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 06:34:13,651][15401] Updated weights for policy 0, policy_version 153740 (0.0033) [2024-06-22 06:34:16,123][15401] Updated weights for policy 0, policy_version 153750 (0.0024) [2024-06-22 06:34:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2519089152. Throughput: 0: 43002.8. Samples: 2519259740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 06:34:21,079][15401] Updated weights for policy 0, policy_version 153760 (0.0033) [2024-06-22 06:34:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2519334912. Throughput: 0: 42872.0. Samples: 2519386020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:23,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 06:34:23,754][15401] Updated weights for policy 0, policy_version 153770 (0.0055) [2024-06-22 06:34:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2519498752. Throughput: 0: 43029.3. Samples: 2519646380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 06:34:28,955][15401] Updated weights for policy 0, policy_version 153780 (0.0043) [2024-06-22 06:34:31,317][15401] Updated weights for policy 0, policy_version 153790 (0.0035) [2024-06-22 06:34:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2519728128. Throughput: 0: 42872.8. Samples: 2519899920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 06:34:36,669][15401] Updated weights for policy 0, policy_version 153800 (0.0040) [2024-06-22 06:34:38,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2519973888. Throughput: 0: 43102.2. Samples: 2520035260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:38,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 06:34:39,228][15401] Updated weights for policy 0, policy_version 153810 (0.0039) [2024-06-22 06:34:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2520137728. Throughput: 0: 42937.8. Samples: 2520290100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:43,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 06:34:44,254][15401] Updated weights for policy 0, policy_version 153820 (0.0034) [2024-06-22 06:34:45,208][15349] Signal inference workers to stop experience collection... (37200 times) [2024-06-22 06:34:45,208][15349] Signal inference workers to resume experience collection... (37200 times) [2024-06-22 06:34:45,248][15401] InferenceWorker_p0-w0: stopping experience collection (37200 times) [2024-06-22 06:34:45,248][15401] InferenceWorker_p0-w0: resuming experience collection (37200 times) [2024-06-22 06:34:46,978][15401] Updated weights for policy 0, policy_version 153830 (0.0030) [2024-06-22 06:34:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43146.5, 300 sec: 42820.6). Total num frames: 2520367104. Throughput: 0: 42739.6. Samples: 2520539120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 06:34:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 06:34:51,780][15401] Updated weights for policy 0, policy_version 153840 (0.0025) [2024-06-22 06:34:53,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 2520612864. Throughput: 0: 42923.7. Samples: 2520676000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:34:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 06:34:54,602][15401] Updated weights for policy 0, policy_version 153850 (0.0044) [2024-06-22 06:34:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2520776704. Throughput: 0: 42698.2. Samples: 2520926420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:34:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 06:34:59,353][15401] Updated weights for policy 0, policy_version 153860 (0.0027) [2024-06-22 06:35:02,121][15401] Updated weights for policy 0, policy_version 153870 (0.0051) [2024-06-22 06:35:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2521022464. Throughput: 0: 42788.8. Samples: 2521185240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 06:35:07,107][15401] Updated weights for policy 0, policy_version 153880 (0.0039) [2024-06-22 06:35:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2521235456. Throughput: 0: 42953.4. Samples: 2521318920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 06:35:09,856][15401] Updated weights for policy 0, policy_version 153890 (0.0044) [2024-06-22 06:35:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2521432064. Throughput: 0: 42711.1. Samples: 2521568380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:13,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 06:35:14,636][15401] Updated weights for policy 0, policy_version 153900 (0.0028) [2024-06-22 06:35:17,600][15401] Updated weights for policy 0, policy_version 153910 (0.0037) [2024-06-22 06:35:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2521677824. Throughput: 0: 42777.3. Samples: 2521824900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 06:35:22,262][15401] Updated weights for policy 0, policy_version 153920 (0.0032) [2024-06-22 06:35:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2521890816. Throughput: 0: 42846.1. Samples: 2521963340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 06:35:25,135][15401] Updated weights for policy 0, policy_version 153930 (0.0027) [2024-06-22 06:35:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2522071040. Throughput: 0: 42785.3. Samples: 2522215440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 06:35:30,049][15401] Updated weights for policy 0, policy_version 153940 (0.0049) [2024-06-22 06:35:32,890][15401] Updated weights for policy 0, policy_version 153950 (0.0033) [2024-06-22 06:35:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2522333184. Throughput: 0: 42747.4. Samples: 2522462760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 06:35:37,527][15401] Updated weights for policy 0, policy_version 153960 (0.0039) [2024-06-22 06:35:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2522513408. Throughput: 0: 42767.9. Samples: 2522600560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 06:35:40,545][15401] Updated weights for policy 0, policy_version 153970 (0.0037) [2024-06-22 06:35:43,396][15132] Fps is (10 sec: 39296.9, 60 sec: 43139.8, 300 sec: 42708.6). Total num frames: 2522726400. Throughput: 0: 42761.1. Samples: 2522850940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:43,397][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 06:35:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000153975_2522726400.pth... [2024-06-22 06:35:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000153348_2512453632.pth [2024-06-22 06:35:45,090][15401] Updated weights for policy 0, policy_version 153980 (0.0032) [2024-06-22 06:35:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2522955776. Throughput: 0: 42740.5. Samples: 2523108560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:35:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 06:35:48,595][15401] Updated weights for policy 0, policy_version 153990 (0.0031) [2024-06-22 06:35:53,171][15401] Updated weights for policy 0, policy_version 154000 (0.0032) [2024-06-22 06:35:53,390][15132] Fps is (10 sec: 42625.1, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 2523152384. Throughput: 0: 42710.0. Samples: 2523240880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:35:53,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 06:35:56,204][15401] Updated weights for policy 0, policy_version 154010 (0.0032) [2024-06-22 06:35:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2523381760. Throughput: 0: 42934.3. Samples: 2523500420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:35:58,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-22 06:36:00,682][15401] Updated weights for policy 0, policy_version 154020 (0.0038) [2024-06-22 06:36:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2523594752. Throughput: 0: 42775.7. Samples: 2523749800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:03,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 06:36:03,791][15401] Updated weights for policy 0, policy_version 154030 (0.0037) [2024-06-22 06:36:06,982][15349] Signal inference workers to stop experience collection... (37250 times) [2024-06-22 06:36:06,982][15349] Signal inference workers to resume experience collection... (37250 times) [2024-06-22 06:36:07,003][15401] InferenceWorker_p0-w0: stopping experience collection (37250 times) [2024-06-22 06:36:07,004][15401] InferenceWorker_p0-w0: resuming experience collection (37250 times) [2024-06-22 06:36:08,125][15401] Updated weights for policy 0, policy_version 154040 (0.0044) [2024-06-22 06:36:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2523791360. Throughput: 0: 42713.4. Samples: 2523885440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 06:36:11,638][15401] Updated weights for policy 0, policy_version 154050 (0.0044) [2024-06-22 06:36:13,394][15132] Fps is (10 sec: 40943.1, 60 sec: 42868.6, 300 sec: 42709.2). Total num frames: 2524004352. Throughput: 0: 42644.5. Samples: 2524134620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:13,394][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 06:36:15,881][15401] Updated weights for policy 0, policy_version 154060 (0.0042) [2024-06-22 06:36:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2524200960. Throughput: 0: 42975.7. Samples: 2524396660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:18,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 06:36:19,315][15401] Updated weights for policy 0, policy_version 154070 (0.0023) [2024-06-22 06:36:23,378][15401] Updated weights for policy 0, policy_version 154080 (0.0023) [2024-06-22 06:36:23,390][15132] Fps is (10 sec: 44254.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2524446720. Throughput: 0: 42631.5. Samples: 2524518980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 06:36:27,066][15401] Updated weights for policy 0, policy_version 154090 (0.0035) [2024-06-22 06:36:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2524659712. Throughput: 0: 42756.7. Samples: 2524774720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 06:36:31,122][15401] Updated weights for policy 0, policy_version 154100 (0.0032) [2024-06-22 06:36:33,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42047.8, 300 sec: 42708.5). Total num frames: 2524856320. Throughput: 0: 42688.1. Samples: 2525029800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:33,397][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 06:36:34,857][15401] Updated weights for policy 0, policy_version 154110 (0.0040) [2024-06-22 06:36:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2525069312. Throughput: 0: 42586.4. Samples: 2525157260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 06:36:38,701][15401] Updated weights for policy 0, policy_version 154120 (0.0041) [2024-06-22 06:36:42,337][15401] Updated weights for policy 0, policy_version 154130 (0.0026) [2024-06-22 06:36:43,390][15132] Fps is (10 sec: 45903.7, 60 sec: 43149.0, 300 sec: 42765.0). Total num frames: 2525315072. Throughput: 0: 42672.2. Samples: 2525420680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 06:36:46,178][15401] Updated weights for policy 0, policy_version 154140 (0.0038) [2024-06-22 06:36:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2525511680. Throughput: 0: 42768.9. Samples: 2525674400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:48,390][15132] Avg episode reward: [(0, '0.162')] [2024-06-22 06:36:50,026][15401] Updated weights for policy 0, policy_version 154150 (0.0043) [2024-06-22 06:36:53,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2525724672. Throughput: 0: 42575.6. Samples: 2525801340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 06:36:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 06:36:53,675][15401] Updated weights for policy 0, policy_version 154160 (0.0031) [2024-06-22 06:36:57,768][15401] Updated weights for policy 0, policy_version 154170 (0.0035) [2024-06-22 06:36:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2525937664. Throughput: 0: 42906.9. Samples: 2526065260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:36:58,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 06:37:01,500][15401] Updated weights for policy 0, policy_version 154180 (0.0033) [2024-06-22 06:37:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 2526134272. Throughput: 0: 42623.0. Samples: 2526314700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 06:37:05,425][15401] Updated weights for policy 0, policy_version 154190 (0.0023) [2024-06-22 06:37:08,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2526363648. Throughput: 0: 42766.9. Samples: 2526443480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:08,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 06:37:08,950][15401] Updated weights for policy 0, policy_version 154200 (0.0029) [2024-06-22 06:37:12,999][15401] Updated weights for policy 0, policy_version 154210 (0.0031) [2024-06-22 06:37:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43147.5, 300 sec: 42765.0). Total num frames: 2526593024. Throughput: 0: 42893.9. Samples: 2526704940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 06:37:16,614][15401] Updated weights for policy 0, policy_version 154220 (0.0035) [2024-06-22 06:37:18,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2526773248. Throughput: 0: 42883.5. Samples: 2526959280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:18,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 06:37:20,680][15401] Updated weights for policy 0, policy_version 154230 (0.0043) [2024-06-22 06:37:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2527002624. Throughput: 0: 42819.5. Samples: 2527084140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 06:37:23,494][15349] Signal inference workers to stop experience collection... (37300 times) [2024-06-22 06:37:23,494][15349] Signal inference workers to resume experience collection... (37300 times) [2024-06-22 06:37:23,532][15401] InferenceWorker_p0-w0: stopping experience collection (37300 times) [2024-06-22 06:37:23,532][15401] InferenceWorker_p0-w0: resuming experience collection (37300 times) [2024-06-22 06:37:24,086][15401] Updated weights for policy 0, policy_version 154240 (0.0033) [2024-06-22 06:37:28,244][15401] Updated weights for policy 0, policy_version 154250 (0.0029) [2024-06-22 06:37:28,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2527232000. Throughput: 0: 42805.9. Samples: 2527347040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:28,393][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 06:37:31,846][15401] Updated weights for policy 0, policy_version 154260 (0.0035) [2024-06-22 06:37:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 2527428608. Throughput: 0: 42714.5. Samples: 2527596560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 06:37:36,338][15401] Updated weights for policy 0, policy_version 154270 (0.0026) [2024-06-22 06:37:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2527657984. Throughput: 0: 42923.5. Samples: 2527732900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 06:37:39,627][15401] Updated weights for policy 0, policy_version 154280 (0.0032) [2024-06-22 06:37:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2527854592. Throughput: 0: 42706.7. Samples: 2527987060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 06:37:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000154288_2527854592.pth... [2024-06-22 06:37:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000153662_2517598208.pth [2024-06-22 06:37:43,988][15401] Updated weights for policy 0, policy_version 154290 (0.0034) [2024-06-22 06:37:47,114][15401] Updated weights for policy 0, policy_version 154300 (0.0031) [2024-06-22 06:37:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 2528083968. Throughput: 0: 42810.3. Samples: 2528241160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 06:37:51,566][15401] Updated weights for policy 0, policy_version 154310 (0.0026) [2024-06-22 06:37:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2528296960. Throughput: 0: 42842.4. Samples: 2528371400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:53,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-22 06:37:55,090][15401] Updated weights for policy 0, policy_version 154320 (0.0042) [2024-06-22 06:37:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 2528493568. Throughput: 0: 42745.9. Samples: 2528628500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 06:37:58,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 06:37:59,070][15401] Updated weights for policy 0, policy_version 154330 (0.0038) [2024-06-22 06:38:02,743][15401] Updated weights for policy 0, policy_version 154340 (0.0025) [2024-06-22 06:38:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 2528739328. Throughput: 0: 42612.4. Samples: 2528876840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 06:38:06,667][15401] Updated weights for policy 0, policy_version 154350 (0.0047) [2024-06-22 06:38:08,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.6, 300 sec: 42764.7). Total num frames: 2528935936. Throughput: 0: 42727.0. Samples: 2529006960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:08,393][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 06:38:10,310][15401] Updated weights for policy 0, policy_version 154360 (0.0036) [2024-06-22 06:38:13,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2529116160. Throughput: 0: 42593.8. Samples: 2529263660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 06:38:14,625][15401] Updated weights for policy 0, policy_version 154370 (0.0048) [2024-06-22 06:38:17,876][15401] Updated weights for policy 0, policy_version 154380 (0.0030) [2024-06-22 06:38:18,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2529378304. Throughput: 0: 42695.7. Samples: 2529517860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 06:38:22,161][15401] Updated weights for policy 0, policy_version 154390 (0.0040) [2024-06-22 06:38:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2529574912. Throughput: 0: 42694.3. Samples: 2529654140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 06:38:25,543][15401] Updated weights for policy 0, policy_version 154400 (0.0031) [2024-06-22 06:38:28,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42327.0, 300 sec: 42765.4). Total num frames: 2529771520. Throughput: 0: 42710.3. Samples: 2529909020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 06:38:29,720][15401] Updated weights for policy 0, policy_version 154410 (0.0036) [2024-06-22 06:38:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2530000896. Throughput: 0: 42651.5. Samples: 2530160480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:33,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 06:38:33,499][15401] Updated weights for policy 0, policy_version 154420 (0.0036) [2024-06-22 06:38:37,434][15401] Updated weights for policy 0, policy_version 154430 (0.0034) [2024-06-22 06:38:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2530197504. Throughput: 0: 42650.7. Samples: 2530290680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:38,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-22 06:38:41,240][15401] Updated weights for policy 0, policy_version 154440 (0.0038) [2024-06-22 06:38:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42876.1). Total num frames: 2530426880. Throughput: 0: 42536.7. Samples: 2530542760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:43,393][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 06:38:45,239][15401] Updated weights for policy 0, policy_version 154450 (0.0046) [2024-06-22 06:38:45,913][15349] Signal inference workers to stop experience collection... (37350 times) [2024-06-22 06:38:45,956][15401] InferenceWorker_p0-w0: stopping experience collection (37350 times) [2024-06-22 06:38:46,025][15349] Signal inference workers to resume experience collection... (37350 times) [2024-06-22 06:38:46,025][15401] InferenceWorker_p0-w0: resuming experience collection (37350 times) [2024-06-22 06:38:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 2530639872. Throughput: 0: 42716.8. Samples: 2530799200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:48,393][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 06:38:48,816][15401] Updated weights for policy 0, policy_version 154460 (0.0033) [2024-06-22 06:38:52,745][15401] Updated weights for policy 0, policy_version 154470 (0.0036) [2024-06-22 06:38:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2530836480. Throughput: 0: 42731.2. Samples: 2530929760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 06:38:56,294][15401] Updated weights for policy 0, policy_version 154480 (0.0028) [2024-06-22 06:38:58,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 2531049472. Throughput: 0: 42630.2. Samples: 2531182120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:38:58,392][15132] Avg episode reward: [(0, '0.247')] [2024-06-22 06:39:00,243][15401] Updated weights for policy 0, policy_version 154490 (0.0028) [2024-06-22 06:39:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2531278848. Throughput: 0: 42789.8. Samples: 2531443400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 06:39:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 06:39:04,228][15401] Updated weights for policy 0, policy_version 154500 (0.0035) [2024-06-22 06:39:08,077][15401] Updated weights for policy 0, policy_version 154510 (0.0041) [2024-06-22 06:39:08,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 2531491840. Throughput: 0: 42555.9. Samples: 2531569160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:08,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 06:39:11,759][15401] Updated weights for policy 0, policy_version 154520 (0.0030) [2024-06-22 06:39:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2531704832. Throughput: 0: 42589.4. Samples: 2531825540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 06:39:15,688][15401] Updated weights for policy 0, policy_version 154530 (0.0039) [2024-06-22 06:39:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2531917824. Throughput: 0: 42880.5. Samples: 2532090100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 06:39:19,381][15401] Updated weights for policy 0, policy_version 154540 (0.0029) [2024-06-22 06:39:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2532130816. Throughput: 0: 42721.5. Samples: 2532213140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:23,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 06:39:23,403][15401] Updated weights for policy 0, policy_version 154550 (0.0039) [2024-06-22 06:39:27,082][15401] Updated weights for policy 0, policy_version 154560 (0.0035) [2024-06-22 06:39:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2532360192. Throughput: 0: 42779.7. Samples: 2532467740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 06:39:31,739][15401] Updated weights for policy 0, policy_version 154570 (0.0045) [2024-06-22 06:39:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2532573184. Throughput: 0: 42813.0. Samples: 2532725680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 06:39:34,990][15401] Updated weights for policy 0, policy_version 154580 (0.0028) [2024-06-22 06:39:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2532786176. Throughput: 0: 42660.0. Samples: 2532849460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 06:39:39,515][15401] Updated weights for policy 0, policy_version 154590 (0.0028) [2024-06-22 06:39:42,666][15401] Updated weights for policy 0, policy_version 154600 (0.0031) [2024-06-22 06:39:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 2532999168. Throughput: 0: 42830.6. Samples: 2533109500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:43,393][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 06:39:43,459][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000154603_2533015552.pth... [2024-06-22 06:39:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000153975_2522726400.pth [2024-06-22 06:39:47,173][15401] Updated weights for policy 0, policy_version 154610 (0.0041) [2024-06-22 06:39:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 2533179392. Throughput: 0: 42711.9. Samples: 2533365440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 06:39:50,409][15401] Updated weights for policy 0, policy_version 154620 (0.0034) [2024-06-22 06:39:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2533425152. Throughput: 0: 42621.3. Samples: 2533487120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 06:39:54,789][15401] Updated weights for policy 0, policy_version 154630 (0.0038) [2024-06-22 06:39:58,010][15401] Updated weights for policy 0, policy_version 154640 (0.0026) [2024-06-22 06:39:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2533621760. Throughput: 0: 42622.1. Samples: 2533743540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:39:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 06:40:02,739][15401] Updated weights for policy 0, policy_version 154650 (0.0034) [2024-06-22 06:40:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2533801984. Throughput: 0: 42501.7. Samples: 2534002680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:40:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 06:40:03,774][15349] Signal inference workers to stop experience collection... (37400 times) [2024-06-22 06:40:03,776][15349] Signal inference workers to resume experience collection... (37400 times) [2024-06-22 06:40:03,828][15401] InferenceWorker_p0-w0: stopping experience collection (37400 times) [2024-06-22 06:40:03,828][15401] InferenceWorker_p0-w0: resuming experience collection (37400 times) [2024-06-22 06:40:05,598][15401] Updated weights for policy 0, policy_version 154660 (0.0037) [2024-06-22 06:40:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2534047744. Throughput: 0: 42453.2. Samples: 2534123540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 06:40:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 06:40:10,437][15401] Updated weights for policy 0, policy_version 154670 (0.0035) [2024-06-22 06:40:13,238][15401] Updated weights for policy 0, policy_version 154680 (0.0036) [2024-06-22 06:40:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2534277120. Throughput: 0: 42544.8. Samples: 2534382260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:13,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 06:40:18,261][15401] Updated weights for policy 0, policy_version 154690 (0.0035) [2024-06-22 06:40:18,396][15132] Fps is (10 sec: 39298.2, 60 sec: 42048.0, 300 sec: 42542.0). Total num frames: 2534440960. Throughput: 0: 42556.1. Samples: 2534640960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:18,396][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 06:40:21,433][15401] Updated weights for policy 0, policy_version 154700 (0.0038) [2024-06-22 06:40:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2534703104. Throughput: 0: 42495.6. Samples: 2534761760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 06:40:25,899][15401] Updated weights for policy 0, policy_version 154710 (0.0034) [2024-06-22 06:40:28,389][15132] Fps is (10 sec: 45903.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2534899712. Throughput: 0: 42461.5. Samples: 2535020160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 06:40:28,932][15401] Updated weights for policy 0, policy_version 154720 (0.0035) [2024-06-22 06:40:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2535079936. Throughput: 0: 42562.2. Samples: 2535280740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 06:40:33,451][15401] Updated weights for policy 0, policy_version 154730 (0.0039) [2024-06-22 06:40:36,595][15401] Updated weights for policy 0, policy_version 154740 (0.0028) [2024-06-22 06:40:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 2535342080. Throughput: 0: 42601.7. Samples: 2535404200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 06:40:40,971][15401] Updated weights for policy 0, policy_version 154750 (0.0035) [2024-06-22 06:40:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 2535538688. Throughput: 0: 42662.3. Samples: 2535663340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 06:40:44,162][15401] Updated weights for policy 0, policy_version 154760 (0.0040) [2024-06-22 06:40:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2535735296. Throughput: 0: 42591.9. Samples: 2535919320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 06:40:48,545][15401] Updated weights for policy 0, policy_version 154770 (0.0035) [2024-06-22 06:40:52,360][15401] Updated weights for policy 0, policy_version 154780 (0.0023) [2024-06-22 06:40:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2535981056. Throughput: 0: 42695.7. Samples: 2536044840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 06:40:56,071][15401] Updated weights for policy 0, policy_version 154790 (0.0026) [2024-06-22 06:40:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2536177664. Throughput: 0: 42549.9. Samples: 2536297000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:40:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 06:41:00,095][15401] Updated weights for policy 0, policy_version 154800 (0.0033) [2024-06-22 06:41:03,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2536374272. Throughput: 0: 42605.6. Samples: 2536557960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:41:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 06:41:03,575][15401] Updated weights for policy 0, policy_version 154810 (0.0022) [2024-06-22 06:41:07,731][15401] Updated weights for policy 0, policy_version 154820 (0.0040) [2024-06-22 06:41:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42710.1). Total num frames: 2536603648. Throughput: 0: 42743.6. Samples: 2536685220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 06:41:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 06:41:11,058][15401] Updated weights for policy 0, policy_version 154830 (0.0035) [2024-06-22 06:41:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2536816640. Throughput: 0: 42623.0. Samples: 2536938200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:13,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 06:41:15,513][15401] Updated weights for policy 0, policy_version 154840 (0.0037) [2024-06-22 06:41:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43148.8, 300 sec: 42653.9). Total num frames: 2537029632. Throughput: 0: 42418.1. Samples: 2537189560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 06:41:18,789][15401] Updated weights for policy 0, policy_version 154850 (0.0030) [2024-06-22 06:41:22,932][15349] Signal inference workers to stop experience collection... (37450 times) [2024-06-22 06:41:22,932][15349] Signal inference workers to resume experience collection... (37450 times) [2024-06-22 06:41:22,943][15401] InferenceWorker_p0-w0: stopping experience collection (37450 times) [2024-06-22 06:41:22,943][15401] InferenceWorker_p0-w0: resuming experience collection (37450 times) [2024-06-22 06:41:23,100][15401] Updated weights for policy 0, policy_version 154860 (0.0034) [2024-06-22 06:41:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2537242624. Throughput: 0: 42644.6. Samples: 2537323200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 06:41:26,272][15401] Updated weights for policy 0, policy_version 154870 (0.0041) [2024-06-22 06:41:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 2537455616. Throughput: 0: 42664.1. Samples: 2537583220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:28,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 06:41:30,634][15401] Updated weights for policy 0, policy_version 154880 (0.0030) [2024-06-22 06:41:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2537668608. Throughput: 0: 42619.5. Samples: 2537837200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 06:41:34,198][15401] Updated weights for policy 0, policy_version 154890 (0.0020) [2024-06-22 06:41:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.5, 300 sec: 42542.9). Total num frames: 2537865216. Throughput: 0: 42654.7. Samples: 2537964300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 06:41:38,507][15401] Updated weights for policy 0, policy_version 154900 (0.0035) [2024-06-22 06:41:41,791][15401] Updated weights for policy 0, policy_version 154910 (0.0035) [2024-06-22 06:41:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2538094592. Throughput: 0: 42628.0. Samples: 2538215260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 06:41:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000154914_2538110976.pth... [2024-06-22 06:41:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000154288_2527854592.pth [2024-06-22 06:41:46,005][15401] Updated weights for policy 0, policy_version 154920 (0.0027) [2024-06-22 06:41:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2538307584. Throughput: 0: 42480.1. Samples: 2538469560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:48,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 06:41:49,550][15401] Updated weights for policy 0, policy_version 154930 (0.0030) [2024-06-22 06:41:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2538520576. Throughput: 0: 42507.5. Samples: 2538598060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 06:41:53,573][15401] Updated weights for policy 0, policy_version 154940 (0.0037) [2024-06-22 06:41:57,149][15401] Updated weights for policy 0, policy_version 154950 (0.0028) [2024-06-22 06:41:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2538733568. Throughput: 0: 42665.4. Samples: 2538858140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:41:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 06:42:01,019][15401] Updated weights for policy 0, policy_version 154960 (0.0034) [2024-06-22 06:42:03,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 2538962944. Throughput: 0: 42761.4. Samples: 2539113920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:42:03,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 06:42:04,926][15401] Updated weights for policy 0, policy_version 154970 (0.0038) [2024-06-22 06:42:08,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 2539143168. Throughput: 0: 42597.2. Samples: 2539240180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:42:08,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 06:42:09,041][15401] Updated weights for policy 0, policy_version 154980 (0.0037) [2024-06-22 06:42:12,434][15401] Updated weights for policy 0, policy_version 154990 (0.0035) [2024-06-22 06:42:13,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2539372544. Throughput: 0: 42542.2. Samples: 2539497620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 06:42:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 06:42:16,640][15401] Updated weights for policy 0, policy_version 155000 (0.0031) [2024-06-22 06:42:18,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2539601920. Throughput: 0: 42684.1. Samples: 2539757980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:18,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 06:42:19,914][15401] Updated weights for policy 0, policy_version 155010 (0.0034) [2024-06-22 06:42:23,391][15132] Fps is (10 sec: 42592.7, 60 sec: 42597.5, 300 sec: 42598.6). Total num frames: 2539798528. Throughput: 0: 42665.7. Samples: 2539884320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:23,391][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 06:42:23,965][15401] Updated weights for policy 0, policy_version 155020 (0.0045) [2024-06-22 06:42:27,415][15401] Updated weights for policy 0, policy_version 155030 (0.0037) [2024-06-22 06:42:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2540027904. Throughput: 0: 42915.5. Samples: 2540146460. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:28,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 06:42:31,870][15401] Updated weights for policy 0, policy_version 155040 (0.0027) [2024-06-22 06:42:32,779][15349] Signal inference workers to stop experience collection... (37500 times) [2024-06-22 06:42:32,837][15349] Signal inference workers to resume experience collection... (37500 times) [2024-06-22 06:42:32,839][15401] InferenceWorker_p0-w0: stopping experience collection (37500 times) [2024-06-22 06:42:32,857][15401] InferenceWorker_p0-w0: resuming experience collection (37500 times) [2024-06-22 06:42:33,389][15132] Fps is (10 sec: 44243.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2540240896. Throughput: 0: 42994.7. Samples: 2540404320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 06:42:35,487][15401] Updated weights for policy 0, policy_version 155050 (0.0032) [2024-06-22 06:42:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2540453888. Throughput: 0: 42936.9. Samples: 2540530220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 06:42:39,641][15401] Updated weights for policy 0, policy_version 155060 (0.0037) [2024-06-22 06:42:43,097][15401] Updated weights for policy 0, policy_version 155070 (0.0040) [2024-06-22 06:42:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2540666880. Throughput: 0: 42819.7. Samples: 2540785040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:43,391][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 06:42:47,359][15401] Updated weights for policy 0, policy_version 155080 (0.0033) [2024-06-22 06:42:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2540879872. Throughput: 0: 42840.4. Samples: 2541041640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 06:42:50,828][15401] Updated weights for policy 0, policy_version 155090 (0.0034) [2024-06-22 06:42:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2541092864. Throughput: 0: 42900.6. Samples: 2541170600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 06:42:55,010][15401] Updated weights for policy 0, policy_version 155100 (0.0026) [2024-06-22 06:42:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2541305856. Throughput: 0: 42847.6. Samples: 2541425760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:42:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 06:42:58,478][15401] Updated weights for policy 0, policy_version 155110 (0.0029) [2024-06-22 06:43:02,712][15401] Updated weights for policy 0, policy_version 155120 (0.0034) [2024-06-22 06:43:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42654.3). Total num frames: 2541518848. Throughput: 0: 42760.9. Samples: 2541682220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:43:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 06:43:05,985][15401] Updated weights for policy 0, policy_version 155130 (0.0033) [2024-06-22 06:43:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 2541748224. Throughput: 0: 42696.0. Samples: 2541805580. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:43:08,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 06:43:10,282][15401] Updated weights for policy 0, policy_version 155140 (0.0034) [2024-06-22 06:43:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2541944832. Throughput: 0: 42497.3. Samples: 2542058840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:43:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 06:43:13,617][15401] Updated weights for policy 0, policy_version 155150 (0.0041) [2024-06-22 06:43:17,799][15401] Updated weights for policy 0, policy_version 155160 (0.0024) [2024-06-22 06:43:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2542157824. Throughput: 0: 42568.4. Samples: 2542319900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-22 06:43:18,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 06:43:21,381][15401] Updated weights for policy 0, policy_version 155170 (0.0045) [2024-06-22 06:43:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42599.3, 300 sec: 42653.9). Total num frames: 2542354432. Throughput: 0: 42576.0. Samples: 2542446140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:43:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 06:43:25,423][15401] Updated weights for policy 0, policy_version 155180 (0.0038) [2024-06-22 06:43:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2542583808. Throughput: 0: 42468.6. Samples: 2542696120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:43:28,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 06:43:29,671][15401] Updated weights for policy 0, policy_version 155190 (0.0026) [2024-06-22 06:43:33,002][15401] Updated weights for policy 0, policy_version 155200 (0.0033) [2024-06-22 06:43:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 2542796800. Throughput: 0: 42539.5. Samples: 2542956020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:43:33,393][15132] Avg episode reward: [(0, '0.845')] [2024-06-22 06:43:37,458][15401] Updated weights for policy 0, policy_version 155210 (0.0039) [2024-06-22 06:43:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 2542993408. Throughput: 0: 42579.5. Samples: 2543086680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:43:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 06:43:40,534][15401] Updated weights for policy 0, policy_version 155220 (0.0035) [2024-06-22 06:43:41,841][15349] Signal inference workers to stop experience collection... (37550 times) [2024-06-22 06:43:41,893][15349] Signal inference workers to resume experience collection... (37550 times) [2024-06-22 06:43:41,894][15401] InferenceWorker_p0-w0: stopping experience collection (37550 times) [2024-06-22 06:43:41,916][15401] InferenceWorker_p0-w0: resuming experience collection (37550 times) [2024-06-22 06:43:43,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2543239168. Throughput: 0: 42656.3. Samples: 2543345300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:43:43,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 06:43:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000155228_2543255552.pth... [2024-06-22 06:43:43,595][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000154603_2533015552.pth [2024-06-22 06:43:45,247][15401] Updated weights for policy 0, policy_version 155230 (0.0041) [2024-06-22 06:43:48,241][15401] Updated weights for policy 0, policy_version 155240 (0.0036) [2024-06-22 06:43:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2543452160. Throughput: 0: 42752.9. Samples: 2543606100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:43:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 06:43:52,667][15401] Updated weights for policy 0, policy_version 155250 (0.0023) [2024-06-22 06:43:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 2543632384. Throughput: 0: 42775.8. Samples: 2543730500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:43:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 06:43:56,121][15401] Updated weights for policy 0, policy_version 155260 (0.0028) [2024-06-22 06:43:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2543878144. Throughput: 0: 42964.0. Samples: 2543992220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:43:58,396][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 06:44:00,390][15401] Updated weights for policy 0, policy_version 155270 (0.0036) [2024-06-22 06:44:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2544074752. Throughput: 0: 42875.1. Samples: 2544249280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:44:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 06:44:03,907][15401] Updated weights for policy 0, policy_version 155280 (0.0030) [2024-06-22 06:44:07,747][15401] Updated weights for policy 0, policy_version 155290 (0.0032) [2024-06-22 06:44:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2544271360. Throughput: 0: 42892.9. Samples: 2544376320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:44:08,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-22 06:44:11,361][15401] Updated weights for policy 0, policy_version 155300 (0.0030) [2024-06-22 06:44:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2544517120. Throughput: 0: 43059.1. Samples: 2544633780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:44:13,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 06:44:15,273][15401] Updated weights for policy 0, policy_version 155310 (0.0028) [2024-06-22 06:44:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2544730112. Throughput: 0: 43009.8. Samples: 2544891360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:44:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 06:44:18,899][15401] Updated weights for policy 0, policy_version 155320 (0.0041) [2024-06-22 06:44:22,907][15401] Updated weights for policy 0, policy_version 155330 (0.0045) [2024-06-22 06:44:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2544926720. Throughput: 0: 42958.6. Samples: 2545019820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 06:44:23,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 06:44:26,365][15401] Updated weights for policy 0, policy_version 155340 (0.0044) [2024-06-22 06:44:28,392][15132] Fps is (10 sec: 42590.2, 60 sec: 42870.1, 300 sec: 42653.7). Total num frames: 2545156096. Throughput: 0: 42842.6. Samples: 2545273300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:44:28,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 06:44:30,667][15401] Updated weights for policy 0, policy_version 155350 (0.0040) [2024-06-22 06:44:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 2545369088. Throughput: 0: 42879.5. Samples: 2545535680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:44:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 06:44:33,831][15401] Updated weights for policy 0, policy_version 155360 (0.0038) [2024-06-22 06:44:38,059][15401] Updated weights for policy 0, policy_version 155370 (0.0032) [2024-06-22 06:44:38,390][15132] Fps is (10 sec: 42606.4, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 2545582080. Throughput: 0: 42916.1. Samples: 2545661720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:44:38,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 06:44:41,423][15401] Updated weights for policy 0, policy_version 155380 (0.0032) [2024-06-22 06:44:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2545811456. Throughput: 0: 42736.2. Samples: 2545915360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:44:43,391][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 06:44:45,565][15401] Updated weights for policy 0, policy_version 155390 (0.0030) [2024-06-22 06:44:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2546008064. Throughput: 0: 42752.0. Samples: 2546173120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:44:48,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 06:44:49,478][15401] Updated weights for policy 0, policy_version 155400 (0.0020) [2024-06-22 06:44:53,236][15401] Updated weights for policy 0, policy_version 155410 (0.0034) [2024-06-22 06:44:53,396][15132] Fps is (10 sec: 42571.9, 60 sec: 43413.1, 300 sec: 42764.1). Total num frames: 2546237440. Throughput: 0: 42766.7. Samples: 2546301100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:44:53,397][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 06:44:57,295][15401] Updated weights for policy 0, policy_version 155420 (0.0023) [2024-06-22 06:44:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2546417664. Throughput: 0: 42730.8. Samples: 2546556660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:44:58,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 06:45:00,987][15401] Updated weights for policy 0, policy_version 155430 (0.0029) [2024-06-22 06:45:03,390][15132] Fps is (10 sec: 40986.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2546647040. Throughput: 0: 42725.0. Samples: 2546813980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:45:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 06:45:04,910][15401] Updated weights for policy 0, policy_version 155440 (0.0034) [2024-06-22 06:45:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2546860032. Throughput: 0: 42712.9. Samples: 2546941900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:45:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 06:45:08,802][15401] Updated weights for policy 0, policy_version 155450 (0.0032) [2024-06-22 06:45:11,264][15349] Signal inference workers to stop experience collection... (37600 times) [2024-06-22 06:45:11,268][15349] Signal inference workers to resume experience collection... (37600 times) [2024-06-22 06:45:11,323][15401] InferenceWorker_p0-w0: stopping experience collection (37600 times) [2024-06-22 06:45:11,324][15401] InferenceWorker_p0-w0: resuming experience collection (37600 times) [2024-06-22 06:45:12,434][15401] Updated weights for policy 0, policy_version 155460 (0.0032) [2024-06-22 06:45:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42821.4). Total num frames: 2547073024. Throughput: 0: 42686.8. Samples: 2547194120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:45:13,394][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 06:45:16,599][15401] Updated weights for policy 0, policy_version 155470 (0.0038) [2024-06-22 06:45:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2547302400. Throughput: 0: 42639.1. Samples: 2547454440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:45:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 06:45:20,025][15401] Updated weights for policy 0, policy_version 155480 (0.0036) [2024-06-22 06:45:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2547482624. Throughput: 0: 42666.7. Samples: 2547581720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:45:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 06:45:24,329][15401] Updated weights for policy 0, policy_version 155490 (0.0044) [2024-06-22 06:45:27,833][15401] Updated weights for policy 0, policy_version 155500 (0.0030) [2024-06-22 06:45:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42599.7, 300 sec: 42820.5). Total num frames: 2547712000. Throughput: 0: 42544.1. Samples: 2547829840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-22 06:45:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 06:45:31,902][15401] Updated weights for policy 0, policy_version 155510 (0.0029) [2024-06-22 06:45:33,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 2547924992. Throughput: 0: 42693.2. Samples: 2548094420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:45:33,392][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 06:45:35,228][15401] Updated weights for policy 0, policy_version 155520 (0.0029) [2024-06-22 06:45:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2548137984. Throughput: 0: 42799.8. Samples: 2548226820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:45:38,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 06:45:39,514][15401] Updated weights for policy 0, policy_version 155530 (0.0026) [2024-06-22 06:45:42,653][15401] Updated weights for policy 0, policy_version 155540 (0.0024) [2024-06-22 06:45:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 2548367360. Throughput: 0: 42676.4. Samples: 2548477100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:45:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 06:45:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000155540_2548367360.pth... [2024-06-22 06:45:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000154914_2538110976.pth [2024-06-22 06:45:47,328][15401] Updated weights for policy 0, policy_version 155550 (0.0033) [2024-06-22 06:45:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2548580352. Throughput: 0: 42590.2. Samples: 2548730540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:45:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 06:45:50,762][15401] Updated weights for policy 0, policy_version 155560 (0.0038) [2024-06-22 06:45:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 2548776960. Throughput: 0: 42643.6. Samples: 2548860860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:45:53,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 06:45:54,893][15401] Updated weights for policy 0, policy_version 155570 (0.0045) [2024-06-22 06:45:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2549006336. Throughput: 0: 42832.0. Samples: 2549121560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:45:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 06:45:58,545][15401] Updated weights for policy 0, policy_version 155580 (0.0034) [2024-06-22 06:46:02,550][15401] Updated weights for policy 0, policy_version 155590 (0.0030) [2024-06-22 06:46:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2549219328. Throughput: 0: 42712.6. Samples: 2549376500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:46:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 06:46:06,143][15401] Updated weights for policy 0, policy_version 155600 (0.0032) [2024-06-22 06:46:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2549399552. Throughput: 0: 42729.1. Samples: 2549504520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:46:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 06:46:10,189][15401] Updated weights for policy 0, policy_version 155610 (0.0043) [2024-06-22 06:46:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2549661696. Throughput: 0: 42945.0. Samples: 2549762360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:46:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 06:46:13,731][15401] Updated weights for policy 0, policy_version 155620 (0.0041) [2024-06-22 06:46:18,195][15401] Updated weights for policy 0, policy_version 155630 (0.0036) [2024-06-22 06:46:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2549841920. Throughput: 0: 42725.4. Samples: 2550016960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:46:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 06:46:21,425][15401] Updated weights for policy 0, policy_version 155640 (0.0034) [2024-06-22 06:46:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2550054912. Throughput: 0: 42498.0. Samples: 2550139220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:46:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 06:46:25,969][15401] Updated weights for policy 0, policy_version 155650 (0.0030) [2024-06-22 06:46:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2550284288. Throughput: 0: 42613.3. Samples: 2550394700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:46:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 06:46:28,887][15401] Updated weights for policy 0, policy_version 155660 (0.0034) [2024-06-22 06:46:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 2550464512. Throughput: 0: 42742.7. Samples: 2550653960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 06:46:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 06:46:33,625][15401] Updated weights for policy 0, policy_version 155670 (0.0038) [2024-06-22 06:46:34,943][15349] Signal inference workers to stop experience collection... (37650 times) [2024-06-22 06:46:34,943][15349] Signal inference workers to resume experience collection... (37650 times) [2024-06-22 06:46:34,987][15401] InferenceWorker_p0-w0: stopping experience collection (37650 times) [2024-06-22 06:46:34,987][15401] InferenceWorker_p0-w0: resuming experience collection (37650 times) [2024-06-22 06:46:36,935][15401] Updated weights for policy 0, policy_version 155680 (0.0037) [2024-06-22 06:46:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2550693888. Throughput: 0: 42598.7. Samples: 2550777800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:46:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 06:46:41,155][15401] Updated weights for policy 0, policy_version 155690 (0.0037) [2024-06-22 06:46:43,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2550939648. Throughput: 0: 42690.1. Samples: 2551042620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:46:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 06:46:44,402][15401] Updated weights for policy 0, policy_version 155700 (0.0032) [2024-06-22 06:46:48,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2551136256. Throughput: 0: 42789.1. Samples: 2551302020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:46:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 06:46:48,657][15401] Updated weights for policy 0, policy_version 155710 (0.0030) [2024-06-22 06:46:52,053][15401] Updated weights for policy 0, policy_version 155720 (0.0039) [2024-06-22 06:46:53,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2551332864. Throughput: 0: 42673.4. Samples: 2551424820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:46:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 06:46:56,362][15401] Updated weights for policy 0, policy_version 155730 (0.0031) [2024-06-22 06:46:58,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 2551578624. Throughput: 0: 42785.3. Samples: 2551687700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:46:58,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 06:46:59,691][15401] Updated weights for policy 0, policy_version 155740 (0.0025) [2024-06-22 06:47:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 2551775232. Throughput: 0: 42868.9. Samples: 2551946060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:47:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 06:47:04,127][15401] Updated weights for policy 0, policy_version 155750 (0.0036) [2024-06-22 06:47:07,155][15401] Updated weights for policy 0, policy_version 155760 (0.0039) [2024-06-22 06:47:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2551988224. Throughput: 0: 42867.6. Samples: 2552068260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:47:08,390][15132] Avg episode reward: [(0, '0.100')] [2024-06-22 06:47:11,669][15401] Updated weights for policy 0, policy_version 155770 (0.0045) [2024-06-22 06:47:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2552217600. Throughput: 0: 43070.1. Samples: 2552332860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:47:13,390][15132] Avg episode reward: [(0, '0.145')] [2024-06-22 06:47:14,859][15401] Updated weights for policy 0, policy_version 155780 (0.0043) [2024-06-22 06:47:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 2552397824. Throughput: 0: 42942.2. Samples: 2552586360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:47:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 06:47:19,101][15401] Updated weights for policy 0, policy_version 155790 (0.0033) [2024-06-22 06:47:22,684][15401] Updated weights for policy 0, policy_version 155800 (0.0028) [2024-06-22 06:47:23,390][15132] Fps is (10 sec: 42595.9, 60 sec: 43144.0, 300 sec: 42764.9). Total num frames: 2552643584. Throughput: 0: 42992.6. Samples: 2552712500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:47:23,391][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 06:47:27,191][15401] Updated weights for policy 0, policy_version 155810 (0.0037) [2024-06-22 06:47:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2552856576. Throughput: 0: 43001.9. Samples: 2552977700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:47:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 06:47:30,319][15401] Updated weights for policy 0, policy_version 155820 (0.0028) [2024-06-22 06:47:33,389][15132] Fps is (10 sec: 40963.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2553053184. Throughput: 0: 42924.7. Samples: 2553233620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 06:47:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 06:47:34,723][15401] Updated weights for policy 0, policy_version 155830 (0.0042) [2024-06-22 06:47:38,266][15401] Updated weights for policy 0, policy_version 155840 (0.0037) [2024-06-22 06:47:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2553282560. Throughput: 0: 42807.8. Samples: 2553351180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:47:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 06:47:42,371][15401] Updated weights for policy 0, policy_version 155850 (0.0031) [2024-06-22 06:47:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 2553495552. Throughput: 0: 42943.3. Samples: 2553620140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:47:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 06:47:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000155854_2553511936.pth... [2024-06-22 06:47:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000155228_2543255552.pth [2024-06-22 06:47:45,645][15401] Updated weights for policy 0, policy_version 155860 (0.0037) [2024-06-22 06:47:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 2553692160. Throughput: 0: 42812.1. Samples: 2553872600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:47:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 06:47:50,174][15401] Updated weights for policy 0, policy_version 155870 (0.0034) [2024-06-22 06:47:53,131][15401] Updated weights for policy 0, policy_version 155880 (0.0023) [2024-06-22 06:47:53,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 2553937920. Throughput: 0: 42868.3. Samples: 2553997340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:47:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 06:47:57,825][15401] Updated weights for policy 0, policy_version 155890 (0.0036) [2024-06-22 06:47:58,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2554150912. Throughput: 0: 42837.8. Samples: 2554260660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:47:58,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 06:48:00,809][15401] Updated weights for policy 0, policy_version 155900 (0.0038) [2024-06-22 06:48:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2554347520. Throughput: 0: 42857.3. Samples: 2554514940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:48:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 06:48:05,327][15401] Updated weights for policy 0, policy_version 155910 (0.0029) [2024-06-22 06:48:08,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2554560512. Throughput: 0: 42832.2. Samples: 2554639920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:48:08,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 06:48:08,841][15401] Updated weights for policy 0, policy_version 155920 (0.0043) [2024-06-22 06:48:13,158][15401] Updated weights for policy 0, policy_version 155930 (0.0027) [2024-06-22 06:48:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2554773504. Throughput: 0: 42735.6. Samples: 2554900800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:48:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 06:48:15,508][15349] Signal inference workers to stop experience collection... (37700 times) [2024-06-22 06:48:15,508][15349] Signal inference workers to resume experience collection... (37700 times) [2024-06-22 06:48:15,556][15401] InferenceWorker_p0-w0: stopping experience collection (37700 times) [2024-06-22 06:48:15,556][15401] InferenceWorker_p0-w0: resuming experience collection (37700 times) [2024-06-22 06:48:16,441][15401] Updated weights for policy 0, policy_version 155940 (0.0040) [2024-06-22 06:48:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2554986496. Throughput: 0: 42703.1. Samples: 2555155260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:48:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 06:48:20,751][15401] Updated weights for policy 0, policy_version 155950 (0.0030) [2024-06-22 06:48:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42872.0, 300 sec: 42820.6). Total num frames: 2555215872. Throughput: 0: 42875.3. Samples: 2555280560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:48:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 06:48:24,106][15401] Updated weights for policy 0, policy_version 155960 (0.0037) [2024-06-22 06:48:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 2555396096. Throughput: 0: 42692.8. Samples: 2555541320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:48:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 06:48:28,487][15401] Updated weights for policy 0, policy_version 155970 (0.0034) [2024-06-22 06:48:31,768][15401] Updated weights for policy 0, policy_version 155980 (0.0039) [2024-06-22 06:48:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2555625472. Throughput: 0: 42529.4. Samples: 2555786420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:48:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 06:48:35,949][15401] Updated weights for policy 0, policy_version 155990 (0.0035) [2024-06-22 06:48:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 2555838464. Throughput: 0: 42664.2. Samples: 2555917220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 06:48:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 06:48:39,427][15401] Updated weights for policy 0, policy_version 156000 (0.0029) [2024-06-22 06:48:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2556035072. Throughput: 0: 42562.2. Samples: 2556175860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:48:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 06:48:43,986][15401] Updated weights for policy 0, policy_version 156010 (0.0033) [2024-06-22 06:48:47,185][15401] Updated weights for policy 0, policy_version 156020 (0.0044) [2024-06-22 06:48:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2556280832. Throughput: 0: 42400.4. Samples: 2556422960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:48:48,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 06:48:51,698][15401] Updated weights for policy 0, policy_version 156030 (0.0027) [2024-06-22 06:48:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2556477440. Throughput: 0: 42627.9. Samples: 2556558180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:48:53,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-22 06:48:54,704][15401] Updated weights for policy 0, policy_version 156040 (0.0044) [2024-06-22 06:48:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 2556674048. Throughput: 0: 42409.8. Samples: 2556809240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:48:58,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 06:48:59,423][15401] Updated weights for policy 0, policy_version 156050 (0.0033) [2024-06-22 06:49:02,385][15401] Updated weights for policy 0, policy_version 156060 (0.0034) [2024-06-22 06:49:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2556903424. Throughput: 0: 42341.3. Samples: 2557060620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:03,391][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 06:49:07,050][15401] Updated weights for policy 0, policy_version 156070 (0.0044) [2024-06-22 06:49:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2557116416. Throughput: 0: 42599.0. Samples: 2557197520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 06:49:10,351][15401] Updated weights for policy 0, policy_version 156080 (0.0028) [2024-06-22 06:49:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2557313024. Throughput: 0: 42511.6. Samples: 2557454340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 06:49:14,516][15401] Updated weights for policy 0, policy_version 156090 (0.0041) [2024-06-22 06:49:17,993][15401] Updated weights for policy 0, policy_version 156100 (0.0039) [2024-06-22 06:49:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2557542400. Throughput: 0: 42619.0. Samples: 2557704280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 06:49:22,151][15401] Updated weights for policy 0, policy_version 156110 (0.0041) [2024-06-22 06:49:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2557755392. Throughput: 0: 42588.4. Samples: 2557833700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 06:49:25,639][15401] Updated weights for policy 0, policy_version 156120 (0.0043) [2024-06-22 06:49:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2557935616. Throughput: 0: 42584.6. Samples: 2558092160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:28,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 06:49:29,687][15401] Updated weights for policy 0, policy_version 156130 (0.0026) [2024-06-22 06:49:33,274][15401] Updated weights for policy 0, policy_version 156140 (0.0028) [2024-06-22 06:49:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 2558197760. Throughput: 0: 42582.2. Samples: 2558339260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:33,392][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 06:49:37,619][15401] Updated weights for policy 0, policy_version 156150 (0.0038) [2024-06-22 06:49:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2558394368. Throughput: 0: 42541.0. Samples: 2558472520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 06:49:40,933][15401] Updated weights for policy 0, policy_version 156160 (0.0029) [2024-06-22 06:49:43,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2558590976. Throughput: 0: 42683.4. Samples: 2558730000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-22 06:49:43,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-22 06:49:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000156164_2558590976.pth... [2024-06-22 06:49:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000155540_2548367360.pth [2024-06-22 06:49:45,066][15401] Updated weights for policy 0, policy_version 156170 (0.0029) [2024-06-22 06:49:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 2558836736. Throughput: 0: 42546.8. Samples: 2558975220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:49:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 06:49:48,635][15401] Updated weights for policy 0, policy_version 156180 (0.0033) [2024-06-22 06:49:52,842][15401] Updated weights for policy 0, policy_version 156190 (0.0029) [2024-06-22 06:49:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2559033344. Throughput: 0: 42528.9. Samples: 2559111320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:49:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 06:49:56,229][15401] Updated weights for policy 0, policy_version 156200 (0.0042) [2024-06-22 06:49:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2559229952. Throughput: 0: 42568.8. Samples: 2559369940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:49:58,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 06:50:00,546][15401] Updated weights for policy 0, policy_version 156210 (0.0033) [2024-06-22 06:50:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2559475712. Throughput: 0: 42523.1. Samples: 2559617920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:03,392][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 06:50:04,323][15401] Updated weights for policy 0, policy_version 156220 (0.0028) [2024-06-22 06:50:08,182][15401] Updated weights for policy 0, policy_version 156230 (0.0039) [2024-06-22 06:50:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2559672320. Throughput: 0: 42643.4. Samples: 2559752660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 06:50:08,959][15349] Signal inference workers to stop experience collection... (37750 times) [2024-06-22 06:50:08,992][15401] InferenceWorker_p0-w0: stopping experience collection (37750 times) [2024-06-22 06:50:09,009][15349] Signal inference workers to resume experience collection... (37750 times) [2024-06-22 06:50:09,021][15401] InferenceWorker_p0-w0: resuming experience collection (37750 times) [2024-06-22 06:50:11,856][15401] Updated weights for policy 0, policy_version 156240 (0.0041) [2024-06-22 06:50:13,391][15132] Fps is (10 sec: 40962.2, 60 sec: 42870.1, 300 sec: 42653.7). Total num frames: 2559885312. Throughput: 0: 42478.1. Samples: 2560003760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:13,392][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 06:50:15,665][15401] Updated weights for policy 0, policy_version 156250 (0.0038) [2024-06-22 06:50:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2560098304. Throughput: 0: 42645.9. Samples: 2560258220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 06:50:19,550][15401] Updated weights for policy 0, policy_version 156260 (0.0037) [2024-06-22 06:50:23,206][15401] Updated weights for policy 0, policy_version 156270 (0.0028) [2024-06-22 06:50:23,390][15132] Fps is (10 sec: 44244.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2560327680. Throughput: 0: 42649.2. Samples: 2560391740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:23,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 06:50:27,057][15401] Updated weights for policy 0, policy_version 156280 (0.0041) [2024-06-22 06:50:28,391][15132] Fps is (10 sec: 42592.0, 60 sec: 43143.4, 300 sec: 42709.6). Total num frames: 2560524288. Throughput: 0: 42382.7. Samples: 2560637280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:28,392][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 06:50:30,794][15401] Updated weights for policy 0, policy_version 156290 (0.0037) [2024-06-22 06:50:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 2560737280. Throughput: 0: 42745.7. Samples: 2560898780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:33,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 06:50:34,940][15401] Updated weights for policy 0, policy_version 156300 (0.0027) [2024-06-22 06:50:38,389][15132] Fps is (10 sec: 44243.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2560966656. Throughput: 0: 42683.2. Samples: 2561032060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 06:50:38,422][15401] Updated weights for policy 0, policy_version 156310 (0.0028) [2024-06-22 06:50:42,537][15401] Updated weights for policy 0, policy_version 156320 (0.0032) [2024-06-22 06:50:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2561163264. Throughput: 0: 42625.3. Samples: 2561288080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 06:50:46,376][15401] Updated weights for policy 0, policy_version 156330 (0.0035) [2024-06-22 06:50:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2561359872. Throughput: 0: 42868.0. Samples: 2561546880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 06:50:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 06:50:50,555][15401] Updated weights for policy 0, policy_version 156340 (0.0023) [2024-06-22 06:50:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2561589248. Throughput: 0: 42664.5. Samples: 2561672560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:50:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 06:50:53,967][15401] Updated weights for policy 0, policy_version 156350 (0.0035) [2024-06-22 06:50:58,140][15401] Updated weights for policy 0, policy_version 156360 (0.0034) [2024-06-22 06:50:58,395][15132] Fps is (10 sec: 44211.8, 60 sec: 42867.5, 300 sec: 42653.1). Total num frames: 2561802240. Throughput: 0: 42655.5. Samples: 2561923420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:50:58,400][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 06:51:01,588][15401] Updated weights for policy 0, policy_version 156370 (0.0037) [2024-06-22 06:51:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 2561998848. Throughput: 0: 42962.0. Samples: 2562191520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 06:51:05,672][15401] Updated weights for policy 0, policy_version 156380 (0.0028) [2024-06-22 06:51:08,389][15132] Fps is (10 sec: 42623.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 2562228224. Throughput: 0: 42654.0. Samples: 2562311160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 06:51:09,319][15401] Updated weights for policy 0, policy_version 156390 (0.0038) [2024-06-22 06:51:13,149][15401] Updated weights for policy 0, policy_version 156400 (0.0030) [2024-06-22 06:51:13,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42872.9, 300 sec: 42765.0). Total num frames: 2562457600. Throughput: 0: 42975.7. Samples: 2562571120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:13,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 06:51:17,079][15401] Updated weights for policy 0, policy_version 156410 (0.0028) [2024-06-22 06:51:18,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2562654208. Throughput: 0: 42829.3. Samples: 2562826100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:18,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 06:51:20,798][15401] Updated weights for policy 0, policy_version 156420 (0.0033) [2024-06-22 06:51:21,876][15349] Signal inference workers to stop experience collection... (37800 times) [2024-06-22 06:51:21,916][15401] InferenceWorker_p0-w0: stopping experience collection (37800 times) [2024-06-22 06:51:21,936][15349] Signal inference workers to resume experience collection... (37800 times) [2024-06-22 06:51:21,936][15401] InferenceWorker_p0-w0: resuming experience collection (37800 times) [2024-06-22 06:51:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2562867200. Throughput: 0: 42633.7. Samples: 2562950580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:23,396][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 06:51:24,652][15401] Updated weights for policy 0, policy_version 156430 (0.0029) [2024-06-22 06:51:28,393][15132] Fps is (10 sec: 42581.9, 60 sec: 42596.7, 300 sec: 42764.5). Total num frames: 2563080192. Throughput: 0: 42748.9. Samples: 2563211940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:28,394][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 06:51:28,707][15401] Updated weights for policy 0, policy_version 156440 (0.0038) [2024-06-22 06:51:32,147][15401] Updated weights for policy 0, policy_version 156450 (0.0034) [2024-06-22 06:51:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2563293184. Throughput: 0: 42787.9. Samples: 2563472340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:33,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 06:51:36,256][15401] Updated weights for policy 0, policy_version 156460 (0.0050) [2024-06-22 06:51:38,390][15132] Fps is (10 sec: 44253.7, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 2563522560. Throughput: 0: 42759.6. Samples: 2563596740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:38,390][15132] Avg episode reward: [(0, '0.084')] [2024-06-22 06:51:39,806][15401] Updated weights for policy 0, policy_version 156470 (0.0037) [2024-06-22 06:51:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2563719168. Throughput: 0: 42935.6. Samples: 2563855280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 06:51:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000156478_2563735552.pth... [2024-06-22 06:51:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000155854_2553511936.pth [2024-06-22 06:51:43,944][15401] Updated weights for policy 0, policy_version 156480 (0.0035) [2024-06-22 06:51:47,363][15401] Updated weights for policy 0, policy_version 156490 (0.0034) [2024-06-22 06:51:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2563932160. Throughput: 0: 42632.2. Samples: 2564109960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 06:51:48,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 06:51:51,719][15401] Updated weights for policy 0, policy_version 156500 (0.0023) [2024-06-22 06:51:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2564161536. Throughput: 0: 42850.0. Samples: 2564239420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:51:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 06:51:54,885][15401] Updated weights for policy 0, policy_version 156510 (0.0040) [2024-06-22 06:51:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42602.4, 300 sec: 42653.9). Total num frames: 2564358144. Throughput: 0: 42931.4. Samples: 2564503040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:51:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 06:51:59,554][15401] Updated weights for policy 0, policy_version 156520 (0.0033) [2024-06-22 06:52:02,467][15401] Updated weights for policy 0, policy_version 156530 (0.0036) [2024-06-22 06:52:03,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 2564603904. Throughput: 0: 42927.2. Samples: 2564757820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 06:52:07,254][15401] Updated weights for policy 0, policy_version 156540 (0.0024) [2024-06-22 06:52:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2564816896. Throughput: 0: 43189.7. Samples: 2564894120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 06:52:10,384][15401] Updated weights for policy 0, policy_version 156550 (0.0037) [2024-06-22 06:52:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2565013504. Throughput: 0: 43064.5. Samples: 2565149680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 06:52:14,693][15401] Updated weights for policy 0, policy_version 156560 (0.0044) [2024-06-22 06:52:17,912][15401] Updated weights for policy 0, policy_version 156570 (0.0031) [2024-06-22 06:52:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42765.1). Total num frames: 2565259264. Throughput: 0: 42952.1. Samples: 2565405180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 06:52:22,206][15401] Updated weights for policy 0, policy_version 156580 (0.0031) [2024-06-22 06:52:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2565455872. Throughput: 0: 43057.4. Samples: 2565534320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:23,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 06:52:25,384][15401] Updated weights for policy 0, policy_version 156590 (0.0038) [2024-06-22 06:52:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42874.2, 300 sec: 42709.5). Total num frames: 2565652480. Throughput: 0: 43032.8. Samples: 2565791760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:28,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-22 06:52:29,752][15401] Updated weights for policy 0, policy_version 156600 (0.0033) [2024-06-22 06:52:33,120][15401] Updated weights for policy 0, policy_version 156610 (0.0045) [2024-06-22 06:52:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.8, 300 sec: 42820.6). Total num frames: 2565914624. Throughput: 0: 43095.6. Samples: 2566049260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 06:52:37,388][15401] Updated weights for policy 0, policy_version 156620 (0.0034) [2024-06-22 06:52:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2566094848. Throughput: 0: 43219.6. Samples: 2566184300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 06:52:38,616][15349] Signal inference workers to stop experience collection... (37850 times) [2024-06-22 06:52:38,643][15401] InferenceWorker_p0-w0: stopping experience collection (37850 times) [2024-06-22 06:52:38,730][15349] Signal inference workers to resume experience collection... (37850 times) [2024-06-22 06:52:38,730][15401] InferenceWorker_p0-w0: resuming experience collection (37850 times) [2024-06-22 06:52:40,748][15401] Updated weights for policy 0, policy_version 156630 (0.0038) [2024-06-22 06:52:43,392][15132] Fps is (10 sec: 37673.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 2566291456. Throughput: 0: 42891.5. Samples: 2566433260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:43,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 06:52:45,096][15401] Updated weights for policy 0, policy_version 156640 (0.0039) [2024-06-22 06:52:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2566537216. Throughput: 0: 43009.7. Samples: 2566693260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 06:52:48,439][15401] Updated weights for policy 0, policy_version 156650 (0.0043) [2024-06-22 06:52:52,921][15401] Updated weights for policy 0, policy_version 156660 (0.0027) [2024-06-22 06:52:53,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 2566717440. Throughput: 0: 42934.8. Samples: 2566826180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 06:52:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 06:52:56,022][15401] Updated weights for policy 0, policy_version 156670 (0.0036) [2024-06-22 06:52:58,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 2566946816. Throughput: 0: 42751.5. Samples: 2567073600. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:52:58,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 06:53:00,846][15401] Updated weights for policy 0, policy_version 156680 (0.0038) [2024-06-22 06:53:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2567192576. Throughput: 0: 42835.6. Samples: 2567332780. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 06:53:03,675][15401] Updated weights for policy 0, policy_version 156690 (0.0036) [2024-06-22 06:53:08,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2567356416. Throughput: 0: 42895.4. Samples: 2567464620. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:08,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-22 06:53:08,645][15401] Updated weights for policy 0, policy_version 156700 (0.0042) [2024-06-22 06:53:11,348][15401] Updated weights for policy 0, policy_version 156710 (0.0042) [2024-06-22 06:53:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2567585792. Throughput: 0: 42695.1. Samples: 2567713040. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:13,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 06:53:16,125][15401] Updated weights for policy 0, policy_version 156720 (0.0030) [2024-06-22 06:53:18,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2567831552. Throughput: 0: 42567.9. Samples: 2567964820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 06:53:19,199][15401] Updated weights for policy 0, policy_version 156730 (0.0035) [2024-06-22 06:53:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2567995392. Throughput: 0: 42557.9. Samples: 2568099400. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 06:53:23,756][15401] Updated weights for policy 0, policy_version 156740 (0.0030) [2024-06-22 06:53:26,935][15401] Updated weights for policy 0, policy_version 156750 (0.0036) [2024-06-22 06:53:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2568241152. Throughput: 0: 42549.4. Samples: 2568347880. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 06:53:31,293][15401] Updated weights for policy 0, policy_version 156760 (0.0037) [2024-06-22 06:53:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2568454144. Throughput: 0: 42676.4. Samples: 2568613700. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 06:53:34,339][15401] Updated weights for policy 0, policy_version 156770 (0.0025) [2024-06-22 06:53:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2568650752. Throughput: 0: 42662.5. Samples: 2568746000. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 06:53:38,952][15401] Updated weights for policy 0, policy_version 156780 (0.0027) [2024-06-22 06:53:41,875][15401] Updated weights for policy 0, policy_version 156790 (0.0030) [2024-06-22 06:53:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 2568896512. Throughput: 0: 42789.0. Samples: 2568999000. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 06:53:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000156793_2568896512.pth... [2024-06-22 06:53:43,509][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000156164_2558590976.pth [2024-06-22 06:53:46,666][15349] Signal inference workers to stop experience collection... (37900 times) [2024-06-22 06:53:46,703][15401] InferenceWorker_p0-w0: stopping experience collection (37900 times) [2024-06-22 06:53:46,738][15349] Signal inference workers to resume experience collection... (37900 times) [2024-06-22 06:53:46,739][15401] InferenceWorker_p0-w0: resuming experience collection (37900 times) [2024-06-22 06:53:46,741][15401] Updated weights for policy 0, policy_version 156800 (0.0031) [2024-06-22 06:53:48,396][15132] Fps is (10 sec: 45846.1, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 2569109504. Throughput: 0: 42915.6. Samples: 2569264260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:48,396][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 06:53:49,402][15401] Updated weights for policy 0, policy_version 156810 (0.0028) [2024-06-22 06:53:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2569306112. Throughput: 0: 42838.7. Samples: 2569392360. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 06:53:54,256][15401] Updated weights for policy 0, policy_version 156820 (0.0032) [2024-06-22 06:53:57,129][15401] Updated weights for policy 0, policy_version 156830 (0.0048) [2024-06-22 06:53:58,390][15132] Fps is (10 sec: 42625.6, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 2569535488. Throughput: 0: 43010.8. Samples: 2569648520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 06:53:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 06:54:01,533][15401] Updated weights for policy 0, policy_version 156840 (0.0044) [2024-06-22 06:54:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 2569732096. Throughput: 0: 43169.7. Samples: 2569907560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:03,393][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 06:54:05,281][15401] Updated weights for policy 0, policy_version 156850 (0.0035) [2024-06-22 06:54:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2569961472. Throughput: 0: 42939.0. Samples: 2570031660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 06:54:09,023][15401] Updated weights for policy 0, policy_version 156860 (0.0024) [2024-06-22 06:54:13,036][15401] Updated weights for policy 0, policy_version 156870 (0.0042) [2024-06-22 06:54:13,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 2570190848. Throughput: 0: 43102.7. Samples: 2570287500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 06:54:16,906][15401] Updated weights for policy 0, policy_version 156880 (0.0039) [2024-06-22 06:54:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2570371072. Throughput: 0: 43026.2. Samples: 2570549880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 06:54:20,619][15401] Updated weights for policy 0, policy_version 156890 (0.0030) [2024-06-22 06:54:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2570600448. Throughput: 0: 42742.2. Samples: 2570669400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 06:54:24,663][15401] Updated weights for policy 0, policy_version 156900 (0.0036) [2024-06-22 06:54:28,084][15401] Updated weights for policy 0, policy_version 156910 (0.0037) [2024-06-22 06:54:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 2570829824. Throughput: 0: 43011.1. Samples: 2570934500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 06:54:32,581][15401] Updated weights for policy 0, policy_version 156920 (0.0044) [2024-06-22 06:54:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2571010048. Throughput: 0: 43030.1. Samples: 2571200340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:33,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 06:54:35,544][15401] Updated weights for policy 0, policy_version 156930 (0.0031) [2024-06-22 06:54:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 2571255808. Throughput: 0: 42834.3. Samples: 2571319900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 06:54:40,106][15401] Updated weights for policy 0, policy_version 156940 (0.0030) [2024-06-22 06:54:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2571436032. Throughput: 0: 42737.3. Samples: 2571571700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:43,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 06:54:43,643][15401] Updated weights for policy 0, policy_version 156950 (0.0031) [2024-06-22 06:54:47,595][15401] Updated weights for policy 0, policy_version 156960 (0.0033) [2024-06-22 06:54:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 2571665408. Throughput: 0: 42793.4. Samples: 2571833160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:48,391][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 06:54:51,329][15401] Updated weights for policy 0, policy_version 156970 (0.0032) [2024-06-22 06:54:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2571862016. Throughput: 0: 42809.9. Samples: 2571958100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 06:54:55,485][15401] Updated weights for policy 0, policy_version 156980 (0.0025) [2024-06-22 06:54:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2572091392. Throughput: 0: 42770.2. Samples: 2572212160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:54:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 06:54:58,963][15401] Updated weights for policy 0, policy_version 156990 (0.0034) [2024-06-22 06:55:02,917][15349] Signal inference workers to stop experience collection... (37950 times) [2024-06-22 06:55:02,964][15401] InferenceWorker_p0-w0: stopping experience collection (37950 times) [2024-06-22 06:55:02,975][15349] Signal inference workers to resume experience collection... (37950 times) [2024-06-22 06:55:02,983][15401] InferenceWorker_p0-w0: resuming experience collection (37950 times) [2024-06-22 06:55:03,159][15401] Updated weights for policy 0, policy_version 157000 (0.0026) [2024-06-22 06:55:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2572304384. Throughput: 0: 42755.6. Samples: 2572473880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 06:55:03,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-22 06:55:06,612][15401] Updated weights for policy 0, policy_version 157010 (0.0037) [2024-06-22 06:55:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 2572533760. Throughput: 0: 42923.3. Samples: 2572600940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:08,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 06:55:10,719][15401] Updated weights for policy 0, policy_version 157020 (0.0053) [2024-06-22 06:55:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2572713984. Throughput: 0: 42679.5. Samples: 2572855080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 06:55:14,108][15401] Updated weights for policy 0, policy_version 157030 (0.0045) [2024-06-22 06:55:18,255][15401] Updated weights for policy 0, policy_version 157040 (0.0032) [2024-06-22 06:55:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2572943360. Throughput: 0: 42716.5. Samples: 2573122580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 06:55:22,006][15401] Updated weights for policy 0, policy_version 157050 (0.0027) [2024-06-22 06:55:23,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42931.8). Total num frames: 2573189120. Throughput: 0: 42875.4. Samples: 2573249300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 06:55:25,783][15401] Updated weights for policy 0, policy_version 157060 (0.0038) [2024-06-22 06:55:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2573369344. Throughput: 0: 42676.9. Samples: 2573492160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:28,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 06:55:29,467][15401] Updated weights for policy 0, policy_version 157070 (0.0036) [2024-06-22 06:55:33,260][15401] Updated weights for policy 0, policy_version 157080 (0.0042) [2024-06-22 06:55:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2573598720. Throughput: 0: 42842.3. Samples: 2573761060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:33,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 06:55:37,107][15401] Updated weights for policy 0, policy_version 157090 (0.0036) [2024-06-22 06:55:38,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 2573828096. Throughput: 0: 43030.1. Samples: 2573894560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:38,392][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 06:55:41,019][15401] Updated weights for policy 0, policy_version 157100 (0.0023) [2024-06-22 06:55:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2574024704. Throughput: 0: 43062.6. Samples: 2574149980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 06:55:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000157106_2574024704.pth... [2024-06-22 06:55:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000156478_2563735552.pth [2024-06-22 06:55:44,879][15401] Updated weights for policy 0, policy_version 157110 (0.0039) [2024-06-22 06:55:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2574237696. Throughput: 0: 42936.9. Samples: 2574406040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 06:55:48,645][15401] Updated weights for policy 0, policy_version 157120 (0.0039) [2024-06-22 06:55:52,457][15401] Updated weights for policy 0, policy_version 157130 (0.0028) [2024-06-22 06:55:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42932.4). Total num frames: 2574467072. Throughput: 0: 42985.7. Samples: 2574535300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:53,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 06:55:56,566][15401] Updated weights for policy 0, policy_version 157140 (0.0031) [2024-06-22 06:55:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 2574680064. Throughput: 0: 43021.7. Samples: 2574791160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:55:58,392][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 06:55:59,683][15349] Signal inference workers to stop experience collection... (38000 times) [2024-06-22 06:55:59,688][15349] Signal inference workers to resume experience collection... (38000 times) [2024-06-22 06:55:59,718][15401] InferenceWorker_p0-w0: stopping experience collection (38000 times) [2024-06-22 06:55:59,718][15401] InferenceWorker_p0-w0: resuming experience collection (38000 times) [2024-06-22 06:56:00,054][15401] Updated weights for policy 0, policy_version 157150 (0.0023) [2024-06-22 06:56:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2574876672. Throughput: 0: 42900.3. Samples: 2575053100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:56:03,393][15132] Avg episode reward: [(0, '0.274')] [2024-06-22 06:56:04,123][15401] Updated weights for policy 0, policy_version 157160 (0.0035) [2024-06-22 06:56:07,408][15401] Updated weights for policy 0, policy_version 157170 (0.0039) [2024-06-22 06:56:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2575122432. Throughput: 0: 42996.6. Samples: 2575184140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 06:56:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 06:56:11,387][15401] Updated weights for policy 0, policy_version 157180 (0.0034) [2024-06-22 06:56:13,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43688.9, 300 sec: 42986.8). Total num frames: 2575335424. Throughput: 0: 43391.0. Samples: 2575444860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:13,393][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 06:56:15,343][15401] Updated weights for policy 0, policy_version 157190 (0.0033) [2024-06-22 06:56:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2575532032. Throughput: 0: 43196.4. Samples: 2575704900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 06:56:18,871][15401] Updated weights for policy 0, policy_version 157200 (0.0037) [2024-06-22 06:56:22,961][15401] Updated weights for policy 0, policy_version 157210 (0.0026) [2024-06-22 06:56:23,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42871.6, 300 sec: 42987.8). Total num frames: 2575761408. Throughput: 0: 43033.5. Samples: 2575830960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 06:56:26,566][15401] Updated weights for policy 0, policy_version 157220 (0.0048) [2024-06-22 06:56:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 2575990784. Throughput: 0: 43066.3. Samples: 2576087960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 06:56:30,442][15401] Updated weights for policy 0, policy_version 157230 (0.0036) [2024-06-22 06:56:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2576171008. Throughput: 0: 43356.5. Samples: 2576357080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:33,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 06:56:34,053][15401] Updated weights for policy 0, policy_version 157240 (0.0039) [2024-06-22 06:56:37,929][15401] Updated weights for policy 0, policy_version 157250 (0.0032) [2024-06-22 06:56:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 2576400384. Throughput: 0: 43213.5. Samples: 2576479900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:38,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 06:56:41,643][15401] Updated weights for policy 0, policy_version 157260 (0.0024) [2024-06-22 06:56:43,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 2576646144. Throughput: 0: 43347.7. Samples: 2576741700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:43,390][15132] Avg episode reward: [(0, '0.120')] [2024-06-22 06:56:45,568][15401] Updated weights for policy 0, policy_version 157270 (0.0028) [2024-06-22 06:56:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2576809984. Throughput: 0: 43190.6. Samples: 2576996680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 06:56:49,314][15401] Updated weights for policy 0, policy_version 157280 (0.0026) [2024-06-22 06:56:53,227][15401] Updated weights for policy 0, policy_version 157290 (0.0023) [2024-06-22 06:56:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 2577039360. Throughput: 0: 42932.5. Samples: 2577116100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 06:56:54,453][15349] Signal inference workers to stop experience collection... (38050 times) [2024-06-22 06:56:54,454][15349] Signal inference workers to resume experience collection... (38050 times) [2024-06-22 06:56:54,488][15401] InferenceWorker_p0-w0: stopping experience collection (38050 times) [2024-06-22 06:56:54,488][15401] InferenceWorker_p0-w0: resuming experience collection (38050 times) [2024-06-22 06:56:56,918][15401] Updated weights for policy 0, policy_version 157300 (0.0037) [2024-06-22 06:56:58,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43419.2, 300 sec: 42987.1). Total num frames: 2577285120. Throughput: 0: 43055.1. Samples: 2577382240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:56:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 06:57:00,713][15401] Updated weights for policy 0, policy_version 157310 (0.0025) [2024-06-22 06:57:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2577465344. Throughput: 0: 43125.8. Samples: 2577645560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:57:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 06:57:04,659][15401] Updated weights for policy 0, policy_version 157320 (0.0031) [2024-06-22 06:57:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2577678336. Throughput: 0: 42949.2. Samples: 2577763680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:57:08,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-22 06:57:08,497][15401] Updated weights for policy 0, policy_version 157330 (0.0039) [2024-06-22 06:57:12,291][15401] Updated weights for policy 0, policy_version 157340 (0.0029) [2024-06-22 06:57:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2577907712. Throughput: 0: 43190.6. Samples: 2578031540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 06:57:13,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-22 06:57:15,926][15401] Updated weights for policy 0, policy_version 157350 (0.0031) [2024-06-22 06:57:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2578120704. Throughput: 0: 42802.5. Samples: 2578283200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 06:57:19,906][15401] Updated weights for policy 0, policy_version 157360 (0.0029) [2024-06-22 06:57:23,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42866.8, 300 sec: 42986.3). Total num frames: 2578333696. Throughput: 0: 42900.1. Samples: 2578410680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:23,396][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 06:57:23,597][15401] Updated weights for policy 0, policy_version 157370 (0.0033) [2024-06-22 06:57:27,713][15401] Updated weights for policy 0, policy_version 157380 (0.0039) [2024-06-22 06:57:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2578530304. Throughput: 0: 42913.8. Samples: 2578672820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:28,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 06:57:31,072][15401] Updated weights for policy 0, policy_version 157390 (0.0043) [2024-06-22 06:57:33,389][15132] Fps is (10 sec: 44265.5, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2578776064. Throughput: 0: 42768.2. Samples: 2578921240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 06:57:35,246][15401] Updated weights for policy 0, policy_version 157400 (0.0049) [2024-06-22 06:57:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 2578972672. Throughput: 0: 42971.5. Samples: 2579049820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:38,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 06:57:38,768][15401] Updated weights for policy 0, policy_version 157410 (0.0037) [2024-06-22 06:57:43,142][15401] Updated weights for policy 0, policy_version 157420 (0.0031) [2024-06-22 06:57:43,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42052.1, 300 sec: 42820.5). Total num frames: 2579169280. Throughput: 0: 42802.6. Samples: 2579308360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 06:57:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000157420_2579169280.pth... [2024-06-22 06:57:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000156793_2568896512.pth [2024-06-22 06:57:46,305][15401] Updated weights for policy 0, policy_version 157430 (0.0036) [2024-06-22 06:57:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2579415040. Throughput: 0: 42534.6. Samples: 2579559620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 06:57:50,721][15401] Updated weights for policy 0, policy_version 157440 (0.0032) [2024-06-22 06:57:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 2579628032. Throughput: 0: 42860.1. Samples: 2579692380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 06:57:54,775][15401] Updated weights for policy 0, policy_version 157450 (0.0026) [2024-06-22 06:57:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2579808256. Throughput: 0: 42515.0. Samples: 2579944720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:57:58,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 06:57:58,536][15401] Updated weights for policy 0, policy_version 157460 (0.0033) [2024-06-22 06:58:02,336][15401] Updated weights for policy 0, policy_version 157470 (0.0028) [2024-06-22 06:58:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2580037632. Throughput: 0: 42640.6. Samples: 2580202020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:58:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 06:58:06,102][15401] Updated weights for policy 0, policy_version 157480 (0.0037) [2024-06-22 06:58:08,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2580267008. Throughput: 0: 42713.7. Samples: 2580332520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:58:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 06:58:09,857][15401] Updated weights for policy 0, policy_version 157490 (0.0036) [2024-06-22 06:58:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2580447232. Throughput: 0: 42637.3. Samples: 2580591500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 06:58:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 06:58:13,593][15401] Updated weights for policy 0, policy_version 157500 (0.0034) [2024-06-22 06:58:17,345][15401] Updated weights for policy 0, policy_version 157510 (0.0033) [2024-06-22 06:58:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 2580676608. Throughput: 0: 42916.0. Samples: 2580852460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:18,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-22 06:58:21,315][15401] Updated weights for policy 0, policy_version 157520 (0.0036) [2024-06-22 06:58:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42876.1, 300 sec: 42931.7). Total num frames: 2580905984. Throughput: 0: 42917.0. Samples: 2580981080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 06:58:24,635][15349] Signal inference workers to stop experience collection... (38100 times) [2024-06-22 06:58:24,635][15349] Signal inference workers to resume experience collection... (38100 times) [2024-06-22 06:58:24,681][15401] InferenceWorker_p0-w0: stopping experience collection (38100 times) [2024-06-22 06:58:24,681][15401] InferenceWorker_p0-w0: resuming experience collection (38100 times) [2024-06-22 06:58:24,775][15401] Updated weights for policy 0, policy_version 157530 (0.0046) [2024-06-22 06:58:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2581118976. Throughput: 0: 42692.1. Samples: 2581229500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:28,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 06:58:28,777][15401] Updated weights for policy 0, policy_version 157540 (0.0029) [2024-06-22 06:58:32,455][15401] Updated weights for policy 0, policy_version 157550 (0.0034) [2024-06-22 06:58:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2581331968. Throughput: 0: 42899.2. Samples: 2581490080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:33,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 06:58:36,255][15401] Updated weights for policy 0, policy_version 157560 (0.0031) [2024-06-22 06:58:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2581528576. Throughput: 0: 42830.7. Samples: 2581619760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 06:58:39,992][15401] Updated weights for policy 0, policy_version 157570 (0.0037) [2024-06-22 06:58:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.6, 300 sec: 42877.0). Total num frames: 2581757952. Throughput: 0: 42989.0. Samples: 2581879220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 06:58:43,978][15401] Updated weights for policy 0, policy_version 157580 (0.0031) [2024-06-22 06:58:47,547][15401] Updated weights for policy 0, policy_version 157590 (0.0041) [2024-06-22 06:58:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2581970944. Throughput: 0: 42916.4. Samples: 2582133260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 06:58:51,425][15401] Updated weights for policy 0, policy_version 157600 (0.0043) [2024-06-22 06:58:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 2582167552. Throughput: 0: 42978.9. Samples: 2582266580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:53,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 06:58:55,202][15401] Updated weights for policy 0, policy_version 157610 (0.0023) [2024-06-22 06:58:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43416.0, 300 sec: 42987.2). Total num frames: 2582413312. Throughput: 0: 42926.6. Samples: 2582523300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:58:58,392][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 06:58:59,193][15401] Updated weights for policy 0, policy_version 157620 (0.0038) [2024-06-22 06:59:02,744][15401] Updated weights for policy 0, policy_version 157630 (0.0039) [2024-06-22 06:59:03,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2582626304. Throughput: 0: 42771.0. Samples: 2582777160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:59:03,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 06:59:06,702][15401] Updated weights for policy 0, policy_version 157640 (0.0036) [2024-06-22 06:59:08,391][15132] Fps is (10 sec: 40964.6, 60 sec: 42597.5, 300 sec: 42820.4). Total num frames: 2582822912. Throughput: 0: 42884.0. Samples: 2582910920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:59:08,391][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 06:59:10,190][15401] Updated weights for policy 0, policy_version 157650 (0.0052) [2024-06-22 06:59:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2583035904. Throughput: 0: 43066.3. Samples: 2583167480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:59:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 06:59:14,283][15401] Updated weights for policy 0, policy_version 157660 (0.0026) [2024-06-22 06:59:17,998][15401] Updated weights for policy 0, policy_version 157670 (0.0025) [2024-06-22 06:59:18,396][15132] Fps is (10 sec: 45851.8, 60 sec: 43412.9, 300 sec: 42986.3). Total num frames: 2583281664. Throughput: 0: 42803.2. Samples: 2583416500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 06:59:18,396][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 06:59:21,857][15401] Updated weights for policy 0, policy_version 157680 (0.0033) [2024-06-22 06:59:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2583461888. Throughput: 0: 42961.4. Samples: 2583553020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:59:23,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 06:59:25,770][15401] Updated weights for policy 0, policy_version 157690 (0.0037) [2024-06-22 06:59:28,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2583691264. Throughput: 0: 42976.0. Samples: 2583813140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:59:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 06:59:29,371][15401] Updated weights for policy 0, policy_version 157700 (0.0037) [2024-06-22 06:59:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2583904256. Throughput: 0: 43021.4. Samples: 2584069220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:59:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 06:59:33,443][15401] Updated weights for policy 0, policy_version 157710 (0.0029) [2024-06-22 06:59:36,994][15401] Updated weights for policy 0, policy_version 157720 (0.0028) [2024-06-22 06:59:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 2584100864. Throughput: 0: 42850.7. Samples: 2584194860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:59:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 06:59:40,859][15349] Signal inference workers to stop experience collection... (38150 times) [2024-06-22 06:59:40,892][15401] InferenceWorker_p0-w0: stopping experience collection (38150 times) [2024-06-22 06:59:40,909][15349] Signal inference workers to resume experience collection... (38150 times) [2024-06-22 06:59:40,912][15401] InferenceWorker_p0-w0: resuming experience collection (38150 times) [2024-06-22 06:59:41,053][15401] Updated weights for policy 0, policy_version 157730 (0.0034) [2024-06-22 06:59:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 2584346624. Throughput: 0: 42958.2. Samples: 2584456420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:59:43,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 06:59:43,536][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000157737_2584363008.pth... [2024-06-22 06:59:43,599][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000157106_2574024704.pth [2024-06-22 06:59:44,784][15401] Updated weights for policy 0, policy_version 157740 (0.0032) [2024-06-22 06:59:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2584543232. Throughput: 0: 43132.1. Samples: 2584718100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:59:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 06:59:48,667][15401] Updated weights for policy 0, policy_version 157750 (0.0036) [2024-06-22 06:59:52,350][15401] Updated weights for policy 0, policy_version 157760 (0.0035) [2024-06-22 06:59:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2584756224. Throughput: 0: 42791.4. Samples: 2584836480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:59:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 06:59:56,831][15401] Updated weights for policy 0, policy_version 157770 (0.0026) [2024-06-22 06:59:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 2584985600. Throughput: 0: 42806.1. Samples: 2585093760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 06:59:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 06:59:59,982][15401] Updated weights for policy 0, policy_version 157780 (0.0033) [2024-06-22 07:00:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2585165824. Throughput: 0: 42978.9. Samples: 2585350280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 07:00:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 07:00:04,482][15401] Updated weights for policy 0, policy_version 157790 (0.0034) [2024-06-22 07:00:07,677][15401] Updated weights for policy 0, policy_version 157800 (0.0025) [2024-06-22 07:00:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42872.4, 300 sec: 42987.2). Total num frames: 2585395200. Throughput: 0: 42671.9. Samples: 2585473260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 07:00:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 07:00:11,964][15401] Updated weights for policy 0, policy_version 157810 (0.0042) [2024-06-22 07:00:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2585608192. Throughput: 0: 42739.6. Samples: 2585736420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 07:00:13,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 07:00:15,478][15401] Updated weights for policy 0, policy_version 157820 (0.0034) [2024-06-22 07:00:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42329.9, 300 sec: 42820.6). Total num frames: 2585821184. Throughput: 0: 42657.8. Samples: 2585988820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 07:00:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 07:00:19,577][15401] Updated weights for policy 0, policy_version 157830 (0.0032) [2024-06-22 07:00:23,248][15401] Updated weights for policy 0, policy_version 157840 (0.0044) [2024-06-22 07:00:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 2586050560. Throughput: 0: 42752.6. Samples: 2586118720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 07:00:23,396][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 07:00:27,160][15401] Updated weights for policy 0, policy_version 157850 (0.0030) [2024-06-22 07:00:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 2586230784. Throughput: 0: 42473.0. Samples: 2586367600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:00:28,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 07:00:31,425][15401] Updated weights for policy 0, policy_version 157860 (0.0022) [2024-06-22 07:00:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 2586460160. Throughput: 0: 42260.0. Samples: 2586619800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:00:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 07:00:35,033][15401] Updated weights for policy 0, policy_version 157870 (0.0030) [2024-06-22 07:00:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2586673152. Throughput: 0: 42689.7. Samples: 2586757520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:00:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 07:00:38,847][15401] Updated weights for policy 0, policy_version 157880 (0.0036) [2024-06-22 07:00:42,957][15401] Updated weights for policy 0, policy_version 157890 (0.0041) [2024-06-22 07:00:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 2586886144. Throughput: 0: 42647.1. Samples: 2587012880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:00:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 07:00:46,741][15401] Updated weights for policy 0, policy_version 157900 (0.0035) [2024-06-22 07:00:48,391][15132] Fps is (10 sec: 44232.5, 60 sec: 42870.7, 300 sec: 42876.0). Total num frames: 2587115520. Throughput: 0: 42481.3. Samples: 2587261980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:00:48,391][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 07:00:49,536][15349] Signal inference workers to stop experience collection... (38200 times) [2024-06-22 07:00:49,537][15349] Signal inference workers to resume experience collection... (38200 times) [2024-06-22 07:00:49,583][15401] InferenceWorker_p0-w0: stopping experience collection (38200 times) [2024-06-22 07:00:49,583][15401] InferenceWorker_p0-w0: resuming experience collection (38200 times) [2024-06-22 07:00:50,573][15401] Updated weights for policy 0, policy_version 157910 (0.0045) [2024-06-22 07:00:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 2587312128. Throughput: 0: 42774.2. Samples: 2587398100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:00:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 07:00:54,152][15401] Updated weights for policy 0, policy_version 157920 (0.0051) [2024-06-22 07:00:57,990][15401] Updated weights for policy 0, policy_version 157930 (0.0037) [2024-06-22 07:00:58,389][15132] Fps is (10 sec: 40964.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2587525120. Throughput: 0: 42647.2. Samples: 2587655540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:00:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 07:01:02,078][15401] Updated weights for policy 0, policy_version 157940 (0.0033) [2024-06-22 07:01:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2587754496. Throughput: 0: 42716.8. Samples: 2587911080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:01:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 07:01:05,436][15401] Updated weights for policy 0, policy_version 157950 (0.0037) [2024-06-22 07:01:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 2587967488. Throughput: 0: 42733.8. Samples: 2588041740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:01:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 07:01:09,637][15401] Updated weights for policy 0, policy_version 157960 (0.0049) [2024-06-22 07:01:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2588147712. Throughput: 0: 42912.4. Samples: 2588298660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:01:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 07:01:13,676][15401] Updated weights for policy 0, policy_version 157970 (0.0029) [2024-06-22 07:01:17,407][15401] Updated weights for policy 0, policy_version 157980 (0.0028) [2024-06-22 07:01:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2588393472. Throughput: 0: 42882.6. Samples: 2588549520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:01:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 07:01:21,163][15401] Updated weights for policy 0, policy_version 157990 (0.0034) [2024-06-22 07:01:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 2588590080. Throughput: 0: 42795.9. Samples: 2588683340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:01:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 07:01:24,827][15401] Updated weights for policy 0, policy_version 158000 (0.0037) [2024-06-22 07:01:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2588803072. Throughput: 0: 42806.8. Samples: 2588939180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-22 07:01:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 07:01:28,679][15401] Updated weights for policy 0, policy_version 158010 (0.0035) [2024-06-22 07:01:32,693][15401] Updated weights for policy 0, policy_version 158020 (0.0036) [2024-06-22 07:01:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2589032448. Throughput: 0: 42929.9. Samples: 2589193780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:01:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 07:01:36,160][15401] Updated weights for policy 0, policy_version 158030 (0.0027) [2024-06-22 07:01:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2589261824. Throughput: 0: 42725.4. Samples: 2589320740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:01:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 07:01:40,270][15401] Updated weights for policy 0, policy_version 158040 (0.0036) [2024-06-22 07:01:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2589458432. Throughput: 0: 42783.5. Samples: 2589580800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:01:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 07:01:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000158049_2589474816.pth... [2024-06-22 07:01:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000157420_2579169280.pth [2024-06-22 07:01:43,635][15401] Updated weights for policy 0, policy_version 158050 (0.0035) [2024-06-22 07:01:47,834][15401] Updated weights for policy 0, policy_version 158060 (0.0030) [2024-06-22 07:01:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42599.1, 300 sec: 42820.5). Total num frames: 2589671424. Throughput: 0: 42820.8. Samples: 2589838020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:01:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 07:01:51,566][15401] Updated weights for policy 0, policy_version 158070 (0.0040) [2024-06-22 07:01:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2589900800. Throughput: 0: 42714.6. Samples: 2589963900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:01:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 07:01:55,738][15401] Updated weights for policy 0, policy_version 158080 (0.0030) [2024-06-22 07:01:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2590130176. Throughput: 0: 42739.7. Samples: 2590221940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:01:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 07:01:59,030][15401] Updated weights for policy 0, policy_version 158090 (0.0042) [2024-06-22 07:02:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2590277632. Throughput: 0: 43004.0. Samples: 2590484700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:02:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 07:02:03,541][15401] Updated weights for policy 0, policy_version 158100 (0.0028) [2024-06-22 07:02:06,688][15401] Updated weights for policy 0, policy_version 158110 (0.0033) [2024-06-22 07:02:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2590539776. Throughput: 0: 42636.6. Samples: 2590601980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:02:08,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 07:02:11,258][15401] Updated weights for policy 0, policy_version 158120 (0.0024) [2024-06-22 07:02:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2590752768. Throughput: 0: 42844.8. Samples: 2590867200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:02:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 07:02:14,224][15401] Updated weights for policy 0, policy_version 158130 (0.0044) [2024-06-22 07:02:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 2590932992. Throughput: 0: 42767.5. Samples: 2591118320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:02:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 07:02:18,989][15401] Updated weights for policy 0, policy_version 158140 (0.0047) [2024-06-22 07:02:21,657][15349] Signal inference workers to stop experience collection... (38250 times) [2024-06-22 07:02:21,709][15401] InferenceWorker_p0-w0: stopping experience collection (38250 times) [2024-06-22 07:02:21,712][15349] Signal inference workers to resume experience collection... (38250 times) [2024-06-22 07:02:21,727][15401] InferenceWorker_p0-w0: resuming experience collection (38250 times) [2024-06-22 07:02:21,849][15401] Updated weights for policy 0, policy_version 158150 (0.0042) [2024-06-22 07:02:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 2591178752. Throughput: 0: 42599.2. Samples: 2591237700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:02:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 07:02:26,941][15401] Updated weights for policy 0, policy_version 158160 (0.0032) [2024-06-22 07:02:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2591375360. Throughput: 0: 42774.8. Samples: 2591505660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:02:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 07:02:29,630][15401] Updated weights for policy 0, policy_version 158170 (0.0027) [2024-06-22 07:02:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2591588352. Throughput: 0: 42644.5. Samples: 2591757020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 07:02:33,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 07:02:34,745][15401] Updated weights for policy 0, policy_version 158180 (0.0043) [2024-06-22 07:02:37,359][15401] Updated weights for policy 0, policy_version 158190 (0.0033) [2024-06-22 07:02:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 2591817728. Throughput: 0: 42633.4. Samples: 2591882500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:02:38,393][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 07:02:42,340][15401] Updated weights for policy 0, policy_version 158200 (0.0039) [2024-06-22 07:02:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2592014336. Throughput: 0: 42766.2. Samples: 2592146420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:02:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 07:02:44,750][15401] Updated weights for policy 0, policy_version 158210 (0.0031) [2024-06-22 07:02:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2592227328. Throughput: 0: 42540.0. Samples: 2592399000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:02:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 07:02:49,768][15401] Updated weights for policy 0, policy_version 158220 (0.0036) [2024-06-22 07:02:52,735][15401] Updated weights for policy 0, policy_version 158230 (0.0024) [2024-06-22 07:02:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2592473088. Throughput: 0: 42811.5. Samples: 2592528500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:02:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 07:02:57,477][15401] Updated weights for policy 0, policy_version 158240 (0.0039) [2024-06-22 07:02:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42709.5). Total num frames: 2592636928. Throughput: 0: 42850.2. Samples: 2592795460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:02:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 07:03:00,115][15401] Updated weights for policy 0, policy_version 158250 (0.0029) [2024-06-22 07:03:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2592882688. Throughput: 0: 42798.3. Samples: 2593044240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:03:03,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 07:03:05,066][15401] Updated weights for policy 0, policy_version 158260 (0.0034) [2024-06-22 07:03:07,755][15401] Updated weights for policy 0, policy_version 158270 (0.0042) [2024-06-22 07:03:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2593112064. Throughput: 0: 43137.6. Samples: 2593178900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:03:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 07:03:12,441][15401] Updated weights for policy 0, policy_version 158280 (0.0040) [2024-06-22 07:03:13,396][15132] Fps is (10 sec: 40933.3, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 2593292288. Throughput: 0: 43005.4. Samples: 2593441180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:03:13,396][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 07:03:15,421][15401] Updated weights for policy 0, policy_version 158290 (0.0040) [2024-06-22 07:03:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2593521664. Throughput: 0: 42936.0. Samples: 2593689140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:03:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 07:03:19,983][15401] Updated weights for policy 0, policy_version 158300 (0.0042) [2024-06-22 07:03:23,250][15401] Updated weights for policy 0, policy_version 158310 (0.0030) [2024-06-22 07:03:23,390][15132] Fps is (10 sec: 45904.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2593751040. Throughput: 0: 43153.8. Samples: 2593824320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:03:23,391][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 07:03:27,830][15401] Updated weights for policy 0, policy_version 158320 (0.0045) [2024-06-22 07:03:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2593914880. Throughput: 0: 42836.8. Samples: 2594074080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:03:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 07:03:31,299][15401] Updated weights for policy 0, policy_version 158330 (0.0034) [2024-06-22 07:03:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2594160640. Throughput: 0: 42734.2. Samples: 2594322040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:03:33,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 07:03:35,471][15401] Updated weights for policy 0, policy_version 158340 (0.0041) [2024-06-22 07:03:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 2594373632. Throughput: 0: 42936.9. Samples: 2594460660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 07:03:38,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 07:03:38,772][15401] Updated weights for policy 0, policy_version 158350 (0.0032) [2024-06-22 07:03:43,036][15401] Updated weights for policy 0, policy_version 158360 (0.0030) [2024-06-22 07:03:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2594570240. Throughput: 0: 42595.5. Samples: 2594712260. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:03:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 07:03:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000158360_2594570240.pth... [2024-06-22 07:03:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000157737_2584363008.pth [2024-06-22 07:03:46,397][15401] Updated weights for policy 0, policy_version 158370 (0.0049) [2024-06-22 07:03:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2594816000. Throughput: 0: 42634.2. Samples: 2594962780. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:03:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 07:03:50,628][15401] Updated weights for policy 0, policy_version 158380 (0.0041) [2024-06-22 07:03:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2595012608. Throughput: 0: 42670.7. Samples: 2595099080. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:03:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 07:03:53,606][15349] Signal inference workers to stop experience collection... (38300 times) [2024-06-22 07:03:53,661][15401] InferenceWorker_p0-w0: stopping experience collection (38300 times) [2024-06-22 07:03:53,669][15349] Signal inference workers to resume experience collection... (38300 times) [2024-06-22 07:03:53,676][15401] InferenceWorker_p0-w0: resuming experience collection (38300 times) [2024-06-22 07:03:53,966][15401] Updated weights for policy 0, policy_version 158390 (0.0031) [2024-06-22 07:03:58,342][15401] Updated weights for policy 0, policy_version 158400 (0.0040) [2024-06-22 07:03:58,391][15132] Fps is (10 sec: 40955.4, 60 sec: 43143.8, 300 sec: 42709.3). Total num frames: 2595225600. Throughput: 0: 42606.4. Samples: 2595358240. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:03:58,391][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 07:04:01,837][15401] Updated weights for policy 0, policy_version 158410 (0.0034) [2024-06-22 07:04:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 2595471360. Throughput: 0: 42448.5. Samples: 2595599320. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 07:04:05,963][15401] Updated weights for policy 0, policy_version 158420 (0.0025) [2024-06-22 07:04:08,389][15132] Fps is (10 sec: 42603.4, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2595651584. Throughput: 0: 42555.2. Samples: 2595739300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 07:04:09,527][15401] Updated weights for policy 0, policy_version 158430 (0.0024) [2024-06-22 07:04:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42876.0, 300 sec: 42654.9). Total num frames: 2595864576. Throughput: 0: 42736.9. Samples: 2595997240. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 07:04:13,601][15401] Updated weights for policy 0, policy_version 158440 (0.0028) [2024-06-22 07:04:17,151][15401] Updated weights for policy 0, policy_version 158450 (0.0029) [2024-06-22 07:04:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2596110336. Throughput: 0: 42743.1. Samples: 2596245480. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 07:04:21,242][15401] Updated weights for policy 0, policy_version 158460 (0.0038) [2024-06-22 07:04:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2596290560. Throughput: 0: 42644.4. Samples: 2596379660. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 07:04:24,779][15401] Updated weights for policy 0, policy_version 158470 (0.0039) [2024-06-22 07:04:28,396][15132] Fps is (10 sec: 39296.8, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 2596503552. Throughput: 0: 42581.6. Samples: 2596628700. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:28,396][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 07:04:29,266][15401] Updated weights for policy 0, policy_version 158480 (0.0039) [2024-06-22 07:04:32,307][15401] Updated weights for policy 0, policy_version 158490 (0.0031) [2024-06-22 07:04:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2596749312. Throughput: 0: 42655.0. Samples: 2596882260. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 07:04:36,992][15401] Updated weights for policy 0, policy_version 158500 (0.0042) [2024-06-22 07:04:38,394][15132] Fps is (10 sec: 42607.5, 60 sec: 42595.4, 300 sec: 42653.7). Total num frames: 2596929536. Throughput: 0: 42731.1. Samples: 2597022160. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:38,394][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 07:04:39,910][15401] Updated weights for policy 0, policy_version 158510 (0.0042) [2024-06-22 07:04:43,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2597126144. Throughput: 0: 42442.4. Samples: 2597268100. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-22 07:04:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 07:04:44,623][15401] Updated weights for policy 0, policy_version 158520 (0.0039) [2024-06-22 07:04:47,506][15401] Updated weights for policy 0, policy_version 158530 (0.0024) [2024-06-22 07:04:48,390][15132] Fps is (10 sec: 45894.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2597388288. Throughput: 0: 42766.6. Samples: 2597523820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:04:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 07:04:52,348][15401] Updated weights for policy 0, policy_version 158540 (0.0029) [2024-06-22 07:04:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2597568512. Throughput: 0: 42751.4. Samples: 2597663120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:04:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 07:04:55,152][15401] Updated weights for policy 0, policy_version 158550 (0.0033) [2024-06-22 07:04:58,391][15132] Fps is (10 sec: 39317.5, 60 sec: 42598.4, 300 sec: 42764.9). Total num frames: 2597781504. Throughput: 0: 42469.2. Samples: 2597908400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:04:58,391][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 07:05:00,038][15401] Updated weights for policy 0, policy_version 158560 (0.0042) [2024-06-22 07:05:01,949][15349] Signal inference workers to stop experience collection... (38350 times) [2024-06-22 07:05:01,956][15349] Signal inference workers to resume experience collection... (38350 times) [2024-06-22 07:05:01,993][15401] InferenceWorker_p0-w0: stopping experience collection (38350 times) [2024-06-22 07:05:01,993][15401] InferenceWorker_p0-w0: resuming experience collection (38350 times) [2024-06-22 07:05:02,914][15401] Updated weights for policy 0, policy_version 158570 (0.0037) [2024-06-22 07:05:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2598027264. Throughput: 0: 42508.1. Samples: 2598158340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 07:05:07,596][15401] Updated weights for policy 0, policy_version 158580 (0.0034) [2024-06-22 07:05:08,390][15132] Fps is (10 sec: 42602.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2598207488. Throughput: 0: 42577.8. Samples: 2598295660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:08,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 07:05:10,454][15401] Updated weights for policy 0, policy_version 158590 (0.0029) [2024-06-22 07:05:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2598420480. Throughput: 0: 42700.2. Samples: 2598549940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:13,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 07:05:15,269][15401] Updated weights for policy 0, policy_version 158600 (0.0047) [2024-06-22 07:05:18,335][15401] Updated weights for policy 0, policy_version 158610 (0.0034) [2024-06-22 07:05:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2598666240. Throughput: 0: 42736.1. Samples: 2598805380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 07:05:22,850][15401] Updated weights for policy 0, policy_version 158620 (0.0034) [2024-06-22 07:05:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2598846464. Throughput: 0: 42475.5. Samples: 2598933380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:23,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 07:05:26,447][15401] Updated weights for policy 0, policy_version 158630 (0.0042) [2024-06-22 07:05:28,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42601.2, 300 sec: 42709.1). Total num frames: 2599059456. Throughput: 0: 42551.9. Samples: 2599183040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:28,392][15132] Avg episode reward: [(0, '0.898')] [2024-06-22 07:05:30,446][15401] Updated weights for policy 0, policy_version 158640 (0.0039) [2024-06-22 07:05:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2599288832. Throughput: 0: 42476.5. Samples: 2599435260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:33,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-22 07:05:34,037][15401] Updated weights for policy 0, policy_version 158650 (0.0032) [2024-06-22 07:05:38,167][15401] Updated weights for policy 0, policy_version 158660 (0.0032) [2024-06-22 07:05:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42601.3, 300 sec: 42709.5). Total num frames: 2599485440. Throughput: 0: 42311.6. Samples: 2599567140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:38,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-22 07:05:41,735][15401] Updated weights for policy 0, policy_version 158670 (0.0030) [2024-06-22 07:05:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42709.6). Total num frames: 2599714816. Throughput: 0: 42435.6. Samples: 2599817960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 07:05:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 07:05:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000158674_2599714816.pth... [2024-06-22 07:05:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000158049_2589474816.pth [2024-06-22 07:05:46,159][15401] Updated weights for policy 0, policy_version 158680 (0.0043) [2024-06-22 07:05:48,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 2599927808. Throughput: 0: 42580.1. Samples: 2600074720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:05:48,396][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 07:05:49,480][15401] Updated weights for policy 0, policy_version 158690 (0.0035) [2024-06-22 07:05:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2600108032. Throughput: 0: 42413.3. Samples: 2600204260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:05:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 07:05:53,814][15401] Updated weights for policy 0, policy_version 158700 (0.0039) [2024-06-22 07:05:57,051][15401] Updated weights for policy 0, policy_version 158710 (0.0039) [2024-06-22 07:05:58,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42872.2, 300 sec: 42709.5). Total num frames: 2600353792. Throughput: 0: 42354.7. Samples: 2600455900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:05:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 07:06:01,465][15401] Updated weights for policy 0, policy_version 158720 (0.0036) [2024-06-22 07:06:03,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2600566784. Throughput: 0: 42497.4. Samples: 2600717760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 07:06:04,705][15401] Updated weights for policy 0, policy_version 158730 (0.0034) [2024-06-22 07:06:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2600747008. Throughput: 0: 42365.0. Samples: 2600839800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:08,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 07:06:09,190][15401] Updated weights for policy 0, policy_version 158740 (0.0039) [2024-06-22 07:06:12,427][15401] Updated weights for policy 0, policy_version 158750 (0.0041) [2024-06-22 07:06:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2600992768. Throughput: 0: 42594.8. Samples: 2601099700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 07:06:16,870][15401] Updated weights for policy 0, policy_version 158760 (0.0034) [2024-06-22 07:06:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 2601205760. Throughput: 0: 42706.7. Samples: 2601357060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:18,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 07:06:20,122][15401] Updated weights for policy 0, policy_version 158770 (0.0035) [2024-06-22 07:06:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2601385984. Throughput: 0: 42561.0. Samples: 2601482380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:23,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 07:06:24,424][15401] Updated weights for policy 0, policy_version 158780 (0.0028) [2024-06-22 07:06:25,660][15349] Signal inference workers to stop experience collection... (38400 times) [2024-06-22 07:06:25,708][15349] Signal inference workers to resume experience collection... (38400 times) [2024-06-22 07:06:25,708][15401] InferenceWorker_p0-w0: stopping experience collection (38400 times) [2024-06-22 07:06:25,721][15401] InferenceWorker_p0-w0: resuming experience collection (38400 times) [2024-06-22 07:06:27,887][15401] Updated weights for policy 0, policy_version 158790 (0.0028) [2024-06-22 07:06:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2601631744. Throughput: 0: 42769.5. Samples: 2601742580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 07:06:31,961][15401] Updated weights for policy 0, policy_version 158800 (0.0036) [2024-06-22 07:06:33,390][15132] Fps is (10 sec: 44232.6, 60 sec: 42324.7, 300 sec: 42598.3). Total num frames: 2601828352. Throughput: 0: 42879.0. Samples: 2602004040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:33,391][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 07:06:35,343][15401] Updated weights for policy 0, policy_version 158810 (0.0027) [2024-06-22 07:06:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2602041344. Throughput: 0: 42691.8. Samples: 2602125380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 07:06:39,858][15401] Updated weights for policy 0, policy_version 158820 (0.0039) [2024-06-22 07:06:42,911][15401] Updated weights for policy 0, policy_version 158830 (0.0040) [2024-06-22 07:06:43,389][15132] Fps is (10 sec: 45879.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2602287104. Throughput: 0: 42962.3. Samples: 2602389200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 07:06:47,802][15401] Updated weights for policy 0, policy_version 158840 (0.0031) [2024-06-22 07:06:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42329.8, 300 sec: 42598.4). Total num frames: 2602467328. Throughput: 0: 42857.7. Samples: 2602646360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 07:06:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 07:06:50,716][15401] Updated weights for policy 0, policy_version 158850 (0.0025) [2024-06-22 07:06:53,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 2602680320. Throughput: 0: 42832.8. Samples: 2602767380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:06:53,392][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 07:06:55,219][15401] Updated weights for policy 0, policy_version 158860 (0.0038) [2024-06-22 07:06:58,266][15401] Updated weights for policy 0, policy_version 158870 (0.0027) [2024-06-22 07:06:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2602926080. Throughput: 0: 42958.1. Samples: 2603032820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:06:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 07:07:02,790][15401] Updated weights for policy 0, policy_version 158880 (0.0028) [2024-06-22 07:07:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2603106304. Throughput: 0: 42859.8. Samples: 2603285760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 07:07:05,925][15401] Updated weights for policy 0, policy_version 158890 (0.0040) [2024-06-22 07:07:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 2603335680. Throughput: 0: 42780.3. Samples: 2603407600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:08,393][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 07:07:10,284][15401] Updated weights for policy 0, policy_version 158900 (0.0036) [2024-06-22 07:07:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2603548672. Throughput: 0: 42709.8. Samples: 2603664520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:13,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 07:07:13,710][15401] Updated weights for policy 0, policy_version 158910 (0.0037) [2024-06-22 07:07:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2603728896. Throughput: 0: 42746.2. Samples: 2603927580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 07:07:18,435][15401] Updated weights for policy 0, policy_version 158920 (0.0022) [2024-06-22 07:07:21,218][15401] Updated weights for policy 0, policy_version 158930 (0.0038) [2024-06-22 07:07:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2603991040. Throughput: 0: 42697.2. Samples: 2604046760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 07:07:25,885][15401] Updated weights for policy 0, policy_version 158940 (0.0036) [2024-06-22 07:07:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2604204032. Throughput: 0: 42745.8. Samples: 2604312760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 07:07:28,826][15401] Updated weights for policy 0, policy_version 158950 (0.0035) [2024-06-22 07:07:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.9, 300 sec: 42598.7). Total num frames: 2604384256. Throughput: 0: 42696.4. Samples: 2604567700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:33,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 07:07:33,746][15401] Updated weights for policy 0, policy_version 158960 (0.0035) [2024-06-22 07:07:36,717][15401] Updated weights for policy 0, policy_version 158970 (0.0043) [2024-06-22 07:07:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 2604630016. Throughput: 0: 42767.6. Samples: 2604691920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:38,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 07:07:41,267][15401] Updated weights for policy 0, policy_version 158980 (0.0036) [2024-06-22 07:07:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 2604810240. Throughput: 0: 42553.5. Samples: 2604947720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:43,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-22 07:07:43,443][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000158986_2604826624.pth... [2024-06-22 07:07:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000158360_2594570240.pth [2024-06-22 07:07:44,597][15401] Updated weights for policy 0, policy_version 158990 (0.0037) [2024-06-22 07:07:48,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2605023232. Throughput: 0: 42610.3. Samples: 2605203220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:48,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 07:07:48,702][15401] Updated weights for policy 0, policy_version 159000 (0.0034) [2024-06-22 07:07:52,075][15401] Updated weights for policy 0, policy_version 159010 (0.0033) [2024-06-22 07:07:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 2605268992. Throughput: 0: 42905.4. Samples: 2605338240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 07:07:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 07:07:56,147][15401] Updated weights for policy 0, policy_version 159020 (0.0037) [2024-06-22 07:07:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2605449216. Throughput: 0: 42945.7. Samples: 2605597080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:07:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 07:07:59,016][15349] Signal inference workers to stop experience collection... (38450 times) [2024-06-22 07:07:59,069][15401] InferenceWorker_p0-w0: stopping experience collection (38450 times) [2024-06-22 07:07:59,077][15349] Signal inference workers to resume experience collection... (38450 times) [2024-06-22 07:07:59,086][15401] InferenceWorker_p0-w0: resuming experience collection (38450 times) [2024-06-22 07:07:59,747][15401] Updated weights for policy 0, policy_version 159030 (0.0045) [2024-06-22 07:08:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2605678592. Throughput: 0: 42621.2. Samples: 2605845540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 07:08:03,738][15401] Updated weights for policy 0, policy_version 159040 (0.0027) [2024-06-22 07:08:07,451][15401] Updated weights for policy 0, policy_version 159050 (0.0039) [2024-06-22 07:08:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42873.2, 300 sec: 42765.9). Total num frames: 2605907968. Throughput: 0: 42970.2. Samples: 2605980420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 07:08:11,182][15401] Updated weights for policy 0, policy_version 159060 (0.0034) [2024-06-22 07:08:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2606088192. Throughput: 0: 42747.5. Samples: 2606236400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 07:08:15,173][15401] Updated weights for policy 0, policy_version 159070 (0.0037) [2024-06-22 07:08:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2606317568. Throughput: 0: 42676.1. Samples: 2606488120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:18,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 07:08:18,717][15401] Updated weights for policy 0, policy_version 159080 (0.0033) [2024-06-22 07:08:22,734][15401] Updated weights for policy 0, policy_version 159090 (0.0031) [2024-06-22 07:08:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2606546944. Throughput: 0: 42873.3. Samples: 2606621120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 07:08:26,382][15401] Updated weights for policy 0, policy_version 159100 (0.0033) [2024-06-22 07:08:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2606727168. Throughput: 0: 42902.5. Samples: 2606878340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:28,396][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 07:08:30,350][15401] Updated weights for policy 0, policy_version 159110 (0.0033) [2024-06-22 07:08:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2606972928. Throughput: 0: 42972.9. Samples: 2607137000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 07:08:33,925][15401] Updated weights for policy 0, policy_version 159120 (0.0039) [2024-06-22 07:08:37,805][15401] Updated weights for policy 0, policy_version 159130 (0.0036) [2024-06-22 07:08:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 2607185920. Throughput: 0: 42959.6. Samples: 2607271420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:38,399][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 07:08:41,695][15401] Updated weights for policy 0, policy_version 159140 (0.0032) [2024-06-22 07:08:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2607382528. Throughput: 0: 42859.4. Samples: 2607525760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:43,391][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 07:08:45,271][15401] Updated weights for policy 0, policy_version 159150 (0.0026) [2024-06-22 07:08:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2607628288. Throughput: 0: 43021.4. Samples: 2607781500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 07:08:49,333][15401] Updated weights for policy 0, policy_version 159160 (0.0034) [2024-06-22 07:08:52,940][15401] Updated weights for policy 0, policy_version 159170 (0.0031) [2024-06-22 07:08:53,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42869.8, 300 sec: 42764.8). Total num frames: 2607841280. Throughput: 0: 43056.4. Samples: 2607918060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:53,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 07:08:57,162][15401] Updated weights for policy 0, policy_version 159180 (0.0043) [2024-06-22 07:08:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2608021504. Throughput: 0: 42805.7. Samples: 2608162660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 07:08:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 07:09:01,090][15401] Updated weights for policy 0, policy_version 159190 (0.0035) [2024-06-22 07:09:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2608267264. Throughput: 0: 42800.4. Samples: 2608414140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 07:09:04,892][15401] Updated weights for policy 0, policy_version 159200 (0.0039) [2024-06-22 07:09:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2608463872. Throughput: 0: 42927.2. Samples: 2608552840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 07:09:08,590][15401] Updated weights for policy 0, policy_version 159210 (0.0033) [2024-06-22 07:09:12,638][15401] Updated weights for policy 0, policy_version 159220 (0.0037) [2024-06-22 07:09:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2608660480. Throughput: 0: 42752.1. Samples: 2608802180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 07:09:14,460][15349] Signal inference workers to stop experience collection... (38500 times) [2024-06-22 07:09:14,490][15401] InferenceWorker_p0-w0: stopping experience collection (38500 times) [2024-06-22 07:09:14,520][15349] Signal inference workers to resume experience collection... (38500 times) [2024-06-22 07:09:14,520][15401] InferenceWorker_p0-w0: resuming experience collection (38500 times) [2024-06-22 07:09:16,243][15401] Updated weights for policy 0, policy_version 159230 (0.0023) [2024-06-22 07:09:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2608906240. Throughput: 0: 42643.2. Samples: 2609055940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 07:09:20,197][15401] Updated weights for policy 0, policy_version 159240 (0.0037) [2024-06-22 07:09:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 2609119232. Throughput: 0: 42781.8. Samples: 2609196600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 07:09:24,262][15401] Updated weights for policy 0, policy_version 159250 (0.0025) [2024-06-22 07:09:28,067][15401] Updated weights for policy 0, policy_version 159260 (0.0048) [2024-06-22 07:09:28,394][15132] Fps is (10 sec: 40941.8, 60 sec: 43141.4, 300 sec: 42597.8). Total num frames: 2609315840. Throughput: 0: 42632.8. Samples: 2609444420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:28,394][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 07:09:32,086][15401] Updated weights for policy 0, policy_version 159270 (0.0035) [2024-06-22 07:09:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42710.1). Total num frames: 2609528832. Throughput: 0: 42438.4. Samples: 2609691220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:33,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 07:09:35,566][15401] Updated weights for policy 0, policy_version 159280 (0.0040) [2024-06-22 07:09:38,389][15132] Fps is (10 sec: 44256.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2609758208. Throughput: 0: 42337.0. Samples: 2609823120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:38,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 07:09:39,858][15401] Updated weights for policy 0, policy_version 159290 (0.0031) [2024-06-22 07:09:43,058][15401] Updated weights for policy 0, policy_version 159300 (0.0037) [2024-06-22 07:09:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2609971200. Throughput: 0: 42596.5. Samples: 2610079500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 07:09:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000159300_2609971200.pth... [2024-06-22 07:09:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000158674_2599714816.pth [2024-06-22 07:09:47,560][15401] Updated weights for policy 0, policy_version 159310 (0.0037) [2024-06-22 07:09:48,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42050.7, 300 sec: 42653.6). Total num frames: 2610151424. Throughput: 0: 42709.8. Samples: 2610336180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:48,392][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 07:09:50,674][15401] Updated weights for policy 0, policy_version 159320 (0.0030) [2024-06-22 07:09:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42765.2). Total num frames: 2610397184. Throughput: 0: 42382.1. Samples: 2610460040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 07:09:55,022][15401] Updated weights for policy 0, policy_version 159330 (0.0026) [2024-06-22 07:09:58,261][15401] Updated weights for policy 0, policy_version 159340 (0.0034) [2024-06-22 07:09:58,390][15132] Fps is (10 sec: 47524.9, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 2610626560. Throughput: 0: 42646.6. Samples: 2610721280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:09:58,391][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 07:10:02,592][15401] Updated weights for policy 0, policy_version 159350 (0.0042) [2024-06-22 07:10:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2610806784. Throughput: 0: 42728.3. Samples: 2610978720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 07:10:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 07:10:06,179][15401] Updated weights for policy 0, policy_version 159360 (0.0034) [2024-06-22 07:10:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2611036160. Throughput: 0: 42394.1. Samples: 2611104340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:08,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 07:10:10,093][15401] Updated weights for policy 0, policy_version 159370 (0.0035) [2024-06-22 07:10:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2611249152. Throughput: 0: 42836.6. Samples: 2611371880. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:13,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 07:10:13,816][15401] Updated weights for policy 0, policy_version 159380 (0.0029) [2024-06-22 07:10:17,679][15401] Updated weights for policy 0, policy_version 159390 (0.0035) [2024-06-22 07:10:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2611445760. Throughput: 0: 42972.4. Samples: 2611624980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 07:10:21,427][15401] Updated weights for policy 0, policy_version 159400 (0.0042) [2024-06-22 07:10:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2611658752. Throughput: 0: 42881.7. Samples: 2611752800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 07:10:23,710][15349] Signal inference workers to stop experience collection... (38550 times) [2024-06-22 07:10:23,749][15401] InferenceWorker_p0-w0: stopping experience collection (38550 times) [2024-06-22 07:10:23,760][15349] Signal inference workers to resume experience collection... (38550 times) [2024-06-22 07:10:23,767][15401] InferenceWorker_p0-w0: resuming experience collection (38550 times) [2024-06-22 07:10:25,243][15401] Updated weights for policy 0, policy_version 159410 (0.0044) [2024-06-22 07:10:28,392][15132] Fps is (10 sec: 45863.6, 60 sec: 43145.9, 300 sec: 42764.7). Total num frames: 2611904512. Throughput: 0: 43023.0. Samples: 2612015640. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:28,393][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 07:10:29,470][15401] Updated weights for policy 0, policy_version 159420 (0.0025) [2024-06-22 07:10:32,796][15401] Updated weights for policy 0, policy_version 159430 (0.0029) [2024-06-22 07:10:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2612117504. Throughput: 0: 42933.4. Samples: 2612268080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:33,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 07:10:37,266][15401] Updated weights for policy 0, policy_version 159440 (0.0044) [2024-06-22 07:10:38,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42596.7, 300 sec: 42709.2). Total num frames: 2612314112. Throughput: 0: 43052.9. Samples: 2612397520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:38,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 07:10:40,351][15401] Updated weights for policy 0, policy_version 159450 (0.0031) [2024-06-22 07:10:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 2612527104. Throughput: 0: 43028.9. Samples: 2612657580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:43,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 07:10:44,876][15401] Updated weights for policy 0, policy_version 159460 (0.0038) [2024-06-22 07:10:48,033][15401] Updated weights for policy 0, policy_version 159470 (0.0042) [2024-06-22 07:10:48,390][15132] Fps is (10 sec: 44246.1, 60 sec: 43419.1, 300 sec: 42876.1). Total num frames: 2612756480. Throughput: 0: 42921.6. Samples: 2612910200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 07:10:52,341][15401] Updated weights for policy 0, policy_version 159480 (0.0037) [2024-06-22 07:10:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2612953088. Throughput: 0: 43099.7. Samples: 2613043820. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 07:10:55,792][15401] Updated weights for policy 0, policy_version 159490 (0.0037) [2024-06-22 07:10:58,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2613182464. Throughput: 0: 42880.5. Samples: 2613301500. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:10:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 07:10:59,920][15401] Updated weights for policy 0, policy_version 159500 (0.0031) [2024-06-22 07:11:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2613395456. Throughput: 0: 42879.5. Samples: 2613554560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:11:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 07:11:03,468][15401] Updated weights for policy 0, policy_version 159510 (0.0038) [2024-06-22 07:11:07,505][15401] Updated weights for policy 0, policy_version 159520 (0.0032) [2024-06-22 07:11:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 2613592064. Throughput: 0: 42940.8. Samples: 2613685140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-22 07:11:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 07:11:11,035][15401] Updated weights for policy 0, policy_version 159530 (0.0027) [2024-06-22 07:11:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2613821440. Throughput: 0: 42820.6. Samples: 2613942460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 07:11:15,396][15401] Updated weights for policy 0, policy_version 159540 (0.0036) [2024-06-22 07:11:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2614034432. Throughput: 0: 42861.3. Samples: 2614196840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:18,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 07:11:18,955][15401] Updated weights for policy 0, policy_version 159550 (0.0034) [2024-06-22 07:11:23,038][15401] Updated weights for policy 0, policy_version 159560 (0.0025) [2024-06-22 07:11:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2614247424. Throughput: 0: 42782.1. Samples: 2614322620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 07:11:26,776][15401] Updated weights for policy 0, policy_version 159570 (0.0043) [2024-06-22 07:11:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.3, 300 sec: 42876.2). Total num frames: 2614476800. Throughput: 0: 42658.7. Samples: 2614577220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 07:11:30,423][15401] Updated weights for policy 0, policy_version 159580 (0.0038) [2024-06-22 07:11:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2614657024. Throughput: 0: 42904.2. Samples: 2614840880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 07:11:34,363][15401] Updated weights for policy 0, policy_version 159590 (0.0037) [2024-06-22 07:11:37,997][15401] Updated weights for policy 0, policy_version 159600 (0.0042) [2024-06-22 07:11:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2614886400. Throughput: 0: 42637.4. Samples: 2614962500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 07:11:42,068][15401] Updated weights for policy 0, policy_version 159610 (0.0029) [2024-06-22 07:11:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2615099392. Throughput: 0: 42606.6. Samples: 2615218800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 07:11:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000159614_2615115776.pth... [2024-06-22 07:11:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000158986_2604826624.pth [2024-06-22 07:11:45,657][15401] Updated weights for policy 0, policy_version 159620 (0.0034) [2024-06-22 07:11:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.5, 300 sec: 42765.4). Total num frames: 2615296000. Throughput: 0: 42814.1. Samples: 2615481200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 07:11:49,867][15401] Updated weights for policy 0, policy_version 159630 (0.0029) [2024-06-22 07:11:53,382][15401] Updated weights for policy 0, policy_version 159640 (0.0036) [2024-06-22 07:11:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2615541760. Throughput: 0: 42552.6. Samples: 2615600000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:53,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 07:11:55,695][15349] Signal inference workers to stop experience collection... (38600 times) [2024-06-22 07:11:55,749][15349] Signal inference workers to resume experience collection... (38600 times) [2024-06-22 07:11:55,749][15401] InferenceWorker_p0-w0: stopping experience collection (38600 times) [2024-06-22 07:11:55,770][15401] InferenceWorker_p0-w0: resuming experience collection (38600 times) [2024-06-22 07:11:57,719][15401] Updated weights for policy 0, policy_version 159650 (0.0029) [2024-06-22 07:11:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2615738368. Throughput: 0: 42554.2. Samples: 2615857400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:11:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 07:12:00,953][15401] Updated weights for policy 0, policy_version 159660 (0.0040) [2024-06-22 07:12:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 2615934976. Throughput: 0: 42671.5. Samples: 2616117060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:12:03,391][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 07:12:05,550][15401] Updated weights for policy 0, policy_version 159670 (0.0031) [2024-06-22 07:12:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2616180736. Throughput: 0: 42589.5. Samples: 2616239140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:12:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 07:12:08,487][15401] Updated weights for policy 0, policy_version 159680 (0.0034) [2024-06-22 07:12:12,993][15401] Updated weights for policy 0, policy_version 159690 (0.0031) [2024-06-22 07:12:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2616377344. Throughput: 0: 42769.4. Samples: 2616501840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:12:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 07:12:16,330][15401] Updated weights for policy 0, policy_version 159700 (0.0039) [2024-06-22 07:12:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2616573952. Throughput: 0: 42448.9. Samples: 2616751080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 07:12:20,751][15401] Updated weights for policy 0, policy_version 159710 (0.0028) [2024-06-22 07:12:23,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2616819712. Throughput: 0: 42492.3. Samples: 2616874760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:23,393][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 07:12:24,043][15401] Updated weights for policy 0, policy_version 159720 (0.0032) [2024-06-22 07:12:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42050.5, 300 sec: 42764.7). Total num frames: 2616999936. Throughput: 0: 42615.5. Samples: 2617136600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:28,393][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 07:12:28,532][15401] Updated weights for policy 0, policy_version 159730 (0.0037) [2024-06-22 07:12:31,800][15401] Updated weights for policy 0, policy_version 159740 (0.0031) [2024-06-22 07:12:33,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2617229312. Throughput: 0: 42424.5. Samples: 2617390300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 07:12:36,294][15401] Updated weights for policy 0, policy_version 159750 (0.0049) [2024-06-22 07:12:38,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2617458688. Throughput: 0: 42644.4. Samples: 2617519000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:38,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 07:12:39,402][15401] Updated weights for policy 0, policy_version 159760 (0.0026) [2024-06-22 07:12:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2617638912. Throughput: 0: 42620.7. Samples: 2617775340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 07:12:43,953][15401] Updated weights for policy 0, policy_version 159770 (0.0039) [2024-06-22 07:12:47,138][15401] Updated weights for policy 0, policy_version 159780 (0.0026) [2024-06-22 07:12:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2617868288. Throughput: 0: 42445.0. Samples: 2618027080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 07:12:51,858][15401] Updated weights for policy 0, policy_version 159790 (0.0040) [2024-06-22 07:12:53,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2618097664. Throughput: 0: 42684.0. Samples: 2618159920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 07:12:55,038][15401] Updated weights for policy 0, policy_version 159800 (0.0028) [2024-06-22 07:12:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2618277888. Throughput: 0: 42498.7. Samples: 2618414280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:12:58,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 07:12:59,423][15401] Updated weights for policy 0, policy_version 159810 (0.0037) [2024-06-22 07:13:02,495][15401] Updated weights for policy 0, policy_version 159820 (0.0032) [2024-06-22 07:13:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2618523648. Throughput: 0: 42628.4. Samples: 2618669360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:13:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 07:13:07,042][15349] Signal inference workers to stop experience collection... (38650 times) [2024-06-22 07:13:07,042][15349] Signal inference workers to resume experience collection... (38650 times) [2024-06-22 07:13:07,092][15401] InferenceWorker_p0-w0: stopping experience collection (38650 times) [2024-06-22 07:13:07,092][15401] InferenceWorker_p0-w0: resuming experience collection (38650 times) [2024-06-22 07:13:07,210][15401] Updated weights for policy 0, policy_version 159830 (0.0036) [2024-06-22 07:13:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2618720256. Throughput: 0: 42872.6. Samples: 2618803920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:13:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 07:13:10,109][15401] Updated weights for policy 0, policy_version 159840 (0.0030) [2024-06-22 07:13:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2618916864. Throughput: 0: 42730.3. Samples: 2619059360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:13:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 07:13:14,633][15401] Updated weights for policy 0, policy_version 159850 (0.0027) [2024-06-22 07:13:17,803][15401] Updated weights for policy 0, policy_version 159860 (0.0044) [2024-06-22 07:13:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 2619162624. Throughput: 0: 42606.2. Samples: 2619307680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:13:18,392][15132] Avg episode reward: [(0, '0.266')] [2024-06-22 07:13:22,148][15401] Updated weights for policy 0, policy_version 159870 (0.0036) [2024-06-22 07:13:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 2619342848. Throughput: 0: 42755.0. Samples: 2619442980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:13:23,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 07:13:25,393][15401] Updated weights for policy 0, policy_version 159880 (0.0044) [2024-06-22 07:13:28,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 2619555840. Throughput: 0: 42609.5. Samples: 2619692760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:13:28,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 07:13:30,045][15401] Updated weights for policy 0, policy_version 159890 (0.0028) [2024-06-22 07:13:33,193][15401] Updated weights for policy 0, policy_version 159900 (0.0033) [2024-06-22 07:13:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2619801600. Throughput: 0: 42652.8. Samples: 2619946460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:13:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 07:13:37,616][15401] Updated weights for policy 0, policy_version 159910 (0.0033) [2024-06-22 07:13:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2619981824. Throughput: 0: 42656.0. Samples: 2620079440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:13:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 07:13:41,014][15401] Updated weights for policy 0, policy_version 159920 (0.0025) [2024-06-22 07:13:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2620211200. Throughput: 0: 42543.0. Samples: 2620328720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:13:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 07:13:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000159925_2620211200.pth... [2024-06-22 07:13:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000159300_2609971200.pth [2024-06-22 07:13:45,175][15401] Updated weights for policy 0, policy_version 159930 (0.0041) [2024-06-22 07:13:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2620440576. Throughput: 0: 42561.4. Samples: 2620584620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:13:48,392][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 07:13:48,674][15401] Updated weights for policy 0, policy_version 159940 (0.0039) [2024-06-22 07:13:53,125][15401] Updated weights for policy 0, policy_version 159950 (0.0039) [2024-06-22 07:13:53,391][15132] Fps is (10 sec: 40952.4, 60 sec: 42051.0, 300 sec: 42709.2). Total num frames: 2620620800. Throughput: 0: 42424.9. Samples: 2620713120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:13:53,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 07:13:56,402][15401] Updated weights for policy 0, policy_version 159960 (0.0034) [2024-06-22 07:13:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2620850176. Throughput: 0: 42344.4. Samples: 2620964860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:13:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 07:14:00,758][15401] Updated weights for policy 0, policy_version 159970 (0.0040) [2024-06-22 07:14:03,389][15132] Fps is (10 sec: 44245.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2621063168. Throughput: 0: 42625.9. Samples: 2621225740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:14:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 07:14:04,073][15401] Updated weights for policy 0, policy_version 159980 (0.0032) [2024-06-22 07:14:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2621259776. Throughput: 0: 42521.1. Samples: 2621356420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:14:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 07:14:08,510][15401] Updated weights for policy 0, policy_version 159990 (0.0033) [2024-06-22 07:14:11,827][15401] Updated weights for policy 0, policy_version 160000 (0.0035) [2024-06-22 07:14:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 2621505536. Throughput: 0: 42547.0. Samples: 2621607480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:14:13,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 07:14:16,099][15401] Updated weights for policy 0, policy_version 160010 (0.0032) [2024-06-22 07:14:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 2621685760. Throughput: 0: 42712.1. Samples: 2621868500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 07:14:18,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 07:14:19,551][15401] Updated weights for policy 0, policy_version 160020 (0.0031) [2024-06-22 07:14:23,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42869.8, 300 sec: 42709.8). Total num frames: 2621915136. Throughput: 0: 42504.4. Samples: 2621992240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:14:23,393][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 07:14:23,598][15401] Updated weights for policy 0, policy_version 160030 (0.0038) [2024-06-22 07:14:27,209][15401] Updated weights for policy 0, policy_version 160040 (0.0028) [2024-06-22 07:14:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2622144512. Throughput: 0: 42799.0. Samples: 2622254680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:14:28,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 07:14:30,983][15401] Updated weights for policy 0, policy_version 160050 (0.0032) [2024-06-22 07:14:33,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2622341120. Throughput: 0: 42921.7. Samples: 2622516100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:14:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 07:14:34,758][15401] Updated weights for policy 0, policy_version 160060 (0.0032) [2024-06-22 07:14:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2622570496. Throughput: 0: 42787.5. Samples: 2622638480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:14:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 07:14:38,547][15401] Updated weights for policy 0, policy_version 160070 (0.0023) [2024-06-22 07:14:42,460][15401] Updated weights for policy 0, policy_version 160080 (0.0029) [2024-06-22 07:14:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 2622783488. Throughput: 0: 42954.1. Samples: 2622897800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:14:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 07:14:46,020][15401] Updated weights for policy 0, policy_version 160090 (0.0033) [2024-06-22 07:14:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2622980096. Throughput: 0: 43065.4. Samples: 2623163680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:14:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 07:14:50,126][15401] Updated weights for policy 0, policy_version 160100 (0.0038) [2024-06-22 07:14:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43418.9, 300 sec: 42709.5). Total num frames: 2623225856. Throughput: 0: 42807.4. Samples: 2623282760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:14:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 07:14:53,974][15401] Updated weights for policy 0, policy_version 160110 (0.0030) [2024-06-22 07:14:57,695][15401] Updated weights for policy 0, policy_version 160120 (0.0025) [2024-06-22 07:14:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2623422464. Throughput: 0: 43028.2. Samples: 2623543640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:14:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 07:15:01,552][15401] Updated weights for policy 0, policy_version 160130 (0.0037) [2024-06-22 07:15:03,311][15349] Signal inference workers to stop experience collection... (38700 times) [2024-06-22 07:15:03,361][15401] InferenceWorker_p0-w0: stopping experience collection (38700 times) [2024-06-22 07:15:03,373][15349] Signal inference workers to resume experience collection... (38700 times) [2024-06-22 07:15:03,380][15401] InferenceWorker_p0-w0: resuming experience collection (38700 times) [2024-06-22 07:15:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2623619072. Throughput: 0: 43179.6. Samples: 2623811580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:15:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 07:15:05,409][15401] Updated weights for policy 0, policy_version 160140 (0.0025) [2024-06-22 07:15:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2623864832. Throughput: 0: 42956.5. Samples: 2623925180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:15:08,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 07:15:09,382][15401] Updated weights for policy 0, policy_version 160150 (0.0033) [2024-06-22 07:15:13,117][15401] Updated weights for policy 0, policy_version 160160 (0.0030) [2024-06-22 07:15:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 2624061440. Throughput: 0: 42883.7. Samples: 2624184440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:15:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 07:15:16,871][15401] Updated weights for policy 0, policy_version 160170 (0.0033) [2024-06-22 07:15:18,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 2624258048. Throughput: 0: 42911.4. Samples: 2624447220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:15:18,393][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 07:15:20,815][15401] Updated weights for policy 0, policy_version 160180 (0.0041) [2024-06-22 07:15:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.3, 300 sec: 42654.3). Total num frames: 2624487424. Throughput: 0: 42873.9. Samples: 2624567800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 07:15:23,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-22 07:15:24,603][15401] Updated weights for policy 0, policy_version 160190 (0.0027) [2024-06-22 07:15:28,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2624700416. Throughput: 0: 42835.7. Samples: 2624825400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:15:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 07:15:28,466][15401] Updated weights for policy 0, policy_version 160200 (0.0032) [2024-06-22 07:15:32,155][15401] Updated weights for policy 0, policy_version 160210 (0.0034) [2024-06-22 07:15:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2624897024. Throughput: 0: 42617.7. Samples: 2625081480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:15:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 07:15:35,974][15401] Updated weights for policy 0, policy_version 160220 (0.0027) [2024-06-22 07:15:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2625126400. Throughput: 0: 42733.3. Samples: 2625205760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:15:38,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 07:15:39,752][15401] Updated weights for policy 0, policy_version 160230 (0.0031) [2024-06-22 07:15:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2625339392. Throughput: 0: 42558.2. Samples: 2625458760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:15:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 07:15:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000160239_2625355776.pth... [2024-06-22 07:15:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000159614_2615115776.pth [2024-06-22 07:15:43,709][15401] Updated weights for policy 0, policy_version 160240 (0.0029) [2024-06-22 07:15:47,701][15401] Updated weights for policy 0, policy_version 160250 (0.0034) [2024-06-22 07:15:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2625536000. Throughput: 0: 42463.1. Samples: 2625722420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:15:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 07:15:51,223][15401] Updated weights for policy 0, policy_version 160260 (0.0037) [2024-06-22 07:15:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2625781760. Throughput: 0: 42692.0. Samples: 2625846320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:15:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 07:15:55,469][15401] Updated weights for policy 0, policy_version 160270 (0.0039) [2024-06-22 07:15:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2625978368. Throughput: 0: 42539.8. Samples: 2626098740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:15:58,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 07:15:58,956][15401] Updated weights for policy 0, policy_version 160280 (0.0038) [2024-06-22 07:16:03,026][15401] Updated weights for policy 0, policy_version 160290 (0.0022) [2024-06-22 07:16:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2626191360. Throughput: 0: 42385.4. Samples: 2626354460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:16:03,394][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 07:16:06,598][15401] Updated weights for policy 0, policy_version 160300 (0.0043) [2024-06-22 07:16:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2626404352. Throughput: 0: 42550.6. Samples: 2626482580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:16:08,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 07:16:10,609][15401] Updated weights for policy 0, policy_version 160310 (0.0031) [2024-06-22 07:16:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2626633728. Throughput: 0: 42627.1. Samples: 2626743620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:16:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 07:16:14,344][15401] Updated weights for policy 0, policy_version 160320 (0.0026) [2024-06-22 07:16:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.3, 300 sec: 42654.0). Total num frames: 2626830336. Throughput: 0: 42640.5. Samples: 2627000300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:16:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 07:16:18,487][15401] Updated weights for policy 0, policy_version 160330 (0.0037) [2024-06-22 07:16:21,460][15349] Signal inference workers to stop experience collection... (38750 times) [2024-06-22 07:16:21,460][15349] Signal inference workers to resume experience collection... (38750 times) [2024-06-22 07:16:21,488][15401] InferenceWorker_p0-w0: stopping experience collection (38750 times) [2024-06-22 07:16:21,488][15401] InferenceWorker_p0-w0: resuming experience collection (38750 times) [2024-06-22 07:16:22,111][15401] Updated weights for policy 0, policy_version 160340 (0.0026) [2024-06-22 07:16:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 2627059712. Throughput: 0: 42800.5. Samples: 2627131880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:16:23,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 07:16:25,958][15401] Updated weights for policy 0, policy_version 160350 (0.0043) [2024-06-22 07:16:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2627256320. Throughput: 0: 42825.8. Samples: 2627385920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 07:16:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 07:16:29,639][15401] Updated weights for policy 0, policy_version 160360 (0.0033) [2024-06-22 07:16:33,389][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2627485696. Throughput: 0: 42722.1. Samples: 2627644920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:16:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 07:16:33,702][15401] Updated weights for policy 0, policy_version 160370 (0.0034) [2024-06-22 07:16:37,497][15401] Updated weights for policy 0, policy_version 160380 (0.0036) [2024-06-22 07:16:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2627698688. Throughput: 0: 42836.8. Samples: 2627773980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:16:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 07:16:41,338][15401] Updated weights for policy 0, policy_version 160390 (0.0034) [2024-06-22 07:16:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2627928064. Throughput: 0: 42916.5. Samples: 2628029980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:16:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 07:16:45,238][15401] Updated weights for policy 0, policy_version 160400 (0.0034) [2024-06-22 07:16:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2628124672. Throughput: 0: 42971.5. Samples: 2628288180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:16:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 07:16:48,935][15401] Updated weights for policy 0, policy_version 160410 (0.0040) [2024-06-22 07:16:52,770][15401] Updated weights for policy 0, policy_version 160420 (0.0037) [2024-06-22 07:16:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 2628337664. Throughput: 0: 42959.8. Samples: 2628415780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:16:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 07:16:56,522][15401] Updated weights for policy 0, policy_version 160430 (0.0040) [2024-06-22 07:16:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2628567040. Throughput: 0: 42872.1. Samples: 2628672860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:16:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 07:17:00,733][15401] Updated weights for policy 0, policy_version 160440 (0.0040) [2024-06-22 07:17:03,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.1, 300 sec: 42542.8). Total num frames: 2628730880. Throughput: 0: 42976.5. Samples: 2628934260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:17:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 07:17:04,170][15401] Updated weights for policy 0, policy_version 160450 (0.0024) [2024-06-22 07:17:08,255][15401] Updated weights for policy 0, policy_version 160460 (0.0032) [2024-06-22 07:17:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2628976640. Throughput: 0: 42736.1. Samples: 2629054900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:17:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 07:17:11,841][15401] Updated weights for policy 0, policy_version 160470 (0.0034) [2024-06-22 07:17:13,389][15132] Fps is (10 sec: 47515.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2629206016. Throughput: 0: 42753.8. Samples: 2629309840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:17:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 07:17:15,804][15401] Updated weights for policy 0, policy_version 160480 (0.0037) [2024-06-22 07:17:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 2629386240. Throughput: 0: 42711.5. Samples: 2629566940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:17:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 07:17:19,785][15401] Updated weights for policy 0, policy_version 160490 (0.0037) [2024-06-22 07:17:23,375][15401] Updated weights for policy 0, policy_version 160500 (0.0037) [2024-06-22 07:17:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42873.1, 300 sec: 42820.9). Total num frames: 2629632000. Throughput: 0: 42580.9. Samples: 2629690120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:17:23,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 07:17:27,735][15401] Updated weights for policy 0, policy_version 160510 (0.0031) [2024-06-22 07:17:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2629844992. Throughput: 0: 42767.7. Samples: 2629954520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:17:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 07:17:31,134][15401] Updated weights for policy 0, policy_version 160520 (0.0038) [2024-06-22 07:17:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2630041600. Throughput: 0: 42492.1. Samples: 2630200320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 07:17:33,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 07:17:35,443][15401] Updated weights for policy 0, policy_version 160530 (0.0036) [2024-06-22 07:17:38,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2630254592. Throughput: 0: 42447.2. Samples: 2630325900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:17:38,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 07:17:38,719][15401] Updated weights for policy 0, policy_version 160540 (0.0031) [2024-06-22 07:17:43,150][15401] Updated weights for policy 0, policy_version 160550 (0.0038) [2024-06-22 07:17:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2630451200. Throughput: 0: 42626.6. Samples: 2630591060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:17:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 07:17:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000160551_2630467584.pth... [2024-06-22 07:17:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000159925_2620211200.pth [2024-06-22 07:17:46,679][15401] Updated weights for policy 0, policy_version 160560 (0.0030) [2024-06-22 07:17:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2630696960. Throughput: 0: 42306.5. Samples: 2630838040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:17:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 07:17:50,712][15401] Updated weights for policy 0, policy_version 160570 (0.0029) [2024-06-22 07:17:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2630877184. Throughput: 0: 42593.7. Samples: 2630971620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:17:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 07:17:54,282][15401] Updated weights for policy 0, policy_version 160580 (0.0026) [2024-06-22 07:17:58,321][15401] Updated weights for policy 0, policy_version 160590 (0.0041) [2024-06-22 07:17:58,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 2631106560. Throughput: 0: 42626.6. Samples: 2631228140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:17:58,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 07:18:01,776][15349] Signal inference workers to stop experience collection... (38800 times) [2024-06-22 07:18:01,830][15401] InferenceWorker_p0-w0: stopping experience collection (38800 times) [2024-06-22 07:18:01,837][15349] Signal inference workers to resume experience collection... (38800 times) [2024-06-22 07:18:01,843][15401] InferenceWorker_p0-w0: resuming experience collection (38800 times) [2024-06-22 07:18:01,980][15401] Updated weights for policy 0, policy_version 160600 (0.0047) [2024-06-22 07:18:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.9, 300 sec: 42765.0). Total num frames: 2631335936. Throughput: 0: 42364.6. Samples: 2631473340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:18:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 07:18:05,835][15401] Updated weights for policy 0, policy_version 160610 (0.0028) [2024-06-22 07:18:08,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2631516160. Throughput: 0: 42574.7. Samples: 2631605980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:18:08,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 07:18:09,699][15401] Updated weights for policy 0, policy_version 160620 (0.0038) [2024-06-22 07:18:13,391][15132] Fps is (10 sec: 39316.2, 60 sec: 42051.3, 300 sec: 42598.6). Total num frames: 2631729152. Throughput: 0: 42504.5. Samples: 2631867280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:18:13,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 07:18:13,747][15401] Updated weights for policy 0, policy_version 160630 (0.0029) [2024-06-22 07:18:17,491][15401] Updated weights for policy 0, policy_version 160640 (0.0040) [2024-06-22 07:18:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2631974912. Throughput: 0: 42517.8. Samples: 2632113620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:18:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 07:18:21,518][15401] Updated weights for policy 0, policy_version 160650 (0.0038) [2024-06-22 07:18:23,390][15132] Fps is (10 sec: 44242.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2632171520. Throughput: 0: 42792.0. Samples: 2632251540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:18:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 07:18:24,914][15401] Updated weights for policy 0, policy_version 160660 (0.0029) [2024-06-22 07:18:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2632384512. Throughput: 0: 42515.6. Samples: 2632504260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:18:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 07:18:29,405][15401] Updated weights for policy 0, policy_version 160670 (0.0047) [2024-06-22 07:18:32,847][15401] Updated weights for policy 0, policy_version 160680 (0.0035) [2024-06-22 07:18:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2632597504. Throughput: 0: 42717.5. Samples: 2632760320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:18:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 07:18:36,904][15401] Updated weights for policy 0, policy_version 160690 (0.0042) [2024-06-22 07:18:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2632810496. Throughput: 0: 42709.3. Samples: 2632893540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 07:18:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 07:18:40,622][15401] Updated weights for policy 0, policy_version 160700 (0.0026) [2024-06-22 07:18:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2633023488. Throughput: 0: 42582.6. Samples: 2633144260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:18:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 07:18:44,703][15401] Updated weights for policy 0, policy_version 160710 (0.0027) [2024-06-22 07:18:48,187][15401] Updated weights for policy 0, policy_version 160720 (0.0034) [2024-06-22 07:18:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42765.3). Total num frames: 2633236480. Throughput: 0: 42932.0. Samples: 2633405280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:18:48,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 07:18:52,247][15401] Updated weights for policy 0, policy_version 160730 (0.0023) [2024-06-22 07:18:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2633433088. Throughput: 0: 42794.4. Samples: 2633531720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:18:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 07:18:55,895][15401] Updated weights for policy 0, policy_version 160740 (0.0041) [2024-06-22 07:18:58,393][15132] Fps is (10 sec: 42584.8, 60 sec: 42597.9, 300 sec: 42709.0). Total num frames: 2633662464. Throughput: 0: 42637.0. Samples: 2633786020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:18:58,393][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 07:18:59,851][15401] Updated weights for policy 0, policy_version 160750 (0.0037) [2024-06-22 07:19:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 2633875456. Throughput: 0: 42888.3. Samples: 2634043600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 07:19:03,645][15401] Updated weights for policy 0, policy_version 160760 (0.0032) [2024-06-22 07:19:07,518][15401] Updated weights for policy 0, policy_version 160770 (0.0029) [2024-06-22 07:19:08,390][15132] Fps is (10 sec: 40972.0, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2634072064. Throughput: 0: 42762.1. Samples: 2634175840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:08,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-22 07:19:11,164][15349] Signal inference workers to stop experience collection... (38850 times) [2024-06-22 07:19:11,201][15401] InferenceWorker_p0-w0: stopping experience collection (38850 times) [2024-06-22 07:19:11,211][15349] Signal inference workers to resume experience collection... (38850 times) [2024-06-22 07:19:11,224][15401] InferenceWorker_p0-w0: resuming experience collection (38850 times) [2024-06-22 07:19:11,227][15401] Updated weights for policy 0, policy_version 160780 (0.0046) [2024-06-22 07:19:13,396][15132] Fps is (10 sec: 44208.8, 60 sec: 43140.8, 300 sec: 42819.6). Total num frames: 2634317824. Throughput: 0: 42778.8. Samples: 2634429580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:13,397][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 07:19:15,004][15401] Updated weights for policy 0, policy_version 160790 (0.0028) [2024-06-22 07:19:18,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 2634498048. Throughput: 0: 42991.4. Samples: 2634694940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:18,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 07:19:18,795][15401] Updated weights for policy 0, policy_version 160800 (0.0029) [2024-06-22 07:19:22,433][15401] Updated weights for policy 0, policy_version 160810 (0.0038) [2024-06-22 07:19:23,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2634727424. Throughput: 0: 42722.8. Samples: 2634816060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 07:19:26,371][15401] Updated weights for policy 0, policy_version 160820 (0.0041) [2024-06-22 07:19:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2634956800. Throughput: 0: 42900.9. Samples: 2635074800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 07:19:29,953][15401] Updated weights for policy 0, policy_version 160830 (0.0044) [2024-06-22 07:19:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2635137024. Throughput: 0: 42861.3. Samples: 2635334040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 07:19:33,960][15401] Updated weights for policy 0, policy_version 160840 (0.0029) [2024-06-22 07:19:38,174][15401] Updated weights for policy 0, policy_version 160850 (0.0034) [2024-06-22 07:19:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2635366400. Throughput: 0: 42689.7. Samples: 2635452760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 07:19:41,399][15401] Updated weights for policy 0, policy_version 160860 (0.0032) [2024-06-22 07:19:43,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2635612160. Throughput: 0: 42868.2. Samples: 2635714960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 07:19:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 07:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000160865_2635612160.pth... [2024-06-22 07:19:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000160239_2625355776.pth [2024-06-22 07:19:45,754][15401] Updated weights for policy 0, policy_version 160870 (0.0034) [2024-06-22 07:19:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 2635792384. Throughput: 0: 42818.2. Samples: 2635970420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:19:48,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 07:19:49,155][15401] Updated weights for policy 0, policy_version 160880 (0.0033) [2024-06-22 07:19:53,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2636005376. Throughput: 0: 42635.8. Samples: 2636094440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:19:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 07:19:53,661][15401] Updated weights for policy 0, policy_version 160890 (0.0036) [2024-06-22 07:19:57,062][15401] Updated weights for policy 0, policy_version 160900 (0.0033) [2024-06-22 07:19:58,391][15132] Fps is (10 sec: 45869.3, 60 sec: 43145.7, 300 sec: 42820.3). Total num frames: 2636251136. Throughput: 0: 42805.7. Samples: 2636355620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:19:58,391][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 07:20:01,378][15401] Updated weights for policy 0, policy_version 160910 (0.0036) [2024-06-22 07:20:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2636447744. Throughput: 0: 42696.4. Samples: 2636616280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 07:20:04,750][15401] Updated weights for policy 0, policy_version 160920 (0.0027) [2024-06-22 07:20:08,390][15132] Fps is (10 sec: 40965.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2636660736. Throughput: 0: 42699.0. Samples: 2636737520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 07:20:08,951][15401] Updated weights for policy 0, policy_version 160930 (0.0027) [2024-06-22 07:20:12,562][15401] Updated weights for policy 0, policy_version 160940 (0.0030) [2024-06-22 07:20:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42603.0, 300 sec: 42765.4). Total num frames: 2636873728. Throughput: 0: 42781.4. Samples: 2636999960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 07:20:16,656][15401] Updated weights for policy 0, policy_version 160950 (0.0033) [2024-06-22 07:20:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2637053952. Throughput: 0: 42667.9. Samples: 2637254100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 07:20:20,314][15401] Updated weights for policy 0, policy_version 160960 (0.0037) [2024-06-22 07:20:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2637299712. Throughput: 0: 42727.7. Samples: 2637375500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 07:20:24,119][15401] Updated weights for policy 0, policy_version 160970 (0.0029) [2024-06-22 07:20:27,745][15401] Updated weights for policy 0, policy_version 160980 (0.0043) [2024-06-22 07:20:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2637512704. Throughput: 0: 42707.2. Samples: 2637636780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 07:20:28,529][15349] Signal inference workers to stop experience collection... (38900 times) [2024-06-22 07:20:28,530][15349] Signal inference workers to resume experience collection... (38900 times) [2024-06-22 07:20:28,560][15401] InferenceWorker_p0-w0: stopping experience collection (38900 times) [2024-06-22 07:20:28,560][15401] InferenceWorker_p0-w0: resuming experience collection (38900 times) [2024-06-22 07:20:31,665][15401] Updated weights for policy 0, policy_version 160990 (0.0035) [2024-06-22 07:20:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2637709312. Throughput: 0: 42748.5. Samples: 2637894100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 07:20:35,299][15401] Updated weights for policy 0, policy_version 161000 (0.0039) [2024-06-22 07:20:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2637955072. Throughput: 0: 42809.2. Samples: 2638020860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 07:20:39,256][15401] Updated weights for policy 0, policy_version 161010 (0.0039) [2024-06-22 07:20:43,067][15401] Updated weights for policy 0, policy_version 161020 (0.0028) [2024-06-22 07:20:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2638151680. Throughput: 0: 42713.7. Samples: 2638277680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 07:20:46,773][15401] Updated weights for policy 0, policy_version 161030 (0.0026) [2024-06-22 07:20:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2638364672. Throughput: 0: 42577.3. Samples: 2638532260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 07:20:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 07:20:50,593][15401] Updated weights for policy 0, policy_version 161040 (0.0041) [2024-06-22 07:20:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2638577664. Throughput: 0: 42726.3. Samples: 2638660200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:20:53,390][15132] Avg episode reward: [(0, '0.134')] [2024-06-22 07:20:54,277][15401] Updated weights for policy 0, policy_version 161050 (0.0045) [2024-06-22 07:20:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42599.4, 300 sec: 42765.0). Total num frames: 2638807040. Throughput: 0: 42664.4. Samples: 2638919860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:20:58,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 07:20:58,393][15401] Updated weights for policy 0, policy_version 161060 (0.0037) [2024-06-22 07:21:02,359][15401] Updated weights for policy 0, policy_version 161070 (0.0029) [2024-06-22 07:21:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2639003648. Throughput: 0: 42579.3. Samples: 2639170160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 07:21:06,167][15401] Updated weights for policy 0, policy_version 161080 (0.0026) [2024-06-22 07:21:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2639233024. Throughput: 0: 42739.1. Samples: 2639298760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:08,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 07:21:09,940][15401] Updated weights for policy 0, policy_version 161090 (0.0030) [2024-06-22 07:21:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2639429632. Throughput: 0: 42688.5. Samples: 2639557760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 07:21:13,906][15401] Updated weights for policy 0, policy_version 161100 (0.0038) [2024-06-22 07:21:17,470][15401] Updated weights for policy 0, policy_version 161110 (0.0053) [2024-06-22 07:21:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 2639642624. Throughput: 0: 42469.4. Samples: 2639805220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 07:21:21,491][15401] Updated weights for policy 0, policy_version 161120 (0.0035) [2024-06-22 07:21:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2639872000. Throughput: 0: 42616.4. Samples: 2639938600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 07:21:24,942][15401] Updated weights for policy 0, policy_version 161130 (0.0036) [2024-06-22 07:21:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2640068608. Throughput: 0: 42675.2. Samples: 2640198060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 07:21:29,432][15401] Updated weights for policy 0, policy_version 161140 (0.0047) [2024-06-22 07:21:32,624][15401] Updated weights for policy 0, policy_version 161150 (0.0037) [2024-06-22 07:21:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2640297984. Throughput: 0: 42571.5. Samples: 2640447980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:33,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-22 07:21:36,891][15401] Updated weights for policy 0, policy_version 161160 (0.0021) [2024-06-22 07:21:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2640494592. Throughput: 0: 42780.5. Samples: 2640585320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 07:21:40,170][15401] Updated weights for policy 0, policy_version 161170 (0.0037) [2024-06-22 07:21:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2640707584. Throughput: 0: 42687.6. Samples: 2640840800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:43,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 07:21:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000161177_2640723968.pth... [2024-06-22 07:21:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000160551_2630467584.pth [2024-06-22 07:21:44,537][15401] Updated weights for policy 0, policy_version 161180 (0.0038) [2024-06-22 07:21:48,105][15401] Updated weights for policy 0, policy_version 161190 (0.0045) [2024-06-22 07:21:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2640936960. Throughput: 0: 42607.6. Samples: 2641087500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:48,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 07:21:52,256][15401] Updated weights for policy 0, policy_version 161200 (0.0034) [2024-06-22 07:21:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 2641133568. Throughput: 0: 42633.2. Samples: 2641217360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:21:53,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 07:21:56,138][15401] Updated weights for policy 0, policy_version 161210 (0.0035) [2024-06-22 07:21:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 2641346560. Throughput: 0: 42483.5. Samples: 2641469520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:21:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 07:21:59,918][15401] Updated weights for policy 0, policy_version 161220 (0.0033) [2024-06-22 07:22:03,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2641559552. Throughput: 0: 42765.2. Samples: 2641729660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 07:22:03,861][15401] Updated weights for policy 0, policy_version 161230 (0.0030) [2024-06-22 07:22:07,636][15401] Updated weights for policy 0, policy_version 161240 (0.0037) [2024-06-22 07:22:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2641788928. Throughput: 0: 42614.3. Samples: 2641856240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 07:22:09,698][15349] Signal inference workers to stop experience collection... (38950 times) [2024-06-22 07:22:09,748][15401] InferenceWorker_p0-w0: stopping experience collection (38950 times) [2024-06-22 07:22:09,808][15349] Signal inference workers to resume experience collection... (38950 times) [2024-06-22 07:22:09,808][15401] InferenceWorker_p0-w0: resuming experience collection (38950 times) [2024-06-22 07:22:11,493][15401] Updated weights for policy 0, policy_version 161250 (0.0031) [2024-06-22 07:22:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2641985536. Throughput: 0: 42346.2. Samples: 2642103640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:13,392][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 07:22:15,390][15401] Updated weights for policy 0, policy_version 161260 (0.0045) [2024-06-22 07:22:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2642198528. Throughput: 0: 42585.4. Samples: 2642364320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 07:22:19,019][15401] Updated weights for policy 0, policy_version 161270 (0.0036) [2024-06-22 07:22:22,966][15401] Updated weights for policy 0, policy_version 161280 (0.0029) [2024-06-22 07:22:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2642411520. Throughput: 0: 42274.2. Samples: 2642487660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 07:22:26,770][15401] Updated weights for policy 0, policy_version 161290 (0.0040) [2024-06-22 07:22:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2642640896. Throughput: 0: 42091.1. Samples: 2642734900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 07:22:30,805][15401] Updated weights for policy 0, policy_version 161300 (0.0051) [2024-06-22 07:22:33,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 2642804736. Throughput: 0: 42562.5. Samples: 2643002820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 07:22:34,529][15401] Updated weights for policy 0, policy_version 161310 (0.0040) [2024-06-22 07:22:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2643050496. Throughput: 0: 42336.1. Samples: 2643122380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 07:22:38,706][15401] Updated weights for policy 0, policy_version 161320 (0.0033) [2024-06-22 07:22:42,243][15401] Updated weights for policy 0, policy_version 161330 (0.0029) [2024-06-22 07:22:43,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2643263488. Throughput: 0: 42420.4. Samples: 2643378440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 07:22:46,339][15401] Updated weights for policy 0, policy_version 161340 (0.0038) [2024-06-22 07:22:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2643460096. Throughput: 0: 42409.3. Samples: 2643638080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 07:22:50,378][15401] Updated weights for policy 0, policy_version 161350 (0.0030) [2024-06-22 07:22:53,391][15132] Fps is (10 sec: 42592.6, 60 sec: 42599.2, 300 sec: 42654.1). Total num frames: 2643689472. Throughput: 0: 42220.5. Samples: 2643756220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 07:22:53,391][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 07:22:54,050][15401] Updated weights for policy 0, policy_version 161360 (0.0038) [2024-06-22 07:22:58,084][15401] Updated weights for policy 0, policy_version 161370 (0.0039) [2024-06-22 07:22:58,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2643902464. Throughput: 0: 42517.8. Samples: 2644016940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:22:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 07:23:01,698][15401] Updated weights for policy 0, policy_version 161380 (0.0033) [2024-06-22 07:23:03,390][15132] Fps is (10 sec: 40965.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2644099072. Throughput: 0: 42269.3. Samples: 2644266440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:03,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 07:23:05,844][15401] Updated weights for policy 0, policy_version 161390 (0.0038) [2024-06-22 07:23:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42654.1). Total num frames: 2644312064. Throughput: 0: 42304.1. Samples: 2644391340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 07:23:09,426][15401] Updated weights for policy 0, policy_version 161400 (0.0034) [2024-06-22 07:23:13,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 2644508672. Throughput: 0: 42487.5. Samples: 2644646940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:13,393][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 07:23:13,597][15401] Updated weights for policy 0, policy_version 161410 (0.0047) [2024-06-22 07:23:17,138][15401] Updated weights for policy 0, policy_version 161420 (0.0027) [2024-06-22 07:23:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2644721664. Throughput: 0: 42138.8. Samples: 2644899060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 07:23:18,534][15349] Signal inference workers to stop experience collection... (39000 times) [2024-06-22 07:23:18,536][15349] Signal inference workers to resume experience collection... (39000 times) [2024-06-22 07:23:18,560][15401] InferenceWorker_p0-w0: stopping experience collection (39000 times) [2024-06-22 07:23:18,560][15401] InferenceWorker_p0-w0: resuming experience collection (39000 times) [2024-06-22 07:23:21,476][15401] Updated weights for policy 0, policy_version 161430 (0.0036) [2024-06-22 07:23:23,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2644951040. Throughput: 0: 42332.9. Samples: 2645027360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:23,392][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 07:23:24,626][15401] Updated weights for policy 0, policy_version 161440 (0.0031) [2024-06-22 07:23:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 2645131264. Throughput: 0: 42380.4. Samples: 2645285560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:28,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 07:23:29,239][15401] Updated weights for policy 0, policy_version 161450 (0.0033) [2024-06-22 07:23:32,673][15401] Updated weights for policy 0, policy_version 161460 (0.0052) [2024-06-22 07:23:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2645377024. Throughput: 0: 42140.0. Samples: 2645534380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 07:23:36,978][15401] Updated weights for policy 0, policy_version 161470 (0.0037) [2024-06-22 07:23:38,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 2645590016. Throughput: 0: 42453.6. Samples: 2645666680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:38,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 07:23:40,223][15401] Updated weights for policy 0, policy_version 161480 (0.0034) [2024-06-22 07:23:43,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42047.7, 300 sec: 42541.9). Total num frames: 2645786624. Throughput: 0: 42248.2. Samples: 2645918380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:43,397][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 07:23:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000161486_2645786624.pth... [2024-06-22 07:23:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000160865_2635612160.pth [2024-06-22 07:23:44,608][15401] Updated weights for policy 0, policy_version 161490 (0.0042) [2024-06-22 07:23:47,792][15401] Updated weights for policy 0, policy_version 161500 (0.0033) [2024-06-22 07:23:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2646016000. Throughput: 0: 42322.2. Samples: 2646170940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:48,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 07:23:52,528][15401] Updated weights for policy 0, policy_version 161510 (0.0030) [2024-06-22 07:23:53,389][15132] Fps is (10 sec: 44265.8, 60 sec: 42326.3, 300 sec: 42598.9). Total num frames: 2646228992. Throughput: 0: 42520.4. Samples: 2646304760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 07:23:55,269][15401] Updated weights for policy 0, policy_version 161520 (0.0048) [2024-06-22 07:23:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2646425600. Throughput: 0: 42471.2. Samples: 2646558040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 07:23:58,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-22 07:24:00,139][15401] Updated weights for policy 0, policy_version 161530 (0.0029) [2024-06-22 07:24:02,730][15401] Updated weights for policy 0, policy_version 161540 (0.0037) [2024-06-22 07:24:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2646671360. Throughput: 0: 42409.3. Samples: 2646807480. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:03,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 07:24:07,934][15401] Updated weights for policy 0, policy_version 161550 (0.0040) [2024-06-22 07:24:08,390][15132] Fps is (10 sec: 44235.3, 60 sec: 42598.0, 300 sec: 42543.7). Total num frames: 2646867968. Throughput: 0: 42551.2. Samples: 2646942180. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:08,391][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 07:24:10,965][15401] Updated weights for policy 0, policy_version 161560 (0.0031) [2024-06-22 07:24:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 2647080960. Throughput: 0: 42547.2. Samples: 2647200180. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:13,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 07:24:15,622][15401] Updated weights for policy 0, policy_version 161570 (0.0037) [2024-06-22 07:24:18,396][15132] Fps is (10 sec: 42572.6, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 2647293952. Throughput: 0: 42423.4. Samples: 2647443700. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:18,397][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 07:24:18,672][15401] Updated weights for policy 0, policy_version 161580 (0.0034) [2024-06-22 07:24:21,480][15349] Signal inference workers to stop experience collection... (39050 times) [2024-06-22 07:24:21,481][15349] Signal inference workers to resume experience collection... (39050 times) [2024-06-22 07:24:21,526][15401] InferenceWorker_p0-w0: stopping experience collection (39050 times) [2024-06-22 07:24:21,526][15401] InferenceWorker_p0-w0: resuming experience collection (39050 times) [2024-06-22 07:24:23,122][15401] Updated weights for policy 0, policy_version 161590 (0.0030) [2024-06-22 07:24:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2647490560. Throughput: 0: 42550.4. Samples: 2647581340. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 07:24:26,475][15401] Updated weights for policy 0, policy_version 161600 (0.0044) [2024-06-22 07:24:28,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2647703552. Throughput: 0: 42579.5. Samples: 2647834180. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:28,391][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 07:24:30,874][15401] Updated weights for policy 0, policy_version 161610 (0.0039) [2024-06-22 07:24:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2647932928. Throughput: 0: 42768.1. Samples: 2648095500. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:33,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 07:24:34,255][15401] Updated weights for policy 0, policy_version 161620 (0.0032) [2024-06-22 07:24:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 2648129536. Throughput: 0: 42657.6. Samples: 2648224360. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:38,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 07:24:38,451][15401] Updated weights for policy 0, policy_version 161630 (0.0023) [2024-06-22 07:24:41,994][15401] Updated weights for policy 0, policy_version 161640 (0.0059) [2024-06-22 07:24:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42603.0, 300 sec: 42542.9). Total num frames: 2648342528. Throughput: 0: 42440.5. Samples: 2648467860. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 07:24:46,348][15401] Updated weights for policy 0, policy_version 161650 (0.0030) [2024-06-22 07:24:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2648571904. Throughput: 0: 42547.5. Samples: 2648722120. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 07:24:49,707][15401] Updated weights for policy 0, policy_version 161660 (0.0040) [2024-06-22 07:24:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42432.0). Total num frames: 2648768512. Throughput: 0: 42466.1. Samples: 2648853140. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 07:24:54,258][15401] Updated weights for policy 0, policy_version 161670 (0.0032) [2024-06-22 07:24:57,420][15401] Updated weights for policy 0, policy_version 161680 (0.0035) [2024-06-22 07:24:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2648981504. Throughput: 0: 42294.6. Samples: 2649103440. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:24:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 07:25:01,764][15401] Updated weights for policy 0, policy_version 161690 (0.0030) [2024-06-22 07:25:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2649210880. Throughput: 0: 42635.8. Samples: 2649362040. Policy #0 lag: (min: 1.0, avg: 12.6, max: 23.0) [2024-06-22 07:25:03,394][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 07:25:05,233][15401] Updated weights for policy 0, policy_version 161700 (0.0033) [2024-06-22 07:25:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 2649407488. Throughput: 0: 42426.9. Samples: 2649490560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:08,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-22 07:25:09,738][15401] Updated weights for policy 0, policy_version 161710 (0.0024) [2024-06-22 07:25:12,846][15401] Updated weights for policy 0, policy_version 161720 (0.0034) [2024-06-22 07:25:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2649620480. Throughput: 0: 42504.4. Samples: 2649746880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:13,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-22 07:25:17,297][15401] Updated weights for policy 0, policy_version 161730 (0.0034) [2024-06-22 07:25:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42602.9, 300 sec: 42542.8). Total num frames: 2649849856. Throughput: 0: 42372.0. Samples: 2650002240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:18,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-22 07:25:20,623][15401] Updated weights for policy 0, policy_version 161740 (0.0028) [2024-06-22 07:25:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2650046464. Throughput: 0: 42372.9. Samples: 2650131140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:23,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-22 07:25:24,950][15401] Updated weights for policy 0, policy_version 161750 (0.0031) [2024-06-22 07:25:28,378][15401] Updated weights for policy 0, policy_version 161760 (0.0039) [2024-06-22 07:25:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2650275840. Throughput: 0: 42623.5. Samples: 2650385920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 07:25:32,391][15401] Updated weights for policy 0, policy_version 161770 (0.0030) [2024-06-22 07:25:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2650488832. Throughput: 0: 42618.2. Samples: 2650639940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 07:25:35,961][15401] Updated weights for policy 0, policy_version 161780 (0.0041) [2024-06-22 07:25:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2650685440. Throughput: 0: 42528.8. Samples: 2650766940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 07:25:39,931][15401] Updated weights for policy 0, policy_version 161790 (0.0023) [2024-06-22 07:25:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2650898432. Throughput: 0: 42698.8. Samples: 2651024880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 07:25:43,488][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000161799_2650914816.pth... [2024-06-22 07:25:43,542][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000161177_2640723968.pth [2024-06-22 07:25:43,702][15401] Updated weights for policy 0, policy_version 161800 (0.0038) [2024-06-22 07:25:47,715][15401] Updated weights for policy 0, policy_version 161810 (0.0030) [2024-06-22 07:25:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2651111424. Throughput: 0: 42644.9. Samples: 2651281060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:48,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 07:25:51,341][15401] Updated weights for policy 0, policy_version 161820 (0.0056) [2024-06-22 07:25:53,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 2651340800. Throughput: 0: 42682.7. Samples: 2651411380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:53,393][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 07:25:55,243][15401] Updated weights for policy 0, policy_version 161830 (0.0040) [2024-06-22 07:25:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2651537408. Throughput: 0: 42680.1. Samples: 2651667480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:25:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 07:25:59,107][15349] Signal inference workers to stop experience collection... (39100 times) [2024-06-22 07:25:59,107][15349] Signal inference workers to resume experience collection... (39100 times) [2024-06-22 07:25:59,148][15401] InferenceWorker_p0-w0: stopping experience collection (39100 times) [2024-06-22 07:25:59,149][15401] InferenceWorker_p0-w0: resuming experience collection (39100 times) [2024-06-22 07:25:59,248][15401] Updated weights for policy 0, policy_version 161840 (0.0034) [2024-06-22 07:26:02,781][15401] Updated weights for policy 0, policy_version 161850 (0.0032) [2024-06-22 07:26:03,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2651766784. Throughput: 0: 42652.4. Samples: 2651921600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:26:03,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 07:26:07,043][15401] Updated weights for policy 0, policy_version 161860 (0.0031) [2024-06-22 07:26:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2651979776. Throughput: 0: 42696.9. Samples: 2652052500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:26:08,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 07:26:10,461][15401] Updated weights for policy 0, policy_version 161870 (0.0038) [2024-06-22 07:26:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2652176384. Throughput: 0: 42594.2. Samples: 2652302660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:13,401][15132] Avg episode reward: [(0, '0.865')] [2024-06-22 07:26:14,801][15401] Updated weights for policy 0, policy_version 161880 (0.0040) [2024-06-22 07:26:18,119][15401] Updated weights for policy 0, policy_version 161890 (0.0032) [2024-06-22 07:26:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2652422144. Throughput: 0: 42464.1. Samples: 2652550820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:18,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-22 07:26:22,799][15401] Updated weights for policy 0, policy_version 161900 (0.0038) [2024-06-22 07:26:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2652585984. Throughput: 0: 42613.4. Samples: 2652684540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 07:26:25,822][15401] Updated weights for policy 0, policy_version 161910 (0.0023) [2024-06-22 07:26:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 2652815360. Throughput: 0: 42503.4. Samples: 2652937540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:28,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 07:26:30,172][15401] Updated weights for policy 0, policy_version 161920 (0.0037) [2024-06-22 07:26:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 2653044736. Throughput: 0: 42584.0. Samples: 2653197340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 07:26:33,403][15401] Updated weights for policy 0, policy_version 161930 (0.0050) [2024-06-22 07:26:37,611][15401] Updated weights for policy 0, policy_version 161940 (0.0021) [2024-06-22 07:26:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2653241344. Throughput: 0: 42631.6. Samples: 2653329700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:38,391][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 07:26:40,939][15401] Updated weights for policy 0, policy_version 161950 (0.0034) [2024-06-22 07:26:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2653470720. Throughput: 0: 42471.0. Samples: 2653578680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 07:26:45,565][15401] Updated weights for policy 0, policy_version 161960 (0.0039) [2024-06-22 07:26:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42543.2). Total num frames: 2653683712. Throughput: 0: 42772.2. Samples: 2653846340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 07:26:48,743][15401] Updated weights for policy 0, policy_version 161970 (0.0027) [2024-06-22 07:26:53,014][15401] Updated weights for policy 0, policy_version 161980 (0.0028) [2024-06-22 07:26:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 2653896704. Throughput: 0: 42651.5. Samples: 2653971820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:53,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 07:26:56,631][15401] Updated weights for policy 0, policy_version 161990 (0.0034) [2024-06-22 07:26:58,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2654109696. Throughput: 0: 42685.8. Samples: 2654223520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:26:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 07:27:00,750][15401] Updated weights for policy 0, policy_version 162000 (0.0034) [2024-06-22 07:27:03,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42593.8, 300 sec: 42486.4). Total num frames: 2654322688. Throughput: 0: 43093.3. Samples: 2654490300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:27:03,397][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 07:27:04,223][15401] Updated weights for policy 0, policy_version 162010 (0.0032) [2024-06-22 07:27:08,364][15401] Updated weights for policy 0, policy_version 162020 (0.0046) [2024-06-22 07:27:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2654535680. Throughput: 0: 42895.6. Samples: 2654614840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:27:08,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 07:27:11,681][15401] Updated weights for policy 0, policy_version 162030 (0.0048) [2024-06-22 07:27:13,390][15132] Fps is (10 sec: 44265.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2654765056. Throughput: 0: 42976.8. Samples: 2654871500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:27:13,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 07:27:16,189][15401] Updated weights for policy 0, policy_version 162040 (0.0035) [2024-06-22 07:27:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2654961664. Throughput: 0: 43001.9. Samples: 2655132420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:18,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 07:27:19,187][15401] Updated weights for policy 0, policy_version 162050 (0.0032) [2024-06-22 07:27:23,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2655158272. Throughput: 0: 42876.6. Samples: 2655259140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:23,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 07:27:23,660][15401] Updated weights for policy 0, policy_version 162060 (0.0039) [2024-06-22 07:27:26,527][15349] Signal inference workers to stop experience collection... (39150 times) [2024-06-22 07:27:26,581][15401] InferenceWorker_p0-w0: stopping experience collection (39150 times) [2024-06-22 07:27:26,582][15349] Signal inference workers to resume experience collection... (39150 times) [2024-06-22 07:27:26,598][15401] InferenceWorker_p0-w0: resuming experience collection (39150 times) [2024-06-22 07:27:27,027][15401] Updated weights for policy 0, policy_version 162070 (0.0045) [2024-06-22 07:27:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2655404032. Throughput: 0: 43029.6. Samples: 2655515020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 07:27:31,117][15401] Updated weights for policy 0, policy_version 162080 (0.0037) [2024-06-22 07:27:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2655617024. Throughput: 0: 42848.3. Samples: 2655774520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 07:27:34,863][15401] Updated weights for policy 0, policy_version 162090 (0.0038) [2024-06-22 07:27:38,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2655813632. Throughput: 0: 42953.8. Samples: 2655904740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 07:27:38,752][15401] Updated weights for policy 0, policy_version 162100 (0.0035) [2024-06-22 07:27:42,457][15401] Updated weights for policy 0, policy_version 162110 (0.0036) [2024-06-22 07:27:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 2656059392. Throughput: 0: 43163.9. Samples: 2656166000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:43,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 07:27:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000162113_2656059392.pth... [2024-06-22 07:27:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000161486_2645786624.pth [2024-06-22 07:27:46,635][15401] Updated weights for policy 0, policy_version 162120 (0.0030) [2024-06-22 07:27:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42598.6). Total num frames: 2656256000. Throughput: 0: 42881.7. Samples: 2656419700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 07:27:49,918][15401] Updated weights for policy 0, policy_version 162130 (0.0029) [2024-06-22 07:27:53,389][15132] Fps is (10 sec: 39331.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2656452608. Throughput: 0: 42928.1. Samples: 2656546600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 07:27:54,439][15401] Updated weights for policy 0, policy_version 162140 (0.0031) [2024-06-22 07:27:57,381][15401] Updated weights for policy 0, policy_version 162150 (0.0036) [2024-06-22 07:27:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2656698368. Throughput: 0: 42988.2. Samples: 2656805960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:27:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 07:28:02,038][15401] Updated weights for policy 0, policy_version 162160 (0.0029) [2024-06-22 07:28:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43149.2, 300 sec: 42709.4). Total num frames: 2656911360. Throughput: 0: 42819.4. Samples: 2657059300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:28:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 07:28:05,143][15401] Updated weights for policy 0, policy_version 162170 (0.0042) [2024-06-22 07:28:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 2657091584. Throughput: 0: 42863.4. Samples: 2657188000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:28:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 07:28:09,414][15401] Updated weights for policy 0, policy_version 162180 (0.0025) [2024-06-22 07:28:12,742][15401] Updated weights for policy 0, policy_version 162190 (0.0039) [2024-06-22 07:28:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2657337344. Throughput: 0: 42933.5. Samples: 2657447020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:28:13,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 07:28:17,167][15401] Updated weights for policy 0, policy_version 162200 (0.0036) [2024-06-22 07:28:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2657533952. Throughput: 0: 42916.7. Samples: 2657705780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 07:28:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 07:28:20,216][15401] Updated weights for policy 0, policy_version 162210 (0.0039) [2024-06-22 07:28:23,396][15132] Fps is (10 sec: 40934.5, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 2657746944. Throughput: 0: 42856.7. Samples: 2657833560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:28:23,396][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 07:28:24,957][15401] Updated weights for policy 0, policy_version 162220 (0.0039) [2024-06-22 07:28:27,994][15401] Updated weights for policy 0, policy_version 162230 (0.0042) [2024-06-22 07:28:28,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2657976320. Throughput: 0: 42727.6. Samples: 2658088640. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:28:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 07:28:32,488][15401] Updated weights for policy 0, policy_version 162240 (0.0023) [2024-06-22 07:28:33,389][15132] Fps is (10 sec: 40986.0, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 2658156544. Throughput: 0: 42778.8. Samples: 2658344740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:28:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 07:28:35,972][15401] Updated weights for policy 0, policy_version 162250 (0.0044) [2024-06-22 07:28:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42765.6). Total num frames: 2658402304. Throughput: 0: 42549.7. Samples: 2658461440. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:28:38,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 07:28:40,113][15401] Updated weights for policy 0, policy_version 162260 (0.0037) [2024-06-22 07:28:41,532][15349] Signal inference workers to stop experience collection... (39200 times) [2024-06-22 07:28:41,532][15349] Signal inference workers to resume experience collection... (39200 times) [2024-06-22 07:28:41,543][15401] InferenceWorker_p0-w0: stopping experience collection (39200 times) [2024-06-22 07:28:41,544][15401] InferenceWorker_p0-w0: resuming experience collection (39200 times) [2024-06-22 07:28:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 2658615296. Throughput: 0: 42524.3. Samples: 2658719560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:28:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 07:28:44,104][15401] Updated weights for policy 0, policy_version 162270 (0.0033) [2024-06-22 07:28:47,698][15401] Updated weights for policy 0, policy_version 162280 (0.0035) [2024-06-22 07:28:48,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2658795520. Throughput: 0: 42619.2. Samples: 2658977160. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:28:48,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-22 07:28:52,036][15401] Updated weights for policy 0, policy_version 162290 (0.0038) [2024-06-22 07:28:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 2659041280. Throughput: 0: 42547.1. Samples: 2659102720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:28:53,393][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 07:28:55,775][15401] Updated weights for policy 0, policy_version 162300 (0.0032) [2024-06-22 07:28:58,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 2659237888. Throughput: 0: 42446.6. Samples: 2659357220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:28:58,393][15132] Avg episode reward: [(0, '0.068')] [2024-06-22 07:28:59,738][15401] Updated weights for policy 0, policy_version 162310 (0.0033) [2024-06-22 07:29:03,201][15401] Updated weights for policy 0, policy_version 162320 (0.0026) [2024-06-22 07:29:03,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2659450880. Throughput: 0: 42388.4. Samples: 2659613260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:29:03,391][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 07:29:07,382][15401] Updated weights for policy 0, policy_version 162330 (0.0045) [2024-06-22 07:29:08,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2659680256. Throughput: 0: 42373.4. Samples: 2659740100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:29:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 07:29:10,839][15401] Updated weights for policy 0, policy_version 162340 (0.0030) [2024-06-22 07:29:13,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 2659876864. Throughput: 0: 42296.4. Samples: 2659991980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:29:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 07:29:15,187][15401] Updated weights for policy 0, policy_version 162350 (0.0033) [2024-06-22 07:29:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 2660089856. Throughput: 0: 42156.0. Samples: 2660241760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:29:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 07:29:18,487][15401] Updated weights for policy 0, policy_version 162360 (0.0050) [2024-06-22 07:29:22,918][15401] Updated weights for policy 0, policy_version 162370 (0.0039) [2024-06-22 07:29:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 2660286464. Throughput: 0: 42384.1. Samples: 2660368620. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-22 07:29:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 07:29:26,151][15401] Updated weights for policy 0, policy_version 162380 (0.0043) [2024-06-22 07:29:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 2660483072. Throughput: 0: 42301.9. Samples: 2660623140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:29:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 07:29:30,802][15401] Updated weights for policy 0, policy_version 162390 (0.0021) [2024-06-22 07:29:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2660728832. Throughput: 0: 42243.6. Samples: 2660878120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:29:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 07:29:34,008][15401] Updated weights for policy 0, policy_version 162400 (0.0039) [2024-06-22 07:29:38,323][15401] Updated weights for policy 0, policy_version 162410 (0.0037) [2024-06-22 07:29:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 2660925440. Throughput: 0: 42278.4. Samples: 2661005140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:29:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 07:29:41,398][15401] Updated weights for policy 0, policy_version 162420 (0.0028) [2024-06-22 07:29:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2661138432. Throughput: 0: 42108.9. Samples: 2661252020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:29:43,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 07:29:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000162424_2661154816.pth... [2024-06-22 07:29:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000161799_2650914816.pth [2024-06-22 07:29:46,055][15401] Updated weights for policy 0, policy_version 162430 (0.0022) [2024-06-22 07:29:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2661335040. Throughput: 0: 42224.2. Samples: 2661513340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:29:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 07:29:49,483][15401] Updated weights for policy 0, policy_version 162440 (0.0037) [2024-06-22 07:29:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 2661548032. Throughput: 0: 42154.8. Samples: 2661637060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:29:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 07:29:53,922][15401] Updated weights for policy 0, policy_version 162450 (0.0027) [2024-06-22 07:29:57,110][15401] Updated weights for policy 0, policy_version 162460 (0.0027) [2024-06-22 07:29:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 2661777408. Throughput: 0: 42274.3. Samples: 2661894320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:29:58,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 07:30:01,763][15401] Updated weights for policy 0, policy_version 162470 (0.0038) [2024-06-22 07:30:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2661990400. Throughput: 0: 42471.0. Samples: 2662152960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:30:03,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 07:30:03,899][15349] Signal inference workers to stop experience collection... (39250 times) [2024-06-22 07:30:03,943][15401] InferenceWorker_p0-w0: stopping experience collection (39250 times) [2024-06-22 07:30:03,958][15349] Signal inference workers to resume experience collection... (39250 times) [2024-06-22 07:30:03,959][15401] InferenceWorker_p0-w0: resuming experience collection (39250 times) [2024-06-22 07:30:04,688][15401] Updated weights for policy 0, policy_version 162480 (0.0037) [2024-06-22 07:30:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 42542.9). Total num frames: 2662170624. Throughput: 0: 42399.1. Samples: 2662276580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:30:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 07:30:09,593][15401] Updated weights for policy 0, policy_version 162490 (0.0034) [2024-06-22 07:30:12,483][15401] Updated weights for policy 0, policy_version 162500 (0.0033) [2024-06-22 07:30:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2662416384. Throughput: 0: 42385.3. Samples: 2662530480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:30:13,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 07:30:17,384][15401] Updated weights for policy 0, policy_version 162510 (0.0039) [2024-06-22 07:30:18,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2662645760. Throughput: 0: 42420.0. Samples: 2662787020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:30:18,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 07:30:20,001][15401] Updated weights for policy 0, policy_version 162520 (0.0034) [2024-06-22 07:30:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2662809600. Throughput: 0: 42450.7. Samples: 2662915420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:30:23,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 07:30:24,956][15401] Updated weights for policy 0, policy_version 162530 (0.0041) [2024-06-22 07:30:27,654][15401] Updated weights for policy 0, policy_version 162540 (0.0027) [2024-06-22 07:30:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 2663071744. Throughput: 0: 42680.5. Samples: 2663172640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 07:30:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 07:30:32,588][15401] Updated weights for policy 0, policy_version 162550 (0.0032) [2024-06-22 07:30:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2663268352. Throughput: 0: 42698.5. Samples: 2663434780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:30:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 07:30:35,503][15401] Updated weights for policy 0, policy_version 162560 (0.0039) [2024-06-22 07:30:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2663464960. Throughput: 0: 42630.6. Samples: 2663555440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:30:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 07:30:40,223][15401] Updated weights for policy 0, policy_version 162570 (0.0036) [2024-06-22 07:30:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2663694336. Throughput: 0: 42441.8. Samples: 2663804200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:30:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 07:30:43,687][15401] Updated weights for policy 0, policy_version 162580 (0.0027) [2024-06-22 07:30:47,970][15401] Updated weights for policy 0, policy_version 162590 (0.0035) [2024-06-22 07:30:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2663890944. Throughput: 0: 42513.0. Samples: 2664066040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:30:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 07:30:51,352][15401] Updated weights for policy 0, policy_version 162600 (0.0031) [2024-06-22 07:30:53,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.6, 300 sec: 42598.0). Total num frames: 2664103936. Throughput: 0: 42466.5. Samples: 2664187680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:30:53,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 07:30:55,915][15401] Updated weights for policy 0, policy_version 162610 (0.0038) [2024-06-22 07:30:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2664333312. Throughput: 0: 42470.3. Samples: 2664441640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:30:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 07:30:59,070][15401] Updated weights for policy 0, policy_version 162620 (0.0051) [2024-06-22 07:31:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2664513536. Throughput: 0: 42674.6. Samples: 2664707380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:31:03,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 07:31:03,681][15401] Updated weights for policy 0, policy_version 162630 (0.0034) [2024-06-22 07:31:06,655][15401] Updated weights for policy 0, policy_version 162640 (0.0031) [2024-06-22 07:31:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2664742912. Throughput: 0: 42464.0. Samples: 2664826300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:31:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 07:31:11,431][15401] Updated weights for policy 0, policy_version 162650 (0.0026) [2024-06-22 07:31:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2664972288. Throughput: 0: 42452.8. Samples: 2665083020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:31:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 07:31:14,318][15401] Updated weights for policy 0, policy_version 162660 (0.0039) [2024-06-22 07:31:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 2665152512. Throughput: 0: 42406.7. Samples: 2665343080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:31:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 07:31:19,119][15401] Updated weights for policy 0, policy_version 162670 (0.0028) [2024-06-22 07:31:20,221][15349] Signal inference workers to stop experience collection... (39300 times) [2024-06-22 07:31:20,227][15349] Signal inference workers to resume experience collection... (39300 times) [2024-06-22 07:31:20,248][15401] InferenceWorker_p0-w0: stopping experience collection (39300 times) [2024-06-22 07:31:20,248][15401] InferenceWorker_p0-w0: resuming experience collection (39300 times) [2024-06-22 07:31:22,139][15401] Updated weights for policy 0, policy_version 162680 (0.0034) [2024-06-22 07:31:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2665398272. Throughput: 0: 42291.1. Samples: 2665458540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:31:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 07:31:26,556][15401] Updated weights for policy 0, policy_version 162690 (0.0052) [2024-06-22 07:31:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2665611264. Throughput: 0: 42614.1. Samples: 2665721840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:31:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 07:31:29,675][15401] Updated weights for policy 0, policy_version 162700 (0.0037) [2024-06-22 07:31:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2665775104. Throughput: 0: 42594.0. Samples: 2665982780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 07:31:33,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 07:31:34,118][15401] Updated weights for policy 0, policy_version 162710 (0.0041) [2024-06-22 07:31:37,162][15401] Updated weights for policy 0, policy_version 162720 (0.0031) [2024-06-22 07:31:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2666037248. Throughput: 0: 42518.8. Samples: 2666100920. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:31:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 07:31:41,767][15401] Updated weights for policy 0, policy_version 162730 (0.0037) [2024-06-22 07:31:43,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2666250240. Throughput: 0: 42701.3. Samples: 2666363200. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:31:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 07:31:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000162735_2666250240.pth... [2024-06-22 07:31:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000162113_2656059392.pth [2024-06-22 07:31:45,190][15401] Updated weights for policy 0, policy_version 162740 (0.0036) [2024-06-22 07:31:48,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2666414080. Throughput: 0: 42598.8. Samples: 2666624320. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:31:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 07:31:49,655][15401] Updated weights for policy 0, policy_version 162750 (0.0034) [2024-06-22 07:31:52,663][15401] Updated weights for policy 0, policy_version 162760 (0.0031) [2024-06-22 07:31:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 2666692608. Throughput: 0: 42684.5. Samples: 2666747100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:31:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 07:31:57,273][15401] Updated weights for policy 0, policy_version 162770 (0.0027) [2024-06-22 07:31:58,390][15132] Fps is (10 sec: 45871.0, 60 sec: 42324.7, 300 sec: 42543.7). Total num frames: 2666872832. Throughput: 0: 42845.5. Samples: 2667011100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:31:58,391][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 07:32:00,285][15401] Updated weights for policy 0, policy_version 162780 (0.0034) [2024-06-22 07:32:03,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2667069440. Throughput: 0: 42644.1. Samples: 2667262060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:32:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 07:32:04,995][15401] Updated weights for policy 0, policy_version 162790 (0.0030) [2024-06-22 07:32:07,980][15401] Updated weights for policy 0, policy_version 162800 (0.0038) [2024-06-22 07:32:08,389][15132] Fps is (10 sec: 45879.4, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2667331584. Throughput: 0: 42888.5. Samples: 2667388520. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:32:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 07:32:12,734][15401] Updated weights for policy 0, policy_version 162810 (0.0028) [2024-06-22 07:32:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 2667495424. Throughput: 0: 42786.4. Samples: 2667647220. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:32:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 07:32:15,748][15401] Updated weights for policy 0, policy_version 162820 (0.0030) [2024-06-22 07:32:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2667724800. Throughput: 0: 42446.8. Samples: 2667892880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:32:18,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 07:32:20,318][15401] Updated weights for policy 0, policy_version 162830 (0.0033) [2024-06-22 07:32:23,354][15401] Updated weights for policy 0, policy_version 162840 (0.0027) [2024-06-22 07:32:23,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2667970560. Throughput: 0: 42749.3. Samples: 2668024640. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:32:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 07:32:27,646][15349] Signal inference workers to stop experience collection... (39350 times) [2024-06-22 07:32:27,700][15401] InferenceWorker_p0-w0: stopping experience collection (39350 times) [2024-06-22 07:32:27,704][15349] Signal inference workers to resume experience collection... (39350 times) [2024-06-22 07:32:27,713][15401] InferenceWorker_p0-w0: resuming experience collection (39350 times) [2024-06-22 07:32:27,839][15401] Updated weights for policy 0, policy_version 162850 (0.0039) [2024-06-22 07:32:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2668150784. Throughput: 0: 42827.0. Samples: 2668290420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:32:28,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 07:32:30,941][15401] Updated weights for policy 0, policy_version 162860 (0.0041) [2024-06-22 07:32:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 2668363776. Throughput: 0: 42472.4. Samples: 2668535580. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 07:32:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 07:32:35,770][15401] Updated weights for policy 0, policy_version 162870 (0.0041) [2024-06-22 07:32:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 2668593152. Throughput: 0: 42627.1. Samples: 2668665320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:32:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 07:32:38,662][15401] Updated weights for policy 0, policy_version 162880 (0.0042) [2024-06-22 07:32:43,324][15401] Updated weights for policy 0, policy_version 162890 (0.0029) [2024-06-22 07:32:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2668789760. Throughput: 0: 42406.1. Samples: 2668919340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:32:43,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 07:32:46,692][15401] Updated weights for policy 0, policy_version 162900 (0.0035) [2024-06-22 07:32:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2669002752. Throughput: 0: 42509.3. Samples: 2669174980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:32:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 07:32:50,863][15401] Updated weights for policy 0, policy_version 162910 (0.0034) [2024-06-22 07:32:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2669215744. Throughput: 0: 42571.9. Samples: 2669304260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:32:53,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 07:32:54,023][15401] Updated weights for policy 0, policy_version 162920 (0.0047) [2024-06-22 07:32:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42599.0, 300 sec: 42431.8). Total num frames: 2669428736. Throughput: 0: 42692.8. Samples: 2669568400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:32:58,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 07:32:58,395][15401] Updated weights for policy 0, policy_version 162930 (0.0035) [2024-06-22 07:33:01,530][15401] Updated weights for policy 0, policy_version 162940 (0.0031) [2024-06-22 07:33:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2669625344. Throughput: 0: 42843.0. Samples: 2669820820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:33:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 07:33:05,909][15401] Updated weights for policy 0, policy_version 162950 (0.0043) [2024-06-22 07:33:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2669854720. Throughput: 0: 42727.5. Samples: 2669947380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:33:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 07:33:09,634][15401] Updated weights for policy 0, policy_version 162960 (0.0038) [2024-06-22 07:33:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 2670084096. Throughput: 0: 42490.2. Samples: 2670202480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:33:13,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 07:33:13,453][15401] Updated weights for policy 0, policy_version 162970 (0.0029) [2024-06-22 07:33:17,185][15401] Updated weights for policy 0, policy_version 162980 (0.0034) [2024-06-22 07:33:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42488.2). Total num frames: 2670280704. Throughput: 0: 42758.1. Samples: 2670459700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:33:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 07:33:21,047][15401] Updated weights for policy 0, policy_version 162990 (0.0022) [2024-06-22 07:33:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2670493696. Throughput: 0: 42610.7. Samples: 2670582800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:33:23,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-22 07:33:24,776][15401] Updated weights for policy 0, policy_version 163000 (0.0037) [2024-06-22 07:33:28,390][15132] Fps is (10 sec: 45872.0, 60 sec: 43144.0, 300 sec: 42653.8). Total num frames: 2670739456. Throughput: 0: 42733.9. Samples: 2670842400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:33:28,391][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 07:33:28,647][15401] Updated weights for policy 0, policy_version 163010 (0.0039) [2024-06-22 07:33:32,422][15401] Updated weights for policy 0, policy_version 163020 (0.0035) [2024-06-22 07:33:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42487.6). Total num frames: 2670936064. Throughput: 0: 42733.6. Samples: 2671098000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:33:33,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 07:33:36,616][15401] Updated weights for policy 0, policy_version 163030 (0.0033) [2024-06-22 07:33:38,390][15132] Fps is (10 sec: 40963.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2671149056. Throughput: 0: 42684.9. Samples: 2671225080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 07:33:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 07:33:40,121][15401] Updated weights for policy 0, policy_version 163040 (0.0036) [2024-06-22 07:33:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2671362048. Throughput: 0: 42536.4. Samples: 2671482540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:33:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 07:33:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000163048_2671378432.pth... [2024-06-22 07:33:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000162424_2661154816.pth [2024-06-22 07:33:44,294][15401] Updated weights for policy 0, policy_version 163050 (0.0040) [2024-06-22 07:33:47,839][15401] Updated weights for policy 0, policy_version 163060 (0.0032) [2024-06-22 07:33:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42487.7). Total num frames: 2671575040. Throughput: 0: 42646.8. Samples: 2671739920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:33:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 07:33:51,801][15401] Updated weights for policy 0, policy_version 163070 (0.0047) [2024-06-22 07:33:53,069][15349] Signal inference workers to stop experience collection... (39400 times) [2024-06-22 07:33:53,070][15349] Signal inference workers to resume experience collection... (39400 times) [2024-06-22 07:33:53,080][15401] InferenceWorker_p0-w0: stopping experience collection (39400 times) [2024-06-22 07:33:53,093][15401] InferenceWorker_p0-w0: resuming experience collection (39400 times) [2024-06-22 07:33:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 2671804416. Throughput: 0: 42780.4. Samples: 2671872500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:33:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 07:33:55,580][15401] Updated weights for policy 0, policy_version 163080 (0.0036) [2024-06-22 07:33:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 2672001024. Throughput: 0: 42742.2. Samples: 2672125880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:33:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 07:33:59,298][15401] Updated weights for policy 0, policy_version 163090 (0.0027) [2024-06-22 07:34:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 2672214016. Throughput: 0: 42808.1. Samples: 2672386060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 07:34:03,568][15401] Updated weights for policy 0, policy_version 163100 (0.0028) [2024-06-22 07:34:07,087][15401] Updated weights for policy 0, policy_version 163110 (0.0039) [2024-06-22 07:34:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2672427008. Throughput: 0: 42859.4. Samples: 2672511480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 07:34:11,422][15401] Updated weights for policy 0, policy_version 163120 (0.0032) [2024-06-22 07:34:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2672640000. Throughput: 0: 42704.4. Samples: 2672764060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 07:34:14,743][15401] Updated weights for policy 0, policy_version 163130 (0.0033) [2024-06-22 07:34:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2672852992. Throughput: 0: 42691.7. Samples: 2673019120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 07:34:19,035][15401] Updated weights for policy 0, policy_version 163140 (0.0033) [2024-06-22 07:34:22,477][15401] Updated weights for policy 0, policy_version 163150 (0.0034) [2024-06-22 07:34:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2673065984. Throughput: 0: 42694.6. Samples: 2673146340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 07:34:26,662][15401] Updated weights for policy 0, policy_version 163160 (0.0032) [2024-06-22 07:34:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42599.0, 300 sec: 42598.4). Total num frames: 2673295360. Throughput: 0: 42723.5. Samples: 2673405100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:28,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 07:34:30,370][15401] Updated weights for policy 0, policy_version 163170 (0.0026) [2024-06-22 07:34:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2673491968. Throughput: 0: 42825.8. Samples: 2673667080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:33,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 07:34:34,290][15401] Updated weights for policy 0, policy_version 163180 (0.0031) [2024-06-22 07:34:37,959][15401] Updated weights for policy 0, policy_version 163190 (0.0035) [2024-06-22 07:34:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2673721344. Throughput: 0: 42669.9. Samples: 2673792640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 07:34:41,917][15401] Updated weights for policy 0, policy_version 163200 (0.0036) [2024-06-22 07:34:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2673934336. Throughput: 0: 42764.1. Samples: 2674050260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:34:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 07:34:45,384][15401] Updated weights for policy 0, policy_version 163210 (0.0039) [2024-06-22 07:34:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2674130944. Throughput: 0: 42641.3. Samples: 2674304920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:34:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 07:34:49,709][15401] Updated weights for policy 0, policy_version 163220 (0.0042) [2024-06-22 07:34:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2674343936. Throughput: 0: 42728.5. Samples: 2674434260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:34:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 07:34:53,468][15401] Updated weights for policy 0, policy_version 163230 (0.0033) [2024-06-22 07:34:57,289][15401] Updated weights for policy 0, policy_version 163240 (0.0031) [2024-06-22 07:34:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2674573312. Throughput: 0: 42770.6. Samples: 2674688740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:34:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 07:35:00,945][15401] Updated weights for policy 0, policy_version 163250 (0.0032) [2024-06-22 07:35:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 2674769920. Throughput: 0: 42763.5. Samples: 2674943580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:03,392][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 07:35:04,778][15401] Updated weights for policy 0, policy_version 163260 (0.0023) [2024-06-22 07:35:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2674999296. Throughput: 0: 42780.1. Samples: 2675071440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 07:35:08,505][15401] Updated weights for policy 0, policy_version 163270 (0.0039) [2024-06-22 07:35:12,393][15401] Updated weights for policy 0, policy_version 163280 (0.0040) [2024-06-22 07:35:13,390][15132] Fps is (10 sec: 45885.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2675228672. Throughput: 0: 42773.7. Samples: 2675329920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 07:35:16,422][15401] Updated weights for policy 0, policy_version 163290 (0.0035) [2024-06-22 07:35:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2675425280. Throughput: 0: 42630.6. Samples: 2675585460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:18,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 07:35:20,004][15401] Updated weights for policy 0, policy_version 163300 (0.0036) [2024-06-22 07:35:23,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2675621888. Throughput: 0: 42584.3. Samples: 2675708940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 07:35:24,217][15401] Updated weights for policy 0, policy_version 163310 (0.0037) [2024-06-22 07:35:27,635][15401] Updated weights for policy 0, policy_version 163320 (0.0041) [2024-06-22 07:35:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2675867648. Throughput: 0: 42734.7. Samples: 2675973320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:28,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 07:35:31,897][15401] Updated weights for policy 0, policy_version 163330 (0.0040) [2024-06-22 07:35:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2676064256. Throughput: 0: 42806.7. Samples: 2676231220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:33,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 07:35:35,192][15401] Updated weights for policy 0, policy_version 163340 (0.0028) [2024-06-22 07:35:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2676277248. Throughput: 0: 42791.5. Samples: 2676359880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 07:35:39,390][15349] Signal inference workers to stop experience collection... (39450 times) [2024-06-22 07:35:39,390][15349] Signal inference workers to resume experience collection... (39450 times) [2024-06-22 07:35:39,394][15401] Updated weights for policy 0, policy_version 163350 (0.0026) [2024-06-22 07:35:39,414][15401] InferenceWorker_p0-w0: stopping experience collection (39450 times) [2024-06-22 07:35:39,414][15401] InferenceWorker_p0-w0: resuming experience collection (39450 times) [2024-06-22 07:35:42,877][15401] Updated weights for policy 0, policy_version 163360 (0.0037) [2024-06-22 07:35:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2676506624. Throughput: 0: 42964.0. Samples: 2676622120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:43,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-22 07:35:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000163362_2676523008.pth... [2024-06-22 07:35:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000162735_2666250240.pth [2024-06-22 07:35:46,993][15401] Updated weights for policy 0, policy_version 163370 (0.0024) [2024-06-22 07:35:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 2676719616. Throughput: 0: 42959.7. Samples: 2676876660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:35:48,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-22 07:35:50,475][15401] Updated weights for policy 0, policy_version 163380 (0.0036) [2024-06-22 07:35:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2676916224. Throughput: 0: 42902.1. Samples: 2677002040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:35:53,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 07:35:54,915][15401] Updated weights for policy 0, policy_version 163390 (0.0026) [2024-06-22 07:35:58,305][15401] Updated weights for policy 0, policy_version 163400 (0.0031) [2024-06-22 07:35:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2677145600. Throughput: 0: 42952.2. Samples: 2677262760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:35:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 07:36:02,595][15401] Updated weights for policy 0, policy_version 163410 (0.0035) [2024-06-22 07:36:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2677342208. Throughput: 0: 42811.2. Samples: 2677511960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 07:36:05,887][15401] Updated weights for policy 0, policy_version 163420 (0.0051) [2024-06-22 07:36:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2677571584. Throughput: 0: 42815.1. Samples: 2677635620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:08,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 07:36:10,187][15401] Updated weights for policy 0, policy_version 163430 (0.0030) [2024-06-22 07:36:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2677768192. Throughput: 0: 42775.9. Samples: 2677898240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 07:36:13,733][15401] Updated weights for policy 0, policy_version 163440 (0.0028) [2024-06-22 07:36:17,947][15401] Updated weights for policy 0, policy_version 163450 (0.0031) [2024-06-22 07:36:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2677964800. Throughput: 0: 42665.0. Samples: 2678151140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:18,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 07:36:21,214][15401] Updated weights for policy 0, policy_version 163460 (0.0027) [2024-06-22 07:36:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2678210560. Throughput: 0: 42626.2. Samples: 2678278060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:23,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 07:36:25,426][15401] Updated weights for policy 0, policy_version 163470 (0.0029) [2024-06-22 07:36:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 2678407168. Throughput: 0: 42653.2. Samples: 2678541520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 07:36:28,938][15401] Updated weights for policy 0, policy_version 163480 (0.0034) [2024-06-22 07:36:33,153][15401] Updated weights for policy 0, policy_version 163490 (0.0032) [2024-06-22 07:36:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2678636544. Throughput: 0: 42675.5. Samples: 2678797060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 07:36:36,447][15401] Updated weights for policy 0, policy_version 163500 (0.0041) [2024-06-22 07:36:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2678849536. Throughput: 0: 42721.0. Samples: 2678924480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:38,391][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 07:36:40,814][15401] Updated weights for policy 0, policy_version 163510 (0.0032) [2024-06-22 07:36:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2679062528. Throughput: 0: 42628.8. Samples: 2679181060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:43,394][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 07:36:44,461][15401] Updated weights for policy 0, policy_version 163520 (0.0046) [2024-06-22 07:36:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2679259136. Throughput: 0: 42767.6. Samples: 2679436500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 07:36:48,634][15401] Updated weights for policy 0, policy_version 163530 (0.0041) [2024-06-22 07:36:52,154][15401] Updated weights for policy 0, policy_version 163540 (0.0037) [2024-06-22 07:36:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 2679488512. Throughput: 0: 42680.9. Samples: 2679556260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-22 07:36:53,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 07:36:56,430][15401] Updated weights for policy 0, policy_version 163550 (0.0026) [2024-06-22 07:36:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2679701504. Throughput: 0: 42604.0. Samples: 2679815420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:36:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 07:36:59,835][15401] Updated weights for policy 0, policy_version 163560 (0.0043) [2024-06-22 07:37:01,502][15349] Signal inference workers to stop experience collection... (39500 times) [2024-06-22 07:37:01,502][15349] Signal inference workers to resume experience collection... (39500 times) [2024-06-22 07:37:01,518][15401] InferenceWorker_p0-w0: stopping experience collection (39500 times) [2024-06-22 07:37:01,518][15401] InferenceWorker_p0-w0: resuming experience collection (39500 times) [2024-06-22 07:37:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2679898112. Throughput: 0: 42533.7. Samples: 2680065160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 07:37:03,990][15401] Updated weights for policy 0, policy_version 163570 (0.0041) [2024-06-22 07:37:07,481][15401] Updated weights for policy 0, policy_version 163580 (0.0027) [2024-06-22 07:37:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2680111104. Throughput: 0: 42510.6. Samples: 2680191040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 07:37:11,887][15401] Updated weights for policy 0, policy_version 163590 (0.0037) [2024-06-22 07:37:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2680324096. Throughput: 0: 42431.8. Samples: 2680450940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 07:37:15,007][15401] Updated weights for policy 0, policy_version 163600 (0.0036) [2024-06-22 07:37:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2680553472. Throughput: 0: 42242.6. Samples: 2680697980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 07:37:19,459][15401] Updated weights for policy 0, policy_version 163610 (0.0026) [2024-06-22 07:37:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 2680733696. Throughput: 0: 42256.4. Samples: 2680826020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 07:37:23,447][15401] Updated weights for policy 0, policy_version 163620 (0.0027) [2024-06-22 07:37:27,318][15401] Updated weights for policy 0, policy_version 163630 (0.0038) [2024-06-22 07:37:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2680946688. Throughput: 0: 42294.7. Samples: 2681084320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:28,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 07:37:30,923][15401] Updated weights for policy 0, policy_version 163640 (0.0038) [2024-06-22 07:37:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2681176064. Throughput: 0: 42233.6. Samples: 2681337020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 07:37:35,110][15401] Updated weights for policy 0, policy_version 163650 (0.0043) [2024-06-22 07:37:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2681389056. Throughput: 0: 42365.3. Samples: 2681462700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 07:37:38,532][15401] Updated weights for policy 0, policy_version 163660 (0.0032) [2024-06-22 07:37:42,733][15401] Updated weights for policy 0, policy_version 163670 (0.0033) [2024-06-22 07:37:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 2681585664. Throughput: 0: 42331.3. Samples: 2681720320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 07:37:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000163672_2681602048.pth... [2024-06-22 07:37:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000163048_2671378432.pth [2024-06-22 07:37:46,167][15401] Updated weights for policy 0, policy_version 163680 (0.0036) [2024-06-22 07:37:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2681815040. Throughput: 0: 42456.4. Samples: 2681975700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 07:37:50,572][15401] Updated weights for policy 0, policy_version 163690 (0.0025) [2024-06-22 07:37:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 2681995264. Throughput: 0: 42459.9. Samples: 2682101740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 07:37:54,215][15401] Updated weights for policy 0, policy_version 163700 (0.0040) [2024-06-22 07:37:58,247][15401] Updated weights for policy 0, policy_version 163710 (0.0036) [2024-06-22 07:37:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 2682224640. Throughput: 0: 42221.8. Samples: 2682350920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 07:37:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 07:38:01,852][15401] Updated weights for policy 0, policy_version 163720 (0.0032) [2024-06-22 07:38:03,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2682454016. Throughput: 0: 42308.6. Samples: 2682601860. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 07:38:05,907][15401] Updated weights for policy 0, policy_version 163730 (0.0032) [2024-06-22 07:38:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 2682617856. Throughput: 0: 42504.9. Samples: 2682738740. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 07:38:09,541][15401] Updated weights for policy 0, policy_version 163740 (0.0033) [2024-06-22 07:38:13,390][15132] Fps is (10 sec: 40958.7, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 2682863616. Throughput: 0: 42354.9. Samples: 2682990300. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 07:38:13,560][15401] Updated weights for policy 0, policy_version 163750 (0.0037) [2024-06-22 07:38:17,218][15401] Updated weights for policy 0, policy_version 163760 (0.0036) [2024-06-22 07:38:18,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2683109376. Throughput: 0: 42331.7. Samples: 2683241940. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:18,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 07:38:20,947][15401] Updated weights for policy 0, policy_version 163770 (0.0040) [2024-06-22 07:38:23,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42320.8, 300 sec: 42486.5). Total num frames: 2683273216. Throughput: 0: 42459.7. Samples: 2683373660. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:23,397][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 07:38:24,863][15401] Updated weights for policy 0, policy_version 163780 (0.0038) [2024-06-22 07:38:25,740][15349] Signal inference workers to stop experience collection... (39550 times) [2024-06-22 07:38:25,779][15401] InferenceWorker_p0-w0: stopping experience collection (39550 times) [2024-06-22 07:38:25,798][15349] Signal inference workers to resume experience collection... (39550 times) [2024-06-22 07:38:25,799][15401] InferenceWorker_p0-w0: resuming experience collection (39550 times) [2024-06-22 07:38:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2683518976. Throughput: 0: 42456.0. Samples: 2683630840. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:28,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 07:38:28,679][15401] Updated weights for policy 0, policy_version 163790 (0.0035) [2024-06-22 07:38:32,588][15401] Updated weights for policy 0, policy_version 163800 (0.0041) [2024-06-22 07:38:33,390][15132] Fps is (10 sec: 45904.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2683731968. Throughput: 0: 42492.9. Samples: 2683887880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 07:38:36,180][15401] Updated weights for policy 0, policy_version 163810 (0.0041) [2024-06-22 07:38:38,395][15132] Fps is (10 sec: 39301.1, 60 sec: 42048.7, 300 sec: 42542.1). Total num frames: 2683912192. Throughput: 0: 42514.4. Samples: 2684015100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:38,395][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 07:38:40,425][15401] Updated weights for policy 0, policy_version 163820 (0.0047) [2024-06-22 07:38:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2684174336. Throughput: 0: 42613.7. Samples: 2684268540. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:43,396][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 07:38:43,759][15401] Updated weights for policy 0, policy_version 163830 (0.0024) [2024-06-22 07:38:48,147][15401] Updated weights for policy 0, policy_version 163840 (0.0042) [2024-06-22 07:38:48,389][15132] Fps is (10 sec: 44259.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2684354560. Throughput: 0: 42844.8. Samples: 2684529880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 07:38:51,753][15401] Updated weights for policy 0, policy_version 163850 (0.0027) [2024-06-22 07:38:53,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 2684567552. Throughput: 0: 42494.1. Samples: 2684651080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:53,401][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 07:38:55,660][15401] Updated weights for policy 0, policy_version 163860 (0.0034) [2024-06-22 07:38:58,391][15132] Fps is (10 sec: 47506.5, 60 sec: 43416.5, 300 sec: 42764.8). Total num frames: 2684829696. Throughput: 0: 42794.8. Samples: 2684916120. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:38:58,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 07:38:59,353][15401] Updated weights for policy 0, policy_version 163870 (0.0023) [2024-06-22 07:39:03,271][15401] Updated weights for policy 0, policy_version 163880 (0.0033) [2024-06-22 07:39:03,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 2685009920. Throughput: 0: 42998.1. Samples: 2685176960. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-22 07:39:03,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 07:39:06,862][15401] Updated weights for policy 0, policy_version 163890 (0.0031) [2024-06-22 07:39:08,389][15132] Fps is (10 sec: 39327.6, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 2685222912. Throughput: 0: 42874.2. Samples: 2685302720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 07:39:10,708][15401] Updated weights for policy 0, policy_version 163900 (0.0035) [2024-06-22 07:39:13,389][15132] Fps is (10 sec: 45886.2, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2685468672. Throughput: 0: 42918.6. Samples: 2685562180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 07:39:14,348][15401] Updated weights for policy 0, policy_version 163910 (0.0028) [2024-06-22 07:39:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 2685648896. Throughput: 0: 43010.7. Samples: 2685823460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:18,393][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 07:39:18,562][15401] Updated weights for policy 0, policy_version 163920 (0.0041) [2024-06-22 07:39:21,822][15401] Updated weights for policy 0, policy_version 163930 (0.0025) [2024-06-22 07:39:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43149.2, 300 sec: 42598.4). Total num frames: 2685861888. Throughput: 0: 42935.6. Samples: 2685946980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:23,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 07:39:26,266][15401] Updated weights for policy 0, policy_version 163940 (0.0042) [2024-06-22 07:39:28,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2686091264. Throughput: 0: 43032.5. Samples: 2686205000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:28,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 07:39:29,196][15401] Updated weights for policy 0, policy_version 163950 (0.0048) [2024-06-22 07:39:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2686304256. Throughput: 0: 43042.6. Samples: 2686466800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 07:39:33,694][15401] Updated weights for policy 0, policy_version 163960 (0.0044) [2024-06-22 07:39:37,226][15401] Updated weights for policy 0, policy_version 163970 (0.0031) [2024-06-22 07:39:38,393][15132] Fps is (10 sec: 42581.8, 60 sec: 43418.5, 300 sec: 42653.4). Total num frames: 2686517248. Throughput: 0: 43049.3. Samples: 2686588360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:38,394][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 07:39:41,238][15401] Updated weights for policy 0, policy_version 163980 (0.0038) [2024-06-22 07:39:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2686746624. Throughput: 0: 43181.5. Samples: 2686859220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 07:39:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000163987_2686763008.pth... [2024-06-22 07:39:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000163362_2676523008.pth [2024-06-22 07:39:44,662][15401] Updated weights for policy 0, policy_version 163990 (0.0041) [2024-06-22 07:39:48,389][15132] Fps is (10 sec: 40975.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2686926848. Throughput: 0: 42987.6. Samples: 2687111300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 07:39:49,060][15401] Updated weights for policy 0, policy_version 164000 (0.0040) [2024-06-22 07:39:52,122][15401] Updated weights for policy 0, policy_version 164010 (0.0038) [2024-06-22 07:39:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43419.2, 300 sec: 42709.5). Total num frames: 2687172608. Throughput: 0: 43023.4. Samples: 2687238780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:53,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 07:39:56,483][15401] Updated weights for policy 0, policy_version 164020 (0.0027) [2024-06-22 07:39:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42326.3, 300 sec: 42709.8). Total num frames: 2687369216. Throughput: 0: 43063.0. Samples: 2687500020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:39:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 07:39:59,801][15401] Updated weights for policy 0, policy_version 164030 (0.0033) [2024-06-22 07:40:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 2687565824. Throughput: 0: 43077.1. Samples: 2687761820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:40:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 07:40:03,594][15349] Signal inference workers to stop experience collection... (39600 times) [2024-06-22 07:40:03,600][15349] Signal inference workers to resume experience collection... (39600 times) [2024-06-22 07:40:03,645][15401] InferenceWorker_p0-w0: stopping experience collection (39600 times) [2024-06-22 07:40:03,646][15401] InferenceWorker_p0-w0: resuming experience collection (39600 times) [2024-06-22 07:40:04,000][15401] Updated weights for policy 0, policy_version 164040 (0.0037) [2024-06-22 07:40:07,465][15401] Updated weights for policy 0, policy_version 164050 (0.0023) [2024-06-22 07:40:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 2687811584. Throughput: 0: 43099.6. Samples: 2687886460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 07:40:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 07:40:11,500][15401] Updated weights for policy 0, policy_version 164060 (0.0035) [2024-06-22 07:40:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2688024576. Throughput: 0: 42997.3. Samples: 2688139880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:13,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 07:40:15,475][15401] Updated weights for policy 0, policy_version 164070 (0.0030) [2024-06-22 07:40:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2688221184. Throughput: 0: 43068.0. Samples: 2688404860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 07:40:19,533][15401] Updated weights for policy 0, policy_version 164080 (0.0034) [2024-06-22 07:40:23,276][15401] Updated weights for policy 0, policy_version 164090 (0.0040) [2024-06-22 07:40:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2688450560. Throughput: 0: 43156.5. Samples: 2688530240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 07:40:27,150][15401] Updated weights for policy 0, policy_version 164100 (0.0030) [2024-06-22 07:40:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2688663552. Throughput: 0: 42724.0. Samples: 2688781800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 07:40:30,925][15401] Updated weights for policy 0, policy_version 164110 (0.0035) [2024-06-22 07:40:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2688876544. Throughput: 0: 42840.3. Samples: 2689039120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 07:40:34,754][15401] Updated weights for policy 0, policy_version 164120 (0.0028) [2024-06-22 07:40:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42874.2, 300 sec: 42653.9). Total num frames: 2689089536. Throughput: 0: 42847.6. Samples: 2689166920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:38,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 07:40:38,606][15401] Updated weights for policy 0, policy_version 164130 (0.0041) [2024-06-22 07:40:43,003][15401] Updated weights for policy 0, policy_version 164140 (0.0040) [2024-06-22 07:40:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2689286144. Throughput: 0: 42773.9. Samples: 2689424840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 07:40:46,269][15401] Updated weights for policy 0, policy_version 164150 (0.0032) [2024-06-22 07:40:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2689515520. Throughput: 0: 42505.7. Samples: 2689674580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:48,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 07:40:50,631][15401] Updated weights for policy 0, policy_version 164160 (0.0025) [2024-06-22 07:40:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2689744896. Throughput: 0: 42640.8. Samples: 2689805300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 07:40:53,843][15401] Updated weights for policy 0, policy_version 164170 (0.0034) [2024-06-22 07:40:58,301][15401] Updated weights for policy 0, policy_version 164180 (0.0024) [2024-06-22 07:40:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 2689925120. Throughput: 0: 42662.4. Samples: 2690059680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:40:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 07:41:02,095][15401] Updated weights for policy 0, policy_version 164190 (0.0036) [2024-06-22 07:41:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2690138112. Throughput: 0: 42563.7. Samples: 2690320220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:41:03,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 07:41:05,696][15401] Updated weights for policy 0, policy_version 164200 (0.0035) [2024-06-22 07:41:08,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2690383872. Throughput: 0: 42674.7. Samples: 2690450600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:41:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 07:41:09,425][15401] Updated weights for policy 0, policy_version 164210 (0.0038) [2024-06-22 07:41:13,020][15401] Updated weights for policy 0, policy_version 164220 (0.0034) [2024-06-22 07:41:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2690580480. Throughput: 0: 42788.4. Samples: 2690707280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 07:41:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 07:41:16,816][15401] Updated weights for policy 0, policy_version 164230 (0.0037) [2024-06-22 07:41:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2690793472. Throughput: 0: 42951.2. Samples: 2690971920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 07:41:19,339][15349] Signal inference workers to stop experience collection... (39650 times) [2024-06-22 07:41:19,343][15349] Signal inference workers to resume experience collection... (39650 times) [2024-06-22 07:41:19,400][15401] InferenceWorker_p0-w0: stopping experience collection (39650 times) [2024-06-22 07:41:19,400][15401] InferenceWorker_p0-w0: resuming experience collection (39650 times) [2024-06-22 07:41:20,924][15401] Updated weights for policy 0, policy_version 164240 (0.0040) [2024-06-22 07:41:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2691006464. Throughput: 0: 42942.3. Samples: 2691099320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:23,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 07:41:24,339][15401] Updated weights for policy 0, policy_version 164250 (0.0039) [2024-06-22 07:41:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2691219456. Throughput: 0: 42847.6. Samples: 2691352980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:28,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 07:41:28,462][15401] Updated weights for policy 0, policy_version 164260 (0.0025) [2024-06-22 07:41:31,714][15401] Updated weights for policy 0, policy_version 164270 (0.0039) [2024-06-22 07:41:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2691432448. Throughput: 0: 43063.1. Samples: 2691612420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:33,393][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 07:41:35,827][15401] Updated weights for policy 0, policy_version 164280 (0.0041) [2024-06-22 07:41:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2691661824. Throughput: 0: 43119.3. Samples: 2691745660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:38,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 07:41:39,710][15401] Updated weights for policy 0, policy_version 164290 (0.0048) [2024-06-22 07:41:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2691874816. Throughput: 0: 43228.2. Samples: 2692004960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 07:41:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000164299_2691874816.pth... [2024-06-22 07:41:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000163672_2681602048.pth [2024-06-22 07:41:43,634][15401] Updated weights for policy 0, policy_version 164300 (0.0034) [2024-06-22 07:41:47,371][15401] Updated weights for policy 0, policy_version 164310 (0.0031) [2024-06-22 07:41:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2692071424. Throughput: 0: 43210.1. Samples: 2692264680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:48,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-22 07:41:51,387][15401] Updated weights for policy 0, policy_version 164320 (0.0030) [2024-06-22 07:41:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2692317184. Throughput: 0: 43066.8. Samples: 2692388600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 07:41:54,907][15401] Updated weights for policy 0, policy_version 164330 (0.0029) [2024-06-22 07:41:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2692513792. Throughput: 0: 42999.1. Samples: 2692642240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:41:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 07:41:59,007][15401] Updated weights for policy 0, policy_version 164340 (0.0030) [2024-06-22 07:42:03,005][15401] Updated weights for policy 0, policy_version 164350 (0.0033) [2024-06-22 07:42:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2692710400. Throughput: 0: 42920.9. Samples: 2692903360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:42:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 07:42:06,444][15401] Updated weights for policy 0, policy_version 164360 (0.0038) [2024-06-22 07:42:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 2692972544. Throughput: 0: 42911.6. Samples: 2693030340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:42:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 07:42:10,554][15401] Updated weights for policy 0, policy_version 164370 (0.0042) [2024-06-22 07:42:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2693169152. Throughput: 0: 43062.5. Samples: 2693290800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:42:13,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 07:42:13,991][15401] Updated weights for policy 0, policy_version 164380 (0.0038) [2024-06-22 07:42:18,103][15401] Updated weights for policy 0, policy_version 164390 (0.0031) [2024-06-22 07:42:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2693365760. Throughput: 0: 42817.8. Samples: 2693539220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 07:42:18,394][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 07:42:21,855][15401] Updated weights for policy 0, policy_version 164400 (0.0032) [2024-06-22 07:42:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2693595136. Throughput: 0: 42668.5. Samples: 2693665740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:42:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 07:42:25,729][15401] Updated weights for policy 0, policy_version 164410 (0.0036) [2024-06-22 07:42:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 2693791744. Throughput: 0: 42723.2. Samples: 2693927600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:42:28,392][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 07:42:29,403][15401] Updated weights for policy 0, policy_version 164420 (0.0040) [2024-06-22 07:42:33,386][15401] Updated weights for policy 0, policy_version 164430 (0.0049) [2024-06-22 07:42:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2694021120. Throughput: 0: 42412.1. Samples: 2694173220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:42:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 07:42:37,073][15401] Updated weights for policy 0, policy_version 164440 (0.0040) [2024-06-22 07:42:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2694234112. Throughput: 0: 42547.5. Samples: 2694303240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:42:38,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 07:42:38,604][15349] Signal inference workers to stop experience collection... (39700 times) [2024-06-22 07:42:38,604][15349] Signal inference workers to resume experience collection... (39700 times) [2024-06-22 07:42:38,630][15401] InferenceWorker_p0-w0: stopping experience collection (39700 times) [2024-06-22 07:42:38,630][15401] InferenceWorker_p0-w0: resuming experience collection (39700 times) [2024-06-22 07:42:40,994][15401] Updated weights for policy 0, policy_version 164450 (0.0043) [2024-06-22 07:42:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2694414336. Throughput: 0: 42605.7. Samples: 2694559500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:42:43,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-22 07:42:44,723][15401] Updated weights for policy 0, policy_version 164460 (0.0031) [2024-06-22 07:42:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2694643712. Throughput: 0: 42421.0. Samples: 2694812300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:42:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 07:42:48,862][15401] Updated weights for policy 0, policy_version 164470 (0.0034) [2024-06-22 07:42:52,809][15401] Updated weights for policy 0, policy_version 164480 (0.0038) [2024-06-22 07:42:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2694856704. Throughput: 0: 42516.9. Samples: 2694943600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:42:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 07:42:56,334][15401] Updated weights for policy 0, policy_version 164490 (0.0035) [2024-06-22 07:42:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2695053312. Throughput: 0: 42386.7. Samples: 2695198200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:42:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 07:43:00,735][15401] Updated weights for policy 0, policy_version 164500 (0.0022) [2024-06-22 07:43:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2695299072. Throughput: 0: 42466.7. Samples: 2695450220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:43:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 07:43:03,663][15401] Updated weights for policy 0, policy_version 164510 (0.0027) [2024-06-22 07:43:08,275][15401] Updated weights for policy 0, policy_version 164520 (0.0044) [2024-06-22 07:43:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 2695495680. Throughput: 0: 42791.1. Samples: 2695591340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:43:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 07:43:11,455][15401] Updated weights for policy 0, policy_version 164530 (0.0041) [2024-06-22 07:43:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2695708672. Throughput: 0: 42501.3. Samples: 2695840060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:43:13,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 07:43:15,659][15401] Updated weights for policy 0, policy_version 164540 (0.0030) [2024-06-22 07:43:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42988.1). Total num frames: 2695954432. Throughput: 0: 42759.5. Samples: 2696097400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:43:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 07:43:18,920][15401] Updated weights for policy 0, policy_version 164550 (0.0041) [2024-06-22 07:43:23,340][15401] Updated weights for policy 0, policy_version 164560 (0.0035) [2024-06-22 07:43:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2696151040. Throughput: 0: 42925.9. Samples: 2696234900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 07:43:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 07:43:26,377][15401] Updated weights for policy 0, policy_version 164570 (0.0033) [2024-06-22 07:43:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2696364032. Throughput: 0: 42811.7. Samples: 2696486020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:43:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 07:43:30,922][15401] Updated weights for policy 0, policy_version 164580 (0.0034) [2024-06-22 07:43:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 43043.5). Total num frames: 2696609792. Throughput: 0: 42930.1. Samples: 2696744160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:43:33,404][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 07:43:33,905][15401] Updated weights for policy 0, policy_version 164590 (0.0031) [2024-06-22 07:43:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2696790016. Throughput: 0: 43084.0. Samples: 2696882380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:43:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 07:43:38,409][15401] Updated weights for policy 0, policy_version 164600 (0.0041) [2024-06-22 07:43:41,542][15401] Updated weights for policy 0, policy_version 164610 (0.0036) [2024-06-22 07:43:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2697019392. Throughput: 0: 42969.0. Samples: 2697131800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:43:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 07:43:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000164613_2697019392.pth... [2024-06-22 07:43:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000163987_2686763008.pth [2024-06-22 07:43:45,934][15401] Updated weights for policy 0, policy_version 164620 (0.0026) [2024-06-22 07:43:48,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43690.5, 300 sec: 43043.0). Total num frames: 2697265152. Throughput: 0: 43128.3. Samples: 2697391000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:43:48,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 07:43:49,166][15401] Updated weights for policy 0, policy_version 164630 (0.0035) [2024-06-22 07:43:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.2). Total num frames: 2697445376. Throughput: 0: 42947.0. Samples: 2697523960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:43:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 07:43:53,964][15401] Updated weights for policy 0, policy_version 164640 (0.0033) [2024-06-22 07:43:56,880][15401] Updated weights for policy 0, policy_version 164650 (0.0041) [2024-06-22 07:43:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43690.7, 300 sec: 42932.0). Total num frames: 2697674752. Throughput: 0: 43058.2. Samples: 2697777680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:43:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 07:44:01,523][15401] Updated weights for policy 0, policy_version 164660 (0.0036) [2024-06-22 07:44:02,935][15349] Signal inference workers to stop experience collection... (39750 times) [2024-06-22 07:44:02,980][15401] InferenceWorker_p0-w0: stopping experience collection (39750 times) [2024-06-22 07:44:02,989][15349] Signal inference workers to resume experience collection... (39750 times) [2024-06-22 07:44:02,998][15401] InferenceWorker_p0-w0: resuming experience collection (39750 times) [2024-06-22 07:44:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2697887744. Throughput: 0: 43101.4. Samples: 2698036960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:44:03,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-22 07:44:04,484][15401] Updated weights for policy 0, policy_version 164670 (0.0031) [2024-06-22 07:44:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2698084352. Throughput: 0: 42927.9. Samples: 2698166660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:44:08,390][15132] Avg episode reward: [(0, '0.845')] [2024-06-22 07:44:09,019][15401] Updated weights for policy 0, policy_version 164680 (0.0042) [2024-06-22 07:44:12,512][15401] Updated weights for policy 0, policy_version 164690 (0.0037) [2024-06-22 07:44:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42932.0). Total num frames: 2698313728. Throughput: 0: 42888.4. Samples: 2698416000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:44:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 07:44:17,054][15401] Updated weights for policy 0, policy_version 164700 (0.0027) [2024-06-22 07:44:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2698526720. Throughput: 0: 42911.1. Samples: 2698675160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:44:18,390][15132] Avg episode reward: [(0, '0.167')] [2024-06-22 07:44:20,278][15401] Updated weights for policy 0, policy_version 164710 (0.0027) [2024-06-22 07:44:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2698706944. Throughput: 0: 42600.1. Samples: 2698799380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:44:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 07:44:24,715][15401] Updated weights for policy 0, policy_version 164720 (0.0034) [2024-06-22 07:44:27,974][15401] Updated weights for policy 0, policy_version 164730 (0.0042) [2024-06-22 07:44:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2698952704. Throughput: 0: 42801.4. Samples: 2699057860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 07:44:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 07:44:32,213][15401] Updated weights for policy 0, policy_version 164740 (0.0029) [2024-06-22 07:44:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42821.1). Total num frames: 2699149312. Throughput: 0: 42849.4. Samples: 2699319220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:44:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 07:44:35,901][15401] Updated weights for policy 0, policy_version 164750 (0.0035) [2024-06-22 07:44:38,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2699345920. Throughput: 0: 42613.3. Samples: 2699441560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:44:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 07:44:39,827][15401] Updated weights for policy 0, policy_version 164760 (0.0034) [2024-06-22 07:44:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 2699558912. Throughput: 0: 42690.1. Samples: 2699698740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:44:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 07:44:43,575][15401] Updated weights for policy 0, policy_version 164770 (0.0041) [2024-06-22 07:44:47,315][15401] Updated weights for policy 0, policy_version 164780 (0.0028) [2024-06-22 07:44:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2699788288. Throughput: 0: 42685.7. Samples: 2699957820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:44:48,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 07:44:51,168][15401] Updated weights for policy 0, policy_version 164790 (0.0036) [2024-06-22 07:44:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2700001280. Throughput: 0: 42681.8. Samples: 2700087340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:44:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 07:44:54,819][15401] Updated weights for policy 0, policy_version 164800 (0.0038) [2024-06-22 07:44:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2700214272. Throughput: 0: 42733.7. Samples: 2700339020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:44:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 07:44:58,783][15401] Updated weights for policy 0, policy_version 164810 (0.0026) [2024-06-22 07:45:02,486][15401] Updated weights for policy 0, policy_version 164820 (0.0029) [2024-06-22 07:45:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2700427264. Throughput: 0: 42681.3. Samples: 2700595820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:45:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 07:45:06,480][15401] Updated weights for policy 0, policy_version 164830 (0.0042) [2024-06-22 07:45:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2700656640. Throughput: 0: 42752.8. Samples: 2700723260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:45:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 07:45:10,807][15401] Updated weights for policy 0, policy_version 164840 (0.0033) [2024-06-22 07:45:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2700836864. Throughput: 0: 42576.9. Samples: 2700973820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:45:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 07:45:14,191][15401] Updated weights for policy 0, policy_version 164850 (0.0039) [2024-06-22 07:45:18,287][15401] Updated weights for policy 0, policy_version 164860 (0.0027) [2024-06-22 07:45:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42765.1). Total num frames: 2701066240. Throughput: 0: 42684.2. Samples: 2701240000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:45:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 07:45:21,825][15401] Updated weights for policy 0, policy_version 164870 (0.0031) [2024-06-22 07:45:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2701295616. Throughput: 0: 42837.5. Samples: 2701369240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:45:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 07:45:25,779][15401] Updated weights for policy 0, policy_version 164880 (0.0025) [2024-06-22 07:45:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2701475840. Throughput: 0: 42751.8. Samples: 2701622560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:45:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 07:45:29,397][15401] Updated weights for policy 0, policy_version 164890 (0.0033) [2024-06-22 07:45:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 2701705216. Throughput: 0: 42725.9. Samples: 2701880480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 07:45:33,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 07:45:33,396][15401] Updated weights for policy 0, policy_version 164900 (0.0036) [2024-06-22 07:45:36,384][15349] Signal inference workers to stop experience collection... (39800 times) [2024-06-22 07:45:36,384][15349] Signal inference workers to resume experience collection... (39800 times) [2024-06-22 07:45:36,404][15401] InferenceWorker_p0-w0: stopping experience collection (39800 times) [2024-06-22 07:45:36,404][15401] InferenceWorker_p0-w0: resuming experience collection (39800 times) [2024-06-22 07:45:37,145][15401] Updated weights for policy 0, policy_version 164910 (0.0025) [2024-06-22 07:45:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2701934592. Throughput: 0: 42744.9. Samples: 2702010860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:45:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 07:45:40,995][15401] Updated weights for policy 0, policy_version 164920 (0.0035) [2024-06-22 07:45:43,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2702131200. Throughput: 0: 42732.1. Samples: 2702261960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:45:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 07:45:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000164925_2702131200.pth... [2024-06-22 07:45:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000164299_2691874816.pth [2024-06-22 07:45:44,849][15401] Updated weights for policy 0, policy_version 164930 (0.0039) [2024-06-22 07:45:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2702344192. Throughput: 0: 42861.0. Samples: 2702524560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:45:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 07:45:48,655][15401] Updated weights for policy 0, policy_version 164940 (0.0028) [2024-06-22 07:45:52,478][15401] Updated weights for policy 0, policy_version 164950 (0.0032) [2024-06-22 07:45:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2702589952. Throughput: 0: 42850.7. Samples: 2702651540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:45:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 07:45:56,409][15401] Updated weights for policy 0, policy_version 164960 (0.0031) [2024-06-22 07:45:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 2702770176. Throughput: 0: 42906.6. Samples: 2702904720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:45:58,392][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 07:46:00,146][15401] Updated weights for policy 0, policy_version 164970 (0.0038) [2024-06-22 07:46:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2702999552. Throughput: 0: 42719.0. Samples: 2703162360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:46:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 07:46:04,083][15401] Updated weights for policy 0, policy_version 164980 (0.0041) [2024-06-22 07:46:07,739][15401] Updated weights for policy 0, policy_version 164990 (0.0027) [2024-06-22 07:46:08,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2703228928. Throughput: 0: 42749.4. Samples: 2703292960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:46:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 07:46:11,777][15401] Updated weights for policy 0, policy_version 165000 (0.0046) [2024-06-22 07:46:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2703425536. Throughput: 0: 42763.8. Samples: 2703546940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:46:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 07:46:15,314][15401] Updated weights for policy 0, policy_version 165010 (0.0031) [2024-06-22 07:46:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2703622144. Throughput: 0: 42867.0. Samples: 2703809500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:46:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 07:46:19,450][15401] Updated weights for policy 0, policy_version 165020 (0.0036) [2024-06-22 07:46:22,742][15401] Updated weights for policy 0, policy_version 165030 (0.0037) [2024-06-22 07:46:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2703884288. Throughput: 0: 42784.0. Samples: 2703936140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:46:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 07:46:27,010][15401] Updated weights for policy 0, policy_version 165040 (0.0039) [2024-06-22 07:46:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2704064512. Throughput: 0: 42919.5. Samples: 2704193340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:46:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 07:46:30,446][15401] Updated weights for policy 0, policy_version 165050 (0.0037) [2024-06-22 07:46:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2704277504. Throughput: 0: 42981.7. Samples: 2704458740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:46:33,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 07:46:34,671][15401] Updated weights for policy 0, policy_version 165060 (0.0030) [2024-06-22 07:46:37,979][15401] Updated weights for policy 0, policy_version 165070 (0.0027) [2024-06-22 07:46:38,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2704539648. Throughput: 0: 42966.7. Samples: 2704585040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 07:46:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 07:46:42,096][15401] Updated weights for policy 0, policy_version 165080 (0.0058) [2024-06-22 07:46:43,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 2704736256. Throughput: 0: 43071.6. Samples: 2704842940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:46:43,393][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 07:46:45,249][15349] Signal inference workers to stop experience collection... (39850 times) [2024-06-22 07:46:45,251][15349] Signal inference workers to resume experience collection... (39850 times) [2024-06-22 07:46:45,280][15401] InferenceWorker_p0-w0: stopping experience collection (39850 times) [2024-06-22 07:46:45,281][15401] InferenceWorker_p0-w0: resuming experience collection (39850 times) [2024-06-22 07:46:45,541][15401] Updated weights for policy 0, policy_version 165090 (0.0033) [2024-06-22 07:46:48,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2704916480. Throughput: 0: 43151.1. Samples: 2705104160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:46:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 07:46:49,548][15401] Updated weights for policy 0, policy_version 165100 (0.0037) [2024-06-22 07:46:53,160][15401] Updated weights for policy 0, policy_version 165110 (0.0036) [2024-06-22 07:46:53,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2705162240. Throughput: 0: 43059.0. Samples: 2705230620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:46:53,390][15132] Avg episode reward: [(0, '0.184')] [2024-06-22 07:46:57,105][15401] Updated weights for policy 0, policy_version 165120 (0.0037) [2024-06-22 07:46:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 2705358848. Throughput: 0: 43069.7. Samples: 2705485080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:46:58,391][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 07:47:01,093][15401] Updated weights for policy 0, policy_version 165130 (0.0036) [2024-06-22 07:47:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2705571840. Throughput: 0: 42956.4. Samples: 2705742540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:03,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 07:47:04,959][15401] Updated weights for policy 0, policy_version 165140 (0.0035) [2024-06-22 07:47:08,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 2705784832. Throughput: 0: 43010.1. Samples: 2705871700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:08,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 07:47:08,721][15401] Updated weights for policy 0, policy_version 165150 (0.0036) [2024-06-22 07:47:12,479][15401] Updated weights for policy 0, policy_version 165160 (0.0047) [2024-06-22 07:47:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2705997824. Throughput: 0: 43096.0. Samples: 2706132660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 07:47:16,138][15401] Updated weights for policy 0, policy_version 165170 (0.0027) [2024-06-22 07:47:18,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2706210816. Throughput: 0: 42998.2. Samples: 2706393660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 07:47:20,105][15401] Updated weights for policy 0, policy_version 165180 (0.0036) [2024-06-22 07:47:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 2706440192. Throughput: 0: 42956.6. Samples: 2706518080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 07:47:23,688][15401] Updated weights for policy 0, policy_version 165190 (0.0030) [2024-06-22 07:47:27,827][15401] Updated weights for policy 0, policy_version 165200 (0.0033) [2024-06-22 07:47:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2706653184. Throughput: 0: 42933.8. Samples: 2706774860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 07:47:31,557][15401] Updated weights for policy 0, policy_version 165210 (0.0050) [2024-06-22 07:47:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2706849792. Throughput: 0: 42684.9. Samples: 2707024980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 07:47:36,046][15401] Updated weights for policy 0, policy_version 165220 (0.0035) [2024-06-22 07:47:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 2707079168. Throughput: 0: 42725.8. Samples: 2707153280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 07:47:39,304][15401] Updated weights for policy 0, policy_version 165230 (0.0038) [2024-06-22 07:47:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 2707275776. Throughput: 0: 42730.7. Samples: 2707407960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 07:47:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 07:47:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000165239_2707275776.pth... [2024-06-22 07:47:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000164613_2697019392.pth [2024-06-22 07:47:43,665][15401] Updated weights for policy 0, policy_version 165240 (0.0036) [2024-06-22 07:47:46,874][15401] Updated weights for policy 0, policy_version 165250 (0.0021) [2024-06-22 07:47:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2707505152. Throughput: 0: 42718.1. Samples: 2707664860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:47:48,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 07:47:51,214][15401] Updated weights for policy 0, policy_version 165260 (0.0033) [2024-06-22 07:47:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2707701760. Throughput: 0: 42758.3. Samples: 2707795720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:47:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 07:47:54,838][15401] Updated weights for policy 0, policy_version 165270 (0.0033) [2024-06-22 07:47:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2707931136. Throughput: 0: 42612.8. Samples: 2708050240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:47:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 07:47:58,730][15401] Updated weights for policy 0, policy_version 165280 (0.0053) [2024-06-22 07:48:02,447][15401] Updated weights for policy 0, policy_version 165290 (0.0031) [2024-06-22 07:48:03,390][15132] Fps is (10 sec: 44233.9, 60 sec: 42871.0, 300 sec: 42876.0). Total num frames: 2708144128. Throughput: 0: 42526.6. Samples: 2708307380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:03,391][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 07:48:06,433][15401] Updated weights for policy 0, policy_version 165300 (0.0038) [2024-06-22 07:48:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2708357120. Throughput: 0: 42770.2. Samples: 2708442740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 07:48:10,056][15401] Updated weights for policy 0, policy_version 165310 (0.0027) [2024-06-22 07:48:13,392][15132] Fps is (10 sec: 42591.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2708570112. Throughput: 0: 42720.9. Samples: 2708697400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:13,392][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 07:48:13,818][15401] Updated weights for policy 0, policy_version 165320 (0.0033) [2024-06-22 07:48:17,547][15401] Updated weights for policy 0, policy_version 165330 (0.0026) [2024-06-22 07:48:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2708799488. Throughput: 0: 42871.9. Samples: 2708954220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:18,391][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 07:48:21,311][15401] Updated weights for policy 0, policy_version 165340 (0.0034) [2024-06-22 07:48:23,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2708996096. Throughput: 0: 42893.9. Samples: 2709083500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:23,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 07:48:25,263][15401] Updated weights for policy 0, policy_version 165350 (0.0043) [2024-06-22 07:48:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2709192704. Throughput: 0: 42957.0. Samples: 2709341020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 07:48:28,499][15349] Signal inference workers to stop experience collection... (39900 times) [2024-06-22 07:48:28,544][15401] InferenceWorker_p0-w0: stopping experience collection (39900 times) [2024-06-22 07:48:28,552][15349] Signal inference workers to resume experience collection... (39900 times) [2024-06-22 07:48:28,561][15401] InferenceWorker_p0-w0: resuming experience collection (39900 times) [2024-06-22 07:48:28,832][15401] Updated weights for policy 0, policy_version 165360 (0.0033) [2024-06-22 07:48:32,767][15401] Updated weights for policy 0, policy_version 165370 (0.0030) [2024-06-22 07:48:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2709438464. Throughput: 0: 42992.1. Samples: 2709599500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 07:48:36,421][15401] Updated weights for policy 0, policy_version 165380 (0.0041) [2024-06-22 07:48:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2709635072. Throughput: 0: 42882.3. Samples: 2709725420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 07:48:40,818][15401] Updated weights for policy 0, policy_version 165390 (0.0028) [2024-06-22 07:48:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2709848064. Throughput: 0: 42904.9. Samples: 2709980960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 07:48:44,110][15401] Updated weights for policy 0, policy_version 165400 (0.0027) [2024-06-22 07:48:48,324][15401] Updated weights for policy 0, policy_version 165410 (0.0028) [2024-06-22 07:48:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2710077440. Throughput: 0: 42832.2. Samples: 2710234800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 07:48:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 07:48:51,874][15401] Updated weights for policy 0, policy_version 165420 (0.0023) [2024-06-22 07:48:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2710290432. Throughput: 0: 42800.8. Samples: 2710368780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:48:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 07:48:55,814][15401] Updated weights for policy 0, policy_version 165430 (0.0023) [2024-06-22 07:48:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2710503424. Throughput: 0: 42850.2. Samples: 2710625560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:48:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 07:48:59,716][15401] Updated weights for policy 0, policy_version 165440 (0.0038) [2024-06-22 07:49:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42872.0, 300 sec: 42820.6). Total num frames: 2710716416. Throughput: 0: 42883.8. Samples: 2710883980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 07:49:03,432][15401] Updated weights for policy 0, policy_version 165450 (0.0040) [2024-06-22 07:49:07,231][15401] Updated weights for policy 0, policy_version 165460 (0.0042) [2024-06-22 07:49:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2710929408. Throughput: 0: 42903.1. Samples: 2711014140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 07:49:11,185][15401] Updated weights for policy 0, policy_version 165470 (0.0027) [2024-06-22 07:49:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2711142400. Throughput: 0: 42764.8. Samples: 2711265440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 07:49:14,767][15401] Updated weights for policy 0, policy_version 165480 (0.0023) [2024-06-22 07:49:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 2711371776. Throughput: 0: 42787.9. Samples: 2711524960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 07:49:19,212][15401] Updated weights for policy 0, policy_version 165490 (0.0031) [2024-06-22 07:49:22,448][15401] Updated weights for policy 0, policy_version 165500 (0.0033) [2024-06-22 07:49:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2711568384. Throughput: 0: 42875.4. Samples: 2711654820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 07:49:26,785][15401] Updated weights for policy 0, policy_version 165510 (0.0029) [2024-06-22 07:49:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2711797760. Throughput: 0: 42998.7. Samples: 2711915900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:28,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 07:49:30,025][15401] Updated weights for policy 0, policy_version 165520 (0.0034) [2024-06-22 07:49:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2712010752. Throughput: 0: 43008.3. Samples: 2712170180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 07:49:34,424][15401] Updated weights for policy 0, policy_version 165530 (0.0032) [2024-06-22 07:49:37,862][15401] Updated weights for policy 0, policy_version 165540 (0.0036) [2024-06-22 07:49:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2712223744. Throughput: 0: 42936.6. Samples: 2712300920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 07:49:41,900][15401] Updated weights for policy 0, policy_version 165550 (0.0028) [2024-06-22 07:49:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2712420352. Throughput: 0: 43091.7. Samples: 2712564680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 07:49:43,450][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000165554_2712436736.pth... [2024-06-22 07:49:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000164925_2702131200.pth [2024-06-22 07:49:45,610][15401] Updated weights for policy 0, policy_version 165560 (0.0023) [2024-06-22 07:49:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2712649728. Throughput: 0: 42945.3. Samples: 2712816520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:48,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 07:49:49,472][15401] Updated weights for policy 0, policy_version 165570 (0.0044) [2024-06-22 07:49:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2712846336. Throughput: 0: 42913.8. Samples: 2712945260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 07:49:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 07:49:53,615][15401] Updated weights for policy 0, policy_version 165580 (0.0035) [2024-06-22 07:49:56,838][15401] Updated weights for policy 0, policy_version 165590 (0.0039) [2024-06-22 07:49:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2713075712. Throughput: 0: 43002.5. Samples: 2713200540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:49:58,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 07:50:01,178][15401] Updated weights for policy 0, policy_version 165600 (0.0037) [2024-06-22 07:50:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2713305088. Throughput: 0: 42995.5. Samples: 2713459760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 07:50:04,717][15401] Updated weights for policy 0, policy_version 165610 (0.0032) [2024-06-22 07:50:08,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2713485312. Throughput: 0: 42944.1. Samples: 2713587300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 07:50:08,702][15401] Updated weights for policy 0, policy_version 165620 (0.0047) [2024-06-22 07:50:12,362][15401] Updated weights for policy 0, policy_version 165630 (0.0030) [2024-06-22 07:50:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2713714688. Throughput: 0: 42866.3. Samples: 2713844880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 07:50:16,152][15401] Updated weights for policy 0, policy_version 165640 (0.0029) [2024-06-22 07:50:17,177][15349] Signal inference workers to stop experience collection... (39950 times) [2024-06-22 07:50:17,197][15401] InferenceWorker_p0-w0: stopping experience collection (39950 times) [2024-06-22 07:50:17,234][15349] Signal inference workers to resume experience collection... (39950 times) [2024-06-22 07:50:17,235][15401] InferenceWorker_p0-w0: resuming experience collection (39950 times) [2024-06-22 07:50:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2713944064. Throughput: 0: 42752.1. Samples: 2714094020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 07:50:20,012][15401] Updated weights for policy 0, policy_version 165650 (0.0040) [2024-06-22 07:50:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 2714157056. Throughput: 0: 42727.1. Samples: 2714223640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:23,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 07:50:23,690][15401] Updated weights for policy 0, policy_version 165660 (0.0048) [2024-06-22 07:50:27,627][15401] Updated weights for policy 0, policy_version 165670 (0.0038) [2024-06-22 07:50:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2714353664. Throughput: 0: 42556.3. Samples: 2714479720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:28,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 07:50:31,687][15401] Updated weights for policy 0, policy_version 165680 (0.0034) [2024-06-22 07:50:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2714566656. Throughput: 0: 42614.6. Samples: 2714734180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 07:50:35,379][15401] Updated weights for policy 0, policy_version 165690 (0.0035) [2024-06-22 07:50:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 2714779648. Throughput: 0: 42600.8. Samples: 2714862400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:38,393][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 07:50:39,185][15401] Updated weights for policy 0, policy_version 165700 (0.0040) [2024-06-22 07:50:43,231][15401] Updated weights for policy 0, policy_version 165710 (0.0033) [2024-06-22 07:50:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2714992640. Throughput: 0: 42594.1. Samples: 2715117280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:43,394][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 07:50:46,959][15401] Updated weights for policy 0, policy_version 165720 (0.0023) [2024-06-22 07:50:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2715205632. Throughput: 0: 42402.7. Samples: 2715367880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 07:50:50,834][15401] Updated weights for policy 0, policy_version 165730 (0.0030) [2024-06-22 07:50:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 2715402240. Throughput: 0: 42463.6. Samples: 2715498160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 07:50:54,551][15401] Updated weights for policy 0, policy_version 165740 (0.0021) [2024-06-22 07:50:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 2715631616. Throughput: 0: 42422.0. Samples: 2715753880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 07:50:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 07:50:58,515][15401] Updated weights for policy 0, policy_version 165750 (0.0022) [2024-06-22 07:51:02,481][15401] Updated weights for policy 0, policy_version 165760 (0.0031) [2024-06-22 07:51:03,391][15132] Fps is (10 sec: 45868.9, 60 sec: 42597.5, 300 sec: 42820.4). Total num frames: 2715860992. Throughput: 0: 42477.0. Samples: 2716005540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:03,391][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 07:51:06,153][15401] Updated weights for policy 0, policy_version 165770 (0.0040) [2024-06-22 07:51:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2716057600. Throughput: 0: 42576.0. Samples: 2716139560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:08,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 07:51:10,016][15401] Updated weights for policy 0, policy_version 165780 (0.0051) [2024-06-22 07:51:13,390][15132] Fps is (10 sec: 39326.6, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 2716254208. Throughput: 0: 42567.1. Samples: 2716395240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 07:51:14,027][15401] Updated weights for policy 0, policy_version 165790 (0.0037) [2024-06-22 07:51:17,544][15401] Updated weights for policy 0, policy_version 165800 (0.0033) [2024-06-22 07:51:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2716516352. Throughput: 0: 42584.1. Samples: 2716650460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 07:51:21,489][15401] Updated weights for policy 0, policy_version 165810 (0.0042) [2024-06-22 07:51:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2716712960. Throughput: 0: 42783.2. Samples: 2716787540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 07:51:25,074][15401] Updated weights for policy 0, policy_version 165820 (0.0032) [2024-06-22 07:51:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2716909568. Throughput: 0: 42803.6. Samples: 2717043440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 07:51:29,286][15401] Updated weights for policy 0, policy_version 165830 (0.0033) [2024-06-22 07:51:32,943][15401] Updated weights for policy 0, policy_version 165840 (0.0041) [2024-06-22 07:51:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2717138944. Throughput: 0: 42837.5. Samples: 2717295560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:33,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 07:51:36,832][15401] Updated weights for policy 0, policy_version 165850 (0.0025) [2024-06-22 07:51:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 2717351936. Throughput: 0: 42975.6. Samples: 2717432060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 07:51:40,298][15401] Updated weights for policy 0, policy_version 165860 (0.0037) [2024-06-22 07:51:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2717548544. Throughput: 0: 42961.4. Samples: 2717687140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:43,396][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 07:51:43,532][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000165867_2717564928.pth... [2024-06-22 07:51:43,593][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000165239_2707275776.pth [2024-06-22 07:51:44,604][15401] Updated weights for policy 0, policy_version 165870 (0.0043) [2024-06-22 07:51:45,563][15349] Signal inference workers to stop experience collection... (40000 times) [2024-06-22 07:51:45,591][15401] InferenceWorker_p0-w0: stopping experience collection (40000 times) [2024-06-22 07:51:45,613][15349] Signal inference workers to resume experience collection... (40000 times) [2024-06-22 07:51:45,620][15401] InferenceWorker_p0-w0: resuming experience collection (40000 times) [2024-06-22 07:51:47,949][15401] Updated weights for policy 0, policy_version 165880 (0.0024) [2024-06-22 07:51:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2717794304. Throughput: 0: 43091.5. Samples: 2717944600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 07:51:52,113][15401] Updated weights for policy 0, policy_version 165890 (0.0036) [2024-06-22 07:51:53,394][15132] Fps is (10 sec: 44218.6, 60 sec: 43141.6, 300 sec: 42820.0). Total num frames: 2717990912. Throughput: 0: 43065.0. Samples: 2718077660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:53,394][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 07:51:55,611][15401] Updated weights for policy 0, policy_version 165900 (0.0048) [2024-06-22 07:51:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2718203904. Throughput: 0: 43032.1. Samples: 2718331680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 07:51:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 07:51:59,824][15401] Updated weights for policy 0, policy_version 165910 (0.0044) [2024-06-22 07:52:03,260][15401] Updated weights for policy 0, policy_version 165920 (0.0044) [2024-06-22 07:52:03,392][15132] Fps is (10 sec: 44244.1, 60 sec: 42870.7, 300 sec: 42876.1). Total num frames: 2718433280. Throughput: 0: 43099.0. Samples: 2718590020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:03,392][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 07:52:07,530][15401] Updated weights for policy 0, policy_version 165930 (0.0036) [2024-06-22 07:52:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2718629888. Throughput: 0: 43011.7. Samples: 2718723060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 07:52:10,944][15401] Updated weights for policy 0, policy_version 165940 (0.0029) [2024-06-22 07:52:13,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2718859264. Throughput: 0: 42842.2. Samples: 2718971340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 07:52:14,955][15401] Updated weights for policy 0, policy_version 165950 (0.0029) [2024-06-22 07:52:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2719055872. Throughput: 0: 43108.0. Samples: 2719235420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:18,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-22 07:52:18,594][15401] Updated weights for policy 0, policy_version 165960 (0.0030) [2024-06-22 07:52:22,546][15401] Updated weights for policy 0, policy_version 165970 (0.0041) [2024-06-22 07:52:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2719268864. Throughput: 0: 42847.6. Samples: 2719360200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:23,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 07:52:26,180][15401] Updated weights for policy 0, policy_version 165980 (0.0044) [2024-06-22 07:52:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2719481856. Throughput: 0: 42815.2. Samples: 2719613820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:28,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 07:52:30,006][15401] Updated weights for policy 0, policy_version 165990 (0.0044) [2024-06-22 07:52:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2719711232. Throughput: 0: 42886.2. Samples: 2719874480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 07:52:34,015][15401] Updated weights for policy 0, policy_version 166000 (0.0025) [2024-06-22 07:52:37,597][15401] Updated weights for policy 0, policy_version 166010 (0.0030) [2024-06-22 07:52:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2719907840. Throughput: 0: 42726.6. Samples: 2720000180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 07:52:42,201][15401] Updated weights for policy 0, policy_version 166020 (0.0052) [2024-06-22 07:52:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2720137216. Throughput: 0: 42827.5. Samples: 2720258920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 07:52:45,807][15401] Updated weights for policy 0, policy_version 166030 (0.0040) [2024-06-22 07:52:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2720333824. Throughput: 0: 42656.6. Samples: 2720509460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 07:52:49,797][15401] Updated weights for policy 0, policy_version 166040 (0.0042) [2024-06-22 07:52:53,327][15401] Updated weights for policy 0, policy_version 166050 (0.0033) [2024-06-22 07:52:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42874.4, 300 sec: 42820.6). Total num frames: 2720563200. Throughput: 0: 42425.2. Samples: 2720632200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 07:52:57,477][15401] Updated weights for policy 0, policy_version 166060 (0.0025) [2024-06-22 07:52:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2720776192. Throughput: 0: 42791.6. Samples: 2720896960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:52:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 07:53:00,744][15401] Updated weights for policy 0, policy_version 166070 (0.0030) [2024-06-22 07:53:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 2720989184. Throughput: 0: 42607.0. Samples: 2721152740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 07:53:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 07:53:05,004][15401] Updated weights for policy 0, policy_version 166080 (0.0025) [2024-06-22 07:53:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42820.6). Total num frames: 2721202176. Throughput: 0: 42658.1. Samples: 2721279920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:08,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 07:53:08,519][15401] Updated weights for policy 0, policy_version 166090 (0.0024) [2024-06-22 07:53:09,389][15349] Signal inference workers to stop experience collection... (40050 times) [2024-06-22 07:53:09,424][15401] InferenceWorker_p0-w0: stopping experience collection (40050 times) [2024-06-22 07:53:09,504][15349] Signal inference workers to resume experience collection... (40050 times) [2024-06-22 07:53:09,505][15401] InferenceWorker_p0-w0: resuming experience collection (40050 times) [2024-06-22 07:53:12,638][15401] Updated weights for policy 0, policy_version 166100 (0.0027) [2024-06-22 07:53:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 2721415168. Throughput: 0: 42837.4. Samples: 2721541500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 07:53:16,244][15401] Updated weights for policy 0, policy_version 166110 (0.0045) [2024-06-22 07:53:18,394][15132] Fps is (10 sec: 42589.1, 60 sec: 42868.2, 300 sec: 42819.9). Total num frames: 2721628160. Throughput: 0: 42629.4. Samples: 2721793000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:18,394][15132] Avg episode reward: [(0, '0.172')] [2024-06-22 07:53:20,061][15401] Updated weights for policy 0, policy_version 166120 (0.0036) [2024-06-22 07:53:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2721841152. Throughput: 0: 42740.0. Samples: 2721923480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 07:53:23,902][15401] Updated weights for policy 0, policy_version 166130 (0.0040) [2024-06-22 07:53:27,624][15401] Updated weights for policy 0, policy_version 166140 (0.0035) [2024-06-22 07:53:28,389][15132] Fps is (10 sec: 42618.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2722054144. Throughput: 0: 42695.6. Samples: 2722180220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 07:53:31,399][15401] Updated weights for policy 0, policy_version 166150 (0.0031) [2024-06-22 07:53:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2722283520. Throughput: 0: 42764.0. Samples: 2722433840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 07:53:35,297][15401] Updated weights for policy 0, policy_version 166160 (0.0035) [2024-06-22 07:53:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2722480128. Throughput: 0: 42988.9. Samples: 2722566700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 07:53:38,951][15401] Updated weights for policy 0, policy_version 166170 (0.0028) [2024-06-22 07:53:42,959][15401] Updated weights for policy 0, policy_version 166180 (0.0028) [2024-06-22 07:53:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2722709504. Throughput: 0: 42781.9. Samples: 2722822140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:43,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 07:53:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000166181_2722709504.pth... [2024-06-22 07:53:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000165554_2712436736.pth [2024-06-22 07:53:47,202][15401] Updated weights for policy 0, policy_version 166190 (0.0037) [2024-06-22 07:53:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2722922496. Throughput: 0: 42576.1. Samples: 2723068660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 07:53:50,496][15401] Updated weights for policy 0, policy_version 166200 (0.0040) [2024-06-22 07:53:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2723135488. Throughput: 0: 42585.3. Samples: 2723196160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 07:53:54,798][15401] Updated weights for policy 0, policy_version 166210 (0.0033) [2024-06-22 07:53:58,112][15401] Updated weights for policy 0, policy_version 166220 (0.0030) [2024-06-22 07:53:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2723348480. Throughput: 0: 42552.2. Samples: 2723456360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:53:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 07:54:02,594][15401] Updated weights for policy 0, policy_version 166230 (0.0051) [2024-06-22 07:54:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2723561472. Throughput: 0: 42477.2. Samples: 2723704280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:54:03,407][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 07:54:06,335][15401] Updated weights for policy 0, policy_version 166240 (0.0032) [2024-06-22 07:54:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 2723758080. Throughput: 0: 42431.1. Samples: 2723832880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 07:54:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 07:54:10,277][15401] Updated weights for policy 0, policy_version 166250 (0.0048) [2024-06-22 07:54:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2723971072. Throughput: 0: 42441.8. Samples: 2724090100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 07:54:13,889][15401] Updated weights for policy 0, policy_version 166260 (0.0050) [2024-06-22 07:54:17,918][15401] Updated weights for policy 0, policy_version 166270 (0.0027) [2024-06-22 07:54:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42601.7, 300 sec: 42765.0). Total num frames: 2724184064. Throughput: 0: 42455.1. Samples: 2724344320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 07:54:21,573][15401] Updated weights for policy 0, policy_version 166280 (0.0034) [2024-06-22 07:54:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2724397056. Throughput: 0: 42338.8. Samples: 2724471940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 07:54:25,570][15401] Updated weights for policy 0, policy_version 166290 (0.0035) [2024-06-22 07:54:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2724610048. Throughput: 0: 42322.2. Samples: 2724726640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 07:54:29,152][15401] Updated weights for policy 0, policy_version 166300 (0.0027) [2024-06-22 07:54:33,317][15401] Updated weights for policy 0, policy_version 166310 (0.0028) [2024-06-22 07:54:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2724823040. Throughput: 0: 42620.5. Samples: 2724986580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 07:54:36,710][15401] Updated weights for policy 0, policy_version 166320 (0.0045) [2024-06-22 07:54:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2725019648. Throughput: 0: 42635.2. Samples: 2725114740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:38,396][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 07:54:41,140][15401] Updated weights for policy 0, policy_version 166330 (0.0036) [2024-06-22 07:54:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2725249024. Throughput: 0: 42496.5. Samples: 2725368700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:43,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 07:54:44,657][15401] Updated weights for policy 0, policy_version 166340 (0.0030) [2024-06-22 07:54:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2725445632. Throughput: 0: 42736.6. Samples: 2725627420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 07:54:48,671][15401] Updated weights for policy 0, policy_version 166350 (0.0036) [2024-06-22 07:54:50,052][15349] Signal inference workers to stop experience collection... (40100 times) [2024-06-22 07:54:50,101][15401] InferenceWorker_p0-w0: stopping experience collection (40100 times) [2024-06-22 07:54:50,109][15349] Signal inference workers to resume experience collection... (40100 times) [2024-06-22 07:54:50,115][15401] InferenceWorker_p0-w0: resuming experience collection (40100 times) [2024-06-22 07:54:52,387][15401] Updated weights for policy 0, policy_version 166360 (0.0032) [2024-06-22 07:54:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2725658624. Throughput: 0: 42676.5. Samples: 2725753320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 07:54:56,390][15401] Updated weights for policy 0, policy_version 166370 (0.0032) [2024-06-22 07:54:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2725904384. Throughput: 0: 42655.5. Samples: 2726009600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:54:58,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 07:55:00,610][15401] Updated weights for policy 0, policy_version 166380 (0.0030) [2024-06-22 07:55:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2726100992. Throughput: 0: 42560.5. Samples: 2726259540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:55:03,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 07:55:03,914][15401] Updated weights for policy 0, policy_version 166390 (0.0032) [2024-06-22 07:55:08,392][15132] Fps is (10 sec: 37674.1, 60 sec: 42050.6, 300 sec: 42598.0). Total num frames: 2726281216. Throughput: 0: 42514.1. Samples: 2726385180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:55:08,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 07:55:08,408][15401] Updated weights for policy 0, policy_version 166400 (0.0034) [2024-06-22 07:55:11,717][15401] Updated weights for policy 0, policy_version 166410 (0.0055) [2024-06-22 07:55:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2726543360. Throughput: 0: 42474.6. Samples: 2726638000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-22 07:55:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 07:55:15,936][15401] Updated weights for policy 0, policy_version 166420 (0.0033) [2024-06-22 07:55:18,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2726723584. Throughput: 0: 42459.1. Samples: 2726897240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 07:55:19,312][15401] Updated weights for policy 0, policy_version 166430 (0.0038) [2024-06-22 07:55:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2726936576. Throughput: 0: 42350.6. Samples: 2727020520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 07:55:24,014][15401] Updated weights for policy 0, policy_version 166440 (0.0035) [2024-06-22 07:55:26,791][15401] Updated weights for policy 0, policy_version 166450 (0.0034) [2024-06-22 07:55:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2727165952. Throughput: 0: 42394.8. Samples: 2727276460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:28,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 07:55:31,686][15401] Updated weights for policy 0, policy_version 166460 (0.0041) [2024-06-22 07:55:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 2727378944. Throughput: 0: 42530.0. Samples: 2727541280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 07:55:34,302][15401] Updated weights for policy 0, policy_version 166470 (0.0045) [2024-06-22 07:55:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2727575552. Throughput: 0: 42435.5. Samples: 2727662920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:38,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 07:55:39,220][15401] Updated weights for policy 0, policy_version 166480 (0.0034) [2024-06-22 07:55:41,860][15401] Updated weights for policy 0, policy_version 166490 (0.0033) [2024-06-22 07:55:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2727837696. Throughput: 0: 42427.1. Samples: 2727918820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 07:55:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000166494_2727837696.pth... [2024-06-22 07:55:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000165867_2717564928.pth [2024-06-22 07:55:46,898][15401] Updated weights for policy 0, policy_version 166500 (0.0029) [2024-06-22 07:55:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2728001536. Throughput: 0: 42859.9. Samples: 2728188240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 07:55:49,518][15401] Updated weights for policy 0, policy_version 166510 (0.0035) [2024-06-22 07:55:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2728214528. Throughput: 0: 42754.7. Samples: 2728309040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 07:55:54,566][15401] Updated weights for policy 0, policy_version 166520 (0.0023) [2024-06-22 07:55:57,053][15401] Updated weights for policy 0, policy_version 166530 (0.0028) [2024-06-22 07:55:57,973][15349] Signal inference workers to stop experience collection... (40150 times) [2024-06-22 07:55:57,973][15349] Signal inference workers to resume experience collection... (40150 times) [2024-06-22 07:55:58,005][15401] InferenceWorker_p0-w0: stopping experience collection (40150 times) [2024-06-22 07:55:58,005][15401] InferenceWorker_p0-w0: resuming experience collection (40150 times) [2024-06-22 07:55:58,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 2728476672. Throughput: 0: 42763.2. Samples: 2728562340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:55:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 07:56:02,303][15401] Updated weights for policy 0, policy_version 166540 (0.0030) [2024-06-22 07:56:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2728640512. Throughput: 0: 42914.2. Samples: 2728828380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:56:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 07:56:04,825][15401] Updated weights for policy 0, policy_version 166550 (0.0029) [2024-06-22 07:56:08,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2728853504. Throughput: 0: 42840.0. Samples: 2728948320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:56:08,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 07:56:09,919][15401] Updated weights for policy 0, policy_version 166560 (0.0028) [2024-06-22 07:56:12,525][15401] Updated weights for policy 0, policy_version 166570 (0.0035) [2024-06-22 07:56:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2729115648. Throughput: 0: 42749.7. Samples: 2729200200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:56:13,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 07:56:17,754][15401] Updated weights for policy 0, policy_version 166580 (0.0037) [2024-06-22 07:56:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2729295872. Throughput: 0: 42607.2. Samples: 2729458600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 07:56:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 07:56:20,721][15401] Updated weights for policy 0, policy_version 166590 (0.0033) [2024-06-22 07:56:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2729492480. Throughput: 0: 42582.4. Samples: 2729579120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:56:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 07:56:25,520][15401] Updated weights for policy 0, policy_version 166600 (0.0031) [2024-06-22 07:56:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2729721856. Throughput: 0: 42642.2. Samples: 2729837720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:56:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 07:56:28,426][15401] Updated weights for policy 0, policy_version 166610 (0.0037) [2024-06-22 07:56:32,979][15401] Updated weights for policy 0, policy_version 166620 (0.0040) [2024-06-22 07:56:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2729918464. Throughput: 0: 42394.6. Samples: 2730096000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:56:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 07:56:35,953][15401] Updated weights for policy 0, policy_version 166630 (0.0042) [2024-06-22 07:56:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2730147840. Throughput: 0: 42350.2. Samples: 2730214800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:56:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 07:56:40,783][15401] Updated weights for policy 0, policy_version 166640 (0.0029) [2024-06-22 07:56:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2730377216. Throughput: 0: 42517.7. Samples: 2730475640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:56:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 07:56:43,476][15401] Updated weights for policy 0, policy_version 166650 (0.0038) [2024-06-22 07:56:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42543.4). Total num frames: 2730541056. Throughput: 0: 42410.0. Samples: 2730736840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:56:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 07:56:48,430][15401] Updated weights for policy 0, policy_version 166660 (0.0041) [2024-06-22 07:56:51,092][15401] Updated weights for policy 0, policy_version 166670 (0.0029) [2024-06-22 07:56:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2730803200. Throughput: 0: 42374.6. Samples: 2730855180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:56:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 07:56:56,117][15401] Updated weights for policy 0, policy_version 166680 (0.0030) [2024-06-22 07:56:58,390][15132] Fps is (10 sec: 47514.4, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 2731016192. Throughput: 0: 42590.7. Samples: 2731116780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:56:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 07:56:58,718][15401] Updated weights for policy 0, policy_version 166690 (0.0036) [2024-06-22 07:57:03,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2731180032. Throughput: 0: 42831.2. Samples: 2731386000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:57:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 07:57:03,638][15401] Updated weights for policy 0, policy_version 166700 (0.0041) [2024-06-22 07:57:06,491][15401] Updated weights for policy 0, policy_version 166710 (0.0029) [2024-06-22 07:57:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2731425792. Throughput: 0: 42654.6. Samples: 2731498580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:57:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 07:57:11,254][15401] Updated weights for policy 0, policy_version 166720 (0.0040) [2024-06-22 07:57:13,391][15132] Fps is (10 sec: 47507.5, 60 sec: 42324.5, 300 sec: 42709.3). Total num frames: 2731655168. Throughput: 0: 42633.0. Samples: 2731756260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:57:13,391][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 07:57:14,378][15401] Updated weights for policy 0, policy_version 166730 (0.0035) [2024-06-22 07:57:15,484][15349] Signal inference workers to stop experience collection... (40200 times) [2024-06-22 07:57:15,534][15401] InferenceWorker_p0-w0: stopping experience collection (40200 times) [2024-06-22 07:57:15,598][15349] Signal inference workers to resume experience collection... (40200 times) [2024-06-22 07:57:15,598][15401] InferenceWorker_p0-w0: resuming experience collection (40200 times) [2024-06-22 07:57:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2731835392. Throughput: 0: 42773.4. Samples: 2732020800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:57:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 07:57:18,791][15401] Updated weights for policy 0, policy_version 166740 (0.0037) [2024-06-22 07:57:22,250][15401] Updated weights for policy 0, policy_version 166750 (0.0034) [2024-06-22 07:57:23,390][15132] Fps is (10 sec: 40964.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2732064768. Throughput: 0: 42635.0. Samples: 2732133380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-22 07:57:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 07:57:26,504][15401] Updated weights for policy 0, policy_version 166760 (0.0037) [2024-06-22 07:57:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2732294144. Throughput: 0: 42699.0. Samples: 2732397100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:57:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 07:57:30,277][15401] Updated weights for policy 0, policy_version 166770 (0.0037) [2024-06-22 07:57:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2732474368. Throughput: 0: 42592.7. Samples: 2732653500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:57:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 07:57:34,202][15401] Updated weights for policy 0, policy_version 166780 (0.0032) [2024-06-22 07:57:37,855][15401] Updated weights for policy 0, policy_version 166790 (0.0028) [2024-06-22 07:57:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2732687360. Throughput: 0: 42634.8. Samples: 2732773740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:57:38,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-22 07:57:42,055][15401] Updated weights for policy 0, policy_version 166800 (0.0031) [2024-06-22 07:57:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2732933120. Throughput: 0: 42680.0. Samples: 2733037380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:57:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 07:57:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000166805_2732933120.pth... [2024-06-22 07:57:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000166181_2722709504.pth [2024-06-22 07:57:45,895][15401] Updated weights for policy 0, policy_version 166810 (0.0037) [2024-06-22 07:57:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2733113344. Throughput: 0: 42341.8. Samples: 2733291380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:57:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 07:57:49,651][15401] Updated weights for policy 0, policy_version 166820 (0.0046) [2024-06-22 07:57:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 2733309952. Throughput: 0: 42498.2. Samples: 2733411000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:57:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 07:57:53,712][15401] Updated weights for policy 0, policy_version 166830 (0.0047) [2024-06-22 07:57:57,387][15401] Updated weights for policy 0, policy_version 166840 (0.0044) [2024-06-22 07:57:58,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 2733555712. Throughput: 0: 42551.8. Samples: 2733671140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:57:58,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 07:58:01,285][15401] Updated weights for policy 0, policy_version 166850 (0.0033) [2024-06-22 07:58:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42543.2). Total num frames: 2733752320. Throughput: 0: 42470.6. Samples: 2733931980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:58:03,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 07:58:04,954][15401] Updated weights for policy 0, policy_version 166860 (0.0034) [2024-06-22 07:58:08,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2733965312. Throughput: 0: 42598.8. Samples: 2734050320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:58:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 07:58:09,159][15401] Updated weights for policy 0, policy_version 166870 (0.0038) [2024-06-22 07:58:12,360][15401] Updated weights for policy 0, policy_version 166880 (0.0029) [2024-06-22 07:58:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42326.1, 300 sec: 42599.0). Total num frames: 2734194688. Throughput: 0: 42362.2. Samples: 2734303400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:58:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 07:58:16,819][15401] Updated weights for policy 0, policy_version 166890 (0.0034) [2024-06-22 07:58:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2734391296. Throughput: 0: 42442.9. Samples: 2734563440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:58:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 07:58:19,910][15401] Updated weights for policy 0, policy_version 166900 (0.0034) [2024-06-22 07:58:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2734620672. Throughput: 0: 42510.2. Samples: 2734686700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:58:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 07:58:24,589][15401] Updated weights for policy 0, policy_version 166910 (0.0050) [2024-06-22 07:58:27,470][15401] Updated weights for policy 0, policy_version 166920 (0.0020) [2024-06-22 07:58:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2734833664. Throughput: 0: 42553.9. Samples: 2734952300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 07:58:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 07:58:32,338][15401] Updated weights for policy 0, policy_version 166930 (0.0033) [2024-06-22 07:58:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 2735030272. Throughput: 0: 42713.2. Samples: 2735213580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:58:33,393][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 07:58:35,078][15401] Updated weights for policy 0, policy_version 166940 (0.0030) [2024-06-22 07:58:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2735259648. Throughput: 0: 42761.8. Samples: 2735335280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:58:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 07:58:39,818][15401] Updated weights for policy 0, policy_version 166950 (0.0040) [2024-06-22 07:58:42,855][15401] Updated weights for policy 0, policy_version 166960 (0.0045) [2024-06-22 07:58:43,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2735489024. Throughput: 0: 42717.0. Samples: 2735593300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:58:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 07:58:47,803][15401] Updated weights for policy 0, policy_version 166970 (0.0033) [2024-06-22 07:58:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2735669248. Throughput: 0: 42950.3. Samples: 2735864740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:58:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 07:58:50,252][15401] Updated weights for policy 0, policy_version 166980 (0.0022) [2024-06-22 07:58:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2735898624. Throughput: 0: 42809.8. Samples: 2735976760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:58:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 07:58:55,039][15349] Signal inference workers to stop experience collection... (40250 times) [2024-06-22 07:58:55,096][15401] InferenceWorker_p0-w0: stopping experience collection (40250 times) [2024-06-22 07:58:55,096][15349] Signal inference workers to resume experience collection... (40250 times) [2024-06-22 07:58:55,136][15401] InferenceWorker_p0-w0: resuming experience collection (40250 times) [2024-06-22 07:58:55,235][15401] Updated weights for policy 0, policy_version 166990 (0.0047) [2024-06-22 07:58:57,881][15401] Updated weights for policy 0, policy_version 167000 (0.0038) [2024-06-22 07:58:58,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 2736144384. Throughput: 0: 42980.0. Samples: 2736237500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:58:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 07:59:03,222][15401] Updated weights for policy 0, policy_version 167010 (0.0033) [2024-06-22 07:59:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2736291840. Throughput: 0: 43159.2. Samples: 2736505600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:59:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 07:59:05,527][15401] Updated weights for policy 0, policy_version 167020 (0.0031) [2024-06-22 07:59:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2736553984. Throughput: 0: 42917.0. Samples: 2736617960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:59:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 07:59:10,731][15401] Updated weights for policy 0, policy_version 167030 (0.0034) [2024-06-22 07:59:13,340][15401] Updated weights for policy 0, policy_version 167040 (0.0037) [2024-06-22 07:59:13,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2736783360. Throughput: 0: 42898.6. Samples: 2736882740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:59:13,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 07:59:18,385][15401] Updated weights for policy 0, policy_version 167050 (0.0037) [2024-06-22 07:59:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2736947200. Throughput: 0: 42896.6. Samples: 2737143820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:59:18,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-22 07:59:21,009][15401] Updated weights for policy 0, policy_version 167060 (0.0032) [2024-06-22 07:59:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2737192960. Throughput: 0: 42674.3. Samples: 2737255620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:59:23,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 07:59:25,787][15401] Updated weights for policy 0, policy_version 167070 (0.0032) [2024-06-22 07:59:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2737405952. Throughput: 0: 42889.7. Samples: 2737523340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:59:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 07:59:28,523][15401] Updated weights for policy 0, policy_version 167080 (0.0027) [2024-06-22 07:59:33,212][15401] Updated weights for policy 0, policy_version 167090 (0.0046) [2024-06-22 07:59:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42871.5, 300 sec: 42653.6). Total num frames: 2737602560. Throughput: 0: 42678.2. Samples: 2737785360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 07:59:33,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 07:59:36,115][15401] Updated weights for policy 0, policy_version 167100 (0.0048) [2024-06-22 07:59:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2737831936. Throughput: 0: 42811.6. Samples: 2737903280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 07:59:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 07:59:40,927][15401] Updated weights for policy 0, policy_version 167110 (0.0023) [2024-06-22 07:59:43,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2738061312. Throughput: 0: 42969.8. Samples: 2738171140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 07:59:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 07:59:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000167118_2738061312.pth... [2024-06-22 07:59:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000166494_2727837696.pth [2024-06-22 07:59:43,835][15401] Updated weights for policy 0, policy_version 167120 (0.0028) [2024-06-22 07:59:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2738241536. Throughput: 0: 42553.7. Samples: 2738420520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 07:59:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 07:59:48,594][15401] Updated weights for policy 0, policy_version 167130 (0.0028) [2024-06-22 07:59:51,547][15401] Updated weights for policy 0, policy_version 167140 (0.0026) [2024-06-22 07:59:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2738470912. Throughput: 0: 42839.9. Samples: 2738545760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 07:59:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 07:59:56,086][15401] Updated weights for policy 0, policy_version 167150 (0.0031) [2024-06-22 07:59:58,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 2738700288. Throughput: 0: 42876.4. Samples: 2738812280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 07:59:58,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 07:59:59,642][15401] Updated weights for policy 0, policy_version 167160 (0.0029) [2024-06-22 08:00:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 2738896896. Throughput: 0: 42709.2. Samples: 2739065740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 08:00:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 08:00:03,556][15401] Updated weights for policy 0, policy_version 167170 (0.0047) [2024-06-22 08:00:07,188][15401] Updated weights for policy 0, policy_version 167180 (0.0045) [2024-06-22 08:00:08,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2739126272. Throughput: 0: 43041.8. Samples: 2739192500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 08:00:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 08:00:11,243][15349] Signal inference workers to stop experience collection... (40300 times) [2024-06-22 08:00:11,292][15401] InferenceWorker_p0-w0: stopping experience collection (40300 times) [2024-06-22 08:00:11,353][15349] Signal inference workers to resume experience collection... (40300 times) [2024-06-22 08:00:11,353][15401] InferenceWorker_p0-w0: resuming experience collection (40300 times) [2024-06-22 08:00:11,495][15401] Updated weights for policy 0, policy_version 167190 (0.0040) [2024-06-22 08:00:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2739322880. Throughput: 0: 42993.8. Samples: 2739458060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 08:00:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 08:00:14,611][15401] Updated weights for policy 0, policy_version 167200 (0.0031) [2024-06-22 08:00:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2739519488. Throughput: 0: 42773.8. Samples: 2739710080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 08:00:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 08:00:19,287][15401] Updated weights for policy 0, policy_version 167210 (0.0034) [2024-06-22 08:00:22,169][15401] Updated weights for policy 0, policy_version 167220 (0.0030) [2024-06-22 08:00:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2739765248. Throughput: 0: 42975.5. Samples: 2739837180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 08:00:23,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 08:00:26,821][15401] Updated weights for policy 0, policy_version 167230 (0.0023) [2024-06-22 08:00:28,390][15132] Fps is (10 sec: 45873.6, 60 sec: 42871.2, 300 sec: 42709.4). Total num frames: 2739978240. Throughput: 0: 42816.1. Samples: 2740097880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 08:00:28,391][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 08:00:29,586][15401] Updated weights for policy 0, policy_version 167240 (0.0026) [2024-06-22 08:00:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2740174848. Throughput: 0: 42842.7. Samples: 2740348440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 08:00:33,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 08:00:34,722][15401] Updated weights for policy 0, policy_version 167250 (0.0034) [2024-06-22 08:00:37,896][15401] Updated weights for policy 0, policy_version 167260 (0.0033) [2024-06-22 08:00:38,390][15132] Fps is (10 sec: 42600.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2740404224. Throughput: 0: 42913.4. Samples: 2740476860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 08:00:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 08:00:42,378][15401] Updated weights for policy 0, policy_version 167270 (0.0034) [2024-06-22 08:00:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2740600832. Throughput: 0: 42771.5. Samples: 2740736900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:00:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 08:00:45,668][15401] Updated weights for policy 0, policy_version 167280 (0.0040) [2024-06-22 08:00:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 2740830208. Throughput: 0: 42636.4. Samples: 2740984480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:00:48,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 08:00:49,900][15401] Updated weights for policy 0, policy_version 167290 (0.0043) [2024-06-22 08:00:53,126][15401] Updated weights for policy 0, policy_version 167300 (0.0038) [2024-06-22 08:00:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2741043200. Throughput: 0: 42726.2. Samples: 2741115180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:00:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 08:00:57,442][15401] Updated weights for policy 0, policy_version 167310 (0.0038) [2024-06-22 08:00:58,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 2741223424. Throughput: 0: 42728.1. Samples: 2741380820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:00:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 08:01:00,678][15401] Updated weights for policy 0, policy_version 167320 (0.0037) [2024-06-22 08:01:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2741469184. Throughput: 0: 42668.9. Samples: 2741630180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:03,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 08:01:05,290][15401] Updated weights for policy 0, policy_version 167330 (0.0033) [2024-06-22 08:01:08,311][15401] Updated weights for policy 0, policy_version 167340 (0.0040) [2024-06-22 08:01:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2741698560. Throughput: 0: 42725.8. Samples: 2741759840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 08:01:12,835][15401] Updated weights for policy 0, policy_version 167350 (0.0024) [2024-06-22 08:01:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2741878784. Throughput: 0: 42654.5. Samples: 2742017320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 08:01:16,266][15401] Updated weights for policy 0, policy_version 167360 (0.0026) [2024-06-22 08:01:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2742108160. Throughput: 0: 42679.2. Samples: 2742269000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 08:01:20,335][15401] Updated weights for policy 0, policy_version 167370 (0.0037) [2024-06-22 08:01:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2742321152. Throughput: 0: 42787.9. Samples: 2742402320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 08:01:23,857][15401] Updated weights for policy 0, policy_version 167380 (0.0030) [2024-06-22 08:01:27,924][15401] Updated weights for policy 0, policy_version 167390 (0.0041) [2024-06-22 08:01:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.6, 300 sec: 42709.5). Total num frames: 2742517760. Throughput: 0: 42788.6. Samples: 2742662380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 08:01:30,389][15349] Signal inference workers to stop experience collection... (40350 times) [2024-06-22 08:01:30,389][15349] Signal inference workers to resume experience collection... (40350 times) [2024-06-22 08:01:30,421][15401] InferenceWorker_p0-w0: stopping experience collection (40350 times) [2024-06-22 08:01:30,421][15401] InferenceWorker_p0-w0: resuming experience collection (40350 times) [2024-06-22 08:01:31,379][15401] Updated weights for policy 0, policy_version 167400 (0.0027) [2024-06-22 08:01:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2742763520. Throughput: 0: 42680.5. Samples: 2742905000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 08:01:35,718][15401] Updated weights for policy 0, policy_version 167410 (0.0049) [2024-06-22 08:01:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2742960128. Throughput: 0: 42754.7. Samples: 2743039140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:38,391][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 08:01:39,216][15401] Updated weights for policy 0, policy_version 167420 (0.0026) [2024-06-22 08:01:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 2743156736. Throughput: 0: 42529.8. Samples: 2743294660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:01:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 08:01:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000167430_2743173120.pth... [2024-06-22 08:01:43,461][15401] Updated weights for policy 0, policy_version 167430 (0.0036) [2024-06-22 08:01:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000166805_2732933120.pth [2024-06-22 08:01:46,796][15401] Updated weights for policy 0, policy_version 167440 (0.0046) [2024-06-22 08:01:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2743402496. Throughput: 0: 42613.9. Samples: 2743547800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:01:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 08:01:51,119][15401] Updated weights for policy 0, policy_version 167450 (0.0026) [2024-06-22 08:01:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2743599104. Throughput: 0: 42752.1. Samples: 2743683680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:01:53,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 08:01:54,507][15401] Updated weights for policy 0, policy_version 167460 (0.0031) [2024-06-22 08:01:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2743795712. Throughput: 0: 42790.4. Samples: 2743942880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:01:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 08:01:58,667][15401] Updated weights for policy 0, policy_version 167470 (0.0042) [2024-06-22 08:02:02,215][15401] Updated weights for policy 0, policy_version 167480 (0.0036) [2024-06-22 08:02:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2744041472. Throughput: 0: 42730.6. Samples: 2744191880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:03,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 08:02:06,682][15401] Updated weights for policy 0, policy_version 167490 (0.0041) [2024-06-22 08:02:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42654.1). Total num frames: 2744238080. Throughput: 0: 42698.8. Samples: 2744323760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 08:02:09,800][15401] Updated weights for policy 0, policy_version 167500 (0.0025) [2024-06-22 08:02:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2744451072. Throughput: 0: 42524.9. Samples: 2744576000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:13,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 08:02:14,176][15401] Updated weights for policy 0, policy_version 167510 (0.0043) [2024-06-22 08:02:17,891][15401] Updated weights for policy 0, policy_version 167520 (0.0036) [2024-06-22 08:02:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2744680448. Throughput: 0: 42768.7. Samples: 2744829600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 08:02:21,651][15401] Updated weights for policy 0, policy_version 167530 (0.0029) [2024-06-22 08:02:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2744877056. Throughput: 0: 42732.8. Samples: 2744962120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:23,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 08:02:25,399][15401] Updated weights for policy 0, policy_version 167540 (0.0028) [2024-06-22 08:02:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2745090048. Throughput: 0: 42739.0. Samples: 2745217920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 08:02:29,183][15401] Updated weights for policy 0, policy_version 167550 (0.0038) [2024-06-22 08:02:33,225][15401] Updated weights for policy 0, policy_version 167560 (0.0048) [2024-06-22 08:02:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 2745303040. Throughput: 0: 42828.3. Samples: 2745475080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 08:02:36,720][15401] Updated weights for policy 0, policy_version 167570 (0.0040) [2024-06-22 08:02:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2745516032. Throughput: 0: 42673.2. Samples: 2745603980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 08:02:40,722][15401] Updated weights for policy 0, policy_version 167580 (0.0027) [2024-06-22 08:02:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2745729024. Throughput: 0: 42618.6. Samples: 2745860720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:43,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 08:02:44,295][15401] Updated weights for policy 0, policy_version 167590 (0.0035) [2024-06-22 08:02:48,257][15401] Updated weights for policy 0, policy_version 167600 (0.0026) [2024-06-22 08:02:48,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 2745958400. Throughput: 0: 42861.7. Samples: 2746120760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:02:48,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 08:02:52,104][15401] Updated weights for policy 0, policy_version 167610 (0.0039) [2024-06-22 08:02:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 2746171392. Throughput: 0: 42846.6. Samples: 2746251860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:02:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:02:55,795][15401] Updated weights for policy 0, policy_version 167620 (0.0040) [2024-06-22 08:02:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2746384384. Throughput: 0: 43065.4. Samples: 2746513940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:02:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:02:59,645][15401] Updated weights for policy 0, policy_version 167630 (0.0039) [2024-06-22 08:03:03,291][15401] Updated weights for policy 0, policy_version 167640 (0.0027) [2024-06-22 08:03:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2746613760. Throughput: 0: 43212.2. Samples: 2746774140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 08:03:05,811][15349] Signal inference workers to stop experience collection... (40400 times) [2024-06-22 08:03:05,811][15349] Signal inference workers to resume experience collection... (40400 times) [2024-06-22 08:03:05,830][15401] InferenceWorker_p0-w0: stopping experience collection (40400 times) [2024-06-22 08:03:05,830][15401] InferenceWorker_p0-w0: resuming experience collection (40400 times) [2024-06-22 08:03:07,118][15401] Updated weights for policy 0, policy_version 167650 (0.0033) [2024-06-22 08:03:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2746826752. Throughput: 0: 43076.9. Samples: 2746900680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:08,392][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 08:03:11,252][15401] Updated weights for policy 0, policy_version 167660 (0.0037) [2024-06-22 08:03:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2747039744. Throughput: 0: 43113.0. Samples: 2747158000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 08:03:14,695][15401] Updated weights for policy 0, policy_version 167670 (0.0040) [2024-06-22 08:03:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2747236352. Throughput: 0: 43216.9. Samples: 2747419840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 08:03:18,965][15401] Updated weights for policy 0, policy_version 167680 (0.0035) [2024-06-22 08:03:22,267][15401] Updated weights for policy 0, policy_version 167690 (0.0028) [2024-06-22 08:03:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2747465728. Throughput: 0: 43018.4. Samples: 2747539800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:23,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 08:03:26,598][15401] Updated weights for policy 0, policy_version 167700 (0.0027) [2024-06-22 08:03:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 2747678720. Throughput: 0: 43119.2. Samples: 2747801080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 08:03:30,183][15401] Updated weights for policy 0, policy_version 167710 (0.0033) [2024-06-22 08:03:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2747875328. Throughput: 0: 43031.1. Samples: 2748057060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 08:03:34,596][15401] Updated weights for policy 0, policy_version 167720 (0.0026) [2024-06-22 08:03:37,871][15401] Updated weights for policy 0, policy_version 167730 (0.0028) [2024-06-22 08:03:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2748104704. Throughput: 0: 42981.8. Samples: 2748186040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 08:03:42,091][15401] Updated weights for policy 0, policy_version 167740 (0.0038) [2024-06-22 08:03:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2748334080. Throughput: 0: 42882.9. Samples: 2748443680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:43,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 08:03:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000167745_2748334080.pth... [2024-06-22 08:03:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000167118_2738061312.pth [2024-06-22 08:03:45,511][15401] Updated weights for policy 0, policy_version 167750 (0.0036) [2024-06-22 08:03:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2748530688. Throughput: 0: 42774.1. Samples: 2748698980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:48,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 08:03:49,657][15401] Updated weights for policy 0, policy_version 167760 (0.0038) [2024-06-22 08:03:53,071][15401] Updated weights for policy 0, policy_version 167770 (0.0028) [2024-06-22 08:03:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2748743680. Throughput: 0: 42771.1. Samples: 2748825280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 08:03:53,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 08:03:57,227][15401] Updated weights for policy 0, policy_version 167780 (0.0044) [2024-06-22 08:03:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2748956672. Throughput: 0: 42970.1. Samples: 2749091660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:03:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 08:04:00,473][15401] Updated weights for policy 0, policy_version 167790 (0.0034) [2024-06-22 08:04:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2749169664. Throughput: 0: 42760.2. Samples: 2749344040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 08:04:04,848][15401] Updated weights for policy 0, policy_version 167800 (0.0042) [2024-06-22 08:04:07,980][15401] Updated weights for policy 0, policy_version 167810 (0.0034) [2024-06-22 08:04:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 2749399040. Throughput: 0: 43043.5. Samples: 2749476760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 08:04:12,290][15401] Updated weights for policy 0, policy_version 167820 (0.0026) [2024-06-22 08:04:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2749612032. Throughput: 0: 43071.1. Samples: 2749739280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:13,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 08:04:15,483][15401] Updated weights for policy 0, policy_version 167830 (0.0031) [2024-06-22 08:04:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2749808640. Throughput: 0: 42934.2. Samples: 2749989100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 08:04:20,143][15401] Updated weights for policy 0, policy_version 167840 (0.0027) [2024-06-22 08:04:22,991][15401] Updated weights for policy 0, policy_version 167850 (0.0033) [2024-06-22 08:04:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2750054400. Throughput: 0: 42875.4. Samples: 2750115440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 08:04:27,834][15401] Updated weights for policy 0, policy_version 167860 (0.0047) [2024-06-22 08:04:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 2750251008. Throughput: 0: 42893.6. Samples: 2750373880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 08:04:30,554][15401] Updated weights for policy 0, policy_version 167870 (0.0027) [2024-06-22 08:04:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2750447616. Throughput: 0: 42948.5. Samples: 2750631660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 08:04:35,412][15401] Updated weights for policy 0, policy_version 167880 (0.0032) [2024-06-22 08:04:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2750693376. Throughput: 0: 42975.2. Samples: 2750759160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 08:04:38,705][15401] Updated weights for policy 0, policy_version 167890 (0.0033) [2024-06-22 08:04:43,009][15401] Updated weights for policy 0, policy_version 167900 (0.0040) [2024-06-22 08:04:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2750889984. Throughput: 0: 42830.2. Samples: 2751019020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 08:04:46,418][15401] Updated weights for policy 0, policy_version 167910 (0.0031) [2024-06-22 08:04:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2751086592. Throughput: 0: 42915.5. Samples: 2751275240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 08:04:50,596][15401] Updated weights for policy 0, policy_version 167920 (0.0031) [2024-06-22 08:04:51,698][15349] Signal inference workers to stop experience collection... (40450 times) [2024-06-22 08:04:51,742][15401] InferenceWorker_p0-w0: stopping experience collection (40450 times) [2024-06-22 08:04:51,751][15349] Signal inference workers to resume experience collection... (40450 times) [2024-06-22 08:04:51,752][15401] InferenceWorker_p0-w0: resuming experience collection (40450 times) [2024-06-22 08:04:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42820.9). Total num frames: 2751332352. Throughput: 0: 42800.9. Samples: 2751402800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 08:04:53,958][15401] Updated weights for policy 0, policy_version 167930 (0.0041) [2024-06-22 08:04:58,206][15401] Updated weights for policy 0, policy_version 167940 (0.0033) [2024-06-22 08:04:58,390][15132] Fps is (10 sec: 44232.7, 60 sec: 42870.8, 300 sec: 42820.4). Total num frames: 2751528960. Throughput: 0: 42613.3. Samples: 2751656920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:04:58,391][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 08:05:01,781][15401] Updated weights for policy 0, policy_version 167950 (0.0029) [2024-06-22 08:05:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2751741952. Throughput: 0: 42602.8. Samples: 2751906220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 08:05:05,797][15401] Updated weights for policy 0, policy_version 167960 (0.0030) [2024-06-22 08:05:08,396][15132] Fps is (10 sec: 44212.7, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 2751971328. Throughput: 0: 42786.5. Samples: 2752041100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:08,396][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 08:05:09,461][15401] Updated weights for policy 0, policy_version 167970 (0.0035) [2024-06-22 08:05:13,333][15401] Updated weights for policy 0, policy_version 167980 (0.0042) [2024-06-22 08:05:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2752184320. Throughput: 0: 42727.9. Samples: 2752296640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 08:05:17,225][15401] Updated weights for policy 0, policy_version 167990 (0.0038) [2024-06-22 08:05:18,389][15132] Fps is (10 sec: 42625.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2752397312. Throughput: 0: 42642.7. Samples: 2752550580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 08:05:20,952][15401] Updated weights for policy 0, policy_version 168000 (0.0029) [2024-06-22 08:05:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2752610304. Throughput: 0: 42706.2. Samples: 2752680940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:23,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 08:05:24,703][15401] Updated weights for policy 0, policy_version 168010 (0.0038) [2024-06-22 08:05:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2752823296. Throughput: 0: 42694.8. Samples: 2752940280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:05:28,659][15401] Updated weights for policy 0, policy_version 168020 (0.0030) [2024-06-22 08:05:32,438][15401] Updated weights for policy 0, policy_version 168030 (0.0024) [2024-06-22 08:05:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2753019904. Throughput: 0: 42734.6. Samples: 2753198300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 08:05:36,130][15401] Updated weights for policy 0, policy_version 168040 (0.0044) [2024-06-22 08:05:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2753249280. Throughput: 0: 42690.7. Samples: 2753323880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 08:05:40,061][15401] Updated weights for policy 0, policy_version 168050 (0.0045) [2024-06-22 08:05:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 2753462272. Throughput: 0: 42782.2. Samples: 2753582080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 08:05:43,441][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000168059_2753478656.pth... [2024-06-22 08:05:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000167430_2743173120.pth [2024-06-22 08:05:43,903][15401] Updated weights for policy 0, policy_version 168060 (0.0044) [2024-06-22 08:05:47,750][15401] Updated weights for policy 0, policy_version 168070 (0.0026) [2024-06-22 08:05:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2753675264. Throughput: 0: 43013.7. Samples: 2753841840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:48,390][15132] Avg episode reward: [(0, '0.061')] [2024-06-22 08:05:51,405][15401] Updated weights for policy 0, policy_version 168080 (0.0033) [2024-06-22 08:05:53,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42866.9, 300 sec: 42986.2). Total num frames: 2753904640. Throughput: 0: 42937.8. Samples: 2753973300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:53,396][15132] Avg episode reward: [(0, '0.061')] [2024-06-22 08:05:55,450][15401] Updated weights for policy 0, policy_version 168090 (0.0025) [2024-06-22 08:05:58,391][15132] Fps is (10 sec: 44230.3, 60 sec: 43144.1, 300 sec: 42875.9). Total num frames: 2754117632. Throughput: 0: 42968.8. Samples: 2754230300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:05:58,392][15132] Avg episode reward: [(0, '0.086')] [2024-06-22 08:05:59,021][15401] Updated weights for policy 0, policy_version 168100 (0.0041) [2024-06-22 08:06:03,271][15401] Updated weights for policy 0, policy_version 168110 (0.0031) [2024-06-22 08:06:03,390][15132] Fps is (10 sec: 40985.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2754314240. Throughput: 0: 43114.9. Samples: 2754490760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 08:06:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 08:06:06,999][15401] Updated weights for policy 0, policy_version 168120 (0.0030) [2024-06-22 08:06:08,389][15132] Fps is (10 sec: 42605.0, 60 sec: 42876.0, 300 sec: 42931.7). Total num frames: 2754543616. Throughput: 0: 42869.4. Samples: 2754610060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:08,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 08:06:11,049][15401] Updated weights for policy 0, policy_version 168130 (0.0038) [2024-06-22 08:06:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2754740224. Throughput: 0: 42782.9. Samples: 2754865520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 08:06:14,669][15401] Updated weights for policy 0, policy_version 168140 (0.0028) [2024-06-22 08:06:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2754936832. Throughput: 0: 42748.6. Samples: 2755121980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 08:06:18,445][15349] Signal inference workers to stop experience collection... (40500 times) [2024-06-22 08:06:18,499][15349] Signal inference workers to resume experience collection... (40500 times) [2024-06-22 08:06:18,500][15401] InferenceWorker_p0-w0: stopping experience collection (40500 times) [2024-06-22 08:06:18,510][15401] InferenceWorker_p0-w0: resuming experience collection (40500 times) [2024-06-22 08:06:18,688][15401] Updated weights for policy 0, policy_version 168150 (0.0034) [2024-06-22 08:06:22,254][15401] Updated weights for policy 0, policy_version 168160 (0.0042) [2024-06-22 08:06:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2755182592. Throughput: 0: 42821.7. Samples: 2755250860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 08:06:26,275][15401] Updated weights for policy 0, policy_version 168170 (0.0036) [2024-06-22 08:06:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2755379200. Throughput: 0: 42765.4. Samples: 2755506520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:28,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 08:06:29,728][15401] Updated weights for policy 0, policy_version 168180 (0.0037) [2024-06-22 08:06:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2755575808. Throughput: 0: 42798.3. Samples: 2755767760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:33,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 08:06:33,845][15401] Updated weights for policy 0, policy_version 168190 (0.0046) [2024-06-22 08:06:37,659][15401] Updated weights for policy 0, policy_version 168200 (0.0039) [2024-06-22 08:06:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2755805184. Throughput: 0: 42519.8. Samples: 2755886420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:38,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 08:06:41,460][15401] Updated weights for policy 0, policy_version 168210 (0.0026) [2024-06-22 08:06:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2756018176. Throughput: 0: 42486.7. Samples: 2756142140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 08:06:45,148][15401] Updated weights for policy 0, policy_version 168220 (0.0031) [2024-06-22 08:06:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2756231168. Throughput: 0: 42560.6. Samples: 2756405980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 08:06:49,067][15401] Updated weights for policy 0, policy_version 168230 (0.0034) [2024-06-22 08:06:52,566][15401] Updated weights for policy 0, policy_version 168240 (0.0035) [2024-06-22 08:06:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42602.8, 300 sec: 42931.6). Total num frames: 2756460544. Throughput: 0: 42818.1. Samples: 2756536880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 08:06:56,645][15401] Updated weights for policy 0, policy_version 168250 (0.0041) [2024-06-22 08:06:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42326.4, 300 sec: 42765.0). Total num frames: 2756657152. Throughput: 0: 42653.9. Samples: 2756784940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:06:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 08:07:00,279][15401] Updated weights for policy 0, policy_version 168260 (0.0037) [2024-06-22 08:07:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2756870144. Throughput: 0: 42770.1. Samples: 2757046640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:07:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 08:07:04,122][15401] Updated weights for policy 0, policy_version 168270 (0.0026) [2024-06-22 08:07:08,154][15401] Updated weights for policy 0, policy_version 168280 (0.0030) [2024-06-22 08:07:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2757099520. Throughput: 0: 42806.7. Samples: 2757177160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 08:07:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 08:07:12,010][15401] Updated weights for policy 0, policy_version 168290 (0.0030) [2024-06-22 08:07:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2757312512. Throughput: 0: 42717.7. Samples: 2757428820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 08:07:15,874][15401] Updated weights for policy 0, policy_version 168300 (0.0037) [2024-06-22 08:07:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2757509120. Throughput: 0: 42569.7. Samples: 2757683400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:18,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 08:07:19,839][15401] Updated weights for policy 0, policy_version 168310 (0.0032) [2024-06-22 08:07:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 2757738496. Throughput: 0: 42777.3. Samples: 2757811500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:23,392][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 08:07:23,538][15401] Updated weights for policy 0, policy_version 168320 (0.0027) [2024-06-22 08:07:27,319][15401] Updated weights for policy 0, policy_version 168330 (0.0048) [2024-06-22 08:07:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2757967872. Throughput: 0: 42857.4. Samples: 2758070720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 08:07:31,280][15401] Updated weights for policy 0, policy_version 168340 (0.0031) [2024-06-22 08:07:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2758164480. Throughput: 0: 42714.6. Samples: 2758328140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:33,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 08:07:35,152][15401] Updated weights for policy 0, policy_version 168350 (0.0043) [2024-06-22 08:07:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2758377472. Throughput: 0: 42605.0. Samples: 2758454100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:38,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 08:07:38,753][15401] Updated weights for policy 0, policy_version 168360 (0.0038) [2024-06-22 08:07:42,721][15401] Updated weights for policy 0, policy_version 168370 (0.0027) [2024-06-22 08:07:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 2758606848. Throughput: 0: 42971.7. Samples: 2758718660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 08:07:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000168372_2758606848.pth... [2024-06-22 08:07:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000167745_2748334080.pth [2024-06-22 08:07:46,571][15401] Updated weights for policy 0, policy_version 168380 (0.0036) [2024-06-22 08:07:48,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.7, 300 sec: 42875.8). Total num frames: 2758819840. Throughput: 0: 42771.1. Samples: 2758971440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:48,393][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 08:07:50,212][15401] Updated weights for policy 0, policy_version 168390 (0.0029) [2024-06-22 08:07:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2759032832. Throughput: 0: 42794.2. Samples: 2759102900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 08:07:54,230][15401] Updated weights for policy 0, policy_version 168400 (0.0028) [2024-06-22 08:07:57,712][15401] Updated weights for policy 0, policy_version 168410 (0.0029) [2024-06-22 08:07:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2759245824. Throughput: 0: 43125.4. Samples: 2759369460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:07:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 08:08:01,774][15401] Updated weights for policy 0, policy_version 168420 (0.0024) [2024-06-22 08:08:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 2759442432. Throughput: 0: 43169.3. Samples: 2759626020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:08:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 08:08:05,134][15401] Updated weights for policy 0, policy_version 168430 (0.0030) [2024-06-22 08:08:06,285][15349] Signal inference workers to stop experience collection... (40550 times) [2024-06-22 08:08:06,332][15401] InferenceWorker_p0-w0: stopping experience collection (40550 times) [2024-06-22 08:08:06,398][15349] Signal inference workers to resume experience collection... (40550 times) [2024-06-22 08:08:06,399][15401] InferenceWorker_p0-w0: resuming experience collection (40550 times) [2024-06-22 08:08:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2759671808. Throughput: 0: 43271.2. Samples: 2759758600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:08:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 08:08:09,217][15401] Updated weights for policy 0, policy_version 168440 (0.0028) [2024-06-22 08:08:12,694][15401] Updated weights for policy 0, policy_version 168450 (0.0036) [2024-06-22 08:08:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2759884800. Throughput: 0: 43287.1. Samples: 2760018640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 08:08:13,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 08:08:17,033][15401] Updated weights for policy 0, policy_version 168460 (0.0032) [2024-06-22 08:08:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2760097792. Throughput: 0: 43236.6. Samples: 2760273780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 08:08:20,316][15401] Updated weights for policy 0, policy_version 168470 (0.0041) [2024-06-22 08:08:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 2760310784. Throughput: 0: 43337.8. Samples: 2760404300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 08:08:24,572][15401] Updated weights for policy 0, policy_version 168480 (0.0038) [2024-06-22 08:08:27,961][15401] Updated weights for policy 0, policy_version 168490 (0.0034) [2024-06-22 08:08:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2760556544. Throughput: 0: 43340.8. Samples: 2760669000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 08:08:32,179][15401] Updated weights for policy 0, policy_version 168500 (0.0030) [2024-06-22 08:08:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 2760753152. Throughput: 0: 43381.8. Samples: 2760923620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:33,393][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 08:08:35,618][15401] Updated weights for policy 0, policy_version 168510 (0.0030) [2024-06-22 08:08:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2760966144. Throughput: 0: 43241.2. Samples: 2761048760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:38,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 08:08:39,695][15401] Updated weights for policy 0, policy_version 168520 (0.0023) [2024-06-22 08:08:43,277][15401] Updated weights for policy 0, policy_version 168530 (0.0027) [2024-06-22 08:08:43,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2761195520. Throughput: 0: 43203.6. Samples: 2761313620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:43,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 08:08:47,157][15401] Updated weights for policy 0, policy_version 168540 (0.0027) [2024-06-22 08:08:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43419.3, 300 sec: 42987.2). Total num frames: 2761424896. Throughput: 0: 43210.2. Samples: 2761570480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:48,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 08:08:50,943][15401] Updated weights for policy 0, policy_version 168550 (0.0036) [2024-06-22 08:08:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2761605120. Throughput: 0: 43205.3. Samples: 2761702840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 08:08:54,859][15401] Updated weights for policy 0, policy_version 168560 (0.0040) [2024-06-22 08:08:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2761834496. Throughput: 0: 43002.7. Samples: 2761953760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:08:58,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 08:08:58,542][15401] Updated weights for policy 0, policy_version 168570 (0.0026) [2024-06-22 08:09:02,314][15401] Updated weights for policy 0, policy_version 168580 (0.0035) [2024-06-22 08:09:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 2762063872. Throughput: 0: 42980.8. Samples: 2762207920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:09:03,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 08:09:06,404][15401] Updated weights for policy 0, policy_version 168590 (0.0037) [2024-06-22 08:09:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2762244096. Throughput: 0: 43014.1. Samples: 2762339940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:09:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 08:09:10,067][15401] Updated weights for policy 0, policy_version 168600 (0.0034) [2024-06-22 08:09:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2762473472. Throughput: 0: 42757.8. Samples: 2762593100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:09:13,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 08:09:14,007][15401] Updated weights for policy 0, policy_version 168610 (0.0032) [2024-06-22 08:09:17,542][15401] Updated weights for policy 0, policy_version 168620 (0.0029) [2024-06-22 08:09:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2762686464. Throughput: 0: 42837.0. Samples: 2762851180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:09:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 08:09:19,322][15349] Signal inference workers to stop experience collection... (40600 times) [2024-06-22 08:09:19,322][15349] Signal inference workers to resume experience collection... (40600 times) [2024-06-22 08:09:19,336][15401] InferenceWorker_p0-w0: stopping experience collection (40600 times) [2024-06-22 08:09:19,336][15401] InferenceWorker_p0-w0: resuming experience collection (40600 times) [2024-06-22 08:09:21,851][15401] Updated weights for policy 0, policy_version 168630 (0.0036) [2024-06-22 08:09:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2762899456. Throughput: 0: 43011.5. Samples: 2762984280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:09:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 08:09:25,164][15401] Updated weights for policy 0, policy_version 168640 (0.0031) [2024-06-22 08:09:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2763128832. Throughput: 0: 42770.5. Samples: 2763238300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:09:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 08:09:29,310][15401] Updated weights for policy 0, policy_version 168650 (0.0031) [2024-06-22 08:09:32,769][15401] Updated weights for policy 0, policy_version 168660 (0.0036) [2024-06-22 08:09:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43146.4, 300 sec: 42876.1). Total num frames: 2763341824. Throughput: 0: 42872.1. Samples: 2763499720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:09:33,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 08:09:36,787][15401] Updated weights for policy 0, policy_version 168670 (0.0033) [2024-06-22 08:09:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2763554816. Throughput: 0: 42746.7. Samples: 2763626440. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:09:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 08:09:40,742][15401] Updated weights for policy 0, policy_version 168680 (0.0025) [2024-06-22 08:09:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2763767808. Throughput: 0: 42883.6. Samples: 2763883520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:09:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 08:09:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000168688_2763784192.pth... [2024-06-22 08:09:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000168059_2753478656.pth [2024-06-22 08:09:44,264][15401] Updated weights for policy 0, policy_version 168690 (0.0037) [2024-06-22 08:09:48,344][15401] Updated weights for policy 0, policy_version 168700 (0.0038) [2024-06-22 08:09:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2763980800. Throughput: 0: 43057.8. Samples: 2764145520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:09:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 08:09:51,768][15401] Updated weights for policy 0, policy_version 168710 (0.0039) [2024-06-22 08:09:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42931.8). Total num frames: 2764193792. Throughput: 0: 42814.7. Samples: 2764266600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:09:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 08:09:56,259][15401] Updated weights for policy 0, policy_version 168720 (0.0034) [2024-06-22 08:09:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2764406784. Throughput: 0: 42936.7. Samples: 2764525260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:09:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 08:09:59,665][15401] Updated weights for policy 0, policy_version 168730 (0.0035) [2024-06-22 08:10:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42821.5). Total num frames: 2764603392. Throughput: 0: 42835.5. Samples: 2764778780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:10:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 08:10:03,941][15401] Updated weights for policy 0, policy_version 168740 (0.0038) [2024-06-22 08:10:07,179][15401] Updated weights for policy 0, policy_version 168750 (0.0044) [2024-06-22 08:10:08,392][15132] Fps is (10 sec: 42589.0, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 2764832768. Throughput: 0: 42649.0. Samples: 2764903580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:10:08,392][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 08:10:11,546][15401] Updated weights for policy 0, policy_version 168760 (0.0043) [2024-06-22 08:10:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2765045760. Throughput: 0: 42765.4. Samples: 2765162740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:10:13,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 08:10:14,907][15401] Updated weights for policy 0, policy_version 168770 (0.0033) [2024-06-22 08:10:18,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2765258752. Throughput: 0: 42599.5. Samples: 2765416700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:10:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 08:10:19,214][15401] Updated weights for policy 0, policy_version 168780 (0.0041) [2024-06-22 08:10:22,637][15401] Updated weights for policy 0, policy_version 168790 (0.0034) [2024-06-22 08:10:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2765471744. Throughput: 0: 42631.0. Samples: 2765544840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 08:10:23,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 08:10:26,605][15401] Updated weights for policy 0, policy_version 168800 (0.0022) [2024-06-22 08:10:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2765684736. Throughput: 0: 42810.2. Samples: 2765809980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:10:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 08:10:30,297][15401] Updated weights for policy 0, policy_version 168810 (0.0044) [2024-06-22 08:10:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2765930496. Throughput: 0: 42642.2. Samples: 2766064420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:10:33,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 08:10:33,994][15401] Updated weights for policy 0, policy_version 168820 (0.0029) [2024-06-22 08:10:37,956][15401] Updated weights for policy 0, policy_version 168830 (0.0043) [2024-06-22 08:10:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2766110720. Throughput: 0: 42776.1. Samples: 2766191520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:10:38,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 08:10:41,969][15401] Updated weights for policy 0, policy_version 168840 (0.0030) [2024-06-22 08:10:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2766340096. Throughput: 0: 42760.5. Samples: 2766449480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:10:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 08:10:45,599][15401] Updated weights for policy 0, policy_version 168850 (0.0028) [2024-06-22 08:10:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42932.6). Total num frames: 2766569472. Throughput: 0: 42912.5. Samples: 2766709840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:10:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 08:10:48,404][15349] Signal inference workers to stop experience collection... (40650 times) [2024-06-22 08:10:48,404][15349] Signal inference workers to resume experience collection... (40650 times) [2024-06-22 08:10:48,435][15401] InferenceWorker_p0-w0: stopping experience collection (40650 times) [2024-06-22 08:10:48,436][15401] InferenceWorker_p0-w0: resuming experience collection (40650 times) [2024-06-22 08:10:49,383][15401] Updated weights for policy 0, policy_version 168860 (0.0031) [2024-06-22 08:10:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42820.8). Total num frames: 2766749696. Throughput: 0: 42973.0. Samples: 2766837260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:10:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 08:10:53,685][15401] Updated weights for policy 0, policy_version 168870 (0.0028) [2024-06-22 08:10:56,815][15401] Updated weights for policy 0, policy_version 168880 (0.0036) [2024-06-22 08:10:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2766962688. Throughput: 0: 42955.1. Samples: 2767095720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:10:58,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 08:11:01,227][15401] Updated weights for policy 0, policy_version 168890 (0.0035) [2024-06-22 08:11:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2767208448. Throughput: 0: 42972.0. Samples: 2767350440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:11:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 08:11:04,221][15401] Updated weights for policy 0, policy_version 168900 (0.0022) [2024-06-22 08:11:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 2767388672. Throughput: 0: 43013.9. Samples: 2767480460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:11:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 08:11:09,003][15401] Updated weights for policy 0, policy_version 168910 (0.0039) [2024-06-22 08:11:11,841][15401] Updated weights for policy 0, policy_version 168920 (0.0040) [2024-06-22 08:11:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2767601664. Throughput: 0: 42828.4. Samples: 2767737260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:11:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 08:11:16,907][15401] Updated weights for policy 0, policy_version 168930 (0.0037) [2024-06-22 08:11:18,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2767863808. Throughput: 0: 42869.3. Samples: 2767993540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:11:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 08:11:19,253][15401] Updated weights for policy 0, policy_version 168940 (0.0033) [2024-06-22 08:11:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2768044032. Throughput: 0: 43204.8. Samples: 2768135740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:11:23,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 08:11:24,387][15401] Updated weights for policy 0, policy_version 168950 (0.0033) [2024-06-22 08:11:26,664][15401] Updated weights for policy 0, policy_version 168960 (0.0038) [2024-06-22 08:11:28,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 2768257024. Throughput: 0: 43016.0. Samples: 2768385300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 08:11:28,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 08:11:31,877][15401] Updated weights for policy 0, policy_version 168970 (0.0028) [2024-06-22 08:11:33,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 2768519168. Throughput: 0: 43071.4. Samples: 2768648060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:11:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 08:11:34,137][15401] Updated weights for policy 0, policy_version 168980 (0.0034) [2024-06-22 08:11:38,392][15132] Fps is (10 sec: 44236.6, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 2768699392. Throughput: 0: 43402.0. Samples: 2768790460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:11:38,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 08:11:39,209][15401] Updated weights for policy 0, policy_version 168990 (0.0027) [2024-06-22 08:11:41,555][15401] Updated weights for policy 0, policy_version 169000 (0.0027) [2024-06-22 08:11:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2768896000. Throughput: 0: 43083.9. Samples: 2769034500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:11:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 08:11:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000169000_2768896000.pth... [2024-06-22 08:11:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000168372_2758606848.pth [2024-06-22 08:11:46,862][15401] Updated weights for policy 0, policy_version 169010 (0.0028) [2024-06-22 08:11:48,390][15132] Fps is (10 sec: 47525.1, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 2769174528. Throughput: 0: 43109.3. Samples: 2769290360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:11:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:11:49,189][15401] Updated weights for policy 0, policy_version 169020 (0.0028) [2024-06-22 08:11:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2769321984. Throughput: 0: 43409.7. Samples: 2769433900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:11:53,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 08:11:53,827][15349] Signal inference workers to stop experience collection... (40700 times) [2024-06-22 08:11:53,832][15349] Signal inference workers to resume experience collection... (40700 times) [2024-06-22 08:11:53,864][15401] InferenceWorker_p0-w0: stopping experience collection (40700 times) [2024-06-22 08:11:53,865][15401] InferenceWorker_p0-w0: resuming experience collection (40700 times) [2024-06-22 08:11:54,336][15401] Updated weights for policy 0, policy_version 169030 (0.0042) [2024-06-22 08:11:57,022][15401] Updated weights for policy 0, policy_version 169040 (0.0030) [2024-06-22 08:11:58,390][15132] Fps is (10 sec: 37683.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2769551360. Throughput: 0: 43057.4. Samples: 2769674840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:11:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 08:12:01,930][15401] Updated weights for policy 0, policy_version 169050 (0.0040) [2024-06-22 08:12:03,390][15132] Fps is (10 sec: 49152.2, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 2769813504. Throughput: 0: 43189.7. Samples: 2769937080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:12:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 08:12:04,646][15401] Updated weights for policy 0, policy_version 169060 (0.0043) [2024-06-22 08:12:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2769960960. Throughput: 0: 43169.3. Samples: 2770078360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:12:08,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 08:12:09,387][15401] Updated weights for policy 0, policy_version 169070 (0.0023) [2024-06-22 08:12:12,232][15401] Updated weights for policy 0, policy_version 169080 (0.0038) [2024-06-22 08:12:13,392][15132] Fps is (10 sec: 39312.3, 60 sec: 43415.9, 300 sec: 43042.4). Total num frames: 2770206720. Throughput: 0: 43092.0. Samples: 2770324440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:12:13,401][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 08:12:17,121][15401] Updated weights for policy 0, policy_version 169090 (0.0025) [2024-06-22 08:12:18,392][15132] Fps is (10 sec: 49139.9, 60 sec: 43142.7, 300 sec: 43098.2). Total num frames: 2770452480. Throughput: 0: 43040.0. Samples: 2770584960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:12:18,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 08:12:19,821][15401] Updated weights for policy 0, policy_version 169100 (0.0023) [2024-06-22 08:12:23,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2770616320. Throughput: 0: 42936.9. Samples: 2770722520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:12:23,396][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 08:12:24,658][15401] Updated weights for policy 0, policy_version 169110 (0.0028) [2024-06-22 08:12:27,482][15401] Updated weights for policy 0, policy_version 169120 (0.0027) [2024-06-22 08:12:28,390][15132] Fps is (10 sec: 40969.9, 60 sec: 43419.3, 300 sec: 43042.7). Total num frames: 2770862080. Throughput: 0: 42989.3. Samples: 2770969020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:12:28,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 08:12:32,392][15401] Updated weights for policy 0, policy_version 169130 (0.0034) [2024-06-22 08:12:33,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.6, 300 sec: 43098.3). Total num frames: 2771091456. Throughput: 0: 43126.3. Samples: 2771231040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 08:12:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 08:12:35,125][15401] Updated weights for policy 0, policy_version 169140 (0.0033) [2024-06-22 08:12:38,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 2771255296. Throughput: 0: 42776.0. Samples: 2771358820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:12:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 08:12:40,027][15401] Updated weights for policy 0, policy_version 169150 (0.0036) [2024-06-22 08:12:43,186][15401] Updated weights for policy 0, policy_version 169160 (0.0027) [2024-06-22 08:12:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 43043.1). Total num frames: 2771517440. Throughput: 0: 43001.8. Samples: 2771609920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:12:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 08:12:46,880][15349] Signal inference workers to stop experience collection... (40750 times) [2024-06-22 08:12:46,880][15349] Signal inference workers to resume experience collection... (40750 times) [2024-06-22 08:12:46,928][15401] InferenceWorker_p0-w0: stopping experience collection (40750 times) [2024-06-22 08:12:46,928][15401] InferenceWorker_p0-w0: resuming experience collection (40750 times) [2024-06-22 08:12:47,779][15401] Updated weights for policy 0, policy_version 169170 (0.0049) [2024-06-22 08:12:48,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 2771730432. Throughput: 0: 42940.9. Samples: 2771869420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:12:48,394][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 08:12:50,876][15401] Updated weights for policy 0, policy_version 169180 (0.0029) [2024-06-22 08:12:53,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2771894272. Throughput: 0: 42583.5. Samples: 2771994620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:12:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 08:12:55,282][15401] Updated weights for policy 0, policy_version 169190 (0.0039) [2024-06-22 08:12:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 2772156416. Throughput: 0: 42891.1. Samples: 2772254440. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:12:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 08:12:58,508][15401] Updated weights for policy 0, policy_version 169200 (0.0034) [2024-06-22 08:13:02,891][15401] Updated weights for policy 0, policy_version 169210 (0.0036) [2024-06-22 08:13:03,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 2772353024. Throughput: 0: 42870.3. Samples: 2772514020. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:13:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 08:13:06,356][15401] Updated weights for policy 0, policy_version 169220 (0.0036) [2024-06-22 08:13:08,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2772533248. Throughput: 0: 42600.9. Samples: 2772639560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:13:08,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-22 08:13:10,917][15401] Updated weights for policy 0, policy_version 169230 (0.0041) [2024-06-22 08:13:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 2772795392. Throughput: 0: 42708.1. Samples: 2772890880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:13:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 08:13:14,225][15401] Updated weights for policy 0, policy_version 169240 (0.0034) [2024-06-22 08:13:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42054.0, 300 sec: 42931.6). Total num frames: 2772975616. Throughput: 0: 42823.1. Samples: 2773158080. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:13:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 08:13:18,475][15401] Updated weights for policy 0, policy_version 169250 (0.0045) [2024-06-22 08:13:21,978][15401] Updated weights for policy 0, policy_version 169260 (0.0024) [2024-06-22 08:13:23,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2773172224. Throughput: 0: 42557.9. Samples: 2773273920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:13:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 08:13:26,095][15401] Updated weights for policy 0, policy_version 169270 (0.0026) [2024-06-22 08:13:28,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 43043.1). Total num frames: 2773450752. Throughput: 0: 42807.6. Samples: 2773536260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:13:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 08:13:29,662][15401] Updated weights for policy 0, policy_version 169280 (0.0027) [2024-06-22 08:13:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 2773614592. Throughput: 0: 42851.2. Samples: 2773797720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:13:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 08:13:33,880][15401] Updated weights for policy 0, policy_version 169290 (0.0034) [2024-06-22 08:13:37,598][15401] Updated weights for policy 0, policy_version 169300 (0.0032) [2024-06-22 08:13:38,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2773827584. Throughput: 0: 42703.7. Samples: 2773916280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 08:13:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 08:13:41,450][15401] Updated weights for policy 0, policy_version 169310 (0.0032) [2024-06-22 08:13:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2774073344. Throughput: 0: 42773.8. Samples: 2774179260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:13:43,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 08:13:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000169317_2774089728.pth... [2024-06-22 08:13:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000168688_2763784192.pth [2024-06-22 08:13:45,421][15401] Updated weights for policy 0, policy_version 169320 (0.0041) [2024-06-22 08:13:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 2774269952. Throughput: 0: 42826.6. Samples: 2774441220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:13:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 08:13:49,053][15401] Updated weights for policy 0, policy_version 169330 (0.0037) [2024-06-22 08:13:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2774450176. Throughput: 0: 42577.0. Samples: 2774555520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:13:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 08:13:53,463][15401] Updated weights for policy 0, policy_version 169340 (0.0025) [2024-06-22 08:13:56,709][15401] Updated weights for policy 0, policy_version 169350 (0.0033) [2024-06-22 08:13:58,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 2774728704. Throughput: 0: 42736.9. Samples: 2774814040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:13:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 08:14:01,144][15401] Updated weights for policy 0, policy_version 169360 (0.0023) [2024-06-22 08:14:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2774892544. Throughput: 0: 42660.7. Samples: 2775077820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 08:14:04,257][15401] Updated weights for policy 0, policy_version 169370 (0.0039) [2024-06-22 08:14:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2775105536. Throughput: 0: 42742.1. Samples: 2775197320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 08:14:08,736][15401] Updated weights for policy 0, policy_version 169380 (0.0035) [2024-06-22 08:14:10,163][15349] Signal inference workers to stop experience collection... (40800 times) [2024-06-22 08:14:10,163][15349] Signal inference workers to resume experience collection... (40800 times) [2024-06-22 08:14:10,178][15401] InferenceWorker_p0-w0: stopping experience collection (40800 times) [2024-06-22 08:14:10,178][15401] InferenceWorker_p0-w0: resuming experience collection (40800 times) [2024-06-22 08:14:11,852][15401] Updated weights for policy 0, policy_version 169390 (0.0027) [2024-06-22 08:14:13,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2775334912. Throughput: 0: 42655.0. Samples: 2775455740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 08:14:16,325][15401] Updated weights for policy 0, policy_version 169400 (0.0034) [2024-06-22 08:14:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2775531520. Throughput: 0: 42638.1. Samples: 2775716440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 08:14:19,410][15401] Updated weights for policy 0, policy_version 169410 (0.0038) [2024-06-22 08:14:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 2775760896. Throughput: 0: 42729.6. Samples: 2775839220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:23,401][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 08:14:24,096][15401] Updated weights for policy 0, policy_version 169420 (0.0040) [2024-06-22 08:14:27,110][15401] Updated weights for policy 0, policy_version 169430 (0.0047) [2024-06-22 08:14:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 2775973888. Throughput: 0: 42526.2. Samples: 2776092940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 08:14:31,722][15401] Updated weights for policy 0, policy_version 169440 (0.0033) [2024-06-22 08:14:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2776170496. Throughput: 0: 42502.8. Samples: 2776353840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 08:14:35,423][15401] Updated weights for policy 0, policy_version 169450 (0.0027) [2024-06-22 08:14:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2776399872. Throughput: 0: 42740.9. Samples: 2776478860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 08:14:39,381][15401] Updated weights for policy 0, policy_version 169460 (0.0050) [2024-06-22 08:14:42,973][15401] Updated weights for policy 0, policy_version 169470 (0.0035) [2024-06-22 08:14:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2776612864. Throughput: 0: 42586.6. Samples: 2776730440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 08:14:43,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-22 08:14:47,409][15401] Updated weights for policy 0, policy_version 169480 (0.0032) [2024-06-22 08:14:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2776809472. Throughput: 0: 42593.0. Samples: 2776994500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:14:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 08:14:50,570][15401] Updated weights for policy 0, policy_version 169490 (0.0026) [2024-06-22 08:14:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2777055232. Throughput: 0: 42683.5. Samples: 2777118080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:14:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 08:14:54,811][15401] Updated weights for policy 0, policy_version 169500 (0.0036) [2024-06-22 08:14:58,280][15401] Updated weights for policy 0, policy_version 169510 (0.0058) [2024-06-22 08:14:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 2777251840. Throughput: 0: 42675.6. Samples: 2777376140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:14:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 08:15:02,442][15401] Updated weights for policy 0, policy_version 169520 (0.0041) [2024-06-22 08:15:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 2777448448. Throughput: 0: 42617.9. Samples: 2777634240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 08:15:05,870][15401] Updated weights for policy 0, policy_version 169530 (0.0034) [2024-06-22 08:15:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2777694208. Throughput: 0: 42598.7. Samples: 2777756060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 08:15:10,049][15401] Updated weights for policy 0, policy_version 169540 (0.0045) [2024-06-22 08:15:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2777890816. Throughput: 0: 42609.9. Samples: 2778010380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 08:15:13,464][15401] Updated weights for policy 0, policy_version 169550 (0.0036) [2024-06-22 08:15:17,654][15401] Updated weights for policy 0, policy_version 169560 (0.0036) [2024-06-22 08:15:18,391][15132] Fps is (10 sec: 39316.6, 60 sec: 42597.4, 300 sec: 42764.8). Total num frames: 2778087424. Throughput: 0: 42537.3. Samples: 2778268080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:18,391][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 08:15:21,551][15401] Updated weights for policy 0, policy_version 169570 (0.0027) [2024-06-22 08:15:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 2778316800. Throughput: 0: 42606.2. Samples: 2778396140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:23,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-22 08:15:24,074][15349] Signal inference workers to stop experience collection... (40850 times) [2024-06-22 08:15:24,074][15349] Signal inference workers to resume experience collection... (40850 times) [2024-06-22 08:15:24,123][15401] InferenceWorker_p0-w0: stopping experience collection (40850 times) [2024-06-22 08:15:24,124][15401] InferenceWorker_p0-w0: resuming experience collection (40850 times) [2024-06-22 08:15:25,189][15401] Updated weights for policy 0, policy_version 169580 (0.0039) [2024-06-22 08:15:28,392][15132] Fps is (10 sec: 44232.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 2778529792. Throughput: 0: 42761.8. Samples: 2778654820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:28,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 08:15:29,167][15401] Updated weights for policy 0, policy_version 169590 (0.0022) [2024-06-22 08:15:33,099][15401] Updated weights for policy 0, policy_version 169600 (0.0046) [2024-06-22 08:15:33,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 2778742784. Throughput: 0: 42492.0. Samples: 2778906740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:33,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 08:15:36,707][15401] Updated weights for policy 0, policy_version 169610 (0.0040) [2024-06-22 08:15:38,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2778972160. Throughput: 0: 42580.9. Samples: 2779034220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 08:15:40,635][15401] Updated weights for policy 0, policy_version 169620 (0.0036) [2024-06-22 08:15:43,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2779168768. Throughput: 0: 42667.0. Samples: 2779296160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:43,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 08:15:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000169627_2779168768.pth... [2024-06-22 08:15:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000169000_2768896000.pth [2024-06-22 08:15:44,304][15401] Updated weights for policy 0, policy_version 169630 (0.0031) [2024-06-22 08:15:48,198][15401] Updated weights for policy 0, policy_version 169640 (0.0034) [2024-06-22 08:15:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2779381760. Throughput: 0: 42492.4. Samples: 2779546400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 08:15:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 08:15:52,194][15401] Updated weights for policy 0, policy_version 169650 (0.0034) [2024-06-22 08:15:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2779611136. Throughput: 0: 42645.4. Samples: 2779675100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:15:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 08:15:55,962][15401] Updated weights for policy 0, policy_version 169660 (0.0042) [2024-06-22 08:15:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2779807744. Throughput: 0: 42595.9. Samples: 2779927200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:15:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 08:15:59,814][15401] Updated weights for policy 0, policy_version 169670 (0.0034) [2024-06-22 08:16:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2780004352. Throughput: 0: 42718.2. Samples: 2780190340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 08:16:03,841][15401] Updated weights for policy 0, policy_version 169680 (0.0039) [2024-06-22 08:16:07,212][15401] Updated weights for policy 0, policy_version 169690 (0.0039) [2024-06-22 08:16:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2780233728. Throughput: 0: 42675.5. Samples: 2780316540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 08:16:11,443][15401] Updated weights for policy 0, policy_version 169700 (0.0030) [2024-06-22 08:16:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2780446720. Throughput: 0: 42638.3. Samples: 2780573440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:13,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 08:16:14,756][15401] Updated weights for policy 0, policy_version 169710 (0.0032) [2024-06-22 08:16:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42872.4, 300 sec: 42765.0). Total num frames: 2780659712. Throughput: 0: 42813.7. Samples: 2780833260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 08:16:19,107][15401] Updated weights for policy 0, policy_version 169720 (0.0041) [2024-06-22 08:16:22,368][15401] Updated weights for policy 0, policy_version 169730 (0.0048) [2024-06-22 08:16:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2780872704. Throughput: 0: 42741.0. Samples: 2780957560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 08:16:26,820][15401] Updated weights for policy 0, policy_version 169740 (0.0036) [2024-06-22 08:16:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 2781085696. Throughput: 0: 42644.2. Samples: 2781215140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:28,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 08:16:29,966][15401] Updated weights for policy 0, policy_version 169750 (0.0039) [2024-06-22 08:16:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42326.9, 300 sec: 42654.3). Total num frames: 2781282304. Throughput: 0: 42825.5. Samples: 2781473560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 08:16:34,418][15401] Updated weights for policy 0, policy_version 169760 (0.0034) [2024-06-22 08:16:37,671][15401] Updated weights for policy 0, policy_version 169770 (0.0031) [2024-06-22 08:16:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2781528064. Throughput: 0: 42714.2. Samples: 2781597240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:38,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 08:16:41,935][15349] Signal inference workers to stop experience collection... (40900 times) [2024-06-22 08:16:41,935][15349] Signal inference workers to resume experience collection... (40900 times) [2024-06-22 08:16:41,980][15401] InferenceWorker_p0-w0: stopping experience collection (40900 times) [2024-06-22 08:16:41,980][15401] InferenceWorker_p0-w0: resuming experience collection (40900 times) [2024-06-22 08:16:42,070][15401] Updated weights for policy 0, policy_version 169780 (0.0033) [2024-06-22 08:16:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2781724672. Throughput: 0: 42826.2. Samples: 2781854380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:43,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 08:16:45,372][15401] Updated weights for policy 0, policy_version 169790 (0.0034) [2024-06-22 08:16:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2781937664. Throughput: 0: 42678.2. Samples: 2782110860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 08:16:49,788][15401] Updated weights for policy 0, policy_version 169800 (0.0025) [2024-06-22 08:16:52,881][15401] Updated weights for policy 0, policy_version 169810 (0.0051) [2024-06-22 08:16:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2782167040. Throughput: 0: 42758.7. Samples: 2782240680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-22 08:16:53,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 08:16:57,274][15401] Updated weights for policy 0, policy_version 169820 (0.0034) [2024-06-22 08:16:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2782380032. Throughput: 0: 42836.8. Samples: 2782501100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:16:58,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 08:17:00,260][15401] Updated weights for policy 0, policy_version 169830 (0.0034) [2024-06-22 08:17:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2782576640. Throughput: 0: 42802.3. Samples: 2782759360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 08:17:04,885][15401] Updated weights for policy 0, policy_version 169840 (0.0024) [2024-06-22 08:17:07,741][15401] Updated weights for policy 0, policy_version 169850 (0.0043) [2024-06-22 08:17:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 2782822400. Throughput: 0: 42883.2. Samples: 2782887300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:08,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-22 08:17:12,402][15401] Updated weights for policy 0, policy_version 169860 (0.0037) [2024-06-22 08:17:13,393][15132] Fps is (10 sec: 45860.0, 60 sec: 43142.1, 300 sec: 42653.8). Total num frames: 2783035392. Throughput: 0: 42977.2. Samples: 2783149260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:13,393][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 08:17:15,569][15401] Updated weights for policy 0, policy_version 169870 (0.0030) [2024-06-22 08:17:18,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2783199232. Throughput: 0: 42779.4. Samples: 2783398620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 08:17:20,160][15401] Updated weights for policy 0, policy_version 169880 (0.0043) [2024-06-22 08:17:23,246][15401] Updated weights for policy 0, policy_version 169890 (0.0036) [2024-06-22 08:17:23,390][15132] Fps is (10 sec: 44250.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2783477760. Throughput: 0: 42744.3. Samples: 2783520740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:23,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 08:17:27,869][15401] Updated weights for policy 0, policy_version 169900 (0.0046) [2024-06-22 08:17:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2783674368. Throughput: 0: 42910.3. Samples: 2783785340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 08:17:30,854][15401] Updated weights for policy 0, policy_version 169910 (0.0038) [2024-06-22 08:17:33,392][15132] Fps is (10 sec: 37674.6, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 2783854592. Throughput: 0: 43034.6. Samples: 2784047520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:33,393][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 08:17:35,458][15401] Updated weights for policy 0, policy_version 169920 (0.0029) [2024-06-22 08:17:37,549][15349] Signal inference workers to stop experience collection... (40950 times) [2024-06-22 08:17:37,603][15349] Signal inference workers to resume experience collection... (40950 times) [2024-06-22 08:17:37,604][15401] InferenceWorker_p0-w0: stopping experience collection (40950 times) [2024-06-22 08:17:37,631][15401] InferenceWorker_p0-w0: resuming experience collection (40950 times) [2024-06-22 08:17:38,383][15401] Updated weights for policy 0, policy_version 169930 (0.0038) [2024-06-22 08:17:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2784133120. Throughput: 0: 42777.4. Samples: 2784165660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 08:17:43,082][15401] Updated weights for policy 0, policy_version 169940 (0.0028) [2024-06-22 08:17:43,389][15132] Fps is (10 sec: 45886.8, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2784313344. Throughput: 0: 42957.9. Samples: 2784434200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:43,390][15132] Avg episode reward: [(0, '0.144')] [2024-06-22 08:17:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000169942_2784329728.pth... [2024-06-22 08:17:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000169317_2774089728.pth [2024-06-22 08:17:46,266][15401] Updated weights for policy 0, policy_version 169950 (0.0042) [2024-06-22 08:17:48,390][15132] Fps is (10 sec: 36044.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2784493568. Throughput: 0: 42889.7. Samples: 2784689400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:48,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-22 08:17:50,848][15401] Updated weights for policy 0, policy_version 169960 (0.0027) [2024-06-22 08:17:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2784755712. Throughput: 0: 42753.2. Samples: 2784811200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:53,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 08:17:54,041][15401] Updated weights for policy 0, policy_version 169970 (0.0030) [2024-06-22 08:17:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2784935936. Throughput: 0: 42772.0. Samples: 2785073860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:17:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 08:17:58,775][15401] Updated weights for policy 0, policy_version 169980 (0.0023) [2024-06-22 08:18:02,402][15401] Updated weights for policy 0, policy_version 169990 (0.0024) [2024-06-22 08:18:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2785132544. Throughput: 0: 42701.1. Samples: 2785320180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 08:18:06,610][15401] Updated weights for policy 0, policy_version 170000 (0.0028) [2024-06-22 08:18:08,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2785411072. Throughput: 0: 42881.6. Samples: 2785450400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:08,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 08:18:10,320][15401] Updated weights for policy 0, policy_version 170010 (0.0035) [2024-06-22 08:18:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42054.6, 300 sec: 42653.9). Total num frames: 2785558528. Throughput: 0: 42785.3. Samples: 2785710680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 08:18:14,057][15401] Updated weights for policy 0, policy_version 170020 (0.0042) [2024-06-22 08:18:17,768][15401] Updated weights for policy 0, policy_version 170030 (0.0035) [2024-06-22 08:18:18,389][15132] Fps is (10 sec: 37682.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2785787904. Throughput: 0: 42552.1. Samples: 2785962260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:18,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 08:18:21,574][15401] Updated weights for policy 0, policy_version 170040 (0.0036) [2024-06-22 08:18:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2786033664. Throughput: 0: 42797.7. Samples: 2786091560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 08:18:25,255][15401] Updated weights for policy 0, policy_version 170050 (0.0027) [2024-06-22 08:18:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2786213888. Throughput: 0: 42618.1. Samples: 2786352020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:28,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 08:18:29,219][15401] Updated weights for policy 0, policy_version 170060 (0.0038) [2024-06-22 08:18:32,802][15401] Updated weights for policy 0, policy_version 170070 (0.0042) [2024-06-22 08:18:33,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 2786426880. Throughput: 0: 42479.1. Samples: 2786601060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:33,393][15132] Avg episode reward: [(0, '0.808')] [2024-06-22 08:18:37,105][15401] Updated weights for policy 0, policy_version 170080 (0.0038) [2024-06-22 08:18:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2786672640. Throughput: 0: 42622.8. Samples: 2786729220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 08:18:40,931][15401] Updated weights for policy 0, policy_version 170090 (0.0024) [2024-06-22 08:18:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2786852864. Throughput: 0: 42499.9. Samples: 2786986360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 08:18:44,687][15401] Updated weights for policy 0, policy_version 170100 (0.0040) [2024-06-22 08:18:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2787065856. Throughput: 0: 42706.9. Samples: 2787241980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 08:18:48,443][15401] Updated weights for policy 0, policy_version 170110 (0.0046) [2024-06-22 08:18:52,371][15401] Updated weights for policy 0, policy_version 170120 (0.0031) [2024-06-22 08:18:52,860][15349] Signal inference workers to stop experience collection... (41000 times) [2024-06-22 08:18:52,894][15401] InferenceWorker_p0-w0: stopping experience collection (41000 times) [2024-06-22 08:18:52,909][15349] Signal inference workers to resume experience collection... (41000 times) [2024-06-22 08:18:52,916][15401] InferenceWorker_p0-w0: resuming experience collection (41000 times) [2024-06-22 08:18:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2787311616. Throughput: 0: 42703.0. Samples: 2787372040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 08:18:55,851][15401] Updated weights for policy 0, policy_version 170130 (0.0033) [2024-06-22 08:18:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2787491840. Throughput: 0: 42717.7. Samples: 2787632980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:18:58,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 08:18:59,819][15401] Updated weights for policy 0, policy_version 170140 (0.0033) [2024-06-22 08:19:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2787721216. Throughput: 0: 42804.0. Samples: 2787888440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:19:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 08:19:03,410][15401] Updated weights for policy 0, policy_version 170150 (0.0042) [2024-06-22 08:19:07,805][15401] Updated weights for policy 0, policy_version 170160 (0.0051) [2024-06-22 08:19:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2787950592. Throughput: 0: 42780.0. Samples: 2788016660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 08:19:10,964][15401] Updated weights for policy 0, policy_version 170170 (0.0028) [2024-06-22 08:19:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2788130816. Throughput: 0: 42713.4. Samples: 2788274120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 08:19:15,437][15401] Updated weights for policy 0, policy_version 170180 (0.0037) [2024-06-22 08:19:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2788360192. Throughput: 0: 42725.5. Samples: 2788523600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:18,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 08:19:18,995][15401] Updated weights for policy 0, policy_version 170190 (0.0030) [2024-06-22 08:19:23,011][15401] Updated weights for policy 0, policy_version 170200 (0.0038) [2024-06-22 08:19:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2788573184. Throughput: 0: 42766.6. Samples: 2788653720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 08:19:26,437][15401] Updated weights for policy 0, policy_version 170210 (0.0032) [2024-06-22 08:19:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2788769792. Throughput: 0: 42757.4. Samples: 2788910440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:19:30,551][15401] Updated weights for policy 0, policy_version 170220 (0.0042) [2024-06-22 08:19:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2788999168. Throughput: 0: 42612.8. Samples: 2789159560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 08:19:34,269][15401] Updated weights for policy 0, policy_version 170230 (0.0039) [2024-06-22 08:19:38,187][15401] Updated weights for policy 0, policy_version 170240 (0.0043) [2024-06-22 08:19:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2789228544. Throughput: 0: 42620.5. Samples: 2789289960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:38,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-22 08:19:41,923][15401] Updated weights for policy 0, policy_version 170250 (0.0027) [2024-06-22 08:19:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2789425152. Throughput: 0: 42542.3. Samples: 2789547380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:43,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 08:19:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000170253_2789425152.pth... [2024-06-22 08:19:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000169627_2779168768.pth [2024-06-22 08:19:45,725][15401] Updated weights for policy 0, policy_version 170260 (0.0026) [2024-06-22 08:19:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2789638144. Throughput: 0: 42594.5. Samples: 2789805200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:48,399][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 08:19:49,522][15401] Updated weights for policy 0, policy_version 170270 (0.0049) [2024-06-22 08:19:53,387][15401] Updated weights for policy 0, policy_version 170280 (0.0038) [2024-06-22 08:19:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2789867520. Throughput: 0: 42705.3. Samples: 2789938400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:53,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-22 08:19:57,441][15401] Updated weights for policy 0, policy_version 170290 (0.0052) [2024-06-22 08:19:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2790080512. Throughput: 0: 42558.1. Samples: 2790189240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:19:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 08:20:01,146][15401] Updated weights for policy 0, policy_version 170300 (0.0042) [2024-06-22 08:20:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2790277120. Throughput: 0: 42656.4. Samples: 2790443140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:20:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 08:20:05,111][15401] Updated weights for policy 0, policy_version 170310 (0.0030) [2024-06-22 08:20:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2790490112. Throughput: 0: 42719.6. Samples: 2790576100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 08:20:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 08:20:08,601][15349] Signal inference workers to stop experience collection... (41050 times) [2024-06-22 08:20:08,654][15401] InferenceWorker_p0-w0: stopping experience collection (41050 times) [2024-06-22 08:20:08,655][15349] Signal inference workers to resume experience collection... (41050 times) [2024-06-22 08:20:08,673][15401] InferenceWorker_p0-w0: resuming experience collection (41050 times) [2024-06-22 08:20:08,796][15401] Updated weights for policy 0, policy_version 170320 (0.0033) [2024-06-22 08:20:12,861][15401] Updated weights for policy 0, policy_version 170330 (0.0029) [2024-06-22 08:20:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 2790686720. Throughput: 0: 42645.3. Samples: 2790829480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 08:20:16,829][15401] Updated weights for policy 0, policy_version 170340 (0.0035) [2024-06-22 08:20:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2790932480. Throughput: 0: 42747.5. Samples: 2791083200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 08:20:20,465][15401] Updated weights for policy 0, policy_version 170350 (0.0043) [2024-06-22 08:20:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 2791129088. Throughput: 0: 42868.8. Samples: 2791219060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 08:20:24,347][15401] Updated weights for policy 0, policy_version 170360 (0.0032) [2024-06-22 08:20:28,173][15401] Updated weights for policy 0, policy_version 170370 (0.0037) [2024-06-22 08:20:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2791342080. Throughput: 0: 42595.9. Samples: 2791464200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:28,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-22 08:20:31,798][15401] Updated weights for policy 0, policy_version 170380 (0.0032) [2024-06-22 08:20:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2791555072. Throughput: 0: 42620.1. Samples: 2791723100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 08:20:36,137][15401] Updated weights for policy 0, policy_version 170390 (0.0046) [2024-06-22 08:20:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2791784448. Throughput: 0: 42632.9. Samples: 2791856880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:38,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-22 08:20:39,495][15401] Updated weights for policy 0, policy_version 170400 (0.0028) [2024-06-22 08:20:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2791964672. Throughput: 0: 42667.2. Samples: 2792109260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:43,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 08:20:43,755][15401] Updated weights for policy 0, policy_version 170410 (0.0022) [2024-06-22 08:20:46,935][15401] Updated weights for policy 0, policy_version 170420 (0.0025) [2024-06-22 08:20:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2792194048. Throughput: 0: 42742.7. Samples: 2792366560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:48,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-22 08:20:51,391][15401] Updated weights for policy 0, policy_version 170430 (0.0032) [2024-06-22 08:20:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2792423424. Throughput: 0: 42833.3. Samples: 2792503600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:53,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 08:20:54,730][15401] Updated weights for policy 0, policy_version 170440 (0.0036) [2024-06-22 08:20:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2792620032. Throughput: 0: 42790.3. Samples: 2792755040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:20:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 08:20:58,847][15401] Updated weights for policy 0, policy_version 170450 (0.0039) [2024-06-22 08:21:02,382][15401] Updated weights for policy 0, policy_version 170460 (0.0030) [2024-06-22 08:21:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2792865792. Throughput: 0: 42824.9. Samples: 2793010420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:21:03,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 08:21:06,537][15401] Updated weights for policy 0, policy_version 170470 (0.0030) [2024-06-22 08:21:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2793078784. Throughput: 0: 42729.8. Samples: 2793141900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:21:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 08:21:09,947][15401] Updated weights for policy 0, policy_version 170480 (0.0029) [2024-06-22 08:21:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2793275392. Throughput: 0: 43012.8. Samples: 2793399780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 08:21:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 08:21:14,157][15401] Updated weights for policy 0, policy_version 170490 (0.0030) [2024-06-22 08:21:17,571][15401] Updated weights for policy 0, policy_version 170500 (0.0029) [2024-06-22 08:21:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2793472000. Throughput: 0: 42950.1. Samples: 2793655860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:18,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 08:21:21,716][15401] Updated weights for policy 0, policy_version 170510 (0.0034) [2024-06-22 08:21:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2793734144. Throughput: 0: 42957.3. Samples: 2793789960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:23,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 08:21:25,265][15401] Updated weights for policy 0, policy_version 170520 (0.0032) [2024-06-22 08:21:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 2793897984. Throughput: 0: 43087.6. Samples: 2794048200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 08:21:29,214][15401] Updated weights for policy 0, policy_version 170530 (0.0033) [2024-06-22 08:21:32,839][15401] Updated weights for policy 0, policy_version 170540 (0.0035) [2024-06-22 08:21:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2794143744. Throughput: 0: 43056.5. Samples: 2794304100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 08:21:36,711][15401] Updated weights for policy 0, policy_version 170550 (0.0039) [2024-06-22 08:21:37,778][15349] Signal inference workers to stop experience collection... (41100 times) [2024-06-22 08:21:37,813][15401] InferenceWorker_p0-w0: stopping experience collection (41100 times) [2024-06-22 08:21:37,834][15349] Signal inference workers to resume experience collection... (41100 times) [2024-06-22 08:21:37,840][15401] InferenceWorker_p0-w0: resuming experience collection (41100 times) [2024-06-22 08:21:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2794373120. Throughput: 0: 43030.8. Samples: 2794439980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 08:21:40,277][15401] Updated weights for policy 0, policy_version 170560 (0.0036) [2024-06-22 08:21:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2794553344. Throughput: 0: 43215.2. Samples: 2794699720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 08:21:43,519][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000170567_2794569728.pth... [2024-06-22 08:21:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000169942_2784329728.pth [2024-06-22 08:21:44,409][15401] Updated weights for policy 0, policy_version 170570 (0.0024) [2024-06-22 08:21:47,866][15401] Updated weights for policy 0, policy_version 170580 (0.0026) [2024-06-22 08:21:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2794782720. Throughput: 0: 43083.1. Samples: 2794949060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 08:21:51,823][15401] Updated weights for policy 0, policy_version 170590 (0.0046) [2024-06-22 08:21:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2795012096. Throughput: 0: 43228.4. Samples: 2795087180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 08:21:55,334][15401] Updated weights for policy 0, policy_version 170600 (0.0030) [2024-06-22 08:21:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2795192320. Throughput: 0: 43133.4. Samples: 2795340780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:21:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 08:21:59,464][15401] Updated weights for policy 0, policy_version 170610 (0.0039) [2024-06-22 08:22:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 2795421696. Throughput: 0: 43168.9. Samples: 2795598460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:22:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 08:22:03,456][15401] Updated weights for policy 0, policy_version 170620 (0.0032) [2024-06-22 08:22:07,081][15401] Updated weights for policy 0, policy_version 170630 (0.0040) [2024-06-22 08:22:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.5). Total num frames: 2795651072. Throughput: 0: 43116.4. Samples: 2795730200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:22:08,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 08:22:10,975][15401] Updated weights for policy 0, policy_version 170640 (0.0034) [2024-06-22 08:22:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2795831296. Throughput: 0: 43034.1. Samples: 2795984740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:22:13,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 08:22:14,771][15401] Updated weights for policy 0, policy_version 170650 (0.0027) [2024-06-22 08:22:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2796060672. Throughput: 0: 42961.3. Samples: 2796237360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-22 08:22:18,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 08:22:18,603][15401] Updated weights for policy 0, policy_version 170660 (0.0038) [2024-06-22 08:22:22,184][15401] Updated weights for policy 0, policy_version 170670 (0.0031) [2024-06-22 08:22:23,396][15132] Fps is (10 sec: 47483.1, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 2796306432. Throughput: 0: 43044.0. Samples: 2796377240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:22:23,397][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 08:22:26,449][15401] Updated weights for policy 0, policy_version 170680 (0.0028) [2024-06-22 08:22:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 2796470272. Throughput: 0: 42848.5. Samples: 2796627900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:22:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 08:22:29,991][15401] Updated weights for policy 0, policy_version 170690 (0.0040) [2024-06-22 08:22:33,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2796716032. Throughput: 0: 43001.0. Samples: 2796884100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:22:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 08:22:34,168][15401] Updated weights for policy 0, policy_version 170700 (0.0037) [2024-06-22 08:22:37,657][15401] Updated weights for policy 0, policy_version 170710 (0.0033) [2024-06-22 08:22:38,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2796945408. Throughput: 0: 42968.1. Samples: 2797020740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:22:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 08:22:41,592][15401] Updated weights for policy 0, policy_version 170720 (0.0043) [2024-06-22 08:22:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2797142016. Throughput: 0: 43040.5. Samples: 2797277600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:22:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 08:22:45,309][15401] Updated weights for policy 0, policy_version 170730 (0.0029) [2024-06-22 08:22:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 2797371392. Throughput: 0: 42905.0. Samples: 2797529180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:22:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 08:22:49,125][15401] Updated weights for policy 0, policy_version 170740 (0.0027) [2024-06-22 08:22:52,869][15401] Updated weights for policy 0, policy_version 170750 (0.0028) [2024-06-22 08:22:53,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 2797600768. Throughput: 0: 42950.7. Samples: 2797663080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:22:53,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 08:22:56,595][15401] Updated weights for policy 0, policy_version 170760 (0.0039) [2024-06-22 08:22:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2797780992. Throughput: 0: 42993.9. Samples: 2797919460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:22:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 08:23:00,543][15401] Updated weights for policy 0, policy_version 170770 (0.0030) [2024-06-22 08:23:01,534][15349] Signal inference workers to stop experience collection... (41150 times) [2024-06-22 08:23:01,535][15349] Signal inference workers to resume experience collection... (41150 times) [2024-06-22 08:23:01,563][15401] InferenceWorker_p0-w0: stopping experience collection (41150 times) [2024-06-22 08:23:01,563][15401] InferenceWorker_p0-w0: resuming experience collection (41150 times) [2024-06-22 08:23:03,389][15132] Fps is (10 sec: 40970.4, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2798010368. Throughput: 0: 43079.2. Samples: 2798175920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:23:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 08:23:04,190][15401] Updated weights for policy 0, policy_version 170780 (0.0033) [2024-06-22 08:23:08,119][15401] Updated weights for policy 0, policy_version 170790 (0.0035) [2024-06-22 08:23:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2798223360. Throughput: 0: 42955.0. Samples: 2798309940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:23:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 08:23:11,641][15401] Updated weights for policy 0, policy_version 170800 (0.0031) [2024-06-22 08:23:13,390][15132] Fps is (10 sec: 40958.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2798419968. Throughput: 0: 42919.3. Samples: 2798559280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:23:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 08:23:15,738][15401] Updated weights for policy 0, policy_version 170810 (0.0032) [2024-06-22 08:23:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 2798665728. Throughput: 0: 42984.3. Samples: 2798818400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:23:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 08:23:19,243][15401] Updated weights for policy 0, policy_version 170820 (0.0032) [2024-06-22 08:23:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 2798862336. Throughput: 0: 42841.8. Samples: 2798948620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-22 08:23:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 08:23:23,808][15401] Updated weights for policy 0, policy_version 170830 (0.0041) [2024-06-22 08:23:27,290][15401] Updated weights for policy 0, policy_version 170840 (0.0031) [2024-06-22 08:23:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43417.6, 300 sec: 42876.5). Total num frames: 2799075328. Throughput: 0: 42813.8. Samples: 2799204220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:23:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 08:23:31,381][15401] Updated weights for policy 0, policy_version 170850 (0.0042) [2024-06-22 08:23:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2799304704. Throughput: 0: 42817.3. Samples: 2799455960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:23:33,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 08:23:34,854][15401] Updated weights for policy 0, policy_version 170860 (0.0040) [2024-06-22 08:23:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2799501312. Throughput: 0: 42721.0. Samples: 2799585420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:23:38,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 08:23:39,116][15401] Updated weights for policy 0, policy_version 170870 (0.0035) [2024-06-22 08:23:42,488][15401] Updated weights for policy 0, policy_version 170880 (0.0033) [2024-06-22 08:23:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2799730688. Throughput: 0: 42745.6. Samples: 2799843020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:23:43,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 08:23:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000170882_2799730688.pth... [2024-06-22 08:23:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000170253_2789425152.pth [2024-06-22 08:23:46,866][15401] Updated weights for policy 0, policy_version 170890 (0.0044) [2024-06-22 08:23:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2799943680. Throughput: 0: 42717.5. Samples: 2800098220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:23:48,391][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 08:23:50,191][15401] Updated weights for policy 0, policy_version 170900 (0.0027) [2024-06-22 08:23:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42054.0, 300 sec: 42820.6). Total num frames: 2800123904. Throughput: 0: 42544.5. Samples: 2800224440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:23:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 08:23:54,529][15401] Updated weights for policy 0, policy_version 170910 (0.0035) [2024-06-22 08:23:57,955][15401] Updated weights for policy 0, policy_version 170920 (0.0026) [2024-06-22 08:23:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2800353280. Throughput: 0: 42569.5. Samples: 2800474900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:23:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 08:24:02,126][15401] Updated weights for policy 0, policy_version 170930 (0.0035) [2024-06-22 08:24:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2800549888. Throughput: 0: 42698.3. Samples: 2800739820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:24:03,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 08:24:05,431][15401] Updated weights for policy 0, policy_version 170940 (0.0041) [2024-06-22 08:24:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2800779264. Throughput: 0: 42602.8. Samples: 2800865740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:24:08,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 08:24:09,801][15401] Updated weights for policy 0, policy_version 170950 (0.0036) [2024-06-22 08:24:13,277][15401] Updated weights for policy 0, policy_version 170960 (0.0040) [2024-06-22 08:24:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2801008640. Throughput: 0: 42665.7. Samples: 2801124180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:24:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 08:24:17,232][15401] Updated weights for policy 0, policy_version 170970 (0.0037) [2024-06-22 08:24:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 2801188864. Throughput: 0: 42865.8. Samples: 2801384920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:24:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 08:24:20,886][15401] Updated weights for policy 0, policy_version 170980 (0.0029) [2024-06-22 08:24:21,966][15349] Signal inference workers to stop experience collection... (41200 times) [2024-06-22 08:24:21,972][15349] Signal inference workers to resume experience collection... (41200 times) [2024-06-22 08:24:22,016][15401] InferenceWorker_p0-w0: stopping experience collection (41200 times) [2024-06-22 08:24:22,016][15401] InferenceWorker_p0-w0: resuming experience collection (41200 times) [2024-06-22 08:24:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2801434624. Throughput: 0: 42792.0. Samples: 2801511060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:24:23,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 08:24:24,697][15401] Updated weights for policy 0, policy_version 170990 (0.0033) [2024-06-22 08:24:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2801647616. Throughput: 0: 42895.6. Samples: 2801773320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 08:24:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 08:24:28,594][15401] Updated weights for policy 0, policy_version 171000 (0.0035) [2024-06-22 08:24:32,182][15401] Updated weights for policy 0, policy_version 171010 (0.0045) [2024-06-22 08:24:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2801844224. Throughput: 0: 42902.9. Samples: 2802028840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:24:33,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 08:24:36,062][15401] Updated weights for policy 0, policy_version 171020 (0.0028) [2024-06-22 08:24:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2802089984. Throughput: 0: 42905.7. Samples: 2802155200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:24:38,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 08:24:39,989][15401] Updated weights for policy 0, policy_version 171030 (0.0033) [2024-06-22 08:24:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2802286592. Throughput: 0: 43150.7. Samples: 2802416680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:24:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 08:24:43,589][15401] Updated weights for policy 0, policy_version 171040 (0.0035) [2024-06-22 08:24:47,563][15401] Updated weights for policy 0, policy_version 171050 (0.0027) [2024-06-22 08:24:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2802483200. Throughput: 0: 42899.5. Samples: 2802670300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:24:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 08:24:51,171][15401] Updated weights for policy 0, policy_version 171060 (0.0034) [2024-06-22 08:24:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2802728960. Throughput: 0: 42850.5. Samples: 2802794020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:24:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 08:24:55,080][15401] Updated weights for policy 0, policy_version 171070 (0.0033) [2024-06-22 08:24:58,396][15132] Fps is (10 sec: 45846.2, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 2802941952. Throughput: 0: 42898.4. Samples: 2803054880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:24:58,396][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 08:24:59,306][15401] Updated weights for policy 0, policy_version 171080 (0.0039) [2024-06-22 08:25:02,661][15401] Updated weights for policy 0, policy_version 171090 (0.0032) [2024-06-22 08:25:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2803138560. Throughput: 0: 42716.3. Samples: 2803307160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:25:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 08:25:06,903][15401] Updated weights for policy 0, policy_version 171100 (0.0026) [2024-06-22 08:25:08,390][15132] Fps is (10 sec: 44264.9, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 2803384320. Throughput: 0: 42864.4. Samples: 2803439960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:25:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 08:25:10,736][15401] Updated weights for policy 0, policy_version 171110 (0.0032) [2024-06-22 08:25:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2803580928. Throughput: 0: 42849.0. Samples: 2803701520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:25:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 08:25:14,418][15401] Updated weights for policy 0, policy_version 171120 (0.0028) [2024-06-22 08:25:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2803777536. Throughput: 0: 42838.2. Samples: 2803956560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:25:18,391][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 08:25:18,832][15401] Updated weights for policy 0, policy_version 171130 (0.0036) [2024-06-22 08:25:22,038][15401] Updated weights for policy 0, policy_version 171140 (0.0027) [2024-06-22 08:25:23,391][15132] Fps is (10 sec: 44228.9, 60 sec: 43143.3, 300 sec: 42986.9). Total num frames: 2804023296. Throughput: 0: 42916.6. Samples: 2804086520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:25:23,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 08:25:26,267][15401] Updated weights for policy 0, policy_version 171150 (0.0027) [2024-06-22 08:25:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2804219904. Throughput: 0: 42857.7. Samples: 2804345280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:25:28,394][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 08:25:29,558][15401] Updated weights for policy 0, policy_version 171160 (0.0044) [2024-06-22 08:25:33,390][15132] Fps is (10 sec: 40966.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2804432896. Throughput: 0: 42888.9. Samples: 2804600300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 08:25:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 08:25:33,735][15401] Updated weights for policy 0, policy_version 171170 (0.0031) [2024-06-22 08:25:37,207][15401] Updated weights for policy 0, policy_version 171180 (0.0032) [2024-06-22 08:25:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 2804662272. Throughput: 0: 43064.1. Samples: 2804731900. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:25:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 08:25:41,067][15349] Signal inference workers to stop experience collection... (41250 times) [2024-06-22 08:25:41,067][15349] Signal inference workers to resume experience collection... (41250 times) [2024-06-22 08:25:41,078][15401] InferenceWorker_p0-w0: stopping experience collection (41250 times) [2024-06-22 08:25:41,079][15401] InferenceWorker_p0-w0: resuming experience collection (41250 times) [2024-06-22 08:25:41,205][15401] Updated weights for policy 0, policy_version 171190 (0.0028) [2024-06-22 08:25:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2804875264. Throughput: 0: 43146.6. Samples: 2804996200. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:25:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 08:25:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000171196_2804875264.pth... [2024-06-22 08:25:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000170567_2794569728.pth [2024-06-22 08:25:44,783][15401] Updated weights for policy 0, policy_version 171200 (0.0053) [2024-06-22 08:25:48,396][15132] Fps is (10 sec: 40933.5, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 2805071872. Throughput: 0: 43213.9. Samples: 2805252060. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:25:48,396][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 08:25:48,706][15401] Updated weights for policy 0, policy_version 171210 (0.0031) [2024-06-22 08:25:52,273][15401] Updated weights for policy 0, policy_version 171220 (0.0030) [2024-06-22 08:25:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2805317632. Throughput: 0: 43087.2. Samples: 2805378880. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:25:53,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 08:25:56,299][15401] Updated weights for policy 0, policy_version 171230 (0.0024) [2024-06-22 08:25:58,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42603.0, 300 sec: 42820.9). Total num frames: 2805497856. Throughput: 0: 43054.7. Samples: 2805638980. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:25:58,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 08:26:00,043][15401] Updated weights for policy 0, policy_version 171240 (0.0021) [2024-06-22 08:26:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2805710848. Throughput: 0: 42992.4. Samples: 2805891220. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:26:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 08:26:03,857][15401] Updated weights for policy 0, policy_version 171250 (0.0031) [2024-06-22 08:26:07,619][15401] Updated weights for policy 0, policy_version 171260 (0.0024) [2024-06-22 08:26:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2805972992. Throughput: 0: 42999.0. Samples: 2806021400. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:26:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 08:26:11,553][15401] Updated weights for policy 0, policy_version 171270 (0.0043) [2024-06-22 08:26:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2806136832. Throughput: 0: 42844.5. Samples: 2806273280. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:26:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 08:26:15,281][15401] Updated weights for policy 0, policy_version 171280 (0.0038) [2024-06-22 08:26:18,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2806349824. Throughput: 0: 42820.4. Samples: 2806527220. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:26:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 08:26:19,588][15401] Updated weights for policy 0, policy_version 171290 (0.0042) [2024-06-22 08:26:22,995][15401] Updated weights for policy 0, policy_version 171300 (0.0033) [2024-06-22 08:26:23,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43144.1, 300 sec: 43097.9). Total num frames: 2806611968. Throughput: 0: 42817.7. Samples: 2806658800. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:26:23,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 08:26:27,078][15401] Updated weights for policy 0, policy_version 171310 (0.0032) [2024-06-22 08:26:28,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2806759424. Throughput: 0: 42632.0. Samples: 2806914640. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:26:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 08:26:30,605][15401] Updated weights for policy 0, policy_version 171320 (0.0038) [2024-06-22 08:26:33,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2807005184. Throughput: 0: 42700.2. Samples: 2807173300. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:26:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 08:26:34,511][15401] Updated weights for policy 0, policy_version 171330 (0.0029) [2024-06-22 08:26:38,297][15401] Updated weights for policy 0, policy_version 171340 (0.0032) [2024-06-22 08:26:38,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2807234560. Throughput: 0: 42785.0. Samples: 2807304200. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-06-22 08:26:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 08:26:41,942][15349] Signal inference workers to stop experience collection... (41300 times) [2024-06-22 08:26:41,942][15349] Signal inference workers to resume experience collection... (41300 times) [2024-06-22 08:26:41,991][15401] InferenceWorker_p0-w0: stopping experience collection (41300 times) [2024-06-22 08:26:41,991][15401] InferenceWorker_p0-w0: resuming experience collection (41300 times) [2024-06-22 08:26:42,080][15401] Updated weights for policy 0, policy_version 171350 (0.0037) [2024-06-22 08:26:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2807414784. Throughput: 0: 42568.8. Samples: 2807554580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:26:43,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 08:26:45,769][15401] Updated weights for policy 0, policy_version 171360 (0.0041) [2024-06-22 08:26:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 2807660544. Throughput: 0: 42745.2. Samples: 2807814760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:26:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 08:26:49,543][15401] Updated weights for policy 0, policy_version 171370 (0.0033) [2024-06-22 08:26:53,261][15401] Updated weights for policy 0, policy_version 171380 (0.0034) [2024-06-22 08:26:53,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 2807889920. Throughput: 0: 42807.1. Samples: 2807947720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:26:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 08:26:56,991][15401] Updated weights for policy 0, policy_version 171390 (0.0032) [2024-06-22 08:26:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2808070144. Throughput: 0: 42710.8. Samples: 2808195260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:26:58,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 08:27:01,176][15401] Updated weights for policy 0, policy_version 171400 (0.0043) [2024-06-22 08:27:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2808299520. Throughput: 0: 42893.5. Samples: 2808457420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 08:27:04,843][15401] Updated weights for policy 0, policy_version 171410 (0.0040) [2024-06-22 08:27:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 2808512512. Throughput: 0: 42823.2. Samples: 2808585740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:08,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 08:27:09,242][15401] Updated weights for policy 0, policy_version 171420 (0.0037) [2024-06-22 08:27:12,481][15401] Updated weights for policy 0, policy_version 171430 (0.0046) [2024-06-22 08:27:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2808709120. Throughput: 0: 42701.0. Samples: 2808836180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 08:27:16,906][15401] Updated weights for policy 0, policy_version 171440 (0.0034) [2024-06-22 08:27:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.9, 300 sec: 42765.6). Total num frames: 2808922112. Throughput: 0: 42688.9. Samples: 2809094400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:18,393][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 08:27:20,053][15401] Updated weights for policy 0, policy_version 171450 (0.0033) [2024-06-22 08:27:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42054.0, 300 sec: 42931.6). Total num frames: 2809135104. Throughput: 0: 42588.0. Samples: 2809220660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 08:27:24,542][15401] Updated weights for policy 0, policy_version 171460 (0.0023) [2024-06-22 08:27:27,624][15401] Updated weights for policy 0, policy_version 171470 (0.0040) [2024-06-22 08:27:28,392][15132] Fps is (10 sec: 44236.6, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 2809364480. Throughput: 0: 42742.2. Samples: 2809478080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:28,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 08:27:32,186][15401] Updated weights for policy 0, policy_version 171480 (0.0038) [2024-06-22 08:27:33,391][15132] Fps is (10 sec: 42590.5, 60 sec: 42597.2, 300 sec: 42764.8). Total num frames: 2809561088. Throughput: 0: 42746.4. Samples: 2809738420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:33,392][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 08:27:35,307][15401] Updated weights for policy 0, policy_version 171490 (0.0039) [2024-06-22 08:27:38,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2809790464. Throughput: 0: 42624.5. Samples: 2809865820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 08:27:39,767][15401] Updated weights for policy 0, policy_version 171500 (0.0022) [2024-06-22 08:27:42,554][15349] Signal inference workers to stop experience collection... (41350 times) [2024-06-22 08:27:42,588][15401] InferenceWorker_p0-w0: stopping experience collection (41350 times) [2024-06-22 08:27:42,610][15349] Signal inference workers to resume experience collection... (41350 times) [2024-06-22 08:27:42,611][15401] InferenceWorker_p0-w0: resuming experience collection (41350 times) [2024-06-22 08:27:42,912][15401] Updated weights for policy 0, policy_version 171510 (0.0035) [2024-06-22 08:27:43,390][15132] Fps is (10 sec: 45881.8, 60 sec: 43417.4, 300 sec: 42876.0). Total num frames: 2810019840. Throughput: 0: 42850.7. Samples: 2810123560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 08:27:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 08:27:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000171510_2810019840.pth... [2024-06-22 08:27:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000170882_2799730688.pth [2024-06-22 08:27:47,409][15401] Updated weights for policy 0, policy_version 171520 (0.0029) [2024-06-22 08:27:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 2810200064. Throughput: 0: 42802.7. Samples: 2810383540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:27:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 08:27:50,547][15401] Updated weights for policy 0, policy_version 171530 (0.0034) [2024-06-22 08:27:53,390][15132] Fps is (10 sec: 40961.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2810429440. Throughput: 0: 42653.7. Samples: 2810505160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:27:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 08:27:55,623][15401] Updated weights for policy 0, policy_version 171540 (0.0025) [2024-06-22 08:27:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2810658816. Throughput: 0: 42854.1. Samples: 2810764620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:27:58,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 08:27:58,431][15401] Updated weights for policy 0, policy_version 171550 (0.0031) [2024-06-22 08:28:03,090][15401] Updated weights for policy 0, policy_version 171560 (0.0039) [2024-06-22 08:28:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2810839040. Throughput: 0: 42936.8. Samples: 2811026460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 08:28:06,096][15401] Updated weights for policy 0, policy_version 171570 (0.0025) [2024-06-22 08:28:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 2811084800. Throughput: 0: 42834.5. Samples: 2811148220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 08:28:10,911][15401] Updated weights for policy 0, policy_version 171580 (0.0036) [2024-06-22 08:28:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2811297792. Throughput: 0: 43122.8. Samples: 2811418500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 08:28:13,598][15401] Updated weights for policy 0, policy_version 171590 (0.0036) [2024-06-22 08:28:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 2811478016. Throughput: 0: 42916.3. Samples: 2811669580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:18,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 08:28:18,576][15401] Updated weights for policy 0, policy_version 171600 (0.0040) [2024-06-22 08:28:21,394][15401] Updated weights for policy 0, policy_version 171610 (0.0045) [2024-06-22 08:28:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2811723776. Throughput: 0: 42776.9. Samples: 2811790780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 08:28:26,222][15401] Updated weights for policy 0, policy_version 171620 (0.0037) [2024-06-22 08:28:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 2811904000. Throughput: 0: 42926.5. Samples: 2812055340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:28,393][15132] Avg episode reward: [(0, '0.190')] [2024-06-22 08:28:29,135][15401] Updated weights for policy 0, policy_version 171630 (0.0025) [2024-06-22 08:28:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42872.8, 300 sec: 42820.6). Total num frames: 2812133376. Throughput: 0: 42701.0. Samples: 2812305080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 08:28:33,814][15401] Updated weights for policy 0, policy_version 171640 (0.0033) [2024-06-22 08:28:37,101][15401] Updated weights for policy 0, policy_version 171650 (0.0030) [2024-06-22 08:28:38,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2812362752. Throughput: 0: 42802.0. Samples: 2812431240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:38,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 08:28:41,521][15401] Updated weights for policy 0, policy_version 171660 (0.0046) [2024-06-22 08:28:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.6, 300 sec: 42765.0). Total num frames: 2812559360. Throughput: 0: 42755.1. Samples: 2812688600. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:43,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-22 08:28:44,269][15349] Signal inference workers to stop experience collection... (41400 times) [2024-06-22 08:28:44,269][15349] Signal inference workers to resume experience collection... (41400 times) [2024-06-22 08:28:44,320][15401] InferenceWorker_p0-w0: stopping experience collection (41400 times) [2024-06-22 08:28:44,320][15401] InferenceWorker_p0-w0: resuming experience collection (41400 times) [2024-06-22 08:28:44,831][15401] Updated weights for policy 0, policy_version 171670 (0.0038) [2024-06-22 08:28:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2812772352. Throughput: 0: 42505.0. Samples: 2812939180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-22 08:28:48,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-22 08:28:49,296][15401] Updated weights for policy 0, policy_version 171680 (0.0031) [2024-06-22 08:28:52,498][15401] Updated weights for policy 0, policy_version 171690 (0.0032) [2024-06-22 08:28:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2812985344. Throughput: 0: 42689.7. Samples: 2813069260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:28:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 08:28:57,096][15401] Updated weights for policy 0, policy_version 171700 (0.0031) [2024-06-22 08:28:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 2813181952. Throughput: 0: 42537.3. Samples: 2813332680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:28:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 08:29:00,060][15401] Updated weights for policy 0, policy_version 171710 (0.0031) [2024-06-22 08:29:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2813427712. Throughput: 0: 42422.7. Samples: 2813578600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 08:29:04,503][15401] Updated weights for policy 0, policy_version 171720 (0.0037) [2024-06-22 08:29:07,627][15401] Updated weights for policy 0, policy_version 171730 (0.0024) [2024-06-22 08:29:08,390][15132] Fps is (10 sec: 45872.3, 60 sec: 42598.0, 300 sec: 42820.5). Total num frames: 2813640704. Throughput: 0: 42824.2. Samples: 2813717900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:08,391][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 08:29:12,070][15401] Updated weights for policy 0, policy_version 171740 (0.0028) [2024-06-22 08:29:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2813837312. Throughput: 0: 42631.5. Samples: 2813973660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:13,391][15132] Avg episode reward: [(0, '0.798')] [2024-06-22 08:29:15,377][15401] Updated weights for policy 0, policy_version 171750 (0.0033) [2024-06-22 08:29:18,390][15132] Fps is (10 sec: 42600.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2814066688. Throughput: 0: 42516.2. Samples: 2814218320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 08:29:19,590][15401] Updated weights for policy 0, policy_version 171760 (0.0028) [2024-06-22 08:29:23,174][15401] Updated weights for policy 0, policy_version 171770 (0.0030) [2024-06-22 08:29:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 2814279680. Throughput: 0: 42757.1. Samples: 2814355320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 08:29:27,503][15401] Updated weights for policy 0, policy_version 171780 (0.0037) [2024-06-22 08:29:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 2814476288. Throughput: 0: 42609.2. Samples: 2814606020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 08:29:31,116][15401] Updated weights for policy 0, policy_version 171790 (0.0038) [2024-06-22 08:29:33,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2814705664. Throughput: 0: 42585.0. Samples: 2814855500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 08:29:35,063][15401] Updated weights for policy 0, policy_version 171800 (0.0031) [2024-06-22 08:29:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2814902272. Throughput: 0: 42769.4. Samples: 2814993880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:38,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 08:29:38,757][15401] Updated weights for policy 0, policy_version 171810 (0.0039) [2024-06-22 08:29:42,516][15401] Updated weights for policy 0, policy_version 171820 (0.0037) [2024-06-22 08:29:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2815115264. Throughput: 0: 42484.3. Samples: 2815244480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:43,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 08:29:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000171821_2815115264.pth... [2024-06-22 08:29:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000171196_2804875264.pth [2024-06-22 08:29:46,332][15401] Updated weights for policy 0, policy_version 171830 (0.0044) [2024-06-22 08:29:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2815344640. Throughput: 0: 42630.3. Samples: 2815496960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:48,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 08:29:50,003][15401] Updated weights for policy 0, policy_version 171840 (0.0037) [2024-06-22 08:29:53,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 2815557632. Throughput: 0: 42661.5. Samples: 2815637640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 08:29:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 08:29:53,949][15401] Updated weights for policy 0, policy_version 171850 (0.0028) [2024-06-22 08:29:56,840][15349] Signal inference workers to stop experience collection... (41450 times) [2024-06-22 08:29:56,840][15349] Signal inference workers to resume experience collection... (41450 times) [2024-06-22 08:29:56,870][15401] InferenceWorker_p0-w0: stopping experience collection (41450 times) [2024-06-22 08:29:56,870][15401] InferenceWorker_p0-w0: resuming experience collection (41450 times) [2024-06-22 08:29:58,251][15401] Updated weights for policy 0, policy_version 171860 (0.0043) [2024-06-22 08:29:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2815754240. Throughput: 0: 42492.5. Samples: 2815885820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:29:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 08:30:01,647][15401] Updated weights for policy 0, policy_version 171870 (0.0024) [2024-06-22 08:30:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2816000000. Throughput: 0: 42827.2. Samples: 2816145540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 08:30:05,766][15401] Updated weights for policy 0, policy_version 171880 (0.0030) [2024-06-22 08:30:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.9, 300 sec: 42765.0). Total num frames: 2816196608. Throughput: 0: 42786.5. Samples: 2816280700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 08:30:09,366][15401] Updated weights for policy 0, policy_version 171890 (0.0030) [2024-06-22 08:30:13,368][15401] Updated weights for policy 0, policy_version 171900 (0.0038) [2024-06-22 08:30:13,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2816409600. Throughput: 0: 42800.9. Samples: 2816532160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:13,393][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:30:16,918][15401] Updated weights for policy 0, policy_version 171910 (0.0047) [2024-06-22 08:30:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 2816638976. Throughput: 0: 43048.8. Samples: 2816792700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 08:30:21,015][15401] Updated weights for policy 0, policy_version 171920 (0.0027) [2024-06-22 08:30:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2816835584. Throughput: 0: 42839.1. Samples: 2816921640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 08:30:24,301][15401] Updated weights for policy 0, policy_version 171930 (0.0036) [2024-06-22 08:30:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2817048576. Throughput: 0: 42894.4. Samples: 2817174720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:28,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 08:30:28,545][15401] Updated weights for policy 0, policy_version 171940 (0.0034) [2024-06-22 08:30:31,926][15401] Updated weights for policy 0, policy_version 171950 (0.0032) [2024-06-22 08:30:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2817261568. Throughput: 0: 42883.6. Samples: 2817426720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 08:30:36,143][15401] Updated weights for policy 0, policy_version 171960 (0.0024) [2024-06-22 08:30:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 2817458176. Throughput: 0: 42628.4. Samples: 2817556020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:38,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 08:30:39,643][15401] Updated weights for policy 0, policy_version 171970 (0.0031) [2024-06-22 08:30:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42766.0). Total num frames: 2817687552. Throughput: 0: 42655.6. Samples: 2817805320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 08:30:43,741][15401] Updated weights for policy 0, policy_version 171980 (0.0032) [2024-06-22 08:30:47,611][15401] Updated weights for policy 0, policy_version 171990 (0.0048) [2024-06-22 08:30:48,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2817900544. Throughput: 0: 42546.6. Samples: 2818060140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 08:30:51,319][15401] Updated weights for policy 0, policy_version 172000 (0.0035) [2024-06-22 08:30:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2818097152. Throughput: 0: 42462.2. Samples: 2818191500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:53,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 08:30:55,266][15401] Updated weights for policy 0, policy_version 172010 (0.0033) [2024-06-22 08:30:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2818342912. Throughput: 0: 42631.2. Samples: 2818450460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-22 08:30:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 08:30:59,266][15401] Updated weights for policy 0, policy_version 172020 (0.0024) [2024-06-22 08:31:03,058][15401] Updated weights for policy 0, policy_version 172030 (0.0029) [2024-06-22 08:31:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2818539520. Throughput: 0: 42463.5. Samples: 2818703560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 08:31:06,730][15401] Updated weights for policy 0, policy_version 172040 (0.0035) [2024-06-22 08:31:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2818752512. Throughput: 0: 42450.3. Samples: 2818831900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 08:31:10,963][15401] Updated weights for policy 0, policy_version 172050 (0.0039) [2024-06-22 08:31:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2818981888. Throughput: 0: 42627.1. Samples: 2819092940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:13,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 08:31:14,531][15401] Updated weights for policy 0, policy_version 172060 (0.0023) [2024-06-22 08:31:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42543.2). Total num frames: 2819162112. Throughput: 0: 42704.9. Samples: 2819348440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 08:31:18,759][15401] Updated weights for policy 0, policy_version 172070 (0.0048) [2024-06-22 08:31:22,147][15401] Updated weights for policy 0, policy_version 172080 (0.0034) [2024-06-22 08:31:23,394][15132] Fps is (10 sec: 40942.4, 60 sec: 42595.4, 300 sec: 42819.9). Total num frames: 2819391488. Throughput: 0: 42568.0. Samples: 2819471660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:23,400][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 08:31:26,504][15401] Updated weights for policy 0, policy_version 172090 (0.0036) [2024-06-22 08:31:28,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2819604480. Throughput: 0: 42952.4. Samples: 2819738180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:28,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 08:31:29,691][15349] Signal inference workers to stop experience collection... (41500 times) [2024-06-22 08:31:29,737][15401] InferenceWorker_p0-w0: stopping experience collection (41500 times) [2024-06-22 08:31:29,751][15349] Signal inference workers to resume experience collection... (41500 times) [2024-06-22 08:31:29,753][15401] InferenceWorker_p0-w0: resuming experience collection (41500 times) [2024-06-22 08:31:29,892][15401] Updated weights for policy 0, policy_version 172100 (0.0041) [2024-06-22 08:31:33,389][15132] Fps is (10 sec: 42617.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2819817472. Throughput: 0: 42941.9. Samples: 2819992520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 08:31:34,065][15401] Updated weights for policy 0, policy_version 172110 (0.0029) [2024-06-22 08:31:37,273][15401] Updated weights for policy 0, policy_version 172120 (0.0032) [2024-06-22 08:31:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 2820046848. Throughput: 0: 42905.3. Samples: 2820122240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 08:31:41,638][15401] Updated weights for policy 0, policy_version 172130 (0.0036) [2024-06-22 08:31:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2820243456. Throughput: 0: 42810.1. Samples: 2820376920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:43,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-22 08:31:43,472][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000172135_2820259840.pth... [2024-06-22 08:31:43,538][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000171510_2810019840.pth [2024-06-22 08:31:45,447][15401] Updated weights for policy 0, policy_version 172140 (0.0038) [2024-06-22 08:31:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2820456448. Throughput: 0: 42830.3. Samples: 2820630920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:48,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 08:31:49,177][15401] Updated weights for policy 0, policy_version 172150 (0.0046) [2024-06-22 08:31:53,182][15401] Updated weights for policy 0, policy_version 172160 (0.0046) [2024-06-22 08:31:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2820685824. Throughput: 0: 42836.0. Samples: 2820759520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 08:31:56,886][15401] Updated weights for policy 0, policy_version 172170 (0.0035) [2024-06-22 08:31:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2820882432. Throughput: 0: 42736.7. Samples: 2821016100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:31:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 08:32:00,667][15401] Updated weights for policy 0, policy_version 172180 (0.0048) [2024-06-22 08:32:03,391][15132] Fps is (10 sec: 40952.8, 60 sec: 42597.3, 300 sec: 42653.7). Total num frames: 2821095424. Throughput: 0: 42700.5. Samples: 2821270040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 08:32:03,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 08:32:04,521][15401] Updated weights for policy 0, policy_version 172190 (0.0039) [2024-06-22 08:32:08,192][15401] Updated weights for policy 0, policy_version 172200 (0.0035) [2024-06-22 08:32:08,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2821324800. Throughput: 0: 42822.8. Samples: 2821398500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 08:32:12,130][15401] Updated weights for policy 0, policy_version 172210 (0.0037) [2024-06-22 08:32:13,389][15132] Fps is (10 sec: 42606.0, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 2821521408. Throughput: 0: 42671.2. Samples: 2821658380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 08:32:15,575][15401] Updated weights for policy 0, policy_version 172220 (0.0037) [2024-06-22 08:32:18,396][15132] Fps is (10 sec: 42570.8, 60 sec: 43139.8, 300 sec: 42764.1). Total num frames: 2821750784. Throughput: 0: 42683.6. Samples: 2821913560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:18,397][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 08:32:19,834][15401] Updated weights for policy 0, policy_version 172230 (0.0030) [2024-06-22 08:32:23,151][15401] Updated weights for policy 0, policy_version 172240 (0.0036) [2024-06-22 08:32:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43147.6, 300 sec: 42765.4). Total num frames: 2821980160. Throughput: 0: 42866.6. Samples: 2822051240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:23,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 08:32:27,415][15401] Updated weights for policy 0, policy_version 172250 (0.0027) [2024-06-22 08:32:28,389][15132] Fps is (10 sec: 39347.3, 60 sec: 42325.4, 300 sec: 42654.2). Total num frames: 2822144000. Throughput: 0: 42747.7. Samples: 2822300560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 08:32:31,031][15401] Updated weights for policy 0, policy_version 172260 (0.0049) [2024-06-22 08:32:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2822406144. Throughput: 0: 42770.5. Samples: 2822555600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 08:32:34,951][15401] Updated weights for policy 0, policy_version 172270 (0.0033) [2024-06-22 08:32:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2822602752. Throughput: 0: 42937.3. Samples: 2822691700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 08:32:38,550][15401] Updated weights for policy 0, policy_version 172280 (0.0030) [2024-06-22 08:32:42,574][15401] Updated weights for policy 0, policy_version 172290 (0.0032) [2024-06-22 08:32:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2822799360. Throughput: 0: 42835.6. Samples: 2822943700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 08:32:46,103][15401] Updated weights for policy 0, policy_version 172300 (0.0038) [2024-06-22 08:32:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2823061504. Throughput: 0: 42863.0. Samples: 2823198800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:48,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 08:32:50,269][15401] Updated weights for policy 0, policy_version 172310 (0.0041) [2024-06-22 08:32:53,154][15349] Signal inference workers to stop experience collection... (41550 times) [2024-06-22 08:32:53,155][15349] Signal inference workers to resume experience collection... (41550 times) [2024-06-22 08:32:53,185][15401] InferenceWorker_p0-w0: stopping experience collection (41550 times) [2024-06-22 08:32:53,185][15401] InferenceWorker_p0-w0: resuming experience collection (41550 times) [2024-06-22 08:32:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2823258112. Throughput: 0: 43085.3. Samples: 2823337340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 08:32:53,659][15401] Updated weights for policy 0, policy_version 172320 (0.0029) [2024-06-22 08:32:58,294][15401] Updated weights for policy 0, policy_version 172330 (0.0030) [2024-06-22 08:32:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 2823454720. Throughput: 0: 42960.9. Samples: 2823591620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:32:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 08:33:01,206][15401] Updated weights for policy 0, policy_version 172340 (0.0030) [2024-06-22 08:33:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43418.7, 300 sec: 42765.0). Total num frames: 2823700480. Throughput: 0: 42815.8. Samples: 2823840000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:33:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 08:33:05,899][15401] Updated weights for policy 0, policy_version 172350 (0.0028) [2024-06-22 08:33:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2823880704. Throughput: 0: 42782.3. Samples: 2823976440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:33:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 08:33:09,182][15401] Updated weights for policy 0, policy_version 172360 (0.0029) [2024-06-22 08:33:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2824093696. Throughput: 0: 42820.4. Samples: 2824227480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 08:33:13,422][15401] Updated weights for policy 0, policy_version 172370 (0.0037) [2024-06-22 08:33:16,856][15401] Updated weights for policy 0, policy_version 172380 (0.0041) [2024-06-22 08:33:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 2824323072. Throughput: 0: 42796.6. Samples: 2824481440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:18,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 08:33:21,123][15401] Updated weights for policy 0, policy_version 172390 (0.0022) [2024-06-22 08:33:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 2824519680. Throughput: 0: 42767.9. Samples: 2824616260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:23,394][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 08:33:24,539][15401] Updated weights for policy 0, policy_version 172400 (0.0027) [2024-06-22 08:33:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2824732672. Throughput: 0: 42741.1. Samples: 2824867040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:28,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 08:33:28,755][15401] Updated weights for policy 0, policy_version 172410 (0.0028) [2024-06-22 08:33:32,079][15401] Updated weights for policy 0, policy_version 172420 (0.0036) [2024-06-22 08:33:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2824978432. Throughput: 0: 42754.1. Samples: 2825122740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 08:33:36,341][15401] Updated weights for policy 0, policy_version 172430 (0.0028) [2024-06-22 08:33:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2825142272. Throughput: 0: 42615.4. Samples: 2825255040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:38,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 08:33:39,846][15401] Updated weights for policy 0, policy_version 172440 (0.0032) [2024-06-22 08:33:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2825388032. Throughput: 0: 42493.6. Samples: 2825503840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 08:33:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000172448_2825388032.pth... [2024-06-22 08:33:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000171821_2815115264.pth [2024-06-22 08:33:44,432][15401] Updated weights for policy 0, policy_version 172450 (0.0028) [2024-06-22 08:33:47,510][15401] Updated weights for policy 0, policy_version 172460 (0.0035) [2024-06-22 08:33:48,389][15132] Fps is (10 sec: 47514.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2825617408. Throughput: 0: 42758.5. Samples: 2825764120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 08:33:51,872][15401] Updated weights for policy 0, policy_version 172470 (0.0032) [2024-06-22 08:33:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2825797632. Throughput: 0: 42620.4. Samples: 2825894360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:53,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 08:33:55,354][15401] Updated weights for policy 0, policy_version 172480 (0.0036) [2024-06-22 08:33:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2826043392. Throughput: 0: 42638.6. Samples: 2826146220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:33:58,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 08:33:59,874][15401] Updated weights for policy 0, policy_version 172490 (0.0027) [2024-06-22 08:34:02,720][15349] Signal inference workers to stop experience collection... (41600 times) [2024-06-22 08:34:02,722][15349] Signal inference workers to resume experience collection... (41600 times) [2024-06-22 08:34:02,738][15401] InferenceWorker_p0-w0: stopping experience collection (41600 times) [2024-06-22 08:34:02,738][15401] InferenceWorker_p0-w0: resuming experience collection (41600 times) [2024-06-22 08:34:03,240][15401] Updated weights for policy 0, policy_version 172500 (0.0042) [2024-06-22 08:34:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 2826240000. Throughput: 0: 42787.0. Samples: 2826406860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:34:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 08:34:07,314][15401] Updated weights for policy 0, policy_version 172510 (0.0047) [2024-06-22 08:34:08,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2826420224. Throughput: 0: 42426.3. Samples: 2826525440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:34:08,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 08:34:10,783][15401] Updated weights for policy 0, policy_version 172520 (0.0029) [2024-06-22 08:34:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2826698752. Throughput: 0: 42520.0. Samples: 2826780440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:34:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 08:34:14,663][15401] Updated weights for policy 0, policy_version 172530 (0.0032) [2024-06-22 08:34:18,382][15401] Updated weights for policy 0, policy_version 172540 (0.0033) [2024-06-22 08:34:18,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2826895360. Throughput: 0: 42678.7. Samples: 2827043280. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 08:34:22,063][15401] Updated weights for policy 0, policy_version 172550 (0.0031) [2024-06-22 08:34:23,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2827075584. Throughput: 0: 42513.8. Samples: 2827168160. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 08:34:26,203][15401] Updated weights for policy 0, policy_version 172560 (0.0040) [2024-06-22 08:34:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 2827337728. Throughput: 0: 42710.7. Samples: 2827425820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 08:34:30,429][15401] Updated weights for policy 0, policy_version 172570 (0.0040) [2024-06-22 08:34:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2827534336. Throughput: 0: 42676.2. Samples: 2827684560. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:33,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 08:34:33,754][15401] Updated weights for policy 0, policy_version 172580 (0.0043) [2024-06-22 08:34:38,007][15401] Updated weights for policy 0, policy_version 172590 (0.0037) [2024-06-22 08:34:38,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2827714560. Throughput: 0: 42593.8. Samples: 2827811080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 08:34:41,523][15401] Updated weights for policy 0, policy_version 172600 (0.0035) [2024-06-22 08:34:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2827976704. Throughput: 0: 42656.1. Samples: 2828065740. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 08:34:45,714][15401] Updated weights for policy 0, policy_version 172610 (0.0036) [2024-06-22 08:34:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2828156928. Throughput: 0: 42659.3. Samples: 2828326520. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 08:34:49,135][15401] Updated weights for policy 0, policy_version 172620 (0.0028) [2024-06-22 08:34:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2828353536. Throughput: 0: 42698.6. Samples: 2828446880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:53,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 08:34:53,441][15401] Updated weights for policy 0, policy_version 172630 (0.0048) [2024-06-22 08:34:56,861][15401] Updated weights for policy 0, policy_version 172640 (0.0037) [2024-06-22 08:34:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2828615680. Throughput: 0: 42606.6. Samples: 2828697740. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:34:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 08:35:01,083][15401] Updated weights for policy 0, policy_version 172650 (0.0036) [2024-06-22 08:35:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 2828746752. Throughput: 0: 42590.2. Samples: 2828959840. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:35:03,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 08:35:04,762][15401] Updated weights for policy 0, policy_version 172660 (0.0036) [2024-06-22 08:35:08,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 2828992512. Throughput: 0: 42330.7. Samples: 2829073040. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:35:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 08:35:08,694][15401] Updated weights for policy 0, policy_version 172670 (0.0025) [2024-06-22 08:35:12,318][15401] Updated weights for policy 0, policy_version 172680 (0.0035) [2024-06-22 08:35:13,389][15132] Fps is (10 sec: 50790.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2829254656. Throughput: 0: 42469.4. Samples: 2829336940. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:35:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 08:35:16,268][15401] Updated weights for policy 0, policy_version 172690 (0.0046) [2024-06-22 08:35:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2829402112. Throughput: 0: 42721.4. Samples: 2829607020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-22 08:35:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 08:35:18,906][15349] Signal inference workers to stop experience collection... (41650 times) [2024-06-22 08:35:18,907][15349] Signal inference workers to resume experience collection... (41650 times) [2024-06-22 08:35:18,950][15401] InferenceWorker_p0-w0: stopping experience collection (41650 times) [2024-06-22 08:35:18,951][15401] InferenceWorker_p0-w0: resuming experience collection (41650 times) [2024-06-22 08:35:19,850][15401] Updated weights for policy 0, policy_version 172700 (0.0032) [2024-06-22 08:35:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2829664256. Throughput: 0: 42409.8. Samples: 2829719520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:35:23,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 08:35:23,742][15401] Updated weights for policy 0, policy_version 172710 (0.0026) [2024-06-22 08:35:27,687][15401] Updated weights for policy 0, policy_version 172720 (0.0037) [2024-06-22 08:35:28,390][15132] Fps is (10 sec: 50790.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2829910016. Throughput: 0: 42678.7. Samples: 2829986280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:35:28,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-22 08:35:31,146][15401] Updated weights for policy 0, policy_version 172730 (0.0032) [2024-06-22 08:35:33,390][15132] Fps is (10 sec: 36044.4, 60 sec: 41506.1, 300 sec: 42598.7). Total num frames: 2830024704. Throughput: 0: 42821.1. Samples: 2830253480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:35:33,391][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:35:35,401][15401] Updated weights for policy 0, policy_version 172740 (0.0031) [2024-06-22 08:35:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2830303232. Throughput: 0: 42645.8. Samples: 2830365940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:35:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 08:35:39,033][15401] Updated weights for policy 0, policy_version 172750 (0.0031) [2024-06-22 08:35:42,854][15401] Updated weights for policy 0, policy_version 172760 (0.0038) [2024-06-22 08:35:43,390][15132] Fps is (10 sec: 50791.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2830532608. Throughput: 0: 43087.6. Samples: 2830636680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:35:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 08:35:43,520][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000172763_2830548992.pth... [2024-06-22 08:35:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000172135_2820259840.pth [2024-06-22 08:35:46,445][15401] Updated weights for policy 0, policy_version 172770 (0.0039) [2024-06-22 08:35:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2830680064. Throughput: 0: 43048.0. Samples: 2830897000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:35:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 08:35:50,458][15401] Updated weights for policy 0, policy_version 172780 (0.0035) [2024-06-22 08:35:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2830958592. Throughput: 0: 43155.5. Samples: 2831015040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:35:53,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 08:35:54,060][15401] Updated weights for policy 0, policy_version 172790 (0.0038) [2024-06-22 08:35:58,119][15401] Updated weights for policy 0, policy_version 172800 (0.0035) [2024-06-22 08:35:58,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2831155200. Throughput: 0: 43179.5. Samples: 2831280020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:35:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 08:36:01,684][15401] Updated weights for policy 0, policy_version 172810 (0.0038) [2024-06-22 08:36:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2831335424. Throughput: 0: 42873.9. Samples: 2831536340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:36:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 08:36:05,936][15401] Updated weights for policy 0, policy_version 172820 (0.0035) [2024-06-22 08:36:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2831597568. Throughput: 0: 43121.3. Samples: 2831659980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:36:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 08:36:09,282][15401] Updated weights for policy 0, policy_version 172830 (0.0033) [2024-06-22 08:36:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2831794176. Throughput: 0: 43005.3. Samples: 2831921520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:36:13,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 08:36:13,447][15401] Updated weights for policy 0, policy_version 172840 (0.0026) [2024-06-22 08:36:17,398][15401] Updated weights for policy 0, policy_version 172850 (0.0035) [2024-06-22 08:36:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42710.1). Total num frames: 2831990784. Throughput: 0: 42605.0. Samples: 2832170700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:36:18,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 08:36:21,075][15401] Updated weights for policy 0, policy_version 172860 (0.0051) [2024-06-22 08:36:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2832220160. Throughput: 0: 42872.0. Samples: 2832295180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-22 08:36:23,390][15132] Avg episode reward: [(0, '0.136')] [2024-06-22 08:36:25,127][15401] Updated weights for policy 0, policy_version 172870 (0.0034) [2024-06-22 08:36:27,188][15349] Signal inference workers to stop experience collection... (41700 times) [2024-06-22 08:36:27,245][15349] Signal inference workers to resume experience collection... (41700 times) [2024-06-22 08:36:27,245][15401] InferenceWorker_p0-w0: stopping experience collection (41700 times) [2024-06-22 08:36:27,277][15401] InferenceWorker_p0-w0: resuming experience collection (41700 times) [2024-06-22 08:36:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 2832416768. Throughput: 0: 42594.7. Samples: 2832553440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:36:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 08:36:29,169][15401] Updated weights for policy 0, policy_version 172880 (0.0036) [2024-06-22 08:36:32,700][15401] Updated weights for policy 0, policy_version 172890 (0.0041) [2024-06-22 08:36:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 42709.5). Total num frames: 2832646144. Throughput: 0: 42271.6. Samples: 2832799220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:36:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 08:36:36,591][15401] Updated weights for policy 0, policy_version 172900 (0.0034) [2024-06-22 08:36:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2832859136. Throughput: 0: 42484.7. Samples: 2832926860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:36:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 08:36:40,266][15401] Updated weights for policy 0, policy_version 172910 (0.0039) [2024-06-22 08:36:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 2833039360. Throughput: 0: 42343.1. Samples: 2833185460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:36:43,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 08:36:44,127][15401] Updated weights for policy 0, policy_version 172920 (0.0032) [2024-06-22 08:36:48,120][15401] Updated weights for policy 0, policy_version 172930 (0.0044) [2024-06-22 08:36:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2833285120. Throughput: 0: 42158.1. Samples: 2833433460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:36:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 08:36:51,637][15401] Updated weights for policy 0, policy_version 172940 (0.0035) [2024-06-22 08:36:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2833498112. Throughput: 0: 42471.1. Samples: 2833571180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:36:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 08:36:55,692][15401] Updated weights for policy 0, policy_version 172950 (0.0036) [2024-06-22 08:36:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 2833711104. Throughput: 0: 42452.8. Samples: 2833831900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:36:58,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 08:36:59,251][15401] Updated weights for policy 0, policy_version 172960 (0.0037) [2024-06-22 08:37:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 2833924096. Throughput: 0: 42394.6. Samples: 2834078560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:37:03,393][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 08:37:03,445][15401] Updated weights for policy 0, policy_version 172970 (0.0040) [2024-06-22 08:37:06,971][15401] Updated weights for policy 0, policy_version 172980 (0.0049) [2024-06-22 08:37:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2834120704. Throughput: 0: 42596.4. Samples: 2834212020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:37:08,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 08:37:11,332][15401] Updated weights for policy 0, policy_version 172990 (0.0037) [2024-06-22 08:37:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 2834333696. Throughput: 0: 42661.3. Samples: 2834473200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:37:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 08:37:14,644][15401] Updated weights for policy 0, policy_version 173000 (0.0029) [2024-06-22 08:37:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2834546688. Throughput: 0: 42707.6. Samples: 2834721060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:37:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 08:37:19,123][15401] Updated weights for policy 0, policy_version 173010 (0.0041) [2024-06-22 08:37:22,623][15401] Updated weights for policy 0, policy_version 173020 (0.0024) [2024-06-22 08:37:23,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 2834776064. Throughput: 0: 42793.4. Samples: 2834852660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:37:23,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 08:37:26,807][15401] Updated weights for policy 0, policy_version 173030 (0.0026) [2024-06-22 08:37:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2834972672. Throughput: 0: 42745.7. Samples: 2835109020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:37:28,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 08:37:30,236][15401] Updated weights for policy 0, policy_version 173040 (0.0043) [2024-06-22 08:37:33,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2835185664. Throughput: 0: 42845.0. Samples: 2835361480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:37:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 08:37:35,046][15401] Updated weights for policy 0, policy_version 173050 (0.0041) [2024-06-22 08:37:37,962][15401] Updated weights for policy 0, policy_version 173060 (0.0040) [2024-06-22 08:37:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2835415040. Throughput: 0: 42596.4. Samples: 2835488020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:37:38,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 08:37:38,938][15349] Signal inference workers to stop experience collection... (41750 times) [2024-06-22 08:37:38,939][15349] Signal inference workers to resume experience collection... (41750 times) [2024-06-22 08:37:38,965][15401] InferenceWorker_p0-w0: stopping experience collection (41750 times) [2024-06-22 08:37:38,965][15401] InferenceWorker_p0-w0: resuming experience collection (41750 times) [2024-06-22 08:37:42,722][15401] Updated weights for policy 0, policy_version 173070 (0.0035) [2024-06-22 08:37:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2835628032. Throughput: 0: 42592.8. Samples: 2835748580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:37:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 08:37:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000173073_2835628032.pth... [2024-06-22 08:37:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000172448_2825388032.pth [2024-06-22 08:37:45,609][15401] Updated weights for policy 0, policy_version 173080 (0.0035) [2024-06-22 08:37:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2835841024. Throughput: 0: 42705.4. Samples: 2836000200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:37:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 08:37:50,273][15401] Updated weights for policy 0, policy_version 173090 (0.0037) [2024-06-22 08:37:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2836054016. Throughput: 0: 42548.9. Samples: 2836126720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:37:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 08:37:53,509][15401] Updated weights for policy 0, policy_version 173100 (0.0045) [2024-06-22 08:37:57,913][15401] Updated weights for policy 0, policy_version 173110 (0.0028) [2024-06-22 08:37:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2836250624. Throughput: 0: 42519.2. Samples: 2836386560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:37:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 08:38:01,060][15401] Updated weights for policy 0, policy_version 173120 (0.0026) [2024-06-22 08:38:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 2836480000. Throughput: 0: 42593.8. Samples: 2836637780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:38:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 08:38:05,436][15401] Updated weights for policy 0, policy_version 173130 (0.0046) [2024-06-22 08:38:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2836692992. Throughput: 0: 42623.2. Samples: 2836770600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:38:08,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 08:38:08,526][15401] Updated weights for policy 0, policy_version 173140 (0.0036) [2024-06-22 08:38:13,205][15401] Updated weights for policy 0, policy_version 173150 (0.0039) [2024-06-22 08:38:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2836889600. Throughput: 0: 42674.7. Samples: 2837029380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:38:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 08:38:16,193][15401] Updated weights for policy 0, policy_version 173160 (0.0047) [2024-06-22 08:38:18,390][15132] Fps is (10 sec: 42595.8, 60 sec: 42871.0, 300 sec: 42709.4). Total num frames: 2837118976. Throughput: 0: 42603.3. Samples: 2837278660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:38:18,391][15132] Avg episode reward: [(0, '0.219')] [2024-06-22 08:38:20,931][15401] Updated weights for policy 0, policy_version 173170 (0.0043) [2024-06-22 08:38:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 2837331968. Throughput: 0: 42687.2. Samples: 2837408940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:38:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 08:38:23,730][15401] Updated weights for policy 0, policy_version 173180 (0.0023) [2024-06-22 08:38:28,390][15132] Fps is (10 sec: 40962.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2837528576. Throughput: 0: 42587.7. Samples: 2837665020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:38:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 08:38:28,462][15401] Updated weights for policy 0, policy_version 173190 (0.0034) [2024-06-22 08:38:31,774][15401] Updated weights for policy 0, policy_version 173200 (0.0029) [2024-06-22 08:38:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2837757952. Throughput: 0: 42652.4. Samples: 2837919560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 08:38:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 08:38:36,285][15401] Updated weights for policy 0, policy_version 173210 (0.0036) [2024-06-22 08:38:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2837970944. Throughput: 0: 42789.4. Samples: 2838052240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:38:38,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 08:38:39,369][15401] Updated weights for policy 0, policy_version 173220 (0.0026) [2024-06-22 08:38:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2838167552. Throughput: 0: 42707.9. Samples: 2838308420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:38:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 08:38:43,928][15401] Updated weights for policy 0, policy_version 173230 (0.0041) [2024-06-22 08:38:46,798][15349] Signal inference workers to stop experience collection... (41800 times) [2024-06-22 08:38:46,799][15349] Signal inference workers to resume experience collection... (41800 times) [2024-06-22 08:38:46,811][15401] InferenceWorker_p0-w0: stopping experience collection (41800 times) [2024-06-22 08:38:46,816][15401] InferenceWorker_p0-w0: resuming experience collection (41800 times) [2024-06-22 08:38:46,941][15401] Updated weights for policy 0, policy_version 173240 (0.0031) [2024-06-22 08:38:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2838396928. Throughput: 0: 42813.8. Samples: 2838564400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:38:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 08:38:51,447][15401] Updated weights for policy 0, policy_version 173250 (0.0042) [2024-06-22 08:38:53,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 2838626304. Throughput: 0: 42775.0. Samples: 2838695580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:38:53,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 08:38:54,491][15401] Updated weights for policy 0, policy_version 173260 (0.0026) [2024-06-22 08:38:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2838806528. Throughput: 0: 42586.6. Samples: 2838945780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:38:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 08:38:59,133][15401] Updated weights for policy 0, policy_version 173270 (0.0042) [2024-06-22 08:39:02,134][15401] Updated weights for policy 0, policy_version 173280 (0.0034) [2024-06-22 08:39:03,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2839035904. Throughput: 0: 42788.1. Samples: 2839204100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:39:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 08:39:06,653][15401] Updated weights for policy 0, policy_version 173290 (0.0037) [2024-06-22 08:39:08,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 2839248896. Throughput: 0: 42787.5. Samples: 2839334480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:39:08,392][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 08:39:10,034][15401] Updated weights for policy 0, policy_version 173300 (0.0031) [2024-06-22 08:39:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2839445504. Throughput: 0: 42742.7. Samples: 2839588440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:39:13,394][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 08:39:14,600][15401] Updated weights for policy 0, policy_version 173310 (0.0030) [2024-06-22 08:39:17,507][15401] Updated weights for policy 0, policy_version 173320 (0.0032) [2024-06-22 08:39:18,391][15132] Fps is (10 sec: 44242.4, 60 sec: 42871.1, 300 sec: 42764.9). Total num frames: 2839691264. Throughput: 0: 42735.5. Samples: 2839842700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:39:18,391][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 08:39:22,195][15401] Updated weights for policy 0, policy_version 173330 (0.0029) [2024-06-22 08:39:23,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 2839904256. Throughput: 0: 42845.7. Samples: 2839980400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:39:23,393][15132] Avg episode reward: [(0, '0.187')] [2024-06-22 08:39:25,060][15401] Updated weights for policy 0, policy_version 173340 (0.0034) [2024-06-22 08:39:28,390][15132] Fps is (10 sec: 40964.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2840100864. Throughput: 0: 42767.1. Samples: 2840232940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:39:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 08:39:29,722][15401] Updated weights for policy 0, policy_version 173350 (0.0026) [2024-06-22 08:39:32,909][15401] Updated weights for policy 0, policy_version 173360 (0.0033) [2024-06-22 08:39:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2840330240. Throughput: 0: 42627.6. Samples: 2840482640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:39:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 08:39:37,496][15401] Updated weights for policy 0, policy_version 173370 (0.0047) [2024-06-22 08:39:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 2840543232. Throughput: 0: 42749.8. Samples: 2840619320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-22 08:39:38,393][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 08:39:40,508][15401] Updated weights for policy 0, policy_version 173380 (0.0034) [2024-06-22 08:39:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2840739840. Throughput: 0: 42733.5. Samples: 2840868780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:39:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 08:39:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000173386_2840756224.pth... [2024-06-22 08:39:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000172763_2830548992.pth [2024-06-22 08:39:45,455][15401] Updated weights for policy 0, policy_version 173390 (0.0032) [2024-06-22 08:39:48,317][15401] Updated weights for policy 0, policy_version 173400 (0.0035) [2024-06-22 08:39:48,392][15132] Fps is (10 sec: 44236.9, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2840985600. Throughput: 0: 42545.3. Samples: 2841118740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:39:48,393][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 08:39:52,949][15401] Updated weights for policy 0, policy_version 173410 (0.0039) [2024-06-22 08:39:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 2841165824. Throughput: 0: 42673.7. Samples: 2841254700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:39:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 08:39:56,166][15401] Updated weights for policy 0, policy_version 173420 (0.0032) [2024-06-22 08:39:58,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2841378816. Throughput: 0: 42607.1. Samples: 2841505760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:39:58,392][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 08:40:00,538][15401] Updated weights for policy 0, policy_version 173430 (0.0032) [2024-06-22 08:40:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2841608192. Throughput: 0: 42642.0. Samples: 2841761540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:03,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 08:40:03,870][15401] Updated weights for policy 0, policy_version 173440 (0.0030) [2024-06-22 08:40:08,134][15401] Updated weights for policy 0, policy_version 173450 (0.0040) [2024-06-22 08:40:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 2841804800. Throughput: 0: 42544.5. Samples: 2841894800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:08,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 08:40:11,858][15401] Updated weights for policy 0, policy_version 173460 (0.0024) [2024-06-22 08:40:13,396][15132] Fps is (10 sec: 39296.1, 60 sec: 42593.8, 300 sec: 42708.5). Total num frames: 2842001408. Throughput: 0: 42293.5. Samples: 2842136420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:13,396][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 08:40:13,921][15349] Signal inference workers to stop experience collection... (41850 times) [2024-06-22 08:40:13,923][15349] Signal inference workers to resume experience collection... (41850 times) [2024-06-22 08:40:13,964][15401] InferenceWorker_p0-w0: stopping experience collection (41850 times) [2024-06-22 08:40:13,964][15401] InferenceWorker_p0-w0: resuming experience collection (41850 times) [2024-06-22 08:40:15,969][15401] Updated weights for policy 0, policy_version 173470 (0.0047) [2024-06-22 08:40:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42599.2, 300 sec: 42653.9). Total num frames: 2842247168. Throughput: 0: 42411.0. Samples: 2842391140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 08:40:19,538][15401] Updated weights for policy 0, policy_version 173480 (0.0032) [2024-06-22 08:40:23,364][15401] Updated weights for policy 0, policy_version 173490 (0.0024) [2024-06-22 08:40:23,390][15132] Fps is (10 sec: 45904.5, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 2842460160. Throughput: 0: 42514.2. Samples: 2842532360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:23,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 08:40:27,034][15401] Updated weights for policy 0, policy_version 173500 (0.0026) [2024-06-22 08:40:28,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 2842640384. Throughput: 0: 42493.7. Samples: 2842781100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:28,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 08:40:31,016][15401] Updated weights for policy 0, policy_version 173510 (0.0029) [2024-06-22 08:40:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2842886144. Throughput: 0: 42697.9. Samples: 2843040040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 08:40:34,667][15401] Updated weights for policy 0, policy_version 173520 (0.0036) [2024-06-22 08:40:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 2843082752. Throughput: 0: 42686.4. Samples: 2843175580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 08:40:38,781][15401] Updated weights for policy 0, policy_version 173530 (0.0048) [2024-06-22 08:40:42,371][15401] Updated weights for policy 0, policy_version 173540 (0.0045) [2024-06-22 08:40:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2843295744. Throughput: 0: 42600.9. Samples: 2843422800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 08:40:43,399][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:40:46,408][15401] Updated weights for policy 0, policy_version 173550 (0.0041) [2024-06-22 08:40:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 2843525120. Throughput: 0: 42593.7. Samples: 2843678260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:40:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 08:40:50,107][15401] Updated weights for policy 0, policy_version 173560 (0.0041) [2024-06-22 08:40:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2843721728. Throughput: 0: 42447.9. Samples: 2843804960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:40:53,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 08:40:54,183][15401] Updated weights for policy 0, policy_version 173570 (0.0032) [2024-06-22 08:40:57,550][15401] Updated weights for policy 0, policy_version 173580 (0.0036) [2024-06-22 08:40:58,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 2843951104. Throughput: 0: 42834.7. Samples: 2844063980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:40:58,397][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 08:41:01,727][15401] Updated weights for policy 0, policy_version 173590 (0.0041) [2024-06-22 08:41:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2844164096. Throughput: 0: 42962.2. Samples: 2844324440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 08:41:05,229][15401] Updated weights for policy 0, policy_version 173600 (0.0030) [2024-06-22 08:41:08,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2844360704. Throughput: 0: 42672.9. Samples: 2844452640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:08,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 08:41:09,466][15401] Updated weights for policy 0, policy_version 173610 (0.0028) [2024-06-22 08:41:12,865][15401] Updated weights for policy 0, policy_version 173620 (0.0048) [2024-06-22 08:41:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43422.2, 300 sec: 42765.0). Total num frames: 2844606464. Throughput: 0: 42863.1. Samples: 2844709840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 08:41:17,237][15401] Updated weights for policy 0, policy_version 173630 (0.0031) [2024-06-22 08:41:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2844819456. Throughput: 0: 42751.4. Samples: 2844963860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 08:41:20,518][15401] Updated weights for policy 0, policy_version 173640 (0.0033) [2024-06-22 08:41:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2844999680. Throughput: 0: 42675.1. Samples: 2845095960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:23,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 08:41:24,765][15401] Updated weights for policy 0, policy_version 173650 (0.0044) [2024-06-22 08:41:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 2845229056. Throughput: 0: 42844.5. Samples: 2845350800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 08:41:28,407][15401] Updated weights for policy 0, policy_version 173660 (0.0044) [2024-06-22 08:41:32,478][15401] Updated weights for policy 0, policy_version 173670 (0.0028) [2024-06-22 08:41:33,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42709.2). Total num frames: 2845458432. Throughput: 0: 42726.2. Samples: 2845601040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:33,393][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 08:41:35,868][15401] Updated weights for policy 0, policy_version 173680 (0.0033) [2024-06-22 08:41:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2845655040. Throughput: 0: 42810.3. Samples: 2845731420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 08:41:40,023][15401] Updated weights for policy 0, policy_version 173690 (0.0039) [2024-06-22 08:41:43,386][15401] Updated weights for policy 0, policy_version 173700 (0.0041) [2024-06-22 08:41:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2845900800. Throughput: 0: 42854.6. Samples: 2845992160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 08:41:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000173700_2845900800.pth... [2024-06-22 08:41:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000173073_2835628032.pth [2024-06-22 08:41:47,800][15401] Updated weights for policy 0, policy_version 173710 (0.0038) [2024-06-22 08:41:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2846081024. Throughput: 0: 42762.3. Samples: 2846248740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 08:41:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 08:41:51,174][15401] Updated weights for policy 0, policy_version 173720 (0.0030) [2024-06-22 08:41:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2846294016. Throughput: 0: 42682.3. Samples: 2846373340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:41:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 08:41:54,269][15349] Signal inference workers to stop experience collection... (41900 times) [2024-06-22 08:41:54,276][15349] Signal inference workers to resume experience collection... (41900 times) [2024-06-22 08:41:54,315][15401] InferenceWorker_p0-w0: stopping experience collection (41900 times) [2024-06-22 08:41:54,315][15401] InferenceWorker_p0-w0: resuming experience collection (41900 times) [2024-06-22 08:41:55,521][15401] Updated weights for policy 0, policy_version 173730 (0.0033) [2024-06-22 08:41:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42876.1, 300 sec: 42709.8). Total num frames: 2846523392. Throughput: 0: 42665.5. Samples: 2846629780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:41:58,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 08:41:58,939][15401] Updated weights for policy 0, policy_version 173740 (0.0040) [2024-06-22 08:42:03,151][15401] Updated weights for policy 0, policy_version 173750 (0.0030) [2024-06-22 08:42:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2846720000. Throughput: 0: 42795.2. Samples: 2846889640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:03,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 08:42:07,280][15401] Updated weights for policy 0, policy_version 173760 (0.0027) [2024-06-22 08:42:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2846932992. Throughput: 0: 42559.9. Samples: 2847011160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 08:42:10,799][15401] Updated weights for policy 0, policy_version 173770 (0.0030) [2024-06-22 08:42:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2847162368. Throughput: 0: 42579.5. Samples: 2847266880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 08:42:15,151][15401] Updated weights for policy 0, policy_version 173780 (0.0029) [2024-06-22 08:42:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 2847358976. Throughput: 0: 42883.8. Samples: 2847530700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:18,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 08:42:18,465][15401] Updated weights for policy 0, policy_version 173790 (0.0034) [2024-06-22 08:42:23,112][15401] Updated weights for policy 0, policy_version 173800 (0.0024) [2024-06-22 08:42:23,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2847539200. Throughput: 0: 42617.2. Samples: 2847649200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 08:42:25,981][15401] Updated weights for policy 0, policy_version 173810 (0.0033) [2024-06-22 08:42:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2847817728. Throughput: 0: 42645.4. Samples: 2847911200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 08:42:30,716][15401] Updated weights for policy 0, policy_version 173820 (0.0038) [2024-06-22 08:42:33,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 2847997952. Throughput: 0: 42550.0. Samples: 2848163480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 08:42:33,564][15401] Updated weights for policy 0, policy_version 173830 (0.0043) [2024-06-22 08:42:38,247][15401] Updated weights for policy 0, policy_version 173840 (0.0042) [2024-06-22 08:42:38,392][15132] Fps is (10 sec: 37674.0, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 2848194560. Throughput: 0: 42435.9. Samples: 2848283060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:38,393][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 08:42:41,582][15401] Updated weights for policy 0, policy_version 173850 (0.0038) [2024-06-22 08:42:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2848440320. Throughput: 0: 42490.7. Samples: 2848541860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 08:42:45,679][15401] Updated weights for policy 0, policy_version 173860 (0.0027) [2024-06-22 08:42:48,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2848620544. Throughput: 0: 42381.7. Samples: 2848796820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 08:42:49,766][15401] Updated weights for policy 0, policy_version 173870 (0.0035) [2024-06-22 08:42:53,154][15401] Updated weights for policy 0, policy_version 173880 (0.0031) [2024-06-22 08:42:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2848849920. Throughput: 0: 42329.8. Samples: 2848916000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 08:42:53,390][15132] Avg episode reward: [(0, '0.891')] [2024-06-22 08:42:57,401][15401] Updated weights for policy 0, policy_version 173890 (0.0037) [2024-06-22 08:42:58,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2849062912. Throughput: 0: 42591.2. Samples: 2849183480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:42:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 08:42:58,406][15349] Signal inference workers to stop experience collection... (41950 times) [2024-06-22 08:42:58,456][15401] InferenceWorker_p0-w0: stopping experience collection (41950 times) [2024-06-22 08:42:58,462][15349] Signal inference workers to resume experience collection... (41950 times) [2024-06-22 08:42:58,469][15401] InferenceWorker_p0-w0: resuming experience collection (41950 times) [2024-06-22 08:43:01,035][15401] Updated weights for policy 0, policy_version 173900 (0.0027) [2024-06-22 08:43:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2849259520. Throughput: 0: 42345.7. Samples: 2849436260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:03,390][15132] Avg episode reward: [(0, '0.866')] [2024-06-22 08:43:05,006][15401] Updated weights for policy 0, policy_version 173910 (0.0038) [2024-06-22 08:43:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2849488896. Throughput: 0: 42412.2. Samples: 2849557740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 08:43:08,508][15401] Updated weights for policy 0, policy_version 173920 (0.0033) [2024-06-22 08:43:12,594][15401] Updated weights for policy 0, policy_version 173930 (0.0035) [2024-06-22 08:43:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2849701888. Throughput: 0: 42436.4. Samples: 2849820840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:13,391][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 08:43:16,073][15401] Updated weights for policy 0, policy_version 173940 (0.0028) [2024-06-22 08:43:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 2849882112. Throughput: 0: 42647.8. Samples: 2850082640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 08:43:20,199][15401] Updated weights for policy 0, policy_version 173950 (0.0035) [2024-06-22 08:43:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2850144256. Throughput: 0: 42703.9. Samples: 2850204640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 08:43:23,595][15401] Updated weights for policy 0, policy_version 173960 (0.0044) [2024-06-22 08:43:27,779][15401] Updated weights for policy 0, policy_version 173970 (0.0021) [2024-06-22 08:43:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 2850340864. Throughput: 0: 42721.8. Samples: 2850464340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 08:43:31,130][15401] Updated weights for policy 0, policy_version 173980 (0.0032) [2024-06-22 08:43:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2850537472. Throughput: 0: 42907.2. Samples: 2850727640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 08:43:35,254][15401] Updated weights for policy 0, policy_version 173990 (0.0032) [2024-06-22 08:43:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43419.2, 300 sec: 42820.5). Total num frames: 2850799616. Throughput: 0: 43063.9. Samples: 2850853880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 08:43:38,577][15401] Updated weights for policy 0, policy_version 174000 (0.0028) [2024-06-22 08:43:42,672][15401] Updated weights for policy 0, policy_version 174010 (0.0035) [2024-06-22 08:43:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2850979840. Throughput: 0: 42997.5. Samples: 2851118380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 08:43:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000174011_2850996224.pth... [2024-06-22 08:43:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000173386_2840756224.pth [2024-06-22 08:43:46,092][15401] Updated weights for policy 0, policy_version 174020 (0.0027) [2024-06-22 08:43:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2851192832. Throughput: 0: 43096.3. Samples: 2851375600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 08:43:50,391][15401] Updated weights for policy 0, policy_version 174030 (0.0037) [2024-06-22 08:43:53,389][15132] Fps is (10 sec: 45876.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2851438592. Throughput: 0: 43144.9. Samples: 2851499260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 08:43:53,625][15401] Updated weights for policy 0, policy_version 174040 (0.0039) [2024-06-22 08:43:57,872][15401] Updated weights for policy 0, policy_version 174050 (0.0033) [2024-06-22 08:43:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2851651584. Throughput: 0: 43215.1. Samples: 2851765520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:43:58,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 08:44:01,712][15401] Updated weights for policy 0, policy_version 174060 (0.0046) [2024-06-22 08:44:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 2851848192. Throughput: 0: 43061.3. Samples: 2852020400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 08:44:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 08:44:05,279][15349] Signal inference workers to stop experience collection... (42000 times) [2024-06-22 08:44:05,284][15349] Signal inference workers to resume experience collection... (42000 times) [2024-06-22 08:44:05,322][15401] InferenceWorker_p0-w0: stopping experience collection (42000 times) [2024-06-22 08:44:05,323][15401] InferenceWorker_p0-w0: resuming experience collection (42000 times) [2024-06-22 08:44:05,431][15401] Updated weights for policy 0, policy_version 174070 (0.0029) [2024-06-22 08:44:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2852093952. Throughput: 0: 43131.7. Samples: 2852145560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 08:44:09,131][15401] Updated weights for policy 0, policy_version 174080 (0.0041) [2024-06-22 08:44:13,253][15401] Updated weights for policy 0, policy_version 174090 (0.0043) [2024-06-22 08:44:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42709.6). Total num frames: 2852290560. Throughput: 0: 43225.7. Samples: 2852409500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 08:44:16,892][15401] Updated weights for policy 0, policy_version 174100 (0.0046) [2024-06-22 08:44:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43417.6, 300 sec: 42654.3). Total num frames: 2852487168. Throughput: 0: 43095.6. Samples: 2852666940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 08:44:20,841][15401] Updated weights for policy 0, policy_version 174110 (0.0027) [2024-06-22 08:44:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2852732928. Throughput: 0: 43044.2. Samples: 2852790860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 08:44:24,812][15401] Updated weights for policy 0, policy_version 174120 (0.0041) [2024-06-22 08:44:28,204][15401] Updated weights for policy 0, policy_version 174130 (0.0033) [2024-06-22 08:44:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2852945920. Throughput: 0: 43171.3. Samples: 2853061080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 08:44:32,222][15401] Updated weights for policy 0, policy_version 174140 (0.0044) [2024-06-22 08:44:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 2853142528. Throughput: 0: 43204.9. Samples: 2853319820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 08:44:35,690][15401] Updated weights for policy 0, policy_version 174150 (0.0036) [2024-06-22 08:44:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2853371904. Throughput: 0: 43266.1. Samples: 2853446240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 08:44:39,639][15401] Updated weights for policy 0, policy_version 174160 (0.0028) [2024-06-22 08:44:43,116][15401] Updated weights for policy 0, policy_version 174170 (0.0031) [2024-06-22 08:44:43,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43963.8, 300 sec: 42820.9). Total num frames: 2853617664. Throughput: 0: 43261.7. Samples: 2853712300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:43,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 08:44:47,414][15401] Updated weights for policy 0, policy_version 174180 (0.0039) [2024-06-22 08:44:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2853797888. Throughput: 0: 43476.2. Samples: 2853976820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 08:44:50,726][15401] Updated weights for policy 0, policy_version 174190 (0.0036) [2024-06-22 08:44:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2854027264. Throughput: 0: 43389.7. Samples: 2854098100. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 08:44:55,017][15401] Updated weights for policy 0, policy_version 174200 (0.0040) [2024-06-22 08:44:58,270][15401] Updated weights for policy 0, policy_version 174210 (0.0034) [2024-06-22 08:44:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2854256640. Throughput: 0: 43341.7. Samples: 2854359880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:44:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 08:45:02,474][15401] Updated weights for policy 0, policy_version 174220 (0.0046) [2024-06-22 08:45:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2854436864. Throughput: 0: 43354.7. Samples: 2854617900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:45:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 08:45:05,984][15401] Updated weights for policy 0, policy_version 174230 (0.0038) [2024-06-22 08:45:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42988.1). Total num frames: 2854682624. Throughput: 0: 43451.5. Samples: 2854746180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 08:45:08,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 08:45:10,086][15401] Updated weights for policy 0, policy_version 174240 (0.0034) [2024-06-22 08:45:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2854895616. Throughput: 0: 43291.4. Samples: 2855009200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 08:45:13,802][15401] Updated weights for policy 0, policy_version 174250 (0.0026) [2024-06-22 08:45:17,789][15401] Updated weights for policy 0, policy_version 174260 (0.0041) [2024-06-22 08:45:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2855075840. Throughput: 0: 43213.7. Samples: 2855264440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:18,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 08:45:21,246][15401] Updated weights for policy 0, policy_version 174270 (0.0034) [2024-06-22 08:45:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 2855321600. Throughput: 0: 43131.5. Samples: 2855387160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 08:45:25,518][15401] Updated weights for policy 0, policy_version 174280 (0.0022) [2024-06-22 08:45:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2855534592. Throughput: 0: 42993.8. Samples: 2855647020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 08:45:28,577][15349] Signal inference workers to stop experience collection... (42050 times) [2024-06-22 08:45:28,578][15349] Signal inference workers to resume experience collection... (42050 times) [2024-06-22 08:45:28,602][15401] InferenceWorker_p0-w0: stopping experience collection (42050 times) [2024-06-22 08:45:28,602][15401] InferenceWorker_p0-w0: resuming experience collection (42050 times) [2024-06-22 08:45:28,737][15401] Updated weights for policy 0, policy_version 174290 (0.0036) [2024-06-22 08:45:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2855714816. Throughput: 0: 42883.9. Samples: 2855906600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 08:45:33,858][15401] Updated weights for policy 0, policy_version 174300 (0.0033) [2024-06-22 08:45:36,497][15401] Updated weights for policy 0, policy_version 174310 (0.0037) [2024-06-22 08:45:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2855960576. Throughput: 0: 42955.2. Samples: 2856031080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 08:45:41,361][15401] Updated weights for policy 0, policy_version 174320 (0.0031) [2024-06-22 08:45:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2856157184. Throughput: 0: 42933.5. Samples: 2856291880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 08:45:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000174327_2856173568.pth... [2024-06-22 08:45:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000173700_2845900800.pth [2024-06-22 08:45:43,971][15401] Updated weights for policy 0, policy_version 174330 (0.0041) [2024-06-22 08:45:48,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 2856370176. Throughput: 0: 42748.4. Samples: 2856541680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:48,393][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 08:45:48,726][15401] Updated weights for policy 0, policy_version 174340 (0.0041) [2024-06-22 08:45:51,518][15401] Updated weights for policy 0, policy_version 174350 (0.0033) [2024-06-22 08:45:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42821.5). Total num frames: 2856583168. Throughput: 0: 42689.4. Samples: 2856667200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:53,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 08:45:56,196][15401] Updated weights for policy 0, policy_version 174360 (0.0038) [2024-06-22 08:45:58,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 2856796160. Throughput: 0: 42749.8. Samples: 2856933040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:45:58,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 08:45:59,071][15401] Updated weights for policy 0, policy_version 174370 (0.0031) [2024-06-22 08:46:03,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2857009152. Throughput: 0: 42759.2. Samples: 2857188600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:46:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 08:46:03,647][15401] Updated weights for policy 0, policy_version 174380 (0.0023) [2024-06-22 08:46:06,869][15401] Updated weights for policy 0, policy_version 174390 (0.0036) [2024-06-22 08:46:08,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 2857222144. Throughput: 0: 42889.3. Samples: 2857317180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:46:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 08:46:11,155][15401] Updated weights for policy 0, policy_version 174400 (0.0036) [2024-06-22 08:46:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2857451520. Throughput: 0: 42851.5. Samples: 2857575340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-22 08:46:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 08:46:14,615][15401] Updated weights for policy 0, policy_version 174410 (0.0036) [2024-06-22 08:46:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2857664512. Throughput: 0: 42689.4. Samples: 2857827620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 08:46:19,280][15401] Updated weights for policy 0, policy_version 174420 (0.0038) [2024-06-22 08:46:22,243][15401] Updated weights for policy 0, policy_version 174430 (0.0032) [2024-06-22 08:46:23,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42593.9, 300 sec: 42875.2). Total num frames: 2857877504. Throughput: 0: 42763.6. Samples: 2857955720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:23,396][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 08:46:26,812][15401] Updated weights for policy 0, policy_version 174440 (0.0038) [2024-06-22 08:46:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 2858074112. Throughput: 0: 42721.7. Samples: 2858214360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 08:46:29,953][15401] Updated weights for policy 0, policy_version 174450 (0.0050) [2024-06-22 08:46:33,390][15132] Fps is (10 sec: 42625.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2858303488. Throughput: 0: 42738.2. Samples: 2858464800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 08:46:34,304][15401] Updated weights for policy 0, policy_version 174460 (0.0045) [2024-06-22 08:46:37,720][15401] Updated weights for policy 0, policy_version 174470 (0.0045) [2024-06-22 08:46:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2858516480. Throughput: 0: 42975.1. Samples: 2858601080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 08:46:41,784][15401] Updated weights for policy 0, policy_version 174480 (0.0046) [2024-06-22 08:46:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2858729472. Throughput: 0: 42711.7. Samples: 2858854960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 08:46:45,560][15401] Updated weights for policy 0, policy_version 174490 (0.0040) [2024-06-22 08:46:48,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 2858958848. Throughput: 0: 42630.1. Samples: 2859106960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 08:46:49,161][15349] Signal inference workers to stop experience collection... (42100 times) [2024-06-22 08:46:49,171][15349] Signal inference workers to resume experience collection... (42100 times) [2024-06-22 08:46:49,196][15401] InferenceWorker_p0-w0: stopping experience collection (42100 times) [2024-06-22 08:46:49,196][15401] InferenceWorker_p0-w0: resuming experience collection (42100 times) [2024-06-22 08:46:49,312][15401] Updated weights for policy 0, policy_version 174500 (0.0040) [2024-06-22 08:46:53,145][15401] Updated weights for policy 0, policy_version 174510 (0.0040) [2024-06-22 08:46:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2859171840. Throughput: 0: 42846.6. Samples: 2859245280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 08:46:56,812][15401] Updated weights for policy 0, policy_version 174520 (0.0046) [2024-06-22 08:46:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2859368448. Throughput: 0: 42715.6. Samples: 2859497540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:46:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 08:47:00,944][15401] Updated weights for policy 0, policy_version 174530 (0.0033) [2024-06-22 08:47:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2859581440. Throughput: 0: 42797.4. Samples: 2859753500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:47:03,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-22 08:47:04,599][15401] Updated weights for policy 0, policy_version 174540 (0.0037) [2024-06-22 08:47:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2859810816. Throughput: 0: 42861.7. Samples: 2859884220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:47:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 08:47:08,955][15401] Updated weights for policy 0, policy_version 174550 (0.0030) [2024-06-22 08:47:12,043][15401] Updated weights for policy 0, policy_version 174560 (0.0039) [2024-06-22 08:47:13,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42593.9, 300 sec: 42875.2). Total num frames: 2860007424. Throughput: 0: 42586.9. Samples: 2860131040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:47:13,396][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 08:47:16,623][15401] Updated weights for policy 0, policy_version 174570 (0.0043) [2024-06-22 08:47:18,394][15132] Fps is (10 sec: 42580.1, 60 sec: 42868.4, 300 sec: 43042.1). Total num frames: 2860236800. Throughput: 0: 42852.5. Samples: 2860393340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-22 08:47:18,394][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 08:47:19,952][15401] Updated weights for policy 0, policy_version 174580 (0.0037) [2024-06-22 08:47:23,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 2860433408. Throughput: 0: 42770.7. Samples: 2860525760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:47:23,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-22 08:47:24,099][15401] Updated weights for policy 0, policy_version 174590 (0.0035) [2024-06-22 08:47:27,612][15401] Updated weights for policy 0, policy_version 174600 (0.0044) [2024-06-22 08:47:28,390][15132] Fps is (10 sec: 42615.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2860662784. Throughput: 0: 42812.2. Samples: 2860781520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:47:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 08:47:31,745][15401] Updated weights for policy 0, policy_version 174610 (0.0036) [2024-06-22 08:47:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 43043.1). Total num frames: 2860892160. Throughput: 0: 42970.3. Samples: 2861040620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:47:33,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 08:47:35,221][15401] Updated weights for policy 0, policy_version 174620 (0.0030) [2024-06-22 08:47:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2861072384. Throughput: 0: 42821.0. Samples: 2861172220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:47:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 08:47:39,352][15401] Updated weights for policy 0, policy_version 174630 (0.0029) [2024-06-22 08:47:43,074][15401] Updated weights for policy 0, policy_version 174640 (0.0036) [2024-06-22 08:47:43,391][15132] Fps is (10 sec: 42594.1, 60 sec: 43143.7, 300 sec: 43042.6). Total num frames: 2861318144. Throughput: 0: 42750.1. Samples: 2861421340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:47:43,391][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 08:47:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000174641_2861318144.pth... [2024-06-22 08:47:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000174011_2850996224.pth [2024-06-22 08:47:47,172][15401] Updated weights for policy 0, policy_version 174650 (0.0031) [2024-06-22 08:47:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2861514752. Throughput: 0: 42910.0. Samples: 2861684460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:47:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 08:47:50,587][15401] Updated weights for policy 0, policy_version 174660 (0.0035) [2024-06-22 08:47:53,390][15132] Fps is (10 sec: 40963.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2861727744. Throughput: 0: 42811.9. Samples: 2861810760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:47:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 08:47:54,806][15401] Updated weights for policy 0, policy_version 174670 (0.0042) [2024-06-22 08:47:58,147][15401] Updated weights for policy 0, policy_version 174680 (0.0029) [2024-06-22 08:47:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 2861957120. Throughput: 0: 43064.4. Samples: 2862068660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:47:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 08:48:02,643][15401] Updated weights for policy 0, policy_version 174690 (0.0034) [2024-06-22 08:48:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2862137344. Throughput: 0: 42833.4. Samples: 2862320660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:48:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 08:48:05,818][15401] Updated weights for policy 0, policy_version 174700 (0.0028) [2024-06-22 08:48:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2862366720. Throughput: 0: 42710.1. Samples: 2862447720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:48:08,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 08:48:09,891][15349] Signal inference workers to stop experience collection... (42150 times) [2024-06-22 08:48:09,894][15349] Signal inference workers to resume experience collection... (42150 times) [2024-06-22 08:48:09,912][15401] InferenceWorker_p0-w0: stopping experience collection (42150 times) [2024-06-22 08:48:09,912][15401] InferenceWorker_p0-w0: resuming experience collection (42150 times) [2024-06-22 08:48:10,038][15401] Updated weights for policy 0, policy_version 174710 (0.0028) [2024-06-22 08:48:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43149.1, 300 sec: 43098.2). Total num frames: 2862596096. Throughput: 0: 42860.1. Samples: 2862710220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:48:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 08:48:13,588][15401] Updated weights for policy 0, policy_version 174720 (0.0037) [2024-06-22 08:48:17,657][15401] Updated weights for policy 0, policy_version 174730 (0.0034) [2024-06-22 08:48:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42874.5, 300 sec: 42931.6). Total num frames: 2862809088. Throughput: 0: 42896.0. Samples: 2862970940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:48:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 08:48:21,202][15401] Updated weights for policy 0, policy_version 174740 (0.0034) [2024-06-22 08:48:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 2863022080. Throughput: 0: 42793.8. Samples: 2863097940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 08:48:23,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 08:48:25,277][15401] Updated weights for policy 0, policy_version 174750 (0.0037) [2024-06-22 08:48:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42931.7). Total num frames: 2863202304. Throughput: 0: 42939.7. Samples: 2863353580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:48:28,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 08:48:29,027][15401] Updated weights for policy 0, policy_version 174760 (0.0025) [2024-06-22 08:48:33,049][15401] Updated weights for policy 0, policy_version 174770 (0.0033) [2024-06-22 08:48:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2863431680. Throughput: 0: 42843.2. Samples: 2863612400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:48:33,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 08:48:36,416][15401] Updated weights for policy 0, policy_version 174780 (0.0048) [2024-06-22 08:48:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2863677440. Throughput: 0: 42959.2. Samples: 2863743920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:48:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 08:48:40,492][15401] Updated weights for policy 0, policy_version 174790 (0.0037) [2024-06-22 08:48:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42326.0, 300 sec: 42931.6). Total num frames: 2863857664. Throughput: 0: 42994.1. Samples: 2864003400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:48:43,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 08:48:44,005][15401] Updated weights for policy 0, policy_version 174800 (0.0038) [2024-06-22 08:48:48,189][15401] Updated weights for policy 0, policy_version 174810 (0.0045) [2024-06-22 08:48:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2864087040. Throughput: 0: 43105.7. Samples: 2864260420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:48:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 08:48:51,682][15401] Updated weights for policy 0, policy_version 174820 (0.0030) [2024-06-22 08:48:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2864316416. Throughput: 0: 43214.8. Samples: 2864392380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:48:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 08:48:55,671][15401] Updated weights for policy 0, policy_version 174830 (0.0032) [2024-06-22 08:48:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 2864513024. Throughput: 0: 42960.6. Samples: 2864643440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:48:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 08:48:59,325][15401] Updated weights for policy 0, policy_version 174840 (0.0045) [2024-06-22 08:49:03,348][15401] Updated weights for policy 0, policy_version 174850 (0.0034) [2024-06-22 08:49:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2864742400. Throughput: 0: 42877.0. Samples: 2864900400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:49:03,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-22 08:49:06,931][15401] Updated weights for policy 0, policy_version 174860 (0.0032) [2024-06-22 08:49:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2864955392. Throughput: 0: 42911.1. Samples: 2865028940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:49:08,399][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 08:49:11,000][15401] Updated weights for policy 0, policy_version 174870 (0.0030) [2024-06-22 08:49:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2865152000. Throughput: 0: 42864.8. Samples: 2865282500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:49:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 08:49:14,855][15401] Updated weights for policy 0, policy_version 174880 (0.0036) [2024-06-22 08:49:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2865364992. Throughput: 0: 42711.6. Samples: 2865534420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:49:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 08:49:18,646][15401] Updated weights for policy 0, policy_version 174890 (0.0037) [2024-06-22 08:49:22,898][15401] Updated weights for policy 0, policy_version 174900 (0.0036) [2024-06-22 08:49:23,393][15132] Fps is (10 sec: 42583.6, 60 sec: 42595.9, 300 sec: 42820.0). Total num frames: 2865577984. Throughput: 0: 42561.6. Samples: 2865659340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:49:23,393][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 08:49:26,965][15349] Signal inference workers to stop experience collection... (42200 times) [2024-06-22 08:49:27,012][15401] InferenceWorker_p0-w0: stopping experience collection (42200 times) [2024-06-22 08:49:27,019][15349] Signal inference workers to resume experience collection... (42200 times) [2024-06-22 08:49:27,030][15401] InferenceWorker_p0-w0: resuming experience collection (42200 times) [2024-06-22 08:49:27,037][15401] Updated weights for policy 0, policy_version 174910 (0.0035) [2024-06-22 08:49:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2865774592. Throughput: 0: 42504.4. Samples: 2865916100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 08:49:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 08:49:30,523][15401] Updated weights for policy 0, policy_version 174920 (0.0044) [2024-06-22 08:49:33,389][15132] Fps is (10 sec: 44252.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2866020352. Throughput: 0: 42341.8. Samples: 2866165800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:49:33,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 08:49:34,584][15401] Updated weights for policy 0, policy_version 174930 (0.0034) [2024-06-22 08:49:38,124][15401] Updated weights for policy 0, policy_version 174940 (0.0026) [2024-06-22 08:49:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2866233344. Throughput: 0: 42347.9. Samples: 2866298040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:49:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 08:49:42,189][15401] Updated weights for policy 0, policy_version 174950 (0.0033) [2024-06-22 08:49:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2866429952. Throughput: 0: 42524.8. Samples: 2866557060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:49:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 08:49:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000174953_2866429952.pth... [2024-06-22 08:49:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000174327_2856173568.pth [2024-06-22 08:49:45,573][15401] Updated weights for policy 0, policy_version 174960 (0.0040) [2024-06-22 08:49:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2866642944. Throughput: 0: 42516.9. Samples: 2866813660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:49:48,390][15132] Avg episode reward: [(0, '0.845')] [2024-06-22 08:49:49,806][15401] Updated weights for policy 0, policy_version 174970 (0.0036) [2024-06-22 08:49:53,037][15401] Updated weights for policy 0, policy_version 174980 (0.0024) [2024-06-22 08:49:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 2866888704. Throughput: 0: 42573.2. Samples: 2866944740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:49:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 08:49:57,328][15401] Updated weights for policy 0, policy_version 174990 (0.0036) [2024-06-22 08:49:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2867085312. Throughput: 0: 42637.8. Samples: 2867201200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:49:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 08:50:01,145][15401] Updated weights for policy 0, policy_version 175000 (0.0033) [2024-06-22 08:50:03,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 2867298304. Throughput: 0: 42566.5. Samples: 2867450020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:50:03,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 08:50:04,946][15401] Updated weights for policy 0, policy_version 175010 (0.0023) [2024-06-22 08:50:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2867494912. Throughput: 0: 42802.9. Samples: 2867585320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:50:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 08:50:08,886][15401] Updated weights for policy 0, policy_version 175020 (0.0032) [2024-06-22 08:50:12,607][15401] Updated weights for policy 0, policy_version 175030 (0.0028) [2024-06-22 08:50:13,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2867707904. Throughput: 0: 42686.3. Samples: 2867836980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:50:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 08:50:16,434][15401] Updated weights for policy 0, policy_version 175040 (0.0029) [2024-06-22 08:50:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2867937280. Throughput: 0: 42916.8. Samples: 2868097060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:50:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 08:50:20,212][15401] Updated weights for policy 0, policy_version 175050 (0.0044) [2024-06-22 08:50:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42874.0, 300 sec: 42765.0). Total num frames: 2868150272. Throughput: 0: 42841.9. Samples: 2868225920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:50:23,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 08:50:24,026][15401] Updated weights for policy 0, policy_version 175060 (0.0028) [2024-06-22 08:50:28,293][15401] Updated weights for policy 0, policy_version 175070 (0.0029) [2024-06-22 08:50:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2868346880. Throughput: 0: 42662.3. Samples: 2868476860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:50:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 08:50:31,483][15401] Updated weights for policy 0, policy_version 175080 (0.0033) [2024-06-22 08:50:33,390][15132] Fps is (10 sec: 44234.3, 60 sec: 42871.0, 300 sec: 42820.5). Total num frames: 2868592640. Throughput: 0: 42700.7. Samples: 2868735220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 08:50:33,391][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 08:50:35,795][15401] Updated weights for policy 0, policy_version 175090 (0.0036) [2024-06-22 08:50:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 2868789248. Throughput: 0: 42821.9. Samples: 2868871820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:50:38,392][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 08:50:39,083][15401] Updated weights for policy 0, policy_version 175100 (0.0049) [2024-06-22 08:50:43,390][15132] Fps is (10 sec: 39323.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2868985856. Throughput: 0: 42696.9. Samples: 2869122560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:50:43,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 08:50:43,547][15401] Updated weights for policy 0, policy_version 175110 (0.0038) [2024-06-22 08:50:46,829][15401] Updated weights for policy 0, policy_version 175120 (0.0036) [2024-06-22 08:50:48,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2869231616. Throughput: 0: 42776.9. Samples: 2869374880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:50:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 08:50:51,149][15401] Updated weights for policy 0, policy_version 175130 (0.0025) [2024-06-22 08:50:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42765.4). Total num frames: 2869411840. Throughput: 0: 42726.2. Samples: 2869508000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:50:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 08:50:54,475][15401] Updated weights for policy 0, policy_version 175140 (0.0034) [2024-06-22 08:50:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2869624832. Throughput: 0: 42671.1. Samples: 2869757180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:50:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 08:50:58,814][15401] Updated weights for policy 0, policy_version 175150 (0.0027) [2024-06-22 08:51:02,025][15401] Updated weights for policy 0, policy_version 175160 (0.0033) [2024-06-22 08:51:03,222][15349] Signal inference workers to stop experience collection... (42250 times) [2024-06-22 08:51:03,222][15349] Signal inference workers to resume experience collection... (42250 times) [2024-06-22 08:51:03,255][15401] InferenceWorker_p0-w0: stopping experience collection (42250 times) [2024-06-22 08:51:03,255][15401] InferenceWorker_p0-w0: resuming experience collection (42250 times) [2024-06-22 08:51:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2869870592. Throughput: 0: 42636.4. Samples: 2870015700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:51:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 08:51:06,439][15401] Updated weights for policy 0, policy_version 175170 (0.0028) [2024-06-22 08:51:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2870034432. Throughput: 0: 42669.3. Samples: 2870146040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:51:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 08:51:09,947][15401] Updated weights for policy 0, policy_version 175180 (0.0031) [2024-06-22 08:51:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2870280192. Throughput: 0: 42572.4. Samples: 2870392620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:51:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 08:51:14,308][15401] Updated weights for policy 0, policy_version 175190 (0.0032) [2024-06-22 08:51:17,513][15401] Updated weights for policy 0, policy_version 175200 (0.0022) [2024-06-22 08:51:18,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 2870509568. Throughput: 0: 42667.5. Samples: 2870655240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:51:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 08:51:22,407][15401] Updated weights for policy 0, policy_version 175210 (0.0026) [2024-06-22 08:51:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2870689792. Throughput: 0: 42484.0. Samples: 2870783500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:51:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 08:51:25,196][15401] Updated weights for policy 0, policy_version 175220 (0.0026) [2024-06-22 08:51:28,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2870902784. Throughput: 0: 42313.8. Samples: 2871026680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:51:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 08:51:29,950][15401] Updated weights for policy 0, policy_version 175230 (0.0027) [2024-06-22 08:51:33,210][15401] Updated weights for policy 0, policy_version 175240 (0.0022) [2024-06-22 08:51:33,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42597.1, 300 sec: 42820.2). Total num frames: 2871148544. Throughput: 0: 42606.2. Samples: 2871292260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:51:33,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 08:51:37,949][15401] Updated weights for policy 0, policy_version 175250 (0.0046) [2024-06-22 08:51:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 2871296000. Throughput: 0: 42621.4. Samples: 2871425960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:51:38,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 08:51:40,800][15401] Updated weights for policy 0, policy_version 175260 (0.0032) [2024-06-22 08:51:43,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2871558144. Throughput: 0: 42389.4. Samples: 2871664700. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:51:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 08:51:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000175266_2871558144.pth... [2024-06-22 08:51:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000174641_2861318144.pth [2024-06-22 08:51:45,589][15401] Updated weights for policy 0, policy_version 175270 (0.0027) [2024-06-22 08:51:48,294][15401] Updated weights for policy 0, policy_version 175280 (0.0033) [2024-06-22 08:51:48,389][15132] Fps is (10 sec: 49152.4, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 2871787520. Throughput: 0: 42508.6. Samples: 2871928580. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:51:48,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 08:51:52,984][15401] Updated weights for policy 0, policy_version 175290 (0.0034) [2024-06-22 08:51:53,396][15132] Fps is (10 sec: 39296.1, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 2871951360. Throughput: 0: 42469.5. Samples: 2872057440. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:51:53,396][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 08:51:56,098][15401] Updated weights for policy 0, policy_version 175300 (0.0035) [2024-06-22 08:51:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2872213504. Throughput: 0: 42628.0. Samples: 2872310880. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:51:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 08:52:00,366][15401] Updated weights for policy 0, policy_version 175310 (0.0043) [2024-06-22 08:52:03,390][15132] Fps is (10 sec: 47543.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2872426496. Throughput: 0: 42645.0. Samples: 2872574260. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 08:52:03,545][15401] Updated weights for policy 0, policy_version 175320 (0.0041) [2024-06-22 08:52:07,711][15401] Updated weights for policy 0, policy_version 175330 (0.0029) [2024-06-22 08:52:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 2872606720. Throughput: 0: 42707.1. Samples: 2872705320. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 08:52:11,202][15401] Updated weights for policy 0, policy_version 175340 (0.0034) [2024-06-22 08:52:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42821.2). Total num frames: 2872868864. Throughput: 0: 42940.0. Samples: 2872958980. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:13,394][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 08:52:15,753][15401] Updated weights for policy 0, policy_version 175350 (0.0034) [2024-06-22 08:52:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 2873065472. Throughput: 0: 42863.5. Samples: 2873221020. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 08:52:18,960][15401] Updated weights for policy 0, policy_version 175360 (0.0047) [2024-06-22 08:52:23,342][15401] Updated weights for policy 0, policy_version 175370 (0.0043) [2024-06-22 08:52:23,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 2873262080. Throughput: 0: 42684.8. Samples: 2873346880. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:23,393][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 08:52:24,474][15349] Signal inference workers to stop experience collection... (42300 times) [2024-06-22 08:52:24,503][15401] InferenceWorker_p0-w0: stopping experience collection (42300 times) [2024-06-22 08:52:24,530][15349] Signal inference workers to resume experience collection... (42300 times) [2024-06-22 08:52:24,530][15401] InferenceWorker_p0-w0: resuming experience collection (42300 times) [2024-06-22 08:52:26,763][15401] Updated weights for policy 0, policy_version 175380 (0.0034) [2024-06-22 08:52:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2873507840. Throughput: 0: 42979.4. Samples: 2873598780. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 08:52:30,794][15401] Updated weights for policy 0, policy_version 175390 (0.0041) [2024-06-22 08:52:33,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 2873688064. Throughput: 0: 42929.3. Samples: 2873860400. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 08:52:34,549][15401] Updated weights for policy 0, policy_version 175400 (0.0028) [2024-06-22 08:52:38,341][15401] Updated weights for policy 0, policy_version 175410 (0.0029) [2024-06-22 08:52:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43690.6, 300 sec: 42709.6). Total num frames: 2873917440. Throughput: 0: 42804.7. Samples: 2873983380. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:38,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-22 08:52:42,152][15401] Updated weights for policy 0, policy_version 175420 (0.0035) [2024-06-22 08:52:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2874130432. Throughput: 0: 42919.0. Samples: 2874242240. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-22 08:52:43,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-22 08:52:45,781][15401] Updated weights for policy 0, policy_version 175430 (0.0032) [2024-06-22 08:52:48,390][15132] Fps is (10 sec: 37683.5, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 2874294272. Throughput: 0: 42925.8. Samples: 2874505920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:52:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 08:52:49,838][15401] Updated weights for policy 0, policy_version 175440 (0.0040) [2024-06-22 08:52:53,289][15401] Updated weights for policy 0, policy_version 175450 (0.0039) [2024-06-22 08:52:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43695.3, 300 sec: 42765.0). Total num frames: 2874572800. Throughput: 0: 42705.7. Samples: 2874627080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:52:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 08:52:57,446][15401] Updated weights for policy 0, policy_version 175460 (0.0048) [2024-06-22 08:52:58,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2874769408. Throughput: 0: 42836.5. Samples: 2874886620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:52:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 08:53:01,242][15401] Updated weights for policy 0, policy_version 175470 (0.0029) [2024-06-22 08:53:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2874949632. Throughput: 0: 42693.0. Samples: 2875142200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 08:53:05,262][15401] Updated weights for policy 0, policy_version 175480 (0.0026) [2024-06-22 08:53:08,395][15132] Fps is (10 sec: 42574.3, 60 sec: 43140.5, 300 sec: 42708.7). Total num frames: 2875195392. Throughput: 0: 42555.7. Samples: 2875262020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:08,396][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 08:53:09,004][15401] Updated weights for policy 0, policy_version 175490 (0.0039) [2024-06-22 08:53:12,903][15401] Updated weights for policy 0, policy_version 175500 (0.0050) [2024-06-22 08:53:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2875408384. Throughput: 0: 42850.0. Samples: 2875527020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 08:53:16,676][15401] Updated weights for policy 0, policy_version 175510 (0.0044) [2024-06-22 08:53:18,389][15132] Fps is (10 sec: 40983.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2875604992. Throughput: 0: 42811.5. Samples: 2875786920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 08:53:20,495][15401] Updated weights for policy 0, policy_version 175520 (0.0035) [2024-06-22 08:53:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 2875850752. Throughput: 0: 42724.9. Samples: 2875906000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 08:53:24,051][15401] Updated weights for policy 0, policy_version 175530 (0.0033) [2024-06-22 08:53:28,180][15401] Updated weights for policy 0, policy_version 175540 (0.0038) [2024-06-22 08:53:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2876047360. Throughput: 0: 42983.2. Samples: 2876176480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 08:53:31,254][15349] Signal inference workers to stop experience collection... (42350 times) [2024-06-22 08:53:31,295][15401] InferenceWorker_p0-w0: stopping experience collection (42350 times) [2024-06-22 08:53:31,371][15349] Signal inference workers to resume experience collection... (42350 times) [2024-06-22 08:53:31,371][15401] InferenceWorker_p0-w0: resuming experience collection (42350 times) [2024-06-22 08:53:31,528][15401] Updated weights for policy 0, policy_version 175550 (0.0030) [2024-06-22 08:53:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2876260352. Throughput: 0: 42659.2. Samples: 2876425580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 08:53:36,092][15401] Updated weights for policy 0, policy_version 175560 (0.0034) [2024-06-22 08:53:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2876473344. Throughput: 0: 42829.0. Samples: 2876554380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 08:53:39,483][15401] Updated weights for policy 0, policy_version 175570 (0.0022) [2024-06-22 08:53:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2876686336. Throughput: 0: 42847.1. Samples: 2876814740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:43,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 08:53:43,513][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000175580_2876702720.pth... [2024-06-22 08:53:43,516][15401] Updated weights for policy 0, policy_version 175580 (0.0034) [2024-06-22 08:53:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000174953_2866429952.pth [2024-06-22 08:53:47,242][15401] Updated weights for policy 0, policy_version 175590 (0.0032) [2024-06-22 08:53:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 2876915712. Throughput: 0: 42827.8. Samples: 2877069460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 08:53:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 08:53:51,039][15401] Updated weights for policy 0, policy_version 175600 (0.0037) [2024-06-22 08:53:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2877112320. Throughput: 0: 43000.9. Samples: 2877196820. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:53:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 08:53:54,719][15401] Updated weights for policy 0, policy_version 175610 (0.0035) [2024-06-22 08:53:58,391][15132] Fps is (10 sec: 40954.1, 60 sec: 42597.3, 300 sec: 42653.7). Total num frames: 2877325312. Throughput: 0: 42860.7. Samples: 2877455820. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:53:58,392][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 08:53:58,941][15401] Updated weights for policy 0, policy_version 175620 (0.0034) [2024-06-22 08:54:02,404][15401] Updated weights for policy 0, policy_version 175630 (0.0045) [2024-06-22 08:54:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2877554688. Throughput: 0: 42871.1. Samples: 2877716120. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:03,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 08:54:06,447][15401] Updated weights for policy 0, policy_version 175640 (0.0041) [2024-06-22 08:54:08,389][15132] Fps is (10 sec: 44243.8, 60 sec: 42875.5, 300 sec: 42765.0). Total num frames: 2877767680. Throughput: 0: 43036.1. Samples: 2877842620. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 08:54:09,857][15401] Updated weights for policy 0, policy_version 175650 (0.0034) [2024-06-22 08:54:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2877980672. Throughput: 0: 42842.7. Samples: 2878104400. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 08:54:14,227][15401] Updated weights for policy 0, policy_version 175660 (0.0041) [2024-06-22 08:54:17,517][15401] Updated weights for policy 0, policy_version 175670 (0.0041) [2024-06-22 08:54:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.5). Total num frames: 2878193664. Throughput: 0: 42936.8. Samples: 2878357740. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 08:54:21,800][15401] Updated weights for policy 0, policy_version 175680 (0.0029) [2024-06-22 08:54:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2878406656. Throughput: 0: 42954.6. Samples: 2878487340. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 08:54:25,237][15401] Updated weights for policy 0, policy_version 175690 (0.0027) [2024-06-22 08:54:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2878603264. Throughput: 0: 42810.0. Samples: 2878741200. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 08:54:29,550][15401] Updated weights for policy 0, policy_version 175700 (0.0039) [2024-06-22 08:54:33,104][15401] Updated weights for policy 0, policy_version 175710 (0.0028) [2024-06-22 08:54:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2878832640. Throughput: 0: 42921.0. Samples: 2879000900. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 08:54:37,151][15401] Updated weights for policy 0, policy_version 175720 (0.0041) [2024-06-22 08:54:38,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2879062016. Throughput: 0: 43020.8. Samples: 2879132860. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:38,393][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 08:54:40,853][15401] Updated weights for policy 0, policy_version 175730 (0.0037) [2024-06-22 08:54:43,392][15132] Fps is (10 sec: 42586.0, 60 sec: 42869.3, 300 sec: 42764.6). Total num frames: 2879258624. Throughput: 0: 42860.5. Samples: 2879384600. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:43,393][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 08:54:44,704][15401] Updated weights for policy 0, policy_version 175740 (0.0027) [2024-06-22 08:54:48,305][15401] Updated weights for policy 0, policy_version 175750 (0.0045) [2024-06-22 08:54:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2879488000. Throughput: 0: 42871.6. Samples: 2879645340. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 08:54:52,407][15401] Updated weights for policy 0, policy_version 175760 (0.0026) [2024-06-22 08:54:53,390][15132] Fps is (10 sec: 44249.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2879700992. Throughput: 0: 42943.1. Samples: 2879775060. Policy #0 lag: (min: 2.0, avg: 12.1, max: 24.0) [2024-06-22 08:54:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 08:54:56,317][15401] Updated weights for policy 0, policy_version 175770 (0.0035) [2024-06-22 08:54:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42872.6, 300 sec: 42709.8). Total num frames: 2879897600. Throughput: 0: 42630.2. Samples: 2880022760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:54:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 08:54:59,970][15401] Updated weights for policy 0, policy_version 175780 (0.0032) [2024-06-22 08:55:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 2880110592. Throughput: 0: 42743.5. Samples: 2880281300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:03,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 08:55:03,858][15401] Updated weights for policy 0, policy_version 175790 (0.0037) [2024-06-22 08:55:07,369][15349] Signal inference workers to stop experience collection... (42400 times) [2024-06-22 08:55:07,370][15349] Signal inference workers to resume experience collection... (42400 times) [2024-06-22 08:55:07,407][15401] InferenceWorker_p0-w0: stopping experience collection (42400 times) [2024-06-22 08:55:07,407][15401] InferenceWorker_p0-w0: resuming experience collection (42400 times) [2024-06-22 08:55:07,518][15401] Updated weights for policy 0, policy_version 175800 (0.0037) [2024-06-22 08:55:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2880339968. Throughput: 0: 42836.5. Samples: 2880414980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:08,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 08:55:11,411][15401] Updated weights for policy 0, policy_version 175810 (0.0041) [2024-06-22 08:55:13,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2880536576. Throughput: 0: 42893.4. Samples: 2880671400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:13,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 08:55:15,169][15401] Updated weights for policy 0, policy_version 175820 (0.0035) [2024-06-22 08:55:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2880765952. Throughput: 0: 42819.4. Samples: 2880927780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:18,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 08:55:18,934][15401] Updated weights for policy 0, policy_version 175830 (0.0042) [2024-06-22 08:55:22,743][15401] Updated weights for policy 0, policy_version 175840 (0.0038) [2024-06-22 08:55:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2880962560. Throughput: 0: 42817.6. Samples: 2881059560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 08:55:26,354][15401] Updated weights for policy 0, policy_version 175850 (0.0028) [2024-06-22 08:55:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2881175552. Throughput: 0: 42956.2. Samples: 2881317500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 08:55:30,240][15401] Updated weights for policy 0, policy_version 175860 (0.0032) [2024-06-22 08:55:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 2881404928. Throughput: 0: 42852.4. Samples: 2881573700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 08:55:34,117][15401] Updated weights for policy 0, policy_version 175870 (0.0031) [2024-06-22 08:55:38,010][15401] Updated weights for policy 0, policy_version 175880 (0.0032) [2024-06-22 08:55:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 2881617920. Throughput: 0: 42841.4. Samples: 2881702920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 08:55:41,766][15401] Updated weights for policy 0, policy_version 175890 (0.0039) [2024-06-22 08:55:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.5, 300 sec: 42709.5). Total num frames: 2881830912. Throughput: 0: 42928.0. Samples: 2881954520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:43,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 08:55:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000175893_2881830912.pth... [2024-06-22 08:55:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000175266_2871558144.pth [2024-06-22 08:55:45,595][15401] Updated weights for policy 0, policy_version 175900 (0.0042) [2024-06-22 08:55:48,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 2882043904. Throughput: 0: 43096.8. Samples: 2882220560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 08:55:49,390][15401] Updated weights for policy 0, policy_version 175910 (0.0032) [2024-06-22 08:55:53,278][15401] Updated weights for policy 0, policy_version 175920 (0.0037) [2024-06-22 08:55:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 2882273280. Throughput: 0: 42905.7. Samples: 2882345840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:53,393][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 08:55:56,835][15401] Updated weights for policy 0, policy_version 175930 (0.0032) [2024-06-22 08:55:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2882469888. Throughput: 0: 42704.7. Samples: 2882593120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:55:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 08:56:01,029][15401] Updated weights for policy 0, policy_version 175940 (0.0026) [2024-06-22 08:56:03,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2882682880. Throughput: 0: 42888.6. Samples: 2882857760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 08:56:03,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 08:56:04,432][15401] Updated weights for policy 0, policy_version 175950 (0.0028) [2024-06-22 08:56:08,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2882895872. Throughput: 0: 42772.6. Samples: 2882984320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:08,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 08:56:08,709][15401] Updated weights for policy 0, policy_version 175960 (0.0032) [2024-06-22 08:56:12,322][15401] Updated weights for policy 0, policy_version 175970 (0.0035) [2024-06-22 08:56:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2883108864. Throughput: 0: 42667.0. Samples: 2883237520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:13,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 08:56:16,582][15401] Updated weights for policy 0, policy_version 175980 (0.0033) [2024-06-22 08:56:18,394][15132] Fps is (10 sec: 44216.4, 60 sec: 42868.3, 300 sec: 42875.4). Total num frames: 2883338240. Throughput: 0: 42724.5. Samples: 2883496500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:18,395][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 08:56:19,801][15349] Signal inference workers to stop experience collection... (42450 times) [2024-06-22 08:56:19,844][15401] InferenceWorker_p0-w0: stopping experience collection (42450 times) [2024-06-22 08:56:19,851][15349] Signal inference workers to resume experience collection... (42450 times) [2024-06-22 08:56:19,856][15401] InferenceWorker_p0-w0: resuming experience collection (42450 times) [2024-06-22 08:56:20,002][15401] Updated weights for policy 0, policy_version 175990 (0.0034) [2024-06-22 08:56:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 2883518464. Throughput: 0: 42690.7. Samples: 2883624000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:23,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 08:56:24,294][15401] Updated weights for policy 0, policy_version 176000 (0.0036) [2024-06-22 08:56:27,629][15401] Updated weights for policy 0, policy_version 176010 (0.0030) [2024-06-22 08:56:28,389][15132] Fps is (10 sec: 42618.2, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 2883764224. Throughput: 0: 42873.8. Samples: 2883883840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 08:56:31,746][15401] Updated weights for policy 0, policy_version 176020 (0.0039) [2024-06-22 08:56:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2883977216. Throughput: 0: 42649.1. Samples: 2884139760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:33,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-22 08:56:35,325][15401] Updated weights for policy 0, policy_version 176030 (0.0030) [2024-06-22 08:56:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 2884173824. Throughput: 0: 42726.3. Samples: 2884268520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:38,392][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 08:56:39,254][15401] Updated weights for policy 0, policy_version 176040 (0.0030) [2024-06-22 08:56:43,062][15401] Updated weights for policy 0, policy_version 176050 (0.0037) [2024-06-22 08:56:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2884419584. Throughput: 0: 43008.3. Samples: 2884528480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 08:56:46,867][15401] Updated weights for policy 0, policy_version 176060 (0.0038) [2024-06-22 08:56:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.6, 300 sec: 42932.6). Total num frames: 2884616192. Throughput: 0: 42828.1. Samples: 2884785020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 08:56:50,733][15401] Updated weights for policy 0, policy_version 176070 (0.0043) [2024-06-22 08:56:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 2884829184. Throughput: 0: 42869.9. Samples: 2884913460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 08:56:54,693][15401] Updated weights for policy 0, policy_version 176080 (0.0037) [2024-06-22 08:56:58,203][15401] Updated weights for policy 0, policy_version 176090 (0.0039) [2024-06-22 08:56:58,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43143.0, 300 sec: 42820.2). Total num frames: 2885058560. Throughput: 0: 42856.4. Samples: 2885166160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:56:58,393][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 08:57:02,169][15401] Updated weights for policy 0, policy_version 176100 (0.0031) [2024-06-22 08:57:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2885255168. Throughput: 0: 42871.1. Samples: 2885425500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:57:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 08:57:06,202][15401] Updated weights for policy 0, policy_version 176110 (0.0049) [2024-06-22 08:57:08,393][15132] Fps is (10 sec: 39316.6, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 2885451776. Throughput: 0: 42898.7. Samples: 2885554600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 08:57:08,394][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 08:57:10,360][15401] Updated weights for policy 0, policy_version 176120 (0.0039) [2024-06-22 08:57:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2885681152. Throughput: 0: 42811.5. Samples: 2885810360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:13,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 08:57:13,858][15401] Updated weights for policy 0, policy_version 176130 (0.0037) [2024-06-22 08:57:17,858][15401] Updated weights for policy 0, policy_version 176140 (0.0029) [2024-06-22 08:57:18,389][15132] Fps is (10 sec: 45892.4, 60 sec: 42874.8, 300 sec: 42876.5). Total num frames: 2885910528. Throughput: 0: 42901.8. Samples: 2886070340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 08:57:21,636][15401] Updated weights for policy 0, policy_version 176150 (0.0027) [2024-06-22 08:57:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2886107136. Throughput: 0: 42968.4. Samples: 2886202000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:23,392][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 08:57:25,290][15401] Updated weights for policy 0, policy_version 176160 (0.0027) [2024-06-22 08:57:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2886336512. Throughput: 0: 42872.9. Samples: 2886457760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:28,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 08:57:28,995][15401] Updated weights for policy 0, policy_version 176170 (0.0029) [2024-06-22 08:57:32,792][15401] Updated weights for policy 0, policy_version 176180 (0.0033) [2024-06-22 08:57:33,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2886565888. Throughput: 0: 42953.8. Samples: 2886717940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 08:57:36,532][15401] Updated weights for policy 0, policy_version 176190 (0.0039) [2024-06-22 08:57:37,368][15349] Signal inference workers to stop experience collection... (42500 times) [2024-06-22 08:57:37,419][15401] InferenceWorker_p0-w0: stopping experience collection (42500 times) [2024-06-22 08:57:37,428][15349] Signal inference workers to resume experience collection... (42500 times) [2024-06-22 08:57:37,437][15401] InferenceWorker_p0-w0: resuming experience collection (42500 times) [2024-06-22 08:57:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 2886762496. Throughput: 0: 43034.5. Samples: 2886850120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:38,392][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 08:57:40,406][15401] Updated weights for policy 0, policy_version 176200 (0.0042) [2024-06-22 08:57:43,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 2886975488. Throughput: 0: 42997.9. Samples: 2887100960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 08:57:43,489][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000176208_2886991872.pth... [2024-06-22 08:57:43,545][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000175580_2876702720.pth [2024-06-22 08:57:44,017][15401] Updated weights for policy 0, policy_version 176210 (0.0033) [2024-06-22 08:57:48,022][15401] Updated weights for policy 0, policy_version 176220 (0.0034) [2024-06-22 08:57:48,393][15132] Fps is (10 sec: 44231.8, 60 sec: 43142.0, 300 sec: 42820.1). Total num frames: 2887204864. Throughput: 0: 42972.6. Samples: 2887359420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:48,393][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 08:57:51,587][15401] Updated weights for policy 0, policy_version 176230 (0.0022) [2024-06-22 08:57:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2887417856. Throughput: 0: 42955.0. Samples: 2887487420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 08:57:55,698][15401] Updated weights for policy 0, policy_version 176240 (0.0034) [2024-06-22 08:57:58,390][15132] Fps is (10 sec: 40974.2, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 2887614464. Throughput: 0: 43027.1. Samples: 2887746580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:57:58,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-22 08:57:59,260][15401] Updated weights for policy 0, policy_version 176250 (0.0033) [2024-06-22 08:58:03,248][15401] Updated weights for policy 0, policy_version 176260 (0.0042) [2024-06-22 08:58:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.9). Total num frames: 2887843840. Throughput: 0: 43001.6. Samples: 2888005420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:58:03,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 08:58:07,535][15401] Updated weights for policy 0, policy_version 176270 (0.0043) [2024-06-22 08:58:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43420.2, 300 sec: 42876.1). Total num frames: 2888056832. Throughput: 0: 42877.4. Samples: 2888131480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:58:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 08:58:11,443][15401] Updated weights for policy 0, policy_version 176280 (0.0023) [2024-06-22 08:58:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2888269824. Throughput: 0: 42819.3. Samples: 2888384640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 08:58:13,400][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 08:58:15,060][15401] Updated weights for policy 0, policy_version 176290 (0.0033) [2024-06-22 08:58:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2888466432. Throughput: 0: 42631.4. Samples: 2888636360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:18,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 08:58:18,983][15401] Updated weights for policy 0, policy_version 176300 (0.0034) [2024-06-22 08:58:22,679][15401] Updated weights for policy 0, policy_version 176310 (0.0031) [2024-06-22 08:58:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2888695808. Throughput: 0: 42534.7. Samples: 2888764080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 08:58:26,685][15401] Updated weights for policy 0, policy_version 176320 (0.0032) [2024-06-22 08:58:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2888908800. Throughput: 0: 42717.3. Samples: 2889023240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:28,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 08:58:30,498][15401] Updated weights for policy 0, policy_version 176330 (0.0028) [2024-06-22 08:58:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2889105408. Throughput: 0: 42657.6. Samples: 2889278860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 08:58:34,429][15401] Updated weights for policy 0, policy_version 176340 (0.0033) [2024-06-22 08:58:38,037][15401] Updated weights for policy 0, policy_version 176350 (0.0034) [2024-06-22 08:58:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 2889318400. Throughput: 0: 42543.5. Samples: 2889401880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 08:58:42,149][15401] Updated weights for policy 0, policy_version 176360 (0.0028) [2024-06-22 08:58:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2889547776. Throughput: 0: 42644.0. Samples: 2889665560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 08:58:45,495][15401] Updated weights for policy 0, policy_version 176370 (0.0035) [2024-06-22 08:58:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.9, 300 sec: 42876.1). Total num frames: 2889760768. Throughput: 0: 42451.2. Samples: 2889915720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 08:58:49,869][15401] Updated weights for policy 0, policy_version 176380 (0.0033) [2024-06-22 08:58:53,214][15401] Updated weights for policy 0, policy_version 176390 (0.0033) [2024-06-22 08:58:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.3). Total num frames: 2889973760. Throughput: 0: 42528.0. Samples: 2890045240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 08:58:57,449][15401] Updated weights for policy 0, policy_version 176400 (0.0047) [2024-06-22 08:58:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2890170368. Throughput: 0: 42800.5. Samples: 2890310660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:58:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 08:59:00,751][15401] Updated weights for policy 0, policy_version 176410 (0.0031) [2024-06-22 08:59:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 2890399744. Throughput: 0: 42705.8. Samples: 2890558120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:59:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 08:59:05,178][15401] Updated weights for policy 0, policy_version 176420 (0.0039) [2024-06-22 08:59:08,383][15401] Updated weights for policy 0, policy_version 176430 (0.0033) [2024-06-22 08:59:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2890629120. Throughput: 0: 42694.3. Samples: 2890685320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:59:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 08:59:09,356][15349] Signal inference workers to stop experience collection... (42550 times) [2024-06-22 08:59:09,356][15349] Signal inference workers to resume experience collection... (42550 times) [2024-06-22 08:59:09,391][15401] InferenceWorker_p0-w0: stopping experience collection (42550 times) [2024-06-22 08:59:09,392][15401] InferenceWorker_p0-w0: resuming experience collection (42550 times) [2024-06-22 08:59:12,845][15401] Updated weights for policy 0, policy_version 176440 (0.0033) [2024-06-22 08:59:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2890809344. Throughput: 0: 42831.9. Samples: 2890950680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:59:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 08:59:15,911][15401] Updated weights for policy 0, policy_version 176450 (0.0037) [2024-06-22 08:59:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2891038720. Throughput: 0: 42698.5. Samples: 2891200300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 08:59:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 08:59:20,507][15401] Updated weights for policy 0, policy_version 176460 (0.0039) [2024-06-22 08:59:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2891251712. Throughput: 0: 42859.6. Samples: 2891330560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 08:59:23,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 08:59:24,110][15401] Updated weights for policy 0, policy_version 176470 (0.0028) [2024-06-22 08:59:28,134][15401] Updated weights for policy 0, policy_version 176480 (0.0029) [2024-06-22 08:59:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2891448320. Throughput: 0: 42673.0. Samples: 2891585840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 08:59:28,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 08:59:31,574][15401] Updated weights for policy 0, policy_version 176490 (0.0048) [2024-06-22 08:59:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 2891661312. Throughput: 0: 42781.9. Samples: 2891840900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 08:59:33,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 08:59:35,771][15401] Updated weights for policy 0, policy_version 176500 (0.0040) [2024-06-22 08:59:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 2891907072. Throughput: 0: 42711.5. Samples: 2891967260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 08:59:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 08:59:39,446][15401] Updated weights for policy 0, policy_version 176510 (0.0030) [2024-06-22 08:59:43,251][15401] Updated weights for policy 0, policy_version 176520 (0.0024) [2024-06-22 08:59:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2892103680. Throughput: 0: 42624.1. Samples: 2892228740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 08:59:43,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-22 08:59:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000176520_2892103680.pth... [2024-06-22 08:59:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000175893_2881830912.pth [2024-06-22 08:59:47,009][15401] Updated weights for policy 0, policy_version 176530 (0.0033) [2024-06-22 08:59:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2892300288. Throughput: 0: 42810.8. Samples: 2892484600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 08:59:48,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 08:59:51,031][15401] Updated weights for policy 0, policy_version 176540 (0.0044) [2024-06-22 08:59:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2892529664. Throughput: 0: 42767.6. Samples: 2892609860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 08:59:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 08:59:54,598][15401] Updated weights for policy 0, policy_version 176550 (0.0035) [2024-06-22 08:59:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 2892726272. Throughput: 0: 42566.3. Samples: 2892866160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 08:59:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 08:59:58,833][15401] Updated weights for policy 0, policy_version 176560 (0.0047) [2024-06-22 09:00:02,106][15401] Updated weights for policy 0, policy_version 176570 (0.0035) [2024-06-22 09:00:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2892955648. Throughput: 0: 42811.6. Samples: 2893126820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 09:00:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 09:00:06,305][15401] Updated weights for policy 0, policy_version 176580 (0.0039) [2024-06-22 09:00:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2893185024. Throughput: 0: 42804.6. Samples: 2893256760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 09:00:08,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 09:00:09,686][15401] Updated weights for policy 0, policy_version 176590 (0.0031) [2024-06-22 09:00:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 2893381632. Throughput: 0: 42780.9. Samples: 2893510980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 09:00:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 09:00:13,794][15401] Updated weights for policy 0, policy_version 176600 (0.0025) [2024-06-22 09:00:17,351][15401] Updated weights for policy 0, policy_version 176610 (0.0040) [2024-06-22 09:00:18,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 2893594624. Throughput: 0: 42842.5. Samples: 2893768920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 09:00:18,393][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 09:00:21,797][15401] Updated weights for policy 0, policy_version 176620 (0.0034) [2024-06-22 09:00:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2893824000. Throughput: 0: 42846.4. Samples: 2893895340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 09:00:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 09:00:24,930][15401] Updated weights for policy 0, policy_version 176630 (0.0034) [2024-06-22 09:00:28,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2894004224. Throughput: 0: 42632.4. Samples: 2894147200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:00:28,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 09:00:29,664][15401] Updated weights for policy 0, policy_version 176640 (0.0048) [2024-06-22 09:00:32,563][15401] Updated weights for policy 0, policy_version 176650 (0.0028) [2024-06-22 09:00:33,391][15132] Fps is (10 sec: 40952.0, 60 sec: 42870.1, 300 sec: 42764.7). Total num frames: 2894233600. Throughput: 0: 42705.7. Samples: 2894406440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:00:33,392][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 09:00:37,173][15401] Updated weights for policy 0, policy_version 176660 (0.0036) [2024-06-22 09:00:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2894446592. Throughput: 0: 42921.3. Samples: 2894541320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:00:38,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 09:00:40,145][15401] Updated weights for policy 0, policy_version 176670 (0.0032) [2024-06-22 09:00:43,390][15132] Fps is (10 sec: 42606.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2894659584. Throughput: 0: 42877.6. Samples: 2894795660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:00:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 09:00:43,781][15349] Signal inference workers to stop experience collection... (42600 times) [2024-06-22 09:00:43,781][15349] Signal inference workers to resume experience collection... (42600 times) [2024-06-22 09:00:43,826][15401] InferenceWorker_p0-w0: stopping experience collection (42600 times) [2024-06-22 09:00:43,826][15401] InferenceWorker_p0-w0: resuming experience collection (42600 times) [2024-06-22 09:00:44,601][15401] Updated weights for policy 0, policy_version 176680 (0.0030) [2024-06-22 09:00:47,670][15401] Updated weights for policy 0, policy_version 176690 (0.0038) [2024-06-22 09:00:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 2894888960. Throughput: 0: 42789.4. Samples: 2895052340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:00:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 09:00:52,302][15401] Updated weights for policy 0, policy_version 176700 (0.0031) [2024-06-22 09:00:53,394][15132] Fps is (10 sec: 44216.6, 60 sec: 42868.1, 300 sec: 42819.9). Total num frames: 2895101952. Throughput: 0: 42999.5. Samples: 2895191940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:00:53,395][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 09:00:55,239][15401] Updated weights for policy 0, policy_version 176710 (0.0034) [2024-06-22 09:00:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 2895298560. Throughput: 0: 42897.6. Samples: 2895441480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:00:58,393][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 09:00:59,846][15401] Updated weights for policy 0, policy_version 176720 (0.0033) [2024-06-22 09:01:02,778][15401] Updated weights for policy 0, policy_version 176730 (0.0028) [2024-06-22 09:01:03,389][15132] Fps is (10 sec: 44257.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2895544320. Throughput: 0: 42729.0. Samples: 2895691620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:01:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 09:01:07,399][15401] Updated weights for policy 0, policy_version 176740 (0.0024) [2024-06-22 09:01:08,395][15132] Fps is (10 sec: 42587.2, 60 sec: 42321.7, 300 sec: 42764.3). Total num frames: 2895724544. Throughput: 0: 42946.7. Samples: 2895828160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:01:08,395][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 09:01:10,645][15401] Updated weights for policy 0, policy_version 176750 (0.0049) [2024-06-22 09:01:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.7). Total num frames: 2895953920. Throughput: 0: 42978.2. Samples: 2896081220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:01:13,391][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 09:01:15,026][15401] Updated weights for policy 0, policy_version 176760 (0.0041) [2024-06-22 09:01:18,390][15132] Fps is (10 sec: 45898.2, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 2896183296. Throughput: 0: 42909.8. Samples: 2896337300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:01:18,391][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 09:01:18,818][15401] Updated weights for policy 0, policy_version 176770 (0.0034) [2024-06-22 09:01:22,946][15401] Updated weights for policy 0, policy_version 176780 (0.0030) [2024-06-22 09:01:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2896379904. Throughput: 0: 42876.7. Samples: 2896470780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:01:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 09:01:26,422][15401] Updated weights for policy 0, policy_version 176790 (0.0031) [2024-06-22 09:01:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 2896609280. Throughput: 0: 42793.4. Samples: 2896721460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 09:01:28,393][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 09:01:30,855][15401] Updated weights for policy 0, policy_version 176800 (0.0038) [2024-06-22 09:01:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43419.0, 300 sec: 42932.0). Total num frames: 2896838656. Throughput: 0: 42790.6. Samples: 2896977920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:01:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 09:01:33,942][15401] Updated weights for policy 0, policy_version 176810 (0.0023) [2024-06-22 09:01:38,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2897002496. Throughput: 0: 42616.9. Samples: 2897109500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:01:38,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 09:01:38,433][15401] Updated weights for policy 0, policy_version 176820 (0.0033) [2024-06-22 09:01:41,406][15401] Updated weights for policy 0, policy_version 176830 (0.0029) [2024-06-22 09:01:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2897248256. Throughput: 0: 42867.7. Samples: 2897370420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:01:43,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 09:01:43,489][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000176835_2897264640.pth... [2024-06-22 09:01:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000176208_2886991872.pth [2024-06-22 09:01:45,962][15401] Updated weights for policy 0, policy_version 176840 (0.0037) [2024-06-22 09:01:48,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2897477632. Throughput: 0: 42959.5. Samples: 2897624800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:01:48,393][15132] Avg episode reward: [(0, '0.844')] [2024-06-22 09:01:49,156][15401] Updated weights for policy 0, policy_version 176850 (0.0033) [2024-06-22 09:01:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42601.7, 300 sec: 42709.8). Total num frames: 2897657856. Throughput: 0: 42792.4. Samples: 2897753600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:01:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 09:01:53,464][15401] Updated weights for policy 0, policy_version 176860 (0.0042) [2024-06-22 09:01:56,968][15401] Updated weights for policy 0, policy_version 176870 (0.0036) [2024-06-22 09:01:57,573][15349] Signal inference workers to stop experience collection... (42650 times) [2024-06-22 09:01:57,574][15349] Signal inference workers to resume experience collection... (42650 times) [2024-06-22 09:01:57,686][15401] InferenceWorker_p0-w0: stopping experience collection (42650 times) [2024-06-22 09:01:57,687][15401] InferenceWorker_p0-w0: resuming experience collection (42650 times) [2024-06-22 09:01:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43419.2, 300 sec: 42876.1). Total num frames: 2897903616. Throughput: 0: 42884.8. Samples: 2898011040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:01:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 09:02:01,032][15401] Updated weights for policy 0, policy_version 176880 (0.0027) [2024-06-22 09:02:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.6). Total num frames: 2898100224. Throughput: 0: 42930.2. Samples: 2898269160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:02:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 09:02:04,638][15401] Updated weights for policy 0, policy_version 176890 (0.0037) [2024-06-22 09:02:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42875.1, 300 sec: 42765.0). Total num frames: 2898296832. Throughput: 0: 42596.6. Samples: 2898387620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:02:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 09:02:08,814][15401] Updated weights for policy 0, policy_version 176900 (0.0031) [2024-06-22 09:02:12,272][15401] Updated weights for policy 0, policy_version 176910 (0.0031) [2024-06-22 09:02:13,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 2898558976. Throughput: 0: 42969.3. Samples: 2898655080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:02:13,393][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 09:02:16,333][15401] Updated weights for policy 0, policy_version 176920 (0.0031) [2024-06-22 09:02:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2898755584. Throughput: 0: 43035.1. Samples: 2898914500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:02:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 09:02:19,903][15401] Updated weights for policy 0, policy_version 176930 (0.0030) [2024-06-22 09:02:23,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2898952192. Throughput: 0: 42760.3. Samples: 2899033720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:02:23,391][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 09:02:24,095][15401] Updated weights for policy 0, policy_version 176940 (0.0035) [2024-06-22 09:02:27,555][15401] Updated weights for policy 0, policy_version 176950 (0.0030) [2024-06-22 09:02:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 2899181568. Throughput: 0: 42832.9. Samples: 2899297900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:02:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 09:02:31,579][15401] Updated weights for policy 0, policy_version 176960 (0.0032) [2024-06-22 09:02:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 2899378176. Throughput: 0: 42907.2. Samples: 2899555620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:02:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 09:02:35,373][15401] Updated weights for policy 0, policy_version 176970 (0.0035) [2024-06-22 09:02:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 2899607552. Throughput: 0: 42673.3. Samples: 2899673900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:02:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 09:02:39,225][15401] Updated weights for policy 0, policy_version 176980 (0.0040) [2024-06-22 09:02:43,110][15401] Updated weights for policy 0, policy_version 176990 (0.0030) [2024-06-22 09:02:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42710.0). Total num frames: 2899804160. Throughput: 0: 42845.9. Samples: 2899939100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:02:43,391][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 09:02:46,847][15401] Updated weights for policy 0, policy_version 177000 (0.0027) [2024-06-22 09:02:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2900017152. Throughput: 0: 42487.1. Samples: 2900181080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:02:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 09:02:50,946][15401] Updated weights for policy 0, policy_version 177010 (0.0026) [2024-06-22 09:02:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2900230144. Throughput: 0: 42793.3. Samples: 2900313320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:02:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 09:02:54,770][15401] Updated weights for policy 0, policy_version 177020 (0.0026) [2024-06-22 09:02:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2900426752. Throughput: 0: 42584.6. Samples: 2900571280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:02:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 09:02:58,676][15401] Updated weights for policy 0, policy_version 177030 (0.0038) [2024-06-22 09:03:02,256][15401] Updated weights for policy 0, policy_version 177040 (0.0032) [2024-06-22 09:03:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2900672512. Throughput: 0: 42346.7. Samples: 2900820100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:03:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 09:03:06,577][15401] Updated weights for policy 0, policy_version 177050 (0.0032) [2024-06-22 09:03:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2900885504. Throughput: 0: 42776.9. Samples: 2900958680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:03:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 09:03:09,804][15401] Updated weights for policy 0, policy_version 177060 (0.0024) [2024-06-22 09:03:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41780.9, 300 sec: 42709.5). Total num frames: 2901065728. Throughput: 0: 42532.3. Samples: 2901211860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:03:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 09:03:14,110][15401] Updated weights for policy 0, policy_version 177070 (0.0036) [2024-06-22 09:03:17,849][15401] Updated weights for policy 0, policy_version 177080 (0.0034) [2024-06-22 09:03:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2901311488. Throughput: 0: 42435.5. Samples: 2901465220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:03:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 09:03:21,588][15401] Updated weights for policy 0, policy_version 177090 (0.0037) [2024-06-22 09:03:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2901524480. Throughput: 0: 42784.9. Samples: 2901599220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:03:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 09:03:25,558][15401] Updated weights for policy 0, policy_version 177100 (0.0042) [2024-06-22 09:03:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2901721088. Throughput: 0: 42501.9. Samples: 2901851680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:03:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 09:03:29,098][15401] Updated weights for policy 0, policy_version 177110 (0.0042) [2024-06-22 09:03:33,037][15401] Updated weights for policy 0, policy_version 177120 (0.0035) [2024-06-22 09:03:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2901934080. Throughput: 0: 42725.7. Samples: 2902103740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:03:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 09:03:36,936][15401] Updated weights for policy 0, policy_version 177130 (0.0025) [2024-06-22 09:03:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2902147072. Throughput: 0: 42756.5. Samples: 2902237360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:03:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 09:03:40,583][15401] Updated weights for policy 0, policy_version 177140 (0.0038) [2024-06-22 09:03:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2902343680. Throughput: 0: 42626.9. Samples: 2902489500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:03:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 09:03:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000177146_2902360064.pth... [2024-06-22 09:03:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000176520_2892103680.pth [2024-06-22 09:03:44,469][15401] Updated weights for policy 0, policy_version 177150 (0.0033) [2024-06-22 09:03:48,050][15401] Updated weights for policy 0, policy_version 177160 (0.0033) [2024-06-22 09:03:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2902589440. Throughput: 0: 42754.6. Samples: 2902744060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:03:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 09:03:49,356][15349] Signal inference workers to stop experience collection... (42700 times) [2024-06-22 09:03:49,399][15401] InferenceWorker_p0-w0: stopping experience collection (42700 times) [2024-06-22 09:03:49,410][15349] Signal inference workers to resume experience collection... (42700 times) [2024-06-22 09:03:49,420][15401] InferenceWorker_p0-w0: resuming experience collection (42700 times) [2024-06-22 09:03:52,118][15401] Updated weights for policy 0, policy_version 177170 (0.0041) [2024-06-22 09:03:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2902802432. Throughput: 0: 42716.4. Samples: 2902880920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:03:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 09:03:56,273][15401] Updated weights for policy 0, policy_version 177180 (0.0027) [2024-06-22 09:03:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2902999040. Throughput: 0: 42597.7. Samples: 2903128760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:03:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 09:03:59,838][15401] Updated weights for policy 0, policy_version 177190 (0.0023) [2024-06-22 09:04:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2903212032. Throughput: 0: 42671.5. Samples: 2903385440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:03,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 09:04:03,834][15401] Updated weights for policy 0, policy_version 177200 (0.0030) [2024-06-22 09:04:07,474][15401] Updated weights for policy 0, policy_version 177210 (0.0026) [2024-06-22 09:04:08,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2903441408. Throughput: 0: 42651.7. Samples: 2903518540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 09:04:11,339][15401] Updated weights for policy 0, policy_version 177220 (0.0030) [2024-06-22 09:04:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2903654400. Throughput: 0: 42698.6. Samples: 2903773120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:13,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 09:04:15,099][15401] Updated weights for policy 0, policy_version 177230 (0.0038) [2024-06-22 09:04:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2903867392. Throughput: 0: 42858.6. Samples: 2904032380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 09:04:19,270][15401] Updated weights for policy 0, policy_version 177240 (0.0026) [2024-06-22 09:04:22,668][15401] Updated weights for policy 0, policy_version 177250 (0.0037) [2024-06-22 09:04:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2904096768. Throughput: 0: 42751.1. Samples: 2904161160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:23,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 09:04:26,702][15401] Updated weights for policy 0, policy_version 177260 (0.0031) [2024-06-22 09:04:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2904293376. Throughput: 0: 42766.3. Samples: 2904413980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 09:04:30,394][15401] Updated weights for policy 0, policy_version 177270 (0.0035) [2024-06-22 09:04:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2904522752. Throughput: 0: 42775.5. Samples: 2904668960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:33,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 09:04:34,367][15401] Updated weights for policy 0, policy_version 177280 (0.0043) [2024-06-22 09:04:38,016][15401] Updated weights for policy 0, policy_version 177290 (0.0041) [2024-06-22 09:04:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2904719360. Throughput: 0: 42732.6. Samples: 2904803880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 09:04:41,763][15401] Updated weights for policy 0, policy_version 177300 (0.0038) [2024-06-22 09:04:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2904932352. Throughput: 0: 42844.7. Samples: 2905056760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:04:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 09:04:45,697][15401] Updated weights for policy 0, policy_version 177310 (0.0036) [2024-06-22 09:04:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2905145344. Throughput: 0: 42929.3. Samples: 2905317260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:04:48,391][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 09:04:49,377][15401] Updated weights for policy 0, policy_version 177320 (0.0038) [2024-06-22 09:04:53,146][15401] Updated weights for policy 0, policy_version 177330 (0.0031) [2024-06-22 09:04:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2905374720. Throughput: 0: 42969.2. Samples: 2905452160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:04:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 09:04:56,978][15401] Updated weights for policy 0, policy_version 177340 (0.0044) [2024-06-22 09:04:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2905571328. Throughput: 0: 42928.1. Samples: 2905704880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:04:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 09:05:00,718][15401] Updated weights for policy 0, policy_version 177350 (0.0032) [2024-06-22 09:05:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2905800704. Throughput: 0: 43028.6. Samples: 2905968660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 09:05:04,558][15401] Updated weights for policy 0, policy_version 177360 (0.0034) [2024-06-22 09:05:08,354][15401] Updated weights for policy 0, policy_version 177370 (0.0031) [2024-06-22 09:05:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2906030080. Throughput: 0: 43070.2. Samples: 2906099320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:08,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 09:05:11,912][15349] Signal inference workers to stop experience collection... (42750 times) [2024-06-22 09:05:11,920][15349] Signal inference workers to resume experience collection... (42750 times) [2024-06-22 09:05:11,949][15401] InferenceWorker_p0-w0: stopping experience collection (42750 times) [2024-06-22 09:05:11,949][15401] InferenceWorker_p0-w0: resuming experience collection (42750 times) [2024-06-22 09:05:12,071][15401] Updated weights for policy 0, policy_version 177380 (0.0024) [2024-06-22 09:05:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 2906243072. Throughput: 0: 43106.7. Samples: 2906353780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:13,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 09:05:15,765][15401] Updated weights for policy 0, policy_version 177390 (0.0035) [2024-06-22 09:05:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2906456064. Throughput: 0: 43239.7. Samples: 2906614740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 09:05:19,635][15401] Updated weights for policy 0, policy_version 177400 (0.0034) [2024-06-22 09:05:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2906669056. Throughput: 0: 43132.9. Samples: 2906744860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 09:05:23,557][15401] Updated weights for policy 0, policy_version 177410 (0.0034) [2024-06-22 09:05:27,329][15401] Updated weights for policy 0, policy_version 177420 (0.0037) [2024-06-22 09:05:28,396][15132] Fps is (10 sec: 42571.1, 60 sec: 43140.0, 300 sec: 42875.4). Total num frames: 2906882048. Throughput: 0: 43301.3. Samples: 2907005600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:28,397][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 09:05:31,286][15401] Updated weights for policy 0, policy_version 177430 (0.0032) [2024-06-22 09:05:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2907111424. Throughput: 0: 43124.8. Samples: 2907257880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 09:05:34,875][15401] Updated weights for policy 0, policy_version 177440 (0.0030) [2024-06-22 09:05:38,390][15132] Fps is (10 sec: 42625.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2907308032. Throughput: 0: 43047.2. Samples: 2907389280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:38,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 09:05:39,084][15401] Updated weights for policy 0, policy_version 177450 (0.0029) [2024-06-22 09:05:42,813][15401] Updated weights for policy 0, policy_version 177460 (0.0043) [2024-06-22 09:05:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2907521024. Throughput: 0: 42951.5. Samples: 2907637700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 09:05:43,432][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000177462_2907537408.pth... [2024-06-22 09:05:43,505][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000176835_2897264640.pth [2024-06-22 09:05:46,859][15401] Updated weights for policy 0, policy_version 177470 (0.0030) [2024-06-22 09:05:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.7). Total num frames: 2907717632. Throughput: 0: 42829.6. Samples: 2907896000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:48,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-22 09:05:50,451][15401] Updated weights for policy 0, policy_version 177480 (0.0028) [2024-06-22 09:05:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 2907930624. Throughput: 0: 42736.9. Samples: 2908022480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:05:53,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-22 09:05:54,274][15401] Updated weights for policy 0, policy_version 177490 (0.0030) [2024-06-22 09:05:58,025][15401] Updated weights for policy 0, policy_version 177500 (0.0029) [2024-06-22 09:05:58,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 2908176384. Throughput: 0: 42710.6. Samples: 2908275760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:05:58,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 09:06:02,345][15401] Updated weights for policy 0, policy_version 177510 (0.0035) [2024-06-22 09:06:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42821.3). Total num frames: 2908356608. Throughput: 0: 42692.9. Samples: 2908535920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:03,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 09:06:05,654][15401] Updated weights for policy 0, policy_version 177520 (0.0029) [2024-06-22 09:06:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2908569600. Throughput: 0: 42469.3. Samples: 2908655980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:08,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 09:06:10,011][15401] Updated weights for policy 0, policy_version 177530 (0.0028) [2024-06-22 09:06:13,216][15401] Updated weights for policy 0, policy_version 177540 (0.0031) [2024-06-22 09:06:13,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2908815360. Throughput: 0: 42481.6. Samples: 2908917100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:13,393][15132] Avg episode reward: [(0, '0.798')] [2024-06-22 09:06:17,618][15401] Updated weights for policy 0, policy_version 177550 (0.0028) [2024-06-22 09:06:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2908995584. Throughput: 0: 42627.8. Samples: 2909176120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:18,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 09:06:20,882][15401] Updated weights for policy 0, policy_version 177560 (0.0028) [2024-06-22 09:06:22,334][15349] Signal inference workers to stop experience collection... (42800 times) [2024-06-22 09:06:22,365][15401] InferenceWorker_p0-w0: stopping experience collection (42800 times) [2024-06-22 09:06:22,396][15349] Signal inference workers to resume experience collection... (42800 times) [2024-06-22 09:06:22,400][15401] InferenceWorker_p0-w0: resuming experience collection (42800 times) [2024-06-22 09:06:23,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42596.6, 300 sec: 42765.0). Total num frames: 2909224960. Throughput: 0: 42469.7. Samples: 2909300520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:23,392][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 09:06:25,340][15401] Updated weights for policy 0, policy_version 177570 (0.0035) [2024-06-22 09:06:28,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 2909454336. Throughput: 0: 42709.1. Samples: 2909559620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 09:06:28,741][15401] Updated weights for policy 0, policy_version 177580 (0.0027) [2024-06-22 09:06:32,861][15401] Updated weights for policy 0, policy_version 177590 (0.0038) [2024-06-22 09:06:33,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2909650944. Throughput: 0: 42777.9. Samples: 2909821000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 09:06:36,360][15401] Updated weights for policy 0, policy_version 177600 (0.0037) [2024-06-22 09:06:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2909880320. Throughput: 0: 42783.0. Samples: 2909947720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 09:06:40,495][15401] Updated weights for policy 0, policy_version 177610 (0.0035) [2024-06-22 09:06:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2910076928. Throughput: 0: 42989.9. Samples: 2910210300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 09:06:44,013][15401] Updated weights for policy 0, policy_version 177620 (0.0040) [2024-06-22 09:06:48,206][15401] Updated weights for policy 0, policy_version 177630 (0.0049) [2024-06-22 09:06:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2910289920. Throughput: 0: 42779.0. Samples: 2910460980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:48,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-22 09:06:51,499][15401] Updated weights for policy 0, policy_version 177640 (0.0036) [2024-06-22 09:06:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2910502912. Throughput: 0: 42917.8. Samples: 2910587280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:53,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 09:06:55,957][15401] Updated weights for policy 0, policy_version 177650 (0.0035) [2024-06-22 09:06:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2910715904. Throughput: 0: 42913.4. Samples: 2910848100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 09:06:58,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 09:06:59,052][15401] Updated weights for policy 0, policy_version 177660 (0.0025) [2024-06-22 09:07:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2910928896. Throughput: 0: 42734.0. Samples: 2911099160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 09:07:03,533][15401] Updated weights for policy 0, policy_version 177670 (0.0039) [2024-06-22 09:07:06,544][15401] Updated weights for policy 0, policy_version 177680 (0.0036) [2024-06-22 09:07:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 2911158272. Throughput: 0: 42757.4. Samples: 2911224500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 09:07:10,970][15401] Updated weights for policy 0, policy_version 177690 (0.0029) [2024-06-22 09:07:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 2911371264. Throughput: 0: 42864.1. Samples: 2911488500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:13,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 09:07:14,295][15401] Updated weights for policy 0, policy_version 177700 (0.0039) [2024-06-22 09:07:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2911567872. Throughput: 0: 42756.6. Samples: 2911745040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 09:07:18,639][15401] Updated weights for policy 0, policy_version 177710 (0.0032) [2024-06-22 09:07:22,529][15401] Updated weights for policy 0, policy_version 177720 (0.0034) [2024-06-22 09:07:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 2911780864. Throughput: 0: 42629.1. Samples: 2911866020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:23,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 09:07:26,325][15401] Updated weights for policy 0, policy_version 177730 (0.0030) [2024-06-22 09:07:28,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2911993856. Throughput: 0: 42452.0. Samples: 2912120640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 09:07:30,245][15401] Updated weights for policy 0, policy_version 177740 (0.0031) [2024-06-22 09:07:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2912206848. Throughput: 0: 42585.8. Samples: 2912377340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 09:07:33,915][15401] Updated weights for policy 0, policy_version 177750 (0.0035) [2024-06-22 09:07:37,888][15401] Updated weights for policy 0, policy_version 177760 (0.0036) [2024-06-22 09:07:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 2912436224. Throughput: 0: 42678.3. Samples: 2912507800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 09:07:41,529][15401] Updated weights for policy 0, policy_version 177770 (0.0032) [2024-06-22 09:07:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2912632832. Throughput: 0: 42463.2. Samples: 2912758940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 09:07:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000177773_2912632832.pth... [2024-06-22 09:07:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000177146_2902360064.pth [2024-06-22 09:07:45,510][15401] Updated weights for policy 0, policy_version 177780 (0.0039) [2024-06-22 09:07:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2912845824. Throughput: 0: 42749.9. Samples: 2913022900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 09:07:49,111][15401] Updated weights for policy 0, policy_version 177790 (0.0035) [2024-06-22 09:07:53,130][15401] Updated weights for policy 0, policy_version 177800 (0.0033) [2024-06-22 09:07:53,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 2913091584. Throughput: 0: 42801.3. Samples: 2913150560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 09:07:56,795][15401] Updated weights for policy 0, policy_version 177810 (0.0033) [2024-06-22 09:07:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2913288192. Throughput: 0: 42610.6. Samples: 2913405980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:07:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 09:08:00,557][15349] Signal inference workers to stop experience collection... (42850 times) [2024-06-22 09:08:00,557][15349] Signal inference workers to resume experience collection... (42850 times) [2024-06-22 09:08:00,575][15401] InferenceWorker_p0-w0: stopping experience collection (42850 times) [2024-06-22 09:08:00,612][15401] InferenceWorker_p0-w0: resuming experience collection (42850 times) [2024-06-22 09:08:00,702][15401] Updated weights for policy 0, policy_version 177820 (0.0043) [2024-06-22 09:08:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2913484800. Throughput: 0: 42662.5. Samples: 2913664860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 09:08:03,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 09:08:04,444][15401] Updated weights for policy 0, policy_version 177830 (0.0036) [2024-06-22 09:08:08,246][15401] Updated weights for policy 0, policy_version 177840 (0.0046) [2024-06-22 09:08:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2913730560. Throughput: 0: 42870.9. Samples: 2913795220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 09:08:12,012][15401] Updated weights for policy 0, policy_version 177850 (0.0045) [2024-06-22 09:08:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2913927168. Throughput: 0: 42834.3. Samples: 2914048180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 09:08:16,063][15401] Updated weights for policy 0, policy_version 177860 (0.0037) [2024-06-22 09:08:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2914140160. Throughput: 0: 42827.2. Samples: 2914304560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 09:08:19,656][15401] Updated weights for policy 0, policy_version 177870 (0.0044) [2024-06-22 09:08:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2914353152. Throughput: 0: 42758.1. Samples: 2914431920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 09:08:24,487][15401] Updated weights for policy 0, policy_version 177880 (0.0043) [2024-06-22 09:08:27,369][15401] Updated weights for policy 0, policy_version 177890 (0.0028) [2024-06-22 09:08:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2914566144. Throughput: 0: 42838.1. Samples: 2914686660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 09:08:31,954][15401] Updated weights for policy 0, policy_version 177900 (0.0027) [2024-06-22 09:08:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2914779136. Throughput: 0: 42686.6. Samples: 2914943800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 09:08:34,959][15401] Updated weights for policy 0, policy_version 177910 (0.0031) [2024-06-22 09:08:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2914992128. Throughput: 0: 42710.0. Samples: 2915072500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 09:08:39,535][15401] Updated weights for policy 0, policy_version 177920 (0.0032) [2024-06-22 09:08:42,600][15401] Updated weights for policy 0, policy_version 177930 (0.0029) [2024-06-22 09:08:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2915221504. Throughput: 0: 42807.2. Samples: 2915332300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:43,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 09:08:47,004][15401] Updated weights for policy 0, policy_version 177940 (0.0040) [2024-06-22 09:08:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2915418112. Throughput: 0: 42638.3. Samples: 2915583580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:48,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 09:08:50,224][15401] Updated weights for policy 0, policy_version 177950 (0.0040) [2024-06-22 09:08:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 2915631104. Throughput: 0: 42677.8. Samples: 2915715820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:53,393][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 09:08:54,471][15401] Updated weights for policy 0, policy_version 177960 (0.0045) [2024-06-22 09:08:58,101][15401] Updated weights for policy 0, policy_version 177970 (0.0042) [2024-06-22 09:08:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2915860480. Throughput: 0: 42690.2. Samples: 2915969240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:08:58,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 09:09:01,946][15401] Updated weights for policy 0, policy_version 177980 (0.0031) [2024-06-22 09:09:03,391][15132] Fps is (10 sec: 42604.2, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 2916057088. Throughput: 0: 42850.5. Samples: 2916232880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:09:03,391][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 09:09:05,576][15349] Signal inference workers to stop experience collection... (42900 times) [2024-06-22 09:09:05,577][15349] Signal inference workers to resume experience collection... (42900 times) [2024-06-22 09:09:05,617][15401] InferenceWorker_p0-w0: stopping experience collection (42900 times) [2024-06-22 09:09:05,617][15401] InferenceWorker_p0-w0: resuming experience collection (42900 times) [2024-06-22 09:09:05,713][15401] Updated weights for policy 0, policy_version 177990 (0.0039) [2024-06-22 09:09:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2916270080. Throughput: 0: 42734.3. Samples: 2916354960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 09:09:08,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-22 09:09:09,492][15401] Updated weights for policy 0, policy_version 178000 (0.0041) [2024-06-22 09:09:13,389][15132] Fps is (10 sec: 44241.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2916499456. Throughput: 0: 42771.7. Samples: 2916611380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 09:09:13,463][15401] Updated weights for policy 0, policy_version 178010 (0.0040) [2024-06-22 09:09:17,562][15401] Updated weights for policy 0, policy_version 178020 (0.0042) [2024-06-22 09:09:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2916679680. Throughput: 0: 42748.9. Samples: 2916867500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:18,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 09:09:21,293][15401] Updated weights for policy 0, policy_version 178030 (0.0056) [2024-06-22 09:09:23,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 2916892672. Throughput: 0: 42637.7. Samples: 2916991300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:23,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 09:09:25,296][15401] Updated weights for policy 0, policy_version 178040 (0.0036) [2024-06-22 09:09:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2917122048. Throughput: 0: 42532.9. Samples: 2917246280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 09:09:29,090][15401] Updated weights for policy 0, policy_version 178050 (0.0034) [2024-06-22 09:09:32,883][15401] Updated weights for policy 0, policy_version 178060 (0.0033) [2024-06-22 09:09:33,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2917335040. Throughput: 0: 42709.3. Samples: 2917505500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 09:09:37,033][15401] Updated weights for policy 0, policy_version 178070 (0.0029) [2024-06-22 09:09:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2917548032. Throughput: 0: 42759.1. Samples: 2917639880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:38,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 09:09:40,744][15401] Updated weights for policy 0, policy_version 178080 (0.0040) [2024-06-22 09:09:43,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 2917761024. Throughput: 0: 42617.7. Samples: 2917887140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:43,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 09:09:43,537][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000178087_2917777408.pth... [2024-06-22 09:09:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000177462_2907537408.pth [2024-06-22 09:09:44,626][15401] Updated weights for policy 0, policy_version 178090 (0.0040) [2024-06-22 09:09:48,362][15401] Updated weights for policy 0, policy_version 178100 (0.0029) [2024-06-22 09:09:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2917990400. Throughput: 0: 42688.1. Samples: 2918153900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:48,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 09:09:52,264][15401] Updated weights for policy 0, policy_version 178110 (0.0026) [2024-06-22 09:09:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 2918187008. Throughput: 0: 42708.3. Samples: 2918276840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 09:09:55,846][15401] Updated weights for policy 0, policy_version 178120 (0.0027) [2024-06-22 09:09:58,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2918432768. Throughput: 0: 42755.1. Samples: 2918535360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:09:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 09:09:59,917][15401] Updated weights for policy 0, policy_version 178130 (0.0050) [2024-06-22 09:10:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42872.3, 300 sec: 42709.5). Total num frames: 2918629376. Throughput: 0: 42763.7. Samples: 2918791860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:10:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 09:10:03,783][15401] Updated weights for policy 0, policy_version 178140 (0.0034) [2024-06-22 09:10:07,654][15401] Updated weights for policy 0, policy_version 178150 (0.0037) [2024-06-22 09:10:08,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2918825984. Throughput: 0: 42867.1. Samples: 2918920220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:10:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 09:10:11,279][15401] Updated weights for policy 0, policy_version 178160 (0.0032) [2024-06-22 09:10:13,395][15132] Fps is (10 sec: 44213.8, 60 sec: 42867.7, 300 sec: 42764.3). Total num frames: 2919071744. Throughput: 0: 42951.9. Samples: 2919179340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 09:10:13,395][15132] Avg episode reward: [(0, '0.814')] [2024-06-22 09:10:15,479][15401] Updated weights for policy 0, policy_version 178170 (0.0031) [2024-06-22 09:10:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2919268352. Throughput: 0: 42798.3. Samples: 2919431420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:18,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-22 09:10:18,902][15401] Updated weights for policy 0, policy_version 178180 (0.0037) [2024-06-22 09:10:22,924][15401] Updated weights for policy 0, policy_version 178190 (0.0025) [2024-06-22 09:10:23,396][15132] Fps is (10 sec: 40954.9, 60 sec: 43141.6, 300 sec: 42709.5). Total num frames: 2919481344. Throughput: 0: 42733.1. Samples: 2919563140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:23,396][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 09:10:26,681][15401] Updated weights for policy 0, policy_version 178200 (0.0030) [2024-06-22 09:10:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2919710720. Throughput: 0: 43027.3. Samples: 2919823260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 09:10:30,407][15401] Updated weights for policy 0, policy_version 178210 (0.0023) [2024-06-22 09:10:31,661][15349] Signal inference workers to stop experience collection... (42950 times) [2024-06-22 09:10:31,664][15349] Signal inference workers to resume experience collection... (42950 times) [2024-06-22 09:10:31,678][15401] InferenceWorker_p0-w0: stopping experience collection (42950 times) [2024-06-22 09:10:31,679][15401] InferenceWorker_p0-w0: resuming experience collection (42950 times) [2024-06-22 09:10:33,394][15132] Fps is (10 sec: 44243.3, 60 sec: 43141.0, 300 sec: 42764.3). Total num frames: 2919923712. Throughput: 0: 42766.0. Samples: 2920078480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:33,395][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 09:10:34,319][15401] Updated weights for policy 0, policy_version 178220 (0.0033) [2024-06-22 09:10:38,316][15401] Updated weights for policy 0, policy_version 178230 (0.0029) [2024-06-22 09:10:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2920120320. Throughput: 0: 42799.1. Samples: 2920202800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 09:10:41,899][15401] Updated weights for policy 0, policy_version 178240 (0.0047) [2024-06-22 09:10:43,390][15132] Fps is (10 sec: 44258.5, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 2920366080. Throughput: 0: 42860.7. Samples: 2920464100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 09:10:45,946][15401] Updated weights for policy 0, policy_version 178250 (0.0035) [2024-06-22 09:10:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 2920562688. Throughput: 0: 42886.7. Samples: 2920721760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 09:10:49,480][15401] Updated weights for policy 0, policy_version 178260 (0.0027) [2024-06-22 09:10:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2920759296. Throughput: 0: 42785.4. Samples: 2920845560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 09:10:53,618][15401] Updated weights for policy 0, policy_version 178270 (0.0046) [2024-06-22 09:10:57,078][15401] Updated weights for policy 0, policy_version 178280 (0.0030) [2024-06-22 09:10:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 2920988672. Throughput: 0: 42791.9. Samples: 2921104760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:10:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 09:11:01,088][15401] Updated weights for policy 0, policy_version 178290 (0.0045) [2024-06-22 09:11:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2921201664. Throughput: 0: 42788.0. Samples: 2921356880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:11:03,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 09:11:04,595][15401] Updated weights for policy 0, policy_version 178300 (0.0022) [2024-06-22 09:11:08,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 2921398272. Throughput: 0: 42672.0. Samples: 2921483100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:11:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 09:11:08,730][15401] Updated weights for policy 0, policy_version 178310 (0.0034) [2024-06-22 09:11:12,033][15401] Updated weights for policy 0, policy_version 178320 (0.0041) [2024-06-22 09:11:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42602.1, 300 sec: 42820.6). Total num frames: 2921627648. Throughput: 0: 42634.2. Samples: 2921741800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:11:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 09:11:16,210][15401] Updated weights for policy 0, policy_version 178330 (0.0036) [2024-06-22 09:11:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 2921840640. Throughput: 0: 42759.3. Samples: 2922002440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-22 09:11:18,395][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 09:11:20,135][15401] Updated weights for policy 0, policy_version 178340 (0.0034) [2024-06-22 09:11:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 2922070016. Throughput: 0: 42785.8. Samples: 2922128160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:11:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 09:11:23,806][15401] Updated weights for policy 0, policy_version 178350 (0.0032) [2024-06-22 09:11:27,983][15401] Updated weights for policy 0, policy_version 178360 (0.0037) [2024-06-22 09:11:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2922266624. Throughput: 0: 42808.9. Samples: 2922390500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:11:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 09:11:31,314][15401] Updated weights for policy 0, policy_version 178370 (0.0027) [2024-06-22 09:11:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42875.0, 300 sec: 42765.0). Total num frames: 2922496000. Throughput: 0: 42748.8. Samples: 2922645460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:11:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 09:11:35,467][15401] Updated weights for policy 0, policy_version 178380 (0.0040) [2024-06-22 09:11:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2922708992. Throughput: 0: 42840.4. Samples: 2922773380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:11:38,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 09:11:38,860][15401] Updated weights for policy 0, policy_version 178390 (0.0031) [2024-06-22 09:11:43,127][15401] Updated weights for policy 0, policy_version 178400 (0.0039) [2024-06-22 09:11:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2922921984. Throughput: 0: 42977.0. Samples: 2923038720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:11:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 09:11:43,466][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000178402_2922938368.pth... [2024-06-22 09:11:43,527][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000177773_2912632832.pth [2024-06-22 09:11:46,560][15401] Updated weights for policy 0, policy_version 178410 (0.0026) [2024-06-22 09:11:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2923118592. Throughput: 0: 43030.3. Samples: 2923293240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:11:48,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 09:11:50,472][15349] Signal inference workers to stop experience collection... (43000 times) [2024-06-22 09:11:50,472][15349] Signal inference workers to resume experience collection... (43000 times) [2024-06-22 09:11:50,491][15401] InferenceWorker_p0-w0: stopping experience collection (43000 times) [2024-06-22 09:11:50,491][15401] InferenceWorker_p0-w0: resuming experience collection (43000 times) [2024-06-22 09:11:50,779][15401] Updated weights for policy 0, policy_version 178420 (0.0027) [2024-06-22 09:11:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2923331584. Throughput: 0: 42975.1. Samples: 2923416980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:11:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 09:11:54,097][15401] Updated weights for policy 0, policy_version 178430 (0.0042) [2024-06-22 09:11:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2923544576. Throughput: 0: 42942.2. Samples: 2923674200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:11:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 09:11:58,641][15401] Updated weights for policy 0, policy_version 178440 (0.0038) [2024-06-22 09:12:01,594][15401] Updated weights for policy 0, policy_version 178450 (0.0038) [2024-06-22 09:12:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2923757568. Throughput: 0: 42968.0. Samples: 2923936000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:12:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 09:12:06,409][15401] Updated weights for policy 0, policy_version 178460 (0.0045) [2024-06-22 09:12:08,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 2924003328. Throughput: 0: 42980.8. Samples: 2924062400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:12:08,392][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 09:12:09,463][15401] Updated weights for policy 0, policy_version 178470 (0.0042) [2024-06-22 09:12:13,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 2924183552. Throughput: 0: 42842.2. Samples: 2924318500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:12:13,392][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 09:12:13,926][15401] Updated weights for policy 0, policy_version 178480 (0.0037) [2024-06-22 09:12:17,071][15401] Updated weights for policy 0, policy_version 178490 (0.0022) [2024-06-22 09:12:18,392][15132] Fps is (10 sec: 39321.5, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 2924396544. Throughput: 0: 42880.4. Samples: 2924575180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:12:18,393][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 09:12:21,695][15401] Updated weights for policy 0, policy_version 178500 (0.0034) [2024-06-22 09:12:23,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2924642304. Throughput: 0: 42968.9. Samples: 2924706980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:12:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 09:12:24,907][15401] Updated weights for policy 0, policy_version 178510 (0.0027) [2024-06-22 09:12:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2924822528. Throughput: 0: 42599.9. Samples: 2924955720. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:12:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 09:12:29,415][15401] Updated weights for policy 0, policy_version 178520 (0.0047) [2024-06-22 09:12:32,397][15401] Updated weights for policy 0, policy_version 178530 (0.0030) [2024-06-22 09:12:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2925051904. Throughput: 0: 42642.6. Samples: 2925212160. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:12:33,396][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 09:12:37,016][15401] Updated weights for policy 0, policy_version 178540 (0.0027) [2024-06-22 09:12:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2925281280. Throughput: 0: 42834.5. Samples: 2925344540. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:12:38,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-22 09:12:40,158][15401] Updated weights for policy 0, policy_version 178550 (0.0030) [2024-06-22 09:12:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2925477888. Throughput: 0: 42916.9. Samples: 2925605460. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:12:43,390][15132] Avg episode reward: [(0, '0.189')] [2024-06-22 09:12:44,496][15401] Updated weights for policy 0, policy_version 178560 (0.0035) [2024-06-22 09:12:47,623][15401] Updated weights for policy 0, policy_version 178570 (0.0030) [2024-06-22 09:12:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2925690880. Throughput: 0: 42644.2. Samples: 2925854980. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:12:48,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-22 09:12:52,471][15401] Updated weights for policy 0, policy_version 178580 (0.0027) [2024-06-22 09:12:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2925903872. Throughput: 0: 42761.3. Samples: 2925986560. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:12:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 09:12:55,114][15401] Updated weights for policy 0, policy_version 178590 (0.0032) [2024-06-22 09:12:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2926116864. Throughput: 0: 42700.9. Samples: 2926239940. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:12:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 09:13:00,267][15401] Updated weights for policy 0, policy_version 178600 (0.0031) [2024-06-22 09:13:02,814][15401] Updated weights for policy 0, policy_version 178610 (0.0040) [2024-06-22 09:13:03,390][15132] Fps is (10 sec: 44234.4, 60 sec: 43144.2, 300 sec: 42764.9). Total num frames: 2926346240. Throughput: 0: 42505.3. Samples: 2926487840. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:13:03,391][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 09:13:07,947][15401] Updated weights for policy 0, policy_version 178620 (0.0030) [2024-06-22 09:13:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 2926542848. Throughput: 0: 42700.9. Samples: 2926628520. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:13:08,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 09:13:10,558][15401] Updated weights for policy 0, policy_version 178630 (0.0030) [2024-06-22 09:13:13,390][15132] Fps is (10 sec: 39323.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 2926739456. Throughput: 0: 42746.2. Samples: 2926879300. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:13:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 09:13:13,582][15349] Signal inference workers to stop experience collection... (43050 times) [2024-06-22 09:13:13,582][15349] Signal inference workers to resume experience collection... (43050 times) [2024-06-22 09:13:13,596][15401] InferenceWorker_p0-w0: stopping experience collection (43050 times) [2024-06-22 09:13:13,596][15401] InferenceWorker_p0-w0: resuming experience collection (43050 times) [2024-06-22 09:13:15,319][15401] Updated weights for policy 0, policy_version 178640 (0.0037) [2024-06-22 09:13:18,250][15401] Updated weights for policy 0, policy_version 178650 (0.0029) [2024-06-22 09:13:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 2927001600. Throughput: 0: 42601.0. Samples: 2927129200. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:13:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 09:13:23,020][15401] Updated weights for policy 0, policy_version 178660 (0.0041) [2024-06-22 09:13:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2927165440. Throughput: 0: 42667.6. Samples: 2927264580. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:13:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 09:13:25,872][15401] Updated weights for policy 0, policy_version 178670 (0.0036) [2024-06-22 09:13:28,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2927378432. Throughput: 0: 42440.5. Samples: 2927515280. Policy #0 lag: (min: 3.0, avg: 11.8, max: 23.0) [2024-06-22 09:13:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 09:13:30,705][15401] Updated weights for policy 0, policy_version 178680 (0.0032) [2024-06-22 09:13:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2927624192. Throughput: 0: 42621.2. Samples: 2927772940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:13:33,399][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 09:13:33,749][15401] Updated weights for policy 0, policy_version 178690 (0.0036) [2024-06-22 09:13:38,181][15401] Updated weights for policy 0, policy_version 178700 (0.0026) [2024-06-22 09:13:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2927820800. Throughput: 0: 42745.0. Samples: 2927910080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:13:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 09:13:41,497][15401] Updated weights for policy 0, policy_version 178710 (0.0033) [2024-06-22 09:13:43,391][15132] Fps is (10 sec: 40953.4, 60 sec: 42597.2, 300 sec: 42764.8). Total num frames: 2928033792. Throughput: 0: 42702.9. Samples: 2928161640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:13:43,392][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 09:13:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000178714_2928050176.pth... [2024-06-22 09:13:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000178087_2917777408.pth [2024-06-22 09:13:46,174][15401] Updated weights for policy 0, policy_version 178720 (0.0036) [2024-06-22 09:13:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2928246784. Throughput: 0: 42946.4. Samples: 2928420400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:13:48,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 09:13:49,189][15401] Updated weights for policy 0, policy_version 178730 (0.0041) [2024-06-22 09:13:53,389][15132] Fps is (10 sec: 40967.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2928443392. Throughput: 0: 42704.9. Samples: 2928550240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:13:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 09:13:53,638][15401] Updated weights for policy 0, policy_version 178740 (0.0026) [2024-06-22 09:13:56,801][15401] Updated weights for policy 0, policy_version 178750 (0.0037) [2024-06-22 09:13:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 2928689152. Throughput: 0: 42708.6. Samples: 2928801180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:13:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 09:14:01,368][15401] Updated weights for policy 0, policy_version 178760 (0.0037) [2024-06-22 09:14:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.9, 300 sec: 42820.6). Total num frames: 2928902144. Throughput: 0: 43065.8. Samples: 2929067160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:14:03,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 09:14:04,577][15401] Updated weights for policy 0, policy_version 178770 (0.0031) [2024-06-22 09:14:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2929098752. Throughput: 0: 42821.6. Samples: 2929191560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:14:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 09:14:08,806][15401] Updated weights for policy 0, policy_version 178780 (0.0036) [2024-06-22 09:14:12,129][15401] Updated weights for policy 0, policy_version 178790 (0.0048) [2024-06-22 09:14:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 2929344512. Throughput: 0: 42981.7. Samples: 2929449460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:14:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 09:14:16,204][15401] Updated weights for policy 0, policy_version 178800 (0.0031) [2024-06-22 09:14:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 2929541120. Throughput: 0: 43121.9. Samples: 2929713420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:14:18,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-22 09:14:19,666][15401] Updated weights for policy 0, policy_version 178810 (0.0036) [2024-06-22 09:14:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2929754112. Throughput: 0: 42865.4. Samples: 2929839020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:14:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 09:14:23,578][15401] Updated weights for policy 0, policy_version 178820 (0.0023) [2024-06-22 09:14:27,243][15401] Updated weights for policy 0, policy_version 178830 (0.0027) [2024-06-22 09:14:28,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43688.9, 300 sec: 42931.3). Total num frames: 2929999872. Throughput: 0: 43112.2. Samples: 2930101720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:14:28,392][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 09:14:30,332][15349] Signal inference workers to stop experience collection... (43100 times) [2024-06-22 09:14:30,332][15349] Signal inference workers to resume experience collection... (43100 times) [2024-06-22 09:14:30,341][15401] InferenceWorker_p0-w0: stopping experience collection (43100 times) [2024-06-22 09:14:30,341][15401] InferenceWorker_p0-w0: resuming experience collection (43100 times) [2024-06-22 09:14:31,103][15401] Updated weights for policy 0, policy_version 178840 (0.0031) [2024-06-22 09:14:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2930180096. Throughput: 0: 43168.8. Samples: 2930363000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:14:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 09:14:35,230][15401] Updated weights for policy 0, policy_version 178850 (0.0036) [2024-06-22 09:14:38,392][15132] Fps is (10 sec: 40960.1, 60 sec: 43142.8, 300 sec: 42876.1). Total num frames: 2930409472. Throughput: 0: 43006.1. Samples: 2930485620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:14:38,392][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 09:14:38,543][15401] Updated weights for policy 0, policy_version 178860 (0.0033) [2024-06-22 09:14:42,668][15401] Updated weights for policy 0, policy_version 178870 (0.0026) [2024-06-22 09:14:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43418.8, 300 sec: 42876.4). Total num frames: 2930638848. Throughput: 0: 43275.5. Samples: 2930748580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:14:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 09:14:46,469][15401] Updated weights for policy 0, policy_version 178880 (0.0034) [2024-06-22 09:14:48,389][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2930835456. Throughput: 0: 43074.2. Samples: 2931005500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:14:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 09:14:50,153][15401] Updated weights for policy 0, policy_version 178890 (0.0036) [2024-06-22 09:14:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2931048448. Throughput: 0: 43016.1. Samples: 2931127280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:14:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 09:14:54,177][15401] Updated weights for policy 0, policy_version 178900 (0.0031) [2024-06-22 09:14:57,910][15401] Updated weights for policy 0, policy_version 178910 (0.0036) [2024-06-22 09:14:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2931294208. Throughput: 0: 43147.5. Samples: 2931391100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:14:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 09:15:01,695][15401] Updated weights for policy 0, policy_version 178920 (0.0043) [2024-06-22 09:15:03,395][15132] Fps is (10 sec: 42575.3, 60 sec: 42867.5, 300 sec: 42875.3). Total num frames: 2931474432. Throughput: 0: 42946.3. Samples: 2931646240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:03,395][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 09:15:05,506][15401] Updated weights for policy 0, policy_version 178930 (0.0044) [2024-06-22 09:15:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43416.0, 300 sec: 42821.0). Total num frames: 2931703808. Throughput: 0: 42959.4. Samples: 2931772300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:08,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 09:15:09,177][15401] Updated weights for policy 0, policy_version 178940 (0.0035) [2024-06-22 09:15:13,390][15132] Fps is (10 sec: 42621.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2931900416. Throughput: 0: 42926.6. Samples: 2932033320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 09:15:13,427][15401] Updated weights for policy 0, policy_version 178950 (0.0028) [2024-06-22 09:15:17,023][15401] Updated weights for policy 0, policy_version 178960 (0.0034) [2024-06-22 09:15:18,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.4, 300 sec: 42877.0). Total num frames: 2932129792. Throughput: 0: 42765.8. Samples: 2932287460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 09:15:20,982][15401] Updated weights for policy 0, policy_version 178970 (0.0031) [2024-06-22 09:15:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2932326400. Throughput: 0: 42891.5. Samples: 2932415640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 09:15:24,822][15401] Updated weights for policy 0, policy_version 178980 (0.0035) [2024-06-22 09:15:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42821.3). Total num frames: 2932555776. Throughput: 0: 42787.1. Samples: 2932674000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 09:15:28,842][15401] Updated weights for policy 0, policy_version 178990 (0.0033) [2024-06-22 09:15:32,406][15401] Updated weights for policy 0, policy_version 179000 (0.0038) [2024-06-22 09:15:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2932768768. Throughput: 0: 42707.5. Samples: 2932927340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:33,392][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 09:15:36,258][15401] Updated weights for policy 0, policy_version 179010 (0.0039) [2024-06-22 09:15:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 2932981760. Throughput: 0: 42881.0. Samples: 2933056920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:38,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 09:15:40,064][15401] Updated weights for policy 0, policy_version 179020 (0.0034) [2024-06-22 09:15:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2933194752. Throughput: 0: 42650.1. Samples: 2933310360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 09:15:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 09:15:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000179028_2933194752.pth... [2024-06-22 09:15:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000178402_2922938368.pth [2024-06-22 09:15:43,984][15401] Updated weights for policy 0, policy_version 179030 (0.0042) [2024-06-22 09:15:47,820][15401] Updated weights for policy 0, policy_version 179040 (0.0030) [2024-06-22 09:15:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2933407744. Throughput: 0: 42734.2. Samples: 2933569040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:15:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 09:15:51,738][15401] Updated weights for policy 0, policy_version 179050 (0.0038) [2024-06-22 09:15:53,396][15132] Fps is (10 sec: 42571.9, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 2933620736. Throughput: 0: 42758.4. Samples: 2933696600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:15:53,396][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 09:15:55,494][15401] Updated weights for policy 0, policy_version 179060 (0.0042) [2024-06-22 09:15:58,348][15349] Signal inference workers to stop experience collection... (43150 times) [2024-06-22 09:15:58,348][15349] Signal inference workers to resume experience collection... (43150 times) [2024-06-22 09:15:58,370][15401] InferenceWorker_p0-w0: stopping experience collection (43150 times) [2024-06-22 09:15:58,370][15401] InferenceWorker_p0-w0: resuming experience collection (43150 times) [2024-06-22 09:15:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2933833728. Throughput: 0: 42780.1. Samples: 2933958420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:15:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 09:15:59,284][15401] Updated weights for policy 0, policy_version 179070 (0.0032) [2024-06-22 09:16:03,194][15401] Updated weights for policy 0, policy_version 179080 (0.0041) [2024-06-22 09:16:03,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42875.4, 300 sec: 42876.1). Total num frames: 2934046720. Throughput: 0: 42707.2. Samples: 2934209280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 09:16:07,379][15401] Updated weights for policy 0, policy_version 179090 (0.0031) [2024-06-22 09:16:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 2934259712. Throughput: 0: 42621.5. Samples: 2934333600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 09:16:10,943][15401] Updated weights for policy 0, policy_version 179100 (0.0028) [2024-06-22 09:16:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2934456320. Throughput: 0: 42456.9. Samples: 2934584560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 09:16:14,922][15401] Updated weights for policy 0, policy_version 179110 (0.0025) [2024-06-22 09:16:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2934669312. Throughput: 0: 42529.8. Samples: 2934841180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 09:16:19,204][15401] Updated weights for policy 0, policy_version 179120 (0.0033) [2024-06-22 09:16:22,624][15401] Updated weights for policy 0, policy_version 179130 (0.0037) [2024-06-22 09:16:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2934882304. Throughput: 0: 42525.7. Samples: 2934970580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:23,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 09:16:26,757][15401] Updated weights for policy 0, policy_version 179140 (0.0040) [2024-06-22 09:16:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2935095296. Throughput: 0: 42540.6. Samples: 2935224680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 09:16:30,241][15401] Updated weights for policy 0, policy_version 179150 (0.0030) [2024-06-22 09:16:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2935324672. Throughput: 0: 42464.2. Samples: 2935479940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 09:16:34,375][15401] Updated weights for policy 0, policy_version 179160 (0.0029) [2024-06-22 09:16:37,818][15401] Updated weights for policy 0, policy_version 179170 (0.0031) [2024-06-22 09:16:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2935537664. Throughput: 0: 42591.4. Samples: 2935612940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 09:16:41,963][15401] Updated weights for policy 0, policy_version 179180 (0.0027) [2024-06-22 09:16:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2935734272. Throughput: 0: 42553.8. Samples: 2935873340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 09:16:45,304][15401] Updated weights for policy 0, policy_version 179190 (0.0058) [2024-06-22 09:16:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2935963648. Throughput: 0: 42464.8. Samples: 2936120200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:16:48,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 09:16:49,783][15401] Updated weights for policy 0, policy_version 179200 (0.0030) [2024-06-22 09:16:53,113][15401] Updated weights for policy 0, policy_version 179210 (0.0044) [2024-06-22 09:16:53,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42598.4, 300 sec: 42819.6). Total num frames: 2936176640. Throughput: 0: 42630.8. Samples: 2936252260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:16:53,396][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 09:16:57,448][15401] Updated weights for policy 0, policy_version 179220 (0.0036) [2024-06-22 09:16:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2936373248. Throughput: 0: 42826.8. Samples: 2936511760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:16:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 09:17:00,677][15401] Updated weights for policy 0, policy_version 179230 (0.0043) [2024-06-22 09:17:03,392][15132] Fps is (10 sec: 42615.6, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 2936602624. Throughput: 0: 42749.3. Samples: 2936765000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:03,392][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 09:17:05,539][15401] Updated weights for policy 0, policy_version 179240 (0.0028) [2024-06-22 09:17:08,364][15401] Updated weights for policy 0, policy_version 179250 (0.0033) [2024-06-22 09:17:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 2936832000. Throughput: 0: 42950.3. Samples: 2936903340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 09:17:12,977][15401] Updated weights for policy 0, policy_version 179260 (0.0029) [2024-06-22 09:17:13,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2937012224. Throughput: 0: 42852.4. Samples: 2937153040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 09:17:15,109][15349] Signal inference workers to stop experience collection... (43200 times) [2024-06-22 09:17:15,160][15401] InferenceWorker_p0-w0: stopping experience collection (43200 times) [2024-06-22 09:17:15,171][15349] Signal inference workers to resume experience collection... (43200 times) [2024-06-22 09:17:15,192][15401] InferenceWorker_p0-w0: resuming experience collection (43200 times) [2024-06-22 09:17:15,850][15401] Updated weights for policy 0, policy_version 179270 (0.0038) [2024-06-22 09:17:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2937257984. Throughput: 0: 42743.3. Samples: 2937403380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 09:17:20,369][15401] Updated weights for policy 0, policy_version 179280 (0.0029) [2024-06-22 09:17:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2937470976. Throughput: 0: 42931.9. Samples: 2937544880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:23,390][15132] Avg episode reward: [(0, '0.161')] [2024-06-22 09:17:23,427][15401] Updated weights for policy 0, policy_version 179290 (0.0039) [2024-06-22 09:17:28,090][15401] Updated weights for policy 0, policy_version 179300 (0.0027) [2024-06-22 09:17:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2937651200. Throughput: 0: 42754.6. Samples: 2937797300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 09:17:31,102][15401] Updated weights for policy 0, policy_version 179310 (0.0038) [2024-06-22 09:17:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2937913344. Throughput: 0: 42896.0. Samples: 2938050520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 09:17:35,438][15401] Updated weights for policy 0, policy_version 179320 (0.0038) [2024-06-22 09:17:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2938109952. Throughput: 0: 43288.8. Samples: 2938199980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 09:17:38,607][15401] Updated weights for policy 0, policy_version 179330 (0.0024) [2024-06-22 09:17:43,030][15401] Updated weights for policy 0, policy_version 179340 (0.0038) [2024-06-22 09:17:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2938306560. Throughput: 0: 43101.3. Samples: 2938451320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 09:17:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000179340_2938306560.pth... [2024-06-22 09:17:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000178714_2928050176.pth [2024-06-22 09:17:46,192][15401] Updated weights for policy 0, policy_version 179350 (0.0031) [2024-06-22 09:17:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2938568704. Throughput: 0: 42902.2. Samples: 2938695500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 09:17:50,571][15401] Updated weights for policy 0, policy_version 179360 (0.0029) [2024-06-22 09:17:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 2938765312. Throughput: 0: 43050.2. Samples: 2938840600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 09:17:53,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 09:17:53,662][15401] Updated weights for policy 0, policy_version 179370 (0.0042) [2024-06-22 09:17:57,938][15401] Updated weights for policy 0, policy_version 179380 (0.0040) [2024-06-22 09:17:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42765.1). Total num frames: 2938961920. Throughput: 0: 43211.6. Samples: 2939097560. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:17:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 09:18:00,850][15349] Signal inference workers to stop experience collection... (43250 times) [2024-06-22 09:18:00,850][15349] Signal inference workers to resume experience collection... (43250 times) [2024-06-22 09:18:00,871][15401] InferenceWorker_p0-w0: stopping experience collection (43250 times) [2024-06-22 09:18:00,871][15401] InferenceWorker_p0-w0: resuming experience collection (43250 times) [2024-06-22 09:18:01,159][15401] Updated weights for policy 0, policy_version 179390 (0.0030) [2024-06-22 09:18:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43692.4, 300 sec: 42987.2). Total num frames: 2939224064. Throughput: 0: 43210.6. Samples: 2939347860. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:03,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 09:18:05,442][15401] Updated weights for policy 0, policy_version 179400 (0.0035) [2024-06-22 09:18:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2939404288. Throughput: 0: 43285.9. Samples: 2939492740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 09:18:09,127][15401] Updated weights for policy 0, policy_version 179410 (0.0036) [2024-06-22 09:18:13,135][15401] Updated weights for policy 0, policy_version 179420 (0.0048) [2024-06-22 09:18:13,392][15132] Fps is (10 sec: 39312.4, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 2939617280. Throughput: 0: 43379.0. Samples: 2939749460. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:13,392][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 09:18:16,810][15401] Updated weights for policy 0, policy_version 179430 (0.0029) [2024-06-22 09:18:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 2939863040. Throughput: 0: 43249.9. Samples: 2939996760. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 09:18:21,397][15401] Updated weights for policy 0, policy_version 179440 (0.0045) [2024-06-22 09:18:23,392][15132] Fps is (10 sec: 42597.9, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 2940043264. Throughput: 0: 43034.5. Samples: 2940136640. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:23,392][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 09:18:24,292][15401] Updated weights for policy 0, policy_version 179450 (0.0052) [2024-06-22 09:18:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2940256256. Throughput: 0: 42983.6. Samples: 2940385580. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 09:18:29,070][15401] Updated weights for policy 0, policy_version 179460 (0.0029) [2024-06-22 09:18:31,915][15401] Updated weights for policy 0, policy_version 179470 (0.0033) [2024-06-22 09:18:33,389][15132] Fps is (10 sec: 44248.4, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 2940485632. Throughput: 0: 43211.3. Samples: 2940640000. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 09:18:36,517][15401] Updated weights for policy 0, policy_version 179480 (0.0026) [2024-06-22 09:18:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.8). Total num frames: 2940665856. Throughput: 0: 42940.6. Samples: 2940772920. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 09:18:39,380][15401] Updated weights for policy 0, policy_version 179490 (0.0034) [2024-06-22 09:18:43,390][15132] Fps is (10 sec: 40958.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2940895232. Throughput: 0: 42808.2. Samples: 2941023940. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:43,396][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 09:18:44,369][15401] Updated weights for policy 0, policy_version 179500 (0.0040) [2024-06-22 09:18:47,342][15401] Updated weights for policy 0, policy_version 179510 (0.0042) [2024-06-22 09:18:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 2941124608. Throughput: 0: 42880.6. Samples: 2941277480. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 09:18:51,917][15401] Updated weights for policy 0, policy_version 179520 (0.0034) [2024-06-22 09:18:53,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2941304832. Throughput: 0: 42581.8. Samples: 2941408920. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:53,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-22 09:18:54,913][15401] Updated weights for policy 0, policy_version 179530 (0.0037) [2024-06-22 09:18:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2941550592. Throughput: 0: 42386.7. Samples: 2941656760. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 09:18:58,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 09:19:00,172][15401] Updated weights for policy 0, policy_version 179540 (0.0022) [2024-06-22 09:19:02,583][15401] Updated weights for policy 0, policy_version 179550 (0.0029) [2024-06-22 09:19:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 2941763584. Throughput: 0: 42503.6. Samples: 2941909420. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 09:19:07,563][15349] Signal inference workers to stop experience collection... (43300 times) [2024-06-22 09:19:07,568][15349] Signal inference workers to resume experience collection... (43300 times) [2024-06-22 09:19:07,587][15401] InferenceWorker_p0-w0: stopping experience collection (43300 times) [2024-06-22 09:19:07,615][15401] InferenceWorker_p0-w0: resuming experience collection (43300 times) [2024-06-22 09:19:07,732][15401] Updated weights for policy 0, policy_version 179560 (0.0039) [2024-06-22 09:19:08,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2941927424. Throughput: 0: 42371.2. Samples: 2942043240. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:08,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 09:19:10,185][15401] Updated weights for policy 0, policy_version 179570 (0.0036) [2024-06-22 09:19:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 2942189568. Throughput: 0: 42445.3. Samples: 2942295620. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 09:19:15,367][15401] Updated weights for policy 0, policy_version 179580 (0.0045) [2024-06-22 09:19:17,776][15401] Updated weights for policy 0, policy_version 179590 (0.0045) [2024-06-22 09:19:18,389][15132] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2942418944. Throughput: 0: 42477.7. Samples: 2942551500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 09:19:23,022][15401] Updated weights for policy 0, policy_version 179600 (0.0039) [2024-06-22 09:19:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.1, 300 sec: 42654.3). Total num frames: 2942582784. Throughput: 0: 42355.4. Samples: 2942678920. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 09:19:25,542][15401] Updated weights for policy 0, policy_version 179610 (0.0035) [2024-06-22 09:19:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2942828544. Throughput: 0: 42405.5. Samples: 2942932180. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 09:19:30,666][15401] Updated weights for policy 0, policy_version 179620 (0.0028) [2024-06-22 09:19:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 2943025152. Throughput: 0: 42400.4. Samples: 2943185500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 09:19:33,660][15401] Updated weights for policy 0, policy_version 179630 (0.0035) [2024-06-22 09:19:38,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2943205376. Throughput: 0: 42211.0. Samples: 2943308420. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 09:19:38,555][15401] Updated weights for policy 0, policy_version 179640 (0.0029) [2024-06-22 09:19:41,270][15401] Updated weights for policy 0, policy_version 179650 (0.0030) [2024-06-22 09:19:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 2943451136. Throughput: 0: 42355.1. Samples: 2943562740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:43,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 09:19:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000179655_2943467520.pth... [2024-06-22 09:19:43,530][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000179028_2933194752.pth [2024-06-22 09:19:46,054][15401] Updated weights for policy 0, policy_version 179660 (0.0022) [2024-06-22 09:19:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2943664128. Throughput: 0: 42499.5. Samples: 2943821900. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 09:19:48,881][15401] Updated weights for policy 0, policy_version 179670 (0.0036) [2024-06-22 09:19:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 2943844352. Throughput: 0: 42312.3. Samples: 2943947300. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 09:19:53,944][15401] Updated weights for policy 0, policy_version 179680 (0.0024) [2024-06-22 09:19:56,707][15401] Updated weights for policy 0, policy_version 179690 (0.0036) [2024-06-22 09:19:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.8). Total num frames: 2944090112. Throughput: 0: 42457.9. Samples: 2944206220. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:19:58,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 09:20:01,762][15401] Updated weights for policy 0, policy_version 179700 (0.0039) [2024-06-22 09:20:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2944303104. Throughput: 0: 42316.9. Samples: 2944455760. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-22 09:20:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 09:20:04,536][15401] Updated weights for policy 0, policy_version 179710 (0.0034) [2024-06-22 09:20:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2944483328. Throughput: 0: 42360.4. Samples: 2944585140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 09:20:09,397][15401] Updated weights for policy 0, policy_version 179720 (0.0031) [2024-06-22 09:20:11,414][15349] Signal inference workers to stop experience collection... (43350 times) [2024-06-22 09:20:11,460][15401] InferenceWorker_p0-w0: stopping experience collection (43350 times) [2024-06-22 09:20:11,469][15349] Signal inference workers to resume experience collection... (43350 times) [2024-06-22 09:20:11,476][15401] InferenceWorker_p0-w0: resuming experience collection (43350 times) [2024-06-22 09:20:12,316][15401] Updated weights for policy 0, policy_version 179730 (0.0032) [2024-06-22 09:20:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2944712704. Throughput: 0: 42415.6. Samples: 2944840880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 09:20:17,062][15401] Updated weights for policy 0, policy_version 179740 (0.0030) [2024-06-22 09:20:18,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2944942080. Throughput: 0: 42519.5. Samples: 2945098880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 09:20:20,218][15401] Updated weights for policy 0, policy_version 179750 (0.0035) [2024-06-22 09:20:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2945122304. Throughput: 0: 42607.7. Samples: 2945225760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 09:20:24,534][15401] Updated weights for policy 0, policy_version 179760 (0.0033) [2024-06-22 09:20:27,886][15401] Updated weights for policy 0, policy_version 179770 (0.0044) [2024-06-22 09:20:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2945368064. Throughput: 0: 42686.6. Samples: 2945483640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 09:20:32,440][15401] Updated weights for policy 0, policy_version 179780 (0.0022) [2024-06-22 09:20:33,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2945564672. Throughput: 0: 42755.0. Samples: 2945745880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 09:20:35,434][15401] Updated weights for policy 0, policy_version 179790 (0.0031) [2024-06-22 09:20:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2945777664. Throughput: 0: 42674.3. Samples: 2945867640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 09:20:40,080][15401] Updated weights for policy 0, policy_version 179800 (0.0043) [2024-06-22 09:20:43,064][15401] Updated weights for policy 0, policy_version 179810 (0.0030) [2024-06-22 09:20:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2946023424. Throughput: 0: 42732.8. Samples: 2946129200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 09:20:47,560][15401] Updated weights for policy 0, policy_version 179820 (0.0035) [2024-06-22 09:20:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 2946203648. Throughput: 0: 43019.0. Samples: 2946391620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:48,392][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 09:20:50,642][15401] Updated weights for policy 0, policy_version 179830 (0.0039) [2024-06-22 09:20:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2946449408. Throughput: 0: 42889.5. Samples: 2946515160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:53,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 09:20:55,003][15401] Updated weights for policy 0, policy_version 179840 (0.0037) [2024-06-22 09:20:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2946646016. Throughput: 0: 42940.6. Samples: 2946773200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:20:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 09:20:58,462][15401] Updated weights for policy 0, policy_version 179850 (0.0039) [2024-06-22 09:21:02,533][15401] Updated weights for policy 0, policy_version 179860 (0.0046) [2024-06-22 09:21:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2946859008. Throughput: 0: 42988.8. Samples: 2947033380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:21:03,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 09:21:06,051][15401] Updated weights for policy 0, policy_version 179870 (0.0028) [2024-06-22 09:21:08,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43416.0, 300 sec: 42820.2). Total num frames: 2947088384. Throughput: 0: 42939.8. Samples: 2947158160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 09:21:08,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 09:21:10,251][15401] Updated weights for policy 0, policy_version 179880 (0.0034) [2024-06-22 09:21:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2947284992. Throughput: 0: 42986.4. Samples: 2947418020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 09:21:13,609][15401] Updated weights for policy 0, policy_version 179890 (0.0043) [2024-06-22 09:21:18,010][15401] Updated weights for policy 0, policy_version 179900 (0.0038) [2024-06-22 09:21:18,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2947481600. Throughput: 0: 42779.3. Samples: 2947670940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:18,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 09:21:21,289][15401] Updated weights for policy 0, policy_version 179910 (0.0029) [2024-06-22 09:21:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43690.5, 300 sec: 42876.1). Total num frames: 2947743744. Throughput: 0: 42857.3. Samples: 2947796220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 09:21:25,515][15401] Updated weights for policy 0, policy_version 179920 (0.0033) [2024-06-22 09:21:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2947923968. Throughput: 0: 42887.7. Samples: 2948059140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 09:21:29,162][15401] Updated weights for policy 0, policy_version 179930 (0.0038) [2024-06-22 09:21:32,989][15401] Updated weights for policy 0, policy_version 179940 (0.0036) [2024-06-22 09:21:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2948136960. Throughput: 0: 42544.0. Samples: 2948306100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 09:21:36,869][15401] Updated weights for policy 0, policy_version 179950 (0.0033) [2024-06-22 09:21:38,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2948382720. Throughput: 0: 42700.7. Samples: 2948436700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:38,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 09:21:40,563][15401] Updated weights for policy 0, policy_version 179960 (0.0041) [2024-06-22 09:21:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2948562944. Throughput: 0: 42911.3. Samples: 2948704220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:43,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 09:21:43,536][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000179967_2948579328.pth... [2024-06-22 09:21:43,579][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000179340_2938306560.pth [2024-06-22 09:21:44,446][15349] Signal inference workers to stop experience collection... (43400 times) [2024-06-22 09:21:44,478][15401] InferenceWorker_p0-w0: stopping experience collection (43400 times) [2024-06-22 09:21:44,504][15349] Signal inference workers to resume experience collection... (43400 times) [2024-06-22 09:21:44,505][15401] InferenceWorker_p0-w0: resuming experience collection (43400 times) [2024-06-22 09:21:44,513][15401] Updated weights for policy 0, policy_version 179970 (0.0037) [2024-06-22 09:21:48,233][15401] Updated weights for policy 0, policy_version 179980 (0.0031) [2024-06-22 09:21:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42766.0). Total num frames: 2948792320. Throughput: 0: 42569.0. Samples: 2948948980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 09:21:52,019][15401] Updated weights for policy 0, policy_version 179990 (0.0041) [2024-06-22 09:21:53,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2949021696. Throughput: 0: 42759.5. Samples: 2949082240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 09:21:56,027][15401] Updated weights for policy 0, policy_version 180000 (0.0032) [2024-06-22 09:21:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 2949185536. Throughput: 0: 42622.1. Samples: 2949336020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:21:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 09:21:59,651][15401] Updated weights for policy 0, policy_version 180010 (0.0035) [2024-06-22 09:22:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 2949431296. Throughput: 0: 42337.7. Samples: 2949576240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:22:03,392][15132] Avg episode reward: [(0, '0.866')] [2024-06-22 09:22:03,570][15401] Updated weights for policy 0, policy_version 180020 (0.0039) [2024-06-22 09:22:07,654][15401] Updated weights for policy 0, policy_version 180030 (0.0027) [2024-06-22 09:22:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 2949627904. Throughput: 0: 42579.7. Samples: 2949712300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:22:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 09:22:11,084][15401] Updated weights for policy 0, policy_version 180040 (0.0034) [2024-06-22 09:22:13,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2949824512. Throughput: 0: 42284.2. Samples: 2949961940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 09:22:13,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-22 09:22:15,221][15401] Updated weights for policy 0, policy_version 180050 (0.0034) [2024-06-22 09:22:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2950070272. Throughput: 0: 42466.4. Samples: 2950217080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:18,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-22 09:22:18,971][15401] Updated weights for policy 0, policy_version 180060 (0.0034) [2024-06-22 09:22:22,766][15401] Updated weights for policy 0, policy_version 180070 (0.0024) [2024-06-22 09:22:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2950266880. Throughput: 0: 42605.9. Samples: 2950353960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 09:22:26,406][15401] Updated weights for policy 0, policy_version 180080 (0.0027) [2024-06-22 09:22:28,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2950447104. Throughput: 0: 42164.2. Samples: 2950601600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:28,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 09:22:30,630][15401] Updated weights for policy 0, policy_version 180090 (0.0046) [2024-06-22 09:22:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2950725632. Throughput: 0: 42506.1. Samples: 2950861760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:33,396][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 09:22:33,892][15401] Updated weights for policy 0, policy_version 180100 (0.0026) [2024-06-22 09:22:38,326][15401] Updated weights for policy 0, policy_version 180110 (0.0042) [2024-06-22 09:22:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2950922240. Throughput: 0: 42599.6. Samples: 2950999220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:38,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 09:22:41,515][15401] Updated weights for policy 0, policy_version 180120 (0.0032) [2024-06-22 09:22:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 2951118848. Throughput: 0: 42360.6. Samples: 2951242240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 09:22:46,009][15401] Updated weights for policy 0, policy_version 180130 (0.0032) [2024-06-22 09:22:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2951348224. Throughput: 0: 42852.4. Samples: 2951504500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 09:22:49,362][15401] Updated weights for policy 0, policy_version 180140 (0.0044) [2024-06-22 09:22:52,164][15349] Signal inference workers to stop experience collection... (43450 times) [2024-06-22 09:22:52,164][15349] Signal inference workers to resume experience collection... (43450 times) [2024-06-22 09:22:52,211][15401] InferenceWorker_p0-w0: stopping experience collection (43450 times) [2024-06-22 09:22:52,212][15401] InferenceWorker_p0-w0: resuming experience collection (43450 times) [2024-06-22 09:22:53,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2951544832. Throughput: 0: 42746.0. Samples: 2951635880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:53,391][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 09:22:53,843][15401] Updated weights for policy 0, policy_version 180150 (0.0038) [2024-06-22 09:22:56,877][15401] Updated weights for policy 0, policy_version 180160 (0.0045) [2024-06-22 09:22:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2951757824. Throughput: 0: 42763.7. Samples: 2951886300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:22:58,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-22 09:23:01,254][15401] Updated weights for policy 0, policy_version 180170 (0.0032) [2024-06-22 09:23:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 2951987200. Throughput: 0: 42966.1. Samples: 2952150560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:23:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 09:23:04,614][15401] Updated weights for policy 0, policy_version 180180 (0.0040) [2024-06-22 09:23:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2952200192. Throughput: 0: 42854.1. Samples: 2952282400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:23:08,392][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 09:23:08,867][15401] Updated weights for policy 0, policy_version 180190 (0.0028) [2024-06-22 09:23:12,074][15401] Updated weights for policy 0, policy_version 180200 (0.0036) [2024-06-22 09:23:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42542.8). Total num frames: 2952413184. Throughput: 0: 42927.0. Samples: 2952533320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:23:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 09:23:16,255][15401] Updated weights for policy 0, policy_version 180210 (0.0026) [2024-06-22 09:23:18,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.7, 300 sec: 42765.0). Total num frames: 2952658944. Throughput: 0: 42971.5. Samples: 2952795580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-22 09:23:18,401][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 09:23:19,606][15401] Updated weights for policy 0, policy_version 180220 (0.0044) [2024-06-22 09:23:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2952839168. Throughput: 0: 42945.4. Samples: 2952931760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:23:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 09:23:23,924][15401] Updated weights for policy 0, policy_version 180230 (0.0038) [2024-06-22 09:23:27,581][15401] Updated weights for policy 0, policy_version 180240 (0.0039) [2024-06-22 09:23:28,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43690.6, 300 sec: 42653.9). Total num frames: 2953068544. Throughput: 0: 43126.1. Samples: 2953182920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:23:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 09:23:31,467][15401] Updated weights for policy 0, policy_version 180250 (0.0022) [2024-06-22 09:23:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2953297920. Throughput: 0: 42976.9. Samples: 2953438460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:23:33,394][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 09:23:35,351][15401] Updated weights for policy 0, policy_version 180260 (0.0037) [2024-06-22 09:23:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 2953478144. Throughput: 0: 43055.1. Samples: 2953573360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:23:38,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-22 09:23:39,804][15401] Updated weights for policy 0, policy_version 180270 (0.0023) [2024-06-22 09:23:42,868][15401] Updated weights for policy 0, policy_version 180280 (0.0031) [2024-06-22 09:23:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2953707520. Throughput: 0: 43076.8. Samples: 2953824760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:23:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 09:23:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000180280_2953707520.pth... [2024-06-22 09:23:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000179655_2943467520.pth [2024-06-22 09:23:47,327][15401] Updated weights for policy 0, policy_version 180290 (0.0034) [2024-06-22 09:23:48,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 2953936896. Throughput: 0: 42996.8. Samples: 2954085520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:23:48,393][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 09:23:50,512][15401] Updated weights for policy 0, policy_version 180300 (0.0038) [2024-06-22 09:23:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2954133504. Throughput: 0: 43030.2. Samples: 2954218760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:23:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 09:23:54,741][15401] Updated weights for policy 0, policy_version 180310 (0.0041) [2024-06-22 09:23:58,094][15401] Updated weights for policy 0, policy_version 180320 (0.0026) [2024-06-22 09:23:58,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2954362880. Throughput: 0: 43096.9. Samples: 2954472680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:23:58,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 09:24:01,924][15349] Signal inference workers to stop experience collection... (43500 times) [2024-06-22 09:24:01,930][15349] Signal inference workers to resume experience collection... (43500 times) [2024-06-22 09:24:01,945][15401] InferenceWorker_p0-w0: stopping experience collection (43500 times) [2024-06-22 09:24:01,945][15401] InferenceWorker_p0-w0: resuming experience collection (43500 times) [2024-06-22 09:24:02,346][15401] Updated weights for policy 0, policy_version 180330 (0.0028) [2024-06-22 09:24:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2954575872. Throughput: 0: 43051.7. Samples: 2954732800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:24:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 09:24:05,766][15401] Updated weights for policy 0, policy_version 180340 (0.0033) [2024-06-22 09:24:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2954756096. Throughput: 0: 42865.3. Samples: 2954860700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:24:08,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 09:24:10,238][15401] Updated weights for policy 0, policy_version 180350 (0.0034) [2024-06-22 09:24:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2955001856. Throughput: 0: 42867.2. Samples: 2955111940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:24:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 09:24:13,407][15401] Updated weights for policy 0, policy_version 180360 (0.0043) [2024-06-22 09:24:17,931][15401] Updated weights for policy 0, policy_version 180370 (0.0033) [2024-06-22 09:24:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2955198464. Throughput: 0: 42928.9. Samples: 2955370260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:24:18,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 09:24:20,993][15401] Updated weights for policy 0, policy_version 180380 (0.0035) [2024-06-22 09:24:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2955395072. Throughput: 0: 42675.4. Samples: 2955493740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 09:24:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 09:24:25,905][15401] Updated weights for policy 0, policy_version 180390 (0.0042) [2024-06-22 09:24:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2955657216. Throughput: 0: 42773.0. Samples: 2955749540. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:24:28,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 09:24:28,610][15401] Updated weights for policy 0, policy_version 180400 (0.0032) [2024-06-22 09:24:33,396][15132] Fps is (10 sec: 42570.6, 60 sec: 42047.8, 300 sec: 42764.1). Total num frames: 2955821056. Throughput: 0: 42918.9. Samples: 2956017040. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:24:33,396][15132] Avg episode reward: [(0, '0.899')] [2024-06-22 09:24:33,499][15401] Updated weights for policy 0, policy_version 180410 (0.0027) [2024-06-22 09:24:36,212][15401] Updated weights for policy 0, policy_version 180420 (0.0034) [2024-06-22 09:24:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2956034048. Throughput: 0: 42511.2. Samples: 2956131760. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:24:38,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-22 09:24:41,113][15401] Updated weights for policy 0, policy_version 180430 (0.0037) [2024-06-22 09:24:43,389][15132] Fps is (10 sec: 47544.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2956296192. Throughput: 0: 42589.8. Samples: 2956389220. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:24:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 09:24:44,220][15401] Updated weights for policy 0, policy_version 180440 (0.0036) [2024-06-22 09:24:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 2956460032. Throughput: 0: 42815.5. Samples: 2956659500. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:24:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 09:24:48,717][15401] Updated weights for policy 0, policy_version 180450 (0.0041) [2024-06-22 09:24:51,801][15401] Updated weights for policy 0, policy_version 180460 (0.0032) [2024-06-22 09:24:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2956689408. Throughput: 0: 42451.6. Samples: 2956771020. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:24:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 09:24:56,358][15401] Updated weights for policy 0, policy_version 180470 (0.0033) [2024-06-22 09:24:58,389][15132] Fps is (10 sec: 49152.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2956951552. Throughput: 0: 42727.6. Samples: 2957034680. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:24:58,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 09:24:59,174][15401] Updated weights for policy 0, policy_version 180480 (0.0039) [2024-06-22 09:25:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.1, 300 sec: 42765.0). Total num frames: 2957099008. Throughput: 0: 42959.1. Samples: 2957303420. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:25:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 09:25:04,079][15401] Updated weights for policy 0, policy_version 180490 (0.0027) [2024-06-22 09:25:06,729][15401] Updated weights for policy 0, policy_version 180500 (0.0038) [2024-06-22 09:25:08,390][15132] Fps is (10 sec: 37682.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2957328384. Throughput: 0: 42678.8. Samples: 2957414300. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:25:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 09:25:08,772][15349] Signal inference workers to stop experience collection... (43550 times) [2024-06-22 09:25:08,772][15349] Signal inference workers to resume experience collection... (43550 times) [2024-06-22 09:25:08,800][15401] InferenceWorker_p0-w0: stopping experience collection (43550 times) [2024-06-22 09:25:08,800][15401] InferenceWorker_p0-w0: resuming experience collection (43550 times) [2024-06-22 09:25:11,740][15401] Updated weights for policy 0, policy_version 180510 (0.0026) [2024-06-22 09:25:13,389][15132] Fps is (10 sec: 49152.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2957590528. Throughput: 0: 42863.5. Samples: 2957678400. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:25:13,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 09:25:14,180][15401] Updated weights for policy 0, policy_version 180520 (0.0035) [2024-06-22 09:25:18,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 2957737984. Throughput: 0: 42703.8. Samples: 2957938540. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:25:18,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 09:25:19,237][15401] Updated weights for policy 0, policy_version 180530 (0.0038) [2024-06-22 09:25:22,386][15401] Updated weights for policy 0, policy_version 180540 (0.0042) [2024-06-22 09:25:23,392][15132] Fps is (10 sec: 39312.2, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 2957983744. Throughput: 0: 42702.6. Samples: 2958053480. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:25:23,393][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 09:25:26,989][15401] Updated weights for policy 0, policy_version 180550 (0.0029) [2024-06-22 09:25:28,389][15132] Fps is (10 sec: 47525.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2958213120. Throughput: 0: 42835.6. Samples: 2958316820. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:25:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 09:25:29,862][15401] Updated weights for policy 0, policy_version 180560 (0.0041) [2024-06-22 09:25:33,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 2958376960. Throughput: 0: 42672.0. Samples: 2958579740. Policy #0 lag: (min: 0.0, avg: 13.3, max: 31.0) [2024-06-22 09:25:33,396][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 09:25:34,833][15401] Updated weights for policy 0, policy_version 180570 (0.0035) [2024-06-22 09:25:38,005][15401] Updated weights for policy 0, policy_version 180580 (0.0042) [2024-06-22 09:25:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2958622720. Throughput: 0: 42723.9. Samples: 2958693600. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:25:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 09:25:42,521][15401] Updated weights for policy 0, policy_version 180590 (0.0029) [2024-06-22 09:25:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42052.1, 300 sec: 42765.0). Total num frames: 2958819328. Throughput: 0: 42823.3. Samples: 2958961740. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:25:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 09:25:43,605][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000180594_2958852096.pth... [2024-06-22 09:25:43,678][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000179967_2948579328.pth [2024-06-22 09:25:45,562][15401] Updated weights for policy 0, policy_version 180600 (0.0035) [2024-06-22 09:25:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2959032320. Throughput: 0: 42323.3. Samples: 2959207960. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:25:48,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 09:25:50,754][15401] Updated weights for policy 0, policy_version 180610 (0.0035) [2024-06-22 09:25:53,153][15401] Updated weights for policy 0, policy_version 180620 (0.0033) [2024-06-22 09:25:53,391][15132] Fps is (10 sec: 45867.1, 60 sec: 43143.1, 300 sec: 42820.3). Total num frames: 2959278080. Throughput: 0: 42695.6. Samples: 2959335680. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:25:53,392][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 09:25:58,260][15401] Updated weights for policy 0, policy_version 180630 (0.0021) [2024-06-22 09:25:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.0, 300 sec: 42654.0). Total num frames: 2959441920. Throughput: 0: 42577.3. Samples: 2959594380. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:25:58,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-22 09:26:00,622][15401] Updated weights for policy 0, policy_version 180640 (0.0028) [2024-06-22 09:26:03,390][15132] Fps is (10 sec: 39329.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 2959671296. Throughput: 0: 42331.1. Samples: 2959843340. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:26:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 09:26:06,001][15401] Updated weights for policy 0, policy_version 180650 (0.0041) [2024-06-22 09:26:06,856][15349] Signal inference workers to stop experience collection... (43600 times) [2024-06-22 09:26:06,909][15401] InferenceWorker_p0-w0: stopping experience collection (43600 times) [2024-06-22 09:26:06,918][15349] Signal inference workers to resume experience collection... (43600 times) [2024-06-22 09:26:06,924][15401] InferenceWorker_p0-w0: resuming experience collection (43600 times) [2024-06-22 09:26:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2959917056. Throughput: 0: 42759.9. Samples: 2959977580. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:26:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 09:26:08,568][15401] Updated weights for policy 0, policy_version 180660 (0.0025) [2024-06-22 09:26:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41504.5, 300 sec: 42709.1). Total num frames: 2960080896. Throughput: 0: 42572.4. Samples: 2960232680. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:26:13,393][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 09:26:13,431][15401] Updated weights for policy 0, policy_version 180670 (0.0027) [2024-06-22 09:26:16,116][15401] Updated weights for policy 0, policy_version 180680 (0.0039) [2024-06-22 09:26:18,392][15132] Fps is (10 sec: 40950.8, 60 sec: 43144.6, 300 sec: 42653.6). Total num frames: 2960326656. Throughput: 0: 42157.8. Samples: 2960476940. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:26:18,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 09:26:21,032][15401] Updated weights for policy 0, policy_version 180690 (0.0036) [2024-06-22 09:26:23,389][15132] Fps is (10 sec: 47525.2, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 2960556032. Throughput: 0: 42626.3. Samples: 2960611780. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:26:23,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 09:26:24,109][15401] Updated weights for policy 0, policy_version 180700 (0.0034) [2024-06-22 09:26:28,389][15132] Fps is (10 sec: 39331.0, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 2960719872. Throughput: 0: 42321.5. Samples: 2960866200. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:26:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 09:26:28,746][15401] Updated weights for policy 0, policy_version 180710 (0.0029) [2024-06-22 09:26:31,983][15401] Updated weights for policy 0, policy_version 180720 (0.0023) [2024-06-22 09:26:33,396][15132] Fps is (10 sec: 42570.7, 60 sec: 43412.9, 300 sec: 42708.6). Total num frames: 2960982016. Throughput: 0: 42263.2. Samples: 2961110080. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:26:33,397][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 09:26:36,228][15401] Updated weights for policy 0, policy_version 180730 (0.0042) [2024-06-22 09:26:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2961178624. Throughput: 0: 42472.5. Samples: 2961246860. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-22 09:26:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 09:26:39,590][15401] Updated weights for policy 0, policy_version 180740 (0.0037) [2024-06-22 09:26:43,389][15132] Fps is (10 sec: 37707.9, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 2961358848. Throughput: 0: 42491.2. Samples: 2961506480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:26:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 09:26:44,070][15401] Updated weights for policy 0, policy_version 180750 (0.0039) [2024-06-22 09:26:47,160][15401] Updated weights for policy 0, policy_version 180760 (0.0040) [2024-06-22 09:26:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2961620992. Throughput: 0: 42438.7. Samples: 2961753080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:26:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 09:26:51,583][15401] Updated weights for policy 0, policy_version 180770 (0.0031) [2024-06-22 09:26:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42326.7, 300 sec: 42820.5). Total num frames: 2961817600. Throughput: 0: 42575.6. Samples: 2961893480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:26:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 09:26:54,915][15401] Updated weights for policy 0, policy_version 180780 (0.0042) [2024-06-22 09:26:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2962014208. Throughput: 0: 42581.8. Samples: 2962148760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:26:58,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 09:26:59,131][15401] Updated weights for policy 0, policy_version 180790 (0.0024) [2024-06-22 09:27:02,406][15401] Updated weights for policy 0, policy_version 180800 (0.0043) [2024-06-22 09:27:03,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 2962243584. Throughput: 0: 42774.8. Samples: 2962401720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:03,391][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 09:27:03,464][15349] Signal inference workers to stop experience collection... (43650 times) [2024-06-22 09:27:03,464][15349] Signal inference workers to resume experience collection... (43650 times) [2024-06-22 09:27:03,480][15401] InferenceWorker_p0-w0: stopping experience collection (43650 times) [2024-06-22 09:27:03,481][15401] InferenceWorker_p0-w0: resuming experience collection (43650 times) [2024-06-22 09:27:07,009][15401] Updated weights for policy 0, policy_version 180810 (0.0039) [2024-06-22 09:27:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2962456576. Throughput: 0: 42809.8. Samples: 2962538220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 09:27:10,232][15401] Updated weights for policy 0, policy_version 180820 (0.0034) [2024-06-22 09:27:13,390][15132] Fps is (10 sec: 40961.2, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 2962653184. Throughput: 0: 42741.3. Samples: 2962789560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 09:27:14,722][15401] Updated weights for policy 0, policy_version 180830 (0.0037) [2024-06-22 09:27:17,731][15401] Updated weights for policy 0, policy_version 180840 (0.0037) [2024-06-22 09:27:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2962898944. Throughput: 0: 42982.3. Samples: 2963044000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 09:27:22,347][15401] Updated weights for policy 0, policy_version 180850 (0.0029) [2024-06-22 09:27:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2963095552. Throughput: 0: 43023.1. Samples: 2963182900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:23,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 09:27:25,570][15401] Updated weights for policy 0, policy_version 180860 (0.0034) [2024-06-22 09:27:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2963292160. Throughput: 0: 42765.7. Samples: 2963430940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 09:27:29,919][15401] Updated weights for policy 0, policy_version 180870 (0.0040) [2024-06-22 09:27:33,144][15401] Updated weights for policy 0, policy_version 180880 (0.0025) [2024-06-22 09:27:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42603.1, 300 sec: 42765.0). Total num frames: 2963537920. Throughput: 0: 42933.9. Samples: 2963685100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 09:27:37,510][15401] Updated weights for policy 0, policy_version 180890 (0.0039) [2024-06-22 09:27:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2963718144. Throughput: 0: 42831.6. Samples: 2963820900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:38,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 09:27:40,755][15401] Updated weights for policy 0, policy_version 180900 (0.0038) [2024-06-22 09:27:43,396][15132] Fps is (10 sec: 40933.5, 60 sec: 43139.9, 300 sec: 42708.6). Total num frames: 2963947520. Throughput: 0: 42731.3. Samples: 2964071940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-22 09:27:43,396][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 09:27:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000180905_2963947520.pth... [2024-06-22 09:27:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000180280_2953707520.pth [2024-06-22 09:27:45,326][15401] Updated weights for policy 0, policy_version 180910 (0.0033) [2024-06-22 09:27:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2964176896. Throughput: 0: 42596.0. Samples: 2964318520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:27:48,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 09:27:48,402][15401] Updated weights for policy 0, policy_version 180920 (0.0029) [2024-06-22 09:27:52,835][15401] Updated weights for policy 0, policy_version 180930 (0.0041) [2024-06-22 09:27:53,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2964357120. Throughput: 0: 42541.7. Samples: 2964452600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:27:53,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 09:27:56,222][15401] Updated weights for policy 0, policy_version 180940 (0.0022) [2024-06-22 09:27:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2964602880. Throughput: 0: 42564.4. Samples: 2964704960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:27:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 09:28:00,408][15401] Updated weights for policy 0, policy_version 180950 (0.0043) [2024-06-22 09:28:03,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42870.0, 300 sec: 42764.7). Total num frames: 2964815872. Throughput: 0: 42707.0. Samples: 2964965920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:03,392][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 09:28:03,900][15401] Updated weights for policy 0, policy_version 180960 (0.0028) [2024-06-22 09:28:07,850][15401] Updated weights for policy 0, policy_version 180970 (0.0032) [2024-06-22 09:28:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2965012480. Throughput: 0: 42516.4. Samples: 2965096140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:08,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 09:28:11,643][15401] Updated weights for policy 0, policy_version 180980 (0.0038) [2024-06-22 09:28:13,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2965225472. Throughput: 0: 42619.1. Samples: 2965348800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:13,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 09:28:16,354][15401] Updated weights for policy 0, policy_version 180990 (0.0042) [2024-06-22 09:28:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2965454848. Throughput: 0: 42641.7. Samples: 2965603980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 09:28:19,391][15401] Updated weights for policy 0, policy_version 181000 (0.0032) [2024-06-22 09:28:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2965635072. Throughput: 0: 42564.5. Samples: 2965736300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 09:28:23,915][15401] Updated weights for policy 0, policy_version 181010 (0.0028) [2024-06-22 09:28:27,197][15401] Updated weights for policy 0, policy_version 181020 (0.0033) [2024-06-22 09:28:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2965848064. Throughput: 0: 42678.2. Samples: 2965992180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 09:28:28,698][15349] Signal inference workers to stop experience collection... (43700 times) [2024-06-22 09:28:28,698][15349] Signal inference workers to resume experience collection... (43700 times) [2024-06-22 09:28:28,709][15401] InferenceWorker_p0-w0: stopping experience collection (43700 times) [2024-06-22 09:28:28,709][15401] InferenceWorker_p0-w0: resuming experience collection (43700 times) [2024-06-22 09:28:31,471][15401] Updated weights for policy 0, policy_version 181030 (0.0025) [2024-06-22 09:28:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2966093824. Throughput: 0: 42868.7. Samples: 2966247620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 09:28:34,744][15401] Updated weights for policy 0, policy_version 181040 (0.0043) [2024-06-22 09:28:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2966290432. Throughput: 0: 42747.7. Samples: 2966376240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:38,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 09:28:39,329][15401] Updated weights for policy 0, policy_version 181050 (0.0045) [2024-06-22 09:28:42,536][15401] Updated weights for policy 0, policy_version 181060 (0.0038) [2024-06-22 09:28:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42602.8, 300 sec: 42598.7). Total num frames: 2966503424. Throughput: 0: 42705.3. Samples: 2966626700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 09:28:46,861][15401] Updated weights for policy 0, policy_version 181070 (0.0036) [2024-06-22 09:28:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2966716416. Throughput: 0: 42714.8. Samples: 2966887980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:28:48,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 09:28:50,138][15401] Updated weights for policy 0, policy_version 181080 (0.0028) [2024-06-22 09:28:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2966929408. Throughput: 0: 42518.9. Samples: 2967009480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:28:53,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 09:28:54,595][15401] Updated weights for policy 0, policy_version 181090 (0.0040) [2024-06-22 09:28:58,309][15401] Updated weights for policy 0, policy_version 181100 (0.0042) [2024-06-22 09:28:58,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2967142400. Throughput: 0: 42678.5. Samples: 2967269340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:28:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 09:29:02,218][15401] Updated weights for policy 0, policy_version 181110 (0.0035) [2024-06-22 09:29:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 2967339008. Throughput: 0: 42621.4. Samples: 2967521940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:03,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 09:29:05,909][15401] Updated weights for policy 0, policy_version 181120 (0.0039) [2024-06-22 09:29:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2967568384. Throughput: 0: 42415.0. Samples: 2967644980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 09:29:09,760][15401] Updated weights for policy 0, policy_version 181130 (0.0030) [2024-06-22 09:29:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2967781376. Throughput: 0: 42331.0. Samples: 2967897080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:13,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 09:29:13,593][15401] Updated weights for policy 0, policy_version 181140 (0.0040) [2024-06-22 09:29:18,017][15401] Updated weights for policy 0, policy_version 181150 (0.0034) [2024-06-22 09:29:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 2967961600. Throughput: 0: 42385.8. Samples: 2968154980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:18,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-22 09:29:21,236][15401] Updated weights for policy 0, policy_version 181160 (0.0033) [2024-06-22 09:29:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2968207360. Throughput: 0: 42346.0. Samples: 2968281820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 09:29:25,641][15401] Updated weights for policy 0, policy_version 181170 (0.0034) [2024-06-22 09:29:28,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 2968420352. Throughput: 0: 42397.0. Samples: 2968534560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 09:29:29,008][15401] Updated weights for policy 0, policy_version 181180 (0.0038) [2024-06-22 09:29:33,372][15401] Updated weights for policy 0, policy_version 181190 (0.0046) [2024-06-22 09:29:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 2968616960. Throughput: 0: 42425.3. Samples: 2968797120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 09:29:36,626][15401] Updated weights for policy 0, policy_version 181200 (0.0029) [2024-06-22 09:29:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2968846336. Throughput: 0: 42268.5. Samples: 2968911560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:38,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 09:29:40,998][15401] Updated weights for policy 0, policy_version 181210 (0.0042) [2024-06-22 09:29:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2969059328. Throughput: 0: 42301.8. Samples: 2969172920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 09:29:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000181217_2969059328.pth... [2024-06-22 09:29:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000180594_2958852096.pth [2024-06-22 09:29:44,284][15401] Updated weights for policy 0, policy_version 181220 (0.0041) [2024-06-22 09:29:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2969255936. Throughput: 0: 42407.9. Samples: 2969430300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 09:29:48,562][15401] Updated weights for policy 0, policy_version 181230 (0.0043) [2024-06-22 09:29:51,592][15349] Signal inference workers to stop experience collection... (43750 times) [2024-06-22 09:29:51,632][15401] InferenceWorker_p0-w0: stopping experience collection (43750 times) [2024-06-22 09:29:51,645][15349] Signal inference workers to resume experience collection... (43750 times) [2024-06-22 09:29:51,647][15401] InferenceWorker_p0-w0: resuming experience collection (43750 times) [2024-06-22 09:29:51,807][15401] Updated weights for policy 0, policy_version 181240 (0.0034) [2024-06-22 09:29:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2969485312. Throughput: 0: 42498.7. Samples: 2969557420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 09:29:53,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-22 09:29:56,165][15401] Updated weights for policy 0, policy_version 181250 (0.0030) [2024-06-22 09:29:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2969681920. Throughput: 0: 42671.5. Samples: 2969817300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:29:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 09:29:59,269][15401] Updated weights for policy 0, policy_version 181260 (0.0034) [2024-06-22 09:30:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2969911296. Throughput: 0: 42641.3. Samples: 2970073840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 09:30:03,531][15401] Updated weights for policy 0, policy_version 181270 (0.0024) [2024-06-22 09:30:07,111][15401] Updated weights for policy 0, policy_version 181280 (0.0042) [2024-06-22 09:30:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2970140672. Throughput: 0: 42641.0. Samples: 2970200660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 09:30:11,456][15401] Updated weights for policy 0, policy_version 181290 (0.0029) [2024-06-22 09:30:13,394][15132] Fps is (10 sec: 42578.9, 60 sec: 42595.1, 300 sec: 42709.2). Total num frames: 2970337280. Throughput: 0: 42798.2. Samples: 2970460680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:13,395][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 09:30:14,660][15401] Updated weights for policy 0, policy_version 181300 (0.0028) [2024-06-22 09:30:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 2970533888. Throughput: 0: 42675.5. Samples: 2970717520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 09:30:19,220][15401] Updated weights for policy 0, policy_version 181310 (0.0030) [2024-06-22 09:30:22,351][15401] Updated weights for policy 0, policy_version 181320 (0.0028) [2024-06-22 09:30:23,389][15132] Fps is (10 sec: 44257.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2970779648. Throughput: 0: 42989.2. Samples: 2970846080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 09:30:26,675][15401] Updated weights for policy 0, policy_version 181330 (0.0038) [2024-06-22 09:30:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2970992640. Throughput: 0: 42882.8. Samples: 2971102640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:28,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 09:30:30,173][15401] Updated weights for policy 0, policy_version 181340 (0.0050) [2024-06-22 09:30:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2971172864. Throughput: 0: 42984.9. Samples: 2971364620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 09:30:34,463][15401] Updated weights for policy 0, policy_version 181350 (0.0026) [2024-06-22 09:30:37,798][15401] Updated weights for policy 0, policy_version 181360 (0.0026) [2024-06-22 09:30:38,393][15132] Fps is (10 sec: 40945.7, 60 sec: 42595.8, 300 sec: 42653.5). Total num frames: 2971402240. Throughput: 0: 42904.7. Samples: 2971488280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:38,394][15132] Avg episode reward: [(0, '0.124')] [2024-06-22 09:30:41,919][15401] Updated weights for policy 0, policy_version 181370 (0.0033) [2024-06-22 09:30:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2971615232. Throughput: 0: 42864.6. Samples: 2971746200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:43,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 09:30:45,636][15401] Updated weights for policy 0, policy_version 181380 (0.0034) [2024-06-22 09:30:48,390][15132] Fps is (10 sec: 42613.3, 60 sec: 42871.5, 300 sec: 42543.1). Total num frames: 2971828224. Throughput: 0: 42872.5. Samples: 2972003100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:48,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 09:30:49,527][15401] Updated weights for policy 0, policy_version 181390 (0.0027) [2024-06-22 09:30:53,044][15401] Updated weights for policy 0, policy_version 181400 (0.0032) [2024-06-22 09:30:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2972057600. Throughput: 0: 42839.4. Samples: 2972128440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 09:30:57,047][15401] Updated weights for policy 0, policy_version 181410 (0.0052) [2024-06-22 09:30:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2972254208. Throughput: 0: 42739.5. Samples: 2972383760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 09:30:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 09:31:01,002][15401] Updated weights for policy 0, policy_version 181420 (0.0029) [2024-06-22 09:31:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2972467200. Throughput: 0: 42771.1. Samples: 2972642220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:03,390][15132] Avg episode reward: [(0, '0.155')] [2024-06-22 09:31:04,777][15401] Updated weights for policy 0, policy_version 181430 (0.0023) [2024-06-22 09:31:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2972696576. Throughput: 0: 42726.2. Samples: 2972768760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 09:31:08,514][15401] Updated weights for policy 0, policy_version 181440 (0.0030) [2024-06-22 09:31:12,420][15401] Updated weights for policy 0, policy_version 181450 (0.0037) [2024-06-22 09:31:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42874.8, 300 sec: 42654.3). Total num frames: 2972909568. Throughput: 0: 42791.5. Samples: 2973028260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 09:31:16,397][15401] Updated weights for policy 0, policy_version 181460 (0.0041) [2024-06-22 09:31:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2973122560. Throughput: 0: 42712.4. Samples: 2973286680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:18,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-22 09:31:19,968][15401] Updated weights for policy 0, policy_version 181470 (0.0031) [2024-06-22 09:31:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2973335552. Throughput: 0: 42734.0. Samples: 2973411160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 09:31:24,081][15401] Updated weights for policy 0, policy_version 181480 (0.0030) [2024-06-22 09:31:26,198][15349] Signal inference workers to stop experience collection... (43800 times) [2024-06-22 09:31:26,256][15401] InferenceWorker_p0-w0: stopping experience collection (43800 times) [2024-06-22 09:31:26,257][15349] Signal inference workers to resume experience collection... (43800 times) [2024-06-22 09:31:26,276][15401] InferenceWorker_p0-w0: resuming experience collection (43800 times) [2024-06-22 09:31:27,443][15401] Updated weights for policy 0, policy_version 181490 (0.0028) [2024-06-22 09:31:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42654.9). Total num frames: 2973564928. Throughput: 0: 42858.1. Samples: 2973674820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 09:31:31,646][15401] Updated weights for policy 0, policy_version 181500 (0.0042) [2024-06-22 09:31:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2973745152. Throughput: 0: 42886.1. Samples: 2973932980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:33,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 09:31:35,012][15401] Updated weights for policy 0, policy_version 181510 (0.0027) [2024-06-22 09:31:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42874.0, 300 sec: 42765.0). Total num frames: 2973974528. Throughput: 0: 42877.1. Samples: 2974057900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:38,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 09:31:39,139][15401] Updated weights for policy 0, policy_version 181520 (0.0027) [2024-06-22 09:31:43,074][15401] Updated weights for policy 0, policy_version 181530 (0.0046) [2024-06-22 09:31:43,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2974220288. Throughput: 0: 43053.4. Samples: 2974321160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:43,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 09:31:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000181532_2974220288.pth... [2024-06-22 09:31:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000180905_2963947520.pth [2024-06-22 09:31:46,907][15401] Updated weights for policy 0, policy_version 181540 (0.0037) [2024-06-22 09:31:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2974400512. Throughput: 0: 43013.7. Samples: 2974577840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 09:31:50,579][15401] Updated weights for policy 0, policy_version 181550 (0.0026) [2024-06-22 09:31:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2974613504. Throughput: 0: 42902.6. Samples: 2974699380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:53,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-22 09:31:54,378][15401] Updated weights for policy 0, policy_version 181560 (0.0042) [2024-06-22 09:31:58,305][15401] Updated weights for policy 0, policy_version 181570 (0.0026) [2024-06-22 09:31:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2974842880. Throughput: 0: 42915.2. Samples: 2974959440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:31:58,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 09:32:02,182][15401] Updated weights for policy 0, policy_version 181580 (0.0042) [2024-06-22 09:32:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2975039488. Throughput: 0: 42987.2. Samples: 2975221100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 09:32:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 09:32:05,874][15401] Updated weights for policy 0, policy_version 181590 (0.0035) [2024-06-22 09:32:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2975268864. Throughput: 0: 42946.1. Samples: 2975343740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 09:32:09,912][15401] Updated weights for policy 0, policy_version 181600 (0.0047) [2024-06-22 09:32:13,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 2975481856. Throughput: 0: 42714.7. Samples: 2975597080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:13,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 09:32:13,655][15401] Updated weights for policy 0, policy_version 181610 (0.0037) [2024-06-22 09:32:17,481][15401] Updated weights for policy 0, policy_version 181620 (0.0024) [2024-06-22 09:32:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2975678464. Throughput: 0: 42684.1. Samples: 2975853760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 09:32:21,202][15401] Updated weights for policy 0, policy_version 181630 (0.0029) [2024-06-22 09:32:23,396][15132] Fps is (10 sec: 42581.1, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 2975907840. Throughput: 0: 42771.1. Samples: 2975982880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:23,397][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 09:32:25,068][15401] Updated weights for policy 0, policy_version 181640 (0.0029) [2024-06-22 09:32:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2976104448. Throughput: 0: 42598.2. Samples: 2976238080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:28,392][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 09:32:29,119][15401] Updated weights for policy 0, policy_version 181650 (0.0024) [2024-06-22 09:32:32,694][15401] Updated weights for policy 0, policy_version 181660 (0.0031) [2024-06-22 09:32:33,390][15132] Fps is (10 sec: 42625.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2976333824. Throughput: 0: 42589.3. Samples: 2976494360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 09:32:36,687][15401] Updated weights for policy 0, policy_version 181670 (0.0033) [2024-06-22 09:32:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42765.9). Total num frames: 2976563200. Throughput: 0: 42860.2. Samples: 2976628080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:38,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 09:32:40,499][15401] Updated weights for policy 0, policy_version 181680 (0.0035) [2024-06-22 09:32:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2976759808. Throughput: 0: 42756.0. Samples: 2976883460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:43,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 09:32:44,297][15401] Updated weights for policy 0, policy_version 181690 (0.0037) [2024-06-22 09:32:47,972][15401] Updated weights for policy 0, policy_version 181700 (0.0034) [2024-06-22 09:32:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2976972800. Throughput: 0: 42532.0. Samples: 2977135040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 09:32:51,842][15401] Updated weights for policy 0, policy_version 181710 (0.0043) [2024-06-22 09:32:51,845][15349] Signal inference workers to stop experience collection... (43850 times) [2024-06-22 09:32:51,845][15349] Signal inference workers to resume experience collection... (43850 times) [2024-06-22 09:32:51,862][15401] InferenceWorker_p0-w0: stopping experience collection (43850 times) [2024-06-22 09:32:51,862][15401] InferenceWorker_p0-w0: resuming experience collection (43850 times) [2024-06-22 09:32:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2977185792. Throughput: 0: 42697.3. Samples: 2977265120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 09:32:55,910][15401] Updated weights for policy 0, policy_version 181720 (0.0038) [2024-06-22 09:32:58,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42593.8, 300 sec: 42653.4). Total num frames: 2977398784. Throughput: 0: 42790.9. Samples: 2977522840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:32:58,396][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 09:32:59,512][15401] Updated weights for policy 0, policy_version 181730 (0.0040) [2024-06-22 09:33:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2977611776. Throughput: 0: 42805.4. Samples: 2977780000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:33:03,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-22 09:33:03,517][15401] Updated weights for policy 0, policy_version 181740 (0.0036) [2024-06-22 09:33:07,017][15401] Updated weights for policy 0, policy_version 181750 (0.0030) [2024-06-22 09:33:08,389][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2977841152. Throughput: 0: 42819.9. Samples: 2977909500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:33:08,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-22 09:33:11,105][15401] Updated weights for policy 0, policy_version 181760 (0.0041) [2024-06-22 09:33:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 2978054144. Throughput: 0: 42926.7. Samples: 2978169780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:33:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 09:33:14,547][15401] Updated weights for policy 0, policy_version 181770 (0.0033) [2024-06-22 09:33:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2978250752. Throughput: 0: 42785.3. Samples: 2978419700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 09:33:18,997][15401] Updated weights for policy 0, policy_version 181780 (0.0042) [2024-06-22 09:33:22,648][15401] Updated weights for policy 0, policy_version 181790 (0.0035) [2024-06-22 09:33:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42874.4, 300 sec: 42820.2). Total num frames: 2978480128. Throughput: 0: 42484.8. Samples: 2978540000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:23,393][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 09:33:26,581][15401] Updated weights for policy 0, policy_version 181800 (0.0033) [2024-06-22 09:33:28,391][15132] Fps is (10 sec: 44229.3, 60 sec: 43143.3, 300 sec: 42709.2). Total num frames: 2978693120. Throughput: 0: 42758.3. Samples: 2978807660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:28,392][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 09:33:30,047][15401] Updated weights for policy 0, policy_version 181810 (0.0041) [2024-06-22 09:33:33,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2978873344. Throughput: 0: 42800.8. Samples: 2979061080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:33,395][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 09:33:34,450][15401] Updated weights for policy 0, policy_version 181820 (0.0029) [2024-06-22 09:33:37,520][15401] Updated weights for policy 0, policy_version 181830 (0.0034) [2024-06-22 09:33:38,389][15132] Fps is (10 sec: 42606.2, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 2979119104. Throughput: 0: 42703.3. Samples: 2979186760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 09:33:42,041][15401] Updated weights for policy 0, policy_version 181840 (0.0034) [2024-06-22 09:33:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2979332096. Throughput: 0: 42859.5. Samples: 2979451240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 09:33:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000181844_2979332096.pth... [2024-06-22 09:33:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000181217_2969059328.pth [2024-06-22 09:33:45,076][15401] Updated weights for policy 0, policy_version 181850 (0.0027) [2024-06-22 09:33:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2979528704. Throughput: 0: 42800.9. Samples: 2979706040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 09:33:49,628][15401] Updated weights for policy 0, policy_version 181860 (0.0045) [2024-06-22 09:33:52,582][15401] Updated weights for policy 0, policy_version 181870 (0.0037) [2024-06-22 09:33:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2979758080. Throughput: 0: 42750.2. Samples: 2979833260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 09:33:57,287][15401] Updated weights for policy 0, policy_version 181880 (0.0034) [2024-06-22 09:33:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 2979954688. Throughput: 0: 42677.8. Samples: 2980090280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:33:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 09:34:00,140][15401] Updated weights for policy 0, policy_version 181890 (0.0026) [2024-06-22 09:34:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2980167680. Throughput: 0: 42856.4. Samples: 2980348240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:34:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 09:34:04,921][15401] Updated weights for policy 0, policy_version 181900 (0.0039) [2024-06-22 09:34:08,265][15401] Updated weights for policy 0, policy_version 181910 (0.0044) [2024-06-22 09:34:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2980413440. Throughput: 0: 43082.9. Samples: 2980478620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:34:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 09:34:12,544][15401] Updated weights for policy 0, policy_version 181920 (0.0033) [2024-06-22 09:34:13,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 2980610048. Throughput: 0: 42788.7. Samples: 2980733180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:34:13,392][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 09:34:15,889][15401] Updated weights for policy 0, policy_version 181930 (0.0023) [2024-06-22 09:34:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2980806656. Throughput: 0: 42881.0. Samples: 2980990720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:34:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 09:34:20,135][15401] Updated weights for policy 0, policy_version 181940 (0.0025) [2024-06-22 09:34:22,835][15349] Signal inference workers to stop experience collection... (43900 times) [2024-06-22 09:34:22,840][15349] Signal inference workers to resume experience collection... (43900 times) [2024-06-22 09:34:22,887][15401] InferenceWorker_p0-w0: stopping experience collection (43900 times) [2024-06-22 09:34:22,887][15401] InferenceWorker_p0-w0: resuming experience collection (43900 times) [2024-06-22 09:34:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2981052416. Throughput: 0: 42903.9. Samples: 2981117440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:34:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 09:34:23,517][15401] Updated weights for policy 0, policy_version 181950 (0.0032) [2024-06-22 09:34:27,900][15401] Updated weights for policy 0, policy_version 181960 (0.0046) [2024-06-22 09:34:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42599.6, 300 sec: 42820.5). Total num frames: 2981249024. Throughput: 0: 42836.7. Samples: 2981378900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:34:28,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 09:34:31,083][15401] Updated weights for policy 0, policy_version 181970 (0.0042) [2024-06-22 09:34:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2981445632. Throughput: 0: 42889.7. Samples: 2981636080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:34:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 09:34:35,571][15401] Updated weights for policy 0, policy_version 181980 (0.0035) [2024-06-22 09:34:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2981707776. Throughput: 0: 42874.7. Samples: 2981762620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:34:38,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 09:34:38,610][15401] Updated weights for policy 0, policy_version 181990 (0.0029) [2024-06-22 09:34:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2981871616. Throughput: 0: 43025.8. Samples: 2982026440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:34:43,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-22 09:34:43,415][15401] Updated weights for policy 0, policy_version 182000 (0.0050) [2024-06-22 09:34:46,363][15401] Updated weights for policy 0, policy_version 182010 (0.0036) [2024-06-22 09:34:48,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 2982100992. Throughput: 0: 42787.4. Samples: 2982273940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:34:48,396][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 09:34:51,317][15401] Updated weights for policy 0, policy_version 182020 (0.0026) [2024-06-22 09:34:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2982330368. Throughput: 0: 42893.3. Samples: 2982408820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:34:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 09:34:54,273][15401] Updated weights for policy 0, policy_version 182030 (0.0032) [2024-06-22 09:34:58,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2982526976. Throughput: 0: 42788.5. Samples: 2982658560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:34:58,392][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 09:34:58,940][15401] Updated weights for policy 0, policy_version 182040 (0.0047) [2024-06-22 09:35:01,937][15401] Updated weights for policy 0, policy_version 182050 (0.0040) [2024-06-22 09:35:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2982756352. Throughput: 0: 42553.2. Samples: 2982905620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:35:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 09:35:06,483][15401] Updated weights for policy 0, policy_version 182060 (0.0033) [2024-06-22 09:35:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42821.2). Total num frames: 2982969344. Throughput: 0: 42740.1. Samples: 2983040740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:35:08,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 09:35:09,685][15401] Updated weights for policy 0, policy_version 182070 (0.0031) [2024-06-22 09:35:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 2983165952. Throughput: 0: 42596.4. Samples: 2983295740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:35:13,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 09:35:14,000][15401] Updated weights for policy 0, policy_version 182080 (0.0031) [2024-06-22 09:35:17,410][15401] Updated weights for policy 0, policy_version 182090 (0.0025) [2024-06-22 09:35:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2983378944. Throughput: 0: 42393.9. Samples: 2983543800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:35:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 09:35:21,527][15401] Updated weights for policy 0, policy_version 182100 (0.0040) [2024-06-22 09:35:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2983608320. Throughput: 0: 42467.1. Samples: 2983673640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 09:35:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 09:35:25,084][15401] Updated weights for policy 0, policy_version 182110 (0.0043) [2024-06-22 09:35:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2983788544. Throughput: 0: 42248.8. Samples: 2983927640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:35:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 09:35:29,436][15401] Updated weights for policy 0, policy_version 182120 (0.0040) [2024-06-22 09:35:32,898][15401] Updated weights for policy 0, policy_version 182130 (0.0033) [2024-06-22 09:35:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.5). Total num frames: 2984017920. Throughput: 0: 42462.4. Samples: 2984184480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:35:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 09:35:36,927][15401] Updated weights for policy 0, policy_version 182140 (0.0034) [2024-06-22 09:35:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2984230912. Throughput: 0: 42466.2. Samples: 2984319800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:35:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 09:35:40,475][15401] Updated weights for policy 0, policy_version 182150 (0.0027) [2024-06-22 09:35:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2984427520. Throughput: 0: 42488.0. Samples: 2984570520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:35:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 09:35:43,473][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000182156_2984443904.pth... [2024-06-22 09:35:43,535][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000181532_2974220288.pth [2024-06-22 09:35:44,478][15401] Updated weights for policy 0, policy_version 182160 (0.0045) [2024-06-22 09:35:45,836][15349] Signal inference workers to stop experience collection... (43950 times) [2024-06-22 09:35:45,844][15349] Signal inference workers to resume experience collection... (43950 times) [2024-06-22 09:35:45,850][15401] InferenceWorker_p0-w0: stopping experience collection (43950 times) [2024-06-22 09:35:45,883][15401] InferenceWorker_p0-w0: resuming experience collection (43950 times) [2024-06-22 09:35:48,051][15401] Updated weights for policy 0, policy_version 182170 (0.0037) [2024-06-22 09:35:48,390][15132] Fps is (10 sec: 44232.3, 60 sec: 42875.4, 300 sec: 42764.9). Total num frames: 2984673280. Throughput: 0: 42637.8. Samples: 2984824360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:35:48,391][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 09:35:51,965][15401] Updated weights for policy 0, policy_version 182180 (0.0035) [2024-06-22 09:35:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2984886272. Throughput: 0: 42707.0. Samples: 2984962560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:35:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 09:35:55,654][15401] Updated weights for policy 0, policy_version 182190 (0.0036) [2024-06-22 09:35:58,389][15132] Fps is (10 sec: 40964.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2985082880. Throughput: 0: 42807.8. Samples: 2985222080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:35:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 09:35:59,412][15401] Updated weights for policy 0, policy_version 182200 (0.0033) [2024-06-22 09:36:03,202][15401] Updated weights for policy 0, policy_version 182210 (0.0035) [2024-06-22 09:36:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2985328640. Throughput: 0: 43014.2. Samples: 2985479440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:36:03,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-22 09:36:06,921][15401] Updated weights for policy 0, policy_version 182220 (0.0041) [2024-06-22 09:36:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2985541632. Throughput: 0: 43020.1. Samples: 2985609540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:36:08,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 09:36:10,779][15401] Updated weights for policy 0, policy_version 182230 (0.0025) [2024-06-22 09:36:13,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 2985738240. Throughput: 0: 43156.9. Samples: 2985869800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:36:13,392][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 09:36:15,019][15401] Updated weights for policy 0, policy_version 182240 (0.0026) [2024-06-22 09:36:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2985967616. Throughput: 0: 43108.2. Samples: 2986124340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:36:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 09:36:18,724][15401] Updated weights for policy 0, policy_version 182250 (0.0024) [2024-06-22 09:36:22,487][15401] Updated weights for policy 0, policy_version 182260 (0.0028) [2024-06-22 09:36:23,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2986196992. Throughput: 0: 43079.0. Samples: 2986258360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:36:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 09:36:26,144][15401] Updated weights for policy 0, policy_version 182270 (0.0037) [2024-06-22 09:36:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 2986393600. Throughput: 0: 43310.7. Samples: 2986519500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:36:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 09:36:29,956][15401] Updated weights for policy 0, policy_version 182280 (0.0035) [2024-06-22 09:36:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2986606592. Throughput: 0: 43301.3. Samples: 2986772880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:36:33,391][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 09:36:34,438][15401] Updated weights for policy 0, policy_version 182290 (0.0033) [2024-06-22 09:36:37,647][15401] Updated weights for policy 0, policy_version 182300 (0.0024) [2024-06-22 09:36:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2986819584. Throughput: 0: 43109.7. Samples: 2986902500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:36:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 09:36:41,825][15401] Updated weights for policy 0, policy_version 182310 (0.0032) [2024-06-22 09:36:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 2987048960. Throughput: 0: 43130.1. Samples: 2987162940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:36:43,390][15132] Avg episode reward: [(0, '0.130')] [2024-06-22 09:36:45,210][15401] Updated weights for policy 0, policy_version 182320 (0.0033) [2024-06-22 09:36:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43143.5, 300 sec: 42875.8). Total num frames: 2987261952. Throughput: 0: 43071.4. Samples: 2987417760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:36:48,392][15132] Avg episode reward: [(0, '0.284')] [2024-06-22 09:36:49,465][15401] Updated weights for policy 0, policy_version 182330 (0.0036) [2024-06-22 09:36:52,759][15401] Updated weights for policy 0, policy_version 182340 (0.0040) [2024-06-22 09:36:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2987474944. Throughput: 0: 43049.1. Samples: 2987546760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:36:53,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 09:36:57,142][15401] Updated weights for policy 0, policy_version 182350 (0.0025) [2024-06-22 09:36:58,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2987655168. Throughput: 0: 43089.4. Samples: 2987808720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:36:58,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 09:37:00,273][15401] Updated weights for policy 0, policy_version 182360 (0.0032) [2024-06-22 09:37:01,009][15349] Signal inference workers to stop experience collection... (44000 times) [2024-06-22 09:37:01,010][15349] Signal inference workers to resume experience collection... (44000 times) [2024-06-22 09:37:01,043][15401] InferenceWorker_p0-w0: stopping experience collection (44000 times) [2024-06-22 09:37:01,043][15401] InferenceWorker_p0-w0: resuming experience collection (44000 times) [2024-06-22 09:37:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 2987900928. Throughput: 0: 42979.8. Samples: 2988058540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:37:03,393][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 09:37:04,624][15401] Updated weights for policy 0, policy_version 182370 (0.0042) [2024-06-22 09:37:07,953][15401] Updated weights for policy 0, policy_version 182380 (0.0030) [2024-06-22 09:37:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 2988113920. Throughput: 0: 42963.1. Samples: 2988191700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:37:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 09:37:12,177][15401] Updated weights for policy 0, policy_version 182390 (0.0041) [2024-06-22 09:37:13,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 2988310528. Throughput: 0: 42812.5. Samples: 2988446060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:37:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 09:37:15,825][15401] Updated weights for policy 0, policy_version 182400 (0.0041) [2024-06-22 09:37:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 2988539904. Throughput: 0: 42872.4. Samples: 2988702140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:37:18,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 09:37:19,605][15401] Updated weights for policy 0, policy_version 182410 (0.0039) [2024-06-22 09:37:23,273][15401] Updated weights for policy 0, policy_version 182420 (0.0028) [2024-06-22 09:37:23,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2988769280. Throughput: 0: 42908.1. Samples: 2988833360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:37:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 09:37:27,238][15401] Updated weights for policy 0, policy_version 182430 (0.0032) [2024-06-22 09:37:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2988949504. Throughput: 0: 42888.5. Samples: 2989092920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:37:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 09:37:30,723][15401] Updated weights for policy 0, policy_version 182440 (0.0032) [2024-06-22 09:37:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2989178880. Throughput: 0: 43008.5. Samples: 2989353040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:37:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 09:37:34,947][15401] Updated weights for policy 0, policy_version 182450 (0.0033) [2024-06-22 09:37:38,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 2989408256. Throughput: 0: 42993.8. Samples: 2989481580. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:37:38,393][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 09:37:38,515][15401] Updated weights for policy 0, policy_version 182460 (0.0035) [2024-06-22 09:37:42,492][15401] Updated weights for policy 0, policy_version 182470 (0.0027) [2024-06-22 09:37:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2989621248. Throughput: 0: 43038.9. Samples: 2989745480. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:37:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 09:37:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000182472_2989621248.pth... [2024-06-22 09:37:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000181844_2979332096.pth [2024-06-22 09:37:45,889][15401] Updated weights for policy 0, policy_version 182480 (0.0032) [2024-06-22 09:37:48,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2989834240. Throughput: 0: 43138.4. Samples: 2989999660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:37:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 09:37:50,125][15401] Updated weights for policy 0, policy_version 182490 (0.0023) [2024-06-22 09:37:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42932.6). Total num frames: 2990063616. Throughput: 0: 43068.4. Samples: 2990129780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:37:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 09:37:53,420][15401] Updated weights for policy 0, policy_version 182500 (0.0026) [2024-06-22 09:37:57,948][15401] Updated weights for policy 0, policy_version 182510 (0.0041) [2024-06-22 09:37:58,395][15132] Fps is (10 sec: 42577.0, 60 sec: 43413.9, 300 sec: 42875.4). Total num frames: 2990260224. Throughput: 0: 43017.8. Samples: 2990382080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:37:58,395][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 09:38:01,405][15401] Updated weights for policy 0, policy_version 182520 (0.0040) [2024-06-22 09:38:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 2990473216. Throughput: 0: 42974.2. Samples: 2990635980. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:03,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 09:38:05,549][15401] Updated weights for policy 0, policy_version 182530 (0.0035) [2024-06-22 09:38:08,390][15132] Fps is (10 sec: 42619.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2990686208. Throughput: 0: 42892.8. Samples: 2990763540. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:08,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 09:38:09,228][15401] Updated weights for policy 0, policy_version 182540 (0.0034) [2024-06-22 09:38:13,273][15401] Updated weights for policy 0, policy_version 182550 (0.0032) [2024-06-22 09:38:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2990899200. Throughput: 0: 42774.7. Samples: 2991017780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 09:38:16,899][15401] Updated weights for policy 0, policy_version 182560 (0.0037) [2024-06-22 09:38:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 2991112192. Throughput: 0: 42706.2. Samples: 2991274820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 09:38:20,858][15401] Updated weights for policy 0, policy_version 182570 (0.0037) [2024-06-22 09:38:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.8). Total num frames: 2991325184. Throughput: 0: 42689.8. Samples: 2991402520. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:23,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 09:38:23,612][15349] Signal inference workers to stop experience collection... (44050 times) [2024-06-22 09:38:23,612][15349] Signal inference workers to resume experience collection... (44050 times) [2024-06-22 09:38:23,663][15401] InferenceWorker_p0-w0: stopping experience collection (44050 times) [2024-06-22 09:38:23,663][15401] InferenceWorker_p0-w0: resuming experience collection (44050 times) [2024-06-22 09:38:24,503][15401] Updated weights for policy 0, policy_version 182580 (0.0043) [2024-06-22 09:38:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2991538176. Throughput: 0: 42525.1. Samples: 2991659100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 09:38:28,452][15401] Updated weights for policy 0, policy_version 182590 (0.0043) [2024-06-22 09:38:32,239][15401] Updated weights for policy 0, policy_version 182600 (0.0037) [2024-06-22 09:38:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2991751168. Throughput: 0: 42539.6. Samples: 2991913940. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:33,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 09:38:36,058][15401] Updated weights for policy 0, policy_version 182610 (0.0033) [2024-06-22 09:38:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 2991980544. Throughput: 0: 42475.2. Samples: 2992041160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 09:38:40,131][15401] Updated weights for policy 0, policy_version 182620 (0.0039) [2024-06-22 09:38:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2992177152. Throughput: 0: 42680.7. Samples: 2992302500. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-22 09:38:43,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-22 09:38:43,731][15401] Updated weights for policy 0, policy_version 182630 (0.0037) [2024-06-22 09:38:47,558][15401] Updated weights for policy 0, policy_version 182640 (0.0031) [2024-06-22 09:38:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2992406528. Throughput: 0: 42745.4. Samples: 2992559520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:38:48,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 09:38:51,294][15401] Updated weights for policy 0, policy_version 182650 (0.0022) [2024-06-22 09:38:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2992619520. Throughput: 0: 42821.1. Samples: 2992690480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:38:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 09:38:55,465][15401] Updated weights for policy 0, policy_version 182660 (0.0030) [2024-06-22 09:38:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42875.1, 300 sec: 42931.7). Total num frames: 2992832512. Throughput: 0: 42943.1. Samples: 2992950220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:38:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 09:38:59,321][15401] Updated weights for policy 0, policy_version 182670 (0.0029) [2024-06-22 09:39:03,127][15401] Updated weights for policy 0, policy_version 182680 (0.0038) [2024-06-22 09:39:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2993045504. Throughput: 0: 42679.1. Samples: 2993195380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:03,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-22 09:39:06,946][15401] Updated weights for policy 0, policy_version 182690 (0.0034) [2024-06-22 09:39:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.5, 300 sec: 42765.4). Total num frames: 2993225728. Throughput: 0: 42757.9. Samples: 2993326620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:08,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 09:39:10,677][15401] Updated weights for policy 0, policy_version 182700 (0.0040) [2024-06-22 09:39:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2993455104. Throughput: 0: 42768.5. Samples: 2993583680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:13,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 09:39:14,513][15401] Updated weights for policy 0, policy_version 182710 (0.0041) [2024-06-22 09:39:18,293][15401] Updated weights for policy 0, policy_version 182720 (0.0031) [2024-06-22 09:39:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2993684480. Throughput: 0: 42746.2. Samples: 2993837520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:18,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 09:39:22,146][15401] Updated weights for policy 0, policy_version 182730 (0.0028) [2024-06-22 09:39:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2993881088. Throughput: 0: 42846.6. Samples: 2993969260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 09:39:25,927][15401] Updated weights for policy 0, policy_version 182740 (0.0040) [2024-06-22 09:39:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2994094080. Throughput: 0: 42688.1. Samples: 2994223460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 09:39:29,964][15401] Updated weights for policy 0, policy_version 182750 (0.0039) [2024-06-22 09:39:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2994307072. Throughput: 0: 42641.9. Samples: 2994478400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 09:39:33,702][15401] Updated weights for policy 0, policy_version 182760 (0.0034) [2024-06-22 09:39:37,501][15401] Updated weights for policy 0, policy_version 182770 (0.0028) [2024-06-22 09:39:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2994536448. Throughput: 0: 42671.4. Samples: 2994610700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 09:39:41,371][15401] Updated weights for policy 0, policy_version 182780 (0.0035) [2024-06-22 09:39:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 2994733056. Throughput: 0: 42596.9. Samples: 2994867080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 09:39:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000182785_2994749440.pth... [2024-06-22 09:39:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000182156_2984443904.pth [2024-06-22 09:39:45,223][15401] Updated weights for policy 0, policy_version 182790 (0.0041) [2024-06-22 09:39:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2994962432. Throughput: 0: 42894.3. Samples: 2995125620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-22 09:39:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 09:39:49,001][15401] Updated weights for policy 0, policy_version 182800 (0.0047) [2024-06-22 09:39:52,906][15401] Updated weights for policy 0, policy_version 182810 (0.0033) [2024-06-22 09:39:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2995191808. Throughput: 0: 42806.2. Samples: 2995252900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:39:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 09:39:56,522][15401] Updated weights for policy 0, policy_version 182820 (0.0046) [2024-06-22 09:39:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2995388416. Throughput: 0: 42806.7. Samples: 2995509980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:39:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 09:40:00,418][15349] Signal inference workers to stop experience collection... (44100 times) [2024-06-22 09:40:00,418][15349] Signal inference workers to resume experience collection... (44100 times) [2024-06-22 09:40:00,441][15401] InferenceWorker_p0-w0: stopping experience collection (44100 times) [2024-06-22 09:40:00,441][15401] InferenceWorker_p0-w0: resuming experience collection (44100 times) [2024-06-22 09:40:00,573][15401] Updated weights for policy 0, policy_version 182830 (0.0039) [2024-06-22 09:40:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2995601408. Throughput: 0: 42986.5. Samples: 2995771920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 09:40:03,980][15401] Updated weights for policy 0, policy_version 182840 (0.0040) [2024-06-22 09:40:08,127][15401] Updated weights for policy 0, policy_version 182850 (0.0035) [2024-06-22 09:40:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2995814400. Throughput: 0: 42862.3. Samples: 2995898060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:08,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 09:40:11,884][15401] Updated weights for policy 0, policy_version 182860 (0.0034) [2024-06-22 09:40:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2996027392. Throughput: 0: 42841.3. Samples: 2996151320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:13,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-22 09:40:15,802][15401] Updated weights for policy 0, policy_version 182870 (0.0032) [2024-06-22 09:40:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2996240384. Throughput: 0: 42881.4. Samples: 2996408060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:18,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 09:40:19,662][15401] Updated weights for policy 0, policy_version 182880 (0.0036) [2024-06-22 09:40:23,392][15132] Fps is (10 sec: 42590.0, 60 sec: 42870.0, 300 sec: 42931.4). Total num frames: 2996453376. Throughput: 0: 42798.6. Samples: 2996536720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:23,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 09:40:23,569][15401] Updated weights for policy 0, policy_version 182890 (0.0039) [2024-06-22 09:40:27,292][15401] Updated weights for policy 0, policy_version 182900 (0.0044) [2024-06-22 09:40:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2996666368. Throughput: 0: 42725.7. Samples: 2996789740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 09:40:31,318][15401] Updated weights for policy 0, policy_version 182910 (0.0023) [2024-06-22 09:40:33,389][15132] Fps is (10 sec: 42607.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2996879360. Throughput: 0: 42461.5. Samples: 2997036380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 09:40:35,107][15401] Updated weights for policy 0, policy_version 182920 (0.0033) [2024-06-22 09:40:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 2997059584. Throughput: 0: 42604.9. Samples: 2997170120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 09:40:39,145][15401] Updated weights for policy 0, policy_version 182930 (0.0050) [2024-06-22 09:40:43,046][15401] Updated weights for policy 0, policy_version 182940 (0.0033) [2024-06-22 09:40:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42764.8). Total num frames: 2997288960. Throughput: 0: 42623.4. Samples: 2997428140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:43,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 09:40:46,839][15401] Updated weights for policy 0, policy_version 182950 (0.0028) [2024-06-22 09:40:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2997518336. Throughput: 0: 42342.5. Samples: 2997677320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 09:40:50,643][15401] Updated weights for policy 0, policy_version 182960 (0.0031) [2024-06-22 09:40:53,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 2997714944. Throughput: 0: 42430.6. Samples: 2997807440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:40:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 09:40:54,620][15401] Updated weights for policy 0, policy_version 182970 (0.0035) [2024-06-22 09:40:58,218][15401] Updated weights for policy 0, policy_version 182980 (0.0027) [2024-06-22 09:40:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2997944320. Throughput: 0: 42541.9. Samples: 2998065700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:40:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 09:41:02,101][15401] Updated weights for policy 0, policy_version 182990 (0.0042) [2024-06-22 09:41:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2998173696. Throughput: 0: 42494.1. Samples: 2998320300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:03,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 09:41:05,748][15401] Updated weights for policy 0, policy_version 183000 (0.0038) [2024-06-22 09:41:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42765.4). Total num frames: 2998353920. Throughput: 0: 42476.1. Samples: 2998448060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 09:41:09,588][15401] Updated weights for policy 0, policy_version 183010 (0.0042) [2024-06-22 09:41:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2998583296. Throughput: 0: 42583.2. Samples: 2998705980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 09:41:13,439][15401] Updated weights for policy 0, policy_version 183020 (0.0026) [2024-06-22 09:41:17,199][15401] Updated weights for policy 0, policy_version 183030 (0.0031) [2024-06-22 09:41:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2998796288. Throughput: 0: 42682.6. Samples: 2998957100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 09:41:21,137][15401] Updated weights for policy 0, policy_version 183040 (0.0045) [2024-06-22 09:41:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 2998976512. Throughput: 0: 42611.6. Samples: 2999087640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:23,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-22 09:41:24,151][15349] Signal inference workers to stop experience collection... (44150 times) [2024-06-22 09:41:24,156][15349] Signal inference workers to resume experience collection... (44150 times) [2024-06-22 09:41:24,201][15401] InferenceWorker_p0-w0: stopping experience collection (44150 times) [2024-06-22 09:41:24,201][15401] InferenceWorker_p0-w0: resuming experience collection (44150 times) [2024-06-22 09:41:24,757][15401] Updated weights for policy 0, policy_version 183050 (0.0027) [2024-06-22 09:41:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2999222272. Throughput: 0: 42686.3. Samples: 2999348920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:28,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-22 09:41:28,627][15401] Updated weights for policy 0, policy_version 183060 (0.0041) [2024-06-22 09:41:32,790][15401] Updated weights for policy 0, policy_version 183070 (0.0045) [2024-06-22 09:41:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2999435264. Throughput: 0: 42755.5. Samples: 2999601320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 09:41:36,263][15401] Updated weights for policy 0, policy_version 183080 (0.0025) [2024-06-22 09:41:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2999631872. Throughput: 0: 42645.8. Samples: 2999726500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 09:41:40,363][15401] Updated weights for policy 0, policy_version 183090 (0.0031) [2024-06-22 09:41:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 2999861248. Throughput: 0: 42656.0. Samples: 2999985220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 09:41:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000183098_2999877632.pth... [2024-06-22 09:41:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000182472_2989621248.pth [2024-06-22 09:41:43,909][15401] Updated weights for policy 0, policy_version 183100 (0.0028) [2024-06-22 09:41:48,152][15401] Updated weights for policy 0, policy_version 183110 (0.0036) [2024-06-22 09:41:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3000074240. Throughput: 0: 42662.0. Samples: 3000240080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 09:41:51,711][15401] Updated weights for policy 0, policy_version 183120 (0.0042) [2024-06-22 09:41:53,390][15132] Fps is (10 sec: 42595.4, 60 sec: 42871.0, 300 sec: 42820.5). Total num frames: 3000287232. Throughput: 0: 42683.4. Samples: 3000368840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:53,391][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 09:41:55,751][15401] Updated weights for policy 0, policy_version 183130 (0.0024) [2024-06-22 09:41:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3000516608. Throughput: 0: 42729.8. Samples: 3000628820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 09:41:58,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-22 09:41:59,541][15401] Updated weights for policy 0, policy_version 183140 (0.0042) [2024-06-22 09:42:03,370][15401] Updated weights for policy 0, policy_version 183150 (0.0034) [2024-06-22 09:42:03,389][15132] Fps is (10 sec: 44239.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3000729600. Throughput: 0: 42882.7. Samples: 3000886820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 09:42:07,251][15401] Updated weights for policy 0, policy_version 183160 (0.0033) [2024-06-22 09:42:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3000942592. Throughput: 0: 42813.8. Samples: 3001014260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:08,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 09:42:11,211][15401] Updated weights for policy 0, policy_version 183170 (0.0031) [2024-06-22 09:42:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3001139200. Throughput: 0: 42686.1. Samples: 3001269800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 09:42:14,792][15401] Updated weights for policy 0, policy_version 183180 (0.0046) [2024-06-22 09:42:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3001352192. Throughput: 0: 42831.5. Samples: 3001528740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 09:42:18,670][15401] Updated weights for policy 0, policy_version 183190 (0.0037) [2024-06-22 09:42:22,632][15401] Updated weights for policy 0, policy_version 183200 (0.0033) [2024-06-22 09:42:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3001581568. Throughput: 0: 42976.5. Samples: 3001660440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:23,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 09:42:26,194][15401] Updated weights for policy 0, policy_version 183210 (0.0028) [2024-06-22 09:42:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3001794560. Throughput: 0: 42816.4. Samples: 3001911960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:28,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 09:42:30,290][15401] Updated weights for policy 0, policy_version 183220 (0.0029) [2024-06-22 09:42:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 3002007552. Throughput: 0: 42894.1. Samples: 3002170320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 09:42:33,767][15401] Updated weights for policy 0, policy_version 183230 (0.0036) [2024-06-22 09:42:38,045][15401] Updated weights for policy 0, policy_version 183240 (0.0037) [2024-06-22 09:42:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3002204160. Throughput: 0: 42850.8. Samples: 3002297100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:38,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 09:42:41,290][15401] Updated weights for policy 0, policy_version 183250 (0.0041) [2024-06-22 09:42:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3002433536. Throughput: 0: 42763.1. Samples: 3002553160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 09:42:45,705][15401] Updated weights for policy 0, policy_version 183260 (0.0041) [2024-06-22 09:42:47,096][15349] Signal inference workers to stop experience collection... (44200 times) [2024-06-22 09:42:47,096][15349] Signal inference workers to resume experience collection... (44200 times) [2024-06-22 09:42:47,129][15401] InferenceWorker_p0-w0: stopping experience collection (44200 times) [2024-06-22 09:42:47,129][15401] InferenceWorker_p0-w0: resuming experience collection (44200 times) [2024-06-22 09:42:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 3002646528. Throughput: 0: 42669.3. Samples: 3002807040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:48,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 09:42:49,061][15401] Updated weights for policy 0, policy_version 183270 (0.0040) [2024-06-22 09:42:53,310][15401] Updated weights for policy 0, policy_version 183280 (0.0029) [2024-06-22 09:42:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.9, 300 sec: 42710.2). Total num frames: 3002859520. Throughput: 0: 42704.8. Samples: 3002935980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:53,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 09:42:56,637][15401] Updated weights for policy 0, policy_version 183290 (0.0042) [2024-06-22 09:42:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3003072512. Throughput: 0: 42617.4. Samples: 3003187580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:42:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 09:43:01,517][15401] Updated weights for policy 0, policy_version 183300 (0.0032) [2024-06-22 09:43:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3003301888. Throughput: 0: 42591.1. Samples: 3003445340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 09:43:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 09:43:04,594][15401] Updated weights for policy 0, policy_version 183310 (0.0038) [2024-06-22 09:43:08,394][15132] Fps is (10 sec: 40941.2, 60 sec: 42322.1, 300 sec: 42653.3). Total num frames: 3003482112. Throughput: 0: 42491.2. Samples: 3003572740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:08,395][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 09:43:09,140][15401] Updated weights for policy 0, policy_version 183320 (0.0037) [2024-06-22 09:43:12,369][15401] Updated weights for policy 0, policy_version 183330 (0.0029) [2024-06-22 09:43:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3003727872. Throughput: 0: 42610.6. Samples: 3003829440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 09:43:16,742][15401] Updated weights for policy 0, policy_version 183340 (0.0034) [2024-06-22 09:43:18,389][15132] Fps is (10 sec: 45895.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3003940864. Throughput: 0: 42497.4. Samples: 3004082700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 09:43:20,009][15401] Updated weights for policy 0, policy_version 183350 (0.0052) [2024-06-22 09:43:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3004137472. Throughput: 0: 42530.1. Samples: 3004210960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 09:43:24,218][15401] Updated weights for policy 0, policy_version 183360 (0.0040) [2024-06-22 09:43:27,600][15401] Updated weights for policy 0, policy_version 183370 (0.0035) [2024-06-22 09:43:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3004366848. Throughput: 0: 42764.0. Samples: 3004477540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 09:43:31,731][15401] Updated weights for policy 0, policy_version 183380 (0.0036) [2024-06-22 09:43:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3004579840. Throughput: 0: 42638.2. Samples: 3004725660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 09:43:35,337][15401] Updated weights for policy 0, policy_version 183390 (0.0038) [2024-06-22 09:43:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3004760064. Throughput: 0: 42669.8. Samples: 3004856120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 09:43:39,350][15401] Updated weights for policy 0, policy_version 183400 (0.0034) [2024-06-22 09:43:42,805][15401] Updated weights for policy 0, policy_version 183410 (0.0034) [2024-06-22 09:43:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3005005824. Throughput: 0: 42828.3. Samples: 3005114860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 09:43:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000183411_3005005824.pth... [2024-06-22 09:43:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000182785_2994749440.pth [2024-06-22 09:43:46,889][15401] Updated weights for policy 0, policy_version 183420 (0.0040) [2024-06-22 09:43:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 3005218816. Throughput: 0: 42863.5. Samples: 3005374200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:48,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 09:43:50,593][15401] Updated weights for policy 0, policy_version 183430 (0.0038) [2024-06-22 09:43:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3005415424. Throughput: 0: 42815.4. Samples: 3005499240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:53,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 09:43:54,283][15401] Updated weights for policy 0, policy_version 183440 (0.0023) [2024-06-22 09:43:58,291][15401] Updated weights for policy 0, policy_version 183450 (0.0027) [2024-06-22 09:43:58,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 3005644800. Throughput: 0: 42772.2. Samples: 3005754460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:43:58,396][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 09:44:02,092][15401] Updated weights for policy 0, policy_version 183460 (0.0031) [2024-06-22 09:44:02,908][15349] Signal inference workers to stop experience collection... (44250 times) [2024-06-22 09:44:02,909][15349] Signal inference workers to resume experience collection... (44250 times) [2024-06-22 09:44:02,941][15401] InferenceWorker_p0-w0: stopping experience collection (44250 times) [2024-06-22 09:44:02,941][15401] InferenceWorker_p0-w0: resuming experience collection (44250 times) [2024-06-22 09:44:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3005857792. Throughput: 0: 42919.5. Samples: 3006014080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:44:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 09:44:05,762][15401] Updated weights for policy 0, policy_version 183470 (0.0040) [2024-06-22 09:44:08,390][15132] Fps is (10 sec: 40985.7, 60 sec: 42874.6, 300 sec: 42709.5). Total num frames: 3006054400. Throughput: 0: 42900.9. Samples: 3006141500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:44:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 09:44:09,472][15401] Updated weights for policy 0, policy_version 183480 (0.0050) [2024-06-22 09:44:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3006283776. Throughput: 0: 42700.4. Samples: 3006399060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 09:44:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 09:44:13,558][15401] Updated weights for policy 0, policy_version 183490 (0.0028) [2024-06-22 09:44:17,254][15401] Updated weights for policy 0, policy_version 183500 (0.0027) [2024-06-22 09:44:18,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 3006496768. Throughput: 0: 42975.6. Samples: 3006659660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:18,392][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 09:44:21,137][15401] Updated weights for policy 0, policy_version 183510 (0.0030) [2024-06-22 09:44:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3006676992. Throughput: 0: 42845.3. Samples: 3006784160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 09:44:25,222][15401] Updated weights for policy 0, policy_version 183520 (0.0038) [2024-06-22 09:44:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3006922752. Throughput: 0: 42806.2. Samples: 3007041140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:28,394][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 09:44:28,760][15401] Updated weights for policy 0, policy_version 183530 (0.0041) [2024-06-22 09:44:32,912][15401] Updated weights for policy 0, policy_version 183540 (0.0041) [2024-06-22 09:44:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3007135744. Throughput: 0: 42827.3. Samples: 3007301420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:33,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 09:44:36,323][15401] Updated weights for policy 0, policy_version 183550 (0.0032) [2024-06-22 09:44:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3007332352. Throughput: 0: 42890.3. Samples: 3007429300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:38,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 09:44:40,399][15401] Updated weights for policy 0, policy_version 183560 (0.0024) [2024-06-22 09:44:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 3007578112. Throughput: 0: 42956.7. Samples: 3007687340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:43,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 09:44:43,854][15401] Updated weights for policy 0, policy_version 183570 (0.0033) [2024-06-22 09:44:47,833][15401] Updated weights for policy 0, policy_version 183580 (0.0032) [2024-06-22 09:44:48,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 3007791104. Throughput: 0: 42930.2. Samples: 3007946040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:48,393][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 09:44:51,437][15401] Updated weights for policy 0, policy_version 183590 (0.0033) [2024-06-22 09:44:53,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3007987712. Throughput: 0: 43046.3. Samples: 3008078580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:53,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 09:44:55,423][15401] Updated weights for policy 0, policy_version 183600 (0.0031) [2024-06-22 09:44:58,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 3008233472. Throughput: 0: 43048.0. Samples: 3008336220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:44:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 09:44:58,934][15401] Updated weights for policy 0, policy_version 183610 (0.0041) [2024-06-22 09:45:03,098][15401] Updated weights for policy 0, policy_version 183620 (0.0043) [2024-06-22 09:45:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3008430080. Throughput: 0: 42908.0. Samples: 3008590420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:45:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 09:45:07,000][15401] Updated weights for policy 0, policy_version 183630 (0.0032) [2024-06-22 09:45:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3008626688. Throughput: 0: 42958.3. Samples: 3008717280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:45:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 09:45:10,592][15401] Updated weights for policy 0, policy_version 183640 (0.0040) [2024-06-22 09:45:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3008872448. Throughput: 0: 43184.9. Samples: 3008984460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:45:13,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 09:45:14,442][15401] Updated weights for policy 0, policy_version 183650 (0.0027) [2024-06-22 09:45:18,252][15401] Updated weights for policy 0, policy_version 183660 (0.0033) [2024-06-22 09:45:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.2, 300 sec: 42820.9). Total num frames: 3009085440. Throughput: 0: 42919.5. Samples: 3009232800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 09:45:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 09:45:22,126][15401] Updated weights for policy 0, policy_version 183670 (0.0035) [2024-06-22 09:45:22,152][15349] Signal inference workers to stop experience collection... (44300 times) [2024-06-22 09:45:22,152][15349] Signal inference workers to resume experience collection... (44300 times) [2024-06-22 09:45:22,170][15401] InferenceWorker_p0-w0: stopping experience collection (44300 times) [2024-06-22 09:45:22,170][15401] InferenceWorker_p0-w0: resuming experience collection (44300 times) [2024-06-22 09:45:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 3009282048. Throughput: 0: 42907.9. Samples: 3009360160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:45:23,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-22 09:45:25,940][15401] Updated weights for policy 0, policy_version 183680 (0.0046) [2024-06-22 09:45:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3009495040. Throughput: 0: 43086.3. Samples: 3009626120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:45:28,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 09:45:29,562][15401] Updated weights for policy 0, policy_version 183690 (0.0041) [2024-06-22 09:45:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3009724416. Throughput: 0: 42863.1. Samples: 3009874780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:45:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 09:45:33,629][15401] Updated weights for policy 0, policy_version 183700 (0.0041) [2024-06-22 09:45:37,389][15401] Updated weights for policy 0, policy_version 183710 (0.0033) [2024-06-22 09:45:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 3009937408. Throughput: 0: 42808.0. Samples: 3010004940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:45:38,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 09:45:41,254][15401] Updated weights for policy 0, policy_version 183720 (0.0032) [2024-06-22 09:45:43,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 3010101248. Throughput: 0: 42737.4. Samples: 3010259400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:45:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 09:45:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000183723_3010117632.pth... [2024-06-22 09:45:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000183098_2999877632.pth [2024-06-22 09:45:45,276][15401] Updated weights for policy 0, policy_version 183730 (0.0046) [2024-06-22 09:45:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 3010363392. Throughput: 0: 42526.7. Samples: 3010504120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:45:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 09:45:48,986][15401] Updated weights for policy 0, policy_version 183740 (0.0041) [2024-06-22 09:45:52,883][15401] Updated weights for policy 0, policy_version 183750 (0.0038) [2024-06-22 09:45:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3010576384. Throughput: 0: 42819.4. Samples: 3010644160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:45:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 09:45:56,743][15401] Updated weights for policy 0, policy_version 183760 (0.0042) [2024-06-22 09:45:58,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 3010740224. Throughput: 0: 42373.0. Samples: 3010891240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:45:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 09:46:00,944][15401] Updated weights for policy 0, policy_version 183770 (0.0044) [2024-06-22 09:46:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3010969600. Throughput: 0: 42471.1. Samples: 3011144000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:46:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 09:46:04,279][15401] Updated weights for policy 0, policy_version 183780 (0.0042) [2024-06-22 09:46:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3011198976. Throughput: 0: 42620.0. Samples: 3011278060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:46:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 09:46:08,474][15401] Updated weights for policy 0, policy_version 183790 (0.0037) [2024-06-22 09:46:12,166][15401] Updated weights for policy 0, policy_version 183800 (0.0041) [2024-06-22 09:46:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3011395584. Throughput: 0: 42129.7. Samples: 3011521960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:46:13,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-22 09:46:16,094][15401] Updated weights for policy 0, policy_version 183810 (0.0032) [2024-06-22 09:46:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3011624960. Throughput: 0: 42272.6. Samples: 3011777040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:46:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 09:46:20,218][15401] Updated weights for policy 0, policy_version 183820 (0.0042) [2024-06-22 09:46:23,361][15349] Signal inference workers to stop experience collection... (44350 times) [2024-06-22 09:46:23,362][15349] Signal inference workers to resume experience collection... (44350 times) [2024-06-22 09:46:23,378][15401] InferenceWorker_p0-w0: stopping experience collection (44350 times) [2024-06-22 09:46:23,378][15401] InferenceWorker_p0-w0: resuming experience collection (44350 times) [2024-06-22 09:46:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3011821568. Throughput: 0: 42486.3. Samples: 3011916820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-22 09:46:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 09:46:23,838][15401] Updated weights for policy 0, policy_version 183830 (0.0034) [2024-06-22 09:46:27,792][15401] Updated weights for policy 0, policy_version 183840 (0.0024) [2024-06-22 09:46:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 3012034560. Throughput: 0: 42259.9. Samples: 3012161100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:46:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 09:46:31,725][15401] Updated weights for policy 0, policy_version 183850 (0.0037) [2024-06-22 09:46:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 3012263936. Throughput: 0: 42559.2. Samples: 3012419280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:46:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 09:46:35,304][15401] Updated weights for policy 0, policy_version 183860 (0.0035) [2024-06-22 09:46:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3012460544. Throughput: 0: 42465.5. Samples: 3012555100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:46:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 09:46:39,354][15401] Updated weights for policy 0, policy_version 183870 (0.0043) [2024-06-22 09:46:42,903][15401] Updated weights for policy 0, policy_version 183880 (0.0038) [2024-06-22 09:46:43,396][15132] Fps is (10 sec: 42570.9, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 3012689920. Throughput: 0: 42466.3. Samples: 3012802500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:46:43,396][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 09:46:47,070][15401] Updated weights for policy 0, policy_version 183890 (0.0040) [2024-06-22 09:46:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 3012919296. Throughput: 0: 42565.2. Samples: 3013059440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:46:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 09:46:50,556][15401] Updated weights for policy 0, policy_version 183900 (0.0045) [2024-06-22 09:46:53,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 3013099520. Throughput: 0: 42511.6. Samples: 3013191080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:46:53,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 09:46:54,784][15401] Updated weights for policy 0, policy_version 183910 (0.0039) [2024-06-22 09:46:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3013328896. Throughput: 0: 42727.9. Samples: 3013444720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:46:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 09:46:58,533][15401] Updated weights for policy 0, policy_version 183920 (0.0032) [2024-06-22 09:47:02,417][15401] Updated weights for policy 0, policy_version 183930 (0.0032) [2024-06-22 09:47:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 3013574656. Throughput: 0: 42675.8. Samples: 3013697460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:47:03,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 09:47:05,973][15401] Updated weights for policy 0, policy_version 183940 (0.0032) [2024-06-22 09:47:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 3013722112. Throughput: 0: 42426.1. Samples: 3013826000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:47:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 09:47:10,237][15401] Updated weights for policy 0, policy_version 183950 (0.0041) [2024-06-22 09:47:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3013984256. Throughput: 0: 42789.0. Samples: 3014086600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:47:13,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 09:47:13,397][15401] Updated weights for policy 0, policy_version 183960 (0.0042) [2024-06-22 09:47:17,836][15401] Updated weights for policy 0, policy_version 183970 (0.0056) [2024-06-22 09:47:18,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3014197248. Throughput: 0: 42735.0. Samples: 3014342360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:47:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 09:47:21,428][15401] Updated weights for policy 0, policy_version 183980 (0.0049) [2024-06-22 09:47:23,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 3014361088. Throughput: 0: 42532.8. Samples: 3014469080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:47:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 09:47:25,457][15401] Updated weights for policy 0, policy_version 183990 (0.0033) [2024-06-22 09:47:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3014623232. Throughput: 0: 42735.0. Samples: 3014725300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 09:47:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 09:47:28,999][15401] Updated weights for policy 0, policy_version 184000 (0.0036) [2024-06-22 09:47:32,408][15349] Signal inference workers to stop experience collection... (44400 times) [2024-06-22 09:47:32,440][15401] InferenceWorker_p0-w0: stopping experience collection (44400 times) [2024-06-22 09:47:32,523][15349] Signal inference workers to resume experience collection... (44400 times) [2024-06-22 09:47:32,524][15401] InferenceWorker_p0-w0: resuming experience collection (44400 times) [2024-06-22 09:47:33,137][15401] Updated weights for policy 0, policy_version 184010 (0.0027) [2024-06-22 09:47:33,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3014836224. Throughput: 0: 42826.3. Samples: 3014986620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:47:33,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 09:47:37,017][15401] Updated weights for policy 0, policy_version 184020 (0.0040) [2024-06-22 09:47:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3015000064. Throughput: 0: 42655.5. Samples: 3015110580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:47:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 09:47:40,682][15401] Updated weights for policy 0, policy_version 184030 (0.0047) [2024-06-22 09:47:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43149.1, 300 sec: 42820.9). Total num frames: 3015278592. Throughput: 0: 42637.4. Samples: 3015363400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:47:43,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 09:47:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000184038_3015278592.pth... [2024-06-22 09:47:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000183411_3005005824.pth [2024-06-22 09:47:44,626][15401] Updated weights for policy 0, policy_version 184040 (0.0035) [2024-06-22 09:47:48,359][15401] Updated weights for policy 0, policy_version 184050 (0.0021) [2024-06-22 09:47:48,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3015475200. Throughput: 0: 42884.6. Samples: 3015627260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:47:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 09:47:52,150][15401] Updated weights for policy 0, policy_version 184060 (0.0043) [2024-06-22 09:47:53,390][15132] Fps is (10 sec: 36044.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3015639040. Throughput: 0: 42660.9. Samples: 3015745740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:47:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 09:47:55,934][15401] Updated weights for policy 0, policy_version 184070 (0.0030) [2024-06-22 09:47:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3015901184. Throughput: 0: 42676.0. Samples: 3016007020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:47:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 09:47:59,574][15401] Updated weights for policy 0, policy_version 184080 (0.0029) [2024-06-22 09:48:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42052.4, 300 sec: 42765.7). Total num frames: 3016097792. Throughput: 0: 42945.1. Samples: 3016274880. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:48:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 09:48:03,594][15401] Updated weights for policy 0, policy_version 184090 (0.0033) [2024-06-22 09:48:07,149][15401] Updated weights for policy 0, policy_version 184100 (0.0033) [2024-06-22 09:48:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3016294400. Throughput: 0: 42841.9. Samples: 3016396960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:48:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 09:48:11,125][15401] Updated weights for policy 0, policy_version 184110 (0.0031) [2024-06-22 09:48:13,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3016556544. Throughput: 0: 42950.2. Samples: 3016658060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:48:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 09:48:14,779][15401] Updated weights for policy 0, policy_version 184120 (0.0028) [2024-06-22 09:48:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3016736768. Throughput: 0: 42928.6. Samples: 3016918400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:48:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 09:48:19,035][15401] Updated weights for policy 0, policy_version 184130 (0.0031) [2024-06-22 09:48:22,964][15401] Updated weights for policy 0, policy_version 184140 (0.0032) [2024-06-22 09:48:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3016949760. Throughput: 0: 42913.7. Samples: 3017041700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:48:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 09:48:26,659][15401] Updated weights for policy 0, policy_version 184150 (0.0028) [2024-06-22 09:48:28,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3017211904. Throughput: 0: 43061.4. Samples: 3017301160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:48:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 09:48:30,533][15401] Updated weights for policy 0, policy_version 184160 (0.0042) [2024-06-22 09:48:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3017375744. Throughput: 0: 43074.1. Samples: 3017565600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:48:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 09:48:34,411][15401] Updated weights for policy 0, policy_version 184170 (0.0025) [2024-06-22 09:48:34,429][15349] Signal inference workers to stop experience collection... (44450 times) [2024-06-22 09:48:34,429][15349] Signal inference workers to resume experience collection... (44450 times) [2024-06-22 09:48:34,441][15401] InferenceWorker_p0-w0: stopping experience collection (44450 times) [2024-06-22 09:48:34,441][15401] InferenceWorker_p0-w0: resuming experience collection (44450 times) [2024-06-22 09:48:38,079][15401] Updated weights for policy 0, policy_version 184180 (0.0046) [2024-06-22 09:48:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 3017605120. Throughput: 0: 43048.0. Samples: 3017682900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:48:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 09:48:42,038][15401] Updated weights for policy 0, policy_version 184190 (0.0027) [2024-06-22 09:48:43,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3017850880. Throughput: 0: 43173.2. Samples: 3017949820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:48:43,390][15132] Avg episode reward: [(0, '0.845')] [2024-06-22 09:48:45,667][15401] Updated weights for policy 0, policy_version 184200 (0.0046) [2024-06-22 09:48:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3018031104. Throughput: 0: 42928.8. Samples: 3018206680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:48:48,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-22 09:48:49,773][15401] Updated weights for policy 0, policy_version 184210 (0.0042) [2024-06-22 09:48:53,382][15401] Updated weights for policy 0, policy_version 184220 (0.0034) [2024-06-22 09:48:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43690.8, 300 sec: 42766.0). Total num frames: 3018260480. Throughput: 0: 42906.7. Samples: 3018327760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:48:53,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-22 09:48:57,399][15401] Updated weights for policy 0, policy_version 184230 (0.0034) [2024-06-22 09:48:58,391][15132] Fps is (10 sec: 45867.3, 60 sec: 43143.3, 300 sec: 42820.3). Total num frames: 3018489856. Throughput: 0: 42974.4. Samples: 3018591980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:48:58,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 09:49:00,929][15401] Updated weights for policy 0, policy_version 184240 (0.0034) [2024-06-22 09:49:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3018670080. Throughput: 0: 42950.1. Samples: 3018851160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 09:49:05,049][15401] Updated weights for policy 0, policy_version 184250 (0.0035) [2024-06-22 09:49:08,390][15132] Fps is (10 sec: 40966.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 3018899456. Throughput: 0: 42898.7. Samples: 3018972140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:08,391][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 09:49:08,738][15401] Updated weights for policy 0, policy_version 184260 (0.0026) [2024-06-22 09:49:12,603][15401] Updated weights for policy 0, policy_version 184270 (0.0031) [2024-06-22 09:49:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 3019112448. Throughput: 0: 42957.4. Samples: 3019234240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 09:49:16,551][15401] Updated weights for policy 0, policy_version 184280 (0.0040) [2024-06-22 09:49:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3019325440. Throughput: 0: 42661.3. Samples: 3019485360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:18,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 09:49:20,197][15401] Updated weights for policy 0, policy_version 184290 (0.0036) [2024-06-22 09:49:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 3019538432. Throughput: 0: 42964.8. Samples: 3019616420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:23,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 09:49:24,173][15401] Updated weights for policy 0, policy_version 184300 (0.0026) [2024-06-22 09:49:28,034][15401] Updated weights for policy 0, policy_version 184310 (0.0024) [2024-06-22 09:49:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3019735040. Throughput: 0: 42605.4. Samples: 3019867060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:28,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 09:49:31,861][15401] Updated weights for policy 0, policy_version 184320 (0.0031) [2024-06-22 09:49:33,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3019948032. Throughput: 0: 42504.4. Samples: 3020119380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 09:49:35,639][15401] Updated weights for policy 0, policy_version 184330 (0.0030) [2024-06-22 09:49:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 3020177408. Throughput: 0: 42774.6. Samples: 3020252620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 09:49:39,428][15401] Updated weights for policy 0, policy_version 184340 (0.0032) [2024-06-22 09:49:43,114][15401] Updated weights for policy 0, policy_version 184350 (0.0030) [2024-06-22 09:49:43,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42320.8, 300 sec: 42708.9). Total num frames: 3020390400. Throughput: 0: 42596.4. Samples: 3020509020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 09:49:43,397][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 09:49:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000184350_3020390400.pth... [2024-06-22 09:49:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000183723_3010117632.pth [2024-06-22 09:49:46,957][15401] Updated weights for policy 0, policy_version 184360 (0.0032) [2024-06-22 09:49:48,391][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3020603392. Throughput: 0: 42660.4. Samples: 3020770880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:49:48,391][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 09:49:51,214][15401] Updated weights for policy 0, policy_version 184370 (0.0029) [2024-06-22 09:49:53,392][15132] Fps is (10 sec: 42615.7, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 3020816384. Throughput: 0: 42837.3. Samples: 3020899920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:49:53,392][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 09:49:54,472][15401] Updated weights for policy 0, policy_version 184380 (0.0035) [2024-06-22 09:49:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42326.5, 300 sec: 42709.5). Total num frames: 3021029376. Throughput: 0: 42612.8. Samples: 3021151820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:49:58,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 09:49:58,819][15401] Updated weights for policy 0, policy_version 184390 (0.0031) [2024-06-22 09:49:59,747][15349] Signal inference workers to stop experience collection... (44500 times) [2024-06-22 09:49:59,747][15349] Signal inference workers to resume experience collection... (44500 times) [2024-06-22 09:49:59,788][15401] InferenceWorker_p0-w0: stopping experience collection (44500 times) [2024-06-22 09:49:59,789][15401] InferenceWorker_p0-w0: resuming experience collection (44500 times) [2024-06-22 09:50:02,461][15401] Updated weights for policy 0, policy_version 184400 (0.0032) [2024-06-22 09:50:03,390][15132] Fps is (10 sec: 44246.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3021258752. Throughput: 0: 42752.4. Samples: 3021409220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 09:50:06,531][15401] Updated weights for policy 0, policy_version 184410 (0.0038) [2024-06-22 09:50:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3021471744. Throughput: 0: 42850.7. Samples: 3021544600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 09:50:10,284][15401] Updated weights for policy 0, policy_version 184420 (0.0029) [2024-06-22 09:50:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3021668352. Throughput: 0: 42760.7. Samples: 3021791300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 09:50:14,123][15401] Updated weights for policy 0, policy_version 184430 (0.0031) [2024-06-22 09:50:17,882][15401] Updated weights for policy 0, policy_version 184440 (0.0033) [2024-06-22 09:50:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3021897728. Throughput: 0: 42892.1. Samples: 3022049520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 09:50:22,037][15401] Updated weights for policy 0, policy_version 184450 (0.0037) [2024-06-22 09:50:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 3022094336. Throughput: 0: 42681.8. Samples: 3022173300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:23,390][15132] Avg episode reward: [(0, '0.179')] [2024-06-22 09:50:25,714][15401] Updated weights for policy 0, policy_version 184460 (0.0030) [2024-06-22 09:50:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3022323712. Throughput: 0: 42669.2. Samples: 3022428860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 09:50:29,485][15401] Updated weights for policy 0, policy_version 184470 (0.0028) [2024-06-22 09:50:33,343][15401] Updated weights for policy 0, policy_version 184480 (0.0029) [2024-06-22 09:50:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3022520320. Throughput: 0: 42613.5. Samples: 3022688480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 09:50:37,080][15401] Updated weights for policy 0, policy_version 184490 (0.0042) [2024-06-22 09:50:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3022733312. Throughput: 0: 42471.6. Samples: 3022811040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 09:50:40,882][15401] Updated weights for policy 0, policy_version 184500 (0.0033) [2024-06-22 09:50:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 3022946304. Throughput: 0: 42588.5. Samples: 3023068300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:43,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 09:50:44,669][15401] Updated weights for policy 0, policy_version 184510 (0.0031) [2024-06-22 09:50:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3023159296. Throughput: 0: 42545.9. Samples: 3023323780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-22 09:50:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 09:50:48,591][15401] Updated weights for policy 0, policy_version 184520 (0.0043) [2024-06-22 09:50:52,351][15401] Updated weights for policy 0, policy_version 184530 (0.0045) [2024-06-22 09:50:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 3023355904. Throughput: 0: 42306.2. Samples: 3023448380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:50:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 09:50:56,367][15401] Updated weights for policy 0, policy_version 184540 (0.0038) [2024-06-22 09:50:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3023601664. Throughput: 0: 42710.6. Samples: 3023713280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:50:58,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 09:50:59,952][15401] Updated weights for policy 0, policy_version 184550 (0.0031) [2024-06-22 09:51:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3023814656. Throughput: 0: 42592.8. Samples: 3023966200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 09:51:03,969][15401] Updated weights for policy 0, policy_version 184560 (0.0029) [2024-06-22 09:51:07,603][15401] Updated weights for policy 0, policy_version 184570 (0.0036) [2024-06-22 09:51:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3024011264. Throughput: 0: 42627.5. Samples: 3024091540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 09:51:12,096][15401] Updated weights for policy 0, policy_version 184580 (0.0040) [2024-06-22 09:51:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3024240640. Throughput: 0: 42713.3. Samples: 3024350960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:13,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-22 09:51:15,347][15401] Updated weights for policy 0, policy_version 184590 (0.0031) [2024-06-22 09:51:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3024437248. Throughput: 0: 42539.4. Samples: 3024602760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 09:51:19,825][15401] Updated weights for policy 0, policy_version 184600 (0.0036) [2024-06-22 09:51:22,923][15401] Updated weights for policy 0, policy_version 184610 (0.0047) [2024-06-22 09:51:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3024666624. Throughput: 0: 42614.6. Samples: 3024728700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:23,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 09:51:27,310][15401] Updated weights for policy 0, policy_version 184620 (0.0031) [2024-06-22 09:51:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3024863232. Throughput: 0: 42722.3. Samples: 3024990800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 09:51:30,711][15401] Updated weights for policy 0, policy_version 184630 (0.0040) [2024-06-22 09:51:31,532][15349] Signal inference workers to stop experience collection... (44550 times) [2024-06-22 09:51:31,533][15349] Signal inference workers to resume experience collection... (44550 times) [2024-06-22 09:51:31,542][15401] InferenceWorker_p0-w0: stopping experience collection (44550 times) [2024-06-22 09:51:31,543][15401] InferenceWorker_p0-w0: resuming experience collection (44550 times) [2024-06-22 09:51:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 3025076224. Throughput: 0: 42645.7. Samples: 3025242940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:33,392][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 09:51:34,940][15401] Updated weights for policy 0, policy_version 184640 (0.0026) [2024-06-22 09:51:38,164][15401] Updated weights for policy 0, policy_version 184650 (0.0033) [2024-06-22 09:51:38,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.7, 300 sec: 42765.6). Total num frames: 3025305600. Throughput: 0: 42806.7. Samples: 3025374780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:38,393][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 09:51:42,529][15401] Updated weights for policy 0, policy_version 184660 (0.0029) [2024-06-22 09:51:43,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3025502208. Throughput: 0: 42728.1. Samples: 3025636040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:43,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 09:51:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000184663_3025518592.pth... [2024-06-22 09:51:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000184038_3015278592.pth [2024-06-22 09:51:45,612][15401] Updated weights for policy 0, policy_version 184670 (0.0026) [2024-06-22 09:51:48,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3025715200. Throughput: 0: 42691.7. Samples: 3025887320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:48,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 09:51:50,494][15401] Updated weights for policy 0, policy_version 184680 (0.0033) [2024-06-22 09:51:53,103][15401] Updated weights for policy 0, policy_version 184690 (0.0039) [2024-06-22 09:51:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3025960960. Throughput: 0: 42731.1. Samples: 3026014440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-22 09:51:53,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 09:51:57,871][15401] Updated weights for policy 0, policy_version 184700 (0.0033) [2024-06-22 09:51:58,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3026141184. Throughput: 0: 42569.7. Samples: 3026266600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:51:58,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-22 09:52:01,275][15401] Updated weights for policy 0, policy_version 184710 (0.0034) [2024-06-22 09:52:03,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 3026337792. Throughput: 0: 42682.3. Samples: 3026523460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:03,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 09:52:05,370][15401] Updated weights for policy 0, policy_version 184720 (0.0036) [2024-06-22 09:52:08,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3026567168. Throughput: 0: 42646.8. Samples: 3026647800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:08,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 09:52:09,017][15401] Updated weights for policy 0, policy_version 184730 (0.0050) [2024-06-22 09:52:12,858][15401] Updated weights for policy 0, policy_version 184740 (0.0044) [2024-06-22 09:52:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3026780160. Throughput: 0: 42631.5. Samples: 3026909220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 09:52:16,864][15401] Updated weights for policy 0, policy_version 184750 (0.0033) [2024-06-22 09:52:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3026993152. Throughput: 0: 42542.6. Samples: 3027157260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 09:52:20,687][15401] Updated weights for policy 0, policy_version 184760 (0.0025) [2024-06-22 09:52:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3027189760. Throughput: 0: 42409.9. Samples: 3027283120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 09:52:24,291][15401] Updated weights for policy 0, policy_version 184770 (0.0030) [2024-06-22 09:52:28,338][15401] Updated weights for policy 0, policy_version 184780 (0.0032) [2024-06-22 09:52:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 3027435520. Throughput: 0: 42461.6. Samples: 3027546820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 09:52:32,410][15401] Updated weights for policy 0, policy_version 184790 (0.0043) [2024-06-22 09:52:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 3027632128. Throughput: 0: 42429.2. Samples: 3027796640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 09:52:35,908][15401] Updated weights for policy 0, policy_version 184800 (0.0040) [2024-06-22 09:52:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 3027845120. Throughput: 0: 42362.3. Samples: 3027920740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 09:52:39,994][15401] Updated weights for policy 0, policy_version 184810 (0.0028) [2024-06-22 09:52:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3028074496. Throughput: 0: 42631.8. Samples: 3028185020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 09:52:43,551][15401] Updated weights for policy 0, policy_version 184820 (0.0027) [2024-06-22 09:52:47,573][15401] Updated weights for policy 0, policy_version 184830 (0.0037) [2024-06-22 09:52:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3028271104. Throughput: 0: 42569.9. Samples: 3028439100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 09:52:51,542][15401] Updated weights for policy 0, policy_version 184840 (0.0036) [2024-06-22 09:52:52,703][15349] Signal inference workers to stop experience collection... (44600 times) [2024-06-22 09:52:52,754][15401] InferenceWorker_p0-w0: stopping experience collection (44600 times) [2024-06-22 09:52:52,761][15349] Signal inference workers to resume experience collection... (44600 times) [2024-06-22 09:52:52,767][15401] InferenceWorker_p0-w0: resuming experience collection (44600 times) [2024-06-22 09:52:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3028500480. Throughput: 0: 42769.2. Samples: 3028572420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 09:52:55,171][15401] Updated weights for policy 0, policy_version 184850 (0.0022) [2024-06-22 09:52:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3028680704. Throughput: 0: 42638.7. Samples: 3028827960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 09:52:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 09:52:59,487][15401] Updated weights for policy 0, policy_version 184860 (0.0034) [2024-06-22 09:53:02,864][15401] Updated weights for policy 0, policy_version 184870 (0.0032) [2024-06-22 09:53:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3028926464. Throughput: 0: 42493.9. Samples: 3029069480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:03,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 09:53:07,043][15401] Updated weights for policy 0, policy_version 184880 (0.0024) [2024-06-22 09:53:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 3029139456. Throughput: 0: 42690.9. Samples: 3029204220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 09:53:10,478][15401] Updated weights for policy 0, policy_version 184890 (0.0040) [2024-06-22 09:53:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3029319680. Throughput: 0: 42637.0. Samples: 3029465480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 09:53:14,682][15401] Updated weights for policy 0, policy_version 184900 (0.0034) [2024-06-22 09:53:18,240][15401] Updated weights for policy 0, policy_version 184910 (0.0037) [2024-06-22 09:53:18,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 3029565440. Throughput: 0: 42652.8. Samples: 3029716120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:18,393][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 09:53:22,305][15401] Updated weights for policy 0, policy_version 184920 (0.0031) [2024-06-22 09:53:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3029778432. Throughput: 0: 42779.1. Samples: 3029845800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 09:53:25,973][15401] Updated weights for policy 0, policy_version 184930 (0.0038) [2024-06-22 09:53:28,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 3029975040. Throughput: 0: 42627.9. Samples: 3030103380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:28,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 09:53:29,754][15401] Updated weights for policy 0, policy_version 184940 (0.0027) [2024-06-22 09:53:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3030204416. Throughput: 0: 42796.8. Samples: 3030364960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 09:53:33,469][15401] Updated weights for policy 0, policy_version 184950 (0.0045) [2024-06-22 09:53:37,275][15401] Updated weights for policy 0, policy_version 184960 (0.0044) [2024-06-22 09:53:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3030417408. Throughput: 0: 42725.9. Samples: 3030495080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 09:53:41,022][15401] Updated weights for policy 0, policy_version 184970 (0.0035) [2024-06-22 09:53:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3030630400. Throughput: 0: 42806.2. Samples: 3030754240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 09:53:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000184975_3030630400.pth... [2024-06-22 09:53:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000184350_3020390400.pth [2024-06-22 09:53:44,783][15401] Updated weights for policy 0, policy_version 184980 (0.0044) [2024-06-22 09:53:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3030843392. Throughput: 0: 43216.9. Samples: 3031014240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 09:53:48,919][15401] Updated weights for policy 0, policy_version 184990 (0.0029) [2024-06-22 09:53:52,238][15401] Updated weights for policy 0, policy_version 185000 (0.0028) [2024-06-22 09:53:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 3031072768. Throughput: 0: 43137.0. Samples: 3031145380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:53,399][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 09:53:56,462][15401] Updated weights for policy 0, policy_version 185010 (0.0047) [2024-06-22 09:53:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3031269376. Throughput: 0: 42974.3. Samples: 3031399320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:53:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 09:54:00,194][15401] Updated weights for policy 0, policy_version 185020 (0.0033) [2024-06-22 09:54:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3031482368. Throughput: 0: 43270.2. Samples: 3031663180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 09:54:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 09:54:03,991][15401] Updated weights for policy 0, policy_version 185030 (0.0024) [2024-06-22 09:54:07,755][15401] Updated weights for policy 0, policy_version 185040 (0.0026) [2024-06-22 09:54:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3031728128. Throughput: 0: 43189.3. Samples: 3031789320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 09:54:11,447][15401] Updated weights for policy 0, policy_version 185050 (0.0040) [2024-06-22 09:54:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 3031908352. Throughput: 0: 43250.8. Samples: 3032049560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 09:54:13,520][15349] Signal inference workers to stop experience collection... (44650 times) [2024-06-22 09:54:13,521][15349] Signal inference workers to resume experience collection... (44650 times) [2024-06-22 09:54:13,535][15401] InferenceWorker_p0-w0: stopping experience collection (44650 times) [2024-06-22 09:54:13,554][15401] InferenceWorker_p0-w0: resuming experience collection (44650 times) [2024-06-22 09:54:15,261][15401] Updated weights for policy 0, policy_version 185060 (0.0043) [2024-06-22 09:54:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42709.8). Total num frames: 3032137728. Throughput: 0: 43003.0. Samples: 3032300100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 09:54:19,047][15401] Updated weights for policy 0, policy_version 185070 (0.0041) [2024-06-22 09:54:22,946][15401] Updated weights for policy 0, policy_version 185080 (0.0031) [2024-06-22 09:54:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3032367104. Throughput: 0: 43026.9. Samples: 3032431300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 09:54:27,310][15401] Updated weights for policy 0, policy_version 185090 (0.0029) [2024-06-22 09:54:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 3032547328. Throughput: 0: 43027.1. Samples: 3032690460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 09:54:30,635][15401] Updated weights for policy 0, policy_version 185100 (0.0028) [2024-06-22 09:54:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.2, 300 sec: 42709.4). Total num frames: 3032776704. Throughput: 0: 42793.0. Samples: 3032939940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:33,396][15132] Avg episode reward: [(0, '0.175')] [2024-06-22 09:54:34,809][15401] Updated weights for policy 0, policy_version 185110 (0.0027) [2024-06-22 09:54:38,184][15401] Updated weights for policy 0, policy_version 185120 (0.0027) [2024-06-22 09:54:38,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43417.6, 300 sec: 42821.5). Total num frames: 3033022464. Throughput: 0: 42898.8. Samples: 3033075820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:38,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 09:54:42,327][15401] Updated weights for policy 0, policy_version 185130 (0.0037) [2024-06-22 09:54:43,392][15132] Fps is (10 sec: 42589.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 3033202688. Throughput: 0: 42882.6. Samples: 3033329140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:43,392][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 09:54:45,876][15401] Updated weights for policy 0, policy_version 185140 (0.0040) [2024-06-22 09:54:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 3033415680. Throughput: 0: 42578.4. Samples: 3033579200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 09:54:49,983][15401] Updated weights for policy 0, policy_version 185150 (0.0049) [2024-06-22 09:54:53,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3033628672. Throughput: 0: 42708.5. Samples: 3033711200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:53,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 09:54:53,907][15401] Updated weights for policy 0, policy_version 185160 (0.0030) [2024-06-22 09:54:57,648][15401] Updated weights for policy 0, policy_version 185170 (0.0031) [2024-06-22 09:54:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3033841664. Throughput: 0: 42452.9. Samples: 3033959940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:54:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 09:55:01,663][15401] Updated weights for policy 0, policy_version 185180 (0.0032) [2024-06-22 09:55:03,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42867.0, 300 sec: 42653.0). Total num frames: 3034054656. Throughput: 0: 42566.5. Samples: 3034215860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:55:03,396][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 09:55:05,262][15401] Updated weights for policy 0, policy_version 185190 (0.0036) [2024-06-22 09:55:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3034251264. Throughput: 0: 42569.8. Samples: 3034346940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:55:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 09:55:09,333][15401] Updated weights for policy 0, policy_version 185200 (0.0029) [2024-06-22 09:55:12,860][15401] Updated weights for policy 0, policy_version 185210 (0.0047) [2024-06-22 09:55:13,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3034480640. Throughput: 0: 42437.8. Samples: 3034600160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 09:55:13,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 09:55:17,273][15349] Signal inference workers to stop experience collection... (44700 times) [2024-06-22 09:55:17,310][15401] InferenceWorker_p0-w0: stopping experience collection (44700 times) [2024-06-22 09:55:17,330][15349] Signal inference workers to resume experience collection... (44700 times) [2024-06-22 09:55:17,331][15401] InferenceWorker_p0-w0: resuming experience collection (44700 times) [2024-06-22 09:55:17,338][15401] Updated weights for policy 0, policy_version 185220 (0.0047) [2024-06-22 09:55:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3034693632. Throughput: 0: 42367.8. Samples: 3034846480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 09:55:21,050][15401] Updated weights for policy 0, policy_version 185230 (0.0047) [2024-06-22 09:55:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3034890240. Throughput: 0: 42202.5. Samples: 3034974940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:23,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 09:55:24,847][15401] Updated weights for policy 0, policy_version 185240 (0.0029) [2024-06-22 09:55:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3035119616. Throughput: 0: 42310.7. Samples: 3035233020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 09:55:28,598][15401] Updated weights for policy 0, policy_version 185250 (0.0030) [2024-06-22 09:55:32,568][15401] Updated weights for policy 0, policy_version 185260 (0.0031) [2024-06-22 09:55:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 3035332608. Throughput: 0: 42319.0. Samples: 3035483560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 09:55:36,313][15401] Updated weights for policy 0, policy_version 185270 (0.0036) [2024-06-22 09:55:38,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 42598.4). Total num frames: 3035512832. Throughput: 0: 42239.6. Samples: 3035611980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 09:55:40,129][15401] Updated weights for policy 0, policy_version 185280 (0.0044) [2024-06-22 09:55:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 3035758592. Throughput: 0: 42451.1. Samples: 3035870240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 09:55:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000185288_3035758592.pth... [2024-06-22 09:55:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000184663_3025518592.pth [2024-06-22 09:55:44,011][15401] Updated weights for policy 0, policy_version 185290 (0.0034) [2024-06-22 09:55:47,979][15401] Updated weights for policy 0, policy_version 185300 (0.0042) [2024-06-22 09:55:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 3035955200. Throughput: 0: 42395.7. Samples: 3036123400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 09:55:51,693][15401] Updated weights for policy 0, policy_version 185310 (0.0042) [2024-06-22 09:55:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3036168192. Throughput: 0: 42324.9. Samples: 3036251560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 09:55:55,565][15401] Updated weights for policy 0, policy_version 185320 (0.0034) [2024-06-22 09:55:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3036381184. Throughput: 0: 42510.8. Samples: 3036513140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:55:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 09:55:59,247][15401] Updated weights for policy 0, policy_version 185330 (0.0048) [2024-06-22 09:56:03,203][15401] Updated weights for policy 0, policy_version 185340 (0.0034) [2024-06-22 09:56:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 3036610560. Throughput: 0: 42742.3. Samples: 3036769880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:56:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 09:56:06,693][15401] Updated weights for policy 0, policy_version 185350 (0.0034) [2024-06-22 09:56:08,396][15132] Fps is (10 sec: 42570.5, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 3036807168. Throughput: 0: 42743.3. Samples: 3036898660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:56:08,397][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 09:56:10,778][15401] Updated weights for policy 0, policy_version 185360 (0.0035) [2024-06-22 09:56:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3037036544. Throughput: 0: 42696.4. Samples: 3037154360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:56:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 09:56:14,855][15401] Updated weights for policy 0, policy_version 185370 (0.0042) [2024-06-22 09:56:18,389][15132] Fps is (10 sec: 44265.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3037249536. Throughput: 0: 42789.0. Samples: 3037409060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 09:56:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 09:56:18,829][15401] Updated weights for policy 0, policy_version 185380 (0.0033) [2024-06-22 09:56:22,474][15401] Updated weights for policy 0, policy_version 185390 (0.0040) [2024-06-22 09:56:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3037462528. Throughput: 0: 42749.8. Samples: 3037535720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:56:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 09:56:26,477][15401] Updated weights for policy 0, policy_version 185400 (0.0031) [2024-06-22 09:56:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 3037659136. Throughput: 0: 42773.0. Samples: 3037795020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:56:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 09:56:30,012][15401] Updated weights for policy 0, policy_version 185410 (0.0026) [2024-06-22 09:56:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 3037872128. Throughput: 0: 42810.7. Samples: 3038049880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:56:33,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 09:56:34,116][15401] Updated weights for policy 0, policy_version 185420 (0.0029) [2024-06-22 09:56:37,747][15401] Updated weights for policy 0, policy_version 185430 (0.0040) [2024-06-22 09:56:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3038085120. Throughput: 0: 42676.9. Samples: 3038172020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:56:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 09:56:39,253][15349] Signal inference workers to stop experience collection... (44750 times) [2024-06-22 09:56:39,299][15401] InferenceWorker_p0-w0: stopping experience collection (44750 times) [2024-06-22 09:56:39,317][15349] Signal inference workers to resume experience collection... (44750 times) [2024-06-22 09:56:39,319][15401] InferenceWorker_p0-w0: resuming experience collection (44750 times) [2024-06-22 09:56:41,639][15401] Updated weights for policy 0, policy_version 185440 (0.0046) [2024-06-22 09:56:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3038298112. Throughput: 0: 42535.0. Samples: 3038427220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:56:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 09:56:45,634][15401] Updated weights for policy 0, policy_version 185450 (0.0037) [2024-06-22 09:56:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 3038511104. Throughput: 0: 42605.8. Samples: 3038687140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:56:48,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 09:56:49,295][15401] Updated weights for policy 0, policy_version 185460 (0.0033) [2024-06-22 09:56:53,183][15401] Updated weights for policy 0, policy_version 185470 (0.0038) [2024-06-22 09:56:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3038740480. Throughput: 0: 42573.2. Samples: 3038814180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:56:53,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-22 09:56:56,815][15401] Updated weights for policy 0, policy_version 185480 (0.0032) [2024-06-22 09:56:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 3038920704. Throughput: 0: 42505.0. Samples: 3039067080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:56:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 09:57:00,758][15401] Updated weights for policy 0, policy_version 185490 (0.0037) [2024-06-22 09:57:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 3039133696. Throughput: 0: 42749.2. Samples: 3039332780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:57:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 09:57:04,453][15401] Updated weights for policy 0, policy_version 185500 (0.0034) [2024-06-22 09:57:08,279][15401] Updated weights for policy 0, policy_version 185510 (0.0027) [2024-06-22 09:57:08,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 3039395840. Throughput: 0: 42741.3. Samples: 3039459080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:57:08,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 09:57:12,256][15401] Updated weights for policy 0, policy_version 185520 (0.0041) [2024-06-22 09:57:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3039576064. Throughput: 0: 42505.3. Samples: 3039707760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:57:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 09:57:15,975][15401] Updated weights for policy 0, policy_version 185530 (0.0039) [2024-06-22 09:57:18,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 3039772672. Throughput: 0: 42635.5. Samples: 3039968480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:57:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 09:57:20,223][15401] Updated weights for policy 0, policy_version 185540 (0.0041) [2024-06-22 09:57:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3040034816. Throughput: 0: 42635.2. Samples: 3040090600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 09:57:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 09:57:23,457][15401] Updated weights for policy 0, policy_version 185550 (0.0027) [2024-06-22 09:57:27,759][15401] Updated weights for policy 0, policy_version 185560 (0.0047) [2024-06-22 09:57:28,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3040231424. Throughput: 0: 42909.4. Samples: 3040358140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:57:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 09:57:31,428][15401] Updated weights for policy 0, policy_version 185570 (0.0024) [2024-06-22 09:57:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3040428032. Throughput: 0: 42789.8. Samples: 3040612680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:57:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 09:57:35,711][15401] Updated weights for policy 0, policy_version 185580 (0.0028) [2024-06-22 09:57:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 3040657408. Throughput: 0: 42667.6. Samples: 3040734220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:57:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 09:57:39,053][15401] Updated weights for policy 0, policy_version 185590 (0.0041) [2024-06-22 09:57:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3040854016. Throughput: 0: 42882.7. Samples: 3040996800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:57:43,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-22 09:57:43,544][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000185600_3040870400.pth... [2024-06-22 09:57:43,549][15401] Updated weights for policy 0, policy_version 185600 (0.0032) [2024-06-22 09:57:43,599][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000184975_3030630400.pth [2024-06-22 09:57:46,641][15349] Signal inference workers to stop experience collection... (44800 times) [2024-06-22 09:57:46,662][15401] InferenceWorker_p0-w0: stopping experience collection (44800 times) [2024-06-22 09:57:46,702][15349] Signal inference workers to resume experience collection... (44800 times) [2024-06-22 09:57:46,703][15401] InferenceWorker_p0-w0: resuming experience collection (44800 times) [2024-06-22 09:57:46,836][15401] Updated weights for policy 0, policy_version 185610 (0.0034) [2024-06-22 09:57:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3041083392. Throughput: 0: 42353.8. Samples: 3041238700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:57:48,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-22 09:57:51,199][15401] Updated weights for policy 0, policy_version 185620 (0.0035) [2024-06-22 09:57:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3041296384. Throughput: 0: 42557.6. Samples: 3041374180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:57:53,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 09:57:54,711][15401] Updated weights for policy 0, policy_version 185630 (0.0026) [2024-06-22 09:57:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3041492992. Throughput: 0: 42680.0. Samples: 3041628360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:57:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 09:57:59,145][15401] Updated weights for policy 0, policy_version 185640 (0.0030) [2024-06-22 09:58:02,262][15401] Updated weights for policy 0, policy_version 185650 (0.0040) [2024-06-22 09:58:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 3041722368. Throughput: 0: 42442.3. Samples: 3041878380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:58:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 09:58:06,801][15401] Updated weights for policy 0, policy_version 185660 (0.0030) [2024-06-22 09:58:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3041935360. Throughput: 0: 42684.8. Samples: 3042011420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:58:08,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 09:58:09,838][15401] Updated weights for policy 0, policy_version 185670 (0.0038) [2024-06-22 09:58:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 3042115584. Throughput: 0: 42406.8. Samples: 3042266440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:58:13,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 09:58:14,335][15401] Updated weights for policy 0, policy_version 185680 (0.0028) [2024-06-22 09:58:17,657][15401] Updated weights for policy 0, policy_version 185690 (0.0038) [2024-06-22 09:58:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3042361344. Throughput: 0: 42259.1. Samples: 3042514340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:58:18,393][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 09:58:22,026][15401] Updated weights for policy 0, policy_version 185700 (0.0033) [2024-06-22 09:58:23,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42050.5, 300 sec: 42653.9). Total num frames: 3042557952. Throughput: 0: 42617.6. Samples: 3042652120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:58:23,392][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 09:58:25,120][15401] Updated weights for policy 0, policy_version 185710 (0.0043) [2024-06-22 09:58:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3042770944. Throughput: 0: 42398.2. Samples: 3042904720. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 09:58:28,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 09:58:29,721][15401] Updated weights for policy 0, policy_version 185720 (0.0029) [2024-06-22 09:58:32,739][15401] Updated weights for policy 0, policy_version 185730 (0.0045) [2024-06-22 09:58:33,390][15132] Fps is (10 sec: 45885.6, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 3043016704. Throughput: 0: 42585.7. Samples: 3043155060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:58:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 09:58:37,592][15401] Updated weights for policy 0, policy_version 185740 (0.0037) [2024-06-22 09:58:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3043196928. Throughput: 0: 42642.7. Samples: 3043293100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:58:38,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 09:58:40,393][15401] Updated weights for policy 0, policy_version 185750 (0.0032) [2024-06-22 09:58:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3043426304. Throughput: 0: 42637.8. Samples: 3043547060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:58:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 09:58:45,139][15401] Updated weights for policy 0, policy_version 185760 (0.0032) [2024-06-22 09:58:48,027][15401] Updated weights for policy 0, policy_version 185770 (0.0036) [2024-06-22 09:58:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3043655680. Throughput: 0: 42627.2. Samples: 3043796600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:58:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 09:58:53,071][15401] Updated weights for policy 0, policy_version 185780 (0.0021) [2024-06-22 09:58:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 3043819520. Throughput: 0: 42707.6. Samples: 3043933260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:58:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 09:58:55,704][15401] Updated weights for policy 0, policy_version 185790 (0.0030) [2024-06-22 09:58:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 3044065280. Throughput: 0: 42682.6. Samples: 3044187160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:58:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 09:59:00,595][15401] Updated weights for policy 0, policy_version 185800 (0.0042) [2024-06-22 09:59:03,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3044294656. Throughput: 0: 42684.1. Samples: 3044435120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:59:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 09:59:03,409][15401] Updated weights for policy 0, policy_version 185810 (0.0042) [2024-06-22 09:59:08,298][15401] Updated weights for policy 0, policy_version 185820 (0.0038) [2024-06-22 09:59:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3044474880. Throughput: 0: 42497.3. Samples: 3044564400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:59:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 09:59:11,056][15401] Updated weights for policy 0, policy_version 185830 (0.0032) [2024-06-22 09:59:13,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 3044704256. Throughput: 0: 42541.2. Samples: 3044819080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:59:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 09:59:15,801][15401] Updated weights for policy 0, policy_version 185840 (0.0043) [2024-06-22 09:59:18,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 3044933632. Throughput: 0: 42666.7. Samples: 3045075160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:59:18,392][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 09:59:19,261][15401] Updated weights for policy 0, policy_version 185850 (0.0039) [2024-06-22 09:59:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 3045113856. Throughput: 0: 42532.4. Samples: 3045207060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:59:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 09:59:23,543][15401] Updated weights for policy 0, policy_version 185860 (0.0033) [2024-06-22 09:59:26,910][15401] Updated weights for policy 0, policy_version 185870 (0.0034) [2024-06-22 09:59:28,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3045343232. Throughput: 0: 42455.1. Samples: 3045457540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:59:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 09:59:30,344][15349] Signal inference workers to stop experience collection... (44850 times) [2024-06-22 09:59:30,344][15349] Signal inference workers to resume experience collection... (44850 times) [2024-06-22 09:59:30,363][15401] InferenceWorker_p0-w0: stopping experience collection (44850 times) [2024-06-22 09:59:30,363][15401] InferenceWorker_p0-w0: resuming experience collection (44850 times) [2024-06-22 09:59:30,945][15401] Updated weights for policy 0, policy_version 185880 (0.0035) [2024-06-22 09:59:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 3045556224. Throughput: 0: 42688.9. Samples: 3045717600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:59:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 09:59:34,408][15401] Updated weights for policy 0, policy_version 185890 (0.0026) [2024-06-22 09:59:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 3045769216. Throughput: 0: 42620.4. Samples: 3045851180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-22 09:59:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 09:59:38,523][15401] Updated weights for policy 0, policy_version 185900 (0.0042) [2024-06-22 09:59:41,787][15401] Updated weights for policy 0, policy_version 185910 (0.0029) [2024-06-22 09:59:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3045998592. Throughput: 0: 42656.0. Samples: 3046106680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 09:59:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 09:59:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000185913_3045998592.pth... [2024-06-22 09:59:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000185288_3035758592.pth [2024-06-22 09:59:46,433][15401] Updated weights for policy 0, policy_version 185920 (0.0038) [2024-06-22 09:59:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3046211584. Throughput: 0: 42828.9. Samples: 3046362420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 09:59:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 09:59:49,374][15401] Updated weights for policy 0, policy_version 185930 (0.0028) [2024-06-22 09:59:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3046408192. Throughput: 0: 42849.0. Samples: 3046492600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 09:59:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 09:59:54,085][15401] Updated weights for policy 0, policy_version 185940 (0.0028) [2024-06-22 09:59:57,575][15401] Updated weights for policy 0, policy_version 185950 (0.0039) [2024-06-22 09:59:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 3046653952. Throughput: 0: 42900.9. Samples: 3046749620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 09:59:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 10:00:01,731][15401] Updated weights for policy 0, policy_version 185960 (0.0026) [2024-06-22 10:00:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3046850560. Throughput: 0: 42943.2. Samples: 3047007500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 10:00:05,202][15401] Updated weights for policy 0, policy_version 185970 (0.0026) [2024-06-22 10:00:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3047063552. Throughput: 0: 42854.2. Samples: 3047135500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 10:00:09,269][15401] Updated weights for policy 0, policy_version 185980 (0.0039) [2024-06-22 10:00:12,705][15401] Updated weights for policy 0, policy_version 185990 (0.0034) [2024-06-22 10:00:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3047260160. Throughput: 0: 42942.7. Samples: 3047389960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 10:00:16,898][15401] Updated weights for policy 0, policy_version 186000 (0.0025) [2024-06-22 10:00:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3047505920. Throughput: 0: 42939.9. Samples: 3047649900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 10:00:20,416][15401] Updated weights for policy 0, policy_version 186010 (0.0054) [2024-06-22 10:00:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3047686144. Throughput: 0: 42733.7. Samples: 3047774200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 10:00:24,593][15401] Updated weights for policy 0, policy_version 186020 (0.0039) [2024-06-22 10:00:28,033][15401] Updated weights for policy 0, policy_version 186030 (0.0029) [2024-06-22 10:00:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3047915520. Throughput: 0: 42752.5. Samples: 3048030540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:28,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 10:00:32,350][15401] Updated weights for policy 0, policy_version 186040 (0.0023) [2024-06-22 10:00:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3048128512. Throughput: 0: 42785.2. Samples: 3048287760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 10:00:35,824][15401] Updated weights for policy 0, policy_version 186050 (0.0040) [2024-06-22 10:00:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3048341504. Throughput: 0: 42715.6. Samples: 3048414800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 10:00:40,125][15401] Updated weights for policy 0, policy_version 186060 (0.0030) [2024-06-22 10:00:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3048538112. Throughput: 0: 42781.8. Samples: 3048674800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:00:43,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 10:00:43,701][15401] Updated weights for policy 0, policy_version 186070 (0.0032) [2024-06-22 10:00:47,871][15401] Updated weights for policy 0, policy_version 186080 (0.0030) [2024-06-22 10:00:48,390][15132] Fps is (10 sec: 40958.2, 60 sec: 42325.0, 300 sec: 42653.9). Total num frames: 3048751104. Throughput: 0: 42780.4. Samples: 3048932640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:00:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 10:00:51,395][15401] Updated weights for policy 0, policy_version 186090 (0.0043) [2024-06-22 10:00:53,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 3048980480. Throughput: 0: 42722.6. Samples: 3049058120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:00:53,393][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 10:00:55,400][15401] Updated weights for policy 0, policy_version 186100 (0.0026) [2024-06-22 10:00:57,297][15349] Signal inference workers to stop experience collection... (44900 times) [2024-06-22 10:00:57,297][15349] Signal inference workers to resume experience collection... (44900 times) [2024-06-22 10:00:57,322][15401] InferenceWorker_p0-w0: stopping experience collection (44900 times) [2024-06-22 10:00:57,323][15401] InferenceWorker_p0-w0: resuming experience collection (44900 times) [2024-06-22 10:00:58,390][15132] Fps is (10 sec: 44238.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3049193472. Throughput: 0: 42748.9. Samples: 3049313660. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:00:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 10:00:59,021][15401] Updated weights for policy 0, policy_version 186110 (0.0046) [2024-06-22 10:01:02,968][15401] Updated weights for policy 0, policy_version 186120 (0.0022) [2024-06-22 10:01:03,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42323.6, 300 sec: 42654.5). Total num frames: 3049390080. Throughput: 0: 42541.2. Samples: 3049564360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:03,393][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 10:01:06,682][15401] Updated weights for policy 0, policy_version 186130 (0.0028) [2024-06-22 10:01:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3049603072. Throughput: 0: 42651.6. Samples: 3049693520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 10:01:10,508][15401] Updated weights for policy 0, policy_version 186140 (0.0042) [2024-06-22 10:01:13,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3049832448. Throughput: 0: 42739.6. Samples: 3049953820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 10:01:14,290][15401] Updated weights for policy 0, policy_version 186150 (0.0026) [2024-06-22 10:01:18,096][15401] Updated weights for policy 0, policy_version 186160 (0.0034) [2024-06-22 10:01:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3050045440. Throughput: 0: 42561.9. Samples: 3050203040. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 10:01:21,856][15401] Updated weights for policy 0, policy_version 186170 (0.0046) [2024-06-22 10:01:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3050242048. Throughput: 0: 42519.6. Samples: 3050328180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:23,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 10:01:26,199][15401] Updated weights for policy 0, policy_version 186180 (0.0037) [2024-06-22 10:01:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3050455040. Throughput: 0: 42564.8. Samples: 3050590220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:28,394][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 10:01:29,609][15401] Updated weights for policy 0, policy_version 186190 (0.0031) [2024-06-22 10:01:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3050684416. Throughput: 0: 42382.5. Samples: 3050839840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 10:01:33,630][15401] Updated weights for policy 0, policy_version 186200 (0.0037) [2024-06-22 10:01:37,597][15401] Updated weights for policy 0, policy_version 186210 (0.0037) [2024-06-22 10:01:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3050897408. Throughput: 0: 42629.9. Samples: 3050976360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 10:01:41,253][15401] Updated weights for policy 0, policy_version 186220 (0.0033) [2024-06-22 10:01:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3051110400. Throughput: 0: 42601.3. Samples: 3051230720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:43,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 10:01:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000186225_3051110400.pth... [2024-06-22 10:01:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000185600_3040870400.pth [2024-06-22 10:01:45,238][15401] Updated weights for policy 0, policy_version 186230 (0.0029) [2024-06-22 10:01:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.7, 300 sec: 42653.9). Total num frames: 3051323392. Throughput: 0: 42537.3. Samples: 3051478440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 10:01:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 10:01:49,129][15401] Updated weights for policy 0, policy_version 186240 (0.0039) [2024-06-22 10:01:52,987][15401] Updated weights for policy 0, policy_version 186250 (0.0039) [2024-06-22 10:01:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 3051520000. Throughput: 0: 42536.5. Samples: 3051607660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:01:53,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 10:01:57,017][15401] Updated weights for policy 0, policy_version 186260 (0.0044) [2024-06-22 10:01:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3051732992. Throughput: 0: 42411.6. Samples: 3051862340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:01:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 10:02:00,840][15401] Updated weights for policy 0, policy_version 186270 (0.0042) [2024-06-22 10:02:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 3051962368. Throughput: 0: 42427.1. Samples: 3052112260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 10:02:04,882][15401] Updated weights for policy 0, policy_version 186280 (0.0026) [2024-06-22 10:02:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3052158976. Throughput: 0: 42625.7. Samples: 3052246340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 10:02:08,461][15401] Updated weights for policy 0, policy_version 186290 (0.0023) [2024-06-22 10:02:12,502][15401] Updated weights for policy 0, policy_version 186300 (0.0041) [2024-06-22 10:02:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3052371968. Throughput: 0: 42475.1. Samples: 3052501600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 10:02:16,078][15401] Updated weights for policy 0, policy_version 186310 (0.0035) [2024-06-22 10:02:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3052601344. Throughput: 0: 42554.3. Samples: 3052754780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:18,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 10:02:20,286][15401] Updated weights for policy 0, policy_version 186320 (0.0037) [2024-06-22 10:02:23,391][15132] Fps is (10 sec: 42592.1, 60 sec: 42597.3, 300 sec: 42598.2). Total num frames: 3052797952. Throughput: 0: 42412.4. Samples: 3052884980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:23,392][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 10:02:24,094][15401] Updated weights for policy 0, policy_version 186330 (0.0034) [2024-06-22 10:02:27,847][15401] Updated weights for policy 0, policy_version 186340 (0.0037) [2024-06-22 10:02:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3053010944. Throughput: 0: 42420.9. Samples: 3053139660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:28,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 10:02:32,075][15401] Updated weights for policy 0, policy_version 186350 (0.0027) [2024-06-22 10:02:33,390][15132] Fps is (10 sec: 45881.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3053256704. Throughput: 0: 42509.9. Samples: 3053391380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 10:02:33,756][15349] Signal inference workers to stop experience collection... (44950 times) [2024-06-22 10:02:33,800][15401] InferenceWorker_p0-w0: stopping experience collection (44950 times) [2024-06-22 10:02:33,866][15349] Signal inference workers to resume experience collection... (44950 times) [2024-06-22 10:02:33,866][15401] InferenceWorker_p0-w0: resuming experience collection (44950 times) [2024-06-22 10:02:35,445][15401] Updated weights for policy 0, policy_version 186360 (0.0028) [2024-06-22 10:02:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3053436928. Throughput: 0: 42672.1. Samples: 3053527900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:38,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 10:02:39,427][15401] Updated weights for policy 0, policy_version 186370 (0.0024) [2024-06-22 10:02:42,888][15401] Updated weights for policy 0, policy_version 186380 (0.0029) [2024-06-22 10:02:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3053649920. Throughput: 0: 42661.6. Samples: 3053782120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 10:02:47,059][15401] Updated weights for policy 0, policy_version 186390 (0.0034) [2024-06-22 10:02:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3053895680. Throughput: 0: 42758.2. Samples: 3054036380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 10:02:50,962][15401] Updated weights for policy 0, policy_version 186400 (0.0038) [2024-06-22 10:02:53,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3054092288. Throughput: 0: 42825.8. Samples: 3054173500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:02:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 10:02:54,547][15401] Updated weights for policy 0, policy_version 186410 (0.0029) [2024-06-22 10:02:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3054288896. Throughput: 0: 42819.6. Samples: 3054428480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:02:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 10:02:58,488][15401] Updated weights for policy 0, policy_version 186420 (0.0029) [2024-06-22 10:03:02,119][15401] Updated weights for policy 0, policy_version 186430 (0.0031) [2024-06-22 10:03:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3054534656. Throughput: 0: 42928.0. Samples: 3054686540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:03,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 10:03:06,047][15401] Updated weights for policy 0, policy_version 186440 (0.0027) [2024-06-22 10:03:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3054747648. Throughput: 0: 43005.4. Samples: 3054820160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 10:03:09,654][15401] Updated weights for policy 0, policy_version 186450 (0.0034) [2024-06-22 10:03:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3054927872. Throughput: 0: 42931.4. Samples: 3055071580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 10:03:13,718][15401] Updated weights for policy 0, policy_version 186460 (0.0031) [2024-06-22 10:03:17,585][15401] Updated weights for policy 0, policy_version 186470 (0.0042) [2024-06-22 10:03:18,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 3055157248. Throughput: 0: 42983.9. Samples: 3055325760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:18,392][15132] Avg episode reward: [(0, '0.815')] [2024-06-22 10:03:21,342][15401] Updated weights for policy 0, policy_version 186480 (0.0035) [2024-06-22 10:03:23,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43145.6, 300 sec: 42765.0). Total num frames: 3055386624. Throughput: 0: 42879.8. Samples: 3055457500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:23,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 10:03:25,156][15401] Updated weights for policy 0, policy_version 186490 (0.0036) [2024-06-22 10:03:28,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3055583232. Throughput: 0: 42840.6. Samples: 3055709940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 10:03:28,917][15401] Updated weights for policy 0, policy_version 186500 (0.0035) [2024-06-22 10:03:32,552][15401] Updated weights for policy 0, policy_version 186510 (0.0044) [2024-06-22 10:03:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3055796224. Throughput: 0: 42992.4. Samples: 3055971040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 10:03:36,699][15401] Updated weights for policy 0, policy_version 186520 (0.0035) [2024-06-22 10:03:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3056025600. Throughput: 0: 42798.6. Samples: 3056099440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 10:03:40,086][15401] Updated weights for policy 0, policy_version 186530 (0.0032) [2024-06-22 10:03:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3056238592. Throughput: 0: 42756.7. Samples: 3056352540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 10:03:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000186538_3056238592.pth... [2024-06-22 10:03:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000185913_3045998592.pth [2024-06-22 10:03:44,347][15401] Updated weights for policy 0, policy_version 186540 (0.0031) [2024-06-22 10:03:45,789][15349] Signal inference workers to stop experience collection... (45000 times) [2024-06-22 10:03:45,791][15349] Signal inference workers to resume experience collection... (45000 times) [2024-06-22 10:03:45,831][15401] InferenceWorker_p0-w0: stopping experience collection (45000 times) [2024-06-22 10:03:45,832][15401] InferenceWorker_p0-w0: resuming experience collection (45000 times) [2024-06-22 10:03:47,892][15401] Updated weights for policy 0, policy_version 186550 (0.0037) [2024-06-22 10:03:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3056451584. Throughput: 0: 42719.0. Samples: 3056608900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 10:03:51,850][15401] Updated weights for policy 0, policy_version 186560 (0.0041) [2024-06-22 10:03:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3056680960. Throughput: 0: 42763.5. Samples: 3056744520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:53,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 10:03:55,379][15401] Updated weights for policy 0, policy_version 186570 (0.0035) [2024-06-22 10:03:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3056877568. Throughput: 0: 42967.2. Samples: 3057005100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:03:58,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 10:03:59,420][15401] Updated weights for policy 0, policy_version 186580 (0.0034) [2024-06-22 10:04:02,876][15401] Updated weights for policy 0, policy_version 186590 (0.0040) [2024-06-22 10:04:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3057106944. Throughput: 0: 42989.4. Samples: 3057260180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 10:04:03,391][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 10:04:06,995][15401] Updated weights for policy 0, policy_version 186600 (0.0028) [2024-06-22 10:04:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3057336320. Throughput: 0: 43016.1. Samples: 3057393220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:08,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 10:04:10,498][15401] Updated weights for policy 0, policy_version 186610 (0.0031) [2024-06-22 10:04:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.7, 300 sec: 42654.3). Total num frames: 3057516544. Throughput: 0: 43141.0. Samples: 3057651280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 10:04:14,545][15401] Updated weights for policy 0, policy_version 186620 (0.0035) [2024-06-22 10:04:18,158][15401] Updated weights for policy 0, policy_version 186630 (0.0039) [2024-06-22 10:04:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 3057745920. Throughput: 0: 42972.4. Samples: 3057904800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:18,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-22 10:04:22,040][15401] Updated weights for policy 0, policy_version 186640 (0.0035) [2024-06-22 10:04:23,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3057958912. Throughput: 0: 43057.1. Samples: 3058037020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 10:04:25,865][15401] Updated weights for policy 0, policy_version 186650 (0.0033) [2024-06-22 10:04:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3058171904. Throughput: 0: 43080.1. Samples: 3058291140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:28,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 10:04:29,899][15401] Updated weights for policy 0, policy_version 186660 (0.0036) [2024-06-22 10:04:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3058384896. Throughput: 0: 43140.5. Samples: 3058550220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 10:04:33,498][15401] Updated weights for policy 0, policy_version 186670 (0.0022) [2024-06-22 10:04:37,537][15401] Updated weights for policy 0, policy_version 186680 (0.0037) [2024-06-22 10:04:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 3058597888. Throughput: 0: 42938.6. Samples: 3058676760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:38,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 10:04:41,574][15401] Updated weights for policy 0, policy_version 186690 (0.0043) [2024-06-22 10:04:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3058810880. Throughput: 0: 42817.4. Samples: 3058931880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:43,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 10:04:45,143][15401] Updated weights for policy 0, policy_version 186700 (0.0038) [2024-06-22 10:04:48,394][15132] Fps is (10 sec: 42579.0, 60 sec: 42868.2, 300 sec: 42764.3). Total num frames: 3059023872. Throughput: 0: 42874.7. Samples: 3059189740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:48,395][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 10:04:49,084][15401] Updated weights for policy 0, policy_version 186710 (0.0030) [2024-06-22 10:04:52,715][15401] Updated weights for policy 0, policy_version 186720 (0.0024) [2024-06-22 10:04:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3059236864. Throughput: 0: 42697.3. Samples: 3059314600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 10:04:57,026][15401] Updated weights for policy 0, policy_version 186730 (0.0043) [2024-06-22 10:04:58,390][15132] Fps is (10 sec: 44257.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3059466240. Throughput: 0: 42787.4. Samples: 3059576720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:04:58,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 10:05:00,375][15401] Updated weights for policy 0, policy_version 186740 (0.0029) [2024-06-22 10:05:01,541][15349] Signal inference workers to stop experience collection... (45050 times) [2024-06-22 10:05:01,593][15349] Signal inference workers to resume experience collection... (45050 times) [2024-06-22 10:05:01,593][15401] InferenceWorker_p0-w0: stopping experience collection (45050 times) [2024-06-22 10:05:01,614][15401] InferenceWorker_p0-w0: resuming experience collection (45050 times) [2024-06-22 10:05:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3059662848. Throughput: 0: 42932.5. Samples: 3059836760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:05:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 10:05:04,452][15401] Updated weights for policy 0, policy_version 186750 (0.0030) [2024-06-22 10:05:08,078][15401] Updated weights for policy 0, policy_version 186760 (0.0031) [2024-06-22 10:05:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3059892224. Throughput: 0: 42730.8. Samples: 3059959900. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-22 10:05:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 10:05:11,943][15401] Updated weights for policy 0, policy_version 186770 (0.0028) [2024-06-22 10:05:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3060105216. Throughput: 0: 42866.7. Samples: 3060220140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 10:05:15,658][15401] Updated weights for policy 0, policy_version 186780 (0.0032) [2024-06-22 10:05:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3060301824. Throughput: 0: 42771.1. Samples: 3060474920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 10:05:19,671][15401] Updated weights for policy 0, policy_version 186790 (0.0025) [2024-06-22 10:05:23,238][15401] Updated weights for policy 0, policy_version 186800 (0.0043) [2024-06-22 10:05:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3060531200. Throughput: 0: 42785.0. Samples: 3060602080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 10:05:27,130][15401] Updated weights for policy 0, policy_version 186810 (0.0040) [2024-06-22 10:05:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3060744192. Throughput: 0: 42988.5. Samples: 3060866360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:28,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 10:05:30,898][15401] Updated weights for policy 0, policy_version 186820 (0.0037) [2024-06-22 10:05:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3060957184. Throughput: 0: 42881.9. Samples: 3061119220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 10:05:35,038][15401] Updated weights for policy 0, policy_version 186830 (0.0044) [2024-06-22 10:05:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3061170176. Throughput: 0: 42936.5. Samples: 3061246740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:38,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 10:05:38,425][15401] Updated weights for policy 0, policy_version 186840 (0.0028) [2024-06-22 10:05:42,762][15401] Updated weights for policy 0, policy_version 186850 (0.0031) [2024-06-22 10:05:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3061383168. Throughput: 0: 42991.2. Samples: 3061511320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 10:05:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000186852_3061383168.pth... [2024-06-22 10:05:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000186225_3051110400.pth [2024-06-22 10:05:45,846][15401] Updated weights for policy 0, policy_version 186860 (0.0035) [2024-06-22 10:05:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43147.9, 300 sec: 42820.9). Total num frames: 3061612544. Throughput: 0: 42750.2. Samples: 3061760520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:48,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 10:05:50,189][15401] Updated weights for policy 0, policy_version 186870 (0.0041) [2024-06-22 10:05:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3061825536. Throughput: 0: 42975.1. Samples: 3061893780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 10:05:53,437][15401] Updated weights for policy 0, policy_version 186880 (0.0032) [2024-06-22 10:05:57,903][15401] Updated weights for policy 0, policy_version 186890 (0.0023) [2024-06-22 10:05:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 3062022144. Throughput: 0: 43144.0. Samples: 3062161620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:05:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 10:06:01,807][15401] Updated weights for policy 0, policy_version 186900 (0.0039) [2024-06-22 10:06:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3062267904. Throughput: 0: 42866.2. Samples: 3062403900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:06:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 10:06:05,458][15401] Updated weights for policy 0, policy_version 186910 (0.0032) [2024-06-22 10:06:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3062464512. Throughput: 0: 42979.5. Samples: 3062536160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:06:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 10:06:09,391][15349] Signal inference workers to stop experience collection... (45100 times) [2024-06-22 10:06:09,393][15349] Signal inference workers to resume experience collection... (45100 times) [2024-06-22 10:06:09,397][15401] Updated weights for policy 0, policy_version 186920 (0.0037) [2024-06-22 10:06:09,409][15401] InferenceWorker_p0-w0: stopping experience collection (45100 times) [2024-06-22 10:06:09,410][15401] InferenceWorker_p0-w0: resuming experience collection (45100 times) [2024-06-22 10:06:12,965][15401] Updated weights for policy 0, policy_version 186930 (0.0029) [2024-06-22 10:06:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3062677504. Throughput: 0: 43081.4. Samples: 3062805020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 10:06:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 10:06:16,840][15401] Updated weights for policy 0, policy_version 186940 (0.0043) [2024-06-22 10:06:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3062906880. Throughput: 0: 42949.2. Samples: 3063051940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:18,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 10:06:20,585][15401] Updated weights for policy 0, policy_version 186950 (0.0036) [2024-06-22 10:06:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3063103488. Throughput: 0: 42996.4. Samples: 3063181580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 10:06:24,455][15401] Updated weights for policy 0, policy_version 186960 (0.0031) [2024-06-22 10:06:28,222][15401] Updated weights for policy 0, policy_version 186970 (0.0025) [2024-06-22 10:06:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 3063332864. Throughput: 0: 42975.9. Samples: 3063445340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:28,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 10:06:31,924][15401] Updated weights for policy 0, policy_version 186980 (0.0033) [2024-06-22 10:06:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3063562240. Throughput: 0: 43043.9. Samples: 3063697500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 10:06:35,801][15401] Updated weights for policy 0, policy_version 186990 (0.0025) [2024-06-22 10:06:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3063758848. Throughput: 0: 43109.3. Samples: 3063833700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:38,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 10:06:39,390][15401] Updated weights for policy 0, policy_version 187000 (0.0027) [2024-06-22 10:06:43,288][15401] Updated weights for policy 0, policy_version 187010 (0.0042) [2024-06-22 10:06:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3063971840. Throughput: 0: 42916.8. Samples: 3064092880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 10:06:47,138][15401] Updated weights for policy 0, policy_version 187020 (0.0027) [2024-06-22 10:06:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3064184832. Throughput: 0: 43088.0. Samples: 3064342860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 10:06:50,935][15401] Updated weights for policy 0, policy_version 187030 (0.0032) [2024-06-22 10:06:53,396][15132] Fps is (10 sec: 44208.8, 60 sec: 43139.9, 300 sec: 42986.2). Total num frames: 3064414208. Throughput: 0: 43129.1. Samples: 3064477240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:53,396][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 10:06:54,662][15401] Updated weights for policy 0, policy_version 187040 (0.0029) [2024-06-22 10:06:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3064594432. Throughput: 0: 42952.5. Samples: 3064737880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:06:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 10:06:58,675][15401] Updated weights for policy 0, policy_version 187050 (0.0036) [2024-06-22 10:07:02,311][15401] Updated weights for policy 0, policy_version 187060 (0.0034) [2024-06-22 10:07:03,389][15132] Fps is (10 sec: 40986.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3064823808. Throughput: 0: 43102.7. Samples: 3064991560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:07:03,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 10:07:06,249][15401] Updated weights for policy 0, policy_version 187070 (0.0042) [2024-06-22 10:07:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3065036800. Throughput: 0: 43037.8. Samples: 3065118280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:07:08,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 10:07:09,864][15401] Updated weights for policy 0, policy_version 187080 (0.0033) [2024-06-22 10:07:12,298][15349] Signal inference workers to stop experience collection... (45150 times) [2024-06-22 10:07:12,298][15349] Signal inference workers to resume experience collection... (45150 times) [2024-06-22 10:07:12,313][15401] InferenceWorker_p0-w0: stopping experience collection (45150 times) [2024-06-22 10:07:12,313][15401] InferenceWorker_p0-w0: resuming experience collection (45150 times) [2024-06-22 10:07:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3065249792. Throughput: 0: 42871.5. Samples: 3065374460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:07:13,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 10:07:14,055][15401] Updated weights for policy 0, policy_version 187090 (0.0026) [2024-06-22 10:07:17,696][15401] Updated weights for policy 0, policy_version 187100 (0.0036) [2024-06-22 10:07:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42931.8). Total num frames: 3065462784. Throughput: 0: 42825.8. Samples: 3065624660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 10:07:18,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 10:07:21,786][15401] Updated weights for policy 0, policy_version 187110 (0.0041) [2024-06-22 10:07:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3065692160. Throughput: 0: 42722.2. Samples: 3065756200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:07:23,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 10:07:25,239][15401] Updated weights for policy 0, policy_version 187120 (0.0044) [2024-06-22 10:07:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 3065888768. Throughput: 0: 42633.4. Samples: 3066011380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:07:28,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 10:07:29,607][15401] Updated weights for policy 0, policy_version 187130 (0.0034) [2024-06-22 10:07:32,790][15401] Updated weights for policy 0, policy_version 187140 (0.0027) [2024-06-22 10:07:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 3066101760. Throughput: 0: 42725.4. Samples: 3066265500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:07:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 10:07:37,066][15401] Updated weights for policy 0, policy_version 187150 (0.0024) [2024-06-22 10:07:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42986.9). Total num frames: 3066331136. Throughput: 0: 42685.6. Samples: 3066397920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:07:38,392][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 10:07:40,658][15401] Updated weights for policy 0, policy_version 187160 (0.0043) [2024-06-22 10:07:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3066527744. Throughput: 0: 42469.6. Samples: 3066649020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:07:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 10:07:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000187166_3066527744.pth... [2024-06-22 10:07:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000186538_3056238592.pth [2024-06-22 10:07:44,674][15401] Updated weights for policy 0, policy_version 187170 (0.0044) [2024-06-22 10:07:48,260][15401] Updated weights for policy 0, policy_version 187180 (0.0038) [2024-06-22 10:07:48,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3066757120. Throughput: 0: 42464.8. Samples: 3066902480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:07:48,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 10:07:52,503][15401] Updated weights for policy 0, policy_version 187190 (0.0036) [2024-06-22 10:07:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42056.8, 300 sec: 42876.1). Total num frames: 3066937344. Throughput: 0: 42618.7. Samples: 3067036120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:07:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 10:07:55,849][15401] Updated weights for policy 0, policy_version 187200 (0.0044) [2024-06-22 10:07:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3067166720. Throughput: 0: 42539.2. Samples: 3067288720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:07:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 10:08:00,201][15401] Updated weights for policy 0, policy_version 187210 (0.0041) [2024-06-22 10:08:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3067379712. Throughput: 0: 42657.7. Samples: 3067544260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:08:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 10:08:03,588][15401] Updated weights for policy 0, policy_version 187220 (0.0030) [2024-06-22 10:08:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 3067559936. Throughput: 0: 42582.3. Samples: 3067672400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:08:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 10:08:08,442][15401] Updated weights for policy 0, policy_version 187230 (0.0029) [2024-06-22 10:08:11,260][15401] Updated weights for policy 0, policy_version 187240 (0.0037) [2024-06-22 10:08:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 3067789312. Throughput: 0: 42549.8. Samples: 3067926120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:08:13,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 10:08:16,062][15401] Updated weights for policy 0, policy_version 187250 (0.0044) [2024-06-22 10:08:18,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3068035072. Throughput: 0: 42481.7. Samples: 3068177180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:08:18,396][15132] Avg episode reward: [(0, '0.222')] [2024-06-22 10:08:18,884][15401] Updated weights for policy 0, policy_version 187260 (0.0034) [2024-06-22 10:08:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 3068215296. Throughput: 0: 42526.6. Samples: 3068311520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:08:23,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-22 10:08:23,661][15401] Updated weights for policy 0, policy_version 187270 (0.0032) [2024-06-22 10:08:26,492][15401] Updated weights for policy 0, policy_version 187280 (0.0039) [2024-06-22 10:08:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 3068428288. Throughput: 0: 42511.2. Samples: 3068562020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-22 10:08:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 10:08:31,207][15401] Updated weights for policy 0, policy_version 187290 (0.0035) [2024-06-22 10:08:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3068674048. Throughput: 0: 42589.4. Samples: 3068819000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:08:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 10:08:34,170][15401] Updated weights for policy 0, policy_version 187300 (0.0036) [2024-06-22 10:08:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 3068870656. Throughput: 0: 42566.1. Samples: 3068951600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:08:38,394][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 10:08:38,852][15401] Updated weights for policy 0, policy_version 187310 (0.0042) [2024-06-22 10:08:40,010][15349] Signal inference workers to stop experience collection... (45200 times) [2024-06-22 10:08:40,011][15349] Signal inference workers to resume experience collection... (45200 times) [2024-06-22 10:08:40,028][15401] InferenceWorker_p0-w0: stopping experience collection (45200 times) [2024-06-22 10:08:40,029][15401] InferenceWorker_p0-w0: resuming experience collection (45200 times) [2024-06-22 10:08:41,982][15401] Updated weights for policy 0, policy_version 187320 (0.0027) [2024-06-22 10:08:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3069067264. Throughput: 0: 42472.4. Samples: 3069199980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:08:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 10:08:46,421][15401] Updated weights for policy 0, policy_version 187330 (0.0041) [2024-06-22 10:08:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3069296640. Throughput: 0: 42615.7. Samples: 3069461960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:08:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 10:08:49,819][15401] Updated weights for policy 0, policy_version 187340 (0.0034) [2024-06-22 10:08:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3069509632. Throughput: 0: 42556.4. Samples: 3069587440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:08:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 10:08:54,537][15401] Updated weights for policy 0, policy_version 187350 (0.0033) [2024-06-22 10:08:57,554][15401] Updated weights for policy 0, policy_version 187360 (0.0047) [2024-06-22 10:08:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3069722624. Throughput: 0: 42381.3. Samples: 3069833280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:08:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 10:09:02,150][15401] Updated weights for policy 0, policy_version 187370 (0.0031) [2024-06-22 10:09:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3069952000. Throughput: 0: 42558.3. Samples: 3070092300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:09:03,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 10:09:05,338][15401] Updated weights for policy 0, policy_version 187380 (0.0028) [2024-06-22 10:09:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 3070132224. Throughput: 0: 42452.9. Samples: 3070222000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:09:08,393][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 10:09:09,749][15401] Updated weights for policy 0, policy_version 187390 (0.0038) [2024-06-22 10:09:13,006][15401] Updated weights for policy 0, policy_version 187400 (0.0032) [2024-06-22 10:09:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 3070377984. Throughput: 0: 42632.8. Samples: 3070480500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:09:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 10:09:17,416][15401] Updated weights for policy 0, policy_version 187410 (0.0035) [2024-06-22 10:09:18,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3070574592. Throughput: 0: 42516.5. Samples: 3070732240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:09:18,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 10:09:20,961][15401] Updated weights for policy 0, policy_version 187420 (0.0028) [2024-06-22 10:09:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3070787584. Throughput: 0: 42484.8. Samples: 3070863420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:09:23,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 10:09:25,134][15401] Updated weights for policy 0, policy_version 187430 (0.0028) [2024-06-22 10:09:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3070984192. Throughput: 0: 42578.3. Samples: 3071116000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:09:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 10:09:28,686][15401] Updated weights for policy 0, policy_version 187440 (0.0040) [2024-06-22 10:09:32,776][15401] Updated weights for policy 0, policy_version 187450 (0.0035) [2024-06-22 10:09:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3071213568. Throughput: 0: 42396.3. Samples: 3071369800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 10:09:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 10:09:36,333][15401] Updated weights for policy 0, policy_version 187460 (0.0024) [2024-06-22 10:09:38,391][15132] Fps is (10 sec: 44228.2, 60 sec: 42597.1, 300 sec: 42764.7). Total num frames: 3071426560. Throughput: 0: 42537.3. Samples: 3071501700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:09:38,392][15132] Avg episode reward: [(0, '0.849')] [2024-06-22 10:09:40,418][15401] Updated weights for policy 0, policy_version 187470 (0.0040) [2024-06-22 10:09:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.7). Total num frames: 3071639552. Throughput: 0: 42640.5. Samples: 3071752100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:09:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 10:09:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000187478_3071639552.pth... [2024-06-22 10:09:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000186852_3061383168.pth [2024-06-22 10:09:44,485][15401] Updated weights for policy 0, policy_version 187480 (0.0039) [2024-06-22 10:09:48,046][15401] Updated weights for policy 0, policy_version 187490 (0.0031) [2024-06-22 10:09:48,390][15132] Fps is (10 sec: 40967.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 3071836160. Throughput: 0: 42574.2. Samples: 3072008140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:09:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 10:09:52,138][15401] Updated weights for policy 0, policy_version 187500 (0.0042) [2024-06-22 10:09:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3072049152. Throughput: 0: 42604.9. Samples: 3072139120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:09:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 10:09:55,665][15401] Updated weights for policy 0, policy_version 187510 (0.0031) [2024-06-22 10:09:56,820][15349] Signal inference workers to stop experience collection... (45250 times) [2024-06-22 10:09:56,875][15401] InferenceWorker_p0-w0: stopping experience collection (45250 times) [2024-06-22 10:09:56,936][15349] Signal inference workers to resume experience collection... (45250 times) [2024-06-22 10:09:56,936][15401] InferenceWorker_p0-w0: resuming experience collection (45250 times) [2024-06-22 10:09:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3072278528. Throughput: 0: 42406.8. Samples: 3072388800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:09:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 10:09:59,764][15401] Updated weights for policy 0, policy_version 187520 (0.0033) [2024-06-22 10:10:03,238][15401] Updated weights for policy 0, policy_version 187530 (0.0036) [2024-06-22 10:10:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3072491520. Throughput: 0: 42558.5. Samples: 3072647380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:10:03,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 10:10:07,459][15401] Updated weights for policy 0, policy_version 187540 (0.0030) [2024-06-22 10:10:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 3072688128. Throughput: 0: 42455.2. Samples: 3072773900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:10:08,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 10:10:10,847][15401] Updated weights for policy 0, policy_version 187550 (0.0033) [2024-06-22 10:10:13,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 3072917504. Throughput: 0: 42599.0. Samples: 3073033060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:10:13,392][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 10:10:15,150][15401] Updated weights for policy 0, policy_version 187560 (0.0045) [2024-06-22 10:10:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3073130496. Throughput: 0: 42561.5. Samples: 3073285060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:10:18,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 10:10:18,498][15401] Updated weights for policy 0, policy_version 187570 (0.0043) [2024-06-22 10:10:22,788][15401] Updated weights for policy 0, policy_version 187580 (0.0041) [2024-06-22 10:10:23,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3073327104. Throughput: 0: 42456.3. Samples: 3073412160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:10:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 10:10:26,623][15401] Updated weights for policy 0, policy_version 187590 (0.0038) [2024-06-22 10:10:28,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 3073556480. Throughput: 0: 42587.4. Samples: 3073668540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:10:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 10:10:30,402][15401] Updated weights for policy 0, policy_version 187600 (0.0034) [2024-06-22 10:10:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 3073753088. Throughput: 0: 42585.0. Samples: 3073924460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:10:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 10:10:34,254][15401] Updated weights for policy 0, policy_version 187610 (0.0027) [2024-06-22 10:10:38,346][15401] Updated weights for policy 0, policy_version 187620 (0.0030) [2024-06-22 10:10:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42326.7, 300 sec: 42653.9). Total num frames: 3073966080. Throughput: 0: 42439.6. Samples: 3074048900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 10:10:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 10:10:41,678][15401] Updated weights for policy 0, policy_version 187630 (0.0032) [2024-06-22 10:10:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3074179072. Throughput: 0: 42702.7. Samples: 3074310420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:10:43,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 10:10:45,888][15401] Updated weights for policy 0, policy_version 187640 (0.0037) [2024-06-22 10:10:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3074424832. Throughput: 0: 42670.8. Samples: 3074567560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:10:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 10:10:49,453][15401] Updated weights for policy 0, policy_version 187650 (0.0027) [2024-06-22 10:10:53,225][15401] Updated weights for policy 0, policy_version 187660 (0.0031) [2024-06-22 10:10:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3074621440. Throughput: 0: 42863.0. Samples: 3074702740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:10:53,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 10:10:56,980][15401] Updated weights for policy 0, policy_version 187670 (0.0049) [2024-06-22 10:10:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3074834432. Throughput: 0: 42755.5. Samples: 3074956960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:10:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 10:11:01,005][15401] Updated weights for policy 0, policy_version 187680 (0.0045) [2024-06-22 10:11:03,392][15132] Fps is (10 sec: 45862.4, 60 sec: 43142.6, 300 sec: 42764.6). Total num frames: 3075080192. Throughput: 0: 42856.7. Samples: 3075213740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:03,393][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 10:11:04,883][15401] Updated weights for policy 0, policy_version 187690 (0.0034) [2024-06-22 10:11:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3075276800. Throughput: 0: 43000.0. Samples: 3075347160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:08,390][15401] Updated weights for policy 0, policy_version 187700 (0.0053) [2024-06-22 10:11:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 10:11:12,434][15401] Updated weights for policy 0, policy_version 187710 (0.0039) [2024-06-22 10:11:13,390][15132] Fps is (10 sec: 40971.3, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 3075489792. Throughput: 0: 43163.6. Samples: 3075610900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 10:11:15,799][15401] Updated weights for policy 0, policy_version 187720 (0.0035) [2024-06-22 10:11:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3075702784. Throughput: 0: 43105.2. Samples: 3075864200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 10:11:20,065][15401] Updated weights for policy 0, policy_version 187730 (0.0033) [2024-06-22 10:11:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 3075915776. Throughput: 0: 43125.2. Samples: 3075989540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 10:11:23,699][15401] Updated weights for policy 0, policy_version 187740 (0.0035) [2024-06-22 10:11:27,830][15401] Updated weights for policy 0, policy_version 187750 (0.0029) [2024-06-22 10:11:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3076128768. Throughput: 0: 42991.1. Samples: 3076245020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:28,393][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 10:11:31,591][15401] Updated weights for policy 0, policy_version 187760 (0.0039) [2024-06-22 10:11:31,916][15349] Signal inference workers to stop experience collection... (45300 times) [2024-06-22 10:11:31,916][15349] Signal inference workers to resume experience collection... (45300 times) [2024-06-22 10:11:31,947][15401] InferenceWorker_p0-w0: stopping experience collection (45300 times) [2024-06-22 10:11:31,947][15401] InferenceWorker_p0-w0: resuming experience collection (45300 times) [2024-06-22 10:11:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 3076358144. Throughput: 0: 42944.9. Samples: 3076500080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 10:11:35,325][15401] Updated weights for policy 0, policy_version 187770 (0.0036) [2024-06-22 10:11:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 3076521984. Throughput: 0: 42693.8. Samples: 3076623960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:38,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-22 10:11:39,232][15401] Updated weights for policy 0, policy_version 187780 (0.0030) [2024-06-22 10:11:42,718][15401] Updated weights for policy 0, policy_version 187790 (0.0030) [2024-06-22 10:11:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 3076767744. Throughput: 0: 42887.1. Samples: 3076886880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:11:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 10:11:43,596][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000187793_3076800512.pth... [2024-06-22 10:11:43,639][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000187166_3066527744.pth [2024-06-22 10:11:46,763][15401] Updated weights for policy 0, policy_version 187800 (0.0035) [2024-06-22 10:11:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 3076980736. Throughput: 0: 42838.2. Samples: 3077141340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:11:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 10:11:50,639][15401] Updated weights for policy 0, policy_version 187810 (0.0026) [2024-06-22 10:11:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3077160960. Throughput: 0: 42691.9. Samples: 3077268300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:11:53,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 10:11:54,442][15401] Updated weights for policy 0, policy_version 187820 (0.0027) [2024-06-22 10:11:58,352][15401] Updated weights for policy 0, policy_version 187830 (0.0031) [2024-06-22 10:11:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 3077406720. Throughput: 0: 42456.6. Samples: 3077521440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:11:58,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 10:12:02,113][15401] Updated weights for policy 0, policy_version 187840 (0.0037) [2024-06-22 10:12:03,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42327.3, 300 sec: 42653.9). Total num frames: 3077619712. Throughput: 0: 42596.0. Samples: 3077781020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:03,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 10:12:05,975][15401] Updated weights for policy 0, policy_version 187850 (0.0027) [2024-06-22 10:12:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3077816320. Throughput: 0: 42554.2. Samples: 3077904480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 10:12:09,970][15401] Updated weights for policy 0, policy_version 187860 (0.0036) [2024-06-22 10:12:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3078029312. Throughput: 0: 42477.7. Samples: 3078156520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 10:12:13,694][15401] Updated weights for policy 0, policy_version 187870 (0.0032) [2024-06-22 10:12:17,623][15401] Updated weights for policy 0, policy_version 187880 (0.0037) [2024-06-22 10:12:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 3078242304. Throughput: 0: 42463.7. Samples: 3078410940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 10:12:21,685][15401] Updated weights for policy 0, policy_version 187890 (0.0038) [2024-06-22 10:12:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 3078438912. Throughput: 0: 42614.6. Samples: 3078541620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:23,391][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 10:12:25,169][15401] Updated weights for policy 0, policy_version 187900 (0.0029) [2024-06-22 10:12:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3078668288. Throughput: 0: 42409.0. Samples: 3078795280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 10:12:29,536][15401] Updated weights for policy 0, policy_version 187910 (0.0040) [2024-06-22 10:12:32,985][15401] Updated weights for policy 0, policy_version 187920 (0.0041) [2024-06-22 10:12:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 3078897664. Throughput: 0: 42227.5. Samples: 3079041580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 10:12:37,171][15401] Updated weights for policy 0, policy_version 187930 (0.0034) [2024-06-22 10:12:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3079094272. Throughput: 0: 42431.2. Samples: 3079177700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 10:12:40,399][15401] Updated weights for policy 0, policy_version 187940 (0.0033) [2024-06-22 10:12:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3079323648. Throughput: 0: 42554.1. Samples: 3079436380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 10:12:44,602][15401] Updated weights for policy 0, policy_version 187950 (0.0038) [2024-06-22 10:12:48,098][15401] Updated weights for policy 0, policy_version 187960 (0.0038) [2024-06-22 10:12:48,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 3079536640. Throughput: 0: 42423.5. Samples: 3079690180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:48,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 10:12:49,813][15349] Signal inference workers to stop experience collection... (45350 times) [2024-06-22 10:12:49,822][15349] Signal inference workers to resume experience collection... (45350 times) [2024-06-22 10:12:49,851][15401] InferenceWorker_p0-w0: stopping experience collection (45350 times) [2024-06-22 10:12:49,851][15401] InferenceWorker_p0-w0: resuming experience collection (45350 times) [2024-06-22 10:12:52,013][15401] Updated weights for policy 0, policy_version 187970 (0.0022) [2024-06-22 10:12:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3079733248. Throughput: 0: 42739.0. Samples: 3079827740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:12:53,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 10:12:55,961][15401] Updated weights for policy 0, policy_version 187980 (0.0038) [2024-06-22 10:12:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3079962624. Throughput: 0: 42747.2. Samples: 3080080140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:12:58,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-22 10:12:59,561][15401] Updated weights for policy 0, policy_version 187990 (0.0032) [2024-06-22 10:13:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3080159232. Throughput: 0: 42945.2. Samples: 3080343480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 10:13:03,637][15401] Updated weights for policy 0, policy_version 188000 (0.0035) [2024-06-22 10:13:07,262][15401] Updated weights for policy 0, policy_version 188010 (0.0035) [2024-06-22 10:13:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3080372224. Throughput: 0: 42764.0. Samples: 3080466000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:08,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-22 10:13:11,381][15401] Updated weights for policy 0, policy_version 188020 (0.0044) [2024-06-22 10:13:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3080601600. Throughput: 0: 42724.8. Samples: 3080717900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 10:13:14,777][15401] Updated weights for policy 0, policy_version 188030 (0.0032) [2024-06-22 10:13:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3080814592. Throughput: 0: 43053.1. Samples: 3080978960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 10:13:18,947][15401] Updated weights for policy 0, policy_version 188040 (0.0027) [2024-06-22 10:13:22,509][15401] Updated weights for policy 0, policy_version 188050 (0.0033) [2024-06-22 10:13:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3081027584. Throughput: 0: 42893.3. Samples: 3081107900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 10:13:26,791][15401] Updated weights for policy 0, policy_version 188060 (0.0051) [2024-06-22 10:13:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3081240576. Throughput: 0: 42666.2. Samples: 3081356360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 10:13:30,664][15401] Updated weights for policy 0, policy_version 188070 (0.0033) [2024-06-22 10:13:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3081453568. Throughput: 0: 42946.8. Samples: 3081622680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 10:13:34,275][15401] Updated weights for policy 0, policy_version 188080 (0.0042) [2024-06-22 10:13:38,388][15401] Updated weights for policy 0, policy_version 188090 (0.0038) [2024-06-22 10:13:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3081666560. Throughput: 0: 42659.2. Samples: 3081747400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 10:13:42,031][15401] Updated weights for policy 0, policy_version 188100 (0.0023) [2024-06-22 10:13:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3081879552. Throughput: 0: 42625.3. Samples: 3081998280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 10:13:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000188104_3081895936.pth... [2024-06-22 10:13:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000187478_3071639552.pth [2024-06-22 10:13:45,963][15401] Updated weights for policy 0, policy_version 188110 (0.0032) [2024-06-22 10:13:48,392][15132] Fps is (10 sec: 42589.4, 60 sec: 42598.6, 300 sec: 42653.6). Total num frames: 3082092544. Throughput: 0: 42435.3. Samples: 3082253160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:48,392][15132] Avg episode reward: [(0, '0.260')] [2024-06-22 10:13:49,755][15401] Updated weights for policy 0, policy_version 188120 (0.0051) [2024-06-22 10:13:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3082289152. Throughput: 0: 42508.1. Samples: 3082378860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:53,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 10:13:53,959][15401] Updated weights for policy 0, policy_version 188130 (0.0028) [2024-06-22 10:13:57,393][15401] Updated weights for policy 0, policy_version 188140 (0.0035) [2024-06-22 10:13:58,390][15132] Fps is (10 sec: 44246.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3082534912. Throughput: 0: 42682.3. Samples: 3082638600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 10:13:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 10:14:01,586][15401] Updated weights for policy 0, policy_version 188150 (0.0042) [2024-06-22 10:14:03,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 3082731520. Throughput: 0: 42422.5. Samples: 3082888080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:03,393][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 10:14:05,562][15401] Updated weights for policy 0, policy_version 188160 (0.0041) [2024-06-22 10:14:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 3082928128. Throughput: 0: 42356.1. Samples: 3083013920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 10:14:09,123][15401] Updated weights for policy 0, policy_version 188170 (0.0034) [2024-06-22 10:14:13,143][15401] Updated weights for policy 0, policy_version 188180 (0.0041) [2024-06-22 10:14:13,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3083157504. Throughput: 0: 42595.7. Samples: 3083273160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:13,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 10:14:16,914][15401] Updated weights for policy 0, policy_version 188190 (0.0037) [2024-06-22 10:14:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 3083354112. Throughput: 0: 42191.1. Samples: 3083521280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:18,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 10:14:20,648][15349] Signal inference workers to stop experience collection... (45400 times) [2024-06-22 10:14:20,685][15401] InferenceWorker_p0-w0: stopping experience collection (45400 times) [2024-06-22 10:14:20,721][15349] Signal inference workers to resume experience collection... (45400 times) [2024-06-22 10:14:20,721][15401] InferenceWorker_p0-w0: resuming experience collection (45400 times) [2024-06-22 10:14:20,869][15401] Updated weights for policy 0, policy_version 188200 (0.0032) [2024-06-22 10:14:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3083567104. Throughput: 0: 42235.6. Samples: 3083648000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:23,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-22 10:14:24,502][15401] Updated weights for policy 0, policy_version 188210 (0.0038) [2024-06-22 10:14:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3083780096. Throughput: 0: 42406.3. Samples: 3083906560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 10:14:28,471][15401] Updated weights for policy 0, policy_version 188220 (0.0027) [2024-06-22 10:14:32,225][15401] Updated weights for policy 0, policy_version 188230 (0.0027) [2024-06-22 10:14:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 3083993088. Throughput: 0: 42450.1. Samples: 3084163320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 10:14:36,051][15401] Updated weights for policy 0, policy_version 188240 (0.0035) [2024-06-22 10:14:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3084206080. Throughput: 0: 42516.4. Samples: 3084292100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 10:14:39,798][15401] Updated weights for policy 0, policy_version 188250 (0.0035) [2024-06-22 10:14:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3084419072. Throughput: 0: 42529.8. Samples: 3084552440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:43,390][15132] Avg episode reward: [(0, '0.128')] [2024-06-22 10:14:43,583][15401] Updated weights for policy 0, policy_version 188260 (0.0036) [2024-06-22 10:14:47,545][15401] Updated weights for policy 0, policy_version 188270 (0.0047) [2024-06-22 10:14:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 3084632064. Throughput: 0: 42590.3. Samples: 3084804540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 10:14:51,635][15401] Updated weights for policy 0, policy_version 188280 (0.0037) [2024-06-22 10:14:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3084845056. Throughput: 0: 42731.1. Samples: 3084936820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 10:14:55,194][15401] Updated weights for policy 0, policy_version 188290 (0.0034) [2024-06-22 10:14:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 3085041664. Throughput: 0: 42690.2. Samples: 3085194220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:14:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 10:14:59,101][15401] Updated weights for policy 0, policy_version 188300 (0.0032) [2024-06-22 10:15:02,757][15401] Updated weights for policy 0, policy_version 188310 (0.0038) [2024-06-22 10:15:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3085303808. Throughput: 0: 42813.8. Samples: 3085447900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 10:15:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 10:15:06,785][15401] Updated weights for policy 0, policy_version 188320 (0.0038) [2024-06-22 10:15:08,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 3085516800. Throughput: 0: 43040.0. Samples: 3085584800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:08,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 10:15:10,270][15401] Updated weights for policy 0, policy_version 188330 (0.0044) [2024-06-22 10:15:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 3085697024. Throughput: 0: 42921.7. Samples: 3085838040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 10:15:14,343][15401] Updated weights for policy 0, policy_version 188340 (0.0040) [2024-06-22 10:15:17,863][15401] Updated weights for policy 0, policy_version 188350 (0.0029) [2024-06-22 10:15:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3085942784. Throughput: 0: 42864.0. Samples: 3086092200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:18,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 10:15:21,947][15401] Updated weights for policy 0, policy_version 188360 (0.0028) [2024-06-22 10:15:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 3086139392. Throughput: 0: 43092.5. Samples: 3086231260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 10:15:25,390][15401] Updated weights for policy 0, policy_version 188370 (0.0029) [2024-06-22 10:15:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3086336000. Throughput: 0: 42658.7. Samples: 3086472080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 10:15:29,991][15401] Updated weights for policy 0, policy_version 188380 (0.0033) [2024-06-22 10:15:32,971][15401] Updated weights for policy 0, policy_version 188390 (0.0028) [2024-06-22 10:15:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 3086598144. Throughput: 0: 42747.5. Samples: 3086728180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 10:15:37,018][15349] Signal inference workers to stop experience collection... (45450 times) [2024-06-22 10:15:37,067][15401] InferenceWorker_p0-w0: stopping experience collection (45450 times) [2024-06-22 10:15:37,139][15349] Signal inference workers to resume experience collection... (45450 times) [2024-06-22 10:15:37,139][15401] InferenceWorker_p0-w0: resuming experience collection (45450 times) [2024-06-22 10:15:37,601][15401] Updated weights for policy 0, policy_version 188400 (0.0027) [2024-06-22 10:15:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3086794752. Throughput: 0: 42939.5. Samples: 3086869100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 10:15:40,839][15401] Updated weights for policy 0, policy_version 188410 (0.0029) [2024-06-22 10:15:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3086991360. Throughput: 0: 42604.0. Samples: 3087111400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:43,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 10:15:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000188415_3086991360.pth... [2024-06-22 10:15:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000187793_3076800512.pth [2024-06-22 10:15:45,365][15401] Updated weights for policy 0, policy_version 188420 (0.0023) [2024-06-22 10:15:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3087220736. Throughput: 0: 42667.2. Samples: 3087367920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 10:15:48,463][15401] Updated weights for policy 0, policy_version 188430 (0.0033) [2024-06-22 10:15:53,171][15401] Updated weights for policy 0, policy_version 188440 (0.0044) [2024-06-22 10:15:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3087400960. Throughput: 0: 42462.7. Samples: 3087495620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 10:15:56,276][15401] Updated weights for policy 0, policy_version 188450 (0.0027) [2024-06-22 10:15:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 42598.8). Total num frames: 3087646720. Throughput: 0: 42248.9. Samples: 3087739240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:15:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 10:16:00,952][15401] Updated weights for policy 0, policy_version 188460 (0.0040) [2024-06-22 10:16:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 3087826944. Throughput: 0: 42415.5. Samples: 3088000900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:16:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 10:16:04,213][15401] Updated weights for policy 0, policy_version 188470 (0.0033) [2024-06-22 10:16:08,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 3088023552. Throughput: 0: 42094.7. Samples: 3088125520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 10:16:08,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 10:16:08,709][15401] Updated weights for policy 0, policy_version 188480 (0.0039) [2024-06-22 10:16:11,777][15401] Updated weights for policy 0, policy_version 188490 (0.0028) [2024-06-22 10:16:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3088285696. Throughput: 0: 42470.9. Samples: 3088383280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 10:16:16,247][15401] Updated weights for policy 0, policy_version 188500 (0.0043) [2024-06-22 10:16:18,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 3088482304. Throughput: 0: 42555.5. Samples: 3088643280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:18,393][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 10:16:19,322][15401] Updated weights for policy 0, policy_version 188510 (0.0033) [2024-06-22 10:16:23,389][15132] Fps is (10 sec: 36045.3, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 3088646144. Throughput: 0: 42149.4. Samples: 3088765820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 10:16:24,120][15401] Updated weights for policy 0, policy_version 188520 (0.0033) [2024-06-22 10:16:27,146][15401] Updated weights for policy 0, policy_version 188530 (0.0028) [2024-06-22 10:16:28,393][15132] Fps is (10 sec: 45871.8, 60 sec: 43415.2, 300 sec: 42653.5). Total num frames: 3088941056. Throughput: 0: 42454.3. Samples: 3089021980. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:28,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 10:16:31,723][15401] Updated weights for policy 0, policy_version 188540 (0.0034) [2024-06-22 10:16:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 3089088512. Throughput: 0: 42655.6. Samples: 3089287420. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:33,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-22 10:16:33,954][15349] Signal inference workers to stop experience collection... (45500 times) [2024-06-22 10:16:33,999][15401] InferenceWorker_p0-w0: stopping experience collection (45500 times) [2024-06-22 10:16:34,068][15349] Signal inference workers to resume experience collection... (45500 times) [2024-06-22 10:16:34,068][15401] InferenceWorker_p0-w0: resuming experience collection (45500 times) [2024-06-22 10:16:34,740][15401] Updated weights for policy 0, policy_version 188550 (0.0038) [2024-06-22 10:16:38,392][15132] Fps is (10 sec: 36047.6, 60 sec: 41777.6, 300 sec: 42487.0). Total num frames: 3089301504. Throughput: 0: 42392.8. Samples: 3089403400. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:38,392][15132] Avg episode reward: [(0, '0.229')] [2024-06-22 10:16:39,418][15401] Updated weights for policy 0, policy_version 188560 (0.0035) [2024-06-22 10:16:42,285][15401] Updated weights for policy 0, policy_version 188570 (0.0036) [2024-06-22 10:16:43,390][15132] Fps is (10 sec: 49151.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3089580032. Throughput: 0: 42736.4. Samples: 3089662380. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:43,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 10:16:46,975][15401] Updated weights for policy 0, policy_version 188580 (0.0029) [2024-06-22 10:16:48,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3089760256. Throughput: 0: 42820.4. Samples: 3089927820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 10:16:50,275][15401] Updated weights for policy 0, policy_version 188590 (0.0034) [2024-06-22 10:16:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 3089956864. Throughput: 0: 42826.5. Samples: 3090052720. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 10:16:54,459][15401] Updated weights for policy 0, policy_version 188600 (0.0031) [2024-06-22 10:16:57,843][15401] Updated weights for policy 0, policy_version 188610 (0.0033) [2024-06-22 10:16:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3090219008. Throughput: 0: 42828.1. Samples: 3090310540. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:16:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 10:17:02,150][15401] Updated weights for policy 0, policy_version 188620 (0.0031) [2024-06-22 10:17:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3090382848. Throughput: 0: 42906.4. Samples: 3090573960. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:17:03,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 10:17:05,376][15401] Updated weights for policy 0, policy_version 188630 (0.0029) [2024-06-22 10:17:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 3090612224. Throughput: 0: 42902.3. Samples: 3090696420. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:17:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 10:17:09,729][15401] Updated weights for policy 0, policy_version 188640 (0.0036) [2024-06-22 10:17:13,008][15401] Updated weights for policy 0, policy_version 188650 (0.0044) [2024-06-22 10:17:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3090841600. Throughput: 0: 43056.0. Samples: 3090959360. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:17:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 10:17:17,342][15401] Updated weights for policy 0, policy_version 188660 (0.0039) [2024-06-22 10:17:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 3091021824. Throughput: 0: 42903.0. Samples: 3091218060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 25.0) [2024-06-22 10:17:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 10:17:20,695][15401] Updated weights for policy 0, policy_version 188670 (0.0041) [2024-06-22 10:17:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 3091267584. Throughput: 0: 42991.5. Samples: 3091337920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:17:23,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 10:17:24,778][15401] Updated weights for policy 0, policy_version 188680 (0.0042) [2024-06-22 10:17:28,068][15401] Updated weights for policy 0, policy_version 188690 (0.0031) [2024-06-22 10:17:28,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42600.6, 300 sec: 42709.5). Total num frames: 3091496960. Throughput: 0: 43048.0. Samples: 3091599540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:17:28,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 10:17:32,359][15401] Updated weights for policy 0, policy_version 188700 (0.0039) [2024-06-22 10:17:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3091660800. Throughput: 0: 43017.4. Samples: 3091863600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:17:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 10:17:35,570][15401] Updated weights for policy 0, policy_version 188710 (0.0033) [2024-06-22 10:17:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43419.4, 300 sec: 42654.0). Total num frames: 3091906560. Throughput: 0: 42906.0. Samples: 3091983480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:17:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 10:17:40,754][15401] Updated weights for policy 0, policy_version 188720 (0.0028) [2024-06-22 10:17:42,920][15349] Signal inference workers to stop experience collection... (45550 times) [2024-06-22 10:17:42,920][15349] Signal inference workers to resume experience collection... (45550 times) [2024-06-22 10:17:42,953][15401] InferenceWorker_p0-w0: stopping experience collection (45550 times) [2024-06-22 10:17:42,954][15401] InferenceWorker_p0-w0: resuming experience collection (45550 times) [2024-06-22 10:17:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 3092135936. Throughput: 0: 42979.1. Samples: 3092244600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:17:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 10:17:43,479][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000188730_3092152320.pth... [2024-06-22 10:17:43,482][15401] Updated weights for policy 0, policy_version 188730 (0.0030) [2024-06-22 10:17:43,538][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000188104_3081895936.pth [2024-06-22 10:17:48,391][15401] Updated weights for policy 0, policy_version 188740 (0.0026) [2024-06-22 10:17:48,392][15132] Fps is (10 sec: 40948.3, 60 sec: 42596.4, 300 sec: 42653.6). Total num frames: 3092316160. Throughput: 0: 42887.5. Samples: 3092504020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:17:48,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 10:17:51,037][15401] Updated weights for policy 0, policy_version 188750 (0.0046) [2024-06-22 10:17:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3092545536. Throughput: 0: 42739.1. Samples: 3092619680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:17:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 10:17:55,857][15401] Updated weights for policy 0, policy_version 188760 (0.0031) [2024-06-22 10:17:58,389][15132] Fps is (10 sec: 45888.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3092774912. Throughput: 0: 42846.3. Samples: 3092887440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:17:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 10:17:58,743][15401] Updated weights for policy 0, policy_version 188770 (0.0033) [2024-06-22 10:18:03,325][15401] Updated weights for policy 0, policy_version 188780 (0.0037) [2024-06-22 10:18:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3092971520. Throughput: 0: 42719.9. Samples: 3093140460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:18:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 10:18:06,657][15401] Updated weights for policy 0, policy_version 188790 (0.0023) [2024-06-22 10:18:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3093184512. Throughput: 0: 42774.3. Samples: 3093262760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:18:08,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 10:18:10,994][15401] Updated weights for policy 0, policy_version 188800 (0.0043) [2024-06-22 10:18:13,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3093413888. Throughput: 0: 42896.6. Samples: 3093529880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:18:13,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 10:18:14,267][15401] Updated weights for policy 0, policy_version 188810 (0.0028) [2024-06-22 10:18:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3093610496. Throughput: 0: 42653.7. Samples: 3093783020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:18:18,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 10:18:18,581][15401] Updated weights for policy 0, policy_version 188820 (0.0033) [2024-06-22 10:18:21,872][15401] Updated weights for policy 0, policy_version 188830 (0.0035) [2024-06-22 10:18:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3093839872. Throughput: 0: 42726.9. Samples: 3093906200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 10:18:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 10:18:26,126][15401] Updated weights for policy 0, policy_version 188840 (0.0036) [2024-06-22 10:18:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3094036480. Throughput: 0: 42647.1. Samples: 3094163720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:18:28,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 10:18:29,618][15401] Updated weights for policy 0, policy_version 188850 (0.0039) [2024-06-22 10:18:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 3094249472. Throughput: 0: 42562.7. Samples: 3094419220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:18:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 10:18:33,785][15401] Updated weights for policy 0, policy_version 188860 (0.0036) [2024-06-22 10:18:37,828][15401] Updated weights for policy 0, policy_version 188870 (0.0028) [2024-06-22 10:18:38,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 3094495232. Throughput: 0: 42823.1. Samples: 3094546820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:18:38,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 10:18:41,648][15401] Updated weights for policy 0, policy_version 188880 (0.0032) [2024-06-22 10:18:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42654.2). Total num frames: 3094675456. Throughput: 0: 42631.0. Samples: 3094805840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:18:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 10:18:45,303][15401] Updated weights for policy 0, policy_version 188890 (0.0030) [2024-06-22 10:18:48,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42873.4, 300 sec: 42709.5). Total num frames: 3094888448. Throughput: 0: 42591.2. Samples: 3095057060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:18:48,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 10:18:49,586][15401] Updated weights for policy 0, policy_version 188900 (0.0039) [2024-06-22 10:18:52,913][15401] Updated weights for policy 0, policy_version 188910 (0.0036) [2024-06-22 10:18:53,391][15132] Fps is (10 sec: 44229.6, 60 sec: 42870.3, 300 sec: 42653.7). Total num frames: 3095117824. Throughput: 0: 42658.9. Samples: 3095182480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:18:53,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 10:18:57,104][15401] Updated weights for policy 0, policy_version 188920 (0.0041) [2024-06-22 10:18:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 3095314432. Throughput: 0: 42517.7. Samples: 3095443180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:18:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 10:19:00,410][15401] Updated weights for policy 0, policy_version 188930 (0.0039) [2024-06-22 10:19:03,390][15132] Fps is (10 sec: 39328.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3095511040. Throughput: 0: 42635.1. Samples: 3095701600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:19:03,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 10:19:04,853][15401] Updated weights for policy 0, policy_version 188940 (0.0036) [2024-06-22 10:19:08,199][15401] Updated weights for policy 0, policy_version 188950 (0.0042) [2024-06-22 10:19:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3095756800. Throughput: 0: 42689.4. Samples: 3095827220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:19:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 10:19:12,439][15401] Updated weights for policy 0, policy_version 188960 (0.0028) [2024-06-22 10:19:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3095953408. Throughput: 0: 42684.5. Samples: 3096084520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:19:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 10:19:15,779][15401] Updated weights for policy 0, policy_version 188970 (0.0038) [2024-06-22 10:19:18,236][15349] Signal inference workers to stop experience collection... (45600 times) [2024-06-22 10:19:18,238][15349] Signal inference workers to resume experience collection... (45600 times) [2024-06-22 10:19:18,246][15401] InferenceWorker_p0-w0: stopping experience collection (45600 times) [2024-06-22 10:19:18,279][15401] InferenceWorker_p0-w0: resuming experience collection (45600 times) [2024-06-22 10:19:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3096182784. Throughput: 0: 42625.2. Samples: 3096337360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:19:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 10:19:20,227][15401] Updated weights for policy 0, policy_version 188980 (0.0034) [2024-06-22 10:19:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 3096363008. Throughput: 0: 42571.9. Samples: 3096462460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:19:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 10:19:23,791][15401] Updated weights for policy 0, policy_version 188990 (0.0025) [2024-06-22 10:19:27,874][15401] Updated weights for policy 0, policy_version 189000 (0.0031) [2024-06-22 10:19:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3096592384. Throughput: 0: 42533.9. Samples: 3096719860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 10:19:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 10:19:31,359][15401] Updated weights for policy 0, policy_version 189010 (0.0032) [2024-06-22 10:19:33,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3096821760. Throughput: 0: 42498.8. Samples: 3096969500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:19:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 10:19:35,598][15401] Updated weights for policy 0, policy_version 189020 (0.0028) [2024-06-22 10:19:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41780.9, 300 sec: 42653.9). Total num frames: 3097001984. Throughput: 0: 42532.2. Samples: 3097096360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:19:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 10:19:39,012][15401] Updated weights for policy 0, policy_version 189030 (0.0033) [2024-06-22 10:19:43,182][15401] Updated weights for policy 0, policy_version 189040 (0.0032) [2024-06-22 10:19:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3097231360. Throughput: 0: 42540.0. Samples: 3097357480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:19:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 10:19:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000189041_3097247744.pth... [2024-06-22 10:19:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000188415_3086991360.pth [2024-06-22 10:19:46,838][15401] Updated weights for policy 0, policy_version 189050 (0.0034) [2024-06-22 10:19:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3097444352. Throughput: 0: 42441.4. Samples: 3097611460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:19:48,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 10:19:50,762][15401] Updated weights for policy 0, policy_version 189060 (0.0032) [2024-06-22 10:19:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41780.3, 300 sec: 42653.9). Total num frames: 3097624576. Throughput: 0: 42409.8. Samples: 3097735660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:19:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 10:19:54,466][15401] Updated weights for policy 0, policy_version 189070 (0.0028) [2024-06-22 10:19:58,290][15401] Updated weights for policy 0, policy_version 189080 (0.0037) [2024-06-22 10:19:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3097886720. Throughput: 0: 42473.6. Samples: 3097995840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:19:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 10:20:02,458][15401] Updated weights for policy 0, policy_version 189090 (0.0040) [2024-06-22 10:20:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 3098066944. Throughput: 0: 42537.2. Samples: 3098251540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:20:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 10:20:05,769][15401] Updated weights for policy 0, policy_version 189100 (0.0033) [2024-06-22 10:20:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3098296320. Throughput: 0: 42445.1. Samples: 3098372480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:20:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 10:20:10,199][15401] Updated weights for policy 0, policy_version 189110 (0.0040) [2024-06-22 10:20:13,279][15401] Updated weights for policy 0, policy_version 189120 (0.0032) [2024-06-22 10:20:13,390][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3098542080. Throughput: 0: 42543.9. Samples: 3098634340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:20:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 10:20:17,756][15401] Updated weights for policy 0, policy_version 189130 (0.0036) [2024-06-22 10:20:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3098722304. Throughput: 0: 42790.3. Samples: 3098895060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:20:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 10:20:19,536][15349] Signal inference workers to stop experience collection... (45650 times) [2024-06-22 10:20:19,537][15349] Signal inference workers to resume experience collection... (45650 times) [2024-06-22 10:20:19,562][15401] InferenceWorker_p0-w0: stopping experience collection (45650 times) [2024-06-22 10:20:19,562][15401] InferenceWorker_p0-w0: resuming experience collection (45650 times) [2024-06-22 10:20:21,370][15401] Updated weights for policy 0, policy_version 189140 (0.0032) [2024-06-22 10:20:23,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3098918912. Throughput: 0: 42774.7. Samples: 3099021220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:20:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 10:20:25,548][15401] Updated weights for policy 0, policy_version 189150 (0.0040) [2024-06-22 10:20:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3099148288. Throughput: 0: 42633.9. Samples: 3099276000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:20:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 10:20:28,879][15401] Updated weights for policy 0, policy_version 189160 (0.0025) [2024-06-22 10:20:33,070][15401] Updated weights for policy 0, policy_version 189170 (0.0031) [2024-06-22 10:20:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3099361280. Throughput: 0: 42823.1. Samples: 3099538500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:20:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 10:20:36,410][15401] Updated weights for policy 0, policy_version 189180 (0.0032) [2024-06-22 10:20:38,392][15132] Fps is (10 sec: 42586.8, 60 sec: 42869.6, 300 sec: 42653.6). Total num frames: 3099574272. Throughput: 0: 42789.5. Samples: 3099661300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 10:20:38,393][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 10:20:40,774][15401] Updated weights for policy 0, policy_version 189190 (0.0041) [2024-06-22 10:20:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3099820032. Throughput: 0: 42822.7. Samples: 3099922860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:20:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 10:20:44,003][15401] Updated weights for policy 0, policy_version 189200 (0.0046) [2024-06-22 10:20:48,390][15132] Fps is (10 sec: 42609.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3100000256. Throughput: 0: 42928.1. Samples: 3100183300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:20:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 10:20:48,519][15401] Updated weights for policy 0, policy_version 189210 (0.0032) [2024-06-22 10:20:51,612][15401] Updated weights for policy 0, policy_version 189220 (0.0046) [2024-06-22 10:20:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3100213248. Throughput: 0: 42966.6. Samples: 3100305980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:20:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 10:20:56,139][15401] Updated weights for policy 0, policy_version 189230 (0.0034) [2024-06-22 10:20:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3100459008. Throughput: 0: 42858.6. Samples: 3100562980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:20:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 10:20:59,056][15401] Updated weights for policy 0, policy_version 189240 (0.0031) [2024-06-22 10:21:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3100639232. Throughput: 0: 42820.4. Samples: 3100821980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 10:21:03,739][15401] Updated weights for policy 0, policy_version 189250 (0.0030) [2024-06-22 10:21:06,907][15401] Updated weights for policy 0, policy_version 189260 (0.0031) [2024-06-22 10:21:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3100868608. Throughput: 0: 42795.2. Samples: 3100947000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 10:21:11,273][15401] Updated weights for policy 0, policy_version 189270 (0.0031) [2024-06-22 10:21:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 3101097984. Throughput: 0: 43033.7. Samples: 3101212520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 10:21:14,540][15401] Updated weights for policy 0, policy_version 189280 (0.0041) [2024-06-22 10:21:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3101294592. Throughput: 0: 42804.9. Samples: 3101464720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 10:21:19,273][15401] Updated weights for policy 0, policy_version 189290 (0.0034) [2024-06-22 10:21:22,239][15401] Updated weights for policy 0, policy_version 189300 (0.0043) [2024-06-22 10:21:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42598.9). Total num frames: 3101507584. Throughput: 0: 42915.3. Samples: 3101592380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 10:21:26,783][15401] Updated weights for policy 0, policy_version 189310 (0.0028) [2024-06-22 10:21:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3101720576. Throughput: 0: 42928.8. Samples: 3101854660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 10:21:29,718][15401] Updated weights for policy 0, policy_version 189320 (0.0049) [2024-06-22 10:21:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 3101949952. Throughput: 0: 42731.6. Samples: 3102106220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 10:21:34,635][15401] Updated weights for policy 0, policy_version 189330 (0.0039) [2024-06-22 10:21:36,354][15349] Signal inference workers to stop experience collection... (45700 times) [2024-06-22 10:21:36,355][15349] Signal inference workers to resume experience collection... (45700 times) [2024-06-22 10:21:36,392][15401] InferenceWorker_p0-w0: stopping experience collection (45700 times) [2024-06-22 10:21:36,392][15401] InferenceWorker_p0-w0: resuming experience collection (45700 times) [2024-06-22 10:21:37,631][15401] Updated weights for policy 0, policy_version 189340 (0.0037) [2024-06-22 10:21:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.4, 300 sec: 42653.9). Total num frames: 3102162944. Throughput: 0: 42954.3. Samples: 3102238920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 10:21:42,115][15401] Updated weights for policy 0, policy_version 189350 (0.0037) [2024-06-22 10:21:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3102359552. Throughput: 0: 43028.0. Samples: 3102499240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-22 10:21:43,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 10:21:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000189353_3102359552.pth... [2024-06-22 10:21:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000188730_3092152320.pth [2024-06-22 10:21:45,098][15401] Updated weights for policy 0, policy_version 189360 (0.0033) [2024-06-22 10:21:48,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3102588928. Throughput: 0: 42865.6. Samples: 3102751040. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:21:48,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 10:21:49,564][15401] Updated weights for policy 0, policy_version 189370 (0.0032) [2024-06-22 10:21:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3102785536. Throughput: 0: 42985.3. Samples: 3102881340. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:21:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 10:21:53,434][15401] Updated weights for policy 0, policy_version 189380 (0.0031) [2024-06-22 10:21:57,194][15401] Updated weights for policy 0, policy_version 189390 (0.0032) [2024-06-22 10:21:58,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3103014912. Throughput: 0: 42751.4. Samples: 3103136340. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:21:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 10:22:01,005][15401] Updated weights for policy 0, policy_version 189400 (0.0044) [2024-06-22 10:22:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 3103227904. Throughput: 0: 42738.1. Samples: 3103387940. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 10:22:04,752][15401] Updated weights for policy 0, policy_version 189410 (0.0029) [2024-06-22 10:22:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3103440896. Throughput: 0: 42796.5. Samples: 3103518220. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 10:22:08,775][15401] Updated weights for policy 0, policy_version 189420 (0.0031) [2024-06-22 10:22:12,374][15401] Updated weights for policy 0, policy_version 189430 (0.0026) [2024-06-22 10:22:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3103670272. Throughput: 0: 42675.6. Samples: 3103775060. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 10:22:16,446][15401] Updated weights for policy 0, policy_version 189440 (0.0042) [2024-06-22 10:22:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3103866880. Throughput: 0: 42737.7. Samples: 3104029420. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:18,393][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 10:22:20,143][15401] Updated weights for policy 0, policy_version 189450 (0.0043) [2024-06-22 10:22:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3104079872. Throughput: 0: 42560.1. Samples: 3104154120. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 10:22:24,379][15401] Updated weights for policy 0, policy_version 189460 (0.0027) [2024-06-22 10:22:27,672][15401] Updated weights for policy 0, policy_version 189470 (0.0043) [2024-06-22 10:22:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3104292864. Throughput: 0: 42490.6. Samples: 3104411320. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 10:22:32,179][15401] Updated weights for policy 0, policy_version 189480 (0.0032) [2024-06-22 10:22:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3104505856. Throughput: 0: 42689.9. Samples: 3104671980. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 10:22:35,232][15401] Updated weights for policy 0, policy_version 189490 (0.0041) [2024-06-22 10:22:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 3104702464. Throughput: 0: 42629.8. Samples: 3104799680. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 10:22:39,619][15401] Updated weights for policy 0, policy_version 189500 (0.0025) [2024-06-22 10:22:42,763][15401] Updated weights for policy 0, policy_version 189510 (0.0037) [2024-06-22 10:22:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 3104948224. Throughput: 0: 42778.2. Samples: 3105061360. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 10:22:44,155][15349] Signal inference workers to stop experience collection... (45750 times) [2024-06-22 10:22:44,155][15349] Signal inference workers to resume experience collection... (45750 times) [2024-06-22 10:22:44,168][15401] InferenceWorker_p0-w0: stopping experience collection (45750 times) [2024-06-22 10:22:44,190][15401] InferenceWorker_p0-w0: resuming experience collection (45750 times) [2024-06-22 10:22:47,207][15401] Updated weights for policy 0, policy_version 189520 (0.0033) [2024-06-22 10:22:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 3105128448. Throughput: 0: 43001.4. Samples: 3105323000. Policy #0 lag: (min: 2.0, avg: 11.5, max: 24.0) [2024-06-22 10:22:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 10:22:50,419][15401] Updated weights for policy 0, policy_version 189530 (0.0038) [2024-06-22 10:22:53,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3105357824. Throughput: 0: 42700.8. Samples: 3105439760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:22:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 10:22:54,886][15401] Updated weights for policy 0, policy_version 189540 (0.0042) [2024-06-22 10:22:57,993][15401] Updated weights for policy 0, policy_version 189550 (0.0036) [2024-06-22 10:22:58,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3105603584. Throughput: 0: 42773.9. Samples: 3105699880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:22:58,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 10:23:03,061][15401] Updated weights for policy 0, policy_version 189560 (0.0031) [2024-06-22 10:23:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3105751040. Throughput: 0: 42991.5. Samples: 3105964040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 10:23:05,414][15401] Updated weights for policy 0, policy_version 189570 (0.0030) [2024-06-22 10:23:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3105996800. Throughput: 0: 42832.5. Samples: 3106081580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 10:23:10,598][15401] Updated weights for policy 0, policy_version 189580 (0.0031) [2024-06-22 10:23:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3106226176. Throughput: 0: 42967.5. Samples: 3106344860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:13,395][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 10:23:13,458][15401] Updated weights for policy 0, policy_version 189590 (0.0031) [2024-06-22 10:23:18,245][15401] Updated weights for policy 0, policy_version 189600 (0.0044) [2024-06-22 10:23:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3106422784. Throughput: 0: 43120.0. Samples: 3106612380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 10:23:20,992][15401] Updated weights for policy 0, policy_version 189610 (0.0027) [2024-06-22 10:23:23,390][15132] Fps is (10 sec: 42596.1, 60 sec: 42871.0, 300 sec: 42764.9). Total num frames: 3106652160. Throughput: 0: 42803.8. Samples: 3106725880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:23,391][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 10:23:25,718][15401] Updated weights for policy 0, policy_version 189620 (0.0031) [2024-06-22 10:23:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3106865152. Throughput: 0: 42802.1. Samples: 3106987440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:28,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-22 10:23:28,667][15401] Updated weights for policy 0, policy_version 189630 (0.0034) [2024-06-22 10:23:33,389][15132] Fps is (10 sec: 39324.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 3107045376. Throughput: 0: 42911.5. Samples: 3107254020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:33,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 10:23:33,425][15401] Updated weights for policy 0, policy_version 189640 (0.0048) [2024-06-22 10:23:36,197][15401] Updated weights for policy 0, policy_version 189650 (0.0038) [2024-06-22 10:23:38,396][15132] Fps is (10 sec: 44208.0, 60 sec: 43412.9, 300 sec: 42819.6). Total num frames: 3107307520. Throughput: 0: 42978.8. Samples: 3107374080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:38,397][15132] Avg episode reward: [(0, '0.147')] [2024-06-22 10:23:41,111][15401] Updated weights for policy 0, policy_version 189660 (0.0029) [2024-06-22 10:23:43,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 3107520512. Throughput: 0: 43077.9. Samples: 3107638380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 10:23:43,521][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000189669_3107536896.pth... [2024-06-22 10:23:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000189041_3097247744.pth [2024-06-22 10:23:43,734][15401] Updated weights for policy 0, policy_version 189670 (0.0040) [2024-06-22 10:23:48,389][15132] Fps is (10 sec: 37707.3, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 3107684352. Throughput: 0: 42956.1. Samples: 3107897060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 10:23:48,577][15401] Updated weights for policy 0, policy_version 189680 (0.0032) [2024-06-22 10:23:50,740][15349] Signal inference workers to stop experience collection... (45800 times) [2024-06-22 10:23:50,743][15349] Signal inference workers to resume experience collection... (45800 times) [2024-06-22 10:23:50,755][15401] InferenceWorker_p0-w0: stopping experience collection (45800 times) [2024-06-22 10:23:50,755][15401] InferenceWorker_p0-w0: resuming experience collection (45800 times) [2024-06-22 10:23:51,569][15401] Updated weights for policy 0, policy_version 189690 (0.0028) [2024-06-22 10:23:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3107962880. Throughput: 0: 42971.9. Samples: 3108015320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 10:23:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 10:23:56,121][15401] Updated weights for policy 0, policy_version 189700 (0.0040) [2024-06-22 10:23:58,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3108159488. Throughput: 0: 42993.4. Samples: 3108279560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:23:58,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 10:23:59,106][15401] Updated weights for policy 0, policy_version 189710 (0.0036) [2024-06-22 10:24:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3108339712. Throughput: 0: 42828.4. Samples: 3108539660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 10:24:03,740][15401] Updated weights for policy 0, policy_version 189720 (0.0036) [2024-06-22 10:24:06,735][15401] Updated weights for policy 0, policy_version 189730 (0.0034) [2024-06-22 10:24:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 3108618240. Throughput: 0: 43012.6. Samples: 3108661420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 10:24:11,388][15401] Updated weights for policy 0, policy_version 189740 (0.0033) [2024-06-22 10:24:13,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 3108814848. Throughput: 0: 43071.1. Samples: 3108925640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:13,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 10:24:14,562][15401] Updated weights for policy 0, policy_version 189750 (0.0041) [2024-06-22 10:24:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3108995072. Throughput: 0: 42789.7. Samples: 3109179560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:18,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 10:24:18,907][15401] Updated weights for policy 0, policy_version 189760 (0.0031) [2024-06-22 10:24:22,193][15401] Updated weights for policy 0, policy_version 189770 (0.0030) [2024-06-22 10:24:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43145.0, 300 sec: 42876.1). Total num frames: 3109240832. Throughput: 0: 42909.6. Samples: 3109304740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 10:24:26,582][15401] Updated weights for policy 0, policy_version 189780 (0.0039) [2024-06-22 10:24:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3109437440. Throughput: 0: 42832.0. Samples: 3109565820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 10:24:29,658][15401] Updated weights for policy 0, policy_version 189790 (0.0041) [2024-06-22 10:24:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 3109650432. Throughput: 0: 42727.0. Samples: 3109819780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 10:24:34,029][15401] Updated weights for policy 0, policy_version 189800 (0.0033) [2024-06-22 10:24:37,165][15401] Updated weights for policy 0, policy_version 189810 (0.0032) [2024-06-22 10:24:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 3109879808. Throughput: 0: 43056.5. Samples: 3109952860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 10:24:41,611][15401] Updated weights for policy 0, policy_version 189820 (0.0041) [2024-06-22 10:24:43,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3110092800. Throughput: 0: 43030.8. Samples: 3110215940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 10:24:44,767][15401] Updated weights for policy 0, policy_version 189830 (0.0028) [2024-06-22 10:24:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 3110289408. Throughput: 0: 42960.6. Samples: 3110472880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:48,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 10:24:49,130][15401] Updated weights for policy 0, policy_version 189840 (0.0027) [2024-06-22 10:24:52,367][15401] Updated weights for policy 0, policy_version 189850 (0.0036) [2024-06-22 10:24:53,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3110518784. Throughput: 0: 43019.0. Samples: 3110597280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 10:24:56,685][15401] Updated weights for policy 0, policy_version 189860 (0.0026) [2024-06-22 10:24:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3110748160. Throughput: 0: 42927.0. Samples: 3110857360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:24:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 10:24:59,877][15401] Updated weights for policy 0, policy_version 189870 (0.0030) [2024-06-22 10:25:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3110944768. Throughput: 0: 42997.4. Samples: 3111114440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 10:25:03,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-22 10:25:04,165][15401] Updated weights for policy 0, policy_version 189880 (0.0040) [2024-06-22 10:25:07,698][15401] Updated weights for policy 0, policy_version 189890 (0.0039) [2024-06-22 10:25:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3111174144. Throughput: 0: 42989.8. Samples: 3111239280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 10:25:11,985][15401] Updated weights for policy 0, policy_version 189900 (0.0027) [2024-06-22 10:25:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3111370752. Throughput: 0: 42978.2. Samples: 3111499840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:13,396][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 10:25:14,696][15349] Signal inference workers to stop experience collection... (45850 times) [2024-06-22 10:25:14,698][15349] Signal inference workers to resume experience collection... (45850 times) [2024-06-22 10:25:14,725][15401] InferenceWorker_p0-w0: stopping experience collection (45850 times) [2024-06-22 10:25:14,725][15401] InferenceWorker_p0-w0: resuming experience collection (45850 times) [2024-06-22 10:25:15,122][15401] Updated weights for policy 0, policy_version 189910 (0.0039) [2024-06-22 10:25:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3111583744. Throughput: 0: 43165.9. Samples: 3111762240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:18,391][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 10:25:19,472][15401] Updated weights for policy 0, policy_version 189920 (0.0038) [2024-06-22 10:25:22,568][15401] Updated weights for policy 0, policy_version 189930 (0.0038) [2024-06-22 10:25:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3111813120. Throughput: 0: 43079.5. Samples: 3111891440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 10:25:27,046][15401] Updated weights for policy 0, policy_version 189940 (0.0029) [2024-06-22 10:25:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3112009728. Throughput: 0: 42833.2. Samples: 3112143440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 10:25:30,636][15401] Updated weights for policy 0, policy_version 189950 (0.0033) [2024-06-22 10:25:33,390][15132] Fps is (10 sec: 42597.0, 60 sec: 43144.4, 300 sec: 42932.0). Total num frames: 3112239104. Throughput: 0: 42927.1. Samples: 3112404620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 10:25:34,700][15401] Updated weights for policy 0, policy_version 189960 (0.0036) [2024-06-22 10:25:38,060][15401] Updated weights for policy 0, policy_version 189970 (0.0049) [2024-06-22 10:25:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3112468480. Throughput: 0: 43028.1. Samples: 3112533540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:38,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 10:25:42,356][15401] Updated weights for policy 0, policy_version 189980 (0.0049) [2024-06-22 10:25:43,390][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3112648704. Throughput: 0: 42885.3. Samples: 3112787200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:43,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-22 10:25:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000189981_3112648704.pth... [2024-06-22 10:25:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000189353_3102359552.pth [2024-06-22 10:25:46,140][15401] Updated weights for policy 0, policy_version 189990 (0.0034) [2024-06-22 10:25:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 3112878080. Throughput: 0: 42847.7. Samples: 3113042580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:48,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 10:25:50,009][15401] Updated weights for policy 0, policy_version 190000 (0.0036) [2024-06-22 10:25:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3113091072. Throughput: 0: 42900.8. Samples: 3113169820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:53,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 10:25:53,777][15401] Updated weights for policy 0, policy_version 190010 (0.0032) [2024-06-22 10:25:57,733][15401] Updated weights for policy 0, policy_version 190020 (0.0022) [2024-06-22 10:25:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3113287680. Throughput: 0: 42717.7. Samples: 3113422140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:25:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 10:26:01,167][15401] Updated weights for policy 0, policy_version 190030 (0.0031) [2024-06-22 10:26:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 3113500672. Throughput: 0: 42585.8. Samples: 3113678600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:26:03,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 10:26:05,717][15401] Updated weights for policy 0, policy_version 190040 (0.0040) [2024-06-22 10:26:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3113746432. Throughput: 0: 42648.5. Samples: 3113810620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 10:26:08,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 10:26:08,697][15401] Updated weights for policy 0, policy_version 190050 (0.0034) [2024-06-22 10:26:13,311][15401] Updated weights for policy 0, policy_version 190060 (0.0034) [2024-06-22 10:26:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3113943040. Throughput: 0: 42760.8. Samples: 3114067680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 10:26:16,197][15401] Updated weights for policy 0, policy_version 190070 (0.0039) [2024-06-22 10:26:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3114156032. Throughput: 0: 42571.4. Samples: 3114320320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:18,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 10:26:21,156][15401] Updated weights for policy 0, policy_version 190080 (0.0038) [2024-06-22 10:26:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 3114385408. Throughput: 0: 42708.6. Samples: 3114455420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:23,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 10:26:24,124][15349] Signal inference workers to stop experience collection... (45900 times) [2024-06-22 10:26:24,129][15349] Signal inference workers to resume experience collection... (45900 times) [2024-06-22 10:26:24,142][15401] Updated weights for policy 0, policy_version 190090 (0.0038) [2024-06-22 10:26:24,173][15401] InferenceWorker_p0-w0: stopping experience collection (45900 times) [2024-06-22 10:26:24,173][15401] InferenceWorker_p0-w0: resuming experience collection (45900 times) [2024-06-22 10:26:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3114565632. Throughput: 0: 42792.9. Samples: 3114712880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:28,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-22 10:26:28,666][15401] Updated weights for policy 0, policy_version 190100 (0.0057) [2024-06-22 10:26:31,592][15401] Updated weights for policy 0, policy_version 190110 (0.0033) [2024-06-22 10:26:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 3114811392. Throughput: 0: 42738.2. Samples: 3114965800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:33,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-22 10:26:36,299][15401] Updated weights for policy 0, policy_version 190120 (0.0044) [2024-06-22 10:26:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3115024384. Throughput: 0: 42912.6. Samples: 3115100880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 10:26:39,342][15401] Updated weights for policy 0, policy_version 190130 (0.0035) [2024-06-22 10:26:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 3115220992. Throughput: 0: 42906.2. Samples: 3115352920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:43,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 10:26:43,934][15401] Updated weights for policy 0, policy_version 190140 (0.0034) [2024-06-22 10:26:47,082][15401] Updated weights for policy 0, policy_version 190150 (0.0043) [2024-06-22 10:26:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3115466752. Throughput: 0: 42724.5. Samples: 3115601200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 10:26:51,555][15401] Updated weights for policy 0, policy_version 190160 (0.0033) [2024-06-22 10:26:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3115646976. Throughput: 0: 42772.9. Samples: 3115735400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:53,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 10:26:54,758][15401] Updated weights for policy 0, policy_version 190170 (0.0042) [2024-06-22 10:26:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3115859968. Throughput: 0: 42833.0. Samples: 3115995160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:26:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 10:26:59,349][15401] Updated weights for policy 0, policy_version 190180 (0.0041) [2024-06-22 10:27:02,554][15401] Updated weights for policy 0, policy_version 190190 (0.0034) [2024-06-22 10:27:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3116105728. Throughput: 0: 42686.7. Samples: 3116241220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:27:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 10:27:06,790][15401] Updated weights for policy 0, policy_version 190200 (0.0039) [2024-06-22 10:27:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3116285952. Throughput: 0: 42696.8. Samples: 3116376780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:27:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 10:27:10,110][15401] Updated weights for policy 0, policy_version 190210 (0.0030) [2024-06-22 10:27:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3116515328. Throughput: 0: 42825.9. Samples: 3116640040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 10:27:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 10:27:14,153][15401] Updated weights for policy 0, policy_version 190220 (0.0034) [2024-06-22 10:27:17,806][15401] Updated weights for policy 0, policy_version 190230 (0.0026) [2024-06-22 10:27:18,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 3116761088. Throughput: 0: 42698.1. Samples: 3116887220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 10:27:22,368][15401] Updated weights for policy 0, policy_version 190240 (0.0030) [2024-06-22 10:27:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3116941312. Throughput: 0: 42732.8. Samples: 3117023860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:23,400][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 10:27:24,104][15349] Signal inference workers to stop experience collection... (45950 times) [2024-06-22 10:27:24,107][15349] Signal inference workers to resume experience collection... (45950 times) [2024-06-22 10:27:24,128][15401] InferenceWorker_p0-w0: stopping experience collection (45950 times) [2024-06-22 10:27:24,128][15401] InferenceWorker_p0-w0: resuming experience collection (45950 times) [2024-06-22 10:27:25,492][15401] Updated weights for policy 0, policy_version 190250 (0.0033) [2024-06-22 10:27:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3117170688. Throughput: 0: 42780.1. Samples: 3117278020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 10:27:29,874][15401] Updated weights for policy 0, policy_version 190260 (0.0033) [2024-06-22 10:27:33,059][15401] Updated weights for policy 0, policy_version 190270 (0.0026) [2024-06-22 10:27:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3117400064. Throughput: 0: 43048.0. Samples: 3117538360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 10:27:37,474][15401] Updated weights for policy 0, policy_version 190280 (0.0027) [2024-06-22 10:27:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3117580288. Throughput: 0: 42955.5. Samples: 3117668400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 10:27:40,833][15401] Updated weights for policy 0, policy_version 190290 (0.0032) [2024-06-22 10:27:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3117809664. Throughput: 0: 42806.6. Samples: 3117921460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 10:27:43,499][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000190297_3117826048.pth... [2024-06-22 10:27:43,557][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000189669_3107536896.pth [2024-06-22 10:27:45,038][15401] Updated weights for policy 0, policy_version 190300 (0.0034) [2024-06-22 10:27:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3118022656. Throughput: 0: 43322.3. Samples: 3118190720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 10:27:48,398][15401] Updated weights for policy 0, policy_version 190310 (0.0038) [2024-06-22 10:27:52,546][15401] Updated weights for policy 0, policy_version 190320 (0.0035) [2024-06-22 10:27:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3118219264. Throughput: 0: 43153.8. Samples: 3118318700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:53,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-22 10:27:56,054][15401] Updated weights for policy 0, policy_version 190330 (0.0034) [2024-06-22 10:27:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 3118465024. Throughput: 0: 42884.3. Samples: 3118569840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:27:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 10:28:00,029][15401] Updated weights for policy 0, policy_version 190340 (0.0029) [2024-06-22 10:28:03,393][15132] Fps is (10 sec: 44220.5, 60 sec: 42595.8, 300 sec: 42931.1). Total num frames: 3118661632. Throughput: 0: 43308.6. Samples: 3118836260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:28:03,394][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 10:28:03,818][15401] Updated weights for policy 0, policy_version 190350 (0.0028) [2024-06-22 10:28:07,842][15401] Updated weights for policy 0, policy_version 190360 (0.0031) [2024-06-22 10:28:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3118874624. Throughput: 0: 43083.6. Samples: 3118962620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:28:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 10:28:11,438][15401] Updated weights for policy 0, policy_version 190370 (0.0035) [2024-06-22 10:28:13,389][15132] Fps is (10 sec: 44253.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3119104000. Throughput: 0: 43049.3. Samples: 3119215240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:28:13,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-22 10:28:15,341][15401] Updated weights for policy 0, policy_version 190380 (0.0027) [2024-06-22 10:28:18,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42931.4). Total num frames: 3119316992. Throughput: 0: 43190.1. Samples: 3119482020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 10:28:18,392][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 10:28:18,996][15401] Updated weights for policy 0, policy_version 190390 (0.0043) [2024-06-22 10:28:22,830][15401] Updated weights for policy 0, policy_version 190400 (0.0043) [2024-06-22 10:28:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3119529984. Throughput: 0: 43092.0. Samples: 3119607540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:28:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 10:28:26,628][15401] Updated weights for policy 0, policy_version 190410 (0.0037) [2024-06-22 10:28:28,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 3119742976. Throughput: 0: 43054.7. Samples: 3119858920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:28:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 10:28:30,356][15401] Updated weights for policy 0, policy_version 190420 (0.0039) [2024-06-22 10:28:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42821.5). Total num frames: 3119939584. Throughput: 0: 42931.0. Samples: 3120122620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:28:33,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 10:28:34,285][15401] Updated weights for policy 0, policy_version 190430 (0.0038) [2024-06-22 10:28:37,865][15401] Updated weights for policy 0, policy_version 190440 (0.0035) [2024-06-22 10:28:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3120185344. Throughput: 0: 42839.1. Samples: 3120246460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:28:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 10:28:42,008][15401] Updated weights for policy 0, policy_version 190450 (0.0046) [2024-06-22 10:28:43,004][15349] Signal inference workers to stop experience collection... (46000 times) [2024-06-22 10:28:43,004][15349] Signal inference workers to resume experience collection... (46000 times) [2024-06-22 10:28:43,039][15401] InferenceWorker_p0-w0: stopping experience collection (46000 times) [2024-06-22 10:28:43,039][15401] InferenceWorker_p0-w0: resuming experience collection (46000 times) [2024-06-22 10:28:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 3120398336. Throughput: 0: 42946.2. Samples: 3120502420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:28:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 10:28:45,434][15401] Updated weights for policy 0, policy_version 190460 (0.0050) [2024-06-22 10:28:48,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3120578560. Throughput: 0: 42736.3. Samples: 3120759340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:28:48,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 10:28:49,835][15401] Updated weights for policy 0, policy_version 190470 (0.0030) [2024-06-22 10:28:52,935][15401] Updated weights for policy 0, policy_version 190480 (0.0037) [2024-06-22 10:28:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3120824320. Throughput: 0: 42723.1. Samples: 3120885160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:28:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 10:28:57,605][15401] Updated weights for policy 0, policy_version 190490 (0.0031) [2024-06-22 10:28:58,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3121020928. Throughput: 0: 42924.8. Samples: 3121146860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:28:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 10:29:00,601][15401] Updated weights for policy 0, policy_version 190500 (0.0036) [2024-06-22 10:29:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42600.9, 300 sec: 42709.5). Total num frames: 3121217536. Throughput: 0: 42634.6. Samples: 3121400480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:29:03,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 10:29:05,290][15401] Updated weights for policy 0, policy_version 190510 (0.0032) [2024-06-22 10:29:08,056][15401] Updated weights for policy 0, policy_version 190520 (0.0036) [2024-06-22 10:29:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3121479680. Throughput: 0: 42689.3. Samples: 3121528560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:29:08,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 10:29:12,947][15401] Updated weights for policy 0, policy_version 190530 (0.0044) [2024-06-22 10:29:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3121643520. Throughput: 0: 42849.3. Samples: 3121787140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:29:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 10:29:15,593][15401] Updated weights for policy 0, policy_version 190540 (0.0029) [2024-06-22 10:29:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 3121856512. Throughput: 0: 42563.5. Samples: 3122037980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:29:18,394][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 10:29:20,747][15401] Updated weights for policy 0, policy_version 190550 (0.0026) [2024-06-22 10:29:23,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3122118656. Throughput: 0: 42743.1. Samples: 3122169900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:29:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 10:29:23,453][15401] Updated weights for policy 0, policy_version 190560 (0.0035) [2024-06-22 10:29:28,390][15132] Fps is (10 sec: 40958.4, 60 sec: 42052.0, 300 sec: 42765.0). Total num frames: 3122266112. Throughput: 0: 42682.3. Samples: 3122423140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 10:29:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 10:29:28,684][15401] Updated weights for policy 0, policy_version 190570 (0.0031) [2024-06-22 10:29:31,097][15401] Updated weights for policy 0, policy_version 190580 (0.0028) [2024-06-22 10:29:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3122511872. Throughput: 0: 42657.3. Samples: 3122678820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:29:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 10:29:36,504][15401] Updated weights for policy 0, policy_version 190590 (0.0033) [2024-06-22 10:29:38,390][15132] Fps is (10 sec: 47515.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3122741248. Throughput: 0: 42693.8. Samples: 3122806380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:29:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 10:29:39,403][15401] Updated weights for policy 0, policy_version 190600 (0.0021) [2024-06-22 10:29:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3122937856. Throughput: 0: 42683.6. Samples: 3123067620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:29:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 10:29:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000190609_3122937856.pth... [2024-06-22 10:29:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000189981_3112648704.pth [2024-06-22 10:29:44,159][15401] Updated weights for policy 0, policy_version 190610 (0.0031) [2024-06-22 10:29:46,893][15401] Updated weights for policy 0, policy_version 190620 (0.0035) [2024-06-22 10:29:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 3123167232. Throughput: 0: 42604.2. Samples: 3123317660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:29:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 10:29:51,774][15401] Updated weights for policy 0, policy_version 190630 (0.0041) [2024-06-22 10:29:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3123380224. Throughput: 0: 42646.2. Samples: 3123447640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:29:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 10:29:54,275][15401] Updated weights for policy 0, policy_version 190640 (0.0027) [2024-06-22 10:29:55,766][15349] Signal inference workers to stop experience collection... (46050 times) [2024-06-22 10:29:55,821][15401] InferenceWorker_p0-w0: stopping experience collection (46050 times) [2024-06-22 10:29:55,824][15349] Signal inference workers to resume experience collection... (46050 times) [2024-06-22 10:29:55,836][15401] InferenceWorker_p0-w0: resuming experience collection (46050 times) [2024-06-22 10:29:58,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 3123544064. Throughput: 0: 42614.4. Samples: 3123704780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:29:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 10:29:59,433][15401] Updated weights for policy 0, policy_version 190650 (0.0040) [2024-06-22 10:30:02,437][15401] Updated weights for policy 0, policy_version 190660 (0.0026) [2024-06-22 10:30:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3123789824. Throughput: 0: 42459.5. Samples: 3123948660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:30:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 10:30:07,048][15401] Updated weights for policy 0, policy_version 190670 (0.0032) [2024-06-22 10:30:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 3124002816. Throughput: 0: 42486.1. Samples: 3124081780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:30:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 10:30:09,877][15401] Updated weights for policy 0, policy_version 190680 (0.0038) [2024-06-22 10:30:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3124199424. Throughput: 0: 42606.2. Samples: 3124340400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:30:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 10:30:14,599][15401] Updated weights for policy 0, policy_version 190690 (0.0033) [2024-06-22 10:30:17,387][15401] Updated weights for policy 0, policy_version 190700 (0.0032) [2024-06-22 10:30:18,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3124445184. Throughput: 0: 42314.2. Samples: 3124583060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:30:18,393][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 10:30:22,446][15401] Updated weights for policy 0, policy_version 190710 (0.0029) [2024-06-22 10:30:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 3124641792. Throughput: 0: 42594.7. Samples: 3124723140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:30:23,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 10:30:25,171][15401] Updated weights for policy 0, policy_version 190720 (0.0038) [2024-06-22 10:30:28,390][15132] Fps is (10 sec: 37692.4, 60 sec: 42598.7, 300 sec: 42654.0). Total num frames: 3124822016. Throughput: 0: 42344.9. Samples: 3124973140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:30:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 10:30:30,134][15401] Updated weights for policy 0, policy_version 190730 (0.0047) [2024-06-22 10:30:32,818][15401] Updated weights for policy 0, policy_version 190740 (0.0026) [2024-06-22 10:30:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3125084160. Throughput: 0: 42054.6. Samples: 3125210120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 10:30:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 10:30:37,773][15401] Updated weights for policy 0, policy_version 190750 (0.0036) [2024-06-22 10:30:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 3125264384. Throughput: 0: 42273.8. Samples: 3125349960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:30:38,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-22 10:30:40,916][15401] Updated weights for policy 0, policy_version 190760 (0.0026) [2024-06-22 10:30:43,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 3125460992. Throughput: 0: 42103.4. Samples: 3125599440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:30:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 10:30:45,605][15401] Updated weights for policy 0, policy_version 190770 (0.0036) [2024-06-22 10:30:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3125723136. Throughput: 0: 42166.9. Samples: 3125846160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:30:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 10:30:48,484][15401] Updated weights for policy 0, policy_version 190780 (0.0045) [2024-06-22 10:30:53,174][15401] Updated weights for policy 0, policy_version 190790 (0.0023) [2024-06-22 10:30:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3125903360. Throughput: 0: 42327.5. Samples: 3125986520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:30:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 10:30:56,266][15401] Updated weights for policy 0, policy_version 190800 (0.0032) [2024-06-22 10:30:58,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3126116352. Throughput: 0: 42064.5. Samples: 3126233300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:30:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 10:31:01,192][15401] Updated weights for policy 0, policy_version 190810 (0.0038) [2024-06-22 10:31:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3126345728. Throughput: 0: 42345.9. Samples: 3126488520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:31:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 10:31:03,898][15401] Updated weights for policy 0, policy_version 190820 (0.0044) [2024-06-22 10:31:05,639][15349] Signal inference workers to stop experience collection... (46100 times) [2024-06-22 10:31:05,670][15401] InferenceWorker_p0-w0: stopping experience collection (46100 times) [2024-06-22 10:31:05,698][15349] Signal inference workers to resume experience collection... (46100 times) [2024-06-22 10:31:05,704][15401] InferenceWorker_p0-w0: resuming experience collection (46100 times) [2024-06-22 10:31:08,391][15132] Fps is (10 sec: 40952.2, 60 sec: 42051.0, 300 sec: 42653.7). Total num frames: 3126525952. Throughput: 0: 42104.9. Samples: 3126617940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:31:08,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 10:31:09,067][15401] Updated weights for policy 0, policy_version 190830 (0.0034) [2024-06-22 10:31:11,877][15401] Updated weights for policy 0, policy_version 190840 (0.0036) [2024-06-22 10:31:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 3126771712. Throughput: 0: 42075.5. Samples: 3126866640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:31:13,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 10:31:16,883][15401] Updated weights for policy 0, policy_version 190850 (0.0032) [2024-06-22 10:31:18,389][15132] Fps is (10 sec: 45884.2, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 3126984704. Throughput: 0: 42642.7. Samples: 3127129040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:31:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 10:31:19,524][15401] Updated weights for policy 0, policy_version 190860 (0.0044) [2024-06-22 10:31:23,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3127164928. Throughput: 0: 42420.4. Samples: 3127258880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:31:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 10:31:24,706][15401] Updated weights for policy 0, policy_version 190870 (0.0035) [2024-06-22 10:31:27,299][15401] Updated weights for policy 0, policy_version 190880 (0.0040) [2024-06-22 10:31:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3127410688. Throughput: 0: 42492.5. Samples: 3127511600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:31:28,390][15132] Avg episode reward: [(0, '0.135')] [2024-06-22 10:31:32,334][15401] Updated weights for policy 0, policy_version 190890 (0.0026) [2024-06-22 10:31:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 3127607296. Throughput: 0: 42757.7. Samples: 3127770260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:31:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 10:31:35,072][15401] Updated weights for policy 0, policy_version 190900 (0.0031) [2024-06-22 10:31:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3127803904. Throughput: 0: 42421.0. Samples: 3127895460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 10:31:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 10:31:40,100][15401] Updated weights for policy 0, policy_version 190910 (0.0030) [2024-06-22 10:31:43,055][15401] Updated weights for policy 0, policy_version 190920 (0.0034) [2024-06-22 10:31:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3128049664. Throughput: 0: 42454.6. Samples: 3128143760. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:31:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 10:31:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000190921_3128049664.pth... [2024-06-22 10:31:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000190297_3117826048.pth [2024-06-22 10:31:47,797][15401] Updated weights for policy 0, policy_version 190930 (0.0046) [2024-06-22 10:31:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 3128229888. Throughput: 0: 42567.9. Samples: 3128404080. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:31:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 10:31:50,609][15401] Updated weights for policy 0, policy_version 190940 (0.0037) [2024-06-22 10:31:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3128459264. Throughput: 0: 42397.8. Samples: 3128525760. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:31:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 10:31:55,220][15401] Updated weights for policy 0, policy_version 190950 (0.0040) [2024-06-22 10:31:58,018][15401] Updated weights for policy 0, policy_version 190960 (0.0039) [2024-06-22 10:31:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3128688640. Throughput: 0: 42675.7. Samples: 3128786940. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:31:58,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 10:32:02,734][15401] Updated weights for policy 0, policy_version 190970 (0.0035) [2024-06-22 10:32:03,394][15132] Fps is (10 sec: 42581.2, 60 sec: 42322.5, 300 sec: 42708.9). Total num frames: 3128885248. Throughput: 0: 42721.4. Samples: 3129051680. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:03,394][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 10:32:05,625][15401] Updated weights for policy 0, policy_version 190980 (0.0032) [2024-06-22 10:32:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42872.8, 300 sec: 42653.9). Total num frames: 3129098240. Throughput: 0: 42475.7. Samples: 3129170280. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 10:32:10,445][15401] Updated weights for policy 0, policy_version 190990 (0.0032) [2024-06-22 10:32:13,192][15401] Updated weights for policy 0, policy_version 191000 (0.0034) [2024-06-22 10:32:13,389][15132] Fps is (10 sec: 45893.6, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 3129344000. Throughput: 0: 42790.3. Samples: 3129437160. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 10:32:17,002][15349] Signal inference workers to stop experience collection... (46150 times) [2024-06-22 10:32:17,003][15349] Signal inference workers to resume experience collection... (46150 times) [2024-06-22 10:32:17,045][15401] InferenceWorker_p0-w0: stopping experience collection (46150 times) [2024-06-22 10:32:17,045][15401] InferenceWorker_p0-w0: resuming experience collection (46150 times) [2024-06-22 10:32:17,858][15401] Updated weights for policy 0, policy_version 191010 (0.0027) [2024-06-22 10:32:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 3129524224. Throughput: 0: 42888.4. Samples: 3129700340. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:18,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 10:32:20,777][15401] Updated weights for policy 0, policy_version 191020 (0.0034) [2024-06-22 10:32:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3129753600. Throughput: 0: 42830.6. Samples: 3129822840. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:23,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 10:32:25,198][15401] Updated weights for policy 0, policy_version 191030 (0.0042) [2024-06-22 10:32:28,389][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3129982976. Throughput: 0: 43181.4. Samples: 3130086920. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:28,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 10:32:28,485][15401] Updated weights for policy 0, policy_version 191040 (0.0038) [2024-06-22 10:32:33,127][15401] Updated weights for policy 0, policy_version 191050 (0.0045) [2024-06-22 10:32:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3130163200. Throughput: 0: 43182.6. Samples: 3130347300. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 10:32:36,425][15401] Updated weights for policy 0, policy_version 191060 (0.0032) [2024-06-22 10:32:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3130376192. Throughput: 0: 43163.2. Samples: 3130468100. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:38,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 10:32:40,658][15401] Updated weights for policy 0, policy_version 191070 (0.0036) [2024-06-22 10:32:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3130605568. Throughput: 0: 42978.7. Samples: 3130720980. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 10:32:44,167][15401] Updated weights for policy 0, policy_version 191080 (0.0024) [2024-06-22 10:32:48,128][15401] Updated weights for policy 0, policy_version 191090 (0.0044) [2024-06-22 10:32:48,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 3130818560. Throughput: 0: 42966.0. Samples: 3130985080. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 10:32:48,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 10:32:51,879][15401] Updated weights for policy 0, policy_version 191100 (0.0026) [2024-06-22 10:32:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3131031552. Throughput: 0: 43108.8. Samples: 3131110180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:32:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 10:32:55,733][15401] Updated weights for policy 0, policy_version 191110 (0.0034) [2024-06-22 10:32:58,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 3131260928. Throughput: 0: 42949.0. Samples: 3131369860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:32:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 10:32:59,511][15401] Updated weights for policy 0, policy_version 191120 (0.0038) [2024-06-22 10:33:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42874.3, 300 sec: 42653.9). Total num frames: 3131457536. Throughput: 0: 42837.8. Samples: 3131627940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:03,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 10:33:03,482][15401] Updated weights for policy 0, policy_version 191130 (0.0022) [2024-06-22 10:33:07,303][15401] Updated weights for policy 0, policy_version 191140 (0.0037) [2024-06-22 10:33:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3131670528. Throughput: 0: 42709.5. Samples: 3131744760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:08,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 10:33:11,198][15401] Updated weights for policy 0, policy_version 191150 (0.0034) [2024-06-22 10:33:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 3131899904. Throughput: 0: 42718.4. Samples: 3132009240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 10:33:14,794][15401] Updated weights for policy 0, policy_version 191160 (0.0043) [2024-06-22 10:33:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 3132096512. Throughput: 0: 42564.6. Samples: 3132262700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:18,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 10:33:19,329][15401] Updated weights for policy 0, policy_version 191170 (0.0050) [2024-06-22 10:33:22,614][15401] Updated weights for policy 0, policy_version 191180 (0.0024) [2024-06-22 10:33:23,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3132309504. Throughput: 0: 42542.5. Samples: 3132382520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 10:33:26,968][15401] Updated weights for policy 0, policy_version 191190 (0.0038) [2024-06-22 10:33:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3132538880. Throughput: 0: 42788.7. Samples: 3132646480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 10:33:30,423][15401] Updated weights for policy 0, policy_version 191200 (0.0029) [2024-06-22 10:33:31,942][15349] Signal inference workers to stop experience collection... (46200 times) [2024-06-22 10:33:31,942][15349] Signal inference workers to resume experience collection... (46200 times) [2024-06-22 10:33:31,973][15401] InferenceWorker_p0-w0: stopping experience collection (46200 times) [2024-06-22 10:33:31,973][15401] InferenceWorker_p0-w0: resuming experience collection (46200 times) [2024-06-22 10:33:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 3132719104. Throughput: 0: 42611.6. Samples: 3132902500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 10:33:34,598][15401] Updated weights for policy 0, policy_version 191210 (0.0035) [2024-06-22 10:33:38,039][15401] Updated weights for policy 0, policy_version 191220 (0.0037) [2024-06-22 10:33:38,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3132964864. Throughput: 0: 42609.0. Samples: 3133027580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 10:33:42,553][15401] Updated weights for policy 0, policy_version 191230 (0.0037) [2024-06-22 10:33:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 3133177856. Throughput: 0: 42762.2. Samples: 3133294160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:43,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 10:33:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000191235_3133194240.pth... [2024-06-22 10:33:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000190609_3122937856.pth [2024-06-22 10:33:45,676][15401] Updated weights for policy 0, policy_version 191240 (0.0038) [2024-06-22 10:33:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 3133374464. Throughput: 0: 42578.6. Samples: 3133543980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 10:33:50,060][15401] Updated weights for policy 0, policy_version 191250 (0.0038) [2024-06-22 10:33:53,131][15401] Updated weights for policy 0, policy_version 191260 (0.0025) [2024-06-22 10:33:53,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 3133603840. Throughput: 0: 42782.1. Samples: 3133670060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:33:53,393][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 10:33:57,585][15401] Updated weights for policy 0, policy_version 191270 (0.0038) [2024-06-22 10:33:58,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42593.8, 300 sec: 42708.6). Total num frames: 3133816832. Throughput: 0: 42880.4. Samples: 3133939140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:33:58,396][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 10:34:00,770][15401] Updated weights for policy 0, policy_version 191280 (0.0028) [2024-06-22 10:34:03,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 3134029824. Throughput: 0: 42844.9. Samples: 3134190720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 10:34:05,207][15401] Updated weights for policy 0, policy_version 191290 (0.0027) [2024-06-22 10:34:08,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3134242816. Throughput: 0: 43121.5. Samples: 3134322980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 10:34:08,499][15401] Updated weights for policy 0, policy_version 191300 (0.0037) [2024-06-22 10:34:12,691][15401] Updated weights for policy 0, policy_version 191310 (0.0031) [2024-06-22 10:34:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 3134455808. Throughput: 0: 43019.6. Samples: 3134582460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:13,393][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 10:34:16,235][15401] Updated weights for policy 0, policy_version 191320 (0.0028) [2024-06-22 10:34:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 3134685184. Throughput: 0: 42845.6. Samples: 3134830560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 10:34:20,185][15401] Updated weights for policy 0, policy_version 191330 (0.0037) [2024-06-22 10:34:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 3134881792. Throughput: 0: 43112.4. Samples: 3134967640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 10:34:23,775][15401] Updated weights for policy 0, policy_version 191340 (0.0031) [2024-06-22 10:34:27,679][15401] Updated weights for policy 0, policy_version 191350 (0.0026) [2024-06-22 10:34:28,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3135078400. Throughput: 0: 42764.8. Samples: 3135218580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 10:34:31,203][15401] Updated weights for policy 0, policy_version 191360 (0.0033) [2024-06-22 10:34:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3135307776. Throughput: 0: 43026.3. Samples: 3135480160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:33,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 10:34:35,117][15401] Updated weights for policy 0, policy_version 191370 (0.0034) [2024-06-22 10:34:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3135537152. Throughput: 0: 43245.1. Samples: 3135615980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 10:34:38,944][15401] Updated weights for policy 0, policy_version 191380 (0.0032) [2024-06-22 10:34:42,711][15401] Updated weights for policy 0, policy_version 191390 (0.0034) [2024-06-22 10:34:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3135733760. Throughput: 0: 42854.9. Samples: 3135867340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:43,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 10:34:46,730][15401] Updated weights for policy 0, policy_version 191400 (0.0029) [2024-06-22 10:34:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 3135963136. Throughput: 0: 42894.7. Samples: 3136120980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:48,390][15132] Avg episode reward: [(0, '0.159')] [2024-06-22 10:34:50,381][15401] Updated weights for policy 0, policy_version 191410 (0.0038) [2024-06-22 10:34:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 3136176128. Throughput: 0: 42977.3. Samples: 3136256960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 10:34:54,106][15401] Updated weights for policy 0, policy_version 191420 (0.0037) [2024-06-22 10:34:57,855][15401] Updated weights for policy 0, policy_version 191430 (0.0034) [2024-06-22 10:34:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 3136389120. Throughput: 0: 42944.4. Samples: 3136514860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 10:34:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 10:35:01,875][15401] Updated weights for policy 0, policy_version 191440 (0.0034) [2024-06-22 10:35:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3136618496. Throughput: 0: 42974.3. Samples: 3136764400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 10:35:05,231][15349] Signal inference workers to stop experience collection... (46250 times) [2024-06-22 10:35:05,233][15349] Signal inference workers to resume experience collection... (46250 times) [2024-06-22 10:35:05,274][15401] InferenceWorker_p0-w0: stopping experience collection (46250 times) [2024-06-22 10:35:05,274][15401] InferenceWorker_p0-w0: resuming experience collection (46250 times) [2024-06-22 10:35:05,742][15401] Updated weights for policy 0, policy_version 191450 (0.0033) [2024-06-22 10:35:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3136815104. Throughput: 0: 42867.9. Samples: 3136896700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 10:35:09,671][15401] Updated weights for policy 0, policy_version 191460 (0.0035) [2024-06-22 10:35:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.1, 300 sec: 42598.8). Total num frames: 3137011712. Throughput: 0: 42848.0. Samples: 3137146740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 10:35:13,674][15401] Updated weights for policy 0, policy_version 191470 (0.0034) [2024-06-22 10:35:17,282][15401] Updated weights for policy 0, policy_version 191480 (0.0030) [2024-06-22 10:35:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3137273856. Throughput: 0: 42639.5. Samples: 3137398940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:18,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 10:35:21,237][15401] Updated weights for policy 0, policy_version 191490 (0.0040) [2024-06-22 10:35:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3137437696. Throughput: 0: 42542.5. Samples: 3137530400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:23,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 10:35:25,142][15401] Updated weights for policy 0, policy_version 191500 (0.0038) [2024-06-22 10:35:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3137667072. Throughput: 0: 42760.1. Samples: 3137791540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:28,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 10:35:28,782][15401] Updated weights for policy 0, policy_version 191510 (0.0036) [2024-06-22 10:35:32,678][15401] Updated weights for policy 0, policy_version 191520 (0.0033) [2024-06-22 10:35:33,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3137912832. Throughput: 0: 42697.7. Samples: 3138042380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 10:35:36,468][15401] Updated weights for policy 0, policy_version 191530 (0.0046) [2024-06-22 10:35:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 3138093056. Throughput: 0: 42714.6. Samples: 3138179120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:38,395][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 10:35:40,063][15401] Updated weights for policy 0, policy_version 191540 (0.0028) [2024-06-22 10:35:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3138322432. Throughput: 0: 42682.3. Samples: 3138435560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 10:35:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000191548_3138322432.pth... [2024-06-22 10:35:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000190921_3128049664.pth [2024-06-22 10:35:44,102][15401] Updated weights for policy 0, policy_version 191550 (0.0027) [2024-06-22 10:35:47,587][15401] Updated weights for policy 0, policy_version 191560 (0.0034) [2024-06-22 10:35:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3138551808. Throughput: 0: 42811.2. Samples: 3138690900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 10:35:51,738][15401] Updated weights for policy 0, policy_version 191570 (0.0034) [2024-06-22 10:35:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3138732032. Throughput: 0: 42804.9. Samples: 3138822920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 10:35:55,062][15401] Updated weights for policy 0, policy_version 191580 (0.0041) [2024-06-22 10:35:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3138977792. Throughput: 0: 42948.8. Samples: 3139079440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:35:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 10:35:59,355][15401] Updated weights for policy 0, policy_version 191590 (0.0036) [2024-06-22 10:36:03,165][15401] Updated weights for policy 0, policy_version 191600 (0.0028) [2024-06-22 10:36:03,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 3139174400. Throughput: 0: 43028.7. Samples: 3139335220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:36:03,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 10:36:07,373][15401] Updated weights for policy 0, policy_version 191610 (0.0034) [2024-06-22 10:36:08,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 3139371008. Throughput: 0: 43008.7. Samples: 3139465780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 10:36:08,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 10:36:10,934][15401] Updated weights for policy 0, policy_version 191620 (0.0028) [2024-06-22 10:36:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3139600384. Throughput: 0: 42799.5. Samples: 3139717520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 10:36:15,078][15401] Updated weights for policy 0, policy_version 191630 (0.0039) [2024-06-22 10:36:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 3139796992. Throughput: 0: 43032.6. Samples: 3139978840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 10:36:18,542][15401] Updated weights for policy 0, policy_version 191640 (0.0026) [2024-06-22 10:36:22,606][15401] Updated weights for policy 0, policy_version 191650 (0.0034) [2024-06-22 10:36:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3140009984. Throughput: 0: 42873.3. Samples: 3140108420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 10:36:26,053][15401] Updated weights for policy 0, policy_version 191660 (0.0031) [2024-06-22 10:36:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3140255744. Throughput: 0: 42828.0. Samples: 3140362820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 10:36:30,162][15401] Updated weights for policy 0, policy_version 191670 (0.0029) [2024-06-22 10:36:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3140452352. Throughput: 0: 42939.6. Samples: 3140623180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 10:36:33,611][15401] Updated weights for policy 0, policy_version 191680 (0.0029) [2024-06-22 10:36:34,449][15349] Signal inference workers to stop experience collection... (46300 times) [2024-06-22 10:36:34,482][15401] InferenceWorker_p0-w0: stopping experience collection (46300 times) [2024-06-22 10:36:34,496][15349] Signal inference workers to resume experience collection... (46300 times) [2024-06-22 10:36:34,509][15401] InferenceWorker_p0-w0: resuming experience collection (46300 times) [2024-06-22 10:36:37,974][15401] Updated weights for policy 0, policy_version 191690 (0.0031) [2024-06-22 10:36:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 3140665344. Throughput: 0: 42705.4. Samples: 3140744760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:38,393][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 10:36:41,231][15401] Updated weights for policy 0, policy_version 191700 (0.0030) [2024-06-22 10:36:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 3140894720. Throughput: 0: 42772.6. Samples: 3141004200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 10:36:45,562][15401] Updated weights for policy 0, policy_version 191710 (0.0032) [2024-06-22 10:36:48,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3141091328. Throughput: 0: 42974.2. Samples: 3141269060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 10:36:48,897][15401] Updated weights for policy 0, policy_version 191720 (0.0043) [2024-06-22 10:36:53,328][15401] Updated weights for policy 0, policy_version 191730 (0.0031) [2024-06-22 10:36:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3141304320. Throughput: 0: 42787.5. Samples: 3141391220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 10:36:56,475][15401] Updated weights for policy 0, policy_version 191740 (0.0036) [2024-06-22 10:36:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42932.2). Total num frames: 3141550080. Throughput: 0: 42874.7. Samples: 3141646880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:36:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 10:37:01,018][15401] Updated weights for policy 0, policy_version 191750 (0.0033) [2024-06-22 10:37:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3141746688. Throughput: 0: 42930.7. Samples: 3141910720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:37:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 10:37:04,086][15401] Updated weights for policy 0, policy_version 191760 (0.0031) [2024-06-22 10:37:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3141943296. Throughput: 0: 42790.4. Samples: 3142033980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:37:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 10:37:08,857][15401] Updated weights for policy 0, policy_version 191770 (0.0045) [2024-06-22 10:37:11,731][15401] Updated weights for policy 0, policy_version 191780 (0.0030) [2024-06-22 10:37:13,394][15132] Fps is (10 sec: 45854.9, 60 sec: 43414.5, 300 sec: 42986.9). Total num frames: 3142205440. Throughput: 0: 42849.2. Samples: 3142291220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:37:13,394][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 10:37:16,429][15401] Updated weights for policy 0, policy_version 191790 (0.0036) [2024-06-22 10:37:18,391][15132] Fps is (10 sec: 44228.0, 60 sec: 43143.1, 300 sec: 42820.3). Total num frames: 3142385664. Throughput: 0: 42820.8. Samples: 3142550200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:18,396][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 10:37:19,642][15401] Updated weights for policy 0, policy_version 191800 (0.0029) [2024-06-22 10:37:23,389][15132] Fps is (10 sec: 37699.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3142582272. Throughput: 0: 42848.1. Samples: 3142672820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:23,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 10:37:23,908][15401] Updated weights for policy 0, policy_version 191810 (0.0036) [2024-06-22 10:37:27,088][15401] Updated weights for policy 0, policy_version 191820 (0.0026) [2024-06-22 10:37:28,389][15132] Fps is (10 sec: 44246.0, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 3142828032. Throughput: 0: 42954.2. Samples: 3142937140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 10:37:31,575][15401] Updated weights for policy 0, policy_version 191830 (0.0030) [2024-06-22 10:37:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3143024640. Throughput: 0: 42950.7. Samples: 3143201840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 10:37:34,624][15401] Updated weights for policy 0, policy_version 191840 (0.0030) [2024-06-22 10:37:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 3143237632. Throughput: 0: 42987.6. Samples: 3143325660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 10:37:39,100][15401] Updated weights for policy 0, policy_version 191850 (0.0036) [2024-06-22 10:37:42,093][15401] Updated weights for policy 0, policy_version 191860 (0.0027) [2024-06-22 10:37:43,389][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 3143483392. Throughput: 0: 43042.7. Samples: 3143583800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 10:37:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000191863_3143483392.pth... [2024-06-22 10:37:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000191235_3133194240.pth [2024-06-22 10:37:46,857][15401] Updated weights for policy 0, policy_version 191870 (0.0031) [2024-06-22 10:37:48,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.7, 300 sec: 42875.8). Total num frames: 3143680000. Throughput: 0: 42942.1. Samples: 3143843220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:48,392][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 10:37:49,938][15401] Updated weights for policy 0, policy_version 191880 (0.0047) [2024-06-22 10:37:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3143876608. Throughput: 0: 42970.3. Samples: 3143967640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:53,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 10:37:54,437][15401] Updated weights for policy 0, policy_version 191890 (0.0032) [2024-06-22 10:37:57,816][15401] Updated weights for policy 0, policy_version 191900 (0.0020) [2024-06-22 10:37:58,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3144122368. Throughput: 0: 43054.8. Samples: 3144228500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:37:58,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 10:38:01,957][15401] Updated weights for policy 0, policy_version 191910 (0.0032) [2024-06-22 10:38:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 3144335360. Throughput: 0: 42927.8. Samples: 3144481860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:38:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 10:38:05,203][15349] Signal inference workers to stop experience collection... (46350 times) [2024-06-22 10:38:05,204][15349] Signal inference workers to resume experience collection... (46350 times) [2024-06-22 10:38:05,241][15401] InferenceWorker_p0-w0: stopping experience collection (46350 times) [2024-06-22 10:38:05,242][15401] InferenceWorker_p0-w0: resuming experience collection (46350 times) [2024-06-22 10:38:05,342][15401] Updated weights for policy 0, policy_version 191920 (0.0034) [2024-06-22 10:38:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3144531968. Throughput: 0: 43092.5. Samples: 3144611980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:38:08,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 10:38:09,751][15401] Updated weights for policy 0, policy_version 191930 (0.0032) [2024-06-22 10:38:12,807][15401] Updated weights for policy 0, policy_version 191940 (0.0026) [2024-06-22 10:38:13,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42599.8, 300 sec: 42931.3). Total num frames: 3144761344. Throughput: 0: 43148.8. Samples: 3144878940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:38:13,392][15132] Avg episode reward: [(0, '0.827')] [2024-06-22 10:38:17,144][15401] Updated weights for policy 0, policy_version 191950 (0.0041) [2024-06-22 10:38:18,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43418.9, 300 sec: 42987.2). Total num frames: 3144990720. Throughput: 0: 42770.4. Samples: 3145126520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 10:38:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 10:38:20,527][15401] Updated weights for policy 0, policy_version 191960 (0.0027) [2024-06-22 10:38:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3145170944. Throughput: 0: 43040.9. Samples: 3145262500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:38:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 10:38:24,462][15401] Updated weights for policy 0, policy_version 191970 (0.0031) [2024-06-22 10:38:28,204][15401] Updated weights for policy 0, policy_version 191980 (0.0038) [2024-06-22 10:38:28,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3145400320. Throughput: 0: 43207.6. Samples: 3145528140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:38:28,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 10:38:31,918][15401] Updated weights for policy 0, policy_version 191990 (0.0020) [2024-06-22 10:38:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3145613312. Throughput: 0: 43063.8. Samples: 3145780980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:38:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 10:38:36,091][15401] Updated weights for policy 0, policy_version 192000 (0.0039) [2024-06-22 10:38:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 3145826304. Throughput: 0: 43203.4. Samples: 3145911900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:38:38,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 10:38:39,396][15401] Updated weights for policy 0, policy_version 192010 (0.0021) [2024-06-22 10:38:43,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 3146022912. Throughput: 0: 43088.8. Samples: 3146167500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:38:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 10:38:43,636][15401] Updated weights for policy 0, policy_version 192020 (0.0035) [2024-06-22 10:38:46,947][15401] Updated weights for policy 0, policy_version 192030 (0.0038) [2024-06-22 10:38:48,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43146.4, 300 sec: 42932.0). Total num frames: 3146268672. Throughput: 0: 43083.5. Samples: 3146420620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:38:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 10:38:51,254][15401] Updated weights for policy 0, policy_version 192040 (0.0032) [2024-06-22 10:38:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43417.6, 300 sec: 42932.6). Total num frames: 3146481664. Throughput: 0: 43290.1. Samples: 3146560040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:38:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 10:38:54,486][15401] Updated weights for policy 0, policy_version 192050 (0.0037) [2024-06-22 10:38:58,391][15132] Fps is (10 sec: 39315.9, 60 sec: 42324.4, 300 sec: 42820.4). Total num frames: 3146661888. Throughput: 0: 42901.4. Samples: 3146809460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:38:58,391][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 10:38:59,051][15401] Updated weights for policy 0, policy_version 192060 (0.0044) [2024-06-22 10:39:02,165][15401] Updated weights for policy 0, policy_version 192070 (0.0041) [2024-06-22 10:39:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 3146924032. Throughput: 0: 43040.6. Samples: 3147063340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:39:03,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 10:39:06,848][15401] Updated weights for policy 0, policy_version 192080 (0.0040) [2024-06-22 10:39:08,389][15132] Fps is (10 sec: 45881.5, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 3147120640. Throughput: 0: 43112.4. Samples: 3147202560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:39:08,390][15132] Avg episode reward: [(0, '0.141')] [2024-06-22 10:39:10,075][15401] Updated weights for policy 0, policy_version 192090 (0.0027) [2024-06-22 10:39:13,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42327.0, 300 sec: 42765.1). Total num frames: 3147300864. Throughput: 0: 42838.6. Samples: 3147455880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:39:13,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-22 10:39:14,279][15401] Updated weights for policy 0, policy_version 192100 (0.0034) [2024-06-22 10:39:17,555][15401] Updated weights for policy 0, policy_version 192110 (0.0027) [2024-06-22 10:39:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 3147563008. Throughput: 0: 42864.9. Samples: 3147709900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:39:18,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 10:39:21,761][15401] Updated weights for policy 0, policy_version 192120 (0.0038) [2024-06-22 10:39:23,200][15349] Signal inference workers to stop experience collection... (46400 times) [2024-06-22 10:39:23,200][15349] Signal inference workers to resume experience collection... (46400 times) [2024-06-22 10:39:23,216][15401] InferenceWorker_p0-w0: stopping experience collection (46400 times) [2024-06-22 10:39:23,216][15401] InferenceWorker_p0-w0: resuming experience collection (46400 times) [2024-06-22 10:39:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 3147776000. Throughput: 0: 43105.5. Samples: 3147851540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-22 10:39:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 10:39:24,969][15401] Updated weights for policy 0, policy_version 192130 (0.0034) [2024-06-22 10:39:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3147956224. Throughput: 0: 42976.2. Samples: 3148101420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:39:28,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 10:39:29,395][15401] Updated weights for policy 0, policy_version 192140 (0.0028) [2024-06-22 10:39:32,892][15401] Updated weights for policy 0, policy_version 192150 (0.0030) [2024-06-22 10:39:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3148218368. Throughput: 0: 43127.1. Samples: 3148361340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:39:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 10:39:37,291][15401] Updated weights for policy 0, policy_version 192160 (0.0030) [2024-06-22 10:39:38,392][15132] Fps is (10 sec: 45863.3, 60 sec: 43144.4, 300 sec: 42986.8). Total num frames: 3148414976. Throughput: 0: 43070.9. Samples: 3148498340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:39:38,393][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 10:39:40,334][15401] Updated weights for policy 0, policy_version 192170 (0.0031) [2024-06-22 10:39:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 3148611584. Throughput: 0: 42968.9. Samples: 3148743000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:39:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 10:39:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000192176_3148611584.pth... [2024-06-22 10:39:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000191548_3138322432.pth [2024-06-22 10:39:44,778][15401] Updated weights for policy 0, policy_version 192180 (0.0037) [2024-06-22 10:39:47,791][15401] Updated weights for policy 0, policy_version 192190 (0.0030) [2024-06-22 10:39:48,395][15132] Fps is (10 sec: 44224.1, 60 sec: 43140.6, 300 sec: 42986.4). Total num frames: 3148857344. Throughput: 0: 43065.1. Samples: 3149001500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:39:48,395][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 10:39:52,313][15401] Updated weights for policy 0, policy_version 192200 (0.0052) [2024-06-22 10:39:53,391][15132] Fps is (10 sec: 42592.4, 60 sec: 42597.4, 300 sec: 42875.9). Total num frames: 3149037568. Throughput: 0: 43106.2. Samples: 3149142400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:39:53,391][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 10:39:55,223][15401] Updated weights for policy 0, policy_version 192210 (0.0024) [2024-06-22 10:39:58,389][15132] Fps is (10 sec: 40982.0, 60 sec: 43418.6, 300 sec: 42876.1). Total num frames: 3149266944. Throughput: 0: 43083.1. Samples: 3149394620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:39:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 10:39:59,802][15401] Updated weights for policy 0, policy_version 192220 (0.0029) [2024-06-22 10:40:03,190][15401] Updated weights for policy 0, policy_version 192230 (0.0031) [2024-06-22 10:40:03,389][15132] Fps is (10 sec: 45881.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3149496320. Throughput: 0: 43272.0. Samples: 3149657140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:40:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 10:40:07,439][15401] Updated weights for policy 0, policy_version 192240 (0.0038) [2024-06-22 10:40:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3149709312. Throughput: 0: 43054.6. Samples: 3149789000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:40:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 10:40:10,831][15401] Updated weights for policy 0, policy_version 192250 (0.0043) [2024-06-22 10:40:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 3149922304. Throughput: 0: 43063.5. Samples: 3150039280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:40:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 10:40:15,114][15401] Updated weights for policy 0, policy_version 192260 (0.0034) [2024-06-22 10:40:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 43042.4). Total num frames: 3150135296. Throughput: 0: 43164.7. Samples: 3150303860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:40:18,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 10:40:18,534][15401] Updated weights for policy 0, policy_version 192270 (0.0028) [2024-06-22 10:40:22,839][15401] Updated weights for policy 0, policy_version 192280 (0.0034) [2024-06-22 10:40:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 3150331904. Throughput: 0: 42987.4. Samples: 3150432660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:40:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 10:40:26,154][15401] Updated weights for policy 0, policy_version 192290 (0.0042) [2024-06-22 10:40:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3150561280. Throughput: 0: 43086.2. Samples: 3150681880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:40:28,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 10:40:30,495][15401] Updated weights for policy 0, policy_version 192300 (0.0038) [2024-06-22 10:40:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3150774272. Throughput: 0: 43162.6. Samples: 3150943580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 10:40:33,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 10:40:33,648][15401] Updated weights for policy 0, policy_version 192310 (0.0034) [2024-06-22 10:40:37,912][15401] Updated weights for policy 0, policy_version 192320 (0.0035) [2024-06-22 10:40:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.3, 300 sec: 42931.6). Total num frames: 3150987264. Throughput: 0: 42931.5. Samples: 3151074260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:40:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 10:40:41,042][15349] Signal inference workers to stop experience collection... (46450 times) [2024-06-22 10:40:41,049][15349] Signal inference workers to resume experience collection... (46450 times) [2024-06-22 10:40:41,071][15401] InferenceWorker_p0-w0: stopping experience collection (46450 times) [2024-06-22 10:40:41,071][15401] InferenceWorker_p0-w0: resuming experience collection (46450 times) [2024-06-22 10:40:41,218][15401] Updated weights for policy 0, policy_version 192330 (0.0027) [2024-06-22 10:40:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3151216640. Throughput: 0: 43076.3. Samples: 3151333060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:40:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 10:40:45,476][15401] Updated weights for policy 0, policy_version 192340 (0.0029) [2024-06-22 10:40:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42875.4, 300 sec: 43042.8). Total num frames: 3151429632. Throughput: 0: 43053.8. Samples: 3151594560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:40:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 10:40:48,665][15401] Updated weights for policy 0, policy_version 192350 (0.0035) [2024-06-22 10:40:53,012][15401] Updated weights for policy 0, policy_version 192360 (0.0041) [2024-06-22 10:40:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43145.5, 300 sec: 42876.1). Total num frames: 3151626240. Throughput: 0: 43064.0. Samples: 3151726880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:40:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 10:40:56,206][15401] Updated weights for policy 0, policy_version 192370 (0.0030) [2024-06-22 10:40:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 3151872000. Throughput: 0: 43179.1. Samples: 3151982340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:40:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 10:41:00,822][15401] Updated weights for policy 0, policy_version 192380 (0.0040) [2024-06-22 10:41:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 3152084992. Throughput: 0: 42985.5. Samples: 3152238100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:41:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 10:41:03,960][15401] Updated weights for policy 0, policy_version 192390 (0.0033) [2024-06-22 10:41:08,378][15401] Updated weights for policy 0, policy_version 192400 (0.0048) [2024-06-22 10:41:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3152281600. Throughput: 0: 43107.5. Samples: 3152372500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:41:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 10:41:11,504][15401] Updated weights for policy 0, policy_version 192410 (0.0035) [2024-06-22 10:41:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 3152510976. Throughput: 0: 43304.9. Samples: 3152630600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:41:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 10:41:15,810][15401] Updated weights for policy 0, policy_version 192420 (0.0045) [2024-06-22 10:41:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43419.4, 300 sec: 43153.8). Total num frames: 3152740352. Throughput: 0: 43099.1. Samples: 3152883040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:41:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 10:41:19,182][15401] Updated weights for policy 0, policy_version 192430 (0.0031) [2024-06-22 10:41:23,340][15401] Updated weights for policy 0, policy_version 192440 (0.0034) [2024-06-22 10:41:23,392][15132] Fps is (10 sec: 42585.4, 60 sec: 43415.4, 300 sec: 42986.8). Total num frames: 3152936960. Throughput: 0: 43155.8. Samples: 3153016400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:41:23,393][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 10:41:26,620][15401] Updated weights for policy 0, policy_version 192450 (0.0052) [2024-06-22 10:41:28,396][15132] Fps is (10 sec: 40933.2, 60 sec: 43139.8, 300 sec: 43041.8). Total num frames: 3153149952. Throughput: 0: 43150.8. Samples: 3153275120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:41:28,396][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 10:41:30,827][15401] Updated weights for policy 0, policy_version 192460 (0.0026) [2024-06-22 10:41:33,389][15132] Fps is (10 sec: 45888.5, 60 sec: 43690.6, 300 sec: 43154.1). Total num frames: 3153395712. Throughput: 0: 42993.2. Samples: 3153529260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:41:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 10:41:34,120][15401] Updated weights for policy 0, policy_version 192470 (0.0031) [2024-06-22 10:41:38,389][15132] Fps is (10 sec: 42626.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3153575936. Throughput: 0: 43139.5. Samples: 3153668160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-22 10:41:38,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 10:41:38,484][15401] Updated weights for policy 0, policy_version 192480 (0.0050) [2024-06-22 10:41:42,186][15401] Updated weights for policy 0, policy_version 192490 (0.0042) [2024-06-22 10:41:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 43098.2). Total num frames: 3153805312. Throughput: 0: 42981.3. Samples: 3153916500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:41:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 10:41:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000192493_3153805312.pth... [2024-06-22 10:41:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000191863_3143483392.pth [2024-06-22 10:41:46,217][15401] Updated weights for policy 0, policy_version 192500 (0.0050) [2024-06-22 10:41:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 3154018304. Throughput: 0: 42973.8. Samples: 3154171920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:41:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 10:41:49,714][15401] Updated weights for policy 0, policy_version 192510 (0.0032) [2024-06-22 10:41:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3154214912. Throughput: 0: 42823.1. Samples: 3154299540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:41:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 10:41:53,774][15401] Updated weights for policy 0, policy_version 192520 (0.0032) [2024-06-22 10:41:57,277][15401] Updated weights for policy 0, policy_version 192530 (0.0040) [2024-06-22 10:41:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 3154427904. Throughput: 0: 42802.2. Samples: 3154556700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:41:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 10:42:01,704][15401] Updated weights for policy 0, policy_version 192540 (0.0049) [2024-06-22 10:42:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 3154657280. Throughput: 0: 42812.0. Samples: 3154809580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 10:42:05,021][15401] Updated weights for policy 0, policy_version 192550 (0.0033) [2024-06-22 10:42:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42821.2). Total num frames: 3154837504. Throughput: 0: 42821.1. Samples: 3154943220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 10:42:09,253][15401] Updated weights for policy 0, policy_version 192560 (0.0026) [2024-06-22 10:42:12,762][15401] Updated weights for policy 0, policy_version 192570 (0.0036) [2024-06-22 10:42:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 43043.0). Total num frames: 3155083264. Throughput: 0: 42668.4. Samples: 3155194920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 10:42:17,075][15401] Updated weights for policy 0, policy_version 192580 (0.0030) [2024-06-22 10:42:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42596.7, 300 sec: 43097.9). Total num frames: 3155296256. Throughput: 0: 42743.2. Samples: 3155452800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:18,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 10:42:19,110][15349] Signal inference workers to stop experience collection... (46500 times) [2024-06-22 10:42:19,172][15401] InferenceWorker_p0-w0: stopping experience collection (46500 times) [2024-06-22 10:42:19,174][15349] Signal inference workers to resume experience collection... (46500 times) [2024-06-22 10:42:19,190][15401] InferenceWorker_p0-w0: resuming experience collection (46500 times) [2024-06-22 10:42:20,647][15401] Updated weights for policy 0, policy_version 192590 (0.0033) [2024-06-22 10:42:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.6, 300 sec: 42931.6). Total num frames: 3155492864. Throughput: 0: 42498.3. Samples: 3155580580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:23,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 10:42:24,568][15401] Updated weights for policy 0, policy_version 192600 (0.0030) [2024-06-22 10:42:28,350][15401] Updated weights for policy 0, policy_version 192610 (0.0024) [2024-06-22 10:42:28,389][15132] Fps is (10 sec: 42608.4, 60 sec: 42876.1, 300 sec: 43042.7). Total num frames: 3155722240. Throughput: 0: 42728.5. Samples: 3155839280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 10:42:32,117][15401] Updated weights for policy 0, policy_version 192620 (0.0030) [2024-06-22 10:42:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42987.2). Total num frames: 3155918848. Throughput: 0: 42872.4. Samples: 3156101180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:33,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 10:42:35,856][15401] Updated weights for policy 0, policy_version 192630 (0.0037) [2024-06-22 10:42:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3156131840. Throughput: 0: 42759.6. Samples: 3156223720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 10:42:39,805][15401] Updated weights for policy 0, policy_version 192640 (0.0031) [2024-06-22 10:42:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42987.5). Total num frames: 3156361216. Throughput: 0: 42760.0. Samples: 3156480900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 10:42:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 10:42:43,504][15401] Updated weights for policy 0, policy_version 192650 (0.0035) [2024-06-22 10:42:47,560][15401] Updated weights for policy 0, policy_version 192660 (0.0038) [2024-06-22 10:42:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 3156557824. Throughput: 0: 42769.3. Samples: 3156734200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:42:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 10:42:51,058][15401] Updated weights for policy 0, policy_version 192670 (0.0036) [2024-06-22 10:42:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3156787200. Throughput: 0: 42514.6. Samples: 3156856380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:42:53,390][15132] Avg episode reward: [(0, '0.177')] [2024-06-22 10:42:55,282][15401] Updated weights for policy 0, policy_version 192680 (0.0052) [2024-06-22 10:42:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3157000192. Throughput: 0: 42825.7. Samples: 3157122080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:42:58,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-22 10:42:58,892][15401] Updated weights for policy 0, policy_version 192690 (0.0031) [2024-06-22 10:43:02,786][15401] Updated weights for policy 0, policy_version 192700 (0.0034) [2024-06-22 10:43:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3157213184. Throughput: 0: 42751.2. Samples: 3157376500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:03,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 10:43:06,405][15401] Updated weights for policy 0, policy_version 192710 (0.0043) [2024-06-22 10:43:08,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.7, 300 sec: 42931.6). Total num frames: 3157426176. Throughput: 0: 42747.8. Samples: 3157504340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:08,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 10:43:10,842][15401] Updated weights for policy 0, policy_version 192720 (0.0035) [2024-06-22 10:43:13,394][15132] Fps is (10 sec: 40942.2, 60 sec: 42322.2, 300 sec: 42820.0). Total num frames: 3157622784. Throughput: 0: 42709.7. Samples: 3157761400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:13,394][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 10:43:14,228][15401] Updated weights for policy 0, policy_version 192730 (0.0033) [2024-06-22 10:43:18,378][15401] Updated weights for policy 0, policy_version 192740 (0.0031) [2024-06-22 10:43:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 3157852160. Throughput: 0: 42608.4. Samples: 3158018560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 10:43:21,689][15401] Updated weights for policy 0, policy_version 192750 (0.0047) [2024-06-22 10:43:23,389][15132] Fps is (10 sec: 44255.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3158065152. Throughput: 0: 42708.8. Samples: 3158145620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 10:43:26,002][15401] Updated weights for policy 0, policy_version 192760 (0.0031) [2024-06-22 10:43:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3158261760. Throughput: 0: 42665.3. Samples: 3158400840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 10:43:29,493][15401] Updated weights for policy 0, policy_version 192770 (0.0034) [2024-06-22 10:43:30,945][15349] Signal inference workers to stop experience collection... (46550 times) [2024-06-22 10:43:30,993][15401] InferenceWorker_p0-w0: stopping experience collection (46550 times) [2024-06-22 10:43:31,002][15349] Signal inference workers to resume experience collection... (46550 times) [2024-06-22 10:43:31,016][15401] InferenceWorker_p0-w0: resuming experience collection (46550 times) [2024-06-22 10:43:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.6, 300 sec: 42876.1). Total num frames: 3158474752. Throughput: 0: 42771.8. Samples: 3158659040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:33,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 10:43:33,946][15401] Updated weights for policy 0, policy_version 192780 (0.0031) [2024-06-22 10:43:36,859][15401] Updated weights for policy 0, policy_version 192790 (0.0029) [2024-06-22 10:43:38,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3158704128. Throughput: 0: 42859.9. Samples: 3158785080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 10:43:41,369][15401] Updated weights for policy 0, policy_version 192800 (0.0038) [2024-06-22 10:43:43,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3158917120. Throughput: 0: 42839.2. Samples: 3159049840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 10:43:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000192806_3158933504.pth... [2024-06-22 10:43:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000192176_3148611584.pth [2024-06-22 10:43:44,663][15401] Updated weights for policy 0, policy_version 192810 (0.0033) [2024-06-22 10:43:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3159113728. Throughput: 0: 42859.1. Samples: 3159305160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 10:43:48,884][15401] Updated weights for policy 0, policy_version 192820 (0.0034) [2024-06-22 10:43:52,428][15401] Updated weights for policy 0, policy_version 192830 (0.0027) [2024-06-22 10:43:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 43098.5). Total num frames: 3159375872. Throughput: 0: 42837.9. Samples: 3159431940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:43:53,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 10:43:56,307][15401] Updated weights for policy 0, policy_version 192840 (0.0036) [2024-06-22 10:43:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3159556096. Throughput: 0: 43019.3. Samples: 3159697080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:43:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 10:44:00,099][15401] Updated weights for policy 0, policy_version 192850 (0.0037) [2024-06-22 10:44:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3159769088. Throughput: 0: 42922.3. Samples: 3159950060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:03,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 10:44:03,901][15401] Updated weights for policy 0, policy_version 192860 (0.0039) [2024-06-22 10:44:08,079][15401] Updated weights for policy 0, policy_version 192870 (0.0026) [2024-06-22 10:44:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42987.2). Total num frames: 3159982080. Throughput: 0: 42858.8. Samples: 3160074260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:08,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-22 10:44:11,598][15401] Updated weights for policy 0, policy_version 192880 (0.0044) [2024-06-22 10:44:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42601.5, 300 sec: 42765.0). Total num frames: 3160178688. Throughput: 0: 42874.7. Samples: 3160330200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 10:44:15,611][15401] Updated weights for policy 0, policy_version 192890 (0.0034) [2024-06-22 10:44:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3160424448. Throughput: 0: 42728.6. Samples: 3160581720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:18,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 10:44:19,390][15401] Updated weights for policy 0, policy_version 192900 (0.0031) [2024-06-22 10:44:23,282][15401] Updated weights for policy 0, policy_version 192910 (0.0044) [2024-06-22 10:44:23,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3160637440. Throughput: 0: 42958.3. Samples: 3160718200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:23,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 10:44:26,844][15401] Updated weights for policy 0, policy_version 192920 (0.0037) [2024-06-22 10:44:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3160834048. Throughput: 0: 42720.9. Samples: 3160972280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 10:44:28,960][15349] Signal inference workers to stop experience collection... (46600 times) [2024-06-22 10:44:28,960][15349] Signal inference workers to resume experience collection... (46600 times) [2024-06-22 10:44:28,993][15401] InferenceWorker_p0-w0: stopping experience collection (46600 times) [2024-06-22 10:44:28,993][15401] InferenceWorker_p0-w0: resuming experience collection (46600 times) [2024-06-22 10:44:30,931][15401] Updated weights for policy 0, policy_version 192930 (0.0035) [2024-06-22 10:44:33,393][15132] Fps is (10 sec: 44222.1, 60 sec: 43417.0, 300 sec: 42931.5). Total num frames: 3161079808. Throughput: 0: 42761.3. Samples: 3161229560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:33,393][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 10:44:34,354][15401] Updated weights for policy 0, policy_version 192940 (0.0024) [2024-06-22 10:44:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3161276416. Throughput: 0: 42989.2. Samples: 3161366460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 10:44:38,410][15401] Updated weights for policy 0, policy_version 192950 (0.0028) [2024-06-22 10:44:42,282][15401] Updated weights for policy 0, policy_version 192960 (0.0047) [2024-06-22 10:44:43,389][15132] Fps is (10 sec: 39334.9, 60 sec: 42598.4, 300 sec: 42765.8). Total num frames: 3161473024. Throughput: 0: 42663.5. Samples: 3161616940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 10:44:45,916][15401] Updated weights for policy 0, policy_version 192970 (0.0033) [2024-06-22 10:44:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42987.4). Total num frames: 3161718784. Throughput: 0: 42626.6. Samples: 3161868260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:48,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 10:44:49,867][15401] Updated weights for policy 0, policy_version 192980 (0.0038) [2024-06-22 10:44:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3161931776. Throughput: 0: 42976.8. Samples: 3162008220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 10:44:53,659][15401] Updated weights for policy 0, policy_version 192990 (0.0038) [2024-06-22 10:44:57,359][15401] Updated weights for policy 0, policy_version 193000 (0.0045) [2024-06-22 10:44:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3162112000. Throughput: 0: 42798.6. Samples: 3162256140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 10:44:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 10:45:01,043][15401] Updated weights for policy 0, policy_version 193010 (0.0031) [2024-06-22 10:45:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3162357760. Throughput: 0: 42941.3. Samples: 3162514080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:03,396][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 10:45:04,892][15401] Updated weights for policy 0, policy_version 193020 (0.0031) [2024-06-22 10:45:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3162570752. Throughput: 0: 43032.5. Samples: 3162654660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 10:45:08,627][15401] Updated weights for policy 0, policy_version 193030 (0.0028) [2024-06-22 10:45:12,489][15401] Updated weights for policy 0, policy_version 193040 (0.0034) [2024-06-22 10:45:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 3162767360. Throughput: 0: 42867.9. Samples: 3162901340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:13,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 10:45:16,227][15401] Updated weights for policy 0, policy_version 193050 (0.0024) [2024-06-22 10:45:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3162980352. Throughput: 0: 42992.6. Samples: 3163164080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 10:45:20,026][15401] Updated weights for policy 0, policy_version 193060 (0.0030) [2024-06-22 10:45:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3163193344. Throughput: 0: 42746.8. Samples: 3163290060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 10:45:24,041][15401] Updated weights for policy 0, policy_version 193070 (0.0024) [2024-06-22 10:45:27,638][15401] Updated weights for policy 0, policy_version 193080 (0.0039) [2024-06-22 10:45:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3163422720. Throughput: 0: 42800.4. Samples: 3163542960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 10:45:31,520][15401] Updated weights for policy 0, policy_version 193090 (0.0032) [2024-06-22 10:45:33,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42599.0, 300 sec: 42875.7). Total num frames: 3163635712. Throughput: 0: 43004.3. Samples: 3163803560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:33,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 10:45:35,300][15401] Updated weights for policy 0, policy_version 193100 (0.0039) [2024-06-22 10:45:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 3163848704. Throughput: 0: 42706.1. Samples: 3163930100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:38,392][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 10:45:38,472][15349] Signal inference workers to stop experience collection... (46650 times) [2024-06-22 10:45:38,473][15349] Signal inference workers to resume experience collection... (46650 times) [2024-06-22 10:45:38,486][15401] InferenceWorker_p0-w0: stopping experience collection (46650 times) [2024-06-22 10:45:38,518][15401] InferenceWorker_p0-w0: resuming experience collection (46650 times) [2024-06-22 10:45:39,090][15401] Updated weights for policy 0, policy_version 193110 (0.0029) [2024-06-22 10:45:42,913][15401] Updated weights for policy 0, policy_version 193120 (0.0029) [2024-06-22 10:45:43,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3164078080. Throughput: 0: 42957.3. Samples: 3164189220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 10:45:43,520][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000193121_3164094464.pth... [2024-06-22 10:45:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000192493_3153805312.pth [2024-06-22 10:45:47,246][15401] Updated weights for policy 0, policy_version 193130 (0.0029) [2024-06-22 10:45:48,393][15132] Fps is (10 sec: 42593.7, 60 sec: 42595.9, 300 sec: 42875.6). Total num frames: 3164274688. Throughput: 0: 42928.6. Samples: 3164446020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:48,394][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 10:45:50,667][15401] Updated weights for policy 0, policy_version 193140 (0.0046) [2024-06-22 10:45:53,394][15132] Fps is (10 sec: 40939.8, 60 sec: 42594.9, 300 sec: 42764.3). Total num frames: 3164487680. Throughput: 0: 42394.3. Samples: 3164562620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:53,395][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 10:45:54,926][15401] Updated weights for policy 0, policy_version 193150 (0.0029) [2024-06-22 10:45:58,339][15401] Updated weights for policy 0, policy_version 193160 (0.0032) [2024-06-22 10:45:58,389][15132] Fps is (10 sec: 45891.5, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 3164733440. Throughput: 0: 42754.7. Samples: 3164825300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:45:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 10:46:02,757][15401] Updated weights for policy 0, policy_version 193170 (0.0032) [2024-06-22 10:46:03,390][15132] Fps is (10 sec: 42619.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3164913664. Throughput: 0: 42535.0. Samples: 3165078160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-22 10:46:03,390][15132] Avg episode reward: [(0, '0.887')] [2024-06-22 10:46:06,219][15401] Updated weights for policy 0, policy_version 193180 (0.0037) [2024-06-22 10:46:08,396][15132] Fps is (10 sec: 37658.8, 60 sec: 42320.7, 300 sec: 42708.5). Total num frames: 3165110272. Throughput: 0: 42477.9. Samples: 3165201840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:08,396][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 10:46:10,431][15401] Updated weights for policy 0, policy_version 193190 (0.0028) [2024-06-22 10:46:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3165356032. Throughput: 0: 42695.1. Samples: 3165464240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 10:46:14,205][15401] Updated weights for policy 0, policy_version 193200 (0.0031) [2024-06-22 10:46:18,128][15401] Updated weights for policy 0, policy_version 193210 (0.0048) [2024-06-22 10:46:18,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 3165552640. Throughput: 0: 42581.8. Samples: 3165719640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 10:46:21,725][15401] Updated weights for policy 0, policy_version 193220 (0.0043) [2024-06-22 10:46:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 3165765632. Throughput: 0: 42619.6. Samples: 3165847880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:23,390][15132] Avg episode reward: [(0, '0.156')] [2024-06-22 10:46:25,713][15401] Updated weights for policy 0, policy_version 193230 (0.0038) [2024-06-22 10:46:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3165995008. Throughput: 0: 42601.3. Samples: 3166106280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 10:46:29,330][15401] Updated weights for policy 0, policy_version 193240 (0.0029) [2024-06-22 10:46:33,211][15401] Updated weights for policy 0, policy_version 193250 (0.0042) [2024-06-22 10:46:33,391][15132] Fps is (10 sec: 44229.3, 60 sec: 42871.9, 300 sec: 42820.3). Total num frames: 3166208000. Throughput: 0: 42564.8. Samples: 3166361360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:33,392][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 10:46:37,172][15401] Updated weights for policy 0, policy_version 193260 (0.0034) [2024-06-22 10:46:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 3166388224. Throughput: 0: 42806.1. Samples: 3166488680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 10:46:40,840][15401] Updated weights for policy 0, policy_version 193270 (0.0026) [2024-06-22 10:46:43,389][15132] Fps is (10 sec: 42606.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3166633984. Throughput: 0: 42770.2. Samples: 3166749960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:43,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 10:46:44,697][15401] Updated weights for policy 0, policy_version 193280 (0.0039) [2024-06-22 10:46:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.9, 300 sec: 42765.0). Total num frames: 3166830592. Throughput: 0: 42781.0. Samples: 3167003300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 10:46:48,899][15401] Updated weights for policy 0, policy_version 193290 (0.0033) [2024-06-22 10:46:52,118][15349] Signal inference workers to stop experience collection... (46700 times) [2024-06-22 10:46:52,118][15349] Signal inference workers to resume experience collection... (46700 times) [2024-06-22 10:46:52,168][15401] InferenceWorker_p0-w0: stopping experience collection (46700 times) [2024-06-22 10:46:52,168][15401] InferenceWorker_p0-w0: resuming experience collection (46700 times) [2024-06-22 10:46:52,263][15401] Updated weights for policy 0, policy_version 193300 (0.0024) [2024-06-22 10:46:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42601.9, 300 sec: 42765.0). Total num frames: 3167043584. Throughput: 0: 42896.8. Samples: 3167131920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 10:46:56,429][15401] Updated weights for policy 0, policy_version 193310 (0.0039) [2024-06-22 10:46:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3167272960. Throughput: 0: 42769.3. Samples: 3167388860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:46:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 10:47:00,393][15401] Updated weights for policy 0, policy_version 193320 (0.0031) [2024-06-22 10:47:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3167485952. Throughput: 0: 42764.0. Samples: 3167644020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:47:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 10:47:04,073][15401] Updated weights for policy 0, policy_version 193330 (0.0040) [2024-06-22 10:47:08,214][15401] Updated weights for policy 0, policy_version 193340 (0.0036) [2024-06-22 10:47:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 3167682560. Throughput: 0: 42626.7. Samples: 3167766080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:47:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 10:47:11,711][15401] Updated weights for policy 0, policy_version 193350 (0.0035) [2024-06-22 10:47:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 3167928320. Throughput: 0: 42764.9. Samples: 3168030700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 10:47:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 10:47:15,763][15401] Updated weights for policy 0, policy_version 193360 (0.0029) [2024-06-22 10:47:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3168141312. Throughput: 0: 42709.3. Samples: 3168283200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 10:47:19,222][15401] Updated weights for policy 0, policy_version 193370 (0.0032) [2024-06-22 10:47:23,351][15401] Updated weights for policy 0, policy_version 193380 (0.0029) [2024-06-22 10:47:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3168337920. Throughput: 0: 42755.6. Samples: 3168412680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:23,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 10:47:27,020][15401] Updated weights for policy 0, policy_version 193390 (0.0040) [2024-06-22 10:47:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3168567296. Throughput: 0: 42799.0. Samples: 3168675920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:28,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-22 10:47:30,988][15401] Updated weights for policy 0, policy_version 193400 (0.0027) [2024-06-22 10:47:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42872.7, 300 sec: 42876.1). Total num frames: 3168780288. Throughput: 0: 42847.1. Samples: 3168931420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 10:47:34,551][15401] Updated weights for policy 0, policy_version 193410 (0.0041) [2024-06-22 10:47:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3168976896. Throughput: 0: 42821.8. Samples: 3169058900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 10:47:38,443][15401] Updated weights for policy 0, policy_version 193420 (0.0033) [2024-06-22 10:47:42,453][15401] Updated weights for policy 0, policy_version 193430 (0.0041) [2024-06-22 10:47:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3169206272. Throughput: 0: 42872.0. Samples: 3169318100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:43,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 10:47:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000193433_3169206272.pth... [2024-06-22 10:47:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000192806_3158933504.pth [2024-06-22 10:47:45,963][15401] Updated weights for policy 0, policy_version 193440 (0.0033) [2024-06-22 10:47:48,396][15132] Fps is (10 sec: 42572.3, 60 sec: 42867.1, 300 sec: 42764.1). Total num frames: 3169402880. Throughput: 0: 42866.2. Samples: 3169573260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:48,396][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 10:47:49,961][15401] Updated weights for policy 0, policy_version 193450 (0.0022) [2024-06-22 10:47:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3169615872. Throughput: 0: 43010.6. Samples: 3169701560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:53,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 10:47:53,660][15401] Updated weights for policy 0, policy_version 193460 (0.0052) [2024-06-22 10:47:57,390][15401] Updated weights for policy 0, policy_version 193470 (0.0033) [2024-06-22 10:47:58,389][15132] Fps is (10 sec: 44263.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3169845248. Throughput: 0: 42845.3. Samples: 3169958740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:47:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 10:48:01,382][15401] Updated weights for policy 0, policy_version 193480 (0.0034) [2024-06-22 10:48:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 3170025472. Throughput: 0: 43004.3. Samples: 3170218400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:48:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 10:48:04,834][15401] Updated weights for policy 0, policy_version 193490 (0.0041) [2024-06-22 10:48:08,396][15132] Fps is (10 sec: 42571.0, 60 sec: 43139.9, 300 sec: 42875.8). Total num frames: 3170271232. Throughput: 0: 42919.1. Samples: 3170344320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:48:08,405][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 10:48:09,626][15401] Updated weights for policy 0, policy_version 193500 (0.0040) [2024-06-22 10:48:12,620][15401] Updated weights for policy 0, policy_version 193510 (0.0034) [2024-06-22 10:48:13,394][15132] Fps is (10 sec: 45856.6, 60 sec: 42595.4, 300 sec: 42819.9). Total num frames: 3170484224. Throughput: 0: 42678.3. Samples: 3170596620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:48:13,394][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 10:48:17,240][15401] Updated weights for policy 0, policy_version 193520 (0.0025) [2024-06-22 10:48:18,391][15132] Fps is (10 sec: 40979.1, 60 sec: 42324.0, 300 sec: 42764.8). Total num frames: 3170680832. Throughput: 0: 43029.0. Samples: 3170867800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:48:18,392][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 10:48:19,269][15349] Signal inference workers to stop experience collection... (46750 times) [2024-06-22 10:48:19,270][15349] Signal inference workers to resume experience collection... (46750 times) [2024-06-22 10:48:19,283][15401] InferenceWorker_p0-w0: stopping experience collection (46750 times) [2024-06-22 10:48:19,283][15401] InferenceWorker_p0-w0: resuming experience collection (46750 times) [2024-06-22 10:48:19,995][15401] Updated weights for policy 0, policy_version 193530 (0.0033) [2024-06-22 10:48:23,389][15132] Fps is (10 sec: 44255.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3170926592. Throughput: 0: 42895.5. Samples: 3170989200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:48:23,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 10:48:24,827][15401] Updated weights for policy 0, policy_version 193540 (0.0033) [2024-06-22 10:48:27,786][15401] Updated weights for policy 0, policy_version 193550 (0.0031) [2024-06-22 10:48:28,389][15132] Fps is (10 sec: 45883.4, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 3171139584. Throughput: 0: 42776.5. Samples: 3171243040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:48:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 10:48:32,334][15401] Updated weights for policy 0, policy_version 193560 (0.0038) [2024-06-22 10:48:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3171319808. Throughput: 0: 42961.4. Samples: 3171506260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:48:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 10:48:35,426][15401] Updated weights for policy 0, policy_version 193570 (0.0026) [2024-06-22 10:48:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3171549184. Throughput: 0: 42797.0. Samples: 3171627420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:48:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 10:48:39,806][15401] Updated weights for policy 0, policy_version 193580 (0.0048) [2024-06-22 10:48:43,206][15401] Updated weights for policy 0, policy_version 193590 (0.0040) [2024-06-22 10:48:43,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3171794944. Throughput: 0: 42802.1. Samples: 3171884840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:48:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 10:48:47,289][15401] Updated weights for policy 0, policy_version 193600 (0.0030) [2024-06-22 10:48:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42875.8, 300 sec: 42709.5). Total num frames: 3171975168. Throughput: 0: 42946.3. Samples: 3172150980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:48:48,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-22 10:48:50,616][15401] Updated weights for policy 0, policy_version 193610 (0.0032) [2024-06-22 10:48:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 3172204544. Throughput: 0: 42850.5. Samples: 3172272420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:48:53,392][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 10:48:54,723][15401] Updated weights for policy 0, policy_version 193620 (0.0034) [2024-06-22 10:48:58,092][15401] Updated weights for policy 0, policy_version 193630 (0.0029) [2024-06-22 10:48:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3172450304. Throughput: 0: 43097.7. Samples: 3172535840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:48:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 10:49:02,320][15401] Updated weights for policy 0, policy_version 193640 (0.0047) [2024-06-22 10:49:03,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3172614144. Throughput: 0: 42870.5. Samples: 3172796900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:49:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 10:49:05,743][15401] Updated weights for policy 0, policy_version 193650 (0.0030) [2024-06-22 10:49:08,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 3172827136. Throughput: 0: 42798.2. Samples: 3172915120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:49:08,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 10:49:10,037][15401] Updated weights for policy 0, policy_version 193660 (0.0034) [2024-06-22 10:49:13,123][15401] Updated weights for policy 0, policy_version 193670 (0.0024) [2024-06-22 10:49:13,392][15132] Fps is (10 sec: 49140.0, 60 sec: 43691.9, 300 sec: 42986.8). Total num frames: 3173105664. Throughput: 0: 43071.0. Samples: 3173181340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:49:13,393][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 10:49:17,447][15401] Updated weights for policy 0, policy_version 193680 (0.0034) [2024-06-22 10:49:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43144.1, 300 sec: 42820.2). Total num frames: 3173269504. Throughput: 0: 43018.1. Samples: 3173442180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:49:18,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 10:49:20,575][15401] Updated weights for policy 0, policy_version 193690 (0.0032) [2024-06-22 10:49:23,389][15132] Fps is (10 sec: 36053.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 3173466112. Throughput: 0: 42963.1. Samples: 3173560760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-22 10:49:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 10:49:24,716][15349] Signal inference workers to stop experience collection... (46800 times) [2024-06-22 10:49:24,757][15401] InferenceWorker_p0-w0: stopping experience collection (46800 times) [2024-06-22 10:49:24,785][15349] Signal inference workers to resume experience collection... (46800 times) [2024-06-22 10:49:24,792][15401] InferenceWorker_p0-w0: resuming experience collection (46800 times) [2024-06-22 10:49:24,927][15401] Updated weights for policy 0, policy_version 193700 (0.0032) [2024-06-22 10:49:28,123][15401] Updated weights for policy 0, policy_version 193710 (0.0024) [2024-06-22 10:49:28,389][15132] Fps is (10 sec: 47525.4, 60 sec: 43417.6, 300 sec: 42932.1). Total num frames: 3173744640. Throughput: 0: 43283.7. Samples: 3173832600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:49:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 10:49:32,625][15401] Updated weights for policy 0, policy_version 193720 (0.0041) [2024-06-22 10:49:33,392][15132] Fps is (10 sec: 44226.9, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3173908480. Throughput: 0: 43114.7. Samples: 3174091240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:49:33,392][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 10:49:35,854][15401] Updated weights for policy 0, policy_version 193730 (0.0043) [2024-06-22 10:49:38,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3174105088. Throughput: 0: 43069.0. Samples: 3174210420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:49:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 10:49:40,685][15401] Updated weights for policy 0, policy_version 193740 (0.0036) [2024-06-22 10:49:43,389][15132] Fps is (10 sec: 47524.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3174383616. Throughput: 0: 43191.6. Samples: 3174479460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:49:43,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 10:49:43,396][15401] Updated weights for policy 0, policy_version 193750 (0.0036) [2024-06-22 10:49:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000193750_3174400000.pth... [2024-06-22 10:49:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000193121_3164094464.pth [2024-06-22 10:49:48,298][15401] Updated weights for policy 0, policy_version 193760 (0.0026) [2024-06-22 10:49:48,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3174563840. Throughput: 0: 43104.0. Samples: 3174736680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:49:48,392][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 10:49:51,409][15401] Updated weights for policy 0, policy_version 193770 (0.0036) [2024-06-22 10:49:53,389][15132] Fps is (10 sec: 36044.6, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 3174744064. Throughput: 0: 43097.3. Samples: 3174854500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:49:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 10:49:55,894][15401] Updated weights for policy 0, policy_version 193780 (0.0044) [2024-06-22 10:49:58,392][15132] Fps is (10 sec: 45874.9, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 3175022592. Throughput: 0: 42992.9. Samples: 3175116020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:49:58,392][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 10:49:59,092][15401] Updated weights for policy 0, policy_version 193790 (0.0033) [2024-06-22 10:50:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3175202816. Throughput: 0: 42933.4. Samples: 3175374080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:50:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 10:50:03,448][15401] Updated weights for policy 0, policy_version 193800 (0.0035) [2024-06-22 10:50:06,764][15401] Updated weights for policy 0, policy_version 193810 (0.0033) [2024-06-22 10:50:08,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3175399424. Throughput: 0: 43034.7. Samples: 3175497320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:50:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 10:50:10,989][15401] Updated weights for policy 0, policy_version 193820 (0.0031) [2024-06-22 10:50:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42327.0, 300 sec: 42931.6). Total num frames: 3175645184. Throughput: 0: 42765.7. Samples: 3175757060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:50:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 10:50:14,413][15401] Updated weights for policy 0, policy_version 193830 (0.0038) [2024-06-22 10:50:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 3175858176. Throughput: 0: 42766.7. Samples: 3176015640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:50:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 10:50:19,231][15401] Updated weights for policy 0, policy_version 193840 (0.0032) [2024-06-22 10:50:22,073][15401] Updated weights for policy 0, policy_version 193850 (0.0033) [2024-06-22 10:50:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3176054784. Throughput: 0: 42686.5. Samples: 3176131320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:50:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 10:50:26,882][15401] Updated weights for policy 0, policy_version 193860 (0.0047) [2024-06-22 10:50:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42932.0). Total num frames: 3176300544. Throughput: 0: 42620.7. Samples: 3176397400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:50:28,399][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 10:50:29,807][15401] Updated weights for policy 0, policy_version 193870 (0.0038) [2024-06-22 10:50:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.1, 300 sec: 42820.9). Total num frames: 3176480768. Throughput: 0: 42576.5. Samples: 3176652520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-22 10:50:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 10:50:34,445][15401] Updated weights for policy 0, policy_version 193880 (0.0041) [2024-06-22 10:50:34,454][15349] Signal inference workers to stop experience collection... (46850 times) [2024-06-22 10:50:34,462][15349] Signal inference workers to resume experience collection... (46850 times) [2024-06-22 10:50:34,489][15401] InferenceWorker_p0-w0: stopping experience collection (46850 times) [2024-06-22 10:50:34,489][15401] InferenceWorker_p0-w0: resuming experience collection (46850 times) [2024-06-22 10:50:37,747][15401] Updated weights for policy 0, policy_version 193890 (0.0028) [2024-06-22 10:50:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.3, 300 sec: 42765.0). Total num frames: 3176693760. Throughput: 0: 42634.0. Samples: 3176773040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:50:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 10:50:42,012][15401] Updated weights for policy 0, policy_version 193900 (0.0026) [2024-06-22 10:50:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42876.6). Total num frames: 3176923136. Throughput: 0: 42608.9. Samples: 3177033320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:50:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 10:50:45,396][15401] Updated weights for policy 0, policy_version 193910 (0.0042) [2024-06-22 10:50:48,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 3177119744. Throughput: 0: 42613.7. Samples: 3177291800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:50:48,392][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 10:50:49,614][15401] Updated weights for policy 0, policy_version 193920 (0.0027) [2024-06-22 10:50:53,055][15401] Updated weights for policy 0, policy_version 193930 (0.0044) [2024-06-22 10:50:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 3177349120. Throughput: 0: 42643.6. Samples: 3177416280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:50:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 10:50:57,284][15401] Updated weights for policy 0, policy_version 193940 (0.0044) [2024-06-22 10:50:58,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 3177562112. Throughput: 0: 42587.6. Samples: 3177673500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:50:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 10:51:00,698][15401] Updated weights for policy 0, policy_version 193950 (0.0049) [2024-06-22 10:51:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42877.0). Total num frames: 3177758720. Throughput: 0: 42576.7. Samples: 3177931600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:51:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 10:51:04,833][15401] Updated weights for policy 0, policy_version 193960 (0.0034) [2024-06-22 10:51:08,278][15401] Updated weights for policy 0, policy_version 193970 (0.0029) [2024-06-22 10:51:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3178004480. Throughput: 0: 42866.4. Samples: 3178060300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:51:08,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 10:51:12,689][15401] Updated weights for policy 0, policy_version 193980 (0.0038) [2024-06-22 10:51:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3178217472. Throughput: 0: 42775.6. Samples: 3178322300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:51:13,396][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 10:51:16,011][15401] Updated weights for policy 0, policy_version 193990 (0.0042) [2024-06-22 10:51:18,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3178397696. Throughput: 0: 42597.3. Samples: 3178569400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:51:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 10:51:20,416][15401] Updated weights for policy 0, policy_version 194000 (0.0047) [2024-06-22 10:51:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3178627072. Throughput: 0: 42716.6. Samples: 3178695280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:51:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 10:51:23,853][15401] Updated weights for policy 0, policy_version 194010 (0.0027) [2024-06-22 10:51:27,857][15401] Updated weights for policy 0, policy_version 194020 (0.0032) [2024-06-22 10:51:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42820.8). Total num frames: 3178840064. Throughput: 0: 42847.2. Samples: 3178961440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:51:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 10:51:31,473][15401] Updated weights for policy 0, policy_version 194030 (0.0027) [2024-06-22 10:51:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3179036672. Throughput: 0: 42781.5. Samples: 3179216860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:51:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 10:51:35,698][15401] Updated weights for policy 0, policy_version 194040 (0.0028) [2024-06-22 10:51:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 3179266048. Throughput: 0: 42718.2. Samples: 3179338600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-22 10:51:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 10:51:38,966][15401] Updated weights for policy 0, policy_version 194050 (0.0024) [2024-06-22 10:51:43,074][15401] Updated weights for policy 0, policy_version 194060 (0.0033) [2024-06-22 10:51:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3179479040. Throughput: 0: 42816.5. Samples: 3179600240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:51:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 10:51:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000194061_3179495424.pth... [2024-06-22 10:51:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000193433_3169206272.pth [2024-06-22 10:51:46,518][15401] Updated weights for policy 0, policy_version 194070 (0.0036) [2024-06-22 10:51:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 3179675648. Throughput: 0: 42844.5. Samples: 3179859600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:51:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 10:51:50,712][15401] Updated weights for policy 0, policy_version 194080 (0.0035) [2024-06-22 10:51:53,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 3179905024. Throughput: 0: 42783.8. Samples: 3179985680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:51:53,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 10:51:54,175][15401] Updated weights for policy 0, policy_version 194090 (0.0025) [2024-06-22 10:51:58,353][15401] Updated weights for policy 0, policy_version 194100 (0.0028) [2024-06-22 10:51:58,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 3180134400. Throughput: 0: 42828.1. Samples: 3180249660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:51:58,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 10:52:01,837][15401] Updated weights for policy 0, policy_version 194110 (0.0041) [2024-06-22 10:52:03,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3180314624. Throughput: 0: 42944.0. Samples: 3180501880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 10:52:03,687][15349] Signal inference workers to stop experience collection... (46900 times) [2024-06-22 10:52:03,689][15349] Signal inference workers to resume experience collection... (46900 times) [2024-06-22 10:52:03,740][15401] InferenceWorker_p0-w0: stopping experience collection (46900 times) [2024-06-22 10:52:03,740][15401] InferenceWorker_p0-w0: resuming experience collection (46900 times) [2024-06-22 10:52:06,001][15401] Updated weights for policy 0, policy_version 194120 (0.0025) [2024-06-22 10:52:08,396][15132] Fps is (10 sec: 42579.8, 60 sec: 42593.6, 300 sec: 42819.6). Total num frames: 3180560384. Throughput: 0: 42892.7. Samples: 3180625740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:08,397][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 10:52:09,862][15401] Updated weights for policy 0, policy_version 194130 (0.0036) [2024-06-22 10:52:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 3180773376. Throughput: 0: 42764.8. Samples: 3180885860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 10:52:13,743][15401] Updated weights for policy 0, policy_version 194140 (0.0026) [2024-06-22 10:52:17,744][15401] Updated weights for policy 0, policy_version 194150 (0.0031) [2024-06-22 10:52:18,389][15132] Fps is (10 sec: 40988.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3180969984. Throughput: 0: 42856.4. Samples: 3181145400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 10:52:21,283][15401] Updated weights for policy 0, policy_version 194160 (0.0036) [2024-06-22 10:52:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3181199360. Throughput: 0: 42959.9. Samples: 3181271800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:23,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 10:52:25,374][15401] Updated weights for policy 0, policy_version 194170 (0.0044) [2024-06-22 10:52:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3181412352. Throughput: 0: 42978.1. Samples: 3181534260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 10:52:28,991][15401] Updated weights for policy 0, policy_version 194180 (0.0041) [2024-06-22 10:52:32,795][15401] Updated weights for policy 0, policy_version 194190 (0.0030) [2024-06-22 10:52:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3181608960. Throughput: 0: 42844.9. Samples: 3181787620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 10:52:36,561][15401] Updated weights for policy 0, policy_version 194200 (0.0041) [2024-06-22 10:52:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3181821952. Throughput: 0: 42788.1. Samples: 3181911040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 10:52:40,368][15401] Updated weights for policy 0, policy_version 194210 (0.0047) [2024-06-22 10:52:43,396][15132] Fps is (10 sec: 42570.7, 60 sec: 42593.7, 300 sec: 42820.5). Total num frames: 3182034944. Throughput: 0: 42834.8. Samples: 3182177400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 10:52:43,397][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 10:52:44,273][15401] Updated weights for policy 0, policy_version 194220 (0.0036) [2024-06-22 10:52:48,345][15401] Updated weights for policy 0, policy_version 194230 (0.0042) [2024-06-22 10:52:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3182264320. Throughput: 0: 42716.9. Samples: 3182424140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:52:48,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 10:52:52,070][15401] Updated weights for policy 0, policy_version 194240 (0.0031) [2024-06-22 10:52:53,390][15132] Fps is (10 sec: 42624.9, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 3182460928. Throughput: 0: 42863.5. Samples: 3182554320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:52:53,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-22 10:52:56,116][15401] Updated weights for policy 0, policy_version 194250 (0.0037) [2024-06-22 10:52:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42931.7). Total num frames: 3182690304. Throughput: 0: 42660.1. Samples: 3182805560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:52:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-22 10:52:59,805][15401] Updated weights for policy 0, policy_version 194260 (0.0062) [2024-06-22 10:53:03,392][15132] Fps is (10 sec: 42589.3, 60 sec: 42869.7, 300 sec: 42765.6). Total num frames: 3182886912. Throughput: 0: 42591.0. Samples: 3183062100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:03,393][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 10:53:03,735][15401] Updated weights for policy 0, policy_version 194270 (0.0029) [2024-06-22 10:53:07,447][15401] Updated weights for policy 0, policy_version 194280 (0.0037) [2024-06-22 10:53:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42330.1, 300 sec: 42765.6). Total num frames: 3183099904. Throughput: 0: 42614.3. Samples: 3183189440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 10:53:11,618][15401] Updated weights for policy 0, policy_version 194290 (0.0040) [2024-06-22 10:53:13,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.3, 300 sec: 42876.3). Total num frames: 3183329280. Throughput: 0: 42394.6. Samples: 3183442020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 10:53:14,963][15401] Updated weights for policy 0, policy_version 194300 (0.0043) [2024-06-22 10:53:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3183542272. Throughput: 0: 42575.1. Samples: 3183703500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 10:53:19,159][15401] Updated weights for policy 0, policy_version 194310 (0.0023) [2024-06-22 10:53:22,946][15401] Updated weights for policy 0, policy_version 194320 (0.0034) [2024-06-22 10:53:23,396][15132] Fps is (10 sec: 42571.6, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 3183755264. Throughput: 0: 42579.7. Samples: 3183827400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:23,396][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 10:53:26,738][15401] Updated weights for policy 0, policy_version 194330 (0.0028) [2024-06-22 10:53:28,391][15132] Fps is (10 sec: 44228.6, 60 sec: 42870.2, 300 sec: 42931.4). Total num frames: 3183984640. Throughput: 0: 42456.4. Samples: 3184087740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:28,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 10:53:30,550][15401] Updated weights for policy 0, policy_version 194340 (0.0034) [2024-06-22 10:53:33,392][15132] Fps is (10 sec: 40976.2, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3184164864. Throughput: 0: 42932.7. Samples: 3184356220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:33,393][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 10:53:34,173][15401] Updated weights for policy 0, policy_version 194350 (0.0032) [2024-06-22 10:53:37,980][15401] Updated weights for policy 0, policy_version 194360 (0.0037) [2024-06-22 10:53:38,389][15132] Fps is (10 sec: 42606.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3184410624. Throughput: 0: 42832.3. Samples: 3184481760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:38,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 10:53:39,434][15349] Signal inference workers to stop experience collection... (46950 times) [2024-06-22 10:53:39,434][15349] Signal inference workers to resume experience collection... (46950 times) [2024-06-22 10:53:39,455][15401] InferenceWorker_p0-w0: stopping experience collection (46950 times) [2024-06-22 10:53:39,455][15401] InferenceWorker_p0-w0: resuming experience collection (46950 times) [2024-06-22 10:53:41,693][15401] Updated weights for policy 0, policy_version 194370 (0.0036) [2024-06-22 10:53:43,390][15132] Fps is (10 sec: 47522.9, 60 sec: 43421.9, 300 sec: 42931.6). Total num frames: 3184640000. Throughput: 0: 43110.1. Samples: 3184745540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:43,391][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 10:53:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000194375_3184640000.pth... [2024-06-22 10:53:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000193750_3174400000.pth [2024-06-22 10:53:45,543][15401] Updated weights for policy 0, policy_version 194380 (0.0032) [2024-06-22 10:53:48,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 3184820224. Throughput: 0: 43186.3. Samples: 3185005480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:48,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 10:53:49,244][15401] Updated weights for policy 0, policy_version 194390 (0.0034) [2024-06-22 10:53:53,108][15401] Updated weights for policy 0, policy_version 194400 (0.0041) [2024-06-22 10:53:53,389][15132] Fps is (10 sec: 42600.6, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 3185065984. Throughput: 0: 43074.2. Samples: 3185127780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 10:53:53,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 10:53:56,745][15401] Updated weights for policy 0, policy_version 194410 (0.0055) [2024-06-22 10:53:58,389][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3185262592. Throughput: 0: 43244.6. Samples: 3185388020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:53:58,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 10:54:00,573][15401] Updated weights for policy 0, policy_version 194420 (0.0037) [2024-06-22 10:54:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 3185459200. Throughput: 0: 43219.6. Samples: 3185648380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:03,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 10:54:04,487][15401] Updated weights for policy 0, policy_version 194430 (0.0043) [2024-06-22 10:54:08,049][15401] Updated weights for policy 0, policy_version 194440 (0.0033) [2024-06-22 10:54:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 42765.4). Total num frames: 3185721344. Throughput: 0: 43298.6. Samples: 3185775560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 10:54:12,107][15401] Updated weights for policy 0, policy_version 194450 (0.0032) [2024-06-22 10:54:13,391][15132] Fps is (10 sec: 44228.4, 60 sec: 42870.2, 300 sec: 42820.6). Total num frames: 3185901568. Throughput: 0: 43051.1. Samples: 3186025040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:13,392][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 10:54:15,733][15401] Updated weights for policy 0, policy_version 194460 (0.0045) [2024-06-22 10:54:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3186114560. Throughput: 0: 42826.3. Samples: 3186283300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:18,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-22 10:54:19,817][15401] Updated weights for policy 0, policy_version 194470 (0.0051) [2024-06-22 10:54:23,390][15132] Fps is (10 sec: 44244.9, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 3186343936. Throughput: 0: 42917.7. Samples: 3186413060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:23,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 10:54:23,444][15401] Updated weights for policy 0, policy_version 194480 (0.0029) [2024-06-22 10:54:27,692][15401] Updated weights for policy 0, policy_version 194490 (0.0028) [2024-06-22 10:54:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42599.8, 300 sec: 42820.9). Total num frames: 3186540544. Throughput: 0: 42691.7. Samples: 3186666640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:28,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 10:54:31,045][15401] Updated weights for policy 0, policy_version 194500 (0.0027) [2024-06-22 10:54:33,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43417.6, 300 sec: 42931.3). Total num frames: 3186769920. Throughput: 0: 42682.2. Samples: 3186926180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:33,392][15132] Avg episode reward: [(0, '0.259')] [2024-06-22 10:54:35,196][15401] Updated weights for policy 0, policy_version 194510 (0.0048) [2024-06-22 10:54:38,391][15132] Fps is (10 sec: 44231.1, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 3186982912. Throughput: 0: 42794.8. Samples: 3187053600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:38,391][15132] Avg episode reward: [(0, '0.248')] [2024-06-22 10:54:38,890][15401] Updated weights for policy 0, policy_version 194520 (0.0029) [2024-06-22 10:54:42,878][15401] Updated weights for policy 0, policy_version 194530 (0.0040) [2024-06-22 10:54:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.7, 300 sec: 42820.9). Total num frames: 3187195904. Throughput: 0: 42668.9. Samples: 3187308120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:43,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 10:54:46,492][15401] Updated weights for policy 0, policy_version 194540 (0.0035) [2024-06-22 10:54:48,389][15132] Fps is (10 sec: 42603.5, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 3187408896. Throughput: 0: 42480.4. Samples: 3187560000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 10:54:50,718][15401] Updated weights for policy 0, policy_version 194550 (0.0031) [2024-06-22 10:54:53,145][15349] Signal inference workers to stop experience collection... (47000 times) [2024-06-22 10:54:53,145][15349] Signal inference workers to resume experience collection... (47000 times) [2024-06-22 10:54:53,179][15401] InferenceWorker_p0-w0: stopping experience collection (47000 times) [2024-06-22 10:54:53,180][15401] InferenceWorker_p0-w0: resuming experience collection (47000 times) [2024-06-22 10:54:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 3187621888. Throughput: 0: 42559.6. Samples: 3187690740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 10:54:54,087][15401] Updated weights for policy 0, policy_version 194560 (0.0031) [2024-06-22 10:54:58,267][15401] Updated weights for policy 0, policy_version 194570 (0.0032) [2024-06-22 10:54:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 3187834880. Throughput: 0: 42891.5. Samples: 3187955180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 10:54:58,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 10:55:01,971][15401] Updated weights for policy 0, policy_version 194580 (0.0042) [2024-06-22 10:55:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3188047872. Throughput: 0: 42640.1. Samples: 3188202100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 10:55:06,133][15401] Updated weights for policy 0, policy_version 194590 (0.0034) [2024-06-22 10:55:08,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 3188260864. Throughput: 0: 42675.1. Samples: 3188333540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:08,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 10:55:09,654][15401] Updated weights for policy 0, policy_version 194600 (0.0032) [2024-06-22 10:55:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42599.8, 300 sec: 42709.5). Total num frames: 3188457472. Throughput: 0: 42690.3. Samples: 3188587700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:13,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 10:55:13,615][15401] Updated weights for policy 0, policy_version 194610 (0.0047) [2024-06-22 10:55:17,474][15401] Updated weights for policy 0, policy_version 194620 (0.0030) [2024-06-22 10:55:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3188686848. Throughput: 0: 42673.4. Samples: 3188846380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 10:55:21,557][15401] Updated weights for policy 0, policy_version 194630 (0.0040) [2024-06-22 10:55:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3188916224. Throughput: 0: 42692.6. Samples: 3188974720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 10:55:24,976][15401] Updated weights for policy 0, policy_version 194640 (0.0032) [2024-06-22 10:55:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3189112832. Throughput: 0: 42788.3. Samples: 3189233600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 10:55:28,923][15401] Updated weights for policy 0, policy_version 194650 (0.0035) [2024-06-22 10:55:32,405][15401] Updated weights for policy 0, policy_version 194660 (0.0037) [2024-06-22 10:55:33,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42325.3, 300 sec: 42764.7). Total num frames: 3189309440. Throughput: 0: 42919.5. Samples: 3189491480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:33,392][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 10:55:36,735][15401] Updated weights for policy 0, policy_version 194670 (0.0038) [2024-06-22 10:55:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42872.3, 300 sec: 42820.6). Total num frames: 3189555200. Throughput: 0: 42928.5. Samples: 3189622520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 10:55:39,991][15401] Updated weights for policy 0, policy_version 194680 (0.0030) [2024-06-22 10:55:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 3189735424. Throughput: 0: 42577.8. Samples: 3189871080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 10:55:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000194686_3189735424.pth... [2024-06-22 10:55:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000194061_3179495424.pth [2024-06-22 10:55:44,263][15401] Updated weights for policy 0, policy_version 194690 (0.0042) [2024-06-22 10:55:48,163][15401] Updated weights for policy 0, policy_version 194700 (0.0024) [2024-06-22 10:55:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3189964800. Throughput: 0: 42797.4. Samples: 3190127980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:48,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-22 10:55:52,405][15401] Updated weights for policy 0, policy_version 194710 (0.0032) [2024-06-22 10:55:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3190177792. Throughput: 0: 42863.6. Samples: 3190262300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:53,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 10:55:55,542][15401] Updated weights for policy 0, policy_version 194720 (0.0038) [2024-06-22 10:55:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 3190390784. Throughput: 0: 42839.5. Samples: 3190515480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:55:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 10:55:59,870][15401] Updated weights for policy 0, policy_version 194730 (0.0028) [2024-06-22 10:56:03,106][15401] Updated weights for policy 0, policy_version 194740 (0.0038) [2024-06-22 10:56:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3190620160. Throughput: 0: 42752.4. Samples: 3190770240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 10:56:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 10:56:07,294][15401] Updated weights for policy 0, policy_version 194750 (0.0033) [2024-06-22 10:56:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3190833152. Throughput: 0: 43000.1. Samples: 3190909720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 10:56:10,593][15349] Signal inference workers to stop experience collection... (47050 times) [2024-06-22 10:56:10,593][15349] Signal inference workers to resume experience collection... (47050 times) [2024-06-22 10:56:10,628][15401] InferenceWorker_p0-w0: stopping experience collection (47050 times) [2024-06-22 10:56:10,628][15401] InferenceWorker_p0-w0: resuming experience collection (47050 times) [2024-06-22 10:56:10,725][15401] Updated weights for policy 0, policy_version 194760 (0.0030) [2024-06-22 10:56:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3191029760. Throughput: 0: 42920.5. Samples: 3191165020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 10:56:14,862][15401] Updated weights for policy 0, policy_version 194770 (0.0037) [2024-06-22 10:56:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3191259136. Throughput: 0: 43014.8. Samples: 3191427040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 10:56:18,507][15401] Updated weights for policy 0, policy_version 194780 (0.0042) [2024-06-22 10:56:22,463][15401] Updated weights for policy 0, policy_version 194790 (0.0039) [2024-06-22 10:56:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3191488512. Throughput: 0: 43009.6. Samples: 3191557960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:23,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 10:56:26,150][15401] Updated weights for policy 0, policy_version 194800 (0.0027) [2024-06-22 10:56:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3191685120. Throughput: 0: 42920.9. Samples: 3191802520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 10:56:30,012][15401] Updated weights for policy 0, policy_version 194810 (0.0038) [2024-06-22 10:56:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3191881728. Throughput: 0: 43134.6. Samples: 3192069040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:33,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 10:56:33,908][15401] Updated weights for policy 0, policy_version 194820 (0.0040) [2024-06-22 10:56:37,905][15401] Updated weights for policy 0, policy_version 194830 (0.0031) [2024-06-22 10:56:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3192111104. Throughput: 0: 42840.1. Samples: 3192190100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 10:56:41,596][15401] Updated weights for policy 0, policy_version 194840 (0.0024) [2024-06-22 10:56:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3192340480. Throughput: 0: 42874.6. Samples: 3192444840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 10:56:45,638][15401] Updated weights for policy 0, policy_version 194850 (0.0036) [2024-06-22 10:56:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 3192520704. Throughput: 0: 42975.2. Samples: 3192704120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 10:56:49,283][15401] Updated weights for policy 0, policy_version 194860 (0.0035) [2024-06-22 10:56:53,218][15401] Updated weights for policy 0, policy_version 194870 (0.0028) [2024-06-22 10:56:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3192750080. Throughput: 0: 42576.9. Samples: 3192825680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 10:56:57,056][15401] Updated weights for policy 0, policy_version 194880 (0.0034) [2024-06-22 10:56:58,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 3192979456. Throughput: 0: 42695.5. Samples: 3193086420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:56:58,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 10:57:00,955][15401] Updated weights for policy 0, policy_version 194890 (0.0046) [2024-06-22 10:57:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42710.5). Total num frames: 3193159680. Throughput: 0: 42553.4. Samples: 3193341940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:57:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 10:57:04,493][15401] Updated weights for policy 0, policy_version 194900 (0.0042) [2024-06-22 10:57:08,389][15132] Fps is (10 sec: 40970.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3193389056. Throughput: 0: 42374.9. Samples: 3193464820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:57:08,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 10:57:08,440][15401] Updated weights for policy 0, policy_version 194910 (0.0025) [2024-06-22 10:57:12,271][15401] Updated weights for policy 0, policy_version 194920 (0.0028) [2024-06-22 10:57:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3193618432. Throughput: 0: 42733.4. Samples: 3193725520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 10:57:13,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 10:57:16,369][15401] Updated weights for policy 0, policy_version 194930 (0.0032) [2024-06-22 10:57:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3193815040. Throughput: 0: 42455.5. Samples: 3193979540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 10:57:19,864][15401] Updated weights for policy 0, policy_version 194940 (0.0039) [2024-06-22 10:57:22,927][15349] Signal inference workers to stop experience collection... (47100 times) [2024-06-22 10:57:22,931][15349] Signal inference workers to resume experience collection... (47100 times) [2024-06-22 10:57:22,948][15401] InferenceWorker_p0-w0: stopping experience collection (47100 times) [2024-06-22 10:57:22,948][15401] InferenceWorker_p0-w0: resuming experience collection (47100 times) [2024-06-22 10:57:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 3194028032. Throughput: 0: 42632.0. Samples: 3194108540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 10:57:24,003][15401] Updated weights for policy 0, policy_version 194950 (0.0039) [2024-06-22 10:57:27,485][15401] Updated weights for policy 0, policy_version 194960 (0.0040) [2024-06-22 10:57:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3194241024. Throughput: 0: 42740.9. Samples: 3194368180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 10:57:31,594][15401] Updated weights for policy 0, policy_version 194970 (0.0048) [2024-06-22 10:57:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3194470400. Throughput: 0: 42789.7. Samples: 3194629660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:33,399][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 10:57:35,144][15401] Updated weights for policy 0, policy_version 194980 (0.0043) [2024-06-22 10:57:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42821.5). Total num frames: 3194667008. Throughput: 0: 43015.9. Samples: 3194761400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 10:57:39,030][15401] Updated weights for policy 0, policy_version 194990 (0.0035) [2024-06-22 10:57:42,635][15401] Updated weights for policy 0, policy_version 195000 (0.0034) [2024-06-22 10:57:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3194896384. Throughput: 0: 42771.2. Samples: 3195011020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 10:57:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000195001_3194896384.pth... [2024-06-22 10:57:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000194375_3184640000.pth [2024-06-22 10:57:47,127][15401] Updated weights for policy 0, policy_version 195010 (0.0040) [2024-06-22 10:57:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3195092992. Throughput: 0: 42841.3. Samples: 3195269800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:48,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 10:57:50,260][15401] Updated weights for policy 0, policy_version 195020 (0.0035) [2024-06-22 10:57:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3195322368. Throughput: 0: 42872.4. Samples: 3195394080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 10:57:54,847][15401] Updated weights for policy 0, policy_version 195030 (0.0037) [2024-06-22 10:57:58,011][15401] Updated weights for policy 0, policy_version 195040 (0.0033) [2024-06-22 10:57:58,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3195551744. Throughput: 0: 42878.1. Samples: 3195655140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:57:58,392][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 10:58:02,350][15401] Updated weights for policy 0, policy_version 195050 (0.0043) [2024-06-22 10:58:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3195731968. Throughput: 0: 42858.7. Samples: 3195908180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:58:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 10:58:05,765][15401] Updated weights for policy 0, policy_version 195060 (0.0042) [2024-06-22 10:58:08,392][15132] Fps is (10 sec: 40961.5, 60 sec: 42869.9, 300 sec: 42820.3). Total num frames: 3195961344. Throughput: 0: 42827.3. Samples: 3196035860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:58:08,392][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 10:58:10,017][15401] Updated weights for policy 0, policy_version 195070 (0.0033) [2024-06-22 10:58:13,378][15401] Updated weights for policy 0, policy_version 195080 (0.0028) [2024-06-22 10:58:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3196190720. Throughput: 0: 42868.5. Samples: 3196297260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:58:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 10:58:17,716][15401] Updated weights for policy 0, policy_version 195090 (0.0032) [2024-06-22 10:58:18,390][15132] Fps is (10 sec: 42603.8, 60 sec: 42871.0, 300 sec: 42821.4). Total num frames: 3196387328. Throughput: 0: 42766.0. Samples: 3196554160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 10:58:18,391][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 10:58:21,145][15401] Updated weights for policy 0, policy_version 195100 (0.0028) [2024-06-22 10:58:23,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.8). Total num frames: 3196616704. Throughput: 0: 42616.1. Samples: 3196679120. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:58:23,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 10:58:25,289][15401] Updated weights for policy 0, policy_version 195110 (0.0043) [2024-06-22 10:58:28,389][15132] Fps is (10 sec: 40963.3, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 3196796928. Throughput: 0: 42861.8. Samples: 3196939800. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:58:28,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 10:58:28,873][15401] Updated weights for policy 0, policy_version 195120 (0.0033) [2024-06-22 10:58:32,932][15401] Updated weights for policy 0, policy_version 195130 (0.0038) [2024-06-22 10:58:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3197026304. Throughput: 0: 42714.5. Samples: 3197191960. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:58:33,393][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 10:58:36,736][15401] Updated weights for policy 0, policy_version 195140 (0.0032) [2024-06-22 10:58:38,396][15132] Fps is (10 sec: 45845.5, 60 sec: 43140.0, 300 sec: 42764.2). Total num frames: 3197255680. Throughput: 0: 42982.7. Samples: 3197328580. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:58:38,405][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 10:58:40,392][15401] Updated weights for policy 0, policy_version 195150 (0.0043) [2024-06-22 10:58:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 3197435904. Throughput: 0: 42758.4. Samples: 3197579160. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:58:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 10:58:44,090][15349] Signal inference workers to stop experience collection... (47150 times) [2024-06-22 10:58:44,136][15401] InferenceWorker_p0-w0: stopping experience collection (47150 times) [2024-06-22 10:58:44,136][15349] Signal inference workers to resume experience collection... (47150 times) [2024-06-22 10:58:44,148][15401] InferenceWorker_p0-w0: resuming experience collection (47150 times) [2024-06-22 10:58:44,281][15401] Updated weights for policy 0, policy_version 195160 (0.0027) [2024-06-22 10:58:48,021][15401] Updated weights for policy 0, policy_version 195170 (0.0038) [2024-06-22 10:58:48,389][15132] Fps is (10 sec: 42625.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3197681664. Throughput: 0: 42887.6. Samples: 3197838120. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:58:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 10:58:51,839][15401] Updated weights for policy 0, policy_version 195180 (0.0028) [2024-06-22 10:58:53,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3197894656. Throughput: 0: 43010.8. Samples: 3197971260. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:58:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 10:58:55,645][15401] Updated weights for policy 0, policy_version 195190 (0.0033) [2024-06-22 10:58:58,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42325.3, 300 sec: 42820.2). Total num frames: 3198091264. Throughput: 0: 42716.3. Samples: 3198219600. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:58:58,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 10:58:59,722][15401] Updated weights for policy 0, policy_version 195200 (0.0023) [2024-06-22 10:59:03,152][15401] Updated weights for policy 0, policy_version 195210 (0.0036) [2024-06-22 10:59:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3198320640. Throughput: 0: 42781.2. Samples: 3198479280. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:59:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 10:59:07,275][15401] Updated weights for policy 0, policy_version 195220 (0.0029) [2024-06-22 10:59:08,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42599.9, 300 sec: 42765.3). Total num frames: 3198517248. Throughput: 0: 42911.6. Samples: 3198610140. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:59:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 10:59:10,915][15401] Updated weights for policy 0, policy_version 195230 (0.0032) [2024-06-22 10:59:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 3198746624. Throughput: 0: 42612.3. Samples: 3198857460. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:59:13,392][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 10:59:15,299][15401] Updated weights for policy 0, policy_version 195240 (0.0027) [2024-06-22 10:59:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 3198943232. Throughput: 0: 42754.3. Samples: 3199115900. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:59:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 10:59:18,661][15401] Updated weights for policy 0, policy_version 195250 (0.0032) [2024-06-22 10:59:23,246][15401] Updated weights for policy 0, policy_version 195260 (0.0036) [2024-06-22 10:59:23,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3199139840. Throughput: 0: 42555.8. Samples: 3199243320. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-22 10:59:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 10:59:26,129][15401] Updated weights for policy 0, policy_version 195270 (0.0035) [2024-06-22 10:59:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 3199385600. Throughput: 0: 42657.7. Samples: 3199498760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 10:59:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 10:59:30,828][15401] Updated weights for policy 0, policy_version 195280 (0.0024) [2024-06-22 10:59:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 3199598592. Throughput: 0: 42592.9. Samples: 3199754800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 10:59:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 10:59:33,678][15401] Updated weights for policy 0, policy_version 195290 (0.0045) [2024-06-22 10:59:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42330.0, 300 sec: 42709.5). Total num frames: 3199795200. Throughput: 0: 42522.3. Samples: 3199884760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 10:59:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 10:59:38,393][15401] Updated weights for policy 0, policy_version 195300 (0.0035) [2024-06-22 10:59:41,274][15401] Updated weights for policy 0, policy_version 195310 (0.0036) [2024-06-22 10:59:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3200008192. Throughput: 0: 42660.0. Samples: 3200139200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 10:59:43,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 10:59:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000195314_3200024576.pth... [2024-06-22 10:59:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000194686_3189735424.pth [2024-06-22 10:59:46,131][15401] Updated weights for policy 0, policy_version 195320 (0.0025) [2024-06-22 10:59:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3200221184. Throughput: 0: 42475.2. Samples: 3200390660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 10:59:48,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 10:59:49,230][15401] Updated weights for policy 0, policy_version 195330 (0.0034) [2024-06-22 10:59:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 3200434176. Throughput: 0: 42403.5. Samples: 3200518300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 10:59:53,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 10:59:53,926][15401] Updated weights for policy 0, policy_version 195340 (0.0042) [2024-06-22 10:59:56,741][15401] Updated weights for policy 0, policy_version 195350 (0.0023) [2024-06-22 10:59:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 3200663552. Throughput: 0: 42709.4. Samples: 3200779280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 10:59:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 11:00:01,475][15401] Updated weights for policy 0, policy_version 195360 (0.0035) [2024-06-22 11:00:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 3200876544. Throughput: 0: 42750.2. Samples: 3201039660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 11:00:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 11:00:03,456][15349] Signal inference workers to stop experience collection... (47200 times) [2024-06-22 11:00:03,500][15401] InferenceWorker_p0-w0: stopping experience collection (47200 times) [2024-06-22 11:00:03,508][15349] Signal inference workers to resume experience collection... (47200 times) [2024-06-22 11:00:03,515][15401] InferenceWorker_p0-w0: resuming experience collection (47200 times) [2024-06-22 11:00:04,084][15401] Updated weights for policy 0, policy_version 195370 (0.0037) [2024-06-22 11:00:08,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 3201089536. Throughput: 0: 42732.5. Samples: 3201166380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 11:00:08,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 11:00:08,910][15401] Updated weights for policy 0, policy_version 195380 (0.0038) [2024-06-22 11:00:12,207][15401] Updated weights for policy 0, policy_version 195390 (0.0030) [2024-06-22 11:00:13,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 3201302528. Throughput: 0: 42670.6. Samples: 3201419040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 11:00:13,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 11:00:16,394][15401] Updated weights for policy 0, policy_version 195400 (0.0047) [2024-06-22 11:00:18,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3201482752. Throughput: 0: 42704.4. Samples: 3201676500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 11:00:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 11:00:19,783][15401] Updated weights for policy 0, policy_version 195410 (0.0027) [2024-06-22 11:00:23,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 3201712128. Throughput: 0: 42443.8. Samples: 3201794840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 11:00:23,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 11:00:24,035][15401] Updated weights for policy 0, policy_version 195420 (0.0033) [2024-06-22 11:00:27,521][15401] Updated weights for policy 0, policy_version 195430 (0.0038) [2024-06-22 11:00:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 3201941504. Throughput: 0: 42566.3. Samples: 3202054680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 11:00:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 11:00:31,896][15401] Updated weights for policy 0, policy_version 195440 (0.0042) [2024-06-22 11:00:33,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3202138112. Throughput: 0: 42770.2. Samples: 3202315320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-22 11:00:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 11:00:35,362][15401] Updated weights for policy 0, policy_version 195450 (0.0041) [2024-06-22 11:00:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3202351104. Throughput: 0: 42579.6. Samples: 3202434380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:00:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 11:00:39,586][15401] Updated weights for policy 0, policy_version 195460 (0.0032) [2024-06-22 11:00:43,031][15401] Updated weights for policy 0, policy_version 195470 (0.0045) [2024-06-22 11:00:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3202596864. Throughput: 0: 42666.3. Samples: 3202699260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:00:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 11:00:47,266][15401] Updated weights for policy 0, policy_version 195480 (0.0031) [2024-06-22 11:00:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3202777088. Throughput: 0: 42530.1. Samples: 3202953520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:00:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 11:00:50,779][15401] Updated weights for policy 0, policy_version 195490 (0.0034) [2024-06-22 11:00:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3202990080. Throughput: 0: 42453.1. Samples: 3203076660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:00:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 11:00:54,867][15401] Updated weights for policy 0, policy_version 195500 (0.0033) [2024-06-22 11:00:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3203235840. Throughput: 0: 42754.8. Samples: 3203342900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:00:58,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 11:00:58,391][15401] Updated weights for policy 0, policy_version 195510 (0.0027) [2024-06-22 11:01:02,458][15401] Updated weights for policy 0, policy_version 195520 (0.0037) [2024-06-22 11:01:03,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 3203432448. Throughput: 0: 42665.8. Samples: 3203596560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:01:03,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 11:01:05,982][15401] Updated weights for policy 0, policy_version 195530 (0.0033) [2024-06-22 11:01:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 3203645440. Throughput: 0: 42731.6. Samples: 3203717760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:01:08,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 11:01:10,196][15401] Updated weights for policy 0, policy_version 195540 (0.0036) [2024-06-22 11:01:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3203874816. Throughput: 0: 42859.9. Samples: 3203983380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:01:13,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 11:01:13,905][15401] Updated weights for policy 0, policy_version 195550 (0.0036) [2024-06-22 11:01:17,990][15401] Updated weights for policy 0, policy_version 195560 (0.0034) [2024-06-22 11:01:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 3204071424. Throughput: 0: 42508.8. Samples: 3204228220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:01:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 11:01:21,741][15401] Updated weights for policy 0, policy_version 195570 (0.0033) [2024-06-22 11:01:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 3204284416. Throughput: 0: 42724.3. Samples: 3204357080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:01:23,393][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 11:01:25,449][15401] Updated weights for policy 0, policy_version 195580 (0.0032) [2024-06-22 11:01:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3204497408. Throughput: 0: 42645.4. Samples: 3204618300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:01:28,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 11:01:29,353][15401] Updated weights for policy 0, policy_version 195590 (0.0044) [2024-06-22 11:01:33,362][15401] Updated weights for policy 0, policy_version 195600 (0.0033) [2024-06-22 11:01:33,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3204710400. Throughput: 0: 42660.0. Samples: 3204873220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:01:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 11:01:35,937][15349] Signal inference workers to stop experience collection... (47250 times) [2024-06-22 11:01:35,943][15349] Signal inference workers to resume experience collection... (47250 times) [2024-06-22 11:01:35,972][15401] InferenceWorker_p0-w0: stopping experience collection (47250 times) [2024-06-22 11:01:35,972][15401] InferenceWorker_p0-w0: resuming experience collection (47250 times) [2024-06-22 11:01:36,932][15401] Updated weights for policy 0, policy_version 195610 (0.0035) [2024-06-22 11:01:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3204907008. Throughput: 0: 42677.3. Samples: 3204997140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 11:01:38,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-22 11:01:40,991][15401] Updated weights for policy 0, policy_version 195620 (0.0032) [2024-06-22 11:01:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 3205136384. Throughput: 0: 42506.6. Samples: 3205255800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:01:43,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 11:01:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000195626_3205136384.pth... [2024-06-22 11:01:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000195001_3194896384.pth [2024-06-22 11:01:44,509][15401] Updated weights for policy 0, policy_version 195630 (0.0042) [2024-06-22 11:01:48,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 3205349376. Throughput: 0: 42531.3. Samples: 3205510640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:01:48,396][15132] Avg episode reward: [(0, '0.233')] [2024-06-22 11:01:48,632][15401] Updated weights for policy 0, policy_version 195640 (0.0042) [2024-06-22 11:01:52,355][15401] Updated weights for policy 0, policy_version 195650 (0.0029) [2024-06-22 11:01:53,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 3205562368. Throughput: 0: 42859.7. Samples: 3205646340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:01:53,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 11:01:56,234][15401] Updated weights for policy 0, policy_version 195660 (0.0033) [2024-06-22 11:01:58,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3205775360. Throughput: 0: 42790.8. Samples: 3205908960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:01:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 11:01:59,765][15401] Updated weights for policy 0, policy_version 195670 (0.0040) [2024-06-22 11:02:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 3205988352. Throughput: 0: 43022.7. Samples: 3206164240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 11:02:03,746][15401] Updated weights for policy 0, policy_version 195680 (0.0032) [2024-06-22 11:02:07,399][15401] Updated weights for policy 0, policy_version 195690 (0.0034) [2024-06-22 11:02:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 3206217728. Throughput: 0: 43058.8. Samples: 3206294620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:08,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 11:02:11,333][15401] Updated weights for policy 0, policy_version 195700 (0.0036) [2024-06-22 11:02:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 3206397952. Throughput: 0: 42928.9. Samples: 3206550100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 11:02:14,915][15401] Updated weights for policy 0, policy_version 195710 (0.0030) [2024-06-22 11:02:18,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 3206643712. Throughput: 0: 43126.3. Samples: 3206814000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:18,392][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 11:02:18,852][15401] Updated weights for policy 0, policy_version 195720 (0.0029) [2024-06-22 11:02:22,440][15401] Updated weights for policy 0, policy_version 195730 (0.0035) [2024-06-22 11:02:23,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 3206873088. Throughput: 0: 43241.3. Samples: 3206943000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:23,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-22 11:02:26,323][15401] Updated weights for policy 0, policy_version 195740 (0.0030) [2024-06-22 11:02:28,390][15132] Fps is (10 sec: 40968.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3207053312. Throughput: 0: 43173.8. Samples: 3207198520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:28,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 11:02:29,938][15401] Updated weights for policy 0, policy_version 195750 (0.0032) [2024-06-22 11:02:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3207299072. Throughput: 0: 43283.6. Samples: 3207458120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 11:02:33,719][15401] Updated weights for policy 0, policy_version 195760 (0.0034) [2024-06-22 11:02:37,615][15401] Updated weights for policy 0, policy_version 195770 (0.0049) [2024-06-22 11:02:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 3207512064. Throughput: 0: 43201.6. Samples: 3207590420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:38,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 11:02:41,188][15401] Updated weights for policy 0, policy_version 195780 (0.0026) [2024-06-22 11:02:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 3207692288. Throughput: 0: 43083.6. Samples: 3207847720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 11:02:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 11:02:44,635][15349] Signal inference workers to stop experience collection... (47300 times) [2024-06-22 11:02:44,640][15349] Signal inference workers to resume experience collection... (47300 times) [2024-06-22 11:02:44,653][15401] InferenceWorker_p0-w0: stopping experience collection (47300 times) [2024-06-22 11:02:44,653][15401] InferenceWorker_p0-w0: resuming experience collection (47300 times) [2024-06-22 11:02:45,123][15401] Updated weights for policy 0, policy_version 195790 (0.0036) [2024-06-22 11:02:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 3207938048. Throughput: 0: 43045.7. Samples: 3208101300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:02:48,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 11:02:49,371][15401] Updated weights for policy 0, policy_version 195800 (0.0038) [2024-06-22 11:02:52,759][15401] Updated weights for policy 0, policy_version 195810 (0.0031) [2024-06-22 11:02:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 3208151040. Throughput: 0: 43084.1. Samples: 3208233400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:02:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 11:02:56,798][15401] Updated weights for policy 0, policy_version 195820 (0.0024) [2024-06-22 11:02:58,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 3208331264. Throughput: 0: 42966.5. Samples: 3208483700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:02:58,392][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 11:03:00,360][15401] Updated weights for policy 0, policy_version 195830 (0.0042) [2024-06-22 11:03:03,392][15132] Fps is (10 sec: 42587.5, 60 sec: 43142.8, 300 sec: 42765.0). Total num frames: 3208577024. Throughput: 0: 42883.8. Samples: 3208743780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:03,393][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 11:03:04,520][15401] Updated weights for policy 0, policy_version 195840 (0.0034) [2024-06-22 11:03:08,174][15401] Updated weights for policy 0, policy_version 195850 (0.0034) [2024-06-22 11:03:08,391][15132] Fps is (10 sec: 47520.1, 60 sec: 43143.7, 300 sec: 42764.8). Total num frames: 3208806400. Throughput: 0: 42995.9. Samples: 3208877860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:08,391][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 11:03:12,609][15401] Updated weights for policy 0, policy_version 195860 (0.0029) [2024-06-22 11:03:13,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42709.6). Total num frames: 3208986624. Throughput: 0: 42810.8. Samples: 3209125000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 11:03:15,997][15401] Updated weights for policy 0, policy_version 195870 (0.0030) [2024-06-22 11:03:18,389][15132] Fps is (10 sec: 42603.0, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 3209232384. Throughput: 0: 42724.4. Samples: 3209380720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 11:03:20,192][15401] Updated weights for policy 0, policy_version 195880 (0.0037) [2024-06-22 11:03:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3209428992. Throughput: 0: 42786.7. Samples: 3209515820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 11:03:23,633][15401] Updated weights for policy 0, policy_version 195890 (0.0036) [2024-06-22 11:03:27,744][15401] Updated weights for policy 0, policy_version 195900 (0.0030) [2024-06-22 11:03:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 3209641984. Throughput: 0: 42613.7. Samples: 3209765440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:28,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 11:03:31,595][15401] Updated weights for policy 0, policy_version 195910 (0.0034) [2024-06-22 11:03:33,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.6, 300 sec: 42765.6). Total num frames: 3209871360. Throughput: 0: 42750.7. Samples: 3210025180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:33,393][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 11:03:35,238][15401] Updated weights for policy 0, policy_version 195920 (0.0038) [2024-06-22 11:03:38,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3210084352. Throughput: 0: 42663.3. Samples: 3210153260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:38,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 11:03:39,120][15401] Updated weights for policy 0, policy_version 195930 (0.0042) [2024-06-22 11:03:43,030][15401] Updated weights for policy 0, policy_version 195940 (0.0033) [2024-06-22 11:03:43,392][15132] Fps is (10 sec: 40960.0, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 3210280960. Throughput: 0: 42857.3. Samples: 3210412280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:43,393][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 11:03:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000195940_3210280960.pth... [2024-06-22 11:03:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000195314_3200024576.pth [2024-06-22 11:03:46,736][15401] Updated weights for policy 0, policy_version 195950 (0.0038) [2024-06-22 11:03:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3210493952. Throughput: 0: 42678.4. Samples: 3210664200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:48,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 11:03:50,607][15401] Updated weights for policy 0, policy_version 195960 (0.0032) [2024-06-22 11:03:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 3210706944. Throughput: 0: 42503.7. Samples: 3210790480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 11:03:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 11:03:54,861][15401] Updated weights for policy 0, policy_version 195970 (0.0038) [2024-06-22 11:03:55,659][15349] Signal inference workers to stop experience collection... (47350 times) [2024-06-22 11:03:55,712][15401] InferenceWorker_p0-w0: stopping experience collection (47350 times) [2024-06-22 11:03:55,713][15349] Signal inference workers to resume experience collection... (47350 times) [2024-06-22 11:03:55,731][15401] InferenceWorker_p0-w0: resuming experience collection (47350 times) [2024-06-22 11:03:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 3210919936. Throughput: 0: 42711.0. Samples: 3211047000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:03:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 11:03:58,523][15401] Updated weights for policy 0, policy_version 195980 (0.0029) [2024-06-22 11:04:02,324][15401] Updated weights for policy 0, policy_version 195990 (0.0025) [2024-06-22 11:04:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 3211132928. Throughput: 0: 42775.2. Samples: 3211305600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 11:04:06,249][15401] Updated weights for policy 0, policy_version 196000 (0.0030) [2024-06-22 11:04:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42599.2, 300 sec: 42765.4). Total num frames: 3211362304. Throughput: 0: 42688.0. Samples: 3211436780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 11:04:09,742][15401] Updated weights for policy 0, policy_version 196010 (0.0024) [2024-06-22 11:04:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3211558912. Throughput: 0: 42822.8. Samples: 3211692360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:13,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 11:04:14,101][15401] Updated weights for policy 0, policy_version 196020 (0.0045) [2024-06-22 11:04:17,306][15401] Updated weights for policy 0, policy_version 196030 (0.0056) [2024-06-22 11:04:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3211771904. Throughput: 0: 42694.9. Samples: 3211946340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 11:04:21,604][15401] Updated weights for policy 0, policy_version 196040 (0.0031) [2024-06-22 11:04:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3212001280. Throughput: 0: 42768.9. Samples: 3212077860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:23,398][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 11:04:24,864][15401] Updated weights for policy 0, policy_version 196050 (0.0028) [2024-06-22 11:04:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3212214272. Throughput: 0: 42702.8. Samples: 3212333800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 11:04:29,139][15401] Updated weights for policy 0, policy_version 196060 (0.0024) [2024-06-22 11:04:32,319][15401] Updated weights for policy 0, policy_version 196070 (0.0035) [2024-06-22 11:04:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42820.5). Total num frames: 3212427264. Throughput: 0: 42879.5. Samples: 3212593780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 11:04:36,851][15401] Updated weights for policy 0, policy_version 196080 (0.0045) [2024-06-22 11:04:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3212656640. Throughput: 0: 42941.0. Samples: 3212722820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 11:04:40,557][15401] Updated weights for policy 0, policy_version 196090 (0.0029) [2024-06-22 11:04:43,391][15132] Fps is (10 sec: 44230.8, 60 sec: 43145.4, 300 sec: 42875.9). Total num frames: 3212869632. Throughput: 0: 43011.7. Samples: 3212982580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:43,391][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 11:04:44,311][15401] Updated weights for policy 0, policy_version 196100 (0.0028) [2024-06-22 11:04:48,124][15401] Updated weights for policy 0, policy_version 196110 (0.0027) [2024-06-22 11:04:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3213066240. Throughput: 0: 42949.8. Samples: 3213238340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 11:04:51,751][15401] Updated weights for policy 0, policy_version 196120 (0.0036) [2024-06-22 11:04:53,392][15132] Fps is (10 sec: 42595.4, 60 sec: 43143.1, 300 sec: 42820.3). Total num frames: 3213295616. Throughput: 0: 42863.4. Samples: 3213365720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:53,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 11:04:55,570][15401] Updated weights for policy 0, policy_version 196130 (0.0046) [2024-06-22 11:04:58,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 3213492224. Throughput: 0: 43007.9. Samples: 3213627820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:04:58,392][15132] Avg episode reward: [(0, '0.852')] [2024-06-22 11:04:59,197][15401] Updated weights for policy 0, policy_version 196140 (0.0033) [2024-06-22 11:05:02,967][15401] Updated weights for policy 0, policy_version 196150 (0.0036) [2024-06-22 11:05:03,389][15132] Fps is (10 sec: 42607.4, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3213721600. Throughput: 0: 43031.9. Samples: 3213882780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 11:05:06,671][15401] Updated weights for policy 0, policy_version 196160 (0.0031) [2024-06-22 11:05:08,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.8, 300 sec: 42820.6). Total num frames: 3213934592. Throughput: 0: 43150.6. Samples: 3214019740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:08,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 11:05:10,276][15349] Signal inference workers to stop experience collection... (47400 times) [2024-06-22 11:05:10,276][15349] Signal inference workers to resume experience collection... (47400 times) [2024-06-22 11:05:10,307][15401] InferenceWorker_p0-w0: stopping experience collection (47400 times) [2024-06-22 11:05:10,307][15401] InferenceWorker_p0-w0: resuming experience collection (47400 times) [2024-06-22 11:05:10,415][15401] Updated weights for policy 0, policy_version 196170 (0.0036) [2024-06-22 11:05:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3214147584. Throughput: 0: 43127.5. Samples: 3214274540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 11:05:14,329][15401] Updated weights for policy 0, policy_version 196180 (0.0030) [2024-06-22 11:05:17,807][15401] Updated weights for policy 0, policy_version 196190 (0.0028) [2024-06-22 11:05:18,392][15132] Fps is (10 sec: 44236.6, 60 sec: 43415.8, 300 sec: 42931.6). Total num frames: 3214376960. Throughput: 0: 43079.0. Samples: 3214532440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:18,393][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 11:05:22,167][15401] Updated weights for policy 0, policy_version 196200 (0.0041) [2024-06-22 11:05:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3214589952. Throughput: 0: 43114.2. Samples: 3214662960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 11:05:25,770][15401] Updated weights for policy 0, policy_version 196210 (0.0041) [2024-06-22 11:05:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3214802944. Throughput: 0: 43016.8. Samples: 3214918280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 11:05:29,831][15401] Updated weights for policy 0, policy_version 196220 (0.0034) [2024-06-22 11:05:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3215015936. Throughput: 0: 43086.9. Samples: 3215177260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 11:05:33,544][15401] Updated weights for policy 0, policy_version 196230 (0.0032) [2024-06-22 11:05:37,420][15401] Updated weights for policy 0, policy_version 196240 (0.0044) [2024-06-22 11:05:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 3215228928. Throughput: 0: 43127.2. Samples: 3215306460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:38,393][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 11:05:41,202][15401] Updated weights for policy 0, policy_version 196250 (0.0043) [2024-06-22 11:05:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42599.3, 300 sec: 42876.1). Total num frames: 3215425536. Throughput: 0: 42903.6. Samples: 3215558380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 11:05:43,519][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000196255_3215441920.pth... [2024-06-22 11:05:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000195626_3205136384.pth [2024-06-22 11:05:44,940][15401] Updated weights for policy 0, policy_version 196260 (0.0040) [2024-06-22 11:05:48,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3215654912. Throughput: 0: 42907.6. Samples: 3215813620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:48,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-22 11:05:48,843][15401] Updated weights for policy 0, policy_version 196270 (0.0032) [2024-06-22 11:05:52,384][15401] Updated weights for policy 0, policy_version 196280 (0.0022) [2024-06-22 11:05:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42872.9, 300 sec: 42820.5). Total num frames: 3215867904. Throughput: 0: 42860.4. Samples: 3215948360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 11:05:56,346][15401] Updated weights for policy 0, policy_version 196290 (0.0032) [2024-06-22 11:05:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42873.1, 300 sec: 42820.9). Total num frames: 3216064512. Throughput: 0: 42821.7. Samples: 3216201520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:05:58,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 11:06:00,057][15401] Updated weights for policy 0, policy_version 196300 (0.0042) [2024-06-22 11:06:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 3216293888. Throughput: 0: 42927.7. Samples: 3216464080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:06:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 11:06:03,859][15401] Updated weights for policy 0, policy_version 196310 (0.0038) [2024-06-22 11:06:07,519][15401] Updated weights for policy 0, policy_version 196320 (0.0036) [2024-06-22 11:06:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 3216523264. Throughput: 0: 42911.5. Samples: 3216593980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 11:06:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 11:06:12,061][15401] Updated weights for policy 0, policy_version 196330 (0.0040) [2024-06-22 11:06:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3216703488. Throughput: 0: 42885.0. Samples: 3216848100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 11:06:15,178][15401] Updated weights for policy 0, policy_version 196340 (0.0038) [2024-06-22 11:06:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.2, 300 sec: 42876.5). Total num frames: 3216932864. Throughput: 0: 42916.6. Samples: 3217108500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 11:06:19,790][15401] Updated weights for policy 0, policy_version 196350 (0.0031) [2024-06-22 11:06:22,690][15401] Updated weights for policy 0, policy_version 196360 (0.0023) [2024-06-22 11:06:23,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 3217162240. Throughput: 0: 42896.4. Samples: 3217236800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:23,393][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 11:06:27,245][15401] Updated weights for policy 0, policy_version 196370 (0.0030) [2024-06-22 11:06:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3217358848. Throughput: 0: 43057.3. Samples: 3217495960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 11:06:30,232][15401] Updated weights for policy 0, policy_version 196380 (0.0034) [2024-06-22 11:06:33,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42596.8, 300 sec: 42931.3). Total num frames: 3217571840. Throughput: 0: 43188.7. Samples: 3217757220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:33,393][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 11:06:34,725][15401] Updated weights for policy 0, policy_version 196390 (0.0036) [2024-06-22 11:06:37,971][15401] Updated weights for policy 0, policy_version 196400 (0.0032) [2024-06-22 11:06:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43146.3, 300 sec: 42987.5). Total num frames: 3217817600. Throughput: 0: 42993.9. Samples: 3217883080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 11:06:42,452][15401] Updated weights for policy 0, policy_version 196410 (0.0039) [2024-06-22 11:06:43,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.6, 300 sec: 42932.6). Total num frames: 3218014208. Throughput: 0: 42964.6. Samples: 3218134920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 11:06:45,091][15349] Signal inference workers to stop experience collection... (47450 times) [2024-06-22 11:06:45,116][15401] InferenceWorker_p0-w0: stopping experience collection (47450 times) [2024-06-22 11:06:45,154][15349] Signal inference workers to resume experience collection... (47450 times) [2024-06-22 11:06:45,154][15401] InferenceWorker_p0-w0: resuming experience collection (47450 times) [2024-06-22 11:06:45,869][15401] Updated weights for policy 0, policy_version 196420 (0.0036) [2024-06-22 11:06:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3218210816. Throughput: 0: 42809.7. Samples: 3218390520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:48,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 11:06:49,969][15401] Updated weights for policy 0, policy_version 196430 (0.0032) [2024-06-22 11:06:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3218456576. Throughput: 0: 42799.5. Samples: 3218519960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 11:06:53,820][15401] Updated weights for policy 0, policy_version 196440 (0.0025) [2024-06-22 11:06:57,880][15401] Updated weights for policy 0, policy_version 196450 (0.0036) [2024-06-22 11:06:58,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 3218653184. Throughput: 0: 42822.1. Samples: 3218775200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:06:58,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 11:07:01,756][15401] Updated weights for policy 0, policy_version 196460 (0.0043) [2024-06-22 11:07:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3218849792. Throughput: 0: 42628.8. Samples: 3219026800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:07:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 11:07:05,533][15401] Updated weights for policy 0, policy_version 196470 (0.0038) [2024-06-22 11:07:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3219079168. Throughput: 0: 42618.3. Samples: 3219154520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:07:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 11:07:09,528][15401] Updated weights for policy 0, policy_version 196480 (0.0034) [2024-06-22 11:07:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 3219275776. Throughput: 0: 42553.8. Samples: 3219410880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 11:07:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 11:07:13,564][15401] Updated weights for policy 0, policy_version 196490 (0.0034) [2024-06-22 11:07:17,109][15401] Updated weights for policy 0, policy_version 196500 (0.0036) [2024-06-22 11:07:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3219488768. Throughput: 0: 42388.0. Samples: 3219664580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 11:07:21,254][15401] Updated weights for policy 0, policy_version 196510 (0.0038) [2024-06-22 11:07:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 3219734528. Throughput: 0: 42450.2. Samples: 3219793340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 11:07:24,604][15401] Updated weights for policy 0, policy_version 196520 (0.0047) [2024-06-22 11:07:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3219931136. Throughput: 0: 42661.7. Samples: 3220054700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 11:07:28,727][15401] Updated weights for policy 0, policy_version 196530 (0.0034) [2024-06-22 11:07:32,327][15401] Updated weights for policy 0, policy_version 196540 (0.0028) [2024-06-22 11:07:33,395][15132] Fps is (10 sec: 39300.7, 60 sec: 42596.4, 300 sec: 42764.3). Total num frames: 3220127744. Throughput: 0: 42560.3. Samples: 3220305960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:33,395][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 11:07:36,452][15401] Updated weights for policy 0, policy_version 196550 (0.0050) [2024-06-22 11:07:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3220373504. Throughput: 0: 42563.6. Samples: 3220435320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 11:07:39,886][15401] Updated weights for policy 0, policy_version 196560 (0.0047) [2024-06-22 11:07:43,392][15132] Fps is (10 sec: 42610.2, 60 sec: 42323.5, 300 sec: 42764.7). Total num frames: 3220553728. Throughput: 0: 42735.0. Samples: 3220698280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:43,393][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 11:07:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000196568_3220570112.pth... [2024-06-22 11:07:43,598][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000195940_3210280960.pth [2024-06-22 11:07:44,013][15401] Updated weights for policy 0, policy_version 196570 (0.0023) [2024-06-22 11:07:47,423][15401] Updated weights for policy 0, policy_version 196580 (0.0039) [2024-06-22 11:07:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3220783104. Throughput: 0: 42824.5. Samples: 3220953900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:48,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-22 11:07:51,628][15401] Updated weights for policy 0, policy_version 196590 (0.0034) [2024-06-22 11:07:53,390][15132] Fps is (10 sec: 47525.4, 60 sec: 42871.4, 300 sec: 43043.1). Total num frames: 3221028864. Throughput: 0: 42932.8. Samples: 3221086500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:53,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 11:07:55,001][15401] Updated weights for policy 0, policy_version 196600 (0.0029) [2024-06-22 11:07:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42054.0, 300 sec: 42709.8). Total num frames: 3221176320. Throughput: 0: 42962.8. Samples: 3221344200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:07:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 11:07:59,244][15401] Updated weights for policy 0, policy_version 196610 (0.0026) [2024-06-22 11:08:02,563][15401] Updated weights for policy 0, policy_version 196620 (0.0038) [2024-06-22 11:08:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.7). Total num frames: 3221438464. Throughput: 0: 42968.5. Samples: 3221598160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:08:03,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-22 11:08:06,458][15349] Signal inference workers to stop experience collection... (47500 times) [2024-06-22 11:08:06,509][15349] Signal inference workers to resume experience collection... (47500 times) [2024-06-22 11:08:06,510][15401] InferenceWorker_p0-w0: stopping experience collection (47500 times) [2024-06-22 11:08:06,523][15401] InferenceWorker_p0-w0: resuming experience collection (47500 times) [2024-06-22 11:08:07,000][15401] Updated weights for policy 0, policy_version 196630 (0.0037) [2024-06-22 11:08:08,389][15132] Fps is (10 sec: 49151.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3221667840. Throughput: 0: 43069.8. Samples: 3221731480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:08:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 11:08:10,102][15401] Updated weights for policy 0, policy_version 196640 (0.0037) [2024-06-22 11:08:13,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3221798912. Throughput: 0: 42792.5. Samples: 3221980360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:08:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 11:08:14,659][15401] Updated weights for policy 0, policy_version 196650 (0.0028) [2024-06-22 11:08:18,367][15401] Updated weights for policy 0, policy_version 196660 (0.0025) [2024-06-22 11:08:18,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 3222077440. Throughput: 0: 42765.0. Samples: 3222230260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 11:08:18,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 11:08:22,426][15401] Updated weights for policy 0, policy_version 196670 (0.0029) [2024-06-22 11:08:23,389][15132] Fps is (10 sec: 50790.8, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 3222306816. Throughput: 0: 42957.8. Samples: 3222368420. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:08:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 11:08:25,855][15401] Updated weights for policy 0, policy_version 196680 (0.0029) [2024-06-22 11:08:28,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 3222454272. Throughput: 0: 42674.0. Samples: 3222618500. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:08:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 11:08:29,989][15401] Updated weights for policy 0, policy_version 196690 (0.0038) [2024-06-22 11:08:33,391][15132] Fps is (10 sec: 40954.7, 60 sec: 43147.5, 300 sec: 42820.4). Total num frames: 3222716416. Throughput: 0: 42600.2. Samples: 3222870960. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:08:33,391][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 11:08:33,712][15401] Updated weights for policy 0, policy_version 196700 (0.0037) [2024-06-22 11:08:37,585][15401] Updated weights for policy 0, policy_version 196710 (0.0039) [2024-06-22 11:08:38,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.3, 300 sec: 42876.5). Total num frames: 3222929408. Throughput: 0: 42749.4. Samples: 3223010220. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:08:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 11:08:41,154][15401] Updated weights for policy 0, policy_version 196720 (0.0039) [2024-06-22 11:08:43,390][15132] Fps is (10 sec: 40964.6, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 3223126016. Throughput: 0: 42626.5. Samples: 3223262400. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:08:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 11:08:45,418][15401] Updated weights for policy 0, policy_version 196730 (0.0045) [2024-06-22 11:08:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3223371776. Throughput: 0: 42484.9. Samples: 3223509980. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:08:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 11:08:48,615][15401] Updated weights for policy 0, policy_version 196740 (0.0033) [2024-06-22 11:08:53,131][15401] Updated weights for policy 0, policy_version 196750 (0.0029) [2024-06-22 11:08:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3223568384. Throughput: 0: 42562.1. Samples: 3223646780. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:08:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 11:08:56,372][15401] Updated weights for policy 0, policy_version 196760 (0.0034) [2024-06-22 11:08:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3223764992. Throughput: 0: 42600.5. Samples: 3223897380. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:08:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 11:09:00,801][15401] Updated weights for policy 0, policy_version 196770 (0.0032) [2024-06-22 11:09:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3224027136. Throughput: 0: 42569.8. Samples: 3224145800. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:09:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 11:09:04,360][15401] Updated weights for policy 0, policy_version 196780 (0.0024) [2024-06-22 11:09:08,306][15401] Updated weights for policy 0, policy_version 196790 (0.0046) [2024-06-22 11:09:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3224207360. Throughput: 0: 42643.0. Samples: 3224287360. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:09:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 11:09:09,010][15349] Signal inference workers to stop experience collection... (47550 times) [2024-06-22 11:09:09,010][15349] Signal inference workers to resume experience collection... (47550 times) [2024-06-22 11:09:09,023][15401] InferenceWorker_p0-w0: stopping experience collection (47550 times) [2024-06-22 11:09:09,023][15401] InferenceWorker_p0-w0: resuming experience collection (47550 times) [2024-06-22 11:09:12,027][15401] Updated weights for policy 0, policy_version 196800 (0.0028) [2024-06-22 11:09:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 3224420352. Throughput: 0: 42784.7. Samples: 3224543820. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:09:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 11:09:16,084][15401] Updated weights for policy 0, policy_version 196810 (0.0046) [2024-06-22 11:09:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43146.3, 300 sec: 42931.7). Total num frames: 3224666112. Throughput: 0: 42723.5. Samples: 3224793460. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:09:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 11:09:19,540][15401] Updated weights for policy 0, policy_version 196820 (0.0032) [2024-06-22 11:09:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3224846336. Throughput: 0: 42692.9. Samples: 3224931400. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:09:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 11:09:23,902][15401] Updated weights for policy 0, policy_version 196830 (0.0040) [2024-06-22 11:09:27,091][15401] Updated weights for policy 0, policy_version 196840 (0.0041) [2024-06-22 11:09:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 3225075712. Throughput: 0: 42737.9. Samples: 3225185600. Policy #0 lag: (min: 3.0, avg: 9.8, max: 20.0) [2024-06-22 11:09:28,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 11:09:31,506][15401] Updated weights for policy 0, policy_version 196850 (0.0030) [2024-06-22 11:09:33,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43418.5, 300 sec: 42931.6). Total num frames: 3225321472. Throughput: 0: 42852.1. Samples: 3225438320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:09:33,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 11:09:34,540][15401] Updated weights for policy 0, policy_version 196860 (0.0039) [2024-06-22 11:09:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.2). Total num frames: 3225485312. Throughput: 0: 42761.8. Samples: 3225571060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:09:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 11:09:39,048][15401] Updated weights for policy 0, policy_version 196870 (0.0041) [2024-06-22 11:09:42,190][15401] Updated weights for policy 0, policy_version 196880 (0.0027) [2024-06-22 11:09:43,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3225698304. Throughput: 0: 42934.1. Samples: 3225829420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:09:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 11:09:43,579][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000196883_3225731072.pth... [2024-06-22 11:09:43,640][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000196255_3215441920.pth [2024-06-22 11:09:46,694][15401] Updated weights for policy 0, policy_version 196890 (0.0027) [2024-06-22 11:09:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 3225944064. Throughput: 0: 43025.4. Samples: 3226081940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:09:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 11:09:49,934][15401] Updated weights for policy 0, policy_version 196900 (0.0027) [2024-06-22 11:09:53,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.7, 300 sec: 42765.0). Total num frames: 3226107904. Throughput: 0: 42819.0. Samples: 3226214320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:09:53,393][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 11:09:54,513][15401] Updated weights for policy 0, policy_version 196910 (0.0048) [2024-06-22 11:09:57,853][15401] Updated weights for policy 0, policy_version 196920 (0.0031) [2024-06-22 11:09:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 3226353664. Throughput: 0: 42680.5. Samples: 3226464440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:09:58,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 11:10:02,115][15401] Updated weights for policy 0, policy_version 196930 (0.0043) [2024-06-22 11:10:03,389][15132] Fps is (10 sec: 45886.8, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 3226566656. Throughput: 0: 42891.6. Samples: 3226723580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:10:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 11:10:05,454][15401] Updated weights for policy 0, policy_version 196940 (0.0036) [2024-06-22 11:10:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3226746880. Throughput: 0: 42611.0. Samples: 3226848900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:10:08,390][15132] Avg episode reward: [(0, '0.904')] [2024-06-22 11:10:09,606][15349] Signal inference workers to stop experience collection... (47600 times) [2024-06-22 11:10:09,606][15349] Signal inference workers to resume experience collection... (47600 times) [2024-06-22 11:10:09,610][15401] Updated weights for policy 0, policy_version 196950 (0.0036) [2024-06-22 11:10:09,628][15401] InferenceWorker_p0-w0: stopping experience collection (47600 times) [2024-06-22 11:10:09,628][15401] InferenceWorker_p0-w0: resuming experience collection (47600 times) [2024-06-22 11:10:12,930][15401] Updated weights for policy 0, policy_version 196960 (0.0047) [2024-06-22 11:10:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 3226992640. Throughput: 0: 42827.2. Samples: 3227112820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:10:13,396][15132] Avg episode reward: [(0, '0.814')] [2024-06-22 11:10:17,172][15401] Updated weights for policy 0, policy_version 196970 (0.0038) [2024-06-22 11:10:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3227205632. Throughput: 0: 42849.6. Samples: 3227366560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:10:18,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-22 11:10:20,482][15401] Updated weights for policy 0, policy_version 196980 (0.0039) [2024-06-22 11:10:23,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3227402240. Throughput: 0: 42777.8. Samples: 3227496060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:10:23,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-22 11:10:25,063][15401] Updated weights for policy 0, policy_version 196990 (0.0040) [2024-06-22 11:10:28,008][15401] Updated weights for policy 0, policy_version 197000 (0.0032) [2024-06-22 11:10:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3227648000. Throughput: 0: 42863.1. Samples: 3227758260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:10:28,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 11:10:32,679][15401] Updated weights for policy 0, policy_version 197010 (0.0039) [2024-06-22 11:10:33,392][15132] Fps is (10 sec: 42588.4, 60 sec: 41777.5, 300 sec: 42709.5). Total num frames: 3227828224. Throughput: 0: 42875.1. Samples: 3228011420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:10:33,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 11:10:35,963][15401] Updated weights for policy 0, policy_version 197020 (0.0040) [2024-06-22 11:10:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3228041216. Throughput: 0: 42651.7. Samples: 3228133540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:10:38,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 11:10:40,808][15401] Updated weights for policy 0, policy_version 197030 (0.0043) [2024-06-22 11:10:43,392][15132] Fps is (10 sec: 45875.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3228286976. Throughput: 0: 42703.5. Samples: 3228386200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:10:43,392][15132] Avg episode reward: [(0, '0.236')] [2024-06-22 11:10:43,577][15401] Updated weights for policy 0, policy_version 197040 (0.0049) [2024-06-22 11:10:48,381][15401] Updated weights for policy 0, policy_version 197050 (0.0033) [2024-06-22 11:10:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3228467200. Throughput: 0: 42766.2. Samples: 3228648060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:10:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 11:10:51,312][15401] Updated weights for policy 0, policy_version 197060 (0.0035) [2024-06-22 11:10:53,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3228680192. Throughput: 0: 42641.0. Samples: 3228767740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:10:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 11:10:55,903][15401] Updated weights for policy 0, policy_version 197070 (0.0028) [2024-06-22 11:10:58,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 3228925952. Throughput: 0: 42495.4. Samples: 3229025220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:10:58,392][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 11:10:58,767][15401] Updated weights for policy 0, policy_version 197080 (0.0029) [2024-06-22 11:11:03,381][15401] Updated weights for policy 0, policy_version 197090 (0.0040) [2024-06-22 11:11:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3229122560. Throughput: 0: 42766.9. Samples: 3229291060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:11:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 11:11:06,458][15401] Updated weights for policy 0, policy_version 197100 (0.0022) [2024-06-22 11:11:08,389][15132] Fps is (10 sec: 40970.3, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 3229335552. Throughput: 0: 42581.0. Samples: 3229412200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:11:08,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 11:11:10,888][15401] Updated weights for policy 0, policy_version 197110 (0.0021) [2024-06-22 11:11:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3229564928. Throughput: 0: 42557.9. Samples: 3229673360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:11:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 11:11:14,025][15401] Updated weights for policy 0, policy_version 197120 (0.0046) [2024-06-22 11:11:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 3229761536. Throughput: 0: 42783.2. Samples: 3229936560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:11:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 11:11:18,446][15401] Updated weights for policy 0, policy_version 197130 (0.0038) [2024-06-22 11:11:21,596][15401] Updated weights for policy 0, policy_version 197140 (0.0047) [2024-06-22 11:11:23,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3229974528. Throughput: 0: 42755.9. Samples: 3230057560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:11:23,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-22 11:11:26,171][15401] Updated weights for policy 0, policy_version 197150 (0.0032) [2024-06-22 11:11:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 3230220288. Throughput: 0: 42888.6. Samples: 3230316080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:11:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 11:11:29,329][15401] Updated weights for policy 0, policy_version 197160 (0.0031) [2024-06-22 11:11:33,089][15349] Signal inference workers to stop experience collection... (47650 times) [2024-06-22 11:11:33,138][15401] InferenceWorker_p0-w0: stopping experience collection (47650 times) [2024-06-22 11:11:33,203][15349] Signal inference workers to resume experience collection... (47650 times) [2024-06-22 11:11:33,203][15401] InferenceWorker_p0-w0: resuming experience collection (47650 times) [2024-06-22 11:11:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 3230400512. Throughput: 0: 42750.7. Samples: 3230571840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:11:33,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 11:11:33,980][15401] Updated weights for policy 0, policy_version 197170 (0.0036) [2024-06-22 11:11:37,308][15401] Updated weights for policy 0, policy_version 197180 (0.0021) [2024-06-22 11:11:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3230597120. Throughput: 0: 42708.0. Samples: 3230689600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:11:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 11:11:42,049][15401] Updated weights for policy 0, policy_version 197190 (0.0041) [2024-06-22 11:11:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 3230842880. Throughput: 0: 42858.2. Samples: 3230953740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:11:43,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 11:11:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000197196_3230859264.pth... [2024-06-22 11:11:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000196568_3220570112.pth [2024-06-22 11:11:45,090][15401] Updated weights for policy 0, policy_version 197200 (0.0028) [2024-06-22 11:11:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3231023104. Throughput: 0: 42605.6. Samples: 3231208320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:11:48,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 11:11:49,623][15401] Updated weights for policy 0, policy_version 197210 (0.0049) [2024-06-22 11:11:52,648][15401] Updated weights for policy 0, policy_version 197220 (0.0032) [2024-06-22 11:11:53,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 3231252480. Throughput: 0: 42629.3. Samples: 3231330520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:11:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 11:11:57,303][15401] Updated weights for policy 0, policy_version 197230 (0.0042) [2024-06-22 11:11:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 3231465472. Throughput: 0: 42646.2. Samples: 3231592440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:11:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 11:12:00,418][15401] Updated weights for policy 0, policy_version 197240 (0.0032) [2024-06-22 11:12:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 3231662080. Throughput: 0: 42399.6. Samples: 3231844540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 11:12:04,898][15401] Updated weights for policy 0, policy_version 197250 (0.0034) [2024-06-22 11:12:08,125][15401] Updated weights for policy 0, policy_version 197260 (0.0031) [2024-06-22 11:12:08,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 3231907840. Throughput: 0: 42378.2. Samples: 3231964680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:08,392][15132] Avg episode reward: [(0, '0.187')] [2024-06-22 11:12:12,984][15401] Updated weights for policy 0, policy_version 197270 (0.0041) [2024-06-22 11:12:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 3232088064. Throughput: 0: 42412.3. Samples: 3232224640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 11:12:15,747][15401] Updated weights for policy 0, policy_version 197280 (0.0023) [2024-06-22 11:12:18,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3232301056. Throughput: 0: 42302.1. Samples: 3232475440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 11:12:20,536][15401] Updated weights for policy 0, policy_version 197290 (0.0033) [2024-06-22 11:12:23,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 3232546816. Throughput: 0: 42541.2. Samples: 3232604060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:23,393][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 11:12:23,583][15401] Updated weights for policy 0, policy_version 197300 (0.0036) [2024-06-22 11:12:28,166][15401] Updated weights for policy 0, policy_version 197310 (0.0027) [2024-06-22 11:12:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 41777.5, 300 sec: 42709.9). Total num frames: 3232727040. Throughput: 0: 42442.8. Samples: 3232863760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:28,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 11:12:31,300][15401] Updated weights for policy 0, policy_version 197320 (0.0042) [2024-06-22 11:12:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3232956416. Throughput: 0: 42280.1. Samples: 3233110920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 11:12:35,823][15401] Updated weights for policy 0, policy_version 197330 (0.0036) [2024-06-22 11:12:38,392][15132] Fps is (10 sec: 45874.9, 60 sec: 43142.7, 300 sec: 42820.6). Total num frames: 3233185792. Throughput: 0: 42568.3. Samples: 3233246200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:38,393][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 11:12:38,860][15401] Updated weights for policy 0, policy_version 197340 (0.0023) [2024-06-22 11:12:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.4, 300 sec: 42598.4). Total num frames: 3233349632. Throughput: 0: 42341.4. Samples: 3233497800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:43,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 11:12:43,901][15401] Updated weights for policy 0, policy_version 197350 (0.0038) [2024-06-22 11:12:45,509][15349] Signal inference workers to stop experience collection... (47700 times) [2024-06-22 11:12:45,510][15349] Signal inference workers to resume experience collection... (47700 times) [2024-06-22 11:12:45,534][15401] InferenceWorker_p0-w0: stopping experience collection (47700 times) [2024-06-22 11:12:45,534][15401] InferenceWorker_p0-w0: resuming experience collection (47700 times) [2024-06-22 11:12:47,020][15401] Updated weights for policy 0, policy_version 197360 (0.0035) [2024-06-22 11:12:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3233595392. Throughput: 0: 42338.1. Samples: 3233749760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 11:12:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 11:12:51,524][15401] Updated weights for policy 0, policy_version 197370 (0.0033) [2024-06-22 11:12:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3233808384. Throughput: 0: 42656.0. Samples: 3233884100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:12:53,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 11:12:54,792][15401] Updated weights for policy 0, policy_version 197380 (0.0037) [2024-06-22 11:12:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 3233988608. Throughput: 0: 42386.8. Samples: 3234132040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:12:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 11:12:59,586][15401] Updated weights for policy 0, policy_version 197390 (0.0030) [2024-06-22 11:13:02,519][15401] Updated weights for policy 0, policy_version 197400 (0.0033) [2024-06-22 11:13:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3234234368. Throughput: 0: 42429.8. Samples: 3234384780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 11:13:07,471][15401] Updated weights for policy 0, policy_version 197410 (0.0036) [2024-06-22 11:13:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42053.9, 300 sec: 42820.5). Total num frames: 3234430976. Throughput: 0: 42507.6. Samples: 3234516800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 11:13:10,100][15401] Updated weights for policy 0, policy_version 197420 (0.0042) [2024-06-22 11:13:13,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 42487.7). Total num frames: 3234611200. Throughput: 0: 42155.5. Samples: 3234760660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 11:13:15,062][15401] Updated weights for policy 0, policy_version 197430 (0.0032) [2024-06-22 11:13:17,930][15401] Updated weights for policy 0, policy_version 197440 (0.0038) [2024-06-22 11:13:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3234873344. Throughput: 0: 42237.2. Samples: 3235011600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 11:13:22,632][15401] Updated weights for policy 0, policy_version 197450 (0.0029) [2024-06-22 11:13:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41781.0, 300 sec: 42709.5). Total num frames: 3235053568. Throughput: 0: 42281.5. Samples: 3235148760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 11:13:25,789][15401] Updated weights for policy 0, policy_version 197460 (0.0050) [2024-06-22 11:13:28,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.0, 300 sec: 42543.0). Total num frames: 3235266560. Throughput: 0: 42262.5. Samples: 3235399620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 11:13:30,174][15401] Updated weights for policy 0, policy_version 197470 (0.0036) [2024-06-22 11:13:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3235495936. Throughput: 0: 42242.7. Samples: 3235650680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 11:13:33,399][15401] Updated weights for policy 0, policy_version 197480 (0.0038) [2024-06-22 11:13:37,836][15401] Updated weights for policy 0, policy_version 197490 (0.0027) [2024-06-22 11:13:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 3235692544. Throughput: 0: 42289.8. Samples: 3235787140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:38,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 11:13:41,122][15401] Updated weights for policy 0, policy_version 197500 (0.0032) [2024-06-22 11:13:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 3235921920. Throughput: 0: 42358.7. Samples: 3236038180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 11:13:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000197505_3235921920.pth... [2024-06-22 11:13:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000196883_3225731072.pth [2024-06-22 11:13:45,455][15401] Updated weights for policy 0, policy_version 197510 (0.0033) [2024-06-22 11:13:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 3236134912. Throughput: 0: 42425.7. Samples: 3236294040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:48,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 11:13:48,747][15401] Updated weights for policy 0, policy_version 197520 (0.0030) [2024-06-22 11:13:51,576][15349] Signal inference workers to stop experience collection... (47750 times) [2024-06-22 11:13:51,576][15349] Signal inference workers to resume experience collection... (47750 times) [2024-06-22 11:13:51,626][15401] InferenceWorker_p0-w0: stopping experience collection (47750 times) [2024-06-22 11:13:51,626][15401] InferenceWorker_p0-w0: resuming experience collection (47750 times) [2024-06-22 11:13:52,994][15401] Updated weights for policy 0, policy_version 197530 (0.0044) [2024-06-22 11:13:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3236331520. Throughput: 0: 42363.2. Samples: 3236423140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 11:13:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 11:13:56,695][15401] Updated weights for policy 0, policy_version 197540 (0.0036) [2024-06-22 11:13:58,392][15132] Fps is (10 sec: 40960.3, 60 sec: 42596.7, 300 sec: 42431.4). Total num frames: 3236544512. Throughput: 0: 42669.8. Samples: 3236680900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:13:58,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 11:14:00,552][15401] Updated weights for policy 0, policy_version 197550 (0.0027) [2024-06-22 11:14:03,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 3236790272. Throughput: 0: 42840.3. Samples: 3236939420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 11:14:04,484][15401] Updated weights for policy 0, policy_version 197560 (0.0037) [2024-06-22 11:14:08,061][15401] Updated weights for policy 0, policy_version 197570 (0.0031) [2024-06-22 11:14:08,392][15132] Fps is (10 sec: 44237.9, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 3236986880. Throughput: 0: 42625.0. Samples: 3237066980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:08,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 11:14:12,252][15401] Updated weights for policy 0, policy_version 197580 (0.0027) [2024-06-22 11:14:13,392][15132] Fps is (10 sec: 40952.2, 60 sec: 43143.0, 300 sec: 42487.0). Total num frames: 3237199872. Throughput: 0: 42627.4. Samples: 3237317940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:13,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 11:14:15,662][15401] Updated weights for policy 0, policy_version 197590 (0.0027) [2024-06-22 11:14:18,392][15132] Fps is (10 sec: 42597.2, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 3237412864. Throughput: 0: 42758.1. Samples: 3237574900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:18,392][15132] Avg episode reward: [(0, '0.783')] [2024-06-22 11:14:19,826][15401] Updated weights for policy 0, policy_version 197600 (0.0032) [2024-06-22 11:14:23,357][15401] Updated weights for policy 0, policy_version 197610 (0.0028) [2024-06-22 11:14:23,390][15132] Fps is (10 sec: 44246.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3237642240. Throughput: 0: 42640.9. Samples: 3237705980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:23,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 11:14:27,319][15401] Updated weights for policy 0, policy_version 197620 (0.0029) [2024-06-22 11:14:28,396][15132] Fps is (10 sec: 44218.9, 60 sec: 43139.9, 300 sec: 42486.4). Total num frames: 3237855232. Throughput: 0: 42844.5. Samples: 3237966460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:28,396][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 11:14:30,928][15401] Updated weights for policy 0, policy_version 197630 (0.0035) [2024-06-22 11:14:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3238051840. Throughput: 0: 42782.4. Samples: 3238219140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 11:14:35,424][15401] Updated weights for policy 0, policy_version 197640 (0.0029) [2024-06-22 11:14:38,389][15132] Fps is (10 sec: 42626.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3238281216. Throughput: 0: 42650.6. Samples: 3238342420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 11:14:38,441][15401] Updated weights for policy 0, policy_version 197650 (0.0039) [2024-06-22 11:14:43,024][15401] Updated weights for policy 0, policy_version 197660 (0.0037) [2024-06-22 11:14:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 3238477824. Throughput: 0: 42869.9. Samples: 3238609940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 11:14:46,339][15401] Updated weights for policy 0, policy_version 197670 (0.0032) [2024-06-22 11:14:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 3238707200. Throughput: 0: 42614.0. Samples: 3238857040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:48,390][15132] Avg episode reward: [(0, '0.206')] [2024-06-22 11:14:50,615][15401] Updated weights for policy 0, policy_version 197680 (0.0033) [2024-06-22 11:14:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3238920192. Throughput: 0: 42707.3. Samples: 3238988720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 11:14:54,048][15401] Updated weights for policy 0, policy_version 197690 (0.0038) [2024-06-22 11:14:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 3239100416. Throughput: 0: 42696.7. Samples: 3239239200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:14:58,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-22 11:14:58,455][15401] Updated weights for policy 0, policy_version 197700 (0.0031) [2024-06-22 11:15:02,227][15401] Updated weights for policy 0, policy_version 197710 (0.0035) [2024-06-22 11:15:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.5, 300 sec: 42598.4). Total num frames: 3239313408. Throughput: 0: 42750.8. Samples: 3239498580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 11:15:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 11:15:06,086][15401] Updated weights for policy 0, policy_version 197720 (0.0032) [2024-06-22 11:15:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 3239559168. Throughput: 0: 42619.1. Samples: 3239623840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 11:15:09,568][15401] Updated weights for policy 0, policy_version 197730 (0.0043) [2024-06-22 11:15:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42599.9, 300 sec: 42542.9). Total num frames: 3239755776. Throughput: 0: 42592.3. Samples: 3239882840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 11:15:13,566][15401] Updated weights for policy 0, policy_version 197740 (0.0028) [2024-06-22 11:15:17,195][15401] Updated weights for policy 0, policy_version 197750 (0.0038) [2024-06-22 11:15:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 3239968768. Throughput: 0: 42563.5. Samples: 3240134500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 11:15:21,438][15401] Updated weights for policy 0, policy_version 197760 (0.0026) [2024-06-22 11:15:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 3240181760. Throughput: 0: 42734.1. Samples: 3240265460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 11:15:24,910][15401] Updated weights for policy 0, policy_version 197770 (0.0030) [2024-06-22 11:15:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42055.1, 300 sec: 42542.9). Total num frames: 3240378368. Throughput: 0: 42400.8. Samples: 3240518080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:28,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 11:15:29,126][15401] Updated weights for policy 0, policy_version 197780 (0.0028) [2024-06-22 11:15:29,128][15349] Signal inference workers to stop experience collection... (47800 times) [2024-06-22 11:15:29,129][15349] Signal inference workers to resume experience collection... (47800 times) [2024-06-22 11:15:29,149][15401] InferenceWorker_p0-w0: stopping experience collection (47800 times) [2024-06-22 11:15:29,150][15401] InferenceWorker_p0-w0: resuming experience collection (47800 times) [2024-06-22 11:15:32,699][15401] Updated weights for policy 0, policy_version 197790 (0.0027) [2024-06-22 11:15:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 3240624128. Throughput: 0: 42576.8. Samples: 3240773000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 11:15:36,640][15401] Updated weights for policy 0, policy_version 197800 (0.0030) [2024-06-22 11:15:38,389][15132] Fps is (10 sec: 45886.2, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 3240837120. Throughput: 0: 42649.4. Samples: 3240907940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 11:15:40,312][15401] Updated weights for policy 0, policy_version 197810 (0.0041) [2024-06-22 11:15:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 3241033728. Throughput: 0: 42787.4. Samples: 3241164640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 11:15:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000197817_3241033728.pth... [2024-06-22 11:15:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000197196_3230859264.pth [2024-06-22 11:15:44,165][15401] Updated weights for policy 0, policy_version 197820 (0.0029) [2024-06-22 11:15:47,878][15401] Updated weights for policy 0, policy_version 197830 (0.0040) [2024-06-22 11:15:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3241263104. Throughput: 0: 42544.3. Samples: 3241413080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 11:15:51,902][15401] Updated weights for policy 0, policy_version 197840 (0.0025) [2024-06-22 11:15:53,391][15132] Fps is (10 sec: 42594.6, 60 sec: 42324.6, 300 sec: 42487.5). Total num frames: 3241459712. Throughput: 0: 42641.2. Samples: 3241542740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:53,391][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 11:15:55,472][15401] Updated weights for policy 0, policy_version 197850 (0.0034) [2024-06-22 11:15:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 3241672704. Throughput: 0: 42629.4. Samples: 3241801160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:15:58,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 11:15:59,566][15401] Updated weights for policy 0, policy_version 197860 (0.0036) [2024-06-22 11:16:03,030][15401] Updated weights for policy 0, policy_version 197870 (0.0029) [2024-06-22 11:16:03,394][15132] Fps is (10 sec: 44222.1, 60 sec: 43141.3, 300 sec: 42597.8). Total num frames: 3241902080. Throughput: 0: 42513.7. Samples: 3242047800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:16:03,394][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 11:16:07,158][15401] Updated weights for policy 0, policy_version 197880 (0.0026) [2024-06-22 11:16:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 3242115072. Throughput: 0: 42664.1. Samples: 3242185340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 11:16:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 11:16:10,535][15401] Updated weights for policy 0, policy_version 197890 (0.0037) [2024-06-22 11:16:13,390][15132] Fps is (10 sec: 42616.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3242328064. Throughput: 0: 42794.6. Samples: 3242443740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:13,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-22 11:16:14,728][15401] Updated weights for policy 0, policy_version 197900 (0.0036) [2024-06-22 11:16:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3242541056. Throughput: 0: 42768.1. Samples: 3242697560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 11:16:18,447][15401] Updated weights for policy 0, policy_version 197910 (0.0038) [2024-06-22 11:16:22,286][15401] Updated weights for policy 0, policy_version 197920 (0.0037) [2024-06-22 11:16:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 3242737664. Throughput: 0: 42667.6. Samples: 3242827980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 11:16:26,077][15401] Updated weights for policy 0, policy_version 197930 (0.0032) [2024-06-22 11:16:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 3242950656. Throughput: 0: 42734.9. Samples: 3243087700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:28,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 11:16:29,785][15401] Updated weights for policy 0, policy_version 197940 (0.0027) [2024-06-22 11:16:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3243196416. Throughput: 0: 42849.3. Samples: 3243341300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:33,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 11:16:33,619][15401] Updated weights for policy 0, policy_version 197950 (0.0040) [2024-06-22 11:16:37,343][15401] Updated weights for policy 0, policy_version 197960 (0.0044) [2024-06-22 11:16:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3243393024. Throughput: 0: 42985.0. Samples: 3243477020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:38,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 11:16:41,574][15401] Updated weights for policy 0, policy_version 197970 (0.0038) [2024-06-22 11:16:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3243589632. Throughput: 0: 42825.8. Samples: 3243728320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 11:16:45,073][15401] Updated weights for policy 0, policy_version 197980 (0.0047) [2024-06-22 11:16:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 3243802624. Throughput: 0: 42992.1. Samples: 3243982260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 11:16:49,230][15401] Updated weights for policy 0, policy_version 197990 (0.0036) [2024-06-22 11:16:51,210][15349] Signal inference workers to stop experience collection... (47850 times) [2024-06-22 11:16:51,240][15401] InferenceWorker_p0-w0: stopping experience collection (47850 times) [2024-06-22 11:16:51,273][15349] Signal inference workers to resume experience collection... (47850 times) [2024-06-22 11:16:51,274][15401] InferenceWorker_p0-w0: resuming experience collection (47850 times) [2024-06-22 11:16:53,011][15401] Updated weights for policy 0, policy_version 198000 (0.0030) [2024-06-22 11:16:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42872.3, 300 sec: 42598.4). Total num frames: 3244032000. Throughput: 0: 42808.0. Samples: 3244111700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 11:16:57,146][15401] Updated weights for policy 0, policy_version 198010 (0.0031) [2024-06-22 11:16:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3244228608. Throughput: 0: 42624.0. Samples: 3244361820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:16:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 11:17:00,650][15401] Updated weights for policy 0, policy_version 198020 (0.0032) [2024-06-22 11:17:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42601.6, 300 sec: 42543.2). Total num frames: 3244457984. Throughput: 0: 42633.8. Samples: 3244616080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:17:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 11:17:04,747][15401] Updated weights for policy 0, policy_version 198030 (0.0027) [2024-06-22 11:17:08,101][15401] Updated weights for policy 0, policy_version 198040 (0.0041) [2024-06-22 11:17:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3244687360. Throughput: 0: 42736.9. Samples: 3244751140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:17:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 11:17:12,229][15401] Updated weights for policy 0, policy_version 198050 (0.0044) [2024-06-22 11:17:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3244867584. Throughput: 0: 42709.3. Samples: 3245009620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:17:13,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 11:17:15,543][15401] Updated weights for policy 0, policy_version 198060 (0.0027) [2024-06-22 11:17:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 3245113344. Throughput: 0: 42806.7. Samples: 3245267600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 11:17:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 11:17:19,872][15401] Updated weights for policy 0, policy_version 198070 (0.0046) [2024-06-22 11:17:23,061][15401] Updated weights for policy 0, policy_version 198080 (0.0038) [2024-06-22 11:17:23,391][15132] Fps is (10 sec: 47508.5, 60 sec: 43416.8, 300 sec: 42765.2). Total num frames: 3245342720. Throughput: 0: 42846.6. Samples: 3245405160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:17:23,391][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 11:17:27,341][15401] Updated weights for policy 0, policy_version 198090 (0.0031) [2024-06-22 11:17:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3245522944. Throughput: 0: 42890.2. Samples: 3245658380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:17:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 11:17:31,085][15401] Updated weights for policy 0, policy_version 198100 (0.0041) [2024-06-22 11:17:33,390][15132] Fps is (10 sec: 40964.0, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 3245752320. Throughput: 0: 42876.0. Samples: 3245911680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:17:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 11:17:35,342][15401] Updated weights for policy 0, policy_version 198110 (0.0038) [2024-06-22 11:17:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42764.6). Total num frames: 3245965312. Throughput: 0: 42887.4. Samples: 3246041740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:17:38,393][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 11:17:39,058][15401] Updated weights for policy 0, policy_version 198120 (0.0031) [2024-06-22 11:17:42,867][15401] Updated weights for policy 0, policy_version 198130 (0.0037) [2024-06-22 11:17:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3246178304. Throughput: 0: 43116.9. Samples: 3246302080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:17:43,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 11:17:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000198131_3246178304.pth... [2024-06-22 11:17:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000197505_3235921920.pth [2024-06-22 11:17:46,953][15401] Updated weights for policy 0, policy_version 198140 (0.0033) [2024-06-22 11:17:48,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 3246407680. Throughput: 0: 43033.8. Samples: 3246552600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:17:48,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 11:17:50,301][15401] Updated weights for policy 0, policy_version 198150 (0.0029) [2024-06-22 11:17:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3246604288. Throughput: 0: 42992.0. Samples: 3246685780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:17:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 11:17:54,367][15401] Updated weights for policy 0, policy_version 198160 (0.0036) [2024-06-22 11:17:58,159][15401] Updated weights for policy 0, policy_version 198170 (0.0033) [2024-06-22 11:17:58,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43415.9, 300 sec: 42709.1). Total num frames: 3246833664. Throughput: 0: 42971.4. Samples: 3246943440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:17:58,393][15132] Avg episode reward: [(0, '0.117')] [2024-06-22 11:18:02,207][15401] Updated weights for policy 0, policy_version 198180 (0.0030) [2024-06-22 11:18:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3247030272. Throughput: 0: 42940.9. Samples: 3247199940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:18:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 11:18:05,802][15401] Updated weights for policy 0, policy_version 198190 (0.0032) [2024-06-22 11:18:08,391][15132] Fps is (10 sec: 42603.3, 60 sec: 42870.5, 300 sec: 42875.9). Total num frames: 3247259648. Throughput: 0: 42729.5. Samples: 3247328000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:18:08,391][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 11:18:09,669][15401] Updated weights for policy 0, policy_version 198200 (0.0039) [2024-06-22 11:18:13,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 3247456256. Throughput: 0: 42839.6. Samples: 3247586260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:18:13,393][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 11:18:13,557][15401] Updated weights for policy 0, policy_version 198210 (0.0041) [2024-06-22 11:18:17,377][15401] Updated weights for policy 0, policy_version 198220 (0.0040) [2024-06-22 11:18:18,250][15349] Signal inference workers to stop experience collection... (47900 times) [2024-06-22 11:18:18,276][15401] InferenceWorker_p0-w0: stopping experience collection (47900 times) [2024-06-22 11:18:18,310][15349] Signal inference workers to resume experience collection... (47900 times) [2024-06-22 11:18:18,311][15401] InferenceWorker_p0-w0: resuming experience collection (47900 times) [2024-06-22 11:18:18,393][15132] Fps is (10 sec: 40952.8, 60 sec: 42596.3, 300 sec: 42764.6). Total num frames: 3247669248. Throughput: 0: 42962.0. Samples: 3247845100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:18:18,393][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 11:18:21,346][15401] Updated weights for policy 0, policy_version 198230 (0.0032) [2024-06-22 11:18:23,390][15132] Fps is (10 sec: 44246.7, 60 sec: 42599.0, 300 sec: 42820.5). Total num frames: 3247898624. Throughput: 0: 42886.2. Samples: 3247971520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:18:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 11:18:24,938][15401] Updated weights for policy 0, policy_version 198240 (0.0035) [2024-06-22 11:18:28,389][15132] Fps is (10 sec: 42611.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3248095232. Throughput: 0: 42699.3. Samples: 3248223540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:18:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 11:18:28,947][15401] Updated weights for policy 0, policy_version 198250 (0.0041) [2024-06-22 11:18:32,542][15401] Updated weights for policy 0, policy_version 198260 (0.0046) [2024-06-22 11:18:33,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3248291840. Throughput: 0: 42870.1. Samples: 3248481760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:18:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 11:18:36,455][15401] Updated weights for policy 0, policy_version 198270 (0.0027) [2024-06-22 11:18:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.4, 300 sec: 42820.6). Total num frames: 3248553984. Throughput: 0: 42813.8. Samples: 3248612400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:18:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 11:18:40,207][15401] Updated weights for policy 0, policy_version 198280 (0.0033) [2024-06-22 11:18:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 3248734208. Throughput: 0: 42850.8. Samples: 3248871620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:18:43,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 11:18:44,270][15401] Updated weights for policy 0, policy_version 198290 (0.0031) [2024-06-22 11:18:47,704][15401] Updated weights for policy 0, policy_version 198300 (0.0038) [2024-06-22 11:18:48,391][15132] Fps is (10 sec: 39316.4, 60 sec: 42324.4, 300 sec: 42764.8). Total num frames: 3248947200. Throughput: 0: 42750.0. Samples: 3249123740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:18:48,391][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 11:18:51,974][15401] Updated weights for policy 0, policy_version 198310 (0.0031) [2024-06-22 11:18:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 3249192960. Throughput: 0: 42791.4. Samples: 3249253560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:18:53,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 11:18:55,218][15401] Updated weights for policy 0, policy_version 198320 (0.0040) [2024-06-22 11:18:58,389][15132] Fps is (10 sec: 42603.7, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 3249373184. Throughput: 0: 42803.2. Samples: 3249512300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:18:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 11:18:59,516][15401] Updated weights for policy 0, policy_version 198330 (0.0051) [2024-06-22 11:19:02,743][15401] Updated weights for policy 0, policy_version 198340 (0.0034) [2024-06-22 11:19:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 3249602560. Throughput: 0: 42708.2. Samples: 3249766840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:19:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 11:19:07,090][15401] Updated weights for policy 0, policy_version 198350 (0.0031) [2024-06-22 11:19:08,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43143.7, 300 sec: 42876.0). Total num frames: 3249848320. Throughput: 0: 42906.2. Samples: 3249902400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:19:08,401][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 11:19:10,276][15401] Updated weights for policy 0, policy_version 198360 (0.0045) [2024-06-22 11:19:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 3250012160. Throughput: 0: 42990.5. Samples: 3250158120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:19:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 11:19:14,583][15401] Updated weights for policy 0, policy_version 198370 (0.0024) [2024-06-22 11:19:18,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42873.5, 300 sec: 42709.5). Total num frames: 3250241536. Throughput: 0: 42892.8. Samples: 3250411940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:19:18,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 11:19:18,521][15401] Updated weights for policy 0, policy_version 198380 (0.0037) [2024-06-22 11:19:22,338][15401] Updated weights for policy 0, policy_version 198390 (0.0035) [2024-06-22 11:19:23,392][15132] Fps is (10 sec: 49140.4, 60 sec: 43416.0, 300 sec: 42876.7). Total num frames: 3250503680. Throughput: 0: 42939.0. Samples: 3250544760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:19:23,392][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 11:19:26,231][15401] Updated weights for policy 0, policy_version 198400 (0.0029) [2024-06-22 11:19:28,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3250651136. Throughput: 0: 43033.0. Samples: 3250808100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 11:19:28,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 11:19:29,827][15401] Updated weights for policy 0, policy_version 198410 (0.0034) [2024-06-22 11:19:33,392][15132] Fps is (10 sec: 39321.4, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 3250896896. Throughput: 0: 43031.7. Samples: 3251060220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:19:33,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 11:19:34,390][15401] Updated weights for policy 0, policy_version 198420 (0.0043) [2024-06-22 11:19:37,402][15401] Updated weights for policy 0, policy_version 198430 (0.0032) [2024-06-22 11:19:38,390][15132] Fps is (10 sec: 50789.3, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 3251159040. Throughput: 0: 43107.5. Samples: 3251193400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:19:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 11:19:41,975][15401] Updated weights for policy 0, policy_version 198440 (0.0035) [2024-06-22 11:19:43,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3251290112. Throughput: 0: 43048.5. Samples: 3251449480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:19:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 11:19:43,455][15349] Signal inference workers to stop experience collection... (47950 times) [2024-06-22 11:19:43,496][15401] InferenceWorker_p0-w0: stopping experience collection (47950 times) [2024-06-22 11:19:43,507][15349] Signal inference workers to resume experience collection... (47950 times) [2024-06-22 11:19:43,508][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000198444_3251306496.pth... [2024-06-22 11:19:43,511][15401] InferenceWorker_p0-w0: resuming experience collection (47950 times) [2024-06-22 11:19:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000197817_3241033728.pth [2024-06-22 11:19:45,073][15401] Updated weights for policy 0, policy_version 198450 (0.0032) [2024-06-22 11:19:48,389][15132] Fps is (10 sec: 37683.7, 60 sec: 43145.5, 300 sec: 42765.0). Total num frames: 3251535872. Throughput: 0: 42948.5. Samples: 3251699520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:19:48,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 11:19:49,567][15401] Updated weights for policy 0, policy_version 198460 (0.0036) [2024-06-22 11:19:52,629][15401] Updated weights for policy 0, policy_version 198470 (0.0045) [2024-06-22 11:19:53,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 3251748864. Throughput: 0: 42958.2. Samples: 3251835520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:19:53,393][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 11:19:57,458][15401] Updated weights for policy 0, policy_version 198480 (0.0032) [2024-06-22 11:19:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3251945472. Throughput: 0: 42961.4. Samples: 3252091380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:19:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 11:20:00,619][15401] Updated weights for policy 0, policy_version 198490 (0.0029) [2024-06-22 11:20:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3252191232. Throughput: 0: 42817.0. Samples: 3252338700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:20:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 11:20:05,179][15401] Updated weights for policy 0, policy_version 198500 (0.0033) [2024-06-22 11:20:08,069][15401] Updated weights for policy 0, policy_version 198510 (0.0038) [2024-06-22 11:20:08,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42598.4, 300 sec: 42875.7). Total num frames: 3252404224. Throughput: 0: 42997.7. Samples: 3252479660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:20:08,393][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 11:20:12,708][15401] Updated weights for policy 0, policy_version 198520 (0.0051) [2024-06-22 11:20:13,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3252568064. Throughput: 0: 42904.0. Samples: 3252738780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:20:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 11:20:15,618][15401] Updated weights for policy 0, policy_version 198530 (0.0027) [2024-06-22 11:20:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3252846592. Throughput: 0: 42704.0. Samples: 3252981800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:20:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 11:20:20,243][15401] Updated weights for policy 0, policy_version 198540 (0.0032) [2024-06-22 11:20:23,141][15401] Updated weights for policy 0, policy_version 198550 (0.0046) [2024-06-22 11:20:23,390][15132] Fps is (10 sec: 49150.8, 60 sec: 42600.0, 300 sec: 42987.5). Total num frames: 3253059584. Throughput: 0: 42780.4. Samples: 3253118520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:20:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 11:20:27,699][15401] Updated weights for policy 0, policy_version 198560 (0.0030) [2024-06-22 11:20:28,392][15132] Fps is (10 sec: 39312.4, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 3253239808. Throughput: 0: 42825.2. Samples: 3253376720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:20:28,392][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 11:20:30,677][15401] Updated weights for policy 0, policy_version 198570 (0.0034) [2024-06-22 11:20:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 3253485568. Throughput: 0: 42722.1. Samples: 3253622020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:20:33,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 11:20:35,240][15401] Updated weights for policy 0, policy_version 198580 (0.0037) [2024-06-22 11:20:38,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42050.6, 300 sec: 42875.8). Total num frames: 3253682176. Throughput: 0: 42796.9. Samples: 3253761380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 11:20:38,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 11:20:38,559][15401] Updated weights for policy 0, policy_version 198590 (0.0038) [2024-06-22 11:20:42,717][15401] Updated weights for policy 0, policy_version 198600 (0.0027) [2024-06-22 11:20:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3253878784. Throughput: 0: 42848.0. Samples: 3254019540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:20:43,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 11:20:46,035][15401] Updated weights for policy 0, policy_version 198610 (0.0040) [2024-06-22 11:20:48,390][15132] Fps is (10 sec: 44246.8, 60 sec: 43144.4, 300 sec: 42931.8). Total num frames: 3254124544. Throughput: 0: 42979.8. Samples: 3254272800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:20:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 11:20:50,424][15401] Updated weights for policy 0, policy_version 198620 (0.0030) [2024-06-22 11:20:52,362][15349] Signal inference workers to stop experience collection... (48000 times) [2024-06-22 11:20:52,404][15401] InferenceWorker_p0-w0: stopping experience collection (48000 times) [2024-06-22 11:20:52,416][15349] Signal inference workers to resume experience collection... (48000 times) [2024-06-22 11:20:52,417][15401] InferenceWorker_p0-w0: resuming experience collection (48000 times) [2024-06-22 11:20:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 3254321152. Throughput: 0: 42868.0. Samples: 3254408620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:20:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 11:20:53,664][15401] Updated weights for policy 0, policy_version 198630 (0.0036) [2024-06-22 11:20:57,856][15401] Updated weights for policy 0, policy_version 198640 (0.0028) [2024-06-22 11:20:58,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 3254517760. Throughput: 0: 42694.5. Samples: 3254660040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:20:58,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 11:21:01,139][15401] Updated weights for policy 0, policy_version 198650 (0.0023) [2024-06-22 11:21:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3254763520. Throughput: 0: 42983.5. Samples: 3254916060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 11:21:05,381][15401] Updated weights for policy 0, policy_version 198660 (0.0028) [2024-06-22 11:21:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 3254976512. Throughput: 0: 43001.5. Samples: 3255053580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:08,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 11:21:08,708][15401] Updated weights for policy 0, policy_version 198670 (0.0021) [2024-06-22 11:21:13,045][15401] Updated weights for policy 0, policy_version 198680 (0.0034) [2024-06-22 11:21:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 3255173120. Throughput: 0: 42951.1. Samples: 3255309420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:13,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 11:21:16,147][15401] Updated weights for policy 0, policy_version 198690 (0.0036) [2024-06-22 11:21:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3255418880. Throughput: 0: 43104.8. Samples: 3255561740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 11:21:20,868][15401] Updated weights for policy 0, policy_version 198700 (0.0046) [2024-06-22 11:21:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42987.1). Total num frames: 3255631872. Throughput: 0: 43206.2. Samples: 3255705560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 11:21:23,818][15401] Updated weights for policy 0, policy_version 198710 (0.0029) [2024-06-22 11:21:28,392][15132] Fps is (10 sec: 39312.6, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 3255812096. Throughput: 0: 43083.1. Samples: 3255958380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:28,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 11:21:28,447][15401] Updated weights for policy 0, policy_version 198720 (0.0033) [2024-06-22 11:21:31,376][15401] Updated weights for policy 0, policy_version 198730 (0.0025) [2024-06-22 11:21:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3256074240. Throughput: 0: 43238.8. Samples: 3256218540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 11:21:35,969][15401] Updated weights for policy 0, policy_version 198740 (0.0053) [2024-06-22 11:21:38,390][15132] Fps is (10 sec: 47524.4, 60 sec: 43419.3, 300 sec: 43042.7). Total num frames: 3256287232. Throughput: 0: 43268.9. Samples: 3256355720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 11:21:38,983][15401] Updated weights for policy 0, policy_version 198750 (0.0029) [2024-06-22 11:21:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3256467456. Throughput: 0: 43277.8. Samples: 3256607540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-22 11:21:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 11:21:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000198759_3256467456.pth... [2024-06-22 11:21:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000198131_3246178304.pth [2024-06-22 11:21:43,699][15401] Updated weights for policy 0, policy_version 198760 (0.0038) [2024-06-22 11:21:46,571][15401] Updated weights for policy 0, policy_version 198770 (0.0041) [2024-06-22 11:21:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 3256729600. Throughput: 0: 43177.9. Samples: 3256859060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:21:48,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 11:21:51,317][15401] Updated weights for policy 0, policy_version 198780 (0.0027) [2024-06-22 11:21:53,083][15349] Signal inference workers to stop experience collection... (48050 times) [2024-06-22 11:21:53,126][15401] InferenceWorker_p0-w0: stopping experience collection (48050 times) [2024-06-22 11:21:53,134][15349] Signal inference workers to resume experience collection... (48050 times) [2024-06-22 11:21:53,145][15401] InferenceWorker_p0-w0: resuming experience collection (48050 times) [2024-06-22 11:21:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 3256909824. Throughput: 0: 43231.6. Samples: 3256999000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:21:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 11:21:53,993][15401] Updated weights for policy 0, policy_version 198790 (0.0037) [2024-06-22 11:21:58,392][15132] Fps is (10 sec: 39312.6, 60 sec: 43416.0, 300 sec: 42931.3). Total num frames: 3257122816. Throughput: 0: 43110.3. Samples: 3257249480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:21:58,392][15132] Avg episode reward: [(0, '0.230')] [2024-06-22 11:21:58,874][15401] Updated weights for policy 0, policy_version 198800 (0.0025) [2024-06-22 11:22:01,717][15401] Updated weights for policy 0, policy_version 198810 (0.0029) [2024-06-22 11:22:03,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 3257384960. Throughput: 0: 43004.5. Samples: 3257496940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:03,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 11:22:06,392][15401] Updated weights for policy 0, policy_version 198820 (0.0028) [2024-06-22 11:22:08,389][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3257532416. Throughput: 0: 42875.7. Samples: 3257634960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 11:22:09,368][15401] Updated weights for policy 0, policy_version 198830 (0.0027) [2024-06-22 11:22:13,391][15132] Fps is (10 sec: 37676.4, 60 sec: 43143.2, 300 sec: 42875.8). Total num frames: 3257761792. Throughput: 0: 42892.5. Samples: 3257888520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:13,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 11:22:14,609][15401] Updated weights for policy 0, policy_version 198840 (0.0035) [2024-06-22 11:22:17,437][15401] Updated weights for policy 0, policy_version 198850 (0.0033) [2024-06-22 11:22:18,392][15132] Fps is (10 sec: 47501.6, 60 sec: 43142.8, 300 sec: 42931.4). Total num frames: 3258007552. Throughput: 0: 42611.0. Samples: 3258136140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:18,393][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 11:22:22,120][15401] Updated weights for policy 0, policy_version 198860 (0.0036) [2024-06-22 11:22:23,389][15132] Fps is (10 sec: 40967.9, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 3258171392. Throughput: 0: 42710.8. Samples: 3258277700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 11:22:24,982][15401] Updated weights for policy 0, policy_version 198870 (0.0027) [2024-06-22 11:22:28,390][15132] Fps is (10 sec: 37692.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 3258384384. Throughput: 0: 42546.2. Samples: 3258522120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 11:22:29,606][15401] Updated weights for policy 0, policy_version 198880 (0.0034) [2024-06-22 11:22:32,823][15401] Updated weights for policy 0, policy_version 198890 (0.0028) [2024-06-22 11:22:33,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 3258646528. Throughput: 0: 42672.0. Samples: 3258779300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 11:22:37,431][15401] Updated weights for policy 0, policy_version 198900 (0.0040) [2024-06-22 11:22:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 3258793984. Throughput: 0: 42575.0. Samples: 3258914880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 11:22:40,353][15401] Updated weights for policy 0, policy_version 198910 (0.0037) [2024-06-22 11:22:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3259039744. Throughput: 0: 42462.5. Samples: 3259160200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 11:22:44,915][15401] Updated weights for policy 0, policy_version 198920 (0.0042) [2024-06-22 11:22:48,149][15401] Updated weights for policy 0, policy_version 198930 (0.0030) [2024-06-22 11:22:48,390][15132] Fps is (10 sec: 49152.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3259285504. Throughput: 0: 42712.9. Samples: 3259419020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 11:22:52,762][15401] Updated weights for policy 0, policy_version 198940 (0.0036) [2024-06-22 11:22:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 3259449344. Throughput: 0: 42613.3. Samples: 3259552560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 11:22:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 11:22:55,682][15401] Updated weights for policy 0, policy_version 198950 (0.0039) [2024-06-22 11:22:56,544][15349] Signal inference workers to stop experience collection... (48100 times) [2024-06-22 11:22:56,545][15349] Signal inference workers to resume experience collection... (48100 times) [2024-06-22 11:22:56,556][15401] InferenceWorker_p0-w0: stopping experience collection (48100 times) [2024-06-22 11:22:56,557][15401] InferenceWorker_p0-w0: resuming experience collection (48100 times) [2024-06-22 11:22:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 3259678720. Throughput: 0: 42469.8. Samples: 3259799580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:22:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 11:23:00,542][15401] Updated weights for policy 0, policy_version 198960 (0.0028) [2024-06-22 11:23:03,275][15401] Updated weights for policy 0, policy_version 198970 (0.0026) [2024-06-22 11:23:03,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42325.3, 300 sec: 42931.8). Total num frames: 3259924480. Throughput: 0: 42797.5. Samples: 3260061920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:03,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 11:23:08,327][15401] Updated weights for policy 0, policy_version 198980 (0.0031) [2024-06-22 11:23:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 3260088320. Throughput: 0: 42515.6. Samples: 3260190900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 11:23:11,141][15401] Updated weights for policy 0, policy_version 198990 (0.0037) [2024-06-22 11:23:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42872.8, 300 sec: 42932.1). Total num frames: 3260334080. Throughput: 0: 42597.9. Samples: 3260439020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:13,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 11:23:15,935][15401] Updated weights for policy 0, policy_version 199000 (0.0024) [2024-06-22 11:23:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42054.1, 300 sec: 42820.6). Total num frames: 3260530688. Throughput: 0: 42687.2. Samples: 3260700220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 11:23:18,796][15401] Updated weights for policy 0, policy_version 199010 (0.0024) [2024-06-22 11:23:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3260727296. Throughput: 0: 42465.0. Samples: 3260825800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 11:23:23,574][15401] Updated weights for policy 0, policy_version 199020 (0.0032) [2024-06-22 11:23:26,455][15401] Updated weights for policy 0, policy_version 199030 (0.0034) [2024-06-22 11:23:28,390][15132] Fps is (10 sec: 45873.7, 60 sec: 43417.4, 300 sec: 43042.7). Total num frames: 3260989440. Throughput: 0: 42703.8. Samples: 3261081880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 11:23:31,063][15401] Updated weights for policy 0, policy_version 199040 (0.0026) [2024-06-22 11:23:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3261169664. Throughput: 0: 42911.0. Samples: 3261350020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:33,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 11:23:34,056][15401] Updated weights for policy 0, policy_version 199050 (0.0039) [2024-06-22 11:23:38,389][15132] Fps is (10 sec: 37684.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3261366272. Throughput: 0: 42598.7. Samples: 3261469500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 11:23:38,590][15401] Updated weights for policy 0, policy_version 199060 (0.0038) [2024-06-22 11:23:41,878][15401] Updated weights for policy 0, policy_version 199070 (0.0041) [2024-06-22 11:23:43,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42987.4). Total num frames: 3261628416. Throughput: 0: 42845.3. Samples: 3261727620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 11:23:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000199074_3261628416.pth... [2024-06-22 11:23:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000198444_3251306496.pth [2024-06-22 11:23:46,139][15401] Updated weights for policy 0, policy_version 199080 (0.0037) [2024-06-22 11:23:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 3261808640. Throughput: 0: 42824.0. Samples: 3261989000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 11:23:49,390][15401] Updated weights for policy 0, policy_version 199090 (0.0035) [2024-06-22 11:23:53,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3262005248. Throughput: 0: 42704.4. Samples: 3262112600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:53,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 11:23:53,729][15401] Updated weights for policy 0, policy_version 199100 (0.0037) [2024-06-22 11:23:57,061][15401] Updated weights for policy 0, policy_version 199110 (0.0039) [2024-06-22 11:23:58,242][15349] Signal inference workers to stop experience collection... (48150 times) [2024-06-22 11:23:58,242][15349] Signal inference workers to resume experience collection... (48150 times) [2024-06-22 11:23:58,289][15401] InferenceWorker_p0-w0: stopping experience collection (48150 times) [2024-06-22 11:23:58,289][15401] InferenceWorker_p0-w0: resuming experience collection (48150 times) [2024-06-22 11:23:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3262267392. Throughput: 0: 42852.9. Samples: 3262367400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-22 11:23:58,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 11:24:01,682][15401] Updated weights for policy 0, policy_version 199120 (0.0033) [2024-06-22 11:24:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 3262447616. Throughput: 0: 42818.5. Samples: 3262627060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 11:24:04,811][15401] Updated weights for policy 0, policy_version 199130 (0.0027) [2024-06-22 11:24:08,391][15132] Fps is (10 sec: 39317.3, 60 sec: 42870.7, 300 sec: 42875.9). Total num frames: 3262660608. Throughput: 0: 42690.1. Samples: 3262746900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:08,391][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 11:24:09,315][15401] Updated weights for policy 0, policy_version 199140 (0.0033) [2024-06-22 11:24:12,362][15401] Updated weights for policy 0, policy_version 199150 (0.0040) [2024-06-22 11:24:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.1, 300 sec: 42876.1). Total num frames: 3262889984. Throughput: 0: 42665.3. Samples: 3263001820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:13,391][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 11:24:17,059][15401] Updated weights for policy 0, policy_version 199160 (0.0037) [2024-06-22 11:24:18,389][15132] Fps is (10 sec: 42603.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 3263086592. Throughput: 0: 42518.0. Samples: 3263263320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:18,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-22 11:24:20,261][15401] Updated weights for policy 0, policy_version 199170 (0.0027) [2024-06-22 11:24:23,389][15132] Fps is (10 sec: 42599.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3263315968. Throughput: 0: 42488.0. Samples: 3263381460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:23,390][15132] Avg episode reward: [(0, '0.933')] [2024-06-22 11:24:23,415][15349] Saving new best policy, reward=0.933! [2024-06-22 11:24:24,684][15401] Updated weights for policy 0, policy_version 199180 (0.0034) [2024-06-22 11:24:27,870][15401] Updated weights for policy 0, policy_version 199190 (0.0042) [2024-06-22 11:24:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.6, 300 sec: 42876.5). Total num frames: 3263545344. Throughput: 0: 42698.2. Samples: 3263649040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:28,390][15132] Avg episode reward: [(0, '0.921')] [2024-06-22 11:24:32,205][15401] Updated weights for policy 0, policy_version 199200 (0.0033) [2024-06-22 11:24:33,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3263725568. Throughput: 0: 42710.9. Samples: 3263911000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:33,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-22 11:24:35,346][15401] Updated weights for policy 0, policy_version 199210 (0.0025) [2024-06-22 11:24:38,391][15132] Fps is (10 sec: 40955.0, 60 sec: 43143.6, 300 sec: 42931.4). Total num frames: 3263954944. Throughput: 0: 42662.4. Samples: 3264032460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:38,391][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 11:24:39,738][15401] Updated weights for policy 0, policy_version 199220 (0.0045) [2024-06-22 11:24:43,015][15401] Updated weights for policy 0, policy_version 199230 (0.0033) [2024-06-22 11:24:43,389][15132] Fps is (10 sec: 47514.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3264200704. Throughput: 0: 42852.4. Samples: 3264295760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:43,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 11:24:47,759][15401] Updated weights for policy 0, policy_version 199240 (0.0031) [2024-06-22 11:24:48,390][15132] Fps is (10 sec: 40964.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 3264364544. Throughput: 0: 42649.4. Samples: 3264546280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 11:24:50,690][15401] Updated weights for policy 0, policy_version 199250 (0.0031) [2024-06-22 11:24:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3264593920. Throughput: 0: 42741.9. Samples: 3264670240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 11:24:55,394][15401] Updated weights for policy 0, policy_version 199260 (0.0039) [2024-06-22 11:24:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3264823296. Throughput: 0: 42932.4. Samples: 3264933760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:24:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 11:24:58,510][15401] Updated weights for policy 0, policy_version 199270 (0.0025) [2024-06-22 11:25:02,900][15401] Updated weights for policy 0, policy_version 199280 (0.0036) [2024-06-22 11:25:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 3265003520. Throughput: 0: 42711.9. Samples: 3265185360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 11:25:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 11:25:06,220][15401] Updated weights for policy 0, policy_version 199290 (0.0043) [2024-06-22 11:25:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43145.2, 300 sec: 42987.1). Total num frames: 3265249280. Throughput: 0: 42862.1. Samples: 3265310260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 11:25:10,807][15401] Updated weights for policy 0, policy_version 199300 (0.0029) [2024-06-22 11:25:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 3265429504. Throughput: 0: 42804.7. Samples: 3265575260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 11:25:14,204][15401] Updated weights for policy 0, policy_version 199310 (0.0030) [2024-06-22 11:25:18,287][15401] Updated weights for policy 0, policy_version 199320 (0.0035) [2024-06-22 11:25:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3265658880. Throughput: 0: 42617.2. Samples: 3265828760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:18,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 11:25:22,046][15401] Updated weights for policy 0, policy_version 199330 (0.0036) [2024-06-22 11:25:22,756][15349] Signal inference workers to stop experience collection... (48200 times) [2024-06-22 11:25:22,769][15401] InferenceWorker_p0-w0: stopping experience collection (48200 times) [2024-06-22 11:25:22,818][15349] Signal inference workers to resume experience collection... (48200 times) [2024-06-22 11:25:22,819][15401] InferenceWorker_p0-w0: resuming experience collection (48200 times) [2024-06-22 11:25:23,391][15132] Fps is (10 sec: 45869.8, 60 sec: 42870.5, 300 sec: 42876.3). Total num frames: 3265888256. Throughput: 0: 42747.0. Samples: 3265956080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:23,391][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 11:25:25,734][15401] Updated weights for policy 0, policy_version 199340 (0.0035) [2024-06-22 11:25:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3266068480. Throughput: 0: 42576.0. Samples: 3266211680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 11:25:29,589][15401] Updated weights for policy 0, policy_version 199350 (0.0023) [2024-06-22 11:25:33,243][15401] Updated weights for policy 0, policy_version 199360 (0.0032) [2024-06-22 11:25:33,390][15132] Fps is (10 sec: 42603.8, 60 sec: 43144.7, 300 sec: 42820.9). Total num frames: 3266314240. Throughput: 0: 42590.2. Samples: 3266462840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 11:25:37,254][15401] Updated weights for policy 0, policy_version 199370 (0.0044) [2024-06-22 11:25:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42326.2, 300 sec: 42765.0). Total num frames: 3266494464. Throughput: 0: 42766.2. Samples: 3266594720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:38,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 11:25:40,745][15401] Updated weights for policy 0, policy_version 199380 (0.0041) [2024-06-22 11:25:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3266723840. Throughput: 0: 42547.8. Samples: 3266848420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 11:25:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000199385_3266723840.pth... [2024-06-22 11:25:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000198759_3256467456.pth [2024-06-22 11:25:45,200][15401] Updated weights for policy 0, policy_version 199390 (0.0040) [2024-06-22 11:25:48,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3266969600. Throughput: 0: 42620.9. Samples: 3267103300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:48,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 11:25:48,395][15401] Updated weights for policy 0, policy_version 199400 (0.0037) [2024-06-22 11:25:52,848][15401] Updated weights for policy 0, policy_version 199410 (0.0046) [2024-06-22 11:25:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3267149824. Throughput: 0: 42742.0. Samples: 3267233640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 11:25:55,875][15401] Updated weights for policy 0, policy_version 199420 (0.0043) [2024-06-22 11:25:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3267362816. Throughput: 0: 42651.8. Samples: 3267494580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:25:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 11:26:00,501][15401] Updated weights for policy 0, policy_version 199430 (0.0030) [2024-06-22 11:26:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 3267608576. Throughput: 0: 42631.1. Samples: 3267747160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:26:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 11:26:03,462][15401] Updated weights for policy 0, policy_version 199440 (0.0042) [2024-06-22 11:26:08,299][15401] Updated weights for policy 0, policy_version 199450 (0.0037) [2024-06-22 11:26:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 3267788800. Throughput: 0: 42712.9. Samples: 3267878100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:26:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 11:26:11,262][15401] Updated weights for policy 0, policy_version 199460 (0.0032) [2024-06-22 11:26:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3268001792. Throughput: 0: 42709.7. Samples: 3268133620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 11:26:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 11:26:15,749][15401] Updated weights for policy 0, policy_version 199470 (0.0031) [2024-06-22 11:26:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 3268247552. Throughput: 0: 42799.1. Samples: 3268388800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:18,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 11:26:18,781][15401] Updated weights for policy 0, policy_version 199480 (0.0028) [2024-06-22 11:26:23,360][15401] Updated weights for policy 0, policy_version 199490 (0.0032) [2024-06-22 11:26:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42599.4, 300 sec: 42820.9). Total num frames: 3268444160. Throughput: 0: 42870.7. Samples: 3268523900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:23,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 11:26:26,486][15401] Updated weights for policy 0, policy_version 199500 (0.0037) [2024-06-22 11:26:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3268657152. Throughput: 0: 42739.2. Samples: 3268771680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 11:26:30,961][15401] Updated weights for policy 0, policy_version 199510 (0.0040) [2024-06-22 11:26:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3268870144. Throughput: 0: 42852.0. Samples: 3269031640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 11:26:34,124][15401] Updated weights for policy 0, policy_version 199520 (0.0030) [2024-06-22 11:26:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3269066752. Throughput: 0: 42843.9. Samples: 3269161620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 11:26:38,930][15401] Updated weights for policy 0, policy_version 199530 (0.0034) [2024-06-22 11:26:41,838][15401] Updated weights for policy 0, policy_version 199540 (0.0028) [2024-06-22 11:26:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3269296128. Throughput: 0: 42473.3. Samples: 3269405880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:43,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 11:26:45,748][15349] Signal inference workers to stop experience collection... (48250 times) [2024-06-22 11:26:45,756][15349] Signal inference workers to resume experience collection... (48250 times) [2024-06-22 11:26:45,764][15401] InferenceWorker_p0-w0: stopping experience collection (48250 times) [2024-06-22 11:26:45,775][15401] InferenceWorker_p0-w0: resuming experience collection (48250 times) [2024-06-22 11:26:46,776][15401] Updated weights for policy 0, policy_version 199550 (0.0033) [2024-06-22 11:26:48,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 3269509120. Throughput: 0: 42662.1. Samples: 3269667060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:48,392][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 11:26:49,919][15401] Updated weights for policy 0, policy_version 199560 (0.0034) [2024-06-22 11:26:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 3269705728. Throughput: 0: 42541.8. Samples: 3269792480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:53,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 11:26:54,373][15401] Updated weights for policy 0, policy_version 199570 (0.0037) [2024-06-22 11:26:57,546][15401] Updated weights for policy 0, policy_version 199580 (0.0035) [2024-06-22 11:26:58,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 3269935104. Throughput: 0: 42543.5. Samples: 3270048080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:26:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 11:27:01,818][15401] Updated weights for policy 0, policy_version 199590 (0.0041) [2024-06-22 11:27:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3270131712. Throughput: 0: 42572.9. Samples: 3270304580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:27:03,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 11:27:05,190][15401] Updated weights for policy 0, policy_version 199600 (0.0034) [2024-06-22 11:27:08,393][15132] Fps is (10 sec: 40947.9, 60 sec: 42596.2, 300 sec: 42653.8). Total num frames: 3270344704. Throughput: 0: 42326.0. Samples: 3270428700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:27:08,393][15132] Avg episode reward: [(0, '0.252')] [2024-06-22 11:27:09,533][15401] Updated weights for policy 0, policy_version 199610 (0.0036) [2024-06-22 11:27:13,111][15401] Updated weights for policy 0, policy_version 199620 (0.0025) [2024-06-22 11:27:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 3270574080. Throughput: 0: 42673.2. Samples: 3270691980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:27:13,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-22 11:27:17,206][15401] Updated weights for policy 0, policy_version 199630 (0.0021) [2024-06-22 11:27:18,389][15132] Fps is (10 sec: 44250.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3270787072. Throughput: 0: 42526.7. Samples: 3270945340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 11:27:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 11:27:20,677][15401] Updated weights for policy 0, policy_version 199640 (0.0036) [2024-06-22 11:27:23,391][15132] Fps is (10 sec: 40952.6, 60 sec: 42323.9, 300 sec: 42709.2). Total num frames: 3270983680. Throughput: 0: 42458.7. Samples: 3271072340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:27:23,392][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 11:27:24,715][15401] Updated weights for policy 0, policy_version 199650 (0.0036) [2024-06-22 11:27:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3271213056. Throughput: 0: 42824.3. Samples: 3271332980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:27:28,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 11:27:28,623][15401] Updated weights for policy 0, policy_version 199660 (0.0045) [2024-06-22 11:27:32,275][15401] Updated weights for policy 0, policy_version 199670 (0.0037) [2024-06-22 11:27:33,389][15132] Fps is (10 sec: 44245.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3271426048. Throughput: 0: 42760.6. Samples: 3271591180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:27:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 11:27:36,138][15401] Updated weights for policy 0, policy_version 199680 (0.0031) [2024-06-22 11:27:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3271622656. Throughput: 0: 42879.8. Samples: 3271722080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:27:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 11:27:39,994][15401] Updated weights for policy 0, policy_version 199690 (0.0028) [2024-06-22 11:27:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 3271868416. Throughput: 0: 42907.1. Samples: 3271979000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:27:43,393][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 11:27:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000199699_3271868416.pth... [2024-06-22 11:27:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000199074_3261628416.pth [2024-06-22 11:27:43,656][15401] Updated weights for policy 0, policy_version 199700 (0.0038) [2024-06-22 11:27:47,633][15401] Updated weights for policy 0, policy_version 199710 (0.0046) [2024-06-22 11:27:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 3272081408. Throughput: 0: 42721.8. Samples: 3272227060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:27:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 11:27:51,726][15401] Updated weights for policy 0, policy_version 199720 (0.0037) [2024-06-22 11:27:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3272278016. Throughput: 0: 42880.7. Samples: 3272358200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:27:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 11:27:55,319][15401] Updated weights for policy 0, policy_version 199730 (0.0037) [2024-06-22 11:27:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3272474624. Throughput: 0: 42516.5. Samples: 3272605220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:27:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 11:27:59,542][15401] Updated weights for policy 0, policy_version 199740 (0.0029) [2024-06-22 11:28:02,906][15401] Updated weights for policy 0, policy_version 199750 (0.0032) [2024-06-22 11:28:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3272704000. Throughput: 0: 42693.3. Samples: 3272866540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:28:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 11:28:07,097][15401] Updated weights for policy 0, policy_version 199760 (0.0031) [2024-06-22 11:28:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.6, 300 sec: 42598.4). Total num frames: 3272900608. Throughput: 0: 42806.3. Samples: 3272998540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:28:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 11:28:10,344][15401] Updated weights for policy 0, policy_version 199770 (0.0039) [2024-06-22 11:28:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3273129984. Throughput: 0: 42686.7. Samples: 3273253880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:28:13,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 11:28:14,954][15401] Updated weights for policy 0, policy_version 199780 (0.0046) [2024-06-22 11:28:18,274][15401] Updated weights for policy 0, policy_version 199790 (0.0033) [2024-06-22 11:28:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3273359360. Throughput: 0: 42625.8. Samples: 3273509340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:28:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 11:28:22,460][15401] Updated weights for policy 0, policy_version 199800 (0.0037) [2024-06-22 11:28:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42599.7, 300 sec: 42542.9). Total num frames: 3273539584. Throughput: 0: 42695.1. Samples: 3273643360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:28:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 11:28:25,830][15401] Updated weights for policy 0, policy_version 199810 (0.0031) [2024-06-22 11:28:27,390][15349] Signal inference workers to stop experience collection... (48300 times) [2024-06-22 11:28:27,433][15401] InferenceWorker_p0-w0: stopping experience collection (48300 times) [2024-06-22 11:28:27,440][15349] Signal inference workers to resume experience collection... (48300 times) [2024-06-22 11:28:27,460][15401] InferenceWorker_p0-w0: resuming experience collection (48300 times) [2024-06-22 11:28:28,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3273785344. Throughput: 0: 42617.4. Samples: 3273896680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 11:28:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 11:28:30,008][15401] Updated weights for policy 0, policy_version 199820 (0.0045) [2024-06-22 11:28:33,293][15401] Updated weights for policy 0, policy_version 199830 (0.0029) [2024-06-22 11:28:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3274014720. Throughput: 0: 42884.7. Samples: 3274156880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:28:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 11:28:37,603][15401] Updated weights for policy 0, policy_version 199840 (0.0028) [2024-06-22 11:28:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 3274194944. Throughput: 0: 42831.9. Samples: 3274285740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:28:38,392][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 11:28:41,416][15401] Updated weights for policy 0, policy_version 199850 (0.0044) [2024-06-22 11:28:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 3274440704. Throughput: 0: 42929.8. Samples: 3274537060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:28:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 11:28:45,284][15401] Updated weights for policy 0, policy_version 199860 (0.0040) [2024-06-22 11:28:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3274637312. Throughput: 0: 42947.6. Samples: 3274799180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:28:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 11:28:48,897][15401] Updated weights for policy 0, policy_version 199870 (0.0035) [2024-06-22 11:28:53,058][15401] Updated weights for policy 0, policy_version 199880 (0.0022) [2024-06-22 11:28:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3274833920. Throughput: 0: 42813.8. Samples: 3274925160. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:28:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 11:28:56,738][15401] Updated weights for policy 0, policy_version 199890 (0.0030) [2024-06-22 11:28:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3275079680. Throughput: 0: 42857.8. Samples: 3275182480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:28:58,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 11:29:00,723][15401] Updated weights for policy 0, policy_version 199900 (0.0028) [2024-06-22 11:29:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 3275276288. Throughput: 0: 43045.8. Samples: 3275446400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:29:03,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 11:29:04,288][15401] Updated weights for policy 0, policy_version 199910 (0.0036) [2024-06-22 11:29:08,275][15401] Updated weights for policy 0, policy_version 199920 (0.0031) [2024-06-22 11:29:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3275489280. Throughput: 0: 42783.6. Samples: 3275568620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:29:08,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 11:29:11,845][15401] Updated weights for policy 0, policy_version 199930 (0.0040) [2024-06-22 11:29:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3275735040. Throughput: 0: 42923.9. Samples: 3275828260. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:29:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 11:29:15,860][15401] Updated weights for policy 0, policy_version 199940 (0.0039) [2024-06-22 11:29:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 3275882496. Throughput: 0: 43008.1. Samples: 3276092240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:29:18,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 11:29:19,499][15401] Updated weights for policy 0, policy_version 199950 (0.0042) [2024-06-22 11:29:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3276128256. Throughput: 0: 42615.6. Samples: 3276203340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:29:23,390][15132] Avg episode reward: [(0, '0.874')] [2024-06-22 11:29:23,519][15401] Updated weights for policy 0, policy_version 199960 (0.0033) [2024-06-22 11:29:27,201][15401] Updated weights for policy 0, policy_version 199970 (0.0036) [2024-06-22 11:29:28,389][15132] Fps is (10 sec: 49152.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3276374016. Throughput: 0: 42857.4. Samples: 3276465640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:29:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 11:29:31,141][15401] Updated weights for policy 0, policy_version 199980 (0.0036) [2024-06-22 11:29:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42654.1). Total num frames: 3276537856. Throughput: 0: 42983.1. Samples: 3276733420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-22 11:29:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 11:29:34,729][15401] Updated weights for policy 0, policy_version 199990 (0.0034) [2024-06-22 11:29:35,907][15349] Signal inference workers to stop experience collection... (48350 times) [2024-06-22 11:29:35,944][15401] InferenceWorker_p0-w0: stopping experience collection (48350 times) [2024-06-22 11:29:36,030][15349] Signal inference workers to resume experience collection... (48350 times) [2024-06-22 11:29:36,031][15401] InferenceWorker_p0-w0: resuming experience collection (48350 times) [2024-06-22 11:29:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 3276767232. Throughput: 0: 42683.6. Samples: 3276845920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:29:38,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 11:29:38,829][15401] Updated weights for policy 0, policy_version 200000 (0.0034) [2024-06-22 11:29:42,738][15401] Updated weights for policy 0, policy_version 200010 (0.0041) [2024-06-22 11:29:43,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3277012992. Throughput: 0: 42787.9. Samples: 3277107940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:29:43,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 11:29:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000200013_3277012992.pth... [2024-06-22 11:29:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000199385_3266723840.pth [2024-06-22 11:29:46,670][15401] Updated weights for policy 0, policy_version 200020 (0.0036) [2024-06-22 11:29:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3277176832. Throughput: 0: 42717.3. Samples: 3277368680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:29:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 11:29:50,362][15401] Updated weights for policy 0, policy_version 200030 (0.0030) [2024-06-22 11:29:53,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 3277406208. Throughput: 0: 42582.1. Samples: 3277484920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:29:53,393][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 11:29:54,210][15401] Updated weights for policy 0, policy_version 200040 (0.0029) [2024-06-22 11:29:57,956][15401] Updated weights for policy 0, policy_version 200050 (0.0033) [2024-06-22 11:29:58,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3277651968. Throughput: 0: 42787.2. Samples: 3277753680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:29:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 11:30:01,817][15401] Updated weights for policy 0, policy_version 200060 (0.0036) [2024-06-22 11:30:03,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3277815808. Throughput: 0: 42529.3. Samples: 3278006060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:30:03,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-22 11:30:05,473][15401] Updated weights for policy 0, policy_version 200070 (0.0048) [2024-06-22 11:30:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3278045184. Throughput: 0: 42773.7. Samples: 3278128160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:30:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 11:30:09,659][15401] Updated weights for policy 0, policy_version 200080 (0.0033) [2024-06-22 11:30:13,363][15401] Updated weights for policy 0, policy_version 200090 (0.0030) [2024-06-22 11:30:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3278274560. Throughput: 0: 42804.3. Samples: 3278391840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:30:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 11:30:17,532][15401] Updated weights for policy 0, policy_version 200100 (0.0031) [2024-06-22 11:30:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42654.1). Total num frames: 3278471168. Throughput: 0: 42619.4. Samples: 3278651300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:30:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 11:30:21,238][15401] Updated weights for policy 0, policy_version 200110 (0.0034) [2024-06-22 11:30:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3278700544. Throughput: 0: 42942.0. Samples: 3278778320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:30:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 11:30:25,237][15401] Updated weights for policy 0, policy_version 200120 (0.0028) [2024-06-22 11:30:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 3278897152. Throughput: 0: 42573.3. Samples: 3279023740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:30:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 11:30:28,755][15401] Updated weights for policy 0, policy_version 200130 (0.0037) [2024-06-22 11:30:32,779][15401] Updated weights for policy 0, policy_version 200140 (0.0042) [2024-06-22 11:30:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3279110144. Throughput: 0: 42587.6. Samples: 3279285120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:30:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 11:30:36,325][15401] Updated weights for policy 0, policy_version 200150 (0.0032) [2024-06-22 11:30:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3279323136. Throughput: 0: 42893.9. Samples: 3279415040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 11:30:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 11:30:40,399][15401] Updated weights for policy 0, policy_version 200160 (0.0031) [2024-06-22 11:30:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 3279552512. Throughput: 0: 42565.7. Samples: 3279669240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:30:43,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 11:30:43,945][15401] Updated weights for policy 0, policy_version 200170 (0.0033) [2024-06-22 11:30:47,955][15401] Updated weights for policy 0, policy_version 200180 (0.0027) [2024-06-22 11:30:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3279749120. Throughput: 0: 42800.0. Samples: 3279932060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:30:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 11:30:51,666][15401] Updated weights for policy 0, policy_version 200190 (0.0034) [2024-06-22 11:30:53,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 3279978496. Throughput: 0: 42983.7. Samples: 3280062420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:30:53,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 11:30:55,561][15401] Updated weights for policy 0, policy_version 200200 (0.0030) [2024-06-22 11:30:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3280191488. Throughput: 0: 42727.7. Samples: 3280314580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:30:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 11:30:59,199][15401] Updated weights for policy 0, policy_version 200210 (0.0045) [2024-06-22 11:31:03,111][15401] Updated weights for policy 0, policy_version 200220 (0.0027) [2024-06-22 11:31:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 3280420864. Throughput: 0: 42868.1. Samples: 3280580360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:03,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 11:31:06,841][15401] Updated weights for policy 0, policy_version 200230 (0.0037) [2024-06-22 11:31:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3280617472. Throughput: 0: 42805.0. Samples: 3280704540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:08,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 11:31:08,565][15349] Signal inference workers to stop experience collection... (48400 times) [2024-06-22 11:31:08,573][15349] Signal inference workers to resume experience collection... (48400 times) [2024-06-22 11:31:08,620][15401] InferenceWorker_p0-w0: stopping experience collection (48400 times) [2024-06-22 11:31:08,620][15401] InferenceWorker_p0-w0: resuming experience collection (48400 times) [2024-06-22 11:31:10,903][15401] Updated weights for policy 0, policy_version 200240 (0.0032) [2024-06-22 11:31:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3280830464. Throughput: 0: 42928.1. Samples: 3280955500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:13,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 11:31:14,510][15401] Updated weights for policy 0, policy_version 200250 (0.0035) [2024-06-22 11:31:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3281043456. Throughput: 0: 43039.1. Samples: 3281221880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:18,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 11:31:18,505][15401] Updated weights for policy 0, policy_version 200260 (0.0027) [2024-06-22 11:31:22,171][15401] Updated weights for policy 0, policy_version 200270 (0.0031) [2024-06-22 11:31:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3281272832. Throughput: 0: 42965.8. Samples: 3281348500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:23,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 11:31:26,000][15401] Updated weights for policy 0, policy_version 200280 (0.0034) [2024-06-22 11:31:28,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 3281485824. Throughput: 0: 42971.3. Samples: 3281603120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:28,396][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 11:31:29,720][15401] Updated weights for policy 0, policy_version 200290 (0.0041) [2024-06-22 11:31:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3281698816. Throughput: 0: 42910.7. Samples: 3281863040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 11:31:33,641][15401] Updated weights for policy 0, policy_version 200300 (0.0029) [2024-06-22 11:31:37,366][15401] Updated weights for policy 0, policy_version 200310 (0.0031) [2024-06-22 11:31:38,392][15132] Fps is (10 sec: 44254.6, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 3281928192. Throughput: 0: 42749.2. Samples: 3281986240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:38,392][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 11:31:41,196][15401] Updated weights for policy 0, policy_version 200320 (0.0043) [2024-06-22 11:31:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 3282124800. Throughput: 0: 42892.4. Samples: 3282244740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:43,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 11:31:43,511][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000200326_3282141184.pth... [2024-06-22 11:31:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000199699_3271868416.pth [2024-06-22 11:31:45,073][15401] Updated weights for policy 0, policy_version 200330 (0.0030) [2024-06-22 11:31:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3282337792. Throughput: 0: 42679.6. Samples: 3282500940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 11:31:48,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 11:31:49,006][15401] Updated weights for policy 0, policy_version 200340 (0.0037) [2024-06-22 11:31:52,600][15401] Updated weights for policy 0, policy_version 200350 (0.0035) [2024-06-22 11:31:53,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 3282550784. Throughput: 0: 42667.2. Samples: 3282624840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:31:53,396][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 11:31:56,586][15401] Updated weights for policy 0, policy_version 200360 (0.0039) [2024-06-22 11:31:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3282763776. Throughput: 0: 42897.7. Samples: 3282885900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:31:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 11:32:00,161][15401] Updated weights for policy 0, policy_version 200370 (0.0042) [2024-06-22 11:32:03,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.4, 300 sec: 42821.0). Total num frames: 3282976768. Throughput: 0: 42799.1. Samples: 3283147840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 11:32:04,072][15401] Updated weights for policy 0, policy_version 200380 (0.0038) [2024-06-22 11:32:07,853][15401] Updated weights for policy 0, policy_version 200390 (0.0030) [2024-06-22 11:32:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3283206144. Throughput: 0: 42794.6. Samples: 3283274260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:08,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 11:32:11,959][15401] Updated weights for policy 0, policy_version 200400 (0.0031) [2024-06-22 11:32:13,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3283419136. Throughput: 0: 42791.0. Samples: 3283528540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:13,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 11:32:15,406][15401] Updated weights for policy 0, policy_version 200410 (0.0048) [2024-06-22 11:32:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42820.5). Total num frames: 3283615744. Throughput: 0: 42700.3. Samples: 3283784660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:18,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 11:32:19,427][15401] Updated weights for policy 0, policy_version 200420 (0.0029) [2024-06-22 11:32:23,207][15401] Updated weights for policy 0, policy_version 200430 (0.0033) [2024-06-22 11:32:23,389][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3283845120. Throughput: 0: 42888.1. Samples: 3283916100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 11:32:27,013][15401] Updated weights for policy 0, policy_version 200440 (0.0022) [2024-06-22 11:32:28,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 3284058112. Throughput: 0: 42671.1. Samples: 3284164940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:28,395][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 11:32:31,017][15401] Updated weights for policy 0, policy_version 200450 (0.0043) [2024-06-22 11:32:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3284254720. Throughput: 0: 42888.9. Samples: 3284430940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 11:32:34,649][15401] Updated weights for policy 0, policy_version 200460 (0.0036) [2024-06-22 11:32:38,390][15132] Fps is (10 sec: 42595.4, 60 sec: 42599.6, 300 sec: 42765.3). Total num frames: 3284484096. Throughput: 0: 43069.0. Samples: 3284562700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:38,391][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 11:32:38,684][15401] Updated weights for policy 0, policy_version 200470 (0.0048) [2024-06-22 11:32:40,438][15349] Signal inference workers to stop experience collection... (48450 times) [2024-06-22 11:32:40,493][15401] InferenceWorker_p0-w0: stopping experience collection (48450 times) [2024-06-22 11:32:40,501][15349] Signal inference workers to resume experience collection... (48450 times) [2024-06-22 11:32:40,508][15401] InferenceWorker_p0-w0: resuming experience collection (48450 times) [2024-06-22 11:32:42,376][15401] Updated weights for policy 0, policy_version 200480 (0.0028) [2024-06-22 11:32:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3284680704. Throughput: 0: 42946.7. Samples: 3284818500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 11:32:46,328][15401] Updated weights for policy 0, policy_version 200490 (0.0032) [2024-06-22 11:32:48,390][15132] Fps is (10 sec: 44239.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3284926464. Throughput: 0: 42813.6. Samples: 3285074460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 11:32:50,363][15401] Updated weights for policy 0, policy_version 200500 (0.0025) [2024-06-22 11:32:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 3285106688. Throughput: 0: 42951.2. Samples: 3285207060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 11:32:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 11:32:53,926][15401] Updated weights for policy 0, policy_version 200510 (0.0036) [2024-06-22 11:32:57,866][15401] Updated weights for policy 0, policy_version 200520 (0.0038) [2024-06-22 11:32:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3285336064. Throughput: 0: 42961.4. Samples: 3285461700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:32:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 11:33:01,478][15401] Updated weights for policy 0, policy_version 200530 (0.0032) [2024-06-22 11:33:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3285565440. Throughput: 0: 42963.6. Samples: 3285717920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:03,399][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 11:33:05,382][15401] Updated weights for policy 0, policy_version 200540 (0.0034) [2024-06-22 11:33:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3285745664. Throughput: 0: 43011.6. Samples: 3285851620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:08,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 11:33:09,004][15401] Updated weights for policy 0, policy_version 200550 (0.0034) [2024-06-22 11:33:13,068][15401] Updated weights for policy 0, policy_version 200560 (0.0031) [2024-06-22 11:33:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.0, 300 sec: 42820.5). Total num frames: 3285991424. Throughput: 0: 43144.8. Samples: 3286106460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:13,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-22 11:33:16,860][15401] Updated weights for policy 0, policy_version 200570 (0.0044) [2024-06-22 11:33:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 3286204416. Throughput: 0: 42948.3. Samples: 3286363620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:18,398][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 11:33:20,632][15401] Updated weights for policy 0, policy_version 200580 (0.0035) [2024-06-22 11:33:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3286401024. Throughput: 0: 42950.5. Samples: 3286495440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 11:33:24,292][15401] Updated weights for policy 0, policy_version 200590 (0.0041) [2024-06-22 11:33:28,169][15401] Updated weights for policy 0, policy_version 200600 (0.0043) [2024-06-22 11:33:28,390][15132] Fps is (10 sec: 42596.1, 60 sec: 42871.0, 300 sec: 42764.9). Total num frames: 3286630400. Throughput: 0: 42861.6. Samples: 3286747300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:28,391][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 11:33:32,104][15401] Updated weights for policy 0, policy_version 200610 (0.0034) [2024-06-22 11:33:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.5). Total num frames: 3286843392. Throughput: 0: 43045.0. Samples: 3287011480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 11:33:35,700][15401] Updated weights for policy 0, policy_version 200620 (0.0031) [2024-06-22 11:33:38,390][15132] Fps is (10 sec: 42601.0, 60 sec: 42871.9, 300 sec: 42765.0). Total num frames: 3287056384. Throughput: 0: 42971.9. Samples: 3287140800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:38,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 11:33:39,728][15401] Updated weights for policy 0, policy_version 200630 (0.0036) [2024-06-22 11:33:43,304][15401] Updated weights for policy 0, policy_version 200640 (0.0039) [2024-06-22 11:33:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3287285760. Throughput: 0: 42807.9. Samples: 3287388060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 11:33:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000200640_3287285760.pth... [2024-06-22 11:33:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000200013_3277012992.pth [2024-06-22 11:33:47,410][15401] Updated weights for policy 0, policy_version 200650 (0.0043) [2024-06-22 11:33:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3287482368. Throughput: 0: 43003.2. Samples: 3287653060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:48,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 11:33:50,867][15401] Updated weights for policy 0, policy_version 200660 (0.0032) [2024-06-22 11:33:53,396][15132] Fps is (10 sec: 40933.6, 60 sec: 43139.8, 300 sec: 42764.1). Total num frames: 3287695360. Throughput: 0: 42752.4. Samples: 3287775760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:53,396][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 11:33:53,977][15349] Signal inference workers to stop experience collection... (48500 times) [2024-06-22 11:33:53,981][15349] Signal inference workers to resume experience collection... (48500 times) [2024-06-22 11:33:54,026][15401] InferenceWorker_p0-w0: stopping experience collection (48500 times) [2024-06-22 11:33:54,026][15401] InferenceWorker_p0-w0: resuming experience collection (48500 times) [2024-06-22 11:33:55,147][15401] Updated weights for policy 0, policy_version 200670 (0.0043) [2024-06-22 11:33:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3287924736. Throughput: 0: 42781.7. Samples: 3288031640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:33:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 11:33:58,577][15401] Updated weights for policy 0, policy_version 200680 (0.0026) [2024-06-22 11:34:02,902][15401] Updated weights for policy 0, policy_version 200690 (0.0025) [2024-06-22 11:34:03,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3288121344. Throughput: 0: 42928.4. Samples: 3288295400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 11:34:03,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 11:34:06,098][15401] Updated weights for policy 0, policy_version 200700 (0.0030) [2024-06-22 11:34:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3288334336. Throughput: 0: 42744.8. Samples: 3288418960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:08,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-22 11:34:10,545][15401] Updated weights for policy 0, policy_version 200710 (0.0033) [2024-06-22 11:34:13,392][15132] Fps is (10 sec: 45864.8, 60 sec: 43142.9, 300 sec: 43042.4). Total num frames: 3288580096. Throughput: 0: 42844.1. Samples: 3288675360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:13,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 11:34:14,057][15401] Updated weights for policy 0, policy_version 200720 (0.0032) [2024-06-22 11:34:18,371][15401] Updated weights for policy 0, policy_version 200730 (0.0030) [2024-06-22 11:34:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3288760320. Throughput: 0: 42846.5. Samples: 3288939580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 11:34:21,607][15401] Updated weights for policy 0, policy_version 200740 (0.0035) [2024-06-22 11:34:23,390][15132] Fps is (10 sec: 37692.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3288956928. Throughput: 0: 42535.1. Samples: 3289054880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:23,400][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 11:34:25,993][15401] Updated weights for policy 0, policy_version 200750 (0.0040) [2024-06-22 11:34:28,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43143.3, 300 sec: 42986.8). Total num frames: 3289219072. Throughput: 0: 42743.0. Samples: 3289311600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:28,401][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 11:34:29,119][15401] Updated weights for policy 0, policy_version 200760 (0.0028) [2024-06-22 11:34:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3289382912. Throughput: 0: 42730.2. Samples: 3289575920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 11:34:33,674][15401] Updated weights for policy 0, policy_version 200770 (0.0043) [2024-06-22 11:34:36,747][15401] Updated weights for policy 0, policy_version 200780 (0.0026) [2024-06-22 11:34:38,390][15132] Fps is (10 sec: 37692.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3289595904. Throughput: 0: 42567.4. Samples: 3289691020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 11:34:41,327][15401] Updated weights for policy 0, policy_version 200790 (0.0039) [2024-06-22 11:34:43,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3289874432. Throughput: 0: 42674.0. Samples: 3289951960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 11:34:44,308][15401] Updated weights for policy 0, policy_version 200800 (0.0033) [2024-06-22 11:34:48,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.6, 300 sec: 42765.0). Total num frames: 3290021888. Throughput: 0: 42612.0. Samples: 3290213040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:48,393][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 11:34:49,016][15401] Updated weights for policy 0, policy_version 200810 (0.0037) [2024-06-22 11:34:52,294][15401] Updated weights for policy 0, policy_version 200820 (0.0038) [2024-06-22 11:34:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 3290251264. Throughput: 0: 42293.3. Samples: 3290322160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:53,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 11:34:56,907][15401] Updated weights for policy 0, policy_version 200830 (0.0024) [2024-06-22 11:34:58,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3290480640. Throughput: 0: 42557.0. Samples: 3290590320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:34:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 11:34:59,786][15401] Updated weights for policy 0, policy_version 200840 (0.0039) [2024-06-22 11:35:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 3290660864. Throughput: 0: 42369.0. Samples: 3290846180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:35:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 11:35:04,572][15401] Updated weights for policy 0, policy_version 200850 (0.0037) [2024-06-22 11:35:07,780][15401] Updated weights for policy 0, policy_version 200860 (0.0049) [2024-06-22 11:35:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3290890240. Throughput: 0: 42438.7. Samples: 3290964620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:35:08,390][15132] Avg episode reward: [(0, '0.069')] [2024-06-22 11:35:09,305][15349] Signal inference workers to stop experience collection... (48550 times) [2024-06-22 11:35:09,316][15349] Signal inference workers to resume experience collection... (48550 times) [2024-06-22 11:35:09,332][15401] InferenceWorker_p0-w0: stopping experience collection (48550 times) [2024-06-22 11:35:09,361][15401] InferenceWorker_p0-w0: resuming experience collection (48550 times) [2024-06-22 11:35:12,188][15401] Updated weights for policy 0, policy_version 200870 (0.0043) [2024-06-22 11:35:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 3291136000. Throughput: 0: 42682.7. Samples: 3291232220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:13,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 11:35:15,402][15401] Updated weights for policy 0, policy_version 200880 (0.0033) [2024-06-22 11:35:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3291299840. Throughput: 0: 42580.0. Samples: 3291492020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 11:35:20,037][15401] Updated weights for policy 0, policy_version 200890 (0.0031) [2024-06-22 11:35:22,873][15401] Updated weights for policy 0, policy_version 200900 (0.0035) [2024-06-22 11:35:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3291545600. Throughput: 0: 42654.4. Samples: 3291610460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:23,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 11:35:27,429][15401] Updated weights for policy 0, policy_version 200910 (0.0033) [2024-06-22 11:35:28,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42600.2, 300 sec: 42931.6). Total num frames: 3291774976. Throughput: 0: 42873.3. Samples: 3291881260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:28,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 11:35:30,415][15401] Updated weights for policy 0, policy_version 200920 (0.0024) [2024-06-22 11:35:33,390][15132] Fps is (10 sec: 40956.1, 60 sec: 42870.8, 300 sec: 42820.4). Total num frames: 3291955200. Throughput: 0: 42772.1. Samples: 3292137720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:33,391][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 11:35:35,059][15401] Updated weights for policy 0, policy_version 200930 (0.0038) [2024-06-22 11:35:37,853][15401] Updated weights for policy 0, policy_version 200940 (0.0034) [2024-06-22 11:35:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 3292200960. Throughput: 0: 43131.1. Samples: 3292263060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 11:35:42,864][15401] Updated weights for policy 0, policy_version 200950 (0.0032) [2024-06-22 11:35:43,396][15132] Fps is (10 sec: 45849.8, 60 sec: 42320.8, 300 sec: 42930.7). Total num frames: 3292413952. Throughput: 0: 43167.1. Samples: 3292533120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:43,396][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 11:35:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000200953_3292413952.pth... [2024-06-22 11:35:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000200326_3282141184.pth [2024-06-22 11:35:45,557][15401] Updated weights for policy 0, policy_version 200960 (0.0037) [2024-06-22 11:35:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43146.4, 300 sec: 42820.6). Total num frames: 3292610560. Throughput: 0: 43002.3. Samples: 3292781280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 11:35:50,602][15401] Updated weights for policy 0, policy_version 200970 (0.0031) [2024-06-22 11:35:53,381][15401] Updated weights for policy 0, policy_version 200980 (0.0033) [2024-06-22 11:35:53,389][15132] Fps is (10 sec: 44265.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3292856320. Throughput: 0: 43101.4. Samples: 3292904180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 11:35:58,193][15401] Updated weights for policy 0, policy_version 200990 (0.0036) [2024-06-22 11:35:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3293036544. Throughput: 0: 43130.2. Samples: 3293173080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:35:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 11:36:01,098][15401] Updated weights for policy 0, policy_version 201000 (0.0035) [2024-06-22 11:36:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3293249536. Throughput: 0: 43019.1. Samples: 3293427880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:36:03,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 11:36:05,728][15401] Updated weights for policy 0, policy_version 201010 (0.0040) [2024-06-22 11:36:05,939][15349] Signal inference workers to stop experience collection... (48600 times) [2024-06-22 11:36:05,939][15349] Signal inference workers to resume experience collection... (48600 times) [2024-06-22 11:36:05,978][15401] InferenceWorker_p0-w0: stopping experience collection (48600 times) [2024-06-22 11:36:05,979][15401] InferenceWorker_p0-w0: resuming experience collection (48600 times) [2024-06-22 11:36:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3293478912. Throughput: 0: 43284.4. Samples: 3293558260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:36:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 11:36:08,664][15401] Updated weights for policy 0, policy_version 201020 (0.0027) [2024-06-22 11:36:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3293659136. Throughput: 0: 43084.8. Samples: 3293820080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:36:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 11:36:13,445][15401] Updated weights for policy 0, policy_version 201030 (0.0036) [2024-06-22 11:36:16,340][15401] Updated weights for policy 0, policy_version 201040 (0.0028) [2024-06-22 11:36:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 3293904896. Throughput: 0: 42950.4. Samples: 3294070460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-22 11:36:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 11:36:20,980][15401] Updated weights for policy 0, policy_version 201050 (0.0043) [2024-06-22 11:36:23,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 3294134272. Throughput: 0: 43206.7. Samples: 3294207360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:36:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 11:36:23,979][15401] Updated weights for policy 0, policy_version 201060 (0.0029) [2024-06-22 11:36:28,391][15132] Fps is (10 sec: 40955.5, 60 sec: 42324.4, 300 sec: 42764.8). Total num frames: 3294314496. Throughput: 0: 42956.5. Samples: 3294465940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:36:28,391][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 11:36:28,544][15401] Updated weights for policy 0, policy_version 201070 (0.0036) [2024-06-22 11:36:31,402][15401] Updated weights for policy 0, policy_version 201080 (0.0038) [2024-06-22 11:36:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43418.3, 300 sec: 42820.9). Total num frames: 3294560256. Throughput: 0: 43068.9. Samples: 3294719380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:36:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 11:36:36,011][15401] Updated weights for policy 0, policy_version 201090 (0.0035) [2024-06-22 11:36:38,390][15132] Fps is (10 sec: 47519.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3294789632. Throughput: 0: 43314.6. Samples: 3294853340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:36:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 11:36:38,897][15401] Updated weights for policy 0, policy_version 201100 (0.0031) [2024-06-22 11:36:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 3294953472. Throughput: 0: 43105.9. Samples: 3295112840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:36:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 11:36:43,649][15401] Updated weights for policy 0, policy_version 201110 (0.0033) [2024-06-22 11:36:46,757][15401] Updated weights for policy 0, policy_version 201120 (0.0041) [2024-06-22 11:36:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 3295199232. Throughput: 0: 42838.3. Samples: 3295355600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:36:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 11:36:51,546][15401] Updated weights for policy 0, policy_version 201130 (0.0034) [2024-06-22 11:36:53,389][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3295428608. Throughput: 0: 43144.0. Samples: 3295499740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:36:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 11:36:54,495][15401] Updated weights for policy 0, policy_version 201140 (0.0041) [2024-06-22 11:36:58,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3295576064. Throughput: 0: 42741.0. Samples: 3295743420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:36:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 11:36:59,103][15349] Signal inference workers to stop experience collection... (48650 times) [2024-06-22 11:36:59,151][15401] InferenceWorker_p0-w0: stopping experience collection (48650 times) [2024-06-22 11:36:59,219][15349] Signal inference workers to resume experience collection... (48650 times) [2024-06-22 11:36:59,219][15401] InferenceWorker_p0-w0: resuming experience collection (48650 times) [2024-06-22 11:36:59,354][15401] Updated weights for policy 0, policy_version 201150 (0.0030) [2024-06-22 11:37:02,017][15401] Updated weights for policy 0, policy_version 201160 (0.0037) [2024-06-22 11:37:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3295854592. Throughput: 0: 42645.0. Samples: 3295989480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:37:03,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 11:37:07,206][15401] Updated weights for policy 0, policy_version 201170 (0.0040) [2024-06-22 11:37:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 3296051200. Throughput: 0: 42720.4. Samples: 3296129780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:37:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 11:37:09,654][15401] Updated weights for policy 0, policy_version 201180 (0.0045) [2024-06-22 11:37:13,389][15132] Fps is (10 sec: 36045.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 3296215040. Throughput: 0: 42527.6. Samples: 3296379620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:37:13,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 11:37:14,656][15401] Updated weights for policy 0, policy_version 201190 (0.0038) [2024-06-22 11:37:17,341][15401] Updated weights for policy 0, policy_version 201200 (0.0038) [2024-06-22 11:37:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 3296493568. Throughput: 0: 42580.8. Samples: 3296635520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:37:18,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 11:37:22,149][15401] Updated weights for policy 0, policy_version 201210 (0.0029) [2024-06-22 11:37:23,389][15132] Fps is (10 sec: 49151.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3296706560. Throughput: 0: 42691.2. Samples: 3296774440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-22 11:37:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 11:37:24,955][15401] Updated weights for policy 0, policy_version 201220 (0.0034) [2024-06-22 11:37:28,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42599.3, 300 sec: 42765.0). Total num frames: 3296870400. Throughput: 0: 42484.8. Samples: 3297024660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:37:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 11:37:29,672][15401] Updated weights for policy 0, policy_version 201230 (0.0038) [2024-06-22 11:37:32,466][15401] Updated weights for policy 0, policy_version 201240 (0.0035) [2024-06-22 11:37:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 3297148928. Throughput: 0: 42738.1. Samples: 3297278820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:37:33,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 11:37:37,517][15401] Updated weights for policy 0, policy_version 201250 (0.0045) [2024-06-22 11:37:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3297329152. Throughput: 0: 42688.5. Samples: 3297420720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:37:38,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 11:37:39,998][15401] Updated weights for policy 0, policy_version 201260 (0.0031) [2024-06-22 11:37:43,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 3297509376. Throughput: 0: 42624.4. Samples: 3297661520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:37:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 11:37:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000201265_3297525760.pth... [2024-06-22 11:37:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000200640_3287285760.pth [2024-06-22 11:37:45,329][15401] Updated weights for policy 0, policy_version 201270 (0.0044) [2024-06-22 11:37:47,595][15401] Updated weights for policy 0, policy_version 201280 (0.0033) [2024-06-22 11:37:48,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.7, 300 sec: 42986.8). Total num frames: 3297787904. Throughput: 0: 42762.7. Samples: 3297913900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:37:48,393][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 11:37:52,964][15401] Updated weights for policy 0, policy_version 201290 (0.0026) [2024-06-22 11:37:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 3297951744. Throughput: 0: 42858.7. Samples: 3298058420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:37:53,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 11:37:55,534][15401] Updated weights for policy 0, policy_version 201300 (0.0038) [2024-06-22 11:37:58,390][15132] Fps is (10 sec: 37692.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3298164736. Throughput: 0: 42712.3. Samples: 3298301680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:37:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 11:38:00,110][15349] Signal inference workers to stop experience collection... (48700 times) [2024-06-22 11:38:00,113][15349] Signal inference workers to resume experience collection... (48700 times) [2024-06-22 11:38:00,139][15401] InferenceWorker_p0-w0: stopping experience collection (48700 times) [2024-06-22 11:38:00,140][15401] InferenceWorker_p0-w0: resuming experience collection (48700 times) [2024-06-22 11:38:00,584][15401] Updated weights for policy 0, policy_version 201310 (0.0028) [2024-06-22 11:38:03,159][15401] Updated weights for policy 0, policy_version 201320 (0.0031) [2024-06-22 11:38:03,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 3298426880. Throughput: 0: 42629.2. Samples: 3298553840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:38:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 11:38:08,309][15401] Updated weights for policy 0, policy_version 201330 (0.0034) [2024-06-22 11:38:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3298590720. Throughput: 0: 42693.3. Samples: 3298695640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:38:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 11:38:10,692][15401] Updated weights for policy 0, policy_version 201340 (0.0044) [2024-06-22 11:38:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 3298820096. Throughput: 0: 42489.7. Samples: 3298936700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:38:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 11:38:16,115][15401] Updated weights for policy 0, policy_version 201350 (0.0031) [2024-06-22 11:38:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3299049472. Throughput: 0: 42517.3. Samples: 3299192100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:38:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 11:38:18,697][15401] Updated weights for policy 0, policy_version 201360 (0.0038) [2024-06-22 11:38:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 3299213312. Throughput: 0: 42417.4. Samples: 3299329500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:38:23,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 11:38:23,867][15401] Updated weights for policy 0, policy_version 201370 (0.0045) [2024-06-22 11:38:26,189][15401] Updated weights for policy 0, policy_version 201380 (0.0044) [2024-06-22 11:38:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3299459072. Throughput: 0: 42499.2. Samples: 3299573980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-22 11:38:28,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 11:38:31,479][15401] Updated weights for policy 0, policy_version 201390 (0.0037) [2024-06-22 11:38:33,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3299704832. Throughput: 0: 42800.5. Samples: 3299839820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:38:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 11:38:33,753][15401] Updated weights for policy 0, policy_version 201400 (0.0030) [2024-06-22 11:38:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3299868672. Throughput: 0: 42548.8. Samples: 3299973120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:38:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 11:38:39,034][15401] Updated weights for policy 0, policy_version 201410 (0.0032) [2024-06-22 11:38:41,413][15401] Updated weights for policy 0, policy_version 201420 (0.0036) [2024-06-22 11:38:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 3300114432. Throughput: 0: 42510.7. Samples: 3300214660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:38:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 11:38:46,825][15401] Updated weights for policy 0, policy_version 201430 (0.0041) [2024-06-22 11:38:46,826][15349] Signal inference workers to stop experience collection... (48750 times) [2024-06-22 11:38:46,827][15349] Signal inference workers to resume experience collection... (48750 times) [2024-06-22 11:38:46,847][15401] InferenceWorker_p0-w0: stopping experience collection (48750 times) [2024-06-22 11:38:46,847][15401] InferenceWorker_p0-w0: resuming experience collection (48750 times) [2024-06-22 11:38:48,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42600.1, 300 sec: 42877.0). Total num frames: 3300343808. Throughput: 0: 42834.3. Samples: 3300481380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:38:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 11:38:49,052][15401] Updated weights for policy 0, policy_version 201440 (0.0023) [2024-06-22 11:38:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3300491264. Throughput: 0: 42582.7. Samples: 3300611860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:38:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 11:38:54,350][15401] Updated weights for policy 0, policy_version 201450 (0.0024) [2024-06-22 11:38:56,623][15401] Updated weights for policy 0, policy_version 201460 (0.0035) [2024-06-22 11:38:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3300753408. Throughput: 0: 42745.8. Samples: 3300860260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:38:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 11:39:02,075][15401] Updated weights for policy 0, policy_version 201470 (0.0027) [2024-06-22 11:39:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 3300966400. Throughput: 0: 42784.5. Samples: 3301117400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:39:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 11:39:04,712][15401] Updated weights for policy 0, policy_version 201480 (0.0043) [2024-06-22 11:39:08,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 3301130240. Throughput: 0: 42647.5. Samples: 3301248640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:39:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 11:39:09,752][15401] Updated weights for policy 0, policy_version 201490 (0.0028) [2024-06-22 11:39:12,442][15401] Updated weights for policy 0, policy_version 201500 (0.0039) [2024-06-22 11:39:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 3301392384. Throughput: 0: 42825.2. Samples: 3301501220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:39:13,392][15132] Avg episode reward: [(0, '0.821')] [2024-06-22 11:39:17,351][15401] Updated weights for policy 0, policy_version 201510 (0.0028) [2024-06-22 11:39:18,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 3301605376. Throughput: 0: 42640.8. Samples: 3301758760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:39:18,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 11:39:20,007][15401] Updated weights for policy 0, policy_version 201520 (0.0037) [2024-06-22 11:39:23,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 3301785600. Throughput: 0: 42482.7. Samples: 3301884840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:39:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 11:39:25,063][15401] Updated weights for policy 0, policy_version 201530 (0.0038) [2024-06-22 11:39:27,675][15401] Updated weights for policy 0, policy_version 201540 (0.0038) [2024-06-22 11:39:28,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3302047744. Throughput: 0: 42789.4. Samples: 3302140180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:39:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 11:39:32,793][15401] Updated weights for policy 0, policy_version 201550 (0.0034) [2024-06-22 11:39:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3302244352. Throughput: 0: 42649.3. Samples: 3302400600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:39:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 11:39:35,185][15401] Updated weights for policy 0, policy_version 201560 (0.0029) [2024-06-22 11:39:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3302440960. Throughput: 0: 42541.7. Samples: 3302526240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-22 11:39:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 11:39:40,311][15401] Updated weights for policy 0, policy_version 201570 (0.0043) [2024-06-22 11:39:42,754][15401] Updated weights for policy 0, policy_version 201580 (0.0034) [2024-06-22 11:39:43,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42931.6). Total num frames: 3302686720. Throughput: 0: 42687.1. Samples: 3302781280. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:39:43,393][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 11:39:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000201580_3302686720.pth... [2024-06-22 11:39:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000200953_3292413952.pth [2024-06-22 11:39:47,921][15401] Updated weights for policy 0, policy_version 201590 (0.0022) [2024-06-22 11:39:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3302866944. Throughput: 0: 43067.0. Samples: 3303055420. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:39:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 11:39:50,275][15401] Updated weights for policy 0, policy_version 201600 (0.0036) [2024-06-22 11:39:53,146][15349] Signal inference workers to stop experience collection... (48800 times) [2024-06-22 11:39:53,150][15349] Signal inference workers to resume experience collection... (48800 times) [2024-06-22 11:39:53,170][15401] InferenceWorker_p0-w0: stopping experience collection (48800 times) [2024-06-22 11:39:53,170][15401] InferenceWorker_p0-w0: resuming experience collection (48800 times) [2024-06-22 11:39:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 3303096320. Throughput: 0: 42797.2. Samples: 3303174520. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:39:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 11:39:55,478][15401] Updated weights for policy 0, policy_version 201610 (0.0046) [2024-06-22 11:39:58,343][15401] Updated weights for policy 0, policy_version 201620 (0.0023) [2024-06-22 11:39:58,396][15132] Fps is (10 sec: 47483.4, 60 sec: 43140.0, 300 sec: 42986.2). Total num frames: 3303342080. Throughput: 0: 42844.7. Samples: 3303429400. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:39:58,397][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 11:40:03,062][15401] Updated weights for policy 0, policy_version 201630 (0.0046) [2024-06-22 11:40:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3303522304. Throughput: 0: 43167.3. Samples: 3303701180. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 11:40:05,766][15401] Updated weights for policy 0, policy_version 201640 (0.0034) [2024-06-22 11:40:08,389][15132] Fps is (10 sec: 39346.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 3303735296. Throughput: 0: 43075.6. Samples: 3303823240. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:08,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 11:40:10,497][15401] Updated weights for policy 0, policy_version 201650 (0.0025) [2024-06-22 11:40:13,381][15401] Updated weights for policy 0, policy_version 201660 (0.0022) [2024-06-22 11:40:13,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43419.4, 300 sec: 43042.7). Total num frames: 3303997440. Throughput: 0: 43135.2. Samples: 3304081260. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 11:40:17,954][15401] Updated weights for policy 0, policy_version 201670 (0.0035) [2024-06-22 11:40:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 3304194048. Throughput: 0: 43255.1. Samples: 3304347080. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 11:40:20,900][15401] Updated weights for policy 0, policy_version 201680 (0.0039) [2024-06-22 11:40:23,390][15132] Fps is (10 sec: 37682.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3304374272. Throughput: 0: 43151.8. Samples: 3304468080. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 11:40:25,437][15401] Updated weights for policy 0, policy_version 201690 (0.0036) [2024-06-22 11:40:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42987.3). Total num frames: 3304636416. Throughput: 0: 43307.6. Samples: 3304730020. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 11:40:28,715][15401] Updated weights for policy 0, policy_version 201700 (0.0028) [2024-06-22 11:40:33,099][15401] Updated weights for policy 0, policy_version 201710 (0.0037) [2024-06-22 11:40:33,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3304833024. Throughput: 0: 42957.8. Samples: 3304988520. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 11:40:36,636][15401] Updated weights for policy 0, policy_version 201720 (0.0033) [2024-06-22 11:40:38,392][15132] Fps is (10 sec: 39312.5, 60 sec: 43142.7, 300 sec: 42765.6). Total num frames: 3305029632. Throughput: 0: 43169.3. Samples: 3305117240. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:38,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 11:40:40,658][15401] Updated weights for policy 0, policy_version 201730 (0.0047) [2024-06-22 11:40:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 3305259008. Throughput: 0: 43162.1. Samples: 3305371420. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-22 11:40:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 11:40:44,344][15401] Updated weights for policy 0, policy_version 201740 (0.0029) [2024-06-22 11:40:48,366][15401] Updated weights for policy 0, policy_version 201750 (0.0043) [2024-06-22 11:40:48,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 3305472000. Throughput: 0: 42949.8. Samples: 3305633920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:40:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 11:40:51,908][15401] Updated weights for policy 0, policy_version 201760 (0.0025) [2024-06-22 11:40:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3305684992. Throughput: 0: 43120.4. Samples: 3305763660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:40:53,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 11:40:56,017][15401] Updated weights for policy 0, policy_version 201770 (0.0054) [2024-06-22 11:40:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42602.9, 300 sec: 42876.1). Total num frames: 3305897984. Throughput: 0: 43000.8. Samples: 3306016300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:40:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 11:41:00,008][15401] Updated weights for policy 0, policy_version 201780 (0.0039) [2024-06-22 11:41:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3306094592. Throughput: 0: 42930.7. Samples: 3306278960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 11:41:03,673][15401] Updated weights for policy 0, policy_version 201790 (0.0041) [2024-06-22 11:41:07,707][15401] Updated weights for policy 0, policy_version 201800 (0.0037) [2024-06-22 11:41:08,393][15132] Fps is (10 sec: 42582.6, 60 sec: 43141.9, 300 sec: 42931.1). Total num frames: 3306323968. Throughput: 0: 42912.2. Samples: 3306399280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:08,394][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 11:41:11,211][15401] Updated weights for policy 0, policy_version 201810 (0.0027) [2024-06-22 11:41:13,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42596.6, 300 sec: 42875.8). Total num frames: 3306553344. Throughput: 0: 42790.2. Samples: 3306655680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:13,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 11:41:15,401][15401] Updated weights for policy 0, policy_version 201820 (0.0034) [2024-06-22 11:41:18,389][15132] Fps is (10 sec: 40975.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3306733568. Throughput: 0: 42741.4. Samples: 3306911880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 11:41:18,682][15349] Signal inference workers to stop experience collection... (48850 times) [2024-06-22 11:41:18,685][15349] Signal inference workers to resume experience collection... (48850 times) [2024-06-22 11:41:18,712][15401] InferenceWorker_p0-w0: stopping experience collection (48850 times) [2024-06-22 11:41:18,712][15401] InferenceWorker_p0-w0: resuming experience collection (48850 times) [2024-06-22 11:41:18,836][15401] Updated weights for policy 0, policy_version 201830 (0.0039) [2024-06-22 11:41:23,118][15401] Updated weights for policy 0, policy_version 201840 (0.0039) [2024-06-22 11:41:23,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.6, 300 sec: 42820.7). Total num frames: 3306946560. Throughput: 0: 42568.5. Samples: 3307032720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 11:41:26,638][15401] Updated weights for policy 0, policy_version 201850 (0.0036) [2024-06-22 11:41:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 3307192320. Throughput: 0: 42677.4. Samples: 3307291900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 11:41:30,907][15401] Updated weights for policy 0, policy_version 201860 (0.0036) [2024-06-22 11:41:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3307388928. Throughput: 0: 42617.2. Samples: 3307551700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 11:41:34,016][15401] Updated weights for policy 0, policy_version 201870 (0.0041) [2024-06-22 11:41:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 3307585536. Throughput: 0: 42448.4. Samples: 3307673840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:38,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 11:41:38,503][15401] Updated weights for policy 0, policy_version 201880 (0.0033) [2024-06-22 11:41:41,498][15401] Updated weights for policy 0, policy_version 201890 (0.0025) [2024-06-22 11:41:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3307814912. Throughput: 0: 42551.0. Samples: 3307931100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 11:41:43,574][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000201894_3307831296.pth... [2024-06-22 11:41:43,618][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000201265_3297525760.pth [2024-06-22 11:41:46,391][15401] Updated weights for policy 0, policy_version 201900 (0.0020) [2024-06-22 11:41:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3308027904. Throughput: 0: 42490.5. Samples: 3308191040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 11:41:49,195][15401] Updated weights for policy 0, policy_version 201910 (0.0037) [2024-06-22 11:41:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 3308208128. Throughput: 0: 42634.2. Samples: 3308317660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-22 11:41:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 11:41:53,981][15401] Updated weights for policy 0, policy_version 201920 (0.0040) [2024-06-22 11:41:56,926][15401] Updated weights for policy 0, policy_version 201930 (0.0034) [2024-06-22 11:41:58,394][15132] Fps is (10 sec: 44218.6, 60 sec: 42868.4, 300 sec: 42764.4). Total num frames: 3308470272. Throughput: 0: 42519.2. Samples: 3308569120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:41:58,394][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 11:42:01,482][15401] Updated weights for policy 0, policy_version 201940 (0.0044) [2024-06-22 11:42:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3308666880. Throughput: 0: 42617.3. Samples: 3308829660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:03,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 11:42:04,496][15401] Updated weights for policy 0, policy_version 201950 (0.0041) [2024-06-22 11:42:08,389][15132] Fps is (10 sec: 39338.3, 60 sec: 42328.0, 300 sec: 42876.1). Total num frames: 3308863488. Throughput: 0: 42759.2. Samples: 3308956880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:08,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 11:42:09,424][15401] Updated weights for policy 0, policy_version 201960 (0.0026) [2024-06-22 11:42:12,014][15401] Updated weights for policy 0, policy_version 201970 (0.0039) [2024-06-22 11:42:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 3309109248. Throughput: 0: 42544.9. Samples: 3309206420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:13,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 11:42:17,110][15401] Updated weights for policy 0, policy_version 201980 (0.0046) [2024-06-22 11:42:18,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 3309305856. Throughput: 0: 42649.8. Samples: 3309471040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:18,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 11:42:19,706][15401] Updated weights for policy 0, policy_version 201990 (0.0035) [2024-06-22 11:42:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3309502464. Throughput: 0: 42642.4. Samples: 3309592740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:23,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 11:42:24,930][15401] Updated weights for policy 0, policy_version 202000 (0.0032) [2024-06-22 11:42:27,371][15401] Updated weights for policy 0, policy_version 202010 (0.0043) [2024-06-22 11:42:28,389][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3309764608. Throughput: 0: 42577.8. Samples: 3309847100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 11:42:32,625][15401] Updated weights for policy 0, policy_version 202020 (0.0044) [2024-06-22 11:42:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3309944832. Throughput: 0: 42731.6. Samples: 3310113960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:33,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 11:42:34,941][15401] Updated weights for policy 0, policy_version 202030 (0.0029) [2024-06-22 11:42:38,391][15132] Fps is (10 sec: 37678.0, 60 sec: 42597.5, 300 sec: 42820.4). Total num frames: 3310141440. Throughput: 0: 42693.7. Samples: 3310238940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:38,391][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 11:42:40,214][15401] Updated weights for policy 0, policy_version 202040 (0.0031) [2024-06-22 11:42:42,556][15401] Updated weights for policy 0, policy_version 202050 (0.0031) [2024-06-22 11:42:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 3310403584. Throughput: 0: 42647.5. Samples: 3310488080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 11:42:48,023][15401] Updated weights for policy 0, policy_version 202060 (0.0037) [2024-06-22 11:42:48,389][15132] Fps is (10 sec: 42604.7, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 3310567424. Throughput: 0: 42712.5. Samples: 3310751720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:48,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 11:42:48,962][15349] Signal inference workers to stop experience collection... (48900 times) [2024-06-22 11:42:48,963][15349] Signal inference workers to resume experience collection... (48900 times) [2024-06-22 11:42:49,012][15401] InferenceWorker_p0-w0: stopping experience collection (48900 times) [2024-06-22 11:42:49,012][15401] InferenceWorker_p0-w0: resuming experience collection (48900 times) [2024-06-22 11:42:50,550][15401] Updated weights for policy 0, policy_version 202070 (0.0043) [2024-06-22 11:42:53,390][15132] Fps is (10 sec: 37681.6, 60 sec: 42871.1, 300 sec: 42765.0). Total num frames: 3310780416. Throughput: 0: 42492.4. Samples: 3310869060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:53,391][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 11:42:55,894][15401] Updated weights for policy 0, policy_version 202080 (0.0029) [2024-06-22 11:42:58,099][15401] Updated weights for policy 0, policy_version 202090 (0.0030) [2024-06-22 11:42:58,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42874.4, 300 sec: 42765.0). Total num frames: 3311042560. Throughput: 0: 42690.5. Samples: 3311127500. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 11:42:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 11:43:03,375][15401] Updated weights for policy 0, policy_version 202100 (0.0033) [2024-06-22 11:43:03,389][15132] Fps is (10 sec: 42600.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3311206400. Throughput: 0: 42911.7. Samples: 3311401960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 11:43:05,986][15401] Updated weights for policy 0, policy_version 202110 (0.0040) [2024-06-22 11:43:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3311435776. Throughput: 0: 42644.3. Samples: 3311511740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:08,390][15132] Avg episode reward: [(0, '0.138')] [2024-06-22 11:43:10,946][15401] Updated weights for policy 0, policy_version 202120 (0.0029) [2024-06-22 11:43:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3311665152. Throughput: 0: 42686.6. Samples: 3311768100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:13,393][15132] Avg episode reward: [(0, '0.230')] [2024-06-22 11:43:13,827][15401] Updated weights for policy 0, policy_version 202130 (0.0042) [2024-06-22 11:43:18,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 3311845376. Throughput: 0: 42733.1. Samples: 3312036940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 11:43:18,417][15401] Updated weights for policy 0, policy_version 202140 (0.0041) [2024-06-22 11:43:21,479][15401] Updated weights for policy 0, policy_version 202150 (0.0038) [2024-06-22 11:43:23,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3312074752. Throughput: 0: 42536.0. Samples: 3312153000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 11:43:26,022][15401] Updated weights for policy 0, policy_version 202160 (0.0041) [2024-06-22 11:43:28,396][15132] Fps is (10 sec: 45845.0, 60 sec: 42320.8, 300 sec: 42708.6). Total num frames: 3312304128. Throughput: 0: 42660.6. Samples: 3312408080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:28,396][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 11:43:29,411][15401] Updated weights for policy 0, policy_version 202170 (0.0048) [2024-06-22 11:43:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3312484352. Throughput: 0: 42696.3. Samples: 3312673060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:33,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 11:43:33,835][15401] Updated weights for policy 0, policy_version 202180 (0.0042) [2024-06-22 11:43:37,131][15401] Updated weights for policy 0, policy_version 202190 (0.0027) [2024-06-22 11:43:38,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42872.4, 300 sec: 42709.5). Total num frames: 3312713728. Throughput: 0: 42656.0. Samples: 3312788560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 11:43:41,459][15401] Updated weights for policy 0, policy_version 202200 (0.0047) [2024-06-22 11:43:43,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 3312943104. Throughput: 0: 42635.6. Samples: 3313046200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:43,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 11:43:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000202206_3312943104.pth... [2024-06-22 11:43:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000201580_3302686720.pth [2024-06-22 11:43:44,924][15401] Updated weights for policy 0, policy_version 202210 (0.0041) [2024-06-22 11:43:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3313123328. Throughput: 0: 42287.5. Samples: 3313304900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 11:43:49,064][15401] Updated weights for policy 0, policy_version 202220 (0.0037) [2024-06-22 11:43:52,770][15401] Updated weights for policy 0, policy_version 202230 (0.0044) [2024-06-22 11:43:53,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 3313352704. Throughput: 0: 42535.1. Samples: 3313425820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:53,394][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 11:43:56,646][15401] Updated weights for policy 0, policy_version 202240 (0.0032) [2024-06-22 11:43:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 3313565696. Throughput: 0: 42453.1. Samples: 3313678380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:43:58,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 11:44:00,517][15401] Updated weights for policy 0, policy_version 202250 (0.0036) [2024-06-22 11:44:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3313762304. Throughput: 0: 42211.3. Samples: 3313936460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:44:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 11:44:04,397][15401] Updated weights for policy 0, policy_version 202260 (0.0032) [2024-06-22 11:44:08,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 3313975296. Throughput: 0: 42359.0. Samples: 3314059160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 11:44:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 11:44:08,409][15401] Updated weights for policy 0, policy_version 202270 (0.0040) [2024-06-22 11:44:10,257][15349] Signal inference workers to stop experience collection... (48950 times) [2024-06-22 11:44:10,257][15349] Signal inference workers to resume experience collection... (48950 times) [2024-06-22 11:44:10,316][15401] InferenceWorker_p0-w0: stopping experience collection (48950 times) [2024-06-22 11:44:10,317][15401] InferenceWorker_p0-w0: resuming experience collection (48950 times) [2024-06-22 11:44:12,578][15401] Updated weights for policy 0, policy_version 202280 (0.0058) [2024-06-22 11:44:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42327.0, 300 sec: 42709.8). Total num frames: 3314204672. Throughput: 0: 42480.2. Samples: 3314319420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 11:44:16,122][15401] Updated weights for policy 0, policy_version 202290 (0.0034) [2024-06-22 11:44:18,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3314401280. Throughput: 0: 42102.7. Samples: 3314567780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:18,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 11:44:20,380][15401] Updated weights for policy 0, policy_version 202300 (0.0038) [2024-06-22 11:44:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3314614272. Throughput: 0: 42284.9. Samples: 3314691380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 11:44:23,804][15401] Updated weights for policy 0, policy_version 202310 (0.0045) [2024-06-22 11:44:27,991][15401] Updated weights for policy 0, policy_version 202320 (0.0040) [2024-06-22 11:44:28,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42056.8, 300 sec: 42654.0). Total num frames: 3314827264. Throughput: 0: 42353.4. Samples: 3314952000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:28,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-22 11:44:31,671][15401] Updated weights for policy 0, policy_version 202330 (0.0034) [2024-06-22 11:44:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 3315023872. Throughput: 0: 42313.5. Samples: 3315209000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 11:44:35,645][15401] Updated weights for policy 0, policy_version 202340 (0.0026) [2024-06-22 11:44:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 3315269632. Throughput: 0: 42357.4. Samples: 3315331900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 11:44:39,313][15401] Updated weights for policy 0, policy_version 202350 (0.0041) [2024-06-22 11:44:43,252][15401] Updated weights for policy 0, policy_version 202360 (0.0039) [2024-06-22 11:44:43,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 3315466240. Throughput: 0: 42585.2. Samples: 3315594720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:43,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-22 11:44:46,915][15401] Updated weights for policy 0, policy_version 202370 (0.0031) [2024-06-22 11:44:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 3315662848. Throughput: 0: 42564.8. Samples: 3315851880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 11:44:50,729][15401] Updated weights for policy 0, policy_version 202380 (0.0028) [2024-06-22 11:44:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 3315908608. Throughput: 0: 42556.0. Samples: 3315974180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 11:44:55,478][15401] Updated weights for policy 0, policy_version 202390 (0.0040) [2024-06-22 11:44:58,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3316105216. Throughput: 0: 42476.6. Samples: 3316230860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:44:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 11:44:58,642][15401] Updated weights for policy 0, policy_version 202400 (0.0043) [2024-06-22 11:45:02,995][15401] Updated weights for policy 0, policy_version 202410 (0.0031) [2024-06-22 11:45:03,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 3316301824. Throughput: 0: 42843.2. Samples: 3316495720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:45:03,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 11:45:06,220][15401] Updated weights for policy 0, policy_version 202420 (0.0039) [2024-06-22 11:45:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 3316563968. Throughput: 0: 42861.8. Samples: 3316620160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:45:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 11:45:10,378][15401] Updated weights for policy 0, policy_version 202430 (0.0033) [2024-06-22 11:45:13,391][15132] Fps is (10 sec: 45878.6, 60 sec: 42597.3, 300 sec: 42598.2). Total num frames: 3316760576. Throughput: 0: 42864.6. Samples: 3316880980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 11:45:13,392][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 11:45:13,492][15401] Updated weights for policy 0, policy_version 202440 (0.0035) [2024-06-22 11:45:18,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 3316924416. Throughput: 0: 42945.2. Samples: 3317141540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 11:45:18,417][15401] Updated weights for policy 0, policy_version 202450 (0.0046) [2024-06-22 11:45:21,035][15401] Updated weights for policy 0, policy_version 202460 (0.0029) [2024-06-22 11:45:23,389][15132] Fps is (10 sec: 44244.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 3317202944. Throughput: 0: 42760.9. Samples: 3317256140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:23,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 11:45:26,125][15401] Updated weights for policy 0, policy_version 202470 (0.0054) [2024-06-22 11:45:27,326][15349] Signal inference workers to stop experience collection... (49000 times) [2024-06-22 11:45:27,365][15401] InferenceWorker_p0-w0: stopping experience collection (49000 times) [2024-06-22 11:45:27,374][15349] Signal inference workers to resume experience collection... (49000 times) [2024-06-22 11:45:27,385][15401] InferenceWorker_p0-w0: resuming experience collection (49000 times) [2024-06-22 11:45:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3317399552. Throughput: 0: 42805.8. Samples: 3317520980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 11:45:28,734][15401] Updated weights for policy 0, policy_version 202480 (0.0030) [2024-06-22 11:45:33,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.2, 300 sec: 42543.2). Total num frames: 3317579776. Throughput: 0: 42740.5. Samples: 3317775200. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:33,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 11:45:33,685][15401] Updated weights for policy 0, policy_version 202490 (0.0036) [2024-06-22 11:45:36,479][15401] Updated weights for policy 0, policy_version 202500 (0.0033) [2024-06-22 11:45:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 3317825536. Throughput: 0: 42663.6. Samples: 3317894140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:38,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 11:45:41,429][15401] Updated weights for policy 0, policy_version 202510 (0.0033) [2024-06-22 11:45:43,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3318054912. Throughput: 0: 42914.1. Samples: 3318162000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 11:45:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000202518_3318054912.pth... [2024-06-22 11:45:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000201894_3307831296.pth [2024-06-22 11:45:44,190][15401] Updated weights for policy 0, policy_version 202520 (0.0040) [2024-06-22 11:45:48,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 3318218752. Throughput: 0: 42662.4. Samples: 3318415420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 11:45:48,910][15401] Updated weights for policy 0, policy_version 202530 (0.0039) [2024-06-22 11:45:51,775][15401] Updated weights for policy 0, policy_version 202540 (0.0036) [2024-06-22 11:45:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3318464512. Throughput: 0: 42723.6. Samples: 3318542720. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 11:45:56,492][15401] Updated weights for policy 0, policy_version 202550 (0.0032) [2024-06-22 11:45:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3318661120. Throughput: 0: 42754.0. Samples: 3318804840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:45:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 11:45:59,429][15401] Updated weights for policy 0, policy_version 202560 (0.0043) [2024-06-22 11:46:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42543.4). Total num frames: 3318874112. Throughput: 0: 42523.9. Samples: 3319055120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:46:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 11:46:04,285][15401] Updated weights for policy 0, policy_version 202570 (0.0048) [2024-06-22 11:46:07,059][15401] Updated weights for policy 0, policy_version 202580 (0.0026) [2024-06-22 11:46:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 3319119872. Throughput: 0: 42892.4. Samples: 3319186300. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:46:08,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 11:46:11,947][15401] Updated weights for policy 0, policy_version 202590 (0.0038) [2024-06-22 11:46:13,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42326.6, 300 sec: 42598.4). Total num frames: 3319300096. Throughput: 0: 42859.7. Samples: 3319449660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:46:13,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 11:46:14,658][15401] Updated weights for policy 0, policy_version 202600 (0.0031) [2024-06-22 11:46:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 3319529472. Throughput: 0: 42700.0. Samples: 3319696700. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-22 11:46:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 11:46:19,540][15401] Updated weights for policy 0, policy_version 202610 (0.0039) [2024-06-22 11:46:22,343][15401] Updated weights for policy 0, policy_version 202620 (0.0031) [2024-06-22 11:46:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3319758848. Throughput: 0: 43028.2. Samples: 3319830300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:46:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 11:46:27,099][15401] Updated weights for policy 0, policy_version 202630 (0.0039) [2024-06-22 11:46:28,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 3319939072. Throughput: 0: 42774.2. Samples: 3320086940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:46:28,392][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 11:46:29,972][15401] Updated weights for policy 0, policy_version 202640 (0.0041) [2024-06-22 11:46:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 3320184832. Throughput: 0: 42567.3. Samples: 3320330960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:46:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 11:46:34,747][15401] Updated weights for policy 0, policy_version 202650 (0.0045) [2024-06-22 11:46:36,874][15349] Signal inference workers to stop experience collection... (49050 times) [2024-06-22 11:46:36,932][15401] InferenceWorker_p0-w0: stopping experience collection (49050 times) [2024-06-22 11:46:36,986][15349] Signal inference workers to resume experience collection... (49050 times) [2024-06-22 11:46:36,986][15401] InferenceWorker_p0-w0: resuming experience collection (49050 times) [2024-06-22 11:46:37,584][15401] Updated weights for policy 0, policy_version 202660 (0.0033) [2024-06-22 11:46:38,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 3320381440. Throughput: 0: 42820.1. Samples: 3320469620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:46:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 11:46:42,187][15401] Updated weights for policy 0, policy_version 202670 (0.0032) [2024-06-22 11:46:43,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 3320561664. Throughput: 0: 42526.1. Samples: 3320718520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:46:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 11:46:45,654][15401] Updated weights for policy 0, policy_version 202680 (0.0030) [2024-06-22 11:46:48,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3320807424. Throughput: 0: 42598.8. Samples: 3320972060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:46:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 11:46:49,751][15401] Updated weights for policy 0, policy_version 202690 (0.0029) [2024-06-22 11:46:53,293][15401] Updated weights for policy 0, policy_version 202700 (0.0026) [2024-06-22 11:46:53,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.5, 300 sec: 42599.0). Total num frames: 3321036800. Throughput: 0: 42703.1. Samples: 3321107940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:46:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 11:46:57,303][15401] Updated weights for policy 0, policy_version 202710 (0.0035) [2024-06-22 11:46:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3321217024. Throughput: 0: 42480.8. Samples: 3321361300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:46:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 11:47:00,951][15401] Updated weights for policy 0, policy_version 202720 (0.0031) [2024-06-22 11:47:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 3321462784. Throughput: 0: 42679.6. Samples: 3321617380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:47:03,393][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 11:47:05,078][15401] Updated weights for policy 0, policy_version 202730 (0.0044) [2024-06-22 11:47:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 3321659392. Throughput: 0: 42653.3. Samples: 3321749700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:47:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 11:47:08,543][15401] Updated weights for policy 0, policy_version 202740 (0.0043) [2024-06-22 11:47:12,698][15401] Updated weights for policy 0, policy_version 202750 (0.0032) [2024-06-22 11:47:13,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 3321872384. Throughput: 0: 42540.3. Samples: 3322001160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:47:13,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 11:47:16,202][15401] Updated weights for policy 0, policy_version 202760 (0.0043) [2024-06-22 11:47:18,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 3322101760. Throughput: 0: 42726.7. Samples: 3322253760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:47:18,393][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 11:47:20,229][15401] Updated weights for policy 0, policy_version 202770 (0.0032) [2024-06-22 11:47:23,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 3322314752. Throughput: 0: 42543.0. Samples: 3322384060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:47:23,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 11:47:23,798][15401] Updated weights for policy 0, policy_version 202780 (0.0036) [2024-06-22 11:47:27,975][15401] Updated weights for policy 0, policy_version 202790 (0.0031) [2024-06-22 11:47:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 3322511360. Throughput: 0: 42641.0. Samples: 3322637360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 11:47:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 11:47:31,549][15401] Updated weights for policy 0, policy_version 202800 (0.0027) [2024-06-22 11:47:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42709.7). Total num frames: 3322740736. Throughput: 0: 42844.1. Samples: 3322900040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:47:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 11:47:35,411][15401] Updated weights for policy 0, policy_version 202810 (0.0031) [2024-06-22 11:47:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 3322937344. Throughput: 0: 42602.3. Samples: 3323025040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:47:38,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 11:47:39,192][15401] Updated weights for policy 0, policy_version 202820 (0.0040) [2024-06-22 11:47:43,151][15401] Updated weights for policy 0, policy_version 202830 (0.0042) [2024-06-22 11:47:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.8, 300 sec: 42709.5). Total num frames: 3323166720. Throughput: 0: 42693.4. Samples: 3323282500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:47:43,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 11:47:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000202830_3323166720.pth... [2024-06-22 11:47:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000202206_3312943104.pth [2024-06-22 11:47:47,180][15401] Updated weights for policy 0, policy_version 202840 (0.0038) [2024-06-22 11:47:48,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42653.7). Total num frames: 3323363328. Throughput: 0: 42641.3. Samples: 3323536240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:47:48,392][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 11:47:50,699][15401] Updated weights for policy 0, policy_version 202850 (0.0042) [2024-06-22 11:47:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3323592704. Throughput: 0: 42574.6. Samples: 3323665560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:47:53,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 11:47:55,057][15401] Updated weights for policy 0, policy_version 202860 (0.0040) [2024-06-22 11:47:55,458][15349] Signal inference workers to stop experience collection... (49100 times) [2024-06-22 11:47:55,458][15349] Signal inference workers to resume experience collection... (49100 times) [2024-06-22 11:47:55,493][15401] InferenceWorker_p0-w0: stopping experience collection (49100 times) [2024-06-22 11:47:55,494][15401] InferenceWorker_p0-w0: resuming experience collection (49100 times) [2024-06-22 11:47:58,317][15401] Updated weights for policy 0, policy_version 202870 (0.0031) [2024-06-22 11:47:58,392][15132] Fps is (10 sec: 45875.2, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 3323822080. Throughput: 0: 42645.8. Samples: 3323920320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:47:58,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 11:48:02,870][15401] Updated weights for policy 0, policy_version 202880 (0.0033) [2024-06-22 11:48:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 3324002304. Throughput: 0: 42861.9. Samples: 3324182440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:48:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 11:48:05,974][15401] Updated weights for policy 0, policy_version 202890 (0.0036) [2024-06-22 11:48:08,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 3324248064. Throughput: 0: 42726.2. Samples: 3324306740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:48:08,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 11:48:10,407][15401] Updated weights for policy 0, policy_version 202900 (0.0031) [2024-06-22 11:48:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 3324461056. Throughput: 0: 42788.9. Samples: 3324562860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:48:13,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-22 11:48:14,146][15401] Updated weights for policy 0, policy_version 202910 (0.0048) [2024-06-22 11:48:18,379][15401] Updated weights for policy 0, policy_version 202920 (0.0038) [2024-06-22 11:48:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 3324641280. Throughput: 0: 42787.9. Samples: 3324825500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:48:18,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 11:48:21,636][15401] Updated weights for policy 0, policy_version 202930 (0.0031) [2024-06-22 11:48:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 3324870656. Throughput: 0: 42629.2. Samples: 3324943360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:48:23,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 11:48:25,941][15401] Updated weights for policy 0, policy_version 202940 (0.0029) [2024-06-22 11:48:28,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3325116416. Throughput: 0: 42764.3. Samples: 3325206900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:48:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 11:48:29,079][15401] Updated weights for policy 0, policy_version 202950 (0.0028) [2024-06-22 11:48:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 3325280256. Throughput: 0: 43028.1. Samples: 3325472400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 11:48:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 11:48:33,449][15401] Updated weights for policy 0, policy_version 202960 (0.0040) [2024-06-22 11:48:36,555][15401] Updated weights for policy 0, policy_version 202970 (0.0028) [2024-06-22 11:48:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 3325509632. Throughput: 0: 42837.0. Samples: 3325593220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:48:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 11:48:41,126][15401] Updated weights for policy 0, policy_version 202980 (0.0031) [2024-06-22 11:48:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3325739008. Throughput: 0: 42996.9. Samples: 3325855080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:48:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 11:48:44,547][15401] Updated weights for policy 0, policy_version 202990 (0.0024) [2024-06-22 11:48:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 3325935616. Throughput: 0: 42952.0. Samples: 3326115280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:48:48,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 11:48:49,085][15401] Updated weights for policy 0, policy_version 203000 (0.0033) [2024-06-22 11:48:51,980][15401] Updated weights for policy 0, policy_version 203010 (0.0029) [2024-06-22 11:48:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3326181376. Throughput: 0: 42827.1. Samples: 3326233960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:48:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 11:48:56,734][15401] Updated weights for policy 0, policy_version 203020 (0.0038) [2024-06-22 11:48:58,258][15349] Signal inference workers to stop experience collection... (49150 times) [2024-06-22 11:48:58,309][15401] InferenceWorker_p0-w0: stopping experience collection (49150 times) [2024-06-22 11:48:58,318][15349] Signal inference workers to resume experience collection... (49150 times) [2024-06-22 11:48:58,330][15401] InferenceWorker_p0-w0: resuming experience collection (49150 times) [2024-06-22 11:48:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 3326394368. Throughput: 0: 43079.9. Samples: 3326501460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:48:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 11:48:59,780][15401] Updated weights for policy 0, policy_version 203030 (0.0036) [2024-06-22 11:49:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3326558208. Throughput: 0: 43023.0. Samples: 3326761540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 11:49:04,279][15401] Updated weights for policy 0, policy_version 203040 (0.0033) [2024-06-22 11:49:07,289][15401] Updated weights for policy 0, policy_version 203050 (0.0041) [2024-06-22 11:49:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 3326803968. Throughput: 0: 42964.0. Samples: 3326876840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:08,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 11:49:11,746][15401] Updated weights for policy 0, policy_version 203060 (0.0024) [2024-06-22 11:49:13,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 3327033344. Throughput: 0: 43008.6. Samples: 3327142280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 11:49:14,841][15401] Updated weights for policy 0, policy_version 203070 (0.0041) [2024-06-22 11:49:18,392][15132] Fps is (10 sec: 39321.6, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 3327197184. Throughput: 0: 42799.5. Samples: 3327398480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:18,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 11:49:19,370][15401] Updated weights for policy 0, policy_version 203080 (0.0030) [2024-06-22 11:49:22,290][15401] Updated weights for policy 0, policy_version 203090 (0.0032) [2024-06-22 11:49:23,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3327442944. Throughput: 0: 42794.5. Samples: 3327518980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 11:49:26,977][15401] Updated weights for policy 0, policy_version 203100 (0.0045) [2024-06-22 11:49:28,389][15132] Fps is (10 sec: 47525.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3327672320. Throughput: 0: 42923.7. Samples: 3327786640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 11:49:29,923][15401] Updated weights for policy 0, policy_version 203110 (0.0036) [2024-06-22 11:49:33,396][15132] Fps is (10 sec: 40934.5, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 3327852544. Throughput: 0: 42862.8. Samples: 3328044380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:33,405][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 11:49:34,582][15401] Updated weights for policy 0, policy_version 203120 (0.0040) [2024-06-22 11:49:37,693][15401] Updated weights for policy 0, policy_version 203130 (0.0034) [2024-06-22 11:49:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3328098304. Throughput: 0: 42861.8. Samples: 3328162740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 11:49:42,197][15401] Updated weights for policy 0, policy_version 203140 (0.0033) [2024-06-22 11:49:43,390][15132] Fps is (10 sec: 45904.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3328311296. Throughput: 0: 42755.1. Samples: 3328425440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 11:49:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 11:49:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000203144_3328311296.pth... [2024-06-22 11:49:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000202518_3318054912.pth [2024-06-22 11:49:45,601][15401] Updated weights for policy 0, policy_version 203150 (0.0033) [2024-06-22 11:49:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3328491520. Throughput: 0: 42652.8. Samples: 3328680920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:49:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 11:49:49,833][15401] Updated weights for policy 0, policy_version 203160 (0.0027) [2024-06-22 11:49:53,120][15401] Updated weights for policy 0, policy_version 203170 (0.0031) [2024-06-22 11:49:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3328753664. Throughput: 0: 42882.2. Samples: 3328806440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:49:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 11:49:57,394][15401] Updated weights for policy 0, policy_version 203180 (0.0038) [2024-06-22 11:49:58,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 3328950272. Throughput: 0: 42832.8. Samples: 3329069760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:49:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 11:50:00,531][15401] Updated weights for policy 0, policy_version 203190 (0.0038) [2024-06-22 11:50:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3329146880. Throughput: 0: 42893.8. Samples: 3329328600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 11:50:04,947][15401] Updated weights for policy 0, policy_version 203200 (0.0038) [2024-06-22 11:50:08,138][15401] Updated weights for policy 0, policy_version 203210 (0.0031) [2024-06-22 11:50:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42820.8). Total num frames: 3329392640. Throughput: 0: 42961.0. Samples: 3329452220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 11:50:12,793][15401] Updated weights for policy 0, policy_version 203220 (0.0042) [2024-06-22 11:50:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 3329589248. Throughput: 0: 42792.3. Samples: 3329712400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:13,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 11:50:16,015][15401] Updated weights for policy 0, policy_version 203230 (0.0028) [2024-06-22 11:50:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 3329769472. Throughput: 0: 42761.7. Samples: 3329968380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 11:50:18,445][15349] Signal inference workers to stop experience collection... (49200 times) [2024-06-22 11:50:18,452][15349] Signal inference workers to resume experience collection... (49200 times) [2024-06-22 11:50:18,492][15401] InferenceWorker_p0-w0: stopping experience collection (49200 times) [2024-06-22 11:50:18,492][15401] InferenceWorker_p0-w0: resuming experience collection (49200 times) [2024-06-22 11:50:20,353][15401] Updated weights for policy 0, policy_version 203240 (0.0031) [2024-06-22 11:50:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3330015232. Throughput: 0: 42796.4. Samples: 3330088580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 11:50:23,846][15401] Updated weights for policy 0, policy_version 203250 (0.0038) [2024-06-22 11:50:27,946][15401] Updated weights for policy 0, policy_version 203260 (0.0029) [2024-06-22 11:50:28,396][15132] Fps is (10 sec: 45845.3, 60 sec: 42593.8, 300 sec: 42875.2). Total num frames: 3330228224. Throughput: 0: 42821.5. Samples: 3330352680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:28,396][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 11:50:31,262][15401] Updated weights for policy 0, policy_version 203270 (0.0038) [2024-06-22 11:50:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42876.0, 300 sec: 42709.8). Total num frames: 3330424832. Throughput: 0: 42760.5. Samples: 3330605140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 11:50:35,468][15401] Updated weights for policy 0, policy_version 203280 (0.0029) [2024-06-22 11:50:38,390][15132] Fps is (10 sec: 44264.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3330670592. Throughput: 0: 42832.0. Samples: 3330733880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 11:50:39,395][15401] Updated weights for policy 0, policy_version 203290 (0.0032) [2024-06-22 11:50:43,071][15401] Updated weights for policy 0, policy_version 203300 (0.0031) [2024-06-22 11:50:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3330883584. Throughput: 0: 42904.8. Samples: 3331000480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 11:50:46,713][15401] Updated weights for policy 0, policy_version 203310 (0.0033) [2024-06-22 11:50:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 3331080192. Throughput: 0: 42788.9. Samples: 3331254100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 11:50:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 11:50:50,613][15401] Updated weights for policy 0, policy_version 203320 (0.0037) [2024-06-22 11:50:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3331325952. Throughput: 0: 42954.1. Samples: 3331385160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:50:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 11:50:54,113][15401] Updated weights for policy 0, policy_version 203330 (0.0029) [2024-06-22 11:50:58,102][15401] Updated weights for policy 0, policy_version 203340 (0.0033) [2024-06-22 11:50:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3331522560. Throughput: 0: 43176.4. Samples: 3331655240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:50:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 11:51:01,512][15401] Updated weights for policy 0, policy_version 203350 (0.0032) [2024-06-22 11:51:03,391][15132] Fps is (10 sec: 42593.9, 60 sec: 43416.7, 300 sec: 42820.4). Total num frames: 3331751936. Throughput: 0: 43230.7. Samples: 3331913820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:03,391][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 11:51:05,639][15401] Updated weights for policy 0, policy_version 203360 (0.0042) [2024-06-22 11:51:08,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3331981312. Throughput: 0: 43455.3. Samples: 3332044060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 11:51:09,135][15401] Updated weights for policy 0, policy_version 203370 (0.0028) [2024-06-22 11:51:13,237][15401] Updated weights for policy 0, policy_version 203380 (0.0039) [2024-06-22 11:51:13,392][15132] Fps is (10 sec: 42593.5, 60 sec: 43144.6, 300 sec: 42875.8). Total num frames: 3332177920. Throughput: 0: 43424.3. Samples: 3332306600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:13,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 11:51:16,770][15401] Updated weights for policy 0, policy_version 203390 (0.0036) [2024-06-22 11:51:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43963.7, 300 sec: 42876.1). Total num frames: 3332407296. Throughput: 0: 43384.5. Samples: 3332557440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 11:51:20,975][15401] Updated weights for policy 0, policy_version 203400 (0.0027) [2024-06-22 11:51:23,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 3332620288. Throughput: 0: 43432.1. Samples: 3332688320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 11:51:24,264][15401] Updated weights for policy 0, policy_version 203410 (0.0029) [2024-06-22 11:51:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 3332800512. Throughput: 0: 43271.2. Samples: 3332947680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 11:51:28,793][15401] Updated weights for policy 0, policy_version 203420 (0.0037) [2024-06-22 11:51:30,492][15349] Signal inference workers to stop experience collection... (49250 times) [2024-06-22 11:51:30,492][15349] Signal inference workers to resume experience collection... (49250 times) [2024-06-22 11:51:30,512][15401] InferenceWorker_p0-w0: stopping experience collection (49250 times) [2024-06-22 11:51:30,544][15401] InferenceWorker_p0-w0: resuming experience collection (49250 times) [2024-06-22 11:51:31,792][15401] Updated weights for policy 0, policy_version 203430 (0.0037) [2024-06-22 11:51:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 3333046272. Throughput: 0: 43283.6. Samples: 3333201860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 11:51:36,284][15401] Updated weights for policy 0, policy_version 203440 (0.0035) [2024-06-22 11:51:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.7, 300 sec: 43042.7). Total num frames: 3333259264. Throughput: 0: 43361.2. Samples: 3333336400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 11:51:39,479][15401] Updated weights for policy 0, policy_version 203450 (0.0037) [2024-06-22 11:51:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3333455872. Throughput: 0: 43059.2. Samples: 3333592900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 11:51:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000203458_3333455872.pth... [2024-06-22 11:51:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000202830_3323166720.pth [2024-06-22 11:51:43,911][15401] Updated weights for policy 0, policy_version 203460 (0.0032) [2024-06-22 11:51:47,144][15401] Updated weights for policy 0, policy_version 203470 (0.0031) [2024-06-22 11:51:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 3333701632. Throughput: 0: 42848.2. Samples: 3333841940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 11:51:51,633][15401] Updated weights for policy 0, policy_version 203480 (0.0042) [2024-06-22 11:51:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 3333898240. Throughput: 0: 42934.1. Samples: 3333976100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 11:51:54,851][15401] Updated weights for policy 0, policy_version 203490 (0.0029) [2024-06-22 11:51:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 3334111232. Throughput: 0: 42957.4. Samples: 3334239580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 11:51:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 11:51:59,294][15401] Updated weights for policy 0, policy_version 203500 (0.0028) [2024-06-22 11:52:02,356][15401] Updated weights for policy 0, policy_version 203510 (0.0034) [2024-06-22 11:52:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42872.5, 300 sec: 42931.6). Total num frames: 3334324224. Throughput: 0: 42964.1. Samples: 3334490820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 11:52:06,754][15401] Updated weights for policy 0, policy_version 203520 (0.0025) [2024-06-22 11:52:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42931.7). Total num frames: 3334537216. Throughput: 0: 42970.7. Samples: 3334622000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 11:52:09,863][15401] Updated weights for policy 0, policy_version 203530 (0.0023) [2024-06-22 11:52:13,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 3334750208. Throughput: 0: 42867.6. Samples: 3334876720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 11:52:14,665][15401] Updated weights for policy 0, policy_version 203540 (0.0035) [2024-06-22 11:52:17,587][15401] Updated weights for policy 0, policy_version 203550 (0.0026) [2024-06-22 11:52:18,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 3334979584. Throughput: 0: 42897.9. Samples: 3335132280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 11:52:22,542][15401] Updated weights for policy 0, policy_version 203560 (0.0037) [2024-06-22 11:52:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3335176192. Throughput: 0: 42970.1. Samples: 3335270060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:23,396][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 11:52:25,319][15401] Updated weights for policy 0, policy_version 203570 (0.0038) [2024-06-22 11:52:28,389][15132] Fps is (10 sec: 40961.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3335389184. Throughput: 0: 42753.0. Samples: 3335516780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 11:52:30,002][15401] Updated weights for policy 0, policy_version 203580 (0.0046) [2024-06-22 11:52:32,847][15401] Updated weights for policy 0, policy_version 203590 (0.0023) [2024-06-22 11:52:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3335618560. Throughput: 0: 42986.0. Samples: 3335776300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 11:52:37,510][15401] Updated weights for policy 0, policy_version 203600 (0.0031) [2024-06-22 11:52:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3335815168. Throughput: 0: 43025.0. Samples: 3335912220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 11:52:40,852][15401] Updated weights for policy 0, policy_version 203610 (0.0031) [2024-06-22 11:52:43,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42871.3, 300 sec: 42932.0). Total num frames: 3336028160. Throughput: 0: 42677.1. Samples: 3336160060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 11:52:44,392][15349] Signal inference workers to stop experience collection... (49300 times) [2024-06-22 11:52:44,392][15349] Signal inference workers to resume experience collection... (49300 times) [2024-06-22 11:52:44,416][15401] InferenceWorker_p0-w0: stopping experience collection (49300 times) [2024-06-22 11:52:44,416][15401] InferenceWorker_p0-w0: resuming experience collection (49300 times) [2024-06-22 11:52:45,311][15401] Updated weights for policy 0, policy_version 203620 (0.0044) [2024-06-22 11:52:48,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.8, 300 sec: 42931.3). Total num frames: 3336257536. Throughput: 0: 42761.6. Samples: 3336415200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:48,392][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 11:52:48,749][15401] Updated weights for policy 0, policy_version 203630 (0.0027) [2024-06-22 11:52:52,813][15401] Updated weights for policy 0, policy_version 203640 (0.0033) [2024-06-22 11:52:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 3336454144. Throughput: 0: 42840.3. Samples: 3336549820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 11:52:56,481][15401] Updated weights for policy 0, policy_version 203650 (0.0030) [2024-06-22 11:52:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3336683520. Throughput: 0: 42924.0. Samples: 3336808300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:52:58,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-22 11:53:00,283][15401] Updated weights for policy 0, policy_version 203660 (0.0031) [2024-06-22 11:53:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3336896512. Throughput: 0: 42964.2. Samples: 3337065660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 11:53:03,396][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 11:53:03,910][15401] Updated weights for policy 0, policy_version 203670 (0.0046) [2024-06-22 11:53:07,731][15401] Updated weights for policy 0, policy_version 203680 (0.0031) [2024-06-22 11:53:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3337109504. Throughput: 0: 42919.2. Samples: 3337201420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 11:53:11,369][15401] Updated weights for policy 0, policy_version 203690 (0.0042) [2024-06-22 11:53:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3337306112. Throughput: 0: 43074.2. Samples: 3337455120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 11:53:15,400][15401] Updated weights for policy 0, policy_version 203700 (0.0047) [2024-06-22 11:53:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 3337551872. Throughput: 0: 42816.3. Samples: 3337703040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 11:53:19,264][15401] Updated weights for policy 0, policy_version 203710 (0.0041) [2024-06-22 11:53:22,996][15401] Updated weights for policy 0, policy_version 203720 (0.0029) [2024-06-22 11:53:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3337748480. Throughput: 0: 42846.7. Samples: 3337840320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 11:53:27,142][15401] Updated weights for policy 0, policy_version 203730 (0.0040) [2024-06-22 11:53:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3337945088. Throughput: 0: 42942.0. Samples: 3338092440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 11:53:30,487][15401] Updated weights for policy 0, policy_version 203740 (0.0027) [2024-06-22 11:53:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3338207232. Throughput: 0: 42913.0. Samples: 3338346180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 11:53:34,660][15401] Updated weights for policy 0, policy_version 203750 (0.0030) [2024-06-22 11:53:38,205][15401] Updated weights for policy 0, policy_version 203760 (0.0037) [2024-06-22 11:53:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 3338403840. Throughput: 0: 43026.0. Samples: 3338485980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 11:53:42,085][15401] Updated weights for policy 0, policy_version 203770 (0.0031) [2024-06-22 11:53:43,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 3338616832. Throughput: 0: 42962.0. Samples: 3338741700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:43,393][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 11:53:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000203773_3338616832.pth... [2024-06-22 11:53:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000203144_3328311296.pth [2024-06-22 11:53:46,097][15401] Updated weights for policy 0, policy_version 203780 (0.0031) [2024-06-22 11:53:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 3338846208. Throughput: 0: 42772.4. Samples: 3338990420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 11:53:49,604][15401] Updated weights for policy 0, policy_version 203790 (0.0038) [2024-06-22 11:53:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3339026432. Throughput: 0: 42743.5. Samples: 3339124880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:53,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 11:53:53,441][15349] Signal inference workers to stop experience collection... (49350 times) [2024-06-22 11:53:53,441][15349] Signal inference workers to resume experience collection... (49350 times) [2024-06-22 11:53:53,459][15401] InferenceWorker_p0-w0: stopping experience collection (49350 times) [2024-06-22 11:53:53,459][15401] InferenceWorker_p0-w0: resuming experience collection (49350 times) [2024-06-22 11:53:53,590][15401] Updated weights for policy 0, policy_version 203800 (0.0028) [2024-06-22 11:53:57,077][15401] Updated weights for policy 0, policy_version 203810 (0.0033) [2024-06-22 11:53:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 3339255808. Throughput: 0: 42692.5. Samples: 3339376280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:53:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 11:54:01,173][15401] Updated weights for policy 0, policy_version 203820 (0.0030) [2024-06-22 11:54:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 3339485184. Throughput: 0: 42904.0. Samples: 3339633720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:54:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 11:54:04,662][15401] Updated weights for policy 0, policy_version 203830 (0.0026) [2024-06-22 11:54:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3339649024. Throughput: 0: 42777.4. Samples: 3339765300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:54:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 11:54:08,895][15401] Updated weights for policy 0, policy_version 203840 (0.0029) [2024-06-22 11:54:12,550][15401] Updated weights for policy 0, policy_version 203850 (0.0036) [2024-06-22 11:54:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 43043.1). Total num frames: 3339894784. Throughput: 0: 42831.0. Samples: 3340019840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 11:54:13,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 11:54:16,476][15401] Updated weights for policy 0, policy_version 203860 (0.0039) [2024-06-22 11:54:18,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3340124160. Throughput: 0: 42798.7. Samples: 3340272120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:18,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 11:54:20,265][15401] Updated weights for policy 0, policy_version 203870 (0.0028) [2024-06-22 11:54:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3340304384. Throughput: 0: 42680.0. Samples: 3340406580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:23,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 11:54:24,030][15401] Updated weights for policy 0, policy_version 203880 (0.0038) [2024-06-22 11:54:27,781][15401] Updated weights for policy 0, policy_version 203890 (0.0041) [2024-06-22 11:54:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 43043.7). Total num frames: 3340550144. Throughput: 0: 42674.9. Samples: 3340661960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 11:54:31,799][15401] Updated weights for policy 0, policy_version 203900 (0.0044) [2024-06-22 11:54:33,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3340779520. Throughput: 0: 42774.6. Samples: 3340915280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 11:54:35,439][15401] Updated weights for policy 0, policy_version 203910 (0.0028) [2024-06-22 11:54:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 3340943360. Throughput: 0: 42772.7. Samples: 3341049660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:38,396][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 11:54:39,407][15401] Updated weights for policy 0, policy_version 203920 (0.0035) [2024-06-22 11:54:43,039][15401] Updated weights for policy 0, policy_version 203930 (0.0033) [2024-06-22 11:54:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 43098.3). Total num frames: 3341205504. Throughput: 0: 42814.5. Samples: 3341302940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:43,394][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 11:54:47,157][15401] Updated weights for policy 0, policy_version 203940 (0.0043) [2024-06-22 11:54:48,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3341418496. Throughput: 0: 42764.8. Samples: 3341558140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 11:54:50,650][15401] Updated weights for policy 0, policy_version 203950 (0.0035) [2024-06-22 11:54:53,389][15132] Fps is (10 sec: 36045.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3341565952. Throughput: 0: 42678.2. Samples: 3341685820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 11:54:55,003][15401] Updated weights for policy 0, policy_version 203960 (0.0026) [2024-06-22 11:54:58,240][15401] Updated weights for policy 0, policy_version 203970 (0.0021) [2024-06-22 11:54:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3341844480. Throughput: 0: 42743.2. Samples: 3341943280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:54:58,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 11:55:02,523][15401] Updated weights for policy 0, policy_version 203980 (0.0035) [2024-06-22 11:55:03,396][15132] Fps is (10 sec: 49120.2, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 3342057472. Throughput: 0: 42758.8. Samples: 3342196540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:55:03,396][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 11:55:05,807][15401] Updated weights for policy 0, policy_version 203990 (0.0043) [2024-06-22 11:55:08,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 3342221312. Throughput: 0: 42634.6. Samples: 3342325140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:55:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 11:55:10,089][15401] Updated weights for policy 0, policy_version 204000 (0.0028) [2024-06-22 11:55:12,584][15349] Signal inference workers to stop experience collection... (49400 times) [2024-06-22 11:55:12,590][15349] Signal inference workers to resume experience collection... (49400 times) [2024-06-22 11:55:12,608][15401] InferenceWorker_p0-w0: stopping experience collection (49400 times) [2024-06-22 11:55:12,608][15401] InferenceWorker_p0-w0: resuming experience collection (49400 times) [2024-06-22 11:55:13,389][15132] Fps is (10 sec: 42626.0, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 3342483456. Throughput: 0: 42744.5. Samples: 3342585460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:55:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 11:55:13,473][15401] Updated weights for policy 0, policy_version 204010 (0.0027) [2024-06-22 11:55:17,635][15401] Updated weights for policy 0, policy_version 204020 (0.0036) [2024-06-22 11:55:18,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 3342680064. Throughput: 0: 42790.6. Samples: 3342840960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 11:55:18,392][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 11:55:21,235][15401] Updated weights for policy 0, policy_version 204030 (0.0022) [2024-06-22 11:55:23,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42821.5). Total num frames: 3342860288. Throughput: 0: 42715.3. Samples: 3342971840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:55:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 11:55:25,224][15401] Updated weights for policy 0, policy_version 204040 (0.0024) [2024-06-22 11:55:28,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42869.7, 300 sec: 43042.4). Total num frames: 3343122432. Throughput: 0: 42763.5. Samples: 3343227400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:55:28,393][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 11:55:29,114][15401] Updated weights for policy 0, policy_version 204050 (0.0039) [2024-06-22 11:55:33,220][15401] Updated weights for policy 0, policy_version 204060 (0.0026) [2024-06-22 11:55:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3343319040. Throughput: 0: 42887.7. Samples: 3343488080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:55:33,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 11:55:36,720][15401] Updated weights for policy 0, policy_version 204070 (0.0021) [2024-06-22 11:55:38,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3343515648. Throughput: 0: 42920.8. Samples: 3343617260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:55:38,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 11:55:40,945][15401] Updated weights for policy 0, policy_version 204080 (0.0039) [2024-06-22 11:55:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 3343745024. Throughput: 0: 42668.3. Samples: 3343863360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:55:43,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 11:55:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000204086_3343745024.pth... [2024-06-22 11:55:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000203458_3333455872.pth [2024-06-22 11:55:44,526][15401] Updated weights for policy 0, policy_version 204090 (0.0040) [2024-06-22 11:55:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42765.1). Total num frames: 3343941632. Throughput: 0: 42778.2. Samples: 3344121280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:55:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 11:55:48,690][15401] Updated weights for policy 0, policy_version 204100 (0.0049) [2024-06-22 11:55:52,455][15401] Updated weights for policy 0, policy_version 204110 (0.0046) [2024-06-22 11:55:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 3344171008. Throughput: 0: 42675.6. Samples: 3344245540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:55:53,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 11:55:56,452][15401] Updated weights for policy 0, policy_version 204120 (0.0038) [2024-06-22 11:55:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42876.3). Total num frames: 3344400384. Throughput: 0: 42540.8. Samples: 3344499800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:55:58,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 11:56:00,049][15401] Updated weights for policy 0, policy_version 204130 (0.0037) [2024-06-22 11:56:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42056.6, 300 sec: 42709.4). Total num frames: 3344580608. Throughput: 0: 42638.2. Samples: 3344759580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:56:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 11:56:04,346][15401] Updated weights for policy 0, policy_version 204140 (0.0038) [2024-06-22 11:56:07,796][15401] Updated weights for policy 0, policy_version 204150 (0.0037) [2024-06-22 11:56:08,393][15132] Fps is (10 sec: 40945.4, 60 sec: 43142.0, 300 sec: 42820.4). Total num frames: 3344809984. Throughput: 0: 42506.2. Samples: 3344884780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:56:08,394][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 11:56:11,935][15401] Updated weights for policy 0, policy_version 204160 (0.0033) [2024-06-22 11:56:13,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3345039360. Throughput: 0: 42639.7. Samples: 3345146080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:56:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 11:56:15,403][15401] Updated weights for policy 0, policy_version 204170 (0.0033) [2024-06-22 11:56:16,316][15349] Signal inference workers to stop experience collection... (49450 times) [2024-06-22 11:56:16,316][15349] Signal inference workers to resume experience collection... (49450 times) [2024-06-22 11:56:16,345][15401] InferenceWorker_p0-w0: stopping experience collection (49450 times) [2024-06-22 11:56:16,346][15401] InferenceWorker_p0-w0: resuming experience collection (49450 times) [2024-06-22 11:56:18,389][15132] Fps is (10 sec: 40974.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 3345219584. Throughput: 0: 42469.3. Samples: 3345399200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:56:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 11:56:19,528][15401] Updated weights for policy 0, policy_version 204180 (0.0041) [2024-06-22 11:56:23,353][15401] Updated weights for policy 0, policy_version 204190 (0.0039) [2024-06-22 11:56:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3345448960. Throughput: 0: 42252.4. Samples: 3345518620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:56:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 11:56:27,119][15401] Updated weights for policy 0, policy_version 204200 (0.0034) [2024-06-22 11:56:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 3345661952. Throughput: 0: 42696.9. Samples: 3345784720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 11:56:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 11:56:31,151][15401] Updated weights for policy 0, policy_version 204210 (0.0036) [2024-06-22 11:56:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3345858560. Throughput: 0: 42681.7. Samples: 3346041960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:56:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 11:56:34,781][15401] Updated weights for policy 0, policy_version 204220 (0.0037) [2024-06-22 11:56:38,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3346071552. Throughput: 0: 42594.0. Samples: 3346162260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:56:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 11:56:38,667][15401] Updated weights for policy 0, policy_version 204230 (0.0033) [2024-06-22 11:56:42,438][15401] Updated weights for policy 0, policy_version 204240 (0.0033) [2024-06-22 11:56:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3346317312. Throughput: 0: 42850.7. Samples: 3346428080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:56:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 11:56:46,342][15401] Updated weights for policy 0, policy_version 204250 (0.0040) [2024-06-22 11:56:48,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3346513920. Throughput: 0: 42740.0. Samples: 3346682880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:56:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 11:56:50,185][15401] Updated weights for policy 0, policy_version 204260 (0.0024) [2024-06-22 11:56:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3346726912. Throughput: 0: 42679.0. Samples: 3346805180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:56:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 11:56:54,018][15401] Updated weights for policy 0, policy_version 204270 (0.0040) [2024-06-22 11:56:57,618][15401] Updated weights for policy 0, policy_version 204280 (0.0043) [2024-06-22 11:56:58,397][15132] Fps is (10 sec: 45841.7, 60 sec: 42866.2, 300 sec: 42875.0). Total num frames: 3346972672. Throughput: 0: 42786.7. Samples: 3347071800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:56:58,397][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 11:57:01,814][15401] Updated weights for policy 0, policy_version 204290 (0.0034) [2024-06-22 11:57:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3347152896. Throughput: 0: 42855.6. Samples: 3347327700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:57:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 11:57:05,243][15401] Updated weights for policy 0, policy_version 204300 (0.0029) [2024-06-22 11:57:08,392][15132] Fps is (10 sec: 40980.4, 60 sec: 42872.3, 300 sec: 42820.2). Total num frames: 3347382272. Throughput: 0: 42941.3. Samples: 3347451080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:57:08,393][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 11:57:09,373][15401] Updated weights for policy 0, policy_version 204310 (0.0034) [2024-06-22 11:57:12,705][15401] Updated weights for policy 0, policy_version 204320 (0.0041) [2024-06-22 11:57:13,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3347595264. Throughput: 0: 42893.4. Samples: 3347715020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:57:13,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 11:57:16,774][15401] Updated weights for policy 0, policy_version 204330 (0.0037) [2024-06-22 11:57:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3347808256. Throughput: 0: 43009.7. Samples: 3347977400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:57:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 11:57:20,557][15401] Updated weights for policy 0, policy_version 204340 (0.0039) [2024-06-22 11:57:23,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3348037632. Throughput: 0: 43014.5. Samples: 3348097920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:57:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 11:57:24,478][15401] Updated weights for policy 0, policy_version 204350 (0.0031) [2024-06-22 11:57:28,157][15401] Updated weights for policy 0, policy_version 204360 (0.0032) [2024-06-22 11:57:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3348250624. Throughput: 0: 43014.1. Samples: 3348363720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:57:28,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 11:57:31,866][15401] Updated weights for policy 0, policy_version 204370 (0.0038) [2024-06-22 11:57:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3348447232. Throughput: 0: 43067.2. Samples: 3348620900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 11:57:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 11:57:35,784][15401] Updated weights for policy 0, policy_version 204380 (0.0038) [2024-06-22 11:57:38,382][15349] Signal inference workers to stop experience collection... (49500 times) [2024-06-22 11:57:38,382][15349] Signal inference workers to resume experience collection... (49500 times) [2024-06-22 11:57:38,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43415.7, 300 sec: 42875.8). Total num frames: 3348676608. Throughput: 0: 43104.7. Samples: 3348745000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:57:38,392][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 11:57:38,420][15401] InferenceWorker_p0-w0: stopping experience collection (49500 times) [2024-06-22 11:57:38,420][15401] InferenceWorker_p0-w0: resuming experience collection (49500 times) [2024-06-22 11:57:39,598][15401] Updated weights for policy 0, policy_version 204390 (0.0032) [2024-06-22 11:57:43,324][15401] Updated weights for policy 0, policy_version 204400 (0.0033) [2024-06-22 11:57:43,396][15132] Fps is (10 sec: 44208.2, 60 sec: 42866.8, 300 sec: 42820.0). Total num frames: 3348889600. Throughput: 0: 42972.0. Samples: 3349005500. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:57:43,397][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 11:57:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000204401_3348905984.pth... [2024-06-22 11:57:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000203773_3338616832.pth [2024-06-22 11:57:47,181][15401] Updated weights for policy 0, policy_version 204410 (0.0031) [2024-06-22 11:57:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3349086208. Throughput: 0: 43009.8. Samples: 3349263140. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:57:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 11:57:50,770][15401] Updated weights for policy 0, policy_version 204420 (0.0031) [2024-06-22 11:57:53,392][15132] Fps is (10 sec: 40976.2, 60 sec: 42869.6, 300 sec: 42764.7). Total num frames: 3349299200. Throughput: 0: 42970.6. Samples: 3349384760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:57:53,393][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 11:57:54,877][15401] Updated weights for policy 0, policy_version 204430 (0.0026) [2024-06-22 11:57:58,284][15401] Updated weights for policy 0, policy_version 204440 (0.0047) [2024-06-22 11:57:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42876.7, 300 sec: 42876.1). Total num frames: 3349544960. Throughput: 0: 42949.0. Samples: 3349647620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:57:58,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 11:58:02,603][15401] Updated weights for policy 0, policy_version 204450 (0.0037) [2024-06-22 11:58:03,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3349725184. Throughput: 0: 42807.1. Samples: 3349903720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:58:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 11:58:06,365][15401] Updated weights for policy 0, policy_version 204460 (0.0041) [2024-06-22 11:58:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 3349954560. Throughput: 0: 42997.8. Samples: 3350032820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:58:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 11:58:10,471][15401] Updated weights for policy 0, policy_version 204470 (0.0038) [2024-06-22 11:58:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3350167552. Throughput: 0: 42828.1. Samples: 3350290980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:58:13,392][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 11:58:13,958][15401] Updated weights for policy 0, policy_version 204480 (0.0034) [2024-06-22 11:58:18,069][15401] Updated weights for policy 0, policy_version 204490 (0.0033) [2024-06-22 11:58:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3350380544. Throughput: 0: 42867.1. Samples: 3350549920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:58:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 11:58:21,674][15401] Updated weights for policy 0, policy_version 204500 (0.0039) [2024-06-22 11:58:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3350609920. Throughput: 0: 42930.7. Samples: 3350676780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:58:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 11:58:25,598][15401] Updated weights for policy 0, policy_version 204510 (0.0038) [2024-06-22 11:58:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 3350822912. Throughput: 0: 42831.4. Samples: 3350932740. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:58:28,392][15132] Avg episode reward: [(0, '0.183')] [2024-06-22 11:58:29,293][15401] Updated weights for policy 0, policy_version 204520 (0.0038) [2024-06-22 11:58:33,032][15401] Updated weights for policy 0, policy_version 204530 (0.0028) [2024-06-22 11:58:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 3351035904. Throughput: 0: 42733.7. Samples: 3351186160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:58:33,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 11:58:37,013][15401] Updated weights for policy 0, policy_version 204540 (0.0046) [2024-06-22 11:58:38,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3351248896. Throughput: 0: 42897.4. Samples: 3351315140. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-22 11:58:38,392][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 11:58:40,679][15401] Updated weights for policy 0, policy_version 204550 (0.0033) [2024-06-22 11:58:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 3351445504. Throughput: 0: 42796.8. Samples: 3351573480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:58:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 11:58:44,508][15401] Updated weights for policy 0, policy_version 204560 (0.0029) [2024-06-22 11:58:48,386][15401] Updated weights for policy 0, policy_version 204570 (0.0042) [2024-06-22 11:58:48,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3351674880. Throughput: 0: 42807.6. Samples: 3351830060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:58:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 11:58:52,069][15401] Updated weights for policy 0, policy_version 204580 (0.0032) [2024-06-22 11:58:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 3351887872. Throughput: 0: 42789.3. Samples: 3351958340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:58:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 11:58:55,820][15401] Updated weights for policy 0, policy_version 204590 (0.0038) [2024-06-22 11:58:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3352100864. Throughput: 0: 42754.6. Samples: 3352214940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:58:58,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-22 11:58:59,892][15401] Updated weights for policy 0, policy_version 204600 (0.0043) [2024-06-22 11:59:00,621][15349] Signal inference workers to stop experience collection... (49550 times) [2024-06-22 11:59:00,621][15349] Signal inference workers to resume experience collection... (49550 times) [2024-06-22 11:59:00,671][15401] InferenceWorker_p0-w0: stopping experience collection (49550 times) [2024-06-22 11:59:00,671][15401] InferenceWorker_p0-w0: resuming experience collection (49550 times) [2024-06-22 11:59:03,387][15401] Updated weights for policy 0, policy_version 204610 (0.0024) [2024-06-22 11:59:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3352330240. Throughput: 0: 42850.7. Samples: 3352478200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:03,390][15132] Avg episode reward: [(0, '0.154')] [2024-06-22 11:59:07,746][15401] Updated weights for policy 0, policy_version 204620 (0.0033) [2024-06-22 11:59:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3352543232. Throughput: 0: 42920.4. Samples: 3352608200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:08,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-22 11:59:11,230][15401] Updated weights for policy 0, policy_version 204630 (0.0033) [2024-06-22 11:59:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3352739840. Throughput: 0: 42876.1. Samples: 3352862060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:13,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 11:59:15,362][15401] Updated weights for policy 0, policy_version 204640 (0.0041) [2024-06-22 11:59:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 3352952832. Throughput: 0: 42935.1. Samples: 3353118340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:18,393][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 11:59:18,810][15401] Updated weights for policy 0, policy_version 204650 (0.0043) [2024-06-22 11:59:22,978][15401] Updated weights for policy 0, policy_version 204660 (0.0033) [2024-06-22 11:59:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3353165824. Throughput: 0: 42995.5. Samples: 3353249840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:23,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 11:59:26,357][15401] Updated weights for policy 0, policy_version 204670 (0.0033) [2024-06-22 11:59:28,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3353395200. Throughput: 0: 42868.9. Samples: 3353502580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:28,394][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 11:59:30,509][15401] Updated weights for policy 0, policy_version 204680 (0.0040) [2024-06-22 11:59:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3353591808. Throughput: 0: 42977.8. Samples: 3353764060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 11:59:34,121][15401] Updated weights for policy 0, policy_version 204690 (0.0044) [2024-06-22 11:59:38,387][15401] Updated weights for policy 0, policy_version 204700 (0.0038) [2024-06-22 11:59:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 3353804800. Throughput: 0: 42856.4. Samples: 3353886880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:38,391][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 11:59:41,784][15401] Updated weights for policy 0, policy_version 204710 (0.0040) [2024-06-22 11:59:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3354017792. Throughput: 0: 42794.6. Samples: 3354140700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 11:59:43,472][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000204714_3354034176.pth... [2024-06-22 11:59:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000204086_3343745024.pth [2024-06-22 11:59:46,199][15401] Updated weights for policy 0, policy_version 204720 (0.0025) [2024-06-22 11:59:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3354230784. Throughput: 0: 42617.3. Samples: 3354395980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 11:59:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 11:59:49,555][15401] Updated weights for policy 0, policy_version 204730 (0.0036) [2024-06-22 11:59:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3354443776. Throughput: 0: 42480.9. Samples: 3354519840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 11:59:53,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 11:59:53,764][15401] Updated weights for policy 0, policy_version 204740 (0.0036) [2024-06-22 11:59:56,967][15401] Updated weights for policy 0, policy_version 204750 (0.0035) [2024-06-22 11:59:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 3354656768. Throughput: 0: 42549.8. Samples: 3354776800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 11:59:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 12:00:01,302][15401] Updated weights for policy 0, policy_version 204760 (0.0026) [2024-06-22 12:00:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3354886144. Throughput: 0: 42701.0. Samples: 3355039780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 12:00:04,977][15401] Updated weights for policy 0, policy_version 204770 (0.0032) [2024-06-22 12:00:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3355066368. Throughput: 0: 42587.3. Samples: 3355166260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:08,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 12:00:09,122][15401] Updated weights for policy 0, policy_version 204780 (0.0032) [2024-06-22 12:00:12,636][15401] Updated weights for policy 0, policy_version 204790 (0.0025) [2024-06-22 12:00:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 3355312128. Throughput: 0: 42817.5. Samples: 3355429360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 12:00:16,580][15401] Updated weights for policy 0, policy_version 204800 (0.0032) [2024-06-22 12:00:18,390][15132] Fps is (10 sec: 45873.6, 60 sec: 42873.0, 300 sec: 42931.6). Total num frames: 3355525120. Throughput: 0: 42679.7. Samples: 3355684660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 12:00:20,349][15401] Updated weights for policy 0, policy_version 204810 (0.0043) [2024-06-22 12:00:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3355738112. Throughput: 0: 42864.5. Samples: 3355815780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:23,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 12:00:24,190][15401] Updated weights for policy 0, policy_version 204820 (0.0036) [2024-06-22 12:00:24,965][15349] Signal inference workers to stop experience collection... (49600 times) [2024-06-22 12:00:24,966][15349] Signal inference workers to resume experience collection... (49600 times) [2024-06-22 12:00:25,025][15401] InferenceWorker_p0-w0: stopping experience collection (49600 times) [2024-06-22 12:00:25,026][15401] InferenceWorker_p0-w0: resuming experience collection (49600 times) [2024-06-22 12:00:27,995][15401] Updated weights for policy 0, policy_version 204830 (0.0037) [2024-06-22 12:00:28,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3355951104. Throughput: 0: 42922.1. Samples: 3356072200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 12:00:31,743][15401] Updated weights for policy 0, policy_version 204840 (0.0028) [2024-06-22 12:00:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3356180480. Throughput: 0: 42926.2. Samples: 3356327660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 12:00:35,605][15401] Updated weights for policy 0, policy_version 204850 (0.0033) [2024-06-22 12:00:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3356360704. Throughput: 0: 42941.4. Samples: 3356452200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 12:00:39,313][15401] Updated weights for policy 0, policy_version 204860 (0.0027) [2024-06-22 12:00:43,239][15401] Updated weights for policy 0, policy_version 204870 (0.0043) [2024-06-22 12:00:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3356590080. Throughput: 0: 42936.0. Samples: 3356708920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:43,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 12:00:46,908][15401] Updated weights for policy 0, policy_version 204880 (0.0048) [2024-06-22 12:00:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3356786688. Throughput: 0: 42868.9. Samples: 3356968880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:48,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-22 12:00:50,950][15401] Updated weights for policy 0, policy_version 204890 (0.0030) [2024-06-22 12:00:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3356999680. Throughput: 0: 42817.7. Samples: 3357093060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:00:53,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 12:00:54,528][15401] Updated weights for policy 0, policy_version 204900 (0.0039) [2024-06-22 12:00:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3357212672. Throughput: 0: 42676.4. Samples: 3357349800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:00:58,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 12:00:58,579][15401] Updated weights for policy 0, policy_version 204910 (0.0026) [2024-06-22 12:01:02,749][15401] Updated weights for policy 0, policy_version 204920 (0.0029) [2024-06-22 12:01:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42765.5). Total num frames: 3357425664. Throughput: 0: 42604.6. Samples: 3357601860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 12:01:06,308][15401] Updated weights for policy 0, policy_version 204930 (0.0035) [2024-06-22 12:01:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3357655040. Throughput: 0: 42581.5. Samples: 3357731940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 12:01:10,235][15401] Updated weights for policy 0, policy_version 204940 (0.0032) [2024-06-22 12:01:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3357868032. Throughput: 0: 42637.0. Samples: 3357990860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:13,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 12:01:14,128][15401] Updated weights for policy 0, policy_version 204950 (0.0028) [2024-06-22 12:01:17,681][15401] Updated weights for policy 0, policy_version 204960 (0.0040) [2024-06-22 12:01:18,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42596.9, 300 sec: 42820.2). Total num frames: 3358081024. Throughput: 0: 42661.3. Samples: 3358247520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:18,393][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 12:01:21,681][15401] Updated weights for policy 0, policy_version 204970 (0.0037) [2024-06-22 12:01:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3358310400. Throughput: 0: 42814.2. Samples: 3358378840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:23,390][15132] Avg episode reward: [(0, '0.911')] [2024-06-22 12:01:25,332][15401] Updated weights for policy 0, policy_version 204980 (0.0032) [2024-06-22 12:01:28,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3358507008. Throughput: 0: 42743.1. Samples: 3358632360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:28,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 12:01:29,687][15401] Updated weights for policy 0, policy_version 204990 (0.0044) [2024-06-22 12:01:32,836][15401] Updated weights for policy 0, policy_version 205000 (0.0053) [2024-06-22 12:01:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3358736384. Throughput: 0: 42559.9. Samples: 3358884080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 12:01:37,247][15401] Updated weights for policy 0, policy_version 205010 (0.0023) [2024-06-22 12:01:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3358932992. Throughput: 0: 42815.1. Samples: 3359019740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 12:01:40,406][15401] Updated weights for policy 0, policy_version 205020 (0.0041) [2024-06-22 12:01:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3359162368. Throughput: 0: 42738.6. Samples: 3359273040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:43,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 12:01:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000205027_3359162368.pth... [2024-06-22 12:01:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000204401_3348905984.pth [2024-06-22 12:01:44,697][15401] Updated weights for policy 0, policy_version 205030 (0.0032) [2024-06-22 12:01:48,006][15401] Updated weights for policy 0, policy_version 205040 (0.0045) [2024-06-22 12:01:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3359375360. Throughput: 0: 42718.2. Samples: 3359524180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 12:01:52,227][15401] Updated weights for policy 0, policy_version 205050 (0.0036) [2024-06-22 12:01:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42766.1). Total num frames: 3359588352. Throughput: 0: 42761.1. Samples: 3359656200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 12:01:55,587][15401] Updated weights for policy 0, policy_version 205060 (0.0041) [2024-06-22 12:01:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3359801344. Throughput: 0: 42767.5. Samples: 3359915400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:01:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 12:01:59,730][15401] Updated weights for policy 0, policy_version 205070 (0.0028) [2024-06-22 12:02:03,293][15401] Updated weights for policy 0, policy_version 205080 (0.0040) [2024-06-22 12:02:03,392][15132] Fps is (10 sec: 44226.9, 60 sec: 43415.9, 300 sec: 42876.1). Total num frames: 3360030720. Throughput: 0: 42770.3. Samples: 3360172180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:02:03,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 12:02:07,652][15401] Updated weights for policy 0, policy_version 205090 (0.0036) [2024-06-22 12:02:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 3360210944. Throughput: 0: 42774.3. Samples: 3360303680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 12:02:10,921][15401] Updated weights for policy 0, policy_version 205100 (0.0032) [2024-06-22 12:02:13,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3360440320. Throughput: 0: 42868.8. Samples: 3360561460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:13,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 12:02:15,093][15349] Signal inference workers to stop experience collection... (49650 times) [2024-06-22 12:02:15,093][15349] Signal inference workers to resume experience collection... (49650 times) [2024-06-22 12:02:15,109][15401] InferenceWorker_p0-w0: stopping experience collection (49650 times) [2024-06-22 12:02:15,109][15401] InferenceWorker_p0-w0: resuming experience collection (49650 times) [2024-06-22 12:02:15,253][15401] Updated weights for policy 0, policy_version 205110 (0.0057) [2024-06-22 12:02:18,390][15132] Fps is (10 sec: 44233.0, 60 sec: 42872.6, 300 sec: 42764.9). Total num frames: 3360653312. Throughput: 0: 43019.3. Samples: 3360819980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:18,391][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 12:02:18,690][15401] Updated weights for policy 0, policy_version 205120 (0.0044) [2024-06-22 12:02:23,046][15401] Updated weights for policy 0, policy_version 205130 (0.0031) [2024-06-22 12:02:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3360866304. Throughput: 0: 42868.0. Samples: 3360948800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 12:02:26,261][15401] Updated weights for policy 0, policy_version 205140 (0.0027) [2024-06-22 12:02:28,390][15132] Fps is (10 sec: 40963.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3361062912. Throughput: 0: 42859.1. Samples: 3361201700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 12:02:30,483][15401] Updated weights for policy 0, policy_version 205150 (0.0038) [2024-06-22 12:02:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.7, 300 sec: 42709.5). Total num frames: 3361275904. Throughput: 0: 43065.3. Samples: 3361462220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:33,393][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 12:02:34,082][15401] Updated weights for policy 0, policy_version 205160 (0.0035) [2024-06-22 12:02:38,141][15401] Updated weights for policy 0, policy_version 205170 (0.0040) [2024-06-22 12:02:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 3361505280. Throughput: 0: 42968.1. Samples: 3361589760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 12:02:41,563][15401] Updated weights for policy 0, policy_version 205180 (0.0037) [2024-06-22 12:02:43,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3361718272. Throughput: 0: 42835.6. Samples: 3361843000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:43,395][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 12:02:45,717][15401] Updated weights for policy 0, policy_version 205190 (0.0030) [2024-06-22 12:02:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42765.4). Total num frames: 3361914880. Throughput: 0: 42875.7. Samples: 3362101480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:48,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 12:02:49,316][15401] Updated weights for policy 0, policy_version 205200 (0.0032) [2024-06-22 12:02:53,243][15401] Updated weights for policy 0, policy_version 205210 (0.0042) [2024-06-22 12:02:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3362160640. Throughput: 0: 42768.4. Samples: 3362228260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:53,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 12:02:57,350][15401] Updated weights for policy 0, policy_version 205220 (0.0022) [2024-06-22 12:02:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3362373632. Throughput: 0: 42657.0. Samples: 3362481020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:02:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 12:03:01,203][15401] Updated weights for policy 0, policy_version 205230 (0.0033) [2024-06-22 12:03:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 3362570240. Throughput: 0: 42432.3. Samples: 3362729400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:03:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 12:03:05,371][15401] Updated weights for policy 0, policy_version 205240 (0.0023) [2024-06-22 12:03:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3362783232. Throughput: 0: 42443.1. Samples: 3362858740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 12:03:08,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 12:03:08,966][15401] Updated weights for policy 0, policy_version 205250 (0.0035) [2024-06-22 12:03:12,967][15401] Updated weights for policy 0, policy_version 205260 (0.0035) [2024-06-22 12:03:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3362996224. Throughput: 0: 42497.8. Samples: 3363114100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 12:03:16,861][15401] Updated weights for policy 0, policy_version 205270 (0.0027) [2024-06-22 12:03:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 3363209216. Throughput: 0: 42283.3. Samples: 3363364860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 12:03:20,582][15401] Updated weights for policy 0, policy_version 205280 (0.0037) [2024-06-22 12:03:22,944][15349] Signal inference workers to stop experience collection... (49700 times) [2024-06-22 12:03:22,944][15349] Signal inference workers to resume experience collection... (49700 times) [2024-06-22 12:03:22,979][15401] InferenceWorker_p0-w0: stopping experience collection (49700 times) [2024-06-22 12:03:22,979][15401] InferenceWorker_p0-w0: resuming experience collection (49700 times) [2024-06-22 12:03:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 3363405824. Throughput: 0: 42401.4. Samples: 3363497820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 12:03:24,305][15401] Updated weights for policy 0, policy_version 205290 (0.0040) [2024-06-22 12:03:28,210][15401] Updated weights for policy 0, policy_version 205300 (0.0039) [2024-06-22 12:03:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3363635200. Throughput: 0: 42459.3. Samples: 3363753660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 12:03:31,930][15401] Updated weights for policy 0, policy_version 205310 (0.0027) [2024-06-22 12:03:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43146.2, 300 sec: 42765.3). Total num frames: 3363864576. Throughput: 0: 42354.9. Samples: 3364007460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 12:03:36,136][15401] Updated weights for policy 0, policy_version 205320 (0.0030) [2024-06-22 12:03:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3364061184. Throughput: 0: 42545.0. Samples: 3364142780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:38,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 12:03:39,485][15401] Updated weights for policy 0, policy_version 205330 (0.0041) [2024-06-22 12:03:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3364274176. Throughput: 0: 42525.3. Samples: 3364394660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:43,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 12:03:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000205339_3364274176.pth... [2024-06-22 12:03:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000204714_3354034176.pth [2024-06-22 12:03:43,783][15401] Updated weights for policy 0, policy_version 205340 (0.0036) [2024-06-22 12:03:47,172][15401] Updated weights for policy 0, policy_version 205350 (0.0036) [2024-06-22 12:03:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 3364503552. Throughput: 0: 42697.3. Samples: 3364650780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 12:03:51,343][15401] Updated weights for policy 0, policy_version 205360 (0.0022) [2024-06-22 12:03:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3364683776. Throughput: 0: 42754.3. Samples: 3364782680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:53,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 12:03:54,810][15401] Updated weights for policy 0, policy_version 205370 (0.0028) [2024-06-22 12:03:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3364929536. Throughput: 0: 42692.9. Samples: 3365035280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:03:58,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 12:03:59,706][15401] Updated weights for policy 0, policy_version 205380 (0.0031) [2024-06-22 12:04:02,435][15401] Updated weights for policy 0, policy_version 205390 (0.0021) [2024-06-22 12:04:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3365142528. Throughput: 0: 42841.8. Samples: 3365292740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:04:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 12:04:07,279][15401] Updated weights for policy 0, policy_version 205400 (0.0036) [2024-06-22 12:04:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3365322752. Throughput: 0: 42759.1. Samples: 3365421980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:04:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 12:04:10,147][15401] Updated weights for policy 0, policy_version 205410 (0.0029) [2024-06-22 12:04:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3365584896. Throughput: 0: 42768.3. Samples: 3365678240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:04:13,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 12:04:14,873][15401] Updated weights for policy 0, policy_version 205420 (0.0027) [2024-06-22 12:04:17,838][15401] Updated weights for policy 0, policy_version 205430 (0.0031) [2024-06-22 12:04:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3365781504. Throughput: 0: 42795.3. Samples: 3365933240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 12:04:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 12:04:22,313][15401] Updated weights for policy 0, policy_version 205440 (0.0042) [2024-06-22 12:04:23,396][15132] Fps is (10 sec: 39296.9, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 3365978112. Throughput: 0: 42692.1. Samples: 3366064200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:04:23,396][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 12:04:25,179][15401] Updated weights for policy 0, policy_version 205450 (0.0042) [2024-06-22 12:04:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3366223872. Throughput: 0: 43014.3. Samples: 3366330300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:04:28,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 12:04:29,794][15401] Updated weights for policy 0, policy_version 205460 (0.0030) [2024-06-22 12:04:32,835][15401] Updated weights for policy 0, policy_version 205470 (0.0035) [2024-06-22 12:04:33,389][15132] Fps is (10 sec: 45904.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3366436864. Throughput: 0: 42793.4. Samples: 3366576480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:04:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 12:04:37,386][15401] Updated weights for policy 0, policy_version 205480 (0.0026) [2024-06-22 12:04:38,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3366600704. Throughput: 0: 42674.2. Samples: 3366703020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:04:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 12:04:40,659][15401] Updated weights for policy 0, policy_version 205490 (0.0032) [2024-06-22 12:04:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3366846464. Throughput: 0: 42912.6. Samples: 3366966340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:04:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 12:04:43,537][15349] Signal inference workers to stop experience collection... (49750 times) [2024-06-22 12:04:43,538][15349] Signal inference workers to resume experience collection... (49750 times) [2024-06-22 12:04:43,586][15401] InferenceWorker_p0-w0: stopping experience collection (49750 times) [2024-06-22 12:04:43,587][15401] InferenceWorker_p0-w0: resuming experience collection (49750 times) [2024-06-22 12:04:44,992][15401] Updated weights for policy 0, policy_version 205500 (0.0047) [2024-06-22 12:04:48,361][15401] Updated weights for policy 0, policy_version 205510 (0.0031) [2024-06-22 12:04:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3367075840. Throughput: 0: 42726.5. Samples: 3367215440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:04:48,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 12:04:52,535][15401] Updated weights for policy 0, policy_version 205520 (0.0043) [2024-06-22 12:04:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3367256064. Throughput: 0: 42765.3. Samples: 3367346420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:04:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 12:04:56,135][15401] Updated weights for policy 0, policy_version 205530 (0.0045) [2024-06-22 12:04:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3367485440. Throughput: 0: 42835.7. Samples: 3367605840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:04:58,398][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 12:04:59,987][15401] Updated weights for policy 0, policy_version 205540 (0.0030) [2024-06-22 12:05:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3367698432. Throughput: 0: 42704.8. Samples: 3367854960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:05:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 12:05:03,974][15401] Updated weights for policy 0, policy_version 205550 (0.0023) [2024-06-22 12:05:07,731][15401] Updated weights for policy 0, policy_version 205560 (0.0032) [2024-06-22 12:05:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3367911424. Throughput: 0: 42784.8. Samples: 3367989240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:05:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 12:05:11,673][15401] Updated weights for policy 0, policy_version 205570 (0.0041) [2024-06-22 12:05:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42709.2). Total num frames: 3368124416. Throughput: 0: 42510.0. Samples: 3368243360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:05:13,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 12:05:15,338][15401] Updated weights for policy 0, policy_version 205580 (0.0035) [2024-06-22 12:05:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3368337408. Throughput: 0: 42700.9. Samples: 3368498020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:05:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 12:05:19,413][15401] Updated weights for policy 0, policy_version 205590 (0.0028) [2024-06-22 12:05:22,890][15401] Updated weights for policy 0, policy_version 205600 (0.0033) [2024-06-22 12:05:23,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 3368550400. Throughput: 0: 42904.4. Samples: 3368633720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 12:05:23,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 12:05:27,099][15401] Updated weights for policy 0, policy_version 205610 (0.0032) [2024-06-22 12:05:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 3368763392. Throughput: 0: 42584.9. Samples: 3368882660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:05:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 12:05:30,908][15401] Updated weights for policy 0, policy_version 205620 (0.0033) [2024-06-22 12:05:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3368976384. Throughput: 0: 42711.9. Samples: 3369137480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:05:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 12:05:34,682][15401] Updated weights for policy 0, policy_version 205630 (0.0039) [2024-06-22 12:05:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3369189376. Throughput: 0: 42674.3. Samples: 3369266760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:05:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 12:05:38,522][15401] Updated weights for policy 0, policy_version 205640 (0.0054) [2024-06-22 12:05:42,381][15401] Updated weights for policy 0, policy_version 205650 (0.0044) [2024-06-22 12:05:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 3369385984. Throughput: 0: 42554.1. Samples: 3369520780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:05:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 12:05:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000205651_3369385984.pth... [2024-06-22 12:05:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000205027_3359162368.pth [2024-06-22 12:05:46,085][15401] Updated weights for policy 0, policy_version 205660 (0.0042) [2024-06-22 12:05:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3369631744. Throughput: 0: 42732.5. Samples: 3369777920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:05:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 12:05:50,490][15401] Updated weights for policy 0, policy_version 205670 (0.0033) [2024-06-22 12:05:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3369811968. Throughput: 0: 42626.2. Samples: 3369907420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:05:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 12:05:53,839][15401] Updated weights for policy 0, policy_version 205680 (0.0035) [2024-06-22 12:05:58,148][15401] Updated weights for policy 0, policy_version 205690 (0.0041) [2024-06-22 12:05:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3370024960. Throughput: 0: 42409.4. Samples: 3370151680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:05:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 12:06:01,652][15401] Updated weights for policy 0, policy_version 205700 (0.0044) [2024-06-22 12:06:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3370254336. Throughput: 0: 42476.1. Samples: 3370409440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:06:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 12:06:05,846][15401] Updated weights for policy 0, policy_version 205710 (0.0036) [2024-06-22 12:06:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3370450944. Throughput: 0: 42349.9. Samples: 3370539460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:06:08,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 12:06:09,455][15401] Updated weights for policy 0, policy_version 205720 (0.0033) [2024-06-22 12:06:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.1, 300 sec: 42654.3). Total num frames: 3370663936. Throughput: 0: 42272.4. Samples: 3370784920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:06:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 12:06:13,574][15401] Updated weights for policy 0, policy_version 205730 (0.0033) [2024-06-22 12:06:17,226][15401] Updated weights for policy 0, policy_version 205740 (0.0033) [2024-06-22 12:06:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 3370876928. Throughput: 0: 42534.2. Samples: 3371051520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:06:18,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 12:06:18,472][15349] Signal inference workers to stop experience collection... (49800 times) [2024-06-22 12:06:18,472][15349] Signal inference workers to resume experience collection... (49800 times) [2024-06-22 12:06:18,509][15401] InferenceWorker_p0-w0: stopping experience collection (49800 times) [2024-06-22 12:06:18,510][15401] InferenceWorker_p0-w0: resuming experience collection (49800 times) [2024-06-22 12:06:21,197][15401] Updated weights for policy 0, policy_version 205750 (0.0041) [2024-06-22 12:06:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3371089920. Throughput: 0: 42516.0. Samples: 3371179980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:06:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 12:06:24,929][15401] Updated weights for policy 0, policy_version 205760 (0.0036) [2024-06-22 12:06:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3371319296. Throughput: 0: 42381.9. Samples: 3371427960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:06:28,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 12:06:29,054][15401] Updated weights for policy 0, policy_version 205770 (0.0034) [2024-06-22 12:06:32,649][15401] Updated weights for policy 0, policy_version 205780 (0.0029) [2024-06-22 12:06:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 3371515904. Throughput: 0: 42466.3. Samples: 3371688900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:06:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 12:06:36,891][15401] Updated weights for policy 0, policy_version 205790 (0.0037) [2024-06-22 12:06:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3371728896. Throughput: 0: 42378.7. Samples: 3371814460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:06:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 12:06:40,159][15401] Updated weights for policy 0, policy_version 205800 (0.0042) [2024-06-22 12:06:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3371974656. Throughput: 0: 42570.7. Samples: 3372067360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:06:43,396][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 12:06:44,833][15401] Updated weights for policy 0, policy_version 205810 (0.0043) [2024-06-22 12:06:47,643][15401] Updated weights for policy 0, policy_version 205820 (0.0024) [2024-06-22 12:06:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3372154880. Throughput: 0: 42659.5. Samples: 3372329120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:06:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 12:06:52,333][15401] Updated weights for policy 0, policy_version 205830 (0.0036) [2024-06-22 12:06:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3372351488. Throughput: 0: 42561.3. Samples: 3372454720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:06:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 12:06:55,392][15401] Updated weights for policy 0, policy_version 205840 (0.0035) [2024-06-22 12:06:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 3372613632. Throughput: 0: 42693.6. Samples: 3372706140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:06:58,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 12:07:00,043][15401] Updated weights for policy 0, policy_version 205850 (0.0043) [2024-06-22 12:07:03,220][15401] Updated weights for policy 0, policy_version 205860 (0.0043) [2024-06-22 12:07:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3372810240. Throughput: 0: 42444.1. Samples: 3372961500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:07:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 12:07:07,939][15401] Updated weights for policy 0, policy_version 205870 (0.0029) [2024-06-22 12:07:08,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3372990464. Throughput: 0: 42367.9. Samples: 3373086540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:07:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 12:07:10,857][15401] Updated weights for policy 0, policy_version 205880 (0.0034) [2024-06-22 12:07:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 3373236224. Throughput: 0: 42626.1. Samples: 3373346140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:07:13,390][15132] Avg episode reward: [(0, '0.161')] [2024-06-22 12:07:15,471][15401] Updated weights for policy 0, policy_version 205890 (0.0034) [2024-06-22 12:07:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 3373449216. Throughput: 0: 42637.8. Samples: 3373607600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:07:18,390][15132] Avg episode reward: [(0, '0.151')] [2024-06-22 12:07:18,489][15401] Updated weights for policy 0, policy_version 205900 (0.0033) [2024-06-22 12:07:22,972][15401] Updated weights for policy 0, policy_version 205910 (0.0046) [2024-06-22 12:07:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3373645824. Throughput: 0: 42594.1. Samples: 3373731200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:07:23,391][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 12:07:26,216][15401] Updated weights for policy 0, policy_version 205920 (0.0040) [2024-06-22 12:07:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 3373891584. Throughput: 0: 42789.2. Samples: 3373992880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:07:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 12:07:30,419][15401] Updated weights for policy 0, policy_version 205930 (0.0024) [2024-06-22 12:07:33,065][15349] Signal inference workers to stop experience collection... (49850 times) [2024-06-22 12:07:33,101][15401] InferenceWorker_p0-w0: stopping experience collection (49850 times) [2024-06-22 12:07:33,127][15349] Signal inference workers to resume experience collection... (49850 times) [2024-06-22 12:07:33,127][15401] InferenceWorker_p0-w0: resuming experience collection (49850 times) [2024-06-22 12:07:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3374088192. Throughput: 0: 42777.2. Samples: 3374254100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:07:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 12:07:33,979][15401] Updated weights for policy 0, policy_version 205940 (0.0048) [2024-06-22 12:07:37,861][15401] Updated weights for policy 0, policy_version 205950 (0.0034) [2024-06-22 12:07:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3374301184. Throughput: 0: 42694.7. Samples: 3374375980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 12:07:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 12:07:41,461][15401] Updated weights for policy 0, policy_version 205960 (0.0034) [2024-06-22 12:07:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3374546944. Throughput: 0: 42952.0. Samples: 3374638980. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:07:43,395][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 12:07:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000205966_3374546944.pth... [2024-06-22 12:07:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000205339_3364274176.pth [2024-06-22 12:07:45,461][15401] Updated weights for policy 0, policy_version 205970 (0.0031) [2024-06-22 12:07:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3374727168. Throughput: 0: 43160.1. Samples: 3374903700. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:07:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 12:07:48,950][15401] Updated weights for policy 0, policy_version 205980 (0.0033) [2024-06-22 12:07:53,065][15401] Updated weights for policy 0, policy_version 205990 (0.0046) [2024-06-22 12:07:53,392][15132] Fps is (10 sec: 39312.6, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 3374940160. Throughput: 0: 43076.9. Samples: 3375025100. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:07:53,392][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 12:07:56,494][15401] Updated weights for policy 0, policy_version 206000 (0.0030) [2024-06-22 12:07:58,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 3375169536. Throughput: 0: 43017.7. Samples: 3375282040. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:07:58,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 12:08:00,604][15401] Updated weights for policy 0, policy_version 206010 (0.0026) [2024-06-22 12:08:03,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 3375366144. Throughput: 0: 43136.3. Samples: 3375548840. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:03,393][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 12:08:04,147][15401] Updated weights for policy 0, policy_version 206020 (0.0032) [2024-06-22 12:08:08,392][15132] Fps is (10 sec: 40960.0, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 3375579136. Throughput: 0: 43002.7. Samples: 3375666420. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:08,392][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 12:08:08,447][15401] Updated weights for policy 0, policy_version 206030 (0.0033) [2024-06-22 12:08:11,903][15401] Updated weights for policy 0, policy_version 206040 (0.0030) [2024-06-22 12:08:13,392][15132] Fps is (10 sec: 45875.0, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 3375824896. Throughput: 0: 43008.4. Samples: 3375928360. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:13,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 12:08:15,933][15401] Updated weights for policy 0, policy_version 206050 (0.0040) [2024-06-22 12:08:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3376005120. Throughput: 0: 43089.5. Samples: 3376193120. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 12:08:19,516][15401] Updated weights for policy 0, policy_version 206060 (0.0038) [2024-06-22 12:08:23,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 3376218112. Throughput: 0: 42935.5. Samples: 3376308080. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:23,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 12:08:23,692][15401] Updated weights for policy 0, policy_version 206070 (0.0035) [2024-06-22 12:08:27,228][15401] Updated weights for policy 0, policy_version 206080 (0.0034) [2024-06-22 12:08:28,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 3376463872. Throughput: 0: 42828.5. Samples: 3376566360. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:28,392][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 12:08:31,164][15401] Updated weights for policy 0, policy_version 206090 (0.0032) [2024-06-22 12:08:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3376627712. Throughput: 0: 42916.3. Samples: 3376834940. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 12:08:34,701][15401] Updated weights for policy 0, policy_version 206100 (0.0035) [2024-06-22 12:08:38,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3376889856. Throughput: 0: 42836.0. Samples: 3376952620. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 12:08:38,714][15401] Updated weights for policy 0, policy_version 206110 (0.0029) [2024-06-22 12:08:42,316][15401] Updated weights for policy 0, policy_version 206120 (0.0023) [2024-06-22 12:08:43,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3377102848. Throughput: 0: 42999.7. Samples: 3377216920. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 12:08:43,905][15349] Signal inference workers to stop experience collection... (49900 times) [2024-06-22 12:08:43,945][15401] InferenceWorker_p0-w0: stopping experience collection (49900 times) [2024-06-22 12:08:43,965][15349] Signal inference workers to resume experience collection... (49900 times) [2024-06-22 12:08:43,966][15401] InferenceWorker_p0-w0: resuming experience collection (49900 times) [2024-06-22 12:08:46,190][15401] Updated weights for policy 0, policy_version 206130 (0.0030) [2024-06-22 12:08:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3377283072. Throughput: 0: 42829.4. Samples: 3377476060. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-22 12:08:48,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-22 12:08:50,034][15401] Updated weights for policy 0, policy_version 206140 (0.0036) [2024-06-22 12:08:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 3377528832. Throughput: 0: 42846.7. Samples: 3377594420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:08:53,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 12:08:54,050][15401] Updated weights for policy 0, policy_version 206150 (0.0035) [2024-06-22 12:08:57,855][15401] Updated weights for policy 0, policy_version 206160 (0.0039) [2024-06-22 12:08:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 3377741824. Throughput: 0: 42941.9. Samples: 3377860640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:08:58,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 12:09:01,599][15401] Updated weights for policy 0, policy_version 206170 (0.0036) [2024-06-22 12:09:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3377938432. Throughput: 0: 42700.9. Samples: 3378114660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 12:09:05,376][15401] Updated weights for policy 0, policy_version 206180 (0.0027) [2024-06-22 12:09:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.3, 300 sec: 42654.0). Total num frames: 3378167808. Throughput: 0: 42813.4. Samples: 3378234680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:08,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 12:09:09,182][15401] Updated weights for policy 0, policy_version 206190 (0.0024) [2024-06-22 12:09:13,135][15401] Updated weights for policy 0, policy_version 206200 (0.0037) [2024-06-22 12:09:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 3378380800. Throughput: 0: 42998.4. Samples: 3378501180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 12:09:17,134][15401] Updated weights for policy 0, policy_version 206210 (0.0024) [2024-06-22 12:09:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 3378577408. Throughput: 0: 42648.9. Samples: 3378754140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:18,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-22 12:09:20,989][15401] Updated weights for policy 0, policy_version 206220 (0.0041) [2024-06-22 12:09:23,389][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3378806784. Throughput: 0: 42795.6. Samples: 3378878420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:23,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 12:09:24,638][15401] Updated weights for policy 0, policy_version 206230 (0.0031) [2024-06-22 12:09:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 3379003392. Throughput: 0: 42711.0. Samples: 3379138920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 12:09:28,795][15401] Updated weights for policy 0, policy_version 206240 (0.0039) [2024-06-22 12:09:32,393][15401] Updated weights for policy 0, policy_version 206250 (0.0034) [2024-06-22 12:09:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 3379232768. Throughput: 0: 42524.8. Samples: 3379389680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:33,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 12:09:36,497][15401] Updated weights for policy 0, policy_version 206260 (0.0027) [2024-06-22 12:09:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3379429376. Throughput: 0: 42815.6. Samples: 3379521120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 12:09:40,018][15401] Updated weights for policy 0, policy_version 206270 (0.0049) [2024-06-22 12:09:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3379658752. Throughput: 0: 42563.0. Samples: 3379775980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 12:09:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000206278_3379658752.pth... [2024-06-22 12:09:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000205651_3369385984.pth [2024-06-22 12:09:44,230][15401] Updated weights for policy 0, policy_version 206280 (0.0034) [2024-06-22 12:09:47,886][15401] Updated weights for policy 0, policy_version 206290 (0.0028) [2024-06-22 12:09:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3379871744. Throughput: 0: 42506.6. Samples: 3380027460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 12:09:51,842][15401] Updated weights for policy 0, policy_version 206300 (0.0039) [2024-06-22 12:09:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3380084736. Throughput: 0: 42746.2. Samples: 3380158260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 12:09:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 12:09:55,552][15401] Updated weights for policy 0, policy_version 206310 (0.0035) [2024-06-22 12:09:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 3380281344. Throughput: 0: 42458.4. Samples: 3380411820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:09:58,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-22 12:09:59,561][15401] Updated weights for policy 0, policy_version 206320 (0.0035) [2024-06-22 12:10:03,119][15401] Updated weights for policy 0, policy_version 206330 (0.0035) [2024-06-22 12:10:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3380527104. Throughput: 0: 42361.8. Samples: 3380660420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 12:10:07,212][15401] Updated weights for policy 0, policy_version 206340 (0.0033) [2024-06-22 12:10:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 3380723712. Throughput: 0: 42630.2. Samples: 3380796780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:08,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 12:10:10,864][15401] Updated weights for policy 0, policy_version 206350 (0.0037) [2024-06-22 12:10:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 3380903936. Throughput: 0: 42497.8. Samples: 3381051320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 12:10:14,821][15401] Updated weights for policy 0, policy_version 206360 (0.0031) [2024-06-22 12:10:18,334][15401] Updated weights for policy 0, policy_version 206370 (0.0043) [2024-06-22 12:10:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3381166080. Throughput: 0: 42603.6. Samples: 3381306840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 12:10:21,097][15349] Signal inference workers to stop experience collection... (49950 times) [2024-06-22 12:10:21,120][15401] InferenceWorker_p0-w0: stopping experience collection (49950 times) [2024-06-22 12:10:21,157][15349] Signal inference workers to resume experience collection... (49950 times) [2024-06-22 12:10:21,158][15401] InferenceWorker_p0-w0: resuming experience collection (49950 times) [2024-06-22 12:10:22,937][15401] Updated weights for policy 0, policy_version 206380 (0.0026) [2024-06-22 12:10:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 3381346304. Throughput: 0: 42628.4. Samples: 3381439500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:23,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 12:10:25,874][15401] Updated weights for policy 0, policy_version 206390 (0.0032) [2024-06-22 12:10:28,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.1, 300 sec: 42598.4). Total num frames: 3381542912. Throughput: 0: 42503.3. Samples: 3381688640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 12:10:30,442][15401] Updated weights for policy 0, policy_version 206400 (0.0033) [2024-06-22 12:10:33,389][15132] Fps is (10 sec: 45887.1, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 3381805056. Throughput: 0: 42612.2. Samples: 3381945000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 12:10:33,579][15401] Updated weights for policy 0, policy_version 206410 (0.0024) [2024-06-22 12:10:38,047][15401] Updated weights for policy 0, policy_version 206420 (0.0041) [2024-06-22 12:10:38,392][15132] Fps is (10 sec: 44227.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 3381985280. Throughput: 0: 42644.8. Samples: 3382077380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:38,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 12:10:41,092][15401] Updated weights for policy 0, policy_version 206430 (0.0050) [2024-06-22 12:10:43,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3382198272. Throughput: 0: 42602.0. Samples: 3382328900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 12:10:46,007][15401] Updated weights for policy 0, policy_version 206440 (0.0028) [2024-06-22 12:10:48,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3382427648. Throughput: 0: 42787.4. Samples: 3382585860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:48,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 12:10:48,960][15401] Updated weights for policy 0, policy_version 206450 (0.0040) [2024-06-22 12:10:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 3382607872. Throughput: 0: 42621.0. Samples: 3382714720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 12:10:53,760][15401] Updated weights for policy 0, policy_version 206460 (0.0025) [2024-06-22 12:10:56,755][15401] Updated weights for policy 0, policy_version 206470 (0.0038) [2024-06-22 12:10:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3382870016. Throughput: 0: 42577.6. Samples: 3382967320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:10:58,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 12:11:01,589][15401] Updated weights for policy 0, policy_version 206480 (0.0044) [2024-06-22 12:11:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3383050240. Throughput: 0: 42630.7. Samples: 3383225220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:11:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 12:11:04,561][15401] Updated weights for policy 0, policy_version 206490 (0.0041) [2024-06-22 12:11:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3383263232. Throughput: 0: 42483.5. Samples: 3383351160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 12:11:09,198][15401] Updated weights for policy 0, policy_version 206500 (0.0032) [2024-06-22 12:11:12,394][15401] Updated weights for policy 0, policy_version 206510 (0.0029) [2024-06-22 12:11:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 3383508992. Throughput: 0: 42674.4. Samples: 3383608980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:13,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-22 12:11:16,732][15401] Updated weights for policy 0, policy_version 206520 (0.0035) [2024-06-22 12:11:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 3383689216. Throughput: 0: 42794.6. Samples: 3383870760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 12:11:19,897][15401] Updated weights for policy 0, policy_version 206530 (0.0045) [2024-06-22 12:11:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 3383918592. Throughput: 0: 42598.6. Samples: 3383994220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 12:11:24,824][15401] Updated weights for policy 0, policy_version 206540 (0.0030) [2024-06-22 12:11:27,677][15401] Updated weights for policy 0, policy_version 206550 (0.0042) [2024-06-22 12:11:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 3384131584. Throughput: 0: 42743.0. Samples: 3384252340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 12:11:32,259][15401] Updated weights for policy 0, policy_version 206560 (0.0030) [2024-06-22 12:11:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3384328192. Throughput: 0: 42903.3. Samples: 3384516500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:33,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-22 12:11:35,415][15401] Updated weights for policy 0, policy_version 206570 (0.0035) [2024-06-22 12:11:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 3384573952. Throughput: 0: 42842.6. Samples: 3384642640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:38,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 12:11:39,751][15401] Updated weights for policy 0, policy_version 206580 (0.0037) [2024-06-22 12:11:42,823][15401] Updated weights for policy 0, policy_version 206590 (0.0033) [2024-06-22 12:11:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3384770560. Throughput: 0: 43009.4. Samples: 3384902740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 12:11:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000206591_3384786944.pth... [2024-06-22 12:11:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000205966_3374546944.pth [2024-06-22 12:11:47,207][15401] Updated weights for policy 0, policy_version 206600 (0.0034) [2024-06-22 12:11:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 3384983552. Throughput: 0: 43158.7. Samples: 3385167360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:48,393][15132] Avg episode reward: [(0, '0.121')] [2024-06-22 12:11:50,390][15401] Updated weights for policy 0, policy_version 206610 (0.0033) [2024-06-22 12:11:53,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43688.9, 300 sec: 42764.7). Total num frames: 3385229312. Throughput: 0: 43018.2. Samples: 3385287080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:53,393][15132] Avg episode reward: [(0, '0.288')] [2024-06-22 12:11:54,757][15401] Updated weights for policy 0, policy_version 206620 (0.0041) [2024-06-22 12:11:56,298][15349] Signal inference workers to stop experience collection... (50000 times) [2024-06-22 12:11:56,347][15401] InferenceWorker_p0-w0: stopping experience collection (50000 times) [2024-06-22 12:11:56,356][15349] Signal inference workers to resume experience collection... (50000 times) [2024-06-22 12:11:56,369][15401] InferenceWorker_p0-w0: resuming experience collection (50000 times) [2024-06-22 12:11:57,854][15401] Updated weights for policy 0, policy_version 206630 (0.0035) [2024-06-22 12:11:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3385425920. Throughput: 0: 43102.2. Samples: 3385548580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:11:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 12:12:02,287][15401] Updated weights for policy 0, policy_version 206640 (0.0042) [2024-06-22 12:12:03,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3385638912. Throughput: 0: 43152.4. Samples: 3385812620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:12:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 12:12:05,459][15401] Updated weights for policy 0, policy_version 206650 (0.0024) [2024-06-22 12:12:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 3385868288. Throughput: 0: 43054.8. Samples: 3385931680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 12:12:08,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-22 12:12:09,796][15401] Updated weights for policy 0, policy_version 206660 (0.0034) [2024-06-22 12:12:13,375][15401] Updated weights for policy 0, policy_version 206670 (0.0032) [2024-06-22 12:12:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3386081280. Throughput: 0: 43185.3. Samples: 3386195680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 12:12:17,220][15401] Updated weights for policy 0, policy_version 206680 (0.0031) [2024-06-22 12:12:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 3386277888. Throughput: 0: 43048.8. Samples: 3386453700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 12:12:20,874][15401] Updated weights for policy 0, policy_version 206690 (0.0038) [2024-06-22 12:12:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3386507264. Throughput: 0: 42972.4. Samples: 3386576400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:23,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 12:12:24,714][15401] Updated weights for policy 0, policy_version 206700 (0.0026) [2024-06-22 12:12:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3386720256. Throughput: 0: 42974.2. Samples: 3386836580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:28,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-22 12:12:28,610][15401] Updated weights for policy 0, policy_version 206710 (0.0044) [2024-06-22 12:12:32,639][15401] Updated weights for policy 0, policy_version 206720 (0.0032) [2024-06-22 12:12:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3386916864. Throughput: 0: 42916.5. Samples: 3387098600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 12:12:36,453][15401] Updated weights for policy 0, policy_version 206730 (0.0036) [2024-06-22 12:12:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 3387162624. Throughput: 0: 43120.4. Samples: 3387227400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 12:12:40,084][15401] Updated weights for policy 0, policy_version 206740 (0.0030) [2024-06-22 12:12:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3387359232. Throughput: 0: 42943.5. Samples: 3387481140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:43,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 12:12:43,905][15401] Updated weights for policy 0, policy_version 206750 (0.0031) [2024-06-22 12:12:47,535][15401] Updated weights for policy 0, policy_version 206760 (0.0028) [2024-06-22 12:12:48,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43142.9, 300 sec: 42820.6). Total num frames: 3387572224. Throughput: 0: 42927.4. Samples: 3387744460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:48,392][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 12:12:51,369][15401] Updated weights for policy 0, policy_version 206770 (0.0036) [2024-06-22 12:12:53,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43146.3, 300 sec: 42876.5). Total num frames: 3387817984. Throughput: 0: 43112.5. Samples: 3387871740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 12:12:55,680][15401] Updated weights for policy 0, policy_version 206780 (0.0026) [2024-06-22 12:12:58,396][15132] Fps is (10 sec: 44219.1, 60 sec: 43139.9, 300 sec: 42875.5). Total num frames: 3388014592. Throughput: 0: 42965.4. Samples: 3388129400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:12:58,396][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 12:12:58,968][15401] Updated weights for policy 0, policy_version 206790 (0.0047) [2024-06-22 12:13:03,325][15401] Updated weights for policy 0, policy_version 206800 (0.0043) [2024-06-22 12:13:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 3388211200. Throughput: 0: 42836.1. Samples: 3388381320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:13:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 12:13:06,683][15401] Updated weights for policy 0, policy_version 206810 (0.0034) [2024-06-22 12:13:08,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3388440576. Throughput: 0: 42950.8. Samples: 3388509180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:13:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 12:13:10,833][15401] Updated weights for policy 0, policy_version 206820 (0.0030) [2024-06-22 12:13:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3388653568. Throughput: 0: 43128.0. Samples: 3388777340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:13:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 12:13:14,103][15401] Updated weights for policy 0, policy_version 206830 (0.0025) [2024-06-22 12:13:18,328][15401] Updated weights for policy 0, policy_version 206840 (0.0039) [2024-06-22 12:13:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3388866560. Throughput: 0: 42979.5. Samples: 3389032680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-22 12:13:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 12:13:21,563][15401] Updated weights for policy 0, policy_version 206850 (0.0029) [2024-06-22 12:13:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3389079552. Throughput: 0: 42869.9. Samples: 3389156540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:13:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 12:13:24,444][15349] Signal inference workers to stop experience collection... (50050 times) [2024-06-22 12:13:24,444][15349] Signal inference workers to resume experience collection... (50050 times) [2024-06-22 12:13:24,483][15401] InferenceWorker_p0-w0: stopping experience collection (50050 times) [2024-06-22 12:13:24,483][15401] InferenceWorker_p0-w0: resuming experience collection (50050 times) [2024-06-22 12:13:25,850][15401] Updated weights for policy 0, policy_version 206860 (0.0040) [2024-06-22 12:13:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 3389292544. Throughput: 0: 43181.7. Samples: 3389424320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:13:28,393][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 12:13:29,265][15401] Updated weights for policy 0, policy_version 206870 (0.0048) [2024-06-22 12:13:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3389505536. Throughput: 0: 42937.0. Samples: 3389676520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:13:33,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 12:13:33,524][15401] Updated weights for policy 0, policy_version 206880 (0.0032) [2024-06-22 12:13:36,831][15401] Updated weights for policy 0, policy_version 206890 (0.0040) [2024-06-22 12:13:38,392][15132] Fps is (10 sec: 44237.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 3389734912. Throughput: 0: 42918.1. Samples: 3389803160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:13:38,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 12:13:40,983][15401] Updated weights for policy 0, policy_version 206900 (0.0027) [2024-06-22 12:13:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 3389947904. Throughput: 0: 43047.1. Samples: 3390066240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:13:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 12:13:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000206907_3389964288.pth... [2024-06-22 12:13:43,541][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000206278_3379658752.pth [2024-06-22 12:13:44,892][15401] Updated weights for policy 0, policy_version 206910 (0.0033) [2024-06-22 12:13:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 3390160896. Throughput: 0: 43102.7. Samples: 3390320940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:13:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 12:13:48,678][15401] Updated weights for policy 0, policy_version 206920 (0.0027) [2024-06-22 12:13:52,473][15401] Updated weights for policy 0, policy_version 206930 (0.0036) [2024-06-22 12:13:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3390357504. Throughput: 0: 43083.1. Samples: 3390447920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:13:53,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-22 12:13:56,813][15401] Updated weights for policy 0, policy_version 206940 (0.0033) [2024-06-22 12:13:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 3390586880. Throughput: 0: 42839.0. Samples: 3390705100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:13:58,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-22 12:14:00,048][15401] Updated weights for policy 0, policy_version 206950 (0.0032) [2024-06-22 12:14:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3390799872. Throughput: 0: 42838.3. Samples: 3390960400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:14:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 12:14:04,245][15401] Updated weights for policy 0, policy_version 206960 (0.0033) [2024-06-22 12:14:07,655][15401] Updated weights for policy 0, policy_version 206970 (0.0024) [2024-06-22 12:14:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3390996480. Throughput: 0: 42907.4. Samples: 3391087380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:14:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 12:14:11,869][15401] Updated weights for policy 0, policy_version 206980 (0.0031) [2024-06-22 12:14:13,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 3391209472. Throughput: 0: 42780.5. Samples: 3391349440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:14:13,392][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 12:14:15,174][15401] Updated weights for policy 0, policy_version 206990 (0.0050) [2024-06-22 12:14:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3391438848. Throughput: 0: 42783.4. Samples: 3391601780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:14:18,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 12:14:19,579][15401] Updated weights for policy 0, policy_version 207000 (0.0030) [2024-06-22 12:14:22,898][15401] Updated weights for policy 0, policy_version 207010 (0.0061) [2024-06-22 12:14:23,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3391651840. Throughput: 0: 42819.2. Samples: 3391729920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 12:14:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 12:14:27,509][15401] Updated weights for policy 0, policy_version 207020 (0.0036) [2024-06-22 12:14:28,396][15132] Fps is (10 sec: 42571.8, 60 sec: 42868.7, 300 sec: 42819.6). Total num frames: 3391864832. Throughput: 0: 42741.8. Samples: 3391989900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:14:28,396][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 12:14:31,315][15401] Updated weights for policy 0, policy_version 207030 (0.0030) [2024-06-22 12:14:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3392077824. Throughput: 0: 42571.9. Samples: 3392236680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:14:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 12:14:35,079][15401] Updated weights for policy 0, policy_version 207040 (0.0043) [2024-06-22 12:14:38,390][15132] Fps is (10 sec: 42625.1, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 3392290816. Throughput: 0: 42616.7. Samples: 3392365680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:14:38,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-22 12:14:38,867][15401] Updated weights for policy 0, policy_version 207050 (0.0028) [2024-06-22 12:14:42,617][15401] Updated weights for policy 0, policy_version 207060 (0.0036) [2024-06-22 12:14:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3392503808. Throughput: 0: 42631.3. Samples: 3392623500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:14:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 12:14:46,977][15401] Updated weights for policy 0, policy_version 207070 (0.0034) [2024-06-22 12:14:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3392716800. Throughput: 0: 42504.4. Samples: 3392873100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:14:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 12:14:50,141][15401] Updated weights for policy 0, policy_version 207080 (0.0032) [2024-06-22 12:14:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3392929792. Throughput: 0: 42616.9. Samples: 3393005140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:14:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 12:14:54,646][15401] Updated weights for policy 0, policy_version 207090 (0.0039) [2024-06-22 12:14:57,868][15401] Updated weights for policy 0, policy_version 207100 (0.0033) [2024-06-22 12:14:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3393142784. Throughput: 0: 42476.5. Samples: 3393260780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:14:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 12:15:02,292][15401] Updated weights for policy 0, policy_version 207110 (0.0046) [2024-06-22 12:15:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3393355776. Throughput: 0: 42500.1. Samples: 3393514280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:15:03,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 12:15:05,761][15401] Updated weights for policy 0, policy_version 207120 (0.0036) [2024-06-22 12:15:07,000][15349] Signal inference workers to stop experience collection... (50100 times) [2024-06-22 12:15:07,001][15349] Signal inference workers to resume experience collection... (50100 times) [2024-06-22 12:15:07,027][15401] InferenceWorker_p0-w0: stopping experience collection (50100 times) [2024-06-22 12:15:07,027][15401] InferenceWorker_p0-w0: resuming experience collection (50100 times) [2024-06-22 12:15:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 3393585152. Throughput: 0: 42644.9. Samples: 3393648940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:15:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 12:15:09,957][15401] Updated weights for policy 0, policy_version 207130 (0.0037) [2024-06-22 12:15:13,325][15401] Updated weights for policy 0, policy_version 207140 (0.0045) [2024-06-22 12:15:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 3393781760. Throughput: 0: 42463.3. Samples: 3393900480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:15:13,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 12:15:17,492][15401] Updated weights for policy 0, policy_version 207150 (0.0048) [2024-06-22 12:15:18,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.4, 300 sec: 42765.4). Total num frames: 3393961984. Throughput: 0: 42622.7. Samples: 3394154700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:15:18,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 12:15:21,319][15401] Updated weights for policy 0, policy_version 207160 (0.0037) [2024-06-22 12:15:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3394224128. Throughput: 0: 42514.7. Samples: 3394278840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:15:23,394][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 12:15:24,976][15401] Updated weights for policy 0, policy_version 207170 (0.0026) [2024-06-22 12:15:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42329.8, 300 sec: 42709.4). Total num frames: 3394404352. Throughput: 0: 42450.0. Samples: 3394533760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:15:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 12:15:28,803][15401] Updated weights for policy 0, policy_version 207180 (0.0030) [2024-06-22 12:15:32,480][15401] Updated weights for policy 0, policy_version 207190 (0.0034) [2024-06-22 12:15:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 3394617344. Throughput: 0: 42631.2. Samples: 3394791500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:15:33,396][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 12:15:36,455][15401] Updated weights for policy 0, policy_version 207200 (0.0047) [2024-06-22 12:15:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3394846720. Throughput: 0: 42534.8. Samples: 3394919200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:15:38,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 12:15:40,011][15401] Updated weights for policy 0, policy_version 207210 (0.0032) [2024-06-22 12:15:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3395043328. Throughput: 0: 42649.3. Samples: 3395180000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:15:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 12:15:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000207218_3395059712.pth... [2024-06-22 12:15:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000206591_3384786944.pth [2024-06-22 12:15:44,239][15401] Updated weights for policy 0, policy_version 207220 (0.0032) [2024-06-22 12:15:48,038][15401] Updated weights for policy 0, policy_version 207230 (0.0032) [2024-06-22 12:15:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3395272704. Throughput: 0: 42613.7. Samples: 3395431900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:15:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 12:15:51,700][15401] Updated weights for policy 0, policy_version 207240 (0.0033) [2024-06-22 12:15:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3395469312. Throughput: 0: 42464.4. Samples: 3395559840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:15:53,390][15132] Avg episode reward: [(0, '0.065')] [2024-06-22 12:15:55,405][15401] Updated weights for policy 0, policy_version 207250 (0.0028) [2024-06-22 12:15:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3395698688. Throughput: 0: 42591.3. Samples: 3395817080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:15:58,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 12:15:59,416][15401] Updated weights for policy 0, policy_version 207260 (0.0031) [2024-06-22 12:16:02,804][15401] Updated weights for policy 0, policy_version 207270 (0.0043) [2024-06-22 12:16:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3395911680. Throughput: 0: 42582.5. Samples: 3396070920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:16:03,396][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 12:16:06,887][15401] Updated weights for policy 0, policy_version 207280 (0.0031) [2024-06-22 12:16:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3396124672. Throughput: 0: 42732.1. Samples: 3396201780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:16:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 12:16:10,547][15401] Updated weights for policy 0, policy_version 207290 (0.0036) [2024-06-22 12:16:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3396337664. Throughput: 0: 42803.6. Samples: 3396459920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:16:13,390][15132] Avg episode reward: [(0, '0.190')] [2024-06-22 12:16:14,595][15401] Updated weights for policy 0, policy_version 207300 (0.0029) [2024-06-22 12:16:18,064][15401] Updated weights for policy 0, policy_version 207310 (0.0039) [2024-06-22 12:16:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 3396567040. Throughput: 0: 42621.6. Samples: 3396709480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:16:18,390][15132] Avg episode reward: [(0, '0.190')] [2024-06-22 12:16:22,449][15401] Updated weights for policy 0, policy_version 207320 (0.0037) [2024-06-22 12:16:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3396780032. Throughput: 0: 42724.8. Samples: 3396841820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:16:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 12:16:25,600][15401] Updated weights for policy 0, policy_version 207330 (0.0037) [2024-06-22 12:16:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3396976640. Throughput: 0: 42711.1. Samples: 3397102000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:16:28,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 12:16:29,206][15349] Signal inference workers to stop experience collection... (50150 times) [2024-06-22 12:16:29,208][15349] Signal inference workers to resume experience collection... (50150 times) [2024-06-22 12:16:29,244][15401] InferenceWorker_p0-w0: stopping experience collection (50150 times) [2024-06-22 12:16:29,244][15401] InferenceWorker_p0-w0: resuming experience collection (50150 times) [2024-06-22 12:16:29,985][15401] Updated weights for policy 0, policy_version 207340 (0.0038) [2024-06-22 12:16:33,194][15401] Updated weights for policy 0, policy_version 207350 (0.0031) [2024-06-22 12:16:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.8, 300 sec: 42875.7). Total num frames: 3397222400. Throughput: 0: 42627.1. Samples: 3397350220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:16:33,392][15132] Avg episode reward: [(0, '0.843')] [2024-06-22 12:16:37,416][15401] Updated weights for policy 0, policy_version 207360 (0.0033) [2024-06-22 12:16:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3397402624. Throughput: 0: 42859.0. Samples: 3397488500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-22 12:16:38,396][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 12:16:40,691][15401] Updated weights for policy 0, policy_version 207370 (0.0047) [2024-06-22 12:16:43,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3397615616. Throughput: 0: 42880.0. Samples: 3397746680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:16:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 12:16:44,948][15401] Updated weights for policy 0, policy_version 207380 (0.0033) [2024-06-22 12:16:48,356][15401] Updated weights for policy 0, policy_version 207390 (0.0032) [2024-06-22 12:16:48,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 3397877760. Throughput: 0: 42854.3. Samples: 3397999360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:16:48,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 12:16:52,834][15401] Updated weights for policy 0, policy_version 207400 (0.0042) [2024-06-22 12:16:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3398041600. Throughput: 0: 42911.1. Samples: 3398132780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:16:53,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 12:16:56,147][15401] Updated weights for policy 0, policy_version 207410 (0.0041) [2024-06-22 12:16:58,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3398254592. Throughput: 0: 42789.0. Samples: 3398385420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:16:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 12:17:00,282][15401] Updated weights for policy 0, policy_version 207420 (0.0030) [2024-06-22 12:17:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3398467584. Throughput: 0: 42951.5. Samples: 3398642300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 12:17:04,268][15401] Updated weights for policy 0, policy_version 207430 (0.0038) [2024-06-22 12:17:08,093][15401] Updated weights for policy 0, policy_version 207440 (0.0041) [2024-06-22 12:17:08,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 3398696960. Throughput: 0: 42787.6. Samples: 3398767360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:08,392][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 12:17:11,928][15401] Updated weights for policy 0, policy_version 207450 (0.0042) [2024-06-22 12:17:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3398909952. Throughput: 0: 42733.3. Samples: 3399025000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 12:17:15,539][15401] Updated weights for policy 0, policy_version 207460 (0.0036) [2024-06-22 12:17:18,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3399122944. Throughput: 0: 43073.9. Samples: 3399288440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:18,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 12:17:19,429][15401] Updated weights for policy 0, policy_version 207470 (0.0031) [2024-06-22 12:17:23,300][15401] Updated weights for policy 0, policy_version 207480 (0.0045) [2024-06-22 12:17:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3399352320. Throughput: 0: 42908.0. Samples: 3399419360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:23,394][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 12:17:26,891][15401] Updated weights for policy 0, policy_version 207490 (0.0032) [2024-06-22 12:17:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3399548928. Throughput: 0: 42888.3. Samples: 3399676660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 12:17:30,737][15401] Updated weights for policy 0, policy_version 207500 (0.0031) [2024-06-22 12:17:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 3399761920. Throughput: 0: 43006.8. Samples: 3399934660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 12:17:34,726][15401] Updated weights for policy 0, policy_version 207510 (0.0031) [2024-06-22 12:17:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3399991296. Throughput: 0: 42817.2. Samples: 3400059560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 12:17:38,674][15401] Updated weights for policy 0, policy_version 207520 (0.0032) [2024-06-22 12:17:41,836][15349] Signal inference workers to stop experience collection... (50200 times) [2024-06-22 12:17:41,873][15401] InferenceWorker_p0-w0: stopping experience collection (50200 times) [2024-06-22 12:17:41,903][15349] Signal inference workers to resume experience collection... (50200 times) [2024-06-22 12:17:41,904][15401] InferenceWorker_p0-w0: resuming experience collection (50200 times) [2024-06-22 12:17:42,540][15401] Updated weights for policy 0, policy_version 207530 (0.0044) [2024-06-22 12:17:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 3400187904. Throughput: 0: 42931.1. Samples: 3400317320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 12:17:43,661][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000207533_3400220672.pth... [2024-06-22 12:17:43,731][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000206907_3389964288.pth [2024-06-22 12:17:46,783][15401] Updated weights for policy 0, policy_version 207540 (0.0027) [2024-06-22 12:17:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3400400896. Throughput: 0: 42908.2. Samples: 3400573160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:17:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 12:17:50,314][15401] Updated weights for policy 0, policy_version 207550 (0.0042) [2024-06-22 12:17:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.9). Total num frames: 3400630272. Throughput: 0: 42917.4. Samples: 3400698540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:17:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 12:17:54,388][15401] Updated weights for policy 0, policy_version 207560 (0.0035) [2024-06-22 12:17:57,940][15401] Updated weights for policy 0, policy_version 207570 (0.0026) [2024-06-22 12:17:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3400826880. Throughput: 0: 42931.6. Samples: 3400956920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:17:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 12:18:02,131][15401] Updated weights for policy 0, policy_version 207580 (0.0028) [2024-06-22 12:18:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 3401039872. Throughput: 0: 42744.2. Samples: 3401211940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:03,391][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 12:18:05,594][15401] Updated weights for policy 0, policy_version 207590 (0.0035) [2024-06-22 12:18:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 3401252864. Throughput: 0: 42631.6. Samples: 3401337780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 12:18:09,795][15401] Updated weights for policy 0, policy_version 207600 (0.0036) [2024-06-22 12:18:13,150][15401] Updated weights for policy 0, policy_version 207610 (0.0026) [2024-06-22 12:18:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3401482240. Throughput: 0: 42659.4. Samples: 3401596340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 12:18:17,507][15401] Updated weights for policy 0, policy_version 207620 (0.0038) [2024-06-22 12:18:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3401662464. Throughput: 0: 42575.9. Samples: 3401850580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 12:18:20,868][15401] Updated weights for policy 0, policy_version 207630 (0.0041) [2024-06-22 12:18:23,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 3401908224. Throughput: 0: 42442.2. Samples: 3401969560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:23,393][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 12:18:25,195][15401] Updated weights for policy 0, policy_version 207640 (0.0034) [2024-06-22 12:18:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3402104832. Throughput: 0: 42407.6. Samples: 3402225660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:28,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 12:18:28,766][15401] Updated weights for policy 0, policy_version 207650 (0.0031) [2024-06-22 12:18:32,735][15401] Updated weights for policy 0, policy_version 207660 (0.0021) [2024-06-22 12:18:33,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 3402301440. Throughput: 0: 42500.9. Samples: 3402485700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 12:18:36,578][15401] Updated weights for policy 0, policy_version 207670 (0.0033) [2024-06-22 12:18:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 3402547200. Throughput: 0: 42511.1. Samples: 3402611640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:38,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 12:18:40,772][15401] Updated weights for policy 0, policy_version 207680 (0.0037) [2024-06-22 12:18:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 3402727424. Throughput: 0: 42347.3. Samples: 3402862560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:43,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 12:18:44,407][15401] Updated weights for policy 0, policy_version 207690 (0.0036) [2024-06-22 12:18:48,333][15401] Updated weights for policy 0, policy_version 207700 (0.0047) [2024-06-22 12:18:48,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3402956800. Throughput: 0: 42344.2. Samples: 3403117420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:48,394][15132] Avg episode reward: [(0, '0.219')] [2024-06-22 12:18:51,978][15401] Updated weights for policy 0, policy_version 207710 (0.0039) [2024-06-22 12:18:53,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3403169792. Throughput: 0: 42419.9. Samples: 3403246680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-22 12:18:53,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-22 12:18:55,753][15401] Updated weights for policy 0, policy_version 207720 (0.0045) [2024-06-22 12:18:57,075][15349] Signal inference workers to stop experience collection... (50250 times) [2024-06-22 12:18:57,075][15349] Signal inference workers to resume experience collection... (50250 times) [2024-06-22 12:18:57,126][15401] InferenceWorker_p0-w0: stopping experience collection (50250 times) [2024-06-22 12:18:57,126][15401] InferenceWorker_p0-w0: resuming experience collection (50250 times) [2024-06-22 12:18:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3403366400. Throughput: 0: 42450.0. Samples: 3403506580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:18:58,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 12:18:59,482][15401] Updated weights for policy 0, policy_version 207730 (0.0045) [2024-06-22 12:19:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3403595776. Throughput: 0: 42527.5. Samples: 3403764320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 12:19:03,803][15401] Updated weights for policy 0, policy_version 207740 (0.0026) [2024-06-22 12:19:07,172][15401] Updated weights for policy 0, policy_version 207750 (0.0031) [2024-06-22 12:19:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 3403808768. Throughput: 0: 42692.5. Samples: 3403890620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:08,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 12:19:11,623][15401] Updated weights for policy 0, policy_version 207760 (0.0038) [2024-06-22 12:19:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3404021760. Throughput: 0: 42758.5. Samples: 3404149800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 12:19:14,884][15401] Updated weights for policy 0, policy_version 207770 (0.0027) [2024-06-22 12:19:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3404234752. Throughput: 0: 42656.0. Samples: 3404405220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:18,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 12:19:19,351][15401] Updated weights for policy 0, policy_version 207780 (0.0040) [2024-06-22 12:19:22,587][15401] Updated weights for policy 0, policy_version 207790 (0.0037) [2024-06-22 12:19:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42654.9). Total num frames: 3404447744. Throughput: 0: 42748.1. Samples: 3404535200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 12:19:26,796][15401] Updated weights for policy 0, policy_version 207800 (0.0024) [2024-06-22 12:19:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3404660736. Throughput: 0: 42846.5. Samples: 3404790640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 12:19:30,191][15401] Updated weights for policy 0, policy_version 207810 (0.0036) [2024-06-22 12:19:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3404890112. Throughput: 0: 42840.8. Samples: 3405045260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:33,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 12:19:34,526][15401] Updated weights for policy 0, policy_version 207820 (0.0029) [2024-06-22 12:19:37,782][15401] Updated weights for policy 0, policy_version 207830 (0.0031) [2024-06-22 12:19:38,392][15132] Fps is (10 sec: 44225.4, 60 sec: 42598.3, 300 sec: 42709.1). Total num frames: 3405103104. Throughput: 0: 42861.2. Samples: 3405175540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:38,393][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 12:19:42,062][15401] Updated weights for policy 0, policy_version 207840 (0.0034) [2024-06-22 12:19:43,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 3405299712. Throughput: 0: 42819.8. Samples: 3405433580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:43,393][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 12:19:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000207843_3405299712.pth... [2024-06-22 12:19:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000207218_3395059712.pth [2024-06-22 12:19:45,703][15401] Updated weights for policy 0, policy_version 207850 (0.0039) [2024-06-22 12:19:48,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3405529088. Throughput: 0: 42744.1. Samples: 3405687800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:48,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 12:19:49,667][15401] Updated weights for policy 0, policy_version 207860 (0.0026) [2024-06-22 12:19:53,357][15401] Updated weights for policy 0, policy_version 207870 (0.0038) [2024-06-22 12:19:53,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3405742080. Throughput: 0: 42845.5. Samples: 3405818660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:53,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 12:19:57,202][15401] Updated weights for policy 0, policy_version 207880 (0.0035) [2024-06-22 12:19:58,393][15132] Fps is (10 sec: 40947.0, 60 sec: 42869.2, 300 sec: 42653.5). Total num frames: 3405938688. Throughput: 0: 42818.9. Samples: 3406076780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:19:58,393][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 12:20:00,848][15401] Updated weights for policy 0, policy_version 207890 (0.0039) [2024-06-22 12:20:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 3406168064. Throughput: 0: 42705.4. Samples: 3406326960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-22 12:20:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 12:20:05,211][15401] Updated weights for policy 0, policy_version 207900 (0.0040) [2024-06-22 12:20:08,390][15132] Fps is (10 sec: 44249.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3406381056. Throughput: 0: 42766.4. Samples: 3406459700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 12:20:08,737][15401] Updated weights for policy 0, policy_version 207910 (0.0041) [2024-06-22 12:20:12,872][15401] Updated weights for policy 0, policy_version 207920 (0.0034) [2024-06-22 12:20:13,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 3406594048. Throughput: 0: 42793.6. Samples: 3406716460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:13,393][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 12:20:15,449][15349] Signal inference workers to stop experience collection... (50300 times) [2024-06-22 12:20:15,449][15349] Signal inference workers to resume experience collection... (50300 times) [2024-06-22 12:20:15,500][15401] InferenceWorker_p0-w0: stopping experience collection (50300 times) [2024-06-22 12:20:15,500][15401] InferenceWorker_p0-w0: resuming experience collection (50300 times) [2024-06-22 12:20:16,297][15401] Updated weights for policy 0, policy_version 207930 (0.0033) [2024-06-22 12:20:18,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 3406823424. Throughput: 0: 42608.9. Samples: 3406962760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:18,393][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 12:20:20,498][15401] Updated weights for policy 0, policy_version 207940 (0.0037) [2024-06-22 12:20:23,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3407020032. Throughput: 0: 42781.1. Samples: 3407100580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 12:20:23,750][15401] Updated weights for policy 0, policy_version 207950 (0.0025) [2024-06-22 12:20:28,143][15401] Updated weights for policy 0, policy_version 207960 (0.0028) [2024-06-22 12:20:28,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3407233024. Throughput: 0: 42646.3. Samples: 3407352560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 12:20:31,397][15401] Updated weights for policy 0, policy_version 207970 (0.0029) [2024-06-22 12:20:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3407462400. Throughput: 0: 42638.5. Samples: 3407606540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 12:20:35,669][15401] Updated weights for policy 0, policy_version 207980 (0.0023) [2024-06-22 12:20:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 3407642624. Throughput: 0: 42745.7. Samples: 3407742220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:38,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 12:20:39,120][15401] Updated weights for policy 0, policy_version 207990 (0.0039) [2024-06-22 12:20:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 3407855616. Throughput: 0: 42616.4. Samples: 3407994380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:43,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 12:20:43,459][15401] Updated weights for policy 0, policy_version 208000 (0.0035) [2024-06-22 12:20:47,026][15401] Updated weights for policy 0, policy_version 208010 (0.0031) [2024-06-22 12:20:48,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3408117760. Throughput: 0: 42701.3. Samples: 3408248520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 12:20:50,893][15401] Updated weights for policy 0, policy_version 208020 (0.0043) [2024-06-22 12:20:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3408297984. Throughput: 0: 42745.2. Samples: 3408383220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 12:20:54,508][15401] Updated weights for policy 0, policy_version 208030 (0.0036) [2024-06-22 12:20:58,355][15401] Updated weights for policy 0, policy_version 208040 (0.0026) [2024-06-22 12:20:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43146.8, 300 sec: 42765.0). Total num frames: 3408527360. Throughput: 0: 42785.0. Samples: 3408641680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:20:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 12:21:02,318][15401] Updated weights for policy 0, policy_version 208050 (0.0041) [2024-06-22 12:21:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3408740352. Throughput: 0: 42949.8. Samples: 3408895400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:21:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 12:21:05,847][15401] Updated weights for policy 0, policy_version 208060 (0.0033) [2024-06-22 12:21:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 3408936960. Throughput: 0: 42787.6. Samples: 3409026020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 12:21:08,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 12:21:10,103][15401] Updated weights for policy 0, policy_version 208070 (0.0034) [2024-06-22 12:21:13,241][15401] Updated weights for policy 0, policy_version 208080 (0.0041) [2024-06-22 12:21:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 3409182720. Throughput: 0: 42957.3. Samples: 3409285640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 12:21:17,766][15401] Updated weights for policy 0, policy_version 208090 (0.0022) [2024-06-22 12:21:18,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42598.5, 300 sec: 42709.2). Total num frames: 3409379328. Throughput: 0: 43039.7. Samples: 3409543420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:18,392][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 12:21:20,893][15401] Updated weights for policy 0, policy_version 208100 (0.0030) [2024-06-22 12:21:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3409592320. Throughput: 0: 42786.7. Samples: 3409667620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:23,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 12:21:25,189][15401] Updated weights for policy 0, policy_version 208110 (0.0035) [2024-06-22 12:21:28,321][15401] Updated weights for policy 0, policy_version 208120 (0.0033) [2024-06-22 12:21:28,392][15132] Fps is (10 sec: 45874.9, 60 sec: 43415.9, 300 sec: 42765.0). Total num frames: 3409838080. Throughput: 0: 43022.6. Samples: 3409930500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:28,392][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 12:21:32,703][15401] Updated weights for policy 0, policy_version 208130 (0.0031) [2024-06-22 12:21:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3410018304. Throughput: 0: 43195.5. Samples: 3410192320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 12:21:36,251][15401] Updated weights for policy 0, policy_version 208140 (0.0026) [2024-06-22 12:21:38,390][15132] Fps is (10 sec: 37691.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3410214912. Throughput: 0: 42950.1. Samples: 3410315980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 12:21:38,577][15349] Signal inference workers to stop experience collection... (50350 times) [2024-06-22 12:21:38,597][15401] InferenceWorker_p0-w0: stopping experience collection (50350 times) [2024-06-22 12:21:38,637][15349] Signal inference workers to resume experience collection... (50350 times) [2024-06-22 12:21:38,638][15401] InferenceWorker_p0-w0: resuming experience collection (50350 times) [2024-06-22 12:21:40,483][15401] Updated weights for policy 0, policy_version 208150 (0.0024) [2024-06-22 12:21:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3410444288. Throughput: 0: 42832.4. Samples: 3410569140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:43,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 12:21:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000208158_3410460672.pth... [2024-06-22 12:21:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000207533_3400220672.pth [2024-06-22 12:21:43,861][15401] Updated weights for policy 0, policy_version 208160 (0.0035) [2024-06-22 12:21:48,174][15401] Updated weights for policy 0, policy_version 208170 (0.0030) [2024-06-22 12:21:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3410657280. Throughput: 0: 43082.3. Samples: 3410834100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:48,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 12:21:51,383][15401] Updated weights for policy 0, policy_version 208180 (0.0039) [2024-06-22 12:21:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3410870272. Throughput: 0: 43011.9. Samples: 3410961560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:53,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 12:21:55,703][15401] Updated weights for policy 0, policy_version 208190 (0.0023) [2024-06-22 12:21:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3411083264. Throughput: 0: 43032.9. Samples: 3411222120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:21:58,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 12:21:59,006][15401] Updated weights for policy 0, policy_version 208200 (0.0029) [2024-06-22 12:22:03,291][15401] Updated weights for policy 0, policy_version 208210 (0.0037) [2024-06-22 12:22:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 3411312640. Throughput: 0: 43056.1. Samples: 3411480840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:22:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 12:22:06,700][15401] Updated weights for policy 0, policy_version 208220 (0.0029) [2024-06-22 12:22:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3411525632. Throughput: 0: 43174.6. Samples: 3411610480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:22:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 12:22:10,755][15401] Updated weights for policy 0, policy_version 208230 (0.0026) [2024-06-22 12:22:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3411738624. Throughput: 0: 43052.1. Samples: 3411867740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:22:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 12:22:14,442][15401] Updated weights for policy 0, policy_version 208240 (0.0029) [2024-06-22 12:22:18,178][15401] Updated weights for policy 0, policy_version 208250 (0.0041) [2024-06-22 12:22:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 3411968000. Throughput: 0: 42868.4. Samples: 3412121400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 12:22:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 12:22:21,965][15401] Updated weights for policy 0, policy_version 208260 (0.0040) [2024-06-22 12:22:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3412164608. Throughput: 0: 43044.6. Samples: 3412252980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:22:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 12:22:25,659][15401] Updated weights for policy 0, policy_version 208270 (0.0030) [2024-06-22 12:22:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 3412377600. Throughput: 0: 43049.0. Samples: 3412506340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:22:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 12:22:29,983][15401] Updated weights for policy 0, policy_version 208280 (0.0035) [2024-06-22 12:22:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3412590592. Throughput: 0: 42900.8. Samples: 3412764640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:22:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 12:22:33,532][15401] Updated weights for policy 0, policy_version 208290 (0.0038) [2024-06-22 12:22:37,423][15401] Updated weights for policy 0, policy_version 208300 (0.0027) [2024-06-22 12:22:38,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 3412819968. Throughput: 0: 43035.9. Samples: 3412898180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:22:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 12:22:40,972][15401] Updated weights for policy 0, policy_version 208310 (0.0047) [2024-06-22 12:22:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 3413032960. Throughput: 0: 42878.4. Samples: 3413151660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:22:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 12:22:45,448][15401] Updated weights for policy 0, policy_version 208320 (0.0047) [2024-06-22 12:22:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3413245952. Throughput: 0: 42827.5. Samples: 3413408080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:22:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 12:22:48,976][15401] Updated weights for policy 0, policy_version 208330 (0.0038) [2024-06-22 12:22:53,199][15401] Updated weights for policy 0, policy_version 208340 (0.0042) [2024-06-22 12:22:53,390][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3413442560. Throughput: 0: 42824.9. Samples: 3413537600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:22:53,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 12:22:56,829][15401] Updated weights for policy 0, policy_version 208350 (0.0045) [2024-06-22 12:22:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3413655552. Throughput: 0: 42811.0. Samples: 3413794240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:22:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 12:23:00,731][15401] Updated weights for policy 0, policy_version 208360 (0.0027) [2024-06-22 12:23:03,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 3413901312. Throughput: 0: 42694.3. Samples: 3414042740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:23:03,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 12:23:04,580][15401] Updated weights for policy 0, policy_version 208370 (0.0028) [2024-06-22 12:23:08,301][15401] Updated weights for policy 0, policy_version 208380 (0.0037) [2024-06-22 12:23:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3414097920. Throughput: 0: 42791.4. Samples: 3414178600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:23:08,396][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 12:23:12,115][15401] Updated weights for policy 0, policy_version 208390 (0.0028) [2024-06-22 12:23:13,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3414310912. Throughput: 0: 42854.5. Samples: 3414434800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:23:13,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 12:23:15,852][15401] Updated weights for policy 0, policy_version 208400 (0.0028) [2024-06-22 12:23:18,103][15349] Signal inference workers to stop experience collection... (50400 times) [2024-06-22 12:23:18,134][15401] InferenceWorker_p0-w0: stopping experience collection (50400 times) [2024-06-22 12:23:18,152][15349] Signal inference workers to resume experience collection... (50400 times) [2024-06-22 12:23:18,153][15401] InferenceWorker_p0-w0: resuming experience collection (50400 times) [2024-06-22 12:23:18,392][15132] Fps is (10 sec: 45864.8, 60 sec: 43142.9, 300 sec: 42876.1). Total num frames: 3414556672. Throughput: 0: 42770.3. Samples: 3414689400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:23:18,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 12:23:19,601][15401] Updated weights for policy 0, policy_version 208410 (0.0047) [2024-06-22 12:23:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3414736896. Throughput: 0: 42760.1. Samples: 3414822380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-22 12:23:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 12:23:23,524][15401] Updated weights for policy 0, policy_version 208420 (0.0048) [2024-06-22 12:23:27,147][15401] Updated weights for policy 0, policy_version 208430 (0.0031) [2024-06-22 12:23:28,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3414966272. Throughput: 0: 42748.6. Samples: 3415075340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:23:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 12:23:31,249][15401] Updated weights for policy 0, policy_version 208440 (0.0022) [2024-06-22 12:23:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 3415179264. Throughput: 0: 42724.4. Samples: 3415330680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:23:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 12:23:35,078][15401] Updated weights for policy 0, policy_version 208450 (0.0035) [2024-06-22 12:23:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3415375872. Throughput: 0: 42691.9. Samples: 3415458740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:23:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 12:23:38,687][15401] Updated weights for policy 0, policy_version 208460 (0.0037) [2024-06-22 12:23:42,646][15401] Updated weights for policy 0, policy_version 208470 (0.0024) [2024-06-22 12:23:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.8, 300 sec: 42931.7). Total num frames: 3415621632. Throughput: 0: 42856.2. Samples: 3415722760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:23:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 12:23:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000208473_3415621632.pth... [2024-06-22 12:23:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000207843_3405299712.pth [2024-06-22 12:23:46,128][15401] Updated weights for policy 0, policy_version 208480 (0.0043) [2024-06-22 12:23:48,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3415801856. Throughput: 0: 43051.7. Samples: 3415979960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:23:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 12:23:50,146][15401] Updated weights for policy 0, policy_version 208490 (0.0037) [2024-06-22 12:23:53,389][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3416031232. Throughput: 0: 42705.5. Samples: 3416100340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:23:53,390][15132] Avg episode reward: [(0, '0.142')] [2024-06-22 12:23:54,121][15401] Updated weights for policy 0, policy_version 208500 (0.0033) [2024-06-22 12:23:57,716][15401] Updated weights for policy 0, policy_version 208510 (0.0034) [2024-06-22 12:23:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3416244224. Throughput: 0: 42770.3. Samples: 3416359460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:23:58,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 12:24:01,882][15401] Updated weights for policy 0, policy_version 208520 (0.0030) [2024-06-22 12:24:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 3416440832. Throughput: 0: 42866.3. Samples: 3416618280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:24:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 12:24:05,231][15401] Updated weights for policy 0, policy_version 208530 (0.0035) [2024-06-22 12:24:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3416653824. Throughput: 0: 42696.0. Samples: 3416743700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:24:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 12:24:09,479][15401] Updated weights for policy 0, policy_version 208540 (0.0032) [2024-06-22 12:24:12,954][15401] Updated weights for policy 0, policy_version 208550 (0.0041) [2024-06-22 12:24:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3416883200. Throughput: 0: 42910.2. Samples: 3417006300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:24:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 12:24:16,928][15401] Updated weights for policy 0, policy_version 208560 (0.0031) [2024-06-22 12:24:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 3417096192. Throughput: 0: 43012.4. Samples: 3417266240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:24:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 12:24:20,498][15401] Updated weights for policy 0, policy_version 208570 (0.0023) [2024-06-22 12:24:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3417309184. Throughput: 0: 42901.8. Samples: 3417389320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:24:23,396][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 12:24:24,571][15401] Updated weights for policy 0, policy_version 208580 (0.0028) [2024-06-22 12:24:27,960][15401] Updated weights for policy 0, policy_version 208590 (0.0030) [2024-06-22 12:24:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 3417538560. Throughput: 0: 42866.0. Samples: 3417651840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:24:28,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 12:24:32,143][15401] Updated weights for policy 0, policy_version 208600 (0.0031) [2024-06-22 12:24:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 3417751552. Throughput: 0: 42903.5. Samples: 3417910620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 12:24:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 12:24:35,521][15401] Updated weights for policy 0, policy_version 208610 (0.0029) [2024-06-22 12:24:38,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42869.9, 300 sec: 42876.1). Total num frames: 3417948160. Throughput: 0: 43017.7. Samples: 3418036240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:24:38,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 12:24:39,173][15349] Signal inference workers to stop experience collection... (50450 times) [2024-06-22 12:24:39,180][15349] Signal inference workers to resume experience collection... (50450 times) [2024-06-22 12:24:39,185][15401] InferenceWorker_p0-w0: stopping experience collection (50450 times) [2024-06-22 12:24:39,221][15401] InferenceWorker_p0-w0: resuming experience collection (50450 times) [2024-06-22 12:24:39,756][15401] Updated weights for policy 0, policy_version 208620 (0.0036) [2024-06-22 12:24:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3418177536. Throughput: 0: 43065.7. Samples: 3418297420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:24:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 12:24:43,581][15401] Updated weights for policy 0, policy_version 208630 (0.0038) [2024-06-22 12:24:47,244][15401] Updated weights for policy 0, policy_version 208640 (0.0035) [2024-06-22 12:24:48,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3418406912. Throughput: 0: 42901.7. Samples: 3418548860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:24:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 12:24:51,071][15401] Updated weights for policy 0, policy_version 208650 (0.0033) [2024-06-22 12:24:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 3418587136. Throughput: 0: 42915.9. Samples: 3418674920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:24:53,394][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 12:24:55,099][15401] Updated weights for policy 0, policy_version 208660 (0.0046) [2024-06-22 12:24:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3418816512. Throughput: 0: 42870.7. Samples: 3418935480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:24:58,393][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 12:24:58,986][15401] Updated weights for policy 0, policy_version 208670 (0.0028) [2024-06-22 12:25:02,690][15401] Updated weights for policy 0, policy_version 208680 (0.0037) [2024-06-22 12:25:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3419029504. Throughput: 0: 42860.4. Samples: 3419194960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:25:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 12:25:06,670][15401] Updated weights for policy 0, policy_version 208690 (0.0031) [2024-06-22 12:25:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 3419242496. Throughput: 0: 43004.6. Samples: 3419324520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:25:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 12:25:10,346][15401] Updated weights for policy 0, policy_version 208700 (0.0031) [2024-06-22 12:25:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 3419439104. Throughput: 0: 42711.5. Samples: 3419573760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:25:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 12:25:14,354][15401] Updated weights for policy 0, policy_version 208710 (0.0041) [2024-06-22 12:25:17,894][15401] Updated weights for policy 0, policy_version 208720 (0.0026) [2024-06-22 12:25:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3419684864. Throughput: 0: 42816.5. Samples: 3419837360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:25:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 12:25:22,138][15401] Updated weights for policy 0, policy_version 208730 (0.0032) [2024-06-22 12:25:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3419881472. Throughput: 0: 43076.4. Samples: 3419974580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:25:23,399][15132] Avg episode reward: [(0, '0.269')] [2024-06-22 12:25:25,691][15401] Updated weights for policy 0, policy_version 208740 (0.0030) [2024-06-22 12:25:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 3420094464. Throughput: 0: 42673.8. Samples: 3420217740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:25:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 12:25:29,830][15401] Updated weights for policy 0, policy_version 208750 (0.0028) [2024-06-22 12:25:33,229][15401] Updated weights for policy 0, policy_version 208760 (0.0047) [2024-06-22 12:25:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3420323840. Throughput: 0: 42898.6. Samples: 3420479300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:25:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 12:25:37,312][15401] Updated weights for policy 0, policy_version 208770 (0.0038) [2024-06-22 12:25:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 3420520448. Throughput: 0: 43013.7. Samples: 3420610540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 12:25:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 12:25:40,730][15401] Updated weights for policy 0, policy_version 208780 (0.0038) [2024-06-22 12:25:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3420749824. Throughput: 0: 42819.5. Samples: 3420862360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:25:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 12:25:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000208786_3420749824.pth... [2024-06-22 12:25:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000208158_3410460672.pth [2024-06-22 12:25:44,957][15401] Updated weights for policy 0, policy_version 208790 (0.0045) [2024-06-22 12:25:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3420962816. Throughput: 0: 42792.1. Samples: 3421120600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:25:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 12:25:48,395][15401] Updated weights for policy 0, policy_version 208800 (0.0026) [2024-06-22 12:25:52,473][15401] Updated weights for policy 0, policy_version 208810 (0.0027) [2024-06-22 12:25:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3421175808. Throughput: 0: 42844.8. Samples: 3421252540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:25:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 12:25:55,983][15401] Updated weights for policy 0, policy_version 208820 (0.0035) [2024-06-22 12:25:57,619][15349] Signal inference workers to stop experience collection... (50500 times) [2024-06-22 12:25:57,674][15401] InferenceWorker_p0-w0: stopping experience collection (50500 times) [2024-06-22 12:25:57,675][15349] Signal inference workers to resume experience collection... (50500 times) [2024-06-22 12:25:57,696][15401] InferenceWorker_p0-w0: resuming experience collection (50500 times) [2024-06-22 12:25:58,396][15132] Fps is (10 sec: 42570.3, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 3421388800. Throughput: 0: 42847.7. Samples: 3421502180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:25:58,397][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 12:26:00,222][15401] Updated weights for policy 0, policy_version 208830 (0.0027) [2024-06-22 12:26:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3421601792. Throughput: 0: 42815.4. Samples: 3421764060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:03,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 12:26:03,776][15401] Updated weights for policy 0, policy_version 208840 (0.0038) [2024-06-22 12:26:07,865][15401] Updated weights for policy 0, policy_version 208850 (0.0039) [2024-06-22 12:26:08,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3421814784. Throughput: 0: 42566.8. Samples: 3421890080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 12:26:11,349][15401] Updated weights for policy 0, policy_version 208860 (0.0033) [2024-06-22 12:26:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 3422027776. Throughput: 0: 42708.4. Samples: 3422139620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 12:26:15,756][15401] Updated weights for policy 0, policy_version 208870 (0.0031) [2024-06-22 12:26:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 3422224384. Throughput: 0: 42858.7. Samples: 3422407940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:18,391][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 12:26:19,055][15401] Updated weights for policy 0, policy_version 208880 (0.0037) [2024-06-22 12:26:23,219][15401] Updated weights for policy 0, policy_version 208890 (0.0031) [2024-06-22 12:26:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3422453760. Throughput: 0: 42682.3. Samples: 3422531240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 12:26:26,817][15401] Updated weights for policy 0, policy_version 208900 (0.0030) [2024-06-22 12:26:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3422666752. Throughput: 0: 42660.0. Samples: 3422782060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:28,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 12:26:30,752][15401] Updated weights for policy 0, policy_version 208910 (0.0038) [2024-06-22 12:26:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3422863360. Throughput: 0: 42914.6. Samples: 3423051760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 12:26:34,627][15401] Updated weights for policy 0, policy_version 208920 (0.0034) [2024-06-22 12:26:38,222][15401] Updated weights for policy 0, policy_version 208930 (0.0032) [2024-06-22 12:26:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3423109120. Throughput: 0: 42749.2. Samples: 3423176260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 12:26:42,183][15401] Updated weights for policy 0, policy_version 208940 (0.0037) [2024-06-22 12:26:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3423322112. Throughput: 0: 42914.5. Samples: 3423433060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 12:26:45,793][15401] Updated weights for policy 0, policy_version 208950 (0.0039) [2024-06-22 12:26:48,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3423502336. Throughput: 0: 42813.5. Samples: 3423690660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 12:26:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 12:26:50,011][15401] Updated weights for policy 0, policy_version 208960 (0.0043) [2024-06-22 12:26:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3423748096. Throughput: 0: 42813.2. Samples: 3423816680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:26:53,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 12:26:53,755][15401] Updated weights for policy 0, policy_version 208970 (0.0030) [2024-06-22 12:26:57,649][15401] Updated weights for policy 0, policy_version 208980 (0.0039) [2024-06-22 12:26:58,390][15132] Fps is (10 sec: 44235.3, 60 sec: 42602.8, 300 sec: 42820.5). Total num frames: 3423944704. Throughput: 0: 43035.9. Samples: 3424076240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:26:58,390][15132] Avg episode reward: [(0, '0.882')] [2024-06-22 12:27:01,382][15401] Updated weights for policy 0, policy_version 208990 (0.0040) [2024-06-22 12:27:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3424157696. Throughput: 0: 42773.4. Samples: 3424332740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 12:27:05,293][15401] Updated weights for policy 0, policy_version 209000 (0.0033) [2024-06-22 12:27:08,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3424387072. Throughput: 0: 42887.0. Samples: 3424461160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:08,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 12:27:08,828][15401] Updated weights for policy 0, policy_version 209010 (0.0039) [2024-06-22 12:27:12,942][15401] Updated weights for policy 0, policy_version 209020 (0.0029) [2024-06-22 12:27:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3424583680. Throughput: 0: 42946.4. Samples: 3424714640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 12:27:16,756][15401] Updated weights for policy 0, policy_version 209030 (0.0035) [2024-06-22 12:27:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3424796672. Throughput: 0: 42604.5. Samples: 3424968960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 12:27:20,529][15401] Updated weights for policy 0, policy_version 209040 (0.0039) [2024-06-22 12:27:23,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3425026048. Throughput: 0: 42673.0. Samples: 3425096540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 12:27:24,246][15401] Updated weights for policy 0, policy_version 209050 (0.0035) [2024-06-22 12:27:28,259][15401] Updated weights for policy 0, policy_version 209060 (0.0033) [2024-06-22 12:27:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3425239040. Throughput: 0: 42757.5. Samples: 3425357140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 12:27:30,285][15349] Signal inference workers to stop experience collection... (50550 times) [2024-06-22 12:27:30,285][15349] Signal inference workers to resume experience collection... (50550 times) [2024-06-22 12:27:30,294][15401] InferenceWorker_p0-w0: stopping experience collection (50550 times) [2024-06-22 12:27:30,295][15401] InferenceWorker_p0-w0: resuming experience collection (50550 times) [2024-06-22 12:27:31,946][15401] Updated weights for policy 0, policy_version 209070 (0.0028) [2024-06-22 12:27:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3425452032. Throughput: 0: 42659.9. Samples: 3425610360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:33,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 12:27:35,991][15401] Updated weights for policy 0, policy_version 209080 (0.0036) [2024-06-22 12:27:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3425648640. Throughput: 0: 42543.1. Samples: 3425731120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 12:27:39,514][15401] Updated weights for policy 0, policy_version 209090 (0.0033) [2024-06-22 12:27:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 3425878016. Throughput: 0: 42607.4. Samples: 3425993560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 12:27:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000209100_3425894400.pth... [2024-06-22 12:27:43,531][15401] Updated weights for policy 0, policy_version 209100 (0.0030) [2024-06-22 12:27:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000208473_3415621632.pth [2024-06-22 12:27:47,909][15401] Updated weights for policy 0, policy_version 209110 (0.0035) [2024-06-22 12:27:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3426091008. Throughput: 0: 42473.4. Samples: 3426244040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 12:27:51,365][15401] Updated weights for policy 0, policy_version 209120 (0.0043) [2024-06-22 12:27:53,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42050.7, 300 sec: 42764.7). Total num frames: 3426271232. Throughput: 0: 42403.6. Samples: 3426369420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 12:27:53,392][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 12:27:55,473][15401] Updated weights for policy 0, policy_version 209130 (0.0030) [2024-06-22 12:27:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 3426500608. Throughput: 0: 42499.9. Samples: 3426627140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:27:58,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 12:27:59,642][15401] Updated weights for policy 0, policy_version 209140 (0.0031) [2024-06-22 12:28:03,080][15401] Updated weights for policy 0, policy_version 209150 (0.0033) [2024-06-22 12:28:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3426713600. Throughput: 0: 42471.6. Samples: 3426880180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 12:28:07,078][15401] Updated weights for policy 0, policy_version 209160 (0.0038) [2024-06-22 12:28:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3426926592. Throughput: 0: 42456.0. Samples: 3427007060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 12:28:10,656][15401] Updated weights for policy 0, policy_version 209170 (0.0037) [2024-06-22 12:28:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 3427155968. Throughput: 0: 42527.1. Samples: 3427270860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 12:28:14,571][15401] Updated weights for policy 0, policy_version 209180 (0.0040) [2024-06-22 12:28:18,209][15401] Updated weights for policy 0, policy_version 209190 (0.0033) [2024-06-22 12:28:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3427368960. Throughput: 0: 42653.3. Samples: 3427529760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 12:28:22,184][15401] Updated weights for policy 0, policy_version 209200 (0.0034) [2024-06-22 12:28:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3427581952. Throughput: 0: 42730.9. Samples: 3427654000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:23,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-22 12:28:25,802][15401] Updated weights for policy 0, policy_version 209210 (0.0030) [2024-06-22 12:28:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3427794944. Throughput: 0: 42829.3. Samples: 3427920880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:28,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 12:28:29,881][15401] Updated weights for policy 0, policy_version 209220 (0.0040) [2024-06-22 12:28:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3428007936. Throughput: 0: 42844.0. Samples: 3428172020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 12:28:33,538][15401] Updated weights for policy 0, policy_version 209230 (0.0031) [2024-06-22 12:28:37,418][15401] Updated weights for policy 0, policy_version 209240 (0.0022) [2024-06-22 12:28:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3428237312. Throughput: 0: 42937.8. Samples: 3428301520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 12:28:41,186][15401] Updated weights for policy 0, policy_version 209250 (0.0031) [2024-06-22 12:28:43,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 3428417536. Throughput: 0: 43084.5. Samples: 3428566220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:43,397][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 12:28:44,882][15401] Updated weights for policy 0, policy_version 209260 (0.0033) [2024-06-22 12:28:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3428646912. Throughput: 0: 42990.5. Samples: 3428814760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 12:28:48,734][15401] Updated weights for policy 0, policy_version 209270 (0.0029) [2024-06-22 12:28:52,449][15401] Updated weights for policy 0, policy_version 209280 (0.0032) [2024-06-22 12:28:53,389][15132] Fps is (10 sec: 45905.1, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 3428876288. Throughput: 0: 43206.4. Samples: 3428951340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 12:28:56,103][15401] Updated weights for policy 0, policy_version 209290 (0.0032) [2024-06-22 12:28:58,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 3429056512. Throughput: 0: 43127.5. Samples: 3429211700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:28:58,392][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 12:29:00,146][15401] Updated weights for policy 0, policy_version 209300 (0.0037) [2024-06-22 12:29:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3429302272. Throughput: 0: 42877.4. Samples: 3429459240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-22 12:29:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 12:29:03,550][15401] Updated weights for policy 0, policy_version 209310 (0.0023) [2024-06-22 12:29:06,449][15349] Signal inference workers to stop experience collection... (50600 times) [2024-06-22 12:29:06,453][15349] Signal inference workers to resume experience collection... (50600 times) [2024-06-22 12:29:06,497][15401] InferenceWorker_p0-w0: stopping experience collection (50600 times) [2024-06-22 12:29:06,498][15401] InferenceWorker_p0-w0: resuming experience collection (50600 times) [2024-06-22 12:29:07,883][15401] Updated weights for policy 0, policy_version 209320 (0.0024) [2024-06-22 12:29:08,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3429498880. Throughput: 0: 43175.9. Samples: 3429596920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 12:29:11,016][15401] Updated weights for policy 0, policy_version 209330 (0.0038) [2024-06-22 12:29:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3429695488. Throughput: 0: 42903.6. Samples: 3429851540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:13,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 12:29:15,396][15401] Updated weights for policy 0, policy_version 209340 (0.0032) [2024-06-22 12:29:18,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3429974016. Throughput: 0: 42893.4. Samples: 3430102220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 12:29:18,482][15401] Updated weights for policy 0, policy_version 209350 (0.0037) [2024-06-22 12:29:22,890][15401] Updated weights for policy 0, policy_version 209360 (0.0030) [2024-06-22 12:29:23,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 3430170624. Throughput: 0: 43146.6. Samples: 3430243120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 12:29:26,333][15401] Updated weights for policy 0, policy_version 209370 (0.0037) [2024-06-22 12:29:28,390][15132] Fps is (10 sec: 36044.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3430334464. Throughput: 0: 42738.1. Samples: 3430489160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 12:29:30,612][15401] Updated weights for policy 0, policy_version 209380 (0.0037) [2024-06-22 12:29:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 3430612992. Throughput: 0: 42861.0. Samples: 3430743500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 12:29:34,280][15401] Updated weights for policy 0, policy_version 209390 (0.0031) [2024-06-22 12:29:38,244][15401] Updated weights for policy 0, policy_version 209400 (0.0033) [2024-06-22 12:29:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3430809600. Throughput: 0: 43023.1. Samples: 3430887380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 12:29:41,925][15401] Updated weights for policy 0, policy_version 209410 (0.0034) [2024-06-22 12:29:43,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 3430989824. Throughput: 0: 42692.0. Samples: 3431132740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 12:29:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000209411_3430989824.pth... [2024-06-22 12:29:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000208786_3420749824.pth [2024-06-22 12:29:45,909][15401] Updated weights for policy 0, policy_version 209420 (0.0026) [2024-06-22 12:29:48,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43689.0, 300 sec: 42986.8). Total num frames: 3431268352. Throughput: 0: 42927.8. Samples: 3431391100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:48,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 12:29:49,532][15401] Updated weights for policy 0, policy_version 209430 (0.0036) [2024-06-22 12:29:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3431448576. Throughput: 0: 43024.5. Samples: 3431533020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 12:29:53,524][15401] Updated weights for policy 0, policy_version 209440 (0.0041) [2024-06-22 12:29:57,350][15401] Updated weights for policy 0, policy_version 209450 (0.0028) [2024-06-22 12:29:58,390][15132] Fps is (10 sec: 37692.1, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 3431645184. Throughput: 0: 42835.0. Samples: 3431779120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:29:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 12:30:01,291][15401] Updated weights for policy 0, policy_version 209460 (0.0032) [2024-06-22 12:30:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3431874560. Throughput: 0: 42913.2. Samples: 3432033320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:30:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 12:30:04,918][15401] Updated weights for policy 0, policy_version 209470 (0.0030) [2024-06-22 12:30:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3432071168. Throughput: 0: 42838.7. Samples: 3432170860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 12:30:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 12:30:09,110][15401] Updated weights for policy 0, policy_version 209480 (0.0029) [2024-06-22 12:30:12,476][15401] Updated weights for policy 0, policy_version 209490 (0.0038) [2024-06-22 12:30:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 3432300544. Throughput: 0: 42888.8. Samples: 3432419260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:13,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 12:30:16,813][15401] Updated weights for policy 0, policy_version 209500 (0.0042) [2024-06-22 12:30:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3432513536. Throughput: 0: 42986.4. Samples: 3432677880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 12:30:20,517][15401] Updated weights for policy 0, policy_version 209510 (0.0031) [2024-06-22 12:30:21,951][15349] Signal inference workers to stop experience collection... (50650 times) [2024-06-22 12:30:21,951][15349] Signal inference workers to resume experience collection... (50650 times) [2024-06-22 12:30:21,997][15401] InferenceWorker_p0-w0: stopping experience collection (50650 times) [2024-06-22 12:30:21,997][15401] InferenceWorker_p0-w0: resuming experience collection (50650 times) [2024-06-22 12:30:23,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3432726528. Throughput: 0: 42684.8. Samples: 3432808200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 12:30:24,221][15401] Updated weights for policy 0, policy_version 209520 (0.0028) [2024-06-22 12:30:28,043][15401] Updated weights for policy 0, policy_version 209530 (0.0028) [2024-06-22 12:30:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 3432939520. Throughput: 0: 42835.1. Samples: 3433060320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 12:30:31,760][15401] Updated weights for policy 0, policy_version 209540 (0.0025) [2024-06-22 12:30:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3433152512. Throughput: 0: 42765.8. Samples: 3433315460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 12:30:35,643][15401] Updated weights for policy 0, policy_version 209550 (0.0047) [2024-06-22 12:30:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3433365504. Throughput: 0: 42419.5. Samples: 3433441900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:38,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 12:30:39,827][15401] Updated weights for policy 0, policy_version 209560 (0.0036) [2024-06-22 12:30:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3433578496. Throughput: 0: 42695.4. Samples: 3433700420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 12:30:43,626][15401] Updated weights for policy 0, policy_version 209570 (0.0037) [2024-06-22 12:30:47,626][15401] Updated weights for policy 0, policy_version 209580 (0.0032) [2024-06-22 12:30:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41780.8, 300 sec: 42709.5). Total num frames: 3433775104. Throughput: 0: 42668.0. Samples: 3433953380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 12:30:51,403][15401] Updated weights for policy 0, policy_version 209590 (0.0030) [2024-06-22 12:30:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 3434020864. Throughput: 0: 42457.8. Samples: 3434081460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:53,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 12:30:55,163][15401] Updated weights for policy 0, policy_version 209600 (0.0028) [2024-06-22 12:30:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3434201088. Throughput: 0: 42727.7. Samples: 3434341900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:30:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 12:30:59,032][15401] Updated weights for policy 0, policy_version 209610 (0.0028) [2024-06-22 12:31:02,892][15401] Updated weights for policy 0, policy_version 209620 (0.0041) [2024-06-22 12:31:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3434430464. Throughput: 0: 42560.8. Samples: 3434593120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:31:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 12:31:06,557][15401] Updated weights for policy 0, policy_version 209630 (0.0036) [2024-06-22 12:31:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3434659840. Throughput: 0: 42573.0. Samples: 3434723980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:31:08,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 12:31:10,593][15401] Updated weights for policy 0, policy_version 209640 (0.0032) [2024-06-22 12:31:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 3434856448. Throughput: 0: 42697.9. Samples: 3434981720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:31:13,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-22 12:31:14,311][15401] Updated weights for policy 0, policy_version 209650 (0.0040) [2024-06-22 12:31:18,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3435069440. Throughput: 0: 42742.6. Samples: 3435238980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:31:18,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 12:31:18,398][15401] Updated weights for policy 0, policy_version 209660 (0.0041) [2024-06-22 12:31:21,885][15401] Updated weights for policy 0, policy_version 209670 (0.0046) [2024-06-22 12:31:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 3435298816. Throughput: 0: 42731.3. Samples: 3435364800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:31:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 12:31:25,861][15401] Updated weights for policy 0, policy_version 209680 (0.0041) [2024-06-22 12:31:28,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 3435479040. Throughput: 0: 42667.2. Samples: 3435620540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:31:28,392][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 12:31:29,808][15401] Updated weights for policy 0, policy_version 209690 (0.0023) [2024-06-22 12:31:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3435708416. Throughput: 0: 42806.3. Samples: 3435879660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:31:33,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 12:31:33,770][15401] Updated weights for policy 0, policy_version 209700 (0.0032) [2024-06-22 12:31:37,397][15401] Updated weights for policy 0, policy_version 209710 (0.0036) [2024-06-22 12:31:38,389][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3435937792. Throughput: 0: 42891.6. Samples: 3436011580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:31:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 12:31:41,375][15401] Updated weights for policy 0, policy_version 209720 (0.0035) [2024-06-22 12:31:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3436118016. Throughput: 0: 42746.6. Samples: 3436265500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:31:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 12:31:43,494][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000209725_3436134400.pth... [2024-06-22 12:31:43,563][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000209100_3425894400.pth [2024-06-22 12:31:44,975][15401] Updated weights for policy 0, policy_version 209730 (0.0028) [2024-06-22 12:31:44,976][15349] Signal inference workers to stop experience collection... (50700 times) [2024-06-22 12:31:44,976][15349] Signal inference workers to resume experience collection... (50700 times) [2024-06-22 12:31:45,024][15401] InferenceWorker_p0-w0: stopping experience collection (50700 times) [2024-06-22 12:31:45,024][15401] InferenceWorker_p0-w0: resuming experience collection (50700 times) [2024-06-22 12:31:48,391][15132] Fps is (10 sec: 40955.6, 60 sec: 42870.8, 300 sec: 42709.3). Total num frames: 3436347392. Throughput: 0: 42756.7. Samples: 3436517220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:31:48,391][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 12:31:48,869][15401] Updated weights for policy 0, policy_version 209740 (0.0039) [2024-06-22 12:31:52,489][15401] Updated weights for policy 0, policy_version 209750 (0.0026) [2024-06-22 12:31:53,396][15132] Fps is (10 sec: 47483.0, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 3436593152. Throughput: 0: 42868.9. Samples: 3436653360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:31:53,397][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 12:31:56,615][15401] Updated weights for policy 0, policy_version 209760 (0.0036) [2024-06-22 12:31:58,389][15132] Fps is (10 sec: 40964.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3436756992. Throughput: 0: 42932.1. Samples: 3436913660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:31:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 12:32:00,057][15401] Updated weights for policy 0, policy_version 209770 (0.0041) [2024-06-22 12:32:03,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3437002752. Throughput: 0: 42695.6. Samples: 3437160180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:32:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 12:32:04,083][15401] Updated weights for policy 0, policy_version 209780 (0.0029) [2024-06-22 12:32:07,955][15401] Updated weights for policy 0, policy_version 209790 (0.0029) [2024-06-22 12:32:08,389][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3437232128. Throughput: 0: 42911.0. Samples: 3437295800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:32:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 12:32:11,589][15401] Updated weights for policy 0, policy_version 209800 (0.0030) [2024-06-22 12:32:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3437412352. Throughput: 0: 42896.4. Samples: 3437550780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:32:13,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 12:32:15,526][15401] Updated weights for policy 0, policy_version 209810 (0.0035) [2024-06-22 12:32:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 3437658112. Throughput: 0: 42714.2. Samples: 3437801800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:32:18,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-22 12:32:19,523][15401] Updated weights for policy 0, policy_version 209820 (0.0027) [2024-06-22 12:32:23,090][15401] Updated weights for policy 0, policy_version 209830 (0.0031) [2024-06-22 12:32:23,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 3437871104. Throughput: 0: 42882.1. Samples: 3437941380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-22 12:32:23,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 12:32:27,084][15401] Updated weights for policy 0, policy_version 209840 (0.0023) [2024-06-22 12:32:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 3438067712. Throughput: 0: 42858.7. Samples: 3438194140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:32:28,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 12:32:30,853][15401] Updated weights for policy 0, policy_version 209850 (0.0047) [2024-06-22 12:32:33,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 3438313472. Throughput: 0: 42912.6. Samples: 3438448240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:32:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 12:32:34,532][15401] Updated weights for policy 0, policy_version 209860 (0.0028) [2024-06-22 12:32:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 3438493696. Throughput: 0: 42740.3. Samples: 3438576500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:32:38,393][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 12:32:38,607][15401] Updated weights for policy 0, policy_version 209870 (0.0043) [2024-06-22 12:32:42,205][15401] Updated weights for policy 0, policy_version 209880 (0.0033) [2024-06-22 12:32:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3438723072. Throughput: 0: 42730.6. Samples: 3438836540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:32:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 12:32:46,249][15401] Updated weights for policy 0, policy_version 209890 (0.0039) [2024-06-22 12:32:47,917][15349] Signal inference workers to stop experience collection... (50750 times) [2024-06-22 12:32:47,924][15349] Signal inference workers to resume experience collection... (50750 times) [2024-06-22 12:32:47,956][15401] InferenceWorker_p0-w0: stopping experience collection (50750 times) [2024-06-22 12:32:47,956][15401] InferenceWorker_p0-w0: resuming experience collection (50750 times) [2024-06-22 12:32:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43145.3, 300 sec: 42932.0). Total num frames: 3438936064. Throughput: 0: 42881.4. Samples: 3439089840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:32:48,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 12:32:49,803][15401] Updated weights for policy 0, policy_version 209900 (0.0025) [2024-06-22 12:32:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42329.8, 300 sec: 42820.5). Total num frames: 3439132672. Throughput: 0: 42696.8. Samples: 3439217160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:32:53,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 12:32:53,840][15401] Updated weights for policy 0, policy_version 209910 (0.0033) [2024-06-22 12:32:57,286][15401] Updated weights for policy 0, policy_version 209920 (0.0034) [2024-06-22 12:32:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3439345664. Throughput: 0: 42857.0. Samples: 3439479340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:32:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 12:33:01,397][15401] Updated weights for policy 0, policy_version 209930 (0.0031) [2024-06-22 12:33:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3439591424. Throughput: 0: 42953.8. Samples: 3439734720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 12:33:05,165][15401] Updated weights for policy 0, policy_version 209940 (0.0032) [2024-06-22 12:33:08,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3439755264. Throughput: 0: 42611.6. Samples: 3439858800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 12:33:09,141][15401] Updated weights for policy 0, policy_version 209950 (0.0035) [2024-06-22 12:33:12,882][15401] Updated weights for policy 0, policy_version 209960 (0.0030) [2024-06-22 12:33:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3440001024. Throughput: 0: 42772.3. Samples: 3440118900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:13,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 12:33:16,771][15401] Updated weights for policy 0, policy_version 209970 (0.0030) [2024-06-22 12:33:18,390][15132] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3440246784. Throughput: 0: 42768.4. Samples: 3440372820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 12:33:20,518][15401] Updated weights for policy 0, policy_version 209980 (0.0047) [2024-06-22 12:33:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 3440394240. Throughput: 0: 42908.6. Samples: 3440507280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 12:33:24,478][15401] Updated weights for policy 0, policy_version 209990 (0.0037) [2024-06-22 12:33:27,986][15401] Updated weights for policy 0, policy_version 210000 (0.0031) [2024-06-22 12:33:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3440656384. Throughput: 0: 42898.6. Samples: 3440766980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:28,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 12:33:31,953][15401] Updated weights for policy 0, policy_version 210010 (0.0033) [2024-06-22 12:33:33,389][15132] Fps is (10 sec: 49151.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3440885760. Throughput: 0: 42999.1. Samples: 3441024800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 12:33:35,456][15401] Updated weights for policy 0, policy_version 210020 (0.0032) [2024-06-22 12:33:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.1, 300 sec: 42877.0). Total num frames: 3441065984. Throughput: 0: 43075.1. Samples: 3441155540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 12:33:39,556][15401] Updated weights for policy 0, policy_version 210030 (0.0046) [2024-06-22 12:33:42,822][15401] Updated weights for policy 0, policy_version 210040 (0.0031) [2024-06-22 12:33:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3441295360. Throughput: 0: 43054.9. Samples: 3441416820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 12:33:43,439][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000210041_3441311744.pth... [2024-06-22 12:33:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000209411_3430989824.pth [2024-06-22 12:33:47,046][15401] Updated weights for policy 0, policy_version 210050 (0.0045) [2024-06-22 12:33:48,389][15132] Fps is (10 sec: 47514.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3441541120. Throughput: 0: 43184.5. Samples: 3441678020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 12:33:50,714][15401] Updated weights for policy 0, policy_version 210060 (0.0029) [2024-06-22 12:33:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 3441721344. Throughput: 0: 43427.6. Samples: 3441813040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 12:33:54,639][15401] Updated weights for policy 0, policy_version 210070 (0.0029) [2024-06-22 12:33:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3441934336. Throughput: 0: 43217.1. Samples: 3442063660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:33:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 12:33:58,431][15401] Updated weights for policy 0, policy_version 210080 (0.0035) [2024-06-22 12:34:02,163][15401] Updated weights for policy 0, policy_version 210090 (0.0033) [2024-06-22 12:34:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 3442180096. Throughput: 0: 43389.2. Samples: 3442325340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:34:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 12:34:05,884][15401] Updated weights for policy 0, policy_version 210100 (0.0032) [2024-06-22 12:34:08,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3442360320. Throughput: 0: 43357.6. Samples: 3442458380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:34:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 12:34:09,938][15401] Updated weights for policy 0, policy_version 210110 (0.0039) [2024-06-22 12:34:10,724][15349] Signal inference workers to stop experience collection... (50800 times) [2024-06-22 12:34:10,725][15349] Signal inference workers to resume experience collection... (50800 times) [2024-06-22 12:34:10,755][15401] InferenceWorker_p0-w0: stopping experience collection (50800 times) [2024-06-22 12:34:10,755][15401] InferenceWorker_p0-w0: resuming experience collection (50800 times) [2024-06-22 12:34:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3442589696. Throughput: 0: 43261.4. Samples: 3442713740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:34:13,396][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 12:34:13,549][15401] Updated weights for policy 0, policy_version 210120 (0.0033) [2024-06-22 12:34:17,423][15401] Updated weights for policy 0, policy_version 210130 (0.0030) [2024-06-22 12:34:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3442819072. Throughput: 0: 43298.6. Samples: 3442973240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:34:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 12:34:21,075][15401] Updated weights for policy 0, policy_version 210140 (0.0024) [2024-06-22 12:34:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 3443015680. Throughput: 0: 43304.6. Samples: 3443104240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:34:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 12:34:25,011][15401] Updated weights for policy 0, policy_version 210150 (0.0034) [2024-06-22 12:34:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3443245056. Throughput: 0: 43188.5. Samples: 3443360300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:34:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 12:34:28,834][15401] Updated weights for policy 0, policy_version 210160 (0.0039) [2024-06-22 12:34:32,669][15401] Updated weights for policy 0, policy_version 210170 (0.0033) [2024-06-22 12:34:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3443474432. Throughput: 0: 43169.6. Samples: 3443620660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:34:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 12:34:36,588][15401] Updated weights for policy 0, policy_version 210180 (0.0039) [2024-06-22 12:34:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 3443654656. Throughput: 0: 43056.5. Samples: 3443750580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 12:34:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 12:34:40,237][15401] Updated weights for policy 0, policy_version 210190 (0.0041) [2024-06-22 12:34:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 3443900416. Throughput: 0: 43035.8. Samples: 3444000280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:34:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 12:34:44,008][15401] Updated weights for policy 0, policy_version 210200 (0.0037) [2024-06-22 12:34:47,718][15401] Updated weights for policy 0, policy_version 210210 (0.0049) [2024-06-22 12:34:48,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 3444113408. Throughput: 0: 43153.8. Samples: 3444267260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:34:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 12:34:51,572][15401] Updated weights for policy 0, policy_version 210220 (0.0040) [2024-06-22 12:34:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3444293632. Throughput: 0: 43084.9. Samples: 3444397200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:34:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 12:34:55,603][15401] Updated weights for policy 0, policy_version 210230 (0.0037) [2024-06-22 12:34:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 3444539392. Throughput: 0: 43084.8. Samples: 3444652560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:34:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 12:34:58,947][15401] Updated weights for policy 0, policy_version 210240 (0.0038) [2024-06-22 12:35:03,059][15401] Updated weights for policy 0, policy_version 210250 (0.0029) [2024-06-22 12:35:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 3444752384. Throughput: 0: 43163.1. Samples: 3444915580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 12:35:06,637][15401] Updated weights for policy 0, policy_version 210260 (0.0027) [2024-06-22 12:35:08,390][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 3444948992. Throughput: 0: 43134.6. Samples: 3445045300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:08,400][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 12:35:10,505][15401] Updated weights for policy 0, policy_version 210270 (0.0035) [2024-06-22 12:35:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3445194752. Throughput: 0: 43208.1. Samples: 3445304660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 12:35:14,160][15401] Updated weights for policy 0, policy_version 210280 (0.0033) [2024-06-22 12:35:18,086][15401] Updated weights for policy 0, policy_version 210290 (0.0046) [2024-06-22 12:35:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3445407744. Throughput: 0: 43328.1. Samples: 3445570420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 12:35:18,534][15349] Signal inference workers to stop experience collection... (50850 times) [2024-06-22 12:35:18,583][15401] InferenceWorker_p0-w0: stopping experience collection (50850 times) [2024-06-22 12:35:18,586][15349] Signal inference workers to resume experience collection... (50850 times) [2024-06-22 12:35:18,598][15401] InferenceWorker_p0-w0: resuming experience collection (50850 times) [2024-06-22 12:35:21,943][15401] Updated weights for policy 0, policy_version 210300 (0.0038) [2024-06-22 12:35:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3445604352. Throughput: 0: 43221.6. Samples: 3445695560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 12:35:25,622][15401] Updated weights for policy 0, policy_version 210310 (0.0033) [2024-06-22 12:35:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 3445850112. Throughput: 0: 43402.7. Samples: 3445953400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:28,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 12:35:29,372][15401] Updated weights for policy 0, policy_version 210320 (0.0031) [2024-06-22 12:35:33,084][15401] Updated weights for policy 0, policy_version 210330 (0.0040) [2024-06-22 12:35:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3446063104. Throughput: 0: 43208.1. Samples: 3446211620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 12:35:36,944][15401] Updated weights for policy 0, policy_version 210340 (0.0032) [2024-06-22 12:35:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3446259712. Throughput: 0: 43157.9. Samples: 3446339300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:38,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 12:35:40,560][15401] Updated weights for policy 0, policy_version 210350 (0.0027) [2024-06-22 12:35:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 3446489088. Throughput: 0: 43234.8. Samples: 3446598120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 12:35:43,431][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000210357_3446489088.pth... [2024-06-22 12:35:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000209725_3436134400.pth [2024-06-22 12:35:44,478][15401] Updated weights for policy 0, policy_version 210360 (0.0029) [2024-06-22 12:35:48,181][15401] Updated weights for policy 0, policy_version 210370 (0.0043) [2024-06-22 12:35:48,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 3446702080. Throughput: 0: 43243.0. Samples: 3446861620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 12:35:48,393][15132] Avg episode reward: [(0, '0.818')] [2024-06-22 12:35:52,035][15401] Updated weights for policy 0, policy_version 210380 (0.0034) [2024-06-22 12:35:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 3446915072. Throughput: 0: 43247.0. Samples: 3446991420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:35:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 12:35:55,868][15401] Updated weights for policy 0, policy_version 210390 (0.0030) [2024-06-22 12:35:58,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 3447144448. Throughput: 0: 43190.1. Samples: 3447248220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:35:58,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 12:35:59,599][15401] Updated weights for policy 0, policy_version 210400 (0.0030) [2024-06-22 12:36:03,343][15401] Updated weights for policy 0, policy_version 210410 (0.0026) [2024-06-22 12:36:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 3447357440. Throughput: 0: 43133.9. Samples: 3447511440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 12:36:07,075][15401] Updated weights for policy 0, policy_version 210420 (0.0028) [2024-06-22 12:36:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 3447570432. Throughput: 0: 43293.9. Samples: 3447643780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 12:36:10,956][15401] Updated weights for policy 0, policy_version 210430 (0.0033) [2024-06-22 12:36:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 43098.6). Total num frames: 3447783424. Throughput: 0: 43223.5. Samples: 3447898460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:13,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 12:36:14,755][15401] Updated weights for policy 0, policy_version 210440 (0.0029) [2024-06-22 12:36:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 43042.3). Total num frames: 3447996416. Throughput: 0: 43258.1. Samples: 3448158340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:18,392][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 12:36:18,618][15401] Updated weights for policy 0, policy_version 210450 (0.0035) [2024-06-22 12:36:22,748][15401] Updated weights for policy 0, policy_version 210460 (0.0036) [2024-06-22 12:36:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 43154.1). Total num frames: 3448209408. Throughput: 0: 43309.2. Samples: 3448288220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 12:36:26,007][15401] Updated weights for policy 0, policy_version 210470 (0.0037) [2024-06-22 12:36:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 3448422400. Throughput: 0: 43246.6. Samples: 3448544220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:28,391][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 12:36:30,169][15401] Updated weights for policy 0, policy_version 210480 (0.0031) [2024-06-22 12:36:31,490][15349] Signal inference workers to stop experience collection... (50900 times) [2024-06-22 12:36:31,490][15349] Signal inference workers to resume experience collection... (50900 times) [2024-06-22 12:36:31,512][15401] InferenceWorker_p0-w0: stopping experience collection (50900 times) [2024-06-22 12:36:31,512][15401] InferenceWorker_p0-w0: resuming experience collection (50900 times) [2024-06-22 12:36:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 3448651776. Throughput: 0: 43178.3. Samples: 3448804540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:33,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 12:36:33,519][15401] Updated weights for policy 0, policy_version 210490 (0.0028) [2024-06-22 12:36:37,741][15401] Updated weights for policy 0, policy_version 210500 (0.0028) [2024-06-22 12:36:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 3448832000. Throughput: 0: 43180.6. Samples: 3448934540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 12:36:41,496][15401] Updated weights for policy 0, policy_version 210510 (0.0045) [2024-06-22 12:36:43,396][15132] Fps is (10 sec: 42571.4, 60 sec: 43139.9, 300 sec: 43153.0). Total num frames: 3449077760. Throughput: 0: 43077.5. Samples: 3449186980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:43,397][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 12:36:45,780][15401] Updated weights for policy 0, policy_version 210520 (0.0034) [2024-06-22 12:36:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.3, 300 sec: 43043.7). Total num frames: 3449290752. Throughput: 0: 43101.7. Samples: 3449451020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 12:36:48,996][15401] Updated weights for policy 0, policy_version 210530 (0.0030) [2024-06-22 12:36:53,265][15401] Updated weights for policy 0, policy_version 210540 (0.0039) [2024-06-22 12:36:53,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 3449487360. Throughput: 0: 42973.8. Samples: 3449577600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 12:36:56,507][15401] Updated weights for policy 0, policy_version 210550 (0.0041) [2024-06-22 12:36:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 43153.8). Total num frames: 3449733120. Throughput: 0: 43011.1. Samples: 3449833960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 12:36:58,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 12:37:00,783][15401] Updated weights for policy 0, policy_version 210560 (0.0043) [2024-06-22 12:37:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 3449929728. Throughput: 0: 43023.6. Samples: 3450094300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:03,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 12:37:04,014][15401] Updated weights for policy 0, policy_version 210570 (0.0022) [2024-06-22 12:37:08,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 43042.7). Total num frames: 3450109952. Throughput: 0: 42927.2. Samples: 3450219940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 12:37:08,555][15401] Updated weights for policy 0, policy_version 210580 (0.0036) [2024-06-22 12:37:11,618][15401] Updated weights for policy 0, policy_version 210590 (0.0044) [2024-06-22 12:37:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 3450372096. Throughput: 0: 42773.4. Samples: 3450469020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 12:37:16,269][15401] Updated weights for policy 0, policy_version 210600 (0.0044) [2024-06-22 12:37:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.2, 300 sec: 43043.1). Total num frames: 3450568704. Throughput: 0: 42747.2. Samples: 3450728160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 12:37:19,686][15401] Updated weights for policy 0, policy_version 210610 (0.0040) [2024-06-22 12:37:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.5, 300 sec: 42987.2). Total num frames: 3450748928. Throughput: 0: 42583.6. Samples: 3450850800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 12:37:24,187][15401] Updated weights for policy 0, policy_version 210620 (0.0039) [2024-06-22 12:37:27,374][15401] Updated weights for policy 0, policy_version 210630 (0.0035) [2024-06-22 12:37:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 3450994688. Throughput: 0: 42650.9. Samples: 3451106100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:28,392][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 12:37:31,877][15401] Updated weights for policy 0, policy_version 210640 (0.0050) [2024-06-22 12:37:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 43043.1). Total num frames: 3451191296. Throughput: 0: 42491.1. Samples: 3451363120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:33,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 12:37:35,198][15401] Updated weights for policy 0, policy_version 210650 (0.0037) [2024-06-22 12:37:38,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3451404288. Throughput: 0: 42458.2. Samples: 3451488220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 12:37:39,745][15401] Updated weights for policy 0, policy_version 210660 (0.0044) [2024-06-22 12:37:42,931][15401] Updated weights for policy 0, policy_version 210670 (0.0031) [2024-06-22 12:37:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42329.8, 300 sec: 42987.2). Total num frames: 3451617280. Throughput: 0: 42460.8. Samples: 3451744700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:43,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 12:37:43,530][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000210672_3451650048.pth... [2024-06-22 12:37:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000210041_3441311744.pth [2024-06-22 12:37:47,249][15401] Updated weights for policy 0, policy_version 210680 (0.0053) [2024-06-22 12:37:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 43042.7). Total num frames: 3451830272. Throughput: 0: 42206.3. Samples: 3451993580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:48,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-22 12:37:50,881][15349] Signal inference workers to stop experience collection... (50950 times) [2024-06-22 12:37:50,922][15401] InferenceWorker_p0-w0: stopping experience collection (50950 times) [2024-06-22 12:37:50,929][15349] Signal inference workers to resume experience collection... (50950 times) [2024-06-22 12:37:50,935][15401] InferenceWorker_p0-w0: resuming experience collection (50950 times) [2024-06-22 12:37:50,944][15401] Updated weights for policy 0, policy_version 210690 (0.0041) [2024-06-22 12:37:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 3452043264. Throughput: 0: 42163.9. Samples: 3452117320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 12:37:55,236][15401] Updated weights for policy 0, policy_version 210700 (0.0025) [2024-06-22 12:37:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42931.6). Total num frames: 3452256256. Throughput: 0: 42420.4. Samples: 3452377940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:37:58,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 12:37:58,477][15401] Updated weights for policy 0, policy_version 210710 (0.0028) [2024-06-22 12:38:02,682][15401] Updated weights for policy 0, policy_version 210720 (0.0034) [2024-06-22 12:38:03,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 43153.4). Total num frames: 3452485632. Throughput: 0: 42387.0. Samples: 3452635680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 12:38:03,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 12:38:06,131][15401] Updated weights for policy 0, policy_version 210730 (0.0038) [2024-06-22 12:38:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3452682240. Throughput: 0: 42565.7. Samples: 3452766260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 12:38:10,221][15401] Updated weights for policy 0, policy_version 210740 (0.0035) [2024-06-22 12:38:13,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42050.6, 300 sec: 42875.8). Total num frames: 3452895232. Throughput: 0: 42597.3. Samples: 3453022980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:13,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 12:38:13,688][15401] Updated weights for policy 0, policy_version 210750 (0.0041) [2024-06-22 12:38:17,681][15401] Updated weights for policy 0, policy_version 210760 (0.0039) [2024-06-22 12:38:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 43153.8). Total num frames: 3453124608. Throughput: 0: 42661.3. Samples: 3453282880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 12:38:21,299][15401] Updated weights for policy 0, policy_version 210770 (0.0029) [2024-06-22 12:38:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3453321216. Throughput: 0: 42782.7. Samples: 3453413440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 12:38:25,277][15401] Updated weights for policy 0, policy_version 210780 (0.0033) [2024-06-22 12:38:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 3453550592. Throughput: 0: 42774.2. Samples: 3453669540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 12:38:28,842][15401] Updated weights for policy 0, policy_version 210790 (0.0044) [2024-06-22 12:38:32,874][15401] Updated weights for policy 0, policy_version 210800 (0.0036) [2024-06-22 12:38:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 3453763584. Throughput: 0: 42951.5. Samples: 3453926400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 12:38:36,463][15401] Updated weights for policy 0, policy_version 210810 (0.0036) [2024-06-22 12:38:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3453976576. Throughput: 0: 43162.7. Samples: 3454059640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:38,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 12:38:40,507][15401] Updated weights for policy 0, policy_version 210820 (0.0026) [2024-06-22 12:38:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3454189568. Throughput: 0: 42896.4. Samples: 3454308280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:43,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 12:38:44,541][15401] Updated weights for policy 0, policy_version 210830 (0.0037) [2024-06-22 12:38:48,118][15401] Updated weights for policy 0, policy_version 210840 (0.0036) [2024-06-22 12:38:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3454402560. Throughput: 0: 42852.6. Samples: 3454563940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 12:38:52,235][15401] Updated weights for policy 0, policy_version 210850 (0.0041) [2024-06-22 12:38:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3454631936. Throughput: 0: 42901.0. Samples: 3454696800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:53,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 12:38:55,301][15349] Signal inference workers to stop experience collection... (51000 times) [2024-06-22 12:38:55,301][15349] Signal inference workers to resume experience collection... (51000 times) [2024-06-22 12:38:55,311][15401] InferenceWorker_p0-w0: stopping experience collection (51000 times) [2024-06-22 12:38:55,325][15401] InferenceWorker_p0-w0: resuming experience collection (51000 times) [2024-06-22 12:38:55,765][15401] Updated weights for policy 0, policy_version 210860 (0.0034) [2024-06-22 12:38:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3454844928. Throughput: 0: 42885.8. Samples: 3454952740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:38:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 12:39:00,059][15401] Updated weights for policy 0, policy_version 210870 (0.0036) [2024-06-22 12:39:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 3455041536. Throughput: 0: 42740.4. Samples: 3455206200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:39:03,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 12:39:03,620][15401] Updated weights for policy 0, policy_version 210880 (0.0040) [2024-06-22 12:39:07,637][15401] Updated weights for policy 0, policy_version 210890 (0.0043) [2024-06-22 12:39:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3455254528. Throughput: 0: 42642.8. Samples: 3455332360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:39:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 12:39:11,271][15401] Updated weights for policy 0, policy_version 210900 (0.0038) [2024-06-22 12:39:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 3455467520. Throughput: 0: 42657.8. Samples: 3455589140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 12:39:13,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 12:39:15,445][15401] Updated weights for policy 0, policy_version 210910 (0.0039) [2024-06-22 12:39:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3455680512. Throughput: 0: 42653.4. Samples: 3455845800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 12:39:18,879][15401] Updated weights for policy 0, policy_version 210920 (0.0041) [2024-06-22 12:39:23,050][15401] Updated weights for policy 0, policy_version 210930 (0.0038) [2024-06-22 12:39:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3455877120. Throughput: 0: 42359.7. Samples: 3455965820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 12:39:26,451][15401] Updated weights for policy 0, policy_version 210940 (0.0036) [2024-06-22 12:39:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3456122880. Throughput: 0: 42568.5. Samples: 3456223860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 12:39:30,586][15401] Updated weights for policy 0, policy_version 210950 (0.0031) [2024-06-22 12:39:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3456319488. Throughput: 0: 42493.4. Samples: 3456476140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 12:39:34,571][15401] Updated weights for policy 0, policy_version 210960 (0.0042) [2024-06-22 12:39:38,164][15401] Updated weights for policy 0, policy_version 210970 (0.0043) [2024-06-22 12:39:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3456532480. Throughput: 0: 42279.9. Samples: 3456599400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 12:39:42,395][15401] Updated weights for policy 0, policy_version 210980 (0.0045) [2024-06-22 12:39:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3456745472. Throughput: 0: 42473.8. Samples: 3456864060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:43,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-22 12:39:43,431][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000210984_3456761856.pth... [2024-06-22 12:39:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000210357_3446489088.pth [2024-06-22 12:39:45,765][15401] Updated weights for policy 0, policy_version 210990 (0.0029) [2024-06-22 12:39:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 3456942080. Throughput: 0: 42506.2. Samples: 3457118980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 12:39:49,994][15401] Updated weights for policy 0, policy_version 211000 (0.0033) [2024-06-22 12:39:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3457155072. Throughput: 0: 42351.5. Samples: 3457238180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 12:39:53,921][15401] Updated weights for policy 0, policy_version 211010 (0.0028) [2024-06-22 12:39:57,600][15401] Updated weights for policy 0, policy_version 211020 (0.0039) [2024-06-22 12:39:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3457400832. Throughput: 0: 42398.3. Samples: 3457497060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:39:58,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 12:40:01,732][15401] Updated weights for policy 0, policy_version 211030 (0.0033) [2024-06-22 12:40:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 3457581056. Throughput: 0: 42445.6. Samples: 3457755860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:40:03,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-22 12:40:05,048][15401] Updated weights for policy 0, policy_version 211040 (0.0041) [2024-06-22 12:40:08,391][15132] Fps is (10 sec: 40953.5, 60 sec: 42597.2, 300 sec: 42764.8). Total num frames: 3457810432. Throughput: 0: 42405.1. Samples: 3457874120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:40:08,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 12:40:09,382][15401] Updated weights for policy 0, policy_version 211050 (0.0039) [2024-06-22 12:40:10,646][15349] Signal inference workers to stop experience collection... (51050 times) [2024-06-22 12:40:10,668][15401] InferenceWorker_p0-w0: stopping experience collection (51050 times) [2024-06-22 12:40:10,705][15349] Signal inference workers to resume experience collection... (51050 times) [2024-06-22 12:40:10,707][15401] InferenceWorker_p0-w0: resuming experience collection (51050 times) [2024-06-22 12:40:13,027][15401] Updated weights for policy 0, policy_version 211060 (0.0052) [2024-06-22 12:40:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3458023424. Throughput: 0: 42558.2. Samples: 3458138980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:40:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 12:40:17,105][15401] Updated weights for policy 0, policy_version 211070 (0.0035) [2024-06-22 12:40:18,390][15132] Fps is (10 sec: 39327.3, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 3458203648. Throughput: 0: 42622.9. Samples: 3458394180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 12:40:18,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 12:40:20,529][15401] Updated weights for policy 0, policy_version 211080 (0.0035) [2024-06-22 12:40:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3458449408. Throughput: 0: 42509.9. Samples: 3458512340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:40:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 12:40:24,715][15401] Updated weights for policy 0, policy_version 211090 (0.0032) [2024-06-22 12:40:28,090][15401] Updated weights for policy 0, policy_version 211100 (0.0031) [2024-06-22 12:40:28,389][15132] Fps is (10 sec: 45876.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3458662400. Throughput: 0: 42514.9. Samples: 3458777220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:40:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 12:40:32,353][15401] Updated weights for policy 0, policy_version 211110 (0.0040) [2024-06-22 12:40:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 3458859008. Throughput: 0: 42717.3. Samples: 3459041260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:40:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 12:40:35,709][15401] Updated weights for policy 0, policy_version 211120 (0.0041) [2024-06-22 12:40:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3459088384. Throughput: 0: 42664.0. Samples: 3459158060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:40:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 12:40:40,458][15401] Updated weights for policy 0, policy_version 211130 (0.0032) [2024-06-22 12:40:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.6, 300 sec: 42709.9). Total num frames: 3459301376. Throughput: 0: 42778.8. Samples: 3459422100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:40:43,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 12:40:43,490][15401] Updated weights for policy 0, policy_version 211140 (0.0037) [2024-06-22 12:40:48,009][15401] Updated weights for policy 0, policy_version 211150 (0.0045) [2024-06-22 12:40:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3459497984. Throughput: 0: 42715.6. Samples: 3459678060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:40:48,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 12:40:51,131][15401] Updated weights for policy 0, policy_version 211160 (0.0034) [2024-06-22 12:40:53,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3459743744. Throughput: 0: 42759.2. Samples: 3459798220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:40:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 12:40:55,541][15401] Updated weights for policy 0, policy_version 211170 (0.0038) [2024-06-22 12:40:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3459940352. Throughput: 0: 42742.3. Samples: 3460062380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:40:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 12:40:58,775][15401] Updated weights for policy 0, policy_version 211180 (0.0024) [2024-06-22 12:41:03,085][15401] Updated weights for policy 0, policy_version 211190 (0.0037) [2024-06-22 12:41:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3460136960. Throughput: 0: 42559.6. Samples: 3460309360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:41:03,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 12:41:06,603][15401] Updated weights for policy 0, policy_version 211200 (0.0040) [2024-06-22 12:41:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42872.6, 300 sec: 42709.5). Total num frames: 3460382720. Throughput: 0: 42854.6. Samples: 3460440800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:41:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 12:41:11,001][15401] Updated weights for policy 0, policy_version 211210 (0.0048) [2024-06-22 12:41:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 3460562944. Throughput: 0: 42794.1. Samples: 3460702960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:41:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 12:41:13,990][15401] Updated weights for policy 0, policy_version 211220 (0.0033) [2024-06-22 12:41:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3460775936. Throughput: 0: 42596.4. Samples: 3460958100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:41:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 12:41:18,459][15401] Updated weights for policy 0, policy_version 211230 (0.0040) [2024-06-22 12:41:21,523][15401] Updated weights for policy 0, policy_version 211240 (0.0032) [2024-06-22 12:41:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 3461021696. Throughput: 0: 42784.8. Samples: 3461083380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:41:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 12:41:25,957][15401] Updated weights for policy 0, policy_version 211250 (0.0028) [2024-06-22 12:41:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3461218304. Throughput: 0: 42869.6. Samples: 3461351240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:41:28,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 12:41:29,136][15401] Updated weights for policy 0, policy_version 211260 (0.0037) [2024-06-22 12:41:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3461431296. Throughput: 0: 42788.9. Samples: 3461603560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:41:33,390][15132] Avg episode reward: [(0, '0.131')] [2024-06-22 12:41:33,533][15401] Updated weights for policy 0, policy_version 211270 (0.0034) [2024-06-22 12:41:36,937][15401] Updated weights for policy 0, policy_version 211280 (0.0025) [2024-06-22 12:41:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 3461677056. Throughput: 0: 43008.1. Samples: 3461733580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:41:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 12:41:41,135][15401] Updated weights for policy 0, policy_version 211290 (0.0031) [2024-06-22 12:41:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 3461857280. Throughput: 0: 42952.8. Samples: 3461995260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:41:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 12:41:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000211295_3461857280.pth... [2024-06-22 12:41:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000210672_3451650048.pth [2024-06-22 12:41:44,045][15349] Signal inference workers to stop experience collection... (51100 times) [2024-06-22 12:41:44,045][15349] Signal inference workers to resume experience collection... (51100 times) [2024-06-22 12:41:44,087][15401] InferenceWorker_p0-w0: stopping experience collection (51100 times) [2024-06-22 12:41:44,087][15401] InferenceWorker_p0-w0: resuming experience collection (51100 times) [2024-06-22 12:41:44,654][15401] Updated weights for policy 0, policy_version 211300 (0.0030) [2024-06-22 12:41:48,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 3462070272. Throughput: 0: 43033.8. Samples: 3462245980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:41:48,392][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 12:41:48,944][15401] Updated weights for policy 0, policy_version 211310 (0.0026) [2024-06-22 12:41:52,322][15401] Updated weights for policy 0, policy_version 211320 (0.0028) [2024-06-22 12:41:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3462316032. Throughput: 0: 43059.0. Samples: 3462378460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:41:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 12:41:56,447][15401] Updated weights for policy 0, policy_version 211330 (0.0025) [2024-06-22 12:41:58,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 3462479872. Throughput: 0: 42792.9. Samples: 3462628640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:41:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 12:42:00,302][15401] Updated weights for policy 0, policy_version 211340 (0.0029) [2024-06-22 12:42:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3462725632. Throughput: 0: 42778.7. Samples: 3462883140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:42:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 12:42:04,062][15401] Updated weights for policy 0, policy_version 211350 (0.0029) [2024-06-22 12:42:08,086][15401] Updated weights for policy 0, policy_version 211360 (0.0034) [2024-06-22 12:42:08,396][15132] Fps is (10 sec: 47482.9, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 3462955008. Throughput: 0: 42955.7. Samples: 3463016660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:42:08,397][15132] Avg episode reward: [(0, '0.808')] [2024-06-22 12:42:11,639][15401] Updated weights for policy 0, policy_version 211370 (0.0034) [2024-06-22 12:42:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3463135232. Throughput: 0: 42562.6. Samples: 3463266560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:42:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 12:42:15,599][15401] Updated weights for policy 0, policy_version 211380 (0.0041) [2024-06-22 12:42:18,389][15132] Fps is (10 sec: 40986.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3463364608. Throughput: 0: 42506.3. Samples: 3463516340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:42:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 12:42:19,489][15401] Updated weights for policy 0, policy_version 211390 (0.0033) [2024-06-22 12:42:23,241][15401] Updated weights for policy 0, policy_version 211400 (0.0037) [2024-06-22 12:42:23,391][15132] Fps is (10 sec: 44232.7, 60 sec: 42597.7, 300 sec: 42654.1). Total num frames: 3463577600. Throughput: 0: 42683.9. Samples: 3463654400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:42:23,391][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 12:42:27,144][15401] Updated weights for policy 0, policy_version 211410 (0.0028) [2024-06-22 12:42:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3463790592. Throughput: 0: 42530.9. Samples: 3463909140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:42:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 12:42:31,022][15401] Updated weights for policy 0, policy_version 211420 (0.0031) [2024-06-22 12:42:33,390][15132] Fps is (10 sec: 45879.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3464036352. Throughput: 0: 42646.2. Samples: 3464164960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 12:42:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 12:42:34,642][15401] Updated weights for policy 0, policy_version 211430 (0.0023) [2024-06-22 12:42:38,390][15132] Fps is (10 sec: 40958.6, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 3464200192. Throughput: 0: 42723.4. Samples: 3464301020. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:42:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 12:42:38,734][15401] Updated weights for policy 0, policy_version 211440 (0.0021) [2024-06-22 12:42:42,192][15401] Updated weights for policy 0, policy_version 211450 (0.0027) [2024-06-22 12:42:43,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3464413184. Throughput: 0: 42805.2. Samples: 3464554880. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:42:43,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 12:42:46,352][15401] Updated weights for policy 0, policy_version 211460 (0.0031) [2024-06-22 12:42:48,389][15132] Fps is (10 sec: 47514.8, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 3464675328. Throughput: 0: 42676.0. Samples: 3464803560. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:42:48,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 12:42:49,672][15401] Updated weights for policy 0, policy_version 211470 (0.0035) [2024-06-22 12:42:52,424][15349] Signal inference workers to stop experience collection... (51150 times) [2024-06-22 12:42:52,424][15349] Signal inference workers to resume experience collection... (51150 times) [2024-06-22 12:42:52,476][15401] InferenceWorker_p0-w0: stopping experience collection (51150 times) [2024-06-22 12:42:52,476][15401] InferenceWorker_p0-w0: resuming experience collection (51150 times) [2024-06-22 12:42:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 3464822784. Throughput: 0: 42819.5. Samples: 3464943260. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:42:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 12:42:53,964][15401] Updated weights for policy 0, policy_version 211480 (0.0028) [2024-06-22 12:42:57,267][15401] Updated weights for policy 0, policy_version 211490 (0.0040) [2024-06-22 12:42:58,396][15132] Fps is (10 sec: 37658.8, 60 sec: 42866.8, 300 sec: 42597.8). Total num frames: 3465052160. Throughput: 0: 42695.3. Samples: 3465188120. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:42:58,397][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 12:43:01,715][15401] Updated weights for policy 0, policy_version 211500 (0.0031) [2024-06-22 12:43:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3465297920. Throughput: 0: 42942.1. Samples: 3465448740. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:03,392][15132] Avg episode reward: [(0, '0.237')] [2024-06-22 12:43:05,099][15401] Updated weights for policy 0, policy_version 211510 (0.0038) [2024-06-22 12:43:08,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42056.8, 300 sec: 42654.3). Total num frames: 3465478144. Throughput: 0: 42837.9. Samples: 3465582060. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:08,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-22 12:43:09,587][15401] Updated weights for policy 0, policy_version 211520 (0.0042) [2024-06-22 12:43:12,628][15401] Updated weights for policy 0, policy_version 211530 (0.0035) [2024-06-22 12:43:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3465707520. Throughput: 0: 42658.5. Samples: 3465828780. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:13,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 12:43:17,205][15401] Updated weights for policy 0, policy_version 211540 (0.0033) [2024-06-22 12:43:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3465920512. Throughput: 0: 42777.9. Samples: 3466089960. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 12:43:20,049][15401] Updated weights for policy 0, policy_version 211550 (0.0042) [2024-06-22 12:43:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42326.1, 300 sec: 42598.4). Total num frames: 3466117120. Throughput: 0: 42582.1. Samples: 3466217200. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:23,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 12:43:24,876][15401] Updated weights for policy 0, policy_version 211560 (0.0036) [2024-06-22 12:43:27,759][15401] Updated weights for policy 0, policy_version 211570 (0.0044) [2024-06-22 12:43:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3466362880. Throughput: 0: 42406.7. Samples: 3466463180. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:28,392][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 12:43:32,477][15401] Updated weights for policy 0, policy_version 211580 (0.0052) [2024-06-22 12:43:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 3466559488. Throughput: 0: 42544.9. Samples: 3466718080. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 12:43:35,825][15401] Updated weights for policy 0, policy_version 211590 (0.0038) [2024-06-22 12:43:38,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 3466739712. Throughput: 0: 42151.1. Samples: 3466840060. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:38,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 12:43:40,014][15401] Updated weights for policy 0, policy_version 211600 (0.0031) [2024-06-22 12:43:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 3467001856. Throughput: 0: 42439.3. Samples: 3467097620. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-06-22 12:43:43,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-22 12:43:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000211609_3467001856.pth... [2024-06-22 12:43:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000210984_3456761856.pth [2024-06-22 12:43:43,919][15401] Updated weights for policy 0, policy_version 211610 (0.0036) [2024-06-22 12:43:47,690][15401] Updated weights for policy 0, policy_version 211620 (0.0024) [2024-06-22 12:43:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 3467182080. Throughput: 0: 42376.5. Samples: 3467355680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:43:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 12:43:51,521][15401] Updated weights for policy 0, policy_version 211630 (0.0042) [2024-06-22 12:43:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 3467395072. Throughput: 0: 42292.8. Samples: 3467485240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:43:53,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 12:43:55,818][15401] Updated weights for policy 0, policy_version 211640 (0.0030) [2024-06-22 12:43:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42876.1, 300 sec: 42653.9). Total num frames: 3467624448. Throughput: 0: 42400.1. Samples: 3467736780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:43:58,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-22 12:43:59,330][15401] Updated weights for policy 0, policy_version 211650 (0.0034) [2024-06-22 12:44:03,371][15401] Updated weights for policy 0, policy_version 211660 (0.0024) [2024-06-22 12:44:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3467837440. Throughput: 0: 42260.4. Samples: 3467991680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:03,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 12:44:07,009][15401] Updated weights for policy 0, policy_version 211670 (0.0051) [2024-06-22 12:44:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3468017664. Throughput: 0: 42190.5. Samples: 3468115780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:08,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 12:44:10,945][15401] Updated weights for policy 0, policy_version 211680 (0.0029) [2024-06-22 12:44:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3468263424. Throughput: 0: 42460.5. Samples: 3468373900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:13,396][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 12:44:14,951][15401] Updated weights for policy 0, policy_version 211690 (0.0038) [2024-06-22 12:44:16,844][15349] Signal inference workers to stop experience collection... (51200 times) [2024-06-22 12:44:16,878][15401] InferenceWorker_p0-w0: stopping experience collection (51200 times) [2024-06-22 12:44:16,893][15349] Signal inference workers to resume experience collection... (51200 times) [2024-06-22 12:44:16,894][15401] InferenceWorker_p0-w0: resuming experience collection (51200 times) [2024-06-22 12:44:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3468460032. Throughput: 0: 42332.0. Samples: 3468623020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:18,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-22 12:44:18,854][15401] Updated weights for policy 0, policy_version 211700 (0.0029) [2024-06-22 12:44:23,273][15401] Updated weights for policy 0, policy_version 211710 (0.0026) [2024-06-22 12:44:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 3468656640. Throughput: 0: 42460.3. Samples: 3468750780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:23,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 12:44:26,519][15401] Updated weights for policy 0, policy_version 211720 (0.0026) [2024-06-22 12:44:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3468886016. Throughput: 0: 42303.7. Samples: 3469001280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 12:44:31,013][15401] Updated weights for policy 0, policy_version 211730 (0.0038) [2024-06-22 12:44:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3469099008. Throughput: 0: 42299.5. Samples: 3469259160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 12:44:34,063][15401] Updated weights for policy 0, policy_version 211740 (0.0042) [2024-06-22 12:44:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3469295616. Throughput: 0: 42345.0. Samples: 3469390760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 12:44:38,721][15401] Updated weights for policy 0, policy_version 211750 (0.0038) [2024-06-22 12:44:41,970][15401] Updated weights for policy 0, policy_version 211760 (0.0035) [2024-06-22 12:44:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 3469524992. Throughput: 0: 42215.0. Samples: 3469636460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 12:44:46,259][15401] Updated weights for policy 0, policy_version 211770 (0.0031) [2024-06-22 12:44:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3469721600. Throughput: 0: 42351.0. Samples: 3469897480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 12:44:49,601][15401] Updated weights for policy 0, policy_version 211780 (0.0028) [2024-06-22 12:44:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 3469934592. Throughput: 0: 42357.8. Samples: 3470021880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 12:44:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 12:44:53,907][15401] Updated weights for policy 0, policy_version 211790 (0.0033) [2024-06-22 12:44:57,386][15401] Updated weights for policy 0, policy_version 211800 (0.0033) [2024-06-22 12:44:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3470147584. Throughput: 0: 42287.2. Samples: 3470276820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:44:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 12:45:02,011][15401] Updated weights for policy 0, policy_version 211810 (0.0031) [2024-06-22 12:45:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42543.1). Total num frames: 3470360576. Throughput: 0: 42459.1. Samples: 3470533680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 12:45:04,954][15401] Updated weights for policy 0, policy_version 211820 (0.0033) [2024-06-22 12:45:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3470589952. Throughput: 0: 42418.3. Samples: 3470659600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 12:45:09,516][15401] Updated weights for policy 0, policy_version 211830 (0.0034) [2024-06-22 12:45:12,397][15401] Updated weights for policy 0, policy_version 211840 (0.0033) [2024-06-22 12:45:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3470802944. Throughput: 0: 42607.9. Samples: 3470918640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 12:45:17,086][15401] Updated weights for policy 0, policy_version 211850 (0.0033) [2024-06-22 12:45:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 3470983168. Throughput: 0: 42733.3. Samples: 3471182160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 12:45:20,445][15401] Updated weights for policy 0, policy_version 211860 (0.0033) [2024-06-22 12:45:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 3471212544. Throughput: 0: 42473.4. Samples: 3471302060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 12:45:24,834][15401] Updated weights for policy 0, policy_version 211870 (0.0043) [2024-06-22 12:45:28,059][15401] Updated weights for policy 0, policy_version 211880 (0.0029) [2024-06-22 12:45:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3471441920. Throughput: 0: 42669.3. Samples: 3471556580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 12:45:32,454][15401] Updated weights for policy 0, policy_version 211890 (0.0038) [2024-06-22 12:45:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 3471638528. Throughput: 0: 42586.1. Samples: 3471813860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 12:45:35,627][15401] Updated weights for policy 0, policy_version 211900 (0.0042) [2024-06-22 12:45:37,566][15349] Signal inference workers to stop experience collection... (51250 times) [2024-06-22 12:45:37,620][15401] InferenceWorker_p0-w0: stopping experience collection (51250 times) [2024-06-22 12:45:37,624][15349] Signal inference workers to resume experience collection... (51250 times) [2024-06-22 12:45:37,635][15401] InferenceWorker_p0-w0: resuming experience collection (51250 times) [2024-06-22 12:45:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 3471867904. Throughput: 0: 42557.2. Samples: 3471936960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 12:45:40,053][15401] Updated weights for policy 0, policy_version 211910 (0.0028) [2024-06-22 12:45:43,391][15132] Fps is (10 sec: 44231.0, 60 sec: 42597.4, 300 sec: 42653.7). Total num frames: 3472080896. Throughput: 0: 42572.4. Samples: 3472192640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:43,391][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 12:45:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000211919_3472080896.pth... [2024-06-22 12:45:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000211295_3461857280.pth [2024-06-22 12:45:43,670][15401] Updated weights for policy 0, policy_version 211920 (0.0042) [2024-06-22 12:45:48,046][15401] Updated weights for policy 0, policy_version 211930 (0.0026) [2024-06-22 12:45:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 3472277504. Throughput: 0: 42499.8. Samples: 3472446180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 12:45:51,284][15401] Updated weights for policy 0, policy_version 211940 (0.0042) [2024-06-22 12:45:53,390][15132] Fps is (10 sec: 44242.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3472523264. Throughput: 0: 42502.6. Samples: 3472572220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:53,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 12:45:55,560][15401] Updated weights for policy 0, policy_version 211950 (0.0032) [2024-06-22 12:45:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3472703488. Throughput: 0: 42555.1. Samples: 3472833620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 12:45:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 12:45:59,048][15401] Updated weights for policy 0, policy_version 211960 (0.0039) [2024-06-22 12:46:03,116][15401] Updated weights for policy 0, policy_version 211970 (0.0035) [2024-06-22 12:46:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 3472916480. Throughput: 0: 42316.6. Samples: 3473086400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 12:46:06,762][15401] Updated weights for policy 0, policy_version 211980 (0.0037) [2024-06-22 12:46:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3473129472. Throughput: 0: 42517.3. Samples: 3473215340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 12:46:10,746][15401] Updated weights for policy 0, policy_version 211990 (0.0048) [2024-06-22 12:46:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3473358848. Throughput: 0: 42542.7. Samples: 3473471000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 12:46:14,616][15401] Updated weights for policy 0, policy_version 212000 (0.0029) [2024-06-22 12:46:18,275][15401] Updated weights for policy 0, policy_version 212010 (0.0046) [2024-06-22 12:46:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 3473571840. Throughput: 0: 42628.1. Samples: 3473732120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:18,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 12:46:22,140][15401] Updated weights for policy 0, policy_version 212020 (0.0036) [2024-06-22 12:46:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3473784832. Throughput: 0: 42633.4. Samples: 3473855460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 12:46:25,878][15401] Updated weights for policy 0, policy_version 212030 (0.0026) [2024-06-22 12:46:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3473997824. Throughput: 0: 42649.7. Samples: 3474111820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 12:46:30,389][15401] Updated weights for policy 0, policy_version 212040 (0.0035) [2024-06-22 12:46:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 3474210816. Throughput: 0: 42708.9. Samples: 3474368080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:33,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 12:46:33,535][15401] Updated weights for policy 0, policy_version 212050 (0.0029) [2024-06-22 12:46:37,950][15401] Updated weights for policy 0, policy_version 212060 (0.0039) [2024-06-22 12:46:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 3474407424. Throughput: 0: 42695.6. Samples: 3474493520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:38,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-22 12:46:41,334][15401] Updated weights for policy 0, policy_version 212070 (0.0026) [2024-06-22 12:46:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42599.3, 300 sec: 42598.7). Total num frames: 3474636800. Throughput: 0: 42470.6. Samples: 3474744800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:43,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 12:46:45,548][15401] Updated weights for policy 0, policy_version 212080 (0.0039) [2024-06-22 12:46:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 3474817024. Throughput: 0: 42463.8. Samples: 3474997280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 12:46:49,464][15401] Updated weights for policy 0, policy_version 212090 (0.0043) [2024-06-22 12:46:53,109][15401] Updated weights for policy 0, policy_version 212100 (0.0028) [2024-06-22 12:46:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3475062784. Throughput: 0: 42334.2. Samples: 3475120380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 12:46:56,916][15401] Updated weights for policy 0, policy_version 212110 (0.0041) [2024-06-22 12:46:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 3475243008. Throughput: 0: 42462.7. Samples: 3475381820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:46:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 12:46:59,509][15349] Signal inference workers to stop experience collection... (51300 times) [2024-06-22 12:46:59,564][15401] InferenceWorker_p0-w0: stopping experience collection (51300 times) [2024-06-22 12:46:59,622][15349] Signal inference workers to resume experience collection... (51300 times) [2024-06-22 12:46:59,622][15401] InferenceWorker_p0-w0: resuming experience collection (51300 times) [2024-06-22 12:47:00,788][15401] Updated weights for policy 0, policy_version 212120 (0.0024) [2024-06-22 12:47:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42432.7). Total num frames: 3475472384. Throughput: 0: 42442.7. Samples: 3475642040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:47:03,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 12:47:04,561][15401] Updated weights for policy 0, policy_version 212130 (0.0028) [2024-06-22 12:47:08,211][15401] Updated weights for policy 0, policy_version 212140 (0.0030) [2024-06-22 12:47:08,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3475701760. Throughput: 0: 42560.0. Samples: 3475770660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 12:47:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 12:47:12,264][15401] Updated weights for policy 0, policy_version 212150 (0.0032) [2024-06-22 12:47:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 3475898368. Throughput: 0: 42538.6. Samples: 3476026060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 12:47:16,104][15401] Updated weights for policy 0, policy_version 212160 (0.0028) [2024-06-22 12:47:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42542.7). Total num frames: 3476127744. Throughput: 0: 42467.2. Samples: 3476279200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:18,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 12:47:20,302][15401] Updated weights for policy 0, policy_version 212170 (0.0030) [2024-06-22 12:47:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 3476340736. Throughput: 0: 42535.9. Samples: 3476407640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 12:47:23,580][15401] Updated weights for policy 0, policy_version 212180 (0.0034) [2024-06-22 12:47:27,951][15401] Updated weights for policy 0, policy_version 212190 (0.0024) [2024-06-22 12:47:28,390][15132] Fps is (10 sec: 39329.1, 60 sec: 42051.9, 300 sec: 42320.6). Total num frames: 3476520960. Throughput: 0: 42641.0. Samples: 3476663660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:28,391][15132] Avg episode reward: [(0, '0.864')] [2024-06-22 12:47:31,303][15401] Updated weights for policy 0, policy_version 212200 (0.0042) [2024-06-22 12:47:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3476750336. Throughput: 0: 42768.9. Samples: 3476921880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 12:47:35,577][15401] Updated weights for policy 0, policy_version 212210 (0.0036) [2024-06-22 12:47:38,392][15132] Fps is (10 sec: 45866.3, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 3476979712. Throughput: 0: 42903.0. Samples: 3477051120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:38,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 12:47:38,785][15401] Updated weights for policy 0, policy_version 212220 (0.0035) [2024-06-22 12:47:43,170][15401] Updated weights for policy 0, policy_version 212230 (0.0029) [2024-06-22 12:47:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 3477176320. Throughput: 0: 42866.0. Samples: 3477310800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 12:47:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000212230_3477176320.pth... [2024-06-22 12:47:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000211609_3467001856.pth [2024-06-22 12:47:46,631][15401] Updated weights for policy 0, policy_version 212240 (0.0039) [2024-06-22 12:47:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3477389312. Throughput: 0: 42807.0. Samples: 3477568360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:48,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 12:47:50,790][15401] Updated weights for policy 0, policy_version 212250 (0.0027) [2024-06-22 12:47:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42654.8). Total num frames: 3477635072. Throughput: 0: 42817.6. Samples: 3477697460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 12:47:54,468][15401] Updated weights for policy 0, policy_version 212260 (0.0034) [2024-06-22 12:47:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 3477815296. Throughput: 0: 42859.7. Samples: 3477954740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:47:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 12:47:58,530][15401] Updated weights for policy 0, policy_version 212270 (0.0036) [2024-06-22 12:48:02,156][15401] Updated weights for policy 0, policy_version 212280 (0.0037) [2024-06-22 12:48:03,389][15132] Fps is (10 sec: 37684.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 3478011904. Throughput: 0: 42861.9. Samples: 3478207880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:48:03,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 12:48:06,339][15401] Updated weights for policy 0, policy_version 212290 (0.0022) [2024-06-22 12:48:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3478274048. Throughput: 0: 42815.7. Samples: 3478334340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:48:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 12:48:09,723][15401] Updated weights for policy 0, policy_version 212300 (0.0023) [2024-06-22 12:48:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 3478437888. Throughput: 0: 42825.4. Samples: 3478590780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:48:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 12:48:14,043][15401] Updated weights for policy 0, policy_version 212310 (0.0024) [2024-06-22 12:48:14,169][15349] Signal inference workers to stop experience collection... (51350 times) [2024-06-22 12:48:14,221][15401] InferenceWorker_p0-w0: stopping experience collection (51350 times) [2024-06-22 12:48:14,227][15349] Signal inference workers to resume experience collection... (51350 times) [2024-06-22 12:48:14,237][15401] InferenceWorker_p0-w0: resuming experience collection (51350 times) [2024-06-22 12:48:17,569][15401] Updated weights for policy 0, policy_version 212320 (0.0038) [2024-06-22 12:48:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 3478667264. Throughput: 0: 42568.5. Samples: 3478837460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:18,399][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 12:48:21,752][15401] Updated weights for policy 0, policy_version 212330 (0.0041) [2024-06-22 12:48:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 3478880256. Throughput: 0: 42575.6. Samples: 3478966920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:23,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 12:48:25,265][15401] Updated weights for policy 0, policy_version 212340 (0.0033) [2024-06-22 12:48:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.8, 300 sec: 42431.8). Total num frames: 3479076864. Throughput: 0: 42310.4. Samples: 3479214760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 12:48:29,608][15401] Updated weights for policy 0, policy_version 212350 (0.0038) [2024-06-22 12:48:32,765][15401] Updated weights for policy 0, policy_version 212360 (0.0033) [2024-06-22 12:48:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3479306240. Throughput: 0: 42375.6. Samples: 3479475260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 12:48:37,097][15401] Updated weights for policy 0, policy_version 212370 (0.0036) [2024-06-22 12:48:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 3479535616. Throughput: 0: 42509.5. Samples: 3479610380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 12:48:40,204][15401] Updated weights for policy 0, policy_version 212380 (0.0041) [2024-06-22 12:48:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 3479732224. Throughput: 0: 42605.8. Samples: 3479872000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 12:48:44,651][15401] Updated weights for policy 0, policy_version 212390 (0.0036) [2024-06-22 12:48:47,871][15401] Updated weights for policy 0, policy_version 212400 (0.0042) [2024-06-22 12:48:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3479961600. Throughput: 0: 42592.5. Samples: 3480124540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 12:48:52,185][15401] Updated weights for policy 0, policy_version 212410 (0.0033) [2024-06-22 12:48:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 3480190976. Throughput: 0: 42856.0. Samples: 3480262860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:53,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 12:48:55,422][15401] Updated weights for policy 0, policy_version 212420 (0.0027) [2024-06-22 12:48:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 3480387584. Throughput: 0: 42871.0. Samples: 3480519980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:48:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 12:48:59,751][15401] Updated weights for policy 0, policy_version 212430 (0.0031) [2024-06-22 12:49:03,025][15401] Updated weights for policy 0, policy_version 212440 (0.0030) [2024-06-22 12:49:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43415.8, 300 sec: 42709.1). Total num frames: 3480616960. Throughput: 0: 43063.5. Samples: 3480775420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:49:03,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 12:49:07,281][15401] Updated weights for policy 0, policy_version 212450 (0.0028) [2024-06-22 12:49:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3480846336. Throughput: 0: 43260.9. Samples: 3480913660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:49:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 12:49:10,714][15401] Updated weights for policy 0, policy_version 212460 (0.0036) [2024-06-22 12:49:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3481026560. Throughput: 0: 43225.3. Samples: 3481159900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:49:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 12:49:15,047][15401] Updated weights for policy 0, policy_version 212470 (0.0026) [2024-06-22 12:49:18,327][15401] Updated weights for policy 0, policy_version 212480 (0.0041) [2024-06-22 12:49:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 3481272320. Throughput: 0: 43125.3. Samples: 3481415900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:49:18,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 12:49:22,665][15401] Updated weights for policy 0, policy_version 212490 (0.0037) [2024-06-22 12:49:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 3481468928. Throughput: 0: 43066.5. Samples: 3481548380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 12:49:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 12:49:25,942][15401] Updated weights for policy 0, policy_version 212500 (0.0036) [2024-06-22 12:49:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 3481665536. Throughput: 0: 42976.3. Samples: 3481805940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:49:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 12:49:30,294][15401] Updated weights for policy 0, policy_version 212510 (0.0030) [2024-06-22 12:49:33,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 3481911296. Throughput: 0: 42984.0. Samples: 3482058820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:49:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 12:49:33,497][15401] Updated weights for policy 0, policy_version 212520 (0.0025) [2024-06-22 12:49:36,297][15349] Signal inference workers to stop experience collection... (51400 times) [2024-06-22 12:49:36,298][15349] Signal inference workers to resume experience collection... (51400 times) [2024-06-22 12:49:36,326][15401] InferenceWorker_p0-w0: stopping experience collection (51400 times) [2024-06-22 12:49:36,326][15401] InferenceWorker_p0-w0: resuming experience collection (51400 times) [2024-06-22 12:49:37,894][15401] Updated weights for policy 0, policy_version 212530 (0.0040) [2024-06-22 12:49:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3482124288. Throughput: 0: 42942.6. Samples: 3482195280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:49:38,393][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 12:49:41,424][15401] Updated weights for policy 0, policy_version 212540 (0.0047) [2024-06-22 12:49:43,392][15132] Fps is (10 sec: 39311.6, 60 sec: 42869.6, 300 sec: 42653.6). Total num frames: 3482304512. Throughput: 0: 42793.7. Samples: 3482445800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:49:43,393][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 12:49:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000212543_3482304512.pth... [2024-06-22 12:49:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000211919_3472080896.pth [2024-06-22 12:49:45,571][15401] Updated weights for policy 0, policy_version 212550 (0.0031) [2024-06-22 12:49:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3482550272. Throughput: 0: 42628.1. Samples: 3482693580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:49:48,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 12:49:49,066][15401] Updated weights for policy 0, policy_version 212560 (0.0031) [2024-06-22 12:49:53,301][15401] Updated weights for policy 0, policy_version 212570 (0.0028) [2024-06-22 12:49:53,390][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3482746880. Throughput: 0: 42693.7. Samples: 3482834880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:49:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 12:49:56,613][15401] Updated weights for policy 0, policy_version 212580 (0.0039) [2024-06-22 12:49:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3482943488. Throughput: 0: 42637.7. Samples: 3483078600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:49:58,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 12:50:01,059][15401] Updated weights for policy 0, policy_version 212590 (0.0033) [2024-06-22 12:50:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42871.4, 300 sec: 42709.1). Total num frames: 3483189248. Throughput: 0: 42606.1. Samples: 3483333280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:50:03,392][15132] Avg episode reward: [(0, '0.179')] [2024-06-22 12:50:04,462][15401] Updated weights for policy 0, policy_version 212600 (0.0032) [2024-06-22 12:50:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3483369472. Throughput: 0: 42680.2. Samples: 3483468980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:50:08,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 12:50:08,823][15401] Updated weights for policy 0, policy_version 212610 (0.0030) [2024-06-22 12:50:12,336][15401] Updated weights for policy 0, policy_version 212620 (0.0027) [2024-06-22 12:50:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3483598848. Throughput: 0: 42592.1. Samples: 3483722580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:50:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 12:50:16,655][15401] Updated weights for policy 0, policy_version 212630 (0.0040) [2024-06-22 12:50:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3483828224. Throughput: 0: 42547.5. Samples: 3483973460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:50:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 12:50:20,170][15401] Updated weights for policy 0, policy_version 212640 (0.0044) [2024-06-22 12:50:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3484008448. Throughput: 0: 42417.3. Samples: 3484104060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:50:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 12:50:24,270][15401] Updated weights for policy 0, policy_version 212650 (0.0027) [2024-06-22 12:50:27,760][15401] Updated weights for policy 0, policy_version 212660 (0.0037) [2024-06-22 12:50:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 3484254208. Throughput: 0: 42466.2. Samples: 3484356780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 12:50:28,393][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 12:50:31,859][15401] Updated weights for policy 0, policy_version 212670 (0.0028) [2024-06-22 12:50:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3484450816. Throughput: 0: 42671.6. Samples: 3484613800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:50:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 12:50:35,725][15401] Updated weights for policy 0, policy_version 212680 (0.0029) [2024-06-22 12:50:38,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42052.2, 300 sec: 42598.6). Total num frames: 3484647424. Throughput: 0: 42312.8. Samples: 3484738960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:50:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 12:50:39,766][15401] Updated weights for policy 0, policy_version 212690 (0.0031) [2024-06-22 12:50:43,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 3484860416. Throughput: 0: 42476.5. Samples: 3484990040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:50:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 12:50:43,507][15401] Updated weights for policy 0, policy_version 212700 (0.0025) [2024-06-22 12:50:47,295][15401] Updated weights for policy 0, policy_version 212710 (0.0027) [2024-06-22 12:50:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3485089792. Throughput: 0: 42549.8. Samples: 3485247920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:50:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 12:50:51,198][15401] Updated weights for policy 0, policy_version 212720 (0.0024) [2024-06-22 12:50:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3485286400. Throughput: 0: 42462.6. Samples: 3485379800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:50:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 12:50:54,880][15401] Updated weights for policy 0, policy_version 212730 (0.0030) [2024-06-22 12:50:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3485483008. Throughput: 0: 42343.1. Samples: 3485628020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:50:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 12:50:58,677][15349] Signal inference workers to stop experience collection... (51450 times) [2024-06-22 12:50:58,705][15401] InferenceWorker_p0-w0: stopping experience collection (51450 times) [2024-06-22 12:50:58,745][15349] Signal inference workers to resume experience collection... (51450 times) [2024-06-22 12:50:58,746][15401] InferenceWorker_p0-w0: resuming experience collection (51450 times) [2024-06-22 12:50:58,913][15401] Updated weights for policy 0, policy_version 212740 (0.0034) [2024-06-22 12:51:02,526][15401] Updated weights for policy 0, policy_version 212750 (0.0036) [2024-06-22 12:51:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 3485728768. Throughput: 0: 42647.6. Samples: 3485892600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:51:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 12:51:06,491][15401] Updated weights for policy 0, policy_version 212760 (0.0027) [2024-06-22 12:51:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3485925376. Throughput: 0: 42596.9. Samples: 3486020920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:51:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 12:51:10,073][15401] Updated weights for policy 0, policy_version 212770 (0.0037) [2024-06-22 12:51:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3486138368. Throughput: 0: 42549.0. Samples: 3486271380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:51:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 12:51:14,123][15401] Updated weights for policy 0, policy_version 212780 (0.0038) [2024-06-22 12:51:17,954][15401] Updated weights for policy 0, policy_version 212790 (0.0032) [2024-06-22 12:51:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3486351360. Throughput: 0: 42578.1. Samples: 3486529820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:51:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 12:51:21,789][15401] Updated weights for policy 0, policy_version 212800 (0.0034) [2024-06-22 12:51:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3486580736. Throughput: 0: 42680.0. Samples: 3486659560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:51:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 12:51:25,674][15401] Updated weights for policy 0, policy_version 212810 (0.0031) [2024-06-22 12:51:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 3486793728. Throughput: 0: 42560.9. Samples: 3486905280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:51:28,391][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 12:51:29,400][15401] Updated weights for policy 0, policy_version 212820 (0.0030) [2024-06-22 12:51:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3486990336. Throughput: 0: 42684.1. Samples: 3487168700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:51:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 12:51:33,651][15401] Updated weights for policy 0, policy_version 212830 (0.0045) [2024-06-22 12:51:37,212][15401] Updated weights for policy 0, policy_version 212840 (0.0035) [2024-06-22 12:51:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3487219712. Throughput: 0: 42569.2. Samples: 3487295420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-22 12:51:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 12:51:41,241][15401] Updated weights for policy 0, policy_version 212850 (0.0041) [2024-06-22 12:51:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3487432704. Throughput: 0: 42752.8. Samples: 3487551900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:51:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 12:51:43,439][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000212857_3487449088.pth... [2024-06-22 12:51:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000212230_3477176320.pth [2024-06-22 12:51:44,838][15401] Updated weights for policy 0, policy_version 212860 (0.0035) [2024-06-22 12:51:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3487645696. Throughput: 0: 42463.1. Samples: 3487803440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:51:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 12:51:48,906][15401] Updated weights for policy 0, policy_version 212870 (0.0045) [2024-06-22 12:51:52,450][15401] Updated weights for policy 0, policy_version 212880 (0.0032) [2024-06-22 12:51:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3487842304. Throughput: 0: 42508.9. Samples: 3487933820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:51:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 12:51:56,575][15401] Updated weights for policy 0, policy_version 212890 (0.0034) [2024-06-22 12:51:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3488071680. Throughput: 0: 42619.2. Samples: 3488189240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:51:58,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 12:52:00,551][15401] Updated weights for policy 0, policy_version 212900 (0.0042) [2024-06-22 12:52:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3488284672. Throughput: 0: 42431.5. Samples: 3488439240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:03,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 12:52:04,218][15401] Updated weights for policy 0, policy_version 212910 (0.0039) [2024-06-22 12:52:08,152][15401] Updated weights for policy 0, policy_version 212920 (0.0032) [2024-06-22 12:52:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3488481280. Throughput: 0: 42418.2. Samples: 3488568380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 12:52:11,714][15401] Updated weights for policy 0, policy_version 212930 (0.0036) [2024-06-22 12:52:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 3488694272. Throughput: 0: 42715.6. Samples: 3488827480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 12:52:16,015][15401] Updated weights for policy 0, policy_version 212940 (0.0041) [2024-06-22 12:52:16,767][15349] Signal inference workers to stop experience collection... (51500 times) [2024-06-22 12:52:16,810][15401] InferenceWorker_p0-w0: stopping experience collection (51500 times) [2024-06-22 12:52:16,830][15349] Signal inference workers to resume experience collection... (51500 times) [2024-06-22 12:52:16,832][15401] InferenceWorker_p0-w0: resuming experience collection (51500 times) [2024-06-22 12:52:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3488940032. Throughput: 0: 42357.3. Samples: 3489074780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 12:52:19,341][15401] Updated weights for policy 0, policy_version 212950 (0.0029) [2024-06-22 12:52:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.6). Total num frames: 3489120256. Throughput: 0: 42559.7. Samples: 3489210600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:23,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 12:52:23,626][15401] Updated weights for policy 0, policy_version 212960 (0.0044) [2024-06-22 12:52:26,799][15401] Updated weights for policy 0, policy_version 212970 (0.0024) [2024-06-22 12:52:28,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 3489316864. Throughput: 0: 42349.3. Samples: 3489457620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:28,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 12:52:31,245][15401] Updated weights for policy 0, policy_version 212980 (0.0033) [2024-06-22 12:52:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 3489562624. Throughput: 0: 42439.4. Samples: 3489713220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 12:52:34,375][15401] Updated weights for policy 0, policy_version 212990 (0.0029) [2024-06-22 12:52:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3489759232. Throughput: 0: 42482.7. Samples: 3489845540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 12:52:38,843][15401] Updated weights for policy 0, policy_version 213000 (0.0027) [2024-06-22 12:52:41,929][15401] Updated weights for policy 0, policy_version 213010 (0.0033) [2024-06-22 12:52:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3489972224. Throughput: 0: 42348.0. Samples: 3490094900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:43,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 12:52:46,389][15401] Updated weights for policy 0, policy_version 213020 (0.0029) [2024-06-22 12:52:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3490201600. Throughput: 0: 42625.0. Samples: 3490357360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 12:52:48,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 12:52:49,605][15401] Updated weights for policy 0, policy_version 213030 (0.0038) [2024-06-22 12:52:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3490398208. Throughput: 0: 42744.0. Samples: 3490491860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:52:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 12:52:54,500][15401] Updated weights for policy 0, policy_version 213040 (0.0024) [2024-06-22 12:52:57,169][15401] Updated weights for policy 0, policy_version 213050 (0.0030) [2024-06-22 12:52:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3490611200. Throughput: 0: 42340.4. Samples: 3490732800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:52:58,390][15132] Avg episode reward: [(0, '0.152')] [2024-06-22 12:53:02,394][15401] Updated weights for policy 0, policy_version 213060 (0.0040) [2024-06-22 12:53:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3490840576. Throughput: 0: 42657.8. Samples: 3490994380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 12:53:04,673][15401] Updated weights for policy 0, policy_version 213070 (0.0032) [2024-06-22 12:53:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3491020800. Throughput: 0: 42551.0. Samples: 3491125400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 12:53:09,939][15401] Updated weights for policy 0, policy_version 213080 (0.0038) [2024-06-22 12:53:12,192][15401] Updated weights for policy 0, policy_version 213090 (0.0050) [2024-06-22 12:53:13,391][15132] Fps is (10 sec: 42592.1, 60 sec: 42870.4, 300 sec: 42709.3). Total num frames: 3491266560. Throughput: 0: 42493.8. Samples: 3491369900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:13,391][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 12:53:17,515][15401] Updated weights for policy 0, policy_version 213100 (0.0036) [2024-06-22 12:53:18,390][15132] Fps is (10 sec: 45871.6, 60 sec: 42324.7, 300 sec: 42709.3). Total num frames: 3491479552. Throughput: 0: 42718.8. Samples: 3491635600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:18,391][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 12:53:20,178][15401] Updated weights for policy 0, policy_version 213110 (0.0033) [2024-06-22 12:53:23,389][15132] Fps is (10 sec: 39327.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3491659776. Throughput: 0: 42601.5. Samples: 3491762600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 12:53:24,875][15401] Updated weights for policy 0, policy_version 213120 (0.0025) [2024-06-22 12:53:27,772][15401] Updated weights for policy 0, policy_version 213130 (0.0036) [2024-06-22 12:53:28,390][15132] Fps is (10 sec: 44240.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 3491921920. Throughput: 0: 42755.4. Samples: 3492018900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 12:53:31,185][15349] Signal inference workers to stop experience collection... (51550 times) [2024-06-22 12:53:31,186][15349] Signal inference workers to resume experience collection... (51550 times) [2024-06-22 12:53:31,228][15401] InferenceWorker_p0-w0: stopping experience collection (51550 times) [2024-06-22 12:53:31,228][15401] InferenceWorker_p0-w0: resuming experience collection (51550 times) [2024-06-22 12:53:32,441][15401] Updated weights for policy 0, policy_version 213140 (0.0029) [2024-06-22 12:53:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3492102144. Throughput: 0: 42721.3. Samples: 3492279820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 12:53:35,719][15401] Updated weights for policy 0, policy_version 213150 (0.0051) [2024-06-22 12:53:38,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3492298752. Throughput: 0: 42365.7. Samples: 3492398320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 12:53:39,972][15401] Updated weights for policy 0, policy_version 213160 (0.0025) [2024-06-22 12:53:43,390][15132] Fps is (10 sec: 45873.9, 60 sec: 43144.3, 300 sec: 42709.4). Total num frames: 3492560896. Throughput: 0: 42977.1. Samples: 3492666780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 12:53:43,445][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000213170_3492577280.pth... [2024-06-22 12:53:43,451][15401] Updated weights for policy 0, policy_version 213170 (0.0030) [2024-06-22 12:53:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000212543_3482304512.pth [2024-06-22 12:53:47,745][15401] Updated weights for policy 0, policy_version 213180 (0.0032) [2024-06-22 12:53:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3492757504. Throughput: 0: 42748.8. Samples: 3492918080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 12:53:51,098][15401] Updated weights for policy 0, policy_version 213190 (0.0046) [2024-06-22 12:53:53,389][15132] Fps is (10 sec: 37684.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 3492937728. Throughput: 0: 42608.6. Samples: 3493042780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 12:53:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 12:53:55,383][15401] Updated weights for policy 0, policy_version 213200 (0.0041) [2024-06-22 12:53:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 3493183488. Throughput: 0: 42964.9. Samples: 3493303260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:53:58,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 12:53:58,875][15401] Updated weights for policy 0, policy_version 213210 (0.0030) [2024-06-22 12:54:03,054][15401] Updated weights for policy 0, policy_version 213220 (0.0027) [2024-06-22 12:54:03,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 3493396480. Throughput: 0: 42728.7. Samples: 3493558360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 12:54:06,416][15401] Updated weights for policy 0, policy_version 213230 (0.0030) [2024-06-22 12:54:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3493593088. Throughput: 0: 42815.5. Samples: 3493689300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 12:54:10,666][15401] Updated weights for policy 0, policy_version 213240 (0.0041) [2024-06-22 12:54:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42599.4, 300 sec: 42542.9). Total num frames: 3493822464. Throughput: 0: 42895.6. Samples: 3493949200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:13,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 12:54:14,166][15401] Updated weights for policy 0, policy_version 213250 (0.0035) [2024-06-22 12:54:18,358][15401] Updated weights for policy 0, policy_version 213260 (0.0037) [2024-06-22 12:54:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42872.0, 300 sec: 42654.0). Total num frames: 3494051840. Throughput: 0: 42760.3. Samples: 3494204040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:18,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 12:54:21,984][15401] Updated weights for policy 0, policy_version 213270 (0.0028) [2024-06-22 12:54:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.3, 300 sec: 42653.9). Total num frames: 3494248448. Throughput: 0: 43046.6. Samples: 3494335420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:23,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 12:54:25,969][15401] Updated weights for policy 0, policy_version 213280 (0.0037) [2024-06-22 12:54:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 3494461440. Throughput: 0: 42772.2. Samples: 3494591520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:28,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-22 12:54:29,708][15401] Updated weights for policy 0, policy_version 213290 (0.0047) [2024-06-22 12:54:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 3494674432. Throughput: 0: 42921.8. Samples: 3494849560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:33,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 12:54:33,654][15401] Updated weights for policy 0, policy_version 213300 (0.0029) [2024-06-22 12:54:34,591][15349] Signal inference workers to stop experience collection... (51600 times) [2024-06-22 12:54:34,592][15349] Signal inference workers to resume experience collection... (51600 times) [2024-06-22 12:54:34,639][15401] InferenceWorker_p0-w0: stopping experience collection (51600 times) [2024-06-22 12:54:34,640][15401] InferenceWorker_p0-w0: resuming experience collection (51600 times) [2024-06-22 12:54:37,345][15401] Updated weights for policy 0, policy_version 213310 (0.0037) [2024-06-22 12:54:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.9, 300 sec: 42653.9). Total num frames: 3494887424. Throughput: 0: 43100.7. Samples: 3494982420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:38,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 12:54:41,227][15401] Updated weights for policy 0, policy_version 213320 (0.0023) [2024-06-22 12:54:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3495116800. Throughput: 0: 42868.0. Samples: 3495232320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 12:54:44,927][15401] Updated weights for policy 0, policy_version 213330 (0.0021) [2024-06-22 12:54:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3495329792. Throughput: 0: 42929.5. Samples: 3495490180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:48,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 12:54:48,906][15401] Updated weights for policy 0, policy_version 213340 (0.0030) [2024-06-22 12:54:52,856][15401] Updated weights for policy 0, policy_version 213350 (0.0036) [2024-06-22 12:54:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 3495542784. Throughput: 0: 42962.2. Samples: 3495622600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:53,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 12:54:56,390][15401] Updated weights for policy 0, policy_version 213360 (0.0023) [2024-06-22 12:54:58,392][15132] Fps is (10 sec: 42589.6, 60 sec: 42870.1, 300 sec: 42598.5). Total num frames: 3495755776. Throughput: 0: 42768.8. Samples: 3495873880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:54:58,392][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 12:55:00,594][15401] Updated weights for policy 0, policy_version 213370 (0.0028) [2024-06-22 12:55:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3495968768. Throughput: 0: 42967.7. Samples: 3496137580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 12:55:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 12:55:04,311][15401] Updated weights for policy 0, policy_version 213380 (0.0026) [2024-06-22 12:55:08,087][15401] Updated weights for policy 0, policy_version 213390 (0.0045) [2024-06-22 12:55:08,390][15132] Fps is (10 sec: 42606.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 3496181760. Throughput: 0: 42785.8. Samples: 3496260780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 12:55:11,980][15401] Updated weights for policy 0, policy_version 213400 (0.0028) [2024-06-22 12:55:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3496411136. Throughput: 0: 42868.9. Samples: 3496520620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:13,390][15132] Avg episode reward: [(0, '0.079')] [2024-06-22 12:55:15,806][15401] Updated weights for policy 0, policy_version 213410 (0.0032) [2024-06-22 12:55:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3496607744. Throughput: 0: 42945.0. Samples: 3496782080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:18,391][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 12:55:19,489][15401] Updated weights for policy 0, policy_version 213420 (0.0033) [2024-06-22 12:55:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 3496804352. Throughput: 0: 42658.7. Samples: 3496901960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:23,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 12:55:23,893][15401] Updated weights for policy 0, policy_version 213430 (0.0024) [2024-06-22 12:55:27,218][15401] Updated weights for policy 0, policy_version 213440 (0.0039) [2024-06-22 12:55:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 3497066496. Throughput: 0: 42911.8. Samples: 3497163340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 12:55:31,442][15401] Updated weights for policy 0, policy_version 213450 (0.0026) [2024-06-22 12:55:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3497246720. Throughput: 0: 42862.2. Samples: 3497418980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 12:55:34,945][15401] Updated weights for policy 0, policy_version 213460 (0.0037) [2024-06-22 12:55:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 3497459712. Throughput: 0: 42578.7. Samples: 3497538640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 12:55:38,918][15401] Updated weights for policy 0, policy_version 213470 (0.0040) [2024-06-22 12:55:42,411][15401] Updated weights for policy 0, policy_version 213480 (0.0036) [2024-06-22 12:55:43,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 3497721856. Throughput: 0: 43007.3. Samples: 3497809120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 12:55:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000213484_3497721856.pth... [2024-06-22 12:55:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000212857_3487449088.pth [2024-06-22 12:55:46,438][15401] Updated weights for policy 0, policy_version 213490 (0.0033) [2024-06-22 12:55:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 3497885696. Throughput: 0: 42865.6. Samples: 3498066640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:48,393][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 12:55:50,089][15401] Updated weights for policy 0, policy_version 213500 (0.0036) [2024-06-22 12:55:50,745][15349] Signal inference workers to stop experience collection... (51650 times) [2024-06-22 12:55:50,747][15349] Signal inference workers to resume experience collection... (51650 times) [2024-06-22 12:55:50,761][15401] InferenceWorker_p0-w0: stopping experience collection (51650 times) [2024-06-22 12:55:50,800][15401] InferenceWorker_p0-w0: resuming experience collection (51650 times) [2024-06-22 12:55:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3498115072. Throughput: 0: 42707.3. Samples: 3498182600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 12:55:54,347][15401] Updated weights for policy 0, policy_version 213510 (0.0038) [2024-06-22 12:55:57,542][15401] Updated weights for policy 0, policy_version 213520 (0.0047) [2024-06-22 12:55:58,390][15132] Fps is (10 sec: 45886.4, 60 sec: 43146.0, 300 sec: 42765.0). Total num frames: 3498344448. Throughput: 0: 42845.8. Samples: 3498448680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:55:58,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 12:56:02,044][15401] Updated weights for policy 0, policy_version 213530 (0.0038) [2024-06-22 12:56:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 3498524672. Throughput: 0: 42873.2. Samples: 3498711380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:56:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 12:56:05,312][15401] Updated weights for policy 0, policy_version 213540 (0.0029) [2024-06-22 12:56:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3498770432. Throughput: 0: 42844.5. Samples: 3498829960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-22 12:56:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 12:56:09,774][15401] Updated weights for policy 0, policy_version 213550 (0.0042) [2024-06-22 12:56:12,879][15401] Updated weights for policy 0, policy_version 213560 (0.0036) [2024-06-22 12:56:13,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3498983424. Throughput: 0: 42867.0. Samples: 3499092360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 12:56:17,436][15401] Updated weights for policy 0, policy_version 213570 (0.0040) [2024-06-22 12:56:18,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 3499163648. Throughput: 0: 42959.5. Samples: 3499352260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:18,392][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 12:56:20,735][15401] Updated weights for policy 0, policy_version 213580 (0.0023) [2024-06-22 12:56:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 3499409408. Throughput: 0: 42962.6. Samples: 3499472060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:23,392][15132] Avg episode reward: [(0, '0.809')] [2024-06-22 12:56:25,190][15401] Updated weights for policy 0, policy_version 213590 (0.0047) [2024-06-22 12:56:28,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3499606016. Throughput: 0: 42714.6. Samples: 3499731280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 12:56:28,418][15401] Updated weights for policy 0, policy_version 213600 (0.0042) [2024-06-22 12:56:32,762][15401] Updated weights for policy 0, policy_version 213610 (0.0031) [2024-06-22 12:56:33,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3499802624. Throughput: 0: 42677.4. Samples: 3499987020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:33,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 12:56:36,183][15401] Updated weights for policy 0, policy_version 213620 (0.0027) [2024-06-22 12:56:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3500048384. Throughput: 0: 42878.1. Samples: 3500112120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 12:56:40,216][15401] Updated weights for policy 0, policy_version 213630 (0.0039) [2024-06-22 12:56:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3500244992. Throughput: 0: 42887.1. Samples: 3500378600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 12:56:43,714][15401] Updated weights for policy 0, policy_version 213640 (0.0026) [2024-06-22 12:56:47,730][15401] Updated weights for policy 0, policy_version 213650 (0.0031) [2024-06-22 12:56:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3500457984. Throughput: 0: 42717.9. Samples: 3500633680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 12:56:51,247][15401] Updated weights for policy 0, policy_version 213660 (0.0042) [2024-06-22 12:56:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 3500703744. Throughput: 0: 42917.7. Samples: 3500761260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:53,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 12:56:55,334][15401] Updated weights for policy 0, policy_version 213670 (0.0040) [2024-06-22 12:56:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3500900352. Throughput: 0: 42855.7. Samples: 3501020860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:56:58,390][15132] Avg episode reward: [(0, '0.218')] [2024-06-22 12:56:58,964][15401] Updated weights for policy 0, policy_version 213680 (0.0031) [2024-06-22 12:57:02,992][15401] Updated weights for policy 0, policy_version 213690 (0.0042) [2024-06-22 12:57:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3501096960. Throughput: 0: 42738.2. Samples: 3501275380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:57:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 12:57:06,578][15401] Updated weights for policy 0, policy_version 213700 (0.0044) [2024-06-22 12:57:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3501326336. Throughput: 0: 42930.2. Samples: 3501403820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:57:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 12:57:10,564][15401] Updated weights for policy 0, policy_version 213710 (0.0023) [2024-06-22 12:57:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3501522944. Throughput: 0: 42881.3. Samples: 3501660940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:57:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 12:57:14,287][15401] Updated weights for policy 0, policy_version 213720 (0.0026) [2024-06-22 12:57:18,162][15401] Updated weights for policy 0, policy_version 213730 (0.0042) [2024-06-22 12:57:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 3501752320. Throughput: 0: 42795.1. Samples: 3501912900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 12:57:18,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 12:57:21,903][15401] Updated weights for policy 0, policy_version 213740 (0.0029) [2024-06-22 12:57:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 3501948928. Throughput: 0: 42914.7. Samples: 3502043280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:57:23,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 12:57:25,749][15401] Updated weights for policy 0, policy_version 213750 (0.0034) [2024-06-22 12:57:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3502178304. Throughput: 0: 42760.9. Samples: 3502302840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:57:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 12:57:29,668][15401] Updated weights for policy 0, policy_version 213760 (0.0034) [2024-06-22 12:57:32,998][15349] Signal inference workers to stop experience collection... (51700 times) [2024-06-22 12:57:33,045][15349] Signal inference workers to resume experience collection... (51700 times) [2024-06-22 12:57:33,052][15401] InferenceWorker_p0-w0: stopping experience collection (51700 times) [2024-06-22 12:57:33,071][15401] InferenceWorker_p0-w0: resuming experience collection (51700 times) [2024-06-22 12:57:33,193][15401] Updated weights for policy 0, policy_version 213770 (0.0031) [2024-06-22 12:57:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3502407680. Throughput: 0: 42769.8. Samples: 3502558320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:57:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 12:57:37,085][15401] Updated weights for policy 0, policy_version 213780 (0.0035) [2024-06-22 12:57:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3502587904. Throughput: 0: 42781.9. Samples: 3502686440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:57:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 12:57:40,695][15401] Updated weights for policy 0, policy_version 213790 (0.0035) [2024-06-22 12:57:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3502833664. Throughput: 0: 42735.4. Samples: 3502943960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:57:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 12:57:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000213796_3502833664.pth... [2024-06-22 12:57:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000213170_3492577280.pth [2024-06-22 12:57:44,651][15401] Updated weights for policy 0, policy_version 213800 (0.0033) [2024-06-22 12:57:48,301][15401] Updated weights for policy 0, policy_version 213810 (0.0030) [2024-06-22 12:57:48,390][15132] Fps is (10 sec: 47512.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3503063040. Throughput: 0: 42721.3. Samples: 3503197840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:57:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 12:57:52,754][15401] Updated weights for policy 0, policy_version 213820 (0.0030) [2024-06-22 12:57:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 3503243264. Throughput: 0: 42761.7. Samples: 3503328100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:57:53,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 12:57:56,216][15401] Updated weights for policy 0, policy_version 213830 (0.0035) [2024-06-22 12:57:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3503472640. Throughput: 0: 42790.4. Samples: 3503586500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:57:58,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-22 12:58:00,518][15401] Updated weights for policy 0, policy_version 213840 (0.0039) [2024-06-22 12:58:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3503685632. Throughput: 0: 42901.2. Samples: 3503843360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:58:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 12:58:03,743][15401] Updated weights for policy 0, policy_version 213850 (0.0034) [2024-06-22 12:58:08,098][15401] Updated weights for policy 0, policy_version 213860 (0.0053) [2024-06-22 12:58:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 3503898624. Throughput: 0: 42834.7. Samples: 3503970840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:58:08,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 12:58:11,314][15401] Updated weights for policy 0, policy_version 213870 (0.0025) [2024-06-22 12:58:13,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43144.6, 300 sec: 42820.7). Total num frames: 3504111616. Throughput: 0: 42718.7. Samples: 3504225180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:58:13,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-22 12:58:15,710][15401] Updated weights for policy 0, policy_version 213880 (0.0028) [2024-06-22 12:58:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 3504324608. Throughput: 0: 42843.0. Samples: 3504486260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:58:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 12:58:19,387][15401] Updated weights for policy 0, policy_version 213890 (0.0042) [2024-06-22 12:58:23,129][15401] Updated weights for policy 0, policy_version 213900 (0.0042) [2024-06-22 12:58:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3504553984. Throughput: 0: 42735.4. Samples: 3504609540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 12:58:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 12:58:26,851][15401] Updated weights for policy 0, policy_version 213910 (0.0027) [2024-06-22 12:58:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3504766976. Throughput: 0: 42871.1. Samples: 3504873160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:58:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 12:58:30,947][15401] Updated weights for policy 0, policy_version 213920 (0.0039) [2024-06-22 12:58:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3504963584. Throughput: 0: 42944.1. Samples: 3505130320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:58:33,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-22 12:58:34,667][15401] Updated weights for policy 0, policy_version 213930 (0.0046) [2024-06-22 12:58:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3505176576. Throughput: 0: 42722.8. Samples: 3505250620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:58:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 12:58:38,579][15401] Updated weights for policy 0, policy_version 213940 (0.0041) [2024-06-22 12:58:42,318][15401] Updated weights for policy 0, policy_version 213950 (0.0034) [2024-06-22 12:58:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3505422336. Throughput: 0: 42967.9. Samples: 3505520060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:58:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 12:58:46,008][15401] Updated weights for policy 0, policy_version 213960 (0.0044) [2024-06-22 12:58:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 3505602560. Throughput: 0: 43030.9. Samples: 3505779740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:58:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 12:58:50,006][15401] Updated weights for policy 0, policy_version 213970 (0.0034) [2024-06-22 12:58:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3505831936. Throughput: 0: 42994.5. Samples: 3505905600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:58:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 12:58:53,508][15401] Updated weights for policy 0, policy_version 213980 (0.0031) [2024-06-22 12:58:57,686][15401] Updated weights for policy 0, policy_version 213990 (0.0023) [2024-06-22 12:58:58,396][15132] Fps is (10 sec: 45846.2, 60 sec: 43140.0, 300 sec: 42930.7). Total num frames: 3506061312. Throughput: 0: 43184.6. Samples: 3506168760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:58:58,396][15132] Avg episode reward: [(0, '0.205')] [2024-06-22 12:59:01,343][15401] Updated weights for policy 0, policy_version 214000 (0.0033) [2024-06-22 12:59:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 3506241536. Throughput: 0: 43073.9. Samples: 3506424580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:59:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 12:59:05,091][15349] Signal inference workers to stop experience collection... (51750 times) [2024-06-22 12:59:05,092][15349] Signal inference workers to resume experience collection... (51750 times) [2024-06-22 12:59:05,118][15401] InferenceWorker_p0-w0: stopping experience collection (51750 times) [2024-06-22 12:59:05,119][15401] InferenceWorker_p0-w0: resuming experience collection (51750 times) [2024-06-22 12:59:05,269][15401] Updated weights for policy 0, policy_version 214010 (0.0027) [2024-06-22 12:59:08,389][15132] Fps is (10 sec: 40986.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3506470912. Throughput: 0: 42980.1. Samples: 3506543640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:59:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 12:59:09,080][15401] Updated weights for policy 0, policy_version 214020 (0.0035) [2024-06-22 12:59:12,906][15401] Updated weights for policy 0, policy_version 214030 (0.0036) [2024-06-22 12:59:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3506683904. Throughput: 0: 42979.1. Samples: 3506807220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:59:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 12:59:17,035][15401] Updated weights for policy 0, policy_version 214040 (0.0036) [2024-06-22 12:59:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3506880512. Throughput: 0: 43028.1. Samples: 3507066580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:59:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 12:59:20,527][15401] Updated weights for policy 0, policy_version 214050 (0.0029) [2024-06-22 12:59:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3507109888. Throughput: 0: 43012.4. Samples: 3507186180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:59:23,391][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 12:59:24,479][15401] Updated weights for policy 0, policy_version 214060 (0.0028) [2024-06-22 12:59:28,043][15401] Updated weights for policy 0, policy_version 214070 (0.0037) [2024-06-22 12:59:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3507339264. Throughput: 0: 42987.6. Samples: 3507454500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:59:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 12:59:31,909][15401] Updated weights for policy 0, policy_version 214080 (0.0037) [2024-06-22 12:59:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 3507535872. Throughput: 0: 42962.1. Samples: 3507713040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 12:59:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 12:59:35,673][15401] Updated weights for policy 0, policy_version 214090 (0.0022) [2024-06-22 12:59:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3507765248. Throughput: 0: 42789.1. Samples: 3507831100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:59:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 12:59:40,001][15401] Updated weights for policy 0, policy_version 214100 (0.0041) [2024-06-22 12:59:43,214][15401] Updated weights for policy 0, policy_version 214110 (0.0041) [2024-06-22 12:59:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3507994624. Throughput: 0: 42870.8. Samples: 3508097680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:59:43,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 12:59:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000214111_3507994624.pth... [2024-06-22 12:59:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000213484_3497721856.pth [2024-06-22 12:59:47,582][15401] Updated weights for policy 0, policy_version 214120 (0.0025) [2024-06-22 12:59:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3508158464. Throughput: 0: 43011.9. Samples: 3508360120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:59:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 12:59:50,841][15401] Updated weights for policy 0, policy_version 214130 (0.0030) [2024-06-22 12:59:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 3508404224. Throughput: 0: 42898.5. Samples: 3508474080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:59:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 12:59:55,186][15401] Updated weights for policy 0, policy_version 214140 (0.0038) [2024-06-22 12:59:58,282][15401] Updated weights for policy 0, policy_version 214150 (0.0040) [2024-06-22 12:59:58,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42876.1, 300 sec: 42931.6). Total num frames: 3508633600. Throughput: 0: 43039.7. Samples: 3508744000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 12:59:58,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 13:00:03,158][15401] Updated weights for policy 0, policy_version 214160 (0.0035) [2024-06-22 13:00:03,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3508797440. Throughput: 0: 43027.5. Samples: 3509002820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:03,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 13:00:06,007][15401] Updated weights for policy 0, policy_version 214170 (0.0032) [2024-06-22 13:00:06,535][15349] Signal inference workers to stop experience collection... (51800 times) [2024-06-22 13:00:06,537][15349] Signal inference workers to resume experience collection... (51800 times) [2024-06-22 13:00:06,564][15401] InferenceWorker_p0-w0: stopping experience collection (51800 times) [2024-06-22 13:00:06,564][15401] InferenceWorker_p0-w0: resuming experience collection (51800 times) [2024-06-22 13:00:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3509059584. Throughput: 0: 42988.0. Samples: 3509120640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 13:00:10,587][15401] Updated weights for policy 0, policy_version 214180 (0.0035) [2024-06-22 13:00:13,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 3509256192. Throughput: 0: 42965.7. Samples: 3509388060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:13,393][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 13:00:13,621][15401] Updated weights for policy 0, policy_version 214190 (0.0038) [2024-06-22 13:00:18,066][15401] Updated weights for policy 0, policy_version 214200 (0.0023) [2024-06-22 13:00:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 3509452800. Throughput: 0: 42876.9. Samples: 3509642500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 13:00:21,257][15401] Updated weights for policy 0, policy_version 214210 (0.0035) [2024-06-22 13:00:23,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3509714944. Throughput: 0: 42997.2. Samples: 3509765980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 13:00:25,624][15401] Updated weights for policy 0, policy_version 214220 (0.0029) [2024-06-22 13:00:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3509878784. Throughput: 0: 42883.7. Samples: 3510027440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 13:00:28,992][15401] Updated weights for policy 0, policy_version 214230 (0.0036) [2024-06-22 13:00:33,201][15401] Updated weights for policy 0, policy_version 214240 (0.0032) [2024-06-22 13:00:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 3510108160. Throughput: 0: 42712.6. Samples: 3510282180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 13:00:36,795][15401] Updated weights for policy 0, policy_version 214250 (0.0032) [2024-06-22 13:00:38,393][15132] Fps is (10 sec: 49133.9, 60 sec: 43414.9, 300 sec: 42875.6). Total num frames: 3510370304. Throughput: 0: 43006.0. Samples: 3510409500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:38,394][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 13:00:40,981][15401] Updated weights for policy 0, policy_version 214260 (0.0037) [2024-06-22 13:00:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42876.5). Total num frames: 3510534144. Throughput: 0: 42958.7. Samples: 3510677140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 13:00:43,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 13:00:44,326][15401] Updated weights for policy 0, policy_version 214270 (0.0037) [2024-06-22 13:00:48,389][15132] Fps is (10 sec: 37697.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 3510747136. Throughput: 0: 42742.0. Samples: 3510926200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:00:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 13:00:48,483][15401] Updated weights for policy 0, policy_version 214280 (0.0038) [2024-06-22 13:00:51,906][15401] Updated weights for policy 0, policy_version 214290 (0.0032) [2024-06-22 13:00:53,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3511009280. Throughput: 0: 43053.8. Samples: 3511058060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:00:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 13:00:55,961][15401] Updated weights for policy 0, policy_version 214300 (0.0033) [2024-06-22 13:00:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 3511156736. Throughput: 0: 42889.5. Samples: 3511317980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:00:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 13:00:59,472][15401] Updated weights for policy 0, policy_version 214310 (0.0029) [2024-06-22 13:01:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 3511402496. Throughput: 0: 42826.0. Samples: 3511569660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:03,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 13:01:03,428][15401] Updated weights for policy 0, policy_version 214320 (0.0042) [2024-06-22 13:01:07,318][15401] Updated weights for policy 0, policy_version 214330 (0.0043) [2024-06-22 13:01:08,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 3511648256. Throughput: 0: 42905.1. Samples: 3511696700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 13:01:11,526][15401] Updated weights for policy 0, policy_version 214340 (0.0028) [2024-06-22 13:01:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.2, 300 sec: 42932.0). Total num frames: 3511828480. Throughput: 0: 43007.1. Samples: 3511962760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 13:01:14,972][15401] Updated weights for policy 0, policy_version 214350 (0.0041) [2024-06-22 13:01:18,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3512041472. Throughput: 0: 42931.8. Samples: 3512214120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 13:01:19,009][15401] Updated weights for policy 0, policy_version 214360 (0.0036) [2024-06-22 13:01:19,826][15349] Signal inference workers to stop experience collection... (51850 times) [2024-06-22 13:01:19,853][15401] InferenceWorker_p0-w0: stopping experience collection (51850 times) [2024-06-22 13:01:19,938][15349] Signal inference workers to resume experience collection... (51850 times) [2024-06-22 13:01:19,939][15401] InferenceWorker_p0-w0: resuming experience collection (51850 times) [2024-06-22 13:01:22,589][15401] Updated weights for policy 0, policy_version 214370 (0.0035) [2024-06-22 13:01:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3512287232. Throughput: 0: 42989.2. Samples: 3512343860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 13:01:27,143][15401] Updated weights for policy 0, policy_version 214380 (0.0035) [2024-06-22 13:01:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3512467456. Throughput: 0: 42813.6. Samples: 3512603760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:28,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 13:01:30,075][15401] Updated weights for policy 0, policy_version 214390 (0.0028) [2024-06-22 13:01:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3512696832. Throughput: 0: 42908.2. Samples: 3512857080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:33,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 13:01:34,602][15401] Updated weights for policy 0, policy_version 214400 (0.0028) [2024-06-22 13:01:37,611][15401] Updated weights for policy 0, policy_version 214410 (0.0028) [2024-06-22 13:01:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42874.1, 300 sec: 43042.7). Total num frames: 3512942592. Throughput: 0: 42879.9. Samples: 3512987660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 13:01:42,337][15401] Updated weights for policy 0, policy_version 214420 (0.0037) [2024-06-22 13:01:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3513106432. Throughput: 0: 42978.2. Samples: 3513252000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 13:01:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000214424_3513122816.pth... [2024-06-22 13:01:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000213796_3502833664.pth [2024-06-22 13:01:45,271][15401] Updated weights for policy 0, policy_version 214430 (0.0030) [2024-06-22 13:01:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 3513352192. Throughput: 0: 42956.3. Samples: 3513502700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 13:01:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 13:01:49,980][15401] Updated weights for policy 0, policy_version 214440 (0.0039) [2024-06-22 13:01:52,728][15401] Updated weights for policy 0, policy_version 214450 (0.0035) [2024-06-22 13:01:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3513565184. Throughput: 0: 43077.6. Samples: 3513635200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:01:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 13:01:57,408][15401] Updated weights for policy 0, policy_version 214460 (0.0035) [2024-06-22 13:01:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3513761792. Throughput: 0: 43054.2. Samples: 3513900200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:01:58,392][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 13:02:00,769][15401] Updated weights for policy 0, policy_version 214470 (0.0039) [2024-06-22 13:02:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 3513974784. Throughput: 0: 42917.8. Samples: 3514145420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:03,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 13:02:04,876][15401] Updated weights for policy 0, policy_version 214480 (0.0024) [2024-06-22 13:02:08,091][15401] Updated weights for policy 0, policy_version 214490 (0.0022) [2024-06-22 13:02:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 43042.8). Total num frames: 3514220544. Throughput: 0: 42938.9. Samples: 3514276100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 13:02:12,629][15401] Updated weights for policy 0, policy_version 214500 (0.0032) [2024-06-22 13:02:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 3514400768. Throughput: 0: 43053.7. Samples: 3514541180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:13,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 13:02:15,512][15401] Updated weights for policy 0, policy_version 214510 (0.0038) [2024-06-22 13:02:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3514630144. Throughput: 0: 43109.4. Samples: 3514797000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 13:02:20,202][15401] Updated weights for policy 0, policy_version 214520 (0.0028) [2024-06-22 13:02:22,912][15401] Updated weights for policy 0, policy_version 214530 (0.0037) [2024-06-22 13:02:23,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 3514859520. Throughput: 0: 43114.3. Samples: 3514927800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:23,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 13:02:27,670][15401] Updated weights for policy 0, policy_version 214540 (0.0023) [2024-06-22 13:02:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3515039744. Throughput: 0: 43032.9. Samples: 3515188480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:28,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 13:02:30,552][15401] Updated weights for policy 0, policy_version 214550 (0.0030) [2024-06-22 13:02:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3515269120. Throughput: 0: 43089.3. Samples: 3515441720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:33,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 13:02:35,213][15401] Updated weights for policy 0, policy_version 214560 (0.0029) [2024-06-22 13:02:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 3515498496. Throughput: 0: 42884.1. Samples: 3515564980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 13:02:38,550][15401] Updated weights for policy 0, policy_version 214570 (0.0046) [2024-06-22 13:02:42,713][15401] Updated weights for policy 0, policy_version 214580 (0.0026) [2024-06-22 13:02:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3515695104. Throughput: 0: 42802.2. Samples: 3515826300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 13:02:46,182][15401] Updated weights for policy 0, policy_version 214590 (0.0028) [2024-06-22 13:02:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 3515908096. Throughput: 0: 43057.1. Samples: 3516082980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 13:02:50,418][15401] Updated weights for policy 0, policy_version 214600 (0.0033) [2024-06-22 13:02:51,255][15349] Signal inference workers to stop experience collection... (51900 times) [2024-06-22 13:02:51,255][15349] Signal inference workers to resume experience collection... (51900 times) [2024-06-22 13:02:51,282][15401] InferenceWorker_p0-w0: stopping experience collection (51900 times) [2024-06-22 13:02:51,282][15401] InferenceWorker_p0-w0: resuming experience collection (51900 times) [2024-06-22 13:02:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3516137472. Throughput: 0: 43090.6. Samples: 3516215180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 13:02:53,944][15401] Updated weights for policy 0, policy_version 214610 (0.0041) [2024-06-22 13:02:57,938][15401] Updated weights for policy 0, policy_version 214620 (0.0027) [2024-06-22 13:02:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3516334080. Throughput: 0: 42903.7. Samples: 3516471840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:02:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 13:03:01,623][15401] Updated weights for policy 0, policy_version 214630 (0.0028) [2024-06-22 13:03:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3516547072. Throughput: 0: 42826.1. Samples: 3516724180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:03,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 13:03:05,455][15401] Updated weights for policy 0, policy_version 214640 (0.0036) [2024-06-22 13:03:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3516776448. Throughput: 0: 42862.3. Samples: 3516856600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 13:03:09,113][15401] Updated weights for policy 0, policy_version 214650 (0.0035) [2024-06-22 13:03:13,055][15401] Updated weights for policy 0, policy_version 214660 (0.0025) [2024-06-22 13:03:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3516989440. Throughput: 0: 42794.2. Samples: 3517114220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:13,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 13:03:16,866][15401] Updated weights for policy 0, policy_version 214670 (0.0032) [2024-06-22 13:03:18,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 3517202432. Throughput: 0: 42868.7. Samples: 3517370820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 13:03:20,662][15401] Updated weights for policy 0, policy_version 214680 (0.0047) [2024-06-22 13:03:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 3517431808. Throughput: 0: 43071.4. Samples: 3517503200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:23,393][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 13:03:24,899][15401] Updated weights for policy 0, policy_version 214690 (0.0036) [2024-06-22 13:03:28,294][15401] Updated weights for policy 0, policy_version 214700 (0.0023) [2024-06-22 13:03:28,392][15132] Fps is (10 sec: 44226.9, 60 sec: 43415.8, 300 sec: 42986.8). Total num frames: 3517644800. Throughput: 0: 42924.0. Samples: 3517757980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:28,392][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 13:03:32,525][15401] Updated weights for policy 0, policy_version 214710 (0.0031) [2024-06-22 13:03:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3517841408. Throughput: 0: 42907.1. Samples: 3518013800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 13:03:36,415][15401] Updated weights for policy 0, policy_version 214720 (0.0030) [2024-06-22 13:03:38,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 3518070784. Throughput: 0: 42852.3. Samples: 3518143640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:38,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 13:03:40,138][15401] Updated weights for policy 0, policy_version 214730 (0.0039) [2024-06-22 13:03:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3518267392. Throughput: 0: 42710.8. Samples: 3518393820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 13:03:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000214738_3518267392.pth... [2024-06-22 13:03:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000214111_3507994624.pth [2024-06-22 13:03:44,132][15401] Updated weights for policy 0, policy_version 214740 (0.0028) [2024-06-22 13:03:47,730][15401] Updated weights for policy 0, policy_version 214750 (0.0038) [2024-06-22 13:03:48,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3518464000. Throughput: 0: 42927.2. Samples: 3518655900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 13:03:51,648][15401] Updated weights for policy 0, policy_version 214760 (0.0026) [2024-06-22 13:03:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 3518709760. Throughput: 0: 42860.8. Samples: 3518785340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 13:03:55,222][15401] Updated weights for policy 0, policy_version 214770 (0.0040) [2024-06-22 13:03:58,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 3518922752. Throughput: 0: 42785.3. Samples: 3519039660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:03:58,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 13:03:59,254][15401] Updated weights for policy 0, policy_version 214780 (0.0042) [2024-06-22 13:04:02,668][15401] Updated weights for policy 0, policy_version 214790 (0.0045) [2024-06-22 13:04:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3519119360. Throughput: 0: 42848.5. Samples: 3519299000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:04:03,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-22 13:04:07,404][15401] Updated weights for policy 0, policy_version 214800 (0.0028) [2024-06-22 13:04:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3519332352. Throughput: 0: 42694.7. Samples: 3519424460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 13:04:10,107][15401] Updated weights for policy 0, policy_version 214810 (0.0030) [2024-06-22 13:04:12,172][15349] Signal inference workers to stop experience collection... (51950 times) [2024-06-22 13:04:12,209][15401] InferenceWorker_p0-w0: stopping experience collection (51950 times) [2024-06-22 13:04:12,232][15349] Signal inference workers to resume experience collection... (51950 times) [2024-06-22 13:04:12,232][15401] InferenceWorker_p0-w0: resuming experience collection (51950 times) [2024-06-22 13:04:13,390][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3519578112. Throughput: 0: 42742.3. Samples: 3519681280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 13:04:14,824][15401] Updated weights for policy 0, policy_version 214820 (0.0042) [2024-06-22 13:04:17,471][15401] Updated weights for policy 0, policy_version 214830 (0.0037) [2024-06-22 13:04:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3519774720. Throughput: 0: 42805.7. Samples: 3519940060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 13:04:22,299][15401] Updated weights for policy 0, policy_version 214840 (0.0037) [2024-06-22 13:04:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 3519987712. Throughput: 0: 42793.8. Samples: 3520069360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:23,393][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 13:04:24,912][15401] Updated weights for policy 0, policy_version 214850 (0.0025) [2024-06-22 13:04:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 3520233472. Throughput: 0: 43097.7. Samples: 3520333220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 13:04:29,938][15401] Updated weights for policy 0, policy_version 214860 (0.0038) [2024-06-22 13:04:32,735][15401] Updated weights for policy 0, policy_version 214870 (0.0030) [2024-06-22 13:04:33,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3520430080. Throughput: 0: 43019.6. Samples: 3520591780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 13:04:37,562][15401] Updated weights for policy 0, policy_version 214880 (0.0034) [2024-06-22 13:04:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 3520643072. Throughput: 0: 43024.5. Samples: 3520721440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:38,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 13:04:40,450][15401] Updated weights for policy 0, policy_version 214890 (0.0036) [2024-06-22 13:04:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 3520872448. Throughput: 0: 43323.3. Samples: 3520989100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:43,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 13:04:45,088][15401] Updated weights for policy 0, policy_version 214900 (0.0039) [2024-06-22 13:04:48,131][15401] Updated weights for policy 0, policy_version 214910 (0.0024) [2024-06-22 13:04:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43690.5, 300 sec: 42987.2). Total num frames: 3521085440. Throughput: 0: 43183.1. Samples: 3521242240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 13:04:52,623][15401] Updated weights for policy 0, policy_version 214920 (0.0034) [2024-06-22 13:04:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3521298432. Throughput: 0: 43206.7. Samples: 3521368760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:53,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-22 13:04:55,799][15401] Updated weights for policy 0, policy_version 214930 (0.0035) [2024-06-22 13:04:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43146.3, 300 sec: 43098.3). Total num frames: 3521511424. Throughput: 0: 43270.3. Samples: 3521628440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:04:58,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 13:05:00,038][15401] Updated weights for policy 0, policy_version 214940 (0.0042) [2024-06-22 13:05:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43416.0, 300 sec: 42931.3). Total num frames: 3521724416. Throughput: 0: 43091.5. Samples: 3521879280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:05:03,393][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 13:05:03,452][15401] Updated weights for policy 0, policy_version 214950 (0.0030) [2024-06-22 13:05:07,981][15401] Updated weights for policy 0, policy_version 214960 (0.0028) [2024-06-22 13:05:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 3521921024. Throughput: 0: 43135.3. Samples: 3522010340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:05:08,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-22 13:05:08,499][15349] Signal inference workers to stop experience collection... (52000 times) [2024-06-22 13:05:08,545][15401] InferenceWorker_p0-w0: stopping experience collection (52000 times) [2024-06-22 13:05:08,555][15349] Signal inference workers to resume experience collection... (52000 times) [2024-06-22 13:05:08,570][15401] InferenceWorker_p0-w0: resuming experience collection (52000 times) [2024-06-22 13:05:11,161][15401] Updated weights for policy 0, policy_version 214970 (0.0042) [2024-06-22 13:05:13,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 3522150400. Throughput: 0: 43043.6. Samples: 3522270180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 13:05:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 13:05:15,342][15401] Updated weights for policy 0, policy_version 214980 (0.0044) [2024-06-22 13:05:18,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3522363392. Throughput: 0: 42886.0. Samples: 3522521660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 13:05:18,927][15401] Updated weights for policy 0, policy_version 214990 (0.0043) [2024-06-22 13:05:22,805][15401] Updated weights for policy 0, policy_version 215000 (0.0041) [2024-06-22 13:05:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 3522576384. Throughput: 0: 42947.1. Samples: 3522654060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 13:05:26,506][15401] Updated weights for policy 0, policy_version 215010 (0.0028) [2024-06-22 13:05:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42987.1). Total num frames: 3522789376. Throughput: 0: 42781.2. Samples: 3522914260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 13:05:30,515][15401] Updated weights for policy 0, policy_version 215020 (0.0027) [2024-06-22 13:05:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.6). Total num frames: 3523018752. Throughput: 0: 42907.3. Samples: 3523173060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 13:05:33,955][15401] Updated weights for policy 0, policy_version 215030 (0.0036) [2024-06-22 13:05:38,001][15401] Updated weights for policy 0, policy_version 215040 (0.0043) [2024-06-22 13:05:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3523231744. Throughput: 0: 43061.8. Samples: 3523306540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 13:05:41,507][15401] Updated weights for policy 0, policy_version 215050 (0.0028) [2024-06-22 13:05:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3523428352. Throughput: 0: 42895.1. Samples: 3523558720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 13:05:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000215054_3523444736.pth... [2024-06-22 13:05:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000214424_3513122816.pth [2024-06-22 13:05:45,623][15401] Updated weights for policy 0, policy_version 215060 (0.0033) [2024-06-22 13:05:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3523657728. Throughput: 0: 43129.8. Samples: 3523820020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:48,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 13:05:49,075][15401] Updated weights for policy 0, policy_version 215070 (0.0034) [2024-06-22 13:05:53,275][15401] Updated weights for policy 0, policy_version 215080 (0.0031) [2024-06-22 13:05:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 3523870720. Throughput: 0: 43158.1. Samples: 3523952460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:53,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-22 13:05:56,842][15401] Updated weights for policy 0, policy_version 215090 (0.0028) [2024-06-22 13:05:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3524083712. Throughput: 0: 42970.7. Samples: 3524203860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:05:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 13:06:01,070][15401] Updated weights for policy 0, policy_version 215100 (0.0042) [2024-06-22 13:06:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 3524296704. Throughput: 0: 43151.7. Samples: 3524463480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:06:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 13:06:04,269][15401] Updated weights for policy 0, policy_version 215110 (0.0030) [2024-06-22 13:06:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 3524509696. Throughput: 0: 43226.1. Samples: 3524599240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:06:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 13:06:08,480][15401] Updated weights for policy 0, policy_version 215120 (0.0032) [2024-06-22 13:06:12,197][15401] Updated weights for policy 0, policy_version 215130 (0.0044) [2024-06-22 13:06:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3524739072. Throughput: 0: 42993.4. Samples: 3524848960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:06:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 13:06:16,192][15401] Updated weights for policy 0, policy_version 215140 (0.0032) [2024-06-22 13:06:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3524935680. Throughput: 0: 42867.0. Samples: 3525102080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:06:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 13:06:19,786][15401] Updated weights for policy 0, policy_version 215150 (0.0027) [2024-06-22 13:06:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3525148672. Throughput: 0: 42752.1. Samples: 3525230380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-22 13:06:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 13:06:23,919][15401] Updated weights for policy 0, policy_version 215160 (0.0036) [2024-06-22 13:06:27,362][15401] Updated weights for policy 0, policy_version 215170 (0.0036) [2024-06-22 13:06:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3525378048. Throughput: 0: 42987.9. Samples: 3525493180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:06:28,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 13:06:31,547][15401] Updated weights for policy 0, policy_version 215180 (0.0029) [2024-06-22 13:06:31,697][15349] Signal inference workers to stop experience collection... (52050 times) [2024-06-22 13:06:31,749][15401] InferenceWorker_p0-w0: stopping experience collection (52050 times) [2024-06-22 13:06:31,753][15349] Signal inference workers to resume experience collection... (52050 times) [2024-06-22 13:06:31,759][15401] InferenceWorker_p0-w0: resuming experience collection (52050 times) [2024-06-22 13:06:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3525574656. Throughput: 0: 42886.2. Samples: 3525749900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:06:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 13:06:34,799][15401] Updated weights for policy 0, policy_version 215190 (0.0036) [2024-06-22 13:06:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 3525787648. Throughput: 0: 42911.5. Samples: 3525883480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:06:38,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-22 13:06:39,095][15401] Updated weights for policy 0, policy_version 215200 (0.0038) [2024-06-22 13:06:42,446][15401] Updated weights for policy 0, policy_version 215210 (0.0036) [2024-06-22 13:06:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3526017024. Throughput: 0: 42927.0. Samples: 3526135580. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:06:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 13:06:46,619][15401] Updated weights for policy 0, policy_version 215220 (0.0024) [2024-06-22 13:06:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3526246400. Throughput: 0: 42907.2. Samples: 3526394300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:06:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 13:06:50,034][15401] Updated weights for policy 0, policy_version 215230 (0.0029) [2024-06-22 13:06:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3526426624. Throughput: 0: 42800.9. Samples: 3526525280. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:06:53,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-22 13:06:54,550][15401] Updated weights for policy 0, policy_version 215240 (0.0034) [2024-06-22 13:06:57,621][15401] Updated weights for policy 0, policy_version 215250 (0.0029) [2024-06-22 13:06:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3526672384. Throughput: 0: 42828.4. Samples: 3526776240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:06:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 13:07:02,142][15401] Updated weights for policy 0, policy_version 215260 (0.0030) [2024-06-22 13:07:03,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 3526885376. Throughput: 0: 42972.0. Samples: 3527035920. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:03,392][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 13:07:05,266][15401] Updated weights for policy 0, policy_version 215270 (0.0026) [2024-06-22 13:07:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 3527081984. Throughput: 0: 43005.7. Samples: 3527165740. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:08,393][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 13:07:09,856][15401] Updated weights for policy 0, policy_version 215280 (0.0032) [2024-06-22 13:07:12,768][15401] Updated weights for policy 0, policy_version 215290 (0.0033) [2024-06-22 13:07:13,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 3527327744. Throughput: 0: 42951.5. Samples: 3527426000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:13,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 13:07:17,355][15401] Updated weights for policy 0, policy_version 215300 (0.0032) [2024-06-22 13:07:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3527524352. Throughput: 0: 43005.5. Samples: 3527685140. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 13:07:20,784][15401] Updated weights for policy 0, policy_version 215310 (0.0035) [2024-06-22 13:07:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42987.1). Total num frames: 3527720960. Throughput: 0: 42782.2. Samples: 3527808680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 13:07:24,997][15401] Updated weights for policy 0, policy_version 215320 (0.0047) [2024-06-22 13:07:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42986.8). Total num frames: 3527950336. Throughput: 0: 42906.2. Samples: 3528066460. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:28,393][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 13:07:28,414][15401] Updated weights for policy 0, policy_version 215330 (0.0042) [2024-06-22 13:07:32,601][15401] Updated weights for policy 0, policy_version 215340 (0.0025) [2024-06-22 13:07:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3528146944. Throughput: 0: 42976.4. Samples: 3528328240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 13:07:35,944][15401] Updated weights for policy 0, policy_version 215350 (0.0031) [2024-06-22 13:07:38,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 3528359936. Throughput: 0: 42788.6. Samples: 3528450760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 13:07:40,121][15401] Updated weights for policy 0, policy_version 215360 (0.0030) [2024-06-22 13:07:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3528605696. Throughput: 0: 42948.9. Samples: 3528708940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 13:07:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000215369_3528605696.pth... [2024-06-22 13:07:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000214738_3518267392.pth [2024-06-22 13:07:43,635][15401] Updated weights for policy 0, policy_version 215370 (0.0028) [2024-06-22 13:07:47,840][15401] Updated weights for policy 0, policy_version 215380 (0.0032) [2024-06-22 13:07:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3528785920. Throughput: 0: 42892.6. Samples: 3528965980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:48,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 13:07:49,985][15349] Signal inference workers to stop experience collection... (52100 times) [2024-06-22 13:07:50,025][15401] InferenceWorker_p0-w0: stopping experience collection (52100 times) [2024-06-22 13:07:50,046][15349] Signal inference workers to resume experience collection... (52100 times) [2024-06-22 13:07:50,052][15401] InferenceWorker_p0-w0: resuming experience collection (52100 times) [2024-06-22 13:07:51,201][15401] Updated weights for policy 0, policy_version 215390 (0.0032) [2024-06-22 13:07:53,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 3528998912. Throughput: 0: 42713.3. Samples: 3529087840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:53,393][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 13:07:55,376][15401] Updated weights for policy 0, policy_version 215400 (0.0031) [2024-06-22 13:07:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 3529244672. Throughput: 0: 42780.5. Samples: 3529351120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:07:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 13:07:58,725][15401] Updated weights for policy 0, policy_version 215410 (0.0036) [2024-06-22 13:08:03,163][15401] Updated weights for policy 0, policy_version 215420 (0.0039) [2024-06-22 13:08:03,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 3529441280. Throughput: 0: 42769.6. Samples: 3529609780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:08:03,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 13:08:06,337][15401] Updated weights for policy 0, policy_version 215430 (0.0034) [2024-06-22 13:08:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 3529654272. Throughput: 0: 42775.7. Samples: 3529733580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:08:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 13:08:10,759][15401] Updated weights for policy 0, policy_version 215440 (0.0037) [2024-06-22 13:08:13,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 3529883648. Throughput: 0: 42719.3. Samples: 3529988720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:08:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 13:08:13,944][15401] Updated weights for policy 0, policy_version 215450 (0.0032) [2024-06-22 13:08:18,393][15132] Fps is (10 sec: 40944.6, 60 sec: 42322.6, 300 sec: 42820.0). Total num frames: 3530063872. Throughput: 0: 42732.8. Samples: 3530251380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:08:18,394][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 13:08:18,818][15401] Updated weights for policy 0, policy_version 215460 (0.0032) [2024-06-22 13:08:21,595][15401] Updated weights for policy 0, policy_version 215470 (0.0029) [2024-06-22 13:08:23,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 3530276864. Throughput: 0: 42725.7. Samples: 3530373420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:08:23,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 13:08:26,365][15401] Updated weights for policy 0, policy_version 215480 (0.0038) [2024-06-22 13:08:28,389][15132] Fps is (10 sec: 45892.7, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 3530522624. Throughput: 0: 42521.3. Samples: 3530622400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:08:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 13:08:29,322][15401] Updated weights for policy 0, policy_version 215490 (0.0030) [2024-06-22 13:08:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 3530702848. Throughput: 0: 42784.5. Samples: 3530891280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:08:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 13:08:33,979][15401] Updated weights for policy 0, policy_version 215500 (0.0042) [2024-06-22 13:08:37,047][15401] Updated weights for policy 0, policy_version 215510 (0.0040) [2024-06-22 13:08:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 3530932224. Throughput: 0: 42782.2. Samples: 3531012940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 13:08:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 13:08:41,958][15401] Updated weights for policy 0, policy_version 215520 (0.0040) [2024-06-22 13:08:43,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42325.2, 300 sec: 42987.1). Total num frames: 3531145216. Throughput: 0: 42671.9. Samples: 3531271360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:08:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 13:08:45,335][15401] Updated weights for policy 0, policy_version 215530 (0.0049) [2024-06-22 13:08:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3531341824. Throughput: 0: 42602.7. Samples: 3531526900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:08:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 13:08:49,592][15401] Updated weights for policy 0, policy_version 215540 (0.0040) [2024-06-22 13:08:53,079][15401] Updated weights for policy 0, policy_version 215550 (0.0029) [2024-06-22 13:08:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 3531571200. Throughput: 0: 42560.9. Samples: 3531648820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:08:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 13:08:57,607][15401] Updated weights for policy 0, policy_version 215560 (0.0041) [2024-06-22 13:08:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 3531784192. Throughput: 0: 42710.0. Samples: 3531910680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:08:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 13:09:00,569][15401] Updated weights for policy 0, policy_version 215570 (0.0037) [2024-06-22 13:09:03,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42875.7). Total num frames: 3531980800. Throughput: 0: 42540.0. Samples: 3532165620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:03,393][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 13:09:05,149][15401] Updated weights for policy 0, policy_version 215580 (0.0026) [2024-06-22 13:09:08,116][15401] Updated weights for policy 0, policy_version 215590 (0.0036) [2024-06-22 13:09:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3532226560. Throughput: 0: 42641.8. Samples: 3532292300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 13:09:12,719][15401] Updated weights for policy 0, policy_version 215600 (0.0036) [2024-06-22 13:09:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3532439552. Throughput: 0: 42999.5. Samples: 3532557380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 13:09:15,779][15401] Updated weights for policy 0, policy_version 215610 (0.0033) [2024-06-22 13:09:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42874.2, 300 sec: 42876.5). Total num frames: 3532636160. Throughput: 0: 42786.2. Samples: 3532816660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:18,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 13:09:20,075][15349] Signal inference workers to stop experience collection... (52150 times) [2024-06-22 13:09:20,077][15349] Signal inference workers to resume experience collection... (52150 times) [2024-06-22 13:09:20,088][15401] InferenceWorker_p0-w0: stopping experience collection (52150 times) [2024-06-22 13:09:20,100][15401] InferenceWorker_p0-w0: resuming experience collection (52150 times) [2024-06-22 13:09:20,231][15401] Updated weights for policy 0, policy_version 215620 (0.0033) [2024-06-22 13:09:23,315][15401] Updated weights for policy 0, policy_version 215630 (0.0038) [2024-06-22 13:09:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 3532881920. Throughput: 0: 42808.8. Samples: 3532939340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:23,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 13:09:27,915][15401] Updated weights for policy 0, policy_version 215640 (0.0045) [2024-06-22 13:09:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3533078528. Throughput: 0: 42955.7. Samples: 3533204360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 13:09:30,816][15401] Updated weights for policy 0, policy_version 215650 (0.0045) [2024-06-22 13:09:33,392][15132] Fps is (10 sec: 40950.8, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 3533291520. Throughput: 0: 42973.4. Samples: 3533460800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:33,392][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 13:09:35,365][15401] Updated weights for policy 0, policy_version 215660 (0.0033) [2024-06-22 13:09:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3533520896. Throughput: 0: 43041.7. Samples: 3533585700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 13:09:38,838][15401] Updated weights for policy 0, policy_version 215670 (0.0035) [2024-06-22 13:09:42,945][15401] Updated weights for policy 0, policy_version 215680 (0.0022) [2024-06-22 13:09:43,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3533733888. Throughput: 0: 43216.0. Samples: 3533855400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 13:09:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 13:09:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000215682_3533733888.pth... [2024-06-22 13:09:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000215054_3523444736.pth [2024-06-22 13:09:46,303][15401] Updated weights for policy 0, policy_version 215690 (0.0023) [2024-06-22 13:09:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3533930496. Throughput: 0: 43176.1. Samples: 3534108440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:09:48,396][15132] Avg episode reward: [(0, '0.812')] [2024-06-22 13:09:50,333][15401] Updated weights for policy 0, policy_version 215700 (0.0031) [2024-06-22 13:09:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3534159872. Throughput: 0: 43101.3. Samples: 3534231860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:09:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 13:09:53,887][15401] Updated weights for policy 0, policy_version 215710 (0.0042) [2024-06-22 13:09:58,074][15401] Updated weights for policy 0, policy_version 215720 (0.0033) [2024-06-22 13:09:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 3534372864. Throughput: 0: 43078.3. Samples: 3534495900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:09:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 13:10:01,563][15401] Updated weights for policy 0, policy_version 215730 (0.0024) [2024-06-22 13:10:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 3534569472. Throughput: 0: 43079.5. Samples: 3534755240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 13:10:05,596][15401] Updated weights for policy 0, policy_version 215740 (0.0036) [2024-06-22 13:10:08,391][15132] Fps is (10 sec: 44231.6, 60 sec: 43143.7, 300 sec: 42931.5). Total num frames: 3534815232. Throughput: 0: 42986.1. Samples: 3534873760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:08,391][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 13:10:09,207][15401] Updated weights for policy 0, policy_version 215750 (0.0042) [2024-06-22 13:10:13,265][15401] Updated weights for policy 0, policy_version 215760 (0.0037) [2024-06-22 13:10:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3535011840. Throughput: 0: 43002.5. Samples: 3535139480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 13:10:16,794][15401] Updated weights for policy 0, policy_version 215770 (0.0033) [2024-06-22 13:10:18,389][15132] Fps is (10 sec: 40965.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3535224832. Throughput: 0: 42933.0. Samples: 3535392680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 13:10:20,820][15401] Updated weights for policy 0, policy_version 215780 (0.0033) [2024-06-22 13:10:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42875.7). Total num frames: 3535437824. Throughput: 0: 43014.2. Samples: 3535521440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:23,393][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 13:10:24,345][15401] Updated weights for policy 0, policy_version 215790 (0.0031) [2024-06-22 13:10:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3535650816. Throughput: 0: 42689.8. Samples: 3535776440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 13:10:28,763][15401] Updated weights for policy 0, policy_version 215800 (0.0026) [2024-06-22 13:10:31,977][15401] Updated weights for policy 0, policy_version 215810 (0.0031) [2024-06-22 13:10:33,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 3535863808. Throughput: 0: 42749.3. Samples: 3536032160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 13:10:36,270][15401] Updated weights for policy 0, policy_version 215820 (0.0027) [2024-06-22 13:10:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3536076800. Throughput: 0: 42989.0. Samples: 3536166360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 13:10:39,449][15401] Updated weights for policy 0, policy_version 215830 (0.0024) [2024-06-22 13:10:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3536289792. Throughput: 0: 42847.1. Samples: 3536424020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:43,400][15132] Avg episode reward: [(0, '0.216')] [2024-06-22 13:10:43,886][15401] Updated weights for policy 0, policy_version 215840 (0.0036) [2024-06-22 13:10:47,054][15401] Updated weights for policy 0, policy_version 215850 (0.0033) [2024-06-22 13:10:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3536502784. Throughput: 0: 42718.5. Samples: 3536677580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 13:10:51,428][15401] Updated weights for policy 0, policy_version 215860 (0.0028) [2024-06-22 13:10:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3536715776. Throughput: 0: 42919.8. Samples: 3536805100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 13:10:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 13:10:54,961][15401] Updated weights for policy 0, policy_version 215870 (0.0042) [2024-06-22 13:10:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3536928768. Throughput: 0: 42729.3. Samples: 3537062300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:10:58,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 13:10:58,969][15401] Updated weights for policy 0, policy_version 215880 (0.0037) [2024-06-22 13:11:00,904][15349] Signal inference workers to stop experience collection... (52200 times) [2024-06-22 13:11:00,959][15401] InferenceWorker_p0-w0: stopping experience collection (52200 times) [2024-06-22 13:11:00,967][15349] Signal inference workers to resume experience collection... (52200 times) [2024-06-22 13:11:00,982][15401] InferenceWorker_p0-w0: resuming experience collection (52200 times) [2024-06-22 13:11:02,712][15401] Updated weights for policy 0, policy_version 215890 (0.0034) [2024-06-22 13:11:03,391][15132] Fps is (10 sec: 44228.9, 60 sec: 43143.2, 300 sec: 42875.8). Total num frames: 3537158144. Throughput: 0: 42769.4. Samples: 3537317380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:03,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 13:11:06,522][15401] Updated weights for policy 0, policy_version 215900 (0.0045) [2024-06-22 13:11:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42326.1, 300 sec: 42765.0). Total num frames: 3537354752. Throughput: 0: 42836.0. Samples: 3537448960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 13:11:10,527][15401] Updated weights for policy 0, policy_version 215910 (0.0031) [2024-06-22 13:11:13,389][15132] Fps is (10 sec: 40967.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3537567744. Throughput: 0: 42747.1. Samples: 3537700060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 13:11:14,031][15401] Updated weights for policy 0, policy_version 215920 (0.0031) [2024-06-22 13:11:18,087][15401] Updated weights for policy 0, policy_version 215930 (0.0038) [2024-06-22 13:11:18,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3537813504. Throughput: 0: 42797.9. Samples: 3537958060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 13:11:21,581][15401] Updated weights for policy 0, policy_version 215940 (0.0040) [2024-06-22 13:11:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 3537977344. Throughput: 0: 42705.6. Samples: 3538088120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:23,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 13:11:25,881][15401] Updated weights for policy 0, policy_version 215950 (0.0038) [2024-06-22 13:11:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 3538223104. Throughput: 0: 42531.6. Samples: 3538338040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:28,392][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 13:11:29,353][15401] Updated weights for policy 0, policy_version 215960 (0.0031) [2024-06-22 13:11:33,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3538436096. Throughput: 0: 42738.3. Samples: 3538600800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 13:11:33,533][15401] Updated weights for policy 0, policy_version 215970 (0.0034) [2024-06-22 13:11:36,970][15401] Updated weights for policy 0, policy_version 215980 (0.0030) [2024-06-22 13:11:38,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3538632704. Throughput: 0: 42632.8. Samples: 3538723580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:38,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 13:11:41,316][15401] Updated weights for policy 0, policy_version 215990 (0.0038) [2024-06-22 13:11:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3538862080. Throughput: 0: 42529.8. Samples: 3538976140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:43,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 13:11:43,451][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000215996_3538878464.pth... [2024-06-22 13:11:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000215369_3528605696.pth [2024-06-22 13:11:45,114][15401] Updated weights for policy 0, policy_version 216000 (0.0031) [2024-06-22 13:11:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3539058688. Throughput: 0: 42633.8. Samples: 3539235820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 13:11:48,781][15401] Updated weights for policy 0, policy_version 216010 (0.0033) [2024-06-22 13:11:52,573][15401] Updated weights for policy 0, policy_version 216020 (0.0024) [2024-06-22 13:11:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3539288064. Throughput: 0: 42519.3. Samples: 3539362320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 13:11:56,612][15401] Updated weights for policy 0, policy_version 216030 (0.0051) [2024-06-22 13:11:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 3539484672. Throughput: 0: 42605.4. Samples: 3539617300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:11:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 13:12:00,317][15401] Updated weights for policy 0, policy_version 216040 (0.0045) [2024-06-22 13:12:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42326.5, 300 sec: 42765.3). Total num frames: 3539697664. Throughput: 0: 42688.7. Samples: 3539879060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 13:12:03,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 13:12:04,153][15401] Updated weights for policy 0, policy_version 216050 (0.0030) [2024-06-22 13:12:07,878][15401] Updated weights for policy 0, policy_version 216060 (0.0043) [2024-06-22 13:12:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 3539927040. Throughput: 0: 42540.2. Samples: 3540002420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 13:12:11,746][15401] Updated weights for policy 0, policy_version 216070 (0.0034) [2024-06-22 13:12:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3540140032. Throughput: 0: 42736.5. Samples: 3540261080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 13:12:15,601][15401] Updated weights for policy 0, policy_version 216080 (0.0026) [2024-06-22 13:12:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3540336640. Throughput: 0: 42624.9. Samples: 3540518920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:18,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 13:12:19,709][15401] Updated weights for policy 0, policy_version 216090 (0.0027) [2024-06-22 13:12:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 3540549632. Throughput: 0: 42617.5. Samples: 3540641360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 13:12:23,652][15401] Updated weights for policy 0, policy_version 216100 (0.0037) [2024-06-22 13:12:27,291][15349] Signal inference workers to stop experience collection... (52250 times) [2024-06-22 13:12:27,292][15349] Signal inference workers to resume experience collection... (52250 times) [2024-06-22 13:12:27,322][15401] InferenceWorker_p0-w0: stopping experience collection (52250 times) [2024-06-22 13:12:27,322][15401] InferenceWorker_p0-w0: resuming experience collection (52250 times) [2024-06-22 13:12:27,439][15401] Updated weights for policy 0, policy_version 216110 (0.0038) [2024-06-22 13:12:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 3540795392. Throughput: 0: 42761.4. Samples: 3540900400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 13:12:31,233][15401] Updated weights for policy 0, policy_version 216120 (0.0024) [2024-06-22 13:12:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3540959232. Throughput: 0: 42707.0. Samples: 3541157640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:33,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-22 13:12:35,398][15401] Updated weights for policy 0, policy_version 216130 (0.0044) [2024-06-22 13:12:38,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 3541204992. Throughput: 0: 42475.4. Samples: 3541273820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:38,392][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 13:12:39,542][15401] Updated weights for policy 0, policy_version 216140 (0.0053) [2024-06-22 13:12:42,929][15401] Updated weights for policy 0, policy_version 216150 (0.0037) [2024-06-22 13:12:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3541401600. Throughput: 0: 42569.3. Samples: 3541532920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 13:12:46,994][15401] Updated weights for policy 0, policy_version 216160 (0.0039) [2024-06-22 13:12:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 3541614592. Throughput: 0: 42399.8. Samples: 3541787040. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 13:12:50,936][15401] Updated weights for policy 0, policy_version 216170 (0.0032) [2024-06-22 13:12:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3541843968. Throughput: 0: 42408.9. Samples: 3541910820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 13:12:55,022][15401] Updated weights for policy 0, policy_version 216180 (0.0033) [2024-06-22 13:12:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42709.2). Total num frames: 3542040576. Throughput: 0: 42406.1. Samples: 3542169460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:12:58,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 13:12:58,570][15401] Updated weights for policy 0, policy_version 216190 (0.0034) [2024-06-22 13:13:02,706][15401] Updated weights for policy 0, policy_version 216200 (0.0031) [2024-06-22 13:13:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3542253568. Throughput: 0: 42393.8. Samples: 3542426640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:13:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 13:13:06,182][15401] Updated weights for policy 0, policy_version 216210 (0.0028) [2024-06-22 13:13:08,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 3542482944. Throughput: 0: 42486.5. Samples: 3542553360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-22 13:13:08,393][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 13:13:10,271][15401] Updated weights for policy 0, policy_version 216220 (0.0034) [2024-06-22 13:13:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42765.5). Total num frames: 3542679552. Throughput: 0: 42461.3. Samples: 3542811160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:13,391][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 13:13:13,734][15401] Updated weights for policy 0, policy_version 216230 (0.0033) [2024-06-22 13:13:17,784][15401] Updated weights for policy 0, policy_version 216240 (0.0026) [2024-06-22 13:13:18,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3542892544. Throughput: 0: 42339.3. Samples: 3543062900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 13:13:21,646][15401] Updated weights for policy 0, policy_version 216250 (0.0026) [2024-06-22 13:13:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 3543121920. Throughput: 0: 42534.7. Samples: 3543187780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:23,391][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 13:13:25,944][15401] Updated weights for policy 0, policy_version 216260 (0.0040) [2024-06-22 13:13:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 3543318528. Throughput: 0: 42507.2. Samples: 3543445740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 13:13:29,452][15401] Updated weights for policy 0, policy_version 216270 (0.0033) [2024-06-22 13:13:33,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 3543515136. Throughput: 0: 42658.1. Samples: 3543706760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:33,392][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 13:13:33,522][15401] Updated weights for policy 0, policy_version 216280 (0.0030) [2024-06-22 13:13:36,969][15401] Updated weights for policy 0, policy_version 216290 (0.0037) [2024-06-22 13:13:37,305][15349] Signal inference workers to stop experience collection... (52300 times) [2024-06-22 13:13:37,314][15349] Signal inference workers to resume experience collection... (52300 times) [2024-06-22 13:13:37,340][15401] InferenceWorker_p0-w0: stopping experience collection (52300 times) [2024-06-22 13:13:37,340][15401] InferenceWorker_p0-w0: resuming experience collection (52300 times) [2024-06-22 13:13:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 3543777280. Throughput: 0: 42739.4. Samples: 3543834100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:38,396][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 13:13:40,972][15401] Updated weights for policy 0, policy_version 216300 (0.0034) [2024-06-22 13:13:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3543957504. Throughput: 0: 42717.0. Samples: 3544091620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 13:13:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000216307_3543973888.pth... [2024-06-22 13:13:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000215682_3533733888.pth [2024-06-22 13:13:44,577][15401] Updated weights for policy 0, policy_version 216310 (0.0031) [2024-06-22 13:13:48,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3544170496. Throughput: 0: 42659.1. Samples: 3544346300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 13:13:48,857][15401] Updated weights for policy 0, policy_version 216320 (0.0031) [2024-06-22 13:13:52,078][15401] Updated weights for policy 0, policy_version 216330 (0.0031) [2024-06-22 13:13:53,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3544432640. Throughput: 0: 42761.0. Samples: 3544477500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:53,396][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 13:13:56,370][15401] Updated weights for policy 0, policy_version 216340 (0.0035) [2024-06-22 13:13:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 3544596480. Throughput: 0: 42985.9. Samples: 3544745520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:13:58,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 13:13:59,480][15401] Updated weights for policy 0, policy_version 216350 (0.0032) [2024-06-22 13:14:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3544809472. Throughput: 0: 42929.2. Samples: 3544994720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:14:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 13:14:03,878][15401] Updated weights for policy 0, policy_version 216360 (0.0027) [2024-06-22 13:14:07,225][15401] Updated weights for policy 0, policy_version 216370 (0.0034) [2024-06-22 13:14:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 3545055232. Throughput: 0: 43079.2. Samples: 3545126340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:14:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 13:14:11,602][15401] Updated weights for policy 0, policy_version 216380 (0.0024) [2024-06-22 13:14:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 3545235456. Throughput: 0: 43115.2. Samples: 3545385920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:14:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 13:14:14,826][15401] Updated weights for policy 0, policy_version 216390 (0.0042) [2024-06-22 13:14:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 3545464832. Throughput: 0: 42806.2. Samples: 3545632940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:14:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 13:14:19,165][15401] Updated weights for policy 0, policy_version 216400 (0.0042) [2024-06-22 13:14:22,509][15401] Updated weights for policy 0, policy_version 216410 (0.0032) [2024-06-22 13:14:23,392][15132] Fps is (10 sec: 47501.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3545710592. Throughput: 0: 43003.1. Samples: 3545769340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:14:23,393][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 13:14:26,848][15401] Updated weights for policy 0, policy_version 216420 (0.0028) [2024-06-22 13:14:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 3545858048. Throughput: 0: 42897.3. Samples: 3546022000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:14:28,392][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 13:14:30,237][15401] Updated weights for policy 0, policy_version 216430 (0.0032) [2024-06-22 13:14:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43419.4, 300 sec: 42709.5). Total num frames: 3546120192. Throughput: 0: 42733.0. Samples: 3546269280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:14:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 13:14:34,528][15401] Updated weights for policy 0, policy_version 216440 (0.0030) [2024-06-22 13:14:37,900][15401] Updated weights for policy 0, policy_version 216450 (0.0034) [2024-06-22 13:14:38,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3546349568. Throughput: 0: 42883.5. Samples: 3546407260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:14:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 13:14:42,000][15401] Updated weights for policy 0, policy_version 216460 (0.0036) [2024-06-22 13:14:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3546513408. Throughput: 0: 42487.6. Samples: 3546657460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:14:43,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 13:14:45,331][15349] Signal inference workers to stop experience collection... (52350 times) [2024-06-22 13:14:45,380][15401] InferenceWorker_p0-w0: stopping experience collection (52350 times) [2024-06-22 13:14:45,386][15349] Signal inference workers to resume experience collection... (52350 times) [2024-06-22 13:14:45,401][15401] InferenceWorker_p0-w0: resuming experience collection (52350 times) [2024-06-22 13:14:45,409][15401] Updated weights for policy 0, policy_version 216470 (0.0034) [2024-06-22 13:14:48,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 3546775552. Throughput: 0: 42573.8. Samples: 3546910640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:14:48,392][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 13:14:49,498][15401] Updated weights for policy 0, policy_version 216480 (0.0050) [2024-06-22 13:14:53,121][15401] Updated weights for policy 0, policy_version 216490 (0.0040) [2024-06-22 13:14:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3546988544. Throughput: 0: 42843.9. Samples: 3547054320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:14:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 13:14:57,117][15401] Updated weights for policy 0, policy_version 216500 (0.0046) [2024-06-22 13:14:58,389][15132] Fps is (10 sec: 37692.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3547152384. Throughput: 0: 42486.1. Samples: 3547297800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:14:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 13:15:00,694][15401] Updated weights for policy 0, policy_version 216510 (0.0032) [2024-06-22 13:15:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42654.1). Total num frames: 3547398144. Throughput: 0: 42576.6. Samples: 3547548880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:15:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 13:15:04,566][15401] Updated weights for policy 0, policy_version 216520 (0.0038) [2024-06-22 13:15:08,236][15401] Updated weights for policy 0, policy_version 216530 (0.0044) [2024-06-22 13:15:08,396][15132] Fps is (10 sec: 47483.4, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 3547627520. Throughput: 0: 42673.2. Samples: 3547689800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:15:08,396][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 13:15:12,801][15401] Updated weights for policy 0, policy_version 216540 (0.0040) [2024-06-22 13:15:13,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3547791360. Throughput: 0: 42616.5. Samples: 3547939740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:15:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 13:15:15,961][15401] Updated weights for policy 0, policy_version 216550 (0.0030) [2024-06-22 13:15:18,390][15132] Fps is (10 sec: 42625.0, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 3548053504. Throughput: 0: 42664.3. Samples: 3548189180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:15:18,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 13:15:20,493][15401] Updated weights for policy 0, policy_version 216560 (0.0036) [2024-06-22 13:15:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 3548233728. Throughput: 0: 42664.0. Samples: 3548327140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:15:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 13:15:24,068][15401] Updated weights for policy 0, policy_version 216570 (0.0040) [2024-06-22 13:15:28,222][15401] Updated weights for policy 0, policy_version 216580 (0.0032) [2024-06-22 13:15:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3548446720. Throughput: 0: 42683.4. Samples: 3548578220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 13:15:28,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 13:15:31,544][15401] Updated weights for policy 0, policy_version 216590 (0.0029) [2024-06-22 13:15:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3548692480. Throughput: 0: 42773.7. Samples: 3548835360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:15:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 13:15:35,737][15401] Updated weights for policy 0, policy_version 216600 (0.0045) [2024-06-22 13:15:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3548872704. Throughput: 0: 42628.4. Samples: 3548972600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:15:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 13:15:38,675][15349] Signal inference workers to stop experience collection... (52400 times) [2024-06-22 13:15:38,720][15401] InferenceWorker_p0-w0: stopping experience collection (52400 times) [2024-06-22 13:15:38,731][15349] Signal inference workers to resume experience collection... (52400 times) [2024-06-22 13:15:38,733][15401] InferenceWorker_p0-w0: resuming experience collection (52400 times) [2024-06-22 13:15:39,027][15401] Updated weights for policy 0, policy_version 216610 (0.0032) [2024-06-22 13:15:43,216][15401] Updated weights for policy 0, policy_version 216620 (0.0023) [2024-06-22 13:15:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3549102080. Throughput: 0: 42729.8. Samples: 3549220640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:15:43,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 13:15:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000216620_3549102080.pth... [2024-06-22 13:15:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000215996_3538878464.pth [2024-06-22 13:15:46,868][15401] Updated weights for policy 0, policy_version 216630 (0.0037) [2024-06-22 13:15:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 3549331456. Throughput: 0: 42759.3. Samples: 3549473060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:15:48,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 13:15:50,749][15401] Updated weights for policy 0, policy_version 216640 (0.0038) [2024-06-22 13:15:53,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 3549511680. Throughput: 0: 42441.4. Samples: 3549599400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:15:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 13:15:54,710][15401] Updated weights for policy 0, policy_version 216650 (0.0043) [2024-06-22 13:15:58,314][15401] Updated weights for policy 0, policy_version 216660 (0.0028) [2024-06-22 13:15:58,392][15132] Fps is (10 sec: 42588.9, 60 sec: 43415.9, 300 sec: 42709.4). Total num frames: 3549757440. Throughput: 0: 42602.7. Samples: 3549856960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:15:58,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 13:16:02,735][15401] Updated weights for policy 0, policy_version 216670 (0.0030) [2024-06-22 13:16:03,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3549970432. Throughput: 0: 42660.2. Samples: 3550108880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:16:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 13:16:06,413][15401] Updated weights for policy 0, policy_version 216680 (0.0042) [2024-06-22 13:16:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42056.8, 300 sec: 42653.9). Total num frames: 3550150656. Throughput: 0: 42482.4. Samples: 3550238840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:16:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 13:16:10,227][15401] Updated weights for policy 0, policy_version 216690 (0.0023) [2024-06-22 13:16:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 3550396416. Throughput: 0: 42582.3. Samples: 3550494420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:16:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 13:16:14,036][15401] Updated weights for policy 0, policy_version 216700 (0.0040) [2024-06-22 13:16:17,706][15401] Updated weights for policy 0, policy_version 216710 (0.0024) [2024-06-22 13:16:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3550593024. Throughput: 0: 42559.2. Samples: 3550750520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:16:18,394][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 13:16:21,711][15401] Updated weights for policy 0, policy_version 216720 (0.0047) [2024-06-22 13:16:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 3550789632. Throughput: 0: 42262.2. Samples: 3550874400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:16:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 13:16:25,540][15401] Updated weights for policy 0, policy_version 216730 (0.0051) [2024-06-22 13:16:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 3551019008. Throughput: 0: 42440.1. Samples: 3551130440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:16:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 13:16:29,388][15401] Updated weights for policy 0, policy_version 216740 (0.0046) [2024-06-22 13:16:33,151][15401] Updated weights for policy 0, policy_version 216750 (0.0032) [2024-06-22 13:16:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 3551248384. Throughput: 0: 42547.3. Samples: 3551387680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 13:16:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 13:16:36,977][15401] Updated weights for policy 0, policy_version 216760 (0.0031) [2024-06-22 13:16:38,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3551412224. Throughput: 0: 42649.8. Samples: 3551518640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:16:38,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 13:16:40,731][15401] Updated weights for policy 0, policy_version 216770 (0.0038) [2024-06-22 13:16:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3551657984. Throughput: 0: 42526.2. Samples: 3551770540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:16:43,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 13:16:45,213][15401] Updated weights for policy 0, policy_version 216780 (0.0028) [2024-06-22 13:16:48,323][15401] Updated weights for policy 0, policy_version 216790 (0.0025) [2024-06-22 13:16:48,389][15132] Fps is (10 sec: 47514.6, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 3551887360. Throughput: 0: 42689.4. Samples: 3552029900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:16:48,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 13:16:52,753][15401] Updated weights for policy 0, policy_version 216800 (0.0040) [2024-06-22 13:16:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 3552051200. Throughput: 0: 42593.3. Samples: 3552155540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:16:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 13:16:53,936][15349] Signal inference workers to stop experience collection... (52450 times) [2024-06-22 13:16:53,938][15349] Signal inference workers to resume experience collection... (52450 times) [2024-06-22 13:16:53,980][15401] InferenceWorker_p0-w0: stopping experience collection (52450 times) [2024-06-22 13:16:53,980][15401] InferenceWorker_p0-w0: resuming experience collection (52450 times) [2024-06-22 13:16:56,489][15401] Updated weights for policy 0, policy_version 216810 (0.0037) [2024-06-22 13:16:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 3552296960. Throughput: 0: 42482.7. Samples: 3552406140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:16:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 13:17:00,193][15401] Updated weights for policy 0, policy_version 216820 (0.0036) [2024-06-22 13:17:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3552509952. Throughput: 0: 42572.0. Samples: 3552666260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 13:17:04,070][15401] Updated weights for policy 0, policy_version 216830 (0.0036) [2024-06-22 13:17:08,153][15401] Updated weights for policy 0, policy_version 216840 (0.0024) [2024-06-22 13:17:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3552706560. Throughput: 0: 42619.7. Samples: 3552792280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:08,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 13:17:11,906][15401] Updated weights for policy 0, policy_version 216850 (0.0033) [2024-06-22 13:17:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3552935936. Throughput: 0: 42466.2. Samples: 3553041420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 13:17:15,845][15401] Updated weights for policy 0, policy_version 216860 (0.0032) [2024-06-22 13:17:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3553132544. Throughput: 0: 42492.8. Samples: 3553299860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 13:17:19,471][15401] Updated weights for policy 0, policy_version 216870 (0.0031) [2024-06-22 13:17:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 3553329152. Throughput: 0: 42262.8. Samples: 3553420460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 13:17:23,798][15401] Updated weights for policy 0, policy_version 216880 (0.0046) [2024-06-22 13:17:27,076][15401] Updated weights for policy 0, policy_version 216890 (0.0033) [2024-06-22 13:17:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3553574912. Throughput: 0: 42345.0. Samples: 3553676060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:28,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 13:17:31,419][15401] Updated weights for policy 0, policy_version 216900 (0.0042) [2024-06-22 13:17:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 42543.2). Total num frames: 3553755136. Throughput: 0: 42497.7. Samples: 3553942300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:33,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 13:17:34,667][15401] Updated weights for policy 0, policy_version 216910 (0.0025) [2024-06-22 13:17:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3553968128. Throughput: 0: 42257.2. Samples: 3554057120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 13:17:39,164][15401] Updated weights for policy 0, policy_version 216920 (0.0038) [2024-06-22 13:17:42,285][15401] Updated weights for policy 0, policy_version 216930 (0.0032) [2024-06-22 13:17:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3554230272. Throughput: 0: 42435.1. Samples: 3554315720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 13:17:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 13:17:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000216933_3554230272.pth... [2024-06-22 13:17:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000216307_3543973888.pth [2024-06-22 13:17:47,304][15401] Updated weights for policy 0, policy_version 216940 (0.0040) [2024-06-22 13:17:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 3554394112. Throughput: 0: 42591.2. Samples: 3554582860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:17:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 13:17:49,912][15401] Updated weights for policy 0, policy_version 216950 (0.0027) [2024-06-22 13:17:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 3554607104. Throughput: 0: 42351.4. Samples: 3554698100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:17:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 13:17:54,991][15401] Updated weights for policy 0, policy_version 216960 (0.0035) [2024-06-22 13:17:57,552][15401] Updated weights for policy 0, policy_version 216970 (0.0034) [2024-06-22 13:17:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3554852864. Throughput: 0: 42484.3. Samples: 3554953220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:17:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 13:18:02,713][15401] Updated weights for policy 0, policy_version 216980 (0.0041) [2024-06-22 13:18:03,392][15132] Fps is (10 sec: 40950.7, 60 sec: 41777.6, 300 sec: 42487.3). Total num frames: 3555016704. Throughput: 0: 42488.9. Samples: 3555211960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:03,392][15132] Avg episode reward: [(0, '0.827')] [2024-06-22 13:18:04,670][15349] Signal inference workers to stop experience collection... (52500 times) [2024-06-22 13:18:04,672][15349] Signal inference workers to resume experience collection... (52500 times) [2024-06-22 13:18:04,693][15401] InferenceWorker_p0-w0: stopping experience collection (52500 times) [2024-06-22 13:18:04,694][15401] InferenceWorker_p0-w0: resuming experience collection (52500 times) [2024-06-22 13:18:05,702][15401] Updated weights for policy 0, policy_version 216990 (0.0049) [2024-06-22 13:18:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3555246080. Throughput: 0: 42405.9. Samples: 3555328720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 13:18:10,393][15401] Updated weights for policy 0, policy_version 217000 (0.0029) [2024-06-22 13:18:13,377][15401] Updated weights for policy 0, policy_version 217010 (0.0050) [2024-06-22 13:18:13,390][15132] Fps is (10 sec: 47525.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3555491840. Throughput: 0: 42479.1. Samples: 3555587620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 13:18:17,996][15401] Updated weights for policy 0, policy_version 217020 (0.0042) [2024-06-22 13:18:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 3555655680. Throughput: 0: 42300.4. Samples: 3555845820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:18,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 13:18:21,119][15401] Updated weights for policy 0, policy_version 217030 (0.0037) [2024-06-22 13:18:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3555901440. Throughput: 0: 42421.3. Samples: 3555966080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:23,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 13:18:25,655][15401] Updated weights for policy 0, policy_version 217040 (0.0027) [2024-06-22 13:18:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 3556114432. Throughput: 0: 42468.9. Samples: 3556226820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:28,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 13:18:29,144][15401] Updated weights for policy 0, policy_version 217050 (0.0039) [2024-06-22 13:18:33,348][15401] Updated weights for policy 0, policy_version 217060 (0.0026) [2024-06-22 13:18:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 3556311040. Throughput: 0: 42235.5. Samples: 3556483460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:33,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 13:18:36,789][15401] Updated weights for policy 0, policy_version 217070 (0.0033) [2024-06-22 13:18:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 3556524032. Throughput: 0: 42312.0. Samples: 3556602240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:38,393][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 13:18:41,247][15401] Updated weights for policy 0, policy_version 217080 (0.0040) [2024-06-22 13:18:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3556753408. Throughput: 0: 42461.4. Samples: 3556863980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:43,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 13:18:44,687][15401] Updated weights for policy 0, policy_version 217090 (0.0030) [2024-06-22 13:18:48,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 3556933632. Throughput: 0: 42268.9. Samples: 3557113960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 13:18:48,923][15401] Updated weights for policy 0, policy_version 217100 (0.0031) [2024-06-22 13:18:52,451][15401] Updated weights for policy 0, policy_version 217110 (0.0027) [2024-06-22 13:18:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 3557179392. Throughput: 0: 42492.8. Samples: 3557240900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 13:18:53,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 13:18:56,506][15401] Updated weights for policy 0, policy_version 217120 (0.0032) [2024-06-22 13:18:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 3557376000. Throughput: 0: 42479.5. Samples: 3557499200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:18:58,391][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 13:19:00,137][15401] Updated weights for policy 0, policy_version 217130 (0.0034) [2024-06-22 13:19:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 3557588992. Throughput: 0: 42394.2. Samples: 3557753560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 13:19:04,117][15401] Updated weights for policy 0, policy_version 217140 (0.0036) [2024-06-22 13:19:07,712][15401] Updated weights for policy 0, policy_version 217150 (0.0031) [2024-06-22 13:19:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3557818368. Throughput: 0: 42621.9. Samples: 3557884060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 13:19:11,580][15401] Updated weights for policy 0, policy_version 217160 (0.0031) [2024-06-22 13:19:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3558031360. Throughput: 0: 42591.7. Samples: 3558143440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:13,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 13:19:15,166][15401] Updated weights for policy 0, policy_version 217170 (0.0040) [2024-06-22 13:19:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.9, 300 sec: 42487.3). Total num frames: 3558244352. Throughput: 0: 42636.8. Samples: 3558402220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:18,392][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 13:19:19,009][15349] Signal inference workers to stop experience collection... (52550 times) [2024-06-22 13:19:19,009][15349] Signal inference workers to resume experience collection... (52550 times) [2024-06-22 13:19:19,039][15401] InferenceWorker_p0-w0: stopping experience collection (52550 times) [2024-06-22 13:19:19,039][15401] InferenceWorker_p0-w0: resuming experience collection (52550 times) [2024-06-22 13:19:19,161][15401] Updated weights for policy 0, policy_version 217180 (0.0031) [2024-06-22 13:19:22,714][15401] Updated weights for policy 0, policy_version 217190 (0.0036) [2024-06-22 13:19:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3558457344. Throughput: 0: 42802.7. Samples: 3558528260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 13:19:27,099][15401] Updated weights for policy 0, policy_version 217200 (0.0041) [2024-06-22 13:19:28,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 3558653952. Throughput: 0: 42735.1. Samples: 3558787060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:28,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 13:19:30,233][15401] Updated weights for policy 0, policy_version 217210 (0.0029) [2024-06-22 13:19:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 3558850560. Throughput: 0: 42979.6. Samples: 3559048040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 13:19:34,487][15401] Updated weights for policy 0, policy_version 217220 (0.0029) [2024-06-22 13:19:37,731][15401] Updated weights for policy 0, policy_version 217230 (0.0036) [2024-06-22 13:19:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 3559096320. Throughput: 0: 42964.8. Samples: 3559174320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 13:19:42,035][15401] Updated weights for policy 0, policy_version 217240 (0.0030) [2024-06-22 13:19:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 3559309312. Throughput: 0: 42908.4. Samples: 3559430080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 13:19:43,536][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000217244_3559325696.pth... [2024-06-22 13:19:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000216620_3549102080.pth [2024-06-22 13:19:45,349][15401] Updated weights for policy 0, policy_version 217250 (0.0038) [2024-06-22 13:19:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 3559505920. Throughput: 0: 43036.0. Samples: 3559690180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 13:19:49,569][15401] Updated weights for policy 0, policy_version 217260 (0.0039) [2024-06-22 13:19:53,288][15401] Updated weights for policy 0, policy_version 217270 (0.0035) [2024-06-22 13:19:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3559751680. Throughput: 0: 42928.0. Samples: 3559815820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:53,396][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 13:19:57,529][15401] Updated weights for policy 0, policy_version 217280 (0.0027) [2024-06-22 13:19:58,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 3559948288. Throughput: 0: 42898.8. Samples: 3560073880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 13:19:58,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 13:20:00,797][15401] Updated weights for policy 0, policy_version 217290 (0.0030) [2024-06-22 13:20:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42488.2). Total num frames: 3560161280. Throughput: 0: 42824.0. Samples: 3560329200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 13:20:04,982][15401] Updated weights for policy 0, policy_version 217300 (0.0034) [2024-06-22 13:20:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3560390656. Throughput: 0: 42921.8. Samples: 3560459740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 13:20:08,667][15401] Updated weights for policy 0, policy_version 217310 (0.0041) [2024-06-22 13:20:12,558][15401] Updated weights for policy 0, policy_version 217320 (0.0028) [2024-06-22 13:20:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 3560603648. Throughput: 0: 42899.2. Samples: 3560717520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 13:20:16,383][15401] Updated weights for policy 0, policy_version 217330 (0.0021) [2024-06-22 13:20:18,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42871.5, 300 sec: 42653.6). Total num frames: 3560816640. Throughput: 0: 42699.9. Samples: 3560969640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:18,392][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 13:20:20,432][15401] Updated weights for policy 0, policy_version 217340 (0.0031) [2024-06-22 13:20:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3561029632. Throughput: 0: 42812.9. Samples: 3561100900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:23,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 13:20:23,920][15401] Updated weights for policy 0, policy_version 217350 (0.0034) [2024-06-22 13:20:28,350][15401] Updated weights for policy 0, policy_version 217360 (0.0029) [2024-06-22 13:20:28,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 3561226240. Throughput: 0: 42960.0. Samples: 3561363280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 13:20:31,512][15401] Updated weights for policy 0, policy_version 217370 (0.0047) [2024-06-22 13:20:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 3561472000. Throughput: 0: 42760.4. Samples: 3561614400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:33,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 13:20:35,083][15349] Signal inference workers to stop experience collection... (52600 times) [2024-06-22 13:20:35,084][15349] Signal inference workers to resume experience collection... (52600 times) [2024-06-22 13:20:35,132][15401] InferenceWorker_p0-w0: stopping experience collection (52600 times) [2024-06-22 13:20:35,132][15401] InferenceWorker_p0-w0: resuming experience collection (52600 times) [2024-06-22 13:20:35,862][15401] Updated weights for policy 0, policy_version 217380 (0.0037) [2024-06-22 13:20:38,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3561668608. Throughput: 0: 42913.0. Samples: 3561746900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:38,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 13:20:39,171][15401] Updated weights for policy 0, policy_version 217390 (0.0026) [2024-06-22 13:20:43,367][15401] Updated weights for policy 0, policy_version 217400 (0.0028) [2024-06-22 13:20:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 3561881600. Throughput: 0: 43030.4. Samples: 3562010260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 13:20:46,879][15401] Updated weights for policy 0, policy_version 217410 (0.0039) [2024-06-22 13:20:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 3562110976. Throughput: 0: 42861.8. Samples: 3562257980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:48,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-22 13:20:50,871][15401] Updated weights for policy 0, policy_version 217420 (0.0035) [2024-06-22 13:20:53,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 3562291200. Throughput: 0: 42846.8. Samples: 3562387840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:53,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-22 13:20:54,582][15401] Updated weights for policy 0, policy_version 217430 (0.0030) [2024-06-22 13:20:58,284][15401] Updated weights for policy 0, policy_version 217440 (0.0033) [2024-06-22 13:20:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 3562536960. Throughput: 0: 42810.3. Samples: 3562643980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:20:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 13:21:02,534][15401] Updated weights for policy 0, policy_version 217450 (0.0026) [2024-06-22 13:21:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3562733568. Throughput: 0: 42815.6. Samples: 3562896240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:21:03,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 13:21:05,947][15401] Updated weights for policy 0, policy_version 217460 (0.0032) [2024-06-22 13:21:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 3562930176. Throughput: 0: 42604.9. Samples: 3563018120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 13:21:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 13:21:10,445][15401] Updated weights for policy 0, policy_version 217470 (0.0039) [2024-06-22 13:21:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3563175936. Throughput: 0: 42664.9. Samples: 3563283200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 13:21:13,455][15401] Updated weights for policy 0, policy_version 217480 (0.0031) [2024-06-22 13:21:18,061][15401] Updated weights for policy 0, policy_version 217490 (0.0033) [2024-06-22 13:21:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 3563372544. Throughput: 0: 42727.2. Samples: 3563537120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:18,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 13:21:20,984][15401] Updated weights for policy 0, policy_version 217500 (0.0030) [2024-06-22 13:21:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 3563585536. Throughput: 0: 42450.5. Samples: 3563657280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:23,393][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 13:21:25,702][15401] Updated weights for policy 0, policy_version 217510 (0.0036) [2024-06-22 13:21:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 3563798528. Throughput: 0: 42458.0. Samples: 3563920860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 13:21:29,190][15401] Updated weights for policy 0, policy_version 217520 (0.0031) [2024-06-22 13:21:33,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 3563995136. Throughput: 0: 42760.0. Samples: 3564182180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 13:21:33,512][15401] Updated weights for policy 0, policy_version 217530 (0.0039) [2024-06-22 13:21:36,632][15401] Updated weights for policy 0, policy_version 217540 (0.0051) [2024-06-22 13:21:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 3564240896. Throughput: 0: 42561.7. Samples: 3564303120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 13:21:41,069][15401] Updated weights for policy 0, policy_version 217550 (0.0036) [2024-06-22 13:21:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 3564437504. Throughput: 0: 42629.2. Samples: 3564562300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:43,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 13:21:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000217557_3564453888.pth... [2024-06-22 13:21:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000216933_3554230272.pth [2024-06-22 13:21:44,181][15401] Updated weights for policy 0, policy_version 217560 (0.0040) [2024-06-22 13:21:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3564634112. Throughput: 0: 42822.7. Samples: 3564823260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:48,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-22 13:21:48,600][15401] Updated weights for policy 0, policy_version 217570 (0.0037) [2024-06-22 13:21:51,711][15401] Updated weights for policy 0, policy_version 217580 (0.0030) [2024-06-22 13:21:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3564863488. Throughput: 0: 42788.5. Samples: 3564943600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 13:21:56,583][15401] Updated weights for policy 0, policy_version 217590 (0.0035) [2024-06-22 13:21:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3565076480. Throughput: 0: 42551.5. Samples: 3565198020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:21:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 13:21:59,413][15349] Signal inference workers to stop experience collection... (52650 times) [2024-06-22 13:21:59,417][15349] Signal inference workers to resume experience collection... (52650 times) [2024-06-22 13:21:59,438][15401] InferenceWorker_p0-w0: stopping experience collection (52650 times) [2024-06-22 13:21:59,438][15401] InferenceWorker_p0-w0: resuming experience collection (52650 times) [2024-06-22 13:21:59,582][15401] Updated weights for policy 0, policy_version 217600 (0.0034) [2024-06-22 13:22:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3565273088. Throughput: 0: 42611.1. Samples: 3565454620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:22:03,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 13:22:04,338][15401] Updated weights for policy 0, policy_version 217610 (0.0035) [2024-06-22 13:22:07,419][15401] Updated weights for policy 0, policy_version 217620 (0.0029) [2024-06-22 13:22:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3565502464. Throughput: 0: 42653.4. Samples: 3565576580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:22:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 13:22:12,014][15401] Updated weights for policy 0, policy_version 217630 (0.0024) [2024-06-22 13:22:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 3565699072. Throughput: 0: 42475.3. Samples: 3565832260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:22:13,391][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 13:22:15,358][15401] Updated weights for policy 0, policy_version 217640 (0.0044) [2024-06-22 13:22:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3565912064. Throughput: 0: 42263.5. Samples: 3566084040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 13:22:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 13:22:19,580][15401] Updated weights for policy 0, policy_version 217650 (0.0036) [2024-06-22 13:22:23,249][15401] Updated weights for policy 0, policy_version 217660 (0.0034) [2024-06-22 13:22:23,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 3566141440. Throughput: 0: 42450.6. Samples: 3566213400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:22:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 13:22:27,129][15401] Updated weights for policy 0, policy_version 217670 (0.0035) [2024-06-22 13:22:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 3566338048. Throughput: 0: 42386.4. Samples: 3566469680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:22:28,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-22 13:22:30,839][15401] Updated weights for policy 0, policy_version 217680 (0.0040) [2024-06-22 13:22:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3566551040. Throughput: 0: 42147.5. Samples: 3566719900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:22:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 13:22:35,155][15401] Updated weights for policy 0, policy_version 217690 (0.0040) [2024-06-22 13:22:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3566780416. Throughput: 0: 42345.2. Samples: 3566849140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:22:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 13:22:38,587][15401] Updated weights for policy 0, policy_version 217700 (0.0036) [2024-06-22 13:22:42,881][15401] Updated weights for policy 0, policy_version 217710 (0.0041) [2024-06-22 13:22:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 3566960640. Throughput: 0: 42382.1. Samples: 3567105220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:22:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 13:22:46,372][15401] Updated weights for policy 0, policy_version 217720 (0.0033) [2024-06-22 13:22:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3567206400. Throughput: 0: 42290.3. Samples: 3567357680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:22:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 13:22:50,522][15401] Updated weights for policy 0, policy_version 217730 (0.0031) [2024-06-22 13:22:53,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3567419392. Throughput: 0: 42587.6. Samples: 3567493020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:22:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 13:22:54,050][15401] Updated weights for policy 0, policy_version 217740 (0.0034) [2024-06-22 13:22:58,275][15401] Updated weights for policy 0, policy_version 217750 (0.0039) [2024-06-22 13:22:58,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.7, 300 sec: 42709.5). Total num frames: 3567616000. Throughput: 0: 42445.5. Samples: 3567742400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:22:58,392][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 13:23:01,909][15401] Updated weights for policy 0, policy_version 217760 (0.0038) [2024-06-22 13:23:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 3567845376. Throughput: 0: 42507.4. Samples: 3567996880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:23:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 13:23:05,961][15401] Updated weights for policy 0, policy_version 217770 (0.0051) [2024-06-22 13:23:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 3568025600. Throughput: 0: 42489.0. Samples: 3568125400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:23:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 13:23:09,587][15401] Updated weights for policy 0, policy_version 217780 (0.0030) [2024-06-22 13:23:13,392][15132] Fps is (10 sec: 39312.8, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 3568238592. Throughput: 0: 42395.5. Samples: 3568377580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:23:13,392][15132] Avg episode reward: [(0, '0.269')] [2024-06-22 13:23:13,975][15401] Updated weights for policy 0, policy_version 217790 (0.0037) [2024-06-22 13:23:17,322][15401] Updated weights for policy 0, policy_version 217800 (0.0037) [2024-06-22 13:23:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3568484352. Throughput: 0: 42332.0. Samples: 3568624840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:23:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 13:23:21,455][15401] Updated weights for policy 0, policy_version 217810 (0.0041) [2024-06-22 13:23:22,987][15349] Signal inference workers to stop experience collection... (52700 times) [2024-06-22 13:23:22,987][15349] Signal inference workers to resume experience collection... (52700 times) [2024-06-22 13:23:23,039][15401] InferenceWorker_p0-w0: stopping experience collection (52700 times) [2024-06-22 13:23:23,039][15401] InferenceWorker_p0-w0: resuming experience collection (52700 times) [2024-06-22 13:23:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3568680960. Throughput: 0: 42396.0. Samples: 3568756960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:23:23,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-22 13:23:25,097][15401] Updated weights for policy 0, policy_version 217820 (0.0050) [2024-06-22 13:23:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3568893952. Throughput: 0: 42396.3. Samples: 3569013040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:23:28,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 13:23:28,951][15401] Updated weights for policy 0, policy_version 217830 (0.0043) [2024-06-22 13:23:32,744][15401] Updated weights for policy 0, policy_version 217840 (0.0021) [2024-06-22 13:23:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 3569123328. Throughput: 0: 42359.6. Samples: 3569263860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:23:33,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 13:23:36,767][15401] Updated weights for policy 0, policy_version 217850 (0.0037) [2024-06-22 13:23:38,390][15132] Fps is (10 sec: 40958.7, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 3569303552. Throughput: 0: 42297.5. Samples: 3569396420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:23:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 13:23:40,333][15401] Updated weights for policy 0, policy_version 217860 (0.0036) [2024-06-22 13:23:43,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3569516544. Throughput: 0: 42303.9. Samples: 3569645980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:23:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 13:23:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000217866_3569516544.pth... [2024-06-22 13:23:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000217244_3559325696.pth [2024-06-22 13:23:44,296][15401] Updated weights for policy 0, policy_version 217870 (0.0042) [2024-06-22 13:23:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 3569729536. Throughput: 0: 42464.2. Samples: 3569907760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:23:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 13:23:48,482][15401] Updated weights for policy 0, policy_version 217880 (0.0032) [2024-06-22 13:23:51,957][15401] Updated weights for policy 0, policy_version 217890 (0.0037) [2024-06-22 13:23:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 3569958912. Throughput: 0: 42535.0. Samples: 3570039480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:23:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 13:23:56,059][15401] Updated weights for policy 0, policy_version 217900 (0.0038) [2024-06-22 13:23:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 3570155520. Throughput: 0: 42419.6. Samples: 3570286360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:23:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 13:23:59,783][15401] Updated weights for policy 0, policy_version 217910 (0.0033) [2024-06-22 13:24:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 3570384896. Throughput: 0: 42548.4. Samples: 3570539620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:24:03,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 13:24:03,706][15401] Updated weights for policy 0, policy_version 217920 (0.0037) [2024-06-22 13:24:07,875][15401] Updated weights for policy 0, policy_version 217930 (0.0023) [2024-06-22 13:24:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 3570565120. Throughput: 0: 42581.7. Samples: 3570673240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:24:08,393][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 13:24:11,362][15401] Updated weights for policy 0, policy_version 217940 (0.0034) [2024-06-22 13:24:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42600.1, 300 sec: 42543.2). Total num frames: 3570794496. Throughput: 0: 42345.6. Samples: 3570918600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:24:13,391][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 13:24:15,597][15401] Updated weights for policy 0, policy_version 217950 (0.0035) [2024-06-22 13:24:18,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3571023872. Throughput: 0: 42423.8. Samples: 3571172940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:24:18,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 13:24:18,967][15401] Updated weights for policy 0, policy_version 217960 (0.0028) [2024-06-22 13:24:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 3571204096. Throughput: 0: 42470.9. Samples: 3571307600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:24:23,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 13:24:23,808][15401] Updated weights for policy 0, policy_version 217970 (0.0037) [2024-06-22 13:24:26,845][15401] Updated weights for policy 0, policy_version 217980 (0.0034) [2024-06-22 13:24:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 3571433472. Throughput: 0: 42447.2. Samples: 3571556100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:24:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 13:24:31,267][15401] Updated weights for policy 0, policy_version 217990 (0.0032) [2024-06-22 13:24:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 3571646464. Throughput: 0: 42216.7. Samples: 3571807520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:24:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 13:24:34,486][15401] Updated weights for policy 0, policy_version 218000 (0.0038) [2024-06-22 13:24:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 3571859456. Throughput: 0: 42162.3. Samples: 3571936780. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:24:38,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 13:24:38,706][15401] Updated weights for policy 0, policy_version 218010 (0.0027) [2024-06-22 13:24:42,175][15401] Updated weights for policy 0, policy_version 218020 (0.0043) [2024-06-22 13:24:43,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 3572056064. Throughput: 0: 42292.0. Samples: 3572189500. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:24:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 13:24:46,352][15401] Updated weights for policy 0, policy_version 218030 (0.0034) [2024-06-22 13:24:47,279][15349] Signal inference workers to stop experience collection... (52750 times) [2024-06-22 13:24:47,326][15401] InferenceWorker_p0-w0: stopping experience collection (52750 times) [2024-06-22 13:24:47,393][15349] Signal inference workers to resume experience collection... (52750 times) [2024-06-22 13:24:47,393][15401] InferenceWorker_p0-w0: resuming experience collection (52750 times) [2024-06-22 13:24:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 3572301824. Throughput: 0: 42315.5. Samples: 3572443720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:24:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 13:24:49,964][15401] Updated weights for policy 0, policy_version 218040 (0.0037) [2024-06-22 13:24:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 3572482048. Throughput: 0: 42201.0. Samples: 3572572180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:24:53,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-22 13:24:53,991][15401] Updated weights for policy 0, policy_version 218050 (0.0036) [2024-06-22 13:24:57,655][15401] Updated weights for policy 0, policy_version 218060 (0.0040) [2024-06-22 13:24:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3572711424. Throughput: 0: 42452.9. Samples: 3572828980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:24:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 13:25:01,544][15401] Updated weights for policy 0, policy_version 218070 (0.0040) [2024-06-22 13:25:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 3572908032. Throughput: 0: 42499.2. Samples: 3573085400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:03,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 13:25:05,318][15401] Updated weights for policy 0, policy_version 218080 (0.0043) [2024-06-22 13:25:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 3573137408. Throughput: 0: 42129.2. Samples: 3573203420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:08,395][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 13:25:09,614][15401] Updated weights for policy 0, policy_version 218090 (0.0040) [2024-06-22 13:25:12,971][15401] Updated weights for policy 0, policy_version 218100 (0.0035) [2024-06-22 13:25:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 3573366784. Throughput: 0: 42424.8. Samples: 3573465220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 13:25:17,210][15401] Updated weights for policy 0, policy_version 218110 (0.0038) [2024-06-22 13:25:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 3573530624. Throughput: 0: 42534.7. Samples: 3573721580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 13:25:20,952][15401] Updated weights for policy 0, policy_version 218120 (0.0046) [2024-06-22 13:25:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 3573760000. Throughput: 0: 42316.8. Samples: 3573841040. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:23,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 13:25:25,294][15401] Updated weights for policy 0, policy_version 218130 (0.0033) [2024-06-22 13:25:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 3573989376. Throughput: 0: 42388.5. Samples: 3574096980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 13:25:28,515][15401] Updated weights for policy 0, policy_version 218140 (0.0039) [2024-06-22 13:25:33,077][15401] Updated weights for policy 0, policy_version 218150 (0.0044) [2024-06-22 13:25:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42376.2). Total num frames: 3574169600. Throughput: 0: 42417.0. Samples: 3574352480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 13:25:36,427][15401] Updated weights for policy 0, policy_version 218160 (0.0037) [2024-06-22 13:25:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 3574382592. Throughput: 0: 42338.2. Samples: 3574477400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 13:25:40,780][15401] Updated weights for policy 0, policy_version 218170 (0.0034) [2024-06-22 13:25:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 3574611968. Throughput: 0: 42211.5. Samples: 3574728500. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-22 13:25:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 13:25:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000218178_3574628352.pth... [2024-06-22 13:25:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000217557_3564453888.pth [2024-06-22 13:25:44,293][15401] Updated weights for policy 0, policy_version 218180 (0.0041) [2024-06-22 13:25:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 3574808576. Throughput: 0: 42170.3. Samples: 3574983060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:25:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 13:25:48,490][15401] Updated weights for policy 0, policy_version 218190 (0.0043) [2024-06-22 13:25:51,873][15401] Updated weights for policy 0, policy_version 218200 (0.0039) [2024-06-22 13:25:53,391][15132] Fps is (10 sec: 40952.5, 60 sec: 42324.0, 300 sec: 42320.4). Total num frames: 3575021568. Throughput: 0: 42205.0. Samples: 3575102720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:25:53,392][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 13:25:56,483][15401] Updated weights for policy 0, policy_version 218210 (0.0027) [2024-06-22 13:25:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 3575250944. Throughput: 0: 42237.9. Samples: 3575365920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:25:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 13:25:59,367][15401] Updated weights for policy 0, policy_version 218220 (0.0025) [2024-06-22 13:26:02,822][15349] Signal inference workers to stop experience collection... (52800 times) [2024-06-22 13:26:02,826][15349] Signal inference workers to resume experience collection... (52800 times) [2024-06-22 13:26:02,872][15401] InferenceWorker_p0-w0: stopping experience collection (52800 times) [2024-06-22 13:26:02,872][15401] InferenceWorker_p0-w0: resuming experience collection (52800 times) [2024-06-22 13:26:03,389][15132] Fps is (10 sec: 42606.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 3575447552. Throughput: 0: 42144.1. Samples: 3575618060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 13:26:04,178][15401] Updated weights for policy 0, policy_version 218230 (0.0037) [2024-06-22 13:26:07,347][15401] Updated weights for policy 0, policy_version 218240 (0.0030) [2024-06-22 13:26:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 3575660544. Throughput: 0: 42237.1. Samples: 3575741700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:08,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 13:26:11,973][15401] Updated weights for policy 0, policy_version 218250 (0.0048) [2024-06-22 13:26:13,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42050.6, 300 sec: 42431.4). Total num frames: 3575889920. Throughput: 0: 42336.8. Samples: 3576002240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:13,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 13:26:15,234][15401] Updated weights for policy 0, policy_version 218260 (0.0036) [2024-06-22 13:26:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42432.1). Total num frames: 3576102912. Throughput: 0: 42241.3. Samples: 3576253340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:18,396][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 13:26:19,694][15401] Updated weights for policy 0, policy_version 218270 (0.0042) [2024-06-22 13:26:22,852][15401] Updated weights for policy 0, policy_version 218280 (0.0046) [2024-06-22 13:26:23,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42323.7, 300 sec: 42375.9). Total num frames: 3576299520. Throughput: 0: 42327.0. Samples: 3576382220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:23,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 13:26:27,160][15401] Updated weights for policy 0, policy_version 218290 (0.0038) [2024-06-22 13:26:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 3576512512. Throughput: 0: 42530.8. Samples: 3576642380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 13:26:30,377][15401] Updated weights for policy 0, policy_version 218300 (0.0031) [2024-06-22 13:26:33,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 3576741888. Throughput: 0: 42506.6. Samples: 3576895860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 13:26:34,766][15401] Updated weights for policy 0, policy_version 218310 (0.0037) [2024-06-22 13:26:37,914][15401] Updated weights for policy 0, policy_version 218320 (0.0038) [2024-06-22 13:26:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 3576954880. Throughput: 0: 42817.3. Samples: 3577029420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 13:26:42,774][15401] Updated weights for policy 0, policy_version 218330 (0.0046) [2024-06-22 13:26:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 3577151488. Throughput: 0: 42732.3. Samples: 3577288880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 13:26:45,471][15401] Updated weights for policy 0, policy_version 218340 (0.0038) [2024-06-22 13:26:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 3577380864. Throughput: 0: 42679.5. Samples: 3577538640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 13:26:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 13:26:50,181][15401] Updated weights for policy 0, policy_version 218350 (0.0046) [2024-06-22 13:26:53,051][15401] Updated weights for policy 0, policy_version 218360 (0.0021) [2024-06-22 13:26:53,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43145.9, 300 sec: 42487.3). Total num frames: 3577610240. Throughput: 0: 42872.8. Samples: 3577670980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:26:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 13:26:57,784][15401] Updated weights for policy 0, policy_version 218370 (0.0033) [2024-06-22 13:26:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 3577790464. Throughput: 0: 42817.8. Samples: 3577928940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:26:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 13:27:00,940][15401] Updated weights for policy 0, policy_version 218380 (0.0040) [2024-06-22 13:27:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 3578036224. Throughput: 0: 42760.5. Samples: 3578177560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 13:27:05,510][15401] Updated weights for policy 0, policy_version 218390 (0.0031) [2024-06-22 13:27:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42487.4). Total num frames: 3578232832. Throughput: 0: 42789.8. Samples: 3578307660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 13:27:09,310][15401] Updated weights for policy 0, policy_version 218400 (0.0029) [2024-06-22 13:27:10,712][15349] Signal inference workers to stop experience collection... (52850 times) [2024-06-22 13:27:10,713][15349] Signal inference workers to resume experience collection... (52850 times) [2024-06-22 13:27:10,744][15401] InferenceWorker_p0-w0: stopping experience collection (52850 times) [2024-06-22 13:27:10,745][15401] InferenceWorker_p0-w0: resuming experience collection (52850 times) [2024-06-22 13:27:13,219][15401] Updated weights for policy 0, policy_version 218410 (0.0030) [2024-06-22 13:27:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 3578429440. Throughput: 0: 42600.3. Samples: 3578559400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 13:27:16,996][15401] Updated weights for policy 0, policy_version 218420 (0.0036) [2024-06-22 13:27:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 3578658816. Throughput: 0: 42609.0. Samples: 3578813260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:18,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 13:27:21,199][15401] Updated weights for policy 0, policy_version 218430 (0.0038) [2024-06-22 13:27:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42431.8). Total num frames: 3578855424. Throughput: 0: 42553.7. Samples: 3578944340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 13:27:24,558][15401] Updated weights for policy 0, policy_version 218440 (0.0034) [2024-06-22 13:27:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 3579052032. Throughput: 0: 42373.0. Samples: 3579195660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:28,398][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 13:27:28,751][15401] Updated weights for policy 0, policy_version 218450 (0.0030) [2024-06-22 13:27:32,247][15401] Updated weights for policy 0, policy_version 218460 (0.0036) [2024-06-22 13:27:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.8, 300 sec: 42431.4). Total num frames: 3579297792. Throughput: 0: 42434.2. Samples: 3579448280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:33,401][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 13:27:36,229][15401] Updated weights for policy 0, policy_version 218470 (0.0031) [2024-06-22 13:27:38,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 3579494400. Throughput: 0: 42494.7. Samples: 3579583340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:38,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 13:27:39,861][15401] Updated weights for policy 0, policy_version 218480 (0.0038) [2024-06-22 13:27:43,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42598.6, 300 sec: 42376.3). Total num frames: 3579707392. Throughput: 0: 42396.6. Samples: 3579836780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:43,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 13:27:43,455][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000218489_3579723776.pth... [2024-06-22 13:27:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000217866_3569516544.pth [2024-06-22 13:27:43,686][15401] Updated weights for policy 0, policy_version 218490 (0.0030) [2024-06-22 13:27:47,668][15401] Updated weights for policy 0, policy_version 218500 (0.0035) [2024-06-22 13:27:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 3579936768. Throughput: 0: 42666.7. Samples: 3580097560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:48,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 13:27:51,736][15401] Updated weights for policy 0, policy_version 218510 (0.0032) [2024-06-22 13:27:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 3580149760. Throughput: 0: 42684.9. Samples: 3580228480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 13:27:55,074][15401] Updated weights for policy 0, policy_version 218520 (0.0033) [2024-06-22 13:27:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 3580362752. Throughput: 0: 42776.9. Samples: 3580484360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:27:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 13:27:59,195][15401] Updated weights for policy 0, policy_version 218530 (0.0039) [2024-06-22 13:28:02,650][15401] Updated weights for policy 0, policy_version 218540 (0.0048) [2024-06-22 13:28:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3580575744. Throughput: 0: 42785.8. Samples: 3580738620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:03,396][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 13:28:06,678][15401] Updated weights for policy 0, policy_version 218550 (0.0036) [2024-06-22 13:28:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 3580788736. Throughput: 0: 42832.4. Samples: 3580871800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 13:28:10,252][15401] Updated weights for policy 0, policy_version 218560 (0.0045) [2024-06-22 13:28:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 3581001728. Throughput: 0: 42900.0. Samples: 3581126160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 13:28:14,142][15401] Updated weights for policy 0, policy_version 218570 (0.0035) [2024-06-22 13:28:17,873][15401] Updated weights for policy 0, policy_version 218580 (0.0042) [2024-06-22 13:28:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 3581231104. Throughput: 0: 43070.8. Samples: 3581386360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:18,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 13:28:22,025][15401] Updated weights for policy 0, policy_version 218590 (0.0037) [2024-06-22 13:28:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 3581411328. Throughput: 0: 42967.6. Samples: 3581516780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 13:28:25,754][15401] Updated weights for policy 0, policy_version 218600 (0.0031) [2024-06-22 13:28:28,396][15132] Fps is (10 sec: 40933.7, 60 sec: 43139.9, 300 sec: 42430.8). Total num frames: 3581640704. Throughput: 0: 42896.5. Samples: 3581767400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:28,396][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 13:28:29,694][15401] Updated weights for policy 0, policy_version 218610 (0.0030) [2024-06-22 13:28:33,295][15401] Updated weights for policy 0, policy_version 218620 (0.0026) [2024-06-22 13:28:33,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42871.5, 300 sec: 42598.1). Total num frames: 3581870080. Throughput: 0: 42924.7. Samples: 3582029280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:33,393][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 13:28:36,986][15349] Signal inference workers to stop experience collection... (52900 times) [2024-06-22 13:28:37,027][15401] InferenceWorker_p0-w0: stopping experience collection (52900 times) [2024-06-22 13:28:37,039][15349] Signal inference workers to resume experience collection... (52900 times) [2024-06-22 13:28:37,045][15401] InferenceWorker_p0-w0: resuming experience collection (52900 times) [2024-06-22 13:28:37,186][15401] Updated weights for policy 0, policy_version 218630 (0.0037) [2024-06-22 13:28:38,392][15132] Fps is (10 sec: 42615.3, 60 sec: 42871.4, 300 sec: 42542.5). Total num frames: 3582066688. Throughput: 0: 42980.4. Samples: 3582162700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:38,393][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 13:28:40,719][15401] Updated weights for policy 0, policy_version 218640 (0.0031) [2024-06-22 13:28:43,394][15132] Fps is (10 sec: 42590.8, 60 sec: 43141.4, 300 sec: 42597.8). Total num frames: 3582296064. Throughput: 0: 42737.9. Samples: 3582407740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:43,394][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 13:28:44,974][15401] Updated weights for policy 0, policy_version 218650 (0.0024) [2024-06-22 13:28:48,326][15401] Updated weights for policy 0, policy_version 218660 (0.0027) [2024-06-22 13:28:48,389][15132] Fps is (10 sec: 45886.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3582525440. Throughput: 0: 42846.7. Samples: 3582666720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 13:28:52,500][15401] Updated weights for policy 0, policy_version 218670 (0.0028) [2024-06-22 13:28:53,391][15132] Fps is (10 sec: 40971.5, 60 sec: 42597.4, 300 sec: 42542.7). Total num frames: 3582705664. Throughput: 0: 42777.4. Samples: 3582796840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:53,392][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 13:28:55,888][15401] Updated weights for policy 0, policy_version 218680 (0.0045) [2024-06-22 13:28:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42598.7). Total num frames: 3582951424. Throughput: 0: 42848.5. Samples: 3583054340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:28:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 13:28:59,996][15401] Updated weights for policy 0, policy_version 218690 (0.0042) [2024-06-22 13:29:03,389][15132] Fps is (10 sec: 44243.3, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 3583148032. Throughput: 0: 42864.0. Samples: 3583315240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:29:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 13:29:03,631][15401] Updated weights for policy 0, policy_version 218700 (0.0035) [2024-06-22 13:29:07,879][15401] Updated weights for policy 0, policy_version 218710 (0.0032) [2024-06-22 13:29:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3583361024. Throughput: 0: 42717.2. Samples: 3583439060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 13:29:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 13:29:11,096][15401] Updated weights for policy 0, policy_version 218720 (0.0026) [2024-06-22 13:29:13,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 3583590400. Throughput: 0: 42850.4. Samples: 3583695500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:13,393][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 13:29:15,394][15401] Updated weights for policy 0, policy_version 218730 (0.0031) [2024-06-22 13:29:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3583787008. Throughput: 0: 42736.5. Samples: 3583952320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 13:29:19,177][15401] Updated weights for policy 0, policy_version 218740 (0.0040) [2024-06-22 13:29:23,207][15401] Updated weights for policy 0, policy_version 218750 (0.0031) [2024-06-22 13:29:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 3584000000. Throughput: 0: 42541.7. Samples: 3584076980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 13:29:26,725][15401] Updated weights for policy 0, policy_version 218760 (0.0026) [2024-06-22 13:29:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43422.1, 300 sec: 42709.5). Total num frames: 3584245760. Throughput: 0: 42898.1. Samples: 3584337980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 13:29:30,758][15401] Updated weights for policy 0, policy_version 218770 (0.0042) [2024-06-22 13:29:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 3584425984. Throughput: 0: 42818.2. Samples: 3584593540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:33,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 13:29:34,688][15401] Updated weights for policy 0, policy_version 218780 (0.0033) [2024-06-22 13:29:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 3584638976. Throughput: 0: 42541.7. Samples: 3584711160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 13:29:38,692][15401] Updated weights for policy 0, policy_version 218790 (0.0034) [2024-06-22 13:29:42,335][15401] Updated weights for policy 0, policy_version 218800 (0.0033) [2024-06-22 13:29:42,358][15349] Signal inference workers to stop experience collection... (52950 times) [2024-06-22 13:29:42,358][15349] Signal inference workers to resume experience collection... (52950 times) [2024-06-22 13:29:42,399][15401] InferenceWorker_p0-w0: stopping experience collection (52950 times) [2024-06-22 13:29:42,399][15401] InferenceWorker_p0-w0: resuming experience collection (52950 times) [2024-06-22 13:29:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42874.4, 300 sec: 42598.4). Total num frames: 3584868352. Throughput: 0: 42599.0. Samples: 3584971300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 13:29:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000218803_3584868352.pth... [2024-06-22 13:29:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000218178_3574628352.pth [2024-06-22 13:29:46,200][15401] Updated weights for policy 0, policy_version 218810 (0.0028) [2024-06-22 13:29:48,390][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 3585032192. Throughput: 0: 42609.7. Samples: 3585232680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:48,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 13:29:49,977][15401] Updated weights for policy 0, policy_version 218820 (0.0030) [2024-06-22 13:29:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42872.5, 300 sec: 42598.4). Total num frames: 3585277952. Throughput: 0: 42432.0. Samples: 3585348500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 13:29:53,826][15401] Updated weights for policy 0, policy_version 218830 (0.0038) [2024-06-22 13:29:57,486][15401] Updated weights for policy 0, policy_version 218840 (0.0035) [2024-06-22 13:29:58,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3585507328. Throughput: 0: 42528.2. Samples: 3585609160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:29:58,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 13:30:01,415][15401] Updated weights for policy 0, policy_version 218850 (0.0035) [2024-06-22 13:30:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 3585671168. Throughput: 0: 42652.5. Samples: 3585871680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:30:03,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 13:30:05,183][15401] Updated weights for policy 0, policy_version 218860 (0.0031) [2024-06-22 13:30:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 3585916928. Throughput: 0: 42482.3. Samples: 3585988680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:30:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 13:30:08,917][15401] Updated weights for policy 0, policy_version 218870 (0.0046) [2024-06-22 13:30:12,938][15401] Updated weights for policy 0, policy_version 218880 (0.0026) [2024-06-22 13:30:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 3586129920. Throughput: 0: 42529.2. Samples: 3586251780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:30:13,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 13:30:16,773][15401] Updated weights for policy 0, policy_version 218890 (0.0034) [2024-06-22 13:30:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 3586310144. Throughput: 0: 42640.8. Samples: 3586512380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 13:30:20,777][15401] Updated weights for policy 0, policy_version 218900 (0.0034) [2024-06-22 13:30:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3586555904. Throughput: 0: 42636.1. Samples: 3586629780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 13:30:24,721][15401] Updated weights for policy 0, policy_version 218910 (0.0032) [2024-06-22 13:30:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41779.4, 300 sec: 42653.9). Total num frames: 3586752512. Throughput: 0: 42697.1. Samples: 3586892660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 13:30:28,571][15401] Updated weights for policy 0, policy_version 218920 (0.0035) [2024-06-22 13:30:32,353][15401] Updated weights for policy 0, policy_version 218930 (0.0040) [2024-06-22 13:30:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 3586965504. Throughput: 0: 42486.5. Samples: 3587144580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:33,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 13:30:36,285][15401] Updated weights for policy 0, policy_version 218940 (0.0031) [2024-06-22 13:30:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3587211264. Throughput: 0: 42747.1. Samples: 3587272120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 13:30:39,804][15401] Updated weights for policy 0, policy_version 218950 (0.0045) [2024-06-22 13:30:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3587407872. Throughput: 0: 42756.8. Samples: 3587533220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:43,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 13:30:43,806][15401] Updated weights for policy 0, policy_version 218960 (0.0030) [2024-06-22 13:30:47,570][15401] Updated weights for policy 0, policy_version 218970 (0.0028) [2024-06-22 13:30:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.7). Total num frames: 3587620864. Throughput: 0: 42647.5. Samples: 3587790820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:48,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 13:30:51,333][15401] Updated weights for policy 0, policy_version 218980 (0.0027) [2024-06-22 13:30:53,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 3587850240. Throughput: 0: 42914.4. Samples: 3587919840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:53,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 13:30:55,095][15401] Updated weights for policy 0, policy_version 218990 (0.0031) [2024-06-22 13:30:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3588046848. Throughput: 0: 42892.8. Samples: 3588181960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:30:58,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 13:30:58,932][15401] Updated weights for policy 0, policy_version 219000 (0.0031) [2024-06-22 13:31:03,303][15401] Updated weights for policy 0, policy_version 219010 (0.0038) [2024-06-22 13:31:03,390][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3588259840. Throughput: 0: 42801.8. Samples: 3588438460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:31:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 13:31:06,674][15401] Updated weights for policy 0, policy_version 219020 (0.0037) [2024-06-22 13:31:08,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.8, 300 sec: 42765.0). Total num frames: 3588505600. Throughput: 0: 42941.7. Samples: 3588562260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:31:08,392][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 13:31:09,441][15349] Signal inference workers to stop experience collection... (53000 times) [2024-06-22 13:31:09,441][15349] Signal inference workers to resume experience collection... (53000 times) [2024-06-22 13:31:09,475][15401] InferenceWorker_p0-w0: stopping experience collection (53000 times) [2024-06-22 13:31:09,475][15401] InferenceWorker_p0-w0: resuming experience collection (53000 times) [2024-06-22 13:31:10,822][15401] Updated weights for policy 0, policy_version 219030 (0.0042) [2024-06-22 13:31:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 3588685824. Throughput: 0: 42876.7. Samples: 3588822120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:31:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 13:31:14,302][15401] Updated weights for policy 0, policy_version 219040 (0.0036) [2024-06-22 13:31:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 3588898816. Throughput: 0: 43003.3. Samples: 3589079720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:31:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 13:31:18,431][15401] Updated weights for policy 0, policy_version 219050 (0.0038) [2024-06-22 13:31:22,251][15401] Updated weights for policy 0, policy_version 219060 (0.0030) [2024-06-22 13:31:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3589128192. Throughput: 0: 42892.0. Samples: 3589202260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-22 13:31:23,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 13:31:25,944][15401] Updated weights for policy 0, policy_version 219070 (0.0036) [2024-06-22 13:31:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 3589324800. Throughput: 0: 42888.9. Samples: 3589463220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:31:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 13:31:29,751][15401] Updated weights for policy 0, policy_version 219080 (0.0044) [2024-06-22 13:31:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 3589537792. Throughput: 0: 42757.4. Samples: 3589714900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:31:33,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-22 13:31:33,899][15401] Updated weights for policy 0, policy_version 219090 (0.0036) [2024-06-22 13:31:37,242][15401] Updated weights for policy 0, policy_version 219100 (0.0027) [2024-06-22 13:31:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3589783552. Throughput: 0: 42873.0. Samples: 3589849120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:31:38,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-22 13:31:41,447][15401] Updated weights for policy 0, policy_version 219110 (0.0029) [2024-06-22 13:31:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 3589980160. Throughput: 0: 42758.1. Samples: 3590106180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:31:43,393][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 13:31:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000219115_3589980160.pth... [2024-06-22 13:31:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000218489_3579723776.pth [2024-06-22 13:31:44,873][15401] Updated weights for policy 0, policy_version 219120 (0.0042) [2024-06-22 13:31:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 3590193152. Throughput: 0: 42727.7. Samples: 3590361200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:31:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 13:31:48,965][15401] Updated weights for policy 0, policy_version 219130 (0.0035) [2024-06-22 13:31:52,336][15401] Updated weights for policy 0, policy_version 219140 (0.0034) [2024-06-22 13:31:53,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3590422528. Throughput: 0: 42910.3. Samples: 3590493120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:31:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 13:31:56,956][15401] Updated weights for policy 0, policy_version 219150 (0.0034) [2024-06-22 13:31:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3590619136. Throughput: 0: 42914.7. Samples: 3590753280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:31:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 13:32:00,244][15401] Updated weights for policy 0, policy_version 219160 (0.0023) [2024-06-22 13:32:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3590832128. Throughput: 0: 42891.1. Samples: 3591009820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:32:03,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 13:32:04,452][15401] Updated weights for policy 0, policy_version 219170 (0.0042) [2024-06-22 13:32:07,695][15401] Updated weights for policy 0, policy_version 219180 (0.0036) [2024-06-22 13:32:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 3591077888. Throughput: 0: 43024.0. Samples: 3591138340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:32:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 13:32:12,354][15401] Updated weights for policy 0, policy_version 219190 (0.0038) [2024-06-22 13:32:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3591274496. Throughput: 0: 42902.1. Samples: 3591393820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:32:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 13:32:15,522][15401] Updated weights for policy 0, policy_version 219200 (0.0027) [2024-06-22 13:32:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3591471104. Throughput: 0: 42887.5. Samples: 3591644840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:32:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 13:32:19,881][15401] Updated weights for policy 0, policy_version 219210 (0.0037) [2024-06-22 13:32:23,114][15401] Updated weights for policy 0, policy_version 219220 (0.0042) [2024-06-22 13:32:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3591716864. Throughput: 0: 42777.0. Samples: 3591774080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:32:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 13:32:27,372][15401] Updated weights for policy 0, policy_version 219230 (0.0029) [2024-06-22 13:32:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 3591897088. Throughput: 0: 42853.9. Samples: 3592034500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:32:28,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 13:32:29,154][15349] Signal inference workers to stop experience collection... (53050 times) [2024-06-22 13:32:29,154][15349] Signal inference workers to resume experience collection... (53050 times) [2024-06-22 13:32:29,177][15401] InferenceWorker_p0-w0: stopping experience collection (53050 times) [2024-06-22 13:32:29,177][15401] InferenceWorker_p0-w0: resuming experience collection (53050 times) [2024-06-22 13:32:30,658][15401] Updated weights for policy 0, policy_version 219240 (0.0036) [2024-06-22 13:32:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 3592110080. Throughput: 0: 42817.2. Samples: 3592287980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:32:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 13:32:34,911][15401] Updated weights for policy 0, policy_version 219250 (0.0034) [2024-06-22 13:32:38,167][15401] Updated weights for policy 0, policy_version 219260 (0.0044) [2024-06-22 13:32:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3592355840. Throughput: 0: 42677.8. Samples: 3592413620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:32:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 13:32:42,864][15401] Updated weights for policy 0, policy_version 219270 (0.0039) [2024-06-22 13:32:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3592552448. Throughput: 0: 42626.2. Samples: 3592671460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:32:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 13:32:46,147][15401] Updated weights for policy 0, policy_version 219280 (0.0028) [2024-06-22 13:32:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3592749056. Throughput: 0: 42684.9. Samples: 3592930640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:32:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 13:32:50,390][15401] Updated weights for policy 0, policy_version 219290 (0.0024) [2024-06-22 13:32:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3592962048. Throughput: 0: 42562.4. Samples: 3593053640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:32:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 13:32:53,744][15401] Updated weights for policy 0, policy_version 219300 (0.0047) [2024-06-22 13:32:57,900][15401] Updated weights for policy 0, policy_version 219310 (0.0039) [2024-06-22 13:32:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3593191424. Throughput: 0: 42650.7. Samples: 3593313100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:32:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 13:33:01,306][15401] Updated weights for policy 0, policy_version 219320 (0.0035) [2024-06-22 13:33:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3593388032. Throughput: 0: 42859.1. Samples: 3593573500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:33:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 13:33:05,526][15401] Updated weights for policy 0, policy_version 219330 (0.0029) [2024-06-22 13:33:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3593617408. Throughput: 0: 42750.6. Samples: 3593697860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:33:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 13:33:09,022][15401] Updated weights for policy 0, policy_version 219340 (0.0024) [2024-06-22 13:33:13,256][15401] Updated weights for policy 0, policy_version 219350 (0.0036) [2024-06-22 13:33:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3593846784. Throughput: 0: 42814.5. Samples: 3593961160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:33:13,396][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 13:33:16,830][15401] Updated weights for policy 0, policy_version 219360 (0.0028) [2024-06-22 13:33:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3594043392. Throughput: 0: 42722.6. Samples: 3594210500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:33:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 13:33:20,823][15401] Updated weights for policy 0, policy_version 219370 (0.0027) [2024-06-22 13:33:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42766.0). Total num frames: 3594256384. Throughput: 0: 42752.5. Samples: 3594337480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:33:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 13:33:24,694][15401] Updated weights for policy 0, policy_version 219380 (0.0029) [2024-06-22 13:33:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 3594469376. Throughput: 0: 42739.6. Samples: 3594594740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:33:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 13:33:28,678][15401] Updated weights for policy 0, policy_version 219390 (0.0027) [2024-06-22 13:33:32,331][15401] Updated weights for policy 0, policy_version 219400 (0.0028) [2024-06-22 13:33:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 3594698752. Throughput: 0: 42594.2. Samples: 3594847380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:33:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 13:33:36,357][15401] Updated weights for policy 0, policy_version 219410 (0.0030) [2024-06-22 13:33:37,861][15349] Signal inference workers to stop experience collection... (53100 times) [2024-06-22 13:33:37,862][15349] Signal inference workers to resume experience collection... (53100 times) [2024-06-22 13:33:37,911][15401] InferenceWorker_p0-w0: stopping experience collection (53100 times) [2024-06-22 13:33:37,911][15401] InferenceWorker_p0-w0: resuming experience collection (53100 times) [2024-06-22 13:33:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.6). Total num frames: 3594911744. Throughput: 0: 42792.8. Samples: 3594979320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 13:33:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 13:33:39,927][15401] Updated weights for policy 0, policy_version 219420 (0.0039) [2024-06-22 13:33:43,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 3595108352. Throughput: 0: 42683.1. Samples: 3595233940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:33:43,393][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 13:33:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000219428_3595108352.pth... [2024-06-22 13:33:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000218803_3584868352.pth [2024-06-22 13:33:44,147][15401] Updated weights for policy 0, policy_version 219430 (0.0037) [2024-06-22 13:33:47,708][15401] Updated weights for policy 0, policy_version 219440 (0.0035) [2024-06-22 13:33:48,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42764.9). Total num frames: 3595321344. Throughput: 0: 42572.4. Samples: 3595489360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:33:48,393][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 13:33:51,692][15401] Updated weights for policy 0, policy_version 219450 (0.0037) [2024-06-22 13:33:53,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3595550720. Throughput: 0: 42784.9. Samples: 3595623180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:33:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 13:33:55,577][15401] Updated weights for policy 0, policy_version 219460 (0.0031) [2024-06-22 13:33:58,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3595763712. Throughput: 0: 42625.1. Samples: 3595879280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:33:58,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 13:33:59,097][15401] Updated weights for policy 0, policy_version 219470 (0.0032) [2024-06-22 13:34:03,054][15401] Updated weights for policy 0, policy_version 219480 (0.0034) [2024-06-22 13:34:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3595976704. Throughput: 0: 42963.7. Samples: 3596143860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 13:34:06,565][15401] Updated weights for policy 0, policy_version 219490 (0.0051) [2024-06-22 13:34:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 3596189696. Throughput: 0: 42982.2. Samples: 3596271680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 13:34:10,461][15401] Updated weights for policy 0, policy_version 219500 (0.0040) [2024-06-22 13:34:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3596402688. Throughput: 0: 43140.9. Samples: 3596536080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 13:34:14,024][15401] Updated weights for policy 0, policy_version 219510 (0.0031) [2024-06-22 13:34:18,199][15401] Updated weights for policy 0, policy_version 219520 (0.0037) [2024-06-22 13:34:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3596615680. Throughput: 0: 43083.5. Samples: 3596786140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 13:34:21,699][15401] Updated weights for policy 0, policy_version 219530 (0.0023) [2024-06-22 13:34:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3596828672. Throughput: 0: 42998.3. Samples: 3596914240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:23,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 13:34:25,753][15401] Updated weights for policy 0, policy_version 219540 (0.0038) [2024-06-22 13:34:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3597041664. Throughput: 0: 43242.4. Samples: 3597179740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 13:34:29,078][15401] Updated weights for policy 0, policy_version 219550 (0.0037) [2024-06-22 13:34:33,378][15401] Updated weights for policy 0, policy_version 219560 (0.0048) [2024-06-22 13:34:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3597271040. Throughput: 0: 43341.9. Samples: 3597439640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 13:34:36,534][15401] Updated weights for policy 0, policy_version 219570 (0.0027) [2024-06-22 13:34:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3597484032. Throughput: 0: 43191.6. Samples: 3597566800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:38,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 13:34:40,919][15401] Updated weights for policy 0, policy_version 219580 (0.0038) [2024-06-22 13:34:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43417.6, 300 sec: 42986.8). Total num frames: 3597713408. Throughput: 0: 43258.1. Samples: 3597826000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:43,392][15132] Avg episode reward: [(0, '0.260')] [2024-06-22 13:34:44,156][15401] Updated weights for policy 0, policy_version 219590 (0.0039) [2024-06-22 13:34:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3597893632. Throughput: 0: 43132.4. Samples: 3598084820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 13:34:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 13:34:48,692][15401] Updated weights for policy 0, policy_version 219600 (0.0028) [2024-06-22 13:34:51,778][15401] Updated weights for policy 0, policy_version 219610 (0.0033) [2024-06-22 13:34:53,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3598139392. Throughput: 0: 42980.9. Samples: 3598205820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:34:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 13:34:56,354][15401] Updated weights for policy 0, policy_version 219620 (0.0037) [2024-06-22 13:34:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3598352384. Throughput: 0: 42921.9. Samples: 3598467560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:34:58,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 13:34:59,005][15349] Signal inference workers to stop experience collection... (53150 times) [2024-06-22 13:34:59,006][15349] Signal inference workers to resume experience collection... (53150 times) [2024-06-22 13:34:59,016][15401] InferenceWorker_p0-w0: stopping experience collection (53150 times) [2024-06-22 13:34:59,016][15401] InferenceWorker_p0-w0: resuming experience collection (53150 times) [2024-06-22 13:34:59,462][15401] Updated weights for policy 0, policy_version 219630 (0.0024) [2024-06-22 13:35:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3598532608. Throughput: 0: 43178.3. Samples: 3598729160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 13:35:03,927][15401] Updated weights for policy 0, policy_version 219640 (0.0041) [2024-06-22 13:35:07,306][15401] Updated weights for policy 0, policy_version 219650 (0.0025) [2024-06-22 13:35:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3598794752. Throughput: 0: 43025.8. Samples: 3598850400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 13:35:11,611][15401] Updated weights for policy 0, policy_version 219660 (0.0028) [2024-06-22 13:35:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3598991360. Throughput: 0: 42978.8. Samples: 3599113780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 13:35:15,118][15401] Updated weights for policy 0, policy_version 219670 (0.0030) [2024-06-22 13:35:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3599187968. Throughput: 0: 42972.3. Samples: 3599373400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 13:35:19,173][15401] Updated weights for policy 0, policy_version 219680 (0.0036) [2024-06-22 13:35:22,839][15401] Updated weights for policy 0, policy_version 219690 (0.0033) [2024-06-22 13:35:23,389][15132] Fps is (10 sec: 45874.7, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 3599450112. Throughput: 0: 42897.8. Samples: 3599497200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 13:35:26,762][15401] Updated weights for policy 0, policy_version 219700 (0.0030) [2024-06-22 13:35:28,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 3599646720. Throughput: 0: 42892.9. Samples: 3599756080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 13:35:30,672][15401] Updated weights for policy 0, policy_version 219710 (0.0032) [2024-06-22 13:35:33,389][15132] Fps is (10 sec: 36044.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3599810560. Throughput: 0: 42856.4. Samples: 3600013360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:33,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-22 13:35:34,354][15401] Updated weights for policy 0, policy_version 219720 (0.0047) [2024-06-22 13:35:38,236][15401] Updated weights for policy 0, policy_version 219730 (0.0032) [2024-06-22 13:35:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3600056320. Throughput: 0: 42885.8. Samples: 3600135680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 13:35:42,536][15401] Updated weights for policy 0, policy_version 219740 (0.0039) [2024-06-22 13:35:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 3600269312. Throughput: 0: 42827.5. Samples: 3600394800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:43,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 13:35:43,495][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000219744_3600285696.pth... [2024-06-22 13:35:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000219115_3589980160.pth [2024-06-22 13:35:45,770][15401] Updated weights for policy 0, policy_version 219750 (0.0022) [2024-06-22 13:35:48,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 3600465920. Throughput: 0: 42522.4. Samples: 3600642940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:48,396][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 13:35:50,028][15401] Updated weights for policy 0, policy_version 219760 (0.0038) [2024-06-22 13:35:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3600695296. Throughput: 0: 42709.3. Samples: 3600772320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 13:35:53,422][15401] Updated weights for policy 0, policy_version 219770 (0.0036) [2024-06-22 13:35:57,593][15401] Updated weights for policy 0, policy_version 219780 (0.0032) [2024-06-22 13:35:58,392][15132] Fps is (10 sec: 42615.5, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 3600891904. Throughput: 0: 42675.4. Samples: 3601034280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-22 13:35:58,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 13:36:01,270][15401] Updated weights for policy 0, policy_version 219790 (0.0028) [2024-06-22 13:36:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42765.3). Total num frames: 3601121280. Throughput: 0: 42446.6. Samples: 3601283500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:03,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 13:36:05,160][15401] Updated weights for policy 0, policy_version 219800 (0.0034) [2024-06-22 13:36:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3601334272. Throughput: 0: 42611.6. Samples: 3601414720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 13:36:08,792][15401] Updated weights for policy 0, policy_version 219810 (0.0028) [2024-06-22 13:36:12,798][15401] Updated weights for policy 0, policy_version 219820 (0.0034) [2024-06-22 13:36:13,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3601547264. Throughput: 0: 42807.3. Samples: 3601682400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:13,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 13:36:13,457][15349] Signal inference workers to stop experience collection... (53200 times) [2024-06-22 13:36:13,499][15401] InferenceWorker_p0-w0: stopping experience collection (53200 times) [2024-06-22 13:36:13,509][15349] Signal inference workers to resume experience collection... (53200 times) [2024-06-22 13:36:13,516][15401] InferenceWorker_p0-w0: resuming experience collection (53200 times) [2024-06-22 13:36:16,187][15401] Updated weights for policy 0, policy_version 219830 (0.0039) [2024-06-22 13:36:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3601760256. Throughput: 0: 42625.4. Samples: 3601931500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 13:36:20,363][15401] Updated weights for policy 0, policy_version 219840 (0.0027) [2024-06-22 13:36:23,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42323.6, 300 sec: 42931.3). Total num frames: 3601989632. Throughput: 0: 42742.6. Samples: 3602059200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:23,392][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 13:36:23,725][15401] Updated weights for policy 0, policy_version 219850 (0.0049) [2024-06-22 13:36:27,824][15401] Updated weights for policy 0, policy_version 219860 (0.0034) [2024-06-22 13:36:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3602202624. Throughput: 0: 42811.0. Samples: 3602321300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:28,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 13:36:31,523][15401] Updated weights for policy 0, policy_version 219870 (0.0032) [2024-06-22 13:36:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3602399232. Throughput: 0: 43017.3. Samples: 3602578440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:33,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 13:36:35,395][15401] Updated weights for policy 0, policy_version 219880 (0.0036) [2024-06-22 13:36:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 3602628608. Throughput: 0: 42855.1. Samples: 3602700800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:38,396][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 13:36:39,136][15401] Updated weights for policy 0, policy_version 219890 (0.0033) [2024-06-22 13:36:42,982][15401] Updated weights for policy 0, policy_version 219900 (0.0023) [2024-06-22 13:36:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3602841600. Throughput: 0: 42917.9. Samples: 3602965480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 13:36:46,824][15401] Updated weights for policy 0, policy_version 219910 (0.0027) [2024-06-22 13:36:48,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42871.5, 300 sec: 42764.1). Total num frames: 3603038208. Throughput: 0: 43010.9. Samples: 3603219260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:48,396][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 13:36:50,614][15401] Updated weights for policy 0, policy_version 219920 (0.0034) [2024-06-22 13:36:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3603267584. Throughput: 0: 42870.6. Samples: 3603343900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 13:36:54,144][15401] Updated weights for policy 0, policy_version 219930 (0.0025) [2024-06-22 13:36:58,273][15401] Updated weights for policy 0, policy_version 219940 (0.0031) [2024-06-22 13:36:58,390][15132] Fps is (10 sec: 45903.9, 60 sec: 43419.2, 300 sec: 42931.6). Total num frames: 3603496960. Throughput: 0: 42854.5. Samples: 3603610860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:36:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 13:37:01,785][15401] Updated weights for policy 0, policy_version 219950 (0.0032) [2024-06-22 13:37:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3603677184. Throughput: 0: 42919.6. Samples: 3603862880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 13:37:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 13:37:06,023][15401] Updated weights for policy 0, policy_version 219960 (0.0037) [2024-06-22 13:37:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3603922944. Throughput: 0: 42781.0. Samples: 3603984240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 13:37:09,722][15401] Updated weights for policy 0, policy_version 219970 (0.0042) [2024-06-22 13:37:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3604103168. Throughput: 0: 42834.0. Samples: 3604248820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 13:37:13,962][15401] Updated weights for policy 0, policy_version 219980 (0.0029) [2024-06-22 13:37:17,138][15401] Updated weights for policy 0, policy_version 219990 (0.0041) [2024-06-22 13:37:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3604332544. Throughput: 0: 42717.7. Samples: 3604500740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:18,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 13:37:21,696][15401] Updated weights for policy 0, policy_version 220000 (0.0034) [2024-06-22 13:37:23,086][15349] Signal inference workers to stop experience collection... (53250 times) [2024-06-22 13:37:23,086][15349] Signal inference workers to resume experience collection... (53250 times) [2024-06-22 13:37:23,108][15401] InferenceWorker_p0-w0: stopping experience collection (53250 times) [2024-06-22 13:37:23,134][15401] InferenceWorker_p0-w0: resuming experience collection (53250 times) [2024-06-22 13:37:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 3604561920. Throughput: 0: 42903.9. Samples: 3604631480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 13:37:25,287][15401] Updated weights for policy 0, policy_version 220010 (0.0048) [2024-06-22 13:37:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3604725760. Throughput: 0: 42693.1. Samples: 3604886680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 13:37:29,337][15401] Updated weights for policy 0, policy_version 220020 (0.0038) [2024-06-22 13:37:32,907][15401] Updated weights for policy 0, policy_version 220030 (0.0025) [2024-06-22 13:37:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3604971520. Throughput: 0: 42650.0. Samples: 3605138240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:33,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 13:37:36,918][15401] Updated weights for policy 0, policy_version 220040 (0.0041) [2024-06-22 13:37:38,390][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3605200896. Throughput: 0: 42902.7. Samples: 3605274520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:38,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 13:37:40,535][15401] Updated weights for policy 0, policy_version 220050 (0.0035) [2024-06-22 13:37:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3605381120. Throughput: 0: 42604.2. Samples: 3605528040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 13:37:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000220055_3605381120.pth... [2024-06-22 13:37:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000219428_3595108352.pth [2024-06-22 13:37:44,500][15401] Updated weights for policy 0, policy_version 220060 (0.0032) [2024-06-22 13:37:48,083][15401] Updated weights for policy 0, policy_version 220070 (0.0034) [2024-06-22 13:37:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43149.0, 300 sec: 42931.6). Total num frames: 3605626880. Throughput: 0: 42628.3. Samples: 3605781160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 13:37:52,053][15401] Updated weights for policy 0, policy_version 220080 (0.0027) [2024-06-22 13:37:53,391][15132] Fps is (10 sec: 44229.1, 60 sec: 42597.2, 300 sec: 42820.3). Total num frames: 3605823488. Throughput: 0: 43097.0. Samples: 3605923680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:53,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 13:37:55,720][15401] Updated weights for policy 0, policy_version 220090 (0.0036) [2024-06-22 13:37:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 3606020096. Throughput: 0: 42813.1. Samples: 3606175420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:37:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 13:37:59,566][15401] Updated weights for policy 0, policy_version 220100 (0.0039) [2024-06-22 13:38:03,390][15132] Fps is (10 sec: 44244.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3606265856. Throughput: 0: 42852.9. Samples: 3606429120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:38:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 13:38:03,641][15401] Updated weights for policy 0, policy_version 220110 (0.0045) [2024-06-22 13:38:07,170][15401] Updated weights for policy 0, policy_version 220120 (0.0046) [2024-06-22 13:38:08,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3606462464. Throughput: 0: 42980.1. Samples: 3606565580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:38:08,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 13:38:11,430][15401] Updated weights for policy 0, policy_version 220130 (0.0033) [2024-06-22 13:38:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 3606675456. Throughput: 0: 42849.0. Samples: 3606814880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-22 13:38:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 13:38:14,882][15401] Updated weights for policy 0, policy_version 220140 (0.0027) [2024-06-22 13:38:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3606904832. Throughput: 0: 42809.3. Samples: 3607064660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 13:38:19,140][15401] Updated weights for policy 0, policy_version 220150 (0.0038) [2024-06-22 13:38:22,630][15401] Updated weights for policy 0, policy_version 220160 (0.0025) [2024-06-22 13:38:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3607117824. Throughput: 0: 42756.5. Samples: 3607198560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:23,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 13:38:26,741][15401] Updated weights for policy 0, policy_version 220170 (0.0039) [2024-06-22 13:38:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 3607330816. Throughput: 0: 42783.9. Samples: 3607453320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 13:38:30,354][15401] Updated weights for policy 0, policy_version 220180 (0.0041) [2024-06-22 13:38:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3607560192. Throughput: 0: 42722.3. Samples: 3607703660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:33,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 13:38:34,342][15401] Updated weights for policy 0, policy_version 220190 (0.0043) [2024-06-22 13:38:35,636][15349] Signal inference workers to stop experience collection... (53300 times) [2024-06-22 13:38:35,637][15349] Signal inference workers to resume experience collection... (53300 times) [2024-06-22 13:38:35,660][15401] InferenceWorker_p0-w0: stopping experience collection (53300 times) [2024-06-22 13:38:35,660][15401] InferenceWorker_p0-w0: resuming experience collection (53300 times) [2024-06-22 13:38:38,199][15401] Updated weights for policy 0, policy_version 220200 (0.0040) [2024-06-22 13:38:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 3607756800. Throughput: 0: 42485.1. Samples: 3607835440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:38,400][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 13:38:41,799][15401] Updated weights for policy 0, policy_version 220210 (0.0040) [2024-06-22 13:38:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42876.5). Total num frames: 3607969792. Throughput: 0: 42642.8. Samples: 3608094340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 13:38:45,627][15401] Updated weights for policy 0, policy_version 220220 (0.0036) [2024-06-22 13:38:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3608199168. Throughput: 0: 42700.6. Samples: 3608350640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 13:38:49,659][15401] Updated weights for policy 0, policy_version 220230 (0.0033) [2024-06-22 13:38:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42872.6, 300 sec: 42820.5). Total num frames: 3608395776. Throughput: 0: 42592.8. Samples: 3608482260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 13:38:53,537][15401] Updated weights for policy 0, policy_version 220240 (0.0032) [2024-06-22 13:38:57,291][15401] Updated weights for policy 0, policy_version 220250 (0.0036) [2024-06-22 13:38:58,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43416.0, 300 sec: 42875.7). Total num frames: 3608625152. Throughput: 0: 42894.7. Samples: 3608745240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:38:58,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 13:39:00,982][15401] Updated weights for policy 0, policy_version 220260 (0.0040) [2024-06-22 13:39:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3608838144. Throughput: 0: 42922.4. Samples: 3608996160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:39:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 13:39:04,863][15401] Updated weights for policy 0, policy_version 220270 (0.0042) [2024-06-22 13:39:08,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3609051136. Throughput: 0: 42830.1. Samples: 3609125920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:39:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 13:39:08,522][15401] Updated weights for policy 0, policy_version 220280 (0.0032) [2024-06-22 13:39:12,626][15401] Updated weights for policy 0, policy_version 220290 (0.0023) [2024-06-22 13:39:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3609264128. Throughput: 0: 42999.7. Samples: 3609388300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:39:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 13:39:15,971][15401] Updated weights for policy 0, policy_version 220300 (0.0037) [2024-06-22 13:39:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3609477120. Throughput: 0: 43078.7. Samples: 3609642200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-22 13:39:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 13:39:20,182][15401] Updated weights for policy 0, policy_version 220310 (0.0037) [2024-06-22 13:39:23,391][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3609706496. Throughput: 0: 43032.5. Samples: 3609771900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:39:23,391][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 13:39:23,652][15401] Updated weights for policy 0, policy_version 220320 (0.0038) [2024-06-22 13:39:27,667][15401] Updated weights for policy 0, policy_version 220330 (0.0030) [2024-06-22 13:39:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3609919488. Throughput: 0: 43121.3. Samples: 3610034800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:39:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 13:39:31,249][15401] Updated weights for policy 0, policy_version 220340 (0.0036) [2024-06-22 13:39:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3610116096. Throughput: 0: 43189.7. Samples: 3610294180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:39:33,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 13:39:35,142][15401] Updated weights for policy 0, policy_version 220350 (0.0041) [2024-06-22 13:39:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3610345472. Throughput: 0: 42999.0. Samples: 3610417220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:39:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 13:39:39,313][15401] Updated weights for policy 0, policy_version 220360 (0.0027) [2024-06-22 13:39:42,899][15401] Updated weights for policy 0, policy_version 220370 (0.0027) [2024-06-22 13:39:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3610558464. Throughput: 0: 42910.6. Samples: 3610676120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:39:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 13:39:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000220371_3610558464.pth... [2024-06-22 13:39:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000219744_3600285696.pth [2024-06-22 13:39:47,114][15401] Updated weights for policy 0, policy_version 220380 (0.0040) [2024-06-22 13:39:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3610755072. Throughput: 0: 43104.4. Samples: 3610935860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:39:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 13:39:49,583][15349] Signal inference workers to stop experience collection... (53350 times) [2024-06-22 13:39:49,583][15349] Signal inference workers to resume experience collection... (53350 times) [2024-06-22 13:39:49,598][15401] InferenceWorker_p0-w0: stopping experience collection (53350 times) [2024-06-22 13:39:49,598][15401] InferenceWorker_p0-w0: resuming experience collection (53350 times) [2024-06-22 13:39:50,569][15401] Updated weights for policy 0, policy_version 220390 (0.0024) [2024-06-22 13:39:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3610984448. Throughput: 0: 42950.3. Samples: 3611058680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:39:53,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 13:39:54,750][15401] Updated weights for policy 0, policy_version 220400 (0.0033) [2024-06-22 13:39:58,037][15401] Updated weights for policy 0, policy_version 220410 (0.0036) [2024-06-22 13:39:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.3, 300 sec: 42931.7). Total num frames: 3611197440. Throughput: 0: 43017.0. Samples: 3611324060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:39:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 13:40:02,227][15401] Updated weights for policy 0, policy_version 220420 (0.0028) [2024-06-22 13:40:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3611426816. Throughput: 0: 43020.5. Samples: 3611578120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:40:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 13:40:05,689][15401] Updated weights for policy 0, policy_version 220430 (0.0032) [2024-06-22 13:40:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3611607040. Throughput: 0: 42962.7. Samples: 3611705220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:40:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 13:40:09,737][15401] Updated weights for policy 0, policy_version 220440 (0.0028) [2024-06-22 13:40:13,357][15401] Updated weights for policy 0, policy_version 220450 (0.0041) [2024-06-22 13:40:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3611852800. Throughput: 0: 42968.8. Samples: 3611968400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:40:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 13:40:17,335][15401] Updated weights for policy 0, policy_version 220460 (0.0029) [2024-06-22 13:40:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3612065792. Throughput: 0: 42970.6. Samples: 3612227860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:40:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 13:40:20,872][15401] Updated weights for policy 0, policy_version 220470 (0.0027) [2024-06-22 13:40:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3612262400. Throughput: 0: 42941.1. Samples: 3612349560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:40:23,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 13:40:24,942][15401] Updated weights for policy 0, policy_version 220480 (0.0040) [2024-06-22 13:40:28,357][15401] Updated weights for policy 0, policy_version 220490 (0.0025) [2024-06-22 13:40:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3612508160. Throughput: 0: 43025.5. Samples: 3612612260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 13:40:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 13:40:32,555][15401] Updated weights for policy 0, policy_version 220500 (0.0032) [2024-06-22 13:40:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3612704768. Throughput: 0: 42972.9. Samples: 3612869640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:40:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 13:40:36,022][15401] Updated weights for policy 0, policy_version 220510 (0.0029) [2024-06-22 13:40:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3612917760. Throughput: 0: 43035.2. Samples: 3612995260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:40:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 13:40:40,238][15401] Updated weights for policy 0, policy_version 220520 (0.0041) [2024-06-22 13:40:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.6, 300 sec: 42877.0). Total num frames: 3613114368. Throughput: 0: 42825.2. Samples: 3613251200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:40:43,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 13:40:43,926][15401] Updated weights for policy 0, policy_version 220530 (0.0026) [2024-06-22 13:40:47,783][15401] Updated weights for policy 0, policy_version 220540 (0.0033) [2024-06-22 13:40:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3613343744. Throughput: 0: 42840.5. Samples: 3613505940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:40:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 13:40:51,574][15401] Updated weights for policy 0, policy_version 220550 (0.0033) [2024-06-22 13:40:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 3613556736. Throughput: 0: 42918.3. Samples: 3613636540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:40:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 13:40:55,283][15401] Updated weights for policy 0, policy_version 220560 (0.0033) [2024-06-22 13:40:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 3613753344. Throughput: 0: 42625.8. Samples: 3613886560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:40:58,390][15132] Avg episode reward: [(0, '0.113')] [2024-06-22 13:40:59,152][15401] Updated weights for policy 0, policy_version 220570 (0.0026) [2024-06-22 13:41:03,340][15401] Updated weights for policy 0, policy_version 220580 (0.0027) [2024-06-22 13:41:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3613982720. Throughput: 0: 42685.0. Samples: 3614148680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:41:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 13:41:06,268][15349] Signal inference workers to stop experience collection... (53400 times) [2024-06-22 13:41:06,268][15349] Signal inference workers to resume experience collection... (53400 times) [2024-06-22 13:41:06,300][15401] InferenceWorker_p0-w0: stopping experience collection (53400 times) [2024-06-22 13:41:06,300][15401] InferenceWorker_p0-w0: resuming experience collection (53400 times) [2024-06-22 13:41:06,814][15401] Updated weights for policy 0, policy_version 220590 (0.0048) [2024-06-22 13:41:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3614195712. Throughput: 0: 42840.8. Samples: 3614277400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:41:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 13:41:11,092][15401] Updated weights for policy 0, policy_version 220600 (0.0050) [2024-06-22 13:41:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3614408704. Throughput: 0: 42562.7. Samples: 3614527580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:41:13,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-22 13:41:14,611][15401] Updated weights for policy 0, policy_version 220610 (0.0027) [2024-06-22 13:41:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 3614605312. Throughput: 0: 42663.0. Samples: 3614789480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:41:18,390][15132] Avg episode reward: [(0, '0.070')] [2024-06-22 13:41:18,663][15401] Updated weights for policy 0, policy_version 220620 (0.0041) [2024-06-22 13:41:22,231][15401] Updated weights for policy 0, policy_version 220630 (0.0035) [2024-06-22 13:41:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3614851072. Throughput: 0: 42636.8. Samples: 3614913920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:41:23,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 13:41:26,438][15401] Updated weights for policy 0, policy_version 220640 (0.0022) [2024-06-22 13:41:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3615047680. Throughput: 0: 42626.7. Samples: 3615169400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:41:28,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 13:41:29,864][15401] Updated weights for policy 0, policy_version 220650 (0.0027) [2024-06-22 13:41:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3615227904. Throughput: 0: 42709.7. Samples: 3615427880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:41:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 13:41:34,252][15401] Updated weights for policy 0, policy_version 220660 (0.0036) [2024-06-22 13:41:37,507][15401] Updated weights for policy 0, policy_version 220670 (0.0040) [2024-06-22 13:41:38,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42866.8, 300 sec: 42875.2). Total num frames: 3615490048. Throughput: 0: 42487.7. Samples: 3615548760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 13:41:38,396][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 13:41:42,028][15401] Updated weights for policy 0, policy_version 220680 (0.0042) [2024-06-22 13:41:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42877.0). Total num frames: 3615686656. Throughput: 0: 42607.8. Samples: 3615803920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:41:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 13:41:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000220684_3615686656.pth... [2024-06-22 13:41:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000220055_3605381120.pth [2024-06-22 13:41:44,995][15401] Updated weights for policy 0, policy_version 220690 (0.0036) [2024-06-22 13:41:48,390][15132] Fps is (10 sec: 39346.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3615883264. Throughput: 0: 42495.0. Samples: 3616060960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:41:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 13:41:49,740][15401] Updated weights for policy 0, policy_version 220700 (0.0029) [2024-06-22 13:41:52,499][15401] Updated weights for policy 0, policy_version 220710 (0.0035) [2024-06-22 13:41:53,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3616129024. Throughput: 0: 42539.2. Samples: 3616191660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:41:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 13:41:57,493][15401] Updated weights for policy 0, policy_version 220720 (0.0032) [2024-06-22 13:41:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3616309248. Throughput: 0: 42753.7. Samples: 3616451500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:41:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 13:42:00,292][15401] Updated weights for policy 0, policy_version 220730 (0.0050) [2024-06-22 13:42:03,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 3616522240. Throughput: 0: 42368.0. Samples: 3616696040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 13:42:05,255][15401] Updated weights for policy 0, policy_version 220740 (0.0029) [2024-06-22 13:42:08,156][15401] Updated weights for policy 0, policy_version 220750 (0.0037) [2024-06-22 13:42:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3616768000. Throughput: 0: 42490.8. Samples: 3616826000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 13:42:12,782][15401] Updated weights for policy 0, policy_version 220760 (0.0021) [2024-06-22 13:42:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3616948224. Throughput: 0: 42559.0. Samples: 3617084560. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 13:42:15,747][15401] Updated weights for policy 0, policy_version 220770 (0.0036) [2024-06-22 13:42:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3617161216. Throughput: 0: 42532.5. Samples: 3617341840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 13:42:20,466][15401] Updated weights for policy 0, policy_version 220780 (0.0035) [2024-06-22 13:42:22,153][15349] Signal inference workers to stop experience collection... (53450 times) [2024-06-22 13:42:22,154][15349] Signal inference workers to resume experience collection... (53450 times) [2024-06-22 13:42:22,196][15401] InferenceWorker_p0-w0: stopping experience collection (53450 times) [2024-06-22 13:42:22,196][15401] InferenceWorker_p0-w0: resuming experience collection (53450 times) [2024-06-22 13:42:23,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3617406976. Throughput: 0: 42660.3. Samples: 3617468200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 13:42:23,519][15401] Updated weights for policy 0, policy_version 220790 (0.0033) [2024-06-22 13:42:27,958][15401] Updated weights for policy 0, policy_version 220800 (0.0033) [2024-06-22 13:42:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3617587200. Throughput: 0: 42751.3. Samples: 3617727720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 13:42:31,090][15401] Updated weights for policy 0, policy_version 220810 (0.0042) [2024-06-22 13:42:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3617800192. Throughput: 0: 42726.7. Samples: 3617983660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 13:42:35,458][15401] Updated weights for policy 0, policy_version 220820 (0.0030) [2024-06-22 13:42:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42603.0, 300 sec: 42931.6). Total num frames: 3618045952. Throughput: 0: 42765.8. Samples: 3618116120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:38,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 13:42:38,731][15401] Updated weights for policy 0, policy_version 220830 (0.0027) [2024-06-22 13:42:43,082][15401] Updated weights for policy 0, policy_version 220840 (0.0037) [2024-06-22 13:42:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3618242560. Throughput: 0: 42516.8. Samples: 3618364760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:43,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 13:42:46,696][15401] Updated weights for policy 0, policy_version 220850 (0.0030) [2024-06-22 13:42:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42820.8). Total num frames: 3618455552. Throughput: 0: 42577.6. Samples: 3618612020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-22 13:42:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 13:42:50,778][15401] Updated weights for policy 0, policy_version 220860 (0.0033) [2024-06-22 13:42:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3618668544. Throughput: 0: 42627.5. Samples: 3618744240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:42:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 13:42:54,248][15401] Updated weights for policy 0, policy_version 220870 (0.0037) [2024-06-22 13:42:58,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3618848768. Throughput: 0: 42621.8. Samples: 3619002540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:42:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 13:42:58,849][15401] Updated weights for policy 0, policy_version 220880 (0.0047) [2024-06-22 13:43:02,183][15401] Updated weights for policy 0, policy_version 220890 (0.0031) [2024-06-22 13:43:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3619127296. Throughput: 0: 42377.6. Samples: 3619248840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:03,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 13:43:06,784][15401] Updated weights for policy 0, policy_version 220900 (0.0039) [2024-06-22 13:43:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3619291136. Throughput: 0: 42633.3. Samples: 3619386700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:08,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 13:43:09,979][15401] Updated weights for policy 0, policy_version 220910 (0.0041) [2024-06-22 13:43:13,391][15132] Fps is (10 sec: 37678.8, 60 sec: 42597.5, 300 sec: 42709.3). Total num frames: 3619504128. Throughput: 0: 42421.9. Samples: 3619636760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:13,391][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 13:43:14,369][15401] Updated weights for policy 0, policy_version 220920 (0.0041) [2024-06-22 13:43:17,678][15401] Updated weights for policy 0, policy_version 220930 (0.0040) [2024-06-22 13:43:18,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 3619766272. Throughput: 0: 42315.0. Samples: 3619887840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:18,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 13:43:22,173][15401] Updated weights for policy 0, policy_version 220940 (0.0039) [2024-06-22 13:43:23,390][15132] Fps is (10 sec: 42603.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3619930112. Throughput: 0: 42559.9. Samples: 3620031320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:23,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 13:43:25,203][15401] Updated weights for policy 0, policy_version 220950 (0.0028) [2024-06-22 13:43:28,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3620143104. Throughput: 0: 42438.4. Samples: 3620274480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 13:43:29,955][15401] Updated weights for policy 0, policy_version 220960 (0.0038) [2024-06-22 13:43:32,930][15401] Updated weights for policy 0, policy_version 220970 (0.0033) [2024-06-22 13:43:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3620372480. Throughput: 0: 42628.3. Samples: 3620530300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 13:43:37,510][15401] Updated weights for policy 0, policy_version 220980 (0.0039) [2024-06-22 13:43:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3620585472. Throughput: 0: 42712.8. Samples: 3620666320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 13:43:40,689][15401] Updated weights for policy 0, policy_version 220990 (0.0041) [2024-06-22 13:43:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3620798464. Throughput: 0: 42384.0. Samples: 3620909820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:43,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 13:43:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000220996_3620798464.pth... [2024-06-22 13:43:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000220371_3610558464.pth [2024-06-22 13:43:45,037][15401] Updated weights for policy 0, policy_version 221000 (0.0036) [2024-06-22 13:43:45,433][15349] Signal inference workers to stop experience collection... (53500 times) [2024-06-22 13:43:45,463][15401] InferenceWorker_p0-w0: stopping experience collection (53500 times) [2024-06-22 13:43:45,489][15349] Signal inference workers to resume experience collection... (53500 times) [2024-06-22 13:43:45,489][15401] InferenceWorker_p0-w0: resuming experience collection (53500 times) [2024-06-22 13:43:48,342][15401] Updated weights for policy 0, policy_version 221010 (0.0041) [2024-06-22 13:43:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3621027840. Throughput: 0: 42872.6. Samples: 3621178100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:48,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 13:43:52,777][15401] Updated weights for policy 0, policy_version 221020 (0.0031) [2024-06-22 13:43:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 3621208064. Throughput: 0: 42635.2. Samples: 3621305280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 13:43:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 13:43:56,027][15401] Updated weights for policy 0, policy_version 221030 (0.0028) [2024-06-22 13:43:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 3621453824. Throughput: 0: 42751.4. Samples: 3621560520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:43:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 13:44:00,346][15401] Updated weights for policy 0, policy_version 221040 (0.0028) [2024-06-22 13:44:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.5, 300 sec: 42709.5). Total num frames: 3621650432. Throughput: 0: 43049.2. Samples: 3621825040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:03,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 13:44:03,683][15401] Updated weights for policy 0, policy_version 221050 (0.0032) [2024-06-22 13:44:07,754][15401] Updated weights for policy 0, policy_version 221060 (0.0027) [2024-06-22 13:44:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3621863424. Throughput: 0: 42740.5. Samples: 3621954640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:08,390][15132] Avg episode reward: [(0, '0.160')] [2024-06-22 13:44:11,231][15401] Updated weights for policy 0, policy_version 221070 (0.0029) [2024-06-22 13:44:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43145.5, 300 sec: 42765.0). Total num frames: 3622092800. Throughput: 0: 42823.0. Samples: 3622201520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:13,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 13:44:15,695][15401] Updated weights for policy 0, policy_version 221080 (0.0035) [2024-06-22 13:44:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 3622273024. Throughput: 0: 42851.0. Samples: 3622458600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:18,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 13:44:19,243][15401] Updated weights for policy 0, policy_version 221090 (0.0039) [2024-06-22 13:44:23,296][15401] Updated weights for policy 0, policy_version 221100 (0.0030) [2024-06-22 13:44:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3622502400. Throughput: 0: 42545.0. Samples: 3622580840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 13:44:26,952][15401] Updated weights for policy 0, policy_version 221110 (0.0040) [2024-06-22 13:44:28,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3622731776. Throughput: 0: 42913.4. Samples: 3622840920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 13:44:30,910][15401] Updated weights for policy 0, policy_version 221120 (0.0033) [2024-06-22 13:44:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3622928384. Throughput: 0: 42667.1. Samples: 3623098120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 13:44:34,689][15401] Updated weights for policy 0, policy_version 221130 (0.0042) [2024-06-22 13:44:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3623141376. Throughput: 0: 42610.1. Samples: 3623222740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 13:44:38,508][15401] Updated weights for policy 0, policy_version 221140 (0.0031) [2024-06-22 13:44:42,198][15401] Updated weights for policy 0, policy_version 221150 (0.0048) [2024-06-22 13:44:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3623387136. Throughput: 0: 42796.1. Samples: 3623486340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:43,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 13:44:46,133][15401] Updated weights for policy 0, policy_version 221160 (0.0034) [2024-06-22 13:44:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3623567360. Throughput: 0: 42593.6. Samples: 3623741760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 13:44:49,904][15401] Updated weights for policy 0, policy_version 221170 (0.0030) [2024-06-22 13:44:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3623780352. Throughput: 0: 42488.1. Samples: 3623866600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 13:44:53,843][15401] Updated weights for policy 0, policy_version 221180 (0.0027) [2024-06-22 13:44:55,152][15349] Signal inference workers to stop experience collection... (53550 times) [2024-06-22 13:44:55,184][15401] InferenceWorker_p0-w0: stopping experience collection (53550 times) [2024-06-22 13:44:55,206][15349] Signal inference workers to resume experience collection... (53550 times) [2024-06-22 13:44:55,212][15401] InferenceWorker_p0-w0: resuming experience collection (53550 times) [2024-06-22 13:44:57,769][15401] Updated weights for policy 0, policy_version 221190 (0.0033) [2024-06-22 13:44:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3624009728. Throughput: 0: 42888.1. Samples: 3624131480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:44:58,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 13:45:01,324][15401] Updated weights for policy 0, policy_version 221200 (0.0047) [2024-06-22 13:45:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3624222720. Throughput: 0: 42834.7. Samples: 3624386160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 13:45:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 13:45:05,159][15401] Updated weights for policy 0, policy_version 221210 (0.0032) [2024-06-22 13:45:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3624435712. Throughput: 0: 42887.9. Samples: 3624510800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:08,396][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 13:45:08,941][15401] Updated weights for policy 0, policy_version 221220 (0.0031) [2024-06-22 13:45:12,684][15401] Updated weights for policy 0, policy_version 221230 (0.0033) [2024-06-22 13:45:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3624665088. Throughput: 0: 42988.9. Samples: 3624775420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:13,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 13:45:16,464][15401] Updated weights for policy 0, policy_version 221240 (0.0031) [2024-06-22 13:45:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3624861696. Throughput: 0: 42862.3. Samples: 3625026920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:18,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-22 13:45:20,438][15401] Updated weights for policy 0, policy_version 221250 (0.0048) [2024-06-22 13:45:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3625074688. Throughput: 0: 42974.3. Samples: 3625156580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:23,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-22 13:45:24,062][15401] Updated weights for policy 0, policy_version 221260 (0.0024) [2024-06-22 13:45:28,177][15401] Updated weights for policy 0, policy_version 221270 (0.0037) [2024-06-22 13:45:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3625287680. Throughput: 0: 42866.7. Samples: 3625415340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:28,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-22 13:45:32,018][15401] Updated weights for policy 0, policy_version 221280 (0.0034) [2024-06-22 13:45:33,394][15132] Fps is (10 sec: 44218.3, 60 sec: 43141.5, 300 sec: 42708.9). Total num frames: 3625517056. Throughput: 0: 42749.8. Samples: 3625665680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:33,394][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 13:45:35,883][15401] Updated weights for policy 0, policy_version 221290 (0.0034) [2024-06-22 13:45:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 3625713664. Throughput: 0: 42802.1. Samples: 3625792800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:38,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 13:45:39,610][15401] Updated weights for policy 0, policy_version 221300 (0.0033) [2024-06-22 13:45:43,389][15132] Fps is (10 sec: 40977.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3625926656. Throughput: 0: 42665.3. Samples: 3626051420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:43,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-22 13:45:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000221309_3625926656.pth... [2024-06-22 13:45:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000220684_3615686656.pth [2024-06-22 13:45:44,038][15401] Updated weights for policy 0, policy_version 221310 (0.0027) [2024-06-22 13:45:47,648][15401] Updated weights for policy 0, policy_version 221320 (0.0035) [2024-06-22 13:45:48,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3626139648. Throughput: 0: 42571.6. Samples: 3626301880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 13:45:51,646][15401] Updated weights for policy 0, policy_version 221330 (0.0034) [2024-06-22 13:45:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3626352640. Throughput: 0: 42633.3. Samples: 3626429300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:53,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 13:45:55,521][15401] Updated weights for policy 0, policy_version 221340 (0.0030) [2024-06-22 13:45:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3626565632. Throughput: 0: 42464.8. Samples: 3626686340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:45:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 13:45:59,135][15401] Updated weights for policy 0, policy_version 221350 (0.0028) [2024-06-22 13:46:03,079][15401] Updated weights for policy 0, policy_version 221360 (0.0033) [2024-06-22 13:46:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3626795008. Throughput: 0: 42615.9. Samples: 3626944640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:46:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 13:46:06,752][15401] Updated weights for policy 0, policy_version 221370 (0.0033) [2024-06-22 13:46:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3626991616. Throughput: 0: 42654.3. Samples: 3627076020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:46:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 13:46:10,602][15401] Updated weights for policy 0, policy_version 221380 (0.0036) [2024-06-22 13:46:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3627204608. Throughput: 0: 42495.5. Samples: 3627327640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 13:46:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 13:46:14,169][15401] Updated weights for policy 0, policy_version 221390 (0.0051) [2024-06-22 13:46:18,218][15401] Updated weights for policy 0, policy_version 221400 (0.0043) [2024-06-22 13:46:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3627417600. Throughput: 0: 42766.7. Samples: 3627590000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 13:46:21,792][15401] Updated weights for policy 0, policy_version 221410 (0.0033) [2024-06-22 13:46:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3627614208. Throughput: 0: 42694.6. Samples: 3627713960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 13:46:25,777][15349] Signal inference workers to stop experience collection... (53600 times) [2024-06-22 13:46:25,777][15349] Signal inference workers to resume experience collection... (53600 times) [2024-06-22 13:46:25,793][15401] InferenceWorker_p0-w0: stopping experience collection (53600 times) [2024-06-22 13:46:25,824][15401] InferenceWorker_p0-w0: resuming experience collection (53600 times) [2024-06-22 13:46:25,931][15401] Updated weights for policy 0, policy_version 221420 (0.0038) [2024-06-22 13:46:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3627843584. Throughput: 0: 42499.6. Samples: 3627963900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:28,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 13:46:29,445][15401] Updated weights for policy 0, policy_version 221430 (0.0040) [2024-06-22 13:46:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42055.2, 300 sec: 42543.8). Total num frames: 3628040192. Throughput: 0: 42800.9. Samples: 3628227920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 13:46:33,673][15401] Updated weights for policy 0, policy_version 221440 (0.0047) [2024-06-22 13:46:37,041][15401] Updated weights for policy 0, policy_version 221450 (0.0036) [2024-06-22 13:46:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 3628269568. Throughput: 0: 42681.4. Samples: 3628349960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 13:46:41,179][15401] Updated weights for policy 0, policy_version 221460 (0.0035) [2024-06-22 13:46:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3628482560. Throughput: 0: 42635.6. Samples: 3628604940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:43,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 13:46:44,962][15401] Updated weights for policy 0, policy_version 221470 (0.0039) [2024-06-22 13:46:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 3628695552. Throughput: 0: 42566.6. Samples: 3628860140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 13:46:48,839][15401] Updated weights for policy 0, policy_version 221480 (0.0034) [2024-06-22 13:46:52,650][15401] Updated weights for policy 0, policy_version 221490 (0.0038) [2024-06-22 13:46:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3628908544. Throughput: 0: 42491.9. Samples: 3628988160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 13:46:56,331][15401] Updated weights for policy 0, policy_version 221500 (0.0039) [2024-06-22 13:46:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3629137920. Throughput: 0: 42652.4. Samples: 3629247000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:46:58,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 13:47:00,138][15401] Updated weights for policy 0, policy_version 221510 (0.0032) [2024-06-22 13:47:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3629350912. Throughput: 0: 42517.2. Samples: 3629503280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:47:03,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 13:47:04,054][15401] Updated weights for policy 0, policy_version 221520 (0.0033) [2024-06-22 13:47:07,641][15401] Updated weights for policy 0, policy_version 221530 (0.0030) [2024-06-22 13:47:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3629547520. Throughput: 0: 42583.2. Samples: 3629630200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:47:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 13:47:11,730][15401] Updated weights for policy 0, policy_version 221540 (0.0033) [2024-06-22 13:47:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3629776896. Throughput: 0: 42722.1. Samples: 3629886400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:47:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 13:47:15,681][15401] Updated weights for policy 0, policy_version 221550 (0.0033) [2024-06-22 13:47:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3629957120. Throughput: 0: 42617.0. Samples: 3630145680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 13:47:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 13:47:19,295][15401] Updated weights for policy 0, policy_version 221560 (0.0046) [2024-06-22 13:47:23,207][15401] Updated weights for policy 0, policy_version 221570 (0.0037) [2024-06-22 13:47:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 3630202880. Throughput: 0: 42588.5. Samples: 3630266440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:47:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 13:47:26,919][15401] Updated weights for policy 0, policy_version 221580 (0.0032) [2024-06-22 13:47:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3630415872. Throughput: 0: 42672.0. Samples: 3630525180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:47:28,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 13:47:30,670][15401] Updated weights for policy 0, policy_version 221590 (0.0041) [2024-06-22 13:47:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 3630596096. Throughput: 0: 42956.6. Samples: 3630793180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:47:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 13:47:34,406][15401] Updated weights for policy 0, policy_version 221600 (0.0029) [2024-06-22 13:47:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3630825472. Throughput: 0: 42693.4. Samples: 3630909360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:47:38,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 13:47:38,686][15401] Updated weights for policy 0, policy_version 221610 (0.0038) [2024-06-22 13:47:41,410][15349] Signal inference workers to stop experience collection... (53650 times) [2024-06-22 13:47:41,460][15401] InferenceWorker_p0-w0: stopping experience collection (53650 times) [2024-06-22 13:47:41,465][15349] Signal inference workers to resume experience collection... (53650 times) [2024-06-22 13:47:41,476][15401] InferenceWorker_p0-w0: resuming experience collection (53650 times) [2024-06-22 13:47:42,441][15401] Updated weights for policy 0, policy_version 221620 (0.0022) [2024-06-22 13:47:43,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3631071232. Throughput: 0: 42761.8. Samples: 3631171280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:47:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 13:47:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000221623_3631071232.pth... [2024-06-22 13:47:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000220996_3620798464.pth [2024-06-22 13:47:45,998][15401] Updated weights for policy 0, policy_version 221630 (0.0036) [2024-06-22 13:47:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3631251456. Throughput: 0: 43022.7. Samples: 3631439300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:47:48,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-22 13:47:49,977][15401] Updated weights for policy 0, policy_version 221640 (0.0037) [2024-06-22 13:47:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3631497216. Throughput: 0: 42903.2. Samples: 3631560840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:47:53,396][15132] Avg episode reward: [(0, '0.259')] [2024-06-22 13:47:53,598][15401] Updated weights for policy 0, policy_version 221650 (0.0038) [2024-06-22 13:47:57,341][15401] Updated weights for policy 0, policy_version 221660 (0.0036) [2024-06-22 13:47:58,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3631726592. Throughput: 0: 43129.9. Samples: 3631827240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:47:58,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 13:48:01,069][15401] Updated weights for policy 0, policy_version 221670 (0.0033) [2024-06-22 13:48:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3631906816. Throughput: 0: 43273.6. Samples: 3632093000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:48:03,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 13:48:04,859][15401] Updated weights for policy 0, policy_version 221680 (0.0027) [2024-06-22 13:48:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42820.7). Total num frames: 3632136192. Throughput: 0: 43100.8. Samples: 3632205980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:48:08,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 13:48:08,687][15401] Updated weights for policy 0, policy_version 221690 (0.0029) [2024-06-22 13:48:12,485][15401] Updated weights for policy 0, policy_version 221700 (0.0024) [2024-06-22 13:48:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3632365568. Throughput: 0: 43291.6. Samples: 3632473300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:48:13,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-22 13:48:16,313][15401] Updated weights for policy 0, policy_version 221710 (0.0033) [2024-06-22 13:48:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3632545792. Throughput: 0: 43105.3. Samples: 3632732920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:48:18,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-22 13:48:20,183][15401] Updated weights for policy 0, policy_version 221720 (0.0030) [2024-06-22 13:48:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3632758784. Throughput: 0: 43110.3. Samples: 3632849320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:48:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 13:48:24,119][15401] Updated weights for policy 0, policy_version 221730 (0.0040) [2024-06-22 13:48:28,203][15401] Updated weights for policy 0, policy_version 221740 (0.0025) [2024-06-22 13:48:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3632988160. Throughput: 0: 43194.7. Samples: 3633115040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 13:48:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 13:48:31,830][15401] Updated weights for policy 0, policy_version 221750 (0.0023) [2024-06-22 13:48:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3633168384. Throughput: 0: 42997.8. Samples: 3633374200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:48:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 13:48:35,697][15401] Updated weights for policy 0, policy_version 221760 (0.0021) [2024-06-22 13:48:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 3633414144. Throughput: 0: 42890.5. Samples: 3633490920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:48:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 13:48:39,837][15401] Updated weights for policy 0, policy_version 221770 (0.0032) [2024-06-22 13:48:43,205][15401] Updated weights for policy 0, policy_version 221780 (0.0035) [2024-06-22 13:48:43,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3633643520. Throughput: 0: 42862.2. Samples: 3633756040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:48:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 13:48:47,836][15401] Updated weights for policy 0, policy_version 221790 (0.0046) [2024-06-22 13:48:48,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3633823744. Throughput: 0: 42645.8. Samples: 3634012060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:48:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 13:48:49,265][15349] Signal inference workers to stop experience collection... (53700 times) [2024-06-22 13:48:49,266][15349] Signal inference workers to resume experience collection... (53700 times) [2024-06-22 13:48:49,284][15401] InferenceWorker_p0-w0: stopping experience collection (53700 times) [2024-06-22 13:48:49,284][15401] InferenceWorker_p0-w0: resuming experience collection (53700 times) [2024-06-22 13:48:50,782][15401] Updated weights for policy 0, policy_version 221800 (0.0032) [2024-06-22 13:48:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3634069504. Throughput: 0: 42861.9. Samples: 3634134760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:48:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 13:48:55,489][15401] Updated weights for policy 0, policy_version 221810 (0.0038) [2024-06-22 13:48:58,329][15401] Updated weights for policy 0, policy_version 221820 (0.0027) [2024-06-22 13:48:58,396][15132] Fps is (10 sec: 47483.4, 60 sec: 42866.9, 300 sec: 42875.1). Total num frames: 3634298880. Throughput: 0: 42673.9. Samples: 3634393900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:48:58,397][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 13:49:03,092][15401] Updated weights for policy 0, policy_version 221830 (0.0033) [2024-06-22 13:49:03,393][15132] Fps is (10 sec: 39306.2, 60 sec: 42595.7, 300 sec: 42708.9). Total num frames: 3634462720. Throughput: 0: 42531.5. Samples: 3634647000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:49:03,394][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 13:49:06,269][15401] Updated weights for policy 0, policy_version 221840 (0.0038) [2024-06-22 13:49:08,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3634708480. Throughput: 0: 42692.7. Samples: 3634770500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:49:08,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 13:49:10,703][15401] Updated weights for policy 0, policy_version 221850 (0.0032) [2024-06-22 13:49:13,396][15132] Fps is (10 sec: 45863.0, 60 sec: 42593.7, 300 sec: 42875.2). Total num frames: 3634921472. Throughput: 0: 42708.1. Samples: 3635037180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:49:13,397][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 13:49:14,064][15401] Updated weights for policy 0, policy_version 221860 (0.0041) [2024-06-22 13:49:18,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 3635101696. Throughput: 0: 42660.0. Samples: 3635294000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:49:18,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 13:49:18,544][15401] Updated weights for policy 0, policy_version 221870 (0.0042) [2024-06-22 13:49:21,738][15401] Updated weights for policy 0, policy_version 221880 (0.0039) [2024-06-22 13:49:23,389][15132] Fps is (10 sec: 42626.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3635347456. Throughput: 0: 42780.3. Samples: 3635416020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:49:23,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 13:49:26,182][15401] Updated weights for policy 0, policy_version 221890 (0.0032) [2024-06-22 13:49:28,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3635544064. Throughput: 0: 42713.8. Samples: 3635678160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:49:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 13:49:29,294][15401] Updated weights for policy 0, policy_version 221900 (0.0035) [2024-06-22 13:49:33,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3635740672. Throughput: 0: 42639.5. Samples: 3635930840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:49:33,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 13:49:33,855][15401] Updated weights for policy 0, policy_version 221910 (0.0038) [2024-06-22 13:49:37,087][15401] Updated weights for policy 0, policy_version 221920 (0.0026) [2024-06-22 13:49:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 3636002816. Throughput: 0: 42739.1. Samples: 3636058020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:49:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 13:49:41,455][15401] Updated weights for policy 0, policy_version 221930 (0.0037) [2024-06-22 13:49:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3636183040. Throughput: 0: 42739.7. Samples: 3636316920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:49:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 13:49:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000221935_3636183040.pth... [2024-06-22 13:49:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000221309_3625926656.pth [2024-06-22 13:49:44,546][15401] Updated weights for policy 0, policy_version 221940 (0.0026) [2024-06-22 13:49:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3636396032. Throughput: 0: 42824.1. Samples: 3636573920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:49:48,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 13:49:48,932][15401] Updated weights for policy 0, policy_version 221950 (0.0030) [2024-06-22 13:49:52,137][15401] Updated weights for policy 0, policy_version 221960 (0.0029) [2024-06-22 13:49:53,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3636641792. Throughput: 0: 43056.6. Samples: 3636708040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:49:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 13:49:56,546][15401] Updated weights for policy 0, policy_version 221970 (0.0028) [2024-06-22 13:49:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41783.7, 300 sec: 42653.9). Total num frames: 3636805632. Throughput: 0: 42884.5. Samples: 3636966700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:49:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 13:49:58,839][15349] Signal inference workers to stop experience collection... (53750 times) [2024-06-22 13:49:58,866][15401] InferenceWorker_p0-w0: stopping experience collection (53750 times) [2024-06-22 13:49:58,908][15349] Signal inference workers to resume experience collection... (53750 times) [2024-06-22 13:49:58,908][15401] InferenceWorker_p0-w0: resuming experience collection (53750 times) [2024-06-22 13:49:59,681][15401] Updated weights for policy 0, policy_version 221980 (0.0030) [2024-06-22 13:50:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43420.4, 300 sec: 42820.6). Total num frames: 3637067776. Throughput: 0: 42705.9. Samples: 3637215660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 13:50:04,233][15401] Updated weights for policy 0, policy_version 221990 (0.0046) [2024-06-22 13:50:07,354][15401] Updated weights for policy 0, policy_version 222000 (0.0032) [2024-06-22 13:50:08,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3637280768. Throughput: 0: 42996.3. Samples: 3637350860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:08,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 13:50:11,777][15401] Updated weights for policy 0, policy_version 222010 (0.0024) [2024-06-22 13:50:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 3637460992. Throughput: 0: 42828.3. Samples: 3637605440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 13:50:15,575][15401] Updated weights for policy 0, policy_version 222020 (0.0025) [2024-06-22 13:50:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 3637706752. Throughput: 0: 42708.6. Samples: 3637852720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:18,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-22 13:50:19,801][15401] Updated weights for policy 0, policy_version 222030 (0.0041) [2024-06-22 13:50:23,059][15401] Updated weights for policy 0, policy_version 222040 (0.0035) [2024-06-22 13:50:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3637919744. Throughput: 0: 42898.6. Samples: 3637988460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 13:50:27,604][15401] Updated weights for policy 0, policy_version 222050 (0.0044) [2024-06-22 13:50:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42654.6). Total num frames: 3638099968. Throughput: 0: 42937.5. Samples: 3638249100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:28,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 13:50:30,526][15401] Updated weights for policy 0, policy_version 222060 (0.0028) [2024-06-22 13:50:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43690.7, 300 sec: 42876.4). Total num frames: 3638362112. Throughput: 0: 42722.6. Samples: 3638496440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 13:50:35,026][15401] Updated weights for policy 0, policy_version 222070 (0.0030) [2024-06-22 13:50:38,051][15401] Updated weights for policy 0, policy_version 222080 (0.0029) [2024-06-22 13:50:38,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3638575104. Throughput: 0: 42796.9. Samples: 3638633900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 13:50:42,540][15401] Updated weights for policy 0, policy_version 222090 (0.0031) [2024-06-22 13:50:43,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3638738944. Throughput: 0: 42668.9. Samples: 3638886800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 13:50:45,753][15401] Updated weights for policy 0, policy_version 222100 (0.0031) [2024-06-22 13:50:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3639001088. Throughput: 0: 42709.3. Samples: 3639137580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 13:50:48,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 13:50:50,095][15401] Updated weights for policy 0, policy_version 222110 (0.0040) [2024-06-22 13:50:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3639181312. Throughput: 0: 42665.2. Samples: 3639270800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:50:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 13:50:53,953][15401] Updated weights for policy 0, policy_version 222120 (0.0030) [2024-06-22 13:50:57,607][15401] Updated weights for policy 0, policy_version 222130 (0.0034) [2024-06-22 13:50:58,392][15132] Fps is (10 sec: 39312.3, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 3639394304. Throughput: 0: 42558.7. Samples: 3639520680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:50:58,392][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 13:51:01,091][15349] Signal inference workers to stop experience collection... (53800 times) [2024-06-22 13:51:01,103][15401] InferenceWorker_p0-w0: stopping experience collection (53800 times) [2024-06-22 13:51:01,156][15349] Signal inference workers to resume experience collection... (53800 times) [2024-06-22 13:51:01,156][15401] InferenceWorker_p0-w0: resuming experience collection (53800 times) [2024-06-22 13:51:01,464][15401] Updated weights for policy 0, policy_version 222140 (0.0037) [2024-06-22 13:51:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3639607296. Throughput: 0: 42903.1. Samples: 3639783360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:03,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 13:51:05,077][15401] Updated weights for policy 0, policy_version 222150 (0.0028) [2024-06-22 13:51:08,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 3639803904. Throughput: 0: 42697.3. Samples: 3639909840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 13:51:09,218][15401] Updated weights for policy 0, policy_version 222160 (0.0032) [2024-06-22 13:51:12,569][15401] Updated weights for policy 0, policy_version 222170 (0.0023) [2024-06-22 13:51:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3640033280. Throughput: 0: 42427.6. Samples: 3640158340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:13,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 13:51:16,858][15401] Updated weights for policy 0, policy_version 222180 (0.0042) [2024-06-22 13:51:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 3640246272. Throughput: 0: 42891.1. Samples: 3640426540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 13:51:19,919][15401] Updated weights for policy 0, policy_version 222190 (0.0038) [2024-06-22 13:51:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3640459264. Throughput: 0: 42666.5. Samples: 3640553900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:23,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-22 13:51:24,463][15401] Updated weights for policy 0, policy_version 222200 (0.0022) [2024-06-22 13:51:27,941][15401] Updated weights for policy 0, policy_version 222210 (0.0045) [2024-06-22 13:51:28,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3640688640. Throughput: 0: 42709.8. Samples: 3640808740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:28,394][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 13:51:32,194][15401] Updated weights for policy 0, policy_version 222220 (0.0033) [2024-06-22 13:51:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 3640901632. Throughput: 0: 42973.3. Samples: 3641071380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 13:51:35,725][15401] Updated weights for policy 0, policy_version 222230 (0.0025) [2024-06-22 13:51:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3641114624. Throughput: 0: 42783.2. Samples: 3641196040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 13:51:39,819][15401] Updated weights for policy 0, policy_version 222240 (0.0029) [2024-06-22 13:51:43,395][15132] Fps is (10 sec: 42576.5, 60 sec: 43140.8, 300 sec: 42819.8). Total num frames: 3641327616. Throughput: 0: 42975.1. Samples: 3641454680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:43,395][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 13:51:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000222249_3641327616.pth... [2024-06-22 13:51:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000221623_3631071232.pth [2024-06-22 13:51:43,798][15401] Updated weights for policy 0, policy_version 222250 (0.0037) [2024-06-22 13:51:47,546][15401] Updated weights for policy 0, policy_version 222260 (0.0027) [2024-06-22 13:51:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3641540608. Throughput: 0: 42808.5. Samples: 3641709740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:48,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 13:51:51,372][15401] Updated weights for policy 0, policy_version 222270 (0.0038) [2024-06-22 13:51:53,389][15132] Fps is (10 sec: 44260.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3641769984. Throughput: 0: 42846.3. Samples: 3641837920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-22 13:51:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 13:51:55,174][15401] Updated weights for policy 0, policy_version 222280 (0.0023) [2024-06-22 13:51:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3641966592. Throughput: 0: 43041.6. Samples: 3642095220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:51:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 13:51:58,824][15401] Updated weights for policy 0, policy_version 222290 (0.0027) [2024-06-22 13:52:02,858][15401] Updated weights for policy 0, policy_version 222300 (0.0027) [2024-06-22 13:52:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3642179584. Throughput: 0: 42858.4. Samples: 3642355160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 13:52:06,279][15401] Updated weights for policy 0, policy_version 222310 (0.0042) [2024-06-22 13:52:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 3642408960. Throughput: 0: 42815.3. Samples: 3642480580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 13:52:10,420][15401] Updated weights for policy 0, policy_version 222320 (0.0027) [2024-06-22 13:52:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3642621952. Throughput: 0: 42900.4. Samples: 3642739260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 13:52:13,907][15401] Updated weights for policy 0, policy_version 222330 (0.0030) [2024-06-22 13:52:18,043][15401] Updated weights for policy 0, policy_version 222340 (0.0039) [2024-06-22 13:52:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3642834944. Throughput: 0: 42657.3. Samples: 3642990960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:18,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-22 13:52:21,775][15401] Updated weights for policy 0, policy_version 222350 (0.0038) [2024-06-22 13:52:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3643031552. Throughput: 0: 42722.1. Samples: 3643118540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 13:52:25,440][15349] Signal inference workers to stop experience collection... (53850 times) [2024-06-22 13:52:25,467][15401] InferenceWorker_p0-w0: stopping experience collection (53850 times) [2024-06-22 13:52:25,493][15349] Signal inference workers to resume experience collection... (53850 times) [2024-06-22 13:52:25,494][15401] InferenceWorker_p0-w0: resuming experience collection (53850 times) [2024-06-22 13:52:25,663][15401] Updated weights for policy 0, policy_version 222360 (0.0031) [2024-06-22 13:52:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3643260928. Throughput: 0: 42665.4. Samples: 3643374400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 13:52:29,502][15401] Updated weights for policy 0, policy_version 222370 (0.0036) [2024-06-22 13:52:33,230][15401] Updated weights for policy 0, policy_version 222380 (0.0034) [2024-06-22 13:52:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3643473920. Throughput: 0: 42491.9. Samples: 3643621880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:33,391][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 13:52:36,995][15401] Updated weights for policy 0, policy_version 222390 (0.0022) [2024-06-22 13:52:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3643654144. Throughput: 0: 42551.5. Samples: 3643752740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:38,394][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 13:52:40,906][15401] Updated weights for policy 0, policy_version 222400 (0.0031) [2024-06-22 13:52:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42602.1, 300 sec: 42820.6). Total num frames: 3643883520. Throughput: 0: 42594.2. Samples: 3644011960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:43,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-22 13:52:44,578][15401] Updated weights for policy 0, policy_version 222410 (0.0036) [2024-06-22 13:52:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3644112896. Throughput: 0: 42444.5. Samples: 3644265160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 13:52:48,549][15401] Updated weights for policy 0, policy_version 222420 (0.0033) [2024-06-22 13:52:52,557][15401] Updated weights for policy 0, policy_version 222430 (0.0039) [2024-06-22 13:52:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3644309504. Throughput: 0: 42539.5. Samples: 3644394860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 13:52:56,295][15401] Updated weights for policy 0, policy_version 222440 (0.0029) [2024-06-22 13:52:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3644538880. Throughput: 0: 42521.4. Samples: 3644652720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:52:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 13:53:00,117][15401] Updated weights for policy 0, policy_version 222450 (0.0029) [2024-06-22 13:53:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3644735488. Throughput: 0: 42688.1. Samples: 3644911920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 13:53:04,116][15401] Updated weights for policy 0, policy_version 222460 (0.0030) [2024-06-22 13:53:07,865][15401] Updated weights for policy 0, policy_version 222470 (0.0037) [2024-06-22 13:53:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3644964864. Throughput: 0: 42629.9. Samples: 3645036880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 13:53:11,722][15401] Updated weights for policy 0, policy_version 222480 (0.0031) [2024-06-22 13:53:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3645194240. Throughput: 0: 42676.9. Samples: 3645294860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 13:53:15,306][15401] Updated weights for policy 0, policy_version 222490 (0.0033) [2024-06-22 13:53:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3645374464. Throughput: 0: 42931.2. Samples: 3645553780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 13:53:19,133][15401] Updated weights for policy 0, policy_version 222500 (0.0044) [2024-06-22 13:53:22,951][15401] Updated weights for policy 0, policy_version 222510 (0.0042) [2024-06-22 13:53:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3645603840. Throughput: 0: 42830.3. Samples: 3645680100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 13:53:26,901][15401] Updated weights for policy 0, policy_version 222520 (0.0035) [2024-06-22 13:53:28,393][15132] Fps is (10 sec: 45858.9, 60 sec: 42868.9, 300 sec: 42931.1). Total num frames: 3645833216. Throughput: 0: 42901.9. Samples: 3645942700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:28,394][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 13:53:30,564][15401] Updated weights for policy 0, policy_version 222530 (0.0040) [2024-06-22 13:53:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 3646029824. Throughput: 0: 42920.7. Samples: 3646196700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:33,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 13:53:34,612][15401] Updated weights for policy 0, policy_version 222540 (0.0041) [2024-06-22 13:53:38,090][15401] Updated weights for policy 0, policy_version 222550 (0.0033) [2024-06-22 13:53:38,389][15132] Fps is (10 sec: 42614.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 3646259200. Throughput: 0: 42855.6. Samples: 3646323360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 13:53:41,916][15349] Signal inference workers to stop experience collection... (53900 times) [2024-06-22 13:53:41,916][15349] Signal inference workers to resume experience collection... (53900 times) [2024-06-22 13:53:41,946][15401] InferenceWorker_p0-w0: stopping experience collection (53900 times) [2024-06-22 13:53:41,946][15401] InferenceWorker_p0-w0: resuming experience collection (53900 times) [2024-06-22 13:53:42,052][15401] Updated weights for policy 0, policy_version 222560 (0.0047) [2024-06-22 13:53:43,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3646455808. Throughput: 0: 42749.3. Samples: 3646576440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:43,399][15132] Avg episode reward: [(0, '0.825')] [2024-06-22 13:53:43,629][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000222564_3646488576.pth... [2024-06-22 13:53:43,678][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000221935_3636183040.pth [2024-06-22 13:53:46,183][15401] Updated weights for policy 0, policy_version 222570 (0.0028) [2024-06-22 13:53:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3646668800. Throughput: 0: 42858.2. Samples: 3646840540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:48,398][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 13:53:49,608][15401] Updated weights for policy 0, policy_version 222580 (0.0033) [2024-06-22 13:53:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 3646898176. Throughput: 0: 42976.8. Samples: 3646970840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 13:53:53,887][15401] Updated weights for policy 0, policy_version 222590 (0.0036) [2024-06-22 13:53:56,995][15401] Updated weights for policy 0, policy_version 222600 (0.0033) [2024-06-22 13:53:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42821.1). Total num frames: 3647094784. Throughput: 0: 42762.3. Samples: 3647219160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:53:58,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 13:54:01,209][15401] Updated weights for policy 0, policy_version 222610 (0.0032) [2024-06-22 13:54:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3647307776. Throughput: 0: 42956.9. Samples: 3647486840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:54:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 13:54:04,479][15401] Updated weights for policy 0, policy_version 222620 (0.0038) [2024-06-22 13:54:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42766.0). Total num frames: 3647537152. Throughput: 0: 42947.5. Samples: 3647612740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:54:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 13:54:08,705][15401] Updated weights for policy 0, policy_version 222630 (0.0034) [2024-06-22 13:54:12,375][15401] Updated weights for policy 0, policy_version 222640 (0.0033) [2024-06-22 13:54:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 3647750144. Throughput: 0: 42827.8. Samples: 3647869800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 13:54:13,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 13:54:16,698][15401] Updated weights for policy 0, policy_version 222650 (0.0037) [2024-06-22 13:54:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3647946752. Throughput: 0: 43003.7. Samples: 3648131760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 13:54:19,976][15401] Updated weights for policy 0, policy_version 222660 (0.0049) [2024-06-22 13:54:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3648159744. Throughput: 0: 42974.1. Samples: 3648257200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 13:54:24,056][15401] Updated weights for policy 0, policy_version 222670 (0.0030) [2024-06-22 13:54:27,409][15401] Updated weights for policy 0, policy_version 222680 (0.0034) [2024-06-22 13:54:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42601.0, 300 sec: 42876.1). Total num frames: 3648389120. Throughput: 0: 43117.4. Samples: 3648516720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 13:54:31,455][15401] Updated weights for policy 0, policy_version 222690 (0.0036) [2024-06-22 13:54:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 3648602112. Throughput: 0: 43131.4. Samples: 3648781560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:33,393][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 13:54:35,353][15401] Updated weights for policy 0, policy_version 222700 (0.0041) [2024-06-22 13:54:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3648815104. Throughput: 0: 42993.8. Samples: 3648905560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 13:54:39,260][15401] Updated weights for policy 0, policy_version 222710 (0.0036) [2024-06-22 13:54:42,213][15349] Signal inference workers to stop experience collection... (53950 times) [2024-06-22 13:54:42,216][15349] Signal inference workers to resume experience collection... (53950 times) [2024-06-22 13:54:42,234][15401] InferenceWorker_p0-w0: stopping experience collection (53950 times) [2024-06-22 13:54:42,234][15401] InferenceWorker_p0-w0: resuming experience collection (53950 times) [2024-06-22 13:54:43,001][15401] Updated weights for policy 0, policy_version 222720 (0.0025) [2024-06-22 13:54:43,390][15132] Fps is (10 sec: 45886.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3649060864. Throughput: 0: 43221.3. Samples: 3649164120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:43,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 13:54:47,079][15401] Updated weights for policy 0, policy_version 222730 (0.0032) [2024-06-22 13:54:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3649257472. Throughput: 0: 42962.6. Samples: 3649420160. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 13:54:50,785][15401] Updated weights for policy 0, policy_version 222740 (0.0028) [2024-06-22 13:54:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3649454080. Throughput: 0: 42874.3. Samples: 3649542080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 13:54:54,557][15401] Updated weights for policy 0, policy_version 222750 (0.0039) [2024-06-22 13:54:58,305][15401] Updated weights for policy 0, policy_version 222760 (0.0031) [2024-06-22 13:54:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3649699840. Throughput: 0: 42982.3. Samples: 3649804000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:54:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 13:55:01,977][15401] Updated weights for policy 0, policy_version 222770 (0.0038) [2024-06-22 13:55:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3649880064. Throughput: 0: 42937.6. Samples: 3650063960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:55:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 13:55:05,754][15401] Updated weights for policy 0, policy_version 222780 (0.0032) [2024-06-22 13:55:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3650093056. Throughput: 0: 42944.0. Samples: 3650189680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:55:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 13:55:09,968][15401] Updated weights for policy 0, policy_version 222790 (0.0034) [2024-06-22 13:55:13,223][15401] Updated weights for policy 0, policy_version 222800 (0.0033) [2024-06-22 13:55:13,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3650355200. Throughput: 0: 42884.4. Samples: 3650446520. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:55:13,390][15132] Avg episode reward: [(0, '0.893')] [2024-06-22 13:55:17,524][15401] Updated weights for policy 0, policy_version 222810 (0.0038) [2024-06-22 13:55:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 3650535424. Throughput: 0: 42898.7. Samples: 3650711900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-22 13:55:18,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 13:55:20,705][15401] Updated weights for policy 0, policy_version 222820 (0.0037) [2024-06-22 13:55:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3650748416. Throughput: 0: 42926.6. Samples: 3650837260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:55:23,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 13:55:25,046][15401] Updated weights for policy 0, policy_version 222830 (0.0033) [2024-06-22 13:55:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3650994176. Throughput: 0: 42910.3. Samples: 3651095080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:55:28,390][15132] Avg episode reward: [(0, '0.177')] [2024-06-22 13:55:28,429][15401] Updated weights for policy 0, policy_version 222840 (0.0025) [2024-06-22 13:55:32,588][15401] Updated weights for policy 0, policy_version 222850 (0.0035) [2024-06-22 13:55:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 3651190784. Throughput: 0: 43160.9. Samples: 3651362400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:55:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 13:55:35,858][15401] Updated weights for policy 0, policy_version 222860 (0.0037) [2024-06-22 13:55:38,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 3651387392. Throughput: 0: 43043.5. Samples: 3651479140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:55:38,392][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 13:55:39,512][15349] Signal inference workers to stop experience collection... (54000 times) [2024-06-22 13:55:39,577][15401] InferenceWorker_p0-w0: stopping experience collection (54000 times) [2024-06-22 13:55:39,633][15349] Signal inference workers to resume experience collection... (54000 times) [2024-06-22 13:55:39,633][15401] InferenceWorker_p0-w0: resuming experience collection (54000 times) [2024-06-22 13:55:40,532][15401] Updated weights for policy 0, policy_version 222870 (0.0029) [2024-06-22 13:55:43,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 3651649536. Throughput: 0: 43243.4. Samples: 3651750060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:55:43,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 13:55:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000222879_3651649536.pth... [2024-06-22 13:55:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000222249_3641327616.pth [2024-06-22 13:55:43,675][15401] Updated weights for policy 0, policy_version 222880 (0.0040) [2024-06-22 13:55:48,007][15401] Updated weights for policy 0, policy_version 222890 (0.0032) [2024-06-22 13:55:48,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3651846144. Throughput: 0: 43113.9. Samples: 3652004080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:55:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 13:55:51,163][15401] Updated weights for policy 0, policy_version 222900 (0.0027) [2024-06-22 13:55:53,390][15132] Fps is (10 sec: 39330.5, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 3652042752. Throughput: 0: 43071.8. Samples: 3652127920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:55:53,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 13:55:55,423][15401] Updated weights for policy 0, policy_version 222910 (0.0038) [2024-06-22 13:55:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 3652304896. Throughput: 0: 43301.0. Samples: 3652395060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:55:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 13:55:58,894][15401] Updated weights for policy 0, policy_version 222920 (0.0035) [2024-06-22 13:56:02,909][15401] Updated weights for policy 0, policy_version 222930 (0.0038) [2024-06-22 13:56:03,392][15132] Fps is (10 sec: 44227.0, 60 sec: 43416.0, 300 sec: 42986.8). Total num frames: 3652485120. Throughput: 0: 43045.8. Samples: 3652649060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:56:03,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 13:56:06,498][15401] Updated weights for policy 0, policy_version 222940 (0.0030) [2024-06-22 13:56:08,390][15132] Fps is (10 sec: 37682.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3652681728. Throughput: 0: 43076.8. Samples: 3652775720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:56:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 13:56:10,402][15401] Updated weights for policy 0, policy_version 222950 (0.0026) [2024-06-22 13:56:13,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 3652927488. Throughput: 0: 43134.7. Samples: 3653036140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:56:13,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 13:56:14,051][15401] Updated weights for policy 0, policy_version 222960 (0.0045) [2024-06-22 13:56:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3653124096. Throughput: 0: 42882.2. Samples: 3653292100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:56:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 13:56:18,522][15401] Updated weights for policy 0, policy_version 222970 (0.0031) [2024-06-22 13:56:21,741][15401] Updated weights for policy 0, policy_version 222980 (0.0041) [2024-06-22 13:56:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3653320704. Throughput: 0: 43104.0. Samples: 3653418720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:56:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 13:56:26,267][15401] Updated weights for policy 0, policy_version 222990 (0.0030) [2024-06-22 13:56:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 3653566464. Throughput: 0: 42854.8. Samples: 3653678420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-22 13:56:28,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-22 13:56:29,377][15401] Updated weights for policy 0, policy_version 223000 (0.0029) [2024-06-22 13:56:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3653763072. Throughput: 0: 42876.4. Samples: 3653933520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:56:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 13:56:33,959][15401] Updated weights for policy 0, policy_version 223010 (0.0032) [2024-06-22 13:56:37,211][15401] Updated weights for policy 0, policy_version 223020 (0.0028) [2024-06-22 13:56:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43146.3, 300 sec: 42876.9). Total num frames: 3653976064. Throughput: 0: 42883.8. Samples: 3654057680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:56:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 13:56:41,639][15401] Updated weights for policy 0, policy_version 223030 (0.0024) [2024-06-22 13:56:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42598.4, 300 sec: 42931.3). Total num frames: 3654205440. Throughput: 0: 42726.0. Samples: 3654317840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:56:43,393][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 13:56:44,734][15401] Updated weights for policy 0, policy_version 223040 (0.0031) [2024-06-22 13:56:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3654402048. Throughput: 0: 42837.3. Samples: 3654576640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:56:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 13:56:49,385][15401] Updated weights for policy 0, policy_version 223050 (0.0041) [2024-06-22 13:56:52,778][15401] Updated weights for policy 0, policy_version 223060 (0.0038) [2024-06-22 13:56:53,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 3654631424. Throughput: 0: 42745.9. Samples: 3654699280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:56:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 13:56:57,003][15401] Updated weights for policy 0, policy_version 223070 (0.0033) [2024-06-22 13:56:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 3654844416. Throughput: 0: 42711.9. Samples: 3654958180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:56:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 13:57:00,456][15401] Updated weights for policy 0, policy_version 223080 (0.0033) [2024-06-22 13:57:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 3655024640. Throughput: 0: 42827.1. Samples: 3655219320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:57:03,396][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 13:57:04,538][15401] Updated weights for policy 0, policy_version 223090 (0.0038) [2024-06-22 13:57:08,193][15401] Updated weights for policy 0, policy_version 223100 (0.0037) [2024-06-22 13:57:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3655270400. Throughput: 0: 42768.2. Samples: 3655343280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:57:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 13:57:09,617][15349] Signal inference workers to stop experience collection... (54050 times) [2024-06-22 13:57:09,670][15401] InferenceWorker_p0-w0: stopping experience collection (54050 times) [2024-06-22 13:57:09,670][15349] Signal inference workers to resume experience collection... (54050 times) [2024-06-22 13:57:09,684][15401] InferenceWorker_p0-w0: resuming experience collection (54050 times) [2024-06-22 13:57:12,091][15401] Updated weights for policy 0, policy_version 223110 (0.0035) [2024-06-22 13:57:13,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.6, 300 sec: 42875.8). Total num frames: 3655483392. Throughput: 0: 42720.3. Samples: 3655600940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:57:13,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 13:57:15,697][15401] Updated weights for policy 0, policy_version 223120 (0.0040) [2024-06-22 13:57:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3655680000. Throughput: 0: 42653.0. Samples: 3655852900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:57:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 13:57:19,948][15401] Updated weights for policy 0, policy_version 223130 (0.0028) [2024-06-22 13:57:23,307][15401] Updated weights for policy 0, policy_version 223140 (0.0038) [2024-06-22 13:57:23,391][15132] Fps is (10 sec: 44242.2, 60 sec: 43416.8, 300 sec: 42931.5). Total num frames: 3655925760. Throughput: 0: 42679.7. Samples: 3655978320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:57:23,391][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 13:57:27,662][15401] Updated weights for policy 0, policy_version 223150 (0.0031) [2024-06-22 13:57:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3656122368. Throughput: 0: 42770.4. Samples: 3656242400. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:57:28,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 13:57:30,874][15401] Updated weights for policy 0, policy_version 223160 (0.0035) [2024-06-22 13:57:33,390][15132] Fps is (10 sec: 40964.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3656335360. Throughput: 0: 42585.8. Samples: 3656493000. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:57:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 13:57:35,290][15401] Updated weights for policy 0, policy_version 223170 (0.0041) [2024-06-22 13:57:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3656548352. Throughput: 0: 42739.5. Samples: 3656622560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 13:57:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 13:57:38,819][15401] Updated weights for policy 0, policy_version 223180 (0.0027) [2024-06-22 13:57:42,810][15401] Updated weights for policy 0, policy_version 223190 (0.0028) [2024-06-22 13:57:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 3656761344. Throughput: 0: 42772.9. Samples: 3656882960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:57:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 13:57:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000223192_3656777728.pth... [2024-06-22 13:57:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000222564_3646488576.pth [2024-06-22 13:57:46,683][15401] Updated weights for policy 0, policy_version 223200 (0.0039) [2024-06-22 13:57:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3656990720. Throughput: 0: 42543.6. Samples: 3657133780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:57:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 13:57:50,644][15401] Updated weights for policy 0, policy_version 223210 (0.0026) [2024-06-22 13:57:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 3657170944. Throughput: 0: 42703.5. Samples: 3657265040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:57:53,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 13:57:54,469][15401] Updated weights for policy 0, policy_version 223220 (0.0031) [2024-06-22 13:57:58,321][15401] Updated weights for policy 0, policy_version 223230 (0.0039) [2024-06-22 13:57:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3657400320. Throughput: 0: 42790.6. Samples: 3657526420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:57:58,391][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 13:58:02,117][15401] Updated weights for policy 0, policy_version 223240 (0.0033) [2024-06-22 13:58:03,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3657629696. Throughput: 0: 42768.8. Samples: 3657777500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:03,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 13:58:05,752][15401] Updated weights for policy 0, policy_version 223250 (0.0025) [2024-06-22 13:58:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 3657826304. Throughput: 0: 42942.4. Samples: 3657910680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:08,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 13:58:09,626][15401] Updated weights for policy 0, policy_version 223260 (0.0035) [2024-06-22 13:58:13,174][15401] Updated weights for policy 0, policy_version 223270 (0.0033) [2024-06-22 13:58:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.3, 300 sec: 42987.2). Total num frames: 3658055680. Throughput: 0: 42794.7. Samples: 3658168160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 13:58:17,133][15401] Updated weights for policy 0, policy_version 223280 (0.0026) [2024-06-22 13:58:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3658268672. Throughput: 0: 43017.5. Samples: 3658428780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 13:58:20,938][15401] Updated weights for policy 0, policy_version 223290 (0.0029) [2024-06-22 13:58:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42326.1, 300 sec: 42821.1). Total num frames: 3658465280. Throughput: 0: 42878.1. Samples: 3658552080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 13:58:24,944][15401] Updated weights for policy 0, policy_version 223300 (0.0031) [2024-06-22 13:58:28,267][15401] Updated weights for policy 0, policy_version 223310 (0.0040) [2024-06-22 13:58:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42987.5). Total num frames: 3658711040. Throughput: 0: 42976.0. Samples: 3658816880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 13:58:30,772][15349] Signal inference workers to stop experience collection... (54100 times) [2024-06-22 13:58:30,772][15349] Signal inference workers to resume experience collection... (54100 times) [2024-06-22 13:58:30,789][15401] InferenceWorker_p0-w0: stopping experience collection (54100 times) [2024-06-22 13:58:30,789][15401] InferenceWorker_p0-w0: resuming experience collection (54100 times) [2024-06-22 13:58:32,562][15401] Updated weights for policy 0, policy_version 223320 (0.0035) [2024-06-22 13:58:33,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3658924032. Throughput: 0: 43099.1. Samples: 3659073240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 13:58:35,745][15401] Updated weights for policy 0, policy_version 223330 (0.0031) [2024-06-22 13:58:38,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3659104256. Throughput: 0: 43023.1. Samples: 3659200980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 13:58:40,181][15401] Updated weights for policy 0, policy_version 223340 (0.0045) [2024-06-22 13:58:43,197][15401] Updated weights for policy 0, policy_version 223350 (0.0036) [2024-06-22 13:58:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 3659366400. Throughput: 0: 42909.4. Samples: 3659457340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 13:58:47,763][15401] Updated weights for policy 0, policy_version 223360 (0.0026) [2024-06-22 13:58:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3659563008. Throughput: 0: 43100.4. Samples: 3659717020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 13:58:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 13:58:51,163][15401] Updated weights for policy 0, policy_version 223370 (0.0047) [2024-06-22 13:58:53,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42871.4, 300 sec: 42875.7). Total num frames: 3659743232. Throughput: 0: 42932.8. Samples: 3659842760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:58:53,393][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 13:58:55,400][15401] Updated weights for policy 0, policy_version 223380 (0.0037) [2024-06-22 13:58:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3659988992. Throughput: 0: 42888.3. Samples: 3660098140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:58:58,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 13:58:58,613][15401] Updated weights for policy 0, policy_version 223390 (0.0030) [2024-06-22 13:59:03,264][15401] Updated weights for policy 0, policy_version 223400 (0.0052) [2024-06-22 13:59:03,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3660185600. Throughput: 0: 43012.7. Samples: 3660364360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 13:59:06,141][15401] Updated weights for policy 0, policy_version 223410 (0.0030) [2024-06-22 13:59:08,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3660382208. Throughput: 0: 42905.2. Samples: 3660482820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 13:59:10,777][15401] Updated weights for policy 0, policy_version 223420 (0.0028) [2024-06-22 13:59:13,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 3660660736. Throughput: 0: 42695.1. Samples: 3660738160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 13:59:13,655][15401] Updated weights for policy 0, policy_version 223430 (0.0033) [2024-06-22 13:59:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.2, 300 sec: 42931.6). Total num frames: 3660824576. Throughput: 0: 42964.8. Samples: 3661006660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 13:59:18,530][15401] Updated weights for policy 0, policy_version 223440 (0.0032) [2024-06-22 13:59:21,477][15401] Updated weights for policy 0, policy_version 223450 (0.0032) [2024-06-22 13:59:23,396][15132] Fps is (10 sec: 36022.1, 60 sec: 42594.0, 300 sec: 42819.6). Total num frames: 3661021184. Throughput: 0: 42639.4. Samples: 3661120020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:23,397][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 13:59:26,024][15401] Updated weights for policy 0, policy_version 223460 (0.0030) [2024-06-22 13:59:28,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.6, 300 sec: 43043.1). Total num frames: 3661299712. Throughput: 0: 42800.5. Samples: 3661383360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:28,396][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 13:59:29,002][15401] Updated weights for policy 0, policy_version 223470 (0.0029) [2024-06-22 13:59:33,392][15132] Fps is (10 sec: 44254.4, 60 sec: 42323.7, 300 sec: 42875.7). Total num frames: 3661463552. Throughput: 0: 43141.3. Samples: 3661658480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:33,393][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 13:59:33,549][15401] Updated weights for policy 0, policy_version 223480 (0.0035) [2024-06-22 13:59:33,841][15349] Signal inference workers to stop experience collection... (54150 times) [2024-06-22 13:59:33,841][15349] Signal inference workers to resume experience collection... (54150 times) [2024-06-22 13:59:33,856][15401] InferenceWorker_p0-w0: stopping experience collection (54150 times) [2024-06-22 13:59:33,856][15401] InferenceWorker_p0-w0: resuming experience collection (54150 times) [2024-06-22 13:59:36,554][15401] Updated weights for policy 0, policy_version 223490 (0.0037) [2024-06-22 13:59:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3661692928. Throughput: 0: 42960.4. Samples: 3661775880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:38,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 13:59:41,170][15401] Updated weights for policy 0, policy_version 223500 (0.0042) [2024-06-22 13:59:43,390][15132] Fps is (10 sec: 49163.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3661955072. Throughput: 0: 43143.1. Samples: 3662039580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 13:59:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000223508_3661955072.pth... [2024-06-22 13:59:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000222879_3651649536.pth [2024-06-22 13:59:44,145][15401] Updated weights for policy 0, policy_version 223510 (0.0036) [2024-06-22 13:59:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3662135296. Throughput: 0: 43212.6. Samples: 3662308920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 13:59:48,612][15401] Updated weights for policy 0, policy_version 223520 (0.0041) [2024-06-22 13:59:51,889][15401] Updated weights for policy 0, policy_version 223530 (0.0028) [2024-06-22 13:59:53,389][15132] Fps is (10 sec: 37683.5, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 3662331904. Throughput: 0: 43171.4. Samples: 3662425520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 13:59:56,058][15401] Updated weights for policy 0, policy_version 223540 (0.0039) [2024-06-22 13:59:58,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43690.7, 300 sec: 43153.8). Total num frames: 3662610432. Throughput: 0: 43340.5. Samples: 3662688480. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 13:59:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 13:59:59,510][15401] Updated weights for policy 0, policy_version 223550 (0.0037) [2024-06-22 14:00:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 3662774272. Throughput: 0: 43414.4. Samples: 3662960300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 14:00:03,752][15401] Updated weights for policy 0, policy_version 223560 (0.0035) [2024-06-22 14:00:07,122][15401] Updated weights for policy 0, policy_version 223570 (0.0036) [2024-06-22 14:00:08,392][15132] Fps is (10 sec: 37674.0, 60 sec: 43416.0, 300 sec: 42820.2). Total num frames: 3662987264. Throughput: 0: 43427.8. Samples: 3663074100. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:08,392][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 14:00:11,422][15401] Updated weights for policy 0, policy_version 223580 (0.0039) [2024-06-22 14:00:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 3663249408. Throughput: 0: 43459.0. Samples: 3663339020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 14:00:14,667][15401] Updated weights for policy 0, policy_version 223590 (0.0036) [2024-06-22 14:00:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 3663413248. Throughput: 0: 43315.7. Samples: 3663607580. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 14:00:19,033][15401] Updated weights for policy 0, policy_version 223600 (0.0033) [2024-06-22 14:00:22,448][15401] Updated weights for policy 0, policy_version 223610 (0.0026) [2024-06-22 14:00:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43695.2, 300 sec: 42876.1). Total num frames: 3663642624. Throughput: 0: 43329.4. Samples: 3663725700. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 14:00:26,495][15401] Updated weights for policy 0, policy_version 223620 (0.0033) [2024-06-22 14:00:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3663888384. Throughput: 0: 43233.8. Samples: 3663985100. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 14:00:29,979][15401] Updated weights for policy 0, policy_version 223630 (0.0030) [2024-06-22 14:00:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43419.4, 300 sec: 42987.5). Total num frames: 3664068608. Throughput: 0: 43261.4. Samples: 3664255680. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 14:00:33,888][15401] Updated weights for policy 0, policy_version 223640 (0.0043) [2024-06-22 14:00:37,523][15401] Updated weights for policy 0, policy_version 223650 (0.0025) [2024-06-22 14:00:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 3664281600. Throughput: 0: 43330.2. Samples: 3664375380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:38,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 14:00:41,582][15401] Updated weights for policy 0, policy_version 223660 (0.0033) [2024-06-22 14:00:42,490][15349] Signal inference workers to stop experience collection... (54200 times) [2024-06-22 14:00:42,490][15349] Signal inference workers to resume experience collection... (54200 times) [2024-06-22 14:00:42,538][15401] InferenceWorker_p0-w0: stopping experience collection (54200 times) [2024-06-22 14:00:42,538][15401] InferenceWorker_p0-w0: resuming experience collection (54200 times) [2024-06-22 14:00:43,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3664543744. Throughput: 0: 43310.1. Samples: 3664637440. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:43,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 14:00:45,120][15401] Updated weights for policy 0, policy_version 223670 (0.0032) [2024-06-22 14:00:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 3664707584. Throughput: 0: 43133.3. Samples: 3664901300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 14:00:49,117][15401] Updated weights for policy 0, policy_version 223680 (0.0038) [2024-06-22 14:00:52,852][15401] Updated weights for policy 0, policy_version 223690 (0.0028) [2024-06-22 14:00:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 3664953344. Throughput: 0: 43411.2. Samples: 3665027500. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 14:00:56,740][15401] Updated weights for policy 0, policy_version 223700 (0.0028) [2024-06-22 14:00:58,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.3, 300 sec: 43043.0). Total num frames: 3665182720. Throughput: 0: 43275.9. Samples: 3665286440. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:00:58,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 14:01:00,729][15401] Updated weights for policy 0, policy_version 223710 (0.0047) [2024-06-22 14:01:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 3665362944. Throughput: 0: 43119.4. Samples: 3665547960. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 14:01:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 14:01:04,365][15401] Updated weights for policy 0, policy_version 223720 (0.0031) [2024-06-22 14:01:08,261][15401] Updated weights for policy 0, policy_version 223730 (0.0048) [2024-06-22 14:01:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 3665592320. Throughput: 0: 43156.9. Samples: 3665667760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 14:01:12,185][15401] Updated weights for policy 0, policy_version 223740 (0.0028) [2024-06-22 14:01:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 3665821696. Throughput: 0: 43328.4. Samples: 3665934880. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 14:01:15,675][15401] Updated weights for policy 0, policy_version 223750 (0.0030) [2024-06-22 14:01:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 3666018304. Throughput: 0: 42994.1. Samples: 3666190420. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 14:01:19,813][15401] Updated weights for policy 0, policy_version 223760 (0.0040) [2024-06-22 14:01:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3666231296. Throughput: 0: 42976.0. Samples: 3666309300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:23,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 14:01:23,549][15401] Updated weights for policy 0, policy_version 223770 (0.0024) [2024-06-22 14:01:27,382][15401] Updated weights for policy 0, policy_version 223780 (0.0027) [2024-06-22 14:01:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 3666460672. Throughput: 0: 43048.0. Samples: 3666574600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:28,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 14:01:31,043][15401] Updated weights for policy 0, policy_version 223790 (0.0031) [2024-06-22 14:01:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 3666657280. Throughput: 0: 42940.4. Samples: 3666833620. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:33,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-22 14:01:34,691][15401] Updated weights for policy 0, policy_version 223800 (0.0041) [2024-06-22 14:01:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42987.5). Total num frames: 3666886656. Throughput: 0: 43022.2. Samples: 3666963500. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:38,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 14:01:38,441][15401] Updated weights for policy 0, policy_version 223810 (0.0045) [2024-06-22 14:01:42,224][15401] Updated weights for policy 0, policy_version 223820 (0.0032) [2024-06-22 14:01:43,390][15132] Fps is (10 sec: 45871.5, 60 sec: 42870.9, 300 sec: 43098.1). Total num frames: 3667116032. Throughput: 0: 43021.1. Samples: 3667222420. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:43,391][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 14:01:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000223823_3667116032.pth... [2024-06-22 14:01:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000223192_3656777728.pth [2024-06-22 14:01:46,692][15401] Updated weights for policy 0, policy_version 223830 (0.0030) [2024-06-22 14:01:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 3667312640. Throughput: 0: 42926.8. Samples: 3667479660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 14:01:49,982][15401] Updated weights for policy 0, policy_version 223840 (0.0031) [2024-06-22 14:01:53,389][15132] Fps is (10 sec: 42602.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3667542016. Throughput: 0: 43060.6. Samples: 3667605480. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 14:01:54,336][15401] Updated weights for policy 0, policy_version 223850 (0.0037) [2024-06-22 14:01:57,420][15401] Updated weights for policy 0, policy_version 223860 (0.0034) [2024-06-22 14:01:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 43153.8). Total num frames: 3667755008. Throughput: 0: 42996.9. Samples: 3667869740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:01:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 14:02:01,821][15401] Updated weights for policy 0, policy_version 223870 (0.0043) [2024-06-22 14:02:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3667951616. Throughput: 0: 43014.2. Samples: 3668126060. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:02:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 14:02:05,144][15401] Updated weights for policy 0, policy_version 223880 (0.0043) [2024-06-22 14:02:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 43043.0). Total num frames: 3668180992. Throughput: 0: 43108.8. Samples: 3668249200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:02:08,396][15132] Avg episode reward: [(0, '0.129')] [2024-06-22 14:02:09,262][15401] Updated weights for policy 0, policy_version 223890 (0.0031) [2024-06-22 14:02:12,902][15401] Updated weights for policy 0, policy_version 223900 (0.0029) [2024-06-22 14:02:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 3668393984. Throughput: 0: 42968.4. Samples: 3668508180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 14:02:13,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 14:02:15,428][15349] Signal inference workers to stop experience collection... (54250 times) [2024-06-22 14:02:15,429][15349] Signal inference workers to resume experience collection... (54250 times) [2024-06-22 14:02:15,449][15401] InferenceWorker_p0-w0: stopping experience collection (54250 times) [2024-06-22 14:02:15,449][15401] InferenceWorker_p0-w0: resuming experience collection (54250 times) [2024-06-22 14:02:17,321][15401] Updated weights for policy 0, policy_version 223910 (0.0039) [2024-06-22 14:02:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42931.8). Total num frames: 3668590592. Throughput: 0: 42957.4. Samples: 3668766700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 14:02:20,493][15401] Updated weights for policy 0, policy_version 223920 (0.0031) [2024-06-22 14:02:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3668803584. Throughput: 0: 42753.5. Samples: 3668887400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 14:02:25,318][15401] Updated weights for policy 0, policy_version 223930 (0.0026) [2024-06-22 14:02:28,287][15401] Updated weights for policy 0, policy_version 223940 (0.0028) [2024-06-22 14:02:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 3669032960. Throughput: 0: 42605.2. Samples: 3669139620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 14:02:32,764][15401] Updated weights for policy 0, policy_version 223950 (0.0033) [2024-06-22 14:02:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3669213184. Throughput: 0: 42764.3. Samples: 3669404060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:33,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 14:02:36,099][15401] Updated weights for policy 0, policy_version 223960 (0.0033) [2024-06-22 14:02:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 3669458944. Throughput: 0: 42586.2. Samples: 3669521860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 14:02:40,276][15401] Updated weights for policy 0, policy_version 223970 (0.0032) [2024-06-22 14:02:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.9, 300 sec: 42987.2). Total num frames: 3669671936. Throughput: 0: 42459.5. Samples: 3669780420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:43,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 14:02:43,607][15401] Updated weights for policy 0, policy_version 223980 (0.0030) [2024-06-22 14:02:47,847][15401] Updated weights for policy 0, policy_version 223990 (0.0036) [2024-06-22 14:02:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42987.5). Total num frames: 3669852160. Throughput: 0: 42393.0. Samples: 3670033740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 14:02:51,411][15401] Updated weights for policy 0, policy_version 224000 (0.0030) [2024-06-22 14:02:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 3670081536. Throughput: 0: 42492.1. Samples: 3670161340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 14:02:55,322][15401] Updated weights for policy 0, policy_version 224010 (0.0032) [2024-06-22 14:02:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 3670278144. Throughput: 0: 42357.9. Samples: 3670414280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:02:58,396][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 14:02:59,319][15401] Updated weights for policy 0, policy_version 224020 (0.0040) [2024-06-22 14:03:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 3670491136. Throughput: 0: 42184.9. Samples: 3670665020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:03:03,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 14:03:03,406][15401] Updated weights for policy 0, policy_version 224030 (0.0026) [2024-06-22 14:03:06,909][15401] Updated weights for policy 0, policy_version 224040 (0.0043) [2024-06-22 14:03:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 3670720512. Throughput: 0: 42386.0. Samples: 3670794780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:03:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 14:03:11,005][15401] Updated weights for policy 0, policy_version 224050 (0.0033) [2024-06-22 14:03:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 42765.0). Total num frames: 3670884352. Throughput: 0: 42334.7. Samples: 3671044680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:03:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 14:03:14,695][15401] Updated weights for policy 0, policy_version 224060 (0.0039) [2024-06-22 14:03:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 3671130112. Throughput: 0: 42128.6. Samples: 3671299840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:03:18,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 14:03:18,651][15401] Updated weights for policy 0, policy_version 224070 (0.0045) [2024-06-22 14:03:22,491][15401] Updated weights for policy 0, policy_version 224080 (0.0028) [2024-06-22 14:03:23,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3671359488. Throughput: 0: 42466.2. Samples: 3671432840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:03:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 14:03:26,159][15401] Updated weights for policy 0, policy_version 224090 (0.0034) [2024-06-22 14:03:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 3671539712. Throughput: 0: 42353.4. Samples: 3671686320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:03:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 14:03:30,006][15401] Updated weights for policy 0, policy_version 224100 (0.0038) [2024-06-22 14:03:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3671785472. Throughput: 0: 42481.7. Samples: 3671945420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:03:33,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 14:03:33,898][15401] Updated weights for policy 0, policy_version 224110 (0.0033) [2024-06-22 14:03:37,681][15401] Updated weights for policy 0, policy_version 224120 (0.0055) [2024-06-22 14:03:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 3671998464. Throughput: 0: 42519.9. Samples: 3672074740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:03:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 14:03:41,425][15401] Updated weights for policy 0, policy_version 224130 (0.0028) [2024-06-22 14:03:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 3672195072. Throughput: 0: 42505.6. Samples: 3672327040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:03:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 14:03:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000224133_3672195072.pth... [2024-06-22 14:03:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000223508_3661955072.pth [2024-06-22 14:03:44,803][15349] Signal inference workers to stop experience collection... (54300 times) [2024-06-22 14:03:44,803][15349] Signal inference workers to resume experience collection... (54300 times) [2024-06-22 14:03:44,833][15401] InferenceWorker_p0-w0: stopping experience collection (54300 times) [2024-06-22 14:03:44,833][15401] InferenceWorker_p0-w0: resuming experience collection (54300 times) [2024-06-22 14:03:45,462][15401] Updated weights for policy 0, policy_version 224140 (0.0040) [2024-06-22 14:03:48,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 3672408064. Throughput: 0: 42552.6. Samples: 3672579880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:03:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 14:03:49,092][15401] Updated weights for policy 0, policy_version 224150 (0.0046) [2024-06-22 14:03:52,912][15401] Updated weights for policy 0, policy_version 224160 (0.0025) [2024-06-22 14:03:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3672637440. Throughput: 0: 42636.2. Samples: 3672713400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:03:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 14:03:56,569][15401] Updated weights for policy 0, policy_version 224170 (0.0024) [2024-06-22 14:03:58,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 3672850432. Throughput: 0: 42764.0. Samples: 3672969160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:03:58,393][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 14:04:00,854][15401] Updated weights for policy 0, policy_version 224180 (0.0030) [2024-06-22 14:04:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3673063424. Throughput: 0: 42840.2. Samples: 3673227660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:04:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 14:04:04,431][15401] Updated weights for policy 0, policy_version 224190 (0.0026) [2024-06-22 14:04:08,215][15401] Updated weights for policy 0, policy_version 224200 (0.0033) [2024-06-22 14:04:08,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3673292800. Throughput: 0: 42837.3. Samples: 3673360520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:04:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 14:04:12,265][15401] Updated weights for policy 0, policy_version 224210 (0.0036) [2024-06-22 14:04:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 3673489408. Throughput: 0: 42863.6. Samples: 3673615180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:04:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 14:04:15,833][15401] Updated weights for policy 0, policy_version 224220 (0.0038) [2024-06-22 14:04:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42988.1). Total num frames: 3673702400. Throughput: 0: 42729.8. Samples: 3673868260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:04:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 14:04:19,880][15401] Updated weights for policy 0, policy_version 224230 (0.0026) [2024-06-22 14:04:23,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3673931776. Throughput: 0: 42926.6. Samples: 3674006440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:04:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 14:04:23,527][15401] Updated weights for policy 0, policy_version 224240 (0.0033) [2024-06-22 14:04:27,501][15401] Updated weights for policy 0, policy_version 224250 (0.0037) [2024-06-22 14:04:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42931.6). Total num frames: 3674128384. Throughput: 0: 42956.9. Samples: 3674260200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:04:28,401][15132] Avg episode reward: [(0, '0.252')] [2024-06-22 14:04:31,211][15401] Updated weights for policy 0, policy_version 224260 (0.0031) [2024-06-22 14:04:33,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3674357760. Throughput: 0: 43051.8. Samples: 3674517220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:04:33,392][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 14:04:34,986][15401] Updated weights for policy 0, policy_version 224270 (0.0027) [2024-06-22 14:04:38,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3674587136. Throughput: 0: 42993.2. Samples: 3674648100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:04:38,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 14:04:38,874][15401] Updated weights for policy 0, policy_version 224280 (0.0035) [2024-06-22 14:04:42,891][15401] Updated weights for policy 0, policy_version 224290 (0.0046) [2024-06-22 14:04:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3674767360. Throughput: 0: 42978.2. Samples: 3674903080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:04:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 14:04:47,051][15401] Updated weights for policy 0, policy_version 224300 (0.0036) [2024-06-22 14:04:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3674996736. Throughput: 0: 42838.0. Samples: 3675155360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:04:48,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 14:04:50,952][15401] Updated weights for policy 0, policy_version 224310 (0.0024) [2024-06-22 14:04:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3675209728. Throughput: 0: 42847.3. Samples: 3675288640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:04:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 14:04:54,557][15401] Updated weights for policy 0, policy_version 224320 (0.0049) [2024-06-22 14:04:58,396][15132] Fps is (10 sec: 40933.3, 60 sec: 42595.6, 300 sec: 42819.6). Total num frames: 3675406336. Throughput: 0: 42814.8. Samples: 3675542120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:04:58,396][15132] Avg episode reward: [(0, '0.195')] [2024-06-22 14:04:58,553][15401] Updated weights for policy 0, policy_version 224330 (0.0030) [2024-06-22 14:05:02,018][15401] Updated weights for policy 0, policy_version 224340 (0.0030) [2024-06-22 14:05:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 3675652096. Throughput: 0: 42946.2. Samples: 3675800840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:05:03,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 14:05:05,948][15401] Updated weights for policy 0, policy_version 224350 (0.0040) [2024-06-22 14:05:08,390][15132] Fps is (10 sec: 45904.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3675865088. Throughput: 0: 42854.0. Samples: 3675934860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:05:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 14:05:09,447][15401] Updated weights for policy 0, policy_version 224360 (0.0023) [2024-06-22 14:05:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3676061696. Throughput: 0: 42929.0. Samples: 3676191900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:05:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 14:05:13,476][15401] Updated weights for policy 0, policy_version 224370 (0.0030) [2024-06-22 14:05:16,748][15349] Signal inference workers to stop experience collection... (54350 times) [2024-06-22 14:05:16,792][15401] InferenceWorker_p0-w0: stopping experience collection (54350 times) [2024-06-22 14:05:16,859][15349] Signal inference workers to resume experience collection... (54350 times) [2024-06-22 14:05:16,859][15401] InferenceWorker_p0-w0: resuming experience collection (54350 times) [2024-06-22 14:05:16,999][15401] Updated weights for policy 0, policy_version 224380 (0.0031) [2024-06-22 14:05:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3676291072. Throughput: 0: 42876.2. Samples: 3676446640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:05:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 14:05:21,208][15401] Updated weights for policy 0, policy_version 224390 (0.0036) [2024-06-22 14:05:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 3676487680. Throughput: 0: 42819.6. Samples: 3676574980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:05:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 14:05:24,641][15401] Updated weights for policy 0, policy_version 224400 (0.0035) [2024-06-22 14:05:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 3676700672. Throughput: 0: 42708.6. Samples: 3676824960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:05:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 14:05:28,914][15401] Updated weights for policy 0, policy_version 224410 (0.0039) [2024-06-22 14:05:32,662][15401] Updated weights for policy 0, policy_version 224420 (0.0035) [2024-06-22 14:05:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3676930048. Throughput: 0: 42808.8. Samples: 3677081760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:05:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 14:05:36,995][15401] Updated weights for policy 0, policy_version 224430 (0.0033) [2024-06-22 14:05:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3677126656. Throughput: 0: 42794.2. Samples: 3677214380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 14:05:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 14:05:40,459][15401] Updated weights for policy 0, policy_version 224440 (0.0036) [2024-06-22 14:05:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 3677356032. Throughput: 0: 42707.9. Samples: 3677463700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:05:43,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 14:05:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000224448_3677356032.pth... [2024-06-22 14:05:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000223823_3667116032.pth [2024-06-22 14:05:44,511][15401] Updated weights for policy 0, policy_version 224450 (0.0032) [2024-06-22 14:05:47,918][15401] Updated weights for policy 0, policy_version 224460 (0.0034) [2024-06-22 14:05:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3677569024. Throughput: 0: 42810.2. Samples: 3677727300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:05:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 14:05:51,976][15401] Updated weights for policy 0, policy_version 224470 (0.0036) [2024-06-22 14:05:53,390][15132] Fps is (10 sec: 42595.6, 60 sec: 42871.0, 300 sec: 42709.4). Total num frames: 3677782016. Throughput: 0: 42715.1. Samples: 3677857060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:05:53,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 14:05:55,828][15401] Updated weights for policy 0, policy_version 224480 (0.0043) [2024-06-22 14:05:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 3677995008. Throughput: 0: 42619.5. Samples: 3678109780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:05:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 14:05:59,728][15401] Updated weights for policy 0, policy_version 224490 (0.0047) [2024-06-22 14:06:03,390][15132] Fps is (10 sec: 40962.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3678191616. Throughput: 0: 42692.3. Samples: 3678367800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 14:06:03,450][15401] Updated weights for policy 0, policy_version 224500 (0.0037) [2024-06-22 14:06:07,294][15401] Updated weights for policy 0, policy_version 224510 (0.0038) [2024-06-22 14:06:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3678420992. Throughput: 0: 42637.7. Samples: 3678493680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:08,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-22 14:06:11,086][15401] Updated weights for policy 0, policy_version 224520 (0.0028) [2024-06-22 14:06:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3678633984. Throughput: 0: 42800.3. Samples: 3678750980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 14:06:15,164][15401] Updated weights for policy 0, policy_version 224530 (0.0033) [2024-06-22 14:06:18,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3678846976. Throughput: 0: 42712.8. Samples: 3679003940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:18,392][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 14:06:18,723][15401] Updated weights for policy 0, policy_version 224540 (0.0031) [2024-06-22 14:06:22,690][15401] Updated weights for policy 0, policy_version 224550 (0.0031) [2024-06-22 14:06:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3679059968. Throughput: 0: 42652.0. Samples: 3679133720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 14:06:26,146][15401] Updated weights for policy 0, policy_version 224560 (0.0032) [2024-06-22 14:06:28,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3679272960. Throughput: 0: 42943.5. Samples: 3679396160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:28,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 14:06:30,142][15401] Updated weights for policy 0, policy_version 224570 (0.0039) [2024-06-22 14:06:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3679502336. Throughput: 0: 42631.2. Samples: 3679645700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:33,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 14:06:33,595][15401] Updated weights for policy 0, policy_version 224580 (0.0029) [2024-06-22 14:06:37,854][15401] Updated weights for policy 0, policy_version 224590 (0.0037) [2024-06-22 14:06:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.1). Total num frames: 3679698944. Throughput: 0: 42729.9. Samples: 3679779880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:38,396][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 14:06:41,033][15401] Updated weights for policy 0, policy_version 224600 (0.0026) [2024-06-22 14:06:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3679911936. Throughput: 0: 42910.7. Samples: 3680040760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 14:06:45,525][15401] Updated weights for policy 0, policy_version 224610 (0.0029) [2024-06-22 14:06:47,414][15349] Signal inference workers to stop experience collection... (54400 times) [2024-06-22 14:06:47,416][15349] Signal inference workers to resume experience collection... (54400 times) [2024-06-22 14:06:47,444][15401] InferenceWorker_p0-w0: stopping experience collection (54400 times) [2024-06-22 14:06:47,444][15401] InferenceWorker_p0-w0: resuming experience collection (54400 times) [2024-06-22 14:06:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3680157696. Throughput: 0: 42762.3. Samples: 3680292100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 14:06:48,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 14:06:48,402][15401] Updated weights for policy 0, policy_version 224620 (0.0026) [2024-06-22 14:06:53,343][15401] Updated weights for policy 0, policy_version 224630 (0.0041) [2024-06-22 14:06:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.8, 300 sec: 42654.0). Total num frames: 3680337920. Throughput: 0: 42885.4. Samples: 3680423520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:06:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 14:06:56,053][15401] Updated weights for policy 0, policy_version 224640 (0.0034) [2024-06-22 14:06:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3680534528. Throughput: 0: 42894.8. Samples: 3680681240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:06:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 14:07:00,760][15401] Updated weights for policy 0, policy_version 224650 (0.0032) [2024-06-22 14:07:03,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 3680813056. Throughput: 0: 42948.5. Samples: 3680936520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 14:07:03,401][15401] Updated weights for policy 0, policy_version 224660 (0.0040) [2024-06-22 14:07:08,106][15401] Updated weights for policy 0, policy_version 224670 (0.0034) [2024-06-22 14:07:08,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3680993280. Throughput: 0: 43198.1. Samples: 3681077640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 14:07:10,830][15401] Updated weights for policy 0, policy_version 224680 (0.0023) [2024-06-22 14:07:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3681189888. Throughput: 0: 43012.0. Samples: 3681331700. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:13,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-22 14:07:16,145][15401] Updated weights for policy 0, policy_version 224690 (0.0043) [2024-06-22 14:07:18,371][15401] Updated weights for policy 0, policy_version 224700 (0.0024) [2024-06-22 14:07:18,389][15132] Fps is (10 sec: 49152.8, 60 sec: 43965.5, 300 sec: 42987.2). Total num frames: 3681484800. Throughput: 0: 42956.5. Samples: 3681578740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:18,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 14:07:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 3681615872. Throughput: 0: 43215.4. Samples: 3681724680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:23,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 14:07:23,740][15401] Updated weights for policy 0, policy_version 224710 (0.0033) [2024-06-22 14:07:26,514][15401] Updated weights for policy 0, policy_version 224720 (0.0041) [2024-06-22 14:07:28,389][15132] Fps is (10 sec: 34406.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3681828864. Throughput: 0: 43004.0. Samples: 3681975940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 14:07:31,366][15401] Updated weights for policy 0, policy_version 224730 (0.0026) [2024-06-22 14:07:33,392][15132] Fps is (10 sec: 49152.7, 60 sec: 43416.0, 300 sec: 42875.8). Total num frames: 3682107392. Throughput: 0: 42976.1. Samples: 3682226120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:33,404][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 14:07:33,964][15401] Updated weights for policy 0, policy_version 224740 (0.0035) [2024-06-22 14:07:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3682254848. Throughput: 0: 43259.1. Samples: 3682370180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 14:07:38,946][15401] Updated weights for policy 0, policy_version 224750 (0.0047) [2024-06-22 14:07:40,695][15349] Signal inference workers to stop experience collection... (54450 times) [2024-06-22 14:07:40,744][15401] InferenceWorker_p0-w0: stopping experience collection (54450 times) [2024-06-22 14:07:40,752][15349] Signal inference workers to resume experience collection... (54450 times) [2024-06-22 14:07:40,768][15401] InferenceWorker_p0-w0: resuming experience collection (54450 times) [2024-06-22 14:07:42,218][15401] Updated weights for policy 0, policy_version 224760 (0.0036) [2024-06-22 14:07:43,390][15132] Fps is (10 sec: 37691.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3682484224. Throughput: 0: 42879.0. Samples: 3682610800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 14:07:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000224761_3682484224.pth... [2024-06-22 14:07:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000224133_3672195072.pth [2024-06-22 14:07:46,870][15401] Updated weights for policy 0, policy_version 224770 (0.0033) [2024-06-22 14:07:48,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3682729984. Throughput: 0: 42840.5. Samples: 3682864340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 14:07:50,103][15401] Updated weights for policy 0, policy_version 224780 (0.0038) [2024-06-22 14:07:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3682893824. Throughput: 0: 42692.1. Samples: 3682998780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-22 14:07:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 14:07:54,521][15401] Updated weights for policy 0, policy_version 224790 (0.0041) [2024-06-22 14:07:57,494][15401] Updated weights for policy 0, policy_version 224800 (0.0031) [2024-06-22 14:07:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 3683139584. Throughput: 0: 42578.1. Samples: 3683247720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:07:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 14:08:02,005][15401] Updated weights for policy 0, policy_version 224810 (0.0040) [2024-06-22 14:08:03,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3683368960. Throughput: 0: 42789.7. Samples: 3683504280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:03,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 14:08:04,851][15401] Updated weights for policy 0, policy_version 224820 (0.0030) [2024-06-22 14:08:08,396][15132] Fps is (10 sec: 40934.4, 60 sec: 42594.0, 300 sec: 42930.7). Total num frames: 3683549184. Throughput: 0: 42547.8. Samples: 3683639500. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:08,396][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 14:08:09,635][15401] Updated weights for policy 0, policy_version 224830 (0.0034) [2024-06-22 14:08:13,151][15401] Updated weights for policy 0, policy_version 224840 (0.0032) [2024-06-22 14:08:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3683778560. Throughput: 0: 42479.9. Samples: 3683887540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 14:08:17,454][15401] Updated weights for policy 0, policy_version 224850 (0.0039) [2024-06-22 14:08:18,389][15132] Fps is (10 sec: 47543.8, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 3684024320. Throughput: 0: 42721.2. Samples: 3684148480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 14:08:20,653][15401] Updated weights for policy 0, policy_version 224860 (0.0043) [2024-06-22 14:08:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 3684188160. Throughput: 0: 42467.4. Samples: 3684281220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 14:08:24,835][15401] Updated weights for policy 0, policy_version 224870 (0.0036) [2024-06-22 14:08:28,071][15401] Updated weights for policy 0, policy_version 224880 (0.0029) [2024-06-22 14:08:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3684433920. Throughput: 0: 42668.2. Samples: 3684530860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:28,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-22 14:08:32,550][15401] Updated weights for policy 0, policy_version 224890 (0.0045) [2024-06-22 14:08:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42326.9, 300 sec: 42876.1). Total num frames: 3684646912. Throughput: 0: 42909.2. Samples: 3684795260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:33,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 14:08:35,539][15401] Updated weights for policy 0, policy_version 224900 (0.0034) [2024-06-22 14:08:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3684843520. Throughput: 0: 42817.8. Samples: 3684925580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 14:08:40,059][15401] Updated weights for policy 0, policy_version 224910 (0.0030) [2024-06-22 14:08:43,319][15401] Updated weights for policy 0, policy_version 224920 (0.0038) [2024-06-22 14:08:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 3685089280. Throughput: 0: 42929.0. Samples: 3685179520. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 14:08:47,612][15401] Updated weights for policy 0, policy_version 224930 (0.0028) [2024-06-22 14:08:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3685269504. Throughput: 0: 43108.6. Samples: 3685444160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 14:08:48,636][15349] Signal inference workers to stop experience collection... (54500 times) [2024-06-22 14:08:48,636][15349] Signal inference workers to resume experience collection... (54500 times) [2024-06-22 14:08:48,674][15401] InferenceWorker_p0-w0: stopping experience collection (54500 times) [2024-06-22 14:08:48,674][15401] InferenceWorker_p0-w0: resuming experience collection (54500 times) [2024-06-22 14:08:51,206][15401] Updated weights for policy 0, policy_version 224940 (0.0034) [2024-06-22 14:08:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3685482496. Throughput: 0: 42863.4. Samples: 3685568080. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 14:08:55,775][15401] Updated weights for policy 0, policy_version 224950 (0.0031) [2024-06-22 14:08:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3685711872. Throughput: 0: 42822.2. Samples: 3685814540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:08:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 14:08:59,163][15401] Updated weights for policy 0, policy_version 224960 (0.0027) [2024-06-22 14:09:03,332][15401] Updated weights for policy 0, policy_version 224970 (0.0024) [2024-06-22 14:09:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3685908480. Throughput: 0: 42981.8. Samples: 3686082660. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-22 14:09:03,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 14:09:06,856][15401] Updated weights for policy 0, policy_version 224980 (0.0027) [2024-06-22 14:09:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 3686121472. Throughput: 0: 42647.2. Samples: 3686200340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 14:09:11,034][15401] Updated weights for policy 0, policy_version 224990 (0.0042) [2024-06-22 14:09:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3686334464. Throughput: 0: 42807.1. Samples: 3686457180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:13,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-22 14:09:14,485][15401] Updated weights for policy 0, policy_version 225000 (0.0030) [2024-06-22 14:09:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 3686531072. Throughput: 0: 42674.3. Samples: 3686715600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:18,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-22 14:09:18,661][15401] Updated weights for policy 0, policy_version 225010 (0.0023) [2024-06-22 14:09:22,254][15401] Updated weights for policy 0, policy_version 225020 (0.0032) [2024-06-22 14:09:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 3686760448. Throughput: 0: 42468.0. Samples: 3686836640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 14:09:26,327][15401] Updated weights for policy 0, policy_version 225030 (0.0035) [2024-06-22 14:09:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3686973440. Throughput: 0: 42502.6. Samples: 3687092140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:28,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 14:09:29,853][15401] Updated weights for policy 0, policy_version 225040 (0.0029) [2024-06-22 14:09:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3687186432. Throughput: 0: 42359.0. Samples: 3687350320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 14:09:33,881][15401] Updated weights for policy 0, policy_version 225050 (0.0039) [2024-06-22 14:09:37,394][15401] Updated weights for policy 0, policy_version 225060 (0.0036) [2024-06-22 14:09:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 3687383040. Throughput: 0: 42475.0. Samples: 3687479560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:38,393][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 14:09:41,662][15401] Updated weights for policy 0, policy_version 225070 (0.0036) [2024-06-22 14:09:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 3687612416. Throughput: 0: 42742.6. Samples: 3687737960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 14:09:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000225074_3687612416.pth... [2024-06-22 14:09:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000224448_3677356032.pth [2024-06-22 14:09:45,095][15401] Updated weights for policy 0, policy_version 225080 (0.0029) [2024-06-22 14:09:48,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3687825408. Throughput: 0: 42501.8. Samples: 3687995240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 14:09:49,216][15401] Updated weights for policy 0, policy_version 225090 (0.0039) [2024-06-22 14:09:52,758][15401] Updated weights for policy 0, policy_version 225100 (0.0054) [2024-06-22 14:09:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 3688038400. Throughput: 0: 42750.6. Samples: 3688124120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 14:09:56,806][15401] Updated weights for policy 0, policy_version 225110 (0.0027) [2024-06-22 14:09:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 3688251392. Throughput: 0: 42643.0. Samples: 3688376220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:09:58,392][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 14:10:00,281][15401] Updated weights for policy 0, policy_version 225120 (0.0026) [2024-06-22 14:10:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3688464384. Throughput: 0: 42775.1. Samples: 3688640480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:10:03,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 14:10:04,263][15401] Updated weights for policy 0, policy_version 225130 (0.0029) [2024-06-22 14:10:06,301][15349] Signal inference workers to stop experience collection... (54550 times) [2024-06-22 14:10:06,313][15401] InferenceWorker_p0-w0: stopping experience collection (54550 times) [2024-06-22 14:10:06,359][15349] Signal inference workers to resume experience collection... (54550 times) [2024-06-22 14:10:06,360][15401] InferenceWorker_p0-w0: resuming experience collection (54550 times) [2024-06-22 14:10:08,068][15401] Updated weights for policy 0, policy_version 225140 (0.0049) [2024-06-22 14:10:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3688693760. Throughput: 0: 42981.8. Samples: 3688770820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:10:08,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 14:10:11,831][15401] Updated weights for policy 0, policy_version 225150 (0.0028) [2024-06-22 14:10:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3688890368. Throughput: 0: 42905.3. Samples: 3689022880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 14:10:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 14:10:15,518][15401] Updated weights for policy 0, policy_version 225160 (0.0039) [2024-06-22 14:10:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3689119744. Throughput: 0: 43107.0. Samples: 3689290140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:18,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-22 14:10:19,330][15401] Updated weights for policy 0, policy_version 225170 (0.0041) [2024-06-22 14:10:22,986][15401] Updated weights for policy 0, policy_version 225180 (0.0027) [2024-06-22 14:10:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3689349120. Throughput: 0: 43133.5. Samples: 3689420460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 14:10:26,809][15401] Updated weights for policy 0, policy_version 225190 (0.0033) [2024-06-22 14:10:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3689545728. Throughput: 0: 42917.8. Samples: 3689669260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 14:10:31,063][15401] Updated weights for policy 0, policy_version 225200 (0.0028) [2024-06-22 14:10:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3689742336. Throughput: 0: 43115.5. Samples: 3689935440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 14:10:34,509][15401] Updated weights for policy 0, policy_version 225210 (0.0023) [2024-06-22 14:10:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 3689971712. Throughput: 0: 42965.7. Samples: 3690057580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 14:10:38,719][15401] Updated weights for policy 0, policy_version 225220 (0.0054) [2024-06-22 14:10:42,077][15401] Updated weights for policy 0, policy_version 225230 (0.0036) [2024-06-22 14:10:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3690201088. Throughput: 0: 43009.9. Samples: 3690311560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 14:10:46,218][15401] Updated weights for policy 0, policy_version 225240 (0.0030) [2024-06-22 14:10:48,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 3690397696. Throughput: 0: 43075.3. Samples: 3690578860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 14:10:49,887][15401] Updated weights for policy 0, policy_version 225250 (0.0028) [2024-06-22 14:10:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3690610688. Throughput: 0: 42967.0. Samples: 3690704340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 14:10:53,848][15401] Updated weights for policy 0, policy_version 225260 (0.0041) [2024-06-22 14:10:57,666][15401] Updated weights for policy 0, policy_version 225270 (0.0031) [2024-06-22 14:10:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 3690840064. Throughput: 0: 43002.6. Samples: 3690958000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:10:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 14:11:01,471][15401] Updated weights for policy 0, policy_version 225280 (0.0028) [2024-06-22 14:11:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3691036672. Throughput: 0: 42836.9. Samples: 3691217800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:11:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 14:11:05,238][15401] Updated weights for policy 0, policy_version 225290 (0.0053) [2024-06-22 14:11:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3691249664. Throughput: 0: 42674.2. Samples: 3691340800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:11:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 14:11:09,643][15401] Updated weights for policy 0, policy_version 225300 (0.0029) [2024-06-22 14:11:13,187][15401] Updated weights for policy 0, policy_version 225310 (0.0036) [2024-06-22 14:11:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.8, 300 sec: 42820.6). Total num frames: 3691479040. Throughput: 0: 42839.1. Samples: 3691597120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:11:13,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 14:11:17,162][15401] Updated weights for policy 0, policy_version 225320 (0.0027) [2024-06-22 14:11:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3691692032. Throughput: 0: 42720.5. Samples: 3691857860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 14:11:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 14:11:20,595][15401] Updated weights for policy 0, policy_version 225330 (0.0026) [2024-06-22 14:11:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3691905024. Throughput: 0: 42789.8. Samples: 3691983120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:11:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 14:11:24,790][15401] Updated weights for policy 0, policy_version 225340 (0.0026) [2024-06-22 14:11:27,457][15349] Signal inference workers to stop experience collection... (54600 times) [2024-06-22 14:11:27,511][15401] InferenceWorker_p0-w0: stopping experience collection (54600 times) [2024-06-22 14:11:27,517][15349] Signal inference workers to resume experience collection... (54600 times) [2024-06-22 14:11:27,526][15401] InferenceWorker_p0-w0: resuming experience collection (54600 times) [2024-06-22 14:11:28,098][15401] Updated weights for policy 0, policy_version 225350 (0.0024) [2024-06-22 14:11:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3692134400. Throughput: 0: 42892.9. Samples: 3692241740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:11:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 14:11:32,272][15401] Updated weights for policy 0, policy_version 225360 (0.0034) [2024-06-22 14:11:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 3692347392. Throughput: 0: 42705.6. Samples: 3692500620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:11:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 14:11:35,652][15401] Updated weights for policy 0, policy_version 225370 (0.0047) [2024-06-22 14:11:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3692544000. Throughput: 0: 42727.6. Samples: 3692627080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:11:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 14:11:39,857][15401] Updated weights for policy 0, policy_version 225380 (0.0041) [2024-06-22 14:11:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3692756992. Throughput: 0: 42844.0. Samples: 3692885980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:11:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 14:11:43,581][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000225390_3692789760.pth... [2024-06-22 14:11:43,583][15401] Updated weights for policy 0, policy_version 225390 (0.0021) [2024-06-22 14:11:43,639][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000224761_3682484224.pth [2024-06-22 14:11:47,625][15401] Updated weights for policy 0, policy_version 225400 (0.0042) [2024-06-22 14:11:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3693002752. Throughput: 0: 42820.5. Samples: 3693144720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:11:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 14:11:51,297][15401] Updated weights for policy 0, policy_version 225410 (0.0034) [2024-06-22 14:11:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3693199360. Throughput: 0: 42884.0. Samples: 3693270580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:11:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 14:11:55,466][15401] Updated weights for policy 0, policy_version 225420 (0.0041) [2024-06-22 14:11:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3693395968. Throughput: 0: 42793.1. Samples: 3693522700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:11:58,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 14:11:59,024][15401] Updated weights for policy 0, policy_version 225430 (0.0042) [2024-06-22 14:12:03,098][15401] Updated weights for policy 0, policy_version 225440 (0.0025) [2024-06-22 14:12:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3693625344. Throughput: 0: 42826.3. Samples: 3693785040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:12:03,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-22 14:12:06,678][15401] Updated weights for policy 0, policy_version 225450 (0.0041) [2024-06-22 14:12:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3693821952. Throughput: 0: 42868.6. Samples: 3693912200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:12:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 14:12:10,777][15401] Updated weights for policy 0, policy_version 225460 (0.0025) [2024-06-22 14:12:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 3694034944. Throughput: 0: 42687.0. Samples: 3694162660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:12:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 14:12:14,570][15401] Updated weights for policy 0, policy_version 225470 (0.0031) [2024-06-22 14:12:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 3694247936. Throughput: 0: 42797.9. Samples: 3694426520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:12:18,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 14:12:18,463][15401] Updated weights for policy 0, policy_version 225480 (0.0025) [2024-06-22 14:12:22,192][15401] Updated weights for policy 0, policy_version 225490 (0.0037) [2024-06-22 14:12:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3694460928. Throughput: 0: 42743.1. Samples: 3694550520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:12:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 14:12:26,406][15401] Updated weights for policy 0, policy_version 225500 (0.0029) [2024-06-22 14:12:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 3694690304. Throughput: 0: 42708.6. Samples: 3694807860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 14:12:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 14:12:29,724][15401] Updated weights for policy 0, policy_version 225510 (0.0043) [2024-06-22 14:12:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3694886912. Throughput: 0: 42682.3. Samples: 3695065420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:12:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 14:12:33,899][15401] Updated weights for policy 0, policy_version 225520 (0.0031) [2024-06-22 14:12:37,270][15401] Updated weights for policy 0, policy_version 225530 (0.0032) [2024-06-22 14:12:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3695099904. Throughput: 0: 42615.2. Samples: 3695188260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:12:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 14:12:41,371][15401] Updated weights for policy 0, policy_version 225540 (0.0024) [2024-06-22 14:12:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3695312896. Throughput: 0: 42865.2. Samples: 3695451640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:12:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 14:12:44,978][15401] Updated weights for policy 0, policy_version 225550 (0.0027) [2024-06-22 14:12:47,902][15349] Signal inference workers to stop experience collection... (54650 times) [2024-06-22 14:12:47,953][15401] InferenceWorker_p0-w0: stopping experience collection (54650 times) [2024-06-22 14:12:47,960][15349] Signal inference workers to resume experience collection... (54650 times) [2024-06-22 14:12:47,968][15401] InferenceWorker_p0-w0: resuming experience collection (54650 times) [2024-06-22 14:12:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3695542272. Throughput: 0: 42711.0. Samples: 3695707040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:12:48,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 14:12:49,170][15401] Updated weights for policy 0, policy_version 225560 (0.0032) [2024-06-22 14:12:52,578][15401] Updated weights for policy 0, policy_version 225570 (0.0023) [2024-06-22 14:12:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3695738880. Throughput: 0: 42671.1. Samples: 3695832400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:12:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 14:12:56,789][15401] Updated weights for policy 0, policy_version 225580 (0.0040) [2024-06-22 14:12:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3695968256. Throughput: 0: 42839.1. Samples: 3696090420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:12:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 14:13:00,320][15401] Updated weights for policy 0, policy_version 225590 (0.0042) [2024-06-22 14:13:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 3696181248. Throughput: 0: 42780.0. Samples: 3696351620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:13:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 14:13:04,450][15401] Updated weights for policy 0, policy_version 225600 (0.0035) [2024-06-22 14:13:08,001][15401] Updated weights for policy 0, policy_version 225610 (0.0030) [2024-06-22 14:13:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3696394240. Throughput: 0: 42768.3. Samples: 3696475100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:13:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 14:13:12,044][15401] Updated weights for policy 0, policy_version 225620 (0.0030) [2024-06-22 14:13:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 3696607232. Throughput: 0: 42833.7. Samples: 3696735380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:13:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 14:13:15,786][15401] Updated weights for policy 0, policy_version 225630 (0.0043) [2024-06-22 14:13:18,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 3696820224. Throughput: 0: 42750.1. Samples: 3696989280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:13:18,392][15132] Avg episode reward: [(0, '0.852')] [2024-06-22 14:13:19,747][15401] Updated weights for policy 0, policy_version 225640 (0.0036) [2024-06-22 14:13:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3697016832. Throughput: 0: 42729.3. Samples: 3697111080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:13:23,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 14:13:23,690][15401] Updated weights for policy 0, policy_version 225650 (0.0033) [2024-06-22 14:13:27,395][15401] Updated weights for policy 0, policy_version 225660 (0.0034) [2024-06-22 14:13:28,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3697262592. Throughput: 0: 42713.8. Samples: 3697373760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:13:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 14:13:31,333][15401] Updated weights for policy 0, policy_version 225670 (0.0044) [2024-06-22 14:13:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3697459200. Throughput: 0: 42628.4. Samples: 3697625320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:13:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 14:13:35,089][15401] Updated weights for policy 0, policy_version 225680 (0.0034) [2024-06-22 14:13:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3697672192. Throughput: 0: 42698.6. Samples: 3697753840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:13:38,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 14:13:38,983][15401] Updated weights for policy 0, policy_version 225690 (0.0028) [2024-06-22 14:13:42,887][15401] Updated weights for policy 0, policy_version 225700 (0.0031) [2024-06-22 14:13:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 3697868800. Throughput: 0: 42622.6. Samples: 3698008440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:13:43,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 14:13:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000225701_3697885184.pth... [2024-06-22 14:13:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000225074_3687612416.pth [2024-06-22 14:13:46,782][15401] Updated weights for policy 0, policy_version 225710 (0.0039) [2024-06-22 14:13:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3698081792. Throughput: 0: 42485.3. Samples: 3698263460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:13:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 14:13:50,440][15401] Updated weights for policy 0, policy_version 225720 (0.0050) [2024-06-22 14:13:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3698327552. Throughput: 0: 42534.3. Samples: 3698389140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:13:53,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 14:13:54,619][15401] Updated weights for policy 0, policy_version 225730 (0.0022) [2024-06-22 14:13:58,035][15401] Updated weights for policy 0, policy_version 225740 (0.0022) [2024-06-22 14:13:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3698524160. Throughput: 0: 42517.4. Samples: 3698648660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:13:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 14:14:02,101][15401] Updated weights for policy 0, policy_version 225750 (0.0035) [2024-06-22 14:14:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3698737152. Throughput: 0: 42533.3. Samples: 3698903180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 14:14:05,895][15401] Updated weights for policy 0, policy_version 225760 (0.0044) [2024-06-22 14:14:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3698966528. Throughput: 0: 42653.8. Samples: 3699030500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 14:14:09,895][15401] Updated weights for policy 0, policy_version 225770 (0.0034) [2024-06-22 14:14:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3699146752. Throughput: 0: 42452.5. Samples: 3699284120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:13,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 14:14:13,678][15401] Updated weights for policy 0, policy_version 225780 (0.0041) [2024-06-22 14:14:15,378][15349] Signal inference workers to stop experience collection... (54700 times) [2024-06-22 14:14:15,434][15349] Signal inference workers to resume experience collection... (54700 times) [2024-06-22 14:14:15,436][15401] InferenceWorker_p0-w0: stopping experience collection (54700 times) [2024-06-22 14:14:15,448][15401] InferenceWorker_p0-w0: resuming experience collection (54700 times) [2024-06-22 14:14:17,777][15401] Updated weights for policy 0, policy_version 225790 (0.0046) [2024-06-22 14:14:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 3699376128. Throughput: 0: 42455.8. Samples: 3699535820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:18,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 14:14:21,214][15401] Updated weights for policy 0, policy_version 225800 (0.0028) [2024-06-22 14:14:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3699605504. Throughput: 0: 42448.1. Samples: 3699664000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 14:14:25,650][15401] Updated weights for policy 0, policy_version 225810 (0.0029) [2024-06-22 14:14:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3699802112. Throughput: 0: 42557.5. Samples: 3699923520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 14:14:29,077][15401] Updated weights for policy 0, policy_version 225820 (0.0036) [2024-06-22 14:14:33,287][15401] Updated weights for policy 0, policy_version 225830 (0.0039) [2024-06-22 14:14:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 3699998720. Throughput: 0: 42696.3. Samples: 3700184800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 14:14:36,849][15401] Updated weights for policy 0, policy_version 225840 (0.0040) [2024-06-22 14:14:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3700244480. Throughput: 0: 42654.7. Samples: 3700308600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 14:14:40,679][15401] Updated weights for policy 0, policy_version 225850 (0.0022) [2024-06-22 14:14:43,391][15132] Fps is (10 sec: 42590.8, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 3700424704. Throughput: 0: 42591.5. Samples: 3700565360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:43,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 14:14:44,308][15401] Updated weights for policy 0, policy_version 225860 (0.0041) [2024-06-22 14:14:48,239][15401] Updated weights for policy 0, policy_version 225870 (0.0046) [2024-06-22 14:14:48,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 3700654080. Throughput: 0: 42585.3. Samples: 3700819620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:48,392][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 14:14:51,970][15401] Updated weights for policy 0, policy_version 225880 (0.0037) [2024-06-22 14:14:53,396][15132] Fps is (10 sec: 47492.2, 60 sec: 42866.9, 300 sec: 42875.5). Total num frames: 3700899840. Throughput: 0: 42616.6. Samples: 3700948520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:53,396][15132] Avg episode reward: [(0, '0.831')] [2024-06-22 14:14:56,451][15401] Updated weights for policy 0, policy_version 225890 (0.0033) [2024-06-22 14:14:58,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3701063680. Throughput: 0: 42589.8. Samples: 3701200660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:14:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 14:14:59,506][15401] Updated weights for policy 0, policy_version 225900 (0.0032) [2024-06-22 14:15:03,389][15132] Fps is (10 sec: 34428.7, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 3701243904. Throughput: 0: 42951.9. Samples: 3701468660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:03,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 14:15:03,933][15401] Updated weights for policy 0, policy_version 225910 (0.0035) [2024-06-22 14:15:07,460][15401] Updated weights for policy 0, policy_version 225920 (0.0038) [2024-06-22 14:15:08,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3701538816. Throughput: 0: 42752.0. Samples: 3701587840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 14:15:11,834][15401] Updated weights for policy 0, policy_version 225930 (0.0032) [2024-06-22 14:15:13,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 3701719040. Throughput: 0: 42617.2. Samples: 3701841300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:13,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 14:15:15,051][15349] Signal inference workers to stop experience collection... (54750 times) [2024-06-22 14:15:15,098][15349] Signal inference workers to resume experience collection... (54750 times) [2024-06-22 14:15:15,098][15401] InferenceWorker_p0-w0: stopping experience collection (54750 times) [2024-06-22 14:15:15,103][15401] Updated weights for policy 0, policy_version 225940 (0.0024) [2024-06-22 14:15:15,114][15401] InferenceWorker_p0-w0: resuming experience collection (54750 times) [2024-06-22 14:15:18,390][15132] Fps is (10 sec: 36044.2, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 3701899264. Throughput: 0: 42548.5. Samples: 3702099480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 14:15:19,704][15401] Updated weights for policy 0, policy_version 225950 (0.0043) [2024-06-22 14:15:22,656][15401] Updated weights for policy 0, policy_version 225960 (0.0039) [2024-06-22 14:15:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3702177792. Throughput: 0: 42552.4. Samples: 3702223460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 14:15:27,264][15401] Updated weights for policy 0, policy_version 225970 (0.0039) [2024-06-22 14:15:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3702358016. Throughput: 0: 42618.2. Samples: 3702483100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 14:15:30,239][15401] Updated weights for policy 0, policy_version 225980 (0.0034) [2024-06-22 14:15:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3702554624. Throughput: 0: 42525.8. Samples: 3702733180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:33,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 14:15:34,898][15401] Updated weights for policy 0, policy_version 225990 (0.0032) [2024-06-22 14:15:37,916][15401] Updated weights for policy 0, policy_version 226000 (0.0038) [2024-06-22 14:15:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3702816768. Throughput: 0: 42510.6. Samples: 3702861220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:38,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 14:15:42,424][15401] Updated weights for policy 0, policy_version 226010 (0.0036) [2024-06-22 14:15:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42872.8, 300 sec: 42709.4). Total num frames: 3702996992. Throughput: 0: 42688.7. Samples: 3703121660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:43,399][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 14:15:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000226013_3702996992.pth... [2024-06-22 14:15:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000225390_3692789760.pth [2024-06-22 14:15:45,575][15401] Updated weights for policy 0, policy_version 226020 (0.0031) [2024-06-22 14:15:48,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 3703193600. Throughput: 0: 42404.8. Samples: 3703376880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 14:15:49,999][15401] Updated weights for policy 0, policy_version 226030 (0.0029) [2024-06-22 14:15:53,121][15401] Updated weights for policy 0, policy_version 226040 (0.0037) [2024-06-22 14:15:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 3703455744. Throughput: 0: 42572.0. Samples: 3703503580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:15:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 14:15:57,499][15401] Updated weights for policy 0, policy_version 226050 (0.0036) [2024-06-22 14:15:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3703635968. Throughput: 0: 42792.6. Samples: 3703766960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:15:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 14:16:00,583][15401] Updated weights for policy 0, policy_version 226060 (0.0029) [2024-06-22 14:16:03,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 3703848960. Throughput: 0: 42647.1. Samples: 3704018600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 14:16:05,453][15401] Updated weights for policy 0, policy_version 226070 (0.0036) [2024-06-22 14:16:08,088][15401] Updated weights for policy 0, policy_version 226080 (0.0028) [2024-06-22 14:16:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 3704094720. Throughput: 0: 42752.4. Samples: 3704147320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 14:16:12,996][15401] Updated weights for policy 0, policy_version 226090 (0.0032) [2024-06-22 14:16:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3704274944. Throughput: 0: 42856.9. Samples: 3704411660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 14:16:15,722][15349] Signal inference workers to stop experience collection... (54800 times) [2024-06-22 14:16:15,729][15349] Signal inference workers to resume experience collection... (54800 times) [2024-06-22 14:16:15,764][15401] InferenceWorker_p0-w0: stopping experience collection (54800 times) [2024-06-22 14:16:15,764][15401] InferenceWorker_p0-w0: resuming experience collection (54800 times) [2024-06-22 14:16:15,878][15401] Updated weights for policy 0, policy_version 226100 (0.0035) [2024-06-22 14:16:18,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3704471552. Throughput: 0: 42799.5. Samples: 3704659160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 14:16:20,531][15401] Updated weights for policy 0, policy_version 226110 (0.0038) [2024-06-22 14:16:23,330][15401] Updated weights for policy 0, policy_version 226120 (0.0026) [2024-06-22 14:16:23,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3704750080. Throughput: 0: 42867.1. Samples: 3704790240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 14:16:28,030][15401] Updated weights for policy 0, policy_version 226130 (0.0038) [2024-06-22 14:16:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3704913920. Throughput: 0: 42871.7. Samples: 3705050880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:28,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 14:16:31,010][15401] Updated weights for policy 0, policy_version 226140 (0.0044) [2024-06-22 14:16:33,390][15132] Fps is (10 sec: 37682.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3705126912. Throughput: 0: 42661.7. Samples: 3705296660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 14:16:35,772][15401] Updated weights for policy 0, policy_version 226150 (0.0034) [2024-06-22 14:16:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 3705339904. Throughput: 0: 42775.6. Samples: 3705428480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 14:16:38,897][15401] Updated weights for policy 0, policy_version 226160 (0.0036) [2024-06-22 14:16:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 3705536512. Throughput: 0: 42666.1. Samples: 3705686940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 14:16:43,558][15401] Updated weights for policy 0, policy_version 226170 (0.0028) [2024-06-22 14:16:46,433][15401] Updated weights for policy 0, policy_version 226180 (0.0034) [2024-06-22 14:16:48,391][15132] Fps is (10 sec: 40954.4, 60 sec: 42597.5, 300 sec: 42542.7). Total num frames: 3705749504. Throughput: 0: 42784.7. Samples: 3705943960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:48,391][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 14:16:51,048][15401] Updated weights for policy 0, policy_version 226190 (0.0024) [2024-06-22 14:16:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3705995264. Throughput: 0: 42737.0. Samples: 3706070480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:53,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-22 14:16:54,061][15401] Updated weights for policy 0, policy_version 226200 (0.0031) [2024-06-22 14:16:58,389][15132] Fps is (10 sec: 44242.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3706191872. Throughput: 0: 42672.6. Samples: 3706331920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:16:58,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 14:16:58,567][15401] Updated weights for policy 0, policy_version 226210 (0.0035) [2024-06-22 14:17:01,697][15401] Updated weights for policy 0, policy_version 226220 (0.0033) [2024-06-22 14:17:03,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 3706404864. Throughput: 0: 42932.4. Samples: 3706591220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 14:17:03,393][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 14:17:06,349][15401] Updated weights for policy 0, policy_version 226230 (0.0038) [2024-06-22 14:17:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3706650624. Throughput: 0: 42929.2. Samples: 3706722060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 14:17:09,527][15401] Updated weights for policy 0, policy_version 226240 (0.0037) [2024-06-22 14:17:13,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3706830848. Throughput: 0: 42730.9. Samples: 3706973780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:13,390][15132] Avg episode reward: [(0, '0.891')] [2024-06-22 14:17:14,253][15401] Updated weights for policy 0, policy_version 226250 (0.0033) [2024-06-22 14:17:17,056][15401] Updated weights for policy 0, policy_version 226260 (0.0043) [2024-06-22 14:17:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3707060224. Throughput: 0: 42889.4. Samples: 3707226680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 14:17:19,914][15349] Signal inference workers to stop experience collection... (54850 times) [2024-06-22 14:17:19,914][15349] Signal inference workers to resume experience collection... (54850 times) [2024-06-22 14:17:19,929][15401] InferenceWorker_p0-w0: stopping experience collection (54850 times) [2024-06-22 14:17:19,929][15401] InferenceWorker_p0-w0: resuming experience collection (54850 times) [2024-06-22 14:17:21,715][15401] Updated weights for policy 0, policy_version 226270 (0.0031) [2024-06-22 14:17:23,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3707289600. Throughput: 0: 42894.2. Samples: 3707358720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 14:17:24,577][15401] Updated weights for policy 0, policy_version 226280 (0.0028) [2024-06-22 14:17:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3707486208. Throughput: 0: 42978.3. Samples: 3707620960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:28,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 14:17:29,202][15401] Updated weights for policy 0, policy_version 226290 (0.0028) [2024-06-22 14:17:32,208][15401] Updated weights for policy 0, policy_version 226300 (0.0038) [2024-06-22 14:17:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3707715584. Throughput: 0: 42743.4. Samples: 3707867360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 14:17:37,083][15401] Updated weights for policy 0, policy_version 226310 (0.0035) [2024-06-22 14:17:38,393][15132] Fps is (10 sec: 44222.9, 60 sec: 43142.2, 300 sec: 42764.6). Total num frames: 3707928576. Throughput: 0: 42977.4. Samples: 3708004600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:38,393][15132] Avg episode reward: [(0, '0.235')] [2024-06-22 14:17:39,841][15401] Updated weights for policy 0, policy_version 226320 (0.0034) [2024-06-22 14:17:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3708108800. Throughput: 0: 42797.1. Samples: 3708257800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 14:17:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000226326_3708125184.pth... [2024-06-22 14:17:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000225701_3697885184.pth [2024-06-22 14:17:44,922][15401] Updated weights for policy 0, policy_version 226330 (0.0037) [2024-06-22 14:17:47,936][15401] Updated weights for policy 0, policy_version 226340 (0.0032) [2024-06-22 14:17:48,390][15132] Fps is (10 sec: 42611.3, 60 sec: 43418.5, 300 sec: 42765.0). Total num frames: 3708354560. Throughput: 0: 42582.2. Samples: 3708507320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 14:17:52,381][15401] Updated weights for policy 0, policy_version 226350 (0.0036) [2024-06-22 14:17:53,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3708567552. Throughput: 0: 42718.6. Samples: 3708644400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 14:17:55,395][15401] Updated weights for policy 0, policy_version 226360 (0.0039) [2024-06-22 14:17:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3708731392. Throughput: 0: 42764.2. Samples: 3708898160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:17:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 14:17:59,950][15401] Updated weights for policy 0, policy_version 226370 (0.0026) [2024-06-22 14:18:02,909][15401] Updated weights for policy 0, policy_version 226380 (0.0034) [2024-06-22 14:18:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 3709009920. Throughput: 0: 42538.4. Samples: 3709140900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:18:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 14:18:07,546][15401] Updated weights for policy 0, policy_version 226390 (0.0033) [2024-06-22 14:18:08,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3709206528. Throughput: 0: 42758.9. Samples: 3709282880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:18:08,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 14:18:10,410][15401] Updated weights for policy 0, policy_version 226400 (0.0027) [2024-06-22 14:18:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 3709403136. Throughput: 0: 42520.3. Samples: 3709534380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 14:18:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 14:18:15,478][15401] Updated weights for policy 0, policy_version 226410 (0.0023) [2024-06-22 14:18:18,321][15401] Updated weights for policy 0, policy_version 226420 (0.0037) [2024-06-22 14:18:18,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 3709665280. Throughput: 0: 42472.0. Samples: 3709778600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:18,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 14:18:23,022][15401] Updated weights for policy 0, policy_version 226430 (0.0037) [2024-06-22 14:18:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 3709829120. Throughput: 0: 42552.2. Samples: 3709919320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 14:18:25,759][15401] Updated weights for policy 0, policy_version 226440 (0.0030) [2024-06-22 14:18:28,390][15132] Fps is (10 sec: 36044.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3710025728. Throughput: 0: 42582.8. Samples: 3710174020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:28,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 14:18:30,972][15401] Updated weights for policy 0, policy_version 226450 (0.0032) [2024-06-22 14:18:31,897][15349] Signal inference workers to stop experience collection... (54900 times) [2024-06-22 14:18:31,926][15401] InferenceWorker_p0-w0: stopping experience collection (54900 times) [2024-06-22 14:18:32,012][15349] Signal inference workers to resume experience collection... (54900 times) [2024-06-22 14:18:32,012][15401] InferenceWorker_p0-w0: resuming experience collection (54900 times) [2024-06-22 14:18:33,339][15401] Updated weights for policy 0, policy_version 226460 (0.0025) [2024-06-22 14:18:33,391][15132] Fps is (10 sec: 49145.3, 60 sec: 43416.6, 300 sec: 42875.9). Total num frames: 3710320640. Throughput: 0: 42630.7. Samples: 3710425760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:33,391][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 14:18:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42327.6, 300 sec: 42709.5). Total num frames: 3710468096. Throughput: 0: 42826.3. Samples: 3710571580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 14:18:38,514][15401] Updated weights for policy 0, policy_version 226470 (0.0043) [2024-06-22 14:18:41,392][15401] Updated weights for policy 0, policy_version 226480 (0.0036) [2024-06-22 14:18:43,389][15132] Fps is (10 sec: 36049.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3710681088. Throughput: 0: 42747.0. Samples: 3710821780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:43,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 14:18:46,023][15401] Updated weights for policy 0, policy_version 226490 (0.0032) [2024-06-22 14:18:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3710943232. Throughput: 0: 43096.4. Samples: 3711080240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 14:18:48,893][15401] Updated weights for policy 0, policy_version 226500 (0.0031) [2024-06-22 14:18:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3711107072. Throughput: 0: 42833.0. Samples: 3711210360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:53,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 14:18:53,595][15401] Updated weights for policy 0, policy_version 226510 (0.0034) [2024-06-22 14:18:57,731][15401] Updated weights for policy 0, policy_version 226520 (0.0033) [2024-06-22 14:18:58,391][15132] Fps is (10 sec: 40951.9, 60 sec: 43689.2, 300 sec: 42764.7). Total num frames: 3711352832. Throughput: 0: 42921.3. Samples: 3711465920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:18:58,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 14:19:01,201][15401] Updated weights for policy 0, policy_version 226530 (0.0033) [2024-06-22 14:19:03,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3711582208. Throughput: 0: 43100.4. Samples: 3711718120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:19:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 14:19:05,097][15401] Updated weights for policy 0, policy_version 226540 (0.0037) [2024-06-22 14:19:08,390][15132] Fps is (10 sec: 40967.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3711762432. Throughput: 0: 42970.9. Samples: 3711853020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:19:08,391][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 14:19:08,695][15401] Updated weights for policy 0, policy_version 226550 (0.0029) [2024-06-22 14:19:12,582][15401] Updated weights for policy 0, policy_version 226560 (0.0040) [2024-06-22 14:19:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3711975424. Throughput: 0: 43179.7. Samples: 3712117100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:19:13,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 14:19:16,081][15401] Updated weights for policy 0, policy_version 226570 (0.0036) [2024-06-22 14:19:18,389][15132] Fps is (10 sec: 47515.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3712237568. Throughput: 0: 43153.4. Samples: 3712367600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:19:18,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 14:19:20,287][15401] Updated weights for policy 0, policy_version 226580 (0.0039) [2024-06-22 14:19:23,390][15132] Fps is (10 sec: 44235.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 3712417792. Throughput: 0: 42936.6. Samples: 3712503740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 14:19:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 14:19:23,844][15401] Updated weights for policy 0, policy_version 226590 (0.0030) [2024-06-22 14:19:27,778][15401] Updated weights for policy 0, policy_version 226600 (0.0037) [2024-06-22 14:19:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3712630784. Throughput: 0: 42986.6. Samples: 3712756180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:19:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 14:19:30,413][15349] Signal inference workers to stop experience collection... (54950 times) [2024-06-22 14:19:30,446][15401] InferenceWorker_p0-w0: stopping experience collection (54950 times) [2024-06-22 14:19:30,471][15349] Signal inference workers to resume experience collection... (54950 times) [2024-06-22 14:19:30,476][15401] InferenceWorker_p0-w0: resuming experience collection (54950 times) [2024-06-22 14:19:31,313][15401] Updated weights for policy 0, policy_version 226610 (0.0041) [2024-06-22 14:19:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42326.3, 300 sec: 42765.0). Total num frames: 3712860160. Throughput: 0: 42972.0. Samples: 3713013980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:19:33,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 14:19:35,180][15401] Updated weights for policy 0, policy_version 226620 (0.0038) [2024-06-22 14:19:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42820.8). Total num frames: 3713056768. Throughput: 0: 43153.8. Samples: 3713152280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:19:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 14:19:38,958][15401] Updated weights for policy 0, policy_version 226630 (0.0038) [2024-06-22 14:19:42,924][15401] Updated weights for policy 0, policy_version 226640 (0.0034) [2024-06-22 14:19:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 3713269760. Throughput: 0: 42976.4. Samples: 3713399780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:19:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 14:19:43,552][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000226641_3713286144.pth... [2024-06-22 14:19:43,603][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000226013_3702996992.pth [2024-06-22 14:19:46,631][15401] Updated weights for policy 0, policy_version 226650 (0.0033) [2024-06-22 14:19:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 3713499136. Throughput: 0: 43193.1. Samples: 3713661800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:19:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 14:19:50,802][15401] Updated weights for policy 0, policy_version 226660 (0.0050) [2024-06-22 14:19:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3713679360. Throughput: 0: 43086.4. Samples: 3713791900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:19:53,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 14:19:54,666][15401] Updated weights for policy 0, policy_version 226670 (0.0036) [2024-06-22 14:19:58,307][15401] Updated weights for policy 0, policy_version 226680 (0.0032) [2024-06-22 14:19:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42872.9, 300 sec: 42987.2). Total num frames: 3713925120. Throughput: 0: 42831.9. Samples: 3714044540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:19:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 14:20:02,091][15401] Updated weights for policy 0, policy_version 226690 (0.0036) [2024-06-22 14:20:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3714138112. Throughput: 0: 43020.4. Samples: 3714303520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:20:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 14:20:05,833][15401] Updated weights for policy 0, policy_version 226700 (0.0035) [2024-06-22 14:20:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 3714351104. Throughput: 0: 42841.1. Samples: 3714431580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:20:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 14:20:09,619][15401] Updated weights for policy 0, policy_version 226710 (0.0037) [2024-06-22 14:20:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3714564096. Throughput: 0: 42884.5. Samples: 3714685980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:20:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 14:20:13,709][15401] Updated weights for policy 0, policy_version 226720 (0.0040) [2024-06-22 14:20:17,527][15401] Updated weights for policy 0, policy_version 226730 (0.0031) [2024-06-22 14:20:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.5, 300 sec: 42709.1). Total num frames: 3714777088. Throughput: 0: 42913.2. Samples: 3714945180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:20:18,392][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 14:20:21,234][15401] Updated weights for policy 0, policy_version 226740 (0.0037) [2024-06-22 14:20:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3714990080. Throughput: 0: 42671.5. Samples: 3715072500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:20:23,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 14:20:24,973][15401] Updated weights for policy 0, policy_version 226750 (0.0041) [2024-06-22 14:20:28,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3715219456. Throughput: 0: 42954.7. Samples: 3715332740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:20:28,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 14:20:28,849][15401] Updated weights for policy 0, policy_version 226760 (0.0033) [2024-06-22 14:20:32,603][15401] Updated weights for policy 0, policy_version 226770 (0.0036) [2024-06-22 14:20:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3715416064. Throughput: 0: 42814.1. Samples: 3715588440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:20:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 14:20:36,284][15401] Updated weights for policy 0, policy_version 226780 (0.0037) [2024-06-22 14:20:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3715645440. Throughput: 0: 42745.4. Samples: 3715715440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:20:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 14:20:40,085][15401] Updated weights for policy 0, policy_version 226790 (0.0032) [2024-06-22 14:20:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3715858432. Throughput: 0: 42842.1. Samples: 3715972440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:20:43,394][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 14:20:43,802][15401] Updated weights for policy 0, policy_version 226800 (0.0040) [2024-06-22 14:20:47,878][15401] Updated weights for policy 0, policy_version 226810 (0.0027) [2024-06-22 14:20:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3716071424. Throughput: 0: 42875.8. Samples: 3716232940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:20:48,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 14:20:51,476][15401] Updated weights for policy 0, policy_version 226820 (0.0037) [2024-06-22 14:20:53,394][15132] Fps is (10 sec: 42578.1, 60 sec: 43414.1, 300 sec: 42875.4). Total num frames: 3716284416. Throughput: 0: 42765.5. Samples: 3716356240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:20:53,395][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 14:20:55,596][15401] Updated weights for policy 0, policy_version 226830 (0.0048) [2024-06-22 14:20:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 3716497408. Throughput: 0: 42855.0. Samples: 3716614460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:20:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 14:20:59,264][15401] Updated weights for policy 0, policy_version 226840 (0.0025) [2024-06-22 14:21:03,164][15401] Updated weights for policy 0, policy_version 226850 (0.0045) [2024-06-22 14:21:03,390][15132] Fps is (10 sec: 42619.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3716710400. Throughput: 0: 42935.2. Samples: 3716877160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:21:03,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 14:21:06,909][15401] Updated weights for policy 0, policy_version 226860 (0.0031) [2024-06-22 14:21:08,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3716923392. Throughput: 0: 42914.2. Samples: 3717003640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:21:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 14:21:10,859][15401] Updated weights for policy 0, policy_version 226870 (0.0034) [2024-06-22 14:21:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3717136384. Throughput: 0: 42832.1. Samples: 3717260180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:21:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 14:21:14,411][15349] Signal inference workers to stop experience collection... (55000 times) [2024-06-22 14:21:14,456][15401] InferenceWorker_p0-w0: stopping experience collection (55000 times) [2024-06-22 14:21:14,478][15349] Signal inference workers to resume experience collection... (55000 times) [2024-06-22 14:21:14,478][15401] InferenceWorker_p0-w0: resuming experience collection (55000 times) [2024-06-22 14:21:14,649][15401] Updated weights for policy 0, policy_version 226880 (0.0040) [2024-06-22 14:21:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 3717332992. Throughput: 0: 42878.7. Samples: 3717517980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:21:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 14:21:18,747][15401] Updated weights for policy 0, policy_version 226890 (0.0055) [2024-06-22 14:21:22,312][15401] Updated weights for policy 0, policy_version 226900 (0.0028) [2024-06-22 14:21:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3717562368. Throughput: 0: 42900.0. Samples: 3717645940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:21:23,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 14:21:26,308][15401] Updated weights for policy 0, policy_version 226910 (0.0031) [2024-06-22 14:21:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 3717791744. Throughput: 0: 42708.1. Samples: 3717894300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:21:28,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 14:21:30,024][15401] Updated weights for policy 0, policy_version 226920 (0.0049) [2024-06-22 14:21:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 3717988352. Throughput: 0: 42847.2. Samples: 3718161160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:21:33,392][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 14:21:33,814][15401] Updated weights for policy 0, policy_version 226930 (0.0037) [2024-06-22 14:21:37,980][15401] Updated weights for policy 0, policy_version 226940 (0.0044) [2024-06-22 14:21:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3718201344. Throughput: 0: 42785.6. Samples: 3718281380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 14:21:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 14:21:41,788][15401] Updated weights for policy 0, policy_version 226950 (0.0028) [2024-06-22 14:21:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42987.4). Total num frames: 3718430720. Throughput: 0: 42699.3. Samples: 3718535920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:21:43,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 14:21:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000226955_3718430720.pth... [2024-06-22 14:21:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000226326_3708125184.pth [2024-06-22 14:21:45,556][15401] Updated weights for policy 0, policy_version 226960 (0.0029) [2024-06-22 14:21:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 3718627328. Throughput: 0: 42611.1. Samples: 3718794660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:21:48,404][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 14:21:49,320][15401] Updated weights for policy 0, policy_version 226970 (0.0040) [2024-06-22 14:21:53,325][15401] Updated weights for policy 0, policy_version 226980 (0.0043) [2024-06-22 14:21:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42601.8, 300 sec: 42876.1). Total num frames: 3718840320. Throughput: 0: 42492.4. Samples: 3718915800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:21:53,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 14:21:56,847][15401] Updated weights for policy 0, policy_version 226990 (0.0042) [2024-06-22 14:21:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 3719053312. Throughput: 0: 42512.4. Samples: 3719173240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:21:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 14:22:01,245][15401] Updated weights for policy 0, policy_version 227000 (0.0034) [2024-06-22 14:22:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3719249920. Throughput: 0: 42442.7. Samples: 3719427900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 14:22:04,530][15401] Updated weights for policy 0, policy_version 227010 (0.0047) [2024-06-22 14:22:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3719479296. Throughput: 0: 42433.3. Samples: 3719555440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 14:22:08,658][15401] Updated weights for policy 0, policy_version 227020 (0.0030) [2024-06-22 14:22:12,248][15401] Updated weights for policy 0, policy_version 227030 (0.0050) [2024-06-22 14:22:13,393][15132] Fps is (10 sec: 45860.9, 60 sec: 42869.3, 300 sec: 42875.7). Total num frames: 3719708672. Throughput: 0: 42601.1. Samples: 3719811480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:13,393][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 14:22:16,678][15401] Updated weights for policy 0, policy_version 227040 (0.0038) [2024-06-22 14:22:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3719888896. Throughput: 0: 42377.8. Samples: 3720068060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 14:22:19,789][15401] Updated weights for policy 0, policy_version 227050 (0.0033) [2024-06-22 14:22:23,389][15132] Fps is (10 sec: 39333.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3720101888. Throughput: 0: 42342.2. Samples: 3720186780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 14:22:24,216][15401] Updated weights for policy 0, policy_version 227060 (0.0038) [2024-06-22 14:22:27,368][15401] Updated weights for policy 0, policy_version 227070 (0.0037) [2024-06-22 14:22:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3720331264. Throughput: 0: 42439.5. Samples: 3720445700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 14:22:31,739][15401] Updated weights for policy 0, policy_version 227080 (0.0034) [2024-06-22 14:22:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42709.9). Total num frames: 3720527872. Throughput: 0: 42449.3. Samples: 3720704880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:33,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 14:22:35,092][15401] Updated weights for policy 0, policy_version 227090 (0.0046) [2024-06-22 14:22:38,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.1, 300 sec: 42820.5). Total num frames: 3720740864. Throughput: 0: 42522.9. Samples: 3720829340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 14:22:39,749][15401] Updated weights for policy 0, policy_version 227100 (0.0033) [2024-06-22 14:22:43,218][15401] Updated weights for policy 0, policy_version 227110 (0.0038) [2024-06-22 14:22:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3720970240. Throughput: 0: 42384.3. Samples: 3721080540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 14:22:47,377][15401] Updated weights for policy 0, policy_version 227120 (0.0044) [2024-06-22 14:22:48,396][15132] Fps is (10 sec: 40934.5, 60 sec: 42047.8, 300 sec: 42653.0). Total num frames: 3721150464. Throughput: 0: 42496.9. Samples: 3721340540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 14:22:48,397][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 14:22:51,078][15401] Updated weights for policy 0, policy_version 227130 (0.0028) [2024-06-22 14:22:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3721379840. Throughput: 0: 42350.6. Samples: 3721461220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:22:53,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 14:22:54,785][15401] Updated weights for policy 0, policy_version 227140 (0.0036) [2024-06-22 14:22:56,676][15349] Signal inference workers to stop experience collection... (55050 times) [2024-06-22 14:22:56,676][15349] Signal inference workers to resume experience collection... (55050 times) [2024-06-22 14:22:56,715][15401] InferenceWorker_p0-w0: stopping experience collection (55050 times) [2024-06-22 14:22:56,715][15401] InferenceWorker_p0-w0: resuming experience collection (55050 times) [2024-06-22 14:22:58,389][15132] Fps is (10 sec: 45905.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3721609216. Throughput: 0: 42507.4. Samples: 3721724180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:22:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 14:22:58,505][15401] Updated weights for policy 0, policy_version 227150 (0.0033) [2024-06-22 14:23:02,216][15401] Updated weights for policy 0, policy_version 227160 (0.0046) [2024-06-22 14:23:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 3721805824. Throughput: 0: 42637.3. Samples: 3721986840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:03,392][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 14:23:05,931][15401] Updated weights for policy 0, policy_version 227170 (0.0041) [2024-06-22 14:23:08,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3722018816. Throughput: 0: 42696.4. Samples: 3722108120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 14:23:09,746][15401] Updated weights for policy 0, policy_version 227180 (0.0037) [2024-06-22 14:23:13,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42600.6, 300 sec: 42709.5). Total num frames: 3722264576. Throughput: 0: 42863.6. Samples: 3722374560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 14:23:13,518][15401] Updated weights for policy 0, policy_version 227190 (0.0051) [2024-06-22 14:23:17,678][15401] Updated weights for policy 0, policy_version 227200 (0.0035) [2024-06-22 14:23:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3722444800. Throughput: 0: 42835.6. Samples: 3722632480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 14:23:20,981][15401] Updated weights for policy 0, policy_version 227210 (0.0032) [2024-06-22 14:23:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3722674176. Throughput: 0: 42810.8. Samples: 3722755820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:23,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 14:23:24,973][15401] Updated weights for policy 0, policy_version 227220 (0.0033) [2024-06-22 14:23:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42709.7). Total num frames: 3722919936. Throughput: 0: 43264.1. Samples: 3723027420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 14:23:28,428][15401] Updated weights for policy 0, policy_version 227230 (0.0030) [2024-06-22 14:23:33,057][15401] Updated weights for policy 0, policy_version 227240 (0.0031) [2024-06-22 14:23:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3723100160. Throughput: 0: 43046.6. Samples: 3723277360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 14:23:36,312][15401] Updated weights for policy 0, policy_version 227250 (0.0036) [2024-06-22 14:23:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.8, 300 sec: 42931.6). Total num frames: 3723345920. Throughput: 0: 43074.8. Samples: 3723399580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:38,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 14:23:40,729][15401] Updated weights for policy 0, policy_version 227260 (0.0037) [2024-06-22 14:23:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3723558912. Throughput: 0: 43185.6. Samples: 3723667540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 14:23:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000227268_3723558912.pth... [2024-06-22 14:23:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000226641_3713286144.pth [2024-06-22 14:23:43,950][15401] Updated weights for policy 0, policy_version 227270 (0.0036) [2024-06-22 14:23:48,282][15401] Updated weights for policy 0, policy_version 227280 (0.0032) [2024-06-22 14:23:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 3723755520. Throughput: 0: 43055.6. Samples: 3723924240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 14:23:51,634][15401] Updated weights for policy 0, policy_version 227290 (0.0032) [2024-06-22 14:23:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.7, 300 sec: 42765.3). Total num frames: 3723968512. Throughput: 0: 43228.5. Samples: 3724053400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 14:23:55,665][15401] Updated weights for policy 0, policy_version 227300 (0.0033) [2024-06-22 14:23:58,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 3724165120. Throughput: 0: 43041.2. Samples: 3724311520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-22 14:23:58,393][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 14:23:59,222][15401] Updated weights for policy 0, policy_version 227310 (0.0040) [2024-06-22 14:24:01,451][15349] Signal inference workers to stop experience collection... (55100 times) [2024-06-22 14:24:01,451][15349] Signal inference workers to resume experience collection... (55100 times) [2024-06-22 14:24:01,495][15401] InferenceWorker_p0-w0: stopping experience collection (55100 times) [2024-06-22 14:24:01,495][15401] InferenceWorker_p0-w0: resuming experience collection (55100 times) [2024-06-22 14:24:03,237][15401] Updated weights for policy 0, policy_version 227320 (0.0022) [2024-06-22 14:24:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 3724410880. Throughput: 0: 42998.6. Samples: 3724567420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:03,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 14:24:06,725][15401] Updated weights for policy 0, policy_version 227330 (0.0024) [2024-06-22 14:24:08,389][15132] Fps is (10 sec: 45886.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 3724623872. Throughput: 0: 43164.6. Samples: 3724698220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:08,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 14:24:10,869][15401] Updated weights for policy 0, policy_version 227340 (0.0030) [2024-06-22 14:24:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3724820480. Throughput: 0: 42795.1. Samples: 3724953200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 14:24:14,334][15401] Updated weights for policy 0, policy_version 227350 (0.0043) [2024-06-22 14:24:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3725049856. Throughput: 0: 43092.9. Samples: 3725216540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:18,394][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 14:24:18,453][15401] Updated weights for policy 0, policy_version 227360 (0.0031) [2024-06-22 14:24:21,833][15401] Updated weights for policy 0, policy_version 227370 (0.0040) [2024-06-22 14:24:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3725262848. Throughput: 0: 43196.3. Samples: 3725343420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 14:24:26,215][15401] Updated weights for policy 0, policy_version 227380 (0.0031) [2024-06-22 14:24:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3725475840. Throughput: 0: 42921.8. Samples: 3725599020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 14:24:29,652][15401] Updated weights for policy 0, policy_version 227390 (0.0050) [2024-06-22 14:24:33,396][15132] Fps is (10 sec: 42571.8, 60 sec: 43140.0, 300 sec: 42819.6). Total num frames: 3725688832. Throughput: 0: 43079.8. Samples: 3725863100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:33,396][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 14:24:33,733][15401] Updated weights for policy 0, policy_version 227400 (0.0032) [2024-06-22 14:24:37,550][15401] Updated weights for policy 0, policy_version 227410 (0.0029) [2024-06-22 14:24:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3725901824. Throughput: 0: 42972.0. Samples: 3725987140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 14:24:41,275][15401] Updated weights for policy 0, policy_version 227420 (0.0030) [2024-06-22 14:24:43,389][15132] Fps is (10 sec: 44265.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3726131200. Throughput: 0: 43004.6. Samples: 3726246620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 14:24:44,967][15401] Updated weights for policy 0, policy_version 227430 (0.0033) [2024-06-22 14:24:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3726327808. Throughput: 0: 43126.3. Samples: 3726508100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 14:24:49,134][15401] Updated weights for policy 0, policy_version 227440 (0.0036) [2024-06-22 14:24:52,787][15401] Updated weights for policy 0, policy_version 227450 (0.0033) [2024-06-22 14:24:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3726557184. Throughput: 0: 42939.5. Samples: 3726630500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 14:24:56,548][15401] Updated weights for policy 0, policy_version 227460 (0.0042) [2024-06-22 14:24:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 3726770176. Throughput: 0: 42981.2. Samples: 3726887360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:24:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 14:25:00,546][15401] Updated weights for policy 0, policy_version 227470 (0.0041) [2024-06-22 14:25:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3726983168. Throughput: 0: 42959.1. Samples: 3727149700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:25:03,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 14:25:04,359][15401] Updated weights for policy 0, policy_version 227480 (0.0041) [2024-06-22 14:25:08,087][15401] Updated weights for policy 0, policy_version 227490 (0.0033) [2024-06-22 14:25:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3727212544. Throughput: 0: 42936.5. Samples: 3727275560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:25:08,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 14:25:11,889][15349] Signal inference workers to stop experience collection... (55150 times) [2024-06-22 14:25:11,889][15349] Signal inference workers to resume experience collection... (55150 times) [2024-06-22 14:25:11,896][15401] Updated weights for policy 0, policy_version 227500 (0.0032) [2024-06-22 14:25:11,936][15401] InferenceWorker_p0-w0: stopping experience collection (55150 times) [2024-06-22 14:25:11,936][15401] InferenceWorker_p0-w0: resuming experience collection (55150 times) [2024-06-22 14:25:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3727409152. Throughput: 0: 42912.0. Samples: 3727530060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 14:25:15,902][15401] Updated weights for policy 0, policy_version 227510 (0.0030) [2024-06-22 14:25:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3727622144. Throughput: 0: 42754.9. Samples: 3727786800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 14:25:19,444][15401] Updated weights for policy 0, policy_version 227520 (0.0042) [2024-06-22 14:25:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3727835136. Throughput: 0: 42818.0. Samples: 3727913960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 14:25:23,556][15401] Updated weights for policy 0, policy_version 227530 (0.0026) [2024-06-22 14:25:27,241][15401] Updated weights for policy 0, policy_version 227540 (0.0040) [2024-06-22 14:25:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3728064512. Throughput: 0: 42803.5. Samples: 3728172780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 14:25:31,077][15401] Updated weights for policy 0, policy_version 227550 (0.0046) [2024-06-22 14:25:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 3728261120. Throughput: 0: 42689.3. Samples: 3728429120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 14:25:34,871][15401] Updated weights for policy 0, policy_version 227560 (0.0037) [2024-06-22 14:25:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3728490496. Throughput: 0: 42918.3. Samples: 3728561820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 14:25:38,529][15401] Updated weights for policy 0, policy_version 227570 (0.0042) [2024-06-22 14:25:42,567][15401] Updated weights for policy 0, policy_version 227580 (0.0033) [2024-06-22 14:25:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3728687104. Throughput: 0: 42965.3. Samples: 3728820900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:43,393][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 14:25:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000227581_3728687104.pth... [2024-06-22 14:25:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000226955_3718430720.pth [2024-06-22 14:25:46,163][15401] Updated weights for policy 0, policy_version 227590 (0.0029) [2024-06-22 14:25:48,396][15132] Fps is (10 sec: 42570.4, 60 sec: 43139.9, 300 sec: 42820.3). Total num frames: 3728916480. Throughput: 0: 42691.2. Samples: 3729071080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:48,397][15132] Avg episode reward: [(0, '0.247')] [2024-06-22 14:25:50,462][15401] Updated weights for policy 0, policy_version 227600 (0.0043) [2024-06-22 14:25:53,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3729129472. Throughput: 0: 42842.2. Samples: 3729203460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:53,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 14:25:54,005][15401] Updated weights for policy 0, policy_version 227610 (0.0027) [2024-06-22 14:25:58,144][15401] Updated weights for policy 0, policy_version 227620 (0.0033) [2024-06-22 14:25:58,396][15132] Fps is (10 sec: 40960.0, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 3729326080. Throughput: 0: 42873.4. Samples: 3729459640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:25:58,397][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 14:26:01,772][15401] Updated weights for policy 0, policy_version 227630 (0.0043) [2024-06-22 14:26:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3729555456. Throughput: 0: 42584.5. Samples: 3729703100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:26:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 14:26:06,087][15401] Updated weights for policy 0, policy_version 227640 (0.0044) [2024-06-22 14:26:08,392][15132] Fps is (10 sec: 42615.6, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 3729752064. Throughput: 0: 42689.0. Samples: 3729835060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:26:08,393][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 14:26:09,622][15401] Updated weights for policy 0, policy_version 227650 (0.0041) [2024-06-22 14:26:13,391][15132] Fps is (10 sec: 39315.5, 60 sec: 42324.3, 300 sec: 42764.8). Total num frames: 3729948672. Throughput: 0: 42734.6. Samples: 3730095900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:26:13,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 14:26:13,760][15401] Updated weights for policy 0, policy_version 227660 (0.0027) [2024-06-22 14:26:17,178][15401] Updated weights for policy 0, policy_version 227670 (0.0035) [2024-06-22 14:26:18,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3730194432. Throughput: 0: 42487.2. Samples: 3730341040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 14:26:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 14:26:21,330][15401] Updated weights for policy 0, policy_version 227680 (0.0028) [2024-06-22 14:26:23,389][15132] Fps is (10 sec: 44243.8, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 3730391040. Throughput: 0: 42553.3. Samples: 3730476720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:26:23,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 14:26:24,929][15401] Updated weights for policy 0, policy_version 227690 (0.0038) [2024-06-22 14:26:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 3730604032. Throughput: 0: 42365.9. Samples: 3730727260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:26:28,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 14:26:29,144][15401] Updated weights for policy 0, policy_version 227700 (0.0026) [2024-06-22 14:26:32,204][15349] Signal inference workers to stop experience collection... (55200 times) [2024-06-22 14:26:32,205][15349] Signal inference workers to resume experience collection... (55200 times) [2024-06-22 14:26:32,250][15401] InferenceWorker_p0-w0: stopping experience collection (55200 times) [2024-06-22 14:26:32,250][15401] InferenceWorker_p0-w0: resuming experience collection (55200 times) [2024-06-22 14:26:32,571][15401] Updated weights for policy 0, policy_version 227710 (0.0025) [2024-06-22 14:26:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3730833408. Throughput: 0: 42412.3. Samples: 3730979360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:26:33,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 14:26:36,824][15401] Updated weights for policy 0, policy_version 227720 (0.0029) [2024-06-22 14:26:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 3731013632. Throughput: 0: 42349.4. Samples: 3731109180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:26:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 14:26:40,229][15401] Updated weights for policy 0, policy_version 227730 (0.0049) [2024-06-22 14:26:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 3731259392. Throughput: 0: 42256.2. Samples: 3731360900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:26:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 14:26:44,603][15401] Updated weights for policy 0, policy_version 227740 (0.0039) [2024-06-22 14:26:47,952][15401] Updated weights for policy 0, policy_version 227750 (0.0047) [2024-06-22 14:26:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 3731456000. Throughput: 0: 42437.3. Samples: 3731612780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:26:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 14:26:52,075][15401] Updated weights for policy 0, policy_version 227760 (0.0032) [2024-06-22 14:26:53,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 3731636224. Throughput: 0: 42323.2. Samples: 3731739500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:26:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 14:26:55,843][15401] Updated weights for policy 0, policy_version 227770 (0.0043) [2024-06-22 14:26:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42603.0, 300 sec: 42820.5). Total num frames: 3731881984. Throughput: 0: 42255.6. Samples: 3731997340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:26:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 14:26:59,663][15401] Updated weights for policy 0, policy_version 227780 (0.0027) [2024-06-22 14:27:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3732094976. Throughput: 0: 42626.7. Samples: 3732259240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:27:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 14:27:03,466][15401] Updated weights for policy 0, policy_version 227790 (0.0037) [2024-06-22 14:27:07,071][15401] Updated weights for policy 0, policy_version 227800 (0.0036) [2024-06-22 14:27:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.1, 300 sec: 42654.4). Total num frames: 3732291584. Throughput: 0: 42447.1. Samples: 3732386840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:27:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 14:27:11,099][15401] Updated weights for policy 0, policy_version 227810 (0.0023) [2024-06-22 14:27:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43145.6, 300 sec: 42876.1). Total num frames: 3732537344. Throughput: 0: 42561.3. Samples: 3732642520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:27:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 14:27:14,740][15401] Updated weights for policy 0, policy_version 227820 (0.0035) [2024-06-22 14:27:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3732733952. Throughput: 0: 42924.9. Samples: 3732910980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:27:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 14:27:18,753][15401] Updated weights for policy 0, policy_version 227830 (0.0029) [2024-06-22 14:27:22,122][15401] Updated weights for policy 0, policy_version 227840 (0.0043) [2024-06-22 14:27:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3732946944. Throughput: 0: 42738.2. Samples: 3733032400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 14:27:23,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 14:27:26,500][15401] Updated weights for policy 0, policy_version 227850 (0.0038) [2024-06-22 14:27:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3733176320. Throughput: 0: 42962.0. Samples: 3733294180. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:27:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 14:27:29,576][15401] Updated weights for policy 0, policy_version 227860 (0.0033) [2024-06-22 14:27:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3733372928. Throughput: 0: 43289.8. Samples: 3733560820. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:27:33,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 14:27:34,175][15401] Updated weights for policy 0, policy_version 227870 (0.0026) [2024-06-22 14:27:37,329][15401] Updated weights for policy 0, policy_version 227880 (0.0039) [2024-06-22 14:27:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 3733602304. Throughput: 0: 43176.7. Samples: 3733682460. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:27:38,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 14:27:41,671][15401] Updated weights for policy 0, policy_version 227890 (0.0033) [2024-06-22 14:27:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42988.1). Total num frames: 3733831680. Throughput: 0: 43168.4. Samples: 3733939920. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:27:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 14:27:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000227895_3733831680.pth... [2024-06-22 14:27:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000227268_3723558912.pth [2024-06-22 14:27:45,119][15401] Updated weights for policy 0, policy_version 227900 (0.0035) [2024-06-22 14:27:48,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3734028288. Throughput: 0: 43083.0. Samples: 3734197980. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:27:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 14:27:49,549][15401] Updated weights for policy 0, policy_version 227910 (0.0045) [2024-06-22 14:27:52,882][15401] Updated weights for policy 0, policy_version 227920 (0.0041) [2024-06-22 14:27:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 3734257664. Throughput: 0: 43005.7. Samples: 3734322100. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:27:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 14:27:55,852][15349] Signal inference workers to stop experience collection... (55250 times) [2024-06-22 14:27:55,896][15401] InferenceWorker_p0-w0: stopping experience collection (55250 times) [2024-06-22 14:27:55,906][15349] Signal inference workers to resume experience collection... (55250 times) [2024-06-22 14:27:55,916][15401] InferenceWorker_p0-w0: resuming experience collection (55250 times) [2024-06-22 14:27:57,112][15401] Updated weights for policy 0, policy_version 227930 (0.0031) [2024-06-22 14:27:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 3734470656. Throughput: 0: 43107.2. Samples: 3734582340. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:27:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 14:28:00,581][15401] Updated weights for policy 0, policy_version 227940 (0.0036) [2024-06-22 14:28:03,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 3734650880. Throughput: 0: 42741.6. Samples: 3734834460. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:28:03,393][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 14:28:04,714][15401] Updated weights for policy 0, policy_version 227950 (0.0045) [2024-06-22 14:28:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3734880256. Throughput: 0: 42786.3. Samples: 3734957780. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:28:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 14:28:08,425][15401] Updated weights for policy 0, policy_version 227960 (0.0041) [2024-06-22 14:28:12,452][15401] Updated weights for policy 0, policy_version 227970 (0.0035) [2024-06-22 14:28:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3735109632. Throughput: 0: 42735.9. Samples: 3735217300. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:28:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 14:28:16,000][15401] Updated weights for policy 0, policy_version 227980 (0.0029) [2024-06-22 14:28:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3735306240. Throughput: 0: 42386.6. Samples: 3735468220. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:28:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 14:28:20,519][15401] Updated weights for policy 0, policy_version 227990 (0.0034) [2024-06-22 14:28:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3735519232. Throughput: 0: 42492.6. Samples: 3735594620. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:28:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 14:28:23,607][15401] Updated weights for policy 0, policy_version 228000 (0.0045) [2024-06-22 14:28:28,270][15401] Updated weights for policy 0, policy_version 228010 (0.0036) [2024-06-22 14:28:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3735715840. Throughput: 0: 42462.2. Samples: 3735850720. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:28:28,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 14:28:31,692][15401] Updated weights for policy 0, policy_version 228020 (0.0039) [2024-06-22 14:28:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3735945216. Throughput: 0: 42339.1. Samples: 3736103240. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-22 14:28:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 14:28:36,088][15401] Updated weights for policy 0, policy_version 228030 (0.0041) [2024-06-22 14:28:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3736174592. Throughput: 0: 42376.8. Samples: 3736229060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:28:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 14:28:39,263][15401] Updated weights for policy 0, policy_version 228040 (0.0039) [2024-06-22 14:28:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 3736354816. Throughput: 0: 42318.8. Samples: 3736486680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:28:43,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 14:28:43,504][15401] Updated weights for policy 0, policy_version 228050 (0.0024) [2024-06-22 14:28:47,233][15401] Updated weights for policy 0, policy_version 228060 (0.0033) [2024-06-22 14:28:48,393][15132] Fps is (10 sec: 39310.0, 60 sec: 42323.2, 300 sec: 42709.0). Total num frames: 3736567808. Throughput: 0: 42319.0. Samples: 3736738840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:28:48,393][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 14:28:51,074][15401] Updated weights for policy 0, policy_version 228070 (0.0049) [2024-06-22 14:28:53,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42323.6, 300 sec: 42820.6). Total num frames: 3736797184. Throughput: 0: 42338.0. Samples: 3736863100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:28:53,392][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 14:28:54,927][15401] Updated weights for policy 0, policy_version 228080 (0.0030) [2024-06-22 14:28:58,389][15132] Fps is (10 sec: 40972.9, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 3736977408. Throughput: 0: 42350.9. Samples: 3737123080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:28:58,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 14:28:58,816][15401] Updated weights for policy 0, policy_version 228090 (0.0032) [2024-06-22 14:29:02,653][15401] Updated weights for policy 0, policy_version 228100 (0.0031) [2024-06-22 14:29:03,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 3737206784. Throughput: 0: 42429.3. Samples: 3737377540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:03,390][15132] Avg episode reward: [(0, '0.115')] [2024-06-22 14:29:06,511][15401] Updated weights for policy 0, policy_version 228110 (0.0043) [2024-06-22 14:29:08,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3737436160. Throughput: 0: 42455.1. Samples: 3737505100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:08,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-22 14:29:10,274][15401] Updated weights for policy 0, policy_version 228120 (0.0038) [2024-06-22 14:29:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 3737616384. Throughput: 0: 42513.9. Samples: 3737763840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 14:29:14,031][15349] Signal inference workers to stop experience collection... (55300 times) [2024-06-22 14:29:14,032][15349] Signal inference workers to resume experience collection... (55300 times) [2024-06-22 14:29:14,078][15401] InferenceWorker_p0-w0: stopping experience collection (55300 times) [2024-06-22 14:29:14,078][15401] InferenceWorker_p0-w0: resuming experience collection (55300 times) [2024-06-22 14:29:14,178][15401] Updated weights for policy 0, policy_version 228130 (0.0043) [2024-06-22 14:29:17,971][15401] Updated weights for policy 0, policy_version 228140 (0.0046) [2024-06-22 14:29:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3737862144. Throughput: 0: 42401.8. Samples: 3738011320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 14:29:21,852][15401] Updated weights for policy 0, policy_version 228150 (0.0034) [2024-06-22 14:29:23,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3738091520. Throughput: 0: 42586.4. Samples: 3738145440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 14:29:25,668][15401] Updated weights for policy 0, policy_version 228160 (0.0028) [2024-06-22 14:29:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 3738255360. Throughput: 0: 42460.8. Samples: 3738397420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:28,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 14:29:29,551][15401] Updated weights for policy 0, policy_version 228170 (0.0031) [2024-06-22 14:29:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3738484736. Throughput: 0: 42519.3. Samples: 3738652080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 14:29:33,715][15401] Updated weights for policy 0, policy_version 228180 (0.0041) [2024-06-22 14:29:37,478][15401] Updated weights for policy 0, policy_version 228190 (0.0031) [2024-06-22 14:29:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 3738697728. Throughput: 0: 42721.9. Samples: 3738785480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 14:29:41,456][15401] Updated weights for policy 0, policy_version 228200 (0.0036) [2024-06-22 14:29:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3738894336. Throughput: 0: 42459.9. Samples: 3739033780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 14:29:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 14:29:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000228204_3738894336.pth... [2024-06-22 14:29:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000227581_3728687104.pth [2024-06-22 14:29:45,045][15401] Updated weights for policy 0, policy_version 228210 (0.0032) [2024-06-22 14:29:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42600.5, 300 sec: 42598.4). Total num frames: 3739123712. Throughput: 0: 42379.0. Samples: 3739284600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:29:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 14:29:49,045][15401] Updated weights for policy 0, policy_version 228220 (0.0046) [2024-06-22 14:29:52,861][15401] Updated weights for policy 0, policy_version 228230 (0.0034) [2024-06-22 14:29:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 3739353088. Throughput: 0: 42585.8. Samples: 3739421460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:29:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 14:29:56,605][15401] Updated weights for policy 0, policy_version 228240 (0.0025) [2024-06-22 14:29:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3739533312. Throughput: 0: 42464.9. Samples: 3739674760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:29:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 14:30:00,583][15401] Updated weights for policy 0, policy_version 228250 (0.0035) [2024-06-22 14:30:03,391][15132] Fps is (10 sec: 42592.7, 60 sec: 42870.6, 300 sec: 42598.2). Total num frames: 3739779072. Throughput: 0: 42629.9. Samples: 3739929720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:03,391][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 14:30:04,218][15401] Updated weights for policy 0, policy_version 228260 (0.0027) [2024-06-22 14:30:08,159][15401] Updated weights for policy 0, policy_version 228270 (0.0034) [2024-06-22 14:30:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3739992064. Throughput: 0: 42618.6. Samples: 3740063280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:08,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 14:30:11,872][15401] Updated weights for policy 0, policy_version 228280 (0.0033) [2024-06-22 14:30:13,390][15132] Fps is (10 sec: 40964.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3740188672. Throughput: 0: 42602.6. Samples: 3740314540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 14:30:15,626][15401] Updated weights for policy 0, policy_version 228290 (0.0038) [2024-06-22 14:30:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3740401664. Throughput: 0: 42691.2. Samples: 3740573180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 14:30:19,313][15401] Updated weights for policy 0, policy_version 228300 (0.0032) [2024-06-22 14:30:23,294][15401] Updated weights for policy 0, policy_version 228310 (0.0033) [2024-06-22 14:30:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3740631040. Throughput: 0: 42611.0. Samples: 3740702980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 14:30:24,888][15349] Signal inference workers to stop experience collection... (55350 times) [2024-06-22 14:30:24,936][15401] InferenceWorker_p0-w0: stopping experience collection (55350 times) [2024-06-22 14:30:24,938][15349] Signal inference workers to resume experience collection... (55350 times) [2024-06-22 14:30:24,946][15401] InferenceWorker_p0-w0: resuming experience collection (55350 times) [2024-06-22 14:30:26,809][15401] Updated weights for policy 0, policy_version 228320 (0.0031) [2024-06-22 14:30:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3740811264. Throughput: 0: 42653.4. Samples: 3740953180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 14:30:31,014][15401] Updated weights for policy 0, policy_version 228330 (0.0023) [2024-06-22 14:30:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 3741040640. Throughput: 0: 42809.0. Samples: 3741211000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 14:30:34,860][15401] Updated weights for policy 0, policy_version 228340 (0.0035) [2024-06-22 14:30:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 3741253632. Throughput: 0: 42762.7. Samples: 3741345780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:38,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 14:30:38,560][15401] Updated weights for policy 0, policy_version 228350 (0.0028) [2024-06-22 14:30:42,386][15401] Updated weights for policy 0, policy_version 228360 (0.0028) [2024-06-22 14:30:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42543.8). Total num frames: 3741466624. Throughput: 0: 42853.3. Samples: 3741603160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 14:30:46,290][15401] Updated weights for policy 0, policy_version 228370 (0.0024) [2024-06-22 14:30:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3741696000. Throughput: 0: 42685.6. Samples: 3741850520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:48,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 14:30:49,878][15401] Updated weights for policy 0, policy_version 228380 (0.0032) [2024-06-22 14:30:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 3741892608. Throughput: 0: 42699.6. Samples: 3741984760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-22 14:30:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 14:30:54,034][15401] Updated weights for policy 0, policy_version 228390 (0.0033) [2024-06-22 14:30:57,389][15401] Updated weights for policy 0, policy_version 228400 (0.0022) [2024-06-22 14:30:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3742121984. Throughput: 0: 42805.4. Samples: 3742240780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:30:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 14:31:01,645][15401] Updated weights for policy 0, policy_version 228410 (0.0037) [2024-06-22 14:31:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42872.4, 300 sec: 42709.8). Total num frames: 3742351360. Throughput: 0: 42724.9. Samples: 3742495800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 14:31:05,435][15401] Updated weights for policy 0, policy_version 228420 (0.0025) [2024-06-22 14:31:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42654.1). Total num frames: 3742531584. Throughput: 0: 42770.6. Samples: 3742627660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:08,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 14:31:09,451][15401] Updated weights for policy 0, policy_version 228430 (0.0037) [2024-06-22 14:31:13,010][15401] Updated weights for policy 0, policy_version 228440 (0.0036) [2024-06-22 14:31:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3742777344. Throughput: 0: 42868.4. Samples: 3742882260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:13,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 14:31:16,919][15401] Updated weights for policy 0, policy_version 228450 (0.0035) [2024-06-22 14:31:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3742973952. Throughput: 0: 42780.7. Samples: 3743136120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 14:31:20,606][15401] Updated weights for policy 0, policy_version 228460 (0.0044) [2024-06-22 14:31:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3743170560. Throughput: 0: 42752.0. Samples: 3743269620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 14:31:24,427][15401] Updated weights for policy 0, policy_version 228470 (0.0035) [2024-06-22 14:31:28,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 3743399936. Throughput: 0: 42654.0. Samples: 3743522600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 14:31:28,583][15401] Updated weights for policy 0, policy_version 228480 (0.0025) [2024-06-22 14:31:31,464][15349] Signal inference workers to stop experience collection... (55400 times) [2024-06-22 14:31:31,465][15349] Signal inference workers to resume experience collection... (55400 times) [2024-06-22 14:31:31,511][15401] InferenceWorker_p0-w0: stopping experience collection (55400 times) [2024-06-22 14:31:31,511][15401] InferenceWorker_p0-w0: resuming experience collection (55400 times) [2024-06-22 14:31:32,196][15401] Updated weights for policy 0, policy_version 228490 (0.0031) [2024-06-22 14:31:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3743612928. Throughput: 0: 42922.1. Samples: 3743782020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:33,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 14:31:36,113][15401] Updated weights for policy 0, policy_version 228500 (0.0037) [2024-06-22 14:31:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3743825920. Throughput: 0: 42732.0. Samples: 3743907700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 14:31:40,255][15401] Updated weights for policy 0, policy_version 228510 (0.0038) [2024-06-22 14:31:43,396][15132] Fps is (10 sec: 44208.6, 60 sec: 43139.8, 300 sec: 42708.5). Total num frames: 3744055296. Throughput: 0: 42761.0. Samples: 3744165300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:43,397][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 14:31:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000228519_3744055296.pth... [2024-06-22 14:31:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000227895_3733831680.pth [2024-06-22 14:31:43,651][15401] Updated weights for policy 0, policy_version 228520 (0.0042) [2024-06-22 14:31:47,831][15401] Updated weights for policy 0, policy_version 228530 (0.0037) [2024-06-22 14:31:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3744251904. Throughput: 0: 42878.7. Samples: 3744425340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:48,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 14:31:51,121][15401] Updated weights for policy 0, policy_version 228540 (0.0042) [2024-06-22 14:31:53,390][15132] Fps is (10 sec: 40985.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 3744464896. Throughput: 0: 42719.0. Samples: 3744550020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 14:31:55,326][15401] Updated weights for policy 0, policy_version 228550 (0.0032) [2024-06-22 14:31:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3744694272. Throughput: 0: 42999.6. Samples: 3744817240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 14:31:58,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 14:31:59,016][15401] Updated weights for policy 0, policy_version 228560 (0.0030) [2024-06-22 14:32:02,759][15401] Updated weights for policy 0, policy_version 228570 (0.0037) [2024-06-22 14:32:03,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3744907264. Throughput: 0: 42962.0. Samples: 3745069420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 14:32:06,548][15401] Updated weights for policy 0, policy_version 228580 (0.0039) [2024-06-22 14:32:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3745103872. Throughput: 0: 42743.8. Samples: 3745193100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:08,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 14:32:10,244][15401] Updated weights for policy 0, policy_version 228590 (0.0028) [2024-06-22 14:32:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3745349632. Throughput: 0: 43000.6. Samples: 3745457620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 14:32:14,028][15401] Updated weights for policy 0, policy_version 228600 (0.0025) [2024-06-22 14:32:17,707][15401] Updated weights for policy 0, policy_version 228610 (0.0039) [2024-06-22 14:32:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 3745546240. Throughput: 0: 42784.4. Samples: 3745707320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 14:32:21,526][15401] Updated weights for policy 0, policy_version 228620 (0.0028) [2024-06-22 14:32:23,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3745726464. Throughput: 0: 42828.0. Samples: 3745834960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 14:32:25,302][15401] Updated weights for policy 0, policy_version 228630 (0.0028) [2024-06-22 14:32:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3745972224. Throughput: 0: 43009.7. Samples: 3746100460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 14:32:29,026][15401] Updated weights for policy 0, policy_version 228640 (0.0033) [2024-06-22 14:32:33,187][15401] Updated weights for policy 0, policy_version 228650 (0.0032) [2024-06-22 14:32:33,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3746201600. Throughput: 0: 42905.7. Samples: 3746356100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:33,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 14:32:35,597][15349] Signal inference workers to stop experience collection... (55450 times) [2024-06-22 14:32:35,598][15349] Signal inference workers to resume experience collection... (55450 times) [2024-06-22 14:32:35,613][15401] InferenceWorker_p0-w0: stopping experience collection (55450 times) [2024-06-22 14:32:35,613][15401] InferenceWorker_p0-w0: resuming experience collection (55450 times) [2024-06-22 14:32:36,821][15401] Updated weights for policy 0, policy_version 228660 (0.0043) [2024-06-22 14:32:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 3746381824. Throughput: 0: 43026.7. Samples: 3746486220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:38,391][15132] Avg episode reward: [(0, '0.216')] [2024-06-22 14:32:40,898][15401] Updated weights for policy 0, policy_version 228670 (0.0033) [2024-06-22 14:32:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 3746611200. Throughput: 0: 42736.4. Samples: 3746740380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 14:32:44,399][15401] Updated weights for policy 0, policy_version 228680 (0.0035) [2024-06-22 14:32:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3746840576. Throughput: 0: 42837.8. Samples: 3746997120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 14:32:48,516][15401] Updated weights for policy 0, policy_version 228690 (0.0042) [2024-06-22 14:32:52,076][15401] Updated weights for policy 0, policy_version 228700 (0.0034) [2024-06-22 14:32:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 3747037184. Throughput: 0: 43042.4. Samples: 3747130000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 14:32:56,372][15401] Updated weights for policy 0, policy_version 228710 (0.0028) [2024-06-22 14:32:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 3747250176. Throughput: 0: 42762.6. Samples: 3747381940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:32:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 14:33:00,142][15401] Updated weights for policy 0, policy_version 228720 (0.0028) [2024-06-22 14:33:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3747463168. Throughput: 0: 42952.6. Samples: 3747640180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:33:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 14:33:04,023][15401] Updated weights for policy 0, policy_version 228730 (0.0025) [2024-06-22 14:33:07,591][15401] Updated weights for policy 0, policy_version 228740 (0.0027) [2024-06-22 14:33:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3747692544. Throughput: 0: 42947.8. Samples: 3747767620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 14:33:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 14:33:11,836][15401] Updated weights for policy 0, policy_version 228750 (0.0044) [2024-06-22 14:33:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3747905536. Throughput: 0: 42808.9. Samples: 3748026860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:13,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 14:33:15,227][15401] Updated weights for policy 0, policy_version 228760 (0.0033) [2024-06-22 14:33:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3748118528. Throughput: 0: 42692.4. Samples: 3748277260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 14:33:19,324][15401] Updated weights for policy 0, policy_version 228770 (0.0035) [2024-06-22 14:33:22,799][15401] Updated weights for policy 0, policy_version 228780 (0.0032) [2024-06-22 14:33:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 3748331520. Throughput: 0: 42736.5. Samples: 3748409360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 14:33:26,940][15401] Updated weights for policy 0, policy_version 228790 (0.0040) [2024-06-22 14:33:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3748544512. Throughput: 0: 42854.7. Samples: 3748668840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 14:33:30,306][15401] Updated weights for policy 0, policy_version 228800 (0.0033) [2024-06-22 14:33:33,392][15132] Fps is (10 sec: 44224.1, 60 sec: 42869.4, 300 sec: 42709.1). Total num frames: 3748773888. Throughput: 0: 42742.6. Samples: 3748920660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:33,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 14:33:34,642][15401] Updated weights for policy 0, policy_version 228810 (0.0034) [2024-06-22 14:33:38,043][15401] Updated weights for policy 0, policy_version 228820 (0.0028) [2024-06-22 14:33:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 3748986880. Throughput: 0: 42775.4. Samples: 3749055000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:38,393][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 14:33:42,017][15401] Updated weights for policy 0, policy_version 228830 (0.0032) [2024-06-22 14:33:43,390][15132] Fps is (10 sec: 40971.3, 60 sec: 42871.3, 300 sec: 42765.4). Total num frames: 3749183488. Throughput: 0: 42789.6. Samples: 3749307480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 14:33:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000228832_3749183488.pth... [2024-06-22 14:33:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000228204_3738894336.pth [2024-06-22 14:33:46,066][15401] Updated weights for policy 0, policy_version 228840 (0.0041) [2024-06-22 14:33:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3749412864. Throughput: 0: 42714.3. Samples: 3749562320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 14:33:50,202][15401] Updated weights for policy 0, policy_version 228850 (0.0024) [2024-06-22 14:33:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3749609472. Throughput: 0: 42879.7. Samples: 3749697200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:53,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 14:33:53,735][15401] Updated weights for policy 0, policy_version 228860 (0.0041) [2024-06-22 14:33:57,630][15401] Updated weights for policy 0, policy_version 228870 (0.0024) [2024-06-22 14:33:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3749806080. Throughput: 0: 42730.7. Samples: 3749949740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:33:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 14:33:58,622][15349] Signal inference workers to stop experience collection... (55500 times) [2024-06-22 14:33:58,672][15401] InferenceWorker_p0-w0: stopping experience collection (55500 times) [2024-06-22 14:33:58,738][15349] Signal inference workers to resume experience collection... (55500 times) [2024-06-22 14:33:58,738][15401] InferenceWorker_p0-w0: resuming experience collection (55500 times) [2024-06-22 14:34:01,463][15401] Updated weights for policy 0, policy_version 228880 (0.0031) [2024-06-22 14:34:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3750051840. Throughput: 0: 42746.7. Samples: 3750200860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:34:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 14:34:05,359][15401] Updated weights for policy 0, policy_version 228890 (0.0029) [2024-06-22 14:34:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 3750248448. Throughput: 0: 42764.9. Samples: 3750333780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:34:08,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 14:34:09,228][15401] Updated weights for policy 0, policy_version 228900 (0.0034) [2024-06-22 14:34:13,272][15401] Updated weights for policy 0, policy_version 228910 (0.0032) [2024-06-22 14:34:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3750461440. Throughput: 0: 42664.1. Samples: 3750588720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:34:13,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 14:34:16,980][15401] Updated weights for policy 0, policy_version 228920 (0.0036) [2024-06-22 14:34:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3750674432. Throughput: 0: 42709.7. Samples: 3750842480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:34:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 14:34:20,819][15401] Updated weights for policy 0, policy_version 228930 (0.0034) [2024-06-22 14:34:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3750887424. Throughput: 0: 42709.3. Samples: 3750976820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:34:23,399][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 14:34:24,522][15401] Updated weights for policy 0, policy_version 228940 (0.0038) [2024-06-22 14:34:28,367][15401] Updated weights for policy 0, policy_version 228950 (0.0040) [2024-06-22 14:34:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3751116800. Throughput: 0: 42808.5. Samples: 3751233860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:34:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 14:34:32,187][15401] Updated weights for policy 0, policy_version 228960 (0.0037) [2024-06-22 14:34:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.4, 300 sec: 42820.5). Total num frames: 3751329792. Throughput: 0: 42707.0. Samples: 3751484140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:34:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 14:34:36,293][15401] Updated weights for policy 0, policy_version 228970 (0.0026) [2024-06-22 14:34:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 3751542784. Throughput: 0: 42764.9. Samples: 3751621620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:34:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 14:34:39,831][15401] Updated weights for policy 0, policy_version 228980 (0.0031) [2024-06-22 14:34:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3751739392. Throughput: 0: 42764.8. Samples: 3751874160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:34:43,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 14:34:43,746][15401] Updated weights for policy 0, policy_version 228990 (0.0028) [2024-06-22 14:34:47,405][15401] Updated weights for policy 0, policy_version 229000 (0.0040) [2024-06-22 14:34:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3751968768. Throughput: 0: 42875.2. Samples: 3752130240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:34:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 14:34:51,318][15401] Updated weights for policy 0, policy_version 229010 (0.0038) [2024-06-22 14:34:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3752181760. Throughput: 0: 42875.0. Samples: 3752263160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:34:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 14:34:54,958][15401] Updated weights for policy 0, policy_version 229020 (0.0032) [2024-06-22 14:34:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.7). Total num frames: 3752378368. Throughput: 0: 42841.3. Samples: 3752516580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:34:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 14:34:58,991][15401] Updated weights for policy 0, policy_version 229030 (0.0038) [2024-06-22 14:35:02,637][15401] Updated weights for policy 0, policy_version 229040 (0.0041) [2024-06-22 14:35:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3752607744. Throughput: 0: 42903.8. Samples: 3752773140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:35:03,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-22 14:35:06,634][15401] Updated weights for policy 0, policy_version 229050 (0.0031) [2024-06-22 14:35:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3752820736. Throughput: 0: 42738.5. Samples: 3752900040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:35:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 14:35:10,251][15401] Updated weights for policy 0, policy_version 229060 (0.0046) [2024-06-22 14:35:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3753033728. Throughput: 0: 42787.3. Samples: 3753159280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:35:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 14:35:14,268][15401] Updated weights for policy 0, policy_version 229070 (0.0028) [2024-06-22 14:35:17,802][15401] Updated weights for policy 0, policy_version 229080 (0.0031) [2024-06-22 14:35:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 3753279488. Throughput: 0: 42948.5. Samples: 3753416820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:35:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 14:35:21,853][15401] Updated weights for policy 0, policy_version 229090 (0.0035) [2024-06-22 14:35:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 3753476096. Throughput: 0: 42964.9. Samples: 3753555040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:35:23,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 14:35:25,545][15401] Updated weights for policy 0, policy_version 229100 (0.0034) [2024-06-22 14:35:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3753672704. Throughput: 0: 42748.8. Samples: 3753797860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 14:35:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 14:35:29,447][15401] Updated weights for policy 0, policy_version 229110 (0.0035) [2024-06-22 14:35:33,094][15349] Signal inference workers to stop experience collection... (55550 times) [2024-06-22 14:35:33,095][15349] Signal inference workers to resume experience collection... (55550 times) [2024-06-22 14:35:33,132][15401] InferenceWorker_p0-w0: stopping experience collection (55550 times) [2024-06-22 14:35:33,132][15401] InferenceWorker_p0-w0: resuming experience collection (55550 times) [2024-06-22 14:35:33,245][15401] Updated weights for policy 0, policy_version 229120 (0.0029) [2024-06-22 14:35:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3753902080. Throughput: 0: 42845.2. Samples: 3754058280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:35:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 14:35:37,013][15401] Updated weights for policy 0, policy_version 229130 (0.0040) [2024-06-22 14:35:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3754115072. Throughput: 0: 42896.1. Samples: 3754193480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:35:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 14:35:41,074][15401] Updated weights for policy 0, policy_version 229140 (0.0042) [2024-06-22 14:35:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3754311680. Throughput: 0: 42747.9. Samples: 3754440240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:35:43,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 14:35:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000229145_3754311680.pth... [2024-06-22 14:35:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000228519_3744055296.pth [2024-06-22 14:35:44,826][15401] Updated weights for policy 0, policy_version 229150 (0.0042) [2024-06-22 14:35:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3754524672. Throughput: 0: 42620.5. Samples: 3754691060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:35:48,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 14:35:48,855][15401] Updated weights for policy 0, policy_version 229160 (0.0042) [2024-06-22 14:35:52,782][15401] Updated weights for policy 0, policy_version 229170 (0.0038) [2024-06-22 14:35:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 3754721280. Throughput: 0: 42548.7. Samples: 3754814840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:35:53,392][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 14:35:56,780][15401] Updated weights for policy 0, policy_version 229180 (0.0039) [2024-06-22 14:35:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3754934272. Throughput: 0: 42476.4. Samples: 3755070720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:35:58,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 14:36:00,360][15401] Updated weights for policy 0, policy_version 229190 (0.0040) [2024-06-22 14:36:03,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 3755163648. Throughput: 0: 42450.6. Samples: 3755327100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:36:03,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 14:36:04,522][15401] Updated weights for policy 0, policy_version 229200 (0.0031) [2024-06-22 14:36:08,237][15401] Updated weights for policy 0, policy_version 229210 (0.0033) [2024-06-22 14:36:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 3755376640. Throughput: 0: 42166.1. Samples: 3755452620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:36:08,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 14:36:12,198][15401] Updated weights for policy 0, policy_version 229220 (0.0042) [2024-06-22 14:36:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3755606016. Throughput: 0: 42664.9. Samples: 3755717780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:36:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 14:36:15,949][15401] Updated weights for policy 0, policy_version 229230 (0.0043) [2024-06-22 14:36:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 3755802624. Throughput: 0: 42551.5. Samples: 3755973100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:36:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 14:36:19,888][15401] Updated weights for policy 0, policy_version 229240 (0.0040) [2024-06-22 14:36:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3755999232. Throughput: 0: 42328.5. Samples: 3756098260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:36:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 14:36:23,643][15401] Updated weights for policy 0, policy_version 229250 (0.0036) [2024-06-22 14:36:27,509][15401] Updated weights for policy 0, policy_version 229260 (0.0029) [2024-06-22 14:36:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3756244992. Throughput: 0: 42573.8. Samples: 3756356060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:36:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 14:36:31,489][15401] Updated weights for policy 0, policy_version 229270 (0.0023) [2024-06-22 14:36:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3756425216. Throughput: 0: 42645.7. Samples: 3756610120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:36:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 14:36:35,451][15401] Updated weights for policy 0, policy_version 229280 (0.0038) [2024-06-22 14:36:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 3756654592. Throughput: 0: 42648.5. Samples: 3756733920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 14:36:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 14:36:38,926][15401] Updated weights for policy 0, policy_version 229290 (0.0024) [2024-06-22 14:36:42,879][15401] Updated weights for policy 0, policy_version 229300 (0.0027) [2024-06-22 14:36:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3756867584. Throughput: 0: 42866.1. Samples: 3756999700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:36:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 14:36:46,667][15401] Updated weights for policy 0, policy_version 229310 (0.0024) [2024-06-22 14:36:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3757080576. Throughput: 0: 42814.2. Samples: 3757253740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:36:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 14:36:50,357][15401] Updated weights for policy 0, policy_version 229320 (0.0025) [2024-06-22 14:36:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 3757309952. Throughput: 0: 42820.5. Samples: 3757379440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:36:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 14:36:54,279][15401] Updated weights for policy 0, policy_version 229330 (0.0027) [2024-06-22 14:36:57,758][15349] Signal inference workers to stop experience collection... (55600 times) [2024-06-22 14:36:57,764][15349] Signal inference workers to resume experience collection... (55600 times) [2024-06-22 14:36:57,809][15401] InferenceWorker_p0-w0: stopping experience collection (55600 times) [2024-06-22 14:36:57,810][15401] InferenceWorker_p0-w0: resuming experience collection (55600 times) [2024-06-22 14:36:57,891][15401] Updated weights for policy 0, policy_version 229340 (0.0023) [2024-06-22 14:36:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3757506560. Throughput: 0: 42673.7. Samples: 3757638100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:36:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 14:37:01,849][15401] Updated weights for policy 0, policy_version 229350 (0.0039) [2024-06-22 14:37:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3757735936. Throughput: 0: 42800.1. Samples: 3757899100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 14:37:05,588][15401] Updated weights for policy 0, policy_version 229360 (0.0043) [2024-06-22 14:37:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 3757932544. Throughput: 0: 42769.3. Samples: 3758022880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 14:37:09,459][15401] Updated weights for policy 0, policy_version 229370 (0.0031) [2024-06-22 14:37:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3758145536. Throughput: 0: 42688.4. Samples: 3758277040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:13,399][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 14:37:13,508][15401] Updated weights for policy 0, policy_version 229380 (0.0035) [2024-06-22 14:37:17,021][15401] Updated weights for policy 0, policy_version 229390 (0.0033) [2024-06-22 14:37:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 3758358528. Throughput: 0: 42749.8. Samples: 3758533860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 14:37:21,079][15401] Updated weights for policy 0, policy_version 229400 (0.0040) [2024-06-22 14:37:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3758587904. Throughput: 0: 42897.0. Samples: 3758664280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 14:37:24,587][15401] Updated weights for policy 0, policy_version 229410 (0.0029) [2024-06-22 14:37:28,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 3758800896. Throughput: 0: 42702.7. Samples: 3758921420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:28,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 14:37:28,583][15401] Updated weights for policy 0, policy_version 229420 (0.0036) [2024-06-22 14:37:32,177][15401] Updated weights for policy 0, policy_version 229430 (0.0051) [2024-06-22 14:37:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3758997504. Throughput: 0: 42756.1. Samples: 3759177760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:33,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 14:37:36,114][15401] Updated weights for policy 0, policy_version 229440 (0.0032) [2024-06-22 14:37:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3759243264. Throughput: 0: 42672.8. Samples: 3759299720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 14:37:40,230][15401] Updated weights for policy 0, policy_version 229450 (0.0042) [2024-06-22 14:37:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 3759456256. Throughput: 0: 42740.1. Samples: 3759561400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 14:37:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 14:37:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000229459_3759456256.pth... [2024-06-22 14:37:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000228832_3749183488.pth [2024-06-22 14:37:43,693][15401] Updated weights for policy 0, policy_version 229460 (0.0037) [2024-06-22 14:37:47,975][15401] Updated weights for policy 0, policy_version 229470 (0.0025) [2024-06-22 14:37:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3759652864. Throughput: 0: 42549.7. Samples: 3759813840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:37:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 14:37:51,137][15401] Updated weights for policy 0, policy_version 229480 (0.0047) [2024-06-22 14:37:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3759865856. Throughput: 0: 42542.7. Samples: 3759937300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:37:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 14:37:55,676][15401] Updated weights for policy 0, policy_version 229490 (0.0028) [2024-06-22 14:37:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3760078848. Throughput: 0: 42703.6. Samples: 3760198700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:37:58,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 14:37:59,054][15401] Updated weights for policy 0, policy_version 229500 (0.0028) [2024-06-22 14:38:03,315][15401] Updated weights for policy 0, policy_version 229510 (0.0038) [2024-06-22 14:38:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3760291840. Throughput: 0: 42842.2. Samples: 3760461760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:03,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 14:38:06,891][15401] Updated weights for policy 0, policy_version 229520 (0.0039) [2024-06-22 14:38:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3760504832. Throughput: 0: 42700.4. Samples: 3760585800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 14:38:10,940][15401] Updated weights for policy 0, policy_version 229530 (0.0031) [2024-06-22 14:38:13,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 3760734208. Throughput: 0: 42675.1. Samples: 3760841800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:13,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 14:38:14,665][15401] Updated weights for policy 0, policy_version 229540 (0.0033) [2024-06-22 14:38:18,375][15349] Signal inference workers to stop experience collection... (55650 times) [2024-06-22 14:38:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3760930816. Throughput: 0: 42975.2. Samples: 3761111640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:18,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 14:38:18,427][15401] InferenceWorker_p0-w0: stopping experience collection (55650 times) [2024-06-22 14:38:18,428][15349] Signal inference workers to resume experience collection... (55650 times) [2024-06-22 14:38:18,439][15401] InferenceWorker_p0-w0: resuming experience collection (55650 times) [2024-06-22 14:38:18,441][15401] Updated weights for policy 0, policy_version 229550 (0.0027) [2024-06-22 14:38:22,312][15401] Updated weights for policy 0, policy_version 229560 (0.0038) [2024-06-22 14:38:23,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3761160192. Throughput: 0: 42976.1. Samples: 3761233640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 14:38:25,896][15401] Updated weights for policy 0, policy_version 229570 (0.0029) [2024-06-22 14:38:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 3761389568. Throughput: 0: 42916.5. Samples: 3761492640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 14:38:29,833][15401] Updated weights for policy 0, policy_version 229580 (0.0025) [2024-06-22 14:38:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 3761586176. Throughput: 0: 43233.2. Samples: 3761759340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 14:38:33,485][15401] Updated weights for policy 0, policy_version 229590 (0.0034) [2024-06-22 14:38:37,358][15401] Updated weights for policy 0, policy_version 229600 (0.0047) [2024-06-22 14:38:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3761815552. Throughput: 0: 43334.7. Samples: 3761887360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 14:38:41,017][15401] Updated weights for policy 0, policy_version 229610 (0.0036) [2024-06-22 14:38:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3762028544. Throughput: 0: 43140.9. Samples: 3762140040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 14:38:45,225][15401] Updated weights for policy 0, policy_version 229620 (0.0039) [2024-06-22 14:38:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3762208768. Throughput: 0: 43127.6. Samples: 3762402500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 14:38:49,052][15401] Updated weights for policy 0, policy_version 229630 (0.0038) [2024-06-22 14:38:52,683][15401] Updated weights for policy 0, policy_version 229640 (0.0033) [2024-06-22 14:38:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3762454528. Throughput: 0: 43067.2. Samples: 3762523820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 14:38:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 14:38:56,458][15401] Updated weights for policy 0, policy_version 229650 (0.0044) [2024-06-22 14:38:58,392][15132] Fps is (10 sec: 47501.6, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 3762683904. Throughput: 0: 43268.5. Samples: 3762788880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:38:58,393][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 14:39:00,558][15401] Updated weights for policy 0, policy_version 229660 (0.0037) [2024-06-22 14:39:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3762880512. Throughput: 0: 43031.1. Samples: 3763048040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:03,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 14:39:04,096][15401] Updated weights for policy 0, policy_version 229670 (0.0035) [2024-06-22 14:39:08,154][15401] Updated weights for policy 0, policy_version 229680 (0.0031) [2024-06-22 14:39:08,389][15132] Fps is (10 sec: 40970.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3763093504. Throughput: 0: 43069.0. Samples: 3763171740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:08,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 14:39:11,585][15401] Updated weights for policy 0, policy_version 229690 (0.0041) [2024-06-22 14:39:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 3763322880. Throughput: 0: 43076.8. Samples: 3763431100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 14:39:15,642][15401] Updated weights for policy 0, policy_version 229700 (0.0024) [2024-06-22 14:39:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3763519488. Throughput: 0: 42948.1. Samples: 3763692000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:18,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 14:39:19,174][15401] Updated weights for policy 0, policy_version 229710 (0.0029) [2024-06-22 14:39:23,214][15401] Updated weights for policy 0, policy_version 229720 (0.0027) [2024-06-22 14:39:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3763732480. Throughput: 0: 42919.1. Samples: 3763818720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:23,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 14:39:26,792][15401] Updated weights for policy 0, policy_version 229730 (0.0036) [2024-06-22 14:39:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3763978240. Throughput: 0: 43073.4. Samples: 3764078340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:28,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-22 14:39:30,808][15401] Updated weights for policy 0, policy_version 229740 (0.0035) [2024-06-22 14:39:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3764174848. Throughput: 0: 42933.2. Samples: 3764334500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 14:39:34,382][15401] Updated weights for policy 0, policy_version 229750 (0.0022) [2024-06-22 14:39:38,396][15132] Fps is (10 sec: 39296.3, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 3764371456. Throughput: 0: 42952.9. Samples: 3764456980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:38,397][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 14:39:38,456][15401] Updated weights for policy 0, policy_version 229760 (0.0032) [2024-06-22 14:39:42,027][15401] Updated weights for policy 0, policy_version 229770 (0.0028) [2024-06-22 14:39:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3764617216. Throughput: 0: 42904.6. Samples: 3764719480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 14:39:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000229775_3764633600.pth... [2024-06-22 14:39:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000229145_3754311680.pth [2024-06-22 14:39:45,929][15401] Updated weights for policy 0, policy_version 229780 (0.0038) [2024-06-22 14:39:45,930][15349] Signal inference workers to stop experience collection... (55700 times) [2024-06-22 14:39:45,930][15349] Signal inference workers to resume experience collection... (55700 times) [2024-06-22 14:39:45,971][15401] InferenceWorker_p0-w0: stopping experience collection (55700 times) [2024-06-22 14:39:45,971][15401] InferenceWorker_p0-w0: resuming experience collection (55700 times) [2024-06-22 14:39:48,389][15132] Fps is (10 sec: 42626.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3764797440. Throughput: 0: 42945.0. Samples: 3764980560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 14:39:49,523][15401] Updated weights for policy 0, policy_version 229790 (0.0029) [2024-06-22 14:39:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 3765026816. Throughput: 0: 42906.4. Samples: 3765102540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 14:39:53,442][15401] Updated weights for policy 0, policy_version 229800 (0.0042) [2024-06-22 14:39:57,182][15401] Updated weights for policy 0, policy_version 229810 (0.0043) [2024-06-22 14:39:58,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 3765272576. Throughput: 0: 42949.0. Samples: 3765363800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:39:58,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-22 14:40:01,018][15401] Updated weights for policy 0, policy_version 229820 (0.0047) [2024-06-22 14:40:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3765452800. Throughput: 0: 42939.1. Samples: 3765624260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:40:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 14:40:04,991][15401] Updated weights for policy 0, policy_version 229830 (0.0037) [2024-06-22 14:40:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.3, 300 sec: 42876.1). Total num frames: 3765682176. Throughput: 0: 42831.8. Samples: 3765746160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 14:40:08,483][15401] Updated weights for policy 0, policy_version 229840 (0.0043) [2024-06-22 14:40:12,398][15401] Updated weights for policy 0, policy_version 229850 (0.0024) [2024-06-22 14:40:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3765895168. Throughput: 0: 42845.2. Samples: 3766006380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 14:40:16,210][15401] Updated weights for policy 0, policy_version 229860 (0.0029) [2024-06-22 14:40:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3766091776. Throughput: 0: 42980.1. Samples: 3766268600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 14:40:19,937][15401] Updated weights for policy 0, policy_version 229870 (0.0043) [2024-06-22 14:40:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3766304768. Throughput: 0: 42995.9. Samples: 3766391520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 14:40:23,871][15401] Updated weights for policy 0, policy_version 229880 (0.0025) [2024-06-22 14:40:27,664][15401] Updated weights for policy 0, policy_version 229890 (0.0032) [2024-06-22 14:40:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3766550528. Throughput: 0: 42962.2. Samples: 3766652780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:28,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 14:40:31,334][15401] Updated weights for policy 0, policy_version 229900 (0.0045) [2024-06-22 14:40:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3766730752. Throughput: 0: 43024.8. Samples: 3766916680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 14:40:35,152][15401] Updated weights for policy 0, policy_version 229910 (0.0027) [2024-06-22 14:40:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 3766960128. Throughput: 0: 43058.3. Samples: 3767040160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 14:40:39,153][15401] Updated weights for policy 0, policy_version 229920 (0.0032) [2024-06-22 14:40:42,567][15401] Updated weights for policy 0, policy_version 229930 (0.0029) [2024-06-22 14:40:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3767189504. Throughput: 0: 42869.8. Samples: 3767292940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 14:40:47,019][15401] Updated weights for policy 0, policy_version 229940 (0.0045) [2024-06-22 14:40:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42932.0). Total num frames: 3767386112. Throughput: 0: 42937.7. Samples: 3767556460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 14:40:50,186][15401] Updated weights for policy 0, policy_version 229950 (0.0023) [2024-06-22 14:40:52,820][15349] Signal inference workers to stop experience collection... (55750 times) [2024-06-22 14:40:52,821][15349] Signal inference workers to resume experience collection... (55750 times) [2024-06-22 14:40:52,876][15401] InferenceWorker_p0-w0: stopping experience collection (55750 times) [2024-06-22 14:40:52,876][15401] InferenceWorker_p0-w0: resuming experience collection (55750 times) [2024-06-22 14:40:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3767599104. Throughput: 0: 43175.2. Samples: 3767689040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:53,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 14:40:54,533][15401] Updated weights for policy 0, policy_version 229960 (0.0039) [2024-06-22 14:40:57,583][15401] Updated weights for policy 0, policy_version 229970 (0.0032) [2024-06-22 14:40:58,394][15132] Fps is (10 sec: 45856.2, 60 sec: 42868.4, 300 sec: 42986.6). Total num frames: 3767844864. Throughput: 0: 43062.7. Samples: 3767944380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:40:58,394][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 14:41:02,168][15401] Updated weights for policy 0, policy_version 229980 (0.0037) [2024-06-22 14:41:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 3768041472. Throughput: 0: 43048.9. Samples: 3768205800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:41:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 14:41:05,369][15401] Updated weights for policy 0, policy_version 229990 (0.0043) [2024-06-22 14:41:08,389][15132] Fps is (10 sec: 39338.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3768238080. Throughput: 0: 43091.1. Samples: 3768330620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:41:08,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 14:41:09,820][15401] Updated weights for policy 0, policy_version 230000 (0.0029) [2024-06-22 14:41:13,037][15401] Updated weights for policy 0, policy_version 230010 (0.0035) [2024-06-22 14:41:13,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43140.0, 300 sec: 42986.2). Total num frames: 3768483840. Throughput: 0: 43002.7. Samples: 3768588180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-22 14:41:13,396][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 14:41:17,327][15401] Updated weights for policy 0, policy_version 230020 (0.0043) [2024-06-22 14:41:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3768664064. Throughput: 0: 42960.9. Samples: 3768849920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:18,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 14:41:20,522][15401] Updated weights for policy 0, policy_version 230030 (0.0037) [2024-06-22 14:41:23,389][15132] Fps is (10 sec: 39347.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3768877056. Throughput: 0: 42980.9. Samples: 3768974300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 14:41:24,879][15401] Updated weights for policy 0, policy_version 230040 (0.0037) [2024-06-22 14:41:28,033][15401] Updated weights for policy 0, policy_version 230050 (0.0034) [2024-06-22 14:41:28,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 3769139200. Throughput: 0: 43247.2. Samples: 3769239060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:28,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-22 14:41:32,743][15401] Updated weights for policy 0, policy_version 230060 (0.0034) [2024-06-22 14:41:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3769319424. Throughput: 0: 43161.0. Samples: 3769498700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:33,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 14:41:35,957][15401] Updated weights for policy 0, policy_version 230070 (0.0040) [2024-06-22 14:41:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 3769532416. Throughput: 0: 42920.9. Samples: 3769620480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 14:41:40,474][15401] Updated weights for policy 0, policy_version 230080 (0.0038) [2024-06-22 14:41:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3769778176. Throughput: 0: 43104.1. Samples: 3769883880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 14:41:43,466][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000230090_3769794560.pth... [2024-06-22 14:41:43,477][15401] Updated weights for policy 0, policy_version 230090 (0.0031) [2024-06-22 14:41:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000229459_3759456256.pth [2024-06-22 14:41:48,206][15401] Updated weights for policy 0, policy_version 230100 (0.0033) [2024-06-22 14:41:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3769958400. Throughput: 0: 43092.4. Samples: 3770144960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:48,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 14:41:51,246][15401] Updated weights for policy 0, policy_version 230110 (0.0034) [2024-06-22 14:41:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3770187776. Throughput: 0: 42993.3. Samples: 3770265320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:53,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 14:41:55,927][15401] Updated weights for policy 0, policy_version 230120 (0.0050) [2024-06-22 14:41:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42874.6, 300 sec: 42987.2). Total num frames: 3770417152. Throughput: 0: 43103.2. Samples: 3770527540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:41:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 14:41:58,831][15401] Updated weights for policy 0, policy_version 230130 (0.0043) [2024-06-22 14:42:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3770597376. Throughput: 0: 43088.9. Samples: 3770788920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:42:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 14:42:03,613][15401] Updated weights for policy 0, policy_version 230140 (0.0029) [2024-06-22 14:42:06,555][15401] Updated weights for policy 0, policy_version 230150 (0.0033) [2024-06-22 14:42:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3770826752. Throughput: 0: 42989.8. Samples: 3770908840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:42:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 14:42:11,265][15401] Updated weights for policy 0, policy_version 230160 (0.0027) [2024-06-22 14:42:11,967][15349] Signal inference workers to stop experience collection... (55800 times) [2024-06-22 14:42:11,972][15349] Signal inference workers to resume experience collection... (55800 times) [2024-06-22 14:42:12,017][15401] InferenceWorker_p0-w0: stopping experience collection (55800 times) [2024-06-22 14:42:12,017][15401] InferenceWorker_p0-w0: resuming experience collection (55800 times) [2024-06-22 14:42:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42603.0, 300 sec: 42987.2). Total num frames: 3771039744. Throughput: 0: 42911.9. Samples: 3771170100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:42:13,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 14:42:14,177][15401] Updated weights for policy 0, policy_version 230170 (0.0038) [2024-06-22 14:42:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3771236352. Throughput: 0: 43006.3. Samples: 3771433980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:42:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 14:42:18,846][15401] Updated weights for policy 0, policy_version 230180 (0.0030) [2024-06-22 14:42:21,879][15401] Updated weights for policy 0, policy_version 230190 (0.0031) [2024-06-22 14:42:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42987.5). Total num frames: 3771482112. Throughput: 0: 43133.7. Samples: 3771561500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:42:23,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 14:42:26,605][15401] Updated weights for policy 0, policy_version 230200 (0.0038) [2024-06-22 14:42:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 3771695104. Throughput: 0: 42964.0. Samples: 3771817260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:42:28,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 14:42:29,771][15401] Updated weights for policy 0, policy_version 230210 (0.0032) [2024-06-22 14:42:33,395][15132] Fps is (10 sec: 40939.2, 60 sec: 42867.7, 300 sec: 42875.3). Total num frames: 3771891712. Throughput: 0: 42866.2. Samples: 3772074160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:42:33,395][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 14:42:34,113][15401] Updated weights for policy 0, policy_version 230220 (0.0046) [2024-06-22 14:42:37,472][15401] Updated weights for policy 0, policy_version 230230 (0.0042) [2024-06-22 14:42:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43690.6, 300 sec: 43042.7). Total num frames: 3772153856. Throughput: 0: 42896.4. Samples: 3772195660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:42:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 14:42:41,516][15401] Updated weights for policy 0, policy_version 230240 (0.0040) [2024-06-22 14:42:43,390][15132] Fps is (10 sec: 45898.5, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 3772350464. Throughput: 0: 43010.0. Samples: 3772463000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:42:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 14:42:45,155][15401] Updated weights for policy 0, policy_version 230250 (0.0037) [2024-06-22 14:42:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3772547072. Throughput: 0: 42908.9. Samples: 3772719820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:42:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 14:42:48,988][15401] Updated weights for policy 0, policy_version 230260 (0.0022) [2024-06-22 14:42:52,743][15401] Updated weights for policy 0, policy_version 230270 (0.0028) [2024-06-22 14:42:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3772776448. Throughput: 0: 43060.9. Samples: 3772846580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:42:53,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 14:42:56,574][15401] Updated weights for policy 0, policy_version 230280 (0.0036) [2024-06-22 14:42:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 3772989440. Throughput: 0: 43071.5. Samples: 3773108320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:42:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 14:43:00,391][15401] Updated weights for policy 0, policy_version 230290 (0.0040) [2024-06-22 14:43:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3773186048. Throughput: 0: 42889.8. Samples: 3773364020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:43:03,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 14:43:03,957][15401] Updated weights for policy 0, policy_version 230300 (0.0042) [2024-06-22 14:43:06,903][15349] Signal inference workers to stop experience collection... (55850 times) [2024-06-22 14:43:06,904][15349] Signal inference workers to resume experience collection... (55850 times) [2024-06-22 14:43:06,913][15401] InferenceWorker_p0-w0: stopping experience collection (55850 times) [2024-06-22 14:43:06,952][15401] InferenceWorker_p0-w0: resuming experience collection (55850 times) [2024-06-22 14:43:08,083][15401] Updated weights for policy 0, policy_version 230310 (0.0030) [2024-06-22 14:43:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42987.2). Total num frames: 3773415424. Throughput: 0: 42964.9. Samples: 3773495020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:43:08,393][15132] Avg episode reward: [(0, '0.154')] [2024-06-22 14:43:11,517][15401] Updated weights for policy 0, policy_version 230320 (0.0032) [2024-06-22 14:43:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3773628416. Throughput: 0: 42916.8. Samples: 3773748520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:43:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 14:43:15,594][15401] Updated weights for policy 0, policy_version 230330 (0.0033) [2024-06-22 14:43:18,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3773841408. Throughput: 0: 43073.1. Samples: 3774012220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:43:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 14:43:18,979][15401] Updated weights for policy 0, policy_version 230340 (0.0021) [2024-06-22 14:43:23,102][15401] Updated weights for policy 0, policy_version 230350 (0.0030) [2024-06-22 14:43:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3774054400. Throughput: 0: 43192.1. Samples: 3774139300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:43:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 14:43:26,907][15401] Updated weights for policy 0, policy_version 230360 (0.0046) [2024-06-22 14:43:28,391][15132] Fps is (10 sec: 42590.5, 60 sec: 42870.2, 300 sec: 42986.9). Total num frames: 3774267392. Throughput: 0: 42918.4. Samples: 3774394400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-22 14:43:28,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 14:43:30,642][15401] Updated weights for policy 0, policy_version 230370 (0.0037) [2024-06-22 14:43:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43148.3, 300 sec: 42931.6). Total num frames: 3774480384. Throughput: 0: 42943.5. Samples: 3774652280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:43:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 14:43:34,615][15401] Updated weights for policy 0, policy_version 230380 (0.0035) [2024-06-22 14:43:38,116][15401] Updated weights for policy 0, policy_version 230390 (0.0045) [2024-06-22 14:43:38,390][15132] Fps is (10 sec: 44244.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3774709760. Throughput: 0: 42842.1. Samples: 3774774480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:43:38,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 14:43:42,316][15401] Updated weights for policy 0, policy_version 230400 (0.0040) [2024-06-22 14:43:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 3774906368. Throughput: 0: 42856.8. Samples: 3775036880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:43:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 14:43:43,581][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000230404_3774939136.pth... [2024-06-22 14:43:43,634][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000229775_3764633600.pth [2024-06-22 14:43:46,276][15401] Updated weights for policy 0, policy_version 230410 (0.0033) [2024-06-22 14:43:48,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 3775135744. Throughput: 0: 42903.9. Samples: 3775294800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:43:48,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 14:43:49,817][15401] Updated weights for policy 0, policy_version 230420 (0.0030) [2024-06-22 14:43:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 3775332352. Throughput: 0: 42791.1. Samples: 3775420520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:43:53,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 14:43:53,747][15401] Updated weights for policy 0, policy_version 230430 (0.0035) [2024-06-22 14:43:57,532][15401] Updated weights for policy 0, policy_version 230440 (0.0047) [2024-06-22 14:43:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3775561728. Throughput: 0: 42952.9. Samples: 3775681400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:43:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 14:44:01,358][15401] Updated weights for policy 0, policy_version 230450 (0.0036) [2024-06-22 14:44:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3775774720. Throughput: 0: 42676.8. Samples: 3775932680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:44:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 14:44:05,033][15401] Updated weights for policy 0, policy_version 230460 (0.0039) [2024-06-22 14:44:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 3775954944. Throughput: 0: 42745.4. Samples: 3776062840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:44:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 14:44:09,166][15401] Updated weights for policy 0, policy_version 230470 (0.0036) [2024-06-22 14:44:12,951][15401] Updated weights for policy 0, policy_version 230480 (0.0033) [2024-06-22 14:44:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3776200704. Throughput: 0: 42720.3. Samples: 3776316740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:44:13,399][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 14:44:16,787][15401] Updated weights for policy 0, policy_version 230490 (0.0034) [2024-06-22 14:44:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3776413696. Throughput: 0: 42751.6. Samples: 3776576100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:44:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 14:44:20,573][15401] Updated weights for policy 0, policy_version 230500 (0.0038) [2024-06-22 14:44:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3776610304. Throughput: 0: 42805.8. Samples: 3776700740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:44:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 14:44:24,351][15401] Updated weights for policy 0, policy_version 230510 (0.0030) [2024-06-22 14:44:28,125][15401] Updated weights for policy 0, policy_version 230520 (0.0052) [2024-06-22 14:44:28,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43144.1, 300 sec: 42986.8). Total num frames: 3776856064. Throughput: 0: 42701.8. Samples: 3776958560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:44:28,392][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 14:44:31,846][15401] Updated weights for policy 0, policy_version 230530 (0.0033) [2024-06-22 14:44:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42988.1). Total num frames: 3777052672. Throughput: 0: 42759.1. Samples: 3777218860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:44:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 14:44:35,691][15401] Updated weights for policy 0, policy_version 230540 (0.0030) [2024-06-22 14:44:38,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3777265664. Throughput: 0: 42734.7. Samples: 3777343580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 14:44:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 14:44:39,995][15401] Updated weights for policy 0, policy_version 230550 (0.0043) [2024-06-22 14:44:41,343][15349] Signal inference workers to stop experience collection... (55900 times) [2024-06-22 14:44:41,343][15349] Signal inference workers to resume experience collection... (55900 times) [2024-06-22 14:44:41,379][15401] InferenceWorker_p0-w0: stopping experience collection (55900 times) [2024-06-22 14:44:41,379][15401] InferenceWorker_p0-w0: resuming experience collection (55900 times) [2024-06-22 14:44:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 3777478656. Throughput: 0: 42601.2. Samples: 3777598460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:44:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 14:44:43,611][15401] Updated weights for policy 0, policy_version 230560 (0.0045) [2024-06-22 14:44:47,579][15401] Updated weights for policy 0, policy_version 230570 (0.0030) [2024-06-22 14:44:48,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42325.3, 300 sec: 42875.8). Total num frames: 3777675264. Throughput: 0: 42875.5. Samples: 3777862180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:44:48,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 14:44:51,128][15401] Updated weights for policy 0, policy_version 230580 (0.0030) [2024-06-22 14:44:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3777888256. Throughput: 0: 42652.4. Samples: 3777982200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:44:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 14:44:55,243][15401] Updated weights for policy 0, policy_version 230590 (0.0034) [2024-06-22 14:44:58,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3778117632. Throughput: 0: 42709.4. Samples: 3778238660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:44:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 14:44:58,677][15401] Updated weights for policy 0, policy_version 230600 (0.0030) [2024-06-22 14:45:02,963][15401] Updated weights for policy 0, policy_version 230610 (0.0023) [2024-06-22 14:45:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3778314240. Throughput: 0: 42735.9. Samples: 3778499220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 14:45:06,244][15401] Updated weights for policy 0, policy_version 230620 (0.0049) [2024-06-22 14:45:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.7, 300 sec: 42875.8). Total num frames: 3778543616. Throughput: 0: 42690.2. Samples: 3778621900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:08,393][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 14:45:10,667][15401] Updated weights for policy 0, policy_version 230630 (0.0031) [2024-06-22 14:45:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3778772992. Throughput: 0: 42916.9. Samples: 3778889720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:13,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 14:45:13,810][15401] Updated weights for policy 0, policy_version 230640 (0.0032) [2024-06-22 14:45:18,286][15401] Updated weights for policy 0, policy_version 230650 (0.0030) [2024-06-22 14:45:18,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3778969600. Throughput: 0: 42878.2. Samples: 3779148380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 14:45:21,547][15401] Updated weights for policy 0, policy_version 230660 (0.0034) [2024-06-22 14:45:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3779182592. Throughput: 0: 42755.2. Samples: 3779267560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:23,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 14:45:26,408][15401] Updated weights for policy 0, policy_version 230670 (0.0037) [2024-06-22 14:45:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42873.2, 300 sec: 43042.7). Total num frames: 3779428352. Throughput: 0: 43043.7. Samples: 3779535420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 14:45:29,160][15401] Updated weights for policy 0, policy_version 230680 (0.0035) [2024-06-22 14:45:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3779608576. Throughput: 0: 42877.7. Samples: 3779791580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 14:45:33,826][15401] Updated weights for policy 0, policy_version 230690 (0.0035) [2024-06-22 14:45:36,643][15401] Updated weights for policy 0, policy_version 230700 (0.0034) [2024-06-22 14:45:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3779837952. Throughput: 0: 42901.9. Samples: 3779912780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:38,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 14:45:41,244][15401] Updated weights for policy 0, policy_version 230710 (0.0028) [2024-06-22 14:45:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3780050944. Throughput: 0: 43278.6. Samples: 3780186200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:43,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 14:45:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000230717_3780067328.pth... [2024-06-22 14:45:43,585][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000230090_3769794560.pth [2024-06-22 14:45:44,058][15401] Updated weights for policy 0, policy_version 230720 (0.0039) [2024-06-22 14:45:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 3780247552. Throughput: 0: 43080.6. Samples: 3780437840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 14:45:48,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 14:45:48,767][15401] Updated weights for policy 0, policy_version 230730 (0.0049) [2024-06-22 14:45:51,665][15401] Updated weights for policy 0, policy_version 230740 (0.0037) [2024-06-22 14:45:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.7). Total num frames: 3780493312. Throughput: 0: 43177.0. Samples: 3780564760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:45:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 14:45:56,425][15401] Updated weights for policy 0, policy_version 230750 (0.0031) [2024-06-22 14:45:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3780689920. Throughput: 0: 43031.7. Samples: 3780826140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:45:58,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 14:45:59,507][15401] Updated weights for policy 0, policy_version 230760 (0.0039) [2024-06-22 14:46:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3780902912. Throughput: 0: 42880.1. Samples: 3781077980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 14:46:04,137][15401] Updated weights for policy 0, policy_version 230770 (0.0035) [2024-06-22 14:46:05,292][15349] Signal inference workers to stop experience collection... (55950 times) [2024-06-22 14:46:05,292][15349] Signal inference workers to resume experience collection... (55950 times) [2024-06-22 14:46:05,315][15401] InferenceWorker_p0-w0: stopping experience collection (55950 times) [2024-06-22 14:46:05,315][15401] InferenceWorker_p0-w0: resuming experience collection (55950 times) [2024-06-22 14:46:06,999][15401] Updated weights for policy 0, policy_version 230780 (0.0028) [2024-06-22 14:46:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.3, 300 sec: 42877.0). Total num frames: 3781132288. Throughput: 0: 43073.8. Samples: 3781205880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 14:46:11,810][15401] Updated weights for policy 0, policy_version 230790 (0.0038) [2024-06-22 14:46:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3781328896. Throughput: 0: 42937.2. Samples: 3781467600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:13,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 14:46:14,823][15401] Updated weights for policy 0, policy_version 230800 (0.0036) [2024-06-22 14:46:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3781541888. Throughput: 0: 42902.8. Samples: 3781722200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 14:46:19,320][15401] Updated weights for policy 0, policy_version 230810 (0.0027) [2024-06-22 14:46:22,373][15401] Updated weights for policy 0, policy_version 230820 (0.0032) [2024-06-22 14:46:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3781787648. Throughput: 0: 43107.5. Samples: 3781852620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:23,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 14:46:26,816][15401] Updated weights for policy 0, policy_version 230830 (0.0030) [2024-06-22 14:46:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3781967872. Throughput: 0: 42810.7. Samples: 3782112680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 14:46:30,314][15401] Updated weights for policy 0, policy_version 230840 (0.0036) [2024-06-22 14:46:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3782197248. Throughput: 0: 42814.1. Samples: 3782364480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 14:46:34,534][15401] Updated weights for policy 0, policy_version 230850 (0.0025) [2024-06-22 14:46:37,854][15401] Updated weights for policy 0, policy_version 230860 (0.0042) [2024-06-22 14:46:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3782426624. Throughput: 0: 42997.4. Samples: 3782499640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 14:46:42,242][15401] Updated weights for policy 0, policy_version 230870 (0.0043) [2024-06-22 14:46:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3782590464. Throughput: 0: 42878.1. Samples: 3782755660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 14:46:45,504][15401] Updated weights for policy 0, policy_version 230880 (0.0028) [2024-06-22 14:46:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3782852608. Throughput: 0: 42873.3. Samples: 3783007280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:48,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 14:46:50,243][15401] Updated weights for policy 0, policy_version 230890 (0.0035) [2024-06-22 14:46:52,945][15401] Updated weights for policy 0, policy_version 230900 (0.0034) [2024-06-22 14:46:53,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3783065600. Throughput: 0: 43079.1. Samples: 3783144440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 14:46:57,711][15401] Updated weights for policy 0, policy_version 230910 (0.0035) [2024-06-22 14:46:58,391][15132] Fps is (10 sec: 39314.5, 60 sec: 42597.1, 300 sec: 42875.8). Total num frames: 3783245824. Throughput: 0: 42975.3. Samples: 3783401560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 14:46:58,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 14:47:00,787][15401] Updated weights for policy 0, policy_version 230920 (0.0023) [2024-06-22 14:47:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3783491584. Throughput: 0: 42782.3. Samples: 3783647400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:03,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 14:47:05,228][15401] Updated weights for policy 0, policy_version 230930 (0.0034) [2024-06-22 14:47:08,389][15132] Fps is (10 sec: 45883.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3783704576. Throughput: 0: 42996.4. Samples: 3783787460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:08,390][15132] Avg episode reward: [(0, '0.243')] [2024-06-22 14:47:08,400][15401] Updated weights for policy 0, policy_version 230940 (0.0036) [2024-06-22 14:47:12,653][15401] Updated weights for policy 0, policy_version 230950 (0.0030) [2024-06-22 14:47:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3783901184. Throughput: 0: 42976.0. Samples: 3784046600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:13,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-22 14:47:16,010][15401] Updated weights for policy 0, policy_version 230960 (0.0028) [2024-06-22 14:47:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 3784146944. Throughput: 0: 42797.5. Samples: 3784290360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 14:47:20,192][15401] Updated weights for policy 0, policy_version 230970 (0.0033) [2024-06-22 14:47:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3784343552. Throughput: 0: 42898.2. Samples: 3784430060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:23,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 14:47:23,570][15349] Signal inference workers to stop experience collection... (56000 times) [2024-06-22 14:47:23,624][15401] InferenceWorker_p0-w0: stopping experience collection (56000 times) [2024-06-22 14:47:23,632][15349] Signal inference workers to resume experience collection... (56000 times) [2024-06-22 14:47:23,633][15401] InferenceWorker_p0-w0: resuming experience collection (56000 times) [2024-06-22 14:47:23,770][15401] Updated weights for policy 0, policy_version 230980 (0.0026) [2024-06-22 14:47:27,741][15401] Updated weights for policy 0, policy_version 230990 (0.0026) [2024-06-22 14:47:28,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42876.8). Total num frames: 3784540160. Throughput: 0: 42827.0. Samples: 3784682880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:28,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-22 14:47:31,533][15401] Updated weights for policy 0, policy_version 231000 (0.0029) [2024-06-22 14:47:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3784785920. Throughput: 0: 42985.4. Samples: 3784941620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:33,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 14:47:35,316][15401] Updated weights for policy 0, policy_version 231010 (0.0042) [2024-06-22 14:47:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3784982528. Throughput: 0: 42925.7. Samples: 3785076100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 14:47:38,959][15401] Updated weights for policy 0, policy_version 231020 (0.0034) [2024-06-22 14:47:42,674][15401] Updated weights for policy 0, policy_version 231030 (0.0029) [2024-06-22 14:47:43,396][15132] Fps is (10 sec: 40933.7, 60 sec: 43413.0, 300 sec: 42875.2). Total num frames: 3785195520. Throughput: 0: 42873.8. Samples: 3785331080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:43,396][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 14:47:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000231030_3785195520.pth... [2024-06-22 14:47:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000230404_3774939136.pth [2024-06-22 14:47:46,798][15401] Updated weights for policy 0, policy_version 231040 (0.0032) [2024-06-22 14:47:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3785424896. Throughput: 0: 43239.2. Samples: 3785593160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 14:47:50,134][15401] Updated weights for policy 0, policy_version 231050 (0.0046) [2024-06-22 14:47:53,389][15132] Fps is (10 sec: 45905.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3785654272. Throughput: 0: 43036.1. Samples: 3785724080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 14:47:54,334][15401] Updated weights for policy 0, policy_version 231060 (0.0034) [2024-06-22 14:47:57,998][15401] Updated weights for policy 0, policy_version 231070 (0.0037) [2024-06-22 14:47:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43418.8, 300 sec: 42931.6). Total num frames: 3785850880. Throughput: 0: 43045.4. Samples: 3785983640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:47:58,396][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 14:48:01,911][15401] Updated weights for policy 0, policy_version 231080 (0.0030) [2024-06-22 14:48:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 3786063872. Throughput: 0: 43199.8. Samples: 3786234360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:48:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 14:48:05,892][15401] Updated weights for policy 0, policy_version 231090 (0.0032) [2024-06-22 14:48:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3786293248. Throughput: 0: 43086.6. Samples: 3786368960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 14:48:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 14:48:09,702][15401] Updated weights for policy 0, policy_version 231100 (0.0032) [2024-06-22 14:48:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 3786489856. Throughput: 0: 43082.9. Samples: 3786621600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 14:48:13,439][15401] Updated weights for policy 0, policy_version 231110 (0.0031) [2024-06-22 14:48:17,342][15401] Updated weights for policy 0, policy_version 231120 (0.0040) [2024-06-22 14:48:18,394][15132] Fps is (10 sec: 40941.1, 60 sec: 42595.1, 300 sec: 42875.4). Total num frames: 3786702848. Throughput: 0: 43071.5. Samples: 3786880040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:18,395][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 14:48:21,049][15401] Updated weights for policy 0, policy_version 231130 (0.0048) [2024-06-22 14:48:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42931.9). Total num frames: 3786932224. Throughput: 0: 42851.9. Samples: 3787004440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 14:48:25,239][15401] Updated weights for policy 0, policy_version 231140 (0.0043) [2024-06-22 14:48:28,389][15132] Fps is (10 sec: 44257.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3787145216. Throughput: 0: 42885.2. Samples: 3787260640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 14:48:28,479][15401] Updated weights for policy 0, policy_version 231150 (0.0029) [2024-06-22 14:48:32,589][15401] Updated weights for policy 0, policy_version 231160 (0.0040) [2024-06-22 14:48:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3787341824. Throughput: 0: 43035.4. Samples: 3787529760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 14:48:35,931][15401] Updated weights for policy 0, policy_version 231170 (0.0026) [2024-06-22 14:48:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 3787587584. Throughput: 0: 42848.2. Samples: 3787652260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 14:48:39,908][15401] Updated weights for policy 0, policy_version 231180 (0.0043) [2024-06-22 14:48:43,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43422.3, 300 sec: 42932.0). Total num frames: 3787800576. Throughput: 0: 42976.1. Samples: 3787917560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 14:48:43,443][15401] Updated weights for policy 0, policy_version 231190 (0.0031) [2024-06-22 14:48:47,311][15401] Updated weights for policy 0, policy_version 231200 (0.0022) [2024-06-22 14:48:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 3787997184. Throughput: 0: 43175.5. Samples: 3788177260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 14:48:51,108][15401] Updated weights for policy 0, policy_version 231210 (0.0041) [2024-06-22 14:48:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 3788226560. Throughput: 0: 42895.1. Samples: 3788299240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 14:48:55,091][15401] Updated weights for policy 0, policy_version 231220 (0.0033) [2024-06-22 14:48:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3788439552. Throughput: 0: 43174.6. Samples: 3788564460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:48:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 14:48:58,640][15401] Updated weights for policy 0, policy_version 231230 (0.0025) [2024-06-22 14:48:58,665][15349] Signal inference workers to stop experience collection... (56050 times) [2024-06-22 14:48:58,666][15349] Signal inference workers to resume experience collection... (56050 times) [2024-06-22 14:48:58,685][15401] InferenceWorker_p0-w0: stopping experience collection (56050 times) [2024-06-22 14:48:58,685][15401] InferenceWorker_p0-w0: resuming experience collection (56050 times) [2024-06-22 14:49:02,579][15401] Updated weights for policy 0, policy_version 231240 (0.0027) [2024-06-22 14:49:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3788636160. Throughput: 0: 43098.7. Samples: 3788819280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:49:03,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 14:49:06,447][15401] Updated weights for policy 0, policy_version 231250 (0.0031) [2024-06-22 14:49:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3788865536. Throughput: 0: 43051.2. Samples: 3788941740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:49:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 14:49:10,336][15401] Updated weights for policy 0, policy_version 231260 (0.0033) [2024-06-22 14:49:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3789062144. Throughput: 0: 43192.0. Samples: 3789204280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:49:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 14:49:14,206][15401] Updated weights for policy 0, policy_version 231270 (0.0029) [2024-06-22 14:49:17,924][15401] Updated weights for policy 0, policy_version 231280 (0.0032) [2024-06-22 14:49:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43147.9, 300 sec: 42987.2). Total num frames: 3789291520. Throughput: 0: 42799.7. Samples: 3789455740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 14:49:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 14:49:21,853][15401] Updated weights for policy 0, policy_version 231290 (0.0041) [2024-06-22 14:49:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 3789504512. Throughput: 0: 43105.4. Samples: 3789592000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:49:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 14:49:25,627][15401] Updated weights for policy 0, policy_version 231300 (0.0036) [2024-06-22 14:49:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3789701120. Throughput: 0: 42799.9. Samples: 3789843560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:49:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 14:49:29,783][15401] Updated weights for policy 0, policy_version 231310 (0.0045) [2024-06-22 14:49:33,243][15401] Updated weights for policy 0, policy_version 231320 (0.0033) [2024-06-22 14:49:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3789946880. Throughput: 0: 42661.8. Samples: 3790097040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:49:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 14:49:37,321][15401] Updated weights for policy 0, policy_version 231330 (0.0039) [2024-06-22 14:49:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3790159872. Throughput: 0: 42858.7. Samples: 3790227880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:49:38,390][15132] Avg episode reward: [(0, '0.886')] [2024-06-22 14:49:41,418][15401] Updated weights for policy 0, policy_version 231340 (0.0031) [2024-06-22 14:49:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.6, 300 sec: 42987.2). Total num frames: 3790356480. Throughput: 0: 42664.3. Samples: 3790484460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:49:43,393][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 14:49:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000231345_3790356480.pth... [2024-06-22 14:49:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000230717_3780067328.pth [2024-06-22 14:49:45,133][15401] Updated weights for policy 0, policy_version 231350 (0.0046) [2024-06-22 14:49:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3790569472. Throughput: 0: 42618.7. Samples: 3790737120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:49:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 14:49:49,186][15401] Updated weights for policy 0, policy_version 231360 (0.0029) [2024-06-22 14:49:52,840][15401] Updated weights for policy 0, policy_version 231370 (0.0033) [2024-06-22 14:49:53,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3790798848. Throughput: 0: 42824.7. Samples: 3790868860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:49:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 14:49:56,781][15401] Updated weights for policy 0, policy_version 231380 (0.0035) [2024-06-22 14:49:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 3790995456. Throughput: 0: 42696.8. Samples: 3791125640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:49:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 14:50:00,474][15401] Updated weights for policy 0, policy_version 231390 (0.0027) [2024-06-22 14:50:03,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43142.9, 300 sec: 42987.2). Total num frames: 3791224832. Throughput: 0: 42751.1. Samples: 3791379640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:50:03,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 14:50:04,264][15401] Updated weights for policy 0, policy_version 231400 (0.0030) [2024-06-22 14:50:08,035][15401] Updated weights for policy 0, policy_version 231410 (0.0035) [2024-06-22 14:50:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3791421440. Throughput: 0: 42712.1. Samples: 3791514040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:50:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 14:50:11,792][15401] Updated weights for policy 0, policy_version 231420 (0.0038) [2024-06-22 14:50:13,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3791618048. Throughput: 0: 42804.5. Samples: 3791769760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:50:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 14:50:15,814][15401] Updated weights for policy 0, policy_version 231430 (0.0042) [2024-06-22 14:50:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3791880192. Throughput: 0: 42750.4. Samples: 3792020800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:50:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 14:50:19,358][15401] Updated weights for policy 0, policy_version 231440 (0.0039) [2024-06-22 14:50:23,274][15401] Updated weights for policy 0, policy_version 231450 (0.0029) [2024-06-22 14:50:23,390][15132] Fps is (10 sec: 45873.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 3792076800. Throughput: 0: 42973.6. Samples: 3792161700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 14:50:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 14:50:26,869][15401] Updated weights for policy 0, policy_version 231460 (0.0032) [2024-06-22 14:50:28,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3792257024. Throughput: 0: 42856.1. Samples: 3792412880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:50:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 14:50:31,082][15401] Updated weights for policy 0, policy_version 231470 (0.0038) [2024-06-22 14:50:31,566][15349] Signal inference workers to stop experience collection... (56100 times) [2024-06-22 14:50:31,567][15349] Signal inference workers to resume experience collection... (56100 times) [2024-06-22 14:50:31,583][15401] InferenceWorker_p0-w0: stopping experience collection (56100 times) [2024-06-22 14:50:31,583][15401] InferenceWorker_p0-w0: resuming experience collection (56100 times) [2024-06-22 14:50:33,392][15132] Fps is (10 sec: 44227.4, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 3792519168. Throughput: 0: 42834.6. Samples: 3792664780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:50:33,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 14:50:34,527][15401] Updated weights for policy 0, policy_version 231480 (0.0039) [2024-06-22 14:50:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 3792715776. Throughput: 0: 42991.3. Samples: 3792803460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:50:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 14:50:38,602][15401] Updated weights for policy 0, policy_version 231490 (0.0037) [2024-06-22 14:50:42,525][15401] Updated weights for policy 0, policy_version 231500 (0.0028) [2024-06-22 14:50:43,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 3792896000. Throughput: 0: 42760.1. Samples: 3793049840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:50:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 14:50:46,357][15401] Updated weights for policy 0, policy_version 231510 (0.0032) [2024-06-22 14:50:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3793174528. Throughput: 0: 42708.1. Samples: 3793301400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:50:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 14:50:49,950][15401] Updated weights for policy 0, policy_version 231520 (0.0034) [2024-06-22 14:50:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3793354752. Throughput: 0: 42946.6. Samples: 3793446640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:50:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 14:50:53,810][15401] Updated weights for policy 0, policy_version 231530 (0.0041) [2024-06-22 14:50:57,545][15401] Updated weights for policy 0, policy_version 231540 (0.0033) [2024-06-22 14:50:58,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3793551360. Throughput: 0: 42797.7. Samples: 3793695660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:50:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 14:51:01,156][15401] Updated weights for policy 0, policy_version 231550 (0.0035) [2024-06-22 14:51:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 3793797120. Throughput: 0: 42917.7. Samples: 3793952100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:51:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 14:51:04,969][15401] Updated weights for policy 0, policy_version 231560 (0.0037) [2024-06-22 14:51:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3794010112. Throughput: 0: 42814.1. Samples: 3794088320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:51:08,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 14:51:08,824][15401] Updated weights for policy 0, policy_version 231570 (0.0035) [2024-06-22 14:51:12,411][15401] Updated weights for policy 0, policy_version 231580 (0.0032) [2024-06-22 14:51:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3794206720. Throughput: 0: 42859.2. Samples: 3794341540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:51:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 14:51:16,458][15401] Updated weights for policy 0, policy_version 231590 (0.0029) [2024-06-22 14:51:18,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 3794452480. Throughput: 0: 42968.8. Samples: 3794598280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:51:18,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-22 14:51:20,643][15401] Updated weights for policy 0, policy_version 231600 (0.0040) [2024-06-22 14:51:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 3794649088. Throughput: 0: 42933.7. Samples: 3794735480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:51:23,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 14:51:23,897][15401] Updated weights for policy 0, policy_version 231610 (0.0043) [2024-06-22 14:51:28,143][15401] Updated weights for policy 0, policy_version 231620 (0.0043) [2024-06-22 14:51:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3794862080. Throughput: 0: 43120.3. Samples: 3794990260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:51:28,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 14:51:31,275][15401] Updated weights for policy 0, policy_version 231630 (0.0027) [2024-06-22 14:51:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 3795107840. Throughput: 0: 43255.8. Samples: 3795247920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 14:51:33,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 14:51:35,659][15401] Updated weights for policy 0, policy_version 231640 (0.0031) [2024-06-22 14:51:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 3795288064. Throughput: 0: 42996.0. Samples: 3795381460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:51:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 14:51:38,999][15401] Updated weights for policy 0, policy_version 231650 (0.0039) [2024-06-22 14:51:43,316][15401] Updated weights for policy 0, policy_version 231660 (0.0054) [2024-06-22 14:51:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 3795517440. Throughput: 0: 43062.6. Samples: 3795633480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:51:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 14:51:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000231660_3795517440.pth... [2024-06-22 14:51:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000231030_3785195520.pth [2024-06-22 14:51:44,732][15349] Signal inference workers to stop experience collection... (56150 times) [2024-06-22 14:51:44,733][15349] Signal inference workers to resume experience collection... (56150 times) [2024-06-22 14:51:44,748][15401] InferenceWorker_p0-w0: stopping experience collection (56150 times) [2024-06-22 14:51:44,749][15401] InferenceWorker_p0-w0: resuming experience collection (56150 times) [2024-06-22 14:51:46,898][15401] Updated weights for policy 0, policy_version 231670 (0.0031) [2024-06-22 14:51:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3795730432. Throughput: 0: 43040.5. Samples: 3795888920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:51:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 14:51:51,087][15401] Updated weights for policy 0, policy_version 231680 (0.0032) [2024-06-22 14:51:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 43043.0). Total num frames: 3795943424. Throughput: 0: 42927.4. Samples: 3796020060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:51:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 14:51:54,394][15401] Updated weights for policy 0, policy_version 231690 (0.0034) [2024-06-22 14:51:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3796156416. Throughput: 0: 42956.9. Samples: 3796274600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:51:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 14:51:58,517][15401] Updated weights for policy 0, policy_version 231700 (0.0040) [2024-06-22 14:52:02,124][15401] Updated weights for policy 0, policy_version 231710 (0.0026) [2024-06-22 14:52:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3796369408. Throughput: 0: 42936.2. Samples: 3796530400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 14:52:06,740][15401] Updated weights for policy 0, policy_version 231720 (0.0042) [2024-06-22 14:52:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 3796582400. Throughput: 0: 42788.9. Samples: 3796660980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 14:52:10,228][15401] Updated weights for policy 0, policy_version 231730 (0.0043) [2024-06-22 14:52:13,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3796779008. Throughput: 0: 42697.7. Samples: 3796911660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 14:52:14,338][15401] Updated weights for policy 0, policy_version 231740 (0.0032) [2024-06-22 14:52:17,955][15401] Updated weights for policy 0, policy_version 231750 (0.0029) [2024-06-22 14:52:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3797008384. Throughput: 0: 42682.7. Samples: 3797168640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 14:52:22,174][15401] Updated weights for policy 0, policy_version 231760 (0.0038) [2024-06-22 14:52:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 3797204992. Throughput: 0: 42518.7. Samples: 3797294800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 14:52:25,544][15401] Updated weights for policy 0, policy_version 231770 (0.0026) [2024-06-22 14:52:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3797434368. Throughput: 0: 42498.6. Samples: 3797545920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 14:52:29,774][15401] Updated weights for policy 0, policy_version 231780 (0.0029) [2024-06-22 14:52:33,057][15401] Updated weights for policy 0, policy_version 231790 (0.0034) [2024-06-22 14:52:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42931.6). Total num frames: 3797647360. Throughput: 0: 42520.9. Samples: 3797802360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:33,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 14:52:37,308][15401] Updated weights for policy 0, policy_version 231800 (0.0022) [2024-06-22 14:52:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42932.6). Total num frames: 3797860352. Throughput: 0: 42629.8. Samples: 3797938400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:38,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-22 14:52:40,554][15401] Updated weights for policy 0, policy_version 231810 (0.0041) [2024-06-22 14:52:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3798073344. Throughput: 0: 42535.9. Samples: 3798188720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 14:52:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 14:52:44,765][15401] Updated weights for policy 0, policy_version 231820 (0.0028) [2024-06-22 14:52:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3798269952. Throughput: 0: 42634.7. Samples: 3798448960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:52:48,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 14:52:48,727][15401] Updated weights for policy 0, policy_version 231830 (0.0037) [2024-06-22 14:52:52,409][15401] Updated weights for policy 0, policy_version 231840 (0.0027) [2024-06-22 14:52:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3798499328. Throughput: 0: 42760.9. Samples: 3798585220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:52:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 14:52:56,143][15401] Updated weights for policy 0, policy_version 231850 (0.0036) [2024-06-22 14:52:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3798712320. Throughput: 0: 42729.8. Samples: 3798834500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:52:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 14:53:00,063][15401] Updated weights for policy 0, policy_version 231860 (0.0036) [2024-06-22 14:53:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3798941696. Throughput: 0: 43052.9. Samples: 3799106020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 14:53:03,646][15401] Updated weights for policy 0, policy_version 231870 (0.0041) [2024-06-22 14:53:07,274][15349] Signal inference workers to stop experience collection... (56200 times) [2024-06-22 14:53:07,275][15349] Signal inference workers to resume experience collection... (56200 times) [2024-06-22 14:53:07,317][15401] InferenceWorker_p0-w0: stopping experience collection (56200 times) [2024-06-22 14:53:07,317][15401] InferenceWorker_p0-w0: resuming experience collection (56200 times) [2024-06-22 14:53:07,417][15401] Updated weights for policy 0, policy_version 231880 (0.0038) [2024-06-22 14:53:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3799154688. Throughput: 0: 43072.8. Samples: 3799233080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 14:53:11,207][15401] Updated weights for policy 0, policy_version 231890 (0.0031) [2024-06-22 14:53:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42932.3). Total num frames: 3799367680. Throughput: 0: 43046.2. Samples: 3799483000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 14:53:15,236][15401] Updated weights for policy 0, policy_version 231900 (0.0039) [2024-06-22 14:53:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3799580672. Throughput: 0: 43151.1. Samples: 3799744160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:18,394][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 14:53:18,783][15401] Updated weights for policy 0, policy_version 231910 (0.0022) [2024-06-22 14:53:22,746][15401] Updated weights for policy 0, policy_version 231920 (0.0044) [2024-06-22 14:53:23,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3799810048. Throughput: 0: 43036.5. Samples: 3799875040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:23,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 14:53:26,339][15401] Updated weights for policy 0, policy_version 231930 (0.0027) [2024-06-22 14:53:28,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 3800023040. Throughput: 0: 43153.2. Samples: 3800130720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:28,392][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 14:53:30,497][15401] Updated weights for policy 0, policy_version 231940 (0.0037) [2024-06-22 14:53:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3800219648. Throughput: 0: 43186.2. Samples: 3800392340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 14:53:33,937][15401] Updated weights for policy 0, policy_version 231950 (0.0034) [2024-06-22 14:53:37,984][15401] Updated weights for policy 0, policy_version 231960 (0.0041) [2024-06-22 14:53:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3800449024. Throughput: 0: 42928.5. Samples: 3800517000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 14:53:41,543][15401] Updated weights for policy 0, policy_version 231970 (0.0022) [2024-06-22 14:53:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3800662016. Throughput: 0: 43116.4. Samples: 3800774740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:43,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 14:53:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000231974_3800662016.pth... [2024-06-22 14:53:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000231345_3790356480.pth [2024-06-22 14:53:45,688][15401] Updated weights for policy 0, policy_version 231980 (0.0040) [2024-06-22 14:53:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3800858624. Throughput: 0: 42771.2. Samples: 3801030720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 14:53:49,310][15401] Updated weights for policy 0, policy_version 231990 (0.0030) [2024-06-22 14:53:53,287][15401] Updated weights for policy 0, policy_version 232000 (0.0028) [2024-06-22 14:53:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3801088000. Throughput: 0: 42805.4. Samples: 3801159320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 14:53:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 14:53:56,998][15401] Updated weights for policy 0, policy_version 232010 (0.0026) [2024-06-22 14:53:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3801284608. Throughput: 0: 42788.1. Samples: 3801408460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:53:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 14:54:01,019][15401] Updated weights for policy 0, policy_version 232020 (0.0039) [2024-06-22 14:54:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3801497600. Throughput: 0: 42754.7. Samples: 3801668120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 14:54:04,672][15401] Updated weights for policy 0, policy_version 232030 (0.0034) [2024-06-22 14:54:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3801710592. Throughput: 0: 42645.0. Samples: 3801794060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 14:54:08,716][15401] Updated weights for policy 0, policy_version 232040 (0.0033) [2024-06-22 14:54:12,503][15401] Updated weights for policy 0, policy_version 232050 (0.0031) [2024-06-22 14:54:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3801939968. Throughput: 0: 42631.2. Samples: 3802049020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 14:54:16,773][15401] Updated weights for policy 0, policy_version 232060 (0.0031) [2024-06-22 14:54:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3802152960. Throughput: 0: 42695.5. Samples: 3802313640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 14:54:20,146][15401] Updated weights for policy 0, policy_version 232070 (0.0026) [2024-06-22 14:54:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3802349568. Throughput: 0: 42650.5. Samples: 3802436280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 14:54:24,233][15401] Updated weights for policy 0, policy_version 232080 (0.0026) [2024-06-22 14:54:27,707][15401] Updated weights for policy 0, policy_version 232090 (0.0037) [2024-06-22 14:54:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 3802578944. Throughput: 0: 42690.3. Samples: 3802695800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:28,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 14:54:31,801][15401] Updated weights for policy 0, policy_version 232100 (0.0033) [2024-06-22 14:54:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3802775552. Throughput: 0: 42699.0. Samples: 3802952180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 14:54:35,263][15401] Updated weights for policy 0, policy_version 232110 (0.0032) [2024-06-22 14:54:36,477][15349] Signal inference workers to stop experience collection... (56250 times) [2024-06-22 14:54:36,477][15349] Signal inference workers to resume experience collection... (56250 times) [2024-06-22 14:54:36,513][15401] InferenceWorker_p0-w0: stopping experience collection (56250 times) [2024-06-22 14:54:36,513][15401] InferenceWorker_p0-w0: resuming experience collection (56250 times) [2024-06-22 14:54:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42820.9). Total num frames: 3802988544. Throughput: 0: 42537.7. Samples: 3803073520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 14:54:39,642][15401] Updated weights for policy 0, policy_version 232120 (0.0032) [2024-06-22 14:54:43,114][15401] Updated weights for policy 0, policy_version 232130 (0.0038) [2024-06-22 14:54:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3803234304. Throughput: 0: 42878.3. Samples: 3803337980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 14:54:47,226][15401] Updated weights for policy 0, policy_version 232140 (0.0035) [2024-06-22 14:54:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3803430912. Throughput: 0: 42783.5. Samples: 3803593380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 14:54:50,718][15401] Updated weights for policy 0, policy_version 232150 (0.0028) [2024-06-22 14:54:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3803643904. Throughput: 0: 42783.8. Samples: 3803719340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 14:54:54,751][15401] Updated weights for policy 0, policy_version 232160 (0.0038) [2024-06-22 14:54:58,255][15401] Updated weights for policy 0, policy_version 232170 (0.0031) [2024-06-22 14:54:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 3803873280. Throughput: 0: 42877.3. Samples: 3803978500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:54:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 14:55:02,854][15401] Updated weights for policy 0, policy_version 232180 (0.0033) [2024-06-22 14:55:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3804069888. Throughput: 0: 42873.7. Samples: 3804242960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 14:55:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 14:55:05,742][15401] Updated weights for policy 0, policy_version 232190 (0.0032) [2024-06-22 14:55:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3804282880. Throughput: 0: 42814.4. Samples: 3804362920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 14:55:10,444][15401] Updated weights for policy 0, policy_version 232200 (0.0044) [2024-06-22 14:55:13,265][15401] Updated weights for policy 0, policy_version 232210 (0.0040) [2024-06-22 14:55:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3804528640. Throughput: 0: 42882.5. Samples: 3804625520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:13,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 14:55:18,176][15401] Updated weights for policy 0, policy_version 232220 (0.0027) [2024-06-22 14:55:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3804708864. Throughput: 0: 43069.9. Samples: 3804890320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:18,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 14:55:20,943][15401] Updated weights for policy 0, policy_version 232230 (0.0030) [2024-06-22 14:55:23,392][15132] Fps is (10 sec: 42588.9, 60 sec: 43415.9, 300 sec: 43042.4). Total num frames: 3804954624. Throughput: 0: 42976.4. Samples: 3805007560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:23,393][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 14:55:25,709][15401] Updated weights for policy 0, policy_version 232240 (0.0046) [2024-06-22 14:55:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 3805151232. Throughput: 0: 43049.3. Samples: 3805275200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 14:55:28,734][15401] Updated weights for policy 0, policy_version 232250 (0.0024) [2024-06-22 14:55:33,380][15401] Updated weights for policy 0, policy_version 232260 (0.0033) [2024-06-22 14:55:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3805347840. Throughput: 0: 43093.0. Samples: 3805532560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 14:55:36,291][15401] Updated weights for policy 0, policy_version 232270 (0.0025) [2024-06-22 14:55:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 3805593600. Throughput: 0: 42940.6. Samples: 3805651660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:38,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 14:55:41,035][15401] Updated weights for policy 0, policy_version 232280 (0.0035) [2024-06-22 14:55:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3805790208. Throughput: 0: 43037.7. Samples: 3805915200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 14:55:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000232287_3805790208.pth... [2024-06-22 14:55:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000231660_3795517440.pth [2024-06-22 14:55:43,927][15401] Updated weights for policy 0, policy_version 232290 (0.0045) [2024-06-22 14:55:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3805986816. Throughput: 0: 42988.5. Samples: 3806177440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 14:55:48,542][15401] Updated weights for policy 0, policy_version 232300 (0.0041) [2024-06-22 14:55:51,624][15401] Updated weights for policy 0, policy_version 232310 (0.0028) [2024-06-22 14:55:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 3806232576. Throughput: 0: 43028.5. Samples: 3806299200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 14:55:56,441][15401] Updated weights for policy 0, policy_version 232320 (0.0033) [2024-06-22 14:55:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3806429184. Throughput: 0: 42985.2. Samples: 3806559840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:55:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 14:55:59,150][15401] Updated weights for policy 0, policy_version 232330 (0.0034) [2024-06-22 14:56:03,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3806625792. Throughput: 0: 42927.4. Samples: 3806822060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:56:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 14:56:04,066][15401] Updated weights for policy 0, policy_version 232340 (0.0022) [2024-06-22 14:56:06,026][15349] Signal inference workers to stop experience collection... (56300 times) [2024-06-22 14:56:06,026][15349] Signal inference workers to resume experience collection... (56300 times) [2024-06-22 14:56:06,067][15401] InferenceWorker_p0-w0: stopping experience collection (56300 times) [2024-06-22 14:56:06,067][15401] InferenceWorker_p0-w0: resuming experience collection (56300 times) [2024-06-22 14:56:06,705][15401] Updated weights for policy 0, policy_version 232350 (0.0046) [2024-06-22 14:56:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3806887936. Throughput: 0: 43046.8. Samples: 3806944560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 14:56:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 14:56:11,540][15401] Updated weights for policy 0, policy_version 232360 (0.0028) [2024-06-22 14:56:13,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 3807084544. Throughput: 0: 43015.2. Samples: 3807210880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 14:56:14,282][15401] Updated weights for policy 0, policy_version 232370 (0.0036) [2024-06-22 14:56:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3807281152. Throughput: 0: 42966.1. Samples: 3807466040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 14:56:19,200][15401] Updated weights for policy 0, policy_version 232380 (0.0032) [2024-06-22 14:56:22,145][15401] Updated weights for policy 0, policy_version 232390 (0.0040) [2024-06-22 14:56:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 3807543296. Throughput: 0: 43063.2. Samples: 3807589500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 14:56:26,598][15401] Updated weights for policy 0, policy_version 232400 (0.0044) [2024-06-22 14:56:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3807739904. Throughput: 0: 43152.1. Samples: 3807857040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 14:56:29,744][15401] Updated weights for policy 0, policy_version 232410 (0.0037) [2024-06-22 14:56:33,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3807920128. Throughput: 0: 43079.1. Samples: 3808116000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 14:56:34,216][15401] Updated weights for policy 0, policy_version 232420 (0.0048) [2024-06-22 14:56:37,386][15401] Updated weights for policy 0, policy_version 232430 (0.0025) [2024-06-22 14:56:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3808165888. Throughput: 0: 43061.2. Samples: 3808236960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 14:56:41,772][15401] Updated weights for policy 0, policy_version 232440 (0.0038) [2024-06-22 14:56:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3808395264. Throughput: 0: 43146.5. Samples: 3808501440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 14:56:44,818][15401] Updated weights for policy 0, policy_version 232450 (0.0034) [2024-06-22 14:56:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3808575488. Throughput: 0: 43134.7. Samples: 3808763120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 14:56:49,287][15401] Updated weights for policy 0, policy_version 232460 (0.0032) [2024-06-22 14:56:52,439][15401] Updated weights for policy 0, policy_version 232470 (0.0029) [2024-06-22 14:56:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3808821248. Throughput: 0: 43086.6. Samples: 3808883460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 14:56:56,918][15401] Updated weights for policy 0, policy_version 232480 (0.0042) [2024-06-22 14:56:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3809017856. Throughput: 0: 42881.2. Samples: 3809140540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:56:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-22 14:57:00,006][15401] Updated weights for policy 0, policy_version 232490 (0.0048) [2024-06-22 14:57:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3809214464. Throughput: 0: 42964.4. Samples: 3809399440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:57:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 14:57:04,340][15401] Updated weights for policy 0, policy_version 232500 (0.0048) [2024-06-22 14:57:07,807][15401] Updated weights for policy 0, policy_version 232510 (0.0033) [2024-06-22 14:57:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3809460224. Throughput: 0: 43162.6. Samples: 3809531820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:57:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 14:57:12,043][15401] Updated weights for policy 0, policy_version 232520 (0.0029) [2024-06-22 14:57:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 3809640448. Throughput: 0: 42830.6. Samples: 3809784420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:57:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 14:57:15,231][15401] Updated weights for policy 0, policy_version 232530 (0.0032) [2024-06-22 14:57:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 3809869824. Throughput: 0: 42697.3. Samples: 3810037480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 14:57:18,392][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 14:57:19,998][15401] Updated weights for policy 0, policy_version 232540 (0.0044) [2024-06-22 14:57:21,442][15349] Signal inference workers to stop experience collection... (56350 times) [2024-06-22 14:57:21,443][15349] Signal inference workers to resume experience collection... (56350 times) [2024-06-22 14:57:21,479][15401] InferenceWorker_p0-w0: stopping experience collection (56350 times) [2024-06-22 14:57:21,479][15401] InferenceWorker_p0-w0: resuming experience collection (56350 times) [2024-06-22 14:57:23,122][15401] Updated weights for policy 0, policy_version 232550 (0.0032) [2024-06-22 14:57:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 3810099200. Throughput: 0: 43066.3. Samples: 3810174940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:57:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 14:57:27,555][15401] Updated weights for policy 0, policy_version 232560 (0.0024) [2024-06-22 14:57:28,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 3810279424. Throughput: 0: 42939.6. Samples: 3810433720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:57:28,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-22 14:57:30,728][15401] Updated weights for policy 0, policy_version 232570 (0.0038) [2024-06-22 14:57:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3810508800. Throughput: 0: 42622.2. Samples: 3810681120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:57:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 14:57:35,235][15401] Updated weights for policy 0, policy_version 232580 (0.0026) [2024-06-22 14:57:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3810738176. Throughput: 0: 42875.1. Samples: 3810812840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:57:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 14:57:38,542][15401] Updated weights for policy 0, policy_version 232590 (0.0033) [2024-06-22 14:57:43,028][15401] Updated weights for policy 0, policy_version 232600 (0.0044) [2024-06-22 14:57:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 3810934784. Throughput: 0: 42838.2. Samples: 3811068260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:57:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 14:57:43,522][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000232602_3810951168.pth... [2024-06-22 14:57:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000231974_3800662016.pth [2024-06-22 14:57:46,110][15401] Updated weights for policy 0, policy_version 232610 (0.0043) [2024-06-22 14:57:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3811164160. Throughput: 0: 42672.5. Samples: 3811319700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:57:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 14:57:50,745][15401] Updated weights for policy 0, policy_version 232620 (0.0042) [2024-06-22 14:57:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3811393536. Throughput: 0: 42779.6. Samples: 3811456900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:57:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 14:57:53,600][15401] Updated weights for policy 0, policy_version 232630 (0.0025) [2024-06-22 14:57:58,354][15401] Updated weights for policy 0, policy_version 232640 (0.0028) [2024-06-22 14:57:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3811573760. Throughput: 0: 42913.7. Samples: 3811715540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:57:58,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 14:58:01,141][15401] Updated weights for policy 0, policy_version 232650 (0.0036) [2024-06-22 14:58:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3811803136. Throughput: 0: 42936.0. Samples: 3811969500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:58:03,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 14:58:05,911][15401] Updated weights for policy 0, policy_version 232660 (0.0039) [2024-06-22 14:58:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 3812032512. Throughput: 0: 42778.6. Samples: 3812099980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:58:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 14:58:09,105][15401] Updated weights for policy 0, policy_version 232670 (0.0032) [2024-06-22 14:58:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3812212736. Throughput: 0: 42768.5. Samples: 3812358300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:58:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 14:58:13,415][15401] Updated weights for policy 0, policy_version 232680 (0.0036) [2024-06-22 14:58:16,719][15401] Updated weights for policy 0, policy_version 232690 (0.0027) [2024-06-22 14:58:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 3812442112. Throughput: 0: 43001.0. Samples: 3812616160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:58:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 14:58:21,170][15401] Updated weights for policy 0, policy_version 232700 (0.0039) [2024-06-22 14:58:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 3812671488. Throughput: 0: 42979.4. Samples: 3812746920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:58:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 14:58:24,232][15401] Updated weights for policy 0, policy_version 232710 (0.0052) [2024-06-22 14:58:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3812868096. Throughput: 0: 42913.0. Samples: 3812999340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 14:58:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 14:58:28,783][15401] Updated weights for policy 0, policy_version 232720 (0.0037) [2024-06-22 14:58:31,713][15401] Updated weights for policy 0, policy_version 232730 (0.0030) [2024-06-22 14:58:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3813097472. Throughput: 0: 43091.5. Samples: 3813258820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:58:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 14:58:36,480][15401] Updated weights for policy 0, policy_version 232740 (0.0032) [2024-06-22 14:58:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 3813326848. Throughput: 0: 42979.6. Samples: 3813390980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:58:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 14:58:39,367][15401] Updated weights for policy 0, policy_version 232750 (0.0038) [2024-06-22 14:58:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3813523456. Throughput: 0: 42908.0. Samples: 3813646400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:58:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 14:58:43,788][15401] Updated weights for policy 0, policy_version 232760 (0.0034) [2024-06-22 14:58:46,852][15349] Signal inference workers to stop experience collection... (56400 times) [2024-06-22 14:58:46,884][15401] InferenceWorker_p0-w0: stopping experience collection (56400 times) [2024-06-22 14:58:46,922][15349] Signal inference workers to resume experience collection... (56400 times) [2024-06-22 14:58:46,922][15401] InferenceWorker_p0-w0: resuming experience collection (56400 times) [2024-06-22 14:58:47,060][15401] Updated weights for policy 0, policy_version 232770 (0.0042) [2024-06-22 14:58:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3813736448. Throughput: 0: 43002.4. Samples: 3813904600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:58:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 14:58:51,378][15401] Updated weights for policy 0, policy_version 232780 (0.0031) [2024-06-22 14:58:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3813949440. Throughput: 0: 43079.1. Samples: 3814038540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:58:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 14:58:54,589][15401] Updated weights for policy 0, policy_version 232790 (0.0033) [2024-06-22 14:58:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3814162432. Throughput: 0: 42910.2. Samples: 3814289260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:58:58,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 14:58:59,210][15401] Updated weights for policy 0, policy_version 232800 (0.0032) [2024-06-22 14:59:02,212][15401] Updated weights for policy 0, policy_version 232810 (0.0040) [2024-06-22 14:59:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 3814391808. Throughput: 0: 42996.7. Samples: 3814551020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:59:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 14:59:06,596][15401] Updated weights for policy 0, policy_version 232820 (0.0044) [2024-06-22 14:59:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3814588416. Throughput: 0: 43077.9. Samples: 3814685420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:59:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 14:59:09,903][15401] Updated weights for policy 0, policy_version 232830 (0.0037) [2024-06-22 14:59:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3814817792. Throughput: 0: 43120.8. Samples: 3814939780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:59:13,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 14:59:13,978][15401] Updated weights for policy 0, policy_version 232840 (0.0027) [2024-06-22 14:59:17,428][15401] Updated weights for policy 0, policy_version 232850 (0.0052) [2024-06-22 14:59:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 3815030784. Throughput: 0: 43088.0. Samples: 3815197780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:59:18,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 14:59:21,364][15401] Updated weights for policy 0, policy_version 232860 (0.0027) [2024-06-22 14:59:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3815227392. Throughput: 0: 43062.2. Samples: 3815328780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:59:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 14:59:25,392][15401] Updated weights for policy 0, policy_version 232870 (0.0031) [2024-06-22 14:59:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3815456768. Throughput: 0: 43018.7. Samples: 3815582240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:59:28,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 14:59:28,816][15401] Updated weights for policy 0, policy_version 232880 (0.0035) [2024-06-22 14:59:32,876][15401] Updated weights for policy 0, policy_version 232890 (0.0027) [2024-06-22 14:59:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3815669760. Throughput: 0: 43026.5. Samples: 3815840800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:59:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 14:59:36,948][15401] Updated weights for policy 0, policy_version 232900 (0.0027) [2024-06-22 14:59:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3815882752. Throughput: 0: 42947.1. Samples: 3815971160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-22 14:59:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 14:59:40,441][15401] Updated weights for policy 0, policy_version 232910 (0.0036) [2024-06-22 14:59:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3816095744. Throughput: 0: 43068.3. Samples: 3816227340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 14:59:43,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 14:59:43,534][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000232917_3816112128.pth... [2024-06-22 14:59:43,574][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000232287_3805790208.pth [2024-06-22 14:59:44,640][15401] Updated weights for policy 0, policy_version 232920 (0.0036) [2024-06-22 14:59:48,095][15401] Updated weights for policy 0, policy_version 232930 (0.0035) [2024-06-22 14:59:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 3816325120. Throughput: 0: 42889.9. Samples: 3816481160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 14:59:48,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 14:59:52,180][15401] Updated weights for policy 0, policy_version 232940 (0.0039) [2024-06-22 14:59:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3816538112. Throughput: 0: 42799.1. Samples: 3816611380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 14:59:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 14:59:55,814][15401] Updated weights for policy 0, policy_version 232950 (0.0035) [2024-06-22 14:59:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3816751104. Throughput: 0: 42875.6. Samples: 3816869180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 14:59:58,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 15:00:00,048][15401] Updated weights for policy 0, policy_version 232960 (0.0034) [2024-06-22 15:00:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.9, 300 sec: 42986.8). Total num frames: 3816964096. Throughput: 0: 42711.6. Samples: 3817119900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:03,393][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 15:00:03,738][15401] Updated weights for policy 0, policy_version 232970 (0.0027) [2024-06-22 15:00:07,614][15401] Updated weights for policy 0, policy_version 232980 (0.0041) [2024-06-22 15:00:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3817177088. Throughput: 0: 42584.9. Samples: 3817245100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 15:00:09,406][15349] Signal inference workers to stop experience collection... (56450 times) [2024-06-22 15:00:09,408][15349] Signal inference workers to resume experience collection... (56450 times) [2024-06-22 15:00:09,429][15401] InferenceWorker_p0-w0: stopping experience collection (56450 times) [2024-06-22 15:00:09,429][15401] InferenceWorker_p0-w0: resuming experience collection (56450 times) [2024-06-22 15:00:11,843][15401] Updated weights for policy 0, policy_version 232990 (0.0027) [2024-06-22 15:00:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3817373696. Throughput: 0: 42714.7. Samples: 3817504400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 15:00:15,170][15401] Updated weights for policy 0, policy_version 233000 (0.0028) [2024-06-22 15:00:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 3817603072. Throughput: 0: 42766.8. Samples: 3817765300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 15:00:19,256][15401] Updated weights for policy 0, policy_version 233010 (0.0033) [2024-06-22 15:00:22,640][15401] Updated weights for policy 0, policy_version 233020 (0.0028) [2024-06-22 15:00:23,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 3817832448. Throughput: 0: 42694.5. Samples: 3817892420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 15:00:26,996][15401] Updated weights for policy 0, policy_version 233030 (0.0042) [2024-06-22 15:00:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3818029056. Throughput: 0: 42814.8. Samples: 3818154000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:28,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 15:00:30,409][15401] Updated weights for policy 0, policy_version 233040 (0.0030) [2024-06-22 15:00:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3818225664. Throughput: 0: 42838.6. Samples: 3818408800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 15:00:34,675][15401] Updated weights for policy 0, policy_version 233050 (0.0038) [2024-06-22 15:00:37,797][15401] Updated weights for policy 0, policy_version 233060 (0.0034) [2024-06-22 15:00:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3818471424. Throughput: 0: 42822.3. Samples: 3818538380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 15:00:42,252][15401] Updated weights for policy 0, policy_version 233070 (0.0033) [2024-06-22 15:00:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3818668032. Throughput: 0: 42847.0. Samples: 3818797300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 15:00:45,380][15401] Updated weights for policy 0, policy_version 233080 (0.0043) [2024-06-22 15:00:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 3818881024. Throughput: 0: 42813.4. Samples: 3819046400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 15:00:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 15:00:49,795][15401] Updated weights for policy 0, policy_version 233090 (0.0039) [2024-06-22 15:00:53,177][15401] Updated weights for policy 0, policy_version 233100 (0.0044) [2024-06-22 15:00:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3819110400. Throughput: 0: 42949.7. Samples: 3819177840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:00:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 15:00:57,270][15401] Updated weights for policy 0, policy_version 233110 (0.0042) [2024-06-22 15:00:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 3819290624. Throughput: 0: 42798.1. Samples: 3819430320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:00:58,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-22 15:01:00,757][15401] Updated weights for policy 0, policy_version 233120 (0.0036) [2024-06-22 15:01:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 3819503616. Throughput: 0: 42739.0. Samples: 3819688560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:03,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 15:01:04,987][15401] Updated weights for policy 0, policy_version 233130 (0.0026) [2024-06-22 15:01:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3819749376. Throughput: 0: 42678.8. Samples: 3819812960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 15:01:08,443][15401] Updated weights for policy 0, policy_version 233140 (0.0025) [2024-06-22 15:01:12,564][15401] Updated weights for policy 0, policy_version 233150 (0.0033) [2024-06-22 15:01:13,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 3819945984. Throughput: 0: 42696.3. Samples: 3820075440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:13,392][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 15:01:16,333][15401] Updated weights for policy 0, policy_version 233160 (0.0031) [2024-06-22 15:01:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3820142592. Throughput: 0: 42701.0. Samples: 3820330340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 15:01:20,556][15401] Updated weights for policy 0, policy_version 233170 (0.0030) [2024-06-22 15:01:23,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42596.8, 300 sec: 42875.7). Total num frames: 3820388352. Throughput: 0: 42558.5. Samples: 3820453620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:23,393][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 15:01:23,860][15401] Updated weights for policy 0, policy_version 233180 (0.0034) [2024-06-22 15:01:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3820568576. Throughput: 0: 42508.5. Samples: 3820710180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:28,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 15:01:28,416][15401] Updated weights for policy 0, policy_version 233190 (0.0043) [2024-06-22 15:01:31,120][15349] Signal inference workers to stop experience collection... (56500 times) [2024-06-22 15:01:31,120][15349] Signal inference workers to resume experience collection... (56500 times) [2024-06-22 15:01:31,168][15401] InferenceWorker_p0-w0: stopping experience collection (56500 times) [2024-06-22 15:01:31,168][15401] InferenceWorker_p0-w0: resuming experience collection (56500 times) [2024-06-22 15:01:31,436][15401] Updated weights for policy 0, policy_version 233200 (0.0035) [2024-06-22 15:01:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3820797952. Throughput: 0: 42527.2. Samples: 3820960120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:33,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 15:01:36,042][15401] Updated weights for policy 0, policy_version 233210 (0.0038) [2024-06-22 15:01:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3821010944. Throughput: 0: 42521.9. Samples: 3821091320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 15:01:39,076][15401] Updated weights for policy 0, policy_version 233220 (0.0029) [2024-06-22 15:01:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 3821223936. Throughput: 0: 42675.1. Samples: 3821350800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:43,393][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 15:01:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000233229_3821223936.pth... [2024-06-22 15:01:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000232602_3810951168.pth [2024-06-22 15:01:43,643][15401] Updated weights for policy 0, policy_version 233230 (0.0037) [2024-06-22 15:01:46,494][15401] Updated weights for policy 0, policy_version 233240 (0.0023) [2024-06-22 15:01:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3821420544. Throughput: 0: 42758.8. Samples: 3821612700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:48,396][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 15:01:51,202][15401] Updated weights for policy 0, policy_version 233250 (0.0035) [2024-06-22 15:01:53,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3821666304. Throughput: 0: 42836.1. Samples: 3821740580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 15:01:54,052][15401] Updated weights for policy 0, policy_version 233260 (0.0031) [2024-06-22 15:01:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3821862912. Throughput: 0: 42692.9. Samples: 3821996520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 15:01:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 15:01:58,730][15401] Updated weights for policy 0, policy_version 233270 (0.0040) [2024-06-22 15:02:01,918][15401] Updated weights for policy 0, policy_version 233280 (0.0037) [2024-06-22 15:02:03,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 3822075904. Throughput: 0: 42764.8. Samples: 3822254860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:03,393][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 15:02:06,161][15401] Updated weights for policy 0, policy_version 233290 (0.0020) [2024-06-22 15:02:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3822305280. Throughput: 0: 42868.9. Samples: 3822382620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 15:02:09,386][15401] Updated weights for policy 0, policy_version 233300 (0.0024) [2024-06-22 15:02:13,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 3822518272. Throughput: 0: 43025.8. Samples: 3822646340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 15:02:14,104][15401] Updated weights for policy 0, policy_version 233310 (0.0033) [2024-06-22 15:02:17,098][15401] Updated weights for policy 0, policy_version 233320 (0.0036) [2024-06-22 15:02:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3822731264. Throughput: 0: 43001.2. Samples: 3822895180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 15:02:21,709][15401] Updated weights for policy 0, policy_version 233330 (0.0033) [2024-06-22 15:02:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42871.5, 300 sec: 42986.8). Total num frames: 3822960640. Throughput: 0: 43105.6. Samples: 3823031180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:23,392][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 15:02:24,624][15401] Updated weights for policy 0, policy_version 233340 (0.0030) [2024-06-22 15:02:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3823157248. Throughput: 0: 43142.4. Samples: 3823292100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:28,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 15:02:29,297][15401] Updated weights for policy 0, policy_version 233350 (0.0036) [2024-06-22 15:02:32,138][15401] Updated weights for policy 0, policy_version 233360 (0.0025) [2024-06-22 15:02:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3823386624. Throughput: 0: 42893.7. Samples: 3823542920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:33,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 15:02:36,726][15401] Updated weights for policy 0, policy_version 233370 (0.0033) [2024-06-22 15:02:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3823583232. Throughput: 0: 43116.8. Samples: 3823680840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 15:02:39,919][15401] Updated weights for policy 0, policy_version 233380 (0.0021) [2024-06-22 15:02:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 3823796224. Throughput: 0: 43259.7. Samples: 3823943200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 15:02:44,083][15349] Signal inference workers to stop experience collection... (56550 times) [2024-06-22 15:02:44,126][15401] InferenceWorker_p0-w0: stopping experience collection (56550 times) [2024-06-22 15:02:44,133][15349] Signal inference workers to resume experience collection... (56550 times) [2024-06-22 15:02:44,150][15401] InferenceWorker_p0-w0: resuming experience collection (56550 times) [2024-06-22 15:02:44,286][15401] Updated weights for policy 0, policy_version 233390 (0.0042) [2024-06-22 15:02:47,348][15401] Updated weights for policy 0, policy_version 233400 (0.0046) [2024-06-22 15:02:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 3824041984. Throughput: 0: 43025.8. Samples: 3824190920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:48,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-22 15:02:51,846][15401] Updated weights for policy 0, policy_version 233410 (0.0040) [2024-06-22 15:02:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3824238592. Throughput: 0: 43360.9. Samples: 3824333860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 15:02:54,909][15401] Updated weights for policy 0, policy_version 233420 (0.0021) [2024-06-22 15:02:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3824435200. Throughput: 0: 43054.2. Samples: 3824583780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:02:58,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 15:02:59,485][15401] Updated weights for policy 0, policy_version 233430 (0.0037) [2024-06-22 15:03:02,731][15401] Updated weights for policy 0, policy_version 233440 (0.0041) [2024-06-22 15:03:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 3824680960. Throughput: 0: 43136.0. Samples: 3824836300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-22 15:03:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 15:03:07,124][15401] Updated weights for policy 0, policy_version 233450 (0.0032) [2024-06-22 15:03:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3824893952. Throughput: 0: 43175.6. Samples: 3824973980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:08,399][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 15:03:10,653][15401] Updated weights for policy 0, policy_version 233460 (0.0031) [2024-06-22 15:03:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3825090560. Throughput: 0: 43066.2. Samples: 3825230080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:13,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 15:03:14,684][15401] Updated weights for policy 0, policy_version 233470 (0.0031) [2024-06-22 15:03:18,166][15401] Updated weights for policy 0, policy_version 233480 (0.0030) [2024-06-22 15:03:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3825336320. Throughput: 0: 43064.0. Samples: 3825480800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:18,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 15:03:22,458][15401] Updated weights for policy 0, policy_version 233490 (0.0040) [2024-06-22 15:03:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 3825516544. Throughput: 0: 42989.9. Samples: 3825615380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 15:03:25,835][15401] Updated weights for policy 0, policy_version 233500 (0.0044) [2024-06-22 15:03:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3825729536. Throughput: 0: 42877.3. Samples: 3825872680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:28,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 15:03:30,078][15401] Updated weights for policy 0, policy_version 233510 (0.0044) [2024-06-22 15:03:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3825975296. Throughput: 0: 42916.9. Samples: 3826122180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 15:03:33,464][15401] Updated weights for policy 0, policy_version 233520 (0.0032) [2024-06-22 15:03:37,657][15401] Updated weights for policy 0, policy_version 233530 (0.0031) [2024-06-22 15:03:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3826188288. Throughput: 0: 42759.0. Samples: 3826258020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 15:03:41,134][15401] Updated weights for policy 0, policy_version 233540 (0.0031) [2024-06-22 15:03:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3826368512. Throughput: 0: 42862.8. Samples: 3826512600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 15:03:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000233544_3826384896.pth... [2024-06-22 15:03:43,532][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000232917_3816112128.pth [2024-06-22 15:03:45,247][15401] Updated weights for policy 0, policy_version 233550 (0.0023) [2024-06-22 15:03:45,267][15349] Signal inference workers to stop experience collection... (56600 times) [2024-06-22 15:03:45,267][15349] Signal inference workers to resume experience collection... (56600 times) [2024-06-22 15:03:45,315][15401] InferenceWorker_p0-w0: stopping experience collection (56600 times) [2024-06-22 15:03:45,315][15401] InferenceWorker_p0-w0: resuming experience collection (56600 times) [2024-06-22 15:03:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3826614272. Throughput: 0: 42910.7. Samples: 3826767280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 15:03:48,975][15401] Updated weights for policy 0, policy_version 233560 (0.0040) [2024-06-22 15:03:52,882][15401] Updated weights for policy 0, policy_version 233570 (0.0041) [2024-06-22 15:03:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3826810880. Throughput: 0: 42793.8. Samples: 3826899700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 15:03:56,612][15401] Updated weights for policy 0, policy_version 233580 (0.0027) [2024-06-22 15:03:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3827007488. Throughput: 0: 42665.8. Samples: 3827150040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:03:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 15:04:00,684][15401] Updated weights for policy 0, policy_version 233590 (0.0031) [2024-06-22 15:04:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3827253248. Throughput: 0: 42787.6. Samples: 3827406240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:04:03,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 15:04:04,210][15401] Updated weights for policy 0, policy_version 233600 (0.0028) [2024-06-22 15:04:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3827449856. Throughput: 0: 42808.9. Samples: 3827541780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:04:08,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 15:04:08,427][15401] Updated weights for policy 0, policy_version 233610 (0.0043) [2024-06-22 15:04:12,079][15401] Updated weights for policy 0, policy_version 233620 (0.0026) [2024-06-22 15:04:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3827646464. Throughput: 0: 42610.3. Samples: 3827790140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 15:04:13,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 15:04:16,222][15401] Updated weights for policy 0, policy_version 233630 (0.0039) [2024-06-22 15:04:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3827892224. Throughput: 0: 42726.3. Samples: 3828044860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 15:04:20,021][15401] Updated weights for policy 0, policy_version 233640 (0.0023) [2024-06-22 15:04:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3828088832. Throughput: 0: 42666.3. Samples: 3828178000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:23,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 15:04:23,655][15401] Updated weights for policy 0, policy_version 233650 (0.0037) [2024-06-22 15:04:27,558][15401] Updated weights for policy 0, policy_version 233660 (0.0037) [2024-06-22 15:04:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3828285440. Throughput: 0: 42636.0. Samples: 3828431220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 15:04:31,497][15401] Updated weights for policy 0, policy_version 233670 (0.0026) [2024-06-22 15:04:33,396][15132] Fps is (10 sec: 45845.6, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 3828547584. Throughput: 0: 42586.9. Samples: 3828683960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:33,396][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 15:04:35,171][15401] Updated weights for policy 0, policy_version 233680 (0.0042) [2024-06-22 15:04:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3828727808. Throughput: 0: 42667.1. Samples: 3828819720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:38,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 15:04:39,085][15401] Updated weights for policy 0, policy_version 233690 (0.0049) [2024-06-22 15:04:43,106][15401] Updated weights for policy 0, policy_version 233700 (0.0045) [2024-06-22 15:04:43,390][15132] Fps is (10 sec: 39346.1, 60 sec: 42871.3, 300 sec: 42765.3). Total num frames: 3828940800. Throughput: 0: 42704.6. Samples: 3829071760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 15:04:46,611][15401] Updated weights for policy 0, policy_version 233710 (0.0031) [2024-06-22 15:04:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3829186560. Throughput: 0: 42644.0. Samples: 3829325220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:48,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 15:04:50,475][15401] Updated weights for policy 0, policy_version 233720 (0.0027) [2024-06-22 15:04:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3829366784. Throughput: 0: 42638.0. Samples: 3829460500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 15:04:54,377][15401] Updated weights for policy 0, policy_version 233730 (0.0031) [2024-06-22 15:04:58,306][15401] Updated weights for policy 0, policy_version 233740 (0.0033) [2024-06-22 15:04:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3829596160. Throughput: 0: 42840.5. Samples: 3829717960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:04:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 15:05:01,853][15401] Updated weights for policy 0, policy_version 233750 (0.0039) [2024-06-22 15:05:03,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 3829809152. Throughput: 0: 42745.2. Samples: 3829968500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:05:03,393][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 15:05:06,378][15401] Updated weights for policy 0, policy_version 233760 (0.0033) [2024-06-22 15:05:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3830022144. Throughput: 0: 42818.1. Samples: 3830104820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:05:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 15:05:09,466][15401] Updated weights for policy 0, policy_version 233770 (0.0031) [2024-06-22 15:05:13,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3830218752. Throughput: 0: 42707.0. Samples: 3830353040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:05:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 15:05:13,961][15401] Updated weights for policy 0, policy_version 233780 (0.0035) [2024-06-22 15:05:16,769][15349] Signal inference workers to stop experience collection... (56650 times) [2024-06-22 15:05:16,804][15401] InferenceWorker_p0-w0: stopping experience collection (56650 times) [2024-06-22 15:05:16,827][15349] Signal inference workers to resume experience collection... (56650 times) [2024-06-22 15:05:16,828][15401] InferenceWorker_p0-w0: resuming experience collection (56650 times) [2024-06-22 15:05:17,113][15401] Updated weights for policy 0, policy_version 233790 (0.0037) [2024-06-22 15:05:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3830464512. Throughput: 0: 42786.1. Samples: 3830609060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:05:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 15:05:21,837][15401] Updated weights for policy 0, policy_version 233800 (0.0034) [2024-06-22 15:05:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3830661120. Throughput: 0: 42771.5. Samples: 3830744440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-22 15:05:23,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-22 15:05:24,883][15401] Updated weights for policy 0, policy_version 233810 (0.0022) [2024-06-22 15:05:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3830857728. Throughput: 0: 42723.3. Samples: 3830994300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:05:28,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 15:05:29,147][15401] Updated weights for policy 0, policy_version 233820 (0.0026) [2024-06-22 15:05:32,425][15401] Updated weights for policy 0, policy_version 233830 (0.0037) [2024-06-22 15:05:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 3831087104. Throughput: 0: 43001.0. Samples: 3831260260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:05:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 15:05:36,564][15401] Updated weights for policy 0, policy_version 233840 (0.0027) [2024-06-22 15:05:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3831316480. Throughput: 0: 42903.1. Samples: 3831391140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:05:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 15:05:40,147][15401] Updated weights for policy 0, policy_version 233850 (0.0050) [2024-06-22 15:05:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3831513088. Throughput: 0: 42778.9. Samples: 3831643020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:05:43,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 15:05:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000233857_3831513088.pth... [2024-06-22 15:05:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000233229_3821223936.pth [2024-06-22 15:05:44,120][15401] Updated weights for policy 0, policy_version 233860 (0.0033) [2024-06-22 15:05:47,827][15401] Updated weights for policy 0, policy_version 233870 (0.0033) [2024-06-22 15:05:48,396][15132] Fps is (10 sec: 40934.4, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 3831726080. Throughput: 0: 42927.4. Samples: 3831900400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:05:48,396][15132] Avg episode reward: [(0, '0.193')] [2024-06-22 15:05:52,116][15401] Updated weights for policy 0, policy_version 233880 (0.0034) [2024-06-22 15:05:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3831939072. Throughput: 0: 42759.7. Samples: 3832029000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:05:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 15:05:55,602][15401] Updated weights for policy 0, policy_version 233890 (0.0027) [2024-06-22 15:05:58,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 3832168448. Throughput: 0: 42916.5. Samples: 3832284280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:05:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 15:05:59,632][15401] Updated weights for policy 0, policy_version 233900 (0.0036) [2024-06-22 15:06:03,294][15401] Updated weights for policy 0, policy_version 233910 (0.0050) [2024-06-22 15:06:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 3832381440. Throughput: 0: 42937.3. Samples: 3832541240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:06:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 15:06:07,459][15401] Updated weights for policy 0, policy_version 233920 (0.0036) [2024-06-22 15:06:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 3832594432. Throughput: 0: 42754.3. Samples: 3832668380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:06:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 15:06:11,101][15401] Updated weights for policy 0, policy_version 233930 (0.0042) [2024-06-22 15:06:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3832807424. Throughput: 0: 42907.6. Samples: 3832925140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:06:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 15:06:15,262][15401] Updated weights for policy 0, policy_version 233940 (0.0028) [2024-06-22 15:06:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 3833004032. Throughput: 0: 42646.6. Samples: 3833179360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:06:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 15:06:18,886][15401] Updated weights for policy 0, policy_version 233950 (0.0043) [2024-06-22 15:06:22,790][15401] Updated weights for policy 0, policy_version 233960 (0.0042) [2024-06-22 15:06:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3833249792. Throughput: 0: 42572.6. Samples: 3833306900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:06:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 15:06:26,913][15401] Updated weights for policy 0, policy_version 233970 (0.0036) [2024-06-22 15:06:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 3833446400. Throughput: 0: 42747.6. Samples: 3833566760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:06:28,392][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 15:06:30,255][15401] Updated weights for policy 0, policy_version 233980 (0.0046) [2024-06-22 15:06:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3833643008. Throughput: 0: 42639.7. Samples: 3833818920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-22 15:06:33,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 15:06:34,391][15401] Updated weights for policy 0, policy_version 233990 (0.0027) [2024-06-22 15:06:37,300][15349] Signal inference workers to stop experience collection... (56700 times) [2024-06-22 15:06:37,300][15349] Signal inference workers to resume experience collection... (56700 times) [2024-06-22 15:06:37,339][15401] InferenceWorker_p0-w0: stopping experience collection (56700 times) [2024-06-22 15:06:37,340][15401] InferenceWorker_p0-w0: resuming experience collection (56700 times) [2024-06-22 15:06:37,718][15401] Updated weights for policy 0, policy_version 234000 (0.0036) [2024-06-22 15:06:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 3833872384. Throughput: 0: 42610.1. Samples: 3833946460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:06:38,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 15:06:41,944][15401] Updated weights for policy 0, policy_version 234010 (0.0044) [2024-06-22 15:06:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3834068992. Throughput: 0: 42563.9. Samples: 3834199660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:06:43,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 15:06:45,516][15401] Updated weights for policy 0, policy_version 234020 (0.0039) [2024-06-22 15:06:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 3834298368. Throughput: 0: 42534.8. Samples: 3834455300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:06:48,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 15:06:49,565][15401] Updated weights for policy 0, policy_version 234030 (0.0039) [2024-06-22 15:06:53,177][15401] Updated weights for policy 0, policy_version 234040 (0.0042) [2024-06-22 15:06:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3834511360. Throughput: 0: 42579.9. Samples: 3834584480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:06:53,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 15:06:57,459][15401] Updated weights for policy 0, policy_version 234050 (0.0031) [2024-06-22 15:06:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 3834724352. Throughput: 0: 42524.9. Samples: 3834838760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:06:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 15:07:01,095][15401] Updated weights for policy 0, policy_version 234060 (0.0030) [2024-06-22 15:07:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3834937344. Throughput: 0: 42433.4. Samples: 3835088860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:03,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-22 15:07:05,435][15401] Updated weights for policy 0, policy_version 234070 (0.0031) [2024-06-22 15:07:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3835133952. Throughput: 0: 42427.6. Samples: 3835216140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:08,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 15:07:08,643][15401] Updated weights for policy 0, policy_version 234080 (0.0027) [2024-06-22 15:07:13,069][15401] Updated weights for policy 0, policy_version 234090 (0.0032) [2024-06-22 15:07:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3835363328. Throughput: 0: 42453.3. Samples: 3835477060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:13,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 15:07:16,336][15401] Updated weights for policy 0, policy_version 234100 (0.0036) [2024-06-22 15:07:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3835576320. Throughput: 0: 42261.0. Samples: 3835720660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:18,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 15:07:20,694][15401] Updated weights for policy 0, policy_version 234110 (0.0030) [2024-06-22 15:07:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 3835756544. Throughput: 0: 42388.6. Samples: 3835853940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 15:07:24,191][15401] Updated weights for policy 0, policy_version 234120 (0.0039) [2024-06-22 15:07:28,295][15401] Updated weights for policy 0, policy_version 234130 (0.0034) [2024-06-22 15:07:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 3835985920. Throughput: 0: 42396.4. Samples: 3836107500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 15:07:32,123][15401] Updated weights for policy 0, policy_version 234140 (0.0026) [2024-06-22 15:07:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3836198912. Throughput: 0: 42306.9. Samples: 3836359120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 15:07:36,005][15401] Updated weights for policy 0, policy_version 234150 (0.0029) [2024-06-22 15:07:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3836411904. Throughput: 0: 42314.1. Samples: 3836488620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 15:07:39,857][15401] Updated weights for policy 0, policy_version 234160 (0.0031) [2024-06-22 15:07:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3836608512. Throughput: 0: 42256.9. Samples: 3836740320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 15:07:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 15:07:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000234168_3836608512.pth... [2024-06-22 15:07:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000233544_3826384896.pth [2024-06-22 15:07:44,072][15401] Updated weights for policy 0, policy_version 234170 (0.0035) [2024-06-22 15:07:45,547][15349] Signal inference workers to stop experience collection... (56750 times) [2024-06-22 15:07:45,555][15349] Signal inference workers to resume experience collection... (56750 times) [2024-06-22 15:07:45,561][15401] InferenceWorker_p0-w0: stopping experience collection (56750 times) [2024-06-22 15:07:45,588][15401] InferenceWorker_p0-w0: resuming experience collection (56750 times) [2024-06-22 15:07:47,495][15401] Updated weights for policy 0, policy_version 234180 (0.0038) [2024-06-22 15:07:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3836854272. Throughput: 0: 42391.5. Samples: 3836996480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:07:48,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 15:07:51,622][15401] Updated weights for policy 0, policy_version 234190 (0.0029) [2024-06-22 15:07:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3837034496. Throughput: 0: 42602.7. Samples: 3837133260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:07:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 15:07:55,037][15401] Updated weights for policy 0, policy_version 234200 (0.0034) [2024-06-22 15:07:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3837263872. Throughput: 0: 42398.3. Samples: 3837384980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:07:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 15:07:59,102][15401] Updated weights for policy 0, policy_version 234210 (0.0039) [2024-06-22 15:08:02,950][15401] Updated weights for policy 0, policy_version 234220 (0.0048) [2024-06-22 15:08:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 3837476864. Throughput: 0: 42646.1. Samples: 3837639740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 15:08:07,091][15401] Updated weights for policy 0, policy_version 234230 (0.0029) [2024-06-22 15:08:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 3837673472. Throughput: 0: 42560.7. Samples: 3837769180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:08,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 15:08:10,652][15401] Updated weights for policy 0, policy_version 234240 (0.0030) [2024-06-22 15:08:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 3837886464. Throughput: 0: 42342.7. Samples: 3838012920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 15:08:15,043][15401] Updated weights for policy 0, policy_version 234250 (0.0035) [2024-06-22 15:08:18,262][15401] Updated weights for policy 0, policy_version 234260 (0.0033) [2024-06-22 15:08:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3838115840. Throughput: 0: 42461.1. Samples: 3838269860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:18,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 15:08:22,830][15401] Updated weights for policy 0, policy_version 234270 (0.0036) [2024-06-22 15:08:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3838296064. Throughput: 0: 42469.9. Samples: 3838399760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:23,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-22 15:08:25,835][15401] Updated weights for policy 0, policy_version 234280 (0.0037) [2024-06-22 15:08:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3838541824. Throughput: 0: 42510.7. Samples: 3838653300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:28,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 15:08:30,394][15401] Updated weights for policy 0, policy_version 234290 (0.0038) [2024-06-22 15:08:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 3838738432. Throughput: 0: 42632.1. Samples: 3838914920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 15:08:33,596][15401] Updated weights for policy 0, policy_version 234300 (0.0039) [2024-06-22 15:08:37,942][15401] Updated weights for policy 0, policy_version 234310 (0.0038) [2024-06-22 15:08:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 3838951424. Throughput: 0: 42386.7. Samples: 3839040660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:38,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 15:08:41,089][15401] Updated weights for policy 0, policy_version 234320 (0.0029) [2024-06-22 15:08:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 3839197184. Throughput: 0: 42620.5. Samples: 3839302900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 15:08:45,459][15401] Updated weights for policy 0, policy_version 234330 (0.0026) [2024-06-22 15:08:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3839393792. Throughput: 0: 42764.0. Samples: 3839564120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:48,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 15:08:48,676][15401] Updated weights for policy 0, policy_version 234340 (0.0047) [2024-06-22 15:08:53,128][15401] Updated weights for policy 0, policy_version 234350 (0.0037) [2024-06-22 15:08:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 3839606784. Throughput: 0: 42556.5. Samples: 3839684220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 15:08:53,390][15132] Avg episode reward: [(0, '0.145')] [2024-06-22 15:08:56,236][15401] Updated weights for policy 0, policy_version 234360 (0.0034) [2024-06-22 15:08:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3839852544. Throughput: 0: 43017.4. Samples: 3839948700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:08:58,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 15:09:00,643][15401] Updated weights for policy 0, policy_version 234370 (0.0037) [2024-06-22 15:09:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3840049152. Throughput: 0: 43135.0. Samples: 3840210940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:03,395][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 15:09:03,483][15349] Signal inference workers to stop experience collection... (56800 times) [2024-06-22 15:09:03,531][15401] InferenceWorker_p0-w0: stopping experience collection (56800 times) [2024-06-22 15:09:03,539][15349] Signal inference workers to resume experience collection... (56800 times) [2024-06-22 15:09:03,546][15401] InferenceWorker_p0-w0: resuming experience collection (56800 times) [2024-06-22 15:09:03,699][15401] Updated weights for policy 0, policy_version 234380 (0.0041) [2024-06-22 15:09:08,277][15401] Updated weights for policy 0, policy_version 234390 (0.0046) [2024-06-22 15:09:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3840245760. Throughput: 0: 42901.8. Samples: 3840330340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 15:09:11,346][15401] Updated weights for policy 0, policy_version 234400 (0.0047) [2024-06-22 15:09:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3840475136. Throughput: 0: 43029.7. Samples: 3840589640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:13,396][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 15:09:15,715][15401] Updated weights for policy 0, policy_version 234410 (0.0030) [2024-06-22 15:09:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3840688128. Throughput: 0: 42944.3. Samples: 3840847420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 15:09:19,479][15401] Updated weights for policy 0, policy_version 234420 (0.0037) [2024-06-22 15:09:23,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 3840884736. Throughput: 0: 42929.2. Samples: 3840972580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:23,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 15:09:23,914][15401] Updated weights for policy 0, policy_version 234430 (0.0026) [2024-06-22 15:09:26,995][15401] Updated weights for policy 0, policy_version 234440 (0.0033) [2024-06-22 15:09:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 3841114112. Throughput: 0: 42776.4. Samples: 3841227840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 15:09:31,572][15401] Updated weights for policy 0, policy_version 234450 (0.0036) [2024-06-22 15:09:33,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3841310720. Throughput: 0: 42665.9. Samples: 3841484080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 15:09:34,626][15401] Updated weights for policy 0, policy_version 234460 (0.0038) [2024-06-22 15:09:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 3841523712. Throughput: 0: 42836.5. Samples: 3841611860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 15:09:39,010][15401] Updated weights for policy 0, policy_version 234470 (0.0037) [2024-06-22 15:09:42,213][15401] Updated weights for policy 0, policy_version 234480 (0.0022) [2024-06-22 15:09:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 3841769472. Throughput: 0: 42596.7. Samples: 3841865560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 15:09:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000234483_3841769472.pth... [2024-06-22 15:09:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000233857_3831513088.pth [2024-06-22 15:09:46,684][15401] Updated weights for policy 0, policy_version 234490 (0.0038) [2024-06-22 15:09:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3841949696. Throughput: 0: 42591.2. Samples: 3842127540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:48,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 15:09:49,854][15401] Updated weights for policy 0, policy_version 234500 (0.0026) [2024-06-22 15:09:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 3842162688. Throughput: 0: 42735.6. Samples: 3842253440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 15:09:54,236][15401] Updated weights for policy 0, policy_version 234510 (0.0032) [2024-06-22 15:09:57,502][15401] Updated weights for policy 0, policy_version 234520 (0.0029) [2024-06-22 15:09:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 3842408448. Throughput: 0: 42561.3. Samples: 3842504900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:09:58,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 15:10:01,834][15401] Updated weights for policy 0, policy_version 234530 (0.0046) [2024-06-22 15:10:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3842588672. Throughput: 0: 42644.4. Samples: 3842766420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:10:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 15:10:05,059][15401] Updated weights for policy 0, policy_version 234540 (0.0048) [2024-06-22 15:10:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3842801664. Throughput: 0: 42631.0. Samples: 3842890880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 15:10:09,615][15401] Updated weights for policy 0, policy_version 234550 (0.0037) [2024-06-22 15:10:12,591][15401] Updated weights for policy 0, policy_version 234560 (0.0039) [2024-06-22 15:10:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3843031040. Throughput: 0: 42595.6. Samples: 3843144640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:13,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 15:10:17,322][15401] Updated weights for policy 0, policy_version 234570 (0.0036) [2024-06-22 15:10:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3843244032. Throughput: 0: 42566.5. Samples: 3843399580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 15:10:20,706][15401] Updated weights for policy 0, policy_version 234580 (0.0042) [2024-06-22 15:10:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 3843440640. Throughput: 0: 42555.9. Samples: 3843526880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 15:10:24,963][15401] Updated weights for policy 0, policy_version 234590 (0.0048) [2024-06-22 15:10:26,729][15349] Signal inference workers to stop experience collection... (56850 times) [2024-06-22 15:10:26,734][15349] Signal inference workers to resume experience collection... (56850 times) [2024-06-22 15:10:26,757][15401] InferenceWorker_p0-w0: stopping experience collection (56850 times) [2024-06-22 15:10:26,757][15401] InferenceWorker_p0-w0: resuming experience collection (56850 times) [2024-06-22 15:10:28,190][15401] Updated weights for policy 0, policy_version 234600 (0.0044) [2024-06-22 15:10:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3843686400. Throughput: 0: 42632.9. Samples: 3843784040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:28,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-22 15:10:32,587][15401] Updated weights for policy 0, policy_version 234610 (0.0036) [2024-06-22 15:10:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3843866624. Throughput: 0: 42520.9. Samples: 3844040980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:33,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-22 15:10:35,939][15401] Updated weights for policy 0, policy_version 234620 (0.0027) [2024-06-22 15:10:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 3844063232. Throughput: 0: 42405.6. Samples: 3844161700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:38,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 15:10:40,095][15401] Updated weights for policy 0, policy_version 234630 (0.0044) [2024-06-22 15:10:43,391][15132] Fps is (10 sec: 44231.2, 60 sec: 42324.6, 300 sec: 42654.7). Total num frames: 3844308992. Throughput: 0: 42475.7. Samples: 3844416360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:43,391][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 15:10:43,572][15401] Updated weights for policy 0, policy_version 234640 (0.0039) [2024-06-22 15:10:47,878][15401] Updated weights for policy 0, policy_version 234650 (0.0031) [2024-06-22 15:10:48,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 3844521984. Throughput: 0: 42361.0. Samples: 3844672760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:48,392][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 15:10:51,593][15401] Updated weights for policy 0, policy_version 234660 (0.0032) [2024-06-22 15:10:53,394][15132] Fps is (10 sec: 40948.4, 60 sec: 42595.5, 300 sec: 42542.3). Total num frames: 3844718592. Throughput: 0: 42366.6. Samples: 3844797540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:53,394][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 15:10:55,605][15401] Updated weights for policy 0, policy_version 234670 (0.0053) [2024-06-22 15:10:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3844947968. Throughput: 0: 42504.8. Samples: 3845057360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:10:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 15:10:59,238][15401] Updated weights for policy 0, policy_version 234680 (0.0038) [2024-06-22 15:11:03,390][15132] Fps is (10 sec: 42615.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 3845144576. Throughput: 0: 42577.8. Samples: 3845315580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:11:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 15:11:03,752][15401] Updated weights for policy 0, policy_version 234690 (0.0035) [2024-06-22 15:11:07,020][15401] Updated weights for policy 0, policy_version 234700 (0.0035) [2024-06-22 15:11:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 3845357568. Throughput: 0: 42408.1. Samples: 3845435240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:11:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 15:11:11,300][15401] Updated weights for policy 0, policy_version 234710 (0.0041) [2024-06-22 15:11:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3845586944. Throughput: 0: 42464.1. Samples: 3845694920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:11:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 15:11:14,856][15401] Updated weights for policy 0, policy_version 234720 (0.0044) [2024-06-22 15:11:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 3845767168. Throughput: 0: 42624.0. Samples: 3845959060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:18,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 15:11:19,008][15401] Updated weights for policy 0, policy_version 234730 (0.0047) [2024-06-22 15:11:22,558][15401] Updated weights for policy 0, policy_version 234740 (0.0032) [2024-06-22 15:11:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 3846012928. Throughput: 0: 42606.3. Samples: 3846078980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:23,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 15:11:26,497][15401] Updated weights for policy 0, policy_version 234750 (0.0038) [2024-06-22 15:11:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3846225920. Throughput: 0: 42743.8. Samples: 3846339780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:28,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 15:11:30,079][15401] Updated weights for policy 0, policy_version 234760 (0.0034) [2024-06-22 15:11:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 3846406144. Throughput: 0: 42843.6. Samples: 3846600620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 15:11:34,066][15401] Updated weights for policy 0, policy_version 234770 (0.0024) [2024-06-22 15:11:37,514][15401] Updated weights for policy 0, policy_version 234780 (0.0031) [2024-06-22 15:11:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 3846651904. Throughput: 0: 42740.7. Samples: 3846720700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:38,391][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 15:11:42,000][15401] Updated weights for policy 0, policy_version 234790 (0.0033) [2024-06-22 15:11:43,006][15349] Signal inference workers to stop experience collection... (56900 times) [2024-06-22 15:11:43,053][15401] InferenceWorker_p0-w0: stopping experience collection (56900 times) [2024-06-22 15:11:43,064][15349] Signal inference workers to resume experience collection... (56900 times) [2024-06-22 15:11:43,069][15401] InferenceWorker_p0-w0: resuming experience collection (56900 times) [2024-06-22 15:11:43,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42872.3, 300 sec: 42653.9). Total num frames: 3846881280. Throughput: 0: 42601.8. Samples: 3846974440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:43,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-22 15:11:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000234795_3846881280.pth... [2024-06-22 15:11:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000234168_3836608512.pth [2024-06-22 15:11:45,252][15401] Updated weights for policy 0, policy_version 234800 (0.0037) [2024-06-22 15:11:48,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42325.3, 300 sec: 42542.5). Total num frames: 3847061504. Throughput: 0: 42630.3. Samples: 3847234040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:48,392][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 15:11:49,519][15401] Updated weights for policy 0, policy_version 234810 (0.0030) [2024-06-22 15:11:53,127][15401] Updated weights for policy 0, policy_version 234820 (0.0030) [2024-06-22 15:11:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42874.4, 300 sec: 42598.4). Total num frames: 3847290880. Throughput: 0: 42599.2. Samples: 3847352200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 15:11:57,402][15401] Updated weights for policy 0, policy_version 234830 (0.0037) [2024-06-22 15:11:58,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3847503872. Throughput: 0: 42676.0. Samples: 3847615340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:11:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 15:12:00,936][15401] Updated weights for policy 0, policy_version 234840 (0.0035) [2024-06-22 15:12:03,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 3847684096. Throughput: 0: 42458.5. Samples: 3847869700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:12:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 15:12:05,202][15401] Updated weights for policy 0, policy_version 234850 (0.0030) [2024-06-22 15:12:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3847929856. Throughput: 0: 42423.9. Samples: 3847988060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:12:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 15:12:08,794][15401] Updated weights for policy 0, policy_version 234860 (0.0022) [2024-06-22 15:12:12,851][15401] Updated weights for policy 0, policy_version 234870 (0.0033) [2024-06-22 15:12:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 3848126464. Throughput: 0: 42435.8. Samples: 3848249400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:12:13,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 15:12:16,619][15401] Updated weights for policy 0, policy_version 234880 (0.0028) [2024-06-22 15:12:18,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3848323072. Throughput: 0: 42300.2. Samples: 3848504120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:12:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 15:12:20,573][15401] Updated weights for policy 0, policy_version 234890 (0.0032) [2024-06-22 15:12:23,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3848568832. Throughput: 0: 42368.1. Samples: 3848627260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:12:23,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-22 15:12:24,019][15401] Updated weights for policy 0, policy_version 234900 (0.0028) [2024-06-22 15:12:28,135][15401] Updated weights for policy 0, policy_version 234910 (0.0041) [2024-06-22 15:12:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 3848765440. Throughput: 0: 42457.8. Samples: 3848885040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:12:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 15:12:32,166][15401] Updated weights for policy 0, policy_version 234920 (0.0032) [2024-06-22 15:12:33,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 3848945664. Throughput: 0: 42428.8. Samples: 3849143240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:12:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 15:12:35,766][15401] Updated weights for policy 0, policy_version 234930 (0.0033) [2024-06-22 15:12:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3849191424. Throughput: 0: 42556.9. Samples: 3849267260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:12:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 15:12:39,702][15401] Updated weights for policy 0, policy_version 234940 (0.0036) [2024-06-22 15:12:43,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 3849404416. Throughput: 0: 42467.5. Samples: 3849526380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:12:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 15:12:43,439][15401] Updated weights for policy 0, policy_version 234950 (0.0034) [2024-06-22 15:12:47,294][15401] Updated weights for policy 0, policy_version 234960 (0.0042) [2024-06-22 15:12:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 3849601024. Throughput: 0: 42580.9. Samples: 3849785840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:12:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 15:12:51,153][15401] Updated weights for policy 0, policy_version 234970 (0.0030) [2024-06-22 15:12:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3849830400. Throughput: 0: 42608.6. Samples: 3849905440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:12:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 15:12:54,854][15401] Updated weights for policy 0, policy_version 234980 (0.0029) [2024-06-22 15:12:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 3850043392. Throughput: 0: 42637.4. Samples: 3850168180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:12:58,392][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 15:12:58,875][15401] Updated weights for policy 0, policy_version 234990 (0.0031) [2024-06-22 15:13:02,719][15401] Updated weights for policy 0, policy_version 235000 (0.0021) [2024-06-22 15:13:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3850240000. Throughput: 0: 42663.0. Samples: 3850423960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:13:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 15:13:06,804][15401] Updated weights for policy 0, policy_version 235010 (0.0036) [2024-06-22 15:13:08,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3850485760. Throughput: 0: 42766.6. Samples: 3850551760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:13:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 15:13:10,224][15401] Updated weights for policy 0, policy_version 235020 (0.0031) [2024-06-22 15:13:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3850682368. Throughput: 0: 42844.2. Samples: 3850813040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:13:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 15:13:14,335][15401] Updated weights for policy 0, policy_version 235030 (0.0023) [2024-06-22 15:13:15,748][15349] Signal inference workers to stop experience collection... (56950 times) [2024-06-22 15:13:15,768][15401] InferenceWorker_p0-w0: stopping experience collection (56950 times) [2024-06-22 15:13:15,807][15349] Signal inference workers to resume experience collection... (56950 times) [2024-06-22 15:13:15,808][15401] InferenceWorker_p0-w0: resuming experience collection (56950 times) [2024-06-22 15:13:17,810][15401] Updated weights for policy 0, policy_version 235040 (0.0024) [2024-06-22 15:13:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 3850895360. Throughput: 0: 42692.6. Samples: 3851064400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:13:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 15:13:21,912][15401] Updated weights for policy 0, policy_version 235050 (0.0027) [2024-06-22 15:13:23,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3851124736. Throughput: 0: 42845.9. Samples: 3851195320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:13:23,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 15:13:25,418][15401] Updated weights for policy 0, policy_version 235060 (0.0035) [2024-06-22 15:13:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3851321344. Throughput: 0: 42936.5. Samples: 3851458520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 15:13:28,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 15:13:29,480][15401] Updated weights for policy 0, policy_version 235070 (0.0026) [2024-06-22 15:13:33,020][15401] Updated weights for policy 0, policy_version 235080 (0.0025) [2024-06-22 15:13:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 3851550720. Throughput: 0: 42711.1. Samples: 3851707840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:13:33,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-22 15:13:37,229][15401] Updated weights for policy 0, policy_version 235090 (0.0039) [2024-06-22 15:13:38,390][15132] Fps is (10 sec: 44234.0, 60 sec: 42871.0, 300 sec: 42598.3). Total num frames: 3851763712. Throughput: 0: 42942.9. Samples: 3851837900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:13:38,391][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 15:13:40,731][15401] Updated weights for policy 0, policy_version 235100 (0.0029) [2024-06-22 15:13:43,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 3851976704. Throughput: 0: 42943.1. Samples: 3852100620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:13:43,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 15:13:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000235106_3851976704.pth... [2024-06-22 15:13:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000234483_3841769472.pth [2024-06-22 15:13:44,991][15401] Updated weights for policy 0, policy_version 235110 (0.0034) [2024-06-22 15:13:48,392][15132] Fps is (10 sec: 42590.7, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 3852189696. Throughput: 0: 42732.8. Samples: 3852347040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:13:48,393][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 15:13:48,455][15401] Updated weights for policy 0, policy_version 235120 (0.0029) [2024-06-22 15:13:52,866][15401] Updated weights for policy 0, policy_version 235130 (0.0044) [2024-06-22 15:13:53,390][15132] Fps is (10 sec: 40967.9, 60 sec: 42598.0, 300 sec: 42487.2). Total num frames: 3852386304. Throughput: 0: 42666.2. Samples: 3852471760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:13:53,391][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 15:13:55,964][15401] Updated weights for policy 0, policy_version 235140 (0.0034) [2024-06-22 15:13:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 3852599296. Throughput: 0: 42696.1. Samples: 3852734360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:13:58,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 15:14:00,529][15401] Updated weights for policy 0, policy_version 235150 (0.0034) [2024-06-22 15:14:03,389][15132] Fps is (10 sec: 44239.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 3852828672. Throughput: 0: 42600.2. Samples: 3852981400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:14:03,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 15:14:03,915][15401] Updated weights for policy 0, policy_version 235160 (0.0031) [2024-06-22 15:14:08,143][15401] Updated weights for policy 0, policy_version 235170 (0.0024) [2024-06-22 15:14:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3853041664. Throughput: 0: 42597.2. Samples: 3853112200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:14:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 15:14:11,742][15401] Updated weights for policy 0, policy_version 235180 (0.0038) [2024-06-22 15:14:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 3853238272. Throughput: 0: 42527.5. Samples: 3853372260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:14:13,390][15132] Avg episode reward: [(0, '0.211')] [2024-06-22 15:14:15,829][15401] Updated weights for policy 0, policy_version 235190 (0.0033) [2024-06-22 15:14:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 3853467648. Throughput: 0: 42598.3. Samples: 3853624760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:14:18,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 15:14:19,601][15401] Updated weights for policy 0, policy_version 235200 (0.0036) [2024-06-22 15:14:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 3853664256. Throughput: 0: 42545.0. Samples: 3853752400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:14:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 15:14:23,564][15401] Updated weights for policy 0, policy_version 235210 (0.0039) [2024-06-22 15:14:24,490][15349] Signal inference workers to stop experience collection... (57000 times) [2024-06-22 15:14:24,537][15401] InferenceWorker_p0-w0: stopping experience collection (57000 times) [2024-06-22 15:14:24,546][15349] Signal inference workers to resume experience collection... (57000 times) [2024-06-22 15:14:24,553][15401] InferenceWorker_p0-w0: resuming experience collection (57000 times) [2024-06-22 15:14:27,425][15401] Updated weights for policy 0, policy_version 235220 (0.0037) [2024-06-22 15:14:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3853877248. Throughput: 0: 42501.4. Samples: 3854013080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:14:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 15:14:31,398][15401] Updated weights for policy 0, policy_version 235230 (0.0026) [2024-06-22 15:14:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3854106624. Throughput: 0: 42675.1. Samples: 3854267320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:14:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 15:14:34,837][15401] Updated weights for policy 0, policy_version 235240 (0.0031) [2024-06-22 15:14:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.7, 300 sec: 42487.3). Total num frames: 3854303232. Throughput: 0: 42853.7. Samples: 3854400160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 15:14:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 15:14:38,864][15401] Updated weights for policy 0, policy_version 235250 (0.0030) [2024-06-22 15:14:42,311][15401] Updated weights for policy 0, policy_version 235260 (0.0030) [2024-06-22 15:14:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 3854532608. Throughput: 0: 42870.6. Samples: 3854663540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:14:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 15:14:46,311][15401] Updated weights for policy 0, policy_version 235270 (0.0036) [2024-06-22 15:14:48,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 3854761984. Throughput: 0: 42934.1. Samples: 3854913440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:14:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 15:14:49,921][15401] Updated weights for policy 0, policy_version 235280 (0.0022) [2024-06-22 15:14:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.8, 300 sec: 42487.3). Total num frames: 3854942208. Throughput: 0: 42952.1. Samples: 3855045040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:14:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 15:14:54,026][15401] Updated weights for policy 0, policy_version 235290 (0.0036) [2024-06-22 15:14:57,959][15401] Updated weights for policy 0, policy_version 235300 (0.0041) [2024-06-22 15:14:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3855171584. Throughput: 0: 42932.0. Samples: 3855304200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:14:58,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 15:15:01,487][15401] Updated weights for policy 0, policy_version 235310 (0.0026) [2024-06-22 15:15:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 3855384576. Throughput: 0: 43093.9. Samples: 3855563980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:03,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 15:15:05,344][15401] Updated weights for policy 0, policy_version 235320 (0.0032) [2024-06-22 15:15:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3855597568. Throughput: 0: 43102.7. Samples: 3855692020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:08,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 15:15:08,959][15401] Updated weights for policy 0, policy_version 235330 (0.0043) [2024-06-22 15:15:12,715][15401] Updated weights for policy 0, policy_version 235340 (0.0045) [2024-06-22 15:15:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 3855826944. Throughput: 0: 43140.9. Samples: 3855954420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 15:15:16,642][15401] Updated weights for policy 0, policy_version 235350 (0.0039) [2024-06-22 15:15:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3856039936. Throughput: 0: 43198.7. Samples: 3856211260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 15:15:20,249][15401] Updated weights for policy 0, policy_version 235360 (0.0033) [2024-06-22 15:15:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3856252928. Throughput: 0: 43031.1. Samples: 3856336560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 15:15:24,403][15401] Updated weights for policy 0, policy_version 235370 (0.0038) [2024-06-22 15:15:27,988][15401] Updated weights for policy 0, policy_version 235380 (0.0030) [2024-06-22 15:15:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3856465920. Throughput: 0: 43068.9. Samples: 3856601640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 15:15:31,986][15401] Updated weights for policy 0, policy_version 235390 (0.0023) [2024-06-22 15:15:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3856678912. Throughput: 0: 43007.9. Samples: 3856848800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 15:15:35,622][15401] Updated weights for policy 0, policy_version 235400 (0.0029) [2024-06-22 15:15:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42709.6). Total num frames: 3856908288. Throughput: 0: 42962.1. Samples: 3856978340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 15:15:39,501][15401] Updated weights for policy 0, policy_version 235410 (0.0039) [2024-06-22 15:15:43,153][15401] Updated weights for policy 0, policy_version 235420 (0.0040) [2024-06-22 15:15:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 3857121280. Throughput: 0: 43053.8. Samples: 3857241620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:43,394][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 15:15:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000235420_3857121280.pth... [2024-06-22 15:15:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000234795_3846881280.pth [2024-06-22 15:15:47,322][15401] Updated weights for policy 0, policy_version 235430 (0.0039) [2024-06-22 15:15:48,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42765.3). Total num frames: 3857334272. Throughput: 0: 42797.2. Samples: 3857489960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 15:15:48,393][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 15:15:50,883][15401] Updated weights for policy 0, policy_version 235440 (0.0049) [2024-06-22 15:15:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 3857547264. Throughput: 0: 42858.8. Samples: 3857620660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:15:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 15:15:54,949][15401] Updated weights for policy 0, policy_version 235450 (0.0040) [2024-06-22 15:15:55,254][15349] Signal inference workers to stop experience collection... (57050 times) [2024-06-22 15:15:55,298][15401] InferenceWorker_p0-w0: stopping experience collection (57050 times) [2024-06-22 15:15:55,312][15349] Signal inference workers to resume experience collection... (57050 times) [2024-06-22 15:15:55,313][15401] InferenceWorker_p0-w0: resuming experience collection (57050 times) [2024-06-22 15:15:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3857743872. Throughput: 0: 42782.6. Samples: 3857879640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:15:58,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 15:15:58,708][15401] Updated weights for policy 0, policy_version 235460 (0.0033) [2024-06-22 15:16:02,267][15401] Updated weights for policy 0, policy_version 235470 (0.0032) [2024-06-22 15:16:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3857989632. Throughput: 0: 42779.2. Samples: 3858136320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:03,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 15:16:06,564][15401] Updated weights for policy 0, policy_version 235480 (0.0045) [2024-06-22 15:16:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 3858202624. Throughput: 0: 42941.9. Samples: 3858268940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 15:16:09,766][15401] Updated weights for policy 0, policy_version 235490 (0.0028) [2024-06-22 15:16:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3858382848. Throughput: 0: 42669.4. Samples: 3858521760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 15:16:14,085][15401] Updated weights for policy 0, policy_version 235500 (0.0033) [2024-06-22 15:16:17,591][15401] Updated weights for policy 0, policy_version 235510 (0.0029) [2024-06-22 15:16:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3858628608. Throughput: 0: 42813.0. Samples: 3858775380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 15:16:22,000][15401] Updated weights for policy 0, policy_version 235520 (0.0026) [2024-06-22 15:16:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3858825216. Throughput: 0: 43006.8. Samples: 3858913640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 15:16:25,270][15401] Updated weights for policy 0, policy_version 235530 (0.0037) [2024-06-22 15:16:28,390][15132] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3859005440. Throughput: 0: 42586.6. Samples: 3859158020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 15:16:29,484][15401] Updated weights for policy 0, policy_version 235540 (0.0024) [2024-06-22 15:16:32,742][15401] Updated weights for policy 0, policy_version 235550 (0.0032) [2024-06-22 15:16:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3859267584. Throughput: 0: 42875.1. Samples: 3859419240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:33,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 15:16:36,870][15401] Updated weights for policy 0, policy_version 235560 (0.0031) [2024-06-22 15:16:38,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3859480576. Throughput: 0: 43097.7. Samples: 3859560060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 15:16:40,388][15401] Updated weights for policy 0, policy_version 235570 (0.0038) [2024-06-22 15:16:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 3859677184. Throughput: 0: 42744.1. Samples: 3859803120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 15:16:44,833][15401] Updated weights for policy 0, policy_version 235580 (0.0036) [2024-06-22 15:16:47,835][15401] Updated weights for policy 0, policy_version 235590 (0.0029) [2024-06-22 15:16:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 3859906560. Throughput: 0: 42868.8. Samples: 3860065420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 15:16:52,336][15401] Updated weights for policy 0, policy_version 235600 (0.0028) [2024-06-22 15:16:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 3860119552. Throughput: 0: 43010.5. Samples: 3860204420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 15:16:55,600][15401] Updated weights for policy 0, policy_version 235610 (0.0037) [2024-06-22 15:16:58,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 3860316160. Throughput: 0: 42898.7. Samples: 3860452480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:16:58,396][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 15:16:59,855][15401] Updated weights for policy 0, policy_version 235620 (0.0042) [2024-06-22 15:17:03,155][15401] Updated weights for policy 0, policy_version 235630 (0.0044) [2024-06-22 15:17:03,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 3860561920. Throughput: 0: 42908.8. Samples: 3860706380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:03,393][15132] Avg episode reward: [(0, '0.849')] [2024-06-22 15:17:07,442][15401] Updated weights for policy 0, policy_version 235640 (0.0053) [2024-06-22 15:17:08,392][15132] Fps is (10 sec: 44254.5, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 3860758528. Throughput: 0: 42885.6. Samples: 3860843600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:08,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 15:17:11,134][15401] Updated weights for policy 0, policy_version 235650 (0.0029) [2024-06-22 15:17:13,392][15132] Fps is (10 sec: 40959.9, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 3860971520. Throughput: 0: 43131.6. Samples: 3861099040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:13,393][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 15:17:14,923][15401] Updated weights for policy 0, policy_version 235660 (0.0031) [2024-06-22 15:17:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3861200896. Throughput: 0: 42845.8. Samples: 3861347300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 15:17:18,822][15401] Updated weights for policy 0, policy_version 235670 (0.0037) [2024-06-22 15:17:22,785][15401] Updated weights for policy 0, policy_version 235680 (0.0035) [2024-06-22 15:17:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3861397504. Throughput: 0: 42728.8. Samples: 3861482860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 15:17:24,705][15349] Signal inference workers to stop experience collection... (57100 times) [2024-06-22 15:17:24,705][15349] Signal inference workers to resume experience collection... (57100 times) [2024-06-22 15:17:24,746][15401] InferenceWorker_p0-w0: stopping experience collection (57100 times) [2024-06-22 15:17:24,746][15401] InferenceWorker_p0-w0: resuming experience collection (57100 times) [2024-06-22 15:17:26,320][15401] Updated weights for policy 0, policy_version 235690 (0.0035) [2024-06-22 15:17:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 3861626880. Throughput: 0: 43000.2. Samples: 3861738140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 15:17:30,271][15401] Updated weights for policy 0, policy_version 235700 (0.0032) [2024-06-22 15:17:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3861839872. Throughput: 0: 42644.9. Samples: 3861984440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 15:17:34,194][15401] Updated weights for policy 0, policy_version 235710 (0.0034) [2024-06-22 15:17:38,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3862020096. Throughput: 0: 42474.9. Samples: 3862115780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 15:17:38,410][15401] Updated weights for policy 0, policy_version 235720 (0.0030) [2024-06-22 15:17:41,772][15401] Updated weights for policy 0, policy_version 235730 (0.0029) [2024-06-22 15:17:43,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43415.7, 300 sec: 42986.8). Total num frames: 3862282240. Throughput: 0: 42575.7. Samples: 3862368220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:43,393][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 15:17:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000235735_3862282240.pth... [2024-06-22 15:17:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000235106_3851976704.pth [2024-06-22 15:17:46,206][15401] Updated weights for policy 0, policy_version 235740 (0.0032) [2024-06-22 15:17:48,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3862462464. Throughput: 0: 42555.4. Samples: 3862621280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 15:17:49,501][15401] Updated weights for policy 0, policy_version 235750 (0.0032) [2024-06-22 15:17:53,389][15132] Fps is (10 sec: 37692.9, 60 sec: 42325.5, 300 sec: 42765.4). Total num frames: 3862659072. Throughput: 0: 42411.2. Samples: 3862752000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 15:17:53,649][15401] Updated weights for policy 0, policy_version 235760 (0.0040) [2024-06-22 15:17:57,226][15401] Updated weights for policy 0, policy_version 235770 (0.0030) [2024-06-22 15:17:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43149.1, 300 sec: 42931.6). Total num frames: 3862904832. Throughput: 0: 42464.5. Samples: 3863009840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:17:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 15:18:01,128][15401] Updated weights for policy 0, policy_version 235780 (0.0032) [2024-06-22 15:18:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 3863117824. Throughput: 0: 42705.3. Samples: 3863269040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:18:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 15:18:04,966][15401] Updated weights for policy 0, policy_version 235790 (0.0024) [2024-06-22 15:18:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 3863314432. Throughput: 0: 42534.3. Samples: 3863396900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 15:18:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 15:18:08,671][15401] Updated weights for policy 0, policy_version 235800 (0.0037) [2024-06-22 15:18:12,667][15401] Updated weights for policy 0, policy_version 235810 (0.0039) [2024-06-22 15:18:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 3863543808. Throughput: 0: 42645.5. Samples: 3863657180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 15:18:16,213][15401] Updated weights for policy 0, policy_version 235820 (0.0026) [2024-06-22 15:18:18,391][15132] Fps is (10 sec: 42590.0, 60 sec: 42324.0, 300 sec: 42764.7). Total num frames: 3863740416. Throughput: 0: 42939.0. Samples: 3863916780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:18,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 15:18:20,319][15401] Updated weights for policy 0, policy_version 235830 (0.0043) [2024-06-22 15:18:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3863969792. Throughput: 0: 42767.0. Samples: 3864040300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 15:18:23,815][15401] Updated weights for policy 0, policy_version 235840 (0.0032) [2024-06-22 15:18:27,824][15401] Updated weights for policy 0, policy_version 235850 (0.0038) [2024-06-22 15:18:28,390][15132] Fps is (10 sec: 44245.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3864182784. Throughput: 0: 43118.4. Samples: 3864308440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 15:18:31,411][15401] Updated weights for policy 0, policy_version 235860 (0.0034) [2024-06-22 15:18:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 3864379392. Throughput: 0: 43242.3. Samples: 3864567180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 15:18:35,550][15401] Updated weights for policy 0, policy_version 235870 (0.0031) [2024-06-22 15:18:36,754][15349] Signal inference workers to stop experience collection... (57150 times) [2024-06-22 15:18:36,795][15401] InferenceWorker_p0-w0: stopping experience collection (57150 times) [2024-06-22 15:18:36,803][15349] Signal inference workers to resume experience collection... (57150 times) [2024-06-22 15:18:36,818][15401] InferenceWorker_p0-w0: resuming experience collection (57150 times) [2024-06-22 15:18:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 3864625152. Throughput: 0: 43103.1. Samples: 3864691640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 15:18:39,049][15401] Updated weights for policy 0, policy_version 235880 (0.0046) [2024-06-22 15:18:43,136][15401] Updated weights for policy 0, policy_version 235890 (0.0027) [2024-06-22 15:18:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42327.1, 300 sec: 42820.9). Total num frames: 3864821760. Throughput: 0: 43294.7. Samples: 3864958100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 15:18:46,686][15401] Updated weights for policy 0, policy_version 235900 (0.0022) [2024-06-22 15:18:48,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42867.0, 300 sec: 42875.2). Total num frames: 3865034752. Throughput: 0: 43147.8. Samples: 3865210960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:48,396][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 15:18:50,860][15401] Updated weights for policy 0, policy_version 235910 (0.0037) [2024-06-22 15:18:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 3865280512. Throughput: 0: 43231.6. Samples: 3865342320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:53,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 15:18:54,260][15401] Updated weights for policy 0, policy_version 235920 (0.0032) [2024-06-22 15:18:58,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3865460736. Throughput: 0: 43263.1. Samples: 3865604020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:18:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 15:18:58,410][15401] Updated weights for policy 0, policy_version 235930 (0.0029) [2024-06-22 15:19:01,929][15401] Updated weights for policy 0, policy_version 235940 (0.0033) [2024-06-22 15:19:03,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 3865690112. Throughput: 0: 43167.6. Samples: 3865859340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:19:03,393][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 15:19:05,968][15401] Updated weights for policy 0, policy_version 235950 (0.0029) [2024-06-22 15:19:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3865903104. Throughput: 0: 43283.1. Samples: 3865988040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:19:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 15:19:09,608][15401] Updated weights for policy 0, policy_version 235960 (0.0039) [2024-06-22 15:19:13,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3866116096. Throughput: 0: 42948.4. Samples: 3866241120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:19:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 15:19:13,767][15401] Updated weights for policy 0, policy_version 235970 (0.0029) [2024-06-22 15:19:17,271][15401] Updated weights for policy 0, policy_version 235980 (0.0028) [2024-06-22 15:19:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.0, 300 sec: 42931.6). Total num frames: 3866329088. Throughput: 0: 42921.0. Samples: 3866498620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-22 15:19:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 15:19:21,657][15401] Updated weights for policy 0, policy_version 235990 (0.0035) [2024-06-22 15:19:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3866558464. Throughput: 0: 43084.4. Samples: 3866630440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:19:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 15:19:24,758][15401] Updated weights for policy 0, policy_version 236000 (0.0038) [2024-06-22 15:19:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3866755072. Throughput: 0: 42748.9. Samples: 3866881800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:19:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 15:19:29,192][15401] Updated weights for policy 0, policy_version 236010 (0.0032) [2024-06-22 15:19:32,259][15401] Updated weights for policy 0, policy_version 236020 (0.0033) [2024-06-22 15:19:33,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 3866968064. Throughput: 0: 42846.5. Samples: 3867138880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:19:33,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 15:19:36,674][15401] Updated weights for policy 0, policy_version 236030 (0.0032) [2024-06-22 15:19:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3867164672. Throughput: 0: 42826.2. Samples: 3867269500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:19:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 15:19:39,816][15401] Updated weights for policy 0, policy_version 236040 (0.0032) [2024-06-22 15:19:43,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3867410432. Throughput: 0: 42609.3. Samples: 3867521440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:19:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 15:19:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000236048_3867410432.pth... [2024-06-22 15:19:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000235420_3857121280.pth [2024-06-22 15:19:44,060][15401] Updated weights for policy 0, policy_version 236050 (0.0028) [2024-06-22 15:19:47,423][15401] Updated weights for policy 0, policy_version 236060 (0.0034) [2024-06-22 15:19:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43149.1, 300 sec: 42987.2). Total num frames: 3867623424. Throughput: 0: 42593.9. Samples: 3867775960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:19:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 15:19:51,515][15401] Updated weights for policy 0, policy_version 236070 (0.0035) [2024-06-22 15:19:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3867820032. Throughput: 0: 42688.8. Samples: 3867909040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:19:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 15:19:55,029][15401] Updated weights for policy 0, policy_version 236080 (0.0025) [2024-06-22 15:19:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3868065792. Throughput: 0: 42893.8. Samples: 3868171340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:19:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 15:19:59,745][15401] Updated weights for policy 0, policy_version 236090 (0.0030) [2024-06-22 15:20:02,885][15401] Updated weights for policy 0, policy_version 236100 (0.0038) [2024-06-22 15:20:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 3868278784. Throughput: 0: 42713.8. Samples: 3868420740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:20:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 15:20:05,170][15349] Signal inference workers to stop experience collection... (57200 times) [2024-06-22 15:20:05,172][15349] Signal inference workers to resume experience collection... (57200 times) [2024-06-22 15:20:05,195][15401] InferenceWorker_p0-w0: stopping experience collection (57200 times) [2024-06-22 15:20:05,195][15401] InferenceWorker_p0-w0: resuming experience collection (57200 times) [2024-06-22 15:20:07,217][15401] Updated weights for policy 0, policy_version 236110 (0.0038) [2024-06-22 15:20:08,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3868459008. Throughput: 0: 42787.9. Samples: 3868555900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:20:08,396][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 15:20:10,492][15401] Updated weights for policy 0, policy_version 236120 (0.0028) [2024-06-22 15:20:13,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43413.0, 300 sec: 42986.2). Total num frames: 3868721152. Throughput: 0: 43017.0. Samples: 3868817840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:20:13,396][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 15:20:15,195][15401] Updated weights for policy 0, policy_version 236130 (0.0026) [2024-06-22 15:20:18,115][15401] Updated weights for policy 0, policy_version 236140 (0.0036) [2024-06-22 15:20:18,396][15132] Fps is (10 sec: 45846.7, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 3868917760. Throughput: 0: 42823.3. Samples: 3869066100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:20:18,397][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 15:20:22,792][15401] Updated weights for policy 0, policy_version 236150 (0.0031) [2024-06-22 15:20:23,390][15132] Fps is (10 sec: 37707.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3869097984. Throughput: 0: 42744.4. Samples: 3869193000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:20:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 15:20:25,752][15401] Updated weights for policy 0, policy_version 236160 (0.0047) [2024-06-22 15:20:28,389][15132] Fps is (10 sec: 44265.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3869360128. Throughput: 0: 43112.1. Samples: 3869461480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:20:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 15:20:30,437][15401] Updated weights for policy 0, policy_version 236170 (0.0028) [2024-06-22 15:20:33,256][15401] Updated weights for policy 0, policy_version 236180 (0.0047) [2024-06-22 15:20:33,392][15132] Fps is (10 sec: 47502.3, 60 sec: 43417.6, 300 sec: 42931.3). Total num frames: 3869573120. Throughput: 0: 42988.8. Samples: 3869710560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:20:33,393][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 15:20:38,129][15401] Updated weights for policy 0, policy_version 236190 (0.0034) [2024-06-22 15:20:38,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3869736960. Throughput: 0: 42937.3. Samples: 3869841220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:20:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 15:20:40,859][15401] Updated weights for policy 0, policy_version 236200 (0.0033) [2024-06-22 15:20:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 3869999104. Throughput: 0: 42927.0. Samples: 3870103060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:20:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 15:20:45,502][15401] Updated weights for policy 0, policy_version 236210 (0.0026) [2024-06-22 15:20:48,230][15401] Updated weights for policy 0, policy_version 236220 (0.0030) [2024-06-22 15:20:48,389][15132] Fps is (10 sec: 49152.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3870228480. Throughput: 0: 42941.3. Samples: 3870353100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:20:48,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 15:20:53,164][15401] Updated weights for policy 0, policy_version 236230 (0.0034) [2024-06-22 15:20:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3870392320. Throughput: 0: 42943.6. Samples: 3870488360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:20:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 15:20:56,079][15401] Updated weights for policy 0, policy_version 236240 (0.0024) [2024-06-22 15:20:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3870638080. Throughput: 0: 42912.9. Samples: 3870748640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:20:58,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 15:21:00,586][15401] Updated weights for policy 0, policy_version 236250 (0.0028) [2024-06-22 15:21:03,391][15132] Fps is (10 sec: 45869.1, 60 sec: 42870.4, 300 sec: 42875.9). Total num frames: 3870851072. Throughput: 0: 43026.5. Samples: 3871002080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:21:03,391][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 15:21:03,843][15401] Updated weights for policy 0, policy_version 236260 (0.0021) [2024-06-22 15:21:08,091][15401] Updated weights for policy 0, policy_version 236270 (0.0033) [2024-06-22 15:21:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 3871047680. Throughput: 0: 42999.5. Samples: 3871128080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:21:08,393][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 15:21:08,978][15349] Signal inference workers to stop experience collection... (57250 times) [2024-06-22 15:21:08,979][15349] Signal inference workers to resume experience collection... (57250 times) [2024-06-22 15:21:09,018][15401] InferenceWorker_p0-w0: stopping experience collection (57250 times) [2024-06-22 15:21:09,018][15401] InferenceWorker_p0-w0: resuming experience collection (57250 times) [2024-06-22 15:21:11,354][15401] Updated weights for policy 0, policy_version 236280 (0.0022) [2024-06-22 15:21:13,389][15132] Fps is (10 sec: 42604.4, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 3871277056. Throughput: 0: 42935.6. Samples: 3871393580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:21:13,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 15:21:15,573][15401] Updated weights for policy 0, policy_version 236290 (0.0031) [2024-06-22 15:21:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42876.0, 300 sec: 42931.6). Total num frames: 3871490048. Throughput: 0: 43102.3. Samples: 3871650060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:21:18,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 15:21:18,890][15401] Updated weights for policy 0, policy_version 236300 (0.0040) [2024-06-22 15:21:23,031][15401] Updated weights for policy 0, policy_version 236310 (0.0033) [2024-06-22 15:21:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 3871703040. Throughput: 0: 43057.0. Samples: 3871778780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:21:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 15:21:26,390][15401] Updated weights for policy 0, policy_version 236320 (0.0031) [2024-06-22 15:21:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3871916032. Throughput: 0: 43023.1. Samples: 3872039100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:21:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 15:21:30,738][15401] Updated weights for policy 0, policy_version 236330 (0.0028) [2024-06-22 15:21:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 3872112640. Throughput: 0: 43248.0. Samples: 3872299260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:21:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 15:21:34,358][15401] Updated weights for policy 0, policy_version 236340 (0.0028) [2024-06-22 15:21:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3872342016. Throughput: 0: 43025.3. Samples: 3872424500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 25.0) [2024-06-22 15:21:38,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 15:21:38,683][15401] Updated weights for policy 0, policy_version 236350 (0.0035) [2024-06-22 15:21:41,929][15401] Updated weights for policy 0, policy_version 236360 (0.0031) [2024-06-22 15:21:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3872571392. Throughput: 0: 42815.9. Samples: 3872675360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:21:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 15:21:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000236363_3872571392.pth... [2024-06-22 15:21:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000235735_3862282240.pth [2024-06-22 15:21:46,345][15401] Updated weights for policy 0, policy_version 236370 (0.0048) [2024-06-22 15:21:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 3872751616. Throughput: 0: 43129.8. Samples: 3872942860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:21:48,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 15:21:49,637][15401] Updated weights for policy 0, policy_version 236380 (0.0038) [2024-06-22 15:21:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42932.5). Total num frames: 3872980992. Throughput: 0: 42957.7. Samples: 3873061080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:21:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 15:21:54,140][15401] Updated weights for policy 0, policy_version 236390 (0.0027) [2024-06-22 15:21:57,401][15401] Updated weights for policy 0, policy_version 236400 (0.0036) [2024-06-22 15:21:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 3873210368. Throughput: 0: 42885.2. Samples: 3873323420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:21:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 15:22:01,710][15401] Updated weights for policy 0, policy_version 236410 (0.0023) [2024-06-22 15:22:02,233][15349] Signal inference workers to stop experience collection... (57300 times) [2024-06-22 15:22:02,280][15401] InferenceWorker_p0-w0: stopping experience collection (57300 times) [2024-06-22 15:22:02,288][15349] Signal inference workers to resume experience collection... (57300 times) [2024-06-22 15:22:02,294][15401] InferenceWorker_p0-w0: resuming experience collection (57300 times) [2024-06-22 15:22:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42326.3, 300 sec: 42820.9). Total num frames: 3873390592. Throughput: 0: 43034.1. Samples: 3873586600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:03,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 15:22:04,887][15401] Updated weights for policy 0, policy_version 236420 (0.0027) [2024-06-22 15:22:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 3873619968. Throughput: 0: 42861.3. Samples: 3873707540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 15:22:09,372][15401] Updated weights for policy 0, policy_version 236430 (0.0033) [2024-06-22 15:22:12,472][15401] Updated weights for policy 0, policy_version 236440 (0.0034) [2024-06-22 15:22:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3873849344. Throughput: 0: 42823.1. Samples: 3873966140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:13,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 15:22:16,886][15401] Updated weights for policy 0, policy_version 236450 (0.0041) [2024-06-22 15:22:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 3874045952. Throughput: 0: 42812.8. Samples: 3874225940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:18,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 15:22:20,247][15401] Updated weights for policy 0, policy_version 236460 (0.0035) [2024-06-22 15:22:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 3874258944. Throughput: 0: 42904.4. Samples: 3874355200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:23,391][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 15:22:24,378][15401] Updated weights for policy 0, policy_version 236470 (0.0029) [2024-06-22 15:22:27,635][15401] Updated weights for policy 0, policy_version 236480 (0.0037) [2024-06-22 15:22:28,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3874504704. Throughput: 0: 43021.4. Samples: 3874611320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 15:22:31,947][15401] Updated weights for policy 0, policy_version 236490 (0.0033) [2024-06-22 15:22:33,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 3874701312. Throughput: 0: 43007.0. Samples: 3874878280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:33,393][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 15:22:35,228][15401] Updated weights for policy 0, policy_version 236500 (0.0039) [2024-06-22 15:22:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 3874930688. Throughput: 0: 43164.5. Samples: 3875003480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 15:22:39,419][15401] Updated weights for policy 0, policy_version 236510 (0.0030) [2024-06-22 15:22:42,987][15401] Updated weights for policy 0, policy_version 236520 (0.0030) [2024-06-22 15:22:43,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 3875160064. Throughput: 0: 43061.9. Samples: 3875261200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 15:22:47,177][15401] Updated weights for policy 0, policy_version 236530 (0.0030) [2024-06-22 15:22:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3875323904. Throughput: 0: 42926.7. Samples: 3875518300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 15:22:48,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 15:22:50,684][15401] Updated weights for policy 0, policy_version 236540 (0.0030) [2024-06-22 15:22:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3875553280. Throughput: 0: 42892.6. Samples: 3875637700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:22:53,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 15:22:55,219][15401] Updated weights for policy 0, policy_version 236550 (0.0036) [2024-06-22 15:22:58,330][15401] Updated weights for policy 0, policy_version 236560 (0.0028) [2024-06-22 15:22:58,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3875799040. Throughput: 0: 42879.6. Samples: 3875895720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:22:58,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 15:23:02,803][15401] Updated weights for policy 0, policy_version 236570 (0.0039) [2024-06-22 15:23:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3875979264. Throughput: 0: 42937.5. Samples: 3876158020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:03,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 15:23:05,969][15401] Updated weights for policy 0, policy_version 236580 (0.0029) [2024-06-22 15:23:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3876208640. Throughput: 0: 42837.8. Samples: 3876282900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:08,390][15132] Avg episode reward: [(0, '0.218')] [2024-06-22 15:23:10,369][15401] Updated weights for policy 0, policy_version 236590 (0.0043) [2024-06-22 15:23:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 43043.0). Total num frames: 3876438016. Throughput: 0: 42958.7. Samples: 3876544460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:13,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 15:23:13,514][15401] Updated weights for policy 0, policy_version 236600 (0.0034) [2024-06-22 15:23:18,009][15401] Updated weights for policy 0, policy_version 236610 (0.0041) [2024-06-22 15:23:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 3876634624. Throughput: 0: 42725.0. Samples: 3876800800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 15:23:20,335][15349] Signal inference workers to stop experience collection... (57350 times) [2024-06-22 15:23:20,363][15401] InferenceWorker_p0-w0: stopping experience collection (57350 times) [2024-06-22 15:23:20,386][15349] Signal inference workers to resume experience collection... (57350 times) [2024-06-22 15:23:20,387][15401] InferenceWorker_p0-w0: resuming experience collection (57350 times) [2024-06-22 15:23:21,394][15401] Updated weights for policy 0, policy_version 236620 (0.0034) [2024-06-22 15:23:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 3876864000. Throughput: 0: 42692.5. Samples: 3876924640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 15:23:25,529][15401] Updated weights for policy 0, policy_version 236630 (0.0025) [2024-06-22 15:23:28,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 43042.4). Total num frames: 3877076992. Throughput: 0: 42809.7. Samples: 3877187740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:28,392][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 15:23:29,157][15401] Updated weights for policy 0, policy_version 236640 (0.0040) [2024-06-22 15:23:33,356][15401] Updated weights for policy 0, policy_version 236650 (0.0036) [2024-06-22 15:23:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42871.4, 300 sec: 42875.7). Total num frames: 3877273600. Throughput: 0: 42838.1. Samples: 3877446120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:33,393][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 15:23:36,793][15401] Updated weights for policy 0, policy_version 236660 (0.0044) [2024-06-22 15:23:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 3877519360. Throughput: 0: 43035.9. Samples: 3877574320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:38,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 15:23:40,859][15401] Updated weights for policy 0, policy_version 236670 (0.0031) [2024-06-22 15:23:43,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42325.3, 300 sec: 42932.6). Total num frames: 3877699584. Throughput: 0: 43046.7. Samples: 3877832820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 15:23:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000236676_3877699584.pth... [2024-06-22 15:23:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000236048_3867410432.pth [2024-06-22 15:23:44,420][15401] Updated weights for policy 0, policy_version 236680 (0.0052) [2024-06-22 15:23:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3877912576. Throughput: 0: 42845.8. Samples: 3878086080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 15:23:48,463][15401] Updated weights for policy 0, policy_version 236690 (0.0041) [2024-06-22 15:23:52,045][15401] Updated weights for policy 0, policy_version 236700 (0.0046) [2024-06-22 15:23:53,392][15132] Fps is (10 sec: 44227.1, 60 sec: 43142.9, 300 sec: 42986.9). Total num frames: 3878141952. Throughput: 0: 43002.0. Samples: 3878218080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:53,392][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 15:23:56,031][15401] Updated weights for policy 0, policy_version 236710 (0.0033) [2024-06-22 15:23:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42876.5). Total num frames: 3878338560. Throughput: 0: 42871.6. Samples: 3878473680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 15:23:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 15:23:59,738][15401] Updated weights for policy 0, policy_version 236720 (0.0035) [2024-06-22 15:24:03,390][15132] Fps is (10 sec: 42607.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3878567936. Throughput: 0: 42903.5. Samples: 3878731460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 15:24:03,544][15401] Updated weights for policy 0, policy_version 236730 (0.0030) [2024-06-22 15:24:07,421][15401] Updated weights for policy 0, policy_version 236740 (0.0025) [2024-06-22 15:24:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3878780928. Throughput: 0: 42998.2. Samples: 3878859560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 15:24:11,098][15401] Updated weights for policy 0, policy_version 236750 (0.0041) [2024-06-22 15:24:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3878993920. Throughput: 0: 42897.0. Samples: 3879118000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 15:24:15,031][15401] Updated weights for policy 0, policy_version 236760 (0.0032) [2024-06-22 15:24:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3879206912. Throughput: 0: 42775.7. Samples: 3879370920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 15:24:19,001][15401] Updated weights for policy 0, policy_version 236770 (0.0034) [2024-06-22 15:24:23,040][15401] Updated weights for policy 0, policy_version 236780 (0.0038) [2024-06-22 15:24:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3879419904. Throughput: 0: 42717.4. Samples: 3879496600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:23,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 15:24:26,725][15401] Updated weights for policy 0, policy_version 236790 (0.0031) [2024-06-22 15:24:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42932.0). Total num frames: 3879632896. Throughput: 0: 42700.9. Samples: 3879754360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 15:24:30,594][15401] Updated weights for policy 0, policy_version 236800 (0.0034) [2024-06-22 15:24:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 3879845888. Throughput: 0: 42600.8. Samples: 3880003120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 15:24:34,480][15401] Updated weights for policy 0, policy_version 236810 (0.0047) [2024-06-22 15:24:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 3880026112. Throughput: 0: 42432.3. Samples: 3880127440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 15:24:38,832][15401] Updated weights for policy 0, policy_version 236820 (0.0039) [2024-06-22 15:24:41,961][15401] Updated weights for policy 0, policy_version 236830 (0.0041) [2024-06-22 15:24:43,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 3880288256. Throughput: 0: 42589.7. Samples: 3880390320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:43,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 15:24:46,332][15401] Updated weights for policy 0, policy_version 236840 (0.0044) [2024-06-22 15:24:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 3880484864. Throughput: 0: 42614.3. Samples: 3880649100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 15:24:49,578][15401] Updated weights for policy 0, policy_version 236850 (0.0029) [2024-06-22 15:24:53,392][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42764.7). Total num frames: 3880681472. Throughput: 0: 42593.4. Samples: 3880776360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:53,393][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 15:24:53,828][15401] Updated weights for policy 0, policy_version 236860 (0.0033) [2024-06-22 15:24:57,059][15401] Updated weights for policy 0, policy_version 236870 (0.0036) [2024-06-22 15:24:58,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3880943616. Throughput: 0: 42604.7. Samples: 3881035220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:24:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 15:25:01,436][15401] Updated weights for policy 0, policy_version 236880 (0.0034) [2024-06-22 15:25:03,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3881140224. Throughput: 0: 42763.5. Samples: 3881295280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 15:25:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 15:25:04,602][15401] Updated weights for policy 0, policy_version 236890 (0.0029) [2024-06-22 15:25:08,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 3881320448. Throughput: 0: 42819.6. Samples: 3881423480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 15:25:08,929][15401] Updated weights for policy 0, policy_version 236900 (0.0031) [2024-06-22 15:25:10,474][15349] Signal inference workers to stop experience collection... (57400 times) [2024-06-22 15:25:10,531][15401] InferenceWorker_p0-w0: stopping experience collection (57400 times) [2024-06-22 15:25:10,534][15349] Signal inference workers to resume experience collection... (57400 times) [2024-06-22 15:25:10,541][15401] InferenceWorker_p0-w0: resuming experience collection (57400 times) [2024-06-22 15:25:12,138][15401] Updated weights for policy 0, policy_version 236910 (0.0026) [2024-06-22 15:25:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42821.5). Total num frames: 3881549824. Throughput: 0: 42701.2. Samples: 3881675920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:13,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 15:25:16,725][15401] Updated weights for policy 0, policy_version 236920 (0.0042) [2024-06-22 15:25:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 3881762816. Throughput: 0: 43003.7. Samples: 3881938280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 15:25:20,168][15401] Updated weights for policy 0, policy_version 236930 (0.0037) [2024-06-22 15:25:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3881975808. Throughput: 0: 43044.0. Samples: 3882064420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 15:25:24,415][15401] Updated weights for policy 0, policy_version 236940 (0.0045) [2024-06-22 15:25:27,673][15401] Updated weights for policy 0, policy_version 236950 (0.0039) [2024-06-22 15:25:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 3882221568. Throughput: 0: 42943.9. Samples: 3882322700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:28,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 15:25:32,115][15401] Updated weights for policy 0, policy_version 236960 (0.0028) [2024-06-22 15:25:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 3882401792. Throughput: 0: 42967.1. Samples: 3882582620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 15:25:35,469][15401] Updated weights for policy 0, policy_version 236970 (0.0039) [2024-06-22 15:25:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 3882631168. Throughput: 0: 42954.7. Samples: 3882709220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:38,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 15:25:39,850][15401] Updated weights for policy 0, policy_version 236980 (0.0038) [2024-06-22 15:25:43,253][15401] Updated weights for policy 0, policy_version 236990 (0.0047) [2024-06-22 15:25:43,396][15132] Fps is (10 sec: 44208.0, 60 sec: 42595.5, 300 sec: 42764.1). Total num frames: 3882844160. Throughput: 0: 42984.1. Samples: 3882969780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:43,397][15132] Avg episode reward: [(0, '0.187')] [2024-06-22 15:25:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000236990_3882844160.pth... [2024-06-22 15:25:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000236363_3872571392.pth [2024-06-22 15:25:47,371][15401] Updated weights for policy 0, policy_version 237000 (0.0034) [2024-06-22 15:25:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3883040768. Throughput: 0: 42867.6. Samples: 3883224320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 15:25:50,818][15401] Updated weights for policy 0, policy_version 237010 (0.0040) [2024-06-22 15:25:53,390][15132] Fps is (10 sec: 44265.1, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 3883286528. Throughput: 0: 42790.5. Samples: 3883349060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 15:25:54,835][15401] Updated weights for policy 0, policy_version 237020 (0.0037) [2024-06-22 15:25:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42820.8). Total num frames: 3883483136. Throughput: 0: 43070.8. Samples: 3883614100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:25:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 15:25:58,507][15401] Updated weights for policy 0, policy_version 237030 (0.0034) [2024-06-22 15:26:02,414][15401] Updated weights for policy 0, policy_version 237040 (0.0041) [2024-06-22 15:26:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 3883696128. Throughput: 0: 42854.6. Samples: 3883866740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:26:03,400][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 15:26:06,393][15401] Updated weights for policy 0, policy_version 237050 (0.0038) [2024-06-22 15:26:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3883925504. Throughput: 0: 42854.3. Samples: 3883992860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:26:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 15:26:10,108][15401] Updated weights for policy 0, policy_version 237060 (0.0029) [2024-06-22 15:26:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3884138496. Throughput: 0: 42764.0. Samples: 3884247080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-22 15:26:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 15:26:13,858][15401] Updated weights for policy 0, policy_version 237070 (0.0037) [2024-06-22 15:26:18,040][15401] Updated weights for policy 0, policy_version 237080 (0.0041) [2024-06-22 15:26:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3884335104. Throughput: 0: 42680.8. Samples: 3884503260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 15:26:21,588][15401] Updated weights for policy 0, policy_version 237090 (0.0039) [2024-06-22 15:26:23,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 3884548096. Throughput: 0: 42692.3. Samples: 3884630640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:23,396][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 15:26:25,789][15401] Updated weights for policy 0, policy_version 237100 (0.0023) [2024-06-22 15:26:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.8, 300 sec: 42931.3). Total num frames: 3884777472. Throughput: 0: 42547.9. Samples: 3884884260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:28,393][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 15:26:29,369][15401] Updated weights for policy 0, policy_version 237110 (0.0047) [2024-06-22 15:26:33,278][15401] Updated weights for policy 0, policy_version 237120 (0.0030) [2024-06-22 15:26:33,389][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3884974080. Throughput: 0: 42578.7. Samples: 3885140360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 15:26:34,518][15349] Signal inference workers to stop experience collection... (57450 times) [2024-06-22 15:26:34,518][15349] Signal inference workers to resume experience collection... (57450 times) [2024-06-22 15:26:34,546][15401] InferenceWorker_p0-w0: stopping experience collection (57450 times) [2024-06-22 15:26:34,546][15401] InferenceWorker_p0-w0: resuming experience collection (57450 times) [2024-06-22 15:26:36,951][15401] Updated weights for policy 0, policy_version 237130 (0.0033) [2024-06-22 15:26:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3885187072. Throughput: 0: 42626.4. Samples: 3885267240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:38,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 15:26:40,879][15401] Updated weights for policy 0, policy_version 237140 (0.0034) [2024-06-22 15:26:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 3885400064. Throughput: 0: 42403.9. Samples: 3885522280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 15:26:44,635][15401] Updated weights for policy 0, policy_version 237150 (0.0033) [2024-06-22 15:26:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3885596672. Throughput: 0: 42398.2. Samples: 3885774660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:48,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-22 15:26:48,696][15401] Updated weights for policy 0, policy_version 237160 (0.0042) [2024-06-22 15:26:52,217][15401] Updated weights for policy 0, policy_version 237170 (0.0036) [2024-06-22 15:26:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3885842432. Throughput: 0: 42518.5. Samples: 3885906200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:53,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-22 15:26:56,343][15401] Updated weights for policy 0, policy_version 237180 (0.0033) [2024-06-22 15:26:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3886039040. Throughput: 0: 42508.5. Samples: 3886159960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:26:58,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 15:26:59,801][15401] Updated weights for policy 0, policy_version 237190 (0.0026) [2024-06-22 15:27:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3886235648. Throughput: 0: 42465.9. Samples: 3886414220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:27:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 15:27:03,959][15401] Updated weights for policy 0, policy_version 237200 (0.0043) [2024-06-22 15:27:07,439][15401] Updated weights for policy 0, policy_version 237210 (0.0023) [2024-06-22 15:27:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3886481408. Throughput: 0: 42501.7. Samples: 3886542940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:27:08,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 15:27:11,872][15401] Updated weights for policy 0, policy_version 237220 (0.0048) [2024-06-22 15:27:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42765.4). Total num frames: 3886661632. Throughput: 0: 42585.9. Samples: 3886800520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:27:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 15:27:15,053][15401] Updated weights for policy 0, policy_version 237230 (0.0047) [2024-06-22 15:27:18,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3886891008. Throughput: 0: 42606.9. Samples: 3887057680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:27:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 15:27:19,394][15401] Updated weights for policy 0, policy_version 237240 (0.0047) [2024-06-22 15:27:22,950][15401] Updated weights for policy 0, policy_version 237250 (0.0035) [2024-06-22 15:27:23,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 3887120384. Throughput: 0: 42627.1. Samples: 3887185460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 15:27:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 15:27:27,019][15401] Updated weights for policy 0, policy_version 237260 (0.0037) [2024-06-22 15:27:28,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42054.0, 300 sec: 42709.8). Total num frames: 3887300608. Throughput: 0: 42541.4. Samples: 3887436640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:27:28,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 15:27:30,619][15401] Updated weights for policy 0, policy_version 237270 (0.0033) [2024-06-22 15:27:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3887529984. Throughput: 0: 42571.1. Samples: 3887690360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:27:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 15:27:34,672][15401] Updated weights for policy 0, policy_version 237280 (0.0032) [2024-06-22 15:27:38,118][15401] Updated weights for policy 0, policy_version 237290 (0.0036) [2024-06-22 15:27:38,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3887775744. Throughput: 0: 42701.0. Samples: 3887827740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:27:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 15:27:42,348][15401] Updated weights for policy 0, policy_version 237300 (0.0033) [2024-06-22 15:27:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3887939584. Throughput: 0: 42618.6. Samples: 3888077800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:27:43,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 15:27:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000237301_3887939584.pth... [2024-06-22 15:27:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000236676_3877699584.pth [2024-06-22 15:27:45,762][15401] Updated weights for policy 0, policy_version 237310 (0.0037) [2024-06-22 15:27:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3888185344. Throughput: 0: 42635.9. Samples: 3888332840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:27:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 15:27:50,229][15401] Updated weights for policy 0, policy_version 237320 (0.0026) [2024-06-22 15:27:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3888398336. Throughput: 0: 42761.7. Samples: 3888467220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:27:53,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 15:27:53,438][15401] Updated weights for policy 0, policy_version 237330 (0.0030) [2024-06-22 15:27:57,784][15401] Updated weights for policy 0, policy_version 237340 (0.0029) [2024-06-22 15:27:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3888578560. Throughput: 0: 42619.5. Samples: 3888718400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:27:58,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 15:27:59,662][15349] Signal inference workers to stop experience collection... (57500 times) [2024-06-22 15:27:59,663][15349] Signal inference workers to resume experience collection... (57500 times) [2024-06-22 15:27:59,705][15401] InferenceWorker_p0-w0: stopping experience collection (57500 times) [2024-06-22 15:27:59,705][15401] InferenceWorker_p0-w0: resuming experience collection (57500 times) [2024-06-22 15:28:01,055][15401] Updated weights for policy 0, policy_version 237350 (0.0035) [2024-06-22 15:28:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3888824320. Throughput: 0: 42610.8. Samples: 3888975160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:28:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 15:28:05,502][15401] Updated weights for policy 0, policy_version 237360 (0.0030) [2024-06-22 15:28:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3889020928. Throughput: 0: 42735.1. Samples: 3889108540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:28:08,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 15:28:09,225][15401] Updated weights for policy 0, policy_version 237370 (0.0031) [2024-06-22 15:28:13,182][15401] Updated weights for policy 0, policy_version 237380 (0.0037) [2024-06-22 15:28:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3889233920. Throughput: 0: 42747.5. Samples: 3889360280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:28:13,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 15:28:16,834][15401] Updated weights for policy 0, policy_version 237390 (0.0041) [2024-06-22 15:28:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3889479680. Throughput: 0: 42783.1. Samples: 3889615600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:28:18,396][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 15:28:20,963][15401] Updated weights for policy 0, policy_version 237400 (0.0027) [2024-06-22 15:28:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 3889659904. Throughput: 0: 42696.7. Samples: 3889749100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:28:23,390][15132] Avg episode reward: [(0, '0.908')] [2024-06-22 15:28:24,673][15401] Updated weights for policy 0, policy_version 237410 (0.0029) [2024-06-22 15:28:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 3889856512. Throughput: 0: 42695.2. Samples: 3889999080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:28:28,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-22 15:28:28,561][15401] Updated weights for policy 0, policy_version 237420 (0.0038) [2024-06-22 15:28:32,211][15401] Updated weights for policy 0, policy_version 237430 (0.0043) [2024-06-22 15:28:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3890118656. Throughput: 0: 42754.1. Samples: 3890256780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:28:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 15:28:36,175][15401] Updated weights for policy 0, policy_version 237440 (0.0032) [2024-06-22 15:28:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 3890298880. Throughput: 0: 42642.7. Samples: 3890386140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:28:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 15:28:40,056][15401] Updated weights for policy 0, policy_version 237450 (0.0036) [2024-06-22 15:28:43,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3890511872. Throughput: 0: 42675.4. Samples: 3890638800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:28:43,392][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 15:28:43,709][15401] Updated weights for policy 0, policy_version 237460 (0.0040) [2024-06-22 15:28:47,917][15401] Updated weights for policy 0, policy_version 237470 (0.0033) [2024-06-22 15:28:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 3890741248. Throughput: 0: 42757.9. Samples: 3890899260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:28:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 15:28:51,215][15401] Updated weights for policy 0, policy_version 237480 (0.0027) [2024-06-22 15:28:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3890954240. Throughput: 0: 42554.1. Samples: 3891023480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:28:53,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 15:28:55,642][15401] Updated weights for policy 0, policy_version 237490 (0.0048) [2024-06-22 15:28:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 3891167232. Throughput: 0: 42721.3. Samples: 3891282840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:28:58,392][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 15:28:58,903][15401] Updated weights for policy 0, policy_version 237500 (0.0026) [2024-06-22 15:29:03,377][15401] Updated weights for policy 0, policy_version 237510 (0.0029) [2024-06-22 15:29:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 3891363840. Throughput: 0: 42880.5. Samples: 3891545220. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 15:29:06,340][15401] Updated weights for policy 0, policy_version 237520 (0.0035) [2024-06-22 15:29:08,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 3891576832. Throughput: 0: 42666.0. Samples: 3891669060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 15:29:10,974][15401] Updated weights for policy 0, policy_version 237530 (0.0045) [2024-06-22 15:29:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3891822592. Throughput: 0: 42855.2. Samples: 3891927560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 15:29:13,921][15401] Updated weights for policy 0, policy_version 237540 (0.0040) [2024-06-22 15:29:18,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 3892002816. Throughput: 0: 42942.3. Samples: 3892189280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:18,393][15132] Avg episode reward: [(0, '0.795')] [2024-06-22 15:29:18,460][15401] Updated weights for policy 0, policy_version 237550 (0.0044) [2024-06-22 15:29:21,747][15401] Updated weights for policy 0, policy_version 237560 (0.0029) [2024-06-22 15:29:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3892232192. Throughput: 0: 42777.2. Samples: 3892311120. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:23,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-22 15:29:25,969][15401] Updated weights for policy 0, policy_version 237570 (0.0030) [2024-06-22 15:29:28,390][15132] Fps is (10 sec: 47525.0, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 3892477952. Throughput: 0: 43016.5. Samples: 3892574540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 15:29:29,367][15401] Updated weights for policy 0, policy_version 237580 (0.0031) [2024-06-22 15:29:33,119][15349] Signal inference workers to stop experience collection... (57550 times) [2024-06-22 15:29:33,147][15401] InferenceWorker_p0-w0: stopping experience collection (57550 times) [2024-06-22 15:29:33,237][15349] Signal inference workers to resume experience collection... (57550 times) [2024-06-22 15:29:33,238][15401] InferenceWorker_p0-w0: resuming experience collection (57550 times) [2024-06-22 15:29:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 3892641792. Throughput: 0: 42986.6. Samples: 3892833660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 15:29:33,554][15401] Updated weights for policy 0, policy_version 237590 (0.0041) [2024-06-22 15:29:37,206][15401] Updated weights for policy 0, policy_version 237600 (0.0042) [2024-06-22 15:29:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 3892887552. Throughput: 0: 43009.9. Samples: 3892958920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 15:29:41,130][15401] Updated weights for policy 0, policy_version 237610 (0.0032) [2024-06-22 15:29:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3893100544. Throughput: 0: 43084.4. Samples: 3893221540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 15:29:43,399][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 15:29:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000237616_3893100544.pth... [2024-06-22 15:29:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000236990_3882844160.pth [2024-06-22 15:29:44,654][15401] Updated weights for policy 0, policy_version 237620 (0.0036) [2024-06-22 15:29:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 3893297152. Throughput: 0: 42942.7. Samples: 3893477640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:29:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 15:29:48,666][15401] Updated weights for policy 0, policy_version 237630 (0.0034) [2024-06-22 15:29:52,215][15401] Updated weights for policy 0, policy_version 237640 (0.0037) [2024-06-22 15:29:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 3893510144. Throughput: 0: 42957.2. Samples: 3893602140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:29:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 15:29:56,611][15401] Updated weights for policy 0, policy_version 237650 (0.0022) [2024-06-22 15:29:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 3893739520. Throughput: 0: 42943.9. Samples: 3893860040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:29:58,390][15132] Avg episode reward: [(0, '0.156')] [2024-06-22 15:29:59,748][15401] Updated weights for policy 0, policy_version 237660 (0.0023) [2024-06-22 15:30:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3893936128. Throughput: 0: 42940.6. Samples: 3894121500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:03,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 15:30:04,438][15401] Updated weights for policy 0, policy_version 237670 (0.0024) [2024-06-22 15:30:07,937][15401] Updated weights for policy 0, policy_version 237680 (0.0032) [2024-06-22 15:30:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3894165504. Throughput: 0: 42979.2. Samples: 3894245180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 15:30:12,062][15401] Updated weights for policy 0, policy_version 237690 (0.0030) [2024-06-22 15:30:13,394][15132] Fps is (10 sec: 45856.1, 60 sec: 42868.5, 300 sec: 42819.9). Total num frames: 3894394880. Throughput: 0: 42933.9. Samples: 3894506740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:13,394][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 15:30:15,415][15401] Updated weights for policy 0, policy_version 237700 (0.0032) [2024-06-22 15:30:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 3894575104. Throughput: 0: 42931.7. Samples: 3894765580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:18,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-22 15:30:19,474][15401] Updated weights for policy 0, policy_version 237710 (0.0024) [2024-06-22 15:30:23,001][15401] Updated weights for policy 0, policy_version 237720 (0.0035) [2024-06-22 15:30:23,389][15132] Fps is (10 sec: 40976.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3894804480. Throughput: 0: 42864.0. Samples: 3894887800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 15:30:27,016][15401] Updated weights for policy 0, policy_version 237730 (0.0037) [2024-06-22 15:30:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3895033856. Throughput: 0: 42759.7. Samples: 3895145720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 15:30:30,495][15401] Updated weights for policy 0, policy_version 237740 (0.0027) [2024-06-22 15:30:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3895214080. Throughput: 0: 42861.1. Samples: 3895406400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 15:30:34,682][15401] Updated weights for policy 0, policy_version 237750 (0.0030) [2024-06-22 15:30:36,171][15349] Signal inference workers to stop experience collection... (57600 times) [2024-06-22 15:30:36,226][15401] InferenceWorker_p0-w0: stopping experience collection (57600 times) [2024-06-22 15:30:36,226][15349] Signal inference workers to resume experience collection... (57600 times) [2024-06-22 15:30:36,249][15401] InferenceWorker_p0-w0: resuming experience collection (57600 times) [2024-06-22 15:30:38,101][15401] Updated weights for policy 0, policy_version 237760 (0.0041) [2024-06-22 15:30:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42766.0). Total num frames: 3895459840. Throughput: 0: 42817.9. Samples: 3895528940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 15:30:42,350][15401] Updated weights for policy 0, policy_version 237770 (0.0038) [2024-06-22 15:30:43,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3895672832. Throughput: 0: 43004.6. Samples: 3895795240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 15:30:45,532][15401] Updated weights for policy 0, policy_version 237780 (0.0038) [2024-06-22 15:30:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3895869440. Throughput: 0: 42843.0. Samples: 3896049440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:48,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-22 15:30:49,840][15401] Updated weights for policy 0, policy_version 237790 (0.0036) [2024-06-22 15:30:53,256][15401] Updated weights for policy 0, policy_version 237800 (0.0036) [2024-06-22 15:30:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 3896115200. Throughput: 0: 42993.1. Samples: 3896179880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 15:30:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 15:30:57,393][15401] Updated weights for policy 0, policy_version 237810 (0.0037) [2024-06-22 15:30:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3896328192. Throughput: 0: 42943.9. Samples: 3896439040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:30:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 15:31:00,978][15401] Updated weights for policy 0, policy_version 237820 (0.0033) [2024-06-22 15:31:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3896524800. Throughput: 0: 42822.1. Samples: 3896692580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 15:31:05,188][15401] Updated weights for policy 0, policy_version 237830 (0.0035) [2024-06-22 15:31:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3896754176. Throughput: 0: 42988.1. Samples: 3896822260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:08,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 15:31:08,485][15401] Updated weights for policy 0, policy_version 237840 (0.0038) [2024-06-22 15:31:12,707][15401] Updated weights for policy 0, policy_version 237850 (0.0030) [2024-06-22 15:31:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42599.6, 300 sec: 42764.7). Total num frames: 3896950784. Throughput: 0: 43070.1. Samples: 3897083980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:13,393][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 15:31:15,990][15401] Updated weights for policy 0, policy_version 237860 (0.0035) [2024-06-22 15:31:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42765.9). Total num frames: 3897163776. Throughput: 0: 43128.1. Samples: 3897347160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 15:31:20,198][15401] Updated weights for policy 0, policy_version 237870 (0.0028) [2024-06-22 15:31:23,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 3897393152. Throughput: 0: 43241.4. Samples: 3897474800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:23,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 15:31:23,590][15401] Updated weights for policy 0, policy_version 237880 (0.0040) [2024-06-22 15:31:27,654][15401] Updated weights for policy 0, policy_version 237890 (0.0023) [2024-06-22 15:31:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3897606144. Throughput: 0: 43032.8. Samples: 3897731720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 15:31:31,379][15401] Updated weights for policy 0, policy_version 237900 (0.0035) [2024-06-22 15:31:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3897802752. Throughput: 0: 43075.0. Samples: 3897987820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 15:31:35,647][15401] Updated weights for policy 0, policy_version 237910 (0.0027) [2024-06-22 15:31:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3898048512. Throughput: 0: 42949.5. Samples: 3898112600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 15:31:39,302][15401] Updated weights for policy 0, policy_version 237920 (0.0031) [2024-06-22 15:31:43,178][15401] Updated weights for policy 0, policy_version 237930 (0.0038) [2024-06-22 15:31:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 3898245120. Throughput: 0: 42916.8. Samples: 3898370300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 15:31:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000237930_3898245120.pth... [2024-06-22 15:31:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000237301_3887939584.pth [2024-06-22 15:31:46,955][15401] Updated weights for policy 0, policy_version 237940 (0.0037) [2024-06-22 15:31:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3898458112. Throughput: 0: 42955.6. Samples: 3898625580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 15:31:50,784][15401] Updated weights for policy 0, policy_version 237950 (0.0026) [2024-06-22 15:31:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3898671104. Throughput: 0: 42941.7. Samples: 3898754640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 15:31:54,683][15401] Updated weights for policy 0, policy_version 237960 (0.0029) [2024-06-22 15:31:58,336][15401] Updated weights for policy 0, policy_version 237970 (0.0033) [2024-06-22 15:31:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3898900480. Throughput: 0: 42813.4. Samples: 3899010480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:31:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 15:32:02,338][15401] Updated weights for policy 0, policy_version 237980 (0.0035) [2024-06-22 15:32:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3899097088. Throughput: 0: 42546.2. Samples: 3899261740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 15:32:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 15:32:05,986][15401] Updated weights for policy 0, policy_version 237990 (0.0044) [2024-06-22 15:32:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3899310080. Throughput: 0: 42520.0. Samples: 3899388200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 15:32:09,818][15401] Updated weights for policy 0, policy_version 238000 (0.0033) [2024-06-22 15:32:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 3899523072. Throughput: 0: 42683.9. Samples: 3899652500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 15:32:13,779][15401] Updated weights for policy 0, policy_version 238010 (0.0039) [2024-06-22 15:32:17,593][15401] Updated weights for policy 0, policy_version 238020 (0.0030) [2024-06-22 15:32:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3899736064. Throughput: 0: 42607.6. Samples: 3899905160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:18,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-22 15:32:18,773][15349] Signal inference workers to stop experience collection... (57650 times) [2024-06-22 15:32:18,773][15349] Signal inference workers to resume experience collection... (57650 times) [2024-06-22 15:32:18,811][15401] InferenceWorker_p0-w0: stopping experience collection (57650 times) [2024-06-22 15:32:18,811][15401] InferenceWorker_p0-w0: resuming experience collection (57650 times) [2024-06-22 15:32:21,414][15401] Updated weights for policy 0, policy_version 238030 (0.0035) [2024-06-22 15:32:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 3899965440. Throughput: 0: 42675.9. Samples: 3900033020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 15:32:25,033][15401] Updated weights for policy 0, policy_version 238040 (0.0034) [2024-06-22 15:32:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3900178432. Throughput: 0: 42817.8. Samples: 3900297100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 15:32:29,102][15401] Updated weights for policy 0, policy_version 238050 (0.0045) [2024-06-22 15:32:32,967][15401] Updated weights for policy 0, policy_version 238060 (0.0027) [2024-06-22 15:32:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3900375040. Throughput: 0: 42728.8. Samples: 3900548380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 15:32:36,675][15401] Updated weights for policy 0, policy_version 238070 (0.0033) [2024-06-22 15:32:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 3900588032. Throughput: 0: 42622.6. Samples: 3900672660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 15:32:40,493][15401] Updated weights for policy 0, policy_version 238080 (0.0035) [2024-06-22 15:32:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3900801024. Throughput: 0: 42722.7. Samples: 3900933000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 15:32:44,215][15401] Updated weights for policy 0, policy_version 238090 (0.0034) [2024-06-22 15:32:48,271][15401] Updated weights for policy 0, policy_version 238100 (0.0039) [2024-06-22 15:32:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3901030400. Throughput: 0: 42805.4. Samples: 3901187980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 15:32:51,930][15401] Updated weights for policy 0, policy_version 238110 (0.0030) [2024-06-22 15:32:53,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 3901243392. Throughput: 0: 42948.1. Samples: 3901321140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:53,397][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 15:32:56,055][15401] Updated weights for policy 0, policy_version 238120 (0.0032) [2024-06-22 15:32:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3901440000. Throughput: 0: 42809.9. Samples: 3901578940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:32:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 15:32:59,540][15401] Updated weights for policy 0, policy_version 238130 (0.0032) [2024-06-22 15:33:03,390][15132] Fps is (10 sec: 42625.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3901669376. Throughput: 0: 42996.4. Samples: 3901840000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:33:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 15:33:03,713][15401] Updated weights for policy 0, policy_version 238140 (0.0029) [2024-06-22 15:33:06,998][15401] Updated weights for policy 0, policy_version 238150 (0.0039) [2024-06-22 15:33:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3901898752. Throughput: 0: 43063.1. Samples: 3901970860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:33:08,399][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 15:33:11,251][15401] Updated weights for policy 0, policy_version 238160 (0.0040) [2024-06-22 15:33:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3902095360. Throughput: 0: 42775.2. Samples: 3902221980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 15:33:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 15:33:14,492][15401] Updated weights for policy 0, policy_version 238170 (0.0031) [2024-06-22 15:33:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3902308352. Throughput: 0: 42982.7. Samples: 3902482600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:18,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 15:33:18,821][15401] Updated weights for policy 0, policy_version 238180 (0.0041) [2024-06-22 15:33:22,415][15401] Updated weights for policy 0, policy_version 238190 (0.0028) [2024-06-22 15:33:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3902537728. Throughput: 0: 43023.2. Samples: 3902608700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 15:33:26,424][15401] Updated weights for policy 0, policy_version 238200 (0.0042) [2024-06-22 15:33:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3902734336. Throughput: 0: 42942.6. Samples: 3902865420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 15:33:30,037][15401] Updated weights for policy 0, policy_version 238210 (0.0034) [2024-06-22 15:33:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3902930944. Throughput: 0: 42984.4. Samples: 3903122280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 15:33:34,067][15401] Updated weights for policy 0, policy_version 238220 (0.0036) [2024-06-22 15:33:35,932][15349] Signal inference workers to stop experience collection... (57700 times) [2024-06-22 15:33:35,936][15349] Signal inference workers to resume experience collection... (57700 times) [2024-06-22 15:33:35,958][15401] InferenceWorker_p0-w0: stopping experience collection (57700 times) [2024-06-22 15:33:35,958][15401] InferenceWorker_p0-w0: resuming experience collection (57700 times) [2024-06-22 15:33:37,742][15401] Updated weights for policy 0, policy_version 238230 (0.0024) [2024-06-22 15:33:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3903160320. Throughput: 0: 42742.4. Samples: 3903244280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:38,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 15:33:41,612][15401] Updated weights for policy 0, policy_version 238240 (0.0036) [2024-06-22 15:33:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3903389696. Throughput: 0: 42729.4. Samples: 3903501760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 15:33:43,483][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000238245_3903406080.pth... [2024-06-22 15:33:43,532][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000237616_3893100544.pth [2024-06-22 15:33:45,393][15401] Updated weights for policy 0, policy_version 238250 (0.0037) [2024-06-22 15:33:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3903586304. Throughput: 0: 42742.3. Samples: 3903763400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 15:33:49,196][15401] Updated weights for policy 0, policy_version 238260 (0.0042) [2024-06-22 15:33:53,194][15401] Updated weights for policy 0, policy_version 238270 (0.0029) [2024-06-22 15:33:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42876.0, 300 sec: 42876.4). Total num frames: 3903815680. Throughput: 0: 42542.3. Samples: 3903885260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:53,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 15:33:57,096][15401] Updated weights for policy 0, policy_version 238280 (0.0041) [2024-06-22 15:33:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3904028672. Throughput: 0: 42725.7. Samples: 3904144640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:33:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 15:34:00,960][15401] Updated weights for policy 0, policy_version 238290 (0.0040) [2024-06-22 15:34:03,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 3904225280. Throughput: 0: 42648.4. Samples: 3904401880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:34:03,393][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 15:34:04,971][15401] Updated weights for policy 0, policy_version 238300 (0.0032) [2024-06-22 15:34:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3904438272. Throughput: 0: 42681.4. Samples: 3904529360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:34:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 15:34:08,797][15401] Updated weights for policy 0, policy_version 238310 (0.0027) [2024-06-22 15:34:12,634][15401] Updated weights for policy 0, policy_version 238320 (0.0038) [2024-06-22 15:34:13,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 3904651264. Throughput: 0: 42518.6. Samples: 3904778760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:34:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 15:34:16,691][15401] Updated weights for policy 0, policy_version 238330 (0.0037) [2024-06-22 15:34:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 3904847872. Throughput: 0: 42491.7. Samples: 3905034400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:34:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 15:34:20,317][15401] Updated weights for policy 0, policy_version 238340 (0.0044) [2024-06-22 15:34:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3905077248. Throughput: 0: 42644.0. Samples: 3905163260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 15:34:23,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-22 15:34:24,114][15401] Updated weights for policy 0, policy_version 238350 (0.0042) [2024-06-22 15:34:27,914][15401] Updated weights for policy 0, policy_version 238360 (0.0030) [2024-06-22 15:34:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3905290240. Throughput: 0: 42531.5. Samples: 3905415680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:34:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 15:34:32,043][15401] Updated weights for policy 0, policy_version 238370 (0.0029) [2024-06-22 15:34:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3905503232. Throughput: 0: 42488.0. Samples: 3905675360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:34:33,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 15:34:35,453][15401] Updated weights for policy 0, policy_version 238380 (0.0036) [2024-06-22 15:34:38,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 3905716224. Throughput: 0: 42584.6. Samples: 3905801840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:34:38,396][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 15:34:39,663][15401] Updated weights for policy 0, policy_version 238390 (0.0030) [2024-06-22 15:34:43,188][15401] Updated weights for policy 0, policy_version 238400 (0.0036) [2024-06-22 15:34:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3905945600. Throughput: 0: 42511.0. Samples: 3906057640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:34:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 15:34:47,215][15401] Updated weights for policy 0, policy_version 238410 (0.0031) [2024-06-22 15:34:48,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3906125824. Throughput: 0: 42568.6. Samples: 3906317360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:34:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 15:34:50,996][15401] Updated weights for policy 0, policy_version 238420 (0.0034) [2024-06-22 15:34:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3906371584. Throughput: 0: 42444.3. Samples: 3906439360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:34:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 15:34:54,800][15401] Updated weights for policy 0, policy_version 238430 (0.0039) [2024-06-22 15:34:58,089][15349] Signal inference workers to stop experience collection... (57750 times) [2024-06-22 15:34:58,090][15349] Signal inference workers to resume experience collection... (57750 times) [2024-06-22 15:34:58,124][15401] InferenceWorker_p0-w0: stopping experience collection (57750 times) [2024-06-22 15:34:58,124][15401] InferenceWorker_p0-w0: resuming experience collection (57750 times) [2024-06-22 15:34:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3906568192. Throughput: 0: 42711.3. Samples: 3906700760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:34:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 15:34:58,655][15401] Updated weights for policy 0, policy_version 238440 (0.0030) [2024-06-22 15:35:02,429][15401] Updated weights for policy 0, policy_version 238450 (0.0034) [2024-06-22 15:35:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 3906781184. Throughput: 0: 42806.5. Samples: 3906960700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:35:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 15:35:06,202][15401] Updated weights for policy 0, policy_version 238460 (0.0030) [2024-06-22 15:35:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42821.1). Total num frames: 3907026944. Throughput: 0: 42837.4. Samples: 3907090940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:35:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 15:35:10,745][15401] Updated weights for policy 0, policy_version 238470 (0.0026) [2024-06-22 15:35:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3907223552. Throughput: 0: 42951.5. Samples: 3907348500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:35:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 15:35:13,778][15401] Updated weights for policy 0, policy_version 238480 (0.0027) [2024-06-22 15:35:18,200][15401] Updated weights for policy 0, policy_version 238490 (0.0033) [2024-06-22 15:35:18,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 3907420160. Throughput: 0: 42951.7. Samples: 3907608200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:35:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 15:35:21,342][15401] Updated weights for policy 0, policy_version 238500 (0.0026) [2024-06-22 15:35:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3907665920. Throughput: 0: 42898.5. Samples: 3907732000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:35:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 15:35:26,004][15401] Updated weights for policy 0, policy_version 238510 (0.0039) [2024-06-22 15:35:28,389][15132] Fps is (10 sec: 42600.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3907846144. Throughput: 0: 42848.7. Samples: 3907985820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:35:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 15:35:29,105][15401] Updated weights for policy 0, policy_version 238520 (0.0043) [2024-06-22 15:35:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3908042752. Throughput: 0: 42792.8. Samples: 3908243040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 15:35:33,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 15:35:33,657][15401] Updated weights for policy 0, policy_version 238530 (0.0025) [2024-06-22 15:35:36,870][15401] Updated weights for policy 0, policy_version 238540 (0.0034) [2024-06-22 15:35:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43149.1, 300 sec: 42820.5). Total num frames: 3908304896. Throughput: 0: 42770.3. Samples: 3908364020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:35:38,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 15:35:41,506][15401] Updated weights for policy 0, policy_version 238550 (0.0028) [2024-06-22 15:35:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3908485120. Throughput: 0: 42775.8. Samples: 3908625680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:35:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 15:35:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000238556_3908501504.pth... [2024-06-22 15:35:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000237930_3898245120.pth [2024-06-22 15:35:44,858][15401] Updated weights for policy 0, policy_version 238560 (0.0034) [2024-06-22 15:35:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 3908698112. Throughput: 0: 42585.2. Samples: 3908877040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:35:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 15:35:48,992][15401] Updated weights for policy 0, policy_version 238570 (0.0053) [2024-06-22 15:35:52,511][15401] Updated weights for policy 0, policy_version 238580 (0.0041) [2024-06-22 15:35:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3908927488. Throughput: 0: 42486.8. Samples: 3909002840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:35:53,390][15132] Avg episode reward: [(0, '0.131')] [2024-06-22 15:35:56,885][15401] Updated weights for policy 0, policy_version 238590 (0.0033) [2024-06-22 15:35:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3909140480. Throughput: 0: 42549.8. Samples: 3909263240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:35:58,390][15132] Avg episode reward: [(0, '0.196')] [2024-06-22 15:36:00,543][15401] Updated weights for policy 0, policy_version 238600 (0.0045) [2024-06-22 15:36:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3909337088. Throughput: 0: 42375.4. Samples: 3909515080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:36:03,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-22 15:36:04,430][15401] Updated weights for policy 0, policy_version 238610 (0.0033) [2024-06-22 15:36:05,231][15349] Signal inference workers to stop experience collection... (57800 times) [2024-06-22 15:36:05,252][15401] InferenceWorker_p0-w0: stopping experience collection (57800 times) [2024-06-22 15:36:05,292][15349] Signal inference workers to resume experience collection... (57800 times) [2024-06-22 15:36:05,293][15401] InferenceWorker_p0-w0: resuming experience collection (57800 times) [2024-06-22 15:36:08,142][15401] Updated weights for policy 0, policy_version 238620 (0.0021) [2024-06-22 15:36:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 3909566464. Throughput: 0: 42338.8. Samples: 3909637240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:36:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 15:36:11,929][15401] Updated weights for policy 0, policy_version 238630 (0.0046) [2024-06-22 15:36:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3909763072. Throughput: 0: 42419.9. Samples: 3909894720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:36:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 15:36:15,714][15401] Updated weights for policy 0, policy_version 238640 (0.0036) [2024-06-22 15:36:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.7, 300 sec: 42653.9). Total num frames: 3909976064. Throughput: 0: 42409.4. Samples: 3910151460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:36:18,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 15:36:19,664][15401] Updated weights for policy 0, policy_version 238650 (0.0037) [2024-06-22 15:36:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 3910189056. Throughput: 0: 42611.7. Samples: 3910281540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:36:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 15:36:23,489][15401] Updated weights for policy 0, policy_version 238660 (0.0033) [2024-06-22 15:36:27,149][15401] Updated weights for policy 0, policy_version 238670 (0.0050) [2024-06-22 15:36:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3910402048. Throughput: 0: 42437.0. Samples: 3910535340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:36:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 15:36:31,030][15401] Updated weights for policy 0, policy_version 238680 (0.0033) [2024-06-22 15:36:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 3910631424. Throughput: 0: 42485.4. Samples: 3910788880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:36:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 15:36:34,731][15401] Updated weights for policy 0, policy_version 238690 (0.0036) [2024-06-22 15:36:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 3910828032. Throughput: 0: 42602.0. Samples: 3910919940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 15:36:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 15:36:38,756][15401] Updated weights for policy 0, policy_version 238700 (0.0037) [2024-06-22 15:36:42,918][15401] Updated weights for policy 0, policy_version 238710 (0.0035) [2024-06-22 15:36:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3911041024. Throughput: 0: 42562.5. Samples: 3911178560. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:36:43,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 15:36:46,455][15401] Updated weights for policy 0, policy_version 238720 (0.0032) [2024-06-22 15:36:48,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 3911286784. Throughput: 0: 42480.4. Samples: 3911426700. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:36:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 15:36:50,508][15401] Updated weights for policy 0, policy_version 238730 (0.0034) [2024-06-22 15:36:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 3911450624. Throughput: 0: 42830.6. Samples: 3911564620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:36:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 15:36:54,093][15401] Updated weights for policy 0, policy_version 238740 (0.0042) [2024-06-22 15:36:57,994][15401] Updated weights for policy 0, policy_version 238750 (0.0024) [2024-06-22 15:36:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 3911680000. Throughput: 0: 42775.1. Samples: 3911819600. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:36:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 15:37:01,669][15401] Updated weights for policy 0, policy_version 238760 (0.0041) [2024-06-22 15:37:03,392][15132] Fps is (10 sec: 47503.0, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 3911925760. Throughput: 0: 42627.5. Samples: 3912069800. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:03,392][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 15:37:05,798][15401] Updated weights for policy 0, policy_version 238770 (0.0036) [2024-06-22 15:37:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3912122368. Throughput: 0: 42914.7. Samples: 3912212700. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:08,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 15:37:09,322][15401] Updated weights for policy 0, policy_version 238780 (0.0027) [2024-06-22 15:37:13,390][15132] Fps is (10 sec: 39330.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3912318976. Throughput: 0: 42754.1. Samples: 3912459280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 15:37:13,429][15401] Updated weights for policy 0, policy_version 238790 (0.0036) [2024-06-22 15:37:16,943][15401] Updated weights for policy 0, policy_version 238800 (0.0038) [2024-06-22 15:37:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 3912564736. Throughput: 0: 42808.0. Samples: 3912715240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:18,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 15:37:20,891][15401] Updated weights for policy 0, policy_version 238810 (0.0036) [2024-06-22 15:37:23,162][15349] Signal inference workers to stop experience collection... (57850 times) [2024-06-22 15:37:23,162][15349] Signal inference workers to resume experience collection... (57850 times) [2024-06-22 15:37:23,177][15401] InferenceWorker_p0-w0: stopping experience collection (57850 times) [2024-06-22 15:37:23,177][15401] InferenceWorker_p0-w0: resuming experience collection (57850 times) [2024-06-22 15:37:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3912777728. Throughput: 0: 42870.8. Samples: 3912849120. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 15:37:24,455][15401] Updated weights for policy 0, policy_version 238820 (0.0029) [2024-06-22 15:37:28,370][15401] Updated weights for policy 0, policy_version 238830 (0.0038) [2024-06-22 15:37:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3912990720. Throughput: 0: 42787.7. Samples: 3913104000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 15:37:31,957][15401] Updated weights for policy 0, policy_version 238840 (0.0027) [2024-06-22 15:37:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 3913203712. Throughput: 0: 43005.3. Samples: 3913362040. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:33,393][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 15:37:36,323][15401] Updated weights for policy 0, policy_version 238850 (0.0033) [2024-06-22 15:37:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 3913400320. Throughput: 0: 42904.5. Samples: 3913495320. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 15:37:39,485][15401] Updated weights for policy 0, policy_version 238860 (0.0039) [2024-06-22 15:37:43,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 3913613312. Throughput: 0: 42755.9. Samples: 3913743720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:43,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 15:37:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000238868_3913613312.pth... [2024-06-22 15:37:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000238245_3903406080.pth [2024-06-22 15:37:44,129][15401] Updated weights for policy 0, policy_version 238870 (0.0036) [2024-06-22 15:37:47,066][15401] Updated weights for policy 0, policy_version 238880 (0.0037) [2024-06-22 15:37:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 3913859072. Throughput: 0: 43105.3. Samples: 3914009440. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 15:37:48,397][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 15:37:51,626][15401] Updated weights for policy 0, policy_version 238890 (0.0030) [2024-06-22 15:37:53,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3914039296. Throughput: 0: 42778.1. Samples: 3914137720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:37:53,399][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 15:37:54,662][15401] Updated weights for policy 0, policy_version 238900 (0.0035) [2024-06-22 15:37:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3914268672. Throughput: 0: 42877.1. Samples: 3914388740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:37:58,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 15:37:59,135][15401] Updated weights for policy 0, policy_version 238910 (0.0038) [2024-06-22 15:38:02,461][15401] Updated weights for policy 0, policy_version 238920 (0.0043) [2024-06-22 15:38:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 3914481664. Throughput: 0: 43008.5. Samples: 3914650620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:03,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 15:38:06,567][15401] Updated weights for policy 0, policy_version 238930 (0.0023) [2024-06-22 15:38:08,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 3914678272. Throughput: 0: 43001.2. Samples: 3914784280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:08,393][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 15:38:09,878][15401] Updated weights for policy 0, policy_version 238940 (0.0032) [2024-06-22 15:38:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 3914924032. Throughput: 0: 42964.4. Samples: 3915037400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 15:38:14,033][15401] Updated weights for policy 0, policy_version 238950 (0.0026) [2024-06-22 15:38:17,811][15401] Updated weights for policy 0, policy_version 238960 (0.0026) [2024-06-22 15:38:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3915120640. Throughput: 0: 42971.7. Samples: 3915295660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 15:38:21,506][15401] Updated weights for policy 0, policy_version 238970 (0.0032) [2024-06-22 15:38:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3915333632. Throughput: 0: 42915.5. Samples: 3915426520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 15:38:25,332][15401] Updated weights for policy 0, policy_version 238980 (0.0037) [2024-06-22 15:38:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3915563008. Throughput: 0: 43043.9. Samples: 3915680600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 15:38:29,331][15401] Updated weights for policy 0, policy_version 238990 (0.0035) [2024-06-22 15:38:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 3915759616. Throughput: 0: 42929.2. Samples: 3915941260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 15:38:33,520][15401] Updated weights for policy 0, policy_version 239000 (0.0045) [2024-06-22 15:38:33,614][15349] Signal inference workers to stop experience collection... (57900 times) [2024-06-22 15:38:33,666][15401] InferenceWorker_p0-w0: stopping experience collection (57900 times) [2024-06-22 15:38:33,671][15349] Signal inference workers to resume experience collection... (57900 times) [2024-06-22 15:38:33,676][15401] InferenceWorker_p0-w0: resuming experience collection (57900 times) [2024-06-22 15:38:36,889][15401] Updated weights for policy 0, policy_version 239010 (0.0033) [2024-06-22 15:38:38,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 3915988992. Throughput: 0: 42887.1. Samples: 3916067640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:38,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-22 15:38:40,877][15401] Updated weights for policy 0, policy_version 239020 (0.0032) [2024-06-22 15:38:43,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43692.4, 300 sec: 42876.1). Total num frames: 3916234752. Throughput: 0: 43198.9. Samples: 3916332700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 15:38:44,674][15401] Updated weights for policy 0, policy_version 239030 (0.0040) [2024-06-22 15:38:48,284][15401] Updated weights for policy 0, policy_version 239040 (0.0031) [2024-06-22 15:38:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3916431360. Throughput: 0: 43156.3. Samples: 3916592660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 15:38:52,386][15401] Updated weights for policy 0, policy_version 239050 (0.0046) [2024-06-22 15:38:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 3916644352. Throughput: 0: 43013.3. Samples: 3916719780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:53,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 15:38:55,825][15401] Updated weights for policy 0, policy_version 239060 (0.0037) [2024-06-22 15:38:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.3, 300 sec: 42820.9). Total num frames: 3916857344. Throughput: 0: 43173.7. Samples: 3916980220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 15:38:58,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 15:38:59,851][15401] Updated weights for policy 0, policy_version 239070 (0.0027) [2024-06-22 15:39:03,386][15401] Updated weights for policy 0, policy_version 239080 (0.0037) [2024-06-22 15:39:03,392][15132] Fps is (10 sec: 44227.6, 60 sec: 43415.9, 300 sec: 42875.8). Total num frames: 3917086720. Throughput: 0: 43117.0. Samples: 3917236020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:03,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 15:39:07,579][15401] Updated weights for policy 0, policy_version 239090 (0.0055) [2024-06-22 15:39:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 3917283328. Throughput: 0: 43090.7. Samples: 3917365600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:08,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 15:39:11,005][15401] Updated weights for policy 0, policy_version 239100 (0.0026) [2024-06-22 15:39:13,389][15132] Fps is (10 sec: 40969.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3917496320. Throughput: 0: 43135.3. Samples: 3917621680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 15:39:15,283][15401] Updated weights for policy 0, policy_version 239110 (0.0031) [2024-06-22 15:39:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3917725696. Throughput: 0: 43109.5. Samples: 3917881180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:18,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-22 15:39:18,668][15401] Updated weights for policy 0, policy_version 239120 (0.0037) [2024-06-22 15:39:22,697][15401] Updated weights for policy 0, policy_version 239130 (0.0047) [2024-06-22 15:39:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3917922304. Throughput: 0: 43155.2. Samples: 3918009620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:23,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 15:39:26,347][15401] Updated weights for policy 0, policy_version 239140 (0.0041) [2024-06-22 15:39:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3918151680. Throughput: 0: 43080.4. Samples: 3918271320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 15:39:30,465][15401] Updated weights for policy 0, policy_version 239150 (0.0036) [2024-06-22 15:39:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43690.6, 300 sec: 42932.5). Total num frames: 3918381056. Throughput: 0: 43053.2. Samples: 3918530060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 15:39:33,798][15401] Updated weights for policy 0, policy_version 239160 (0.0026) [2024-06-22 15:39:38,244][15401] Updated weights for policy 0, policy_version 239170 (0.0030) [2024-06-22 15:39:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3918561280. Throughput: 0: 43145.0. Samples: 3918661300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 15:39:41,389][15401] Updated weights for policy 0, policy_version 239180 (0.0040) [2024-06-22 15:39:43,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3918790656. Throughput: 0: 43105.5. Samples: 3918919960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 15:39:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000239184_3918790656.pth... [2024-06-22 15:39:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000238556_3908501504.pth [2024-06-22 15:39:45,857][15401] Updated weights for policy 0, policy_version 239190 (0.0041) [2024-06-22 15:39:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3919020032. Throughput: 0: 42910.5. Samples: 3919166900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 15:39:48,943][15401] Updated weights for policy 0, policy_version 239200 (0.0032) [2024-06-22 15:39:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 3919200256. Throughput: 0: 43093.0. Samples: 3919304780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 15:39:53,413][15401] Updated weights for policy 0, policy_version 239210 (0.0036) [2024-06-22 15:39:56,871][15401] Updated weights for policy 0, policy_version 239220 (0.0032) [2024-06-22 15:39:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3919429632. Throughput: 0: 43156.0. Samples: 3919563700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:39:58,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 15:39:59,431][15349] Signal inference workers to stop experience collection... (57950 times) [2024-06-22 15:39:59,431][15349] Signal inference workers to resume experience collection... (57950 times) [2024-06-22 15:39:59,461][15401] InferenceWorker_p0-w0: stopping experience collection (57950 times) [2024-06-22 15:39:59,461][15401] InferenceWorker_p0-w0: resuming experience collection (57950 times) [2024-06-22 15:40:00,834][15401] Updated weights for policy 0, policy_version 239230 (0.0026) [2024-06-22 15:40:03,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 3919675392. Throughput: 0: 42966.2. Samples: 3919814660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:40:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 15:40:04,277][15401] Updated weights for policy 0, policy_version 239240 (0.0031) [2024-06-22 15:40:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3919855616. Throughput: 0: 43247.6. Samples: 3919955760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 15:40:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 15:40:08,511][15401] Updated weights for policy 0, policy_version 239250 (0.0047) [2024-06-22 15:40:11,757][15401] Updated weights for policy 0, policy_version 239260 (0.0034) [2024-06-22 15:40:13,391][15132] Fps is (10 sec: 39314.4, 60 sec: 42870.2, 300 sec: 42875.9). Total num frames: 3920068608. Throughput: 0: 42949.9. Samples: 3920204140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:13,392][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 15:40:16,108][15401] Updated weights for policy 0, policy_version 239270 (0.0044) [2024-06-22 15:40:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3920297984. Throughput: 0: 42870.0. Samples: 3920459200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:18,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 15:40:19,331][15401] Updated weights for policy 0, policy_version 239280 (0.0026) [2024-06-22 15:40:23,389][15132] Fps is (10 sec: 44244.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3920510976. Throughput: 0: 43084.5. Samples: 3920600100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:23,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 15:40:23,503][15401] Updated weights for policy 0, policy_version 239290 (0.0036) [2024-06-22 15:40:27,095][15401] Updated weights for policy 0, policy_version 239300 (0.0034) [2024-06-22 15:40:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3920723968. Throughput: 0: 42951.5. Samples: 3920852780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 15:40:31,579][15401] Updated weights for policy 0, policy_version 239310 (0.0031) [2024-06-22 15:40:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3920936960. Throughput: 0: 43049.0. Samples: 3921104100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:33,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 15:40:34,645][15401] Updated weights for policy 0, policy_version 239320 (0.0031) [2024-06-22 15:40:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 3921149952. Throughput: 0: 43004.9. Samples: 3921240000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:38,390][15132] Avg episode reward: [(0, '0.149')] [2024-06-22 15:40:39,006][15401] Updated weights for policy 0, policy_version 239330 (0.0029) [2024-06-22 15:40:42,550][15401] Updated weights for policy 0, policy_version 239340 (0.0034) [2024-06-22 15:40:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3921362944. Throughput: 0: 42848.9. Samples: 3921491900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:43,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 15:40:46,696][15401] Updated weights for policy 0, policy_version 239350 (0.0027) [2024-06-22 15:40:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3921592320. Throughput: 0: 42961.3. Samples: 3921747920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 15:40:50,093][15401] Updated weights for policy 0, policy_version 239360 (0.0029) [2024-06-22 15:40:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 3921805312. Throughput: 0: 42850.5. Samples: 3921884140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:53,393][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 15:40:54,208][15401] Updated weights for policy 0, policy_version 239370 (0.0041) [2024-06-22 15:40:57,602][15401] Updated weights for policy 0, policy_version 239380 (0.0032) [2024-06-22 15:40:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3922018304. Throughput: 0: 42993.8. Samples: 3922138780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:40:58,390][15132] Avg episode reward: [(0, '0.088')] [2024-06-22 15:41:02,241][15401] Updated weights for policy 0, policy_version 239390 (0.0030) [2024-06-22 15:41:03,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3922231296. Throughput: 0: 42937.2. Samples: 3922391380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:41:03,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 15:41:05,227][15401] Updated weights for policy 0, policy_version 239400 (0.0033) [2024-06-22 15:41:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3922444288. Throughput: 0: 42730.2. Samples: 3922522960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:41:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 15:41:09,847][15401] Updated weights for policy 0, policy_version 239410 (0.0039) [2024-06-22 15:41:13,030][15401] Updated weights for policy 0, policy_version 239420 (0.0039) [2024-06-22 15:41:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43145.8, 300 sec: 42987.1). Total num frames: 3922657280. Throughput: 0: 42761.3. Samples: 3922777040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:41:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 15:41:17,340][15401] Updated weights for policy 0, policy_version 239430 (0.0031) [2024-06-22 15:41:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42986.8). Total num frames: 3922870272. Throughput: 0: 42797.7. Samples: 3923030100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 15:41:18,393][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 15:41:21,203][15401] Updated weights for policy 0, policy_version 239440 (0.0039) [2024-06-22 15:41:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 43042.4). Total num frames: 3923099648. Throughput: 0: 42694.5. Samples: 3923161360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:41:23,393][15132] Avg episode reward: [(0, '0.253')] [2024-06-22 15:41:24,727][15401] Updated weights for policy 0, policy_version 239450 (0.0036) [2024-06-22 15:41:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3923279872. Throughput: 0: 42653.9. Samples: 3923411320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:41:28,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 15:41:28,926][15401] Updated weights for policy 0, policy_version 239460 (0.0037) [2024-06-22 15:41:30,362][15349] Signal inference workers to stop experience collection... (58000 times) [2024-06-22 15:41:30,408][15401] InferenceWorker_p0-w0: stopping experience collection (58000 times) [2024-06-22 15:41:30,417][15349] Signal inference workers to resume experience collection... (58000 times) [2024-06-22 15:41:30,423][15401] InferenceWorker_p0-w0: resuming experience collection (58000 times) [2024-06-22 15:41:32,144][15401] Updated weights for policy 0, policy_version 239470 (0.0041) [2024-06-22 15:41:33,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 3923525632. Throughput: 0: 42747.0. Samples: 3923671540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:41:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 15:41:36,508][15401] Updated weights for policy 0, policy_version 239480 (0.0032) [2024-06-22 15:41:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 3923722240. Throughput: 0: 42615.2. Samples: 3923801720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:41:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 15:41:39,687][15401] Updated weights for policy 0, policy_version 239490 (0.0026) [2024-06-22 15:41:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3923935232. Throughput: 0: 42717.1. Samples: 3924061060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:41:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 15:41:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000239498_3923935232.pth... [2024-06-22 15:41:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000238868_3913613312.pth [2024-06-22 15:41:44,106][15401] Updated weights for policy 0, policy_version 239500 (0.0034) [2024-06-22 15:41:47,276][15401] Updated weights for policy 0, policy_version 239510 (0.0035) [2024-06-22 15:41:48,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42598.2, 300 sec: 43042.7). Total num frames: 3924148224. Throughput: 0: 42742.9. Samples: 3924314820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:41:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 15:41:51,717][15401] Updated weights for policy 0, policy_version 239520 (0.0032) [2024-06-22 15:41:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 3924361216. Throughput: 0: 42731.9. Samples: 3924445900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:41:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 15:41:54,924][15401] Updated weights for policy 0, policy_version 239530 (0.0039) [2024-06-22 15:41:58,389][15132] Fps is (10 sec: 42599.9, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 3924574208. Throughput: 0: 42672.6. Samples: 3924697300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:41:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 15:41:59,590][15401] Updated weights for policy 0, policy_version 239540 (0.0039) [2024-06-22 15:42:02,679][15401] Updated weights for policy 0, policy_version 239550 (0.0033) [2024-06-22 15:42:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3924803584. Throughput: 0: 42871.2. Samples: 3924959200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:42:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 15:42:07,118][15401] Updated weights for policy 0, policy_version 239560 (0.0038) [2024-06-22 15:42:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 3925000192. Throughput: 0: 42864.5. Samples: 3925090160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:42:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 15:42:10,369][15401] Updated weights for policy 0, policy_version 239570 (0.0041) [2024-06-22 15:42:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3925213184. Throughput: 0: 42887.0. Samples: 3925341240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:42:13,399][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 15:42:14,675][15401] Updated weights for policy 0, policy_version 239580 (0.0032) [2024-06-22 15:42:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42598.4, 300 sec: 42875.7). Total num frames: 3925426176. Throughput: 0: 42932.9. Samples: 3925603620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:42:18,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 15:42:18,414][15401] Updated weights for policy 0, policy_version 239590 (0.0023) [2024-06-22 15:42:22,116][15401] Updated weights for policy 0, policy_version 239600 (0.0023) [2024-06-22 15:42:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 3925639168. Throughput: 0: 42969.2. Samples: 3925735340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:42:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 15:42:26,070][15401] Updated weights for policy 0, policy_version 239610 (0.0045) [2024-06-22 15:42:28,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 3925868544. Throughput: 0: 42685.5. Samples: 3925981900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-22 15:42:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 15:42:29,626][15401] Updated weights for policy 0, policy_version 239620 (0.0045) [2024-06-22 15:42:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 3926081536. Throughput: 0: 42982.4. Samples: 3926249020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:42:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 15:42:33,526][15401] Updated weights for policy 0, policy_version 239630 (0.0048) [2024-06-22 15:42:37,249][15349] Signal inference workers to stop experience collection... (58050 times) [2024-06-22 15:42:37,296][15401] InferenceWorker_p0-w0: stopping experience collection (58050 times) [2024-06-22 15:42:37,354][15349] Signal inference workers to resume experience collection... (58050 times) [2024-06-22 15:42:37,355][15401] InferenceWorker_p0-w0: resuming experience collection (58050 times) [2024-06-22 15:42:37,500][15401] Updated weights for policy 0, policy_version 239640 (0.0039) [2024-06-22 15:42:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 3926278144. Throughput: 0: 42923.1. Samples: 3926377440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:42:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 15:42:41,065][15401] Updated weights for policy 0, policy_version 239650 (0.0039) [2024-06-22 15:42:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.9, 300 sec: 42875.7). Total num frames: 3926507520. Throughput: 0: 42780.3. Samples: 3926622520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:42:43,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 15:42:45,378][15401] Updated weights for policy 0, policy_version 239660 (0.0037) [2024-06-22 15:42:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 3926720512. Throughput: 0: 42695.5. Samples: 3926880500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:42:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 15:42:49,021][15401] Updated weights for policy 0, policy_version 239670 (0.0044) [2024-06-22 15:42:52,970][15401] Updated weights for policy 0, policy_version 239680 (0.0039) [2024-06-22 15:42:53,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3926917120. Throughput: 0: 42763.5. Samples: 3927014520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:42:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 15:42:56,628][15401] Updated weights for policy 0, policy_version 239690 (0.0036) [2024-06-22 15:42:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 3927162880. Throughput: 0: 42801.4. Samples: 3927267300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:42:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 15:43:00,491][15401] Updated weights for policy 0, policy_version 239700 (0.0046) [2024-06-22 15:43:03,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 43043.1). Total num frames: 3927375872. Throughput: 0: 42625.8. Samples: 3927521680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:43:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 15:43:04,469][15401] Updated weights for policy 0, policy_version 239710 (0.0043) [2024-06-22 15:43:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3927556096. Throughput: 0: 42545.8. Samples: 3927649900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:43:08,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 15:43:08,500][15401] Updated weights for policy 0, policy_version 239720 (0.0028) [2024-06-22 15:43:12,087][15401] Updated weights for policy 0, policy_version 239730 (0.0031) [2024-06-22 15:43:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3927785472. Throughput: 0: 42820.0. Samples: 3927908800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:43:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 15:43:16,134][15401] Updated weights for policy 0, policy_version 239740 (0.0030) [2024-06-22 15:43:18,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 3928014848. Throughput: 0: 42476.6. Samples: 3928160460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:43:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 15:43:19,607][15401] Updated weights for policy 0, policy_version 239750 (0.0041) [2024-06-22 15:43:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 3928211456. Throughput: 0: 42534.7. Samples: 3928291500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:43:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 15:43:23,925][15401] Updated weights for policy 0, policy_version 239760 (0.0034) [2024-06-22 15:43:27,439][15401] Updated weights for policy 0, policy_version 239770 (0.0033) [2024-06-22 15:43:28,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3928424448. Throughput: 0: 42706.1. Samples: 3928544200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:43:28,399][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 15:43:31,478][15401] Updated weights for policy 0, policy_version 239780 (0.0022) [2024-06-22 15:43:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3928621056. Throughput: 0: 42745.3. Samples: 3928804040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:43:33,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 15:43:35,040][15401] Updated weights for policy 0, policy_version 239790 (0.0028) [2024-06-22 15:43:38,392][15132] Fps is (10 sec: 44227.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 3928866816. Throughput: 0: 42603.6. Samples: 3928931780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 15:43:38,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 15:43:39,006][15401] Updated weights for policy 0, policy_version 239800 (0.0026) [2024-06-22 15:43:42,674][15401] Updated weights for policy 0, policy_version 239810 (0.0031) [2024-06-22 15:43:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 3929063424. Throughput: 0: 42731.9. Samples: 3929190240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:43:43,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 15:43:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000239811_3929063424.pth... [2024-06-22 15:43:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000239184_3918790656.pth [2024-06-22 15:43:46,641][15401] Updated weights for policy 0, policy_version 239820 (0.0037) [2024-06-22 15:43:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3929276416. Throughput: 0: 42784.9. Samples: 3929447000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:43:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 15:43:50,289][15401] Updated weights for policy 0, policy_version 239830 (0.0042) [2024-06-22 15:43:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3929489408. Throughput: 0: 42797.8. Samples: 3929575800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:43:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 15:43:54,022][15401] Updated weights for policy 0, policy_version 239840 (0.0045) [2024-06-22 15:43:58,213][15401] Updated weights for policy 0, policy_version 239850 (0.0025) [2024-06-22 15:43:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42765.3). Total num frames: 3929702400. Throughput: 0: 42841.3. Samples: 3929836660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:43:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 15:44:01,475][15401] Updated weights for policy 0, policy_version 239860 (0.0032) [2024-06-22 15:44:03,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 3929931776. Throughput: 0: 42832.3. Samples: 3930088020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:03,392][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 15:44:05,847][15401] Updated weights for policy 0, policy_version 239870 (0.0033) [2024-06-22 15:44:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3930128384. Throughput: 0: 42747.0. Samples: 3930215120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 15:44:09,383][15401] Updated weights for policy 0, policy_version 239880 (0.0036) [2024-06-22 15:44:13,381][15401] Updated weights for policy 0, policy_version 239890 (0.0026) [2024-06-22 15:44:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3930357760. Throughput: 0: 42934.9. Samples: 3930476260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:13,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 15:44:17,315][15401] Updated weights for policy 0, policy_version 239900 (0.0034) [2024-06-22 15:44:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3930570752. Throughput: 0: 42720.9. Samples: 3930726480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 15:44:21,012][15401] Updated weights for policy 0, policy_version 239910 (0.0040) [2024-06-22 15:44:21,668][15349] Signal inference workers to stop experience collection... (58100 times) [2024-06-22 15:44:21,698][15401] InferenceWorker_p0-w0: stopping experience collection (58100 times) [2024-06-22 15:44:21,723][15349] Signal inference workers to resume experience collection... (58100 times) [2024-06-22 15:44:21,724][15401] InferenceWorker_p0-w0: resuming experience collection (58100 times) [2024-06-22 15:44:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3930783744. Throughput: 0: 42829.4. Samples: 3930859000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 15:44:24,828][15401] Updated weights for policy 0, policy_version 239920 (0.0032) [2024-06-22 15:44:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3930980352. Throughput: 0: 42846.1. Samples: 3931118320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:28,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 15:44:28,652][15401] Updated weights for policy 0, policy_version 239930 (0.0027) [2024-06-22 15:44:32,189][15401] Updated weights for policy 0, policy_version 239940 (0.0031) [2024-06-22 15:44:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 3931209728. Throughput: 0: 42877.4. Samples: 3931376480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 15:44:36,174][15401] Updated weights for policy 0, policy_version 239950 (0.0031) [2024-06-22 15:44:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 3931422720. Throughput: 0: 42885.3. Samples: 3931505640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 15:44:39,715][15401] Updated weights for policy 0, policy_version 239960 (0.0043) [2024-06-22 15:44:43,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3931619328. Throughput: 0: 42726.3. Samples: 3931759340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 15:44:44,005][15401] Updated weights for policy 0, policy_version 239970 (0.0038) [2024-06-22 15:44:47,579][15401] Updated weights for policy 0, policy_version 239980 (0.0026) [2024-06-22 15:44:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3931848704. Throughput: 0: 42764.0. Samples: 3932012300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 15:44:48,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 15:44:51,656][15401] Updated weights for policy 0, policy_version 239990 (0.0026) [2024-06-22 15:44:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3932078080. Throughput: 0: 42878.6. Samples: 3932144660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:44:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 15:44:55,287][15401] Updated weights for policy 0, policy_version 240000 (0.0034) [2024-06-22 15:44:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3932258304. Throughput: 0: 42624.4. Samples: 3932394360. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:44:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 15:44:59,597][15401] Updated weights for policy 0, policy_version 240010 (0.0043) [2024-06-22 15:45:03,051][15401] Updated weights for policy 0, policy_version 240020 (0.0041) [2024-06-22 15:45:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 3932487680. Throughput: 0: 42759.6. Samples: 3932650660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 15:45:07,363][15401] Updated weights for policy 0, policy_version 240030 (0.0032) [2024-06-22 15:45:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42876.3). Total num frames: 3932717056. Throughput: 0: 42760.8. Samples: 3932783240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:08,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-22 15:45:10,633][15401] Updated weights for policy 0, policy_version 240040 (0.0027) [2024-06-22 15:45:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 3932913664. Throughput: 0: 42600.5. Samples: 3933035440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:13,392][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 15:45:14,923][15401] Updated weights for policy 0, policy_version 240050 (0.0025) [2024-06-22 15:45:18,132][15401] Updated weights for policy 0, policy_version 240060 (0.0036) [2024-06-22 15:45:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3933143040. Throughput: 0: 42570.6. Samples: 3933292160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 15:45:22,500][15401] Updated weights for policy 0, policy_version 240070 (0.0033) [2024-06-22 15:45:23,396][15132] Fps is (10 sec: 44219.2, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 3933356032. Throughput: 0: 42700.3. Samples: 3933427420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:23,396][15132] Avg episode reward: [(0, '0.140')] [2024-06-22 15:45:26,034][15401] Updated weights for policy 0, policy_version 240080 (0.0035) [2024-06-22 15:45:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 3933569024. Throughput: 0: 42656.0. Samples: 3933678860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:28,392][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 15:45:30,128][15401] Updated weights for policy 0, policy_version 240090 (0.0035) [2024-06-22 15:45:33,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3933765632. Throughput: 0: 42785.7. Samples: 3933937660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 15:45:33,530][15349] Signal inference workers to stop experience collection... (58150 times) [2024-06-22 15:45:33,530][15349] Signal inference workers to resume experience collection... (58150 times) [2024-06-22 15:45:33,577][15401] InferenceWorker_p0-w0: stopping experience collection (58150 times) [2024-06-22 15:45:33,577][15401] InferenceWorker_p0-w0: resuming experience collection (58150 times) [2024-06-22 15:45:33,667][15401] Updated weights for policy 0, policy_version 240100 (0.0032) [2024-06-22 15:45:37,662][15401] Updated weights for policy 0, policy_version 240110 (0.0026) [2024-06-22 15:45:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3933995008. Throughput: 0: 42752.6. Samples: 3934068520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:38,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 15:45:41,198][15401] Updated weights for policy 0, policy_version 240120 (0.0023) [2024-06-22 15:45:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 3934191616. Throughput: 0: 42895.6. Samples: 3934324660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:43,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-22 15:45:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000240125_3934208000.pth... [2024-06-22 15:45:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000239498_3923935232.pth [2024-06-22 15:45:45,534][15401] Updated weights for policy 0, policy_version 240130 (0.0039) [2024-06-22 15:45:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 3934420992. Throughput: 0: 42782.5. Samples: 3934575880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 15:45:48,891][15401] Updated weights for policy 0, policy_version 240140 (0.0037) [2024-06-22 15:45:52,978][15401] Updated weights for policy 0, policy_version 240150 (0.0038) [2024-06-22 15:45:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3934633984. Throughput: 0: 42768.1. Samples: 3934707800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:53,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-22 15:45:56,593][15401] Updated weights for policy 0, policy_version 240160 (0.0033) [2024-06-22 15:45:58,394][15132] Fps is (10 sec: 40943.8, 60 sec: 42868.5, 300 sec: 42708.9). Total num frames: 3934830592. Throughput: 0: 42745.9. Samples: 3934959080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-22 15:45:58,394][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 15:46:00,610][15401] Updated weights for policy 0, policy_version 240170 (0.0037) [2024-06-22 15:46:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3935076352. Throughput: 0: 42729.4. Samples: 3935214980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 15:46:04,870][15401] Updated weights for policy 0, policy_version 240180 (0.0029) [2024-06-22 15:46:08,011][15401] Updated weights for policy 0, policy_version 240190 (0.0033) [2024-06-22 15:46:08,390][15132] Fps is (10 sec: 44254.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3935272960. Throughput: 0: 42734.5. Samples: 3935350200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 15:46:12,487][15401] Updated weights for policy 0, policy_version 240200 (0.0034) [2024-06-22 15:46:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 3935469568. Throughput: 0: 42840.9. Samples: 3935606700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 15:46:15,555][15401] Updated weights for policy 0, policy_version 240210 (0.0036) [2024-06-22 15:46:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 3935698944. Throughput: 0: 42724.9. Samples: 3935860280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 15:46:19,880][15401] Updated weights for policy 0, policy_version 240220 (0.0040) [2024-06-22 15:46:23,364][15401] Updated weights for policy 0, policy_version 240230 (0.0044) [2024-06-22 15:46:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42875.9, 300 sec: 42876.1). Total num frames: 3935928320. Throughput: 0: 42850.9. Samples: 3935996820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 15:46:27,602][15401] Updated weights for policy 0, policy_version 240240 (0.0041) [2024-06-22 15:46:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3936124928. Throughput: 0: 42727.0. Samples: 3936247380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:28,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 15:46:31,253][15401] Updated weights for policy 0, policy_version 240250 (0.0052) [2024-06-22 15:46:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3936337920. Throughput: 0: 42785.5. Samples: 3936501220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 15:46:35,172][15401] Updated weights for policy 0, policy_version 240260 (0.0030) [2024-06-22 15:46:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3936567296. Throughput: 0: 42899.9. Samples: 3936638300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:38,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 15:46:38,869][15401] Updated weights for policy 0, policy_version 240270 (0.0032) [2024-06-22 15:46:42,686][15401] Updated weights for policy 0, policy_version 240280 (0.0032) [2024-06-22 15:46:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 3936763904. Throughput: 0: 42938.6. Samples: 3936891140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:43,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-22 15:46:46,345][15401] Updated weights for policy 0, policy_version 240290 (0.0035) [2024-06-22 15:46:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3936976896. Throughput: 0: 43072.3. Samples: 3937153240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 15:46:50,253][15401] Updated weights for policy 0, policy_version 240300 (0.0041) [2024-06-22 15:46:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3937206272. Throughput: 0: 42931.5. Samples: 3937282120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 15:46:53,862][15401] Updated weights for policy 0, policy_version 240310 (0.0030) [2024-06-22 15:46:57,776][15401] Updated weights for policy 0, policy_version 240320 (0.0029) [2024-06-22 15:46:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43147.5, 300 sec: 42765.0). Total num frames: 3937419264. Throughput: 0: 42881.8. Samples: 3937536380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:46:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 15:47:01,498][15401] Updated weights for policy 0, policy_version 240330 (0.0037) [2024-06-22 15:47:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3937615872. Throughput: 0: 42922.2. Samples: 3937791780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:47:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 15:47:05,517][15349] Signal inference workers to stop experience collection... (58200 times) [2024-06-22 15:47:05,518][15349] Signal inference workers to resume experience collection... (58200 times) [2024-06-22 15:47:05,531][15401] InferenceWorker_p0-w0: stopping experience collection (58200 times) [2024-06-22 15:47:05,531][15401] InferenceWorker_p0-w0: resuming experience collection (58200 times) [2024-06-22 15:47:05,668][15401] Updated weights for policy 0, policy_version 240340 (0.0037) [2024-06-22 15:47:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 3937845248. Throughput: 0: 42714.3. Samples: 3937919060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-22 15:47:08,393][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 15:47:09,205][15401] Updated weights for policy 0, policy_version 240350 (0.0032) [2024-06-22 15:47:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 3938041856. Throughput: 0: 42953.0. Samples: 3938180260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 15:47:13,510][15401] Updated weights for policy 0, policy_version 240360 (0.0033) [2024-06-22 15:47:16,867][15401] Updated weights for policy 0, policy_version 240370 (0.0050) [2024-06-22 15:47:18,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3938271232. Throughput: 0: 42899.0. Samples: 3938431680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:18,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-22 15:47:21,317][15401] Updated weights for policy 0, policy_version 240380 (0.0042) [2024-06-22 15:47:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 3938467840. Throughput: 0: 42858.4. Samples: 3938566920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:23,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 15:47:24,334][15401] Updated weights for policy 0, policy_version 240390 (0.0039) [2024-06-22 15:47:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3938680832. Throughput: 0: 42864.5. Samples: 3938820040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 15:47:28,830][15401] Updated weights for policy 0, policy_version 240400 (0.0036) [2024-06-22 15:47:31,922][15401] Updated weights for policy 0, policy_version 240410 (0.0026) [2024-06-22 15:47:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3938926592. Throughput: 0: 42645.0. Samples: 3939072260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 15:47:36,770][15401] Updated weights for policy 0, policy_version 240420 (0.0032) [2024-06-22 15:47:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 3939123200. Throughput: 0: 42923.5. Samples: 3939213680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 15:47:39,391][15401] Updated weights for policy 0, policy_version 240430 (0.0029) [2024-06-22 15:47:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3939336192. Throughput: 0: 42725.7. Samples: 3939459040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:43,392][15132] Avg episode reward: [(0, '0.252')] [2024-06-22 15:47:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000240438_3939336192.pth... [2024-06-22 15:47:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000239811_3929063424.pth [2024-06-22 15:47:44,248][15401] Updated weights for policy 0, policy_version 240440 (0.0045) [2024-06-22 15:47:47,242][15401] Updated weights for policy 0, policy_version 240450 (0.0029) [2024-06-22 15:47:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3939581952. Throughput: 0: 42540.3. Samples: 3939706100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 15:47:51,895][15401] Updated weights for policy 0, policy_version 240460 (0.0031) [2024-06-22 15:47:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 3939745792. Throughput: 0: 42820.0. Samples: 3939845960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:53,393][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 15:47:55,137][15401] Updated weights for policy 0, policy_version 240470 (0.0031) [2024-06-22 15:47:58,390][15132] Fps is (10 sec: 37683.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3939958784. Throughput: 0: 42578.5. Samples: 3940096300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:47:58,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 15:47:59,527][15401] Updated weights for policy 0, policy_version 240480 (0.0035) [2024-06-22 15:48:02,755][15401] Updated weights for policy 0, policy_version 240490 (0.0035) [2024-06-22 15:48:03,389][15132] Fps is (10 sec: 47525.7, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 3940220928. Throughput: 0: 42722.8. Samples: 3940354200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:48:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 15:48:07,595][15401] Updated weights for policy 0, policy_version 240500 (0.0041) [2024-06-22 15:48:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 3940384768. Throughput: 0: 42766.2. Samples: 3940491400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:48:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 15:48:10,314][15401] Updated weights for policy 0, policy_version 240510 (0.0037) [2024-06-22 15:48:13,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 3940614144. Throughput: 0: 42570.0. Samples: 3940735700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:48:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 15:48:15,122][15401] Updated weights for policy 0, policy_version 240520 (0.0045) [2024-06-22 15:48:16,734][15349] Signal inference workers to stop experience collection... (58250 times) [2024-06-22 15:48:16,792][15401] InferenceWorker_p0-w0: stopping experience collection (58250 times) [2024-06-22 15:48:16,793][15349] Signal inference workers to resume experience collection... (58250 times) [2024-06-22 15:48:16,809][15401] InferenceWorker_p0-w0: resuming experience collection (58250 times) [2024-06-22 15:48:17,894][15401] Updated weights for policy 0, policy_version 240530 (0.0033) [2024-06-22 15:48:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3940843520. Throughput: 0: 42673.6. Samples: 3940992580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 15:48:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 15:48:22,522][15401] Updated weights for policy 0, policy_version 240540 (0.0028) [2024-06-22 15:48:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3941023744. Throughput: 0: 42600.1. Samples: 3941130680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:48:23,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 15:48:25,927][15401] Updated weights for policy 0, policy_version 240550 (0.0024) [2024-06-22 15:48:28,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.7, 300 sec: 42875.8). Total num frames: 3941269504. Throughput: 0: 42730.7. Samples: 3941382020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:48:28,393][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 15:48:29,995][15401] Updated weights for policy 0, policy_version 240560 (0.0033) [2024-06-22 15:48:33,390][15401] Updated weights for policy 0, policy_version 240570 (0.0030) [2024-06-22 15:48:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 3941498880. Throughput: 0: 43150.4. Samples: 3941647860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:48:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 15:48:38,054][15401] Updated weights for policy 0, policy_version 240580 (0.0039) [2024-06-22 15:48:38,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3941679104. Throughput: 0: 42886.3. Samples: 3941775740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:48:38,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 15:48:40,859][15401] Updated weights for policy 0, policy_version 240590 (0.0027) [2024-06-22 15:48:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3941924864. Throughput: 0: 42872.3. Samples: 3942025560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:48:43,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 15:48:45,595][15401] Updated weights for policy 0, policy_version 240600 (0.0036) [2024-06-22 15:48:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 3942137856. Throughput: 0: 42915.5. Samples: 3942285400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:48:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 15:48:48,532][15401] Updated weights for policy 0, policy_version 240610 (0.0040) [2024-06-22 15:48:53,025][15401] Updated weights for policy 0, policy_version 240620 (0.0035) [2024-06-22 15:48:53,392][15132] Fps is (10 sec: 39312.6, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 3942318080. Throughput: 0: 42695.8. Samples: 3942412820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:48:53,393][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 15:48:56,211][15401] Updated weights for policy 0, policy_version 240630 (0.0025) [2024-06-22 15:48:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 42876.4). Total num frames: 3942580224. Throughput: 0: 42982.8. Samples: 3942669920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:48:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 15:49:00,472][15401] Updated weights for policy 0, policy_version 240640 (0.0031) [2024-06-22 15:49:03,392][15132] Fps is (10 sec: 47513.8, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 3942793216. Throughput: 0: 43043.6. Samples: 3942929640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:49:03,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 15:49:03,853][15401] Updated weights for policy 0, policy_version 240650 (0.0026) [2024-06-22 15:49:07,967][15401] Updated weights for policy 0, policy_version 240660 (0.0037) [2024-06-22 15:49:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3942973440. Throughput: 0: 42879.5. Samples: 3943060260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:49:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 15:49:11,473][15349] Signal inference workers to stop experience collection... (58300 times) [2024-06-22 15:49:11,478][15349] Signal inference workers to resume experience collection... (58300 times) [2024-06-22 15:49:11,495][15401] Updated weights for policy 0, policy_version 240670 (0.0032) [2024-06-22 15:49:11,527][15401] InferenceWorker_p0-w0: stopping experience collection (58300 times) [2024-06-22 15:49:11,528][15401] InferenceWorker_p0-w0: resuming experience collection (58300 times) [2024-06-22 15:49:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 3943219200. Throughput: 0: 43084.5. Samples: 3943320720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:49:13,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 15:49:15,381][15401] Updated weights for policy 0, policy_version 240680 (0.0045) [2024-06-22 15:49:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3943432192. Throughput: 0: 42930.2. Samples: 3943579720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:49:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 15:49:18,992][15401] Updated weights for policy 0, policy_version 240690 (0.0027) [2024-06-22 15:49:23,039][15401] Updated weights for policy 0, policy_version 240700 (0.0039) [2024-06-22 15:49:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3943628800. Throughput: 0: 42788.9. Samples: 3943701240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:49:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 15:49:26,564][15401] Updated weights for policy 0, policy_version 240710 (0.0026) [2024-06-22 15:49:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 3943858176. Throughput: 0: 42979.6. Samples: 3943959640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 15:49:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 15:49:30,934][15401] Updated weights for policy 0, policy_version 240720 (0.0027) [2024-06-22 15:49:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3944038400. Throughput: 0: 43086.1. Samples: 3944224280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:49:33,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 15:49:34,505][15401] Updated weights for policy 0, policy_version 240730 (0.0039) [2024-06-22 15:49:38,391][15401] Updated weights for policy 0, policy_version 240740 (0.0031) [2024-06-22 15:49:38,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 3944284160. Throughput: 0: 42979.6. Samples: 3944346900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:49:38,393][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 15:49:42,166][15401] Updated weights for policy 0, policy_version 240750 (0.0034) [2024-06-22 15:49:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3944497152. Throughput: 0: 42995.4. Samples: 3944604720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:49:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 15:49:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000240753_3944497152.pth... [2024-06-22 15:49:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000240125_3934208000.pth [2024-06-22 15:49:45,959][15401] Updated weights for policy 0, policy_version 240760 (0.0028) [2024-06-22 15:49:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3944693760. Throughput: 0: 42990.8. Samples: 3944864120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:49:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 15:49:49,803][15401] Updated weights for policy 0, policy_version 240770 (0.0034) [2024-06-22 15:49:53,393][15132] Fps is (10 sec: 42583.2, 60 sec: 43416.7, 300 sec: 42931.1). Total num frames: 3944923136. Throughput: 0: 42812.1. Samples: 3944986960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:49:53,393][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 15:49:54,140][15401] Updated weights for policy 0, policy_version 240780 (0.0027) [2024-06-22 15:49:57,297][15401] Updated weights for policy 0, policy_version 240790 (0.0033) [2024-06-22 15:49:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3945136128. Throughput: 0: 42809.3. Samples: 3945247140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:49:58,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 15:50:01,790][15401] Updated weights for policy 0, policy_version 240800 (0.0039) [2024-06-22 15:50:03,390][15132] Fps is (10 sec: 39335.9, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 3945316352. Throughput: 0: 42782.2. Samples: 3945504920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:50:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 15:50:05,366][15401] Updated weights for policy 0, policy_version 240810 (0.0032) [2024-06-22 15:50:08,390][15132] Fps is (10 sec: 42596.9, 60 sec: 43144.2, 300 sec: 42876.4). Total num frames: 3945562112. Throughput: 0: 42812.0. Samples: 3945627800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:50:08,391][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 15:50:09,459][15401] Updated weights for policy 0, policy_version 240820 (0.0042) [2024-06-22 15:50:12,885][15401] Updated weights for policy 0, policy_version 240830 (0.0044) [2024-06-22 15:50:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3945775104. Throughput: 0: 42847.2. Samples: 3945887760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:50:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 15:50:17,201][15401] Updated weights for policy 0, policy_version 240840 (0.0036) [2024-06-22 15:50:18,390][15132] Fps is (10 sec: 40961.8, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 3945971712. Throughput: 0: 42560.0. Samples: 3946139480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:50:18,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 15:50:20,694][15401] Updated weights for policy 0, policy_version 240850 (0.0038) [2024-06-22 15:50:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3946184704. Throughput: 0: 42599.5. Samples: 3946263780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:50:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 15:50:24,761][15401] Updated weights for policy 0, policy_version 240860 (0.0030) [2024-06-22 15:50:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3946397696. Throughput: 0: 42731.2. Samples: 3946527620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:50:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 15:50:28,451][15401] Updated weights for policy 0, policy_version 240870 (0.0032) [2024-06-22 15:50:32,254][15401] Updated weights for policy 0, policy_version 240880 (0.0028) [2024-06-22 15:50:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 3946594304. Throughput: 0: 42657.2. Samples: 3946783700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:50:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 15:50:35,960][15401] Updated weights for policy 0, policy_version 240890 (0.0038) [2024-06-22 15:50:36,753][15349] Signal inference workers to stop experience collection... (58350 times) [2024-06-22 15:50:36,753][15349] Signal inference workers to resume experience collection... (58350 times) [2024-06-22 15:50:36,788][15401] InferenceWorker_p0-w0: stopping experience collection (58350 times) [2024-06-22 15:50:36,789][15401] InferenceWorker_p0-w0: resuming experience collection (58350 times) [2024-06-22 15:50:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 3946840064. Throughput: 0: 42640.7. Samples: 3946905640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 15:50:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 15:50:39,912][15401] Updated weights for policy 0, policy_version 240900 (0.0030) [2024-06-22 15:50:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3947053056. Throughput: 0: 42786.8. Samples: 3947172540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:50:43,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 15:50:43,514][15401] Updated weights for policy 0, policy_version 240910 (0.0039) [2024-06-22 15:50:47,431][15401] Updated weights for policy 0, policy_version 240920 (0.0031) [2024-06-22 15:50:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3947233280. Throughput: 0: 42726.3. Samples: 3947427600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:50:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 15:50:51,585][15401] Updated weights for policy 0, policy_version 240930 (0.0036) [2024-06-22 15:50:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.9, 300 sec: 42876.7). Total num frames: 3947479040. Throughput: 0: 42864.3. Samples: 3947556680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:50:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 15:50:55,273][15401] Updated weights for policy 0, policy_version 240940 (0.0043) [2024-06-22 15:50:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 3947692032. Throughput: 0: 42866.7. Samples: 3947816760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:50:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 15:50:59,042][15401] Updated weights for policy 0, policy_version 240950 (0.0026) [2024-06-22 15:51:03,184][15401] Updated weights for policy 0, policy_version 240960 (0.0033) [2024-06-22 15:51:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3947888640. Throughput: 0: 43109.8. Samples: 3948079420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 15:51:06,646][15401] Updated weights for policy 0, policy_version 240970 (0.0035) [2024-06-22 15:51:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.8, 300 sec: 42931.6). Total num frames: 3948134400. Throughput: 0: 43124.5. Samples: 3948204380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:08,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 15:51:10,614][15401] Updated weights for policy 0, policy_version 240980 (0.0034) [2024-06-22 15:51:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3948331008. Throughput: 0: 43017.8. Samples: 3948463420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:13,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 15:51:14,104][15401] Updated weights for policy 0, policy_version 240990 (0.0025) [2024-06-22 15:51:17,988][15401] Updated weights for policy 0, policy_version 241000 (0.0039) [2024-06-22 15:51:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3948544000. Throughput: 0: 43058.7. Samples: 3948721340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 15:51:21,698][15401] Updated weights for policy 0, policy_version 241010 (0.0032) [2024-06-22 15:51:23,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 3948789760. Throughput: 0: 43305.7. Samples: 3948854400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:23,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 15:51:25,832][15401] Updated weights for policy 0, policy_version 241020 (0.0034) [2024-06-22 15:51:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3948969984. Throughput: 0: 43132.4. Samples: 3949113500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 15:51:29,243][15401] Updated weights for policy 0, policy_version 241030 (0.0026) [2024-06-22 15:51:33,290][15401] Updated weights for policy 0, policy_version 241040 (0.0038) [2024-06-22 15:51:33,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 3949199360. Throughput: 0: 43032.0. Samples: 3949364040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:33,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 15:51:36,846][15401] Updated weights for policy 0, policy_version 241050 (0.0026) [2024-06-22 15:51:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42987.1). Total num frames: 3949445120. Throughput: 0: 43098.2. Samples: 3949496100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 15:51:41,269][15401] Updated weights for policy 0, policy_version 241060 (0.0032) [2024-06-22 15:51:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3949608960. Throughput: 0: 43086.3. Samples: 3949755640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 15:51:43,493][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000241066_3949625344.pth... [2024-06-22 15:51:43,544][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000240438_3939336192.pth [2024-06-22 15:51:44,331][15349] Signal inference workers to stop experience collection... (58400 times) [2024-06-22 15:51:44,331][15349] Signal inference workers to resume experience collection... (58400 times) [2024-06-22 15:51:44,366][15401] InferenceWorker_p0-w0: stopping experience collection (58400 times) [2024-06-22 15:51:44,366][15401] InferenceWorker_p0-w0: resuming experience collection (58400 times) [2024-06-22 15:51:44,506][15401] Updated weights for policy 0, policy_version 241070 (0.0046) [2024-06-22 15:51:48,390][15132] Fps is (10 sec: 39322.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3949838336. Throughput: 0: 42764.9. Samples: 3950003840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 15:51:48,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 15:51:48,807][15401] Updated weights for policy 0, policy_version 241080 (0.0028) [2024-06-22 15:51:52,172][15401] Updated weights for policy 0, policy_version 241090 (0.0045) [2024-06-22 15:51:53,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 3950084096. Throughput: 0: 43003.6. Samples: 3950139540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:51:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 15:51:56,332][15401] Updated weights for policy 0, policy_version 241100 (0.0040) [2024-06-22 15:51:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3950231552. Throughput: 0: 42975.8. Samples: 3950397340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:51:58,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 15:51:59,762][15401] Updated weights for policy 0, policy_version 241110 (0.0042) [2024-06-22 15:52:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 3950477312. Throughput: 0: 42651.6. Samples: 3950640660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:03,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 15:52:03,925][15401] Updated weights for policy 0, policy_version 241120 (0.0049) [2024-06-22 15:52:07,515][15401] Updated weights for policy 0, policy_version 241130 (0.0029) [2024-06-22 15:52:08,390][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 3950723072. Throughput: 0: 42832.0. Samples: 3950781840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 15:52:11,382][15401] Updated weights for policy 0, policy_version 241140 (0.0030) [2024-06-22 15:52:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 3950870528. Throughput: 0: 42617.5. Samples: 3951031280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 15:52:15,240][15401] Updated weights for policy 0, policy_version 241150 (0.0034) [2024-06-22 15:52:18,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3951099904. Throughput: 0: 42647.2. Samples: 3951283160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 15:52:19,450][15401] Updated weights for policy 0, policy_version 241160 (0.0032) [2024-06-22 15:52:23,206][15401] Updated weights for policy 0, policy_version 241170 (0.0027) [2024-06-22 15:52:23,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 3951345664. Throughput: 0: 42607.3. Samples: 3951413420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 15:52:27,016][15401] Updated weights for policy 0, policy_version 241180 (0.0028) [2024-06-22 15:52:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3951525888. Throughput: 0: 42460.8. Samples: 3951666380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:28,395][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 15:52:30,749][15401] Updated weights for policy 0, policy_version 241190 (0.0031) [2024-06-22 15:52:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 3951755264. Throughput: 0: 42699.5. Samples: 3951925420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:33,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 15:52:34,400][15401] Updated weights for policy 0, policy_version 241200 (0.0051) [2024-06-22 15:52:38,381][15401] Updated weights for policy 0, policy_version 241210 (0.0031) [2024-06-22 15:52:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 3951984640. Throughput: 0: 42648.0. Samples: 3952058700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 15:52:42,058][15401] Updated weights for policy 0, policy_version 241220 (0.0049) [2024-06-22 15:52:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 3952181248. Throughput: 0: 42368.5. Samples: 3952303920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:43,398][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 15:52:46,026][15401] Updated weights for policy 0, policy_version 241230 (0.0035) [2024-06-22 15:52:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 3952394240. Throughput: 0: 42769.8. Samples: 3952565300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 15:52:49,733][15401] Updated weights for policy 0, policy_version 241240 (0.0041) [2024-06-22 15:52:52,836][15349] Signal inference workers to stop experience collection... (58450 times) [2024-06-22 15:52:52,888][15401] InferenceWorker_p0-w0: stopping experience collection (58450 times) [2024-06-22 15:52:52,945][15349] Signal inference workers to resume experience collection... (58450 times) [2024-06-22 15:52:52,945][15401] InferenceWorker_p0-w0: resuming experience collection (58450 times) [2024-06-22 15:52:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 3952607232. Throughput: 0: 42402.8. Samples: 3952689960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 15:52:53,625][15401] Updated weights for policy 0, policy_version 241250 (0.0039) [2024-06-22 15:52:57,571][15401] Updated weights for policy 0, policy_version 241260 (0.0039) [2024-06-22 15:52:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 3952836608. Throughput: 0: 42579.5. Samples: 3952947360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:52:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 15:53:01,437][15401] Updated weights for policy 0, policy_version 241270 (0.0035) [2024-06-22 15:53:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3953049600. Throughput: 0: 42423.8. Samples: 3953192240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 15:53:05,214][15401] Updated weights for policy 0, policy_version 241280 (0.0038) [2024-06-22 15:53:08,389][15132] Fps is (10 sec: 37682.9, 60 sec: 41506.2, 300 sec: 42709.5). Total num frames: 3953213440. Throughput: 0: 42358.2. Samples: 3953319540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 15:53:09,483][15401] Updated weights for policy 0, policy_version 241290 (0.0031) [2024-06-22 15:53:13,058][15401] Updated weights for policy 0, policy_version 241300 (0.0035) [2024-06-22 15:53:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 3953475584. Throughput: 0: 42617.0. Samples: 3953584140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 15:53:17,186][15401] Updated weights for policy 0, policy_version 241310 (0.0033) [2024-06-22 15:53:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3953672192. Throughput: 0: 42405.5. Samples: 3953833560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 15:53:20,670][15401] Updated weights for policy 0, policy_version 241320 (0.0038) [2024-06-22 15:53:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 3953868800. Throughput: 0: 42288.0. Samples: 3953961660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 15:53:24,791][15401] Updated weights for policy 0, policy_version 241330 (0.0030) [2024-06-22 15:53:28,283][15401] Updated weights for policy 0, policy_version 241340 (0.0038) [2024-06-22 15:53:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 3954114560. Throughput: 0: 42657.4. Samples: 3954223500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 15:53:32,402][15401] Updated weights for policy 0, policy_version 241350 (0.0030) [2024-06-22 15:53:33,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42871.4, 300 sec: 42875.7). Total num frames: 3954327552. Throughput: 0: 42369.3. Samples: 3954472020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:33,393][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 15:53:35,866][15401] Updated weights for policy 0, policy_version 241360 (0.0036) [2024-06-22 15:53:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 3954507776. Throughput: 0: 42410.5. Samples: 3954598440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:38,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 15:53:40,307][15401] Updated weights for policy 0, policy_version 241370 (0.0039) [2024-06-22 15:53:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3954753536. Throughput: 0: 42453.2. Samples: 3954857760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:43,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 15:53:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000241379_3954753536.pth... [2024-06-22 15:53:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000240753_3944497152.pth [2024-06-22 15:53:43,620][15401] Updated weights for policy 0, policy_version 241380 (0.0035) [2024-06-22 15:53:47,849][15401] Updated weights for policy 0, policy_version 241390 (0.0029) [2024-06-22 15:53:48,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.7, 300 sec: 42876.1). Total num frames: 3954966528. Throughput: 0: 42633.3. Samples: 3955110840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:48,393][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 15:53:51,411][15401] Updated weights for policy 0, policy_version 241400 (0.0030) [2024-06-22 15:53:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 3955163136. Throughput: 0: 42667.6. Samples: 3955239580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 15:53:55,399][15401] Updated weights for policy 0, policy_version 241410 (0.0029) [2024-06-22 15:53:58,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 3955359744. Throughput: 0: 42391.5. Samples: 3955491760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:53:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 15:53:59,041][15401] Updated weights for policy 0, policy_version 241420 (0.0036) [2024-06-22 15:54:03,099][15401] Updated weights for policy 0, policy_version 241430 (0.0039) [2024-06-22 15:54:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3955589120. Throughput: 0: 42454.5. Samples: 3955744020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:54:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 15:54:04,251][15349] Signal inference workers to stop experience collection... (58500 times) [2024-06-22 15:54:04,253][15349] Signal inference workers to resume experience collection... (58500 times) [2024-06-22 15:54:04,265][15401] InferenceWorker_p0-w0: stopping experience collection (58500 times) [2024-06-22 15:54:04,278][15401] InferenceWorker_p0-w0: resuming experience collection (58500 times) [2024-06-22 15:54:07,015][15401] Updated weights for policy 0, policy_version 241440 (0.0030) [2024-06-22 15:54:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3955785728. Throughput: 0: 42558.2. Samples: 3955876780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 15:54:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 15:54:10,947][15401] Updated weights for policy 0, policy_version 241450 (0.0044) [2024-06-22 15:54:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 3956015104. Throughput: 0: 42402.2. Samples: 3956131600. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 15:54:14,771][15401] Updated weights for policy 0, policy_version 241460 (0.0032) [2024-06-22 15:54:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.5, 300 sec: 42653.6). Total num frames: 3956211712. Throughput: 0: 42631.6. Samples: 3956390440. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:18,401][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 15:54:18,661][15401] Updated weights for policy 0, policy_version 241470 (0.0034) [2024-06-22 15:54:22,327][15401] Updated weights for policy 0, policy_version 241480 (0.0021) [2024-06-22 15:54:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3956441088. Throughput: 0: 42670.3. Samples: 3956518600. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:23,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 15:54:26,265][15401] Updated weights for policy 0, policy_version 241490 (0.0036) [2024-06-22 15:54:28,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 3956654080. Throughput: 0: 42613.8. Samples: 3956775380. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 15:54:30,382][15401] Updated weights for policy 0, policy_version 241500 (0.0031) [2024-06-22 15:54:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 3956867072. Throughput: 0: 42688.0. Samples: 3957031700. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:33,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 15:54:33,904][15401] Updated weights for policy 0, policy_version 241510 (0.0038) [2024-06-22 15:54:38,097][15401] Updated weights for policy 0, policy_version 241520 (0.0046) [2024-06-22 15:54:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 3957080064. Throughput: 0: 42555.6. Samples: 3957154580. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:38,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 15:54:41,638][15401] Updated weights for policy 0, policy_version 241530 (0.0034) [2024-06-22 15:54:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 3957293056. Throughput: 0: 42623.0. Samples: 3957409800. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:43,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 15:54:45,691][15401] Updated weights for policy 0, policy_version 241540 (0.0041) [2024-06-22 15:54:48,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42052.3, 300 sec: 42598.6). Total num frames: 3957489664. Throughput: 0: 42629.4. Samples: 3957662440. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:48,393][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 15:54:49,771][15401] Updated weights for policy 0, policy_version 241550 (0.0037) [2024-06-22 15:54:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 3957702656. Throughput: 0: 42444.4. Samples: 3957786780. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 15:54:53,651][15401] Updated weights for policy 0, policy_version 241560 (0.0045) [2024-06-22 15:54:57,319][15401] Updated weights for policy 0, policy_version 241570 (0.0024) [2024-06-22 15:54:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3957915648. Throughput: 0: 42504.5. Samples: 3958044300. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:54:58,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 15:55:01,114][15401] Updated weights for policy 0, policy_version 241580 (0.0036) [2024-06-22 15:55:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.5). Total num frames: 3958128640. Throughput: 0: 42393.4. Samples: 3958298040. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:55:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 15:55:04,898][15401] Updated weights for policy 0, policy_version 241590 (0.0047) [2024-06-22 15:55:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 3958358016. Throughput: 0: 42278.3. Samples: 3958421120. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:55:08,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 15:55:08,569][15401] Updated weights for policy 0, policy_version 241600 (0.0041) [2024-06-22 15:55:12,884][15401] Updated weights for policy 0, policy_version 241610 (0.0042) [2024-06-22 15:55:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 3958554624. Throughput: 0: 42436.5. Samples: 3958685020. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:55:13,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 15:55:16,058][15401] Updated weights for policy 0, policy_version 241620 (0.0032) [2024-06-22 15:55:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 3958751232. Throughput: 0: 42316.6. Samples: 3958935940. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-22 15:55:18,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-22 15:55:20,524][15401] Updated weights for policy 0, policy_version 241630 (0.0038) [2024-06-22 15:55:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3959013376. Throughput: 0: 42401.6. Samples: 3959062660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:55:23,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 15:55:23,753][15401] Updated weights for policy 0, policy_version 241640 (0.0051) [2024-06-22 15:55:26,822][15349] Signal inference workers to stop experience collection... (58550 times) [2024-06-22 15:55:26,829][15349] Signal inference workers to resume experience collection... (58550 times) [2024-06-22 15:55:26,850][15401] InferenceWorker_p0-w0: stopping experience collection (58550 times) [2024-06-22 15:55:26,856][15401] InferenceWorker_p0-w0: resuming experience collection (58550 times) [2024-06-22 15:55:27,975][15401] Updated weights for policy 0, policy_version 241650 (0.0040) [2024-06-22 15:55:28,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 3959193600. Throughput: 0: 42646.7. Samples: 3959329000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:55:28,392][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 15:55:31,233][15401] Updated weights for policy 0, policy_version 241660 (0.0036) [2024-06-22 15:55:33,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 3959390208. Throughput: 0: 42723.7. Samples: 3959584900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:55:33,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 15:55:35,994][15401] Updated weights for policy 0, policy_version 241670 (0.0038) [2024-06-22 15:55:38,390][15132] Fps is (10 sec: 47524.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 3959668736. Throughput: 0: 42737.7. Samples: 3959709980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:55:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 15:55:38,621][15401] Updated weights for policy 0, policy_version 241680 (0.0034) [2024-06-22 15:55:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 3959816192. Throughput: 0: 42794.6. Samples: 3959970060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:55:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 15:55:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000241689_3959832576.pth... [2024-06-22 15:55:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000241066_3949625344.pth [2024-06-22 15:55:43,622][15401] Updated weights for policy 0, policy_version 241690 (0.0027) [2024-06-22 15:55:46,768][15401] Updated weights for policy 0, policy_version 241700 (0.0029) [2024-06-22 15:55:48,390][15132] Fps is (10 sec: 36044.8, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 3960029184. Throughput: 0: 42847.0. Samples: 3960226160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:55:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 15:55:51,238][15401] Updated weights for policy 0, policy_version 241710 (0.0033) [2024-06-22 15:55:53,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 3960291328. Throughput: 0: 43047.5. Samples: 3960358260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:55:53,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 15:55:54,665][15401] Updated weights for policy 0, policy_version 241720 (0.0040) [2024-06-22 15:55:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 3960438784. Throughput: 0: 42762.3. Samples: 3960609320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:55:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 15:55:58,902][15401] Updated weights for policy 0, policy_version 241730 (0.0039) [2024-06-22 15:56:02,178][15401] Updated weights for policy 0, policy_version 241740 (0.0035) [2024-06-22 15:56:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 3960700928. Throughput: 0: 42959.0. Samples: 3960869100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:56:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 15:56:06,429][15401] Updated weights for policy 0, policy_version 241750 (0.0037) [2024-06-22 15:56:08,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 3960913920. Throughput: 0: 43043.6. Samples: 3960999620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:56:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 15:56:09,819][15401] Updated weights for policy 0, policy_version 241760 (0.0037) [2024-06-22 15:56:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 3961094144. Throughput: 0: 42763.7. Samples: 3961253260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:56:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 15:56:14,176][15401] Updated weights for policy 0, policy_version 241770 (0.0033) [2024-06-22 15:56:17,580][15401] Updated weights for policy 0, policy_version 241780 (0.0027) [2024-06-22 15:56:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 3961339904. Throughput: 0: 42801.3. Samples: 3961510960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:56:18,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 15:56:21,610][15401] Updated weights for policy 0, policy_version 241790 (0.0032) [2024-06-22 15:56:23,389][15132] Fps is (10 sec: 49151.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3961585664. Throughput: 0: 43088.5. Samples: 3961648960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:56:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 15:56:25,097][15401] Updated weights for policy 0, policy_version 241800 (0.0046) [2024-06-22 15:56:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 3961765888. Throughput: 0: 42997.4. Samples: 3961904940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-22 15:56:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 15:56:29,314][15401] Updated weights for policy 0, policy_version 241810 (0.0026) [2024-06-22 15:56:29,402][15349] Signal inference workers to stop experience collection... (58600 times) [2024-06-22 15:56:29,456][15401] InferenceWorker_p0-w0: stopping experience collection (58600 times) [2024-06-22 15:56:29,459][15349] Signal inference workers to resume experience collection... (58600 times) [2024-06-22 15:56:29,468][15401] InferenceWorker_p0-w0: resuming experience collection (58600 times) [2024-06-22 15:56:32,697][15401] Updated weights for policy 0, policy_version 241820 (0.0037) [2024-06-22 15:56:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 3961995264. Throughput: 0: 43137.9. Samples: 3962167360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:56:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 15:56:36,733][15401] Updated weights for policy 0, policy_version 241830 (0.0038) [2024-06-22 15:56:38,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3962241024. Throughput: 0: 43167.1. Samples: 3962300780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:56:38,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 15:56:40,495][15401] Updated weights for policy 0, policy_version 241840 (0.0044) [2024-06-22 15:56:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 3962388480. Throughput: 0: 43221.3. Samples: 3962554280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:56:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 15:56:44,344][15401] Updated weights for policy 0, policy_version 241850 (0.0043) [2024-06-22 15:56:48,028][15401] Updated weights for policy 0, policy_version 241860 (0.0028) [2024-06-22 15:56:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43690.8, 300 sec: 42598.4). Total num frames: 3962650624. Throughput: 0: 43172.1. Samples: 3962811840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:56:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 15:56:51,912][15401] Updated weights for policy 0, policy_version 241870 (0.0031) [2024-06-22 15:56:53,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3962880000. Throughput: 0: 43236.9. Samples: 3962945280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:56:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 15:56:55,468][15401] Updated weights for policy 0, policy_version 241880 (0.0030) [2024-06-22 15:56:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 3963043840. Throughput: 0: 43195.9. Samples: 3963197080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:56:58,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 15:56:59,507][15401] Updated weights for policy 0, policy_version 241890 (0.0037) [2024-06-22 15:57:03,345][15401] Updated weights for policy 0, policy_version 241900 (0.0033) [2024-06-22 15:57:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 3963289600. Throughput: 0: 43084.4. Samples: 3963449760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:57:03,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 15:57:07,005][15401] Updated weights for policy 0, policy_version 241910 (0.0026) [2024-06-22 15:57:08,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 3963518976. Throughput: 0: 43076.8. Samples: 3963587420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:57:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 15:57:10,920][15401] Updated weights for policy 0, policy_version 241920 (0.0037) [2024-06-22 15:57:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.4, 300 sec: 42709.4). Total num frames: 3963699200. Throughput: 0: 42910.9. Samples: 3963835940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:57:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 15:57:14,997][15401] Updated weights for policy 0, policy_version 241930 (0.0025) [2024-06-22 15:57:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 3963912192. Throughput: 0: 42704.9. Samples: 3964089080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:57:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 15:57:18,688][15401] Updated weights for policy 0, policy_version 241940 (0.0046) [2024-06-22 15:57:22,475][15401] Updated weights for policy 0, policy_version 241950 (0.0028) [2024-06-22 15:57:23,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3964157952. Throughput: 0: 42703.2. Samples: 3964222420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:57:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 15:57:26,401][15401] Updated weights for policy 0, policy_version 241960 (0.0034) [2024-06-22 15:57:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 3964338176. Throughput: 0: 42747.5. Samples: 3964477920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:57:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 15:57:30,031][15401] Updated weights for policy 0, policy_version 241970 (0.0028) [2024-06-22 15:57:30,934][15349] Signal inference workers to stop experience collection... (58650 times) [2024-06-22 15:57:30,935][15349] Signal inference workers to resume experience collection... (58650 times) [2024-06-22 15:57:30,976][15401] InferenceWorker_p0-w0: stopping experience collection (58650 times) [2024-06-22 15:57:30,976][15401] InferenceWorker_p0-w0: resuming experience collection (58650 times) [2024-06-22 15:57:33,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 3964567552. Throughput: 0: 42683.5. Samples: 3964732600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:57:33,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 15:57:34,122][15401] Updated weights for policy 0, policy_version 241980 (0.0028) [2024-06-22 15:57:37,989][15401] Updated weights for policy 0, policy_version 241990 (0.0037) [2024-06-22 15:57:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3964796928. Throughput: 0: 42579.6. Samples: 3964861360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 15:57:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 15:57:41,664][15401] Updated weights for policy 0, policy_version 242000 (0.0034) [2024-06-22 15:57:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 3964993536. Throughput: 0: 42742.2. Samples: 3965120480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:57:43,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-22 15:57:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000242004_3964993536.pth... [2024-06-22 15:57:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000241379_3954753536.pth [2024-06-22 15:57:45,500][15401] Updated weights for policy 0, policy_version 242010 (0.0033) [2024-06-22 15:57:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3965206528. Throughput: 0: 42795.2. Samples: 3965375540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:57:48,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-22 15:57:49,223][15401] Updated weights for policy 0, policy_version 242020 (0.0034) [2024-06-22 15:57:53,169][15401] Updated weights for policy 0, policy_version 242030 (0.0027) [2024-06-22 15:57:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 3965435904. Throughput: 0: 42591.7. Samples: 3965504040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:57:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 15:57:56,986][15401] Updated weights for policy 0, policy_version 242040 (0.0036) [2024-06-22 15:57:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 3965599744. Throughput: 0: 42634.8. Samples: 3965754500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:57:58,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-22 15:58:00,730][15401] Updated weights for policy 0, policy_version 242050 (0.0027) [2024-06-22 15:58:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3965845504. Throughput: 0: 42724.2. Samples: 3966011680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:03,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-22 15:58:04,438][15401] Updated weights for policy 0, policy_version 242060 (0.0024) [2024-06-22 15:58:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 3966058496. Throughput: 0: 42780.0. Samples: 3966147520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:08,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 15:58:08,565][15401] Updated weights for policy 0, policy_version 242070 (0.0032) [2024-06-22 15:58:12,032][15401] Updated weights for policy 0, policy_version 242080 (0.0036) [2024-06-22 15:58:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 3966255104. Throughput: 0: 42608.4. Samples: 3966395300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 15:58:16,014][15401] Updated weights for policy 0, policy_version 242090 (0.0037) [2024-06-22 15:58:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3966500864. Throughput: 0: 42711.1. Samples: 3966654600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:18,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 15:58:19,562][15401] Updated weights for policy 0, policy_version 242100 (0.0029) [2024-06-22 15:58:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 3966697472. Throughput: 0: 42890.3. Samples: 3966791420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 15:58:23,858][15401] Updated weights for policy 0, policy_version 242110 (0.0029) [2024-06-22 15:58:27,202][15401] Updated weights for policy 0, policy_version 242120 (0.0033) [2024-06-22 15:58:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 3966910464. Throughput: 0: 42676.1. Samples: 3967040900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:28,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 15:58:31,346][15401] Updated weights for policy 0, policy_version 242130 (0.0040) [2024-06-22 15:58:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3967139840. Throughput: 0: 42727.5. Samples: 3967298280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 15:58:35,015][15401] Updated weights for policy 0, policy_version 242140 (0.0028) [2024-06-22 15:58:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3967352832. Throughput: 0: 42918.1. Samples: 3967435360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 15:58:38,842][15401] Updated weights for policy 0, policy_version 242150 (0.0038) [2024-06-22 15:58:39,597][15349] Signal inference workers to stop experience collection... (58700 times) [2024-06-22 15:58:39,598][15349] Signal inference workers to resume experience collection... (58700 times) [2024-06-22 15:58:39,643][15401] InferenceWorker_p0-w0: stopping experience collection (58700 times) [2024-06-22 15:58:39,644][15401] InferenceWorker_p0-w0: resuming experience collection (58700 times) [2024-06-22 15:58:42,508][15401] Updated weights for policy 0, policy_version 242160 (0.0045) [2024-06-22 15:58:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 3967565824. Throughput: 0: 42924.0. Samples: 3967686080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 15:58:46,442][15401] Updated weights for policy 0, policy_version 242170 (0.0038) [2024-06-22 15:58:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3967795200. Throughput: 0: 42772.6. Samples: 3967936440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-22 15:58:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 15:58:50,058][15401] Updated weights for policy 0, policy_version 242180 (0.0033) [2024-06-22 15:58:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 3967975424. Throughput: 0: 42840.2. Samples: 3968075340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:58:53,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 15:58:54,030][15401] Updated weights for policy 0, policy_version 242190 (0.0027) [2024-06-22 15:58:58,108][15401] Updated weights for policy 0, policy_version 242200 (0.0032) [2024-06-22 15:58:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 3968204800. Throughput: 0: 42876.8. Samples: 3968324760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:58:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 15:59:01,710][15401] Updated weights for policy 0, policy_version 242210 (0.0033) [2024-06-22 15:59:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3968434176. Throughput: 0: 42853.3. Samples: 3968583000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 15:59:05,955][15401] Updated weights for policy 0, policy_version 242220 (0.0029) [2024-06-22 15:59:08,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 3968614400. Throughput: 0: 42750.0. Samples: 3968715160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 15:59:09,307][15401] Updated weights for policy 0, policy_version 242230 (0.0033) [2024-06-22 15:59:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 3968843776. Throughput: 0: 42882.7. Samples: 3968970620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 15:59:13,606][15401] Updated weights for policy 0, policy_version 242240 (0.0035) [2024-06-22 15:59:17,144][15401] Updated weights for policy 0, policy_version 242250 (0.0056) [2024-06-22 15:59:18,390][15132] Fps is (10 sec: 47512.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3969089536. Throughput: 0: 42918.6. Samples: 3969229620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:18,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-22 15:59:21,264][15401] Updated weights for policy 0, policy_version 242260 (0.0036) [2024-06-22 15:59:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3969269760. Throughput: 0: 42937.8. Samples: 3969367560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 15:59:24,770][15401] Updated weights for policy 0, policy_version 242270 (0.0045) [2024-06-22 15:59:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 3969482752. Throughput: 0: 42855.9. Samples: 3969614600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 15:59:28,860][15401] Updated weights for policy 0, policy_version 242280 (0.0042) [2024-06-22 15:59:32,302][15401] Updated weights for policy 0, policy_version 242290 (0.0032) [2024-06-22 15:59:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3969712128. Throughput: 0: 43206.7. Samples: 3969880740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:33,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 15:59:36,399][15401] Updated weights for policy 0, policy_version 242300 (0.0025) [2024-06-22 15:59:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3969925120. Throughput: 0: 43007.3. Samples: 3970010660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:38,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 15:59:39,769][15401] Updated weights for policy 0, policy_version 242310 (0.0030) [2024-06-22 15:59:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 3970138112. Throughput: 0: 43016.9. Samples: 3970260520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 15:59:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000242318_3970138112.pth... [2024-06-22 15:59:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000241689_3959832576.pth [2024-06-22 15:59:43,888][15401] Updated weights for policy 0, policy_version 242320 (0.0043) [2024-06-22 15:59:47,645][15401] Updated weights for policy 0, policy_version 242330 (0.0035) [2024-06-22 15:59:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3970351104. Throughput: 0: 42935.5. Samples: 3970515100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 15:59:51,957][15401] Updated weights for policy 0, policy_version 242340 (0.0043) [2024-06-22 15:59:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3970564096. Throughput: 0: 42972.8. Samples: 3970648940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:53,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 15:59:55,256][15401] Updated weights for policy 0, policy_version 242350 (0.0039) [2024-06-22 15:59:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3970793472. Throughput: 0: 43029.7. Samples: 3970906960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 15:59:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 15:59:59,365][15401] Updated weights for policy 0, policy_version 242360 (0.0035) [2024-06-22 16:00:02,659][15401] Updated weights for policy 0, policy_version 242370 (0.0037) [2024-06-22 16:00:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3971006464. Throughput: 0: 43090.3. Samples: 3971168680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:03,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 16:00:06,823][15401] Updated weights for policy 0, policy_version 242380 (0.0024) [2024-06-22 16:00:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3971219456. Throughput: 0: 42879.7. Samples: 3971297140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 16:00:09,335][15349] Signal inference workers to stop experience collection... (58750 times) [2024-06-22 16:00:09,337][15349] Signal inference workers to resume experience collection... (58750 times) [2024-06-22 16:00:09,373][15401] InferenceWorker_p0-w0: stopping experience collection (58750 times) [2024-06-22 16:00:09,374][15401] InferenceWorker_p0-w0: resuming experience collection (58750 times) [2024-06-22 16:00:10,714][15401] Updated weights for policy 0, policy_version 242390 (0.0027) [2024-06-22 16:00:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 3971448832. Throughput: 0: 43145.0. Samples: 3971556120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 16:00:14,249][15401] Updated weights for policy 0, policy_version 242400 (0.0026) [2024-06-22 16:00:18,218][15401] Updated weights for policy 0, policy_version 242410 (0.0030) [2024-06-22 16:00:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3971645440. Throughput: 0: 42874.2. Samples: 3971810080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 16:00:21,699][15401] Updated weights for policy 0, policy_version 242420 (0.0039) [2024-06-22 16:00:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 3971858432. Throughput: 0: 42821.6. Samples: 3971937640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 16:00:25,659][15401] Updated weights for policy 0, policy_version 242430 (0.0026) [2024-06-22 16:00:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 3972087808. Throughput: 0: 43051.1. Samples: 3972197820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:28,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 16:00:29,203][15401] Updated weights for policy 0, policy_version 242440 (0.0034) [2024-06-22 16:00:33,159][15401] Updated weights for policy 0, policy_version 242450 (0.0028) [2024-06-22 16:00:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 3972300800. Throughput: 0: 43146.8. Samples: 3972456700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 16:00:36,789][15401] Updated weights for policy 0, policy_version 242460 (0.0039) [2024-06-22 16:00:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3972497408. Throughput: 0: 43046.7. Samples: 3972586040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 16:00:40,788][15401] Updated weights for policy 0, policy_version 242470 (0.0033) [2024-06-22 16:00:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3972710400. Throughput: 0: 43120.1. Samples: 3972847360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:43,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 16:00:44,418][15401] Updated weights for policy 0, policy_version 242480 (0.0031) [2024-06-22 16:00:48,272][15401] Updated weights for policy 0, policy_version 242490 (0.0036) [2024-06-22 16:00:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3972956160. Throughput: 0: 42938.2. Samples: 3973100900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 16:00:52,101][15401] Updated weights for policy 0, policy_version 242500 (0.0032) [2024-06-22 16:00:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 3973152768. Throughput: 0: 43074.6. Samples: 3973235500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 16:00:55,902][15401] Updated weights for policy 0, policy_version 242510 (0.0046) [2024-06-22 16:00:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3973349376. Throughput: 0: 43064.8. Samples: 3973494040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:00:58,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 16:00:59,870][15401] Updated weights for policy 0, policy_version 242520 (0.0028) [2024-06-22 16:01:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3973578752. Throughput: 0: 43008.4. Samples: 3973745460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:01:03,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-22 16:01:03,840][15401] Updated weights for policy 0, policy_version 242530 (0.0029) [2024-06-22 16:01:07,547][15401] Updated weights for policy 0, policy_version 242540 (0.0031) [2024-06-22 16:01:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 3973791744. Throughput: 0: 43087.3. Samples: 3973876560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 16:01:08,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-22 16:01:11,506][15401] Updated weights for policy 0, policy_version 242550 (0.0034) [2024-06-22 16:01:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 3973988352. Throughput: 0: 42904.9. Samples: 3974128540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:13,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 16:01:15,265][15401] Updated weights for policy 0, policy_version 242560 (0.0043) [2024-06-22 16:01:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3974217728. Throughput: 0: 42850.2. Samples: 3974384960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 16:01:19,096][15401] Updated weights for policy 0, policy_version 242570 (0.0026) [2024-06-22 16:01:22,833][15401] Updated weights for policy 0, policy_version 242580 (0.0042) [2024-06-22 16:01:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3974447104. Throughput: 0: 42914.2. Samples: 3974517180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 16:01:26,623][15401] Updated weights for policy 0, policy_version 242590 (0.0042) [2024-06-22 16:01:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3974627328. Throughput: 0: 42581.8. Samples: 3974763540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 16:01:29,729][15349] Signal inference workers to stop experience collection... (58800 times) [2024-06-22 16:01:29,735][15349] Signal inference workers to resume experience collection... (58800 times) [2024-06-22 16:01:29,779][15401] InferenceWorker_p0-w0: stopping experience collection (58800 times) [2024-06-22 16:01:29,779][15401] InferenceWorker_p0-w0: resuming experience collection (58800 times) [2024-06-22 16:01:30,386][15401] Updated weights for policy 0, policy_version 242600 (0.0034) [2024-06-22 16:01:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 3974856704. Throughput: 0: 42863.6. Samples: 3975029760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 16:01:33,971][15401] Updated weights for policy 0, policy_version 242610 (0.0024) [2024-06-22 16:01:38,109][15401] Updated weights for policy 0, policy_version 242620 (0.0034) [2024-06-22 16:01:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 3975086080. Throughput: 0: 42836.8. Samples: 3975163160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 16:01:41,695][15401] Updated weights for policy 0, policy_version 242630 (0.0035) [2024-06-22 16:01:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3975282688. Throughput: 0: 42591.0. Samples: 3975410640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:43,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 16:01:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000242632_3975282688.pth... [2024-06-22 16:01:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000242004_3964993536.pth [2024-06-22 16:01:46,077][15401] Updated weights for policy 0, policy_version 242640 (0.0025) [2024-06-22 16:01:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3975495680. Throughput: 0: 42686.3. Samples: 3975666340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:48,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 16:01:49,366][15401] Updated weights for policy 0, policy_version 242650 (0.0038) [2024-06-22 16:01:53,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3975725056. Throughput: 0: 42696.9. Samples: 3975797920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 16:01:53,479][15401] Updated weights for policy 0, policy_version 242660 (0.0043) [2024-06-22 16:01:57,362][15401] Updated weights for policy 0, policy_version 242670 (0.0033) [2024-06-22 16:01:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 3975921664. Throughput: 0: 42754.4. Samples: 3976052480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:01:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 16:02:01,255][15401] Updated weights for policy 0, policy_version 242680 (0.0046) [2024-06-22 16:02:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3976151040. Throughput: 0: 42847.0. Samples: 3976313080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:02:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 16:02:04,803][15401] Updated weights for policy 0, policy_version 242690 (0.0024) [2024-06-22 16:02:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3976347648. Throughput: 0: 42826.7. Samples: 3976444380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:02:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 16:02:08,847][15401] Updated weights for policy 0, policy_version 242700 (0.0030) [2024-06-22 16:02:12,267][15401] Updated weights for policy 0, policy_version 242710 (0.0038) [2024-06-22 16:02:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 3976577024. Throughput: 0: 42901.8. Samples: 3976694120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:02:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 16:02:16,656][15401] Updated weights for policy 0, policy_version 242720 (0.0041) [2024-06-22 16:02:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3976790016. Throughput: 0: 42752.1. Samples: 3976953600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 16:02:20,071][15401] Updated weights for policy 0, policy_version 242730 (0.0027) [2024-06-22 16:02:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3977003008. Throughput: 0: 42637.8. Samples: 3977081860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 16:02:24,166][15401] Updated weights for policy 0, policy_version 242740 (0.0024) [2024-06-22 16:02:27,684][15401] Updated weights for policy 0, policy_version 242750 (0.0021) [2024-06-22 16:02:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3977216000. Throughput: 0: 42734.3. Samples: 3977333680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:28,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 16:02:31,907][15401] Updated weights for policy 0, policy_version 242760 (0.0032) [2024-06-22 16:02:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3977428992. Throughput: 0: 42952.0. Samples: 3977599180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 16:02:35,327][15401] Updated weights for policy 0, policy_version 242770 (0.0046) [2024-06-22 16:02:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 3977641984. Throughput: 0: 42883.0. Samples: 3977727660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 16:02:39,402][15401] Updated weights for policy 0, policy_version 242780 (0.0034) [2024-06-22 16:02:43,214][15401] Updated weights for policy 0, policy_version 242790 (0.0032) [2024-06-22 16:02:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3977871360. Throughput: 0: 42899.4. Samples: 3977982960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 16:02:45,591][15349] Signal inference workers to stop experience collection... (58850 times) [2024-06-22 16:02:45,596][15349] Signal inference workers to resume experience collection... (58850 times) [2024-06-22 16:02:45,636][15401] InferenceWorker_p0-w0: stopping experience collection (58850 times) [2024-06-22 16:02:45,636][15401] InferenceWorker_p0-w0: resuming experience collection (58850 times) [2024-06-22 16:02:46,915][15401] Updated weights for policy 0, policy_version 242800 (0.0032) [2024-06-22 16:02:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3978084352. Throughput: 0: 42948.8. Samples: 3978245780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 16:02:50,749][15401] Updated weights for policy 0, policy_version 242810 (0.0043) [2024-06-22 16:02:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 3978297344. Throughput: 0: 42815.9. Samples: 3978371100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 16:02:54,656][15401] Updated weights for policy 0, policy_version 242820 (0.0027) [2024-06-22 16:02:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 3978510336. Throughput: 0: 42935.0. Samples: 3978626200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:02:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 16:02:58,791][15401] Updated weights for policy 0, policy_version 242830 (0.0026) [2024-06-22 16:03:02,449][15401] Updated weights for policy 0, policy_version 242840 (0.0039) [2024-06-22 16:03:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3978723328. Throughput: 0: 42850.6. Samples: 3978881880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:03:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 16:03:06,414][15401] Updated weights for policy 0, policy_version 242850 (0.0044) [2024-06-22 16:03:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 3978936320. Throughput: 0: 43009.9. Samples: 3979017300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:03:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 16:03:10,056][15401] Updated weights for policy 0, policy_version 242860 (0.0038) [2024-06-22 16:03:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3979149312. Throughput: 0: 43050.2. Samples: 3979270940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:03:13,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 16:03:14,705][15401] Updated weights for policy 0, policy_version 242870 (0.0025) [2024-06-22 16:03:17,734][15401] Updated weights for policy 0, policy_version 242880 (0.0032) [2024-06-22 16:03:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3979345920. Throughput: 0: 42705.0. Samples: 3979520900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:03:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 16:03:22,099][15401] Updated weights for policy 0, policy_version 242890 (0.0035) [2024-06-22 16:03:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3979575296. Throughput: 0: 42680.4. Samples: 3979648280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:03:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 16:03:25,891][15401] Updated weights for policy 0, policy_version 242900 (0.0034) [2024-06-22 16:03:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3979771904. Throughput: 0: 42721.0. Samples: 3979905400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 16:03:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 16:03:29,655][15401] Updated weights for policy 0, policy_version 242910 (0.0033) [2024-06-22 16:03:33,378][15401] Updated weights for policy 0, policy_version 242920 (0.0032) [2024-06-22 16:03:33,394][15132] Fps is (10 sec: 42578.8, 60 sec: 42868.1, 300 sec: 42875.4). Total num frames: 3980001280. Throughput: 0: 42561.5. Samples: 3980161240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:03:33,395][15132] Avg episode reward: [(0, '0.837')] [2024-06-22 16:03:37,477][15401] Updated weights for policy 0, policy_version 242930 (0.0029) [2024-06-22 16:03:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3980197888. Throughput: 0: 42522.3. Samples: 3980284600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:03:38,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 16:03:41,039][15401] Updated weights for policy 0, policy_version 242940 (0.0032) [2024-06-22 16:03:43,390][15132] Fps is (10 sec: 40979.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 3980410880. Throughput: 0: 42523.1. Samples: 3980539740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:03:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 16:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000242945_3980410880.pth... [2024-06-22 16:03:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000242318_3970138112.pth [2024-06-22 16:03:45,102][15401] Updated weights for policy 0, policy_version 242950 (0.0045) [2024-06-22 16:03:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 3980623872. Throughput: 0: 42560.1. Samples: 3980797080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:03:48,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-22 16:03:48,662][15401] Updated weights for policy 0, policy_version 242960 (0.0039) [2024-06-22 16:03:52,889][15401] Updated weights for policy 0, policy_version 242970 (0.0033) [2024-06-22 16:03:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3980853248. Throughput: 0: 42277.2. Samples: 3980919780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:03:53,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-22 16:03:56,165][15401] Updated weights for policy 0, policy_version 242980 (0.0025) [2024-06-22 16:03:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3981066240. Throughput: 0: 42486.2. Samples: 3981182820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:03:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 16:04:00,381][15401] Updated weights for policy 0, policy_version 242990 (0.0039) [2024-06-22 16:04:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 3981262848. Throughput: 0: 42648.8. Samples: 3981440100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:04:03,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 16:04:04,014][15401] Updated weights for policy 0, policy_version 243000 (0.0046) [2024-06-22 16:04:04,963][15349] Signal inference workers to stop experience collection... (58900 times) [2024-06-22 16:04:04,971][15349] Signal inference workers to resume experience collection... (58900 times) [2024-06-22 16:04:04,974][15401] InferenceWorker_p0-w0: stopping experience collection (58900 times) [2024-06-22 16:04:05,004][15401] InferenceWorker_p0-w0: resuming experience collection (58900 times) [2024-06-22 16:04:08,049][15401] Updated weights for policy 0, policy_version 243010 (0.0033) [2024-06-22 16:04:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3981492224. Throughput: 0: 42555.6. Samples: 3981563280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:04:08,392][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 16:04:11,655][15401] Updated weights for policy 0, policy_version 243020 (0.0032) [2024-06-22 16:04:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3981721600. Throughput: 0: 42643.5. Samples: 3981824360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:04:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 16:04:15,426][15401] Updated weights for policy 0, policy_version 243030 (0.0034) [2024-06-22 16:04:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.2, 300 sec: 42820.6). Total num frames: 3981901824. Throughput: 0: 42755.5. Samples: 3982085040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:04:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 16:04:19,465][15401] Updated weights for policy 0, policy_version 243040 (0.0037) [2024-06-22 16:04:23,005][15401] Updated weights for policy 0, policy_version 243050 (0.0041) [2024-06-22 16:04:23,394][15132] Fps is (10 sec: 42581.0, 60 sec: 42868.6, 300 sec: 42931.0). Total num frames: 3982147584. Throughput: 0: 42660.1. Samples: 3982204480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:04:23,394][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 16:04:27,172][15401] Updated weights for policy 0, policy_version 243060 (0.0032) [2024-06-22 16:04:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3982344192. Throughput: 0: 42814.7. Samples: 3982466400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:04:28,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-22 16:04:30,624][15401] Updated weights for policy 0, policy_version 243070 (0.0029) [2024-06-22 16:04:33,389][15132] Fps is (10 sec: 40977.1, 60 sec: 42601.8, 300 sec: 42820.6). Total num frames: 3982557184. Throughput: 0: 42826.2. Samples: 3982724260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:04:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 16:04:34,730][15401] Updated weights for policy 0, policy_version 243080 (0.0033) [2024-06-22 16:04:38,030][15401] Updated weights for policy 0, policy_version 243090 (0.0035) [2024-06-22 16:04:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3982786560. Throughput: 0: 42978.7. Samples: 3982853820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 16:04:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 16:04:42,319][15401] Updated weights for policy 0, policy_version 243100 (0.0033) [2024-06-22 16:04:43,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3982966784. Throughput: 0: 42854.9. Samples: 3983111300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:04:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 16:04:45,680][15401] Updated weights for policy 0, policy_version 243110 (0.0034) [2024-06-22 16:04:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3983196160. Throughput: 0: 42795.0. Samples: 3983365880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:04:48,399][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 16:04:50,108][15401] Updated weights for policy 0, policy_version 243120 (0.0045) [2024-06-22 16:04:53,232][15401] Updated weights for policy 0, policy_version 243130 (0.0044) [2024-06-22 16:04:53,389][15132] Fps is (10 sec: 47514.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3983441920. Throughput: 0: 43088.5. Samples: 3983502260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:04:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 16:04:57,687][15401] Updated weights for policy 0, policy_version 243140 (0.0027) [2024-06-22 16:04:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 3983622144. Throughput: 0: 42986.6. Samples: 3983758760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:04:58,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 16:05:00,888][15401] Updated weights for policy 0, policy_version 243150 (0.0046) [2024-06-22 16:05:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 3983851520. Throughput: 0: 42851.0. Samples: 3984013340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 16:05:05,355][15401] Updated weights for policy 0, policy_version 243160 (0.0034) [2024-06-22 16:05:08,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 3984064512. Throughput: 0: 43179.1. Samples: 3984147360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 16:05:08,526][15401] Updated weights for policy 0, policy_version 243170 (0.0036) [2024-06-22 16:05:12,829][15401] Updated weights for policy 0, policy_version 243180 (0.0035) [2024-06-22 16:05:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3984277504. Throughput: 0: 43074.9. Samples: 3984404780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:13,394][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 16:05:16,014][15401] Updated weights for policy 0, policy_version 243190 (0.0031) [2024-06-22 16:05:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3984490496. Throughput: 0: 43092.4. Samples: 3984663420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 16:05:20,438][15401] Updated weights for policy 0, policy_version 243200 (0.0027) [2024-06-22 16:05:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42874.4, 300 sec: 42820.6). Total num frames: 3984719872. Throughput: 0: 43136.4. Samples: 3984794960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 16:05:23,602][15401] Updated weights for policy 0, policy_version 243210 (0.0034) [2024-06-22 16:05:23,618][15349] Signal inference workers to stop experience collection... (58950 times) [2024-06-22 16:05:23,618][15349] Signal inference workers to resume experience collection... (58950 times) [2024-06-22 16:05:23,646][15401] InferenceWorker_p0-w0: stopping experience collection (58950 times) [2024-06-22 16:05:23,646][15401] InferenceWorker_p0-w0: resuming experience collection (58950 times) [2024-06-22 16:05:27,995][15401] Updated weights for policy 0, policy_version 243220 (0.0032) [2024-06-22 16:05:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3984932864. Throughput: 0: 43095.7. Samples: 3985050600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 16:05:31,179][15401] Updated weights for policy 0, policy_version 243230 (0.0026) [2024-06-22 16:05:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3985129472. Throughput: 0: 43013.4. Samples: 3985301480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:33,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 16:05:35,723][15401] Updated weights for policy 0, policy_version 243240 (0.0040) [2024-06-22 16:05:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3985358848. Throughput: 0: 42857.8. Samples: 3985430860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:38,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 16:05:38,767][15401] Updated weights for policy 0, policy_version 243250 (0.0028) [2024-06-22 16:05:43,234][15401] Updated weights for policy 0, policy_version 243260 (0.0047) [2024-06-22 16:05:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 3985571840. Throughput: 0: 43124.1. Samples: 3985699340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 16:05:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000243261_3985588224.pth... [2024-06-22 16:05:43,565][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000242632_3975282688.pth [2024-06-22 16:05:46,287][15401] Updated weights for policy 0, policy_version 243270 (0.0028) [2024-06-22 16:05:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 3985784832. Throughput: 0: 42976.9. Samples: 3985947300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 16:05:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 16:05:51,026][15401] Updated weights for policy 0, policy_version 243280 (0.0029) [2024-06-22 16:05:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3985997824. Throughput: 0: 42936.8. Samples: 3986079520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:05:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 16:05:54,004][15401] Updated weights for policy 0, policy_version 243290 (0.0033) [2024-06-22 16:05:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3986194432. Throughput: 0: 43030.7. Samples: 3986341160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:05:58,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 16:05:58,613][15401] Updated weights for policy 0, policy_version 243300 (0.0033) [2024-06-22 16:06:01,942][15401] Updated weights for policy 0, policy_version 243310 (0.0029) [2024-06-22 16:06:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 3986440192. Throughput: 0: 42688.4. Samples: 3986584400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 16:06:06,248][15401] Updated weights for policy 0, policy_version 243320 (0.0025) [2024-06-22 16:06:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3986636800. Throughput: 0: 42799.1. Samples: 3986720920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 16:06:09,648][15401] Updated weights for policy 0, policy_version 243330 (0.0023) [2024-06-22 16:06:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3986849792. Throughput: 0: 42842.5. Samples: 3986978520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 16:06:13,777][15401] Updated weights for policy 0, policy_version 243340 (0.0033) [2024-06-22 16:06:17,185][15401] Updated weights for policy 0, policy_version 243350 (0.0030) [2024-06-22 16:06:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 3987079168. Throughput: 0: 42790.2. Samples: 3987227040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 16:06:21,876][15401] Updated weights for policy 0, policy_version 243360 (0.0039) [2024-06-22 16:06:23,211][15349] Signal inference workers to stop experience collection... (59000 times) [2024-06-22 16:06:23,212][15349] Signal inference workers to resume experience collection... (59000 times) [2024-06-22 16:06:23,241][15401] InferenceWorker_p0-w0: stopping experience collection (59000 times) [2024-06-22 16:06:23,241][15401] InferenceWorker_p0-w0: resuming experience collection (59000 times) [2024-06-22 16:06:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3987292160. Throughput: 0: 42993.7. Samples: 3987365580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 16:06:24,938][15401] Updated weights for policy 0, policy_version 243370 (0.0023) [2024-06-22 16:06:28,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3987488768. Throughput: 0: 42591.7. Samples: 3987615960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 16:06:29,366][15401] Updated weights for policy 0, policy_version 243380 (0.0026) [2024-06-22 16:06:32,467][15401] Updated weights for policy 0, policy_version 243390 (0.0032) [2024-06-22 16:06:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3987718144. Throughput: 0: 42757.4. Samples: 3987871380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 16:06:36,882][15401] Updated weights for policy 0, policy_version 243400 (0.0035) [2024-06-22 16:06:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3987914752. Throughput: 0: 42782.3. Samples: 3988004720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 16:06:40,300][15401] Updated weights for policy 0, policy_version 243410 (0.0033) [2024-06-22 16:06:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 3988127744. Throughput: 0: 42573.9. Samples: 3988256980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 16:06:44,737][15401] Updated weights for policy 0, policy_version 243420 (0.0042) [2024-06-22 16:06:48,013][15401] Updated weights for policy 0, policy_version 243430 (0.0027) [2024-06-22 16:06:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3988373504. Throughput: 0: 42913.2. Samples: 3988515500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:48,399][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 16:06:52,361][15401] Updated weights for policy 0, policy_version 243440 (0.0040) [2024-06-22 16:06:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3988553728. Throughput: 0: 42890.7. Samples: 3988651000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 16:06:55,638][15401] Updated weights for policy 0, policy_version 243450 (0.0038) [2024-06-22 16:06:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3988783104. Throughput: 0: 42794.4. Samples: 3988904260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 16:06:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 16:06:59,906][15401] Updated weights for policy 0, policy_version 243460 (0.0040) [2024-06-22 16:07:03,193][15401] Updated weights for policy 0, policy_version 243470 (0.0030) [2024-06-22 16:07:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 3989012480. Throughput: 0: 43087.7. Samples: 3989165980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 16:07:07,281][15401] Updated weights for policy 0, policy_version 243480 (0.0029) [2024-06-22 16:07:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 3989209088. Throughput: 0: 42975.5. Samples: 3989299480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 16:07:10,725][15401] Updated weights for policy 0, policy_version 243490 (0.0026) [2024-06-22 16:07:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 3989422080. Throughput: 0: 43074.5. Samples: 3989554320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 16:07:14,821][15401] Updated weights for policy 0, policy_version 243500 (0.0037) [2024-06-22 16:07:18,362][15401] Updated weights for policy 0, policy_version 243510 (0.0033) [2024-06-22 16:07:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 3989667840. Throughput: 0: 43102.8. Samples: 3989811000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:18,392][15132] Avg episode reward: [(0, '0.246')] [2024-06-22 16:07:22,442][15401] Updated weights for policy 0, policy_version 243520 (0.0036) [2024-06-22 16:07:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 3989831680. Throughput: 0: 42953.4. Samples: 3989937620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:23,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-22 16:07:25,967][15401] Updated weights for policy 0, policy_version 243530 (0.0034) [2024-06-22 16:07:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 3990077440. Throughput: 0: 42981.8. Samples: 3990191160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 16:07:29,994][15401] Updated weights for policy 0, policy_version 243540 (0.0029) [2024-06-22 16:07:33,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3990290432. Throughput: 0: 43047.0. Samples: 3990452620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 16:07:33,993][15401] Updated weights for policy 0, policy_version 243550 (0.0048) [2024-06-22 16:07:37,661][15401] Updated weights for policy 0, policy_version 243560 (0.0035) [2024-06-22 16:07:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 3990487040. Throughput: 0: 42870.3. Samples: 3990580160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 16:07:41,373][15349] Signal inference workers to stop experience collection... (59050 times) [2024-06-22 16:07:41,397][15401] InferenceWorker_p0-w0: stopping experience collection (59050 times) [2024-06-22 16:07:41,435][15349] Signal inference workers to resume experience collection... (59050 times) [2024-06-22 16:07:41,443][15401] InferenceWorker_p0-w0: resuming experience collection (59050 times) [2024-06-22 16:07:41,597][15401] Updated weights for policy 0, policy_version 243570 (0.0043) [2024-06-22 16:07:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 3990716416. Throughput: 0: 42918.9. Samples: 3990835620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 16:07:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000243575_3990732800.pth... [2024-06-22 16:07:43,658][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000242945_3980410880.pth [2024-06-22 16:07:45,420][15401] Updated weights for policy 0, policy_version 243580 (0.0031) [2024-06-22 16:07:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 3990929408. Throughput: 0: 42866.4. Samples: 3991094960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:48,396][15132] Avg episode reward: [(0, '0.863')] [2024-06-22 16:07:49,135][15401] Updated weights for policy 0, policy_version 243590 (0.0035) [2024-06-22 16:07:52,997][15401] Updated weights for policy 0, policy_version 243600 (0.0046) [2024-06-22 16:07:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 3991142400. Throughput: 0: 42644.2. Samples: 3991218460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:53,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 16:07:56,910][15401] Updated weights for policy 0, policy_version 243610 (0.0034) [2024-06-22 16:07:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 3991355392. Throughput: 0: 42684.6. Samples: 3991475120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:07:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 16:08:00,809][15401] Updated weights for policy 0, policy_version 243620 (0.0026) [2024-06-22 16:08:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 3991568384. Throughput: 0: 42736.8. Samples: 3991734160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:08:03,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 16:08:04,754][15401] Updated weights for policy 0, policy_version 243630 (0.0034) [2024-06-22 16:08:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 3991764992. Throughput: 0: 42684.9. Samples: 3991858440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:08:08,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 16:08:08,897][15401] Updated weights for policy 0, policy_version 243640 (0.0035) [2024-06-22 16:08:12,228][15401] Updated weights for policy 0, policy_version 243650 (0.0036) [2024-06-22 16:08:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3992010752. Throughput: 0: 42831.9. Samples: 3992118600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 16:08:16,427][15401] Updated weights for policy 0, policy_version 243660 (0.0030) [2024-06-22 16:08:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 3992223744. Throughput: 0: 42692.6. Samples: 3992373780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:18,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 16:08:20,003][15401] Updated weights for policy 0, policy_version 243670 (0.0038) [2024-06-22 16:08:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 3992420352. Throughput: 0: 42719.4. Samples: 3992502540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:23,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 16:08:24,121][15401] Updated weights for policy 0, policy_version 243680 (0.0031) [2024-06-22 16:08:27,667][15401] Updated weights for policy 0, policy_version 243690 (0.0030) [2024-06-22 16:08:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42821.2). Total num frames: 3992633344. Throughput: 0: 42889.4. Samples: 3992765640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:28,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 16:08:31,678][15401] Updated weights for policy 0, policy_version 243700 (0.0035) [2024-06-22 16:08:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 3992862720. Throughput: 0: 42736.4. Samples: 3993018100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:33,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-22 16:08:35,198][15401] Updated weights for policy 0, policy_version 243710 (0.0033) [2024-06-22 16:08:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3993059328. Throughput: 0: 42915.5. Samples: 3993149660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:38,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 16:08:39,157][15401] Updated weights for policy 0, policy_version 243720 (0.0041) [2024-06-22 16:08:42,601][15401] Updated weights for policy 0, policy_version 243730 (0.0036) [2024-06-22 16:08:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 3993305088. Throughput: 0: 42996.0. Samples: 3993409940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:43,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 16:08:46,741][15401] Updated weights for policy 0, policy_version 243740 (0.0038) [2024-06-22 16:08:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 3993501696. Throughput: 0: 42942.3. Samples: 3993666560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 16:08:50,555][15401] Updated weights for policy 0, policy_version 243750 (0.0030) [2024-06-22 16:08:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 3993731072. Throughput: 0: 43027.0. Samples: 3993794660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:53,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 16:08:54,343][15401] Updated weights for policy 0, policy_version 243760 (0.0036) [2024-06-22 16:08:58,241][15401] Updated weights for policy 0, policy_version 243770 (0.0028) [2024-06-22 16:08:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3993927680. Throughput: 0: 43153.0. Samples: 3994060480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:08:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 16:09:01,896][15401] Updated weights for policy 0, policy_version 243780 (0.0035) [2024-06-22 16:09:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 3994173440. Throughput: 0: 43100.1. Samples: 3994313280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:09:03,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 16:09:05,801][15401] Updated weights for policy 0, policy_version 243790 (0.0035) [2024-06-22 16:09:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 3994370048. Throughput: 0: 43187.5. Samples: 3994445980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:09:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 16:09:09,765][15401] Updated weights for policy 0, policy_version 243800 (0.0030) [2024-06-22 16:09:13,334][15401] Updated weights for policy 0, policy_version 243810 (0.0043) [2024-06-22 16:09:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3994583040. Throughput: 0: 43062.7. Samples: 3994703460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:09:13,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 16:09:17,315][15401] Updated weights for policy 0, policy_version 243820 (0.0033) [2024-06-22 16:09:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42876.7). Total num frames: 3994796032. Throughput: 0: 43060.5. Samples: 3994955820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 16:09:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 16:09:21,038][15401] Updated weights for policy 0, policy_version 243830 (0.0039) [2024-06-22 16:09:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 3995009024. Throughput: 0: 43089.3. Samples: 3995088680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:09:23,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 16:09:24,776][15401] Updated weights for policy 0, policy_version 243840 (0.0032) [2024-06-22 16:09:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3995222016. Throughput: 0: 42945.6. Samples: 3995342500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:09:28,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 16:09:28,525][15401] Updated weights for policy 0, policy_version 243850 (0.0028) [2024-06-22 16:09:32,527][15401] Updated weights for policy 0, policy_version 243860 (0.0031) [2024-06-22 16:09:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 3995418624. Throughput: 0: 43076.3. Samples: 3995605000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:09:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 16:09:33,513][15349] Signal inference workers to stop experience collection... (59100 times) [2024-06-22 16:09:33,568][15401] InferenceWorker_p0-w0: stopping experience collection (59100 times) [2024-06-22 16:09:33,631][15349] Signal inference workers to resume experience collection... (59100 times) [2024-06-22 16:09:33,631][15401] InferenceWorker_p0-w0: resuming experience collection (59100 times) [2024-06-22 16:09:36,083][15401] Updated weights for policy 0, policy_version 243870 (0.0030) [2024-06-22 16:09:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 3995664384. Throughput: 0: 43000.0. Samples: 3995729660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:09:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 16:09:40,315][15401] Updated weights for policy 0, policy_version 243880 (0.0032) [2024-06-22 16:09:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 3995844608. Throughput: 0: 42686.5. Samples: 3995981380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:09:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 16:09:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000243887_3995844608.pth... [2024-06-22 16:09:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000243261_3985588224.pth [2024-06-22 16:09:43,939][15401] Updated weights for policy 0, policy_version 243890 (0.0040) [2024-06-22 16:09:47,814][15401] Updated weights for policy 0, policy_version 243900 (0.0034) [2024-06-22 16:09:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 3996073984. Throughput: 0: 43050.1. Samples: 3996250540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:09:48,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 16:09:51,729][15401] Updated weights for policy 0, policy_version 243910 (0.0027) [2024-06-22 16:09:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 3996303360. Throughput: 0: 43033.9. Samples: 3996382500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:09:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 16:09:55,272][15401] Updated weights for policy 0, policy_version 243920 (0.0038) [2024-06-22 16:09:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 3996483584. Throughput: 0: 42786.6. Samples: 3996628860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:09:58,399][15132] Avg episode reward: [(0, '0.783')] [2024-06-22 16:09:59,365][15401] Updated weights for policy 0, policy_version 243930 (0.0044) [2024-06-22 16:10:02,807][15401] Updated weights for policy 0, policy_version 243940 (0.0032) [2024-06-22 16:10:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 3996729344. Throughput: 0: 43011.0. Samples: 3996891320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:10:03,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 16:10:07,205][15401] Updated weights for policy 0, policy_version 243950 (0.0035) [2024-06-22 16:10:08,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 3996958720. Throughput: 0: 43031.1. Samples: 3997025080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:10:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 16:10:10,281][15401] Updated weights for policy 0, policy_version 243960 (0.0042) [2024-06-22 16:10:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3997138944. Throughput: 0: 42945.5. Samples: 3997275040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:10:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 16:10:14,738][15401] Updated weights for policy 0, policy_version 243970 (0.0033) [2024-06-22 16:10:17,922][15401] Updated weights for policy 0, policy_version 243980 (0.0029) [2024-06-22 16:10:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3997384704. Throughput: 0: 42910.8. Samples: 3997535980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:10:18,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 16:10:22,308][15401] Updated weights for policy 0, policy_version 243990 (0.0045) [2024-06-22 16:10:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 3997597696. Throughput: 0: 43108.1. Samples: 3997669520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:10:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 16:10:25,793][15401] Updated weights for policy 0, policy_version 244000 (0.0038) [2024-06-22 16:10:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 3997794304. Throughput: 0: 43042.3. Samples: 3997918280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:10:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 16:10:29,846][15401] Updated weights for policy 0, policy_version 244010 (0.0040) [2024-06-22 16:10:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 3998007296. Throughput: 0: 42853.0. Samples: 3998178920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:10:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 16:10:33,467][15401] Updated weights for policy 0, policy_version 244020 (0.0027) [2024-06-22 16:10:37,655][15401] Updated weights for policy 0, policy_version 244030 (0.0043) [2024-06-22 16:10:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 3998203904. Throughput: 0: 42707.5. Samples: 3998304340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:10:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 16:10:41,162][15401] Updated weights for policy 0, policy_version 244040 (0.0045) [2024-06-22 16:10:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 3998449664. Throughput: 0: 42914.7. Samples: 3998560020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:10:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 16:10:45,104][15401] Updated weights for policy 0, policy_version 244050 (0.0038) [2024-06-22 16:10:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 3998646272. Throughput: 0: 42710.7. Samples: 3998813300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:10:48,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-22 16:10:48,962][15401] Updated weights for policy 0, policy_version 244060 (0.0032) [2024-06-22 16:10:49,716][15349] Signal inference workers to stop experience collection... (59150 times) [2024-06-22 16:10:49,749][15401] InferenceWorker_p0-w0: stopping experience collection (59150 times) [2024-06-22 16:10:49,773][15349] Signal inference workers to resume experience collection... (59150 times) [2024-06-22 16:10:49,773][15401] InferenceWorker_p0-w0: resuming experience collection (59150 times) [2024-06-22 16:10:52,617][15401] Updated weights for policy 0, policy_version 244070 (0.0039) [2024-06-22 16:10:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 3998859264. Throughput: 0: 42451.9. Samples: 3998935420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:10:53,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-22 16:10:56,478][15401] Updated weights for policy 0, policy_version 244080 (0.0033) [2024-06-22 16:10:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 3999088640. Throughput: 0: 42674.7. Samples: 3999195400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:10:58,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 16:11:00,143][15401] Updated weights for policy 0, policy_version 244090 (0.0026) [2024-06-22 16:11:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 3999285248. Throughput: 0: 42602.6. Samples: 3999453100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:11:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 16:11:04,616][15401] Updated weights for policy 0, policy_version 244100 (0.0030) [2024-06-22 16:11:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 3999481856. Throughput: 0: 42395.1. Samples: 3999577300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:11:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 16:11:08,394][15401] Updated weights for policy 0, policy_version 244110 (0.0028) [2024-06-22 16:11:12,308][15401] Updated weights for policy 0, policy_version 244120 (0.0048) [2024-06-22 16:11:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 3999711232. Throughput: 0: 42775.9. Samples: 3999843200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:11:13,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 16:11:15,913][15401] Updated weights for policy 0, policy_version 244130 (0.0036) [2024-06-22 16:11:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 3999924224. Throughput: 0: 42472.8. Samples: 4000090200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:11:18,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 16:11:20,023][15401] Updated weights for policy 0, policy_version 244140 (0.0031) [2024-06-22 16:11:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 4000137216. Throughput: 0: 42582.8. Samples: 4000220560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:11:23,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 16:11:23,467][15401] Updated weights for policy 0, policy_version 244150 (0.0043) [2024-06-22 16:11:27,645][15401] Updated weights for policy 0, policy_version 244160 (0.0038) [2024-06-22 16:11:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4000350208. Throughput: 0: 42859.7. Samples: 4000488700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:11:28,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 16:11:30,918][15401] Updated weights for policy 0, policy_version 244170 (0.0045) [2024-06-22 16:11:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4000579584. Throughput: 0: 42728.0. Samples: 4000736060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:11:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 16:11:35,294][15401] Updated weights for policy 0, policy_version 244180 (0.0037) [2024-06-22 16:11:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4000792576. Throughput: 0: 42985.3. Samples: 4000869760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 16:11:38,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 16:11:38,566][15401] Updated weights for policy 0, policy_version 244190 (0.0029) [2024-06-22 16:11:42,706][15401] Updated weights for policy 0, policy_version 244200 (0.0026) [2024-06-22 16:11:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4001005568. Throughput: 0: 42938.7. Samples: 4001127640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:11:43,390][15132] Avg episode reward: [(0, '0.138')] [2024-06-22 16:11:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000244202_4001005568.pth... [2024-06-22 16:11:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000243575_3990732800.pth [2024-06-22 16:11:46,245][15401] Updated weights for policy 0, policy_version 244210 (0.0038) [2024-06-22 16:11:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4001218560. Throughput: 0: 42842.7. Samples: 4001381020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:11:48,390][15132] Avg episode reward: [(0, '0.124')] [2024-06-22 16:11:50,294][15401] Updated weights for policy 0, policy_version 244220 (0.0036) [2024-06-22 16:11:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4001431552. Throughput: 0: 42911.0. Samples: 4001508300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:11:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 16:11:53,798][15401] Updated weights for policy 0, policy_version 244230 (0.0034) [2024-06-22 16:11:58,119][15401] Updated weights for policy 0, policy_version 244240 (0.0030) [2024-06-22 16:11:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4001644544. Throughput: 0: 42781.9. Samples: 4001768380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:11:58,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 16:12:01,667][15401] Updated weights for policy 0, policy_version 244250 (0.0032) [2024-06-22 16:12:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 4001873920. Throughput: 0: 42791.6. Samples: 4002015820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:03,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 16:12:05,765][15401] Updated weights for policy 0, policy_version 244260 (0.0037) [2024-06-22 16:12:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 4002054144. Throughput: 0: 42880.8. Samples: 4002150300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:08,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 16:12:09,371][15401] Updated weights for policy 0, policy_version 244270 (0.0044) [2024-06-22 16:12:13,246][15401] Updated weights for policy 0, policy_version 244280 (0.0028) [2024-06-22 16:12:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4002283520. Throughput: 0: 42473.8. Samples: 4002400020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:13,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 16:12:16,188][15349] Signal inference workers to stop experience collection... (59200 times) [2024-06-22 16:12:16,189][15349] Signal inference workers to resume experience collection... (59200 times) [2024-06-22 16:12:16,233][15401] InferenceWorker_p0-w0: stopping experience collection (59200 times) [2024-06-22 16:12:16,233][15401] InferenceWorker_p0-w0: resuming experience collection (59200 times) [2024-06-22 16:12:17,126][15401] Updated weights for policy 0, policy_version 244290 (0.0033) [2024-06-22 16:12:18,390][15132] Fps is (10 sec: 44245.6, 60 sec: 42871.2, 300 sec: 42931.6). Total num frames: 4002496512. Throughput: 0: 42651.2. Samples: 4002655380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 16:12:20,981][15401] Updated weights for policy 0, policy_version 244300 (0.0047) [2024-06-22 16:12:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4002709504. Throughput: 0: 42698.9. Samples: 4002791220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 16:12:24,590][15401] Updated weights for policy 0, policy_version 244310 (0.0037) [2024-06-22 16:12:28,371][15401] Updated weights for policy 0, policy_version 244320 (0.0041) [2024-06-22 16:12:28,390][15132] Fps is (10 sec: 44238.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4002938880. Throughput: 0: 42652.8. Samples: 4003047020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:28,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 16:12:32,162][15401] Updated weights for policy 0, policy_version 244330 (0.0030) [2024-06-22 16:12:33,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4003151872. Throughput: 0: 42710.6. Samples: 4003303000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 16:12:36,007][15401] Updated weights for policy 0, policy_version 244340 (0.0042) [2024-06-22 16:12:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4003348480. Throughput: 0: 42732.9. Samples: 4003431280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:38,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 16:12:39,711][15401] Updated weights for policy 0, policy_version 244350 (0.0032) [2024-06-22 16:12:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4003577856. Throughput: 0: 42757.6. Samples: 4003692480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:43,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-22 16:12:43,465][15401] Updated weights for policy 0, policy_version 244360 (0.0037) [2024-06-22 16:12:47,220][15401] Updated weights for policy 0, policy_version 244370 (0.0036) [2024-06-22 16:12:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4003807232. Throughput: 0: 42850.6. Samples: 4003944100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-22 16:12:48,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 16:12:51,441][15401] Updated weights for policy 0, policy_version 244380 (0.0034) [2024-06-22 16:12:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4004003840. Throughput: 0: 42788.4. Samples: 4004075680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:12:53,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 16:12:55,075][15401] Updated weights for policy 0, policy_version 244390 (0.0035) [2024-06-22 16:12:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4004233216. Throughput: 0: 43033.7. Samples: 4004336540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:12:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 16:12:59,105][15401] Updated weights for policy 0, policy_version 244400 (0.0031) [2024-06-22 16:13:02,784][15401] Updated weights for policy 0, policy_version 244410 (0.0032) [2024-06-22 16:13:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42987.1). Total num frames: 4004446208. Throughput: 0: 42924.6. Samples: 4004586980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 16:13:06,846][15401] Updated weights for policy 0, policy_version 244420 (0.0036) [2024-06-22 16:13:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 4004610048. Throughput: 0: 42780.5. Samples: 4004716340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:08,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 16:13:10,409][15401] Updated weights for policy 0, policy_version 244430 (0.0029) [2024-06-22 16:13:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4004839424. Throughput: 0: 42641.8. Samples: 4004965900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 16:13:14,477][15401] Updated weights for policy 0, policy_version 244440 (0.0040) [2024-06-22 16:13:17,982][15401] Updated weights for policy 0, policy_version 244450 (0.0023) [2024-06-22 16:13:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 4005068800. Throughput: 0: 42627.6. Samples: 4005221240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:18,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 16:13:22,215][15401] Updated weights for policy 0, policy_version 244460 (0.0032) [2024-06-22 16:13:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 4005265408. Throughput: 0: 42746.3. Samples: 4005354860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:23,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 16:13:25,672][15401] Updated weights for policy 0, policy_version 244470 (0.0033) [2024-06-22 16:13:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4005494784. Throughput: 0: 42579.3. Samples: 4005608540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 16:13:30,012][15401] Updated weights for policy 0, policy_version 244480 (0.0038) [2024-06-22 16:13:31,439][15349] Signal inference workers to stop experience collection... (59250 times) [2024-06-22 16:13:31,444][15349] Signal inference workers to resume experience collection... (59250 times) [2024-06-22 16:13:31,488][15401] InferenceWorker_p0-w0: stopping experience collection (59250 times) [2024-06-22 16:13:31,488][15401] InferenceWorker_p0-w0: resuming experience collection (59250 times) [2024-06-22 16:13:33,354][15401] Updated weights for policy 0, policy_version 244490 (0.0045) [2024-06-22 16:13:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4005724160. Throughput: 0: 42663.6. Samples: 4005863960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 16:13:37,751][15401] Updated weights for policy 0, policy_version 244500 (0.0040) [2024-06-22 16:13:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4005904384. Throughput: 0: 42622.8. Samples: 4005993700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 16:13:41,101][15401] Updated weights for policy 0, policy_version 244510 (0.0052) [2024-06-22 16:13:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4006150144. Throughput: 0: 42522.6. Samples: 4006250060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 16:13:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000244516_4006150144.pth... [2024-06-22 16:13:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000243887_3995844608.pth [2024-06-22 16:13:45,563][15401] Updated weights for policy 0, policy_version 244520 (0.0033) [2024-06-22 16:13:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4006346752. Throughput: 0: 42637.4. Samples: 4006505660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 16:13:48,772][15401] Updated weights for policy 0, policy_version 244530 (0.0047) [2024-06-22 16:13:53,159][15401] Updated weights for policy 0, policy_version 244540 (0.0036) [2024-06-22 16:13:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4006543360. Throughput: 0: 42729.9. Samples: 4006639180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 16:13:56,337][15401] Updated weights for policy 0, policy_version 244550 (0.0028) [2024-06-22 16:13:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4006789120. Throughput: 0: 42749.7. Samples: 4006889640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 16:13:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 16:14:00,832][15401] Updated weights for policy 0, policy_version 244560 (0.0027) [2024-06-22 16:14:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 4006985728. Throughput: 0: 42813.4. Samples: 4007147840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 16:14:04,096][15401] Updated weights for policy 0, policy_version 244570 (0.0039) [2024-06-22 16:14:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4007165952. Throughput: 0: 42609.8. Samples: 4007272300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 16:14:08,611][15401] Updated weights for policy 0, policy_version 244580 (0.0027) [2024-06-22 16:14:11,738][15401] Updated weights for policy 0, policy_version 244590 (0.0038) [2024-06-22 16:14:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4007411712. Throughput: 0: 42611.0. Samples: 4007526040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 16:14:16,487][15401] Updated weights for policy 0, policy_version 244600 (0.0040) [2024-06-22 16:14:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4007608320. Throughput: 0: 42661.8. Samples: 4007783740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 16:14:19,673][15401] Updated weights for policy 0, policy_version 244610 (0.0045) [2024-06-22 16:14:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 4007804928. Throughput: 0: 42493.7. Samples: 4007905920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 16:14:24,184][15401] Updated weights for policy 0, policy_version 244620 (0.0033) [2024-06-22 16:14:27,311][15401] Updated weights for policy 0, policy_version 244630 (0.0030) [2024-06-22 16:14:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4008067072. Throughput: 0: 42595.2. Samples: 4008166840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 16:14:31,732][15401] Updated weights for policy 0, policy_version 244640 (0.0037) [2024-06-22 16:14:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4008247296. Throughput: 0: 42613.9. Samples: 4008423280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 16:14:34,959][15401] Updated weights for policy 0, policy_version 244650 (0.0028) [2024-06-22 16:14:38,392][15132] Fps is (10 sec: 37674.0, 60 sec: 42323.5, 300 sec: 42709.1). Total num frames: 4008443904. Throughput: 0: 42387.9. Samples: 4008546740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:38,393][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 16:14:39,550][15401] Updated weights for policy 0, policy_version 244660 (0.0026) [2024-06-22 16:14:42,516][15401] Updated weights for policy 0, policy_version 244670 (0.0030) [2024-06-22 16:14:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4008689664. Throughput: 0: 42476.9. Samples: 4008801100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:43,391][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 16:14:47,232][15401] Updated weights for policy 0, policy_version 244680 (0.0041) [2024-06-22 16:14:48,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4008886272. Throughput: 0: 42471.6. Samples: 4009059060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 16:14:50,398][15401] Updated weights for policy 0, policy_version 244690 (0.0029) [2024-06-22 16:14:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4009099264. Throughput: 0: 42451.9. Samples: 4009182640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 16:14:54,378][15349] Signal inference workers to stop experience collection... (59300 times) [2024-06-22 16:14:54,378][15349] Signal inference workers to resume experience collection... (59300 times) [2024-06-22 16:14:54,388][15401] InferenceWorker_p0-w0: stopping experience collection (59300 times) [2024-06-22 16:14:54,388][15401] InferenceWorker_p0-w0: resuming experience collection (59300 times) [2024-06-22 16:14:54,722][15401] Updated weights for policy 0, policy_version 244700 (0.0035) [2024-06-22 16:14:58,031][15401] Updated weights for policy 0, policy_version 244710 (0.0031) [2024-06-22 16:14:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4009345024. Throughput: 0: 42620.6. Samples: 4009443960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:14:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 16:15:02,320][15401] Updated weights for policy 0, policy_version 244720 (0.0031) [2024-06-22 16:15:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4009541632. Throughput: 0: 42601.8. Samples: 4009700820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:15:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 16:15:05,827][15401] Updated weights for policy 0, policy_version 244730 (0.0026) [2024-06-22 16:15:08,390][15132] Fps is (10 sec: 40958.8, 60 sec: 43144.3, 300 sec: 42765.0). Total num frames: 4009754624. Throughput: 0: 42570.5. Samples: 4009821600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 16:15:08,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 16:15:10,340][15401] Updated weights for policy 0, policy_version 244740 (0.0034) [2024-06-22 16:15:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4009967616. Throughput: 0: 42533.8. Samples: 4010080860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 16:15:13,472][15401] Updated weights for policy 0, policy_version 244750 (0.0034) [2024-06-22 16:15:17,959][15401] Updated weights for policy 0, policy_version 244760 (0.0028) [2024-06-22 16:15:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4010164224. Throughput: 0: 42529.6. Samples: 4010337120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 16:15:21,163][15401] Updated weights for policy 0, policy_version 244770 (0.0034) [2024-06-22 16:15:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4010393600. Throughput: 0: 42482.3. Samples: 4010458340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 16:15:25,542][15401] Updated weights for policy 0, policy_version 244780 (0.0033) [2024-06-22 16:15:28,389][15132] Fps is (10 sec: 44238.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4010606592. Throughput: 0: 42431.3. Samples: 4010710500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 16:15:29,333][15401] Updated weights for policy 0, policy_version 244790 (0.0028) [2024-06-22 16:15:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4010786816. Throughput: 0: 42529.7. Samples: 4010972900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 16:15:33,550][15401] Updated weights for policy 0, policy_version 244800 (0.0029) [2024-06-22 16:15:37,096][15401] Updated weights for policy 0, policy_version 244810 (0.0029) [2024-06-22 16:15:38,391][15132] Fps is (10 sec: 42590.0, 60 sec: 43145.0, 300 sec: 42653.7). Total num frames: 4011032576. Throughput: 0: 42445.4. Samples: 4011092760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:38,392][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 16:15:41,175][15401] Updated weights for policy 0, policy_version 244820 (0.0034) [2024-06-22 16:15:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 4011229184. Throughput: 0: 42320.5. Samples: 4011348380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:43,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 16:15:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000244826_4011229184.pth... [2024-06-22 16:15:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000244202_4001005568.pth [2024-06-22 16:15:44,858][15401] Updated weights for policy 0, policy_version 244830 (0.0033) [2024-06-22 16:15:48,389][15132] Fps is (10 sec: 39329.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4011425792. Throughput: 0: 42486.7. Samples: 4011612720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 16:15:48,825][15401] Updated weights for policy 0, policy_version 244840 (0.0041) [2024-06-22 16:15:52,536][15401] Updated weights for policy 0, policy_version 244850 (0.0026) [2024-06-22 16:15:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4011655168. Throughput: 0: 42498.0. Samples: 4011734000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 16:15:56,475][15401] Updated weights for policy 0, policy_version 244860 (0.0039) [2024-06-22 16:15:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 4011851776. Throughput: 0: 42222.7. Samples: 4011980880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:15:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 16:16:00,243][15401] Updated weights for policy 0, policy_version 244870 (0.0032) [2024-06-22 16:16:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 4012048384. Throughput: 0: 42268.6. Samples: 4012239200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:16:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 16:16:04,274][15401] Updated weights for policy 0, policy_version 244880 (0.0024) [2024-06-22 16:16:07,626][15349] Signal inference workers to stop experience collection... (59350 times) [2024-06-22 16:16:07,674][15401] InferenceWorker_p0-w0: stopping experience collection (59350 times) [2024-06-22 16:16:07,685][15349] Signal inference workers to resume experience collection... (59350 times) [2024-06-22 16:16:07,700][15401] InferenceWorker_p0-w0: resuming experience collection (59350 times) [2024-06-22 16:16:07,824][15401] Updated weights for policy 0, policy_version 244890 (0.0038) [2024-06-22 16:16:08,390][15132] Fps is (10 sec: 45871.8, 60 sec: 42598.0, 300 sec: 42709.4). Total num frames: 4012310528. Throughput: 0: 42363.3. Samples: 4012364720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:16:08,391][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 16:16:11,725][15401] Updated weights for policy 0, policy_version 244900 (0.0049) [2024-06-22 16:16:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4012507136. Throughput: 0: 42523.4. Samples: 4012624060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:16:13,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-22 16:16:15,378][15401] Updated weights for policy 0, policy_version 244910 (0.0045) [2024-06-22 16:16:18,390][15132] Fps is (10 sec: 39324.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4012703744. Throughput: 0: 42482.2. Samples: 4012884600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 16:16:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 16:16:19,457][15401] Updated weights for policy 0, policy_version 244920 (0.0025) [2024-06-22 16:16:23,030][15401] Updated weights for policy 0, policy_version 244930 (0.0038) [2024-06-22 16:16:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4012933120. Throughput: 0: 42580.0. Samples: 4013008780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:16:23,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 16:16:27,054][15401] Updated weights for policy 0, policy_version 244940 (0.0032) [2024-06-22 16:16:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 4013146112. Throughput: 0: 42581.2. Samples: 4013264640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:16:28,392][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 16:16:30,953][15401] Updated weights for policy 0, policy_version 244950 (0.0033) [2024-06-22 16:16:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4013342720. Throughput: 0: 42408.1. Samples: 4013521080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:16:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 16:16:34,762][15401] Updated weights for policy 0, policy_version 244960 (0.0030) [2024-06-22 16:16:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42053.6, 300 sec: 42542.9). Total num frames: 4013555712. Throughput: 0: 42489.0. Samples: 4013646000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:16:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 16:16:38,722][15401] Updated weights for policy 0, policy_version 244970 (0.0042) [2024-06-22 16:16:42,347][15401] Updated weights for policy 0, policy_version 244980 (0.0033) [2024-06-22 16:16:43,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 4013801472. Throughput: 0: 42678.6. Samples: 4013901520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:16:43,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 16:16:46,342][15401] Updated weights for policy 0, policy_version 244990 (0.0025) [2024-06-22 16:16:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4013981696. Throughput: 0: 42652.9. Samples: 4014158580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:16:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 16:16:49,868][15401] Updated weights for policy 0, policy_version 245000 (0.0028) [2024-06-22 16:16:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4014211072. Throughput: 0: 42622.0. Samples: 4014282680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:16:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 16:16:53,797][15401] Updated weights for policy 0, policy_version 245010 (0.0042) [2024-06-22 16:16:57,346][15401] Updated weights for policy 0, policy_version 245020 (0.0040) [2024-06-22 16:16:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4014424064. Throughput: 0: 42651.7. Samples: 4014543380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:16:58,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 16:17:01,867][15401] Updated weights for policy 0, policy_version 245030 (0.0041) [2024-06-22 16:17:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 4014637056. Throughput: 0: 42729.3. Samples: 4014807420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:17:03,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 16:17:04,867][15401] Updated weights for policy 0, policy_version 245040 (0.0040) [2024-06-22 16:17:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.8, 300 sec: 42598.4). Total num frames: 4014850048. Throughput: 0: 42808.4. Samples: 4014935160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:17:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 16:17:09,309][15401] Updated weights for policy 0, policy_version 245050 (0.0026) [2024-06-22 16:17:10,968][15349] Signal inference workers to stop experience collection... (59400 times) [2024-06-22 16:17:10,968][15349] Signal inference workers to resume experience collection... (59400 times) [2024-06-22 16:17:10,997][15401] InferenceWorker_p0-w0: stopping experience collection (59400 times) [2024-06-22 16:17:10,997][15401] InferenceWorker_p0-w0: resuming experience collection (59400 times) [2024-06-22 16:17:12,369][15401] Updated weights for policy 0, policy_version 245060 (0.0027) [2024-06-22 16:17:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4015079424. Throughput: 0: 42770.3. Samples: 4015189200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:17:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 16:17:16,944][15401] Updated weights for policy 0, policy_version 245070 (0.0029) [2024-06-22 16:17:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4015276032. Throughput: 0: 42990.7. Samples: 4015455660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:17:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 16:17:20,299][15401] Updated weights for policy 0, policy_version 245080 (0.0027) [2024-06-22 16:17:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4015505408. Throughput: 0: 42996.8. Samples: 4015580860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:17:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 16:17:24,497][15401] Updated weights for policy 0, policy_version 245090 (0.0031) [2024-06-22 16:17:27,666][15401] Updated weights for policy 0, policy_version 245100 (0.0036) [2024-06-22 16:17:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 4015718400. Throughput: 0: 43066.2. Samples: 4015839400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-22 16:17:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 16:17:32,155][15401] Updated weights for policy 0, policy_version 245110 (0.0041) [2024-06-22 16:17:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4015915008. Throughput: 0: 43089.4. Samples: 4016097600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:17:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 16:17:35,767][15401] Updated weights for policy 0, policy_version 245120 (0.0047) [2024-06-22 16:17:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 4016144384. Throughput: 0: 43075.6. Samples: 4016221080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:17:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 16:17:39,807][15401] Updated weights for policy 0, policy_version 245130 (0.0037) [2024-06-22 16:17:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 4016357376. Throughput: 0: 42950.2. Samples: 4016476140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:17:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 16:17:43,514][15401] Updated weights for policy 0, policy_version 245140 (0.0033) [2024-06-22 16:17:43,516][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000245140_4016373760.pth... [2024-06-22 16:17:43,565][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000244516_4006150144.pth [2024-06-22 16:17:47,447][15401] Updated weights for policy 0, policy_version 245150 (0.0029) [2024-06-22 16:17:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 4016570368. Throughput: 0: 42685.7. Samples: 4016728280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:17:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 16:17:51,222][15401] Updated weights for policy 0, policy_version 245160 (0.0033) [2024-06-22 16:17:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 4016799744. Throughput: 0: 42708.4. Samples: 4016857040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:17:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 16:17:55,105][15401] Updated weights for policy 0, policy_version 245170 (0.0044) [2024-06-22 16:17:58,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.8, 300 sec: 42542.0). Total num frames: 4016996352. Throughput: 0: 42762.7. Samples: 4017113800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:17:58,397][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 16:17:59,110][15401] Updated weights for policy 0, policy_version 245180 (0.0035) [2024-06-22 16:18:02,907][15401] Updated weights for policy 0, policy_version 245190 (0.0030) [2024-06-22 16:18:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4017209344. Throughput: 0: 42570.5. Samples: 4017371340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:18:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 16:18:07,048][15401] Updated weights for policy 0, policy_version 245200 (0.0038) [2024-06-22 16:18:08,390][15132] Fps is (10 sec: 44265.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4017438720. Throughput: 0: 42647.9. Samples: 4017500020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:18:08,399][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 16:18:10,574][15401] Updated weights for policy 0, policy_version 245210 (0.0039) [2024-06-22 16:18:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4017635328. Throughput: 0: 42509.0. Samples: 4017752300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:18:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 16:18:14,726][15401] Updated weights for policy 0, policy_version 245220 (0.0027) [2024-06-22 16:18:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4017831936. Throughput: 0: 42562.6. Samples: 4018012920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:18:18,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 16:18:18,540][15401] Updated weights for policy 0, policy_version 245230 (0.0036) [2024-06-22 16:18:22,461][15349] Signal inference workers to stop experience collection... (59450 times) [2024-06-22 16:18:22,496][15401] InferenceWorker_p0-w0: stopping experience collection (59450 times) [2024-06-22 16:18:22,520][15349] Signal inference workers to resume experience collection... (59450 times) [2024-06-22 16:18:22,524][15401] InferenceWorker_p0-w0: resuming experience collection (59450 times) [2024-06-22 16:18:22,527][15401] Updated weights for policy 0, policy_version 245240 (0.0037) [2024-06-22 16:18:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4018061312. Throughput: 0: 42555.2. Samples: 4018136060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:18:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 16:18:26,234][15401] Updated weights for policy 0, policy_version 245250 (0.0039) [2024-06-22 16:18:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4018274304. Throughput: 0: 42570.6. Samples: 4018391820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:18:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 16:18:29,996][15401] Updated weights for policy 0, policy_version 245260 (0.0044) [2024-06-22 16:18:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4018470912. Throughput: 0: 42727.7. Samples: 4018651020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:18:33,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 16:18:33,739][15401] Updated weights for policy 0, policy_version 245270 (0.0037) [2024-06-22 16:18:37,927][15401] Updated weights for policy 0, policy_version 245280 (0.0038) [2024-06-22 16:18:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 4018683904. Throughput: 0: 42614.3. Samples: 4018774780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 16:18:38,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 16:18:41,272][15401] Updated weights for policy 0, policy_version 245290 (0.0030) [2024-06-22 16:18:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 4018929664. Throughput: 0: 42611.0. Samples: 4019031020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:18:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 16:18:45,394][15401] Updated weights for policy 0, policy_version 245300 (0.0031) [2024-06-22 16:18:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4019109888. Throughput: 0: 42698.7. Samples: 4019292780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:18:48,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 16:18:48,983][15401] Updated weights for policy 0, policy_version 245310 (0.0031) [2024-06-22 16:18:53,118][15401] Updated weights for policy 0, policy_version 245320 (0.0035) [2024-06-22 16:18:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 4019322880. Throughput: 0: 42513.5. Samples: 4019413120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:18:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 16:18:56,516][15401] Updated weights for policy 0, policy_version 245330 (0.0031) [2024-06-22 16:18:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42603.1, 300 sec: 42598.4). Total num frames: 4019552256. Throughput: 0: 42597.4. Samples: 4019669180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:18:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 16:19:00,891][15401] Updated weights for policy 0, policy_version 245340 (0.0030) [2024-06-22 16:19:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4019748864. Throughput: 0: 42588.4. Samples: 4019929400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:03,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 16:19:04,567][15401] Updated weights for policy 0, policy_version 245350 (0.0027) [2024-06-22 16:19:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 4019961856. Throughput: 0: 42508.4. Samples: 4020048940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 16:19:08,493][15401] Updated weights for policy 0, policy_version 245360 (0.0032) [2024-06-22 16:19:11,967][15401] Updated weights for policy 0, policy_version 245370 (0.0028) [2024-06-22 16:19:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4020191232. Throughput: 0: 42582.5. Samples: 4020308040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 16:19:16,052][15401] Updated weights for policy 0, policy_version 245380 (0.0036) [2024-06-22 16:19:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4020387840. Throughput: 0: 42677.4. Samples: 4020571500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 16:19:19,588][15401] Updated weights for policy 0, policy_version 245390 (0.0032) [2024-06-22 16:19:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4020617216. Throughput: 0: 42619.5. Samples: 4020692560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 16:19:23,600][15401] Updated weights for policy 0, policy_version 245400 (0.0028) [2024-06-22 16:19:27,124][15401] Updated weights for policy 0, policy_version 245410 (0.0043) [2024-06-22 16:19:28,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4020830208. Throughput: 0: 42697.8. Samples: 4020952420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 16:19:31,106][15401] Updated weights for policy 0, policy_version 245420 (0.0031) [2024-06-22 16:19:33,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 4021026816. Throughput: 0: 42783.5. Samples: 4021218140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:33,392][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 16:19:34,707][15401] Updated weights for policy 0, policy_version 245430 (0.0031) [2024-06-22 16:19:38,394][15132] Fps is (10 sec: 44215.9, 60 sec: 43142.8, 300 sec: 42653.3). Total num frames: 4021272576. Throughput: 0: 42851.9. Samples: 4021341660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:38,395][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 16:19:38,535][15401] Updated weights for policy 0, policy_version 245440 (0.0033) [2024-06-22 16:19:41,536][15349] Signal inference workers to stop experience collection... (59500 times) [2024-06-22 16:19:41,536][15349] Signal inference workers to resume experience collection... (59500 times) [2024-06-22 16:19:41,571][15401] InferenceWorker_p0-w0: stopping experience collection (59500 times) [2024-06-22 16:19:41,571][15401] InferenceWorker_p0-w0: resuming experience collection (59500 times) [2024-06-22 16:19:42,262][15401] Updated weights for policy 0, policy_version 245450 (0.0041) [2024-06-22 16:19:43,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4021469184. Throughput: 0: 42860.9. Samples: 4021597920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:43,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 16:19:43,479][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000245452_4021485568.pth... [2024-06-22 16:19:43,534][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000244826_4011229184.pth [2024-06-22 16:19:46,749][15401] Updated weights for policy 0, policy_version 245460 (0.0036) [2024-06-22 16:19:48,389][15132] Fps is (10 sec: 39340.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4021665792. Throughput: 0: 42889.4. Samples: 4021859420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-22 16:19:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 16:19:50,227][15401] Updated weights for policy 0, policy_version 245470 (0.0029) [2024-06-22 16:19:53,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 4021911552. Throughput: 0: 42976.8. Samples: 4021982900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:19:53,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-22 16:19:54,169][15401] Updated weights for policy 0, policy_version 245480 (0.0039) [2024-06-22 16:19:57,824][15401] Updated weights for policy 0, policy_version 245490 (0.0037) [2024-06-22 16:19:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4022124544. Throughput: 0: 42893.8. Samples: 4022238260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:19:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 16:20:02,301][15401] Updated weights for policy 0, policy_version 245500 (0.0043) [2024-06-22 16:20:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4022321152. Throughput: 0: 42864.7. Samples: 4022500420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:03,399][15132] Avg episode reward: [(0, '0.852')] [2024-06-22 16:20:05,620][15401] Updated weights for policy 0, policy_version 245510 (0.0027) [2024-06-22 16:20:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 4022566912. Throughput: 0: 42961.4. Samples: 4022625820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 16:20:09,665][15401] Updated weights for policy 0, policy_version 245520 (0.0039) [2024-06-22 16:20:13,067][15401] Updated weights for policy 0, policy_version 245530 (0.0037) [2024-06-22 16:20:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4022763520. Throughput: 0: 42991.9. Samples: 4022887060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 16:20:17,220][15401] Updated weights for policy 0, policy_version 245540 (0.0039) [2024-06-22 16:20:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4022960128. Throughput: 0: 42909.9. Samples: 4023148980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 16:20:20,720][15401] Updated weights for policy 0, policy_version 245550 (0.0034) [2024-06-22 16:20:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4023205888. Throughput: 0: 42843.2. Samples: 4023269400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:23,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 16:20:24,799][15401] Updated weights for policy 0, policy_version 245560 (0.0038) [2024-06-22 16:20:28,150][15401] Updated weights for policy 0, policy_version 245570 (0.0034) [2024-06-22 16:20:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4023418880. Throughput: 0: 42942.5. Samples: 4023530340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 16:20:32,437][15401] Updated weights for policy 0, policy_version 245580 (0.0028) [2024-06-22 16:20:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42598.7). Total num frames: 4023599104. Throughput: 0: 42982.7. Samples: 4023793640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 16:20:36,056][15401] Updated weights for policy 0, policy_version 245590 (0.0026) [2024-06-22 16:20:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42874.9, 300 sec: 42765.0). Total num frames: 4023844864. Throughput: 0: 42864.9. Samples: 4023911820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 16:20:40,163][15401] Updated weights for policy 0, policy_version 245600 (0.0044) [2024-06-22 16:20:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4024057856. Throughput: 0: 43021.8. Samples: 4024174240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:43,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 16:20:43,751][15401] Updated weights for policy 0, policy_version 245610 (0.0034) [2024-06-22 16:20:47,916][15401] Updated weights for policy 0, policy_version 245620 (0.0050) [2024-06-22 16:20:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4024254464. Throughput: 0: 42949.3. Samples: 4024433140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:48,394][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 16:20:51,319][15401] Updated weights for policy 0, policy_version 245630 (0.0029) [2024-06-22 16:20:52,436][15349] Signal inference workers to stop experience collection... (59550 times) [2024-06-22 16:20:52,436][15349] Signal inference workers to resume experience collection... (59550 times) [2024-06-22 16:20:52,451][15401] InferenceWorker_p0-w0: stopping experience collection (59550 times) [2024-06-22 16:20:52,451][15401] InferenceWorker_p0-w0: resuming experience collection (59550 times) [2024-06-22 16:20:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4024483840. Throughput: 0: 43005.3. Samples: 4024561060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 16:20:55,288][15401] Updated weights for policy 0, policy_version 245640 (0.0028) [2024-06-22 16:20:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4024696832. Throughput: 0: 42951.7. Samples: 4024819880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 16:20:58,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 16:20:58,776][15401] Updated weights for policy 0, policy_version 245650 (0.0044) [2024-06-22 16:21:02,854][15401] Updated weights for policy 0, policy_version 245660 (0.0032) [2024-06-22 16:21:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42765.1). Total num frames: 4024926208. Throughput: 0: 42810.9. Samples: 4025075480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 16:21:06,248][15401] Updated weights for policy 0, policy_version 245670 (0.0036) [2024-06-22 16:21:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4025122816. Throughput: 0: 42933.0. Samples: 4025201380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 16:21:10,437][15401] Updated weights for policy 0, policy_version 245680 (0.0035) [2024-06-22 16:21:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4025352192. Throughput: 0: 42811.6. Samples: 4025456860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:13,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 16:21:13,910][15401] Updated weights for policy 0, policy_version 245690 (0.0033) [2024-06-22 16:21:18,185][15401] Updated weights for policy 0, policy_version 245700 (0.0044) [2024-06-22 16:21:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 4025565184. Throughput: 0: 42820.8. Samples: 4025720580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 16:21:21,520][15401] Updated weights for policy 0, policy_version 245710 (0.0028) [2024-06-22 16:21:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 4025778176. Throughput: 0: 43024.9. Samples: 4025847940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 16:21:25,850][15401] Updated weights for policy 0, policy_version 245720 (0.0029) [2024-06-22 16:21:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4026007552. Throughput: 0: 42881.4. Samples: 4026103900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 16:21:29,143][15401] Updated weights for policy 0, policy_version 245730 (0.0033) [2024-06-22 16:21:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4026171392. Throughput: 0: 43066.3. Samples: 4026371120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 16:21:33,712][15401] Updated weights for policy 0, policy_version 245740 (0.0038) [2024-06-22 16:21:36,655][15401] Updated weights for policy 0, policy_version 245750 (0.0033) [2024-06-22 16:21:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 4026417152. Throughput: 0: 42918.2. Samples: 4026492380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 16:21:41,179][15401] Updated weights for policy 0, policy_version 245760 (0.0030) [2024-06-22 16:21:43,395][15132] Fps is (10 sec: 47486.0, 60 sec: 43140.4, 300 sec: 42930.8). Total num frames: 4026646528. Throughput: 0: 42932.2. Samples: 4026752080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:43,396][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 16:21:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000245767_4026646528.pth... [2024-06-22 16:21:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000245140_4016373760.pth [2024-06-22 16:21:44,197][15401] Updated weights for policy 0, policy_version 245770 (0.0044) [2024-06-22 16:21:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4026826752. Throughput: 0: 42951.3. Samples: 4027008280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:48,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 16:21:48,709][15401] Updated weights for policy 0, policy_version 245780 (0.0036) [2024-06-22 16:21:52,275][15401] Updated weights for policy 0, policy_version 245790 (0.0037) [2024-06-22 16:21:53,390][15132] Fps is (10 sec: 40983.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4027056128. Throughput: 0: 42901.1. Samples: 4027131940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 16:21:56,239][15401] Updated weights for policy 0, policy_version 245800 (0.0030) [2024-06-22 16:21:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4027285504. Throughput: 0: 43056.9. Samples: 4027394420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:21:58,396][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 16:21:59,747][15401] Updated weights for policy 0, policy_version 245810 (0.0035) [2024-06-22 16:22:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 4027465728. Throughput: 0: 42977.0. Samples: 4027654540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:22:03,398][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 16:22:03,872][15401] Updated weights for policy 0, policy_version 245820 (0.0030) [2024-06-22 16:22:05,042][15349] Signal inference workers to stop experience collection... (59600 times) [2024-06-22 16:22:05,093][15349] Signal inference workers to resume experience collection... (59600 times) [2024-06-22 16:22:05,095][15401] InferenceWorker_p0-w0: stopping experience collection (59600 times) [2024-06-22 16:22:05,107][15401] InferenceWorker_p0-w0: resuming experience collection (59600 times) [2024-06-22 16:22:07,270][15401] Updated weights for policy 0, policy_version 245830 (0.0032) [2024-06-22 16:22:08,390][15132] Fps is (10 sec: 40957.1, 60 sec: 42870.9, 300 sec: 42764.9). Total num frames: 4027695104. Throughput: 0: 42880.7. Samples: 4027777600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 16:22:08,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 16:22:11,749][15401] Updated weights for policy 0, policy_version 245840 (0.0033) [2024-06-22 16:22:13,393][15132] Fps is (10 sec: 45859.8, 60 sec: 42869.1, 300 sec: 42875.6). Total num frames: 4027924480. Throughput: 0: 42967.0. Samples: 4028037560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:13,393][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 16:22:15,024][15401] Updated weights for policy 0, policy_version 245850 (0.0038) [2024-06-22 16:22:18,396][15132] Fps is (10 sec: 42574.1, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 4028121088. Throughput: 0: 42814.4. Samples: 4028298040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:18,397][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 16:22:19,423][15401] Updated weights for policy 0, policy_version 245860 (0.0047) [2024-06-22 16:22:22,521][15401] Updated weights for policy 0, policy_version 245870 (0.0044) [2024-06-22 16:22:23,389][15132] Fps is (10 sec: 42612.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4028350464. Throughput: 0: 42946.2. Samples: 4028424960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 16:22:27,065][15401] Updated weights for policy 0, policy_version 245880 (0.0023) [2024-06-22 16:22:28,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4028563456. Throughput: 0: 42882.0. Samples: 4028681520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:28,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 16:22:29,962][15401] Updated weights for policy 0, policy_version 245890 (0.0037) [2024-06-22 16:22:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 4028776448. Throughput: 0: 42946.6. Samples: 4028940880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 16:22:34,435][15401] Updated weights for policy 0, policy_version 245900 (0.0027) [2024-06-22 16:22:38,042][15401] Updated weights for policy 0, policy_version 245910 (0.0027) [2024-06-22 16:22:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4029005824. Throughput: 0: 43018.3. Samples: 4029067760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:38,399][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 16:22:41,926][15401] Updated weights for policy 0, policy_version 245920 (0.0050) [2024-06-22 16:22:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42602.5, 300 sec: 42820.6). Total num frames: 4029202432. Throughput: 0: 42979.5. Samples: 4029328500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:43,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 16:22:45,566][15401] Updated weights for policy 0, policy_version 245930 (0.0033) [2024-06-22 16:22:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4029415424. Throughput: 0: 42930.1. Samples: 4029586400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 16:22:49,339][15401] Updated weights for policy 0, policy_version 245940 (0.0031) [2024-06-22 16:22:53,040][15401] Updated weights for policy 0, policy_version 245950 (0.0032) [2024-06-22 16:22:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42877.0). Total num frames: 4029644800. Throughput: 0: 43113.6. Samples: 4029717680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 16:22:57,019][15401] Updated weights for policy 0, policy_version 245960 (0.0038) [2024-06-22 16:22:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4029841408. Throughput: 0: 42996.1. Samples: 4029972240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:22:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 16:23:00,826][15401] Updated weights for policy 0, policy_version 245970 (0.0043) [2024-06-22 16:23:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4030054400. Throughput: 0: 42922.6. Samples: 4030229280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:23:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 16:23:04,621][15401] Updated weights for policy 0, policy_version 245980 (0.0035) [2024-06-22 16:23:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42872.0, 300 sec: 42820.6). Total num frames: 4030267392. Throughput: 0: 42941.0. Samples: 4030357300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:23:08,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 16:23:08,602][15401] Updated weights for policy 0, policy_version 245990 (0.0024) [2024-06-22 16:23:11,836][15349] Signal inference workers to stop experience collection... (59650 times) [2024-06-22 16:23:11,836][15349] Signal inference workers to resume experience collection... (59650 times) [2024-06-22 16:23:11,882][15401] InferenceWorker_p0-w0: stopping experience collection (59650 times) [2024-06-22 16:23:11,882][15401] InferenceWorker_p0-w0: resuming experience collection (59650 times) [2024-06-22 16:23:12,175][15401] Updated weights for policy 0, policy_version 246000 (0.0038) [2024-06-22 16:23:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.7, 300 sec: 42876.1). Total num frames: 4030480384. Throughput: 0: 42921.3. Samples: 4030612980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:23:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 16:23:16,341][15401] Updated weights for policy 0, policy_version 246010 (0.0037) [2024-06-22 16:23:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 4030709760. Throughput: 0: 42801.8. Samples: 4030866960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 16:23:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 16:23:19,903][15401] Updated weights for policy 0, policy_version 246020 (0.0031) [2024-06-22 16:23:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4030906368. Throughput: 0: 42821.3. Samples: 4030994720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:23:23,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-22 16:23:23,990][15401] Updated weights for policy 0, policy_version 246030 (0.0032) [2024-06-22 16:23:27,801][15401] Updated weights for policy 0, policy_version 246040 (0.0035) [2024-06-22 16:23:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4031135744. Throughput: 0: 42742.2. Samples: 4031251900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:23:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 16:23:31,626][15401] Updated weights for policy 0, policy_version 246050 (0.0034) [2024-06-22 16:23:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 4031365120. Throughput: 0: 42685.8. Samples: 4031507260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:23:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 16:23:35,318][15401] Updated weights for policy 0, policy_version 246060 (0.0032) [2024-06-22 16:23:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4031545344. Throughput: 0: 42705.7. Samples: 4031639440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:23:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 16:23:39,324][15401] Updated weights for policy 0, policy_version 246070 (0.0031) [2024-06-22 16:23:42,933][15401] Updated weights for policy 0, policy_version 246080 (0.0029) [2024-06-22 16:23:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4031791104. Throughput: 0: 42777.2. Samples: 4031897220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:23:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 16:23:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000246081_4031791104.pth... [2024-06-22 16:23:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000245452_4021485568.pth [2024-06-22 16:23:47,007][15401] Updated weights for policy 0, policy_version 246090 (0.0040) [2024-06-22 16:23:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4032004096. Throughput: 0: 42706.6. Samples: 4032151080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:23:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 16:23:50,701][15401] Updated weights for policy 0, policy_version 246100 (0.0033) [2024-06-22 16:23:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4032184320. Throughput: 0: 42675.9. Samples: 4032277720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:23:53,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 16:23:54,681][15401] Updated weights for policy 0, policy_version 246110 (0.0038) [2024-06-22 16:23:58,211][15401] Updated weights for policy 0, policy_version 246120 (0.0034) [2024-06-22 16:23:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 4032430080. Throughput: 0: 42747.9. Samples: 4032536640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:23:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 16:24:02,574][15401] Updated weights for policy 0, policy_version 246130 (0.0034) [2024-06-22 16:24:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4032626688. Throughput: 0: 42791.6. Samples: 4032792580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:24:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 16:24:05,810][15401] Updated weights for policy 0, policy_version 246140 (0.0028) [2024-06-22 16:24:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4032839680. Throughput: 0: 42634.2. Samples: 4032913260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:24:08,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 16:24:10,354][15401] Updated weights for policy 0, policy_version 246150 (0.0040) [2024-06-22 16:24:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 4033069056. Throughput: 0: 42630.7. Samples: 4033170280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:24:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 16:24:13,714][15401] Updated weights for policy 0, policy_version 246160 (0.0034) [2024-06-22 16:24:17,876][15401] Updated weights for policy 0, policy_version 246170 (0.0032) [2024-06-22 16:24:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4033249280. Throughput: 0: 42759.7. Samples: 4033431440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:24:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 16:24:21,173][15401] Updated weights for policy 0, policy_version 246180 (0.0041) [2024-06-22 16:24:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4033478656. Throughput: 0: 42602.1. Samples: 4033556540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:24:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 16:24:25,378][15401] Updated weights for policy 0, policy_version 246190 (0.0039) [2024-06-22 16:24:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 4033708032. Throughput: 0: 42760.5. Samples: 4033821440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 16:24:28,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 16:24:28,706][15401] Updated weights for policy 0, policy_version 246200 (0.0036) [2024-06-22 16:24:32,881][15401] Updated weights for policy 0, policy_version 246210 (0.0038) [2024-06-22 16:24:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42821.2). Total num frames: 4033904640. Throughput: 0: 42774.5. Samples: 4034075940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:24:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 16:24:36,258][15401] Updated weights for policy 0, policy_version 246220 (0.0031) [2024-06-22 16:24:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4034117632. Throughput: 0: 42727.1. Samples: 4034200440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:24:38,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 16:24:40,411][15401] Updated weights for policy 0, policy_version 246230 (0.0034) [2024-06-22 16:24:42,227][15349] Signal inference workers to stop experience collection... (59700 times) [2024-06-22 16:24:42,228][15349] Signal inference workers to resume experience collection... (59700 times) [2024-06-22 16:24:42,252][15401] InferenceWorker_p0-w0: stopping experience collection (59700 times) [2024-06-22 16:24:42,253][15401] InferenceWorker_p0-w0: resuming experience collection (59700 times) [2024-06-22 16:24:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 4034347008. Throughput: 0: 42961.9. Samples: 4034469920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:24:43,398][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 16:24:43,787][15401] Updated weights for policy 0, policy_version 246240 (0.0035) [2024-06-22 16:24:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4034543616. Throughput: 0: 42795.9. Samples: 4034718400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:24:48,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-22 16:24:48,530][15401] Updated weights for policy 0, policy_version 246250 (0.0029) [2024-06-22 16:24:51,799][15401] Updated weights for policy 0, policy_version 246260 (0.0041) [2024-06-22 16:24:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 4034772992. Throughput: 0: 42867.5. Samples: 4034842400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:24:53,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 16:24:56,758][15401] Updated weights for policy 0, policy_version 246270 (0.0036) [2024-06-22 16:24:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4034985984. Throughput: 0: 43094.2. Samples: 4035109520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:24:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 16:24:59,305][15401] Updated weights for policy 0, policy_version 246280 (0.0027) [2024-06-22 16:25:03,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4035182592. Throughput: 0: 42895.8. Samples: 4035361760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:25:03,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 16:25:04,303][15401] Updated weights for policy 0, policy_version 246290 (0.0043) [2024-06-22 16:25:07,012][15401] Updated weights for policy 0, policy_version 246300 (0.0031) [2024-06-22 16:25:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4035411968. Throughput: 0: 42905.8. Samples: 4035487300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:25:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 16:25:11,961][15401] Updated weights for policy 0, policy_version 246310 (0.0046) [2024-06-22 16:25:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4035624960. Throughput: 0: 42972.4. Samples: 4035755200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:25:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 16:25:14,611][15401] Updated weights for policy 0, policy_version 246320 (0.0031) [2024-06-22 16:25:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4035837952. Throughput: 0: 42882.9. Samples: 4036005660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:25:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 16:25:19,679][15401] Updated weights for policy 0, policy_version 246330 (0.0033) [2024-06-22 16:25:22,177][15401] Updated weights for policy 0, policy_version 246340 (0.0035) [2024-06-22 16:25:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4036067328. Throughput: 0: 42919.5. Samples: 4036131820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:25:23,404][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 16:25:27,240][15401] Updated weights for policy 0, policy_version 246350 (0.0034) [2024-06-22 16:25:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4036247552. Throughput: 0: 42740.8. Samples: 4036393260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:25:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 16:25:29,955][15401] Updated weights for policy 0, policy_version 246360 (0.0029) [2024-06-22 16:25:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4036476928. Throughput: 0: 42748.1. Samples: 4036642060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:25:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 16:25:34,770][15401] Updated weights for policy 0, policy_version 246370 (0.0041) [2024-06-22 16:25:37,569][15401] Updated weights for policy 0, policy_version 246380 (0.0037) [2024-06-22 16:25:38,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4036706304. Throughput: 0: 43118.3. Samples: 4036782620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-22 16:25:38,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 16:25:42,280][15401] Updated weights for policy 0, policy_version 246390 (0.0035) [2024-06-22 16:25:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 4036870144. Throughput: 0: 42760.6. Samples: 4037033740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:25:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 16:25:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000246391_4036870144.pth... [2024-06-22 16:25:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000245767_4026646528.pth [2024-06-22 16:25:44,811][15349] Signal inference workers to stop experience collection... (59750 times) [2024-06-22 16:25:44,814][15349] Signal inference workers to resume experience collection... (59750 times) [2024-06-22 16:25:44,830][15401] InferenceWorker_p0-w0: stopping experience collection (59750 times) [2024-06-22 16:25:44,862][15401] InferenceWorker_p0-w0: resuming experience collection (59750 times) [2024-06-22 16:25:45,306][15401] Updated weights for policy 0, policy_version 246400 (0.0033) [2024-06-22 16:25:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4037132288. Throughput: 0: 42765.9. Samples: 4037286220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:25:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 16:25:49,660][15401] Updated weights for policy 0, policy_version 246410 (0.0036) [2024-06-22 16:25:52,962][15401] Updated weights for policy 0, policy_version 246420 (0.0040) [2024-06-22 16:25:53,389][15132] Fps is (10 sec: 49151.9, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 4037361664. Throughput: 0: 43169.4. Samples: 4037429920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:25:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 16:25:57,074][15401] Updated weights for policy 0, policy_version 246430 (0.0037) [2024-06-22 16:25:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4037525504. Throughput: 0: 42827.7. Samples: 4037682440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:25:58,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 16:26:00,597][15401] Updated weights for policy 0, policy_version 246440 (0.0031) [2024-06-22 16:26:03,390][15132] Fps is (10 sec: 44233.8, 60 sec: 43690.3, 300 sec: 42987.1). Total num frames: 4037804032. Throughput: 0: 42838.0. Samples: 4037933400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:03,391][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 16:26:04,497][15401] Updated weights for policy 0, policy_version 246450 (0.0037) [2024-06-22 16:26:08,254][15401] Updated weights for policy 0, policy_version 246460 (0.0029) [2024-06-22 16:26:08,390][15132] Fps is (10 sec: 47509.5, 60 sec: 43144.0, 300 sec: 42876.0). Total num frames: 4038000640. Throughput: 0: 43210.4. Samples: 4038076320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:08,391][15132] Avg episode reward: [(0, '0.214')] [2024-06-22 16:26:11,979][15401] Updated weights for policy 0, policy_version 246470 (0.0033) [2024-06-22 16:26:13,390][15132] Fps is (10 sec: 39323.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4038197248. Throughput: 0: 42919.6. Samples: 4038324640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:13,395][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 16:26:15,997][15401] Updated weights for policy 0, policy_version 246480 (0.0039) [2024-06-22 16:26:18,390][15132] Fps is (10 sec: 44239.9, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 4038443008. Throughput: 0: 43090.1. Samples: 4038581120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 16:26:19,737][15401] Updated weights for policy 0, policy_version 246490 (0.0032) [2024-06-22 16:26:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4038606848. Throughput: 0: 43017.8. Samples: 4038718420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:23,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-22 16:26:23,708][15401] Updated weights for policy 0, policy_version 246500 (0.0043) [2024-06-22 16:26:27,529][15401] Updated weights for policy 0, policy_version 246510 (0.0036) [2024-06-22 16:26:28,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4038836224. Throughput: 0: 42934.1. Samples: 4038965780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 16:26:31,499][15401] Updated weights for policy 0, policy_version 246520 (0.0032) [2024-06-22 16:26:33,390][15132] Fps is (10 sec: 47512.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 4039081984. Throughput: 0: 42971.8. Samples: 4039219960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 16:26:35,001][15401] Updated weights for policy 0, policy_version 246530 (0.0041) [2024-06-22 16:26:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42710.3). Total num frames: 4039245824. Throughput: 0: 42869.8. Samples: 4039359060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:38,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-22 16:26:39,425][15401] Updated weights for policy 0, policy_version 246540 (0.0039) [2024-06-22 16:26:42,417][15401] Updated weights for policy 0, policy_version 246550 (0.0030) [2024-06-22 16:26:43,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 4039491584. Throughput: 0: 42783.5. Samples: 4039607700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 16:26:47,045][15401] Updated weights for policy 0, policy_version 246560 (0.0036) [2024-06-22 16:26:48,390][15132] Fps is (10 sec: 49151.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 4039737344. Throughput: 0: 42971.7. Samples: 4039867100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 16:26:48,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 16:26:49,986][15401] Updated weights for policy 0, policy_version 246570 (0.0034) [2024-06-22 16:26:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4039884800. Throughput: 0: 42683.4. Samples: 4039997040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:26:53,392][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 16:26:54,624][15401] Updated weights for policy 0, policy_version 246580 (0.0034) [2024-06-22 16:26:54,780][15349] Signal inference workers to stop experience collection... (59800 times) [2024-06-22 16:26:54,781][15349] Signal inference workers to resume experience collection... (59800 times) [2024-06-22 16:26:54,819][15401] InferenceWorker_p0-w0: stopping experience collection (59800 times) [2024-06-22 16:26:54,820][15401] InferenceWorker_p0-w0: resuming experience collection (59800 times) [2024-06-22 16:26:57,534][15401] Updated weights for policy 0, policy_version 246590 (0.0047) [2024-06-22 16:26:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 4040146944. Throughput: 0: 42753.4. Samples: 4040248540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:26:58,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 16:27:02,121][15401] Updated weights for policy 0, policy_version 246600 (0.0043) [2024-06-22 16:27:03,389][15132] Fps is (10 sec: 49152.6, 60 sec: 42871.9, 300 sec: 42987.3). Total num frames: 4040376320. Throughput: 0: 42933.9. Samples: 4040513140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 16:27:05,025][15401] Updated weights for policy 0, policy_version 246610 (0.0038) [2024-06-22 16:27:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.8, 300 sec: 42765.5). Total num frames: 4040540160. Throughput: 0: 42822.9. Samples: 4040645460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:08,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 16:27:09,582][15401] Updated weights for policy 0, policy_version 246620 (0.0041) [2024-06-22 16:27:12,629][15401] Updated weights for policy 0, policy_version 246630 (0.0034) [2024-06-22 16:27:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42932.6). Total num frames: 4040785920. Throughput: 0: 42888.0. Samples: 4040895740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 16:27:17,568][15401] Updated weights for policy 0, policy_version 246640 (0.0041) [2024-06-22 16:27:18,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 4041015296. Throughput: 0: 42959.3. Samples: 4041153120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:18,390][15132] Avg episode reward: [(0, '0.184')] [2024-06-22 16:27:20,617][15401] Updated weights for policy 0, policy_version 246650 (0.0036) [2024-06-22 16:27:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4041179136. Throughput: 0: 42668.8. Samples: 4041279160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 16:27:25,098][15401] Updated weights for policy 0, policy_version 246660 (0.0028) [2024-06-22 16:27:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4041424896. Throughput: 0: 42874.3. Samples: 4041537040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:28,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 16:27:28,438][15401] Updated weights for policy 0, policy_version 246670 (0.0048) [2024-06-22 16:27:32,527][15401] Updated weights for policy 0, policy_version 246680 (0.0033) [2024-06-22 16:27:33,396][15132] Fps is (10 sec: 45845.9, 60 sec: 42594.0, 300 sec: 42819.6). Total num frames: 4041637888. Throughput: 0: 42980.5. Samples: 4041801500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:33,397][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 16:27:35,927][15401] Updated weights for policy 0, policy_version 246690 (0.0026) [2024-06-22 16:27:38,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4041818112. Throughput: 0: 42848.4. Samples: 4041925320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:38,392][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 16:27:40,457][15401] Updated weights for policy 0, policy_version 246700 (0.0036) [2024-06-22 16:27:43,390][15132] Fps is (10 sec: 44265.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4042080256. Throughput: 0: 42890.2. Samples: 4042178600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 16:27:43,440][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000246710_4042096640.pth... [2024-06-22 16:27:43,447][15401] Updated weights for policy 0, policy_version 246710 (0.0034) [2024-06-22 16:27:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000246081_4031791104.pth [2024-06-22 16:27:48,008][15401] Updated weights for policy 0, policy_version 246720 (0.0033) [2024-06-22 16:27:48,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4042276864. Throughput: 0: 42893.2. Samples: 4042443340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 16:27:50,936][15401] Updated weights for policy 0, policy_version 246730 (0.0020) [2024-06-22 16:27:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4042473472. Throughput: 0: 42642.4. Samples: 4042564360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:53,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 16:27:55,519][15401] Updated weights for policy 0, policy_version 246740 (0.0023) [2024-06-22 16:27:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4042719232. Throughput: 0: 42761.3. Samples: 4042820000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 16:27:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 16:27:59,102][15401] Updated weights for policy 0, policy_version 246750 (0.0033) [2024-06-22 16:28:03,110][15401] Updated weights for policy 0, policy_version 246760 (0.0039) [2024-06-22 16:28:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4042915840. Throughput: 0: 42686.7. Samples: 4043074020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 16:28:06,641][15401] Updated weights for policy 0, policy_version 246770 (0.0035) [2024-06-22 16:28:08,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4043096064. Throughput: 0: 42779.1. Samples: 4043204220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 16:28:10,838][15401] Updated weights for policy 0, policy_version 246780 (0.0027) [2024-06-22 16:28:11,956][15349] Signal inference workers to stop experience collection... (59850 times) [2024-06-22 16:28:12,005][15401] InferenceWorker_p0-w0: stopping experience collection (59850 times) [2024-06-22 16:28:12,014][15349] Signal inference workers to resume experience collection... (59850 times) [2024-06-22 16:28:12,019][15401] InferenceWorker_p0-w0: resuming experience collection (59850 times) [2024-06-22 16:28:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4043341824. Throughput: 0: 42705.2. Samples: 4043458780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 16:28:14,354][15401] Updated weights for policy 0, policy_version 246790 (0.0031) [2024-06-22 16:28:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 4043538432. Throughput: 0: 42444.7. Samples: 4043711240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 16:28:18,993][15401] Updated weights for policy 0, policy_version 246800 (0.0038) [2024-06-22 16:28:22,236][15401] Updated weights for policy 0, policy_version 246810 (0.0032) [2024-06-22 16:28:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4043751424. Throughput: 0: 42563.0. Samples: 4043840560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 16:28:26,602][15401] Updated weights for policy 0, policy_version 246820 (0.0038) [2024-06-22 16:28:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 4043980800. Throughput: 0: 42632.7. Samples: 4044097080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 16:28:30,207][15401] Updated weights for policy 0, policy_version 246830 (0.0038) [2024-06-22 16:28:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42329.9, 300 sec: 42820.6). Total num frames: 4044177408. Throughput: 0: 42468.5. Samples: 4044354420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:33,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 16:28:34,292][15401] Updated weights for policy 0, policy_version 246840 (0.0032) [2024-06-22 16:28:37,653][15401] Updated weights for policy 0, policy_version 246850 (0.0033) [2024-06-22 16:28:38,390][15132] Fps is (10 sec: 42599.1, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 4044406784. Throughput: 0: 42426.6. Samples: 4044473560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 16:28:41,968][15401] Updated weights for policy 0, policy_version 246860 (0.0027) [2024-06-22 16:28:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4044603392. Throughput: 0: 42578.3. Samples: 4044736020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 16:28:45,267][15401] Updated weights for policy 0, policy_version 246870 (0.0030) [2024-06-22 16:28:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 4044816384. Throughput: 0: 42671.1. Samples: 4044994220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 16:28:49,829][15401] Updated weights for policy 0, policy_version 246880 (0.0034) [2024-06-22 16:28:52,866][15401] Updated weights for policy 0, policy_version 246890 (0.0029) [2024-06-22 16:28:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 4045062144. Throughput: 0: 42527.0. Samples: 4045117940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 16:28:57,863][15401] Updated weights for policy 0, policy_version 246900 (0.0041) [2024-06-22 16:28:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4045242368. Throughput: 0: 42682.4. Samples: 4045379480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:28:58,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 16:29:00,441][15401] Updated weights for policy 0, policy_version 246910 (0.0032) [2024-06-22 16:29:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4045455360. Throughput: 0: 42639.1. Samples: 4045630000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:29:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 16:29:05,446][15401] Updated weights for policy 0, policy_version 246920 (0.0030) [2024-06-22 16:29:07,839][15401] Updated weights for policy 0, policy_version 246930 (0.0025) [2024-06-22 16:29:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 4045701120. Throughput: 0: 42635.8. Samples: 4045759160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:29:08,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 16:29:13,374][15401] Updated weights for policy 0, policy_version 246940 (0.0030) [2024-06-22 16:29:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4045864960. Throughput: 0: 42624.1. Samples: 4046015160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:13,391][15132] Avg episode reward: [(0, '0.823')] [2024-06-22 16:29:13,656][15349] Signal inference workers to stop experience collection... (59900 times) [2024-06-22 16:29:13,709][15401] InferenceWorker_p0-w0: stopping experience collection (59900 times) [2024-06-22 16:29:13,716][15349] Signal inference workers to resume experience collection... (59900 times) [2024-06-22 16:29:13,725][15401] InferenceWorker_p0-w0: resuming experience collection (59900 times) [2024-06-22 16:29:16,095][15401] Updated weights for policy 0, policy_version 246950 (0.0037) [2024-06-22 16:29:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4046110720. Throughput: 0: 42336.8. Samples: 4046259580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:18,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-22 16:29:21,040][15401] Updated weights for policy 0, policy_version 246960 (0.0053) [2024-06-22 16:29:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4046323712. Throughput: 0: 42643.5. Samples: 4046392520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:23,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 16:29:23,726][15401] Updated weights for policy 0, policy_version 246970 (0.0044) [2024-06-22 16:29:28,392][15132] Fps is (10 sec: 37674.5, 60 sec: 41777.7, 300 sec: 42653.6). Total num frames: 4046487552. Throughput: 0: 42519.9. Samples: 4046649520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:28,393][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 16:29:28,690][15401] Updated weights for policy 0, policy_version 246980 (0.0033) [2024-06-22 16:29:31,297][15401] Updated weights for policy 0, policy_version 246990 (0.0029) [2024-06-22 16:29:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4046733312. Throughput: 0: 42328.0. Samples: 4046898980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:33,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 16:29:36,354][15401] Updated weights for policy 0, policy_version 247000 (0.0031) [2024-06-22 16:29:38,389][15132] Fps is (10 sec: 47525.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4046962688. Throughput: 0: 42520.6. Samples: 4047031360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 16:29:39,300][15401] Updated weights for policy 0, policy_version 247010 (0.0034) [2024-06-22 16:29:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4047142912. Throughput: 0: 42435.4. Samples: 4047289080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 16:29:43,516][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000247019_4047159296.pth... [2024-06-22 16:29:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000246391_4036870144.pth [2024-06-22 16:29:43,802][15401] Updated weights for policy 0, policy_version 247020 (0.0039) [2024-06-22 16:29:47,014][15401] Updated weights for policy 0, policy_version 247030 (0.0037) [2024-06-22 16:29:48,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 4047372288. Throughput: 0: 42455.0. Samples: 4047540580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:48,393][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 16:29:51,293][15401] Updated weights for policy 0, policy_version 247040 (0.0027) [2024-06-22 16:29:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4047601664. Throughput: 0: 42607.8. Samples: 4047676520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 16:29:54,532][15401] Updated weights for policy 0, policy_version 247050 (0.0031) [2024-06-22 16:29:58,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4047798272. Throughput: 0: 42545.4. Samples: 4047929700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:29:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 16:29:58,855][15401] Updated weights for policy 0, policy_version 247060 (0.0031) [2024-06-22 16:30:01,919][15401] Updated weights for policy 0, policy_version 247070 (0.0039) [2024-06-22 16:30:03,391][15132] Fps is (10 sec: 42592.5, 60 sec: 42870.4, 300 sec: 42764.8). Total num frames: 4048027648. Throughput: 0: 42916.9. Samples: 4048190900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:30:03,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 16:30:06,721][15401] Updated weights for policy 0, policy_version 247080 (0.0044) [2024-06-22 16:30:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4048240640. Throughput: 0: 42870.9. Samples: 4048321700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:30:08,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 16:30:09,941][15401] Updated weights for policy 0, policy_version 247090 (0.0032) [2024-06-22 16:30:13,390][15132] Fps is (10 sec: 40965.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4048437248. Throughput: 0: 42756.4. Samples: 4048573460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:30:13,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 16:30:14,177][15401] Updated weights for policy 0, policy_version 247100 (0.0039) [2024-06-22 16:30:17,599][15401] Updated weights for policy 0, policy_version 247110 (0.0034) [2024-06-22 16:30:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4048666624. Throughput: 0: 43038.6. Samples: 4048835720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 16:30:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 16:30:21,685][15401] Updated weights for policy 0, policy_version 247120 (0.0041) [2024-06-22 16:30:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4048896000. Throughput: 0: 42972.1. Samples: 4048965120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:30:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 16:30:25,780][15401] Updated weights for policy 0, policy_version 247130 (0.0026) [2024-06-22 16:30:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 4049076224. Throughput: 0: 42859.1. Samples: 4049217740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:30:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 16:30:29,300][15401] Updated weights for policy 0, policy_version 247140 (0.0039) [2024-06-22 16:30:33,294][15401] Updated weights for policy 0, policy_version 247150 (0.0033) [2024-06-22 16:30:33,392][15132] Fps is (10 sec: 40951.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 4049305600. Throughput: 0: 43089.3. Samples: 4049479600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:30:33,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 16:30:37,130][15401] Updated weights for policy 0, policy_version 247160 (0.0032) [2024-06-22 16:30:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4049534976. Throughput: 0: 42979.2. Samples: 4049610580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:30:38,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 16:30:40,860][15401] Updated weights for policy 0, policy_version 247170 (0.0030) [2024-06-22 16:30:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4049731584. Throughput: 0: 42988.8. Samples: 4049864200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:30:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 16:30:44,651][15401] Updated weights for policy 0, policy_version 247180 (0.0028) [2024-06-22 16:30:45,570][15349] Signal inference workers to stop experience collection... (59950 times) [2024-06-22 16:30:45,611][15401] InferenceWorker_p0-w0: stopping experience collection (59950 times) [2024-06-22 16:30:45,619][15349] Signal inference workers to resume experience collection... (59950 times) [2024-06-22 16:30:45,630][15401] InferenceWorker_p0-w0: resuming experience collection (59950 times) [2024-06-22 16:30:48,326][15401] Updated weights for policy 0, policy_version 247190 (0.0028) [2024-06-22 16:30:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 4049960960. Throughput: 0: 42904.6. Samples: 4050121540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:30:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 16:30:52,209][15401] Updated weights for policy 0, policy_version 247200 (0.0027) [2024-06-22 16:30:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4050173952. Throughput: 0: 42870.2. Samples: 4050250860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:30:53,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 16:30:56,085][15401] Updated weights for policy 0, policy_version 247210 (0.0038) [2024-06-22 16:30:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.5). Total num frames: 4050370560. Throughput: 0: 42942.2. Samples: 4050505860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:30:58,394][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 16:30:59,711][15401] Updated weights for policy 0, policy_version 247220 (0.0034) [2024-06-22 16:31:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42599.5, 300 sec: 42654.1). Total num frames: 4050583552. Throughput: 0: 42836.1. Samples: 4050763340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:31:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 16:31:03,735][15401] Updated weights for policy 0, policy_version 247230 (0.0032) [2024-06-22 16:31:07,626][15401] Updated weights for policy 0, policy_version 247240 (0.0025) [2024-06-22 16:31:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4050829312. Throughput: 0: 42797.1. Samples: 4050890980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:31:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 16:31:11,494][15401] Updated weights for policy 0, policy_version 247250 (0.0035) [2024-06-22 16:31:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4051025920. Throughput: 0: 42875.9. Samples: 4051147160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:31:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 16:31:15,253][15401] Updated weights for policy 0, policy_version 247260 (0.0031) [2024-06-22 16:31:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4051222528. Throughput: 0: 42813.0. Samples: 4051406080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:31:18,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 16:31:19,302][15401] Updated weights for policy 0, policy_version 247270 (0.0029) [2024-06-22 16:31:22,977][15401] Updated weights for policy 0, policy_version 247280 (0.0040) [2024-06-22 16:31:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 4051451904. Throughput: 0: 42655.6. Samples: 4051530080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:31:23,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-22 16:31:26,725][15401] Updated weights for policy 0, policy_version 247290 (0.0038) [2024-06-22 16:31:28,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 4051697664. Throughput: 0: 42811.1. Samples: 4051790700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 16:31:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 16:31:30,669][15401] Updated weights for policy 0, policy_version 247300 (0.0034) [2024-06-22 16:31:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 4051861504. Throughput: 0: 42930.3. Samples: 4052053400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:31:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 16:31:34,235][15401] Updated weights for policy 0, policy_version 247310 (0.0036) [2024-06-22 16:31:38,232][15401] Updated weights for policy 0, policy_version 247320 (0.0029) [2024-06-22 16:31:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4052090880. Throughput: 0: 42753.4. Samples: 4052174760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:31:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 16:31:42,168][15401] Updated weights for policy 0, policy_version 247330 (0.0045) [2024-06-22 16:31:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4052320256. Throughput: 0: 42821.3. Samples: 4052432820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:31:43,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-22 16:31:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000247335_4052336640.pth... [2024-06-22 16:31:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000246710_4042096640.pth [2024-06-22 16:31:45,790][15401] Updated weights for policy 0, policy_version 247340 (0.0032) [2024-06-22 16:31:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4052516864. Throughput: 0: 42783.9. Samples: 4052688620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:31:48,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 16:31:49,804][15401] Updated weights for policy 0, policy_version 247350 (0.0028) [2024-06-22 16:31:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4052729856. Throughput: 0: 42648.9. Samples: 4052810180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:31:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 16:31:53,542][15401] Updated weights for policy 0, policy_version 247360 (0.0043) [2024-06-22 16:31:57,283][15401] Updated weights for policy 0, policy_version 247370 (0.0036) [2024-06-22 16:31:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4052926464. Throughput: 0: 42788.6. Samples: 4053072640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:31:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 16:32:01,212][15401] Updated weights for policy 0, policy_version 247380 (0.0037) [2024-06-22 16:32:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 4053172224. Throughput: 0: 42562.9. Samples: 4053321420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:32:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 16:32:04,839][15401] Updated weights for policy 0, policy_version 247390 (0.0038) [2024-06-22 16:32:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4053385216. Throughput: 0: 42759.6. Samples: 4053454260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:32:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 16:32:08,807][15401] Updated weights for policy 0, policy_version 247400 (0.0043) [2024-06-22 16:32:09,397][15349] Signal inference workers to stop experience collection... (60000 times) [2024-06-22 16:32:09,447][15401] InferenceWorker_p0-w0: stopping experience collection (60000 times) [2024-06-22 16:32:09,455][15349] Signal inference workers to resume experience collection... (60000 times) [2024-06-22 16:32:09,471][15401] InferenceWorker_p0-w0: resuming experience collection (60000 times) [2024-06-22 16:32:12,303][15401] Updated weights for policy 0, policy_version 247410 (0.0028) [2024-06-22 16:32:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4053581824. Throughput: 0: 42694.7. Samples: 4053711960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:32:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 16:32:16,321][15401] Updated weights for policy 0, policy_version 247420 (0.0029) [2024-06-22 16:32:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4053827584. Throughput: 0: 42568.3. Samples: 4053968980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:32:18,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 16:32:19,870][15401] Updated weights for policy 0, policy_version 247430 (0.0027) [2024-06-22 16:32:23,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 4053991424. Throughput: 0: 42820.3. Samples: 4054101780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:32:23,393][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 16:32:24,079][15401] Updated weights for policy 0, policy_version 247440 (0.0048) [2024-06-22 16:32:27,496][15401] Updated weights for policy 0, policy_version 247450 (0.0027) [2024-06-22 16:32:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 4054237184. Throughput: 0: 42680.5. Samples: 4054353440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:32:28,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 16:32:31,802][15401] Updated weights for policy 0, policy_version 247460 (0.0031) [2024-06-22 16:32:33,396][15132] Fps is (10 sec: 47494.7, 60 sec: 43412.9, 300 sec: 42875.5). Total num frames: 4054466560. Throughput: 0: 42623.4. Samples: 4054606940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:32:33,396][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 16:32:35,578][15401] Updated weights for policy 0, policy_version 247470 (0.0042) [2024-06-22 16:32:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4054630400. Throughput: 0: 42844.0. Samples: 4054738160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 16:32:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 16:32:39,851][15401] Updated weights for policy 0, policy_version 247480 (0.0031) [2024-06-22 16:32:43,252][15401] Updated weights for policy 0, policy_version 247490 (0.0044) [2024-06-22 16:32:43,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4054876160. Throughput: 0: 42487.9. Samples: 4054984600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:32:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 16:32:47,589][15401] Updated weights for policy 0, policy_version 247500 (0.0038) [2024-06-22 16:32:48,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4055089152. Throughput: 0: 42714.2. Samples: 4055243560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:32:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 16:32:51,022][15401] Updated weights for policy 0, policy_version 247510 (0.0035) [2024-06-22 16:32:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4055285760. Throughput: 0: 42653.2. Samples: 4055373660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:32:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 16:32:55,116][15401] Updated weights for policy 0, policy_version 247520 (0.0039) [2024-06-22 16:32:58,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4055498752. Throughput: 0: 42603.7. Samples: 4055629120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:32:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 16:32:58,593][15401] Updated weights for policy 0, policy_version 247530 (0.0028) [2024-06-22 16:33:02,708][15401] Updated weights for policy 0, policy_version 247540 (0.0030) [2024-06-22 16:33:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4055728128. Throughput: 0: 42674.8. Samples: 4055889340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 16:33:06,246][15401] Updated weights for policy 0, policy_version 247550 (0.0035) [2024-06-22 16:33:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 4055924736. Throughput: 0: 42554.8. Samples: 4056016640. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 16:33:10,285][15401] Updated weights for policy 0, policy_version 247560 (0.0034) [2024-06-22 16:33:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4056154112. Throughput: 0: 42655.1. Samples: 4056272920. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:13,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 16:33:14,015][15401] Updated weights for policy 0, policy_version 247570 (0.0028) [2024-06-22 16:33:17,835][15401] Updated weights for policy 0, policy_version 247580 (0.0048) [2024-06-22 16:33:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4056350720. Throughput: 0: 42747.0. Samples: 4056530280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:18,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 16:33:21,559][15401] Updated weights for policy 0, policy_version 247590 (0.0038) [2024-06-22 16:33:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.3, 300 sec: 42654.0). Total num frames: 4056563712. Throughput: 0: 42719.1. Samples: 4056660520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:23,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-22 16:33:25,625][15401] Updated weights for policy 0, policy_version 247600 (0.0046) [2024-06-22 16:33:26,412][15349] Signal inference workers to stop experience collection... (60050 times) [2024-06-22 16:33:26,466][15401] InferenceWorker_p0-w0: stopping experience collection (60050 times) [2024-06-22 16:33:26,474][15349] Signal inference workers to resume experience collection... (60050 times) [2024-06-22 16:33:26,476][15401] InferenceWorker_p0-w0: resuming experience collection (60050 times) [2024-06-22 16:33:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4056809472. Throughput: 0: 42831.6. Samples: 4056912020. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:28,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 16:33:29,153][15401] Updated weights for policy 0, policy_version 247610 (0.0050) [2024-06-22 16:33:33,043][15401] Updated weights for policy 0, policy_version 247620 (0.0036) [2024-06-22 16:33:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42329.8, 300 sec: 42709.5). Total num frames: 4057006080. Throughput: 0: 42905.5. Samples: 4057174300. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 16:33:36,684][15401] Updated weights for policy 0, policy_version 247630 (0.0033) [2024-06-22 16:33:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4057219072. Throughput: 0: 42904.5. Samples: 4057304360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 16:33:40,597][15401] Updated weights for policy 0, policy_version 247640 (0.0028) [2024-06-22 16:33:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4057448448. Throughput: 0: 42895.0. Samples: 4057559400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 16:33:43,483][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000247648_4057464832.pth... [2024-06-22 16:33:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000247019_4047159296.pth [2024-06-22 16:33:44,966][15401] Updated weights for policy 0, policy_version 247650 (0.0036) [2024-06-22 16:33:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4057645056. Throughput: 0: 43011.1. Samples: 4057824840. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 16:33:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 16:33:48,599][15401] Updated weights for policy 0, policy_version 247660 (0.0031) [2024-06-22 16:33:52,468][15401] Updated weights for policy 0, policy_version 247670 (0.0038) [2024-06-22 16:33:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 4057874432. Throughput: 0: 42930.2. Samples: 4057948500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:33:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 16:33:56,069][15401] Updated weights for policy 0, policy_version 247680 (0.0043) [2024-06-22 16:33:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4058087424. Throughput: 0: 43089.5. Samples: 4058211940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:33:58,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 16:33:59,845][15401] Updated weights for policy 0, policy_version 247690 (0.0026) [2024-06-22 16:34:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4058300416. Throughput: 0: 43152.9. Samples: 4058472160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:03,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-22 16:34:03,561][15401] Updated weights for policy 0, policy_version 247700 (0.0038) [2024-06-22 16:34:07,274][15401] Updated weights for policy 0, policy_version 247710 (0.0031) [2024-06-22 16:34:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4058513408. Throughput: 0: 43053.4. Samples: 4058597920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:08,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 16:34:10,961][15401] Updated weights for policy 0, policy_version 247720 (0.0029) [2024-06-22 16:34:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4058759168. Throughput: 0: 43315.9. Samples: 4058861240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:13,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 16:34:15,187][15401] Updated weights for policy 0, policy_version 247730 (0.0030) [2024-06-22 16:34:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4058939392. Throughput: 0: 43282.6. Samples: 4059122020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 16:34:18,893][15401] Updated weights for policy 0, policy_version 247740 (0.0048) [2024-06-22 16:34:22,511][15401] Updated weights for policy 0, policy_version 247750 (0.0030) [2024-06-22 16:34:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42987.5). Total num frames: 4059168768. Throughput: 0: 43140.4. Samples: 4059245680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 16:34:26,374][15401] Updated weights for policy 0, policy_version 247760 (0.0027) [2024-06-22 16:34:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4059381760. Throughput: 0: 43356.5. Samples: 4059510440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 16:34:30,204][15401] Updated weights for policy 0, policy_version 247770 (0.0038) [2024-06-22 16:34:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4059578368. Throughput: 0: 43336.0. Samples: 4059774960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 16:34:34,006][15401] Updated weights for policy 0, policy_version 247780 (0.0032) [2024-06-22 16:34:37,733][15401] Updated weights for policy 0, policy_version 247790 (0.0035) [2024-06-22 16:34:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4059807744. Throughput: 0: 43299.9. Samples: 4059897000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 16:34:41,646][15401] Updated weights for policy 0, policy_version 247800 (0.0031) [2024-06-22 16:34:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 4060037120. Throughput: 0: 43250.2. Samples: 4060158200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:43,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 16:34:45,197][15401] Updated weights for policy 0, policy_version 247810 (0.0030) [2024-06-22 16:34:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4060233728. Throughput: 0: 43179.1. Samples: 4060415220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 16:34:49,379][15401] Updated weights for policy 0, policy_version 247820 (0.0035) [2024-06-22 16:34:50,830][15349] Signal inference workers to stop experience collection... (60100 times) [2024-06-22 16:34:50,831][15349] Signal inference workers to resume experience collection... (60100 times) [2024-06-22 16:34:50,851][15401] InferenceWorker_p0-w0: stopping experience collection (60100 times) [2024-06-22 16:34:50,852][15401] InferenceWorker_p0-w0: resuming experience collection (60100 times) [2024-06-22 16:34:53,031][15401] Updated weights for policy 0, policy_version 247830 (0.0038) [2024-06-22 16:34:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4060463104. Throughput: 0: 43026.1. Samples: 4060534100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 16:34:56,985][15401] Updated weights for policy 0, policy_version 247840 (0.0022) [2024-06-22 16:34:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42931.9). Total num frames: 4060692480. Throughput: 0: 43047.2. Samples: 4060798360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 16:34:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 16:35:00,607][15401] Updated weights for policy 0, policy_version 247850 (0.0031) [2024-06-22 16:35:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4060889088. Throughput: 0: 42936.1. Samples: 4061054140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 16:35:04,661][15401] Updated weights for policy 0, policy_version 247860 (0.0050) [2024-06-22 16:35:08,024][15401] Updated weights for policy 0, policy_version 247870 (0.0029) [2024-06-22 16:35:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4061102080. Throughput: 0: 42966.3. Samples: 4061179160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 16:35:12,042][15401] Updated weights for policy 0, policy_version 247880 (0.0031) [2024-06-22 16:35:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4061331456. Throughput: 0: 42945.3. Samples: 4061442980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:13,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 16:35:15,635][15401] Updated weights for policy 0, policy_version 247890 (0.0032) [2024-06-22 16:35:18,394][15132] Fps is (10 sec: 42578.7, 60 sec: 43141.2, 300 sec: 42819.9). Total num frames: 4061528064. Throughput: 0: 42812.9. Samples: 4061701740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:18,395][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 16:35:19,792][15401] Updated weights for policy 0, policy_version 247900 (0.0023) [2024-06-22 16:35:23,331][15401] Updated weights for policy 0, policy_version 247910 (0.0023) [2024-06-22 16:35:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4061757440. Throughput: 0: 42739.1. Samples: 4061820260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 16:35:27,455][15401] Updated weights for policy 0, policy_version 247920 (0.0023) [2024-06-22 16:35:28,389][15132] Fps is (10 sec: 44257.5, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 4061970432. Throughput: 0: 42834.2. Samples: 4062085740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 16:35:30,921][15401] Updated weights for policy 0, policy_version 247930 (0.0024) [2024-06-22 16:35:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4062150656. Throughput: 0: 42844.3. Samples: 4062343220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 16:35:35,121][15401] Updated weights for policy 0, policy_version 247940 (0.0032) [2024-06-22 16:35:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4062396416. Throughput: 0: 42931.6. Samples: 4062466020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 16:35:38,425][15401] Updated weights for policy 0, policy_version 247950 (0.0035) [2024-06-22 16:35:42,663][15401] Updated weights for policy 0, policy_version 247960 (0.0033) [2024-06-22 16:35:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4062609408. Throughput: 0: 42860.4. Samples: 4062727080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 16:35:43,438][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000247963_4062625792.pth... [2024-06-22 16:35:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000247335_4052336640.pth [2024-06-22 16:35:45,888][15401] Updated weights for policy 0, policy_version 247970 (0.0025) [2024-06-22 16:35:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4062806016. Throughput: 0: 42889.3. Samples: 4062984160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 16:35:50,459][15401] Updated weights for policy 0, policy_version 247980 (0.0028) [2024-06-22 16:35:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 4063051776. Throughput: 0: 42769.7. Samples: 4063103800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 16:35:53,457][15401] Updated weights for policy 0, policy_version 247990 (0.0047) [2024-06-22 16:35:58,071][15401] Updated weights for policy 0, policy_version 248000 (0.0039) [2024-06-22 16:35:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4063248384. Throughput: 0: 42827.2. Samples: 4063370200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:35:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 16:36:00,996][15401] Updated weights for policy 0, policy_version 248010 (0.0037) [2024-06-22 16:36:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4063461376. Throughput: 0: 42859.8. Samples: 4063630240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:36:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 16:36:05,564][15401] Updated weights for policy 0, policy_version 248020 (0.0039) [2024-06-22 16:36:08,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 4063707136. Throughput: 0: 42968.4. Samples: 4063753940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-22 16:36:08,392][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 16:36:08,536][15401] Updated weights for policy 0, policy_version 248030 (0.0031) [2024-06-22 16:36:13,262][15401] Updated weights for policy 0, policy_version 248040 (0.0034) [2024-06-22 16:36:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4063887360. Throughput: 0: 42939.5. Samples: 4064018020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 16:36:15,377][15349] Signal inference workers to stop experience collection... (60150 times) [2024-06-22 16:36:15,378][15349] Signal inference workers to resume experience collection... (60150 times) [2024-06-22 16:36:15,414][15401] InferenceWorker_p0-w0: stopping experience collection (60150 times) [2024-06-22 16:36:15,414][15401] InferenceWorker_p0-w0: resuming experience collection (60150 times) [2024-06-22 16:36:16,185][15401] Updated weights for policy 0, policy_version 248050 (0.0025) [2024-06-22 16:36:18,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42874.8, 300 sec: 42876.1). Total num frames: 4064100352. Throughput: 0: 42819.2. Samples: 4064270080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 16:36:20,935][15401] Updated weights for policy 0, policy_version 248060 (0.0040) [2024-06-22 16:36:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4064313344. Throughput: 0: 42939.5. Samples: 4064398300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:23,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 16:36:24,349][15401] Updated weights for policy 0, policy_version 248070 (0.0040) [2024-06-22 16:36:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4064526336. Throughput: 0: 42890.3. Samples: 4064657140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:28,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 16:36:28,522][15401] Updated weights for policy 0, policy_version 248080 (0.0035) [2024-06-22 16:36:31,813][15401] Updated weights for policy 0, policy_version 248090 (0.0039) [2024-06-22 16:36:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4064755712. Throughput: 0: 42637.7. Samples: 4064902860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 16:36:36,303][15401] Updated weights for policy 0, policy_version 248100 (0.0041) [2024-06-22 16:36:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4064952320. Throughput: 0: 42973.3. Samples: 4065037600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 16:36:39,402][15401] Updated weights for policy 0, policy_version 248110 (0.0032) [2024-06-22 16:36:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4065165312. Throughput: 0: 42749.3. Samples: 4065293920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:43,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 16:36:43,825][15401] Updated weights for policy 0, policy_version 248120 (0.0028) [2024-06-22 16:36:47,217][15401] Updated weights for policy 0, policy_version 248130 (0.0032) [2024-06-22 16:36:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4065394688. Throughput: 0: 42622.3. Samples: 4065548240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 16:36:51,329][15401] Updated weights for policy 0, policy_version 248140 (0.0048) [2024-06-22 16:36:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 4065607680. Throughput: 0: 42753.4. Samples: 4065677740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 16:36:55,028][15401] Updated weights for policy 0, policy_version 248150 (0.0035) [2024-06-22 16:36:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4065804288. Throughput: 0: 42717.4. Samples: 4065940300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:36:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 16:36:58,912][15401] Updated weights for policy 0, policy_version 248160 (0.0034) [2024-06-22 16:37:02,551][15401] Updated weights for policy 0, policy_version 248170 (0.0034) [2024-06-22 16:37:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4066033664. Throughput: 0: 42613.4. Samples: 4066187680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:37:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 16:37:06,545][15401] Updated weights for policy 0, policy_version 248180 (0.0029) [2024-06-22 16:37:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42053.9, 300 sec: 42876.1). Total num frames: 4066230272. Throughput: 0: 42641.3. Samples: 4066317160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:37:08,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 16:37:10,117][15401] Updated weights for policy 0, policy_version 248190 (0.0027) [2024-06-22 16:37:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4066443264. Throughput: 0: 42532.4. Samples: 4066571100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:37:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 16:37:14,726][15401] Updated weights for policy 0, policy_version 248200 (0.0033) [2024-06-22 16:37:17,888][15401] Updated weights for policy 0, policy_version 248210 (0.0037) [2024-06-22 16:37:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 4066672640. Throughput: 0: 42604.1. Samples: 4066820040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 16:37:18,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 16:37:22,742][15401] Updated weights for policy 0, policy_version 248220 (0.0037) [2024-06-22 16:37:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4066885632. Throughput: 0: 42594.8. Samples: 4066954360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:37:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 16:37:25,563][15401] Updated weights for policy 0, policy_version 248230 (0.0027) [2024-06-22 16:37:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42766.0). Total num frames: 4067082240. Throughput: 0: 42552.5. Samples: 4067208780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:37:28,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 16:37:30,280][15401] Updated weights for policy 0, policy_version 248240 (0.0035) [2024-06-22 16:37:32,832][15349] Signal inference workers to stop experience collection... (60200 times) [2024-06-22 16:37:32,862][15401] InferenceWorker_p0-w0: stopping experience collection (60200 times) [2024-06-22 16:37:32,889][15349] Signal inference workers to resume experience collection... (60200 times) [2024-06-22 16:37:32,896][15401] InferenceWorker_p0-w0: resuming experience collection (60200 times) [2024-06-22 16:37:33,238][15401] Updated weights for policy 0, policy_version 248250 (0.0034) [2024-06-22 16:37:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 4067328000. Throughput: 0: 42428.4. Samples: 4067457520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:37:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 16:37:38,120][15401] Updated weights for policy 0, policy_version 248260 (0.0039) [2024-06-22 16:37:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4067491840. Throughput: 0: 42562.2. Samples: 4067593040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:37:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 16:37:41,099][15401] Updated weights for policy 0, policy_version 248270 (0.0028) [2024-06-22 16:37:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4067737600. Throughput: 0: 42385.6. Samples: 4067847660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:37:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 16:37:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000248275_4067737600.pth... [2024-06-22 16:37:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000247648_4057464832.pth [2024-06-22 16:37:45,982][15401] Updated weights for policy 0, policy_version 248280 (0.0040) [2024-06-22 16:37:48,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4067966976. Throughput: 0: 42455.0. Samples: 4068098160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:37:48,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 16:37:48,682][15401] Updated weights for policy 0, policy_version 248290 (0.0023) [2024-06-22 16:37:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 4068130816. Throughput: 0: 42552.5. Samples: 4068232020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:37:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 16:37:53,471][15401] Updated weights for policy 0, policy_version 248300 (0.0030) [2024-06-22 16:37:56,882][15401] Updated weights for policy 0, policy_version 248310 (0.0024) [2024-06-22 16:37:58,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4068343808. Throughput: 0: 42531.1. Samples: 4068485000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:37:58,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 16:38:01,272][15401] Updated weights for policy 0, policy_version 248320 (0.0034) [2024-06-22 16:38:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4068573184. Throughput: 0: 42620.9. Samples: 4068737980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:38:03,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 16:38:04,476][15401] Updated weights for policy 0, policy_version 248330 (0.0039) [2024-06-22 16:38:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4068769792. Throughput: 0: 42559.6. Samples: 4068869540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:38:08,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 16:38:08,977][15401] Updated weights for policy 0, policy_version 248340 (0.0034) [2024-06-22 16:38:12,472][15401] Updated weights for policy 0, policy_version 248350 (0.0036) [2024-06-22 16:38:13,396][15132] Fps is (10 sec: 42569.8, 60 sec: 42593.7, 300 sec: 42875.1). Total num frames: 4068999168. Throughput: 0: 42598.5. Samples: 4069126000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:38:13,397][15132] Avg episode reward: [(0, '0.831')] [2024-06-22 16:38:16,668][15401] Updated weights for policy 0, policy_version 248360 (0.0033) [2024-06-22 16:38:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4069228544. Throughput: 0: 42746.3. Samples: 4069381100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:38:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 16:38:20,023][15401] Updated weights for policy 0, policy_version 248370 (0.0022) [2024-06-22 16:38:23,389][15132] Fps is (10 sec: 42626.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4069425152. Throughput: 0: 42618.7. Samples: 4069510880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:38:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 16:38:24,111][15401] Updated weights for policy 0, policy_version 248380 (0.0036) [2024-06-22 16:38:27,437][15401] Updated weights for policy 0, policy_version 248390 (0.0034) [2024-06-22 16:38:28,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 4069638144. Throughput: 0: 42761.9. Samples: 4069772040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-22 16:38:28,392][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 16:38:31,592][15401] Updated weights for policy 0, policy_version 248400 (0.0044) [2024-06-22 16:38:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4069883904. Throughput: 0: 42897.8. Samples: 4070028560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:38:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 16:38:34,986][15401] Updated weights for policy 0, policy_version 248410 (0.0038) [2024-06-22 16:38:38,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4070080512. Throughput: 0: 42788.9. Samples: 4070157520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:38:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 16:38:39,250][15401] Updated weights for policy 0, policy_version 248420 (0.0043) [2024-06-22 16:38:42,790][15401] Updated weights for policy 0, policy_version 248430 (0.0040) [2024-06-22 16:38:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 4070277120. Throughput: 0: 42864.1. Samples: 4070413880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:38:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 16:38:46,855][15401] Updated weights for policy 0, policy_version 248440 (0.0027) [2024-06-22 16:38:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4070490112. Throughput: 0: 42978.6. Samples: 4070672020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:38:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 16:38:50,287][15401] Updated weights for policy 0, policy_version 248450 (0.0027) [2024-06-22 16:38:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4070719488. Throughput: 0: 42912.0. Samples: 4070800580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:38:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 16:38:54,427][15401] Updated weights for policy 0, policy_version 248460 (0.0031) [2024-06-22 16:38:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4070916096. Throughput: 0: 42952.6. Samples: 4071058580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:38:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 16:38:58,432][15401] Updated weights for policy 0, policy_version 248470 (0.0032) [2024-06-22 16:38:58,451][15349] Signal inference workers to stop experience collection... (60250 times) [2024-06-22 16:38:58,452][15349] Signal inference workers to resume experience collection... (60250 times) [2024-06-22 16:38:58,496][15401] InferenceWorker_p0-w0: stopping experience collection (60250 times) [2024-06-22 16:38:58,497][15401] InferenceWorker_p0-w0: resuming experience collection (60250 times) [2024-06-22 16:39:01,856][15401] Updated weights for policy 0, policy_version 248480 (0.0029) [2024-06-22 16:39:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4071145472. Throughput: 0: 42960.1. Samples: 4071314300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:39:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 16:39:05,820][15401] Updated weights for policy 0, policy_version 248490 (0.0043) [2024-06-22 16:39:08,394][15132] Fps is (10 sec: 44214.9, 60 sec: 43141.0, 300 sec: 42708.8). Total num frames: 4071358464. Throughput: 0: 42961.0. Samples: 4071444340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:39:08,395][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 16:39:09,314][15401] Updated weights for policy 0, policy_version 248500 (0.0021) [2024-06-22 16:39:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42876.2, 300 sec: 42820.6). Total num frames: 4071571456. Throughput: 0: 42941.8. Samples: 4071704320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:39:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 16:39:13,420][15401] Updated weights for policy 0, policy_version 248510 (0.0047) [2024-06-22 16:39:16,929][15401] Updated weights for policy 0, policy_version 248520 (0.0034) [2024-06-22 16:39:18,389][15132] Fps is (10 sec: 42619.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4071784448. Throughput: 0: 42810.3. Samples: 4071955020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:39:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 16:39:21,178][15401] Updated weights for policy 0, policy_version 248530 (0.0046) [2024-06-22 16:39:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4071997440. Throughput: 0: 42795.6. Samples: 4072083320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:39:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 16:39:24,799][15401] Updated weights for policy 0, policy_version 248540 (0.0037) [2024-06-22 16:39:28,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42871.4, 300 sec: 42820.2). Total num frames: 4072210432. Throughput: 0: 42721.6. Samples: 4072336460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:39:28,393][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 16:39:28,860][15401] Updated weights for policy 0, policy_version 248550 (0.0035) [2024-06-22 16:39:32,310][15401] Updated weights for policy 0, policy_version 248560 (0.0030) [2024-06-22 16:39:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4072423424. Throughput: 0: 42660.5. Samples: 4072591740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:39:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 16:39:36,599][15401] Updated weights for policy 0, policy_version 248570 (0.0042) [2024-06-22 16:39:38,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4072620032. Throughput: 0: 42727.5. Samples: 4072723320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-22 16:39:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 16:39:40,266][15401] Updated weights for policy 0, policy_version 248580 (0.0039) [2024-06-22 16:39:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4072849408. Throughput: 0: 42632.0. Samples: 4072977020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:39:43,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 16:39:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000248587_4072849408.pth... [2024-06-22 16:39:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000247963_4062625792.pth [2024-06-22 16:39:44,215][15401] Updated weights for policy 0, policy_version 248590 (0.0047) [2024-06-22 16:39:48,041][15401] Updated weights for policy 0, policy_version 248600 (0.0037) [2024-06-22 16:39:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4073062400. Throughput: 0: 42540.5. Samples: 4073228620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:39:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 16:39:51,810][15401] Updated weights for policy 0, policy_version 248610 (0.0034) [2024-06-22 16:39:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4073275392. Throughput: 0: 42543.0. Samples: 4073358560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:39:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 16:39:55,763][15401] Updated weights for policy 0, policy_version 248620 (0.0052) [2024-06-22 16:39:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4073488384. Throughput: 0: 42352.9. Samples: 4073610200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:39:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 16:39:59,879][15401] Updated weights for policy 0, policy_version 248630 (0.0027) [2024-06-22 16:40:03,354][15401] Updated weights for policy 0, policy_version 248640 (0.0037) [2024-06-22 16:40:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4073717760. Throughput: 0: 42536.4. Samples: 4073869160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 16:40:07,286][15401] Updated weights for policy 0, policy_version 248650 (0.0030) [2024-06-22 16:40:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42875.0, 300 sec: 42709.5). Total num frames: 4073930752. Throughput: 0: 42560.8. Samples: 4073998560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:08,395][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 16:40:11,040][15401] Updated weights for policy 0, policy_version 248660 (0.0034) [2024-06-22 16:40:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42710.1). Total num frames: 4074127360. Throughput: 0: 42618.6. Samples: 4074254200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:13,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 16:40:14,882][15401] Updated weights for policy 0, policy_version 248670 (0.0032) [2024-06-22 16:40:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4074340352. Throughput: 0: 42631.0. Samples: 4074510140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 16:40:18,825][15401] Updated weights for policy 0, policy_version 248680 (0.0032) [2024-06-22 16:40:20,019][15349] Signal inference workers to stop experience collection... (60300 times) [2024-06-22 16:40:20,024][15349] Signal inference workers to resume experience collection... (60300 times) [2024-06-22 16:40:20,029][15401] InferenceWorker_p0-w0: stopping experience collection (60300 times) [2024-06-22 16:40:20,057][15401] InferenceWorker_p0-w0: resuming experience collection (60300 times) [2024-06-22 16:40:22,605][15401] Updated weights for policy 0, policy_version 248690 (0.0037) [2024-06-22 16:40:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4074569728. Throughput: 0: 42620.5. Samples: 4074641240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 16:40:26,605][15401] Updated weights for policy 0, policy_version 248700 (0.0032) [2024-06-22 16:40:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 4074766336. Throughput: 0: 42613.7. Samples: 4074894640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 16:40:30,100][15401] Updated weights for policy 0, policy_version 248710 (0.0022) [2024-06-22 16:40:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4074962944. Throughput: 0: 42762.1. Samples: 4075152920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:33,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 16:40:34,286][15401] Updated weights for policy 0, policy_version 248720 (0.0046) [2024-06-22 16:40:37,907][15401] Updated weights for policy 0, policy_version 248730 (0.0026) [2024-06-22 16:40:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4075208704. Throughput: 0: 42688.9. Samples: 4075279560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 16:40:42,098][15401] Updated weights for policy 0, policy_version 248740 (0.0045) [2024-06-22 16:40:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4075405312. Throughput: 0: 42700.4. Samples: 4075531720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 16:40:45,588][15401] Updated weights for policy 0, policy_version 248750 (0.0028) [2024-06-22 16:40:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4075618304. Throughput: 0: 42749.8. Samples: 4075792900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 16:40:49,618][15401] Updated weights for policy 0, policy_version 248760 (0.0045) [2024-06-22 16:40:53,247][15401] Updated weights for policy 0, policy_version 248770 (0.0037) [2024-06-22 16:40:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4075847680. Throughput: 0: 42697.1. Samples: 4075919920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-22 16:40:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 16:40:57,102][15401] Updated weights for policy 0, policy_version 248780 (0.0029) [2024-06-22 16:40:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4076044288. Throughput: 0: 42633.9. Samples: 4076172720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:40:58,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 16:41:01,017][15401] Updated weights for policy 0, policy_version 248790 (0.0034) [2024-06-22 16:41:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 4076257280. Throughput: 0: 42689.0. Samples: 4076431140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 16:41:04,696][15401] Updated weights for policy 0, policy_version 248800 (0.0028) [2024-06-22 16:41:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4076486656. Throughput: 0: 42637.9. Samples: 4076559940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 16:41:08,565][15401] Updated weights for policy 0, policy_version 248810 (0.0030) [2024-06-22 16:41:12,231][15401] Updated weights for policy 0, policy_version 248820 (0.0035) [2024-06-22 16:41:13,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 4076699648. Throughput: 0: 42735.1. Samples: 4076817820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:13,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 16:41:16,235][15401] Updated weights for policy 0, policy_version 248830 (0.0027) [2024-06-22 16:41:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4076912640. Throughput: 0: 42691.0. Samples: 4077074020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:18,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 16:41:20,321][15401] Updated weights for policy 0, policy_version 248840 (0.0034) [2024-06-22 16:41:23,390][15132] Fps is (10 sec: 42606.6, 60 sec: 42598.1, 300 sec: 42709.4). Total num frames: 4077125632. Throughput: 0: 42783.9. Samples: 4077204860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:23,391][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 16:41:23,767][15401] Updated weights for policy 0, policy_version 248850 (0.0037) [2024-06-22 16:41:28,020][15401] Updated weights for policy 0, policy_version 248860 (0.0043) [2024-06-22 16:41:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4077338624. Throughput: 0: 42979.5. Samples: 4077465800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 16:41:31,490][15401] Updated weights for policy 0, policy_version 248870 (0.0027) [2024-06-22 16:41:33,390][15132] Fps is (10 sec: 44238.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 4077568000. Throughput: 0: 42870.0. Samples: 4077722060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 16:41:35,471][15401] Updated weights for policy 0, policy_version 248880 (0.0031) [2024-06-22 16:41:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4077764608. Throughput: 0: 43060.4. Samples: 4077857640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 16:41:39,230][15401] Updated weights for policy 0, policy_version 248890 (0.0032) [2024-06-22 16:41:42,842][15401] Updated weights for policy 0, policy_version 248900 (0.0033) [2024-06-22 16:41:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4077977600. Throughput: 0: 43097.8. Samples: 4078112120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 16:41:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000248901_4077993984.pth... [2024-06-22 16:41:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000248275_4067737600.pth [2024-06-22 16:41:46,669][15401] Updated weights for policy 0, policy_version 248910 (0.0038) [2024-06-22 16:41:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 4078223360. Throughput: 0: 43066.5. Samples: 4078369140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:48,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-22 16:41:50,612][15401] Updated weights for policy 0, policy_version 248920 (0.0033) [2024-06-22 16:41:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4078419968. Throughput: 0: 43213.6. Samples: 4078504560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:53,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-22 16:41:54,239][15401] Updated weights for policy 0, policy_version 248930 (0.0031) [2024-06-22 16:41:58,111][15401] Updated weights for policy 0, policy_version 248940 (0.0043) [2024-06-22 16:41:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 4078632960. Throughput: 0: 43141.3. Samples: 4078759180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:41:58,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 16:42:01,971][15401] Updated weights for policy 0, policy_version 248950 (0.0028) [2024-06-22 16:42:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 4078862336. Throughput: 0: 43136.9. Samples: 4079015180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 16:42:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 16:42:06,012][15401] Updated weights for policy 0, policy_version 248960 (0.0032) [2024-06-22 16:42:07,544][15349] Signal inference workers to stop experience collection... (60350 times) [2024-06-22 16:42:07,545][15349] Signal inference workers to resume experience collection... (60350 times) [2024-06-22 16:42:07,564][15401] InferenceWorker_p0-w0: stopping experience collection (60350 times) [2024-06-22 16:42:07,564][15401] InferenceWorker_p0-w0: resuming experience collection (60350 times) [2024-06-22 16:42:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4079075328. Throughput: 0: 43159.6. Samples: 4079147020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 16:42:09,493][15401] Updated weights for policy 0, policy_version 248970 (0.0031) [2024-06-22 16:42:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42709.4). Total num frames: 4079271936. Throughput: 0: 43071.5. Samples: 4079404020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:13,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-22 16:42:13,578][15401] Updated weights for policy 0, policy_version 248980 (0.0029) [2024-06-22 16:42:17,162][15401] Updated weights for policy 0, policy_version 248990 (0.0036) [2024-06-22 16:42:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4079501312. Throughput: 0: 43116.2. Samples: 4079662280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:18,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-22 16:42:21,132][15401] Updated weights for policy 0, policy_version 249000 (0.0031) [2024-06-22 16:42:23,392][15132] Fps is (10 sec: 45864.9, 60 sec: 43416.2, 300 sec: 42875.7). Total num frames: 4079730688. Throughput: 0: 42981.6. Samples: 4079791920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:23,393][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 16:42:24,835][15401] Updated weights for policy 0, policy_version 249010 (0.0023) [2024-06-22 16:42:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 4079927296. Throughput: 0: 42977.8. Samples: 4080046120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:28,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 16:42:28,989][15401] Updated weights for policy 0, policy_version 249020 (0.0038) [2024-06-22 16:42:32,486][15401] Updated weights for policy 0, policy_version 249030 (0.0036) [2024-06-22 16:42:33,393][15132] Fps is (10 sec: 40953.7, 60 sec: 42868.8, 300 sec: 42875.5). Total num frames: 4080140288. Throughput: 0: 42838.1. Samples: 4080297020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:33,394][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 16:42:36,638][15401] Updated weights for policy 0, policy_version 249040 (0.0035) [2024-06-22 16:42:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4080353280. Throughput: 0: 42784.5. Samples: 4080429860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 16:42:40,185][15401] Updated weights for policy 0, policy_version 249050 (0.0042) [2024-06-22 16:42:43,389][15132] Fps is (10 sec: 40976.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4080549888. Throughput: 0: 42858.8. Samples: 4080687720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:43,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 16:42:44,333][15401] Updated weights for policy 0, policy_version 249060 (0.0030) [2024-06-22 16:42:47,619][15401] Updated weights for policy 0, policy_version 249070 (0.0039) [2024-06-22 16:42:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4080795648. Throughput: 0: 42765.3. Samples: 4080939620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:48,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 16:42:51,959][15401] Updated weights for policy 0, policy_version 249080 (0.0026) [2024-06-22 16:42:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4081008640. Throughput: 0: 42763.2. Samples: 4081071360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:53,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 16:42:55,391][15401] Updated weights for policy 0, policy_version 249090 (0.0050) [2024-06-22 16:42:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 4081188864. Throughput: 0: 42782.7. Samples: 4081329240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:42:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 16:42:59,807][15401] Updated weights for policy 0, policy_version 249100 (0.0030) [2024-06-22 16:43:03,060][15401] Updated weights for policy 0, policy_version 249110 (0.0043) [2024-06-22 16:43:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4081418240. Throughput: 0: 42603.1. Samples: 4081579420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:43:03,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 16:43:07,570][15401] Updated weights for policy 0, policy_version 249120 (0.0045) [2024-06-22 16:43:08,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42932.6). Total num frames: 4081664000. Throughput: 0: 42684.9. Samples: 4081712640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:43:08,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 16:43:10,829][15401] Updated weights for policy 0, policy_version 249130 (0.0041) [2024-06-22 16:43:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4081844224. Throughput: 0: 42666.5. Samples: 4081966120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 16:43:13,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 16:43:15,015][15401] Updated weights for policy 0, policy_version 249140 (0.0032) [2024-06-22 16:43:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4082057216. Throughput: 0: 42814.4. Samples: 4082223500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:18,390][15132] Avg episode reward: [(0, '0.892')] [2024-06-22 16:43:18,547][15401] Updated weights for policy 0, policy_version 249150 (0.0041) [2024-06-22 16:43:22,663][15401] Updated weights for policy 0, policy_version 249160 (0.0022) [2024-06-22 16:43:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42820.9). Total num frames: 4082270208. Throughput: 0: 42830.7. Samples: 4082357240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:23,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 16:43:26,011][15401] Updated weights for policy 0, policy_version 249170 (0.0036) [2024-06-22 16:43:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4082483200. Throughput: 0: 42602.5. Samples: 4082604840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 16:43:30,250][15401] Updated weights for policy 0, policy_version 249180 (0.0033) [2024-06-22 16:43:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42601.2, 300 sec: 42765.0). Total num frames: 4082696192. Throughput: 0: 42951.7. Samples: 4082872440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:33,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 16:43:33,586][15401] Updated weights for policy 0, policy_version 249190 (0.0028) [2024-06-22 16:43:36,965][15349] Signal inference workers to stop experience collection... (60400 times) [2024-06-22 16:43:36,969][15349] Signal inference workers to resume experience collection... (60400 times) [2024-06-22 16:43:36,994][15401] InferenceWorker_p0-w0: stopping experience collection (60400 times) [2024-06-22 16:43:36,994][15401] InferenceWorker_p0-w0: resuming experience collection (60400 times) [2024-06-22 16:43:37,704][15401] Updated weights for policy 0, policy_version 249200 (0.0037) [2024-06-22 16:43:38,396][15132] Fps is (10 sec: 45846.2, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 4082941952. Throughput: 0: 42930.3. Samples: 4083003500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:38,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 16:43:41,366][15401] Updated weights for policy 0, policy_version 249210 (0.0033) [2024-06-22 16:43:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4083138560. Throughput: 0: 42839.2. Samples: 4083257000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 16:43:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000249215_4083138560.pth... [2024-06-22 16:43:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000248587_4072849408.pth [2024-06-22 16:43:45,421][15401] Updated weights for policy 0, policy_version 249220 (0.0033) [2024-06-22 16:43:48,396][15132] Fps is (10 sec: 40959.9, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 4083351552. Throughput: 0: 43062.3. Samples: 4083517500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:48,396][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 16:43:49,044][15401] Updated weights for policy 0, policy_version 249230 (0.0038) [2024-06-22 16:43:53,264][15401] Updated weights for policy 0, policy_version 249240 (0.0037) [2024-06-22 16:43:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 4083548160. Throughput: 0: 42811.5. Samples: 4083639160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 16:43:56,637][15401] Updated weights for policy 0, policy_version 249250 (0.0038) [2024-06-22 16:43:58,389][15132] Fps is (10 sec: 42625.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4083777536. Throughput: 0: 42750.7. Samples: 4083889900. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:43:58,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 16:44:00,848][15401] Updated weights for policy 0, policy_version 249260 (0.0041) [2024-06-22 16:44:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.7). Total num frames: 4083974144. Throughput: 0: 42793.7. Samples: 4084149220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:44:03,392][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 16:44:04,190][15401] Updated weights for policy 0, policy_version 249270 (0.0031) [2024-06-22 16:44:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4084187136. Throughput: 0: 42538.6. Samples: 4084271480. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:44:08,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-22 16:44:08,529][15401] Updated weights for policy 0, policy_version 249280 (0.0035) [2024-06-22 16:44:11,948][15401] Updated weights for policy 0, policy_version 249290 (0.0026) [2024-06-22 16:44:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4084432896. Throughput: 0: 42765.4. Samples: 4084529280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:44:13,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 16:44:15,860][15401] Updated weights for policy 0, policy_version 249300 (0.0042) [2024-06-22 16:44:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4084613120. Throughput: 0: 42807.1. Samples: 4084798760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:44:18,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 16:44:19,639][15401] Updated weights for policy 0, policy_version 249310 (0.0036) [2024-06-22 16:44:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 4084842496. Throughput: 0: 42635.8. Samples: 4084921840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 16:44:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 16:44:23,416][15401] Updated weights for policy 0, policy_version 249320 (0.0038) [2024-06-22 16:44:27,226][15401] Updated weights for policy 0, policy_version 249330 (0.0034) [2024-06-22 16:44:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4085071872. Throughput: 0: 42718.3. Samples: 4085179320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:44:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 16:44:31,171][15401] Updated weights for policy 0, policy_version 249340 (0.0047) [2024-06-22 16:44:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4085268480. Throughput: 0: 42707.9. Samples: 4085439080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:44:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 16:44:34,905][15401] Updated weights for policy 0, policy_version 249350 (0.0049) [2024-06-22 16:44:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42328.1, 300 sec: 42820.2). Total num frames: 4085481472. Throughput: 0: 42760.9. Samples: 4085563500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:44:38,392][15132] Avg episode reward: [(0, '0.158')] [2024-06-22 16:44:38,990][15401] Updated weights for policy 0, policy_version 249360 (0.0034) [2024-06-22 16:44:42,573][15401] Updated weights for policy 0, policy_version 249370 (0.0034) [2024-06-22 16:44:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4085710848. Throughput: 0: 42955.5. Samples: 4085822900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:44:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 16:44:46,586][15401] Updated weights for policy 0, policy_version 249380 (0.0029) [2024-06-22 16:44:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42602.9, 300 sec: 42820.5). Total num frames: 4085907456. Throughput: 0: 43079.6. Samples: 4086087800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:44:48,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 16:44:50,270][15401] Updated weights for policy 0, policy_version 249390 (0.0033) [2024-06-22 16:44:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4086136832. Throughput: 0: 43232.8. Samples: 4086216960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:44:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 16:44:54,008][15401] Updated weights for policy 0, policy_version 249400 (0.0042) [2024-06-22 16:44:54,627][15349] Signal inference workers to stop experience collection... (60450 times) [2024-06-22 16:44:54,628][15349] Signal inference workers to resume experience collection... (60450 times) [2024-06-22 16:44:54,650][15401] InferenceWorker_p0-w0: stopping experience collection (60450 times) [2024-06-22 16:44:54,650][15401] InferenceWorker_p0-w0: resuming experience collection (60450 times) [2024-06-22 16:44:57,675][15401] Updated weights for policy 0, policy_version 249410 (0.0040) [2024-06-22 16:44:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4086349824. Throughput: 0: 43100.0. Samples: 4086468780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:44:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 16:45:01,660][15401] Updated weights for policy 0, policy_version 249420 (0.0052) [2024-06-22 16:45:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4086562816. Throughput: 0: 42917.2. Samples: 4086730040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:45:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 16:45:05,172][15401] Updated weights for policy 0, policy_version 249430 (0.0029) [2024-06-22 16:45:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4086759424. Throughput: 0: 43059.7. Samples: 4086859520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:45:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 16:45:09,523][15401] Updated weights for policy 0, policy_version 249440 (0.0028) [2024-06-22 16:45:12,759][15401] Updated weights for policy 0, policy_version 249450 (0.0036) [2024-06-22 16:45:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4087005184. Throughput: 0: 42892.9. Samples: 4087109500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:45:13,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 16:45:17,219][15401] Updated weights for policy 0, policy_version 249460 (0.0034) [2024-06-22 16:45:18,392][15132] Fps is (10 sec: 44225.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4087201792. Throughput: 0: 42851.0. Samples: 4087367480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:45:18,393][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 16:45:20,542][15401] Updated weights for policy 0, policy_version 249470 (0.0031) [2024-06-22 16:45:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4087414784. Throughput: 0: 42927.2. Samples: 4087495120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:45:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 16:45:24,759][15401] Updated weights for policy 0, policy_version 249480 (0.0034) [2024-06-22 16:45:28,220][15401] Updated weights for policy 0, policy_version 249490 (0.0033) [2024-06-22 16:45:28,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 4087660544. Throughput: 0: 42858.2. Samples: 4087751520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:45:28,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 16:45:32,727][15401] Updated weights for policy 0, policy_version 249500 (0.0035) [2024-06-22 16:45:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4087824384. Throughput: 0: 42713.4. Samples: 4088009900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 16:45:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 16:45:35,645][15401] Updated weights for policy 0, policy_version 249510 (0.0037) [2024-06-22 16:45:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 4088053760. Throughput: 0: 42491.7. Samples: 4088129080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:45:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 16:45:40,362][15401] Updated weights for policy 0, policy_version 249520 (0.0037) [2024-06-22 16:45:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4088283136. Throughput: 0: 42578.2. Samples: 4088384800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:45:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 16:45:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000249529_4088283136.pth... [2024-06-22 16:45:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000248901_4077993984.pth [2024-06-22 16:45:43,645][15401] Updated weights for policy 0, policy_version 249530 (0.0039) [2024-06-22 16:45:47,929][15401] Updated weights for policy 0, policy_version 249540 (0.0032) [2024-06-22 16:45:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4088479744. Throughput: 0: 42452.9. Samples: 4088640420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:45:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 16:45:51,226][15401] Updated weights for policy 0, policy_version 249550 (0.0036) [2024-06-22 16:45:53,392][15132] Fps is (10 sec: 40947.8, 60 sec: 42596.3, 300 sec: 42875.7). Total num frames: 4088692736. Throughput: 0: 42446.8. Samples: 4088769760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:45:53,393][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 16:45:55,500][15401] Updated weights for policy 0, policy_version 249560 (0.0040) [2024-06-22 16:45:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4088922112. Throughput: 0: 42627.0. Samples: 4089027720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:45:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 16:45:58,946][15401] Updated weights for policy 0, policy_version 249570 (0.0036) [2024-06-22 16:46:03,092][15401] Updated weights for policy 0, policy_version 249580 (0.0028) [2024-06-22 16:46:03,390][15132] Fps is (10 sec: 42610.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4089118720. Throughput: 0: 42608.9. Samples: 4089284780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:03,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 16:46:06,966][15401] Updated weights for policy 0, policy_version 249590 (0.0038) [2024-06-22 16:46:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 4089348096. Throughput: 0: 42586.5. Samples: 4089411520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 16:46:10,733][15401] Updated weights for policy 0, policy_version 249600 (0.0044) [2024-06-22 16:46:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4089561088. Throughput: 0: 42703.2. Samples: 4089673160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:13,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 16:46:14,616][15401] Updated weights for policy 0, policy_version 249610 (0.0024) [2024-06-22 16:46:17,562][15349] Signal inference workers to stop experience collection... (60500 times) [2024-06-22 16:46:17,565][15349] Signal inference workers to resume experience collection... (60500 times) [2024-06-22 16:46:17,591][15401] InferenceWorker_p0-w0: stopping experience collection (60500 times) [2024-06-22 16:46:17,620][15401] InferenceWorker_p0-w0: resuming experience collection (60500 times) [2024-06-22 16:46:18,335][15401] Updated weights for policy 0, policy_version 249620 (0.0040) [2024-06-22 16:46:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42876.2). Total num frames: 4089774080. Throughput: 0: 42692.9. Samples: 4089931080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 16:46:22,434][15401] Updated weights for policy 0, policy_version 249630 (0.0037) [2024-06-22 16:46:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4089987072. Throughput: 0: 42943.1. Samples: 4090061520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 16:46:26,169][15401] Updated weights for policy 0, policy_version 249640 (0.0034) [2024-06-22 16:46:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4090200064. Throughput: 0: 42896.0. Samples: 4090315120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 16:46:30,002][15401] Updated weights for policy 0, policy_version 249650 (0.0035) [2024-06-22 16:46:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4090396672. Throughput: 0: 43008.9. Samples: 4090575820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 16:46:33,729][15401] Updated weights for policy 0, policy_version 249660 (0.0028) [2024-06-22 16:46:37,591][15401] Updated weights for policy 0, policy_version 249670 (0.0034) [2024-06-22 16:46:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4090626048. Throughput: 0: 43028.7. Samples: 4090705920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 16:46:41,270][15401] Updated weights for policy 0, policy_version 249680 (0.0021) [2024-06-22 16:46:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4090839040. Throughput: 0: 43017.7. Samples: 4090963520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 16:46:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 16:46:45,158][15401] Updated weights for policy 0, policy_version 249690 (0.0027) [2024-06-22 16:46:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4091052032. Throughput: 0: 43033.3. Samples: 4091221280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:46:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 16:46:48,791][15401] Updated weights for policy 0, policy_version 249700 (0.0032) [2024-06-22 16:46:52,885][15401] Updated weights for policy 0, policy_version 249710 (0.0048) [2024-06-22 16:46:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.5, 300 sec: 42820.9). Total num frames: 4091265024. Throughput: 0: 43044.4. Samples: 4091348520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:46:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 16:46:56,340][15401] Updated weights for policy 0, policy_version 249720 (0.0044) [2024-06-22 16:46:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4091494400. Throughput: 0: 42967.5. Samples: 4091606700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:46:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 16:47:00,432][15401] Updated weights for policy 0, policy_version 249730 (0.0038) [2024-06-22 16:47:03,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4091691008. Throughput: 0: 42814.8. Samples: 4091857740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 16:47:03,933][15401] Updated weights for policy 0, policy_version 249740 (0.0031) [2024-06-22 16:47:08,177][15401] Updated weights for policy 0, policy_version 249750 (0.0037) [2024-06-22 16:47:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4091904000. Throughput: 0: 42768.4. Samples: 4091986100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 16:47:11,976][15401] Updated weights for policy 0, policy_version 249760 (0.0041) [2024-06-22 16:47:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4092133376. Throughput: 0: 42877.0. Samples: 4092244580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 16:47:15,723][15401] Updated weights for policy 0, policy_version 249770 (0.0031) [2024-06-22 16:47:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 4092346368. Throughput: 0: 42744.6. Samples: 4092499320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:18,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 16:47:19,641][15401] Updated weights for policy 0, policy_version 249780 (0.0029) [2024-06-22 16:47:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4092542976. Throughput: 0: 42654.1. Samples: 4092625360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 16:47:23,740][15401] Updated weights for policy 0, policy_version 249790 (0.0031) [2024-06-22 16:47:27,449][15401] Updated weights for policy 0, policy_version 249800 (0.0026) [2024-06-22 16:47:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.6). Total num frames: 4092755968. Throughput: 0: 42580.1. Samples: 4092879620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 16:47:31,534][15401] Updated weights for policy 0, policy_version 249810 (0.0043) [2024-06-22 16:47:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4092985344. Throughput: 0: 42375.1. Samples: 4093128160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:33,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 16:47:35,059][15401] Updated weights for policy 0, policy_version 249820 (0.0039) [2024-06-22 16:47:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4093181952. Throughput: 0: 42541.4. Samples: 4093262880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:38,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 16:47:39,134][15401] Updated weights for policy 0, policy_version 249830 (0.0026) [2024-06-22 16:47:40,568][15349] Signal inference workers to stop experience collection... (60550 times) [2024-06-22 16:47:40,568][15349] Signal inference workers to resume experience collection... (60550 times) [2024-06-22 16:47:40,608][15401] InferenceWorker_p0-w0: stopping experience collection (60550 times) [2024-06-22 16:47:40,608][15401] InferenceWorker_p0-w0: resuming experience collection (60550 times) [2024-06-22 16:47:42,724][15401] Updated weights for policy 0, policy_version 249840 (0.0034) [2024-06-22 16:47:43,391][15132] Fps is (10 sec: 39316.9, 60 sec: 42324.5, 300 sec: 42653.8). Total num frames: 4093378560. Throughput: 0: 42405.1. Samples: 4093514980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:43,391][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 16:47:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000249841_4093394944.pth... [2024-06-22 16:47:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000249215_4083138560.pth [2024-06-22 16:47:46,905][15401] Updated weights for policy 0, policy_version 249850 (0.0037) [2024-06-22 16:47:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4093607936. Throughput: 0: 42529.7. Samples: 4093771580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:48,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 16:47:50,307][15401] Updated weights for policy 0, policy_version 249860 (0.0028) [2024-06-22 16:47:53,389][15132] Fps is (10 sec: 42604.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 4093804544. Throughput: 0: 42559.6. Samples: 4093901280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 16:47:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 16:47:54,510][15401] Updated weights for policy 0, policy_version 249870 (0.0040) [2024-06-22 16:47:58,047][15401] Updated weights for policy 0, policy_version 249880 (0.0033) [2024-06-22 16:47:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4094033920. Throughput: 0: 42278.0. Samples: 4094147100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:47:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 16:48:02,186][15401] Updated weights for policy 0, policy_version 249890 (0.0040) [2024-06-22 16:48:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4094246912. Throughput: 0: 42372.0. Samples: 4094406060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 16:48:05,686][15401] Updated weights for policy 0, policy_version 249900 (0.0044) [2024-06-22 16:48:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4094443520. Throughput: 0: 42369.5. Samples: 4094531980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 16:48:09,898][15401] Updated weights for policy 0, policy_version 249910 (0.0028) [2024-06-22 16:48:13,206][15401] Updated weights for policy 0, policy_version 249920 (0.0039) [2024-06-22 16:48:13,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 4094689280. Throughput: 0: 42418.1. Samples: 4094788540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:13,392][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 16:48:17,536][15401] Updated weights for policy 0, policy_version 249930 (0.0034) [2024-06-22 16:48:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4094885888. Throughput: 0: 42583.2. Samples: 4095044400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 16:48:21,347][15401] Updated weights for policy 0, policy_version 249940 (0.0024) [2024-06-22 16:48:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4095098880. Throughput: 0: 42447.6. Samples: 4095173020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 16:48:25,292][15401] Updated weights for policy 0, policy_version 249950 (0.0031) [2024-06-22 16:48:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4095311872. Throughput: 0: 42494.2. Samples: 4095427160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:28,390][15132] Avg episode reward: [(0, '0.866')] [2024-06-22 16:48:29,018][15401] Updated weights for policy 0, policy_version 249960 (0.0044) [2024-06-22 16:48:33,002][15401] Updated weights for policy 0, policy_version 249970 (0.0040) [2024-06-22 16:48:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42599.3). Total num frames: 4095508480. Throughput: 0: 42560.4. Samples: 4095686800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 16:48:36,482][15401] Updated weights for policy 0, policy_version 249980 (0.0033) [2024-06-22 16:48:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4095737856. Throughput: 0: 42480.8. Samples: 4095812920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 16:48:41,010][15401] Updated weights for policy 0, policy_version 249990 (0.0023) [2024-06-22 16:48:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42872.3, 300 sec: 42710.4). Total num frames: 4095950848. Throughput: 0: 42571.1. Samples: 4096062800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 16:48:44,242][15401] Updated weights for policy 0, policy_version 250000 (0.0031) [2024-06-22 16:48:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 4096131072. Throughput: 0: 42635.5. Samples: 4096324660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 16:48:48,681][15401] Updated weights for policy 0, policy_version 250010 (0.0043) [2024-06-22 16:48:51,945][15401] Updated weights for policy 0, policy_version 250020 (0.0032) [2024-06-22 16:48:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4096376832. Throughput: 0: 42432.9. Samples: 4096441460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:53,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 16:48:56,667][15401] Updated weights for policy 0, policy_version 250030 (0.0040) [2024-06-22 16:48:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4096589824. Throughput: 0: 42464.5. Samples: 4096699340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:48:58,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 16:48:59,659][15401] Updated weights for policy 0, policy_version 250040 (0.0037) [2024-06-22 16:49:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4096770048. Throughput: 0: 42609.3. Samples: 4096961820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 16:49:03,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 16:49:04,206][15401] Updated weights for policy 0, policy_version 250050 (0.0028) [2024-06-22 16:49:05,273][15349] Signal inference workers to stop experience collection... (60600 times) [2024-06-22 16:49:05,300][15401] InferenceWorker_p0-w0: stopping experience collection (60600 times) [2024-06-22 16:49:05,328][15349] Signal inference workers to resume experience collection... (60600 times) [2024-06-22 16:49:05,329][15401] InferenceWorker_p0-w0: resuming experience collection (60600 times) [2024-06-22 16:49:07,260][15401] Updated weights for policy 0, policy_version 250060 (0.0037) [2024-06-22 16:49:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4097015808. Throughput: 0: 42404.4. Samples: 4097081220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 16:49:11,862][15401] Updated weights for policy 0, policy_version 250070 (0.0042) [2024-06-22 16:49:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 4097228800. Throughput: 0: 42553.6. Samples: 4097342080. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:13,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 16:49:15,154][15401] Updated weights for policy 0, policy_version 250080 (0.0026) [2024-06-22 16:49:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4097409024. Throughput: 0: 42466.6. Samples: 4097597800. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 16:49:19,674][15401] Updated weights for policy 0, policy_version 250090 (0.0032) [2024-06-22 16:49:22,860][15401] Updated weights for policy 0, policy_version 250100 (0.0036) [2024-06-22 16:49:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4097654784. Throughput: 0: 42314.2. Samples: 4097717060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 16:49:27,163][15401] Updated weights for policy 0, policy_version 250110 (0.0024) [2024-06-22 16:49:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4097867776. Throughput: 0: 42623.7. Samples: 4097980860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 16:49:30,403][15401] Updated weights for policy 0, policy_version 250120 (0.0032) [2024-06-22 16:49:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 4098048000. Throughput: 0: 42482.2. Samples: 4098236360. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 16:49:35,028][15401] Updated weights for policy 0, policy_version 250130 (0.0038) [2024-06-22 16:49:38,127][15401] Updated weights for policy 0, policy_version 250140 (0.0029) [2024-06-22 16:49:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4098310144. Throughput: 0: 42542.7. Samples: 4098355880. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 16:49:42,507][15401] Updated weights for policy 0, policy_version 250150 (0.0036) [2024-06-22 16:49:43,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 4098506752. Throughput: 0: 42784.8. Samples: 4098624760. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:43,392][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 16:49:43,638][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000250155_4098539520.pth... [2024-06-22 16:49:43,701][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000249529_4088283136.pth [2024-06-22 16:49:45,676][15401] Updated weights for policy 0, policy_version 250160 (0.0026) [2024-06-22 16:49:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4098703360. Throughput: 0: 42515.5. Samples: 4098875020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:48,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 16:49:50,196][15401] Updated weights for policy 0, policy_version 250170 (0.0035) [2024-06-22 16:49:53,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4098932736. Throughput: 0: 42689.0. Samples: 4099002220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 16:49:53,501][15401] Updated weights for policy 0, policy_version 250180 (0.0030) [2024-06-22 16:49:57,756][15401] Updated weights for policy 0, policy_version 250190 (0.0044) [2024-06-22 16:49:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4099162112. Throughput: 0: 42874.7. Samples: 4099271440. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:49:58,390][15132] Avg episode reward: [(0, '0.208')] [2024-06-22 16:50:01,109][15401] Updated weights for policy 0, policy_version 250200 (0.0036) [2024-06-22 16:50:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4099358720. Throughput: 0: 42794.6. Samples: 4099523560. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:50:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 16:50:05,485][15401] Updated weights for policy 0, policy_version 250210 (0.0022) [2024-06-22 16:50:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4099588096. Throughput: 0: 42908.3. Samples: 4099647940. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:50:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 16:50:08,621][15401] Updated weights for policy 0, policy_version 250220 (0.0032) [2024-06-22 16:50:13,122][15401] Updated weights for policy 0, policy_version 250230 (0.0025) [2024-06-22 16:50:13,314][15349] Signal inference workers to stop experience collection... (60650 times) [2024-06-22 16:50:13,337][15401] InferenceWorker_p0-w0: stopping experience collection (60650 times) [2024-06-22 16:50:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 4099784704. Throughput: 0: 43000.5. Samples: 4099915880. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-22 16:50:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 16:50:13,427][15349] Signal inference workers to resume experience collection... (60650 times) [2024-06-22 16:50:13,427][15401] InferenceWorker_p0-w0: resuming experience collection (60650 times) [2024-06-22 16:50:16,171][15401] Updated weights for policy 0, policy_version 250240 (0.0028) [2024-06-22 16:50:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4099981312. Throughput: 0: 42941.8. Samples: 4100168740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:18,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 16:50:20,575][15401] Updated weights for policy 0, policy_version 250250 (0.0031) [2024-06-22 16:50:23,393][15132] Fps is (10 sec: 44221.5, 60 sec: 42869.0, 300 sec: 42597.9). Total num frames: 4100227072. Throughput: 0: 42947.0. Samples: 4100288640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:23,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 16:50:23,775][15401] Updated weights for policy 0, policy_version 250260 (0.0041) [2024-06-22 16:50:28,084][15401] Updated weights for policy 0, policy_version 250270 (0.0031) [2024-06-22 16:50:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4100440064. Throughput: 0: 43011.7. Samples: 4100560180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 16:50:31,335][15401] Updated weights for policy 0, policy_version 250280 (0.0032) [2024-06-22 16:50:33,390][15132] Fps is (10 sec: 40973.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4100636672. Throughput: 0: 42935.2. Samples: 4100807100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:33,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 16:50:35,895][15401] Updated weights for policy 0, policy_version 250290 (0.0038) [2024-06-22 16:50:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4100882432. Throughput: 0: 42927.0. Samples: 4100933940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 16:50:39,047][15401] Updated weights for policy 0, policy_version 250300 (0.0027) [2024-06-22 16:50:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 4101046272. Throughput: 0: 42726.7. Samples: 4101194140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 16:50:43,724][15401] Updated weights for policy 0, policy_version 250310 (0.0037) [2024-06-22 16:50:46,857][15401] Updated weights for policy 0, policy_version 250320 (0.0032) [2024-06-22 16:50:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.9). Total num frames: 4101292032. Throughput: 0: 42624.5. Samples: 4101441660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:48,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 16:50:51,645][15401] Updated weights for policy 0, policy_version 250330 (0.0033) [2024-06-22 16:50:53,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4101521408. Throughput: 0: 42936.9. Samples: 4101580100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:53,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 16:50:54,216][15401] Updated weights for policy 0, policy_version 250340 (0.0027) [2024-06-22 16:50:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4101685248. Throughput: 0: 42668.8. Samples: 4101835980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:50:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 16:50:59,087][15401] Updated weights for policy 0, policy_version 250350 (0.0034) [2024-06-22 16:51:02,216][15401] Updated weights for policy 0, policy_version 250360 (0.0036) [2024-06-22 16:51:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4101931008. Throughput: 0: 42506.5. Samples: 4102081540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:51:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 16:51:06,646][15401] Updated weights for policy 0, policy_version 250370 (0.0036) [2024-06-22 16:51:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4102144000. Throughput: 0: 42962.7. Samples: 4102221820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:51:08,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 16:51:09,786][15401] Updated weights for policy 0, policy_version 250380 (0.0031) [2024-06-22 16:51:13,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 4102307840. Throughput: 0: 42570.5. Samples: 4102475860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:51:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 16:51:14,439][15401] Updated weights for policy 0, policy_version 250390 (0.0029) [2024-06-22 16:51:17,643][15401] Updated weights for policy 0, policy_version 250400 (0.0027) [2024-06-22 16:51:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4102569984. Throughput: 0: 42482.7. Samples: 4102718820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:51:18,394][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 16:51:22,252][15401] Updated weights for policy 0, policy_version 250410 (0.0029) [2024-06-22 16:51:23,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42600.8, 300 sec: 42653.9). Total num frames: 4102782976. Throughput: 0: 42794.7. Samples: 4102859700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 16:51:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 16:51:25,060][15349] Signal inference workers to stop experience collection... (60700 times) [2024-06-22 16:51:25,061][15349] Signal inference workers to resume experience collection... (60700 times) [2024-06-22 16:51:25,112][15401] InferenceWorker_p0-w0: stopping experience collection (60700 times) [2024-06-22 16:51:25,112][15401] InferenceWorker_p0-w0: resuming experience collection (60700 times) [2024-06-22 16:51:25,204][15401] Updated weights for policy 0, policy_version 250420 (0.0039) [2024-06-22 16:51:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 4102963200. Throughput: 0: 42537.4. Samples: 4103108320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:51:28,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 16:51:29,837][15401] Updated weights for policy 0, policy_version 250430 (0.0038) [2024-06-22 16:51:32,836][15401] Updated weights for policy 0, policy_version 250440 (0.0044) [2024-06-22 16:51:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4103208960. Throughput: 0: 42607.9. Samples: 4103359020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:51:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 16:51:37,654][15401] Updated weights for policy 0, policy_version 250450 (0.0030) [2024-06-22 16:51:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4103421952. Throughput: 0: 42571.3. Samples: 4103495800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:51:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 16:51:40,518][15401] Updated weights for policy 0, policy_version 250460 (0.0041) [2024-06-22 16:51:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4103618560. Throughput: 0: 42443.4. Samples: 4103745940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:51:43,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 16:51:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000250465_4103618560.pth... [2024-06-22 16:51:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000249841_4093394944.pth [2024-06-22 16:51:45,254][15401] Updated weights for policy 0, policy_version 250470 (0.0038) [2024-06-22 16:51:48,377][15401] Updated weights for policy 0, policy_version 250480 (0.0037) [2024-06-22 16:51:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4103864320. Throughput: 0: 42533.4. Samples: 4103995540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:51:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 16:51:53,031][15401] Updated weights for policy 0, policy_version 250490 (0.0032) [2024-06-22 16:51:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 4104028160. Throughput: 0: 42362.2. Samples: 4104128120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:51:53,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 16:51:55,897][15401] Updated weights for policy 0, policy_version 250500 (0.0038) [2024-06-22 16:51:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4104257536. Throughput: 0: 42312.1. Samples: 4104379900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:51:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 16:52:00,482][15401] Updated weights for policy 0, policy_version 250510 (0.0033) [2024-06-22 16:52:03,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4104503296. Throughput: 0: 42700.5. Samples: 4104640340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:52:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 16:52:03,491][15401] Updated weights for policy 0, policy_version 250520 (0.0030) [2024-06-22 16:52:07,917][15401] Updated weights for policy 0, policy_version 250530 (0.0042) [2024-06-22 16:52:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4104683520. Throughput: 0: 42648.0. Samples: 4104778860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:52:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 16:52:10,965][15401] Updated weights for policy 0, policy_version 250540 (0.0033) [2024-06-22 16:52:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 4104896512. Throughput: 0: 42742.2. Samples: 4105031720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:52:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 16:52:15,603][15401] Updated weights for policy 0, policy_version 250550 (0.0042) [2024-06-22 16:52:18,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4105158656. Throughput: 0: 42927.5. Samples: 4105290760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:52:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 16:52:18,455][15401] Updated weights for policy 0, policy_version 250560 (0.0035) [2024-06-22 16:52:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 4105306112. Throughput: 0: 42858.2. Samples: 4105424420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:52:23,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 16:52:23,598][15401] Updated weights for policy 0, policy_version 250570 (0.0022) [2024-06-22 16:52:25,465][15349] Signal inference workers to stop experience collection... (60750 times) [2024-06-22 16:52:25,466][15349] Signal inference workers to resume experience collection... (60750 times) [2024-06-22 16:52:25,476][15401] InferenceWorker_p0-w0: stopping experience collection (60750 times) [2024-06-22 16:52:25,502][15401] InferenceWorker_p0-w0: resuming experience collection (60750 times) [2024-06-22 16:52:26,335][15401] Updated weights for policy 0, policy_version 250580 (0.0042) [2024-06-22 16:52:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 4105568256. Throughput: 0: 42958.3. Samples: 4105679060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:52:28,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 16:52:31,032][15401] Updated weights for policy 0, policy_version 250590 (0.0024) [2024-06-22 16:52:33,390][15132] Fps is (10 sec: 49150.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4105797632. Throughput: 0: 43157.7. Samples: 4105937640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 16:52:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 16:52:34,051][15401] Updated weights for policy 0, policy_version 250600 (0.0039) [2024-06-22 16:52:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.7). Total num frames: 4105977856. Throughput: 0: 43172.5. Samples: 4106070880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:52:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 16:52:38,528][15401] Updated weights for policy 0, policy_version 250610 (0.0028) [2024-06-22 16:52:41,604][15401] Updated weights for policy 0, policy_version 250620 (0.0044) [2024-06-22 16:52:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 4106223616. Throughput: 0: 43208.4. Samples: 4106324280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:52:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 16:52:45,981][15401] Updated weights for policy 0, policy_version 250630 (0.0050) [2024-06-22 16:52:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4106436608. Throughput: 0: 43243.8. Samples: 4106586320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:52:48,396][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 16:52:49,349][15401] Updated weights for policy 0, policy_version 250640 (0.0043) [2024-06-22 16:52:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 4106633216. Throughput: 0: 43081.3. Samples: 4106717520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:52:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 16:52:53,439][15401] Updated weights for policy 0, policy_version 250650 (0.0041) [2024-06-22 16:52:57,052][15401] Updated weights for policy 0, policy_version 250660 (0.0040) [2024-06-22 16:52:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4106862592. Throughput: 0: 43175.2. Samples: 4106974600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:52:58,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 16:53:01,150][15401] Updated weights for policy 0, policy_version 250670 (0.0037) [2024-06-22 16:53:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4107075584. Throughput: 0: 43237.8. Samples: 4107236460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 16:53:04,577][15401] Updated weights for policy 0, policy_version 250680 (0.0033) [2024-06-22 16:53:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 4107272192. Throughput: 0: 43072.4. Samples: 4107362680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 16:53:08,917][15401] Updated weights for policy 0, policy_version 250690 (0.0027) [2024-06-22 16:53:12,098][15401] Updated weights for policy 0, policy_version 250700 (0.0028) [2024-06-22 16:53:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4107501568. Throughput: 0: 43113.7. Samples: 4107619180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 16:53:16,647][15401] Updated weights for policy 0, policy_version 250710 (0.0033) [2024-06-22 16:53:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4107714560. Throughput: 0: 43199.3. Samples: 4107881600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 16:53:19,488][15401] Updated weights for policy 0, policy_version 250720 (0.0030) [2024-06-22 16:53:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 4107927552. Throughput: 0: 43109.8. Samples: 4108010820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 16:53:24,066][15401] Updated weights for policy 0, policy_version 250730 (0.0046) [2024-06-22 16:53:26,906][15401] Updated weights for policy 0, policy_version 250740 (0.0039) [2024-06-22 16:53:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4108156928. Throughput: 0: 43254.3. Samples: 4108270720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 16:53:31,675][15401] Updated weights for policy 0, policy_version 250750 (0.0032) [2024-06-22 16:53:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4108369920. Throughput: 0: 43210.7. Samples: 4108530800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 16:53:35,017][15401] Updated weights for policy 0, policy_version 250760 (0.0034) [2024-06-22 16:53:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 4108582912. Throughput: 0: 43162.2. Samples: 4108659820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 16:53:39,086][15401] Updated weights for policy 0, policy_version 250770 (0.0028) [2024-06-22 16:53:40,004][15349] Signal inference workers to stop experience collection... (60800 times) [2024-06-22 16:53:40,005][15349] Signal inference workers to resume experience collection... (60800 times) [2024-06-22 16:53:40,038][15401] InferenceWorker_p0-w0: stopping experience collection (60800 times) [2024-06-22 16:53:40,047][15401] InferenceWorker_p0-w0: resuming experience collection (60800 times) [2024-06-22 16:53:42,649][15401] Updated weights for policy 0, policy_version 250780 (0.0039) [2024-06-22 16:53:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 4108795904. Throughput: 0: 43319.7. Samples: 4108924000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 16:53:43,391][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 16:53:43,460][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000250782_4108812288.pth... [2024-06-22 16:53:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000250155_4098539520.pth [2024-06-22 16:53:46,757][15401] Updated weights for policy 0, policy_version 250790 (0.0040) [2024-06-22 16:53:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4109025280. Throughput: 0: 43190.2. Samples: 4109180020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:53:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 16:53:50,463][15401] Updated weights for policy 0, policy_version 250800 (0.0033) [2024-06-22 16:53:53,390][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4109205504. Throughput: 0: 43191.8. Samples: 4109306320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:53:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 16:53:54,318][15401] Updated weights for policy 0, policy_version 250810 (0.0035) [2024-06-22 16:53:57,864][15401] Updated weights for policy 0, policy_version 250820 (0.0029) [2024-06-22 16:53:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4109451264. Throughput: 0: 43159.6. Samples: 4109561360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:53:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 16:54:01,911][15401] Updated weights for policy 0, policy_version 250830 (0.0045) [2024-06-22 16:54:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4109647872. Throughput: 0: 43040.9. Samples: 4109818440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 16:54:05,353][15401] Updated weights for policy 0, policy_version 250840 (0.0034) [2024-06-22 16:54:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4109860864. Throughput: 0: 42937.9. Samples: 4109943020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:08,390][15132] Avg episode reward: [(0, '0.914')] [2024-06-22 16:54:09,783][15401] Updated weights for policy 0, policy_version 250850 (0.0027) [2024-06-22 16:54:12,980][15401] Updated weights for policy 0, policy_version 250860 (0.0032) [2024-06-22 16:54:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4110090240. Throughput: 0: 42871.5. Samples: 4110199940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 16:54:17,358][15401] Updated weights for policy 0, policy_version 250870 (0.0036) [2024-06-22 16:54:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4110286848. Throughput: 0: 42852.2. Samples: 4110459140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:18,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-22 16:54:20,672][15401] Updated weights for policy 0, policy_version 250880 (0.0042) [2024-06-22 16:54:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4110499840. Throughput: 0: 42647.1. Samples: 4110578940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:23,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 16:54:24,914][15401] Updated weights for policy 0, policy_version 250890 (0.0032) [2024-06-22 16:54:28,365][15401] Updated weights for policy 0, policy_version 250900 (0.0047) [2024-06-22 16:54:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 4110745600. Throughput: 0: 42659.3. Samples: 4110843660. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 16:54:32,396][15401] Updated weights for policy 0, policy_version 250910 (0.0022) [2024-06-22 16:54:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4110942208. Throughput: 0: 42777.4. Samples: 4111105000. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:33,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-22 16:54:35,953][15401] Updated weights for policy 0, policy_version 250920 (0.0037) [2024-06-22 16:54:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 4111155200. Throughput: 0: 42707.2. Samples: 4111228140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 16:54:40,320][15401] Updated weights for policy 0, policy_version 250930 (0.0034) [2024-06-22 16:54:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.8, 300 sec: 42987.2). Total num frames: 4111384576. Throughput: 0: 42944.6. Samples: 4111493860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 16:54:43,425][15401] Updated weights for policy 0, policy_version 250940 (0.0031) [2024-06-22 16:54:47,780][15401] Updated weights for policy 0, policy_version 250950 (0.0035) [2024-06-22 16:54:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4111581184. Throughput: 0: 42926.2. Samples: 4111750120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:48,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 16:54:50,853][15401] Updated weights for policy 0, policy_version 250960 (0.0029) [2024-06-22 16:54:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4111794176. Throughput: 0: 43000.9. Samples: 4111878060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 16:54:55,139][15401] Updated weights for policy 0, policy_version 250970 (0.0037) [2024-06-22 16:54:58,376][15401] Updated weights for policy 0, policy_version 250980 (0.0042) [2024-06-22 16:54:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 4112056320. Throughput: 0: 43192.5. Samples: 4112143600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-22 16:54:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 16:55:02,812][15401] Updated weights for policy 0, policy_version 250990 (0.0038) [2024-06-22 16:55:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4112220160. Throughput: 0: 43122.9. Samples: 4112399680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 16:55:04,520][15349] Signal inference workers to stop experience collection... (60850 times) [2024-06-22 16:55:04,563][15401] InferenceWorker_p0-w0: stopping experience collection (60850 times) [2024-06-22 16:55:04,572][15349] Signal inference workers to resume experience collection... (60850 times) [2024-06-22 16:55:04,578][15401] InferenceWorker_p0-w0: resuming experience collection (60850 times) [2024-06-22 16:55:06,191][15401] Updated weights for policy 0, policy_version 251000 (0.0032) [2024-06-22 16:55:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4112449536. Throughput: 0: 43195.0. Samples: 4112522720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 16:55:10,786][15401] Updated weights for policy 0, policy_version 251010 (0.0027) [2024-06-22 16:55:13,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 4112678912. Throughput: 0: 43080.5. Samples: 4112782280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:13,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 16:55:13,920][15401] Updated weights for policy 0, policy_version 251020 (0.0030) [2024-06-22 16:55:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42821.0). Total num frames: 4112859136. Throughput: 0: 43003.5. Samples: 4113040160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:18,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 16:55:18,403][15401] Updated weights for policy 0, policy_version 251030 (0.0033) [2024-06-22 16:55:21,656][15401] Updated weights for policy 0, policy_version 251040 (0.0036) [2024-06-22 16:55:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4113088512. Throughput: 0: 42947.6. Samples: 4113160780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:23,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 16:55:26,070][15401] Updated weights for policy 0, policy_version 251050 (0.0026) [2024-06-22 16:55:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4113317888. Throughput: 0: 43009.7. Samples: 4113429300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:28,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 16:55:29,254][15401] Updated weights for policy 0, policy_version 251060 (0.0044) [2024-06-22 16:55:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4113514496. Throughput: 0: 42984.4. Samples: 4113684420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 16:55:33,530][15401] Updated weights for policy 0, policy_version 251070 (0.0025) [2024-06-22 16:55:36,812][15401] Updated weights for policy 0, policy_version 251080 (0.0031) [2024-06-22 16:55:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4113727488. Throughput: 0: 42914.3. Samples: 4113809200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 16:55:41,143][15401] Updated weights for policy 0, policy_version 251090 (0.0035) [2024-06-22 16:55:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4113956864. Throughput: 0: 42963.2. Samples: 4114076940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 16:55:43,508][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000251097_4113973248.pth... [2024-06-22 16:55:43,563][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000250465_4103618560.pth [2024-06-22 16:55:44,368][15401] Updated weights for policy 0, policy_version 251100 (0.0032) [2024-06-22 16:55:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4114153472. Throughput: 0: 42990.0. Samples: 4114334220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:48,398][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 16:55:48,649][15401] Updated weights for policy 0, policy_version 251110 (0.0029) [2024-06-22 16:55:52,611][15401] Updated weights for policy 0, policy_version 251120 (0.0034) [2024-06-22 16:55:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 4114382848. Throughput: 0: 42916.0. Samples: 4114453940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 16:55:56,565][15401] Updated weights for policy 0, policy_version 251130 (0.0033) [2024-06-22 16:55:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 4114595840. Throughput: 0: 43010.4. Samples: 4114717740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:55:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 16:56:00,091][15401] Updated weights for policy 0, policy_version 251140 (0.0025) [2024-06-22 16:56:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4114792448. Throughput: 0: 42963.2. Samples: 4114973500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:56:03,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 16:56:04,467][15401] Updated weights for policy 0, policy_version 251150 (0.0039) [2024-06-22 16:56:07,549][15401] Updated weights for policy 0, policy_version 251160 (0.0030) [2024-06-22 16:56:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.7, 300 sec: 43153.8). Total num frames: 4115038208. Throughput: 0: 43064.1. Samples: 4115098660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 16:56:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 16:56:10,533][15349] Signal inference workers to stop experience collection... (60900 times) [2024-06-22 16:56:10,533][15349] Signal inference workers to resume experience collection... (60900 times) [2024-06-22 16:56:10,546][15401] InferenceWorker_p0-w0: stopping experience collection (60900 times) [2024-06-22 16:56:10,546][15401] InferenceWorker_p0-w0: resuming experience collection (60900 times) [2024-06-22 16:56:11,777][15401] Updated weights for policy 0, policy_version 251170 (0.0035) [2024-06-22 16:56:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 4115234816. Throughput: 0: 42901.4. Samples: 4115359860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:13,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 16:56:14,985][15401] Updated weights for policy 0, policy_version 251180 (0.0036) [2024-06-22 16:56:18,390][15132] Fps is (10 sec: 40958.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4115447808. Throughput: 0: 43056.7. Samples: 4115621980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 16:56:19,287][15401] Updated weights for policy 0, policy_version 251190 (0.0025) [2024-06-22 16:56:22,573][15401] Updated weights for policy 0, policy_version 251200 (0.0032) [2024-06-22 16:56:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 4115677184. Throughput: 0: 43133.2. Samples: 4115750200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 16:56:26,712][15401] Updated weights for policy 0, policy_version 251210 (0.0027) [2024-06-22 16:56:28,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 4115873792. Throughput: 0: 42979.5. Samples: 4116011020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 16:56:30,156][15401] Updated weights for policy 0, policy_version 251220 (0.0034) [2024-06-22 16:56:33,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 4116103168. Throughput: 0: 43112.7. Samples: 4116274400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:33,393][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 16:56:34,178][15401] Updated weights for policy 0, policy_version 251230 (0.0041) [2024-06-22 16:56:38,057][15401] Updated weights for policy 0, policy_version 251240 (0.0030) [2024-06-22 16:56:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 4116332544. Throughput: 0: 43324.2. Samples: 4116403520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 16:56:41,668][15401] Updated weights for policy 0, policy_version 251250 (0.0035) [2024-06-22 16:56:43,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 4116529152. Throughput: 0: 43328.6. Samples: 4116667540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 16:56:45,575][15401] Updated weights for policy 0, policy_version 251260 (0.0029) [2024-06-22 16:56:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 4116774912. Throughput: 0: 43453.7. Samples: 4116928920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 16:56:49,307][15401] Updated weights for policy 0, policy_version 251270 (0.0037) [2024-06-22 16:56:53,070][15401] Updated weights for policy 0, policy_version 251280 (0.0027) [2024-06-22 16:56:53,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43417.7, 300 sec: 43153.8). Total num frames: 4116987904. Throughput: 0: 43671.9. Samples: 4117063900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 16:56:56,922][15401] Updated weights for policy 0, policy_version 251290 (0.0034) [2024-06-22 16:56:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 4117184512. Throughput: 0: 43451.9. Samples: 4117315200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:56:58,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 16:57:00,570][15401] Updated weights for policy 0, policy_version 251300 (0.0042) [2024-06-22 16:57:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 43153.8). Total num frames: 4117413888. Throughput: 0: 43220.6. Samples: 4117566900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:57:03,392][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 16:57:04,710][15401] Updated weights for policy 0, policy_version 251310 (0.0043) [2024-06-22 16:57:08,154][15401] Updated weights for policy 0, policy_version 251320 (0.0041) [2024-06-22 16:57:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 43209.3). Total num frames: 4117643264. Throughput: 0: 43343.2. Samples: 4117700640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:57:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 16:57:12,210][15401] Updated weights for policy 0, policy_version 251330 (0.0028) [2024-06-22 16:57:13,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 4117807104. Throughput: 0: 43308.3. Samples: 4117960000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:57:13,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 16:57:15,736][15401] Updated weights for policy 0, policy_version 251340 (0.0029) [2024-06-22 16:57:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 43264.9). Total num frames: 4118069248. Throughput: 0: 43167.2. Samples: 4118216820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-22 16:57:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 16:57:19,661][15401] Updated weights for policy 0, policy_version 251350 (0.0050) [2024-06-22 16:57:23,145][15401] Updated weights for policy 0, policy_version 251360 (0.0034) [2024-06-22 16:57:23,390][15132] Fps is (10 sec: 47524.3, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 4118282240. Throughput: 0: 43265.5. Samples: 4118350480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:57:23,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-22 16:57:23,906][15349] Signal inference workers to stop experience collection... (60950 times) [2024-06-22 16:57:23,906][15349] Signal inference workers to resume experience collection... (60950 times) [2024-06-22 16:57:23,924][15401] InferenceWorker_p0-w0: stopping experience collection (60950 times) [2024-06-22 16:57:23,924][15401] InferenceWorker_p0-w0: resuming experience collection (60950 times) [2024-06-22 16:57:27,175][15401] Updated weights for policy 0, policy_version 251370 (0.0033) [2024-06-22 16:57:28,392][15132] Fps is (10 sec: 37674.2, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 4118446080. Throughput: 0: 43043.2. Samples: 4118604580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:57:28,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 16:57:30,793][15401] Updated weights for policy 0, policy_version 251380 (0.0038) [2024-06-22 16:57:33,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43146.3, 300 sec: 43098.3). Total num frames: 4118691840. Throughput: 0: 43025.8. Samples: 4118865080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:57:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 16:57:34,700][15401] Updated weights for policy 0, policy_version 251390 (0.0044) [2024-06-22 16:57:38,392][15132] Fps is (10 sec: 47513.6, 60 sec: 43142.8, 300 sec: 43042.4). Total num frames: 4118921216. Throughput: 0: 43028.8. Samples: 4119000300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:57:38,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 16:57:38,497][15401] Updated weights for policy 0, policy_version 251400 (0.0032) [2024-06-22 16:57:42,252][15401] Updated weights for policy 0, policy_version 251410 (0.0023) [2024-06-22 16:57:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 4119117824. Throughput: 0: 43177.4. Samples: 4119258180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:57:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 16:57:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000251411_4119117824.pth... [2024-06-22 16:57:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000250782_4108812288.pth [2024-06-22 16:57:45,955][15401] Updated weights for policy 0, policy_version 251420 (0.0026) [2024-06-22 16:57:48,390][15132] Fps is (10 sec: 44246.9, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 4119363584. Throughput: 0: 43264.4. Samples: 4119513800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:57:48,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 16:57:49,890][15401] Updated weights for policy 0, policy_version 251430 (0.0023) [2024-06-22 16:57:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 4119560192. Throughput: 0: 43366.3. Samples: 4119652120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:57:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 16:57:53,641][15401] Updated weights for policy 0, policy_version 251440 (0.0041) [2024-06-22 16:57:57,724][15401] Updated weights for policy 0, policy_version 251450 (0.0041) [2024-06-22 16:57:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 4119773184. Throughput: 0: 43115.2. Samples: 4119900080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:57:58,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 16:58:01,283][15401] Updated weights for policy 0, policy_version 251460 (0.0037) [2024-06-22 16:58:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 43209.3). Total num frames: 4120018944. Throughput: 0: 42994.6. Samples: 4120151580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:58:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 16:58:05,543][15401] Updated weights for policy 0, policy_version 251470 (0.0033) [2024-06-22 16:58:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 4120199168. Throughput: 0: 43074.0. Samples: 4120288800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:58:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 16:58:09,092][15401] Updated weights for policy 0, policy_version 251480 (0.0034) [2024-06-22 16:58:13,041][15401] Updated weights for policy 0, policy_version 251490 (0.0033) [2024-06-22 16:58:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43692.5, 300 sec: 43098.3). Total num frames: 4120428544. Throughput: 0: 43037.0. Samples: 4120541140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:58:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 16:58:16,985][15401] Updated weights for policy 0, policy_version 251500 (0.0033) [2024-06-22 16:58:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 4120657920. Throughput: 0: 42821.8. Samples: 4120792060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:58:18,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-22 16:58:20,626][15401] Updated weights for policy 0, policy_version 251510 (0.0031) [2024-06-22 16:58:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.5, 300 sec: 42931.6). Total num frames: 4120821760. Throughput: 0: 42886.7. Samples: 4120930100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:58:23,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 16:58:24,520][15401] Updated weights for policy 0, policy_version 251520 (0.0038) [2024-06-22 16:58:28,154][15401] Updated weights for policy 0, policy_version 251530 (0.0031) [2024-06-22 16:58:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43965.5, 300 sec: 43098.3). Total num frames: 4121083904. Throughput: 0: 42916.4. Samples: 4121189420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 16:58:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 16:58:32,136][15349] Signal inference workers to stop experience collection... (61000 times) [2024-06-22 16:58:32,185][15401] InferenceWorker_p0-w0: stopping experience collection (61000 times) [2024-06-22 16:58:32,194][15349] Signal inference workers to resume experience collection... (61000 times) [2024-06-22 16:58:32,201][15401] InferenceWorker_p0-w0: resuming experience collection (61000 times) [2024-06-22 16:58:32,208][15401] Updated weights for policy 0, policy_version 251540 (0.0033) [2024-06-22 16:58:33,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 43153.8). Total num frames: 4121313280. Throughput: 0: 42758.3. Samples: 4121437920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:58:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 16:58:35,661][15401] Updated weights for policy 0, policy_version 251550 (0.0042) [2024-06-22 16:58:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 4121477120. Throughput: 0: 42685.8. Samples: 4121572980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:58:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 16:58:39,725][15401] Updated weights for policy 0, policy_version 251560 (0.0029) [2024-06-22 16:58:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4121706496. Throughput: 0: 42944.5. Samples: 4121832580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:58:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 16:58:43,501][15401] Updated weights for policy 0, policy_version 251570 (0.0038) [2024-06-22 16:58:47,291][15401] Updated weights for policy 0, policy_version 251580 (0.0031) [2024-06-22 16:58:48,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43417.7, 300 sec: 43264.9). Total num frames: 4121968640. Throughput: 0: 42868.1. Samples: 4122080640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:58:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 16:58:51,127][15401] Updated weights for policy 0, policy_version 251590 (0.0025) [2024-06-22 16:58:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4122116096. Throughput: 0: 42997.7. Samples: 4122223700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:58:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 16:58:54,660][15401] Updated weights for policy 0, policy_version 251600 (0.0049) [2024-06-22 16:58:58,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 4122345472. Throughput: 0: 43107.9. Samples: 4122481000. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:58:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 16:58:58,783][15401] Updated weights for policy 0, policy_version 251610 (0.0036) [2024-06-22 16:59:02,315][15401] Updated weights for policy 0, policy_version 251620 (0.0025) [2024-06-22 16:59:03,392][15132] Fps is (10 sec: 49140.6, 60 sec: 43142.8, 300 sec: 43209.0). Total num frames: 4122607616. Throughput: 0: 43267.9. Samples: 4122739220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:59:03,392][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 16:59:06,361][15401] Updated weights for policy 0, policy_version 251630 (0.0040) [2024-06-22 16:59:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4122771456. Throughput: 0: 43320.5. Samples: 4122879520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:59:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 16:59:09,916][15401] Updated weights for policy 0, policy_version 251640 (0.0036) [2024-06-22 16:59:13,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 4123000832. Throughput: 0: 43055.6. Samples: 4123126920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:59:13,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 16:59:13,886][15401] Updated weights for policy 0, policy_version 251650 (0.0032) [2024-06-22 16:59:17,534][15401] Updated weights for policy 0, policy_version 251660 (0.0028) [2024-06-22 16:59:18,390][15132] Fps is (10 sec: 47512.6, 60 sec: 43144.4, 300 sec: 43209.3). Total num frames: 4123246592. Throughput: 0: 43247.9. Samples: 4123384080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:59:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 16:59:21,447][15401] Updated weights for policy 0, policy_version 251670 (0.0045) [2024-06-22 16:59:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4123410432. Throughput: 0: 43219.9. Samples: 4123517880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:59:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 16:59:25,203][15401] Updated weights for policy 0, policy_version 251680 (0.0037) [2024-06-22 16:59:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 4123639808. Throughput: 0: 42993.3. Samples: 4123767280. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:59:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 16:59:29,191][15401] Updated weights for policy 0, policy_version 251690 (0.0033) [2024-06-22 16:59:32,686][15401] Updated weights for policy 0, policy_version 251700 (0.0025) [2024-06-22 16:59:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 43098.2). Total num frames: 4123869184. Throughput: 0: 43327.4. Samples: 4124030380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:59:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 16:59:36,588][15401] Updated weights for policy 0, policy_version 251710 (0.0028) [2024-06-22 16:59:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4124049408. Throughput: 0: 43078.7. Samples: 4124162240. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-22 16:59:38,399][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 16:59:40,334][15401] Updated weights for policy 0, policy_version 251720 (0.0038) [2024-06-22 16:59:40,353][15349] Signal inference workers to stop experience collection... (61050 times) [2024-06-22 16:59:40,354][15349] Signal inference workers to resume experience collection... (61050 times) [2024-06-22 16:59:40,387][15401] InferenceWorker_p0-w0: stopping experience collection (61050 times) [2024-06-22 16:59:40,387][15401] InferenceWorker_p0-w0: resuming experience collection (61050 times) [2024-06-22 16:59:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 4124295168. Throughput: 0: 42902.8. Samples: 4124411620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 16:59:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 16:59:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000251727_4124295168.pth... [2024-06-22 16:59:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000251097_4113973248.pth [2024-06-22 16:59:44,430][15401] Updated weights for policy 0, policy_version 251730 (0.0042) [2024-06-22 16:59:47,946][15401] Updated weights for policy 0, policy_version 251740 (0.0023) [2024-06-22 16:59:48,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42598.2, 300 sec: 43153.8). Total num frames: 4124524544. Throughput: 0: 42875.0. Samples: 4124668500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 16:59:48,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 16:59:52,214][15401] Updated weights for policy 0, policy_version 251750 (0.0040) [2024-06-22 16:59:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4124688384. Throughput: 0: 42703.6. Samples: 4124801180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 16:59:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 16:59:55,682][15401] Updated weights for policy 0, policy_version 251760 (0.0039) [2024-06-22 16:59:58,390][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 4124934144. Throughput: 0: 42712.4. Samples: 4125048980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 16:59:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 17:00:00,247][15401] Updated weights for policy 0, policy_version 251770 (0.0027) [2024-06-22 17:00:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42327.0, 300 sec: 43042.7). Total num frames: 4125147136. Throughput: 0: 42733.4. Samples: 4125307080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 17:00:03,562][15401] Updated weights for policy 0, policy_version 251780 (0.0037) [2024-06-22 17:00:07,860][15401] Updated weights for policy 0, policy_version 251790 (0.0037) [2024-06-22 17:00:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 4125343744. Throughput: 0: 42605.3. Samples: 4125435120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 17:00:11,204][15401] Updated weights for policy 0, policy_version 251800 (0.0030) [2024-06-22 17:00:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 43153.8). Total num frames: 4125589504. Throughput: 0: 42670.3. Samples: 4125687440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:13,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-22 17:00:15,379][15401] Updated weights for policy 0, policy_version 251810 (0.0034) [2024-06-22 17:00:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.5, 300 sec: 43042.7). Total num frames: 4125786112. Throughput: 0: 42685.0. Samples: 4125951200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 17:00:18,756][15401] Updated weights for policy 0, policy_version 251820 (0.0040) [2024-06-22 17:00:22,936][15401] Updated weights for policy 0, policy_version 251830 (0.0031) [2024-06-22 17:00:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4125982720. Throughput: 0: 42508.5. Samples: 4126075120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:23,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 17:00:26,431][15401] Updated weights for policy 0, policy_version 251840 (0.0027) [2024-06-22 17:00:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 4126228480. Throughput: 0: 42684.8. Samples: 4126332440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:28,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 17:00:30,471][15401] Updated weights for policy 0, policy_version 251850 (0.0029) [2024-06-22 17:00:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 4126441472. Throughput: 0: 42735.2. Samples: 4126591580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:33,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 17:00:33,982][15401] Updated weights for policy 0, policy_version 251860 (0.0033) [2024-06-22 17:00:38,083][15401] Updated weights for policy 0, policy_version 251870 (0.0028) [2024-06-22 17:00:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 4126638080. Throughput: 0: 42674.2. Samples: 4126721520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 17:00:41,581][15401] Updated weights for policy 0, policy_version 251880 (0.0025) [2024-06-22 17:00:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 4126851072. Throughput: 0: 42869.8. Samples: 4126978120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:43,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 17:00:45,734][15401] Updated weights for policy 0, policy_version 251890 (0.0032) [2024-06-22 17:00:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.6, 300 sec: 43042.7). Total num frames: 4127080448. Throughput: 0: 42919.6. Samples: 4127238460. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-22 17:00:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 17:00:49,350][15349] Signal inference workers to stop experience collection... (61100 times) [2024-06-22 17:00:49,370][15401] InferenceWorker_p0-w0: stopping experience collection (61100 times) [2024-06-22 17:00:49,409][15349] Signal inference workers to resume experience collection... (61100 times) [2024-06-22 17:00:49,409][15401] InferenceWorker_p0-w0: resuming experience collection (61100 times) [2024-06-22 17:00:49,410][15401] Updated weights for policy 0, policy_version 251900 (0.0039) [2024-06-22 17:00:53,393][15132] Fps is (10 sec: 40944.7, 60 sec: 42868.7, 300 sec: 42931.1). Total num frames: 4127260672. Throughput: 0: 43017.4. Samples: 4127371060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:00:53,394][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 17:00:53,763][15401] Updated weights for policy 0, policy_version 251910 (0.0036) [2024-06-22 17:00:56,926][15401] Updated weights for policy 0, policy_version 251920 (0.0035) [2024-06-22 17:00:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 4127490048. Throughput: 0: 42827.9. Samples: 4127614700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:00:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 17:01:01,323][15401] Updated weights for policy 0, policy_version 251930 (0.0039) [2024-06-22 17:01:03,390][15132] Fps is (10 sec: 45892.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4127719424. Throughput: 0: 42836.4. Samples: 4127878840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:03,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 17:01:04,528][15401] Updated weights for policy 0, policy_version 251940 (0.0035) [2024-06-22 17:01:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4127916032. Throughput: 0: 42974.6. Samples: 4128008980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:08,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 17:01:08,892][15401] Updated weights for policy 0, policy_version 251950 (0.0041) [2024-06-22 17:01:12,486][15401] Updated weights for policy 0, policy_version 251960 (0.0037) [2024-06-22 17:01:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 4128145408. Throughput: 0: 42796.5. Samples: 4128258280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 17:01:16,617][15401] Updated weights for policy 0, policy_version 251970 (0.0025) [2024-06-22 17:01:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 4128374784. Throughput: 0: 42718.6. Samples: 4128513920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 17:01:20,017][15401] Updated weights for policy 0, policy_version 251980 (0.0044) [2024-06-22 17:01:23,394][15132] Fps is (10 sec: 42578.5, 60 sec: 43141.2, 300 sec: 43042.0). Total num frames: 4128571392. Throughput: 0: 42763.5. Samples: 4128646080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:23,395][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 17:01:24,374][15401] Updated weights for policy 0, policy_version 251990 (0.0030) [2024-06-22 17:01:27,550][15401] Updated weights for policy 0, policy_version 252000 (0.0029) [2024-06-22 17:01:28,393][15132] Fps is (10 sec: 40945.3, 60 sec: 42595.8, 300 sec: 42987.0). Total num frames: 4128784384. Throughput: 0: 42818.2. Samples: 4128905100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:28,394][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 17:01:32,003][15401] Updated weights for policy 0, policy_version 252010 (0.0036) [2024-06-22 17:01:33,390][15132] Fps is (10 sec: 44257.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4129013760. Throughput: 0: 42632.0. Samples: 4129156900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:33,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 17:01:35,232][15401] Updated weights for policy 0, policy_version 252020 (0.0045) [2024-06-22 17:01:38,390][15132] Fps is (10 sec: 40975.3, 60 sec: 42598.3, 300 sec: 42931.7). Total num frames: 4129193984. Throughput: 0: 42600.0. Samples: 4129287900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 17:01:39,679][15401] Updated weights for policy 0, policy_version 252030 (0.0030) [2024-06-22 17:01:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4129406976. Throughput: 0: 42752.9. Samples: 4129538580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 17:01:43,488][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000252040_4129423360.pth... [2024-06-22 17:01:43,494][15401] Updated weights for policy 0, policy_version 252040 (0.0032) [2024-06-22 17:01:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000251411_4119117824.pth [2024-06-22 17:01:47,247][15401] Updated weights for policy 0, policy_version 252050 (0.0037) [2024-06-22 17:01:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4129636352. Throughput: 0: 42564.0. Samples: 4129794220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 17:01:51,110][15401] Updated weights for policy 0, policy_version 252060 (0.0032) [2024-06-22 17:01:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42874.2, 300 sec: 42876.1). Total num frames: 4129832960. Throughput: 0: 42473.4. Samples: 4129920280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:53,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 17:01:54,869][15401] Updated weights for policy 0, policy_version 252070 (0.0039) [2024-06-22 17:01:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4130062336. Throughput: 0: 42538.2. Samples: 4130172500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 17:01:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 17:01:58,798][15401] Updated weights for policy 0, policy_version 252080 (0.0032) [2024-06-22 17:02:02,815][15401] Updated weights for policy 0, policy_version 252090 (0.0030) [2024-06-22 17:02:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4130258944. Throughput: 0: 42602.3. Samples: 4130431020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 17:02:04,003][15349] Signal inference workers to stop experience collection... (61150 times) [2024-06-22 17:02:04,064][15401] InferenceWorker_p0-w0: stopping experience collection (61150 times) [2024-06-22 17:02:04,066][15349] Signal inference workers to resume experience collection... (61150 times) [2024-06-22 17:02:04,076][15401] InferenceWorker_p0-w0: resuming experience collection (61150 times) [2024-06-22 17:02:06,498][15401] Updated weights for policy 0, policy_version 252100 (0.0036) [2024-06-22 17:02:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 4130471936. Throughput: 0: 42461.7. Samples: 4130556660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:08,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 17:02:10,336][15401] Updated weights for policy 0, policy_version 252110 (0.0030) [2024-06-22 17:02:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4130701312. Throughput: 0: 42407.6. Samples: 4130813280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 17:02:13,986][15401] Updated weights for policy 0, policy_version 252120 (0.0036) [2024-06-22 17:02:18,049][15401] Updated weights for policy 0, policy_version 252130 (0.0032) [2024-06-22 17:02:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4130897920. Throughput: 0: 42498.6. Samples: 4131069340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 17:02:21,549][15401] Updated weights for policy 0, policy_version 252140 (0.0041) [2024-06-22 17:02:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42601.7, 300 sec: 42987.5). Total num frames: 4131127296. Throughput: 0: 42391.2. Samples: 4131195500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 17:02:25,689][15401] Updated weights for policy 0, policy_version 252150 (0.0028) [2024-06-22 17:02:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42874.1, 300 sec: 42931.6). Total num frames: 4131356672. Throughput: 0: 42640.0. Samples: 4131457380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 17:02:29,501][15401] Updated weights for policy 0, policy_version 252160 (0.0029) [2024-06-22 17:02:33,267][15401] Updated weights for policy 0, policy_version 252170 (0.0037) [2024-06-22 17:02:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 4131553280. Throughput: 0: 42549.3. Samples: 4131708940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 17:02:36,984][15401] Updated weights for policy 0, policy_version 252180 (0.0023) [2024-06-22 17:02:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4131766272. Throughput: 0: 42621.0. Samples: 4131838220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 17:02:41,288][15401] Updated weights for policy 0, policy_version 252190 (0.0032) [2024-06-22 17:02:43,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 4131979264. Throughput: 0: 42673.4. Samples: 4132093080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:43,397][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 17:02:44,706][15401] Updated weights for policy 0, policy_version 252200 (0.0041) [2024-06-22 17:02:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4132175872. Throughput: 0: 42720.4. Samples: 4132353440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:48,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 17:02:48,710][15401] Updated weights for policy 0, policy_version 252210 (0.0029) [2024-06-22 17:02:52,355][15401] Updated weights for policy 0, policy_version 252220 (0.0032) [2024-06-22 17:02:53,390][15132] Fps is (10 sec: 42625.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4132405248. Throughput: 0: 42734.2. Samples: 4132479700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:53,394][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 17:02:56,414][15401] Updated weights for policy 0, policy_version 252230 (0.0041) [2024-06-22 17:02:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4132618240. Throughput: 0: 42763.5. Samples: 4132737640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:02:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 17:02:59,970][15401] Updated weights for policy 0, policy_version 252240 (0.0036) [2024-06-22 17:03:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4132814848. Throughput: 0: 42777.7. Samples: 4132994340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:03:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 17:03:04,103][15401] Updated weights for policy 0, policy_version 252250 (0.0027) [2024-06-22 17:03:07,658][15401] Updated weights for policy 0, policy_version 252260 (0.0033) [2024-06-22 17:03:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4133044224. Throughput: 0: 42684.3. Samples: 4133116300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 17:03:08,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 17:03:11,669][15401] Updated weights for policy 0, policy_version 252270 (0.0042) [2024-06-22 17:03:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4133257216. Throughput: 0: 42660.5. Samples: 4133377100. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:13,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 17:03:15,186][15401] Updated weights for policy 0, policy_version 252280 (0.0040) [2024-06-22 17:03:18,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 4133470208. Throughput: 0: 42745.7. Samples: 4133632600. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:18,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 17:03:19,571][15401] Updated weights for policy 0, policy_version 252290 (0.0031) [2024-06-22 17:03:22,617][15401] Updated weights for policy 0, policy_version 252300 (0.0031) [2024-06-22 17:03:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4133699584. Throughput: 0: 42720.4. Samples: 4133760640. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 17:03:26,981][15401] Updated weights for policy 0, policy_version 252310 (0.0032) [2024-06-22 17:03:28,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 4133896192. Throughput: 0: 42816.3. Samples: 4134019640. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:28,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 17:03:30,277][15401] Updated weights for policy 0, policy_version 252320 (0.0039) [2024-06-22 17:03:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4134109184. Throughput: 0: 42798.8. Samples: 4134279380. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 17:03:34,804][15401] Updated weights for policy 0, policy_version 252330 (0.0033) [2024-06-22 17:03:37,100][15349] Signal inference workers to stop experience collection... (61200 times) [2024-06-22 17:03:37,101][15349] Signal inference workers to resume experience collection... (61200 times) [2024-06-22 17:03:37,148][15401] InferenceWorker_p0-w0: stopping experience collection (61200 times) [2024-06-22 17:03:37,148][15401] InferenceWorker_p0-w0: resuming experience collection (61200 times) [2024-06-22 17:03:38,204][15401] Updated weights for policy 0, policy_version 252340 (0.0042) [2024-06-22 17:03:38,390][15132] Fps is (10 sec: 45885.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4134354944. Throughput: 0: 42815.0. Samples: 4134406380. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 17:03:42,256][15401] Updated weights for policy 0, policy_version 252350 (0.0040) [2024-06-22 17:03:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 4134551552. Throughput: 0: 42792.8. Samples: 4134663320. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:43,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-22 17:03:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000252353_4134551552.pth... [2024-06-22 17:03:43,448][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000251727_4124295168.pth [2024-06-22 17:03:46,056][15401] Updated weights for policy 0, policy_version 252360 (0.0032) [2024-06-22 17:03:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4134748160. Throughput: 0: 42835.2. Samples: 4134921920. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 17:03:49,898][15401] Updated weights for policy 0, policy_version 252370 (0.0028) [2024-06-22 17:03:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4134961152. Throughput: 0: 42908.0. Samples: 4135047160. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 17:03:54,096][15401] Updated weights for policy 0, policy_version 252380 (0.0035) [2024-06-22 17:03:57,480][15401] Updated weights for policy 0, policy_version 252390 (0.0023) [2024-06-22 17:03:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 4135206912. Throughput: 0: 42823.5. Samples: 4135304160. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:03:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 17:04:01,644][15401] Updated weights for policy 0, policy_version 252400 (0.0029) [2024-06-22 17:04:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4135403520. Throughput: 0: 42878.2. Samples: 4135562020. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:04:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 17:04:05,334][15401] Updated weights for policy 0, policy_version 252410 (0.0037) [2024-06-22 17:04:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4135616512. Throughput: 0: 42821.3. Samples: 4135687600. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:04:08,390][15132] Avg episode reward: [(0, '0.175')] [2024-06-22 17:04:09,512][15401] Updated weights for policy 0, policy_version 252420 (0.0038) [2024-06-22 17:04:12,829][15401] Updated weights for policy 0, policy_version 252430 (0.0037) [2024-06-22 17:04:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4135813120. Throughput: 0: 42830.3. Samples: 4135946900. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:04:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 17:04:17,319][15401] Updated weights for policy 0, policy_version 252440 (0.0034) [2024-06-22 17:04:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 4136042496. Throughput: 0: 42558.0. Samples: 4136194500. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-06-22 17:04:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 17:04:20,925][15401] Updated weights for policy 0, policy_version 252450 (0.0053) [2024-06-22 17:04:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4136239104. Throughput: 0: 42527.2. Samples: 4136320100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:04:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 17:04:25,061][15401] Updated weights for policy 0, policy_version 252460 (0.0036) [2024-06-22 17:04:28,253][15401] Updated weights for policy 0, policy_version 252470 (0.0035) [2024-06-22 17:04:28,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 4136468480. Throughput: 0: 42620.9. Samples: 4136581360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:04:28,393][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 17:04:32,576][15401] Updated weights for policy 0, policy_version 252480 (0.0056) [2024-06-22 17:04:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4136681472. Throughput: 0: 42520.9. Samples: 4136835360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:04:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 17:04:35,941][15401] Updated weights for policy 0, policy_version 252490 (0.0038) [2024-06-22 17:04:38,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 4136878080. Throughput: 0: 42458.7. Samples: 4136957800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:04:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 17:04:40,353][15401] Updated weights for policy 0, policy_version 252500 (0.0023) [2024-06-22 17:04:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4137107456. Throughput: 0: 42590.8. Samples: 4137220740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:04:43,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 17:04:43,452][15401] Updated weights for policy 0, policy_version 252510 (0.0031) [2024-06-22 17:04:47,830][15401] Updated weights for policy 0, policy_version 252520 (0.0030) [2024-06-22 17:04:48,155][15349] Signal inference workers to stop experience collection... (61250 times) [2024-06-22 17:04:48,155][15349] Signal inference workers to resume experience collection... (61250 times) [2024-06-22 17:04:48,185][15401] InferenceWorker_p0-w0: stopping experience collection (61250 times) [2024-06-22 17:04:48,186][15401] InferenceWorker_p0-w0: resuming experience collection (61250 times) [2024-06-22 17:04:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4137320448. Throughput: 0: 42598.3. Samples: 4137478940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:04:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 17:04:51,203][15401] Updated weights for policy 0, policy_version 252530 (0.0030) [2024-06-22 17:04:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4137533440. Throughput: 0: 42617.7. Samples: 4137605400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:04:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 17:04:55,491][15401] Updated weights for policy 0, policy_version 252540 (0.0044) [2024-06-22 17:04:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4137746432. Throughput: 0: 42550.7. Samples: 4137861680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:04:58,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 17:04:58,911][15401] Updated weights for policy 0, policy_version 252550 (0.0027) [2024-06-22 17:05:03,262][15401] Updated weights for policy 0, policy_version 252560 (0.0039) [2024-06-22 17:05:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 4137943040. Throughput: 0: 42842.4. Samples: 4138122400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:05:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 17:05:06,410][15401] Updated weights for policy 0, policy_version 252570 (0.0038) [2024-06-22 17:05:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4138172416. Throughput: 0: 42797.8. Samples: 4138246000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:05:08,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-22 17:05:10,722][15401] Updated weights for policy 0, policy_version 252580 (0.0031) [2024-06-22 17:05:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4138401792. Throughput: 0: 42790.7. Samples: 4138506840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:05:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 17:05:14,323][15401] Updated weights for policy 0, policy_version 252590 (0.0031) [2024-06-22 17:05:18,333][15401] Updated weights for policy 0, policy_version 252600 (0.0039) [2024-06-22 17:05:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4138598400. Throughput: 0: 42820.9. Samples: 4138762300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:05:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 17:05:21,911][15401] Updated weights for policy 0, policy_version 252610 (0.0031) [2024-06-22 17:05:23,395][15132] Fps is (10 sec: 40939.0, 60 sec: 42867.8, 300 sec: 42653.2). Total num frames: 4138811392. Throughput: 0: 42782.7. Samples: 4138883240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:05:23,395][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 17:05:25,851][15401] Updated weights for policy 0, policy_version 252620 (0.0036) [2024-06-22 17:05:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 4139040768. Throughput: 0: 42728.8. Samples: 4139143540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:05:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 17:05:29,392][15401] Updated weights for policy 0, policy_version 252630 (0.0042) [2024-06-22 17:05:33,390][15132] Fps is (10 sec: 42620.0, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 4139237376. Throughput: 0: 42799.0. Samples: 4139404900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 17:05:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 17:05:33,455][15401] Updated weights for policy 0, policy_version 252640 (0.0022) [2024-06-22 17:05:37,347][15401] Updated weights for policy 0, policy_version 252650 (0.0038) [2024-06-22 17:05:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4139450368. Throughput: 0: 42749.0. Samples: 4139529100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:05:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 17:05:41,112][15401] Updated weights for policy 0, policy_version 252660 (0.0045) [2024-06-22 17:05:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4139679744. Throughput: 0: 42889.7. Samples: 4139791720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:05:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 17:05:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000252666_4139679744.pth... [2024-06-22 17:05:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000252040_4129423360.pth [2024-06-22 17:05:44,836][15401] Updated weights for policy 0, policy_version 252670 (0.0040) [2024-06-22 17:05:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.6). Total num frames: 4139876352. Throughput: 0: 42879.6. Samples: 4140051980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:05:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 17:05:48,634][15401] Updated weights for policy 0, policy_version 252680 (0.0032) [2024-06-22 17:05:52,543][15401] Updated weights for policy 0, policy_version 252690 (0.0033) [2024-06-22 17:05:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4140089344. Throughput: 0: 42873.0. Samples: 4140175280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:05:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 17:05:56,668][15401] Updated weights for policy 0, policy_version 252700 (0.0046) [2024-06-22 17:05:57,325][15349] Signal inference workers to stop experience collection... (61300 times) [2024-06-22 17:05:57,370][15401] InferenceWorker_p0-w0: stopping experience collection (61300 times) [2024-06-22 17:05:57,376][15349] Signal inference workers to resume experience collection... (61300 times) [2024-06-22 17:05:57,385][15401] InferenceWorker_p0-w0: resuming experience collection (61300 times) [2024-06-22 17:05:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4140302336. Throughput: 0: 42759.6. Samples: 4140431020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:05:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 17:06:00,093][15401] Updated weights for policy 0, policy_version 252710 (0.0042) [2024-06-22 17:06:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4140515328. Throughput: 0: 42794.1. Samples: 4140688040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 17:06:04,299][15401] Updated weights for policy 0, policy_version 252720 (0.0031) [2024-06-22 17:06:07,622][15401] Updated weights for policy 0, policy_version 252730 (0.0024) [2024-06-22 17:06:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4140744704. Throughput: 0: 42851.1. Samples: 4140811320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 17:06:11,821][15401] Updated weights for policy 0, policy_version 252740 (0.0035) [2024-06-22 17:06:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4140957696. Throughput: 0: 42856.5. Samples: 4141072080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 17:06:15,176][15401] Updated weights for policy 0, policy_version 252750 (0.0030) [2024-06-22 17:06:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 4141154304. Throughput: 0: 42716.5. Samples: 4141327140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:18,393][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 17:06:19,739][15401] Updated weights for policy 0, policy_version 252760 (0.0036) [2024-06-22 17:06:23,247][15401] Updated weights for policy 0, policy_version 252770 (0.0028) [2024-06-22 17:06:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42875.2, 300 sec: 42710.0). Total num frames: 4141383680. Throughput: 0: 42722.6. Samples: 4141451620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 17:06:27,358][15401] Updated weights for policy 0, policy_version 252780 (0.0026) [2024-06-22 17:06:28,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4141613056. Throughput: 0: 42779.1. Samples: 4141716780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 17:06:30,890][15401] Updated weights for policy 0, policy_version 252790 (0.0026) [2024-06-22 17:06:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4141793280. Throughput: 0: 42560.3. Samples: 4141967200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:33,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 17:06:34,955][15401] Updated weights for policy 0, policy_version 252800 (0.0046) [2024-06-22 17:06:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4142022656. Throughput: 0: 42594.1. Samples: 4142092020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:38,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 17:06:38,439][15401] Updated weights for policy 0, policy_version 252810 (0.0049) [2024-06-22 17:06:42,718][15401] Updated weights for policy 0, policy_version 252820 (0.0042) [2024-06-22 17:06:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4142219264. Throughput: 0: 42708.4. Samples: 4142352900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 17:06:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 17:06:45,999][15401] Updated weights for policy 0, policy_version 252830 (0.0033) [2024-06-22 17:06:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4142432256. Throughput: 0: 42638.2. Samples: 4142606760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:06:48,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 17:06:50,373][15401] Updated weights for policy 0, policy_version 252840 (0.0026) [2024-06-22 17:06:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4142661632. Throughput: 0: 42629.9. Samples: 4142729660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:06:53,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-22 17:06:53,658][15401] Updated weights for policy 0, policy_version 252850 (0.0039) [2024-06-22 17:06:58,127][15401] Updated weights for policy 0, policy_version 252860 (0.0034) [2024-06-22 17:06:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4142858240. Throughput: 0: 42541.2. Samples: 4142986440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:06:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 17:07:01,587][15401] Updated weights for policy 0, policy_version 252870 (0.0027) [2024-06-22 17:07:03,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4143038464. Throughput: 0: 42488.5. Samples: 4143239120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 17:07:05,785][15401] Updated weights for policy 0, policy_version 252880 (0.0029) [2024-06-22 17:07:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4143300608. Throughput: 0: 42538.2. Samples: 4143365840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 17:07:09,427][15401] Updated weights for policy 0, policy_version 252890 (0.0030) [2024-06-22 17:07:11,428][15349] Signal inference workers to stop experience collection... (61350 times) [2024-06-22 17:07:11,428][15349] Signal inference workers to resume experience collection... (61350 times) [2024-06-22 17:07:11,472][15401] InferenceWorker_p0-w0: stopping experience collection (61350 times) [2024-06-22 17:07:11,472][15401] InferenceWorker_p0-w0: resuming experience collection (61350 times) [2024-06-22 17:07:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4143497216. Throughput: 0: 42403.6. Samples: 4143624940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 17:07:13,643][15401] Updated weights for policy 0, policy_version 252900 (0.0037) [2024-06-22 17:07:16,997][15401] Updated weights for policy 0, policy_version 252910 (0.0035) [2024-06-22 17:07:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4143693824. Throughput: 0: 42500.5. Samples: 4143879720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 17:07:21,321][15401] Updated weights for policy 0, policy_version 252920 (0.0041) [2024-06-22 17:07:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4143939584. Throughput: 0: 42628.8. Samples: 4144010320. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 17:07:24,663][15401] Updated weights for policy 0, policy_version 252930 (0.0051) [2024-06-22 17:07:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4144136192. Throughput: 0: 42574.2. Samples: 4144268740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 17:07:29,038][15401] Updated weights for policy 0, policy_version 252940 (0.0029) [2024-06-22 17:07:32,119][15401] Updated weights for policy 0, policy_version 252950 (0.0040) [2024-06-22 17:07:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4144349184. Throughput: 0: 42585.8. Samples: 4144523120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 17:07:36,499][15401] Updated weights for policy 0, policy_version 252960 (0.0034) [2024-06-22 17:07:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42766.0). Total num frames: 4144594944. Throughput: 0: 42670.1. Samples: 4144649820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 17:07:39,670][15401] Updated weights for policy 0, policy_version 252970 (0.0029) [2024-06-22 17:07:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4144758784. Throughput: 0: 42752.2. Samples: 4144910280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 17:07:43,527][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000252978_4144791552.pth... [2024-06-22 17:07:43,578][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000252353_4134551552.pth [2024-06-22 17:07:44,114][15401] Updated weights for policy 0, policy_version 252980 (0.0038) [2024-06-22 17:07:47,499][15401] Updated weights for policy 0, policy_version 252990 (0.0027) [2024-06-22 17:07:48,394][15132] Fps is (10 sec: 39303.5, 60 sec: 42595.2, 300 sec: 42653.3). Total num frames: 4144988160. Throughput: 0: 42634.3. Samples: 4145157860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:48,395][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 17:07:51,688][15401] Updated weights for policy 0, policy_version 253000 (0.0030) [2024-06-22 17:07:53,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4145233920. Throughput: 0: 42620.0. Samples: 4145283740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 25.0) [2024-06-22 17:07:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 17:07:55,250][15401] Updated weights for policy 0, policy_version 253010 (0.0032) [2024-06-22 17:07:58,389][15132] Fps is (10 sec: 40978.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4145397760. Throughput: 0: 42717.8. Samples: 4145547240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:07:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 17:07:59,403][15401] Updated weights for policy 0, policy_version 253020 (0.0023) [2024-06-22 17:08:03,361][15401] Updated weights for policy 0, policy_version 253030 (0.0032) [2024-06-22 17:08:03,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43415.9, 300 sec: 42709.1). Total num frames: 4145643520. Throughput: 0: 42480.0. Samples: 4145791420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:03,392][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 17:08:05,388][15349] Signal inference workers to stop experience collection... (61400 times) [2024-06-22 17:08:05,402][15401] InferenceWorker_p0-w0: stopping experience collection (61400 times) [2024-06-22 17:08:05,451][15349] Signal inference workers to resume experience collection... (61400 times) [2024-06-22 17:08:05,451][15401] InferenceWorker_p0-w0: resuming experience collection (61400 times) [2024-06-22 17:08:06,954][15401] Updated weights for policy 0, policy_version 253040 (0.0035) [2024-06-22 17:08:08,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4145872896. Throughput: 0: 42537.9. Samples: 4145924520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 17:08:10,946][15401] Updated weights for policy 0, policy_version 253050 (0.0033) [2024-06-22 17:08:13,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 4146036736. Throughput: 0: 42739.3. Samples: 4146192000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 17:08:14,696][15401] Updated weights for policy 0, policy_version 253060 (0.0028) [2024-06-22 17:08:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4146282496. Throughput: 0: 42627.2. Samples: 4146441340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 17:08:18,490][15401] Updated weights for policy 0, policy_version 253070 (0.0032) [2024-06-22 17:08:22,386][15401] Updated weights for policy 0, policy_version 253080 (0.0029) [2024-06-22 17:08:23,390][15132] Fps is (10 sec: 49148.4, 60 sec: 43144.1, 300 sec: 42820.8). Total num frames: 4146528256. Throughput: 0: 42805.6. Samples: 4146576100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:23,391][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 17:08:26,111][15401] Updated weights for policy 0, policy_version 253090 (0.0045) [2024-06-22 17:08:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4146675712. Throughput: 0: 42780.5. Samples: 4146835400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 17:08:30,055][15401] Updated weights for policy 0, policy_version 253100 (0.0038) [2024-06-22 17:08:33,390][15132] Fps is (10 sec: 40962.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4146937856. Throughput: 0: 42793.6. Samples: 4147083380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:33,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 17:08:33,649][15401] Updated weights for policy 0, policy_version 253110 (0.0027) [2024-06-22 17:08:37,975][15401] Updated weights for policy 0, policy_version 253120 (0.0029) [2024-06-22 17:08:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4147150848. Throughput: 0: 43062.9. Samples: 4147221560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 17:08:41,228][15401] Updated weights for policy 0, policy_version 253130 (0.0037) [2024-06-22 17:08:43,392][15132] Fps is (10 sec: 37674.5, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 4147314688. Throughput: 0: 42807.9. Samples: 4147473700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:43,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 17:08:45,475][15401] Updated weights for policy 0, policy_version 253140 (0.0030) [2024-06-22 17:08:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43147.9, 300 sec: 42765.0). Total num frames: 4147576832. Throughput: 0: 42937.0. Samples: 4147723480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:48,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 17:08:49,472][15401] Updated weights for policy 0, policy_version 253150 (0.0038) [2024-06-22 17:08:52,738][15349] Signal inference workers to stop experience collection... (61450 times) [2024-06-22 17:08:52,739][15349] Signal inference workers to resume experience collection... (61450 times) [2024-06-22 17:08:52,768][15401] InferenceWorker_p0-w0: stopping experience collection (61450 times) [2024-06-22 17:08:52,768][15401] InferenceWorker_p0-w0: resuming experience collection (61450 times) [2024-06-22 17:08:53,215][15401] Updated weights for policy 0, policy_version 253160 (0.0025) [2024-06-22 17:08:53,390][15132] Fps is (10 sec: 47524.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4147789824. Throughput: 0: 43062.6. Samples: 4147862340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 17:08:57,025][15401] Updated weights for policy 0, policy_version 253170 (0.0033) [2024-06-22 17:08:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4147970048. Throughput: 0: 42631.8. Samples: 4148110440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:08:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 17:09:00,859][15401] Updated weights for policy 0, policy_version 253180 (0.0036) [2024-06-22 17:09:03,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43144.5, 300 sec: 42764.7). Total num frames: 4148232192. Throughput: 0: 42590.5. Samples: 4148358020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 17:09:03,393][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 17:09:04,605][15401] Updated weights for policy 0, policy_version 253190 (0.0030) [2024-06-22 17:09:08,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 4148412416. Throughput: 0: 42750.7. Samples: 4148499960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:08,393][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 17:09:08,534][15401] Updated weights for policy 0, policy_version 253200 (0.0037) [2024-06-22 17:09:12,693][15401] Updated weights for policy 0, policy_version 253210 (0.0043) [2024-06-22 17:09:13,390][15132] Fps is (10 sec: 37692.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4148609024. Throughput: 0: 42579.5. Samples: 4148751480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 17:09:16,270][15401] Updated weights for policy 0, policy_version 253220 (0.0029) [2024-06-22 17:09:18,390][15132] Fps is (10 sec: 47525.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4148887552. Throughput: 0: 42477.0. Samples: 4148994840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 17:09:20,139][15401] Updated weights for policy 0, policy_version 253230 (0.0029) [2024-06-22 17:09:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.7, 300 sec: 42598.8). Total num frames: 4149035008. Throughput: 0: 42667.1. Samples: 4149141580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 17:09:23,851][15401] Updated weights for policy 0, policy_version 253240 (0.0050) [2024-06-22 17:09:27,706][15401] Updated weights for policy 0, policy_version 253250 (0.0040) [2024-06-22 17:09:28,390][15132] Fps is (10 sec: 36044.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 4149248000. Throughput: 0: 42603.9. Samples: 4149390780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:28,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 17:09:31,622][15401] Updated weights for policy 0, policy_version 253260 (0.0028) [2024-06-22 17:09:32,084][15349] Signal inference workers to stop experience collection... (61500 times) [2024-06-22 17:09:32,086][15349] Signal inference workers to resume experience collection... (61500 times) [2024-06-22 17:09:32,097][15401] InferenceWorker_p0-w0: stopping experience collection (61500 times) [2024-06-22 17:09:32,127][15401] InferenceWorker_p0-w0: resuming experience collection (61500 times) [2024-06-22 17:09:33,389][15132] Fps is (10 sec: 50790.2, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 4149542912. Throughput: 0: 42536.9. Samples: 4149637640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:33,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-22 17:09:35,218][15401] Updated weights for policy 0, policy_version 253270 (0.0022) [2024-06-22 17:09:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 4149657600. Throughput: 0: 42710.6. Samples: 4149784320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:38,395][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 17:09:39,204][15401] Updated weights for policy 0, policy_version 253280 (0.0046) [2024-06-22 17:09:42,975][15401] Updated weights for policy 0, policy_version 253290 (0.0034) [2024-06-22 17:09:43,390][15132] Fps is (10 sec: 36044.5, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 4149903360. Throughput: 0: 42656.0. Samples: 4150029960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 17:09:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000253290_4149903360.pth... [2024-06-22 17:09:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000252666_4139679744.pth [2024-06-22 17:09:46,906][15401] Updated weights for policy 0, policy_version 253300 (0.0045) [2024-06-22 17:09:48,389][15132] Fps is (10 sec: 50791.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4150165504. Throughput: 0: 42666.8. Samples: 4150277920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 17:09:50,575][15401] Updated weights for policy 0, policy_version 253310 (0.0031) [2024-06-22 17:09:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 4150280192. Throughput: 0: 42600.4. Samples: 4150416880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 17:09:54,557][15401] Updated weights for policy 0, policy_version 253320 (0.0032) [2024-06-22 17:09:58,116][15401] Updated weights for policy 0, policy_version 253330 (0.0037) [2024-06-22 17:09:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4150558720. Throughput: 0: 42496.4. Samples: 4150663820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:09:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 17:10:02,222][15401] Updated weights for policy 0, policy_version 253340 (0.0031) [2024-06-22 17:10:03,390][15132] Fps is (10 sec: 52428.8, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 4150804480. Throughput: 0: 42875.9. Samples: 4150924260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:10:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 17:10:05,686][15401] Updated weights for policy 0, policy_version 253350 (0.0034) [2024-06-22 17:10:08,396][15132] Fps is (10 sec: 37659.3, 60 sec: 42049.5, 300 sec: 42486.4). Total num frames: 4150935552. Throughput: 0: 42512.5. Samples: 4151054920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:10:08,397][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 17:10:09,835][15401] Updated weights for policy 0, policy_version 253360 (0.0039) [2024-06-22 17:10:13,375][15401] Updated weights for policy 0, policy_version 253370 (0.0025) [2024-06-22 17:10:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4151214080. Throughput: 0: 42591.2. Samples: 4151307380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-22 17:10:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 17:10:15,886][15349] Signal inference workers to stop experience collection... (61550 times) [2024-06-22 17:10:15,887][15349] Signal inference workers to resume experience collection... (61550 times) [2024-06-22 17:10:15,910][15401] InferenceWorker_p0-w0: stopping experience collection (61550 times) [2024-06-22 17:10:15,910][15401] InferenceWorker_p0-w0: resuming experience collection (61550 times) [2024-06-22 17:10:17,816][15401] Updated weights for policy 0, policy_version 253380 (0.0027) [2024-06-22 17:10:18,389][15132] Fps is (10 sec: 49183.8, 60 sec: 42325.4, 300 sec: 42765.8). Total num frames: 4151427072. Throughput: 0: 42928.0. Samples: 4151569400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 17:10:21,099][15401] Updated weights for policy 0, policy_version 253390 (0.0034) [2024-06-22 17:10:23,389][15132] Fps is (10 sec: 36044.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4151574528. Throughput: 0: 42419.2. Samples: 4151693180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 17:10:25,512][15401] Updated weights for policy 0, policy_version 253400 (0.0033) [2024-06-22 17:10:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4151836672. Throughput: 0: 42552.9. Samples: 4151944840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 17:10:28,656][15401] Updated weights for policy 0, policy_version 253410 (0.0035) [2024-06-22 17:10:33,075][15401] Updated weights for policy 0, policy_version 253420 (0.0023) [2024-06-22 17:10:33,389][15132] Fps is (10 sec: 47513.9, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 4152049664. Throughput: 0: 43030.7. Samples: 4152214300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 17:10:36,087][15401] Updated weights for policy 0, policy_version 253430 (0.0036) [2024-06-22 17:10:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4152229888. Throughput: 0: 42624.9. Samples: 4152335000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 17:10:40,573][15401] Updated weights for policy 0, policy_version 253440 (0.0026) [2024-06-22 17:10:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4152492032. Throughput: 0: 42924.0. Samples: 4152595400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 17:10:43,596][15401] Updated weights for policy 0, policy_version 253450 (0.0033) [2024-06-22 17:10:48,355][15401] Updated weights for policy 0, policy_version 253460 (0.0043) [2024-06-22 17:10:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4152688640. Throughput: 0: 42905.9. Samples: 4152855020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 17:10:51,404][15401] Updated weights for policy 0, policy_version 253470 (0.0036) [2024-06-22 17:10:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.7, 300 sec: 42654.0). Total num frames: 4152885248. Throughput: 0: 42649.2. Samples: 4152973860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 17:10:55,881][15401] Updated weights for policy 0, policy_version 253480 (0.0034) [2024-06-22 17:10:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4153131008. Throughput: 0: 42725.5. Samples: 4153230020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:10:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 17:10:58,942][15401] Updated weights for policy 0, policy_version 253490 (0.0037) [2024-06-22 17:11:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4153327616. Throughput: 0: 42790.0. Samples: 4153494960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:11:03,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-22 17:11:03,413][15401] Updated weights for policy 0, policy_version 253500 (0.0033) [2024-06-22 17:11:06,780][15401] Updated weights for policy 0, policy_version 253510 (0.0033) [2024-06-22 17:11:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43422.2, 300 sec: 42653.9). Total num frames: 4153540608. Throughput: 0: 42724.0. Samples: 4153615760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:11:08,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-22 17:11:10,836][15401] Updated weights for policy 0, policy_version 253520 (0.0030) [2024-06-22 17:11:13,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4153786368. Throughput: 0: 43152.0. Samples: 4153886680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:11:13,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 17:11:14,123][15401] Updated weights for policy 0, policy_version 253530 (0.0035) [2024-06-22 17:11:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4153982976. Throughput: 0: 42914.7. Samples: 4154145460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:11:18,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 17:11:18,483][15401] Updated weights for policy 0, policy_version 253540 (0.0032) [2024-06-22 17:11:20,881][15349] Signal inference workers to stop experience collection... (61600 times) [2024-06-22 17:11:20,892][15349] Signal inference workers to resume experience collection... (61600 times) [2024-06-22 17:11:20,901][15401] InferenceWorker_p0-w0: stopping experience collection (61600 times) [2024-06-22 17:11:20,919][15401] InferenceWorker_p0-w0: resuming experience collection (61600 times) [2024-06-22 17:11:21,689][15401] Updated weights for policy 0, policy_version 253550 (0.0029) [2024-06-22 17:11:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 4154195968. Throughput: 0: 42946.8. Samples: 4154267600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-22 17:11:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 17:11:26,114][15401] Updated weights for policy 0, policy_version 253560 (0.0029) [2024-06-22 17:11:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4154425344. Throughput: 0: 43059.1. Samples: 4154533060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:11:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 17:11:29,257][15401] Updated weights for policy 0, policy_version 253570 (0.0029) [2024-06-22 17:11:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 4154621952. Throughput: 0: 43107.9. Samples: 4154794880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:11:33,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 17:11:34,117][15401] Updated weights for policy 0, policy_version 253580 (0.0045) [2024-06-22 17:11:37,066][15401] Updated weights for policy 0, policy_version 253590 (0.0029) [2024-06-22 17:11:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4154834944. Throughput: 0: 43121.7. Samples: 4154914340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:11:38,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 17:11:41,835][15401] Updated weights for policy 0, policy_version 253600 (0.0036) [2024-06-22 17:11:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4155047936. Throughput: 0: 43175.4. Samples: 4155172920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:11:43,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 17:11:43,556][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000253605_4155064320.pth... [2024-06-22 17:11:43,640][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000252978_4144791552.pth [2024-06-22 17:11:45,283][15401] Updated weights for policy 0, policy_version 253610 (0.0031) [2024-06-22 17:11:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4155244544. Throughput: 0: 43098.3. Samples: 4155434380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:11:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 17:11:49,493][15401] Updated weights for policy 0, policy_version 253620 (0.0040) [2024-06-22 17:11:52,851][15401] Updated weights for policy 0, policy_version 253630 (0.0022) [2024-06-22 17:11:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 4155490304. Throughput: 0: 43147.5. Samples: 4155557400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:11:53,402][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 17:11:57,059][15401] Updated weights for policy 0, policy_version 253640 (0.0041) [2024-06-22 17:11:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4155686912. Throughput: 0: 42839.2. Samples: 4155814440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:11:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 17:12:00,419][15401] Updated weights for policy 0, policy_version 253650 (0.0026) [2024-06-22 17:12:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4155883520. Throughput: 0: 42805.6. Samples: 4156071720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:12:03,404][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 17:12:04,786][15401] Updated weights for policy 0, policy_version 253660 (0.0027) [2024-06-22 17:12:07,764][15349] Signal inference workers to stop experience collection... (61650 times) [2024-06-22 17:12:07,787][15401] InferenceWorker_p0-w0: stopping experience collection (61650 times) [2024-06-22 17:12:07,875][15349] Signal inference workers to resume experience collection... (61650 times) [2024-06-22 17:12:07,875][15401] InferenceWorker_p0-w0: resuming experience collection (61650 times) [2024-06-22 17:12:08,010][15401] Updated weights for policy 0, policy_version 253670 (0.0034) [2024-06-22 17:12:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4156129280. Throughput: 0: 42844.0. Samples: 4156195580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:12:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 17:12:12,516][15401] Updated weights for policy 0, policy_version 253680 (0.0047) [2024-06-22 17:12:13,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4156342272. Throughput: 0: 42678.9. Samples: 4156453600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:12:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 17:12:15,705][15401] Updated weights for policy 0, policy_version 253690 (0.0025) [2024-06-22 17:12:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4156538880. Throughput: 0: 42549.1. Samples: 4156709580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:12:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 17:12:20,054][15401] Updated weights for policy 0, policy_version 253700 (0.0028) [2024-06-22 17:12:23,301][15401] Updated weights for policy 0, policy_version 253710 (0.0028) [2024-06-22 17:12:23,389][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4156784640. Throughput: 0: 42672.5. Samples: 4156834600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:12:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 17:12:27,682][15401] Updated weights for policy 0, policy_version 253720 (0.0039) [2024-06-22 17:12:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4156981248. Throughput: 0: 42812.5. Samples: 4157099480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:12:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 17:12:30,825][15401] Updated weights for policy 0, policy_version 253730 (0.0036) [2024-06-22 17:12:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4157194240. Throughput: 0: 42697.3. Samples: 4157355760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:12:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 17:12:35,517][15401] Updated weights for policy 0, policy_version 253740 (0.0032) [2024-06-22 17:12:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4157423616. Throughput: 0: 42853.9. Samples: 4157485820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:12:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 17:12:38,478][15401] Updated weights for policy 0, policy_version 253750 (0.0036) [2024-06-22 17:12:43,005][15401] Updated weights for policy 0, policy_version 253760 (0.0029) [2024-06-22 17:12:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42821.2). Total num frames: 4157620224. Throughput: 0: 42967.1. Samples: 4157747960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:12:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 17:12:46,238][15401] Updated weights for policy 0, policy_version 253770 (0.0042) [2024-06-22 17:12:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4157833216. Throughput: 0: 42792.6. Samples: 4157997380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:12:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 17:12:50,490][15401] Updated weights for policy 0, policy_version 253780 (0.0035) [2024-06-22 17:12:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4158046208. Throughput: 0: 42991.0. Samples: 4158130180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:12:53,399][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 17:12:53,943][15401] Updated weights for policy 0, policy_version 253790 (0.0039) [2024-06-22 17:12:58,015][15401] Updated weights for policy 0, policy_version 253800 (0.0031) [2024-06-22 17:12:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 4158259200. Throughput: 0: 43008.3. Samples: 4158388980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:12:58,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 17:13:01,701][15401] Updated weights for policy 0, policy_version 253810 (0.0036) [2024-06-22 17:13:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4158472192. Throughput: 0: 42893.6. Samples: 4158639800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:03,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 17:13:05,913][15401] Updated weights for policy 0, policy_version 253820 (0.0032) [2024-06-22 17:13:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4158685184. Throughput: 0: 43000.0. Samples: 4158769600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:08,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 17:13:09,507][15401] Updated weights for policy 0, policy_version 253830 (0.0030) [2024-06-22 17:13:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4158898176. Throughput: 0: 42786.2. Samples: 4159024860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:13,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 17:13:13,589][15401] Updated weights for policy 0, policy_version 253840 (0.0032) [2024-06-22 17:13:17,153][15401] Updated weights for policy 0, policy_version 253850 (0.0029) [2024-06-22 17:13:18,391][15132] Fps is (10 sec: 44230.8, 60 sec: 43143.5, 300 sec: 42709.4). Total num frames: 4159127552. Throughput: 0: 42452.1. Samples: 4159266160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:18,391][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 17:13:21,691][15401] Updated weights for policy 0, policy_version 253860 (0.0041) [2024-06-22 17:13:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 4159324160. Throughput: 0: 42705.8. Samples: 4159407580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:23,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 17:13:24,837][15401] Updated weights for policy 0, policy_version 253870 (0.0035) [2024-06-22 17:13:28,389][15132] Fps is (10 sec: 40965.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4159537152. Throughput: 0: 42425.3. Samples: 4159657100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 17:13:29,179][15401] Updated weights for policy 0, policy_version 253880 (0.0033) [2024-06-22 17:13:32,522][15401] Updated weights for policy 0, policy_version 253890 (0.0027) [2024-06-22 17:13:33,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 4159766528. Throughput: 0: 42544.3. Samples: 4159911980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:33,393][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 17:13:36,684][15401] Updated weights for policy 0, policy_version 253900 (0.0033) [2024-06-22 17:13:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42876.4). Total num frames: 4159963136. Throughput: 0: 42574.6. Samples: 4160046040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 17:13:40,059][15349] Signal inference workers to stop experience collection... (61700 times) [2024-06-22 17:13:40,097][15401] InferenceWorker_p0-w0: stopping experience collection (61700 times) [2024-06-22 17:13:40,121][15349] Signal inference workers to resume experience collection... (61700 times) [2024-06-22 17:13:40,128][15401] InferenceWorker_p0-w0: resuming experience collection (61700 times) [2024-06-22 17:13:40,264][15401] Updated weights for policy 0, policy_version 253910 (0.0034) [2024-06-22 17:13:43,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4160159744. Throughput: 0: 42486.3. Samples: 4160300860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 17:13:43,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 17:13:43,477][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000253917_4160176128.pth... [2024-06-22 17:13:43,525][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000253290_4149903360.pth [2024-06-22 17:13:44,277][15401] Updated weights for policy 0, policy_version 253920 (0.0044) [2024-06-22 17:13:47,963][15401] Updated weights for policy 0, policy_version 253930 (0.0033) [2024-06-22 17:13:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4160405504. Throughput: 0: 42560.6. Samples: 4160555020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:13:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 17:13:51,915][15401] Updated weights for policy 0, policy_version 253940 (0.0033) [2024-06-22 17:13:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4160585728. Throughput: 0: 42618.7. Samples: 4160687440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:13:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 17:13:55,854][15401] Updated weights for policy 0, policy_version 253950 (0.0040) [2024-06-22 17:13:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 4160815104. Throughput: 0: 42539.5. Samples: 4160939140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:13:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 17:13:59,604][15401] Updated weights for policy 0, policy_version 253960 (0.0042) [2024-06-22 17:14:03,318][15401] Updated weights for policy 0, policy_version 253970 (0.0036) [2024-06-22 17:14:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 4161044480. Throughput: 0: 42908.8. Samples: 4161197000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 17:14:07,357][15401] Updated weights for policy 0, policy_version 253980 (0.0045) [2024-06-22 17:14:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4161224704. Throughput: 0: 42674.6. Samples: 4161327940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 17:14:10,766][15401] Updated weights for policy 0, policy_version 253990 (0.0040) [2024-06-22 17:14:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4161470464. Throughput: 0: 42779.9. Samples: 4161582200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:13,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 17:14:14,843][15401] Updated weights for policy 0, policy_version 254000 (0.0026) [2024-06-22 17:14:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42599.3, 300 sec: 42876.1). Total num frames: 4161683456. Throughput: 0: 42829.4. Samples: 4161839200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 17:14:18,667][15401] Updated weights for policy 0, policy_version 254010 (0.0024) [2024-06-22 17:14:22,525][15401] Updated weights for policy 0, policy_version 254020 (0.0034) [2024-06-22 17:14:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4161880064. Throughput: 0: 42693.8. Samples: 4161967260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 17:14:26,109][15401] Updated weights for policy 0, policy_version 254030 (0.0031) [2024-06-22 17:14:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4162109440. Throughput: 0: 42808.3. Samples: 4162227240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:28,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 17:14:30,014][15401] Updated weights for policy 0, policy_version 254040 (0.0034) [2024-06-22 17:14:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.2, 300 sec: 42931.7). Total num frames: 4162322432. Throughput: 0: 42825.8. Samples: 4162482180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 17:14:33,632][15401] Updated weights for policy 0, policy_version 254050 (0.0033) [2024-06-22 17:14:38,220][15401] Updated weights for policy 0, policy_version 254060 (0.0031) [2024-06-22 17:14:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4162519040. Throughput: 0: 42745.3. Samples: 4162610980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:38,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 17:14:41,750][15401] Updated weights for policy 0, policy_version 254070 (0.0026) [2024-06-22 17:14:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 4162732032. Throughput: 0: 42735.2. Samples: 4162862220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 17:14:45,931][15401] Updated weights for policy 0, policy_version 254080 (0.0024) [2024-06-22 17:14:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 4162961408. Throughput: 0: 42705.9. Samples: 4163118760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 17:14:49,134][15401] Updated weights for policy 0, policy_version 254090 (0.0047) [2024-06-22 17:14:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4163158016. Throughput: 0: 42767.0. Samples: 4163252460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:53,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 17:14:53,516][15401] Updated weights for policy 0, policy_version 254100 (0.0040) [2024-06-22 17:14:54,114][15349] Signal inference workers to stop experience collection... (61750 times) [2024-06-22 17:14:54,158][15401] InferenceWorker_p0-w0: stopping experience collection (61750 times) [2024-06-22 17:14:54,170][15349] Signal inference workers to resume experience collection... (61750 times) [2024-06-22 17:14:54,179][15401] InferenceWorker_p0-w0: resuming experience collection (61750 times) [2024-06-22 17:14:56,660][15401] Updated weights for policy 0, policy_version 254110 (0.0024) [2024-06-22 17:14:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4163387392. Throughput: 0: 42619.5. Samples: 4163500080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:14:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 17:15:01,038][15401] Updated weights for policy 0, policy_version 254120 (0.0031) [2024-06-22 17:15:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42932.6). Total num frames: 4163600384. Throughput: 0: 42771.2. Samples: 4163763900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:03,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 17:15:04,166][15401] Updated weights for policy 0, policy_version 254130 (0.0027) [2024-06-22 17:15:08,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4163796992. Throughput: 0: 42815.7. Samples: 4163893960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:08,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 17:15:08,656][15401] Updated weights for policy 0, policy_version 254140 (0.0035) [2024-06-22 17:15:11,674][15401] Updated weights for policy 0, policy_version 254150 (0.0035) [2024-06-22 17:15:13,393][15132] Fps is (10 sec: 42582.0, 60 sec: 42595.8, 300 sec: 42708.9). Total num frames: 4164026368. Throughput: 0: 42558.7. Samples: 4164142540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:13,394][15132] Avg episode reward: [(0, '0.220')] [2024-06-22 17:15:16,152][15401] Updated weights for policy 0, policy_version 254160 (0.0030) [2024-06-22 17:15:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4164255744. Throughput: 0: 42859.0. Samples: 4164410840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:18,390][15132] Avg episode reward: [(0, '0.137')] [2024-06-22 17:15:19,234][15401] Updated weights for policy 0, policy_version 254170 (0.0033) [2024-06-22 17:15:23,390][15132] Fps is (10 sec: 42614.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4164452352. Throughput: 0: 42840.4. Samples: 4164538800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:23,400][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 17:15:23,937][15401] Updated weights for policy 0, policy_version 254180 (0.0031) [2024-06-22 17:15:27,494][15401] Updated weights for policy 0, policy_version 254190 (0.0029) [2024-06-22 17:15:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4164681728. Throughput: 0: 42888.6. Samples: 4164792200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:28,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 17:15:31,739][15401] Updated weights for policy 0, policy_version 254200 (0.0040) [2024-06-22 17:15:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4164894720. Throughput: 0: 42780.4. Samples: 4165043880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 17:15:34,971][15401] Updated weights for policy 0, policy_version 254210 (0.0023) [2024-06-22 17:15:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4165074944. Throughput: 0: 42808.6. Samples: 4165178840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 17:15:39,427][15401] Updated weights for policy 0, policy_version 254220 (0.0027) [2024-06-22 17:15:42,458][15401] Updated weights for policy 0, policy_version 254230 (0.0037) [2024-06-22 17:15:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4165320704. Throughput: 0: 42861.4. Samples: 4165428840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:43,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-22 17:15:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000254231_4165320704.pth... [2024-06-22 17:15:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000253605_4155064320.pth [2024-06-22 17:15:46,922][15401] Updated weights for policy 0, policy_version 254240 (0.0029) [2024-06-22 17:15:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4165517312. Throughput: 0: 42715.0. Samples: 4165686080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 17:15:49,984][15401] Updated weights for policy 0, policy_version 254250 (0.0037) [2024-06-22 17:15:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4165730304. Throughput: 0: 42718.1. Samples: 4165816280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 17:15:54,859][15401] Updated weights for policy 0, policy_version 254260 (0.0035) [2024-06-22 17:15:58,181][15401] Updated weights for policy 0, policy_version 254270 (0.0042) [2024-06-22 17:15:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4165959680. Throughput: 0: 42799.1. Samples: 4166068340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:15:58,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 17:16:02,417][15401] Updated weights for policy 0, policy_version 254280 (0.0037) [2024-06-22 17:16:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4166172672. Throughput: 0: 42749.4. Samples: 4166334560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:16:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 17:16:05,567][15401] Updated weights for policy 0, policy_version 254290 (0.0044) [2024-06-22 17:16:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4166385664. Throughput: 0: 42758.7. Samples: 4166462940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 17:16:08,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 17:16:10,002][15401] Updated weights for policy 0, policy_version 254300 (0.0024) [2024-06-22 17:16:13,125][15401] Updated weights for policy 0, policy_version 254310 (0.0040) [2024-06-22 17:16:13,393][15132] Fps is (10 sec: 44223.6, 60 sec: 43145.1, 300 sec: 42820.1). Total num frames: 4166615040. Throughput: 0: 42864.2. Samples: 4166721220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:13,393][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 17:16:17,534][15401] Updated weights for policy 0, policy_version 254320 (0.0029) [2024-06-22 17:16:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4166828032. Throughput: 0: 42977.3. Samples: 4166977860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 17:16:20,611][15401] Updated weights for policy 0, policy_version 254330 (0.0024) [2024-06-22 17:16:23,390][15132] Fps is (10 sec: 40972.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4167024640. Throughput: 0: 42866.6. Samples: 4167107840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 17:16:24,256][15349] Signal inference workers to stop experience collection... (61800 times) [2024-06-22 17:16:24,257][15349] Signal inference workers to resume experience collection... (61800 times) [2024-06-22 17:16:24,301][15401] InferenceWorker_p0-w0: stopping experience collection (61800 times) [2024-06-22 17:16:24,302][15401] InferenceWorker_p0-w0: resuming experience collection (61800 times) [2024-06-22 17:16:24,999][15401] Updated weights for policy 0, policy_version 254340 (0.0033) [2024-06-22 17:16:28,189][15401] Updated weights for policy 0, policy_version 254350 (0.0035) [2024-06-22 17:16:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4167270400. Throughput: 0: 43034.7. Samples: 4167365400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:28,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 17:16:32,544][15401] Updated weights for policy 0, policy_version 254360 (0.0026) [2024-06-22 17:16:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4167450624. Throughput: 0: 43070.7. Samples: 4167624260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 17:16:36,106][15401] Updated weights for policy 0, policy_version 254370 (0.0039) [2024-06-22 17:16:38,396][15132] Fps is (10 sec: 40934.2, 60 sec: 43412.9, 300 sec: 42819.6). Total num frames: 4167680000. Throughput: 0: 42974.4. Samples: 4167750400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:38,396][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 17:16:40,441][15401] Updated weights for policy 0, policy_version 254380 (0.0036) [2024-06-22 17:16:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4167909376. Throughput: 0: 43215.6. Samples: 4168013040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:43,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 17:16:43,658][15401] Updated weights for policy 0, policy_version 254390 (0.0033) [2024-06-22 17:16:48,067][15401] Updated weights for policy 0, policy_version 254400 (0.0038) [2024-06-22 17:16:48,389][15132] Fps is (10 sec: 42625.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4168105984. Throughput: 0: 42943.2. Samples: 4168267000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 17:16:51,574][15401] Updated weights for policy 0, policy_version 254410 (0.0044) [2024-06-22 17:16:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4168318976. Throughput: 0: 42840.6. Samples: 4168390760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:53,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 17:16:55,773][15401] Updated weights for policy 0, policy_version 254420 (0.0029) [2024-06-22 17:16:58,391][15132] Fps is (10 sec: 42590.6, 60 sec: 42870.3, 300 sec: 42875.9). Total num frames: 4168531968. Throughput: 0: 42807.0. Samples: 4168647480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:16:58,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 17:16:59,224][15401] Updated weights for policy 0, policy_version 254430 (0.0024) [2024-06-22 17:17:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4168728576. Throughput: 0: 42881.4. Samples: 4168907520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:17:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 17:17:03,451][15401] Updated weights for policy 0, policy_version 254440 (0.0047) [2024-06-22 17:17:06,853][15401] Updated weights for policy 0, policy_version 254450 (0.0050) [2024-06-22 17:17:08,389][15132] Fps is (10 sec: 40967.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4168941568. Throughput: 0: 42672.1. Samples: 4169028080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:17:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 17:17:11,079][15401] Updated weights for policy 0, policy_version 254460 (0.0029) [2024-06-22 17:17:13,394][15132] Fps is (10 sec: 44217.9, 60 sec: 42597.5, 300 sec: 42819.9). Total num frames: 4169170944. Throughput: 0: 42797.3. Samples: 4169291460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:17:13,394][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 17:17:14,381][15401] Updated weights for policy 0, policy_version 254470 (0.0046) [2024-06-22 17:17:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4169367552. Throughput: 0: 42660.9. Samples: 4169544000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 17:17:18,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 17:17:18,818][15401] Updated weights for policy 0, policy_version 254480 (0.0030) [2024-06-22 17:17:21,912][15401] Updated weights for policy 0, policy_version 254490 (0.0034) [2024-06-22 17:17:23,390][15132] Fps is (10 sec: 40977.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4169580544. Throughput: 0: 42657.1. Samples: 4169669700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:17:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 17:17:26,549][15401] Updated weights for policy 0, policy_version 254500 (0.0026) [2024-06-22 17:17:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4169809920. Throughput: 0: 42579.6. Samples: 4169929120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:17:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 17:17:29,576][15401] Updated weights for policy 0, policy_version 254510 (0.0033) [2024-06-22 17:17:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4170022912. Throughput: 0: 42602.2. Samples: 4170184100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:17:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 17:17:34,369][15401] Updated weights for policy 0, policy_version 254520 (0.0035) [2024-06-22 17:17:37,267][15401] Updated weights for policy 0, policy_version 254530 (0.0030) [2024-06-22 17:17:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 4170235904. Throughput: 0: 42647.4. Samples: 4170309900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:17:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 17:17:41,862][15401] Updated weights for policy 0, policy_version 254540 (0.0034) [2024-06-22 17:17:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4170432512. Throughput: 0: 42663.3. Samples: 4170567260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:17:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 17:17:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000254544_4170448896.pth... [2024-06-22 17:17:43,483][15349] Signal inference workers to stop experience collection... (61850 times) [2024-06-22 17:17:43,484][15349] Signal inference workers to resume experience collection... (61850 times) [2024-06-22 17:17:43,500][15401] InferenceWorker_p0-w0: stopping experience collection (61850 times) [2024-06-22 17:17:43,500][15401] InferenceWorker_p0-w0: resuming experience collection (61850 times) [2024-06-22 17:17:43,519][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000253917_4160176128.pth [2024-06-22 17:17:45,327][15401] Updated weights for policy 0, policy_version 254550 (0.0041) [2024-06-22 17:17:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4170661888. Throughput: 0: 42509.4. Samples: 4170820440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:17:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 17:17:49,783][15401] Updated weights for policy 0, policy_version 254560 (0.0031) [2024-06-22 17:17:53,137][15401] Updated weights for policy 0, policy_version 254570 (0.0032) [2024-06-22 17:17:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4170874880. Throughput: 0: 42625.7. Samples: 4170946240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:17:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 17:17:57,233][15401] Updated weights for policy 0, policy_version 254580 (0.0023) [2024-06-22 17:17:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42326.6, 300 sec: 42709.5). Total num frames: 4171071488. Throughput: 0: 42479.6. Samples: 4171202860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:17:58,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 17:18:00,728][15401] Updated weights for policy 0, policy_version 254590 (0.0029) [2024-06-22 17:18:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4171300864. Throughput: 0: 42629.7. Samples: 4171462340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:18:03,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 17:18:05,305][15401] Updated weights for policy 0, policy_version 254600 (0.0039) [2024-06-22 17:18:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4171497472. Throughput: 0: 42528.5. Samples: 4171583480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:18:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 17:18:08,620][15401] Updated weights for policy 0, policy_version 254610 (0.0026) [2024-06-22 17:18:12,882][15401] Updated weights for policy 0, policy_version 254620 (0.0029) [2024-06-22 17:18:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42055.3, 300 sec: 42598.6). Total num frames: 4171694080. Throughput: 0: 42446.6. Samples: 4171839220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:18:13,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 17:18:16,298][15401] Updated weights for policy 0, policy_version 254630 (0.0037) [2024-06-22 17:18:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4171907072. Throughput: 0: 42444.5. Samples: 4172094100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:18:18,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-22 17:18:20,454][15401] Updated weights for policy 0, policy_version 254640 (0.0041) [2024-06-22 17:18:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4172152832. Throughput: 0: 42428.9. Samples: 4172219200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:18:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 17:18:24,204][15401] Updated weights for policy 0, policy_version 254650 (0.0028) [2024-06-22 17:18:28,231][15401] Updated weights for policy 0, policy_version 254660 (0.0041) [2024-06-22 17:18:28,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 4172349440. Throughput: 0: 42391.6. Samples: 4172474980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-22 17:18:28,393][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 17:18:31,657][15401] Updated weights for policy 0, policy_version 254670 (0.0032) [2024-06-22 17:18:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 4172546048. Throughput: 0: 42481.7. Samples: 4172732120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:18:33,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 17:18:36,105][15401] Updated weights for policy 0, policy_version 254680 (0.0040) [2024-06-22 17:18:38,390][15132] Fps is (10 sec: 44246.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4172791808. Throughput: 0: 42402.0. Samples: 4172854340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:18:38,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 17:18:39,304][15401] Updated weights for policy 0, policy_version 254690 (0.0028) [2024-06-22 17:18:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4172988416. Throughput: 0: 42447.9. Samples: 4173113020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:18:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 17:18:43,770][15401] Updated weights for policy 0, policy_version 254700 (0.0027) [2024-06-22 17:18:47,035][15401] Updated weights for policy 0, policy_version 254710 (0.0034) [2024-06-22 17:18:48,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 4173201408. Throughput: 0: 42307.1. Samples: 4173366260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:18:48,393][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 17:18:51,612][15401] Updated weights for policy 0, policy_version 254720 (0.0031) [2024-06-22 17:18:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4173414400. Throughput: 0: 42525.8. Samples: 4173497140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:18:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 17:18:54,837][15401] Updated weights for policy 0, policy_version 254730 (0.0043) [2024-06-22 17:18:58,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4173611008. Throughput: 0: 42404.9. Samples: 4173747440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:18:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 17:18:59,290][15401] Updated weights for policy 0, policy_version 254740 (0.0042) [2024-06-22 17:19:00,605][15349] Signal inference workers to stop experience collection... (61900 times) [2024-06-22 17:19:00,606][15349] Signal inference workers to resume experience collection... (61900 times) [2024-06-22 17:19:00,617][15401] InferenceWorker_p0-w0: stopping experience collection (61900 times) [2024-06-22 17:19:00,640][15401] InferenceWorker_p0-w0: resuming experience collection (61900 times) [2024-06-22 17:19:03,059][15401] Updated weights for policy 0, policy_version 254750 (0.0030) [2024-06-22 17:19:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4173840384. Throughput: 0: 42290.6. Samples: 4173997180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:19:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 17:19:07,038][15401] Updated weights for policy 0, policy_version 254760 (0.0040) [2024-06-22 17:19:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 4174036992. Throughput: 0: 42235.8. Samples: 4174119820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:19:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 17:19:10,623][15401] Updated weights for policy 0, policy_version 254770 (0.0030) [2024-06-22 17:19:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4174249984. Throughput: 0: 42200.5. Samples: 4174373900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:19:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 17:19:14,971][15401] Updated weights for policy 0, policy_version 254780 (0.0029) [2024-06-22 17:19:18,111][15401] Updated weights for policy 0, policy_version 254790 (0.0034) [2024-06-22 17:19:18,395][15132] Fps is (10 sec: 44214.9, 60 sec: 42867.8, 300 sec: 42708.7). Total num frames: 4174479360. Throughput: 0: 42104.5. Samples: 4174627040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:19:18,395][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 17:19:22,550][15401] Updated weights for policy 0, policy_version 254800 (0.0045) [2024-06-22 17:19:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 4174659584. Throughput: 0: 42365.0. Samples: 4174760760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:19:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 17:19:25,739][15401] Updated weights for policy 0, policy_version 254810 (0.0029) [2024-06-22 17:19:28,389][15132] Fps is (10 sec: 42620.5, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 4174905344. Throughput: 0: 42239.7. Samples: 4175013800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:19:28,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 17:19:30,161][15401] Updated weights for policy 0, policy_version 254820 (0.0033) [2024-06-22 17:19:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4175118336. Throughput: 0: 42323.7. Samples: 4175270720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:19:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 17:19:33,925][15401] Updated weights for policy 0, policy_version 254830 (0.0038) [2024-06-22 17:19:37,809][15401] Updated weights for policy 0, policy_version 254840 (0.0038) [2024-06-22 17:19:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 4175314944. Throughput: 0: 42278.3. Samples: 4175399660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-22 17:19:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 17:19:41,729][15401] Updated weights for policy 0, policy_version 254850 (0.0043) [2024-06-22 17:19:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4175527936. Throughput: 0: 42425.4. Samples: 4175656580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:19:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 17:19:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000254855_4175544320.pth... [2024-06-22 17:19:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000254231_4165320704.pth [2024-06-22 17:19:45,278][15401] Updated weights for policy 0, policy_version 254860 (0.0035) [2024-06-22 17:19:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 4175724544. Throughput: 0: 42574.7. Samples: 4175913040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:19:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 17:19:49,450][15401] Updated weights for policy 0, policy_version 254870 (0.0032) [2024-06-22 17:19:52,799][15401] Updated weights for policy 0, policy_version 254880 (0.0034) [2024-06-22 17:19:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4175953920. Throughput: 0: 42666.5. Samples: 4176039800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:19:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 17:19:57,393][15401] Updated weights for policy 0, policy_version 254890 (0.0028) [2024-06-22 17:19:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4176166912. Throughput: 0: 42808.8. Samples: 4176300300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:19:58,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 17:20:00,697][15401] Updated weights for policy 0, policy_version 254900 (0.0038) [2024-06-22 17:20:03,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 4176379904. Throughput: 0: 42717.7. Samples: 4176549220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:03,393][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 17:20:04,962][15401] Updated weights for policy 0, policy_version 254910 (0.0043) [2024-06-22 17:20:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.7, 300 sec: 42654.5). Total num frames: 4176609280. Throughput: 0: 42569.9. Samples: 4176676400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:08,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 17:20:08,395][15401] Updated weights for policy 0, policy_version 254920 (0.0035) [2024-06-22 17:20:12,319][15401] Updated weights for policy 0, policy_version 254930 (0.0034) [2024-06-22 17:20:13,395][15132] Fps is (10 sec: 44224.5, 60 sec: 42867.7, 300 sec: 42597.7). Total num frames: 4176822272. Throughput: 0: 42853.7. Samples: 4176942440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:13,395][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 17:20:16,071][15401] Updated weights for policy 0, policy_version 254940 (0.0040) [2024-06-22 17:20:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42329.1, 300 sec: 42598.4). Total num frames: 4177018880. Throughput: 0: 42681.4. Samples: 4177191380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 17:20:19,993][15401] Updated weights for policy 0, policy_version 254950 (0.0041) [2024-06-22 17:20:23,390][15132] Fps is (10 sec: 42620.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 4177248256. Throughput: 0: 42691.9. Samples: 4177320800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:23,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 17:20:23,757][15401] Updated weights for policy 0, policy_version 254960 (0.0043) [2024-06-22 17:20:27,504][15401] Updated weights for policy 0, policy_version 254970 (0.0034) [2024-06-22 17:20:28,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 4177444864. Throughput: 0: 42681.2. Samples: 4177577240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 17:20:29,904][15349] Signal inference workers to stop experience collection... (61950 times) [2024-06-22 17:20:29,905][15349] Signal inference workers to resume experience collection... (61950 times) [2024-06-22 17:20:29,915][15401] InferenceWorker_p0-w0: stopping experience collection (61950 times) [2024-06-22 17:20:29,915][15401] InferenceWorker_p0-w0: resuming experience collection (61950 times) [2024-06-22 17:20:31,508][15401] Updated weights for policy 0, policy_version 254980 (0.0030) [2024-06-22 17:20:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4177674240. Throughput: 0: 42599.0. Samples: 4177830000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 17:20:35,212][15401] Updated weights for policy 0, policy_version 254990 (0.0037) [2024-06-22 17:20:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4177887232. Throughput: 0: 42670.9. Samples: 4177960000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 17:20:39,046][15401] Updated weights for policy 0, policy_version 255000 (0.0034) [2024-06-22 17:20:43,213][15401] Updated weights for policy 0, policy_version 255010 (0.0033) [2024-06-22 17:20:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4178083840. Throughput: 0: 42631.1. Samples: 4178218700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 17:20:46,802][15401] Updated weights for policy 0, policy_version 255020 (0.0033) [2024-06-22 17:20:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4178313216. Throughput: 0: 42724.5. Samples: 4178471720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:20:48,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 17:20:50,668][15401] Updated weights for policy 0, policy_version 255030 (0.0042) [2024-06-22 17:20:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4178509824. Throughput: 0: 42868.8. Samples: 4178605500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:20:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 17:20:54,224][15401] Updated weights for policy 0, policy_version 255040 (0.0027) [2024-06-22 17:20:58,203][15401] Updated weights for policy 0, policy_version 255050 (0.0031) [2024-06-22 17:20:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4178739200. Throughput: 0: 42731.6. Samples: 4178865140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:20:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 17:21:01,770][15401] Updated weights for policy 0, policy_version 255060 (0.0028) [2024-06-22 17:21:03,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 4178968576. Throughput: 0: 42893.1. Samples: 4179121580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 17:21:05,594][15401] Updated weights for policy 0, policy_version 255070 (0.0037) [2024-06-22 17:21:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.6, 300 sec: 42543.0). Total num frames: 4179165184. Throughput: 0: 43060.9. Samples: 4179258640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:08,392][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 17:21:09,572][15401] Updated weights for policy 0, policy_version 255080 (0.0041) [2024-06-22 17:21:12,952][15401] Updated weights for policy 0, policy_version 255090 (0.0037) [2024-06-22 17:21:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42875.1, 300 sec: 42598.4). Total num frames: 4179394560. Throughput: 0: 43152.0. Samples: 4179519080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 17:21:17,030][15401] Updated weights for policy 0, policy_version 255100 (0.0036) [2024-06-22 17:21:18,393][15132] Fps is (10 sec: 45870.1, 60 sec: 43415.0, 300 sec: 42709.0). Total num frames: 4179623936. Throughput: 0: 43190.0. Samples: 4179773700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:18,394][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 17:21:20,580][15401] Updated weights for policy 0, policy_version 255110 (0.0041) [2024-06-22 17:21:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4179820544. Throughput: 0: 43135.2. Samples: 4179901080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 17:21:24,610][15401] Updated weights for policy 0, policy_version 255120 (0.0041) [2024-06-22 17:21:28,390][15132] Fps is (10 sec: 40974.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4180033536. Throughput: 0: 43077.4. Samples: 4180157180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 17:21:28,510][15401] Updated weights for policy 0, policy_version 255130 (0.0023) [2024-06-22 17:21:32,323][15401] Updated weights for policy 0, policy_version 255140 (0.0028) [2024-06-22 17:21:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42654.5). Total num frames: 4180262912. Throughput: 0: 43144.8. Samples: 4180413340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:33,393][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 17:21:36,142][15401] Updated weights for policy 0, policy_version 255150 (0.0036) [2024-06-22 17:21:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 4180475904. Throughput: 0: 43076.8. Samples: 4180543960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 17:21:39,874][15401] Updated weights for policy 0, policy_version 255160 (0.0038) [2024-06-22 17:21:43,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 4180672512. Throughput: 0: 43104.5. Samples: 4180804840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 17:21:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000255168_4180672512.pth... [2024-06-22 17:21:43,444][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000254544_4170448896.pth [2024-06-22 17:21:43,920][15401] Updated weights for policy 0, policy_version 255170 (0.0047) [2024-06-22 17:21:47,535][15401] Updated weights for policy 0, policy_version 255180 (0.0037) [2024-06-22 17:21:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 4180885504. Throughput: 0: 42895.1. Samples: 4181051860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 17:21:51,570][15401] Updated weights for policy 0, policy_version 255190 (0.0028) [2024-06-22 17:21:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42654.2). Total num frames: 4181114880. Throughput: 0: 42753.0. Samples: 4181182420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 17:21:55,117][15401] Updated weights for policy 0, policy_version 255200 (0.0032) [2024-06-22 17:21:58,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4181295104. Throughput: 0: 42708.5. Samples: 4181440960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 17:21:58,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 17:21:59,299][15401] Updated weights for policy 0, policy_version 255210 (0.0034) [2024-06-22 17:22:02,861][15401] Updated weights for policy 0, policy_version 255220 (0.0031) [2024-06-22 17:22:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4181524480. Throughput: 0: 42700.2. Samples: 4181695060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 17:22:06,839][15401] Updated weights for policy 0, policy_version 255230 (0.0037) [2024-06-22 17:22:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42654.6). Total num frames: 4181753856. Throughput: 0: 42811.5. Samples: 4181827600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 17:22:10,852][15401] Updated weights for policy 0, policy_version 255240 (0.0042) [2024-06-22 17:22:12,218][15349] Signal inference workers to stop experience collection... (62000 times) [2024-06-22 17:22:12,252][15401] InferenceWorker_p0-w0: stopping experience collection (62000 times) [2024-06-22 17:22:12,274][15349] Signal inference workers to resume experience collection... (62000 times) [2024-06-22 17:22:12,275][15401] InferenceWorker_p0-w0: resuming experience collection (62000 times) [2024-06-22 17:22:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4181950464. Throughput: 0: 42842.7. Samples: 4182085100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:13,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 17:22:14,382][15401] Updated weights for policy 0, policy_version 255250 (0.0028) [2024-06-22 17:22:18,322][15401] Updated weights for policy 0, policy_version 255260 (0.0037) [2024-06-22 17:22:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.9, 300 sec: 42709.5). Total num frames: 4182179840. Throughput: 0: 42877.4. Samples: 4182342720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 17:22:21,941][15401] Updated weights for policy 0, policy_version 255270 (0.0033) [2024-06-22 17:22:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 4182392832. Throughput: 0: 42920.4. Samples: 4182475380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 17:22:26,174][15401] Updated weights for policy 0, policy_version 255280 (0.0039) [2024-06-22 17:22:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4182605824. Throughput: 0: 42673.3. Samples: 4182725140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 17:22:29,639][15401] Updated weights for policy 0, policy_version 255290 (0.0033) [2024-06-22 17:22:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 4182802432. Throughput: 0: 42987.7. Samples: 4182986300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 17:22:33,825][15401] Updated weights for policy 0, policy_version 255300 (0.0037) [2024-06-22 17:22:37,350][15401] Updated weights for policy 0, policy_version 255310 (0.0033) [2024-06-22 17:22:38,393][15132] Fps is (10 sec: 44220.2, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 4183048192. Throughput: 0: 42891.9. Samples: 4183112720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:38,394][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 17:22:41,373][15401] Updated weights for policy 0, policy_version 255320 (0.0036) [2024-06-22 17:22:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4183244800. Throughput: 0: 42813.3. Samples: 4183367560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 17:22:45,026][15401] Updated weights for policy 0, policy_version 255330 (0.0054) [2024-06-22 17:22:48,389][15132] Fps is (10 sec: 39336.7, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 4183441408. Throughput: 0: 43015.2. Samples: 4183630740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:48,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 17:22:48,884][15401] Updated weights for policy 0, policy_version 255340 (0.0048) [2024-06-22 17:22:52,626][15401] Updated weights for policy 0, policy_version 255350 (0.0037) [2024-06-22 17:22:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4183687168. Throughput: 0: 42915.1. Samples: 4183758780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 17:22:56,454][15401] Updated weights for policy 0, policy_version 255360 (0.0035) [2024-06-22 17:22:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 4183900160. Throughput: 0: 42928.5. Samples: 4184016880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:22:58,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 17:23:00,337][15401] Updated weights for policy 0, policy_version 255370 (0.0027) [2024-06-22 17:23:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4184096768. Throughput: 0: 42977.9. Samples: 4184276720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:23:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 17:23:03,943][15401] Updated weights for policy 0, policy_version 255380 (0.0050) [2024-06-22 17:23:07,973][15401] Updated weights for policy 0, policy_version 255390 (0.0041) [2024-06-22 17:23:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4184326144. Throughput: 0: 42776.5. Samples: 4184400320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:23:08,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 17:23:11,419][15401] Updated weights for policy 0, policy_version 255400 (0.0033) [2024-06-22 17:23:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4184555520. Throughput: 0: 42898.3. Samples: 4184655560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:23:13,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 17:23:15,551][15401] Updated weights for policy 0, policy_version 255410 (0.0042) [2024-06-22 17:23:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4184752128. Throughput: 0: 42934.2. Samples: 4184918340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 17:23:19,050][15401] Updated weights for policy 0, policy_version 255420 (0.0027) [2024-06-22 17:23:23,047][15401] Updated weights for policy 0, policy_version 255430 (0.0037) [2024-06-22 17:23:23,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 4184965120. Throughput: 0: 43064.0. Samples: 4185050540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:23,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 17:23:26,688][15401] Updated weights for policy 0, policy_version 255440 (0.0044) [2024-06-22 17:23:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4185178112. Throughput: 0: 42967.1. Samples: 4185301080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:28,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 17:23:30,747][15401] Updated weights for policy 0, policy_version 255450 (0.0037) [2024-06-22 17:23:33,392][15132] Fps is (10 sec: 44237.0, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 4185407488. Throughput: 0: 42872.8. Samples: 4185560120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:33,393][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 17:23:34,557][15401] Updated weights for policy 0, policy_version 255460 (0.0045) [2024-06-22 17:23:38,201][15401] Updated weights for policy 0, policy_version 255470 (0.0028) [2024-06-22 17:23:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42872.4, 300 sec: 42820.2). Total num frames: 4185620480. Throughput: 0: 42931.4. Samples: 4185690800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:38,393][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 17:23:39,490][15349] Signal inference workers to stop experience collection... (62050 times) [2024-06-22 17:23:39,490][15349] Signal inference workers to resume experience collection... (62050 times) [2024-06-22 17:23:39,536][15401] InferenceWorker_p0-w0: stopping experience collection (62050 times) [2024-06-22 17:23:39,536][15401] InferenceWorker_p0-w0: resuming experience collection (62050 times) [2024-06-22 17:23:42,111][15401] Updated weights for policy 0, policy_version 255480 (0.0035) [2024-06-22 17:23:43,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 4185833472. Throughput: 0: 42943.3. Samples: 4185949340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 17:23:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000255483_4185833472.pth... [2024-06-22 17:23:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000254855_4175544320.pth [2024-06-22 17:23:46,290][15401] Updated weights for policy 0, policy_version 255490 (0.0038) [2024-06-22 17:23:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 4186046464. Throughput: 0: 42922.5. Samples: 4186208240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 17:23:49,702][15401] Updated weights for policy 0, policy_version 255500 (0.0044) [2024-06-22 17:23:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4186259456. Throughput: 0: 43098.7. Samples: 4186339760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 17:23:53,918][15401] Updated weights for policy 0, policy_version 255510 (0.0037) [2024-06-22 17:23:57,533][15401] Updated weights for policy 0, policy_version 255520 (0.0038) [2024-06-22 17:23:58,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 4186472448. Throughput: 0: 43005.2. Samples: 4186590900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:23:58,393][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 17:24:01,486][15401] Updated weights for policy 0, policy_version 255530 (0.0031) [2024-06-22 17:24:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4186685440. Throughput: 0: 42767.4. Samples: 4186842880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:24:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 17:24:05,362][15401] Updated weights for policy 0, policy_version 255540 (0.0050) [2024-06-22 17:24:08,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4186898432. Throughput: 0: 42722.2. Samples: 4186972940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:24:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 17:24:09,079][15401] Updated weights for policy 0, policy_version 255550 (0.0034) [2024-06-22 17:24:12,909][15401] Updated weights for policy 0, policy_version 255560 (0.0032) [2024-06-22 17:24:13,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42593.8, 300 sec: 42820.4). Total num frames: 4187111424. Throughput: 0: 42885.9. Samples: 4187231220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:24:13,396][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 17:24:16,703][15401] Updated weights for policy 0, policy_version 255570 (0.0044) [2024-06-22 17:24:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4187308032. Throughput: 0: 42797.0. Samples: 4187485880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:24:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 17:24:20,491][15401] Updated weights for policy 0, policy_version 255580 (0.0030) [2024-06-22 17:24:23,390][15132] Fps is (10 sec: 44264.7, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 4187553792. Throughput: 0: 42743.1. Samples: 4187614140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 17:24:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 17:24:24,551][15401] Updated weights for policy 0, policy_version 255590 (0.0039) [2024-06-22 17:24:28,242][15401] Updated weights for policy 0, policy_version 255600 (0.0029) [2024-06-22 17:24:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4187750400. Throughput: 0: 42704.2. Samples: 4187871020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:24:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 17:24:32,158][15401] Updated weights for policy 0, policy_version 255610 (0.0031) [2024-06-22 17:24:33,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42325.3, 300 sec: 42820.2). Total num frames: 4187947008. Throughput: 0: 42567.5. Samples: 4188123880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:24:33,393][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 17:24:35,897][15401] Updated weights for policy 0, policy_version 255620 (0.0029) [2024-06-22 17:24:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 4188176384. Throughput: 0: 42349.8. Samples: 4188245500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:24:38,390][15132] Avg episode reward: [(0, '0.179')] [2024-06-22 17:24:39,792][15401] Updated weights for policy 0, policy_version 255630 (0.0039) [2024-06-22 17:24:43,389][15132] Fps is (10 sec: 44248.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 4188389376. Throughput: 0: 42574.3. Samples: 4188506640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:24:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 17:24:43,482][15401] Updated weights for policy 0, policy_version 255640 (0.0039) [2024-06-22 17:24:47,395][15401] Updated weights for policy 0, policy_version 255650 (0.0031) [2024-06-22 17:24:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 4188585984. Throughput: 0: 42599.3. Samples: 4188759840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:24:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 17:24:51,071][15401] Updated weights for policy 0, policy_version 255660 (0.0031) [2024-06-22 17:24:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4188815360. Throughput: 0: 42436.7. Samples: 4188882580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:24:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 17:24:55,188][15401] Updated weights for policy 0, policy_version 255670 (0.0044) [2024-06-22 17:24:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42876.4). Total num frames: 4189028352. Throughput: 0: 42550.9. Samples: 4189145740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:24:58,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 17:24:58,607][15401] Updated weights for policy 0, policy_version 255680 (0.0027) [2024-06-22 17:25:03,046][15401] Updated weights for policy 0, policy_version 255690 (0.0031) [2024-06-22 17:25:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4189241344. Throughput: 0: 42522.0. Samples: 4189399380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:25:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 17:25:06,609][15401] Updated weights for policy 0, policy_version 255700 (0.0039) [2024-06-22 17:25:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.8). Total num frames: 4189437952. Throughput: 0: 42436.5. Samples: 4189523780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:25:08,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 17:25:10,517][15401] Updated weights for policy 0, policy_version 255710 (0.0026) [2024-06-22 17:25:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42876.0, 300 sec: 42931.6). Total num frames: 4189683712. Throughput: 0: 42481.7. Samples: 4189782700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:25:13,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 17:25:14,027][15401] Updated weights for policy 0, policy_version 255720 (0.0042) [2024-06-22 17:25:15,509][15349] Signal inference workers to stop experience collection... (62100 times) [2024-06-22 17:25:15,549][15401] InferenceWorker_p0-w0: stopping experience collection (62100 times) [2024-06-22 17:25:15,569][15349] Signal inference workers to resume experience collection... (62100 times) [2024-06-22 17:25:15,570][15401] InferenceWorker_p0-w0: resuming experience collection (62100 times) [2024-06-22 17:25:18,056][15401] Updated weights for policy 0, policy_version 255730 (0.0036) [2024-06-22 17:25:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4189880320. Throughput: 0: 42590.3. Samples: 4190040340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:25:18,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 17:25:22,346][15401] Updated weights for policy 0, policy_version 255740 (0.0029) [2024-06-22 17:25:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 4190076928. Throughput: 0: 42722.7. Samples: 4190168020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:25:23,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 17:25:25,741][15401] Updated weights for policy 0, policy_version 255750 (0.0034) [2024-06-22 17:25:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 4190322688. Throughput: 0: 42614.9. Samples: 4190424320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:25:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 17:25:29,907][15401] Updated weights for policy 0, policy_version 255760 (0.0038) [2024-06-22 17:25:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 4190502912. Throughput: 0: 42797.7. Samples: 4190685740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 17:25:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 17:25:33,632][15401] Updated weights for policy 0, policy_version 255770 (0.0026) [2024-06-22 17:25:37,619][15401] Updated weights for policy 0, policy_version 255780 (0.0041) [2024-06-22 17:25:38,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 4190715904. Throughput: 0: 42809.7. Samples: 4190809020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:25:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 17:25:41,140][15401] Updated weights for policy 0, policy_version 255790 (0.0032) [2024-06-22 17:25:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4190961664. Throughput: 0: 42759.1. Samples: 4191069900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:25:43,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-22 17:25:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000255796_4190961664.pth... [2024-06-22 17:25:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000255168_4180672512.pth [2024-06-22 17:25:45,213][15401] Updated weights for policy 0, policy_version 255800 (0.0036) [2024-06-22 17:25:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4191158272. Throughput: 0: 42787.7. Samples: 4191324820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:25:48,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 17:25:48,932][15401] Updated weights for policy 0, policy_version 255810 (0.0030) [2024-06-22 17:25:52,809][15401] Updated weights for policy 0, policy_version 255820 (0.0033) [2024-06-22 17:25:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4191371264. Throughput: 0: 42834.7. Samples: 4191451340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:25:53,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 17:25:56,553][15401] Updated weights for policy 0, policy_version 255830 (0.0032) [2024-06-22 17:25:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 4191584256. Throughput: 0: 42847.2. Samples: 4191710820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:25:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 17:26:00,325][15401] Updated weights for policy 0, policy_version 255840 (0.0030) [2024-06-22 17:26:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 4191797248. Throughput: 0: 42867.7. Samples: 4191969380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 17:26:04,063][15401] Updated weights for policy 0, policy_version 255850 (0.0037) [2024-06-22 17:26:07,810][15401] Updated weights for policy 0, policy_version 255860 (0.0028) [2024-06-22 17:26:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4192026624. Throughput: 0: 42907.5. Samples: 4192098860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:08,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 17:26:11,636][15401] Updated weights for policy 0, policy_version 255870 (0.0031) [2024-06-22 17:26:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.5). Total num frames: 4192239616. Throughput: 0: 42992.1. Samples: 4192358960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:13,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 17:26:15,478][15401] Updated weights for policy 0, policy_version 255880 (0.0024) [2024-06-22 17:26:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4192436224. Throughput: 0: 43014.7. Samples: 4192621400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 17:26:19,134][15401] Updated weights for policy 0, policy_version 255890 (0.0027) [2024-06-22 17:26:22,823][15401] Updated weights for policy 0, policy_version 255900 (0.0036) [2024-06-22 17:26:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4192665600. Throughput: 0: 43122.3. Samples: 4192749520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 17:26:26,920][15401] Updated weights for policy 0, policy_version 255910 (0.0031) [2024-06-22 17:26:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 4192878592. Throughput: 0: 43059.4. Samples: 4193007580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 17:26:30,438][15401] Updated weights for policy 0, policy_version 255920 (0.0031) [2024-06-22 17:26:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4193091584. Throughput: 0: 43264.4. Samples: 4193271720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:33,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 17:26:34,371][15401] Updated weights for policy 0, policy_version 255930 (0.0029) [2024-06-22 17:26:38,309][15401] Updated weights for policy 0, policy_version 255940 (0.0052) [2024-06-22 17:26:38,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4193320960. Throughput: 0: 43215.1. Samples: 4193396020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 17:26:41,950][15401] Updated weights for policy 0, policy_version 255950 (0.0033) [2024-06-22 17:26:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4193533952. Throughput: 0: 43108.8. Samples: 4193650720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 17:26:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 17:26:45,806][15401] Updated weights for policy 0, policy_version 255960 (0.0040) [2024-06-22 17:26:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4193730560. Throughput: 0: 43171.0. Samples: 4193912080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:26:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 17:26:49,568][15401] Updated weights for policy 0, policy_version 255970 (0.0027) [2024-06-22 17:26:53,227][15401] Updated weights for policy 0, policy_version 255980 (0.0029) [2024-06-22 17:26:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 4193976320. Throughput: 0: 43089.2. Samples: 4194037880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:26:53,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 17:26:57,251][15401] Updated weights for policy 0, policy_version 255990 (0.0044) [2024-06-22 17:26:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 4194172928. Throughput: 0: 42930.6. Samples: 4194290940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:26:58,393][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 17:27:00,895][15401] Updated weights for policy 0, policy_version 256000 (0.0032) [2024-06-22 17:27:02,316][15349] Signal inference workers to stop experience collection... (62150 times) [2024-06-22 17:27:02,348][15401] InferenceWorker_p0-w0: stopping experience collection (62150 times) [2024-06-22 17:27:02,383][15349] Signal inference workers to resume experience collection... (62150 times) [2024-06-22 17:27:02,383][15401] InferenceWorker_p0-w0: resuming experience collection (62150 times) [2024-06-22 17:27:03,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4194353152. Throughput: 0: 42918.6. Samples: 4194552740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:03,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 17:27:04,894][15401] Updated weights for policy 0, policy_version 256010 (0.0032) [2024-06-22 17:27:08,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4194615296. Throughput: 0: 42891.4. Samples: 4194679640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:08,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 17:27:08,867][15401] Updated weights for policy 0, policy_version 256020 (0.0027) [2024-06-22 17:27:12,497][15401] Updated weights for policy 0, policy_version 256030 (0.0028) [2024-06-22 17:27:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4194811904. Throughput: 0: 42763.2. Samples: 4194931920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:13,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-22 17:27:16,404][15401] Updated weights for policy 0, policy_version 256040 (0.0041) [2024-06-22 17:27:18,390][15132] Fps is (10 sec: 37681.8, 60 sec: 42598.0, 300 sec: 42709.4). Total num frames: 4194992128. Throughput: 0: 42669.8. Samples: 4195191880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:18,391][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 17:27:20,322][15401] Updated weights for policy 0, policy_version 256050 (0.0021) [2024-06-22 17:27:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4195254272. Throughput: 0: 42683.6. Samples: 4195316780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 17:27:24,373][15401] Updated weights for policy 0, policy_version 256060 (0.0042) [2024-06-22 17:27:28,358][15401] Updated weights for policy 0, policy_version 256070 (0.0043) [2024-06-22 17:27:28,390][15132] Fps is (10 sec: 45877.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4195450880. Throughput: 0: 42742.2. Samples: 4195574120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 17:27:32,062][15401] Updated weights for policy 0, policy_version 256080 (0.0030) [2024-06-22 17:27:33,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.7, 300 sec: 42709.7). Total num frames: 4195647488. Throughput: 0: 42454.6. Samples: 4195822640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:33,392][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 17:27:36,524][15401] Updated weights for policy 0, policy_version 256090 (0.0028) [2024-06-22 17:27:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4195893248. Throughput: 0: 42588.1. Samples: 4195954340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 17:27:39,559][15401] Updated weights for policy 0, policy_version 256100 (0.0034) [2024-06-22 17:27:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4196073472. Throughput: 0: 42671.6. Samples: 4196211060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:43,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 17:27:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000256109_4196089856.pth... [2024-06-22 17:27:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000255483_4185833472.pth [2024-06-22 17:27:44,110][15401] Updated weights for policy 0, policy_version 256110 (0.0036) [2024-06-22 17:27:47,140][15401] Updated weights for policy 0, policy_version 256120 (0.0026) [2024-06-22 17:27:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4196302848. Throughput: 0: 42317.8. Samples: 4196457040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:48,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 17:27:51,806][15401] Updated weights for policy 0, policy_version 256130 (0.0039) [2024-06-22 17:27:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4196515840. Throughput: 0: 42539.2. Samples: 4196593900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:27:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 17:27:54,731][15401] Updated weights for policy 0, policy_version 256140 (0.0034) [2024-06-22 17:27:58,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42325.4, 300 sec: 42764.7). Total num frames: 4196712448. Throughput: 0: 42562.7. Samples: 4196847340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:27:58,392][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 17:27:59,676][15401] Updated weights for policy 0, policy_version 256150 (0.0039) [2024-06-22 17:28:02,279][15401] Updated weights for policy 0, policy_version 256160 (0.0033) [2024-06-22 17:28:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4196941824. Throughput: 0: 42268.4. Samples: 4197093940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 17:28:07,282][15401] Updated weights for policy 0, policy_version 256170 (0.0036) [2024-06-22 17:28:08,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 4197138432. Throughput: 0: 42433.7. Samples: 4197226300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 17:28:09,763][15401] Updated weights for policy 0, policy_version 256180 (0.0029) [2024-06-22 17:28:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4197351424. Throughput: 0: 42473.3. Samples: 4197485420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 17:28:14,872][15349] Signal inference workers to stop experience collection... (62200 times) [2024-06-22 17:28:14,899][15401] InferenceWorker_p0-w0: stopping experience collection (62200 times) [2024-06-22 17:28:14,937][15349] Signal inference workers to resume experience collection... (62200 times) [2024-06-22 17:28:14,939][15401] InferenceWorker_p0-w0: resuming experience collection (62200 times) [2024-06-22 17:28:14,942][15401] Updated weights for policy 0, policy_version 256190 (0.0029) [2024-06-22 17:28:17,789][15401] Updated weights for policy 0, policy_version 256200 (0.0028) [2024-06-22 17:28:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.9, 300 sec: 42820.9). Total num frames: 4197597184. Throughput: 0: 42432.9. Samples: 4197732020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 17:28:22,380][15401] Updated weights for policy 0, policy_version 256210 (0.0029) [2024-06-22 17:28:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 4197777408. Throughput: 0: 42502.1. Samples: 4197866940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 17:28:25,497][15401] Updated weights for policy 0, policy_version 256220 (0.0023) [2024-06-22 17:28:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 4197990400. Throughput: 0: 42505.0. Samples: 4198123780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:28,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 17:28:30,112][15401] Updated weights for policy 0, policy_version 256230 (0.0032) [2024-06-22 17:28:33,146][15401] Updated weights for policy 0, policy_version 256240 (0.0024) [2024-06-22 17:28:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 4198236160. Throughput: 0: 42680.4. Samples: 4198377660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:33,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 17:28:37,704][15401] Updated weights for policy 0, policy_version 256250 (0.0034) [2024-06-22 17:28:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4198432768. Throughput: 0: 42666.3. Samples: 4198513880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 17:28:40,754][15401] Updated weights for policy 0, policy_version 256260 (0.0030) [2024-06-22 17:28:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4198629376. Throughput: 0: 42668.0. Samples: 4198767300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:43,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-22 17:28:45,477][15401] Updated weights for policy 0, policy_version 256270 (0.0031) [2024-06-22 17:28:48,341][15401] Updated weights for policy 0, policy_version 256280 (0.0030) [2024-06-22 17:28:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4198891520. Throughput: 0: 42786.7. Samples: 4199019340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 17:28:52,960][15401] Updated weights for policy 0, policy_version 256290 (0.0034) [2024-06-22 17:28:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 4199071744. Throughput: 0: 42924.0. Samples: 4199157980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:53,401][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 17:28:55,891][15401] Updated weights for policy 0, policy_version 256300 (0.0027) [2024-06-22 17:28:58,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 4199268352. Throughput: 0: 42736.9. Samples: 4199408580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:28:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 17:29:00,539][15401] Updated weights for policy 0, policy_version 256310 (0.0032) [2024-06-22 17:29:03,392][15132] Fps is (10 sec: 45875.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4199530496. Throughput: 0: 42836.0. Samples: 4199659740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:29:03,392][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 17:29:03,483][15401] Updated weights for policy 0, policy_version 256320 (0.0035) [2024-06-22 17:29:08,070][15401] Updated weights for policy 0, policy_version 256330 (0.0025) [2024-06-22 17:29:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 4199710720. Throughput: 0: 43013.4. Samples: 4199802540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-22 17:29:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 17:29:11,626][15401] Updated weights for policy 0, policy_version 256340 (0.0035) [2024-06-22 17:29:13,390][15132] Fps is (10 sec: 37692.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4199907328. Throughput: 0: 42862.6. Samples: 4200052600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 17:29:15,839][15401] Updated weights for policy 0, policy_version 256350 (0.0041) [2024-06-22 17:29:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4200153088. Throughput: 0: 42945.8. Samples: 4200310220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 17:29:19,229][15401] Updated weights for policy 0, policy_version 256360 (0.0029) [2024-06-22 17:29:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4200349696. Throughput: 0: 42875.0. Samples: 4200443260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:23,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 17:29:23,452][15401] Updated weights for policy 0, policy_version 256370 (0.0035) [2024-06-22 17:29:27,026][15401] Updated weights for policy 0, policy_version 256380 (0.0044) [2024-06-22 17:29:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 4200546304. Throughput: 0: 42784.0. Samples: 4200692580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:28,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 17:29:30,933][15401] Updated weights for policy 0, policy_version 256390 (0.0026) [2024-06-22 17:29:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4200808448. Throughput: 0: 42861.2. Samples: 4200948100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 17:29:34,705][15401] Updated weights for policy 0, policy_version 256400 (0.0039) [2024-06-22 17:29:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4200988672. Throughput: 0: 42830.3. Samples: 4201085240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:38,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-22 17:29:38,835][15401] Updated weights for policy 0, policy_version 256410 (0.0030) [2024-06-22 17:29:40,105][15349] Signal inference workers to stop experience collection... (62250 times) [2024-06-22 17:29:40,105][15349] Signal inference workers to resume experience collection... (62250 times) [2024-06-22 17:29:40,154][15401] InferenceWorker_p0-w0: stopping experience collection (62250 times) [2024-06-22 17:29:40,154][15401] InferenceWorker_p0-w0: resuming experience collection (62250 times) [2024-06-22 17:29:42,318][15401] Updated weights for policy 0, policy_version 256420 (0.0039) [2024-06-22 17:29:43,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4201201664. Throughput: 0: 42772.9. Samples: 4201333360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 17:29:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000256421_4201201664.pth... [2024-06-22 17:29:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000255796_4190961664.pth [2024-06-22 17:29:46,283][15401] Updated weights for policy 0, policy_version 256430 (0.0035) [2024-06-22 17:29:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4201447424. Throughput: 0: 42942.8. Samples: 4201592060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 17:29:49,851][15401] Updated weights for policy 0, policy_version 256440 (0.0024) [2024-06-22 17:29:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 4201644032. Throughput: 0: 42814.2. Samples: 4201729280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:53,393][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 17:29:53,871][15401] Updated weights for policy 0, policy_version 256450 (0.0033) [2024-06-22 17:29:57,433][15401] Updated weights for policy 0, policy_version 256460 (0.0044) [2024-06-22 17:29:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4201857024. Throughput: 0: 42841.4. Samples: 4201980460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:29:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 17:30:01,745][15401] Updated weights for policy 0, policy_version 256470 (0.0030) [2024-06-22 17:30:03,392][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42931.3). Total num frames: 4202102784. Throughput: 0: 42850.6. Samples: 4202238600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:30:03,392][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 17:30:05,636][15401] Updated weights for policy 0, policy_version 256480 (0.0031) [2024-06-22 17:30:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4202283008. Throughput: 0: 42829.8. Samples: 4202370600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:30:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 17:30:09,182][15401] Updated weights for policy 0, policy_version 256490 (0.0035) [2024-06-22 17:30:13,108][15401] Updated weights for policy 0, policy_version 256500 (0.0033) [2024-06-22 17:30:13,390][15132] Fps is (10 sec: 39330.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4202496000. Throughput: 0: 43017.2. Samples: 4202628360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:30:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 17:30:16,766][15401] Updated weights for policy 0, policy_version 256510 (0.0039) [2024-06-22 17:30:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4202741760. Throughput: 0: 43080.5. Samples: 4202886720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 17:30:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 17:30:20,712][15401] Updated weights for policy 0, policy_version 256520 (0.0033) [2024-06-22 17:30:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4202921984. Throughput: 0: 43046.5. Samples: 4203022340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:30:23,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 17:30:24,380][15401] Updated weights for policy 0, policy_version 256530 (0.0036) [2024-06-22 17:30:28,244][15401] Updated weights for policy 0, policy_version 256540 (0.0033) [2024-06-22 17:30:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4203151360. Throughput: 0: 43124.0. Samples: 4203273940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:30:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 17:30:32,102][15401] Updated weights for policy 0, policy_version 256550 (0.0040) [2024-06-22 17:30:33,390][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4203397120. Throughput: 0: 43153.8. Samples: 4203533980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:30:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 17:30:35,773][15401] Updated weights for policy 0, policy_version 256560 (0.0041) [2024-06-22 17:30:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 4203577344. Throughput: 0: 43060.3. Samples: 4203666880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:30:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 17:30:39,923][15401] Updated weights for policy 0, policy_version 256570 (0.0042) [2024-06-22 17:30:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4203790336. Throughput: 0: 43001.3. Samples: 4203915520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:30:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 17:30:43,755][15401] Updated weights for policy 0, policy_version 256580 (0.0031) [2024-06-22 17:30:47,443][15401] Updated weights for policy 0, policy_version 256590 (0.0032) [2024-06-22 17:30:48,390][15132] Fps is (10 sec: 44235.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4204019712. Throughput: 0: 43084.0. Samples: 4204177280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:30:48,391][15132] Avg episode reward: [(0, '0.210')] [2024-06-22 17:30:51,440][15401] Updated weights for policy 0, policy_version 256600 (0.0033) [2024-06-22 17:30:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 4204216320. Throughput: 0: 43103.1. Samples: 4204310240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:30:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 17:30:54,849][15401] Updated weights for policy 0, policy_version 256610 (0.0027) [2024-06-22 17:30:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4204429312. Throughput: 0: 42787.9. Samples: 4204553820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:30:58,391][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 17:30:58,829][15401] Updated weights for policy 0, policy_version 256620 (0.0036) [2024-06-22 17:31:03,024][15401] Updated weights for policy 0, policy_version 256630 (0.0037) [2024-06-22 17:31:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 4204642304. Throughput: 0: 43022.3. Samples: 4204822720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:31:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 17:31:04,075][15349] Signal inference workers to stop experience collection... (62300 times) [2024-06-22 17:31:04,127][15401] InferenceWorker_p0-w0: stopping experience collection (62300 times) [2024-06-22 17:31:04,127][15349] Signal inference workers to resume experience collection... (62300 times) [2024-06-22 17:31:04,139][15401] InferenceWorker_p0-w0: resuming experience collection (62300 times) [2024-06-22 17:31:06,669][15401] Updated weights for policy 0, policy_version 256640 (0.0037) [2024-06-22 17:31:08,390][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4204871680. Throughput: 0: 42815.6. Samples: 4204949040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:31:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 17:31:10,537][15401] Updated weights for policy 0, policy_version 256650 (0.0049) [2024-06-22 17:31:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 4205084672. Throughput: 0: 42816.9. Samples: 4205200700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:31:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 17:31:14,136][15401] Updated weights for policy 0, policy_version 256660 (0.0026) [2024-06-22 17:31:18,070][15401] Updated weights for policy 0, policy_version 256670 (0.0040) [2024-06-22 17:31:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4205297664. Throughput: 0: 42883.5. Samples: 4205463740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:31:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 17:31:21,742][15401] Updated weights for policy 0, policy_version 256680 (0.0030) [2024-06-22 17:31:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4205494272. Throughput: 0: 42684.7. Samples: 4205587700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:31:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 17:31:25,688][15401] Updated weights for policy 0, policy_version 256690 (0.0040) [2024-06-22 17:31:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4205740032. Throughput: 0: 42887.2. Samples: 4205845440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 17:31:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 17:31:29,393][15401] Updated weights for policy 0, policy_version 256700 (0.0031) [2024-06-22 17:31:33,297][15401] Updated weights for policy 0, policy_version 256710 (0.0031) [2024-06-22 17:31:33,390][15132] Fps is (10 sec: 44234.4, 60 sec: 42324.9, 300 sec: 42764.9). Total num frames: 4205936640. Throughput: 0: 42910.2. Samples: 4206108260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:31:33,391][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 17:31:37,048][15401] Updated weights for policy 0, policy_version 256720 (0.0034) [2024-06-22 17:31:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4206149632. Throughput: 0: 42617.4. Samples: 4206228020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:31:38,404][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 17:31:41,250][15401] Updated weights for policy 0, policy_version 256730 (0.0037) [2024-06-22 17:31:43,390][15132] Fps is (10 sec: 45877.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4206395392. Throughput: 0: 42903.8. Samples: 4206484480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:31:43,399][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 17:31:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000256738_4206395392.pth... [2024-06-22 17:31:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000256109_4196089856.pth [2024-06-22 17:31:44,614][15401] Updated weights for policy 0, policy_version 256740 (0.0035) [2024-06-22 17:31:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 4206559232. Throughput: 0: 42826.2. Samples: 4206749900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:31:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 17:31:48,678][15401] Updated weights for policy 0, policy_version 256750 (0.0028) [2024-06-22 17:31:52,167][15401] Updated weights for policy 0, policy_version 256760 (0.0038) [2024-06-22 17:31:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 4206788608. Throughput: 0: 42661.0. Samples: 4206868780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:31:53,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 17:31:56,197][15401] Updated weights for policy 0, policy_version 256770 (0.0037) [2024-06-22 17:31:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 4207017984. Throughput: 0: 42962.6. Samples: 4207134020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:31:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 17:31:59,712][15401] Updated weights for policy 0, policy_version 256780 (0.0029) [2024-06-22 17:32:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4207214592. Throughput: 0: 42768.9. Samples: 4207388340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:32:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 17:32:03,761][15401] Updated weights for policy 0, policy_version 256790 (0.0036) [2024-06-22 17:32:07,418][15401] Updated weights for policy 0, policy_version 256800 (0.0026) [2024-06-22 17:32:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4207443968. Throughput: 0: 42893.8. Samples: 4207517920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:32:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 17:32:11,483][15401] Updated weights for policy 0, policy_version 256810 (0.0041) [2024-06-22 17:32:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4207673344. Throughput: 0: 42976.8. Samples: 4207779400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:32:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 17:32:14,938][15401] Updated weights for policy 0, policy_version 256820 (0.0035) [2024-06-22 17:32:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4207869952. Throughput: 0: 42835.7. Samples: 4208035840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:32:18,396][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 17:32:18,937][15401] Updated weights for policy 0, policy_version 256830 (0.0026) [2024-06-22 17:32:22,641][15401] Updated weights for policy 0, policy_version 256840 (0.0026) [2024-06-22 17:32:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4208082944. Throughput: 0: 43021.3. Samples: 4208163980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:32:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 17:32:24,270][15349] Signal inference workers to stop experience collection... (62350 times) [2024-06-22 17:32:24,298][15401] InferenceWorker_p0-w0: stopping experience collection (62350 times) [2024-06-22 17:32:24,317][15349] Signal inference workers to resume experience collection... (62350 times) [2024-06-22 17:32:24,319][15401] InferenceWorker_p0-w0: resuming experience collection (62350 times) [2024-06-22 17:32:26,499][15401] Updated weights for policy 0, policy_version 256850 (0.0037) [2024-06-22 17:32:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 4208312320. Throughput: 0: 43168.5. Samples: 4208427060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:32:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 17:32:30,372][15401] Updated weights for policy 0, policy_version 256860 (0.0023) [2024-06-22 17:32:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.9, 300 sec: 42820.5). Total num frames: 4208525312. Throughput: 0: 42891.4. Samples: 4208680020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:32:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 17:32:34,053][15401] Updated weights for policy 0, policy_version 256870 (0.0027) [2024-06-22 17:32:38,002][15401] Updated weights for policy 0, policy_version 256880 (0.0037) [2024-06-22 17:32:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4208738304. Throughput: 0: 43163.6. Samples: 4208811140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 17:32:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 17:32:41,673][15401] Updated weights for policy 0, policy_version 256890 (0.0042) [2024-06-22 17:32:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4208967680. Throughput: 0: 42939.0. Samples: 4209066280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:32:43,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 17:32:45,481][15401] Updated weights for policy 0, policy_version 256900 (0.0024) [2024-06-22 17:32:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4209164288. Throughput: 0: 43038.7. Samples: 4209325080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:32:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 17:32:49,338][15401] Updated weights for policy 0, policy_version 256910 (0.0036) [2024-06-22 17:32:53,183][15401] Updated weights for policy 0, policy_version 256920 (0.0032) [2024-06-22 17:32:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 4209377280. Throughput: 0: 43049.8. Samples: 4209455160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:32:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 17:32:57,057][15401] Updated weights for policy 0, policy_version 256930 (0.0037) [2024-06-22 17:32:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 4209590272. Throughput: 0: 42959.9. Samples: 4209712700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:32:58,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 17:33:00,706][15401] Updated weights for policy 0, policy_version 256940 (0.0048) [2024-06-22 17:33:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 4209819648. Throughput: 0: 43062.2. Samples: 4209973640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 17:33:04,896][15401] Updated weights for policy 0, policy_version 256950 (0.0028) [2024-06-22 17:33:08,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4210016256. Throughput: 0: 43161.4. Samples: 4210106240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 17:33:08,894][15401] Updated weights for policy 0, policy_version 256960 (0.0035) [2024-06-22 17:33:12,470][15401] Updated weights for policy 0, policy_version 256970 (0.0048) [2024-06-22 17:33:13,390][15132] Fps is (10 sec: 42594.9, 60 sec: 42870.9, 300 sec: 42876.0). Total num frames: 4210245632. Throughput: 0: 43020.1. Samples: 4210363000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:13,391][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 17:33:16,319][15401] Updated weights for policy 0, policy_version 256980 (0.0036) [2024-06-22 17:33:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4210458624. Throughput: 0: 43066.4. Samples: 4210618000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 17:33:20,034][15401] Updated weights for policy 0, policy_version 256990 (0.0031) [2024-06-22 17:33:23,390][15132] Fps is (10 sec: 40963.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4210655232. Throughput: 0: 43139.0. Samples: 4210752400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 17:33:23,786][15401] Updated weights for policy 0, policy_version 257000 (0.0025) [2024-06-22 17:33:27,576][15401] Updated weights for policy 0, policy_version 257010 (0.0028) [2024-06-22 17:33:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4210868224. Throughput: 0: 43148.9. Samples: 4211007980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 17:33:31,480][15401] Updated weights for policy 0, policy_version 257020 (0.0036) [2024-06-22 17:33:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4211113984. Throughput: 0: 43097.8. Samples: 4211264480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 17:33:35,051][15401] Updated weights for policy 0, policy_version 257030 (0.0032) [2024-06-22 17:33:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 4211310592. Throughput: 0: 43086.6. Samples: 4211394060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 17:33:38,882][15349] Signal inference workers to stop experience collection... (62400 times) [2024-06-22 17:33:38,912][15401] InferenceWorker_p0-w0: stopping experience collection (62400 times) [2024-06-22 17:33:38,942][15349] Signal inference workers to resume experience collection... (62400 times) [2024-06-22 17:33:38,942][15401] InferenceWorker_p0-w0: resuming experience collection (62400 times) [2024-06-22 17:33:38,945][15401] Updated weights for policy 0, policy_version 257040 (0.0038) [2024-06-22 17:33:42,457][15401] Updated weights for policy 0, policy_version 257050 (0.0032) [2024-06-22 17:33:43,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 4211523584. Throughput: 0: 43094.6. Samples: 4211651960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:43,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 17:33:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000257051_4211523584.pth... [2024-06-22 17:33:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000256421_4201201664.pth [2024-06-22 17:33:46,336][15401] Updated weights for policy 0, policy_version 257060 (0.0033) [2024-06-22 17:33:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 4211752960. Throughput: 0: 43115.5. Samples: 4211913840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-22 17:33:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 17:33:50,321][15401] Updated weights for policy 0, policy_version 257070 (0.0039) [2024-06-22 17:33:53,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 4211965952. Throughput: 0: 43053.3. Samples: 4212043640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:33:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 17:33:54,178][15401] Updated weights for policy 0, policy_version 257080 (0.0035) [2024-06-22 17:33:58,002][15401] Updated weights for policy 0, policy_version 257090 (0.0034) [2024-06-22 17:33:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42876.5). Total num frames: 4212178944. Throughput: 0: 43146.1. Samples: 4212304540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:33:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 17:34:01,739][15401] Updated weights for policy 0, policy_version 257100 (0.0029) [2024-06-22 17:34:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4212391936. Throughput: 0: 43088.5. Samples: 4212556980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 17:34:05,543][15401] Updated weights for policy 0, policy_version 257110 (0.0034) [2024-06-22 17:34:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 4212604928. Throughput: 0: 43052.8. Samples: 4212689780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:08,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 17:34:09,104][15401] Updated weights for policy 0, policy_version 257120 (0.0041) [2024-06-22 17:34:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.9, 300 sec: 42876.1). Total num frames: 4212801536. Throughput: 0: 43153.8. Samples: 4212949900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 17:34:13,556][15401] Updated weights for policy 0, policy_version 257130 (0.0031) [2024-06-22 17:34:16,388][15401] Updated weights for policy 0, policy_version 257140 (0.0032) [2024-06-22 17:34:18,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.5, 300 sec: 43098.3). Total num frames: 4213063680. Throughput: 0: 43133.7. Samples: 4213205500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:18,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 17:34:21,159][15401] Updated weights for policy 0, policy_version 257150 (0.0030) [2024-06-22 17:34:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 4213260288. Throughput: 0: 43414.4. Samples: 4213347700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:23,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 17:34:23,883][15401] Updated weights for policy 0, policy_version 257160 (0.0040) [2024-06-22 17:34:28,392][15132] Fps is (10 sec: 39312.1, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 4213456896. Throughput: 0: 43231.1. Samples: 4213597360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:28,392][15132] Avg episode reward: [(0, '0.810')] [2024-06-22 17:34:28,651][15401] Updated weights for policy 0, policy_version 257170 (0.0032) [2024-06-22 17:34:31,417][15401] Updated weights for policy 0, policy_version 257180 (0.0039) [2024-06-22 17:34:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 4213702656. Throughput: 0: 43145.6. Samples: 4213855400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:33,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 17:34:36,177][15401] Updated weights for policy 0, policy_version 257190 (0.0023) [2024-06-22 17:34:38,389][15132] Fps is (10 sec: 45886.6, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 4213915648. Throughput: 0: 43304.5. Samples: 4213992340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:38,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 17:34:39,061][15401] Updated weights for policy 0, policy_version 257200 (0.0039) [2024-06-22 17:34:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 4214112256. Throughput: 0: 43059.6. Samples: 4214242220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 17:34:43,697][15401] Updated weights for policy 0, policy_version 257210 (0.0032) [2024-06-22 17:34:46,834][15401] Updated weights for policy 0, policy_version 257220 (0.0022) [2024-06-22 17:34:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 43043.1). Total num frames: 4214341632. Throughput: 0: 43073.3. Samples: 4214495280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 17:34:51,570][15401] Updated weights for policy 0, policy_version 257230 (0.0033) [2024-06-22 17:34:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4214538240. Throughput: 0: 43112.0. Samples: 4214629820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 17:34:54,359][15401] Updated weights for policy 0, policy_version 257240 (0.0020) [2024-06-22 17:34:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 4214751232. Throughput: 0: 42932.9. Samples: 4214881880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-22 17:34:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 17:34:59,166][15401] Updated weights for policy 0, policy_version 257250 (0.0037) [2024-06-22 17:35:00,945][15349] Signal inference workers to stop experience collection... (62450 times) [2024-06-22 17:35:00,945][15349] Signal inference workers to resume experience collection... (62450 times) [2024-06-22 17:35:00,985][15401] InferenceWorker_p0-w0: stopping experience collection (62450 times) [2024-06-22 17:35:00,986][15401] InferenceWorker_p0-w0: resuming experience collection (62450 times) [2024-06-22 17:35:02,016][15401] Updated weights for policy 0, policy_version 257260 (0.0037) [2024-06-22 17:35:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 4214980608. Throughput: 0: 43040.0. Samples: 4215142300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 17:35:06,618][15401] Updated weights for policy 0, policy_version 257270 (0.0035) [2024-06-22 17:35:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 4215177216. Throughput: 0: 42973.8. Samples: 4215281520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 17:35:09,875][15401] Updated weights for policy 0, policy_version 257280 (0.0035) [2024-06-22 17:35:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4215406592. Throughput: 0: 43040.0. Samples: 4215534060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 17:35:14,061][15401] Updated weights for policy 0, policy_version 257290 (0.0027) [2024-06-22 17:35:17,713][15401] Updated weights for policy 0, policy_version 257300 (0.0048) [2024-06-22 17:35:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 4215635968. Throughput: 0: 43105.9. Samples: 4215795160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 17:35:21,586][15401] Updated weights for policy 0, policy_version 257310 (0.0026) [2024-06-22 17:35:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4215832576. Throughput: 0: 42986.1. Samples: 4215926720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:23,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 17:35:25,245][15401] Updated weights for policy 0, policy_version 257320 (0.0030) [2024-06-22 17:35:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 4216061952. Throughput: 0: 43044.3. Samples: 4216179220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:28,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-22 17:35:29,090][15401] Updated weights for policy 0, policy_version 257330 (0.0030) [2024-06-22 17:35:33,052][15401] Updated weights for policy 0, policy_version 257340 (0.0038) [2024-06-22 17:35:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 4216291328. Throughput: 0: 43312.8. Samples: 4216444360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:33,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 17:35:36,514][15401] Updated weights for policy 0, policy_version 257350 (0.0042) [2024-06-22 17:35:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 4216471552. Throughput: 0: 43196.5. Samples: 4216573660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:38,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-22 17:35:40,479][15401] Updated weights for policy 0, policy_version 257360 (0.0027) [2024-06-22 17:35:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 4216717312. Throughput: 0: 43208.4. Samples: 4216826260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 17:35:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000257368_4216717312.pth... [2024-06-22 17:35:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000256738_4206395392.pth [2024-06-22 17:35:44,333][15401] Updated weights for policy 0, policy_version 257370 (0.0048) [2024-06-22 17:35:48,084][15401] Updated weights for policy 0, policy_version 257380 (0.0024) [2024-06-22 17:35:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 4216913920. Throughput: 0: 43067.1. Samples: 4217080320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:48,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 17:35:51,901][15401] Updated weights for policy 0, policy_version 257390 (0.0026) [2024-06-22 17:35:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4217110528. Throughput: 0: 42778.6. Samples: 4217206560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 17:35:55,505][15401] Updated weights for policy 0, policy_version 257400 (0.0040) [2024-06-22 17:35:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 4217356288. Throughput: 0: 42987.1. Samples: 4217468480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:35:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 17:35:59,390][15401] Updated weights for policy 0, policy_version 257410 (0.0036) [2024-06-22 17:36:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4217552896. Throughput: 0: 42856.8. Samples: 4217723720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:36:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 17:36:03,521][15401] Updated weights for policy 0, policy_version 257420 (0.0024) [2024-06-22 17:36:06,913][15401] Updated weights for policy 0, policy_version 257430 (0.0037) [2024-06-22 17:36:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4217749504. Throughput: 0: 42800.0. Samples: 4217852720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:36:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 17:36:10,979][15401] Updated weights for policy 0, policy_version 257440 (0.0027) [2024-06-22 17:36:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 4218011648. Throughput: 0: 42990.2. Samples: 4218113780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 17:36:13,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 17:36:15,227][15401] Updated weights for policy 0, policy_version 257450 (0.0040) [2024-06-22 17:36:18,360][15349] Signal inference workers to stop experience collection... (62500 times) [2024-06-22 17:36:18,360][15349] Signal inference workers to resume experience collection... (62500 times) [2024-06-22 17:36:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 43098.3). Total num frames: 4218208256. Throughput: 0: 42797.9. Samples: 4218370260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 17:36:18,390][15401] InferenceWorker_p0-w0: stopping experience collection (62500 times) [2024-06-22 17:36:18,390][15401] InferenceWorker_p0-w0: resuming experience collection (62500 times) [2024-06-22 17:36:18,498][15401] Updated weights for policy 0, policy_version 257460 (0.0033) [2024-06-22 17:36:22,808][15401] Updated weights for policy 0, policy_version 257470 (0.0035) [2024-06-22 17:36:23,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4218404864. Throughput: 0: 42650.7. Samples: 4218492940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:23,395][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 17:36:26,231][15401] Updated weights for policy 0, policy_version 257480 (0.0032) [2024-06-22 17:36:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 4218650624. Throughput: 0: 42871.6. Samples: 4218755480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:28,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-22 17:36:30,264][15401] Updated weights for policy 0, policy_version 257490 (0.0032) [2024-06-22 17:36:33,390][15132] Fps is (10 sec: 44233.0, 60 sec: 42597.9, 300 sec: 43042.6). Total num frames: 4218847232. Throughput: 0: 42914.4. Samples: 4219011500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:33,391][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 17:36:33,929][15401] Updated weights for policy 0, policy_version 257500 (0.0031) [2024-06-22 17:36:37,831][15401] Updated weights for policy 0, policy_version 257510 (0.0023) [2024-06-22 17:36:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4219060224. Throughput: 0: 42822.3. Samples: 4219133560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:38,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-22 17:36:41,686][15401] Updated weights for policy 0, policy_version 257520 (0.0026) [2024-06-22 17:36:43,389][15132] Fps is (10 sec: 44241.2, 60 sec: 42871.6, 300 sec: 43153.8). Total num frames: 4219289600. Throughput: 0: 42846.8. Samples: 4219396580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 17:36:45,250][15401] Updated weights for policy 0, policy_version 257530 (0.0033) [2024-06-22 17:36:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 4219486208. Throughput: 0: 42837.8. Samples: 4219651420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:48,391][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 17:36:49,360][15401] Updated weights for policy 0, policy_version 257540 (0.0032) [2024-06-22 17:36:52,778][15401] Updated weights for policy 0, policy_version 257550 (0.0033) [2024-06-22 17:36:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4219699200. Throughput: 0: 42731.4. Samples: 4219775640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 17:36:56,898][15401] Updated weights for policy 0, policy_version 257560 (0.0042) [2024-06-22 17:36:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 4219928576. Throughput: 0: 42796.1. Samples: 4220039600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:36:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 17:37:00,443][15401] Updated weights for policy 0, policy_version 257570 (0.0033) [2024-06-22 17:37:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4220125184. Throughput: 0: 42852.0. Samples: 4220298600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:37:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 17:37:04,724][15401] Updated weights for policy 0, policy_version 257580 (0.0031) [2024-06-22 17:37:08,373][15401] Updated weights for policy 0, policy_version 257590 (0.0035) [2024-06-22 17:37:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 4220354560. Throughput: 0: 42894.1. Samples: 4220423180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:37:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 17:37:12,322][15401] Updated weights for policy 0, policy_version 257600 (0.0024) [2024-06-22 17:37:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 4220567552. Throughput: 0: 42940.0. Samples: 4220687780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:37:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 17:37:16,317][15401] Updated weights for policy 0, policy_version 257610 (0.0038) [2024-06-22 17:37:18,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 4220796928. Throughput: 0: 42763.5. Samples: 4220935820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:37:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 17:37:19,891][15401] Updated weights for policy 0, policy_version 257620 (0.0042) [2024-06-22 17:37:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4220977152. Throughput: 0: 42949.7. Samples: 4221066300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-22 17:37:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 17:37:23,854][15401] Updated weights for policy 0, policy_version 257630 (0.0029) [2024-06-22 17:37:27,511][15401] Updated weights for policy 0, policy_version 257640 (0.0034) [2024-06-22 17:37:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 4221190144. Throughput: 0: 42725.2. Samples: 4221319220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:37:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 17:37:31,523][15401] Updated weights for policy 0, policy_version 257650 (0.0038) [2024-06-22 17:37:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43145.1, 300 sec: 43042.7). Total num frames: 4221435904. Throughput: 0: 42851.1. Samples: 4221579720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:37:33,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 17:37:35,162][15401] Updated weights for policy 0, policy_version 257660 (0.0032) [2024-06-22 17:37:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 4221632512. Throughput: 0: 43033.9. Samples: 4221712160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:37:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 17:37:39,267][15401] Updated weights for policy 0, policy_version 257670 (0.0033) [2024-06-22 17:37:42,879][15401] Updated weights for policy 0, policy_version 257680 (0.0029) [2024-06-22 17:37:43,390][15132] Fps is (10 sec: 39320.1, 60 sec: 42325.0, 300 sec: 42931.6). Total num frames: 4221829120. Throughput: 0: 42747.2. Samples: 4221963240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:37:43,391][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 17:37:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000257680_4221829120.pth... [2024-06-22 17:37:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000257051_4211523584.pth [2024-06-22 17:37:46,885][15401] Updated weights for policy 0, policy_version 257690 (0.0030) [2024-06-22 17:37:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4222058496. Throughput: 0: 42643.6. Samples: 4222217560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:37:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 17:37:50,729][15401] Updated weights for policy 0, policy_version 257700 (0.0025) [2024-06-22 17:37:53,390][15132] Fps is (10 sec: 44238.6, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 4222271488. Throughput: 0: 42776.6. Samples: 4222348120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:37:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 17:37:54,698][15401] Updated weights for policy 0, policy_version 257710 (0.0043) [2024-06-22 17:37:58,250][15401] Updated weights for policy 0, policy_version 257720 (0.0037) [2024-06-22 17:37:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4222484480. Throughput: 0: 42492.0. Samples: 4222599920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:37:58,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 17:37:59,240][15349] Signal inference workers to stop experience collection... (62550 times) [2024-06-22 17:37:59,248][15349] Signal inference workers to resume experience collection... (62550 times) [2024-06-22 17:37:59,273][15401] InferenceWorker_p0-w0: stopping experience collection (62550 times) [2024-06-22 17:37:59,273][15401] InferenceWorker_p0-w0: resuming experience collection (62550 times) [2024-06-22 17:38:02,585][15401] Updated weights for policy 0, policy_version 257730 (0.0042) [2024-06-22 17:38:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4222697472. Throughput: 0: 42745.3. Samples: 4222859360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:38:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 17:38:05,863][15401] Updated weights for policy 0, policy_version 257740 (0.0041) [2024-06-22 17:38:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 4222910464. Throughput: 0: 42641.8. Samples: 4222985180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:38:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 17:38:09,965][15401] Updated weights for policy 0, policy_version 257750 (0.0043) [2024-06-22 17:38:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4223123456. Throughput: 0: 42676.1. Samples: 4223239640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:38:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 17:38:13,720][15401] Updated weights for policy 0, policy_version 257760 (0.0032) [2024-06-22 17:38:17,439][15401] Updated weights for policy 0, policy_version 257770 (0.0028) [2024-06-22 17:38:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42931.6). Total num frames: 4223320064. Throughput: 0: 42681.0. Samples: 4223500360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:38:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 17:38:21,335][15401] Updated weights for policy 0, policy_version 257780 (0.0035) [2024-06-22 17:38:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4223549440. Throughput: 0: 42607.5. Samples: 4223629500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:38:23,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 17:38:24,875][15401] Updated weights for policy 0, policy_version 257790 (0.0027) [2024-06-22 17:38:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4223762432. Throughput: 0: 42671.6. Samples: 4223883440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:38:28,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-22 17:38:29,016][15401] Updated weights for policy 0, policy_version 257800 (0.0033) [2024-06-22 17:38:32,701][15401] Updated weights for policy 0, policy_version 257810 (0.0046) [2024-06-22 17:38:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 4223975424. Throughput: 0: 42649.4. Samples: 4224136780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:38:33,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-22 17:38:37,192][15401] Updated weights for policy 0, policy_version 257820 (0.0037) [2024-06-22 17:38:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 4224188416. Throughput: 0: 42640.9. Samples: 4224266960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:38:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 17:38:40,355][15401] Updated weights for policy 0, policy_version 257830 (0.0029) [2024-06-22 17:38:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.8, 300 sec: 42931.6). Total num frames: 4224417792. Throughput: 0: 42639.1. Samples: 4224518680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:38:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 17:38:44,662][15401] Updated weights for policy 0, policy_version 257840 (0.0043) [2024-06-22 17:38:48,175][15401] Updated weights for policy 0, policy_version 257850 (0.0026) [2024-06-22 17:38:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4224614400. Throughput: 0: 42670.6. Samples: 4224779540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:38:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 17:38:52,009][15401] Updated weights for policy 0, policy_version 257860 (0.0032) [2024-06-22 17:38:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4224827392. Throughput: 0: 42620.3. Samples: 4224903100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:38:53,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-22 17:38:55,816][15401] Updated weights for policy 0, policy_version 257870 (0.0040) [2024-06-22 17:38:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4225040384. Throughput: 0: 42703.5. Samples: 4225161300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:38:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 17:38:59,402][15401] Updated weights for policy 0, policy_version 257880 (0.0029) [2024-06-22 17:39:03,353][15401] Updated weights for policy 0, policy_version 257890 (0.0039) [2024-06-22 17:39:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4225269760. Throughput: 0: 42785.3. Samples: 4225425700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 17:39:07,460][15401] Updated weights for policy 0, policy_version 257900 (0.0040) [2024-06-22 17:39:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4225466368. Throughput: 0: 42732.5. Samples: 4225552460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 17:39:10,944][15401] Updated weights for policy 0, policy_version 257910 (0.0042) [2024-06-22 17:39:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4225695744. Throughput: 0: 42787.4. Samples: 4225808880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:13,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 17:39:14,965][15401] Updated weights for policy 0, policy_version 257920 (0.0049) [2024-06-22 17:39:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4225908736. Throughput: 0: 43051.0. Samples: 4226074080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 17:39:18,591][15401] Updated weights for policy 0, policy_version 257930 (0.0027) [2024-06-22 17:39:22,419][15401] Updated weights for policy 0, policy_version 257940 (0.0038) [2024-06-22 17:39:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 4226105344. Throughput: 0: 43069.8. Samples: 4226205100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 17:39:26,388][15401] Updated weights for policy 0, policy_version 257950 (0.0033) [2024-06-22 17:39:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4226351104. Throughput: 0: 43120.3. Samples: 4226459100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:28,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 17:39:30,081][15401] Updated weights for policy 0, policy_version 257960 (0.0030) [2024-06-22 17:39:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4226547712. Throughput: 0: 43189.0. Samples: 4226723040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 17:39:33,819][15401] Updated weights for policy 0, policy_version 257970 (0.0034) [2024-06-22 17:39:37,675][15401] Updated weights for policy 0, policy_version 257980 (0.0045) [2024-06-22 17:39:38,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4226760704. Throughput: 0: 43214.0. Samples: 4226847720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:38,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 17:39:38,864][15349] Signal inference workers to stop experience collection... (62600 times) [2024-06-22 17:39:38,865][15349] Signal inference workers to resume experience collection... (62600 times) [2024-06-22 17:39:38,907][15401] InferenceWorker_p0-w0: stopping experience collection (62600 times) [2024-06-22 17:39:38,907][15401] InferenceWorker_p0-w0: resuming experience collection (62600 times) [2024-06-22 17:39:41,530][15401] Updated weights for policy 0, policy_version 257990 (0.0029) [2024-06-22 17:39:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4226990080. Throughput: 0: 43321.8. Samples: 4227110780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:39:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 17:39:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000257995_4226990080.pth... [2024-06-22 17:39:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000257368_4216717312.pth [2024-06-22 17:39:45,586][15401] Updated weights for policy 0, policy_version 258000 (0.0034) [2024-06-22 17:39:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4227203072. Throughput: 0: 43153.8. Samples: 4227367620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:39:48,392][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 17:39:48,962][15401] Updated weights for policy 0, policy_version 258010 (0.0034) [2024-06-22 17:39:53,209][15401] Updated weights for policy 0, policy_version 258020 (0.0036) [2024-06-22 17:39:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4227416064. Throughput: 0: 43093.3. Samples: 4227491660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:39:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 17:39:56,731][15401] Updated weights for policy 0, policy_version 258030 (0.0039) [2024-06-22 17:39:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4227629056. Throughput: 0: 43236.0. Samples: 4227754500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:39:58,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-22 17:40:00,645][15401] Updated weights for policy 0, policy_version 258040 (0.0031) [2024-06-22 17:40:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4227842048. Throughput: 0: 43001.4. Samples: 4228009140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 17:40:04,316][15401] Updated weights for policy 0, policy_version 258050 (0.0036) [2024-06-22 17:40:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4228038656. Throughput: 0: 42888.7. Samples: 4228135100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 17:40:08,650][15401] Updated weights for policy 0, policy_version 258060 (0.0035) [2024-06-22 17:40:11,813][15401] Updated weights for policy 0, policy_version 258070 (0.0030) [2024-06-22 17:40:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4228268032. Throughput: 0: 43003.2. Samples: 4228394240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 17:40:16,037][15401] Updated weights for policy 0, policy_version 258080 (0.0037) [2024-06-22 17:40:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4228481024. Throughput: 0: 42743.0. Samples: 4228646480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 17:40:19,411][15401] Updated weights for policy 0, policy_version 258090 (0.0035) [2024-06-22 17:40:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4228694016. Throughput: 0: 42977.7. Samples: 4228781720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:23,392][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 17:40:23,589][15401] Updated weights for policy 0, policy_version 258100 (0.0044) [2024-06-22 17:40:27,019][15401] Updated weights for policy 0, policy_version 258110 (0.0036) [2024-06-22 17:40:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4228907008. Throughput: 0: 42763.5. Samples: 4229035140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 17:40:31,053][15401] Updated weights for policy 0, policy_version 258120 (0.0027) [2024-06-22 17:40:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4229136384. Throughput: 0: 42769.8. Samples: 4229292260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:33,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 17:40:34,666][15401] Updated weights for policy 0, policy_version 258130 (0.0031) [2024-06-22 17:40:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4229332992. Throughput: 0: 42933.4. Samples: 4229423660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 17:40:38,910][15401] Updated weights for policy 0, policy_version 258140 (0.0043) [2024-06-22 17:40:42,273][15401] Updated weights for policy 0, policy_version 258150 (0.0035) [2024-06-22 17:40:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4229545984. Throughput: 0: 42733.1. Samples: 4229677480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:43,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 17:40:46,568][15401] Updated weights for policy 0, policy_version 258160 (0.0044) [2024-06-22 17:40:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 4229775360. Throughput: 0: 42849.0. Samples: 4229937340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 17:40:50,314][15401] Updated weights for policy 0, policy_version 258170 (0.0033) [2024-06-22 17:40:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4229971968. Throughput: 0: 42969.4. Samples: 4230068720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 17:40:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 17:40:54,083][15401] Updated weights for policy 0, policy_version 258180 (0.0036) [2024-06-22 17:40:54,649][15349] Signal inference workers to stop experience collection... (62650 times) [2024-06-22 17:40:54,649][15349] Signal inference workers to resume experience collection... (62650 times) [2024-06-22 17:40:54,695][15401] InferenceWorker_p0-w0: stopping experience collection (62650 times) [2024-06-22 17:40:54,695][15401] InferenceWorker_p0-w0: resuming experience collection (62650 times) [2024-06-22 17:40:57,812][15401] Updated weights for policy 0, policy_version 258190 (0.0043) [2024-06-22 17:40:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4230184960. Throughput: 0: 42806.2. Samples: 4230320520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:40:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 17:41:01,762][15401] Updated weights for policy 0, policy_version 258200 (0.0034) [2024-06-22 17:41:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4230397952. Throughput: 0: 42884.4. Samples: 4230576280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 17:41:05,534][15401] Updated weights for policy 0, policy_version 258210 (0.0035) [2024-06-22 17:41:08,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 4230610944. Throughput: 0: 42720.4. Samples: 4230704240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:08,392][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 17:41:09,412][15401] Updated weights for policy 0, policy_version 258220 (0.0045) [2024-06-22 17:41:13,126][15401] Updated weights for policy 0, policy_version 258230 (0.0037) [2024-06-22 17:41:13,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 4230840320. Throughput: 0: 42709.3. Samples: 4230957160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:13,392][15132] Avg episode reward: [(0, '0.266')] [2024-06-22 17:41:17,105][15401] Updated weights for policy 0, policy_version 258240 (0.0033) [2024-06-22 17:41:18,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 4231053312. Throughput: 0: 42594.6. Samples: 4231209120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:18,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 17:41:20,966][15401] Updated weights for policy 0, policy_version 258250 (0.0052) [2024-06-22 17:41:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4231249920. Throughput: 0: 42546.6. Samples: 4231338260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 17:41:24,707][15401] Updated weights for policy 0, policy_version 258260 (0.0025) [2024-06-22 17:41:28,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 4231462912. Throughput: 0: 42687.0. Samples: 4231598400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 17:41:28,668][15401] Updated weights for policy 0, policy_version 258270 (0.0029) [2024-06-22 17:41:32,255][15401] Updated weights for policy 0, policy_version 258280 (0.0037) [2024-06-22 17:41:33,392][15132] Fps is (10 sec: 44225.2, 60 sec: 42596.5, 300 sec: 42820.2). Total num frames: 4231692288. Throughput: 0: 42633.4. Samples: 4231855960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:33,393][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 17:41:36,421][15401] Updated weights for policy 0, policy_version 258290 (0.0049) [2024-06-22 17:41:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4231888896. Throughput: 0: 42595.6. Samples: 4231985520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 17:41:39,907][15401] Updated weights for policy 0, policy_version 258300 (0.0041) [2024-06-22 17:41:43,389][15132] Fps is (10 sec: 42610.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4232118272. Throughput: 0: 42652.2. Samples: 4232239860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 17:41:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000258309_4232134656.pth... [2024-06-22 17:41:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000257680_4221829120.pth [2024-06-22 17:41:44,057][15401] Updated weights for policy 0, policy_version 258310 (0.0045) [2024-06-22 17:41:47,514][15401] Updated weights for policy 0, policy_version 258320 (0.0026) [2024-06-22 17:41:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4232347648. Throughput: 0: 42598.7. Samples: 4232493220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 17:41:51,620][15401] Updated weights for policy 0, policy_version 258330 (0.0043) [2024-06-22 17:41:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4232527872. Throughput: 0: 42548.4. Samples: 4232618820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:53,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 17:41:55,567][15401] Updated weights for policy 0, policy_version 258340 (0.0035) [2024-06-22 17:41:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4232757248. Throughput: 0: 42566.2. Samples: 4232872540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:41:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 17:41:59,388][15401] Updated weights for policy 0, policy_version 258350 (0.0032) [2024-06-22 17:42:03,072][15401] Updated weights for policy 0, policy_version 258360 (0.0037) [2024-06-22 17:42:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4232986624. Throughput: 0: 42671.2. Samples: 4233129220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:42:03,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 17:42:07,079][15401] Updated weights for policy 0, policy_version 258370 (0.0045) [2024-06-22 17:42:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 4233166848. Throughput: 0: 42682.3. Samples: 4233258960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 17:42:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 17:42:10,773][15401] Updated weights for policy 0, policy_version 258380 (0.0035) [2024-06-22 17:42:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 4233412608. Throughput: 0: 42430.1. Samples: 4233507760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 17:42:14,737][15401] Updated weights for policy 0, policy_version 258390 (0.0036) [2024-06-22 17:42:17,974][15349] Signal inference workers to stop experience collection... (62700 times) [2024-06-22 17:42:17,974][15349] Signal inference workers to resume experience collection... (62700 times) [2024-06-22 17:42:18,017][15401] InferenceWorker_p0-w0: stopping experience collection (62700 times) [2024-06-22 17:42:18,017][15401] InferenceWorker_p0-w0: resuming experience collection (62700 times) [2024-06-22 17:42:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 4233609216. Throughput: 0: 42450.9. Samples: 4233766140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 17:42:18,528][15401] Updated weights for policy 0, policy_version 258400 (0.0039) [2024-06-22 17:42:22,620][15401] Updated weights for policy 0, policy_version 258410 (0.0035) [2024-06-22 17:42:23,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4233789440. Throughput: 0: 42369.8. Samples: 4233892160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:23,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 17:42:26,147][15401] Updated weights for policy 0, policy_version 258420 (0.0032) [2024-06-22 17:42:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4234035200. Throughput: 0: 42362.1. Samples: 4234146160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:28,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 17:42:30,551][15401] Updated weights for policy 0, policy_version 258430 (0.0032) [2024-06-22 17:42:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.3, 300 sec: 42709.5). Total num frames: 4234231808. Throughput: 0: 42567.3. Samples: 4234408740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 17:42:33,938][15401] Updated weights for policy 0, policy_version 258440 (0.0049) [2024-06-22 17:42:38,123][15401] Updated weights for policy 0, policy_version 258450 (0.0034) [2024-06-22 17:42:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 4234444800. Throughput: 0: 42382.7. Samples: 4234526040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 17:42:41,559][15401] Updated weights for policy 0, policy_version 258460 (0.0030) [2024-06-22 17:42:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4234674176. Throughput: 0: 42521.8. Samples: 4234786020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:43,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-22 17:42:45,642][15401] Updated weights for policy 0, policy_version 258470 (0.0033) [2024-06-22 17:42:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4234870784. Throughput: 0: 42833.2. Samples: 4235056720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:48,390][15132] Avg episode reward: [(0, '0.866')] [2024-06-22 17:42:49,045][15401] Updated weights for policy 0, policy_version 258480 (0.0029) [2024-06-22 17:42:53,155][15401] Updated weights for policy 0, policy_version 258490 (0.0037) [2024-06-22 17:42:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4235100160. Throughput: 0: 42607.0. Samples: 4235176280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:53,396][15132] Avg episode reward: [(0, '0.852')] [2024-06-22 17:42:56,630][15401] Updated weights for policy 0, policy_version 258500 (0.0029) [2024-06-22 17:42:58,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4235329536. Throughput: 0: 42830.9. Samples: 4235435140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:42:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 17:43:00,643][15401] Updated weights for policy 0, policy_version 258510 (0.0033) [2024-06-22 17:43:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4235526144. Throughput: 0: 42992.4. Samples: 4235700800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:43:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 17:43:04,281][15401] Updated weights for policy 0, policy_version 258520 (0.0025) [2024-06-22 17:43:08,026][15401] Updated weights for policy 0, policy_version 258530 (0.0033) [2024-06-22 17:43:08,389][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4235755520. Throughput: 0: 42885.7. Samples: 4235822020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:43:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 17:43:11,727][15401] Updated weights for policy 0, policy_version 258540 (0.0039) [2024-06-22 17:43:13,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4236001280. Throughput: 0: 43123.1. Samples: 4236086700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:43:13,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 17:43:15,775][15401] Updated weights for policy 0, policy_version 258550 (0.0044) [2024-06-22 17:43:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4236181504. Throughput: 0: 43044.3. Samples: 4236345740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 17:43:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 17:43:19,396][15401] Updated weights for policy 0, policy_version 258560 (0.0041) [2024-06-22 17:43:23,253][15401] Updated weights for policy 0, policy_version 258570 (0.0042) [2024-06-22 17:43:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 4236410880. Throughput: 0: 43083.2. Samples: 4236464780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:43:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 17:43:27,282][15401] Updated weights for policy 0, policy_version 258580 (0.0033) [2024-06-22 17:43:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4236623872. Throughput: 0: 43253.7. Samples: 4236732440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:43:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 17:43:31,019][15401] Updated weights for policy 0, policy_version 258590 (0.0022) [2024-06-22 17:43:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4236820480. Throughput: 0: 42964.1. Samples: 4236990100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:43:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 17:43:34,795][15401] Updated weights for policy 0, policy_version 258600 (0.0032) [2024-06-22 17:43:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4237033472. Throughput: 0: 43105.3. Samples: 4237116020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:43:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 17:43:38,895][15401] Updated weights for policy 0, policy_version 258610 (0.0036) [2024-06-22 17:43:42,392][15401] Updated weights for policy 0, policy_version 258620 (0.0037) [2024-06-22 17:43:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4237279232. Throughput: 0: 43134.1. Samples: 4237376180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:43:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 17:43:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000258623_4237279232.pth... [2024-06-22 17:43:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000257995_4226990080.pth [2024-06-22 17:43:46,445][15401] Updated weights for policy 0, policy_version 258630 (0.0050) [2024-06-22 17:43:48,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4237443072. Throughput: 0: 42974.8. Samples: 4237634660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:43:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 17:43:50,122][15401] Updated weights for policy 0, policy_version 258640 (0.0040) [2024-06-22 17:43:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4237672448. Throughput: 0: 42949.7. Samples: 4237754760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:43:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 17:43:53,718][15349] Signal inference workers to stop experience collection... (62750 times) [2024-06-22 17:43:53,720][15349] Signal inference workers to resume experience collection... (62750 times) [2024-06-22 17:43:53,741][15401] InferenceWorker_p0-w0: stopping experience collection (62750 times) [2024-06-22 17:43:53,741][15401] InferenceWorker_p0-w0: resuming experience collection (62750 times) [2024-06-22 17:43:54,010][15401] Updated weights for policy 0, policy_version 258650 (0.0027) [2024-06-22 17:43:57,636][15401] Updated weights for policy 0, policy_version 258660 (0.0031) [2024-06-22 17:43:58,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4237918208. Throughput: 0: 43012.6. Samples: 4238022260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:43:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 17:44:01,796][15401] Updated weights for policy 0, policy_version 258670 (0.0028) [2024-06-22 17:44:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4238098432. Throughput: 0: 42901.7. Samples: 4238276320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:44:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 17:44:05,430][15401] Updated weights for policy 0, policy_version 258680 (0.0031) [2024-06-22 17:44:08,390][15132] Fps is (10 sec: 40956.3, 60 sec: 42870.9, 300 sec: 42820.4). Total num frames: 4238327808. Throughput: 0: 42950.7. Samples: 4238397600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:44:08,391][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 17:44:09,350][15401] Updated weights for policy 0, policy_version 258690 (0.0029) [2024-06-22 17:44:12,960][15401] Updated weights for policy 0, policy_version 258700 (0.0036) [2024-06-22 17:44:13,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4238557184. Throughput: 0: 43024.1. Samples: 4238668520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:44:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 17:44:16,778][15401] Updated weights for policy 0, policy_version 258710 (0.0042) [2024-06-22 17:44:18,390][15132] Fps is (10 sec: 40962.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4238737408. Throughput: 0: 43033.6. Samples: 4238926620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:44:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 17:44:20,454][15401] Updated weights for policy 0, policy_version 258720 (0.0024) [2024-06-22 17:44:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4238966784. Throughput: 0: 42861.0. Samples: 4239044760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:44:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 17:44:24,907][15401] Updated weights for policy 0, policy_version 258730 (0.0032) [2024-06-22 17:44:27,926][15401] Updated weights for policy 0, policy_version 258740 (0.0034) [2024-06-22 17:44:28,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4239212544. Throughput: 0: 43063.1. Samples: 4239314020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 17:44:28,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 17:44:32,333][15401] Updated weights for policy 0, policy_version 258750 (0.0035) [2024-06-22 17:44:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4239376384. Throughput: 0: 43071.5. Samples: 4239572880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:44:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 17:44:35,487][15401] Updated weights for policy 0, policy_version 258760 (0.0031) [2024-06-22 17:44:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4239605760. Throughput: 0: 43074.3. Samples: 4239693100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:44:38,396][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 17:44:39,825][15401] Updated weights for policy 0, policy_version 258770 (0.0042) [2024-06-22 17:44:42,991][15401] Updated weights for policy 0, policy_version 258780 (0.0027) [2024-06-22 17:44:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4239851520. Throughput: 0: 42979.4. Samples: 4239956340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:44:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 17:44:47,705][15401] Updated weights for policy 0, policy_version 258790 (0.0034) [2024-06-22 17:44:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4240031744. Throughput: 0: 43010.7. Samples: 4240211800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:44:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 17:44:50,566][15401] Updated weights for policy 0, policy_version 258800 (0.0042) [2024-06-22 17:44:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4240261120. Throughput: 0: 43066.2. Samples: 4240335540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:44:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 17:44:55,212][15401] Updated weights for policy 0, policy_version 258810 (0.0037) [2024-06-22 17:44:58,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4240506880. Throughput: 0: 43054.7. Samples: 4240605980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:44:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 17:44:58,392][15401] Updated weights for policy 0, policy_version 258820 (0.0032) [2024-06-22 17:45:02,652][15401] Updated weights for policy 0, policy_version 258830 (0.0036) [2024-06-22 17:45:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 4240687104. Throughput: 0: 43010.8. Samples: 4240862100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:45:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 17:45:05,977][15401] Updated weights for policy 0, policy_version 258840 (0.0033) [2024-06-22 17:45:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43145.1, 300 sec: 42876.1). Total num frames: 4240916480. Throughput: 0: 43108.4. Samples: 4240984640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:45:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 17:45:10,338][15401] Updated weights for policy 0, policy_version 258850 (0.0033) [2024-06-22 17:45:12,922][15349] Signal inference workers to stop experience collection... (62800 times) [2024-06-22 17:45:12,923][15349] Signal inference workers to resume experience collection... (62800 times) [2024-06-22 17:45:12,936][15401] InferenceWorker_p0-w0: stopping experience collection (62800 times) [2024-06-22 17:45:12,936][15401] InferenceWorker_p0-w0: resuming experience collection (62800 times) [2024-06-22 17:45:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4241129472. Throughput: 0: 43032.5. Samples: 4241250480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:45:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 17:45:13,662][15401] Updated weights for policy 0, policy_version 258860 (0.0038) [2024-06-22 17:45:17,859][15401] Updated weights for policy 0, policy_version 258870 (0.0042) [2024-06-22 17:45:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 4241342464. Throughput: 0: 42868.6. Samples: 4241501960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:45:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 17:45:21,468][15401] Updated weights for policy 0, policy_version 258880 (0.0036) [2024-06-22 17:45:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4241539072. Throughput: 0: 43070.6. Samples: 4241631280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:45:23,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 17:45:25,361][15401] Updated weights for policy 0, policy_version 258890 (0.0026) [2024-06-22 17:45:28,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 4241784832. Throughput: 0: 43056.0. Samples: 4241893960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:45:28,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 17:45:29,284][15401] Updated weights for policy 0, policy_version 258900 (0.0041) [2024-06-22 17:45:32,847][15401] Updated weights for policy 0, policy_version 258910 (0.0025) [2024-06-22 17:45:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4241981440. Throughput: 0: 43032.5. Samples: 4242148260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:45:33,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 17:45:36,688][15401] Updated weights for policy 0, policy_version 258920 (0.0031) [2024-06-22 17:45:38,390][15132] Fps is (10 sec: 40969.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4242194432. Throughput: 0: 43218.5. Samples: 4242280380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 17:45:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 17:45:40,468][15401] Updated weights for policy 0, policy_version 258930 (0.0024) [2024-06-22 17:45:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4242440192. Throughput: 0: 43015.1. Samples: 4242541660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:45:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 17:45:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000258938_4242440192.pth... [2024-06-22 17:45:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000258309_4232134656.pth [2024-06-22 17:45:44,087][15401] Updated weights for policy 0, policy_version 258940 (0.0027) [2024-06-22 17:45:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4242620416. Throughput: 0: 43009.3. Samples: 4242797520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:45:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 17:45:48,405][15401] Updated weights for policy 0, policy_version 258950 (0.0045) [2024-06-22 17:45:51,669][15401] Updated weights for policy 0, policy_version 258960 (0.0028) [2024-06-22 17:45:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4242817024. Throughput: 0: 42867.6. Samples: 4242913680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:45:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 17:45:56,007][15401] Updated weights for policy 0, policy_version 258970 (0.0037) [2024-06-22 17:45:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 4243062784. Throughput: 0: 42828.7. Samples: 4243177880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:45:58,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 17:45:59,334][15401] Updated weights for policy 0, policy_version 258980 (0.0039) [2024-06-22 17:46:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 4243259392. Throughput: 0: 42930.7. Samples: 4243433840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 17:46:03,870][15401] Updated weights for policy 0, policy_version 258990 (0.0030) [2024-06-22 17:46:06,970][15401] Updated weights for policy 0, policy_version 259000 (0.0034) [2024-06-22 17:46:08,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.5, 300 sec: 42765.4). Total num frames: 4243456000. Throughput: 0: 42878.8. Samples: 4243560820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 17:46:11,454][15401] Updated weights for policy 0, policy_version 259010 (0.0037) [2024-06-22 17:46:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 4243718144. Throughput: 0: 42782.8. Samples: 4243819080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 17:46:14,521][15401] Updated weights for policy 0, policy_version 259020 (0.0042) [2024-06-22 17:46:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4243898368. Throughput: 0: 42996.0. Samples: 4244083080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 17:46:19,028][15401] Updated weights for policy 0, policy_version 259030 (0.0031) [2024-06-22 17:46:22,832][15401] Updated weights for policy 0, policy_version 259040 (0.0030) [2024-06-22 17:46:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4244111360. Throughput: 0: 42754.3. Samples: 4244204320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:23,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 17:46:26,609][15401] Updated weights for policy 0, policy_version 259050 (0.0035) [2024-06-22 17:46:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.2, 300 sec: 42932.0). Total num frames: 4244357120. Throughput: 0: 42777.3. Samples: 4244466640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:28,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 17:46:30,466][15401] Updated weights for policy 0, policy_version 259060 (0.0042) [2024-06-22 17:46:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4244537344. Throughput: 0: 42700.8. Samples: 4244719060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 17:46:34,111][15349] Signal inference workers to stop experience collection... (62850 times) [2024-06-22 17:46:34,154][15401] InferenceWorker_p0-w0: stopping experience collection (62850 times) [2024-06-22 17:46:34,227][15349] Signal inference workers to resume experience collection... (62850 times) [2024-06-22 17:46:34,227][15401] InferenceWorker_p0-w0: resuming experience collection (62850 times) [2024-06-22 17:46:34,369][15401] Updated weights for policy 0, policy_version 259070 (0.0042) [2024-06-22 17:46:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 4244750336. Throughput: 0: 42837.0. Samples: 4244841340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 17:46:38,426][15401] Updated weights for policy 0, policy_version 259080 (0.0031) [2024-06-22 17:46:42,034][15401] Updated weights for policy 0, policy_version 259090 (0.0037) [2024-06-22 17:46:43,392][15132] Fps is (10 sec: 47502.7, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 4245012480. Throughput: 0: 42758.7. Samples: 4245102020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:43,392][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 17:46:46,054][15401] Updated weights for policy 0, policy_version 259100 (0.0027) [2024-06-22 17:46:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4245192704. Throughput: 0: 42827.8. Samples: 4245361100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:48,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-22 17:46:49,849][15401] Updated weights for policy 0, policy_version 259110 (0.0033) [2024-06-22 17:46:53,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4245389312. Throughput: 0: 42678.2. Samples: 4245481340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 17:46:53,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 17:46:53,709][15401] Updated weights for policy 0, policy_version 259120 (0.0034) [2024-06-22 17:46:57,335][15401] Updated weights for policy 0, policy_version 259130 (0.0043) [2024-06-22 17:46:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 4245635072. Throughput: 0: 42771.4. Samples: 4245743800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:46:58,399][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 17:47:01,355][15401] Updated weights for policy 0, policy_version 259140 (0.0038) [2024-06-22 17:47:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 4245831680. Throughput: 0: 42543.5. Samples: 4245997540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:03,391][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 17:47:05,291][15401] Updated weights for policy 0, policy_version 259150 (0.0031) [2024-06-22 17:47:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 4246044672. Throughput: 0: 42653.7. Samples: 4246123740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 17:47:08,907][15401] Updated weights for policy 0, policy_version 259160 (0.0030) [2024-06-22 17:47:12,955][15401] Updated weights for policy 0, policy_version 259170 (0.0041) [2024-06-22 17:47:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4246274048. Throughput: 0: 42565.2. Samples: 4246382080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:13,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 17:47:16,699][15401] Updated weights for policy 0, policy_version 259180 (0.0040) [2024-06-22 17:47:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4246454272. Throughput: 0: 42843.2. Samples: 4246647000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 17:47:20,610][15401] Updated weights for policy 0, policy_version 259190 (0.0039) [2024-06-22 17:47:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4246700032. Throughput: 0: 42706.2. Samples: 4246763120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 17:47:24,718][15401] Updated weights for policy 0, policy_version 259200 (0.0026) [2024-06-22 17:47:28,078][15401] Updated weights for policy 0, policy_version 259210 (0.0024) [2024-06-22 17:47:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 4246913024. Throughput: 0: 42756.9. Samples: 4247025980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 17:47:32,222][15401] Updated weights for policy 0, policy_version 259220 (0.0040) [2024-06-22 17:47:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4247093248. Throughput: 0: 42750.4. Samples: 4247284860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:33,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 17:47:35,509][15401] Updated weights for policy 0, policy_version 259230 (0.0034) [2024-06-22 17:47:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4247322624. Throughput: 0: 42775.1. Samples: 4247406220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:38,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 17:47:39,594][15401] Updated weights for policy 0, policy_version 259240 (0.0038) [2024-06-22 17:47:42,919][15401] Updated weights for policy 0, policy_version 259250 (0.0026) [2024-06-22 17:47:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42327.0, 300 sec: 42987.2). Total num frames: 4247552000. Throughput: 0: 42816.4. Samples: 4247670540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 17:47:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000259251_4247568384.pth... [2024-06-22 17:47:43,513][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000258623_4237279232.pth [2024-06-22 17:47:47,200][15401] Updated weights for policy 0, policy_version 259260 (0.0032) [2024-06-22 17:47:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4247732224. Throughput: 0: 42935.7. Samples: 4247929640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 17:47:50,845][15401] Updated weights for policy 0, policy_version 259270 (0.0030) [2024-06-22 17:47:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4247977984. Throughput: 0: 42804.9. Samples: 4248049960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 17:47:55,259][15401] Updated weights for policy 0, policy_version 259280 (0.0035) [2024-06-22 17:47:58,279][15349] Signal inference workers to stop experience collection... (62900 times) [2024-06-22 17:47:58,316][15401] InferenceWorker_p0-w0: stopping experience collection (62900 times) [2024-06-22 17:47:58,338][15349] Signal inference workers to resume experience collection... (62900 times) [2024-06-22 17:47:58,339][15401] InferenceWorker_p0-w0: resuming experience collection (62900 times) [2024-06-22 17:47:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4248190976. Throughput: 0: 42777.9. Samples: 4248307080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:47:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 17:47:58,491][15401] Updated weights for policy 0, policy_version 259290 (0.0040) [2024-06-22 17:48:02,771][15401] Updated weights for policy 0, policy_version 259300 (0.0036) [2024-06-22 17:48:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 4248387584. Throughput: 0: 42695.4. Samples: 4248568400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 17:48:03,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 17:48:05,957][15401] Updated weights for policy 0, policy_version 259310 (0.0023) [2024-06-22 17:48:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 4248633344. Throughput: 0: 42799.0. Samples: 4248689180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:08,392][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 17:48:10,440][15401] Updated weights for policy 0, policy_version 259320 (0.0026) [2024-06-22 17:48:13,390][15132] Fps is (10 sec: 45882.9, 60 sec: 42871.0, 300 sec: 42931.5). Total num frames: 4248846336. Throughput: 0: 42911.3. Samples: 4248957020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:13,391][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 17:48:13,517][15401] Updated weights for policy 0, policy_version 259330 (0.0028) [2024-06-22 17:48:18,029][15401] Updated weights for policy 0, policy_version 259340 (0.0026) [2024-06-22 17:48:18,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4249042944. Throughput: 0: 42789.3. Samples: 4249210380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:18,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 17:48:21,523][15401] Updated weights for policy 0, policy_version 259350 (0.0027) [2024-06-22 17:48:23,389][15132] Fps is (10 sec: 42602.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4249272320. Throughput: 0: 42897.8. Samples: 4249336620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 17:48:25,560][15401] Updated weights for policy 0, policy_version 259360 (0.0045) [2024-06-22 17:48:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4249468928. Throughput: 0: 42781.4. Samples: 4249595700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 17:48:29,048][15401] Updated weights for policy 0, policy_version 259370 (0.0037) [2024-06-22 17:48:33,167][15401] Updated weights for policy 0, policy_version 259380 (0.0047) [2024-06-22 17:48:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4249681920. Throughput: 0: 42791.4. Samples: 4249855260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 17:48:36,770][15401] Updated weights for policy 0, policy_version 259390 (0.0034) [2024-06-22 17:48:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4249927680. Throughput: 0: 42962.8. Samples: 4249983280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 17:48:40,919][15401] Updated weights for policy 0, policy_version 259400 (0.0034) [2024-06-22 17:48:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 4250107904. Throughput: 0: 43122.7. Samples: 4250247600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:43,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 17:48:44,297][15401] Updated weights for policy 0, policy_version 259410 (0.0039) [2024-06-22 17:48:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4250320896. Throughput: 0: 43039.7. Samples: 4250505080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 17:48:48,603][15401] Updated weights for policy 0, policy_version 259420 (0.0038) [2024-06-22 17:48:51,778][15401] Updated weights for policy 0, policy_version 259430 (0.0043) [2024-06-22 17:48:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 4250566656. Throughput: 0: 43154.9. Samples: 4250631040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 17:48:56,150][15401] Updated weights for policy 0, policy_version 259440 (0.0028) [2024-06-22 17:48:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 4250763264. Throughput: 0: 43103.0. Samples: 4250896620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:48:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 17:48:59,261][15401] Updated weights for policy 0, policy_version 259450 (0.0035) [2024-06-22 17:49:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.3, 300 sec: 42876.2). Total num frames: 4250976256. Throughput: 0: 43298.2. Samples: 4251158800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:49:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 17:49:03,589][15401] Updated weights for policy 0, policy_version 259460 (0.0036) [2024-06-22 17:49:06,692][15401] Updated weights for policy 0, policy_version 259470 (0.0025) [2024-06-22 17:49:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 4251205632. Throughput: 0: 43262.1. Samples: 4251283420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:49:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 17:49:11,252][15401] Updated weights for policy 0, policy_version 259480 (0.0033) [2024-06-22 17:49:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42872.0, 300 sec: 42987.2). Total num frames: 4251418624. Throughput: 0: 43398.5. Samples: 4251548640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-22 17:49:13,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-22 17:49:14,242][15401] Updated weights for policy 0, policy_version 259490 (0.0033) [2024-06-22 17:49:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4251631616. Throughput: 0: 43308.8. Samples: 4251804160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:18,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-22 17:49:18,871][15401] Updated weights for policy 0, policy_version 259500 (0.0021) [2024-06-22 17:49:21,638][15401] Updated weights for policy 0, policy_version 259510 (0.0030) [2024-06-22 17:49:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4251860992. Throughput: 0: 43320.4. Samples: 4251932700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 17:49:26,419][15401] Updated weights for policy 0, policy_version 259520 (0.0032) [2024-06-22 17:49:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4252057600. Throughput: 0: 43383.1. Samples: 4252199840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 17:49:28,837][15349] Signal inference workers to stop experience collection... (62950 times) [2024-06-22 17:49:28,844][15349] Signal inference workers to resume experience collection... (62950 times) [2024-06-22 17:49:28,874][15401] InferenceWorker_p0-w0: stopping experience collection (62950 times) [2024-06-22 17:49:28,874][15401] InferenceWorker_p0-w0: resuming experience collection (62950 times) [2024-06-22 17:49:29,334][15401] Updated weights for policy 0, policy_version 259530 (0.0038) [2024-06-22 17:49:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4252270592. Throughput: 0: 43289.7. Samples: 4252453120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 17:49:33,866][15401] Updated weights for policy 0, policy_version 259540 (0.0037) [2024-06-22 17:49:37,104][15401] Updated weights for policy 0, policy_version 259550 (0.0043) [2024-06-22 17:49:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4252516352. Throughput: 0: 43340.3. Samples: 4252581360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:38,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 17:49:41,361][15401] Updated weights for policy 0, policy_version 259560 (0.0034) [2024-06-22 17:49:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 4252696576. Throughput: 0: 43246.2. Samples: 4252842700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 17:49:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000259565_4252712960.pth... [2024-06-22 17:49:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000258938_4242440192.pth [2024-06-22 17:49:44,742][15401] Updated weights for policy 0, policy_version 259570 (0.0028) [2024-06-22 17:49:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4252925952. Throughput: 0: 43140.0. Samples: 4253100100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 17:49:48,947][15401] Updated weights for policy 0, policy_version 259580 (0.0028) [2024-06-22 17:49:52,335][15401] Updated weights for policy 0, policy_version 259590 (0.0037) [2024-06-22 17:49:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 4253171712. Throughput: 0: 43260.0. Samples: 4253230120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:53,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 17:49:56,519][15401] Updated weights for policy 0, policy_version 259600 (0.0028) [2024-06-22 17:49:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43415.8, 300 sec: 42986.8). Total num frames: 4253368320. Throughput: 0: 43176.4. Samples: 4253491680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:49:58,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 17:49:59,964][15401] Updated weights for policy 0, policy_version 259610 (0.0035) [2024-06-22 17:50:03,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 4253581312. Throughput: 0: 43062.7. Samples: 4253742080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:50:03,392][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 17:50:04,122][15401] Updated weights for policy 0, policy_version 259620 (0.0046) [2024-06-22 17:50:07,565][15401] Updated weights for policy 0, policy_version 259630 (0.0023) [2024-06-22 17:50:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 4253810688. Throughput: 0: 43157.3. Samples: 4253874780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:50:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 17:50:11,631][15401] Updated weights for policy 0, policy_version 259640 (0.0041) [2024-06-22 17:50:13,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4254007296. Throughput: 0: 43086.6. Samples: 4254138740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:50:13,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 17:50:15,577][15401] Updated weights for policy 0, policy_version 259650 (0.0036) [2024-06-22 17:50:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.8, 300 sec: 43042.7). Total num frames: 4254236672. Throughput: 0: 42975.3. Samples: 4254387000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:50:18,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 17:50:19,094][15401] Updated weights for policy 0, policy_version 259660 (0.0039) [2024-06-22 17:50:22,951][15401] Updated weights for policy 0, policy_version 259670 (0.0028) [2024-06-22 17:50:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 4254433280. Throughput: 0: 43213.9. Samples: 4254525980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 17:50:23,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 17:50:26,609][15401] Updated weights for policy 0, policy_version 259680 (0.0038) [2024-06-22 17:50:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4254646272. Throughput: 0: 43178.2. Samples: 4254785720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:50:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 17:50:30,402][15401] Updated weights for policy 0, policy_version 259690 (0.0028) [2024-06-22 17:50:33,396][15132] Fps is (10 sec: 45845.2, 60 sec: 43686.0, 300 sec: 43041.8). Total num frames: 4254892032. Throughput: 0: 43062.7. Samples: 4255038200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:50:33,397][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 17:50:34,368][15401] Updated weights for policy 0, policy_version 259700 (0.0034) [2024-06-22 17:50:38,019][15401] Updated weights for policy 0, policy_version 259710 (0.0031) [2024-06-22 17:50:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4255088640. Throughput: 0: 43237.8. Samples: 4255175820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:50:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 17:50:41,763][15401] Updated weights for policy 0, policy_version 259720 (0.0031) [2024-06-22 17:50:43,390][15132] Fps is (10 sec: 40986.0, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 4255301632. Throughput: 0: 43168.0. Samples: 4255434140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:50:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 17:50:45,198][15349] Signal inference workers to stop experience collection... (63000 times) [2024-06-22 17:50:45,199][15349] Signal inference workers to resume experience collection... (63000 times) [2024-06-22 17:50:45,216][15401] InferenceWorker_p0-w0: stopping experience collection (63000 times) [2024-06-22 17:50:45,216][15401] InferenceWorker_p0-w0: resuming experience collection (63000 times) [2024-06-22 17:50:45,527][15401] Updated weights for policy 0, policy_version 259730 (0.0029) [2024-06-22 17:50:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43690.7, 300 sec: 43153.8). Total num frames: 4255547392. Throughput: 0: 43197.5. Samples: 4255685860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:50:48,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 17:50:49,396][15401] Updated weights for policy 0, policy_version 259740 (0.0039) [2024-06-22 17:50:53,138][15401] Updated weights for policy 0, policy_version 259750 (0.0029) [2024-06-22 17:50:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 43043.1). Total num frames: 4255760384. Throughput: 0: 43367.1. Samples: 4255826300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:50:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 17:50:56,837][15401] Updated weights for policy 0, policy_version 259760 (0.0031) [2024-06-22 17:50:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42873.2, 300 sec: 42987.1). Total num frames: 4255940608. Throughput: 0: 43109.3. Samples: 4256078660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:50:58,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 17:51:00,743][15401] Updated weights for policy 0, policy_version 259770 (0.0027) [2024-06-22 17:51:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43692.4, 300 sec: 43209.3). Total num frames: 4256202752. Throughput: 0: 43364.8. Samples: 4256338420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:51:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 17:51:04,223][15401] Updated weights for policy 0, policy_version 259780 (0.0036) [2024-06-22 17:51:08,357][15401] Updated weights for policy 0, policy_version 259790 (0.0024) [2024-06-22 17:51:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4256399360. Throughput: 0: 43268.3. Samples: 4256473060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:51:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 17:51:11,687][15401] Updated weights for policy 0, policy_version 259800 (0.0027) [2024-06-22 17:51:13,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 4256579584. Throughput: 0: 43014.4. Samples: 4256721360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:51:13,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-22 17:51:15,923][15401] Updated weights for policy 0, policy_version 259810 (0.0025) [2024-06-22 17:51:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 43153.8). Total num frames: 4256841728. Throughput: 0: 43182.1. Samples: 4256981120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:51:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-22 17:51:19,414][15401] Updated weights for policy 0, policy_version 259820 (0.0027) [2024-06-22 17:51:23,350][15401] Updated weights for policy 0, policy_version 259830 (0.0048) [2024-06-22 17:51:23,389][15132] Fps is (10 sec: 47513.1, 60 sec: 43690.6, 300 sec: 43042.7). Total num frames: 4257054720. Throughput: 0: 43184.1. Samples: 4257119100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:51:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 17:51:27,126][15401] Updated weights for policy 0, policy_version 259840 (0.0041) [2024-06-22 17:51:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 4257251328. Throughput: 0: 43034.3. Samples: 4257370680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:51:28,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 17:51:31,335][15401] Updated weights for policy 0, policy_version 259850 (0.0039) [2024-06-22 17:51:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42876.1, 300 sec: 43098.2). Total num frames: 4257464320. Throughput: 0: 43069.3. Samples: 4257623980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 17:51:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 17:51:34,834][15401] Updated weights for policy 0, policy_version 259860 (0.0035) [2024-06-22 17:51:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 4257660928. Throughput: 0: 42863.2. Samples: 4257755140. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:51:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 17:51:39,059][15401] Updated weights for policy 0, policy_version 259870 (0.0038) [2024-06-22 17:51:42,367][15401] Updated weights for policy 0, policy_version 259880 (0.0033) [2024-06-22 17:51:43,396][15132] Fps is (10 sec: 42570.7, 60 sec: 43140.0, 300 sec: 43041.8). Total num frames: 4257890304. Throughput: 0: 42957.0. Samples: 4258012000. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:51:43,401][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 17:51:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000259881_4257890304.pth... [2024-06-22 17:51:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000259251_4247568384.pth [2024-06-22 17:51:43,992][15349] Signal inference workers to stop experience collection... (63050 times) [2024-06-22 17:51:43,992][15349] Signal inference workers to resume experience collection... (63050 times) [2024-06-22 17:51:44,022][15401] InferenceWorker_p0-w0: stopping experience collection (63050 times) [2024-06-22 17:51:44,022][15401] InferenceWorker_p0-w0: resuming experience collection (63050 times) [2024-06-22 17:51:46,731][15401] Updated weights for policy 0, policy_version 259890 (0.0046) [2024-06-22 17:51:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 4258136064. Throughput: 0: 42789.3. Samples: 4258263940. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:51:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 17:51:49,882][15401] Updated weights for policy 0, policy_version 259900 (0.0032) [2024-06-22 17:51:53,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 4258299904. Throughput: 0: 42861.0. Samples: 4258401800. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:51:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 17:51:54,304][15401] Updated weights for policy 0, policy_version 259910 (0.0038) [2024-06-22 17:51:57,382][15401] Updated weights for policy 0, policy_version 259920 (0.0043) [2024-06-22 17:51:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 4258529280. Throughput: 0: 42968.4. Samples: 4258654940. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:51:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 17:52:02,087][15401] Updated weights for policy 0, policy_version 259930 (0.0029) [2024-06-22 17:52:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 43098.3). Total num frames: 4258758656. Throughput: 0: 42946.2. Samples: 4258913700. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:03,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 17:52:04,987][15401] Updated weights for policy 0, policy_version 259940 (0.0028) [2024-06-22 17:52:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 4258955264. Throughput: 0: 42726.7. Samples: 4259041800. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:08,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-22 17:52:09,802][15401] Updated weights for policy 0, policy_version 259950 (0.0040) [2024-06-22 17:52:12,655][15401] Updated weights for policy 0, policy_version 259960 (0.0036) [2024-06-22 17:52:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 4259201024. Throughput: 0: 42868.1. Samples: 4259299740. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:13,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 17:52:17,607][15401] Updated weights for policy 0, policy_version 259970 (0.0029) [2024-06-22 17:52:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 4259397632. Throughput: 0: 42842.6. Samples: 4259551900. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 17:52:20,427][15401] Updated weights for policy 0, policy_version 259980 (0.0041) [2024-06-22 17:52:23,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42325.2, 300 sec: 42987.2). Total num frames: 4259594240. Throughput: 0: 42695.8. Samples: 4259676460. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:23,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 17:52:25,301][15401] Updated weights for policy 0, policy_version 259990 (0.0023) [2024-06-22 17:52:28,374][15401] Updated weights for policy 0, policy_version 260000 (0.0027) [2024-06-22 17:52:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 4259840000. Throughput: 0: 42684.7. Samples: 4259932540. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 17:52:32,866][15401] Updated weights for policy 0, policy_version 260010 (0.0035) [2024-06-22 17:52:33,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 4260003840. Throughput: 0: 42832.1. Samples: 4260191380. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:33,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 17:52:35,992][15401] Updated weights for policy 0, policy_version 260020 (0.0034) [2024-06-22 17:52:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4260233216. Throughput: 0: 42512.5. Samples: 4260314860. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:38,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 17:52:40,367][15401] Updated weights for policy 0, policy_version 260030 (0.0031) [2024-06-22 17:52:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42876.1, 300 sec: 43153.8). Total num frames: 4260462592. Throughput: 0: 42690.2. Samples: 4260576000. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:43,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 17:52:43,862][15401] Updated weights for policy 0, policy_version 260040 (0.0037) [2024-06-22 17:52:47,978][15401] Updated weights for policy 0, policy_version 260050 (0.0032) [2024-06-22 17:52:48,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42050.6, 300 sec: 42986.8). Total num frames: 4260659200. Throughput: 0: 42684.0. Samples: 4260834580. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-22 17:52:48,393][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 17:52:51,478][15401] Updated weights for policy 0, policy_version 260060 (0.0026) [2024-06-22 17:52:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.7, 300 sec: 43042.4). Total num frames: 4260888576. Throughput: 0: 42707.4. Samples: 4260963740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:52:53,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 17:52:55,583][15401] Updated weights for policy 0, policy_version 260070 (0.0033) [2024-06-22 17:52:58,390][15132] Fps is (10 sec: 45886.3, 60 sec: 43144.5, 300 sec: 43154.1). Total num frames: 4261117952. Throughput: 0: 42765.3. Samples: 4261224180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:52:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 17:52:59,172][15401] Updated weights for policy 0, policy_version 260080 (0.0026) [2024-06-22 17:53:03,183][15401] Updated weights for policy 0, policy_version 260090 (0.0039) [2024-06-22 17:53:03,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 4261314560. Throughput: 0: 42907.5. Samples: 4261482740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 17:53:05,468][15349] Signal inference workers to stop experience collection... (63100 times) [2024-06-22 17:53:05,469][15349] Signal inference workers to resume experience collection... (63100 times) [2024-06-22 17:53:05,519][15401] InferenceWorker_p0-w0: stopping experience collection (63100 times) [2024-06-22 17:53:05,519][15401] InferenceWorker_p0-w0: resuming experience collection (63100 times) [2024-06-22 17:53:06,775][15401] Updated weights for policy 0, policy_version 260100 (0.0027) [2024-06-22 17:53:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 43042.8). Total num frames: 4261543936. Throughput: 0: 42940.6. Samples: 4261608780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 17:53:10,707][15401] Updated weights for policy 0, policy_version 260110 (0.0039) [2024-06-22 17:53:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 43098.2). Total num frames: 4261756928. Throughput: 0: 43121.3. Samples: 4261873000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 17:53:14,790][15401] Updated weights for policy 0, policy_version 260120 (0.0040) [2024-06-22 17:53:18,389][15401] Updated weights for policy 0, policy_version 260130 (0.0037) [2024-06-22 17:53:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 4261969920. Throughput: 0: 42867.9. Samples: 4262120440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 17:53:22,635][15401] Updated weights for policy 0, policy_version 260140 (0.0034) [2024-06-22 17:53:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 43098.2). Total num frames: 4262182912. Throughput: 0: 42967.0. Samples: 4262248380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:23,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 17:53:26,390][15401] Updated weights for policy 0, policy_version 260150 (0.0030) [2024-06-22 17:53:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 43042.7). Total num frames: 4262379520. Throughput: 0: 42871.9. Samples: 4262505240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 17:53:30,344][15401] Updated weights for policy 0, policy_version 260160 (0.0042) [2024-06-22 17:53:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 4262608896. Throughput: 0: 42817.9. Samples: 4262761280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 17:53:33,937][15401] Updated weights for policy 0, policy_version 260170 (0.0038) [2024-06-22 17:53:37,866][15401] Updated weights for policy 0, policy_version 260180 (0.0040) [2024-06-22 17:53:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 4262805504. Throughput: 0: 42803.6. Samples: 4262889800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 17:53:41,621][15401] Updated weights for policy 0, policy_version 260190 (0.0034) [2024-06-22 17:53:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 4263018496. Throughput: 0: 42713.3. Samples: 4263146280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:43,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 17:53:43,534][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000260195_4263034880.pth... [2024-06-22 17:53:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000259565_4252712960.pth [2024-06-22 17:53:45,357][15401] Updated weights for policy 0, policy_version 260200 (0.0033) [2024-06-22 17:53:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 4263231488. Throughput: 0: 42553.9. Samples: 4263397660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:48,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 17:53:49,267][15401] Updated weights for policy 0, policy_version 260210 (0.0036) [2024-06-22 17:53:53,096][15401] Updated weights for policy 0, policy_version 260220 (0.0030) [2024-06-22 17:53:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 4263444480. Throughput: 0: 42706.2. Samples: 4263530560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 17:53:56,859][15401] Updated weights for policy 0, policy_version 260230 (0.0023) [2024-06-22 17:53:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 43042.4). Total num frames: 4263673856. Throughput: 0: 42585.4. Samples: 4263789440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 17:53:58,392][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 17:54:00,666][15401] Updated weights for policy 0, policy_version 260240 (0.0028) [2024-06-22 17:54:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 4263886848. Throughput: 0: 42770.7. Samples: 4264045120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 17:54:04,416][15401] Updated weights for policy 0, policy_version 260250 (0.0038) [2024-06-22 17:54:08,359][15401] Updated weights for policy 0, policy_version 260260 (0.0039) [2024-06-22 17:54:08,390][15132] Fps is (10 sec: 42607.1, 60 sec: 42598.2, 300 sec: 42987.1). Total num frames: 4264099840. Throughput: 0: 42856.1. Samples: 4264176920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:08,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 17:54:12,151][15401] Updated weights for policy 0, policy_version 260270 (0.0029) [2024-06-22 17:54:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 4264312832. Throughput: 0: 42925.0. Samples: 4264436860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 17:54:15,930][15401] Updated weights for policy 0, policy_version 260280 (0.0024) [2024-06-22 17:54:18,390][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4264525824. Throughput: 0: 42879.4. Samples: 4264690860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 17:54:19,669][15401] Updated weights for policy 0, policy_version 260290 (0.0037) [2024-06-22 17:54:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42987.1). Total num frames: 4264738816. Throughput: 0: 43012.4. Samples: 4264825360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:23,399][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 17:54:23,516][15401] Updated weights for policy 0, policy_version 260300 (0.0040) [2024-06-22 17:54:27,087][15401] Updated weights for policy 0, policy_version 260310 (0.0043) [2024-06-22 17:54:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.7, 300 sec: 43042.7). Total num frames: 4264968192. Throughput: 0: 43000.1. Samples: 4265081280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:28,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-22 17:54:31,186][15401] Updated weights for policy 0, policy_version 260320 (0.0033) [2024-06-22 17:54:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4265181184. Throughput: 0: 43155.2. Samples: 4265339640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 17:54:34,781][15401] Updated weights for policy 0, policy_version 260330 (0.0026) [2024-06-22 17:54:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4265377792. Throughput: 0: 43039.0. Samples: 4265467320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 17:54:38,667][15401] Updated weights for policy 0, policy_version 260340 (0.0035) [2024-06-22 17:54:40,000][15349] Signal inference workers to stop experience collection... (63150 times) [2024-06-22 17:54:40,000][15349] Signal inference workers to resume experience collection... (63150 times) [2024-06-22 17:54:40,031][15401] InferenceWorker_p0-w0: stopping experience collection (63150 times) [2024-06-22 17:54:40,031][15401] InferenceWorker_p0-w0: resuming experience collection (63150 times) [2024-06-22 17:54:42,701][15401] Updated weights for policy 0, policy_version 260350 (0.0039) [2024-06-22 17:54:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4265607168. Throughput: 0: 43052.5. Samples: 4265726700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:43,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 17:54:46,136][15401] Updated weights for policy 0, policy_version 260360 (0.0034) [2024-06-22 17:54:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4265820160. Throughput: 0: 43039.9. Samples: 4265981920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 17:54:50,284][15401] Updated weights for policy 0, policy_version 260370 (0.0027) [2024-06-22 17:54:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 4266033152. Throughput: 0: 43112.7. Samples: 4266116980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 17:54:53,948][15401] Updated weights for policy 0, policy_version 260380 (0.0039) [2024-06-22 17:54:57,996][15401] Updated weights for policy 0, policy_version 260390 (0.0032) [2024-06-22 17:54:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42932.0). Total num frames: 4266246144. Throughput: 0: 43028.9. Samples: 4266373160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:54:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 17:55:01,521][15401] Updated weights for policy 0, policy_version 260400 (0.0027) [2024-06-22 17:55:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4266475520. Throughput: 0: 43051.3. Samples: 4266628160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:55:03,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 17:55:05,571][15401] Updated weights for policy 0, policy_version 260410 (0.0035) [2024-06-22 17:55:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 4266672128. Throughput: 0: 43007.5. Samples: 4266760700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 17:55:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 17:55:09,158][15401] Updated weights for policy 0, policy_version 260420 (0.0037) [2024-06-22 17:55:13,128][15401] Updated weights for policy 0, policy_version 260430 (0.0026) [2024-06-22 17:55:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4266901504. Throughput: 0: 43054.6. Samples: 4267018740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 17:55:16,758][15401] Updated weights for policy 0, policy_version 260440 (0.0031) [2024-06-22 17:55:18,396][15132] Fps is (10 sec: 45846.3, 60 sec: 43413.0, 300 sec: 43041.8). Total num frames: 4267130880. Throughput: 0: 42839.6. Samples: 4267267700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:18,397][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 17:55:20,877][15401] Updated weights for policy 0, policy_version 260450 (0.0040) [2024-06-22 17:55:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 4267311104. Throughput: 0: 42982.0. Samples: 4267401500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 17:55:24,312][15401] Updated weights for policy 0, policy_version 260460 (0.0048) [2024-06-22 17:55:28,389][15132] Fps is (10 sec: 39347.2, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 4267524096. Throughput: 0: 42999.2. Samples: 4267661660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 17:55:28,416][15401] Updated weights for policy 0, policy_version 260470 (0.0034) [2024-06-22 17:55:32,275][15401] Updated weights for policy 0, policy_version 260480 (0.0040) [2024-06-22 17:55:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4267753472. Throughput: 0: 42832.5. Samples: 4267909380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 17:55:35,967][15401] Updated weights for policy 0, policy_version 260490 (0.0030) [2024-06-22 17:55:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4267966464. Throughput: 0: 42706.8. Samples: 4268038780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 17:55:40,023][15401] Updated weights for policy 0, policy_version 260500 (0.0040) [2024-06-22 17:55:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4268163072. Throughput: 0: 42667.2. Samples: 4268293180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:43,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 17:55:43,561][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000260510_4268195840.pth... [2024-06-22 17:55:43,563][15401] Updated weights for policy 0, policy_version 260510 (0.0044) [2024-06-22 17:55:43,619][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000259881_4257890304.pth [2024-06-22 17:55:47,608][15401] Updated weights for policy 0, policy_version 260520 (0.0039) [2024-06-22 17:55:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4268392448. Throughput: 0: 42704.5. Samples: 4268549860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 17:55:51,366][15401] Updated weights for policy 0, policy_version 260530 (0.0029) [2024-06-22 17:55:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4268589056. Throughput: 0: 42489.0. Samples: 4268672700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 17:55:55,194][15401] Updated weights for policy 0, policy_version 260540 (0.0032) [2024-06-22 17:55:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4268818432. Throughput: 0: 42480.0. Samples: 4268930340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:55:58,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 17:55:58,990][15401] Updated weights for policy 0, policy_version 260550 (0.0023) [2024-06-22 17:56:02,678][15401] Updated weights for policy 0, policy_version 260560 (0.0037) [2024-06-22 17:56:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 4269031424. Throughput: 0: 42763.4. Samples: 4269191880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:56:03,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 17:56:06,762][15401] Updated weights for policy 0, policy_version 260570 (0.0032) [2024-06-22 17:56:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4269228032. Throughput: 0: 42639.1. Samples: 4269320260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:56:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 17:56:08,509][15349] Signal inference workers to stop experience collection... (63200 times) [2024-06-22 17:56:08,509][15349] Signal inference workers to resume experience collection... (63200 times) [2024-06-22 17:56:08,530][15401] InferenceWorker_p0-w0: stopping experience collection (63200 times) [2024-06-22 17:56:08,564][15401] InferenceWorker_p0-w0: resuming experience collection (63200 times) [2024-06-22 17:56:10,338][15401] Updated weights for policy 0, policy_version 260580 (0.0037) [2024-06-22 17:56:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4269457408. Throughput: 0: 42483.5. Samples: 4269573420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:56:13,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 17:56:14,552][15401] Updated weights for policy 0, policy_version 260590 (0.0043) [2024-06-22 17:56:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42056.8, 300 sec: 42709.5). Total num frames: 4269654016. Throughput: 0: 42787.6. Samples: 4269834820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 17:56:18,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-22 17:56:18,462][15401] Updated weights for policy 0, policy_version 260600 (0.0043) [2024-06-22 17:56:22,137][15401] Updated weights for policy 0, policy_version 260610 (0.0039) [2024-06-22 17:56:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4269867008. Throughput: 0: 42631.0. Samples: 4269957180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:56:23,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-22 17:56:25,992][15401] Updated weights for policy 0, policy_version 260620 (0.0036) [2024-06-22 17:56:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4270112768. Throughput: 0: 42789.4. Samples: 4270218700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:56:28,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 17:56:29,685][15401] Updated weights for policy 0, policy_version 260630 (0.0037) [2024-06-22 17:56:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4270292992. Throughput: 0: 42854.1. Samples: 4270478300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:56:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 17:56:33,676][15401] Updated weights for policy 0, policy_version 260640 (0.0046) [2024-06-22 17:56:37,456][15401] Updated weights for policy 0, policy_version 260650 (0.0034) [2024-06-22 17:56:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 4270505984. Throughput: 0: 42884.8. Samples: 4270602520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:56:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 17:56:41,349][15401] Updated weights for policy 0, policy_version 260660 (0.0026) [2024-06-22 17:56:43,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 4270751744. Throughput: 0: 42807.9. Samples: 4270856800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:56:43,393][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 17:56:45,405][15401] Updated weights for policy 0, policy_version 260670 (0.0039) [2024-06-22 17:56:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 4270931968. Throughput: 0: 42775.2. Samples: 4271116660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:56:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 17:56:48,977][15401] Updated weights for policy 0, policy_version 260680 (0.0027) [2024-06-22 17:56:53,050][15401] Updated weights for policy 0, policy_version 260690 (0.0032) [2024-06-22 17:56:53,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4271144960. Throughput: 0: 42589.3. Samples: 4271236780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:56:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 17:56:56,568][15401] Updated weights for policy 0, policy_version 260700 (0.0043) [2024-06-22 17:56:58,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4271407104. Throughput: 0: 42736.4. Samples: 4271496560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:56:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 17:57:01,003][15401] Updated weights for policy 0, policy_version 260710 (0.0040) [2024-06-22 17:57:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 4271587328. Throughput: 0: 42595.4. Samples: 4271751620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:57:03,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-22 17:57:04,223][15401] Updated weights for policy 0, policy_version 260720 (0.0027) [2024-06-22 17:57:08,392][15132] Fps is (10 sec: 37674.5, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 4271783936. Throughput: 0: 42661.4. Samples: 4271877040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:57:08,393][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 17:57:08,662][15401] Updated weights for policy 0, policy_version 260730 (0.0033) [2024-06-22 17:57:11,825][15401] Updated weights for policy 0, policy_version 260740 (0.0040) [2024-06-22 17:57:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4272013312. Throughput: 0: 42633.3. Samples: 4272137200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:57:13,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 17:57:16,270][15401] Updated weights for policy 0, policy_version 260750 (0.0038) [2024-06-22 17:57:18,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 4272226304. Throughput: 0: 42570.7. Samples: 4272394080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:57:18,392][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 17:57:19,392][15349] Signal inference workers to stop experience collection... (63250 times) [2024-06-22 17:57:19,427][15401] InferenceWorker_p0-w0: stopping experience collection (63250 times) [2024-06-22 17:57:19,439][15349] Signal inference workers to resume experience collection... (63250 times) [2024-06-22 17:57:19,449][15401] InferenceWorker_p0-w0: resuming experience collection (63250 times) [2024-06-22 17:57:19,578][15401] Updated weights for policy 0, policy_version 260760 (0.0040) [2024-06-22 17:57:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4272439296. Throughput: 0: 42590.2. Samples: 4272519080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:57:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 17:57:23,919][15401] Updated weights for policy 0, policy_version 260770 (0.0038) [2024-06-22 17:57:27,191][15401] Updated weights for policy 0, policy_version 260780 (0.0033) [2024-06-22 17:57:28,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4272668672. Throughput: 0: 42805.1. Samples: 4272782920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:57:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 17:57:31,342][15401] Updated weights for policy 0, policy_version 260790 (0.0029) [2024-06-22 17:57:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4272865280. Throughput: 0: 42679.1. Samples: 4273037220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 17:57:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 17:57:34,836][15401] Updated weights for policy 0, policy_version 260800 (0.0030) [2024-06-22 17:57:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4273094656. Throughput: 0: 42823.9. Samples: 4273163860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:57:38,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 17:57:38,794][15401] Updated weights for policy 0, policy_version 260810 (0.0047) [2024-06-22 17:57:42,479][15401] Updated weights for policy 0, policy_version 260820 (0.0032) [2024-06-22 17:57:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.2, 300 sec: 42876.5). Total num frames: 4273307648. Throughput: 0: 42972.6. Samples: 4273430320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:57:43,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 17:57:43,466][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000260823_4273324032.pth... [2024-06-22 17:57:43,517][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000260195_4263034880.pth [2024-06-22 17:57:46,228][15401] Updated weights for policy 0, policy_version 260830 (0.0052) [2024-06-22 17:57:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 4273520640. Throughput: 0: 42974.7. Samples: 4273685480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:57:48,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 17:57:50,425][15401] Updated weights for policy 0, policy_version 260840 (0.0035) [2024-06-22 17:57:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4273733632. Throughput: 0: 42966.8. Samples: 4273810440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:57:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 17:57:53,755][15401] Updated weights for policy 0, policy_version 260850 (0.0038) [2024-06-22 17:57:57,828][15401] Updated weights for policy 0, policy_version 260860 (0.0047) [2024-06-22 17:57:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4273946624. Throughput: 0: 43002.2. Samples: 4274072300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:57:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 17:58:01,427][15401] Updated weights for policy 0, policy_version 260870 (0.0037) [2024-06-22 17:58:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4274159616. Throughput: 0: 42899.6. Samples: 4274324460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 17:58:05,609][15401] Updated weights for policy 0, policy_version 260880 (0.0035) [2024-06-22 17:58:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 4274388992. Throughput: 0: 43021.5. Samples: 4274455040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 17:58:09,433][15401] Updated weights for policy 0, policy_version 260890 (0.0038) [2024-06-22 17:58:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4274569216. Throughput: 0: 42826.6. Samples: 4274710120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 17:58:13,400][15401] Updated weights for policy 0, policy_version 260900 (0.0047) [2024-06-22 17:58:17,140][15401] Updated weights for policy 0, policy_version 260910 (0.0041) [2024-06-22 17:58:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4274798592. Throughput: 0: 42698.6. Samples: 4274958660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:18,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 17:58:20,999][15401] Updated weights for policy 0, policy_version 260920 (0.0033) [2024-06-22 17:58:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4275027968. Throughput: 0: 42741.3. Samples: 4275087220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:23,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-22 17:58:24,612][15401] Updated weights for policy 0, policy_version 260930 (0.0038) [2024-06-22 17:58:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4275224576. Throughput: 0: 42745.7. Samples: 4275353880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:28,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 17:58:28,603][15401] Updated weights for policy 0, policy_version 260940 (0.0037) [2024-06-22 17:58:32,308][15401] Updated weights for policy 0, policy_version 260950 (0.0035) [2024-06-22 17:58:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4275437568. Throughput: 0: 42553.7. Samples: 4275600400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 17:58:36,140][15401] Updated weights for policy 0, policy_version 260960 (0.0040) [2024-06-22 17:58:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4275666944. Throughput: 0: 42686.2. Samples: 4275731320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 17:58:39,838][15401] Updated weights for policy 0, policy_version 260970 (0.0025) [2024-06-22 17:58:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4275879936. Throughput: 0: 42709.3. Samples: 4275994220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 17:58:43,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 17:58:43,780][15401] Updated weights for policy 0, policy_version 260980 (0.0041) [2024-06-22 17:58:45,235][15349] Signal inference workers to stop experience collection... (63300 times) [2024-06-22 17:58:45,287][15401] InferenceWorker_p0-w0: stopping experience collection (63300 times) [2024-06-22 17:58:45,294][15349] Signal inference workers to resume experience collection... (63300 times) [2024-06-22 17:58:45,301][15401] InferenceWorker_p0-w0: resuming experience collection (63300 times) [2024-06-22 17:58:47,450][15401] Updated weights for policy 0, policy_version 260990 (0.0044) [2024-06-22 17:58:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4276076544. Throughput: 0: 42671.1. Samples: 4276244660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:58:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 17:58:51,734][15401] Updated weights for policy 0, policy_version 261000 (0.0036) [2024-06-22 17:58:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 4276305920. Throughput: 0: 42636.7. Samples: 4276373700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:58:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 17:58:55,061][15401] Updated weights for policy 0, policy_version 261010 (0.0032) [2024-06-22 17:58:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4276502528. Throughput: 0: 42604.5. Samples: 4276627320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:58:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 17:58:59,803][15401] Updated weights for policy 0, policy_version 261020 (0.0032) [2024-06-22 17:59:02,988][15401] Updated weights for policy 0, policy_version 261030 (0.0036) [2024-06-22 17:59:03,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 4276715520. Throughput: 0: 42797.0. Samples: 4276884520. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 17:59:07,151][15401] Updated weights for policy 0, policy_version 261040 (0.0029) [2024-06-22 17:59:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4276944896. Throughput: 0: 42971.6. Samples: 4277020940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 17:59:10,507][15401] Updated weights for policy 0, policy_version 261050 (0.0032) [2024-06-22 17:59:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4277141504. Throughput: 0: 42790.2. Samples: 4277279440. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 17:59:14,575][15401] Updated weights for policy 0, policy_version 261060 (0.0028) [2024-06-22 17:59:18,074][15401] Updated weights for policy 0, policy_version 261070 (0.0030) [2024-06-22 17:59:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4277370880. Throughput: 0: 42925.3. Samples: 4277532040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:18,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 17:59:22,298][15401] Updated weights for policy 0, policy_version 261080 (0.0027) [2024-06-22 17:59:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4277583872. Throughput: 0: 42995.0. Samples: 4277666100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 17:59:25,532][15401] Updated weights for policy 0, policy_version 261090 (0.0027) [2024-06-22 17:59:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4277796864. Throughput: 0: 42870.3. Samples: 4277923380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:28,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 17:59:29,886][15401] Updated weights for policy 0, policy_version 261100 (0.0033) [2024-06-22 17:59:32,963][15401] Updated weights for policy 0, policy_version 261110 (0.0032) [2024-06-22 17:59:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4278026240. Throughput: 0: 42860.2. Samples: 4278173380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:33,396][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 17:59:37,460][15401] Updated weights for policy 0, policy_version 261120 (0.0023) [2024-06-22 17:59:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4278239232. Throughput: 0: 43148.6. Samples: 4278315380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 17:59:40,716][15401] Updated weights for policy 0, policy_version 261130 (0.0035) [2024-06-22 17:59:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4278435840. Throughput: 0: 43180.5. Samples: 4278570440. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 17:59:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000261135_4278435840.pth... [2024-06-22 17:59:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000260510_4268195840.pth [2024-06-22 17:59:44,827][15401] Updated weights for policy 0, policy_version 261140 (0.0026) [2024-06-22 17:59:48,243][15401] Updated weights for policy 0, policy_version 261150 (0.0035) [2024-06-22 17:59:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 4278681600. Throughput: 0: 43075.7. Samples: 4278822940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:48,391][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 17:59:52,725][15401] Updated weights for policy 0, policy_version 261160 (0.0025) [2024-06-22 17:59:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4278878208. Throughput: 0: 43085.7. Samples: 4278959800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-22 17:59:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 17:59:55,932][15401] Updated weights for policy 0, policy_version 261170 (0.0026) [2024-06-22 17:59:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4279091200. Throughput: 0: 42999.2. Samples: 4279214400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 17:59:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 18:00:00,216][15401] Updated weights for policy 0, policy_version 261180 (0.0042) [2024-06-22 18:00:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4279304192. Throughput: 0: 43101.1. Samples: 4279471580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 18:00:03,586][15401] Updated weights for policy 0, policy_version 261190 (0.0032) [2024-06-22 18:00:07,860][15401] Updated weights for policy 0, policy_version 261200 (0.0033) [2024-06-22 18:00:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4279533568. Throughput: 0: 43052.4. Samples: 4279603560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:08,392][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 18:00:11,148][15401] Updated weights for policy 0, policy_version 261210 (0.0030) [2024-06-22 18:00:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42765.9). Total num frames: 4279746560. Throughput: 0: 42994.5. Samples: 4279858140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 18:00:14,613][15349] Signal inference workers to stop experience collection... (63350 times) [2024-06-22 18:00:14,663][15401] InferenceWorker_p0-w0: stopping experience collection (63350 times) [2024-06-22 18:00:14,671][15349] Signal inference workers to resume experience collection... (63350 times) [2024-06-22 18:00:14,679][15401] InferenceWorker_p0-w0: resuming experience collection (63350 times) [2024-06-22 18:00:15,530][15401] Updated weights for policy 0, policy_version 261220 (0.0021) [2024-06-22 18:00:18,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4279959552. Throughput: 0: 43167.7. Samples: 4280115920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 18:00:18,685][15401] Updated weights for policy 0, policy_version 261230 (0.0049) [2024-06-22 18:00:23,064][15401] Updated weights for policy 0, policy_version 261240 (0.0037) [2024-06-22 18:00:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4280172544. Throughput: 0: 42900.9. Samples: 4280245920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:23,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 18:00:26,543][15401] Updated weights for policy 0, policy_version 261250 (0.0033) [2024-06-22 18:00:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4280369152. Throughput: 0: 42885.3. Samples: 4280500280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:28,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 18:00:30,678][15401] Updated weights for policy 0, policy_version 261260 (0.0030) [2024-06-22 18:00:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4280598528. Throughput: 0: 43126.7. Samples: 4280763640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 18:00:34,187][15401] Updated weights for policy 0, policy_version 261270 (0.0044) [2024-06-22 18:00:38,306][15401] Updated weights for policy 0, policy_version 261280 (0.0025) [2024-06-22 18:00:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4280811520. Throughput: 0: 42886.3. Samples: 4280889680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 18:00:41,779][15401] Updated weights for policy 0, policy_version 261290 (0.0037) [2024-06-22 18:00:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4281024512. Throughput: 0: 42848.4. Samples: 4281142580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 18:00:45,844][15401] Updated weights for policy 0, policy_version 261300 (0.0033) [2024-06-22 18:00:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 4281221120. Throughput: 0: 42864.4. Samples: 4281400480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 18:00:49,826][15401] Updated weights for policy 0, policy_version 261310 (0.0032) [2024-06-22 18:00:53,387][15401] Updated weights for policy 0, policy_version 261320 (0.0038) [2024-06-22 18:00:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 4281466880. Throughput: 0: 42768.4. Samples: 4281528140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:53,401][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 18:00:57,511][15401] Updated weights for policy 0, policy_version 261330 (0.0048) [2024-06-22 18:00:58,394][15132] Fps is (10 sec: 44216.8, 60 sec: 42868.2, 300 sec: 42820.2). Total num frames: 4281663488. Throughput: 0: 42723.8. Samples: 4281780900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:00:58,395][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 18:01:01,444][15401] Updated weights for policy 0, policy_version 261340 (0.0035) [2024-06-22 18:01:03,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4281876480. Throughput: 0: 42621.3. Samples: 4282033880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:01:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 18:01:05,042][15401] Updated weights for policy 0, policy_version 261350 (0.0033) [2024-06-22 18:01:08,392][15132] Fps is (10 sec: 40968.7, 60 sec: 42325.4, 300 sec: 42764.7). Total num frames: 4282073088. Throughput: 0: 42549.8. Samples: 4282160760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:08,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 18:01:09,084][15401] Updated weights for policy 0, policy_version 261360 (0.0034) [2024-06-22 18:01:12,860][15401] Updated weights for policy 0, policy_version 261370 (0.0039) [2024-06-22 18:01:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4282302464. Throughput: 0: 42674.3. Samples: 4282420620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 18:01:16,785][15401] Updated weights for policy 0, policy_version 261380 (0.0029) [2024-06-22 18:01:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4282515456. Throughput: 0: 42454.8. Samples: 4282674100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 18:01:20,421][15401] Updated weights for policy 0, policy_version 261390 (0.0037) [2024-06-22 18:01:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4282712064. Throughput: 0: 42554.6. Samples: 4282804640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 18:01:24,611][15401] Updated weights for policy 0, policy_version 261400 (0.0035) [2024-06-22 18:01:28,208][15401] Updated weights for policy 0, policy_version 261410 (0.0034) [2024-06-22 18:01:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4282941440. Throughput: 0: 42557.4. Samples: 4283057660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 18:01:32,173][15401] Updated weights for policy 0, policy_version 261420 (0.0039) [2024-06-22 18:01:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4283154432. Throughput: 0: 42460.0. Samples: 4283311180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:33,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 18:01:35,862][15401] Updated weights for policy 0, policy_version 261430 (0.0028) [2024-06-22 18:01:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 4283351040. Throughput: 0: 42394.7. Samples: 4283435800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:38,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-22 18:01:39,890][15401] Updated weights for policy 0, policy_version 261440 (0.0048) [2024-06-22 18:01:42,152][15349] Signal inference workers to stop experience collection... (63400 times) [2024-06-22 18:01:42,152][15349] Signal inference workers to resume experience collection... (63400 times) [2024-06-22 18:01:42,188][15401] InferenceWorker_p0-w0: stopping experience collection (63400 times) [2024-06-22 18:01:42,188][15401] InferenceWorker_p0-w0: resuming experience collection (63400 times) [2024-06-22 18:01:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4283580416. Throughput: 0: 42569.4. Samples: 4283696340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 18:01:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000261449_4283580416.pth... [2024-06-22 18:01:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000260823_4273324032.pth [2024-06-22 18:01:43,741][15401] Updated weights for policy 0, policy_version 261450 (0.0028) [2024-06-22 18:01:47,502][15401] Updated weights for policy 0, policy_version 261460 (0.0037) [2024-06-22 18:01:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4283777024. Throughput: 0: 42608.4. Samples: 4283951260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 18:01:51,358][15401] Updated weights for policy 0, policy_version 261470 (0.0027) [2024-06-22 18:01:53,392][15132] Fps is (10 sec: 42589.1, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 4284006400. Throughput: 0: 42528.4. Samples: 4284074540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:53,393][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 18:01:55,054][15401] Updated weights for policy 0, policy_version 261480 (0.0039) [2024-06-22 18:01:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42601.6, 300 sec: 42820.6). Total num frames: 4284219392. Throughput: 0: 42647.9. Samples: 4284339780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:01:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 18:01:59,128][15401] Updated weights for policy 0, policy_version 261490 (0.0029) [2024-06-22 18:02:02,736][15401] Updated weights for policy 0, policy_version 261500 (0.0038) [2024-06-22 18:02:03,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 4284416000. Throughput: 0: 42513.4. Samples: 4284587200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:02:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 18:02:07,191][15401] Updated weights for policy 0, policy_version 261510 (0.0037) [2024-06-22 18:02:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 4284645376. Throughput: 0: 42432.6. Samples: 4284714100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:02:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 18:02:10,378][15401] Updated weights for policy 0, policy_version 261520 (0.0045) [2024-06-22 18:02:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 4284874752. Throughput: 0: 42684.9. Samples: 4284978480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 18:02:13,399][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 18:02:14,640][15401] Updated weights for policy 0, policy_version 261530 (0.0043) [2024-06-22 18:02:18,248][15401] Updated weights for policy 0, policy_version 261540 (0.0037) [2024-06-22 18:02:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4285071360. Throughput: 0: 42695.2. Samples: 4285232460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 18:02:22,318][15401] Updated weights for policy 0, policy_version 261550 (0.0033) [2024-06-22 18:02:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4285300736. Throughput: 0: 42753.9. Samples: 4285359720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 18:02:25,857][15401] Updated weights for policy 0, policy_version 261560 (0.0032) [2024-06-22 18:02:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4285480960. Throughput: 0: 42758.5. Samples: 4285620460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 18:02:29,893][15401] Updated weights for policy 0, policy_version 261570 (0.0048) [2024-06-22 18:02:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4285710336. Throughput: 0: 42733.4. Samples: 4285874260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 18:02:33,819][15401] Updated weights for policy 0, policy_version 261580 (0.0032) [2024-06-22 18:02:37,482][15401] Updated weights for policy 0, policy_version 261590 (0.0036) [2024-06-22 18:02:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4285939712. Throughput: 0: 42942.3. Samples: 4286006840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 18:02:41,586][15401] Updated weights for policy 0, policy_version 261600 (0.0044) [2024-06-22 18:02:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.5, 300 sec: 42653.9). Total num frames: 4286103552. Throughput: 0: 42640.1. Samples: 4286258580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:43,392][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 18:02:43,528][15349] Signal inference workers to stop experience collection... (63450 times) [2024-06-22 18:02:43,584][15401] InferenceWorker_p0-w0: stopping experience collection (63450 times) [2024-06-22 18:02:43,648][15349] Signal inference workers to resume experience collection... (63450 times) [2024-06-22 18:02:43,648][15401] InferenceWorker_p0-w0: resuming experience collection (63450 times) [2024-06-22 18:02:45,080][15401] Updated weights for policy 0, policy_version 261610 (0.0029) [2024-06-22 18:02:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4286349312. Throughput: 0: 42819.5. Samples: 4286514080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 18:02:49,103][15401] Updated weights for policy 0, policy_version 261620 (0.0029) [2024-06-22 18:02:52,751][15401] Updated weights for policy 0, policy_version 261630 (0.0043) [2024-06-22 18:02:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 4286562304. Throughput: 0: 42883.9. Samples: 4286643880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:53,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 18:02:56,635][15401] Updated weights for policy 0, policy_version 261640 (0.0033) [2024-06-22 18:02:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4286758912. Throughput: 0: 42575.1. Samples: 4286894360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:02:58,391][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 18:03:00,357][15401] Updated weights for policy 0, policy_version 261650 (0.0041) [2024-06-22 18:03:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4286988288. Throughput: 0: 42672.4. Samples: 4287152720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:03:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 18:03:04,166][15401] Updated weights for policy 0, policy_version 261660 (0.0046) [2024-06-22 18:03:07,964][15401] Updated weights for policy 0, policy_version 261670 (0.0038) [2024-06-22 18:03:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4287201280. Throughput: 0: 42739.5. Samples: 4287283000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:03:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 18:03:11,622][15401] Updated weights for policy 0, policy_version 261680 (0.0033) [2024-06-22 18:03:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4287414272. Throughput: 0: 42630.1. Samples: 4287538820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:03:13,391][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 18:03:15,912][15401] Updated weights for policy 0, policy_version 261690 (0.0046) [2024-06-22 18:03:18,390][15132] Fps is (10 sec: 42596.1, 60 sec: 42597.9, 300 sec: 42709.4). Total num frames: 4287627264. Throughput: 0: 42755.0. Samples: 4287798260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:03:18,391][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 18:03:19,151][15401] Updated weights for policy 0, policy_version 261700 (0.0027) [2024-06-22 18:03:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 4287840256. Throughput: 0: 42804.4. Samples: 4287933140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:03:23,393][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 18:03:23,494][15401] Updated weights for policy 0, policy_version 261710 (0.0024) [2024-06-22 18:03:26,653][15401] Updated weights for policy 0, policy_version 261720 (0.0032) [2024-06-22 18:03:28,390][15132] Fps is (10 sec: 42600.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4288053248. Throughput: 0: 42766.6. Samples: 4288183080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-22 18:03:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 18:03:31,039][15401] Updated weights for policy 0, policy_version 261730 (0.0034) [2024-06-22 18:03:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4288282624. Throughput: 0: 42859.5. Samples: 4288442760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:03:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 18:03:34,304][15401] Updated weights for policy 0, policy_version 261740 (0.0026) [2024-06-22 18:03:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4288495616. Throughput: 0: 43005.0. Samples: 4288579100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:03:38,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-22 18:03:38,436][15401] Updated weights for policy 0, policy_version 261750 (0.0034) [2024-06-22 18:03:42,001][15401] Updated weights for policy 0, policy_version 261760 (0.0050) [2024-06-22 18:03:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4288692224. Throughput: 0: 42981.0. Samples: 4288828500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:03:43,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 18:03:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000261761_4288692224.pth... [2024-06-22 18:03:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000261135_4278435840.pth [2024-06-22 18:03:46,468][15401] Updated weights for policy 0, policy_version 261770 (0.0035) [2024-06-22 18:03:48,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 4288937984. Throughput: 0: 42868.2. Samples: 4289081800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:03:48,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 18:03:49,886][15401] Updated weights for policy 0, policy_version 261780 (0.0025) [2024-06-22 18:03:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4289118208. Throughput: 0: 43027.2. Samples: 4289219220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:03:53,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 18:03:54,259][15401] Updated weights for policy 0, policy_version 261790 (0.0045) [2024-06-22 18:03:57,844][15401] Updated weights for policy 0, policy_version 261800 (0.0033) [2024-06-22 18:03:58,390][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 4289347584. Throughput: 0: 42870.3. Samples: 4289467980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:03:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 18:04:01,885][15401] Updated weights for policy 0, policy_version 261810 (0.0034) [2024-06-22 18:04:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4289576960. Throughput: 0: 42678.4. Samples: 4289718760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:04:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 18:04:05,634][15401] Updated weights for policy 0, policy_version 261820 (0.0027) [2024-06-22 18:04:08,107][15349] Signal inference workers to stop experience collection... (63500 times) [2024-06-22 18:04:08,107][15349] Signal inference workers to resume experience collection... (63500 times) [2024-06-22 18:04:08,148][15401] InferenceWorker_p0-w0: stopping experience collection (63500 times) [2024-06-22 18:04:08,148][15401] InferenceWorker_p0-w0: resuming experience collection (63500 times) [2024-06-22 18:04:08,396][15132] Fps is (10 sec: 42572.7, 60 sec: 42867.2, 300 sec: 42819.7). Total num frames: 4289773568. Throughput: 0: 42594.8. Samples: 4289850060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:04:08,396][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 18:04:09,541][15401] Updated weights for policy 0, policy_version 261830 (0.0042) [2024-06-22 18:04:13,312][15401] Updated weights for policy 0, policy_version 261840 (0.0040) [2024-06-22 18:04:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4289986560. Throughput: 0: 42681.9. Samples: 4290103760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:04:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 18:04:17,214][15401] Updated weights for policy 0, policy_version 261850 (0.0047) [2024-06-22 18:04:18,390][15132] Fps is (10 sec: 44263.4, 60 sec: 43144.9, 300 sec: 42820.6). Total num frames: 4290215936. Throughput: 0: 42640.0. Samples: 4290361560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:04:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 18:04:20,900][15401] Updated weights for policy 0, policy_version 261860 (0.0030) [2024-06-22 18:04:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 4290396160. Throughput: 0: 42543.1. Samples: 4290493540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:04:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 18:04:24,863][15401] Updated weights for policy 0, policy_version 261870 (0.0035) [2024-06-22 18:04:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4290625536. Throughput: 0: 42522.2. Samples: 4290742000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:04:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 18:04:28,564][15401] Updated weights for policy 0, policy_version 261880 (0.0038) [2024-06-22 18:04:32,702][15401] Updated weights for policy 0, policy_version 261890 (0.0035) [2024-06-22 18:04:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4290854912. Throughput: 0: 42612.2. Samples: 4290999340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:04:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 18:04:36,484][15401] Updated weights for policy 0, policy_version 261900 (0.0028) [2024-06-22 18:04:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4291051520. Throughput: 0: 42443.1. Samples: 4291129160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-22 18:04:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 18:04:40,231][15401] Updated weights for policy 0, policy_version 261910 (0.0038) [2024-06-22 18:04:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4291280896. Throughput: 0: 42546.1. Samples: 4291382560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:04:43,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-22 18:04:44,193][15401] Updated weights for policy 0, policy_version 261920 (0.0045) [2024-06-22 18:04:47,891][15401] Updated weights for policy 0, policy_version 261930 (0.0027) [2024-06-22 18:04:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 4291477504. Throughput: 0: 42654.3. Samples: 4291638200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:04:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 18:04:51,728][15401] Updated weights for policy 0, policy_version 261940 (0.0036) [2024-06-22 18:04:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4291690496. Throughput: 0: 42654.3. Samples: 4291769240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:04:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 18:04:55,393][15401] Updated weights for policy 0, policy_version 261950 (0.0031) [2024-06-22 18:04:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4291919872. Throughput: 0: 42566.2. Samples: 4292019240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:04:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 18:04:59,412][15401] Updated weights for policy 0, policy_version 261960 (0.0035) [2024-06-22 18:05:03,089][15401] Updated weights for policy 0, policy_version 261970 (0.0033) [2024-06-22 18:05:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 4292116480. Throughput: 0: 42589.0. Samples: 4292278060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 18:05:07,011][15401] Updated weights for policy 0, policy_version 261980 (0.0028) [2024-06-22 18:05:08,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42056.6, 300 sec: 42542.9). Total num frames: 4292296704. Throughput: 0: 42458.7. Samples: 4292404180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 18:05:11,067][15401] Updated weights for policy 0, policy_version 261990 (0.0034) [2024-06-22 18:05:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4292558848. Throughput: 0: 42625.3. Samples: 4292660140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:13,390][15132] Avg episode reward: [(0, '0.881')] [2024-06-22 18:05:14,546][15401] Updated weights for policy 0, policy_version 262000 (0.0041) [2024-06-22 18:05:18,392][15132] Fps is (10 sec: 44227.4, 60 sec: 42050.9, 300 sec: 42598.1). Total num frames: 4292739072. Throughput: 0: 42753.2. Samples: 4292923320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:18,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 18:05:18,565][15401] Updated weights for policy 0, policy_version 262010 (0.0036) [2024-06-22 18:05:18,765][15349] Signal inference workers to stop experience collection... (63550 times) [2024-06-22 18:05:18,822][15401] InferenceWorker_p0-w0: stopping experience collection (63550 times) [2024-06-22 18:05:18,826][15349] Signal inference workers to resume experience collection... (63550 times) [2024-06-22 18:05:18,837][15401] InferenceWorker_p0-w0: resuming experience collection (63550 times) [2024-06-22 18:05:22,798][15401] Updated weights for policy 0, policy_version 262020 (0.0024) [2024-06-22 18:05:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4292952064. Throughput: 0: 42720.5. Samples: 4293051580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:23,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 18:05:26,145][15401] Updated weights for policy 0, policy_version 262030 (0.0038) [2024-06-22 18:05:28,390][15132] Fps is (10 sec: 44245.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4293181440. Throughput: 0: 42691.6. Samples: 4293303680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:28,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 18:05:30,442][15401] Updated weights for policy 0, policy_version 262040 (0.0024) [2024-06-22 18:05:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4293410816. Throughput: 0: 42951.9. Samples: 4293571040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:33,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 18:05:33,754][15401] Updated weights for policy 0, policy_version 262050 (0.0033) [2024-06-22 18:05:38,033][15401] Updated weights for policy 0, policy_version 262060 (0.0033) [2024-06-22 18:05:38,391][15132] Fps is (10 sec: 42591.7, 60 sec: 42597.2, 300 sec: 42653.7). Total num frames: 4293607424. Throughput: 0: 42741.5. Samples: 4293692680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:38,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 18:05:41,367][15401] Updated weights for policy 0, policy_version 262070 (0.0022) [2024-06-22 18:05:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4293836800. Throughput: 0: 42825.3. Samples: 4293946380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 18:05:43,481][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000262076_4293853184.pth... [2024-06-22 18:05:43,530][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000261449_4283580416.pth [2024-06-22 18:05:45,670][15401] Updated weights for policy 0, policy_version 262080 (0.0034) [2024-06-22 18:05:48,390][15132] Fps is (10 sec: 44243.6, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 4294049792. Throughput: 0: 42874.5. Samples: 4294207420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:05:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 18:05:49,054][15401] Updated weights for policy 0, policy_version 262090 (0.0044) [2024-06-22 18:05:53,324][15401] Updated weights for policy 0, policy_version 262100 (0.0032) [2024-06-22 18:05:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 4294246400. Throughput: 0: 42756.4. Samples: 4294328220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:05:53,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 18:05:56,778][15401] Updated weights for policy 0, policy_version 262110 (0.0028) [2024-06-22 18:05:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4294475776. Throughput: 0: 42694.3. Samples: 4294581380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:05:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 18:06:00,950][15401] Updated weights for policy 0, policy_version 262120 (0.0022) [2024-06-22 18:06:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 4294672384. Throughput: 0: 42638.9. Samples: 4294841980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 18:06:04,508][15401] Updated weights for policy 0, policy_version 262130 (0.0038) [2024-06-22 18:06:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4294885376. Throughput: 0: 42553.7. Samples: 4294966500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 18:06:08,526][15401] Updated weights for policy 0, policy_version 262140 (0.0035) [2024-06-22 18:06:12,107][15401] Updated weights for policy 0, policy_version 262150 (0.0024) [2024-06-22 18:06:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4295114752. Throughput: 0: 42617.0. Samples: 4295221440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:13,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 18:06:16,340][15401] Updated weights for policy 0, policy_version 262160 (0.0028) [2024-06-22 18:06:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 4295294976. Throughput: 0: 42429.3. Samples: 4295480360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 18:06:19,759][15401] Updated weights for policy 0, policy_version 262170 (0.0028) [2024-06-22 18:06:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4295524352. Throughput: 0: 42513.1. Samples: 4295605700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 18:06:24,144][15401] Updated weights for policy 0, policy_version 262180 (0.0040) [2024-06-22 18:06:27,689][15401] Updated weights for policy 0, policy_version 262190 (0.0040) [2024-06-22 18:06:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4295737344. Throughput: 0: 42560.0. Samples: 4295861580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 18:06:31,655][15401] Updated weights for policy 0, policy_version 262200 (0.0039) [2024-06-22 18:06:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4295933952. Throughput: 0: 42370.7. Samples: 4296114100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 18:06:35,473][15401] Updated weights for policy 0, policy_version 262210 (0.0040) [2024-06-22 18:06:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42872.6, 300 sec: 42709.5). Total num frames: 4296179712. Throughput: 0: 42474.2. Samples: 4296239560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 18:06:39,287][15401] Updated weights for policy 0, policy_version 262220 (0.0041) [2024-06-22 18:06:43,340][15401] Updated weights for policy 0, policy_version 262230 (0.0041) [2024-06-22 18:06:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4296376320. Throughput: 0: 42677.2. Samples: 4296501860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 18:06:47,066][15401] Updated weights for policy 0, policy_version 262240 (0.0032) [2024-06-22 18:06:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 4296589312. Throughput: 0: 42501.2. Samples: 4296754540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 18:06:49,110][15349] Signal inference workers to stop experience collection... (63600 times) [2024-06-22 18:06:49,111][15349] Signal inference workers to resume experience collection... (63600 times) [2024-06-22 18:06:49,125][15401] InferenceWorker_p0-w0: stopping experience collection (63600 times) [2024-06-22 18:06:49,125][15401] InferenceWorker_p0-w0: resuming experience collection (63600 times) [2024-06-22 18:06:51,031][15401] Updated weights for policy 0, policy_version 262250 (0.0028) [2024-06-22 18:06:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4296818688. Throughput: 0: 42578.6. Samples: 4296882540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:53,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 18:06:54,653][15401] Updated weights for policy 0, policy_version 262260 (0.0037) [2024-06-22 18:06:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4297015296. Throughput: 0: 42770.2. Samples: 4297146100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 18:06:58,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 18:06:58,511][15401] Updated weights for policy 0, policy_version 262270 (0.0039) [2024-06-22 18:07:02,243][15401] Updated weights for policy 0, policy_version 262280 (0.0034) [2024-06-22 18:07:03,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 4297244672. Throughput: 0: 42495.1. Samples: 4297392740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:03,393][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 18:07:06,166][15401] Updated weights for policy 0, policy_version 262290 (0.0028) [2024-06-22 18:07:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4297474048. Throughput: 0: 42752.4. Samples: 4297529560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:08,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-22 18:07:09,901][15401] Updated weights for policy 0, policy_version 262300 (0.0034) [2024-06-22 18:07:13,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 4297654272. Throughput: 0: 42836.3. Samples: 4297789320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:13,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 18:07:14,018][15401] Updated weights for policy 0, policy_version 262310 (0.0035) [2024-06-22 18:07:17,398][15401] Updated weights for policy 0, policy_version 262320 (0.0041) [2024-06-22 18:07:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4297883648. Throughput: 0: 42996.1. Samples: 4298048920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 18:07:21,520][15401] Updated weights for policy 0, policy_version 262330 (0.0045) [2024-06-22 18:07:23,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4298129408. Throughput: 0: 43128.0. Samples: 4298180320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:23,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-22 18:07:24,916][15401] Updated weights for policy 0, policy_version 262340 (0.0030) [2024-06-22 18:07:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4298309632. Throughput: 0: 43037.8. Samples: 4298438560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 18:07:28,976][15401] Updated weights for policy 0, policy_version 262350 (0.0028) [2024-06-22 18:07:32,470][15401] Updated weights for policy 0, policy_version 262360 (0.0033) [2024-06-22 18:07:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4298522624. Throughput: 0: 43170.3. Samples: 4298697200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 18:07:36,428][15401] Updated weights for policy 0, policy_version 262370 (0.0042) [2024-06-22 18:07:38,392][15132] Fps is (10 sec: 45865.6, 60 sec: 43143.0, 300 sec: 42931.3). Total num frames: 4298768384. Throughput: 0: 43264.8. Samples: 4298829540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:38,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 18:07:40,139][15401] Updated weights for policy 0, policy_version 262380 (0.0040) [2024-06-22 18:07:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4298964992. Throughput: 0: 43047.9. Samples: 4299083260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:43,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 18:07:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000262388_4298964992.pth... [2024-06-22 18:07:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000261761_4288692224.pth [2024-06-22 18:07:43,909][15401] Updated weights for policy 0, policy_version 262390 (0.0036) [2024-06-22 18:07:47,740][15401] Updated weights for policy 0, policy_version 262400 (0.0028) [2024-06-22 18:07:48,389][15132] Fps is (10 sec: 39330.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4299161600. Throughput: 0: 43305.5. Samples: 4299341380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:48,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 18:07:51,783][15401] Updated weights for policy 0, policy_version 262410 (0.0023) [2024-06-22 18:07:53,391][15132] Fps is (10 sec: 42591.7, 60 sec: 42870.4, 300 sec: 42820.3). Total num frames: 4299390976. Throughput: 0: 43108.6. Samples: 4299469520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:53,392][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 18:07:55,317][15401] Updated weights for policy 0, policy_version 262420 (0.0057) [2024-06-22 18:07:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4299603968. Throughput: 0: 43081.4. Samples: 4299727880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:07:58,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 18:07:59,449][15401] Updated weights for policy 0, policy_version 262430 (0.0041) [2024-06-22 18:08:00,977][15349] Signal inference workers to stop experience collection... (63650 times) [2024-06-22 18:08:00,988][15349] Signal inference workers to resume experience collection... (63650 times) [2024-06-22 18:08:01,021][15401] InferenceWorker_p0-w0: stopping experience collection (63650 times) [2024-06-22 18:08:01,021][15401] InferenceWorker_p0-w0: resuming experience collection (63650 times) [2024-06-22 18:08:02,887][15401] Updated weights for policy 0, policy_version 262440 (0.0038) [2024-06-22 18:08:03,389][15132] Fps is (10 sec: 42605.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4299816960. Throughput: 0: 42876.5. Samples: 4299978360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:08:03,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 18:08:07,515][15401] Updated weights for policy 0, policy_version 262450 (0.0033) [2024-06-22 18:08:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4300013568. Throughput: 0: 42930.3. Samples: 4300112180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:08:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 18:08:10,302][15401] Updated weights for policy 0, policy_version 262460 (0.0029) [2024-06-22 18:08:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43146.2, 300 sec: 42765.1). Total num frames: 4300242944. Throughput: 0: 42878.6. Samples: 4300368100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-22 18:08:13,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 18:08:15,097][15401] Updated weights for policy 0, policy_version 262470 (0.0037) [2024-06-22 18:08:17,765][15401] Updated weights for policy 0, policy_version 262480 (0.0036) [2024-06-22 18:08:18,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 4300488704. Throughput: 0: 42775.6. Samples: 4300622100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 18:08:22,570][15401] Updated weights for policy 0, policy_version 262490 (0.0029) [2024-06-22 18:08:23,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4300652544. Throughput: 0: 42791.9. Samples: 4300755080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:23,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 18:08:25,486][15401] Updated weights for policy 0, policy_version 262500 (0.0043) [2024-06-22 18:08:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4300881920. Throughput: 0: 42815.8. Samples: 4301009960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:28,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 18:08:30,037][15401] Updated weights for policy 0, policy_version 262510 (0.0050) [2024-06-22 18:08:33,167][15401] Updated weights for policy 0, policy_version 262520 (0.0044) [2024-06-22 18:08:33,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 4301127680. Throughput: 0: 42888.0. Samples: 4301271340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:33,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 18:08:37,549][15401] Updated weights for policy 0, policy_version 262530 (0.0033) [2024-06-22 18:08:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42599.9, 300 sec: 42820.5). Total num frames: 4301324288. Throughput: 0: 43005.1. Samples: 4301404680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 18:08:41,009][15401] Updated weights for policy 0, policy_version 262540 (0.0040) [2024-06-22 18:08:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4301537280. Throughput: 0: 42979.9. Samples: 4301661980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 18:08:44,974][15401] Updated weights for policy 0, policy_version 262550 (0.0038) [2024-06-22 18:08:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4301766656. Throughput: 0: 43110.7. Samples: 4301918340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 18:08:48,499][15401] Updated weights for policy 0, policy_version 262560 (0.0040) [2024-06-22 18:08:52,842][15401] Updated weights for policy 0, policy_version 262570 (0.0028) [2024-06-22 18:08:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42599.6, 300 sec: 42709.5). Total num frames: 4301946880. Throughput: 0: 43034.7. Samples: 4302048740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:53,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 18:08:56,054][15401] Updated weights for policy 0, policy_version 262580 (0.0047) [2024-06-22 18:08:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4302176256. Throughput: 0: 43079.7. Samples: 4302306680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:08:58,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 18:09:00,290][15401] Updated weights for policy 0, policy_version 262590 (0.0038) [2024-06-22 18:09:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42821.4). Total num frames: 4302405632. Throughput: 0: 43219.2. Samples: 4302566960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:09:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 18:09:03,606][15401] Updated weights for policy 0, policy_version 262600 (0.0034) [2024-06-22 18:09:07,699][15401] Updated weights for policy 0, policy_version 262610 (0.0038) [2024-06-22 18:09:08,394][15132] Fps is (10 sec: 44216.2, 60 sec: 43414.2, 300 sec: 42819.9). Total num frames: 4302618624. Throughput: 0: 43140.4. Samples: 4302696600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:09:08,395][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 18:09:10,994][15349] Signal inference workers to stop experience collection... (63700 times) [2024-06-22 18:09:10,994][15349] Signal inference workers to resume experience collection... (63700 times) [2024-06-22 18:09:11,014][15401] InferenceWorker_p0-w0: stopping experience collection (63700 times) [2024-06-22 18:09:11,014][15401] InferenceWorker_p0-w0: resuming experience collection (63700 times) [2024-06-22 18:09:11,784][15401] Updated weights for policy 0, policy_version 262620 (0.0028) [2024-06-22 18:09:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4302831616. Throughput: 0: 43039.0. Samples: 4302946720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:09:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 18:09:15,246][15401] Updated weights for policy 0, policy_version 262630 (0.0034) [2024-06-22 18:09:18,390][15132] Fps is (10 sec: 39339.6, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 4303011840. Throughput: 0: 43123.4. Samples: 4303211900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:09:18,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 18:09:19,298][15401] Updated weights for policy 0, policy_version 262640 (0.0031) [2024-06-22 18:09:23,075][15401] Updated weights for policy 0, policy_version 262650 (0.0037) [2024-06-22 18:09:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43690.5, 300 sec: 42876.1). Total num frames: 4303273984. Throughput: 0: 42900.0. Samples: 4303335180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 18:09:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 18:09:26,919][15401] Updated weights for policy 0, policy_version 262660 (0.0042) [2024-06-22 18:09:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 4303486976. Throughput: 0: 42945.8. Samples: 4303594540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:09:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 18:09:30,766][15401] Updated weights for policy 0, policy_version 262670 (0.0023) [2024-06-22 18:09:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4303667200. Throughput: 0: 43146.7. Samples: 4303859940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:09:33,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 18:09:34,420][15401] Updated weights for policy 0, policy_version 262680 (0.0030) [2024-06-22 18:09:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4303896576. Throughput: 0: 42970.1. Samples: 4303982400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:09:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 18:09:38,403][15401] Updated weights for policy 0, policy_version 262690 (0.0030) [2024-06-22 18:09:42,004][15401] Updated weights for policy 0, policy_version 262700 (0.0046) [2024-06-22 18:09:43,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.8, 300 sec: 42931.6). Total num frames: 4304142336. Throughput: 0: 42993.4. Samples: 4304241380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:09:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 18:09:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000262704_4304142336.pth... [2024-06-22 18:09:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000262076_4293853184.pth [2024-06-22 18:09:45,949][15401] Updated weights for policy 0, policy_version 262710 (0.0028) [2024-06-22 18:09:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4304322560. Throughput: 0: 42901.2. Samples: 4304497520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:09:48,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-22 18:09:49,592][15401] Updated weights for policy 0, policy_version 262720 (0.0045) [2024-06-22 18:09:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 4304551936. Throughput: 0: 42890.3. Samples: 4304626460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:09:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 18:09:53,529][15401] Updated weights for policy 0, policy_version 262730 (0.0026) [2024-06-22 18:09:57,327][15401] Updated weights for policy 0, policy_version 262740 (0.0022) [2024-06-22 18:09:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4304748544. Throughput: 0: 42946.0. Samples: 4304879300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:09:58,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-22 18:10:01,221][15401] Updated weights for policy 0, policy_version 262750 (0.0030) [2024-06-22 18:10:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4304961536. Throughput: 0: 42921.9. Samples: 4305143380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:10:03,390][15132] Avg episode reward: [(0, '0.218')] [2024-06-22 18:10:05,219][15401] Updated weights for policy 0, policy_version 262760 (0.0035) [2024-06-22 18:10:08,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42601.7, 300 sec: 42765.0). Total num frames: 4305174528. Throughput: 0: 42946.4. Samples: 4305267760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:10:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 18:10:09,055][15401] Updated weights for policy 0, policy_version 262770 (0.0046) [2024-06-22 18:10:13,026][15401] Updated weights for policy 0, policy_version 262780 (0.0040) [2024-06-22 18:10:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 4305387520. Throughput: 0: 42832.2. Samples: 4305521980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:10:13,396][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 18:10:16,677][15401] Updated weights for policy 0, policy_version 262790 (0.0034) [2024-06-22 18:10:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4305600512. Throughput: 0: 42648.8. Samples: 4305779140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:10:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 18:10:20,610][15401] Updated weights for policy 0, policy_version 262800 (0.0048) [2024-06-22 18:10:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 4305813504. Throughput: 0: 42725.4. Samples: 4305905040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:10:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 18:10:24,125][15401] Updated weights for policy 0, policy_version 262810 (0.0032) [2024-06-22 18:10:28,098][15401] Updated weights for policy 0, policy_version 262820 (0.0036) [2024-06-22 18:10:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 4306042880. Throughput: 0: 42772.9. Samples: 4306166160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:10:28,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 18:10:31,775][15401] Updated weights for policy 0, policy_version 262830 (0.0029) [2024-06-22 18:10:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 4306255872. Throughput: 0: 42657.0. Samples: 4306417080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-22 18:10:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 18:10:35,861][15401] Updated weights for policy 0, policy_version 262840 (0.0036) [2024-06-22 18:10:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4306452480. Throughput: 0: 42717.3. Samples: 4306548740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:10:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 18:10:39,085][15349] Signal inference workers to stop experience collection... (63750 times) [2024-06-22 18:10:39,122][15401] InferenceWorker_p0-w0: stopping experience collection (63750 times) [2024-06-22 18:10:39,145][15349] Signal inference workers to resume experience collection... (63750 times) [2024-06-22 18:10:39,146][15401] InferenceWorker_p0-w0: resuming experience collection (63750 times) [2024-06-22 18:10:39,439][15401] Updated weights for policy 0, policy_version 262850 (0.0040) [2024-06-22 18:10:43,322][15401] Updated weights for policy 0, policy_version 262860 (0.0048) [2024-06-22 18:10:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4306698240. Throughput: 0: 42824.7. Samples: 4306806400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:10:43,390][15132] Avg episode reward: [(0, '0.845')] [2024-06-22 18:10:47,254][15401] Updated weights for policy 0, policy_version 262870 (0.0032) [2024-06-22 18:10:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4306911232. Throughput: 0: 42597.3. Samples: 4307060260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:10:48,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 18:10:50,958][15401] Updated weights for policy 0, policy_version 262880 (0.0029) [2024-06-22 18:10:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4307107840. Throughput: 0: 42655.1. Samples: 4307187240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:10:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 18:10:54,955][15401] Updated weights for policy 0, policy_version 262890 (0.0032) [2024-06-22 18:10:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 4307320832. Throughput: 0: 42773.8. Samples: 4307446800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:10:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 18:10:58,678][15401] Updated weights for policy 0, policy_version 262900 (0.0035) [2024-06-22 18:11:02,815][15401] Updated weights for policy 0, policy_version 262910 (0.0025) [2024-06-22 18:11:03,391][15132] Fps is (10 sec: 44228.0, 60 sec: 43143.1, 300 sec: 42931.4). Total num frames: 4307550208. Throughput: 0: 42696.8. Samples: 4307700580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:03,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 18:11:06,307][15401] Updated weights for policy 0, policy_version 262920 (0.0033) [2024-06-22 18:11:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4307746816. Throughput: 0: 42752.5. Samples: 4307828900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 18:11:10,402][15401] Updated weights for policy 0, policy_version 262930 (0.0042) [2024-06-22 18:11:13,390][15132] Fps is (10 sec: 40967.5, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 4307959808. Throughput: 0: 42738.9. Samples: 4308089420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 18:11:14,285][15401] Updated weights for policy 0, policy_version 262940 (0.0031) [2024-06-22 18:11:18,030][15401] Updated weights for policy 0, policy_version 262950 (0.0039) [2024-06-22 18:11:18,392][15132] Fps is (10 sec: 44225.4, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 4308189184. Throughput: 0: 42688.8. Samples: 4308338180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:18,393][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 18:11:21,849][15401] Updated weights for policy 0, policy_version 262960 (0.0035) [2024-06-22 18:11:23,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 4308402176. Throughput: 0: 42677.2. Samples: 4308469320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:23,393][15132] Avg episode reward: [(0, '0.182')] [2024-06-22 18:11:25,739][15401] Updated weights for policy 0, policy_version 262970 (0.0039) [2024-06-22 18:11:28,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 4308582400. Throughput: 0: 42618.6. Samples: 4308724240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:28,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-22 18:11:29,557][15401] Updated weights for policy 0, policy_version 262980 (0.0030) [2024-06-22 18:11:33,357][15401] Updated weights for policy 0, policy_version 262990 (0.0030) [2024-06-22 18:11:33,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4308828160. Throughput: 0: 42706.3. Samples: 4308982040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 18:11:37,177][15401] Updated weights for policy 0, policy_version 263000 (0.0045) [2024-06-22 18:11:38,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 4309057536. Throughput: 0: 42756.7. Samples: 4309111300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 18:11:41,002][15401] Updated weights for policy 0, policy_version 263010 (0.0036) [2024-06-22 18:11:43,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 4309204992. Throughput: 0: 42513.2. Samples: 4309359900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:11:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 18:11:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000263013_4309204992.pth... [2024-06-22 18:11:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000262388_4298964992.pth [2024-06-22 18:11:44,865][15401] Updated weights for policy 0, policy_version 263020 (0.0041) [2024-06-22 18:11:48,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4309450752. Throughput: 0: 42501.5. Samples: 4309613060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:11:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 18:11:48,961][15401] Updated weights for policy 0, policy_version 263030 (0.0041) [2024-06-22 18:11:52,205][15349] Signal inference workers to stop experience collection... (63800 times) [2024-06-22 18:11:52,205][15349] Signal inference workers to resume experience collection... (63800 times) [2024-06-22 18:11:52,223][15401] InferenceWorker_p0-w0: stopping experience collection (63800 times) [2024-06-22 18:11:52,223][15401] InferenceWorker_p0-w0: resuming experience collection (63800 times) [2024-06-22 18:11:52,722][15401] Updated weights for policy 0, policy_version 263040 (0.0036) [2024-06-22 18:11:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4309663744. Throughput: 0: 42648.3. Samples: 4309748080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:11:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 18:11:56,581][15401] Updated weights for policy 0, policy_version 263050 (0.0027) [2024-06-22 18:11:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 4309860352. Throughput: 0: 42391.3. Samples: 4309997020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:11:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 18:12:00,736][15401] Updated weights for policy 0, policy_version 263060 (0.0039) [2024-06-22 18:12:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42053.6, 300 sec: 42709.5). Total num frames: 4310073344. Throughput: 0: 42473.4. Samples: 4310249380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:03,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 18:12:04,279][15401] Updated weights for policy 0, policy_version 263070 (0.0042) [2024-06-22 18:12:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42820.9). Total num frames: 4310286336. Throughput: 0: 42496.5. Samples: 4310381560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 18:12:08,418][15401] Updated weights for policy 0, policy_version 263080 (0.0028) [2024-06-22 18:12:11,891][15401] Updated weights for policy 0, policy_version 263090 (0.0028) [2024-06-22 18:12:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 4310499328. Throughput: 0: 42358.3. Samples: 4310630360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 18:12:16,054][15401] Updated weights for policy 0, policy_version 263100 (0.0038) [2024-06-22 18:12:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 4310712320. Throughput: 0: 42323.4. Samples: 4310886600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 18:12:19,518][15401] Updated weights for policy 0, policy_version 263110 (0.0030) [2024-06-22 18:12:23,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 4310925312. Throughput: 0: 42442.8. Samples: 4311021220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 18:12:23,671][15401] Updated weights for policy 0, policy_version 263120 (0.0034) [2024-06-22 18:12:27,251][15401] Updated weights for policy 0, policy_version 263130 (0.0033) [2024-06-22 18:12:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4311138304. Throughput: 0: 42412.8. Samples: 4311268480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 18:12:31,432][15401] Updated weights for policy 0, policy_version 263140 (0.0032) [2024-06-22 18:12:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 4311367680. Throughput: 0: 42456.7. Samples: 4311523620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 18:12:34,747][15401] Updated weights for policy 0, policy_version 263150 (0.0036) [2024-06-22 18:12:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 4311564288. Throughput: 0: 42524.4. Samples: 4311661680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:38,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 18:12:39,001][15401] Updated weights for policy 0, policy_version 263160 (0.0031) [2024-06-22 18:12:42,331][15401] Updated weights for policy 0, policy_version 263170 (0.0034) [2024-06-22 18:12:43,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4311793664. Throughput: 0: 42536.3. Samples: 4311911260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:43,392][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 18:12:46,766][15401] Updated weights for policy 0, policy_version 263180 (0.0036) [2024-06-22 18:12:48,392][15132] Fps is (10 sec: 45864.7, 60 sec: 42869.7, 300 sec: 42820.5). Total num frames: 4312023040. Throughput: 0: 42588.4. Samples: 4312165960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:48,393][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 18:12:49,851][15401] Updated weights for policy 0, policy_version 263190 (0.0043) [2024-06-22 18:12:53,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4312186880. Throughput: 0: 42672.3. Samples: 4312301820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:53,390][15132] Avg episode reward: [(0, '0.903')] [2024-06-22 18:12:54,364][15401] Updated weights for policy 0, policy_version 263200 (0.0040) [2024-06-22 18:12:57,757][15401] Updated weights for policy 0, policy_version 263210 (0.0041) [2024-06-22 18:12:58,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4312432640. Throughput: 0: 42644.7. Samples: 4312549380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:12:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 18:13:02,325][15401] Updated weights for policy 0, policy_version 263220 (0.0048) [2024-06-22 18:13:03,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4312662016. Throughput: 0: 42552.5. Samples: 4312801460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 18:13:05,600][15401] Updated weights for policy 0, policy_version 263230 (0.0038) [2024-06-22 18:13:07,285][15349] Signal inference workers to stop experience collection... (63850 times) [2024-06-22 18:13:07,292][15349] Signal inference workers to resume experience collection... (63850 times) [2024-06-22 18:13:07,296][15401] InferenceWorker_p0-w0: stopping experience collection (63850 times) [2024-06-22 18:13:07,320][15401] InferenceWorker_p0-w0: resuming experience collection (63850 times) [2024-06-22 18:13:08,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42596.7, 300 sec: 42709.2). Total num frames: 4312842240. Throughput: 0: 42436.9. Samples: 4312930980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:08,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 18:13:09,964][15401] Updated weights for policy 0, policy_version 263240 (0.0034) [2024-06-22 18:13:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 4313071616. Throughput: 0: 42566.7. Samples: 4313183980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 18:13:13,541][15401] Updated weights for policy 0, policy_version 263250 (0.0046) [2024-06-22 18:13:17,642][15401] Updated weights for policy 0, policy_version 263260 (0.0034) [2024-06-22 18:13:18,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4313284608. Throughput: 0: 42718.7. Samples: 4313445960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 18:13:21,196][15401] Updated weights for policy 0, policy_version 263270 (0.0043) [2024-06-22 18:13:23,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4313464832. Throughput: 0: 42401.9. Samples: 4313569760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 18:13:25,551][15401] Updated weights for policy 0, policy_version 263280 (0.0033) [2024-06-22 18:13:28,390][15132] Fps is (10 sec: 42594.4, 60 sec: 42870.9, 300 sec: 42653.8). Total num frames: 4313710592. Throughput: 0: 42540.1. Samples: 4313825500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:28,391][15132] Avg episode reward: [(0, '0.225')] [2024-06-22 18:13:28,870][15401] Updated weights for policy 0, policy_version 263290 (0.0034) [2024-06-22 18:13:33,211][15401] Updated weights for policy 0, policy_version 263300 (0.0036) [2024-06-22 18:13:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4313907200. Throughput: 0: 42561.4. Samples: 4314081120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 18:13:36,619][15401] Updated weights for policy 0, policy_version 263310 (0.0034) [2024-06-22 18:13:38,389][15132] Fps is (10 sec: 40964.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4314120192. Throughput: 0: 42389.5. Samples: 4314209340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 18:13:40,886][15401] Updated weights for policy 0, policy_version 263320 (0.0033) [2024-06-22 18:13:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 4314349568. Throughput: 0: 42481.7. Samples: 4314461060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:43,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 18:13:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000263327_4314349568.pth... [2024-06-22 18:13:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000262704_4304142336.pth [2024-06-22 18:13:44,612][15401] Updated weights for policy 0, policy_version 263330 (0.0041) [2024-06-22 18:13:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 4314546176. Throughput: 0: 42803.1. Samples: 4314727600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 18:13:48,595][15401] Updated weights for policy 0, policy_version 263340 (0.0044) [2024-06-22 18:13:52,051][15401] Updated weights for policy 0, policy_version 263350 (0.0025) [2024-06-22 18:13:53,390][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4314775552. Throughput: 0: 42719.1. Samples: 4314853240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:53,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 18:13:56,229][15401] Updated weights for policy 0, policy_version 263360 (0.0032) [2024-06-22 18:13:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4314988544. Throughput: 0: 42721.9. Samples: 4315106460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:13:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 18:13:59,664][15401] Updated weights for policy 0, policy_version 263370 (0.0036) [2024-06-22 18:14:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42599.1). Total num frames: 4315185152. Throughput: 0: 42879.1. Samples: 4315375520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:14:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 18:14:03,905][15401] Updated weights for policy 0, policy_version 263380 (0.0029) [2024-06-22 18:14:07,361][15401] Updated weights for policy 0, policy_version 263390 (0.0031) [2024-06-22 18:14:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 4315430912. Throughput: 0: 42889.7. Samples: 4315499800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-22 18:14:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 18:14:11,652][15401] Updated weights for policy 0, policy_version 263400 (0.0049) [2024-06-22 18:14:11,913][15349] Signal inference workers to stop experience collection... (63900 times) [2024-06-22 18:14:11,937][15401] InferenceWorker_p0-w0: stopping experience collection (63900 times) [2024-06-22 18:14:11,971][15349] Signal inference workers to resume experience collection... (63900 times) [2024-06-22 18:14:11,975][15401] InferenceWorker_p0-w0: resuming experience collection (63900 times) [2024-06-22 18:14:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4315643904. Throughput: 0: 42851.6. Samples: 4315753780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 18:14:15,029][15401] Updated weights for policy 0, policy_version 263410 (0.0037) [2024-06-22 18:14:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4315824128. Throughput: 0: 43037.6. Samples: 4316017820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 18:14:19,175][15401] Updated weights for policy 0, policy_version 263420 (0.0041) [2024-06-22 18:14:22,764][15401] Updated weights for policy 0, policy_version 263430 (0.0033) [2024-06-22 18:14:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 4316069888. Throughput: 0: 42941.8. Samples: 4316141720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 18:14:26,734][15401] Updated weights for policy 0, policy_version 263440 (0.0037) [2024-06-22 18:14:28,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43145.2, 300 sec: 42820.5). Total num frames: 4316299264. Throughput: 0: 42949.9. Samples: 4316393800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 18:14:30,422][15401] Updated weights for policy 0, policy_version 263450 (0.0041) [2024-06-22 18:14:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4316463104. Throughput: 0: 42753.4. Samples: 4316651500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 18:14:34,442][15401] Updated weights for policy 0, policy_version 263460 (0.0034) [2024-06-22 18:14:38,008][15401] Updated weights for policy 0, policy_version 263470 (0.0034) [2024-06-22 18:14:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 4316692480. Throughput: 0: 42690.7. Samples: 4316774320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:38,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 18:14:41,929][15401] Updated weights for policy 0, policy_version 263480 (0.0027) [2024-06-22 18:14:43,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4316938240. Throughput: 0: 42927.9. Samples: 4317038220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:43,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 18:14:45,554][15401] Updated weights for policy 0, policy_version 263490 (0.0030) [2024-06-22 18:14:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4317118464. Throughput: 0: 42513.4. Samples: 4317288620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 18:14:49,681][15401] Updated weights for policy 0, policy_version 263500 (0.0036) [2024-06-22 18:14:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4317331456. Throughput: 0: 42571.2. Samples: 4317415500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 18:14:53,531][15401] Updated weights for policy 0, policy_version 263510 (0.0043) [2024-06-22 18:14:57,118][15401] Updated weights for policy 0, policy_version 263520 (0.0038) [2024-06-22 18:14:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4317560832. Throughput: 0: 42691.5. Samples: 4317674900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:14:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 18:15:01,099][15401] Updated weights for policy 0, policy_version 263530 (0.0029) [2024-06-22 18:15:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4317773824. Throughput: 0: 42464.9. Samples: 4317928740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:15:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 18:15:04,634][15401] Updated weights for policy 0, policy_version 263540 (0.0046) [2024-06-22 18:15:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4317970432. Throughput: 0: 42567.1. Samples: 4318057240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:15:08,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 18:15:08,813][15401] Updated weights for policy 0, policy_version 263550 (0.0044) [2024-06-22 18:15:12,320][15401] Updated weights for policy 0, policy_version 263560 (0.0032) [2024-06-22 18:15:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4318183424. Throughput: 0: 42629.8. Samples: 4318312140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:15:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 18:15:16,530][15401] Updated weights for policy 0, policy_version 263570 (0.0029) [2024-06-22 18:15:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 4318412800. Throughput: 0: 42654.3. Samples: 4318570940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 18:15:18,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 18:15:19,978][15401] Updated weights for policy 0, policy_version 263580 (0.0033) [2024-06-22 18:15:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4318625792. Throughput: 0: 42780.4. Samples: 4318699440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:15:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 18:15:24,048][15401] Updated weights for policy 0, policy_version 263590 (0.0034) [2024-06-22 18:15:27,472][15401] Updated weights for policy 0, policy_version 263600 (0.0033) [2024-06-22 18:15:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4318838784. Throughput: 0: 42574.8. Samples: 4318954080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:15:28,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-22 18:15:29,127][15349] Signal inference workers to stop experience collection... (63950 times) [2024-06-22 18:15:29,127][15349] Signal inference workers to resume experience collection... (63950 times) [2024-06-22 18:15:29,169][15401] InferenceWorker_p0-w0: stopping experience collection (63950 times) [2024-06-22 18:15:29,169][15401] InferenceWorker_p0-w0: resuming experience collection (63950 times) [2024-06-22 18:15:31,693][15401] Updated weights for policy 0, policy_version 263610 (0.0057) [2024-06-22 18:15:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4319035392. Throughput: 0: 42724.9. Samples: 4319211240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:15:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 18:15:35,197][15401] Updated weights for policy 0, policy_version 263620 (0.0045) [2024-06-22 18:15:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4319248384. Throughput: 0: 42754.7. Samples: 4319339460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:15:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 18:15:39,431][15401] Updated weights for policy 0, policy_version 263630 (0.0023) [2024-06-22 18:15:43,165][15401] Updated weights for policy 0, policy_version 263640 (0.0036) [2024-06-22 18:15:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4319477760. Throughput: 0: 42601.8. Samples: 4319591980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:15:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 18:15:43,531][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000263641_4319494144.pth... [2024-06-22 18:15:43,575][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000263013_4309204992.pth [2024-06-22 18:15:46,971][15401] Updated weights for policy 0, policy_version 263650 (0.0028) [2024-06-22 18:15:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4319690752. Throughput: 0: 42849.8. Samples: 4319856980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:15:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 18:15:50,586][15401] Updated weights for policy 0, policy_version 263660 (0.0023) [2024-06-22 18:15:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4319887360. Throughput: 0: 42793.6. Samples: 4319982960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:15:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 18:15:54,603][15401] Updated weights for policy 0, policy_version 263670 (0.0046) [2024-06-22 18:15:58,168][15401] Updated weights for policy 0, policy_version 263680 (0.0029) [2024-06-22 18:15:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42654.2). Total num frames: 4320133120. Throughput: 0: 42800.8. Samples: 4320238180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:15:58,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 18:16:02,471][15401] Updated weights for policy 0, policy_version 263690 (0.0043) [2024-06-22 18:16:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4320313344. Throughput: 0: 42807.8. Samples: 4320497300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:16:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 18:16:05,667][15401] Updated weights for policy 0, policy_version 263700 (0.0028) [2024-06-22 18:16:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 4320542720. Throughput: 0: 42735.9. Samples: 4320622560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:16:08,396][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 18:16:10,022][15401] Updated weights for policy 0, policy_version 263710 (0.0035) [2024-06-22 18:16:13,203][15401] Updated weights for policy 0, policy_version 263720 (0.0037) [2024-06-22 18:16:13,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 4320788480. Throughput: 0: 42811.5. Samples: 4320880600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:16:13,390][15132] Avg episode reward: [(0, '0.153')] [2024-06-22 18:16:17,564][15401] Updated weights for policy 0, policy_version 263730 (0.0034) [2024-06-22 18:16:18,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 4320968704. Throughput: 0: 42835.0. Samples: 4321138820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:16:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 18:16:21,198][15401] Updated weights for policy 0, policy_version 263740 (0.0031) [2024-06-22 18:16:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4321181696. Throughput: 0: 42823.4. Samples: 4321266520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:16:23,390][15132] Avg episode reward: [(0, '0.143')] [2024-06-22 18:16:25,021][15401] Updated weights for policy 0, policy_version 263750 (0.0039) [2024-06-22 18:16:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4321411072. Throughput: 0: 42959.7. Samples: 4321525160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:16:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 18:16:28,624][15401] Updated weights for policy 0, policy_version 263760 (0.0022) [2024-06-22 18:16:33,314][15401] Updated weights for policy 0, policy_version 263770 (0.0022) [2024-06-22 18:16:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4321607680. Throughput: 0: 42948.0. Samples: 4321789640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 18:16:33,396][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 18:16:34,784][15349] Signal inference workers to stop experience collection... (64000 times) [2024-06-22 18:16:34,785][15349] Signal inference workers to resume experience collection... (64000 times) [2024-06-22 18:16:34,817][15401] InferenceWorker_p0-w0: stopping experience collection (64000 times) [2024-06-22 18:16:34,817][15401] InferenceWorker_p0-w0: resuming experience collection (64000 times) [2024-06-22 18:16:36,075][15401] Updated weights for policy 0, policy_version 263780 (0.0040) [2024-06-22 18:16:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4321820672. Throughput: 0: 42760.1. Samples: 4321907160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:16:38,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 18:16:40,878][15401] Updated weights for policy 0, policy_version 263790 (0.0041) [2024-06-22 18:16:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4322066432. Throughput: 0: 42882.4. Samples: 4322167880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:16:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 18:16:44,014][15401] Updated weights for policy 0, policy_version 263800 (0.0033) [2024-06-22 18:16:48,285][15401] Updated weights for policy 0, policy_version 263810 (0.0027) [2024-06-22 18:16:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4322263040. Throughput: 0: 42927.6. Samples: 4322429040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:16:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 18:16:51,401][15401] Updated weights for policy 0, policy_version 263820 (0.0029) [2024-06-22 18:16:53,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 4322459648. Throughput: 0: 42965.4. Samples: 4322556100. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:16:53,393][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 18:16:55,863][15401] Updated weights for policy 0, policy_version 263830 (0.0037) [2024-06-22 18:16:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4322721792. Throughput: 0: 42903.4. Samples: 4322811260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:16:58,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-22 18:16:59,021][15401] Updated weights for policy 0, policy_version 263840 (0.0033) [2024-06-22 18:17:03,317][15401] Updated weights for policy 0, policy_version 263850 (0.0033) [2024-06-22 18:17:03,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 4322918400. Throughput: 0: 42901.8. Samples: 4323069400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 18:17:06,559][15401] Updated weights for policy 0, policy_version 263860 (0.0029) [2024-06-22 18:17:08,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4323115008. Throughput: 0: 42889.9. Samples: 4323196560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 18:17:10,924][15401] Updated weights for policy 0, policy_version 263870 (0.0037) [2024-06-22 18:17:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4323344384. Throughput: 0: 42879.8. Samples: 4323454760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 18:17:14,322][15401] Updated weights for policy 0, policy_version 263880 (0.0050) [2024-06-22 18:17:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4323540992. Throughput: 0: 42668.5. Samples: 4323709720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:18,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-22 18:17:18,855][15401] Updated weights for policy 0, policy_version 263890 (0.0033) [2024-06-22 18:17:22,293][15401] Updated weights for policy 0, policy_version 263900 (0.0032) [2024-06-22 18:17:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4323753984. Throughput: 0: 42890.2. Samples: 4323837220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:23,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-22 18:17:26,367][15401] Updated weights for policy 0, policy_version 263910 (0.0031) [2024-06-22 18:17:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4323966976. Throughput: 0: 42884.4. Samples: 4324097680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:28,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 18:17:30,675][15401] Updated weights for policy 0, policy_version 263920 (0.0022) [2024-06-22 18:17:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4324196352. Throughput: 0: 42754.8. Samples: 4324353000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:33,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 18:17:34,322][15401] Updated weights for policy 0, policy_version 263930 (0.0033) [2024-06-22 18:17:38,291][15401] Updated weights for policy 0, policy_version 263940 (0.0040) [2024-06-22 18:17:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 4324392960. Throughput: 0: 42769.4. Samples: 4324480620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:38,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 18:17:41,882][15401] Updated weights for policy 0, policy_version 263950 (0.0033) [2024-06-22 18:17:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 4324622336. Throughput: 0: 42924.9. Samples: 4324742880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-22 18:17:43,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 18:17:43,533][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000263955_4324638720.pth... [2024-06-22 18:17:43,588][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000263327_4314349568.pth [2024-06-22 18:17:45,763][15401] Updated weights for policy 0, policy_version 263960 (0.0030) [2024-06-22 18:17:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4324835328. Throughput: 0: 42848.9. Samples: 4324997600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:17:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 18:17:49,439][15401] Updated weights for policy 0, policy_version 263970 (0.0038) [2024-06-22 18:17:53,288][15401] Updated weights for policy 0, policy_version 263980 (0.0035) [2024-06-22 18:17:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 4325048320. Throughput: 0: 42813.8. Samples: 4325123180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:17:53,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 18:17:56,983][15401] Updated weights for policy 0, policy_version 263990 (0.0045) [2024-06-22 18:17:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 4325261312. Throughput: 0: 42852.5. Samples: 4325383120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:17:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 18:18:00,994][15401] Updated weights for policy 0, policy_version 264000 (0.0027) [2024-06-22 18:18:01,605][15349] Signal inference workers to stop experience collection... (64050 times) [2024-06-22 18:18:01,606][15349] Signal inference workers to resume experience collection... (64050 times) [2024-06-22 18:18:01,637][15401] InferenceWorker_p0-w0: stopping experience collection (64050 times) [2024-06-22 18:18:01,638][15401] InferenceWorker_p0-w0: resuming experience collection (64050 times) [2024-06-22 18:18:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42820.5). Total num frames: 4325474304. Throughput: 0: 42946.6. Samples: 4325642420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:03,392][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 18:18:04,557][15401] Updated weights for policy 0, policy_version 264010 (0.0042) [2024-06-22 18:18:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4325687296. Throughput: 0: 42956.1. Samples: 4325770240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 18:18:08,415][15401] Updated weights for policy 0, policy_version 264020 (0.0036) [2024-06-22 18:18:12,156][15401] Updated weights for policy 0, policy_version 264030 (0.0036) [2024-06-22 18:18:13,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4325916672. Throughput: 0: 42971.1. Samples: 4326031380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 18:18:16,207][15401] Updated weights for policy 0, policy_version 264040 (0.0034) [2024-06-22 18:18:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4326113280. Throughput: 0: 43072.7. Samples: 4326291280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 18:18:19,721][15401] Updated weights for policy 0, policy_version 264050 (0.0031) [2024-06-22 18:18:23,391][15132] Fps is (10 sec: 40953.1, 60 sec: 42870.3, 300 sec: 42764.9). Total num frames: 4326326272. Throughput: 0: 43013.1. Samples: 4326416280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:23,400][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 18:18:23,987][15401] Updated weights for policy 0, policy_version 264060 (0.0026) [2024-06-22 18:18:27,411][15401] Updated weights for policy 0, policy_version 264070 (0.0040) [2024-06-22 18:18:28,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 4326572032. Throughput: 0: 42978.7. Samples: 4326676920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 18:18:31,464][15401] Updated weights for policy 0, policy_version 264080 (0.0044) [2024-06-22 18:18:33,389][15132] Fps is (10 sec: 42605.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4326752256. Throughput: 0: 43184.1. Samples: 4326940880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 18:18:34,950][15401] Updated weights for policy 0, policy_version 264090 (0.0050) [2024-06-22 18:18:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4326981632. Throughput: 0: 43038.6. Samples: 4327059920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:38,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 18:18:38,920][15401] Updated weights for policy 0, policy_version 264100 (0.0027) [2024-06-22 18:18:42,704][15401] Updated weights for policy 0, policy_version 264110 (0.0034) [2024-06-22 18:18:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4327211008. Throughput: 0: 43027.1. Samples: 4327319340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:43,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 18:18:46,923][15401] Updated weights for policy 0, policy_version 264120 (0.0033) [2024-06-22 18:18:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4327391232. Throughput: 0: 43065.4. Samples: 4327580260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 18:18:50,288][15401] Updated weights for policy 0, policy_version 264130 (0.0041) [2024-06-22 18:18:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4327636992. Throughput: 0: 42873.2. Samples: 4327699540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:18:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 18:18:54,631][15401] Updated weights for policy 0, policy_version 264140 (0.0040) [2024-06-22 18:18:58,077][15401] Updated weights for policy 0, policy_version 264150 (0.0031) [2024-06-22 18:18:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4327833600. Throughput: 0: 42876.0. Samples: 4327960800. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:18:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 18:19:02,162][15401] Updated weights for policy 0, policy_version 264160 (0.0030) [2024-06-22 18:19:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 4328013824. Throughput: 0: 42840.6. Samples: 4328219100. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 18:19:05,577][15401] Updated weights for policy 0, policy_version 264170 (0.0033) [2024-06-22 18:19:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4328292352. Throughput: 0: 42743.8. Samples: 4328339680. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 18:19:09,727][15401] Updated weights for policy 0, policy_version 264180 (0.0026) [2024-06-22 18:19:13,291][15401] Updated weights for policy 0, policy_version 264190 (0.0039) [2024-06-22 18:19:13,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 4328488960. Throughput: 0: 42771.9. Samples: 4328601660. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 18:19:17,226][15401] Updated weights for policy 0, policy_version 264200 (0.0030) [2024-06-22 18:19:18,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4328669184. Throughput: 0: 42632.8. Samples: 4328859360. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:18,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 18:19:20,992][15401] Updated weights for policy 0, policy_version 264210 (0.0040) [2024-06-22 18:19:23,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43145.8, 300 sec: 42765.0). Total num frames: 4328914944. Throughput: 0: 42738.4. Samples: 4328983140. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 18:19:25,046][15401] Updated weights for policy 0, policy_version 264220 (0.0048) [2024-06-22 18:19:28,381][15349] Signal inference workers to stop experience collection... (64100 times) [2024-06-22 18:19:28,384][15349] Signal inference workers to resume experience collection... (64100 times) [2024-06-22 18:19:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 4329127936. Throughput: 0: 42722.8. Samples: 4329241860. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:28,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 18:19:28,424][15401] InferenceWorker_p0-w0: stopping experience collection (64100 times) [2024-06-22 18:19:28,424][15401] InferenceWorker_p0-w0: resuming experience collection (64100 times) [2024-06-22 18:19:28,528][15401] Updated weights for policy 0, policy_version 264230 (0.0038) [2024-06-22 18:19:32,719][15401] Updated weights for policy 0, policy_version 264240 (0.0030) [2024-06-22 18:19:33,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4329324544. Throughput: 0: 42654.7. Samples: 4329499720. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:33,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-22 18:19:36,294][15401] Updated weights for policy 0, policy_version 264250 (0.0022) [2024-06-22 18:19:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4329553920. Throughput: 0: 42748.0. Samples: 4329623200. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 18:19:40,264][15401] Updated weights for policy 0, policy_version 264260 (0.0042) [2024-06-22 18:19:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4329766912. Throughput: 0: 42787.9. Samples: 4329886260. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 18:19:43,529][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000264269_4329783296.pth... [2024-06-22 18:19:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000263641_4319494144.pth [2024-06-22 18:19:43,838][15401] Updated weights for policy 0, policy_version 264270 (0.0038) [2024-06-22 18:19:47,974][15401] Updated weights for policy 0, policy_version 264280 (0.0042) [2024-06-22 18:19:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4329979904. Throughput: 0: 42644.4. Samples: 4330138100. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 18:19:51,709][15401] Updated weights for policy 0, policy_version 264290 (0.0041) [2024-06-22 18:19:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4330192896. Throughput: 0: 42739.0. Samples: 4330262940. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:53,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 18:19:55,595][15401] Updated weights for policy 0, policy_version 264300 (0.0033) [2024-06-22 18:19:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4330405888. Throughput: 0: 42734.7. Samples: 4330524720. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:19:58,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 18:19:59,148][15401] Updated weights for policy 0, policy_version 264310 (0.0033) [2024-06-22 18:20:03,244][15401] Updated weights for policy 0, policy_version 264320 (0.0046) [2024-06-22 18:20:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4330618880. Throughput: 0: 42632.1. Samples: 4330777800. Policy #0 lag: (min: 2.0, avg: 9.5, max: 23.0) [2024-06-22 18:20:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 18:20:06,738][15401] Updated weights for policy 0, policy_version 264330 (0.0036) [2024-06-22 18:20:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4330831872. Throughput: 0: 42584.2. Samples: 4330899440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:08,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 18:20:10,936][15401] Updated weights for policy 0, policy_version 264340 (0.0027) [2024-06-22 18:20:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4331028480. Throughput: 0: 42668.7. Samples: 4331161960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:13,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 18:20:14,233][15401] Updated weights for policy 0, policy_version 264350 (0.0031) [2024-06-22 18:20:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4331241472. Throughput: 0: 42536.0. Samples: 4331413840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 18:20:19,093][15401] Updated weights for policy 0, policy_version 264360 (0.0041) [2024-06-22 18:20:21,913][15401] Updated weights for policy 0, policy_version 264370 (0.0033) [2024-06-22 18:20:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4331470848. Throughput: 0: 42539.5. Samples: 4331537480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 18:20:26,912][15401] Updated weights for policy 0, policy_version 264380 (0.0041) [2024-06-22 18:20:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 4331667456. Throughput: 0: 42512.3. Samples: 4331799320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 18:20:29,548][15401] Updated weights for policy 0, policy_version 264390 (0.0032) [2024-06-22 18:20:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4331864064. Throughput: 0: 42600.5. Samples: 4332055120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 18:20:34,471][15401] Updated weights for policy 0, policy_version 264400 (0.0033) [2024-06-22 18:20:37,275][15401] Updated weights for policy 0, policy_version 264410 (0.0026) [2024-06-22 18:20:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4332109824. Throughput: 0: 42666.2. Samples: 4332182920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 18:20:41,923][15401] Updated weights for policy 0, policy_version 264420 (0.0031) [2024-06-22 18:20:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4332306432. Throughput: 0: 42554.6. Samples: 4332439680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:43,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 18:20:45,100][15401] Updated weights for policy 0, policy_version 264430 (0.0033) [2024-06-22 18:20:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 4332519424. Throughput: 0: 42655.9. Samples: 4332697320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 18:20:49,698][15401] Updated weights for policy 0, policy_version 264440 (0.0029) [2024-06-22 18:20:52,550][15401] Updated weights for policy 0, policy_version 264450 (0.0038) [2024-06-22 18:20:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4332748800. Throughput: 0: 42831.2. Samples: 4332826840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:53,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 18:20:57,481][15401] Updated weights for policy 0, policy_version 264460 (0.0034) [2024-06-22 18:20:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4332945408. Throughput: 0: 42750.8. Samples: 4333085740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:20:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 18:20:59,761][15349] Signal inference workers to stop experience collection... (64150 times) [2024-06-22 18:20:59,813][15401] InferenceWorker_p0-w0: stopping experience collection (64150 times) [2024-06-22 18:20:59,822][15349] Signal inference workers to resume experience collection... (64150 times) [2024-06-22 18:20:59,837][15401] InferenceWorker_p0-w0: resuming experience collection (64150 times) [2024-06-22 18:21:00,337][15401] Updated weights for policy 0, policy_version 264470 (0.0045) [2024-06-22 18:21:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4333174784. Throughput: 0: 42700.5. Samples: 4333335360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:21:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 18:21:05,099][15401] Updated weights for policy 0, policy_version 264480 (0.0045) [2024-06-22 18:21:08,171][15401] Updated weights for policy 0, policy_version 264490 (0.0048) [2024-06-22 18:21:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4333404160. Throughput: 0: 42818.7. Samples: 4333464320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:21:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 18:21:12,648][15401] Updated weights for policy 0, policy_version 264500 (0.0030) [2024-06-22 18:21:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4333584384. Throughput: 0: 42811.2. Samples: 4333725820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:21:13,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 18:21:15,647][15401] Updated weights for policy 0, policy_version 264510 (0.0035) [2024-06-22 18:21:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4333830144. Throughput: 0: 42751.1. Samples: 4333978920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 18:21:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 18:21:20,422][15401] Updated weights for policy 0, policy_version 264520 (0.0029) [2024-06-22 18:21:23,361][15401] Updated weights for policy 0, policy_version 264530 (0.0042) [2024-06-22 18:21:23,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4334059520. Throughput: 0: 42914.3. Samples: 4334114060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:21:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 18:21:28,031][15401] Updated weights for policy 0, policy_version 264540 (0.0038) [2024-06-22 18:21:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 4334239744. Throughput: 0: 42998.3. Samples: 4334374700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:21:28,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 18:21:30,980][15401] Updated weights for policy 0, policy_version 264550 (0.0046) [2024-06-22 18:21:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4334469120. Throughput: 0: 42826.4. Samples: 4334624500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:21:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 18:21:35,777][15401] Updated weights for policy 0, policy_version 264560 (0.0034) [2024-06-22 18:21:38,389][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4334682112. Throughput: 0: 42761.7. Samples: 4334751120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:21:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 18:21:39,036][15401] Updated weights for policy 0, policy_version 264570 (0.0051) [2024-06-22 18:21:43,396][15132] Fps is (10 sec: 39296.1, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 4334862336. Throughput: 0: 42719.2. Samples: 4335008380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:21:43,397][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 18:21:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000264580_4334878720.pth... [2024-06-22 18:21:43,435][15401] Updated weights for policy 0, policy_version 264580 (0.0043) [2024-06-22 18:21:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000263955_4324638720.pth [2024-06-22 18:21:46,760][15401] Updated weights for policy 0, policy_version 264590 (0.0029) [2024-06-22 18:21:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 4335108096. Throughput: 0: 42853.7. Samples: 4335263780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:21:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 18:21:50,951][15401] Updated weights for policy 0, policy_version 264600 (0.0022) [2024-06-22 18:21:53,390][15132] Fps is (10 sec: 47543.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4335337472. Throughput: 0: 42952.9. Samples: 4335397200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:21:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 18:21:54,045][15401] Updated weights for policy 0, policy_version 264610 (0.0028) [2024-06-22 18:21:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4335517696. Throughput: 0: 42803.1. Samples: 4335651960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:21:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 18:21:58,555][15401] Updated weights for policy 0, policy_version 264620 (0.0044) [2024-06-22 18:22:01,659][15401] Updated weights for policy 0, policy_version 264630 (0.0024) [2024-06-22 18:22:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4335763456. Throughput: 0: 42858.7. Samples: 4335907560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:22:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 18:22:06,269][15401] Updated weights for policy 0, policy_version 264640 (0.0029) [2024-06-22 18:22:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4335960064. Throughput: 0: 42860.1. Samples: 4336042760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:22:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 18:22:09,338][15401] Updated weights for policy 0, policy_version 264650 (0.0043) [2024-06-22 18:22:13,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4336156672. Throughput: 0: 42784.3. Samples: 4336300000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:22:13,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 18:22:13,717][15401] Updated weights for policy 0, policy_version 264660 (0.0030) [2024-06-22 18:22:16,942][15401] Updated weights for policy 0, policy_version 264670 (0.0028) [2024-06-22 18:22:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4336386048. Throughput: 0: 42829.4. Samples: 4336551820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:22:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 18:22:21,397][15401] Updated weights for policy 0, policy_version 264680 (0.0035) [2024-06-22 18:22:23,390][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4336615424. Throughput: 0: 42998.2. Samples: 4336686040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:22:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 18:22:24,588][15401] Updated weights for policy 0, policy_version 264690 (0.0042) [2024-06-22 18:22:28,327][15349] Signal inference workers to stop experience collection... (64200 times) [2024-06-22 18:22:28,383][15401] InferenceWorker_p0-w0: stopping experience collection (64200 times) [2024-06-22 18:22:28,389][15349] Signal inference workers to resume experience collection... (64200 times) [2024-06-22 18:22:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 4336795648. Throughput: 0: 42861.8. Samples: 4336936880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 18:22:28,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 18:22:28,395][15401] InferenceWorker_p0-w0: resuming experience collection (64200 times) [2024-06-22 18:22:28,866][15401] Updated weights for policy 0, policy_version 264700 (0.0038) [2024-06-22 18:22:32,123][15401] Updated weights for policy 0, policy_version 264710 (0.0024) [2024-06-22 18:22:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4337041408. Throughput: 0: 43055.2. Samples: 4337201260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:22:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 18:22:36,446][15401] Updated weights for policy 0, policy_version 264720 (0.0030) [2024-06-22 18:22:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4337254400. Throughput: 0: 42989.1. Samples: 4337331700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:22:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 18:22:39,959][15401] Updated weights for policy 0, policy_version 264730 (0.0033) [2024-06-22 18:22:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 4337451008. Throughput: 0: 42911.7. Samples: 4337582980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:22:43,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 18:22:44,243][15401] Updated weights for policy 0, policy_version 264740 (0.0037) [2024-06-22 18:22:47,407][15401] Updated weights for policy 0, policy_version 264750 (0.0038) [2024-06-22 18:22:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4337680384. Throughput: 0: 43002.6. Samples: 4337842680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:22:48,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 18:22:51,798][15401] Updated weights for policy 0, policy_version 264760 (0.0033) [2024-06-22 18:22:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4337893376. Throughput: 0: 42935.1. Samples: 4337974840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:22:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 18:22:55,029][15401] Updated weights for policy 0, policy_version 264770 (0.0034) [2024-06-22 18:22:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 4338089984. Throughput: 0: 42793.5. Samples: 4338225600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:22:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 18:22:59,778][15401] Updated weights for policy 0, policy_version 264780 (0.0035) [2024-06-22 18:23:03,051][15401] Updated weights for policy 0, policy_version 264790 (0.0030) [2024-06-22 18:23:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4338319360. Throughput: 0: 42838.2. Samples: 4338479540. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:23:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 18:23:07,183][15401] Updated weights for policy 0, policy_version 264800 (0.0026) [2024-06-22 18:23:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4338532352. Throughput: 0: 42792.5. Samples: 4338611700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:23:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 18:23:10,631][15401] Updated weights for policy 0, policy_version 264810 (0.0037) [2024-06-22 18:23:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 4338745344. Throughput: 0: 42826.9. Samples: 4338864100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:23:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 18:23:15,109][15401] Updated weights for policy 0, policy_version 264820 (0.0026) [2024-06-22 18:23:18,120][15401] Updated weights for policy 0, policy_version 264830 (0.0035) [2024-06-22 18:23:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 4338974720. Throughput: 0: 42672.4. Samples: 4339121520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:23:18,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 18:23:22,573][15401] Updated weights for policy 0, policy_version 264840 (0.0045) [2024-06-22 18:23:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4339171328. Throughput: 0: 42737.3. Samples: 4339254880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:23:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 18:23:25,929][15401] Updated weights for policy 0, policy_version 264850 (0.0037) [2024-06-22 18:23:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4339400704. Throughput: 0: 42838.6. Samples: 4339510720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:23:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 18:23:30,090][15401] Updated weights for policy 0, policy_version 264860 (0.0036) [2024-06-22 18:23:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4339597312. Throughput: 0: 42822.6. Samples: 4339769700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:23:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 18:23:33,599][15401] Updated weights for policy 0, policy_version 264870 (0.0040) [2024-06-22 18:23:37,444][15401] Updated weights for policy 0, policy_version 264880 (0.0031) [2024-06-22 18:23:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4339826688. Throughput: 0: 42660.5. Samples: 4339894560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-22 18:23:38,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 18:23:41,251][15401] Updated weights for policy 0, policy_version 264890 (0.0030) [2024-06-22 18:23:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4340023296. Throughput: 0: 42843.3. Samples: 4340153560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:23:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 18:23:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000264895_4340039680.pth... [2024-06-22 18:23:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000264269_4329783296.pth [2024-06-22 18:23:45,031][15401] Updated weights for policy 0, policy_version 264900 (0.0033) [2024-06-22 18:23:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4340219904. Throughput: 0: 43044.4. Samples: 4340416540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:23:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 18:23:48,576][15349] Signal inference workers to stop experience collection... (64250 times) [2024-06-22 18:23:48,578][15349] Signal inference workers to resume experience collection... (64250 times) [2024-06-22 18:23:48,624][15401] InferenceWorker_p0-w0: stopping experience collection (64250 times) [2024-06-22 18:23:48,624][15401] InferenceWorker_p0-w0: resuming experience collection (64250 times) [2024-06-22 18:23:49,017][15401] Updated weights for policy 0, policy_version 264910 (0.0041) [2024-06-22 18:23:52,762][15401] Updated weights for policy 0, policy_version 264920 (0.0027) [2024-06-22 18:23:53,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4340465664. Throughput: 0: 42807.7. Samples: 4340538040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:23:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 18:23:56,703][15401] Updated weights for policy 0, policy_version 264930 (0.0043) [2024-06-22 18:23:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4340678656. Throughput: 0: 42827.2. Samples: 4340791320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:23:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 18:24:00,454][15401] Updated weights for policy 0, policy_version 264940 (0.0023) [2024-06-22 18:24:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4340858880. Throughput: 0: 42955.2. Samples: 4341054500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 18:24:04,225][15401] Updated weights for policy 0, policy_version 264950 (0.0037) [2024-06-22 18:24:08,154][15401] Updated weights for policy 0, policy_version 264960 (0.0022) [2024-06-22 18:24:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 4341104640. Throughput: 0: 42634.2. Samples: 4341173420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 18:24:11,742][15401] Updated weights for policy 0, policy_version 264970 (0.0029) [2024-06-22 18:24:13,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4341317632. Throughput: 0: 42659.9. Samples: 4341430420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:13,390][15132] Avg episode reward: [(0, '0.051')] [2024-06-22 18:24:15,658][15401] Updated weights for policy 0, policy_version 264980 (0.0025) [2024-06-22 18:24:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4341514240. Throughput: 0: 42894.7. Samples: 4341699960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:18,390][15132] Avg episode reward: [(0, '0.125')] [2024-06-22 18:24:19,471][15401] Updated weights for policy 0, policy_version 264990 (0.0028) [2024-06-22 18:24:23,321][15401] Updated weights for policy 0, policy_version 265000 (0.0040) [2024-06-22 18:24:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4341760000. Throughput: 0: 42794.1. Samples: 4341820300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 18:24:27,092][15401] Updated weights for policy 0, policy_version 265010 (0.0042) [2024-06-22 18:24:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4341972992. Throughput: 0: 42591.8. Samples: 4342070180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:28,390][15132] Avg episode reward: [(0, '0.093')] [2024-06-22 18:24:30,811][15401] Updated weights for policy 0, policy_version 265020 (0.0027) [2024-06-22 18:24:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4342153216. Throughput: 0: 42604.9. Samples: 4342333760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 18:24:35,222][15401] Updated weights for policy 0, policy_version 265030 (0.0045) [2024-06-22 18:24:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4342382592. Throughput: 0: 42521.3. Samples: 4342451500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 18:24:38,703][15401] Updated weights for policy 0, policy_version 265040 (0.0027) [2024-06-22 18:24:42,802][15401] Updated weights for policy 0, policy_version 265050 (0.0039) [2024-06-22 18:24:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4342595584. Throughput: 0: 42823.1. Samples: 4342718360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:43,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 18:24:46,644][15401] Updated weights for policy 0, policy_version 265060 (0.0053) [2024-06-22 18:24:48,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 4342792192. Throughput: 0: 42515.0. Samples: 4342967780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-22 18:24:48,392][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 18:24:50,782][15401] Updated weights for policy 0, policy_version 265070 (0.0032) [2024-06-22 18:24:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4343021568. Throughput: 0: 42565.3. Samples: 4343088860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:24:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 18:24:54,330][15401] Updated weights for policy 0, policy_version 265080 (0.0041) [2024-06-22 18:24:58,306][15401] Updated weights for policy 0, policy_version 265090 (0.0036) [2024-06-22 18:24:58,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4343234560. Throughput: 0: 42720.0. Samples: 4343352820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:24:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 18:25:02,015][15401] Updated weights for policy 0, policy_version 265100 (0.0031) [2024-06-22 18:25:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4343414784. Throughput: 0: 42473.9. Samples: 4343611280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 18:25:05,198][15349] Signal inference workers to stop experience collection... (64300 times) [2024-06-22 18:25:05,201][15349] Signal inference workers to resume experience collection... (64300 times) [2024-06-22 18:25:05,218][15401] InferenceWorker_p0-w0: stopping experience collection (64300 times) [2024-06-22 18:25:05,219][15401] InferenceWorker_p0-w0: resuming experience collection (64300 times) [2024-06-22 18:25:05,752][15401] Updated weights for policy 0, policy_version 265110 (0.0038) [2024-06-22 18:25:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4343660544. Throughput: 0: 42606.2. Samples: 4343737580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 18:25:09,649][15401] Updated weights for policy 0, policy_version 265120 (0.0043) [2024-06-22 18:25:13,354][15401] Updated weights for policy 0, policy_version 265130 (0.0037) [2024-06-22 18:25:13,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4343889920. Throughput: 0: 42856.4. Samples: 4343998720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 18:25:17,332][15401] Updated weights for policy 0, policy_version 265140 (0.0037) [2024-06-22 18:25:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4344070144. Throughput: 0: 42658.2. Samples: 4344253380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 18:25:21,041][15401] Updated weights for policy 0, policy_version 265150 (0.0036) [2024-06-22 18:25:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4344315904. Throughput: 0: 42714.3. Samples: 4344373640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 18:25:25,447][15401] Updated weights for policy 0, policy_version 265160 (0.0039) [2024-06-22 18:25:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4344512512. Throughput: 0: 42616.5. Samples: 4344636100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 18:25:28,672][15401] Updated weights for policy 0, policy_version 265170 (0.0033) [2024-06-22 18:25:33,050][15401] Updated weights for policy 0, policy_version 265180 (0.0038) [2024-06-22 18:25:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4344709120. Throughput: 0: 42705.8. Samples: 4344889440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 18:25:36,585][15401] Updated weights for policy 0, policy_version 265190 (0.0040) [2024-06-22 18:25:38,392][15132] Fps is (10 sec: 44227.3, 60 sec: 42869.9, 300 sec: 42875.8). Total num frames: 4344954880. Throughput: 0: 42843.7. Samples: 4345016920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:38,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 18:25:40,773][15401] Updated weights for policy 0, policy_version 265200 (0.0038) [2024-06-22 18:25:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4345135104. Throughput: 0: 42726.8. Samples: 4345275520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 18:25:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000265207_4345151488.pth... [2024-06-22 18:25:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000264580_4334878720.pth [2024-06-22 18:25:44,337][15401] Updated weights for policy 0, policy_version 265210 (0.0034) [2024-06-22 18:25:48,390][15132] Fps is (10 sec: 39330.0, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 4345348096. Throughput: 0: 42528.8. Samples: 4345525080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 18:25:48,542][15401] Updated weights for policy 0, policy_version 265220 (0.0028) [2024-06-22 18:25:52,320][15401] Updated weights for policy 0, policy_version 265230 (0.0023) [2024-06-22 18:25:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4345577472. Throughput: 0: 42568.4. Samples: 4345653160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 18:25:56,282][15401] Updated weights for policy 0, policy_version 265240 (0.0026) [2024-06-22 18:25:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 4345741312. Throughput: 0: 42329.8. Samples: 4345903560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:25:58,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-22 18:26:00,099][15401] Updated weights for policy 0, policy_version 265250 (0.0032) [2024-06-22 18:26:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4345987072. Throughput: 0: 42223.1. Samples: 4346153420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 18:26:03,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-22 18:26:03,875][15401] Updated weights for policy 0, policy_version 265260 (0.0041) [2024-06-22 18:26:07,619][15401] Updated weights for policy 0, policy_version 265270 (0.0031) [2024-06-22 18:26:08,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4346216448. Throughput: 0: 42644.8. Samples: 4346292660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:08,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 18:26:11,766][15401] Updated weights for policy 0, policy_version 265280 (0.0037) [2024-06-22 18:26:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 4346396672. Throughput: 0: 42291.4. Samples: 4346539220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 18:26:13,799][15349] Signal inference workers to stop experience collection... (64350 times) [2024-06-22 18:26:13,800][15349] Signal inference workers to resume experience collection... (64350 times) [2024-06-22 18:26:13,842][15401] InferenceWorker_p0-w0: stopping experience collection (64350 times) [2024-06-22 18:26:13,842][15401] InferenceWorker_p0-w0: resuming experience collection (64350 times) [2024-06-22 18:26:15,204][15401] Updated weights for policy 0, policy_version 265290 (0.0031) [2024-06-22 18:26:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4346642432. Throughput: 0: 42271.6. Samples: 4346791660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:18,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 18:26:19,367][15401] Updated weights for policy 0, policy_version 265300 (0.0027) [2024-06-22 18:26:23,105][15401] Updated weights for policy 0, policy_version 265310 (0.0036) [2024-06-22 18:26:23,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 4346855424. Throughput: 0: 42530.5. Samples: 4346930700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:23,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 18:26:26,997][15401] Updated weights for policy 0, policy_version 265320 (0.0031) [2024-06-22 18:26:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4347052032. Throughput: 0: 42312.4. Samples: 4347179580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 18:26:30,686][15401] Updated weights for policy 0, policy_version 265330 (0.0032) [2024-06-22 18:26:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4347297792. Throughput: 0: 42475.6. Samples: 4347436480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 18:26:34,497][15401] Updated weights for policy 0, policy_version 265340 (0.0038) [2024-06-22 18:26:38,223][15401] Updated weights for policy 0, policy_version 265350 (0.0040) [2024-06-22 18:26:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42325.2, 300 sec: 42821.1). Total num frames: 4347494400. Throughput: 0: 42568.9. Samples: 4347568860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:38,392][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 18:26:41,905][15401] Updated weights for policy 0, policy_version 265360 (0.0035) [2024-06-22 18:26:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4347691008. Throughput: 0: 42577.7. Samples: 4347819560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 18:26:45,752][15401] Updated weights for policy 0, policy_version 265370 (0.0047) [2024-06-22 18:26:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 4347920384. Throughput: 0: 42751.7. Samples: 4348077240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:48,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 18:26:50,009][15401] Updated weights for policy 0, policy_version 265380 (0.0044) [2024-06-22 18:26:53,396][15132] Fps is (10 sec: 44209.1, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 4348133376. Throughput: 0: 42723.3. Samples: 4348215480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:53,396][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 18:26:53,637][15401] Updated weights for policy 0, policy_version 265390 (0.0040) [2024-06-22 18:26:57,540][15401] Updated weights for policy 0, policy_version 265400 (0.0032) [2024-06-22 18:26:58,396][15132] Fps is (10 sec: 39296.0, 60 sec: 42866.9, 300 sec: 42541.9). Total num frames: 4348313600. Throughput: 0: 42631.9. Samples: 4348457920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:26:58,397][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 18:27:01,173][15401] Updated weights for policy 0, policy_version 265410 (0.0040) [2024-06-22 18:27:03,390][15132] Fps is (10 sec: 44264.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4348575744. Throughput: 0: 42719.0. Samples: 4348714020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:27:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 18:27:05,560][15401] Updated weights for policy 0, policy_version 265420 (0.0036) [2024-06-22 18:27:08,390][15132] Fps is (10 sec: 45904.5, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 4348772352. Throughput: 0: 42659.1. Samples: 4348850360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:27:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 18:27:08,765][15401] Updated weights for policy 0, policy_version 265430 (0.0035) [2024-06-22 18:27:13,223][15401] Updated weights for policy 0, policy_version 265440 (0.0033) [2024-06-22 18:27:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4348968960. Throughput: 0: 42674.6. Samples: 4349099940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-22 18:27:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 18:27:16,577][15401] Updated weights for policy 0, policy_version 265450 (0.0026) [2024-06-22 18:27:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4349198336. Throughput: 0: 42660.1. Samples: 4349356180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 18:27:20,755][15401] Updated weights for policy 0, policy_version 265460 (0.0034) [2024-06-22 18:27:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4349394944. Throughput: 0: 42682.6. Samples: 4349489480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 18:27:24,321][15401] Updated weights for policy 0, policy_version 265470 (0.0044) [2024-06-22 18:27:28,221][15401] Updated weights for policy 0, policy_version 265480 (0.0043) [2024-06-22 18:27:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4349624320. Throughput: 0: 42652.4. Samples: 4349738920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:28,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 18:27:32,246][15401] Updated weights for policy 0, policy_version 265490 (0.0034) [2024-06-22 18:27:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4349837312. Throughput: 0: 42476.4. Samples: 4349988680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:33,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 18:27:35,787][15401] Updated weights for policy 0, policy_version 265500 (0.0033) [2024-06-22 18:27:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 4350033920. Throughput: 0: 42306.0. Samples: 4350118980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 18:27:40,006][15401] Updated weights for policy 0, policy_version 265510 (0.0032) [2024-06-22 18:27:43,287][15401] Updated weights for policy 0, policy_version 265520 (0.0041) [2024-06-22 18:27:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4350279680. Throughput: 0: 42694.5. Samples: 4350378900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 18:27:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000265520_4350279680.pth... [2024-06-22 18:27:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000264895_4340039680.pth [2024-06-22 18:27:47,665][15401] Updated weights for policy 0, policy_version 265530 (0.0026) [2024-06-22 18:27:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 4350476288. Throughput: 0: 42729.8. Samples: 4350636960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:48,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 18:27:50,395][15349] Signal inference workers to stop experience collection... (64400 times) [2024-06-22 18:27:50,395][15349] Signal inference workers to resume experience collection... (64400 times) [2024-06-22 18:27:50,427][15401] InferenceWorker_p0-w0: stopping experience collection (64400 times) [2024-06-22 18:27:50,427][15401] InferenceWorker_p0-w0: resuming experience collection (64400 times) [2024-06-22 18:27:51,264][15401] Updated weights for policy 0, policy_version 265540 (0.0034) [2024-06-22 18:27:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 4350672896. Throughput: 0: 42455.2. Samples: 4350760840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:53,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 18:27:55,467][15401] Updated weights for policy 0, policy_version 265550 (0.0029) [2024-06-22 18:27:58,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43149.2, 300 sec: 42653.9). Total num frames: 4350902272. Throughput: 0: 42523.7. Samples: 4351013500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:27:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 18:27:58,915][15401] Updated weights for policy 0, policy_version 265560 (0.0031) [2024-06-22 18:28:03,038][15401] Updated weights for policy 0, policy_version 265570 (0.0028) [2024-06-22 18:28:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4351098880. Throughput: 0: 42558.1. Samples: 4351271300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:28:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 18:28:06,589][15401] Updated weights for policy 0, policy_version 265580 (0.0034) [2024-06-22 18:28:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4351311872. Throughput: 0: 42531.1. Samples: 4351403380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:28:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 18:28:10,568][15401] Updated weights for policy 0, policy_version 265590 (0.0045) [2024-06-22 18:28:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4351524864. Throughput: 0: 42493.9. Samples: 4351651140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:28:13,391][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 18:28:14,339][15401] Updated weights for policy 0, policy_version 265600 (0.0034) [2024-06-22 18:28:18,323][15401] Updated weights for policy 0, policy_version 265610 (0.0028) [2024-06-22 18:28:18,391][15132] Fps is (10 sec: 44231.1, 60 sec: 42597.4, 300 sec: 42653.7). Total num frames: 4351754240. Throughput: 0: 42654.7. Samples: 4351908200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:28:18,391][15132] Avg episode reward: [(0, '0.820')] [2024-06-22 18:28:22,339][15401] Updated weights for policy 0, policy_version 265620 (0.0027) [2024-06-22 18:28:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4351967232. Throughput: 0: 42698.2. Samples: 4352040400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:28:23,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 18:28:26,023][15401] Updated weights for policy 0, policy_version 265630 (0.0028) [2024-06-22 18:28:28,390][15132] Fps is (10 sec: 42604.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4352180224. Throughput: 0: 42515.1. Samples: 4352292080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:28:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 18:28:30,107][15401] Updated weights for policy 0, policy_version 265640 (0.0040) [2024-06-22 18:28:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 4352376832. Throughput: 0: 42498.2. Samples: 4352549280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:28:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 18:28:33,771][15401] Updated weights for policy 0, policy_version 265650 (0.0037) [2024-06-22 18:28:37,846][15401] Updated weights for policy 0, policy_version 265660 (0.0028) [2024-06-22 18:28:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4352589824. Throughput: 0: 42473.2. Samples: 4352672140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:28:38,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-22 18:28:41,549][15401] Updated weights for policy 0, policy_version 265670 (0.0025) [2024-06-22 18:28:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4352819200. Throughput: 0: 42481.3. Samples: 4352925160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:28:43,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 18:28:45,599][15401] Updated weights for policy 0, policy_version 265680 (0.0028) [2024-06-22 18:28:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 4353015808. Throughput: 0: 42626.7. Samples: 4353189500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:28:48,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 18:28:49,027][15401] Updated weights for policy 0, policy_version 265690 (0.0035) [2024-06-22 18:28:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4353212416. Throughput: 0: 42342.3. Samples: 4353308780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:28:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 18:28:53,414][15401] Updated weights for policy 0, policy_version 265700 (0.0038) [2024-06-22 18:28:56,787][15401] Updated weights for policy 0, policy_version 265710 (0.0037) [2024-06-22 18:28:58,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 4353458176. Throughput: 0: 42545.7. Samples: 4353565800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:28:58,393][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 18:29:01,325][15401] Updated weights for policy 0, policy_version 265720 (0.0028) [2024-06-22 18:29:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4353654784. Throughput: 0: 42531.1. Samples: 4353822040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:29:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 18:29:04,447][15401] Updated weights for policy 0, policy_version 265730 (0.0032) [2024-06-22 18:29:08,394][15132] Fps is (10 sec: 39314.7, 60 sec: 42322.5, 300 sec: 42486.7). Total num frames: 4353851392. Throughput: 0: 42344.1. Samples: 4353946060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:29:08,394][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 18:29:08,915][15401] Updated weights for policy 0, policy_version 265740 (0.0035) [2024-06-22 18:29:12,180][15401] Updated weights for policy 0, policy_version 265750 (0.0033) [2024-06-22 18:29:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4354080768. Throughput: 0: 42408.1. Samples: 4354200440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:29:13,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 18:29:16,434][15401] Updated weights for policy 0, policy_version 265760 (0.0033) [2024-06-22 18:29:18,390][15132] Fps is (10 sec: 42615.9, 60 sec: 42053.2, 300 sec: 42431.8). Total num frames: 4354277376. Throughput: 0: 42451.2. Samples: 4354459580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:29:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 18:29:19,840][15401] Updated weights for policy 0, policy_version 265770 (0.0035) [2024-06-22 18:29:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 4354506752. Throughput: 0: 42555.5. Samples: 4354587140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:29:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 18:29:24,078][15401] Updated weights for policy 0, policy_version 265780 (0.0028) [2024-06-22 18:29:27,396][15401] Updated weights for policy 0, policy_version 265790 (0.0034) [2024-06-22 18:29:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4354719744. Throughput: 0: 42560.4. Samples: 4354840380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:29:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 18:29:31,663][15401] Updated weights for policy 0, policy_version 265800 (0.0031) [2024-06-22 18:29:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4354916352. Throughput: 0: 42418.1. Samples: 4355098320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:29:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 18:29:33,854][15349] Signal inference workers to stop experience collection... (64450 times) [2024-06-22 18:29:33,902][15401] InferenceWorker_p0-w0: stopping experience collection (64450 times) [2024-06-22 18:29:33,911][15349] Signal inference workers to resume experience collection... (64450 times) [2024-06-22 18:29:33,917][15401] InferenceWorker_p0-w0: resuming experience collection (64450 times) [2024-06-22 18:29:35,125][15401] Updated weights for policy 0, policy_version 265810 (0.0040) [2024-06-22 18:29:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4355129344. Throughput: 0: 42588.5. Samples: 4355225260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 18:29:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 18:29:39,382][15401] Updated weights for policy 0, policy_version 265820 (0.0036) [2024-06-22 18:29:42,868][15401] Updated weights for policy 0, policy_version 265830 (0.0038) [2024-06-22 18:29:43,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 4355375104. Throughput: 0: 42515.1. Samples: 4355478980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:29:43,393][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 18:29:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000265831_4355375104.pth... [2024-06-22 18:29:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000265207_4345151488.pth [2024-06-22 18:29:47,575][15401] Updated weights for policy 0, policy_version 265840 (0.0037) [2024-06-22 18:29:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 4355571712. Throughput: 0: 42647.0. Samples: 4355741160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:29:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 18:29:50,735][15401] Updated weights for policy 0, policy_version 265850 (0.0031) [2024-06-22 18:29:53,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4355768320. Throughput: 0: 42593.7. Samples: 4355862600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:29:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 18:29:55,388][15401] Updated weights for policy 0, policy_version 265860 (0.0039) [2024-06-22 18:29:58,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 4355997696. Throughput: 0: 42577.2. Samples: 4356116520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:29:58,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 18:29:58,554][15401] Updated weights for policy 0, policy_version 265870 (0.0035) [2024-06-22 18:30:03,088][15401] Updated weights for policy 0, policy_version 265880 (0.0041) [2024-06-22 18:30:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 4356194304. Throughput: 0: 42619.1. Samples: 4356377540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:03,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 18:30:06,163][15401] Updated weights for policy 0, policy_version 265890 (0.0035) [2024-06-22 18:30:08,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42874.3, 300 sec: 42487.3). Total num frames: 4356423680. Throughput: 0: 42522.7. Samples: 4356500660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 18:30:10,641][15401] Updated weights for policy 0, policy_version 265900 (0.0043) [2024-06-22 18:30:13,396][15132] Fps is (10 sec: 45856.9, 60 sec: 42866.8, 300 sec: 42653.0). Total num frames: 4356653056. Throughput: 0: 42555.8. Samples: 4356755660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:13,396][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 18:30:13,770][15401] Updated weights for policy 0, policy_version 265910 (0.0040) [2024-06-22 18:30:18,341][15401] Updated weights for policy 0, policy_version 265920 (0.0035) [2024-06-22 18:30:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4356833280. Throughput: 0: 42870.9. Samples: 4357027500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:18,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 18:30:21,241][15401] Updated weights for policy 0, policy_version 265930 (0.0035) [2024-06-22 18:30:23,390][15132] Fps is (10 sec: 44265.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4357095424. Throughput: 0: 42786.1. Samples: 4357150640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 18:30:25,859][15401] Updated weights for policy 0, policy_version 265940 (0.0040) [2024-06-22 18:30:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4357292032. Throughput: 0: 42781.9. Samples: 4357404060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 18:30:28,810][15401] Updated weights for policy 0, policy_version 265950 (0.0022) [2024-06-22 18:30:33,349][15401] Updated weights for policy 0, policy_version 265960 (0.0039) [2024-06-22 18:30:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42487.6). Total num frames: 4357488640. Throughput: 0: 42907.6. Samples: 4357672000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:33,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 18:30:36,383][15401] Updated weights for policy 0, policy_version 265970 (0.0033) [2024-06-22 18:30:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 4357750784. Throughput: 0: 43012.6. Samples: 4357798160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:38,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-22 18:30:40,798][15401] Updated weights for policy 0, policy_version 265980 (0.0036) [2024-06-22 18:30:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 4357931008. Throughput: 0: 43045.4. Samples: 4358053460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 18:30:44,335][15401] Updated weights for policy 0, policy_version 265990 (0.0026) [2024-06-22 18:30:48,385][15401] Updated weights for policy 0, policy_version 266000 (0.0044) [2024-06-22 18:30:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4358144000. Throughput: 0: 42932.2. Samples: 4358309380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:30:48,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 18:30:52,050][15401] Updated weights for policy 0, policy_version 266010 (0.0024) [2024-06-22 18:30:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4358356992. Throughput: 0: 42923.9. Samples: 4358432240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:30:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 18:30:55,705][15349] Signal inference workers to stop experience collection... (64500 times) [2024-06-22 18:30:55,755][15349] Signal inference workers to resume experience collection... (64500 times) [2024-06-22 18:30:55,756][15401] InferenceWorker_p0-w0: stopping experience collection (64500 times) [2024-06-22 18:30:55,771][15401] InferenceWorker_p0-w0: resuming experience collection (64500 times) [2024-06-22 18:30:55,902][15401] Updated weights for policy 0, policy_version 266020 (0.0032) [2024-06-22 18:30:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 4358586368. Throughput: 0: 43273.2. Samples: 4358702680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:30:58,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 18:30:59,741][15401] Updated weights for policy 0, policy_version 266030 (0.0041) [2024-06-22 18:31:03,390][15132] Fps is (10 sec: 42599.1, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 4358782976. Throughput: 0: 42981.2. Samples: 4358961660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 18:31:03,569][15401] Updated weights for policy 0, policy_version 266040 (0.0036) [2024-06-22 18:31:07,222][15401] Updated weights for policy 0, policy_version 266050 (0.0032) [2024-06-22 18:31:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 4359012352. Throughput: 0: 42957.3. Samples: 4359083820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:08,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 18:31:11,139][15401] Updated weights for policy 0, policy_version 266060 (0.0039) [2024-06-22 18:31:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 4359241728. Throughput: 0: 43172.9. Samples: 4359346840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:13,395][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 18:31:14,753][15401] Updated weights for policy 0, policy_version 266070 (0.0046) [2024-06-22 18:31:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 4359421952. Throughput: 0: 42810.7. Samples: 4359598480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 18:31:18,737][15401] Updated weights for policy 0, policy_version 266080 (0.0034) [2024-06-22 18:31:22,867][15401] Updated weights for policy 0, policy_version 266090 (0.0039) [2024-06-22 18:31:23,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4359618560. Throughput: 0: 42715.4. Samples: 4359720360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 18:31:26,562][15401] Updated weights for policy 0, policy_version 266100 (0.0035) [2024-06-22 18:31:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4359864320. Throughput: 0: 42827.5. Samples: 4359980700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 18:31:30,445][15401] Updated weights for policy 0, policy_version 266110 (0.0038) [2024-06-22 18:31:33,394][15132] Fps is (10 sec: 45855.2, 60 sec: 43141.4, 300 sec: 42653.6). Total num frames: 4360077312. Throughput: 0: 42858.4. Samples: 4360238200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:33,394][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 18:31:34,345][15401] Updated weights for policy 0, policy_version 266120 (0.0036) [2024-06-22 18:31:37,891][15401] Updated weights for policy 0, policy_version 266130 (0.0030) [2024-06-22 18:31:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 4360273920. Throughput: 0: 42896.7. Samples: 4360362580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 18:31:41,859][15401] Updated weights for policy 0, policy_version 266140 (0.0029) [2024-06-22 18:31:43,390][15132] Fps is (10 sec: 44255.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4360519680. Throughput: 0: 42722.2. Samples: 4360625180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 18:31:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000266145_4360519680.pth... [2024-06-22 18:31:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000265520_4350279680.pth [2024-06-22 18:31:45,929][15401] Updated weights for policy 0, policy_version 266150 (0.0035) [2024-06-22 18:31:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 4360716288. Throughput: 0: 42590.3. Samples: 4360878220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:48,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 18:31:49,355][15401] Updated weights for policy 0, policy_version 266160 (0.0037) [2024-06-22 18:31:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.6, 300 sec: 42710.4). Total num frames: 4360912896. Throughput: 0: 42587.7. Samples: 4361000160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 18:31:53,485][15401] Updated weights for policy 0, policy_version 266170 (0.0033) [2024-06-22 18:31:56,874][15401] Updated weights for policy 0, policy_version 266180 (0.0030) [2024-06-22 18:31:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4361158656. Throughput: 0: 42535.1. Samples: 4361260920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:31:58,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 18:32:00,994][15401] Updated weights for policy 0, policy_version 266190 (0.0027) [2024-06-22 18:32:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4361338880. Throughput: 0: 42750.3. Samples: 4361522240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 18:32:04,472][15401] Updated weights for policy 0, policy_version 266200 (0.0036) [2024-06-22 18:32:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 4361568256. Throughput: 0: 42803.6. Samples: 4361646520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 18:32:08,484][15401] Updated weights for policy 0, policy_version 266210 (0.0044) [2024-06-22 18:32:12,076][15401] Updated weights for policy 0, policy_version 266220 (0.0039) [2024-06-22 18:32:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4361797632. Throughput: 0: 42725.4. Samples: 4361903340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:13,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 18:32:16,081][15401] Updated weights for policy 0, policy_version 266230 (0.0028) [2024-06-22 18:32:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4361994240. Throughput: 0: 42865.1. Samples: 4362166940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 18:32:19,842][15401] Updated weights for policy 0, policy_version 266240 (0.0028) [2024-06-22 18:32:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 4362207232. Throughput: 0: 42856.0. Samples: 4362291100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 18:32:23,802][15401] Updated weights for policy 0, policy_version 266250 (0.0035) [2024-06-22 18:32:25,639][15349] Signal inference workers to stop experience collection... (64550 times) [2024-06-22 18:32:25,689][15401] InferenceWorker_p0-w0: stopping experience collection (64550 times) [2024-06-22 18:32:25,751][15349] Signal inference workers to resume experience collection... (64550 times) [2024-06-22 18:32:25,751][15401] InferenceWorker_p0-w0: resuming experience collection (64550 times) [2024-06-22 18:32:27,344][15401] Updated weights for policy 0, policy_version 266260 (0.0026) [2024-06-22 18:32:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4362420224. Throughput: 0: 42650.4. Samples: 4362544440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 18:32:31,561][15401] Updated weights for policy 0, policy_version 266270 (0.0029) [2024-06-22 18:32:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42328.4, 300 sec: 42653.9). Total num frames: 4362616832. Throughput: 0: 42815.5. Samples: 4362804920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 18:32:35,259][15401] Updated weights for policy 0, policy_version 266280 (0.0029) [2024-06-22 18:32:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 4362862592. Throughput: 0: 42742.1. Samples: 4362923560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 18:32:39,318][15401] Updated weights for policy 0, policy_version 266290 (0.0038) [2024-06-22 18:32:43,051][15401] Updated weights for policy 0, policy_version 266300 (0.0033) [2024-06-22 18:32:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 4363059200. Throughput: 0: 42667.0. Samples: 4363180940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 18:32:47,189][15401] Updated weights for policy 0, policy_version 266310 (0.0041) [2024-06-22 18:32:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 4363255808. Throughput: 0: 42571.9. Samples: 4363437980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 18:32:50,609][15401] Updated weights for policy 0, policy_version 266320 (0.0045) [2024-06-22 18:32:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4363485184. Throughput: 0: 42524.5. Samples: 4363560120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 18:32:54,903][15401] Updated weights for policy 0, policy_version 266330 (0.0032) [2024-06-22 18:32:58,370][15401] Updated weights for policy 0, policy_version 266340 (0.0030) [2024-06-22 18:32:58,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4363714560. Throughput: 0: 42600.0. Samples: 4363820340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:32:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 18:33:02,659][15401] Updated weights for policy 0, policy_version 266350 (0.0031) [2024-06-22 18:33:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4363911168. Throughput: 0: 42422.2. Samples: 4364075940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:33:03,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 18:33:05,946][15401] Updated weights for policy 0, policy_version 266360 (0.0037) [2024-06-22 18:33:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4364124160. Throughput: 0: 42291.9. Samples: 4364194240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-22 18:33:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 18:33:10,432][15401] Updated weights for policy 0, policy_version 266370 (0.0040) [2024-06-22 18:33:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.7). Total num frames: 4364353536. Throughput: 0: 42646.9. Samples: 4364463560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 18:33:13,457][15401] Updated weights for policy 0, policy_version 266380 (0.0034) [2024-06-22 18:33:18,141][15401] Updated weights for policy 0, policy_version 266390 (0.0042) [2024-06-22 18:33:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 4364550144. Throughput: 0: 42526.2. Samples: 4364718700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:18,392][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 18:33:21,211][15401] Updated weights for policy 0, policy_version 266400 (0.0038) [2024-06-22 18:33:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4364779520. Throughput: 0: 42633.0. Samples: 4364842040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 18:33:25,618][15401] Updated weights for policy 0, policy_version 266410 (0.0035) [2024-06-22 18:33:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4364976128. Throughput: 0: 42654.8. Samples: 4365100400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 18:33:28,946][15401] Updated weights for policy 0, policy_version 266420 (0.0028) [2024-06-22 18:33:33,348][15401] Updated weights for policy 0, policy_version 266430 (0.0039) [2024-06-22 18:33:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4365189120. Throughput: 0: 42772.5. Samples: 4365362740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 18:33:36,592][15401] Updated weights for policy 0, policy_version 266440 (0.0045) [2024-06-22 18:33:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4365434880. Throughput: 0: 42896.9. Samples: 4365490480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:38,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 18:33:40,934][15401] Updated weights for policy 0, policy_version 266450 (0.0041) [2024-06-22 18:33:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4365598720. Throughput: 0: 42803.2. Samples: 4365746480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 18:33:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000266456_4365615104.pth... [2024-06-22 18:33:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000265831_4355375104.pth [2024-06-22 18:33:44,190][15401] Updated weights for policy 0, policy_version 266460 (0.0027) [2024-06-22 18:33:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4365828096. Throughput: 0: 42894.5. Samples: 4366006200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 18:33:48,544][15401] Updated weights for policy 0, policy_version 266470 (0.0046) [2024-06-22 18:33:51,746][15401] Updated weights for policy 0, policy_version 266480 (0.0024) [2024-06-22 18:33:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 4366057472. Throughput: 0: 43150.3. Samples: 4366136000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 18:33:55,739][15349] Signal inference workers to stop experience collection... (64600 times) [2024-06-22 18:33:55,744][15349] Signal inference workers to resume experience collection... (64600 times) [2024-06-22 18:33:55,762][15401] InferenceWorker_p0-w0: stopping experience collection (64600 times) [2024-06-22 18:33:55,762][15401] InferenceWorker_p0-w0: resuming experience collection (64600 times) [2024-06-22 18:33:56,385][15401] Updated weights for policy 0, policy_version 266490 (0.0043) [2024-06-22 18:33:58,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 4366254080. Throughput: 0: 42734.3. Samples: 4366386700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:33:58,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 18:33:59,242][15401] Updated weights for policy 0, policy_version 266500 (0.0032) [2024-06-22 18:34:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.6). Total num frames: 4366467072. Throughput: 0: 42899.2. Samples: 4366649060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:34:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 18:34:03,887][15401] Updated weights for policy 0, policy_version 266510 (0.0033) [2024-06-22 18:34:06,933][15401] Updated weights for policy 0, policy_version 266520 (0.0031) [2024-06-22 18:34:08,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4366712832. Throughput: 0: 42977.4. Samples: 4366776020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:34:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 18:34:11,989][15401] Updated weights for policy 0, policy_version 266530 (0.0043) [2024-06-22 18:34:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4366893056. Throughput: 0: 42941.7. Samples: 4367032780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:34:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 18:34:14,765][15401] Updated weights for policy 0, policy_version 266540 (0.0039) [2024-06-22 18:34:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4367122432. Throughput: 0: 42852.4. Samples: 4367291100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:34:18,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 18:34:19,417][15401] Updated weights for policy 0, policy_version 266550 (0.0039) [2024-06-22 18:34:22,488][15401] Updated weights for policy 0, policy_version 266560 (0.0034) [2024-06-22 18:34:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4367351808. Throughput: 0: 42749.8. Samples: 4367414220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 18:34:23,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 18:34:26,950][15401] Updated weights for policy 0, policy_version 266570 (0.0027) [2024-06-22 18:34:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4367548416. Throughput: 0: 42679.9. Samples: 4367667080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:34:28,392][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 18:34:30,453][15401] Updated weights for policy 0, policy_version 266580 (0.0033) [2024-06-22 18:34:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4367761408. Throughput: 0: 42706.4. Samples: 4367927980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:34:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 18:34:34,402][15401] Updated weights for policy 0, policy_version 266590 (0.0031) [2024-06-22 18:34:37,928][15401] Updated weights for policy 0, policy_version 266600 (0.0027) [2024-06-22 18:34:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 4367974400. Throughput: 0: 42686.6. Samples: 4368056900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:34:38,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-22 18:34:42,206][15401] Updated weights for policy 0, policy_version 266610 (0.0027) [2024-06-22 18:34:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4368187392. Throughput: 0: 42920.9. Samples: 4368318040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:34:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 18:34:45,451][15401] Updated weights for policy 0, policy_version 266620 (0.0038) [2024-06-22 18:34:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4368400384. Throughput: 0: 42662.6. Samples: 4368568880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:34:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 18:34:49,690][15401] Updated weights for policy 0, policy_version 266630 (0.0039) [2024-06-22 18:34:52,931][15401] Updated weights for policy 0, policy_version 266640 (0.0045) [2024-06-22 18:34:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 4368629760. Throughput: 0: 42804.9. Samples: 4368702240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:34:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 18:34:57,284][15401] Updated weights for policy 0, policy_version 266650 (0.0042) [2024-06-22 18:34:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42873.1, 300 sec: 42820.9). Total num frames: 4368826368. Throughput: 0: 42780.0. Samples: 4368957880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:34:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-22 18:35:00,567][15401] Updated weights for policy 0, policy_version 266660 (0.0040) [2024-06-22 18:35:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4369055744. Throughput: 0: 42655.5. Samples: 4369210600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:35:03,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 18:35:04,949][15401] Updated weights for policy 0, policy_version 266670 (0.0035) [2024-06-22 18:35:08,231][15401] Updated weights for policy 0, policy_version 266680 (0.0032) [2024-06-22 18:35:08,396][15132] Fps is (10 sec: 45846.2, 60 sec: 42866.8, 300 sec: 42820.6). Total num frames: 4369285120. Throughput: 0: 42850.7. Samples: 4369342780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:35:08,396][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 18:35:09,541][15349] Signal inference workers to stop experience collection... (64650 times) [2024-06-22 18:35:09,541][15349] Signal inference workers to resume experience collection... (64650 times) [2024-06-22 18:35:09,591][15401] InferenceWorker_p0-w0: stopping experience collection (64650 times) [2024-06-22 18:35:09,591][15401] InferenceWorker_p0-w0: resuming experience collection (64650 times) [2024-06-22 18:35:12,744][15401] Updated weights for policy 0, policy_version 266690 (0.0043) [2024-06-22 18:35:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4369465344. Throughput: 0: 42913.3. Samples: 4369598180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:35:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 18:35:16,103][15401] Updated weights for policy 0, policy_version 266700 (0.0038) [2024-06-22 18:35:18,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4369694720. Throughput: 0: 42729.8. Samples: 4369850820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:35:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 18:35:20,317][15401] Updated weights for policy 0, policy_version 266710 (0.0037) [2024-06-22 18:35:23,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4369907712. Throughput: 0: 42869.4. Samples: 4369986020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:35:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 18:35:23,701][15401] Updated weights for policy 0, policy_version 266720 (0.0036) [2024-06-22 18:35:27,896][15401] Updated weights for policy 0, policy_version 266730 (0.0033) [2024-06-22 18:35:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4370120704. Throughput: 0: 42613.3. Samples: 4370235640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:35:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 18:35:31,554][15401] Updated weights for policy 0, policy_version 266740 (0.0038) [2024-06-22 18:35:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4370350080. Throughput: 0: 42761.4. Samples: 4370493140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 18:35:33,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 18:35:35,343][15401] Updated weights for policy 0, policy_version 266750 (0.0025) [2024-06-22 18:35:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4370530304. Throughput: 0: 42642.3. Samples: 4370621140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:35:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 18:35:39,317][15401] Updated weights for policy 0, policy_version 266760 (0.0043) [2024-06-22 18:35:43,190][15401] Updated weights for policy 0, policy_version 266770 (0.0042) [2024-06-22 18:35:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4370759680. Throughput: 0: 42559.1. Samples: 4370873140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:35:43,393][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 18:35:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000266770_4370759680.pth... [2024-06-22 18:35:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000266145_4360519680.pth [2024-06-22 18:35:47,520][15401] Updated weights for policy 0, policy_version 266780 (0.0032) [2024-06-22 18:35:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4370956288. Throughput: 0: 42812.5. Samples: 4371137160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:35:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 18:35:50,639][15401] Updated weights for policy 0, policy_version 266790 (0.0026) [2024-06-22 18:35:53,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4371185664. Throughput: 0: 42716.4. Samples: 4371264740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:35:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 18:35:55,045][15401] Updated weights for policy 0, policy_version 266800 (0.0040) [2024-06-22 18:35:58,127][15401] Updated weights for policy 0, policy_version 266810 (0.0031) [2024-06-22 18:35:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 4371415040. Throughput: 0: 42739.7. Samples: 4371521460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:35:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 18:36:02,610][15401] Updated weights for policy 0, policy_version 266820 (0.0028) [2024-06-22 18:36:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 4371611648. Throughput: 0: 42978.7. Samples: 4371784860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 18:36:05,571][15401] Updated weights for policy 0, policy_version 266830 (0.0036) [2024-06-22 18:36:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 4371841024. Throughput: 0: 42734.6. Samples: 4371909080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 18:36:10,206][15401] Updated weights for policy 0, policy_version 266840 (0.0030) [2024-06-22 18:36:13,097][15401] Updated weights for policy 0, policy_version 266850 (0.0032) [2024-06-22 18:36:13,392][15132] Fps is (10 sec: 45863.6, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 4372070400. Throughput: 0: 42944.3. Samples: 4372168240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:13,393][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 18:36:17,689][15401] Updated weights for policy 0, policy_version 266860 (0.0039) [2024-06-22 18:36:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4372250624. Throughput: 0: 42938.1. Samples: 4372425360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 18:36:21,199][15401] Updated weights for policy 0, policy_version 266870 (0.0035) [2024-06-22 18:36:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4372480000. Throughput: 0: 42828.7. Samples: 4372548440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 18:36:25,127][15349] Signal inference workers to stop experience collection... (64700 times) [2024-06-22 18:36:25,127][15349] Signal inference workers to resume experience collection... (64700 times) [2024-06-22 18:36:25,168][15401] InferenceWorker_p0-w0: stopping experience collection (64700 times) [2024-06-22 18:36:25,169][15401] InferenceWorker_p0-w0: resuming experience collection (64700 times) [2024-06-22 18:36:25,259][15401] Updated weights for policy 0, policy_version 266880 (0.0027) [2024-06-22 18:36:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.7). Total num frames: 4372692992. Throughput: 0: 42994.4. Samples: 4372807780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 18:36:28,691][15401] Updated weights for policy 0, policy_version 266890 (0.0028) [2024-06-22 18:36:33,174][15401] Updated weights for policy 0, policy_version 266900 (0.0032) [2024-06-22 18:36:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4372889600. Throughput: 0: 42907.9. Samples: 4373068020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 18:36:36,212][15401] Updated weights for policy 0, policy_version 266910 (0.0039) [2024-06-22 18:36:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 4373135360. Throughput: 0: 42747.0. Samples: 4373188360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 18:36:40,783][15401] Updated weights for policy 0, policy_version 266920 (0.0031) [2024-06-22 18:36:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 4373331968. Throughput: 0: 42881.1. Samples: 4373451120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 18:36:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 18:36:43,961][15401] Updated weights for policy 0, policy_version 266930 (0.0042) [2024-06-22 18:36:48,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4373512192. Throughput: 0: 42760.5. Samples: 4373709080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:36:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 18:36:48,695][15401] Updated weights for policy 0, policy_version 266940 (0.0033) [2024-06-22 18:36:51,710][15401] Updated weights for policy 0, policy_version 266950 (0.0029) [2024-06-22 18:36:53,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4373774336. Throughput: 0: 42680.1. Samples: 4373829680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:36:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 18:36:56,256][15401] Updated weights for policy 0, policy_version 266960 (0.0035) [2024-06-22 18:36:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4373954560. Throughput: 0: 42717.6. Samples: 4374090420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:36:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 18:36:59,680][15401] Updated weights for policy 0, policy_version 266970 (0.0033) [2024-06-22 18:37:03,389][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4374151168. Throughput: 0: 42661.4. Samples: 4374345120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 18:37:04,039][15401] Updated weights for policy 0, policy_version 266980 (0.0029) [2024-06-22 18:37:07,317][15401] Updated weights for policy 0, policy_version 266990 (0.0043) [2024-06-22 18:37:08,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4374396928. Throughput: 0: 42687.6. Samples: 4374469380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 18:37:11,901][15401] Updated weights for policy 0, policy_version 267000 (0.0042) [2024-06-22 18:37:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 4374593536. Throughput: 0: 42674.0. Samples: 4374728120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:13,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-22 18:37:15,077][15401] Updated weights for policy 0, policy_version 267010 (0.0037) [2024-06-22 18:37:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4374806528. Throughput: 0: 42514.3. Samples: 4374981160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:18,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 18:37:19,525][15401] Updated weights for policy 0, policy_version 267020 (0.0033) [2024-06-22 18:37:22,898][15401] Updated weights for policy 0, policy_version 267030 (0.0033) [2024-06-22 18:37:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4375035904. Throughput: 0: 42615.2. Samples: 4375106040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 18:37:27,158][15401] Updated weights for policy 0, policy_version 267040 (0.0053) [2024-06-22 18:37:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4375232512. Throughput: 0: 42565.4. Samples: 4375366560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 18:37:30,553][15401] Updated weights for policy 0, policy_version 267050 (0.0037) [2024-06-22 18:37:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4375445504. Throughput: 0: 42461.5. Samples: 4375619860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:33,391][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 18:37:34,958][15401] Updated weights for policy 0, policy_version 267060 (0.0042) [2024-06-22 18:37:38,111][15401] Updated weights for policy 0, policy_version 267070 (0.0034) [2024-06-22 18:37:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4375674880. Throughput: 0: 42670.1. Samples: 4375749840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 18:37:42,346][15401] Updated weights for policy 0, policy_version 267080 (0.0033) [2024-06-22 18:37:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 4375855104. Throughput: 0: 42713.7. Samples: 4376012540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:43,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 18:37:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000267082_4375871488.pth... [2024-06-22 18:37:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000266456_4365615104.pth [2024-06-22 18:37:45,575][15401] Updated weights for policy 0, policy_version 267090 (0.0029) [2024-06-22 18:37:48,393][15132] Fps is (10 sec: 44219.2, 60 sec: 43414.7, 300 sec: 42820.0). Total num frames: 4376117248. Throughput: 0: 42601.1. Samples: 4376262340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:48,394][15132] Avg episode reward: [(0, '0.855')] [2024-06-22 18:37:49,946][15401] Updated weights for policy 0, policy_version 267100 (0.0024) [2024-06-22 18:37:53,213][15401] Updated weights for policy 0, policy_version 267110 (0.0042) [2024-06-22 18:37:53,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4376330240. Throughput: 0: 42911.7. Samples: 4376400400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 18:37:57,633][15401] Updated weights for policy 0, policy_version 267120 (0.0035) [2024-06-22 18:37:58,390][15132] Fps is (10 sec: 39336.8, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 4376510464. Throughput: 0: 42782.3. Samples: 4376653320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 18:37:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 18:38:01,066][15401] Updated weights for policy 0, policy_version 267130 (0.0041) [2024-06-22 18:38:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 4376756224. Throughput: 0: 42705.6. Samples: 4376902920. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 18:38:05,184][15401] Updated weights for policy 0, policy_version 267140 (0.0033) [2024-06-22 18:38:08,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4376969216. Throughput: 0: 42982.7. Samples: 4377040260. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 18:38:08,978][15401] Updated weights for policy 0, policy_version 267150 (0.0036) [2024-06-22 18:38:09,439][15349] Signal inference workers to stop experience collection... (64750 times) [2024-06-22 18:38:09,486][15401] InferenceWorker_p0-w0: stopping experience collection (64750 times) [2024-06-22 18:38:09,555][15349] Signal inference workers to resume experience collection... (64750 times) [2024-06-22 18:38:09,556][15401] InferenceWorker_p0-w0: resuming experience collection (64750 times) [2024-06-22 18:38:12,855][15401] Updated weights for policy 0, policy_version 267160 (0.0034) [2024-06-22 18:38:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 4377149440. Throughput: 0: 42784.0. Samples: 4377291840. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 18:38:16,642][15401] Updated weights for policy 0, policy_version 267170 (0.0037) [2024-06-22 18:38:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4377395200. Throughput: 0: 42746.8. Samples: 4377543460. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 18:38:20,414][15401] Updated weights for policy 0, policy_version 267180 (0.0031) [2024-06-22 18:38:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4377591808. Throughput: 0: 42752.9. Samples: 4377673720. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 18:38:24,124][15401] Updated weights for policy 0, policy_version 267190 (0.0029) [2024-06-22 18:38:28,175][15401] Updated weights for policy 0, policy_version 267200 (0.0029) [2024-06-22 18:38:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4377804800. Throughput: 0: 42544.4. Samples: 4377927040. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 18:38:31,923][15401] Updated weights for policy 0, policy_version 267210 (0.0028) [2024-06-22 18:38:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.7, 300 sec: 42653.9). Total num frames: 4378017792. Throughput: 0: 42658.1. Samples: 4378181780. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 18:38:35,854][15401] Updated weights for policy 0, policy_version 267220 (0.0048) [2024-06-22 18:38:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4378230784. Throughput: 0: 42433.8. Samples: 4378309920. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:38,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 18:38:39,310][15401] Updated weights for policy 0, policy_version 267230 (0.0040) [2024-06-22 18:38:43,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4378427392. Throughput: 0: 42477.0. Samples: 4378564780. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 18:38:43,809][15401] Updated weights for policy 0, policy_version 267240 (0.0040) [2024-06-22 18:38:47,327][15401] Updated weights for policy 0, policy_version 267250 (0.0040) [2024-06-22 18:38:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42328.0, 300 sec: 42709.4). Total num frames: 4378656768. Throughput: 0: 42640.8. Samples: 4378821760. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 18:38:51,392][15401] Updated weights for policy 0, policy_version 267260 (0.0040) [2024-06-22 18:38:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 4378869760. Throughput: 0: 42467.7. Samples: 4378951300. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 18:38:54,921][15401] Updated weights for policy 0, policy_version 267270 (0.0041) [2024-06-22 18:38:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4379082752. Throughput: 0: 42623.6. Samples: 4379209900. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:38:58,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 18:38:58,807][15401] Updated weights for policy 0, policy_version 267280 (0.0041) [2024-06-22 18:39:02,468][15401] Updated weights for policy 0, policy_version 267290 (0.0027) [2024-06-22 18:39:03,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4379295744. Throughput: 0: 42688.4. Samples: 4379464440. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:39:03,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-22 18:39:06,307][15401] Updated weights for policy 0, policy_version 267300 (0.0034) [2024-06-22 18:39:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4379525120. Throughput: 0: 42664.9. Samples: 4379593640. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-22 18:39:08,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 18:39:10,076][15401] Updated weights for policy 0, policy_version 267310 (0.0027) [2024-06-22 18:39:13,391][15132] Fps is (10 sec: 42594.1, 60 sec: 42870.8, 300 sec: 42709.3). Total num frames: 4379721728. Throughput: 0: 42882.6. Samples: 4379856800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:13,391][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 18:39:14,200][15401] Updated weights for policy 0, policy_version 267320 (0.0034) [2024-06-22 18:39:17,842][15401] Updated weights for policy 0, policy_version 267330 (0.0032) [2024-06-22 18:39:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4379934720. Throughput: 0: 42731.9. Samples: 4380104720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 18:39:21,695][15401] Updated weights for policy 0, policy_version 267340 (0.0031) [2024-06-22 18:39:23,392][15132] Fps is (10 sec: 44230.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4380164096. Throughput: 0: 42857.6. Samples: 4380238620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:23,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 18:39:25,543][15401] Updated weights for policy 0, policy_version 267350 (0.0022) [2024-06-22 18:39:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4380360704. Throughput: 0: 42784.0. Samples: 4380490060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 18:39:29,557][15401] Updated weights for policy 0, policy_version 267360 (0.0043) [2024-06-22 18:39:33,101][15349] Signal inference workers to stop experience collection... (64800 times) [2024-06-22 18:39:33,103][15349] Signal inference workers to resume experience collection... (64800 times) [2024-06-22 18:39:33,136][15401] InferenceWorker_p0-w0: stopping experience collection (64800 times) [2024-06-22 18:39:33,136][15401] InferenceWorker_p0-w0: resuming experience collection (64800 times) [2024-06-22 18:39:33,252][15401] Updated weights for policy 0, policy_version 267370 (0.0034) [2024-06-22 18:39:33,396][15132] Fps is (10 sec: 42581.8, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 4380590080. Throughput: 0: 42707.9. Samples: 4380743880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:33,396][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 18:39:37,666][15401] Updated weights for policy 0, policy_version 267380 (0.0031) [2024-06-22 18:39:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4380803072. Throughput: 0: 42785.7. Samples: 4380876660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 18:39:40,869][15401] Updated weights for policy 0, policy_version 267390 (0.0031) [2024-06-22 18:39:43,390][15132] Fps is (10 sec: 39346.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4380983296. Throughput: 0: 42626.6. Samples: 4381128100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 18:39:43,503][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000267395_4380999680.pth... [2024-06-22 18:39:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000266770_4370759680.pth [2024-06-22 18:39:45,199][15401] Updated weights for policy 0, policy_version 267400 (0.0048) [2024-06-22 18:39:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4381212672. Throughput: 0: 42655.9. Samples: 4381383960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 18:39:49,226][15401] Updated weights for policy 0, policy_version 267410 (0.0034) [2024-06-22 18:39:52,750][15401] Updated weights for policy 0, policy_version 267420 (0.0033) [2024-06-22 18:39:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4381442048. Throughput: 0: 42737.3. Samples: 4381516820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:53,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-22 18:39:56,948][15401] Updated weights for policy 0, policy_version 267430 (0.0046) [2024-06-22 18:39:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4381638656. Throughput: 0: 42480.0. Samples: 4381768360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:39:58,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 18:40:00,645][15401] Updated weights for policy 0, policy_version 267440 (0.0040) [2024-06-22 18:40:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42654.9). Total num frames: 4381868032. Throughput: 0: 42460.4. Samples: 4382015440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:40:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 18:40:04,495][15401] Updated weights for policy 0, policy_version 267450 (0.0033) [2024-06-22 18:40:08,282][15401] Updated weights for policy 0, policy_version 267460 (0.0030) [2024-06-22 18:40:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4382064640. Throughput: 0: 42495.3. Samples: 4382150800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:40:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 18:40:12,060][15401] Updated weights for policy 0, policy_version 267470 (0.0031) [2024-06-22 18:40:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42872.2, 300 sec: 42709.5). Total num frames: 4382294016. Throughput: 0: 42525.8. Samples: 4382403720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:40:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 18:40:15,794][15401] Updated weights for policy 0, policy_version 267480 (0.0038) [2024-06-22 18:40:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4382507008. Throughput: 0: 42610.0. Samples: 4382661060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 18:40:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 18:40:19,567][15401] Updated weights for policy 0, policy_version 267490 (0.0030) [2024-06-22 18:40:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 4382703616. Throughput: 0: 42523.9. Samples: 4382790240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:40:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 18:40:23,436][15401] Updated weights for policy 0, policy_version 267500 (0.0033) [2024-06-22 18:40:27,085][15401] Updated weights for policy 0, policy_version 267510 (0.0033) [2024-06-22 18:40:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4382932992. Throughput: 0: 42627.7. Samples: 4383046340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:40:28,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-22 18:40:31,123][15401] Updated weights for policy 0, policy_version 267520 (0.0024) [2024-06-22 18:40:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 4383145984. Throughput: 0: 42741.4. Samples: 4383307320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:40:33,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-22 18:40:34,737][15401] Updated weights for policy 0, policy_version 267530 (0.0030) [2024-06-22 18:40:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 4383342592. Throughput: 0: 42679.0. Samples: 4383437380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:40:38,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 18:40:38,595][15401] Updated weights for policy 0, policy_version 267540 (0.0034) [2024-06-22 18:40:42,302][15401] Updated weights for policy 0, policy_version 267550 (0.0029) [2024-06-22 18:40:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43416.0, 300 sec: 42820.2). Total num frames: 4383588352. Throughput: 0: 42868.5. Samples: 4383697540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:40:43,393][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 18:40:46,044][15349] Signal inference workers to stop experience collection... (64850 times) [2024-06-22 18:40:46,052][15349] Signal inference workers to resume experience collection... (64850 times) [2024-06-22 18:40:46,058][15401] InferenceWorker_p0-w0: stopping experience collection (64850 times) [2024-06-22 18:40:46,080][15401] InferenceWorker_p0-w0: resuming experience collection (64850 times) [2024-06-22 18:40:46,186][15401] Updated weights for policy 0, policy_version 267560 (0.0032) [2024-06-22 18:40:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4383801344. Throughput: 0: 43004.9. Samples: 4383950660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:40:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 18:40:50,016][15401] Updated weights for policy 0, policy_version 267570 (0.0033) [2024-06-22 18:40:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4383997952. Throughput: 0: 42809.6. Samples: 4384077240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:40:53,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 18:40:53,899][15401] Updated weights for policy 0, policy_version 267580 (0.0039) [2024-06-22 18:40:57,515][15401] Updated weights for policy 0, policy_version 267590 (0.0041) [2024-06-22 18:40:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4384227328. Throughput: 0: 43069.3. Samples: 4384341840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:40:58,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-22 18:41:01,645][15401] Updated weights for policy 0, policy_version 267600 (0.0041) [2024-06-22 18:41:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4384407552. Throughput: 0: 42955.4. Samples: 4384594060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:41:03,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 18:41:05,229][15401] Updated weights for policy 0, policy_version 267610 (0.0027) [2024-06-22 18:41:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 4384653312. Throughput: 0: 42938.3. Samples: 4384722460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:41:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 18:41:09,056][15401] Updated weights for policy 0, policy_version 267620 (0.0036) [2024-06-22 18:41:13,335][15401] Updated weights for policy 0, policy_version 267630 (0.0038) [2024-06-22 18:41:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4384849920. Throughput: 0: 43003.1. Samples: 4384981480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:41:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 18:41:16,671][15401] Updated weights for policy 0, policy_version 267640 (0.0040) [2024-06-22 18:41:18,392][15132] Fps is (10 sec: 42589.8, 60 sec: 42870.0, 300 sec: 42709.2). Total num frames: 4385079296. Throughput: 0: 42827.4. Samples: 4385234640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:41:18,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 18:41:21,152][15401] Updated weights for policy 0, policy_version 267650 (0.0032) [2024-06-22 18:41:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4385292288. Throughput: 0: 42848.0. Samples: 4385365540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:41:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 18:41:24,095][15401] Updated weights for policy 0, policy_version 267660 (0.0035) [2024-06-22 18:41:28,396][15132] Fps is (10 sec: 40942.0, 60 sec: 42593.8, 300 sec: 42708.6). Total num frames: 4385488896. Throughput: 0: 42625.6. Samples: 4385615860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:41:28,405][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 18:41:28,719][15401] Updated weights for policy 0, policy_version 267670 (0.0028) [2024-06-22 18:41:32,091][15401] Updated weights for policy 0, policy_version 267680 (0.0033) [2024-06-22 18:41:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4385718272. Throughput: 0: 42621.2. Samples: 4385868620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-22 18:41:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 18:41:36,018][15401] Updated weights for policy 0, policy_version 267690 (0.0043) [2024-06-22 18:41:38,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4385914880. Throughput: 0: 42735.6. Samples: 4386000340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:41:38,390][15132] Avg episode reward: [(0, '0.162')] [2024-06-22 18:41:39,449][15401] Updated weights for policy 0, policy_version 267700 (0.0034) [2024-06-22 18:41:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 4386144256. Throughput: 0: 42670.6. Samples: 4386262020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:41:43,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 18:41:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000267709_4386144256.pth... [2024-06-22 18:41:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000267082_4375871488.pth [2024-06-22 18:41:43,809][15401] Updated weights for policy 0, policy_version 267710 (0.0028) [2024-06-22 18:41:47,172][15401] Updated weights for policy 0, policy_version 267720 (0.0042) [2024-06-22 18:41:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4386357248. Throughput: 0: 42803.2. Samples: 4386520200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:41:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 18:41:51,362][15401] Updated weights for policy 0, policy_version 267730 (0.0039) [2024-06-22 18:41:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4386553856. Throughput: 0: 42873.3. Samples: 4386651760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:41:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 18:41:54,950][15401] Updated weights for policy 0, policy_version 267740 (0.0039) [2024-06-22 18:41:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4386783232. Throughput: 0: 42797.4. Samples: 4386907360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:41:58,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 18:41:58,801][15401] Updated weights for policy 0, policy_version 267750 (0.0038) [2024-06-22 18:42:02,422][15401] Updated weights for policy 0, policy_version 267760 (0.0039) [2024-06-22 18:42:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 4387012608. Throughput: 0: 42751.3. Samples: 4387158360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 18:42:07,124][15401] Updated weights for policy 0, policy_version 267770 (0.0029) [2024-06-22 18:42:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4387192832. Throughput: 0: 42781.4. Samples: 4387290700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 18:42:10,103][15401] Updated weights for policy 0, policy_version 267780 (0.0032) [2024-06-22 18:42:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4387422208. Throughput: 0: 42955.9. Samples: 4387548600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 18:42:14,480][15401] Updated weights for policy 0, policy_version 267790 (0.0044) [2024-06-22 18:42:17,803][15401] Updated weights for policy 0, policy_version 267800 (0.0028) [2024-06-22 18:42:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42872.8, 300 sec: 42765.0). Total num frames: 4387651584. Throughput: 0: 42887.6. Samples: 4387798560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 18:42:21,958][15401] Updated weights for policy 0, policy_version 267810 (0.0040) [2024-06-22 18:42:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4387848192. Throughput: 0: 43000.6. Samples: 4387935360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 18:42:25,360][15401] Updated weights for policy 0, policy_version 267820 (0.0031) [2024-06-22 18:42:26,412][15349] Signal inference workers to stop experience collection... (64900 times) [2024-06-22 18:42:26,460][15401] InferenceWorker_p0-w0: stopping experience collection (64900 times) [2024-06-22 18:42:26,464][15349] Signal inference workers to resume experience collection... (64900 times) [2024-06-22 18:42:26,475][15401] InferenceWorker_p0-w0: resuming experience collection (64900 times) [2024-06-22 18:42:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 4388061184. Throughput: 0: 42882.3. Samples: 4388191720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 18:42:29,787][15401] Updated weights for policy 0, policy_version 267830 (0.0035) [2024-06-22 18:42:33,059][15401] Updated weights for policy 0, policy_version 267840 (0.0042) [2024-06-22 18:42:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 4388306944. Throughput: 0: 42624.8. Samples: 4388438320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:33,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 18:42:37,506][15401] Updated weights for policy 0, policy_version 267850 (0.0029) [2024-06-22 18:42:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 4388487168. Throughput: 0: 42672.0. Samples: 4388572100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:38,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 18:42:40,635][15401] Updated weights for policy 0, policy_version 267860 (0.0033) [2024-06-22 18:42:43,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42654.5). Total num frames: 4388700160. Throughput: 0: 42701.3. Samples: 4388828920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 18:42:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 18:42:45,191][15401] Updated weights for policy 0, policy_version 267870 (0.0037) [2024-06-22 18:42:48,304][15401] Updated weights for policy 0, policy_version 267880 (0.0039) [2024-06-22 18:42:48,392][15132] Fps is (10 sec: 45876.7, 60 sec: 43143.1, 300 sec: 42764.7). Total num frames: 4388945920. Throughput: 0: 42744.7. Samples: 4389081960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:42:48,392][15132] Avg episode reward: [(0, '0.276')] [2024-06-22 18:42:52,827][15401] Updated weights for policy 0, policy_version 267890 (0.0034) [2024-06-22 18:42:53,391][15132] Fps is (10 sec: 42592.5, 60 sec: 42870.6, 300 sec: 42764.8). Total num frames: 4389126144. Throughput: 0: 42650.8. Samples: 4389210040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:42:53,391][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 18:42:56,260][15401] Updated weights for policy 0, policy_version 267900 (0.0029) [2024-06-22 18:42:58,389][15132] Fps is (10 sec: 39329.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4389339136. Throughput: 0: 42638.2. Samples: 4389467320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:42:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 18:43:00,413][15401] Updated weights for policy 0, policy_version 267910 (0.0033) [2024-06-22 18:43:03,389][15132] Fps is (10 sec: 42604.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4389552128. Throughput: 0: 42885.6. Samples: 4389728400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 18:43:04,070][15401] Updated weights for policy 0, policy_version 267920 (0.0035) [2024-06-22 18:43:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4389748736. Throughput: 0: 42553.8. Samples: 4389850280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 18:43:08,444][15401] Updated weights for policy 0, policy_version 267930 (0.0036) [2024-06-22 18:43:11,834][15401] Updated weights for policy 0, policy_version 267940 (0.0041) [2024-06-22 18:43:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 4389978112. Throughput: 0: 42530.8. Samples: 4390105700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:13,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 18:43:16,207][15401] Updated weights for policy 0, policy_version 267950 (0.0028) [2024-06-22 18:43:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 4390191104. Throughput: 0: 42693.5. Samples: 4390359520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 18:43:19,555][15401] Updated weights for policy 0, policy_version 267960 (0.0027) [2024-06-22 18:43:23,389][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4390387712. Throughput: 0: 42607.2. Samples: 4390489320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:23,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 18:43:23,695][15401] Updated weights for policy 0, policy_version 267970 (0.0037) [2024-06-22 18:43:27,092][15401] Updated weights for policy 0, policy_version 267980 (0.0035) [2024-06-22 18:43:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4390633472. Throughput: 0: 42651.9. Samples: 4390748260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 18:43:31,143][15401] Updated weights for policy 0, policy_version 267990 (0.0034) [2024-06-22 18:43:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4390846464. Throughput: 0: 42771.2. Samples: 4391006580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:33,390][15132] Avg episode reward: [(0, '0.118')] [2024-06-22 18:43:34,754][15401] Updated weights for policy 0, policy_version 268000 (0.0027) [2024-06-22 18:43:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 4391043072. Throughput: 0: 42844.5. Samples: 4391137980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 18:43:38,580][15401] Updated weights for policy 0, policy_version 268010 (0.0039) [2024-06-22 18:43:42,121][15401] Updated weights for policy 0, policy_version 268020 (0.0033) [2024-06-22 18:43:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4391272448. Throughput: 0: 42950.6. Samples: 4391400100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:43,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 18:43:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000268022_4391272448.pth... [2024-06-22 18:43:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000267395_4380999680.pth [2024-06-22 18:43:45,975][15401] Updated weights for policy 0, policy_version 268030 (0.0033) [2024-06-22 18:43:47,062][15349] Signal inference workers to stop experience collection... (64950 times) [2024-06-22 18:43:47,120][15401] InferenceWorker_p0-w0: stopping experience collection (64950 times) [2024-06-22 18:43:47,176][15349] Signal inference workers to resume experience collection... (64950 times) [2024-06-22 18:43:47,176][15401] InferenceWorker_p0-w0: resuming experience collection (64950 times) [2024-06-22 18:43:48,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42599.8, 300 sec: 42820.5). Total num frames: 4391501824. Throughput: 0: 42830.5. Samples: 4391655780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:48,391][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 18:43:49,693][15401] Updated weights for policy 0, policy_version 268040 (0.0024) [2024-06-22 18:43:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43145.5, 300 sec: 42820.6). Total num frames: 4391714816. Throughput: 0: 43153.3. Samples: 4391792180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-22 18:43:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 18:43:53,442][15401] Updated weights for policy 0, policy_version 268050 (0.0028) [2024-06-22 18:43:57,425][15401] Updated weights for policy 0, policy_version 268060 (0.0026) [2024-06-22 18:43:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4391911424. Throughput: 0: 43177.7. Samples: 4392048600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:43:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 18:44:01,298][15401] Updated weights for policy 0, policy_version 268070 (0.0041) [2024-06-22 18:44:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 4392157184. Throughput: 0: 43078.5. Samples: 4392298160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:03,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 18:44:05,028][15401] Updated weights for policy 0, policy_version 268080 (0.0040) [2024-06-22 18:44:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42820.7). Total num frames: 4392353792. Throughput: 0: 43239.1. Samples: 4392435080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 18:44:08,799][15401] Updated weights for policy 0, policy_version 268090 (0.0026) [2024-06-22 18:44:13,090][15401] Updated weights for policy 0, policy_version 268100 (0.0023) [2024-06-22 18:44:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43146.1, 300 sec: 42820.5). Total num frames: 4392566784. Throughput: 0: 43323.6. Samples: 4392697820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 18:44:16,356][15401] Updated weights for policy 0, policy_version 268110 (0.0037) [2024-06-22 18:44:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 42876.4). Total num frames: 4392812544. Throughput: 0: 43272.1. Samples: 4392953820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 18:44:20,647][15401] Updated weights for policy 0, policy_version 268120 (0.0037) [2024-06-22 18:44:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43688.9, 300 sec: 42875.8). Total num frames: 4393009152. Throughput: 0: 43386.5. Samples: 4393090480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:23,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 18:44:23,725][15401] Updated weights for policy 0, policy_version 268130 (0.0047) [2024-06-22 18:44:28,192][15401] Updated weights for policy 0, policy_version 268140 (0.0032) [2024-06-22 18:44:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 4393222144. Throughput: 0: 43408.6. Samples: 4393353480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 18:44:31,338][15401] Updated weights for policy 0, policy_version 268150 (0.0037) [2024-06-22 18:44:33,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 4393467904. Throughput: 0: 43301.9. Samples: 4393604360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 18:44:35,702][15401] Updated weights for policy 0, policy_version 268160 (0.0040) [2024-06-22 18:44:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 4393664512. Throughput: 0: 43286.2. Samples: 4393740060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 18:44:38,761][15401] Updated weights for policy 0, policy_version 268170 (0.0040) [2024-06-22 18:44:43,169][15401] Updated weights for policy 0, policy_version 268180 (0.0042) [2024-06-22 18:44:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4393861120. Throughput: 0: 43303.5. Samples: 4393997260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 18:44:46,701][15401] Updated weights for policy 0, policy_version 268190 (0.0041) [2024-06-22 18:44:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4394106880. Throughput: 0: 43455.2. Samples: 4394253540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:48,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-22 18:44:50,871][15401] Updated weights for policy 0, policy_version 268200 (0.0026) [2024-06-22 18:44:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4394303488. Throughput: 0: 43389.9. Samples: 4394387620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:53,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 18:44:54,210][15401] Updated weights for policy 0, policy_version 268210 (0.0033) [2024-06-22 18:44:58,385][15401] Updated weights for policy 0, policy_version 268220 (0.0030) [2024-06-22 18:44:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4394516480. Throughput: 0: 43315.6. Samples: 4394647020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:44:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 18:45:01,771][15401] Updated weights for policy 0, policy_version 268230 (0.0030) [2024-06-22 18:45:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43419.3, 300 sec: 43042.7). Total num frames: 4394762240. Throughput: 0: 43229.7. Samples: 4394899160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:45:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 18:45:06,019][15401] Updated weights for policy 0, policy_version 268240 (0.0036) [2024-06-22 18:45:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4394958848. Throughput: 0: 43158.3. Samples: 4395032500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 18:45:08,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 18:45:09,448][15401] Updated weights for policy 0, policy_version 268250 (0.0028) [2024-06-22 18:45:12,722][15349] Signal inference workers to stop experience collection... (65000 times) [2024-06-22 18:45:12,722][15349] Signal inference workers to resume experience collection... (65000 times) [2024-06-22 18:45:12,757][15401] InferenceWorker_p0-w0: stopping experience collection (65000 times) [2024-06-22 18:45:12,757][15401] InferenceWorker_p0-w0: resuming experience collection (65000 times) [2024-06-22 18:45:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4395155456. Throughput: 0: 42985.6. Samples: 4395287840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 18:45:13,631][15401] Updated weights for policy 0, policy_version 268260 (0.0028) [2024-06-22 18:45:17,097][15401] Updated weights for policy 0, policy_version 268270 (0.0037) [2024-06-22 18:45:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 4395417600. Throughput: 0: 43030.2. Samples: 4395540720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:18,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 18:45:21,174][15401] Updated weights for policy 0, policy_version 268280 (0.0030) [2024-06-22 18:45:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 4395581440. Throughput: 0: 42988.8. Samples: 4395674560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 18:45:24,586][15401] Updated weights for policy 0, policy_version 268290 (0.0029) [2024-06-22 18:45:28,392][15132] Fps is (10 sec: 37674.2, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 4395794432. Throughput: 0: 42832.5. Samples: 4395924820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:28,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 18:45:29,144][15401] Updated weights for policy 0, policy_version 268300 (0.0041) [2024-06-22 18:45:32,352][15401] Updated weights for policy 0, policy_version 268310 (0.0041) [2024-06-22 18:45:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 4396040192. Throughput: 0: 42768.0. Samples: 4396178100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 18:45:36,842][15401] Updated weights for policy 0, policy_version 268320 (0.0029) [2024-06-22 18:45:38,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 4396220416. Throughput: 0: 42669.8. Samples: 4396307760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:38,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 18:45:40,133][15401] Updated weights for policy 0, policy_version 268330 (0.0031) [2024-06-22 18:45:43,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 4396449792. Throughput: 0: 42594.1. Samples: 4396563860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:43,393][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 18:45:43,538][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000268339_4396466176.pth... [2024-06-22 18:45:43,594][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000267709_4386144256.pth [2024-06-22 18:45:44,489][15401] Updated weights for policy 0, policy_version 268340 (0.0050) [2024-06-22 18:45:47,778][15401] Updated weights for policy 0, policy_version 268350 (0.0033) [2024-06-22 18:45:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4396662784. Throughput: 0: 42614.2. Samples: 4396816800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 18:45:51,969][15401] Updated weights for policy 0, policy_version 268360 (0.0029) [2024-06-22 18:45:53,390][15132] Fps is (10 sec: 40966.0, 60 sec: 42597.6, 300 sec: 42820.4). Total num frames: 4396859392. Throughput: 0: 42544.0. Samples: 4396947020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:53,391][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 18:45:55,495][15401] Updated weights for policy 0, policy_version 268370 (0.0038) [2024-06-22 18:45:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 4397072384. Throughput: 0: 42576.1. Samples: 4397203760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:45:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 18:45:59,490][15401] Updated weights for policy 0, policy_version 268380 (0.0024) [2024-06-22 18:46:03,022][15401] Updated weights for policy 0, policy_version 268390 (0.0027) [2024-06-22 18:46:03,390][15132] Fps is (10 sec: 44241.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 4397301760. Throughput: 0: 42628.0. Samples: 4397458980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:46:03,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-22 18:46:07,062][15401] Updated weights for policy 0, policy_version 268400 (0.0038) [2024-06-22 18:46:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 4397481984. Throughput: 0: 42564.1. Samples: 4397589940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:46:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 18:46:10,570][15401] Updated weights for policy 0, policy_version 268410 (0.0039) [2024-06-22 18:46:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42931.9). Total num frames: 4397744128. Throughput: 0: 42697.7. Samples: 4397846120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:46:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 18:46:14,592][15401] Updated weights for policy 0, policy_version 268420 (0.0026) [2024-06-22 18:46:18,162][15401] Updated weights for policy 0, policy_version 268430 (0.0033) [2024-06-22 18:46:18,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 4397957120. Throughput: 0: 42667.1. Samples: 4398098120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-22 18:46:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 18:46:22,661][15401] Updated weights for policy 0, policy_version 268440 (0.0027) [2024-06-22 18:46:23,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42821.5). Total num frames: 4398120960. Throughput: 0: 42584.7. Samples: 4398224080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:46:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 18:46:26,140][15401] Updated weights for policy 0, policy_version 268450 (0.0032) [2024-06-22 18:46:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.2, 300 sec: 42931.7). Total num frames: 4398383104. Throughput: 0: 42580.6. Samples: 4398479880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:46:28,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 18:46:30,310][15401] Updated weights for policy 0, policy_version 268460 (0.0038) [2024-06-22 18:46:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 4398579712. Throughput: 0: 42758.3. Samples: 4398740920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:46:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 18:46:33,612][15349] Signal inference workers to stop experience collection... (65050 times) [2024-06-22 18:46:33,613][15349] Signal inference workers to resume experience collection... (65050 times) [2024-06-22 18:46:33,645][15401] InferenceWorker_p0-w0: stopping experience collection (65050 times) [2024-06-22 18:46:33,646][15401] InferenceWorker_p0-w0: resuming experience collection (65050 times) [2024-06-22 18:46:33,753][15401] Updated weights for policy 0, policy_version 268470 (0.0045) [2024-06-22 18:46:38,022][15401] Updated weights for policy 0, policy_version 268480 (0.0033) [2024-06-22 18:46:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4398776320. Throughput: 0: 42562.7. Samples: 4398862300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:46:38,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 18:46:41,599][15401] Updated weights for policy 0, policy_version 268490 (0.0032) [2024-06-22 18:46:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 4399005696. Throughput: 0: 42511.5. Samples: 4399116780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:46:43,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 18:46:45,599][15401] Updated weights for policy 0, policy_version 268500 (0.0029) [2024-06-22 18:46:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4399202304. Throughput: 0: 42757.2. Samples: 4399383060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:46:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 18:46:49,301][15401] Updated weights for policy 0, policy_version 268510 (0.0036) [2024-06-22 18:46:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42599.1, 300 sec: 42820.6). Total num frames: 4399415296. Throughput: 0: 42502.7. Samples: 4399502560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:46:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 18:46:53,436][15401] Updated weights for policy 0, policy_version 268520 (0.0040) [2024-06-22 18:46:56,993][15401] Updated weights for policy 0, policy_version 268530 (0.0033) [2024-06-22 18:46:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4399644672. Throughput: 0: 42397.0. Samples: 4399753980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:46:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 18:47:01,031][15401] Updated weights for policy 0, policy_version 268540 (0.0039) [2024-06-22 18:47:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 4399841280. Throughput: 0: 42692.5. Samples: 4400019280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:47:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 18:47:04,609][15401] Updated weights for policy 0, policy_version 268550 (0.0030) [2024-06-22 18:47:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4400070656. Throughput: 0: 42612.8. Samples: 4400141660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:47:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 18:47:08,559][15401] Updated weights for policy 0, policy_version 268560 (0.0027) [2024-06-22 18:47:12,327][15401] Updated weights for policy 0, policy_version 268570 (0.0028) [2024-06-22 18:47:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4400283648. Throughput: 0: 42666.7. Samples: 4400399880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:47:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 18:47:16,097][15401] Updated weights for policy 0, policy_version 268580 (0.0039) [2024-06-22 18:47:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 4400480256. Throughput: 0: 42618.7. Samples: 4400658760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:47:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 18:47:20,076][15401] Updated weights for policy 0, policy_version 268590 (0.0049) [2024-06-22 18:47:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4400693248. Throughput: 0: 42648.5. Samples: 4400781480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:47:23,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 18:47:24,204][15401] Updated weights for policy 0, policy_version 268600 (0.0028) [2024-06-22 18:47:27,774][15401] Updated weights for policy 0, policy_version 268610 (0.0023) [2024-06-22 18:47:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4400939008. Throughput: 0: 42876.5. Samples: 4401046220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 18:47:28,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 18:47:31,873][15401] Updated weights for policy 0, policy_version 268620 (0.0035) [2024-06-22 18:47:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 4401119232. Throughput: 0: 42613.8. Samples: 4401300680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:47:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 18:47:35,394][15401] Updated weights for policy 0, policy_version 268630 (0.0039) [2024-06-22 18:47:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4401348608. Throughput: 0: 42725.7. Samples: 4401425220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:47:38,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 18:47:39,532][15401] Updated weights for policy 0, policy_version 268640 (0.0032) [2024-06-22 18:47:43,057][15401] Updated weights for policy 0, policy_version 268650 (0.0042) [2024-06-22 18:47:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.8). Total num frames: 4401577984. Throughput: 0: 42963.5. Samples: 4401687340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:47:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 18:47:43,489][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000268652_4401594368.pth... [2024-06-22 18:47:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000268022_4391272448.pth [2024-06-22 18:47:47,157][15401] Updated weights for policy 0, policy_version 268660 (0.0037) [2024-06-22 18:47:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42875.9). Total num frames: 4401774592. Throughput: 0: 42751.0. Samples: 4401943180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:47:48,393][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 18:47:50,784][15401] Updated weights for policy 0, policy_version 268670 (0.0027) [2024-06-22 18:47:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4401987584. Throughput: 0: 42712.5. Samples: 4402063720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:47:53,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 18:47:55,268][15401] Updated weights for policy 0, policy_version 268680 (0.0027) [2024-06-22 18:47:58,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 4402200576. Throughput: 0: 42736.8. Samples: 4402323140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:47:58,393][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 18:47:58,469][15401] Updated weights for policy 0, policy_version 268690 (0.0042) [2024-06-22 18:47:59,967][15349] Signal inference workers to stop experience collection... (65100 times) [2024-06-22 18:47:59,967][15349] Signal inference workers to resume experience collection... (65100 times) [2024-06-22 18:47:59,999][15401] InferenceWorker_p0-w0: stopping experience collection (65100 times) [2024-06-22 18:47:59,999][15401] InferenceWorker_p0-w0: resuming experience collection (65100 times) [2024-06-22 18:48:02,947][15401] Updated weights for policy 0, policy_version 268700 (0.0037) [2024-06-22 18:48:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4402397184. Throughput: 0: 42689.4. Samples: 4402579780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:48:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 18:48:06,193][15401] Updated weights for policy 0, policy_version 268710 (0.0041) [2024-06-22 18:48:08,396][15132] Fps is (10 sec: 44219.6, 60 sec: 42867.1, 300 sec: 42931.0). Total num frames: 4402642944. Throughput: 0: 42612.2. Samples: 4402699300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:48:08,396][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 18:48:10,517][15401] Updated weights for policy 0, policy_version 268720 (0.0043) [2024-06-22 18:48:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4402823168. Throughput: 0: 42497.3. Samples: 4402958600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:48:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 18:48:13,854][15401] Updated weights for policy 0, policy_version 268730 (0.0027) [2024-06-22 18:48:18,008][15401] Updated weights for policy 0, policy_version 268740 (0.0022) [2024-06-22 18:48:18,389][15132] Fps is (10 sec: 39346.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4403036160. Throughput: 0: 42592.2. Samples: 4403217320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:48:18,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-22 18:48:21,366][15401] Updated weights for policy 0, policy_version 268750 (0.0027) [2024-06-22 18:48:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4403281920. Throughput: 0: 42718.3. Samples: 4403347540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:48:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 18:48:25,938][15401] Updated weights for policy 0, policy_version 268760 (0.0039) [2024-06-22 18:48:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 4403462144. Throughput: 0: 42512.1. Samples: 4403600380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:48:28,395][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 18:48:29,108][15401] Updated weights for policy 0, policy_version 268770 (0.0032) [2024-06-22 18:48:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 4403675136. Throughput: 0: 42365.5. Samples: 4403849520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:48:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 18:48:33,445][15401] Updated weights for policy 0, policy_version 268780 (0.0038) [2024-06-22 18:48:37,065][15401] Updated weights for policy 0, policy_version 268790 (0.0030) [2024-06-22 18:48:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4403904512. Throughput: 0: 42545.0. Samples: 4403978240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 18:48:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 18:48:40,865][15401] Updated weights for policy 0, policy_version 268800 (0.0037) [2024-06-22 18:48:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4404117504. Throughput: 0: 42600.0. Samples: 4404240040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:48:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 18:48:44,876][15401] Updated weights for policy 0, policy_version 268810 (0.0033) [2024-06-22 18:48:48,390][15132] Fps is (10 sec: 42597.0, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 4404330496. Throughput: 0: 42408.1. Samples: 4404488160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:48:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 18:48:48,995][15401] Updated weights for policy 0, policy_version 268820 (0.0027) [2024-06-22 18:48:52,441][15401] Updated weights for policy 0, policy_version 268830 (0.0032) [2024-06-22 18:48:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4404543488. Throughput: 0: 42667.4. Samples: 4404619060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:48:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 18:48:56,477][15401] Updated weights for policy 0, policy_version 268840 (0.0047) [2024-06-22 18:48:58,389][15132] Fps is (10 sec: 39322.9, 60 sec: 42054.0, 300 sec: 42598.8). Total num frames: 4404723712. Throughput: 0: 42664.5. Samples: 4404878500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:48:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 18:48:59,992][15401] Updated weights for policy 0, policy_version 268850 (0.0032) [2024-06-22 18:49:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4404969472. Throughput: 0: 42527.9. Samples: 4405131080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:03,391][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 18:49:03,979][15401] Updated weights for policy 0, policy_version 268860 (0.0038) [2024-06-22 18:49:07,648][15401] Updated weights for policy 0, policy_version 268870 (0.0034) [2024-06-22 18:49:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 4405198848. Throughput: 0: 42515.1. Samples: 4405260720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 18:49:11,430][15401] Updated weights for policy 0, policy_version 268880 (0.0029) [2024-06-22 18:49:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4405379072. Throughput: 0: 42675.1. Samples: 4405520760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 18:49:15,365][15401] Updated weights for policy 0, policy_version 268890 (0.0040) [2024-06-22 18:49:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 4405641216. Throughput: 0: 42624.0. Samples: 4405767600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 18:49:19,176][15401] Updated weights for policy 0, policy_version 268900 (0.0028) [2024-06-22 18:49:23,047][15401] Updated weights for policy 0, policy_version 268910 (0.0050) [2024-06-22 18:49:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4405821440. Throughput: 0: 42902.1. Samples: 4405908840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 18:49:25,314][15349] Signal inference workers to stop experience collection... (65150 times) [2024-06-22 18:49:25,341][15401] InferenceWorker_p0-w0: stopping experience collection (65150 times) [2024-06-22 18:49:25,376][15349] Signal inference workers to resume experience collection... (65150 times) [2024-06-22 18:49:25,376][15401] InferenceWorker_p0-w0: resuming experience collection (65150 times) [2024-06-22 18:49:26,642][15401] Updated weights for policy 0, policy_version 268920 (0.0024) [2024-06-22 18:49:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4406034432. Throughput: 0: 42726.6. Samples: 4406162740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 18:49:30,575][15401] Updated weights for policy 0, policy_version 268930 (0.0029) [2024-06-22 18:49:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 4406296576. Throughput: 0: 42784.6. Samples: 4406413460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 18:49:34,090][15401] Updated weights for policy 0, policy_version 268940 (0.0028) [2024-06-22 18:49:38,359][15401] Updated weights for policy 0, policy_version 268950 (0.0038) [2024-06-22 18:49:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 4406476800. Throughput: 0: 43086.7. Samples: 4406557960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 18:49:41,518][15401] Updated weights for policy 0, policy_version 268960 (0.0027) [2024-06-22 18:49:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4406689792. Throughput: 0: 42978.1. Samples: 4406812520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:43,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 18:49:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000268963_4406689792.pth... [2024-06-22 18:49:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000268339_4396466176.pth [2024-06-22 18:49:45,981][15401] Updated weights for policy 0, policy_version 268970 (0.0032) [2024-06-22 18:49:48,389][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 4406919168. Throughput: 0: 42861.8. Samples: 4407059860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 18:49:49,039][15401] Updated weights for policy 0, policy_version 268980 (0.0036) [2024-06-22 18:49:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4407083008. Throughput: 0: 42934.2. Samples: 4407192760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 18:49:53,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 18:49:53,944][15401] Updated weights for policy 0, policy_version 268990 (0.0034) [2024-06-22 18:49:56,560][15401] Updated weights for policy 0, policy_version 269000 (0.0029) [2024-06-22 18:49:58,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43415.8, 300 sec: 42598.1). Total num frames: 4407328768. Throughput: 0: 42681.3. Samples: 4407441520. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:49:58,393][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 18:50:01,921][15401] Updated weights for policy 0, policy_version 269010 (0.0038) [2024-06-22 18:50:03,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4407558144. Throughput: 0: 42947.6. Samples: 4407700240. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 18:50:04,161][15401] Updated weights for policy 0, policy_version 269020 (0.0041) [2024-06-22 18:50:08,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 4407738368. Throughput: 0: 42724.5. Samples: 4407831440. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 18:50:09,486][15401] Updated weights for policy 0, policy_version 269030 (0.0030) [2024-06-22 18:50:12,550][15401] Updated weights for policy 0, policy_version 269040 (0.0039) [2024-06-22 18:50:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 4407984128. Throughput: 0: 42666.2. Samples: 4408082720. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 18:50:16,963][15401] Updated weights for policy 0, policy_version 269050 (0.0033) [2024-06-22 18:50:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4408197120. Throughput: 0: 42835.2. Samples: 4408341040. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 18:50:20,094][15401] Updated weights for policy 0, policy_version 269060 (0.0041) [2024-06-22 18:50:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 4408360960. Throughput: 0: 42546.6. Samples: 4408472560. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 18:50:24,659][15401] Updated weights for policy 0, policy_version 269070 (0.0037) [2024-06-22 18:50:27,477][15349] Signal inference workers to stop experience collection... (65200 times) [2024-06-22 18:50:27,478][15349] Signal inference workers to resume experience collection... (65200 times) [2024-06-22 18:50:27,508][15401] InferenceWorker_p0-w0: stopping experience collection (65200 times) [2024-06-22 18:50:27,508][15401] InferenceWorker_p0-w0: resuming experience collection (65200 times) [2024-06-22 18:50:27,620][15401] Updated weights for policy 0, policy_version 269080 (0.0040) [2024-06-22 18:50:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4408623104. Throughput: 0: 42485.3. Samples: 4408724360. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:28,391][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 18:50:32,312][15401] Updated weights for policy 0, policy_version 269090 (0.0036) [2024-06-22 18:50:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 4408819712. Throughput: 0: 42834.8. Samples: 4408987420. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 18:50:35,167][15401] Updated weights for policy 0, policy_version 269100 (0.0040) [2024-06-22 18:50:38,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 4409016320. Throughput: 0: 42680.9. Samples: 4409113400. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 18:50:39,907][15401] Updated weights for policy 0, policy_version 269110 (0.0029) [2024-06-22 18:50:43,066][15401] Updated weights for policy 0, policy_version 269120 (0.0034) [2024-06-22 18:50:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4409262080. Throughput: 0: 42862.3. Samples: 4409370220. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:43,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 18:50:47,478][15401] Updated weights for policy 0, policy_version 269130 (0.0035) [2024-06-22 18:50:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42654.1). Total num frames: 4409442304. Throughput: 0: 42883.9. Samples: 4409630020. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 18:50:50,922][15401] Updated weights for policy 0, policy_version 269140 (0.0030) [2024-06-22 18:50:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 4409655296. Throughput: 0: 42733.2. Samples: 4409754440. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 18:50:55,044][15401] Updated weights for policy 0, policy_version 269150 (0.0033) [2024-06-22 18:50:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 4409901056. Throughput: 0: 42823.6. Samples: 4410009780. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:50:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 18:50:58,721][15401] Updated weights for policy 0, policy_version 269160 (0.0028) [2024-06-22 18:51:02,669][15401] Updated weights for policy 0, policy_version 269170 (0.0034) [2024-06-22 18:51:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4410097664. Throughput: 0: 42981.3. Samples: 4410275200. Policy #0 lag: (min: 2.0, avg: 10.3, max: 20.0) [2024-06-22 18:51:03,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 18:51:06,249][15401] Updated weights for policy 0, policy_version 269180 (0.0031) [2024-06-22 18:51:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4410310656. Throughput: 0: 42947.1. Samples: 4410405180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 18:51:10,310][15401] Updated weights for policy 0, policy_version 269190 (0.0037) [2024-06-22 18:51:13,392][15132] Fps is (10 sec: 44227.1, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 4410540032. Throughput: 0: 42824.3. Samples: 4410651540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:13,401][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 18:51:14,015][15401] Updated weights for policy 0, policy_version 269200 (0.0043) [2024-06-22 18:51:17,954][15401] Updated weights for policy 0, policy_version 269210 (0.0038) [2024-06-22 18:51:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4410736640. Throughput: 0: 42781.4. Samples: 4410912580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 18:51:21,579][15401] Updated weights for policy 0, policy_version 269220 (0.0025) [2024-06-22 18:51:23,392][15132] Fps is (10 sec: 40959.2, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 4410949632. Throughput: 0: 42824.3. Samples: 4411040600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:23,393][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 18:51:25,860][15401] Updated weights for policy 0, policy_version 269230 (0.0034) [2024-06-22 18:51:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4411195392. Throughput: 0: 42718.7. Samples: 4411292560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:28,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 18:51:29,406][15401] Updated weights for policy 0, policy_version 269240 (0.0035) [2024-06-22 18:51:33,395][15132] Fps is (10 sec: 42585.2, 60 sec: 42594.4, 300 sec: 42708.7). Total num frames: 4411375616. Throughput: 0: 42771.7. Samples: 4411554980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:33,396][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 18:51:33,465][15401] Updated weights for policy 0, policy_version 269250 (0.0040) [2024-06-22 18:51:36,822][15401] Updated weights for policy 0, policy_version 269260 (0.0028) [2024-06-22 18:51:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4411604992. Throughput: 0: 42717.0. Samples: 4411676700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 18:51:40,967][15401] Updated weights for policy 0, policy_version 269270 (0.0032) [2024-06-22 18:51:43,390][15132] Fps is (10 sec: 44260.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4411817984. Throughput: 0: 42839.4. Samples: 4411937560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:43,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-22 18:51:43,525][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000269277_4411834368.pth... [2024-06-22 18:51:43,604][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000268652_4401594368.pth [2024-06-22 18:51:44,434][15401] Updated weights for policy 0, policy_version 269280 (0.0035) [2024-06-22 18:51:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4412014592. Throughput: 0: 42682.2. Samples: 4412195900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:48,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 18:51:48,576][15401] Updated weights for policy 0, policy_version 269290 (0.0044) [2024-06-22 18:51:51,890][15401] Updated weights for policy 0, policy_version 269300 (0.0027) [2024-06-22 18:51:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4412243968. Throughput: 0: 42600.4. Samples: 4412322200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 18:51:56,095][15401] Updated weights for policy 0, policy_version 269310 (0.0033) [2024-06-22 18:51:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4412473344. Throughput: 0: 42903.9. Samples: 4412582120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:51:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 18:51:59,537][15401] Updated weights for policy 0, policy_version 269320 (0.0023) [2024-06-22 18:52:03,071][15349] Signal inference workers to stop experience collection... (65250 times) [2024-06-22 18:52:03,071][15349] Signal inference workers to resume experience collection... (65250 times) [2024-06-22 18:52:03,083][15401] InferenceWorker_p0-w0: stopping experience collection (65250 times) [2024-06-22 18:52:03,083][15401] InferenceWorker_p0-w0: resuming experience collection (65250 times) [2024-06-22 18:52:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4412669952. Throughput: 0: 42913.7. Samples: 4412843700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:52:03,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 18:52:04,078][15401] Updated weights for policy 0, policy_version 269330 (0.0030) [2024-06-22 18:52:07,256][15401] Updated weights for policy 0, policy_version 269340 (0.0044) [2024-06-22 18:52:08,390][15132] Fps is (10 sec: 40955.9, 60 sec: 42870.8, 300 sec: 42709.3). Total num frames: 4412882944. Throughput: 0: 42809.3. Samples: 4412966960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:52:08,391][15132] Avg episode reward: [(0, '0.000')] [2024-06-22 18:52:11,663][15401] Updated weights for policy 0, policy_version 269350 (0.0027) [2024-06-22 18:52:13,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42871.3, 300 sec: 42820.2). Total num frames: 4413112320. Throughput: 0: 42827.4. Samples: 4413219900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 18:52:13,393][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 18:52:15,383][15401] Updated weights for policy 0, policy_version 269360 (0.0045) [2024-06-22 18:52:18,390][15132] Fps is (10 sec: 40963.6, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 4413292544. Throughput: 0: 42798.0. Samples: 4413480660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:18,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 18:52:19,197][15401] Updated weights for policy 0, policy_version 269370 (0.0037) [2024-06-22 18:52:23,087][15401] Updated weights for policy 0, policy_version 269380 (0.0037) [2024-06-22 18:52:23,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 4413521920. Throughput: 0: 42780.9. Samples: 4413601840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:23,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 18:52:27,065][15401] Updated weights for policy 0, policy_version 269390 (0.0038) [2024-06-22 18:52:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4413751296. Throughput: 0: 42585.3. Samples: 4413853900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:28,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 18:52:30,730][15401] Updated weights for policy 0, policy_version 269400 (0.0037) [2024-06-22 18:52:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42602.2, 300 sec: 42653.9). Total num frames: 4413931520. Throughput: 0: 42631.4. Samples: 4414114320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 18:52:34,911][15401] Updated weights for policy 0, policy_version 269410 (0.0048) [2024-06-22 18:52:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4414160896. Throughput: 0: 42515.7. Samples: 4414235400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 18:52:38,463][15401] Updated weights for policy 0, policy_version 269420 (0.0030) [2024-06-22 18:52:42,446][15401] Updated weights for policy 0, policy_version 269430 (0.0028) [2024-06-22 18:52:43,389][15132] Fps is (10 sec: 45876.4, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 4414390272. Throughput: 0: 42670.3. Samples: 4414502280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 18:52:45,983][15401] Updated weights for policy 0, policy_version 269440 (0.0037) [2024-06-22 18:52:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4414586880. Throughput: 0: 42406.7. Samples: 4414752000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:48,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 18:52:50,330][15401] Updated weights for policy 0, policy_version 269450 (0.0031) [2024-06-22 18:52:53,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 4414799872. Throughput: 0: 42406.6. Samples: 4414875220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 18:52:53,963][15401] Updated weights for policy 0, policy_version 269460 (0.0039) [2024-06-22 18:52:58,221][15401] Updated weights for policy 0, policy_version 269470 (0.0030) [2024-06-22 18:52:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4415012864. Throughput: 0: 42620.5. Samples: 4415137720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:52:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 18:53:01,584][15401] Updated weights for policy 0, policy_version 269480 (0.0026) [2024-06-22 18:53:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 4415225856. Throughput: 0: 42356.9. Samples: 4415386720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:53:03,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 18:53:05,849][15401] Updated weights for policy 0, policy_version 269490 (0.0041) [2024-06-22 18:53:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42599.1, 300 sec: 42765.0). Total num frames: 4415438848. Throughput: 0: 42473.4. Samples: 4415513140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:53:08,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 18:53:09,257][15401] Updated weights for policy 0, policy_version 269500 (0.0035) [2024-06-22 18:53:13,365][15401] Updated weights for policy 0, policy_version 269510 (0.0027) [2024-06-22 18:53:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 4415651840. Throughput: 0: 42757.9. Samples: 4415778000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:53:13,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 18:53:16,877][15401] Updated weights for policy 0, policy_version 269520 (0.0026) [2024-06-22 18:53:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4415848448. Throughput: 0: 42475.2. Samples: 4416025700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:53:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 18:53:21,110][15349] Signal inference workers to stop experience collection... (65300 times) [2024-06-22 18:53:21,119][15349] Signal inference workers to resume experience collection... (65300 times) [2024-06-22 18:53:21,122][15401] InferenceWorker_p0-w0: stopping experience collection (65300 times) [2024-06-22 18:53:21,127][15401] Updated weights for policy 0, policy_version 269530 (0.0043) [2024-06-22 18:53:21,141][15401] InferenceWorker_p0-w0: resuming experience collection (65300 times) [2024-06-22 18:53:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4416094208. Throughput: 0: 42658.6. Samples: 4416155040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:53:23,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 18:53:24,563][15401] Updated weights for policy 0, policy_version 269540 (0.0037) [2024-06-22 18:53:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4416274432. Throughput: 0: 42573.6. Samples: 4416418100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 18:53:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 18:53:28,737][15401] Updated weights for policy 0, policy_version 269550 (0.0044) [2024-06-22 18:53:32,187][15401] Updated weights for policy 0, policy_version 269560 (0.0038) [2024-06-22 18:53:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4416487424. Throughput: 0: 42532.8. Samples: 4416665980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:53:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 18:53:36,459][15401] Updated weights for policy 0, policy_version 269570 (0.0040) [2024-06-22 18:53:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4416733184. Throughput: 0: 42719.1. Samples: 4416797580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:53:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 18:53:39,897][15401] Updated weights for policy 0, policy_version 269580 (0.0044) [2024-06-22 18:53:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 4416897024. Throughput: 0: 42621.7. Samples: 4417055700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:53:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 18:53:43,525][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000269587_4416913408.pth... [2024-06-22 18:53:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000268963_4406689792.pth [2024-06-22 18:53:44,062][15401] Updated weights for policy 0, policy_version 269590 (0.0025) [2024-06-22 18:53:47,559][15401] Updated weights for policy 0, policy_version 269600 (0.0033) [2024-06-22 18:53:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4417142784. Throughput: 0: 42537.9. Samples: 4417300920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:53:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 18:53:51,889][15401] Updated weights for policy 0, policy_version 269610 (0.0034) [2024-06-22 18:53:53,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4417372160. Throughput: 0: 42678.1. Samples: 4417433660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:53:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 18:53:55,345][15401] Updated weights for policy 0, policy_version 269620 (0.0027) [2024-06-22 18:53:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4417536000. Throughput: 0: 42473.7. Samples: 4417689320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:53:58,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 18:53:59,529][15401] Updated weights for policy 0, policy_version 269630 (0.0036) [2024-06-22 18:54:03,134][15401] Updated weights for policy 0, policy_version 269640 (0.0047) [2024-06-22 18:54:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4417798144. Throughput: 0: 42342.6. Samples: 4417931120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:54:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 18:54:07,532][15401] Updated weights for policy 0, policy_version 269650 (0.0030) [2024-06-22 18:54:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4417994752. Throughput: 0: 42540.8. Samples: 4418069380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:54:08,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 18:54:10,753][15401] Updated weights for policy 0, policy_version 269660 (0.0028) [2024-06-22 18:54:13,390][15132] Fps is (10 sec: 36044.4, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 4418158592. Throughput: 0: 42234.2. Samples: 4418318640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:54:13,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-22 18:54:15,113][15401] Updated weights for policy 0, policy_version 269670 (0.0030) [2024-06-22 18:54:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4418420736. Throughput: 0: 41999.7. Samples: 4418555960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:54:18,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 18:54:18,491][15401] Updated weights for policy 0, policy_version 269680 (0.0040) [2024-06-22 18:54:22,328][15349] Signal inference workers to stop experience collection... (65350 times) [2024-06-22 18:54:22,363][15401] InferenceWorker_p0-w0: stopping experience collection (65350 times) [2024-06-22 18:54:22,390][15349] Signal inference workers to resume experience collection... (65350 times) [2024-06-22 18:54:22,396][15401] InferenceWorker_p0-w0: resuming experience collection (65350 times) [2024-06-22 18:54:22,695][15401] Updated weights for policy 0, policy_version 269690 (0.0027) [2024-06-22 18:54:23,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4418633728. Throughput: 0: 42332.6. Samples: 4418702540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:54:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 18:54:26,227][15401] Updated weights for policy 0, policy_version 269700 (0.0033) [2024-06-22 18:54:28,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 4418797568. Throughput: 0: 42184.1. Samples: 4418953980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:54:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 18:54:30,542][15401] Updated weights for policy 0, policy_version 269710 (0.0027) [2024-06-22 18:54:33,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 4419059712. Throughput: 0: 42193.5. Samples: 4419199900. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:54:33,396][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 18:54:34,099][15401] Updated weights for policy 0, policy_version 269720 (0.0043) [2024-06-22 18:54:38,346][15401] Updated weights for policy 0, policy_version 269730 (0.0036) [2024-06-22 18:54:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4419256320. Throughput: 0: 42365.8. Samples: 4419340120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-22 18:54:38,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 18:54:41,958][15401] Updated weights for policy 0, policy_version 269740 (0.0034) [2024-06-22 18:54:43,390][15132] Fps is (10 sec: 39346.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4419452928. Throughput: 0: 42114.6. Samples: 4419584480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:54:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 18:54:46,008][15401] Updated weights for policy 0, policy_version 269750 (0.0038) [2024-06-22 18:54:48,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 4419698688. Throughput: 0: 42406.2. Samples: 4419839500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:54:48,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 18:54:49,989][15401] Updated weights for policy 0, policy_version 269760 (0.0027) [2024-06-22 18:54:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 41777.5, 300 sec: 42542.9). Total num frames: 4419878912. Throughput: 0: 42290.6. Samples: 4419972560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:54:53,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 18:54:53,589][15401] Updated weights for policy 0, policy_version 269770 (0.0042) [2024-06-22 18:54:57,534][15401] Updated weights for policy 0, policy_version 269780 (0.0026) [2024-06-22 18:54:58,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4420091904. Throughput: 0: 42354.4. Samples: 4420224580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:54:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 18:55:01,321][15401] Updated weights for policy 0, policy_version 269790 (0.0037) [2024-06-22 18:55:03,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4420321280. Throughput: 0: 42741.6. Samples: 4420479340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 18:55:05,127][15401] Updated weights for policy 0, policy_version 269800 (0.0041) [2024-06-22 18:55:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 4420517888. Throughput: 0: 42387.1. Samples: 4420609960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 18:55:08,916][15401] Updated weights for policy 0, policy_version 269810 (0.0027) [2024-06-22 18:55:12,718][15401] Updated weights for policy 0, policy_version 269820 (0.0032) [2024-06-22 18:55:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 4420747264. Throughput: 0: 42406.5. Samples: 4420862280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:13,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 18:55:16,949][15401] Updated weights for policy 0, policy_version 269830 (0.0036) [2024-06-22 18:55:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4420976640. Throughput: 0: 42503.9. Samples: 4421112300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:18,391][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 18:55:20,394][15401] Updated weights for policy 0, policy_version 269840 (0.0032) [2024-06-22 18:55:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 4421140480. Throughput: 0: 42340.1. Samples: 4421245420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:23,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 18:55:24,459][15401] Updated weights for policy 0, policy_version 269850 (0.0033) [2024-06-22 18:55:28,103][15401] Updated weights for policy 0, policy_version 269860 (0.0036) [2024-06-22 18:55:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 4421386240. Throughput: 0: 42516.4. Samples: 4421497720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 18:55:32,438][15401] Updated weights for policy 0, policy_version 269870 (0.0034) [2024-06-22 18:55:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42056.8, 300 sec: 42598.4). Total num frames: 4421582848. Throughput: 0: 42454.8. Samples: 4421749860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 18:55:35,696][15401] Updated weights for policy 0, policy_version 269880 (0.0039) [2024-06-22 18:55:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4421795840. Throughput: 0: 42354.3. Samples: 4421878400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 18:55:39,864][15401] Updated weights for policy 0, policy_version 269890 (0.0046) [2024-06-22 18:55:43,280][15401] Updated weights for policy 0, policy_version 269900 (0.0029) [2024-06-22 18:55:43,392][15132] Fps is (10 sec: 45863.7, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 4422041600. Throughput: 0: 42491.0. Samples: 4422136780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:43,393][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 18:55:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000269900_4422041600.pth... [2024-06-22 18:55:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000269277_4411834368.pth [2024-06-22 18:55:47,511][15401] Updated weights for policy 0, policy_version 269910 (0.0033) [2024-06-22 18:55:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 4422238208. Throughput: 0: 42406.7. Samples: 4422387640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 18:55:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 18:55:51,163][15401] Updated weights for policy 0, policy_version 269920 (0.0032) [2024-06-22 18:55:53,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42327.1, 300 sec: 42431.8). Total num frames: 4422418432. Throughput: 0: 42320.0. Samples: 4422514360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:55:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 18:55:55,134][15401] Updated weights for policy 0, policy_version 269930 (0.0031) [2024-06-22 18:55:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4422664192. Throughput: 0: 42470.8. Samples: 4422773460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:55:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 18:55:58,812][15401] Updated weights for policy 0, policy_version 269940 (0.0038) [2024-06-22 18:56:03,081][15401] Updated weights for policy 0, policy_version 269950 (0.0035) [2024-06-22 18:56:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4422877184. Throughput: 0: 42578.0. Samples: 4423028320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 18:56:06,611][15401] Updated weights for policy 0, policy_version 269960 (0.0040) [2024-06-22 18:56:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 4423073792. Throughput: 0: 42449.7. Samples: 4423155660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:08,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-22 18:56:10,355][15349] Signal inference workers to stop experience collection... (65400 times) [2024-06-22 18:56:10,356][15349] Signal inference workers to resume experience collection... (65400 times) [2024-06-22 18:56:10,390][15401] InferenceWorker_p0-w0: stopping experience collection (65400 times) [2024-06-22 18:56:10,390][15401] InferenceWorker_p0-w0: resuming experience collection (65400 times) [2024-06-22 18:56:10,490][15401] Updated weights for policy 0, policy_version 269970 (0.0037) [2024-06-22 18:56:13,393][15132] Fps is (10 sec: 40946.2, 60 sec: 42322.9, 300 sec: 42542.3). Total num frames: 4423286784. Throughput: 0: 42505.7. Samples: 4423410620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:13,394][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 18:56:14,433][15401] Updated weights for policy 0, policy_version 269980 (0.0034) [2024-06-22 18:56:18,143][15401] Updated weights for policy 0, policy_version 269990 (0.0033) [2024-06-22 18:56:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42598.7). Total num frames: 4423516160. Throughput: 0: 42602.1. Samples: 4423666960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 18:56:22,020][15401] Updated weights for policy 0, policy_version 270000 (0.0035) [2024-06-22 18:56:23,392][15132] Fps is (10 sec: 42604.7, 60 sec: 42870.0, 300 sec: 42431.5). Total num frames: 4423712768. Throughput: 0: 42669.3. Samples: 4423798600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:23,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 18:56:25,905][15401] Updated weights for policy 0, policy_version 270010 (0.0037) [2024-06-22 18:56:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42543.7). Total num frames: 4423925760. Throughput: 0: 42540.1. Samples: 4424050980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:28,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 18:56:29,511][15401] Updated weights for policy 0, policy_version 270020 (0.0038) [2024-06-22 18:56:33,390][15132] Fps is (10 sec: 42606.3, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 4424138752. Throughput: 0: 42699.0. Samples: 4424309100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 18:56:33,735][15401] Updated weights for policy 0, policy_version 270030 (0.0045) [2024-06-22 18:56:37,470][15401] Updated weights for policy 0, policy_version 270040 (0.0036) [2024-06-22 18:56:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 4424351744. Throughput: 0: 42808.9. Samples: 4424440760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:38,390][15132] Avg episode reward: [(0, '0.873')] [2024-06-22 18:56:41,603][15401] Updated weights for policy 0, policy_version 270050 (0.0037) [2024-06-22 18:56:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 4424581120. Throughput: 0: 42689.3. Samples: 4424694480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:43,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-22 18:56:45,208][15401] Updated weights for policy 0, policy_version 270060 (0.0035) [2024-06-22 18:56:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4424777728. Throughput: 0: 42586.3. Samples: 4424944700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:48,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 18:56:49,380][15401] Updated weights for policy 0, policy_version 270070 (0.0037) [2024-06-22 18:56:52,904][15401] Updated weights for policy 0, policy_version 270080 (0.0041) [2024-06-22 18:56:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 4425007104. Throughput: 0: 42460.0. Samples: 4425066360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:53,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 18:56:57,039][15401] Updated weights for policy 0, policy_version 270090 (0.0030) [2024-06-22 18:56:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4425203712. Throughput: 0: 42697.1. Samples: 4425331840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:56:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 18:57:00,695][15401] Updated weights for policy 0, policy_version 270100 (0.0033) [2024-06-22 18:57:03,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42325.5, 300 sec: 42487.5). Total num frames: 4425416704. Throughput: 0: 42724.2. Samples: 4425589540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 18:57:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 18:57:04,774][15401] Updated weights for policy 0, policy_version 270110 (0.0039) [2024-06-22 18:57:08,328][15401] Updated weights for policy 0, policy_version 270120 (0.0042) [2024-06-22 18:57:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 4425646080. Throughput: 0: 42693.5. Samples: 4425719720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 18:57:12,312][15401] Updated weights for policy 0, policy_version 270130 (0.0028) [2024-06-22 18:57:13,390][15132] Fps is (10 sec: 42596.1, 60 sec: 42600.6, 300 sec: 42542.8). Total num frames: 4425842688. Throughput: 0: 42776.4. Samples: 4425975940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:13,391][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 18:57:13,560][15349] Signal inference workers to stop experience collection... (65450 times) [2024-06-22 18:57:13,560][15349] Signal inference workers to resume experience collection... (65450 times) [2024-06-22 18:57:13,603][15401] InferenceWorker_p0-w0: stopping experience collection (65450 times) [2024-06-22 18:57:13,603][15401] InferenceWorker_p0-w0: resuming experience collection (65450 times) [2024-06-22 18:57:15,983][15401] Updated weights for policy 0, policy_version 270140 (0.0032) [2024-06-22 18:57:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4426055680. Throughput: 0: 42692.5. Samples: 4426230260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 18:57:20,101][15401] Updated weights for policy 0, policy_version 270150 (0.0038) [2024-06-22 18:57:23,389][15132] Fps is (10 sec: 42600.6, 60 sec: 42599.9, 300 sec: 42431.8). Total num frames: 4426268672. Throughput: 0: 42586.2. Samples: 4426357140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 18:57:23,906][15401] Updated weights for policy 0, policy_version 270160 (0.0034) [2024-06-22 18:57:27,656][15401] Updated weights for policy 0, policy_version 270170 (0.0046) [2024-06-22 18:57:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4426481664. Throughput: 0: 42701.7. Samples: 4426616060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 18:57:31,943][15401] Updated weights for policy 0, policy_version 270180 (0.0027) [2024-06-22 18:57:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 4426694656. Throughput: 0: 42791.6. Samples: 4426870320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 18:57:35,379][15401] Updated weights for policy 0, policy_version 270190 (0.0050) [2024-06-22 18:57:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 4426924032. Throughput: 0: 42832.9. Samples: 4426993840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 18:57:39,448][15401] Updated weights for policy 0, policy_version 270200 (0.0028) [2024-06-22 18:57:42,990][15401] Updated weights for policy 0, policy_version 270210 (0.0032) [2024-06-22 18:57:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 4427137024. Throughput: 0: 42621.2. Samples: 4427249800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:43,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 18:57:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000270211_4427137024.pth... [2024-06-22 18:57:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000269587_4416913408.pth [2024-06-22 18:57:46,993][15401] Updated weights for policy 0, policy_version 270220 (0.0032) [2024-06-22 18:57:48,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42323.7, 300 sec: 42431.5). Total num frames: 4427317248. Throughput: 0: 42711.0. Samples: 4427511640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:48,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 18:57:50,470][15401] Updated weights for policy 0, policy_version 270230 (0.0042) [2024-06-22 18:57:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4427579392. Throughput: 0: 42492.4. Samples: 4427631880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 18:57:54,409][15401] Updated weights for policy 0, policy_version 270240 (0.0048) [2024-06-22 18:57:58,263][15401] Updated weights for policy 0, policy_version 270250 (0.0034) [2024-06-22 18:57:58,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4427776000. Throughput: 0: 42621.8. Samples: 4427893900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:57:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 18:58:01,926][15401] Updated weights for policy 0, policy_version 270260 (0.0034) [2024-06-22 18:58:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4427956224. Throughput: 0: 42655.1. Samples: 4428149740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:58:03,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 18:58:05,904][15401] Updated weights for policy 0, policy_version 270270 (0.0041) [2024-06-22 18:58:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4428218368. Throughput: 0: 42611.4. Samples: 4428274660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:58:08,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 18:58:09,362][15401] Updated weights for policy 0, policy_version 270280 (0.0024) [2024-06-22 18:58:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4428414976. Throughput: 0: 42674.5. Samples: 4428536420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 18:58:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 18:58:13,480][15401] Updated weights for policy 0, policy_version 270290 (0.0028) [2024-06-22 18:58:17,808][15401] Updated weights for policy 0, policy_version 270300 (0.0031) [2024-06-22 18:58:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4428611584. Throughput: 0: 42798.1. Samples: 4428796240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:18,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 18:58:21,230][15401] Updated weights for policy 0, policy_version 270310 (0.0026) [2024-06-22 18:58:23,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 4428840960. Throughput: 0: 42780.1. Samples: 4428919040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:23,392][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 18:58:25,196][15401] Updated weights for policy 0, policy_version 270320 (0.0044) [2024-06-22 18:58:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4429021184. Throughput: 0: 42828.4. Samples: 4429177080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 18:58:28,905][15401] Updated weights for policy 0, policy_version 270330 (0.0032) [2024-06-22 18:58:32,639][15401] Updated weights for policy 0, policy_version 270340 (0.0037) [2024-06-22 18:58:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4429250560. Throughput: 0: 42799.7. Samples: 4429437520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 18:58:36,699][15401] Updated weights for policy 0, policy_version 270350 (0.0026) [2024-06-22 18:58:36,721][15349] Signal inference workers to stop experience collection... (65500 times) [2024-06-22 18:58:36,722][15349] Signal inference workers to resume experience collection... (65500 times) [2024-06-22 18:58:36,737][15401] InferenceWorker_p0-w0: stopping experience collection (65500 times) [2024-06-22 18:58:36,738][15401] InferenceWorker_p0-w0: resuming experience collection (65500 times) [2024-06-22 18:58:38,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4429496320. Throughput: 0: 43034.2. Samples: 4429568420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 18:58:40,265][15401] Updated weights for policy 0, policy_version 270360 (0.0023) [2024-06-22 18:58:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4429676544. Throughput: 0: 42926.5. Samples: 4429825600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 18:58:44,259][15401] Updated weights for policy 0, policy_version 270370 (0.0037) [2024-06-22 18:58:48,031][15401] Updated weights for policy 0, policy_version 270380 (0.0039) [2024-06-22 18:58:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43146.3, 300 sec: 42487.3). Total num frames: 4429905920. Throughput: 0: 42858.8. Samples: 4430078380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:48,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 18:58:51,814][15401] Updated weights for policy 0, policy_version 270390 (0.0041) [2024-06-22 18:58:53,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4430135296. Throughput: 0: 43098.7. Samples: 4430214100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:53,390][15132] Avg episode reward: [(0, '0.110')] [2024-06-22 18:58:55,759][15401] Updated weights for policy 0, policy_version 270400 (0.0032) [2024-06-22 18:58:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4430331904. Throughput: 0: 42747.3. Samples: 4430460040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:58:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 18:58:59,799][15401] Updated weights for policy 0, policy_version 270410 (0.0026) [2024-06-22 18:59:03,250][15401] Updated weights for policy 0, policy_version 270420 (0.0044) [2024-06-22 18:59:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 4430561280. Throughput: 0: 42710.3. Samples: 4430718200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:59:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 18:59:07,435][15401] Updated weights for policy 0, policy_version 270430 (0.0035) [2024-06-22 18:59:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4430774272. Throughput: 0: 43014.6. Samples: 4430854600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:59:08,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-22 18:59:10,853][15401] Updated weights for policy 0, policy_version 270440 (0.0031) [2024-06-22 18:59:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4430987264. Throughput: 0: 42871.2. Samples: 4431106280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:59:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 18:59:15,019][15401] Updated weights for policy 0, policy_version 270450 (0.0031) [2024-06-22 18:59:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 4431200256. Throughput: 0: 42880.7. Samples: 4431367160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:59:18,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 18:59:18,411][15401] Updated weights for policy 0, policy_version 270460 (0.0030) [2024-06-22 18:59:22,665][15401] Updated weights for policy 0, policy_version 270470 (0.0033) [2024-06-22 18:59:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 4431413248. Throughput: 0: 42834.7. Samples: 4431495980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 18:59:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 18:59:25,902][15401] Updated weights for policy 0, policy_version 270480 (0.0034) [2024-06-22 18:59:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 42599.3). Total num frames: 4431626240. Throughput: 0: 42802.4. Samples: 4431751700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 18:59:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 18:59:30,377][15401] Updated weights for policy 0, policy_version 270490 (0.0041) [2024-06-22 18:59:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 4431839232. Throughput: 0: 42876.4. Samples: 4432007820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 18:59:33,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 18:59:33,617][15401] Updated weights for policy 0, policy_version 270500 (0.0031) [2024-06-22 18:59:37,813][15401] Updated weights for policy 0, policy_version 270510 (0.0050) [2024-06-22 18:59:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4432052224. Throughput: 0: 42741.3. Samples: 4432137460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 18:59:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 18:59:41,115][15401] Updated weights for policy 0, policy_version 270520 (0.0046) [2024-06-22 18:59:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.7, 300 sec: 42654.3). Total num frames: 4432281600. Throughput: 0: 43105.7. Samples: 4432399800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 18:59:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 18:59:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000270526_4432297984.pth... [2024-06-22 18:59:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000269900_4422041600.pth [2024-06-22 18:59:45,598][15401] Updated weights for policy 0, policy_version 270530 (0.0040) [2024-06-22 18:59:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 4432494592. Throughput: 0: 42973.8. Samples: 4432652020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 18:59:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 18:59:48,825][15401] Updated weights for policy 0, policy_version 270540 (0.0025) [2024-06-22 18:59:53,149][15401] Updated weights for policy 0, policy_version 270550 (0.0033) [2024-06-22 18:59:53,391][15132] Fps is (10 sec: 40954.6, 60 sec: 42597.5, 300 sec: 42709.3). Total num frames: 4432691200. Throughput: 0: 42788.1. Samples: 4432780120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 18:59:53,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 18:59:56,498][15401] Updated weights for policy 0, policy_version 270560 (0.0030) [2024-06-22 18:59:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4432920576. Throughput: 0: 42988.5. Samples: 4433040760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 18:59:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 19:00:00,824][15401] Updated weights for policy 0, policy_version 270570 (0.0034) [2024-06-22 19:00:03,390][15132] Fps is (10 sec: 44242.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4433133568. Throughput: 0: 42748.0. Samples: 4433290820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:00:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 19:00:04,070][15401] Updated weights for policy 0, policy_version 270580 (0.0045) [2024-06-22 19:00:08,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4433330176. Throughput: 0: 42743.5. Samples: 4433419440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:00:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 19:00:08,701][15401] Updated weights for policy 0, policy_version 270590 (0.0033) [2024-06-22 19:00:11,512][15349] Signal inference workers to stop experience collection... (65550 times) [2024-06-22 19:00:11,512][15349] Signal inference workers to resume experience collection... (65550 times) [2024-06-22 19:00:11,550][15401] InferenceWorker_p0-w0: stopping experience collection (65550 times) [2024-06-22 19:00:11,551][15401] InferenceWorker_p0-w0: resuming experience collection (65550 times) [2024-06-22 19:00:11,645][15401] Updated weights for policy 0, policy_version 270600 (0.0042) [2024-06-22 19:00:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4433575936. Throughput: 0: 42916.9. Samples: 4433682960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:00:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 19:00:16,348][15401] Updated weights for policy 0, policy_version 270610 (0.0046) [2024-06-22 19:00:18,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4433788928. Throughput: 0: 42858.5. Samples: 4433936460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:00:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 19:00:19,262][15401] Updated weights for policy 0, policy_version 270620 (0.0031) [2024-06-22 19:00:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4433969152. Throughput: 0: 42726.8. Samples: 4434060160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:00:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 19:00:24,079][15401] Updated weights for policy 0, policy_version 270630 (0.0023) [2024-06-22 19:00:27,740][15401] Updated weights for policy 0, policy_version 270640 (0.0045) [2024-06-22 19:00:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4434198528. Throughput: 0: 42660.2. Samples: 4434319500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:00:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 19:00:31,680][15401] Updated weights for policy 0, policy_version 270650 (0.0037) [2024-06-22 19:00:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4434411520. Throughput: 0: 42762.6. Samples: 4434576440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:00:33,392][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 19:00:35,200][15401] Updated weights for policy 0, policy_version 270660 (0.0029) [2024-06-22 19:00:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 4434624512. Throughput: 0: 42738.7. Samples: 4434703300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:00:38,395][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 19:00:39,082][15401] Updated weights for policy 0, policy_version 270670 (0.0037) [2024-06-22 19:00:42,602][15401] Updated weights for policy 0, policy_version 270680 (0.0040) [2024-06-22 19:00:43,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4434853888. Throughput: 0: 42748.8. Samples: 4434964460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:00:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 19:00:46,610][15401] Updated weights for policy 0, policy_version 270690 (0.0034) [2024-06-22 19:00:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4435066880. Throughput: 0: 42916.2. Samples: 4435222040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:00:48,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 19:00:50,363][15401] Updated weights for policy 0, policy_version 270700 (0.0038) [2024-06-22 19:00:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43145.5, 300 sec: 42765.0). Total num frames: 4435279872. Throughput: 0: 42877.5. Samples: 4435348920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:00:53,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 19:00:54,298][15401] Updated weights for policy 0, policy_version 270710 (0.0033) [2024-06-22 19:00:57,979][15401] Updated weights for policy 0, policy_version 270720 (0.0037) [2024-06-22 19:00:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4435492864. Throughput: 0: 42957.4. Samples: 4435616040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:00:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 19:01:01,804][15401] Updated weights for policy 0, policy_version 270730 (0.0047) [2024-06-22 19:01:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4435689472. Throughput: 0: 42778.6. Samples: 4435861500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:03,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 19:01:05,815][15401] Updated weights for policy 0, policy_version 270740 (0.0042) [2024-06-22 19:01:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42821.1). Total num frames: 4435918848. Throughput: 0: 42851.5. Samples: 4435988480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 19:01:09,352][15401] Updated weights for policy 0, policy_version 270750 (0.0038) [2024-06-22 19:01:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4436115456. Throughput: 0: 42871.0. Samples: 4436248700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 19:01:13,424][15401] Updated weights for policy 0, policy_version 270760 (0.0040) [2024-06-22 19:01:16,889][15401] Updated weights for policy 0, policy_version 270770 (0.0041) [2024-06-22 19:01:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.3). Total num frames: 4436328448. Throughput: 0: 42861.0. Samples: 4436505080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:18,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 19:01:21,239][15401] Updated weights for policy 0, policy_version 270780 (0.0026) [2024-06-22 19:01:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4436557824. Throughput: 0: 43020.3. Samples: 4436639220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 19:01:24,524][15401] Updated weights for policy 0, policy_version 270790 (0.0030) [2024-06-22 19:01:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4436754432. Throughput: 0: 42853.8. Samples: 4436892880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 19:01:28,982][15401] Updated weights for policy 0, policy_version 270800 (0.0035) [2024-06-22 19:01:31,156][15349] Signal inference workers to stop experience collection... (65600 times) [2024-06-22 19:01:31,156][15349] Signal inference workers to resume experience collection... (65600 times) [2024-06-22 19:01:31,189][15401] InferenceWorker_p0-w0: stopping experience collection (65600 times) [2024-06-22 19:01:31,190][15401] InferenceWorker_p0-w0: resuming experience collection (65600 times) [2024-06-22 19:01:31,959][15401] Updated weights for policy 0, policy_version 270810 (0.0048) [2024-06-22 19:01:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 4436967424. Throughput: 0: 42847.1. Samples: 4437150160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:33,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-22 19:01:36,481][15401] Updated weights for policy 0, policy_version 270820 (0.0033) [2024-06-22 19:01:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4437196800. Throughput: 0: 42870.2. Samples: 4437278080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 19:01:40,117][15401] Updated weights for policy 0, policy_version 270830 (0.0035) [2024-06-22 19:01:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4437409792. Throughput: 0: 42540.8. Samples: 4437530380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 19:01:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000270838_4437409792.pth... [2024-06-22 19:01:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000270211_4427137024.pth [2024-06-22 19:01:43,994][15401] Updated weights for policy 0, policy_version 270840 (0.0027) [2024-06-22 19:01:47,874][15401] Updated weights for policy 0, policy_version 270850 (0.0024) [2024-06-22 19:01:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4437622784. Throughput: 0: 42771.3. Samples: 4437786200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:01:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 19:01:51,515][15401] Updated weights for policy 0, policy_version 270860 (0.0032) [2024-06-22 19:01:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4437835776. Throughput: 0: 42896.5. Samples: 4437918820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:01:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 19:01:55,308][15401] Updated weights for policy 0, policy_version 270870 (0.0032) [2024-06-22 19:01:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4438048768. Throughput: 0: 42847.4. Samples: 4438176840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:01:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 19:01:59,150][15401] Updated weights for policy 0, policy_version 270880 (0.0025) [2024-06-22 19:02:02,798][15401] Updated weights for policy 0, policy_version 270890 (0.0031) [2024-06-22 19:02:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 4438278144. Throughput: 0: 42912.3. Samples: 4438436140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 19:02:06,581][15401] Updated weights for policy 0, policy_version 270900 (0.0033) [2024-06-22 19:02:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4438474752. Throughput: 0: 42850.3. Samples: 4438567480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 19:02:10,308][15401] Updated weights for policy 0, policy_version 270910 (0.0033) [2024-06-22 19:02:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4438671360. Throughput: 0: 42798.3. Samples: 4438818800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 19:02:14,605][15401] Updated weights for policy 0, policy_version 270920 (0.0032) [2024-06-22 19:02:17,791][15401] Updated weights for policy 0, policy_version 270930 (0.0026) [2024-06-22 19:02:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4438917120. Throughput: 0: 42622.6. Samples: 4439068180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 19:02:22,191][15401] Updated weights for policy 0, policy_version 270940 (0.0034) [2024-06-22 19:02:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4439130112. Throughput: 0: 42840.1. Samples: 4439205880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 19:02:25,429][15401] Updated weights for policy 0, policy_version 270950 (0.0040) [2024-06-22 19:02:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4439326720. Throughput: 0: 42808.0. Samples: 4439456740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 19:02:29,758][15401] Updated weights for policy 0, policy_version 270960 (0.0023) [2024-06-22 19:02:33,015][15401] Updated weights for policy 0, policy_version 270970 (0.0025) [2024-06-22 19:02:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4439572480. Throughput: 0: 42580.4. Samples: 4439702320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 19:02:37,507][15401] Updated weights for policy 0, policy_version 270980 (0.0031) [2024-06-22 19:02:38,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 4439752704. Throughput: 0: 42686.3. Samples: 4439839980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:38,396][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 19:02:40,330][15349] Signal inference workers to stop experience collection... (65650 times) [2024-06-22 19:02:40,377][15401] InferenceWorker_p0-w0: stopping experience collection (65650 times) [2024-06-22 19:02:40,386][15349] Signal inference workers to resume experience collection... (65650 times) [2024-06-22 19:02:40,393][15401] InferenceWorker_p0-w0: resuming experience collection (65650 times) [2024-06-22 19:02:40,674][15401] Updated weights for policy 0, policy_version 270990 (0.0034) [2024-06-22 19:02:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 4439965696. Throughput: 0: 42636.9. Samples: 4440095500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 19:02:45,419][15401] Updated weights for policy 0, policy_version 271000 (0.0038) [2024-06-22 19:02:48,389][15132] Fps is (10 sec: 45904.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4440211456. Throughput: 0: 42407.6. Samples: 4440344480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 19:02:48,606][15401] Updated weights for policy 0, policy_version 271010 (0.0026) [2024-06-22 19:02:52,944][15401] Updated weights for policy 0, policy_version 271020 (0.0040) [2024-06-22 19:02:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4440391680. Throughput: 0: 42549.0. Samples: 4440482180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 19:02:56,505][15401] Updated weights for policy 0, policy_version 271030 (0.0049) [2024-06-22 19:02:58,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42596.8, 300 sec: 42875.8). Total num frames: 4440604672. Throughput: 0: 42425.7. Samples: 4440728060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:02:58,393][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 19:03:00,844][15401] Updated weights for policy 0, policy_version 271040 (0.0049) [2024-06-22 19:03:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4440834048. Throughput: 0: 42634.8. Samples: 4440986740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 19:03:04,099][15401] Updated weights for policy 0, policy_version 271050 (0.0029) [2024-06-22 19:03:08,382][15401] Updated weights for policy 0, policy_version 271060 (0.0033) [2024-06-22 19:03:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4441047040. Throughput: 0: 42582.6. Samples: 4441122100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 19:03:11,609][15401] Updated weights for policy 0, policy_version 271070 (0.0041) [2024-06-22 19:03:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4441260032. Throughput: 0: 42569.2. Samples: 4441372360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:13,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 19:03:15,974][15401] Updated weights for policy 0, policy_version 271080 (0.0038) [2024-06-22 19:03:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 4441473024. Throughput: 0: 42854.2. Samples: 4441630760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 19:03:19,850][15401] Updated weights for policy 0, policy_version 271090 (0.0030) [2024-06-22 19:03:23,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42323.6, 300 sec: 42875.8). Total num frames: 4441669632. Throughput: 0: 42710.9. Samples: 4441761800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:23,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 19:03:23,878][15401] Updated weights for policy 0, policy_version 271100 (0.0031) [2024-06-22 19:03:27,479][15401] Updated weights for policy 0, policy_version 271110 (0.0025) [2024-06-22 19:03:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4441899008. Throughput: 0: 42733.8. Samples: 4442018520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:28,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 19:03:31,633][15401] Updated weights for policy 0, policy_version 271120 (0.0036) [2024-06-22 19:03:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4442095616. Throughput: 0: 42895.2. Samples: 4442274760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 19:03:35,057][15401] Updated weights for policy 0, policy_version 271130 (0.0022) [2024-06-22 19:03:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 4442324992. Throughput: 0: 42633.3. Samples: 4442400680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 19:03:39,507][15401] Updated weights for policy 0, policy_version 271140 (0.0026) [2024-06-22 19:03:42,564][15401] Updated weights for policy 0, policy_version 271150 (0.0038) [2024-06-22 19:03:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4442537984. Throughput: 0: 42837.7. Samples: 4442655660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 19:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000271151_4442537984.pth... [2024-06-22 19:03:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000270526_4432297984.pth [2024-06-22 19:03:47,201][15401] Updated weights for policy 0, policy_version 271160 (0.0032) [2024-06-22 19:03:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4442734592. Throughput: 0: 42839.5. Samples: 4442914520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 19:03:50,330][15401] Updated weights for policy 0, policy_version 271170 (0.0038) [2024-06-22 19:03:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4442963968. Throughput: 0: 42531.9. Samples: 4443036040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 19:03:54,889][15401] Updated weights for policy 0, policy_version 271180 (0.0034) [2024-06-22 19:03:56,912][15349] Signal inference workers to stop experience collection... (65700 times) [2024-06-22 19:03:56,913][15349] Signal inference workers to resume experience collection... (65700 times) [2024-06-22 19:03:56,957][15401] InferenceWorker_p0-w0: stopping experience collection (65700 times) [2024-06-22 19:03:56,957][15401] InferenceWorker_p0-w0: resuming experience collection (65700 times) [2024-06-22 19:03:57,883][15401] Updated weights for policy 0, policy_version 271190 (0.0025) [2024-06-22 19:03:58,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 4443193344. Throughput: 0: 42656.6. Samples: 4443291900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:03:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 19:04:02,493][15401] Updated weights for policy 0, policy_version 271200 (0.0041) [2024-06-22 19:04:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 4443357184. Throughput: 0: 42595.6. Samples: 4443547560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:04:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 19:04:05,631][15401] Updated weights for policy 0, policy_version 271210 (0.0039) [2024-06-22 19:04:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4443602944. Throughput: 0: 42473.0. Samples: 4443672980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:04:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 19:04:10,130][15401] Updated weights for policy 0, policy_version 271220 (0.0025) [2024-06-22 19:04:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4443815936. Throughput: 0: 42473.7. Samples: 4443929840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:04:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 19:04:13,654][15401] Updated weights for policy 0, policy_version 271230 (0.0037) [2024-06-22 19:04:17,937][15401] Updated weights for policy 0, policy_version 271240 (0.0045) [2024-06-22 19:04:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4444012544. Throughput: 0: 42361.7. Samples: 4444181040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 19:04:21,588][15401] Updated weights for policy 0, policy_version 271250 (0.0028) [2024-06-22 19:04:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 4444258304. Throughput: 0: 42364.0. Samples: 4444307060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 19:04:25,363][15401] Updated weights for policy 0, policy_version 271260 (0.0039) [2024-06-22 19:04:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4444438528. Throughput: 0: 42509.1. Samples: 4444568560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:28,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-22 19:04:29,031][15401] Updated weights for policy 0, policy_version 271270 (0.0023) [2024-06-22 19:04:32,871][15401] Updated weights for policy 0, policy_version 271280 (0.0030) [2024-06-22 19:04:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4444667904. Throughput: 0: 42330.2. Samples: 4444819380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:33,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 19:04:36,878][15401] Updated weights for policy 0, policy_version 271290 (0.0043) [2024-06-22 19:04:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4444880896. Throughput: 0: 42560.5. Samples: 4444951260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:38,391][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 19:04:40,679][15401] Updated weights for policy 0, policy_version 271300 (0.0029) [2024-06-22 19:04:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 4445077504. Throughput: 0: 42609.3. Samples: 4445209420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:43,392][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 19:04:44,650][15401] Updated weights for policy 0, policy_version 271310 (0.0030) [2024-06-22 19:04:48,198][15401] Updated weights for policy 0, policy_version 271320 (0.0031) [2024-06-22 19:04:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 4445306880. Throughput: 0: 42592.8. Samples: 4445464240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:48,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 19:04:52,222][15401] Updated weights for policy 0, policy_version 271330 (0.0029) [2024-06-22 19:04:53,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4445536256. Throughput: 0: 42729.2. Samples: 4445595800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 19:04:55,706][15401] Updated weights for policy 0, policy_version 271340 (0.0034) [2024-06-22 19:04:58,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 4445716480. Throughput: 0: 42770.2. Samples: 4445854600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:04:58,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 19:04:59,933][15401] Updated weights for policy 0, policy_version 271350 (0.0026) [2024-06-22 19:05:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 4445945856. Throughput: 0: 42747.5. Samples: 4446104780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:05:03,392][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 19:05:03,618][15401] Updated weights for policy 0, policy_version 271360 (0.0035) [2024-06-22 19:05:07,752][15401] Updated weights for policy 0, policy_version 271370 (0.0031) [2024-06-22 19:05:08,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4446158848. Throughput: 0: 42904.0. Samples: 4446237740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:05:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 19:05:11,326][15401] Updated weights for policy 0, policy_version 271380 (0.0040) [2024-06-22 19:05:13,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4446355456. Throughput: 0: 42689.3. Samples: 4446489580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:05:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 19:05:14,243][15349] Signal inference workers to stop experience collection... (65750 times) [2024-06-22 19:05:14,243][15349] Signal inference workers to resume experience collection... (65750 times) [2024-06-22 19:05:14,289][15401] InferenceWorker_p0-w0: stopping experience collection (65750 times) [2024-06-22 19:05:14,289][15401] InferenceWorker_p0-w0: resuming experience collection (65750 times) [2024-06-22 19:05:15,550][15401] Updated weights for policy 0, policy_version 271390 (0.0045) [2024-06-22 19:05:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4446584832. Throughput: 0: 42686.8. Samples: 4446740280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:05:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 19:05:19,106][15401] Updated weights for policy 0, policy_version 271400 (0.0027) [2024-06-22 19:05:23,098][15401] Updated weights for policy 0, policy_version 271410 (0.0031) [2024-06-22 19:05:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4446797824. Throughput: 0: 42834.3. Samples: 4446878800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 19:05:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 19:05:26,591][15401] Updated weights for policy 0, policy_version 271420 (0.0040) [2024-06-22 19:05:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 4447010816. Throughput: 0: 42782.2. Samples: 4447134520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:05:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 19:05:30,787][15401] Updated weights for policy 0, policy_version 271430 (0.0029) [2024-06-22 19:05:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4447240192. Throughput: 0: 42626.7. Samples: 4447382440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:05:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 19:05:34,148][15401] Updated weights for policy 0, policy_version 271440 (0.0037) [2024-06-22 19:05:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4447420416. Throughput: 0: 42759.7. Samples: 4447519980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:05:38,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-22 19:05:38,526][15401] Updated weights for policy 0, policy_version 271450 (0.0040) [2024-06-22 19:05:41,704][15401] Updated weights for policy 0, policy_version 271460 (0.0036) [2024-06-22 19:05:43,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42871.4, 300 sec: 42653.6). Total num frames: 4447649792. Throughput: 0: 42594.2. Samples: 4447771340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:05:43,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 19:05:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000271463_4447649792.pth... [2024-06-22 19:05:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000270838_4437409792.pth [2024-06-22 19:05:46,362][15401] Updated weights for policy 0, policy_version 271470 (0.0034) [2024-06-22 19:05:48,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4447895552. Throughput: 0: 42652.9. Samples: 4448024060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:05:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 19:05:49,332][15401] Updated weights for policy 0, policy_version 271480 (0.0031) [2024-06-22 19:05:53,390][15132] Fps is (10 sec: 40967.5, 60 sec: 42051.9, 300 sec: 42598.3). Total num frames: 4448059392. Throughput: 0: 42665.1. Samples: 4448157700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:05:53,391][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 19:05:54,045][15401] Updated weights for policy 0, policy_version 271490 (0.0035) [2024-06-22 19:05:57,098][15401] Updated weights for policy 0, policy_version 271500 (0.0044) [2024-06-22 19:05:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 4448288768. Throughput: 0: 42765.8. Samples: 4448414040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:05:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 19:06:01,613][15401] Updated weights for policy 0, policy_version 271510 (0.0046) [2024-06-22 19:06:03,390][15132] Fps is (10 sec: 47516.0, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 4448534528. Throughput: 0: 42817.6. Samples: 4448667080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:06:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 19:06:04,635][15401] Updated weights for policy 0, policy_version 271520 (0.0036) [2024-06-22 19:06:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 4448698368. Throughput: 0: 42801.2. Samples: 4448804960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:06:08,392][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 19:06:09,101][15401] Updated weights for policy 0, policy_version 271530 (0.0025) [2024-06-22 19:06:12,123][15401] Updated weights for policy 0, policy_version 271540 (0.0033) [2024-06-22 19:06:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4448927744. Throughput: 0: 42695.2. Samples: 4449055800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:06:13,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-22 19:06:16,636][15401] Updated weights for policy 0, policy_version 271550 (0.0024) [2024-06-22 19:06:18,390][15132] Fps is (10 sec: 47524.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4449173504. Throughput: 0: 42895.5. Samples: 4449312740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:06:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 19:06:19,675][15349] Signal inference workers to stop experience collection... (65800 times) [2024-06-22 19:06:19,728][15401] InferenceWorker_p0-w0: stopping experience collection (65800 times) [2024-06-22 19:06:19,732][15349] Signal inference workers to resume experience collection... (65800 times) [2024-06-22 19:06:19,750][15401] InferenceWorker_p0-w0: resuming experience collection (65800 times) [2024-06-22 19:06:19,876][15401] Updated weights for policy 0, policy_version 271560 (0.0036) [2024-06-22 19:06:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4449353728. Throughput: 0: 42926.9. Samples: 4449451700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:06:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 19:06:24,011][15401] Updated weights for policy 0, policy_version 271570 (0.0040) [2024-06-22 19:06:27,472][15401] Updated weights for policy 0, policy_version 271580 (0.0036) [2024-06-22 19:06:28,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 4449583104. Throughput: 0: 42989.8. Samples: 4449705880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:06:28,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 19:06:31,803][15401] Updated weights for policy 0, policy_version 271590 (0.0041) [2024-06-22 19:06:33,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 4449812480. Throughput: 0: 43165.8. Samples: 4449966620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:06:33,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 19:06:34,995][15401] Updated weights for policy 0, policy_version 271600 (0.0038) [2024-06-22 19:06:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4450009088. Throughput: 0: 43226.8. Samples: 4450102880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:06:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 19:06:39,368][15401] Updated weights for policy 0, policy_version 271610 (0.0043) [2024-06-22 19:06:42,473][15401] Updated weights for policy 0, policy_version 271620 (0.0048) [2024-06-22 19:06:43,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 4450222080. Throughput: 0: 43197.2. Samples: 4450357920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:06:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 19:06:46,986][15401] Updated weights for policy 0, policy_version 271630 (0.0033) [2024-06-22 19:06:48,394][15132] Fps is (10 sec: 45853.6, 60 sec: 42868.1, 300 sec: 42819.8). Total num frames: 4450467840. Throughput: 0: 43108.9. Samples: 4450607180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:06:48,395][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 19:06:50,330][15401] Updated weights for policy 0, policy_version 271640 (0.0030) [2024-06-22 19:06:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43418.0, 300 sec: 42765.0). Total num frames: 4450664448. Throughput: 0: 43113.3. Samples: 4450744960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:06:53,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 19:06:54,509][15401] Updated weights for policy 0, policy_version 271650 (0.0026) [2024-06-22 19:06:57,840][15401] Updated weights for policy 0, policy_version 271660 (0.0042) [2024-06-22 19:06:58,390][15132] Fps is (10 sec: 40979.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4450877440. Throughput: 0: 43117.2. Samples: 4450996080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:06:58,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 19:07:02,513][15401] Updated weights for policy 0, policy_version 271670 (0.0033) [2024-06-22 19:07:03,393][15132] Fps is (10 sec: 42582.0, 60 sec: 42595.7, 300 sec: 42764.5). Total num frames: 4451090432. Throughput: 0: 43040.8. Samples: 4451249740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:03,394][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 19:07:05,446][15401] Updated weights for policy 0, policy_version 271680 (0.0027) [2024-06-22 19:07:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 4451303424. Throughput: 0: 42818.7. Samples: 4451378540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 19:07:10,200][15401] Updated weights for policy 0, policy_version 271690 (0.0035) [2024-06-22 19:07:13,389][15132] Fps is (10 sec: 44254.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4451532800. Throughput: 0: 42907.2. Samples: 4451636600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:13,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 19:07:13,391][15401] Updated weights for policy 0, policy_version 271700 (0.0031) [2024-06-22 19:07:17,726][15401] Updated weights for policy 0, policy_version 271710 (0.0030) [2024-06-22 19:07:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4451729408. Throughput: 0: 42836.5. Samples: 4451894160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 19:07:20,866][15401] Updated weights for policy 0, policy_version 271720 (0.0033) [2024-06-22 19:07:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4451942400. Throughput: 0: 42599.1. Samples: 4452019840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 19:07:25,111][15401] Updated weights for policy 0, policy_version 271730 (0.0037) [2024-06-22 19:07:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 4452171776. Throughput: 0: 42607.3. Samples: 4452275240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 19:07:28,484][15401] Updated weights for policy 0, policy_version 271740 (0.0034) [2024-06-22 19:07:32,644][15401] Updated weights for policy 0, policy_version 271750 (0.0031) [2024-06-22 19:07:33,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42598.4, 300 sec: 42765.6). Total num frames: 4452368384. Throughput: 0: 42970.2. Samples: 4452540740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:33,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 19:07:34,287][15349] Signal inference workers to stop experience collection... (65850 times) [2024-06-22 19:07:34,290][15349] Signal inference workers to resume experience collection... (65850 times) [2024-06-22 19:07:34,304][15401] InferenceWorker_p0-w0: stopping experience collection (65850 times) [2024-06-22 19:07:34,305][15401] InferenceWorker_p0-w0: resuming experience collection (65850 times) [2024-06-22 19:07:36,276][15401] Updated weights for policy 0, policy_version 271760 (0.0034) [2024-06-22 19:07:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4452597760. Throughput: 0: 42723.1. Samples: 4452667500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 19:07:40,529][15401] Updated weights for policy 0, policy_version 271770 (0.0032) [2024-06-22 19:07:43,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4452810752. Throughput: 0: 42936.5. Samples: 4452928220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 19:07:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000271778_4452810752.pth... [2024-06-22 19:07:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000271151_4442537984.pth [2024-06-22 19:07:43,856][15401] Updated weights for policy 0, policy_version 271780 (0.0030) [2024-06-22 19:07:48,029][15401] Updated weights for policy 0, policy_version 271790 (0.0034) [2024-06-22 19:07:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42601.7, 300 sec: 42820.5). Total num frames: 4453023744. Throughput: 0: 43047.7. Samples: 4453186720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 19:07:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 19:07:51,525][15401] Updated weights for policy 0, policy_version 271800 (0.0032) [2024-06-22 19:07:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 4453236736. Throughput: 0: 42979.1. Samples: 4453312600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:07:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 19:07:55,608][15401] Updated weights for policy 0, policy_version 271810 (0.0032) [2024-06-22 19:07:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4453466112. Throughput: 0: 43026.2. Samples: 4453572780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:07:58,398][15132] Avg episode reward: [(0, '0.192')] [2024-06-22 19:07:59,524][15401] Updated weights for policy 0, policy_version 271820 (0.0040) [2024-06-22 19:08:03,162][15401] Updated weights for policy 0, policy_version 271830 (0.0039) [2024-06-22 19:08:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43147.4, 300 sec: 42820.6). Total num frames: 4453679104. Throughput: 0: 43153.3. Samples: 4453836060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:03,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-22 19:08:07,063][15401] Updated weights for policy 0, policy_version 271840 (0.0036) [2024-06-22 19:08:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4453875712. Throughput: 0: 43025.0. Samples: 4453955960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:08,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 19:08:10,737][15401] Updated weights for policy 0, policy_version 271850 (0.0024) [2024-06-22 19:08:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4454105088. Throughput: 0: 43231.5. Samples: 4454220660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 19:08:14,555][15401] Updated weights for policy 0, policy_version 271860 (0.0039) [2024-06-22 19:08:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 4454301696. Throughput: 0: 42957.9. Samples: 4454473740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 19:08:18,524][15401] Updated weights for policy 0, policy_version 271870 (0.0049) [2024-06-22 19:08:22,178][15401] Updated weights for policy 0, policy_version 271880 (0.0045) [2024-06-22 19:08:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4454514688. Throughput: 0: 42856.9. Samples: 4454596060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 19:08:26,082][15401] Updated weights for policy 0, policy_version 271890 (0.0039) [2024-06-22 19:08:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4454760448. Throughput: 0: 42848.5. Samples: 4454856400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:28,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 19:08:30,071][15401] Updated weights for policy 0, policy_version 271900 (0.0033) [2024-06-22 19:08:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4454940672. Throughput: 0: 42846.3. Samples: 4455114800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 19:08:33,813][15401] Updated weights for policy 0, policy_version 271910 (0.0039) [2024-06-22 19:08:37,698][15401] Updated weights for policy 0, policy_version 271920 (0.0045) [2024-06-22 19:08:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4455153664. Throughput: 0: 42788.0. Samples: 4455238060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:38,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 19:08:41,326][15401] Updated weights for policy 0, policy_version 271930 (0.0029) [2024-06-22 19:08:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4455399424. Throughput: 0: 42671.2. Samples: 4455492980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 19:08:45,423][15401] Updated weights for policy 0, policy_version 271940 (0.0025) [2024-06-22 19:08:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4455563264. Throughput: 0: 42698.2. Samples: 4455757480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 19:08:49,054][15401] Updated weights for policy 0, policy_version 271950 (0.0043) [2024-06-22 19:08:53,271][15401] Updated weights for policy 0, policy_version 271960 (0.0037) [2024-06-22 19:08:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4455792640. Throughput: 0: 42577.4. Samples: 4455871940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 19:08:56,852][15401] Updated weights for policy 0, policy_version 271970 (0.0035) [2024-06-22 19:08:58,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4456038400. Throughput: 0: 42304.0. Samples: 4456124340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:08:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 19:09:01,201][15401] Updated weights for policy 0, policy_version 271980 (0.0032) [2024-06-22 19:09:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4456202240. Throughput: 0: 42535.0. Samples: 4456387820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:03,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 19:09:04,566][15401] Updated weights for policy 0, policy_version 271990 (0.0036) [2024-06-22 19:09:08,082][15349] Signal inference workers to stop experience collection... (65900 times) [2024-06-22 19:09:08,082][15349] Signal inference workers to resume experience collection... (65900 times) [2024-06-22 19:09:08,105][15401] InferenceWorker_p0-w0: stopping experience collection (65900 times) [2024-06-22 19:09:08,105][15401] InferenceWorker_p0-w0: resuming experience collection (65900 times) [2024-06-22 19:09:08,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4456415232. Throughput: 0: 42324.5. Samples: 4456500660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 19:09:08,723][15401] Updated weights for policy 0, policy_version 272000 (0.0031) [2024-06-22 19:09:12,230][15401] Updated weights for policy 0, policy_version 272010 (0.0032) [2024-06-22 19:09:13,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4456693760. Throughput: 0: 42506.6. Samples: 4456769200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:13,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-22 19:09:16,374][15401] Updated weights for policy 0, policy_version 272020 (0.0033) [2024-06-22 19:09:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4456841216. Throughput: 0: 42608.9. Samples: 4457032200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 19:09:19,970][15401] Updated weights for policy 0, policy_version 272030 (0.0038) [2024-06-22 19:09:23,392][15132] Fps is (10 sec: 37674.1, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 4457070592. Throughput: 0: 42400.8. Samples: 4457146200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:23,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 19:09:23,873][15401] Updated weights for policy 0, policy_version 272040 (0.0035) [2024-06-22 19:09:27,485][15401] Updated weights for policy 0, policy_version 272050 (0.0026) [2024-06-22 19:09:28,389][15132] Fps is (10 sec: 49152.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4457332736. Throughput: 0: 42840.4. Samples: 4457420800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 19:09:31,428][15401] Updated weights for policy 0, policy_version 272060 (0.0031) [2024-06-22 19:09:33,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4457480192. Throughput: 0: 42700.3. Samples: 4457679000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 19:09:35,041][15401] Updated weights for policy 0, policy_version 272070 (0.0026) [2024-06-22 19:09:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 4457725952. Throughput: 0: 42735.1. Samples: 4457795020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:38,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 19:09:39,347][15401] Updated weights for policy 0, policy_version 272080 (0.0051) [2024-06-22 19:09:42,719][15401] Updated weights for policy 0, policy_version 272090 (0.0040) [2024-06-22 19:09:43,390][15132] Fps is (10 sec: 49151.9, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 4457971712. Throughput: 0: 43155.0. Samples: 4458066320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 19:09:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000272094_4457988096.pth... [2024-06-22 19:09:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000271463_4447649792.pth [2024-06-22 19:09:47,162][15401] Updated weights for policy 0, policy_version 272100 (0.0042) [2024-06-22 19:09:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4458135552. Throughput: 0: 42991.6. Samples: 4458322440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:48,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 19:09:50,418][15401] Updated weights for policy 0, policy_version 272110 (0.0033) [2024-06-22 19:09:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 4458364928. Throughput: 0: 43136.0. Samples: 4458441780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 19:09:54,893][15401] Updated weights for policy 0, policy_version 272120 (0.0028) [2024-06-22 19:09:57,843][15401] Updated weights for policy 0, policy_version 272130 (0.0046) [2024-06-22 19:09:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 4458594304. Throughput: 0: 43096.1. Samples: 4458708520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:09:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 19:10:02,400][15401] Updated weights for policy 0, policy_version 272140 (0.0038) [2024-06-22 19:10:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4458790912. Throughput: 0: 42910.6. Samples: 4458963280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:10:03,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 19:10:05,646][15401] Updated weights for policy 0, policy_version 272150 (0.0034) [2024-06-22 19:10:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 4459020288. Throughput: 0: 43068.9. Samples: 4459084200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:10:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 19:10:09,973][15401] Updated weights for policy 0, policy_version 272160 (0.0043) [2024-06-22 19:10:10,604][15349] Signal inference workers to stop experience collection... (65950 times) [2024-06-22 19:10:10,604][15349] Signal inference workers to resume experience collection... (65950 times) [2024-06-22 19:10:10,656][15401] InferenceWorker_p0-w0: stopping experience collection (65950 times) [2024-06-22 19:10:10,656][15401] InferenceWorker_p0-w0: resuming experience collection (65950 times) [2024-06-22 19:10:13,108][15401] Updated weights for policy 0, policy_version 272170 (0.0044) [2024-06-22 19:10:13,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 4459233280. Throughput: 0: 42919.2. Samples: 4459352160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-22 19:10:13,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 19:10:17,471][15401] Updated weights for policy 0, policy_version 272180 (0.0032) [2024-06-22 19:10:18,392][15132] Fps is (10 sec: 40951.4, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 4459429888. Throughput: 0: 42883.7. Samples: 4459608860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:18,392][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 19:10:20,576][15401] Updated weights for policy 0, policy_version 272190 (0.0034) [2024-06-22 19:10:23,392][15132] Fps is (10 sec: 42587.6, 60 sec: 43144.5, 300 sec: 42875.7). Total num frames: 4459659264. Throughput: 0: 43062.1. Samples: 4459732920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:23,393][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 19:10:24,936][15401] Updated weights for policy 0, policy_version 272200 (0.0036) [2024-06-22 19:10:28,389][15132] Fps is (10 sec: 44246.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 4459872256. Throughput: 0: 42966.8. Samples: 4459999820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:28,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 19:10:28,433][15401] Updated weights for policy 0, policy_version 272210 (0.0028) [2024-06-22 19:10:32,557][15401] Updated weights for policy 0, policy_version 272220 (0.0040) [2024-06-22 19:10:33,392][15132] Fps is (10 sec: 40960.1, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 4460068864. Throughput: 0: 42901.2. Samples: 4460253100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:33,393][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 19:10:36,056][15401] Updated weights for policy 0, policy_version 272230 (0.0033) [2024-06-22 19:10:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 4460298240. Throughput: 0: 42972.5. Samples: 4460375540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 19:10:40,058][15401] Updated weights for policy 0, policy_version 272240 (0.0035) [2024-06-22 19:10:43,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 4460511232. Throughput: 0: 42969.3. Samples: 4460642140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:43,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 19:10:43,680][15401] Updated weights for policy 0, policy_version 272250 (0.0031) [2024-06-22 19:10:48,014][15401] Updated weights for policy 0, policy_version 272260 (0.0038) [2024-06-22 19:10:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42931.7). Total num frames: 4460724224. Throughput: 0: 42940.8. Samples: 4460895520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 19:10:51,265][15401] Updated weights for policy 0, policy_version 272270 (0.0036) [2024-06-22 19:10:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4460953600. Throughput: 0: 43059.6. Samples: 4461021880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:53,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 19:10:55,487][15401] Updated weights for policy 0, policy_version 272280 (0.0038) [2024-06-22 19:10:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4461133824. Throughput: 0: 42832.5. Samples: 4461279620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:10:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 19:10:59,044][15401] Updated weights for policy 0, policy_version 272290 (0.0030) [2024-06-22 19:11:02,972][15401] Updated weights for policy 0, policy_version 272300 (0.0033) [2024-06-22 19:11:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.1, 300 sec: 42932.0). Total num frames: 4461363200. Throughput: 0: 42811.0. Samples: 4461535260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:11:03,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 19:11:06,467][15401] Updated weights for policy 0, policy_version 272310 (0.0025) [2024-06-22 19:11:08,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4461608960. Throughput: 0: 42947.2. Samples: 4461665440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:11:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 19:11:11,123][15401] Updated weights for policy 0, policy_version 272320 (0.0027) [2024-06-22 19:11:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4461805568. Throughput: 0: 42790.3. Samples: 4461925380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:11:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 19:11:14,014][15401] Updated weights for policy 0, policy_version 272330 (0.0023) [2024-06-22 19:11:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42873.0, 300 sec: 42876.1). Total num frames: 4462002176. Throughput: 0: 42998.8. Samples: 4462187940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:11:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 19:11:18,733][15401] Updated weights for policy 0, policy_version 272340 (0.0032) [2024-06-22 19:11:21,842][15401] Updated weights for policy 0, policy_version 272350 (0.0032) [2024-06-22 19:11:23,392][15132] Fps is (10 sec: 42589.1, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 4462231552. Throughput: 0: 42983.3. Samples: 4462309880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 19:11:23,392][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 19:11:26,151][15401] Updated weights for policy 0, policy_version 272360 (0.0048) [2024-06-22 19:11:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 4462444544. Throughput: 0: 42890.5. Samples: 4462572220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:11:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 19:11:29,562][15401] Updated weights for policy 0, policy_version 272370 (0.0038) [2024-06-22 19:11:30,385][15349] Signal inference workers to stop experience collection... (66000 times) [2024-06-22 19:11:30,385][15349] Signal inference workers to resume experience collection... (66000 times) [2024-06-22 19:11:30,420][15401] InferenceWorker_p0-w0: stopping experience collection (66000 times) [2024-06-22 19:11:30,420][15401] InferenceWorker_p0-w0: resuming experience collection (66000 times) [2024-06-22 19:11:33,389][15132] Fps is (10 sec: 40968.8, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 4462641152. Throughput: 0: 42958.8. Samples: 4462828660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:11:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 19:11:33,814][15401] Updated weights for policy 0, policy_version 272380 (0.0030) [2024-06-22 19:11:37,094][15401] Updated weights for policy 0, policy_version 272390 (0.0033) [2024-06-22 19:11:38,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4462886912. Throughput: 0: 42973.5. Samples: 4462955680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:11:38,395][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 19:11:41,559][15401] Updated weights for policy 0, policy_version 272400 (0.0027) [2024-06-22 19:11:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.7). Total num frames: 4463083520. Throughput: 0: 42959.9. Samples: 4463212820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:11:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 19:11:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000272405_4463083520.pth... [2024-06-22 19:11:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000271778_4452810752.pth [2024-06-22 19:11:44,836][15401] Updated weights for policy 0, policy_version 272410 (0.0042) [2024-06-22 19:11:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4463296512. Throughput: 0: 42909.4. Samples: 4463466180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:11:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 19:11:49,386][15401] Updated weights for policy 0, policy_version 272420 (0.0037) [2024-06-22 19:11:52,521][15401] Updated weights for policy 0, policy_version 272430 (0.0050) [2024-06-22 19:11:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4463509504. Throughput: 0: 42845.3. Samples: 4463593480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:11:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 19:11:56,989][15401] Updated weights for policy 0, policy_version 272440 (0.0028) [2024-06-22 19:11:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42821.1). Total num frames: 4463722496. Throughput: 0: 42779.1. Samples: 4463850440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:11:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 19:12:00,065][15401] Updated weights for policy 0, policy_version 272450 (0.0041) [2024-06-22 19:12:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4463935488. Throughput: 0: 42540.0. Samples: 4464102240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:12:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 19:12:04,505][15401] Updated weights for policy 0, policy_version 272460 (0.0039) [2024-06-22 19:12:07,714][15401] Updated weights for policy 0, policy_version 272470 (0.0045) [2024-06-22 19:12:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4464164864. Throughput: 0: 42627.3. Samples: 4464228020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:12:08,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 19:12:12,448][15401] Updated weights for policy 0, policy_version 272480 (0.0028) [2024-06-22 19:12:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4464361472. Throughput: 0: 42558.8. Samples: 4464487360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:12:13,390][15132] Avg episode reward: [(0, '0.195')] [2024-06-22 19:12:15,351][15401] Updated weights for policy 0, policy_version 272490 (0.0041) [2024-06-22 19:12:18,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 4464558080. Throughput: 0: 42271.5. Samples: 4464730980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:12:18,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 19:12:20,330][15401] Updated weights for policy 0, policy_version 272500 (0.0041) [2024-06-22 19:12:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42598.2, 300 sec: 42764.7). Total num frames: 4464787456. Throughput: 0: 42246.1. Samples: 4464856860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:12:23,393][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 19:12:23,551][15401] Updated weights for policy 0, policy_version 272510 (0.0035) [2024-06-22 19:12:27,824][15401] Updated weights for policy 0, policy_version 272520 (0.0034) [2024-06-22 19:12:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 4464984064. Throughput: 0: 42244.5. Samples: 4465113820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:12:28,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 19:12:31,157][15401] Updated weights for policy 0, policy_version 272530 (0.0037) [2024-06-22 19:12:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4465213440. Throughput: 0: 42396.9. Samples: 4465374040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 19:12:33,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-22 19:12:35,238][15401] Updated weights for policy 0, policy_version 272540 (0.0037) [2024-06-22 19:12:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4465426432. Throughput: 0: 42474.2. Samples: 4465504820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:12:38,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 19:12:38,781][15401] Updated weights for policy 0, policy_version 272550 (0.0027) [2024-06-22 19:12:43,114][15401] Updated weights for policy 0, policy_version 272560 (0.0031) [2024-06-22 19:12:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4465639424. Throughput: 0: 42438.7. Samples: 4465760180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:12:43,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 19:12:46,353][15401] Updated weights for policy 0, policy_version 272570 (0.0030) [2024-06-22 19:12:48,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 4465852416. Throughput: 0: 42357.8. Samples: 4466008440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:12:48,393][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 19:12:50,665][15401] Updated weights for policy 0, policy_version 272580 (0.0046) [2024-06-22 19:12:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4466065408. Throughput: 0: 42336.9. Samples: 4466133180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:12:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 19:12:54,702][15401] Updated weights for policy 0, policy_version 272590 (0.0023) [2024-06-22 19:12:58,278][15401] Updated weights for policy 0, policy_version 272600 (0.0032) [2024-06-22 19:12:58,394][15132] Fps is (10 sec: 42588.3, 60 sec: 42595.0, 300 sec: 42708.8). Total num frames: 4466278400. Throughput: 0: 42222.2. Samples: 4466387560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:12:58,395][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 19:13:02,302][15401] Updated weights for policy 0, policy_version 272610 (0.0028) [2024-06-22 19:13:03,275][15349] Signal inference workers to stop experience collection... (66050 times) [2024-06-22 19:13:03,277][15349] Signal inference workers to resume experience collection... (66050 times) [2024-06-22 19:13:03,294][15401] InferenceWorker_p0-w0: stopping experience collection (66050 times) [2024-06-22 19:13:03,332][15401] InferenceWorker_p0-w0: resuming experience collection (66050 times) [2024-06-22 19:13:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4466475008. Throughput: 0: 42503.2. Samples: 4466643520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 19:13:05,965][15401] Updated weights for policy 0, policy_version 272620 (0.0032) [2024-06-22 19:13:08,390][15132] Fps is (10 sec: 39340.3, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 4466671616. Throughput: 0: 42504.9. Samples: 4466769480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:08,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 19:13:09,845][15401] Updated weights for policy 0, policy_version 272630 (0.0021) [2024-06-22 19:13:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 4466900992. Throughput: 0: 42490.5. Samples: 4467025900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 19:13:14,049][15401] Updated weights for policy 0, policy_version 272640 (0.0049) [2024-06-22 19:13:17,538][15401] Updated weights for policy 0, policy_version 272650 (0.0033) [2024-06-22 19:13:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4467130368. Throughput: 0: 42390.6. Samples: 4467281620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 19:13:21,653][15401] Updated weights for policy 0, policy_version 272660 (0.0041) [2024-06-22 19:13:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 4467326976. Throughput: 0: 42480.2. Samples: 4467416420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:23,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-22 19:13:25,558][15401] Updated weights for policy 0, policy_version 272670 (0.0039) [2024-06-22 19:13:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4467523584. Throughput: 0: 42431.1. Samples: 4467669580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:28,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 19:13:29,131][15401] Updated weights for policy 0, policy_version 272680 (0.0032) [2024-06-22 19:13:33,115][15401] Updated weights for policy 0, policy_version 272690 (0.0033) [2024-06-22 19:13:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4467769344. Throughput: 0: 42843.2. Samples: 4467936280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 19:13:36,685][15401] Updated weights for policy 0, policy_version 272700 (0.0031) [2024-06-22 19:13:38,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4467998720. Throughput: 0: 42915.5. Samples: 4468064380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 19:13:40,890][15401] Updated weights for policy 0, policy_version 272710 (0.0037) [2024-06-22 19:13:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4468195328. Throughput: 0: 42792.9. Samples: 4468313040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 19:13:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000272717_4468195328.pth... [2024-06-22 19:13:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000272094_4457988096.pth [2024-06-22 19:13:44,578][15401] Updated weights for policy 0, policy_version 272720 (0.0031) [2024-06-22 19:13:48,372][15401] Updated weights for policy 0, policy_version 272730 (0.0043) [2024-06-22 19:13:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 4468408320. Throughput: 0: 43018.6. Samples: 4468579360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-06-22 19:13:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 19:13:52,157][15401] Updated weights for policy 0, policy_version 272740 (0.0046) [2024-06-22 19:13:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4468637696. Throughput: 0: 43090.2. Samples: 4468708540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:13:53,394][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 19:13:56,005][15401] Updated weights for policy 0, policy_version 272750 (0.0027) [2024-06-22 19:13:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42601.7, 300 sec: 42820.5). Total num frames: 4468834304. Throughput: 0: 42868.0. Samples: 4468954960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:13:58,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 19:13:59,790][15401] Updated weights for policy 0, policy_version 272760 (0.0024) [2024-06-22 19:14:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4469047296. Throughput: 0: 42903.6. Samples: 4469212280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:03,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 19:14:03,587][15401] Updated weights for policy 0, policy_version 272770 (0.0038) [2024-06-22 19:14:07,386][15401] Updated weights for policy 0, policy_version 272780 (0.0025) [2024-06-22 19:14:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 4469276672. Throughput: 0: 42755.4. Samples: 4469340420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 19:14:11,645][15401] Updated weights for policy 0, policy_version 272790 (0.0029) [2024-06-22 19:14:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4469473280. Throughput: 0: 42658.1. Samples: 4469589200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:13,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 19:14:15,250][15401] Updated weights for policy 0, policy_version 272800 (0.0034) [2024-06-22 19:14:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 4469669888. Throughput: 0: 42433.7. Samples: 4469845800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:18,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-22 19:14:19,295][15401] Updated weights for policy 0, policy_version 272810 (0.0029) [2024-06-22 19:14:19,474][15349] Signal inference workers to stop experience collection... (66100 times) [2024-06-22 19:14:19,474][15349] Signal inference workers to resume experience collection... (66100 times) [2024-06-22 19:14:19,512][15401] InferenceWorker_p0-w0: stopping experience collection (66100 times) [2024-06-22 19:14:19,513][15401] InferenceWorker_p0-w0: resuming experience collection (66100 times) [2024-06-22 19:14:22,822][15401] Updated weights for policy 0, policy_version 272820 (0.0024) [2024-06-22 19:14:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4469899264. Throughput: 0: 42525.0. Samples: 4469978000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 19:14:26,798][15401] Updated weights for policy 0, policy_version 272830 (0.0033) [2024-06-22 19:14:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4470112256. Throughput: 0: 42600.6. Samples: 4470230060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 19:14:30,569][15401] Updated weights for policy 0, policy_version 272840 (0.0041) [2024-06-22 19:14:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4470325248. Throughput: 0: 42462.6. Samples: 4470490180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 19:14:34,332][15401] Updated weights for policy 0, policy_version 272850 (0.0025) [2024-06-22 19:14:38,159][15401] Updated weights for policy 0, policy_version 272860 (0.0038) [2024-06-22 19:14:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4470554624. Throughput: 0: 42479.9. Samples: 4470620140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:38,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-22 19:14:41,932][15401] Updated weights for policy 0, policy_version 272870 (0.0034) [2024-06-22 19:14:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4470751232. Throughput: 0: 42754.7. Samples: 4470878920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:43,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-22 19:14:45,788][15401] Updated weights for policy 0, policy_version 272880 (0.0021) [2024-06-22 19:14:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4470964224. Throughput: 0: 42654.1. Samples: 4471131720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:48,394][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 19:14:49,524][15401] Updated weights for policy 0, policy_version 272890 (0.0035) [2024-06-22 19:14:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 4471160832. Throughput: 0: 42719.5. Samples: 4471262800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 19:14:53,603][15401] Updated weights for policy 0, policy_version 272900 (0.0036) [2024-06-22 19:14:57,156][15401] Updated weights for policy 0, policy_version 272910 (0.0046) [2024-06-22 19:14:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 4471373824. Throughput: 0: 42913.9. Samples: 4471520320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-22 19:14:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 19:15:01,234][15401] Updated weights for policy 0, policy_version 272920 (0.0031) [2024-06-22 19:15:03,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 4471619584. Throughput: 0: 42775.0. Samples: 4471770780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:03,392][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 19:15:04,715][15401] Updated weights for policy 0, policy_version 272930 (0.0030) [2024-06-22 19:15:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4471816192. Throughput: 0: 42946.0. Samples: 4471910580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 19:15:08,761][15401] Updated weights for policy 0, policy_version 272940 (0.0037) [2024-06-22 19:15:12,110][15401] Updated weights for policy 0, policy_version 272950 (0.0041) [2024-06-22 19:15:13,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 4472029184. Throughput: 0: 42942.1. Samples: 4472162460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 19:15:16,354][15401] Updated weights for policy 0, policy_version 272960 (0.0030) [2024-06-22 19:15:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 4472274944. Throughput: 0: 42731.6. Samples: 4472413100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:18,394][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 19:15:20,172][15401] Updated weights for policy 0, policy_version 272970 (0.0038) [2024-06-22 19:15:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4472455168. Throughput: 0: 42886.9. Samples: 4472550040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 19:15:24,278][15401] Updated weights for policy 0, policy_version 272980 (0.0030) [2024-06-22 19:15:27,561][15401] Updated weights for policy 0, policy_version 272990 (0.0034) [2024-06-22 19:15:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 4472668160. Throughput: 0: 42865.9. Samples: 4472807880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 19:15:31,773][15401] Updated weights for policy 0, policy_version 273000 (0.0037) [2024-06-22 19:15:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4472897536. Throughput: 0: 42924.6. Samples: 4473063320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 19:15:35,671][15401] Updated weights for policy 0, policy_version 273010 (0.0039) [2024-06-22 19:15:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4473110528. Throughput: 0: 42835.6. Samples: 4473190400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:38,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 19:15:39,454][15401] Updated weights for policy 0, policy_version 273020 (0.0028) [2024-06-22 19:15:43,170][15401] Updated weights for policy 0, policy_version 273030 (0.0040) [2024-06-22 19:15:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 4473323520. Throughput: 0: 42714.5. Samples: 4473442580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:43,393][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 19:15:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000273030_4473323520.pth... [2024-06-22 19:15:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000272405_4463083520.pth [2024-06-22 19:15:47,096][15401] Updated weights for policy 0, policy_version 273040 (0.0023) [2024-06-22 19:15:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 4473536512. Throughput: 0: 42889.5. Samples: 4473700700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 19:15:48,896][15349] Signal inference workers to stop experience collection... (66150 times) [2024-06-22 19:15:48,896][15349] Signal inference workers to resume experience collection... (66150 times) [2024-06-22 19:15:48,939][15401] InferenceWorker_p0-w0: stopping experience collection (66150 times) [2024-06-22 19:15:48,940][15401] InferenceWorker_p0-w0: resuming experience collection (66150 times) [2024-06-22 19:15:50,627][15401] Updated weights for policy 0, policy_version 273050 (0.0032) [2024-06-22 19:15:53,392][15132] Fps is (10 sec: 40959.8, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 4473733120. Throughput: 0: 42720.4. Samples: 4473833100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:53,393][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 19:15:54,899][15401] Updated weights for policy 0, policy_version 273060 (0.0047) [2024-06-22 19:15:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4473962496. Throughput: 0: 42548.0. Samples: 4474077120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:15:58,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-22 19:15:58,505][15401] Updated weights for policy 0, policy_version 273070 (0.0028) [2024-06-22 19:16:02,479][15401] Updated weights for policy 0, policy_version 273080 (0.0035) [2024-06-22 19:16:03,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 4474175488. Throughput: 0: 42829.9. Samples: 4474340440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:16:03,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-22 19:16:06,109][15401] Updated weights for policy 0, policy_version 273090 (0.0035) [2024-06-22 19:16:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42598.0). Total num frames: 4474372096. Throughput: 0: 42587.4. Samples: 4474466580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:16:08,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 19:16:10,044][15401] Updated weights for policy 0, policy_version 273100 (0.0028) [2024-06-22 19:16:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4474601472. Throughput: 0: 42503.4. Samples: 4474720540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-22 19:16:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 19:16:13,859][15401] Updated weights for policy 0, policy_version 273110 (0.0042) [2024-06-22 19:16:17,649][15401] Updated weights for policy 0, policy_version 273120 (0.0026) [2024-06-22 19:16:18,390][15132] Fps is (10 sec: 44246.0, 60 sec: 42325.2, 300 sec: 42654.2). Total num frames: 4474814464. Throughput: 0: 42579.2. Samples: 4474979400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 19:16:21,424][15401] Updated weights for policy 0, policy_version 273130 (0.0042) [2024-06-22 19:16:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 4474994688. Throughput: 0: 42628.9. Samples: 4475108700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 19:16:25,259][15401] Updated weights for policy 0, policy_version 273140 (0.0042) [2024-06-22 19:16:28,389][15132] Fps is (10 sec: 44238.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4475256832. Throughput: 0: 42596.1. Samples: 4475359300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 19:16:29,269][15401] Updated weights for policy 0, policy_version 273150 (0.0041) [2024-06-22 19:16:32,756][15401] Updated weights for policy 0, policy_version 273160 (0.0028) [2024-06-22 19:16:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4475453440. Throughput: 0: 42734.1. Samples: 4475623740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 19:16:36,932][15401] Updated weights for policy 0, policy_version 273170 (0.0030) [2024-06-22 19:16:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4475650048. Throughput: 0: 42616.6. Samples: 4475750740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 19:16:40,878][15401] Updated weights for policy 0, policy_version 273180 (0.0026) [2024-06-22 19:16:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 4475895808. Throughput: 0: 42966.3. Samples: 4476010600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:43,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 19:16:44,532][15401] Updated weights for policy 0, policy_version 273190 (0.0042) [2024-06-22 19:16:48,221][15401] Updated weights for policy 0, policy_version 273200 (0.0043) [2024-06-22 19:16:48,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 4476108800. Throughput: 0: 42813.7. Samples: 4476267160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:48,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 19:16:52,054][15401] Updated weights for policy 0, policy_version 273210 (0.0028) [2024-06-22 19:16:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 4476305408. Throughput: 0: 42901.3. Samples: 4476397040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 19:16:55,857][15401] Updated weights for policy 0, policy_version 273220 (0.0043) [2024-06-22 19:16:58,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 4476534784. Throughput: 0: 43020.4. Samples: 4476656560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:16:58,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 19:16:58,807][15349] Signal inference workers to stop experience collection... (66200 times) [2024-06-22 19:16:58,807][15349] Signal inference workers to resume experience collection... (66200 times) [2024-06-22 19:16:58,844][15401] InferenceWorker_p0-w0: stopping experience collection (66200 times) [2024-06-22 19:16:58,849][15401] InferenceWorker_p0-w0: resuming experience collection (66200 times) [2024-06-22 19:16:59,670][15401] Updated weights for policy 0, policy_version 273230 (0.0036) [2024-06-22 19:17:03,357][15401] Updated weights for policy 0, policy_version 273240 (0.0039) [2024-06-22 19:17:03,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4476764160. Throughput: 0: 43052.8. Samples: 4476916760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:17:03,391][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 19:17:07,065][15401] Updated weights for policy 0, policy_version 273250 (0.0049) [2024-06-22 19:17:08,390][15132] Fps is (10 sec: 42608.0, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 4476960768. Throughput: 0: 43019.9. Samples: 4477044600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:17:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 19:17:11,035][15401] Updated weights for policy 0, policy_version 273260 (0.0036) [2024-06-22 19:17:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 4477190144. Throughput: 0: 43139.9. Samples: 4477300600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:17:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 19:17:14,605][15401] Updated weights for policy 0, policy_version 273270 (0.0036) [2024-06-22 19:17:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.7, 300 sec: 42765.3). Total num frames: 4477403136. Throughput: 0: 43154.6. Samples: 4477565700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:17:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 19:17:18,553][15401] Updated weights for policy 0, policy_version 273280 (0.0039) [2024-06-22 19:17:22,097][15401] Updated weights for policy 0, policy_version 273290 (0.0033) [2024-06-22 19:17:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 4477599744. Throughput: 0: 43150.7. Samples: 4477692520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 19:17:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 19:17:26,338][15401] Updated weights for policy 0, policy_version 273300 (0.0027) [2024-06-22 19:17:28,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4477829120. Throughput: 0: 43088.0. Samples: 4477949560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:17:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 19:17:29,916][15401] Updated weights for policy 0, policy_version 273310 (0.0024) [2024-06-22 19:17:33,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 4478042112. Throughput: 0: 43269.0. Samples: 4478214440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:17:33,396][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 19:17:33,866][15401] Updated weights for policy 0, policy_version 273320 (0.0033) [2024-06-22 19:17:37,490][15401] Updated weights for policy 0, policy_version 273330 (0.0038) [2024-06-22 19:17:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 4478255104. Throughput: 0: 43154.8. Samples: 4478339000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:17:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 19:17:41,637][15401] Updated weights for policy 0, policy_version 273340 (0.0036) [2024-06-22 19:17:43,390][15132] Fps is (10 sec: 44264.5, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 4478484480. Throughput: 0: 43176.4. Samples: 4478599400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:17:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 19:17:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000273345_4478484480.pth... [2024-06-22 19:17:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000272717_4468195328.pth [2024-06-22 19:17:44,897][15401] Updated weights for policy 0, policy_version 273350 (0.0042) [2024-06-22 19:17:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 4478697472. Throughput: 0: 43116.7. Samples: 4478857120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:17:48,393][15132] Avg episode reward: [(0, '0.832')] [2024-06-22 19:17:49,104][15401] Updated weights for policy 0, policy_version 273360 (0.0024) [2024-06-22 19:17:52,675][15401] Updated weights for policy 0, policy_version 273370 (0.0028) [2024-06-22 19:17:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.7, 300 sec: 42821.3). Total num frames: 4478910464. Throughput: 0: 43073.9. Samples: 4478982920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:17:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 19:17:57,059][15401] Updated weights for policy 0, policy_version 273380 (0.0038) [2024-06-22 19:17:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 4479123456. Throughput: 0: 43125.3. Samples: 4479241240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:17:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 19:18:00,235][15401] Updated weights for policy 0, policy_version 273390 (0.0035) [2024-06-22 19:18:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4479320064. Throughput: 0: 42910.8. Samples: 4479496680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:18:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 19:18:04,536][15401] Updated weights for policy 0, policy_version 273400 (0.0046) [2024-06-22 19:18:07,996][15401] Updated weights for policy 0, policy_version 273410 (0.0038) [2024-06-22 19:18:08,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 4479549440. Throughput: 0: 42965.2. Samples: 4479626060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:18:08,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 19:18:12,258][15401] Updated weights for policy 0, policy_version 273420 (0.0038) [2024-06-22 19:18:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4479762432. Throughput: 0: 43021.3. Samples: 4479885520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:18:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 19:18:15,677][15401] Updated weights for policy 0, policy_version 273430 (0.0040) [2024-06-22 19:18:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4479975424. Throughput: 0: 42772.7. Samples: 4480138940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:18:18,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 19:18:19,827][15401] Updated weights for policy 0, policy_version 273440 (0.0037) [2024-06-22 19:18:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4480188416. Throughput: 0: 42884.5. Samples: 4480268800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:18:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 19:18:23,456][15401] Updated weights for policy 0, policy_version 273450 (0.0023) [2024-06-22 19:18:27,582][15401] Updated weights for policy 0, policy_version 273460 (0.0027) [2024-06-22 19:18:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4480401408. Throughput: 0: 42858.8. Samples: 4480528040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:18:28,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 19:18:31,311][15401] Updated weights for policy 0, policy_version 273470 (0.0028) [2024-06-22 19:18:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 4480614400. Throughput: 0: 42813.4. Samples: 4480783620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:18:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 19:18:35,170][15401] Updated weights for policy 0, policy_version 273480 (0.0039) [2024-06-22 19:18:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4480827392. Throughput: 0: 42664.0. Samples: 4480902800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:18:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 19:18:38,942][15401] Updated weights for policy 0, policy_version 273490 (0.0032) [2024-06-22 19:18:42,660][15401] Updated weights for policy 0, policy_version 273500 (0.0042) [2024-06-22 19:18:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4481040384. Throughput: 0: 42633.3. Samples: 4481159740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:18:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 19:18:46,665][15401] Updated weights for policy 0, policy_version 273510 (0.0040) [2024-06-22 19:18:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 4481253376. Throughput: 0: 42700.9. Samples: 4481418320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:18:48,393][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 19:18:50,420][15401] Updated weights for policy 0, policy_version 273520 (0.0044) [2024-06-22 19:18:52,342][15349] Signal inference workers to stop experience collection... (66250 times) [2024-06-22 19:18:52,396][15401] InferenceWorker_p0-w0: stopping experience collection (66250 times) [2024-06-22 19:18:52,468][15349] Signal inference workers to resume experience collection... (66250 times) [2024-06-22 19:18:52,468][15401] InferenceWorker_p0-w0: resuming experience collection (66250 times) [2024-06-22 19:18:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4481466368. Throughput: 0: 42657.4. Samples: 4481545540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:18:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 19:18:54,121][15401] Updated weights for policy 0, policy_version 273530 (0.0033) [2024-06-22 19:18:58,281][15401] Updated weights for policy 0, policy_version 273540 (0.0027) [2024-06-22 19:18:58,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4481679360. Throughput: 0: 42524.1. Samples: 4481799100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:18:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 19:19:01,693][15401] Updated weights for policy 0, policy_version 273550 (0.0045) [2024-06-22 19:19:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4481892352. Throughput: 0: 42609.0. Samples: 4482056340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 19:19:05,748][15401] Updated weights for policy 0, policy_version 273560 (0.0043) [2024-06-22 19:19:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 4482088960. Throughput: 0: 42540.1. Samples: 4482183100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 19:19:09,503][15401] Updated weights for policy 0, policy_version 273570 (0.0032) [2024-06-22 19:19:13,283][15401] Updated weights for policy 0, policy_version 273580 (0.0032) [2024-06-22 19:19:13,393][15132] Fps is (10 sec: 44218.9, 60 sec: 42868.7, 300 sec: 42931.1). Total num frames: 4482334720. Throughput: 0: 42420.7. Samples: 4482437140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:13,394][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 19:19:17,263][15401] Updated weights for policy 0, policy_version 273590 (0.0041) [2024-06-22 19:19:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4482531328. Throughput: 0: 42561.8. Samples: 4482698900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 19:19:20,827][15401] Updated weights for policy 0, policy_version 273600 (0.0032) [2024-06-22 19:19:23,390][15132] Fps is (10 sec: 39336.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4482727936. Throughput: 0: 42547.0. Samples: 4482817420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 19:19:25,124][15401] Updated weights for policy 0, policy_version 273610 (0.0029) [2024-06-22 19:19:28,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 4482973696. Throughput: 0: 42582.3. Samples: 4483076040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:28,392][15132] Avg episode reward: [(0, '0.290')] [2024-06-22 19:19:28,708][15401] Updated weights for policy 0, policy_version 273620 (0.0049) [2024-06-22 19:19:32,706][15401] Updated weights for policy 0, policy_version 273630 (0.0023) [2024-06-22 19:19:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4483153920. Throughput: 0: 42601.3. Samples: 4483335280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:33,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 19:19:36,113][15401] Updated weights for policy 0, policy_version 273640 (0.0030) [2024-06-22 19:19:38,390][15132] Fps is (10 sec: 39330.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4483366912. Throughput: 0: 42534.5. Samples: 4483459600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:38,396][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 19:19:40,445][15401] Updated weights for policy 0, policy_version 273650 (0.0046) [2024-06-22 19:19:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4483629056. Throughput: 0: 42738.6. Samples: 4483722340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 19:19:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000273660_4483645440.pth... [2024-06-22 19:19:43,508][15401] Updated weights for policy 0, policy_version 273660 (0.0032) [2024-06-22 19:19:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000273030_4473323520.pth [2024-06-22 19:19:48,376][15401] Updated weights for policy 0, policy_version 273670 (0.0040) [2024-06-22 19:19:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 4483809280. Throughput: 0: 42740.4. Samples: 4483979660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-22 19:19:48,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 19:19:51,330][15401] Updated weights for policy 0, policy_version 273680 (0.0023) [2024-06-22 19:19:53,391][15132] Fps is (10 sec: 39314.3, 60 sec: 42597.1, 300 sec: 42875.8). Total num frames: 4484022272. Throughput: 0: 42687.4. Samples: 4484104120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:19:53,392][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 19:19:56,046][15401] Updated weights for policy 0, policy_version 273690 (0.0035) [2024-06-22 19:19:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 4484268032. Throughput: 0: 42901.5. Samples: 4484367540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:19:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 19:19:58,998][15401] Updated weights for policy 0, policy_version 273700 (0.0023) [2024-06-22 19:20:03,390][15132] Fps is (10 sec: 42606.1, 60 sec: 42598.2, 300 sec: 42820.6). Total num frames: 4484448256. Throughput: 0: 42662.2. Samples: 4484618700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 19:20:03,748][15401] Updated weights for policy 0, policy_version 273710 (0.0037) [2024-06-22 19:20:07,118][15401] Updated weights for policy 0, policy_version 273720 (0.0033) [2024-06-22 19:20:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4484661248. Throughput: 0: 42683.3. Samples: 4484738160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:08,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 19:20:11,321][15401] Updated weights for policy 0, policy_version 273730 (0.0038) [2024-06-22 19:20:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42601.2, 300 sec: 42765.0). Total num frames: 4484890624. Throughput: 0: 42842.8. Samples: 4485003860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:13,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-22 19:20:14,646][15401] Updated weights for policy 0, policy_version 273740 (0.0033) [2024-06-22 19:20:18,149][15349] Signal inference workers to stop experience collection... (66300 times) [2024-06-22 19:20:18,153][15349] Signal inference workers to resume experience collection... (66300 times) [2024-06-22 19:20:18,167][15401] InferenceWorker_p0-w0: stopping experience collection (66300 times) [2024-06-22 19:20:18,167][15401] InferenceWorker_p0-w0: resuming experience collection (66300 times) [2024-06-22 19:20:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4485087232. Throughput: 0: 42961.4. Samples: 4485268540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:18,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 19:20:18,775][15401] Updated weights for policy 0, policy_version 273750 (0.0028) [2024-06-22 19:20:22,076][15401] Updated weights for policy 0, policy_version 273760 (0.0040) [2024-06-22 19:20:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4485300224. Throughput: 0: 42877.0. Samples: 4485389060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 19:20:26,207][15401] Updated weights for policy 0, policy_version 273770 (0.0053) [2024-06-22 19:20:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 4485545984. Throughput: 0: 42869.9. Samples: 4485651480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 19:20:30,153][15401] Updated weights for policy 0, policy_version 273780 (0.0029) [2024-06-22 19:20:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4485742592. Throughput: 0: 42900.9. Samples: 4485910200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 19:20:33,863][15401] Updated weights for policy 0, policy_version 273790 (0.0039) [2024-06-22 19:20:37,684][15401] Updated weights for policy 0, policy_version 273800 (0.0025) [2024-06-22 19:20:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 4485955584. Throughput: 0: 42827.5. Samples: 4486031280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 19:20:41,529][15401] Updated weights for policy 0, policy_version 273810 (0.0024) [2024-06-22 19:20:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4486168576. Throughput: 0: 42622.3. Samples: 4486285540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 19:20:45,374][15401] Updated weights for policy 0, policy_version 273820 (0.0044) [2024-06-22 19:20:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 4486365184. Throughput: 0: 42863.7. Samples: 4486547560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 19:20:49,097][15401] Updated weights for policy 0, policy_version 273830 (0.0032) [2024-06-22 19:20:53,267][15401] Updated weights for policy 0, policy_version 273840 (0.0032) [2024-06-22 19:20:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42872.8, 300 sec: 42820.6). Total num frames: 4486594560. Throughput: 0: 43010.5. Samples: 4486673640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 19:20:57,079][15401] Updated weights for policy 0, policy_version 273850 (0.0029) [2024-06-22 19:20:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4486823936. Throughput: 0: 42774.2. Samples: 4486928700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-22 19:20:58,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 19:21:00,854][15401] Updated weights for policy 0, policy_version 273860 (0.0036) [2024-06-22 19:21:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 4487004160. Throughput: 0: 42749.8. Samples: 4487192280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 19:21:04,483][15401] Updated weights for policy 0, policy_version 273870 (0.0023) [2024-06-22 19:21:08,313][15401] Updated weights for policy 0, policy_version 273880 (0.0042) [2024-06-22 19:21:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4487249920. Throughput: 0: 42827.9. Samples: 4487316320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 19:21:12,067][15401] Updated weights for policy 0, policy_version 273890 (0.0030) [2024-06-22 19:21:13,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 4487462912. Throughput: 0: 42779.3. Samples: 4487576560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 19:21:16,326][15401] Updated weights for policy 0, policy_version 273900 (0.0045) [2024-06-22 19:21:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4487659520. Throughput: 0: 42751.5. Samples: 4487834020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 19:21:19,739][15401] Updated weights for policy 0, policy_version 273910 (0.0027) [2024-06-22 19:21:21,351][15349] Signal inference workers to stop experience collection... (66350 times) [2024-06-22 19:21:21,379][15401] InferenceWorker_p0-w0: stopping experience collection (66350 times) [2024-06-22 19:21:21,414][15349] Signal inference workers to resume experience collection... (66350 times) [2024-06-22 19:21:21,415][15401] InferenceWorker_p0-w0: resuming experience collection (66350 times) [2024-06-22 19:21:23,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4487872512. Throughput: 0: 42863.6. Samples: 4487960140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 19:21:23,962][15401] Updated weights for policy 0, policy_version 273920 (0.0033) [2024-06-22 19:21:27,435][15401] Updated weights for policy 0, policy_version 273930 (0.0034) [2024-06-22 19:21:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 4488085504. Throughput: 0: 42801.3. Samples: 4488211600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:28,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 19:21:31,626][15401] Updated weights for policy 0, policy_version 273940 (0.0030) [2024-06-22 19:21:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4488282112. Throughput: 0: 42862.6. Samples: 4488476380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 19:21:35,275][15401] Updated weights for policy 0, policy_version 273950 (0.0033) [2024-06-22 19:21:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4488527872. Throughput: 0: 42785.1. Samples: 4488598960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 19:21:39,187][15401] Updated weights for policy 0, policy_version 273960 (0.0035) [2024-06-22 19:21:43,178][15401] Updated weights for policy 0, policy_version 273970 (0.0038) [2024-06-22 19:21:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 4488740864. Throughput: 0: 42872.7. Samples: 4488857980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 19:21:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000273971_4488740864.pth... [2024-06-22 19:21:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000273345_4478484480.pth [2024-06-22 19:21:46,842][15401] Updated weights for policy 0, policy_version 273980 (0.0033) [2024-06-22 19:21:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4488937472. Throughput: 0: 42757.2. Samples: 4489116360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 19:21:50,808][15401] Updated weights for policy 0, policy_version 273990 (0.0036) [2024-06-22 19:21:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 4489150464. Throughput: 0: 42829.0. Samples: 4489243620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 19:21:54,497][15401] Updated weights for policy 0, policy_version 274000 (0.0025) [2024-06-22 19:21:58,259][15401] Updated weights for policy 0, policy_version 274010 (0.0035) [2024-06-22 19:21:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4489379840. Throughput: 0: 42721.6. Samples: 4489499020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:21:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 19:22:02,309][15401] Updated weights for policy 0, policy_version 274020 (0.0044) [2024-06-22 19:22:03,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 4489592832. Throughput: 0: 42796.0. Samples: 4489759940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:22:03,393][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 19:22:05,871][15401] Updated weights for policy 0, policy_version 274030 (0.0032) [2024-06-22 19:22:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4489805824. Throughput: 0: 42857.3. Samples: 4489888720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:22:08,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 19:22:09,906][15401] Updated weights for policy 0, policy_version 274040 (0.0039) [2024-06-22 19:22:13,379][15401] Updated weights for policy 0, policy_version 274050 (0.0033) [2024-06-22 19:22:13,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4490035200. Throughput: 0: 42895.6. Samples: 4490141900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 19:22:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 19:22:17,457][15401] Updated weights for policy 0, policy_version 274060 (0.0028) [2024-06-22 19:22:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4490231808. Throughput: 0: 42812.4. Samples: 4490402940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 19:22:20,903][15401] Updated weights for policy 0, policy_version 274070 (0.0027) [2024-06-22 19:22:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4490444800. Throughput: 0: 42855.7. Samples: 4490527480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 19:22:25,220][15401] Updated weights for policy 0, policy_version 274080 (0.0044) [2024-06-22 19:22:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 4490674176. Throughput: 0: 42748.6. Samples: 4490781660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 19:22:28,414][15401] Updated weights for policy 0, policy_version 274090 (0.0029) [2024-06-22 19:22:32,796][15401] Updated weights for policy 0, policy_version 274100 (0.0037) [2024-06-22 19:22:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4490854400. Throughput: 0: 42783.6. Samples: 4491041620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:33,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 19:22:36,083][15401] Updated weights for policy 0, policy_version 274110 (0.0035) [2024-06-22 19:22:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42654.0). Total num frames: 4491067392. Throughput: 0: 42624.8. Samples: 4491161740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:38,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 19:22:40,339][15401] Updated weights for policy 0, policy_version 274120 (0.0028) [2024-06-22 19:22:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 4491296768. Throughput: 0: 42791.5. Samples: 4491424640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 19:22:43,751][15401] Updated weights for policy 0, policy_version 274130 (0.0047) [2024-06-22 19:22:47,903][15401] Updated weights for policy 0, policy_version 274140 (0.0029) [2024-06-22 19:22:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4491526144. Throughput: 0: 42703.7. Samples: 4491681500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 19:22:51,566][15401] Updated weights for policy 0, policy_version 274150 (0.0035) [2024-06-22 19:22:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4491722752. Throughput: 0: 42720.9. Samples: 4491811160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 19:22:53,596][15349] Signal inference workers to stop experience collection... (66400 times) [2024-06-22 19:22:53,604][15349] Signal inference workers to resume experience collection... (66400 times) [2024-06-22 19:22:53,615][15401] InferenceWorker_p0-w0: stopping experience collection (66400 times) [2024-06-22 19:22:53,643][15401] InferenceWorker_p0-w0: resuming experience collection (66400 times) [2024-06-22 19:22:55,499][15401] Updated weights for policy 0, policy_version 274160 (0.0035) [2024-06-22 19:22:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4491935744. Throughput: 0: 42880.4. Samples: 4492071520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:22:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 19:22:59,082][15401] Updated weights for policy 0, policy_version 274170 (0.0036) [2024-06-22 19:23:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 4492132352. Throughput: 0: 42763.9. Samples: 4492327320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:23:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 19:23:03,688][15401] Updated weights for policy 0, policy_version 274180 (0.0049) [2024-06-22 19:23:06,726][15401] Updated weights for policy 0, policy_version 274190 (0.0030) [2024-06-22 19:23:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4492378112. Throughput: 0: 42733.1. Samples: 4492450460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:23:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 19:23:11,263][15401] Updated weights for policy 0, policy_version 274200 (0.0024) [2024-06-22 19:23:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4492591104. Throughput: 0: 42958.2. Samples: 4492714780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:23:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 19:23:14,423][15401] Updated weights for policy 0, policy_version 274210 (0.0036) [2024-06-22 19:23:18,390][15132] Fps is (10 sec: 39320.1, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 4492771328. Throughput: 0: 42744.1. Samples: 4492965120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:23:18,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 19:23:19,130][15401] Updated weights for policy 0, policy_version 274220 (0.0040) [2024-06-22 19:23:22,127][15401] Updated weights for policy 0, policy_version 274230 (0.0040) [2024-06-22 19:23:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4493017088. Throughput: 0: 42847.1. Samples: 4493089860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:23:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 19:23:26,786][15401] Updated weights for policy 0, policy_version 274240 (0.0030) [2024-06-22 19:23:28,389][15132] Fps is (10 sec: 45876.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4493230080. Throughput: 0: 42766.7. Samples: 4493349140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:23:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 19:23:29,660][15401] Updated weights for policy 0, policy_version 274250 (0.0028) [2024-06-22 19:23:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4493410304. Throughput: 0: 42866.9. Samples: 4493610520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:23:33,399][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 19:23:34,527][15401] Updated weights for policy 0, policy_version 274260 (0.0037) [2024-06-22 19:23:37,163][15401] Updated weights for policy 0, policy_version 274270 (0.0026) [2024-06-22 19:23:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4493656064. Throughput: 0: 42725.9. Samples: 4493733820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:23:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 19:23:41,990][15401] Updated weights for policy 0, policy_version 274280 (0.0026) [2024-06-22 19:23:43,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 4493885440. Throughput: 0: 42737.3. Samples: 4493994700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:23:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 19:23:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000274285_4493885440.pth... [2024-06-22 19:23:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000273660_4483645440.pth [2024-06-22 19:23:44,644][15401] Updated weights for policy 0, policy_version 274290 (0.0031) [2024-06-22 19:23:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4494065664. Throughput: 0: 42795.2. Samples: 4494253100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:23:48,404][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 19:23:49,840][15401] Updated weights for policy 0, policy_version 274300 (0.0049) [2024-06-22 19:23:52,274][15401] Updated weights for policy 0, policy_version 274310 (0.0031) [2024-06-22 19:23:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4494311424. Throughput: 0: 42746.6. Samples: 4494374060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:23:53,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 19:23:57,424][15401] Updated weights for policy 0, policy_version 274320 (0.0034) [2024-06-22 19:23:58,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4494540800. Throughput: 0: 42920.4. Samples: 4494646200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:23:58,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 19:23:59,823][15401] Updated weights for policy 0, policy_version 274330 (0.0039) [2024-06-22 19:24:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4494721024. Throughput: 0: 42990.0. Samples: 4494899660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:24:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 19:24:05,135][15401] Updated weights for policy 0, policy_version 274340 (0.0029) [2024-06-22 19:24:05,461][15349] Signal inference workers to stop experience collection... (66450 times) [2024-06-22 19:24:05,513][15401] InferenceWorker_p0-w0: stopping experience collection (66450 times) [2024-06-22 19:24:05,518][15349] Signal inference workers to resume experience collection... (66450 times) [2024-06-22 19:24:05,529][15401] InferenceWorker_p0-w0: resuming experience collection (66450 times) [2024-06-22 19:24:07,516][15401] Updated weights for policy 0, policy_version 274350 (0.0039) [2024-06-22 19:24:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42821.1). Total num frames: 4494966784. Throughput: 0: 42958.3. Samples: 4495022980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:24:08,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 19:24:12,722][15401] Updated weights for policy 0, policy_version 274360 (0.0038) [2024-06-22 19:24:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4495163392. Throughput: 0: 43167.5. Samples: 4495291680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:24:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 19:24:15,177][15401] Updated weights for policy 0, policy_version 274370 (0.0048) [2024-06-22 19:24:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.8, 300 sec: 42820.6). Total num frames: 4495360000. Throughput: 0: 43043.2. Samples: 4495547460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:24:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 19:24:20,225][15401] Updated weights for policy 0, policy_version 274380 (0.0032) [2024-06-22 19:24:22,905][15401] Updated weights for policy 0, policy_version 274390 (0.0033) [2024-06-22 19:24:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 4495605760. Throughput: 0: 43011.6. Samples: 4495669340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:24:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 19:24:27,625][15401] Updated weights for policy 0, policy_version 274400 (0.0026) [2024-06-22 19:24:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4495818752. Throughput: 0: 43172.4. Samples: 4495937460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:24:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 19:24:30,569][15401] Updated weights for policy 0, policy_version 274410 (0.0035) [2024-06-22 19:24:33,392][15132] Fps is (10 sec: 39312.0, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 4495998976. Throughput: 0: 43229.7. Samples: 4496198540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:24:33,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 19:24:35,134][15401] Updated weights for policy 0, policy_version 274420 (0.0033) [2024-06-22 19:24:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4496244736. Throughput: 0: 43139.5. Samples: 4496315340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:24:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 19:24:38,455][15401] Updated weights for policy 0, policy_version 274430 (0.0041) [2024-06-22 19:24:42,764][15401] Updated weights for policy 0, policy_version 274440 (0.0029) [2024-06-22 19:24:43,390][15132] Fps is (10 sec: 47524.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4496474112. Throughput: 0: 42982.7. Samples: 4496580420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:24:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 19:24:45,928][15401] Updated weights for policy 0, policy_version 274450 (0.0027) [2024-06-22 19:24:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 4496637952. Throughput: 0: 43026.7. Samples: 4496835860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:24:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 19:24:50,340][15401] Updated weights for policy 0, policy_version 274460 (0.0033) [2024-06-22 19:24:53,390][15132] Fps is (10 sec: 42594.7, 60 sec: 43143.9, 300 sec: 42820.4). Total num frames: 4496900096. Throughput: 0: 42991.9. Samples: 4496957660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:24:53,391][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 19:24:53,714][15401] Updated weights for policy 0, policy_version 274470 (0.0035) [2024-06-22 19:24:57,900][15401] Updated weights for policy 0, policy_version 274480 (0.0036) [2024-06-22 19:24:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4497096704. Throughput: 0: 42956.1. Samples: 4497224700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:24:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 19:25:01,211][15401] Updated weights for policy 0, policy_version 274490 (0.0037) [2024-06-22 19:25:03,390][15132] Fps is (10 sec: 39324.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4497293312. Throughput: 0: 42795.0. Samples: 4497473240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:03,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-22 19:25:05,567][15401] Updated weights for policy 0, policy_version 274500 (0.0030) [2024-06-22 19:25:06,672][15349] Signal inference workers to stop experience collection... (66500 times) [2024-06-22 19:25:06,726][15401] InferenceWorker_p0-w0: stopping experience collection (66500 times) [2024-06-22 19:25:06,734][15349] Signal inference workers to resume experience collection... (66500 times) [2024-06-22 19:25:06,745][15401] InferenceWorker_p0-w0: resuming experience collection (66500 times) [2024-06-22 19:25:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4497522688. Throughput: 0: 42820.4. Samples: 4497596260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:08,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 19:25:09,378][15401] Updated weights for policy 0, policy_version 274510 (0.0044) [2024-06-22 19:25:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4497702912. Throughput: 0: 42576.5. Samples: 4497853400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:13,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 19:25:13,580][15401] Updated weights for policy 0, policy_version 274520 (0.0033) [2024-06-22 19:25:16,910][15401] Updated weights for policy 0, policy_version 274530 (0.0027) [2024-06-22 19:25:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4497932288. Throughput: 0: 42285.3. Samples: 4498101280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:18,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-22 19:25:21,727][15401] Updated weights for policy 0, policy_version 274540 (0.0039) [2024-06-22 19:25:23,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 4498161664. Throughput: 0: 42593.4. Samples: 4498232140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:23,392][15132] Avg episode reward: [(0, '0.797')] [2024-06-22 19:25:24,470][15401] Updated weights for policy 0, policy_version 274550 (0.0027) [2024-06-22 19:25:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4498358272. Throughput: 0: 42452.1. Samples: 4498490760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 19:25:29,305][15401] Updated weights for policy 0, policy_version 274560 (0.0039) [2024-06-22 19:25:32,270][15401] Updated weights for policy 0, policy_version 274570 (0.0033) [2024-06-22 19:25:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4498571264. Throughput: 0: 42276.6. Samples: 4498738300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:33,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 19:25:37,040][15401] Updated weights for policy 0, policy_version 274580 (0.0046) [2024-06-22 19:25:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4498800640. Throughput: 0: 42460.4. Samples: 4498868340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 19:25:39,847][15401] Updated weights for policy 0, policy_version 274590 (0.0028) [2024-06-22 19:25:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 4498980864. Throughput: 0: 42227.1. Samples: 4499124920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:43,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 19:25:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000274597_4498997248.pth... [2024-06-22 19:25:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000273971_4488740864.pth [2024-06-22 19:25:44,678][15401] Updated weights for policy 0, policy_version 274600 (0.0042) [2024-06-22 19:25:47,683][15401] Updated weights for policy 0, policy_version 274610 (0.0023) [2024-06-22 19:25:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4499226624. Throughput: 0: 42166.9. Samples: 4499370740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-22 19:25:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 19:25:52,325][15401] Updated weights for policy 0, policy_version 274620 (0.0048) [2024-06-22 19:25:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.9, 300 sec: 42709.5). Total num frames: 4499423232. Throughput: 0: 42449.3. Samples: 4499506480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:25:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 19:25:55,242][15401] Updated weights for policy 0, policy_version 274630 (0.0037) [2024-06-22 19:25:58,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 4499619840. Throughput: 0: 42284.5. Samples: 4499756200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:25:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 19:26:00,267][15401] Updated weights for policy 0, policy_version 274640 (0.0030) [2024-06-22 19:26:03,296][15401] Updated weights for policy 0, policy_version 274650 (0.0040) [2024-06-22 19:26:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4499865600. Throughput: 0: 42279.1. Samples: 4500003840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 19:26:07,961][15401] Updated weights for policy 0, policy_version 274660 (0.0033) [2024-06-22 19:26:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 4500045824. Throughput: 0: 42392.5. Samples: 4500139700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:08,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-22 19:26:11,060][15401] Updated weights for policy 0, policy_version 274670 (0.0029) [2024-06-22 19:26:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4500275200. Throughput: 0: 42286.0. Samples: 4500393640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:13,390][15132] Avg episode reward: [(0, '0.159')] [2024-06-22 19:26:15,544][15401] Updated weights for policy 0, policy_version 274680 (0.0035) [2024-06-22 19:26:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4500488192. Throughput: 0: 42319.6. Samples: 4500642680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 19:26:18,617][15401] Updated weights for policy 0, policy_version 274690 (0.0035) [2024-06-22 19:26:23,265][15401] Updated weights for policy 0, policy_version 274700 (0.0037) [2024-06-22 19:26:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 4500684800. Throughput: 0: 42381.7. Samples: 4500775520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 19:26:26,441][15401] Updated weights for policy 0, policy_version 274710 (0.0037) [2024-06-22 19:26:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4500914176. Throughput: 0: 42349.3. Samples: 4501030640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:28,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 19:26:30,623][15401] Updated weights for policy 0, policy_version 274720 (0.0029) [2024-06-22 19:26:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4501127168. Throughput: 0: 42625.8. Samples: 4501288900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 19:26:34,046][15401] Updated weights for policy 0, policy_version 274730 (0.0029) [2024-06-22 19:26:37,987][15401] Updated weights for policy 0, policy_version 274740 (0.0037) [2024-06-22 19:26:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4501340160. Throughput: 0: 42453.8. Samples: 4501416900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:38,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-22 19:26:41,395][15401] Updated weights for policy 0, policy_version 274750 (0.0036) [2024-06-22 19:26:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4501569536. Throughput: 0: 42717.3. Samples: 4501678480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:43,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 19:26:45,328][15349] Signal inference workers to stop experience collection... (66550 times) [2024-06-22 19:26:45,328][15349] Signal inference workers to resume experience collection... (66550 times) [2024-06-22 19:26:45,366][15401] InferenceWorker_p0-w0: stopping experience collection (66550 times) [2024-06-22 19:26:45,366][15401] InferenceWorker_p0-w0: resuming experience collection (66550 times) [2024-06-22 19:26:45,469][15401] Updated weights for policy 0, policy_version 274760 (0.0034) [2024-06-22 19:26:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 4501782528. Throughput: 0: 42928.0. Samples: 4501935700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:48,393][15132] Avg episode reward: [(0, '0.853')] [2024-06-22 19:26:49,115][15401] Updated weights for policy 0, policy_version 274770 (0.0041) [2024-06-22 19:26:53,267][15401] Updated weights for policy 0, policy_version 274780 (0.0031) [2024-06-22 19:26:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4501995520. Throughput: 0: 42877.7. Samples: 4502069200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 19:26:56,693][15401] Updated weights for policy 0, policy_version 274790 (0.0028) [2024-06-22 19:26:58,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 4502208512. Throughput: 0: 42854.8. Samples: 4502322100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-22 19:26:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 19:27:01,441][15401] Updated weights for policy 0, policy_version 274800 (0.0037) [2024-06-22 19:27:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4502421504. Throughput: 0: 43167.4. Samples: 4502585220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 19:27:04,105][15401] Updated weights for policy 0, policy_version 274810 (0.0027) [2024-06-22 19:27:08,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42866.8, 300 sec: 42653.0). Total num frames: 4502618112. Throughput: 0: 43077.0. Samples: 4502714260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:08,397][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 19:27:08,931][15401] Updated weights for policy 0, policy_version 274820 (0.0034) [2024-06-22 19:27:11,965][15401] Updated weights for policy 0, policy_version 274830 (0.0027) [2024-06-22 19:27:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4502863872. Throughput: 0: 43025.4. Samples: 4502966780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:13,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 19:27:16,879][15401] Updated weights for policy 0, policy_version 274840 (0.0040) [2024-06-22 19:27:18,394][15132] Fps is (10 sec: 44245.7, 60 sec: 42868.3, 300 sec: 42764.4). Total num frames: 4503060480. Throughput: 0: 43130.8. Samples: 4503229980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:18,394][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 19:27:19,436][15401] Updated weights for policy 0, policy_version 274850 (0.0040) [2024-06-22 19:27:23,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 4503257088. Throughput: 0: 43193.3. Samples: 4503360700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:23,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 19:27:24,305][15401] Updated weights for policy 0, policy_version 274860 (0.0030) [2024-06-22 19:27:27,173][15401] Updated weights for policy 0, policy_version 274870 (0.0034) [2024-06-22 19:27:28,392][15132] Fps is (10 sec: 45884.3, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 4503519232. Throughput: 0: 42962.1. Samples: 4503611880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:28,393][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 19:27:31,777][15401] Updated weights for policy 0, policy_version 274880 (0.0023) [2024-06-22 19:27:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4503699456. Throughput: 0: 43176.5. Samples: 4503878540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 19:27:34,859][15401] Updated weights for policy 0, policy_version 274890 (0.0029) [2024-06-22 19:27:38,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4503912448. Throughput: 0: 42939.2. Samples: 4504001460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 19:27:39,404][15401] Updated weights for policy 0, policy_version 274900 (0.0033) [2024-06-22 19:27:42,468][15401] Updated weights for policy 0, policy_version 274910 (0.0030) [2024-06-22 19:27:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4504174592. Throughput: 0: 43189.3. Samples: 4504265620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:43,396][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 19:27:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000274913_4504174592.pth... [2024-06-22 19:27:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000274285_4493885440.pth [2024-06-22 19:27:46,988][15401] Updated weights for policy 0, policy_version 274920 (0.0024) [2024-06-22 19:27:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 4504354816. Throughput: 0: 43151.3. Samples: 4504527020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 19:27:49,987][15401] Updated weights for policy 0, policy_version 274930 (0.0031) [2024-06-22 19:27:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4504551424. Throughput: 0: 42932.3. Samples: 4504645940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 19:27:54,515][15401] Updated weights for policy 0, policy_version 274940 (0.0038) [2024-06-22 19:27:57,568][15401] Updated weights for policy 0, policy_version 274950 (0.0027) [2024-06-22 19:27:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4504797184. Throughput: 0: 43248.1. Samples: 4504912940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:27:58,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 19:28:01,986][15401] Updated weights for policy 0, policy_version 274960 (0.0028) [2024-06-22 19:28:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4504993792. Throughput: 0: 43089.1. Samples: 4505168800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:28:03,391][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 19:28:05,353][15401] Updated weights for policy 0, policy_version 274970 (0.0030) [2024-06-22 19:28:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 4505206784. Throughput: 0: 43026.3. Samples: 4505296780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:28:08,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-22 19:28:09,317][15401] Updated weights for policy 0, policy_version 274980 (0.0033) [2024-06-22 19:28:10,898][15349] Signal inference workers to stop experience collection... (66600 times) [2024-06-22 19:28:10,898][15349] Signal inference workers to resume experience collection... (66600 times) [2024-06-22 19:28:10,935][15401] InferenceWorker_p0-w0: stopping experience collection (66600 times) [2024-06-22 19:28:10,935][15401] InferenceWorker_p0-w0: resuming experience collection (66600 times) [2024-06-22 19:28:13,092][15401] Updated weights for policy 0, policy_version 274990 (0.0046) [2024-06-22 19:28:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 4505436160. Throughput: 0: 43252.9. Samples: 4505558160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 19:28:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 19:28:17,473][15401] Updated weights for policy 0, policy_version 275000 (0.0034) [2024-06-22 19:28:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42874.7, 300 sec: 42765.0). Total num frames: 4505632768. Throughput: 0: 42936.9. Samples: 4505810700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 19:28:21,054][15401] Updated weights for policy 0, policy_version 275010 (0.0049) [2024-06-22 19:28:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 4505845760. Throughput: 0: 42951.4. Samples: 4505934280. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 19:28:25,046][15401] Updated weights for policy 0, policy_version 275020 (0.0032) [2024-06-22 19:28:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 4506058752. Throughput: 0: 42824.8. Samples: 4506192740. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 19:28:28,748][15401] Updated weights for policy 0, policy_version 275030 (0.0037) [2024-06-22 19:28:32,909][15401] Updated weights for policy 0, policy_version 275040 (0.0031) [2024-06-22 19:28:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4506271744. Throughput: 0: 42724.4. Samples: 4506449620. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:33,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-22 19:28:36,275][15401] Updated weights for policy 0, policy_version 275050 (0.0038) [2024-06-22 19:28:38,392][15132] Fps is (10 sec: 45864.8, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 4506517504. Throughput: 0: 42871.6. Samples: 4506575260. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:38,392][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 19:28:40,571][15401] Updated weights for policy 0, policy_version 275060 (0.0033) [2024-06-22 19:28:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4506714112. Throughput: 0: 42792.8. Samples: 4506838620. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 19:28:43,689][15401] Updated weights for policy 0, policy_version 275070 (0.0039) [2024-06-22 19:28:48,031][15401] Updated weights for policy 0, policy_version 275080 (0.0045) [2024-06-22 19:28:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4506927104. Throughput: 0: 42808.1. Samples: 4507095160. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 19:28:51,278][15401] Updated weights for policy 0, policy_version 275090 (0.0026) [2024-06-22 19:28:53,396][15132] Fps is (10 sec: 44208.7, 60 sec: 43413.0, 300 sec: 42764.1). Total num frames: 4507156480. Throughput: 0: 42650.3. Samples: 4507216320. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:53,396][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 19:28:55,571][15401] Updated weights for policy 0, policy_version 275100 (0.0031) [2024-06-22 19:28:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4507369472. Throughput: 0: 42778.3. Samples: 4507483180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:28:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 19:28:58,939][15401] Updated weights for policy 0, policy_version 275110 (0.0034) [2024-06-22 19:29:03,307][15401] Updated weights for policy 0, policy_version 275120 (0.0023) [2024-06-22 19:29:03,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4507566080. Throughput: 0: 42867.5. Samples: 4507739740. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:29:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 19:29:06,403][15401] Updated weights for policy 0, policy_version 275130 (0.0036) [2024-06-22 19:29:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4507795456. Throughput: 0: 42807.3. Samples: 4507860600. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:29:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 19:29:10,799][15401] Updated weights for policy 0, policy_version 275140 (0.0036) [2024-06-22 19:29:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4508008448. Throughput: 0: 42994.0. Samples: 4508127460. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:29:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 19:29:13,992][15401] Updated weights for policy 0, policy_version 275150 (0.0044) [2024-06-22 19:29:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4508205056. Throughput: 0: 42938.7. Samples: 4508381860. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:29:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 19:29:18,711][15401] Updated weights for policy 0, policy_version 275160 (0.0031) [2024-06-22 19:29:21,521][15401] Updated weights for policy 0, policy_version 275170 (0.0047) [2024-06-22 19:29:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4508418048. Throughput: 0: 42853.0. Samples: 4508503540. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-22 19:29:23,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 19:29:26,355][15401] Updated weights for policy 0, policy_version 275180 (0.0029) [2024-06-22 19:29:27,708][15349] Signal inference workers to stop experience collection... (66650 times) [2024-06-22 19:29:27,708][15349] Signal inference workers to resume experience collection... (66650 times) [2024-06-22 19:29:27,718][15401] InferenceWorker_p0-w0: stopping experience collection (66650 times) [2024-06-22 19:29:27,718][15401] InferenceWorker_p0-w0: resuming experience collection (66650 times) [2024-06-22 19:29:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 4508647424. Throughput: 0: 42776.0. Samples: 4508763540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:29:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 19:29:29,344][15401] Updated weights for policy 0, policy_version 275190 (0.0034) [2024-06-22 19:29:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4508844032. Throughput: 0: 42762.5. Samples: 4509019480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:29:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 19:29:33,769][15401] Updated weights for policy 0, policy_version 275200 (0.0028) [2024-06-22 19:29:36,795][15401] Updated weights for policy 0, policy_version 275210 (0.0041) [2024-06-22 19:29:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 4509057024. Throughput: 0: 42866.6. Samples: 4509145040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:29:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 19:29:41,513][15401] Updated weights for policy 0, policy_version 275220 (0.0039) [2024-06-22 19:29:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4509286400. Throughput: 0: 42751.2. Samples: 4509406980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:29:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 19:29:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000275226_4509302784.pth... [2024-06-22 19:29:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000274597_4498997248.pth [2024-06-22 19:29:44,918][15401] Updated weights for policy 0, policy_version 275230 (0.0032) [2024-06-22 19:29:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 4509483008. Throughput: 0: 42699.1. Samples: 4509661200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:29:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 19:29:49,243][15401] Updated weights for policy 0, policy_version 275240 (0.0027) [2024-06-22 19:29:52,401][15401] Updated weights for policy 0, policy_version 275250 (0.0038) [2024-06-22 19:29:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 4509728768. Throughput: 0: 42876.7. Samples: 4509790060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:29:53,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 19:29:56,938][15401] Updated weights for policy 0, policy_version 275260 (0.0037) [2024-06-22 19:29:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4509925376. Throughput: 0: 42695.0. Samples: 4510048740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:29:58,392][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 19:29:59,977][15401] Updated weights for policy 0, policy_version 275270 (0.0036) [2024-06-22 19:30:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4510138368. Throughput: 0: 42758.8. Samples: 4510306000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:30:03,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 19:30:04,783][15401] Updated weights for policy 0, policy_version 275280 (0.0035) [2024-06-22 19:30:07,535][15401] Updated weights for policy 0, policy_version 275290 (0.0044) [2024-06-22 19:30:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4510384128. Throughput: 0: 43019.0. Samples: 4510439400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:30:08,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 19:30:12,388][15401] Updated weights for policy 0, policy_version 275300 (0.0033) [2024-06-22 19:30:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4510564352. Throughput: 0: 42820.5. Samples: 4510690460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:30:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 19:30:15,481][15401] Updated weights for policy 0, policy_version 275310 (0.0042) [2024-06-22 19:30:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 4510777344. Throughput: 0: 42822.3. Samples: 4510946480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:30:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 19:30:19,829][15401] Updated weights for policy 0, policy_version 275320 (0.0031) [2024-06-22 19:30:23,229][15401] Updated weights for policy 0, policy_version 275330 (0.0035) [2024-06-22 19:30:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4511006720. Throughput: 0: 42956.4. Samples: 4511078080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:30:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 19:30:27,351][15401] Updated weights for policy 0, policy_version 275340 (0.0034) [2024-06-22 19:30:28,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 4511186944. Throughput: 0: 42721.4. Samples: 4511329720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:30:28,397][15132] Avg episode reward: [(0, '0.211')] [2024-06-22 19:30:30,683][15401] Updated weights for policy 0, policy_version 275350 (0.0028) [2024-06-22 19:30:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4511432704. Throughput: 0: 42833.4. Samples: 4511588700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:30:33,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 19:30:34,878][15401] Updated weights for policy 0, policy_version 275360 (0.0024) [2024-06-22 19:30:38,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4511629312. Throughput: 0: 42902.8. Samples: 4511720680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:30:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 19:30:38,730][15401] Updated weights for policy 0, policy_version 275370 (0.0026) [2024-06-22 19:30:42,421][15401] Updated weights for policy 0, policy_version 275380 (0.0036) [2024-06-22 19:30:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4511842304. Throughput: 0: 42713.3. Samples: 4511970840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:30:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 19:30:46,257][15401] Updated weights for policy 0, policy_version 275390 (0.0027) [2024-06-22 19:30:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4512071680. Throughput: 0: 42659.0. Samples: 4512225660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:30:48,399][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 19:30:50,001][15401] Updated weights for policy 0, policy_version 275400 (0.0040) [2024-06-22 19:30:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 4512251904. Throughput: 0: 42641.3. Samples: 4512358260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:30:53,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-22 19:30:53,854][15349] Signal inference workers to stop experience collection... (66700 times) [2024-06-22 19:30:53,857][15349] Signal inference workers to resume experience collection... (66700 times) [2024-06-22 19:30:53,875][15401] InferenceWorker_p0-w0: stopping experience collection (66700 times) [2024-06-22 19:30:53,875][15401] InferenceWorker_p0-w0: resuming experience collection (66700 times) [2024-06-22 19:30:54,009][15401] Updated weights for policy 0, policy_version 275410 (0.0061) [2024-06-22 19:30:57,610][15401] Updated weights for policy 0, policy_version 275420 (0.0033) [2024-06-22 19:30:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4512481280. Throughput: 0: 42628.5. Samples: 4512608740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:30:58,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 19:31:01,926][15401] Updated weights for policy 0, policy_version 275430 (0.0040) [2024-06-22 19:31:03,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43142.7, 300 sec: 42986.8). Total num frames: 4512727040. Throughput: 0: 42531.5. Samples: 4512860500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:03,393][15132] Avg episode reward: [(0, '0.253')] [2024-06-22 19:31:05,624][15401] Updated weights for policy 0, policy_version 275440 (0.0044) [2024-06-22 19:31:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41777.5, 300 sec: 42764.7). Total num frames: 4512890880. Throughput: 0: 42555.9. Samples: 4512993200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:08,393][15132] Avg episode reward: [(0, '0.867')] [2024-06-22 19:31:09,689][15401] Updated weights for policy 0, policy_version 275450 (0.0032) [2024-06-22 19:31:13,210][15401] Updated weights for policy 0, policy_version 275460 (0.0034) [2024-06-22 19:31:13,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4513136640. Throughput: 0: 42446.6. Samples: 4513239540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 19:31:17,468][15401] Updated weights for policy 0, policy_version 275470 (0.0044) [2024-06-22 19:31:18,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4513349632. Throughput: 0: 42343.0. Samples: 4513494140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 19:31:21,248][15401] Updated weights for policy 0, policy_version 275480 (0.0031) [2024-06-22 19:31:23,392][15132] Fps is (10 sec: 37673.8, 60 sec: 41777.5, 300 sec: 42709.1). Total num frames: 4513513472. Throughput: 0: 42363.9. Samples: 4513627160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:23,393][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 19:31:25,289][15401] Updated weights for policy 0, policy_version 275490 (0.0031) [2024-06-22 19:31:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 4513759232. Throughput: 0: 42359.1. Samples: 4513877000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 19:31:28,705][15401] Updated weights for policy 0, policy_version 275500 (0.0030) [2024-06-22 19:31:33,369][15401] Updated weights for policy 0, policy_version 275510 (0.0022) [2024-06-22 19:31:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4513955840. Throughput: 0: 42585.0. Samples: 4514141980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 19:31:36,371][15401] Updated weights for policy 0, policy_version 275520 (0.0025) [2024-06-22 19:31:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4514168832. Throughput: 0: 42320.4. Samples: 4514262680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 19:31:40,931][15401] Updated weights for policy 0, policy_version 275530 (0.0029) [2024-06-22 19:31:43,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 4514414592. Throughput: 0: 42292.7. Samples: 4514511920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:43,391][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 19:31:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000275538_4514414592.pth... [2024-06-22 19:31:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000274913_4504174592.pth [2024-06-22 19:31:44,001][15401] Updated weights for policy 0, policy_version 275540 (0.0050) [2024-06-22 19:31:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4514594816. Throughput: 0: 42610.8. Samples: 4514777880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:31:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 19:31:48,571][15401] Updated weights for policy 0, policy_version 275550 (0.0033) [2024-06-22 19:31:51,649][15401] Updated weights for policy 0, policy_version 275560 (0.0039) [2024-06-22 19:31:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4514807808. Throughput: 0: 42387.0. Samples: 4514900520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:31:53,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 19:31:56,335][15401] Updated weights for policy 0, policy_version 275570 (0.0036) [2024-06-22 19:31:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4515037184. Throughput: 0: 42472.8. Samples: 4515150820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:31:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 19:31:59,634][15401] Updated weights for policy 0, policy_version 275580 (0.0027) [2024-06-22 19:32:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41780.9, 300 sec: 42766.0). Total num frames: 4515233792. Throughput: 0: 42665.9. Samples: 4515414100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 19:32:04,075][15401] Updated weights for policy 0, policy_version 275590 (0.0033) [2024-06-22 19:32:07,324][15401] Updated weights for policy 0, policy_version 275600 (0.0037) [2024-06-22 19:32:08,393][15132] Fps is (10 sec: 42584.6, 60 sec: 42870.9, 300 sec: 42709.0). Total num frames: 4515463168. Throughput: 0: 42329.5. Samples: 4515532020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:08,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 19:32:11,923][15401] Updated weights for policy 0, policy_version 275610 (0.0031) [2024-06-22 19:32:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42821.2). Total num frames: 4515692544. Throughput: 0: 42561.5. Samples: 4515792260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:13,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 19:32:14,940][15401] Updated weights for policy 0, policy_version 275620 (0.0029) [2024-06-22 19:32:18,389][15132] Fps is (10 sec: 39334.7, 60 sec: 41779.3, 300 sec: 42709.8). Total num frames: 4515856384. Throughput: 0: 42315.2. Samples: 4516046160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:18,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-22 19:32:19,445][15349] Signal inference workers to stop experience collection... (66750 times) [2024-06-22 19:32:19,467][15401] InferenceWorker_p0-w0: stopping experience collection (66750 times) [2024-06-22 19:32:19,503][15349] Signal inference workers to resume experience collection... (66750 times) [2024-06-22 19:32:19,503][15401] InferenceWorker_p0-w0: resuming experience collection (66750 times) [2024-06-22 19:32:19,643][15401] Updated weights for policy 0, policy_version 275630 (0.0049) [2024-06-22 19:32:22,504][15401] Updated weights for policy 0, policy_version 275640 (0.0038) [2024-06-22 19:32:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43419.4, 300 sec: 42709.8). Total num frames: 4516118528. Throughput: 0: 42274.7. Samples: 4516165040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:23,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-22 19:32:27,257][15401] Updated weights for policy 0, policy_version 275650 (0.0041) [2024-06-22 19:32:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4516315136. Throughput: 0: 42720.2. Samples: 4516434320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 19:32:30,293][15401] Updated weights for policy 0, policy_version 275660 (0.0040) [2024-06-22 19:32:33,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4516495360. Throughput: 0: 42517.7. Samples: 4516691180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 19:32:34,865][15401] Updated weights for policy 0, policy_version 275670 (0.0033) [2024-06-22 19:32:37,909][15401] Updated weights for policy 0, policy_version 275680 (0.0028) [2024-06-22 19:32:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4516757504. Throughput: 0: 42498.3. Samples: 4516812940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 19:32:42,760][15401] Updated weights for policy 0, policy_version 275690 (0.0034) [2024-06-22 19:32:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.5, 300 sec: 42653.9). Total num frames: 4516937728. Throughput: 0: 42751.6. Samples: 4517074640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:43,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 19:32:45,584][15401] Updated weights for policy 0, policy_version 275700 (0.0042) [2024-06-22 19:32:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4517150720. Throughput: 0: 42545.8. Samples: 4517328660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:48,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 19:32:50,448][15401] Updated weights for policy 0, policy_version 275710 (0.0033) [2024-06-22 19:32:53,075][15401] Updated weights for policy 0, policy_version 275720 (0.0036) [2024-06-22 19:32:53,396][15132] Fps is (10 sec: 45845.3, 60 sec: 43140.0, 300 sec: 42708.5). Total num frames: 4517396480. Throughput: 0: 42681.4. Samples: 4517452820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:53,397][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 19:32:57,951][15401] Updated weights for policy 0, policy_version 275730 (0.0030) [2024-06-22 19:32:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4517576704. Throughput: 0: 42776.4. Samples: 4517717200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:32:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 19:33:00,977][15401] Updated weights for policy 0, policy_version 275740 (0.0028) [2024-06-22 19:33:03,389][15132] Fps is (10 sec: 39347.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4517789696. Throughput: 0: 42599.1. Samples: 4517963120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 19:33:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 19:33:05,695][15401] Updated weights for policy 0, policy_version 275750 (0.0038) [2024-06-22 19:33:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.7, 300 sec: 42654.0). Total num frames: 4518019072. Throughput: 0: 42786.7. Samples: 4518090440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:08,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 19:33:08,588][15401] Updated weights for policy 0, policy_version 275760 (0.0047) [2024-06-22 19:33:13,184][15401] Updated weights for policy 0, policy_version 275770 (0.0029) [2024-06-22 19:33:13,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 4518232064. Throughput: 0: 42659.9. Samples: 4518354120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:13,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 19:33:16,123][15401] Updated weights for policy 0, policy_version 275780 (0.0046) [2024-06-22 19:33:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4518445056. Throughput: 0: 42612.0. Samples: 4518608720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 19:33:20,652][15401] Updated weights for policy 0, policy_version 275790 (0.0038) [2024-06-22 19:33:23,294][15349] Signal inference workers to stop experience collection... (66800 times) [2024-06-22 19:33:23,342][15401] InferenceWorker_p0-w0: stopping experience collection (66800 times) [2024-06-22 19:33:23,350][15349] Signal inference workers to resume experience collection... (66800 times) [2024-06-22 19:33:23,361][15401] InferenceWorker_p0-w0: resuming experience collection (66800 times) [2024-06-22 19:33:23,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4518674432. Throughput: 0: 42688.2. Samples: 4518733900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:23,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 19:33:23,633][15401] Updated weights for policy 0, policy_version 275800 (0.0029) [2024-06-22 19:33:28,133][15401] Updated weights for policy 0, policy_version 275810 (0.0029) [2024-06-22 19:33:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4518871040. Throughput: 0: 42768.4. Samples: 4518999220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 19:33:31,725][15401] Updated weights for policy 0, policy_version 275820 (0.0040) [2024-06-22 19:33:33,392][15132] Fps is (10 sec: 40949.4, 60 sec: 43142.8, 300 sec: 42598.4). Total num frames: 4519084032. Throughput: 0: 42683.4. Samples: 4519249520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:33,393][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 19:33:35,653][15401] Updated weights for policy 0, policy_version 275830 (0.0036) [2024-06-22 19:33:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 4519280640. Throughput: 0: 42687.5. Samples: 4519373480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 19:33:39,441][15401] Updated weights for policy 0, policy_version 275840 (0.0043) [2024-06-22 19:33:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 4519510016. Throughput: 0: 42513.7. Samples: 4519630320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:43,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 19:33:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000275849_4519510016.pth... [2024-06-22 19:33:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000275226_4509302784.pth [2024-06-22 19:33:43,626][15401] Updated weights for policy 0, policy_version 275850 (0.0037) [2024-06-22 19:33:47,366][15401] Updated weights for policy 0, policy_version 275860 (0.0027) [2024-06-22 19:33:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 4519723008. Throughput: 0: 42609.9. Samples: 4519880560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 19:33:51,038][15401] Updated weights for policy 0, policy_version 275870 (0.0036) [2024-06-22 19:33:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 4519936000. Throughput: 0: 42651.6. Samples: 4520009760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 19:33:55,196][15401] Updated weights for policy 0, policy_version 275880 (0.0030) [2024-06-22 19:33:58,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4520165376. Throughput: 0: 42761.4. Samples: 4520278280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:33:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 19:33:58,529][15401] Updated weights for policy 0, policy_version 275890 (0.0038) [2024-06-22 19:34:02,694][15401] Updated weights for policy 0, policy_version 275900 (0.0022) [2024-06-22 19:34:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4520361984. Throughput: 0: 42719.2. Samples: 4520531080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:34:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 19:34:06,186][15401] Updated weights for policy 0, policy_version 275910 (0.0035) [2024-06-22 19:34:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4520574976. Throughput: 0: 42728.3. Samples: 4520656680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:34:08,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 19:34:10,340][15401] Updated weights for policy 0, policy_version 275920 (0.0034) [2024-06-22 19:34:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 4520804352. Throughput: 0: 42631.5. Samples: 4520917640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 19:34:13,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 19:34:14,103][15401] Updated weights for policy 0, policy_version 275930 (0.0027) [2024-06-22 19:34:18,117][15401] Updated weights for policy 0, policy_version 275940 (0.0042) [2024-06-22 19:34:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4521017344. Throughput: 0: 42740.5. Samples: 4521172740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:18,391][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 19:34:21,859][15401] Updated weights for policy 0, policy_version 275950 (0.0025) [2024-06-22 19:34:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4521230336. Throughput: 0: 42827.4. Samples: 4521300720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 19:34:25,656][15401] Updated weights for policy 0, policy_version 275960 (0.0036) [2024-06-22 19:34:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4521426944. Throughput: 0: 42807.2. Samples: 4521556640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:28,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-22 19:34:29,654][15401] Updated weights for policy 0, policy_version 275970 (0.0028) [2024-06-22 19:34:33,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 4521639936. Throughput: 0: 42883.4. Samples: 4521810420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:33,392][15132] Avg episode reward: [(0, '0.218')] [2024-06-22 19:34:33,603][15401] Updated weights for policy 0, policy_version 275980 (0.0036) [2024-06-22 19:34:37,201][15401] Updated weights for policy 0, policy_version 275990 (0.0040) [2024-06-22 19:34:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4521869312. Throughput: 0: 42799.5. Samples: 4521935740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-22 19:34:41,217][15401] Updated weights for policy 0, policy_version 276000 (0.0033) [2024-06-22 19:34:43,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4522065920. Throughput: 0: 42509.7. Samples: 4522191220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 19:34:44,719][15401] Updated weights for policy 0, policy_version 276010 (0.0030) [2024-06-22 19:34:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 4522262528. Throughput: 0: 42678.1. Samples: 4522451600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 19:34:48,796][15401] Updated weights for policy 0, policy_version 276020 (0.0034) [2024-06-22 19:34:52,249][15401] Updated weights for policy 0, policy_version 276030 (0.0024) [2024-06-22 19:34:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4522491904. Throughput: 0: 42575.6. Samples: 4522572580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 19:34:55,931][15349] Signal inference workers to stop experience collection... (66850 times) [2024-06-22 19:34:55,932][15349] Signal inference workers to resume experience collection... (66850 times) [2024-06-22 19:34:55,945][15401] InferenceWorker_p0-w0: stopping experience collection (66850 times) [2024-06-22 19:34:55,945][15401] InferenceWorker_p0-w0: resuming experience collection (66850 times) [2024-06-22 19:34:56,784][15401] Updated weights for policy 0, policy_version 276040 (0.0027) [2024-06-22 19:34:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4522721280. Throughput: 0: 42560.5. Samples: 4522832860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:34:58,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 19:34:59,934][15401] Updated weights for policy 0, policy_version 276050 (0.0039) [2024-06-22 19:35:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4522901504. Throughput: 0: 42613.7. Samples: 4523090360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:35:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 19:35:04,556][15401] Updated weights for policy 0, policy_version 276060 (0.0037) [2024-06-22 19:35:07,662][15401] Updated weights for policy 0, policy_version 276070 (0.0030) [2024-06-22 19:35:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4523147264. Throughput: 0: 42526.3. Samples: 4523214400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:35:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 19:35:12,179][15401] Updated weights for policy 0, policy_version 276080 (0.0029) [2024-06-22 19:35:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4523343872. Throughput: 0: 42610.3. Samples: 4523474100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:35:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 19:35:15,290][15401] Updated weights for policy 0, policy_version 276090 (0.0027) [2024-06-22 19:35:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 4523556864. Throughput: 0: 42633.8. Samples: 4523728840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:35:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 19:35:19,894][15401] Updated weights for policy 0, policy_version 276100 (0.0026) [2024-06-22 19:35:23,269][15401] Updated weights for policy 0, policy_version 276110 (0.0030) [2024-06-22 19:35:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 4523786240. Throughput: 0: 42543.9. Samples: 4523850220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-22 19:35:23,390][15132] Avg episode reward: [(0, '0.110')] [2024-06-22 19:35:27,430][15401] Updated weights for policy 0, policy_version 276120 (0.0030) [2024-06-22 19:35:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4523999232. Throughput: 0: 42674.3. Samples: 4524111560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:35:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 19:35:30,839][15401] Updated weights for policy 0, policy_version 276130 (0.0041) [2024-06-22 19:35:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 4524195840. Throughput: 0: 42544.0. Samples: 4524366080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:35:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 19:35:35,004][15401] Updated weights for policy 0, policy_version 276140 (0.0027) [2024-06-22 19:35:38,366][15401] Updated weights for policy 0, policy_version 276150 (0.0042) [2024-06-22 19:35:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4524441600. Throughput: 0: 42690.6. Samples: 4524493660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:35:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 19:35:42,725][15401] Updated weights for policy 0, policy_version 276160 (0.0039) [2024-06-22 19:35:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4524621824. Throughput: 0: 42715.5. Samples: 4524755060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:35:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 19:35:43,476][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000276162_4524638208.pth... [2024-06-22 19:35:43,532][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000275538_4514414592.pth [2024-06-22 19:35:46,022][15401] Updated weights for policy 0, policy_version 276170 (0.0042) [2024-06-22 19:35:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4524834816. Throughput: 0: 42575.2. Samples: 4525006240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:35:48,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 19:35:50,561][15401] Updated weights for policy 0, policy_version 276180 (0.0022) [2024-06-22 19:35:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4525064192. Throughput: 0: 42684.5. Samples: 4525135200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:35:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 19:35:53,730][15401] Updated weights for policy 0, policy_version 276190 (0.0029) [2024-06-22 19:35:58,105][15401] Updated weights for policy 0, policy_version 276200 (0.0036) [2024-06-22 19:35:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 4525277184. Throughput: 0: 42753.4. Samples: 4525398000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:35:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 19:36:01,435][15401] Updated weights for policy 0, policy_version 276210 (0.0032) [2024-06-22 19:36:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 4525473792. Throughput: 0: 42851.7. Samples: 4525657160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:36:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 19:36:05,689][15401] Updated weights for policy 0, policy_version 276220 (0.0050) [2024-06-22 19:36:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4525703168. Throughput: 0: 42876.5. Samples: 4525779660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:36:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 19:36:09,257][15401] Updated weights for policy 0, policy_version 276230 (0.0047) [2024-06-22 19:36:13,243][15401] Updated weights for policy 0, policy_version 276240 (0.0034) [2024-06-22 19:36:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4525932544. Throughput: 0: 42878.2. Samples: 4526041080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:36:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 19:36:14,714][15349] Signal inference workers to stop experience collection... (66900 times) [2024-06-22 19:36:14,714][15349] Signal inference workers to resume experience collection... (66900 times) [2024-06-22 19:36:14,726][15401] InferenceWorker_p0-w0: stopping experience collection (66900 times) [2024-06-22 19:36:14,726][15401] InferenceWorker_p0-w0: resuming experience collection (66900 times) [2024-06-22 19:36:16,958][15401] Updated weights for policy 0, policy_version 276250 (0.0033) [2024-06-22 19:36:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 4526112768. Throughput: 0: 42962.7. Samples: 4526299400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:36:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 19:36:20,688][15401] Updated weights for policy 0, policy_version 276260 (0.0032) [2024-06-22 19:36:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4526358528. Throughput: 0: 42953.4. Samples: 4526426560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:36:23,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 19:36:24,746][15401] Updated weights for policy 0, policy_version 276270 (0.0025) [2024-06-22 19:36:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4526555136. Throughput: 0: 42927.3. Samples: 4526686780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:36:28,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 19:36:28,467][15401] Updated weights for policy 0, policy_version 276280 (0.0025) [2024-06-22 19:36:32,360][15401] Updated weights for policy 0, policy_version 276290 (0.0038) [2024-06-22 19:36:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4526768128. Throughput: 0: 42941.7. Samples: 4526938620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:36:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 19:36:36,213][15401] Updated weights for policy 0, policy_version 276300 (0.0034) [2024-06-22 19:36:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4526997504. Throughput: 0: 42934.7. Samples: 4527067260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-22 19:36:38,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 19:36:39,922][15401] Updated weights for policy 0, policy_version 276310 (0.0024) [2024-06-22 19:36:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4527177728. Throughput: 0: 42854.6. Samples: 4527326460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:36:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 19:36:43,927][15401] Updated weights for policy 0, policy_version 276320 (0.0030) [2024-06-22 19:36:47,426][15401] Updated weights for policy 0, policy_version 276330 (0.0044) [2024-06-22 19:36:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4527407104. Throughput: 0: 42737.8. Samples: 4527580360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:36:48,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-22 19:36:51,523][15401] Updated weights for policy 0, policy_version 276340 (0.0038) [2024-06-22 19:36:53,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 4527636480. Throughput: 0: 42932.8. Samples: 4527711740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:36:53,393][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 19:36:54,927][15401] Updated weights for policy 0, policy_version 276350 (0.0035) [2024-06-22 19:36:58,390][15132] Fps is (10 sec: 40958.4, 60 sec: 42325.0, 300 sec: 42653.9). Total num frames: 4527816704. Throughput: 0: 42687.3. Samples: 4527962020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:36:58,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 19:36:59,291][15401] Updated weights for policy 0, policy_version 276360 (0.0032) [2024-06-22 19:37:03,050][15401] Updated weights for policy 0, policy_version 276370 (0.0039) [2024-06-22 19:37:03,392][15132] Fps is (10 sec: 42598.3, 60 sec: 43142.7, 300 sec: 42709.6). Total num frames: 4528062464. Throughput: 0: 42609.7. Samples: 4528216940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:03,393][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 19:37:06,967][15401] Updated weights for policy 0, policy_version 276380 (0.0037) [2024-06-22 19:37:08,389][15132] Fps is (10 sec: 45877.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4528275456. Throughput: 0: 42745.8. Samples: 4528350120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 19:37:10,470][15401] Updated weights for policy 0, policy_version 276390 (0.0031) [2024-06-22 19:37:13,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 4528472064. Throughput: 0: 42568.3. Samples: 4528602460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:13,393][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 19:37:14,688][15401] Updated weights for policy 0, policy_version 276400 (0.0036) [2024-06-22 19:37:18,015][15401] Updated weights for policy 0, policy_version 276410 (0.0029) [2024-06-22 19:37:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4528701440. Throughput: 0: 42567.6. Samples: 4528854160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:18,391][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 19:37:22,614][15401] Updated weights for policy 0, policy_version 276420 (0.0033) [2024-06-22 19:37:23,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4528930816. Throughput: 0: 42734.2. Samples: 4528990300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:23,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 19:37:25,663][15401] Updated weights for policy 0, policy_version 276430 (0.0025) [2024-06-22 19:37:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4529111040. Throughput: 0: 42659.1. Samples: 4529246120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 19:37:30,355][15401] Updated weights for policy 0, policy_version 276440 (0.0027) [2024-06-22 19:37:33,324][15401] Updated weights for policy 0, policy_version 276450 (0.0036) [2024-06-22 19:37:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4529356800. Throughput: 0: 42509.2. Samples: 4529493280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:33,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 19:37:37,930][15401] Updated weights for policy 0, policy_version 276460 (0.0041) [2024-06-22 19:37:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4529553408. Throughput: 0: 42633.8. Samples: 4529630160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:38,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 19:37:41,014][15401] Updated weights for policy 0, policy_version 276470 (0.0034) [2024-06-22 19:37:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4529750016. Throughput: 0: 42846.2. Samples: 4529890080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:43,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 19:37:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000276474_4529750016.pth... [2024-06-22 19:37:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000275849_4519510016.pth [2024-06-22 19:37:45,433][15401] Updated weights for policy 0, policy_version 276480 (0.0033) [2024-06-22 19:37:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 4529979392. Throughput: 0: 42664.1. Samples: 4530136720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-22 19:37:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 19:37:48,441][15349] Signal inference workers to stop experience collection... (66950 times) [2024-06-22 19:37:48,477][15401] InferenceWorker_p0-w0: stopping experience collection (66950 times) [2024-06-22 19:37:48,487][15349] Signal inference workers to resume experience collection... (66950 times) [2024-06-22 19:37:48,497][15401] InferenceWorker_p0-w0: resuming experience collection (66950 times) [2024-06-22 19:37:48,624][15401] Updated weights for policy 0, policy_version 276490 (0.0028) [2024-06-22 19:37:52,872][15401] Updated weights for policy 0, policy_version 276500 (0.0028) [2024-06-22 19:37:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 4530176000. Throughput: 0: 42675.5. Samples: 4530270520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:37:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 19:37:56,588][15401] Updated weights for policy 0, policy_version 276510 (0.0045) [2024-06-22 19:37:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.8, 300 sec: 42709.5). Total num frames: 4530388992. Throughput: 0: 42802.0. Samples: 4530528440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:37:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 19:38:00,681][15401] Updated weights for policy 0, policy_version 276520 (0.0039) [2024-06-22 19:38:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 4530618368. Throughput: 0: 42722.7. Samples: 4530776680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:03,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 19:38:04,211][15401] Updated weights for policy 0, policy_version 276530 (0.0038) [2024-06-22 19:38:08,366][15401] Updated weights for policy 0, policy_version 276540 (0.0046) [2024-06-22 19:38:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 4530831360. Throughput: 0: 42666.7. Samples: 4530910300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:08,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 19:38:11,863][15401] Updated weights for policy 0, policy_version 276550 (0.0036) [2024-06-22 19:38:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 4531011584. Throughput: 0: 42575.1. Samples: 4531162000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 19:38:16,342][15401] Updated weights for policy 0, policy_version 276560 (0.0032) [2024-06-22 19:38:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4531257344. Throughput: 0: 42735.7. Samples: 4531416380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 19:38:19,478][15401] Updated weights for policy 0, policy_version 276570 (0.0026) [2024-06-22 19:38:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4531470336. Throughput: 0: 42617.3. Samples: 4531547940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 19:38:23,795][15401] Updated weights for policy 0, policy_version 276580 (0.0038) [2024-06-22 19:38:27,081][15401] Updated weights for policy 0, policy_version 276590 (0.0050) [2024-06-22 19:38:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 4531666944. Throughput: 0: 42303.5. Samples: 4531793740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:28,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 19:38:31,348][15401] Updated weights for policy 0, policy_version 276600 (0.0037) [2024-06-22 19:38:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 4531863552. Throughput: 0: 42734.1. Samples: 4532059760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 19:38:34,707][15401] Updated weights for policy 0, policy_version 276610 (0.0027) [2024-06-22 19:38:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4532109312. Throughput: 0: 42564.8. Samples: 4532185940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:38,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 19:38:38,760][15401] Updated weights for policy 0, policy_version 276620 (0.0038) [2024-06-22 19:38:42,556][15401] Updated weights for policy 0, policy_version 276630 (0.0033) [2024-06-22 19:38:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 4532322304. Throughput: 0: 42476.3. Samples: 4532439880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:43,402][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 19:38:46,348][15401] Updated weights for policy 0, policy_version 276640 (0.0038) [2024-06-22 19:38:48,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 4532518912. Throughput: 0: 42746.8. Samples: 4532700380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:48,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 19:38:49,271][15349] Signal inference workers to stop experience collection... (67000 times) [2024-06-22 19:38:49,277][15349] Signal inference workers to resume experience collection... (67000 times) [2024-06-22 19:38:49,294][15401] InferenceWorker_p0-w0: stopping experience collection (67000 times) [2024-06-22 19:38:49,294][15401] InferenceWorker_p0-w0: resuming experience collection (67000 times) [2024-06-22 19:38:50,296][15401] Updated weights for policy 0, policy_version 276650 (0.0035) [2024-06-22 19:38:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4532731904. Throughput: 0: 42634.3. Samples: 4532828840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 19:38:54,305][15401] Updated weights for policy 0, policy_version 276660 (0.0029) [2024-06-22 19:38:57,893][15401] Updated weights for policy 0, policy_version 276670 (0.0037) [2024-06-22 19:38:58,389][15132] Fps is (10 sec: 45885.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4532977664. Throughput: 0: 42674.6. Samples: 4533082360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:38:58,390][15132] Avg episode reward: [(0, '0.171')] [2024-06-22 19:39:02,095][15401] Updated weights for policy 0, policy_version 276680 (0.0024) [2024-06-22 19:39:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4533157888. Throughput: 0: 42768.9. Samples: 4533340980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 19:39:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 19:39:05,470][15401] Updated weights for policy 0, policy_version 276690 (0.0027) [2024-06-22 19:39:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4533387264. Throughput: 0: 42684.9. Samples: 4533468760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 19:39:09,828][15401] Updated weights for policy 0, policy_version 276700 (0.0041) [2024-06-22 19:39:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4533600256. Throughput: 0: 42713.7. Samples: 4533715860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 19:39:13,450][15401] Updated weights for policy 0, policy_version 276710 (0.0027) [2024-06-22 19:39:17,531][15401] Updated weights for policy 0, policy_version 276720 (0.0033) [2024-06-22 19:39:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4533813248. Throughput: 0: 42527.2. Samples: 4533973480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 19:39:21,349][15401] Updated weights for policy 0, policy_version 276730 (0.0038) [2024-06-22 19:39:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4534026240. Throughput: 0: 42603.1. Samples: 4534103080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:23,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 19:39:25,179][15401] Updated weights for policy 0, policy_version 276740 (0.0029) [2024-06-22 19:39:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 4534239232. Throughput: 0: 42704.6. Samples: 4534361580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:28,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 19:39:28,985][15401] Updated weights for policy 0, policy_version 276750 (0.0047) [2024-06-22 19:39:32,889][15401] Updated weights for policy 0, policy_version 276760 (0.0042) [2024-06-22 19:39:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4534452224. Throughput: 0: 42503.0. Samples: 4534612920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 19:39:36,636][15401] Updated weights for policy 0, policy_version 276770 (0.0022) [2024-06-22 19:39:38,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 4534648832. Throughput: 0: 42612.7. Samples: 4534746520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:38,393][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 19:39:40,378][15401] Updated weights for policy 0, policy_version 276780 (0.0026) [2024-06-22 19:39:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4534894592. Throughput: 0: 42590.1. Samples: 4534998920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:43,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 19:39:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000276788_4534894592.pth... [2024-06-22 19:39:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000276162_4524638208.pth [2024-06-22 19:39:44,259][15401] Updated weights for policy 0, policy_version 276790 (0.0040) [2024-06-22 19:39:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 4535074816. Throughput: 0: 42625.3. Samples: 4535259120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:48,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 19:39:48,418][15401] Updated weights for policy 0, policy_version 276800 (0.0045) [2024-06-22 19:39:51,723][15401] Updated weights for policy 0, policy_version 276810 (0.0036) [2024-06-22 19:39:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4535304192. Throughput: 0: 42555.1. Samples: 4535383740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:53,394][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 19:39:56,020][15401] Updated weights for policy 0, policy_version 276820 (0.0034) [2024-06-22 19:39:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4535533568. Throughput: 0: 42896.6. Samples: 4535646200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:39:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 19:39:59,334][15401] Updated weights for policy 0, policy_version 276830 (0.0036) [2024-06-22 19:40:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4535713792. Throughput: 0: 43012.9. Samples: 4535909060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:40:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 19:40:03,542][15401] Updated weights for policy 0, policy_version 276840 (0.0031) [2024-06-22 19:40:06,898][15401] Updated weights for policy 0, policy_version 276850 (0.0028) [2024-06-22 19:40:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4535943168. Throughput: 0: 42784.4. Samples: 4536028380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:40:08,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 19:40:11,018][15401] Updated weights for policy 0, policy_version 276860 (0.0041) [2024-06-22 19:40:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4536172544. Throughput: 0: 42791.1. Samples: 4536287180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 19:40:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 19:40:14,497][15401] Updated weights for policy 0, policy_version 276870 (0.0040) [2024-06-22 19:40:15,340][15349] Signal inference workers to stop experience collection... (67050 times) [2024-06-22 19:40:15,342][15349] Signal inference workers to resume experience collection... (67050 times) [2024-06-22 19:40:15,355][15401] InferenceWorker_p0-w0: stopping experience collection (67050 times) [2024-06-22 19:40:15,355][15401] InferenceWorker_p0-w0: resuming experience collection (67050 times) [2024-06-22 19:40:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 4536369152. Throughput: 0: 43054.6. Samples: 4536550480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:18,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 19:40:18,895][15401] Updated weights for policy 0, policy_version 276880 (0.0033) [2024-06-22 19:40:22,159][15401] Updated weights for policy 0, policy_version 276890 (0.0038) [2024-06-22 19:40:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4536598528. Throughput: 0: 42785.4. Samples: 4536671760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 19:40:26,481][15401] Updated weights for policy 0, policy_version 276900 (0.0032) [2024-06-22 19:40:28,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4536827904. Throughput: 0: 42905.0. Samples: 4536929640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 19:40:29,675][15401] Updated weights for policy 0, policy_version 276910 (0.0036) [2024-06-22 19:40:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4536991744. Throughput: 0: 42995.9. Samples: 4537193940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:33,391][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 19:40:34,033][15401] Updated weights for policy 0, policy_version 276920 (0.0036) [2024-06-22 19:40:37,817][15401] Updated weights for policy 0, policy_version 276930 (0.0031) [2024-06-22 19:40:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43146.4, 300 sec: 42765.0). Total num frames: 4537237504. Throughput: 0: 42889.9. Samples: 4537313780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 19:40:41,935][15401] Updated weights for policy 0, policy_version 276940 (0.0040) [2024-06-22 19:40:43,392][15132] Fps is (10 sec: 47502.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 4537466880. Throughput: 0: 42599.8. Samples: 4537563300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:43,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 19:40:45,415][15401] Updated weights for policy 0, policy_version 276950 (0.0034) [2024-06-22 19:40:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4537630720. Throughput: 0: 42629.3. Samples: 4537827380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 19:40:49,725][15401] Updated weights for policy 0, policy_version 276960 (0.0037) [2024-06-22 19:40:52,970][15401] Updated weights for policy 0, policy_version 276970 (0.0040) [2024-06-22 19:40:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 4537876480. Throughput: 0: 42589.7. Samples: 4537944920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:53,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 19:40:57,280][15401] Updated weights for policy 0, policy_version 276980 (0.0033) [2024-06-22 19:40:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4538089472. Throughput: 0: 42691.0. Samples: 4538208280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:40:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 19:41:00,841][15401] Updated weights for policy 0, policy_version 276990 (0.0041) [2024-06-22 19:41:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 4538286080. Throughput: 0: 42516.8. Samples: 4538463640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:41:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 19:41:05,200][15401] Updated weights for policy 0, policy_version 277000 (0.0027) [2024-06-22 19:41:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4538515456. Throughput: 0: 42492.8. Samples: 4538583940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:41:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 19:41:08,634][15401] Updated weights for policy 0, policy_version 277010 (0.0036) [2024-06-22 19:41:12,935][15401] Updated weights for policy 0, policy_version 277020 (0.0042) [2024-06-22 19:41:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4538728448. Throughput: 0: 42634.3. Samples: 4538848180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:41:13,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 19:41:16,160][15401] Updated weights for policy 0, policy_version 277030 (0.0036) [2024-06-22 19:41:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 4538925056. Throughput: 0: 42282.7. Samples: 4539096660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:41:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 19:41:20,682][15401] Updated weights for policy 0, policy_version 277040 (0.0046) [2024-06-22 19:41:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4539170816. Throughput: 0: 42380.3. Samples: 4539220900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:41:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 19:41:23,743][15401] Updated weights for policy 0, policy_version 277050 (0.0032) [2024-06-22 19:41:28,126][15401] Updated weights for policy 0, policy_version 277060 (0.0026) [2024-06-22 19:41:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 4539351040. Throughput: 0: 42811.2. Samples: 4539489700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:41:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 19:41:31,554][15401] Updated weights for policy 0, policy_version 277070 (0.0030) [2024-06-22 19:41:33,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 4539564032. Throughput: 0: 42520.3. Samples: 4539740900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:41:33,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 19:41:35,781][15401] Updated weights for policy 0, policy_version 277080 (0.0032) [2024-06-22 19:41:38,396][15132] Fps is (10 sec: 45845.8, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 4539809792. Throughput: 0: 42752.2. Samples: 4539869040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:41:38,396][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 19:41:39,081][15401] Updated weights for policy 0, policy_version 277090 (0.0029) [2024-06-22 19:41:43,301][15401] Updated weights for policy 0, policy_version 277100 (0.0029) [2024-06-22 19:41:43,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 4540006400. Throughput: 0: 42691.7. Samples: 4540129400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:41:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 19:41:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000277100_4540006400.pth... [2024-06-22 19:41:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000276474_4529750016.pth [2024-06-22 19:41:44,240][15349] Signal inference workers to stop experience collection... (67100 times) [2024-06-22 19:41:44,300][15401] InferenceWorker_p0-w0: stopping experience collection (67100 times) [2024-06-22 19:41:44,357][15349] Signal inference workers to resume experience collection... (67100 times) [2024-06-22 19:41:44,358][15401] InferenceWorker_p0-w0: resuming experience collection (67100 times) [2024-06-22 19:41:46,918][15401] Updated weights for policy 0, policy_version 277110 (0.0022) [2024-06-22 19:41:48,390][15132] Fps is (10 sec: 40985.7, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 4540219392. Throughput: 0: 42512.4. Samples: 4540376700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:41:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 19:41:50,935][15401] Updated weights for policy 0, policy_version 277120 (0.0037) [2024-06-22 19:41:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 4540432384. Throughput: 0: 42756.2. Samples: 4540507960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:41:53,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 19:41:54,521][15401] Updated weights for policy 0, policy_version 277130 (0.0040) [2024-06-22 19:41:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 4540645376. Throughput: 0: 42728.0. Samples: 4540770940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:41:58,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 19:41:58,566][15401] Updated weights for policy 0, policy_version 277140 (0.0028) [2024-06-22 19:42:02,177][15401] Updated weights for policy 0, policy_version 277150 (0.0037) [2024-06-22 19:42:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4540874752. Throughput: 0: 42578.8. Samples: 4541012700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:42:03,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 19:42:06,397][15401] Updated weights for policy 0, policy_version 277160 (0.0040) [2024-06-22 19:42:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 4541071360. Throughput: 0: 42822.8. Samples: 4541147920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:42:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 19:42:10,012][15401] Updated weights for policy 0, policy_version 277170 (0.0045) [2024-06-22 19:42:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 4541267968. Throughput: 0: 42675.4. Samples: 4541410100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:42:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 19:42:14,129][15401] Updated weights for policy 0, policy_version 277180 (0.0024) [2024-06-22 19:42:17,596][15401] Updated weights for policy 0, policy_version 277190 (0.0030) [2024-06-22 19:42:18,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43415.9, 300 sec: 42709.1). Total num frames: 4541530112. Throughput: 0: 42528.9. Samples: 4541654700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:42:18,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 19:42:21,639][15401] Updated weights for policy 0, policy_version 277200 (0.0027) [2024-06-22 19:42:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4541710336. Throughput: 0: 42955.9. Samples: 4541801780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:42:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 19:42:25,066][15401] Updated weights for policy 0, policy_version 277210 (0.0031) [2024-06-22 19:42:28,389][15132] Fps is (10 sec: 36053.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4541890560. Throughput: 0: 42726.1. Samples: 4542052080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:42:28,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 19:42:29,358][15401] Updated weights for policy 0, policy_version 277220 (0.0030) [2024-06-22 19:42:32,730][15401] Updated weights for policy 0, policy_version 277230 (0.0038) [2024-06-22 19:42:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 4542152704. Throughput: 0: 42809.5. Samples: 4542303120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:42:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 19:42:37,190][15401] Updated weights for policy 0, policy_version 277240 (0.0037) [2024-06-22 19:42:38,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 4542365696. Throughput: 0: 42935.2. Samples: 4542440040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 19:42:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 19:42:40,306][15401] Updated weights for policy 0, policy_version 277250 (0.0039) [2024-06-22 19:42:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 4542545920. Throughput: 0: 42648.0. Samples: 4542690100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:42:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 19:42:44,719][15401] Updated weights for policy 0, policy_version 277260 (0.0028) [2024-06-22 19:42:48,040][15401] Updated weights for policy 0, policy_version 277270 (0.0034) [2024-06-22 19:42:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4542791680. Throughput: 0: 42876.3. Samples: 4542942140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:42:48,390][15132] Avg episode reward: [(0, '0.136')] [2024-06-22 19:42:52,465][15401] Updated weights for policy 0, policy_version 277280 (0.0027) [2024-06-22 19:42:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4543004672. Throughput: 0: 43044.3. Samples: 4543084920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:42:53,390][15132] Avg episode reward: [(0, '0.136')] [2024-06-22 19:42:55,902][15401] Updated weights for policy 0, policy_version 277290 (0.0031) [2024-06-22 19:42:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4543184896. Throughput: 0: 42774.3. Samples: 4543334940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:42:58,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 19:42:59,396][15349] Signal inference workers to stop experience collection... (67150 times) [2024-06-22 19:42:59,396][15349] Signal inference workers to resume experience collection... (67150 times) [2024-06-22 19:42:59,443][15401] InferenceWorker_p0-w0: stopping experience collection (67150 times) [2024-06-22 19:42:59,443][15401] InferenceWorker_p0-w0: resuming experience collection (67150 times) [2024-06-22 19:43:00,015][15401] Updated weights for policy 0, policy_version 277300 (0.0029) [2024-06-22 19:43:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4543430656. Throughput: 0: 42970.4. Samples: 4543588260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 19:43:03,506][15401] Updated weights for policy 0, policy_version 277310 (0.0029) [2024-06-22 19:43:07,788][15401] Updated weights for policy 0, policy_version 277320 (0.0035) [2024-06-22 19:43:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4543643648. Throughput: 0: 42706.6. Samples: 4543723580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 19:43:10,997][15401] Updated weights for policy 0, policy_version 277330 (0.0033) [2024-06-22 19:43:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4543823872. Throughput: 0: 42601.4. Samples: 4543969140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 19:43:15,380][15401] Updated weights for policy 0, policy_version 277340 (0.0046) [2024-06-22 19:43:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 4544069632. Throughput: 0: 42703.9. Samples: 4544224800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 19:43:18,949][15401] Updated weights for policy 0, policy_version 277350 (0.0031) [2024-06-22 19:43:22,911][15401] Updated weights for policy 0, policy_version 277360 (0.0038) [2024-06-22 19:43:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4544266240. Throughput: 0: 42680.8. Samples: 4544360680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 19:43:26,507][15401] Updated weights for policy 0, policy_version 277370 (0.0030) [2024-06-22 19:43:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4544479232. Throughput: 0: 42560.0. Samples: 4544605300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 19:43:30,857][15401] Updated weights for policy 0, policy_version 277380 (0.0038) [2024-06-22 19:43:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 4544708608. Throughput: 0: 42836.5. Samples: 4544869880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:33,393][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 19:43:34,065][15401] Updated weights for policy 0, policy_version 277390 (0.0045) [2024-06-22 19:43:38,309][15401] Updated weights for policy 0, policy_version 277400 (0.0035) [2024-06-22 19:43:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4544921600. Throughput: 0: 42552.6. Samples: 4544999780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 19:43:41,978][15401] Updated weights for policy 0, policy_version 277410 (0.0036) [2024-06-22 19:43:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 4545134592. Throughput: 0: 42572.9. Samples: 4545250720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:43,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 19:43:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000277413_4545134592.pth... [2024-06-22 19:43:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000276788_4534894592.pth [2024-06-22 19:43:46,165][15401] Updated weights for policy 0, policy_version 277420 (0.0026) [2024-06-22 19:43:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4545363968. Throughput: 0: 42630.9. Samples: 4545506660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 19:43:49,634][15401] Updated weights for policy 0, policy_version 277430 (0.0032) [2024-06-22 19:43:53,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 4545544192. Throughput: 0: 42500.9. Samples: 4545636220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-22 19:43:53,393][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 19:43:53,895][15401] Updated weights for policy 0, policy_version 277440 (0.0032) [2024-06-22 19:43:57,342][15401] Updated weights for policy 0, policy_version 277450 (0.0026) [2024-06-22 19:43:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4545773568. Throughput: 0: 42740.8. Samples: 4545892480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:43:58,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-22 19:44:01,665][15401] Updated weights for policy 0, policy_version 277460 (0.0031) [2024-06-22 19:44:03,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4545986560. Throughput: 0: 42688.8. Samples: 4546145800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 19:44:05,133][15401] Updated weights for policy 0, policy_version 277470 (0.0044) [2024-06-22 19:44:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4546183168. Throughput: 0: 42433.0. Samples: 4546270160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:08,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 19:44:09,677][15401] Updated weights for policy 0, policy_version 277480 (0.0039) [2024-06-22 19:44:12,517][15401] Updated weights for policy 0, policy_version 277490 (0.0035) [2024-06-22 19:44:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4546412544. Throughput: 0: 42715.6. Samples: 4546527500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 19:44:16,988][15349] Signal inference workers to stop experience collection... (67200 times) [2024-06-22 19:44:17,016][15401] InferenceWorker_p0-w0: stopping experience collection (67200 times) [2024-06-22 19:44:17,054][15349] Signal inference workers to resume experience collection... (67200 times) [2024-06-22 19:44:17,055][15401] InferenceWorker_p0-w0: resuming experience collection (67200 times) [2024-06-22 19:44:17,193][15401] Updated weights for policy 0, policy_version 277500 (0.0033) [2024-06-22 19:44:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4546625536. Throughput: 0: 42658.2. Samples: 4546789400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 19:44:20,175][15401] Updated weights for policy 0, policy_version 277510 (0.0030) [2024-06-22 19:44:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4546822144. Throughput: 0: 42479.0. Samples: 4546911340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 19:44:24,637][15401] Updated weights for policy 0, policy_version 277520 (0.0027) [2024-06-22 19:44:28,112][15401] Updated weights for policy 0, policy_version 277530 (0.0035) [2024-06-22 19:44:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4547067904. Throughput: 0: 42736.8. Samples: 4547173880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:28,396][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 19:44:32,085][15401] Updated weights for policy 0, policy_version 277540 (0.0039) [2024-06-22 19:44:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42873.2, 300 sec: 42820.9). Total num frames: 4547280896. Throughput: 0: 42816.5. Samples: 4547433400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 19:44:35,773][15401] Updated weights for policy 0, policy_version 277550 (0.0031) [2024-06-22 19:44:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4547477504. Throughput: 0: 42752.5. Samples: 4547559980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 19:44:39,800][15401] Updated weights for policy 0, policy_version 277560 (0.0035) [2024-06-22 19:44:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4547690496. Throughput: 0: 42864.4. Samples: 4547821380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:43,398][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 19:44:43,428][15401] Updated weights for policy 0, policy_version 277570 (0.0031) [2024-06-22 19:44:47,571][15401] Updated weights for policy 0, policy_version 277580 (0.0030) [2024-06-22 19:44:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4547919872. Throughput: 0: 42948.4. Samples: 4548078480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:48,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 19:44:51,035][15401] Updated weights for policy 0, policy_version 277590 (0.0031) [2024-06-22 19:44:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 4548116480. Throughput: 0: 42990.2. Samples: 4548204720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:53,398][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 19:44:55,045][15401] Updated weights for policy 0, policy_version 277600 (0.0037) [2024-06-22 19:44:58,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 4548329472. Throughput: 0: 43012.8. Samples: 4548463180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:44:58,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 19:44:59,130][15401] Updated weights for policy 0, policy_version 277610 (0.0031) [2024-06-22 19:45:02,628][15401] Updated weights for policy 0, policy_version 277620 (0.0032) [2024-06-22 19:45:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4548558848. Throughput: 0: 42889.9. Samples: 4548719440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-22 19:45:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 19:45:06,800][15401] Updated weights for policy 0, policy_version 277630 (0.0040) [2024-06-22 19:45:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4548771840. Throughput: 0: 43134.3. Samples: 4548852380. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 19:45:10,152][15401] Updated weights for policy 0, policy_version 277640 (0.0035) [2024-06-22 19:45:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 4548984832. Throughput: 0: 42831.6. Samples: 4549101300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 19:45:14,289][15401] Updated weights for policy 0, policy_version 277650 (0.0041) [2024-06-22 19:45:18,005][15401] Updated weights for policy 0, policy_version 277660 (0.0034) [2024-06-22 19:45:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4549181440. Throughput: 0: 42726.8. Samples: 4549356100. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 19:45:21,668][15349] Signal inference workers to stop experience collection... (67250 times) [2024-06-22 19:45:21,668][15349] Signal inference workers to resume experience collection... (67250 times) [2024-06-22 19:45:21,707][15401] InferenceWorker_p0-w0: stopping experience collection (67250 times) [2024-06-22 19:45:21,707][15401] InferenceWorker_p0-w0: resuming experience collection (67250 times) [2024-06-22 19:45:21,810][15401] Updated weights for policy 0, policy_version 277670 (0.0041) [2024-06-22 19:45:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4549410816. Throughput: 0: 42774.6. Samples: 4549484840. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 19:45:25,495][15401] Updated weights for policy 0, policy_version 277680 (0.0034) [2024-06-22 19:45:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4549607424. Throughput: 0: 42460.9. Samples: 4549732120. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:28,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 19:45:29,938][15401] Updated weights for policy 0, policy_version 277690 (0.0035) [2024-06-22 19:45:33,119][15401] Updated weights for policy 0, policy_version 277700 (0.0027) [2024-06-22 19:45:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4549836800. Throughput: 0: 42466.8. Samples: 4549989480. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 19:45:37,532][15401] Updated weights for policy 0, policy_version 277710 (0.0031) [2024-06-22 19:45:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 4550049792. Throughput: 0: 42635.1. Samples: 4550123300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:38,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 19:45:40,763][15401] Updated weights for policy 0, policy_version 277720 (0.0038) [2024-06-22 19:45:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4550246400. Throughput: 0: 42528.5. Samples: 4550376860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 19:45:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000277725_4550246400.pth... [2024-06-22 19:45:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000277100_4540006400.pth [2024-06-22 19:45:45,052][15401] Updated weights for policy 0, policy_version 277730 (0.0033) [2024-06-22 19:45:48,339][15401] Updated weights for policy 0, policy_version 277740 (0.0024) [2024-06-22 19:45:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4550492160. Throughput: 0: 42621.3. Samples: 4550637400. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 19:45:52,977][15401] Updated weights for policy 0, policy_version 277750 (0.0032) [2024-06-22 19:45:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4550672384. Throughput: 0: 42521.7. Samples: 4550765860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:53,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 19:45:56,072][15401] Updated weights for policy 0, policy_version 277760 (0.0040) [2024-06-22 19:45:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 4550901760. Throughput: 0: 42551.8. Samples: 4551016140. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:45:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 19:46:00,495][15401] Updated weights for policy 0, policy_version 277770 (0.0025) [2024-06-22 19:46:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4551114752. Throughput: 0: 42707.0. Samples: 4551277920. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:46:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 19:46:03,753][15401] Updated weights for policy 0, policy_version 277780 (0.0037) [2024-06-22 19:46:08,290][15401] Updated weights for policy 0, policy_version 277790 (0.0022) [2024-06-22 19:46:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4551311360. Throughput: 0: 42729.3. Samples: 4551407660. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:46:08,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 19:46:11,490][15401] Updated weights for policy 0, policy_version 277800 (0.0037) [2024-06-22 19:46:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4551557120. Throughput: 0: 42780.0. Samples: 4551657220. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:46:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 19:46:16,277][15401] Updated weights for policy 0, policy_version 277810 (0.0041) [2024-06-22 19:46:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4551737344. Throughput: 0: 42691.5. Samples: 4551910600. Policy #0 lag: (min: 1.0, avg: 8.8, max: 24.0) [2024-06-22 19:46:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 19:46:19,281][15401] Updated weights for policy 0, policy_version 277820 (0.0044) [2024-06-22 19:46:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4551950336. Throughput: 0: 42417.7. Samples: 4552032100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:46:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 19:46:24,033][15401] Updated weights for policy 0, policy_version 277830 (0.0047) [2024-06-22 19:46:26,973][15401] Updated weights for policy 0, policy_version 277840 (0.0039) [2024-06-22 19:46:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 4552196096. Throughput: 0: 42544.5. Samples: 4552291360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:46:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 19:46:31,539][15401] Updated weights for policy 0, policy_version 277850 (0.0022) [2024-06-22 19:46:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 4552376320. Throughput: 0: 42653.7. Samples: 4552556820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:46:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 19:46:34,725][15401] Updated weights for policy 0, policy_version 277860 (0.0042) [2024-06-22 19:46:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 4552605696. Throughput: 0: 42419.5. Samples: 4552674840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:46:38,392][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 19:46:39,004][15401] Updated weights for policy 0, policy_version 277870 (0.0027) [2024-06-22 19:46:41,084][15349] Signal inference workers to stop experience collection... (67300 times) [2024-06-22 19:46:41,085][15349] Signal inference workers to resume experience collection... (67300 times) [2024-06-22 19:46:41,125][15401] InferenceWorker_p0-w0: stopping experience collection (67300 times) [2024-06-22 19:46:41,125][15401] InferenceWorker_p0-w0: resuming experience collection (67300 times) [2024-06-22 19:46:42,066][15401] Updated weights for policy 0, policy_version 277880 (0.0035) [2024-06-22 19:46:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4552835072. Throughput: 0: 42718.8. Samples: 4552938480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:46:43,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-22 19:46:46,665][15401] Updated weights for policy 0, policy_version 277890 (0.0034) [2024-06-22 19:46:48,389][15132] Fps is (10 sec: 39331.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 4552998912. Throughput: 0: 42850.3. Samples: 4553206180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:46:48,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 19:46:49,691][15401] Updated weights for policy 0, policy_version 277900 (0.0041) [2024-06-22 19:46:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4553244672. Throughput: 0: 42421.8. Samples: 4553316640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:46:53,391][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 19:46:54,002][15401] Updated weights for policy 0, policy_version 277910 (0.0042) [2024-06-22 19:46:57,308][15401] Updated weights for policy 0, policy_version 277920 (0.0030) [2024-06-22 19:46:58,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4553474048. Throughput: 0: 42647.2. Samples: 4553576340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:46:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 19:47:01,584][15401] Updated weights for policy 0, policy_version 277930 (0.0031) [2024-06-22 19:47:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 4553654272. Throughput: 0: 43066.1. Samples: 4553848680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:47:03,393][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 19:47:04,794][15401] Updated weights for policy 0, policy_version 277940 (0.0033) [2024-06-22 19:47:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4553883648. Throughput: 0: 42858.8. Samples: 4553960740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:47:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 19:47:09,055][15401] Updated weights for policy 0, policy_version 277950 (0.0037) [2024-06-22 19:47:12,636][15401] Updated weights for policy 0, policy_version 277960 (0.0038) [2024-06-22 19:47:13,389][15132] Fps is (10 sec: 47525.5, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 4554129408. Throughput: 0: 42875.1. Samples: 4554220740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:47:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 19:47:17,352][15401] Updated weights for policy 0, policy_version 277970 (0.0042) [2024-06-22 19:47:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4554276864. Throughput: 0: 42896.5. Samples: 4554487160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:47:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 19:47:20,319][15401] Updated weights for policy 0, policy_version 277980 (0.0034) [2024-06-22 19:47:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 4554555392. Throughput: 0: 42849.8. Samples: 4554603080. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:47:23,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 19:47:24,811][15401] Updated weights for policy 0, policy_version 277990 (0.0030) [2024-06-22 19:47:27,761][15401] Updated weights for policy 0, policy_version 278000 (0.0037) [2024-06-22 19:47:28,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4554752000. Throughput: 0: 42852.0. Samples: 4554866820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-22 19:47:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 19:47:32,330][15401] Updated weights for policy 0, policy_version 278010 (0.0030) [2024-06-22 19:47:33,390][15132] Fps is (10 sec: 37691.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4554932224. Throughput: 0: 42880.8. Samples: 4555135820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:47:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 19:47:35,521][15401] Updated weights for policy 0, policy_version 278020 (0.0029) [2024-06-22 19:47:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 4555194368. Throughput: 0: 43040.4. Samples: 4555253460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:47:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 19:47:40,145][15401] Updated weights for policy 0, policy_version 278030 (0.0029) [2024-06-22 19:47:43,030][15401] Updated weights for policy 0, policy_version 278040 (0.0037) [2024-06-22 19:47:43,396][15132] Fps is (10 sec: 47483.6, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 4555407360. Throughput: 0: 43095.2. Samples: 4555515900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:47:43,396][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 19:47:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000278040_4555407360.pth... [2024-06-22 19:47:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000277413_4545134592.pth [2024-06-22 19:47:47,769][15401] Updated weights for policy 0, policy_version 278050 (0.0030) [2024-06-22 19:47:48,394][15132] Fps is (10 sec: 39305.6, 60 sec: 43141.5, 300 sec: 42653.4). Total num frames: 4555587584. Throughput: 0: 42816.1. Samples: 4555775480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:47:48,394][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 19:47:50,635][15401] Updated weights for policy 0, policy_version 278060 (0.0046) [2024-06-22 19:47:53,394][15132] Fps is (10 sec: 40967.4, 60 sec: 42868.2, 300 sec: 42819.9). Total num frames: 4555816960. Throughput: 0: 43015.1. Samples: 4555896620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:47:53,395][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 19:47:55,238][15401] Updated weights for policy 0, policy_version 278070 (0.0042) [2024-06-22 19:47:57,542][15349] Signal inference workers to stop experience collection... (67350 times) [2024-06-22 19:47:57,542][15349] Signal inference workers to resume experience collection... (67350 times) [2024-06-22 19:47:57,561][15401] InferenceWorker_p0-w0: stopping experience collection (67350 times) [2024-06-22 19:47:57,561][15401] InferenceWorker_p0-w0: resuming experience collection (67350 times) [2024-06-22 19:47:58,195][15401] Updated weights for policy 0, policy_version 278080 (0.0027) [2024-06-22 19:47:58,389][15132] Fps is (10 sec: 47533.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4556062720. Throughput: 0: 43113.8. Samples: 4556160860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:47:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 19:48:02,908][15401] Updated weights for policy 0, policy_version 278090 (0.0032) [2024-06-22 19:48:03,396][15132] Fps is (10 sec: 42590.8, 60 sec: 43141.7, 300 sec: 42708.6). Total num frames: 4556242944. Throughput: 0: 42835.6. Samples: 4556415040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:03,396][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 19:48:06,167][15401] Updated weights for policy 0, policy_version 278100 (0.0027) [2024-06-22 19:48:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4556472320. Throughput: 0: 43123.1. Samples: 4556543520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 19:48:10,518][15401] Updated weights for policy 0, policy_version 278110 (0.0037) [2024-06-22 19:48:13,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4556685312. Throughput: 0: 43084.4. Samples: 4556805620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:13,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 19:48:14,052][15401] Updated weights for policy 0, policy_version 278120 (0.0039) [2024-06-22 19:48:18,007][15401] Updated weights for policy 0, policy_version 278130 (0.0036) [2024-06-22 19:48:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4556881920. Throughput: 0: 42707.7. Samples: 4557057660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:18,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 19:48:21,864][15401] Updated weights for policy 0, policy_version 278140 (0.0042) [2024-06-22 19:48:23,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42871.5, 300 sec: 42875.8). Total num frames: 4557127680. Throughput: 0: 42941.8. Samples: 4557185940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:23,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 19:48:25,699][15401] Updated weights for policy 0, policy_version 278150 (0.0038) [2024-06-22 19:48:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 4557324288. Throughput: 0: 42910.2. Samples: 4557446580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:28,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 19:48:29,301][15401] Updated weights for policy 0, policy_version 278160 (0.0039) [2024-06-22 19:48:33,171][15401] Updated weights for policy 0, policy_version 278170 (0.0036) [2024-06-22 19:48:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4557537280. Throughput: 0: 42876.8. Samples: 4557704760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:33,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-22 19:48:36,822][15401] Updated weights for policy 0, policy_version 278180 (0.0025) [2024-06-22 19:48:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4557783040. Throughput: 0: 42954.6. Samples: 4557829380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:38,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 19:48:41,074][15401] Updated weights for policy 0, policy_version 278190 (0.0039) [2024-06-22 19:48:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 4557963264. Throughput: 0: 42848.9. Samples: 4558089060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-22 19:48:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 19:48:44,473][15401] Updated weights for policy 0, policy_version 278200 (0.0030) [2024-06-22 19:48:48,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42874.4, 300 sec: 42765.4). Total num frames: 4558159872. Throughput: 0: 42818.5. Samples: 4558341600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:48:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 19:48:48,873][15401] Updated weights for policy 0, policy_version 278210 (0.0037) [2024-06-22 19:48:52,581][15401] Updated weights for policy 0, policy_version 278220 (0.0025) [2024-06-22 19:48:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43420.8, 300 sec: 42876.1). Total num frames: 4558422016. Throughput: 0: 42729.3. Samples: 4558466340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:48:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 19:48:56,794][15401] Updated weights for policy 0, policy_version 278230 (0.0037) [2024-06-22 19:48:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4558585856. Throughput: 0: 42699.1. Samples: 4558727080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:48:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 19:49:00,157][15401] Updated weights for policy 0, policy_version 278240 (0.0022) [2024-06-22 19:49:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 4558815232. Throughput: 0: 42768.9. Samples: 4558982260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 19:49:04,373][15401] Updated weights for policy 0, policy_version 278250 (0.0037) [2024-06-22 19:49:07,722][15401] Updated weights for policy 0, policy_version 278260 (0.0036) [2024-06-22 19:49:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4559044608. Throughput: 0: 42721.0. Samples: 4559108280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:08,404][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 19:49:12,159][15401] Updated weights for policy 0, policy_version 278270 (0.0040) [2024-06-22 19:49:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4559224832. Throughput: 0: 42716.0. Samples: 4559368800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 19:49:15,187][15401] Updated weights for policy 0, policy_version 278280 (0.0024) [2024-06-22 19:49:18,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4559437824. Throughput: 0: 42495.2. Samples: 4559617040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 19:49:19,773][15401] Updated weights for policy 0, policy_version 278290 (0.0035) [2024-06-22 19:49:21,948][15349] Signal inference workers to stop experience collection... (67400 times) [2024-06-22 19:49:21,967][15401] InferenceWorker_p0-w0: stopping experience collection (67400 times) [2024-06-22 19:49:22,006][15349] Signal inference workers to resume experience collection... (67400 times) [2024-06-22 19:49:22,006][15401] InferenceWorker_p0-w0: resuming experience collection (67400 times) [2024-06-22 19:49:22,975][15401] Updated weights for policy 0, policy_version 278300 (0.0027) [2024-06-22 19:49:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 4559667200. Throughput: 0: 42631.5. Samples: 4559747800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:23,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 19:49:27,291][15401] Updated weights for policy 0, policy_version 278310 (0.0035) [2024-06-22 19:49:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 4559863808. Throughput: 0: 42627.4. Samples: 4560007300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:28,396][15132] Avg episode reward: [(0, '0.271')] [2024-06-22 19:49:30,516][15401] Updated weights for policy 0, policy_version 278320 (0.0041) [2024-06-22 19:49:33,390][15132] Fps is (10 sec: 44235.1, 60 sec: 42871.2, 300 sec: 42820.5). Total num frames: 4560109568. Throughput: 0: 42681.4. Samples: 4560262280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 19:49:34,855][15401] Updated weights for policy 0, policy_version 278330 (0.0037) [2024-06-22 19:49:38,026][15401] Updated weights for policy 0, policy_version 278340 (0.0029) [2024-06-22 19:49:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 4560322560. Throughput: 0: 42878.3. Samples: 4560395860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 19:49:42,313][15401] Updated weights for policy 0, policy_version 278350 (0.0036) [2024-06-22 19:49:43,389][15132] Fps is (10 sec: 40961.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4560519168. Throughput: 0: 42862.6. Samples: 4560655900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 19:49:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000278353_4560535552.pth... [2024-06-22 19:49:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000277725_4550246400.pth [2024-06-22 19:49:45,907][15401] Updated weights for policy 0, policy_version 278360 (0.0034) [2024-06-22 19:49:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 4560764928. Throughput: 0: 42797.3. Samples: 4560908140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 19:49:49,738][15401] Updated weights for policy 0, policy_version 278370 (0.0037) [2024-06-22 19:49:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 4560961536. Throughput: 0: 42958.5. Samples: 4561041420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-22 19:49:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 19:49:53,499][15401] Updated weights for policy 0, policy_version 278380 (0.0033) [2024-06-22 19:49:57,353][15401] Updated weights for policy 0, policy_version 278390 (0.0037) [2024-06-22 19:49:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4561158144. Throughput: 0: 42915.1. Samples: 4561299980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:49:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 19:50:01,083][15401] Updated weights for policy 0, policy_version 278400 (0.0035) [2024-06-22 19:50:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4561403904. Throughput: 0: 43131.6. Samples: 4561557960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 19:50:05,105][15401] Updated weights for policy 0, policy_version 278410 (0.0041) [2024-06-22 19:50:08,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4561616896. Throughput: 0: 43171.2. Samples: 4561690500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 19:50:08,660][15401] Updated weights for policy 0, policy_version 278420 (0.0029) [2024-06-22 19:50:12,542][15401] Updated weights for policy 0, policy_version 278430 (0.0031) [2024-06-22 19:50:13,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4561813504. Throughput: 0: 43079.5. Samples: 4561945880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 19:50:16,050][15401] Updated weights for policy 0, policy_version 278440 (0.0039) [2024-06-22 19:50:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 4562059264. Throughput: 0: 43155.5. Samples: 4562204260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 19:50:20,096][15401] Updated weights for policy 0, policy_version 278450 (0.0032) [2024-06-22 19:50:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4562255872. Throughput: 0: 43199.5. Samples: 4562339840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:23,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 19:50:23,960][15401] Updated weights for policy 0, policy_version 278460 (0.0040) [2024-06-22 19:50:27,688][15401] Updated weights for policy 0, policy_version 278470 (0.0045) [2024-06-22 19:50:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 4562468864. Throughput: 0: 43122.7. Samples: 4562596420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:28,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 19:50:31,550][15401] Updated weights for policy 0, policy_version 278480 (0.0030) [2024-06-22 19:50:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.8, 300 sec: 42876.1). Total num frames: 4562698240. Throughput: 0: 43177.2. Samples: 4562851120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 19:50:35,766][15401] Updated weights for policy 0, policy_version 278490 (0.0028) [2024-06-22 19:50:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4562911232. Throughput: 0: 43148.5. Samples: 4562983100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:38,392][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 19:50:39,104][15401] Updated weights for policy 0, policy_version 278500 (0.0042) [2024-06-22 19:50:43,239][15401] Updated weights for policy 0, policy_version 278510 (0.0029) [2024-06-22 19:50:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4563107840. Throughput: 0: 43044.3. Samples: 4563236980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 19:50:44,622][15349] Signal inference workers to stop experience collection... (67450 times) [2024-06-22 19:50:44,623][15349] Signal inference workers to resume experience collection... (67450 times) [2024-06-22 19:50:44,645][15401] InferenceWorker_p0-w0: stopping experience collection (67450 times) [2024-06-22 19:50:44,646][15401] InferenceWorker_p0-w0: resuming experience collection (67450 times) [2024-06-22 19:50:46,826][15401] Updated weights for policy 0, policy_version 278520 (0.0025) [2024-06-22 19:50:48,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 4563320832. Throughput: 0: 43036.7. Samples: 4563494720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:48,392][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 19:50:50,802][15401] Updated weights for policy 0, policy_version 278530 (0.0026) [2024-06-22 19:50:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4563533824. Throughput: 0: 42957.4. Samples: 4563623580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 19:50:54,352][15401] Updated weights for policy 0, policy_version 278540 (0.0031) [2024-06-22 19:50:58,390][15401] Updated weights for policy 0, policy_version 278550 (0.0033) [2024-06-22 19:50:58,390][15132] Fps is (10 sec: 44246.5, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 4563763200. Throughput: 0: 42843.0. Samples: 4563873820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:50:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 19:51:02,000][15401] Updated weights for policy 0, policy_version 278560 (0.0047) [2024-06-22 19:51:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4563959808. Throughput: 0: 42701.8. Samples: 4564125840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:51:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 19:51:06,561][15401] Updated weights for policy 0, policy_version 278570 (0.0027) [2024-06-22 19:51:08,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4564172800. Throughput: 0: 42605.8. Samples: 4564257100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-22 19:51:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 19:51:09,558][15401] Updated weights for policy 0, policy_version 278580 (0.0033) [2024-06-22 19:51:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4564385792. Throughput: 0: 42464.3. Samples: 4564507320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 19:51:14,079][15401] Updated weights for policy 0, policy_version 278590 (0.0030) [2024-06-22 19:51:17,326][15401] Updated weights for policy 0, policy_version 278600 (0.0028) [2024-06-22 19:51:18,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42593.8, 300 sec: 42930.7). Total num frames: 4564615168. Throughput: 0: 42459.8. Samples: 4564762080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:18,397][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 19:51:21,698][15401] Updated weights for policy 0, policy_version 278610 (0.0043) [2024-06-22 19:51:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 4564795392. Throughput: 0: 42419.1. Samples: 4564892060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:23,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 19:51:25,138][15401] Updated weights for policy 0, policy_version 278620 (0.0033) [2024-06-22 19:51:28,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4565024768. Throughput: 0: 42344.0. Samples: 4565142460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 19:51:29,240][15401] Updated weights for policy 0, policy_version 278630 (0.0039) [2024-06-22 19:51:32,874][15401] Updated weights for policy 0, policy_version 278640 (0.0027) [2024-06-22 19:51:33,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 4565254144. Throughput: 0: 42360.1. Samples: 4565400820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 19:51:36,925][15401] Updated weights for policy 0, policy_version 278650 (0.0043) [2024-06-22 19:51:38,390][15132] Fps is (10 sec: 39317.9, 60 sec: 41778.6, 300 sec: 42653.8). Total num frames: 4565417984. Throughput: 0: 42387.5. Samples: 4565531060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:38,391][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 19:51:40,400][15401] Updated weights for policy 0, policy_version 278660 (0.0034) [2024-06-22 19:51:43,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 4565663744. Throughput: 0: 42341.1. Samples: 4565779260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:43,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 19:51:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000278666_4565663744.pth... [2024-06-22 19:51:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000278040_4555407360.pth [2024-06-22 19:51:45,001][15401] Updated weights for policy 0, policy_version 278670 (0.0042) [2024-06-22 19:51:47,998][15401] Updated weights for policy 0, policy_version 278680 (0.0028) [2024-06-22 19:51:48,391][15132] Fps is (10 sec: 47511.2, 60 sec: 42872.2, 300 sec: 42875.9). Total num frames: 4565893120. Throughput: 0: 42400.9. Samples: 4566033940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:48,391][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 19:51:52,971][15401] Updated weights for policy 0, policy_version 278690 (0.0042) [2024-06-22 19:51:53,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4566056960. Throughput: 0: 42401.4. Samples: 4566165160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:53,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 19:51:56,110][15401] Updated weights for policy 0, policy_version 278700 (0.0033) [2024-06-22 19:51:58,390][15132] Fps is (10 sec: 42603.7, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 4566319104. Throughput: 0: 42386.6. Samples: 4566414720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:51:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 19:52:00,513][15401] Updated weights for policy 0, policy_version 278710 (0.0026) [2024-06-22 19:52:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4566515712. Throughput: 0: 42586.4. Samples: 4566678200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:52:03,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-22 19:52:03,924][15401] Updated weights for policy 0, policy_version 278720 (0.0045) [2024-06-22 19:52:08,025][15401] Updated weights for policy 0, policy_version 278730 (0.0038) [2024-06-22 19:52:08,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4566712320. Throughput: 0: 42559.2. Samples: 4566807120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:52:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 19:52:09,842][15349] Signal inference workers to stop experience collection... (67500 times) [2024-06-22 19:52:09,842][15349] Signal inference workers to resume experience collection... (67500 times) [2024-06-22 19:52:09,858][15401] InferenceWorker_p0-w0: stopping experience collection (67500 times) [2024-06-22 19:52:09,892][15401] InferenceWorker_p0-w0: resuming experience collection (67500 times) [2024-06-22 19:52:11,368][15401] Updated weights for policy 0, policy_version 278740 (0.0035) [2024-06-22 19:52:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4566958080. Throughput: 0: 42700.0. Samples: 4567063960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:52:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 19:52:15,527][15401] Updated weights for policy 0, policy_version 278750 (0.0035) [2024-06-22 19:52:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42329.8, 300 sec: 42709.8). Total num frames: 4567154688. Throughput: 0: 42736.9. Samples: 4567323980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 19:52:18,394][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 19:52:19,207][15401] Updated weights for policy 0, policy_version 278760 (0.0037) [2024-06-22 19:52:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 4567351296. Throughput: 0: 42521.7. Samples: 4567444500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:52:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 19:52:23,458][15401] Updated weights for policy 0, policy_version 278770 (0.0031) [2024-06-22 19:52:26,757][15401] Updated weights for policy 0, policy_version 278780 (0.0034) [2024-06-22 19:52:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4567613440. Throughput: 0: 42888.9. Samples: 4567709160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:52:28,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 19:52:30,873][15401] Updated weights for policy 0, policy_version 278790 (0.0034) [2024-06-22 19:52:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4567793664. Throughput: 0: 42987.5. Samples: 4567968320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:52:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 19:52:34,328][15401] Updated weights for policy 0, policy_version 278800 (0.0027) [2024-06-22 19:52:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43145.2, 300 sec: 42710.4). Total num frames: 4568006656. Throughput: 0: 42780.0. Samples: 4568090260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:52:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 19:52:38,443][15401] Updated weights for policy 0, policy_version 278810 (0.0028) [2024-06-22 19:52:42,083][15401] Updated weights for policy 0, policy_version 278820 (0.0033) [2024-06-22 19:52:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43146.3, 300 sec: 42932.3). Total num frames: 4568252416. Throughput: 0: 43055.4. Samples: 4568352200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:52:43,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 19:52:46,091][15401] Updated weights for policy 0, policy_version 278830 (0.0044) [2024-06-22 19:52:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42326.4, 300 sec: 42765.7). Total num frames: 4568432640. Throughput: 0: 42926.8. Samples: 4568609900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:52:48,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-22 19:52:49,758][15401] Updated weights for policy 0, policy_version 278840 (0.0032) [2024-06-22 19:52:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4568645632. Throughput: 0: 42763.7. Samples: 4568731480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:52:53,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 19:52:53,884][15401] Updated weights for policy 0, policy_version 278850 (0.0042) [2024-06-22 19:52:57,531][15401] Updated weights for policy 0, policy_version 278860 (0.0031) [2024-06-22 19:52:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 4568891392. Throughput: 0: 42966.1. Samples: 4568997440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:52:58,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 19:53:01,600][15401] Updated weights for policy 0, policy_version 278870 (0.0033) [2024-06-22 19:53:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4569071616. Throughput: 0: 42846.7. Samples: 4569252080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:53:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 19:53:05,342][15401] Updated weights for policy 0, policy_version 278880 (0.0043) [2024-06-22 19:53:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4569300992. Throughput: 0: 42775.5. Samples: 4569369400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:53:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 19:53:09,297][15401] Updated weights for policy 0, policy_version 278890 (0.0032) [2024-06-22 19:53:12,489][15349] Signal inference workers to stop experience collection... (67550 times) [2024-06-22 19:53:12,489][15349] Signal inference workers to resume experience collection... (67550 times) [2024-06-22 19:53:12,533][15401] InferenceWorker_p0-w0: stopping experience collection (67550 times) [2024-06-22 19:53:12,533][15401] InferenceWorker_p0-w0: resuming experience collection (67550 times) [2024-06-22 19:53:12,806][15401] Updated weights for policy 0, policy_version 278900 (0.0048) [2024-06-22 19:53:13,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 4569513984. Throughput: 0: 42960.5. Samples: 4569642660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:53:13,397][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 19:53:17,043][15401] Updated weights for policy 0, policy_version 278910 (0.0031) [2024-06-22 19:53:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 4569710592. Throughput: 0: 42718.8. Samples: 4569890660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:53:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 19:53:20,794][15401] Updated weights for policy 0, policy_version 278920 (0.0041) [2024-06-22 19:53:23,389][15132] Fps is (10 sec: 42626.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4569939968. Throughput: 0: 42762.3. Samples: 4570014560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:53:23,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 19:53:24,650][15401] Updated weights for policy 0, policy_version 278930 (0.0042) [2024-06-22 19:53:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4570136576. Throughput: 0: 42745.7. Samples: 4570275760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:53:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 19:53:28,415][15401] Updated weights for policy 0, policy_version 278940 (0.0038) [2024-06-22 19:53:32,104][15401] Updated weights for policy 0, policy_version 278950 (0.0045) [2024-06-22 19:53:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4570365952. Throughput: 0: 42522.0. Samples: 4570523400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 19:53:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 19:53:35,898][15401] Updated weights for policy 0, policy_version 278960 (0.0037) [2024-06-22 19:53:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4570578944. Throughput: 0: 42798.2. Samples: 4570657400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:53:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 19:53:39,542][15401] Updated weights for policy 0, policy_version 278970 (0.0041) [2024-06-22 19:53:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 4570791936. Throughput: 0: 42701.8. Samples: 4570919020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:53:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 19:53:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000278979_4570791936.pth... [2024-06-22 19:53:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000278353_4560535552.pth [2024-06-22 19:53:43,650][15401] Updated weights for policy 0, policy_version 278980 (0.0031) [2024-06-22 19:53:47,097][15401] Updated weights for policy 0, policy_version 278990 (0.0036) [2024-06-22 19:53:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4571021312. Throughput: 0: 42504.5. Samples: 4571164780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:53:48,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 19:53:51,578][15401] Updated weights for policy 0, policy_version 279000 (0.0032) [2024-06-22 19:53:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4571234304. Throughput: 0: 42907.2. Samples: 4571300220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:53:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 19:53:54,957][15401] Updated weights for policy 0, policy_version 279010 (0.0041) [2024-06-22 19:53:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4571414528. Throughput: 0: 42520.8. Samples: 4571555820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:53:58,394][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 19:53:59,235][15401] Updated weights for policy 0, policy_version 279020 (0.0035) [2024-06-22 19:54:02,485][15401] Updated weights for policy 0, policy_version 279030 (0.0037) [2024-06-22 19:54:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4571660288. Throughput: 0: 42567.9. Samples: 4571806220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:03,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 19:54:06,996][15401] Updated weights for policy 0, policy_version 279040 (0.0031) [2024-06-22 19:54:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4571873280. Throughput: 0: 42972.4. Samples: 4571948320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 19:54:10,007][15401] Updated weights for policy 0, policy_version 279050 (0.0033) [2024-06-22 19:54:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42329.8, 300 sec: 42765.0). Total num frames: 4572053504. Throughput: 0: 42703.0. Samples: 4572197400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 19:54:14,676][15401] Updated weights for policy 0, policy_version 279060 (0.0034) [2024-06-22 19:54:17,599][15401] Updated weights for policy 0, policy_version 279070 (0.0042) [2024-06-22 19:54:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 4572315648. Throughput: 0: 42694.9. Samples: 4572444660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 19:54:22,361][15401] Updated weights for policy 0, policy_version 279080 (0.0039) [2024-06-22 19:54:23,237][15349] Signal inference workers to stop experience collection... (67600 times) [2024-06-22 19:54:23,283][15401] InferenceWorker_p0-w0: stopping experience collection (67600 times) [2024-06-22 19:54:23,288][15349] Signal inference workers to resume experience collection... (67600 times) [2024-06-22 19:54:23,299][15401] InferenceWorker_p0-w0: resuming experience collection (67600 times) [2024-06-22 19:54:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4572512256. Throughput: 0: 42872.3. Samples: 4572586660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:23,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 19:54:25,211][15401] Updated weights for policy 0, policy_version 279090 (0.0029) [2024-06-22 19:54:28,390][15132] Fps is (10 sec: 37682.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4572692480. Throughput: 0: 42655.1. Samples: 4572838500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 19:54:29,820][15401] Updated weights for policy 0, policy_version 279100 (0.0033) [2024-06-22 19:54:32,975][15401] Updated weights for policy 0, policy_version 279110 (0.0050) [2024-06-22 19:54:33,392][15132] Fps is (10 sec: 44228.0, 60 sec: 43143.2, 300 sec: 42820.3). Total num frames: 4572954624. Throughput: 0: 42770.9. Samples: 4573089560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:33,392][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 19:54:37,662][15401] Updated weights for policy 0, policy_version 279120 (0.0032) [2024-06-22 19:54:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4573134848. Throughput: 0: 42794.7. Samples: 4573225980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 19:54:40,599][15401] Updated weights for policy 0, policy_version 279130 (0.0026) [2024-06-22 19:54:43,390][15132] Fps is (10 sec: 39329.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4573347840. Throughput: 0: 42675.6. Samples: 4573476220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 19:54:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 19:54:45,289][15401] Updated weights for policy 0, policy_version 279140 (0.0049) [2024-06-22 19:54:48,290][15401] Updated weights for policy 0, policy_version 279150 (0.0033) [2024-06-22 19:54:48,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 4573593600. Throughput: 0: 42751.6. Samples: 4573730140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:54:48,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 19:54:53,151][15401] Updated weights for policy 0, policy_version 279160 (0.0038) [2024-06-22 19:54:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4573773824. Throughput: 0: 42545.1. Samples: 4573862860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:54:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 19:54:55,855][15401] Updated weights for policy 0, policy_version 279170 (0.0032) [2024-06-22 19:54:58,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4573970432. Throughput: 0: 42493.0. Samples: 4574109580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:54:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 19:55:00,761][15401] Updated weights for policy 0, policy_version 279180 (0.0038) [2024-06-22 19:55:03,389][15132] Fps is (10 sec: 44238.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4574216192. Throughput: 0: 42788.8. Samples: 4574370160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:03,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 19:55:03,567][15401] Updated weights for policy 0, policy_version 279190 (0.0042) [2024-06-22 19:55:08,255][15401] Updated weights for policy 0, policy_version 279200 (0.0029) [2024-06-22 19:55:08,391][15132] Fps is (10 sec: 44230.2, 60 sec: 42324.2, 300 sec: 42709.3). Total num frames: 4574412800. Throughput: 0: 42630.6. Samples: 4574505100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:08,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 19:55:11,013][15401] Updated weights for policy 0, policy_version 279210 (0.0037) [2024-06-22 19:55:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4574625792. Throughput: 0: 42623.1. Samples: 4574756540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:13,391][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 19:55:15,787][15401] Updated weights for policy 0, policy_version 279220 (0.0036) [2024-06-22 19:55:18,389][15132] Fps is (10 sec: 47520.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4574887936. Throughput: 0: 42649.5. Samples: 4575008700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 19:55:18,761][15401] Updated weights for policy 0, policy_version 279230 (0.0024) [2024-06-22 19:55:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4575051776. Throughput: 0: 42653.2. Samples: 4575145380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:23,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-22 19:55:23,515][15401] Updated weights for policy 0, policy_version 279240 (0.0027) [2024-06-22 19:55:26,735][15401] Updated weights for policy 0, policy_version 279250 (0.0030) [2024-06-22 19:55:28,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4575264768. Throughput: 0: 42731.1. Samples: 4575399120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 19:55:31,003][15401] Updated weights for policy 0, policy_version 279260 (0.0037) [2024-06-22 19:55:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42599.8, 300 sec: 42709.5). Total num frames: 4575510528. Throughput: 0: 42779.6. Samples: 4575655120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 19:55:34,197][15401] Updated weights for policy 0, policy_version 279270 (0.0034) [2024-06-22 19:55:37,998][15349] Signal inference workers to stop experience collection... (67650 times) [2024-06-22 19:55:38,037][15401] InferenceWorker_p0-w0: stopping experience collection (67650 times) [2024-06-22 19:55:38,121][15349] Signal inference workers to resume experience collection... (67650 times) [2024-06-22 19:55:38,121][15401] InferenceWorker_p0-w0: resuming experience collection (67650 times) [2024-06-22 19:55:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4575690752. Throughput: 0: 42744.6. Samples: 4575786360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 19:55:39,062][15401] Updated weights for policy 0, policy_version 279280 (0.0037) [2024-06-22 19:55:42,069][15401] Updated weights for policy 0, policy_version 279290 (0.0026) [2024-06-22 19:55:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 4575920128. Throughput: 0: 42868.9. Samples: 4576038680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 19:55:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000279292_4575920128.pth... [2024-06-22 19:55:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000278666_4565663744.pth [2024-06-22 19:55:46,769][15401] Updated weights for policy 0, policy_version 279300 (0.0036) [2024-06-22 19:55:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 4576133120. Throughput: 0: 42804.4. Samples: 4576296360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:48,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 19:55:49,751][15401] Updated weights for policy 0, policy_version 279310 (0.0046) [2024-06-22 19:55:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 4576329728. Throughput: 0: 42693.0. Samples: 4576426220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 19:55:54,116][15401] Updated weights for policy 0, policy_version 279320 (0.0032) [2024-06-22 19:55:57,306][15401] Updated weights for policy 0, policy_version 279330 (0.0031) [2024-06-22 19:55:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4576559104. Throughput: 0: 42758.7. Samples: 4576680680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-22 19:55:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 19:56:01,935][15401] Updated weights for policy 0, policy_version 279340 (0.0022) [2024-06-22 19:56:03,394][15132] Fps is (10 sec: 45853.4, 60 sec: 42868.1, 300 sec: 42764.3). Total num frames: 4576788480. Throughput: 0: 42991.5. Samples: 4576943520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:03,395][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 19:56:04,858][15401] Updated weights for policy 0, policy_version 279350 (0.0032) [2024-06-22 19:56:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42599.4, 300 sec: 42653.9). Total num frames: 4576968704. Throughput: 0: 42723.1. Samples: 4577067920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 19:56:09,554][15401] Updated weights for policy 0, policy_version 279360 (0.0035) [2024-06-22 19:56:12,464][15401] Updated weights for policy 0, policy_version 279370 (0.0041) [2024-06-22 19:56:13,390][15132] Fps is (10 sec: 40979.1, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 4577198080. Throughput: 0: 42561.4. Samples: 4577314380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 19:56:17,080][15401] Updated weights for policy 0, policy_version 279380 (0.0031) [2024-06-22 19:56:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 4577411072. Throughput: 0: 42667.1. Samples: 4577575140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 19:56:20,001][15401] Updated weights for policy 0, policy_version 279390 (0.0038) [2024-06-22 19:56:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 4577624064. Throughput: 0: 42651.9. Samples: 4577705700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 19:56:24,589][15401] Updated weights for policy 0, policy_version 279400 (0.0027) [2024-06-22 19:56:27,891][15401] Updated weights for policy 0, policy_version 279410 (0.0037) [2024-06-22 19:56:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4577853440. Throughput: 0: 42645.7. Samples: 4577957740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:28,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-22 19:56:32,647][15401] Updated weights for policy 0, policy_version 279420 (0.0033) [2024-06-22 19:56:33,392][15132] Fps is (10 sec: 44227.3, 60 sec: 42596.7, 300 sec: 42875.9). Total num frames: 4578066432. Throughput: 0: 42743.0. Samples: 4578219900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:33,392][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 19:56:35,733][15401] Updated weights for policy 0, policy_version 279430 (0.0035) [2024-06-22 19:56:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 4578279424. Throughput: 0: 42638.6. Samples: 4578344960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 19:56:40,090][15401] Updated weights for policy 0, policy_version 279440 (0.0041) [2024-06-22 19:56:41,780][15349] Signal inference workers to stop experience collection... (67700 times) [2024-06-22 19:56:41,818][15401] InferenceWorker_p0-w0: stopping experience collection (67700 times) [2024-06-22 19:56:41,831][15349] Signal inference workers to resume experience collection... (67700 times) [2024-06-22 19:56:41,836][15401] InferenceWorker_p0-w0: resuming experience collection (67700 times) [2024-06-22 19:56:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42709.7). Total num frames: 4578492416. Throughput: 0: 42713.4. Samples: 4578602780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 19:56:43,597][15401] Updated weights for policy 0, policy_version 279450 (0.0036) [2024-06-22 19:56:47,651][15401] Updated weights for policy 0, policy_version 279460 (0.0049) [2024-06-22 19:56:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4578689024. Throughput: 0: 42673.8. Samples: 4578863640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 19:56:51,430][15401] Updated weights for policy 0, policy_version 279470 (0.0031) [2024-06-22 19:56:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4578902016. Throughput: 0: 42732.1. Samples: 4578990860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 19:56:55,106][15401] Updated weights for policy 0, policy_version 279480 (0.0037) [2024-06-22 19:56:58,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4579131392. Throughput: 0: 42913.2. Samples: 4579245480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:56:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 19:56:59,163][15401] Updated weights for policy 0, policy_version 279490 (0.0036) [2024-06-22 19:57:02,795][15401] Updated weights for policy 0, policy_version 279500 (0.0022) [2024-06-22 19:57:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42601.7, 300 sec: 42820.6). Total num frames: 4579344384. Throughput: 0: 42870.6. Samples: 4579504320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:57:03,394][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 19:57:07,322][15401] Updated weights for policy 0, policy_version 279510 (0.0035) [2024-06-22 19:57:08,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 4579540992. Throughput: 0: 42753.7. Samples: 4579629600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 19:57:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 19:57:10,385][15401] Updated weights for policy 0, policy_version 279520 (0.0035) [2024-06-22 19:57:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4579786752. Throughput: 0: 42958.7. Samples: 4579890880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 19:57:14,739][15401] Updated weights for policy 0, policy_version 279530 (0.0038) [2024-06-22 19:57:18,068][15401] Updated weights for policy 0, policy_version 279540 (0.0042) [2024-06-22 19:57:18,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4579983360. Throughput: 0: 42725.7. Samples: 4580142460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 19:57:22,231][15401] Updated weights for policy 0, policy_version 279550 (0.0037) [2024-06-22 19:57:23,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 4580179968. Throughput: 0: 42734.2. Samples: 4580268100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:23,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 19:57:25,837][15401] Updated weights for policy 0, policy_version 279560 (0.0037) [2024-06-22 19:57:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4580409344. Throughput: 0: 42818.2. Samples: 4580529600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 19:57:29,840][15401] Updated weights for policy 0, policy_version 279570 (0.0035) [2024-06-22 19:57:33,359][15401] Updated weights for policy 0, policy_version 279580 (0.0036) [2024-06-22 19:57:33,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 4580638720. Throughput: 0: 42779.9. Samples: 4580788740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:33,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-22 19:57:37,286][15401] Updated weights for policy 0, policy_version 279590 (0.0024) [2024-06-22 19:57:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4580835328. Throughput: 0: 42668.4. Samples: 4580910940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 19:57:40,954][15401] Updated weights for policy 0, policy_version 279600 (0.0028) [2024-06-22 19:57:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4581064704. Throughput: 0: 42770.4. Samples: 4581170140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:43,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 19:57:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000279606_4581064704.pth... [2024-06-22 19:57:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000278979_4570791936.pth [2024-06-22 19:57:44,826][15401] Updated weights for policy 0, policy_version 279610 (0.0028) [2024-06-22 19:57:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4581277696. Throughput: 0: 42894.2. Samples: 4581434660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:48,392][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 19:57:48,608][15401] Updated weights for policy 0, policy_version 279620 (0.0040) [2024-06-22 19:57:52,303][15401] Updated weights for policy 0, policy_version 279630 (0.0041) [2024-06-22 19:57:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4581474304. Throughput: 0: 42868.4. Samples: 4581558680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 19:57:56,198][15401] Updated weights for policy 0, policy_version 279640 (0.0040) [2024-06-22 19:57:58,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4581703680. Throughput: 0: 42849.2. Samples: 4581819100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:57:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 19:57:59,862][15401] Updated weights for policy 0, policy_version 279650 (0.0034) [2024-06-22 19:58:02,250][15349] Signal inference workers to stop experience collection... (67750 times) [2024-06-22 19:58:02,252][15349] Signal inference workers to resume experience collection... (67750 times) [2024-06-22 19:58:02,266][15401] InferenceWorker_p0-w0: stopping experience collection (67750 times) [2024-06-22 19:58:02,266][15401] InferenceWorker_p0-w0: resuming experience collection (67750 times) [2024-06-22 19:58:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4581916672. Throughput: 0: 43113.5. Samples: 4582082560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:58:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 19:58:03,917][15401] Updated weights for policy 0, policy_version 279660 (0.0028) [2024-06-22 19:58:07,505][15401] Updated weights for policy 0, policy_version 279670 (0.0034) [2024-06-22 19:58:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.5, 300 sec: 42821.5). Total num frames: 4582146048. Throughput: 0: 43115.3. Samples: 4582208180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:58:08,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 19:58:11,317][15401] Updated weights for policy 0, policy_version 279680 (0.0034) [2024-06-22 19:58:13,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 4582359040. Throughput: 0: 42958.3. Samples: 4582463000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:58:13,396][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 19:58:15,333][15401] Updated weights for policy 0, policy_version 279690 (0.0027) [2024-06-22 19:58:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4582555648. Throughput: 0: 43119.6. Samples: 4582729120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-22 19:58:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 19:58:19,053][15401] Updated weights for policy 0, policy_version 279700 (0.0032) [2024-06-22 19:58:23,324][15401] Updated weights for policy 0, policy_version 279710 (0.0033) [2024-06-22 19:58:23,389][15132] Fps is (10 sec: 40986.2, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 4582768640. Throughput: 0: 43081.3. Samples: 4582849600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:58:23,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 19:58:27,032][15401] Updated weights for policy 0, policy_version 279720 (0.0033) [2024-06-22 19:58:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4583014400. Throughput: 0: 43027.5. Samples: 4583106380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:58:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 19:58:30,952][15401] Updated weights for policy 0, policy_version 279730 (0.0030) [2024-06-22 19:58:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4583194624. Throughput: 0: 42899.2. Samples: 4583365020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:58:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 19:58:34,804][15401] Updated weights for policy 0, policy_version 279740 (0.0039) [2024-06-22 19:58:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4583407616. Throughput: 0: 42807.5. Samples: 4583485020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:58:38,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 19:58:38,615][15401] Updated weights for policy 0, policy_version 279750 (0.0038) [2024-06-22 19:58:42,321][15401] Updated weights for policy 0, policy_version 279760 (0.0030) [2024-06-22 19:58:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4583653376. Throughput: 0: 42931.3. Samples: 4583751000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:58:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 19:58:46,074][15401] Updated weights for policy 0, policy_version 279770 (0.0031) [2024-06-22 19:58:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 4583817216. Throughput: 0: 42718.7. Samples: 4584004900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:58:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 19:58:49,799][15401] Updated weights for policy 0, policy_version 279780 (0.0030) [2024-06-22 19:58:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4584046592. Throughput: 0: 42596.8. Samples: 4584125040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:58:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 19:58:53,982][15401] Updated weights for policy 0, policy_version 279790 (0.0031) [2024-06-22 19:58:57,589][15401] Updated weights for policy 0, policy_version 279800 (0.0029) [2024-06-22 19:58:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4584259584. Throughput: 0: 42820.4. Samples: 4584389640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:58:58,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 19:59:01,445][15401] Updated weights for policy 0, policy_version 279810 (0.0036) [2024-06-22 19:59:02,400][15349] Signal inference workers to stop experience collection... (67800 times) [2024-06-22 19:59:02,452][15401] InferenceWorker_p0-w0: stopping experience collection (67800 times) [2024-06-22 19:59:02,524][15349] Signal inference workers to resume experience collection... (67800 times) [2024-06-22 19:59:02,524][15401] InferenceWorker_p0-w0: resuming experience collection (67800 times) [2024-06-22 19:59:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4584472576. Throughput: 0: 42582.6. Samples: 4584645340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:59:03,393][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 19:59:05,254][15401] Updated weights for policy 0, policy_version 279820 (0.0034) [2024-06-22 19:59:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4584701952. Throughput: 0: 42757.3. Samples: 4584773680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:59:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 19:59:09,107][15401] Updated weights for policy 0, policy_version 279830 (0.0036) [2024-06-22 19:59:12,790][15401] Updated weights for policy 0, policy_version 279840 (0.0024) [2024-06-22 19:59:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 4584914944. Throughput: 0: 42817.8. Samples: 4585033180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:59:13,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 19:59:16,758][15401] Updated weights for policy 0, policy_version 279850 (0.0027) [2024-06-22 19:59:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4585127936. Throughput: 0: 42721.2. Samples: 4585287580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:59:18,393][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 19:59:20,334][15401] Updated weights for policy 0, policy_version 279860 (0.0047) [2024-06-22 19:59:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4585340928. Throughput: 0: 42967.1. Samples: 4585418540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:59:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 19:59:24,393][15401] Updated weights for policy 0, policy_version 279870 (0.0038) [2024-06-22 19:59:28,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42052.3, 300 sec: 42654.2). Total num frames: 4585537536. Throughput: 0: 42863.4. Samples: 4585679860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:59:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 19:59:28,454][15401] Updated weights for policy 0, policy_version 279880 (0.0028) [2024-06-22 19:59:31,879][15401] Updated weights for policy 0, policy_version 279890 (0.0035) [2024-06-22 19:59:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4585783296. Throughput: 0: 42749.8. Samples: 4585928640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-22 19:59:33,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 19:59:35,963][15401] Updated weights for policy 0, policy_version 279900 (0.0038) [2024-06-22 19:59:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4585979904. Throughput: 0: 43070.5. Samples: 4586063220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 19:59:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 19:59:39,784][15401] Updated weights for policy 0, policy_version 279910 (0.0034) [2024-06-22 19:59:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 4586192896. Throughput: 0: 42797.3. Samples: 4586315520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 19:59:43,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 19:59:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000279920_4586209280.pth... [2024-06-22 19:59:43,470][15401] Updated weights for policy 0, policy_version 279920 (0.0037) [2024-06-22 19:59:43,515][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000279292_4575920128.pth [2024-06-22 19:59:47,303][15401] Updated weights for policy 0, policy_version 279930 (0.0036) [2024-06-22 19:59:48,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 4586405888. Throughput: 0: 42871.9. Samples: 4586574580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 19:59:48,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 19:59:51,096][15401] Updated weights for policy 0, policy_version 279940 (0.0034) [2024-06-22 19:59:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4586618880. Throughput: 0: 42696.5. Samples: 4586695020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 19:59:53,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 19:59:54,915][15401] Updated weights for policy 0, policy_version 279950 (0.0027) [2024-06-22 19:59:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4586831872. Throughput: 0: 42672.8. Samples: 4586953460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 19:59:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 19:59:58,686][15401] Updated weights for policy 0, policy_version 279960 (0.0029) [2024-06-22 20:00:02,500][15401] Updated weights for policy 0, policy_version 279970 (0.0038) [2024-06-22 20:00:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.8). Total num frames: 4587044864. Throughput: 0: 42739.1. Samples: 4587210740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:03,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 20:00:06,733][15401] Updated weights for policy 0, policy_version 279980 (0.0045) [2024-06-22 20:00:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4587241472. Throughput: 0: 42746.2. Samples: 4587342120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 20:00:10,281][15401] Updated weights for policy 0, policy_version 279990 (0.0039) [2024-06-22 20:00:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4587487232. Throughput: 0: 42587.5. Samples: 4587596300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:13,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 20:00:14,287][15401] Updated weights for policy 0, policy_version 280000 (0.0046) [2024-06-22 20:00:18,064][15401] Updated weights for policy 0, policy_version 280010 (0.0048) [2024-06-22 20:00:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 4587683840. Throughput: 0: 42523.5. Samples: 4587842200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:18,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-22 20:00:22,365][15401] Updated weights for policy 0, policy_version 280020 (0.0034) [2024-06-22 20:00:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4587880448. Throughput: 0: 42345.5. Samples: 4587968760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 20:00:24,511][15349] Signal inference workers to stop experience collection... (67850 times) [2024-06-22 20:00:24,516][15349] Signal inference workers to resume experience collection... (67850 times) [2024-06-22 20:00:24,553][15401] InferenceWorker_p0-w0: stopping experience collection (67850 times) [2024-06-22 20:00:24,553][15401] InferenceWorker_p0-w0: resuming experience collection (67850 times) [2024-06-22 20:00:25,957][15401] Updated weights for policy 0, policy_version 280030 (0.0033) [2024-06-22 20:00:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4588109824. Throughput: 0: 42365.8. Samples: 4588221980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 20:00:29,989][15401] Updated weights for policy 0, policy_version 280040 (0.0037) [2024-06-22 20:00:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4588322816. Throughput: 0: 42303.8. Samples: 4588478240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 20:00:33,559][15401] Updated weights for policy 0, policy_version 280050 (0.0032) [2024-06-22 20:00:37,612][15401] Updated weights for policy 0, policy_version 280060 (0.0044) [2024-06-22 20:00:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 4588519424. Throughput: 0: 42454.1. Samples: 4588605460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:38,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 20:00:41,526][15401] Updated weights for policy 0, policy_version 280070 (0.0036) [2024-06-22 20:00:43,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4588748800. Throughput: 0: 42486.8. Samples: 4588865360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 20:00:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 20:00:45,798][15401] Updated weights for policy 0, policy_version 280080 (0.0029) [2024-06-22 20:00:48,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 4588945408. Throughput: 0: 42283.4. Samples: 4589113760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:00:48,397][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 20:00:49,285][15401] Updated weights for policy 0, policy_version 280090 (0.0031) [2024-06-22 20:00:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 4589142016. Throughput: 0: 42137.4. Samples: 4589238300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:00:53,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-22 20:00:53,437][15401] Updated weights for policy 0, policy_version 280100 (0.0043) [2024-06-22 20:00:57,172][15401] Updated weights for policy 0, policy_version 280110 (0.0038) [2024-06-22 20:00:58,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42598.5, 300 sec: 42710.2). Total num frames: 4589387776. Throughput: 0: 42259.6. Samples: 4589497980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:00:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 20:01:01,055][15401] Updated weights for policy 0, policy_version 280120 (0.0042) [2024-06-22 20:01:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4589568000. Throughput: 0: 42416.0. Samples: 4589750920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 20:01:04,865][15401] Updated weights for policy 0, policy_version 280130 (0.0027) [2024-06-22 20:01:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4589780992. Throughput: 0: 42361.4. Samples: 4589875020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:08,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 20:01:08,643][15401] Updated weights for policy 0, policy_version 280140 (0.0046) [2024-06-22 20:01:12,437][15401] Updated weights for policy 0, policy_version 280150 (0.0037) [2024-06-22 20:01:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4590010368. Throughput: 0: 42503.1. Samples: 4590134620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 20:01:16,711][15401] Updated weights for policy 0, policy_version 280160 (0.0037) [2024-06-22 20:01:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4590223360. Throughput: 0: 42404.8. Samples: 4590386460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 20:01:20,313][15401] Updated weights for policy 0, policy_version 280170 (0.0035) [2024-06-22 20:01:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4590436352. Throughput: 0: 42366.3. Samples: 4590511940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:23,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 20:01:24,406][15401] Updated weights for policy 0, policy_version 280180 (0.0031) [2024-06-22 20:01:27,799][15401] Updated weights for policy 0, policy_version 280190 (0.0032) [2024-06-22 20:01:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 4590649344. Throughput: 0: 42309.0. Samples: 4590769260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 20:01:31,934][15401] Updated weights for policy 0, policy_version 280200 (0.0026) [2024-06-22 20:01:33,390][15132] Fps is (10 sec: 42597.0, 60 sec: 42325.0, 300 sec: 42653.9). Total num frames: 4590862336. Throughput: 0: 42439.1. Samples: 4591023260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:33,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 20:01:35,554][15401] Updated weights for policy 0, policy_version 280210 (0.0031) [2024-06-22 20:01:38,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 4591075328. Throughput: 0: 42531.8. Samples: 4591152500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:38,396][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 20:01:39,620][15401] Updated weights for policy 0, policy_version 280220 (0.0028) [2024-06-22 20:01:43,237][15401] Updated weights for policy 0, policy_version 280230 (0.0036) [2024-06-22 20:01:43,389][15132] Fps is (10 sec: 42599.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4591288320. Throughput: 0: 42495.6. Samples: 4591410280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 20:01:43,440][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000280231_4591304704.pth... [2024-06-22 20:01:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000279606_4581064704.pth [2024-06-22 20:01:47,204][15401] Updated weights for policy 0, policy_version 280240 (0.0035) [2024-06-22 20:01:48,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 4591501312. Throughput: 0: 42641.3. Samples: 4591669780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 20:01:50,677][15401] Updated weights for policy 0, policy_version 280250 (0.0043) [2024-06-22 20:01:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 4591714304. Throughput: 0: 42626.6. Samples: 4591793220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 20:01:54,629][15401] Updated weights for policy 0, policy_version 280260 (0.0034) [2024-06-22 20:01:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4591927296. Throughput: 0: 42722.6. Samples: 4592057140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 20:01:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 20:01:58,450][15401] Updated weights for policy 0, policy_version 280270 (0.0026) [2024-06-22 20:02:02,307][15401] Updated weights for policy 0, policy_version 280280 (0.0036) [2024-06-22 20:02:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4592123904. Throughput: 0: 42804.5. Samples: 4592312660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 20:02:06,042][15401] Updated weights for policy 0, policy_version 280290 (0.0047) [2024-06-22 20:02:06,722][15349] Signal inference workers to stop experience collection... (67900 times) [2024-06-22 20:02:06,722][15349] Signal inference workers to resume experience collection... (67900 times) [2024-06-22 20:02:06,734][15401] InferenceWorker_p0-w0: stopping experience collection (67900 times) [2024-06-22 20:02:06,734][15401] InferenceWorker_p0-w0: resuming experience collection (67900 times) [2024-06-22 20:02:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4592353280. Throughput: 0: 42760.5. Samples: 4592436160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 20:02:10,153][15401] Updated weights for policy 0, policy_version 280300 (0.0048) [2024-06-22 20:02:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4592582656. Throughput: 0: 42863.9. Samples: 4592698140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 20:02:13,609][15401] Updated weights for policy 0, policy_version 280310 (0.0044) [2024-06-22 20:02:17,674][15401] Updated weights for policy 0, policy_version 280320 (0.0033) [2024-06-22 20:02:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 4592795648. Throughput: 0: 42980.7. Samples: 4592957380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 20:02:20,973][15401] Updated weights for policy 0, policy_version 280330 (0.0028) [2024-06-22 20:02:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4593008640. Throughput: 0: 42901.6. Samples: 4593082800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 20:02:25,459][15401] Updated weights for policy 0, policy_version 280340 (0.0037) [2024-06-22 20:02:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 4593221632. Throughput: 0: 42995.0. Samples: 4593345060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 20:02:28,968][15401] Updated weights for policy 0, policy_version 280350 (0.0037) [2024-06-22 20:02:33,018][15401] Updated weights for policy 0, policy_version 280360 (0.0032) [2024-06-22 20:02:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 4593418240. Throughput: 0: 42965.8. Samples: 4593603240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 20:02:36,489][15401] Updated weights for policy 0, policy_version 280370 (0.0028) [2024-06-22 20:02:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43149.2, 300 sec: 42709.5). Total num frames: 4593664000. Throughput: 0: 43042.4. Samples: 4593730120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 20:02:40,615][15401] Updated weights for policy 0, policy_version 280380 (0.0034) [2024-06-22 20:02:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 4593860608. Throughput: 0: 42925.0. Samples: 4593988760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 20:02:43,961][15401] Updated weights for policy 0, policy_version 280390 (0.0033) [2024-06-22 20:02:48,043][15401] Updated weights for policy 0, policy_version 280400 (0.0028) [2024-06-22 20:02:48,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 4594073600. Throughput: 0: 42966.9. Samples: 4594246180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 20:02:51,806][15401] Updated weights for policy 0, policy_version 280410 (0.0032) [2024-06-22 20:02:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4594302976. Throughput: 0: 43075.9. Samples: 4594374580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 20:02:55,888][15401] Updated weights for policy 0, policy_version 280420 (0.0033) [2024-06-22 20:02:58,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 4594499584. Throughput: 0: 43088.6. Samples: 4594637120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:02:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 20:02:59,252][15401] Updated weights for policy 0, policy_version 280430 (0.0034) [2024-06-22 20:03:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 4594712576. Throughput: 0: 43049.4. Samples: 4594894600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:03:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 20:03:03,633][15401] Updated weights for policy 0, policy_version 280440 (0.0043) [2024-06-22 20:03:06,940][15401] Updated weights for policy 0, policy_version 280450 (0.0028) [2024-06-22 20:03:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 42710.4). Total num frames: 4594958336. Throughput: 0: 43044.9. Samples: 4595019820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:03:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 20:03:11,023][15401] Updated weights for policy 0, policy_version 280460 (0.0030) [2024-06-22 20:03:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4595154944. Throughput: 0: 43027.6. Samples: 4595281300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 20:03:14,548][15401] Updated weights for policy 0, policy_version 280470 (0.0040) [2024-06-22 20:03:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4595367936. Throughput: 0: 42825.8. Samples: 4595530400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 20:03:18,485][15401] Updated weights for policy 0, policy_version 280480 (0.0032) [2024-06-22 20:03:22,014][15401] Updated weights for policy 0, policy_version 280490 (0.0034) [2024-06-22 20:03:23,396][15132] Fps is (10 sec: 45846.0, 60 sec: 43413.0, 300 sec: 42708.6). Total num frames: 4595613696. Throughput: 0: 42918.2. Samples: 4595661720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:23,397][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 20:03:25,547][15349] Signal inference workers to stop experience collection... (67950 times) [2024-06-22 20:03:25,547][15349] Signal inference workers to resume experience collection... (67950 times) [2024-06-22 20:03:25,566][15401] InferenceWorker_p0-w0: stopping experience collection (67950 times) [2024-06-22 20:03:25,566][15401] InferenceWorker_p0-w0: resuming experience collection (67950 times) [2024-06-22 20:03:26,071][15401] Updated weights for policy 0, policy_version 280500 (0.0040) [2024-06-22 20:03:28,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 4595777536. Throughput: 0: 42927.0. Samples: 4595920580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:28,392][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 20:03:29,586][15401] Updated weights for policy 0, policy_version 280510 (0.0027) [2024-06-22 20:03:33,389][15132] Fps is (10 sec: 40986.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4596023296. Throughput: 0: 42750.4. Samples: 4596169940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 20:03:33,635][15401] Updated weights for policy 0, policy_version 280520 (0.0032) [2024-06-22 20:03:37,637][15401] Updated weights for policy 0, policy_version 280530 (0.0039) [2024-06-22 20:03:38,389][15132] Fps is (10 sec: 47524.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4596252672. Throughput: 0: 42959.1. Samples: 4596307740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 20:03:41,251][15401] Updated weights for policy 0, policy_version 280540 (0.0035) [2024-06-22 20:03:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4596416512. Throughput: 0: 42577.7. Samples: 4596553120. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 20:03:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000280543_4596416512.pth... [2024-06-22 20:03:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000279920_4586209280.pth [2024-06-22 20:03:45,470][15401] Updated weights for policy 0, policy_version 280550 (0.0029) [2024-06-22 20:03:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4596662272. Throughput: 0: 42609.3. Samples: 4596812020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 20:03:49,163][15401] Updated weights for policy 0, policy_version 280560 (0.0036) [2024-06-22 20:03:53,075][15401] Updated weights for policy 0, policy_version 280570 (0.0038) [2024-06-22 20:03:53,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4596891648. Throughput: 0: 42743.6. Samples: 4596943280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 20:03:56,824][15401] Updated weights for policy 0, policy_version 280580 (0.0036) [2024-06-22 20:03:58,393][15132] Fps is (10 sec: 39306.6, 60 sec: 42595.6, 300 sec: 42653.4). Total num frames: 4597055488. Throughput: 0: 42588.0. Samples: 4597197920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:03:58,394][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 20:04:00,784][15401] Updated weights for policy 0, policy_version 280590 (0.0030) [2024-06-22 20:04:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4597301248. Throughput: 0: 42567.5. Samples: 4597445940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:04:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 20:04:04,243][15401] Updated weights for policy 0, policy_version 280600 (0.0035) [2024-06-22 20:04:08,389][15132] Fps is (10 sec: 44253.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4597497856. Throughput: 0: 42633.7. Samples: 4597579960. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:04:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 20:04:08,466][15401] Updated weights for policy 0, policy_version 280610 (0.0034) [2024-06-22 20:04:12,542][15401] Updated weights for policy 0, policy_version 280620 (0.0036) [2024-06-22 20:04:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 4597694464. Throughput: 0: 42571.2. Samples: 4597836180. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:04:13,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 20:04:16,164][15401] Updated weights for policy 0, policy_version 280630 (0.0047) [2024-06-22 20:04:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 4597940224. Throughput: 0: 42615.8. Samples: 4598087660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:04:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 20:04:20,071][15401] Updated weights for policy 0, policy_version 280640 (0.0033) [2024-06-22 20:04:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41783.7, 300 sec: 42653.9). Total num frames: 4598120448. Throughput: 0: 42501.4. Samples: 4598220300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-22 20:04:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 20:04:23,894][15401] Updated weights for policy 0, policy_version 280650 (0.0031) [2024-06-22 20:04:27,627][15401] Updated weights for policy 0, policy_version 280660 (0.0048) [2024-06-22 20:04:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 4598349824. Throughput: 0: 42660.9. Samples: 4598472860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:04:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 20:04:31,460][15401] Updated weights for policy 0, policy_version 280670 (0.0031) [2024-06-22 20:04:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 4598562816. Throughput: 0: 42680.0. Samples: 4598732620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:04:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 20:04:35,217][15401] Updated weights for policy 0, policy_version 280680 (0.0027) [2024-06-22 20:04:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 4598759424. Throughput: 0: 42529.2. Samples: 4598857100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:04:38,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 20:04:38,992][15401] Updated weights for policy 0, policy_version 280690 (0.0030) [2024-06-22 20:04:42,744][15401] Updated weights for policy 0, policy_version 280700 (0.0045) [2024-06-22 20:04:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4599005184. Throughput: 0: 42542.6. Samples: 4599112180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:04:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 20:04:46,042][15349] Signal inference workers to stop experience collection... (68000 times) [2024-06-22 20:04:46,048][15349] Signal inference workers to resume experience collection... (68000 times) [2024-06-22 20:04:46,058][15401] InferenceWorker_p0-w0: stopping experience collection (68000 times) [2024-06-22 20:04:46,079][15401] InferenceWorker_p0-w0: resuming experience collection (68000 times) [2024-06-22 20:04:46,750][15401] Updated weights for policy 0, policy_version 280710 (0.0035) [2024-06-22 20:04:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 4599201792. Throughput: 0: 42826.5. Samples: 4599373140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:04:48,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 20:04:50,214][15401] Updated weights for policy 0, policy_version 280720 (0.0026) [2024-06-22 20:04:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 4599414784. Throughput: 0: 42668.7. Samples: 4599500060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:04:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 20:04:54,612][15401] Updated weights for policy 0, policy_version 280730 (0.0028) [2024-06-22 20:04:57,975][15401] Updated weights for policy 0, policy_version 280740 (0.0044) [2024-06-22 20:04:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43147.2, 300 sec: 42709.5). Total num frames: 4599644160. Throughput: 0: 42608.7. Samples: 4599753580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:04:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 20:05:02,235][15401] Updated weights for policy 0, policy_version 280750 (0.0040) [2024-06-22 20:05:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4599840768. Throughput: 0: 42977.4. Samples: 4600021640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:05:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 20:05:05,389][15401] Updated weights for policy 0, policy_version 280760 (0.0036) [2024-06-22 20:05:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4600053760. Throughput: 0: 42779.2. Samples: 4600145360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:05:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 20:05:09,919][15401] Updated weights for policy 0, policy_version 280770 (0.0046) [2024-06-22 20:05:12,814][15401] Updated weights for policy 0, policy_version 280780 (0.0031) [2024-06-22 20:05:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4600299520. Throughput: 0: 42903.5. Samples: 4600403520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:05:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 20:05:17,570][15401] Updated weights for policy 0, policy_version 280790 (0.0039) [2024-06-22 20:05:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4600479744. Throughput: 0: 42854.6. Samples: 4600661080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:05:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 20:05:20,986][15401] Updated weights for policy 0, policy_version 280800 (0.0040) [2024-06-22 20:05:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4600709120. Throughput: 0: 42833.0. Samples: 4600784580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:05:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 20:05:25,208][15401] Updated weights for policy 0, policy_version 280810 (0.0036) [2024-06-22 20:05:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4600938496. Throughput: 0: 42971.2. Samples: 4601045880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:05:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 20:05:28,527][15401] Updated weights for policy 0, policy_version 280820 (0.0034) [2024-06-22 20:05:33,272][15401] Updated weights for policy 0, policy_version 280830 (0.0035) [2024-06-22 20:05:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4601118720. Throughput: 0: 43000.2. Samples: 4601308140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-22 20:05:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 20:05:36,178][15401] Updated weights for policy 0, policy_version 280840 (0.0032) [2024-06-22 20:05:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4601348096. Throughput: 0: 42784.5. Samples: 4601425360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:05:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 20:05:40,823][15401] Updated weights for policy 0, policy_version 280850 (0.0036) [2024-06-22 20:05:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 4601577472. Throughput: 0: 43043.2. Samples: 4601690520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:05:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 20:05:43,453][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000280859_4601593856.pth... [2024-06-22 20:05:43,505][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000280231_4591304704.pth [2024-06-22 20:05:43,937][15401] Updated weights for policy 0, policy_version 280860 (0.0034) [2024-06-22 20:05:48,357][15401] Updated weights for policy 0, policy_version 280870 (0.0032) [2024-06-22 20:05:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4601774080. Throughput: 0: 42789.9. Samples: 4601947180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:05:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 20:05:50,893][15349] Signal inference workers to stop experience collection... (68050 times) [2024-06-22 20:05:50,893][15349] Signal inference workers to resume experience collection... (68050 times) [2024-06-22 20:05:50,905][15401] InferenceWorker_p0-w0: stopping experience collection (68050 times) [2024-06-22 20:05:50,928][15401] InferenceWorker_p0-w0: resuming experience collection (68050 times) [2024-06-22 20:05:51,455][15401] Updated weights for policy 0, policy_version 280880 (0.0033) [2024-06-22 20:05:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4601987072. Throughput: 0: 42691.5. Samples: 4602066480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:05:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 20:05:55,853][15401] Updated weights for policy 0, policy_version 280890 (0.0032) [2024-06-22 20:05:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.9, 300 sec: 42875.8). Total num frames: 4602216448. Throughput: 0: 42896.0. Samples: 4602333940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:05:58,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 20:05:59,194][15401] Updated weights for policy 0, policy_version 280900 (0.0032) [2024-06-22 20:06:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4602396672. Throughput: 0: 42972.9. Samples: 4602594860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 20:06:03,629][15401] Updated weights for policy 0, policy_version 280910 (0.0035) [2024-06-22 20:06:06,711][15401] Updated weights for policy 0, policy_version 280920 (0.0030) [2024-06-22 20:06:08,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4602658816. Throughput: 0: 43101.7. Samples: 4602724160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 20:06:11,045][15401] Updated weights for policy 0, policy_version 280930 (0.0026) [2024-06-22 20:06:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4602839040. Throughput: 0: 43132.5. Samples: 4602986840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 20:06:14,139][15401] Updated weights for policy 0, policy_version 280940 (0.0027) [2024-06-22 20:06:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4603068416. Throughput: 0: 43135.1. Samples: 4603249220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:18,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 20:06:18,517][15401] Updated weights for policy 0, policy_version 280950 (0.0035) [2024-06-22 20:06:21,957][15401] Updated weights for policy 0, policy_version 280960 (0.0022) [2024-06-22 20:06:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 4603314176. Throughput: 0: 43408.4. Samples: 4603378740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 20:06:26,308][15401] Updated weights for policy 0, policy_version 280970 (0.0046) [2024-06-22 20:06:28,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42598.2, 300 sec: 42820.6). Total num frames: 4603494400. Throughput: 0: 43101.2. Samples: 4603630080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:28,391][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 20:06:29,576][15401] Updated weights for policy 0, policy_version 280980 (0.0035) [2024-06-22 20:06:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 4603707392. Throughput: 0: 43140.4. Samples: 4603888500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:33,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 20:06:33,803][15401] Updated weights for policy 0, policy_version 280990 (0.0024) [2024-06-22 20:06:37,548][15401] Updated weights for policy 0, policy_version 281000 (0.0048) [2024-06-22 20:06:38,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4603953152. Throughput: 0: 43456.8. Samples: 4604022040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 20:06:41,372][15401] Updated weights for policy 0, policy_version 281010 (0.0033) [2024-06-22 20:06:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4604149760. Throughput: 0: 43066.2. Samples: 4604271820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 20:06:45,306][15401] Updated weights for policy 0, policy_version 281020 (0.0041) [2024-06-22 20:06:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4604362752. Throughput: 0: 42949.2. Samples: 4604527580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 20:06:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 20:06:48,860][15401] Updated weights for policy 0, policy_version 281030 (0.0024) [2024-06-22 20:06:52,823][15401] Updated weights for policy 0, policy_version 281040 (0.0021) [2024-06-22 20:06:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4604592128. Throughput: 0: 43061.9. Samples: 4604661940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:06:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 20:06:56,684][15401] Updated weights for policy 0, policy_version 281050 (0.0032) [2024-06-22 20:06:58,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42595.5, 300 sec: 42875.1). Total num frames: 4604772352. Throughput: 0: 42700.1. Samples: 4604908620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:06:58,397][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 20:07:00,433][15401] Updated weights for policy 0, policy_version 281060 (0.0031) [2024-06-22 20:07:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4605001728. Throughput: 0: 42746.6. Samples: 4605172820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 20:07:04,418][15401] Updated weights for policy 0, policy_version 281070 (0.0031) [2024-06-22 20:07:08,055][15401] Updated weights for policy 0, policy_version 281080 (0.0032) [2024-06-22 20:07:08,390][15132] Fps is (10 sec: 45904.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4605231104. Throughput: 0: 42843.6. Samples: 4605306700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:08,391][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 20:07:12,008][15401] Updated weights for policy 0, policy_version 281090 (0.0037) [2024-06-22 20:07:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4605427712. Throughput: 0: 42833.9. Samples: 4605557600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:13,398][15132] Avg episode reward: [(0, '0.837')] [2024-06-22 20:07:15,688][15401] Updated weights for policy 0, policy_version 281100 (0.0023) [2024-06-22 20:07:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4605640704. Throughput: 0: 42816.0. Samples: 4605815220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:18,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-22 20:07:18,648][15349] Signal inference workers to stop experience collection... (68100 times) [2024-06-22 20:07:18,650][15349] Signal inference workers to resume experience collection... (68100 times) [2024-06-22 20:07:18,689][15401] InferenceWorker_p0-w0: stopping experience collection (68100 times) [2024-06-22 20:07:18,696][15401] InferenceWorker_p0-w0: resuming experience collection (68100 times) [2024-06-22 20:07:19,579][15401] Updated weights for policy 0, policy_version 281110 (0.0038) [2024-06-22 20:07:23,378][15401] Updated weights for policy 0, policy_version 281120 (0.0044) [2024-06-22 20:07:23,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.8, 300 sec: 42875.8). Total num frames: 4605870080. Throughput: 0: 42663.6. Samples: 4605942000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:23,392][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 20:07:27,255][15401] Updated weights for policy 0, policy_version 281130 (0.0029) [2024-06-22 20:07:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 4606083072. Throughput: 0: 42743.6. Samples: 4606195280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 20:07:31,213][15401] Updated weights for policy 0, policy_version 281140 (0.0033) [2024-06-22 20:07:33,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4606296064. Throughput: 0: 42660.0. Samples: 4606447280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 20:07:34,736][15401] Updated weights for policy 0, policy_version 281150 (0.0033) [2024-06-22 20:07:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4606492672. Throughput: 0: 42659.4. Samples: 4606581620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 20:07:38,818][15401] Updated weights for policy 0, policy_version 281160 (0.0041) [2024-06-22 20:07:42,335][15401] Updated weights for policy 0, policy_version 281170 (0.0042) [2024-06-22 20:07:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4606705664. Throughput: 0: 42848.3. Samples: 4606836520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:43,398][15132] Avg episode reward: [(0, '0.246')] [2024-06-22 20:07:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000281171_4606705664.pth... [2024-06-22 20:07:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000280543_4596416512.pth [2024-06-22 20:07:46,475][15401] Updated weights for policy 0, policy_version 281180 (0.0047) [2024-06-22 20:07:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4606935040. Throughput: 0: 42661.2. Samples: 4607092580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:48,399][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 20:07:49,918][15401] Updated weights for policy 0, policy_version 281190 (0.0033) [2024-06-22 20:07:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 4607115264. Throughput: 0: 42597.8. Samples: 4607223600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 20:07:54,145][15401] Updated weights for policy 0, policy_version 281200 (0.0037) [2024-06-22 20:07:57,601][15401] Updated weights for policy 0, policy_version 281210 (0.0028) [2024-06-22 20:07:58,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43147.4, 300 sec: 42875.7). Total num frames: 4607361024. Throughput: 0: 42609.8. Samples: 4607475140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:07:58,393][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 20:08:01,812][15401] Updated weights for policy 0, policy_version 281220 (0.0029) [2024-06-22 20:08:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4607574016. Throughput: 0: 42529.6. Samples: 4607729060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:03,396][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 20:08:05,535][15401] Updated weights for policy 0, policy_version 281230 (0.0029) [2024-06-22 20:08:08,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4607770624. Throughput: 0: 42600.1. Samples: 4607858900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:08,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 20:08:09,406][15401] Updated weights for policy 0, policy_version 281240 (0.0038) [2024-06-22 20:08:13,057][15401] Updated weights for policy 0, policy_version 281250 (0.0028) [2024-06-22 20:08:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4608016384. Throughput: 0: 42582.6. Samples: 4608111500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 20:08:17,074][15401] Updated weights for policy 0, policy_version 281260 (0.0036) [2024-06-22 20:08:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 4608196608. Throughput: 0: 42691.6. Samples: 4608368400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:18,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 20:08:20,559][15401] Updated weights for policy 0, policy_version 281270 (0.0040) [2024-06-22 20:08:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42327.0, 300 sec: 42820.9). Total num frames: 4608409600. Throughput: 0: 42493.9. Samples: 4608493840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 20:08:25,233][15401] Updated weights for policy 0, policy_version 281280 (0.0029) [2024-06-22 20:08:28,304][15401] Updated weights for policy 0, policy_version 281290 (0.0042) [2024-06-22 20:08:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4608655360. Throughput: 0: 42470.4. Samples: 4608747680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 20:08:32,748][15401] Updated weights for policy 0, policy_version 281300 (0.0038) [2024-06-22 20:08:33,395][15132] Fps is (10 sec: 42575.8, 60 sec: 42321.7, 300 sec: 42653.2). Total num frames: 4608835584. Throughput: 0: 42689.4. Samples: 4609013820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:33,395][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 20:08:35,837][15401] Updated weights for policy 0, policy_version 281310 (0.0036) [2024-06-22 20:08:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4609064960. Throughput: 0: 42422.7. Samples: 4609132620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:38,392][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 20:08:40,667][15401] Updated weights for policy 0, policy_version 281320 (0.0024) [2024-06-22 20:08:41,382][15349] Signal inference workers to stop experience collection... (68150 times) [2024-06-22 20:08:41,416][15401] InferenceWorker_p0-w0: stopping experience collection (68150 times) [2024-06-22 20:08:41,490][15349] Signal inference workers to resume experience collection... (68150 times) [2024-06-22 20:08:41,491][15401] InferenceWorker_p0-w0: resuming experience collection (68150 times) [2024-06-22 20:08:43,244][15401] Updated weights for policy 0, policy_version 281330 (0.0042) [2024-06-22 20:08:43,389][15132] Fps is (10 sec: 47538.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 4609310720. Throughput: 0: 42577.9. Samples: 4609391040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:43,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-22 20:08:48,278][15401] Updated weights for policy 0, policy_version 281340 (0.0033) [2024-06-22 20:08:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4609474560. Throughput: 0: 42745.5. Samples: 4609652600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 20:08:51,318][15401] Updated weights for policy 0, policy_version 281350 (0.0030) [2024-06-22 20:08:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42932.2). Total num frames: 4609720320. Throughput: 0: 42579.9. Samples: 4609775000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:53,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 20:08:55,745][15401] Updated weights for policy 0, policy_version 281360 (0.0044) [2024-06-22 20:08:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 4609933312. Throughput: 0: 42769.7. Samples: 4610036140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:08:58,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 20:08:59,157][15401] Updated weights for policy 0, policy_version 281370 (0.0031) [2024-06-22 20:09:03,295][15401] Updated weights for policy 0, policy_version 281380 (0.0030) [2024-06-22 20:09:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 4610129920. Throughput: 0: 42873.3. Samples: 4610297700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:09:03,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-22 20:09:07,019][15401] Updated weights for policy 0, policy_version 281390 (0.0036) [2024-06-22 20:09:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4610342912. Throughput: 0: 42835.5. Samples: 4610421440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:09:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 20:09:11,029][15401] Updated weights for policy 0, policy_version 281400 (0.0034) [2024-06-22 20:09:13,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 4610572288. Throughput: 0: 42931.0. Samples: 4610679680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 20:09:13,392][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 20:09:14,789][15401] Updated weights for policy 0, policy_version 281410 (0.0035) [2024-06-22 20:09:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4610768896. Throughput: 0: 42857.5. Samples: 4610942180. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:18,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-22 20:09:18,506][15401] Updated weights for policy 0, policy_version 281420 (0.0031) [2024-06-22 20:09:22,401][15401] Updated weights for policy 0, policy_version 281430 (0.0033) [2024-06-22 20:09:23,391][15132] Fps is (10 sec: 42602.7, 60 sec: 43143.5, 300 sec: 42875.9). Total num frames: 4610998272. Throughput: 0: 42881.0. Samples: 4611062320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:23,391][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 20:09:26,663][15401] Updated weights for policy 0, policy_version 281440 (0.0030) [2024-06-22 20:09:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4611211264. Throughput: 0: 42837.4. Samples: 4611318720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 20:09:30,175][15401] Updated weights for policy 0, policy_version 281450 (0.0042) [2024-06-22 20:09:33,390][15132] Fps is (10 sec: 40965.4, 60 sec: 42875.2, 300 sec: 42876.1). Total num frames: 4611407872. Throughput: 0: 42721.3. Samples: 4611575060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 20:09:34,291][15401] Updated weights for policy 0, policy_version 281460 (0.0028) [2024-06-22 20:09:37,762][15401] Updated weights for policy 0, policy_version 281470 (0.0036) [2024-06-22 20:09:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4611637248. Throughput: 0: 42777.8. Samples: 4611700000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 20:09:41,748][15401] Updated weights for policy 0, policy_version 281480 (0.0028) [2024-06-22 20:09:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 4611833856. Throughput: 0: 42696.0. Samples: 4611957460. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 20:09:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000281484_4611833856.pth... [2024-06-22 20:09:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000280859_4601593856.pth [2024-06-22 20:09:45,395][15401] Updated weights for policy 0, policy_version 281490 (0.0029) [2024-06-22 20:09:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4612063232. Throughput: 0: 42576.9. Samples: 4612213660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 20:09:49,338][15401] Updated weights for policy 0, policy_version 281500 (0.0040) [2024-06-22 20:09:53,185][15401] Updated weights for policy 0, policy_version 281510 (0.0025) [2024-06-22 20:09:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4612259840. Throughput: 0: 42622.2. Samples: 4612339440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 20:09:56,953][15401] Updated weights for policy 0, policy_version 281520 (0.0029) [2024-06-22 20:09:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4612472832. Throughput: 0: 42528.9. Samples: 4612593380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:09:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 20:10:00,935][15401] Updated weights for policy 0, policy_version 281530 (0.0026) [2024-06-22 20:10:01,840][15349] Signal inference workers to stop experience collection... (68200 times) [2024-06-22 20:10:01,840][15349] Signal inference workers to resume experience collection... (68200 times) [2024-06-22 20:10:01,854][15401] InferenceWorker_p0-w0: stopping experience collection (68200 times) [2024-06-22 20:10:01,855][15401] InferenceWorker_p0-w0: resuming experience collection (68200 times) [2024-06-22 20:10:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4612685824. Throughput: 0: 42322.6. Samples: 4612846700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:10:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 20:10:04,543][15401] Updated weights for policy 0, policy_version 281540 (0.0038) [2024-06-22 20:10:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4612898816. Throughput: 0: 42516.0. Samples: 4612975480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:10:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 20:10:08,976][15401] Updated weights for policy 0, policy_version 281550 (0.0044) [2024-06-22 20:10:12,194][15401] Updated weights for policy 0, policy_version 281560 (0.0037) [2024-06-22 20:10:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 4613095424. Throughput: 0: 42515.6. Samples: 4613231920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:10:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 20:10:16,497][15401] Updated weights for policy 0, policy_version 281570 (0.0032) [2024-06-22 20:10:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4613324800. Throughput: 0: 42395.2. Samples: 4613482840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:10:18,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 20:10:19,895][15401] Updated weights for policy 0, policy_version 281580 (0.0034) [2024-06-22 20:10:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42053.3, 300 sec: 42654.0). Total num frames: 4613521408. Throughput: 0: 42576.9. Samples: 4613615960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-22 20:10:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 20:10:24,306][15401] Updated weights for policy 0, policy_version 281590 (0.0038) [2024-06-22 20:10:27,566][15401] Updated weights for policy 0, policy_version 281600 (0.0033) [2024-06-22 20:10:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4613750784. Throughput: 0: 42426.7. Samples: 4613866660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:10:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 20:10:31,962][15401] Updated weights for policy 0, policy_version 281610 (0.0035) [2024-06-22 20:10:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4613980160. Throughput: 0: 42525.4. Samples: 4614127300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:10:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 20:10:35,177][15401] Updated weights for policy 0, policy_version 281620 (0.0028) [2024-06-22 20:10:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 4614160384. Throughput: 0: 42611.2. Samples: 4614256940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:10:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 20:10:39,965][15401] Updated weights for policy 0, policy_version 281630 (0.0043) [2024-06-22 20:10:43,101][15401] Updated weights for policy 0, policy_version 281640 (0.0039) [2024-06-22 20:10:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4614389760. Throughput: 0: 42473.8. Samples: 4614504700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:10:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 20:10:47,453][15401] Updated weights for policy 0, policy_version 281650 (0.0027) [2024-06-22 20:10:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4614602752. Throughput: 0: 42528.9. Samples: 4614760500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:10:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 20:10:51,381][15401] Updated weights for policy 0, policy_version 281660 (0.0041) [2024-06-22 20:10:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 4614799360. Throughput: 0: 42593.3. Samples: 4614892180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:10:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 20:10:55,018][15401] Updated weights for policy 0, policy_version 281670 (0.0033) [2024-06-22 20:10:58,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 4615012352. Throughput: 0: 42491.5. Samples: 4615144140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:10:58,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 20:10:59,009][15401] Updated weights for policy 0, policy_version 281680 (0.0036) [2024-06-22 20:11:02,789][15401] Updated weights for policy 0, policy_version 281690 (0.0032) [2024-06-22 20:11:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4615258112. Throughput: 0: 42636.3. Samples: 4615401480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:11:03,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 20:11:06,377][15349] Signal inference workers to stop experience collection... (68250 times) [2024-06-22 20:11:06,377][15349] Signal inference workers to resume experience collection... (68250 times) [2024-06-22 20:11:06,391][15401] InferenceWorker_p0-w0: stopping experience collection (68250 times) [2024-06-22 20:11:06,416][15401] InferenceWorker_p0-w0: resuming experience collection (68250 times) [2024-06-22 20:11:06,524][15401] Updated weights for policy 0, policy_version 281700 (0.0031) [2024-06-22 20:11:08,389][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4615438336. Throughput: 0: 42536.8. Samples: 4615530120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:11:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 20:11:10,337][15401] Updated weights for policy 0, policy_version 281710 (0.0041) [2024-06-22 20:11:13,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4615667712. Throughput: 0: 42494.3. Samples: 4615778900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:11:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 20:11:14,028][15401] Updated weights for policy 0, policy_version 281720 (0.0042) [2024-06-22 20:11:17,886][15401] Updated weights for policy 0, policy_version 281730 (0.0024) [2024-06-22 20:11:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4615880704. Throughput: 0: 42374.6. Samples: 4616034160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:11:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 20:11:21,990][15401] Updated weights for policy 0, policy_version 281740 (0.0039) [2024-06-22 20:11:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4616093696. Throughput: 0: 42251.5. Samples: 4616158260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:11:23,398][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 20:11:26,009][15401] Updated weights for policy 0, policy_version 281750 (0.0034) [2024-06-22 20:11:28,390][15132] Fps is (10 sec: 40958.4, 60 sec: 42325.0, 300 sec: 42653.9). Total num frames: 4616290304. Throughput: 0: 42473.8. Samples: 4616416040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:11:28,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 20:11:29,435][15401] Updated weights for policy 0, policy_version 281760 (0.0038) [2024-06-22 20:11:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 4616486912. Throughput: 0: 42536.6. Samples: 4616674640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:11:33,398][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 20:11:33,578][15401] Updated weights for policy 0, policy_version 281770 (0.0029) [2024-06-22 20:11:37,099][15401] Updated weights for policy 0, policy_version 281780 (0.0033) [2024-06-22 20:11:38,390][15132] Fps is (10 sec: 42600.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4616716288. Throughput: 0: 42472.4. Samples: 4616803440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:11:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 20:11:41,039][15401] Updated weights for policy 0, policy_version 281790 (0.0037) [2024-06-22 20:11:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4616945664. Throughput: 0: 42563.2. Samples: 4617059380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:11:43,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 20:11:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000281796_4616945664.pth... [2024-06-22 20:11:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000281171_4606705664.pth [2024-06-22 20:11:44,580][15401] Updated weights for policy 0, policy_version 281800 (0.0031) [2024-06-22 20:11:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 4617142272. Throughput: 0: 42578.0. Samples: 4617317480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:11:48,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-22 20:11:48,612][15401] Updated weights for policy 0, policy_version 281810 (0.0033) [2024-06-22 20:11:52,595][15401] Updated weights for policy 0, policy_version 281820 (0.0032) [2024-06-22 20:11:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 4617355264. Throughput: 0: 42584.0. Samples: 4617446400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:11:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 20:11:56,631][15401] Updated weights for policy 0, policy_version 281830 (0.0033) [2024-06-22 20:11:58,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42868.6, 300 sec: 42653.0). Total num frames: 4617584640. Throughput: 0: 42640.5. Samples: 4617698000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:11:58,396][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 20:12:00,158][15401] Updated weights for policy 0, policy_version 281840 (0.0042) [2024-06-22 20:12:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 4617781248. Throughput: 0: 42768.5. Samples: 4617958740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:03,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-22 20:12:03,942][15349] Signal inference workers to stop experience collection... (68300 times) [2024-06-22 20:12:04,000][15401] InferenceWorker_p0-w0: stopping experience collection (68300 times) [2024-06-22 20:12:04,061][15349] Signal inference workers to resume experience collection... (68300 times) [2024-06-22 20:12:04,061][15401] InferenceWorker_p0-w0: resuming experience collection (68300 times) [2024-06-22 20:12:04,207][15401] Updated weights for policy 0, policy_version 281850 (0.0029) [2024-06-22 20:12:07,779][15401] Updated weights for policy 0, policy_version 281860 (0.0031) [2024-06-22 20:12:08,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 4618010624. Throughput: 0: 42825.4. Samples: 4618085400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:08,390][15132] Avg episode reward: [(0, '0.206')] [2024-06-22 20:12:11,657][15401] Updated weights for policy 0, policy_version 281870 (0.0027) [2024-06-22 20:12:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4618240000. Throughput: 0: 42758.2. Samples: 4618340140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 20:12:15,566][15401] Updated weights for policy 0, policy_version 281880 (0.0029) [2024-06-22 20:12:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 4618420224. Throughput: 0: 42946.7. Samples: 4618607240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 20:12:19,330][15401] Updated weights for policy 0, policy_version 281890 (0.0043) [2024-06-22 20:12:23,202][15401] Updated weights for policy 0, policy_version 281900 (0.0032) [2024-06-22 20:12:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4618649600. Throughput: 0: 42770.3. Samples: 4618728100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 20:12:27,023][15401] Updated weights for policy 0, policy_version 281910 (0.0032) [2024-06-22 20:12:28,396][15132] Fps is (10 sec: 47483.0, 60 sec: 43413.3, 300 sec: 42708.6). Total num frames: 4618895360. Throughput: 0: 42835.2. Samples: 4618987240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:28,396][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 20:12:30,824][15401] Updated weights for policy 0, policy_version 281920 (0.0040) [2024-06-22 20:12:33,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 4619075584. Throughput: 0: 42871.2. Samples: 4619246960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:33,396][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 20:12:34,661][15401] Updated weights for policy 0, policy_version 281930 (0.0032) [2024-06-22 20:12:38,389][15132] Fps is (10 sec: 39347.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4619288576. Throughput: 0: 42688.0. Samples: 4619367360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 20:12:38,398][15401] Updated weights for policy 0, policy_version 281940 (0.0034) [2024-06-22 20:12:42,248][15401] Updated weights for policy 0, policy_version 281950 (0.0029) [2024-06-22 20:12:43,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4619517952. Throughput: 0: 43073.0. Samples: 4619636000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:43,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 20:12:46,217][15401] Updated weights for policy 0, policy_version 281960 (0.0034) [2024-06-22 20:12:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4619730944. Throughput: 0: 43009.8. Samples: 4619894180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-22 20:12:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 20:12:49,669][15401] Updated weights for policy 0, policy_version 281970 (0.0039) [2024-06-22 20:12:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 4619927552. Throughput: 0: 42978.1. Samples: 4620019420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:12:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 20:12:53,945][15401] Updated weights for policy 0, policy_version 281980 (0.0035) [2024-06-22 20:12:57,358][15401] Updated weights for policy 0, policy_version 281990 (0.0027) [2024-06-22 20:12:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 4620140544. Throughput: 0: 43054.3. Samples: 4620277580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:12:58,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 20:13:01,504][15401] Updated weights for policy 0, policy_version 282000 (0.0031) [2024-06-22 20:13:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4620369920. Throughput: 0: 42817.3. Samples: 4620534020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 20:13:04,862][15401] Updated weights for policy 0, policy_version 282010 (0.0029) [2024-06-22 20:13:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4620566528. Throughput: 0: 43094.1. Samples: 4620667340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:08,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 20:13:09,011][15401] Updated weights for policy 0, policy_version 282020 (0.0037) [2024-06-22 20:13:12,446][15401] Updated weights for policy 0, policy_version 282030 (0.0042) [2024-06-22 20:13:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4620795904. Throughput: 0: 42967.9. Samples: 4620920520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 20:13:16,798][15401] Updated weights for policy 0, policy_version 282040 (0.0037) [2024-06-22 20:13:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 4621025280. Throughput: 0: 42900.6. Samples: 4621177220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:18,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-22 20:13:19,947][15401] Updated weights for policy 0, policy_version 282050 (0.0026) [2024-06-22 20:13:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4621221888. Throughput: 0: 43139.5. Samples: 4621308640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:23,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-22 20:13:24,094][15401] Updated weights for policy 0, policy_version 282060 (0.0028) [2024-06-22 20:13:27,441][15401] Updated weights for policy 0, policy_version 282070 (0.0032) [2024-06-22 20:13:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42601.2, 300 sec: 42765.4). Total num frames: 4621451264. Throughput: 0: 42934.9. Samples: 4621568180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:28,392][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 20:13:31,923][15401] Updated weights for policy 0, policy_version 282080 (0.0039) [2024-06-22 20:13:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43422.1, 300 sec: 42765.0). Total num frames: 4621680640. Throughput: 0: 42939.0. Samples: 4621826440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 20:13:34,511][15349] Signal inference workers to stop experience collection... (68350 times) [2024-06-22 20:13:34,548][15401] InferenceWorker_p0-w0: stopping experience collection (68350 times) [2024-06-22 20:13:34,572][15349] Signal inference workers to resume experience collection... (68350 times) [2024-06-22 20:13:34,576][15401] InferenceWorker_p0-w0: resuming experience collection (68350 times) [2024-06-22 20:13:35,017][15401] Updated weights for policy 0, policy_version 282090 (0.0025) [2024-06-22 20:13:38,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4621844480. Throughput: 0: 43043.2. Samples: 4621956360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 20:13:39,587][15401] Updated weights for policy 0, policy_version 282100 (0.0028) [2024-06-22 20:13:43,275][15401] Updated weights for policy 0, policy_version 282110 (0.0033) [2024-06-22 20:13:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4622090240. Throughput: 0: 42947.1. Samples: 4622210200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 20:13:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000282111_4622106624.pth... [2024-06-22 20:13:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000281484_4611833856.pth [2024-06-22 20:13:47,128][15401] Updated weights for policy 0, policy_version 282120 (0.0029) [2024-06-22 20:13:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4622303232. Throughput: 0: 43048.0. Samples: 4622471180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 20:13:50,908][15401] Updated weights for policy 0, policy_version 282130 (0.0032) [2024-06-22 20:13:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4622499840. Throughput: 0: 42869.1. Samples: 4622596440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:53,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 20:13:54,722][15401] Updated weights for policy 0, policy_version 282140 (0.0041) [2024-06-22 20:13:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4622729216. Throughput: 0: 42944.9. Samples: 4622853040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:13:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 20:13:58,417][15401] Updated weights for policy 0, policy_version 282150 (0.0044) [2024-06-22 20:14:02,350][15401] Updated weights for policy 0, policy_version 282160 (0.0033) [2024-06-22 20:14:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4622942208. Throughput: 0: 43032.6. Samples: 4623113680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 20:14:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 20:14:05,996][15401] Updated weights for policy 0, policy_version 282170 (0.0031) [2024-06-22 20:14:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 4623138816. Throughput: 0: 42839.1. Samples: 4623236400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 20:14:10,094][15401] Updated weights for policy 0, policy_version 282180 (0.0026) [2024-06-22 20:14:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4623384576. Throughput: 0: 42885.4. Samples: 4623497920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 20:14:13,591][15401] Updated weights for policy 0, policy_version 282190 (0.0028) [2024-06-22 20:14:17,472][15401] Updated weights for policy 0, policy_version 282200 (0.0030) [2024-06-22 20:14:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42709.7). Total num frames: 4623597568. Throughput: 0: 42935.0. Samples: 4623758520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:18,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-22 20:14:21,420][15401] Updated weights for policy 0, policy_version 282210 (0.0034) [2024-06-22 20:14:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4623794176. Throughput: 0: 42902.3. Samples: 4623886960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 20:14:25,106][15401] Updated weights for policy 0, policy_version 282220 (0.0043) [2024-06-22 20:14:28,393][15132] Fps is (10 sec: 42582.1, 60 sec: 42870.4, 300 sec: 42764.5). Total num frames: 4624023552. Throughput: 0: 42985.5. Samples: 4624144720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:28,394][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 20:14:28,864][15401] Updated weights for policy 0, policy_version 282230 (0.0030) [2024-06-22 20:14:32,715][15401] Updated weights for policy 0, policy_version 282240 (0.0043) [2024-06-22 20:14:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4624220160. Throughput: 0: 42968.7. Samples: 4624404780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:33,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 20:14:36,334][15401] Updated weights for policy 0, policy_version 282250 (0.0048) [2024-06-22 20:14:38,389][15132] Fps is (10 sec: 42615.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4624449536. Throughput: 0: 43016.4. Samples: 4624532180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:38,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 20:14:40,178][15401] Updated weights for policy 0, policy_version 282260 (0.0030) [2024-06-22 20:14:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4624646144. Throughput: 0: 43063.9. Samples: 4624790920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 20:14:44,249][15401] Updated weights for policy 0, policy_version 282270 (0.0029) [2024-06-22 20:14:47,624][15401] Updated weights for policy 0, policy_version 282280 (0.0023) [2024-06-22 20:14:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4624875520. Throughput: 0: 42856.3. Samples: 4625042220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:48,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 20:14:51,809][15401] Updated weights for policy 0, policy_version 282290 (0.0046) [2024-06-22 20:14:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4625088512. Throughput: 0: 43131.4. Samples: 4625177320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 20:14:55,509][15401] Updated weights for policy 0, policy_version 282300 (0.0031) [2024-06-22 20:14:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4625285120. Throughput: 0: 42956.0. Samples: 4625430940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:14:58,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 20:14:59,752][15401] Updated weights for policy 0, policy_version 282310 (0.0032) [2024-06-22 20:15:02,949][15401] Updated weights for policy 0, policy_version 282320 (0.0027) [2024-06-22 20:15:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4625530880. Throughput: 0: 42733.0. Samples: 4625681500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:15:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 20:15:07,537][15401] Updated weights for policy 0, policy_version 282330 (0.0039) [2024-06-22 20:15:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4625727488. Throughput: 0: 42962.1. Samples: 4625820260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:15:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 20:15:10,834][15401] Updated weights for policy 0, policy_version 282340 (0.0021) [2024-06-22 20:15:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4625940480. Throughput: 0: 42927.7. Samples: 4626076300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-22 20:15:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 20:15:15,177][15401] Updated weights for policy 0, policy_version 282350 (0.0034) [2024-06-22 20:15:16,859][15349] Signal inference workers to stop experience collection... (68400 times) [2024-06-22 20:15:16,900][15401] InferenceWorker_p0-w0: stopping experience collection (68400 times) [2024-06-22 20:15:16,915][15349] Signal inference workers to resume experience collection... (68400 times) [2024-06-22 20:15:16,920][15401] InferenceWorker_p0-w0: resuming experience collection (68400 times) [2024-06-22 20:15:18,293][15401] Updated weights for policy 0, policy_version 282360 (0.0026) [2024-06-22 20:15:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4626186240. Throughput: 0: 42709.4. Samples: 4626326700. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:18,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 20:15:22,969][15401] Updated weights for policy 0, policy_version 282370 (0.0035) [2024-06-22 20:15:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4626366464. Throughput: 0: 42830.1. Samples: 4626459540. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 20:15:25,769][15401] Updated weights for policy 0, policy_version 282380 (0.0035) [2024-06-22 20:15:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42874.2, 300 sec: 42765.0). Total num frames: 4626595840. Throughput: 0: 42765.3. Samples: 4626715360. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 20:15:30,461][15401] Updated weights for policy 0, policy_version 282390 (0.0035) [2024-06-22 20:15:33,259][15401] Updated weights for policy 0, policy_version 282400 (0.0030) [2024-06-22 20:15:33,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 4626841600. Throughput: 0: 42722.2. Samples: 4626964720. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 20:15:38,007][15401] Updated weights for policy 0, policy_version 282410 (0.0033) [2024-06-22 20:15:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4627005440. Throughput: 0: 42776.0. Samples: 4627102240. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 20:15:40,891][15401] Updated weights for policy 0, policy_version 282420 (0.0031) [2024-06-22 20:15:43,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4627218432. Throughput: 0: 42728.9. Samples: 4627353740. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:43,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 20:15:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000282423_4627218432.pth... [2024-06-22 20:15:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000281796_4616945664.pth [2024-06-22 20:15:45,536][15401] Updated weights for policy 0, policy_version 282430 (0.0042) [2024-06-22 20:15:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4627464192. Throughput: 0: 42812.4. Samples: 4627608060. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:48,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 20:15:48,718][15401] Updated weights for policy 0, policy_version 282440 (0.0036) [2024-06-22 20:15:53,137][15401] Updated weights for policy 0, policy_version 282450 (0.0034) [2024-06-22 20:15:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 4627660800. Throughput: 0: 42789.6. Samples: 4627745800. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 20:15:56,465][15401] Updated weights for policy 0, policy_version 282460 (0.0046) [2024-06-22 20:15:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4627873792. Throughput: 0: 42705.3. Samples: 4627998040. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:15:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 20:16:00,741][15401] Updated weights for policy 0, policy_version 282470 (0.0031) [2024-06-22 20:16:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4628086784. Throughput: 0: 42783.2. Samples: 4628251940. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:16:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 20:16:04,322][15401] Updated weights for policy 0, policy_version 282480 (0.0030) [2024-06-22 20:16:08,379][15401] Updated weights for policy 0, policy_version 282490 (0.0040) [2024-06-22 20:16:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4628316160. Throughput: 0: 42624.5. Samples: 4628377640. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:16:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 20:16:12,435][15401] Updated weights for policy 0, policy_version 282500 (0.0037) [2024-06-22 20:16:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4628496384. Throughput: 0: 42555.2. Samples: 4628630340. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:16:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 20:16:16,156][15401] Updated weights for policy 0, policy_version 282510 (0.0039) [2024-06-22 20:16:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4628709376. Throughput: 0: 42634.7. Samples: 4628883280. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:16:18,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 20:16:20,381][15401] Updated weights for policy 0, policy_version 282520 (0.0046) [2024-06-22 20:16:23,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4628922368. Throughput: 0: 42379.6. Samples: 4629009320. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:16:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 20:16:23,840][15401] Updated weights for policy 0, policy_version 282530 (0.0025) [2024-06-22 20:16:28,233][15401] Updated weights for policy 0, policy_version 282540 (0.0037) [2024-06-22 20:16:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4629135360. Throughput: 0: 42552.8. Samples: 4629268620. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-22 20:16:28,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 20:16:31,483][15401] Updated weights for policy 0, policy_version 282550 (0.0034) [2024-06-22 20:16:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42820.6). Total num frames: 4629348352. Throughput: 0: 42541.8. Samples: 4629522440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:16:33,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 20:16:36,015][15401] Updated weights for policy 0, policy_version 282560 (0.0030) [2024-06-22 20:16:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4629561344. Throughput: 0: 42412.2. Samples: 4629654340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:16:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 20:16:38,697][15349] Signal inference workers to stop experience collection... (68450 times) [2024-06-22 20:16:38,698][15349] Signal inference workers to resume experience collection... (68450 times) [2024-06-22 20:16:38,732][15401] InferenceWorker_p0-w0: stopping experience collection (68450 times) [2024-06-22 20:16:38,732][15401] InferenceWorker_p0-w0: resuming experience collection (68450 times) [2024-06-22 20:16:39,070][15401] Updated weights for policy 0, policy_version 282570 (0.0032) [2024-06-22 20:16:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4629774336. Throughput: 0: 42533.0. Samples: 4629912020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:16:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 20:16:43,592][15401] Updated weights for policy 0, policy_version 282580 (0.0035) [2024-06-22 20:16:46,904][15401] Updated weights for policy 0, policy_version 282590 (0.0034) [2024-06-22 20:16:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4630003712. Throughput: 0: 42567.1. Samples: 4630167460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:16:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 20:16:51,152][15401] Updated weights for policy 0, policy_version 282600 (0.0027) [2024-06-22 20:16:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 4630216704. Throughput: 0: 42656.7. Samples: 4630297200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:16:53,391][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 20:16:54,691][15401] Updated weights for policy 0, policy_version 282610 (0.0031) [2024-06-22 20:16:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4630413312. Throughput: 0: 42791.9. Samples: 4630555980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:16:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 20:16:58,967][15401] Updated weights for policy 0, policy_version 282620 (0.0035) [2024-06-22 20:17:02,322][15401] Updated weights for policy 0, policy_version 282630 (0.0036) [2024-06-22 20:17:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42876.0). Total num frames: 4630659072. Throughput: 0: 42765.6. Samples: 4630807740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:17:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 20:17:06,519][15401] Updated weights for policy 0, policy_version 282640 (0.0027) [2024-06-22 20:17:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4630855680. Throughput: 0: 42957.9. Samples: 4630942420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:17:08,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 20:17:09,870][15401] Updated weights for policy 0, policy_version 282650 (0.0036) [2024-06-22 20:17:13,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4631052288. Throughput: 0: 42597.4. Samples: 4631185500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:17:13,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 20:17:14,278][15401] Updated weights for policy 0, policy_version 282660 (0.0041) [2024-06-22 20:17:18,121][15401] Updated weights for policy 0, policy_version 282670 (0.0030) [2024-06-22 20:17:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4631281664. Throughput: 0: 42745.6. Samples: 4631446000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:17:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 20:17:21,947][15401] Updated weights for policy 0, policy_version 282680 (0.0037) [2024-06-22 20:17:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42710.4). Total num frames: 4631494656. Throughput: 0: 42629.1. Samples: 4631572660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:17:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 20:17:25,689][15401] Updated weights for policy 0, policy_version 282690 (0.0033) [2024-06-22 20:17:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 4631707648. Throughput: 0: 42552.4. Samples: 4631826880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:17:28,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 20:17:29,574][15401] Updated weights for policy 0, policy_version 282700 (0.0034) [2024-06-22 20:17:33,254][15401] Updated weights for policy 0, policy_version 282710 (0.0032) [2024-06-22 20:17:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4631920640. Throughput: 0: 42667.6. Samples: 4632087500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:17:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 20:17:37,195][15401] Updated weights for policy 0, policy_version 282720 (0.0046) [2024-06-22 20:17:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4632133632. Throughput: 0: 42616.6. Samples: 4632214940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-22 20:17:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 20:17:40,821][15401] Updated weights for policy 0, policy_version 282730 (0.0049) [2024-06-22 20:17:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4632363008. Throughput: 0: 42413.1. Samples: 4632464580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:17:43,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-22 20:17:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000282737_4632363008.pth... [2024-06-22 20:17:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000282111_4622106624.pth [2024-06-22 20:17:44,901][15401] Updated weights for policy 0, policy_version 282740 (0.0047) [2024-06-22 20:17:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4632559616. Throughput: 0: 42662.4. Samples: 4632727540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:17:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 20:17:48,533][15401] Updated weights for policy 0, policy_version 282750 (0.0041) [2024-06-22 20:17:53,020][15401] Updated weights for policy 0, policy_version 282760 (0.0025) [2024-06-22 20:17:53,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 4632739840. Throughput: 0: 42416.4. Samples: 4632851160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:17:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 20:17:56,170][15401] Updated weights for policy 0, policy_version 282770 (0.0030) [2024-06-22 20:17:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4633001984. Throughput: 0: 42719.9. Samples: 4633107900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:17:58,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 20:18:00,566][15401] Updated weights for policy 0, policy_version 282780 (0.0027) [2024-06-22 20:18:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 4633198592. Throughput: 0: 42777.9. Samples: 4633371000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 20:18:03,802][15401] Updated weights for policy 0, policy_version 282790 (0.0049) [2024-06-22 20:18:08,051][15401] Updated weights for policy 0, policy_version 282800 (0.0039) [2024-06-22 20:18:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 4633395200. Throughput: 0: 42783.6. Samples: 4633497920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 20:18:11,485][15401] Updated weights for policy 0, policy_version 282810 (0.0030) [2024-06-22 20:18:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4633640960. Throughput: 0: 42886.3. Samples: 4633756760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 20:18:15,479][15401] Updated weights for policy 0, policy_version 282820 (0.0029) [2024-06-22 20:18:16,281][15349] Signal inference workers to stop experience collection... (68500 times) [2024-06-22 20:18:16,332][15401] InferenceWorker_p0-w0: stopping experience collection (68500 times) [2024-06-22 20:18:16,339][15349] Signal inference workers to resume experience collection... (68500 times) [2024-06-22 20:18:16,352][15401] InferenceWorker_p0-w0: resuming experience collection (68500 times) [2024-06-22 20:18:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4633853952. Throughput: 0: 42745.8. Samples: 4634011060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 20:18:19,099][15401] Updated weights for policy 0, policy_version 282830 (0.0025) [2024-06-22 20:18:23,282][15401] Updated weights for policy 0, policy_version 282840 (0.0041) [2024-06-22 20:18:23,394][15132] Fps is (10 sec: 40941.2, 60 sec: 42595.3, 300 sec: 42709.2). Total num frames: 4634050560. Throughput: 0: 42797.4. Samples: 4634141020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:23,395][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 20:18:26,593][15401] Updated weights for policy 0, policy_version 282850 (0.0036) [2024-06-22 20:18:28,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 4634296320. Throughput: 0: 42975.3. Samples: 4634398560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:28,392][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 20:18:30,769][15401] Updated weights for policy 0, policy_version 282860 (0.0042) [2024-06-22 20:18:33,389][15132] Fps is (10 sec: 42617.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4634476544. Throughput: 0: 42779.6. Samples: 4634652620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 20:18:34,339][15401] Updated weights for policy 0, policy_version 282870 (0.0042) [2024-06-22 20:18:38,360][15401] Updated weights for policy 0, policy_version 282880 (0.0028) [2024-06-22 20:18:38,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4634705920. Throughput: 0: 42814.7. Samples: 4634777820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 20:18:41,963][15401] Updated weights for policy 0, policy_version 282890 (0.0036) [2024-06-22 20:18:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4634935296. Throughput: 0: 42949.4. Samples: 4635040620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 20:18:45,972][15401] Updated weights for policy 0, policy_version 282900 (0.0035) [2024-06-22 20:18:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4635131904. Throughput: 0: 42874.6. Samples: 4635300360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 20:18:49,558][15401] Updated weights for policy 0, policy_version 282910 (0.0036) [2024-06-22 20:18:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4635328512. Throughput: 0: 42788.5. Samples: 4635423400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 20:18:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 20:18:53,776][15401] Updated weights for policy 0, policy_version 282920 (0.0027) [2024-06-22 20:18:57,082][15401] Updated weights for policy 0, policy_version 282930 (0.0044) [2024-06-22 20:18:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4635574272. Throughput: 0: 42730.7. Samples: 4635679640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:18:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 20:19:01,139][15401] Updated weights for policy 0, policy_version 282940 (0.0032) [2024-06-22 20:19:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4635770880. Throughput: 0: 42954.3. Samples: 4635944000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 20:19:04,963][15401] Updated weights for policy 0, policy_version 282950 (0.0031) [2024-06-22 20:19:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4636000256. Throughput: 0: 42866.9. Samples: 4636069840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:08,396][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 20:19:09,065][15401] Updated weights for policy 0, policy_version 282960 (0.0037) [2024-06-22 20:19:12,703][15401] Updated weights for policy 0, policy_version 282970 (0.0039) [2024-06-22 20:19:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4636213248. Throughput: 0: 42888.1. Samples: 4636328420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 20:19:16,660][15401] Updated weights for policy 0, policy_version 282980 (0.0026) [2024-06-22 20:19:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4636426240. Throughput: 0: 42964.4. Samples: 4636586020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:18,396][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 20:19:20,517][15401] Updated weights for policy 0, policy_version 282990 (0.0038) [2024-06-22 20:19:23,394][15132] Fps is (10 sec: 44216.4, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 4636655616. Throughput: 0: 42934.8. Samples: 4636710080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:23,395][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 20:19:24,006][15401] Updated weights for policy 0, policy_version 283000 (0.0025) [2024-06-22 20:19:28,048][15401] Updated weights for policy 0, policy_version 283010 (0.0023) [2024-06-22 20:19:28,394][15132] Fps is (10 sec: 42580.7, 60 sec: 42597.1, 300 sec: 42820.0). Total num frames: 4636852224. Throughput: 0: 42987.1. Samples: 4636975220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:28,394][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 20:19:31,404][15401] Updated weights for policy 0, policy_version 283020 (0.0028) [2024-06-22 20:19:33,392][15132] Fps is (10 sec: 42607.5, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 4637081600. Throughput: 0: 42881.4. Samples: 4637230120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:33,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 20:19:35,526][15401] Updated weights for policy 0, policy_version 283030 (0.0034) [2024-06-22 20:19:38,390][15132] Fps is (10 sec: 42615.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4637278208. Throughput: 0: 43114.2. Samples: 4637363540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:38,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 20:19:39,220][15401] Updated weights for policy 0, policy_version 283040 (0.0038) [2024-06-22 20:19:43,146][15401] Updated weights for policy 0, policy_version 283050 (0.0033) [2024-06-22 20:19:43,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4637491200. Throughput: 0: 43087.0. Samples: 4637618560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 20:19:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000283051_4637507584.pth... [2024-06-22 20:19:43,538][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000282423_4627218432.pth [2024-06-22 20:19:46,929][15401] Updated weights for policy 0, policy_version 283060 (0.0046) [2024-06-22 20:19:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4637720576. Throughput: 0: 42795.4. Samples: 4637869800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:48,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 20:19:49,751][15349] Signal inference workers to stop experience collection... (68550 times) [2024-06-22 20:19:49,751][15349] Signal inference workers to resume experience collection... (68550 times) [2024-06-22 20:19:49,792][15401] InferenceWorker_p0-w0: stopping experience collection (68550 times) [2024-06-22 20:19:49,792][15401] InferenceWorker_p0-w0: resuming experience collection (68550 times) [2024-06-22 20:19:50,816][15401] Updated weights for policy 0, policy_version 283070 (0.0040) [2024-06-22 20:19:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4637933568. Throughput: 0: 42852.8. Samples: 4637998220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 20:19:54,622][15401] Updated weights for policy 0, policy_version 283080 (0.0039) [2024-06-22 20:19:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4638130176. Throughput: 0: 42874.6. Samples: 4638257780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:19:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 20:19:58,561][15401] Updated weights for policy 0, policy_version 283090 (0.0043) [2024-06-22 20:20:02,326][15401] Updated weights for policy 0, policy_version 283100 (0.0028) [2024-06-22 20:20:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4638359552. Throughput: 0: 42854.6. Samples: 4638514480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 20:20:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 20:20:06,028][15401] Updated weights for policy 0, policy_version 283110 (0.0027) [2024-06-22 20:20:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4638572544. Throughput: 0: 42884.3. Samples: 4638639680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 20:20:10,065][15401] Updated weights for policy 0, policy_version 283120 (0.0036) [2024-06-22 20:20:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 4638769152. Throughput: 0: 42848.4. Samples: 4638903220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 20:20:13,961][15401] Updated weights for policy 0, policy_version 283130 (0.0032) [2024-06-22 20:20:17,796][15401] Updated weights for policy 0, policy_version 283140 (0.0034) [2024-06-22 20:20:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4638965760. Throughput: 0: 42855.6. Samples: 4639158520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:18,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 20:20:21,464][15401] Updated weights for policy 0, policy_version 283150 (0.0033) [2024-06-22 20:20:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42601.6, 300 sec: 42765.0). Total num frames: 4639211520. Throughput: 0: 42664.9. Samples: 4639283460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 20:20:25,273][15401] Updated weights for policy 0, policy_version 283160 (0.0031) [2024-06-22 20:20:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42601.4, 300 sec: 42598.4). Total num frames: 4639408128. Throughput: 0: 42793.8. Samples: 4639544280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 20:20:29,016][15401] Updated weights for policy 0, policy_version 283170 (0.0031) [2024-06-22 20:20:32,788][15401] Updated weights for policy 0, policy_version 283180 (0.0041) [2024-06-22 20:20:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 4639637504. Throughput: 0: 43040.0. Samples: 4639806600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 20:20:36,849][15401] Updated weights for policy 0, policy_version 283190 (0.0039) [2024-06-22 20:20:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4639850496. Throughput: 0: 42961.1. Samples: 4639931460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 20:20:40,646][15401] Updated weights for policy 0, policy_version 283200 (0.0039) [2024-06-22 20:20:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4640063488. Throughput: 0: 42862.3. Samples: 4640186580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 20:20:44,776][15401] Updated weights for policy 0, policy_version 283210 (0.0042) [2024-06-22 20:20:48,070][15401] Updated weights for policy 0, policy_version 283220 (0.0029) [2024-06-22 20:20:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4640276480. Throughput: 0: 42883.1. Samples: 4640444220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:48,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 20:20:52,268][15401] Updated weights for policy 0, policy_version 283230 (0.0032) [2024-06-22 20:20:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4640505856. Throughput: 0: 43057.3. Samples: 4640577260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 20:20:56,210][15401] Updated weights for policy 0, policy_version 283240 (0.0037) [2024-06-22 20:20:58,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 4640702464. Throughput: 0: 42928.8. Samples: 4640835120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:20:58,392][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 20:20:59,672][15401] Updated weights for policy 0, policy_version 283250 (0.0036) [2024-06-22 20:21:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4640899072. Throughput: 0: 43012.3. Samples: 4641094080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:21:03,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 20:21:03,750][15401] Updated weights for policy 0, policy_version 283260 (0.0025) [2024-06-22 20:21:04,995][15349] Signal inference workers to stop experience collection... (68600 times) [2024-06-22 20:21:04,995][15349] Signal inference workers to resume experience collection... (68600 times) [2024-06-22 20:21:05,046][15401] InferenceWorker_p0-w0: stopping experience collection (68600 times) [2024-06-22 20:21:05,046][15401] InferenceWorker_p0-w0: resuming experience collection (68600 times) [2024-06-22 20:21:07,099][15401] Updated weights for policy 0, policy_version 283270 (0.0029) [2024-06-22 20:21:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4641144832. Throughput: 0: 43035.2. Samples: 4641220040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:21:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 20:21:11,327][15401] Updated weights for policy 0, policy_version 283280 (0.0027) [2024-06-22 20:21:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4641357824. Throughput: 0: 43021.3. Samples: 4641480240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:21:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 20:21:14,587][15401] Updated weights for policy 0, policy_version 283290 (0.0030) [2024-06-22 20:21:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4641554432. Throughput: 0: 42957.5. Samples: 4641739680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 20:21:18,390][15132] Avg episode reward: [(0, '0.899')] [2024-06-22 20:21:18,800][15401] Updated weights for policy 0, policy_version 283300 (0.0032) [2024-06-22 20:21:21,978][15401] Updated weights for policy 0, policy_version 283310 (0.0036) [2024-06-22 20:21:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4641800192. Throughput: 0: 43091.4. Samples: 4641870580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:21:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 20:21:26,511][15401] Updated weights for policy 0, policy_version 283320 (0.0040) [2024-06-22 20:21:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4642013184. Throughput: 0: 43190.3. Samples: 4642130140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:21:28,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 20:21:29,682][15401] Updated weights for policy 0, policy_version 283330 (0.0037) [2024-06-22 20:21:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4642193408. Throughput: 0: 43067.3. Samples: 4642382240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:21:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 20:21:34,154][15401] Updated weights for policy 0, policy_version 283340 (0.0045) [2024-06-22 20:21:37,274][15401] Updated weights for policy 0, policy_version 283350 (0.0028) [2024-06-22 20:21:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 4642455552. Throughput: 0: 42932.1. Samples: 4642509200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:21:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 20:21:41,727][15401] Updated weights for policy 0, policy_version 283360 (0.0026) [2024-06-22 20:21:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4642635776. Throughput: 0: 42993.4. Samples: 4642769720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:21:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 20:21:43,481][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000283365_4642652160.pth... [2024-06-22 20:21:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000282737_4632363008.pth [2024-06-22 20:21:44,734][15401] Updated weights for policy 0, policy_version 283370 (0.0036) [2024-06-22 20:21:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4642865152. Throughput: 0: 42863.2. Samples: 4643022920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:21:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 20:21:49,627][15401] Updated weights for policy 0, policy_version 283380 (0.0032) [2024-06-22 20:21:52,476][15401] Updated weights for policy 0, policy_version 283390 (0.0027) [2024-06-22 20:21:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4643078144. Throughput: 0: 42952.8. Samples: 4643152920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:21:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 20:21:57,276][15401] Updated weights for policy 0, policy_version 283400 (0.0031) [2024-06-22 20:21:58,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42868.6, 300 sec: 42764.1). Total num frames: 4643274752. Throughput: 0: 43040.5. Samples: 4643417340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:21:58,396][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 20:22:00,253][15401] Updated weights for policy 0, policy_version 283410 (0.0030) [2024-06-22 20:22:03,396][15132] Fps is (10 sec: 44208.5, 60 sec: 43686.0, 300 sec: 42930.7). Total num frames: 4643520512. Throughput: 0: 42793.9. Samples: 4643665680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:22:03,396][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 20:22:04,717][15401] Updated weights for policy 0, policy_version 283420 (0.0028) [2024-06-22 20:22:07,818][15401] Updated weights for policy 0, policy_version 283430 (0.0051) [2024-06-22 20:22:08,389][15132] Fps is (10 sec: 45905.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4643733504. Throughput: 0: 42946.7. Samples: 4643803180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:22:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 20:22:12,527][15401] Updated weights for policy 0, policy_version 283440 (0.0030) [2024-06-22 20:22:13,389][15132] Fps is (10 sec: 37707.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4643897344. Throughput: 0: 42951.9. Samples: 4644062980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:22:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 20:22:14,853][15349] Signal inference workers to stop experience collection... (68650 times) [2024-06-22 20:22:14,888][15401] InferenceWorker_p0-w0: stopping experience collection (68650 times) [2024-06-22 20:22:14,912][15349] Signal inference workers to resume experience collection... (68650 times) [2024-06-22 20:22:14,916][15401] InferenceWorker_p0-w0: resuming experience collection (68650 times) [2024-06-22 20:22:15,554][15401] Updated weights for policy 0, policy_version 283450 (0.0037) [2024-06-22 20:22:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42931.7). Total num frames: 4644159488. Throughput: 0: 42808.3. Samples: 4644308620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:22:18,390][15132] Avg episode reward: [(0, '0.917')] [2024-06-22 20:22:19,955][15401] Updated weights for policy 0, policy_version 283460 (0.0038) [2024-06-22 20:22:23,245][15401] Updated weights for policy 0, policy_version 283470 (0.0040) [2024-06-22 20:22:23,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4644372480. Throughput: 0: 43009.3. Samples: 4644444620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:22:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 20:22:27,505][15401] Updated weights for policy 0, policy_version 283480 (0.0035) [2024-06-22 20:22:28,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 4644536320. Throughput: 0: 42916.4. Samples: 4644700960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 20:22:28,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-22 20:22:31,007][15401] Updated weights for policy 0, policy_version 283490 (0.0049) [2024-06-22 20:22:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4644798464. Throughput: 0: 42837.4. Samples: 4644950600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:22:33,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-22 20:22:34,971][15401] Updated weights for policy 0, policy_version 283500 (0.0036) [2024-06-22 20:22:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4645011456. Throughput: 0: 43049.8. Samples: 4645090160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:22:38,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 20:22:38,495][15401] Updated weights for policy 0, policy_version 283510 (0.0040) [2024-06-22 20:22:42,466][15401] Updated weights for policy 0, policy_version 283520 (0.0029) [2024-06-22 20:22:43,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 4645191680. Throughput: 0: 42719.7. Samples: 4645339460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:22:43,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 20:22:46,312][15401] Updated weights for policy 0, policy_version 283530 (0.0039) [2024-06-22 20:22:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 4645437440. Throughput: 0: 42955.6. Samples: 4645598400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:22:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 20:22:49,883][15401] Updated weights for policy 0, policy_version 283540 (0.0030) [2024-06-22 20:22:53,389][15132] Fps is (10 sec: 44238.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4645634048. Throughput: 0: 42725.4. Samples: 4645725820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:22:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 20:22:54,007][15401] Updated weights for policy 0, policy_version 283550 (0.0049) [2024-06-22 20:22:57,454][15401] Updated weights for policy 0, policy_version 283560 (0.0037) [2024-06-22 20:22:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 4645847040. Throughput: 0: 42429.4. Samples: 4645972300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:22:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 20:23:01,794][15401] Updated weights for policy 0, policy_version 283570 (0.0035) [2024-06-22 20:23:03,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42328.2, 300 sec: 42931.3). Total num frames: 4646060032. Throughput: 0: 42667.1. Samples: 4646228740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:03,393][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 20:23:05,074][15401] Updated weights for policy 0, policy_version 283580 (0.0020) [2024-06-22 20:23:08,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42050.5, 300 sec: 42764.7). Total num frames: 4646256640. Throughput: 0: 42593.3. Samples: 4646361420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:08,393][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 20:23:09,529][15401] Updated weights for policy 0, policy_version 283590 (0.0031) [2024-06-22 20:23:12,594][15401] Updated weights for policy 0, policy_version 283600 (0.0026) [2024-06-22 20:23:13,390][15132] Fps is (10 sec: 44246.6, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 4646502400. Throughput: 0: 42427.8. Samples: 4646610220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 20:23:17,165][15401] Updated weights for policy 0, policy_version 283610 (0.0032) [2024-06-22 20:23:18,392][15132] Fps is (10 sec: 45875.2, 60 sec: 42596.7, 300 sec: 42931.9). Total num frames: 4646715392. Throughput: 0: 42523.0. Samples: 4646864240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:18,393][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 20:23:19,740][15349] Signal inference workers to stop experience collection... (68700 times) [2024-06-22 20:23:19,740][15349] Signal inference workers to resume experience collection... (68700 times) [2024-06-22 20:23:19,753][15401] InferenceWorker_p0-w0: stopping experience collection (68700 times) [2024-06-22 20:23:19,753][15401] InferenceWorker_p0-w0: resuming experience collection (68700 times) [2024-06-22 20:23:20,720][15401] Updated weights for policy 0, policy_version 283620 (0.0037) [2024-06-22 20:23:23,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 4646895616. Throughput: 0: 42351.2. Samples: 4646995960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 20:23:24,657][15401] Updated weights for policy 0, policy_version 283630 (0.0039) [2024-06-22 20:23:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 4647141376. Throughput: 0: 42640.2. Samples: 4647258260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 20:23:28,510][15401] Updated weights for policy 0, policy_version 283640 (0.0032) [2024-06-22 20:23:32,242][15401] Updated weights for policy 0, policy_version 283650 (0.0031) [2024-06-22 20:23:33,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4647370752. Throughput: 0: 42458.1. Samples: 4647509020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 20:23:36,565][15401] Updated weights for policy 0, policy_version 283660 (0.0030) [2024-06-22 20:23:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4647550976. Throughput: 0: 42579.9. Samples: 4647641920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 20:23:39,963][15401] Updated weights for policy 0, policy_version 283670 (0.0024) [2024-06-22 20:23:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4647780352. Throughput: 0: 42682.1. Samples: 4647893000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 20:23:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 20:23:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000283678_4647780352.pth... [2024-06-22 20:23:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000283051_4637507584.pth [2024-06-22 20:23:44,271][15401] Updated weights for policy 0, policy_version 283680 (0.0040) [2024-06-22 20:23:47,605][15401] Updated weights for policy 0, policy_version 283690 (0.0034) [2024-06-22 20:23:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4648009728. Throughput: 0: 42649.5. Samples: 4648147860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:23:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 20:23:51,637][15401] Updated weights for policy 0, policy_version 283700 (0.0034) [2024-06-22 20:23:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4648189952. Throughput: 0: 42733.3. Samples: 4648284320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:23:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 20:23:54,973][15401] Updated weights for policy 0, policy_version 283710 (0.0038) [2024-06-22 20:23:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4648435712. Throughput: 0: 42995.3. Samples: 4648545000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:23:58,392][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 20:23:59,031][15401] Updated weights for policy 0, policy_version 283720 (0.0029) [2024-06-22 20:24:02,599][15401] Updated weights for policy 0, policy_version 283730 (0.0028) [2024-06-22 20:24:03,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43144.6, 300 sec: 42875.8). Total num frames: 4648648704. Throughput: 0: 43028.4. Samples: 4648800520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:03,392][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 20:24:06,478][15401] Updated weights for policy 0, policy_version 283740 (0.0042) [2024-06-22 20:24:08,394][15132] Fps is (10 sec: 40943.3, 60 sec: 43143.3, 300 sec: 42819.9). Total num frames: 4648845312. Throughput: 0: 43106.2. Samples: 4648935920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:08,394][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 20:24:10,192][15401] Updated weights for policy 0, policy_version 283750 (0.0025) [2024-06-22 20:24:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 4649058304. Throughput: 0: 43047.1. Samples: 4649195380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 20:24:13,974][15401] Updated weights for policy 0, policy_version 283760 (0.0029) [2024-06-22 20:24:17,783][15401] Updated weights for policy 0, policy_version 283770 (0.0035) [2024-06-22 20:24:18,390][15132] Fps is (10 sec: 45894.0, 60 sec: 43146.3, 300 sec: 42876.8). Total num frames: 4649304064. Throughput: 0: 43134.7. Samples: 4649450080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:18,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 20:24:22,193][15401] Updated weights for policy 0, policy_version 283780 (0.0043) [2024-06-22 20:24:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42821.2). Total num frames: 4649484288. Throughput: 0: 43129.7. Samples: 4649582760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 20:24:23,642][15349] Signal inference workers to stop experience collection... (68750 times) [2024-06-22 20:24:23,697][15401] InferenceWorker_p0-w0: stopping experience collection (68750 times) [2024-06-22 20:24:23,704][15349] Signal inference workers to resume experience collection... (68750 times) [2024-06-22 20:24:23,714][15401] InferenceWorker_p0-w0: resuming experience collection (68750 times) [2024-06-22 20:24:25,362][15401] Updated weights for policy 0, policy_version 283790 (0.0033) [2024-06-22 20:24:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 4649697280. Throughput: 0: 43148.1. Samples: 4649834660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 20:24:29,744][15401] Updated weights for policy 0, policy_version 283800 (0.0032) [2024-06-22 20:24:33,166][15401] Updated weights for policy 0, policy_version 283810 (0.0030) [2024-06-22 20:24:33,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4649959424. Throughput: 0: 43218.7. Samples: 4650092700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 20:24:37,374][15401] Updated weights for policy 0, policy_version 283820 (0.0039) [2024-06-22 20:24:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4650123264. Throughput: 0: 43049.5. Samples: 4650221540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:38,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 20:24:40,949][15401] Updated weights for policy 0, policy_version 283830 (0.0037) [2024-06-22 20:24:43,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 4650352640. Throughput: 0: 42832.8. Samples: 4650472580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:43,393][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 20:24:45,028][15401] Updated weights for policy 0, policy_version 283840 (0.0044) [2024-06-22 20:24:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4650565632. Throughput: 0: 42922.7. Samples: 4650731940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 20:24:48,772][15401] Updated weights for policy 0, policy_version 283850 (0.0047) [2024-06-22 20:24:52,547][15401] Updated weights for policy 0, policy_version 283860 (0.0035) [2024-06-22 20:24:53,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4650762240. Throughput: 0: 42774.7. Samples: 4650860600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 20:24:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 20:24:56,306][15401] Updated weights for policy 0, policy_version 283870 (0.0031) [2024-06-22 20:24:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4651008000. Throughput: 0: 42580.3. Samples: 4651111500. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:24:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 20:25:00,149][15401] Updated weights for policy 0, policy_version 283880 (0.0051) [2024-06-22 20:25:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 4651188224. Throughput: 0: 42824.9. Samples: 4651377200. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 20:25:03,940][15401] Updated weights for policy 0, policy_version 283890 (0.0028) [2024-06-22 20:25:08,274][15401] Updated weights for policy 0, policy_version 283900 (0.0028) [2024-06-22 20:25:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42874.4, 300 sec: 42876.1). Total num frames: 4651417600. Throughput: 0: 42661.4. Samples: 4651502520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 20:25:11,886][15401] Updated weights for policy 0, policy_version 283910 (0.0032) [2024-06-22 20:25:13,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.7, 300 sec: 42986.8). Total num frames: 4651646976. Throughput: 0: 42633.7. Samples: 4651753280. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:13,393][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 20:25:15,849][15401] Updated weights for policy 0, policy_version 283920 (0.0041) [2024-06-22 20:25:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4651843584. Throughput: 0: 42769.1. Samples: 4652017320. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 20:25:19,528][15401] Updated weights for policy 0, policy_version 283930 (0.0038) [2024-06-22 20:25:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4652056576. Throughput: 0: 42586.6. Samples: 4652137940. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 20:25:23,507][15401] Updated weights for policy 0, policy_version 283940 (0.0036) [2024-06-22 20:25:27,191][15401] Updated weights for policy 0, policy_version 283950 (0.0039) [2024-06-22 20:25:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4652285952. Throughput: 0: 42635.6. Samples: 4652391080. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 20:25:31,126][15401] Updated weights for policy 0, policy_version 283960 (0.0042) [2024-06-22 20:25:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 4652466176. Throughput: 0: 42725.8. Samples: 4652654600. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 20:25:34,819][15401] Updated weights for policy 0, policy_version 283970 (0.0025) [2024-06-22 20:25:35,596][15349] Signal inference workers to stop experience collection... (68800 times) [2024-06-22 20:25:35,596][15349] Signal inference workers to resume experience collection... (68800 times) [2024-06-22 20:25:35,617][15401] InferenceWorker_p0-w0: stopping experience collection (68800 times) [2024-06-22 20:25:35,618][15401] InferenceWorker_p0-w0: resuming experience collection (68800 times) [2024-06-22 20:25:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4652695552. Throughput: 0: 42549.6. Samples: 4652775340. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 20:25:38,855][15401] Updated weights for policy 0, policy_version 283980 (0.0027) [2024-06-22 20:25:42,343][15401] Updated weights for policy 0, policy_version 283990 (0.0048) [2024-06-22 20:25:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 4652941312. Throughput: 0: 42661.8. Samples: 4653031280. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:43,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 20:25:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000283993_4652941312.pth... [2024-06-22 20:25:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000283365_4642652160.pth [2024-06-22 20:25:46,832][15401] Updated weights for policy 0, policy_version 284000 (0.0033) [2024-06-22 20:25:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4653105152. Throughput: 0: 42491.9. Samples: 4653289340. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 20:25:49,926][15401] Updated weights for policy 0, policy_version 284010 (0.0025) [2024-06-22 20:25:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.2, 300 sec: 42765.3). Total num frames: 4653318144. Throughput: 0: 42431.9. Samples: 4653411960. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:53,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 20:25:54,176][15401] Updated weights for policy 0, policy_version 284020 (0.0038) [2024-06-22 20:25:57,592][15401] Updated weights for policy 0, policy_version 284030 (0.0026) [2024-06-22 20:25:58,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 4653580288. Throughput: 0: 42819.2. Samples: 4653680040. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:25:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 20:26:01,661][15401] Updated weights for policy 0, policy_version 284040 (0.0046) [2024-06-22 20:26:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4653744128. Throughput: 0: 42673.5. Samples: 4653937620. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:26:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 20:26:05,400][15401] Updated weights for policy 0, policy_version 284050 (0.0035) [2024-06-22 20:26:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4653973504. Throughput: 0: 42663.1. Samples: 4654057780. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-22 20:26:08,399][15132] Avg episode reward: [(0, '0.823')] [2024-06-22 20:26:09,485][15401] Updated weights for policy 0, policy_version 284060 (0.0040) [2024-06-22 20:26:12,996][15401] Updated weights for policy 0, policy_version 284070 (0.0043) [2024-06-22 20:26:13,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 4654219264. Throughput: 0: 42907.6. Samples: 4654321920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 20:26:16,866][15401] Updated weights for policy 0, policy_version 284080 (0.0032) [2024-06-22 20:26:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4654399488. Throughput: 0: 42781.3. Samples: 4654579760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 20:26:20,482][15401] Updated weights for policy 0, policy_version 284090 (0.0041) [2024-06-22 20:26:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 4654612480. Throughput: 0: 42826.1. Samples: 4654702520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 20:26:24,671][15401] Updated weights for policy 0, policy_version 284100 (0.0042) [2024-06-22 20:26:28,055][15401] Updated weights for policy 0, policy_version 284110 (0.0036) [2024-06-22 20:26:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4654858240. Throughput: 0: 43028.9. Samples: 4654967580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:28,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 20:26:32,289][15401] Updated weights for policy 0, policy_version 284120 (0.0028) [2024-06-22 20:26:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4655038464. Throughput: 0: 43007.1. Samples: 4655224660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:33,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 20:26:35,726][15401] Updated weights for policy 0, policy_version 284130 (0.0037) [2024-06-22 20:26:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4655267840. Throughput: 0: 43062.3. Samples: 4655349760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:38,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 20:26:39,643][15401] Updated weights for policy 0, policy_version 284140 (0.0033) [2024-06-22 20:26:43,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4655497216. Throughput: 0: 42946.6. Samples: 4655612640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:43,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 20:26:43,441][15401] Updated weights for policy 0, policy_version 284150 (0.0043) [2024-06-22 20:26:46,748][15349] Signal inference workers to stop experience collection... (68850 times) [2024-06-22 20:26:46,749][15349] Signal inference workers to resume experience collection... (68850 times) [2024-06-22 20:26:46,775][15401] InferenceWorker_p0-w0: stopping experience collection (68850 times) [2024-06-22 20:26:46,775][15401] InferenceWorker_p0-w0: resuming experience collection (68850 times) [2024-06-22 20:26:47,236][15401] Updated weights for policy 0, policy_version 284160 (0.0037) [2024-06-22 20:26:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4655677440. Throughput: 0: 42820.0. Samples: 4655864520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:48,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 20:26:51,327][15401] Updated weights for policy 0, policy_version 284170 (0.0046) [2024-06-22 20:26:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42877.0). Total num frames: 4655923200. Throughput: 0: 42939.2. Samples: 4655990040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 20:26:54,885][15401] Updated weights for policy 0, policy_version 284180 (0.0035) [2024-06-22 20:26:58,396][15132] Fps is (10 sec: 44207.9, 60 sec: 42320.8, 300 sec: 42709.5). Total num frames: 4656119808. Throughput: 0: 42939.2. Samples: 4656254460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:26:58,397][15132] Avg episode reward: [(0, '0.175')] [2024-06-22 20:26:58,855][15401] Updated weights for policy 0, policy_version 284190 (0.0033) [2024-06-22 20:27:03,007][15401] Updated weights for policy 0, policy_version 284200 (0.0049) [2024-06-22 20:27:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4656332800. Throughput: 0: 42746.6. Samples: 4656503360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:27:03,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 20:27:06,538][15401] Updated weights for policy 0, policy_version 284210 (0.0029) [2024-06-22 20:27:08,389][15132] Fps is (10 sec: 44265.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4656562176. Throughput: 0: 42877.9. Samples: 4656632020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:27:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 20:27:10,646][15401] Updated weights for policy 0, policy_version 284220 (0.0042) [2024-06-22 20:27:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4656742400. Throughput: 0: 42763.9. Samples: 4656891960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:27:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 20:27:14,519][15401] Updated weights for policy 0, policy_version 284230 (0.0033) [2024-06-22 20:27:18,285][15401] Updated weights for policy 0, policy_version 284240 (0.0039) [2024-06-22 20:27:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4656988160. Throughput: 0: 42620.7. Samples: 4657142580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-22 20:27:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 20:27:22,252][15401] Updated weights for policy 0, policy_version 284250 (0.0035) [2024-06-22 20:27:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4657201152. Throughput: 0: 42747.8. Samples: 4657273420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:27:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 20:27:25,898][15401] Updated weights for policy 0, policy_version 284260 (0.0038) [2024-06-22 20:27:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 4657381376. Throughput: 0: 42650.2. Samples: 4657531900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:27:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 20:27:29,715][15401] Updated weights for policy 0, policy_version 284270 (0.0033) [2024-06-22 20:27:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4657627136. Throughput: 0: 42630.6. Samples: 4657782900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:27:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 20:27:33,472][15401] Updated weights for policy 0, policy_version 284280 (0.0051) [2024-06-22 20:27:37,293][15401] Updated weights for policy 0, policy_version 284290 (0.0035) [2024-06-22 20:27:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4657823744. Throughput: 0: 42777.8. Samples: 4657915040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:27:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 20:27:41,134][15401] Updated weights for policy 0, policy_version 284300 (0.0030) [2024-06-22 20:27:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 4658020352. Throughput: 0: 42637.7. Samples: 4658172880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:27:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 20:27:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000284303_4658020352.pth... [2024-06-22 20:27:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000283678_4647780352.pth [2024-06-22 20:27:44,962][15401] Updated weights for policy 0, policy_version 284310 (0.0038) [2024-06-22 20:27:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4658266112. Throughput: 0: 42535.5. Samples: 4658417460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:27:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 20:27:48,939][15401] Updated weights for policy 0, policy_version 284320 (0.0028) [2024-06-22 20:27:52,730][15401] Updated weights for policy 0, policy_version 284330 (0.0035) [2024-06-22 20:27:53,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 4658479104. Throughput: 0: 42811.8. Samples: 4658558560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:27:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 20:27:56,427][15401] Updated weights for policy 0, policy_version 284340 (0.0036) [2024-06-22 20:27:58,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42329.9, 300 sec: 42709.8). Total num frames: 4658659328. Throughput: 0: 42672.6. Samples: 4658812220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:27:58,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 20:28:00,430][15401] Updated weights for policy 0, policy_version 284350 (0.0054) [2024-06-22 20:28:01,772][15349] Signal inference workers to stop experience collection... (68900 times) [2024-06-22 20:28:01,772][15349] Signal inference workers to resume experience collection... (68900 times) [2024-06-22 20:28:01,798][15401] InferenceWorker_p0-w0: stopping experience collection (68900 times) [2024-06-22 20:28:01,799][15401] InferenceWorker_p0-w0: resuming experience collection (68900 times) [2024-06-22 20:28:03,391][15132] Fps is (10 sec: 42593.3, 60 sec: 42870.5, 300 sec: 42876.2). Total num frames: 4658905088. Throughput: 0: 42566.1. Samples: 4659058120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:28:03,391][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 20:28:04,245][15401] Updated weights for policy 0, policy_version 284360 (0.0033) [2024-06-22 20:28:08,288][15401] Updated weights for policy 0, policy_version 284370 (0.0036) [2024-06-22 20:28:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4659118080. Throughput: 0: 42801.8. Samples: 4659199500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:28:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 20:28:12,438][15401] Updated weights for policy 0, policy_version 284380 (0.0047) [2024-06-22 20:28:13,390][15132] Fps is (10 sec: 40965.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 4659314688. Throughput: 0: 42620.0. Samples: 4659449800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:28:13,390][15132] Avg episode reward: [(0, '0.866')] [2024-06-22 20:28:15,910][15401] Updated weights for policy 0, policy_version 284390 (0.0034) [2024-06-22 20:28:18,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42593.8, 300 sec: 42875.2). Total num frames: 4659544064. Throughput: 0: 42722.8. Samples: 4659705700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:28:18,396][15132] Avg episode reward: [(0, '0.879')] [2024-06-22 20:28:19,945][15401] Updated weights for policy 0, policy_version 284400 (0.0029) [2024-06-22 20:28:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 4659740672. Throughput: 0: 42856.0. Samples: 4659843560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:28:23,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-22 20:28:23,647][15401] Updated weights for policy 0, policy_version 284410 (0.0033) [2024-06-22 20:28:27,486][15401] Updated weights for policy 0, policy_version 284420 (0.0031) [2024-06-22 20:28:28,396][15132] Fps is (10 sec: 40960.0, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 4659953664. Throughput: 0: 42628.6. Samples: 4660091440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:28:28,397][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 20:28:31,347][15401] Updated weights for policy 0, policy_version 284430 (0.0040) [2024-06-22 20:28:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4660199424. Throughput: 0: 42784.9. Samples: 4660342780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-22 20:28:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 20:28:35,032][15401] Updated weights for policy 0, policy_version 284440 (0.0031) [2024-06-22 20:28:38,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4660379648. Throughput: 0: 42665.6. Samples: 4660478500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:28:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 20:28:39,037][15401] Updated weights for policy 0, policy_version 284450 (0.0037) [2024-06-22 20:28:42,681][15401] Updated weights for policy 0, policy_version 284460 (0.0033) [2024-06-22 20:28:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4660592640. Throughput: 0: 42680.5. Samples: 4660732840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:28:43,396][15132] Avg episode reward: [(0, '0.258')] [2024-06-22 20:28:46,474][15401] Updated weights for policy 0, policy_version 284470 (0.0035) [2024-06-22 20:28:48,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4660838400. Throughput: 0: 42759.9. Samples: 4660982260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:28:48,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 20:28:50,402][15401] Updated weights for policy 0, policy_version 284480 (0.0040) [2024-06-22 20:28:53,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.9, 300 sec: 42709.1). Total num frames: 4661035008. Throughput: 0: 42766.8. Samples: 4661124100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:28:53,392][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 20:28:53,936][15401] Updated weights for policy 0, policy_version 284490 (0.0035) [2024-06-22 20:28:58,333][15401] Updated weights for policy 0, policy_version 284500 (0.0037) [2024-06-22 20:28:58,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.8, 300 sec: 42709.5). Total num frames: 4661248000. Throughput: 0: 42838.6. Samples: 4661377640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:28:58,392][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 20:29:01,391][15401] Updated weights for policy 0, policy_version 284510 (0.0041) [2024-06-22 20:29:03,390][15132] Fps is (10 sec: 45885.7, 60 sec: 43145.5, 300 sec: 42876.7). Total num frames: 4661493760. Throughput: 0: 42832.7. Samples: 4661632900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:03,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 20:29:05,754][15401] Updated weights for policy 0, policy_version 284520 (0.0027) [2024-06-22 20:29:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 4661690368. Throughput: 0: 42832.4. Samples: 4661771020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 20:29:09,199][15401] Updated weights for policy 0, policy_version 284530 (0.0029) [2024-06-22 20:29:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4661886976. Throughput: 0: 42927.9. Samples: 4662022920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:13,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-22 20:29:13,491][15401] Updated weights for policy 0, policy_version 284540 (0.0029) [2024-06-22 20:29:16,741][15401] Updated weights for policy 0, policy_version 284550 (0.0031) [2024-06-22 20:29:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 4662116352. Throughput: 0: 43066.8. Samples: 4662280780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:18,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 20:29:21,111][15401] Updated weights for policy 0, policy_version 284560 (0.0033) [2024-06-22 20:29:23,391][15132] Fps is (10 sec: 44228.1, 60 sec: 43143.1, 300 sec: 42820.3). Total num frames: 4662329344. Throughput: 0: 43086.5. Samples: 4662417480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:23,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 20:29:24,323][15401] Updated weights for policy 0, policy_version 284570 (0.0037) [2024-06-22 20:29:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43149.1, 300 sec: 42653.9). Total num frames: 4662542336. Throughput: 0: 42986.2. Samples: 4662667220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:28,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 20:29:28,866][15401] Updated weights for policy 0, policy_version 284580 (0.0050) [2024-06-22 20:29:30,675][15349] Signal inference workers to stop experience collection... (68950 times) [2024-06-22 20:29:30,706][15401] InferenceWorker_p0-w0: stopping experience collection (68950 times) [2024-06-22 20:29:30,790][15349] Signal inference workers to resume experience collection... (68950 times) [2024-06-22 20:29:30,790][15401] InferenceWorker_p0-w0: resuming experience collection (68950 times) [2024-06-22 20:29:31,954][15401] Updated weights for policy 0, policy_version 284590 (0.0034) [2024-06-22 20:29:33,390][15132] Fps is (10 sec: 42606.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4662755328. Throughput: 0: 43132.4. Samples: 4662923220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 20:29:36,345][15401] Updated weights for policy 0, policy_version 284600 (0.0036) [2024-06-22 20:29:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 4662968320. Throughput: 0: 42870.6. Samples: 4663053180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 20:29:39,602][15401] Updated weights for policy 0, policy_version 284610 (0.0038) [2024-06-22 20:29:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4663181312. Throughput: 0: 42811.2. Samples: 4663304040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-22 20:29:43,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 20:29:43,520][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000284619_4663197696.pth... [2024-06-22 20:29:43,569][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000283993_4652941312.pth [2024-06-22 20:29:43,956][15401] Updated weights for policy 0, policy_version 284620 (0.0030) [2024-06-22 20:29:47,209][15401] Updated weights for policy 0, policy_version 284630 (0.0032) [2024-06-22 20:29:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4663410688. Throughput: 0: 42934.3. Samples: 4663564940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:29:48,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-22 20:29:51,503][15401] Updated weights for policy 0, policy_version 284640 (0.0033) [2024-06-22 20:29:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42871.4, 300 sec: 42709.1). Total num frames: 4663607296. Throughput: 0: 42787.9. Samples: 4663696580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:29:53,393][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 20:29:54,750][15401] Updated weights for policy 0, policy_version 284650 (0.0028) [2024-06-22 20:29:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 4663820288. Throughput: 0: 42910.3. Samples: 4663953880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:29:58,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 20:29:58,906][15401] Updated weights for policy 0, policy_version 284660 (0.0027) [2024-06-22 20:30:02,307][15401] Updated weights for policy 0, policy_version 284670 (0.0026) [2024-06-22 20:30:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4664049664. Throughput: 0: 42863.2. Samples: 4664209620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 20:30:06,980][15401] Updated weights for policy 0, policy_version 284680 (0.0035) [2024-06-22 20:30:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 4664262656. Throughput: 0: 42769.8. Samples: 4664342040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 20:30:10,178][15401] Updated weights for policy 0, policy_version 284690 (0.0033) [2024-06-22 20:30:13,391][15132] Fps is (10 sec: 42591.6, 60 sec: 43143.4, 300 sec: 42820.3). Total num frames: 4664475648. Throughput: 0: 42951.4. Samples: 4664600100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:13,391][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 20:30:14,611][15401] Updated weights for policy 0, policy_version 284700 (0.0032) [2024-06-22 20:30:17,876][15401] Updated weights for policy 0, policy_version 284710 (0.0038) [2024-06-22 20:30:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4664688640. Throughput: 0: 42867.3. Samples: 4664852240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 20:30:22,168][15401] Updated weights for policy 0, policy_version 284720 (0.0036) [2024-06-22 20:30:23,390][15132] Fps is (10 sec: 44243.5, 60 sec: 43145.9, 300 sec: 42820.6). Total num frames: 4664918016. Throughput: 0: 42931.2. Samples: 4664985080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:23,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 20:30:25,470][15401] Updated weights for policy 0, policy_version 284730 (0.0042) [2024-06-22 20:30:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 4665114624. Throughput: 0: 43070.2. Samples: 4665242300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:28,392][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 20:30:29,841][15401] Updated weights for policy 0, policy_version 284740 (0.0038) [2024-06-22 20:30:33,004][15401] Updated weights for policy 0, policy_version 284750 (0.0035) [2024-06-22 20:30:33,392][15132] Fps is (10 sec: 42589.2, 60 sec: 43143.1, 300 sec: 42875.8). Total num frames: 4665344000. Throughput: 0: 42895.6. Samples: 4665495340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:33,392][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 20:30:37,425][15401] Updated weights for policy 0, policy_version 284760 (0.0033) [2024-06-22 20:30:38,392][15132] Fps is (10 sec: 44236.6, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 4665556992. Throughput: 0: 43015.6. Samples: 4665632280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:38,392][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 20:30:38,613][15349] Signal inference workers to stop experience collection... (69000 times) [2024-06-22 20:30:38,614][15349] Signal inference workers to resume experience collection... (69000 times) [2024-06-22 20:30:38,668][15401] InferenceWorker_p0-w0: stopping experience collection (69000 times) [2024-06-22 20:30:38,668][15401] InferenceWorker_p0-w0: resuming experience collection (69000 times) [2024-06-22 20:30:40,604][15401] Updated weights for policy 0, policy_version 284770 (0.0037) [2024-06-22 20:30:43,389][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 4665769984. Throughput: 0: 43039.1. Samples: 4665890640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:43,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 20:30:45,059][15401] Updated weights for policy 0, policy_version 284780 (0.0029) [2024-06-22 20:30:48,275][15401] Updated weights for policy 0, policy_version 284790 (0.0037) [2024-06-22 20:30:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4665999360. Throughput: 0: 42868.8. Samples: 4666138720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:48,392][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 20:30:52,766][15401] Updated weights for policy 0, policy_version 284800 (0.0026) [2024-06-22 20:30:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 4666179584. Throughput: 0: 42944.5. Samples: 4666274540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:53,396][15132] Avg episode reward: [(0, '0.303')] [2024-06-22 20:30:55,946][15401] Updated weights for policy 0, policy_version 284810 (0.0024) [2024-06-22 20:30:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4666392576. Throughput: 0: 42853.5. Samples: 4666528440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 20:30:58,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 20:31:00,315][15401] Updated weights for policy 0, policy_version 284820 (0.0042) [2024-06-22 20:31:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4666638336. Throughput: 0: 42841.3. Samples: 4666780100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 20:31:03,525][15401] Updated weights for policy 0, policy_version 284830 (0.0033) [2024-06-22 20:31:08,026][15401] Updated weights for policy 0, policy_version 284840 (0.0036) [2024-06-22 20:31:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4666818560. Throughput: 0: 42911.2. Samples: 4666916080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 20:31:11,203][15401] Updated weights for policy 0, policy_version 284850 (0.0039) [2024-06-22 20:31:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42326.4, 300 sec: 42765.0). Total num frames: 4667015168. Throughput: 0: 42769.4. Samples: 4667166820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:13,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 20:31:15,752][15401] Updated weights for policy 0, policy_version 284860 (0.0031) [2024-06-22 20:31:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 4667277312. Throughput: 0: 42921.2. Samples: 4667426700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-22 20:31:19,037][15401] Updated weights for policy 0, policy_version 284870 (0.0036) [2024-06-22 20:31:23,326][15401] Updated weights for policy 0, policy_version 284880 (0.0033) [2024-06-22 20:31:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4667473920. Throughput: 0: 42882.2. Samples: 4667561880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 20:31:26,716][15401] Updated weights for policy 0, policy_version 284890 (0.0034) [2024-06-22 20:31:28,395][15132] Fps is (10 sec: 40939.3, 60 sec: 42869.5, 300 sec: 42875.4). Total num frames: 4667686912. Throughput: 0: 42794.6. Samples: 4667816620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:28,395][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 20:31:30,728][15401] Updated weights for policy 0, policy_version 284900 (0.0031) [2024-06-22 20:31:33,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42598.2, 300 sec: 42820.2). Total num frames: 4667899904. Throughput: 0: 43083.9. Samples: 4668077600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:33,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 20:31:34,359][15401] Updated weights for policy 0, policy_version 284910 (0.0038) [2024-06-22 20:31:38,282][15401] Updated weights for policy 0, policy_version 284920 (0.0035) [2024-06-22 20:31:38,389][15132] Fps is (10 sec: 44259.4, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 4668129280. Throughput: 0: 42946.2. Samples: 4668207120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 20:31:42,152][15401] Updated weights for policy 0, policy_version 284930 (0.0043) [2024-06-22 20:31:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4668342272. Throughput: 0: 42968.0. Samples: 4668462000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 20:31:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000284933_4668342272.pth... [2024-06-22 20:31:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000284303_4658020352.pth [2024-06-22 20:31:45,872][15401] Updated weights for policy 0, policy_version 284940 (0.0042) [2024-06-22 20:31:48,394][15132] Fps is (10 sec: 42576.9, 60 sec: 42594.9, 300 sec: 42819.8). Total num frames: 4668555264. Throughput: 0: 43024.1. Samples: 4668716400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:48,395][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 20:31:49,703][15401] Updated weights for policy 0, policy_version 284950 (0.0031) [2024-06-22 20:31:53,334][15401] Updated weights for policy 0, policy_version 284960 (0.0038) [2024-06-22 20:31:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42932.6). Total num frames: 4668784640. Throughput: 0: 42883.0. Samples: 4668845820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 20:31:57,358][15401] Updated weights for policy 0, policy_version 284970 (0.0039) [2024-06-22 20:31:58,390][15132] Fps is (10 sec: 40980.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4668964864. Throughput: 0: 43119.5. Samples: 4669107200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:31:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 20:32:01,359][15401] Updated weights for policy 0, policy_version 284980 (0.0027) [2024-06-22 20:32:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4669194240. Throughput: 0: 42822.6. Samples: 4669353720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:32:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 20:32:05,123][15401] Updated weights for policy 0, policy_version 284990 (0.0028) [2024-06-22 20:32:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 4669407232. Throughput: 0: 42850.2. Samples: 4669490140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:32:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 20:32:08,869][15401] Updated weights for policy 0, policy_version 285000 (0.0026) [2024-06-22 20:32:12,582][15401] Updated weights for policy 0, policy_version 285010 (0.0030) [2024-06-22 20:32:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 4669620224. Throughput: 0: 42911.9. Samples: 4669747440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:32:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 20:32:16,099][15349] Signal inference workers to stop experience collection... (69050 times) [2024-06-22 20:32:16,160][15401] InferenceWorker_p0-w0: stopping experience collection (69050 times) [2024-06-22 20:32:16,221][15349] Signal inference workers to resume experience collection... (69050 times) [2024-06-22 20:32:16,221][15401] InferenceWorker_p0-w0: resuming experience collection (69050 times) [2024-06-22 20:32:16,355][15401] Updated weights for policy 0, policy_version 285020 (0.0025) [2024-06-22 20:32:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4669849600. Throughput: 0: 42816.6. Samples: 4670004240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 20:32:20,109][15401] Updated weights for policy 0, policy_version 285030 (0.0042) [2024-06-22 20:32:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4670029824. Throughput: 0: 42774.2. Samples: 4670131960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 20:32:24,067][15401] Updated weights for policy 0, policy_version 285040 (0.0033) [2024-06-22 20:32:27,969][15401] Updated weights for policy 0, policy_version 285050 (0.0041) [2024-06-22 20:32:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43148.1, 300 sec: 42876.1). Total num frames: 4670275584. Throughput: 0: 42847.1. Samples: 4670390120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:28,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-22 20:32:31,561][15401] Updated weights for policy 0, policy_version 285060 (0.0035) [2024-06-22 20:32:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42871.5, 300 sec: 42875.7). Total num frames: 4670472192. Throughput: 0: 42982.5. Samples: 4670650500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:33,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 20:32:35,546][15401] Updated weights for policy 0, policy_version 285070 (0.0043) [2024-06-22 20:32:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 4670685184. Throughput: 0: 42862.6. Samples: 4670774740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:38,393][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 20:32:39,142][15401] Updated weights for policy 0, policy_version 285080 (0.0022) [2024-06-22 20:32:43,059][15401] Updated weights for policy 0, policy_version 285090 (0.0028) [2024-06-22 20:32:43,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4670930944. Throughput: 0: 42860.5. Samples: 4671035920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 20:32:46,890][15401] Updated weights for policy 0, policy_version 285100 (0.0033) [2024-06-22 20:32:48,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42875.0, 300 sec: 42876.1). Total num frames: 4671127552. Throughput: 0: 43173.3. Samples: 4671296520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:48,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 20:32:50,609][15401] Updated weights for policy 0, policy_version 285110 (0.0038) [2024-06-22 20:32:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 4671324160. Throughput: 0: 42857.2. Samples: 4671418720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 20:32:54,515][15401] Updated weights for policy 0, policy_version 285120 (0.0043) [2024-06-22 20:32:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42876.3). Total num frames: 4671553536. Throughput: 0: 42845.4. Samples: 4671675480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:32:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 20:32:58,447][15401] Updated weights for policy 0, policy_version 285130 (0.0040) [2024-06-22 20:33:02,303][15401] Updated weights for policy 0, policy_version 285140 (0.0037) [2024-06-22 20:33:03,390][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4671782912. Throughput: 0: 42845.7. Samples: 4671932300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:33:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 20:33:05,862][15401] Updated weights for policy 0, policy_version 285150 (0.0040) [2024-06-22 20:33:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4671963136. Throughput: 0: 42801.7. Samples: 4672058040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:33:08,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 20:33:10,201][15401] Updated weights for policy 0, policy_version 285160 (0.0041) [2024-06-22 20:33:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42932.6). Total num frames: 4672208896. Throughput: 0: 42896.5. Samples: 4672320460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:33:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 20:33:13,487][15401] Updated weights for policy 0, policy_version 285170 (0.0027) [2024-06-22 20:33:17,785][15401] Updated weights for policy 0, policy_version 285180 (0.0033) [2024-06-22 20:33:18,396][15132] Fps is (10 sec: 45846.0, 60 sec: 42866.8, 300 sec: 42986.2). Total num frames: 4672421888. Throughput: 0: 42710.9. Samples: 4672572660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:33:18,396][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 20:33:21,100][15401] Updated weights for policy 0, policy_version 285190 (0.0039) [2024-06-22 20:33:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42932.6). Total num frames: 4672618496. Throughput: 0: 42778.7. Samples: 4672699680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 20:33:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 20:33:25,426][15401] Updated weights for policy 0, policy_version 285200 (0.0028) [2024-06-22 20:33:28,129][15349] Signal inference workers to stop experience collection... (69100 times) [2024-06-22 20:33:28,155][15401] InferenceWorker_p0-w0: stopping experience collection (69100 times) [2024-06-22 20:33:28,186][15349] Signal inference workers to resume experience collection... (69100 times) [2024-06-22 20:33:28,187][15401] InferenceWorker_p0-w0: resuming experience collection (69100 times) [2024-06-22 20:33:28,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4672847872. Throughput: 0: 42666.3. Samples: 4672955900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:33:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 20:33:29,424][15401] Updated weights for policy 0, policy_version 285210 (0.0033) [2024-06-22 20:33:33,219][15401] Updated weights for policy 0, policy_version 285220 (0.0025) [2024-06-22 20:33:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 4673060864. Throughput: 0: 42646.3. Samples: 4673215600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:33:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 20:33:37,086][15401] Updated weights for policy 0, policy_version 285230 (0.0035) [2024-06-22 20:33:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 4673257472. Throughput: 0: 42686.8. Samples: 4673339620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:33:38,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 20:33:40,958][15401] Updated weights for policy 0, policy_version 285240 (0.0035) [2024-06-22 20:33:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4673486848. Throughput: 0: 42668.0. Samples: 4673595540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:33:43,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 20:33:43,525][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000285248_4673503232.pth... [2024-06-22 20:33:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000284619_4663197696.pth [2024-06-22 20:33:44,708][15401] Updated weights for policy 0, policy_version 285250 (0.0048) [2024-06-22 20:33:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 4673683456. Throughput: 0: 42624.5. Samples: 4673850400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:33:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 20:33:48,587][15401] Updated weights for policy 0, policy_version 285260 (0.0033) [2024-06-22 20:33:52,135][15401] Updated weights for policy 0, policy_version 285270 (0.0024) [2024-06-22 20:33:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 4673896448. Throughput: 0: 42598.6. Samples: 4673974980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:33:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 20:33:56,246][15401] Updated weights for policy 0, policy_version 285280 (0.0026) [2024-06-22 20:33:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4674125824. Throughput: 0: 42504.0. Samples: 4674233140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:33:58,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-22 20:33:59,699][15401] Updated weights for policy 0, policy_version 285290 (0.0025) [2024-06-22 20:34:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 4674306048. Throughput: 0: 42679.7. Samples: 4674492980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:34:03,390][15132] Avg episode reward: [(0, '0.151')] [2024-06-22 20:34:03,762][15401] Updated weights for policy 0, policy_version 285300 (0.0028) [2024-06-22 20:34:07,701][15401] Updated weights for policy 0, policy_version 285310 (0.0038) [2024-06-22 20:34:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4674535424. Throughput: 0: 42560.5. Samples: 4674614900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:34:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 20:34:11,878][15401] Updated weights for policy 0, policy_version 285320 (0.0030) [2024-06-22 20:34:13,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4674764800. Throughput: 0: 42667.9. Samples: 4674875960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:34:13,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 20:34:15,208][15401] Updated weights for policy 0, policy_version 285330 (0.0026) [2024-06-22 20:34:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42056.7, 300 sec: 42765.3). Total num frames: 4674945024. Throughput: 0: 42500.4. Samples: 4675128120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:34:18,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 20:34:19,862][15401] Updated weights for policy 0, policy_version 285340 (0.0027) [2024-06-22 20:34:22,825][15401] Updated weights for policy 0, policy_version 285350 (0.0030) [2024-06-22 20:34:23,391][15132] Fps is (10 sec: 44229.3, 60 sec: 43143.3, 300 sec: 42931.4). Total num frames: 4675207168. Throughput: 0: 42506.4. Samples: 4675252480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:34:23,392][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 20:34:27,546][15401] Updated weights for policy 0, policy_version 285360 (0.0046) [2024-06-22 20:34:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 4675387392. Throughput: 0: 42786.7. Samples: 4675520940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:34:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 20:34:30,365][15401] Updated weights for policy 0, policy_version 285370 (0.0041) [2024-06-22 20:34:33,389][15132] Fps is (10 sec: 40967.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4675616768. Throughput: 0: 42740.0. Samples: 4675773700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:34:33,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 20:34:35,070][15401] Updated weights for policy 0, policy_version 285380 (0.0033) [2024-06-22 20:34:38,101][15401] Updated weights for policy 0, policy_version 285390 (0.0032) [2024-06-22 20:34:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4675846144. Throughput: 0: 42900.1. Samples: 4675905480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:34:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 20:34:42,580][15401] Updated weights for policy 0, policy_version 285400 (0.0039) [2024-06-22 20:34:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4676009984. Throughput: 0: 42989.3. Samples: 4676167660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:34:43,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 20:34:45,627][15401] Updated weights for policy 0, policy_version 285410 (0.0038) [2024-06-22 20:34:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 4676255744. Throughput: 0: 42758.4. Samples: 4676417100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:34:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 20:34:50,138][15401] Updated weights for policy 0, policy_version 285420 (0.0029) [2024-06-22 20:34:53,093][15401] Updated weights for policy 0, policy_version 285430 (0.0041) [2024-06-22 20:34:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4676485120. Throughput: 0: 43138.7. Samples: 4676556140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:34:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 20:34:53,685][15349] Signal inference workers to stop experience collection... (69150 times) [2024-06-22 20:34:53,687][15349] Signal inference workers to resume experience collection... (69150 times) [2024-06-22 20:34:53,705][15401] InferenceWorker_p0-w0: stopping experience collection (69150 times) [2024-06-22 20:34:53,705][15401] InferenceWorker_p0-w0: resuming experience collection (69150 times) [2024-06-22 20:34:58,313][15401] Updated weights for policy 0, policy_version 285440 (0.0034) [2024-06-22 20:34:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4676648960. Throughput: 0: 42883.1. Samples: 4676805700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:34:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 20:35:00,782][15401] Updated weights for policy 0, policy_version 285450 (0.0034) [2024-06-22 20:35:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4676911104. Throughput: 0: 42754.2. Samples: 4677052060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:03,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 20:35:05,982][15401] Updated weights for policy 0, policy_version 285460 (0.0039) [2024-06-22 20:35:08,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 4677124096. Throughput: 0: 43034.1. Samples: 4677188940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:08,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 20:35:08,746][15401] Updated weights for policy 0, policy_version 285470 (0.0027) [2024-06-22 20:35:13,389][15132] Fps is (10 sec: 36045.4, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 4677271552. Throughput: 0: 42624.5. Samples: 4677439040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 20:35:13,713][15401] Updated weights for policy 0, policy_version 285480 (0.0031) [2024-06-22 20:35:16,722][15401] Updated weights for policy 0, policy_version 285490 (0.0035) [2024-06-22 20:35:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 4677550080. Throughput: 0: 42525.8. Samples: 4677687360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 20:35:21,485][15401] Updated weights for policy 0, policy_version 285500 (0.0037) [2024-06-22 20:35:23,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42326.6, 300 sec: 42820.9). Total num frames: 4677746688. Throughput: 0: 42709.8. Samples: 4677827420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 20:35:24,440][15401] Updated weights for policy 0, policy_version 285510 (0.0049) [2024-06-22 20:35:28,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 4677926912. Throughput: 0: 42246.2. Samples: 4678068840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:28,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 20:35:29,082][15401] Updated weights for policy 0, policy_version 285520 (0.0046) [2024-06-22 20:35:32,061][15401] Updated weights for policy 0, policy_version 285530 (0.0030) [2024-06-22 20:35:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 4678172672. Throughput: 0: 42362.9. Samples: 4678323440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:33,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-22 20:35:36,876][15401] Updated weights for policy 0, policy_version 285540 (0.0038) [2024-06-22 20:35:38,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4678369280. Throughput: 0: 42317.7. Samples: 4678460440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 20:35:39,524][15401] Updated weights for policy 0, policy_version 285550 (0.0030) [2024-06-22 20:35:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4678565888. Throughput: 0: 42182.5. Samples: 4678703920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 20:35:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000285557_4678565888.pth... [2024-06-22 20:35:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000284933_4668342272.pth [2024-06-22 20:35:44,625][15401] Updated weights for policy 0, policy_version 285560 (0.0038) [2024-06-22 20:35:47,134][15401] Updated weights for policy 0, policy_version 285570 (0.0051) [2024-06-22 20:35:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4678828032. Throughput: 0: 42235.6. Samples: 4678952660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 20:35:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 20:35:52,814][15401] Updated weights for policy 0, policy_version 285580 (0.0024) [2024-06-22 20:35:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 41506.1, 300 sec: 42653.9). Total num frames: 4678975488. Throughput: 0: 42137.8. Samples: 4679085140. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:35:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 20:35:55,154][15401] Updated weights for policy 0, policy_version 285590 (0.0030) [2024-06-22 20:35:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4679221248. Throughput: 0: 42194.6. Samples: 4679337800. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:35:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 20:36:00,548][15401] Updated weights for policy 0, policy_version 285600 (0.0022) [2024-06-22 20:36:03,024][15401] Updated weights for policy 0, policy_version 285610 (0.0052) [2024-06-22 20:36:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4679450624. Throughput: 0: 42097.7. Samples: 4679581760. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 20:36:04,852][15349] Signal inference workers to stop experience collection... (69200 times) [2024-06-22 20:36:04,861][15349] Signal inference workers to resume experience collection... (69200 times) [2024-06-22 20:36:04,894][15401] InferenceWorker_p0-w0: stopping experience collection (69200 times) [2024-06-22 20:36:04,894][15401] InferenceWorker_p0-w0: resuming experience collection (69200 times) [2024-06-22 20:36:08,098][15401] Updated weights for policy 0, policy_version 285620 (0.0026) [2024-06-22 20:36:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 42709.5). Total num frames: 4679614464. Throughput: 0: 42009.3. Samples: 4679717840. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 20:36:10,414][15401] Updated weights for policy 0, policy_version 285630 (0.0039) [2024-06-22 20:36:13,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 4679860224. Throughput: 0: 42273.9. Samples: 4679971060. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 20:36:15,633][15401] Updated weights for policy 0, policy_version 285640 (0.0028) [2024-06-22 20:36:18,241][15401] Updated weights for policy 0, policy_version 285650 (0.0033) [2024-06-22 20:36:18,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4680089600. Throughput: 0: 42270.8. Samples: 4680225620. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 20:36:23,030][15401] Updated weights for policy 0, policy_version 285660 (0.0040) [2024-06-22 20:36:23,392][15132] Fps is (10 sec: 40949.0, 60 sec: 42050.4, 300 sec: 42654.3). Total num frames: 4680269824. Throughput: 0: 42200.3. Samples: 4680359560. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:23,393][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 20:36:26,087][15401] Updated weights for policy 0, policy_version 285670 (0.0039) [2024-06-22 20:36:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 4680499200. Throughput: 0: 42332.2. Samples: 4680608860. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 20:36:30,770][15401] Updated weights for policy 0, policy_version 285680 (0.0031) [2024-06-22 20:36:33,390][15132] Fps is (10 sec: 44248.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4680712192. Throughput: 0: 42524.5. Samples: 4680866260. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:33,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 20:36:33,878][15401] Updated weights for policy 0, policy_version 285690 (0.0035) [2024-06-22 20:36:38,386][15401] Updated weights for policy 0, policy_version 285700 (0.0037) [2024-06-22 20:36:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4680908800. Throughput: 0: 42414.6. Samples: 4680993800. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:38,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 20:36:41,653][15401] Updated weights for policy 0, policy_version 285710 (0.0030) [2024-06-22 20:36:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42710.2). Total num frames: 4681154560. Throughput: 0: 42441.3. Samples: 4681247660. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:43,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 20:36:46,083][15401] Updated weights for policy 0, policy_version 285720 (0.0030) [2024-06-22 20:36:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 4681334784. Throughput: 0: 42576.6. Samples: 4681497700. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 20:36:49,433][15401] Updated weights for policy 0, policy_version 285730 (0.0031) [2024-06-22 20:36:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4681531392. Throughput: 0: 42250.5. Samples: 4681619120. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 20:36:53,910][15401] Updated weights for policy 0, policy_version 285740 (0.0039) [2024-06-22 20:36:57,367][15401] Updated weights for policy 0, policy_version 285750 (0.0039) [2024-06-22 20:36:58,395][15132] Fps is (10 sec: 44212.3, 60 sec: 42594.6, 300 sec: 42653.2). Total num frames: 4681777152. Throughput: 0: 42344.1. Samples: 4681876780. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:36:58,396][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 20:37:01,688][15401] Updated weights for policy 0, policy_version 285760 (0.0031) [2024-06-22 20:37:03,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 4681973760. Throughput: 0: 42392.6. Samples: 4682133280. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-22 20:37:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 20:37:05,064][15401] Updated weights for policy 0, policy_version 285770 (0.0042) [2024-06-22 20:37:08,389][15132] Fps is (10 sec: 37704.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4682153984. Throughput: 0: 42213.6. Samples: 4682259060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 20:37:09,102][15401] Updated weights for policy 0, policy_version 285780 (0.0036) [2024-06-22 20:37:12,529][15401] Updated weights for policy 0, policy_version 285790 (0.0035) [2024-06-22 20:37:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4682416128. Throughput: 0: 42465.8. Samples: 4682519820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 20:37:16,761][15349] Signal inference workers to stop experience collection... (69250 times) [2024-06-22 20:37:16,762][15349] Signal inference workers to resume experience collection... (69250 times) [2024-06-22 20:37:16,778][15401] InferenceWorker_p0-w0: stopping experience collection (69250 times) [2024-06-22 20:37:16,804][15401] InferenceWorker_p0-w0: resuming experience collection (69250 times) [2024-06-22 20:37:17,084][15401] Updated weights for policy 0, policy_version 285800 (0.0029) [2024-06-22 20:37:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 4682612736. Throughput: 0: 42327.1. Samples: 4682770980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 20:37:20,282][15401] Updated weights for policy 0, policy_version 285810 (0.0043) [2024-06-22 20:37:23,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42054.1, 300 sec: 42431.8). Total num frames: 4682792960. Throughput: 0: 42151.2. Samples: 4682890600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 20:37:24,773][15401] Updated weights for policy 0, policy_version 285820 (0.0020) [2024-06-22 20:37:28,030][15401] Updated weights for policy 0, policy_version 285830 (0.0036) [2024-06-22 20:37:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 4683038720. Throughput: 0: 42341.0. Samples: 4683153000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 20:37:32,758][15401] Updated weights for policy 0, policy_version 285840 (0.0033) [2024-06-22 20:37:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 4683251712. Throughput: 0: 42513.2. Samples: 4683410800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:33,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-22 20:37:35,595][15401] Updated weights for policy 0, policy_version 285850 (0.0029) [2024-06-22 20:37:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4683448320. Throughput: 0: 42579.2. Samples: 4683535180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 20:37:40,410][15401] Updated weights for policy 0, policy_version 285860 (0.0039) [2024-06-22 20:37:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 4683677696. Throughput: 0: 42398.5. Samples: 4683784480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 20:37:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000285869_4683677696.pth... [2024-06-22 20:37:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000285248_4673503232.pth [2024-06-22 20:37:43,763][15401] Updated weights for policy 0, policy_version 285870 (0.0051) [2024-06-22 20:37:48,229][15401] Updated weights for policy 0, policy_version 285880 (0.0028) [2024-06-22 20:37:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4683874304. Throughput: 0: 42580.7. Samples: 4684049420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 20:37:51,475][15401] Updated weights for policy 0, policy_version 285890 (0.0034) [2024-06-22 20:37:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 4684103680. Throughput: 0: 42540.8. Samples: 4684173400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 20:37:55,752][15401] Updated weights for policy 0, policy_version 285900 (0.0033) [2024-06-22 20:37:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42329.1, 300 sec: 42487.3). Total num frames: 4684316672. Throughput: 0: 42232.3. Samples: 4684420280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:37:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 20:37:59,096][15401] Updated weights for policy 0, policy_version 285910 (0.0043) [2024-06-22 20:38:03,288][15401] Updated weights for policy 0, policy_version 285920 (0.0036) [2024-06-22 20:38:03,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42320.8, 300 sec: 42541.9). Total num frames: 4684513280. Throughput: 0: 42450.0. Samples: 4684681500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:38:03,396][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 20:38:06,798][15401] Updated weights for policy 0, policy_version 285930 (0.0035) [2024-06-22 20:38:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 4684742656. Throughput: 0: 42450.6. Samples: 4684800880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:38:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 20:38:11,275][15401] Updated weights for policy 0, policy_version 285940 (0.0041) [2024-06-22 20:38:13,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42052.1, 300 sec: 42432.7). Total num frames: 4684939264. Throughput: 0: 42374.1. Samples: 4685059840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-22 20:38:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 20:38:14,369][15401] Updated weights for policy 0, policy_version 285950 (0.0032) [2024-06-22 20:38:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4685135872. Throughput: 0: 42422.7. Samples: 4685319820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 20:38:18,790][15401] Updated weights for policy 0, policy_version 285960 (0.0042) [2024-06-22 20:38:22,071][15401] Updated weights for policy 0, policy_version 285970 (0.0041) [2024-06-22 20:38:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 4685381632. Throughput: 0: 42331.1. Samples: 4685440080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 20:38:26,263][15401] Updated weights for policy 0, policy_version 285980 (0.0026) [2024-06-22 20:38:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4685594624. Throughput: 0: 42687.1. Samples: 4685705400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 20:38:29,522][15401] Updated weights for policy 0, policy_version 285990 (0.0047) [2024-06-22 20:38:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4685791232. Throughput: 0: 42446.6. Samples: 4685959520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 20:38:33,991][15401] Updated weights for policy 0, policy_version 286000 (0.0031) [2024-06-22 20:38:37,027][15401] Updated weights for policy 0, policy_version 286010 (0.0024) [2024-06-22 20:38:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42431.4). Total num frames: 4686004224. Throughput: 0: 42521.7. Samples: 4686086980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:38,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 20:38:41,601][15401] Updated weights for policy 0, policy_version 286020 (0.0033) [2024-06-22 20:38:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4686200832. Throughput: 0: 42705.9. Samples: 4686342040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 20:38:45,043][15401] Updated weights for policy 0, policy_version 286030 (0.0025) [2024-06-22 20:38:48,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4686430208. Throughput: 0: 42606.0. Samples: 4686598500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:48,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 20:38:49,221][15401] Updated weights for policy 0, policy_version 286040 (0.0022) [2024-06-22 20:38:50,493][15349] Signal inference workers to stop experience collection... (69300 times) [2024-06-22 20:38:50,493][15349] Signal inference workers to resume experience collection... (69300 times) [2024-06-22 20:38:50,505][15401] InferenceWorker_p0-w0: stopping experience collection (69300 times) [2024-06-22 20:38:50,506][15401] InferenceWorker_p0-w0: resuming experience collection (69300 times) [2024-06-22 20:38:52,658][15401] Updated weights for policy 0, policy_version 286050 (0.0032) [2024-06-22 20:38:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4686659584. Throughput: 0: 42876.4. Samples: 4686730320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 20:38:56,825][15401] Updated weights for policy 0, policy_version 286060 (0.0033) [2024-06-22 20:38:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4686856192. Throughput: 0: 42669.4. Samples: 4686979960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:38:58,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 20:39:00,266][15401] Updated weights for policy 0, policy_version 286070 (0.0037) [2024-06-22 20:39:03,391][15132] Fps is (10 sec: 40952.7, 60 sec: 42601.7, 300 sec: 42487.1). Total num frames: 4687069184. Throughput: 0: 42552.9. Samples: 4687234780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:39:03,392][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 20:39:04,857][15401] Updated weights for policy 0, policy_version 286080 (0.0047) [2024-06-22 20:39:07,969][15401] Updated weights for policy 0, policy_version 286090 (0.0040) [2024-06-22 20:39:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4687298560. Throughput: 0: 42767.6. Samples: 4687364620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:39:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 20:39:12,736][15401] Updated weights for policy 0, policy_version 286100 (0.0028) [2024-06-22 20:39:13,390][15132] Fps is (10 sec: 42606.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4687495168. Throughput: 0: 42560.0. Samples: 4687620600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:39:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 20:39:16,083][15401] Updated weights for policy 0, policy_version 286110 (0.0027) [2024-06-22 20:39:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42376.5). Total num frames: 4687708160. Throughput: 0: 42552.9. Samples: 4687874400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:39:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 20:39:20,290][15401] Updated weights for policy 0, policy_version 286120 (0.0028) [2024-06-22 20:39:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4687937536. Throughput: 0: 42531.2. Samples: 4688000780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:39:23,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-22 20:39:23,696][15401] Updated weights for policy 0, policy_version 286130 (0.0029) [2024-06-22 20:39:27,899][15401] Updated weights for policy 0, policy_version 286140 (0.0029) [2024-06-22 20:39:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4688134144. Throughput: 0: 42608.5. Samples: 4688259420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 20:39:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 20:39:31,614][15401] Updated weights for policy 0, policy_version 286150 (0.0041) [2024-06-22 20:39:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4688347136. Throughput: 0: 42432.1. Samples: 4688507940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:39:33,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 20:39:35,490][15401] Updated weights for policy 0, policy_version 286160 (0.0040) [2024-06-22 20:39:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 4688560128. Throughput: 0: 42391.6. Samples: 4688637940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:39:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 20:39:39,364][15401] Updated weights for policy 0, policy_version 286170 (0.0036) [2024-06-22 20:39:43,341][15401] Updated weights for policy 0, policy_version 286180 (0.0038) [2024-06-22 20:39:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4688773120. Throughput: 0: 42556.9. Samples: 4688895020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:39:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 20:39:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000286180_4688773120.pth... [2024-06-22 20:39:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000285557_4678565888.pth [2024-06-22 20:39:46,891][15401] Updated weights for policy 0, policy_version 286190 (0.0033) [2024-06-22 20:39:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4688986112. Throughput: 0: 42401.2. Samples: 4689142760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:39:48,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 20:39:50,927][15401] Updated weights for policy 0, policy_version 286200 (0.0041) [2024-06-22 20:39:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4689182720. Throughput: 0: 42469.3. Samples: 4689275740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:39:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 20:39:54,640][15401] Updated weights for policy 0, policy_version 286210 (0.0029) [2024-06-22 20:39:58,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42323.8, 300 sec: 42320.4). Total num frames: 4689395712. Throughput: 0: 42443.2. Samples: 4689530640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:39:58,392][15132] Avg episode reward: [(0, '0.782')] [2024-06-22 20:39:58,822][15401] Updated weights for policy 0, policy_version 286220 (0.0045) [2024-06-22 20:40:02,745][15401] Updated weights for policy 0, policy_version 286230 (0.0035) [2024-06-22 20:40:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42599.7, 300 sec: 42376.3). Total num frames: 4689625088. Throughput: 0: 42504.6. Samples: 4689787100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:40:03,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 20:40:06,492][15401] Updated weights for policy 0, policy_version 286240 (0.0042) [2024-06-22 20:40:08,389][15132] Fps is (10 sec: 44246.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4689838080. Throughput: 0: 42532.4. Samples: 4689914740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:40:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 20:40:10,491][15401] Updated weights for policy 0, policy_version 286250 (0.0028) [2024-06-22 20:40:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4690034688. Throughput: 0: 42321.7. Samples: 4690163900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:40:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 20:40:14,228][15401] Updated weights for policy 0, policy_version 286260 (0.0040) [2024-06-22 20:40:18,171][15401] Updated weights for policy 0, policy_version 286270 (0.0034) [2024-06-22 20:40:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4690264064. Throughput: 0: 42638.3. Samples: 4690426660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:40:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 20:40:21,746][15401] Updated weights for policy 0, policy_version 286280 (0.0036) [2024-06-22 20:40:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 4690477056. Throughput: 0: 42453.9. Samples: 4690548360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:40:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 20:40:25,727][15401] Updated weights for policy 0, policy_version 286290 (0.0039) [2024-06-22 20:40:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4690673664. Throughput: 0: 42417.0. Samples: 4690803780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:40:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 20:40:29,649][15401] Updated weights for policy 0, policy_version 286300 (0.0033) [2024-06-22 20:40:29,804][15349] Signal inference workers to stop experience collection... (69350 times) [2024-06-22 20:40:29,804][15349] Signal inference workers to resume experience collection... (69350 times) [2024-06-22 20:40:29,852][15401] InferenceWorker_p0-w0: stopping experience collection (69350 times) [2024-06-22 20:40:29,853][15401] InferenceWorker_p0-w0: resuming experience collection (69350 times) [2024-06-22 20:40:33,370][15401] Updated weights for policy 0, policy_version 286310 (0.0040) [2024-06-22 20:40:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4690903040. Throughput: 0: 42697.8. Samples: 4691064160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:40:33,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 20:40:37,145][15401] Updated weights for policy 0, policy_version 286320 (0.0033) [2024-06-22 20:40:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4691132416. Throughput: 0: 42555.1. Samples: 4691190720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 20:40:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 20:40:40,841][15401] Updated weights for policy 0, policy_version 286330 (0.0039) [2024-06-22 20:40:43,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42593.9, 300 sec: 42375.3). Total num frames: 4691329024. Throughput: 0: 42556.5. Samples: 4691445860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:40:43,396][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 20:40:44,721][15401] Updated weights for policy 0, policy_version 286340 (0.0038) [2024-06-22 20:40:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4691525632. Throughput: 0: 42664.5. Samples: 4691707000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:40:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 20:40:48,635][15401] Updated weights for policy 0, policy_version 286350 (0.0033) [2024-06-22 20:40:52,258][15401] Updated weights for policy 0, policy_version 286360 (0.0027) [2024-06-22 20:40:53,389][15132] Fps is (10 sec: 44265.3, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 4691771392. Throughput: 0: 42610.7. Samples: 4691832220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:40:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 20:40:56,665][15401] Updated weights for policy 0, policy_version 286370 (0.0045) [2024-06-22 20:40:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42431.8). Total num frames: 4691968000. Throughput: 0: 42819.6. Samples: 4692090780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:40:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 20:40:59,935][15401] Updated weights for policy 0, policy_version 286380 (0.0025) [2024-06-22 20:41:03,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 4692164608. Throughput: 0: 42715.8. Samples: 4692348980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:03,392][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 20:41:04,139][15401] Updated weights for policy 0, policy_version 286390 (0.0035) [2024-06-22 20:41:07,433][15401] Updated weights for policy 0, policy_version 286400 (0.0027) [2024-06-22 20:41:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 4692410368. Throughput: 0: 42754.6. Samples: 4692472320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 20:41:11,715][15401] Updated weights for policy 0, policy_version 286410 (0.0024) [2024-06-22 20:41:13,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.6, 300 sec: 42431.8). Total num frames: 4692606976. Throughput: 0: 42890.3. Samples: 4692733840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 20:41:14,929][15401] Updated weights for policy 0, policy_version 286420 (0.0032) [2024-06-22 20:41:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42487.7). Total num frames: 4692803584. Throughput: 0: 42857.2. Samples: 4692992740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 20:41:19,341][15401] Updated weights for policy 0, policy_version 286430 (0.0038) [2024-06-22 20:41:22,729][15401] Updated weights for policy 0, policy_version 286440 (0.0040) [2024-06-22 20:41:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 4693049344. Throughput: 0: 42887.1. Samples: 4693120640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:23,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-22 20:41:27,025][15401] Updated weights for policy 0, policy_version 286450 (0.0037) [2024-06-22 20:41:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 4693245952. Throughput: 0: 42802.0. Samples: 4693371680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 20:41:30,513][15401] Updated weights for policy 0, policy_version 286460 (0.0035) [2024-06-22 20:41:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4693442560. Throughput: 0: 42838.1. Samples: 4693634720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:33,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 20:41:34,560][15401] Updated weights for policy 0, policy_version 286470 (0.0032) [2024-06-22 20:41:37,942][15401] Updated weights for policy 0, policy_version 286480 (0.0035) [2024-06-22 20:41:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4693688320. Throughput: 0: 42818.2. Samples: 4693759040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:38,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-22 20:41:42,320][15401] Updated weights for policy 0, policy_version 286490 (0.0032) [2024-06-22 20:41:43,391][15132] Fps is (10 sec: 44228.0, 60 sec: 42601.6, 300 sec: 42542.6). Total num frames: 4693884928. Throughput: 0: 42837.3. Samples: 4694018540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:43,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 20:41:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000286492_4693884928.pth... [2024-06-22 20:41:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000285869_4683677696.pth [2024-06-22 20:41:44,180][15349] Signal inference workers to stop experience collection... (69400 times) [2024-06-22 20:41:44,182][15349] Signal inference workers to resume experience collection... (69400 times) [2024-06-22 20:41:44,229][15401] InferenceWorker_p0-w0: stopping experience collection (69400 times) [2024-06-22 20:41:44,230][15401] InferenceWorker_p0-w0: resuming experience collection (69400 times) [2024-06-22 20:41:45,453][15401] Updated weights for policy 0, policy_version 286500 (0.0038) [2024-06-22 20:41:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 4694097920. Throughput: 0: 42835.6. Samples: 4694276480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 20:41:49,943][15401] Updated weights for policy 0, policy_version 286510 (0.0034) [2024-06-22 20:41:53,138][15401] Updated weights for policy 0, policy_version 286520 (0.0033) [2024-06-22 20:41:53,393][15132] Fps is (10 sec: 45866.7, 60 sec: 42868.7, 300 sec: 42598.6). Total num frames: 4694343680. Throughput: 0: 42851.9. Samples: 4694400820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 20:41:53,394][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 20:41:57,568][15401] Updated weights for policy 0, policy_version 286530 (0.0038) [2024-06-22 20:41:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4694540288. Throughput: 0: 42716.7. Samples: 4694656100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:41:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 20:42:01,272][15401] Updated weights for policy 0, policy_version 286540 (0.0035) [2024-06-22 20:42:03,389][15132] Fps is (10 sec: 37697.6, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 4694720512. Throughput: 0: 42607.7. Samples: 4694910080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 20:42:05,167][15401] Updated weights for policy 0, policy_version 286550 (0.0036) [2024-06-22 20:42:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 4694966272. Throughput: 0: 42531.9. Samples: 4695034580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 20:42:08,868][15401] Updated weights for policy 0, policy_version 286560 (0.0033) [2024-06-22 20:42:12,623][15401] Updated weights for policy 0, policy_version 286570 (0.0032) [2024-06-22 20:42:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 4695195648. Throughput: 0: 42794.7. Samples: 4695297440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 20:42:16,515][15401] Updated weights for policy 0, policy_version 286580 (0.0040) [2024-06-22 20:42:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4695375872. Throughput: 0: 42546.1. Samples: 4695549300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:18,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 20:42:20,542][15401] Updated weights for policy 0, policy_version 286590 (0.0028) [2024-06-22 20:42:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4695572480. Throughput: 0: 42601.8. Samples: 4695676120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 20:42:24,077][15401] Updated weights for policy 0, policy_version 286600 (0.0028) [2024-06-22 20:42:28,152][15401] Updated weights for policy 0, policy_version 286610 (0.0037) [2024-06-22 20:42:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4695834624. Throughput: 0: 42499.2. Samples: 4695930920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 20:42:31,848][15401] Updated weights for policy 0, policy_version 286620 (0.0047) [2024-06-22 20:42:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4696014848. Throughput: 0: 42607.6. Samples: 4696193820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 20:42:35,775][15401] Updated weights for policy 0, policy_version 286630 (0.0024) [2024-06-22 20:42:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4696227840. Throughput: 0: 42581.4. Samples: 4696316820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:38,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 20:42:39,551][15401] Updated weights for policy 0, policy_version 286640 (0.0026) [2024-06-22 20:42:43,287][15401] Updated weights for policy 0, policy_version 286650 (0.0042) [2024-06-22 20:42:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43145.8, 300 sec: 42709.5). Total num frames: 4696473600. Throughput: 0: 42743.9. Samples: 4696579580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 20:42:47,253][15401] Updated weights for policy 0, policy_version 286660 (0.0036) [2024-06-22 20:42:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 4696653824. Throughput: 0: 42739.0. Samples: 4696833340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 20:42:51,139][15401] Updated weights for policy 0, policy_version 286670 (0.0026) [2024-06-22 20:42:53,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42328.1, 300 sec: 42598.4). Total num frames: 4696883200. Throughput: 0: 42827.2. Samples: 4696961800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-22 20:42:54,792][15401] Updated weights for policy 0, policy_version 286680 (0.0036) [2024-06-22 20:42:58,012][15349] Signal inference workers to stop experience collection... (69450 times) [2024-06-22 20:42:58,065][15401] InferenceWorker_p0-w0: stopping experience collection (69450 times) [2024-06-22 20:42:58,127][15349] Signal inference workers to resume experience collection... (69450 times) [2024-06-22 20:42:58,128][15401] InferenceWorker_p0-w0: resuming experience collection (69450 times) [2024-06-22 20:42:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 4697096192. Throughput: 0: 42801.0. Samples: 4697223480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:42:58,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-22 20:42:58,779][15401] Updated weights for policy 0, policy_version 286690 (0.0030) [2024-06-22 20:43:02,421][15401] Updated weights for policy 0, policy_version 286700 (0.0042) [2024-06-22 20:43:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 4697309184. Throughput: 0: 42883.3. Samples: 4697479040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:43:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 20:43:06,332][15401] Updated weights for policy 0, policy_version 286710 (0.0029) [2024-06-22 20:43:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4697538560. Throughput: 0: 42964.9. Samples: 4697609540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 20:43:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 20:43:10,006][15401] Updated weights for policy 0, policy_version 286720 (0.0032) [2024-06-22 20:43:13,391][15132] Fps is (10 sec: 42590.8, 60 sec: 42324.2, 300 sec: 42709.2). Total num frames: 4697735168. Throughput: 0: 43129.0. Samples: 4697871800. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:13,392][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 20:43:13,997][15401] Updated weights for policy 0, policy_version 286730 (0.0044) [2024-06-22 20:43:17,836][15401] Updated weights for policy 0, policy_version 286740 (0.0039) [2024-06-22 20:43:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4697964544. Throughput: 0: 42745.6. Samples: 4698117380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:18,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 20:43:21,738][15401] Updated weights for policy 0, policy_version 286750 (0.0026) [2024-06-22 20:43:23,390][15132] Fps is (10 sec: 45883.0, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 4698193920. Throughput: 0: 42952.3. Samples: 4698249680. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:23,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 20:43:25,462][15401] Updated weights for policy 0, policy_version 286760 (0.0035) [2024-06-22 20:43:28,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4698357760. Throughput: 0: 42870.5. Samples: 4698508740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 20:43:29,430][15401] Updated weights for policy 0, policy_version 286770 (0.0052) [2024-06-22 20:43:33,031][15401] Updated weights for policy 0, policy_version 286780 (0.0032) [2024-06-22 20:43:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 4698603520. Throughput: 0: 42824.2. Samples: 4698760420. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 20:43:36,914][15401] Updated weights for policy 0, policy_version 286790 (0.0033) [2024-06-22 20:43:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4698816512. Throughput: 0: 42977.3. Samples: 4698895780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 20:43:40,652][15401] Updated weights for policy 0, policy_version 286800 (0.0038) [2024-06-22 20:43:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 4699029504. Throughput: 0: 43013.8. Samples: 4699159100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 20:43:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000286806_4699029504.pth... [2024-06-22 20:43:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000286180_4688773120.pth [2024-06-22 20:43:44,615][15401] Updated weights for policy 0, policy_version 286810 (0.0044) [2024-06-22 20:43:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4699242496. Throughput: 0: 42840.4. Samples: 4699406860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 20:43:48,717][15401] Updated weights for policy 0, policy_version 286820 (0.0042) [2024-06-22 20:43:52,214][15401] Updated weights for policy 0, policy_version 286830 (0.0028) [2024-06-22 20:43:53,391][15132] Fps is (10 sec: 44230.1, 60 sec: 43143.4, 300 sec: 42764.8). Total num frames: 4699471872. Throughput: 0: 42914.6. Samples: 4699540760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:53,391][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 20:43:56,351][15401] Updated weights for policy 0, policy_version 286840 (0.0034) [2024-06-22 20:43:58,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 4699652096. Throughput: 0: 42753.6. Samples: 4699795740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:43:58,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 20:43:59,687][15401] Updated weights for policy 0, policy_version 286850 (0.0036) [2024-06-22 20:44:03,389][15132] Fps is (10 sec: 40966.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4699881472. Throughput: 0: 42874.9. Samples: 4700046740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:44:03,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 20:44:03,950][15401] Updated weights for policy 0, policy_version 286860 (0.0026) [2024-06-22 20:44:07,600][15401] Updated weights for policy 0, policy_version 286870 (0.0036) [2024-06-22 20:44:08,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4700094464. Throughput: 0: 42916.4. Samples: 4700180920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:44:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 20:44:11,512][15401] Updated weights for policy 0, policy_version 286880 (0.0040) [2024-06-22 20:44:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42599.7, 300 sec: 42654.0). Total num frames: 4700291072. Throughput: 0: 42964.9. Samples: 4700442160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:44:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 20:44:15,162][15401] Updated weights for policy 0, policy_version 286890 (0.0041) [2024-06-22 20:44:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4700536832. Throughput: 0: 42931.0. Samples: 4700692320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-06-22 20:44:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 20:44:18,918][15401] Updated weights for policy 0, policy_version 286900 (0.0031) [2024-06-22 20:44:23,099][15349] Signal inference workers to stop experience collection... (69500 times) [2024-06-22 20:44:23,147][15401] InferenceWorker_p0-w0: stopping experience collection (69500 times) [2024-06-22 20:44:23,155][15349] Signal inference workers to resume experience collection... (69500 times) [2024-06-22 20:44:23,171][15401] InferenceWorker_p0-w0: resuming experience collection (69500 times) [2024-06-22 20:44:23,175][15401] Updated weights for policy 0, policy_version 286910 (0.0036) [2024-06-22 20:44:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4700749824. Throughput: 0: 42913.4. Samples: 4700826880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:44:23,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 20:44:26,689][15401] Updated weights for policy 0, policy_version 286920 (0.0039) [2024-06-22 20:44:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4700962816. Throughput: 0: 42881.4. Samples: 4701088760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:44:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 20:44:30,706][15401] Updated weights for policy 0, policy_version 286930 (0.0034) [2024-06-22 20:44:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4701192192. Throughput: 0: 43078.7. Samples: 4701345400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:44:33,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 20:44:34,223][15401] Updated weights for policy 0, policy_version 286940 (0.0038) [2024-06-22 20:44:38,055][15401] Updated weights for policy 0, policy_version 286950 (0.0028) [2024-06-22 20:44:38,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4701405184. Throughput: 0: 43083.1. Samples: 4701479540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:44:38,393][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 20:44:41,785][15401] Updated weights for policy 0, policy_version 286960 (0.0032) [2024-06-22 20:44:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4701618176. Throughput: 0: 43163.0. Samples: 4701737980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:44:43,391][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 20:44:45,678][15401] Updated weights for policy 0, policy_version 286970 (0.0043) [2024-06-22 20:44:48,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4701847552. Throughput: 0: 43313.3. Samples: 4701995840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:44:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 20:44:49,390][15401] Updated weights for policy 0, policy_version 286980 (0.0033) [2024-06-22 20:44:53,224][15401] Updated weights for policy 0, policy_version 286990 (0.0023) [2024-06-22 20:44:53,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42870.8, 300 sec: 42876.1). Total num frames: 4702044160. Throughput: 0: 43254.6. Samples: 4702127480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:44:53,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 20:44:56,927][15401] Updated weights for policy 0, policy_version 287000 (0.0027) [2024-06-22 20:44:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 4702257152. Throughput: 0: 43077.3. Samples: 4702380640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:44:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 20:45:00,826][15401] Updated weights for policy 0, policy_version 287010 (0.0053) [2024-06-22 20:45:03,391][15132] Fps is (10 sec: 44239.1, 60 sec: 43416.2, 300 sec: 42875.8). Total num frames: 4702486528. Throughput: 0: 43261.0. Samples: 4702639140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:45:03,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 20:45:04,412][15401] Updated weights for policy 0, policy_version 287020 (0.0030) [2024-06-22 20:45:08,335][15401] Updated weights for policy 0, policy_version 287030 (0.0035) [2024-06-22 20:45:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4702699520. Throughput: 0: 43159.9. Samples: 4702769080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:45:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 20:45:11,877][15401] Updated weights for policy 0, policy_version 287040 (0.0036) [2024-06-22 20:45:13,390][15132] Fps is (10 sec: 42606.1, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 4702912512. Throughput: 0: 42997.2. Samples: 4703023640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:45:13,391][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 20:45:16,163][15401] Updated weights for policy 0, policy_version 287050 (0.0042) [2024-06-22 20:45:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4703109120. Throughput: 0: 42955.5. Samples: 4703278400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:45:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 20:45:19,626][15401] Updated weights for policy 0, policy_version 287060 (0.0042) [2024-06-22 20:45:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4703322112. Throughput: 0: 42789.4. Samples: 4703404960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:45:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 20:45:23,775][15401] Updated weights for policy 0, policy_version 287070 (0.0038) [2024-06-22 20:45:27,578][15401] Updated weights for policy 0, policy_version 287080 (0.0037) [2024-06-22 20:45:28,392][15132] Fps is (10 sec: 45864.8, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 4703567872. Throughput: 0: 42835.8. Samples: 4703665680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:45:28,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 20:45:31,279][15401] Updated weights for policy 0, policy_version 287090 (0.0043) [2024-06-22 20:45:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4703764480. Throughput: 0: 42752.8. Samples: 4703919720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-06-22 20:45:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 20:45:35,193][15401] Updated weights for policy 0, policy_version 287100 (0.0037) [2024-06-22 20:45:38,392][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42821.2). Total num frames: 4703961088. Throughput: 0: 42622.3. Samples: 4704045480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:45:38,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 20:45:39,285][15401] Updated weights for policy 0, policy_version 287110 (0.0055) [2024-06-22 20:45:42,977][15401] Updated weights for policy 0, policy_version 287120 (0.0054) [2024-06-22 20:45:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 4704190464. Throughput: 0: 42730.6. Samples: 4704303520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:45:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 20:45:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000287121_4704190464.pth... [2024-06-22 20:45:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000286492_4693884928.pth [2024-06-22 20:45:46,887][15401] Updated weights for policy 0, policy_version 287130 (0.0038) [2024-06-22 20:45:48,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4704387072. Throughput: 0: 42539.5. Samples: 4704553340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:45:48,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 20:45:50,819][15401] Updated weights for policy 0, policy_version 287140 (0.0049) [2024-06-22 20:45:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 4704600064. Throughput: 0: 42446.7. Samples: 4704679180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:45:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 20:45:54,508][15401] Updated weights for policy 0, policy_version 287150 (0.0041) [2024-06-22 20:45:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 4704796672. Throughput: 0: 42627.6. Samples: 4704941880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:45:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 20:45:58,879][15401] Updated weights for policy 0, policy_version 287160 (0.0040) [2024-06-22 20:46:02,080][15401] Updated weights for policy 0, policy_version 287170 (0.0041) [2024-06-22 20:46:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42599.7, 300 sec: 42820.5). Total num frames: 4705042432. Throughput: 0: 42570.2. Samples: 4705194060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 20:46:06,381][15401] Updated weights for policy 0, policy_version 287180 (0.0041) [2024-06-22 20:46:07,400][15349] Signal inference workers to stop experience collection... (69550 times) [2024-06-22 20:46:07,402][15349] Signal inference workers to resume experience collection... (69550 times) [2024-06-22 20:46:07,445][15401] InferenceWorker_p0-w0: stopping experience collection (69550 times) [2024-06-22 20:46:07,445][15401] InferenceWorker_p0-w0: resuming experience collection (69550 times) [2024-06-22 20:46:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4705239040. Throughput: 0: 42770.5. Samples: 4705329640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:08,391][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 20:46:09,611][15401] Updated weights for policy 0, policy_version 287190 (0.0038) [2024-06-22 20:46:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4705452032. Throughput: 0: 42664.3. Samples: 4705585480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 20:46:14,127][15401] Updated weights for policy 0, policy_version 287200 (0.0040) [2024-06-22 20:46:17,464][15401] Updated weights for policy 0, policy_version 287210 (0.0022) [2024-06-22 20:46:18,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4705697792. Throughput: 0: 42585.5. Samples: 4705836060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:18,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 20:46:21,708][15401] Updated weights for policy 0, policy_version 287220 (0.0036) [2024-06-22 20:46:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4705894400. Throughput: 0: 42815.0. Samples: 4705972060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:23,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 20:46:25,304][15401] Updated weights for policy 0, policy_version 287230 (0.0046) [2024-06-22 20:46:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42053.9, 300 sec: 42876.1). Total num frames: 4706091008. Throughput: 0: 42669.9. Samples: 4706223660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 20:46:29,254][15401] Updated weights for policy 0, policy_version 287240 (0.0029) [2024-06-22 20:46:32,715][15401] Updated weights for policy 0, policy_version 287250 (0.0033) [2024-06-22 20:46:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4706320384. Throughput: 0: 42865.8. Samples: 4706482300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 20:46:36,725][15401] Updated weights for policy 0, policy_version 287260 (0.0025) [2024-06-22 20:46:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.2, 300 sec: 42931.9). Total num frames: 4706549760. Throughput: 0: 42972.9. Samples: 4706612960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 20:46:40,285][15401] Updated weights for policy 0, policy_version 287270 (0.0046) [2024-06-22 20:46:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4706729984. Throughput: 0: 42726.1. Samples: 4706864560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:46:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 20:46:44,585][15401] Updated weights for policy 0, policy_version 287280 (0.0042) [2024-06-22 20:46:48,244][15401] Updated weights for policy 0, policy_version 287290 (0.0035) [2024-06-22 20:46:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.6). Total num frames: 4706959360. Throughput: 0: 42877.1. Samples: 4707123520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:46:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 20:46:52,485][15401] Updated weights for policy 0, policy_version 287300 (0.0029) [2024-06-22 20:46:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4707172352. Throughput: 0: 42681.9. Samples: 4707250320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:46:53,394][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 20:46:55,745][15401] Updated weights for policy 0, policy_version 287310 (0.0040) [2024-06-22 20:46:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4707385344. Throughput: 0: 42681.8. Samples: 4707506160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:46:58,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 20:47:00,011][15401] Updated weights for policy 0, policy_version 287320 (0.0031) [2024-06-22 20:47:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4707614720. Throughput: 0: 42882.7. Samples: 4707765780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 20:47:03,395][15401] Updated weights for policy 0, policy_version 287330 (0.0027) [2024-06-22 20:47:07,641][15401] Updated weights for policy 0, policy_version 287340 (0.0038) [2024-06-22 20:47:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4707811328. Throughput: 0: 42678.4. Samples: 4707892580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 20:47:10,818][15401] Updated weights for policy 0, policy_version 287350 (0.0043) [2024-06-22 20:47:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4708040704. Throughput: 0: 42845.3. Samples: 4708151700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 20:47:15,179][15401] Updated weights for policy 0, policy_version 287360 (0.0028) [2024-06-22 20:47:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 4708253696. Throughput: 0: 42768.8. Samples: 4708406900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:18,390][15132] Avg episode reward: [(0, '0.206')] [2024-06-22 20:47:18,515][15401] Updated weights for policy 0, policy_version 287370 (0.0038) [2024-06-22 20:47:22,783][15401] Updated weights for policy 0, policy_version 287380 (0.0030) [2024-06-22 20:47:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4708466688. Throughput: 0: 42735.2. Samples: 4708536040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 20:47:26,324][15401] Updated weights for policy 0, policy_version 287390 (0.0041) [2024-06-22 20:47:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4708663296. Throughput: 0: 42836.2. Samples: 4708792180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 20:47:30,464][15401] Updated weights for policy 0, policy_version 287400 (0.0043) [2024-06-22 20:47:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4708892672. Throughput: 0: 42854.5. Samples: 4709051980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:33,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 20:47:33,728][15349] Signal inference workers to stop experience collection... (69600 times) [2024-06-22 20:47:33,734][15349] Signal inference workers to resume experience collection... (69600 times) [2024-06-22 20:47:33,776][15401] InferenceWorker_p0-w0: stopping experience collection (69600 times) [2024-06-22 20:47:33,776][15401] InferenceWorker_p0-w0: resuming experience collection (69600 times) [2024-06-22 20:47:33,879][15401] Updated weights for policy 0, policy_version 287410 (0.0037) [2024-06-22 20:47:37,989][15401] Updated weights for policy 0, policy_version 287420 (0.0033) [2024-06-22 20:47:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4709105664. Throughput: 0: 42913.4. Samples: 4709181420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:38,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-22 20:47:41,603][15401] Updated weights for policy 0, policy_version 287430 (0.0030) [2024-06-22 20:47:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4709302272. Throughput: 0: 42966.7. Samples: 4709439660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 20:47:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000287434_4709318656.pth... [2024-06-22 20:47:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000286806_4699029504.pth [2024-06-22 20:47:45,667][15401] Updated weights for policy 0, policy_version 287440 (0.0037) [2024-06-22 20:47:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4709515264. Throughput: 0: 42802.6. Samples: 4709691900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 20:47:49,264][15401] Updated weights for policy 0, policy_version 287450 (0.0034) [2024-06-22 20:47:53,110][15401] Updated weights for policy 0, policy_version 287460 (0.0039) [2024-06-22 20:47:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4709761024. Throughput: 0: 42874.2. Samples: 4709821920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 20:47:56,974][15401] Updated weights for policy 0, policy_version 287470 (0.0036) [2024-06-22 20:47:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4709941248. Throughput: 0: 42815.5. Samples: 4710078400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-22 20:47:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 20:48:00,664][15401] Updated weights for policy 0, policy_version 287480 (0.0028) [2024-06-22 20:48:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4710170624. Throughput: 0: 42751.0. Samples: 4710330700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 20:48:05,054][15401] Updated weights for policy 0, policy_version 287490 (0.0042) [2024-06-22 20:48:08,335][15401] Updated weights for policy 0, policy_version 287500 (0.0033) [2024-06-22 20:48:08,392][15132] Fps is (10 sec: 45865.2, 60 sec: 43143.0, 300 sec: 42931.6). Total num frames: 4710400000. Throughput: 0: 42841.0. Samples: 4710463980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:08,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 20:48:12,654][15401] Updated weights for policy 0, policy_version 287510 (0.0031) [2024-06-22 20:48:13,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 4710596608. Throughput: 0: 42873.7. Samples: 4710721600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:13,401][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 20:48:16,187][15401] Updated weights for policy 0, policy_version 287520 (0.0031) [2024-06-22 20:48:18,392][15132] Fps is (10 sec: 40958.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 4710809600. Throughput: 0: 42665.8. Samples: 4710972040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:18,393][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 20:48:20,319][15401] Updated weights for policy 0, policy_version 287530 (0.0039) [2024-06-22 20:48:23,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4711022592. Throughput: 0: 42702.7. Samples: 4711103040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 20:48:23,761][15401] Updated weights for policy 0, policy_version 287540 (0.0030) [2024-06-22 20:48:27,881][15401] Updated weights for policy 0, policy_version 287550 (0.0028) [2024-06-22 20:48:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4711219200. Throughput: 0: 42748.9. Samples: 4711363360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 20:48:31,725][15401] Updated weights for policy 0, policy_version 287560 (0.0028) [2024-06-22 20:48:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4711464960. Throughput: 0: 42646.6. Samples: 4711611000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:33,391][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 20:48:35,611][15401] Updated weights for policy 0, policy_version 287570 (0.0036) [2024-06-22 20:48:38,390][15132] Fps is (10 sec: 44232.9, 60 sec: 42597.8, 300 sec: 42820.4). Total num frames: 4711661568. Throughput: 0: 42693.8. Samples: 4711743180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:38,391][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 20:48:39,338][15401] Updated weights for policy 0, policy_version 287580 (0.0023) [2024-06-22 20:48:43,295][15401] Updated weights for policy 0, policy_version 287590 (0.0028) [2024-06-22 20:48:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4711874560. Throughput: 0: 42675.0. Samples: 4711998780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 20:48:46,930][15401] Updated weights for policy 0, policy_version 287600 (0.0037) [2024-06-22 20:48:48,391][15132] Fps is (10 sec: 44233.9, 60 sec: 43143.4, 300 sec: 42820.6). Total num frames: 4712103936. Throughput: 0: 42529.3. Samples: 4712244580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:48,392][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 20:48:50,933][15401] Updated weights for policy 0, policy_version 287610 (0.0036) [2024-06-22 20:48:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.2, 300 sec: 42932.0). Total num frames: 4712316928. Throughput: 0: 42572.1. Samples: 4712379640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:53,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 20:48:54,649][15401] Updated weights for policy 0, policy_version 287620 (0.0039) [2024-06-22 20:48:58,389][15132] Fps is (10 sec: 39327.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4712497152. Throughput: 0: 42553.0. Samples: 4712636380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:48:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 20:48:58,796][15401] Updated weights for policy 0, policy_version 287630 (0.0034) [2024-06-22 20:49:02,220][15401] Updated weights for policy 0, policy_version 287640 (0.0037) [2024-06-22 20:49:02,821][15349] Signal inference workers to stop experience collection... (69650 times) [2024-06-22 20:49:02,823][15349] Signal inference workers to resume experience collection... (69650 times) [2024-06-22 20:49:02,874][15401] InferenceWorker_p0-w0: stopping experience collection (69650 times) [2024-06-22 20:49:02,874][15401] InferenceWorker_p0-w0: resuming experience collection (69650 times) [2024-06-22 20:49:03,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4712742912. Throughput: 0: 42570.8. Samples: 4712887620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:49:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 20:49:06,723][15401] Updated weights for policy 0, policy_version 287650 (0.0040) [2024-06-22 20:49:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42599.8, 300 sec: 42931.6). Total num frames: 4712955904. Throughput: 0: 42718.6. Samples: 4713025380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-22 20:49:08,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-22 20:49:09,945][15401] Updated weights for policy 0, policy_version 287660 (0.0034) [2024-06-22 20:49:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 4713152512. Throughput: 0: 42566.1. Samples: 4713278840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:13,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 20:49:14,315][15401] Updated weights for policy 0, policy_version 287670 (0.0039) [2024-06-22 20:49:17,562][15401] Updated weights for policy 0, policy_version 287680 (0.0049) [2024-06-22 20:49:18,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 4713381888. Throughput: 0: 42612.9. Samples: 4713528680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:18,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 20:49:21,851][15401] Updated weights for policy 0, policy_version 287690 (0.0025) [2024-06-22 20:49:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4713578496. Throughput: 0: 42653.7. Samples: 4713662560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:23,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 20:49:25,023][15401] Updated weights for policy 0, policy_version 287700 (0.0030) [2024-06-22 20:49:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4713791488. Throughput: 0: 42640.1. Samples: 4713917580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:28,390][15132] Avg episode reward: [(0, '0.211')] [2024-06-22 20:49:29,687][15401] Updated weights for policy 0, policy_version 287710 (0.0022) [2024-06-22 20:49:32,849][15401] Updated weights for policy 0, policy_version 287720 (0.0037) [2024-06-22 20:49:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 4714020864. Throughput: 0: 42770.3. Samples: 4714169180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 20:49:37,175][15401] Updated weights for policy 0, policy_version 287730 (0.0029) [2024-06-22 20:49:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42597.3, 300 sec: 42709.2). Total num frames: 4714217472. Throughput: 0: 42713.5. Samples: 4714301840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:38,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 20:49:40,383][15401] Updated weights for policy 0, policy_version 287740 (0.0024) [2024-06-22 20:49:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4714430464. Throughput: 0: 42685.2. Samples: 4714557220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:43,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 20:49:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000287746_4714430464.pth... [2024-06-22 20:49:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000287121_4704190464.pth [2024-06-22 20:49:44,659][15401] Updated weights for policy 0, policy_version 287750 (0.0032) [2024-06-22 20:49:47,967][15401] Updated weights for policy 0, policy_version 287760 (0.0038) [2024-06-22 20:49:48,392][15132] Fps is (10 sec: 45874.9, 60 sec: 42870.8, 300 sec: 42820.6). Total num frames: 4714676224. Throughput: 0: 42649.6. Samples: 4714806960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:48,393][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 20:49:52,314][15401] Updated weights for policy 0, policy_version 287770 (0.0039) [2024-06-22 20:49:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4714856448. Throughput: 0: 42568.9. Samples: 4714940980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:53,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 20:49:55,645][15401] Updated weights for policy 0, policy_version 287780 (0.0035) [2024-06-22 20:49:58,390][15132] Fps is (10 sec: 37691.6, 60 sec: 42598.2, 300 sec: 42598.6). Total num frames: 4715053056. Throughput: 0: 42501.6. Samples: 4715191420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:49:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 20:50:00,524][15401] Updated weights for policy 0, policy_version 287790 (0.0030) [2024-06-22 20:50:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4715298816. Throughput: 0: 42583.7. Samples: 4715444840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:50:03,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 20:50:03,418][15401] Updated weights for policy 0, policy_version 287800 (0.0044) [2024-06-22 20:50:08,079][15401] Updated weights for policy 0, policy_version 287810 (0.0033) [2024-06-22 20:50:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4715479040. Throughput: 0: 42463.6. Samples: 4715573420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:50:08,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 20:50:11,470][15401] Updated weights for policy 0, policy_version 287820 (0.0044) [2024-06-22 20:50:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4715692032. Throughput: 0: 42397.3. Samples: 4715825460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:50:13,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 20:50:15,641][15401] Updated weights for policy 0, policy_version 287830 (0.0041) [2024-06-22 20:50:18,390][15132] Fps is (10 sec: 45871.7, 60 sec: 42599.6, 300 sec: 42764.9). Total num frames: 4715937792. Throughput: 0: 42458.4. Samples: 4716079840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:50:18,391][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 20:50:19,022][15401] Updated weights for policy 0, policy_version 287840 (0.0023) [2024-06-22 20:50:23,244][15401] Updated weights for policy 0, policy_version 287850 (0.0041) [2024-06-22 20:50:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 4716134400. Throughput: 0: 42506.6. Samples: 4716214540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-22 20:50:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 20:50:26,709][15401] Updated weights for policy 0, policy_version 287860 (0.0036) [2024-06-22 20:50:28,390][15132] Fps is (10 sec: 37685.5, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 4716314624. Throughput: 0: 42237.2. Samples: 4716457900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:50:28,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 20:50:30,979][15401] Updated weights for policy 0, policy_version 287870 (0.0041) [2024-06-22 20:50:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 4716560384. Throughput: 0: 42369.7. Samples: 4716713500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:50:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 20:50:34,429][15401] Updated weights for policy 0, policy_version 287880 (0.0046) [2024-06-22 20:50:36,981][15349] Signal inference workers to stop experience collection... (69700 times) [2024-06-22 20:50:36,982][15349] Signal inference workers to resume experience collection... (69700 times) [2024-06-22 20:50:37,020][15401] InferenceWorker_p0-w0: stopping experience collection (69700 times) [2024-06-22 20:50:37,020][15401] InferenceWorker_p0-w0: resuming experience collection (69700 times) [2024-06-22 20:50:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 4716756992. Throughput: 0: 42334.8. Samples: 4716846040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:50:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 20:50:38,816][15401] Updated weights for policy 0, policy_version 287890 (0.0039) [2024-06-22 20:50:42,536][15401] Updated weights for policy 0, policy_version 287900 (0.0043) [2024-06-22 20:50:43,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 4716969984. Throughput: 0: 42202.3. Samples: 4717090620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:50:43,393][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 20:50:46,360][15401] Updated weights for policy 0, policy_version 287910 (0.0046) [2024-06-22 20:50:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 4717199360. Throughput: 0: 42190.5. Samples: 4717343420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:50:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 20:50:50,217][15401] Updated weights for policy 0, policy_version 287920 (0.0033) [2024-06-22 20:50:53,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 4717379584. Throughput: 0: 42248.4. Samples: 4717474600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:50:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 20:50:54,829][15401] Updated weights for policy 0, policy_version 287930 (0.0029) [2024-06-22 20:50:58,085][15401] Updated weights for policy 0, policy_version 287940 (0.0038) [2024-06-22 20:50:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 4717625344. Throughput: 0: 42214.2. Samples: 4717725100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:50:58,391][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 20:51:02,444][15401] Updated weights for policy 0, policy_version 287950 (0.0032) [2024-06-22 20:51:03,392][15132] Fps is (10 sec: 47501.7, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 4717854720. Throughput: 0: 42331.7. Samples: 4717984840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:51:03,393][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 20:51:05,783][15401] Updated weights for policy 0, policy_version 287960 (0.0041) [2024-06-22 20:51:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4718018560. Throughput: 0: 42253.4. Samples: 4718115940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:51:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 20:51:10,067][15401] Updated weights for policy 0, policy_version 287970 (0.0040) [2024-06-22 20:51:13,360][15401] Updated weights for policy 0, policy_version 287980 (0.0030) [2024-06-22 20:51:13,390][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4718264320. Throughput: 0: 42355.7. Samples: 4718363900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:51:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 20:51:17,581][15401] Updated weights for policy 0, policy_version 287990 (0.0030) [2024-06-22 20:51:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.8, 300 sec: 42653.9). Total num frames: 4718477312. Throughput: 0: 42380.5. Samples: 4718620620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:51:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 20:51:20,958][15401] Updated weights for policy 0, policy_version 288000 (0.0037) [2024-06-22 20:51:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4718657536. Throughput: 0: 42309.6. Samples: 4718749980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:51:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 20:51:25,273][15401] Updated weights for policy 0, policy_version 288010 (0.0035) [2024-06-22 20:51:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4718886912. Throughput: 0: 42469.9. Samples: 4719001660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:51:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 20:51:29,151][15401] Updated weights for policy 0, policy_version 288020 (0.0033) [2024-06-22 20:51:32,946][15401] Updated weights for policy 0, policy_version 288030 (0.0039) [2024-06-22 20:51:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4719099904. Throughput: 0: 42580.1. Samples: 4719259520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 20:51:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 20:51:36,677][15401] Updated weights for policy 0, policy_version 288040 (0.0029) [2024-06-22 20:51:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 4719296512. Throughput: 0: 42565.3. Samples: 4719390040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:51:38,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 20:51:40,561][15401] Updated weights for policy 0, policy_version 288050 (0.0035) [2024-06-22 20:51:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 4719542272. Throughput: 0: 42547.6. Samples: 4719639740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:51:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 20:51:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000288058_4719542272.pth... [2024-06-22 20:51:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000287434_4709318656.pth [2024-06-22 20:51:44,433][15401] Updated weights for policy 0, policy_version 288060 (0.0040) [2024-06-22 20:51:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 4719722496. Throughput: 0: 42662.5. Samples: 4719904540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:51:48,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 20:51:48,430][15401] Updated weights for policy 0, policy_version 288070 (0.0027) [2024-06-22 20:51:51,862][15401] Updated weights for policy 0, policy_version 288080 (0.0029) [2024-06-22 20:51:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4719935488. Throughput: 0: 42472.4. Samples: 4720027200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:51:53,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 20:51:56,211][15401] Updated weights for policy 0, policy_version 288090 (0.0026) [2024-06-22 20:51:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4720181248. Throughput: 0: 42686.3. Samples: 4720284780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:51:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 20:51:59,444][15401] Updated weights for policy 0, policy_version 288100 (0.0025) [2024-06-22 20:52:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41780.9, 300 sec: 42542.8). Total num frames: 4720361472. Throughput: 0: 42879.1. Samples: 4720550180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 20:52:03,656][15401] Updated weights for policy 0, policy_version 288110 (0.0036) [2024-06-22 20:52:04,396][15349] Signal inference workers to stop experience collection... (69750 times) [2024-06-22 20:52:04,446][15401] InferenceWorker_p0-w0: stopping experience collection (69750 times) [2024-06-22 20:52:04,454][15349] Signal inference workers to resume experience collection... (69750 times) [2024-06-22 20:52:04,461][15401] InferenceWorker_p0-w0: resuming experience collection (69750 times) [2024-06-22 20:52:06,973][15401] Updated weights for policy 0, policy_version 288120 (0.0030) [2024-06-22 20:52:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 4720590848. Throughput: 0: 42647.6. Samples: 4720669120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 20:52:11,111][15401] Updated weights for policy 0, policy_version 288130 (0.0026) [2024-06-22 20:52:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4720803840. Throughput: 0: 42915.6. Samples: 4720932860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:13,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-22 20:52:14,500][15401] Updated weights for policy 0, policy_version 288140 (0.0050) [2024-06-22 20:52:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 4721016832. Throughput: 0: 42899.7. Samples: 4721190000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 20:52:19,095][15401] Updated weights for policy 0, policy_version 288150 (0.0030) [2024-06-22 20:52:22,022][15401] Updated weights for policy 0, policy_version 288160 (0.0047) [2024-06-22 20:52:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4721229824. Throughput: 0: 42812.5. Samples: 4721316600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:23,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 20:52:26,623][15401] Updated weights for policy 0, policy_version 288170 (0.0023) [2024-06-22 20:52:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4721442816. Throughput: 0: 42995.6. Samples: 4721574540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 20:52:29,872][15401] Updated weights for policy 0, policy_version 288180 (0.0038) [2024-06-22 20:52:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4721639424. Throughput: 0: 42892.8. Samples: 4721834720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 20:52:34,172][15401] Updated weights for policy 0, policy_version 288190 (0.0040) [2024-06-22 20:52:37,799][15401] Updated weights for policy 0, policy_version 288200 (0.0031) [2024-06-22 20:52:38,390][15132] Fps is (10 sec: 44233.7, 60 sec: 43144.1, 300 sec: 42653.8). Total num frames: 4721885184. Throughput: 0: 42912.3. Samples: 4721958280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:38,391][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 20:52:41,723][15401] Updated weights for policy 0, policy_version 288210 (0.0028) [2024-06-22 20:52:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4722081792. Throughput: 0: 42925.7. Samples: 4722216440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 20:52:45,430][15401] Updated weights for policy 0, policy_version 288220 (0.0054) [2024-06-22 20:52:48,389][15132] Fps is (10 sec: 40962.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4722294784. Throughput: 0: 42810.8. Samples: 4722476660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-22 20:52:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 20:52:49,293][15401] Updated weights for policy 0, policy_version 288230 (0.0027) [2024-06-22 20:52:52,877][15401] Updated weights for policy 0, policy_version 288240 (0.0031) [2024-06-22 20:52:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 4722540544. Throughput: 0: 43013.7. Samples: 4722604740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:52:53,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 20:52:56,877][15401] Updated weights for policy 0, policy_version 288250 (0.0037) [2024-06-22 20:52:58,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 4722753536. Throughput: 0: 42802.0. Samples: 4722858960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:52:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 20:53:00,359][15401] Updated weights for policy 0, policy_version 288260 (0.0041) [2024-06-22 20:53:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.7, 300 sec: 42543.2). Total num frames: 4722950144. Throughput: 0: 42968.0. Samples: 4723123560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 20:53:04,399][15401] Updated weights for policy 0, policy_version 288270 (0.0028) [2024-06-22 20:53:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 4723163136. Throughput: 0: 42786.2. Samples: 4723241980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:08,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-22 20:53:08,517][15401] Updated weights for policy 0, policy_version 288280 (0.0034) [2024-06-22 20:53:12,103][15401] Updated weights for policy 0, policy_version 288290 (0.0032) [2024-06-22 20:53:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 4723376128. Throughput: 0: 42710.6. Samples: 4723496520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:13,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-22 20:53:16,223][15401] Updated weights for policy 0, policy_version 288300 (0.0031) [2024-06-22 20:53:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 4723589120. Throughput: 0: 42617.7. Samples: 4723752520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 20:53:20,047][15349] Signal inference workers to stop experience collection... (69800 times) [2024-06-22 20:53:20,047][15349] Signal inference workers to resume experience collection... (69800 times) [2024-06-22 20:53:20,101][15401] InferenceWorker_p0-w0: stopping experience collection (69800 times) [2024-06-22 20:53:20,101][15401] InferenceWorker_p0-w0: resuming experience collection (69800 times) [2024-06-22 20:53:20,186][15401] Updated weights for policy 0, policy_version 288310 (0.0038) [2024-06-22 20:53:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4723802112. Throughput: 0: 42692.1. Samples: 4723879400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:23,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 20:53:23,693][15401] Updated weights for policy 0, policy_version 288320 (0.0036) [2024-06-22 20:53:27,777][15401] Updated weights for policy 0, policy_version 288330 (0.0043) [2024-06-22 20:53:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4724015104. Throughput: 0: 42732.0. Samples: 4724139380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:28,392][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 20:53:31,581][15401] Updated weights for policy 0, policy_version 288340 (0.0026) [2024-06-22 20:53:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42598.5). Total num frames: 4724228096. Throughput: 0: 42755.1. Samples: 4724400640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 20:53:35,211][15401] Updated weights for policy 0, policy_version 288350 (0.0039) [2024-06-22 20:53:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42870.2, 300 sec: 42653.6). Total num frames: 4724457472. Throughput: 0: 42594.2. Samples: 4724521580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:38,393][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 20:53:38,929][15401] Updated weights for policy 0, policy_version 288360 (0.0026) [2024-06-22 20:53:42,940][15401] Updated weights for policy 0, policy_version 288370 (0.0032) [2024-06-22 20:53:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42598.6). Total num frames: 4724670464. Throughput: 0: 42908.2. Samples: 4724789820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 20:53:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000288372_4724686848.pth... [2024-06-22 20:53:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000287746_4714430464.pth [2024-06-22 20:53:46,768][15401] Updated weights for policy 0, policy_version 288380 (0.0040) [2024-06-22 20:53:48,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 4724867072. Throughput: 0: 42573.6. Samples: 4725039380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 20:53:50,576][15401] Updated weights for policy 0, policy_version 288390 (0.0035) [2024-06-22 20:53:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4725096448. Throughput: 0: 42784.5. Samples: 4725167280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 20:53:54,435][15401] Updated weights for policy 0, policy_version 288400 (0.0029) [2024-06-22 20:53:58,277][15401] Updated weights for policy 0, policy_version 288410 (0.0024) [2024-06-22 20:53:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4725309440. Throughput: 0: 42883.6. Samples: 4725426280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:53:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 20:54:01,990][15401] Updated weights for policy 0, policy_version 288420 (0.0037) [2024-06-22 20:54:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4725506048. Throughput: 0: 42831.6. Samples: 4725679940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 20:54:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 20:54:05,852][15401] Updated weights for policy 0, policy_version 288430 (0.0040) [2024-06-22 20:54:08,392][15132] Fps is (10 sec: 40951.1, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 4725719040. Throughput: 0: 42822.9. Samples: 4725806520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:08,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 20:54:09,629][15401] Updated weights for policy 0, policy_version 288440 (0.0034) [2024-06-22 20:54:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 4725948416. Throughput: 0: 42850.7. Samples: 4726067660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 20:54:13,783][15401] Updated weights for policy 0, policy_version 288450 (0.0030) [2024-06-22 20:54:17,301][15401] Updated weights for policy 0, policy_version 288460 (0.0042) [2024-06-22 20:54:18,391][15132] Fps is (10 sec: 42602.3, 60 sec: 42597.6, 300 sec: 42598.2). Total num frames: 4726145024. Throughput: 0: 42536.6. Samples: 4726314840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:18,391][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 20:54:21,360][15401] Updated weights for policy 0, policy_version 288470 (0.0025) [2024-06-22 20:54:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 4726341632. Throughput: 0: 42611.5. Samples: 4726439000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 20:54:25,387][15401] Updated weights for policy 0, policy_version 288480 (0.0034) [2024-06-22 20:54:28,389][15132] Fps is (10 sec: 40965.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4726554624. Throughput: 0: 42346.3. Samples: 4726695400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 20:54:29,026][15401] Updated weights for policy 0, policy_version 288490 (0.0032) [2024-06-22 20:54:32,927][15401] Updated weights for policy 0, policy_version 288500 (0.0038) [2024-06-22 20:54:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 4726800384. Throughput: 0: 42384.9. Samples: 4726946700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 20:54:37,170][15401] Updated weights for policy 0, policy_version 288510 (0.0039) [2024-06-22 20:54:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 4726996992. Throughput: 0: 42491.4. Samples: 4727079400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:38,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-22 20:54:40,426][15401] Updated weights for policy 0, policy_version 288520 (0.0034) [2024-06-22 20:54:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42487.7). Total num frames: 4727209984. Throughput: 0: 42544.4. Samples: 4727340780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:43,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-22 20:54:44,740][15401] Updated weights for policy 0, policy_version 288530 (0.0026) [2024-06-22 20:54:47,064][15349] Signal inference workers to stop experience collection... (69850 times) [2024-06-22 20:54:47,064][15349] Signal inference workers to resume experience collection... (69850 times) [2024-06-22 20:54:47,111][15401] InferenceWorker_p0-w0: stopping experience collection (69850 times) [2024-06-22 20:54:47,111][15401] InferenceWorker_p0-w0: resuming experience collection (69850 times) [2024-06-22 20:54:47,923][15401] Updated weights for policy 0, policy_version 288540 (0.0042) [2024-06-22 20:54:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 4727439360. Throughput: 0: 42556.0. Samples: 4727595060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:48,393][15132] Avg episode reward: [(0, '0.192')] [2024-06-22 20:54:52,247][15401] Updated weights for policy 0, policy_version 288550 (0.0033) [2024-06-22 20:54:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 4727635968. Throughput: 0: 42686.1. Samples: 4727727300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 20:54:55,524][15401] Updated weights for policy 0, policy_version 288560 (0.0033) [2024-06-22 20:54:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4727865344. Throughput: 0: 42666.7. Samples: 4727987660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:54:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 20:54:59,828][15401] Updated weights for policy 0, policy_version 288570 (0.0032) [2024-06-22 20:55:03,172][15401] Updated weights for policy 0, policy_version 288580 (0.0044) [2024-06-22 20:55:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4728094720. Throughput: 0: 42782.4. Samples: 4728240000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:55:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 20:55:07,501][15401] Updated weights for policy 0, policy_version 288590 (0.0042) [2024-06-22 20:55:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42598.2, 300 sec: 42653.6). Total num frames: 4728274944. Throughput: 0: 42968.5. Samples: 4728372680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:55:08,392][15132] Avg episode reward: [(0, '0.338')] [2024-06-22 20:55:10,693][15401] Updated weights for policy 0, policy_version 288600 (0.0037) [2024-06-22 20:55:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4728520704. Throughput: 0: 42934.1. Samples: 4728627440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-22 20:55:13,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 20:55:15,218][15401] Updated weights for policy 0, policy_version 288610 (0.0040) [2024-06-22 20:55:18,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43145.5, 300 sec: 42709.5). Total num frames: 4728733696. Throughput: 0: 42947.2. Samples: 4728879320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:18,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 20:55:18,426][15401] Updated weights for policy 0, policy_version 288620 (0.0042) [2024-06-22 20:55:22,915][15401] Updated weights for policy 0, policy_version 288630 (0.0041) [2024-06-22 20:55:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4728913920. Throughput: 0: 42855.1. Samples: 4729007880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:23,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-22 20:55:25,990][15401] Updated weights for policy 0, policy_version 288640 (0.0037) [2024-06-22 20:55:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 4729143296. Throughput: 0: 42787.6. Samples: 4729266220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:28,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 20:55:30,798][15401] Updated weights for policy 0, policy_version 288650 (0.0029) [2024-06-22 20:55:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4729372672. Throughput: 0: 42826.3. Samples: 4729522140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 20:55:33,651][15401] Updated weights for policy 0, policy_version 288660 (0.0030) [2024-06-22 20:55:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 4729552896. Throughput: 0: 42699.0. Samples: 4729648860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:38,393][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 20:55:38,462][15401] Updated weights for policy 0, policy_version 288670 (0.0028) [2024-06-22 20:55:41,403][15401] Updated weights for policy 0, policy_version 288680 (0.0040) [2024-06-22 20:55:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4729782272. Throughput: 0: 42576.9. Samples: 4729903620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 20:55:43,641][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000288685_4729815040.pth... [2024-06-22 20:55:43,697][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000288058_4719542272.pth [2024-06-22 20:55:46,207][15401] Updated weights for policy 0, policy_version 288690 (0.0035) [2024-06-22 20:55:48,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 4729995264. Throughput: 0: 42749.8. Samples: 4730163740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 20:55:49,128][15401] Updated weights for policy 0, policy_version 288700 (0.0038) [2024-06-22 20:55:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4730191872. Throughput: 0: 42502.6. Samples: 4730285200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 20:55:53,925][15401] Updated weights for policy 0, policy_version 288710 (0.0034) [2024-06-22 20:55:56,743][15401] Updated weights for policy 0, policy_version 288720 (0.0028) [2024-06-22 20:55:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 4730437632. Throughput: 0: 42429.3. Samples: 4730536760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:55:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 20:56:01,412][15349] Signal inference workers to stop experience collection... (69900 times) [2024-06-22 20:56:01,467][15401] InferenceWorker_p0-w0: stopping experience collection (69900 times) [2024-06-22 20:56:01,476][15349] Signal inference workers to resume experience collection... (69900 times) [2024-06-22 20:56:01,483][15401] InferenceWorker_p0-w0: resuming experience collection (69900 times) [2024-06-22 20:56:01,618][15401] Updated weights for policy 0, policy_version 288730 (0.0042) [2024-06-22 20:56:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4730617856. Throughput: 0: 42696.8. Samples: 4730800680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:56:03,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 20:56:04,364][15401] Updated weights for policy 0, policy_version 288740 (0.0038) [2024-06-22 20:56:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 4730830848. Throughput: 0: 42505.8. Samples: 4730920640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:56:08,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 20:56:09,451][15401] Updated weights for policy 0, policy_version 288750 (0.0031) [2024-06-22 20:56:12,443][15401] Updated weights for policy 0, policy_version 288760 (0.0041) [2024-06-22 20:56:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4731060224. Throughput: 0: 42357.7. Samples: 4731172320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:56:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-22 20:56:17,166][15401] Updated weights for policy 0, policy_version 288770 (0.0031) [2024-06-22 20:56:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 4731240448. Throughput: 0: 42573.4. Samples: 4731437940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:56:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 20:56:20,306][15401] Updated weights for policy 0, policy_version 288780 (0.0024) [2024-06-22 20:56:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4731486208. Throughput: 0: 42365.3. Samples: 4731555200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:56:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 20:56:25,098][15401] Updated weights for policy 0, policy_version 288790 (0.0034) [2024-06-22 20:56:28,059][15401] Updated weights for policy 0, policy_version 288800 (0.0024) [2024-06-22 20:56:28,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4731715584. Throughput: 0: 42481.7. Samples: 4731815300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-22 20:56:28,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 20:56:32,657][15401] Updated weights for policy 0, policy_version 288810 (0.0028) [2024-06-22 20:56:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 4731879424. Throughput: 0: 42587.2. Samples: 4732080160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:56:33,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 20:56:35,674][15401] Updated weights for policy 0, policy_version 288820 (0.0030) [2024-06-22 20:56:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 4732125184. Throughput: 0: 42439.7. Samples: 4732194980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:56:38,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 20:56:40,659][15401] Updated weights for policy 0, policy_version 288830 (0.0045) [2024-06-22 20:56:43,380][15401] Updated weights for policy 0, policy_version 288840 (0.0040) [2024-06-22 20:56:43,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4732354560. Throughput: 0: 42612.9. Samples: 4732454340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:56:43,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 20:56:48,234][15401] Updated weights for policy 0, policy_version 288850 (0.0047) [2024-06-22 20:56:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 4732518400. Throughput: 0: 42543.6. Samples: 4732715140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:56:48,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 20:56:51,113][15401] Updated weights for policy 0, policy_version 288860 (0.0041) [2024-06-22 20:56:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4732780544. Throughput: 0: 42475.6. Samples: 4732832040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:56:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 20:56:56,017][15401] Updated weights for policy 0, policy_version 288870 (0.0029) [2024-06-22 20:56:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4732977152. Throughput: 0: 42760.1. Samples: 4733096520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:56:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 20:56:59,043][15401] Updated weights for policy 0, policy_version 288880 (0.0043) [2024-06-22 20:57:03,393][15132] Fps is (10 sec: 37670.9, 60 sec: 42323.0, 300 sec: 42597.9). Total num frames: 4733157376. Throughput: 0: 42551.4. Samples: 4733352900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:57:03,393][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 20:57:03,549][15401] Updated weights for policy 0, policy_version 288890 (0.0032) [2024-06-22 20:57:06,663][15401] Updated weights for policy 0, policy_version 288900 (0.0035) [2024-06-22 20:57:07,894][15349] Signal inference workers to stop experience collection... (69950 times) [2024-06-22 20:57:07,894][15349] Signal inference workers to resume experience collection... (69950 times) [2024-06-22 20:57:07,924][15401] InferenceWorker_p0-w0: stopping experience collection (69950 times) [2024-06-22 20:57:07,924][15401] InferenceWorker_p0-w0: resuming experience collection (69950 times) [2024-06-22 20:57:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4733419520. Throughput: 0: 42603.1. Samples: 4733472340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:57:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 20:57:11,135][15401] Updated weights for policy 0, policy_version 288910 (0.0036) [2024-06-22 20:57:13,389][15132] Fps is (10 sec: 44251.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4733599744. Throughput: 0: 42535.7. Samples: 4733729400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:57:13,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-22 20:57:14,413][15401] Updated weights for policy 0, policy_version 288920 (0.0043) [2024-06-22 20:57:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4733796352. Throughput: 0: 42204.0. Samples: 4733979340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:57:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 20:57:19,241][15401] Updated weights for policy 0, policy_version 288930 (0.0035) [2024-06-22 20:57:22,174][15401] Updated weights for policy 0, policy_version 288940 (0.0039) [2024-06-22 20:57:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4734042112. Throughput: 0: 42459.8. Samples: 4734105680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:57:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 20:57:27,122][15401] Updated weights for policy 0, policy_version 288950 (0.0043) [2024-06-22 20:57:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 4734238720. Throughput: 0: 42482.3. Samples: 4734366040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:57:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 20:57:29,838][15401] Updated weights for policy 0, policy_version 288960 (0.0038) [2024-06-22 20:57:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42598.5). Total num frames: 4734451712. Throughput: 0: 42193.7. Samples: 4734613860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:57:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 20:57:34,558][15401] Updated weights for policy 0, policy_version 288970 (0.0026) [2024-06-22 20:57:37,540][15401] Updated weights for policy 0, policy_version 288980 (0.0047) [2024-06-22 20:57:38,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 4734681088. Throughput: 0: 42535.9. Samples: 4734746160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-22 20:57:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 20:57:42,036][15401] Updated weights for policy 0, policy_version 288990 (0.0032) [2024-06-22 20:57:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 4734877696. Throughput: 0: 42362.2. Samples: 4735002820. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:57:43,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-22 20:57:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000288994_4734877696.pth... [2024-06-22 20:57:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000288372_4724686848.pth [2024-06-22 20:57:45,193][15401] Updated weights for policy 0, policy_version 289000 (0.0034) [2024-06-22 20:57:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 4735090688. Throughput: 0: 42387.8. Samples: 4735260220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:57:48,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 20:57:49,562][15401] Updated weights for policy 0, policy_version 289010 (0.0022) [2024-06-22 20:57:52,765][15401] Updated weights for policy 0, policy_version 289020 (0.0038) [2024-06-22 20:57:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 4735320064. Throughput: 0: 42659.3. Samples: 4735392000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:57:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 20:57:57,203][15401] Updated weights for policy 0, policy_version 289030 (0.0045) [2024-06-22 20:57:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4735516672. Throughput: 0: 42704.0. Samples: 4735651080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:57:58,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 20:58:00,428][15401] Updated weights for policy 0, policy_version 289040 (0.0027) [2024-06-22 20:58:03,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43146.8, 300 sec: 42653.9). Total num frames: 4735746048. Throughput: 0: 42881.1. Samples: 4735909000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 20:58:04,633][15401] Updated weights for policy 0, policy_version 289050 (0.0029) [2024-06-22 20:58:08,030][15401] Updated weights for policy 0, policy_version 289060 (0.0025) [2024-06-22 20:58:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4735959040. Throughput: 0: 43055.2. Samples: 4736043160. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 20:58:12,074][15401] Updated weights for policy 0, policy_version 289070 (0.0028) [2024-06-22 20:58:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4736172032. Throughput: 0: 42947.0. Samples: 4736298660. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 20:58:15,586][15401] Updated weights for policy 0, policy_version 289080 (0.0034) [2024-06-22 20:58:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4736385024. Throughput: 0: 43129.8. Samples: 4736554700. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:18,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 20:58:19,691][15401] Updated weights for policy 0, policy_version 289090 (0.0034) [2024-06-22 20:58:23,320][15401] Updated weights for policy 0, policy_version 289100 (0.0028) [2024-06-22 20:58:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4736614400. Throughput: 0: 43035.8. Samples: 4736682760. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 20:58:27,320][15401] Updated weights for policy 0, policy_version 289110 (0.0038) [2024-06-22 20:58:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4736794624. Throughput: 0: 43070.1. Samples: 4736940980. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 20:58:30,931][15401] Updated weights for policy 0, policy_version 289120 (0.0026) [2024-06-22 20:58:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 4737024000. Throughput: 0: 43021.4. Samples: 4737196180. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:33,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-22 20:58:35,318][15401] Updated weights for policy 0, policy_version 289130 (0.0030) [2024-06-22 20:58:37,059][15349] Signal inference workers to stop experience collection... (70000 times) [2024-06-22 20:58:37,060][15349] Signal inference workers to resume experience collection... (70000 times) [2024-06-22 20:58:37,113][15401] InferenceWorker_p0-w0: stopping experience collection (70000 times) [2024-06-22 20:58:37,113][15401] InferenceWorker_p0-w0: resuming experience collection (70000 times) [2024-06-22 20:58:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4737236992. Throughput: 0: 42864.7. Samples: 4737320920. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 20:58:38,610][15401] Updated weights for policy 0, policy_version 289140 (0.0037) [2024-06-22 20:58:42,832][15401] Updated weights for policy 0, policy_version 289150 (0.0036) [2024-06-22 20:58:43,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 4737449984. Throughput: 0: 42907.0. Samples: 4737582000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:43,393][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 20:58:46,267][15401] Updated weights for policy 0, policy_version 289160 (0.0022) [2024-06-22 20:58:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 4737679360. Throughput: 0: 42759.2. Samples: 4737833160. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 20:58:50,460][15401] Updated weights for policy 0, policy_version 289170 (0.0033) [2024-06-22 20:58:53,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4737875968. Throughput: 0: 42690.2. Samples: 4737964220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-22 20:58:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 20:58:53,971][15401] Updated weights for policy 0, policy_version 289180 (0.0035) [2024-06-22 20:58:58,363][15401] Updated weights for policy 0, policy_version 289190 (0.0034) [2024-06-22 20:58:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4738088960. Throughput: 0: 42697.3. Samples: 4738220040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:58:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 20:59:01,721][15401] Updated weights for policy 0, policy_version 289200 (0.0035) [2024-06-22 20:59:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 4738334720. Throughput: 0: 42461.7. Samples: 4738465480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:03,390][15132] Avg episode reward: [(0, '0.138')] [2024-06-22 20:59:06,158][15401] Updated weights for policy 0, policy_version 289210 (0.0044) [2024-06-22 20:59:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4738514944. Throughput: 0: 42693.6. Samples: 4738603980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 20:59:09,371][15401] Updated weights for policy 0, policy_version 289220 (0.0045) [2024-06-22 20:59:13,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42598.6). Total num frames: 4738711552. Throughput: 0: 42609.3. Samples: 4738858400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 20:59:13,807][15401] Updated weights for policy 0, policy_version 289230 (0.0037) [2024-06-22 20:59:17,161][15401] Updated weights for policy 0, policy_version 289240 (0.0036) [2024-06-22 20:59:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4738957312. Throughput: 0: 42461.8. Samples: 4739106960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 20:59:21,347][15401] Updated weights for policy 0, policy_version 289250 (0.0022) [2024-06-22 20:59:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4739153920. Throughput: 0: 42708.5. Samples: 4739242800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 20:59:24,865][15401] Updated weights for policy 0, policy_version 289260 (0.0038) [2024-06-22 20:59:28,395][15132] Fps is (10 sec: 40938.4, 60 sec: 42867.7, 300 sec: 42597.6). Total num frames: 4739366912. Throughput: 0: 42496.8. Samples: 4739494480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:28,395][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 20:59:28,927][15401] Updated weights for policy 0, policy_version 289270 (0.0020) [2024-06-22 20:59:32,473][15401] Updated weights for policy 0, policy_version 289280 (0.0050) [2024-06-22 20:59:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4739579904. Throughput: 0: 42646.3. Samples: 4739752240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 20:59:36,506][15401] Updated weights for policy 0, policy_version 289290 (0.0039) [2024-06-22 20:59:38,390][15132] Fps is (10 sec: 42621.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4739792896. Throughput: 0: 42574.2. Samples: 4739880060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 20:59:40,227][15401] Updated weights for policy 0, policy_version 289300 (0.0031) [2024-06-22 20:59:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4740022272. Throughput: 0: 42473.7. Samples: 4740131460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:43,392][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 20:59:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000289308_4740022272.pth... [2024-06-22 20:59:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000288685_4729815040.pth [2024-06-22 20:59:44,337][15401] Updated weights for policy 0, policy_version 289310 (0.0025) [2024-06-22 20:59:47,806][15401] Updated weights for policy 0, policy_version 289320 (0.0029) [2024-06-22 20:59:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4740235264. Throughput: 0: 42645.9. Samples: 4740384540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 20:59:51,796][15349] Signal inference workers to stop experience collection... (70050 times) [2024-06-22 20:59:51,796][15349] Signal inference workers to resume experience collection... (70050 times) [2024-06-22 20:59:51,842][15401] InferenceWorker_p0-w0: stopping experience collection (70050 times) [2024-06-22 20:59:51,842][15401] InferenceWorker_p0-w0: resuming experience collection (70050 times) [2024-06-22 20:59:51,934][15401] Updated weights for policy 0, policy_version 289330 (0.0036) [2024-06-22 20:59:53,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4740415488. Throughput: 0: 42453.0. Samples: 4740514360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 20:59:55,347][15401] Updated weights for policy 0, policy_version 289340 (0.0036) [2024-06-22 20:59:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4740661248. Throughput: 0: 42566.2. Samples: 4740773880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 20:59:58,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 20:59:59,534][15401] Updated weights for policy 0, policy_version 289350 (0.0034) [2024-06-22 21:00:02,925][15401] Updated weights for policy 0, policy_version 289360 (0.0047) [2024-06-22 21:00:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 4740874240. Throughput: 0: 42534.7. Samples: 4741021020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:00:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 21:00:07,074][15401] Updated weights for policy 0, policy_version 289370 (0.0023) [2024-06-22 21:00:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4741070848. Throughput: 0: 42399.1. Samples: 4741150760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:00:08,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 21:00:10,984][15401] Updated weights for policy 0, policy_version 289380 (0.0056) [2024-06-22 21:00:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 4741300224. Throughput: 0: 42649.2. Samples: 4741413460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 21:00:15,087][15401] Updated weights for policy 0, policy_version 289390 (0.0038) [2024-06-22 21:00:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 4741513216. Throughput: 0: 42509.4. Samples: 4741665160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:18,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 21:00:18,666][15401] Updated weights for policy 0, policy_version 289400 (0.0028) [2024-06-22 21:00:22,691][15401] Updated weights for policy 0, policy_version 289410 (0.0025) [2024-06-22 21:00:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 4741693440. Throughput: 0: 42478.6. Samples: 4741791600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 21:00:26,270][15401] Updated weights for policy 0, policy_version 289420 (0.0031) [2024-06-22 21:00:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42875.4, 300 sec: 42598.4). Total num frames: 4741939200. Throughput: 0: 42615.7. Samples: 4742049060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:28,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 21:00:30,429][15401] Updated weights for policy 0, policy_version 289430 (0.0025) [2024-06-22 21:00:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 4742152192. Throughput: 0: 42730.1. Samples: 4742307400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 21:00:33,954][15401] Updated weights for policy 0, policy_version 289440 (0.0038) [2024-06-22 21:00:38,067][15401] Updated weights for policy 0, policy_version 289450 (0.0034) [2024-06-22 21:00:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 4742348800. Throughput: 0: 42692.8. Samples: 4742435640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:38,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 21:00:41,604][15401] Updated weights for policy 0, policy_version 289460 (0.0021) [2024-06-22 21:00:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 4742561792. Throughput: 0: 42694.8. Samples: 4742695140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 21:00:46,106][15401] Updated weights for policy 0, policy_version 289470 (0.0029) [2024-06-22 21:00:48,390][15132] Fps is (10 sec: 44246.6, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 4742791168. Throughput: 0: 42857.2. Samples: 4742949600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 21:00:49,719][15401] Updated weights for policy 0, policy_version 289480 (0.0034) [2024-06-22 21:00:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4742987776. Throughput: 0: 42894.2. Samples: 4743081000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 21:00:53,543][15401] Updated weights for policy 0, policy_version 289490 (0.0034) [2024-06-22 21:00:57,253][15401] Updated weights for policy 0, policy_version 289500 (0.0030) [2024-06-22 21:00:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4743217152. Throughput: 0: 42875.6. Samples: 4743342860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:00:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 21:01:01,002][15401] Updated weights for policy 0, policy_version 289510 (0.0037) [2024-06-22 21:01:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4743446528. Throughput: 0: 43032.1. Samples: 4743601620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:01:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 21:01:04,779][15401] Updated weights for policy 0, policy_version 289520 (0.0028) [2024-06-22 21:01:08,390][15132] Fps is (10 sec: 40958.6, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 4743626752. Throughput: 0: 43062.1. Samples: 4743729400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:01:08,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 21:01:08,484][15349] Signal inference workers to stop experience collection... (70100 times) [2024-06-22 21:01:08,485][15349] Signal inference workers to resume experience collection... (70100 times) [2024-06-22 21:01:08,519][15401] InferenceWorker_p0-w0: stopping experience collection (70100 times) [2024-06-22 21:01:08,519][15401] InferenceWorker_p0-w0: resuming experience collection (70100 times) [2024-06-22 21:01:08,622][15401] Updated weights for policy 0, policy_version 289530 (0.0031) [2024-06-22 21:01:12,243][15401] Updated weights for policy 0, policy_version 289540 (0.0038) [2024-06-22 21:01:13,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4743872512. Throughput: 0: 43081.8. Samples: 4743987740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:01:13,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 21:01:16,444][15401] Updated weights for policy 0, policy_version 289550 (0.0036) [2024-06-22 21:01:18,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 4744085504. Throughput: 0: 42927.5. Samples: 4744239140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:01:18,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 21:01:20,023][15401] Updated weights for policy 0, policy_version 289560 (0.0032) [2024-06-22 21:01:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 4744282112. Throughput: 0: 42957.3. Samples: 4744368620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:01:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 21:01:24,297][15401] Updated weights for policy 0, policy_version 289570 (0.0040) [2024-06-22 21:01:27,669][15401] Updated weights for policy 0, policy_version 289580 (0.0051) [2024-06-22 21:01:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4744495104. Throughput: 0: 42913.6. Samples: 4744626260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:01:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 21:01:31,794][15401] Updated weights for policy 0, policy_version 289590 (0.0036) [2024-06-22 21:01:33,390][15132] Fps is (10 sec: 45873.9, 60 sec: 43144.3, 300 sec: 42765.0). Total num frames: 4744740864. Throughput: 0: 42931.8. Samples: 4744881540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:01:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 21:01:35,310][15401] Updated weights for policy 0, policy_version 289600 (0.0027) [2024-06-22 21:01:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 4744937472. Throughput: 0: 42978.2. Samples: 4745015020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:01:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 21:01:39,251][15401] Updated weights for policy 0, policy_version 289610 (0.0031) [2024-06-22 21:01:42,797][15401] Updated weights for policy 0, policy_version 289620 (0.0030) [2024-06-22 21:01:43,390][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4745134080. Throughput: 0: 42793.6. Samples: 4745268580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:01:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 21:01:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000289620_4745134080.pth... [2024-06-22 21:01:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000288994_4734877696.pth [2024-06-22 21:01:46,975][15401] Updated weights for policy 0, policy_version 289630 (0.0032) [2024-06-22 21:01:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 4745363456. Throughput: 0: 42808.2. Samples: 4745527980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:01:48,390][15132] Avg episode reward: [(0, '0.110')] [2024-06-22 21:01:50,224][15401] Updated weights for policy 0, policy_version 289640 (0.0036) [2024-06-22 21:01:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4745576448. Throughput: 0: 42924.7. Samples: 4745661000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:01:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 21:01:54,476][15401] Updated weights for policy 0, policy_version 289650 (0.0033) [2024-06-22 21:01:57,976][15401] Updated weights for policy 0, policy_version 289660 (0.0026) [2024-06-22 21:01:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42820.7). Total num frames: 4745789440. Throughput: 0: 42719.9. Samples: 4745910240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:01:58,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 21:02:01,953][15401] Updated weights for policy 0, policy_version 289670 (0.0035) [2024-06-22 21:02:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4745986048. Throughput: 0: 42873.0. Samples: 4746168420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:02:03,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 21:02:06,179][15401] Updated weights for policy 0, policy_version 289680 (0.0029) [2024-06-22 21:02:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.8, 300 sec: 42765.0). Total num frames: 4746215424. Throughput: 0: 42888.1. Samples: 4746298580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:02:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 21:02:09,395][15401] Updated weights for policy 0, policy_version 289690 (0.0036) [2024-06-22 21:02:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4746428416. Throughput: 0: 42827.6. Samples: 4746553500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:02:13,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 21:02:13,851][15401] Updated weights for policy 0, policy_version 289700 (0.0032) [2024-06-22 21:02:17,440][15401] Updated weights for policy 0, policy_version 289710 (0.0032) [2024-06-22 21:02:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4746657792. Throughput: 0: 42939.8. Samples: 4746813820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:02:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 21:02:21,438][15401] Updated weights for policy 0, policy_version 289720 (0.0038) [2024-06-22 21:02:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4746870784. Throughput: 0: 42830.2. Samples: 4746942380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:02:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 21:02:25,231][15401] Updated weights for policy 0, policy_version 289730 (0.0037) [2024-06-22 21:02:27,793][15349] Signal inference workers to stop experience collection... (70150 times) [2024-06-22 21:02:27,799][15349] Signal inference workers to resume experience collection... (70150 times) [2024-06-22 21:02:27,811][15401] InferenceWorker_p0-w0: stopping experience collection (70150 times) [2024-06-22 21:02:27,844][15401] InferenceWorker_p0-w0: resuming experience collection (70150 times) [2024-06-22 21:02:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4747083776. Throughput: 0: 42941.4. Samples: 4747200940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:02:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 21:02:28,984][15401] Updated weights for policy 0, policy_version 289740 (0.0036) [2024-06-22 21:02:32,678][15401] Updated weights for policy 0, policy_version 289750 (0.0031) [2024-06-22 21:02:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 4747313152. Throughput: 0: 42887.5. Samples: 4747457920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-22 21:02:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 21:02:36,635][15401] Updated weights for policy 0, policy_version 289760 (0.0030) [2024-06-22 21:02:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4747526144. Throughput: 0: 42901.8. Samples: 4747591580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:02:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 21:02:40,248][15401] Updated weights for policy 0, policy_version 289770 (0.0039) [2024-06-22 21:02:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 4747722752. Throughput: 0: 43173.5. Samples: 4747852940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:02:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 21:02:44,088][15401] Updated weights for policy 0, policy_version 289780 (0.0027) [2024-06-22 21:02:47,714][15401] Updated weights for policy 0, policy_version 289790 (0.0029) [2024-06-22 21:02:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4747952128. Throughput: 0: 43020.1. Samples: 4748104320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:02:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 21:02:51,619][15401] Updated weights for policy 0, policy_version 289800 (0.0033) [2024-06-22 21:02:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4748165120. Throughput: 0: 43073.7. Samples: 4748236900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:02:53,391][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 21:02:55,086][15401] Updated weights for policy 0, policy_version 289810 (0.0029) [2024-06-22 21:02:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4748361728. Throughput: 0: 43224.9. Samples: 4748498620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:02:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 21:02:58,995][15401] Updated weights for policy 0, policy_version 289820 (0.0030) [2024-06-22 21:03:02,853][15401] Updated weights for policy 0, policy_version 289830 (0.0043) [2024-06-22 21:03:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 4748591104. Throughput: 0: 43070.6. Samples: 4748752000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:03,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-22 21:03:06,667][15401] Updated weights for policy 0, policy_version 289840 (0.0030) [2024-06-22 21:03:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4748804096. Throughput: 0: 43170.7. Samples: 4748885060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:08,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 21:03:10,316][15401] Updated weights for policy 0, policy_version 289850 (0.0036) [2024-06-22 21:03:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4749017088. Throughput: 0: 43251.0. Samples: 4749147240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:13,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 21:03:14,104][15401] Updated weights for policy 0, policy_version 289860 (0.0032) [2024-06-22 21:03:17,798][15401] Updated weights for policy 0, policy_version 289870 (0.0034) [2024-06-22 21:03:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4749246464. Throughput: 0: 43096.1. Samples: 4749397240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:18,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-22 21:03:21,631][15401] Updated weights for policy 0, policy_version 289880 (0.0029) [2024-06-22 21:03:23,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4749459456. Throughput: 0: 43244.0. Samples: 4749537560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:23,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-22 21:03:25,345][15401] Updated weights for policy 0, policy_version 289890 (0.0029) [2024-06-22 21:03:28,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4749639680. Throughput: 0: 43161.2. Samples: 4749795200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 21:03:29,129][15401] Updated weights for policy 0, policy_version 289900 (0.0028) [2024-06-22 21:03:32,974][15401] Updated weights for policy 0, policy_version 289910 (0.0049) [2024-06-22 21:03:33,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 4749885440. Throughput: 0: 43234.7. Samples: 4750050160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:33,396][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 21:03:36,812][15401] Updated weights for policy 0, policy_version 289920 (0.0029) [2024-06-22 21:03:38,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42876.1). Total num frames: 4750098432. Throughput: 0: 43353.2. Samples: 4750187900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:38,393][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 21:03:40,628][15401] Updated weights for policy 0, policy_version 289930 (0.0042) [2024-06-22 21:03:43,389][15132] Fps is (10 sec: 40986.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4750295040. Throughput: 0: 43256.0. Samples: 4750445140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 21:03:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000289936_4750311424.pth... [2024-06-22 21:03:43,563][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000289308_4740022272.pth [2024-06-22 21:03:44,494][15401] Updated weights for policy 0, policy_version 289940 (0.0044) [2024-06-22 21:03:48,220][15401] Updated weights for policy 0, policy_version 289950 (0.0039) [2024-06-22 21:03:48,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4750540800. Throughput: 0: 43244.1. Samples: 4750697980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:03:48,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 21:03:51,486][15349] Signal inference workers to stop experience collection... (70200 times) [2024-06-22 21:03:51,487][15349] Signal inference workers to resume experience collection... (70200 times) [2024-06-22 21:03:51,528][15401] InferenceWorker_p0-w0: stopping experience collection (70200 times) [2024-06-22 21:03:51,528][15401] InferenceWorker_p0-w0: resuming experience collection (70200 times) [2024-06-22 21:03:51,986][15401] Updated weights for policy 0, policy_version 289960 (0.0030) [2024-06-22 21:03:53,396][15132] Fps is (10 sec: 45846.0, 60 sec: 43140.0, 300 sec: 42930.7). Total num frames: 4750753792. Throughput: 0: 43153.9. Samples: 4750827260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:03:53,396][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 21:03:56,141][15401] Updated weights for policy 0, policy_version 289970 (0.0033) [2024-06-22 21:03:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4750934016. Throughput: 0: 42991.3. Samples: 4751081840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:03:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 21:03:59,926][15401] Updated weights for policy 0, policy_version 289980 (0.0029) [2024-06-22 21:04:03,390][15132] Fps is (10 sec: 40985.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4751163392. Throughput: 0: 43163.8. Samples: 4751339620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 21:04:03,703][15401] Updated weights for policy 0, policy_version 289990 (0.0026) [2024-06-22 21:04:07,591][15401] Updated weights for policy 0, policy_version 290000 (0.0025) [2024-06-22 21:04:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4751392768. Throughput: 0: 43029.8. Samples: 4751473900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:08,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-22 21:04:11,429][15401] Updated weights for policy 0, policy_version 290010 (0.0040) [2024-06-22 21:04:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4751589376. Throughput: 0: 43086.7. Samples: 4751734100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 21:04:15,190][15401] Updated weights for policy 0, policy_version 290020 (0.0042) [2024-06-22 21:04:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4751818752. Throughput: 0: 43009.2. Samples: 4751985300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 21:04:18,975][15401] Updated weights for policy 0, policy_version 290030 (0.0027) [2024-06-22 21:04:22,739][15401] Updated weights for policy 0, policy_version 290040 (0.0032) [2024-06-22 21:04:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42932.4). Total num frames: 4752031744. Throughput: 0: 42874.7. Samples: 4752117160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:23,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 21:04:26,607][15401] Updated weights for policy 0, policy_version 290050 (0.0043) [2024-06-22 21:04:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 4752244736. Throughput: 0: 42921.0. Samples: 4752376580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 21:04:30,359][15401] Updated weights for policy 0, policy_version 290060 (0.0025) [2024-06-22 21:04:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43149.2, 300 sec: 42987.2). Total num frames: 4752474112. Throughput: 0: 43036.4. Samples: 4752634620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 21:04:34,130][15401] Updated weights for policy 0, policy_version 290070 (0.0033) [2024-06-22 21:04:38,010][15401] Updated weights for policy 0, policy_version 290080 (0.0041) [2024-06-22 21:04:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 4752670720. Throughput: 0: 42974.5. Samples: 4752760840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:38,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 21:04:41,721][15401] Updated weights for policy 0, policy_version 290090 (0.0031) [2024-06-22 21:04:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 4752916480. Throughput: 0: 43163.1. Samples: 4753024180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 21:04:45,877][15401] Updated weights for policy 0, policy_version 290100 (0.0030) [2024-06-22 21:04:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 4753129472. Throughput: 0: 42986.4. Samples: 4753274000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:48,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 21:04:49,430][15401] Updated weights for policy 0, policy_version 290110 (0.0037) [2024-06-22 21:04:53,326][15401] Updated weights for policy 0, policy_version 290120 (0.0041) [2024-06-22 21:04:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42876.0, 300 sec: 42931.6). Total num frames: 4753326080. Throughput: 0: 42922.7. Samples: 4753405420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 21:04:56,905][15401] Updated weights for policy 0, policy_version 290130 (0.0027) [2024-06-22 21:04:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 4753539072. Throughput: 0: 42953.7. Samples: 4753667020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 21:04:58,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 21:05:01,168][15401] Updated weights for policy 0, policy_version 290140 (0.0030) [2024-06-22 21:05:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 4753768448. Throughput: 0: 43013.4. Samples: 4753920900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 21:05:04,633][15401] Updated weights for policy 0, policy_version 290150 (0.0034) [2024-06-22 21:05:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4753965056. Throughput: 0: 43028.0. Samples: 4754053420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 21:05:08,691][15401] Updated weights for policy 0, policy_version 290160 (0.0031) [2024-06-22 21:05:12,323][15401] Updated weights for policy 0, policy_version 290170 (0.0031) [2024-06-22 21:05:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4754178048. Throughput: 0: 43027.5. Samples: 4754312820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 21:05:16,052][15401] Updated weights for policy 0, policy_version 290180 (0.0045) [2024-06-22 21:05:17,769][15349] Signal inference workers to stop experience collection... (70250 times) [2024-06-22 21:05:17,769][15349] Signal inference workers to resume experience collection... (70250 times) [2024-06-22 21:05:17,793][15401] InferenceWorker_p0-w0: stopping experience collection (70250 times) [2024-06-22 21:05:17,793][15401] InferenceWorker_p0-w0: resuming experience collection (70250 times) [2024-06-22 21:05:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 4754407424. Throughput: 0: 42951.8. Samples: 4754567460. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 21:05:19,770][15401] Updated weights for policy 0, policy_version 290190 (0.0028) [2024-06-22 21:05:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4754620416. Throughput: 0: 43098.7. Samples: 4754700280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:23,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 21:05:23,478][15401] Updated weights for policy 0, policy_version 290200 (0.0035) [2024-06-22 21:05:27,319][15401] Updated weights for policy 0, policy_version 290210 (0.0040) [2024-06-22 21:05:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 4754833408. Throughput: 0: 43033.7. Samples: 4754960700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:28,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 21:05:31,027][15401] Updated weights for policy 0, policy_version 290220 (0.0048) [2024-06-22 21:05:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 43043.1). Total num frames: 4755046400. Throughput: 0: 43202.6. Samples: 4755218120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:33,394][15132] Avg episode reward: [(0, '0.819')] [2024-06-22 21:05:34,870][15401] Updated weights for policy 0, policy_version 290230 (0.0040) [2024-06-22 21:05:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 4755259392. Throughput: 0: 43071.6. Samples: 4755343640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 21:05:38,710][15401] Updated weights for policy 0, policy_version 290240 (0.0030) [2024-06-22 21:05:42,445][15401] Updated weights for policy 0, policy_version 290250 (0.0033) [2024-06-22 21:05:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 4755488768. Throughput: 0: 43079.9. Samples: 4755605620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 21:05:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000290252_4755488768.pth... [2024-06-22 21:05:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000289620_4745134080.pth [2024-06-22 21:05:46,290][15401] Updated weights for policy 0, policy_version 290260 (0.0037) [2024-06-22 21:05:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 4755701760. Throughput: 0: 43038.6. Samples: 4755857640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 21:05:50,355][15401] Updated weights for policy 0, policy_version 290270 (0.0027) [2024-06-22 21:05:53,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 4755914752. Throughput: 0: 43041.4. Samples: 4755990280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:53,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 21:05:53,753][15401] Updated weights for policy 0, policy_version 290280 (0.0037) [2024-06-22 21:05:57,996][15401] Updated weights for policy 0, policy_version 290290 (0.0037) [2024-06-22 21:05:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 4756127744. Throughput: 0: 43090.1. Samples: 4756251880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:05:58,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 21:06:01,234][15401] Updated weights for policy 0, policy_version 290300 (0.0031) [2024-06-22 21:06:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 4756307968. Throughput: 0: 43021.5. Samples: 4756503420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:06:03,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 21:06:05,690][15401] Updated weights for policy 0, policy_version 290310 (0.0036) [2024-06-22 21:06:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 4756570112. Throughput: 0: 42825.4. Samples: 4756627420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:06:08,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 21:06:08,749][15401] Updated weights for policy 0, policy_version 290320 (0.0031) [2024-06-22 21:06:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 4756750336. Throughput: 0: 42914.4. Samples: 4756891840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-22 21:06:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 21:06:13,425][15401] Updated weights for policy 0, policy_version 290330 (0.0033) [2024-06-22 21:06:16,631][15401] Updated weights for policy 0, policy_version 290340 (0.0028) [2024-06-22 21:06:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 4756963328. Throughput: 0: 42912.4. Samples: 4757149180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:18,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-22 21:06:21,099][15401] Updated weights for policy 0, policy_version 290350 (0.0034) [2024-06-22 21:06:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 4757209088. Throughput: 0: 42890.6. Samples: 4757273720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 21:06:24,086][15401] Updated weights for policy 0, policy_version 290360 (0.0036) [2024-06-22 21:06:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4757389312. Throughput: 0: 42783.2. Samples: 4757530860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:28,399][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 21:06:28,710][15401] Updated weights for policy 0, policy_version 290370 (0.0034) [2024-06-22 21:06:31,705][15401] Updated weights for policy 0, policy_version 290380 (0.0032) [2024-06-22 21:06:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4757618688. Throughput: 0: 43058.2. Samples: 4757795260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:33,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 21:06:36,245][15401] Updated weights for policy 0, policy_version 290390 (0.0032) [2024-06-22 21:06:37,751][15349] Signal inference workers to stop experience collection... (70300 times) [2024-06-22 21:06:37,801][15349] Signal inference workers to resume experience collection... (70300 times) [2024-06-22 21:06:37,801][15401] InferenceWorker_p0-w0: stopping experience collection (70300 times) [2024-06-22 21:06:37,822][15401] InferenceWorker_p0-w0: resuming experience collection (70300 times) [2024-06-22 21:06:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 4757848064. Throughput: 0: 42986.1. Samples: 4757924660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:38,399][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 21:06:39,313][15401] Updated weights for policy 0, policy_version 290400 (0.0035) [2024-06-22 21:06:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.8, 300 sec: 42986.8). Total num frames: 4758044672. Throughput: 0: 42804.9. Samples: 4758178200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:43,392][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 21:06:43,974][15401] Updated weights for policy 0, policy_version 290410 (0.0022) [2024-06-22 21:06:47,415][15401] Updated weights for policy 0, policy_version 290420 (0.0036) [2024-06-22 21:06:48,390][15132] Fps is (10 sec: 42595.9, 60 sec: 42871.0, 300 sec: 43042.6). Total num frames: 4758274048. Throughput: 0: 42942.9. Samples: 4758435880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:48,391][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 21:06:51,499][15401] Updated weights for policy 0, policy_version 290430 (0.0031) [2024-06-22 21:06:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 4758487040. Throughput: 0: 43045.3. Samples: 4758564460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 21:06:54,939][15401] Updated weights for policy 0, policy_version 290440 (0.0043) [2024-06-22 21:06:58,390][15132] Fps is (10 sec: 42601.0, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 4758700032. Throughput: 0: 42831.9. Samples: 4758819280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:06:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 21:06:58,996][15401] Updated weights for policy 0, policy_version 290450 (0.0038) [2024-06-22 21:07:02,413][15401] Updated weights for policy 0, policy_version 290460 (0.0031) [2024-06-22 21:07:03,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43415.8, 300 sec: 43042.3). Total num frames: 4758913024. Throughput: 0: 42788.0. Samples: 4759074740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:03,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 21:07:06,408][15401] Updated weights for policy 0, policy_version 290470 (0.0035) [2024-06-22 21:07:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 4759126016. Throughput: 0: 42974.4. Samples: 4759207560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 21:07:09,949][15401] Updated weights for policy 0, policy_version 290480 (0.0034) [2024-06-22 21:07:13,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 4759355392. Throughput: 0: 43106.6. Samples: 4759470660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:13,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 21:07:13,922][15401] Updated weights for policy 0, policy_version 290490 (0.0040) [2024-06-22 21:07:17,487][15401] Updated weights for policy 0, policy_version 290500 (0.0036) [2024-06-22 21:07:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 4759568384. Throughput: 0: 42887.6. Samples: 4759725200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 21:07:22,479][15401] Updated weights for policy 0, policy_version 290510 (0.0031) [2024-06-22 21:07:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 4759764992. Throughput: 0: 42892.9. Samples: 4759854840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 21:07:25,113][15401] Updated weights for policy 0, policy_version 290520 (0.0028) [2024-06-22 21:07:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 4759994368. Throughput: 0: 42966.4. Samples: 4760111580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:28,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 21:07:30,346][15401] Updated weights for policy 0, policy_version 290530 (0.0027) [2024-06-22 21:07:32,655][15401] Updated weights for policy 0, policy_version 290540 (0.0030) [2024-06-22 21:07:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 4760223744. Throughput: 0: 42811.2. Samples: 4760362360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 21:07:37,791][15401] Updated weights for policy 0, policy_version 290550 (0.0036) [2024-06-22 21:07:38,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.9, 300 sec: 42986.2). Total num frames: 4760403968. Throughput: 0: 42909.5. Samples: 4760495660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:38,396][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 21:07:40,152][15401] Updated weights for policy 0, policy_version 290560 (0.0034) [2024-06-22 21:07:43,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43144.5, 300 sec: 42986.8). Total num frames: 4760633344. Throughput: 0: 43030.1. Samples: 4760755740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:43,393][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 21:07:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000290566_4760633344.pth... [2024-06-22 21:07:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000289936_4750311424.pth [2024-06-22 21:07:45,302][15401] Updated weights for policy 0, policy_version 290570 (0.0038) [2024-06-22 21:07:47,679][15401] Updated weights for policy 0, policy_version 290580 (0.0034) [2024-06-22 21:07:48,392][15132] Fps is (10 sec: 45893.4, 60 sec: 43143.3, 300 sec: 43042.4). Total num frames: 4760862720. Throughput: 0: 43046.8. Samples: 4761011840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:48,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 21:07:52,937][15401] Updated weights for policy 0, policy_version 290590 (0.0028) [2024-06-22 21:07:53,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 4761042944. Throughput: 0: 43085.7. Samples: 4761146420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 21:07:55,538][15401] Updated weights for policy 0, policy_version 290600 (0.0043) [2024-06-22 21:07:58,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4761272320. Throughput: 0: 42846.8. Samples: 4761398760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:07:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 21:07:59,080][15349] Signal inference workers to stop experience collection... (70350 times) [2024-06-22 21:07:59,080][15349] Signal inference workers to resume experience collection... (70350 times) [2024-06-22 21:07:59,126][15401] InferenceWorker_p0-w0: stopping experience collection (70350 times) [2024-06-22 21:07:59,126][15401] InferenceWorker_p0-w0: resuming experience collection (70350 times) [2024-06-22 21:08:00,490][15401] Updated weights for policy 0, policy_version 290610 (0.0037) [2024-06-22 21:08:02,990][15401] Updated weights for policy 0, policy_version 290620 (0.0035) [2024-06-22 21:08:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43419.3, 300 sec: 43098.2). Total num frames: 4761518080. Throughput: 0: 42838.6. Samples: 4761652940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:08:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 21:08:07,979][15401] Updated weights for policy 0, policy_version 290630 (0.0053) [2024-06-22 21:08:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 4761681920. Throughput: 0: 42955.6. Samples: 4761787840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:08:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 21:08:10,827][15401] Updated weights for policy 0, policy_version 290640 (0.0036) [2024-06-22 21:08:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 4761894912. Throughput: 0: 42870.1. Samples: 4762040740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:08:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 21:08:15,553][15401] Updated weights for policy 0, policy_version 290650 (0.0033) [2024-06-22 21:08:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4762140672. Throughput: 0: 42793.8. Samples: 4762288080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:08:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 21:08:18,670][15401] Updated weights for policy 0, policy_version 290660 (0.0043) [2024-06-22 21:08:23,014][15401] Updated weights for policy 0, policy_version 290670 (0.0034) [2024-06-22 21:08:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 4762337280. Throughput: 0: 42793.5. Samples: 4762421100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:08:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 21:08:26,629][15401] Updated weights for policy 0, policy_version 290680 (0.0034) [2024-06-22 21:08:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42932.6). Total num frames: 4762550272. Throughput: 0: 42712.1. Samples: 4762677680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:08:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 21:08:30,413][15401] Updated weights for policy 0, policy_version 290690 (0.0033) [2024-06-22 21:08:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42932.0). Total num frames: 4762763264. Throughput: 0: 42676.4. Samples: 4762932180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:08:33,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 21:08:34,486][15401] Updated weights for policy 0, policy_version 290700 (0.0031) [2024-06-22 21:08:37,925][15401] Updated weights for policy 0, policy_version 290710 (0.0035) [2024-06-22 21:08:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43149.0, 300 sec: 43042.7). Total num frames: 4762992640. Throughput: 0: 42504.3. Samples: 4763059120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-22 21:08:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-22 21:08:42,155][15401] Updated weights for policy 0, policy_version 290720 (0.0029) [2024-06-22 21:08:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 4763205632. Throughput: 0: 42740.8. Samples: 4763322100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:08:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 21:08:45,446][15401] Updated weights for policy 0, policy_version 290730 (0.0035) [2024-06-22 21:08:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42327.0, 300 sec: 42877.0). Total num frames: 4763402240. Throughput: 0: 42797.9. Samples: 4763578840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:08:48,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 21:08:49,982][15401] Updated weights for policy 0, policy_version 290740 (0.0031) [2024-06-22 21:08:53,315][15401] Updated weights for policy 0, policy_version 290750 (0.0034) [2024-06-22 21:08:53,391][15132] Fps is (10 sec: 44232.4, 60 sec: 43416.8, 300 sec: 43098.1). Total num frames: 4763648000. Throughput: 0: 42598.6. Samples: 4763704820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:08:53,391][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 21:08:57,444][15401] Updated weights for policy 0, policy_version 290760 (0.0032) [2024-06-22 21:08:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4763844608. Throughput: 0: 42812.1. Samples: 4763967280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:08:58,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 21:09:00,915][15401] Updated weights for policy 0, policy_version 290770 (0.0036) [2024-06-22 21:09:03,390][15132] Fps is (10 sec: 40963.9, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 4764057600. Throughput: 0: 43034.1. Samples: 4764224620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 21:09:04,996][15401] Updated weights for policy 0, policy_version 290780 (0.0031) [2024-06-22 21:09:07,322][15349] Signal inference workers to stop experience collection... (70400 times) [2024-06-22 21:09:07,323][15349] Signal inference workers to resume experience collection... (70400 times) [2024-06-22 21:09:07,341][15401] InferenceWorker_p0-w0: stopping experience collection (70400 times) [2024-06-22 21:09:07,341][15401] InferenceWorker_p0-w0: resuming experience collection (70400 times) [2024-06-22 21:09:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 4764286976. Throughput: 0: 43071.1. Samples: 4764359300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:08,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 21:09:08,532][15401] Updated weights for policy 0, policy_version 290790 (0.0025) [2024-06-22 21:09:12,922][15401] Updated weights for policy 0, policy_version 290800 (0.0048) [2024-06-22 21:09:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4764467200. Throughput: 0: 42932.5. Samples: 4764609640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 21:09:16,450][15401] Updated weights for policy 0, policy_version 290810 (0.0023) [2024-06-22 21:09:18,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 4764696576. Throughput: 0: 43102.3. Samples: 4764871780. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 21:09:20,431][15401] Updated weights for policy 0, policy_version 290820 (0.0024) [2024-06-22 21:09:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 4764925952. Throughput: 0: 43178.2. Samples: 4765002140. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 21:09:24,064][15401] Updated weights for policy 0, policy_version 290830 (0.0023) [2024-06-22 21:09:27,912][15401] Updated weights for policy 0, policy_version 290840 (0.0034) [2024-06-22 21:09:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4765122560. Throughput: 0: 42935.1. Samples: 4765254180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:28,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 21:09:31,879][15401] Updated weights for policy 0, policy_version 290850 (0.0035) [2024-06-22 21:09:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4765351936. Throughput: 0: 42813.3. Samples: 4765505440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 21:09:35,574][15401] Updated weights for policy 0, policy_version 290860 (0.0035) [2024-06-22 21:09:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4765564928. Throughput: 0: 42907.6. Samples: 4765635620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 21:09:39,512][15401] Updated weights for policy 0, policy_version 290870 (0.0028) [2024-06-22 21:09:43,168][15401] Updated weights for policy 0, policy_version 290880 (0.0028) [2024-06-22 21:09:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4765777920. Throughput: 0: 42813.3. Samples: 4765893880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 21:09:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000290880_4765777920.pth... [2024-06-22 21:09:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000290252_4755488768.pth [2024-06-22 21:09:47,130][15401] Updated weights for policy 0, policy_version 290890 (0.0042) [2024-06-22 21:09:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4765990912. Throughput: 0: 42645.3. Samples: 4766143660. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 21:09:50,940][15401] Updated weights for policy 0, policy_version 290900 (0.0024) [2024-06-22 21:09:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42599.0, 300 sec: 42931.6). Total num frames: 4766203904. Throughput: 0: 42594.7. Samples: 4766276060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-22 21:09:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 21:09:54,631][15401] Updated weights for policy 0, policy_version 290910 (0.0036) [2024-06-22 21:09:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4766416896. Throughput: 0: 42767.0. Samples: 4766534160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:09:58,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 21:09:58,503][15401] Updated weights for policy 0, policy_version 290920 (0.0035) [2024-06-22 21:10:02,175][15401] Updated weights for policy 0, policy_version 290930 (0.0040) [2024-06-22 21:10:03,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 4766629888. Throughput: 0: 42562.1. Samples: 4766787180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:03,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-22 21:10:06,238][15401] Updated weights for policy 0, policy_version 290940 (0.0029) [2024-06-22 21:10:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 4766826496. Throughput: 0: 42451.7. Samples: 4766912460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:08,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 21:10:09,868][15401] Updated weights for policy 0, policy_version 290950 (0.0039) [2024-06-22 21:10:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4767039488. Throughput: 0: 42502.2. Samples: 4767166780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:13,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 21:10:14,197][15401] Updated weights for policy 0, policy_version 290960 (0.0033) [2024-06-22 21:10:18,064][15401] Updated weights for policy 0, policy_version 290970 (0.0042) [2024-06-22 21:10:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 4767252480. Throughput: 0: 42490.5. Samples: 4767417620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:18,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 21:10:21,714][15401] Updated weights for policy 0, policy_version 290980 (0.0032) [2024-06-22 21:10:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4767465472. Throughput: 0: 42495.6. Samples: 4767547920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 21:10:25,423][15401] Updated weights for policy 0, policy_version 290990 (0.0045) [2024-06-22 21:10:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4767678464. Throughput: 0: 42566.6. Samples: 4767809380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 21:10:29,234][15401] Updated weights for policy 0, policy_version 291000 (0.0032) [2024-06-22 21:10:31,548][15349] Signal inference workers to stop experience collection... (70450 times) [2024-06-22 21:10:31,549][15349] Signal inference workers to resume experience collection... (70450 times) [2024-06-22 21:10:31,565][15401] InferenceWorker_p0-w0: stopping experience collection (70450 times) [2024-06-22 21:10:31,565][15401] InferenceWorker_p0-w0: resuming experience collection (70450 times) [2024-06-22 21:10:32,927][15401] Updated weights for policy 0, policy_version 291010 (0.0043) [2024-06-22 21:10:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4767907840. Throughput: 0: 42672.1. Samples: 4768063900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 21:10:37,019][15401] Updated weights for policy 0, policy_version 291020 (0.0045) [2024-06-22 21:10:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 4768104448. Throughput: 0: 42720.6. Samples: 4768198480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 21:10:40,636][15401] Updated weights for policy 0, policy_version 291030 (0.0037) [2024-06-22 21:10:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4768333824. Throughput: 0: 42651.5. Samples: 4768453480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 21:10:44,711][15401] Updated weights for policy 0, policy_version 291040 (0.0030) [2024-06-22 21:10:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4768546816. Throughput: 0: 42711.7. Samples: 4768709100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 21:10:48,438][15401] Updated weights for policy 0, policy_version 291050 (0.0028) [2024-06-22 21:10:52,439][15401] Updated weights for policy 0, policy_version 291060 (0.0037) [2024-06-22 21:10:53,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 4768743424. Throughput: 0: 42696.1. Samples: 4768834060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:53,397][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 21:10:56,010][15401] Updated weights for policy 0, policy_version 291070 (0.0043) [2024-06-22 21:10:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 4768972800. Throughput: 0: 42982.7. Samples: 4769101000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:10:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 21:11:00,188][15401] Updated weights for policy 0, policy_version 291080 (0.0038) [2024-06-22 21:11:03,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 4769185792. Throughput: 0: 43027.7. Samples: 4769353760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-22 21:11:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 21:11:03,743][15401] Updated weights for policy 0, policy_version 291090 (0.0041) [2024-06-22 21:11:07,772][15401] Updated weights for policy 0, policy_version 291100 (0.0036) [2024-06-22 21:11:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4769398784. Throughput: 0: 42942.3. Samples: 4769480320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:08,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 21:11:11,283][15401] Updated weights for policy 0, policy_version 291110 (0.0037) [2024-06-22 21:11:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4769611776. Throughput: 0: 42940.0. Samples: 4769741680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 21:11:15,482][15401] Updated weights for policy 0, policy_version 291120 (0.0023) [2024-06-22 21:11:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 4769824768. Throughput: 0: 42965.6. Samples: 4769997360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 21:11:18,887][15401] Updated weights for policy 0, policy_version 291130 (0.0045) [2024-06-22 21:11:22,947][15401] Updated weights for policy 0, policy_version 291140 (0.0035) [2024-06-22 21:11:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4770054144. Throughput: 0: 42862.9. Samples: 4770127320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:23,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-22 21:11:26,369][15401] Updated weights for policy 0, policy_version 291150 (0.0039) [2024-06-22 21:11:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4770250752. Throughput: 0: 42909.3. Samples: 4770384400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:28,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 21:11:30,795][15401] Updated weights for policy 0, policy_version 291160 (0.0023) [2024-06-22 21:11:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4770480128. Throughput: 0: 42894.6. Samples: 4770639360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 21:11:34,027][15401] Updated weights for policy 0, policy_version 291170 (0.0037) [2024-06-22 21:11:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 4770676736. Throughput: 0: 43037.7. Samples: 4770770480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 21:11:38,434][15401] Updated weights for policy 0, policy_version 291180 (0.0033) [2024-06-22 21:11:41,735][15401] Updated weights for policy 0, policy_version 291190 (0.0035) [2024-06-22 21:11:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 4770889728. Throughput: 0: 42712.3. Samples: 4771023060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:43,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-22 21:11:43,466][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000291193_4770906112.pth... [2024-06-22 21:11:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000290566_4760633344.pth [2024-06-22 21:11:45,960][15401] Updated weights for policy 0, policy_version 291200 (0.0039) [2024-06-22 21:11:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4771135488. Throughput: 0: 42864.5. Samples: 4771282660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:48,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 21:11:49,210][15401] Updated weights for policy 0, policy_version 291210 (0.0040) [2024-06-22 21:11:50,990][15349] Signal inference workers to stop experience collection... (70500 times) [2024-06-22 21:11:50,991][15349] Signal inference workers to resume experience collection... (70500 times) [2024-06-22 21:11:51,005][15401] InferenceWorker_p0-w0: stopping experience collection (70500 times) [2024-06-22 21:11:51,006][15401] InferenceWorker_p0-w0: resuming experience collection (70500 times) [2024-06-22 21:11:53,388][15401] Updated weights for policy 0, policy_version 291220 (0.0037) [2024-06-22 21:11:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 4771348480. Throughput: 0: 42993.6. Samples: 4771415040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:53,390][15132] Avg episode reward: [(0, '0.006')] [2024-06-22 21:11:57,050][15401] Updated weights for policy 0, policy_version 291230 (0.0035) [2024-06-22 21:11:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 4771528704. Throughput: 0: 42714.6. Samples: 4771663840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:11:58,392][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 21:12:01,108][15401] Updated weights for policy 0, policy_version 291240 (0.0024) [2024-06-22 21:12:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4771774464. Throughput: 0: 42755.6. Samples: 4771921360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:12:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 21:12:04,514][15401] Updated weights for policy 0, policy_version 291250 (0.0033) [2024-06-22 21:12:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4771971072. Throughput: 0: 42886.2. Samples: 4772057200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:12:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 21:12:08,833][15401] Updated weights for policy 0, policy_version 291260 (0.0039) [2024-06-22 21:12:12,046][15401] Updated weights for policy 0, policy_version 291270 (0.0034) [2024-06-22 21:12:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4772184064. Throughput: 0: 42554.3. Samples: 4772299340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:12:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 21:12:16,527][15401] Updated weights for policy 0, policy_version 291280 (0.0037) [2024-06-22 21:12:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4772413440. Throughput: 0: 42715.1. Samples: 4772561540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 21:12:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 21:12:19,838][15401] Updated weights for policy 0, policy_version 291290 (0.0032) [2024-06-22 21:12:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4772593664. Throughput: 0: 42699.2. Samples: 4772691940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:12:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 21:12:24,340][15401] Updated weights for policy 0, policy_version 291300 (0.0033) [2024-06-22 21:12:27,352][15401] Updated weights for policy 0, policy_version 291310 (0.0036) [2024-06-22 21:12:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 4772839424. Throughput: 0: 42733.4. Samples: 4772946160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:12:28,393][15132] Avg episode reward: [(0, '0.762')] [2024-06-22 21:12:32,114][15401] Updated weights for policy 0, policy_version 291320 (0.0032) [2024-06-22 21:12:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 4773052416. Throughput: 0: 42733.3. Samples: 4773205660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:12:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 21:12:34,983][15401] Updated weights for policy 0, policy_version 291330 (0.0034) [2024-06-22 21:12:38,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 4773249024. Throughput: 0: 42633.7. Samples: 4773333560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:12:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 21:12:39,745][15401] Updated weights for policy 0, policy_version 291340 (0.0032) [2024-06-22 21:12:42,654][15401] Updated weights for policy 0, policy_version 291350 (0.0032) [2024-06-22 21:12:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42765.4). Total num frames: 4773478400. Throughput: 0: 42710.8. Samples: 4773585820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:12:43,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 21:12:47,305][15401] Updated weights for policy 0, policy_version 291360 (0.0032) [2024-06-22 21:12:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4773675008. Throughput: 0: 42953.8. Samples: 4773854280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:12:48,391][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 21:12:50,237][15401] Updated weights for policy 0, policy_version 291370 (0.0033) [2024-06-22 21:12:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4773888000. Throughput: 0: 42628.9. Samples: 4773975500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:12:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 21:12:55,260][15401] Updated weights for policy 0, policy_version 291380 (0.0036) [2024-06-22 21:12:56,991][15349] Signal inference workers to stop experience collection... (70550 times) [2024-06-22 21:12:56,991][15349] Signal inference workers to resume experience collection... (70550 times) [2024-06-22 21:12:57,007][15401] InferenceWorker_p0-w0: stopping experience collection (70550 times) [2024-06-22 21:12:57,007][15401] InferenceWorker_p0-w0: resuming experience collection (70550 times) [2024-06-22 21:12:58,199][15401] Updated weights for policy 0, policy_version 291390 (0.0029) [2024-06-22 21:12:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4774133760. Throughput: 0: 42883.5. Samples: 4774229100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:12:58,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 21:13:02,893][15401] Updated weights for policy 0, policy_version 291400 (0.0036) [2024-06-22 21:13:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4774313984. Throughput: 0: 43072.8. Samples: 4774499820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:13:03,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 21:13:05,641][15401] Updated weights for policy 0, policy_version 291410 (0.0029) [2024-06-22 21:13:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4774543360. Throughput: 0: 42804.4. Samples: 4774618140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:13:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 21:13:10,673][15401] Updated weights for policy 0, policy_version 291420 (0.0043) [2024-06-22 21:13:13,111][15401] Updated weights for policy 0, policy_version 291430 (0.0042) [2024-06-22 21:13:13,390][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4774789120. Throughput: 0: 43055.2. Samples: 4774883540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:13:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 21:13:18,201][15401] Updated weights for policy 0, policy_version 291440 (0.0022) [2024-06-22 21:13:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 4774969344. Throughput: 0: 43396.8. Samples: 4775158620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:13:18,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 21:13:20,503][15401] Updated weights for policy 0, policy_version 291450 (0.0035) [2024-06-22 21:13:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4775198720. Throughput: 0: 43221.0. Samples: 4775278500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:13:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 21:13:25,719][15401] Updated weights for policy 0, policy_version 291460 (0.0041) [2024-06-22 21:13:28,389][15132] Fps is (10 sec: 45886.8, 60 sec: 43146.4, 300 sec: 42931.7). Total num frames: 4775428096. Throughput: 0: 43369.4. Samples: 4775537440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:13:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 21:13:28,526][15401] Updated weights for policy 0, policy_version 291470 (0.0041) [2024-06-22 21:13:33,313][15401] Updated weights for policy 0, policy_version 291480 (0.0023) [2024-06-22 21:13:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4775608320. Throughput: 0: 43327.2. Samples: 4775804000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-22 21:13:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 21:13:36,051][15401] Updated weights for policy 0, policy_version 291490 (0.0037) [2024-06-22 21:13:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4775837696. Throughput: 0: 43163.6. Samples: 4775917860. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:13:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 21:13:40,752][15401] Updated weights for policy 0, policy_version 291500 (0.0035) [2024-06-22 21:13:43,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 4776083456. Throughput: 0: 43474.9. Samples: 4776185460. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:13:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 21:13:43,493][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000291510_4776099840.pth... [2024-06-22 21:13:43,500][15401] Updated weights for policy 0, policy_version 291510 (0.0037) [2024-06-22 21:13:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000290880_4765777920.pth [2024-06-22 21:13:48,218][15401] Updated weights for policy 0, policy_version 291520 (0.0041) [2024-06-22 21:13:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42820.7). Total num frames: 4776280064. Throughput: 0: 43067.7. Samples: 4776437860. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:13:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 21:13:51,371][15401] Updated weights for policy 0, policy_version 291530 (0.0037) [2024-06-22 21:13:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 4776493056. Throughput: 0: 43238.8. Samples: 4776563880. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:13:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 21:13:55,648][15401] Updated weights for policy 0, policy_version 291540 (0.0039) [2024-06-22 21:13:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4776706048. Throughput: 0: 43161.2. Samples: 4776825800. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:13:58,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-22 21:13:58,811][15349] Signal inference workers to stop experience collection... (70600 times) [2024-06-22 21:13:58,832][15401] InferenceWorker_p0-w0: stopping experience collection (70600 times) [2024-06-22 21:13:58,869][15349] Signal inference workers to resume experience collection... (70600 times) [2024-06-22 21:13:58,870][15401] InferenceWorker_p0-w0: resuming experience collection (70600 times) [2024-06-22 21:13:59,010][15401] Updated weights for policy 0, policy_version 291550 (0.0037) [2024-06-22 21:14:03,082][15401] Updated weights for policy 0, policy_version 291560 (0.0029) [2024-06-22 21:14:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 4776935424. Throughput: 0: 42777.7. Samples: 4777083520. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 21:14:06,918][15401] Updated weights for policy 0, policy_version 291570 (0.0029) [2024-06-22 21:14:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 4777148416. Throughput: 0: 43097.3. Samples: 4777217880. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:08,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-22 21:14:10,567][15401] Updated weights for policy 0, policy_version 291580 (0.0038) [2024-06-22 21:14:13,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 4777345024. Throughput: 0: 42897.6. Samples: 4777467940. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:13,393][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 21:14:14,402][15401] Updated weights for policy 0, policy_version 291590 (0.0027) [2024-06-22 21:14:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 4777558016. Throughput: 0: 42782.1. Samples: 4777729200. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 21:14:18,671][15401] Updated weights for policy 0, policy_version 291600 (0.0028) [2024-06-22 21:14:22,029][15401] Updated weights for policy 0, policy_version 291610 (0.0033) [2024-06-22 21:14:23,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4777787392. Throughput: 0: 43126.2. Samples: 4777858540. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 21:14:26,285][15401] Updated weights for policy 0, policy_version 291620 (0.0048) [2024-06-22 21:14:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 4777984000. Throughput: 0: 42770.4. Samples: 4778110240. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:28,393][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 21:14:29,896][15401] Updated weights for policy 0, policy_version 291630 (0.0027) [2024-06-22 21:14:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4778196992. Throughput: 0: 43008.3. Samples: 4778373340. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:33,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 21:14:33,827][15401] Updated weights for policy 0, policy_version 291640 (0.0024) [2024-06-22 21:14:37,497][15401] Updated weights for policy 0, policy_version 291650 (0.0032) [2024-06-22 21:14:38,389][15132] Fps is (10 sec: 44248.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4778426368. Throughput: 0: 43101.8. Samples: 4778503460. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 21:14:41,422][15401] Updated weights for policy 0, policy_version 291660 (0.0033) [2024-06-22 21:14:43,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4778639360. Throughput: 0: 42954.8. Samples: 4778758760. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-22 21:14:43,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 21:14:45,157][15401] Updated weights for policy 0, policy_version 291670 (0.0046) [2024-06-22 21:14:48,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4778835968. Throughput: 0: 43003.2. Samples: 4779018660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:14:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 21:14:49,099][15401] Updated weights for policy 0, policy_version 291680 (0.0029) [2024-06-22 21:14:52,917][15401] Updated weights for policy 0, policy_version 291690 (0.0040) [2024-06-22 21:14:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 4779081728. Throughput: 0: 42802.8. Samples: 4779144000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:14:53,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 21:14:56,700][15401] Updated weights for policy 0, policy_version 291700 (0.0037) [2024-06-22 21:14:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 4779261952. Throughput: 0: 42989.0. Samples: 4779402340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:14:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 21:15:00,358][15401] Updated weights for policy 0, policy_version 291710 (0.0038) [2024-06-22 21:15:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4779491328. Throughput: 0: 43073.8. Samples: 4779667520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:03,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 21:15:04,051][15401] Updated weights for policy 0, policy_version 291720 (0.0033) [2024-06-22 21:15:07,834][15401] Updated weights for policy 0, policy_version 291730 (0.0036) [2024-06-22 21:15:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4779720704. Throughput: 0: 43026.1. Samples: 4779794720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:08,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 21:15:11,531][15401] Updated weights for policy 0, policy_version 291740 (0.0023) [2024-06-22 21:15:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42932.0). Total num frames: 4779917312. Throughput: 0: 43111.7. Samples: 4780050160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 21:15:15,660][15401] Updated weights for policy 0, policy_version 291750 (0.0029) [2024-06-22 21:15:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4780130304. Throughput: 0: 42961.8. Samples: 4780306520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 21:15:18,697][15349] Signal inference workers to stop experience collection... (70650 times) [2024-06-22 21:15:18,697][15349] Signal inference workers to resume experience collection... (70650 times) [2024-06-22 21:15:18,747][15401] InferenceWorker_p0-w0: stopping experience collection (70650 times) [2024-06-22 21:15:18,747][15401] InferenceWorker_p0-w0: resuming experience collection (70650 times) [2024-06-22 21:15:19,013][15401] Updated weights for policy 0, policy_version 291760 (0.0042) [2024-06-22 21:15:23,030][15401] Updated weights for policy 0, policy_version 291770 (0.0033) [2024-06-22 21:15:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4780359680. Throughput: 0: 42976.7. Samples: 4780437420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 21:15:26,965][15401] Updated weights for policy 0, policy_version 291780 (0.0032) [2024-06-22 21:15:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 4780556288. Throughput: 0: 42970.3. Samples: 4780692420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 21:15:30,495][15401] Updated weights for policy 0, policy_version 291790 (0.0032) [2024-06-22 21:15:33,391][15132] Fps is (10 sec: 44229.9, 60 sec: 43418.2, 300 sec: 43042.5). Total num frames: 4780802048. Throughput: 0: 42867.3. Samples: 4780947760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:33,392][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 21:15:34,399][15401] Updated weights for policy 0, policy_version 291800 (0.0032) [2024-06-22 21:15:37,933][15401] Updated weights for policy 0, policy_version 291810 (0.0028) [2024-06-22 21:15:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 4781015040. Throughput: 0: 43257.7. Samples: 4781090600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 21:15:41,965][15401] Updated weights for policy 0, policy_version 291820 (0.0044) [2024-06-22 21:15:43,390][15132] Fps is (10 sec: 39327.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4781195264. Throughput: 0: 42969.2. Samples: 4781335960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 21:15:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000291821_4781195264.pth... [2024-06-22 21:15:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000291193_4770906112.pth [2024-06-22 21:15:45,592][15401] Updated weights for policy 0, policy_version 291830 (0.0036) [2024-06-22 21:15:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43690.8, 300 sec: 43099.2). Total num frames: 4781457408. Throughput: 0: 42776.2. Samples: 4781592440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 21:15:49,522][15401] Updated weights for policy 0, policy_version 291840 (0.0038) [2024-06-22 21:15:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 4781654016. Throughput: 0: 43009.7. Samples: 4781730160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 21:15:53,405][15401] Updated weights for policy 0, policy_version 291850 (0.0032) [2024-06-22 21:15:57,022][15401] Updated weights for policy 0, policy_version 291860 (0.0043) [2024-06-22 21:15:58,389][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4781850624. Throughput: 0: 42729.8. Samples: 4781973000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 21:15:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 21:16:01,664][15401] Updated weights for policy 0, policy_version 291870 (0.0037) [2024-06-22 21:16:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4782080000. Throughput: 0: 42732.1. Samples: 4782229460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 21:16:04,558][15401] Updated weights for policy 0, policy_version 291880 (0.0041) [2024-06-22 21:16:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4782260224. Throughput: 0: 42872.9. Samples: 4782366700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:08,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 21:16:09,030][15401] Updated weights for policy 0, policy_version 291890 (0.0024) [2024-06-22 21:16:12,207][15401] Updated weights for policy 0, policy_version 291900 (0.0031) [2024-06-22 21:16:13,396][15132] Fps is (10 sec: 42570.7, 60 sec: 43140.0, 300 sec: 42986.3). Total num frames: 4782505984. Throughput: 0: 42709.4. Samples: 4782614620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:13,397][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 21:16:16,825][15401] Updated weights for policy 0, policy_version 291910 (0.0032) [2024-06-22 21:16:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4782702592. Throughput: 0: 42899.0. Samples: 4782878140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 21:16:19,795][15401] Updated weights for policy 0, policy_version 291920 (0.0031) [2024-06-22 21:16:23,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 4782915584. Throughput: 0: 42610.7. Samples: 4783008080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 21:16:24,339][15401] Updated weights for policy 0, policy_version 291930 (0.0033) [2024-06-22 21:16:27,794][15401] Updated weights for policy 0, policy_version 291940 (0.0037) [2024-06-22 21:16:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4783144960. Throughput: 0: 42756.5. Samples: 4783260000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 21:16:31,559][15349] Signal inference workers to stop experience collection... (70700 times) [2024-06-22 21:16:31,560][15349] Signal inference workers to resume experience collection... (70700 times) [2024-06-22 21:16:31,606][15401] InferenceWorker_p0-w0: stopping experience collection (70700 times) [2024-06-22 21:16:31,607][15401] InferenceWorker_p0-w0: resuming experience collection (70700 times) [2024-06-22 21:16:31,866][15401] Updated weights for policy 0, policy_version 291950 (0.0028) [2024-06-22 21:16:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42599.5, 300 sec: 42987.2). Total num frames: 4783357952. Throughput: 0: 42949.2. Samples: 4783525160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:33,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 21:16:35,365][15401] Updated weights for policy 0, policy_version 291960 (0.0033) [2024-06-22 21:16:38,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42593.8, 300 sec: 42986.3). Total num frames: 4783570944. Throughput: 0: 42753.1. Samples: 4783654320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:38,396][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 21:16:39,473][15401] Updated weights for policy 0, policy_version 291970 (0.0038) [2024-06-22 21:16:43,143][15401] Updated weights for policy 0, policy_version 291980 (0.0032) [2024-06-22 21:16:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 4783800320. Throughput: 0: 42991.6. Samples: 4783907620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 21:16:47,556][15401] Updated weights for policy 0, policy_version 291990 (0.0045) [2024-06-22 21:16:48,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42052.1, 300 sec: 42820.5). Total num frames: 4783980544. Throughput: 0: 43154.5. Samples: 4784171420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 21:16:50,695][15401] Updated weights for policy 0, policy_version 292000 (0.0027) [2024-06-22 21:16:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 4784209920. Throughput: 0: 42863.7. Samples: 4784295560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 21:16:55,204][15401] Updated weights for policy 0, policy_version 292010 (0.0024) [2024-06-22 21:16:58,226][15401] Updated weights for policy 0, policy_version 292020 (0.0033) [2024-06-22 21:16:58,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 4784455680. Throughput: 0: 43025.8. Samples: 4784550500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:16:58,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 21:17:02,662][15401] Updated weights for policy 0, policy_version 292030 (0.0041) [2024-06-22 21:17:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4784635904. Throughput: 0: 43067.4. Samples: 4784816180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:17:03,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 21:17:05,742][15401] Updated weights for policy 0, policy_version 292040 (0.0024) [2024-06-22 21:17:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4784848896. Throughput: 0: 42872.8. Samples: 4784937360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 21:17:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 21:17:10,043][15401] Updated weights for policy 0, policy_version 292050 (0.0031) [2024-06-22 21:17:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42875.9, 300 sec: 42931.6). Total num frames: 4785078272. Throughput: 0: 43167.9. Samples: 4785202560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 21:17:13,686][15401] Updated weights for policy 0, policy_version 292060 (0.0029) [2024-06-22 21:17:17,413][15401] Updated weights for policy 0, policy_version 292070 (0.0033) [2024-06-22 21:17:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 4785291264. Throughput: 0: 42981.8. Samples: 4785459340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 21:17:21,153][15401] Updated weights for policy 0, policy_version 292080 (0.0030) [2024-06-22 21:17:23,392][15132] Fps is (10 sec: 44226.9, 60 sec: 43415.8, 300 sec: 42987.2). Total num frames: 4785520640. Throughput: 0: 42816.7. Samples: 4785580900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:23,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 21:17:25,857][15401] Updated weights for policy 0, policy_version 292090 (0.0023) [2024-06-22 21:17:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 4785733632. Throughput: 0: 43107.9. Samples: 4785847580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:28,393][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 21:17:29,122][15401] Updated weights for policy 0, policy_version 292100 (0.0039) [2024-06-22 21:17:33,388][15401] Updated weights for policy 0, policy_version 292110 (0.0036) [2024-06-22 21:17:33,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 4785930240. Throughput: 0: 42960.5. Samples: 4786104740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:33,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 21:17:36,911][15401] Updated weights for policy 0, policy_version 292120 (0.0027) [2024-06-22 21:17:38,391][15132] Fps is (10 sec: 42601.9, 60 sec: 43148.0, 300 sec: 42986.9). Total num frames: 4786159616. Throughput: 0: 42825.5. Samples: 4786222780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:38,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 21:17:41,047][15401] Updated weights for policy 0, policy_version 292130 (0.0043) [2024-06-22 21:17:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 4786356224. Throughput: 0: 42916.4. Samples: 4786481740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:43,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 21:17:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000292136_4786356224.pth... [2024-06-22 21:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000291510_4776099840.pth [2024-06-22 21:17:44,351][15401] Updated weights for policy 0, policy_version 292140 (0.0035) [2024-06-22 21:17:48,390][15132] Fps is (10 sec: 37689.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4786536448. Throughput: 0: 42748.5. Samples: 4786739860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 21:17:48,794][15401] Updated weights for policy 0, policy_version 292150 (0.0039) [2024-06-22 21:17:50,517][15349] Signal inference workers to stop experience collection... (70750 times) [2024-06-22 21:17:50,548][15401] InferenceWorker_p0-w0: stopping experience collection (70750 times) [2024-06-22 21:17:50,566][15349] Signal inference workers to resume experience collection... (70750 times) [2024-06-22 21:17:50,575][15401] InferenceWorker_p0-w0: resuming experience collection (70750 times) [2024-06-22 21:17:51,947][15401] Updated weights for policy 0, policy_version 292160 (0.0035) [2024-06-22 21:17:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 4786798592. Throughput: 0: 42711.7. Samples: 4786859380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 21:17:56,586][15401] Updated weights for policy 0, policy_version 292170 (0.0028) [2024-06-22 21:17:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 4786995200. Throughput: 0: 42658.9. Samples: 4787122200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:17:58,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-22 21:17:59,594][15401] Updated weights for policy 0, policy_version 292180 (0.0031) [2024-06-22 21:18:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4787191808. Throughput: 0: 42747.6. Samples: 4787382980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:18:03,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 21:18:04,198][15401] Updated weights for policy 0, policy_version 292190 (0.0035) [2024-06-22 21:18:07,275][15401] Updated weights for policy 0, policy_version 292200 (0.0032) [2024-06-22 21:18:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4787437568. Throughput: 0: 42733.0. Samples: 4787503780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:18:08,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 21:18:11,713][15401] Updated weights for policy 0, policy_version 292210 (0.0033) [2024-06-22 21:18:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42932.0). Total num frames: 4787634176. Throughput: 0: 42778.7. Samples: 4787772520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:18:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 21:18:14,885][15401] Updated weights for policy 0, policy_version 292220 (0.0043) [2024-06-22 21:18:18,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4787830784. Throughput: 0: 42686.6. Samples: 4788025540. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:18:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 21:18:19,665][15401] Updated weights for policy 0, policy_version 292230 (0.0033) [2024-06-22 21:18:22,504][15401] Updated weights for policy 0, policy_version 292240 (0.0038) [2024-06-22 21:18:23,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42871.5, 300 sec: 42931.3). Total num frames: 4788092928. Throughput: 0: 42768.1. Samples: 4788147380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-22 21:18:23,392][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 21:18:27,256][15401] Updated weights for policy 0, policy_version 292250 (0.0038) [2024-06-22 21:18:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42327.1, 300 sec: 42931.6). Total num frames: 4788273152. Throughput: 0: 42920.5. Samples: 4788413160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:18:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 21:18:30,083][15401] Updated weights for policy 0, policy_version 292260 (0.0030) [2024-06-22 21:18:33,390][15132] Fps is (10 sec: 39330.0, 60 sec: 42599.9, 300 sec: 42876.0). Total num frames: 4788486144. Throughput: 0: 42778.9. Samples: 4788664920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:18:33,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 21:18:34,695][15401] Updated weights for policy 0, policy_version 292270 (0.0033) [2024-06-22 21:18:37,780][15401] Updated weights for policy 0, policy_version 292280 (0.0042) [2024-06-22 21:18:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42872.7, 300 sec: 42876.1). Total num frames: 4788731904. Throughput: 0: 42934.2. Samples: 4788791420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:18:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 21:18:42,491][15401] Updated weights for policy 0, policy_version 292290 (0.0028) [2024-06-22 21:18:43,390][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4788928512. Throughput: 0: 42956.8. Samples: 4789055260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:18:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 21:18:45,380][15401] Updated weights for policy 0, policy_version 292300 (0.0035) [2024-06-22 21:18:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4789141504. Throughput: 0: 42754.7. Samples: 4789306940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:18:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 21:18:50,246][15401] Updated weights for policy 0, policy_version 292310 (0.0029) [2024-06-22 21:18:53,273][15401] Updated weights for policy 0, policy_version 292320 (0.0037) [2024-06-22 21:18:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 4789370880. Throughput: 0: 42849.8. Samples: 4789432020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:18:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 21:18:57,985][15401] Updated weights for policy 0, policy_version 292330 (0.0031) [2024-06-22 21:18:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4789551104. Throughput: 0: 42697.0. Samples: 4789693880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:18:58,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 21:19:00,823][15401] Updated weights for policy 0, policy_version 292340 (0.0035) [2024-06-22 21:19:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4789780480. Throughput: 0: 42667.8. Samples: 4789945580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:19:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 21:19:05,711][15401] Updated weights for policy 0, policy_version 292350 (0.0021) [2024-06-22 21:19:08,375][15401] Updated weights for policy 0, policy_version 292360 (0.0037) [2024-06-22 21:19:08,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 4790026240. Throughput: 0: 42907.2. Samples: 4790078100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:19:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 21:19:13,214][15401] Updated weights for policy 0, policy_version 292370 (0.0042) [2024-06-22 21:19:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4790190080. Throughput: 0: 42771.1. Samples: 4790337860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:19:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 21:19:16,139][15401] Updated weights for policy 0, policy_version 292380 (0.0038) [2024-06-22 21:19:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 4790435840. Throughput: 0: 42709.6. Samples: 4790586840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:19:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 21:19:21,084][15401] Updated weights for policy 0, policy_version 292390 (0.0046) [2024-06-22 21:19:22,348][15349] Signal inference workers to stop experience collection... (70800 times) [2024-06-22 21:19:22,348][15349] Signal inference workers to resume experience collection... (70800 times) [2024-06-22 21:19:22,364][15401] InferenceWorker_p0-w0: stopping experience collection (70800 times) [2024-06-22 21:19:22,364][15401] InferenceWorker_p0-w0: resuming experience collection (70800 times) [2024-06-22 21:19:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42600.0, 300 sec: 42932.0). Total num frames: 4790648832. Throughput: 0: 42849.2. Samples: 4790719640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:19:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 21:19:23,908][15401] Updated weights for policy 0, policy_version 292400 (0.0034) [2024-06-22 21:19:28,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.7, 300 sec: 42820.6). Total num frames: 4790829056. Throughput: 0: 42531.1. Samples: 4790969260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:19:28,392][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 21:19:28,604][15401] Updated weights for policy 0, policy_version 292410 (0.0034) [2024-06-22 21:19:31,528][15401] Updated weights for policy 0, policy_version 292420 (0.0031) [2024-06-22 21:19:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 4791074816. Throughput: 0: 42461.3. Samples: 4791217700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:19:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 21:19:36,323][15401] Updated weights for policy 0, policy_version 292430 (0.0039) [2024-06-22 21:19:38,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4791287808. Throughput: 0: 42850.1. Samples: 4791360280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 21:19:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 21:19:39,110][15401] Updated weights for policy 0, policy_version 292440 (0.0024) [2024-06-22 21:19:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4791468032. Throughput: 0: 42553.1. Samples: 4791608780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:19:43,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 21:19:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000292448_4791468032.pth... [2024-06-22 21:19:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000291821_4781195264.pth [2024-06-22 21:19:43,833][15401] Updated weights for policy 0, policy_version 292450 (0.0038) [2024-06-22 21:19:46,948][15401] Updated weights for policy 0, policy_version 292460 (0.0038) [2024-06-22 21:19:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4791713792. Throughput: 0: 42568.8. Samples: 4791861180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:19:48,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 21:19:51,533][15401] Updated weights for policy 0, policy_version 292470 (0.0031) [2024-06-22 21:19:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 4791894016. Throughput: 0: 42695.6. Samples: 4791999400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:19:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 21:19:54,768][15401] Updated weights for policy 0, policy_version 292480 (0.0038) [2024-06-22 21:19:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 4792123392. Throughput: 0: 42420.7. Samples: 4792246800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:19:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 21:19:59,386][15401] Updated weights for policy 0, policy_version 292490 (0.0033) [2024-06-22 21:20:02,320][15401] Updated weights for policy 0, policy_version 292500 (0.0039) [2024-06-22 21:20:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4792369152. Throughput: 0: 42487.5. Samples: 4792498780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:03,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 21:20:07,007][15401] Updated weights for policy 0, policy_version 292510 (0.0034) [2024-06-22 21:20:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4792565760. Throughput: 0: 42669.9. Samples: 4792639780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 21:20:09,845][15401] Updated weights for policy 0, policy_version 292520 (0.0035) [2024-06-22 21:20:13,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4792745984. Throughput: 0: 42592.9. Samples: 4792885840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 21:20:14,668][15401] Updated weights for policy 0, policy_version 292530 (0.0043) [2024-06-22 21:20:17,470][15401] Updated weights for policy 0, policy_version 292540 (0.0034) [2024-06-22 21:20:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4792991744. Throughput: 0: 42739.2. Samples: 4793140960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:18,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-22 21:20:22,280][15401] Updated weights for policy 0, policy_version 292550 (0.0036) [2024-06-22 21:20:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4793204736. Throughput: 0: 42568.1. Samples: 4793275840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-22 21:20:25,146][15401] Updated weights for policy 0, policy_version 292560 (0.0040) [2024-06-22 21:20:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42709.7). Total num frames: 4793401344. Throughput: 0: 42701.0. Samples: 4793530320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 21:20:29,888][15401] Updated weights for policy 0, policy_version 292570 (0.0039) [2024-06-22 21:20:32,547][15349] Signal inference workers to stop experience collection... (70850 times) [2024-06-22 21:20:32,556][15349] Signal inference workers to resume experience collection... (70850 times) [2024-06-22 21:20:32,557][15401] InferenceWorker_p0-w0: stopping experience collection (70850 times) [2024-06-22 21:20:32,571][15401] InferenceWorker_p0-w0: resuming experience collection (70850 times) [2024-06-22 21:20:32,705][15401] Updated weights for policy 0, policy_version 292580 (0.0039) [2024-06-22 21:20:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4793630720. Throughput: 0: 42848.8. Samples: 4793789380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 21:20:37,369][15401] Updated weights for policy 0, policy_version 292590 (0.0028) [2024-06-22 21:20:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4793843712. Throughput: 0: 42745.7. Samples: 4793922960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 21:20:40,651][15401] Updated weights for policy 0, policy_version 292600 (0.0023) [2024-06-22 21:20:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 4794056704. Throughput: 0: 42946.6. Samples: 4794179400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 21:20:44,861][15401] Updated weights for policy 0, policy_version 292610 (0.0026) [2024-06-22 21:20:48,159][15401] Updated weights for policy 0, policy_version 292620 (0.0045) [2024-06-22 21:20:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4794286080. Throughput: 0: 43102.8. Samples: 4794438400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 21:20:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 21:20:52,407][15401] Updated weights for policy 0, policy_version 292630 (0.0021) [2024-06-22 21:20:53,394][15132] Fps is (10 sec: 42580.7, 60 sec: 43141.4, 300 sec: 42819.9). Total num frames: 4794482688. Throughput: 0: 43012.3. Samples: 4794575520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:20:53,395][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 21:20:55,983][15401] Updated weights for policy 0, policy_version 292640 (0.0026) [2024-06-22 21:20:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4794695680. Throughput: 0: 43060.9. Samples: 4794823580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:20:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 21:21:00,259][15401] Updated weights for policy 0, policy_version 292650 (0.0035) [2024-06-22 21:21:03,390][15132] Fps is (10 sec: 44255.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4794925056. Throughput: 0: 43045.3. Samples: 4795078000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 21:21:03,622][15401] Updated weights for policy 0, policy_version 292660 (0.0037) [2024-06-22 21:21:07,725][15401] Updated weights for policy 0, policy_version 292670 (0.0026) [2024-06-22 21:21:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 4795121664. Throughput: 0: 43007.8. Samples: 4795211200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:08,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 21:21:11,288][15401] Updated weights for policy 0, policy_version 292680 (0.0027) [2024-06-22 21:21:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 4795334656. Throughput: 0: 42900.0. Samples: 4795460820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 21:21:15,247][15401] Updated weights for policy 0, policy_version 292690 (0.0032) [2024-06-22 21:21:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4795564032. Throughput: 0: 43009.3. Samples: 4795724800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 21:21:18,809][15401] Updated weights for policy 0, policy_version 292700 (0.0032) [2024-06-22 21:21:22,858][15401] Updated weights for policy 0, policy_version 292710 (0.0029) [2024-06-22 21:21:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4795760640. Throughput: 0: 42904.4. Samples: 4795853660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 21:21:26,351][15401] Updated weights for policy 0, policy_version 292720 (0.0033) [2024-06-22 21:21:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4795973632. Throughput: 0: 42653.5. Samples: 4796098800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:28,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 21:21:30,764][15401] Updated weights for policy 0, policy_version 292730 (0.0035) [2024-06-22 21:21:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 4796203008. Throughput: 0: 42698.2. Samples: 4796359820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 21:21:34,047][15401] Updated weights for policy 0, policy_version 292740 (0.0032) [2024-06-22 21:21:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4796399616. Throughput: 0: 42407.7. Samples: 4796483680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:38,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-22 21:21:38,490][15401] Updated weights for policy 0, policy_version 292750 (0.0032) [2024-06-22 21:21:42,038][15401] Updated weights for policy 0, policy_version 292760 (0.0027) [2024-06-22 21:21:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.9, 300 sec: 42875.8). Total num frames: 4796628992. Throughput: 0: 42555.2. Samples: 4796738660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:43,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 21:21:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000292763_4796628992.pth... [2024-06-22 21:21:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000292136_4786356224.pth [2024-06-22 21:21:46,612][15401] Updated weights for policy 0, policy_version 292770 (0.0033) [2024-06-22 21:21:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4796825600. Throughput: 0: 42703.1. Samples: 4796999640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 21:21:49,738][15401] Updated weights for policy 0, policy_version 292780 (0.0040) [2024-06-22 21:21:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42874.5, 300 sec: 42709.5). Total num frames: 4797054976. Throughput: 0: 42610.3. Samples: 4797128660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 21:21:53,819][15401] Updated weights for policy 0, policy_version 292790 (0.0031) [2024-06-22 21:21:57,533][15401] Updated weights for policy 0, policy_version 292800 (0.0039) [2024-06-22 21:21:58,392][15132] Fps is (10 sec: 42589.2, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 4797251584. Throughput: 0: 42753.4. Samples: 4797384820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:21:58,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 21:22:01,845][15401] Updated weights for policy 0, policy_version 292810 (0.0032) [2024-06-22 21:22:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 4797464576. Throughput: 0: 42628.4. Samples: 4797643180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:22:03,392][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 21:22:05,035][15401] Updated weights for policy 0, policy_version 292820 (0.0030) [2024-06-22 21:22:06,898][15349] Signal inference workers to stop experience collection... (70900 times) [2024-06-22 21:22:06,902][15349] Signal inference workers to resume experience collection... (70900 times) [2024-06-22 21:22:06,948][15401] InferenceWorker_p0-w0: stopping experience collection (70900 times) [2024-06-22 21:22:06,948][15401] InferenceWorker_p0-w0: resuming experience collection (70900 times) [2024-06-22 21:22:08,389][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4797693952. Throughput: 0: 42681.1. Samples: 4797774300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:08,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 21:22:09,283][15401] Updated weights for policy 0, policy_version 292830 (0.0029) [2024-06-22 21:22:12,986][15401] Updated weights for policy 0, policy_version 292840 (0.0038) [2024-06-22 21:22:13,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4797906944. Throughput: 0: 42840.3. Samples: 4798026720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:13,392][15132] Avg episode reward: [(0, '0.199')] [2024-06-22 21:22:16,766][15401] Updated weights for policy 0, policy_version 292850 (0.0033) [2024-06-22 21:22:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 4798103552. Throughput: 0: 42809.4. Samples: 4798286240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:18,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 21:22:20,504][15401] Updated weights for policy 0, policy_version 292860 (0.0032) [2024-06-22 21:22:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 4798349312. Throughput: 0: 42871.4. Samples: 4798412900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 21:22:24,195][15401] Updated weights for policy 0, policy_version 292870 (0.0040) [2024-06-22 21:22:28,035][15401] Updated weights for policy 0, policy_version 292880 (0.0037) [2024-06-22 21:22:28,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.9, 300 sec: 42764.4). Total num frames: 4798545920. Throughput: 0: 42957.1. Samples: 4798671900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:28,396][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 21:22:31,813][15401] Updated weights for policy 0, policy_version 292890 (0.0041) [2024-06-22 21:22:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.7). Total num frames: 4798758912. Throughput: 0: 42974.2. Samples: 4798933480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 21:22:35,667][15401] Updated weights for policy 0, policy_version 292900 (0.0034) [2024-06-22 21:22:38,389][15132] Fps is (10 sec: 45904.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4799004672. Throughput: 0: 42832.9. Samples: 4799056140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:38,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 21:22:39,463][15401] Updated weights for policy 0, policy_version 292910 (0.0028) [2024-06-22 21:22:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 4799184896. Throughput: 0: 42906.2. Samples: 4799315500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:43,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 21:22:43,459][15401] Updated weights for policy 0, policy_version 292920 (0.0034) [2024-06-22 21:22:47,070][15401] Updated weights for policy 0, policy_version 292930 (0.0048) [2024-06-22 21:22:48,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4799381504. Throughput: 0: 42753.8. Samples: 4799567000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 21:22:51,323][15401] Updated weights for policy 0, policy_version 292940 (0.0026) [2024-06-22 21:22:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4799610880. Throughput: 0: 42641.4. Samples: 4799693160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 21:22:54,762][15401] Updated weights for policy 0, policy_version 292950 (0.0034) [2024-06-22 21:22:58,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42598.3, 300 sec: 42764.7). Total num frames: 4799807488. Throughput: 0: 42789.3. Samples: 4799952240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:22:58,392][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 21:22:58,874][15401] Updated weights for policy 0, policy_version 292960 (0.0030) [2024-06-22 21:23:02,275][15401] Updated weights for policy 0, policy_version 292970 (0.0034) [2024-06-22 21:23:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 4800036864. Throughput: 0: 42565.3. Samples: 4800201680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:23:03,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 21:23:06,407][15401] Updated weights for policy 0, policy_version 292980 (0.0032) [2024-06-22 21:23:08,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4800266240. Throughput: 0: 42718.2. Samples: 4800335220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:23:08,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 21:23:09,822][15401] Updated weights for policy 0, policy_version 292990 (0.0036) [2024-06-22 21:23:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 4800446464. Throughput: 0: 42718.9. Samples: 4800593980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:23:13,390][15132] Avg episode reward: [(0, '0.111')] [2024-06-22 21:23:13,944][15401] Updated weights for policy 0, policy_version 293000 (0.0051) [2024-06-22 21:23:15,933][15349] Signal inference workers to stop experience collection... (70950 times) [2024-06-22 21:23:15,941][15349] Signal inference workers to resume experience collection... (70950 times) [2024-06-22 21:23:15,967][15401] InferenceWorker_p0-w0: stopping experience collection (70950 times) [2024-06-22 21:23:15,967][15401] InferenceWorker_p0-w0: resuming experience collection (70950 times) [2024-06-22 21:23:17,375][15401] Updated weights for policy 0, policy_version 293010 (0.0039) [2024-06-22 21:23:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 4800692224. Throughput: 0: 42393.9. Samples: 4800841200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 21:23:18,390][15132] Avg episode reward: [(0, '0.153')] [2024-06-22 21:23:21,509][15401] Updated weights for policy 0, policy_version 293020 (0.0031) [2024-06-22 21:23:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4800888832. Throughput: 0: 42606.6. Samples: 4800973440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:23:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 21:23:25,260][15401] Updated weights for policy 0, policy_version 293030 (0.0036) [2024-06-22 21:23:28,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42602.8, 300 sec: 42765.0). Total num frames: 4801101824. Throughput: 0: 42565.6. Samples: 4801230960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:23:28,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 21:23:29,084][15401] Updated weights for policy 0, policy_version 293040 (0.0035) [2024-06-22 21:23:32,935][15401] Updated weights for policy 0, policy_version 293050 (0.0031) [2024-06-22 21:23:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 4801331200. Throughput: 0: 42475.5. Samples: 4801478400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:23:33,391][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 21:23:36,754][15401] Updated weights for policy 0, policy_version 293060 (0.0039) [2024-06-22 21:23:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 4801511424. Throughput: 0: 42743.4. Samples: 4801616620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:23:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 21:23:40,555][15401] Updated weights for policy 0, policy_version 293070 (0.0038) [2024-06-22 21:23:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4801740800. Throughput: 0: 42604.0. Samples: 4801869320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:23:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 21:23:43,511][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000293076_4801757184.pth... [2024-06-22 21:23:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000292448_4791468032.pth [2024-06-22 21:23:45,080][15401] Updated weights for policy 0, policy_version 293080 (0.0026) [2024-06-22 21:23:48,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4801970176. Throughput: 0: 42701.8. Samples: 4802123260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:23:48,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 21:23:48,623][15401] Updated weights for policy 0, policy_version 293090 (0.0037) [2024-06-22 21:23:52,488][15401] Updated weights for policy 0, policy_version 293100 (0.0033) [2024-06-22 21:23:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 4802150400. Throughput: 0: 42614.2. Samples: 4802252860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:23:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 21:23:56,539][15401] Updated weights for policy 0, policy_version 293110 (0.0028) [2024-06-22 21:23:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 4802396160. Throughput: 0: 42644.5. Samples: 4802512980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:23:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 21:23:59,950][15401] Updated weights for policy 0, policy_version 293120 (0.0028) [2024-06-22 21:24:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4802609152. Throughput: 0: 42787.0. Samples: 4802766620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:24:03,393][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 21:24:03,998][15401] Updated weights for policy 0, policy_version 293130 (0.0042) [2024-06-22 21:24:07,509][15401] Updated weights for policy 0, policy_version 293140 (0.0038) [2024-06-22 21:24:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4802805760. Throughput: 0: 42809.7. Samples: 4802899880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:24:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 21:24:11,710][15401] Updated weights for policy 0, policy_version 293150 (0.0032) [2024-06-22 21:24:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4803018752. Throughput: 0: 42752.6. Samples: 4803154820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:24:13,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-22 21:24:15,342][15401] Updated weights for policy 0, policy_version 293160 (0.0028) [2024-06-22 21:24:18,391][15132] Fps is (10 sec: 45868.1, 60 sec: 42870.2, 300 sec: 42764.8). Total num frames: 4803264512. Throughput: 0: 43030.5. Samples: 4803414840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:24:18,392][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 21:24:19,246][15401] Updated weights for policy 0, policy_version 293170 (0.0036) [2024-06-22 21:24:23,328][15401] Updated weights for policy 0, policy_version 293180 (0.0042) [2024-06-22 21:24:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 4803461120. Throughput: 0: 42798.7. Samples: 4803542560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:24:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 21:24:25,616][15349] Signal inference workers to stop experience collection... (71000 times) [2024-06-22 21:24:25,616][15349] Signal inference workers to resume experience collection... (71000 times) [2024-06-22 21:24:25,635][15401] InferenceWorker_p0-w0: stopping experience collection (71000 times) [2024-06-22 21:24:25,635][15401] InferenceWorker_p0-w0: resuming experience collection (71000 times) [2024-06-22 21:24:26,739][15401] Updated weights for policy 0, policy_version 293190 (0.0039) [2024-06-22 21:24:28,390][15132] Fps is (10 sec: 40966.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4803674112. Throughput: 0: 42738.2. Samples: 4803792540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 21:24:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 21:24:30,886][15401] Updated weights for policy 0, policy_version 293200 (0.0032) [2024-06-22 21:24:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4803887104. Throughput: 0: 43128.9. Samples: 4804064060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:24:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 21:24:34,327][15401] Updated weights for policy 0, policy_version 293210 (0.0028) [2024-06-22 21:24:38,323][15401] Updated weights for policy 0, policy_version 293220 (0.0031) [2024-06-22 21:24:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4804116480. Throughput: 0: 43047.5. Samples: 4804190000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:24:38,392][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 21:24:41,761][15401] Updated weights for policy 0, policy_version 293230 (0.0040) [2024-06-22 21:24:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4804329472. Throughput: 0: 42831.0. Samples: 4804440380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:24:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 21:24:46,255][15401] Updated weights for policy 0, policy_version 293240 (0.0028) [2024-06-22 21:24:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4804526080. Throughput: 0: 43133.5. Samples: 4804707620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:24:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 21:24:49,311][15401] Updated weights for policy 0, policy_version 293250 (0.0023) [2024-06-22 21:24:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4804739072. Throughput: 0: 42777.4. Samples: 4804824860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:24:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 21:24:53,983][15401] Updated weights for policy 0, policy_version 293260 (0.0031) [2024-06-22 21:24:57,321][15401] Updated weights for policy 0, policy_version 293270 (0.0038) [2024-06-22 21:24:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4804968448. Throughput: 0: 42812.0. Samples: 4805081360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:24:58,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-22 21:25:01,719][15401] Updated weights for policy 0, policy_version 293280 (0.0047) [2024-06-22 21:25:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4805148672. Throughput: 0: 42791.8. Samples: 4805340400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 21:25:05,130][15401] Updated weights for policy 0, policy_version 293290 (0.0026) [2024-06-22 21:25:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4805394432. Throughput: 0: 42696.5. Samples: 4805463900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:08,399][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 21:25:09,332][15401] Updated weights for policy 0, policy_version 293300 (0.0043) [2024-06-22 21:25:12,879][15401] Updated weights for policy 0, policy_version 293310 (0.0029) [2024-06-22 21:25:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4805607424. Throughput: 0: 42916.5. Samples: 4805723780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 21:25:16,913][15401] Updated weights for policy 0, policy_version 293320 (0.0033) [2024-06-22 21:25:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42053.5, 300 sec: 42653.9). Total num frames: 4805787648. Throughput: 0: 42578.7. Samples: 4805980100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:18,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 21:25:20,683][15401] Updated weights for policy 0, policy_version 293330 (0.0027) [2024-06-22 21:25:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4806049792. Throughput: 0: 42411.2. Samples: 4806098500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 21:25:24,463][15401] Updated weights for policy 0, policy_version 293340 (0.0026) [2024-06-22 21:25:28,373][15401] Updated weights for policy 0, policy_version 293350 (0.0033) [2024-06-22 21:25:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4806246400. Throughput: 0: 42643.7. Samples: 4806359340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:28,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 21:25:32,011][15401] Updated weights for policy 0, policy_version 293360 (0.0032) [2024-06-22 21:25:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4806443008. Throughput: 0: 42450.1. Samples: 4806617880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:33,404][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 21:25:35,514][15349] Signal inference workers to stop experience collection... (71050 times) [2024-06-22 21:25:35,544][15401] InferenceWorker_p0-w0: stopping experience collection (71050 times) [2024-06-22 21:25:35,561][15349] Signal inference workers to resume experience collection... (71050 times) [2024-06-22 21:25:35,562][15401] InferenceWorker_p0-w0: resuming experience collection (71050 times) [2024-06-22 21:25:36,170][15401] Updated weights for policy 0, policy_version 293370 (0.0034) [2024-06-22 21:25:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4806688768. Throughput: 0: 42580.5. Samples: 4806740980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:38,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 21:25:39,700][15401] Updated weights for policy 0, policy_version 293380 (0.0036) [2024-06-22 21:25:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 4806836224. Throughput: 0: 42473.8. Samples: 4806992680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 21:25:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 21:25:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000293387_4806852608.pth... [2024-06-22 21:25:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000292763_4796628992.pth [2024-06-22 21:25:44,165][15401] Updated weights for policy 0, policy_version 293390 (0.0041) [2024-06-22 21:25:47,489][15401] Updated weights for policy 0, policy_version 293400 (0.0039) [2024-06-22 21:25:48,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42654.6). Total num frames: 4807065600. Throughput: 0: 42229.3. Samples: 4807240720. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:25:48,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 21:25:51,773][15401] Updated weights for policy 0, policy_version 293410 (0.0034) [2024-06-22 21:25:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4807311360. Throughput: 0: 42456.0. Samples: 4807374420. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:25:53,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 21:25:55,681][15401] Updated weights for policy 0, policy_version 293420 (0.0041) [2024-06-22 21:25:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4807491584. Throughput: 0: 42343.1. Samples: 4807629220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:25:58,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 21:25:59,439][15401] Updated weights for policy 0, policy_version 293430 (0.0039) [2024-06-22 21:26:03,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 4807704576. Throughput: 0: 42181.1. Samples: 4807878260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 21:26:03,448][15401] Updated weights for policy 0, policy_version 293440 (0.0032) [2024-06-22 21:26:07,090][15401] Updated weights for policy 0, policy_version 293450 (0.0047) [2024-06-22 21:26:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4807950336. Throughput: 0: 42584.7. Samples: 4808014820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 21:26:11,374][15401] Updated weights for policy 0, policy_version 293460 (0.0041) [2024-06-22 21:26:13,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4808130560. Throughput: 0: 42425.3. Samples: 4808268480. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 21:26:14,700][15401] Updated weights for policy 0, policy_version 293470 (0.0037) [2024-06-22 21:26:18,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4808343552. Throughput: 0: 42224.6. Samples: 4808517980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 21:26:18,948][15401] Updated weights for policy 0, policy_version 293480 (0.0045) [2024-06-22 21:26:22,359][15401] Updated weights for policy 0, policy_version 293490 (0.0033) [2024-06-22 21:26:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4808572928. Throughput: 0: 42462.6. Samples: 4808651800. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 21:26:26,461][15401] Updated weights for policy 0, policy_version 293500 (0.0031) [2024-06-22 21:26:28,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 4808769536. Throughput: 0: 42599.9. Samples: 4808909680. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 21:26:29,872][15401] Updated weights for policy 0, policy_version 293510 (0.0036) [2024-06-22 21:26:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4808998912. Throughput: 0: 42583.6. Samples: 4809156980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:33,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 21:26:34,415][15401] Updated weights for policy 0, policy_version 293520 (0.0031) [2024-06-22 21:26:37,552][15401] Updated weights for policy 0, policy_version 293530 (0.0037) [2024-06-22 21:26:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42654.3). Total num frames: 4809211904. Throughput: 0: 42623.9. Samples: 4809292500. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:38,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 21:26:41,946][15401] Updated weights for policy 0, policy_version 293540 (0.0042) [2024-06-22 21:26:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4809408512. Throughput: 0: 42624.1. Samples: 4809547300. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:43,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 21:26:45,285][15401] Updated weights for policy 0, policy_version 293550 (0.0028) [2024-06-22 21:26:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 4809637888. Throughput: 0: 42590.5. Samples: 4809794820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:48,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 21:26:49,686][15401] Updated weights for policy 0, policy_version 293560 (0.0033) [2024-06-22 21:26:52,864][15349] Signal inference workers to stop experience collection... (71100 times) [2024-06-22 21:26:52,864][15349] Signal inference workers to resume experience collection... (71100 times) [2024-06-22 21:26:52,894][15401] InferenceWorker_p0-w0: stopping experience collection (71100 times) [2024-06-22 21:26:52,894][15401] InferenceWorker_p0-w0: resuming experience collection (71100 times) [2024-06-22 21:26:53,001][15401] Updated weights for policy 0, policy_version 293570 (0.0045) [2024-06-22 21:26:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 4809850880. Throughput: 0: 42448.9. Samples: 4809925020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 21:26:57,305][15401] Updated weights for policy 0, policy_version 293580 (0.0038) [2024-06-22 21:26:58,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 4810031104. Throughput: 0: 42545.4. Samples: 4810183020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 25.0) [2024-06-22 21:26:58,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 21:27:00,580][15401] Updated weights for policy 0, policy_version 293590 (0.0024) [2024-06-22 21:27:03,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43143.0, 300 sec: 42709.1). Total num frames: 4810293248. Throughput: 0: 42646.5. Samples: 4810437180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:03,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 21:27:04,899][15401] Updated weights for policy 0, policy_version 293600 (0.0041) [2024-06-22 21:27:08,219][15401] Updated weights for policy 0, policy_version 293610 (0.0031) [2024-06-22 21:27:08,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 4810506240. Throughput: 0: 42704.1. Samples: 4810573480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:08,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 21:27:12,489][15401] Updated weights for policy 0, policy_version 293620 (0.0055) [2024-06-22 21:27:13,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4810686464. Throughput: 0: 42604.6. Samples: 4810826880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 21:27:15,984][15401] Updated weights for policy 0, policy_version 293630 (0.0030) [2024-06-22 21:27:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 4810932224. Throughput: 0: 42741.7. Samples: 4811080360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 21:27:20,055][15401] Updated weights for policy 0, policy_version 293640 (0.0030) [2024-06-22 21:27:23,390][15132] Fps is (10 sec: 45871.0, 60 sec: 42870.9, 300 sec: 42710.3). Total num frames: 4811145216. Throughput: 0: 42700.6. Samples: 4811214060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:23,391][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 21:27:23,559][15401] Updated weights for policy 0, policy_version 293650 (0.0034) [2024-06-22 21:27:27,556][15401] Updated weights for policy 0, policy_version 293660 (0.0044) [2024-06-22 21:27:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4811341824. Throughput: 0: 42752.8. Samples: 4811471180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:28,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 21:27:31,225][15401] Updated weights for policy 0, policy_version 293670 (0.0026) [2024-06-22 21:27:33,390][15132] Fps is (10 sec: 42601.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4811571200. Throughput: 0: 42716.7. Samples: 4811717080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:33,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 21:27:35,326][15401] Updated weights for policy 0, policy_version 293680 (0.0027) [2024-06-22 21:27:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4811767808. Throughput: 0: 42935.8. Samples: 4811857120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 21:27:39,269][15401] Updated weights for policy 0, policy_version 293690 (0.0036) [2024-06-22 21:27:42,787][15401] Updated weights for policy 0, policy_version 293700 (0.0029) [2024-06-22 21:27:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4811997184. Throughput: 0: 42891.9. Samples: 4812113160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:43,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 21:27:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000293701_4811997184.pth... [2024-06-22 21:27:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000293076_4801757184.pth [2024-06-22 21:27:46,913][15401] Updated weights for policy 0, policy_version 293710 (0.0034) [2024-06-22 21:27:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4812226560. Throughput: 0: 42708.4. Samples: 4812358960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:48,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 21:27:50,412][15401] Updated weights for policy 0, policy_version 293720 (0.0034) [2024-06-22 21:27:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 4812406784. Throughput: 0: 42816.8. Samples: 4812500240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 21:27:54,472][15401] Updated weights for policy 0, policy_version 293730 (0.0033) [2024-06-22 21:27:58,297][15401] Updated weights for policy 0, policy_version 293740 (0.0037) [2024-06-22 21:27:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 4812636160. Throughput: 0: 42917.7. Samples: 4812758180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:27:58,390][15132] Avg episode reward: [(0, '0.114')] [2024-06-22 21:28:02,215][15401] Updated weights for policy 0, policy_version 293750 (0.0041) [2024-06-22 21:28:03,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 4812881920. Throughput: 0: 42708.5. Samples: 4813002240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:28:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 21:28:06,117][15401] Updated weights for policy 0, policy_version 293760 (0.0036) [2024-06-22 21:28:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4813045760. Throughput: 0: 42786.6. Samples: 4813139420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-22 21:28:08,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 21:28:09,767][15401] Updated weights for policy 0, policy_version 293770 (0.0025) [2024-06-22 21:28:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 4813275136. Throughput: 0: 42842.7. Samples: 4813399100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 21:28:13,776][15401] Updated weights for policy 0, policy_version 293780 (0.0026) [2024-06-22 21:28:17,194][15349] Signal inference workers to stop experience collection... (71150 times) [2024-06-22 21:28:17,244][15349] Signal inference workers to resume experience collection... (71150 times) [2024-06-22 21:28:17,245][15401] InferenceWorker_p0-w0: stopping experience collection (71150 times) [2024-06-22 21:28:17,262][15401] InferenceWorker_p0-w0: resuming experience collection (71150 times) [2024-06-22 21:28:17,382][15401] Updated weights for policy 0, policy_version 293790 (0.0030) [2024-06-22 21:28:18,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4813520896. Throughput: 0: 42854.7. Samples: 4813645540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 21:28:21,189][15401] Updated weights for policy 0, policy_version 293800 (0.0041) [2024-06-22 21:28:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 4813701120. Throughput: 0: 42694.6. Samples: 4813778380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 21:28:25,009][15401] Updated weights for policy 0, policy_version 293810 (0.0035) [2024-06-22 21:28:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4813914112. Throughput: 0: 42692.5. Samples: 4814034320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:28,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-22 21:28:28,803][15401] Updated weights for policy 0, policy_version 293820 (0.0041) [2024-06-22 21:28:32,951][15401] Updated weights for policy 0, policy_version 293830 (0.0046) [2024-06-22 21:28:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4814143488. Throughput: 0: 42875.1. Samples: 4814288340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 21:28:37,078][15401] Updated weights for policy 0, policy_version 293840 (0.0043) [2024-06-22 21:28:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 4814340096. Throughput: 0: 42630.6. Samples: 4814418720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:38,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 21:28:40,440][15401] Updated weights for policy 0, policy_version 293850 (0.0030) [2024-06-22 21:28:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4814553088. Throughput: 0: 42593.8. Samples: 4814674900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 21:28:44,558][15401] Updated weights for policy 0, policy_version 293860 (0.0029) [2024-06-22 21:28:48,118][15401] Updated weights for policy 0, policy_version 293870 (0.0028) [2024-06-22 21:28:48,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 4814766080. Throughput: 0: 42870.4. Samples: 4814931400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 21:28:52,024][15401] Updated weights for policy 0, policy_version 293880 (0.0023) [2024-06-22 21:28:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4814979072. Throughput: 0: 42689.8. Samples: 4815060460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 21:28:55,706][15401] Updated weights for policy 0, policy_version 293890 (0.0037) [2024-06-22 21:28:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4815192064. Throughput: 0: 42735.7. Samples: 4815322200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:28:58,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 21:28:59,503][15401] Updated weights for policy 0, policy_version 293900 (0.0026) [2024-06-22 21:29:03,307][15401] Updated weights for policy 0, policy_version 293910 (0.0049) [2024-06-22 21:29:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4815421440. Throughput: 0: 42977.2. Samples: 4815579520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:29:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-22 21:29:07,281][15401] Updated weights for policy 0, policy_version 293920 (0.0034) [2024-06-22 21:29:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4815634432. Throughput: 0: 42946.1. Samples: 4815710960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:29:08,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 21:29:10,979][15401] Updated weights for policy 0, policy_version 293930 (0.0036) [2024-06-22 21:29:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 4815847424. Throughput: 0: 42892.0. Samples: 4815964460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:29:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 21:29:15,115][15401] Updated weights for policy 0, policy_version 293940 (0.0025) [2024-06-22 21:29:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4816060416. Throughput: 0: 42944.0. Samples: 4816220820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:29:18,395][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 21:29:18,721][15401] Updated weights for policy 0, policy_version 293950 (0.0032) [2024-06-22 21:29:22,652][15401] Updated weights for policy 0, policy_version 293960 (0.0037) [2024-06-22 21:29:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4816289792. Throughput: 0: 42977.8. Samples: 4816352620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 21:29:23,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-22 21:29:26,116][15401] Updated weights for policy 0, policy_version 293970 (0.0034) [2024-06-22 21:29:26,660][15349] Signal inference workers to stop experience collection... (71200 times) [2024-06-22 21:29:26,660][15349] Signal inference workers to resume experience collection... (71200 times) [2024-06-22 21:29:26,694][15401] InferenceWorker_p0-w0: stopping experience collection (71200 times) [2024-06-22 21:29:26,694][15401] InferenceWorker_p0-w0: resuming experience collection (71200 times) [2024-06-22 21:29:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4816486400. Throughput: 0: 42979.5. Samples: 4816608980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:29:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-22 21:29:29,969][15401] Updated weights for policy 0, policy_version 293980 (0.0028) [2024-06-22 21:29:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4816699392. Throughput: 0: 43172.3. Samples: 4816874160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:29:33,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 21:29:33,574][15401] Updated weights for policy 0, policy_version 293990 (0.0038) [2024-06-22 21:29:37,788][15401] Updated weights for policy 0, policy_version 294000 (0.0038) [2024-06-22 21:29:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 4816945152. Throughput: 0: 43188.8. Samples: 4817003960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:29:38,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 21:29:41,348][15401] Updated weights for policy 0, policy_version 294010 (0.0036) [2024-06-22 21:29:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4817141760. Throughput: 0: 43043.1. Samples: 4817259140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:29:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 21:29:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000294015_4817141760.pth... [2024-06-22 21:29:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000293387_4806852608.pth [2024-06-22 21:29:45,176][15401] Updated weights for policy 0, policy_version 294020 (0.0037) [2024-06-22 21:29:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4817354752. Throughput: 0: 43022.0. Samples: 4817515500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:29:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 21:29:48,773][15401] Updated weights for policy 0, policy_version 294030 (0.0028) [2024-06-22 21:29:52,650][15401] Updated weights for policy 0, policy_version 294040 (0.0032) [2024-06-22 21:29:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4817567744. Throughput: 0: 43111.6. Samples: 4817650980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:29:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 21:29:56,055][15401] Updated weights for policy 0, policy_version 294050 (0.0027) [2024-06-22 21:29:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4817780736. Throughput: 0: 43240.0. Samples: 4817910260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:29:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 21:30:00,548][15401] Updated weights for policy 0, policy_version 294060 (0.0033) [2024-06-22 21:30:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4818010112. Throughput: 0: 43197.3. Samples: 4818164700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:30:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 21:30:03,701][15401] Updated weights for policy 0, policy_version 294070 (0.0037) [2024-06-22 21:30:08,144][15401] Updated weights for policy 0, policy_version 294080 (0.0033) [2024-06-22 21:30:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4818206720. Throughput: 0: 43110.6. Samples: 4818292600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:30:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 21:30:11,591][15401] Updated weights for policy 0, policy_version 294090 (0.0028) [2024-06-22 21:30:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4818436096. Throughput: 0: 43120.5. Samples: 4818549400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:30:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 21:30:15,645][15401] Updated weights for policy 0, policy_version 294100 (0.0026) [2024-06-22 21:30:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4818665472. Throughput: 0: 42924.0. Samples: 4818805740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:30:18,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 21:30:19,142][15401] Updated weights for policy 0, policy_version 294110 (0.0037) [2024-06-22 21:30:23,196][15401] Updated weights for policy 0, policy_version 294120 (0.0030) [2024-06-22 21:30:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4818878464. Throughput: 0: 43003.5. Samples: 4818939220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:30:23,392][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 21:30:26,781][15401] Updated weights for policy 0, policy_version 294130 (0.0031) [2024-06-22 21:30:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4819075072. Throughput: 0: 42843.5. Samples: 4819187100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:30:28,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-22 21:30:30,754][15401] Updated weights for policy 0, policy_version 294140 (0.0028) [2024-06-22 21:30:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4819271680. Throughput: 0: 43010.7. Samples: 4819450980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:30:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 21:30:34,517][15401] Updated weights for policy 0, policy_version 294150 (0.0030) [2024-06-22 21:30:38,295][15401] Updated weights for policy 0, policy_version 294160 (0.0030) [2024-06-22 21:30:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4819517440. Throughput: 0: 42884.9. Samples: 4819580800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 21:30:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 21:30:42,019][15401] Updated weights for policy 0, policy_version 294170 (0.0035) [2024-06-22 21:30:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4819714048. Throughput: 0: 42694.3. Samples: 4819831500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:30:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 21:30:46,405][15401] Updated weights for policy 0, policy_version 294180 (0.0031) [2024-06-22 21:30:46,682][15349] Signal inference workers to stop experience collection... (71250 times) [2024-06-22 21:30:46,682][15349] Signal inference workers to resume experience collection... (71250 times) [2024-06-22 21:30:46,730][15401] InferenceWorker_p0-w0: stopping experience collection (71250 times) [2024-06-22 21:30:46,730][15401] InferenceWorker_p0-w0: resuming experience collection (71250 times) [2024-06-22 21:30:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 4819910656. Throughput: 0: 42887.6. Samples: 4820094740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:30:48,393][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 21:30:49,547][15401] Updated weights for policy 0, policy_version 294190 (0.0041) [2024-06-22 21:30:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4820156416. Throughput: 0: 42920.9. Samples: 4820224040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:30:53,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-22 21:30:53,919][15401] Updated weights for policy 0, policy_version 294200 (0.0033) [2024-06-22 21:30:57,417][15401] Updated weights for policy 0, policy_version 294210 (0.0045) [2024-06-22 21:30:58,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 4820369408. Throughput: 0: 42845.3. Samples: 4820477440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:30:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 21:31:01,295][15401] Updated weights for policy 0, policy_version 294220 (0.0030) [2024-06-22 21:31:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4820566016. Throughput: 0: 43035.1. Samples: 4820742320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:03,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 21:31:04,776][15401] Updated weights for policy 0, policy_version 294230 (0.0028) [2024-06-22 21:31:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4820779008. Throughput: 0: 42856.1. Samples: 4820867640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:08,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 21:31:09,034][15401] Updated weights for policy 0, policy_version 294240 (0.0041) [2024-06-22 21:31:13,146][15401] Updated weights for policy 0, policy_version 294250 (0.0037) [2024-06-22 21:31:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4820992000. Throughput: 0: 42885.7. Samples: 4821116960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:13,391][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 21:31:16,627][15401] Updated weights for policy 0, policy_version 294260 (0.0025) [2024-06-22 21:31:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4821221376. Throughput: 0: 42853.3. Samples: 4821379380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 21:31:20,803][15401] Updated weights for policy 0, policy_version 294270 (0.0042) [2024-06-22 21:31:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 4821417984. Throughput: 0: 42823.1. Samples: 4821507840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 21:31:24,142][15401] Updated weights for policy 0, policy_version 294280 (0.0045) [2024-06-22 21:31:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4821630976. Throughput: 0: 42823.1. Samples: 4821758540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 21:31:28,533][15401] Updated weights for policy 0, policy_version 294290 (0.0043) [2024-06-22 21:31:31,922][15401] Updated weights for policy 0, policy_version 294300 (0.0032) [2024-06-22 21:31:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4821860352. Throughput: 0: 42898.6. Samples: 4822025080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 21:31:36,107][15401] Updated weights for policy 0, policy_version 294310 (0.0028) [2024-06-22 21:31:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 4822056960. Throughput: 0: 42810.6. Samples: 4822150520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-22 21:31:40,037][15401] Updated weights for policy 0, policy_version 294320 (0.0036) [2024-06-22 21:31:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4822269952. Throughput: 0: 42794.7. Samples: 4822403200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 21:31:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000294329_4822286336.pth... [2024-06-22 21:31:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000293701_4811997184.pth [2024-06-22 21:31:43,703][15401] Updated weights for policy 0, policy_version 294330 (0.0036) [2024-06-22 21:31:47,555][15401] Updated weights for policy 0, policy_version 294340 (0.0024) [2024-06-22 21:31:48,392][15132] Fps is (10 sec: 45865.3, 60 sec: 43417.7, 300 sec: 42931.3). Total num frames: 4822515712. Throughput: 0: 42731.2. Samples: 4822665320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-22 21:31:48,392][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 21:31:51,370][15401] Updated weights for policy 0, policy_version 294350 (0.0043) [2024-06-22 21:31:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 4822712320. Throughput: 0: 42856.0. Samples: 4822796160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:31:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 21:31:55,101][15401] Updated weights for policy 0, policy_version 294360 (0.0044) [2024-06-22 21:31:58,390][15132] Fps is (10 sec: 40968.9, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 4822925312. Throughput: 0: 42928.9. Samples: 4823048760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:31:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 21:31:59,144][15401] Updated weights for policy 0, policy_version 294370 (0.0037) [2024-06-22 21:32:02,629][15401] Updated weights for policy 0, policy_version 294380 (0.0037) [2024-06-22 21:32:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 4823171072. Throughput: 0: 42801.8. Samples: 4823305460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 21:32:06,819][15401] Updated weights for policy 0, policy_version 294390 (0.0048) [2024-06-22 21:32:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4823334912. Throughput: 0: 42901.3. Samples: 4823438400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 21:32:10,111][15401] Updated weights for policy 0, policy_version 294400 (0.0027) [2024-06-22 21:32:13,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4823547904. Throughput: 0: 42900.8. Samples: 4823689080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 21:32:14,322][15401] Updated weights for policy 0, policy_version 294410 (0.0028) [2024-06-22 21:32:18,019][15401] Updated weights for policy 0, policy_version 294420 (0.0042) [2024-06-22 21:32:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42820.7). Total num frames: 4823777280. Throughput: 0: 42652.0. Samples: 4823944420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 21:32:19,661][15349] Signal inference workers to stop experience collection... (71300 times) [2024-06-22 21:32:19,662][15349] Signal inference workers to resume experience collection... (71300 times) [2024-06-22 21:32:19,692][15401] InferenceWorker_p0-w0: stopping experience collection (71300 times) [2024-06-22 21:32:19,692][15401] InferenceWorker_p0-w0: resuming experience collection (71300 times) [2024-06-22 21:32:22,173][15401] Updated weights for policy 0, policy_version 294430 (0.0028) [2024-06-22 21:32:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4823973888. Throughput: 0: 42755.3. Samples: 4824074500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 21:32:25,646][15401] Updated weights for policy 0, policy_version 294440 (0.0036) [2024-06-22 21:32:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4824203264. Throughput: 0: 42664.9. Samples: 4824323120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 21:32:29,777][15401] Updated weights for policy 0, policy_version 294450 (0.0029) [2024-06-22 21:32:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4824416256. Throughput: 0: 42507.3. Samples: 4824578060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:33,399][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 21:32:33,556][15401] Updated weights for policy 0, policy_version 294460 (0.0028) [2024-06-22 21:32:37,564][15401] Updated weights for policy 0, policy_version 294470 (0.0031) [2024-06-22 21:32:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4824629248. Throughput: 0: 42522.6. Samples: 4824709680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:38,390][15132] Avg episode reward: [(0, '0.141')] [2024-06-22 21:32:41,036][15401] Updated weights for policy 0, policy_version 294480 (0.0036) [2024-06-22 21:32:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4824842240. Throughput: 0: 42545.3. Samples: 4824963300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:43,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-22 21:32:45,193][15401] Updated weights for policy 0, policy_version 294490 (0.0042) [2024-06-22 21:32:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42326.9, 300 sec: 42876.1). Total num frames: 4825055232. Throughput: 0: 42674.6. Samples: 4825225820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 21:32:48,768][15401] Updated weights for policy 0, policy_version 294500 (0.0039) [2024-06-22 21:32:53,027][15401] Updated weights for policy 0, policy_version 294510 (0.0033) [2024-06-22 21:32:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4825251840. Throughput: 0: 42420.3. Samples: 4825347320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 21:32:56,366][15401] Updated weights for policy 0, policy_version 294520 (0.0026) [2024-06-22 21:32:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4825481216. Throughput: 0: 42537.0. Samples: 4825603240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:32:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 21:33:00,914][15401] Updated weights for policy 0, policy_version 294530 (0.0038) [2024-06-22 21:33:03,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 4825710592. Throughput: 0: 42607.1. Samples: 4825861740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 21:33:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 21:33:04,118][15401] Updated weights for policy 0, policy_version 294540 (0.0044) [2024-06-22 21:33:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4825890816. Throughput: 0: 42582.9. Samples: 4825990740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:08,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 21:33:08,606][15401] Updated weights for policy 0, policy_version 294550 (0.0047) [2024-06-22 21:33:11,569][15401] Updated weights for policy 0, policy_version 294560 (0.0039) [2024-06-22 21:33:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4826120192. Throughput: 0: 42646.5. Samples: 4826242220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:13,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-22 21:33:16,179][15401] Updated weights for policy 0, policy_version 294570 (0.0037) [2024-06-22 21:33:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4826333184. Throughput: 0: 42747.6. Samples: 4826501700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 21:33:19,291][15401] Updated weights for policy 0, policy_version 294580 (0.0042) [2024-06-22 21:33:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4826529792. Throughput: 0: 42745.4. Samples: 4826633220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 21:33:23,950][15401] Updated weights for policy 0, policy_version 294590 (0.0046) [2024-06-22 21:33:26,899][15401] Updated weights for policy 0, policy_version 294600 (0.0038) [2024-06-22 21:33:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4826759168. Throughput: 0: 42653.9. Samples: 4826882720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 21:33:31,551][15401] Updated weights for policy 0, policy_version 294610 (0.0041) [2024-06-22 21:33:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 4826972160. Throughput: 0: 42571.9. Samples: 4827141560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 21:33:34,459][15401] Updated weights for policy 0, policy_version 294620 (0.0030) [2024-06-22 21:33:36,829][15349] Signal inference workers to stop experience collection... (71350 times) [2024-06-22 21:33:36,830][15349] Signal inference workers to resume experience collection... (71350 times) [2024-06-22 21:33:36,866][15401] InferenceWorker_p0-w0: stopping experience collection (71350 times) [2024-06-22 21:33:36,866][15401] InferenceWorker_p0-w0: resuming experience collection (71350 times) [2024-06-22 21:33:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4827185152. Throughput: 0: 42845.5. Samples: 4827275360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 21:33:38,989][15401] Updated weights for policy 0, policy_version 294630 (0.0038) [2024-06-22 21:33:42,252][15401] Updated weights for policy 0, policy_version 294640 (0.0040) [2024-06-22 21:33:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4827414528. Throughput: 0: 42750.6. Samples: 4827527020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 21:33:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000294642_4827414528.pth... [2024-06-22 21:33:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000294015_4817141760.pth [2024-06-22 21:33:46,753][15401] Updated weights for policy 0, policy_version 294650 (0.0043) [2024-06-22 21:33:48,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 4827627520. Throughput: 0: 42726.4. Samples: 4827784700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:48,397][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 21:33:49,822][15401] Updated weights for policy 0, policy_version 294660 (0.0036) [2024-06-22 21:33:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4827824128. Throughput: 0: 42618.0. Samples: 4827908540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 21:33:54,311][15401] Updated weights for policy 0, policy_version 294670 (0.0040) [2024-06-22 21:33:57,539][15401] Updated weights for policy 0, policy_version 294680 (0.0044) [2024-06-22 21:33:58,390][15132] Fps is (10 sec: 44264.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4828069888. Throughput: 0: 42599.6. Samples: 4828159200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:33:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 21:34:01,942][15401] Updated weights for policy 0, policy_version 294690 (0.0031) [2024-06-22 21:34:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 4828233728. Throughput: 0: 42605.8. Samples: 4828418960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:34:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 21:34:05,350][15401] Updated weights for policy 0, policy_version 294700 (0.0031) [2024-06-22 21:34:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4828446720. Throughput: 0: 42287.9. Samples: 4828536180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:34:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 21:34:09,882][15401] Updated weights for policy 0, policy_version 294710 (0.0039) [2024-06-22 21:34:13,056][15401] Updated weights for policy 0, policy_version 294720 (0.0026) [2024-06-22 21:34:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4828708864. Throughput: 0: 42593.3. Samples: 4828799420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:34:13,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 21:34:17,823][15401] Updated weights for policy 0, policy_version 294730 (0.0029) [2024-06-22 21:34:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4828872704. Throughput: 0: 42474.7. Samples: 4829052920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-22 21:34:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 21:34:20,773][15401] Updated weights for policy 0, policy_version 294740 (0.0039) [2024-06-22 21:34:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4829085696. Throughput: 0: 42002.6. Samples: 4829165480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:34:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 21:34:25,375][15401] Updated weights for policy 0, policy_version 294750 (0.0027) [2024-06-22 21:34:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4829331456. Throughput: 0: 42399.7. Samples: 4829435000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:34:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 21:34:28,496][15401] Updated weights for policy 0, policy_version 294760 (0.0028) [2024-06-22 21:34:32,965][15401] Updated weights for policy 0, policy_version 294770 (0.0029) [2024-06-22 21:34:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4829528064. Throughput: 0: 42207.7. Samples: 4829683780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:34:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 21:34:36,607][15401] Updated weights for policy 0, policy_version 294780 (0.0037) [2024-06-22 21:34:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4829724672. Throughput: 0: 42244.3. Samples: 4829809540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:34:38,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 21:34:40,547][15401] Updated weights for policy 0, policy_version 294790 (0.0049) [2024-06-22 21:34:43,305][15349] Signal inference workers to stop experience collection... (71400 times) [2024-06-22 21:34:43,307][15349] Signal inference workers to resume experience collection... (71400 times) [2024-06-22 21:34:43,317][15401] InferenceWorker_p0-w0: stopping experience collection (71400 times) [2024-06-22 21:34:43,350][15401] InferenceWorker_p0-w0: resuming experience collection (71400 times) [2024-06-22 21:34:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4829954048. Throughput: 0: 42464.1. Samples: 4830070080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:34:43,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 21:34:44,311][15401] Updated weights for policy 0, policy_version 294800 (0.0031) [2024-06-22 21:34:48,384][15401] Updated weights for policy 0, policy_version 294810 (0.0042) [2024-06-22 21:34:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 4830167040. Throughput: 0: 42327.3. Samples: 4830323680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:34:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 21:34:51,902][15401] Updated weights for policy 0, policy_version 294820 (0.0036) [2024-06-22 21:34:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 4830363648. Throughput: 0: 42546.3. Samples: 4830450760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:34:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 21:34:56,032][15401] Updated weights for policy 0, policy_version 294830 (0.0041) [2024-06-22 21:34:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4830609408. Throughput: 0: 42366.2. Samples: 4830705900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:34:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 21:34:59,499][15401] Updated weights for policy 0, policy_version 294840 (0.0030) [2024-06-22 21:35:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4830789632. Throughput: 0: 42591.6. Samples: 4830969540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:35:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 21:35:03,689][15401] Updated weights for policy 0, policy_version 294850 (0.0028) [2024-06-22 21:35:07,321][15401] Updated weights for policy 0, policy_version 294860 (0.0046) [2024-06-22 21:35:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4831002624. Throughput: 0: 42801.7. Samples: 4831091560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:35:08,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 21:35:11,417][15401] Updated weights for policy 0, policy_version 294870 (0.0043) [2024-06-22 21:35:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4831248384. Throughput: 0: 42538.5. Samples: 4831349240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:35:13,396][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 21:35:15,286][15401] Updated weights for policy 0, policy_version 294880 (0.0036) [2024-06-22 21:35:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 4831412224. Throughput: 0: 42680.4. Samples: 4831604400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:35:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 21:35:19,509][15401] Updated weights for policy 0, policy_version 294890 (0.0044) [2024-06-22 21:35:23,057][15401] Updated weights for policy 0, policy_version 294900 (0.0031) [2024-06-22 21:35:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4831641600. Throughput: 0: 42425.4. Samples: 4831718680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:35:23,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 21:35:27,211][15401] Updated weights for policy 0, policy_version 294910 (0.0039) [2024-06-22 21:35:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4831870976. Throughput: 0: 42516.9. Samples: 4831983340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 21:35:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 21:35:30,647][15401] Updated weights for policy 0, policy_version 294920 (0.0036) [2024-06-22 21:35:33,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 4832051200. Throughput: 0: 42500.7. Samples: 4832236320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:35:33,393][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 21:35:34,878][15401] Updated weights for policy 0, policy_version 294930 (0.0041) [2024-06-22 21:35:38,158][15401] Updated weights for policy 0, policy_version 294940 (0.0034) [2024-06-22 21:35:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4832296960. Throughput: 0: 42369.3. Samples: 4832357380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:35:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 21:35:42,353][15401] Updated weights for policy 0, policy_version 294950 (0.0032) [2024-06-22 21:35:43,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 4832493568. Throughput: 0: 42695.0. Samples: 4832627180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:35:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 21:35:43,468][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000294953_4832509952.pth... [2024-06-22 21:35:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000294329_4822286336.pth [2024-06-22 21:35:45,650][15401] Updated weights for policy 0, policy_version 294960 (0.0036) [2024-06-22 21:35:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 4832690176. Throughput: 0: 42484.5. Samples: 4832881340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:35:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 21:35:50,010][15401] Updated weights for policy 0, policy_version 294970 (0.0027) [2024-06-22 21:35:53,371][15401] Updated weights for policy 0, policy_version 294980 (0.0048) [2024-06-22 21:35:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4832952320. Throughput: 0: 42508.9. Samples: 4833004460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:35:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 21:35:57,830][15401] Updated weights for policy 0, policy_version 294990 (0.0053) [2024-06-22 21:35:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4833132544. Throughput: 0: 42651.6. Samples: 4833268560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:35:58,399][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 21:35:58,581][15349] Signal inference workers to stop experience collection... (71450 times) [2024-06-22 21:35:58,583][15349] Signal inference workers to resume experience collection... (71450 times) [2024-06-22 21:35:58,602][15401] InferenceWorker_p0-w0: stopping experience collection (71450 times) [2024-06-22 21:35:58,635][15401] InferenceWorker_p0-w0: resuming experience collection (71450 times) [2024-06-22 21:36:00,866][15401] Updated weights for policy 0, policy_version 295000 (0.0051) [2024-06-22 21:36:03,389][15132] Fps is (10 sec: 36045.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4833312768. Throughput: 0: 42637.5. Samples: 4833523080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 21:36:05,562][15401] Updated weights for policy 0, policy_version 295010 (0.0038) [2024-06-22 21:36:08,392][15132] Fps is (10 sec: 45865.3, 60 sec: 43143.0, 300 sec: 42709.2). Total num frames: 4833591296. Throughput: 0: 42748.1. Samples: 4833642440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:08,392][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 21:36:08,898][15401] Updated weights for policy 0, policy_version 295020 (0.0046) [2024-06-22 21:36:13,177][15401] Updated weights for policy 0, policy_version 295030 (0.0043) [2024-06-22 21:36:13,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 4833771520. Throughput: 0: 42571.0. Samples: 4833899040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 21:36:16,831][15401] Updated weights for policy 0, policy_version 295040 (0.0037) [2024-06-22 21:36:18,390][15132] Fps is (10 sec: 36052.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4833951744. Throughput: 0: 42665.8. Samples: 4834156180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:18,392][15132] Avg episode reward: [(0, '0.226')] [2024-06-22 21:36:20,826][15401] Updated weights for policy 0, policy_version 295050 (0.0024) [2024-06-22 21:36:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4834213888. Throughput: 0: 42642.2. Samples: 4834276280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 21:36:24,686][15401] Updated weights for policy 0, policy_version 295060 (0.0040) [2024-06-22 21:36:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4834410496. Throughput: 0: 42434.4. Samples: 4834536720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 21:36:28,644][15401] Updated weights for policy 0, policy_version 295070 (0.0035) [2024-06-22 21:36:32,667][15401] Updated weights for policy 0, policy_version 295080 (0.0042) [2024-06-22 21:36:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 4834607104. Throughput: 0: 42560.4. Samples: 4834796560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 21:36:36,132][15401] Updated weights for policy 0, policy_version 295090 (0.0033) [2024-06-22 21:36:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4834852864. Throughput: 0: 42546.7. Samples: 4834919060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 21:36:40,234][15401] Updated weights for policy 0, policy_version 295100 (0.0030) [2024-06-22 21:36:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 4835049472. Throughput: 0: 42485.3. Samples: 4835180400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 21:36:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 21:36:43,640][15401] Updated weights for policy 0, policy_version 295110 (0.0029) [2024-06-22 21:36:47,639][15401] Updated weights for policy 0, policy_version 295120 (0.0033) [2024-06-22 21:36:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4835246080. Throughput: 0: 42607.8. Samples: 4835440440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:36:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 21:36:51,104][15401] Updated weights for policy 0, policy_version 295130 (0.0043) [2024-06-22 21:36:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4835491840. Throughput: 0: 42637.6. Samples: 4835561040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:36:53,390][15132] Avg episode reward: [(0, '0.143')] [2024-06-22 21:36:55,170][15401] Updated weights for policy 0, policy_version 295140 (0.0042) [2024-06-22 21:36:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 4835704832. Throughput: 0: 42857.9. Samples: 4835827640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:36:58,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-22 21:36:58,899][15401] Updated weights for policy 0, policy_version 295150 (0.0030) [2024-06-22 21:37:02,694][15401] Updated weights for policy 0, policy_version 295160 (0.0045) [2024-06-22 21:37:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 4835901440. Throughput: 0: 42765.8. Samples: 4836080640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 21:37:06,564][15401] Updated weights for policy 0, policy_version 295170 (0.0031) [2024-06-22 21:37:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42326.8, 300 sec: 42653.9). Total num frames: 4836130816. Throughput: 0: 42831.4. Samples: 4836203700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 21:37:10,339][15401] Updated weights for policy 0, policy_version 295180 (0.0042) [2024-06-22 21:37:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4836327424. Throughput: 0: 42783.4. Samples: 4836461980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:13,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-22 21:37:13,954][15349] Signal inference workers to stop experience collection... (71500 times) [2024-06-22 21:37:13,980][15401] InferenceWorker_p0-w0: stopping experience collection (71500 times) [2024-06-22 21:37:14,016][15349] Signal inference workers to resume experience collection... (71500 times) [2024-06-22 21:37:14,020][15401] InferenceWorker_p0-w0: resuming experience collection (71500 times) [2024-06-22 21:37:14,365][15401] Updated weights for policy 0, policy_version 295190 (0.0030) [2024-06-22 21:37:18,277][15401] Updated weights for policy 0, policy_version 295200 (0.0031) [2024-06-22 21:37:18,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43415.9, 300 sec: 42653.6). Total num frames: 4836556800. Throughput: 0: 42463.0. Samples: 4836707500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:18,393][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 21:37:22,161][15401] Updated weights for policy 0, policy_version 295210 (0.0045) [2024-06-22 21:37:23,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 4836769792. Throughput: 0: 42612.0. Samples: 4836836700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:23,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 21:37:26,020][15401] Updated weights for policy 0, policy_version 295220 (0.0047) [2024-06-22 21:37:28,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4836966400. Throughput: 0: 42605.4. Samples: 4837097640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 21:37:29,817][15401] Updated weights for policy 0, policy_version 295230 (0.0041) [2024-06-22 21:37:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4837179392. Throughput: 0: 42549.9. Samples: 4837355180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 21:37:33,626][15401] Updated weights for policy 0, policy_version 295240 (0.0044) [2024-06-22 21:37:37,354][15401] Updated weights for policy 0, policy_version 295250 (0.0032) [2024-06-22 21:37:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4837408768. Throughput: 0: 42648.0. Samples: 4837480200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 21:37:41,607][15401] Updated weights for policy 0, policy_version 295260 (0.0030) [2024-06-22 21:37:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4837605376. Throughput: 0: 42392.4. Samples: 4837735300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 21:37:43,488][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000295265_4837621760.pth... [2024-06-22 21:37:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000294642_4827414528.pth [2024-06-22 21:37:45,161][15401] Updated weights for policy 0, policy_version 295270 (0.0035) [2024-06-22 21:37:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4837834752. Throughput: 0: 42426.7. Samples: 4837989840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 21:37:49,136][15401] Updated weights for policy 0, policy_version 295280 (0.0036) [2024-06-22 21:37:52,693][15401] Updated weights for policy 0, policy_version 295290 (0.0035) [2024-06-22 21:37:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4838047744. Throughput: 0: 42577.4. Samples: 4838119680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 21:37:56,719][15401] Updated weights for policy 0, policy_version 295300 (0.0045) [2024-06-22 21:37:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 4838244352. Throughput: 0: 42556.3. Samples: 4838377020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-22 21:37:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 21:38:00,593][15401] Updated weights for policy 0, policy_version 295310 (0.0029) [2024-06-22 21:38:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4838473728. Throughput: 0: 42752.5. Samples: 4838631260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 21:38:04,309][15401] Updated weights for policy 0, policy_version 295320 (0.0029) [2024-06-22 21:38:08,151][15401] Updated weights for policy 0, policy_version 295330 (0.0033) [2024-06-22 21:38:08,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4838686720. Throughput: 0: 42825.8. Samples: 4838763760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 21:38:11,976][15401] Updated weights for policy 0, policy_version 295340 (0.0035) [2024-06-22 21:38:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4838883328. Throughput: 0: 42577.7. Samples: 4839013640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 21:38:15,754][15401] Updated weights for policy 0, policy_version 295350 (0.0040) [2024-06-22 21:38:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 4839112704. Throughput: 0: 42607.4. Samples: 4839272520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:18,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 21:38:19,597][15401] Updated weights for policy 0, policy_version 295360 (0.0034) [2024-06-22 21:38:23,334][15401] Updated weights for policy 0, policy_version 295370 (0.0024) [2024-06-22 21:38:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 4839342080. Throughput: 0: 42857.2. Samples: 4839408780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:23,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-22 21:38:27,234][15401] Updated weights for policy 0, policy_version 295380 (0.0036) [2024-06-22 21:38:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4839522304. Throughput: 0: 42831.6. Samples: 4839662720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 21:38:30,940][15401] Updated weights for policy 0, policy_version 295390 (0.0047) [2024-06-22 21:38:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4839751680. Throughput: 0: 42906.3. Samples: 4839920620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 21:38:34,802][15401] Updated weights for policy 0, policy_version 295400 (0.0040) [2024-06-22 21:38:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 4839964672. Throughput: 0: 42845.3. Samples: 4840047820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:38,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 21:38:38,722][15401] Updated weights for policy 0, policy_version 295410 (0.0028) [2024-06-22 21:38:42,379][15401] Updated weights for policy 0, policy_version 295420 (0.0031) [2024-06-22 21:38:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42543.8). Total num frames: 4840177664. Throughput: 0: 42909.5. Samples: 4840307940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-22 21:38:46,683][15401] Updated weights for policy 0, policy_version 295430 (0.0028) [2024-06-22 21:38:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 4840390656. Throughput: 0: 42930.7. Samples: 4840563140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 21:38:50,451][15401] Updated weights for policy 0, policy_version 295440 (0.0039) [2024-06-22 21:38:50,934][15349] Signal inference workers to stop experience collection... (71550 times) [2024-06-22 21:38:50,992][15401] InferenceWorker_p0-w0: stopping experience collection (71550 times) [2024-06-22 21:38:50,994][15349] Signal inference workers to resume experience collection... (71550 times) [2024-06-22 21:38:51,010][15401] InferenceWorker_p0-w0: resuming experience collection (71550 times) [2024-06-22 21:38:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4840620032. Throughput: 0: 42823.6. Samples: 4840690820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 21:38:54,115][15401] Updated weights for policy 0, policy_version 295450 (0.0041) [2024-06-22 21:38:58,006][15401] Updated weights for policy 0, policy_version 295460 (0.0035) [2024-06-22 21:38:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 4840833024. Throughput: 0: 43226.8. Samples: 4840958840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:38:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 21:39:01,544][15401] Updated weights for policy 0, policy_version 295470 (0.0024) [2024-06-22 21:39:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4841046016. Throughput: 0: 43063.6. Samples: 4841210380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:39:03,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 21:39:05,425][15401] Updated weights for policy 0, policy_version 295480 (0.0044) [2024-06-22 21:39:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4841259008. Throughput: 0: 42939.2. Samples: 4841341040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-22 21:39:08,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 21:39:09,247][15401] Updated weights for policy 0, policy_version 295490 (0.0024) [2024-06-22 21:39:13,278][15401] Updated weights for policy 0, policy_version 295500 (0.0029) [2024-06-22 21:39:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4841472000. Throughput: 0: 43146.6. Samples: 4841604320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 21:39:16,933][15401] Updated weights for policy 0, policy_version 295510 (0.0024) [2024-06-22 21:39:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4841684992. Throughput: 0: 42890.2. Samples: 4841850680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 21:39:20,950][15401] Updated weights for policy 0, policy_version 295520 (0.0033) [2024-06-22 21:39:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4841897984. Throughput: 0: 42923.5. Samples: 4841979280. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 21:39:24,536][15401] Updated weights for policy 0, policy_version 295530 (0.0033) [2024-06-22 21:39:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 4842110976. Throughput: 0: 42946.2. Samples: 4842240520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:28,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 21:39:28,619][15401] Updated weights for policy 0, policy_version 295540 (0.0034) [2024-06-22 21:39:32,248][15401] Updated weights for policy 0, policy_version 295550 (0.0034) [2024-06-22 21:39:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4842307584. Throughput: 0: 42834.6. Samples: 4842490700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 21:39:36,268][15401] Updated weights for policy 0, policy_version 295560 (0.0034) [2024-06-22 21:39:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 4842553344. Throughput: 0: 42806.3. Samples: 4842617100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 21:39:39,941][15401] Updated weights for policy 0, policy_version 295570 (0.0042) [2024-06-22 21:39:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4842733568. Throughput: 0: 42725.3. Samples: 4842881480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 21:39:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000295578_4842749952.pth... [2024-06-22 21:39:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000294953_4832509952.pth [2024-06-22 21:39:43,848][15401] Updated weights for policy 0, policy_version 295580 (0.0036) [2024-06-22 21:39:47,984][15401] Updated weights for policy 0, policy_version 295590 (0.0032) [2024-06-22 21:39:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4842946560. Throughput: 0: 42701.4. Samples: 4843131940. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 21:39:51,487][15401] Updated weights for policy 0, policy_version 295600 (0.0035) [2024-06-22 21:39:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4843208704. Throughput: 0: 42673.8. Samples: 4843261360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 21:39:55,790][15401] Updated weights for policy 0, policy_version 295610 (0.0032) [2024-06-22 21:39:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 4843372544. Throughput: 0: 42492.9. Samples: 4843516600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:39:58,392][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 21:39:59,307][15401] Updated weights for policy 0, policy_version 295620 (0.0042) [2024-06-22 21:40:03,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4843585536. Throughput: 0: 42561.9. Samples: 4843765960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:40:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 21:40:03,470][15401] Updated weights for policy 0, policy_version 295630 (0.0034) [2024-06-22 21:40:06,875][15401] Updated weights for policy 0, policy_version 295640 (0.0030) [2024-06-22 21:40:08,390][15132] Fps is (10 sec: 47524.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4843847680. Throughput: 0: 42640.1. Samples: 4843898080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:40:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 21:40:11,027][15401] Updated weights for policy 0, policy_version 295650 (0.0035) [2024-06-22 21:40:12,627][15349] Signal inference workers to stop experience collection... (71600 times) [2024-06-22 21:40:12,628][15349] Signal inference workers to resume experience collection... (71600 times) [2024-06-22 21:40:12,678][15401] InferenceWorker_p0-w0: stopping experience collection (71600 times) [2024-06-22 21:40:12,678][15401] InferenceWorker_p0-w0: resuming experience collection (71600 times) [2024-06-22 21:40:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4844027904. Throughput: 0: 42598.7. Samples: 4844157460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:40:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 21:40:14,559][15401] Updated weights for policy 0, policy_version 295660 (0.0035) [2024-06-22 21:40:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4844240896. Throughput: 0: 42568.9. Samples: 4844406300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:40:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 21:40:18,440][15401] Updated weights for policy 0, policy_version 295670 (0.0046) [2024-06-22 21:40:22,042][15401] Updated weights for policy 0, policy_version 295680 (0.0037) [2024-06-22 21:40:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4844470272. Throughput: 0: 42776.4. Samples: 4844542040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-22 21:40:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 21:40:25,847][15401] Updated weights for policy 0, policy_version 295690 (0.0039) [2024-06-22 21:40:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 4844650496. Throughput: 0: 42697.4. Samples: 4844802860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:40:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 21:40:29,715][15401] Updated weights for policy 0, policy_version 295700 (0.0032) [2024-06-22 21:40:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4844896256. Throughput: 0: 42714.6. Samples: 4845054100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:40:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 21:40:33,816][15401] Updated weights for policy 0, policy_version 295710 (0.0035) [2024-06-22 21:40:37,499][15401] Updated weights for policy 0, policy_version 295720 (0.0029) [2024-06-22 21:40:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4845109248. Throughput: 0: 42730.3. Samples: 4845184220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:40:38,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 21:40:41,843][15401] Updated weights for policy 0, policy_version 295730 (0.0032) [2024-06-22 21:40:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4845289472. Throughput: 0: 42731.7. Samples: 4845439420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:40:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-22 21:40:45,247][15401] Updated weights for policy 0, policy_version 295740 (0.0038) [2024-06-22 21:40:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4845535232. Throughput: 0: 42711.5. Samples: 4845687980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:40:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 21:40:49,401][15401] Updated weights for policy 0, policy_version 295750 (0.0030) [2024-06-22 21:40:52,738][15401] Updated weights for policy 0, policy_version 295760 (0.0042) [2024-06-22 21:40:53,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4845764608. Throughput: 0: 42794.6. Samples: 4845823840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:40:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 21:40:56,803][15401] Updated weights for policy 0, policy_version 295770 (0.0043) [2024-06-22 21:40:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 4845944832. Throughput: 0: 42842.6. Samples: 4846085480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:40:58,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 21:41:00,168][15401] Updated weights for policy 0, policy_version 295780 (0.0038) [2024-06-22 21:41:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42709.8). Total num frames: 4846190592. Throughput: 0: 42863.0. Samples: 4846335140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:41:03,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 21:41:04,447][15401] Updated weights for policy 0, policy_version 295790 (0.0034) [2024-06-22 21:41:07,827][15401] Updated weights for policy 0, policy_version 295800 (0.0029) [2024-06-22 21:41:08,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4846403584. Throughput: 0: 43045.3. Samples: 4846479080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:41:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 21:41:12,241][15401] Updated weights for policy 0, policy_version 295810 (0.0042) [2024-06-22 21:41:13,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4846567424. Throughput: 0: 42863.8. Samples: 4846731740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:41:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 21:41:13,812][15349] Signal inference workers to stop experience collection... (71650 times) [2024-06-22 21:41:13,814][15349] Signal inference workers to resume experience collection... (71650 times) [2024-06-22 21:41:13,827][15401] InferenceWorker_p0-w0: stopping experience collection (71650 times) [2024-06-22 21:41:13,828][15401] InferenceWorker_p0-w0: resuming experience collection (71650 times) [2024-06-22 21:41:15,795][15401] Updated weights for policy 0, policy_version 295820 (0.0042) [2024-06-22 21:41:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4846829568. Throughput: 0: 42697.8. Samples: 4846975500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:41:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 21:41:19,943][15401] Updated weights for policy 0, policy_version 295830 (0.0032) [2024-06-22 21:41:23,183][15401] Updated weights for policy 0, policy_version 295840 (0.0029) [2024-06-22 21:41:23,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4847042560. Throughput: 0: 43050.1. Samples: 4847121480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:41:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 21:41:27,460][15401] Updated weights for policy 0, policy_version 295850 (0.0026) [2024-06-22 21:41:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4847222784. Throughput: 0: 43037.7. Samples: 4847376120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:41:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 21:41:30,643][15401] Updated weights for policy 0, policy_version 295860 (0.0043) [2024-06-22 21:41:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4847484928. Throughput: 0: 42992.9. Samples: 4847622660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:41:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 21:41:34,951][15401] Updated weights for policy 0, policy_version 295870 (0.0033) [2024-06-22 21:41:38,084][15401] Updated weights for policy 0, policy_version 295880 (0.0036) [2024-06-22 21:41:38,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4847697920. Throughput: 0: 43206.3. Samples: 4847768120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-22 21:41:38,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 21:41:42,502][15401] Updated weights for policy 0, policy_version 295890 (0.0040) [2024-06-22 21:41:43,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4847861760. Throughput: 0: 42918.8. Samples: 4848016720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:41:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 21:41:43,490][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000295891_4847878144.pth... [2024-06-22 21:41:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000295265_4837621760.pth [2024-06-22 21:41:45,704][15401] Updated weights for policy 0, policy_version 295900 (0.0031) [2024-06-22 21:41:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4848123904. Throughput: 0: 42874.2. Samples: 4848264480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:41:48,400][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 21:41:49,951][15401] Updated weights for policy 0, policy_version 295910 (0.0025) [2024-06-22 21:41:53,303][15401] Updated weights for policy 0, policy_version 295920 (0.0024) [2024-06-22 21:41:53,390][15132] Fps is (10 sec: 49151.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4848353280. Throughput: 0: 42932.9. Samples: 4848411060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:41:53,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 21:41:57,534][15401] Updated weights for policy 0, policy_version 295930 (0.0035) [2024-06-22 21:41:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4848517120. Throughput: 0: 42808.9. Samples: 4848658140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:41:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 21:42:01,551][15401] Updated weights for policy 0, policy_version 295940 (0.0035) [2024-06-22 21:42:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4848762880. Throughput: 0: 42952.9. Samples: 4848908380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 21:42:05,136][15401] Updated weights for policy 0, policy_version 295950 (0.0034) [2024-06-22 21:42:08,390][15132] Fps is (10 sec: 45873.5, 60 sec: 42871.2, 300 sec: 42876.0). Total num frames: 4848975872. Throughput: 0: 42780.1. Samples: 4849046600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 21:42:09,242][15401] Updated weights for policy 0, policy_version 295960 (0.0031) [2024-06-22 21:42:13,159][15401] Updated weights for policy 0, policy_version 295970 (0.0036) [2024-06-22 21:42:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.7, 300 sec: 42765.4). Total num frames: 4849172480. Throughput: 0: 42697.0. Samples: 4849297480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:13,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-22 21:42:17,206][15401] Updated weights for policy 0, policy_version 295980 (0.0043) [2024-06-22 21:42:18,389][15132] Fps is (10 sec: 42600.1, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 4849401856. Throughput: 0: 42795.6. Samples: 4849548460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-22 21:42:21,096][15401] Updated weights for policy 0, policy_version 295990 (0.0032) [2024-06-22 21:42:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4849614848. Throughput: 0: 42583.6. Samples: 4849684380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:23,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 21:42:24,828][15401] Updated weights for policy 0, policy_version 296000 (0.0031) [2024-06-22 21:42:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4849795072. Throughput: 0: 42702.2. Samples: 4849938320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:28,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 21:42:28,809][15401] Updated weights for policy 0, policy_version 296010 (0.0027) [2024-06-22 21:42:29,877][15349] Signal inference workers to stop experience collection... (71700 times) [2024-06-22 21:42:29,877][15349] Signal inference workers to resume experience collection... (71700 times) [2024-06-22 21:42:29,908][15401] InferenceWorker_p0-w0: stopping experience collection (71700 times) [2024-06-22 21:42:29,909][15401] InferenceWorker_p0-w0: resuming experience collection (71700 times) [2024-06-22 21:42:32,306][15401] Updated weights for policy 0, policy_version 296020 (0.0041) [2024-06-22 21:42:33,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 4850057216. Throughput: 0: 43021.3. Samples: 4850200540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:33,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 21:42:36,439][15401] Updated weights for policy 0, policy_version 296030 (0.0040) [2024-06-22 21:42:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4850253824. Throughput: 0: 42748.5. Samples: 4850334740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:38,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-22 21:42:40,382][15401] Updated weights for policy 0, policy_version 296040 (0.0037) [2024-06-22 21:42:43,390][15132] Fps is (10 sec: 39330.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4850450432. Throughput: 0: 42698.6. Samples: 4850579580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:43,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-22 21:42:44,082][15401] Updated weights for policy 0, policy_version 296050 (0.0039) [2024-06-22 21:42:47,867][15401] Updated weights for policy 0, policy_version 296060 (0.0044) [2024-06-22 21:42:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4850679808. Throughput: 0: 42998.3. Samples: 4850843300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-22 21:42:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 21:42:51,715][15401] Updated weights for policy 0, policy_version 296070 (0.0042) [2024-06-22 21:42:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 4850909184. Throughput: 0: 42834.2. Samples: 4850974120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:42:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 21:42:55,487][15401] Updated weights for policy 0, policy_version 296080 (0.0036) [2024-06-22 21:42:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4851089408. Throughput: 0: 42725.8. Samples: 4851220140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:42:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 21:42:59,358][15401] Updated weights for policy 0, policy_version 296090 (0.0030) [2024-06-22 21:43:02,966][15401] Updated weights for policy 0, policy_version 296100 (0.0030) [2024-06-22 21:43:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4851302400. Throughput: 0: 43112.3. Samples: 4851488520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 21:43:06,959][15401] Updated weights for policy 0, policy_version 296110 (0.0036) [2024-06-22 21:43:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.7, 300 sec: 42931.6). Total num frames: 4851548160. Throughput: 0: 42978.5. Samples: 4851618420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 21:43:10,829][15401] Updated weights for policy 0, policy_version 296120 (0.0051) [2024-06-22 21:43:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4851728384. Throughput: 0: 42830.2. Samples: 4851865680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 21:43:14,398][15401] Updated weights for policy 0, policy_version 296130 (0.0031) [2024-06-22 21:43:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4851941376. Throughput: 0: 42934.8. Samples: 4852132500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 21:43:18,414][15401] Updated weights for policy 0, policy_version 296140 (0.0050) [2024-06-22 21:43:21,555][15349] Signal inference workers to stop experience collection... (71750 times) [2024-06-22 21:43:21,606][15401] InferenceWorker_p0-w0: stopping experience collection (71750 times) [2024-06-22 21:43:21,612][15349] Signal inference workers to resume experience collection... (71750 times) [2024-06-22 21:43:21,631][15401] InferenceWorker_p0-w0: resuming experience collection (71750 times) [2024-06-22 21:43:22,209][15401] Updated weights for policy 0, policy_version 296150 (0.0045) [2024-06-22 21:43:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4852170752. Throughput: 0: 42653.8. Samples: 4852254160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 21:43:25,932][15401] Updated weights for policy 0, policy_version 296160 (0.0036) [2024-06-22 21:43:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4852383744. Throughput: 0: 42818.7. Samples: 4852506420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 21:43:30,122][15401] Updated weights for policy 0, policy_version 296170 (0.0036) [2024-06-22 21:43:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42053.9, 300 sec: 42765.4). Total num frames: 4852580352. Throughput: 0: 42900.9. Samples: 4852773840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:33,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-22 21:43:33,618][15401] Updated weights for policy 0, policy_version 296180 (0.0025) [2024-06-22 21:43:37,645][15401] Updated weights for policy 0, policy_version 296190 (0.0030) [2024-06-22 21:43:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4852809728. Throughput: 0: 42771.1. Samples: 4852898820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:38,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-22 21:43:41,218][15401] Updated weights for policy 0, policy_version 296200 (0.0036) [2024-06-22 21:43:43,391][15132] Fps is (10 sec: 45869.9, 60 sec: 43143.8, 300 sec: 42875.9). Total num frames: 4853039104. Throughput: 0: 42869.5. Samples: 4853149320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:43,391][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 21:43:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000296206_4853039104.pth... [2024-06-22 21:43:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000295578_4842749952.pth [2024-06-22 21:43:45,281][15401] Updated weights for policy 0, policy_version 296210 (0.0037) [2024-06-22 21:43:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4853235712. Throughput: 0: 42681.0. Samples: 4853409160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 21:43:48,718][15401] Updated weights for policy 0, policy_version 296220 (0.0030) [2024-06-22 21:43:52,990][15401] Updated weights for policy 0, policy_version 296230 (0.0039) [2024-06-22 21:43:53,389][15132] Fps is (10 sec: 39326.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4853432320. Throughput: 0: 42577.5. Samples: 4853534400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 21:43:56,399][15401] Updated weights for policy 0, policy_version 296240 (0.0030) [2024-06-22 21:43:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4853678080. Throughput: 0: 42800.5. Samples: 4853791700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:43:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 21:44:00,582][15401] Updated weights for policy 0, policy_version 296250 (0.0034) [2024-06-22 21:44:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4853858304. Throughput: 0: 42655.1. Samples: 4854051980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-22 21:44:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 21:44:04,019][15401] Updated weights for policy 0, policy_version 296260 (0.0036) [2024-06-22 21:44:08,157][15401] Updated weights for policy 0, policy_version 296270 (0.0047) [2024-06-22 21:44:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4854087680. Throughput: 0: 42747.6. Samples: 4854177800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 21:44:11,686][15401] Updated weights for policy 0, policy_version 296280 (0.0033) [2024-06-22 21:44:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4854300672. Throughput: 0: 42692.9. Samples: 4854427600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 21:44:15,596][15401] Updated weights for policy 0, policy_version 296290 (0.0040) [2024-06-22 21:44:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4854513664. Throughput: 0: 42576.5. Samples: 4854689780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:18,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 21:44:19,586][15401] Updated weights for policy 0, policy_version 296300 (0.0034) [2024-06-22 21:44:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4854726656. Throughput: 0: 42567.6. Samples: 4854814360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:23,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 21:44:23,407][15401] Updated weights for policy 0, policy_version 296310 (0.0046) [2024-06-22 21:44:27,267][15401] Updated weights for policy 0, policy_version 296320 (0.0031) [2024-06-22 21:44:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4854956032. Throughput: 0: 42606.8. Samples: 4855066580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 21:44:30,939][15401] Updated weights for policy 0, policy_version 296330 (0.0031) [2024-06-22 21:44:31,841][15349] Signal inference workers to stop experience collection... (71800 times) [2024-06-22 21:44:31,873][15401] InferenceWorker_p0-w0: stopping experience collection (71800 times) [2024-06-22 21:44:31,909][15349] Signal inference workers to resume experience collection... (71800 times) [2024-06-22 21:44:31,909][15401] InferenceWorker_p0-w0: resuming experience collection (71800 times) [2024-06-22 21:44:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4855152640. Throughput: 0: 42541.3. Samples: 4855323520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 21:44:34,988][15401] Updated weights for policy 0, policy_version 296340 (0.0040) [2024-06-22 21:44:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4855365632. Throughput: 0: 42545.7. Samples: 4855448960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:38,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-22 21:44:38,630][15401] Updated weights for policy 0, policy_version 296350 (0.0031) [2024-06-22 21:44:42,512][15401] Updated weights for policy 0, policy_version 296360 (0.0020) [2024-06-22 21:44:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42326.1, 300 sec: 42820.5). Total num frames: 4855578624. Throughput: 0: 42538.9. Samples: 4855705960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:43,395][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 21:44:46,671][15401] Updated weights for policy 0, policy_version 296370 (0.0038) [2024-06-22 21:44:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4855775232. Throughput: 0: 42454.1. Samples: 4855962420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 21:44:50,055][15401] Updated weights for policy 0, policy_version 296380 (0.0028) [2024-06-22 21:44:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 4856004608. Throughput: 0: 42427.4. Samples: 4856087040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 21:44:54,195][15401] Updated weights for policy 0, policy_version 296390 (0.0042) [2024-06-22 21:44:57,912][15401] Updated weights for policy 0, policy_version 296400 (0.0025) [2024-06-22 21:44:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 4856217600. Throughput: 0: 42612.0. Samples: 4856345140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:44:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 21:45:01,883][15401] Updated weights for policy 0, policy_version 296410 (0.0028) [2024-06-22 21:45:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4856430592. Throughput: 0: 42475.5. Samples: 4856601180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:45:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 21:45:05,527][15401] Updated weights for policy 0, policy_version 296420 (0.0028) [2024-06-22 21:45:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 4856643584. Throughput: 0: 42543.4. Samples: 4856728920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:45:08,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 21:45:09,393][15401] Updated weights for policy 0, policy_version 296430 (0.0024) [2024-06-22 21:45:13,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4856872960. Throughput: 0: 42684.7. Samples: 4856987380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:45:13,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 21:45:13,395][15401] Updated weights for policy 0, policy_version 296440 (0.0030) [2024-06-22 21:45:17,098][15401] Updated weights for policy 0, policy_version 296450 (0.0027) [2024-06-22 21:45:18,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4857053184. Throughput: 0: 42589.1. Samples: 4857240020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-22 21:45:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 21:45:20,927][15401] Updated weights for policy 0, policy_version 296460 (0.0039) [2024-06-22 21:45:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4857282560. Throughput: 0: 42587.9. Samples: 4857365420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:45:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-22 21:45:24,455][15401] Updated weights for policy 0, policy_version 296470 (0.0039) [2024-06-22 21:45:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4857495552. Throughput: 0: 42611.2. Samples: 4857623460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:45:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 21:45:28,723][15401] Updated weights for policy 0, policy_version 296480 (0.0044) [2024-06-22 21:45:32,520][15401] Updated weights for policy 0, policy_version 296490 (0.0039) [2024-06-22 21:45:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4857708544. Throughput: 0: 42537.8. Samples: 4857876620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:45:33,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 21:45:36,636][15401] Updated weights for policy 0, policy_version 296500 (0.0034) [2024-06-22 21:45:37,994][15349] Signal inference workers to stop experience collection... (71850 times) [2024-06-22 21:45:37,994][15349] Signal inference workers to resume experience collection... (71850 times) [2024-06-22 21:45:38,009][15401] InferenceWorker_p0-w0: stopping experience collection (71850 times) [2024-06-22 21:45:38,009][15401] InferenceWorker_p0-w0: resuming experience collection (71850 times) [2024-06-22 21:45:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4857937920. Throughput: 0: 42665.3. Samples: 4858006980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:45:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 21:45:40,078][15401] Updated weights for policy 0, policy_version 296510 (0.0033) [2024-06-22 21:45:43,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 4858150912. Throughput: 0: 42762.2. Samples: 4858269540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:45:43,393][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 21:45:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000296518_4858150912.pth... [2024-06-22 21:45:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000295891_4847878144.pth [2024-06-22 21:45:44,109][15401] Updated weights for policy 0, policy_version 296520 (0.0039) [2024-06-22 21:45:47,563][15401] Updated weights for policy 0, policy_version 296530 (0.0038) [2024-06-22 21:45:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 4858347520. Throughput: 0: 42696.6. Samples: 4858522520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:45:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 21:45:51,857][15401] Updated weights for policy 0, policy_version 296540 (0.0037) [2024-06-22 21:45:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 4858560512. Throughput: 0: 42784.1. Samples: 4858654100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:45:53,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 21:45:55,339][15401] Updated weights for policy 0, policy_version 296550 (0.0023) [2024-06-22 21:45:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4858773504. Throughput: 0: 42620.7. Samples: 4858905320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:45:58,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-22 21:45:59,789][15401] Updated weights for policy 0, policy_version 296560 (0.0052) [2024-06-22 21:46:02,792][15401] Updated weights for policy 0, policy_version 296570 (0.0037) [2024-06-22 21:46:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4859002880. Throughput: 0: 42616.3. Samples: 4859157760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:46:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 21:46:07,270][15401] Updated weights for policy 0, policy_version 296580 (0.0049) [2024-06-22 21:46:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 4859215872. Throughput: 0: 42803.3. Samples: 4859291560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:46:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 21:46:10,862][15401] Updated weights for policy 0, policy_version 296590 (0.0040) [2024-06-22 21:46:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 4859412480. Throughput: 0: 42664.5. Samples: 4859543360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:46:13,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-22 21:46:14,754][15401] Updated weights for policy 0, policy_version 296600 (0.0030) [2024-06-22 21:46:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4859641856. Throughput: 0: 42706.3. Samples: 4859798400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:46:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 21:46:18,467][15401] Updated weights for policy 0, policy_version 296610 (0.0038) [2024-06-22 21:46:22,584][15401] Updated weights for policy 0, policy_version 296620 (0.0045) [2024-06-22 21:46:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4859854848. Throughput: 0: 42729.5. Samples: 4859929800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:46:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 21:46:26,178][15401] Updated weights for policy 0, policy_version 296630 (0.0036) [2024-06-22 21:46:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4860035072. Throughput: 0: 42612.1. Samples: 4860186980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 21:46:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 21:46:30,223][15401] Updated weights for policy 0, policy_version 296640 (0.0043) [2024-06-22 21:46:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4860297216. Throughput: 0: 42607.9. Samples: 4860439880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:46:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 21:46:33,620][15401] Updated weights for policy 0, policy_version 296650 (0.0044) [2024-06-22 21:46:37,842][15401] Updated weights for policy 0, policy_version 296660 (0.0022) [2024-06-22 21:46:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4860477440. Throughput: 0: 42684.4. Samples: 4860574900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:46:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-22 21:46:41,442][15401] Updated weights for policy 0, policy_version 296670 (0.0036) [2024-06-22 21:46:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 4860690432. Throughput: 0: 42674.6. Samples: 4860825680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:46:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 21:46:45,369][15401] Updated weights for policy 0, policy_version 296680 (0.0022) [2024-06-22 21:46:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 4860936192. Throughput: 0: 42754.6. Samples: 4861081720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:46:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 21:46:49,097][15401] Updated weights for policy 0, policy_version 296690 (0.0032) [2024-06-22 21:46:52,971][15401] Updated weights for policy 0, policy_version 296700 (0.0035) [2024-06-22 21:46:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4861132800. Throughput: 0: 42772.4. Samples: 4861216320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:46:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 21:46:56,610][15401] Updated weights for policy 0, policy_version 296710 (0.0028) [2024-06-22 21:46:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4861345792. Throughput: 0: 42784.9. Samples: 4861468680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:46:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 21:46:59,440][15349] Signal inference workers to stop experience collection... (71900 times) [2024-06-22 21:46:59,440][15349] Signal inference workers to resume experience collection... (71900 times) [2024-06-22 21:46:59,498][15401] InferenceWorker_p0-w0: stopping experience collection (71900 times) [2024-06-22 21:46:59,499][15401] InferenceWorker_p0-w0: resuming experience collection (71900 times) [2024-06-22 21:47:00,655][15401] Updated weights for policy 0, policy_version 296720 (0.0035) [2024-06-22 21:47:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4861575168. Throughput: 0: 42799.1. Samples: 4861724360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 21:47:04,691][15401] Updated weights for policy 0, policy_version 296730 (0.0039) [2024-06-22 21:47:08,331][15401] Updated weights for policy 0, policy_version 296740 (0.0046) [2024-06-22 21:47:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4861788160. Throughput: 0: 42766.1. Samples: 4861854380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:08,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 21:47:12,236][15401] Updated weights for policy 0, policy_version 296750 (0.0035) [2024-06-22 21:47:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4861984768. Throughput: 0: 42756.4. Samples: 4862111020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:13,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 21:47:15,845][15401] Updated weights for policy 0, policy_version 296760 (0.0034) [2024-06-22 21:47:18,396][15132] Fps is (10 sec: 42581.5, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 4862214144. Throughput: 0: 42768.2. Samples: 4862364720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:18,396][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 21:47:20,144][15401] Updated weights for policy 0, policy_version 296770 (0.0041) [2024-06-22 21:47:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4862427136. Throughput: 0: 42669.3. Samples: 4862495020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 21:47:23,511][15401] Updated weights for policy 0, policy_version 296780 (0.0029) [2024-06-22 21:47:27,579][15401] Updated weights for policy 0, policy_version 296790 (0.0029) [2024-06-22 21:47:28,389][15132] Fps is (10 sec: 40986.4, 60 sec: 43144.5, 300 sec: 42598.8). Total num frames: 4862623744. Throughput: 0: 42815.7. Samples: 4862752380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:28,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 21:47:31,449][15401] Updated weights for policy 0, policy_version 296800 (0.0043) [2024-06-22 21:47:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4862853120. Throughput: 0: 42706.3. Samples: 4863003500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 21:47:35,552][15401] Updated weights for policy 0, policy_version 296810 (0.0046) [2024-06-22 21:47:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4863066112. Throughput: 0: 42763.2. Samples: 4863140660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 21:47:38,945][15401] Updated weights for policy 0, policy_version 296820 (0.0041) [2024-06-22 21:47:42,925][15401] Updated weights for policy 0, policy_version 296830 (0.0047) [2024-06-22 21:47:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4863262720. Throughput: 0: 42647.8. Samples: 4863387840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-22 21:47:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 21:47:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000296830_4863262720.pth... [2024-06-22 21:47:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000296206_4853039104.pth [2024-06-22 21:47:46,831][15401] Updated weights for policy 0, policy_version 296840 (0.0026) [2024-06-22 21:47:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 4863475712. Throughput: 0: 42650.8. Samples: 4863643640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:47:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 21:47:50,510][15401] Updated weights for policy 0, policy_version 296850 (0.0033) [2024-06-22 21:47:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4863688704. Throughput: 0: 42689.7. Samples: 4863775320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:47:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 21:47:54,406][15401] Updated weights for policy 0, policy_version 296860 (0.0040) [2024-06-22 21:47:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4863901696. Throughput: 0: 42548.8. Samples: 4864025720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:47:58,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-22 21:47:58,477][15401] Updated weights for policy 0, policy_version 296870 (0.0031) [2024-06-22 21:48:01,973][15401] Updated weights for policy 0, policy_version 296880 (0.0041) [2024-06-22 21:48:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4864131072. Throughput: 0: 42568.7. Samples: 4864280040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 21:48:06,230][15401] Updated weights for policy 0, policy_version 296890 (0.0029) [2024-06-22 21:48:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 4864327680. Throughput: 0: 42782.3. Samples: 4864420220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 21:48:09,539][15401] Updated weights for policy 0, policy_version 296900 (0.0034) [2024-06-22 21:48:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 4864540672. Throughput: 0: 42712.7. Samples: 4864674460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 21:48:13,787][15401] Updated weights for policy 0, policy_version 296910 (0.0049) [2024-06-22 21:48:17,505][15401] Updated weights for policy 0, policy_version 296920 (0.0033) [2024-06-22 21:48:18,389][15132] Fps is (10 sec: 47513.1, 60 sec: 43149.1, 300 sec: 42820.5). Total num frames: 4864802816. Throughput: 0: 42821.0. Samples: 4864930440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 21:48:21,352][15401] Updated weights for policy 0, policy_version 296930 (0.0028) [2024-06-22 21:48:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4864966656. Throughput: 0: 42819.6. Samples: 4865067540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:23,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 21:48:25,018][15401] Updated weights for policy 0, policy_version 296940 (0.0031) [2024-06-22 21:48:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4865196032. Throughput: 0: 42965.9. Samples: 4865321300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 21:48:28,992][15401] Updated weights for policy 0, policy_version 296950 (0.0027) [2024-06-22 21:48:30,876][15349] Signal inference workers to stop experience collection... (71950 times) [2024-06-22 21:48:30,926][15401] InferenceWorker_p0-w0: stopping experience collection (71950 times) [2024-06-22 21:48:30,998][15349] Signal inference workers to resume experience collection... (71950 times) [2024-06-22 21:48:30,998][15401] InferenceWorker_p0-w0: resuming experience collection (71950 times) [2024-06-22 21:48:32,460][15401] Updated weights for policy 0, policy_version 296960 (0.0041) [2024-06-22 21:48:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4865425408. Throughput: 0: 43023.9. Samples: 4865579720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 21:48:36,531][15401] Updated weights for policy 0, policy_version 296970 (0.0034) [2024-06-22 21:48:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.6). Total num frames: 4865605632. Throughput: 0: 42958.9. Samples: 4865708460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 21:48:39,983][15401] Updated weights for policy 0, policy_version 296980 (0.0036) [2024-06-22 21:48:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4865851392. Throughput: 0: 43002.3. Samples: 4865960820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 21:48:44,203][15401] Updated weights for policy 0, policy_version 296990 (0.0032) [2024-06-22 21:48:47,917][15401] Updated weights for policy 0, policy_version 297000 (0.0038) [2024-06-22 21:48:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4866064384. Throughput: 0: 43029.4. Samples: 4866216360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:48,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 21:48:51,677][15401] Updated weights for policy 0, policy_version 297010 (0.0032) [2024-06-22 21:48:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4866228224. Throughput: 0: 42794.2. Samples: 4866345960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 21:48:55,514][15401] Updated weights for policy 0, policy_version 297020 (0.0032) [2024-06-22 21:48:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4866490368. Throughput: 0: 42894.4. Samples: 4866604700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:48:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 21:48:59,219][15401] Updated weights for policy 0, policy_version 297030 (0.0036) [2024-06-22 21:49:03,137][15401] Updated weights for policy 0, policy_version 297040 (0.0037) [2024-06-22 21:49:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4866703360. Throughput: 0: 42867.5. Samples: 4866859480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 21:49:07,102][15401] Updated weights for policy 0, policy_version 297050 (0.0031) [2024-06-22 21:49:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4866883584. Throughput: 0: 42650.7. Samples: 4866986820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:08,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-22 21:49:10,874][15401] Updated weights for policy 0, policy_version 297060 (0.0038) [2024-06-22 21:49:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 4867145728. Throughput: 0: 42719.6. Samples: 4867243680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:13,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 21:49:14,768][15401] Updated weights for policy 0, policy_version 297070 (0.0041) [2024-06-22 21:49:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4867325952. Throughput: 0: 42882.7. Samples: 4867509440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:18,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 21:49:18,679][15401] Updated weights for policy 0, policy_version 297080 (0.0021) [2024-06-22 21:49:22,388][15401] Updated weights for policy 0, policy_version 297090 (0.0049) [2024-06-22 21:49:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 4867555328. Throughput: 0: 42585.6. Samples: 4867624820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 21:49:26,219][15401] Updated weights for policy 0, policy_version 297100 (0.0036) [2024-06-22 21:49:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4867784704. Throughput: 0: 42748.9. Samples: 4867884520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:28,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 21:49:30,005][15401] Updated weights for policy 0, policy_version 297110 (0.0033) [2024-06-22 21:49:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4867964928. Throughput: 0: 43007.1. Samples: 4868151680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:33,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 21:49:33,894][15401] Updated weights for policy 0, policy_version 297120 (0.0041) [2024-06-22 21:49:37,832][15401] Updated weights for policy 0, policy_version 297130 (0.0033) [2024-06-22 21:49:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4868194304. Throughput: 0: 42805.8. Samples: 4868272220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 21:49:41,484][15401] Updated weights for policy 0, policy_version 297140 (0.0029) [2024-06-22 21:49:43,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 4868423680. Throughput: 0: 42806.0. Samples: 4868530980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 21:49:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000297145_4868423680.pth... [2024-06-22 21:49:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000296518_4858150912.pth [2024-06-22 21:49:43,588][15349] Signal inference workers to stop experience collection... (72000 times) [2024-06-22 21:49:43,632][15401] InferenceWorker_p0-w0: stopping experience collection (72000 times) [2024-06-22 21:49:43,642][15349] Signal inference workers to resume experience collection... (72000 times) [2024-06-22 21:49:43,657][15401] InferenceWorker_p0-w0: resuming experience collection (72000 times) [2024-06-22 21:49:45,218][15401] Updated weights for policy 0, policy_version 297150 (0.0035) [2024-06-22 21:49:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4868620288. Throughput: 0: 43001.5. Samples: 4868794540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 21:49:49,339][15401] Updated weights for policy 0, policy_version 297160 (0.0030) [2024-06-22 21:49:52,734][15401] Updated weights for policy 0, policy_version 297170 (0.0039) [2024-06-22 21:49:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4868833280. Throughput: 0: 42905.2. Samples: 4868917560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 21:49:56,968][15401] Updated weights for policy 0, policy_version 297180 (0.0031) [2024-06-22 21:49:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4869079040. Throughput: 0: 42984.3. Samples: 4869177980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:49:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 21:50:00,551][15401] Updated weights for policy 0, policy_version 297190 (0.0029) [2024-06-22 21:50:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 4869259264. Throughput: 0: 42750.6. Samples: 4869433220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:50:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 21:50:04,698][15401] Updated weights for policy 0, policy_version 297200 (0.0037) [2024-06-22 21:50:08,226][15401] Updated weights for policy 0, policy_version 297210 (0.0036) [2024-06-22 21:50:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 4869488640. Throughput: 0: 42956.9. Samples: 4869557880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-22 21:50:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 21:50:12,311][15401] Updated weights for policy 0, policy_version 297220 (0.0032) [2024-06-22 21:50:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 4869734400. Throughput: 0: 43138.7. Samples: 4869825760. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 21:50:15,738][15401] Updated weights for policy 0, policy_version 297230 (0.0028) [2024-06-22 21:50:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4869898240. Throughput: 0: 42842.6. Samples: 4870079600. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:18,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 21:50:19,848][15401] Updated weights for policy 0, policy_version 297240 (0.0034) [2024-06-22 21:50:23,392][15132] Fps is (10 sec: 40950.7, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 4870144000. Throughput: 0: 42828.9. Samples: 4870199620. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:23,392][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 21:50:23,397][15401] Updated weights for policy 0, policy_version 297250 (0.0043) [2024-06-22 21:50:27,593][15401] Updated weights for policy 0, policy_version 297260 (0.0040) [2024-06-22 21:50:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4870356992. Throughput: 0: 42985.4. Samples: 4870465320. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 21:50:31,222][15401] Updated weights for policy 0, policy_version 297270 (0.0033) [2024-06-22 21:50:33,390][15132] Fps is (10 sec: 40969.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4870553600. Throughput: 0: 42810.5. Samples: 4870721020. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 21:50:35,226][15401] Updated weights for policy 0, policy_version 297280 (0.0032) [2024-06-22 21:50:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 4870766592. Throughput: 0: 42775.0. Samples: 4870842440. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 21:50:38,964][15401] Updated weights for policy 0, policy_version 297290 (0.0030) [2024-06-22 21:50:43,063][15401] Updated weights for policy 0, policy_version 297300 (0.0034) [2024-06-22 21:50:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 4870963200. Throughput: 0: 42797.8. Samples: 4871103880. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 21:50:46,695][15401] Updated weights for policy 0, policy_version 297310 (0.0033) [2024-06-22 21:50:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4871176192. Throughput: 0: 42723.9. Samples: 4871355800. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 21:50:50,799][15401] Updated weights for policy 0, policy_version 297320 (0.0032) [2024-06-22 21:50:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4871421952. Throughput: 0: 42803.2. Samples: 4871484020. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 21:50:54,141][15401] Updated weights for policy 0, policy_version 297330 (0.0039) [2024-06-22 21:50:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4871602176. Throughput: 0: 42750.7. Samples: 4871749540. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:50:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 21:50:58,500][15401] Updated weights for policy 0, policy_version 297340 (0.0033) [2024-06-22 21:51:01,627][15401] Updated weights for policy 0, policy_version 297350 (0.0024) [2024-06-22 21:51:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4871831552. Throughput: 0: 42635.0. Samples: 4871998180. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:51:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 21:51:04,770][15349] Signal inference workers to stop experience collection... (72050 times) [2024-06-22 21:51:04,770][15349] Signal inference workers to resume experience collection... (72050 times) [2024-06-22 21:51:04,819][15401] InferenceWorker_p0-w0: stopping experience collection (72050 times) [2024-06-22 21:51:04,820][15401] InferenceWorker_p0-w0: resuming experience collection (72050 times) [2024-06-22 21:51:06,062][15401] Updated weights for policy 0, policy_version 297360 (0.0042) [2024-06-22 21:51:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4872060928. Throughput: 0: 42894.7. Samples: 4872129780. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:51:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 21:51:09,509][15401] Updated weights for policy 0, policy_version 297370 (0.0027) [2024-06-22 21:51:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 4872241152. Throughput: 0: 42731.3. Samples: 4872388220. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:51:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 21:51:13,629][15401] Updated weights for policy 0, policy_version 297380 (0.0045) [2024-06-22 21:51:17,386][15401] Updated weights for policy 0, policy_version 297390 (0.0038) [2024-06-22 21:51:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4872470528. Throughput: 0: 42530.2. Samples: 4872634880. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:51:18,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 21:51:21,422][15401] Updated weights for policy 0, policy_version 297400 (0.0028) [2024-06-22 21:51:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 4872699904. Throughput: 0: 42728.1. Samples: 4872765200. Policy #0 lag: (min: 1.0, avg: 11.9, max: 20.0) [2024-06-22 21:51:23,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 21:51:24,926][15401] Updated weights for policy 0, policy_version 297410 (0.0029) [2024-06-22 21:51:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4872896512. Throughput: 0: 42869.7. Samples: 4873033020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:51:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 21:51:28,985][15401] Updated weights for policy 0, policy_version 297420 (0.0032) [2024-06-22 21:51:32,367][15401] Updated weights for policy 0, policy_version 297430 (0.0028) [2024-06-22 21:51:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4873109504. Throughput: 0: 42869.0. Samples: 4873284900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:51:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 21:51:36,430][15401] Updated weights for policy 0, policy_version 297440 (0.0037) [2024-06-22 21:51:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4873355264. Throughput: 0: 42879.4. Samples: 4873413600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:51:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 21:51:39,974][15401] Updated weights for policy 0, policy_version 297450 (0.0031) [2024-06-22 21:51:43,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 4873519104. Throughput: 0: 42778.8. Samples: 4873674860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:51:43,397][15132] Avg episode reward: [(0, '0.594')] [2024-06-22 21:51:43,521][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000297457_4873535488.pth... [2024-06-22 21:51:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000296830_4863262720.pth [2024-06-22 21:51:44,018][15401] Updated weights for policy 0, policy_version 297460 (0.0037) [2024-06-22 21:51:47,386][15401] Updated weights for policy 0, policy_version 297470 (0.0045) [2024-06-22 21:51:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4873764864. Throughput: 0: 42785.5. Samples: 4873923520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:51:48,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-22 21:51:51,667][15401] Updated weights for policy 0, policy_version 297480 (0.0040) [2024-06-22 21:51:53,389][15132] Fps is (10 sec: 47544.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4873994240. Throughput: 0: 42867.5. Samples: 4874058820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:51:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 21:51:55,146][15401] Updated weights for policy 0, policy_version 297490 (0.0027) [2024-06-22 21:51:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4874174464. Throughput: 0: 42856.3. Samples: 4874316760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:51:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 21:51:59,428][15401] Updated weights for policy 0, policy_version 297500 (0.0037) [2024-06-22 21:52:02,622][15401] Updated weights for policy 0, policy_version 297510 (0.0033) [2024-06-22 21:52:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 4874420224. Throughput: 0: 42985.4. Samples: 4874569220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:52:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 21:52:07,305][15401] Updated weights for policy 0, policy_version 297520 (0.0041) [2024-06-22 21:52:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4874616832. Throughput: 0: 43026.7. Samples: 4874701400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:52:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 21:52:10,288][15401] Updated weights for policy 0, policy_version 297530 (0.0037) [2024-06-22 21:52:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 4874813440. Throughput: 0: 42708.1. Samples: 4874954880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:52:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 21:52:14,915][15401] Updated weights for policy 0, policy_version 297540 (0.0027) [2024-06-22 21:52:17,752][15401] Updated weights for policy 0, policy_version 297550 (0.0042) [2024-06-22 21:52:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 4875075584. Throughput: 0: 42748.9. Samples: 4875208600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:52:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 21:52:22,546][15401] Updated weights for policy 0, policy_version 297560 (0.0033) [2024-06-22 21:52:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 4875239424. Throughput: 0: 42841.3. Samples: 4875341560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:52:23,392][15132] Avg episode reward: [(0, '0.281')] [2024-06-22 21:52:25,391][15401] Updated weights for policy 0, policy_version 297570 (0.0032) [2024-06-22 21:52:28,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 4875436032. Throughput: 0: 42583.0. Samples: 4875590820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:52:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 21:52:30,218][15401] Updated weights for policy 0, policy_version 297580 (0.0039) [2024-06-22 21:52:31,549][15349] Signal inference workers to stop experience collection... (72100 times) [2024-06-22 21:52:31,553][15349] Signal inference workers to resume experience collection... (72100 times) [2024-06-22 21:52:31,590][15401] InferenceWorker_p0-w0: stopping experience collection (72100 times) [2024-06-22 21:52:31,590][15401] InferenceWorker_p0-w0: resuming experience collection (72100 times) [2024-06-22 21:52:33,357][15401] Updated weights for policy 0, policy_version 297590 (0.0028) [2024-06-22 21:52:33,389][15132] Fps is (10 sec: 47525.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4875714560. Throughput: 0: 42756.0. Samples: 4875847540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:52:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 21:52:37,832][15401] Updated weights for policy 0, policy_version 297600 (0.0040) [2024-06-22 21:52:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4875894784. Throughput: 0: 42824.9. Samples: 4875985940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 21:52:38,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 21:52:40,898][15401] Updated weights for policy 0, policy_version 297610 (0.0040) [2024-06-22 21:52:43,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 4876091392. Throughput: 0: 42658.6. Samples: 4876236400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:52:43,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-22 21:52:45,675][15401] Updated weights for policy 0, policy_version 297620 (0.0026) [2024-06-22 21:52:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4876337152. Throughput: 0: 42415.1. Samples: 4876477900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:52:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 21:52:49,190][15401] Updated weights for policy 0, policy_version 297630 (0.0041) [2024-06-22 21:52:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4876517376. Throughput: 0: 42619.6. Samples: 4876619280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:52:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 21:52:53,451][15401] Updated weights for policy 0, policy_version 297640 (0.0042) [2024-06-22 21:52:56,682][15401] Updated weights for policy 0, policy_version 297650 (0.0047) [2024-06-22 21:52:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4876746752. Throughput: 0: 42503.9. Samples: 4876867560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:52:58,394][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 21:53:01,240][15401] Updated weights for policy 0, policy_version 297660 (0.0035) [2024-06-22 21:53:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4876992512. Throughput: 0: 42486.6. Samples: 4877120500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:03,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 21:53:04,065][15401] Updated weights for policy 0, policy_version 297670 (0.0024) [2024-06-22 21:53:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4877156352. Throughput: 0: 42565.8. Samples: 4877256920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 21:53:08,896][15401] Updated weights for policy 0, policy_version 297680 (0.0023) [2024-06-22 21:53:11,856][15401] Updated weights for policy 0, policy_version 297690 (0.0025) [2024-06-22 21:53:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4877402112. Throughput: 0: 42674.5. Samples: 4877511180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 21:53:16,425][15401] Updated weights for policy 0, policy_version 297700 (0.0038) [2024-06-22 21:53:18,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4877631488. Throughput: 0: 42643.5. Samples: 4877766500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 21:53:19,563][15401] Updated weights for policy 0, policy_version 297710 (0.0025) [2024-06-22 21:53:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 4877811712. Throughput: 0: 42542.7. Samples: 4877900360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:23,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-22 21:53:24,018][15401] Updated weights for policy 0, policy_version 297720 (0.0049) [2024-06-22 21:53:27,062][15401] Updated weights for policy 0, policy_version 297730 (0.0033) [2024-06-22 21:53:28,396][15132] Fps is (10 sec: 40933.9, 60 sec: 43412.9, 300 sec: 42764.1). Total num frames: 4878041088. Throughput: 0: 42491.0. Samples: 4878148760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:28,397][15132] Avg episode reward: [(0, '0.794')] [2024-06-22 21:53:31,657][15401] Updated weights for policy 0, policy_version 297740 (0.0024) [2024-06-22 21:53:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4878270464. Throughput: 0: 42888.4. Samples: 4878407880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:33,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-22 21:53:34,670][15401] Updated weights for policy 0, policy_version 297750 (0.0030) [2024-06-22 21:53:38,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4878450688. Throughput: 0: 42745.2. Samples: 4878542820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:38,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-22 21:53:39,489][15401] Updated weights for policy 0, policy_version 297760 (0.0033) [2024-06-22 21:53:42,539][15401] Updated weights for policy 0, policy_version 297770 (0.0032) [2024-06-22 21:53:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42709.4). Total num frames: 4878663680. Throughput: 0: 42721.7. Samples: 4878790040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 21:53:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000297771_4878680064.pth... [2024-06-22 21:53:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000297145_4868423680.pth [2024-06-22 21:53:47,220][15349] Signal inference workers to stop experience collection... (72150 times) [2024-06-22 21:53:47,221][15349] Signal inference workers to resume experience collection... (72150 times) [2024-06-22 21:53:47,229][15401] Updated weights for policy 0, policy_version 297780 (0.0032) [2024-06-22 21:53:47,250][15401] InferenceWorker_p0-w0: stopping experience collection (72150 times) [2024-06-22 21:53:47,250][15401] InferenceWorker_p0-w0: resuming experience collection (72150 times) [2024-06-22 21:53:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4878893056. Throughput: 0: 42936.0. Samples: 4879052620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 21:53:49,960][15401] Updated weights for policy 0, policy_version 297790 (0.0036) [2024-06-22 21:53:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4879089664. Throughput: 0: 42785.8. Samples: 4879182280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 21:53:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 21:53:54,735][15401] Updated weights for policy 0, policy_version 297800 (0.0031) [2024-06-22 21:53:57,821][15401] Updated weights for policy 0, policy_version 297810 (0.0045) [2024-06-22 21:53:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4879319040. Throughput: 0: 42735.6. Samples: 4879434280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:53:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 21:54:02,276][15401] Updated weights for policy 0, policy_version 297820 (0.0031) [2024-06-22 21:54:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 4879532032. Throughput: 0: 42922.3. Samples: 4879698000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 21:54:05,311][15401] Updated weights for policy 0, policy_version 297830 (0.0031) [2024-06-22 21:54:08,391][15132] Fps is (10 sec: 40956.1, 60 sec: 42870.8, 300 sec: 42653.8). Total num frames: 4879728640. Throughput: 0: 42777.2. Samples: 4879825380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:08,391][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 21:54:09,945][15401] Updated weights for policy 0, policy_version 297840 (0.0049) [2024-06-22 21:54:13,106][15401] Updated weights for policy 0, policy_version 297850 (0.0024) [2024-06-22 21:54:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4879974400. Throughput: 0: 42784.4. Samples: 4880073780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:13,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 21:54:17,597][15401] Updated weights for policy 0, policy_version 297860 (0.0029) [2024-06-22 21:54:18,389][15132] Fps is (10 sec: 44241.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4880171008. Throughput: 0: 42985.8. Samples: 4880342240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 21:54:20,933][15401] Updated weights for policy 0, policy_version 297870 (0.0029) [2024-06-22 21:54:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 4880384000. Throughput: 0: 42778.6. Samples: 4880467860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 21:54:25,204][15401] Updated weights for policy 0, policy_version 297880 (0.0030) [2024-06-22 21:54:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42874.4, 300 sec: 42875.7). Total num frames: 4880613376. Throughput: 0: 43031.6. Samples: 4880726560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:28,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 21:54:28,444][15401] Updated weights for policy 0, policy_version 297890 (0.0027) [2024-06-22 21:54:32,810][15401] Updated weights for policy 0, policy_version 297900 (0.0031) [2024-06-22 21:54:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4880809984. Throughput: 0: 42931.2. Samples: 4880984520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:33,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 21:54:36,094][15401] Updated weights for policy 0, policy_version 297910 (0.0033) [2024-06-22 21:54:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4881022976. Throughput: 0: 42822.3. Samples: 4881109280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 21:54:40,364][15401] Updated weights for policy 0, policy_version 297920 (0.0033) [2024-06-22 21:54:43,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43416.0, 300 sec: 42875.7). Total num frames: 4881268736. Throughput: 0: 43009.3. Samples: 4881369800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:43,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 21:54:43,565][15401] Updated weights for policy 0, policy_version 297930 (0.0033) [2024-06-22 21:54:47,924][15401] Updated weights for policy 0, policy_version 297940 (0.0032) [2024-06-22 21:54:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4881448960. Throughput: 0: 42890.2. Samples: 4881628060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:48,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 21:54:51,419][15401] Updated weights for policy 0, policy_version 297950 (0.0049) [2024-06-22 21:54:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4881678336. Throughput: 0: 42796.9. Samples: 4881751200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 21:54:55,491][15401] Updated weights for policy 0, policy_version 297960 (0.0038) [2024-06-22 21:54:58,040][15349] Signal inference workers to stop experience collection... (72200 times) [2024-06-22 21:54:58,094][15401] InferenceWorker_p0-w0: stopping experience collection (72200 times) [2024-06-22 21:54:58,153][15349] Signal inference workers to resume experience collection... (72200 times) [2024-06-22 21:54:58,153][15401] InferenceWorker_p0-w0: resuming experience collection (72200 times) [2024-06-22 21:54:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4881907712. Throughput: 0: 42965.2. Samples: 4882007220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:54:58,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-22 21:54:59,089][15401] Updated weights for policy 0, policy_version 297970 (0.0022) [2024-06-22 21:55:03,120][15401] Updated weights for policy 0, policy_version 297980 (0.0040) [2024-06-22 21:55:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4882104320. Throughput: 0: 42791.9. Samples: 4882267880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 21:55:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 21:55:07,033][15401] Updated weights for policy 0, policy_version 297990 (0.0032) [2024-06-22 21:55:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43145.3, 300 sec: 42653.9). Total num frames: 4882317312. Throughput: 0: 42821.1. Samples: 4882394800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:08,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 21:55:11,107][15401] Updated weights for policy 0, policy_version 298000 (0.0031) [2024-06-22 21:55:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4882546688. Throughput: 0: 42713.0. Samples: 4882648540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 21:55:14,725][15401] Updated weights for policy 0, policy_version 298010 (0.0023) [2024-06-22 21:55:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 4882743296. Throughput: 0: 42814.7. Samples: 4882911180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 21:55:18,476][15401] Updated weights for policy 0, policy_version 298020 (0.0026) [2024-06-22 21:55:22,360][15401] Updated weights for policy 0, policy_version 298030 (0.0035) [2024-06-22 21:55:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4882956288. Throughput: 0: 42835.4. Samples: 4883036880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 21:55:25,928][15401] Updated weights for policy 0, policy_version 298040 (0.0034) [2024-06-22 21:55:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 4883202048. Throughput: 0: 42887.6. Samples: 4883299640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 21:55:29,896][15401] Updated weights for policy 0, policy_version 298050 (0.0030) [2024-06-22 21:55:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4883398656. Throughput: 0: 42832.0. Samples: 4883555500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 21:55:33,729][15401] Updated weights for policy 0, policy_version 298060 (0.0055) [2024-06-22 21:55:37,400][15401] Updated weights for policy 0, policy_version 298070 (0.0041) [2024-06-22 21:55:38,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 4883595264. Throughput: 0: 42871.5. Samples: 4883680520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:38,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 21:55:41,278][15401] Updated weights for policy 0, policy_version 298080 (0.0034) [2024-06-22 21:55:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 4883808256. Throughput: 0: 42884.5. Samples: 4883937020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:43,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-22 21:55:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000298085_4883824640.pth... [2024-06-22 21:55:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000297457_4873535488.pth [2024-06-22 21:55:45,239][15401] Updated weights for policy 0, policy_version 298090 (0.0033) [2024-06-22 21:55:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4884037632. Throughput: 0: 42874.3. Samples: 4884197220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 21:55:48,953][15401] Updated weights for policy 0, policy_version 298100 (0.0037) [2024-06-22 21:55:52,745][15401] Updated weights for policy 0, policy_version 298110 (0.0036) [2024-06-22 21:55:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 4884250624. Throughput: 0: 42878.6. Samples: 4884324440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:53,393][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 21:55:56,573][15401] Updated weights for policy 0, policy_version 298120 (0.0030) [2024-06-22 21:55:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4884463616. Throughput: 0: 43007.5. Samples: 4884583880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:55:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 21:56:00,434][15401] Updated weights for policy 0, policy_version 298130 (0.0032) [2024-06-22 21:56:03,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 4884692992. Throughput: 0: 42788.9. Samples: 4884836680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:56:03,396][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 21:56:04,147][15401] Updated weights for policy 0, policy_version 298140 (0.0024) [2024-06-22 21:56:08,082][15401] Updated weights for policy 0, policy_version 298150 (0.0034) [2024-06-22 21:56:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4884905984. Throughput: 0: 42959.2. Samples: 4884970040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:56:08,395][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 21:56:11,628][15401] Updated weights for policy 0, policy_version 298160 (0.0037) [2024-06-22 21:56:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4885102592. Throughput: 0: 42728.1. Samples: 4885222400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:56:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 21:56:16,174][15401] Updated weights for policy 0, policy_version 298170 (0.0041) [2024-06-22 21:56:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4885315584. Throughput: 0: 42804.9. Samples: 4885481720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-22 21:56:18,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 21:56:19,197][15401] Updated weights for policy 0, policy_version 298180 (0.0033) [2024-06-22 21:56:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4885512192. Throughput: 0: 42949.0. Samples: 4885613120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:56:23,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 21:56:23,823][15401] Updated weights for policy 0, policy_version 298190 (0.0022) [2024-06-22 21:56:26,879][15401] Updated weights for policy 0, policy_version 298200 (0.0034) [2024-06-22 21:56:28,391][15132] Fps is (10 sec: 44231.4, 60 sec: 42597.6, 300 sec: 42875.9). Total num frames: 4885757952. Throughput: 0: 42915.3. Samples: 4885868260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:56:28,391][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 21:56:31,286][15401] Updated weights for policy 0, policy_version 298210 (0.0034) [2024-06-22 21:56:32,775][15349] Signal inference workers to stop experience collection... (72250 times) [2024-06-22 21:56:32,775][15349] Signal inference workers to resume experience collection... (72250 times) [2024-06-22 21:56:32,823][15401] InferenceWorker_p0-w0: stopping experience collection (72250 times) [2024-06-22 21:56:32,823][15401] InferenceWorker_p0-w0: resuming experience collection (72250 times) [2024-06-22 21:56:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4885970944. Throughput: 0: 42973.3. Samples: 4886131020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:56:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 21:56:34,561][15401] Updated weights for policy 0, policy_version 298220 (0.0039) [2024-06-22 21:56:38,389][15132] Fps is (10 sec: 39326.5, 60 sec: 42600.1, 300 sec: 42821.5). Total num frames: 4886151168. Throughput: 0: 43020.6. Samples: 4886260260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:56:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 21:56:38,948][15401] Updated weights for policy 0, policy_version 298230 (0.0031) [2024-06-22 21:56:42,378][15401] Updated weights for policy 0, policy_version 298240 (0.0024) [2024-06-22 21:56:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 4886413312. Throughput: 0: 42783.6. Samples: 4886509140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:56:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 21:56:46,521][15401] Updated weights for policy 0, policy_version 298250 (0.0038) [2024-06-22 21:56:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4886593536. Throughput: 0: 42885.3. Samples: 4886766520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:56:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 21:56:50,056][15401] Updated weights for policy 0, policy_version 298260 (0.0029) [2024-06-22 21:56:53,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 4886790144. Throughput: 0: 42665.8. Samples: 4886890000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:56:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 21:56:54,125][15401] Updated weights for policy 0, policy_version 298270 (0.0032) [2024-06-22 21:56:57,908][15401] Updated weights for policy 0, policy_version 298280 (0.0056) [2024-06-22 21:56:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4887035904. Throughput: 0: 42731.5. Samples: 4887145320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:56:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 21:57:01,781][15401] Updated weights for policy 0, policy_version 298290 (0.0030) [2024-06-22 21:57:03,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 4887232512. Throughput: 0: 42673.7. Samples: 4887402140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:57:03,392][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 21:57:05,491][15401] Updated weights for policy 0, policy_version 298300 (0.0036) [2024-06-22 21:57:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4887445504. Throughput: 0: 42616.0. Samples: 4887530840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:57:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 21:57:09,334][15401] Updated weights for policy 0, policy_version 298310 (0.0031) [2024-06-22 21:57:13,382][15401] Updated weights for policy 0, policy_version 298320 (0.0028) [2024-06-22 21:57:13,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4887674880. Throughput: 0: 42687.0. Samples: 4887789120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:57:13,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 21:57:17,009][15401] Updated weights for policy 0, policy_version 298330 (0.0034) [2024-06-22 21:57:18,391][15132] Fps is (10 sec: 42590.3, 60 sec: 42597.0, 300 sec: 42820.6). Total num frames: 4887871488. Throughput: 0: 42553.8. Samples: 4888046020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:57:18,392][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 21:57:21,039][15401] Updated weights for policy 0, policy_version 298340 (0.0044) [2024-06-22 21:57:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4888084480. Throughput: 0: 42430.6. Samples: 4888169640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:57:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 21:57:24,629][15401] Updated weights for policy 0, policy_version 298350 (0.0032) [2024-06-22 21:57:28,389][15132] Fps is (10 sec: 42606.7, 60 sec: 42326.2, 300 sec: 42653.9). Total num frames: 4888297472. Throughput: 0: 42685.8. Samples: 4888430000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:57:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 21:57:28,691][15401] Updated weights for policy 0, policy_version 298360 (0.0042) [2024-06-22 21:57:32,462][15401] Updated weights for policy 0, policy_version 298370 (0.0027) [2024-06-22 21:57:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4888526848. Throughput: 0: 42521.2. Samples: 4888679980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-22 21:57:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 21:57:36,481][15401] Updated weights for policy 0, policy_version 298380 (0.0047) [2024-06-22 21:57:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4888707072. Throughput: 0: 42529.8. Samples: 4888803840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:57:38,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 21:57:40,051][15401] Updated weights for policy 0, policy_version 298390 (0.0033) [2024-06-22 21:57:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4888952832. Throughput: 0: 42596.8. Samples: 4889062180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:57:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 21:57:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000298398_4888952832.pth... [2024-06-22 21:57:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000297771_4878680064.pth [2024-06-22 21:57:44,019][15401] Updated weights for policy 0, policy_version 298400 (0.0037) [2024-06-22 21:57:48,171][15401] Updated weights for policy 0, policy_version 298410 (0.0031) [2024-06-22 21:57:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4889149440. Throughput: 0: 42505.0. Samples: 4889314760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:57:48,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 21:57:51,678][15401] Updated weights for policy 0, policy_version 298420 (0.0036) [2024-06-22 21:57:53,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4889346048. Throughput: 0: 42368.4. Samples: 4889437420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:57:53,391][15132] Avg episode reward: [(0, '0.297')] [2024-06-22 21:57:55,809][15401] Updated weights for policy 0, policy_version 298430 (0.0037) [2024-06-22 21:57:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4889591808. Throughput: 0: 42457.7. Samples: 4889699720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:57:58,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-22 21:57:59,598][15401] Updated weights for policy 0, policy_version 298440 (0.0031) [2024-06-22 21:58:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 4889788416. Throughput: 0: 42366.3. Samples: 4889952420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 21:58:03,487][15401] Updated weights for policy 0, policy_version 298450 (0.0037) [2024-06-22 21:58:07,064][15401] Updated weights for policy 0, policy_version 298460 (0.0040) [2024-06-22 21:58:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4889985024. Throughput: 0: 42537.9. Samples: 4890083840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 21:58:09,621][15349] Signal inference workers to stop experience collection... (72300 times) [2024-06-22 21:58:09,660][15401] InferenceWorker_p0-w0: stopping experience collection (72300 times) [2024-06-22 21:58:09,668][15349] Signal inference workers to resume experience collection... (72300 times) [2024-06-22 21:58:09,675][15401] InferenceWorker_p0-w0: resuming experience collection (72300 times) [2024-06-22 21:58:10,774][15401] Updated weights for policy 0, policy_version 298470 (0.0031) [2024-06-22 21:58:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4890230784. Throughput: 0: 42672.3. Samples: 4890350260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 21:58:14,427][15401] Updated weights for policy 0, policy_version 298480 (0.0025) [2024-06-22 21:58:18,318][15401] Updated weights for policy 0, policy_version 298490 (0.0023) [2024-06-22 21:58:18,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43145.9, 300 sec: 42876.1). Total num frames: 4890460160. Throughput: 0: 42830.7. Samples: 4890607360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:18,392][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 21:58:21,972][15401] Updated weights for policy 0, policy_version 298500 (0.0040) [2024-06-22 21:58:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 4890640384. Throughput: 0: 42993.3. Samples: 4890738540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 21:58:25,903][15401] Updated weights for policy 0, policy_version 298510 (0.0041) [2024-06-22 21:58:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 4890869760. Throughput: 0: 42928.0. Samples: 4890994040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:28,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 21:58:30,027][15401] Updated weights for policy 0, policy_version 298520 (0.0027) [2024-06-22 21:58:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4891099136. Throughput: 0: 43055.0. Samples: 4891252240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 21:58:33,501][15401] Updated weights for policy 0, policy_version 298530 (0.0035) [2024-06-22 21:58:37,463][15401] Updated weights for policy 0, policy_version 298540 (0.0039) [2024-06-22 21:58:38,390][15132] Fps is (10 sec: 42607.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4891295744. Throughput: 0: 43254.1. Samples: 4891383860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:38,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 21:58:41,080][15401] Updated weights for policy 0, policy_version 298550 (0.0036) [2024-06-22 21:58:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 4891525120. Throughput: 0: 43156.4. Samples: 4891641860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-22 21:58:43,392][15132] Avg episode reward: [(0, '0.061')] [2024-06-22 21:58:45,380][15401] Updated weights for policy 0, policy_version 298560 (0.0025) [2024-06-22 21:58:48,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4891754496. Throughput: 0: 43230.6. Samples: 4891897800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:58:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 21:58:48,533][15401] Updated weights for policy 0, policy_version 298570 (0.0028) [2024-06-22 21:58:52,924][15401] Updated weights for policy 0, policy_version 298580 (0.0031) [2024-06-22 21:58:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4891934720. Throughput: 0: 43294.6. Samples: 4892032100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:58:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 21:58:56,046][15401] Updated weights for policy 0, policy_version 298590 (0.0039) [2024-06-22 21:58:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4892164096. Throughput: 0: 43043.1. Samples: 4892287200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:58:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 21:59:00,456][15401] Updated weights for policy 0, policy_version 298600 (0.0053) [2024-06-22 21:59:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42931.8). Total num frames: 4892393472. Throughput: 0: 43120.5. Samples: 4892547780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:03,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 21:59:04,015][15401] Updated weights for policy 0, policy_version 298610 (0.0027) [2024-06-22 21:59:07,983][15401] Updated weights for policy 0, policy_version 298620 (0.0029) [2024-06-22 21:59:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 4892590080. Throughput: 0: 43141.4. Samples: 4892679900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:08,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 21:59:11,642][15401] Updated weights for policy 0, policy_version 298630 (0.0039) [2024-06-22 21:59:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4892819456. Throughput: 0: 43076.0. Samples: 4892932360. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 21:59:15,495][15401] Updated weights for policy 0, policy_version 298640 (0.0039) [2024-06-22 21:59:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4893016064. Throughput: 0: 42971.2. Samples: 4893185940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:18,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 21:59:19,138][15401] Updated weights for policy 0, policy_version 298650 (0.0025) [2024-06-22 21:59:23,340][15401] Updated weights for policy 0, policy_version 298660 (0.0027) [2024-06-22 21:59:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 42820.9). Total num frames: 4893245440. Throughput: 0: 42915.2. Samples: 4893315040. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 21:59:26,748][15401] Updated weights for policy 0, policy_version 298670 (0.0040) [2024-06-22 21:59:28,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43144.5, 300 sec: 42875.7). Total num frames: 4893458432. Throughput: 0: 42845.7. Samples: 4893569920. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:28,393][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 21:59:31,274][15401] Updated weights for policy 0, policy_version 298680 (0.0032) [2024-06-22 21:59:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4893671424. Throughput: 0: 42799.0. Samples: 4893823760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 21:59:34,636][15401] Updated weights for policy 0, policy_version 298690 (0.0032) [2024-06-22 21:59:38,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 4893868032. Throughput: 0: 42680.4. Samples: 4893952720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:38,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-22 21:59:38,775][15401] Updated weights for policy 0, policy_version 298700 (0.0037) [2024-06-22 21:59:42,602][15401] Updated weights for policy 0, policy_version 298710 (0.0038) [2024-06-22 21:59:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 4894113792. Throughput: 0: 42877.5. Samples: 4894216680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:43,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 21:59:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000298714_4894130176.pth... [2024-06-22 21:59:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000298085_4883824640.pth [2024-06-22 21:59:47,086][15401] Updated weights for policy 0, policy_version 298720 (0.0038) [2024-06-22 21:59:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4894310400. Throughput: 0: 42628.0. Samples: 4894466040. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 21:59:50,168][15401] Updated weights for policy 0, policy_version 298730 (0.0034) [2024-06-22 21:59:53,312][15349] Signal inference workers to stop experience collection... (72350 times) [2024-06-22 21:59:53,317][15349] Signal inference workers to resume experience collection... (72350 times) [2024-06-22 21:59:53,334][15401] InferenceWorker_p0-w0: stopping experience collection (72350 times) [2024-06-22 21:59:53,334][15401] InferenceWorker_p0-w0: resuming experience collection (72350 times) [2024-06-22 21:59:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4894507008. Throughput: 0: 42504.4. Samples: 4894592600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:53,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-22 21:59:54,646][15401] Updated weights for policy 0, policy_version 298740 (0.0030) [2024-06-22 21:59:57,687][15401] Updated weights for policy 0, policy_version 298750 (0.0035) [2024-06-22 21:59:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4894736384. Throughput: 0: 42708.6. Samples: 4894854240. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-22 21:59:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 22:00:02,315][15401] Updated weights for policy 0, policy_version 298760 (0.0040) [2024-06-22 22:00:03,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4894965760. Throughput: 0: 42859.9. Samples: 4895114640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 22:00:05,134][15401] Updated weights for policy 0, policy_version 298770 (0.0031) [2024-06-22 22:00:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4895162368. Throughput: 0: 42792.4. Samples: 4895240700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:08,392][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 22:00:09,755][15401] Updated weights for policy 0, policy_version 298780 (0.0033) [2024-06-22 22:00:13,023][15401] Updated weights for policy 0, policy_version 298790 (0.0035) [2024-06-22 22:00:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4895375360. Throughput: 0: 42844.9. Samples: 4895497840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 22:00:17,303][15401] Updated weights for policy 0, policy_version 298800 (0.0032) [2024-06-22 22:00:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 4895571968. Throughput: 0: 43000.8. Samples: 4895758900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:18,392][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 22:00:20,525][15401] Updated weights for policy 0, policy_version 298810 (0.0036) [2024-06-22 22:00:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4895817728. Throughput: 0: 42928.9. Samples: 4895884520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:23,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 22:00:25,003][15401] Updated weights for policy 0, policy_version 298820 (0.0029) [2024-06-22 22:00:28,061][15401] Updated weights for policy 0, policy_version 298830 (0.0041) [2024-06-22 22:00:28,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 4896030720. Throughput: 0: 42879.0. Samples: 4896146240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 22:00:32,698][15401] Updated weights for policy 0, policy_version 298840 (0.0031) [2024-06-22 22:00:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 4896194560. Throughput: 0: 43163.6. Samples: 4896408400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 22:00:35,532][15401] Updated weights for policy 0, policy_version 298850 (0.0029) [2024-06-22 22:00:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 4896473088. Throughput: 0: 42997.3. Samples: 4896527480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 22:00:40,330][15401] Updated weights for policy 0, policy_version 298860 (0.0041) [2024-06-22 22:00:43,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4896669696. Throughput: 0: 42996.0. Samples: 4896789060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:43,390][15132] Avg episode reward: [(0, '0.165')] [2024-06-22 22:00:43,469][15401] Updated weights for policy 0, policy_version 298870 (0.0034) [2024-06-22 22:00:48,316][15401] Updated weights for policy 0, policy_version 298880 (0.0029) [2024-06-22 22:00:48,395][15132] Fps is (10 sec: 37663.1, 60 sec: 42321.6, 300 sec: 42709.1). Total num frames: 4896849920. Throughput: 0: 43159.4. Samples: 4897057040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:48,395][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 22:00:51,010][15401] Updated weights for policy 0, policy_version 298890 (0.0030) [2024-06-22 22:00:51,320][15349] Signal inference workers to stop experience collection... (72400 times) [2024-06-22 22:00:51,362][15401] InferenceWorker_p0-w0: stopping experience collection (72400 times) [2024-06-22 22:00:51,436][15349] Signal inference workers to resume experience collection... (72400 times) [2024-06-22 22:00:51,436][15401] InferenceWorker_p0-w0: resuming experience collection (72400 times) [2024-06-22 22:00:53,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43415.8, 300 sec: 42875.7). Total num frames: 4897112064. Throughput: 0: 42918.2. Samples: 4897172120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:53,392][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 22:00:55,875][15401] Updated weights for policy 0, policy_version 298900 (0.0033) [2024-06-22 22:00:58,392][15132] Fps is (10 sec: 47527.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4897325056. Throughput: 0: 43144.4. Samples: 4897439440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:00:58,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 22:00:58,549][15401] Updated weights for policy 0, policy_version 298910 (0.0028) [2024-06-22 22:01:03,339][15401] Updated weights for policy 0, policy_version 298920 (0.0031) [2024-06-22 22:01:03,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4897505280. Throughput: 0: 43226.3. Samples: 4897703980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:01:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 22:01:06,174][15401] Updated weights for policy 0, policy_version 298930 (0.0065) [2024-06-22 22:01:08,389][15132] Fps is (10 sec: 44247.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 4897767424. Throughput: 0: 43057.8. Samples: 4897822120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:01:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 22:01:10,746][15401] Updated weights for policy 0, policy_version 298940 (0.0026) [2024-06-22 22:01:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4897947648. Throughput: 0: 43165.5. Samples: 4898088680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-22 22:01:13,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 22:01:13,792][15401] Updated weights for policy 0, policy_version 298950 (0.0045) [2024-06-22 22:01:18,262][15401] Updated weights for policy 0, policy_version 298960 (0.0033) [2024-06-22 22:01:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 4898160640. Throughput: 0: 42984.7. Samples: 4898342720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 22:01:21,376][15401] Updated weights for policy 0, policy_version 298970 (0.0037) [2024-06-22 22:01:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 4898406400. Throughput: 0: 43056.7. Samples: 4898465040. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 22:01:26,203][15401] Updated weights for policy 0, policy_version 298980 (0.0036) [2024-06-22 22:01:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4898586624. Throughput: 0: 43176.0. Samples: 4898731980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 22:01:28,940][15401] Updated weights for policy 0, policy_version 298990 (0.0036) [2024-06-22 22:01:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4898799616. Throughput: 0: 42948.0. Samples: 4898989480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 22:01:33,542][15401] Updated weights for policy 0, policy_version 299000 (0.0033) [2024-06-22 22:01:36,659][15401] Updated weights for policy 0, policy_version 299010 (0.0029) [2024-06-22 22:01:38,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4899061760. Throughput: 0: 43119.7. Samples: 4899112400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 22:01:41,363][15401] Updated weights for policy 0, policy_version 299020 (0.0021) [2024-06-22 22:01:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 4899241984. Throughput: 0: 43127.9. Samples: 4899380100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 22:01:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000299026_4899241984.pth... [2024-06-22 22:01:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000298398_4888952832.pth [2024-06-22 22:01:44,401][15401] Updated weights for policy 0, policy_version 299030 (0.0032) [2024-06-22 22:01:48,390][15132] Fps is (10 sec: 37682.8, 60 sec: 43148.3, 300 sec: 42876.1). Total num frames: 4899438592. Throughput: 0: 42852.5. Samples: 4899632340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 22:01:48,953][15401] Updated weights for policy 0, policy_version 299040 (0.0030) [2024-06-22 22:01:52,083][15401] Updated weights for policy 0, policy_version 299050 (0.0030) [2024-06-22 22:01:52,625][15349] Signal inference workers to stop experience collection... (72450 times) [2024-06-22 22:01:52,682][15401] InferenceWorker_p0-w0: stopping experience collection (72450 times) [2024-06-22 22:01:52,742][15349] Signal inference workers to resume experience collection... (72450 times) [2024-06-22 22:01:52,743][15401] InferenceWorker_p0-w0: resuming experience collection (72450 times) [2024-06-22 22:01:53,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43419.3, 300 sec: 42987.2). Total num frames: 4899717120. Throughput: 0: 43027.4. Samples: 4899758360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 22:01:56,457][15401] Updated weights for policy 0, policy_version 299060 (0.0038) [2024-06-22 22:01:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42600.1, 300 sec: 42876.5). Total num frames: 4899880960. Throughput: 0: 42929.3. Samples: 4900020500. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:01:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 22:01:59,667][15401] Updated weights for policy 0, policy_version 299070 (0.0026) [2024-06-22 22:02:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4900093952. Throughput: 0: 42759.1. Samples: 4900266880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:02:03,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-22 22:02:04,043][15401] Updated weights for policy 0, policy_version 299080 (0.0034) [2024-06-22 22:02:07,264][15401] Updated weights for policy 0, policy_version 299090 (0.0043) [2024-06-22 22:02:08,392][15132] Fps is (10 sec: 47501.8, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 4900356096. Throughput: 0: 43090.2. Samples: 4900404200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:02:08,392][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 22:02:11,532][15401] Updated weights for policy 0, policy_version 299100 (0.0030) [2024-06-22 22:02:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42820.8). Total num frames: 4900503552. Throughput: 0: 42732.3. Samples: 4900654940. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:02:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 22:02:15,167][15401] Updated weights for policy 0, policy_version 299110 (0.0039) [2024-06-22 22:02:18,389][15132] Fps is (10 sec: 39331.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4900749312. Throughput: 0: 42608.6. Samples: 4900906860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:02:18,392][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 22:02:19,118][15401] Updated weights for policy 0, policy_version 299120 (0.0045) [2024-06-22 22:02:22,778][15401] Updated weights for policy 0, policy_version 299130 (0.0038) [2024-06-22 22:02:23,395][15132] Fps is (10 sec: 47489.7, 60 sec: 42867.9, 300 sec: 42986.4). Total num frames: 4900978688. Throughput: 0: 42958.1. Samples: 4901045740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:02:23,395][15132] Avg episode reward: [(0, '0.792')] [2024-06-22 22:02:26,758][15401] Updated weights for policy 0, policy_version 299140 (0.0039) [2024-06-22 22:02:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4901142528. Throughput: 0: 42675.3. Samples: 4901300480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-22 22:02:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 22:02:30,253][15401] Updated weights for policy 0, policy_version 299150 (0.0043) [2024-06-22 22:02:33,389][15132] Fps is (10 sec: 42620.5, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 4901404672. Throughput: 0: 42749.9. Samples: 4901556080. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:02:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 22:02:34,513][15401] Updated weights for policy 0, policy_version 299160 (0.0036) [2024-06-22 22:02:37,814][15401] Updated weights for policy 0, policy_version 299170 (0.0033) [2024-06-22 22:02:38,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 4901617664. Throughput: 0: 43014.4. Samples: 4901694000. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:02:38,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 22:02:42,166][15401] Updated weights for policy 0, policy_version 299180 (0.0034) [2024-06-22 22:02:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4901797888. Throughput: 0: 42813.6. Samples: 4901947120. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:02:43,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 22:02:45,584][15401] Updated weights for policy 0, policy_version 299190 (0.0027) [2024-06-22 22:02:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4902027264. Throughput: 0: 42872.6. Samples: 4902196140. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:02:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 22:02:49,888][15401] Updated weights for policy 0, policy_version 299200 (0.0036) [2024-06-22 22:02:53,284][15401] Updated weights for policy 0, policy_version 299210 (0.0042) [2024-06-22 22:02:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 4902256640. Throughput: 0: 42873.0. Samples: 4902333380. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:02:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 22:02:57,351][15401] Updated weights for policy 0, policy_version 299220 (0.0037) [2024-06-22 22:02:58,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4902453248. Throughput: 0: 42957.0. Samples: 4902588000. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:02:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 22:02:59,641][15349] Signal inference workers to stop experience collection... (72500 times) [2024-06-22 22:02:59,642][15349] Signal inference workers to resume experience collection... (72500 times) [2024-06-22 22:02:59,666][15401] InferenceWorker_p0-w0: stopping experience collection (72500 times) [2024-06-22 22:02:59,667][15401] InferenceWorker_p0-w0: resuming experience collection (72500 times) [2024-06-22 22:03:00,979][15401] Updated weights for policy 0, policy_version 299230 (0.0031) [2024-06-22 22:03:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 4902682624. Throughput: 0: 42847.1. Samples: 4902834980. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:03:03,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-22 22:03:04,938][15401] Updated weights for policy 0, policy_version 299240 (0.0042) [2024-06-22 22:03:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42053.9, 300 sec: 42876.1). Total num frames: 4902879232. Throughput: 0: 42771.0. Samples: 4902970220. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:03:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 22:03:08,639][15401] Updated weights for policy 0, policy_version 299250 (0.0032) [2024-06-22 22:03:12,627][15401] Updated weights for policy 0, policy_version 299260 (0.0031) [2024-06-22 22:03:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4903092224. Throughput: 0: 42733.7. Samples: 4903223500. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:03:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 22:03:16,336][15401] Updated weights for policy 0, policy_version 299270 (0.0033) [2024-06-22 22:03:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 4903337984. Throughput: 0: 42656.2. Samples: 4903475620. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:03:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 22:03:20,465][15401] Updated weights for policy 0, policy_version 299280 (0.0040) [2024-06-22 22:03:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42055.8, 300 sec: 42820.9). Total num frames: 4903501824. Throughput: 0: 42530.1. Samples: 4903607860. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:03:23,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 22:03:24,037][15401] Updated weights for policy 0, policy_version 299290 (0.0041) [2024-06-22 22:03:27,913][15401] Updated weights for policy 0, policy_version 299300 (0.0023) [2024-06-22 22:03:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4903747584. Throughput: 0: 42580.5. Samples: 4903863240. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:03:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 22:03:31,720][15401] Updated weights for policy 0, policy_version 299310 (0.0046) [2024-06-22 22:03:33,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 4903976960. Throughput: 0: 42706.2. Samples: 4904117920. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:03:33,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-22 22:03:35,598][15401] Updated weights for policy 0, policy_version 299320 (0.0035) [2024-06-22 22:03:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 4904157184. Throughput: 0: 42572.4. Samples: 4904249140. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-22 22:03:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 22:03:39,351][15401] Updated weights for policy 0, policy_version 299330 (0.0035) [2024-06-22 22:03:42,986][15401] Updated weights for policy 0, policy_version 299340 (0.0026) [2024-06-22 22:03:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4904386560. Throughput: 0: 42514.6. Samples: 4904501260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:03:43,393][15132] Avg episode reward: [(0, '0.712')] [2024-06-22 22:03:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000299341_4904402944.pth... [2024-06-22 22:03:43,565][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000298714_4894130176.pth [2024-06-22 22:03:47,028][15401] Updated weights for policy 0, policy_version 299350 (0.0033) [2024-06-22 22:03:48,396][15132] Fps is (10 sec: 45845.7, 60 sec: 43139.8, 300 sec: 42986.2). Total num frames: 4904615936. Throughput: 0: 42796.5. Samples: 4904761100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:03:48,397][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 22:03:50,414][15401] Updated weights for policy 0, policy_version 299360 (0.0040) [2024-06-22 22:03:53,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4904796160. Throughput: 0: 42770.0. Samples: 4904894860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:03:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 22:03:54,822][15401] Updated weights for policy 0, policy_version 299370 (0.0028) [2024-06-22 22:03:58,012][15401] Updated weights for policy 0, policy_version 299380 (0.0031) [2024-06-22 22:03:58,390][15132] Fps is (10 sec: 42625.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4905041920. Throughput: 0: 42818.2. Samples: 4905150320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:03:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 22:04:02,530][15401] Updated weights for policy 0, policy_version 299390 (0.0026) [2024-06-22 22:04:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4905238528. Throughput: 0: 42891.8. Samples: 4905405740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 22:04:05,618][15401] Updated weights for policy 0, policy_version 299400 (0.0032) [2024-06-22 22:04:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 4905451520. Throughput: 0: 42739.1. Samples: 4905531220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:08,393][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 22:04:10,081][15401] Updated weights for policy 0, policy_version 299410 (0.0031) [2024-06-22 22:04:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4905680896. Throughput: 0: 42763.9. Samples: 4905787620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 22:04:13,938][15401] Updated weights for policy 0, policy_version 299420 (0.0043) [2024-06-22 22:04:17,626][15401] Updated weights for policy 0, policy_version 299430 (0.0031) [2024-06-22 22:04:18,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42050.7, 300 sec: 42764.7). Total num frames: 4905861120. Throughput: 0: 42983.0. Samples: 4906052260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:18,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 22:04:19,208][15349] Signal inference workers to stop experience collection... (72550 times) [2024-06-22 22:04:19,245][15401] InferenceWorker_p0-w0: stopping experience collection (72550 times) [2024-06-22 22:04:19,266][15349] Signal inference workers to resume experience collection... (72550 times) [2024-06-22 22:04:19,269][15401] InferenceWorker_p0-w0: resuming experience collection (72550 times) [2024-06-22 22:04:21,482][15401] Updated weights for policy 0, policy_version 299440 (0.0038) [2024-06-22 22:04:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 4906090496. Throughput: 0: 42865.3. Samples: 4906178080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 22:04:25,139][15401] Updated weights for policy 0, policy_version 299450 (0.0036) [2024-06-22 22:04:28,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4906319872. Throughput: 0: 43088.6. Samples: 4906440140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 22:04:28,920][15401] Updated weights for policy 0, policy_version 299460 (0.0026) [2024-06-22 22:04:33,047][15401] Updated weights for policy 0, policy_version 299470 (0.0035) [2024-06-22 22:04:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4906532864. Throughput: 0: 43123.1. Samples: 4906701360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 22:04:36,343][15401] Updated weights for policy 0, policy_version 299480 (0.0026) [2024-06-22 22:04:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4906762240. Throughput: 0: 42865.6. Samples: 4906823820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 22:04:40,656][15401] Updated weights for policy 0, policy_version 299490 (0.0032) [2024-06-22 22:04:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 4906975232. Throughput: 0: 42996.0. Samples: 4907085140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 22:04:43,814][15401] Updated weights for policy 0, policy_version 299500 (0.0039) [2024-06-22 22:04:48,214][15401] Updated weights for policy 0, policy_version 299510 (0.0032) [2024-06-22 22:04:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42603.0, 300 sec: 42931.6). Total num frames: 4907171840. Throughput: 0: 43177.7. Samples: 4907348740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:48,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 22:04:51,671][15401] Updated weights for policy 0, policy_version 299520 (0.0033) [2024-06-22 22:04:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 4907401216. Throughput: 0: 43168.9. Samples: 4907473720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-22 22:04:53,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 22:04:55,729][15401] Updated weights for policy 0, policy_version 299530 (0.0030) [2024-06-22 22:04:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 4907597824. Throughput: 0: 43196.8. Samples: 4907731580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:04:58,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 22:04:59,142][15401] Updated weights for policy 0, policy_version 299540 (0.0036) [2024-06-22 22:05:03,203][15401] Updated weights for policy 0, policy_version 299550 (0.0032) [2024-06-22 22:05:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4907827200. Throughput: 0: 42983.1. Samples: 4907986400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 22:05:07,066][15401] Updated weights for policy 0, policy_version 299560 (0.0029) [2024-06-22 22:05:08,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 4908040192. Throughput: 0: 43086.7. Samples: 4908116980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 22:05:10,667][15401] Updated weights for policy 0, policy_version 299570 (0.0030) [2024-06-22 22:05:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 4908253184. Throughput: 0: 43003.9. Samples: 4908375320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 22:05:14,691][15401] Updated weights for policy 0, policy_version 299580 (0.0043) [2024-06-22 22:05:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 4908466176. Throughput: 0: 42921.0. Samples: 4908632800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 22:05:18,495][15401] Updated weights for policy 0, policy_version 299590 (0.0042) [2024-06-22 22:05:22,288][15401] Updated weights for policy 0, policy_version 299600 (0.0036) [2024-06-22 22:05:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4908679168. Throughput: 0: 43017.9. Samples: 4908759620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 22:05:26,415][15401] Updated weights for policy 0, policy_version 299610 (0.0033) [2024-06-22 22:05:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 4908892160. Throughput: 0: 42938.7. Samples: 4909017380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 22:05:29,959][15401] Updated weights for policy 0, policy_version 299620 (0.0032) [2024-06-22 22:05:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4909121536. Throughput: 0: 42837.8. Samples: 4909276440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:33,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 22:05:34,010][15401] Updated weights for policy 0, policy_version 299630 (0.0046) [2024-06-22 22:05:37,521][15401] Updated weights for policy 0, policy_version 299640 (0.0045) [2024-06-22 22:05:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4909334528. Throughput: 0: 42924.4. Samples: 4909405320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:38,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 22:05:41,642][15401] Updated weights for policy 0, policy_version 299650 (0.0023) [2024-06-22 22:05:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 43043.5). Total num frames: 4909547520. Throughput: 0: 42956.1. Samples: 4909664500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:43,393][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 22:05:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000299655_4909547520.pth... [2024-06-22 22:05:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000299026_4899241984.pth [2024-06-22 22:05:45,143][15401] Updated weights for policy 0, policy_version 299660 (0.0035) [2024-06-22 22:05:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 4909760512. Throughput: 0: 42881.8. Samples: 4909916080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:48,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 22:05:49,351][15401] Updated weights for policy 0, policy_version 299670 (0.0025) [2024-06-22 22:05:53,155][15401] Updated weights for policy 0, policy_version 299680 (0.0034) [2024-06-22 22:05:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 4909957120. Throughput: 0: 42778.8. Samples: 4910042020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 22:05:57,046][15401] Updated weights for policy 0, policy_version 299690 (0.0026) [2024-06-22 22:05:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 4910186496. Throughput: 0: 42713.0. Samples: 4910297400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:05:58,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-22 22:06:00,714][15401] Updated weights for policy 0, policy_version 299700 (0.0037) [2024-06-22 22:06:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4910399488. Throughput: 0: 42814.9. Samples: 4910559480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:06:03,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-22 22:06:04,875][15401] Updated weights for policy 0, policy_version 299710 (0.0032) [2024-06-22 22:06:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 4910579712. Throughput: 0: 42774.7. Samples: 4910684480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:06:08,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 22:06:08,564][15349] Signal inference workers to stop experience collection... (72600 times) [2024-06-22 22:06:08,564][15349] Signal inference workers to resume experience collection... (72600 times) [2024-06-22 22:06:08,594][15401] InferenceWorker_p0-w0: stopping experience collection (72600 times) [2024-06-22 22:06:08,594][15401] InferenceWorker_p0-w0: resuming experience collection (72600 times) [2024-06-22 22:06:08,738][15401] Updated weights for policy 0, policy_version 299720 (0.0047) [2024-06-22 22:06:12,383][15401] Updated weights for policy 0, policy_version 299730 (0.0034) [2024-06-22 22:06:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 4910841856. Throughput: 0: 42950.7. Samples: 4910950160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 22:06:16,267][15401] Updated weights for policy 0, policy_version 299740 (0.0024) [2024-06-22 22:06:18,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4911054848. Throughput: 0: 42740.8. Samples: 4911199780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 22:06:19,962][15401] Updated weights for policy 0, policy_version 299750 (0.0038) [2024-06-22 22:06:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4911235072. Throughput: 0: 42709.0. Samples: 4911327220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 22:06:23,940][15401] Updated weights for policy 0, policy_version 299760 (0.0035) [2024-06-22 22:06:27,429][15401] Updated weights for policy 0, policy_version 299770 (0.0030) [2024-06-22 22:06:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 4911464448. Throughput: 0: 42795.2. Samples: 4911590280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 22:06:31,473][15401] Updated weights for policy 0, policy_version 299780 (0.0023) [2024-06-22 22:06:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4911677440. Throughput: 0: 42771.6. Samples: 4911840800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 22:06:35,464][15401] Updated weights for policy 0, policy_version 299790 (0.0026) [2024-06-22 22:06:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4911890432. Throughput: 0: 42872.4. Samples: 4911971280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:38,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-22 22:06:39,304][15401] Updated weights for policy 0, policy_version 299800 (0.0042) [2024-06-22 22:06:43,057][15401] Updated weights for policy 0, policy_version 299810 (0.0023) [2024-06-22 22:06:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 4912103424. Throughput: 0: 42915.5. Samples: 4912228600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 22:06:46,854][15401] Updated weights for policy 0, policy_version 299820 (0.0033) [2024-06-22 22:06:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4912332800. Throughput: 0: 42622.7. Samples: 4912477500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:48,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-22 22:06:50,625][15401] Updated weights for policy 0, policy_version 299830 (0.0039) [2024-06-22 22:06:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4912529408. Throughput: 0: 42837.2. Samples: 4912612160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:53,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-22 22:06:54,295][15401] Updated weights for policy 0, policy_version 299840 (0.0047) [2024-06-22 22:06:58,211][15401] Updated weights for policy 0, policy_version 299850 (0.0035) [2024-06-22 22:06:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4912742400. Throughput: 0: 42729.3. Samples: 4912872980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:06:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-22 22:07:01,744][15401] Updated weights for policy 0, policy_version 299860 (0.0048) [2024-06-22 22:07:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.8, 300 sec: 42709.5). Total num frames: 4912955392. Throughput: 0: 42741.7. Samples: 4913123260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:07:03,393][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 22:07:05,934][15401] Updated weights for policy 0, policy_version 299870 (0.0029) [2024-06-22 22:07:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4913168384. Throughput: 0: 42722.1. Samples: 4913249720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:07:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-22 22:07:09,465][15401] Updated weights for policy 0, policy_version 299880 (0.0050) [2024-06-22 22:07:13,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4913364992. Throughput: 0: 42631.1. Samples: 4913508680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:07:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 22:07:13,567][15401] Updated weights for policy 0, policy_version 299890 (0.0029) [2024-06-22 22:07:14,453][15349] Signal inference workers to stop experience collection... (72650 times) [2024-06-22 22:07:14,500][15349] Signal inference workers to resume experience collection... (72650 times) [2024-06-22 22:07:14,500][15401] InferenceWorker_p0-w0: stopping experience collection (72650 times) [2024-06-22 22:07:14,515][15401] InferenceWorker_p0-w0: resuming experience collection (72650 times) [2024-06-22 22:07:16,894][15401] Updated weights for policy 0, policy_version 299900 (0.0051) [2024-06-22 22:07:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42821.3). Total num frames: 4913610752. Throughput: 0: 42750.7. Samples: 4913764580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:07:18,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-22 22:07:21,332][15401] Updated weights for policy 0, policy_version 299910 (0.0031) [2024-06-22 22:07:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4913807360. Throughput: 0: 42856.9. Samples: 4913899840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-22 22:07:23,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 22:07:24,729][15401] Updated weights for policy 0, policy_version 299920 (0.0030) [2024-06-22 22:07:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4914020352. Throughput: 0: 42709.4. Samples: 4914150520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:07:28,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 22:07:28,897][15401] Updated weights for policy 0, policy_version 299930 (0.0026) [2024-06-22 22:07:32,218][15401] Updated weights for policy 0, policy_version 299940 (0.0037) [2024-06-22 22:07:33,393][15132] Fps is (10 sec: 45857.4, 60 sec: 43141.8, 300 sec: 42875.5). Total num frames: 4914266112. Throughput: 0: 42912.3. Samples: 4914408720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:07:33,394][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 22:07:36,383][15401] Updated weights for policy 0, policy_version 299950 (0.0042) [2024-06-22 22:07:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 4914462720. Throughput: 0: 42908.4. Samples: 4914543040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:07:38,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 22:07:39,890][15401] Updated weights for policy 0, policy_version 299960 (0.0031) [2024-06-22 22:07:43,390][15132] Fps is (10 sec: 40976.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4914675712. Throughput: 0: 42825.8. Samples: 4914800140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:07:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 22:07:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000299968_4914675712.pth... [2024-06-22 22:07:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000299341_4904402944.pth [2024-06-22 22:07:43,829][15401] Updated weights for policy 0, policy_version 299970 (0.0037) [2024-06-22 22:07:47,614][15401] Updated weights for policy 0, policy_version 299980 (0.0037) [2024-06-22 22:07:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4914905088. Throughput: 0: 43001.9. Samples: 4915058240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:07:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 22:07:51,376][15401] Updated weights for policy 0, policy_version 299990 (0.0037) [2024-06-22 22:07:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 4915118080. Throughput: 0: 43111.2. Samples: 4915189720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:07:53,396][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 22:07:55,435][15401] Updated weights for policy 0, policy_version 300000 (0.0044) [2024-06-22 22:07:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4915331072. Throughput: 0: 42939.9. Samples: 4915440980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:07:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 22:07:58,910][15401] Updated weights for policy 0, policy_version 300010 (0.0041) [2024-06-22 22:08:02,857][15401] Updated weights for policy 0, policy_version 300020 (0.0033) [2024-06-22 22:08:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 4915544064. Throughput: 0: 43124.8. Samples: 4915705200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:08:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 22:08:06,610][15401] Updated weights for policy 0, policy_version 300030 (0.0038) [2024-06-22 22:08:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4915740672. Throughput: 0: 42974.3. Samples: 4915833680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:08:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 22:08:10,492][15401] Updated weights for policy 0, policy_version 300040 (0.0033) [2024-06-22 22:08:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 4915986432. Throughput: 0: 43092.1. Samples: 4916089660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:08:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-22 22:08:14,211][15401] Updated weights for policy 0, policy_version 300050 (0.0027) [2024-06-22 22:08:18,000][15401] Updated weights for policy 0, policy_version 300060 (0.0036) [2024-06-22 22:08:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4916183040. Throughput: 0: 43105.5. Samples: 4916348300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:08:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 22:08:21,731][15401] Updated weights for policy 0, policy_version 300070 (0.0044) [2024-06-22 22:08:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4916396032. Throughput: 0: 42980.0. Samples: 4916477140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:08:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 22:08:25,935][15401] Updated weights for policy 0, policy_version 300080 (0.0031) [2024-06-22 22:08:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 4916625408. Throughput: 0: 43035.5. Samples: 4916736740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:08:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 22:08:29,263][15401] Updated weights for policy 0, policy_version 300090 (0.0030) [2024-06-22 22:08:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42601.2, 300 sec: 42931.6). Total num frames: 4916822016. Throughput: 0: 43083.6. Samples: 4916997000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-22 22:08:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 22:08:33,478][15401] Updated weights for policy 0, policy_version 300100 (0.0032) [2024-06-22 22:08:36,830][15401] Updated weights for policy 0, policy_version 300110 (0.0039) [2024-06-22 22:08:38,390][15132] Fps is (10 sec: 40957.3, 60 sec: 42871.0, 300 sec: 42876.3). Total num frames: 4917035008. Throughput: 0: 42937.5. Samples: 4917121940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:08:38,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 22:08:40,290][15349] Signal inference workers to stop experience collection... (72700 times) [2024-06-22 22:08:40,290][15349] Signal inference workers to resume experience collection... (72700 times) [2024-06-22 22:08:40,309][15401] InferenceWorker_p0-w0: stopping experience collection (72700 times) [2024-06-22 22:08:40,340][15401] InferenceWorker_p0-w0: resuming experience collection (72700 times) [2024-06-22 22:08:41,081][15401] Updated weights for policy 0, policy_version 300120 (0.0035) [2024-06-22 22:08:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 4917264384. Throughput: 0: 43068.0. Samples: 4917379040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:08:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 22:08:44,476][15401] Updated weights for policy 0, policy_version 300130 (0.0029) [2024-06-22 22:08:48,389][15132] Fps is (10 sec: 42601.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 4917460992. Throughput: 0: 42846.3. Samples: 4917633280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:08:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-22 22:08:48,635][15401] Updated weights for policy 0, policy_version 300140 (0.0035) [2024-06-22 22:08:52,406][15401] Updated weights for policy 0, policy_version 300150 (0.0034) [2024-06-22 22:08:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4917690368. Throughput: 0: 42772.3. Samples: 4917758440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:08:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 22:08:56,423][15401] Updated weights for policy 0, policy_version 300160 (0.0024) [2024-06-22 22:08:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4917903360. Throughput: 0: 42896.4. Samples: 4918020000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:08:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 22:08:59,948][15401] Updated weights for policy 0, policy_version 300170 (0.0038) [2024-06-22 22:09:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 4918099968. Throughput: 0: 42738.7. Samples: 4918271540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 22:09:04,312][15401] Updated weights for policy 0, policy_version 300180 (0.0031) [2024-06-22 22:09:07,544][15401] Updated weights for policy 0, policy_version 300190 (0.0031) [2024-06-22 22:09:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4918329344. Throughput: 0: 42652.5. Samples: 4918396500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 22:09:12,309][15401] Updated weights for policy 0, policy_version 300200 (0.0044) [2024-06-22 22:09:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42932.0). Total num frames: 4918525952. Throughput: 0: 42512.9. Samples: 4918649820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 22:09:15,699][15401] Updated weights for policy 0, policy_version 300210 (0.0040) [2024-06-22 22:09:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4918738944. Throughput: 0: 42343.5. Samples: 4918902460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:18,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 22:09:19,770][15401] Updated weights for policy 0, policy_version 300220 (0.0032) [2024-06-22 22:09:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4918951936. Throughput: 0: 42440.7. Samples: 4919031740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-22 22:09:23,566][15401] Updated weights for policy 0, policy_version 300230 (0.0025) [2024-06-22 22:09:27,754][15401] Updated weights for policy 0, policy_version 300240 (0.0043) [2024-06-22 22:09:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4919164928. Throughput: 0: 42489.4. Samples: 4919291060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 22:09:31,145][15401] Updated weights for policy 0, policy_version 300250 (0.0030) [2024-06-22 22:09:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4919377920. Throughput: 0: 42392.9. Samples: 4919540960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 22:09:35,265][15401] Updated weights for policy 0, policy_version 300260 (0.0044) [2024-06-22 22:09:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.9, 300 sec: 42765.0). Total num frames: 4919590912. Throughput: 0: 42449.9. Samples: 4919668680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:38,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 22:09:39,088][15401] Updated weights for policy 0, policy_version 300270 (0.0033) [2024-06-22 22:09:43,340][15401] Updated weights for policy 0, policy_version 300280 (0.0043) [2024-06-22 22:09:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 4919787520. Throughput: 0: 42284.5. Samples: 4919922800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:43,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 22:09:43,496][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000300281_4919803904.pth... [2024-06-22 22:09:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000299655_4909547520.pth [2024-06-22 22:09:46,720][15401] Updated weights for policy 0, policy_version 300290 (0.0024) [2024-06-22 22:09:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4920033280. Throughput: 0: 42280.9. Samples: 4920174180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:09:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-22 22:09:50,887][15401] Updated weights for policy 0, policy_version 300300 (0.0039) [2024-06-22 22:09:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 4920246272. Throughput: 0: 42457.4. Samples: 4920307080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:09:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 22:09:54,325][15401] Updated weights for policy 0, policy_version 300310 (0.0047) [2024-06-22 22:09:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4920426496. Throughput: 0: 42509.8. Samples: 4920562760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:09:58,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-22 22:09:58,497][15401] Updated weights for policy 0, policy_version 300320 (0.0029) [2024-06-22 22:10:01,942][15349] Signal inference workers to stop experience collection... (72750 times) [2024-06-22 22:10:01,942][15349] Signal inference workers to resume experience collection... (72750 times) [2024-06-22 22:10:01,965][15401] InferenceWorker_p0-w0: stopping experience collection (72750 times) [2024-06-22 22:10:01,965][15401] InferenceWorker_p0-w0: resuming experience collection (72750 times) [2024-06-22 22:10:02,095][15401] Updated weights for policy 0, policy_version 300330 (0.0031) [2024-06-22 22:10:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 4920655872. Throughput: 0: 42321.7. Samples: 4920806940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 22:10:06,175][15401] Updated weights for policy 0, policy_version 300340 (0.0031) [2024-06-22 22:10:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 4920852480. Throughput: 0: 42367.6. Samples: 4920938280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:08,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 22:10:09,626][15401] Updated weights for policy 0, policy_version 300350 (0.0031) [2024-06-22 22:10:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4921065472. Throughput: 0: 42406.7. Samples: 4921199360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:13,396][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 22:10:13,743][15401] Updated weights for policy 0, policy_version 300360 (0.0033) [2024-06-22 22:10:17,738][15401] Updated weights for policy 0, policy_version 300370 (0.0034) [2024-06-22 22:10:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4921278464. Throughput: 0: 42403.5. Samples: 4921449120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:18,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-22 22:10:21,447][15401] Updated weights for policy 0, policy_version 300380 (0.0041) [2024-06-22 22:10:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4921507840. Throughput: 0: 42475.6. Samples: 4921580080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:23,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 22:10:25,295][15401] Updated weights for policy 0, policy_version 300390 (0.0040) [2024-06-22 22:10:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 4921704448. Throughput: 0: 42549.4. Samples: 4921837520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:28,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 22:10:29,077][15401] Updated weights for policy 0, policy_version 300400 (0.0029) [2024-06-22 22:10:32,765][15401] Updated weights for policy 0, policy_version 300410 (0.0031) [2024-06-22 22:10:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4921933824. Throughput: 0: 42711.5. Samples: 4922096200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 22:10:36,510][15401] Updated weights for policy 0, policy_version 300420 (0.0032) [2024-06-22 22:10:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4922146816. Throughput: 0: 42553.4. Samples: 4922221980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 22:10:40,661][15401] Updated weights for policy 0, policy_version 300430 (0.0028) [2024-06-22 22:10:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4922343424. Throughput: 0: 42693.3. Samples: 4922483960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 22:10:44,388][15401] Updated weights for policy 0, policy_version 300440 (0.0038) [2024-06-22 22:10:48,255][15401] Updated weights for policy 0, policy_version 300450 (0.0027) [2024-06-22 22:10:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4922572800. Throughput: 0: 43022.8. Samples: 4922742960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 22:10:51,967][15401] Updated weights for policy 0, policy_version 300460 (0.0040) [2024-06-22 22:10:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4922802176. Throughput: 0: 42916.7. Samples: 4922869540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 22:10:55,652][15401] Updated weights for policy 0, policy_version 300470 (0.0037) [2024-06-22 22:10:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4922982400. Throughput: 0: 42864.0. Samples: 4923128240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:10:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 22:10:59,510][15401] Updated weights for policy 0, policy_version 300480 (0.0043) [2024-06-22 22:11:03,367][15401] Updated weights for policy 0, policy_version 300490 (0.0036) [2024-06-22 22:11:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4923228160. Throughput: 0: 42997.4. Samples: 4923384000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-22 22:11:03,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 22:11:07,190][15401] Updated weights for policy 0, policy_version 300500 (0.0043) [2024-06-22 22:11:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4923441152. Throughput: 0: 42948.0. Samples: 4923512740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 22:11:11,133][15401] Updated weights for policy 0, policy_version 300510 (0.0037) [2024-06-22 22:11:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4923621376. Throughput: 0: 42671.5. Samples: 4923757740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 22:11:15,263][15401] Updated weights for policy 0, policy_version 300520 (0.0029) [2024-06-22 22:11:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4923850752. Throughput: 0: 42774.8. Samples: 4924021060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:18,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-22 22:11:18,802][15401] Updated weights for policy 0, policy_version 300530 (0.0034) [2024-06-22 22:11:22,915][15401] Updated weights for policy 0, policy_version 300540 (0.0031) [2024-06-22 22:11:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4924063744. Throughput: 0: 42799.2. Samples: 4924147940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 22:11:26,611][15401] Updated weights for policy 0, policy_version 300550 (0.0041) [2024-06-22 22:11:28,395][15132] Fps is (10 sec: 40936.2, 60 sec: 42594.3, 300 sec: 42653.1). Total num frames: 4924260352. Throughput: 0: 42519.0. Samples: 4924397560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:28,396][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 22:11:30,402][15401] Updated weights for policy 0, policy_version 300560 (0.0032) [2024-06-22 22:11:33,391][15132] Fps is (10 sec: 42590.0, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 4924489728. Throughput: 0: 42496.3. Samples: 4924655380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:33,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 22:11:34,278][15401] Updated weights for policy 0, policy_version 300570 (0.0030) [2024-06-22 22:11:37,728][15349] Signal inference workers to stop experience collection... (72800 times) [2024-06-22 22:11:37,776][15401] InferenceWorker_p0-w0: stopping experience collection (72800 times) [2024-06-22 22:11:37,845][15349] Signal inference workers to resume experience collection... (72800 times) [2024-06-22 22:11:37,845][15401] InferenceWorker_p0-w0: resuming experience collection (72800 times) [2024-06-22 22:11:37,987][15401] Updated weights for policy 0, policy_version 300580 (0.0029) [2024-06-22 22:11:38,389][15132] Fps is (10 sec: 44262.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4924702720. Throughput: 0: 42624.2. Samples: 4924787620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:38,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-22 22:11:41,880][15401] Updated weights for policy 0, policy_version 300590 (0.0031) [2024-06-22 22:11:43,390][15132] Fps is (10 sec: 42606.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4924915712. Throughput: 0: 42478.5. Samples: 4925039780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 22:11:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000300593_4924915712.pth... [2024-06-22 22:11:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000299968_4914675712.pth [2024-06-22 22:11:45,897][15401] Updated weights for policy 0, policy_version 300600 (0.0038) [2024-06-22 22:11:48,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4925128704. Throughput: 0: 42454.1. Samples: 4925294440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 22:11:49,541][15401] Updated weights for policy 0, policy_version 300610 (0.0054) [2024-06-22 22:11:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 4925325312. Throughput: 0: 42404.5. Samples: 4925420940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 22:11:53,656][15401] Updated weights for policy 0, policy_version 300620 (0.0043) [2024-06-22 22:11:57,356][15401] Updated weights for policy 0, policy_version 300630 (0.0043) [2024-06-22 22:11:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 4925554688. Throughput: 0: 42715.8. Samples: 4925679960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:11:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 22:12:01,449][15401] Updated weights for policy 0, policy_version 300640 (0.0039) [2024-06-22 22:12:03,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42325.1, 300 sec: 42709.5). Total num frames: 4925767680. Throughput: 0: 42471.3. Samples: 4925932280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:12:03,391][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 22:12:04,842][15401] Updated weights for policy 0, policy_version 300650 (0.0027) [2024-06-22 22:12:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4925980672. Throughput: 0: 42521.3. Samples: 4926061400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:12:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 22:12:09,223][15401] Updated weights for policy 0, policy_version 300660 (0.0031) [2024-06-22 22:12:12,871][15401] Updated weights for policy 0, policy_version 300670 (0.0034) [2024-06-22 22:12:13,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4926210048. Throughput: 0: 42785.9. Samples: 4926322680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:12:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 22:12:16,859][15401] Updated weights for policy 0, policy_version 300680 (0.0029) [2024-06-22 22:12:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4926423040. Throughput: 0: 42727.6. Samples: 4926578040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 22:12:18,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-22 22:12:20,297][15401] Updated weights for policy 0, policy_version 300690 (0.0044) [2024-06-22 22:12:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 4926603264. Throughput: 0: 42710.9. Samples: 4926709620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:12:23,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 22:12:24,196][15401] Updated weights for policy 0, policy_version 300700 (0.0027) [2024-06-22 22:12:27,725][15401] Updated weights for policy 0, policy_version 300710 (0.0028) [2024-06-22 22:12:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43148.6, 300 sec: 42654.5). Total num frames: 4926849024. Throughput: 0: 42875.6. Samples: 4926969180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:12:28,392][15132] Avg episode reward: [(0, '0.359')] [2024-06-22 22:12:31,688][15401] Updated weights for policy 0, policy_version 300720 (0.0041) [2024-06-22 22:12:33,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43145.9, 300 sec: 42765.0). Total num frames: 4927078400. Throughput: 0: 42913.3. Samples: 4927225540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:12:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 22:12:35,239][15401] Updated weights for policy 0, policy_version 300730 (0.0036) [2024-06-22 22:12:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4927258624. Throughput: 0: 42979.1. Samples: 4927355000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:12:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 22:12:39,257][15401] Updated weights for policy 0, policy_version 300740 (0.0028) [2024-06-22 22:12:42,844][15401] Updated weights for policy 0, policy_version 300750 (0.0037) [2024-06-22 22:12:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4927504384. Throughput: 0: 43053.0. Samples: 4927617340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:12:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 22:12:46,967][15401] Updated weights for policy 0, policy_version 300760 (0.0032) [2024-06-22 22:12:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4927717376. Throughput: 0: 43012.2. Samples: 4927867820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:12:48,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-22 22:12:50,443][15401] Updated weights for policy 0, policy_version 300770 (0.0031) [2024-06-22 22:12:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4927897600. Throughput: 0: 43050.1. Samples: 4927998660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:12:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 22:12:54,793][15401] Updated weights for policy 0, policy_version 300780 (0.0036) [2024-06-22 22:12:57,182][15349] Signal inference workers to stop experience collection... (72850 times) [2024-06-22 22:12:57,183][15349] Signal inference workers to resume experience collection... (72850 times) [2024-06-22 22:12:57,216][15401] InferenceWorker_p0-w0: stopping experience collection (72850 times) [2024-06-22 22:12:57,216][15401] InferenceWorker_p0-w0: resuming experience collection (72850 times) [2024-06-22 22:12:58,238][15401] Updated weights for policy 0, policy_version 300790 (0.0035) [2024-06-22 22:12:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4928143360. Throughput: 0: 42931.5. Samples: 4928254600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:12:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 22:13:02,432][15401] Updated weights for policy 0, policy_version 300800 (0.0035) [2024-06-22 22:13:03,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 4928372736. Throughput: 0: 42942.5. Samples: 4928510460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:13:03,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 22:13:06,382][15401] Updated weights for policy 0, policy_version 300810 (0.0044) [2024-06-22 22:13:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4928552960. Throughput: 0: 42970.4. Samples: 4928643280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:13:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 22:13:09,907][15401] Updated weights for policy 0, policy_version 300820 (0.0042) [2024-06-22 22:13:13,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4928765952. Throughput: 0: 42813.3. Samples: 4928895780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:13:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 22:13:13,947][15401] Updated weights for policy 0, policy_version 300830 (0.0043) [2024-06-22 22:13:17,370][15401] Updated weights for policy 0, policy_version 300840 (0.0035) [2024-06-22 22:13:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4928995328. Throughput: 0: 42747.5. Samples: 4929149180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:13:18,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-22 22:13:21,598][15401] Updated weights for policy 0, policy_version 300850 (0.0040) [2024-06-22 22:13:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42654.0). Total num frames: 4929208320. Throughput: 0: 42896.0. Samples: 4929285320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:13:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 22:13:25,095][15401] Updated weights for policy 0, policy_version 300860 (0.0033) [2024-06-22 22:13:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4929388544. Throughput: 0: 42613.0. Samples: 4929534920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:13:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 22:13:29,165][15401] Updated weights for policy 0, policy_version 300870 (0.0026) [2024-06-22 22:13:32,857][15401] Updated weights for policy 0, policy_version 300880 (0.0028) [2024-06-22 22:13:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 4929634304. Throughput: 0: 42765.3. Samples: 4929792260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 22:13:33,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 22:13:37,219][15401] Updated weights for policy 0, policy_version 300890 (0.0034) [2024-06-22 22:13:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 4929847296. Throughput: 0: 42777.4. Samples: 4929923640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:13:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 22:13:40,543][15401] Updated weights for policy 0, policy_version 300900 (0.0035) [2024-06-22 22:13:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4930043904. Throughput: 0: 42508.4. Samples: 4930167480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:13:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 22:13:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000300906_4930043904.pth... [2024-06-22 22:13:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000300281_4919803904.pth [2024-06-22 22:13:44,782][15401] Updated weights for policy 0, policy_version 300910 (0.0048) [2024-06-22 22:13:48,324][15401] Updated weights for policy 0, policy_version 300920 (0.0033) [2024-06-22 22:13:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4930273280. Throughput: 0: 42504.1. Samples: 4930423140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:13:48,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-22 22:13:52,660][15401] Updated weights for policy 0, policy_version 300930 (0.0033) [2024-06-22 22:13:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4930469888. Throughput: 0: 42562.1. Samples: 4930558580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:13:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 22:13:55,897][15401] Updated weights for policy 0, policy_version 300940 (0.0026) [2024-06-22 22:13:58,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 4930682880. Throughput: 0: 42398.9. Samples: 4930804000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:13:58,396][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 22:14:00,197][15401] Updated weights for policy 0, policy_version 300950 (0.0036) [2024-06-22 22:14:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 4930912256. Throughput: 0: 42540.6. Samples: 4931063500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:03,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-22 22:14:03,433][15401] Updated weights for policy 0, policy_version 300960 (0.0042) [2024-06-22 22:14:07,695][15401] Updated weights for policy 0, policy_version 300970 (0.0030) [2024-06-22 22:14:08,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4931108864. Throughput: 0: 42431.4. Samples: 4931194740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:08,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 22:14:11,222][15401] Updated weights for policy 0, policy_version 300980 (0.0032) [2024-06-22 22:14:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4931305472. Throughput: 0: 42535.0. Samples: 4931449000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 22:14:15,226][15401] Updated weights for policy 0, policy_version 300990 (0.0042) [2024-06-22 22:14:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4931551232. Throughput: 0: 42451.1. Samples: 4931702560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 22:14:18,673][15401] Updated weights for policy 0, policy_version 301000 (0.0043) [2024-06-22 22:14:22,730][15401] Updated weights for policy 0, policy_version 301010 (0.0029) [2024-06-22 22:14:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4931747840. Throughput: 0: 42573.8. Samples: 4931839460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:23,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 22:14:26,369][15401] Updated weights for policy 0, policy_version 301020 (0.0034) [2024-06-22 22:14:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4931960832. Throughput: 0: 42800.9. Samples: 4932093520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 22:14:30,283][15401] Updated weights for policy 0, policy_version 301030 (0.0037) [2024-06-22 22:14:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4932206592. Throughput: 0: 42812.4. Samples: 4932349700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 22:14:33,952][15401] Updated weights for policy 0, policy_version 301040 (0.0037) [2024-06-22 22:14:37,739][15349] Signal inference workers to stop experience collection... (72900 times) [2024-06-22 22:14:37,744][15349] Signal inference workers to resume experience collection... (72900 times) [2024-06-22 22:14:37,781][15401] InferenceWorker_p0-w0: stopping experience collection (72900 times) [2024-06-22 22:14:37,781][15401] InferenceWorker_p0-w0: resuming experience collection (72900 times) [2024-06-22 22:14:37,878][15401] Updated weights for policy 0, policy_version 301050 (0.0032) [2024-06-22 22:14:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4932403200. Throughput: 0: 42786.3. Samples: 4932483960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 22:14:41,592][15401] Updated weights for policy 0, policy_version 301060 (0.0035) [2024-06-22 22:14:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4932599808. Throughput: 0: 42952.2. Samples: 4932736580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-22 22:14:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 22:14:45,440][15401] Updated weights for policy 0, policy_version 301070 (0.0037) [2024-06-22 22:14:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4932829184. Throughput: 0: 42922.6. Samples: 4932995020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:14:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 22:14:49,070][15401] Updated weights for policy 0, policy_version 301080 (0.0041) [2024-06-22 22:14:53,102][15401] Updated weights for policy 0, policy_version 301090 (0.0026) [2024-06-22 22:14:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4933058560. Throughput: 0: 43009.0. Samples: 4933130140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:14:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 22:14:56,900][15401] Updated weights for policy 0, policy_version 301100 (0.0032) [2024-06-22 22:14:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 4933238784. Throughput: 0: 42931.5. Samples: 4933380920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:14:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 22:15:00,723][15401] Updated weights for policy 0, policy_version 301110 (0.0034) [2024-06-22 22:15:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 4933484544. Throughput: 0: 43108.3. Samples: 4933642540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:03,392][15132] Avg episode reward: [(0, '0.185')] [2024-06-22 22:15:04,341][15401] Updated weights for policy 0, policy_version 301120 (0.0033) [2024-06-22 22:15:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4933697536. Throughput: 0: 43088.5. Samples: 4933778440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 22:15:08,410][15401] Updated weights for policy 0, policy_version 301130 (0.0040) [2024-06-22 22:15:11,865][15401] Updated weights for policy 0, policy_version 301140 (0.0028) [2024-06-22 22:15:13,392][15132] Fps is (10 sec: 40960.1, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 4933894144. Throughput: 0: 42911.1. Samples: 4934024620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:13,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 22:15:16,151][15401] Updated weights for policy 0, policy_version 301150 (0.0024) [2024-06-22 22:15:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 4934123520. Throughput: 0: 43011.1. Samples: 4934285200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 22:15:19,433][15401] Updated weights for policy 0, policy_version 301160 (0.0036) [2024-06-22 22:15:23,392][15132] Fps is (10 sec: 44236.7, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 4934336512. Throughput: 0: 43080.7. Samples: 4934422700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:23,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 22:15:23,819][15401] Updated weights for policy 0, policy_version 301170 (0.0027) [2024-06-22 22:15:26,982][15401] Updated weights for policy 0, policy_version 301180 (0.0027) [2024-06-22 22:15:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4934549504. Throughput: 0: 42890.3. Samples: 4934666640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 22:15:31,552][15401] Updated weights for policy 0, policy_version 301190 (0.0039) [2024-06-22 22:15:33,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4934778880. Throughput: 0: 43011.9. Samples: 4934930560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 22:15:34,535][15401] Updated weights for policy 0, policy_version 301200 (0.0035) [2024-06-22 22:15:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4934975488. Throughput: 0: 42991.5. Samples: 4935064760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 22:15:39,219][15401] Updated weights for policy 0, policy_version 301210 (0.0026) [2024-06-22 22:15:42,269][15401] Updated weights for policy 0, policy_version 301220 (0.0031) [2024-06-22 22:15:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 4935204864. Throughput: 0: 42941.9. Samples: 4935313300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 22:15:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000301221_4935204864.pth... [2024-06-22 22:15:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000300593_4924915712.pth [2024-06-22 22:15:46,996][15401] Updated weights for policy 0, policy_version 301230 (0.0054) [2024-06-22 22:15:47,887][15349] Signal inference workers to stop experience collection... (72950 times) [2024-06-22 22:15:47,939][15349] Signal inference workers to resume experience collection... (72950 times) [2024-06-22 22:15:47,939][15401] InferenceWorker_p0-w0: stopping experience collection (72950 times) [2024-06-22 22:15:47,955][15401] InferenceWorker_p0-w0: resuming experience collection (72950 times) [2024-06-22 22:15:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4935417856. Throughput: 0: 42981.4. Samples: 4935576600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 22:15:49,824][15401] Updated weights for policy 0, policy_version 301240 (0.0028) [2024-06-22 22:15:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4935614464. Throughput: 0: 42773.2. Samples: 4935703240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-22 22:15:54,550][15401] Updated weights for policy 0, policy_version 301250 (0.0043) [2024-06-22 22:15:57,476][15401] Updated weights for policy 0, policy_version 301260 (0.0040) [2024-06-22 22:15:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43688.9, 300 sec: 42820.2). Total num frames: 4935860224. Throughput: 0: 42871.5. Samples: 4935953840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 22:15:58,393][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 22:16:02,883][15401] Updated weights for policy 0, policy_version 301270 (0.0031) [2024-06-22 22:16:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 4936024064. Throughput: 0: 42991.3. Samples: 4936219800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 22:16:05,141][15401] Updated weights for policy 0, policy_version 301280 (0.0046) [2024-06-22 22:16:08,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4936253440. Throughput: 0: 42514.4. Samples: 4936335740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 22:16:10,459][15401] Updated weights for policy 0, policy_version 301290 (0.0034) [2024-06-22 22:16:13,059][15401] Updated weights for policy 0, policy_version 301300 (0.0028) [2024-06-22 22:16:13,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43417.6, 300 sec: 42875.7). Total num frames: 4936499200. Throughput: 0: 42856.0. Samples: 4936595260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:13,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 22:16:17,926][15401] Updated weights for policy 0, policy_version 301310 (0.0032) [2024-06-22 22:16:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4936679424. Throughput: 0: 42774.3. Samples: 4936855400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:18,391][15132] Avg episode reward: [(0, '0.447')] [2024-06-22 22:16:20,822][15401] Updated weights for policy 0, policy_version 301320 (0.0033) [2024-06-22 22:16:23,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42600.2, 300 sec: 42821.4). Total num frames: 4936892416. Throughput: 0: 42489.8. Samples: 4936976800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 22:16:25,411][15401] Updated weights for policy 0, policy_version 301330 (0.0022) [2024-06-22 22:16:28,308][15401] Updated weights for policy 0, policy_version 301340 (0.0035) [2024-06-22 22:16:28,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.7, 300 sec: 42931.9). Total num frames: 4937154560. Throughput: 0: 42826.2. Samples: 4937240480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:28,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 22:16:32,907][15401] Updated weights for policy 0, policy_version 301350 (0.0026) [2024-06-22 22:16:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4937318400. Throughput: 0: 42702.7. Samples: 4937498220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:33,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 22:16:36,004][15401] Updated weights for policy 0, policy_version 301360 (0.0038) [2024-06-22 22:16:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4937547776. Throughput: 0: 42546.9. Samples: 4937617840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:38,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 22:16:40,516][15401] Updated weights for policy 0, policy_version 301370 (0.0046) [2024-06-22 22:16:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4937777152. Throughput: 0: 42891.3. Samples: 4937883840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:43,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 22:16:43,746][15401] Updated weights for policy 0, policy_version 301380 (0.0041) [2024-06-22 22:16:48,178][15401] Updated weights for policy 0, policy_version 301390 (0.0038) [2024-06-22 22:16:48,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4937973760. Throughput: 0: 42643.9. Samples: 4938138780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:48,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 22:16:50,525][15349] Signal inference workers to stop experience collection... (73000 times) [2024-06-22 22:16:50,577][15401] InferenceWorker_p0-w0: stopping experience collection (73000 times) [2024-06-22 22:16:50,587][15349] Signal inference workers to resume experience collection... (73000 times) [2024-06-22 22:16:50,592][15401] InferenceWorker_p0-w0: resuming experience collection (73000 times) [2024-06-22 22:16:51,376][15401] Updated weights for policy 0, policy_version 301400 (0.0027) [2024-06-22 22:16:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4938186752. Throughput: 0: 42794.5. Samples: 4938261500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 22:16:56,037][15401] Updated weights for policy 0, policy_version 301410 (0.0027) [2024-06-22 22:16:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 4938399744. Throughput: 0: 42906.7. Samples: 4938525960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:16:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 22:16:59,142][15401] Updated weights for policy 0, policy_version 301420 (0.0047) [2024-06-22 22:17:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4938596352. Throughput: 0: 42807.1. Samples: 4938781820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:17:03,392][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 22:17:03,667][15401] Updated weights for policy 0, policy_version 301430 (0.0035) [2024-06-22 22:17:06,778][15401] Updated weights for policy 0, policy_version 301440 (0.0031) [2024-06-22 22:17:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4938825728. Throughput: 0: 42876.8. Samples: 4938906260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:17:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 22:17:11,484][15401] Updated weights for policy 0, policy_version 301450 (0.0037) [2024-06-22 22:17:13,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 4939038720. Throughput: 0: 42786.6. Samples: 4939165880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-22 22:17:13,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-22 22:17:14,385][15401] Updated weights for policy 0, policy_version 301460 (0.0030) [2024-06-22 22:17:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 4939235328. Throughput: 0: 42743.4. Samples: 4939421680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 22:17:19,128][15401] Updated weights for policy 0, policy_version 301470 (0.0044) [2024-06-22 22:17:22,106][15401] Updated weights for policy 0, policy_version 301480 (0.0040) [2024-06-22 22:17:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4939481088. Throughput: 0: 42854.6. Samples: 4939546300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 22:17:26,773][15401] Updated weights for policy 0, policy_version 301490 (0.0027) [2024-06-22 22:17:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 4939661312. Throughput: 0: 42644.7. Samples: 4939802860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 22:17:29,775][15401] Updated weights for policy 0, policy_version 301500 (0.0037) [2024-06-22 22:17:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4939874304. Throughput: 0: 42571.6. Samples: 4940054500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 22:17:34,506][15401] Updated weights for policy 0, policy_version 301510 (0.0035) [2024-06-22 22:17:37,498][15401] Updated weights for policy 0, policy_version 301520 (0.0033) [2024-06-22 22:17:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4940120064. Throughput: 0: 42831.5. Samples: 4940188920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 22:17:42,390][15401] Updated weights for policy 0, policy_version 301530 (0.0030) [2024-06-22 22:17:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 4940283904. Throughput: 0: 42515.9. Samples: 4940439180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 22:17:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000301531_4940283904.pth... [2024-06-22 22:17:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000300906_4930043904.pth [2024-06-22 22:17:45,328][15401] Updated weights for policy 0, policy_version 301540 (0.0029) [2024-06-22 22:17:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4940529664. Throughput: 0: 42589.0. Samples: 4940698220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:48,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 22:17:49,854][15401] Updated weights for policy 0, policy_version 301550 (0.0036) [2024-06-22 22:17:52,941][15401] Updated weights for policy 0, policy_version 301560 (0.0034) [2024-06-22 22:17:53,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4940759040. Throughput: 0: 42824.9. Samples: 4940833380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 22:17:57,618][15401] Updated weights for policy 0, policy_version 301570 (0.0028) [2024-06-22 22:17:58,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4940939264. Throughput: 0: 42611.5. Samples: 4941083400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:17:58,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 22:18:00,813][15401] Updated weights for policy 0, policy_version 301580 (0.0034) [2024-06-22 22:18:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 4941168640. Throughput: 0: 42573.4. Samples: 4941337480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:18:03,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 22:18:05,083][15401] Updated weights for policy 0, policy_version 301590 (0.0028) [2024-06-22 22:18:07,819][15349] Signal inference workers to stop experience collection... (73050 times) [2024-06-22 22:18:07,819][15349] Signal inference workers to resume experience collection... (73050 times) [2024-06-22 22:18:07,872][15401] InferenceWorker_p0-w0: stopping experience collection (73050 times) [2024-06-22 22:18:07,872][15401] InferenceWorker_p0-w0: resuming experience collection (73050 times) [2024-06-22 22:18:08,392][15132] Fps is (10 sec: 44224.7, 60 sec: 42596.5, 300 sec: 42764.6). Total num frames: 4941381632. Throughput: 0: 42829.9. Samples: 4941473760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:18:08,393][15132] Avg episode reward: [(0, '0.424')] [2024-06-22 22:18:08,635][15401] Updated weights for policy 0, policy_version 301600 (0.0043) [2024-06-22 22:18:12,739][15401] Updated weights for policy 0, policy_version 301610 (0.0039) [2024-06-22 22:18:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4941594624. Throughput: 0: 42743.9. Samples: 4941726340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:18:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 22:18:16,114][15401] Updated weights for policy 0, policy_version 301620 (0.0030) [2024-06-22 22:18:18,389][15132] Fps is (10 sec: 44249.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4941824000. Throughput: 0: 42891.1. Samples: 4941984600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:18:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-22 22:18:20,542][15401] Updated weights for policy 0, policy_version 301630 (0.0024) [2024-06-22 22:18:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4942036992. Throughput: 0: 42877.0. Samples: 4942118380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:18:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 22:18:23,678][15401] Updated weights for policy 0, policy_version 301640 (0.0023) [2024-06-22 22:18:28,050][15401] Updated weights for policy 0, policy_version 301650 (0.0043) [2024-06-22 22:18:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 4942233600. Throughput: 0: 42882.6. Samples: 4942369000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-22 22:18:28,393][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 22:18:31,328][15401] Updated weights for policy 0, policy_version 301660 (0.0032) [2024-06-22 22:18:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 4942479360. Throughput: 0: 42838.1. Samples: 4942625940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:18:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 22:18:35,578][15401] Updated weights for policy 0, policy_version 301670 (0.0031) [2024-06-22 22:18:38,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4942659584. Throughput: 0: 42790.3. Samples: 4942758940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:18:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 22:18:39,070][15401] Updated weights for policy 0, policy_version 301680 (0.0063) [2024-06-22 22:18:43,371][15401] Updated weights for policy 0, policy_version 301690 (0.0029) [2024-06-22 22:18:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 4942888960. Throughput: 0: 42875.6. Samples: 4943012800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:18:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 22:18:46,925][15401] Updated weights for policy 0, policy_version 301700 (0.0020) [2024-06-22 22:18:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4943118336. Throughput: 0: 42867.9. Samples: 4943266540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:18:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 22:18:51,013][15401] Updated weights for policy 0, policy_version 301710 (0.0031) [2024-06-22 22:18:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 4943314944. Throughput: 0: 42858.1. Samples: 4943402260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:18:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 22:18:54,351][15401] Updated weights for policy 0, policy_version 301720 (0.0036) [2024-06-22 22:18:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4943527936. Throughput: 0: 42931.6. Samples: 4943658260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:18:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 22:18:58,646][15401] Updated weights for policy 0, policy_version 301730 (0.0039) [2024-06-22 22:19:01,809][15401] Updated weights for policy 0, policy_version 301740 (0.0031) [2024-06-22 22:19:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4943757312. Throughput: 0: 43059.2. Samples: 4943922260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:19:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 22:19:06,131][15401] Updated weights for policy 0, policy_version 301750 (0.0028) [2024-06-22 22:19:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.4, 300 sec: 42931.6). Total num frames: 4943970304. Throughput: 0: 43008.7. Samples: 4944053780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:19:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 22:19:09,590][15401] Updated weights for policy 0, policy_version 301760 (0.0030) [2024-06-22 22:19:13,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 4944183296. Throughput: 0: 43096.0. Samples: 4944308320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:19:13,393][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 22:19:13,791][15401] Updated weights for policy 0, policy_version 301770 (0.0036) [2024-06-22 22:19:17,148][15401] Updated weights for policy 0, policy_version 301780 (0.0023) [2024-06-22 22:19:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4944396288. Throughput: 0: 42972.0. Samples: 4944559680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:19:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 22:19:21,320][15401] Updated weights for policy 0, policy_version 301790 (0.0035) [2024-06-22 22:19:23,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 4944609280. Throughput: 0: 42964.4. Samples: 4944692440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:19:23,393][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 22:19:24,625][15401] Updated weights for policy 0, policy_version 301800 (0.0038) [2024-06-22 22:19:28,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43417.6, 300 sec: 42820.2). Total num frames: 4944838656. Throughput: 0: 43107.4. Samples: 4944952740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:19:28,393][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 22:19:28,718][15401] Updated weights for policy 0, policy_version 301810 (0.0024) [2024-06-22 22:19:32,438][15401] Updated weights for policy 0, policy_version 301820 (0.0034) [2024-06-22 22:19:33,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4945051648. Throughput: 0: 43196.6. Samples: 4945210380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:19:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 22:19:36,260][15401] Updated weights for policy 0, policy_version 301830 (0.0035) [2024-06-22 22:19:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4945248256. Throughput: 0: 43042.3. Samples: 4945339160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-22 22:19:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 22:19:39,944][15401] Updated weights for policy 0, policy_version 301840 (0.0057) [2024-06-22 22:19:42,531][15349] Signal inference workers to stop experience collection... (73100 times) [2024-06-22 22:19:42,579][15401] InferenceWorker_p0-w0: stopping experience collection (73100 times) [2024-06-22 22:19:42,649][15349] Signal inference workers to resume experience collection... (73100 times) [2024-06-22 22:19:42,649][15401] InferenceWorker_p0-w0: resuming experience collection (73100 times) [2024-06-22 22:19:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 4945477632. Throughput: 0: 43136.4. Samples: 4945599500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:19:43,392][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 22:19:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000301848_4945477632.pth... [2024-06-22 22:19:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000301221_4935204864.pth [2024-06-22 22:19:44,003][15401] Updated weights for policy 0, policy_version 301850 (0.0026) [2024-06-22 22:19:47,643][15401] Updated weights for policy 0, policy_version 301860 (0.0023) [2024-06-22 22:19:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 4945707008. Throughput: 0: 42900.4. Samples: 4945852780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:19:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 22:19:51,567][15401] Updated weights for policy 0, policy_version 301870 (0.0041) [2024-06-22 22:19:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4945887232. Throughput: 0: 42967.3. Samples: 4945987300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:19:53,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 22:19:55,133][15401] Updated weights for policy 0, policy_version 301880 (0.0036) [2024-06-22 22:19:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 4946116608. Throughput: 0: 42998.7. Samples: 4946243160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:19:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 22:19:59,107][15401] Updated weights for policy 0, policy_version 301890 (0.0027) [2024-06-22 22:20:02,632][15401] Updated weights for policy 0, policy_version 301900 (0.0035) [2024-06-22 22:20:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 4946345984. Throughput: 0: 43011.5. Samples: 4946495200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 22:20:06,717][15401] Updated weights for policy 0, policy_version 301910 (0.0042) [2024-06-22 22:20:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 4946509824. Throughput: 0: 43055.2. Samples: 4946629820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 22:20:10,457][15401] Updated weights for policy 0, policy_version 301920 (0.0035) [2024-06-22 22:20:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 4946755584. Throughput: 0: 42826.7. Samples: 4946879840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 22:20:14,287][15401] Updated weights for policy 0, policy_version 301930 (0.0028) [2024-06-22 22:20:18,314][15401] Updated weights for policy 0, policy_version 301940 (0.0039) [2024-06-22 22:20:18,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 4946984960. Throughput: 0: 42909.7. Samples: 4947141320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 22:20:21,920][15401] Updated weights for policy 0, policy_version 301950 (0.0036) [2024-06-22 22:20:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 4947165184. Throughput: 0: 42952.8. Samples: 4947272040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 22:20:25,849][15401] Updated weights for policy 0, policy_version 301960 (0.0039) [2024-06-22 22:20:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 4947410944. Throughput: 0: 42731.1. Samples: 4947522300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 22:20:29,696][15401] Updated weights for policy 0, policy_version 301970 (0.0039) [2024-06-22 22:20:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4947623936. Throughput: 0: 42926.3. Samples: 4947784460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 22:20:33,430][15401] Updated weights for policy 0, policy_version 301980 (0.0036) [2024-06-22 22:20:37,296][15401] Updated weights for policy 0, policy_version 301990 (0.0047) [2024-06-22 22:20:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4947820544. Throughput: 0: 42817.3. Samples: 4947914080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 22:20:40,936][15349] Signal inference workers to stop experience collection... (73150 times) [2024-06-22 22:20:40,986][15401] InferenceWorker_p0-w0: stopping experience collection (73150 times) [2024-06-22 22:20:40,995][15349] Signal inference workers to resume experience collection... (73150 times) [2024-06-22 22:20:41,004][15401] InferenceWorker_p0-w0: resuming experience collection (73150 times) [2024-06-22 22:20:41,007][15401] Updated weights for policy 0, policy_version 302000 (0.0035) [2024-06-22 22:20:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 4948049920. Throughput: 0: 42748.5. Samples: 4948166840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 22:20:44,884][15401] Updated weights for policy 0, policy_version 302010 (0.0038) [2024-06-22 22:20:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4948262912. Throughput: 0: 42986.4. Samples: 4948429580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:48,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 22:20:48,595][15401] Updated weights for policy 0, policy_version 302020 (0.0032) [2024-06-22 22:20:52,539][15401] Updated weights for policy 0, policy_version 302030 (0.0031) [2024-06-22 22:20:53,395][15132] Fps is (10 sec: 42576.1, 60 sec: 43140.7, 300 sec: 42764.6). Total num frames: 4948475904. Throughput: 0: 42904.3. Samples: 4948560740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-22 22:20:53,395][15132] Avg episode reward: [(0, '0.694')] [2024-06-22 22:20:56,110][15401] Updated weights for policy 0, policy_version 302040 (0.0030) [2024-06-22 22:20:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4948672512. Throughput: 0: 42887.2. Samples: 4948809760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:20:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 22:21:00,223][15401] Updated weights for policy 0, policy_version 302050 (0.0028) [2024-06-22 22:21:03,389][15132] Fps is (10 sec: 44260.4, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 4948918272. Throughput: 0: 43035.2. Samples: 4949077900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 22:21:03,820][15401] Updated weights for policy 0, policy_version 302060 (0.0036) [2024-06-22 22:21:07,882][15401] Updated weights for policy 0, policy_version 302070 (0.0035) [2024-06-22 22:21:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 42820.9). Total num frames: 4949131264. Throughput: 0: 42973.8. Samples: 4949205860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 22:21:11,408][15401] Updated weights for policy 0, policy_version 302080 (0.0029) [2024-06-22 22:21:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4949344256. Throughput: 0: 43016.8. Samples: 4949458060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 22:21:15,544][15401] Updated weights for policy 0, policy_version 302090 (0.0039) [2024-06-22 22:21:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 4949540864. Throughput: 0: 43019.5. Samples: 4949720340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 22:21:18,970][15401] Updated weights for policy 0, policy_version 302100 (0.0032) [2024-06-22 22:21:23,144][15401] Updated weights for policy 0, policy_version 302110 (0.0047) [2024-06-22 22:21:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 4949786624. Throughput: 0: 42915.9. Samples: 4949845300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:23,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 22:21:26,813][15401] Updated weights for policy 0, policy_version 302120 (0.0037) [2024-06-22 22:21:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4949966848. Throughput: 0: 42982.2. Samples: 4950101040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 22:21:31,163][15401] Updated weights for policy 0, policy_version 302130 (0.0035) [2024-06-22 22:21:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4950179840. Throughput: 0: 42930.1. Samples: 4950361440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 22:21:34,451][15401] Updated weights for policy 0, policy_version 302140 (0.0030) [2024-06-22 22:21:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4950409216. Throughput: 0: 42918.5. Samples: 4950491840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 22:21:38,585][15401] Updated weights for policy 0, policy_version 302150 (0.0039) [2024-06-22 22:21:42,072][15401] Updated weights for policy 0, policy_version 302160 (0.0030) [2024-06-22 22:21:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4950638592. Throughput: 0: 42867.5. Samples: 4950738800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-22 22:21:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000302163_4950638592.pth... [2024-06-22 22:21:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000301531_4940283904.pth [2024-06-22 22:21:46,173][15401] Updated weights for policy 0, policy_version 302170 (0.0040) [2024-06-22 22:21:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 4950835200. Throughput: 0: 42776.3. Samples: 4951002840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 22:21:49,721][15401] Updated weights for policy 0, policy_version 302180 (0.0028) [2024-06-22 22:21:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42875.1, 300 sec: 42876.1). Total num frames: 4951048192. Throughput: 0: 42704.3. Samples: 4951127560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 22:21:53,735][15401] Updated weights for policy 0, policy_version 302190 (0.0038) [2024-06-22 22:21:57,543][15401] Updated weights for policy 0, policy_version 302200 (0.0031) [2024-06-22 22:21:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42987.5). Total num frames: 4951277568. Throughput: 0: 42828.1. Samples: 4951385320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:21:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 22:22:01,821][15401] Updated weights for policy 0, policy_version 302210 (0.0031) [2024-06-22 22:22:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 4951474176. Throughput: 0: 42675.4. Samples: 4951640740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:22:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 22:22:04,469][15349] Signal inference workers to stop experience collection... (73200 times) [2024-06-22 22:22:04,498][15401] InferenceWorker_p0-w0: stopping experience collection (73200 times) [2024-06-22 22:22:04,525][15349] Signal inference workers to resume experience collection... (73200 times) [2024-06-22 22:22:04,532][15401] InferenceWorker_p0-w0: resuming experience collection (73200 times) [2024-06-22 22:22:05,057][15401] Updated weights for policy 0, policy_version 302220 (0.0032) [2024-06-22 22:22:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 4951670784. Throughput: 0: 42749.3. Samples: 4951769020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-22 22:22:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 22:22:09,455][15401] Updated weights for policy 0, policy_version 302230 (0.0034) [2024-06-22 22:22:12,977][15401] Updated weights for policy 0, policy_version 302240 (0.0028) [2024-06-22 22:22:13,390][15132] Fps is (10 sec: 44235.2, 60 sec: 42871.3, 300 sec: 42987.1). Total num frames: 4951916544. Throughput: 0: 42857.9. Samples: 4952029660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 22:22:17,294][15401] Updated weights for policy 0, policy_version 302250 (0.0029) [2024-06-22 22:22:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4952113152. Throughput: 0: 42529.8. Samples: 4952275280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 22:22:20,754][15401] Updated weights for policy 0, policy_version 302260 (0.0034) [2024-06-22 22:22:23,389][15132] Fps is (10 sec: 40962.2, 60 sec: 42325.5, 300 sec: 42931.7). Total num frames: 4952326144. Throughput: 0: 42358.2. Samples: 4952397960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 22:22:24,890][15401] Updated weights for policy 0, policy_version 302270 (0.0029) [2024-06-22 22:22:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 4952539136. Throughput: 0: 42759.6. Samples: 4952662980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:28,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 22:22:28,503][15401] Updated weights for policy 0, policy_version 302280 (0.0028) [2024-06-22 22:22:32,504][15401] Updated weights for policy 0, policy_version 302290 (0.0036) [2024-06-22 22:22:33,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 4952752128. Throughput: 0: 42519.6. Samples: 4952916320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:33,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 22:22:36,113][15401] Updated weights for policy 0, policy_version 302300 (0.0035) [2024-06-22 22:22:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 4952981504. Throughput: 0: 42661.8. Samples: 4953047340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:38,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-22 22:22:40,072][15401] Updated weights for policy 0, policy_version 302310 (0.0027) [2024-06-22 22:22:43,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 4953178112. Throughput: 0: 42681.5. Samples: 4953305980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 22:22:43,827][15401] Updated weights for policy 0, policy_version 302320 (0.0031) [2024-06-22 22:22:47,913][15401] Updated weights for policy 0, policy_version 302330 (0.0039) [2024-06-22 22:22:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 4953374720. Throughput: 0: 42628.5. Samples: 4953559020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 22:22:51,526][15401] Updated weights for policy 0, policy_version 302340 (0.0026) [2024-06-22 22:22:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.6, 300 sec: 42931.6). Total num frames: 4953604096. Throughput: 0: 42623.3. Samples: 4953687060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 22:22:55,899][15401] Updated weights for policy 0, policy_version 302350 (0.0031) [2024-06-22 22:22:58,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42323.6, 300 sec: 42875.7). Total num frames: 4953817088. Throughput: 0: 42492.7. Samples: 4953941920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:22:58,393][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 22:22:59,051][15401] Updated weights for policy 0, policy_version 302360 (0.0023) [2024-06-22 22:23:03,363][15401] Updated weights for policy 0, policy_version 302370 (0.0034) [2024-06-22 22:23:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 4954030080. Throughput: 0: 42892.0. Samples: 4954205420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:23:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 22:23:06,583][15401] Updated weights for policy 0, policy_version 302380 (0.0041) [2024-06-22 22:23:08,389][15132] Fps is (10 sec: 44248.3, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 4954259456. Throughput: 0: 42974.7. Samples: 4954331820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:23:08,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 22:23:10,890][15401] Updated weights for policy 0, policy_version 302390 (0.0037) [2024-06-22 22:23:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.7, 300 sec: 42876.1). Total num frames: 4954472448. Throughput: 0: 42791.6. Samples: 4954588600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:23:13,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 22:23:14,422][15401] Updated weights for policy 0, policy_version 302400 (0.0037) [2024-06-22 22:23:18,385][15401] Updated weights for policy 0, policy_version 302410 (0.0041) [2024-06-22 22:23:18,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4954685440. Throughput: 0: 43021.8. Samples: 4954852200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:23:18,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-22 22:23:22,074][15401] Updated weights for policy 0, policy_version 302420 (0.0023) [2024-06-22 22:23:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 4954914816. Throughput: 0: 42899.6. Samples: 4954977820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:23:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 22:23:25,823][15401] Updated weights for policy 0, policy_version 302430 (0.0023) [2024-06-22 22:23:26,915][15349] Signal inference workers to stop experience collection... (73250 times) [2024-06-22 22:23:26,947][15401] InferenceWorker_p0-w0: stopping experience collection (73250 times) [2024-06-22 22:23:26,978][15349] Signal inference workers to resume experience collection... (73250 times) [2024-06-22 22:23:26,980][15401] InferenceWorker_p0-w0: resuming experience collection (73250 times) [2024-06-22 22:23:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4955111424. Throughput: 0: 42788.4. Samples: 4955231460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:23:28,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-22 22:23:29,645][15401] Updated weights for policy 0, policy_version 302440 (0.0035) [2024-06-22 22:23:33,363][15401] Updated weights for policy 0, policy_version 302450 (0.0025) [2024-06-22 22:23:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 4955340800. Throughput: 0: 42819.6. Samples: 4955485900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:23:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 22:23:37,394][15401] Updated weights for policy 0, policy_version 302460 (0.0033) [2024-06-22 22:23:38,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 4955553792. Throughput: 0: 42881.3. Samples: 4955617000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:23:38,396][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 22:23:41,402][15401] Updated weights for policy 0, policy_version 302470 (0.0039) [2024-06-22 22:23:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4955750400. Throughput: 0: 42899.7. Samples: 4955872300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:23:43,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 22:23:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000302475_4955750400.pth... [2024-06-22 22:23:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000301848_4945477632.pth [2024-06-22 22:23:44,977][15401] Updated weights for policy 0, policy_version 302480 (0.0027) [2024-06-22 22:23:48,389][15132] Fps is (10 sec: 40986.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4955963392. Throughput: 0: 42625.0. Samples: 4956123540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:23:48,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 22:23:48,956][15401] Updated weights for policy 0, policy_version 302490 (0.0034) [2024-06-22 22:23:52,767][15401] Updated weights for policy 0, policy_version 302500 (0.0029) [2024-06-22 22:23:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 4956176384. Throughput: 0: 42733.9. Samples: 4956254860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:23:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 22:23:56,550][15401] Updated weights for policy 0, policy_version 302510 (0.0036) [2024-06-22 22:23:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 4956372992. Throughput: 0: 42674.0. Samples: 4956508940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:23:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 22:24:00,564][15401] Updated weights for policy 0, policy_version 302520 (0.0026) [2024-06-22 22:24:03,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 4956618752. Throughput: 0: 42468.1. Samples: 4956763260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:24:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 22:24:04,199][15401] Updated weights for policy 0, policy_version 302530 (0.0026) [2024-06-22 22:24:08,394][15132] Fps is (10 sec: 44216.1, 60 sec: 42594.9, 300 sec: 42820.2). Total num frames: 4956815360. Throughput: 0: 42650.1. Samples: 4956897280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:24:08,395][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 22:24:08,405][15401] Updated weights for policy 0, policy_version 302540 (0.0035) [2024-06-22 22:24:11,907][15401] Updated weights for policy 0, policy_version 302550 (0.0031) [2024-06-22 22:24:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4957011968. Throughput: 0: 42506.1. Samples: 4957144240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:24:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 22:24:16,009][15401] Updated weights for policy 0, policy_version 302560 (0.0032) [2024-06-22 22:24:18,389][15132] Fps is (10 sec: 44258.1, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 4957257728. Throughput: 0: 42533.7. Samples: 4957399920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:24:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 22:24:19,466][15401] Updated weights for policy 0, policy_version 302570 (0.0033) [2024-06-22 22:24:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 4957437952. Throughput: 0: 42661.2. Samples: 4957536480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:24:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 22:24:23,581][15401] Updated weights for policy 0, policy_version 302580 (0.0036) [2024-06-22 22:24:27,251][15401] Updated weights for policy 0, policy_version 302590 (0.0028) [2024-06-22 22:24:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4957650944. Throughput: 0: 42602.6. Samples: 4957789420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:24:28,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-22 22:24:31,187][15401] Updated weights for policy 0, policy_version 302600 (0.0038) [2024-06-22 22:24:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4957896704. Throughput: 0: 42735.0. Samples: 4958046620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-22 22:24:33,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 22:24:34,831][15401] Updated weights for policy 0, policy_version 302610 (0.0027) [2024-06-22 22:24:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42056.8, 300 sec: 42709.8). Total num frames: 4958076928. Throughput: 0: 42743.8. Samples: 4958178320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:24:38,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-22 22:24:38,912][15401] Updated weights for policy 0, policy_version 302620 (0.0028) [2024-06-22 22:24:42,680][15401] Updated weights for policy 0, policy_version 302630 (0.0033) [2024-06-22 22:24:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4958289920. Throughput: 0: 42839.2. Samples: 4958436700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:24:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 22:24:46,605][15401] Updated weights for policy 0, policy_version 302640 (0.0045) [2024-06-22 22:24:48,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 4958552064. Throughput: 0: 42655.1. Samples: 4958682740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:24:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 22:24:48,842][15349] Signal inference workers to stop experience collection... (73300 times) [2024-06-22 22:24:48,889][15401] InferenceWorker_p0-w0: stopping experience collection (73300 times) [2024-06-22 22:24:48,959][15349] Signal inference workers to resume experience collection... (73300 times) [2024-06-22 22:24:48,959][15401] InferenceWorker_p0-w0: resuming experience collection (73300 times) [2024-06-22 22:24:50,354][15401] Updated weights for policy 0, policy_version 302650 (0.0024) [2024-06-22 22:24:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4958715904. Throughput: 0: 42611.1. Samples: 4958814580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:24:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 22:24:54,576][15401] Updated weights for policy 0, policy_version 302660 (0.0035) [2024-06-22 22:24:57,916][15401] Updated weights for policy 0, policy_version 302670 (0.0030) [2024-06-22 22:24:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4958945280. Throughput: 0: 42630.2. Samples: 4959062600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:24:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 22:25:02,238][15401] Updated weights for policy 0, policy_version 302680 (0.0038) [2024-06-22 22:25:03,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 4959174656. Throughput: 0: 42672.5. Samples: 4959320180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 22:25:05,675][15401] Updated weights for policy 0, policy_version 302690 (0.0038) [2024-06-22 22:25:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42328.7, 300 sec: 42709.5). Total num frames: 4959354880. Throughput: 0: 42568.3. Samples: 4959452060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 22:25:09,848][15401] Updated weights for policy 0, policy_version 302700 (0.0027) [2024-06-22 22:25:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4959584256. Throughput: 0: 42492.0. Samples: 4959701560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:13,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 22:25:13,670][15401] Updated weights for policy 0, policy_version 302710 (0.0044) [2024-06-22 22:25:17,442][15401] Updated weights for policy 0, policy_version 302720 (0.0039) [2024-06-22 22:25:18,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 4959813632. Throughput: 0: 42569.7. Samples: 4959962360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:18,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 22:25:21,252][15401] Updated weights for policy 0, policy_version 302730 (0.0040) [2024-06-22 22:25:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4959993856. Throughput: 0: 42570.2. Samples: 4960093980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 22:25:24,973][15401] Updated weights for policy 0, policy_version 302740 (0.0037) [2024-06-22 22:25:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4960239616. Throughput: 0: 42391.2. Samples: 4960344300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:28,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-22 22:25:28,916][15401] Updated weights for policy 0, policy_version 302750 (0.0048) [2024-06-22 22:25:32,775][15401] Updated weights for policy 0, policy_version 302760 (0.0040) [2024-06-22 22:25:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 4960436224. Throughput: 0: 42533.2. Samples: 4960596740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-22 22:25:36,823][15401] Updated weights for policy 0, policy_version 302770 (0.0033) [2024-06-22 22:25:38,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42866.8, 300 sec: 42708.6). Total num frames: 4960649216. Throughput: 0: 42435.9. Samples: 4960724460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:38,397][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 22:25:40,359][15401] Updated weights for policy 0, policy_version 302780 (0.0030) [2024-06-22 22:25:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4960862208. Throughput: 0: 42589.4. Samples: 4960979120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 22:25:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000302787_4960862208.pth... [2024-06-22 22:25:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000302163_4950638592.pth [2024-06-22 22:25:44,309][15401] Updated weights for policy 0, policy_version 302790 (0.0034) [2024-06-22 22:25:48,220][15401] Updated weights for policy 0, policy_version 302800 (0.0032) [2024-06-22 22:25:48,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42052.3, 300 sec: 42710.2). Total num frames: 4961075200. Throughput: 0: 42656.4. Samples: 4961239720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 22:25:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 22:25:52,004][15401] Updated weights for policy 0, policy_version 302810 (0.0032) [2024-06-22 22:25:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4961288192. Throughput: 0: 42575.2. Samples: 4961367940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:25:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 22:25:55,773][15401] Updated weights for policy 0, policy_version 302820 (0.0047) [2024-06-22 22:25:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4961484800. Throughput: 0: 42639.6. Samples: 4961620340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:25:58,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 22:25:59,709][15401] Updated weights for policy 0, policy_version 302830 (0.0038) [2024-06-22 22:26:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 4961714176. Throughput: 0: 42725.3. Samples: 4961884900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:03,390][15132] Avg episode reward: [(0, '0.203')] [2024-06-22 22:26:03,671][15401] Updated weights for policy 0, policy_version 302840 (0.0029) [2024-06-22 22:26:07,371][15401] Updated weights for policy 0, policy_version 302850 (0.0031) [2024-06-22 22:26:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4961927168. Throughput: 0: 42658.2. Samples: 4962013600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 22:26:11,108][15401] Updated weights for policy 0, policy_version 302860 (0.0039) [2024-06-22 22:26:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4962140160. Throughput: 0: 42689.3. Samples: 4962265320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 22:26:15,091][15401] Updated weights for policy 0, policy_version 302870 (0.0028) [2024-06-22 22:26:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 4962369536. Throughput: 0: 42841.9. Samples: 4962524620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 22:26:18,710][15401] Updated weights for policy 0, policy_version 302880 (0.0037) [2024-06-22 22:26:21,671][15349] Signal inference workers to stop experience collection... (73350 times) [2024-06-22 22:26:21,672][15349] Signal inference workers to resume experience collection... (73350 times) [2024-06-22 22:26:21,706][15401] InferenceWorker_p0-w0: stopping experience collection (73350 times) [2024-06-22 22:26:21,706][15401] InferenceWorker_p0-w0: resuming experience collection (73350 times) [2024-06-22 22:26:22,947][15401] Updated weights for policy 0, policy_version 302890 (0.0028) [2024-06-22 22:26:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4962566144. Throughput: 0: 42947.4. Samples: 4962656820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 22:26:26,277][15401] Updated weights for policy 0, policy_version 302900 (0.0045) [2024-06-22 22:26:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4962795520. Throughput: 0: 42963.6. Samples: 4962912480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:28,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 22:26:30,430][15401] Updated weights for policy 0, policy_version 302910 (0.0034) [2024-06-22 22:26:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4963008512. Throughput: 0: 42812.4. Samples: 4963166280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 22:26:34,098][15401] Updated weights for policy 0, policy_version 302920 (0.0039) [2024-06-22 22:26:37,972][15401] Updated weights for policy 0, policy_version 302930 (0.0037) [2024-06-22 22:26:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 4963205120. Throughput: 0: 42773.3. Samples: 4963292740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 22:26:41,733][15401] Updated weights for policy 0, policy_version 302940 (0.0033) [2024-06-22 22:26:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4963434496. Throughput: 0: 42912.9. Samples: 4963551420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-22 22:26:45,508][15401] Updated weights for policy 0, policy_version 302950 (0.0033) [2024-06-22 22:26:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4963647488. Throughput: 0: 42668.5. Samples: 4963804980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 22:26:49,469][15401] Updated weights for policy 0, policy_version 302960 (0.0043) [2024-06-22 22:26:53,159][15401] Updated weights for policy 0, policy_version 302970 (0.0038) [2024-06-22 22:26:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 4963860480. Throughput: 0: 42616.3. Samples: 4963931440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:53,393][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 22:26:57,149][15401] Updated weights for policy 0, policy_version 302980 (0.0047) [2024-06-22 22:26:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4964057088. Throughput: 0: 42749.2. Samples: 4964189040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:26:58,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 22:27:00,872][15401] Updated weights for policy 0, policy_version 302990 (0.0038) [2024-06-22 22:27:03,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4964286464. Throughput: 0: 42682.6. Samples: 4964445340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-22 22:27:03,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-22 22:27:05,075][15401] Updated weights for policy 0, policy_version 303000 (0.0038) [2024-06-22 22:27:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4964499456. Throughput: 0: 42530.8. Samples: 4964570700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 22:27:08,433][15401] Updated weights for policy 0, policy_version 303010 (0.0033) [2024-06-22 22:27:12,574][15401] Updated weights for policy 0, policy_version 303020 (0.0033) [2024-06-22 22:27:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4964696064. Throughput: 0: 42603.1. Samples: 4964829620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:13,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-22 22:27:16,381][15401] Updated weights for policy 0, policy_version 303030 (0.0047) [2024-06-22 22:27:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4964925440. Throughput: 0: 42504.5. Samples: 4965078980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:18,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-22 22:27:20,282][15401] Updated weights for policy 0, policy_version 303040 (0.0035) [2024-06-22 22:27:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4965138432. Throughput: 0: 42711.9. Samples: 4965214780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 22:27:23,917][15401] Updated weights for policy 0, policy_version 303050 (0.0023) [2024-06-22 22:27:27,827][15401] Updated weights for policy 0, policy_version 303060 (0.0029) [2024-06-22 22:27:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 4965335040. Throughput: 0: 42534.1. Samples: 4965465460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-22 22:27:31,748][15401] Updated weights for policy 0, policy_version 303070 (0.0035) [2024-06-22 22:27:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4965580800. Throughput: 0: 42580.1. Samples: 4965721080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 22:27:35,312][15401] Updated weights for policy 0, policy_version 303080 (0.0040) [2024-06-22 22:27:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 4965777408. Throughput: 0: 42871.1. Samples: 4965860540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 22:27:39,142][15401] Updated weights for policy 0, policy_version 303090 (0.0030) [2024-06-22 22:27:39,467][15349] Signal inference workers to stop experience collection... (73400 times) [2024-06-22 22:27:39,496][15401] InferenceWorker_p0-w0: stopping experience collection (73400 times) [2024-06-22 22:27:39,585][15349] Signal inference workers to resume experience collection... (73400 times) [2024-06-22 22:27:39,586][15401] InferenceWorker_p0-w0: resuming experience collection (73400 times) [2024-06-22 22:27:42,962][15401] Updated weights for policy 0, policy_version 303100 (0.0027) [2024-06-22 22:27:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 4965990400. Throughput: 0: 42861.8. Samples: 4966117920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:43,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 22:27:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000303100_4965990400.pth... [2024-06-22 22:27:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000302475_4955750400.pth [2024-06-22 22:27:46,915][15401] Updated weights for policy 0, policy_version 303110 (0.0035) [2024-06-22 22:27:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42764.6). Total num frames: 4966219776. Throughput: 0: 42688.8. Samples: 4966366440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:48,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 22:27:50,958][15401] Updated weights for policy 0, policy_version 303120 (0.0037) [2024-06-22 22:27:53,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 4966400000. Throughput: 0: 42931.5. Samples: 4966502620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 22:27:54,505][15401] Updated weights for policy 0, policy_version 303130 (0.0029) [2024-06-22 22:27:58,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4966629376. Throughput: 0: 42866.2. Samples: 4966758600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:27:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 22:27:58,601][15401] Updated weights for policy 0, policy_version 303140 (0.0041) [2024-06-22 22:28:02,101][15401] Updated weights for policy 0, policy_version 303150 (0.0034) [2024-06-22 22:28:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4966858752. Throughput: 0: 42923.6. Samples: 4967010540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:28:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 22:28:06,205][15401] Updated weights for policy 0, policy_version 303160 (0.0028) [2024-06-22 22:28:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4967055360. Throughput: 0: 42855.2. Samples: 4967143260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:28:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 22:28:09,725][15401] Updated weights for policy 0, policy_version 303170 (0.0040) [2024-06-22 22:28:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4967268352. Throughput: 0: 42871.2. Samples: 4967394660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:28:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 22:28:13,881][15401] Updated weights for policy 0, policy_version 303180 (0.0028) [2024-06-22 22:28:17,356][15401] Updated weights for policy 0, policy_version 303190 (0.0034) [2024-06-22 22:28:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4967514112. Throughput: 0: 42959.0. Samples: 4967654240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 22:28:18,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-22 22:28:21,568][15401] Updated weights for policy 0, policy_version 303200 (0.0036) [2024-06-22 22:28:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4967694336. Throughput: 0: 42867.1. Samples: 4967789560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:28:23,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-22 22:28:25,017][15401] Updated weights for policy 0, policy_version 303210 (0.0031) [2024-06-22 22:28:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4967907328. Throughput: 0: 42768.4. Samples: 4968042400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:28:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 22:28:29,114][15401] Updated weights for policy 0, policy_version 303220 (0.0032) [2024-06-22 22:28:32,619][15401] Updated weights for policy 0, policy_version 303230 (0.0029) [2024-06-22 22:28:33,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 4968136704. Throughput: 0: 42958.5. Samples: 4968299460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:28:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 22:28:36,840][15401] Updated weights for policy 0, policy_version 303240 (0.0034) [2024-06-22 22:28:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4968333312. Throughput: 0: 42789.7. Samples: 4968428160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:28:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 22:28:40,424][15401] Updated weights for policy 0, policy_version 303250 (0.0035) [2024-06-22 22:28:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 4968562688. Throughput: 0: 42688.4. Samples: 4968679580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:28:43,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 22:28:44,450][15401] Updated weights for policy 0, policy_version 303260 (0.0033) [2024-06-22 22:28:48,078][15401] Updated weights for policy 0, policy_version 303270 (0.0036) [2024-06-22 22:28:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 4968775680. Throughput: 0: 42714.6. Samples: 4968932700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:28:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 22:28:52,073][15401] Updated weights for policy 0, policy_version 303280 (0.0034) [2024-06-22 22:28:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4968972288. Throughput: 0: 42642.2. Samples: 4969062160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:28:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 22:28:56,067][15401] Updated weights for policy 0, policy_version 303290 (0.0032) [2024-06-22 22:28:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4969201664. Throughput: 0: 42815.9. Samples: 4969321380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:28:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-22 22:28:58,647][15349] Signal inference workers to stop experience collection... (73450 times) [2024-06-22 22:28:58,703][15401] InferenceWorker_p0-w0: stopping experience collection (73450 times) [2024-06-22 22:28:58,706][15349] Signal inference workers to resume experience collection... (73450 times) [2024-06-22 22:28:58,726][15401] InferenceWorker_p0-w0: resuming experience collection (73450 times) [2024-06-22 22:28:59,699][15401] Updated weights for policy 0, policy_version 303300 (0.0025) [2024-06-22 22:29:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.6). Total num frames: 4969398272. Throughput: 0: 42601.4. Samples: 4969571300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:29:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 22:29:03,730][15401] Updated weights for policy 0, policy_version 303310 (0.0030) [2024-06-22 22:29:07,505][15401] Updated weights for policy 0, policy_version 303320 (0.0035) [2024-06-22 22:29:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 4969611264. Throughput: 0: 42457.4. Samples: 4969700240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:29:08,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 22:29:11,378][15401] Updated weights for policy 0, policy_version 303330 (0.0029) [2024-06-22 22:29:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 4969857024. Throughput: 0: 42615.7. Samples: 4969960100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:29:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 22:29:15,338][15401] Updated weights for policy 0, policy_version 303340 (0.0038) [2024-06-22 22:29:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 4970053632. Throughput: 0: 42636.7. Samples: 4970218120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:29:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 22:29:19,157][15401] Updated weights for policy 0, policy_version 303350 (0.0027) [2024-06-22 22:29:23,054][15401] Updated weights for policy 0, policy_version 303360 (0.0029) [2024-06-22 22:29:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4970266624. Throughput: 0: 42582.7. Samples: 4970344380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:29:23,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 22:29:26,850][15401] Updated weights for policy 0, policy_version 303370 (0.0028) [2024-06-22 22:29:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 4970496000. Throughput: 0: 42873.0. Samples: 4970608860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-22 22:29:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 22:29:30,524][15401] Updated weights for policy 0, policy_version 303380 (0.0029) [2024-06-22 22:29:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 4970708992. Throughput: 0: 42913.4. Samples: 4970863800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:29:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 22:29:34,457][15401] Updated weights for policy 0, policy_version 303390 (0.0035) [2024-06-22 22:29:38,197][15401] Updated weights for policy 0, policy_version 303400 (0.0043) [2024-06-22 22:29:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 4970921984. Throughput: 0: 42852.0. Samples: 4970990500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:29:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 22:29:42,247][15401] Updated weights for policy 0, policy_version 303410 (0.0042) [2024-06-22 22:29:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4971118592. Throughput: 0: 42912.0. Samples: 4971252420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:29:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 22:29:43,625][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000303415_4971151360.pth... [2024-06-22 22:29:43,669][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000302787_4960862208.pth [2024-06-22 22:29:45,851][15401] Updated weights for policy 0, policy_version 303420 (0.0035) [2024-06-22 22:29:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4971331584. Throughput: 0: 42765.7. Samples: 4971495760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:29:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 22:29:49,852][15401] Updated weights for policy 0, policy_version 303430 (0.0043) [2024-06-22 22:29:53,370][15401] Updated weights for policy 0, policy_version 303440 (0.0027) [2024-06-22 22:29:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4971560960. Throughput: 0: 42817.5. Samples: 4971626920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:29:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 22:29:57,478][15401] Updated weights for policy 0, policy_version 303450 (0.0030) [2024-06-22 22:29:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 4971757568. Throughput: 0: 42756.1. Samples: 4971884120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:29:58,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-22 22:30:01,127][15401] Updated weights for policy 0, policy_version 303460 (0.0026) [2024-06-22 22:30:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 4971986944. Throughput: 0: 42647.6. Samples: 4972137260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 22:30:05,131][15401] Updated weights for policy 0, policy_version 303470 (0.0033) [2024-06-22 22:30:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 4972183552. Throughput: 0: 42781.5. Samples: 4972269540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 22:30:08,595][15401] Updated weights for policy 0, policy_version 303480 (0.0026) [2024-06-22 22:30:12,710][15401] Updated weights for policy 0, policy_version 303490 (0.0036) [2024-06-22 22:30:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 4972396544. Throughput: 0: 42628.7. Samples: 4972527160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 22:30:15,970][15349] Signal inference workers to stop experience collection... (73500 times) [2024-06-22 22:30:16,024][15401] InferenceWorker_p0-w0: stopping experience collection (73500 times) [2024-06-22 22:30:16,033][15349] Signal inference workers to resume experience collection... (73500 times) [2024-06-22 22:30:16,040][15401] InferenceWorker_p0-w0: resuming experience collection (73500 times) [2024-06-22 22:30:16,204][15401] Updated weights for policy 0, policy_version 303500 (0.0035) [2024-06-22 22:30:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 4972625920. Throughput: 0: 42636.0. Samples: 4972782420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 22:30:20,670][15401] Updated weights for policy 0, policy_version 303510 (0.0024) [2024-06-22 22:30:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4972822528. Throughput: 0: 42638.7. Samples: 4972909240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 22:30:23,724][15401] Updated weights for policy 0, policy_version 303520 (0.0038) [2024-06-22 22:30:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4973035520. Throughput: 0: 42435.6. Samples: 4973162020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 22:30:28,392][15401] Updated weights for policy 0, policy_version 303530 (0.0031) [2024-06-22 22:30:31,172][15401] Updated weights for policy 0, policy_version 303540 (0.0034) [2024-06-22 22:30:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 4973264896. Throughput: 0: 42728.8. Samples: 4973418560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:33,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 22:30:36,039][15401] Updated weights for policy 0, policy_version 303550 (0.0030) [2024-06-22 22:30:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4973477888. Throughput: 0: 42736.4. Samples: 4973550060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 22:30:38,984][15401] Updated weights for policy 0, policy_version 303560 (0.0038) [2024-06-22 22:30:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4973674496. Throughput: 0: 42600.6. Samples: 4973801160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 22:30:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 22:30:43,586][15401] Updated weights for policy 0, policy_version 303570 (0.0027) [2024-06-22 22:30:46,983][15401] Updated weights for policy 0, policy_version 303580 (0.0028) [2024-06-22 22:30:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4973887488. Throughput: 0: 42727.7. Samples: 4974060000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:30:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 22:30:51,345][15401] Updated weights for policy 0, policy_version 303590 (0.0027) [2024-06-22 22:30:53,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 4974116864. Throughput: 0: 42672.3. Samples: 4974189900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:30:53,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 22:30:54,583][15401] Updated weights for policy 0, policy_version 303600 (0.0029) [2024-06-22 22:30:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4974313472. Throughput: 0: 42510.7. Samples: 4974440140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:30:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 22:30:59,015][15401] Updated weights for policy 0, policy_version 303610 (0.0024) [2024-06-22 22:31:02,335][15401] Updated weights for policy 0, policy_version 303620 (0.0027) [2024-06-22 22:31:03,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4974542848. Throughput: 0: 42607.6. Samples: 4974699760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 22:31:06,727][15401] Updated weights for policy 0, policy_version 303630 (0.0034) [2024-06-22 22:31:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4974755840. Throughput: 0: 42648.4. Samples: 4974828420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:08,395][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 22:31:10,553][15401] Updated weights for policy 0, policy_version 303640 (0.0035) [2024-06-22 22:31:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4974968832. Throughput: 0: 42659.5. Samples: 4975081700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 22:31:14,257][15401] Updated weights for policy 0, policy_version 303650 (0.0033) [2024-06-22 22:31:18,166][15401] Updated weights for policy 0, policy_version 303660 (0.0031) [2024-06-22 22:31:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 4975165440. Throughput: 0: 42666.7. Samples: 4975338660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:18,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 22:31:21,902][15401] Updated weights for policy 0, policy_version 303670 (0.0040) [2024-06-22 22:31:23,326][15349] Signal inference workers to stop experience collection... (73550 times) [2024-06-22 22:31:23,350][15401] InferenceWorker_p0-w0: stopping experience collection (73550 times) [2024-06-22 22:31:23,387][15349] Signal inference workers to resume experience collection... (73550 times) [2024-06-22 22:31:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4975394816. Throughput: 0: 42560.0. Samples: 4975465260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 22:31:23,398][15401] InferenceWorker_p0-w0: resuming experience collection (73550 times) [2024-06-22 22:31:25,673][15401] Updated weights for policy 0, policy_version 303680 (0.0035) [2024-06-22 22:31:28,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4975591424. Throughput: 0: 42645.9. Samples: 4975720220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:28,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 22:31:29,726][15401] Updated weights for policy 0, policy_version 303690 (0.0035) [2024-06-22 22:31:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4975804416. Throughput: 0: 42629.8. Samples: 4975978340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:33,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-22 22:31:33,430][15401] Updated weights for policy 0, policy_version 303700 (0.0047) [2024-06-22 22:31:37,432][15401] Updated weights for policy 0, policy_version 303710 (0.0039) [2024-06-22 22:31:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 4976033792. Throughput: 0: 42617.3. Samples: 4976107680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:38,393][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 22:31:41,157][15401] Updated weights for policy 0, policy_version 303720 (0.0037) [2024-06-22 22:31:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 4976214016. Throughput: 0: 42580.6. Samples: 4976356260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:43,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 22:31:43,464][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000303725_4976230400.pth... [2024-06-22 22:31:43,545][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000303100_4965990400.pth [2024-06-22 22:31:45,109][15401] Updated weights for policy 0, policy_version 303730 (0.0047) [2024-06-22 22:31:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 4976443392. Throughput: 0: 42524.8. Samples: 4976613380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 22:31:48,843][15401] Updated weights for policy 0, policy_version 303740 (0.0041) [2024-06-22 22:31:52,688][15401] Updated weights for policy 0, policy_version 303750 (0.0040) [2024-06-22 22:31:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 4976672768. Throughput: 0: 42486.6. Samples: 4976740320. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:53,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 22:31:56,642][15401] Updated weights for policy 0, policy_version 303760 (0.0029) [2024-06-22 22:31:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4976869376. Throughput: 0: 42575.1. Samples: 4976997580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-22 22:31:58,390][15132] Avg episode reward: [(0, '0.143')] [2024-06-22 22:32:00,386][15401] Updated weights for policy 0, policy_version 303770 (0.0040) [2024-06-22 22:32:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4977082368. Throughput: 0: 42521.0. Samples: 4977252000. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-22 22:32:04,280][15401] Updated weights for policy 0, policy_version 303780 (0.0044) [2024-06-22 22:32:07,915][15401] Updated weights for policy 0, policy_version 303790 (0.0034) [2024-06-22 22:32:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4977328128. Throughput: 0: 42594.5. Samples: 4977382020. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-22 22:32:12,249][15401] Updated weights for policy 0, policy_version 303800 (0.0030) [2024-06-22 22:32:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4977508352. Throughput: 0: 42700.4. Samples: 4977641740. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:13,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 22:32:15,291][15401] Updated weights for policy 0, policy_version 303810 (0.0029) [2024-06-22 22:32:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 4977737728. Throughput: 0: 42545.3. Samples: 4977892880. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:18,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-22 22:32:19,763][15401] Updated weights for policy 0, policy_version 303820 (0.0032) [2024-06-22 22:32:22,912][15401] Updated weights for policy 0, policy_version 303830 (0.0034) [2024-06-22 22:32:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4977950720. Throughput: 0: 42684.6. Samples: 4978028380. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:23,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 22:32:27,263][15401] Updated weights for policy 0, policy_version 303840 (0.0027) [2024-06-22 22:32:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 4978130944. Throughput: 0: 42814.5. Samples: 4978282920. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:28,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-22 22:32:30,447][15401] Updated weights for policy 0, policy_version 303850 (0.0040) [2024-06-22 22:32:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4978376704. Throughput: 0: 42685.7. Samples: 4978534240. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 22:32:34,813][15401] Updated weights for policy 0, policy_version 303860 (0.0029) [2024-06-22 22:32:38,353][15401] Updated weights for policy 0, policy_version 303870 (0.0035) [2024-06-22 22:32:38,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 4978606080. Throughput: 0: 42963.2. Samples: 4978673660. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 22:32:42,416][15401] Updated weights for policy 0, policy_version 303880 (0.0037) [2024-06-22 22:32:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42598.8). Total num frames: 4978786304. Throughput: 0: 42904.5. Samples: 4978928280. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 22:32:45,841][15401] Updated weights for policy 0, policy_version 303890 (0.0029) [2024-06-22 22:32:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 4979032064. Throughput: 0: 42847.0. Samples: 4979180120. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 22:32:48,400][15349] Signal inference workers to stop experience collection... (73600 times) [2024-06-22 22:32:48,404][15349] Signal inference workers to resume experience collection... (73600 times) [2024-06-22 22:32:48,453][15401] InferenceWorker_p0-w0: stopping experience collection (73600 times) [2024-06-22 22:32:48,454][15401] InferenceWorker_p0-w0: resuming experience collection (73600 times) [2024-06-22 22:32:50,697][15401] Updated weights for policy 0, policy_version 303900 (0.0029) [2024-06-22 22:32:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4979228672. Throughput: 0: 42918.3. Samples: 4979313340. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 22:32:53,594][15401] Updated weights for policy 0, policy_version 303910 (0.0042) [2024-06-22 22:32:58,187][15401] Updated weights for policy 0, policy_version 303920 (0.0040) [2024-06-22 22:32:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4979425280. Throughput: 0: 42751.1. Samples: 4979565540. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:32:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 22:33:01,299][15401] Updated weights for policy 0, policy_version 303930 (0.0039) [2024-06-22 22:33:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4979671040. Throughput: 0: 42834.6. Samples: 4979820440. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:33:03,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-22 22:33:05,744][15401] Updated weights for policy 0, policy_version 303940 (0.0026) [2024-06-22 22:33:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4979884032. Throughput: 0: 42817.7. Samples: 4979955180. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:33:08,394][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 22:33:08,903][15401] Updated weights for policy 0, policy_version 303950 (0.0028) [2024-06-22 22:33:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4980064256. Throughput: 0: 42691.7. Samples: 4980204040. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-06-22 22:33:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 22:33:13,770][15401] Updated weights for policy 0, policy_version 303960 (0.0042) [2024-06-22 22:33:16,501][15401] Updated weights for policy 0, policy_version 303970 (0.0042) [2024-06-22 22:33:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4980293632. Throughput: 0: 42752.6. Samples: 4980458100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 22:33:21,131][15401] Updated weights for policy 0, policy_version 303980 (0.0027) [2024-06-22 22:33:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 4980506624. Throughput: 0: 42711.0. Samples: 4980595660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 22:33:24,072][15401] Updated weights for policy 0, policy_version 303990 (0.0036) [2024-06-22 22:33:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4980703232. Throughput: 0: 42739.5. Samples: 4980851560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 22:33:28,574][15401] Updated weights for policy 0, policy_version 304000 (0.0036) [2024-06-22 22:33:31,811][15401] Updated weights for policy 0, policy_version 304010 (0.0036) [2024-06-22 22:33:33,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4980948992. Throughput: 0: 42720.8. Samples: 4981102560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 22:33:36,167][15401] Updated weights for policy 0, policy_version 304020 (0.0029) [2024-06-22 22:33:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4981161984. Throughput: 0: 42838.3. Samples: 4981241060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 22:33:39,338][15401] Updated weights for policy 0, policy_version 304030 (0.0039) [2024-06-22 22:33:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4981358592. Throughput: 0: 42792.5. Samples: 4981491200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:43,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 22:33:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000304038_4981358592.pth... [2024-06-22 22:33:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000303415_4971151360.pth [2024-06-22 22:33:43,805][15401] Updated weights for policy 0, policy_version 304040 (0.0029) [2024-06-22 22:33:47,146][15401] Updated weights for policy 0, policy_version 304050 (0.0030) [2024-06-22 22:33:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4981587968. Throughput: 0: 42933.7. Samples: 4981752460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 22:33:51,551][15401] Updated weights for policy 0, policy_version 304060 (0.0030) [2024-06-22 22:33:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4981817344. Throughput: 0: 42876.8. Samples: 4981884640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 22:33:54,683][15401] Updated weights for policy 0, policy_version 304070 (0.0030) [2024-06-22 22:33:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4982013952. Throughput: 0: 43021.7. Samples: 4982140020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:33:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 22:33:58,981][15401] Updated weights for policy 0, policy_version 304080 (0.0041) [2024-06-22 22:34:00,270][15349] Signal inference workers to stop experience collection... (73650 times) [2024-06-22 22:34:00,315][15401] InferenceWorker_p0-w0: stopping experience collection (73650 times) [2024-06-22 22:34:00,385][15349] Signal inference workers to resume experience collection... (73650 times) [2024-06-22 22:34:00,386][15401] InferenceWorker_p0-w0: resuming experience collection (73650 times) [2024-06-22 22:34:02,875][15401] Updated weights for policy 0, policy_version 304090 (0.0029) [2024-06-22 22:34:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 4982243328. Throughput: 0: 43119.6. Samples: 4982398480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:34:03,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-22 22:34:06,719][15401] Updated weights for policy 0, policy_version 304100 (0.0030) [2024-06-22 22:34:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4982439936. Throughput: 0: 42906.9. Samples: 4982526460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:34:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 22:34:10,475][15401] Updated weights for policy 0, policy_version 304110 (0.0023) [2024-06-22 22:34:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 4982652928. Throughput: 0: 42863.2. Samples: 4982780400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:34:13,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-22 22:34:14,182][15401] Updated weights for policy 0, policy_version 304120 (0.0028) [2024-06-22 22:34:18,078][15401] Updated weights for policy 0, policy_version 304130 (0.0024) [2024-06-22 22:34:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 4982882304. Throughput: 0: 42939.7. Samples: 4983034840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:34:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 22:34:21,796][15401] Updated weights for policy 0, policy_version 304140 (0.0032) [2024-06-22 22:34:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 4983078912. Throughput: 0: 42731.5. Samples: 4983163980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:34:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 22:34:25,711][15401] Updated weights for policy 0, policy_version 304150 (0.0037) [2024-06-22 22:34:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4983291904. Throughput: 0: 42900.1. Samples: 4983421700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-22 22:34:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 22:34:29,249][15401] Updated weights for policy 0, policy_version 304160 (0.0033) [2024-06-22 22:34:33,345][15401] Updated weights for policy 0, policy_version 304170 (0.0037) [2024-06-22 22:34:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4983521280. Throughput: 0: 42830.7. Samples: 4983679840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:34:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 22:34:37,273][15401] Updated weights for policy 0, policy_version 304180 (0.0041) [2024-06-22 22:34:38,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 4983734272. Throughput: 0: 42687.6. Samples: 4983805680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:34:38,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 22:34:41,146][15401] Updated weights for policy 0, policy_version 304190 (0.0039) [2024-06-22 22:34:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4983947264. Throughput: 0: 42748.5. Samples: 4984063700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:34:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 22:34:44,724][15401] Updated weights for policy 0, policy_version 304200 (0.0031) [2024-06-22 22:34:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4984143872. Throughput: 0: 42753.3. Samples: 4984322380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:34:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 22:34:48,886][15401] Updated weights for policy 0, policy_version 304210 (0.0044) [2024-06-22 22:34:52,474][15401] Updated weights for policy 0, policy_version 304220 (0.0038) [2024-06-22 22:34:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4984373248. Throughput: 0: 42732.4. Samples: 4984449420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:34:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-22 22:34:56,455][15401] Updated weights for policy 0, policy_version 304230 (0.0028) [2024-06-22 22:34:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4984586240. Throughput: 0: 42730.2. Samples: 4984703260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:34:58,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-22 22:35:00,092][15401] Updated weights for policy 0, policy_version 304240 (0.0027) [2024-06-22 22:35:03,395][15132] Fps is (10 sec: 42575.2, 60 sec: 42594.4, 300 sec: 42764.2). Total num frames: 4984799232. Throughput: 0: 43017.3. Samples: 4984970860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:35:03,396][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 22:35:03,971][15401] Updated weights for policy 0, policy_version 304250 (0.0041) [2024-06-22 22:35:07,722][15401] Updated weights for policy 0, policy_version 304260 (0.0038) [2024-06-22 22:35:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 4985028608. Throughput: 0: 42977.2. Samples: 4985097960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:35:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 22:35:11,498][15401] Updated weights for policy 0, policy_version 304270 (0.0025) [2024-06-22 22:35:13,390][15132] Fps is (10 sec: 44260.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4985241600. Throughput: 0: 42854.9. Samples: 4985350180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:35:13,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-22 22:35:15,083][15401] Updated weights for policy 0, policy_version 304280 (0.0034) [2024-06-22 22:35:18,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 4985421824. Throughput: 0: 43014.8. Samples: 4985615500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:35:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-22 22:35:19,029][15401] Updated weights for policy 0, policy_version 304290 (0.0035) [2024-06-22 22:35:22,716][15401] Updated weights for policy 0, policy_version 304300 (0.0034) [2024-06-22 22:35:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4985667584. Throughput: 0: 42920.9. Samples: 4985737020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:35:23,398][15132] Avg episode reward: [(0, '0.826')] [2024-06-22 22:35:26,653][15401] Updated weights for policy 0, policy_version 304310 (0.0039) [2024-06-22 22:35:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 4985880576. Throughput: 0: 42885.9. Samples: 4985993560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:35:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 22:35:29,464][15349] Signal inference workers to stop experience collection... (73700 times) [2024-06-22 22:35:29,464][15349] Signal inference workers to resume experience collection... (73700 times) [2024-06-22 22:35:29,507][15401] InferenceWorker_p0-w0: stopping experience collection (73700 times) [2024-06-22 22:35:29,507][15401] InferenceWorker_p0-w0: resuming experience collection (73700 times) [2024-06-22 22:35:30,314][15401] Updated weights for policy 0, policy_version 304320 (0.0031) [2024-06-22 22:35:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4986077184. Throughput: 0: 42964.1. Samples: 4986255760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:35:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 22:35:34,839][15401] Updated weights for policy 0, policy_version 304330 (0.0035) [2024-06-22 22:35:38,118][15401] Updated weights for policy 0, policy_version 304340 (0.0028) [2024-06-22 22:35:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 4986306560. Throughput: 0: 42840.1. Samples: 4986377220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-22 22:35:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 22:35:42,365][15401] Updated weights for policy 0, policy_version 304350 (0.0043) [2024-06-22 22:35:43,391][15132] Fps is (10 sec: 44227.6, 60 sec: 42870.1, 300 sec: 42820.3). Total num frames: 4986519552. Throughput: 0: 43054.0. Samples: 4986640780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:35:43,392][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 22:35:43,438][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000304354_4986535936.pth... [2024-06-22 22:35:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000303725_4976230400.pth [2024-06-22 22:35:45,807][15401] Updated weights for policy 0, policy_version 304360 (0.0041) [2024-06-22 22:35:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 4986716160. Throughput: 0: 42686.6. Samples: 4986891520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:35:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 22:35:49,929][15401] Updated weights for policy 0, policy_version 304370 (0.0044) [2024-06-22 22:35:53,387][15401] Updated weights for policy 0, policy_version 304380 (0.0044) [2024-06-22 22:35:53,390][15132] Fps is (10 sec: 44245.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4986961920. Throughput: 0: 42650.7. Samples: 4987017240. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:35:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 22:35:57,514][15401] Updated weights for policy 0, policy_version 304390 (0.0031) [2024-06-22 22:35:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4987158528. Throughput: 0: 42877.4. Samples: 4987279660. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:35:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 22:36:00,955][15401] Updated weights for policy 0, policy_version 304400 (0.0045) [2024-06-22 22:36:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42602.4, 300 sec: 42709.5). Total num frames: 4987355136. Throughput: 0: 42571.5. Samples: 4987531220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:03,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 22:36:05,083][15401] Updated weights for policy 0, policy_version 304410 (0.0034) [2024-06-22 22:36:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4987584512. Throughput: 0: 42705.9. Samples: 4987658780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-22 22:36:08,868][15401] Updated weights for policy 0, policy_version 304420 (0.0026) [2024-06-22 22:36:12,718][15401] Updated weights for policy 0, policy_version 304430 (0.0041) [2024-06-22 22:36:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 4987797504. Throughput: 0: 42847.0. Samples: 4987921680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:13,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 22:36:16,423][15401] Updated weights for policy 0, policy_version 304440 (0.0046) [2024-06-22 22:36:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 4988010496. Throughput: 0: 42685.1. Samples: 4988176600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 22:36:20,476][15401] Updated weights for policy 0, policy_version 304450 (0.0036) [2024-06-22 22:36:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 4988239872. Throughput: 0: 42817.3. Samples: 4988304000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 22:36:24,064][15401] Updated weights for policy 0, policy_version 304460 (0.0042) [2024-06-22 22:36:28,018][15401] Updated weights for policy 0, policy_version 304470 (0.0036) [2024-06-22 22:36:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4988452864. Throughput: 0: 42762.3. Samples: 4988565000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:28,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 22:36:31,570][15401] Updated weights for policy 0, policy_version 304480 (0.0037) [2024-06-22 22:36:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 4988649472. Throughput: 0: 42809.8. Samples: 4988817960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 22:36:35,695][15401] Updated weights for policy 0, policy_version 304490 (0.0037) [2024-06-22 22:36:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 4988862464. Throughput: 0: 42784.6. Samples: 4988942540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 22:36:39,185][15401] Updated weights for policy 0, policy_version 304500 (0.0039) [2024-06-22 22:36:40,470][15349] Signal inference workers to stop experience collection... (73750 times) [2024-06-22 22:36:40,476][15349] Signal inference workers to resume experience collection... (73750 times) [2024-06-22 22:36:40,517][15401] InferenceWorker_p0-w0: stopping experience collection (73750 times) [2024-06-22 22:36:40,518][15401] InferenceWorker_p0-w0: resuming experience collection (73750 times) [2024-06-22 22:36:43,367][15401] Updated weights for policy 0, policy_version 304510 (0.0031) [2024-06-22 22:36:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42872.8, 300 sec: 42876.1). Total num frames: 4989091840. Throughput: 0: 42820.3. Samples: 4989206580. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 22:36:47,199][15401] Updated weights for policy 0, policy_version 304520 (0.0033) [2024-06-22 22:36:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 4989304832. Throughput: 0: 42785.2. Samples: 4989456560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 22:36:51,554][15401] Updated weights for policy 0, policy_version 304530 (0.0031) [2024-06-22 22:36:53,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4989501440. Throughput: 0: 42910.2. Samples: 4989589740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-22 22:36:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 22:36:54,698][15401] Updated weights for policy 0, policy_version 304540 (0.0026) [2024-06-22 22:36:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 4989714432. Throughput: 0: 42790.2. Samples: 4989847240. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:36:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 22:36:59,190][15401] Updated weights for policy 0, policy_version 304550 (0.0032) [2024-06-22 22:37:02,138][15401] Updated weights for policy 0, policy_version 304560 (0.0038) [2024-06-22 22:37:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4989927424. Throughput: 0: 42658.3. Samples: 4990096220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 22:37:06,811][15401] Updated weights for policy 0, policy_version 304570 (0.0035) [2024-06-22 22:37:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 4990156800. Throughput: 0: 42860.5. Samples: 4990232720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 22:37:09,602][15401] Updated weights for policy 0, policy_version 304580 (0.0044) [2024-06-22 22:37:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4990337024. Throughput: 0: 42680.0. Samples: 4990485600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 22:37:14,422][15401] Updated weights for policy 0, policy_version 304590 (0.0030) [2024-06-22 22:37:17,507][15401] Updated weights for policy 0, policy_version 304600 (0.0038) [2024-06-22 22:37:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 4990582784. Throughput: 0: 42648.4. Samples: 4990737140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 22:37:22,186][15401] Updated weights for policy 0, policy_version 304610 (0.0033) [2024-06-22 22:37:23,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 4990812160. Throughput: 0: 43061.6. Samples: 4990880320. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 22:37:25,096][15401] Updated weights for policy 0, policy_version 304620 (0.0034) [2024-06-22 22:37:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 4990992384. Throughput: 0: 42722.4. Samples: 4991129080. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:28,396][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 22:37:29,956][15401] Updated weights for policy 0, policy_version 304630 (0.0027) [2024-06-22 22:37:32,805][15401] Updated weights for policy 0, policy_version 304640 (0.0032) [2024-06-22 22:37:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4991221760. Throughput: 0: 42727.2. Samples: 4991379280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 22:37:37,550][15401] Updated weights for policy 0, policy_version 304650 (0.0040) [2024-06-22 22:37:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 4991451136. Throughput: 0: 42747.1. Samples: 4991513360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 22:37:40,383][15401] Updated weights for policy 0, policy_version 304660 (0.0031) [2024-06-22 22:37:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 4991631360. Throughput: 0: 42686.4. Samples: 4991768120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 22:37:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000304666_4991647744.pth... [2024-06-22 22:37:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000304038_4981358592.pth [2024-06-22 22:37:45,328][15401] Updated weights for policy 0, policy_version 304670 (0.0036) [2024-06-22 22:37:48,117][15401] Updated weights for policy 0, policy_version 304680 (0.0027) [2024-06-22 22:37:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 4991877120. Throughput: 0: 42632.0. Samples: 4992014660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-22 22:37:52,906][15401] Updated weights for policy 0, policy_version 304690 (0.0034) [2024-06-22 22:37:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 4992057344. Throughput: 0: 42610.3. Samples: 4992150180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 22:37:53,474][15349] Signal inference workers to stop experience collection... (73800 times) [2024-06-22 22:37:53,480][15349] Signal inference workers to resume experience collection... (73800 times) [2024-06-22 22:37:53,493][15401] InferenceWorker_p0-w0: stopping experience collection (73800 times) [2024-06-22 22:37:53,516][15401] InferenceWorker_p0-w0: resuming experience collection (73800 times) [2024-06-22 22:37:56,034][15401] Updated weights for policy 0, policy_version 304700 (0.0030) [2024-06-22 22:37:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 4992270336. Throughput: 0: 42628.9. Samples: 4992403900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:37:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 22:38:00,731][15401] Updated weights for policy 0, policy_version 304710 (0.0035) [2024-06-22 22:38:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4992499712. Throughput: 0: 42660.1. Samples: 4992656840. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:38:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 22:38:03,853][15401] Updated weights for policy 0, policy_version 304720 (0.0033) [2024-06-22 22:38:08,289][15401] Updated weights for policy 0, policy_version 304730 (0.0045) [2024-06-22 22:38:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 4992696320. Throughput: 0: 42409.9. Samples: 4992788760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-22 22:38:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 22:38:11,744][15401] Updated weights for policy 0, policy_version 304740 (0.0032) [2024-06-22 22:38:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 4992925696. Throughput: 0: 42434.1. Samples: 4993038620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 22:38:15,961][15401] Updated weights for policy 0, policy_version 304750 (0.0037) [2024-06-22 22:38:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42765.1). Total num frames: 4993122304. Throughput: 0: 42587.7. Samples: 4993295720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-22 22:38:19,379][15401] Updated weights for policy 0, policy_version 304760 (0.0034) [2024-06-22 22:38:23,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 4993335296. Throughput: 0: 42457.0. Samples: 4993423920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:23,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 22:38:23,517][15401] Updated weights for policy 0, policy_version 304770 (0.0029) [2024-06-22 22:38:26,949][15401] Updated weights for policy 0, policy_version 304780 (0.0031) [2024-06-22 22:38:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4993564672. Throughput: 0: 42456.3. Samples: 4993678660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:28,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-22 22:38:31,098][15401] Updated weights for policy 0, policy_version 304790 (0.0039) [2024-06-22 22:38:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 4993761280. Throughput: 0: 42707.0. Samples: 4993936480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:33,391][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 22:38:34,683][15401] Updated weights for policy 0, policy_version 304800 (0.0043) [2024-06-22 22:38:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 4993990656. Throughput: 0: 42582.1. Samples: 4994066480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:38,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 22:38:38,614][15401] Updated weights for policy 0, policy_version 304810 (0.0047) [2024-06-22 22:38:42,157][15401] Updated weights for policy 0, policy_version 304820 (0.0032) [2024-06-22 22:38:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4994203648. Throughput: 0: 42625.7. Samples: 4994322060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 22:38:46,319][15401] Updated weights for policy 0, policy_version 304830 (0.0026) [2024-06-22 22:38:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 4994416640. Throughput: 0: 42794.1. Samples: 4994582580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 22:38:49,830][15401] Updated weights for policy 0, policy_version 304840 (0.0038) [2024-06-22 22:38:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4994629632. Throughput: 0: 42760.3. Samples: 4994712980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-22 22:38:53,784][15401] Updated weights for policy 0, policy_version 304850 (0.0036) [2024-06-22 22:38:57,260][15401] Updated weights for policy 0, policy_version 304860 (0.0038) [2024-06-22 22:38:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4994842624. Throughput: 0: 42785.9. Samples: 4994963980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:38:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 22:39:01,675][15401] Updated weights for policy 0, policy_version 304870 (0.0029) [2024-06-22 22:39:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4995072000. Throughput: 0: 42748.2. Samples: 4995219400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:39:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 22:39:04,898][15401] Updated weights for policy 0, policy_version 304880 (0.0039) [2024-06-22 22:39:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4995268608. Throughput: 0: 42768.5. Samples: 4995348500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:39:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 22:39:09,531][15401] Updated weights for policy 0, policy_version 304890 (0.0031) [2024-06-22 22:39:12,240][15401] Updated weights for policy 0, policy_version 304900 (0.0033) [2024-06-22 22:39:13,393][15132] Fps is (10 sec: 40945.1, 60 sec: 42595.9, 300 sec: 42708.9). Total num frames: 4995481600. Throughput: 0: 42767.2. Samples: 4995603340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:39:13,394][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 22:39:16,981][15401] Updated weights for policy 0, policy_version 304910 (0.0046) [2024-06-22 22:39:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 4995727360. Throughput: 0: 43012.1. Samples: 4995872020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:39:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 22:39:19,740][15401] Updated weights for policy 0, policy_version 304920 (0.0033) [2024-06-22 22:39:23,389][15132] Fps is (10 sec: 40975.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4995891200. Throughput: 0: 43070.8. Samples: 4996004560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-22 22:39:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 22:39:24,522][15401] Updated weights for policy 0, policy_version 304930 (0.0036) [2024-06-22 22:39:27,391][15401] Updated weights for policy 0, policy_version 304940 (0.0032) [2024-06-22 22:39:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4996136960. Throughput: 0: 42954.7. Samples: 4996255020. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:39:28,391][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 22:39:30,677][15349] Signal inference workers to stop experience collection... (73850 times) [2024-06-22 22:39:30,677][15349] Signal inference workers to resume experience collection... (73850 times) [2024-06-22 22:39:30,687][15401] InferenceWorker_p0-w0: stopping experience collection (73850 times) [2024-06-22 22:39:30,695][15401] InferenceWorker_p0-w0: resuming experience collection (73850 times) [2024-06-22 22:39:32,072][15401] Updated weights for policy 0, policy_version 304950 (0.0022) [2024-06-22 22:39:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.7, 300 sec: 42820.9). Total num frames: 4996366336. Throughput: 0: 42983.2. Samples: 4996516820. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:39:33,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 22:39:34,925][15401] Updated weights for policy 0, policy_version 304960 (0.0030) [2024-06-22 22:39:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 4996546560. Throughput: 0: 43017.5. Samples: 4996648760. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:39:38,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 22:39:39,808][15401] Updated weights for policy 0, policy_version 304970 (0.0031) [2024-06-22 22:39:42,419][15401] Updated weights for policy 0, policy_version 304980 (0.0034) [2024-06-22 22:39:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 4996792320. Throughput: 0: 43032.3. Samples: 4996900540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:39:43,393][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 22:39:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000304980_4996792320.pth... [2024-06-22 22:39:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000304354_4986535936.pth [2024-06-22 22:39:47,634][15401] Updated weights for policy 0, policy_version 304990 (0.0037) [2024-06-22 22:39:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 4997005312. Throughput: 0: 43205.3. Samples: 4997163640. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:39:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 22:39:50,629][15401] Updated weights for policy 0, policy_version 305000 (0.0036) [2024-06-22 22:39:53,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 4997201920. Throughput: 0: 43150.6. Samples: 4997290280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:39:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 22:39:55,136][15401] Updated weights for policy 0, policy_version 305010 (0.0036) [2024-06-22 22:39:58,279][15401] Updated weights for policy 0, policy_version 305020 (0.0032) [2024-06-22 22:39:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42876.9). Total num frames: 4997447680. Throughput: 0: 42952.3. Samples: 4997536040. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:39:58,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-22 22:40:02,737][15401] Updated weights for policy 0, policy_version 305030 (0.0040) [2024-06-22 22:40:03,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 4997644288. Throughput: 0: 42909.2. Samples: 4997803040. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:40:03,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 22:40:05,765][15401] Updated weights for policy 0, policy_version 305040 (0.0032) [2024-06-22 22:40:08,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 4997824512. Throughput: 0: 42643.1. Samples: 4997923500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:40:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 22:40:10,495][15401] Updated weights for policy 0, policy_version 305050 (0.0033) [2024-06-22 22:40:13,389][15132] Fps is (10 sec: 44248.0, 60 sec: 43420.4, 300 sec: 42931.6). Total num frames: 4998086656. Throughput: 0: 42861.5. Samples: 4998183780. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:40:13,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 22:40:14,013][15401] Updated weights for policy 0, policy_version 305060 (0.0045) [2024-06-22 22:40:18,147][15401] Updated weights for policy 0, policy_version 305070 (0.0028) [2024-06-22 22:40:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4998283264. Throughput: 0: 42797.5. Samples: 4998442700. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:40:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 22:40:21,638][15401] Updated weights for policy 0, policy_version 305080 (0.0035) [2024-06-22 22:40:23,392][15132] Fps is (10 sec: 39311.6, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 4998479872. Throughput: 0: 42684.7. Samples: 4998569680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:40:23,392][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 22:40:25,728][15401] Updated weights for policy 0, policy_version 305090 (0.0023) [2024-06-22 22:40:28,053][15349] Signal inference workers to stop experience collection... (73900 times) [2024-06-22 22:40:28,107][15401] InferenceWorker_p0-w0: stopping experience collection (73900 times) [2024-06-22 22:40:28,109][15349] Signal inference workers to resume experience collection... (73900 times) [2024-06-22 22:40:28,131][15401] InferenceWorker_p0-w0: resuming experience collection (73900 times) [2024-06-22 22:40:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 4998725632. Throughput: 0: 42853.4. Samples: 4998828840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:40:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 22:40:29,082][15401] Updated weights for policy 0, policy_version 305100 (0.0041) [2024-06-22 22:40:33,215][15401] Updated weights for policy 0, policy_version 305110 (0.0026) [2024-06-22 22:40:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4998922240. Throughput: 0: 42693.4. Samples: 4999084840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:40:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 22:40:36,582][15401] Updated weights for policy 0, policy_version 305120 (0.0046) [2024-06-22 22:40:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 4999102464. Throughput: 0: 42691.1. Samples: 4999211380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-22 22:40:38,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 22:40:41,037][15401] Updated weights for policy 0, policy_version 305130 (0.0028) [2024-06-22 22:40:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 4999348224. Throughput: 0: 42814.2. Samples: 4999462680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:40:43,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-22 22:40:44,508][15401] Updated weights for policy 0, policy_version 305140 (0.0032) [2024-06-22 22:40:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 4999544832. Throughput: 0: 42691.7. Samples: 4999724060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:40:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 22:40:48,651][15401] Updated weights for policy 0, policy_version 305150 (0.0035) [2024-06-22 22:40:52,100][15401] Updated weights for policy 0, policy_version 305160 (0.0040) [2024-06-22 22:40:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4999774208. Throughput: 0: 42756.0. Samples: 4999847520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:40:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 22:40:56,265][15401] Updated weights for policy 0, policy_version 305170 (0.0026) [2024-06-22 22:40:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 4999987200. Throughput: 0: 42607.1. Samples: 5000101100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:40:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 22:40:59,642][15401] Updated weights for policy 0, policy_version 305180 (0.0048) [2024-06-22 22:41:03,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 5000183808. Throughput: 0: 42846.5. Samples: 5000370800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 22:41:03,953][15401] Updated weights for policy 0, policy_version 305190 (0.0027) [2024-06-22 22:41:07,134][15401] Updated weights for policy 0, policy_version 305200 (0.0026) [2024-06-22 22:41:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5000413184. Throughput: 0: 42788.1. Samples: 5000495040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-22 22:41:11,465][15401] Updated weights for policy 0, policy_version 305210 (0.0027) [2024-06-22 22:41:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5000642560. Throughput: 0: 42689.3. Samples: 5000749860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:13,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 22:41:15,268][15401] Updated weights for policy 0, policy_version 305220 (0.0047) [2024-06-22 22:41:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5000839168. Throughput: 0: 42886.3. Samples: 5001014720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 22:41:18,951][15401] Updated weights for policy 0, policy_version 305230 (0.0035) [2024-06-22 22:41:22,847][15401] Updated weights for policy 0, policy_version 305240 (0.0028) [2024-06-22 22:41:23,396][15132] Fps is (10 sec: 42571.4, 60 sec: 43141.7, 300 sec: 42764.1). Total num frames: 5001068544. Throughput: 0: 42872.5. Samples: 5001140920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:23,396][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 22:41:26,840][15401] Updated weights for policy 0, policy_version 305250 (0.0032) [2024-06-22 22:41:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5001265152. Throughput: 0: 42921.4. Samples: 5001394140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 22:41:30,462][15401] Updated weights for policy 0, policy_version 305260 (0.0040) [2024-06-22 22:41:33,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5001478144. Throughput: 0: 42970.3. Samples: 5001657720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 22:41:34,402][15401] Updated weights for policy 0, policy_version 305270 (0.0032) [2024-06-22 22:41:36,785][15349] Signal inference workers to stop experience collection... (73950 times) [2024-06-22 22:41:36,785][15349] Signal inference workers to resume experience collection... (73950 times) [2024-06-22 22:41:36,818][15401] InferenceWorker_p0-w0: stopping experience collection (73950 times) [2024-06-22 22:41:36,818][15401] InferenceWorker_p0-w0: resuming experience collection (73950 times) [2024-06-22 22:41:38,046][15401] Updated weights for policy 0, policy_version 305280 (0.0036) [2024-06-22 22:41:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 5001723904. Throughput: 0: 43064.4. Samples: 5001785420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-22 22:41:41,960][15401] Updated weights for policy 0, policy_version 305290 (0.0046) [2024-06-22 22:41:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5001904128. Throughput: 0: 42963.5. Samples: 5002034460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-22 22:41:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000305292_5001904128.pth... [2024-06-22 22:41:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000304666_4991647744.pth [2024-06-22 22:41:45,647][15401] Updated weights for policy 0, policy_version 305300 (0.0027) [2024-06-22 22:41:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5002117120. Throughput: 0: 42801.0. Samples: 5002296840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:41:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 22:41:49,550][15401] Updated weights for policy 0, policy_version 305310 (0.0026) [2024-06-22 22:41:53,339][15401] Updated weights for policy 0, policy_version 305320 (0.0034) [2024-06-22 22:41:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5002362880. Throughput: 0: 42819.5. Samples: 5002421920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:41:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 22:41:57,380][15401] Updated weights for policy 0, policy_version 305330 (0.0034) [2024-06-22 22:41:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5002543104. Throughput: 0: 42705.9. Samples: 5002671620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:41:58,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 22:42:01,074][15401] Updated weights for policy 0, policy_version 305340 (0.0036) [2024-06-22 22:42:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5002772480. Throughput: 0: 42571.0. Samples: 5002930420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 22:42:05,020][15401] Updated weights for policy 0, policy_version 305350 (0.0039) [2024-06-22 22:42:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5002969088. Throughput: 0: 42670.2. Samples: 5003060800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 22:42:09,161][15401] Updated weights for policy 0, policy_version 305360 (0.0033) [2024-06-22 22:42:12,787][15401] Updated weights for policy 0, policy_version 305370 (0.0045) [2024-06-22 22:42:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5003182080. Throughput: 0: 42516.9. Samples: 5003307400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-22 22:42:16,918][15401] Updated weights for policy 0, policy_version 305380 (0.0044) [2024-06-22 22:42:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5003395072. Throughput: 0: 42256.8. Samples: 5003559280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-22 22:42:20,490][15401] Updated weights for policy 0, policy_version 305390 (0.0037) [2024-06-22 22:42:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 5003608064. Throughput: 0: 42230.3. Samples: 5003685780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 22:42:24,546][15401] Updated weights for policy 0, policy_version 305400 (0.0051) [2024-06-22 22:42:28,132][15401] Updated weights for policy 0, policy_version 305410 (0.0048) [2024-06-22 22:42:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5003837440. Throughput: 0: 42462.2. Samples: 5003945260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 22:42:32,329][15401] Updated weights for policy 0, policy_version 305420 (0.0054) [2024-06-22 22:42:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5004034048. Throughput: 0: 42394.7. Samples: 5004204600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 22:42:35,725][15401] Updated weights for policy 0, policy_version 305430 (0.0051) [2024-06-22 22:42:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 5004247040. Throughput: 0: 42410.2. Samples: 5004330380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-22 22:42:40,070][15401] Updated weights for policy 0, policy_version 305440 (0.0044) [2024-06-22 22:42:43,300][15401] Updated weights for policy 0, policy_version 305450 (0.0033) [2024-06-22 22:42:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5004492800. Throughput: 0: 42560.0. Samples: 5004586820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 22:42:48,070][15401] Updated weights for policy 0, policy_version 305460 (0.0042) [2024-06-22 22:42:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5004673024. Throughput: 0: 42513.0. Samples: 5004843500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:48,390][15132] Avg episode reward: [(0, '0.845')] [2024-06-22 22:42:51,535][15401] Updated weights for policy 0, policy_version 305470 (0.0039) [2024-06-22 22:42:53,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42050.6, 300 sec: 42764.7). Total num frames: 5004886016. Throughput: 0: 42291.8. Samples: 5004964040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:53,393][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 22:42:54,540][15349] Signal inference workers to stop experience collection... (74000 times) [2024-06-22 22:42:54,563][15401] InferenceWorker_p0-w0: stopping experience collection (74000 times) [2024-06-22 22:42:54,655][15349] Signal inference workers to resume experience collection... (74000 times) [2024-06-22 22:42:54,655][15401] InferenceWorker_p0-w0: resuming experience collection (74000 times) [2024-06-22 22:42:55,635][15401] Updated weights for policy 0, policy_version 305480 (0.0036) [2024-06-22 22:42:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5005115392. Throughput: 0: 42604.0. Samples: 5005224580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:42:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 22:42:59,264][15401] Updated weights for policy 0, policy_version 305490 (0.0034) [2024-06-22 22:43:03,238][15401] Updated weights for policy 0, policy_version 305500 (0.0033) [2024-06-22 22:43:03,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5005312000. Throughput: 0: 42644.5. Samples: 5005478280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-22 22:43:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 22:43:06,755][15401] Updated weights for policy 0, policy_version 305510 (0.0038) [2024-06-22 22:43:08,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 5005492224. Throughput: 0: 42564.7. Samples: 5005601200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:08,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 22:43:11,015][15401] Updated weights for policy 0, policy_version 305520 (0.0030) [2024-06-22 22:43:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5005737984. Throughput: 0: 42487.2. Samples: 5005857180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 22:43:14,256][15401] Updated weights for policy 0, policy_version 305530 (0.0036) [2024-06-22 22:43:18,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 5005934592. Throughput: 0: 42481.7. Samples: 5006116380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:18,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 22:43:18,714][15401] Updated weights for policy 0, policy_version 305540 (0.0031) [2024-06-22 22:43:21,983][15401] Updated weights for policy 0, policy_version 305550 (0.0038) [2024-06-22 22:43:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5006147584. Throughput: 0: 42476.4. Samples: 5006241820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 22:43:26,235][15401] Updated weights for policy 0, policy_version 305560 (0.0038) [2024-06-22 22:43:28,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5006393344. Throughput: 0: 42633.3. Samples: 5006505320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:28,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-22 22:43:29,390][15401] Updated weights for policy 0, policy_version 305570 (0.0034) [2024-06-22 22:43:33,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42709.9). Total num frames: 5006589952. Throughput: 0: 42775.2. Samples: 5006768380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 22:43:33,727][15401] Updated weights for policy 0, policy_version 305580 (0.0033) [2024-06-22 22:43:36,919][15401] Updated weights for policy 0, policy_version 305590 (0.0029) [2024-06-22 22:43:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5006802944. Throughput: 0: 42945.8. Samples: 5006896500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:38,396][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 22:43:41,296][15401] Updated weights for policy 0, policy_version 305600 (0.0024) [2024-06-22 22:43:43,389][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5007048704. Throughput: 0: 43013.8. Samples: 5007160200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 22:43:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000305606_5007048704.pth... [2024-06-22 22:43:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000304980_4996792320.pth [2024-06-22 22:43:44,542][15401] Updated weights for policy 0, policy_version 305610 (0.0042) [2024-06-22 22:43:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 5007228928. Throughput: 0: 43035.5. Samples: 5007414980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:48,393][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 22:43:48,855][15401] Updated weights for policy 0, policy_version 305620 (0.0034) [2024-06-22 22:43:52,199][15401] Updated weights for policy 0, policy_version 305630 (0.0030) [2024-06-22 22:43:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 5007458304. Throughput: 0: 43051.1. Samples: 5007538500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 22:43:56,417][15401] Updated weights for policy 0, policy_version 305640 (0.0030) [2024-06-22 22:43:58,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5007687680. Throughput: 0: 43124.4. Samples: 5007797780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:43:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 22:44:00,447][15401] Updated weights for policy 0, policy_version 305650 (0.0029) [2024-06-22 22:44:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 5007867904. Throughput: 0: 43255.5. Samples: 5008062780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:44:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 22:44:04,046][15401] Updated weights for policy 0, policy_version 305660 (0.0031) [2024-06-22 22:44:06,489][15349] Signal inference workers to stop experience collection... (74050 times) [2024-06-22 22:44:06,490][15349] Signal inference workers to resume experience collection... (74050 times) [2024-06-22 22:44:06,526][15401] InferenceWorker_p0-w0: stopping experience collection (74050 times) [2024-06-22 22:44:06,526][15401] InferenceWorker_p0-w0: resuming experience collection (74050 times) [2024-06-22 22:44:08,033][15401] Updated weights for policy 0, policy_version 305670 (0.0028) [2024-06-22 22:44:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43690.8, 300 sec: 42821.1). Total num frames: 5008113664. Throughput: 0: 43184.6. Samples: 5008185120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:44:08,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 22:44:11,624][15401] Updated weights for policy 0, policy_version 305680 (0.0035) [2024-06-22 22:44:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5008310272. Throughput: 0: 43159.0. Samples: 5008447480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:44:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 22:44:15,494][15401] Updated weights for policy 0, policy_version 305690 (0.0030) [2024-06-22 22:44:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 5008523264. Throughput: 0: 43091.9. Samples: 5008707520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-22 22:44:18,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 22:44:19,378][15401] Updated weights for policy 0, policy_version 305700 (0.0031) [2024-06-22 22:44:23,019][15401] Updated weights for policy 0, policy_version 305710 (0.0028) [2024-06-22 22:44:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 5008769024. Throughput: 0: 43040.0. Samples: 5008833300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:44:23,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-22 22:44:27,165][15401] Updated weights for policy 0, policy_version 305720 (0.0032) [2024-06-22 22:44:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5008965632. Throughput: 0: 42914.1. Samples: 5009091340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:44:28,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 22:44:30,528][15401] Updated weights for policy 0, policy_version 305730 (0.0039) [2024-06-22 22:44:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5009178624. Throughput: 0: 43030.7. Samples: 5009351260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:44:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 22:44:34,745][15401] Updated weights for policy 0, policy_version 305740 (0.0045) [2024-06-22 22:44:37,926][15401] Updated weights for policy 0, policy_version 305750 (0.0032) [2024-06-22 22:44:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 5009408000. Throughput: 0: 43111.3. Samples: 5009478500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:44:38,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-22 22:44:42,823][15401] Updated weights for policy 0, policy_version 305760 (0.0031) [2024-06-22 22:44:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5009604608. Throughput: 0: 43069.6. Samples: 5009735920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:44:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 22:44:45,503][15401] Updated weights for policy 0, policy_version 305770 (0.0037) [2024-06-22 22:44:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 5009817600. Throughput: 0: 42757.5. Samples: 5009986860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:44:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-22 22:44:50,631][15401] Updated weights for policy 0, policy_version 305780 (0.0024) [2024-06-22 22:44:53,290][15401] Updated weights for policy 0, policy_version 305790 (0.0028) [2024-06-22 22:44:53,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 5010063360. Throughput: 0: 42912.3. Samples: 5010116180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:44:53,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 22:44:58,146][15401] Updated weights for policy 0, policy_version 305800 (0.0031) [2024-06-22 22:44:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 5010243584. Throughput: 0: 42854.4. Samples: 5010375920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:44:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-22 22:45:00,812][15401] Updated weights for policy 0, policy_version 305810 (0.0038) [2024-06-22 22:45:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 5010472960. Throughput: 0: 42771.1. Samples: 5010632220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:45:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 22:45:05,639][15401] Updated weights for policy 0, policy_version 305820 (0.0036) [2024-06-22 22:45:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5010702336. Throughput: 0: 42940.0. Samples: 5010765600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:45:08,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-22 22:45:08,557][15401] Updated weights for policy 0, policy_version 305830 (0.0030) [2024-06-22 22:45:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5010866176. Throughput: 0: 42855.1. Samples: 5011019820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:45:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 22:45:13,515][15401] Updated weights for policy 0, policy_version 305840 (0.0040) [2024-06-22 22:45:16,188][15401] Updated weights for policy 0, policy_version 305850 (0.0045) [2024-06-22 22:45:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 5011128320. Throughput: 0: 42788.0. Samples: 5011276720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:45:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 22:45:20,943][15401] Updated weights for policy 0, policy_version 305860 (0.0047) [2024-06-22 22:45:23,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5011341312. Throughput: 0: 42947.5. Samples: 5011411140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:45:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 22:45:23,883][15349] Signal inference workers to stop experience collection... (74100 times) [2024-06-22 22:45:23,893][15401] InferenceWorker_p0-w0: stopping experience collection (74100 times) [2024-06-22 22:45:23,945][15349] Signal inference workers to resume experience collection... (74100 times) [2024-06-22 22:45:23,945][15401] InferenceWorker_p0-w0: resuming experience collection (74100 times) [2024-06-22 22:45:23,946][15401] Updated weights for policy 0, policy_version 305870 (0.0029) [2024-06-22 22:45:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5011521536. Throughput: 0: 42886.4. Samples: 5011665800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:45:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-22 22:45:28,767][15401] Updated weights for policy 0, policy_version 305880 (0.0033) [2024-06-22 22:45:31,563][15401] Updated weights for policy 0, policy_version 305890 (0.0042) [2024-06-22 22:45:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 5011783680. Throughput: 0: 42970.6. Samples: 5011920540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-22 22:45:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 22:45:36,324][15401] Updated weights for policy 0, policy_version 305900 (0.0047) [2024-06-22 22:45:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5011980288. Throughput: 0: 43216.9. Samples: 5012060940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:45:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 22:45:39,242][15401] Updated weights for policy 0, policy_version 305910 (0.0046) [2024-06-22 22:45:43,390][15132] Fps is (10 sec: 37682.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5012160512. Throughput: 0: 42975.7. Samples: 5012309840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:45:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 22:45:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000305918_5012160512.pth... [2024-06-22 22:45:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000305292_5001904128.pth [2024-06-22 22:45:43,790][15401] Updated weights for policy 0, policy_version 305920 (0.0041) [2024-06-22 22:45:46,899][15401] Updated weights for policy 0, policy_version 305930 (0.0035) [2024-06-22 22:45:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43690.5, 300 sec: 42931.6). Total num frames: 5012439040. Throughput: 0: 42969.6. Samples: 5012565860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:45:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 22:45:51,209][15401] Updated weights for policy 0, policy_version 305940 (0.0031) [2024-06-22 22:45:53,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 5012619264. Throughput: 0: 42961.4. Samples: 5012698860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:45:53,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 22:45:54,534][15401] Updated weights for policy 0, policy_version 305950 (0.0029) [2024-06-22 22:45:58,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5012815872. Throughput: 0: 42910.8. Samples: 5012950800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:45:58,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 22:45:58,919][15401] Updated weights for policy 0, policy_version 305960 (0.0024) [2024-06-22 22:46:02,155][15401] Updated weights for policy 0, policy_version 305970 (0.0032) [2024-06-22 22:46:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5013078016. Throughput: 0: 42951.6. Samples: 5013209540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:03,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 22:46:06,452][15401] Updated weights for policy 0, policy_version 305980 (0.0034) [2024-06-22 22:46:08,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5013274624. Throughput: 0: 43034.6. Samples: 5013347700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 22:46:09,880][15401] Updated weights for policy 0, policy_version 305990 (0.0026) [2024-06-22 22:46:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5013471232. Throughput: 0: 42960.0. Samples: 5013599000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:13,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 22:46:14,163][15401] Updated weights for policy 0, policy_version 306000 (0.0032) [2024-06-22 22:46:17,499][15401] Updated weights for policy 0, policy_version 306010 (0.0026) [2024-06-22 22:46:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42877.0). Total num frames: 5013716992. Throughput: 0: 43004.5. Samples: 5013855740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:18,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 22:46:21,546][15401] Updated weights for policy 0, policy_version 306020 (0.0043) [2024-06-22 22:46:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5013913600. Throughput: 0: 42962.3. Samples: 5013994240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:23,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-22 22:46:25,011][15401] Updated weights for policy 0, policy_version 306030 (0.0035) [2024-06-22 22:46:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5014110208. Throughput: 0: 43091.3. Samples: 5014248940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 22:46:29,222][15401] Updated weights for policy 0, policy_version 306040 (0.0048) [2024-06-22 22:46:30,761][15349] Signal inference workers to stop experience collection... (74150 times) [2024-06-22 22:46:30,816][15401] InferenceWorker_p0-w0: stopping experience collection (74150 times) [2024-06-22 22:46:30,879][15349] Signal inference workers to resume experience collection... (74150 times) [2024-06-22 22:46:30,879][15401] InferenceWorker_p0-w0: resuming experience collection (74150 times) [2024-06-22 22:46:32,572][15401] Updated weights for policy 0, policy_version 306050 (0.0029) [2024-06-22 22:46:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5014372352. Throughput: 0: 42955.2. Samples: 5014498840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:33,400][15132] Avg episode reward: [(0, '0.431')] [2024-06-22 22:46:37,198][15401] Updated weights for policy 0, policy_version 306060 (0.0024) [2024-06-22 22:46:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5014552576. Throughput: 0: 42975.4. Samples: 5014632760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 22:46:40,354][15401] Updated weights for policy 0, policy_version 306070 (0.0037) [2024-06-22 22:46:43,389][15132] Fps is (10 sec: 37683.7, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 5014749184. Throughput: 0: 43062.2. Samples: 5014888600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 22:46:44,727][15401] Updated weights for policy 0, policy_version 306080 (0.0024) [2024-06-22 22:46:47,862][15401] Updated weights for policy 0, policy_version 306090 (0.0034) [2024-06-22 22:46:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5014994944. Throughput: 0: 43023.0. Samples: 5015145580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-22 22:46:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 22:46:52,221][15401] Updated weights for policy 0, policy_version 306100 (0.0035) [2024-06-22 22:46:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5015191552. Throughput: 0: 42850.7. Samples: 5015275980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:46:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 22:46:55,544][15401] Updated weights for policy 0, policy_version 306110 (0.0043) [2024-06-22 22:46:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5015420928. Throughput: 0: 42986.6. Samples: 5015533400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:46:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 22:46:59,699][15401] Updated weights for policy 0, policy_version 306120 (0.0035) [2024-06-22 22:47:02,932][15401] Updated weights for policy 0, policy_version 306130 (0.0027) [2024-06-22 22:47:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5015650304. Throughput: 0: 43021.8. Samples: 5015791720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:03,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-22 22:47:07,210][15401] Updated weights for policy 0, policy_version 306140 (0.0031) [2024-06-22 22:47:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 5015846912. Throughput: 0: 42846.7. Samples: 5015922340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 22:47:10,634][15401] Updated weights for policy 0, policy_version 306150 (0.0042) [2024-06-22 22:47:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5016059904. Throughput: 0: 42930.7. Samples: 5016180820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-22 22:47:14,706][15401] Updated weights for policy 0, policy_version 306160 (0.0039) [2024-06-22 22:47:18,110][15401] Updated weights for policy 0, policy_version 306170 (0.0037) [2024-06-22 22:47:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 5016289280. Throughput: 0: 43060.4. Samples: 5016436560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:18,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 22:47:22,375][15401] Updated weights for policy 0, policy_version 306180 (0.0032) [2024-06-22 22:47:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5016485888. Throughput: 0: 43012.6. Samples: 5016568320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 22:47:25,713][15401] Updated weights for policy 0, policy_version 306190 (0.0040) [2024-06-22 22:47:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 5016715264. Throughput: 0: 43030.1. Samples: 5016824960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 22:47:30,556][15401] Updated weights for policy 0, policy_version 306200 (0.0043) [2024-06-22 22:47:33,281][15401] Updated weights for policy 0, policy_version 306210 (0.0034) [2024-06-22 22:47:33,390][15132] Fps is (10 sec: 45873.6, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 5016944640. Throughput: 0: 42974.4. Samples: 5017079440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 22:47:38,019][15401] Updated weights for policy 0, policy_version 306220 (0.0027) [2024-06-22 22:47:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 5017124864. Throughput: 0: 43003.1. Samples: 5017211220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:38,401][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 22:47:40,909][15401] Updated weights for policy 0, policy_version 306230 (0.0044) [2024-06-22 22:47:43,390][15132] Fps is (10 sec: 40961.0, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 5017354240. Throughput: 0: 43092.9. Samples: 5017472580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:43,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-22 22:47:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000306235_5017354240.pth... [2024-06-22 22:47:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000305606_5007048704.pth [2024-06-22 22:47:45,519][15401] Updated weights for policy 0, policy_version 306240 (0.0033) [2024-06-22 22:47:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 5017567232. Throughput: 0: 43028.5. Samples: 5017728000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:48,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 22:47:48,594][15401] Updated weights for policy 0, policy_version 306250 (0.0024) [2024-06-22 22:47:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5017747456. Throughput: 0: 43026.1. Samples: 5017858520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 22:47:53,547][15401] Updated weights for policy 0, policy_version 306260 (0.0031) [2024-06-22 22:47:55,479][15349] Signal inference workers to stop experience collection... (74200 times) [2024-06-22 22:47:55,479][15349] Signal inference workers to resume experience collection... (74200 times) [2024-06-22 22:47:55,528][15401] InferenceWorker_p0-w0: stopping experience collection (74200 times) [2024-06-22 22:47:55,528][15401] InferenceWorker_p0-w0: resuming experience collection (74200 times) [2024-06-22 22:47:56,098][15401] Updated weights for policy 0, policy_version 306270 (0.0033) [2024-06-22 22:47:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5018009600. Throughput: 0: 42903.5. Samples: 5018111480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 22:47:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 22:48:01,072][15401] Updated weights for policy 0, policy_version 306280 (0.0041) [2024-06-22 22:48:03,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 5018222592. Throughput: 0: 42864.6. Samples: 5018365460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:03,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-22 22:48:03,882][15401] Updated weights for policy 0, policy_version 306290 (0.0035) [2024-06-22 22:48:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5018402816. Throughput: 0: 42830.6. Samples: 5018495700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-22 22:48:08,466][15401] Updated weights for policy 0, policy_version 306300 (0.0041) [2024-06-22 22:48:11,573][15401] Updated weights for policy 0, policy_version 306310 (0.0035) [2024-06-22 22:48:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 43043.1). Total num frames: 5018632192. Throughput: 0: 42886.3. Samples: 5018754840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:13,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-22 22:48:15,995][15401] Updated weights for policy 0, policy_version 306320 (0.0031) [2024-06-22 22:48:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 5018845184. Throughput: 0: 42840.2. Samples: 5019007240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 22:48:19,470][15401] Updated weights for policy 0, policy_version 306330 (0.0045) [2024-06-22 22:48:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5019041792. Throughput: 0: 42849.8. Samples: 5019139360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 22:48:23,742][15401] Updated weights for policy 0, policy_version 306340 (0.0034) [2024-06-22 22:48:27,123][15401] Updated weights for policy 0, policy_version 306350 (0.0030) [2024-06-22 22:48:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 5019287552. Throughput: 0: 42793.3. Samples: 5019398280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 22:48:31,346][15401] Updated weights for policy 0, policy_version 306360 (0.0037) [2024-06-22 22:48:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.7, 300 sec: 43042.7). Total num frames: 5019500544. Throughput: 0: 42857.4. Samples: 5019656580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:33,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 22:48:34,738][15401] Updated weights for policy 0, policy_version 306370 (0.0034) [2024-06-22 22:48:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 5019697152. Throughput: 0: 42840.3. Samples: 5019786340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 22:48:39,223][15401] Updated weights for policy 0, policy_version 306380 (0.0037) [2024-06-22 22:48:42,640][15401] Updated weights for policy 0, policy_version 306390 (0.0031) [2024-06-22 22:48:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 5019926528. Throughput: 0: 43014.7. Samples: 5020047140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 22:48:46,809][15401] Updated weights for policy 0, policy_version 306400 (0.0042) [2024-06-22 22:48:48,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5020139520. Throughput: 0: 42920.7. Samples: 5020296900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 22:48:50,180][15401] Updated weights for policy 0, policy_version 306410 (0.0030) [2024-06-22 22:48:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5020352512. Throughput: 0: 42992.9. Samples: 5020430380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 22:48:54,323][15401] Updated weights for policy 0, policy_version 306420 (0.0044) [2024-06-22 22:48:57,871][15401] Updated weights for policy 0, policy_version 306430 (0.0033) [2024-06-22 22:48:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 5020565504. Throughput: 0: 42875.6. Samples: 5020684240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:48:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 22:49:01,761][15401] Updated weights for policy 0, policy_version 306440 (0.0035) [2024-06-22 22:49:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5020778496. Throughput: 0: 43017.9. Samples: 5020943040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:49:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 22:49:05,501][15401] Updated weights for policy 0, policy_version 306450 (0.0032) [2024-06-22 22:49:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5020975104. Throughput: 0: 42926.3. Samples: 5021071040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:49:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 22:49:09,310][15401] Updated weights for policy 0, policy_version 306460 (0.0031) [2024-06-22 22:49:12,779][15401] Updated weights for policy 0, policy_version 306470 (0.0022) [2024-06-22 22:49:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5021220864. Throughput: 0: 42919.2. Samples: 5021329640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-22 22:49:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-22 22:49:16,892][15401] Updated weights for policy 0, policy_version 306480 (0.0033) [2024-06-22 22:49:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5021417472. Throughput: 0: 43021.8. Samples: 5021592560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:18,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-22 22:49:20,295][15401] Updated weights for policy 0, policy_version 306490 (0.0034) [2024-06-22 22:49:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5021630464. Throughput: 0: 42907.7. Samples: 5021717180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 22:49:24,470][15401] Updated weights for policy 0, policy_version 306500 (0.0035) [2024-06-22 22:49:26,836][15349] Signal inference workers to stop experience collection... (74250 times) [2024-06-22 22:49:26,836][15349] Signal inference workers to resume experience collection... (74250 times) [2024-06-22 22:49:26,889][15401] InferenceWorker_p0-w0: stopping experience collection (74250 times) [2024-06-22 22:49:26,889][15401] InferenceWorker_p0-w0: resuming experience collection (74250 times) [2024-06-22 22:49:28,297][15401] Updated weights for policy 0, policy_version 306510 (0.0028) [2024-06-22 22:49:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5021859840. Throughput: 0: 42708.4. Samples: 5021969020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 22:49:32,001][15401] Updated weights for policy 0, policy_version 306520 (0.0031) [2024-06-22 22:49:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 5022040064. Throughput: 0: 43027.6. Samples: 5022233140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 22:49:35,883][15401] Updated weights for policy 0, policy_version 306530 (0.0038) [2024-06-22 22:49:38,395][15132] Fps is (10 sec: 40936.2, 60 sec: 42867.4, 300 sec: 42930.8). Total num frames: 5022269440. Throughput: 0: 42803.4. Samples: 5022356780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:38,396][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 22:49:39,916][15401] Updated weights for policy 0, policy_version 306540 (0.0037) [2024-06-22 22:49:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5022498816. Throughput: 0: 42843.5. Samples: 5022612200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 22:49:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000306550_5022515200.pth... [2024-06-22 22:49:43,423][15401] Updated weights for policy 0, policy_version 306550 (0.0041) [2024-06-22 22:49:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000305918_5012160512.pth [2024-06-22 22:49:47,807][15401] Updated weights for policy 0, policy_version 306560 (0.0030) [2024-06-22 22:49:48,389][15132] Fps is (10 sec: 42623.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5022695424. Throughput: 0: 42830.7. Samples: 5022870420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:48,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 22:49:50,987][15401] Updated weights for policy 0, policy_version 306570 (0.0032) [2024-06-22 22:49:53,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42986.2). Total num frames: 5022924800. Throughput: 0: 42714.8. Samples: 5022993480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:53,396][15132] Avg episode reward: [(0, '0.644')] [2024-06-22 22:49:55,335][15401] Updated weights for policy 0, policy_version 306580 (0.0033) [2024-06-22 22:49:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5023137792. Throughput: 0: 42786.3. Samples: 5023255020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:49:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-22 22:49:58,742][15401] Updated weights for policy 0, policy_version 306590 (0.0028) [2024-06-22 22:50:02,678][15401] Updated weights for policy 0, policy_version 306600 (0.0031) [2024-06-22 22:50:03,394][15132] Fps is (10 sec: 42606.6, 60 sec: 42868.3, 300 sec: 42875.5). Total num frames: 5023350784. Throughput: 0: 42538.8. Samples: 5023507000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:50:03,394][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 22:50:06,447][15401] Updated weights for policy 0, policy_version 306610 (0.0042) [2024-06-22 22:50:08,391][15132] Fps is (10 sec: 40953.8, 60 sec: 42870.4, 300 sec: 42987.0). Total num frames: 5023547392. Throughput: 0: 42721.8. Samples: 5023639720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:50:08,391][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 22:50:10,312][15401] Updated weights for policy 0, policy_version 306620 (0.0042) [2024-06-22 22:50:13,390][15132] Fps is (10 sec: 44256.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5023793152. Throughput: 0: 42956.9. Samples: 5023902080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:50:13,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 22:50:14,160][15401] Updated weights for policy 0, policy_version 306630 (0.0030) [2024-06-22 22:50:17,945][15401] Updated weights for policy 0, policy_version 306640 (0.0036) [2024-06-22 22:50:18,390][15132] Fps is (10 sec: 45881.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5024006144. Throughput: 0: 42809.4. Samples: 5024159560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:50:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 22:50:22,095][15401] Updated weights for policy 0, policy_version 306650 (0.0048) [2024-06-22 22:50:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5024202752. Throughput: 0: 42892.1. Samples: 5024286680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:50:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 22:50:25,463][15401] Updated weights for policy 0, policy_version 306660 (0.0031) [2024-06-22 22:50:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5024432128. Throughput: 0: 42939.5. Samples: 5024544480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-22 22:50:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 22:50:29,806][15401] Updated weights for policy 0, policy_version 306670 (0.0027) [2024-06-22 22:50:33,247][15401] Updated weights for policy 0, policy_version 306680 (0.0033) [2024-06-22 22:50:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5024645120. Throughput: 0: 42859.5. Samples: 5024799100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:50:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-22 22:50:37,365][15401] Updated weights for policy 0, policy_version 306690 (0.0035) [2024-06-22 22:50:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42875.7, 300 sec: 42987.2). Total num frames: 5024841728. Throughput: 0: 43023.1. Samples: 5024929240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:50:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 22:50:40,851][15401] Updated weights for policy 0, policy_version 306700 (0.0036) [2024-06-22 22:50:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5025071104. Throughput: 0: 42915.4. Samples: 5025186220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:50:43,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-22 22:50:45,038][15401] Updated weights for policy 0, policy_version 306710 (0.0042) [2024-06-22 22:50:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 5025284096. Throughput: 0: 42951.7. Samples: 5025439640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:50:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 22:50:48,821][15401] Updated weights for policy 0, policy_version 306720 (0.0027) [2024-06-22 22:50:52,821][15401] Updated weights for policy 0, policy_version 306730 (0.0056) [2024-06-22 22:50:53,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42601.2, 300 sec: 42931.3). Total num frames: 5025480704. Throughput: 0: 42902.6. Samples: 5025570380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:50:53,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 22:50:56,495][15401] Updated weights for policy 0, policy_version 306740 (0.0036) [2024-06-22 22:50:57,061][15349] Signal inference workers to stop experience collection... (74300 times) [2024-06-22 22:50:57,099][15401] InferenceWorker_p0-w0: stopping experience collection (74300 times) [2024-06-22 22:50:57,187][15349] Signal inference workers to resume experience collection... (74300 times) [2024-06-22 22:50:57,187][15401] InferenceWorker_p0-w0: resuming experience collection (74300 times) [2024-06-22 22:50:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5025710080. Throughput: 0: 42760.9. Samples: 5025826320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:50:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 22:51:00,859][15401] Updated weights for policy 0, policy_version 306750 (0.0036) [2024-06-22 22:51:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42874.7, 300 sec: 42876.1). Total num frames: 5025923072. Throughput: 0: 42736.9. Samples: 5026082720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 22:51:03,851][15401] Updated weights for policy 0, policy_version 306760 (0.0047) [2024-06-22 22:51:08,322][15401] Updated weights for policy 0, policy_version 306770 (0.0038) [2024-06-22 22:51:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42872.5, 300 sec: 42876.1). Total num frames: 5026119680. Throughput: 0: 42787.7. Samples: 5026212120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 22:51:11,380][15401] Updated weights for policy 0, policy_version 306780 (0.0054) [2024-06-22 22:51:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5026365440. Throughput: 0: 42811.5. Samples: 5026471000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:13,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 22:51:15,928][15401] Updated weights for policy 0, policy_version 306790 (0.0042) [2024-06-22 22:51:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5026562048. Throughput: 0: 42911.6. Samples: 5026730120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 22:51:19,341][15401] Updated weights for policy 0, policy_version 306800 (0.0032) [2024-06-22 22:51:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5026758656. Throughput: 0: 42779.1. Samples: 5026854300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 22:51:23,511][15401] Updated weights for policy 0, policy_version 306810 (0.0031) [2024-06-22 22:51:26,849][15401] Updated weights for policy 0, policy_version 306820 (0.0036) [2024-06-22 22:51:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5027020800. Throughput: 0: 42898.3. Samples: 5027116640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 22:51:31,166][15401] Updated weights for policy 0, policy_version 306830 (0.0032) [2024-06-22 22:51:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5027201024. Throughput: 0: 42905.3. Samples: 5027370380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:33,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-22 22:51:34,631][15401] Updated weights for policy 0, policy_version 306840 (0.0032) [2024-06-22 22:51:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5027414016. Throughput: 0: 42699.6. Samples: 5027491760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:38,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-22 22:51:38,775][15401] Updated weights for policy 0, policy_version 306850 (0.0033) [2024-06-22 22:51:42,400][15401] Updated weights for policy 0, policy_version 306860 (0.0040) [2024-06-22 22:51:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5027659776. Throughput: 0: 42807.1. Samples: 5027752640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:51:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 22:51:43,503][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000306865_5027676160.pth... [2024-06-22 22:51:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000306235_5017354240.pth [2024-06-22 22:51:46,426][15401] Updated weights for policy 0, policy_version 306870 (0.0040) [2024-06-22 22:51:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5027840000. Throughput: 0: 42761.9. Samples: 5028007020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:51:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-22 22:51:50,202][15401] Updated weights for policy 0, policy_version 306880 (0.0036) [2024-06-22 22:51:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 5028052992. Throughput: 0: 42467.8. Samples: 5028123180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:51:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 22:51:54,129][15401] Updated weights for policy 0, policy_version 306890 (0.0028) [2024-06-22 22:51:57,855][15401] Updated weights for policy 0, policy_version 306900 (0.0025) [2024-06-22 22:51:58,396][15132] Fps is (10 sec: 45847.2, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 5028298752. Throughput: 0: 42617.1. Samples: 5028389040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:51:58,396][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 22:52:02,079][15401] Updated weights for policy 0, policy_version 306910 (0.0038) [2024-06-22 22:52:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 5028429824. Throughput: 0: 42550.1. Samples: 5028644880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 22:52:04,180][15349] Signal inference workers to stop experience collection... (74350 times) [2024-06-22 22:52:04,181][15349] Signal inference workers to resume experience collection... (74350 times) [2024-06-22 22:52:04,222][15401] InferenceWorker_p0-w0: stopping experience collection (74350 times) [2024-06-22 22:52:04,223][15401] InferenceWorker_p0-w0: resuming experience collection (74350 times) [2024-06-22 22:52:05,528][15401] Updated weights for policy 0, policy_version 306920 (0.0034) [2024-06-22 22:52:08,389][15132] Fps is (10 sec: 37707.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5028675584. Throughput: 0: 42303.1. Samples: 5028757940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:08,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 22:52:09,875][15401] Updated weights for policy 0, policy_version 306930 (0.0043) [2024-06-22 22:52:13,227][15401] Updated weights for policy 0, policy_version 306940 (0.0034) [2024-06-22 22:52:13,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5028904960. Throughput: 0: 42211.6. Samples: 5029016160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:13,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-22 22:52:17,541][15401] Updated weights for policy 0, policy_version 306950 (0.0042) [2024-06-22 22:52:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5029085184. Throughput: 0: 42253.9. Samples: 5029271800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:18,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 22:52:21,098][15401] Updated weights for policy 0, policy_version 306960 (0.0035) [2024-06-22 22:52:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5029330944. Throughput: 0: 42353.1. Samples: 5029397640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 22:52:25,029][15401] Updated weights for policy 0, policy_version 306970 (0.0030) [2024-06-22 22:52:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 42654.0). Total num frames: 5029527552. Throughput: 0: 42360.3. Samples: 5029658860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:28,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 22:52:28,661][15401] Updated weights for policy 0, policy_version 306980 (0.0030) [2024-06-22 22:52:32,465][15401] Updated weights for policy 0, policy_version 306990 (0.0035) [2024-06-22 22:52:33,396][15132] Fps is (10 sec: 40933.1, 60 sec: 42320.8, 300 sec: 42764.4). Total num frames: 5029740544. Throughput: 0: 42420.0. Samples: 5029916180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:33,397][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 22:52:36,364][15401] Updated weights for policy 0, policy_version 307000 (0.0037) [2024-06-22 22:52:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5029969920. Throughput: 0: 42580.9. Samples: 5030039320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:38,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 22:52:40,034][15401] Updated weights for policy 0, policy_version 307010 (0.0045) [2024-06-22 22:52:43,389][15132] Fps is (10 sec: 42625.9, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 5030166528. Throughput: 0: 42466.0. Samples: 5030299740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 22:52:44,028][15401] Updated weights for policy 0, policy_version 307020 (0.0049) [2024-06-22 22:52:47,543][15401] Updated weights for policy 0, policy_version 307030 (0.0025) [2024-06-22 22:52:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 5030379520. Throughput: 0: 42314.3. Samples: 5030549020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 22:52:51,839][15401] Updated weights for policy 0, policy_version 307040 (0.0038) [2024-06-22 22:52:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5030608896. Throughput: 0: 42761.7. Samples: 5030682220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 22:52:55,411][15401] Updated weights for policy 0, policy_version 307050 (0.0041) [2024-06-22 22:52:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41510.5, 300 sec: 42598.4). Total num frames: 5030789120. Throughput: 0: 42689.8. Samples: 5030937200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-22 22:52:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 22:52:59,534][15401] Updated weights for policy 0, policy_version 307060 (0.0038) [2024-06-22 22:53:02,836][15401] Updated weights for policy 0, policy_version 307070 (0.0035) [2024-06-22 22:53:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 5031034880. Throughput: 0: 42586.5. Samples: 5031188200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 22:53:07,216][15401] Updated weights for policy 0, policy_version 307080 (0.0032) [2024-06-22 22:53:07,826][15349] Signal inference workers to stop experience collection... (74400 times) [2024-06-22 22:53:07,868][15401] InferenceWorker_p0-w0: stopping experience collection (74400 times) [2024-06-22 22:53:07,874][15349] Signal inference workers to resume experience collection... (74400 times) [2024-06-22 22:53:07,884][15401] InferenceWorker_p0-w0: resuming experience collection (74400 times) [2024-06-22 22:53:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5031247872. Throughput: 0: 42877.6. Samples: 5031327140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-22 22:53:10,794][15401] Updated weights for policy 0, policy_version 307090 (0.0030) [2024-06-22 22:53:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5031444480. Throughput: 0: 42615.2. Samples: 5031576540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:13,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-22 22:53:14,907][15401] Updated weights for policy 0, policy_version 307100 (0.0032) [2024-06-22 22:53:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5031673856. Throughput: 0: 42524.7. Samples: 5031829520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:18,391][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 22:53:18,545][15401] Updated weights for policy 0, policy_version 307110 (0.0037) [2024-06-22 22:53:22,729][15401] Updated weights for policy 0, policy_version 307120 (0.0028) [2024-06-22 22:53:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5031870464. Throughput: 0: 42672.9. Samples: 5031959600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-22 22:53:26,202][15401] Updated weights for policy 0, policy_version 307130 (0.0028) [2024-06-22 22:53:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5032083456. Throughput: 0: 42599.0. Samples: 5032216700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 22:53:30,262][15401] Updated weights for policy 0, policy_version 307140 (0.0028) [2024-06-22 22:53:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42876.1, 300 sec: 42765.1). Total num frames: 5032312832. Throughput: 0: 42722.3. Samples: 5032471520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 22:53:33,691][15401] Updated weights for policy 0, policy_version 307150 (0.0034) [2024-06-22 22:53:37,814][15401] Updated weights for policy 0, policy_version 307160 (0.0042) [2024-06-22 22:53:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 5032525824. Throughput: 0: 42749.6. Samples: 5032605960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:38,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 22:53:41,357][15401] Updated weights for policy 0, policy_version 307170 (0.0048) [2024-06-22 22:53:43,392][15132] Fps is (10 sec: 40948.6, 60 sec: 42596.5, 300 sec: 42653.6). Total num frames: 5032722432. Throughput: 0: 42631.2. Samples: 5032855720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:43,393][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 22:53:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000307173_5032722432.pth... [2024-06-22 22:53:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000306550_5022515200.pth [2024-06-22 22:53:46,138][15401] Updated weights for policy 0, policy_version 307180 (0.0032) [2024-06-22 22:53:48,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5032951808. Throughput: 0: 42717.5. Samples: 5033110480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:48,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 22:53:49,054][15401] Updated weights for policy 0, policy_version 307190 (0.0036) [2024-06-22 22:53:53,390][15132] Fps is (10 sec: 42609.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5033148416. Throughput: 0: 42501.8. Samples: 5033239720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:53,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-22 22:53:53,605][15401] Updated weights for policy 0, policy_version 307200 (0.0034) [2024-06-22 22:53:56,727][15401] Updated weights for policy 0, policy_version 307210 (0.0035) [2024-06-22 22:53:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5033361408. Throughput: 0: 42527.7. Samples: 5033490280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:53:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 22:54:01,227][15401] Updated weights for policy 0, policy_version 307220 (0.0037) [2024-06-22 22:54:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5033590784. Throughput: 0: 42769.4. Samples: 5033754140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:54:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-22 22:54:04,362][15401] Updated weights for policy 0, policy_version 307230 (0.0022) [2024-06-22 22:54:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5033787392. Throughput: 0: 42834.4. Samples: 5033887140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:54:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 22:54:08,712][15401] Updated weights for policy 0, policy_version 307240 (0.0036) [2024-06-22 22:54:11,985][15401] Updated weights for policy 0, policy_version 307250 (0.0038) [2024-06-22 22:54:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 5034016768. Throughput: 0: 42550.7. Samples: 5034131580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-22 22:54:13,392][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 22:54:16,309][15401] Updated weights for policy 0, policy_version 307260 (0.0027) [2024-06-22 22:54:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5034229760. Throughput: 0: 42834.7. Samples: 5034399080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:18,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-22 22:54:19,599][15401] Updated weights for policy 0, policy_version 307270 (0.0040) [2024-06-22 22:54:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5034442752. Throughput: 0: 42787.2. Samples: 5034531380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 22:54:23,888][15401] Updated weights for policy 0, policy_version 307280 (0.0034) [2024-06-22 22:54:27,258][15401] Updated weights for policy 0, policy_version 307290 (0.0037) [2024-06-22 22:54:28,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5034655744. Throughput: 0: 42835.9. Samples: 5034783220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 22:54:31,245][15401] Updated weights for policy 0, policy_version 307300 (0.0045) [2024-06-22 22:54:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42765.8). Total num frames: 5034885120. Throughput: 0: 43023.8. Samples: 5035046560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:33,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-22 22:54:34,889][15401] Updated weights for policy 0, policy_version 307310 (0.0024) [2024-06-22 22:54:38,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42594.0, 300 sec: 42653.0). Total num frames: 5035081728. Throughput: 0: 43061.5. Samples: 5035177760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:38,397][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 22:54:39,224][15401] Updated weights for policy 0, policy_version 307320 (0.0038) [2024-06-22 22:54:42,636][15401] Updated weights for policy 0, policy_version 307330 (0.0037) [2024-06-22 22:54:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42873.2, 300 sec: 42709.4). Total num frames: 5035294720. Throughput: 0: 43041.0. Samples: 5035427140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 22:54:46,889][15401] Updated weights for policy 0, policy_version 307340 (0.0035) [2024-06-22 22:54:46,915][15349] Signal inference workers to stop experience collection... (74450 times) [2024-06-22 22:54:46,915][15349] Signal inference workers to resume experience collection... (74450 times) [2024-06-22 22:54:46,934][15401] InferenceWorker_p0-w0: stopping experience collection (74450 times) [2024-06-22 22:54:46,935][15401] InferenceWorker_p0-w0: resuming experience collection (74450 times) [2024-06-22 22:54:48,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 5035524096. Throughput: 0: 42865.8. Samples: 5035683100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:48,391][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 22:54:50,366][15401] Updated weights for policy 0, policy_version 307350 (0.0034) [2024-06-22 22:54:53,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5035720704. Throughput: 0: 42844.4. Samples: 5035815140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 22:54:54,352][15401] Updated weights for policy 0, policy_version 307360 (0.0027) [2024-06-22 22:54:58,336][15401] Updated weights for policy 0, policy_version 307370 (0.0035) [2024-06-22 22:54:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42710.1). Total num frames: 5035950080. Throughput: 0: 43056.5. Samples: 5036069020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:54:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 22:55:01,883][15401] Updated weights for policy 0, policy_version 307380 (0.0031) [2024-06-22 22:55:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42876.3). Total num frames: 5036195840. Throughput: 0: 42664.7. Samples: 5036319000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:55:03,395][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 22:55:06,607][15401] Updated weights for policy 0, policy_version 307390 (0.0036) [2024-06-22 22:55:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5036376064. Throughput: 0: 42766.3. Samples: 5036455860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:55:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-22 22:55:09,809][15401] Updated weights for policy 0, policy_version 307400 (0.0031) [2024-06-22 22:55:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 5036589056. Throughput: 0: 42655.1. Samples: 5036702700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:55:13,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 22:55:14,212][15401] Updated weights for policy 0, policy_version 307410 (0.0030) [2024-06-22 22:55:17,535][15401] Updated weights for policy 0, policy_version 307420 (0.0036) [2024-06-22 22:55:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5036802048. Throughput: 0: 42478.0. Samples: 5036958060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:55:18,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-22 22:55:21,815][15401] Updated weights for policy 0, policy_version 307430 (0.0028) [2024-06-22 22:55:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5037015040. Throughput: 0: 42592.2. Samples: 5037094140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-22 22:55:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 22:55:25,252][15401] Updated weights for policy 0, policy_version 307440 (0.0037) [2024-06-22 22:55:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5037228032. Throughput: 0: 42545.2. Samples: 5037341660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:55:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-22 22:55:29,393][15401] Updated weights for policy 0, policy_version 307450 (0.0045) [2024-06-22 22:55:32,757][15401] Updated weights for policy 0, policy_version 307460 (0.0033) [2024-06-22 22:55:33,395][15132] Fps is (10 sec: 42576.8, 60 sec: 42594.8, 300 sec: 42708.7). Total num frames: 5037441024. Throughput: 0: 42571.6. Samples: 5037599040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:55:33,395][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 22:55:36,916][15401] Updated weights for policy 0, policy_version 307470 (0.0034) [2024-06-22 22:55:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 5037637632. Throughput: 0: 42639.1. Samples: 5037733900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:55:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 22:55:40,258][15401] Updated weights for policy 0, policy_version 307480 (0.0042) [2024-06-22 22:55:43,390][15132] Fps is (10 sec: 44259.2, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5037883392. Throughput: 0: 42582.1. Samples: 5037985220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:55:43,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 22:55:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000307488_5037883392.pth... [2024-06-22 22:55:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000306865_5027676160.pth [2024-06-22 22:55:44,478][15401] Updated weights for policy 0, policy_version 307490 (0.0039) [2024-06-22 22:55:47,802][15401] Updated weights for policy 0, policy_version 307500 (0.0033) [2024-06-22 22:55:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 5038096384. Throughput: 0: 42743.3. Samples: 5038242440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:55:48,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 22:55:52,110][15401] Updated weights for policy 0, policy_version 307510 (0.0038) [2024-06-22 22:55:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5038276608. Throughput: 0: 42719.9. Samples: 5038378260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:55:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 22:55:55,497][15401] Updated weights for policy 0, policy_version 307520 (0.0031) [2024-06-22 22:55:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5038505984. Throughput: 0: 42747.6. Samples: 5038626340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:55:58,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 22:55:59,801][15401] Updated weights for policy 0, policy_version 307530 (0.0038) [2024-06-22 22:56:01,661][15349] Signal inference workers to stop experience collection... (74500 times) [2024-06-22 22:56:01,686][15401] InferenceWorker_p0-w0: stopping experience collection (74500 times) [2024-06-22 22:56:01,721][15349] Signal inference workers to resume experience collection... (74500 times) [2024-06-22 22:56:01,721][15401] InferenceWorker_p0-w0: resuming experience collection (74500 times) [2024-06-22 22:56:03,089][15401] Updated weights for policy 0, policy_version 307540 (0.0032) [2024-06-22 22:56:03,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5038751744. Throughput: 0: 42789.7. Samples: 5038883600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:56:03,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-22 22:56:07,958][15401] Updated weights for policy 0, policy_version 307550 (0.0030) [2024-06-22 22:56:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5038899200. Throughput: 0: 42724.6. Samples: 5039016740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:56:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 22:56:10,615][15401] Updated weights for policy 0, policy_version 307560 (0.0031) [2024-06-22 22:56:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5039161344. Throughput: 0: 42859.0. Samples: 5039270320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:56:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 22:56:15,485][15401] Updated weights for policy 0, policy_version 307570 (0.0038) [2024-06-22 22:56:18,125][15401] Updated weights for policy 0, policy_version 307580 (0.0032) [2024-06-22 22:56:18,389][15132] Fps is (10 sec: 50790.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5039407104. Throughput: 0: 42867.2. Samples: 5039527840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:56:18,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 22:56:22,981][15401] Updated weights for policy 0, policy_version 307590 (0.0037) [2024-06-22 22:56:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5039554560. Throughput: 0: 42883.4. Samples: 5039663660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:56:23,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-22 22:56:25,883][15401] Updated weights for policy 0, policy_version 307600 (0.0034) [2024-06-22 22:56:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5039800320. Throughput: 0: 42796.2. Samples: 5039911040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:56:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-22 22:56:30,454][15401] Updated weights for policy 0, policy_version 307610 (0.0044) [2024-06-22 22:56:33,330][15401] Updated weights for policy 0, policy_version 307620 (0.0032) [2024-06-22 22:56:33,389][15132] Fps is (10 sec: 49152.6, 60 sec: 43421.3, 300 sec: 42820.6). Total num frames: 5040046080. Throughput: 0: 42859.0. Samples: 5040171100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:56:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 22:56:37,923][15401] Updated weights for policy 0, policy_version 307630 (0.0045) [2024-06-22 22:56:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5040209920. Throughput: 0: 42768.4. Samples: 5040302840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 22:56:38,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 22:56:41,066][15401] Updated weights for policy 0, policy_version 307640 (0.0043) [2024-06-22 22:56:43,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 5040455680. Throughput: 0: 42820.1. Samples: 5040553520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:56:43,396][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 22:56:45,390][15401] Updated weights for policy 0, policy_version 307650 (0.0030) [2024-06-22 22:56:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5040668672. Throughput: 0: 42963.6. Samples: 5040816960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:56:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 22:56:48,817][15401] Updated weights for policy 0, policy_version 307660 (0.0038) [2024-06-22 22:56:53,389][15132] Fps is (10 sec: 39347.0, 60 sec: 42871.5, 300 sec: 42543.8). Total num frames: 5040848896. Throughput: 0: 42848.9. Samples: 5040944940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:56:53,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 22:56:53,605][15401] Updated weights for policy 0, policy_version 307670 (0.0052) [2024-06-22 22:56:56,336][15401] Updated weights for policy 0, policy_version 307680 (0.0026) [2024-06-22 22:56:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5041111040. Throughput: 0: 42909.4. Samples: 5041201240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:56:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 22:57:01,322][15401] Updated weights for policy 0, policy_version 307690 (0.0046) [2024-06-22 22:57:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5041291264. Throughput: 0: 42944.4. Samples: 5041460340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 22:57:04,137][15401] Updated weights for policy 0, policy_version 307700 (0.0023) [2024-06-22 22:57:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5041504256. Throughput: 0: 42620.2. Samples: 5041581560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 22:57:08,819][15401] Updated weights for policy 0, policy_version 307710 (0.0042) [2024-06-22 22:57:12,189][15401] Updated weights for policy 0, policy_version 307720 (0.0057) [2024-06-22 22:57:12,210][15349] Signal inference workers to stop experience collection... (74550 times) [2024-06-22 22:57:12,216][15349] Signal inference workers to resume experience collection... (74550 times) [2024-06-22 22:57:12,267][15401] InferenceWorker_p0-w0: stopping experience collection (74550 times) [2024-06-22 22:57:12,267][15401] InferenceWorker_p0-w0: resuming experience collection (74550 times) [2024-06-22 22:57:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5041750016. Throughput: 0: 42892.4. Samples: 5041841200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 22:57:16,441][15401] Updated weights for policy 0, policy_version 307730 (0.0038) [2024-06-22 22:57:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5041930240. Throughput: 0: 42726.2. Samples: 5042093780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:18,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 22:57:20,117][15401] Updated weights for policy 0, policy_version 307740 (0.0026) [2024-06-22 22:57:23,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5042126848. Throughput: 0: 42472.4. Samples: 5042214100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:23,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-22 22:57:24,024][15401] Updated weights for policy 0, policy_version 307750 (0.0034) [2024-06-22 22:57:27,670][15401] Updated weights for policy 0, policy_version 307760 (0.0021) [2024-06-22 22:57:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 5042372608. Throughput: 0: 42803.4. Samples: 5042479400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 22:57:32,351][15401] Updated weights for policy 0, policy_version 307770 (0.0024) [2024-06-22 22:57:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5042569216. Throughput: 0: 42578.1. Samples: 5042732980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:33,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-22 22:57:35,355][15401] Updated weights for policy 0, policy_version 307780 (0.0030) [2024-06-22 22:57:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5042765824. Throughput: 0: 42487.9. Samples: 5042856900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:38,390][15132] Avg episode reward: [(0, '0.119')] [2024-06-22 22:57:39,814][15401] Updated weights for policy 0, policy_version 307790 (0.0033) [2024-06-22 22:57:43,166][15401] Updated weights for policy 0, policy_version 307800 (0.0037) [2024-06-22 22:57:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42329.8, 300 sec: 42765.0). Total num frames: 5042995200. Throughput: 0: 42481.6. Samples: 5043112920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 22:57:43,549][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000307801_5043011584.pth... [2024-06-22 22:57:43,616][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000307173_5032722432.pth [2024-06-22 22:57:47,412][15401] Updated weights for policy 0, policy_version 307810 (0.0039) [2024-06-22 22:57:48,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 5043191808. Throughput: 0: 42611.1. Samples: 5043377940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:48,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-22 22:57:50,672][15401] Updated weights for policy 0, policy_version 307820 (0.0040) [2024-06-22 22:57:53,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5043421184. Throughput: 0: 42541.2. Samples: 5043496020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-22 22:57:53,392][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 22:57:55,453][15401] Updated weights for policy 0, policy_version 307830 (0.0042) [2024-06-22 22:57:58,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5043634176. Throughput: 0: 42528.9. Samples: 5043755000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:57:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-22 22:57:58,424][15401] Updated weights for policy 0, policy_version 307840 (0.0033) [2024-06-22 22:58:03,105][15401] Updated weights for policy 0, policy_version 307850 (0.0036) [2024-06-22 22:58:03,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5043814400. Throughput: 0: 42752.5. Samples: 5044017640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 22:58:06,018][15401] Updated weights for policy 0, policy_version 307860 (0.0029) [2024-06-22 22:58:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5044060160. Throughput: 0: 42700.4. Samples: 5044135620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:08,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 22:58:10,791][15401] Updated weights for policy 0, policy_version 307870 (0.0044) [2024-06-22 22:58:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5044273152. Throughput: 0: 42434.8. Samples: 5044388960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:13,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-22 22:58:13,833][15401] Updated weights for policy 0, policy_version 307880 (0.0026) [2024-06-22 22:58:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5044453376. Throughput: 0: 42503.6. Samples: 5044645640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:18,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-22 22:58:18,399][15401] Updated weights for policy 0, policy_version 307890 (0.0035) [2024-06-22 22:58:21,436][15401] Updated weights for policy 0, policy_version 307900 (0.0033) [2024-06-22 22:58:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5044682752. Throughput: 0: 42504.6. Samples: 5044769600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:23,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-22 22:58:25,810][15401] Updated weights for policy 0, policy_version 307910 (0.0030) [2024-06-22 22:58:26,572][15349] Signal inference workers to stop experience collection... (74600 times) [2024-06-22 22:58:26,572][15349] Signal inference workers to resume experience collection... (74600 times) [2024-06-22 22:58:26,625][15401] InferenceWorker_p0-w0: stopping experience collection (74600 times) [2024-06-22 22:58:26,625][15401] InferenceWorker_p0-w0: resuming experience collection (74600 times) [2024-06-22 22:58:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5044912128. Throughput: 0: 42699.1. Samples: 5045034380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:28,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 22:58:28,935][15401] Updated weights for policy 0, policy_version 307920 (0.0037) [2024-06-22 22:58:33,363][15401] Updated weights for policy 0, policy_version 307930 (0.0028) [2024-06-22 22:58:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5045125120. Throughput: 0: 42425.7. Samples: 5045287000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 22:58:36,683][15401] Updated weights for policy 0, policy_version 307940 (0.0028) [2024-06-22 22:58:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 5045338112. Throughput: 0: 42537.4. Samples: 5045410100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-22 22:58:40,908][15401] Updated weights for policy 0, policy_version 307950 (0.0028) [2024-06-22 22:58:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 5045551104. Throughput: 0: 42699.3. Samples: 5045676480. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 22:58:44,690][15401] Updated weights for policy 0, policy_version 307960 (0.0051) [2024-06-22 22:58:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 5045764096. Throughput: 0: 42482.5. Samples: 5045929360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:48,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 22:58:48,533][15401] Updated weights for policy 0, policy_version 307970 (0.0036) [2024-06-22 22:58:52,300][15401] Updated weights for policy 0, policy_version 307980 (0.0029) [2024-06-22 22:58:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5045977088. Throughput: 0: 42764.1. Samples: 5046060000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 22:58:56,175][15401] Updated weights for policy 0, policy_version 307990 (0.0031) [2024-06-22 22:58:58,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 5046173696. Throughput: 0: 42702.7. Samples: 5046310860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:58:58,396][15132] Avg episode reward: [(0, '0.000')] [2024-06-22 22:58:59,943][15401] Updated weights for policy 0, policy_version 308000 (0.0038) [2024-06-22 22:59:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5046403072. Throughput: 0: 42591.0. Samples: 5046562240. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:59:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 22:59:03,789][15401] Updated weights for policy 0, policy_version 308010 (0.0027) [2024-06-22 22:59:07,891][15401] Updated weights for policy 0, policy_version 308020 (0.0035) [2024-06-22 22:59:08,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 5046616064. Throughput: 0: 42749.2. Samples: 5046693320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-22 22:59:08,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-22 22:59:11,469][15401] Updated weights for policy 0, policy_version 308030 (0.0042) [2024-06-22 22:59:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 5046812672. Throughput: 0: 42496.8. Samples: 5046946740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 22:59:15,617][15401] Updated weights for policy 0, policy_version 308040 (0.0058) [2024-06-22 22:59:18,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 5047058432. Throughput: 0: 42558.7. Samples: 5047202240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:18,392][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 22:59:19,230][15401] Updated weights for policy 0, policy_version 308050 (0.0031) [2024-06-22 22:59:23,126][15401] Updated weights for policy 0, policy_version 308060 (0.0032) [2024-06-22 22:59:23,392][15132] Fps is (10 sec: 45865.2, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 5047271424. Throughput: 0: 42822.6. Samples: 5047337220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:23,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 22:59:26,770][15401] Updated weights for policy 0, policy_version 308070 (0.0028) [2024-06-22 22:59:28,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5047468032. Throughput: 0: 42524.1. Samples: 5047590060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:28,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-22 22:59:30,921][15401] Updated weights for policy 0, policy_version 308080 (0.0046) [2024-06-22 22:59:33,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 5047713792. Throughput: 0: 42483.1. Samples: 5047841100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 22:59:34,438][15401] Updated weights for policy 0, policy_version 308090 (0.0026) [2024-06-22 22:59:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5047877632. Throughput: 0: 42613.0. Samples: 5047977580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 22:59:38,609][15401] Updated weights for policy 0, policy_version 308100 (0.0028) [2024-06-22 22:59:41,901][15401] Updated weights for policy 0, policy_version 308110 (0.0033) [2024-06-22 22:59:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5048107008. Throughput: 0: 42690.6. Samples: 5048231660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-22 22:59:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000308112_5048107008.pth... [2024-06-22 22:59:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000307488_5037883392.pth [2024-06-22 22:59:46,382][15401] Updated weights for policy 0, policy_version 308120 (0.0033) [2024-06-22 22:59:48,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5048352768. Throughput: 0: 42691.7. Samples: 5048483360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 22:59:49,545][15401] Updated weights for policy 0, policy_version 308130 (0.0034) [2024-06-22 22:59:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5048532992. Throughput: 0: 42854.8. Samples: 5048621780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 22:59:53,948][15401] Updated weights for policy 0, policy_version 308140 (0.0035) [2024-06-22 22:59:55,144][15349] Signal inference workers to stop experience collection... (74650 times) [2024-06-22 22:59:55,144][15349] Signal inference workers to resume experience collection... (74650 times) [2024-06-22 22:59:55,186][15401] InferenceWorker_p0-w0: stopping experience collection (74650 times) [2024-06-22 22:59:55,186][15401] InferenceWorker_p0-w0: resuming experience collection (74650 times) [2024-06-22 22:59:57,321][15401] Updated weights for policy 0, policy_version 308150 (0.0044) [2024-06-22 22:59:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42876.1, 300 sec: 42542.9). Total num frames: 5048745984. Throughput: 0: 42873.1. Samples: 5048876020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 22:59:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 23:00:01,522][15401] Updated weights for policy 0, policy_version 308160 (0.0034) [2024-06-22 23:00:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5048991744. Throughput: 0: 42881.0. Samples: 5049131780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 23:00:03,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 23:00:04,764][15401] Updated weights for policy 0, policy_version 308170 (0.0033) [2024-06-22 23:00:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5049155584. Throughput: 0: 42913.8. Samples: 5049268240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 23:00:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 23:00:09,386][15401] Updated weights for policy 0, policy_version 308180 (0.0028) [2024-06-22 23:00:12,882][15401] Updated weights for policy 0, policy_version 308190 (0.0026) [2024-06-22 23:00:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 5049417728. Throughput: 0: 42937.0. Samples: 5049522220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 23:00:13,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 23:00:16,937][15401] Updated weights for policy 0, policy_version 308200 (0.0028) [2024-06-22 23:00:18,392][15132] Fps is (10 sec: 49140.4, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 5049647104. Throughput: 0: 42869.4. Samples: 5049770320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 23:00:18,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 23:00:20,403][15401] Updated weights for policy 0, policy_version 308210 (0.0051) [2024-06-22 23:00:23,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 5049794560. Throughput: 0: 42869.8. Samples: 5049906720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-22 23:00:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-22 23:00:24,736][15401] Updated weights for policy 0, policy_version 308220 (0.0051) [2024-06-22 23:00:27,788][15401] Updated weights for policy 0, policy_version 308230 (0.0039) [2024-06-22 23:00:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42710.2). Total num frames: 5050040320. Throughput: 0: 42803.2. Samples: 5050157800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:00:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 23:00:32,142][15401] Updated weights for policy 0, policy_version 308240 (0.0034) [2024-06-22 23:00:33,390][15132] Fps is (10 sec: 47512.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5050269696. Throughput: 0: 42967.3. Samples: 5050416900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:00:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 23:00:35,593][15401] Updated weights for policy 0, policy_version 308250 (0.0036) [2024-06-22 23:00:38,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5050449920. Throughput: 0: 42840.8. Samples: 5050549620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:00:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-22 23:00:39,917][15401] Updated weights for policy 0, policy_version 308260 (0.0042) [2024-06-22 23:00:43,010][15401] Updated weights for policy 0, policy_version 308270 (0.0032) [2024-06-22 23:00:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5050695680. Throughput: 0: 42729.6. Samples: 5050798860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:00:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-22 23:00:47,620][15401] Updated weights for policy 0, policy_version 308280 (0.0041) [2024-06-22 23:00:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5050908672. Throughput: 0: 42831.0. Samples: 5051059180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:00:48,395][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 23:00:50,705][15401] Updated weights for policy 0, policy_version 308290 (0.0034) [2024-06-22 23:00:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5051105280. Throughput: 0: 42719.6. Samples: 5051190620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:00:53,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-22 23:00:55,100][15349] Signal inference workers to stop experience collection... (74700 times) [2024-06-22 23:00:55,143][15401] InferenceWorker_p0-w0: stopping experience collection (74700 times) [2024-06-22 23:00:55,156][15349] Signal inference workers to resume experience collection... (74700 times) [2024-06-22 23:00:55,168][15401] InferenceWorker_p0-w0: resuming experience collection (74700 times) [2024-06-22 23:00:55,170][15401] Updated weights for policy 0, policy_version 308300 (0.0030) [2024-06-22 23:00:58,246][15401] Updated weights for policy 0, policy_version 308310 (0.0025) [2024-06-22 23:00:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5051351040. Throughput: 0: 42760.8. Samples: 5051446460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:00:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 23:01:02,685][15401] Updated weights for policy 0, policy_version 308320 (0.0037) [2024-06-22 23:01:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5051547648. Throughput: 0: 43051.2. Samples: 5051707520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:01:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 23:01:05,788][15401] Updated weights for policy 0, policy_version 308330 (0.0041) [2024-06-22 23:01:08,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5051727872. Throughput: 0: 42784.8. Samples: 5051832040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:01:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 23:01:10,268][15401] Updated weights for policy 0, policy_version 308340 (0.0029) [2024-06-22 23:01:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5051990016. Throughput: 0: 42881.2. Samples: 5052087460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:01:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 23:01:13,606][15401] Updated weights for policy 0, policy_version 308350 (0.0030) [2024-06-22 23:01:18,088][15401] Updated weights for policy 0, policy_version 308360 (0.0042) [2024-06-22 23:01:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 5052186624. Throughput: 0: 43000.7. Samples: 5052351920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:01:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 23:01:21,182][15401] Updated weights for policy 0, policy_version 308370 (0.0039) [2024-06-22 23:01:23,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5052366848. Throughput: 0: 42862.5. Samples: 5052478440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:01:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 23:01:25,656][15401] Updated weights for policy 0, policy_version 308380 (0.0036) [2024-06-22 23:01:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5052628992. Throughput: 0: 42918.7. Samples: 5052730200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:01:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 23:01:29,314][15401] Updated weights for policy 0, policy_version 308390 (0.0031) [2024-06-22 23:01:33,355][15401] Updated weights for policy 0, policy_version 308400 (0.0040) [2024-06-22 23:01:33,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5052825600. Throughput: 0: 42959.5. Samples: 5052992360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:01:33,391][15132] Avg episode reward: [(0, '0.406')] [2024-06-22 23:01:36,792][15401] Updated weights for policy 0, policy_version 308410 (0.0044) [2024-06-22 23:01:38,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42543.8). Total num frames: 5053005824. Throughput: 0: 42715.6. Samples: 5053112820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-22 23:01:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 23:01:40,957][15401] Updated weights for policy 0, policy_version 308420 (0.0028) [2024-06-22 23:01:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 5053267968. Throughput: 0: 42786.1. Samples: 5053371940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:01:43,393][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 23:01:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000308428_5053284352.pth... [2024-06-22 23:01:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000307801_5043011584.pth [2024-06-22 23:01:44,367][15401] Updated weights for policy 0, policy_version 308430 (0.0041) [2024-06-22 23:01:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5053431808. Throughput: 0: 42811.6. Samples: 5053634040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:01:48,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-22 23:01:48,928][15401] Updated weights for policy 0, policy_version 308440 (0.0038) [2024-06-22 23:01:51,916][15401] Updated weights for policy 0, policy_version 308450 (0.0026) [2024-06-22 23:01:53,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5053661184. Throughput: 0: 42666.1. Samples: 5053752020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:01:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 23:01:56,503][15401] Updated weights for policy 0, policy_version 308460 (0.0043) [2024-06-22 23:01:58,390][15132] Fps is (10 sec: 49151.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5053923328. Throughput: 0: 42793.8. Samples: 5054013180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:01:58,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 23:01:59,471][15401] Updated weights for policy 0, policy_version 308470 (0.0034) [2024-06-22 23:02:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 5054087168. Throughput: 0: 42709.6. Samples: 5054273960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:03,393][15132] Avg episode reward: [(0, '0.734')] [2024-06-22 23:02:04,344][15401] Updated weights for policy 0, policy_version 308480 (0.0031) [2024-06-22 23:02:07,026][15401] Updated weights for policy 0, policy_version 308490 (0.0027) [2024-06-22 23:02:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 5054316544. Throughput: 0: 42542.7. Samples: 5054392860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:08,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 23:02:12,000][15349] Signal inference workers to stop experience collection... (74750 times) [2024-06-22 23:02:12,031][15401] InferenceWorker_p0-w0: stopping experience collection (74750 times) [2024-06-22 23:02:12,069][15349] Signal inference workers to resume experience collection... (74750 times) [2024-06-22 23:02:12,069][15401] InferenceWorker_p0-w0: resuming experience collection (74750 times) [2024-06-22 23:02:12,071][15401] Updated weights for policy 0, policy_version 308500 (0.0027) [2024-06-22 23:02:13,390][15132] Fps is (10 sec: 45886.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5054545920. Throughput: 0: 42739.6. Samples: 5054653480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 23:02:14,736][15401] Updated weights for policy 0, policy_version 308510 (0.0038) [2024-06-22 23:02:18,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5054726144. Throughput: 0: 42757.8. Samples: 5054916460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-22 23:02:19,459][15401] Updated weights for policy 0, policy_version 308520 (0.0043) [2024-06-22 23:02:22,301][15401] Updated weights for policy 0, policy_version 308530 (0.0033) [2024-06-22 23:02:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 5054971904. Throughput: 0: 42731.4. Samples: 5055035740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-22 23:02:27,159][15401] Updated weights for policy 0, policy_version 308540 (0.0031) [2024-06-22 23:02:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5055184896. Throughput: 0: 42919.8. Samples: 5055303220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 23:02:29,850][15401] Updated weights for policy 0, policy_version 308550 (0.0030) [2024-06-22 23:02:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5055381504. Throughput: 0: 42853.3. Samples: 5055562440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:33,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-22 23:02:34,743][15401] Updated weights for policy 0, policy_version 308560 (0.0039) [2024-06-22 23:02:37,455][15401] Updated weights for policy 0, policy_version 308570 (0.0039) [2024-06-22 23:02:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 5055627264. Throughput: 0: 43030.3. Samples: 5055688380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-22 23:02:42,336][15401] Updated weights for policy 0, policy_version 308580 (0.0050) [2024-06-22 23:02:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42765.4). Total num frames: 5055807488. Throughput: 0: 43046.8. Samples: 5055950280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:43,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-22 23:02:45,421][15401] Updated weights for policy 0, policy_version 308590 (0.0037) [2024-06-22 23:02:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 5056036864. Throughput: 0: 43039.7. Samples: 5056210640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-22 23:02:50,030][15401] Updated weights for policy 0, policy_version 308600 (0.0034) [2024-06-22 23:02:52,850][15401] Updated weights for policy 0, policy_version 308610 (0.0029) [2024-06-22 23:02:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 5056282624. Throughput: 0: 43163.2. Samples: 5056335200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-22 23:02:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-22 23:02:57,745][15401] Updated weights for policy 0, policy_version 308620 (0.0034) [2024-06-22 23:02:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5056462848. Throughput: 0: 43283.9. Samples: 5056601260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:02:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 23:03:00,277][15401] Updated weights for policy 0, policy_version 308630 (0.0040) [2024-06-22 23:03:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 5056692224. Throughput: 0: 43125.7. Samples: 5056857120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 23:03:05,327][15401] Updated weights for policy 0, policy_version 308640 (0.0021) [2024-06-22 23:03:08,119][15401] Updated weights for policy 0, policy_version 308650 (0.0043) [2024-06-22 23:03:08,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 5056921600. Throughput: 0: 43293.5. Samples: 5056983940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 23:03:13,055][15401] Updated weights for policy 0, policy_version 308660 (0.0031) [2024-06-22 23:03:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5057101824. Throughput: 0: 43138.6. Samples: 5057244460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 23:03:15,843][15401] Updated weights for policy 0, policy_version 308670 (0.0047) [2024-06-22 23:03:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5057331200. Throughput: 0: 42949.2. Samples: 5057495160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:18,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-22 23:03:20,733][15401] Updated weights for policy 0, policy_version 308680 (0.0037) [2024-06-22 23:03:21,081][15349] Signal inference workers to stop experience collection... (74800 times) [2024-06-22 23:03:21,081][15349] Signal inference workers to resume experience collection... (74800 times) [2024-06-22 23:03:21,089][15401] InferenceWorker_p0-w0: stopping experience collection (74800 times) [2024-06-22 23:03:21,089][15401] InferenceWorker_p0-w0: resuming experience collection (74800 times) [2024-06-22 23:03:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5057560576. Throughput: 0: 43115.1. Samples: 5057628560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 23:03:23,462][15401] Updated weights for policy 0, policy_version 308690 (0.0046) [2024-06-22 23:03:28,313][15401] Updated weights for policy 0, policy_version 308700 (0.0027) [2024-06-22 23:03:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5057740800. Throughput: 0: 43084.4. Samples: 5057889080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 23:03:31,028][15401] Updated weights for policy 0, policy_version 308710 (0.0038) [2024-06-22 23:03:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5057986560. Throughput: 0: 42830.7. Samples: 5058138020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:33,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 23:03:35,750][15401] Updated weights for policy 0, policy_version 308720 (0.0034) [2024-06-22 23:03:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5058199552. Throughput: 0: 43125.4. Samples: 5058275840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:38,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-22 23:03:38,933][15401] Updated weights for policy 0, policy_version 308730 (0.0033) [2024-06-22 23:03:43,358][15401] Updated weights for policy 0, policy_version 308740 (0.0031) [2024-06-22 23:03:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5058396160. Throughput: 0: 42853.8. Samples: 5058529680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-22 23:03:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000308741_5058412544.pth... [2024-06-22 23:03:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000308112_5048107008.pth [2024-06-22 23:03:46,750][15401] Updated weights for policy 0, policy_version 308750 (0.0034) [2024-06-22 23:03:48,396][15132] Fps is (10 sec: 44208.8, 60 sec: 43413.0, 300 sec: 42930.7). Total num frames: 5058641920. Throughput: 0: 42833.2. Samples: 5058784880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:48,396][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 23:03:50,838][15401] Updated weights for policy 0, policy_version 308760 (0.0037) [2024-06-22 23:03:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42932.6). Total num frames: 5058838528. Throughput: 0: 42905.3. Samples: 5058914680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 23:03:54,247][15401] Updated weights for policy 0, policy_version 308770 (0.0031) [2024-06-22 23:03:58,364][15401] Updated weights for policy 0, policy_version 308780 (0.0035) [2024-06-22 23:03:58,390][15132] Fps is (10 sec: 40986.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5059051520. Throughput: 0: 42951.6. Samples: 5059177280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:03:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-22 23:04:01,953][15401] Updated weights for policy 0, policy_version 308790 (0.0033) [2024-06-22 23:04:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5059280896. Throughput: 0: 42733.4. Samples: 5059418160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:04:03,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 23:04:06,526][15401] Updated weights for policy 0, policy_version 308800 (0.0031) [2024-06-22 23:04:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5059493888. Throughput: 0: 42907.0. Samples: 5059559380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-22 23:04:08,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-22 23:04:09,622][15401] Updated weights for policy 0, policy_version 308810 (0.0036) [2024-06-22 23:04:13,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 5059657728. Throughput: 0: 42736.4. Samples: 5059812220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:13,392][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 23:04:14,048][15401] Updated weights for policy 0, policy_version 308820 (0.0031) [2024-06-22 23:04:17,145][15401] Updated weights for policy 0, policy_version 308830 (0.0021) [2024-06-22 23:04:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 5059919872. Throughput: 0: 42773.3. Samples: 5060062820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 23:04:21,840][15401] Updated weights for policy 0, policy_version 308840 (0.0026) [2024-06-22 23:04:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5060116480. Throughput: 0: 42883.2. Samples: 5060205580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 23:04:24,714][15401] Updated weights for policy 0, policy_version 308850 (0.0033) [2024-06-22 23:04:28,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5060296704. Throughput: 0: 42658.7. Samples: 5060449320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 23:04:29,567][15401] Updated weights for policy 0, policy_version 308860 (0.0037) [2024-06-22 23:04:29,787][15349] Signal inference workers to stop experience collection... (74850 times) [2024-06-22 23:04:29,789][15349] Signal inference workers to resume experience collection... (74850 times) [2024-06-22 23:04:29,806][15401] InferenceWorker_p0-w0: stopping experience collection (74850 times) [2024-06-22 23:04:29,840][15401] InferenceWorker_p0-w0: resuming experience collection (74850 times) [2024-06-22 23:04:32,305][15401] Updated weights for policy 0, policy_version 308870 (0.0038) [2024-06-22 23:04:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5060542464. Throughput: 0: 42614.1. Samples: 5060702240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 23:04:37,339][15401] Updated weights for policy 0, policy_version 308880 (0.0041) [2024-06-22 23:04:38,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5060771840. Throughput: 0: 42864.3. Samples: 5060843580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 23:04:40,199][15401] Updated weights for policy 0, policy_version 308890 (0.0029) [2024-06-22 23:04:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5060952064. Throughput: 0: 42418.2. Samples: 5061086100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 23:04:44,908][15401] Updated weights for policy 0, policy_version 308900 (0.0031) [2024-06-22 23:04:47,786][15401] Updated weights for policy 0, policy_version 308910 (0.0032) [2024-06-22 23:04:48,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42329.9, 300 sec: 42876.1). Total num frames: 5061181440. Throughput: 0: 42846.8. Samples: 5061346260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 23:04:52,329][15401] Updated weights for policy 0, policy_version 308920 (0.0035) [2024-06-22 23:04:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5061410816. Throughput: 0: 42720.0. Samples: 5061481780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 23:04:55,485][15401] Updated weights for policy 0, policy_version 308930 (0.0023) [2024-06-22 23:04:58,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5061591040. Throughput: 0: 42624.9. Samples: 5061730340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:04:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 23:04:59,895][15401] Updated weights for policy 0, policy_version 308940 (0.0033) [2024-06-22 23:05:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 5061836800. Throughput: 0: 42732.9. Samples: 5061985800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:05:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 23:05:03,395][15401] Updated weights for policy 0, policy_version 308950 (0.0035) [2024-06-22 23:05:07,669][15401] Updated weights for policy 0, policy_version 308960 (0.0034) [2024-06-22 23:05:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5062017024. Throughput: 0: 42375.4. Samples: 5062112480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:05:08,403][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 23:05:11,122][15401] Updated weights for policy 0, policy_version 308970 (0.0035) [2024-06-22 23:05:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 5062246400. Throughput: 0: 42649.8. Samples: 5062368560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:05:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 23:05:15,375][15401] Updated weights for policy 0, policy_version 308980 (0.0023) [2024-06-22 23:05:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 5062459392. Throughput: 0: 42635.5. Samples: 5062620840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:05:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-22 23:05:18,831][15401] Updated weights for policy 0, policy_version 308990 (0.0026) [2024-06-22 23:05:18,845][15349] Signal inference workers to stop experience collection... (74900 times) [2024-06-22 23:05:18,852][15349] Signal inference workers to resume experience collection... (74900 times) [2024-06-22 23:05:18,889][15401] InferenceWorker_p0-w0: stopping experience collection (74900 times) [2024-06-22 23:05:18,889][15401] InferenceWorker_p0-w0: resuming experience collection (74900 times) [2024-06-22 23:05:23,386][15401] Updated weights for policy 0, policy_version 309000 (0.0033) [2024-06-22 23:05:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 5062656000. Throughput: 0: 42279.6. Samples: 5062746160. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:05:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-22 23:05:26,499][15401] Updated weights for policy 0, policy_version 309010 (0.0043) [2024-06-22 23:05:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 5062885376. Throughput: 0: 42413.8. Samples: 5062994820. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:05:28,392][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 23:05:31,170][15401] Updated weights for policy 0, policy_version 309020 (0.0032) [2024-06-22 23:05:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 5063114752. Throughput: 0: 42275.3. Samples: 5063248660. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:05:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 23:05:34,079][15401] Updated weights for policy 0, policy_version 309030 (0.0035) [2024-06-22 23:05:38,392][15132] Fps is (10 sec: 39321.3, 60 sec: 41777.5, 300 sec: 42653.6). Total num frames: 5063278592. Throughput: 0: 42248.9. Samples: 5063383080. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:05:38,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 23:05:38,779][15401] Updated weights for policy 0, policy_version 309040 (0.0029) [2024-06-22 23:05:41,904][15401] Updated weights for policy 0, policy_version 309050 (0.0025) [2024-06-22 23:05:43,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5063524352. Throughput: 0: 42380.5. Samples: 5063637460. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:05:43,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 23:05:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000309053_5063524352.pth... [2024-06-22 23:05:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000308428_5053284352.pth [2024-06-22 23:05:46,456][15401] Updated weights for policy 0, policy_version 309060 (0.0042) [2024-06-22 23:05:48,390][15132] Fps is (10 sec: 47525.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5063753728. Throughput: 0: 42271.0. Samples: 5063888000. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:05:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 23:05:49,697][15401] Updated weights for policy 0, policy_version 309070 (0.0035) [2024-06-22 23:05:53,392][15132] Fps is (10 sec: 39312.2, 60 sec: 41777.6, 300 sec: 42598.0). Total num frames: 5063917568. Throughput: 0: 42423.6. Samples: 5064021640. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:05:53,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 23:05:54,247][15401] Updated weights for policy 0, policy_version 309080 (0.0026) [2024-06-22 23:05:57,418][15401] Updated weights for policy 0, policy_version 309090 (0.0033) [2024-06-22 23:05:58,395][15132] Fps is (10 sec: 39300.9, 60 sec: 42594.7, 300 sec: 42708.7). Total num frames: 5064146944. Throughput: 0: 42433.3. Samples: 5064278280. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:05:58,395][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 23:06:01,830][15401] Updated weights for policy 0, policy_version 309100 (0.0041) [2024-06-22 23:06:03,389][15132] Fps is (10 sec: 47525.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5064392704. Throughput: 0: 42426.3. Samples: 5064530020. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:06:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 23:06:04,996][15401] Updated weights for policy 0, policy_version 309110 (0.0042) [2024-06-22 23:06:08,389][15132] Fps is (10 sec: 39342.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5064540160. Throughput: 0: 42599.1. Samples: 5064663120. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:06:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 23:06:09,379][15401] Updated weights for policy 0, policy_version 309120 (0.0049) [2024-06-22 23:06:12,578][15401] Updated weights for policy 0, policy_version 309130 (0.0027) [2024-06-22 23:06:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5064802304. Throughput: 0: 42711.1. Samples: 5064916720. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:06:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 23:06:16,920][15401] Updated weights for policy 0, policy_version 309140 (0.0035) [2024-06-22 23:06:18,390][15132] Fps is (10 sec: 49151.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5065031680. Throughput: 0: 42704.9. Samples: 5065170380. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:06:18,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 23:06:20,183][15401] Updated weights for policy 0, policy_version 309150 (0.0038) [2024-06-22 23:06:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5065195520. Throughput: 0: 42703.6. Samples: 5065304640. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:06:23,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-22 23:06:24,777][15401] Updated weights for policy 0, policy_version 309160 (0.0042) [2024-06-22 23:06:27,766][15401] Updated weights for policy 0, policy_version 309170 (0.0033) [2024-06-22 23:06:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5065441280. Throughput: 0: 42605.8. Samples: 5065554720. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:06:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 23:06:32,274][15401] Updated weights for policy 0, policy_version 309180 (0.0038) [2024-06-22 23:06:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 5065654272. Throughput: 0: 42943.6. Samples: 5065820460. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-06-22 23:06:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 23:06:35,108][15349] Signal inference workers to stop experience collection... (74950 times) [2024-06-22 23:06:35,108][15349] Signal inference workers to resume experience collection... (74950 times) [2024-06-22 23:06:35,153][15401] InferenceWorker_p0-w0: stopping experience collection (74950 times) [2024-06-22 23:06:35,153][15401] InferenceWorker_p0-w0: resuming experience collection (74950 times) [2024-06-22 23:06:35,245][15401] Updated weights for policy 0, policy_version 309190 (0.0041) [2024-06-22 23:06:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42654.3). Total num frames: 5065850880. Throughput: 0: 42694.7. Samples: 5065942800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:06:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 23:06:40,117][15401] Updated weights for policy 0, policy_version 309200 (0.0030) [2024-06-22 23:06:42,894][15401] Updated weights for policy 0, policy_version 309210 (0.0035) [2024-06-22 23:06:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5066096640. Throughput: 0: 42580.6. Samples: 5066194180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:06:43,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-22 23:06:47,773][15401] Updated weights for policy 0, policy_version 309220 (0.0039) [2024-06-22 23:06:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5066293248. Throughput: 0: 42958.1. Samples: 5066463140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:06:48,390][15132] Avg episode reward: [(0, '0.141')] [2024-06-22 23:06:50,433][15401] Updated weights for policy 0, policy_version 309230 (0.0038) [2024-06-22 23:06:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43146.3, 300 sec: 42654.0). Total num frames: 5066506240. Throughput: 0: 42693.4. Samples: 5066584320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:06:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 23:06:55,307][15401] Updated weights for policy 0, policy_version 309240 (0.0035) [2024-06-22 23:06:58,202][15401] Updated weights for policy 0, policy_version 309250 (0.0036) [2024-06-22 23:06:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43421.4, 300 sec: 42932.0). Total num frames: 5066752000. Throughput: 0: 42858.2. Samples: 5066845340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:06:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-22 23:07:03,052][15401] Updated weights for policy 0, policy_version 309260 (0.0047) [2024-06-22 23:07:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 5066932224. Throughput: 0: 43011.6. Samples: 5067105900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:03,391][15132] Avg episode reward: [(0, '0.545')] [2024-06-22 23:07:05,989][15401] Updated weights for policy 0, policy_version 309270 (0.0045) [2024-06-22 23:07:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5067128832. Throughput: 0: 42720.0. Samples: 5067227040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 23:07:10,509][15401] Updated weights for policy 0, policy_version 309280 (0.0035) [2024-06-22 23:07:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5067390976. Throughput: 0: 42985.3. Samples: 5067489060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 23:07:13,502][15401] Updated weights for policy 0, policy_version 309290 (0.0039) [2024-06-22 23:07:17,992][15401] Updated weights for policy 0, policy_version 309300 (0.0039) [2024-06-22 23:07:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 5067571200. Throughput: 0: 42748.9. Samples: 5067744160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 23:07:21,014][15401] Updated weights for policy 0, policy_version 309310 (0.0041) [2024-06-22 23:07:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5067784192. Throughput: 0: 42704.9. Samples: 5067864520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 23:07:26,004][15401] Updated weights for policy 0, policy_version 309320 (0.0039) [2024-06-22 23:07:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5068013568. Throughput: 0: 43016.5. Samples: 5068129920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:28,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 23:07:28,737][15401] Updated weights for policy 0, policy_version 309330 (0.0039) [2024-06-22 23:07:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5068210176. Throughput: 0: 42673.4. Samples: 5068383440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:33,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 23:07:33,434][15401] Updated weights for policy 0, policy_version 309340 (0.0034) [2024-06-22 23:07:36,789][15401] Updated weights for policy 0, policy_version 309350 (0.0038) [2024-06-22 23:07:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5068423168. Throughput: 0: 42661.8. Samples: 5068504100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-22 23:07:40,819][15349] Signal inference workers to stop experience collection... (75000 times) [2024-06-22 23:07:40,870][15401] InferenceWorker_p0-w0: stopping experience collection (75000 times) [2024-06-22 23:07:40,879][15349] Signal inference workers to resume experience collection... (75000 times) [2024-06-22 23:07:40,888][15401] InferenceWorker_p0-w0: resuming experience collection (75000 times) [2024-06-22 23:07:41,014][15401] Updated weights for policy 0, policy_version 309360 (0.0038) [2024-06-22 23:07:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5068652544. Throughput: 0: 42763.2. Samples: 5068769680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:43,390][15132] Avg episode reward: [(0, '0.896')] [2024-06-22 23:07:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000309367_5068668928.pth... [2024-06-22 23:07:43,596][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000308741_5058412544.pth [2024-06-22 23:07:44,520][15401] Updated weights for policy 0, policy_version 309370 (0.0037) [2024-06-22 23:07:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5068849152. Throughput: 0: 42622.3. Samples: 5069023900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-22 23:07:48,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-22 23:07:48,544][15401] Updated weights for policy 0, policy_version 309380 (0.0030) [2024-06-22 23:07:52,072][15401] Updated weights for policy 0, policy_version 309390 (0.0035) [2024-06-22 23:07:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5069062144. Throughput: 0: 42702.8. Samples: 5069148660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:07:53,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-22 23:07:56,130][15401] Updated weights for policy 0, policy_version 309400 (0.0040) [2024-06-22 23:07:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 5069275136. Throughput: 0: 42719.0. Samples: 5069411520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:07:58,392][15132] Avg episode reward: [(0, '0.381')] [2024-06-22 23:07:59,718][15401] Updated weights for policy 0, policy_version 309410 (0.0035) [2024-06-22 23:08:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5069488128. Throughput: 0: 42719.9. Samples: 5069666560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 23:08:04,086][15401] Updated weights for policy 0, policy_version 309420 (0.0037) [2024-06-22 23:08:07,237][15401] Updated weights for policy 0, policy_version 309430 (0.0035) [2024-06-22 23:08:08,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5069717504. Throughput: 0: 42844.0. Samples: 5069792500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 23:08:12,252][15401] Updated weights for policy 0, policy_version 309440 (0.0032) [2024-06-22 23:08:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 5069914112. Throughput: 0: 42706.1. Samples: 5070051700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 23:08:15,389][15401] Updated weights for policy 0, policy_version 309450 (0.0036) [2024-06-22 23:08:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5070127104. Throughput: 0: 42723.2. Samples: 5070305980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 23:08:19,924][15401] Updated weights for policy 0, policy_version 309460 (0.0039) [2024-06-22 23:08:22,970][15401] Updated weights for policy 0, policy_version 309470 (0.0030) [2024-06-22 23:08:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5070356480. Throughput: 0: 42878.2. Samples: 5070433620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 23:08:27,398][15401] Updated weights for policy 0, policy_version 309480 (0.0034) [2024-06-22 23:08:28,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.1, 300 sec: 42598.4). Total num frames: 5070553088. Throughput: 0: 42715.8. Samples: 5070691900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:28,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 23:08:30,582][15401] Updated weights for policy 0, policy_version 309490 (0.0031) [2024-06-22 23:08:33,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 5070766080. Throughput: 0: 42721.2. Samples: 5070946460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:33,393][15132] Avg episode reward: [(0, '0.274')] [2024-06-22 23:08:34,967][15401] Updated weights for policy 0, policy_version 309500 (0.0029) [2024-06-22 23:08:38,162][15401] Updated weights for policy 0, policy_version 309510 (0.0039) [2024-06-22 23:08:38,394][15132] Fps is (10 sec: 47493.6, 60 sec: 43414.4, 300 sec: 42819.9). Total num frames: 5071028224. Throughput: 0: 42810.5. Samples: 5071075320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:38,394][15132] Avg episode reward: [(0, '0.229')] [2024-06-22 23:08:42,703][15401] Updated weights for policy 0, policy_version 309520 (0.0031) [2024-06-22 23:08:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.2, 300 sec: 42543.8). Total num frames: 5071192064. Throughput: 0: 42599.9. Samples: 5071328420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 23:08:45,879][15401] Updated weights for policy 0, policy_version 309530 (0.0045) [2024-06-22 23:08:48,392][15132] Fps is (10 sec: 37690.2, 60 sec: 42596.6, 300 sec: 42598.0). Total num frames: 5071405056. Throughput: 0: 42555.4. Samples: 5071581660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:48,393][15132] Avg episode reward: [(0, '0.751')] [2024-06-22 23:08:50,535][15401] Updated weights for policy 0, policy_version 309540 (0.0029) [2024-06-22 23:08:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5071634432. Throughput: 0: 42549.7. Samples: 5071707240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:53,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 23:08:53,646][15401] Updated weights for policy 0, policy_version 309550 (0.0041) [2024-06-22 23:08:58,323][15401] Updated weights for policy 0, policy_version 309560 (0.0028) [2024-06-22 23:08:58,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 5071831040. Throughput: 0: 42491.1. Samples: 5071963800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:08:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 23:09:01,013][15401] Updated weights for policy 0, policy_version 309570 (0.0034) [2024-06-22 23:09:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5072044032. Throughput: 0: 42519.4. Samples: 5072219360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-22 23:09:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 23:09:05,912][15401] Updated weights for policy 0, policy_version 309580 (0.0036) [2024-06-22 23:09:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5072289792. Throughput: 0: 42611.0. Samples: 5072351120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 23:09:08,927][15349] Signal inference workers to stop experience collection... (75050 times) [2024-06-22 23:09:08,928][15349] Signal inference workers to resume experience collection... (75050 times) [2024-06-22 23:09:08,950][15401] InferenceWorker_p0-w0: stopping experience collection (75050 times) [2024-06-22 23:09:08,950][15401] InferenceWorker_p0-w0: resuming experience collection (75050 times) [2024-06-22 23:09:09,082][15401] Updated weights for policy 0, policy_version 309590 (0.0037) [2024-06-22 23:09:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5072470016. Throughput: 0: 42491.6. Samples: 5072604020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 23:09:13,627][15401] Updated weights for policy 0, policy_version 309600 (0.0040) [2024-06-22 23:09:16,811][15401] Updated weights for policy 0, policy_version 309610 (0.0037) [2024-06-22 23:09:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 5072683008. Throughput: 0: 42458.2. Samples: 5072856980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:18,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-22 23:09:21,576][15401] Updated weights for policy 0, policy_version 309620 (0.0031) [2024-06-22 23:09:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.5, 300 sec: 42709.1). Total num frames: 5072896000. Throughput: 0: 42549.4. Samples: 5072989960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:23,393][15132] Avg episode reward: [(0, '0.345')] [2024-06-22 23:09:24,262][15401] Updated weights for policy 0, policy_version 309630 (0.0031) [2024-06-22 23:09:28,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5073108992. Throughput: 0: 42442.8. Samples: 5073238340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-22 23:09:29,309][15401] Updated weights for policy 0, policy_version 309640 (0.0048) [2024-06-22 23:09:31,980][15401] Updated weights for policy 0, policy_version 309650 (0.0028) [2024-06-22 23:09:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 5073321984. Throughput: 0: 42546.8. Samples: 5073496160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 23:09:36,791][15401] Updated weights for policy 0, policy_version 309660 (0.0032) [2024-06-22 23:09:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42055.2, 300 sec: 42709.5). Total num frames: 5073551360. Throughput: 0: 42686.6. Samples: 5073628140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 23:09:39,746][15401] Updated weights for policy 0, policy_version 309670 (0.0043) [2024-06-22 23:09:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5073764352. Throughput: 0: 42526.3. Samples: 5073877480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:43,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-22 23:09:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000309678_5073764352.pth... [2024-06-22 23:09:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000309053_5063524352.pth [2024-06-22 23:09:44,448][15401] Updated weights for policy 0, policy_version 309680 (0.0032) [2024-06-22 23:09:47,610][15401] Updated weights for policy 0, policy_version 309690 (0.0044) [2024-06-22 23:09:48,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42871.6, 300 sec: 42598.1). Total num frames: 5073977344. Throughput: 0: 42410.2. Samples: 5074127920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:48,393][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 23:09:52,284][15401] Updated weights for policy 0, policy_version 309700 (0.0029) [2024-06-22 23:09:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 5074157568. Throughput: 0: 42359.7. Samples: 5074257300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 23:09:55,290][15401] Updated weights for policy 0, policy_version 309710 (0.0033) [2024-06-22 23:09:58,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5074403328. Throughput: 0: 42563.7. Samples: 5074519380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:09:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 23:09:59,758][15401] Updated weights for policy 0, policy_version 309720 (0.0034) [2024-06-22 23:10:02,928][15401] Updated weights for policy 0, policy_version 309730 (0.0028) [2024-06-22 23:10:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5074616320. Throughput: 0: 42505.6. Samples: 5074769720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:10:03,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 23:10:07,351][15401] Updated weights for policy 0, policy_version 309740 (0.0035) [2024-06-22 23:10:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5074812928. Throughput: 0: 42489.8. Samples: 5074901900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:10:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 23:10:10,570][15401] Updated weights for policy 0, policy_version 309750 (0.0029) [2024-06-22 23:10:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5075042304. Throughput: 0: 42486.6. Samples: 5075150240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:10:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-22 23:10:14,994][15401] Updated weights for policy 0, policy_version 309760 (0.0025) [2024-06-22 23:10:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5075255296. Throughput: 0: 42432.8. Samples: 5075405640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-22 23:10:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-22 23:10:18,406][15401] Updated weights for policy 0, policy_version 309770 (0.0024) [2024-06-22 23:10:22,896][15401] Updated weights for policy 0, policy_version 309780 (0.0036) [2024-06-22 23:10:23,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5075451904. Throughput: 0: 42472.1. Samples: 5075539480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:10:23,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-22 23:10:26,351][15401] Updated weights for policy 0, policy_version 309790 (0.0031) [2024-06-22 23:10:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5075664896. Throughput: 0: 42393.8. Samples: 5075785200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:10:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-22 23:10:30,814][15401] Updated weights for policy 0, policy_version 309800 (0.0041) [2024-06-22 23:10:33,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 5075877888. Throughput: 0: 42651.0. Samples: 5076047120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:10:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 23:10:34,159][15401] Updated weights for policy 0, policy_version 309810 (0.0029) [2024-06-22 23:10:37,435][15349] Signal inference workers to stop experience collection... (75100 times) [2024-06-22 23:10:37,436][15349] Signal inference workers to resume experience collection... (75100 times) [2024-06-22 23:10:37,454][15401] InferenceWorker_p0-w0: stopping experience collection (75100 times) [2024-06-22 23:10:37,454][15401] InferenceWorker_p0-w0: resuming experience collection (75100 times) [2024-06-22 23:10:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5076074496. Throughput: 0: 42592.8. Samples: 5076173980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:10:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 23:10:38,440][15401] Updated weights for policy 0, policy_version 309820 (0.0053) [2024-06-22 23:10:41,916][15401] Updated weights for policy 0, policy_version 309830 (0.0032) [2024-06-22 23:10:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5076303872. Throughput: 0: 42297.7. Samples: 5076422780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:10:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 23:10:45,936][15401] Updated weights for policy 0, policy_version 309840 (0.0036) [2024-06-22 23:10:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42054.0, 300 sec: 42654.3). Total num frames: 5076500480. Throughput: 0: 42442.6. Samples: 5076679640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:10:48,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-22 23:10:49,790][15401] Updated weights for policy 0, policy_version 309850 (0.0048) [2024-06-22 23:10:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42654.7). Total num frames: 5076729856. Throughput: 0: 42348.0. Samples: 5076807560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:10:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 23:10:53,399][15401] Updated weights for policy 0, policy_version 309860 (0.0034) [2024-06-22 23:10:57,538][15401] Updated weights for policy 0, policy_version 309870 (0.0029) [2024-06-22 23:10:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5076942848. Throughput: 0: 42543.1. Samples: 5077064680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:10:58,393][15132] Avg episode reward: [(0, '0.463')] [2024-06-22 23:11:01,016][15401] Updated weights for policy 0, policy_version 309880 (0.0037) [2024-06-22 23:11:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5077139456. Throughput: 0: 42580.0. Samples: 5077321740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:11:03,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 23:11:05,226][15401] Updated weights for policy 0, policy_version 309890 (0.0027) [2024-06-22 23:11:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5077368832. Throughput: 0: 42491.2. Samples: 5077451480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:11:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 23:11:08,956][15401] Updated weights for policy 0, policy_version 309900 (0.0037) [2024-06-22 23:11:12,777][15401] Updated weights for policy 0, policy_version 309910 (0.0038) [2024-06-22 23:11:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 5077565440. Throughput: 0: 42523.6. Samples: 5077698760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:11:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-22 23:11:16,745][15401] Updated weights for policy 0, policy_version 309920 (0.0032) [2024-06-22 23:11:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5077794816. Throughput: 0: 42444.3. Samples: 5077957100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:11:18,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-22 23:11:20,252][15401] Updated weights for policy 0, policy_version 309930 (0.0040) [2024-06-22 23:11:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 5078007808. Throughput: 0: 42564.3. Samples: 5078089380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:11:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-22 23:11:24,465][15401] Updated weights for policy 0, policy_version 309940 (0.0051) [2024-06-22 23:11:28,030][15401] Updated weights for policy 0, policy_version 309950 (0.0023) [2024-06-22 23:11:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5078220800. Throughput: 0: 42734.7. Samples: 5078345840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:11:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 23:11:32,030][15401] Updated weights for policy 0, policy_version 309960 (0.0029) [2024-06-22 23:11:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5078417408. Throughput: 0: 42810.2. Samples: 5078606100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-22 23:11:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-22 23:11:35,738][15401] Updated weights for policy 0, policy_version 309970 (0.0035) [2024-06-22 23:11:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 5078646784. Throughput: 0: 42741.7. Samples: 5078730940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:11:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 23:11:39,539][15401] Updated weights for policy 0, policy_version 309980 (0.0030) [2024-06-22 23:11:43,286][15401] Updated weights for policy 0, policy_version 309990 (0.0037) [2024-06-22 23:11:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5078876160. Throughput: 0: 42800.1. Samples: 5078990680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:11:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 23:11:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000309990_5078876160.pth... [2024-06-22 23:11:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000309367_5068668928.pth [2024-06-22 23:11:47,054][15401] Updated weights for policy 0, policy_version 310000 (0.0034) [2024-06-22 23:11:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5079056384. Throughput: 0: 42690.7. Samples: 5079242820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:11:48,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 23:11:50,987][15401] Updated weights for policy 0, policy_version 310010 (0.0040) [2024-06-22 23:11:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5079285760. Throughput: 0: 42492.0. Samples: 5079363620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:11:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-22 23:11:54,614][15401] Updated weights for policy 0, policy_version 310020 (0.0041) [2024-06-22 23:11:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5079498752. Throughput: 0: 42773.3. Samples: 5079623560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:11:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-22 23:11:58,598][15349] Signal inference workers to stop experience collection... (75150 times) [2024-06-22 23:11:58,598][15349] Signal inference workers to resume experience collection... (75150 times) [2024-06-22 23:11:58,626][15401] InferenceWorker_p0-w0: stopping experience collection (75150 times) [2024-06-22 23:11:58,626][15401] InferenceWorker_p0-w0: resuming experience collection (75150 times) [2024-06-22 23:11:58,747][15401] Updated weights for policy 0, policy_version 310030 (0.0031) [2024-06-22 23:12:02,117][15401] Updated weights for policy 0, policy_version 310040 (0.0030) [2024-06-22 23:12:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5079711744. Throughput: 0: 42636.3. Samples: 5079875740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 23:12:06,457][15401] Updated weights for policy 0, policy_version 310050 (0.0039) [2024-06-22 23:12:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5079924736. Throughput: 0: 42506.8. Samples: 5080002180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 23:12:10,241][15401] Updated weights for policy 0, policy_version 310060 (0.0043) [2024-06-22 23:12:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5080121344. Throughput: 0: 42663.6. Samples: 5080265700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-22 23:12:14,090][15401] Updated weights for policy 0, policy_version 310070 (0.0034) [2024-06-22 23:12:17,970][15401] Updated weights for policy 0, policy_version 310080 (0.0033) [2024-06-22 23:12:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5080367104. Throughput: 0: 42413.0. Samples: 5080514680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 23:12:22,031][15401] Updated weights for policy 0, policy_version 310090 (0.0042) [2024-06-22 23:12:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5080563712. Throughput: 0: 42547.5. Samples: 5080645580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 23:12:25,510][15401] Updated weights for policy 0, policy_version 310100 (0.0030) [2024-06-22 23:12:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5080760320. Throughput: 0: 42419.5. Samples: 5080899560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 23:12:29,666][15401] Updated weights for policy 0, policy_version 310110 (0.0034) [2024-06-22 23:12:32,939][15401] Updated weights for policy 0, policy_version 310120 (0.0034) [2024-06-22 23:12:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5081006080. Throughput: 0: 42329.3. Samples: 5081147640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 23:12:37,297][15401] Updated weights for policy 0, policy_version 310130 (0.0025) [2024-06-22 23:12:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5081202688. Throughput: 0: 42689.8. Samples: 5081284660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 23:12:40,917][15401] Updated weights for policy 0, policy_version 310140 (0.0036) [2024-06-22 23:12:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5081399296. Throughput: 0: 42578.5. Samples: 5081539600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 23:12:44,957][15401] Updated weights for policy 0, policy_version 310150 (0.0035) [2024-06-22 23:12:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 5081645056. Throughput: 0: 42530.2. Samples: 5081789700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:12:48,392][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 23:12:48,501][15401] Updated weights for policy 0, policy_version 310160 (0.0032) [2024-06-22 23:12:52,821][15401] Updated weights for policy 0, policy_version 310170 (0.0037) [2024-06-22 23:12:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 5081841664. Throughput: 0: 42687.9. Samples: 5081923140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:12:53,394][15132] Avg episode reward: [(0, '0.328')] [2024-06-22 23:12:56,289][15401] Updated weights for policy 0, policy_version 310180 (0.0038) [2024-06-22 23:12:58,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5082038272. Throughput: 0: 42424.5. Samples: 5082174800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:12:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 23:13:00,908][15401] Updated weights for policy 0, policy_version 310190 (0.0034) [2024-06-22 23:13:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5082284032. Throughput: 0: 42522.6. Samples: 5082428200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 23:13:04,241][15401] Updated weights for policy 0, policy_version 310200 (0.0033) [2024-06-22 23:13:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5082464256. Throughput: 0: 42695.7. Samples: 5082566880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 23:13:08,505][15401] Updated weights for policy 0, policy_version 310210 (0.0040) [2024-06-22 23:13:12,009][15401] Updated weights for policy 0, policy_version 310220 (0.0042) [2024-06-22 23:13:13,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 5082677248. Throughput: 0: 42623.5. Samples: 5082817720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:13,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-22 23:13:15,976][15401] Updated weights for policy 0, policy_version 310230 (0.0036) [2024-06-22 23:13:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5082923008. Throughput: 0: 42729.3. Samples: 5083070460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 23:13:19,737][15401] Updated weights for policy 0, policy_version 310240 (0.0033) [2024-06-22 23:13:23,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5083103232. Throughput: 0: 42627.9. Samples: 5083202920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 23:13:23,834][15401] Updated weights for policy 0, policy_version 310250 (0.0029) [2024-06-22 23:13:27,615][15401] Updated weights for policy 0, policy_version 310260 (0.0026) [2024-06-22 23:13:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 5083332608. Throughput: 0: 42665.4. Samples: 5083459540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-22 23:13:31,403][15401] Updated weights for policy 0, policy_version 310270 (0.0026) [2024-06-22 23:13:31,581][15349] Signal inference workers to stop experience collection... (75200 times) [2024-06-22 23:13:31,581][15349] Signal inference workers to resume experience collection... (75200 times) [2024-06-22 23:13:31,605][15401] InferenceWorker_p0-w0: stopping experience collection (75200 times) [2024-06-22 23:13:31,605][15401] InferenceWorker_p0-w0: resuming experience collection (75200 times) [2024-06-22 23:13:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42487.9). Total num frames: 5083561984. Throughput: 0: 42660.0. Samples: 5083709300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 23:13:35,197][15401] Updated weights for policy 0, policy_version 310280 (0.0035) [2024-06-22 23:13:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5083758592. Throughput: 0: 42716.6. Samples: 5083845380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-22 23:13:39,046][15401] Updated weights for policy 0, policy_version 310290 (0.0036) [2024-06-22 23:13:42,755][15401] Updated weights for policy 0, policy_version 310300 (0.0034) [2024-06-22 23:13:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 5083971584. Throughput: 0: 42685.6. Samples: 5084095660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-22 23:13:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000310301_5083971584.pth... [2024-06-22 23:13:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000309678_5073764352.pth [2024-06-22 23:13:46,608][15401] Updated weights for policy 0, policy_version 310310 (0.0037) [2024-06-22 23:13:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 5084200960. Throughput: 0: 42666.2. Samples: 5084348180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 23:13:50,306][15401] Updated weights for policy 0, policy_version 310320 (0.0036) [2024-06-22 23:13:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5084381184. Throughput: 0: 42505.2. Samples: 5084479620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:53,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 23:13:54,311][15401] Updated weights for policy 0, policy_version 310330 (0.0035) [2024-06-22 23:13:58,030][15401] Updated weights for policy 0, policy_version 310340 (0.0042) [2024-06-22 23:13:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5084610560. Throughput: 0: 42434.3. Samples: 5084727160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:13:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 23:14:01,916][15401] Updated weights for policy 0, policy_version 310350 (0.0033) [2024-06-22 23:14:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5084839936. Throughput: 0: 42543.1. Samples: 5084984900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-22 23:14:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-22 23:14:05,753][15401] Updated weights for policy 0, policy_version 310360 (0.0046) [2024-06-22 23:14:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5085020160. Throughput: 0: 42548.9. Samples: 5085117620. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:08,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-22 23:14:09,796][15401] Updated weights for policy 0, policy_version 310370 (0.0035) [2024-06-22 23:14:13,203][15401] Updated weights for policy 0, policy_version 310380 (0.0039) [2024-06-22 23:14:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 5085265920. Throughput: 0: 42295.9. Samples: 5085362860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 23:14:17,677][15401] Updated weights for policy 0, policy_version 310390 (0.0035) [2024-06-22 23:14:18,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 5085478912. Throughput: 0: 42469.8. Samples: 5085620540. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:18,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 23:14:21,170][15401] Updated weights for policy 0, policy_version 310400 (0.0036) [2024-06-22 23:14:23,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5085642752. Throughput: 0: 42253.3. Samples: 5085746780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 23:14:25,331][15401] Updated weights for policy 0, policy_version 310410 (0.0027) [2024-06-22 23:14:28,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5085888512. Throughput: 0: 42485.0. Samples: 5086007480. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 23:14:28,726][15401] Updated weights for policy 0, policy_version 310420 (0.0037) [2024-06-22 23:14:32,993][15401] Updated weights for policy 0, policy_version 310430 (0.0040) [2024-06-22 23:14:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5086101504. Throughput: 0: 42498.3. Samples: 5086260600. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 23:14:36,983][15401] Updated weights for policy 0, policy_version 310440 (0.0037) [2024-06-22 23:14:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5086298112. Throughput: 0: 42464.8. Samples: 5086390540. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-22 23:14:40,716][15401] Updated weights for policy 0, policy_version 310450 (0.0038) [2024-06-22 23:14:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 5086543872. Throughput: 0: 42630.6. Samples: 5086645540. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:43,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-22 23:14:44,526][15401] Updated weights for policy 0, policy_version 310460 (0.0049) [2024-06-22 23:14:48,303][15401] Updated weights for policy 0, policy_version 310470 (0.0034) [2024-06-22 23:14:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5086740480. Throughput: 0: 42684.9. Samples: 5086905720. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-22 23:14:52,104][15401] Updated weights for policy 0, policy_version 310480 (0.0029) [2024-06-22 23:14:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5086937088. Throughput: 0: 42489.7. Samples: 5087029660. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:53,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-22 23:14:55,984][15401] Updated weights for policy 0, policy_version 310490 (0.0036) [2024-06-22 23:14:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5087182848. Throughput: 0: 42706.0. Samples: 5087284620. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:14:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-22 23:14:59,669][15401] Updated weights for policy 0, policy_version 310500 (0.0037) [2024-06-22 23:15:02,208][15349] Signal inference workers to stop experience collection... (75250 times) [2024-06-22 23:15:02,208][15349] Signal inference workers to resume experience collection... (75250 times) [2024-06-22 23:15:02,220][15401] InferenceWorker_p0-w0: stopping experience collection (75250 times) [2024-06-22 23:15:02,221][15401] InferenceWorker_p0-w0: resuming experience collection (75250 times) [2024-06-22 23:15:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5087363072. Throughput: 0: 42836.9. Samples: 5087548100. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:15:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 23:15:03,788][15401] Updated weights for policy 0, policy_version 310510 (0.0048) [2024-06-22 23:15:07,279][15401] Updated weights for policy 0, policy_version 310520 (0.0029) [2024-06-22 23:15:08,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5087576064. Throughput: 0: 42803.5. Samples: 5087672940. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:15:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 23:15:11,487][15401] Updated weights for policy 0, policy_version 310530 (0.0054) [2024-06-22 23:15:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5087821824. Throughput: 0: 42591.1. Samples: 5087924080. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:15:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 23:15:14,911][15401] Updated weights for policy 0, policy_version 310540 (0.0032) [2024-06-22 23:15:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42327.1, 300 sec: 42598.8). Total num frames: 5088018432. Throughput: 0: 42908.0. Samples: 5088191460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-22 23:15:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 23:15:19,017][15401] Updated weights for policy 0, policy_version 310550 (0.0050) [2024-06-22 23:15:22,532][15401] Updated weights for policy 0, policy_version 310560 (0.0050) [2024-06-22 23:15:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5088231424. Throughput: 0: 42771.2. Samples: 5088315240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:15:23,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-22 23:15:26,720][15401] Updated weights for policy 0, policy_version 310570 (0.0036) [2024-06-22 23:15:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5088460800. Throughput: 0: 42855.2. Samples: 5088574020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:15:28,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 23:15:30,424][15401] Updated weights for policy 0, policy_version 310580 (0.0036) [2024-06-22 23:15:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5088641024. Throughput: 0: 42966.1. Samples: 5088839200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:15:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 23:15:34,282][15401] Updated weights for policy 0, policy_version 310590 (0.0040) [2024-06-22 23:15:37,889][15401] Updated weights for policy 0, policy_version 310600 (0.0030) [2024-06-22 23:15:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5088870400. Throughput: 0: 42849.0. Samples: 5088957860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:15:38,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-22 23:15:41,770][15401] Updated weights for policy 0, policy_version 310610 (0.0032) [2024-06-22 23:15:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5089099776. Throughput: 0: 42948.8. Samples: 5089217320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:15:43,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 23:15:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000310614_5089099776.pth... [2024-06-22 23:15:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000309990_5078876160.pth [2024-06-22 23:15:45,685][15401] Updated weights for policy 0, policy_version 310620 (0.0040) [2024-06-22 23:15:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5089296384. Throughput: 0: 43017.8. Samples: 5089483900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:15:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 23:15:49,468][15401] Updated weights for policy 0, policy_version 310630 (0.0033) [2024-06-22 23:15:53,230][15401] Updated weights for policy 0, policy_version 310640 (0.0045) [2024-06-22 23:15:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5089525760. Throughput: 0: 42957.2. Samples: 5089606020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:15:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 23:15:57,136][15401] Updated weights for policy 0, policy_version 310650 (0.0035) [2024-06-22 23:15:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5089738752. Throughput: 0: 43211.6. Samples: 5089868600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:15:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 23:16:00,855][15401] Updated weights for policy 0, policy_version 310660 (0.0026) [2024-06-22 23:16:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5089935360. Throughput: 0: 42999.5. Samples: 5090126440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:16:03,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-22 23:16:04,829][15401] Updated weights for policy 0, policy_version 310670 (0.0041) [2024-06-22 23:16:08,297][15401] Updated weights for policy 0, policy_version 310680 (0.0043) [2024-06-22 23:16:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5090181120. Throughput: 0: 43040.1. Samples: 5090252040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:16:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 23:16:12,354][15401] Updated weights for policy 0, policy_version 310690 (0.0028) [2024-06-22 23:16:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 5090377728. Throughput: 0: 43208.0. Samples: 5090518480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:16:13,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 23:16:15,688][15401] Updated weights for policy 0, policy_version 310700 (0.0033) [2024-06-22 23:16:18,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 5090590720. Throughput: 0: 42876.5. Samples: 5090768740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:16:18,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 23:16:20,040][15401] Updated weights for policy 0, policy_version 310710 (0.0039) [2024-06-22 23:16:23,346][15401] Updated weights for policy 0, policy_version 310720 (0.0033) [2024-06-22 23:16:23,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5090836480. Throughput: 0: 43185.7. Samples: 5090901220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:16:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 23:16:27,687][15401] Updated weights for policy 0, policy_version 310730 (0.0029) [2024-06-22 23:16:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5091016704. Throughput: 0: 43164.8. Samples: 5091159740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:16:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 23:16:30,638][15349] Signal inference workers to stop experience collection... (75300 times) [2024-06-22 23:16:30,640][15349] Signal inference workers to resume experience collection... (75300 times) [2024-06-22 23:16:30,668][15401] InferenceWorker_p0-w0: stopping experience collection (75300 times) [2024-06-22 23:16:30,669][15401] InferenceWorker_p0-w0: resuming experience collection (75300 times) [2024-06-22 23:16:30,785][15401] Updated weights for policy 0, policy_version 310740 (0.0034) [2024-06-22 23:16:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 5091246080. Throughput: 0: 42857.8. Samples: 5091412500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:16:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 23:16:35,269][15401] Updated weights for policy 0, policy_version 310750 (0.0027) [2024-06-22 23:16:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5091475456. Throughput: 0: 43131.3. Samples: 5091546920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:16:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 23:16:38,477][15401] Updated weights for policy 0, policy_version 310760 (0.0038) [2024-06-22 23:16:42,917][15401] Updated weights for policy 0, policy_version 310770 (0.0031) [2024-06-22 23:16:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5091672064. Throughput: 0: 43017.4. Samples: 5091804380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:16:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 23:16:46,275][15401] Updated weights for policy 0, policy_version 310780 (0.0024) [2024-06-22 23:16:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5091901440. Throughput: 0: 42940.8. Samples: 5092058780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:16:48,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-22 23:16:50,474][15401] Updated weights for policy 0, policy_version 310790 (0.0028) [2024-06-22 23:16:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 5092114432. Throughput: 0: 43153.7. Samples: 5092194060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:16:53,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 23:16:53,764][15401] Updated weights for policy 0, policy_version 310800 (0.0038) [2024-06-22 23:16:58,085][15401] Updated weights for policy 0, policy_version 310810 (0.0036) [2024-06-22 23:16:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5092311040. Throughput: 0: 42892.5. Samples: 5092448540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:16:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-22 23:17:01,523][15401] Updated weights for policy 0, policy_version 310820 (0.0022) [2024-06-22 23:17:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5092540416. Throughput: 0: 42830.3. Samples: 5092696000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-22 23:17:05,589][15401] Updated weights for policy 0, policy_version 310830 (0.0036) [2024-06-22 23:17:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5092737024. Throughput: 0: 42888.9. Samples: 5092831220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:08,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 23:17:09,058][15401] Updated weights for policy 0, policy_version 310840 (0.0045) [2024-06-22 23:17:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 5092950016. Throughput: 0: 42786.2. Samples: 5093085120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 23:17:13,579][15401] Updated weights for policy 0, policy_version 310850 (0.0042) [2024-06-22 23:17:16,895][15401] Updated weights for policy 0, policy_version 310860 (0.0034) [2024-06-22 23:17:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 5093195776. Throughput: 0: 42751.5. Samples: 5093336320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:18,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 23:17:21,181][15401] Updated weights for policy 0, policy_version 310870 (0.0040) [2024-06-22 23:17:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5093376000. Throughput: 0: 42891.4. Samples: 5093477040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:23,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 23:17:24,569][15401] Updated weights for policy 0, policy_version 310880 (0.0027) [2024-06-22 23:17:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5093588992. Throughput: 0: 42876.9. Samples: 5093733840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:28,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-22 23:17:28,725][15401] Updated weights for policy 0, policy_version 310890 (0.0035) [2024-06-22 23:17:32,091][15401] Updated weights for policy 0, policy_version 310900 (0.0036) [2024-06-22 23:17:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5093834752. Throughput: 0: 42717.3. Samples: 5093981060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-22 23:17:36,238][15401] Updated weights for policy 0, policy_version 310910 (0.0036) [2024-06-22 23:17:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 5094014976. Throughput: 0: 42796.1. Samples: 5094119780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:38,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 23:17:39,314][15349] Signal inference workers to stop experience collection... (75350 times) [2024-06-22 23:17:39,315][15349] Signal inference workers to resume experience collection... (75350 times) [2024-06-22 23:17:39,333][15401] InferenceWorker_p0-w0: stopping experience collection (75350 times) [2024-06-22 23:17:39,333][15401] InferenceWorker_p0-w0: resuming experience collection (75350 times) [2024-06-22 23:17:39,609][15401] Updated weights for policy 0, policy_version 310920 (0.0040) [2024-06-22 23:17:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 5094244352. Throughput: 0: 42756.3. Samples: 5094372580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-22 23:17:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 23:17:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000310928_5094244352.pth... [2024-06-22 23:17:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000310301_5083971584.pth [2024-06-22 23:17:44,195][15401] Updated weights for policy 0, policy_version 310930 (0.0029) [2024-06-22 23:17:47,215][15401] Updated weights for policy 0, policy_version 310940 (0.0038) [2024-06-22 23:17:48,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5094490112. Throughput: 0: 43025.0. Samples: 5094632120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:17:48,390][15132] Avg episode reward: [(0, '0.161')] [2024-06-22 23:17:51,707][15401] Updated weights for policy 0, policy_version 310950 (0.0031) [2024-06-22 23:17:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 5094670336. Throughput: 0: 43061.9. Samples: 5094769000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:17:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 23:17:54,878][15401] Updated weights for policy 0, policy_version 310960 (0.0023) [2024-06-22 23:17:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5094899712. Throughput: 0: 42929.9. Samples: 5095016960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:17:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 23:17:59,226][15401] Updated weights for policy 0, policy_version 310970 (0.0028) [2024-06-22 23:18:02,567][15401] Updated weights for policy 0, policy_version 310980 (0.0040) [2024-06-22 23:18:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5095112704. Throughput: 0: 43079.1. Samples: 5095274880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 23:18:06,882][15401] Updated weights for policy 0, policy_version 310990 (0.0039) [2024-06-22 23:18:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 5095325696. Throughput: 0: 42868.9. Samples: 5095406140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:08,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-22 23:18:10,460][15401] Updated weights for policy 0, policy_version 311000 (0.0038) [2024-06-22 23:18:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 5095555072. Throughput: 0: 42817.2. Samples: 5095660620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 23:18:14,582][15401] Updated weights for policy 0, policy_version 311010 (0.0038) [2024-06-22 23:18:18,006][15401] Updated weights for policy 0, policy_version 311020 (0.0026) [2024-06-22 23:18:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5095751680. Throughput: 0: 42969.9. Samples: 5095914700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 23:18:22,505][15401] Updated weights for policy 0, policy_version 311030 (0.0021) [2024-06-22 23:18:23,396][15132] Fps is (10 sec: 40934.3, 60 sec: 43140.0, 300 sec: 42819.6). Total num frames: 5095964672. Throughput: 0: 42736.9. Samples: 5096043220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:23,397][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 23:18:25,938][15401] Updated weights for policy 0, policy_version 311040 (0.0041) [2024-06-22 23:18:28,396][15132] Fps is (10 sec: 42570.9, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 5096177664. Throughput: 0: 42778.9. Samples: 5096297900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:28,396][15132] Avg episode reward: [(0, '0.504')] [2024-06-22 23:18:29,964][15401] Updated weights for policy 0, policy_version 311050 (0.0023) [2024-06-22 23:18:33,389][15132] Fps is (10 sec: 40986.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5096374272. Throughput: 0: 42863.5. Samples: 5096560980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-22 23:18:33,610][15401] Updated weights for policy 0, policy_version 311060 (0.0031) [2024-06-22 23:18:37,414][15401] Updated weights for policy 0, policy_version 311070 (0.0027) [2024-06-22 23:18:38,390][15132] Fps is (10 sec: 42625.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 5096603648. Throughput: 0: 42610.1. Samples: 5096686460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:38,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-22 23:18:41,188][15401] Updated weights for policy 0, policy_version 311080 (0.0036) [2024-06-22 23:18:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5096833024. Throughput: 0: 42822.6. Samples: 5096943980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 23:18:44,831][15401] Updated weights for policy 0, policy_version 311090 (0.0033) [2024-06-22 23:18:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5097029632. Throughput: 0: 42832.1. Samples: 5097202320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:48,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-22 23:18:48,764][15401] Updated weights for policy 0, policy_version 311100 (0.0033) [2024-06-22 23:18:52,590][15401] Updated weights for policy 0, policy_version 311110 (0.0031) [2024-06-22 23:18:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5097259008. Throughput: 0: 42774.0. Samples: 5097330960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:53,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-22 23:18:56,331][15401] Updated weights for policy 0, policy_version 311120 (0.0047) [2024-06-22 23:18:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5097472000. Throughput: 0: 42888.5. Samples: 5097590600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-22 23:18:58,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-22 23:19:00,149][15401] Updated weights for policy 0, policy_version 311130 (0.0042) [2024-06-22 23:19:02,473][15349] Signal inference workers to stop experience collection... (75400 times) [2024-06-22 23:19:02,473][15349] Signal inference workers to resume experience collection... (75400 times) [2024-06-22 23:19:02,515][15401] InferenceWorker_p0-w0: stopping experience collection (75400 times) [2024-06-22 23:19:02,515][15401] InferenceWorker_p0-w0: resuming experience collection (75400 times) [2024-06-22 23:19:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5097668608. Throughput: 0: 42936.9. Samples: 5097846860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-22 23:19:04,126][15401] Updated weights for policy 0, policy_version 311140 (0.0045) [2024-06-22 23:19:07,836][15401] Updated weights for policy 0, policy_version 311150 (0.0027) [2024-06-22 23:19:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5097897984. Throughput: 0: 42797.3. Samples: 5097968820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 23:19:11,588][15401] Updated weights for policy 0, policy_version 311160 (0.0032) [2024-06-22 23:19:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 5098110976. Throughput: 0: 42867.4. Samples: 5098226660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 23:19:15,512][15401] Updated weights for policy 0, policy_version 311170 (0.0034) [2024-06-22 23:19:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5098323968. Throughput: 0: 42909.3. Samples: 5098491900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 23:19:19,049][15401] Updated weights for policy 0, policy_version 311180 (0.0031) [2024-06-22 23:19:23,139][15401] Updated weights for policy 0, policy_version 311190 (0.0036) [2024-06-22 23:19:23,394][15132] Fps is (10 sec: 44218.7, 60 sec: 43146.2, 300 sec: 42931.0). Total num frames: 5098553344. Throughput: 0: 42907.7. Samples: 5098617480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:23,394][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 23:19:26,835][15401] Updated weights for policy 0, policy_version 311200 (0.0030) [2024-06-22 23:19:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43149.1, 300 sec: 42931.6). Total num frames: 5098766336. Throughput: 0: 42836.9. Samples: 5098871640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:28,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 23:19:30,605][15401] Updated weights for policy 0, policy_version 311210 (0.0040) [2024-06-22 23:19:33,389][15132] Fps is (10 sec: 42616.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5098979328. Throughput: 0: 42872.0. Samples: 5099131560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 23:19:34,514][15401] Updated weights for policy 0, policy_version 311220 (0.0037) [2024-06-22 23:19:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5099192320. Throughput: 0: 42890.2. Samples: 5099261020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-22 23:19:38,397][15401] Updated weights for policy 0, policy_version 311230 (0.0040) [2024-06-22 23:19:41,982][15401] Updated weights for policy 0, policy_version 311240 (0.0032) [2024-06-22 23:19:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5099405312. Throughput: 0: 42894.3. Samples: 5099520840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-22 23:19:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000311244_5099421696.pth... [2024-06-22 23:19:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000310614_5089099776.pth [2024-06-22 23:19:45,884][15401] Updated weights for policy 0, policy_version 311250 (0.0040) [2024-06-22 23:19:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5099618304. Throughput: 0: 42943.4. Samples: 5099779320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:48,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-22 23:19:49,480][15401] Updated weights for policy 0, policy_version 311260 (0.0030) [2024-06-22 23:19:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5099831296. Throughput: 0: 42989.2. Samples: 5099903340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 23:19:53,685][15401] Updated weights for policy 0, policy_version 311270 (0.0032) [2024-06-22 23:19:56,957][15401] Updated weights for policy 0, policy_version 311280 (0.0026) [2024-06-22 23:19:58,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 5100027904. Throughput: 0: 42846.2. Samples: 5100154840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:19:58,393][15132] Avg episode reward: [(0, '0.305')] [2024-06-22 23:20:01,379][15401] Updated weights for policy 0, policy_version 311290 (0.0034) [2024-06-22 23:20:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5100257280. Throughput: 0: 42770.7. Samples: 5100416580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:20:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-22 23:20:04,664][15401] Updated weights for policy 0, policy_version 311300 (0.0030) [2024-06-22 23:20:08,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5100470272. Throughput: 0: 42814.2. Samples: 5100543940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:20:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 23:20:08,956][15401] Updated weights for policy 0, policy_version 311310 (0.0040) [2024-06-22 23:20:12,247][15401] Updated weights for policy 0, policy_version 311320 (0.0023) [2024-06-22 23:20:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5100666880. Throughput: 0: 42632.4. Samples: 5100790100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-22 23:20:13,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-22 23:20:16,621][15401] Updated weights for policy 0, policy_version 311330 (0.0037) [2024-06-22 23:20:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5100879872. Throughput: 0: 42747.5. Samples: 5101055200. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:18,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-22 23:20:20,279][15401] Updated weights for policy 0, policy_version 311340 (0.0039) [2024-06-22 23:20:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42328.3, 300 sec: 42820.6). Total num frames: 5101092864. Throughput: 0: 42639.1. Samples: 5101179780. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:23,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-22 23:20:24,022][15349] Signal inference workers to stop experience collection... (75450 times) [2024-06-22 23:20:24,070][15401] InferenceWorker_p0-w0: stopping experience collection (75450 times) [2024-06-22 23:20:24,082][15349] Signal inference workers to resume experience collection... (75450 times) [2024-06-22 23:20:24,089][15401] InferenceWorker_p0-w0: resuming experience collection (75450 times) [2024-06-22 23:20:24,223][15401] Updated weights for policy 0, policy_version 311350 (0.0038) [2024-06-22 23:20:28,100][15401] Updated weights for policy 0, policy_version 311360 (0.0033) [2024-06-22 23:20:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 5101322240. Throughput: 0: 42471.9. Samples: 5101432080. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 23:20:31,746][15401] Updated weights for policy 0, policy_version 311370 (0.0021) [2024-06-22 23:20:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5101518848. Throughput: 0: 42681.1. Samples: 5101699960. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 23:20:35,612][15401] Updated weights for policy 0, policy_version 311380 (0.0042) [2024-06-22 23:20:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5101731840. Throughput: 0: 42635.7. Samples: 5101821940. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:38,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-22 23:20:39,859][15401] Updated weights for policy 0, policy_version 311390 (0.0041) [2024-06-22 23:20:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5101961216. Throughput: 0: 42676.4. Samples: 5102075180. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 23:20:43,805][15401] Updated weights for policy 0, policy_version 311400 (0.0026) [2024-06-22 23:20:47,481][15401] Updated weights for policy 0, policy_version 311410 (0.0051) [2024-06-22 23:20:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5102157824. Throughput: 0: 42637.4. Samples: 5102335260. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:48,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-22 23:20:51,418][15401] Updated weights for policy 0, policy_version 311420 (0.0038) [2024-06-22 23:20:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5102370816. Throughput: 0: 42584.9. Samples: 5102460260. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:53,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-22 23:20:54,902][15401] Updated weights for policy 0, policy_version 311430 (0.0028) [2024-06-22 23:20:58,396][15132] Fps is (10 sec: 45845.6, 60 sec: 43141.6, 300 sec: 42986.2). Total num frames: 5102616576. Throughput: 0: 42827.2. Samples: 5102717600. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:20:58,397][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 23:20:59,090][15401] Updated weights for policy 0, policy_version 311440 (0.0030) [2024-06-22 23:21:02,431][15401] Updated weights for policy 0, policy_version 311450 (0.0033) [2024-06-22 23:21:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5102796800. Throughput: 0: 42714.6. Samples: 5102977360. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:21:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-22 23:21:06,601][15401] Updated weights for policy 0, policy_version 311460 (0.0039) [2024-06-22 23:21:08,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 5103026176. Throughput: 0: 42739.6. Samples: 5103103060. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:21:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 23:21:10,373][15401] Updated weights for policy 0, policy_version 311470 (0.0027) [2024-06-22 23:21:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 5103239168. Throughput: 0: 42895.2. Samples: 5103362360. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:21:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 23:21:14,060][15401] Updated weights for policy 0, policy_version 311480 (0.0033) [2024-06-22 23:21:17,801][15401] Updated weights for policy 0, policy_version 311490 (0.0040) [2024-06-22 23:21:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5103452160. Throughput: 0: 42515.9. Samples: 5103613180. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:21:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 23:21:21,635][15401] Updated weights for policy 0, policy_version 311500 (0.0033) [2024-06-22 23:21:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5103665152. Throughput: 0: 42743.6. Samples: 5103745400. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:21:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-22 23:21:25,563][15401] Updated weights for policy 0, policy_version 311510 (0.0029) [2024-06-22 23:21:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5103861760. Throughput: 0: 42745.1. Samples: 5103998700. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-22 23:21:28,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 23:21:29,426][15401] Updated weights for policy 0, policy_version 311520 (0.0044) [2024-06-22 23:21:33,138][15401] Updated weights for policy 0, policy_version 311530 (0.0030) [2024-06-22 23:21:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5104107520. Throughput: 0: 42547.1. Samples: 5104249880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:21:33,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-22 23:21:37,454][15401] Updated weights for policy 0, policy_version 311540 (0.0035) [2024-06-22 23:21:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5104287744. Throughput: 0: 42751.1. Samples: 5104384060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:21:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 23:21:40,757][15401] Updated weights for policy 0, policy_version 311550 (0.0029) [2024-06-22 23:21:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 5104517120. Throughput: 0: 42673.3. Samples: 5104637620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:21:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 23:21:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000311555_5104517120.pth... [2024-06-22 23:21:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000310928_5094244352.pth [2024-06-22 23:21:45,050][15401] Updated weights for policy 0, policy_version 311560 (0.0026) [2024-06-22 23:21:45,800][15349] Signal inference workers to stop experience collection... (75500 times) [2024-06-22 23:21:45,800][15349] Signal inference workers to resume experience collection... (75500 times) [2024-06-22 23:21:45,829][15401] InferenceWorker_p0-w0: stopping experience collection (75500 times) [2024-06-22 23:21:45,829][15401] InferenceWorker_p0-w0: resuming experience collection (75500 times) [2024-06-22 23:21:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 5104746496. Throughput: 0: 42487.7. Samples: 5104889300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:21:48,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-22 23:21:48,477][15401] Updated weights for policy 0, policy_version 311570 (0.0034) [2024-06-22 23:21:52,817][15401] Updated weights for policy 0, policy_version 311580 (0.0044) [2024-06-22 23:21:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5104926720. Throughput: 0: 42717.3. Samples: 5105025340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:21:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-22 23:21:56,132][15401] Updated weights for policy 0, policy_version 311590 (0.0041) [2024-06-22 23:21:58,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42056.7, 300 sec: 42709.5). Total num frames: 5105139712. Throughput: 0: 42416.7. Samples: 5105271120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:21:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-22 23:22:01,103][15401] Updated weights for policy 0, policy_version 311600 (0.0029) [2024-06-22 23:22:03,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 5105369088. Throughput: 0: 42495.5. Samples: 5105525580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:03,392][15132] Avg episode reward: [(0, '0.675')] [2024-06-22 23:22:04,054][15401] Updated weights for policy 0, policy_version 311610 (0.0031) [2024-06-22 23:22:08,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5105549312. Throughput: 0: 42480.8. Samples: 5105657040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-22 23:22:08,744][15401] Updated weights for policy 0, policy_version 311620 (0.0030) [2024-06-22 23:22:11,618][15401] Updated weights for policy 0, policy_version 311630 (0.0033) [2024-06-22 23:22:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5105778688. Throughput: 0: 42297.8. Samples: 5105902100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-22 23:22:16,848][15401] Updated weights for policy 0, policy_version 311640 (0.0034) [2024-06-22 23:22:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5106008064. Throughput: 0: 42448.9. Samples: 5106160080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-22 23:22:19,497][15401] Updated weights for policy 0, policy_version 311650 (0.0030) [2024-06-22 23:22:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 5106188288. Throughput: 0: 42336.7. Samples: 5106289220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:23,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-22 23:22:24,427][15401] Updated weights for policy 0, policy_version 311660 (0.0031) [2024-06-22 23:22:27,103][15401] Updated weights for policy 0, policy_version 311670 (0.0026) [2024-06-22 23:22:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5106417664. Throughput: 0: 42274.2. Samples: 5106539960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-22 23:22:32,006][15401] Updated weights for policy 0, policy_version 311680 (0.0034) [2024-06-22 23:22:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 5106647040. Throughput: 0: 42529.7. Samples: 5106803140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-22 23:22:34,766][15401] Updated weights for policy 0, policy_version 311690 (0.0028) [2024-06-22 23:22:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5106843648. Throughput: 0: 42227.4. Samples: 5106925580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-22 23:22:39,621][15401] Updated weights for policy 0, policy_version 311700 (0.0044) [2024-06-22 23:22:42,444][15401] Updated weights for policy 0, policy_version 311710 (0.0024) [2024-06-22 23:22:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5107073024. Throughput: 0: 42353.9. Samples: 5107177040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-22 23:22:43,400][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 23:22:47,340][15401] Updated weights for policy 0, policy_version 311720 (0.0045) [2024-06-22 23:22:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5107269632. Throughput: 0: 42616.5. Samples: 5107443220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:22:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 23:22:50,153][15401] Updated weights for policy 0, policy_version 311730 (0.0043) [2024-06-22 23:22:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5107482624. Throughput: 0: 42404.0. Samples: 5107565220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:22:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 23:22:54,856][15401] Updated weights for policy 0, policy_version 311740 (0.0036) [2024-06-22 23:22:57,893][15401] Updated weights for policy 0, policy_version 311750 (0.0023) [2024-06-22 23:22:58,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42870.0, 300 sec: 42709.2). Total num frames: 5107712000. Throughput: 0: 42650.7. Samples: 5107821480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:22:58,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 23:23:02,695][15401] Updated weights for policy 0, policy_version 311760 (0.0029) [2024-06-22 23:23:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42052.2, 300 sec: 42598.1). Total num frames: 5107892224. Throughput: 0: 42726.2. Samples: 5108082860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:03,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-22 23:23:03,514][15349] Signal inference workers to stop experience collection... (75550 times) [2024-06-22 23:23:03,519][15349] Signal inference workers to resume experience collection... (75550 times) [2024-06-22 23:23:03,540][15401] InferenceWorker_p0-w0: stopping experience collection (75550 times) [2024-06-22 23:23:03,540][15401] InferenceWorker_p0-w0: resuming experience collection (75550 times) [2024-06-22 23:23:05,547][15401] Updated weights for policy 0, policy_version 311770 (0.0036) [2024-06-22 23:23:08,389][15132] Fps is (10 sec: 42607.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5108137984. Throughput: 0: 42509.0. Samples: 5108202120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:08,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-22 23:23:10,522][15401] Updated weights for policy 0, policy_version 311780 (0.0031) [2024-06-22 23:23:13,228][15401] Updated weights for policy 0, policy_version 311790 (0.0038) [2024-06-22 23:23:13,390][15132] Fps is (10 sec: 47525.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5108367360. Throughput: 0: 42624.4. Samples: 5108458060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 23:23:18,187][15401] Updated weights for policy 0, policy_version 311800 (0.0048) [2024-06-22 23:23:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42599.3). Total num frames: 5108531200. Throughput: 0: 42593.8. Samples: 5108719860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:18,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-22 23:23:21,162][15401] Updated weights for policy 0, policy_version 311810 (0.0032) [2024-06-22 23:23:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42654.9). Total num frames: 5108760576. Throughput: 0: 42526.4. Samples: 5108839260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:23,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-22 23:23:26,021][15401] Updated weights for policy 0, policy_version 311820 (0.0037) [2024-06-22 23:23:28,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5109006336. Throughput: 0: 42625.4. Samples: 5109095180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 23:23:28,714][15401] Updated weights for policy 0, policy_version 311830 (0.0034) [2024-06-22 23:23:33,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 5109153792. Throughput: 0: 42495.6. Samples: 5109355520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:33,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-22 23:23:33,791][15401] Updated weights for policy 0, policy_version 311840 (0.0034) [2024-06-22 23:23:36,202][15401] Updated weights for policy 0, policy_version 311850 (0.0027) [2024-06-22 23:23:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5109399552. Throughput: 0: 42349.3. Samples: 5109470940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 23:23:41,448][15401] Updated weights for policy 0, policy_version 311860 (0.0040) [2024-06-22 23:23:43,390][15132] Fps is (10 sec: 49150.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5109645312. Throughput: 0: 42526.2. Samples: 5109735080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:43,391][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 23:23:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000311868_5109645312.pth... [2024-06-22 23:23:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000311244_5099421696.pth [2024-06-22 23:23:43,779][15401] Updated weights for policy 0, policy_version 311870 (0.0034) [2024-06-22 23:23:48,390][15132] Fps is (10 sec: 39319.4, 60 sec: 42051.9, 300 sec: 42487.2). Total num frames: 5109792768. Throughput: 0: 42505.7. Samples: 5109995540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:48,391][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 23:23:49,142][15401] Updated weights for policy 0, policy_version 311880 (0.0038) [2024-06-22 23:23:51,496][15401] Updated weights for policy 0, policy_version 311890 (0.0044) [2024-06-22 23:23:53,390][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5110054912. Throughput: 0: 42475.5. Samples: 5110113520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-22 23:23:56,673][15401] Updated weights for policy 0, policy_version 311900 (0.0028) [2024-06-22 23:23:58,390][15132] Fps is (10 sec: 47515.3, 60 sec: 42599.8, 300 sec: 42709.4). Total num frames: 5110267904. Throughput: 0: 42592.3. Samples: 5110374720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-22 23:23:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 23:23:59,286][15401] Updated weights for policy 0, policy_version 311910 (0.0037) [2024-06-22 23:24:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 5110448128. Throughput: 0: 42413.7. Samples: 5110628480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-22 23:24:04,318][15401] Updated weights for policy 0, policy_version 311920 (0.0032) [2024-06-22 23:24:06,971][15401] Updated weights for policy 0, policy_version 311930 (0.0030) [2024-06-22 23:24:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5110677504. Throughput: 0: 42531.8. Samples: 5110753200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-22 23:24:11,771][15401] Updated weights for policy 0, policy_version 311940 (0.0031) [2024-06-22 23:24:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5110890496. Throughput: 0: 42606.6. Samples: 5111012480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-22 23:24:14,759][15401] Updated weights for policy 0, policy_version 311950 (0.0032) [2024-06-22 23:24:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42487.9). Total num frames: 5111087104. Throughput: 0: 42682.5. Samples: 5111276240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 23:24:19,320][15401] Updated weights for policy 0, policy_version 311960 (0.0054) [2024-06-22 23:24:22,897][15401] Updated weights for policy 0, policy_version 311970 (0.0026) [2024-06-22 23:24:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 5111332864. Throughput: 0: 42880.0. Samples: 5111400640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:23,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 23:24:26,782][15349] Signal inference workers to stop experience collection... (75600 times) [2024-06-22 23:24:26,839][15401] InferenceWorker_p0-w0: stopping experience collection (75600 times) [2024-06-22 23:24:26,845][15349] Signal inference workers to resume experience collection... (75600 times) [2024-06-22 23:24:26,856][15401] InferenceWorker_p0-w0: resuming experience collection (75600 times) [2024-06-22 23:24:26,980][15401] Updated weights for policy 0, policy_version 311980 (0.0036) [2024-06-22 23:24:28,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5111545856. Throughput: 0: 42736.3. Samples: 5111658200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:28,395][15132] Avg episode reward: [(0, '0.766')] [2024-06-22 23:24:30,377][15401] Updated weights for policy 0, policy_version 311990 (0.0029) [2024-06-22 23:24:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 5111742464. Throughput: 0: 42642.8. Samples: 5111914440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 23:24:34,492][15401] Updated weights for policy 0, policy_version 312000 (0.0037) [2024-06-22 23:24:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5111955456. Throughput: 0: 42809.0. Samples: 5112039920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 23:24:38,517][15401] Updated weights for policy 0, policy_version 312010 (0.0042) [2024-06-22 23:24:42,153][15401] Updated weights for policy 0, policy_version 312020 (0.0038) [2024-06-22 23:24:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 5112184832. Throughput: 0: 42810.3. Samples: 5112301280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:43,393][15132] Avg episode reward: [(0, '0.663')] [2024-06-22 23:24:46,163][15401] Updated weights for policy 0, policy_version 312030 (0.0025) [2024-06-22 23:24:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43145.0, 300 sec: 42542.9). Total num frames: 5112381440. Throughput: 0: 42963.3. Samples: 5112561820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 23:24:49,708][15401] Updated weights for policy 0, policy_version 312040 (0.0035) [2024-06-22 23:24:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 5112594432. Throughput: 0: 42959.7. Samples: 5112686380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 23:24:53,702][15401] Updated weights for policy 0, policy_version 312050 (0.0031) [2024-06-22 23:24:57,456][15401] Updated weights for policy 0, policy_version 312060 (0.0028) [2024-06-22 23:24:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 5112807424. Throughput: 0: 42845.0. Samples: 5112940500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:24:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 23:25:01,340][15401] Updated weights for policy 0, policy_version 312070 (0.0035) [2024-06-22 23:25:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 5113020416. Throughput: 0: 42823.6. Samples: 5113203300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:25:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 23:25:05,166][15401] Updated weights for policy 0, policy_version 312080 (0.0029) [2024-06-22 23:25:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 5113249792. Throughput: 0: 42793.5. Samples: 5113326240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:25:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 23:25:09,514][15401] Updated weights for policy 0, policy_version 312090 (0.0031) [2024-06-22 23:25:12,927][15401] Updated weights for policy 0, policy_version 312100 (0.0024) [2024-06-22 23:25:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5113462784. Throughput: 0: 42817.8. Samples: 5113585000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-22 23:25:13,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-22 23:25:17,035][15401] Updated weights for policy 0, policy_version 312110 (0.0034) [2024-06-22 23:25:18,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5113659392. Throughput: 0: 42937.6. Samples: 5113846640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-22 23:25:20,703][15401] Updated weights for policy 0, policy_version 312120 (0.0031) [2024-06-22 23:25:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 5113905152. Throughput: 0: 42885.3. Samples: 5113969760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 23:25:24,790][15401] Updated weights for policy 0, policy_version 312130 (0.0048) [2024-06-22 23:25:28,237][15401] Updated weights for policy 0, policy_version 312140 (0.0038) [2024-06-22 23:25:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5114101760. Throughput: 0: 42597.9. Samples: 5114218080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 23:25:32,450][15401] Updated weights for policy 0, policy_version 312150 (0.0032) [2024-06-22 23:25:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5114281984. Throughput: 0: 42722.6. Samples: 5114484340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 23:25:35,730][15401] Updated weights for policy 0, policy_version 312160 (0.0028) [2024-06-22 23:25:36,380][15349] Signal inference workers to stop experience collection... (75650 times) [2024-06-22 23:25:36,431][15401] InferenceWorker_p0-w0: stopping experience collection (75650 times) [2024-06-22 23:25:36,441][15349] Signal inference workers to resume experience collection... (75650 times) [2024-06-22 23:25:36,450][15401] InferenceWorker_p0-w0: resuming experience collection (75650 times) [2024-06-22 23:25:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 5114544128. Throughput: 0: 42742.2. Samples: 5114609780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:38,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-22 23:25:40,364][15401] Updated weights for policy 0, policy_version 312170 (0.0032) [2024-06-22 23:25:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 5114740736. Throughput: 0: 42785.2. Samples: 5114865840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-22 23:25:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000312179_5114740736.pth... [2024-06-22 23:25:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000311555_5104517120.pth [2024-06-22 23:25:43,669][15401] Updated weights for policy 0, policy_version 312180 (0.0019) [2024-06-22 23:25:47,692][15401] Updated weights for policy 0, policy_version 312190 (0.0035) [2024-06-22 23:25:48,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5114920960. Throughput: 0: 42886.4. Samples: 5115133180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-22 23:25:51,054][15401] Updated weights for policy 0, policy_version 312200 (0.0025) [2024-06-22 23:25:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42599.3). Total num frames: 5115183104. Throughput: 0: 42784.0. Samples: 5115251520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-22 23:25:55,358][15401] Updated weights for policy 0, policy_version 312210 (0.0043) [2024-06-22 23:25:58,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 5115396096. Throughput: 0: 42758.2. Samples: 5115509220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:25:58,393][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 23:25:58,579][15401] Updated weights for policy 0, policy_version 312220 (0.0034) [2024-06-22 23:26:02,690][15401] Updated weights for policy 0, policy_version 312230 (0.0038) [2024-06-22 23:26:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5115576320. Throughput: 0: 42836.1. Samples: 5115774260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:26:03,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 23:26:05,999][15401] Updated weights for policy 0, policy_version 312240 (0.0023) [2024-06-22 23:26:08,392][15132] Fps is (10 sec: 44236.5, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 5115838464. Throughput: 0: 42778.1. Samples: 5115894880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:26:08,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-22 23:26:10,726][15401] Updated weights for policy 0, policy_version 312250 (0.0044) [2024-06-22 23:26:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5116035072. Throughput: 0: 42988.0. Samples: 5116152540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:26:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 23:26:13,828][15401] Updated weights for policy 0, policy_version 312260 (0.0036) [2024-06-22 23:26:18,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 5116215296. Throughput: 0: 43040.5. Samples: 5116421160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:26:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 23:26:18,601][15401] Updated weights for policy 0, policy_version 312270 (0.0025) [2024-06-22 23:26:21,271][15401] Updated weights for policy 0, policy_version 312280 (0.0040) [2024-06-22 23:26:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5116461056. Throughput: 0: 42976.9. Samples: 5116543740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:26:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-22 23:26:26,070][15401] Updated weights for policy 0, policy_version 312290 (0.0030) [2024-06-22 23:26:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5116674048. Throughput: 0: 43021.0. Samples: 5116801780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-22 23:26:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-22 23:26:28,847][15401] Updated weights for policy 0, policy_version 312300 (0.0024) [2024-06-22 23:26:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5116870656. Throughput: 0: 42988.1. Samples: 5117067640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:26:33,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-22 23:26:33,418][15401] Updated weights for policy 0, policy_version 312310 (0.0030) [2024-06-22 23:26:35,706][15349] Signal inference workers to stop experience collection... (75700 times) [2024-06-22 23:26:35,708][15349] Signal inference workers to resume experience collection... (75700 times) [2024-06-22 23:26:35,760][15401] InferenceWorker_p0-w0: stopping experience collection (75700 times) [2024-06-22 23:26:35,760][15401] InferenceWorker_p0-w0: resuming experience collection (75700 times) [2024-06-22 23:26:36,306][15401] Updated weights for policy 0, policy_version 312320 (0.0046) [2024-06-22 23:26:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5117132800. Throughput: 0: 43136.7. Samples: 5117192680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:26:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 23:26:40,917][15401] Updated weights for policy 0, policy_version 312330 (0.0035) [2024-06-22 23:26:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5117329408. Throughput: 0: 43173.0. Samples: 5117451900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:26:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-22 23:26:44,376][15401] Updated weights for policy 0, policy_version 312340 (0.0040) [2024-06-22 23:26:48,389][15132] Fps is (10 sec: 39322.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5117526016. Throughput: 0: 43191.7. Samples: 5117717880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:26:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 23:26:48,432][15401] Updated weights for policy 0, policy_version 312350 (0.0028) [2024-06-22 23:26:51,782][15401] Updated weights for policy 0, policy_version 312360 (0.0031) [2024-06-22 23:26:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5117788160. Throughput: 0: 43247.3. Samples: 5117840900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:26:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-22 23:26:56,145][15401] Updated weights for policy 0, policy_version 312370 (0.0036) [2024-06-22 23:26:58,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5117968384. Throughput: 0: 43326.1. Samples: 5118102320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:26:58,393][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 23:26:59,226][15401] Updated weights for policy 0, policy_version 312380 (0.0032) [2024-06-22 23:27:03,396][15132] Fps is (10 sec: 39296.3, 60 sec: 43413.0, 300 sec: 42819.6). Total num frames: 5118181376. Throughput: 0: 43274.2. Samples: 5118368780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:03,397][15132] Avg episode reward: [(0, '0.604')] [2024-06-22 23:27:03,543][15401] Updated weights for policy 0, policy_version 312390 (0.0025) [2024-06-22 23:27:06,610][15401] Updated weights for policy 0, policy_version 312400 (0.0028) [2024-06-22 23:27:08,389][15132] Fps is (10 sec: 47525.1, 60 sec: 43419.4, 300 sec: 42931.6). Total num frames: 5118443520. Throughput: 0: 43366.2. Samples: 5118495220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 23:27:10,924][15401] Updated weights for policy 0, policy_version 312410 (0.0032) [2024-06-22 23:27:13,392][15132] Fps is (10 sec: 44254.4, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 5118623744. Throughput: 0: 43313.6. Samples: 5118751000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:13,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 23:27:14,281][15401] Updated weights for policy 0, policy_version 312420 (0.0028) [2024-06-22 23:27:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 5118836736. Throughput: 0: 43095.9. Samples: 5119006960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 23:27:18,486][15401] Updated weights for policy 0, policy_version 312430 (0.0033) [2024-06-22 23:27:22,008][15401] Updated weights for policy 0, policy_version 312440 (0.0034) [2024-06-22 23:27:23,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5119066112. Throughput: 0: 43093.4. Samples: 5119131880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:23,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-22 23:27:26,138][15401] Updated weights for policy 0, policy_version 312450 (0.0028) [2024-06-22 23:27:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5119279104. Throughput: 0: 43206.9. Samples: 5119396220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:28,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 23:27:29,667][15401] Updated weights for policy 0, policy_version 312460 (0.0036) [2024-06-22 23:27:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 5119492096. Throughput: 0: 42957.2. Samples: 5119650960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 23:27:33,616][15401] Updated weights for policy 0, policy_version 312470 (0.0032) [2024-06-22 23:27:37,299][15401] Updated weights for policy 0, policy_version 312480 (0.0030) [2024-06-22 23:27:38,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5119705088. Throughput: 0: 43055.1. Samples: 5119778380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 23:27:41,110][15401] Updated weights for policy 0, policy_version 312490 (0.0025) [2024-06-22 23:27:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 5119918080. Throughput: 0: 43103.6. Samples: 5120041980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-22 23:27:43,392][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 23:27:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000312495_5119918080.pth... [2024-06-22 23:27:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000311868_5109645312.pth [2024-06-22 23:27:44,868][15401] Updated weights for policy 0, policy_version 312500 (0.0043) [2024-06-22 23:27:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5120131072. Throughput: 0: 42728.2. Samples: 5120291280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:27:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 23:27:49,276][15401] Updated weights for policy 0, policy_version 312510 (0.0033) [2024-06-22 23:27:52,995][15401] Updated weights for policy 0, policy_version 312520 (0.0040) [2024-06-22 23:27:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 5120344064. Throughput: 0: 42801.4. Samples: 5120421280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:27:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 23:27:56,779][15401] Updated weights for policy 0, policy_version 312530 (0.0042) [2024-06-22 23:27:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43419.3, 300 sec: 42987.5). Total num frames: 5120573440. Throughput: 0: 42996.8. Samples: 5120685760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:27:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 23:28:00,721][15401] Updated weights for policy 0, policy_version 312540 (0.0022) [2024-06-22 23:28:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 5120786432. Throughput: 0: 42773.6. Samples: 5120931780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-22 23:28:04,362][15401] Updated weights for policy 0, policy_version 312550 (0.0036) [2024-06-22 23:28:08,213][15401] Updated weights for policy 0, policy_version 312560 (0.0053) [2024-06-22 23:28:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5120983040. Throughput: 0: 42869.8. Samples: 5121061020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-22 23:28:11,794][15349] Signal inference workers to stop experience collection... (75750 times) [2024-06-22 23:28:11,801][15349] Signal inference workers to resume experience collection... (75750 times) [2024-06-22 23:28:11,817][15401] InferenceWorker_p0-w0: stopping experience collection (75750 times) [2024-06-22 23:28:11,818][15401] InferenceWorker_p0-w0: resuming experience collection (75750 times) [2024-06-22 23:28:11,947][15401] Updated weights for policy 0, policy_version 312570 (0.0041) [2024-06-22 23:28:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 5121196032. Throughput: 0: 42846.4. Samples: 5121324300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-22 23:28:15,657][15401] Updated weights for policy 0, policy_version 312580 (0.0030) [2024-06-22 23:28:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5121425408. Throughput: 0: 42848.0. Samples: 5121579120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 23:28:19,485][15401] Updated weights for policy 0, policy_version 312590 (0.0026) [2024-06-22 23:28:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5121622016. Throughput: 0: 42838.1. Samples: 5121706100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:23,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 23:28:23,653][15401] Updated weights for policy 0, policy_version 312600 (0.0025) [2024-06-22 23:28:27,122][15401] Updated weights for policy 0, policy_version 312610 (0.0037) [2024-06-22 23:28:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 43098.2). Total num frames: 5121867776. Throughput: 0: 42872.1. Samples: 5121971120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:28,396][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 23:28:31,136][15401] Updated weights for policy 0, policy_version 312620 (0.0027) [2024-06-22 23:28:33,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 5122080768. Throughput: 0: 43000.8. Samples: 5122226420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:33,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 23:28:34,546][15401] Updated weights for policy 0, policy_version 312630 (0.0045) [2024-06-22 23:28:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5122277376. Throughput: 0: 43111.6. Samples: 5122361300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 23:28:38,590][15401] Updated weights for policy 0, policy_version 312640 (0.0034) [2024-06-22 23:28:42,013][15401] Updated weights for policy 0, policy_version 312650 (0.0035) [2024-06-22 23:28:43,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43146.1, 300 sec: 43098.3). Total num frames: 5122506752. Throughput: 0: 42837.7. Samples: 5122613460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 23:28:46,172][15401] Updated weights for policy 0, policy_version 312660 (0.0033) [2024-06-22 23:28:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5122703360. Throughput: 0: 43163.2. Samples: 5122874120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-22 23:28:49,997][15401] Updated weights for policy 0, policy_version 312670 (0.0026) [2024-06-22 23:28:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5122916352. Throughput: 0: 43007.2. Samples: 5122996340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 23:28:53,714][15401] Updated weights for policy 0, policy_version 312680 (0.0036) [2024-06-22 23:28:57,681][15401] Updated weights for policy 0, policy_version 312690 (0.0045) [2024-06-22 23:28:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 5123129344. Throughput: 0: 43034.2. Samples: 5123260840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-22 23:28:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 23:29:01,276][15401] Updated weights for policy 0, policy_version 312700 (0.0037) [2024-06-22 23:29:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.6, 300 sec: 42931.7). Total num frames: 5123342336. Throughput: 0: 43027.7. Samples: 5123515360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-22 23:29:05,345][15401] Updated weights for policy 0, policy_version 312710 (0.0026) [2024-06-22 23:29:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5123571712. Throughput: 0: 43081.9. Samples: 5123644780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 23:29:09,197][15401] Updated weights for policy 0, policy_version 312720 (0.0033) [2024-06-22 23:29:12,994][15401] Updated weights for policy 0, policy_version 312730 (0.0033) [2024-06-22 23:29:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5123784704. Throughput: 0: 42925.4. Samples: 5123902760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-22 23:29:17,229][15401] Updated weights for policy 0, policy_version 312740 (0.0041) [2024-06-22 23:29:18,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42593.8, 300 sec: 42875.5). Total num frames: 5123981312. Throughput: 0: 42802.9. Samples: 5124152720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:18,397][15132] Avg episode reward: [(0, '0.319')] [2024-06-22 23:29:20,766][15401] Updated weights for policy 0, policy_version 312750 (0.0031) [2024-06-22 23:29:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5124194304. Throughput: 0: 42519.6. Samples: 5124274680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 23:29:24,787][15401] Updated weights for policy 0, policy_version 312760 (0.0033) [2024-06-22 23:29:28,390][15132] Fps is (10 sec: 44265.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 5124423680. Throughput: 0: 42674.4. Samples: 5124533800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:28,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-22 23:29:28,394][15401] Updated weights for policy 0, policy_version 312770 (0.0025) [2024-06-22 23:29:32,278][15401] Updated weights for policy 0, policy_version 312780 (0.0035) [2024-06-22 23:29:33,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42052.3, 300 sec: 42875.7). Total num frames: 5124603904. Throughput: 0: 42598.2. Samples: 5124791140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:33,392][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 23:29:36,038][15401] Updated weights for policy 0, policy_version 312790 (0.0035) [2024-06-22 23:29:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 5124849664. Throughput: 0: 42668.8. Samples: 5124916440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:38,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 23:29:39,792][15401] Updated weights for policy 0, policy_version 312800 (0.0039) [2024-06-22 23:29:43,390][15132] Fps is (10 sec: 42607.7, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 5125029888. Throughput: 0: 42550.9. Samples: 5125175640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:43,391][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 23:29:43,608][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000312809_5125062656.pth... [2024-06-22 23:29:43,668][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000312179_5114740736.pth [2024-06-22 23:29:43,810][15401] Updated weights for policy 0, policy_version 312810 (0.0038) [2024-06-22 23:29:44,343][15349] Signal inference workers to stop experience collection... (75800 times) [2024-06-22 23:29:44,375][15401] InferenceWorker_p0-w0: stopping experience collection (75800 times) [2024-06-22 23:29:44,397][15349] Signal inference workers to resume experience collection... (75800 times) [2024-06-22 23:29:44,398][15401] InferenceWorker_p0-w0: resuming experience collection (75800 times) [2024-06-22 23:29:47,536][15401] Updated weights for policy 0, policy_version 312820 (0.0028) [2024-06-22 23:29:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5125259264. Throughput: 0: 42558.9. Samples: 5125430520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 23:29:51,391][15401] Updated weights for policy 0, policy_version 312830 (0.0032) [2024-06-22 23:29:53,390][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5125488640. Throughput: 0: 42469.3. Samples: 5125555900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-22 23:29:55,312][15401] Updated weights for policy 0, policy_version 312840 (0.0039) [2024-06-22 23:29:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5125668864. Throughput: 0: 42451.5. Samples: 5125813080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:29:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-22 23:29:59,133][15401] Updated weights for policy 0, policy_version 312850 (0.0035) [2024-06-22 23:30:03,244][15401] Updated weights for policy 0, policy_version 312860 (0.0032) [2024-06-22 23:30:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5125914624. Throughput: 0: 42655.0. Samples: 5126071920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:30:03,391][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 23:30:06,830][15401] Updated weights for policy 0, policy_version 312870 (0.0029) [2024-06-22 23:30:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5126127616. Throughput: 0: 42777.7. Samples: 5126199680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-22 23:30:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-22 23:30:10,899][15401] Updated weights for policy 0, policy_version 312880 (0.0047) [2024-06-22 23:30:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 5126307840. Throughput: 0: 42541.3. Samples: 5126448160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 23:30:14,673][15401] Updated weights for policy 0, policy_version 312890 (0.0036) [2024-06-22 23:30:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 5126520832. Throughput: 0: 42446.7. Samples: 5126701140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 23:30:18,564][15401] Updated weights for policy 0, policy_version 312900 (0.0049) [2024-06-22 23:30:22,366][15401] Updated weights for policy 0, policy_version 312910 (0.0032) [2024-06-22 23:30:23,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 5126750208. Throughput: 0: 42587.5. Samples: 5126832980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:23,393][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 23:30:26,137][15401] Updated weights for policy 0, policy_version 312920 (0.0037) [2024-06-22 23:30:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41777.6, 300 sec: 42875.7). Total num frames: 5126930432. Throughput: 0: 42526.0. Samples: 5127089400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:28,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 23:30:29,909][15401] Updated weights for policy 0, policy_version 312930 (0.0022) [2024-06-22 23:30:33,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 5127176192. Throughput: 0: 42410.7. Samples: 5127339000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-22 23:30:34,044][15401] Updated weights for policy 0, policy_version 312940 (0.0033) [2024-06-22 23:30:37,768][15401] Updated weights for policy 0, policy_version 312950 (0.0050) [2024-06-22 23:30:38,392][15132] Fps is (10 sec: 45875.2, 60 sec: 42323.7, 300 sec: 42875.8). Total num frames: 5127389184. Throughput: 0: 42611.6. Samples: 5127473520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:38,392][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 23:30:42,019][15401] Updated weights for policy 0, policy_version 312960 (0.0031) [2024-06-22 23:30:43,394][15132] Fps is (10 sec: 40942.9, 60 sec: 42595.6, 300 sec: 42931.0). Total num frames: 5127585792. Throughput: 0: 42445.9. Samples: 5127723320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:43,394][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 23:30:45,406][15401] Updated weights for policy 0, policy_version 312970 (0.0038) [2024-06-22 23:30:48,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5127798784. Throughput: 0: 42256.9. Samples: 5127973480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-22 23:30:49,599][15401] Updated weights for policy 0, policy_version 312980 (0.0031) [2024-06-22 23:30:52,935][15401] Updated weights for policy 0, policy_version 312990 (0.0023) [2024-06-22 23:30:53,389][15132] Fps is (10 sec: 44255.6, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 5128028160. Throughput: 0: 42291.2. Samples: 5128102780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-22 23:30:57,217][15401] Updated weights for policy 0, policy_version 313000 (0.0028) [2024-06-22 23:30:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5128224768. Throughput: 0: 42375.1. Samples: 5128355040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:30:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 23:31:00,283][15349] Signal inference workers to stop experience collection... (75850 times) [2024-06-22 23:31:00,283][15349] Signal inference workers to resume experience collection... (75850 times) [2024-06-22 23:31:00,294][15401] InferenceWorker_p0-w0: stopping experience collection (75850 times) [2024-06-22 23:31:00,311][15401] InferenceWorker_p0-w0: resuming experience collection (75850 times) [2024-06-22 23:31:00,436][15401] Updated weights for policy 0, policy_version 313010 (0.0039) [2024-06-22 23:31:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 5128454144. Throughput: 0: 42511.6. Samples: 5128614160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:31:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 23:31:04,835][15401] Updated weights for policy 0, policy_version 313020 (0.0030) [2024-06-22 23:31:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5128650752. Throughput: 0: 42458.7. Samples: 5128743520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:31:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 23:31:08,714][15401] Updated weights for policy 0, policy_version 313030 (0.0032) [2024-06-22 23:31:12,555][15401] Updated weights for policy 0, policy_version 313040 (0.0045) [2024-06-22 23:31:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5128863744. Throughput: 0: 42383.9. Samples: 5128996580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:31:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 23:31:16,314][15401] Updated weights for policy 0, policy_version 313050 (0.0038) [2024-06-22 23:31:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5129076736. Throughput: 0: 42479.6. Samples: 5129250580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:31:18,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-22 23:31:20,367][15401] Updated weights for policy 0, policy_version 313060 (0.0038) [2024-06-22 23:31:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 5129306112. Throughput: 0: 42325.4. Samples: 5129378060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-22 23:31:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 23:31:24,122][15401] Updated weights for policy 0, policy_version 313070 (0.0046) [2024-06-22 23:31:28,068][15401] Updated weights for policy 0, policy_version 313080 (0.0043) [2024-06-22 23:31:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 5129502720. Throughput: 0: 42328.9. Samples: 5129627940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:31:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-22 23:31:31,577][15401] Updated weights for policy 0, policy_version 313090 (0.0038) [2024-06-22 23:31:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5129715712. Throughput: 0: 42589.7. Samples: 5129890020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:31:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-22 23:31:35,629][15401] Updated weights for policy 0, policy_version 313100 (0.0037) [2024-06-22 23:31:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 5129945088. Throughput: 0: 42604.3. Samples: 5130019980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:31:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-22 23:31:39,219][15401] Updated weights for policy 0, policy_version 313110 (0.0045) [2024-06-22 23:31:43,142][15401] Updated weights for policy 0, policy_version 313120 (0.0038) [2024-06-22 23:31:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42874.4, 300 sec: 42820.5). Total num frames: 5130158080. Throughput: 0: 42608.4. Samples: 5130272420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:31:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 23:31:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000313120_5130158080.pth... [2024-06-22 23:31:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000312495_5119918080.pth [2024-06-22 23:31:46,771][15401] Updated weights for policy 0, policy_version 313130 (0.0032) [2024-06-22 23:31:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5130354688. Throughput: 0: 42539.6. Samples: 5130528440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:31:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 23:31:50,969][15401] Updated weights for policy 0, policy_version 313140 (0.0029) [2024-06-22 23:31:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 5130567680. Throughput: 0: 42421.8. Samples: 5130652500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:31:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-22 23:31:54,430][15401] Updated weights for policy 0, policy_version 313150 (0.0040) [2024-06-22 23:31:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42766.0). Total num frames: 5130797056. Throughput: 0: 42575.3. Samples: 5130912460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:31:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 23:31:58,619][15401] Updated weights for policy 0, policy_version 313160 (0.0041) [2024-06-22 23:32:02,112][15401] Updated weights for policy 0, policy_version 313170 (0.0038) [2024-06-22 23:32:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5131026432. Throughput: 0: 42539.4. Samples: 5131164860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:32:03,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-22 23:32:06,507][15401] Updated weights for policy 0, policy_version 313180 (0.0041) [2024-06-22 23:32:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 5131206656. Throughput: 0: 42662.7. Samples: 5131297880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:32:08,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 23:32:09,624][15401] Updated weights for policy 0, policy_version 313190 (0.0022) [2024-06-22 23:32:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 5131419648. Throughput: 0: 42863.6. Samples: 5131556800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:32:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 23:32:14,491][15401] Updated weights for policy 0, policy_version 313200 (0.0037) [2024-06-22 23:32:17,661][15401] Updated weights for policy 0, policy_version 313210 (0.0032) [2024-06-22 23:32:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5131649024. Throughput: 0: 42443.5. Samples: 5131799980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:32:18,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-22 23:32:22,089][15401] Updated weights for policy 0, policy_version 313220 (0.0046) [2024-06-22 23:32:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5131845632. Throughput: 0: 42577.8. Samples: 5131935980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:32:23,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-22 23:32:24,979][15401] Updated weights for policy 0, policy_version 313230 (0.0030) [2024-06-22 23:32:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5132058624. Throughput: 0: 42883.7. Samples: 5132202180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:32:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-22 23:32:29,474][15349] Signal inference workers to stop experience collection... (75900 times) [2024-06-22 23:32:29,482][15349] Signal inference workers to resume experience collection... (75900 times) [2024-06-22 23:32:29,529][15401] InferenceWorker_p0-w0: stopping experience collection (75900 times) [2024-06-22 23:32:29,529][15401] InferenceWorker_p0-w0: resuming experience collection (75900 times) [2024-06-22 23:32:29,617][15401] Updated weights for policy 0, policy_version 313240 (0.0034) [2024-06-22 23:32:32,186][15401] Updated weights for policy 0, policy_version 313250 (0.0038) [2024-06-22 23:32:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5132304384. Throughput: 0: 42858.2. Samples: 5132457060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:32:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 23:32:37,205][15401] Updated weights for policy 0, policy_version 313260 (0.0035) [2024-06-22 23:32:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 5132517376. Throughput: 0: 43129.3. Samples: 5132593320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-22 23:32:38,400][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 23:32:40,302][15401] Updated weights for policy 0, policy_version 313270 (0.0026) [2024-06-22 23:32:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5132713984. Throughput: 0: 42898.1. Samples: 5132842880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:32:43,391][15132] Avg episode reward: [(0, '0.467')] [2024-06-22 23:32:44,704][15401] Updated weights for policy 0, policy_version 313280 (0.0033) [2024-06-22 23:32:48,018][15401] Updated weights for policy 0, policy_version 313290 (0.0040) [2024-06-22 23:32:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5132943360. Throughput: 0: 42991.7. Samples: 5133099480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:32:48,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 23:32:52,280][15401] Updated weights for policy 0, policy_version 313300 (0.0039) [2024-06-22 23:32:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 5133156352. Throughput: 0: 43062.6. Samples: 5133235700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:32:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 23:32:55,921][15401] Updated weights for policy 0, policy_version 313310 (0.0042) [2024-06-22 23:32:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5133352960. Throughput: 0: 42996.0. Samples: 5133491620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:32:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 23:33:00,022][15401] Updated weights for policy 0, policy_version 313320 (0.0034) [2024-06-22 23:33:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5133582336. Throughput: 0: 43153.0. Samples: 5133741860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 23:33:03,657][15401] Updated weights for policy 0, policy_version 313330 (0.0033) [2024-06-22 23:33:07,769][15401] Updated weights for policy 0, policy_version 313340 (0.0039) [2024-06-22 23:33:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5133795328. Throughput: 0: 43141.0. Samples: 5133877320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-22 23:33:11,138][15401] Updated weights for policy 0, policy_version 313350 (0.0038) [2024-06-22 23:33:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5133991936. Throughput: 0: 42911.5. Samples: 5134133200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-22 23:33:15,427][15401] Updated weights for policy 0, policy_version 313360 (0.0024) [2024-06-22 23:33:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5134221312. Throughput: 0: 42677.4. Samples: 5134377540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 23:33:18,691][15401] Updated weights for policy 0, policy_version 313370 (0.0040) [2024-06-22 23:33:23,182][15401] Updated weights for policy 0, policy_version 313380 (0.0033) [2024-06-22 23:33:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5134434304. Throughput: 0: 42607.5. Samples: 5134510660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 23:33:26,255][15401] Updated weights for policy 0, policy_version 313390 (0.0026) [2024-06-22 23:33:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42598.7). Total num frames: 5134647296. Throughput: 0: 42779.0. Samples: 5134767940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-22 23:33:30,792][15401] Updated weights for policy 0, policy_version 313400 (0.0033) [2024-06-22 23:33:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5134876672. Throughput: 0: 42630.2. Samples: 5135017840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 23:33:34,044][15401] Updated weights for policy 0, policy_version 313410 (0.0026) [2024-06-22 23:33:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5135056896. Throughput: 0: 42597.8. Samples: 5135152600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 23:33:38,518][15401] Updated weights for policy 0, policy_version 313420 (0.0037) [2024-06-22 23:33:40,130][15349] Signal inference workers to stop experience collection... (75950 times) [2024-06-22 23:33:40,130][15349] Signal inference workers to resume experience collection... (75950 times) [2024-06-22 23:33:40,157][15401] InferenceWorker_p0-w0: stopping experience collection (75950 times) [2024-06-22 23:33:40,157][15401] InferenceWorker_p0-w0: resuming experience collection (75950 times) [2024-06-22 23:33:41,706][15401] Updated weights for policy 0, policy_version 313430 (0.0026) [2024-06-22 23:33:43,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5135286272. Throughput: 0: 42549.4. Samples: 5135406360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 23:33:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000313433_5135286272.pth... [2024-06-22 23:33:43,518][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000312809_5125062656.pth [2024-06-22 23:33:45,962][15401] Updated weights for policy 0, policy_version 313440 (0.0044) [2024-06-22 23:33:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5135499264. Throughput: 0: 42797.0. Samples: 5135667720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 23:33:49,571][15401] Updated weights for policy 0, policy_version 313450 (0.0036) [2024-06-22 23:33:53,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5135712256. Throughput: 0: 42621.3. Samples: 5135795280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-22 23:33:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 23:33:53,553][15401] Updated weights for policy 0, policy_version 313460 (0.0036) [2024-06-22 23:33:57,240][15401] Updated weights for policy 0, policy_version 313470 (0.0024) [2024-06-22 23:33:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5135925248. Throughput: 0: 42634.2. Samples: 5136051740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:33:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-22 23:34:01,405][15401] Updated weights for policy 0, policy_version 313480 (0.0042) [2024-06-22 23:34:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5136154624. Throughput: 0: 42908.4. Samples: 5136308420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 23:34:04,921][15401] Updated weights for policy 0, policy_version 313490 (0.0032) [2024-06-22 23:34:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5136351232. Throughput: 0: 42721.9. Samples: 5136433140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 23:34:08,793][15401] Updated weights for policy 0, policy_version 313500 (0.0025) [2024-06-22 23:34:12,635][15401] Updated weights for policy 0, policy_version 313510 (0.0030) [2024-06-22 23:34:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42766.0). Total num frames: 5136596992. Throughput: 0: 42902.8. Samples: 5136698560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-22 23:34:16,953][15401] Updated weights for policy 0, policy_version 313520 (0.0042) [2024-06-22 23:34:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5136809984. Throughput: 0: 42992.8. Samples: 5136952520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-22 23:34:20,273][15401] Updated weights for policy 0, policy_version 313530 (0.0042) [2024-06-22 23:34:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5137006592. Throughput: 0: 42868.0. Samples: 5137081660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 23:34:24,297][15401] Updated weights for policy 0, policy_version 313540 (0.0031) [2024-06-22 23:34:27,775][15401] Updated weights for policy 0, policy_version 313550 (0.0036) [2024-06-22 23:34:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 5137219584. Throughput: 0: 43158.5. Samples: 5137348480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-22 23:34:31,764][15401] Updated weights for policy 0, policy_version 313560 (0.0044) [2024-06-22 23:34:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5137465344. Throughput: 0: 42883.3. Samples: 5137597480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-22 23:34:35,615][15401] Updated weights for policy 0, policy_version 313570 (0.0036) [2024-06-22 23:34:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5137629184. Throughput: 0: 42985.3. Samples: 5137729620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 23:34:39,227][15401] Updated weights for policy 0, policy_version 313580 (0.0036) [2024-06-22 23:34:43,377][15401] Updated weights for policy 0, policy_version 313590 (0.0042) [2024-06-22 23:34:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 5137858560. Throughput: 0: 43068.5. Samples: 5137989820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-22 23:34:46,838][15401] Updated weights for policy 0, policy_version 313600 (0.0034) [2024-06-22 23:34:47,952][15349] Signal inference workers to stop experience collection... (76000 times) [2024-06-22 23:34:47,999][15401] InferenceWorker_p0-w0: stopping experience collection (76000 times) [2024-06-22 23:34:48,007][15349] Signal inference workers to resume experience collection... (76000 times) [2024-06-22 23:34:48,018][15401] InferenceWorker_p0-w0: resuming experience collection (76000 times) [2024-06-22 23:34:48,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5138104320. Throughput: 0: 43015.0. Samples: 5138244100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:48,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-22 23:34:50,910][15401] Updated weights for policy 0, policy_version 313610 (0.0030) [2024-06-22 23:34:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5138284544. Throughput: 0: 43269.3. Samples: 5138380260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-22 23:34:54,279][15401] Updated weights for policy 0, policy_version 313620 (0.0027) [2024-06-22 23:34:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5138497536. Throughput: 0: 43053.6. Samples: 5138635980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:34:58,394][15132] Avg episode reward: [(0, '0.727')] [2024-06-22 23:34:58,557][15401] Updated weights for policy 0, policy_version 313630 (0.0028) [2024-06-22 23:35:01,968][15401] Updated weights for policy 0, policy_version 313640 (0.0038) [2024-06-22 23:35:03,391][15132] Fps is (10 sec: 47504.5, 60 sec: 43416.1, 300 sec: 42820.3). Total num frames: 5138759680. Throughput: 0: 42990.2. Samples: 5138887160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:35:03,392][15132] Avg episode reward: [(0, '0.745')] [2024-06-22 23:35:06,217][15401] Updated weights for policy 0, policy_version 313650 (0.0028) [2024-06-22 23:35:08,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 5138939904. Throughput: 0: 43247.5. Samples: 5139027900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-22 23:35:08,393][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 23:35:09,470][15401] Updated weights for policy 0, policy_version 313660 (0.0044) [2024-06-22 23:35:13,396][15132] Fps is (10 sec: 37666.4, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 5139136512. Throughput: 0: 42767.2. Samples: 5139273280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:13,397][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 23:35:13,830][15401] Updated weights for policy 0, policy_version 313670 (0.0039) [2024-06-22 23:35:17,191][15401] Updated weights for policy 0, policy_version 313680 (0.0041) [2024-06-22 23:35:18,392][15132] Fps is (10 sec: 47513.7, 60 sec: 43415.9, 300 sec: 42931.6). Total num frames: 5139415040. Throughput: 0: 42985.8. Samples: 5139531940. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:18,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 23:35:21,476][15401] Updated weights for policy 0, policy_version 313690 (0.0023) [2024-06-22 23:35:23,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 5139562496. Throughput: 0: 43130.8. Samples: 5139670500. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 23:35:24,984][15401] Updated weights for policy 0, policy_version 313700 (0.0032) [2024-06-22 23:35:28,389][15132] Fps is (10 sec: 36053.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5139775488. Throughput: 0: 42816.1. Samples: 5139916540. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-22 23:35:29,107][15401] Updated weights for policy 0, policy_version 313710 (0.0035) [2024-06-22 23:35:32,646][15401] Updated weights for policy 0, policy_version 313720 (0.0024) [2024-06-22 23:35:33,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 5140037632. Throughput: 0: 42838.3. Samples: 5140171820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 23:35:36,700][15401] Updated weights for policy 0, policy_version 313730 (0.0037) [2024-06-22 23:35:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42821.1). Total num frames: 5140217856. Throughput: 0: 42933.7. Samples: 5140312280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-22 23:35:40,277][15401] Updated weights for policy 0, policy_version 313740 (0.0042) [2024-06-22 23:35:43,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5140430848. Throughput: 0: 42626.7. Samples: 5140554280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:43,392][15132] Avg episode reward: [(0, '0.411')] [2024-06-22 23:35:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000313747_5140430848.pth... [2024-06-22 23:35:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000313120_5130158080.pth [2024-06-22 23:35:44,383][15401] Updated weights for policy 0, policy_version 313750 (0.0039) [2024-06-22 23:35:47,849][15401] Updated weights for policy 0, policy_version 313760 (0.0035) [2024-06-22 23:35:48,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5140676608. Throughput: 0: 42826.0. Samples: 5140814240. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:48,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-22 23:35:52,101][15401] Updated weights for policy 0, policy_version 313770 (0.0025) [2024-06-22 23:35:53,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5140856832. Throughput: 0: 42706.7. Samples: 5140949600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:53,391][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 23:35:55,363][15401] Updated weights for policy 0, policy_version 313780 (0.0041) [2024-06-22 23:35:58,390][15132] Fps is (10 sec: 40958.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5141086208. Throughput: 0: 42814.9. Samples: 5141199680. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:35:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 23:35:59,905][15401] Updated weights for policy 0, policy_version 313790 (0.0030) [2024-06-22 23:36:02,905][15401] Updated weights for policy 0, policy_version 313800 (0.0033) [2024-06-22 23:36:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42599.8, 300 sec: 42931.6). Total num frames: 5141315584. Throughput: 0: 42771.2. Samples: 5141456540. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:36:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-22 23:36:06,305][15349] Signal inference workers to stop experience collection... (76050 times) [2024-06-22 23:36:06,305][15349] Signal inference workers to resume experience collection... (76050 times) [2024-06-22 23:36:06,356][15401] InferenceWorker_p0-w0: stopping experience collection (76050 times) [2024-06-22 23:36:06,356][15401] InferenceWorker_p0-w0: resuming experience collection (76050 times) [2024-06-22 23:36:07,520][15401] Updated weights for policy 0, policy_version 313810 (0.0033) [2024-06-22 23:36:08,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 5141512192. Throughput: 0: 42683.0. Samples: 5141591240. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:36:08,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 23:36:10,536][15401] Updated weights for policy 0, policy_version 313820 (0.0035) [2024-06-22 23:36:13,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42875.9, 300 sec: 42820.5). Total num frames: 5141708800. Throughput: 0: 42680.6. Samples: 5141837180. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:36:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-22 23:36:15,142][15401] Updated weights for policy 0, policy_version 313830 (0.0027) [2024-06-22 23:36:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41779.2, 300 sec: 42764.7). Total num frames: 5141921792. Throughput: 0: 42656.8. Samples: 5142091480. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:36:18,392][15132] Avg episode reward: [(0, '0.457')] [2024-06-22 23:36:18,560][15401] Updated weights for policy 0, policy_version 313840 (0.0031) [2024-06-22 23:36:22,794][15401] Updated weights for policy 0, policy_version 313850 (0.0032) [2024-06-22 23:36:23,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5142118400. Throughput: 0: 42422.8. Samples: 5142221300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-22 23:36:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 23:36:26,080][15401] Updated weights for policy 0, policy_version 313860 (0.0043) [2024-06-22 23:36:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5142347776. Throughput: 0: 42652.1. Samples: 5142473520. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:36:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-22 23:36:30,490][15401] Updated weights for policy 0, policy_version 313870 (0.0033) [2024-06-22 23:36:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5142577152. Throughput: 0: 42693.3. Samples: 5142735440. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:36:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 23:36:33,627][15401] Updated weights for policy 0, policy_version 313880 (0.0041) [2024-06-22 23:36:38,067][15401] Updated weights for policy 0, policy_version 313890 (0.0040) [2024-06-22 23:36:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5142773760. Throughput: 0: 42576.1. Samples: 5142865520. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:36:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-22 23:36:41,079][15401] Updated weights for policy 0, policy_version 313900 (0.0029) [2024-06-22 23:36:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 5142986752. Throughput: 0: 42587.3. Samples: 5143116100. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:36:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-22 23:36:45,694][15401] Updated weights for policy 0, policy_version 313910 (0.0038) [2024-06-22 23:36:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 5143216128. Throughput: 0: 42527.9. Samples: 5143370300. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:36:48,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-22 23:36:48,859][15401] Updated weights for policy 0, policy_version 313920 (0.0041) [2024-06-22 23:36:53,374][15401] Updated weights for policy 0, policy_version 313930 (0.0030) [2024-06-22 23:36:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5143429120. Throughput: 0: 42440.9. Samples: 5143501080. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:36:53,391][15132] Avg episode reward: [(0, '0.563')] [2024-06-22 23:36:56,327][15401] Updated weights for policy 0, policy_version 313940 (0.0035) [2024-06-22 23:36:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5143642112. Throughput: 0: 42670.8. Samples: 5143757360. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:36:58,390][15132] Avg episode reward: [(0, '0.203')] [2024-06-22 23:37:01,040][15401] Updated weights for policy 0, policy_version 313950 (0.0033) [2024-06-22 23:37:01,814][15349] Signal inference workers to stop experience collection... (76100 times) [2024-06-22 23:37:01,863][15401] InferenceWorker_p0-w0: stopping experience collection (76100 times) [2024-06-22 23:37:01,929][15349] Signal inference workers to resume experience collection... (76100 times) [2024-06-22 23:37:01,930][15401] InferenceWorker_p0-w0: resuming experience collection (76100 times) [2024-06-22 23:37:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5143871488. Throughput: 0: 42855.6. Samples: 5144019880. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:37:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-22 23:37:04,078][15401] Updated weights for policy 0, policy_version 313960 (0.0024) [2024-06-22 23:37:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5144051712. Throughput: 0: 42718.2. Samples: 5144143620. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:37:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 23:37:08,586][15401] Updated weights for policy 0, policy_version 313970 (0.0034) [2024-06-22 23:37:11,722][15401] Updated weights for policy 0, policy_version 313980 (0.0027) [2024-06-22 23:37:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5144297472. Throughput: 0: 42912.8. Samples: 5144404600. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:37:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-22 23:37:16,684][15401] Updated weights for policy 0, policy_version 313990 (0.0031) [2024-06-22 23:37:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 5144510464. Throughput: 0: 42910.6. Samples: 5144666420. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:37:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 23:37:19,604][15401] Updated weights for policy 0, policy_version 314000 (0.0031) [2024-06-22 23:37:23,390][15132] Fps is (10 sec: 39319.8, 60 sec: 42871.1, 300 sec: 42820.5). Total num frames: 5144690688. Throughput: 0: 42717.7. Samples: 5144787840. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:37:23,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-22 23:37:24,081][15401] Updated weights for policy 0, policy_version 314010 (0.0036) [2024-06-22 23:37:27,244][15401] Updated weights for policy 0, policy_version 314020 (0.0030) [2024-06-22 23:37:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5144952832. Throughput: 0: 42943.0. Samples: 5145048540. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:37:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-22 23:37:31,641][15401] Updated weights for policy 0, policy_version 314030 (0.0045) [2024-06-22 23:37:33,389][15132] Fps is (10 sec: 45877.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5145149440. Throughput: 0: 43100.4. Samples: 5145309820. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:37:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-22 23:37:35,054][15401] Updated weights for policy 0, policy_version 314040 (0.0020) [2024-06-22 23:37:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5145329664. Throughput: 0: 42899.2. Samples: 5145431540. Policy #0 lag: (min: 1.0, avg: 12.7, max: 22.0) [2024-06-22 23:37:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-22 23:37:39,484][15401] Updated weights for policy 0, policy_version 314050 (0.0035) [2024-06-22 23:37:42,937][15401] Updated weights for policy 0, policy_version 314060 (0.0034) [2024-06-22 23:37:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 5145575424. Throughput: 0: 42778.7. Samples: 5145682500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:37:43,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 23:37:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000314062_5145591808.pth... [2024-06-22 23:37:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000313433_5135286272.pth [2024-06-22 23:37:47,532][15401] Updated weights for policy 0, policy_version 314070 (0.0036) [2024-06-22 23:37:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5145772032. Throughput: 0: 42705.0. Samples: 5145941600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:37:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 23:37:50,486][15401] Updated weights for policy 0, policy_version 314080 (0.0027) [2024-06-22 23:37:53,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5145968640. Throughput: 0: 42558.1. Samples: 5146058740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:37:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-22 23:37:55,325][15401] Updated weights for policy 0, policy_version 314090 (0.0034) [2024-06-22 23:37:58,020][15401] Updated weights for policy 0, policy_version 314100 (0.0037) [2024-06-22 23:37:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5146214400. Throughput: 0: 42467.5. Samples: 5146315640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:37:58,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 23:38:03,013][15401] Updated weights for policy 0, policy_version 314110 (0.0033) [2024-06-22 23:38:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5146394624. Throughput: 0: 42572.5. Samples: 5146582180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 23:38:05,661][15401] Updated weights for policy 0, policy_version 314120 (0.0042) [2024-06-22 23:38:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5146624000. Throughput: 0: 42481.4. Samples: 5146699480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-22 23:38:10,892][15401] Updated weights for policy 0, policy_version 314130 (0.0042) [2024-06-22 23:38:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5146853376. Throughput: 0: 42373.8. Samples: 5146955360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-22 23:38:13,526][15401] Updated weights for policy 0, policy_version 314140 (0.0026) [2024-06-22 23:38:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 5147017216. Throughput: 0: 42497.8. Samples: 5147222220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 23:38:18,501][15401] Updated weights for policy 0, policy_version 314150 (0.0031) [2024-06-22 23:38:21,058][15401] Updated weights for policy 0, policy_version 314160 (0.0034) [2024-06-22 23:38:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.8, 300 sec: 42765.0). Total num frames: 5147262976. Throughput: 0: 42349.2. Samples: 5147337260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 23:38:26,195][15401] Updated weights for policy 0, policy_version 314170 (0.0028) [2024-06-22 23:38:26,196][15349] Signal inference workers to stop experience collection... (76150 times) [2024-06-22 23:38:26,196][15349] Signal inference workers to resume experience collection... (76150 times) [2024-06-22 23:38:26,218][15401] InferenceWorker_p0-w0: stopping experience collection (76150 times) [2024-06-22 23:38:26,219][15401] InferenceWorker_p0-w0: resuming experience collection (76150 times) [2024-06-22 23:38:28,390][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5147508736. Throughput: 0: 42642.7. Samples: 5147601320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 23:38:28,632][15401] Updated weights for policy 0, policy_version 314180 (0.0024) [2024-06-22 23:38:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 5147656192. Throughput: 0: 42595.9. Samples: 5147858420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-22 23:38:33,757][15401] Updated weights for policy 0, policy_version 314190 (0.0036) [2024-06-22 23:38:36,486][15401] Updated weights for policy 0, policy_version 314200 (0.0036) [2024-06-22 23:38:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5147918336. Throughput: 0: 42669.8. Samples: 5147978880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:38,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-22 23:38:41,383][15401] Updated weights for policy 0, policy_version 314210 (0.0033) [2024-06-22 23:38:43,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42600.2, 300 sec: 42820.5). Total num frames: 5148131328. Throughput: 0: 42801.5. Samples: 5148241700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-22 23:38:44,145][15401] Updated weights for policy 0, policy_version 314220 (0.0039) [2024-06-22 23:38:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5148311552. Throughput: 0: 42621.4. Samples: 5148500140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 23:38:49,050][15401] Updated weights for policy 0, policy_version 314230 (0.0034) [2024-06-22 23:38:51,930][15401] Updated weights for policy 0, policy_version 314240 (0.0052) [2024-06-22 23:38:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5148557312. Throughput: 0: 42720.4. Samples: 5148621900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:38:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 23:38:56,577][15401] Updated weights for policy 0, policy_version 314250 (0.0031) [2024-06-22 23:38:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5148770304. Throughput: 0: 42878.3. Samples: 5148884880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:38:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-22 23:38:59,530][15401] Updated weights for policy 0, policy_version 314260 (0.0032) [2024-06-22 23:39:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5148950528. Throughput: 0: 42756.1. Samples: 5149146240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:03,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-22 23:39:04,345][15401] Updated weights for policy 0, policy_version 314270 (0.0048) [2024-06-22 23:39:07,185][15401] Updated weights for policy 0, policy_version 314280 (0.0035) [2024-06-22 23:39:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5149212672. Throughput: 0: 42804.5. Samples: 5149263460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:08,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 23:39:11,815][15401] Updated weights for policy 0, policy_version 314290 (0.0039) [2024-06-22 23:39:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5149425664. Throughput: 0: 42875.1. Samples: 5149530700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:13,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-22 23:39:14,675][15401] Updated weights for policy 0, policy_version 314300 (0.0044) [2024-06-22 23:39:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5149605888. Throughput: 0: 42980.4. Samples: 5149792540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 23:39:19,416][15401] Updated weights for policy 0, policy_version 314310 (0.0035) [2024-06-22 23:39:22,350][15401] Updated weights for policy 0, policy_version 314320 (0.0030) [2024-06-22 23:39:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5149835264. Throughput: 0: 42995.6. Samples: 5149913680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 23:39:26,869][15401] Updated weights for policy 0, policy_version 314330 (0.0025) [2024-06-22 23:39:28,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5150081024. Throughput: 0: 43060.4. Samples: 5150179420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-22 23:39:29,762][15401] Updated weights for policy 0, policy_version 314340 (0.0031) [2024-06-22 23:39:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5150244864. Throughput: 0: 43141.8. Samples: 5150441520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 23:39:34,314][15401] Updated weights for policy 0, policy_version 314350 (0.0030) [2024-06-22 23:39:37,189][15401] Updated weights for policy 0, policy_version 314360 (0.0039) [2024-06-22 23:39:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5150490624. Throughput: 0: 43160.4. Samples: 5150564120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:38,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 23:39:42,104][15401] Updated weights for policy 0, policy_version 314370 (0.0030) [2024-06-22 23:39:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5150703616. Throughput: 0: 43175.5. Samples: 5150827780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 23:39:43,466][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000314375_5150720000.pth... [2024-06-22 23:39:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000313747_5140430848.pth [2024-06-22 23:39:44,569][15401] Updated weights for policy 0, policy_version 314380 (0.0039) [2024-06-22 23:39:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5150900224. Throughput: 0: 43018.5. Samples: 5151082080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 23:39:49,633][15401] Updated weights for policy 0, policy_version 314390 (0.0040) [2024-06-22 23:39:50,644][15349] Signal inference workers to stop experience collection... (76200 times) [2024-06-22 23:39:50,701][15401] InferenceWorker_p0-w0: stopping experience collection (76200 times) [2024-06-22 23:39:50,761][15349] Signal inference workers to resume experience collection... (76200 times) [2024-06-22 23:39:50,761][15401] InferenceWorker_p0-w0: resuming experience collection (76200 times) [2024-06-22 23:39:52,714][15401] Updated weights for policy 0, policy_version 314400 (0.0034) [2024-06-22 23:39:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5151129600. Throughput: 0: 43156.8. Samples: 5151205520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 23:39:57,132][15401] Updated weights for policy 0, policy_version 314410 (0.0038) [2024-06-22 23:39:58,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 5151309824. Throughput: 0: 42862.8. Samples: 5151459520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:39:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 23:40:00,342][15401] Updated weights for policy 0, policy_version 314420 (0.0032) [2024-06-22 23:40:03,390][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 5151539200. Throughput: 0: 42708.1. Samples: 5151714400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:40:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-22 23:40:04,751][15401] Updated weights for policy 0, policy_version 314430 (0.0024) [2024-06-22 23:40:08,010][15401] Updated weights for policy 0, policy_version 314440 (0.0035) [2024-06-22 23:40:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 5151784960. Throughput: 0: 42985.4. Samples: 5151848020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-22 23:40:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-22 23:40:12,530][15401] Updated weights for policy 0, policy_version 314450 (0.0027) [2024-06-22 23:40:13,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42320.8, 300 sec: 42542.3). Total num frames: 5151965184. Throughput: 0: 42622.3. Samples: 5152097700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:13,397][15132] Avg episode reward: [(0, '0.695')] [2024-06-22 23:40:16,115][15401] Updated weights for policy 0, policy_version 314460 (0.0029) [2024-06-22 23:40:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5152178176. Throughput: 0: 42399.9. Samples: 5152349520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:18,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-22 23:40:20,083][15401] Updated weights for policy 0, policy_version 314470 (0.0032) [2024-06-22 23:40:23,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5152391168. Throughput: 0: 42476.0. Samples: 5152475540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-22 23:40:24,041][15401] Updated weights for policy 0, policy_version 314480 (0.0035) [2024-06-22 23:40:27,649][15401] Updated weights for policy 0, policy_version 314490 (0.0035) [2024-06-22 23:40:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5152604160. Throughput: 0: 42293.4. Samples: 5152730980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 23:40:31,670][15401] Updated weights for policy 0, policy_version 314500 (0.0027) [2024-06-22 23:40:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5152817152. Throughput: 0: 42377.9. Samples: 5152989080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-22 23:40:35,603][15401] Updated weights for policy 0, policy_version 314510 (0.0038) [2024-06-22 23:40:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 5153030144. Throughput: 0: 42372.2. Samples: 5153112260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:38,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-22 23:40:39,484][15401] Updated weights for policy 0, policy_version 314520 (0.0037) [2024-06-22 23:40:43,078][15401] Updated weights for policy 0, policy_version 314530 (0.0031) [2024-06-22 23:40:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5153259520. Throughput: 0: 42461.2. Samples: 5153370280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-22 23:40:47,422][15401] Updated weights for policy 0, policy_version 314540 (0.0048) [2024-06-22 23:40:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 5153456128. Throughput: 0: 42555.7. Samples: 5153629400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-22 23:40:50,671][15401] Updated weights for policy 0, policy_version 314550 (0.0044) [2024-06-22 23:40:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5153669120. Throughput: 0: 42253.7. Samples: 5153749440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 23:40:55,070][15401] Updated weights for policy 0, policy_version 314560 (0.0026) [2024-06-22 23:40:58,206][15401] Updated weights for policy 0, policy_version 314570 (0.0047) [2024-06-22 23:40:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5153914880. Throughput: 0: 42585.3. Samples: 5154013760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:40:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-22 23:41:02,666][15401] Updated weights for policy 0, policy_version 314580 (0.0054) [2024-06-22 23:41:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5154095104. Throughput: 0: 42759.1. Samples: 5154273680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:41:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 23:41:04,345][15349] Signal inference workers to stop experience collection... (76250 times) [2024-06-22 23:41:04,346][15349] Signal inference workers to resume experience collection... (76250 times) [2024-06-22 23:41:04,359][15401] InferenceWorker_p0-w0: stopping experience collection (76250 times) [2024-06-22 23:41:04,360][15401] InferenceWorker_p0-w0: resuming experience collection (76250 times) [2024-06-22 23:41:05,774][15401] Updated weights for policy 0, policy_version 314590 (0.0023) [2024-06-22 23:41:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5154324480. Throughput: 0: 42685.8. Samples: 5154396400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:41:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-22 23:41:10,377][15401] Updated weights for policy 0, policy_version 314600 (0.0034) [2024-06-22 23:41:13,279][15401] Updated weights for policy 0, policy_version 314610 (0.0032) [2024-06-22 23:41:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43422.2, 300 sec: 42876.4). Total num frames: 5154570240. Throughput: 0: 42867.8. Samples: 5154660040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:41:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 23:41:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5154717696. Throughput: 0: 42824.8. Samples: 5154916200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:41:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-22 23:41:18,562][15401] Updated weights for policy 0, policy_version 314620 (0.0028) [2024-06-22 23:41:21,197][15401] Updated weights for policy 0, policy_version 314630 (0.0046) [2024-06-22 23:41:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5154979840. Throughput: 0: 42668.8. Samples: 5155032360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-22 23:41:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-22 23:41:26,094][15401] Updated weights for policy 0, policy_version 314640 (0.0046) [2024-06-22 23:41:28,389][15132] Fps is (10 sec: 47514.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5155192832. Throughput: 0: 42810.8. Samples: 5155296760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:41:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-22 23:41:28,891][15401] Updated weights for policy 0, policy_version 314650 (0.0036) [2024-06-22 23:41:33,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5155356672. Throughput: 0: 42724.8. Samples: 5155552020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:41:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 23:41:33,709][15401] Updated weights for policy 0, policy_version 314660 (0.0041) [2024-06-22 23:41:36,504][15401] Updated weights for policy 0, policy_version 314670 (0.0028) [2024-06-22 23:41:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5155602432. Throughput: 0: 42721.9. Samples: 5155671920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:41:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-22 23:41:41,359][15401] Updated weights for policy 0, policy_version 314680 (0.0036) [2024-06-22 23:41:43,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5155831808. Throughput: 0: 42789.3. Samples: 5155939280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:41:43,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-22 23:41:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000314687_5155831808.pth... [2024-06-22 23:41:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000314062_5145591808.pth [2024-06-22 23:41:44,044][15401] Updated weights for policy 0, policy_version 314690 (0.0045) [2024-06-22 23:41:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5156012032. Throughput: 0: 42639.1. Samples: 5156192440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:41:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 23:41:49,050][15401] Updated weights for policy 0, policy_version 314700 (0.0038) [2024-06-22 23:41:51,805][15401] Updated weights for policy 0, policy_version 314710 (0.0036) [2024-06-22 23:41:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5156257792. Throughput: 0: 42630.7. Samples: 5156314780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:41:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 23:41:56,827][15401] Updated weights for policy 0, policy_version 314720 (0.0044) [2024-06-22 23:41:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5156454400. Throughput: 0: 42713.1. Samples: 5156582120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:41:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-22 23:41:59,467][15401] Updated weights for policy 0, policy_version 314730 (0.0044) [2024-06-22 23:42:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5156634624. Throughput: 0: 42697.5. Samples: 5156837580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:42:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-22 23:42:04,473][15401] Updated weights for policy 0, policy_version 314740 (0.0040) [2024-06-22 23:42:07,225][15401] Updated weights for policy 0, policy_version 314750 (0.0022) [2024-06-22 23:42:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5156896768. Throughput: 0: 42737.4. Samples: 5156955540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:42:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 23:42:11,941][15401] Updated weights for policy 0, policy_version 314760 (0.0024) [2024-06-22 23:42:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5157109760. Throughput: 0: 42897.2. Samples: 5157227140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:42:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-22 23:42:13,721][15349] Signal inference workers to stop experience collection... (76300 times) [2024-06-22 23:42:13,722][15349] Signal inference workers to resume experience collection... (76300 times) [2024-06-22 23:42:13,762][15401] InferenceWorker_p0-w0: stopping experience collection (76300 times) [2024-06-22 23:42:13,762][15401] InferenceWorker_p0-w0: resuming experience collection (76300 times) [2024-06-22 23:42:14,753][15401] Updated weights for policy 0, policy_version 314770 (0.0029) [2024-06-22 23:42:18,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5157273600. Throughput: 0: 42910.7. Samples: 5157483000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:42:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 23:42:19,629][15401] Updated weights for policy 0, policy_version 314780 (0.0031) [2024-06-22 23:42:22,486][15401] Updated weights for policy 0, policy_version 314790 (0.0028) [2024-06-22 23:42:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5157552128. Throughput: 0: 42907.1. Samples: 5157602740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:42:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 23:42:27,145][15401] Updated weights for policy 0, policy_version 314800 (0.0043) [2024-06-22 23:42:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5157748736. Throughput: 0: 42792.9. Samples: 5157864960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:42:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-22 23:42:30,088][15401] Updated weights for policy 0, policy_version 314810 (0.0036) [2024-06-22 23:42:33,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5157928960. Throughput: 0: 42975.7. Samples: 5158126340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:42:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 23:42:34,745][15401] Updated weights for policy 0, policy_version 314820 (0.0038) [2024-06-22 23:42:37,852][15401] Updated weights for policy 0, policy_version 314830 (0.0030) [2024-06-22 23:42:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 5158191104. Throughput: 0: 42865.5. Samples: 5158243720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-22 23:42:38,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-22 23:42:42,410][15401] Updated weights for policy 0, policy_version 314840 (0.0033) [2024-06-22 23:42:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5158387712. Throughput: 0: 42848.8. Samples: 5158510320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:42:43,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-22 23:42:45,275][15401] Updated weights for policy 0, policy_version 314850 (0.0032) [2024-06-22 23:42:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5158584320. Throughput: 0: 42847.2. Samples: 5158765700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:42:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-22 23:42:50,029][15401] Updated weights for policy 0, policy_version 314860 (0.0041) [2024-06-22 23:42:52,977][15401] Updated weights for policy 0, policy_version 314870 (0.0037) [2024-06-22 23:42:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5158830080. Throughput: 0: 42875.8. Samples: 5158884960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:42:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-22 23:42:57,784][15401] Updated weights for policy 0, policy_version 314880 (0.0038) [2024-06-22 23:42:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5159010304. Throughput: 0: 42759.7. Samples: 5159151320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:42:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-22 23:43:00,563][15401] Updated weights for policy 0, policy_version 314890 (0.0032) [2024-06-22 23:43:03,390][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5159223296. Throughput: 0: 42706.1. Samples: 5159404780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 23:43:05,364][15401] Updated weights for policy 0, policy_version 314900 (0.0029) [2024-06-22 23:43:08,102][15401] Updated weights for policy 0, policy_version 314910 (0.0041) [2024-06-22 23:43:08,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 5159485440. Throughput: 0: 42911.0. Samples: 5159533740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-22 23:43:12,910][15401] Updated weights for policy 0, policy_version 314920 (0.0028) [2024-06-22 23:43:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5159665664. Throughput: 0: 43043.9. Samples: 5159801940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-22 23:43:15,634][15401] Updated weights for policy 0, policy_version 314930 (0.0030) [2024-06-22 23:43:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5159878656. Throughput: 0: 42933.6. Samples: 5160058360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-22 23:43:20,395][15401] Updated weights for policy 0, policy_version 314940 (0.0027) [2024-06-22 23:43:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5160124416. Throughput: 0: 43063.4. Samples: 5160181580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-22 23:43:24,030][15401] Updated weights for policy 0, policy_version 314950 (0.0030) [2024-06-22 23:43:27,600][15349] Signal inference workers to stop experience collection... (76350 times) [2024-06-22 23:43:27,600][15349] Signal inference workers to resume experience collection... (76350 times) [2024-06-22 23:43:27,651][15401] InferenceWorker_p0-w0: stopping experience collection (76350 times) [2024-06-22 23:43:27,651][15401] InferenceWorker_p0-w0: resuming experience collection (76350 times) [2024-06-22 23:43:27,989][15401] Updated weights for policy 0, policy_version 314960 (0.0037) [2024-06-22 23:43:28,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5160337408. Throughput: 0: 43169.8. Samples: 5160452960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:28,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 23:43:31,700][15401] Updated weights for policy 0, policy_version 314970 (0.0035) [2024-06-22 23:43:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5160517632. Throughput: 0: 43035.5. Samples: 5160702300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:33,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-22 23:43:35,662][15401] Updated weights for policy 0, policy_version 314980 (0.0034) [2024-06-22 23:43:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5160763392. Throughput: 0: 43081.6. Samples: 5160823620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-22 23:43:39,548][15401] Updated weights for policy 0, policy_version 314990 (0.0027) [2024-06-22 23:43:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5160943616. Throughput: 0: 43037.7. Samples: 5161088020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-22 23:43:43,472][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000315000_5160960000.pth... [2024-06-22 23:43:43,479][15401] Updated weights for policy 0, policy_version 315000 (0.0038) [2024-06-22 23:43:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000314375_5150720000.pth [2024-06-22 23:43:47,306][15401] Updated weights for policy 0, policy_version 315010 (0.0026) [2024-06-22 23:43:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5161172992. Throughput: 0: 42993.3. Samples: 5161339480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:48,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-22 23:43:51,038][15401] Updated weights for policy 0, policy_version 315020 (0.0033) [2024-06-22 23:43:53,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5161418752. Throughput: 0: 42930.1. Samples: 5161465600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-22 23:43:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-22 23:43:55,019][15401] Updated weights for policy 0, policy_version 315030 (0.0023) [2024-06-22 23:43:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5161598976. Throughput: 0: 42897.9. Samples: 5161732340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:43:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-22 23:43:58,407][15401] Updated weights for policy 0, policy_version 315040 (0.0031) [2024-06-22 23:44:02,437][15401] Updated weights for policy 0, policy_version 315050 (0.0037) [2024-06-22 23:44:03,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5161795584. Throughput: 0: 42745.9. Samples: 5161981920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 23:44:06,195][15401] Updated weights for policy 0, policy_version 315060 (0.0037) [2024-06-22 23:44:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5162057728. Throughput: 0: 42754.7. Samples: 5162105540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:08,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-22 23:44:09,914][15401] Updated weights for policy 0, policy_version 315070 (0.0033) [2024-06-22 23:44:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5162237952. Throughput: 0: 42548.4. Samples: 5162367640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-22 23:44:13,788][15401] Updated weights for policy 0, policy_version 315080 (0.0033) [2024-06-22 23:44:17,476][15401] Updated weights for policy 0, policy_version 315090 (0.0037) [2024-06-22 23:44:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 5162434560. Throughput: 0: 42655.6. Samples: 5162621800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-22 23:44:21,423][15349] Signal inference workers to stop experience collection... (76400 times) [2024-06-22 23:44:21,423][15349] Signal inference workers to resume experience collection... (76400 times) [2024-06-22 23:44:21,444][15401] Updated weights for policy 0, policy_version 315100 (0.0030) [2024-06-22 23:44:21,471][15401] InferenceWorker_p0-w0: stopping experience collection (76400 times) [2024-06-22 23:44:21,471][15401] InferenceWorker_p0-w0: resuming experience collection (76400 times) [2024-06-22 23:44:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5162696704. Throughput: 0: 42871.4. Samples: 5162752840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 23:44:25,075][15401] Updated weights for policy 0, policy_version 315110 (0.0033) [2024-06-22 23:44:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5162860544. Throughput: 0: 42792.0. Samples: 5163013660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:28,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-22 23:44:28,935][15401] Updated weights for policy 0, policy_version 315120 (0.0040) [2024-06-22 23:44:32,638][15401] Updated weights for policy 0, policy_version 315130 (0.0037) [2024-06-22 23:44:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5163089920. Throughput: 0: 42867.2. Samples: 5163268500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-22 23:44:36,594][15401] Updated weights for policy 0, policy_version 315140 (0.0022) [2024-06-22 23:44:38,389][15132] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5163352064. Throughput: 0: 43017.9. Samples: 5163401400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-22 23:44:40,357][15401] Updated weights for policy 0, policy_version 315150 (0.0042) [2024-06-22 23:44:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5163515904. Throughput: 0: 42933.2. Samples: 5163664340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-22 23:44:44,241][15401] Updated weights for policy 0, policy_version 315160 (0.0038) [2024-06-22 23:44:48,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5163728896. Throughput: 0: 42960.8. Samples: 5163915160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:48,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-22 23:44:48,561][15401] Updated weights for policy 0, policy_version 315170 (0.0038) [2024-06-22 23:44:51,864][15401] Updated weights for policy 0, policy_version 315180 (0.0034) [2024-06-22 23:44:53,389][15132] Fps is (10 sec: 49152.5, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5164007424. Throughput: 0: 43026.7. Samples: 5164041740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 23:44:56,111][15401] Updated weights for policy 0, policy_version 315190 (0.0039) [2024-06-22 23:44:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5164154880. Throughput: 0: 43134.7. Samples: 5164308700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:44:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-22 23:44:59,373][15401] Updated weights for policy 0, policy_version 315200 (0.0026) [2024-06-22 23:45:03,390][15132] Fps is (10 sec: 37682.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5164384256. Throughput: 0: 42958.0. Samples: 5164554920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:45:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 23:45:03,737][15401] Updated weights for policy 0, policy_version 315210 (0.0037) [2024-06-22 23:45:06,886][15401] Updated weights for policy 0, policy_version 315220 (0.0038) [2024-06-22 23:45:08,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42932.6). Total num frames: 5164630016. Throughput: 0: 43062.8. Samples: 5164690660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-22 23:45:08,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-22 23:45:11,305][15401] Updated weights for policy 0, policy_version 315230 (0.0046) [2024-06-22 23:45:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5164810240. Throughput: 0: 43091.9. Samples: 5164952800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-22 23:45:14,665][15401] Updated weights for policy 0, policy_version 315240 (0.0032) [2024-06-22 23:45:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5165023232. Throughput: 0: 42952.5. Samples: 5165201360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-22 23:45:18,952][15401] Updated weights for policy 0, policy_version 315250 (0.0039) [2024-06-22 23:45:22,217][15401] Updated weights for policy 0, policy_version 315260 (0.0032) [2024-06-22 23:45:23,390][15132] Fps is (10 sec: 45872.6, 60 sec: 42871.1, 300 sec: 42931.5). Total num frames: 5165268992. Throughput: 0: 42983.4. Samples: 5165335680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:23,398][15132] Avg episode reward: [(0, '0.409')] [2024-06-22 23:45:26,765][15401] Updated weights for policy 0, policy_version 315270 (0.0031) [2024-06-22 23:45:28,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43412.9, 300 sec: 42875.2). Total num frames: 5165465600. Throughput: 0: 42922.0. Samples: 5165596100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:28,397][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 23:45:29,660][15401] Updated weights for policy 0, policy_version 315280 (0.0033) [2024-06-22 23:45:29,668][15349] Signal inference workers to stop experience collection... (76450 times) [2024-06-22 23:45:29,669][15349] Signal inference workers to resume experience collection... (76450 times) [2024-06-22 23:45:29,710][15401] InferenceWorker_p0-w0: stopping experience collection (76450 times) [2024-06-22 23:45:29,710][15401] InferenceWorker_p0-w0: resuming experience collection (76450 times) [2024-06-22 23:45:33,392][15132] Fps is (10 sec: 40952.7, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 5165678592. Throughput: 0: 42914.3. Samples: 5165846400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:33,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 23:45:34,203][15401] Updated weights for policy 0, policy_version 315290 (0.0028) [2024-06-22 23:45:37,190][15401] Updated weights for policy 0, policy_version 315300 (0.0020) [2024-06-22 23:45:38,390][15132] Fps is (10 sec: 45904.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5165924352. Throughput: 0: 43190.6. Samples: 5165985320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-22 23:45:41,594][15401] Updated weights for policy 0, policy_version 315310 (0.0023) [2024-06-22 23:45:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5166088192. Throughput: 0: 42914.1. Samples: 5166239840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-22 23:45:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000315314_5166104576.pth... [2024-06-22 23:45:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000314687_5155831808.pth [2024-06-22 23:45:44,718][15401] Updated weights for policy 0, policy_version 315320 (0.0039) [2024-06-22 23:45:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5166317568. Throughput: 0: 43071.6. Samples: 5166493140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 23:45:49,203][15401] Updated weights for policy 0, policy_version 315330 (0.0030) [2024-06-22 23:45:52,308][15401] Updated weights for policy 0, policy_version 315340 (0.0034) [2024-06-22 23:45:53,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5166563328. Throughput: 0: 43038.2. Samples: 5166627380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-22 23:45:57,268][15401] Updated weights for policy 0, policy_version 315350 (0.0033) [2024-06-22 23:45:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5166743552. Throughput: 0: 42946.7. Samples: 5166885400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:45:58,390][15132] Avg episode reward: [(0, '0.036')] [2024-06-22 23:46:00,210][15401] Updated weights for policy 0, policy_version 315360 (0.0030) [2024-06-22 23:46:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5166956544. Throughput: 0: 42909.7. Samples: 5167132300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:46:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-22 23:46:04,855][15401] Updated weights for policy 0, policy_version 315370 (0.0031) [2024-06-22 23:46:07,866][15401] Updated weights for policy 0, policy_version 315380 (0.0034) [2024-06-22 23:46:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 5167202304. Throughput: 0: 42806.6. Samples: 5167261960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:46:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-22 23:46:12,518][15401] Updated weights for policy 0, policy_version 315390 (0.0037) [2024-06-22 23:46:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 5167382528. Throughput: 0: 42695.4. Samples: 5167517120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:46:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-22 23:46:15,421][15401] Updated weights for policy 0, policy_version 315400 (0.0027) [2024-06-22 23:46:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5167611904. Throughput: 0: 42789.0. Samples: 5167771800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:46:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-22 23:46:20,071][15401] Updated weights for policy 0, policy_version 315410 (0.0032) [2024-06-22 23:46:23,010][15401] Updated weights for policy 0, policy_version 315420 (0.0034) [2024-06-22 23:46:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.9, 300 sec: 42876.1). Total num frames: 5167841280. Throughput: 0: 42659.2. Samples: 5167904980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:46:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 23:46:27,687][15401] Updated weights for policy 0, policy_version 315430 (0.0024) [2024-06-22 23:46:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42876.0, 300 sec: 42987.2). Total num frames: 5168037888. Throughput: 0: 42838.7. Samples: 5168167580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:46:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-22 23:46:30,612][15401] Updated weights for policy 0, policy_version 315440 (0.0029) [2024-06-22 23:46:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 5168250880. Throughput: 0: 42749.9. Samples: 5168416880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:46:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 23:46:35,354][15401] Updated weights for policy 0, policy_version 315450 (0.0031) [2024-06-22 23:46:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 5168463872. Throughput: 0: 42709.7. Samples: 5168549320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:46:38,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-22 23:46:38,657][15401] Updated weights for policy 0, policy_version 315460 (0.0034) [2024-06-22 23:46:43,293][15401] Updated weights for policy 0, policy_version 315470 (0.0048) [2024-06-22 23:46:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5168660480. Throughput: 0: 42522.7. Samples: 5168798920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:46:43,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-22 23:46:46,388][15401] Updated weights for policy 0, policy_version 315480 (0.0032) [2024-06-22 23:46:47,408][15349] Signal inference workers to stop experience collection... (76500 times) [2024-06-22 23:46:47,466][15401] InferenceWorker_p0-w0: stopping experience collection (76500 times) [2024-06-22 23:46:47,469][15349] Signal inference workers to resume experience collection... (76500 times) [2024-06-22 23:46:47,482][15401] InferenceWorker_p0-w0: resuming experience collection (76500 times) [2024-06-22 23:46:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5168906240. Throughput: 0: 42536.9. Samples: 5169046460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:46:48,391][15132] Avg episode reward: [(0, '0.187')] [2024-06-22 23:46:51,038][15401] Updated weights for policy 0, policy_version 315490 (0.0039) [2024-06-22 23:46:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 5169070080. Throughput: 0: 42635.1. Samples: 5169180540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:46:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-22 23:46:54,075][15401] Updated weights for policy 0, policy_version 315500 (0.0022) [2024-06-22 23:46:58,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5169283072. Throughput: 0: 42720.0. Samples: 5169439520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:46:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-22 23:46:58,633][15401] Updated weights for policy 0, policy_version 315510 (0.0031) [2024-06-22 23:47:01,859][15401] Updated weights for policy 0, policy_version 315520 (0.0049) [2024-06-22 23:47:03,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5169545216. Throughput: 0: 42506.7. Samples: 5169684600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:47:03,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-22 23:47:06,401][15401] Updated weights for policy 0, policy_version 315530 (0.0041) [2024-06-22 23:47:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 5169725440. Throughput: 0: 42736.5. Samples: 5169828120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:47:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-22 23:47:09,535][15401] Updated weights for policy 0, policy_version 315540 (0.0029) [2024-06-22 23:47:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5169938432. Throughput: 0: 42328.0. Samples: 5170072340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:47:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-22 23:47:14,012][15401] Updated weights for policy 0, policy_version 315550 (0.0042) [2024-06-22 23:47:17,242][15401] Updated weights for policy 0, policy_version 315560 (0.0036) [2024-06-22 23:47:18,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5170200576. Throughput: 0: 42360.2. Samples: 5170323100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:47:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 23:47:21,541][15401] Updated weights for policy 0, policy_version 315570 (0.0039) [2024-06-22 23:47:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 5170380800. Throughput: 0: 42673.4. Samples: 5170469620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:47:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-22 23:47:24,709][15401] Updated weights for policy 0, policy_version 315580 (0.0039) [2024-06-22 23:47:28,392][15132] Fps is (10 sec: 36036.5, 60 sec: 42050.7, 300 sec: 42820.2). Total num frames: 5170561024. Throughput: 0: 42589.3. Samples: 5170715540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:47:28,392][15132] Avg episode reward: [(0, '0.620')] [2024-06-22 23:47:29,243][15401] Updated weights for policy 0, policy_version 315590 (0.0034) [2024-06-22 23:47:32,247][15401] Updated weights for policy 0, policy_version 315600 (0.0039) [2024-06-22 23:47:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5170839552. Throughput: 0: 42610.6. Samples: 5170963940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:47:33,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-22 23:47:37,056][15401] Updated weights for policy 0, policy_version 315610 (0.0032) [2024-06-22 23:47:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5171003392. Throughput: 0: 42795.7. Samples: 5171106340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-22 23:47:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-22 23:47:39,845][15401] Updated weights for policy 0, policy_version 315620 (0.0038) [2024-06-22 23:47:40,497][15349] Signal inference workers to stop experience collection... (76550 times) [2024-06-22 23:47:40,498][15349] Signal inference workers to resume experience collection... (76550 times) [2024-06-22 23:47:40,545][15401] InferenceWorker_p0-w0: stopping experience collection (76550 times) [2024-06-22 23:47:40,545][15401] InferenceWorker_p0-w0: resuming experience collection (76550 times) [2024-06-22 23:47:43,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5171216384. Throughput: 0: 42490.9. Samples: 5171351620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:47:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 23:47:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000315626_5171216384.pth... [2024-06-22 23:47:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000315000_5160960000.pth [2024-06-22 23:47:44,567][15401] Updated weights for policy 0, policy_version 315630 (0.0031) [2024-06-22 23:47:47,848][15401] Updated weights for policy 0, policy_version 315640 (0.0036) [2024-06-22 23:47:48,390][15132] Fps is (10 sec: 49151.4, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 5171494912. Throughput: 0: 42599.1. Samples: 5171601560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:47:48,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-22 23:47:52,265][15401] Updated weights for policy 0, policy_version 315650 (0.0043) [2024-06-22 23:47:53,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 5171658752. Throughput: 0: 42483.4. Samples: 5171739980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:47:53,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-22 23:47:55,353][15401] Updated weights for policy 0, policy_version 315660 (0.0033) [2024-06-22 23:47:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5171871744. Throughput: 0: 42778.8. Samples: 5171997380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:47:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-22 23:47:59,707][15401] Updated weights for policy 0, policy_version 315670 (0.0031) [2024-06-22 23:48:03,178][15401] Updated weights for policy 0, policy_version 315680 (0.0032) [2024-06-22 23:48:03,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5172101120. Throughput: 0: 42821.8. Samples: 5172250080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-22 23:48:07,354][15401] Updated weights for policy 0, policy_version 315690 (0.0035) [2024-06-22 23:48:08,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5172297728. Throughput: 0: 42568.0. Samples: 5172385280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:08,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 23:48:10,815][15401] Updated weights for policy 0, policy_version 315700 (0.0024) [2024-06-22 23:48:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5172510720. Throughput: 0: 42625.8. Samples: 5172633600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 23:48:15,256][15401] Updated weights for policy 0, policy_version 315710 (0.0030) [2024-06-22 23:48:18,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 5172740096. Throughput: 0: 42728.1. Samples: 5172886700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-22 23:48:18,427][15401] Updated weights for policy 0, policy_version 315720 (0.0034) [2024-06-22 23:48:22,954][15401] Updated weights for policy 0, policy_version 315730 (0.0028) [2024-06-22 23:48:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5172936704. Throughput: 0: 42565.3. Samples: 5173021780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:23,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-22 23:48:26,374][15401] Updated weights for policy 0, policy_version 315740 (0.0030) [2024-06-22 23:48:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 5173149696. Throughput: 0: 42673.9. Samples: 5173271940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:28,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-22 23:48:30,464][15401] Updated weights for policy 0, policy_version 315750 (0.0035) [2024-06-22 23:48:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5173379072. Throughput: 0: 42905.3. Samples: 5173532300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 23:48:33,918][15401] Updated weights for policy 0, policy_version 315760 (0.0032) [2024-06-22 23:48:38,223][15401] Updated weights for policy 0, policy_version 315770 (0.0035) [2024-06-22 23:48:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5173575680. Throughput: 0: 42919.1. Samples: 5173671240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:38,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 23:48:41,294][15401] Updated weights for policy 0, policy_version 315780 (0.0053) [2024-06-22 23:48:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5173805056. Throughput: 0: 42755.9. Samples: 5173921400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:43,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-22 23:48:45,793][15401] Updated weights for policy 0, policy_version 315790 (0.0022) [2024-06-22 23:48:46,644][15349] Signal inference workers to stop experience collection... (76600 times) [2024-06-22 23:48:46,645][15349] Signal inference workers to resume experience collection... (76600 times) [2024-06-22 23:48:46,657][15401] InferenceWorker_p0-w0: stopping experience collection (76600 times) [2024-06-22 23:48:46,657][15401] InferenceWorker_p0-w0: resuming experience collection (76600 times) [2024-06-22 23:48:48,390][15132] Fps is (10 sec: 45872.7, 60 sec: 42324.9, 300 sec: 42764.9). Total num frames: 5174034432. Throughput: 0: 42989.7. Samples: 5174184640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:48,391][15132] Avg episode reward: [(0, '0.360')] [2024-06-22 23:48:48,799][15401] Updated weights for policy 0, policy_version 315800 (0.0035) [2024-06-22 23:48:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 5174198272. Throughput: 0: 42808.0. Samples: 5174311540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-22 23:48:53,398][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 23:48:53,692][15401] Updated weights for policy 0, policy_version 315810 (0.0041) [2024-06-22 23:48:56,592][15401] Updated weights for policy 0, policy_version 315820 (0.0039) [2024-06-22 23:48:58,389][15132] Fps is (10 sec: 42601.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5174460416. Throughput: 0: 42953.8. Samples: 5174566520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:48:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-22 23:49:01,160][15401] Updated weights for policy 0, policy_version 315830 (0.0037) [2024-06-22 23:49:03,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5174673408. Throughput: 0: 43178.7. Samples: 5174829740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:03,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-22 23:49:04,115][15401] Updated weights for policy 0, policy_version 315840 (0.0033) [2024-06-22 23:49:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 5174853632. Throughput: 0: 42930.1. Samples: 5174953640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-22 23:49:08,672][15401] Updated weights for policy 0, policy_version 315850 (0.0042) [2024-06-22 23:49:11,847][15401] Updated weights for policy 0, policy_version 315860 (0.0029) [2024-06-22 23:49:13,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 5175115776. Throughput: 0: 42992.7. Samples: 5175206620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 23:49:16,643][15401] Updated weights for policy 0, policy_version 315870 (0.0029) [2024-06-22 23:49:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5175312384. Throughput: 0: 43087.1. Samples: 5175471220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:18,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-22 23:49:19,446][15401] Updated weights for policy 0, policy_version 315880 (0.0035) [2024-06-22 23:49:23,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5175508992. Throughput: 0: 42716.1. Samples: 5175593460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:23,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-22 23:49:24,234][15401] Updated weights for policy 0, policy_version 315890 (0.0032) [2024-06-22 23:49:26,966][15401] Updated weights for policy 0, policy_version 315900 (0.0031) [2024-06-22 23:49:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 5175754752. Throughput: 0: 42855.0. Samples: 5175849980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:28,393][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 23:49:31,623][15401] Updated weights for policy 0, policy_version 315910 (0.0031) [2024-06-22 23:49:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5175934976. Throughput: 0: 42884.2. Samples: 5176114400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-22 23:49:35,073][15401] Updated weights for policy 0, policy_version 315920 (0.0038) [2024-06-22 23:49:38,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5176147968. Throughput: 0: 42651.2. Samples: 5176230840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-22 23:49:39,265][15401] Updated weights for policy 0, policy_version 315930 (0.0034) [2024-06-22 23:49:42,681][15401] Updated weights for policy 0, policy_version 315940 (0.0034) [2024-06-22 23:49:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5176410112. Throughput: 0: 42688.0. Samples: 5176487480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-22 23:49:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000315943_5176410112.pth... [2024-06-22 23:49:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000315314_5166104576.pth [2024-06-22 23:49:47,121][15401] Updated weights for policy 0, policy_version 315950 (0.0039) [2024-06-22 23:49:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.8, 300 sec: 42598.4). Total num frames: 5176573952. Throughput: 0: 42727.5. Samples: 5176752480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:48,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 23:49:50,115][15349] Signal inference workers to stop experience collection... (76650 times) [2024-06-22 23:49:50,160][15401] InferenceWorker_p0-w0: stopping experience collection (76650 times) [2024-06-22 23:49:50,233][15349] Signal inference workers to resume experience collection... (76650 times) [2024-06-22 23:49:50,234][15401] InferenceWorker_p0-w0: resuming experience collection (76650 times) [2024-06-22 23:49:50,235][15401] Updated weights for policy 0, policy_version 315960 (0.0029) [2024-06-22 23:49:53,390][15132] Fps is (10 sec: 37682.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5176786944. Throughput: 0: 42576.9. Samples: 5176869600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 23:49:54,735][15401] Updated weights for policy 0, policy_version 315970 (0.0027) [2024-06-22 23:49:57,893][15401] Updated weights for policy 0, policy_version 315980 (0.0046) [2024-06-22 23:49:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5177032704. Throughput: 0: 42753.6. Samples: 5177130520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:49:58,396][15132] Avg episode reward: [(0, '0.509')] [2024-06-22 23:50:02,762][15401] Updated weights for policy 0, policy_version 315990 (0.0024) [2024-06-22 23:50:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 5177196544. Throughput: 0: 42499.9. Samples: 5177383720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:50:03,391][15132] Avg episode reward: [(0, '0.739')] [2024-06-22 23:50:05,598][15401] Updated weights for policy 0, policy_version 316000 (0.0028) [2024-06-22 23:50:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5177425920. Throughput: 0: 42501.8. Samples: 5177506040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-22 23:50:08,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-22 23:50:10,411][15401] Updated weights for policy 0, policy_version 316010 (0.0029) [2024-06-22 23:50:13,114][15401] Updated weights for policy 0, policy_version 316020 (0.0032) [2024-06-22 23:50:13,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5177671680. Throughput: 0: 42571.2. Samples: 5177765580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 23:50:18,042][15401] Updated weights for policy 0, policy_version 316030 (0.0036) [2024-06-22 23:50:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42598.5). Total num frames: 5177835520. Throughput: 0: 42481.6. Samples: 5178026080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-22 23:50:21,106][15401] Updated weights for policy 0, policy_version 316040 (0.0030) [2024-06-22 23:50:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42765.9). Total num frames: 5178081280. Throughput: 0: 42562.9. Samples: 5178146180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-22 23:50:25,691][15401] Updated weights for policy 0, policy_version 316050 (0.0027) [2024-06-22 23:50:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42054.0, 300 sec: 42709.8). Total num frames: 5178277888. Throughput: 0: 42655.1. Samples: 5178406960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:28,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-22 23:50:28,807][15401] Updated weights for policy 0, policy_version 316060 (0.0034) [2024-06-22 23:50:33,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5178474496. Throughput: 0: 42547.5. Samples: 5178667120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-22 23:50:33,563][15401] Updated weights for policy 0, policy_version 316070 (0.0037) [2024-06-22 23:50:36,447][15401] Updated weights for policy 0, policy_version 316080 (0.0045) [2024-06-22 23:50:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5178720256. Throughput: 0: 42620.4. Samples: 5178787520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-22 23:50:41,277][15401] Updated weights for policy 0, policy_version 316090 (0.0043) [2024-06-22 23:50:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 42709.5). Total num frames: 5178916864. Throughput: 0: 42699.9. Samples: 5179052020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-22 23:50:44,206][15401] Updated weights for policy 0, policy_version 316100 (0.0034) [2024-06-22 23:50:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5179129856. Throughput: 0: 42727.2. Samples: 5179306440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-22 23:50:48,884][15401] Updated weights for policy 0, policy_version 316110 (0.0035) [2024-06-22 23:50:51,752][15401] Updated weights for policy 0, policy_version 316120 (0.0030) [2024-06-22 23:50:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5179359232. Throughput: 0: 42838.1. Samples: 5179433760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:53,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-22 23:50:53,399][15349] Signal inference workers to stop experience collection... (76700 times) [2024-06-22 23:50:53,399][15349] Signal inference workers to resume experience collection... (76700 times) [2024-06-22 23:50:53,414][15401] InferenceWorker_p0-w0: stopping experience collection (76700 times) [2024-06-22 23:50:53,414][15401] InferenceWorker_p0-w0: resuming experience collection (76700 times) [2024-06-22 23:50:56,341][15401] Updated weights for policy 0, policy_version 316130 (0.0027) [2024-06-22 23:50:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 5179555840. Throughput: 0: 42953.7. Samples: 5179698500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:50:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-22 23:50:59,351][15401] Updated weights for policy 0, policy_version 316140 (0.0050) [2024-06-22 23:51:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5179768832. Throughput: 0: 42721.9. Samples: 5179948560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:51:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-22 23:51:04,329][15401] Updated weights for policy 0, policy_version 316150 (0.0038) [2024-06-22 23:51:06,851][15401] Updated weights for policy 0, policy_version 316160 (0.0038) [2024-06-22 23:51:08,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5180030976. Throughput: 0: 42968.6. Samples: 5180079760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:51:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-22 23:51:11,972][15401] Updated weights for policy 0, policy_version 316170 (0.0056) [2024-06-22 23:51:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 5180178432. Throughput: 0: 42832.4. Samples: 5180334420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:51:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-22 23:51:14,732][15401] Updated weights for policy 0, policy_version 316180 (0.0042) [2024-06-22 23:51:18,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5180407808. Throughput: 0: 42678.3. Samples: 5180587640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:51:18,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-22 23:51:19,540][15401] Updated weights for policy 0, policy_version 316190 (0.0042) [2024-06-22 23:51:22,654][15401] Updated weights for policy 0, policy_version 316200 (0.0037) [2024-06-22 23:51:23,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5180653568. Throughput: 0: 42835.6. Samples: 5180715120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-22 23:51:23,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-22 23:51:27,188][15401] Updated weights for policy 0, policy_version 316210 (0.0041) [2024-06-22 23:51:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5180833792. Throughput: 0: 42767.6. Samples: 5180976560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:51:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-22 23:51:30,225][15401] Updated weights for policy 0, policy_version 316220 (0.0028) [2024-06-22 23:51:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5181046784. Throughput: 0: 42706.7. Samples: 5181228240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:51:33,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-22 23:51:34,722][15401] Updated weights for policy 0, policy_version 316230 (0.0028) [2024-06-22 23:51:37,714][15401] Updated weights for policy 0, policy_version 316240 (0.0029) [2024-06-22 23:51:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5181292544. Throughput: 0: 42825.9. Samples: 5181360920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:51:38,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-22 23:51:42,564][15401] Updated weights for policy 0, policy_version 316250 (0.0044) [2024-06-22 23:51:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5181472768. Throughput: 0: 42688.1. Samples: 5181619460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:51:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-22 23:51:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000316252_5181472768.pth... [2024-06-22 23:51:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000315626_5171216384.pth [2024-06-22 23:51:45,517][15401] Updated weights for policy 0, policy_version 316260 (0.0036) [2024-06-22 23:51:48,394][15132] Fps is (10 sec: 40942.0, 60 sec: 42868.4, 300 sec: 42819.9). Total num frames: 5181702144. Throughput: 0: 42706.1. Samples: 5181870520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:51:48,394][15132] Avg episode reward: [(0, '0.450')] [2024-06-22 23:51:50,170][15401] Updated weights for policy 0, policy_version 316270 (0.0029) [2024-06-22 23:51:53,158][15401] Updated weights for policy 0, policy_version 316280 (0.0033) [2024-06-22 23:51:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5181931520. Throughput: 0: 42852.0. Samples: 5182008100. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:51:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-22 23:51:57,575][15401] Updated weights for policy 0, policy_version 316290 (0.0023) [2024-06-22 23:51:58,390][15132] Fps is (10 sec: 40977.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5182111744. Throughput: 0: 42931.4. Samples: 5182266340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:51:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-22 23:52:00,865][15401] Updated weights for policy 0, policy_version 316300 (0.0021) [2024-06-22 23:52:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5182357504. Throughput: 0: 42784.8. Samples: 5182512960. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:52:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 23:52:05,261][15401] Updated weights for policy 0, policy_version 316310 (0.0038) [2024-06-22 23:52:08,283][15401] Updated weights for policy 0, policy_version 316320 (0.0033) [2024-06-22 23:52:08,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5182586880. Throughput: 0: 43023.5. Samples: 5182651180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:52:08,392][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 23:52:13,047][15401] Updated weights for policy 0, policy_version 316330 (0.0047) [2024-06-22 23:52:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5182750720. Throughput: 0: 42886.2. Samples: 5182906440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:52:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-22 23:52:15,447][15349] Signal inference workers to stop experience collection... (76750 times) [2024-06-22 23:52:15,452][15349] Signal inference workers to resume experience collection... (76750 times) [2024-06-22 23:52:15,482][15401] InferenceWorker_p0-w0: stopping experience collection (76750 times) [2024-06-22 23:52:15,514][15401] InferenceWorker_p0-w0: resuming experience collection (76750 times) [2024-06-22 23:52:15,770][15401] Updated weights for policy 0, policy_version 316340 (0.0033) [2024-06-22 23:52:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5183012864. Throughput: 0: 42750.2. Samples: 5183152000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:52:18,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-22 23:52:20,703][15401] Updated weights for policy 0, policy_version 316350 (0.0040) [2024-06-22 23:52:23,309][15401] Updated weights for policy 0, policy_version 316360 (0.0041) [2024-06-22 23:52:23,390][15132] Fps is (10 sec: 49152.1, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 5183242240. Throughput: 0: 42999.0. Samples: 5183295880. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:52:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 23:52:28,294][15401] Updated weights for policy 0, policy_version 316370 (0.0041) [2024-06-22 23:52:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5183406080. Throughput: 0: 42830.7. Samples: 5183546840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:52:28,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-22 23:52:30,923][15401] Updated weights for policy 0, policy_version 316380 (0.0040) [2024-06-22 23:52:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 5183668224. Throughput: 0: 42958.3. Samples: 5183803460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:52:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-22 23:52:35,765][15401] Updated weights for policy 0, policy_version 316390 (0.0027) [2024-06-22 23:52:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5183864832. Throughput: 0: 42957.0. Samples: 5183941160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-22 23:52:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-22 23:52:38,614][15401] Updated weights for policy 0, policy_version 316400 (0.0029) [2024-06-22 23:52:43,275][15401] Updated weights for policy 0, policy_version 316410 (0.0032) [2024-06-22 23:52:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5184061440. Throughput: 0: 42869.0. Samples: 5184195440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:52:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-22 23:52:46,381][15401] Updated weights for policy 0, policy_version 316420 (0.0036) [2024-06-22 23:52:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43147.6, 300 sec: 42820.9). Total num frames: 5184290816. Throughput: 0: 43074.7. Samples: 5184451320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:52:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-22 23:52:50,776][15401] Updated weights for policy 0, policy_version 316430 (0.0026) [2024-06-22 23:52:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5184503808. Throughput: 0: 42881.8. Samples: 5184580860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:52:53,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-22 23:52:53,911][15401] Updated weights for policy 0, policy_version 316440 (0.0034) [2024-06-22 23:52:58,224][15401] Updated weights for policy 0, policy_version 316450 (0.0046) [2024-06-22 23:52:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5184716800. Throughput: 0: 42970.2. Samples: 5184840100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:52:58,396][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 23:53:01,525][15401] Updated weights for policy 0, policy_version 316460 (0.0035) [2024-06-22 23:53:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 5184946176. Throughput: 0: 43131.9. Samples: 5185092940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:03,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-22 23:53:05,794][15401] Updated weights for policy 0, policy_version 316470 (0.0030) [2024-06-22 23:53:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5185126400. Throughput: 0: 42905.9. Samples: 5185226640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:08,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-22 23:53:09,500][15401] Updated weights for policy 0, policy_version 316480 (0.0030) [2024-06-22 23:53:13,334][15401] Updated weights for policy 0, policy_version 316490 (0.0030) [2024-06-22 23:53:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 5185372160. Throughput: 0: 42982.6. Samples: 5185481060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 23:53:17,362][15401] Updated weights for policy 0, policy_version 316500 (0.0037) [2024-06-22 23:53:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5185585152. Throughput: 0: 42866.8. Samples: 5185732460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-22 23:53:21,137][15401] Updated weights for policy 0, policy_version 316510 (0.0040) [2024-06-22 23:53:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5185765376. Throughput: 0: 42731.5. Samples: 5185864080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-22 23:53:24,968][15401] Updated weights for policy 0, policy_version 316520 (0.0042) [2024-06-22 23:53:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5185994752. Throughput: 0: 42720.4. Samples: 5186117860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:28,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-22 23:53:28,976][15401] Updated weights for policy 0, policy_version 316530 (0.0029) [2024-06-22 23:53:32,754][15401] Updated weights for policy 0, policy_version 316540 (0.0027) [2024-06-22 23:53:33,391][15132] Fps is (10 sec: 44228.3, 60 sec: 42324.1, 300 sec: 42820.3). Total num frames: 5186207744. Throughput: 0: 42722.2. Samples: 5186373900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:33,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-22 23:53:37,167][15401] Updated weights for policy 0, policy_version 316550 (0.0037) [2024-06-22 23:53:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5186404352. Throughput: 0: 42632.0. Samples: 5186499300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-22 23:53:40,591][15401] Updated weights for policy 0, policy_version 316560 (0.0027) [2024-06-22 23:53:42,580][15349] Signal inference workers to stop experience collection... (76800 times) [2024-06-22 23:53:42,581][15349] Signal inference workers to resume experience collection... (76800 times) [2024-06-22 23:53:42,600][15401] InferenceWorker_p0-w0: stopping experience collection (76800 times) [2024-06-22 23:53:42,600][15401] InferenceWorker_p0-w0: resuming experience collection (76800 times) [2024-06-22 23:53:43,389][15132] Fps is (10 sec: 44245.3, 60 sec: 43144.5, 300 sec: 42765.1). Total num frames: 5186650112. Throughput: 0: 42563.7. Samples: 5186755460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-22 23:53:43,537][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000316569_5186666496.pth... [2024-06-22 23:53:43,598][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000315943_5176410112.pth [2024-06-22 23:53:44,601][15401] Updated weights for policy 0, policy_version 316570 (0.0042) [2024-06-22 23:53:48,152][15401] Updated weights for policy 0, policy_version 316580 (0.0030) [2024-06-22 23:53:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5186846720. Throughput: 0: 42708.4. Samples: 5187014820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-22 23:53:48,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 23:53:52,071][15401] Updated weights for policy 0, policy_version 316590 (0.0034) [2024-06-22 23:53:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5187059712. Throughput: 0: 42487.0. Samples: 5187138560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:53:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-22 23:53:56,072][15401] Updated weights for policy 0, policy_version 316600 (0.0030) [2024-06-22 23:53:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5187305472. Throughput: 0: 42574.7. Samples: 5187396920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:53:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-22 23:53:59,603][15401] Updated weights for policy 0, policy_version 316610 (0.0040) [2024-06-22 23:54:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5187469312. Throughput: 0: 42905.7. Samples: 5187663220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-22 23:54:03,610][15401] Updated weights for policy 0, policy_version 316620 (0.0028) [2024-06-22 23:54:06,925][15401] Updated weights for policy 0, policy_version 316630 (0.0035) [2024-06-22 23:54:08,393][15132] Fps is (10 sec: 39306.1, 60 sec: 42868.6, 300 sec: 42653.4). Total num frames: 5187698688. Throughput: 0: 42641.9. Samples: 5187783140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:08,394][15132] Avg episode reward: [(0, '0.443')] [2024-06-22 23:54:11,222][15401] Updated weights for policy 0, policy_version 316640 (0.0043) [2024-06-22 23:54:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5187944448. Throughput: 0: 42706.6. Samples: 5188039660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-22 23:54:14,427][15401] Updated weights for policy 0, policy_version 316650 (0.0022) [2024-06-22 23:54:18,389][15132] Fps is (10 sec: 42615.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5188124672. Throughput: 0: 43032.9. Samples: 5188310300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-22 23:54:18,814][15401] Updated weights for policy 0, policy_version 316660 (0.0046) [2024-06-22 23:54:21,957][15401] Updated weights for policy 0, policy_version 316670 (0.0044) [2024-06-22 23:54:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 5188354048. Throughput: 0: 42791.0. Samples: 5188424900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:23,399][15132] Avg episode reward: [(0, '0.825')] [2024-06-22 23:54:26,454][15401] Updated weights for policy 0, policy_version 316680 (0.0029) [2024-06-22 23:54:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5188583424. Throughput: 0: 42914.7. Samples: 5188686620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:28,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-22 23:54:29,672][15401] Updated weights for policy 0, policy_version 316690 (0.0030) [2024-06-22 23:54:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42053.5, 300 sec: 42653.9). Total num frames: 5188730880. Throughput: 0: 43036.4. Samples: 5188951460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:33,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-22 23:54:34,047][15401] Updated weights for policy 0, policy_version 316700 (0.0035) [2024-06-22 23:54:37,159][15401] Updated weights for policy 0, policy_version 316710 (0.0024) [2024-06-22 23:54:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 5189009408. Throughput: 0: 42710.2. Samples: 5189060520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-22 23:54:41,695][15401] Updated weights for policy 0, policy_version 316720 (0.0039) [2024-06-22 23:54:41,719][15349] Signal inference workers to stop experience collection... (76850 times) [2024-06-22 23:54:41,724][15349] Signal inference workers to resume experience collection... (76850 times) [2024-06-22 23:54:41,765][15401] InferenceWorker_p0-w0: stopping experience collection (76850 times) [2024-06-22 23:54:41,765][15401] InferenceWorker_p0-w0: resuming experience collection (76850 times) [2024-06-22 23:54:43,389][15132] Fps is (10 sec: 49152.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5189222400. Throughput: 0: 42902.3. Samples: 5189327520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 23:54:44,648][15401] Updated weights for policy 0, policy_version 316730 (0.0041) [2024-06-22 23:54:48,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5189386240. Throughput: 0: 43017.0. Samples: 5189598980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-22 23:54:49,388][15401] Updated weights for policy 0, policy_version 316740 (0.0041) [2024-06-22 23:54:52,293][15401] Updated weights for policy 0, policy_version 316750 (0.0042) [2024-06-22 23:54:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5189648384. Throughput: 0: 42866.4. Samples: 5189711960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:53,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-22 23:54:57,109][15401] Updated weights for policy 0, policy_version 316760 (0.0033) [2024-06-22 23:54:58,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5189877760. Throughput: 0: 43035.7. Samples: 5189976260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:54:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-22 23:55:00,080][15401] Updated weights for policy 0, policy_version 316770 (0.0030) [2024-06-22 23:55:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5190041600. Throughput: 0: 42831.5. Samples: 5190237720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-22 23:55:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-22 23:55:05,018][15401] Updated weights for policy 0, policy_version 316780 (0.0043) [2024-06-22 23:55:07,646][15401] Updated weights for policy 0, policy_version 316790 (0.0039) [2024-06-22 23:55:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43420.5, 300 sec: 42820.6). Total num frames: 5190303744. Throughput: 0: 42796.5. Samples: 5190350740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:08,399][15132] Avg episode reward: [(0, '0.610')] [2024-06-22 23:55:12,524][15401] Updated weights for policy 0, policy_version 316800 (0.0039) [2024-06-22 23:55:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5190500352. Throughput: 0: 42876.7. Samples: 5190616080. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:13,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-22 23:55:15,331][15401] Updated weights for policy 0, policy_version 316810 (0.0034) [2024-06-22 23:55:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5190696960. Throughput: 0: 42685.9. Samples: 5190872320. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-22 23:55:20,176][15401] Updated weights for policy 0, policy_version 316820 (0.0033) [2024-06-22 23:55:22,896][15401] Updated weights for policy 0, policy_version 316830 (0.0033) [2024-06-22 23:55:23,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 5190942720. Throughput: 0: 43116.9. Samples: 5191000880. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:23,392][15132] Avg episode reward: [(0, '0.461')] [2024-06-22 23:55:27,609][15401] Updated weights for policy 0, policy_version 316840 (0.0037) [2024-06-22 23:55:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5191122944. Throughput: 0: 43013.8. Samples: 5191263140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-22 23:55:30,639][15401] Updated weights for policy 0, policy_version 316850 (0.0032) [2024-06-22 23:55:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 5191352320. Throughput: 0: 42650.1. Samples: 5191518240. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-22 23:55:35,798][15401] Updated weights for policy 0, policy_version 316860 (0.0039) [2024-06-22 23:55:38,240][15401] Updated weights for policy 0, policy_version 316870 (0.0032) [2024-06-22 23:55:38,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5191598080. Throughput: 0: 42973.4. Samples: 5191645760. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-22 23:55:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5191745536. Throughput: 0: 42656.9. Samples: 5191895820. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-22 23:55:43,479][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000316880_5191761920.pth... [2024-06-22 23:55:43,490][15401] Updated weights for policy 0, policy_version 316880 (0.0041) [2024-06-22 23:55:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000316252_5181472768.pth [2024-06-22 23:55:45,843][15401] Updated weights for policy 0, policy_version 316890 (0.0038) [2024-06-22 23:55:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 5191991296. Throughput: 0: 42564.1. Samples: 5192153100. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 23:55:50,826][15349] Signal inference workers to stop experience collection... (76900 times) [2024-06-22 23:55:50,832][15349] Signal inference workers to resume experience collection... (76900 times) [2024-06-22 23:55:50,868][15401] InferenceWorker_p0-w0: stopping experience collection (76900 times) [2024-06-22 23:55:50,868][15401] InferenceWorker_p0-w0: resuming experience collection (76900 times) [2024-06-22 23:55:50,981][15401] Updated weights for policy 0, policy_version 316900 (0.0030) [2024-06-22 23:55:53,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5192237056. Throughput: 0: 43055.1. Samples: 5192288220. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-22 23:55:53,559][15401] Updated weights for policy 0, policy_version 316910 (0.0042) [2024-06-22 23:55:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 5192400896. Throughput: 0: 42892.1. Samples: 5192546220. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:55:58,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-22 23:55:58,605][15401] Updated weights for policy 0, policy_version 316920 (0.0027) [2024-06-22 23:56:01,247][15401] Updated weights for policy 0, policy_version 316930 (0.0055) [2024-06-22 23:56:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5192630272. Throughput: 0: 42689.8. Samples: 5192793360. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:56:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-22 23:56:06,171][15401] Updated weights for policy 0, policy_version 316940 (0.0026) [2024-06-22 23:56:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 5192843264. Throughput: 0: 42876.6. Samples: 5192930220. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:56:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 23:56:09,303][15401] Updated weights for policy 0, policy_version 316950 (0.0020) [2024-06-22 23:56:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5193039872. Throughput: 0: 42708.9. Samples: 5193185040. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:56:13,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-22 23:56:13,725][15401] Updated weights for policy 0, policy_version 316960 (0.0029) [2024-06-22 23:56:17,112][15401] Updated weights for policy 0, policy_version 316970 (0.0038) [2024-06-22 23:56:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5193285632. Throughput: 0: 42657.8. Samples: 5193437840. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-22 23:56:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-22 23:56:21,333][15401] Updated weights for policy 0, policy_version 316980 (0.0038) [2024-06-22 23:56:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 5193515008. Throughput: 0: 42786.2. Samples: 5193571140. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:56:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-22 23:56:24,540][15401] Updated weights for policy 0, policy_version 316990 (0.0027) [2024-06-22 23:56:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5193695232. Throughput: 0: 42896.5. Samples: 5193826160. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:56:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-22 23:56:28,847][15401] Updated weights for policy 0, policy_version 317000 (0.0053) [2024-06-22 23:56:32,129][15401] Updated weights for policy 0, policy_version 317010 (0.0028) [2024-06-22 23:56:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5193908224. Throughput: 0: 42853.4. Samples: 5194081500. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:56:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-22 23:56:36,966][15401] Updated weights for policy 0, policy_version 317020 (0.0043) [2024-06-22 23:56:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 5194137600. Throughput: 0: 42826.7. Samples: 5194215420. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:56:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-22 23:56:39,669][15401] Updated weights for policy 0, policy_version 317030 (0.0030) [2024-06-22 23:56:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42821.2). Total num frames: 5194334208. Throughput: 0: 42653.2. Samples: 5194465620. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:56:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-22 23:56:44,457][15401] Updated weights for policy 0, policy_version 317040 (0.0034) [2024-06-22 23:56:47,762][15401] Updated weights for policy 0, policy_version 317050 (0.0026) [2024-06-22 23:56:48,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 5194579968. Throughput: 0: 42801.3. Samples: 5194719520. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:56:48,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 23:56:51,874][15401] Updated weights for policy 0, policy_version 317060 (0.0027) [2024-06-22 23:56:53,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 5194792960. Throughput: 0: 42774.6. Samples: 5194855080. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:56:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-22 23:56:55,339][15401] Updated weights for policy 0, policy_version 317070 (0.0027) [2024-06-22 23:56:58,389][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5194989568. Throughput: 0: 42936.8. Samples: 5195117200. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:56:58,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-22 23:56:59,800][15401] Updated weights for policy 0, policy_version 317080 (0.0036) [2024-06-22 23:57:03,191][15401] Updated weights for policy 0, policy_version 317090 (0.0035) [2024-06-22 23:57:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5195218944. Throughput: 0: 42904.1. Samples: 5195368520. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:57:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-22 23:57:07,397][15401] Updated weights for policy 0, policy_version 317100 (0.0041) [2024-06-22 23:57:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5195415552. Throughput: 0: 42698.3. Samples: 5195492560. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:57:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-22 23:57:10,829][15401] Updated weights for policy 0, policy_version 317110 (0.0033) [2024-06-22 23:57:13,093][15349] Signal inference workers to stop experience collection... (76950 times) [2024-06-22 23:57:13,135][15401] InferenceWorker_p0-w0: stopping experience collection (76950 times) [2024-06-22 23:57:13,149][15349] Signal inference workers to resume experience collection... (76950 times) [2024-06-22 23:57:13,151][15401] InferenceWorker_p0-w0: resuming experience collection (76950 times) [2024-06-22 23:57:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5195644928. Throughput: 0: 42817.7. Samples: 5195752960. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:57:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-22 23:57:14,828][15401] Updated weights for policy 0, policy_version 317120 (0.0046) [2024-06-22 23:57:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5195841536. Throughput: 0: 42864.0. Samples: 5196010380. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:57:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-22 23:57:18,470][15401] Updated weights for policy 0, policy_version 317130 (0.0042) [2024-06-22 23:57:22,566][15401] Updated weights for policy 0, policy_version 317140 (0.0033) [2024-06-22 23:57:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 5196038144. Throughput: 0: 42655.6. Samples: 5196134920. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:57:23,391][15132] Avg episode reward: [(0, '0.582')] [2024-06-22 23:57:26,230][15401] Updated weights for policy 0, policy_version 317150 (0.0028) [2024-06-22 23:57:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5196267520. Throughput: 0: 42738.4. Samples: 5196388840. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:57:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-22 23:57:30,174][15401] Updated weights for policy 0, policy_version 317160 (0.0046) [2024-06-22 23:57:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5196480512. Throughput: 0: 42976.5. Samples: 5196653360. Policy #0 lag: (min: 2.0, avg: 10.2, max: 22.0) [2024-06-22 23:57:33,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-22 23:57:33,793][15401] Updated weights for policy 0, policy_version 317170 (0.0036) [2024-06-22 23:57:37,746][15401] Updated weights for policy 0, policy_version 317180 (0.0031) [2024-06-22 23:57:38,391][15132] Fps is (10 sec: 42592.9, 60 sec: 42597.6, 300 sec: 42820.4). Total num frames: 5196693504. Throughput: 0: 42713.9. Samples: 5196777260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:57:38,391][15132] Avg episode reward: [(0, '0.306')] [2024-06-22 23:57:41,540][15401] Updated weights for policy 0, policy_version 317190 (0.0028) [2024-06-22 23:57:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5196906496. Throughput: 0: 42539.6. Samples: 5197031480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:57:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-22 23:57:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000317195_5196922880.pth... [2024-06-22 23:57:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000316569_5186666496.pth [2024-06-22 23:57:45,463][15401] Updated weights for policy 0, policy_version 317200 (0.0034) [2024-06-22 23:57:48,389][15132] Fps is (10 sec: 42603.7, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 5197119488. Throughput: 0: 42722.2. Samples: 5197291020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:57:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-22 23:57:49,375][15401] Updated weights for policy 0, policy_version 317210 (0.0029) [2024-06-22 23:57:53,370][15401] Updated weights for policy 0, policy_version 317220 (0.0043) [2024-06-22 23:57:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5197332480. Throughput: 0: 42790.6. Samples: 5197418140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:57:53,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 23:57:57,011][15401] Updated weights for policy 0, policy_version 317230 (0.0037) [2024-06-22 23:57:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5197545472. Throughput: 0: 42637.8. Samples: 5197671660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:57:58,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 23:58:01,151][15401] Updated weights for policy 0, policy_version 317240 (0.0033) [2024-06-22 23:58:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5197791232. Throughput: 0: 42661.7. Samples: 5197930160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:03,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-22 23:58:04,547][15401] Updated weights for policy 0, policy_version 317250 (0.0032) [2024-06-22 23:58:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5197955072. Throughput: 0: 42784.9. Samples: 5198060240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-22 23:58:08,562][15401] Updated weights for policy 0, policy_version 317260 (0.0033) [2024-06-22 23:58:12,027][15401] Updated weights for policy 0, policy_version 317270 (0.0042) [2024-06-22 23:58:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5198200832. Throughput: 0: 42774.5. Samples: 5198313700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:13,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-22 23:58:15,968][15401] Updated weights for policy 0, policy_version 317280 (0.0037) [2024-06-22 23:58:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5198413824. Throughput: 0: 42626.3. Samples: 5198571540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:18,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-22 23:58:19,961][15401] Updated weights for policy 0, policy_version 317290 (0.0042) [2024-06-22 23:58:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5198610432. Throughput: 0: 42679.8. Samples: 5198697800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:23,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 23:58:23,593][15401] Updated weights for policy 0, policy_version 317300 (0.0036) [2024-06-22 23:58:24,794][15349] Signal inference workers to stop experience collection... (77000 times) [2024-06-22 23:58:24,796][15349] Signal inference workers to resume experience collection... (77000 times) [2024-06-22 23:58:24,804][15401] InferenceWorker_p0-w0: stopping experience collection (77000 times) [2024-06-22 23:58:24,831][15401] InferenceWorker_p0-w0: resuming experience collection (77000 times) [2024-06-22 23:58:27,585][15401] Updated weights for policy 0, policy_version 317310 (0.0037) [2024-06-22 23:58:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 5198839808. Throughput: 0: 42763.1. Samples: 5198955820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:28,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-22 23:58:31,104][15401] Updated weights for policy 0, policy_version 317320 (0.0035) [2024-06-22 23:58:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5199036416. Throughput: 0: 42699.5. Samples: 5199212500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:33,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-22 23:58:35,230][15401] Updated weights for policy 0, policy_version 317330 (0.0031) [2024-06-22 23:58:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42599.3, 300 sec: 42709.5). Total num frames: 5199249408. Throughput: 0: 42701.0. Samples: 5199339680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:38,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-22 23:58:39,234][15401] Updated weights for policy 0, policy_version 317340 (0.0040) [2024-06-22 23:58:42,775][15401] Updated weights for policy 0, policy_version 317350 (0.0024) [2024-06-22 23:58:43,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5199478784. Throughput: 0: 42794.6. Samples: 5199597520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:43,401][15132] Avg episode reward: [(0, '0.430')] [2024-06-22 23:58:46,718][15401] Updated weights for policy 0, policy_version 317360 (0.0034) [2024-06-22 23:58:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5199691776. Throughput: 0: 42845.8. Samples: 5199858220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-22 23:58:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-22 23:58:50,389][15401] Updated weights for policy 0, policy_version 317370 (0.0042) [2024-06-22 23:58:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5199888384. Throughput: 0: 42848.1. Samples: 5199988400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:58:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-22 23:58:54,187][15401] Updated weights for policy 0, policy_version 317380 (0.0032) [2024-06-22 23:58:57,851][15401] Updated weights for policy 0, policy_version 317390 (0.0037) [2024-06-22 23:58:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5200117760. Throughput: 0: 42882.4. Samples: 5200243400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:58:58,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-22 23:59:01,907][15401] Updated weights for policy 0, policy_version 317400 (0.0023) [2024-06-22 23:59:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42821.1). Total num frames: 5200330752. Throughput: 0: 42873.8. Samples: 5200500860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:03,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-22 23:59:05,463][15401] Updated weights for policy 0, policy_version 317410 (0.0035) [2024-06-22 23:59:08,390][15132] Fps is (10 sec: 42594.3, 60 sec: 43143.9, 300 sec: 42709.4). Total num frames: 5200543744. Throughput: 0: 42990.2. Samples: 5200632400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:08,391][15132] Avg episode reward: [(0, '0.420')] [2024-06-22 23:59:09,379][15401] Updated weights for policy 0, policy_version 317420 (0.0025) [2024-06-22 23:59:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5200756736. Throughput: 0: 42899.9. Samples: 5200886320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:13,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-22 23:59:13,537][15401] Updated weights for policy 0, policy_version 317430 (0.0036) [2024-06-22 23:59:17,398][15401] Updated weights for policy 0, policy_version 317440 (0.0038) [2024-06-22 23:59:18,390][15132] Fps is (10 sec: 44240.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5200986112. Throughput: 0: 42816.4. Samples: 5201139240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:18,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-22 23:59:21,126][15401] Updated weights for policy 0, policy_version 317450 (0.0038) [2024-06-22 23:59:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5201199104. Throughput: 0: 42916.0. Samples: 5201270900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-22 23:59:25,203][15401] Updated weights for policy 0, policy_version 317460 (0.0028) [2024-06-22 23:59:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5201395712. Throughput: 0: 42976.5. Samples: 5201531360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-22 23:59:28,848][15401] Updated weights for policy 0, policy_version 317470 (0.0030) [2024-06-22 23:59:32,637][15401] Updated weights for policy 0, policy_version 317480 (0.0037) [2024-06-22 23:59:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5201625088. Throughput: 0: 42839.2. Samples: 5201785980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-22 23:59:36,328][15401] Updated weights for policy 0, policy_version 317490 (0.0033) [2024-06-22 23:59:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5201854464. Throughput: 0: 42880.8. Samples: 5201918040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:38,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-22 23:59:40,339][15401] Updated weights for policy 0, policy_version 317500 (0.0036) [2024-06-22 23:59:43,390][15132] Fps is (10 sec: 42595.8, 60 sec: 42872.8, 300 sec: 42931.5). Total num frames: 5202051072. Throughput: 0: 42880.7. Samples: 5202173060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:43,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-22 23:59:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000317508_5202051072.pth... [2024-06-22 23:59:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000316880_5191761920.pth [2024-06-22 23:59:43,667][15349] Signal inference workers to stop experience collection... (77050 times) [2024-06-22 23:59:43,667][15349] Signal inference workers to resume experience collection... (77050 times) [2024-06-22 23:59:43,705][15401] InferenceWorker_p0-w0: stopping experience collection (77050 times) [2024-06-22 23:59:43,706][15401] InferenceWorker_p0-w0: resuming experience collection (77050 times) [2024-06-22 23:59:43,823][15401] Updated weights for policy 0, policy_version 317510 (0.0032) [2024-06-22 23:59:47,861][15401] Updated weights for policy 0, policy_version 317520 (0.0035) [2024-06-22 23:59:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5202264064. Throughput: 0: 42873.7. Samples: 5202430180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-22 23:59:51,601][15401] Updated weights for policy 0, policy_version 317530 (0.0034) [2024-06-22 23:59:53,396][15132] Fps is (10 sec: 42573.7, 60 sec: 43139.9, 300 sec: 42708.5). Total num frames: 5202477056. Throughput: 0: 42709.0. Samples: 5202554540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:53,396][15132] Avg episode reward: [(0, '0.580')] [2024-06-22 23:59:55,662][15401] Updated weights for policy 0, policy_version 317540 (0.0037) [2024-06-22 23:59:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5202690048. Throughput: 0: 42782.8. Samples: 5202811540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-22 23:59:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-22 23:59:59,183][15401] Updated weights for policy 0, policy_version 317550 (0.0038) [2024-06-23 00:00:03,187][15401] Updated weights for policy 0, policy_version 317560 (0.0033) [2024-06-23 00:00:03,390][15132] Fps is (10 sec: 42625.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5202903040. Throughput: 0: 42878.1. Samples: 5203068760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 00:00:03,396][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 00:00:06,801][15401] Updated weights for policy 0, policy_version 317570 (0.0028) [2024-06-23 00:00:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42872.1, 300 sec: 42765.0). Total num frames: 5203116032. Throughput: 0: 42699.9. Samples: 5203192400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 00:00:10,907][15401] Updated weights for policy 0, policy_version 317580 (0.0037) [2024-06-23 00:00:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5203329024. Throughput: 0: 42650.3. Samples: 5203450620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 00:00:14,368][15401] Updated weights for policy 0, policy_version 317590 (0.0039) [2024-06-23 00:00:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 5203525632. Throughput: 0: 42683.4. Samples: 5203706740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 00:00:18,671][15401] Updated weights for policy 0, policy_version 317600 (0.0033) [2024-06-23 00:00:22,465][15401] Updated weights for policy 0, policy_version 317610 (0.0026) [2024-06-23 00:00:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5203755008. Throughput: 0: 42558.8. Samples: 5203833180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 00:00:26,326][15401] Updated weights for policy 0, policy_version 317620 (0.0041) [2024-06-23 00:00:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5203968000. Throughput: 0: 42703.2. Samples: 5204094680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 00:00:30,078][15401] Updated weights for policy 0, policy_version 317630 (0.0041) [2024-06-23 00:00:33,395][15132] Fps is (10 sec: 42574.8, 60 sec: 42594.5, 300 sec: 42653.1). Total num frames: 5204180992. Throughput: 0: 42595.3. Samples: 5204347200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:33,395][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 00:00:34,041][15401] Updated weights for policy 0, policy_version 317640 (0.0034) [2024-06-23 00:00:37,710][15401] Updated weights for policy 0, policy_version 317650 (0.0046) [2024-06-23 00:00:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.7, 300 sec: 42875.7). Total num frames: 5204393984. Throughput: 0: 42589.1. Samples: 5204470880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:38,392][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 00:00:41,610][15401] Updated weights for policy 0, policy_version 317660 (0.0040) [2024-06-23 00:00:43,389][15132] Fps is (10 sec: 42621.7, 60 sec: 42598.8, 300 sec: 42765.0). Total num frames: 5204606976. Throughput: 0: 42670.6. Samples: 5204731720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 00:00:45,416][15401] Updated weights for policy 0, policy_version 317670 (0.0030) [2024-06-23 00:00:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5204803584. Throughput: 0: 42430.0. Samples: 5204978100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:48,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 00:00:49,900][15401] Updated weights for policy 0, policy_version 317680 (0.0036) [2024-06-23 00:00:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 5205016576. Throughput: 0: 42517.9. Samples: 5205105700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 00:00:53,620][15401] Updated weights for policy 0, policy_version 317690 (0.0039) [2024-06-23 00:00:57,331][15401] Updated weights for policy 0, policy_version 317700 (0.0033) [2024-06-23 00:00:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5205245952. Throughput: 0: 42594.7. Samples: 5205367380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:00:58,390][15132] Avg episode reward: [(0, '0.175')] [2024-06-23 00:01:01,183][15401] Updated weights for policy 0, policy_version 317710 (0.0038) [2024-06-23 00:01:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5205475328. Throughput: 0: 42491.6. Samples: 5205618860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:01:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 00:01:05,101][15401] Updated weights for policy 0, policy_version 317720 (0.0030) [2024-06-23 00:01:05,706][15349] Signal inference workers to stop experience collection... (77100 times) [2024-06-23 00:01:05,755][15401] InferenceWorker_p0-w0: stopping experience collection (77100 times) [2024-06-23 00:01:05,764][15349] Signal inference workers to resume experience collection... (77100 times) [2024-06-23 00:01:05,774][15401] InferenceWorker_p0-w0: resuming experience collection (77100 times) [2024-06-23 00:01:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5205655552. Throughput: 0: 42560.8. Samples: 5205748420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:01:08,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 00:01:08,818][15401] Updated weights for policy 0, policy_version 317730 (0.0035) [2024-06-23 00:01:12,992][15401] Updated weights for policy 0, policy_version 317740 (0.0030) [2024-06-23 00:01:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5205884928. Throughput: 0: 42594.2. Samples: 5206011420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:01:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 00:01:16,599][15401] Updated weights for policy 0, policy_version 317750 (0.0031) [2024-06-23 00:01:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5206097920. Throughput: 0: 42569.3. Samples: 5206262580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:01:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 00:01:20,503][15401] Updated weights for policy 0, policy_version 317760 (0.0031) [2024-06-23 00:01:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5206310912. Throughput: 0: 42687.6. Samples: 5206391720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:01:23,398][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 00:01:24,110][15401] Updated weights for policy 0, policy_version 317770 (0.0040) [2024-06-23 00:01:27,956][15401] Updated weights for policy 0, policy_version 317780 (0.0041) [2024-06-23 00:01:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5206523904. Throughput: 0: 42582.3. Samples: 5206647920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:01:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 00:01:31,794][15401] Updated weights for policy 0, policy_version 317790 (0.0033) [2024-06-23 00:01:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42602.3, 300 sec: 42709.5). Total num frames: 5206736896. Throughput: 0: 42813.7. Samples: 5206904720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:01:33,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 00:01:35,510][15401] Updated weights for policy 0, policy_version 317800 (0.0052) [2024-06-23 00:01:38,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 5206966272. Throughput: 0: 42930.2. Samples: 5207037560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:01:38,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 00:01:39,527][15401] Updated weights for policy 0, policy_version 317810 (0.0027) [2024-06-23 00:01:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 5207146496. Throughput: 0: 42707.6. Samples: 5207289220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:01:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 00:01:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000317819_5207146496.pth... [2024-06-23 00:01:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000317195_5196922880.pth [2024-06-23 00:01:43,706][15401] Updated weights for policy 0, policy_version 317820 (0.0039) [2024-06-23 00:01:47,200][15401] Updated weights for policy 0, policy_version 317830 (0.0035) [2024-06-23 00:01:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5207392256. Throughput: 0: 42707.2. Samples: 5207540680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:01:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 00:01:51,245][15401] Updated weights for policy 0, policy_version 317840 (0.0030) [2024-06-23 00:01:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5207588864. Throughput: 0: 42750.2. Samples: 5207672180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:01:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 00:01:55,138][15401] Updated weights for policy 0, policy_version 317850 (0.0042) [2024-06-23 00:01:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5207801856. Throughput: 0: 42518.3. Samples: 5207924740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:01:58,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 00:01:58,677][15401] Updated weights for policy 0, policy_version 317860 (0.0038) [2024-06-23 00:02:02,629][15401] Updated weights for policy 0, policy_version 317870 (0.0029) [2024-06-23 00:02:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5207998464. Throughput: 0: 42688.7. Samples: 5208183580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:02:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 00:02:06,373][15401] Updated weights for policy 0, policy_version 317880 (0.0038) [2024-06-23 00:02:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5208244224. Throughput: 0: 42660.4. Samples: 5208311440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:02:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 00:02:10,246][15401] Updated weights for policy 0, policy_version 317890 (0.0041) [2024-06-23 00:02:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5208440832. Throughput: 0: 42620.7. Samples: 5208565860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:02:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 00:02:13,886][15401] Updated weights for policy 0, policy_version 317900 (0.0033) [2024-06-23 00:02:17,983][15401] Updated weights for policy 0, policy_version 317910 (0.0030) [2024-06-23 00:02:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5208637440. Throughput: 0: 42756.8. Samples: 5208828780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:02:18,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 00:02:21,486][15401] Updated weights for policy 0, policy_version 317920 (0.0028) [2024-06-23 00:02:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5208883200. Throughput: 0: 42690.7. Samples: 5208958640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:02:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 00:02:25,391][15401] Updated weights for policy 0, policy_version 317930 (0.0028) [2024-06-23 00:02:28,396][15132] Fps is (10 sec: 45846.2, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 5209096192. Throughput: 0: 42707.3. Samples: 5209211320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:02:28,396][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 00:02:29,096][15401] Updated weights for policy 0, policy_version 317940 (0.0035) [2024-06-23 00:02:32,736][15401] Updated weights for policy 0, policy_version 317950 (0.0031) [2024-06-23 00:02:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 5209292800. Throughput: 0: 42904.3. Samples: 5209471380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:02:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 00:02:36,621][15401] Updated weights for policy 0, policy_version 317960 (0.0031) [2024-06-23 00:02:38,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5209522176. Throughput: 0: 42807.1. Samples: 5209598500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:02:38,390][15132] Avg episode reward: [(0, '0.121')] [2024-06-23 00:02:40,273][15401] Updated weights for policy 0, policy_version 317970 (0.0040) [2024-06-23 00:02:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5209735168. Throughput: 0: 42931.6. Samples: 5209856660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:02:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 00:02:44,346][15349] Signal inference workers to stop experience collection... (77150 times) [2024-06-23 00:02:44,399][15401] InferenceWorker_p0-w0: stopping experience collection (77150 times) [2024-06-23 00:02:44,406][15349] Signal inference workers to resume experience collection... (77150 times) [2024-06-23 00:02:44,418][15401] InferenceWorker_p0-w0: resuming experience collection (77150 times) [2024-06-23 00:02:44,538][15401] Updated weights for policy 0, policy_version 317980 (0.0026) [2024-06-23 00:02:48,063][15401] Updated weights for policy 0, policy_version 317990 (0.0042) [2024-06-23 00:02:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5209948160. Throughput: 0: 42784.9. Samples: 5210108900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:02:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 00:02:52,151][15401] Updated weights for policy 0, policy_version 318000 (0.0032) [2024-06-23 00:02:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5210161152. Throughput: 0: 42992.0. Samples: 5210246080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:02:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 00:02:55,595][15401] Updated weights for policy 0, policy_version 318010 (0.0033) [2024-06-23 00:02:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5210374144. Throughput: 0: 42940.1. Samples: 5210498160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:02:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 00:02:59,756][15401] Updated weights for policy 0, policy_version 318020 (0.0033) [2024-06-23 00:03:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5210587136. Throughput: 0: 42768.2. Samples: 5210753340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:03,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 00:03:03,539][15401] Updated weights for policy 0, policy_version 318030 (0.0029) [2024-06-23 00:03:07,448][15401] Updated weights for policy 0, policy_version 318040 (0.0042) [2024-06-23 00:03:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5210800128. Throughput: 0: 42787.5. Samples: 5210884080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 00:03:11,143][15401] Updated weights for policy 0, policy_version 318050 (0.0034) [2024-06-23 00:03:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5211013120. Throughput: 0: 42906.5. Samples: 5211141840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 00:03:15,068][15401] Updated weights for policy 0, policy_version 318060 (0.0041) [2024-06-23 00:03:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 5211242496. Throughput: 0: 42647.6. Samples: 5211390520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 00:03:18,709][15401] Updated weights for policy 0, policy_version 318070 (0.0033) [2024-06-23 00:03:22,838][15401] Updated weights for policy 0, policy_version 318080 (0.0034) [2024-06-23 00:03:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5211422720. Throughput: 0: 42780.5. Samples: 5211523620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 00:03:26,226][15401] Updated weights for policy 0, policy_version 318090 (0.0023) [2024-06-23 00:03:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 5211652096. Throughput: 0: 42746.2. Samples: 5211780240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 00:03:30,517][15401] Updated weights for policy 0, policy_version 318100 (0.0043) [2024-06-23 00:03:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5211881472. Throughput: 0: 42819.6. Samples: 5212035780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 00:03:34,075][15401] Updated weights for policy 0, policy_version 318110 (0.0036) [2024-06-23 00:03:38,083][15401] Updated weights for policy 0, policy_version 318120 (0.0023) [2024-06-23 00:03:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 5212094464. Throughput: 0: 42861.4. Samples: 5212174840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 00:03:41,435][15401] Updated weights for policy 0, policy_version 318130 (0.0026) [2024-06-23 00:03:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5212291072. Throughput: 0: 42777.3. Samples: 5212423140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 00:03:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000318133_5212291072.pth... [2024-06-23 00:03:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000317508_5202051072.pth [2024-06-23 00:03:45,505][15401] Updated weights for policy 0, policy_version 318140 (0.0023) [2024-06-23 00:03:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5212536832. Throughput: 0: 42794.1. Samples: 5212679080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 00:03:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 00:03:49,218][15401] Updated weights for policy 0, policy_version 318150 (0.0029) [2024-06-23 00:03:53,171][15401] Updated weights for policy 0, policy_version 318160 (0.0022) [2024-06-23 00:03:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5212733440. Throughput: 0: 42893.7. Samples: 5212814300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:03:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 00:03:56,927][15401] Updated weights for policy 0, policy_version 318170 (0.0029) [2024-06-23 00:03:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5212913664. Throughput: 0: 42799.7. Samples: 5213067820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:03:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 00:04:01,052][15401] Updated weights for policy 0, policy_version 318180 (0.0041) [2024-06-23 00:04:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.7). Total num frames: 5213175808. Throughput: 0: 42987.5. Samples: 5213324960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 00:04:04,435][15401] Updated weights for policy 0, policy_version 318190 (0.0039) [2024-06-23 00:04:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5213356032. Throughput: 0: 43019.2. Samples: 5213459480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 00:04:08,577][15401] Updated weights for policy 0, policy_version 318200 (0.0061) [2024-06-23 00:04:11,848][15401] Updated weights for policy 0, policy_version 318210 (0.0039) [2024-06-23 00:04:13,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 5213569024. Throughput: 0: 42906.6. Samples: 5213711140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:13,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 00:04:16,135][15401] Updated weights for policy 0, policy_version 318220 (0.0039) [2024-06-23 00:04:18,385][15349] Signal inference workers to stop experience collection... (77200 times) [2024-06-23 00:04:18,387][15349] Signal inference workers to resume experience collection... (77200 times) [2024-06-23 00:04:18,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5213831168. Throughput: 0: 42949.6. Samples: 5213968500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 00:04:18,424][15401] InferenceWorker_p0-w0: stopping experience collection (77200 times) [2024-06-23 00:04:18,425][15401] InferenceWorker_p0-w0: resuming experience collection (77200 times) [2024-06-23 00:04:19,546][15401] Updated weights for policy 0, policy_version 318230 (0.0024) [2024-06-23 00:04:23,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5214011392. Throughput: 0: 42954.6. Samples: 5214107800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 00:04:23,898][15401] Updated weights for policy 0, policy_version 318240 (0.0040) [2024-06-23 00:04:27,290][15401] Updated weights for policy 0, policy_version 318250 (0.0028) [2024-06-23 00:04:28,389][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5214240768. Throughput: 0: 42965.3. Samples: 5214356580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:28,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-23 00:04:31,740][15401] Updated weights for policy 0, policy_version 318260 (0.0028) [2024-06-23 00:04:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5214470144. Throughput: 0: 42888.0. Samples: 5214609040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:33,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 00:04:34,800][15401] Updated weights for policy 0, policy_version 318270 (0.0038) [2024-06-23 00:04:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42654.0). Total num frames: 5214633984. Throughput: 0: 42831.1. Samples: 5214741700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 00:04:39,313][15401] Updated weights for policy 0, policy_version 318280 (0.0032) [2024-06-23 00:04:42,252][15401] Updated weights for policy 0, policy_version 318290 (0.0039) [2024-06-23 00:04:43,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 5214896128. Throughput: 0: 42819.4. Samples: 5214994800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:43,393][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 00:04:46,881][15401] Updated weights for policy 0, policy_version 318300 (0.0040) [2024-06-23 00:04:48,396][15132] Fps is (10 sec: 45846.5, 60 sec: 42593.9, 300 sec: 42765.0). Total num frames: 5215092736. Throughput: 0: 42925.1. Samples: 5215256860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:48,396][15132] Avg episode reward: [(0, '0.257')] [2024-06-23 00:04:50,174][15401] Updated weights for policy 0, policy_version 318310 (0.0033) [2024-06-23 00:04:53,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5215305728. Throughput: 0: 42773.2. Samples: 5215384280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:53,390][15132] Avg episode reward: [(0, '0.136')] [2024-06-23 00:04:54,481][15401] Updated weights for policy 0, policy_version 318320 (0.0027) [2024-06-23 00:04:57,660][15401] Updated weights for policy 0, policy_version 318330 (0.0035) [2024-06-23 00:04:58,390][15132] Fps is (10 sec: 44264.6, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 5215535104. Throughput: 0: 42920.0. Samples: 5215642440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:04:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 00:05:02,282][15401] Updated weights for policy 0, policy_version 318340 (0.0030) [2024-06-23 00:05:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5215731712. Throughput: 0: 43053.2. Samples: 5215905900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 00:05:03,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 00:05:05,238][15401] Updated weights for policy 0, policy_version 318350 (0.0039) [2024-06-23 00:05:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5215944704. Throughput: 0: 42645.4. Samples: 5216026840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:08,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 00:05:09,801][15401] Updated weights for policy 0, policy_version 318360 (0.0038) [2024-06-23 00:05:12,805][15401] Updated weights for policy 0, policy_version 318370 (0.0034) [2024-06-23 00:05:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 5216174080. Throughput: 0: 42916.8. Samples: 5216287840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 00:05:17,326][15401] Updated weights for policy 0, policy_version 318380 (0.0034) [2024-06-23 00:05:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 5216387072. Throughput: 0: 43037.8. Samples: 5216545740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:18,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 00:05:20,735][15401] Updated weights for policy 0, policy_version 318390 (0.0029) [2024-06-23 00:05:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5216583680. Throughput: 0: 42761.0. Samples: 5216665940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 00:05:25,449][15401] Updated weights for policy 0, policy_version 318400 (0.0033) [2024-06-23 00:05:28,271][15401] Updated weights for policy 0, policy_version 318410 (0.0039) [2024-06-23 00:05:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42876.5). Total num frames: 5216829440. Throughput: 0: 42988.0. Samples: 5216929260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:28,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 00:05:33,142][15401] Updated weights for policy 0, policy_version 318420 (0.0021) [2024-06-23 00:05:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 5216993280. Throughput: 0: 43123.0. Samples: 5217197120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 00:05:36,041][15401] Updated weights for policy 0, policy_version 318430 (0.0032) [2024-06-23 00:05:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 5217255424. Throughput: 0: 42949.3. Samples: 5217317000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:38,394][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 00:05:40,559][15401] Updated weights for policy 0, policy_version 318440 (0.0037) [2024-06-23 00:05:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 5217452032. Throughput: 0: 43111.7. Samples: 5217582460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 00:05:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000318449_5217468416.pth... [2024-06-23 00:05:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000317819_5207146496.pth [2024-06-23 00:05:43,635][15401] Updated weights for policy 0, policy_version 318450 (0.0029) [2024-06-23 00:05:48,010][15401] Updated weights for policy 0, policy_version 318460 (0.0030) [2024-06-23 00:05:48,390][15132] Fps is (10 sec: 40957.7, 60 sec: 42875.6, 300 sec: 42876.0). Total num frames: 5217665024. Throughput: 0: 43088.7. Samples: 5217844920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:48,391][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 00:05:48,400][15349] Signal inference workers to stop experience collection... (77250 times) [2024-06-23 00:05:48,451][15401] InferenceWorker_p0-w0: stopping experience collection (77250 times) [2024-06-23 00:05:48,519][15349] Signal inference workers to resume experience collection... (77250 times) [2024-06-23 00:05:48,520][15401] InferenceWorker_p0-w0: resuming experience collection (77250 times) [2024-06-23 00:05:51,187][15401] Updated weights for policy 0, policy_version 318470 (0.0034) [2024-06-23 00:05:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5217910784. Throughput: 0: 43108.4. Samples: 5217966720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 00:05:55,559][15401] Updated weights for policy 0, policy_version 318480 (0.0037) [2024-06-23 00:05:58,390][15132] Fps is (10 sec: 44239.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5218107392. Throughput: 0: 43153.8. Samples: 5218229760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:05:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 00:05:58,957][15401] Updated weights for policy 0, policy_version 318490 (0.0028) [2024-06-23 00:06:03,088][15401] Updated weights for policy 0, policy_version 318500 (0.0029) [2024-06-23 00:06:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5218304000. Throughput: 0: 43037.0. Samples: 5218482400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:06:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 00:06:06,589][15401] Updated weights for policy 0, policy_version 318510 (0.0030) [2024-06-23 00:06:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 5218566144. Throughput: 0: 43116.9. Samples: 5218606200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:06:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 00:06:10,780][15401] Updated weights for policy 0, policy_version 318520 (0.0029) [2024-06-23 00:06:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5218729984. Throughput: 0: 43119.6. Samples: 5218869540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:06:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 00:06:14,161][15401] Updated weights for policy 0, policy_version 318530 (0.0047) [2024-06-23 00:06:18,199][15401] Updated weights for policy 0, policy_version 318540 (0.0046) [2024-06-23 00:06:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5218959360. Throughput: 0: 42824.8. Samples: 5219124240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:06:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 00:06:21,681][15401] Updated weights for policy 0, policy_version 318550 (0.0032) [2024-06-23 00:06:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 5219188736. Throughput: 0: 43142.6. Samples: 5219258420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:06:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 00:06:25,617][15401] Updated weights for policy 0, policy_version 318560 (0.0022) [2024-06-23 00:06:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 5219368960. Throughput: 0: 42986.7. Samples: 5219516860. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:06:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 00:06:29,364][15401] Updated weights for policy 0, policy_version 318570 (0.0028) [2024-06-23 00:06:33,329][15401] Updated weights for policy 0, policy_version 318580 (0.0037) [2024-06-23 00:06:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 5219614720. Throughput: 0: 42877.1. Samples: 5219774360. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:06:33,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-23 00:06:36,869][15401] Updated weights for policy 0, policy_version 318590 (0.0046) [2024-06-23 00:06:38,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5219844096. Throughput: 0: 43028.9. Samples: 5219903020. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:06:38,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 00:06:40,890][15401] Updated weights for policy 0, policy_version 318600 (0.0027) [2024-06-23 00:06:43,396][15132] Fps is (10 sec: 40933.3, 60 sec: 42866.8, 300 sec: 42819.6). Total num frames: 5220024320. Throughput: 0: 43001.8. Samples: 5220165120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:06:43,397][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 00:06:44,410][15401] Updated weights for policy 0, policy_version 318610 (0.0021) [2024-06-23 00:06:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.8, 300 sec: 42876.1). Total num frames: 5220237312. Throughput: 0: 43043.0. Samples: 5220419340. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:06:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 00:06:48,532][15401] Updated weights for policy 0, policy_version 318620 (0.0028) [2024-06-23 00:06:52,071][15401] Updated weights for policy 0, policy_version 318630 (0.0037) [2024-06-23 00:06:53,390][15132] Fps is (10 sec: 47544.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5220499456. Throughput: 0: 43218.2. Samples: 5220551020. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:06:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 00:06:56,012][15401] Updated weights for policy 0, policy_version 318640 (0.0031) [2024-06-23 00:06:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5220663296. Throughput: 0: 43059.9. Samples: 5220807240. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:06:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 00:07:00,122][15401] Updated weights for policy 0, policy_version 318650 (0.0034) [2024-06-23 00:07:00,665][15349] Signal inference workers to stop experience collection... (77300 times) [2024-06-23 00:07:00,666][15349] Signal inference workers to resume experience collection... (77300 times) [2024-06-23 00:07:00,710][15401] InferenceWorker_p0-w0: stopping experience collection (77300 times) [2024-06-23 00:07:00,710][15401] InferenceWorker_p0-w0: resuming experience collection (77300 times) [2024-06-23 00:07:03,396][15132] Fps is (10 sec: 40933.8, 60 sec: 43412.9, 300 sec: 42930.7). Total num frames: 5220909056. Throughput: 0: 42989.9. Samples: 5221059060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:07:03,397][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 00:07:03,484][15401] Updated weights for policy 0, policy_version 318660 (0.0033) [2024-06-23 00:07:07,671][15401] Updated weights for policy 0, policy_version 318670 (0.0033) [2024-06-23 00:07:08,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 5221122048. Throughput: 0: 43017.1. Samples: 5221194180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:07:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 00:07:11,443][15401] Updated weights for policy 0, policy_version 318680 (0.0030) [2024-06-23 00:07:13,389][15132] Fps is (10 sec: 39347.1, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 5221302272. Throughput: 0: 43023.1. Samples: 5221452900. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:07:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 00:07:15,257][15401] Updated weights for policy 0, policy_version 318690 (0.0035) [2024-06-23 00:07:18,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 5221564416. Throughput: 0: 42897.6. Samples: 5221704760. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:07:18,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 00:07:18,882][15401] Updated weights for policy 0, policy_version 318700 (0.0029) [2024-06-23 00:07:22,761][15401] Updated weights for policy 0, policy_version 318710 (0.0040) [2024-06-23 00:07:23,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.7, 300 sec: 42988.1). Total num frames: 5221777408. Throughput: 0: 43128.9. Samples: 5221843820. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:07:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 00:07:26,483][15401] Updated weights for policy 0, policy_version 318720 (0.0044) [2024-06-23 00:07:28,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5221941248. Throughput: 0: 42783.3. Samples: 5222090100. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:07:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 00:07:30,327][15401] Updated weights for policy 0, policy_version 318730 (0.0030) [2024-06-23 00:07:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5222187008. Throughput: 0: 42846.2. Samples: 5222347420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-23 00:07:33,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 00:07:34,473][15401] Updated weights for policy 0, policy_version 318740 (0.0041) [2024-06-23 00:07:38,001][15401] Updated weights for policy 0, policy_version 318750 (0.0042) [2024-06-23 00:07:38,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5222416384. Throughput: 0: 42859.0. Samples: 5222479680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:07:38,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 00:07:42,108][15401] Updated weights for policy 0, policy_version 318760 (0.0043) [2024-06-23 00:07:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 5222580224. Throughput: 0: 42731.2. Samples: 5222730140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:07:43,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-23 00:07:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000318762_5222596608.pth... [2024-06-23 00:07:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000318133_5212291072.pth [2024-06-23 00:07:45,732][15401] Updated weights for policy 0, policy_version 318770 (0.0033) [2024-06-23 00:07:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5222825984. Throughput: 0: 42763.3. Samples: 5222983140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:07:48,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 00:07:49,840][15401] Updated weights for policy 0, policy_version 318780 (0.0040) [2024-06-23 00:07:53,394][15132] Fps is (10 sec: 44214.9, 60 sec: 42048.8, 300 sec: 42875.4). Total num frames: 5223022592. Throughput: 0: 42746.7. Samples: 5223118000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:07:53,395][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 00:07:53,580][15401] Updated weights for policy 0, policy_version 318790 (0.0029) [2024-06-23 00:07:57,458][15401] Updated weights for policy 0, policy_version 318800 (0.0037) [2024-06-23 00:07:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5223235584. Throughput: 0: 42559.4. Samples: 5223368080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:07:58,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 00:08:01,302][15401] Updated weights for policy 0, policy_version 318810 (0.0036) [2024-06-23 00:08:03,389][15132] Fps is (10 sec: 44259.0, 60 sec: 42603.0, 300 sec: 42931.7). Total num frames: 5223464960. Throughput: 0: 42739.7. Samples: 5223628040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 00:08:05,143][15401] Updated weights for policy 0, policy_version 318820 (0.0049) [2024-06-23 00:08:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5223677952. Throughput: 0: 42487.5. Samples: 5223755760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 00:08:08,954][15401] Updated weights for policy 0, policy_version 318830 (0.0030) [2024-06-23 00:08:12,764][15401] Updated weights for policy 0, policy_version 318840 (0.0040) [2024-06-23 00:08:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 5223907328. Throughput: 0: 42772.6. Samples: 5224014860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 00:08:16,487][15401] Updated weights for policy 0, policy_version 318850 (0.0028) [2024-06-23 00:08:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42986.8). Total num frames: 5224103936. Throughput: 0: 42827.5. Samples: 5224274760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:18,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 00:08:20,299][15401] Updated weights for policy 0, policy_version 318860 (0.0036) [2024-06-23 00:08:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 5224333312. Throughput: 0: 42693.4. Samples: 5224400880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 00:08:24,187][15401] Updated weights for policy 0, policy_version 318870 (0.0025) [2024-06-23 00:08:28,009][15401] Updated weights for policy 0, policy_version 318880 (0.0031) [2024-06-23 00:08:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5224529920. Throughput: 0: 42812.4. Samples: 5224656700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 00:08:28,770][15349] Signal inference workers to stop experience collection... (77350 times) [2024-06-23 00:08:28,823][15401] InferenceWorker_p0-w0: stopping experience collection (77350 times) [2024-06-23 00:08:28,831][15349] Signal inference workers to resume experience collection... (77350 times) [2024-06-23 00:08:28,846][15401] InferenceWorker_p0-w0: resuming experience collection (77350 times) [2024-06-23 00:08:31,708][15401] Updated weights for policy 0, policy_version 318890 (0.0038) [2024-06-23 00:08:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5224742912. Throughput: 0: 42917.5. Samples: 5224914420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:33,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 00:08:35,553][15401] Updated weights for policy 0, policy_version 318900 (0.0034) [2024-06-23 00:08:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 5224988672. Throughput: 0: 42755.4. Samples: 5225041780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 00:08:39,303][15401] Updated weights for policy 0, policy_version 318910 (0.0025) [2024-06-23 00:08:43,171][15401] Updated weights for policy 0, policy_version 318920 (0.0033) [2024-06-23 00:08:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5225185280. Throughput: 0: 42899.6. Samples: 5225298560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 00:08:47,198][15401] Updated weights for policy 0, policy_version 318930 (0.0030) [2024-06-23 00:08:48,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5225365504. Throughput: 0: 42869.6. Samples: 5225557180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 00:08:48,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 00:08:50,970][15401] Updated weights for policy 0, policy_version 318940 (0.0038) [2024-06-23 00:08:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43148.0, 300 sec: 43042.7). Total num frames: 5225611264. Throughput: 0: 42782.6. Samples: 5225680980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:08:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 00:08:54,937][15401] Updated weights for policy 0, policy_version 318950 (0.0039) [2024-06-23 00:08:58,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 5225824256. Throughput: 0: 42768.5. Samples: 5225939440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:08:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 00:08:58,445][15401] Updated weights for policy 0, policy_version 318960 (0.0042) [2024-06-23 00:09:02,577][15401] Updated weights for policy 0, policy_version 318970 (0.0039) [2024-06-23 00:09:03,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42593.8, 300 sec: 42930.7). Total num frames: 5226020864. Throughput: 0: 42810.9. Samples: 5226201420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:03,397][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 00:09:06,339][15401] Updated weights for policy 0, policy_version 318980 (0.0031) [2024-06-23 00:09:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 43043.1). Total num frames: 5226266624. Throughput: 0: 42820.4. Samples: 5226327800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 00:09:10,042][15401] Updated weights for policy 0, policy_version 318990 (0.0031) [2024-06-23 00:09:13,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5226463232. Throughput: 0: 42872.4. Samples: 5226585960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:13,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 00:09:13,802][15401] Updated weights for policy 0, policy_version 319000 (0.0032) [2024-06-23 00:09:17,546][15401] Updated weights for policy 0, policy_version 319010 (0.0035) [2024-06-23 00:09:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 5226676224. Throughput: 0: 42854.2. Samples: 5226842860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 00:09:21,600][15401] Updated weights for policy 0, policy_version 319020 (0.0037) [2024-06-23 00:09:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5226889216. Throughput: 0: 42843.5. Samples: 5226969740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 00:09:25,652][15401] Updated weights for policy 0, policy_version 319030 (0.0059) [2024-06-23 00:09:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5227102208. Throughput: 0: 42866.2. Samples: 5227227540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 00:09:29,175][15401] Updated weights for policy 0, policy_version 319040 (0.0031) [2024-06-23 00:09:33,283][15401] Updated weights for policy 0, policy_version 319050 (0.0031) [2024-06-23 00:09:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5227315200. Throughput: 0: 43013.0. Samples: 5227492760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 00:09:36,701][15401] Updated weights for policy 0, policy_version 319060 (0.0029) [2024-06-23 00:09:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 5227544576. Throughput: 0: 43093.3. Samples: 5227620180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:38,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 00:09:40,854][15401] Updated weights for policy 0, policy_version 319070 (0.0029) [2024-06-23 00:09:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42877.0). Total num frames: 5227741184. Throughput: 0: 43048.7. Samples: 5227876640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:43,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 00:09:43,532][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000319077_5227757568.pth... [2024-06-23 00:09:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000318449_5217468416.pth [2024-06-23 00:09:44,435][15401] Updated weights for policy 0, policy_version 319080 (0.0034) [2024-06-23 00:09:46,162][15349] Signal inference workers to stop experience collection... (77400 times) [2024-06-23 00:09:46,162][15349] Signal inference workers to resume experience collection... (77400 times) [2024-06-23 00:09:46,205][15401] InferenceWorker_p0-w0: stopping experience collection (77400 times) [2024-06-23 00:09:46,205][15401] InferenceWorker_p0-w0: resuming experience collection (77400 times) [2024-06-23 00:09:48,239][15401] Updated weights for policy 0, policy_version 319090 (0.0036) [2024-06-23 00:09:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5227970560. Throughput: 0: 42919.8. Samples: 5228132540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 00:09:51,885][15401] Updated weights for policy 0, policy_version 319100 (0.0027) [2024-06-23 00:09:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5228167168. Throughput: 0: 42991.5. Samples: 5228262420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 00:09:55,729][15401] Updated weights for policy 0, policy_version 319110 (0.0043) [2024-06-23 00:09:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5228396544. Throughput: 0: 43001.0. Samples: 5228521000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:09:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 00:09:59,702][15401] Updated weights for policy 0, policy_version 319120 (0.0035) [2024-06-23 00:10:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 5228593152. Throughput: 0: 43034.5. Samples: 5228779420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 00:10:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 00:10:03,754][15401] Updated weights for policy 0, policy_version 319130 (0.0030) [2024-06-23 00:10:07,420][15401] Updated weights for policy 0, policy_version 319140 (0.0052) [2024-06-23 00:10:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5228822528. Throughput: 0: 43046.2. Samples: 5228906820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:08,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 00:10:11,227][15401] Updated weights for policy 0, policy_version 319150 (0.0031) [2024-06-23 00:10:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5229051904. Throughput: 0: 43071.0. Samples: 5229165740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 00:10:14,737][15401] Updated weights for policy 0, policy_version 319160 (0.0036) [2024-06-23 00:10:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5229248512. Throughput: 0: 43005.3. Samples: 5229428000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 00:10:18,837][15401] Updated weights for policy 0, policy_version 319170 (0.0037) [2024-06-23 00:10:22,295][15401] Updated weights for policy 0, policy_version 319180 (0.0043) [2024-06-23 00:10:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 5229477888. Throughput: 0: 42963.5. Samples: 5229553540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:23,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 00:10:26,464][15401] Updated weights for policy 0, policy_version 319190 (0.0026) [2024-06-23 00:10:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5229690880. Throughput: 0: 42997.4. Samples: 5229811520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 00:10:29,972][15401] Updated weights for policy 0, policy_version 319200 (0.0037) [2024-06-23 00:10:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5229903872. Throughput: 0: 43024.5. Samples: 5230068640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 00:10:34,058][15401] Updated weights for policy 0, policy_version 319210 (0.0040) [2024-06-23 00:10:37,671][15401] Updated weights for policy 0, policy_version 319220 (0.0033) [2024-06-23 00:10:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 5230116864. Throughput: 0: 42973.8. Samples: 5230196240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:38,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-23 00:10:41,665][15401] Updated weights for policy 0, policy_version 319230 (0.0038) [2024-06-23 00:10:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 5230329856. Throughput: 0: 42943.5. Samples: 5230453460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 00:10:45,212][15401] Updated weights for policy 0, policy_version 319240 (0.0031) [2024-06-23 00:10:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5230542848. Throughput: 0: 42861.4. Samples: 5230708180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 00:10:49,494][15401] Updated weights for policy 0, policy_version 319250 (0.0032) [2024-06-23 00:10:53,126][15401] Updated weights for policy 0, policy_version 319260 (0.0033) [2024-06-23 00:10:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5230755840. Throughput: 0: 42848.5. Samples: 5230835000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 00:10:57,059][15401] Updated weights for policy 0, policy_version 319270 (0.0032) [2024-06-23 00:10:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5230985216. Throughput: 0: 42910.5. Samples: 5231096700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:10:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 00:11:00,802][15401] Updated weights for policy 0, policy_version 319280 (0.0031) [2024-06-23 00:11:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5231181824. Throughput: 0: 42785.4. Samples: 5231353340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:11:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 00:11:04,657][15401] Updated weights for policy 0, policy_version 319290 (0.0047) [2024-06-23 00:11:08,330][15401] Updated weights for policy 0, policy_version 319300 (0.0034) [2024-06-23 00:11:08,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5231411200. Throughput: 0: 42840.4. Samples: 5231481360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:11:08,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 00:11:12,255][15401] Updated weights for policy 0, policy_version 319310 (0.0039) [2024-06-23 00:11:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 5231624192. Throughput: 0: 42945.0. Samples: 5231744040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:11:13,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 00:11:15,879][15401] Updated weights for policy 0, policy_version 319320 (0.0021) [2024-06-23 00:11:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5231837184. Throughput: 0: 42876.9. Samples: 5231998100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 00:11:18,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 00:11:20,065][15401] Updated weights for policy 0, policy_version 319330 (0.0024) [2024-06-23 00:11:21,049][15349] Signal inference workers to stop experience collection... (77450 times) [2024-06-23 00:11:21,088][15401] InferenceWorker_p0-w0: stopping experience collection (77450 times) [2024-06-23 00:11:21,113][15349] Signal inference workers to resume experience collection... (77450 times) [2024-06-23 00:11:21,116][15401] InferenceWorker_p0-w0: resuming experience collection (77450 times) [2024-06-23 00:11:23,374][15401] Updated weights for policy 0, policy_version 319340 (0.0028) [2024-06-23 00:11:23,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5232066560. Throughput: 0: 43106.4. Samples: 5232136040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:11:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 00:11:27,523][15401] Updated weights for policy 0, policy_version 319350 (0.0037) [2024-06-23 00:11:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5232246784. Throughput: 0: 43101.2. Samples: 5232393020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:11:28,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 00:11:31,011][15401] Updated weights for policy 0, policy_version 319360 (0.0036) [2024-06-23 00:11:33,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 5232492544. Throughput: 0: 42968.3. Samples: 5232641860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:11:33,393][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 00:11:35,208][15401] Updated weights for policy 0, policy_version 319370 (0.0039) [2024-06-23 00:11:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42932.6). Total num frames: 5232689152. Throughput: 0: 43172.4. Samples: 5232777760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:11:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 00:11:38,726][15401] Updated weights for policy 0, policy_version 319380 (0.0040) [2024-06-23 00:11:43,004][15401] Updated weights for policy 0, policy_version 319390 (0.0031) [2024-06-23 00:11:43,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 5232902144. Throughput: 0: 42990.0. Samples: 5233031360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:11:43,393][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 00:11:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000319391_5232902144.pth... [2024-06-23 00:11:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000318762_5222596608.pth [2024-06-23 00:11:46,297][15401] Updated weights for policy 0, policy_version 319400 (0.0037) [2024-06-23 00:11:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5233131520. Throughput: 0: 42759.0. Samples: 5233277500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:11:48,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 00:11:50,776][15401] Updated weights for policy 0, policy_version 319410 (0.0033) [2024-06-23 00:11:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5233311744. Throughput: 0: 42968.5. Samples: 5233414940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:11:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 00:11:54,055][15401] Updated weights for policy 0, policy_version 319420 (0.0032) [2024-06-23 00:11:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 5233524736. Throughput: 0: 42783.9. Samples: 5233669320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:11:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 00:11:58,457][15401] Updated weights for policy 0, policy_version 319430 (0.0034) [2024-06-23 00:12:01,698][15401] Updated weights for policy 0, policy_version 319440 (0.0025) [2024-06-23 00:12:03,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5233786880. Throughput: 0: 42539.5. Samples: 5233912380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:12:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 00:12:06,090][15401] Updated weights for policy 0, policy_version 319450 (0.0036) [2024-06-23 00:12:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.6, 300 sec: 42931.6). Total num frames: 5233967104. Throughput: 0: 42611.4. Samples: 5234053540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:12:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 00:12:09,192][15401] Updated weights for policy 0, policy_version 319460 (0.0030) [2024-06-23 00:12:13,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5234163712. Throughput: 0: 42517.4. Samples: 5234306300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:12:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 00:12:13,703][15401] Updated weights for policy 0, policy_version 319470 (0.0033) [2024-06-23 00:12:17,261][15401] Updated weights for policy 0, policy_version 319480 (0.0033) [2024-06-23 00:12:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5234425856. Throughput: 0: 42582.8. Samples: 5234557980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:12:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 00:12:21,441][15401] Updated weights for policy 0, policy_version 319490 (0.0036) [2024-06-23 00:12:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 5234606080. Throughput: 0: 42475.1. Samples: 5234689140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:12:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 00:12:24,903][15401] Updated weights for policy 0, policy_version 319500 (0.0034) [2024-06-23 00:12:28,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5234802688. Throughput: 0: 42449.0. Samples: 5234941460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:12:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 00:12:29,372][15401] Updated weights for policy 0, policy_version 319510 (0.0032) [2024-06-23 00:12:32,554][15401] Updated weights for policy 0, policy_version 319520 (0.0042) [2024-06-23 00:12:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 5235048448. Throughput: 0: 42481.3. Samples: 5235189160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 00:12:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 00:12:37,032][15401] Updated weights for policy 0, policy_version 319530 (0.0048) [2024-06-23 00:12:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5235228672. Throughput: 0: 42601.4. Samples: 5235332000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:12:38,404][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 00:12:40,218][15401] Updated weights for policy 0, policy_version 319540 (0.0044) [2024-06-23 00:12:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 5235441664. Throughput: 0: 42401.2. Samples: 5235577380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:12:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 00:12:44,594][15401] Updated weights for policy 0, policy_version 319550 (0.0035) [2024-06-23 00:12:47,512][15349] Signal inference workers to stop experience collection... (77500 times) [2024-06-23 00:12:47,556][15401] InferenceWorker_p0-w0: stopping experience collection (77500 times) [2024-06-23 00:12:47,563][15349] Signal inference workers to resume experience collection... (77500 times) [2024-06-23 00:12:47,566][15401] InferenceWorker_p0-w0: resuming experience collection (77500 times) [2024-06-23 00:12:47,728][15401] Updated weights for policy 0, policy_version 319560 (0.0032) [2024-06-23 00:12:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42932.4). Total num frames: 5235687424. Throughput: 0: 42735.6. Samples: 5235835480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:12:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 00:12:52,256][15401] Updated weights for policy 0, policy_version 319570 (0.0036) [2024-06-23 00:12:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5235884032. Throughput: 0: 42713.7. Samples: 5235975660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:12:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 00:12:55,361][15401] Updated weights for policy 0, policy_version 319580 (0.0043) [2024-06-23 00:12:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5236097024. Throughput: 0: 42546.6. Samples: 5236220900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:12:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 00:13:00,323][15401] Updated weights for policy 0, policy_version 319590 (0.0034) [2024-06-23 00:13:03,214][15401] Updated weights for policy 0, policy_version 319600 (0.0033) [2024-06-23 00:13:03,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42323.6, 300 sec: 42875.7). Total num frames: 5236326400. Throughput: 0: 42456.7. Samples: 5236468640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:03,393][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 00:13:07,924][15401] Updated weights for policy 0, policy_version 319610 (0.0026) [2024-06-23 00:13:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5236490240. Throughput: 0: 42446.3. Samples: 5236599220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 00:13:10,942][15401] Updated weights for policy 0, policy_version 319620 (0.0031) [2024-06-23 00:13:13,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 5236736000. Throughput: 0: 42297.8. Samples: 5236844860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 00:13:15,682][15401] Updated weights for policy 0, policy_version 319630 (0.0036) [2024-06-23 00:13:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 5236948992. Throughput: 0: 42589.4. Samples: 5237105680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 00:13:18,716][15401] Updated weights for policy 0, policy_version 319640 (0.0053) [2024-06-23 00:13:23,310][15401] Updated weights for policy 0, policy_version 319650 (0.0033) [2024-06-23 00:13:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5237145600. Throughput: 0: 42289.8. Samples: 5237235040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:23,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-23 00:13:26,284][15401] Updated weights for policy 0, policy_version 319660 (0.0033) [2024-06-23 00:13:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5237391360. Throughput: 0: 42498.7. Samples: 5237489820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 00:13:30,786][15401] Updated weights for policy 0, policy_version 319670 (0.0030) [2024-06-23 00:13:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5237587968. Throughput: 0: 42585.3. Samples: 5237751820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 00:13:34,132][15401] Updated weights for policy 0, policy_version 319680 (0.0039) [2024-06-23 00:13:38,274][15401] Updated weights for policy 0, policy_version 319690 (0.0028) [2024-06-23 00:13:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5237800960. Throughput: 0: 42168.0. Samples: 5237873220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:38,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 00:13:41,912][15401] Updated weights for policy 0, policy_version 319700 (0.0030) [2024-06-23 00:13:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 5238030336. Throughput: 0: 42438.3. Samples: 5238130620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:43,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 00:13:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000319705_5238046720.pth... [2024-06-23 00:13:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000319077_5227757568.pth [2024-06-23 00:13:46,159][15401] Updated weights for policy 0, policy_version 319710 (0.0029) [2024-06-23 00:13:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5238226944. Throughput: 0: 42596.0. Samples: 5238385360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 00:13:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 00:13:49,626][15401] Updated weights for policy 0, policy_version 319720 (0.0043) [2024-06-23 00:13:53,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5238407168. Throughput: 0: 42391.9. Samples: 5238506860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:13:53,390][15132] Avg episode reward: [(0, '0.920')] [2024-06-23 00:13:53,922][15401] Updated weights for policy 0, policy_version 319730 (0.0037) [2024-06-23 00:13:57,736][15401] Updated weights for policy 0, policy_version 319740 (0.0037) [2024-06-23 00:13:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42821.5). Total num frames: 5238652928. Throughput: 0: 42645.8. Samples: 5238763920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:13:58,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 00:14:01,982][15401] Updated weights for policy 0, policy_version 319750 (0.0032) [2024-06-23 00:14:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 5238865920. Throughput: 0: 42479.5. Samples: 5239017260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 00:14:05,316][15401] Updated weights for policy 0, policy_version 319760 (0.0038) [2024-06-23 00:14:08,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 5239046144. Throughput: 0: 42375.0. Samples: 5239142020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:08,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 00:14:09,636][15401] Updated weights for policy 0, policy_version 319770 (0.0027) [2024-06-23 00:14:10,214][15349] Signal inference workers to stop experience collection... (77550 times) [2024-06-23 00:14:10,263][15401] InferenceWorker_p0-w0: stopping experience collection (77550 times) [2024-06-23 00:14:10,326][15349] Signal inference workers to resume experience collection... (77550 times) [2024-06-23 00:14:10,326][15401] InferenceWorker_p0-w0: resuming experience collection (77550 times) [2024-06-23 00:14:13,010][15401] Updated weights for policy 0, policy_version 319780 (0.0027) [2024-06-23 00:14:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5239291904. Throughput: 0: 42547.5. Samples: 5239404460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 00:14:17,199][15401] Updated weights for policy 0, policy_version 319790 (0.0024) [2024-06-23 00:14:18,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5239488512. Throughput: 0: 42237.6. Samples: 5239652520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 00:14:20,649][15401] Updated weights for policy 0, policy_version 319800 (0.0030) [2024-06-23 00:14:23,391][15132] Fps is (10 sec: 40955.3, 60 sec: 42597.5, 300 sec: 42709.3). Total num frames: 5239701504. Throughput: 0: 42309.1. Samples: 5239777180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:23,391][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 00:14:24,725][15401] Updated weights for policy 0, policy_version 319810 (0.0035) [2024-06-23 00:14:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 5239898112. Throughput: 0: 42431.5. Samples: 5240040040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 00:14:28,541][15401] Updated weights for policy 0, policy_version 319820 (0.0035) [2024-06-23 00:14:32,627][15401] Updated weights for policy 0, policy_version 319830 (0.0030) [2024-06-23 00:14:33,389][15132] Fps is (10 sec: 42603.8, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5240127488. Throughput: 0: 42417.0. Samples: 5240294120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 00:14:36,104][15401] Updated weights for policy 0, policy_version 319840 (0.0038) [2024-06-23 00:14:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5240356864. Throughput: 0: 42458.2. Samples: 5240417480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:38,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 00:14:40,349][15401] Updated weights for policy 0, policy_version 319850 (0.0024) [2024-06-23 00:14:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 5240553472. Throughput: 0: 42544.3. Samples: 5240678420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:43,391][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 00:14:43,704][15401] Updated weights for policy 0, policy_version 319860 (0.0029) [2024-06-23 00:14:47,951][15401] Updated weights for policy 0, policy_version 319870 (0.0037) [2024-06-23 00:14:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5240766464. Throughput: 0: 42794.2. Samples: 5240943000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 00:14:51,326][15401] Updated weights for policy 0, policy_version 319880 (0.0027) [2024-06-23 00:14:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5241012224. Throughput: 0: 42825.3. Samples: 5241069060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:53,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 00:14:55,458][15401] Updated weights for policy 0, policy_version 319890 (0.0023) [2024-06-23 00:14:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5241208832. Throughput: 0: 42769.4. Samples: 5241329080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:14:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 00:14:58,978][15401] Updated weights for policy 0, policy_version 319900 (0.0030) [2024-06-23 00:15:02,929][15401] Updated weights for policy 0, policy_version 319910 (0.0030) [2024-06-23 00:15:03,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5241405440. Throughput: 0: 42992.1. Samples: 5241587160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 00:15:03,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 00:15:06,712][15401] Updated weights for policy 0, policy_version 319920 (0.0026) [2024-06-23 00:15:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43419.3, 300 sec: 42709.5). Total num frames: 5241651200. Throughput: 0: 43047.3. Samples: 5241714260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 00:15:10,474][15401] Updated weights for policy 0, policy_version 319930 (0.0044) [2024-06-23 00:15:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5241847808. Throughput: 0: 43065.8. Samples: 5241978000. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:13,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 00:15:14,272][15401] Updated weights for policy 0, policy_version 319940 (0.0040) [2024-06-23 00:15:18,070][15401] Updated weights for policy 0, policy_version 319950 (0.0035) [2024-06-23 00:15:18,392][15132] Fps is (10 sec: 40951.9, 60 sec: 42870.1, 300 sec: 42653.7). Total num frames: 5242060800. Throughput: 0: 42927.8. Samples: 5242225960. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:18,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 00:15:22,029][15401] Updated weights for policy 0, policy_version 319960 (0.0035) [2024-06-23 00:15:23,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43418.4, 300 sec: 42765.0). Total num frames: 5242306560. Throughput: 0: 43092.7. Samples: 5242356660. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 00:15:25,622][15401] Updated weights for policy 0, policy_version 319970 (0.0043) [2024-06-23 00:15:28,390][15132] Fps is (10 sec: 40968.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5242470400. Throughput: 0: 43112.4. Samples: 5242618480. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 00:15:29,514][15401] Updated weights for policy 0, policy_version 319980 (0.0029) [2024-06-23 00:15:33,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5242699776. Throughput: 0: 42815.5. Samples: 5242869700. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:33,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 00:15:33,561][15401] Updated weights for policy 0, policy_version 319990 (0.0046) [2024-06-23 00:15:37,112][15401] Updated weights for policy 0, policy_version 320000 (0.0037) [2024-06-23 00:15:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5242929152. Throughput: 0: 42818.7. Samples: 5242995900. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 00:15:41,258][15401] Updated weights for policy 0, policy_version 320010 (0.0037) [2024-06-23 00:15:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5243125760. Throughput: 0: 42761.7. Samples: 5243253360. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 00:15:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000320015_5243125760.pth... [2024-06-23 00:15:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000319391_5232902144.pth [2024-06-23 00:15:44,018][15349] Signal inference workers to stop experience collection... (77600 times) [2024-06-23 00:15:44,018][15349] Signal inference workers to resume experience collection... (77600 times) [2024-06-23 00:15:44,068][15401] InferenceWorker_p0-w0: stopping experience collection (77600 times) [2024-06-23 00:15:44,068][15401] InferenceWorker_p0-w0: resuming experience collection (77600 times) [2024-06-23 00:15:44,618][15401] Updated weights for policy 0, policy_version 320020 (0.0029) [2024-06-23 00:15:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5243338752. Throughput: 0: 42678.2. Samples: 5243507680. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:48,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 00:15:48,852][15401] Updated weights for policy 0, policy_version 320030 (0.0042) [2024-06-23 00:15:52,489][15401] Updated weights for policy 0, policy_version 320040 (0.0034) [2024-06-23 00:15:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5243568128. Throughput: 0: 42685.5. Samples: 5243635100. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:53,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 00:15:56,403][15401] Updated weights for policy 0, policy_version 320050 (0.0036) [2024-06-23 00:15:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5243748352. Throughput: 0: 42487.4. Samples: 5243889940. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:15:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 00:16:00,611][15401] Updated weights for policy 0, policy_version 320060 (0.0046) [2024-06-23 00:16:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5243977728. Throughput: 0: 42488.2. Samples: 5244137840. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:16:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 00:16:04,130][15401] Updated weights for policy 0, policy_version 320070 (0.0042) [2024-06-23 00:16:08,347][15401] Updated weights for policy 0, policy_version 320080 (0.0032) [2024-06-23 00:16:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5244190720. Throughput: 0: 42569.5. Samples: 5244272280. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:16:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 00:16:11,681][15401] Updated weights for policy 0, policy_version 320090 (0.0047) [2024-06-23 00:16:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5244387328. Throughput: 0: 42421.0. Samples: 5244527420. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:16:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 00:16:15,803][15401] Updated weights for policy 0, policy_version 320100 (0.0041) [2024-06-23 00:16:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 5244633088. Throughput: 0: 42471.2. Samples: 5244780900. Policy #0 lag: (min: 1.0, avg: 11.8, max: 24.0) [2024-06-23 00:16:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 00:16:19,578][15401] Updated weights for policy 0, policy_version 320110 (0.0031) [2024-06-23 00:16:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 5244829696. Throughput: 0: 42587.2. Samples: 5244912320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:16:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 00:16:23,513][15401] Updated weights for policy 0, policy_version 320120 (0.0038) [2024-06-23 00:16:27,215][15401] Updated weights for policy 0, policy_version 320130 (0.0028) [2024-06-23 00:16:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 5245042688. Throughput: 0: 42545.8. Samples: 5245167920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:16:28,392][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 00:16:31,195][15401] Updated weights for policy 0, policy_version 320140 (0.0030) [2024-06-23 00:16:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5245255680. Throughput: 0: 42600.9. Samples: 5245424720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:16:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 00:16:34,732][15401] Updated weights for policy 0, policy_version 320150 (0.0028) [2024-06-23 00:16:38,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42320.9, 300 sec: 42597.8). Total num frames: 5245468672. Throughput: 0: 42653.4. Samples: 5245554780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:16:38,396][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 00:16:38,877][15401] Updated weights for policy 0, policy_version 320160 (0.0044) [2024-06-23 00:16:42,319][15401] Updated weights for policy 0, policy_version 320170 (0.0036) [2024-06-23 00:16:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5245698048. Throughput: 0: 42558.7. Samples: 5245805080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:16:43,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 00:16:46,735][15401] Updated weights for policy 0, policy_version 320180 (0.0035) [2024-06-23 00:16:48,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5245911040. Throughput: 0: 42842.6. Samples: 5246065760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:16:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 00:16:50,097][15401] Updated weights for policy 0, policy_version 320190 (0.0029) [2024-06-23 00:16:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5246107648. Throughput: 0: 42662.7. Samples: 5246192100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:16:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 00:16:54,229][15401] Updated weights for policy 0, policy_version 320200 (0.0028) [2024-06-23 00:16:57,706][15401] Updated weights for policy 0, policy_version 320210 (0.0037) [2024-06-23 00:16:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 5246337024. Throughput: 0: 42599.7. Samples: 5246444400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:16:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 00:17:01,874][15401] Updated weights for policy 0, policy_version 320220 (0.0035) [2024-06-23 00:17:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 5246566400. Throughput: 0: 42729.1. Samples: 5246703720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:17:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 00:17:05,345][15401] Updated weights for policy 0, policy_version 320230 (0.0034) [2024-06-23 00:17:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5246746624. Throughput: 0: 42661.8. Samples: 5246832100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:17:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 00:17:09,944][15401] Updated weights for policy 0, policy_version 320240 (0.0039) [2024-06-23 00:17:13,014][15401] Updated weights for policy 0, policy_version 320250 (0.0032) [2024-06-23 00:17:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 5246992384. Throughput: 0: 42562.2. Samples: 5247083220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:17:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 00:17:16,657][15349] Signal inference workers to stop experience collection... (77650 times) [2024-06-23 00:17:16,659][15349] Signal inference workers to resume experience collection... (77650 times) [2024-06-23 00:17:16,687][15401] InferenceWorker_p0-w0: stopping experience collection (77650 times) [2024-06-23 00:17:16,687][15401] InferenceWorker_p0-w0: resuming experience collection (77650 times) [2024-06-23 00:17:17,452][15401] Updated weights for policy 0, policy_version 320260 (0.0033) [2024-06-23 00:17:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5247172608. Throughput: 0: 42683.1. Samples: 5247345460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:17:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 00:17:20,742][15401] Updated weights for policy 0, policy_version 320270 (0.0033) [2024-06-23 00:17:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5247385600. Throughput: 0: 42398.5. Samples: 5247462440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:17:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 00:17:25,065][15401] Updated weights for policy 0, policy_version 320280 (0.0037) [2024-06-23 00:17:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5247598592. Throughput: 0: 42529.0. Samples: 5247718880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:17:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 00:17:28,761][15401] Updated weights for policy 0, policy_version 320290 (0.0051) [2024-06-23 00:17:32,726][15401] Updated weights for policy 0, policy_version 320300 (0.0042) [2024-06-23 00:17:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5247811584. Throughput: 0: 42382.7. Samples: 5247972980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 00:17:33,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 00:17:36,369][15401] Updated weights for policy 0, policy_version 320310 (0.0040) [2024-06-23 00:17:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 5248024576. Throughput: 0: 42512.8. Samples: 5248105180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:17:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 00:17:40,387][15401] Updated weights for policy 0, policy_version 320320 (0.0042) [2024-06-23 00:17:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5248237568. Throughput: 0: 42647.0. Samples: 5248363520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:17:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 00:17:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000320327_5248237568.pth... [2024-06-23 00:17:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000319705_5238046720.pth [2024-06-23 00:17:44,047][15401] Updated weights for policy 0, policy_version 320330 (0.0027) [2024-06-23 00:17:47,880][15401] Updated weights for policy 0, policy_version 320340 (0.0029) [2024-06-23 00:17:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5248450560. Throughput: 0: 42482.2. Samples: 5248615420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:17:48,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 00:17:51,512][15401] Updated weights for policy 0, policy_version 320350 (0.0044) [2024-06-23 00:17:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5248663552. Throughput: 0: 42465.3. Samples: 5248743040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:17:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 00:17:55,676][15401] Updated weights for policy 0, policy_version 320360 (0.0025) [2024-06-23 00:17:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 5248876544. Throughput: 0: 42720.1. Samples: 5249005620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:17:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 00:17:59,191][15401] Updated weights for policy 0, policy_version 320370 (0.0037) [2024-06-23 00:18:03,233][15401] Updated weights for policy 0, policy_version 320380 (0.0047) [2024-06-23 00:18:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5249105920. Throughput: 0: 42494.2. Samples: 5249257700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:03,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 00:18:06,883][15401] Updated weights for policy 0, policy_version 320390 (0.0039) [2024-06-23 00:18:08,391][15132] Fps is (10 sec: 44231.9, 60 sec: 42870.7, 300 sec: 42653.8). Total num frames: 5249318912. Throughput: 0: 42697.3. Samples: 5249383860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:08,391][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 00:18:11,066][15401] Updated weights for policy 0, policy_version 320400 (0.0032) [2024-06-23 00:18:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5249515520. Throughput: 0: 42882.2. Samples: 5249648580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 00:18:14,611][15401] Updated weights for policy 0, policy_version 320410 (0.0037) [2024-06-23 00:18:18,390][15132] Fps is (10 sec: 40963.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5249728512. Throughput: 0: 42845.8. Samples: 5249901040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 00:18:18,722][15401] Updated weights for policy 0, policy_version 320420 (0.0039) [2024-06-23 00:18:22,323][15401] Updated weights for policy 0, policy_version 320430 (0.0040) [2024-06-23 00:18:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5249957888. Throughput: 0: 42767.5. Samples: 5250029720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 00:18:26,462][15401] Updated weights for policy 0, policy_version 320440 (0.0033) [2024-06-23 00:18:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5250154496. Throughput: 0: 42751.1. Samples: 5250287320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 00:18:30,040][15401] Updated weights for policy 0, policy_version 320450 (0.0034) [2024-06-23 00:18:33,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 5250367488. Throughput: 0: 42899.6. Samples: 5250546000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:33,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 00:18:34,401][15401] Updated weights for policy 0, policy_version 320460 (0.0039) [2024-06-23 00:18:37,613][15401] Updated weights for policy 0, policy_version 320470 (0.0036) [2024-06-23 00:18:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5250613248. Throughput: 0: 42837.7. Samples: 5250670740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:38,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 00:18:41,148][15349] Signal inference workers to stop experience collection... (77700 times) [2024-06-23 00:18:41,161][15401] InferenceWorker_p0-w0: stopping experience collection (77700 times) [2024-06-23 00:18:41,259][15349] Signal inference workers to resume experience collection... (77700 times) [2024-06-23 00:18:41,259][15401] InferenceWorker_p0-w0: resuming experience collection (77700 times) [2024-06-23 00:18:42,076][15401] Updated weights for policy 0, policy_version 320480 (0.0025) [2024-06-23 00:18:43,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5250793472. Throughput: 0: 42622.4. Samples: 5250923640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:43,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 00:18:45,230][15401] Updated weights for policy 0, policy_version 320490 (0.0029) [2024-06-23 00:18:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5251006464. Throughput: 0: 42635.6. Samples: 5251176300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 00:18:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 00:18:50,044][15401] Updated weights for policy 0, policy_version 320500 (0.0038) [2024-06-23 00:18:53,236][15401] Updated weights for policy 0, policy_version 320510 (0.0025) [2024-06-23 00:18:53,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5251235840. Throughput: 0: 42656.6. Samples: 5251303360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:18:53,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 00:18:57,895][15401] Updated weights for policy 0, policy_version 320520 (0.0035) [2024-06-23 00:18:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5251432448. Throughput: 0: 42533.8. Samples: 5251562600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:18:58,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 00:19:00,670][15401] Updated weights for policy 0, policy_version 320530 (0.0025) [2024-06-23 00:19:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 5251661824. Throughput: 0: 42536.0. Samples: 5251815160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 00:19:05,454][15401] Updated weights for policy 0, policy_version 320540 (0.0026) [2024-06-23 00:19:08,365][15401] Updated weights for policy 0, policy_version 320550 (0.0038) [2024-06-23 00:19:08,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42870.5, 300 sec: 42709.2). Total num frames: 5251891200. Throughput: 0: 42572.1. Samples: 5251945560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:08,392][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 00:19:12,998][15401] Updated weights for policy 0, policy_version 320560 (0.0038) [2024-06-23 00:19:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5252055040. Throughput: 0: 42554.2. Samples: 5252202260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:13,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 00:19:16,303][15401] Updated weights for policy 0, policy_version 320570 (0.0035) [2024-06-23 00:19:18,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.6, 300 sec: 42709.7). Total num frames: 5252300800. Throughput: 0: 42335.6. Samples: 5252451000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 00:19:20,669][15401] Updated weights for policy 0, policy_version 320580 (0.0032) [2024-06-23 00:19:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5252497408. Throughput: 0: 42531.5. Samples: 5252584660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 00:19:24,257][15401] Updated weights for policy 0, policy_version 320590 (0.0041) [2024-06-23 00:19:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5252694016. Throughput: 0: 42484.2. Samples: 5252835420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:28,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 00:19:28,726][15401] Updated weights for policy 0, policy_version 320600 (0.0034) [2024-06-23 00:19:31,919][15401] Updated weights for policy 0, policy_version 320610 (0.0035) [2024-06-23 00:19:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 5252939776. Throughput: 0: 42432.9. Samples: 5253085780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 00:19:36,283][15401] Updated weights for policy 0, policy_version 320620 (0.0030) [2024-06-23 00:19:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 5253136384. Throughput: 0: 42797.3. Samples: 5253229240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 00:19:39,458][15401] Updated weights for policy 0, policy_version 320630 (0.0035) [2024-06-23 00:19:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5253349376. Throughput: 0: 42459.0. Samples: 5253473260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 00:19:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000320639_5253349376.pth... [2024-06-23 00:19:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000320015_5243125760.pth [2024-06-23 00:19:43,954][15401] Updated weights for policy 0, policy_version 320640 (0.0027) [2024-06-23 00:19:47,014][15401] Updated weights for policy 0, policy_version 320650 (0.0031) [2024-06-23 00:19:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5253595136. Throughput: 0: 42597.1. Samples: 5253732020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 00:19:51,410][15401] Updated weights for policy 0, policy_version 320660 (0.0039) [2024-06-23 00:19:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5253775360. Throughput: 0: 42750.3. Samples: 5253869220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 00:19:54,579][15401] Updated weights for policy 0, policy_version 320670 (0.0037) [2024-06-23 00:19:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5254004736. Throughput: 0: 42788.5. Samples: 5254127740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:19:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 00:19:58,799][15401] Updated weights for policy 0, policy_version 320680 (0.0041) [2024-06-23 00:20:02,095][15401] Updated weights for policy 0, policy_version 320690 (0.0039) [2024-06-23 00:20:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5254234112. Throughput: 0: 42937.7. Samples: 5254383200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 00:20:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 00:20:06,257][15401] Updated weights for policy 0, policy_version 320700 (0.0028) [2024-06-23 00:20:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 5254430720. Throughput: 0: 43016.7. Samples: 5254520400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 00:20:09,639][15401] Updated weights for policy 0, policy_version 320710 (0.0043) [2024-06-23 00:20:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42654.2). Total num frames: 5254643712. Throughput: 0: 42964.0. Samples: 5254768800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 00:20:13,829][15401] Updated weights for policy 0, policy_version 320720 (0.0037) [2024-06-23 00:20:17,465][15401] Updated weights for policy 0, policy_version 320730 (0.0050) [2024-06-23 00:20:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5254856704. Throughput: 0: 43183.1. Samples: 5255029020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 00:20:21,589][15401] Updated weights for policy 0, policy_version 320740 (0.0029) [2024-06-23 00:20:23,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 5255069696. Throughput: 0: 42895.4. Samples: 5255159640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:23,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 00:20:25,279][15401] Updated weights for policy 0, policy_version 320750 (0.0024) [2024-06-23 00:20:26,659][15349] Signal inference workers to stop experience collection... (77750 times) [2024-06-23 00:20:26,660][15349] Signal inference workers to resume experience collection... (77750 times) [2024-06-23 00:20:26,716][15401] InferenceWorker_p0-w0: stopping experience collection (77750 times) [2024-06-23 00:20:26,716][15401] InferenceWorker_p0-w0: resuming experience collection (77750 times) [2024-06-23 00:20:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5255282688. Throughput: 0: 42949.9. Samples: 5255406000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 00:20:29,285][15401] Updated weights for policy 0, policy_version 320760 (0.0042) [2024-06-23 00:20:32,971][15401] Updated weights for policy 0, policy_version 320770 (0.0042) [2024-06-23 00:20:33,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5255512064. Throughput: 0: 43044.3. Samples: 5255669020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:33,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 00:20:36,845][15401] Updated weights for policy 0, policy_version 320780 (0.0032) [2024-06-23 00:20:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5255708672. Throughput: 0: 42862.1. Samples: 5255798020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 00:20:40,474][15401] Updated weights for policy 0, policy_version 320790 (0.0037) [2024-06-23 00:20:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5255938048. Throughput: 0: 42820.1. Samples: 5256054640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 00:20:44,601][15401] Updated weights for policy 0, policy_version 320800 (0.0029) [2024-06-23 00:20:47,960][15401] Updated weights for policy 0, policy_version 320810 (0.0034) [2024-06-23 00:20:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5256151040. Throughput: 0: 42872.0. Samples: 5256312440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 00:20:52,252][15401] Updated weights for policy 0, policy_version 320820 (0.0034) [2024-06-23 00:20:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5256347648. Throughput: 0: 42779.9. Samples: 5256445500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:53,392][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 00:20:55,482][15401] Updated weights for policy 0, policy_version 320830 (0.0030) [2024-06-23 00:20:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5256577024. Throughput: 0: 42886.7. Samples: 5256698700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:20:58,394][15132] Avg episode reward: [(0, '0.236')] [2024-06-23 00:21:00,102][15401] Updated weights for policy 0, policy_version 320840 (0.0041) [2024-06-23 00:21:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5256790016. Throughput: 0: 42816.4. Samples: 5256955760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:21:03,399][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 00:21:03,537][15401] Updated weights for policy 0, policy_version 320850 (0.0043) [2024-06-23 00:21:07,775][15401] Updated weights for policy 0, policy_version 320860 (0.0036) [2024-06-23 00:21:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5256986624. Throughput: 0: 42857.4. Samples: 5257088120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:21:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 00:21:11,203][15401] Updated weights for policy 0, policy_version 320870 (0.0024) [2024-06-23 00:21:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5257232384. Throughput: 0: 43023.5. Samples: 5257342060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:21:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 00:21:15,504][15401] Updated weights for policy 0, policy_version 320880 (0.0037) [2024-06-23 00:21:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5257445376. Throughput: 0: 42868.5. Samples: 5257598100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 00:21:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 00:21:18,626][15401] Updated weights for policy 0, policy_version 320890 (0.0035) [2024-06-23 00:21:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 5257625600. Throughput: 0: 42901.9. Samples: 5257728600. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:21:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 00:21:23,391][15401] Updated weights for policy 0, policy_version 320900 (0.0034) [2024-06-23 00:21:26,221][15401] Updated weights for policy 0, policy_version 320910 (0.0060) [2024-06-23 00:21:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5257887744. Throughput: 0: 42969.3. Samples: 5257988260. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:21:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 00:21:30,791][15401] Updated weights for policy 0, policy_version 320920 (0.0044) [2024-06-23 00:21:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 5258084352. Throughput: 0: 42838.2. Samples: 5258240160. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:21:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 00:21:34,167][15401] Updated weights for policy 0, policy_version 320930 (0.0029) [2024-06-23 00:21:38,392][15401] Updated weights for policy 0, policy_version 320940 (0.0040) [2024-06-23 00:21:38,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 5258280960. Throughput: 0: 42641.7. Samples: 5258364480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:21:38,393][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 00:21:41,743][15401] Updated weights for policy 0, policy_version 320950 (0.0026) [2024-06-23 00:21:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5258510336. Throughput: 0: 42813.3. Samples: 5258625300. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:21:43,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 00:21:43,474][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000320955_5258526720.pth... [2024-06-23 00:21:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000320327_5248237568.pth [2024-06-23 00:21:45,948][15401] Updated weights for policy 0, policy_version 320960 (0.0024) [2024-06-23 00:21:47,316][15349] Signal inference workers to stop experience collection... (77800 times) [2024-06-23 00:21:47,316][15349] Signal inference workers to resume experience collection... (77800 times) [2024-06-23 00:21:47,335][15401] InferenceWorker_p0-w0: stopping experience collection (77800 times) [2024-06-23 00:21:47,335][15401] InferenceWorker_p0-w0: resuming experience collection (77800 times) [2024-06-23 00:21:48,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5258723328. Throughput: 0: 42910.9. Samples: 5258886740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:21:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 00:21:49,246][15401] Updated weights for policy 0, policy_version 320970 (0.0037) [2024-06-23 00:21:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 5258919936. Throughput: 0: 42675.5. Samples: 5259008620. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:21:53,392][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 00:21:53,482][15401] Updated weights for policy 0, policy_version 320980 (0.0036) [2024-06-23 00:21:56,911][15401] Updated weights for policy 0, policy_version 320990 (0.0035) [2024-06-23 00:21:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5259182080. Throughput: 0: 42885.4. Samples: 5259271900. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:21:58,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 00:22:00,958][15401] Updated weights for policy 0, policy_version 321000 (0.0031) [2024-06-23 00:22:03,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5259362304. Throughput: 0: 42961.2. Samples: 5259531360. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:22:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 00:22:04,342][15401] Updated weights for policy 0, policy_version 321010 (0.0040) [2024-06-23 00:22:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5259575296. Throughput: 0: 42766.7. Samples: 5259653100. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:22:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 00:22:08,428][15401] Updated weights for policy 0, policy_version 321020 (0.0040) [2024-06-23 00:22:12,391][15401] Updated weights for policy 0, policy_version 321030 (0.0036) [2024-06-23 00:22:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5259804672. Throughput: 0: 42899.5. Samples: 5259918740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:22:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 00:22:15,875][15401] Updated weights for policy 0, policy_version 321040 (0.0042) [2024-06-23 00:22:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5260017664. Throughput: 0: 43151.5. Samples: 5260181980. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:22:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 00:22:19,688][15401] Updated weights for policy 0, policy_version 321050 (0.0035) [2024-06-23 00:22:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5260214272. Throughput: 0: 43194.9. Samples: 5260308140. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:22:23,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 00:22:23,796][15401] Updated weights for policy 0, policy_version 321060 (0.0025) [2024-06-23 00:22:27,054][15401] Updated weights for policy 0, policy_version 321070 (0.0059) [2024-06-23 00:22:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5260443648. Throughput: 0: 43119.7. Samples: 5260565680. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:22:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 00:22:31,338][15401] Updated weights for policy 0, policy_version 321080 (0.0034) [2024-06-23 00:22:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5260673024. Throughput: 0: 43238.1. Samples: 5260832460. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 00:22:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 00:22:34,372][15401] Updated weights for policy 0, policy_version 321090 (0.0031) [2024-06-23 00:22:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.4, 300 sec: 42820.6). Total num frames: 5260869632. Throughput: 0: 43349.5. Samples: 5260959240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:22:38,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 00:22:38,646][15401] Updated weights for policy 0, policy_version 321100 (0.0044) [2024-06-23 00:22:41,893][15401] Updated weights for policy 0, policy_version 321110 (0.0038) [2024-06-23 00:22:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 5261115392. Throughput: 0: 43213.3. Samples: 5261216500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:22:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 00:22:46,395][15401] Updated weights for policy 0, policy_version 321120 (0.0039) [2024-06-23 00:22:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5261312000. Throughput: 0: 43279.1. Samples: 5261478920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:22:48,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 00:22:49,841][15401] Updated weights for policy 0, policy_version 321130 (0.0033) [2024-06-23 00:22:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 5261508608. Throughput: 0: 43222.9. Samples: 5261598140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:22:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 00:22:54,087][15401] Updated weights for policy 0, policy_version 321140 (0.0040) [2024-06-23 00:22:57,441][15401] Updated weights for policy 0, policy_version 321150 (0.0028) [2024-06-23 00:22:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5261754368. Throughput: 0: 43122.2. Samples: 5261859240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:22:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 00:22:58,404][15349] Signal inference workers to stop experience collection... (77850 times) [2024-06-23 00:22:58,404][15349] Signal inference workers to resume experience collection... (77850 times) [2024-06-23 00:22:58,458][15401] InferenceWorker_p0-w0: stopping experience collection (77850 times) [2024-06-23 00:22:58,458][15401] InferenceWorker_p0-w0: resuming experience collection (77850 times) [2024-06-23 00:23:01,715][15401] Updated weights for policy 0, policy_version 321160 (0.0029) [2024-06-23 00:23:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.7). Total num frames: 5261950976. Throughput: 0: 43101.4. Samples: 5262121540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 00:23:04,920][15401] Updated weights for policy 0, policy_version 321170 (0.0043) [2024-06-23 00:23:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5262147584. Throughput: 0: 43037.6. Samples: 5262244840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 00:23:09,211][15401] Updated weights for policy 0, policy_version 321180 (0.0034) [2024-06-23 00:23:12,450][15401] Updated weights for policy 0, policy_version 321190 (0.0033) [2024-06-23 00:23:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 5262393344. Throughput: 0: 43090.1. Samples: 5262504740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 00:23:16,993][15401] Updated weights for policy 0, policy_version 321200 (0.0041) [2024-06-23 00:23:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5262589952. Throughput: 0: 42997.3. Samples: 5262767340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:18,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-23 00:23:20,055][15401] Updated weights for policy 0, policy_version 321210 (0.0036) [2024-06-23 00:23:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5262802944. Throughput: 0: 42835.4. Samples: 5262886840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:23,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-23 00:23:24,507][15401] Updated weights for policy 0, policy_version 321220 (0.0046) [2024-06-23 00:23:28,082][15401] Updated weights for policy 0, policy_version 321230 (0.0027) [2024-06-23 00:23:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 5263032320. Throughput: 0: 42860.6. Samples: 5263145220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 00:23:32,185][15401] Updated weights for policy 0, policy_version 321240 (0.0037) [2024-06-23 00:23:33,391][15132] Fps is (10 sec: 42591.7, 60 sec: 42597.3, 300 sec: 42764.8). Total num frames: 5263228928. Throughput: 0: 42777.2. Samples: 5263403960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:33,392][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 00:23:35,803][15401] Updated weights for policy 0, policy_version 321250 (0.0033) [2024-06-23 00:23:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 5263458304. Throughput: 0: 42931.6. Samples: 5263530060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 00:23:39,819][15401] Updated weights for policy 0, policy_version 321260 (0.0032) [2024-06-23 00:23:43,301][15401] Updated weights for policy 0, policy_version 321270 (0.0027) [2024-06-23 00:23:43,389][15132] Fps is (10 sec: 45882.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5263687680. Throughput: 0: 42836.6. Samples: 5263786880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 00:23:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000321270_5263687680.pth... [2024-06-23 00:23:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000320639_5253349376.pth [2024-06-23 00:23:47,308][15401] Updated weights for policy 0, policy_version 321280 (0.0040) [2024-06-23 00:23:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 5263867904. Throughput: 0: 42719.1. Samples: 5264043900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 00:23:48,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 00:23:51,327][15401] Updated weights for policy 0, policy_version 321290 (0.0029) [2024-06-23 00:23:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5264097280. Throughput: 0: 42774.7. Samples: 5264169700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:23:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 00:23:54,924][15401] Updated weights for policy 0, policy_version 321300 (0.0041) [2024-06-23 00:23:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5264310272. Throughput: 0: 42674.6. Samples: 5264425100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:23:58,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-23 00:23:59,117][15401] Updated weights for policy 0, policy_version 321310 (0.0037) [2024-06-23 00:24:02,464][15401] Updated weights for policy 0, policy_version 321320 (0.0038) [2024-06-23 00:24:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 5264506880. Throughput: 0: 42538.3. Samples: 5264681560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 00:24:06,767][15401] Updated weights for policy 0, policy_version 321330 (0.0042) [2024-06-23 00:24:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5264736256. Throughput: 0: 42729.8. Samples: 5264809680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:08,395][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 00:24:10,200][15401] Updated weights for policy 0, policy_version 321340 (0.0040) [2024-06-23 00:24:13,071][15349] Signal inference workers to stop experience collection... (77900 times) [2024-06-23 00:24:13,096][15401] InferenceWorker_p0-w0: stopping experience collection (77900 times) [2024-06-23 00:24:13,130][15349] Signal inference workers to resume experience collection... (77900 times) [2024-06-23 00:24:13,132][15401] InferenceWorker_p0-w0: resuming experience collection (77900 times) [2024-06-23 00:24:13,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 5264965632. Throughput: 0: 42735.0. Samples: 5265068400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:13,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 00:24:14,457][15401] Updated weights for policy 0, policy_version 321350 (0.0046) [2024-06-23 00:24:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5265145856. Throughput: 0: 42572.3. Samples: 5265319640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:18,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-23 00:24:18,500][15401] Updated weights for policy 0, policy_version 321360 (0.0037) [2024-06-23 00:24:22,056][15401] Updated weights for policy 0, policy_version 321370 (0.0042) [2024-06-23 00:24:23,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 5265375232. Throughput: 0: 42569.7. Samples: 5265445800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:23,393][15132] Avg episode reward: [(0, '0.884')] [2024-06-23 00:24:26,138][15401] Updated weights for policy 0, policy_version 321380 (0.0040) [2024-06-23 00:24:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5265588224. Throughput: 0: 42645.4. Samples: 5265705920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 00:24:30,060][15401] Updated weights for policy 0, policy_version 321390 (0.0033) [2024-06-23 00:24:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42599.6, 300 sec: 42876.1). Total num frames: 5265784832. Throughput: 0: 42652.0. Samples: 5265963240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 00:24:33,646][15401] Updated weights for policy 0, policy_version 321400 (0.0027) [2024-06-23 00:24:37,714][15401] Updated weights for policy 0, policy_version 321410 (0.0033) [2024-06-23 00:24:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5266014208. Throughput: 0: 42655.6. Samples: 5266089200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 00:24:41,230][15401] Updated weights for policy 0, policy_version 321420 (0.0040) [2024-06-23 00:24:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5266243584. Throughput: 0: 42851.2. Samples: 5266353400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 00:24:45,331][15401] Updated weights for policy 0, policy_version 321430 (0.0039) [2024-06-23 00:24:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5266423808. Throughput: 0: 42817.3. Samples: 5266608340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 00:24:48,724][15401] Updated weights for policy 0, policy_version 321440 (0.0034) [2024-06-23 00:24:52,831][15401] Updated weights for policy 0, policy_version 321450 (0.0047) [2024-06-23 00:24:53,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 5266669568. Throughput: 0: 42770.2. Samples: 5266734440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:53,393][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 00:24:56,399][15401] Updated weights for policy 0, policy_version 321460 (0.0039) [2024-06-23 00:24:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5266866176. Throughput: 0: 42793.8. Samples: 5266994020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:24:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 00:25:00,486][15401] Updated weights for policy 0, policy_version 321470 (0.0033) [2024-06-23 00:25:03,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5267095552. Throughput: 0: 42963.5. Samples: 5267253000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 00:25:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 00:25:03,962][15401] Updated weights for policy 0, policy_version 321480 (0.0038) [2024-06-23 00:25:08,209][15401] Updated weights for policy 0, policy_version 321490 (0.0037) [2024-06-23 00:25:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5267292160. Throughput: 0: 43005.0. Samples: 5267380920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 00:25:11,443][15401] Updated weights for policy 0, policy_version 321500 (0.0031) [2024-06-23 00:25:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42054.1, 300 sec: 42820.6). Total num frames: 5267488768. Throughput: 0: 42793.8. Samples: 5267631640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:13,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-23 00:25:15,966][15401] Updated weights for policy 0, policy_version 321510 (0.0035) [2024-06-23 00:25:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 5267734528. Throughput: 0: 42802.7. Samples: 5267889360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:18,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-23 00:25:19,293][15401] Updated weights for policy 0, policy_version 321520 (0.0035) [2024-06-23 00:25:23,032][15349] Signal inference workers to stop experience collection... (77950 times) [2024-06-23 00:25:23,090][15349] Signal inference workers to resume experience collection... (77950 times) [2024-06-23 00:25:23,091][15401] InferenceWorker_p0-w0: stopping experience collection (77950 times) [2024-06-23 00:25:23,116][15401] InferenceWorker_p0-w0: resuming experience collection (77950 times) [2024-06-23 00:25:23,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 5267914752. Throughput: 0: 42899.2. Samples: 5268019660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 00:25:23,574][15401] Updated weights for policy 0, policy_version 321530 (0.0027) [2024-06-23 00:25:26,926][15401] Updated weights for policy 0, policy_version 321540 (0.0038) [2024-06-23 00:25:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5268127744. Throughput: 0: 42504.8. Samples: 5268266120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 00:25:31,190][15401] Updated weights for policy 0, policy_version 321550 (0.0027) [2024-06-23 00:25:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 5268373504. Throughput: 0: 42612.9. Samples: 5268525920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 00:25:34,489][15401] Updated weights for policy 0, policy_version 321560 (0.0032) [2024-06-23 00:25:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5268570112. Throughput: 0: 42823.7. Samples: 5268661400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 00:25:38,697][15401] Updated weights for policy 0, policy_version 321570 (0.0039) [2024-06-23 00:25:42,136][15401] Updated weights for policy 0, policy_version 321580 (0.0024) [2024-06-23 00:25:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5268783104. Throughput: 0: 42591.2. Samples: 5268910620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 00:25:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000321581_5268783104.pth... [2024-06-23 00:25:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000320955_5258526720.pth [2024-06-23 00:25:46,400][15401] Updated weights for policy 0, policy_version 321590 (0.0031) [2024-06-23 00:25:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 5269012480. Throughput: 0: 42519.0. Samples: 5269166360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 00:25:49,939][15401] Updated weights for policy 0, policy_version 321600 (0.0037) [2024-06-23 00:25:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 5269225472. Throughput: 0: 42650.1. Samples: 5269300180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:53,396][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 00:25:54,274][15401] Updated weights for policy 0, policy_version 321610 (0.0037) [2024-06-23 00:25:57,883][15401] Updated weights for policy 0, policy_version 321620 (0.0035) [2024-06-23 00:25:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5269438464. Throughput: 0: 42664.7. Samples: 5269551560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:25:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 00:26:01,843][15401] Updated weights for policy 0, policy_version 321630 (0.0042) [2024-06-23 00:26:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5269651456. Throughput: 0: 42571.0. Samples: 5269805060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:26:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 00:26:05,611][15401] Updated weights for policy 0, policy_version 321640 (0.0037) [2024-06-23 00:26:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 5269848064. Throughput: 0: 42524.7. Samples: 5269933280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:26:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 00:26:09,461][15401] Updated weights for policy 0, policy_version 321650 (0.0030) [2024-06-23 00:26:13,119][15401] Updated weights for policy 0, policy_version 321660 (0.0028) [2024-06-23 00:26:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5270077440. Throughput: 0: 42790.6. Samples: 5270191700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:26:13,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 00:26:16,989][15401] Updated weights for policy 0, policy_version 321670 (0.0030) [2024-06-23 00:26:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5270290432. Throughput: 0: 42792.4. Samples: 5270451580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 00:26:18,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 00:26:21,224][15401] Updated weights for policy 0, policy_version 321680 (0.0031) [2024-06-23 00:26:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5270470656. Throughput: 0: 42580.0. Samples: 5270577500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:26:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 00:26:24,682][15401] Updated weights for policy 0, policy_version 321690 (0.0029) [2024-06-23 00:26:26,382][15349] Signal inference workers to stop experience collection... (78000 times) [2024-06-23 00:26:26,382][15349] Signal inference workers to resume experience collection... (78000 times) [2024-06-23 00:26:26,396][15401] InferenceWorker_p0-w0: stopping experience collection (78000 times) [2024-06-23 00:26:26,396][15401] InferenceWorker_p0-w0: resuming experience collection (78000 times) [2024-06-23 00:26:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5270700032. Throughput: 0: 42637.6. Samples: 5270829320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:26:28,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 00:26:28,949][15401] Updated weights for policy 0, policy_version 321700 (0.0038) [2024-06-23 00:26:32,428][15401] Updated weights for policy 0, policy_version 321710 (0.0032) [2024-06-23 00:26:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 5270913024. Throughput: 0: 42865.5. Samples: 5271095300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:26:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 00:26:36,352][15401] Updated weights for policy 0, policy_version 321720 (0.0044) [2024-06-23 00:26:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5271142400. Throughput: 0: 42776.9. Samples: 5271225140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:26:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 00:26:40,065][15401] Updated weights for policy 0, policy_version 321730 (0.0031) [2024-06-23 00:26:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5271355392. Throughput: 0: 42772.5. Samples: 5271476320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:26:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 00:26:44,067][15401] Updated weights for policy 0, policy_version 321740 (0.0042) [2024-06-23 00:26:47,671][15401] Updated weights for policy 0, policy_version 321750 (0.0031) [2024-06-23 00:26:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 5271552000. Throughput: 0: 42872.4. Samples: 5271734320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:26:48,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 00:26:51,808][15401] Updated weights for policy 0, policy_version 321760 (0.0036) [2024-06-23 00:26:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5271764992. Throughput: 0: 42809.5. Samples: 5271859700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:26:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 00:26:55,534][15401] Updated weights for policy 0, policy_version 321770 (0.0028) [2024-06-23 00:26:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5271994368. Throughput: 0: 42549.4. Samples: 5272106420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:26:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 00:26:59,851][15401] Updated weights for policy 0, policy_version 321780 (0.0035) [2024-06-23 00:27:03,332][15401] Updated weights for policy 0, policy_version 321790 (0.0032) [2024-06-23 00:27:03,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 5272207360. Throughput: 0: 42545.9. Samples: 5272366420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:27:03,396][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 00:27:07,307][15401] Updated weights for policy 0, policy_version 321800 (0.0041) [2024-06-23 00:27:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5272420352. Throughput: 0: 42570.9. Samples: 5272493200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:27:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 00:27:11,014][15401] Updated weights for policy 0, policy_version 321810 (0.0027) [2024-06-23 00:27:13,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5272633344. Throughput: 0: 42726.3. Samples: 5272752000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:27:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 00:27:14,767][15401] Updated weights for policy 0, policy_version 321820 (0.0034) [2024-06-23 00:27:18,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5272829952. Throughput: 0: 42568.0. Samples: 5273010860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:27:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 00:27:18,789][15401] Updated weights for policy 0, policy_version 321830 (0.0040) [2024-06-23 00:27:22,602][15401] Updated weights for policy 0, policy_version 321840 (0.0037) [2024-06-23 00:27:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5273075712. Throughput: 0: 42435.0. Samples: 5273134720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:27:23,390][15132] Avg episode reward: [(0, '0.884')] [2024-06-23 00:27:26,345][15401] Updated weights for policy 0, policy_version 321850 (0.0034) [2024-06-23 00:27:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5273272320. Throughput: 0: 42511.5. Samples: 5273389340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:27:28,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 00:27:30,149][15401] Updated weights for policy 0, policy_version 321860 (0.0027) [2024-06-23 00:27:33,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5273452544. Throughput: 0: 42594.2. Samples: 5273651060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:27:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 00:27:34,210][15401] Updated weights for policy 0, policy_version 321870 (0.0029) [2024-06-23 00:27:37,937][15401] Updated weights for policy 0, policy_version 321880 (0.0033) [2024-06-23 00:27:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5273698304. Throughput: 0: 42565.7. Samples: 5273775160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-23 00:27:38,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-23 00:27:42,079][15401] Updated weights for policy 0, policy_version 321890 (0.0030) [2024-06-23 00:27:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5273911296. Throughput: 0: 42814.6. Samples: 5274033080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:27:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 00:27:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000321894_5273911296.pth... [2024-06-23 00:27:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000321270_5263687680.pth [2024-06-23 00:27:45,510][15401] Updated weights for policy 0, policy_version 321900 (0.0037) [2024-06-23 00:27:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5274107904. Throughput: 0: 42663.0. Samples: 5274285980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:27:48,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 00:27:49,516][15401] Updated weights for policy 0, policy_version 321910 (0.0028) [2024-06-23 00:27:53,145][15401] Updated weights for policy 0, policy_version 321920 (0.0041) [2024-06-23 00:27:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5274337280. Throughput: 0: 42649.9. Samples: 5274412440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:27:53,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 00:27:54,689][15349] Signal inference workers to stop experience collection... (78050 times) [2024-06-23 00:27:54,691][15349] Signal inference workers to resume experience collection... (78050 times) [2024-06-23 00:27:54,740][15401] InferenceWorker_p0-w0: stopping experience collection (78050 times) [2024-06-23 00:27:54,740][15401] InferenceWorker_p0-w0: resuming experience collection (78050 times) [2024-06-23 00:27:57,357][15401] Updated weights for policy 0, policy_version 321930 (0.0041) [2024-06-23 00:27:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5274550272. Throughput: 0: 42660.0. Samples: 5274671700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:27:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 00:28:00,814][15401] Updated weights for policy 0, policy_version 321940 (0.0037) [2024-06-23 00:28:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 5274746880. Throughput: 0: 42481.3. Samples: 5274922520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:03,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-23 00:28:04,913][15401] Updated weights for policy 0, policy_version 321950 (0.0034) [2024-06-23 00:28:08,359][15401] Updated weights for policy 0, policy_version 321960 (0.0029) [2024-06-23 00:28:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5274992640. Throughput: 0: 42593.0. Samples: 5275051400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:08,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-23 00:28:12,564][15401] Updated weights for policy 0, policy_version 321970 (0.0037) [2024-06-23 00:28:13,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 5275205632. Throughput: 0: 42570.6. Samples: 5275305120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:13,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 00:28:16,440][15401] Updated weights for policy 0, policy_version 321980 (0.0042) [2024-06-23 00:28:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5275385856. Throughput: 0: 42493.8. Samples: 5275563280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:18,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 00:28:20,253][15401] Updated weights for policy 0, policy_version 321990 (0.0033) [2024-06-23 00:28:23,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5275615232. Throughput: 0: 42610.3. Samples: 5275692620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:23,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 00:28:23,893][15401] Updated weights for policy 0, policy_version 322000 (0.0037) [2024-06-23 00:28:27,794][15401] Updated weights for policy 0, policy_version 322010 (0.0040) [2024-06-23 00:28:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.7). Total num frames: 5275828224. Throughput: 0: 42575.2. Samples: 5275948960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:28,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 00:28:31,555][15401] Updated weights for policy 0, policy_version 322020 (0.0035) [2024-06-23 00:28:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5276024832. Throughput: 0: 42619.1. Samples: 5276203840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 00:28:35,555][15401] Updated weights for policy 0, policy_version 322030 (0.0029) [2024-06-23 00:28:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5276237824. Throughput: 0: 42539.4. Samples: 5276326720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:38,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 00:28:39,133][15401] Updated weights for policy 0, policy_version 322040 (0.0032) [2024-06-23 00:28:43,146][15401] Updated weights for policy 0, policy_version 322050 (0.0039) [2024-06-23 00:28:43,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42593.9, 300 sec: 42708.5). Total num frames: 5276467200. Throughput: 0: 42547.7. Samples: 5276586620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:43,396][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 00:28:46,898][15401] Updated weights for policy 0, policy_version 322060 (0.0025) [2024-06-23 00:28:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5276680192. Throughput: 0: 42515.1. Samples: 5276835700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 00:28:51,392][15401] Updated weights for policy 0, policy_version 322070 (0.0037) [2024-06-23 00:28:53,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5276876800. Throughput: 0: 42520.8. Samples: 5276964840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 00:28:53,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 00:28:54,590][15401] Updated weights for policy 0, policy_version 322080 (0.0039) [2024-06-23 00:28:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5277089792. Throughput: 0: 42557.9. Samples: 5277220120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:28:58,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 00:28:58,868][15401] Updated weights for policy 0, policy_version 322090 (0.0030) [2024-06-23 00:29:02,230][15401] Updated weights for policy 0, policy_version 322100 (0.0038) [2024-06-23 00:29:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5277335552. Throughput: 0: 42440.5. Samples: 5277473100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:03,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 00:29:06,650][15401] Updated weights for policy 0, policy_version 322110 (0.0030) [2024-06-23 00:29:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 5277515776. Throughput: 0: 42528.0. Samples: 5277606380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 00:29:09,894][15401] Updated weights for policy 0, policy_version 322120 (0.0038) [2024-06-23 00:29:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 5277745152. Throughput: 0: 42544.8. Samples: 5277863480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 00:29:14,301][15401] Updated weights for policy 0, policy_version 322130 (0.0033) [2024-06-23 00:29:17,553][15401] Updated weights for policy 0, policy_version 322140 (0.0035) [2024-06-23 00:29:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 5277958144. Throughput: 0: 42433.8. Samples: 5278113360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 00:29:21,967][15401] Updated weights for policy 0, policy_version 322150 (0.0048) [2024-06-23 00:29:23,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 5278171136. Throughput: 0: 42586.0. Samples: 5278243360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:23,396][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 00:29:25,334][15401] Updated weights for policy 0, policy_version 322160 (0.0037) [2024-06-23 00:29:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5278384128. Throughput: 0: 42651.8. Samples: 5278505680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 00:29:29,432][15401] Updated weights for policy 0, policy_version 322170 (0.0025) [2024-06-23 00:29:32,778][15401] Updated weights for policy 0, policy_version 322180 (0.0034) [2024-06-23 00:29:33,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5278597120. Throughput: 0: 42709.0. Samples: 5278757600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 00:29:37,157][15401] Updated weights for policy 0, policy_version 322190 (0.0034) [2024-06-23 00:29:38,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 5278793728. Throughput: 0: 42837.1. Samples: 5278892500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 00:29:38,459][15349] Signal inference workers to stop experience collection... (78100 times) [2024-06-23 00:29:38,488][15401] InferenceWorker_p0-w0: stopping experience collection (78100 times) [2024-06-23 00:29:38,512][15349] Signal inference workers to resume experience collection... (78100 times) [2024-06-23 00:29:38,512][15401] InferenceWorker_p0-w0: resuming experience collection (78100 times) [2024-06-23 00:29:40,275][15401] Updated weights for policy 0, policy_version 322200 (0.0031) [2024-06-23 00:29:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 5279039488. Throughput: 0: 42899.9. Samples: 5279150620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:43,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 00:29:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000322207_5279039488.pth... [2024-06-23 00:29:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000321581_5268783104.pth [2024-06-23 00:29:44,671][15401] Updated weights for policy 0, policy_version 322210 (0.0028) [2024-06-23 00:29:47,899][15401] Updated weights for policy 0, policy_version 322220 (0.0027) [2024-06-23 00:29:48,390][15132] Fps is (10 sec: 45873.8, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 5279252480. Throughput: 0: 42866.5. Samples: 5279402100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 00:29:52,195][15401] Updated weights for policy 0, policy_version 322230 (0.0032) [2024-06-23 00:29:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5279449088. Throughput: 0: 42953.3. Samples: 5279539280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 00:29:55,507][15401] Updated weights for policy 0, policy_version 322240 (0.0037) [2024-06-23 00:29:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5279678464. Throughput: 0: 43085.0. Samples: 5279802300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:29:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 00:29:59,477][15401] Updated weights for policy 0, policy_version 322250 (0.0038) [2024-06-23 00:30:03,113][15401] Updated weights for policy 0, policy_version 322260 (0.0040) [2024-06-23 00:30:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5279907840. Throughput: 0: 43151.1. Samples: 5280055160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:30:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 00:30:07,289][15401] Updated weights for policy 0, policy_version 322270 (0.0033) [2024-06-23 00:30:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5280104448. Throughput: 0: 43108.8. Samples: 5280182980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 00:30:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 00:30:11,838][15401] Updated weights for policy 0, policy_version 322280 (0.0043) [2024-06-23 00:30:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5280317440. Throughput: 0: 43098.3. Samples: 5280445100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 00:30:14,777][15401] Updated weights for policy 0, policy_version 322290 (0.0032) [2024-06-23 00:30:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5280530432. Throughput: 0: 43109.3. Samples: 5280697520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:18,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 00:30:19,342][15401] Updated weights for policy 0, policy_version 322300 (0.0031) [2024-06-23 00:30:22,602][15401] Updated weights for policy 0, policy_version 322310 (0.0031) [2024-06-23 00:30:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 5280743424. Throughput: 0: 43085.6. Samples: 5280831360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 00:30:26,920][15401] Updated weights for policy 0, policy_version 322320 (0.0041) [2024-06-23 00:30:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5280956416. Throughput: 0: 43055.1. Samples: 5281088100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:28,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 00:30:30,470][15401] Updated weights for policy 0, policy_version 322330 (0.0035) [2024-06-23 00:30:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5281185792. Throughput: 0: 42985.0. Samples: 5281336420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 00:30:34,583][15401] Updated weights for policy 0, policy_version 322340 (0.0031) [2024-06-23 00:30:37,922][15401] Updated weights for policy 0, policy_version 322350 (0.0041) [2024-06-23 00:30:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5281398784. Throughput: 0: 42951.1. Samples: 5281472080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 00:30:42,062][15401] Updated weights for policy 0, policy_version 322360 (0.0042) [2024-06-23 00:30:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5281595392. Throughput: 0: 42869.8. Samples: 5281731440. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 00:30:45,452][15401] Updated weights for policy 0, policy_version 322370 (0.0042) [2024-06-23 00:30:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5281841152. Throughput: 0: 42871.1. Samples: 5281984360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 00:30:49,567][15401] Updated weights for policy 0, policy_version 322380 (0.0041) [2024-06-23 00:30:53,211][15401] Updated weights for policy 0, policy_version 322390 (0.0031) [2024-06-23 00:30:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5282037760. Throughput: 0: 43039.5. Samples: 5282119760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 00:30:57,081][15401] Updated weights for policy 0, policy_version 322400 (0.0049) [2024-06-23 00:30:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5282250752. Throughput: 0: 42851.1. Samples: 5282373400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:30:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 00:31:00,778][15401] Updated weights for policy 0, policy_version 322410 (0.0022) [2024-06-23 00:31:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 5282463744. Throughput: 0: 42883.9. Samples: 5282627400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:31:03,393][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 00:31:04,042][15349] Signal inference workers to stop experience collection... (78150 times) [2024-06-23 00:31:04,043][15349] Signal inference workers to resume experience collection... (78150 times) [2024-06-23 00:31:04,087][15401] InferenceWorker_p0-w0: stopping experience collection (78150 times) [2024-06-23 00:31:04,087][15401] InferenceWorker_p0-w0: resuming experience collection (78150 times) [2024-06-23 00:31:04,556][15401] Updated weights for policy 0, policy_version 322420 (0.0036) [2024-06-23 00:31:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5282676736. Throughput: 0: 42844.0. Samples: 5282759340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:31:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 00:31:08,830][15401] Updated weights for policy 0, policy_version 322430 (0.0032) [2024-06-23 00:31:12,053][15401] Updated weights for policy 0, policy_version 322440 (0.0038) [2024-06-23 00:31:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5282873344. Throughput: 0: 42680.9. Samples: 5283008740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:31:13,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 00:31:16,495][15401] Updated weights for policy 0, policy_version 322450 (0.0029) [2024-06-23 00:31:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5283119104. Throughput: 0: 42801.4. Samples: 5283262480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:31:18,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 00:31:19,999][15401] Updated weights for policy 0, policy_version 322460 (0.0034) [2024-06-23 00:31:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5283282944. Throughput: 0: 42745.8. Samples: 5283395640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-23 00:31:23,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 00:31:24,202][15401] Updated weights for policy 0, policy_version 322470 (0.0034) [2024-06-23 00:31:27,593][15401] Updated weights for policy 0, policy_version 322480 (0.0029) [2024-06-23 00:31:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5283512320. Throughput: 0: 42615.9. Samples: 5283649160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:31:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 00:31:31,849][15401] Updated weights for policy 0, policy_version 322490 (0.0047) [2024-06-23 00:31:33,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5283758080. Throughput: 0: 42557.3. Samples: 5283899440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:31:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 00:31:35,649][15401] Updated weights for policy 0, policy_version 322500 (0.0042) [2024-06-23 00:31:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5283938304. Throughput: 0: 42491.5. Samples: 5284031880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:31:38,391][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 00:31:39,715][15401] Updated weights for policy 0, policy_version 322510 (0.0032) [2024-06-23 00:31:43,220][15401] Updated weights for policy 0, policy_version 322520 (0.0032) [2024-06-23 00:31:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5284167680. Throughput: 0: 42476.4. Samples: 5284284840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:31:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 00:31:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000322520_5284167680.pth... [2024-06-23 00:31:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000321894_5273911296.pth [2024-06-23 00:31:47,447][15401] Updated weights for policy 0, policy_version 322530 (0.0032) [2024-06-23 00:31:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5284397056. Throughput: 0: 42498.4. Samples: 5284539720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:31:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 00:31:51,111][15401] Updated weights for policy 0, policy_version 322540 (0.0033) [2024-06-23 00:31:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5284593664. Throughput: 0: 42517.0. Samples: 5284672600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:31:53,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-23 00:31:54,942][15401] Updated weights for policy 0, policy_version 322550 (0.0026) [2024-06-23 00:31:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42710.1). Total num frames: 5284806656. Throughput: 0: 42707.9. Samples: 5284930700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:31:58,393][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 00:31:58,615][15401] Updated weights for policy 0, policy_version 322560 (0.0028) [2024-06-23 00:32:02,519][15401] Updated weights for policy 0, policy_version 322570 (0.0044) [2024-06-23 00:32:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 5285052416. Throughput: 0: 42623.5. Samples: 5285180540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:32:03,390][15132] Avg episode reward: [(0, '0.090')] [2024-06-23 00:32:06,363][15401] Updated weights for policy 0, policy_version 322580 (0.0046) [2024-06-23 00:32:08,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5285232640. Throughput: 0: 42661.7. Samples: 5285315420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:32:08,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-23 00:32:10,394][15401] Updated weights for policy 0, policy_version 322590 (0.0027) [2024-06-23 00:32:13,391][15132] Fps is (10 sec: 39317.1, 60 sec: 42870.7, 300 sec: 42764.9). Total num frames: 5285445632. Throughput: 0: 42633.7. Samples: 5285567720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:32:13,391][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 00:32:14,175][15401] Updated weights for policy 0, policy_version 322600 (0.0036) [2024-06-23 00:32:18,017][15401] Updated weights for policy 0, policy_version 322610 (0.0034) [2024-06-23 00:32:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5285658624. Throughput: 0: 42877.4. Samples: 5285828920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:32:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 00:32:21,668][15401] Updated weights for policy 0, policy_version 322620 (0.0029) [2024-06-23 00:32:23,390][15132] Fps is (10 sec: 44241.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5285888000. Throughput: 0: 42812.5. Samples: 5285958440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:32:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 00:32:25,479][15401] Updated weights for policy 0, policy_version 322630 (0.0026) [2024-06-23 00:32:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5286084608. Throughput: 0: 42779.2. Samples: 5286209900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:32:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 00:32:29,388][15401] Updated weights for policy 0, policy_version 322640 (0.0043) [2024-06-23 00:32:33,324][15401] Updated weights for policy 0, policy_version 322650 (0.0040) [2024-06-23 00:32:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5286297600. Throughput: 0: 42988.3. Samples: 5286474200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:32:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 00:32:37,207][15401] Updated weights for policy 0, policy_version 322660 (0.0047) [2024-06-23 00:32:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5286526976. Throughput: 0: 42738.6. Samples: 5286595840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 00:32:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 00:32:40,727][15401] Updated weights for policy 0, policy_version 322670 (0.0037) [2024-06-23 00:32:41,541][15349] Signal inference workers to stop experience collection... (78200 times) [2024-06-23 00:32:41,586][15401] InferenceWorker_p0-w0: stopping experience collection (78200 times) [2024-06-23 00:32:41,596][15349] Signal inference workers to resume experience collection... (78200 times) [2024-06-23 00:32:41,605][15401] InferenceWorker_p0-w0: resuming experience collection (78200 times) [2024-06-23 00:32:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5286739968. Throughput: 0: 42672.4. Samples: 5286850860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:32:43,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 00:32:44,751][15401] Updated weights for policy 0, policy_version 322680 (0.0027) [2024-06-23 00:32:48,192][15401] Updated weights for policy 0, policy_version 322690 (0.0036) [2024-06-23 00:32:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5286952960. Throughput: 0: 42833.7. Samples: 5287108060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:32:48,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 00:32:52,421][15401] Updated weights for policy 0, policy_version 322700 (0.0032) [2024-06-23 00:32:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5287149568. Throughput: 0: 42674.7. Samples: 5287235780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:32:53,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 00:32:55,783][15401] Updated weights for policy 0, policy_version 322710 (0.0024) [2024-06-23 00:32:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 5287378944. Throughput: 0: 42673.1. Samples: 5287487960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:32:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 00:33:00,157][15401] Updated weights for policy 0, policy_version 322720 (0.0038) [2024-06-23 00:33:03,312][15401] Updated weights for policy 0, policy_version 322730 (0.0037) [2024-06-23 00:33:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5287608320. Throughput: 0: 42590.3. Samples: 5287745480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:03,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 00:33:07,762][15401] Updated weights for policy 0, policy_version 322740 (0.0032) [2024-06-23 00:33:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 5287788544. Throughput: 0: 42619.6. Samples: 5287876320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:08,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 00:33:10,793][15401] Updated weights for policy 0, policy_version 322750 (0.0038) [2024-06-23 00:33:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42872.2, 300 sec: 42820.5). Total num frames: 5288017920. Throughput: 0: 42696.8. Samples: 5288131260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:13,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 00:33:15,347][15401] Updated weights for policy 0, policy_version 322760 (0.0039) [2024-06-23 00:33:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5288230912. Throughput: 0: 42525.9. Samples: 5288387860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:18,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-23 00:33:18,847][15401] Updated weights for policy 0, policy_version 322770 (0.0028) [2024-06-23 00:33:22,953][15401] Updated weights for policy 0, policy_version 322780 (0.0043) [2024-06-23 00:33:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5288443904. Throughput: 0: 42676.5. Samples: 5288516280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:23,394][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 00:33:26,566][15401] Updated weights for policy 0, policy_version 322790 (0.0039) [2024-06-23 00:33:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5288656896. Throughput: 0: 42572.1. Samples: 5288766600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:28,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 00:33:30,586][15401] Updated weights for policy 0, policy_version 322800 (0.0031) [2024-06-23 00:33:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5288837120. Throughput: 0: 42654.2. Samples: 5289027500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 00:33:34,162][15401] Updated weights for policy 0, policy_version 322810 (0.0031) [2024-06-23 00:33:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42654.9). Total num frames: 5289050112. Throughput: 0: 42571.1. Samples: 5289151480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:38,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 00:33:38,733][15401] Updated weights for policy 0, policy_version 322820 (0.0036) [2024-06-23 00:33:41,631][15401] Updated weights for policy 0, policy_version 322830 (0.0033) [2024-06-23 00:33:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5289295872. Throughput: 0: 42721.3. Samples: 5289410420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:43,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 00:33:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000322833_5289295872.pth... [2024-06-23 00:33:43,448][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000322207_5279039488.pth [2024-06-23 00:33:46,181][15401] Updated weights for policy 0, policy_version 322840 (0.0043) [2024-06-23 00:33:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5289492480. Throughput: 0: 42736.3. Samples: 5289668620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:48,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 00:33:49,652][15401] Updated weights for policy 0, policy_version 322850 (0.0035) [2024-06-23 00:33:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5289705472. Throughput: 0: 42633.2. Samples: 5289794820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:33:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 00:33:53,729][15401] Updated weights for policy 0, policy_version 322860 (0.0044) [2024-06-23 00:33:57,192][15401] Updated weights for policy 0, policy_version 322870 (0.0032) [2024-06-23 00:33:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5289934848. Throughput: 0: 42660.8. Samples: 5290051000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:33:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 00:34:01,322][15401] Updated weights for policy 0, policy_version 322880 (0.0042) [2024-06-23 00:34:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5290147840. Throughput: 0: 42791.6. Samples: 5290313480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 00:34:04,750][15401] Updated weights for policy 0, policy_version 322890 (0.0026) [2024-06-23 00:34:08,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5290344448. Throughput: 0: 42750.4. Samples: 5290440040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 00:34:08,558][15349] Signal inference workers to stop experience collection... (78250 times) [2024-06-23 00:34:08,558][15349] Signal inference workers to resume experience collection... (78250 times) [2024-06-23 00:34:08,588][15401] InferenceWorker_p0-w0: stopping experience collection (78250 times) [2024-06-23 00:34:08,589][15401] InferenceWorker_p0-w0: resuming experience collection (78250 times) [2024-06-23 00:34:08,931][15401] Updated weights for policy 0, policy_version 322900 (0.0028) [2024-06-23 00:34:12,359][15401] Updated weights for policy 0, policy_version 322910 (0.0045) [2024-06-23 00:34:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5290573824. Throughput: 0: 42867.0. Samples: 5290695620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:13,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 00:34:16,863][15401] Updated weights for policy 0, policy_version 322920 (0.0038) [2024-06-23 00:34:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 5290770432. Throughput: 0: 42776.5. Samples: 5290952440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 00:34:20,110][15401] Updated weights for policy 0, policy_version 322930 (0.0045) [2024-06-23 00:34:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5290983424. Throughput: 0: 42842.2. Samples: 5291079380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:23,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 00:34:24,601][15401] Updated weights for policy 0, policy_version 322940 (0.0040) [2024-06-23 00:34:27,551][15401] Updated weights for policy 0, policy_version 322950 (0.0027) [2024-06-23 00:34:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5291212800. Throughput: 0: 42734.8. Samples: 5291333480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 00:34:32,164][15401] Updated weights for policy 0, policy_version 322960 (0.0042) [2024-06-23 00:34:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5291409408. Throughput: 0: 42851.2. Samples: 5291596920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 00:34:35,458][15401] Updated weights for policy 0, policy_version 322970 (0.0041) [2024-06-23 00:34:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5291638784. Throughput: 0: 42854.2. Samples: 5291723260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 00:34:39,621][15401] Updated weights for policy 0, policy_version 322980 (0.0023) [2024-06-23 00:34:43,157][15401] Updated weights for policy 0, policy_version 322990 (0.0038) [2024-06-23 00:34:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5291868160. Throughput: 0: 42821.9. Samples: 5291977980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 00:34:47,270][15401] Updated weights for policy 0, policy_version 323000 (0.0042) [2024-06-23 00:34:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5292048384. Throughput: 0: 42748.3. Samples: 5292237160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 00:34:50,822][15401] Updated weights for policy 0, policy_version 323010 (0.0031) [2024-06-23 00:34:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5292277760. Throughput: 0: 42688.2. Samples: 5292361020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 00:34:54,884][15401] Updated weights for policy 0, policy_version 323020 (0.0032) [2024-06-23 00:34:58,393][15132] Fps is (10 sec: 45857.5, 60 sec: 42868.8, 300 sec: 42708.9). Total num frames: 5292507136. Throughput: 0: 42653.6. Samples: 5292615200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:34:58,394][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 00:34:58,719][15401] Updated weights for policy 0, policy_version 323030 (0.0045) [2024-06-23 00:35:02,585][15401] Updated weights for policy 0, policy_version 323040 (0.0040) [2024-06-23 00:35:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5292703744. Throughput: 0: 42546.1. Samples: 5292867020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:35:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 00:35:06,352][15401] Updated weights for policy 0, policy_version 323050 (0.0047) [2024-06-23 00:35:08,389][15132] Fps is (10 sec: 42615.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5292933120. Throughput: 0: 42590.2. Samples: 5292995940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:35:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 00:35:10,439][15401] Updated weights for policy 0, policy_version 323060 (0.0036) [2024-06-23 00:35:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5293146112. Throughput: 0: 42640.8. Samples: 5293252320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 00:35:14,133][15401] Updated weights for policy 0, policy_version 323070 (0.0028) [2024-06-23 00:35:18,059][15401] Updated weights for policy 0, policy_version 323080 (0.0039) [2024-06-23 00:35:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5293342720. Throughput: 0: 42492.0. Samples: 5293509060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 00:35:21,667][15401] Updated weights for policy 0, policy_version 323090 (0.0032) [2024-06-23 00:35:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5293572096. Throughput: 0: 42526.0. Samples: 5293636920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:23,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 00:35:25,683][15401] Updated weights for policy 0, policy_version 323100 (0.0038) [2024-06-23 00:35:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5293768704. Throughput: 0: 42591.7. Samples: 5293894600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 00:35:29,726][15401] Updated weights for policy 0, policy_version 323110 (0.0038) [2024-06-23 00:35:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5293981696. Throughput: 0: 42595.5. Samples: 5294153960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 00:35:33,467][15401] Updated weights for policy 0, policy_version 323120 (0.0029) [2024-06-23 00:35:37,656][15401] Updated weights for policy 0, policy_version 323130 (0.0024) [2024-06-23 00:35:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5294194688. Throughput: 0: 42680.1. Samples: 5294281620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 00:35:40,857][15401] Updated weights for policy 0, policy_version 323140 (0.0030) [2024-06-23 00:35:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5294424064. Throughput: 0: 42809.3. Samples: 5294541460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:43,399][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 00:35:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000323146_5294424064.pth... [2024-06-23 00:35:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000322520_5284167680.pth [2024-06-23 00:35:45,101][15401] Updated weights for policy 0, policy_version 323150 (0.0035) [2024-06-23 00:35:45,850][15349] Signal inference workers to stop experience collection... (78300 times) [2024-06-23 00:35:45,896][15401] InferenceWorker_p0-w0: stopping experience collection (78300 times) [2024-06-23 00:35:45,905][15349] Signal inference workers to resume experience collection... (78300 times) [2024-06-23 00:35:45,912][15401] InferenceWorker_p0-w0: resuming experience collection (78300 times) [2024-06-23 00:35:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 5294653440. Throughput: 0: 42898.8. Samples: 5294797460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 00:35:48,392][15401] Updated weights for policy 0, policy_version 323160 (0.0039) [2024-06-23 00:35:52,642][15401] Updated weights for policy 0, policy_version 323170 (0.0022) [2024-06-23 00:35:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5294833664. Throughput: 0: 42895.9. Samples: 5294926260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 00:35:56,445][15401] Updated weights for policy 0, policy_version 323180 (0.0037) [2024-06-23 00:35:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42601.2, 300 sec: 42709.8). Total num frames: 5295063040. Throughput: 0: 42740.0. Samples: 5295175620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:35:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 00:36:00,199][15401] Updated weights for policy 0, policy_version 323190 (0.0037) [2024-06-23 00:36:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5295276032. Throughput: 0: 42682.8. Samples: 5295429780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:36:03,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 00:36:04,390][15401] Updated weights for policy 0, policy_version 323200 (0.0040) [2024-06-23 00:36:08,108][15401] Updated weights for policy 0, policy_version 323210 (0.0034) [2024-06-23 00:36:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5295472640. Throughput: 0: 42636.8. Samples: 5295555580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:36:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 00:36:12,092][15401] Updated weights for policy 0, policy_version 323220 (0.0027) [2024-06-23 00:36:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5295702016. Throughput: 0: 42729.7. Samples: 5295817440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:36:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 00:36:15,678][15401] Updated weights for policy 0, policy_version 323230 (0.0040) [2024-06-23 00:36:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5295915008. Throughput: 0: 42542.3. Samples: 5296068360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:36:18,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 00:36:19,809][15401] Updated weights for policy 0, policy_version 323240 (0.0048) [2024-06-23 00:36:23,293][15401] Updated weights for policy 0, policy_version 323250 (0.0031) [2024-06-23 00:36:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5296128000. Throughput: 0: 42541.3. Samples: 5296195980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 00:36:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 00:36:27,359][15401] Updated weights for policy 0, policy_version 323260 (0.0033) [2024-06-23 00:36:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 5296324608. Throughput: 0: 42460.6. Samples: 5296452280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:36:28,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 00:36:30,971][15401] Updated weights for policy 0, policy_version 323270 (0.0033) [2024-06-23 00:36:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5296537600. Throughput: 0: 42496.0. Samples: 5296709780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:36:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 00:36:35,235][15401] Updated weights for policy 0, policy_version 323280 (0.0034) [2024-06-23 00:36:38,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5296766976. Throughput: 0: 42349.3. Samples: 5296831980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:36:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 00:36:38,517][15401] Updated weights for policy 0, policy_version 323290 (0.0040) [2024-06-23 00:36:42,664][15401] Updated weights for policy 0, policy_version 323300 (0.0031) [2024-06-23 00:36:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 5296963584. Throughput: 0: 42615.1. Samples: 5297093300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:36:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 00:36:46,442][15401] Updated weights for policy 0, policy_version 323310 (0.0037) [2024-06-23 00:36:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5297176576. Throughput: 0: 42563.1. Samples: 5297345120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:36:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 00:36:50,363][15401] Updated weights for policy 0, policy_version 323320 (0.0033) [2024-06-23 00:36:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 5297405952. Throughput: 0: 42600.3. Samples: 5297472600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:36:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 00:36:54,364][15401] Updated weights for policy 0, policy_version 323330 (0.0043) [2024-06-23 00:36:58,246][15401] Updated weights for policy 0, policy_version 323340 (0.0026) [2024-06-23 00:36:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5297602560. Throughput: 0: 42378.7. Samples: 5297724480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:36:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 00:37:02,233][15401] Updated weights for policy 0, policy_version 323350 (0.0043) [2024-06-23 00:37:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5297831936. Throughput: 0: 42430.2. Samples: 5297977720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:37:03,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-23 00:37:06,036][15401] Updated weights for policy 0, policy_version 323360 (0.0037) [2024-06-23 00:37:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42653.8). Total num frames: 5298028544. Throughput: 0: 42425.3. Samples: 5298105220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:37:08,400][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 00:37:09,792][15401] Updated weights for policy 0, policy_version 323370 (0.0035) [2024-06-23 00:37:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5298225152. Throughput: 0: 42368.8. Samples: 5298358780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:37:13,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 00:37:13,729][15401] Updated weights for policy 0, policy_version 323380 (0.0050) [2024-06-23 00:37:17,525][15401] Updated weights for policy 0, policy_version 323390 (0.0041) [2024-06-23 00:37:18,391][15132] Fps is (10 sec: 40961.8, 60 sec: 42050.9, 300 sec: 42542.6). Total num frames: 5298438144. Throughput: 0: 42205.7. Samples: 5298609120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:37:18,392][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 00:37:21,610][15401] Updated weights for policy 0, policy_version 323400 (0.0023) [2024-06-23 00:37:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5298667520. Throughput: 0: 42455.1. Samples: 5298742460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:37:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 00:37:25,431][15401] Updated weights for policy 0, policy_version 323410 (0.0029) [2024-06-23 00:37:28,390][15132] Fps is (10 sec: 42606.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 5298864128. Throughput: 0: 42076.4. Samples: 5298986740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:37:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 00:37:29,471][15401] Updated weights for policy 0, policy_version 323420 (0.0049) [2024-06-23 00:37:32,986][15401] Updated weights for policy 0, policy_version 323430 (0.0028) [2024-06-23 00:37:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5299077120. Throughput: 0: 42260.4. Samples: 5299246840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:37:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 00:37:34,263][15349] Signal inference workers to stop experience collection... (78350 times) [2024-06-23 00:37:34,315][15349] Signal inference workers to resume experience collection... (78350 times) [2024-06-23 00:37:34,328][15401] InferenceWorker_p0-w0: stopping experience collection (78350 times) [2024-06-23 00:37:34,328][15401] InferenceWorker_p0-w0: resuming experience collection (78350 times) [2024-06-23 00:37:37,181][15401] Updated weights for policy 0, policy_version 323440 (0.0030) [2024-06-23 00:37:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5299290112. Throughput: 0: 42231.7. Samples: 5299373020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 00:37:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 00:37:40,525][15401] Updated weights for policy 0, policy_version 323450 (0.0044) [2024-06-23 00:37:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5299503104. Throughput: 0: 42316.8. Samples: 5299628740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:37:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 00:37:43,546][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000323457_5299519488.pth... [2024-06-23 00:37:43,601][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000322833_5289295872.pth [2024-06-23 00:37:44,623][15401] Updated weights for policy 0, policy_version 323460 (0.0036) [2024-06-23 00:37:48,173][15401] Updated weights for policy 0, policy_version 323470 (0.0035) [2024-06-23 00:37:48,394][15132] Fps is (10 sec: 44216.0, 60 sec: 42595.0, 300 sec: 42653.3). Total num frames: 5299732480. Throughput: 0: 42370.7. Samples: 5299884600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:37:48,395][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 00:37:52,100][15401] Updated weights for policy 0, policy_version 323480 (0.0047) [2024-06-23 00:37:53,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.7, 300 sec: 42542.5). Total num frames: 5299929088. Throughput: 0: 42463.1. Samples: 5300016060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:37:53,393][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 00:37:55,777][15401] Updated weights for policy 0, policy_version 323490 (0.0040) [2024-06-23 00:37:58,389][15132] Fps is (10 sec: 40979.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5300142080. Throughput: 0: 42415.2. Samples: 5300267460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:37:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 00:37:59,655][15401] Updated weights for policy 0, policy_version 323500 (0.0040) [2024-06-23 00:38:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5300355072. Throughput: 0: 42547.9. Samples: 5300523700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 00:38:03,798][15401] Updated weights for policy 0, policy_version 323510 (0.0030) [2024-06-23 00:38:07,335][15401] Updated weights for policy 0, policy_version 323520 (0.0037) [2024-06-23 00:38:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 5300568064. Throughput: 0: 42416.6. Samples: 5300651200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 00:38:11,487][15401] Updated weights for policy 0, policy_version 323530 (0.0029) [2024-06-23 00:38:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5300797440. Throughput: 0: 42628.0. Samples: 5300905000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:13,393][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 00:38:14,922][15401] Updated weights for policy 0, policy_version 323540 (0.0033) [2024-06-23 00:38:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42599.8, 300 sec: 42542.9). Total num frames: 5300994048. Throughput: 0: 42587.6. Samples: 5301163280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 00:38:19,117][15401] Updated weights for policy 0, policy_version 323550 (0.0032) [2024-06-23 00:38:22,538][15401] Updated weights for policy 0, policy_version 323560 (0.0035) [2024-06-23 00:38:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 5301207040. Throughput: 0: 42647.1. Samples: 5301292140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 00:38:26,662][15401] Updated weights for policy 0, policy_version 323570 (0.0033) [2024-06-23 00:38:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5301420032. Throughput: 0: 42644.9. Samples: 5301547760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 00:38:30,244][15401] Updated weights for policy 0, policy_version 323580 (0.0028) [2024-06-23 00:38:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5301633024. Throughput: 0: 42816.0. Samples: 5301811120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:33,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 00:38:34,186][15401] Updated weights for policy 0, policy_version 323590 (0.0030) [2024-06-23 00:38:38,037][15401] Updated weights for policy 0, policy_version 323600 (0.0035) [2024-06-23 00:38:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5301862400. Throughput: 0: 42556.0. Samples: 5301930980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 00:38:41,825][15401] Updated weights for policy 0, policy_version 323610 (0.0045) [2024-06-23 00:38:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5302075392. Throughput: 0: 42608.4. Samples: 5302184840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 00:38:45,839][15401] Updated weights for policy 0, policy_version 323620 (0.0036) [2024-06-23 00:38:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42328.6, 300 sec: 42598.4). Total num frames: 5302272000. Throughput: 0: 42725.0. Samples: 5302446320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 00:38:49,849][15401] Updated weights for policy 0, policy_version 323630 (0.0037) [2024-06-23 00:38:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 5302501376. Throughput: 0: 42694.7. Samples: 5302572460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 00:38:53,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 00:38:53,454][15401] Updated weights for policy 0, policy_version 323640 (0.0048) [2024-06-23 00:38:57,553][15401] Updated weights for policy 0, policy_version 323650 (0.0034) [2024-06-23 00:38:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5302697984. Throughput: 0: 42813.9. Samples: 5302831620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:38:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 00:39:01,919][15401] Updated weights for policy 0, policy_version 323660 (0.0044) [2024-06-23 00:39:03,167][15349] Signal inference workers to stop experience collection... (78400 times) [2024-06-23 00:39:03,168][15349] Signal inference workers to resume experience collection... (78400 times) [2024-06-23 00:39:03,189][15401] InferenceWorker_p0-w0: stopping experience collection (78400 times) [2024-06-23 00:39:03,189][15401] InferenceWorker_p0-w0: resuming experience collection (78400 times) [2024-06-23 00:39:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5302927360. Throughput: 0: 42780.8. Samples: 5303088420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 00:39:05,134][15401] Updated weights for policy 0, policy_version 323670 (0.0042) [2024-06-23 00:39:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5303156736. Throughput: 0: 42793.8. Samples: 5303217860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:08,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 00:39:09,428][15401] Updated weights for policy 0, policy_version 323680 (0.0028) [2024-06-23 00:39:12,651][15401] Updated weights for policy 0, policy_version 323690 (0.0026) [2024-06-23 00:39:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5303336960. Throughput: 0: 42761.8. Samples: 5303472040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 00:39:16,985][15401] Updated weights for policy 0, policy_version 323700 (0.0032) [2024-06-23 00:39:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5303549952. Throughput: 0: 42676.1. Samples: 5303731540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 00:39:20,213][15401] Updated weights for policy 0, policy_version 323710 (0.0037) [2024-06-23 00:39:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5303795712. Throughput: 0: 42907.1. Samples: 5303861800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 00:39:24,393][15401] Updated weights for policy 0, policy_version 323720 (0.0049) [2024-06-23 00:39:27,943][15401] Updated weights for policy 0, policy_version 323730 (0.0028) [2024-06-23 00:39:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5303992320. Throughput: 0: 42925.4. Samples: 5304116480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:28,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 00:39:32,412][15401] Updated weights for policy 0, policy_version 323740 (0.0034) [2024-06-23 00:39:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5304205312. Throughput: 0: 42771.1. Samples: 5304371020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 00:39:35,552][15401] Updated weights for policy 0, policy_version 323750 (0.0025) [2024-06-23 00:39:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5304434688. Throughput: 0: 42736.0. Samples: 5304495580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:38,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-23 00:39:40,082][15401] Updated weights for policy 0, policy_version 323760 (0.0023) [2024-06-23 00:39:43,300][15401] Updated weights for policy 0, policy_version 323770 (0.0040) [2024-06-23 00:39:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5304647680. Throughput: 0: 42759.2. Samples: 5304755780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:43,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 00:39:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000323770_5304647680.pth... [2024-06-23 00:39:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000323146_5294424064.pth [2024-06-23 00:39:47,575][15401] Updated weights for policy 0, policy_version 323780 (0.0037) [2024-06-23 00:39:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5304827904. Throughput: 0: 42810.9. Samples: 5305014900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:48,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 00:39:51,259][15401] Updated weights for policy 0, policy_version 323790 (0.0033) [2024-06-23 00:39:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42543.4). Total num frames: 5305057280. Throughput: 0: 42648.9. Samples: 5305137060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:53,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 00:39:55,081][15401] Updated weights for policy 0, policy_version 323800 (0.0033) [2024-06-23 00:39:58,394][15132] Fps is (10 sec: 44215.7, 60 sec: 42868.1, 300 sec: 42597.7). Total num frames: 5305270272. Throughput: 0: 42594.8. Samples: 5305389000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:39:58,395][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 00:39:59,019][15401] Updated weights for policy 0, policy_version 323810 (0.0041) [2024-06-23 00:40:03,240][15401] Updated weights for policy 0, policy_version 323820 (0.0035) [2024-06-23 00:40:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5305466880. Throughput: 0: 42629.5. Samples: 5305649880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:40:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 00:40:06,893][15401] Updated weights for policy 0, policy_version 323830 (0.0031) [2024-06-23 00:40:08,389][15132] Fps is (10 sec: 42618.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5305696256. Throughput: 0: 42603.6. Samples: 5305778960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 00:40:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 00:40:10,826][15401] Updated weights for policy 0, policy_version 323840 (0.0038) [2024-06-23 00:40:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5305892864. Throughput: 0: 42437.7. Samples: 5306026180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:13,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 00:40:14,565][15401] Updated weights for policy 0, policy_version 323850 (0.0038) [2024-06-23 00:40:18,295][15401] Updated weights for policy 0, policy_version 323860 (0.0037) [2024-06-23 00:40:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5306122240. Throughput: 0: 42722.8. Samples: 5306293540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 00:40:22,110][15349] Signal inference workers to stop experience collection... (78450 times) [2024-06-23 00:40:22,156][15401] InferenceWorker_p0-w0: stopping experience collection (78450 times) [2024-06-23 00:40:22,170][15349] Signal inference workers to resume experience collection... (78450 times) [2024-06-23 00:40:22,176][15401] InferenceWorker_p0-w0: resuming experience collection (78450 times) [2024-06-23 00:40:22,179][15401] Updated weights for policy 0, policy_version 323870 (0.0026) [2024-06-23 00:40:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5306351616. Throughput: 0: 42783.9. Samples: 5306420860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 00:40:25,933][15401] Updated weights for policy 0, policy_version 323880 (0.0028) [2024-06-23 00:40:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5306548224. Throughput: 0: 42670.7. Samples: 5306675960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 00:40:29,700][15401] Updated weights for policy 0, policy_version 323890 (0.0027) [2024-06-23 00:40:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5306761216. Throughput: 0: 42624.8. Samples: 5306933020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 00:40:33,521][15401] Updated weights for policy 0, policy_version 323900 (0.0032) [2024-06-23 00:40:37,196][15401] Updated weights for policy 0, policy_version 323910 (0.0038) [2024-06-23 00:40:38,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 5307006976. Throughput: 0: 42805.3. Samples: 5307063400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:38,393][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 00:40:41,089][15401] Updated weights for policy 0, policy_version 323920 (0.0024) [2024-06-23 00:40:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5307203584. Throughput: 0: 42806.2. Samples: 5307315080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 00:40:44,925][15401] Updated weights for policy 0, policy_version 323930 (0.0027) [2024-06-23 00:40:48,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5307383808. Throughput: 0: 42762.0. Samples: 5307574160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:48,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 00:40:48,982][15401] Updated weights for policy 0, policy_version 323940 (0.0030) [2024-06-23 00:40:52,610][15401] Updated weights for policy 0, policy_version 323950 (0.0035) [2024-06-23 00:40:53,392][15132] Fps is (10 sec: 44223.9, 60 sec: 43142.5, 300 sec: 42653.5). Total num frames: 5307645952. Throughput: 0: 42711.0. Samples: 5307701080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:53,393][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 00:40:56,635][15401] Updated weights for policy 0, policy_version 323960 (0.0023) [2024-06-23 00:40:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42874.8, 300 sec: 42598.4). Total num frames: 5307842560. Throughput: 0: 42892.0. Samples: 5307956320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:40:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 00:41:00,067][15401] Updated weights for policy 0, policy_version 323970 (0.0023) [2024-06-23 00:41:03,390][15132] Fps is (10 sec: 39332.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5308039168. Throughput: 0: 42907.0. Samples: 5308224360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:41:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 00:41:04,022][15401] Updated weights for policy 0, policy_version 323980 (0.0043) [2024-06-23 00:41:07,543][15401] Updated weights for policy 0, policy_version 323990 (0.0031) [2024-06-23 00:41:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 5308301312. Throughput: 0: 42873.8. Samples: 5308350180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:41:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 00:41:11,638][15401] Updated weights for policy 0, policy_version 324000 (0.0037) [2024-06-23 00:41:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 5308465152. Throughput: 0: 42917.3. Samples: 5308607240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:41:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 00:41:15,155][15401] Updated weights for policy 0, policy_version 324010 (0.0033) [2024-06-23 00:41:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5308694528. Throughput: 0: 42987.8. Samples: 5308867480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:41:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 00:41:19,155][15401] Updated weights for policy 0, policy_version 324020 (0.0036) [2024-06-23 00:41:22,673][15401] Updated weights for policy 0, policy_version 324030 (0.0036) [2024-06-23 00:41:23,390][15132] Fps is (10 sec: 45871.5, 60 sec: 42871.0, 300 sec: 42709.7). Total num frames: 5308923904. Throughput: 0: 42855.0. Samples: 5308991800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 00:41:23,391][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 00:41:27,116][15401] Updated weights for policy 0, policy_version 324040 (0.0044) [2024-06-23 00:41:27,575][15349] Signal inference workers to stop experience collection... (78500 times) [2024-06-23 00:41:27,581][15349] Signal inference workers to resume experience collection... (78500 times) [2024-06-23 00:41:27,594][15401] InferenceWorker_p0-w0: stopping experience collection (78500 times) [2024-06-23 00:41:27,594][15401] InferenceWorker_p0-w0: resuming experience collection (78500 times) [2024-06-23 00:41:28,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 5309136896. Throughput: 0: 43109.2. Samples: 5309255100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:41:28,393][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 00:41:30,433][15401] Updated weights for policy 0, policy_version 324050 (0.0041) [2024-06-23 00:41:33,390][15132] Fps is (10 sec: 40963.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5309333504. Throughput: 0: 42986.6. Samples: 5309508560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:41:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 00:41:34,677][15401] Updated weights for policy 0, policy_version 324060 (0.0030) [2024-06-23 00:41:38,284][15401] Updated weights for policy 0, policy_version 324070 (0.0029) [2024-06-23 00:41:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 5309562880. Throughput: 0: 42810.2. Samples: 5309627420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:41:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 00:41:42,379][15401] Updated weights for policy 0, policy_version 324080 (0.0030) [2024-06-23 00:41:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5309775872. Throughput: 0: 43028.9. Samples: 5309892620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:41:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 00:41:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000324084_5309792256.pth... [2024-06-23 00:41:43,519][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000323457_5299519488.pth [2024-06-23 00:41:45,807][15401] Updated weights for policy 0, policy_version 324090 (0.0039) [2024-06-23 00:41:48,392][15132] Fps is (10 sec: 40950.7, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 5309972480. Throughput: 0: 42724.5. Samples: 5310147060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:41:48,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 00:41:50,082][15401] Updated weights for policy 0, policy_version 324100 (0.0029) [2024-06-23 00:41:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.5, 300 sec: 42709.5). Total num frames: 5310201856. Throughput: 0: 42681.0. Samples: 5310270820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:41:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 00:41:53,559][15401] Updated weights for policy 0, policy_version 324110 (0.0036) [2024-06-23 00:41:57,638][15401] Updated weights for policy 0, policy_version 324120 (0.0035) [2024-06-23 00:41:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5310398464. Throughput: 0: 42774.6. Samples: 5310532100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:41:58,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-23 00:42:01,610][15401] Updated weights for policy 0, policy_version 324130 (0.0027) [2024-06-23 00:42:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 5310611456. Throughput: 0: 42538.8. Samples: 5310781720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:42:03,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-23 00:42:05,340][15401] Updated weights for policy 0, policy_version 324140 (0.0038) [2024-06-23 00:42:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5310840832. Throughput: 0: 42623.3. Samples: 5310909820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:42:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 00:42:09,276][15401] Updated weights for policy 0, policy_version 324150 (0.0035) [2024-06-23 00:42:13,359][15401] Updated weights for policy 0, policy_version 324160 (0.0041) [2024-06-23 00:42:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 5311037440. Throughput: 0: 42509.4. Samples: 5311167920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:42:13,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 00:42:16,987][15401] Updated weights for policy 0, policy_version 324170 (0.0025) [2024-06-23 00:42:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5311250432. Throughput: 0: 42489.8. Samples: 5311420600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:42:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 00:42:20,807][15401] Updated weights for policy 0, policy_version 324180 (0.0031) [2024-06-23 00:42:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.9, 300 sec: 42709.5). Total num frames: 5311463424. Throughput: 0: 42765.5. Samples: 5311551860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:42:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 00:42:24,445][15401] Updated weights for policy 0, policy_version 324190 (0.0033) [2024-06-23 00:42:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 5311660032. Throughput: 0: 42499.6. Samples: 5311805100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:42:28,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-23 00:42:28,572][15401] Updated weights for policy 0, policy_version 324200 (0.0031) [2024-06-23 00:42:32,006][15401] Updated weights for policy 0, policy_version 324210 (0.0047) [2024-06-23 00:42:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5311889408. Throughput: 0: 42410.2. Samples: 5312055420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:42:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 00:42:36,285][15401] Updated weights for policy 0, policy_version 324220 (0.0040) [2024-06-23 00:42:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5312102400. Throughput: 0: 42497.7. Samples: 5312183220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 00:42:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 00:42:39,582][15401] Updated weights for policy 0, policy_version 324230 (0.0034) [2024-06-23 00:42:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42599.1). Total num frames: 5312299008. Throughput: 0: 42375.1. Samples: 5312438980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:42:43,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-23 00:42:44,129][15401] Updated weights for policy 0, policy_version 324240 (0.0037) [2024-06-23 00:42:46,983][15401] Updated weights for policy 0, policy_version 324250 (0.0036) [2024-06-23 00:42:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 5312544768. Throughput: 0: 42391.6. Samples: 5312689340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:42:48,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 00:42:51,763][15401] Updated weights for policy 0, policy_version 324260 (0.0028) [2024-06-23 00:42:52,755][15349] Signal inference workers to stop experience collection... (78550 times) [2024-06-23 00:42:52,808][15401] InferenceWorker_p0-w0: stopping experience collection (78550 times) [2024-06-23 00:42:52,812][15349] Signal inference workers to resume experience collection... (78550 times) [2024-06-23 00:42:52,821][15401] InferenceWorker_p0-w0: resuming experience collection (78550 times) [2024-06-23 00:42:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5312757760. Throughput: 0: 42670.4. Samples: 5312829980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:42:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 00:42:54,922][15401] Updated weights for policy 0, policy_version 324270 (0.0036) [2024-06-23 00:42:58,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 5312954368. Throughput: 0: 42547.0. Samples: 5313082640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:42:58,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 00:42:59,562][15401] Updated weights for policy 0, policy_version 324280 (0.0027) [2024-06-23 00:43:02,574][15401] Updated weights for policy 0, policy_version 324290 (0.0041) [2024-06-23 00:43:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5313183744. Throughput: 0: 42556.7. Samples: 5313335660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 00:43:07,170][15401] Updated weights for policy 0, policy_version 324300 (0.0037) [2024-06-23 00:43:08,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5313380352. Throughput: 0: 42659.0. Samples: 5313471520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:08,394][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 00:43:10,471][15401] Updated weights for policy 0, policy_version 324310 (0.0035) [2024-06-23 00:43:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5313593344. Throughput: 0: 42552.0. Samples: 5313719940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 00:43:14,894][15401] Updated weights for policy 0, policy_version 324320 (0.0030) [2024-06-23 00:43:18,304][15401] Updated weights for policy 0, policy_version 324330 (0.0037) [2024-06-23 00:43:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5313822720. Throughput: 0: 42649.4. Samples: 5313974640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 00:43:22,509][15401] Updated weights for policy 0, policy_version 324340 (0.0033) [2024-06-23 00:43:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5314019328. Throughput: 0: 42756.8. Samples: 5314107280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:23,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 00:43:26,015][15401] Updated weights for policy 0, policy_version 324350 (0.0032) [2024-06-23 00:43:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5314232320. Throughput: 0: 42558.2. Samples: 5314354100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 00:43:30,239][15401] Updated weights for policy 0, policy_version 324360 (0.0034) [2024-06-23 00:43:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5314445312. Throughput: 0: 42747.0. Samples: 5314612960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:33,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 00:43:33,789][15401] Updated weights for policy 0, policy_version 324370 (0.0041) [2024-06-23 00:43:37,896][15401] Updated weights for policy 0, policy_version 324380 (0.0033) [2024-06-23 00:43:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5314674688. Throughput: 0: 42532.8. Samples: 5314743960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:38,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 00:43:41,263][15401] Updated weights for policy 0, policy_version 324390 (0.0030) [2024-06-23 00:43:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5314887680. Throughput: 0: 42578.2. Samples: 5314998560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 00:43:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000324395_5314887680.pth... [2024-06-23 00:43:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000323770_5304647680.pth [2024-06-23 00:43:45,756][15401] Updated weights for policy 0, policy_version 324400 (0.0033) [2024-06-23 00:43:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5315084288. Throughput: 0: 42533.9. Samples: 5315249680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 00:43:48,941][15401] Updated weights for policy 0, policy_version 324410 (0.0032) [2024-06-23 00:43:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 5315280896. Throughput: 0: 42383.7. Samples: 5315378780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 00:43:53,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-23 00:43:53,470][15401] Updated weights for policy 0, policy_version 324420 (0.0037) [2024-06-23 00:43:56,676][15401] Updated weights for policy 0, policy_version 324430 (0.0037) [2024-06-23 00:43:58,391][15132] Fps is (10 sec: 44230.2, 60 sec: 42872.1, 300 sec: 42709.3). Total num frames: 5315526656. Throughput: 0: 42539.0. Samples: 5315634260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:43:58,392][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 00:44:01,344][15401] Updated weights for policy 0, policy_version 324440 (0.0025) [2024-06-23 00:44:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5315723264. Throughput: 0: 42514.2. Samples: 5315887780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 00:44:04,945][15401] Updated weights for policy 0, policy_version 324450 (0.0026) [2024-06-23 00:44:08,390][15132] Fps is (10 sec: 39327.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5315919872. Throughput: 0: 42464.9. Samples: 5316018200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 00:44:08,902][15401] Updated weights for policy 0, policy_version 324460 (0.0034) [2024-06-23 00:44:12,572][15401] Updated weights for policy 0, policy_version 324470 (0.0026) [2024-06-23 00:44:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5316149248. Throughput: 0: 42621.8. Samples: 5316272080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 00:44:16,478][15401] Updated weights for policy 0, policy_version 324480 (0.0026) [2024-06-23 00:44:17,676][15349] Signal inference workers to stop experience collection... (78600 times) [2024-06-23 00:44:17,676][15349] Signal inference workers to resume experience collection... (78600 times) [2024-06-23 00:44:17,725][15401] InferenceWorker_p0-w0: stopping experience collection (78600 times) [2024-06-23 00:44:17,725][15401] InferenceWorker_p0-w0: resuming experience collection (78600 times) [2024-06-23 00:44:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5316378624. Throughput: 0: 42453.0. Samples: 5316523340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 00:44:20,156][15401] Updated weights for policy 0, policy_version 324490 (0.0043) [2024-06-23 00:44:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5316575232. Throughput: 0: 42621.2. Samples: 5316661920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 00:44:23,990][15401] Updated weights for policy 0, policy_version 324500 (0.0038) [2024-06-23 00:44:27,665][15401] Updated weights for policy 0, policy_version 324510 (0.0038) [2024-06-23 00:44:28,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 5316788224. Throughput: 0: 42602.9. Samples: 5316915960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:28,396][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 00:44:31,768][15401] Updated weights for policy 0, policy_version 324520 (0.0026) [2024-06-23 00:44:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5317033984. Throughput: 0: 42670.2. Samples: 5317169840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 00:44:35,413][15401] Updated weights for policy 0, policy_version 324530 (0.0047) [2024-06-23 00:44:38,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5317197824. Throughput: 0: 42797.8. Samples: 5317304680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 00:44:39,255][15401] Updated weights for policy 0, policy_version 324540 (0.0040) [2024-06-23 00:44:43,381][15401] Updated weights for policy 0, policy_version 324550 (0.0034) [2024-06-23 00:44:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 5317427200. Throughput: 0: 42616.8. Samples: 5317551960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:43,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 00:44:46,886][15401] Updated weights for policy 0, policy_version 324560 (0.0041) [2024-06-23 00:44:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5317672960. Throughput: 0: 42565.8. Samples: 5317803240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:48,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 00:44:51,002][15401] Updated weights for policy 0, policy_version 324570 (0.0035) [2024-06-23 00:44:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42599.1). Total num frames: 5317836800. Throughput: 0: 42603.2. Samples: 5317935340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 00:44:54,614][15401] Updated weights for policy 0, policy_version 324580 (0.0043) [2024-06-23 00:44:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42326.3, 300 sec: 42709.5). Total num frames: 5318066176. Throughput: 0: 42634.5. Samples: 5318190640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:44:58,395][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 00:44:58,559][15401] Updated weights for policy 0, policy_version 324590 (0.0030) [2024-06-23 00:45:02,295][15401] Updated weights for policy 0, policy_version 324600 (0.0039) [2024-06-23 00:45:03,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5318311936. Throughput: 0: 42686.2. Samples: 5318444220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:45:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 00:45:05,963][15401] Updated weights for policy 0, policy_version 324610 (0.0036) [2024-06-23 00:45:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5318475776. Throughput: 0: 42564.5. Samples: 5318577320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 00:45:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 00:45:10,085][15401] Updated weights for policy 0, policy_version 324620 (0.0042) [2024-06-23 00:45:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5318721536. Throughput: 0: 42525.2. Samples: 5318829320. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:13,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 00:45:13,485][15401] Updated weights for policy 0, policy_version 324630 (0.0033) [2024-06-23 00:45:17,641][15401] Updated weights for policy 0, policy_version 324640 (0.0045) [2024-06-23 00:45:18,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5318950912. Throughput: 0: 42689.9. Samples: 5319090880. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:18,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 00:45:21,200][15401] Updated weights for policy 0, policy_version 324650 (0.0027) [2024-06-23 00:45:23,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5319114752. Throughput: 0: 42562.9. Samples: 5319220020. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:23,391][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 00:45:25,436][15401] Updated weights for policy 0, policy_version 324660 (0.0043) [2024-06-23 00:45:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 5319376896. Throughput: 0: 42649.0. Samples: 5319471160. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:28,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-23 00:45:28,885][15401] Updated weights for policy 0, policy_version 324670 (0.0044) [2024-06-23 00:45:32,095][15349] Signal inference workers to stop experience collection... (78650 times) [2024-06-23 00:45:32,136][15401] InferenceWorker_p0-w0: stopping experience collection (78650 times) [2024-06-23 00:45:32,159][15349] Signal inference workers to resume experience collection... (78650 times) [2024-06-23 00:45:32,168][15401] InferenceWorker_p0-w0: resuming experience collection (78650 times) [2024-06-23 00:45:32,958][15401] Updated weights for policy 0, policy_version 324680 (0.0032) [2024-06-23 00:45:33,389][15132] Fps is (10 sec: 47514.7, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 5319589888. Throughput: 0: 42914.8. Samples: 5319734400. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 00:45:36,505][15401] Updated weights for policy 0, policy_version 324690 (0.0049) [2024-06-23 00:45:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5319753728. Throughput: 0: 42774.2. Samples: 5319860180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 00:45:40,566][15401] Updated weights for policy 0, policy_version 324700 (0.0024) [2024-06-23 00:45:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5320015872. Throughput: 0: 42797.0. Samples: 5320116500. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 00:45:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000324708_5320015872.pth... [2024-06-23 00:45:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000324084_5309792256.pth [2024-06-23 00:45:44,252][15401] Updated weights for policy 0, policy_version 324710 (0.0040) [2024-06-23 00:45:48,366][15401] Updated weights for policy 0, policy_version 324720 (0.0042) [2024-06-23 00:45:48,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42323.7, 300 sec: 42598.5). Total num frames: 5320212480. Throughput: 0: 43012.8. Samples: 5320379900. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:48,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 00:45:52,143][15401] Updated weights for policy 0, policy_version 324730 (0.0024) [2024-06-23 00:45:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5320425472. Throughput: 0: 42722.7. Samples: 5320499840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 00:45:55,969][15401] Updated weights for policy 0, policy_version 324740 (0.0036) [2024-06-23 00:45:58,390][15132] Fps is (10 sec: 45885.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5320671232. Throughput: 0: 42942.0. Samples: 5320761720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:45:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 00:45:59,812][15401] Updated weights for policy 0, policy_version 324750 (0.0037) [2024-06-23 00:46:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5320851456. Throughput: 0: 42909.2. Samples: 5321021800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:46:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 00:46:03,480][15401] Updated weights for policy 0, policy_version 324760 (0.0034) [2024-06-23 00:46:07,229][15401] Updated weights for policy 0, policy_version 324770 (0.0044) [2024-06-23 00:46:08,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5321048064. Throughput: 0: 42801.5. Samples: 5321146080. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:46:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 00:46:11,187][15401] Updated weights for policy 0, policy_version 324780 (0.0028) [2024-06-23 00:46:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5321293824. Throughput: 0: 42798.2. Samples: 5321397080. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:46:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 00:46:14,688][15401] Updated weights for policy 0, policy_version 324790 (0.0027) [2024-06-23 00:46:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42598.5). Total num frames: 5321490432. Throughput: 0: 42849.2. Samples: 5321662620. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:46:18,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 00:46:18,782][15401] Updated weights for policy 0, policy_version 324800 (0.0038) [2024-06-23 00:46:22,687][15401] Updated weights for policy 0, policy_version 324810 (0.0042) [2024-06-23 00:46:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42654.3). Total num frames: 5321719808. Throughput: 0: 42681.3. Samples: 5321780840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-23 00:46:23,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-23 00:46:26,682][15401] Updated weights for policy 0, policy_version 324820 (0.0026) [2024-06-23 00:46:28,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 5321932800. Throughput: 0: 42840.3. Samples: 5322044420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:46:28,393][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 00:46:30,059][15401] Updated weights for policy 0, policy_version 324830 (0.0027) [2024-06-23 00:46:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5322129408. Throughput: 0: 42817.8. Samples: 5322306600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:46:33,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-23 00:46:34,284][15401] Updated weights for policy 0, policy_version 324840 (0.0033) [2024-06-23 00:46:37,497][15401] Updated weights for policy 0, policy_version 324850 (0.0043) [2024-06-23 00:46:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 5322358784. Throughput: 0: 42909.8. Samples: 5322430780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:46:38,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-23 00:46:41,813][15401] Updated weights for policy 0, policy_version 324860 (0.0033) [2024-06-23 00:46:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 5322588160. Throughput: 0: 42847.7. Samples: 5322689860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:46:43,390][15132] Avg episode reward: [(0, '0.155')] [2024-06-23 00:46:45,358][15401] Updated weights for policy 0, policy_version 324870 (0.0041) [2024-06-23 00:46:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 5322768384. Throughput: 0: 42782.7. Samples: 5322947020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:46:48,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 00:46:49,472][15401] Updated weights for policy 0, policy_version 324880 (0.0033) [2024-06-23 00:46:53,014][15401] Updated weights for policy 0, policy_version 324890 (0.0037) [2024-06-23 00:46:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5323014144. Throughput: 0: 42722.5. Samples: 5323068600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:46:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 00:46:57,364][15401] Updated weights for policy 0, policy_version 324900 (0.0039) [2024-06-23 00:46:58,018][15349] Signal inference workers to stop experience collection... (78700 times) [2024-06-23 00:46:58,018][15349] Signal inference workers to resume experience collection... (78700 times) [2024-06-23 00:46:58,040][15401] InferenceWorker_p0-w0: stopping experience collection (78700 times) [2024-06-23 00:46:58,040][15401] InferenceWorker_p0-w0: resuming experience collection (78700 times) [2024-06-23 00:46:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5323227136. Throughput: 0: 42997.4. Samples: 5323331960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:46:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 00:47:00,587][15401] Updated weights for policy 0, policy_version 324910 (0.0039) [2024-06-23 00:47:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5323407360. Throughput: 0: 42721.4. Samples: 5323585080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:47:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 00:47:04,939][15401] Updated weights for policy 0, policy_version 324920 (0.0030) [2024-06-23 00:47:08,194][15401] Updated weights for policy 0, policy_version 324930 (0.0040) [2024-06-23 00:47:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5323653120. Throughput: 0: 42923.2. Samples: 5323712380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:47:08,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 00:47:12,330][15401] Updated weights for policy 0, policy_version 324940 (0.0032) [2024-06-23 00:47:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5323849728. Throughput: 0: 42697.4. Samples: 5323965700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:47:13,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 00:47:15,768][15401] Updated weights for policy 0, policy_version 324950 (0.0037) [2024-06-23 00:47:18,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5324029952. Throughput: 0: 42642.8. Samples: 5324225520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:47:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 00:47:19,922][15401] Updated weights for policy 0, policy_version 324960 (0.0035) [2024-06-23 00:47:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5324292096. Throughput: 0: 42684.4. Samples: 5324351580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:47:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 00:47:23,406][15401] Updated weights for policy 0, policy_version 324970 (0.0038) [2024-06-23 00:47:27,812][15401] Updated weights for policy 0, policy_version 324980 (0.0029) [2024-06-23 00:47:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 5324472320. Throughput: 0: 42540.4. Samples: 5324604180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:47:28,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 00:47:31,254][15401] Updated weights for policy 0, policy_version 324990 (0.0033) [2024-06-23 00:47:33,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5324668928. Throughput: 0: 42712.9. Samples: 5324869100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:47:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 00:47:35,324][15401] Updated weights for policy 0, policy_version 325000 (0.0032) [2024-06-23 00:47:38,395][15132] Fps is (10 sec: 45848.8, 60 sec: 42867.4, 300 sec: 42819.7). Total num frames: 5324931072. Throughput: 0: 42845.2. Samples: 5324996880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 00:47:38,396][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 00:47:38,785][15401] Updated weights for policy 0, policy_version 325010 (0.0040) [2024-06-23 00:47:42,917][15401] Updated weights for policy 0, policy_version 325020 (0.0028) [2024-06-23 00:47:43,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 5325127680. Throughput: 0: 42636.8. Samples: 5325250720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:47:43,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 00:47:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000325020_5325127680.pth... [2024-06-23 00:47:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000324395_5314887680.pth [2024-06-23 00:47:46,708][15401] Updated weights for policy 0, policy_version 325030 (0.0043) [2024-06-23 00:47:48,389][15132] Fps is (10 sec: 39344.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5325324288. Throughput: 0: 42716.4. Samples: 5325507320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:47:48,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 00:47:50,438][15401] Updated weights for policy 0, policy_version 325040 (0.0037) [2024-06-23 00:47:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 5325553664. Throughput: 0: 42575.0. Samples: 5325628260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:47:53,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 00:47:54,479][15401] Updated weights for policy 0, policy_version 325050 (0.0038) [2024-06-23 00:47:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5325766656. Throughput: 0: 42549.8. Samples: 5325880440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:47:58,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 00:47:58,647][15401] Updated weights for policy 0, policy_version 325060 (0.0036) [2024-06-23 00:48:02,076][15401] Updated weights for policy 0, policy_version 325070 (0.0037) [2024-06-23 00:48:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5325963264. Throughput: 0: 42406.1. Samples: 5326133800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 00:48:05,024][15349] Signal inference workers to stop experience collection... (78750 times) [2024-06-23 00:48:05,077][15349] Signal inference workers to resume experience collection... (78750 times) [2024-06-23 00:48:05,077][15401] InferenceWorker_p0-w0: stopping experience collection (78750 times) [2024-06-23 00:48:05,090][15401] InferenceWorker_p0-w0: resuming experience collection (78750 times) [2024-06-23 00:48:06,310][15401] Updated weights for policy 0, policy_version 325080 (0.0028) [2024-06-23 00:48:08,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42050.5, 300 sec: 42653.6). Total num frames: 5326176256. Throughput: 0: 42473.0. Samples: 5326262960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:08,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 00:48:09,825][15401] Updated weights for policy 0, policy_version 325090 (0.0035) [2024-06-23 00:48:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5326405632. Throughput: 0: 42602.2. Samples: 5326521280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 00:48:14,059][15401] Updated weights for policy 0, policy_version 325100 (0.0030) [2024-06-23 00:48:17,466][15401] Updated weights for policy 0, policy_version 325110 (0.0044) [2024-06-23 00:48:18,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5326618624. Throughput: 0: 42128.8. Samples: 5326764900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 00:48:22,429][15401] Updated weights for policy 0, policy_version 325120 (0.0037) [2024-06-23 00:48:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41779.4, 300 sec: 42598.4). Total num frames: 5326798848. Throughput: 0: 42342.5. Samples: 5326902040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 00:48:25,456][15401] Updated weights for policy 0, policy_version 325130 (0.0033) [2024-06-23 00:48:28,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 5327028224. Throughput: 0: 42234.5. Samples: 5327151440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:28,397][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 00:48:30,027][15401] Updated weights for policy 0, policy_version 325140 (0.0042) [2024-06-23 00:48:33,028][15401] Updated weights for policy 0, policy_version 325150 (0.0033) [2024-06-23 00:48:33,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 5327257600. Throughput: 0: 42110.7. Samples: 5327402400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:33,392][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 00:48:37,628][15401] Updated weights for policy 0, policy_version 325160 (0.0036) [2024-06-23 00:48:38,390][15132] Fps is (10 sec: 42625.7, 60 sec: 42056.3, 300 sec: 42598.4). Total num frames: 5327454208. Throughput: 0: 42414.6. Samples: 5327536920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:38,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 00:48:40,935][15401] Updated weights for policy 0, policy_version 325170 (0.0044) [2024-06-23 00:48:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 5327667200. Throughput: 0: 42569.7. Samples: 5327796080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 00:48:45,193][15401] Updated weights for policy 0, policy_version 325180 (0.0038) [2024-06-23 00:48:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5327896576. Throughput: 0: 42521.0. Samples: 5328047240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:48,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 00:48:48,486][15401] Updated weights for policy 0, policy_version 325190 (0.0046) [2024-06-23 00:48:53,068][15401] Updated weights for policy 0, policy_version 325200 (0.0029) [2024-06-23 00:48:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42543.1). Total num frames: 5328076800. Throughput: 0: 42598.7. Samples: 5328179800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 00:48:56,194][15401] Updated weights for policy 0, policy_version 325210 (0.0033) [2024-06-23 00:48:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5328306176. Throughput: 0: 42464.5. Samples: 5328432180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 00:48:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 00:49:00,932][15401] Updated weights for policy 0, policy_version 325220 (0.0029) [2024-06-23 00:49:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5328519168. Throughput: 0: 42677.0. Samples: 5328685360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 00:49:04,566][15401] Updated weights for policy 0, policy_version 325230 (0.0048) [2024-06-23 00:49:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 5328699392. Throughput: 0: 42439.0. Samples: 5328811800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 00:49:08,610][15401] Updated weights for policy 0, policy_version 325240 (0.0036) [2024-06-23 00:49:09,110][15349] Signal inference workers to stop experience collection... (78800 times) [2024-06-23 00:49:09,161][15349] Signal inference workers to resume experience collection... (78800 times) [2024-06-23 00:49:09,165][15401] InferenceWorker_p0-w0: stopping experience collection (78800 times) [2024-06-23 00:49:09,178][15401] InferenceWorker_p0-w0: resuming experience collection (78800 times) [2024-06-23 00:49:12,323][15401] Updated weights for policy 0, policy_version 325250 (0.0034) [2024-06-23 00:49:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5328945152. Throughput: 0: 42577.6. Samples: 5329067160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:13,394][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 00:49:16,301][15401] Updated weights for policy 0, policy_version 325260 (0.0039) [2024-06-23 00:49:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5329158144. Throughput: 0: 42520.8. Samples: 5329315740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 00:49:19,988][15401] Updated weights for policy 0, policy_version 325270 (0.0044) [2024-06-23 00:49:23,390][15132] Fps is (10 sec: 40956.4, 60 sec: 42597.7, 300 sec: 42599.2). Total num frames: 5329354752. Throughput: 0: 42517.4. Samples: 5329450240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:23,391][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 00:49:24,060][15401] Updated weights for policy 0, policy_version 325280 (0.0029) [2024-06-23 00:49:27,576][15401] Updated weights for policy 0, policy_version 325290 (0.0033) [2024-06-23 00:49:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42876.1, 300 sec: 42598.4). Total num frames: 5329600512. Throughput: 0: 42589.0. Samples: 5329712580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:28,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 00:49:31,602][15401] Updated weights for policy 0, policy_version 325300 (0.0036) [2024-06-23 00:49:33,390][15132] Fps is (10 sec: 45878.9, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 5329813504. Throughput: 0: 42588.3. Samples: 5329963720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:33,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-23 00:49:35,108][15401] Updated weights for policy 0, policy_version 325310 (0.0046) [2024-06-23 00:49:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5330010112. Throughput: 0: 42440.6. Samples: 5330089620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 00:49:39,293][15401] Updated weights for policy 0, policy_version 325320 (0.0030) [2024-06-23 00:49:42,979][15401] Updated weights for policy 0, policy_version 325330 (0.0037) [2024-06-23 00:49:43,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 5330206720. Throughput: 0: 42581.5. Samples: 5330348340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 00:49:43,433][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000325331_5330223104.pth... [2024-06-23 00:49:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000324708_5320015872.pth [2024-06-23 00:49:47,064][15401] Updated weights for policy 0, policy_version 325340 (0.0029) [2024-06-23 00:49:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5330436096. Throughput: 0: 42578.6. Samples: 5330601400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 00:49:50,675][15401] Updated weights for policy 0, policy_version 325350 (0.0036) [2024-06-23 00:49:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5330665472. Throughput: 0: 42645.8. Samples: 5330730860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:53,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 00:49:54,768][15401] Updated weights for policy 0, policy_version 325360 (0.0043) [2024-06-23 00:49:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5330845696. Throughput: 0: 42458.4. Samples: 5330977780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:49:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 00:49:58,485][15401] Updated weights for policy 0, policy_version 325370 (0.0028) [2024-06-23 00:50:02,491][15401] Updated weights for policy 0, policy_version 325380 (0.0034) [2024-06-23 00:50:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5331058688. Throughput: 0: 42623.1. Samples: 5331233780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:50:03,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-23 00:50:05,961][15401] Updated weights for policy 0, policy_version 325390 (0.0035) [2024-06-23 00:50:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 5331304448. Throughput: 0: 42445.3. Samples: 5331360240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:50:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 00:50:09,930][15401] Updated weights for policy 0, policy_version 325400 (0.0037) [2024-06-23 00:50:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5331501056. Throughput: 0: 42357.4. Samples: 5331618660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:50:13,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 00:50:13,595][15401] Updated weights for policy 0, policy_version 325410 (0.0036) [2024-06-23 00:50:15,398][15349] Signal inference workers to stop experience collection... (78850 times) [2024-06-23 00:50:15,398][15349] Signal inference workers to resume experience collection... (78850 times) [2024-06-23 00:50:15,424][15401] InferenceWorker_p0-w0: stopping experience collection (78850 times) [2024-06-23 00:50:15,425][15401] InferenceWorker_p0-w0: resuming experience collection (78850 times) [2024-06-23 00:50:17,482][15401] Updated weights for policy 0, policy_version 325420 (0.0036) [2024-06-23 00:50:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5331697664. Throughput: 0: 42529.5. Samples: 5331877540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:18,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 00:50:21,305][15401] Updated weights for policy 0, policy_version 325430 (0.0036) [2024-06-23 00:50:23,391][15132] Fps is (10 sec: 42591.9, 60 sec: 42871.1, 300 sec: 42542.7). Total num frames: 5331927040. Throughput: 0: 42536.3. Samples: 5332003820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:23,391][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 00:50:24,955][15401] Updated weights for policy 0, policy_version 325440 (0.0033) [2024-06-23 00:50:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5332123648. Throughput: 0: 42397.7. Samples: 5332256240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 00:50:28,938][15401] Updated weights for policy 0, policy_version 325450 (0.0031) [2024-06-23 00:50:32,471][15401] Updated weights for policy 0, policy_version 325460 (0.0027) [2024-06-23 00:50:33,390][15132] Fps is (10 sec: 40965.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5332336640. Throughput: 0: 42505.7. Samples: 5332514160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 00:50:36,518][15401] Updated weights for policy 0, policy_version 325470 (0.0024) [2024-06-23 00:50:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5332566016. Throughput: 0: 42440.9. Samples: 5332640700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 00:50:39,953][15401] Updated weights for policy 0, policy_version 325480 (0.0026) [2024-06-23 00:50:43,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42866.8, 300 sec: 42597.8). Total num frames: 5332779008. Throughput: 0: 42790.3. Samples: 5332903620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:43,396][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 00:50:44,237][15401] Updated weights for policy 0, policy_version 325490 (0.0043) [2024-06-23 00:50:47,991][15401] Updated weights for policy 0, policy_version 325500 (0.0024) [2024-06-23 00:50:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5332992000. Throughput: 0: 42751.2. Samples: 5333157580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:48,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 00:50:51,766][15401] Updated weights for policy 0, policy_version 325510 (0.0027) [2024-06-23 00:50:53,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5333221376. Throughput: 0: 42854.2. Samples: 5333288680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:53,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 00:50:55,569][15401] Updated weights for policy 0, policy_version 325520 (0.0034) [2024-06-23 00:50:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5333417984. Throughput: 0: 42884.8. Samples: 5333548480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:50:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 00:50:59,345][15401] Updated weights for policy 0, policy_version 325530 (0.0048) [2024-06-23 00:51:03,147][15401] Updated weights for policy 0, policy_version 325540 (0.0035) [2024-06-23 00:51:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5333647360. Throughput: 0: 42693.2. Samples: 5333798740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:51:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 00:51:07,343][15401] Updated weights for policy 0, policy_version 325550 (0.0034) [2024-06-23 00:51:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5333843968. Throughput: 0: 42762.8. Samples: 5333928080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:51:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 00:51:10,659][15401] Updated weights for policy 0, policy_version 325560 (0.0029) [2024-06-23 00:51:13,391][15132] Fps is (10 sec: 40952.6, 60 sec: 42597.1, 300 sec: 42598.1). Total num frames: 5334056960. Throughput: 0: 42879.7. Samples: 5334185900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:51:13,392][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 00:51:15,004][15401] Updated weights for policy 0, policy_version 325570 (0.0029) [2024-06-23 00:51:18,333][15401] Updated weights for policy 0, policy_version 325580 (0.0029) [2024-06-23 00:51:18,390][15132] Fps is (10 sec: 45873.9, 60 sec: 43417.4, 300 sec: 42653.9). Total num frames: 5334302720. Throughput: 0: 42692.8. Samples: 5334435340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:51:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 00:51:22,671][15401] Updated weights for policy 0, policy_version 325590 (0.0032) [2024-06-23 00:51:23,389][15132] Fps is (10 sec: 42606.3, 60 sec: 42599.5, 300 sec: 42543.2). Total num frames: 5334482944. Throughput: 0: 42748.0. Samples: 5334564360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:51:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 00:51:26,090][15401] Updated weights for policy 0, policy_version 325600 (0.0034) [2024-06-23 00:51:28,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5334695936. Throughput: 0: 42669.7. Samples: 5334823480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 00:51:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 00:51:30,640][15401] Updated weights for policy 0, policy_version 325610 (0.0033) [2024-06-23 00:51:33,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.9, 300 sec: 42598.1). Total num frames: 5334925312. Throughput: 0: 42645.6. Samples: 5335076740. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:51:33,393][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 00:51:33,852][15401] Updated weights for policy 0, policy_version 325620 (0.0037) [2024-06-23 00:51:35,809][15349] Signal inference workers to stop experience collection... (78900 times) [2024-06-23 00:51:35,836][15401] InferenceWorker_p0-w0: stopping experience collection (78900 times) [2024-06-23 00:51:35,862][15349] Signal inference workers to resume experience collection... (78900 times) [2024-06-23 00:51:35,863][15401] InferenceWorker_p0-w0: resuming experience collection (78900 times) [2024-06-23 00:51:38,188][15401] Updated weights for policy 0, policy_version 325630 (0.0035) [2024-06-23 00:51:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5335121920. Throughput: 0: 42472.0. Samples: 5335199920. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:51:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 00:51:41,515][15401] Updated weights for policy 0, policy_version 325640 (0.0032) [2024-06-23 00:51:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 5335351296. Throughput: 0: 42506.7. Samples: 5335461280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:51:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 00:51:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000325645_5335367680.pth... [2024-06-23 00:51:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000325020_5325127680.pth [2024-06-23 00:51:45,591][15401] Updated weights for policy 0, policy_version 325650 (0.0032) [2024-06-23 00:51:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5335547904. Throughput: 0: 42789.2. Samples: 5335724260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:51:48,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 00:51:49,209][15401] Updated weights for policy 0, policy_version 325660 (0.0027) [2024-06-23 00:51:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5335777280. Throughput: 0: 42599.4. Samples: 5335845060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:51:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 00:51:53,387][15401] Updated weights for policy 0, policy_version 325670 (0.0045) [2024-06-23 00:51:57,214][15401] Updated weights for policy 0, policy_version 325680 (0.0032) [2024-06-23 00:51:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5335990272. Throughput: 0: 42567.5. Samples: 5336101360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:51:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 00:52:01,109][15401] Updated weights for policy 0, policy_version 325690 (0.0035) [2024-06-23 00:52:03,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 5336186880. Throughput: 0: 42735.2. Samples: 5336358520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:03,392][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 00:52:04,808][15401] Updated weights for policy 0, policy_version 325700 (0.0037) [2024-06-23 00:52:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5336399872. Throughput: 0: 42719.0. Samples: 5336486720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 00:52:08,924][15401] Updated weights for policy 0, policy_version 325710 (0.0035) [2024-06-23 00:52:12,767][15401] Updated weights for policy 0, policy_version 325720 (0.0029) [2024-06-23 00:52:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42599.7, 300 sec: 42653.9). Total num frames: 5336612864. Throughput: 0: 42612.8. Samples: 5336741060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 00:52:16,807][15401] Updated weights for policy 0, policy_version 325730 (0.0032) [2024-06-23 00:52:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 5336825856. Throughput: 0: 42630.3. Samples: 5336995000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 00:52:20,340][15401] Updated weights for policy 0, policy_version 325740 (0.0041) [2024-06-23 00:52:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5337038848. Throughput: 0: 42667.1. Samples: 5337119940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 00:52:24,384][15401] Updated weights for policy 0, policy_version 325750 (0.0036) [2024-06-23 00:52:28,081][15401] Updated weights for policy 0, policy_version 325760 (0.0034) [2024-06-23 00:52:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5337251840. Throughput: 0: 42632.1. Samples: 5337379720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 00:52:31,939][15401] Updated weights for policy 0, policy_version 325770 (0.0029) [2024-06-23 00:52:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42543.7). Total num frames: 5337481216. Throughput: 0: 42469.9. Samples: 5337635400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 00:52:35,846][15401] Updated weights for policy 0, policy_version 325780 (0.0040) [2024-06-23 00:52:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 5337677824. Throughput: 0: 42698.3. Samples: 5337766480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:38,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 00:52:39,478][15401] Updated weights for policy 0, policy_version 325790 (0.0030) [2024-06-23 00:52:43,305][15401] Updated weights for policy 0, policy_version 325800 (0.0034) [2024-06-23 00:52:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5337907200. Throughput: 0: 42799.5. Samples: 5338027340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 00:52:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 00:52:47,212][15401] Updated weights for policy 0, policy_version 325810 (0.0025) [2024-06-23 00:52:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5338120192. Throughput: 0: 42675.1. Samples: 5338278800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:52:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 00:52:51,038][15401] Updated weights for policy 0, policy_version 325820 (0.0031) [2024-06-23 00:52:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5338333184. Throughput: 0: 42695.7. Samples: 5338408020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:52:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 00:52:54,827][15401] Updated weights for policy 0, policy_version 325830 (0.0022) [2024-06-23 00:52:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5338529792. Throughput: 0: 42855.2. Samples: 5338669540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:52:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 00:52:58,644][15401] Updated weights for policy 0, policy_version 325840 (0.0042) [2024-06-23 00:53:01,843][15349] Signal inference workers to stop experience collection... (78950 times) [2024-06-23 00:53:01,847][15349] Signal inference workers to resume experience collection... (78950 times) [2024-06-23 00:53:01,868][15401] InferenceWorker_p0-w0: stopping experience collection (78950 times) [2024-06-23 00:53:01,868][15401] InferenceWorker_p0-w0: resuming experience collection (78950 times) [2024-06-23 00:53:02,503][15401] Updated weights for policy 0, policy_version 325850 (0.0037) [2024-06-23 00:53:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43146.2, 300 sec: 42709.8). Total num frames: 5338775552. Throughput: 0: 42899.0. Samples: 5338925460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 00:53:06,206][15401] Updated weights for policy 0, policy_version 325860 (0.0040) [2024-06-23 00:53:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5338988544. Throughput: 0: 43020.3. Samples: 5339055860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 00:53:10,113][15401] Updated weights for policy 0, policy_version 325870 (0.0027) [2024-06-23 00:53:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5339185152. Throughput: 0: 42951.8. Samples: 5339312560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 00:53:13,697][15401] Updated weights for policy 0, policy_version 325880 (0.0035) [2024-06-23 00:53:17,675][15401] Updated weights for policy 0, policy_version 325890 (0.0040) [2024-06-23 00:53:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5339398144. Throughput: 0: 43014.7. Samples: 5339571060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:18,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 00:53:21,312][15401] Updated weights for policy 0, policy_version 325900 (0.0028) [2024-06-23 00:53:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 5339627520. Throughput: 0: 42934.6. Samples: 5339698540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:23,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 00:53:25,723][15401] Updated weights for policy 0, policy_version 325910 (0.0021) [2024-06-23 00:53:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 5339807744. Throughput: 0: 42632.9. Samples: 5339945820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:28,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-23 00:53:29,664][15401] Updated weights for policy 0, policy_version 325920 (0.0027) [2024-06-23 00:53:33,215][15401] Updated weights for policy 0, policy_version 325930 (0.0030) [2024-06-23 00:53:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5340037120. Throughput: 0: 42818.8. Samples: 5340205640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 00:53:37,183][15401] Updated weights for policy 0, policy_version 325940 (0.0032) [2024-06-23 00:53:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5340250112. Throughput: 0: 42807.0. Samples: 5340334340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 00:53:40,886][15401] Updated weights for policy 0, policy_version 325950 (0.0036) [2024-06-23 00:53:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5340463104. Throughput: 0: 42607.5. Samples: 5340586880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 00:53:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000325956_5340463104.pth... [2024-06-23 00:53:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000325331_5330223104.pth [2024-06-23 00:53:44,833][15401] Updated weights for policy 0, policy_version 325960 (0.0035) [2024-06-23 00:53:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5340676096. Throughput: 0: 42604.1. Samples: 5340842640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:48,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 00:53:48,577][15401] Updated weights for policy 0, policy_version 325970 (0.0028) [2024-06-23 00:53:52,430][15401] Updated weights for policy 0, policy_version 325980 (0.0028) [2024-06-23 00:53:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5340905472. Throughput: 0: 42467.5. Samples: 5340966900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 00:53:56,156][15401] Updated weights for policy 0, policy_version 325990 (0.0042) [2024-06-23 00:53:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5341102080. Throughput: 0: 42548.4. Samples: 5341227240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-23 00:53:58,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 00:54:00,003][15401] Updated weights for policy 0, policy_version 326000 (0.0045) [2024-06-23 00:54:03,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 5341331456. Throughput: 0: 42465.2. Samples: 5341482100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:03,392][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 00:54:03,694][15401] Updated weights for policy 0, policy_version 326010 (0.0035) [2024-06-23 00:54:07,791][15401] Updated weights for policy 0, policy_version 326020 (0.0034) [2024-06-23 00:54:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5341544448. Throughput: 0: 42503.6. Samples: 5341611200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:08,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 00:54:11,157][15401] Updated weights for policy 0, policy_version 326030 (0.0041) [2024-06-23 00:54:13,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5341757440. Throughput: 0: 42839.2. Samples: 5341873580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:13,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 00:54:15,402][15401] Updated weights for policy 0, policy_version 326040 (0.0034) [2024-06-23 00:54:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 5341954048. Throughput: 0: 42821.3. Samples: 5342132600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 00:54:19,090][15401] Updated weights for policy 0, policy_version 326050 (0.0030) [2024-06-23 00:54:22,947][15401] Updated weights for policy 0, policy_version 326060 (0.0025) [2024-06-23 00:54:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5342183424. Throughput: 0: 42708.0. Samples: 5342256200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 00:54:26,814][15401] Updated weights for policy 0, policy_version 326070 (0.0045) [2024-06-23 00:54:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5342412800. Throughput: 0: 42891.9. Samples: 5342517020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 00:54:30,496][15401] Updated weights for policy 0, policy_version 326080 (0.0041) [2024-06-23 00:54:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5342593024. Throughput: 0: 43015.5. Samples: 5342778340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 00:54:33,949][15349] Signal inference workers to stop experience collection... (79000 times) [2024-06-23 00:54:33,949][15349] Signal inference workers to resume experience collection... (79000 times) [2024-06-23 00:54:33,981][15401] InferenceWorker_p0-w0: stopping experience collection (79000 times) [2024-06-23 00:54:33,981][15401] InferenceWorker_p0-w0: resuming experience collection (79000 times) [2024-06-23 00:54:34,451][15401] Updated weights for policy 0, policy_version 326090 (0.0040) [2024-06-23 00:54:37,861][15401] Updated weights for policy 0, policy_version 326100 (0.0036) [2024-06-23 00:54:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5342822400. Throughput: 0: 42968.9. Samples: 5342900500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 00:54:42,018][15401] Updated weights for policy 0, policy_version 326110 (0.0027) [2024-06-23 00:54:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5343035392. Throughput: 0: 42949.1. Samples: 5343159940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 00:54:45,426][15401] Updated weights for policy 0, policy_version 326120 (0.0028) [2024-06-23 00:54:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5343248384. Throughput: 0: 43075.2. Samples: 5343420380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:48,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 00:54:49,642][15401] Updated weights for policy 0, policy_version 326130 (0.0030) [2024-06-23 00:54:52,948][15401] Updated weights for policy 0, policy_version 326140 (0.0035) [2024-06-23 00:54:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5343477760. Throughput: 0: 42936.8. Samples: 5343543360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 00:54:57,659][15401] Updated weights for policy 0, policy_version 326150 (0.0028) [2024-06-23 00:54:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5343690752. Throughput: 0: 42876.4. Samples: 5343803020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:54:58,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 00:55:01,186][15401] Updated weights for policy 0, policy_version 326160 (0.0040) [2024-06-23 00:55:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 5343887360. Throughput: 0: 42727.1. Samples: 5344055320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:55:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 00:55:05,459][15401] Updated weights for policy 0, policy_version 326170 (0.0024) [2024-06-23 00:55:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5344100352. Throughput: 0: 42793.4. Samples: 5344181900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:55:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 00:55:08,739][15401] Updated weights for policy 0, policy_version 326180 (0.0036) [2024-06-23 00:55:12,973][15401] Updated weights for policy 0, policy_version 326190 (0.0029) [2024-06-23 00:55:13,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5344329728. Throughput: 0: 42859.0. Samples: 5344445780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 00:55:13,392][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 00:55:16,346][15401] Updated weights for policy 0, policy_version 326200 (0.0039) [2024-06-23 00:55:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.7). Total num frames: 5344526336. Throughput: 0: 42709.0. Samples: 5344700240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:18,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 00:55:20,496][15401] Updated weights for policy 0, policy_version 326210 (0.0032) [2024-06-23 00:55:23,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5344755712. Throughput: 0: 42895.6. Samples: 5344830800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 00:55:24,040][15401] Updated weights for policy 0, policy_version 326220 (0.0028) [2024-06-23 00:55:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5344935936. Throughput: 0: 42735.8. Samples: 5345083060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 00:55:28,538][15401] Updated weights for policy 0, policy_version 326230 (0.0036) [2024-06-23 00:55:31,635][15401] Updated weights for policy 0, policy_version 326240 (0.0023) [2024-06-23 00:55:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5345165312. Throughput: 0: 42591.2. Samples: 5345336980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 00:55:36,135][15401] Updated weights for policy 0, policy_version 326250 (0.0030) [2024-06-23 00:55:38,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43142.8, 300 sec: 42821.1). Total num frames: 5345411072. Throughput: 0: 42792.4. Samples: 5345469120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:38,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 00:55:39,468][15401] Updated weights for policy 0, policy_version 326260 (0.0031) [2024-06-23 00:55:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5345591296. Throughput: 0: 42705.3. Samples: 5345724760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 00:55:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000326269_5345591296.pth... [2024-06-23 00:55:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000325645_5335367680.pth [2024-06-23 00:55:43,674][15401] Updated weights for policy 0, policy_version 326270 (0.0029) [2024-06-23 00:55:45,473][15349] Signal inference workers to stop experience collection... (79050 times) [2024-06-23 00:55:45,513][15401] InferenceWorker_p0-w0: stopping experience collection (79050 times) [2024-06-23 00:55:45,522][15349] Signal inference workers to resume experience collection... (79050 times) [2024-06-23 00:55:45,532][15401] InferenceWorker_p0-w0: resuming experience collection (79050 times) [2024-06-23 00:55:46,960][15401] Updated weights for policy 0, policy_version 326280 (0.0028) [2024-06-23 00:55:48,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5345804288. Throughput: 0: 42748.9. Samples: 5345979020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 00:55:51,335][15401] Updated weights for policy 0, policy_version 326290 (0.0029) [2024-06-23 00:55:53,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5346066432. Throughput: 0: 42999.9. Samples: 5346116900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:53,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 00:55:54,389][15401] Updated weights for policy 0, policy_version 326300 (0.0034) [2024-06-23 00:55:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5346230272. Throughput: 0: 42741.5. Samples: 5346369040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:55:58,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-23 00:55:58,883][15401] Updated weights for policy 0, policy_version 326310 (0.0022) [2024-06-23 00:56:01,925][15401] Updated weights for policy 0, policy_version 326320 (0.0027) [2024-06-23 00:56:03,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5346443264. Throughput: 0: 42679.9. Samples: 5346620840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:56:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 00:56:06,583][15401] Updated weights for policy 0, policy_version 326330 (0.0041) [2024-06-23 00:56:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42820.8). Total num frames: 5346689024. Throughput: 0: 42723.7. Samples: 5346753360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:56:08,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 00:56:10,067][15401] Updated weights for policy 0, policy_version 326340 (0.0027) [2024-06-23 00:56:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 5346885632. Throughput: 0: 42655.6. Samples: 5347002560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:56:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 00:56:14,130][15401] Updated weights for policy 0, policy_version 326350 (0.0031) [2024-06-23 00:56:17,673][15401] Updated weights for policy 0, policy_version 326360 (0.0040) [2024-06-23 00:56:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5347098624. Throughput: 0: 42755.5. Samples: 5347260980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:56:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 00:56:21,752][15401] Updated weights for policy 0, policy_version 326370 (0.0039) [2024-06-23 00:56:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5347328000. Throughput: 0: 42728.6. Samples: 5347391800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:56:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 00:56:25,437][15401] Updated weights for policy 0, policy_version 326380 (0.0046) [2024-06-23 00:56:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 5347524608. Throughput: 0: 42715.1. Samples: 5347646940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 00:56:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 00:56:29,525][15401] Updated weights for policy 0, policy_version 326390 (0.0040) [2024-06-23 00:56:32,996][15401] Updated weights for policy 0, policy_version 326400 (0.0026) [2024-06-23 00:56:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5347737600. Throughput: 0: 42561.7. Samples: 5347894300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:56:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 00:56:37,291][15401] Updated weights for policy 0, policy_version 326410 (0.0038) [2024-06-23 00:56:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 5347950592. Throughput: 0: 42579.5. Samples: 5348032980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:56:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 00:56:40,700][15401] Updated weights for policy 0, policy_version 326420 (0.0033) [2024-06-23 00:56:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5348163584. Throughput: 0: 42575.9. Samples: 5348284960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:56:43,400][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 00:56:44,850][15401] Updated weights for policy 0, policy_version 326430 (0.0042) [2024-06-23 00:56:48,376][15401] Updated weights for policy 0, policy_version 326440 (0.0027) [2024-06-23 00:56:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5348392960. Throughput: 0: 42613.3. Samples: 5348538440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:56:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 00:56:52,248][15401] Updated weights for policy 0, policy_version 326450 (0.0028) [2024-06-23 00:56:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5348589568. Throughput: 0: 42643.0. Samples: 5348672300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:56:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 00:56:53,837][15349] Signal inference workers to stop experience collection... (79100 times) [2024-06-23 00:56:53,838][15349] Signal inference workers to resume experience collection... (79100 times) [2024-06-23 00:56:53,857][15401] InferenceWorker_p0-w0: stopping experience collection (79100 times) [2024-06-23 00:56:53,857][15401] InferenceWorker_p0-w0: resuming experience collection (79100 times) [2024-06-23 00:56:56,142][15401] Updated weights for policy 0, policy_version 326460 (0.0028) [2024-06-23 00:56:58,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42866.9, 300 sec: 42764.4). Total num frames: 5348802560. Throughput: 0: 42815.2. Samples: 5348929520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:56:58,397][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 00:57:00,171][15401] Updated weights for policy 0, policy_version 326470 (0.0022) [2024-06-23 00:57:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 5349031936. Throughput: 0: 42719.0. Samples: 5349183340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 00:57:03,618][15401] Updated weights for policy 0, policy_version 326480 (0.0028) [2024-06-23 00:57:07,693][15401] Updated weights for policy 0, policy_version 326490 (0.0033) [2024-06-23 00:57:08,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5349228544. Throughput: 0: 42691.0. Samples: 5349312900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:08,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 00:57:11,407][15401] Updated weights for policy 0, policy_version 326500 (0.0033) [2024-06-23 00:57:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5349457920. Throughput: 0: 42671.6. Samples: 5349567160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 00:57:15,595][15401] Updated weights for policy 0, policy_version 326510 (0.0040) [2024-06-23 00:57:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5349670912. Throughput: 0: 42871.9. Samples: 5349823640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:18,393][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 00:57:19,094][15401] Updated weights for policy 0, policy_version 326520 (0.0031) [2024-06-23 00:57:23,233][15401] Updated weights for policy 0, policy_version 326530 (0.0037) [2024-06-23 00:57:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 5349867520. Throughput: 0: 42644.4. Samples: 5349951980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 00:57:26,473][15401] Updated weights for policy 0, policy_version 326540 (0.0023) [2024-06-23 00:57:28,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5350080512. Throughput: 0: 42745.5. Samples: 5350208500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 00:57:30,735][15401] Updated weights for policy 0, policy_version 326550 (0.0033) [2024-06-23 00:57:33,391][15132] Fps is (10 sec: 42591.1, 60 sec: 42597.1, 300 sec: 42764.8). Total num frames: 5350293504. Throughput: 0: 42852.1. Samples: 5350466860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:33,392][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 00:57:34,111][15401] Updated weights for policy 0, policy_version 326560 (0.0035) [2024-06-23 00:57:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5350506496. Throughput: 0: 42774.8. Samples: 5350597160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:38,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 00:57:38,502][15401] Updated weights for policy 0, policy_version 326570 (0.0039) [2024-06-23 00:57:42,198][15401] Updated weights for policy 0, policy_version 326580 (0.0048) [2024-06-23 00:57:43,389][15132] Fps is (10 sec: 42606.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5350719488. Throughput: 0: 42636.3. Samples: 5350847880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 00:57:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 00:57:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000326582_5350719488.pth... [2024-06-23 00:57:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000325956_5340463104.pth [2024-06-23 00:57:46,174][15401] Updated weights for policy 0, policy_version 326590 (0.0052) [2024-06-23 00:57:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5350916096. Throughput: 0: 42729.0. Samples: 5351106140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:57:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 00:57:49,752][15401] Updated weights for policy 0, policy_version 326600 (0.0049) [2024-06-23 00:57:53,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 5351145472. Throughput: 0: 42662.8. Samples: 5351233000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:57:53,397][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 00:57:53,630][15401] Updated weights for policy 0, policy_version 326610 (0.0039) [2024-06-23 00:57:57,494][15401] Updated weights for policy 0, policy_version 326620 (0.0044) [2024-06-23 00:57:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 5351374848. Throughput: 0: 42590.7. Samples: 5351483740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:57:58,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 00:58:01,851][15401] Updated weights for policy 0, policy_version 326630 (0.0029) [2024-06-23 00:58:03,392][15132] Fps is (10 sec: 40976.6, 60 sec: 42050.7, 300 sec: 42598.1). Total num frames: 5351555072. Throughput: 0: 42608.5. Samples: 5351741020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:03,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 00:58:03,981][15349] Signal inference workers to stop experience collection... (79150 times) [2024-06-23 00:58:03,982][15349] Signal inference workers to resume experience collection... (79150 times) [2024-06-23 00:58:03,995][15401] InferenceWorker_p0-w0: stopping experience collection (79150 times) [2024-06-23 00:58:03,996][15401] InferenceWorker_p0-w0: resuming experience collection (79150 times) [2024-06-23 00:58:05,083][15401] Updated weights for policy 0, policy_version 326640 (0.0038) [2024-06-23 00:58:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5351768064. Throughput: 0: 42494.6. Samples: 5351864240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 00:58:09,478][15401] Updated weights for policy 0, policy_version 326650 (0.0029) [2024-06-23 00:58:12,761][15401] Updated weights for policy 0, policy_version 326660 (0.0032) [2024-06-23 00:58:13,390][15132] Fps is (10 sec: 45885.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5352013824. Throughput: 0: 42436.2. Samples: 5352118140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 00:58:17,038][15401] Updated weights for policy 0, policy_version 326670 (0.0032) [2024-06-23 00:58:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 5352194048. Throughput: 0: 42484.8. Samples: 5352378600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 00:58:20,353][15401] Updated weights for policy 0, policy_version 326680 (0.0041) [2024-06-23 00:58:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5352423424. Throughput: 0: 42302.9. Samples: 5352500800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 00:58:24,732][15401] Updated weights for policy 0, policy_version 326690 (0.0031) [2024-06-23 00:58:28,095][15401] Updated weights for policy 0, policy_version 326700 (0.0031) [2024-06-23 00:58:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5352652800. Throughput: 0: 42518.5. Samples: 5352761220. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 00:58:32,401][15401] Updated weights for policy 0, policy_version 326710 (0.0020) [2024-06-23 00:58:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42326.6, 300 sec: 42653.9). Total num frames: 5352833024. Throughput: 0: 42413.3. Samples: 5353014740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:33,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 00:58:35,642][15401] Updated weights for policy 0, policy_version 326720 (0.0037) [2024-06-23 00:58:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5353062400. Throughput: 0: 42270.9. Samples: 5353134920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 00:58:40,379][15401] Updated weights for policy 0, policy_version 326730 (0.0035) [2024-06-23 00:58:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5353275392. Throughput: 0: 42575.4. Samples: 5353399640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 00:58:43,552][15401] Updated weights for policy 0, policy_version 326740 (0.0041) [2024-06-23 00:58:47,979][15401] Updated weights for policy 0, policy_version 326750 (0.0035) [2024-06-23 00:58:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5353488384. Throughput: 0: 42493.9. Samples: 5353653140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:48,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 00:58:51,313][15401] Updated weights for policy 0, policy_version 326760 (0.0035) [2024-06-23 00:58:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 5353684992. Throughput: 0: 42517.4. Samples: 5353777520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:53,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 00:58:55,557][15401] Updated weights for policy 0, policy_version 326770 (0.0042) [2024-06-23 00:58:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 5353930752. Throughput: 0: 42709.0. Samples: 5354040040. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 00:58:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 00:58:58,990][15401] Updated weights for policy 0, policy_version 326780 (0.0042) [2024-06-23 00:59:03,194][15401] Updated weights for policy 0, policy_version 326790 (0.0030) [2024-06-23 00:59:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 5354127360. Throughput: 0: 42519.0. Samples: 5354291960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 00:59:06,835][15401] Updated weights for policy 0, policy_version 326800 (0.0026) [2024-06-23 00:59:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5354340352. Throughput: 0: 42585.8. Samples: 5354417160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:08,391][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 00:59:10,773][15401] Updated weights for policy 0, policy_version 326810 (0.0034) [2024-06-23 00:59:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5354569728. Throughput: 0: 42686.3. Samples: 5354682100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:13,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 00:59:14,511][15401] Updated weights for policy 0, policy_version 326820 (0.0040) [2024-06-23 00:59:18,367][15401] Updated weights for policy 0, policy_version 326830 (0.0032) [2024-06-23 00:59:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5354782720. Throughput: 0: 42621.4. Samples: 5354932700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 00:59:22,138][15401] Updated weights for policy 0, policy_version 326840 (0.0038) [2024-06-23 00:59:23,286][15349] Signal inference workers to stop experience collection... (79200 times) [2024-06-23 00:59:23,286][15349] Signal inference workers to resume experience collection... (79200 times) [2024-06-23 00:59:23,336][15401] InferenceWorker_p0-w0: stopping experience collection (79200 times) [2024-06-23 00:59:23,336][15401] InferenceWorker_p0-w0: resuming experience collection (79200 times) [2024-06-23 00:59:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5354979328. Throughput: 0: 42714.7. Samples: 5355057080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 00:59:25,858][15401] Updated weights for policy 0, policy_version 326850 (0.0028) [2024-06-23 00:59:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5355208704. Throughput: 0: 42793.8. Samples: 5355325360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:28,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 00:59:29,768][15401] Updated weights for policy 0, policy_version 326860 (0.0045) [2024-06-23 00:59:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5355421696. Throughput: 0: 42647.0. Samples: 5355572260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:33,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 00:59:33,649][15401] Updated weights for policy 0, policy_version 326870 (0.0040) [2024-06-23 00:59:37,281][15401] Updated weights for policy 0, policy_version 326880 (0.0020) [2024-06-23 00:59:38,390][15132] Fps is (10 sec: 40957.9, 60 sec: 42598.0, 300 sec: 42653.8). Total num frames: 5355618304. Throughput: 0: 42764.4. Samples: 5355701940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:38,391][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 00:59:41,513][15401] Updated weights for policy 0, policy_version 326890 (0.0027) [2024-06-23 00:59:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5355814912. Throughput: 0: 42651.6. Samples: 5355959360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:43,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 00:59:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000326894_5355831296.pth... [2024-06-23 00:59:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000326269_5345591296.pth [2024-06-23 00:59:45,151][15401] Updated weights for policy 0, policy_version 326900 (0.0022) [2024-06-23 00:59:48,389][15132] Fps is (10 sec: 44239.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5356060672. Throughput: 0: 42647.2. Samples: 5356211080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 00:59:49,331][15401] Updated weights for policy 0, policy_version 326910 (0.0036) [2024-06-23 00:59:52,700][15401] Updated weights for policy 0, policy_version 326920 (0.0026) [2024-06-23 00:59:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5356273664. Throughput: 0: 42813.4. Samples: 5356343760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 00:59:56,930][15401] Updated weights for policy 0, policy_version 326930 (0.0023) [2024-06-23 00:59:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5356453888. Throughput: 0: 42581.8. Samples: 5356598280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 00:59:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 01:00:00,394][15401] Updated weights for policy 0, policy_version 326940 (0.0047) [2024-06-23 01:00:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 5356699648. Throughput: 0: 42668.3. Samples: 5356852780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 01:00:03,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 01:00:04,462][15401] Updated weights for policy 0, policy_version 326950 (0.0024) [2024-06-23 01:00:08,219][15401] Updated weights for policy 0, policy_version 326960 (0.0032) [2024-06-23 01:00:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 5356912640. Throughput: 0: 42892.5. Samples: 5356987240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 01:00:08,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 01:00:11,909][15401] Updated weights for policy 0, policy_version 326970 (0.0037) [2024-06-23 01:00:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5357109248. Throughput: 0: 42669.0. Samples: 5357245460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 01:00:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 01:00:15,949][15401] Updated weights for policy 0, policy_version 326980 (0.0044) [2024-06-23 01:00:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5357371392. Throughput: 0: 42783.2. Samples: 5357497500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 01:00:19,492][15401] Updated weights for policy 0, policy_version 326990 (0.0038) [2024-06-23 01:00:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5357535232. Throughput: 0: 42904.6. Samples: 5357632620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 01:00:23,589][15401] Updated weights for policy 0, policy_version 327000 (0.0030) [2024-06-23 01:00:27,070][15401] Updated weights for policy 0, policy_version 327010 (0.0041) [2024-06-23 01:00:28,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5357748224. Throughput: 0: 42658.7. Samples: 5357879000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:28,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 01:00:29,840][15349] Signal inference workers to stop experience collection... (79250 times) [2024-06-23 01:00:29,888][15401] InferenceWorker_p0-w0: stopping experience collection (79250 times) [2024-06-23 01:00:29,898][15349] Signal inference workers to resume experience collection... (79250 times) [2024-06-23 01:00:29,903][15401] InferenceWorker_p0-w0: resuming experience collection (79250 times) [2024-06-23 01:00:31,260][15401] Updated weights for policy 0, policy_version 327020 (0.0026) [2024-06-23 01:00:33,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 5357993984. Throughput: 0: 42757.2. Samples: 5358135160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 01:00:34,956][15401] Updated weights for policy 0, policy_version 327030 (0.0037) [2024-06-23 01:00:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.7, 300 sec: 42598.4). Total num frames: 5358157824. Throughput: 0: 42812.5. Samples: 5358270320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 01:00:39,019][15401] Updated weights for policy 0, policy_version 327040 (0.0030) [2024-06-23 01:00:42,596][15401] Updated weights for policy 0, policy_version 327050 (0.0033) [2024-06-23 01:00:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5358403584. Throughput: 0: 42770.6. Samples: 5358522960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 01:00:46,928][15401] Updated weights for policy 0, policy_version 327060 (0.0032) [2024-06-23 01:00:48,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5358649344. Throughput: 0: 42581.9. Samples: 5358768960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:48,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 01:00:50,322][15401] Updated weights for policy 0, policy_version 327070 (0.0033) [2024-06-23 01:00:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5358796800. Throughput: 0: 42620.4. Samples: 5358905160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 01:00:54,520][15401] Updated weights for policy 0, policy_version 327080 (0.0034) [2024-06-23 01:00:58,361][15401] Updated weights for policy 0, policy_version 327090 (0.0026) [2024-06-23 01:00:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5359042560. Throughput: 0: 42572.0. Samples: 5359161200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:00:58,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 01:01:02,039][15401] Updated weights for policy 0, policy_version 327100 (0.0042) [2024-06-23 01:01:03,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.7, 300 sec: 42653.9). Total num frames: 5359271936. Throughput: 0: 42606.3. Samples: 5359414780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:01:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 01:01:05,795][15401] Updated weights for policy 0, policy_version 327110 (0.0026) [2024-06-23 01:01:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5359435776. Throughput: 0: 42503.9. Samples: 5359545300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:01:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 01:01:09,692][15401] Updated weights for policy 0, policy_version 327120 (0.0034) [2024-06-23 01:01:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5359681536. Throughput: 0: 42675.1. Samples: 5359799380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:01:13,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 01:01:13,431][15401] Updated weights for policy 0, policy_version 327130 (0.0030) [2024-06-23 01:01:17,547][15401] Updated weights for policy 0, policy_version 327140 (0.0049) [2024-06-23 01:01:18,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5359910912. Throughput: 0: 42719.6. Samples: 5360057540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:01:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 01:01:21,023][15401] Updated weights for policy 0, policy_version 327150 (0.0034) [2024-06-23 01:01:23,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 5360091136. Throughput: 0: 42433.2. Samples: 5360179820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:01:23,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-23 01:01:25,372][15401] Updated weights for policy 0, policy_version 327160 (0.0024) [2024-06-23 01:01:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5360320512. Throughput: 0: 42481.3. Samples: 5360434620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 01:01:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 01:01:28,607][15401] Updated weights for policy 0, policy_version 327170 (0.0032) [2024-06-23 01:01:32,890][15401] Updated weights for policy 0, policy_version 327180 (0.0033) [2024-06-23 01:01:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5360533504. Throughput: 0: 42824.0. Samples: 5360696040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:01:33,394][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 01:01:36,122][15401] Updated weights for policy 0, policy_version 327190 (0.0021) [2024-06-23 01:01:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5360713728. Throughput: 0: 42463.1. Samples: 5360816000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:01:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 01:01:40,448][15401] Updated weights for policy 0, policy_version 327200 (0.0036) [2024-06-23 01:01:41,184][15349] Signal inference workers to stop experience collection... (79300 times) [2024-06-23 01:01:41,184][15349] Signal inference workers to resume experience collection... (79300 times) [2024-06-23 01:01:41,214][15401] InferenceWorker_p0-w0: stopping experience collection (79300 times) [2024-06-23 01:01:41,214][15401] InferenceWorker_p0-w0: resuming experience collection (79300 times) [2024-06-23 01:01:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5360975872. Throughput: 0: 42543.4. Samples: 5361075660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:01:43,394][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 01:01:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000327208_5360975872.pth... [2024-06-23 01:01:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000326582_5350719488.pth [2024-06-23 01:01:44,043][15401] Updated weights for policy 0, policy_version 327210 (0.0025) [2024-06-23 01:01:48,150][15401] Updated weights for policy 0, policy_version 327220 (0.0024) [2024-06-23 01:01:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 5361172480. Throughput: 0: 42569.8. Samples: 5361330420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:01:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 01:01:51,579][15401] Updated weights for policy 0, policy_version 327230 (0.0032) [2024-06-23 01:01:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 5361369088. Throughput: 0: 42492.4. Samples: 5361457460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:01:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 01:01:55,646][15401] Updated weights for policy 0, policy_version 327240 (0.0029) [2024-06-23 01:01:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5361614848. Throughput: 0: 42679.1. Samples: 5361719940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:01:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 01:01:59,226][15401] Updated weights for policy 0, policy_version 327250 (0.0032) [2024-06-23 01:02:03,273][15401] Updated weights for policy 0, policy_version 327260 (0.0040) [2024-06-23 01:02:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5361827840. Throughput: 0: 42710.8. Samples: 5361979520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 01:02:06,706][15401] Updated weights for policy 0, policy_version 327270 (0.0034) [2024-06-23 01:02:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5362008064. Throughput: 0: 42848.2. Samples: 5362107980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 01:02:10,809][15401] Updated weights for policy 0, policy_version 327280 (0.0034) [2024-06-23 01:02:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 5362253824. Throughput: 0: 42861.9. Samples: 5362363400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 01:02:14,445][15401] Updated weights for policy 0, policy_version 327290 (0.0032) [2024-06-23 01:02:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5362450432. Throughput: 0: 42850.2. Samples: 5362624300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 01:02:18,849][15401] Updated weights for policy 0, policy_version 327300 (0.0044) [2024-06-23 01:02:22,537][15401] Updated weights for policy 0, policy_version 327310 (0.0029) [2024-06-23 01:02:23,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5362647040. Throughput: 0: 42884.8. Samples: 5362745820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 01:02:26,394][15401] Updated weights for policy 0, policy_version 327320 (0.0033) [2024-06-23 01:02:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.7). Total num frames: 5362892800. Throughput: 0: 42821.4. Samples: 5363002620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 01:02:30,177][15401] Updated weights for policy 0, policy_version 327330 (0.0038) [2024-06-23 01:02:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5363089408. Throughput: 0: 43121.3. Samples: 5363270880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 01:02:33,881][15401] Updated weights for policy 0, policy_version 327340 (0.0032) [2024-06-23 01:02:37,651][15401] Updated weights for policy 0, policy_version 327350 (0.0037) [2024-06-23 01:02:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5363302400. Throughput: 0: 43032.0. Samples: 5363393900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 01:02:41,421][15401] Updated weights for policy 0, policy_version 327360 (0.0042) [2024-06-23 01:02:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5363548160. Throughput: 0: 42963.5. Samples: 5363653300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 01:02:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 01:02:45,955][15401] Updated weights for policy 0, policy_version 327370 (0.0045) [2024-06-23 01:02:48,345][15349] Signal inference workers to stop experience collection... (79350 times) [2024-06-23 01:02:48,346][15349] Signal inference workers to resume experience collection... (79350 times) [2024-06-23 01:02:48,369][15401] InferenceWorker_p0-w0: stopping experience collection (79350 times) [2024-06-23 01:02:48,369][15401] InferenceWorker_p0-w0: resuming experience collection (79350 times) [2024-06-23 01:02:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 5363728384. Throughput: 0: 43041.8. Samples: 5363916400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:02:48,400][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 01:02:48,994][15401] Updated weights for policy 0, policy_version 327380 (0.0025) [2024-06-23 01:02:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5363941376. Throughput: 0: 42774.5. Samples: 5364032840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:02:53,399][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 01:02:53,544][15401] Updated weights for policy 0, policy_version 327390 (0.0042) [2024-06-23 01:02:56,746][15401] Updated weights for policy 0, policy_version 327400 (0.0038) [2024-06-23 01:02:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 5364170752. Throughput: 0: 42720.9. Samples: 5364285840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:02:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 01:03:01,191][15401] Updated weights for policy 0, policy_version 327410 (0.0042) [2024-06-23 01:03:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5364367360. Throughput: 0: 42802.3. Samples: 5364550400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:03,398][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 01:03:04,419][15401] Updated weights for policy 0, policy_version 327420 (0.0030) [2024-06-23 01:03:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5364580352. Throughput: 0: 42685.3. Samples: 5364666660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:08,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-23 01:03:08,770][15401] Updated weights for policy 0, policy_version 327430 (0.0029) [2024-06-23 01:03:12,659][15401] Updated weights for policy 0, policy_version 327440 (0.0041) [2024-06-23 01:03:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5364826112. Throughput: 0: 42694.2. Samples: 5364923860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:13,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 01:03:16,388][15401] Updated weights for policy 0, policy_version 327450 (0.0046) [2024-06-23 01:03:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 5364989952. Throughput: 0: 42506.3. Samples: 5365183660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 01:03:20,546][15401] Updated weights for policy 0, policy_version 327460 (0.0034) [2024-06-23 01:03:23,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 5365219328. Throughput: 0: 42332.3. Samples: 5365298960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:23,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 01:03:24,236][15401] Updated weights for policy 0, policy_version 327470 (0.0028) [2024-06-23 01:03:28,159][15401] Updated weights for policy 0, policy_version 327480 (0.0033) [2024-06-23 01:03:28,396][15132] Fps is (10 sec: 45845.1, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 5365448704. Throughput: 0: 42247.7. Samples: 5365554720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:28,397][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 01:03:31,914][15401] Updated weights for policy 0, policy_version 327490 (0.0038) [2024-06-23 01:03:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5365612544. Throughput: 0: 42276.0. Samples: 5365818820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 01:03:35,877][15401] Updated weights for policy 0, policy_version 327500 (0.0033) [2024-06-23 01:03:38,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5365874688. Throughput: 0: 42336.0. Samples: 5365937960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 01:03:39,632][15401] Updated weights for policy 0, policy_version 327510 (0.0033) [2024-06-23 01:03:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 5366054912. Throughput: 0: 42412.9. Samples: 5366194420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 01:03:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000327519_5366071296.pth... [2024-06-23 01:03:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000326894_5355831296.pth [2024-06-23 01:03:43,640][15401] Updated weights for policy 0, policy_version 327520 (0.0032) [2024-06-23 01:03:47,887][15401] Updated weights for policy 0, policy_version 327530 (0.0028) [2024-06-23 01:03:48,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5366267904. Throughput: 0: 42308.6. Samples: 5366454280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 01:03:51,473][15401] Updated weights for policy 0, policy_version 327540 (0.0048) [2024-06-23 01:03:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5366497280. Throughput: 0: 42419.1. Samples: 5366575520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:53,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 01:03:55,416][15401] Updated weights for policy 0, policy_version 327550 (0.0036) [2024-06-23 01:03:58,390][15132] Fps is (10 sec: 44235.2, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 5366710272. Throughput: 0: 42265.1. Samples: 5366825800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:03:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 01:03:59,199][15401] Updated weights for policy 0, policy_version 327560 (0.0038) [2024-06-23 01:04:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5366890496. Throughput: 0: 42418.9. Samples: 5367092520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-23 01:04:03,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-23 01:04:03,442][15401] Updated weights for policy 0, policy_version 327570 (0.0030) [2024-06-23 01:04:07,035][15401] Updated weights for policy 0, policy_version 327580 (0.0032) [2024-06-23 01:04:08,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5367152640. Throughput: 0: 42530.3. Samples: 5367212720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 01:04:10,992][15349] Signal inference workers to stop experience collection... (79400 times) [2024-06-23 01:04:10,997][15349] Signal inference workers to resume experience collection... (79400 times) [2024-06-23 01:04:11,011][15401] InferenceWorker_p0-w0: stopping experience collection (79400 times) [2024-06-23 01:04:11,011][15401] InferenceWorker_p0-w0: resuming experience collection (79400 times) [2024-06-23 01:04:11,146][15401] Updated weights for policy 0, policy_version 327590 (0.0032) [2024-06-23 01:04:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5367349248. Throughput: 0: 42465.2. Samples: 5367465380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 01:04:14,862][15401] Updated weights for policy 0, policy_version 327600 (0.0036) [2024-06-23 01:04:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5367529472. Throughput: 0: 42488.5. Samples: 5367730800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:18,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 01:04:18,550][15401] Updated weights for policy 0, policy_version 327610 (0.0032) [2024-06-23 01:04:22,470][15401] Updated weights for policy 0, policy_version 327620 (0.0032) [2024-06-23 01:04:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 5367775232. Throughput: 0: 42626.4. Samples: 5367856140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:23,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 01:04:26,048][15401] Updated weights for policy 0, policy_version 327630 (0.0037) [2024-06-23 01:04:28,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 5368004608. Throughput: 0: 42604.8. Samples: 5368111640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:28,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 01:04:29,978][15401] Updated weights for policy 0, policy_version 327640 (0.0040) [2024-06-23 01:04:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 5368201216. Throughput: 0: 42668.8. Samples: 5368374380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:33,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 01:04:33,843][15401] Updated weights for policy 0, policy_version 327650 (0.0029) [2024-06-23 01:04:37,721][15401] Updated weights for policy 0, policy_version 327660 (0.0035) [2024-06-23 01:04:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5368397824. Throughput: 0: 42780.3. Samples: 5368500640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 01:04:41,382][15401] Updated weights for policy 0, policy_version 327670 (0.0037) [2024-06-23 01:04:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5368643584. Throughput: 0: 42978.0. Samples: 5368759800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 01:04:45,499][15401] Updated weights for policy 0, policy_version 327680 (0.0036) [2024-06-23 01:04:48,389][15132] Fps is (10 sec: 44238.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5368840192. Throughput: 0: 42593.1. Samples: 5369009200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:48,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 01:04:48,955][15401] Updated weights for policy 0, policy_version 327690 (0.0032) [2024-06-23 01:04:53,035][15401] Updated weights for policy 0, policy_version 327700 (0.0042) [2024-06-23 01:04:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5369053184. Throughput: 0: 42777.8. Samples: 5369137720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 01:04:56,470][15401] Updated weights for policy 0, policy_version 327710 (0.0023) [2024-06-23 01:04:58,390][15132] Fps is (10 sec: 45873.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5369298944. Throughput: 0: 42924.7. Samples: 5369397000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:04:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 01:05:00,642][15401] Updated weights for policy 0, policy_version 327720 (0.0044) [2024-06-23 01:05:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 5369495552. Throughput: 0: 42639.0. Samples: 5369649560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:05:03,399][15132] Avg episode reward: [(0, '0.192')] [2024-06-23 01:05:04,140][15401] Updated weights for policy 0, policy_version 327730 (0.0038) [2024-06-23 01:05:08,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5369675776. Throughput: 0: 42670.7. Samples: 5369776320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:05:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 01:05:08,613][15401] Updated weights for policy 0, policy_version 327740 (0.0040) [2024-06-23 01:05:11,644][15401] Updated weights for policy 0, policy_version 327750 (0.0036) [2024-06-23 01:05:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 5369937920. Throughput: 0: 42782.1. Samples: 5370036840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:05:13,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-23 01:05:14,320][15349] Signal inference workers to stop experience collection... (79450 times) [2024-06-23 01:05:14,373][15401] InferenceWorker_p0-w0: stopping experience collection (79450 times) [2024-06-23 01:05:14,382][15349] Signal inference workers to resume experience collection... (79450 times) [2024-06-23 01:05:14,387][15401] InferenceWorker_p0-w0: resuming experience collection (79450 times) [2024-06-23 01:05:16,185][15401] Updated weights for policy 0, policy_version 327760 (0.0057) [2024-06-23 01:05:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5370118144. Throughput: 0: 42554.6. Samples: 5370289340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:05:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 01:05:19,186][15401] Updated weights for policy 0, policy_version 327770 (0.0042) [2024-06-23 01:05:23,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5370314752. Throughput: 0: 42564.7. Samples: 5370416040. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:05:23,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 01:05:23,779][15401] Updated weights for policy 0, policy_version 327780 (0.0031) [2024-06-23 01:05:26,888][15401] Updated weights for policy 0, policy_version 327790 (0.0028) [2024-06-23 01:05:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5370560512. Throughput: 0: 42517.8. Samples: 5370673100. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:05:28,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 01:05:31,369][15401] Updated weights for policy 0, policy_version 327800 (0.0025) [2024-06-23 01:05:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 5370757120. Throughput: 0: 42793.1. Samples: 5370935000. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:05:33,392][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 01:05:34,471][15401] Updated weights for policy 0, policy_version 327810 (0.0032) [2024-06-23 01:05:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5370970112. Throughput: 0: 42742.1. Samples: 5371061120. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:05:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 01:05:38,817][15401] Updated weights for policy 0, policy_version 327820 (0.0033) [2024-06-23 01:05:42,154][15401] Updated weights for policy 0, policy_version 327830 (0.0032) [2024-06-23 01:05:43,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 5371199488. Throughput: 0: 42501.0. Samples: 5371309640. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:05:43,393][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 01:05:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000327832_5371199488.pth... [2024-06-23 01:05:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000327208_5360975872.pth [2024-06-23 01:05:46,470][15401] Updated weights for policy 0, policy_version 327840 (0.0028) [2024-06-23 01:05:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5371379712. Throughput: 0: 42739.2. Samples: 5371572820. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:05:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 01:05:50,117][15401] Updated weights for policy 0, policy_version 327850 (0.0038) [2024-06-23 01:05:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5371625472. Throughput: 0: 42587.9. Samples: 5371692780. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:05:53,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 01:05:54,197][15401] Updated weights for policy 0, policy_version 327860 (0.0032) [2024-06-23 01:05:57,711][15401] Updated weights for policy 0, policy_version 327870 (0.0038) [2024-06-23 01:05:58,396][15132] Fps is (10 sec: 47483.2, 60 sec: 42594.0, 300 sec: 42653.0). Total num frames: 5371854848. Throughput: 0: 42503.0. Samples: 5371949740. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:05:58,396][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 01:06:01,951][15401] Updated weights for policy 0, policy_version 327880 (0.0026) [2024-06-23 01:06:03,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 5372002304. Throughput: 0: 42705.7. Samples: 5372211100. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:06:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 01:06:05,637][15401] Updated weights for policy 0, policy_version 327890 (0.0028) [2024-06-23 01:06:08,389][15132] Fps is (10 sec: 40986.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5372264448. Throughput: 0: 42524.0. Samples: 5372329620. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:06:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 01:06:09,807][15401] Updated weights for policy 0, policy_version 327900 (0.0039) [2024-06-23 01:06:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 5372444672. Throughput: 0: 42523.1. Samples: 5372586640. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:06:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 01:06:13,546][15401] Updated weights for policy 0, policy_version 327910 (0.0027) [2024-06-23 01:06:17,297][15401] Updated weights for policy 0, policy_version 327920 (0.0030) [2024-06-23 01:06:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5372657664. Throughput: 0: 42433.9. Samples: 5372844420. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:06:18,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 01:06:21,110][15401] Updated weights for policy 0, policy_version 327930 (0.0031) [2024-06-23 01:06:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 5372903424. Throughput: 0: 42423.7. Samples: 5372970180. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:06:23,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 01:06:25,025][15401] Updated weights for policy 0, policy_version 327940 (0.0043) [2024-06-23 01:06:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5373100032. Throughput: 0: 42688.4. Samples: 5373230520. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:06:28,404][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 01:06:28,919][15401] Updated weights for policy 0, policy_version 327950 (0.0023) [2024-06-23 01:06:30,205][15349] Signal inference workers to stop experience collection... (79500 times) [2024-06-23 01:06:30,205][15349] Signal inference workers to resume experience collection... (79500 times) [2024-06-23 01:06:30,229][15401] InferenceWorker_p0-w0: stopping experience collection (79500 times) [2024-06-23 01:06:30,229][15401] InferenceWorker_p0-w0: resuming experience collection (79500 times) [2024-06-23 01:06:32,668][15401] Updated weights for policy 0, policy_version 327960 (0.0033) [2024-06-23 01:06:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 5373313024. Throughput: 0: 42504.8. Samples: 5373485540. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-23 01:06:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 01:06:36,557][15401] Updated weights for policy 0, policy_version 327970 (0.0027) [2024-06-23 01:06:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5373542400. Throughput: 0: 42666.7. Samples: 5373612780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:06:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 01:06:40,251][15401] Updated weights for policy 0, policy_version 327980 (0.0030) [2024-06-23 01:06:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 5373739008. Throughput: 0: 42662.5. Samples: 5373869280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:06:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 01:06:44,253][15401] Updated weights for policy 0, policy_version 327990 (0.0028) [2024-06-23 01:06:48,258][15401] Updated weights for policy 0, policy_version 328000 (0.0047) [2024-06-23 01:06:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5373952000. Throughput: 0: 42513.3. Samples: 5374124200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:06:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 01:06:51,926][15401] Updated weights for policy 0, policy_version 328010 (0.0023) [2024-06-23 01:06:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 5374164992. Throughput: 0: 42769.7. Samples: 5374254260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:06:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 01:06:55,795][15401] Updated weights for policy 0, policy_version 328020 (0.0035) [2024-06-23 01:06:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42056.8, 300 sec: 42542.9). Total num frames: 5374377984. Throughput: 0: 42718.2. Samples: 5374508960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:06:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 01:06:59,544][15401] Updated weights for policy 0, policy_version 328030 (0.0042) [2024-06-23 01:07:03,396][15132] Fps is (10 sec: 42570.8, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 5374590976. Throughput: 0: 42656.9. Samples: 5374764260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:03,397][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 01:07:03,565][15401] Updated weights for policy 0, policy_version 328040 (0.0041) [2024-06-23 01:07:07,216][15401] Updated weights for policy 0, policy_version 328050 (0.0025) [2024-06-23 01:07:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5374820352. Throughput: 0: 42744.4. Samples: 5374893680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 01:07:11,238][15401] Updated weights for policy 0, policy_version 328060 (0.0026) [2024-06-23 01:07:13,396][15132] Fps is (10 sec: 44237.2, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 5375033344. Throughput: 0: 42508.3. Samples: 5375143660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:13,396][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 01:07:15,037][15401] Updated weights for policy 0, policy_version 328070 (0.0031) [2024-06-23 01:07:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5375229952. Throughput: 0: 42635.7. Samples: 5375404140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 01:07:18,845][15401] Updated weights for policy 0, policy_version 328080 (0.0028) [2024-06-23 01:07:22,582][15401] Updated weights for policy 0, policy_version 328090 (0.0028) [2024-06-23 01:07:23,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5375459328. Throughput: 0: 42531.5. Samples: 5375526700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 01:07:26,976][15401] Updated weights for policy 0, policy_version 328100 (0.0034) [2024-06-23 01:07:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5375672320. Throughput: 0: 42516.3. Samples: 5375782520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 01:07:30,229][15401] Updated weights for policy 0, policy_version 328110 (0.0026) [2024-06-23 01:07:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5375868928. Throughput: 0: 42605.3. Samples: 5376041440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 01:07:34,411][15401] Updated weights for policy 0, policy_version 328120 (0.0036) [2024-06-23 01:07:38,154][15401] Updated weights for policy 0, policy_version 328130 (0.0041) [2024-06-23 01:07:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5376081920. Throughput: 0: 42507.1. Samples: 5376167080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 01:07:41,957][15401] Updated weights for policy 0, policy_version 328140 (0.0044) [2024-06-23 01:07:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5376311296. Throughput: 0: 42573.2. Samples: 5376424760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 01:07:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000328144_5376311296.pth... [2024-06-23 01:07:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000327519_5366071296.pth [2024-06-23 01:07:45,912][15401] Updated weights for policy 0, policy_version 328150 (0.0038) [2024-06-23 01:07:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5376507904. Throughput: 0: 42714.1. Samples: 5376686120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:07:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 01:07:49,625][15401] Updated weights for policy 0, policy_version 328160 (0.0038) [2024-06-23 01:07:53,302][15349] Signal inference workers to stop experience collection... (79550 times) [2024-06-23 01:07:53,308][15349] Signal inference workers to resume experience collection... (79550 times) [2024-06-23 01:07:53,313][15401] InferenceWorker_p0-w0: stopping experience collection (79550 times) [2024-06-23 01:07:53,344][15401] InferenceWorker_p0-w0: resuming experience collection (79550 times) [2024-06-23 01:07:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5376720896. Throughput: 0: 42575.1. Samples: 5376809560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:07:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 01:07:53,458][15401] Updated weights for policy 0, policy_version 328170 (0.0041) [2024-06-23 01:07:57,244][15401] Updated weights for policy 0, policy_version 328180 (0.0028) [2024-06-23 01:07:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5376950272. Throughput: 0: 42922.5. Samples: 5377074900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:07:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 01:08:01,059][15401] Updated weights for policy 0, policy_version 328190 (0.0023) [2024-06-23 01:08:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42876.1, 300 sec: 42653.9). Total num frames: 5377163264. Throughput: 0: 42793.3. Samples: 5377329840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 01:08:04,874][15401] Updated weights for policy 0, policy_version 328200 (0.0039) [2024-06-23 01:08:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5377359872. Throughput: 0: 42932.1. Samples: 5377458640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 01:08:08,631][15401] Updated weights for policy 0, policy_version 328210 (0.0037) [2024-06-23 01:08:12,394][15401] Updated weights for policy 0, policy_version 328220 (0.0030) [2024-06-23 01:08:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 5377589248. Throughput: 0: 42964.1. Samples: 5377715900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:13,390][15132] Avg episode reward: [(0, '0.891')] [2024-06-23 01:08:16,466][15401] Updated weights for policy 0, policy_version 328230 (0.0036) [2024-06-23 01:08:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 5377802240. Throughput: 0: 42808.9. Samples: 5377967840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:18,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 01:08:19,822][15401] Updated weights for policy 0, policy_version 328240 (0.0037) [2024-06-23 01:08:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 5378015232. Throughput: 0: 42947.7. Samples: 5378099720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:23,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 01:08:23,988][15401] Updated weights for policy 0, policy_version 328250 (0.0030) [2024-06-23 01:08:27,316][15401] Updated weights for policy 0, policy_version 328260 (0.0028) [2024-06-23 01:08:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5378228224. Throughput: 0: 42966.3. Samples: 5378358240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:28,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-23 01:08:31,452][15401] Updated weights for policy 0, policy_version 328270 (0.0039) [2024-06-23 01:08:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5378457600. Throughput: 0: 42849.9. Samples: 5378614360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:33,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-23 01:08:35,097][15401] Updated weights for policy 0, policy_version 328280 (0.0036) [2024-06-23 01:08:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5378654208. Throughput: 0: 43104.9. Samples: 5378749280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 01:08:38,994][15401] Updated weights for policy 0, policy_version 328290 (0.0045) [2024-06-23 01:08:42,515][15401] Updated weights for policy 0, policy_version 328300 (0.0036) [2024-06-23 01:08:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5378883584. Throughput: 0: 42761.9. Samples: 5378999180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 01:08:46,447][15401] Updated weights for policy 0, policy_version 328310 (0.0031) [2024-06-23 01:08:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5379096576. Throughput: 0: 42924.8. Samples: 5379261460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 01:08:49,970][15401] Updated weights for policy 0, policy_version 328320 (0.0027) [2024-06-23 01:08:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5379293184. Throughput: 0: 42974.3. Samples: 5379392480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:53,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 01:08:54,235][15401] Updated weights for policy 0, policy_version 328330 (0.0040) [2024-06-23 01:08:57,623][15401] Updated weights for policy 0, policy_version 328340 (0.0034) [2024-06-23 01:08:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5379538944. Throughput: 0: 43006.6. Samples: 5379651200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:08:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 01:09:01,840][15401] Updated weights for policy 0, policy_version 328350 (0.0039) [2024-06-23 01:09:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5379735552. Throughput: 0: 43169.9. Samples: 5379910480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 01:09:03,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-23 01:09:03,766][15349] Signal inference workers to stop experience collection... (79600 times) [2024-06-23 01:09:03,772][15349] Signal inference workers to resume experience collection... (79600 times) [2024-06-23 01:09:03,803][15401] InferenceWorker_p0-w0: stopping experience collection (79600 times) [2024-06-23 01:09:03,804][15401] InferenceWorker_p0-w0: resuming experience collection (79600 times) [2024-06-23 01:09:05,424][15401] Updated weights for policy 0, policy_version 328360 (0.0036) [2024-06-23 01:09:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5379948544. Throughput: 0: 42984.4. Samples: 5380034020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 01:09:09,523][15401] Updated weights for policy 0, policy_version 328370 (0.0029) [2024-06-23 01:09:13,073][15401] Updated weights for policy 0, policy_version 328380 (0.0043) [2024-06-23 01:09:13,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 5380177920. Throughput: 0: 42913.3. Samples: 5380289440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:13,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 01:09:17,161][15401] Updated weights for policy 0, policy_version 328390 (0.0029) [2024-06-23 01:09:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5380390912. Throughput: 0: 42922.3. Samples: 5380545860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:18,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-23 01:09:20,929][15401] Updated weights for policy 0, policy_version 328400 (0.0034) [2024-06-23 01:09:23,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5380587520. Throughput: 0: 42651.9. Samples: 5380668620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 01:09:24,902][15401] Updated weights for policy 0, policy_version 328410 (0.0029) [2024-06-23 01:09:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5380800512. Throughput: 0: 42882.8. Samples: 5380928900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 01:09:28,586][15401] Updated weights for policy 0, policy_version 328420 (0.0038) [2024-06-23 01:09:32,956][15401] Updated weights for policy 0, policy_version 328430 (0.0039) [2024-06-23 01:09:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5381029888. Throughput: 0: 42721.8. Samples: 5381183940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:33,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 01:09:36,310][15401] Updated weights for policy 0, policy_version 328440 (0.0037) [2024-06-23 01:09:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5381210112. Throughput: 0: 42589.3. Samples: 5381309000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 01:09:40,504][15401] Updated weights for policy 0, policy_version 328450 (0.0042) [2024-06-23 01:09:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5381439488. Throughput: 0: 42547.5. Samples: 5381565840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 01:09:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000328458_5381455872.pth... [2024-06-23 01:09:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000327832_5371199488.pth [2024-06-23 01:09:43,950][15401] Updated weights for policy 0, policy_version 328460 (0.0031) [2024-06-23 01:09:48,144][15401] Updated weights for policy 0, policy_version 328470 (0.0036) [2024-06-23 01:09:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5381652480. Throughput: 0: 42497.0. Samples: 5381822840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 01:09:51,724][15401] Updated weights for policy 0, policy_version 328480 (0.0034) [2024-06-23 01:09:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5381865472. Throughput: 0: 42623.5. Samples: 5381952080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:53,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 01:09:55,833][15401] Updated weights for policy 0, policy_version 328490 (0.0026) [2024-06-23 01:09:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5382078464. Throughput: 0: 42546.3. Samples: 5382203920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:09:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 01:09:59,792][15401] Updated weights for policy 0, policy_version 328500 (0.0028) [2024-06-23 01:10:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5382291456. Throughput: 0: 42630.7. Samples: 5382464240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:10:03,397][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 01:10:03,498][15401] Updated weights for policy 0, policy_version 328510 (0.0026) [2024-06-23 01:10:07,591][15401] Updated weights for policy 0, policy_version 328520 (0.0036) [2024-06-23 01:10:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5382520832. Throughput: 0: 42781.8. Samples: 5382593800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:10:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 01:10:11,254][15401] Updated weights for policy 0, policy_version 328530 (0.0036) [2024-06-23 01:10:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 5382733824. Throughput: 0: 42658.9. Samples: 5382848560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:10:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 01:10:15,061][15401] Updated weights for policy 0, policy_version 328540 (0.0038) [2024-06-23 01:10:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 5382930432. Throughput: 0: 42684.0. Samples: 5383104820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 01:10:18,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 01:10:18,944][15401] Updated weights for policy 0, policy_version 328550 (0.0037) [2024-06-23 01:10:21,534][15349] Signal inference workers to stop experience collection... (79650 times) [2024-06-23 01:10:21,539][15349] Signal inference workers to resume experience collection... (79650 times) [2024-06-23 01:10:21,583][15401] InferenceWorker_p0-w0: stopping experience collection (79650 times) [2024-06-23 01:10:21,584][15401] InferenceWorker_p0-w0: resuming experience collection (79650 times) [2024-06-23 01:10:22,710][15401] Updated weights for policy 0, policy_version 328560 (0.0033) [2024-06-23 01:10:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5383159808. Throughput: 0: 42730.6. Samples: 5383231880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:10:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 01:10:26,675][15401] Updated weights for policy 0, policy_version 328570 (0.0032) [2024-06-23 01:10:28,389][15132] Fps is (10 sec: 45887.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 5383389184. Throughput: 0: 42749.0. Samples: 5383489540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:10:28,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 01:10:30,305][15401] Updated weights for policy 0, policy_version 328580 (0.0046) [2024-06-23 01:10:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5383585792. Throughput: 0: 42751.1. Samples: 5383746640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:10:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 01:10:34,391][15401] Updated weights for policy 0, policy_version 328590 (0.0051) [2024-06-23 01:10:38,037][15401] Updated weights for policy 0, policy_version 328600 (0.0033) [2024-06-23 01:10:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42709.9). Total num frames: 5383798784. Throughput: 0: 42640.6. Samples: 5383870900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:10:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 01:10:41,962][15401] Updated weights for policy 0, policy_version 328610 (0.0032) [2024-06-23 01:10:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 5384028160. Throughput: 0: 42906.2. Samples: 5384134700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:10:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 01:10:45,425][15401] Updated weights for policy 0, policy_version 328620 (0.0036) [2024-06-23 01:10:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5384224768. Throughput: 0: 42812.0. Samples: 5384390780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:10:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 01:10:49,453][15401] Updated weights for policy 0, policy_version 328630 (0.0040) [2024-06-23 01:10:52,884][15401] Updated weights for policy 0, policy_version 328640 (0.0044) [2024-06-23 01:10:53,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.6, 300 sec: 42654.9). Total num frames: 5384437760. Throughput: 0: 42717.0. Samples: 5384516060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:10:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 01:10:57,129][15401] Updated weights for policy 0, policy_version 328650 (0.0046) [2024-06-23 01:10:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5384650752. Throughput: 0: 42958.9. Samples: 5384781700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:10:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 01:11:00,687][15401] Updated weights for policy 0, policy_version 328660 (0.0036) [2024-06-23 01:11:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5384880128. Throughput: 0: 42954.0. Samples: 5385037640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:11:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 01:11:04,509][15401] Updated weights for policy 0, policy_version 328670 (0.0034) [2024-06-23 01:11:08,137][15401] Updated weights for policy 0, policy_version 328680 (0.0027) [2024-06-23 01:11:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5385093120. Throughput: 0: 43019.6. Samples: 5385167760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:11:08,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 01:11:12,114][15401] Updated weights for policy 0, policy_version 328690 (0.0028) [2024-06-23 01:11:13,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5385306112. Throughput: 0: 42965.7. Samples: 5385423000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:11:13,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 01:11:15,751][15401] Updated weights for policy 0, policy_version 328700 (0.0039) [2024-06-23 01:11:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 5385519104. Throughput: 0: 42952.4. Samples: 5385679500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:11:18,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 01:11:19,952][15401] Updated weights for policy 0, policy_version 328710 (0.0030) [2024-06-23 01:11:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 5385715712. Throughput: 0: 43061.4. Samples: 5385808660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:11:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 01:11:23,560][15401] Updated weights for policy 0, policy_version 328720 (0.0026) [2024-06-23 01:11:27,582][15401] Updated weights for policy 0, policy_version 328730 (0.0036) [2024-06-23 01:11:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5385945088. Throughput: 0: 42880.0. Samples: 5386064300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:11:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 01:11:31,643][15401] Updated weights for policy 0, policy_version 328740 (0.0036) [2024-06-23 01:11:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5386174464. Throughput: 0: 42813.9. Samples: 5386317400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:11:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 01:11:35,084][15401] Updated weights for policy 0, policy_version 328750 (0.0036) [2024-06-23 01:11:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5386354688. Throughput: 0: 42877.4. Samples: 5386445540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:11:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 01:11:39,114][15401] Updated weights for policy 0, policy_version 328760 (0.0030) [2024-06-23 01:11:42,591][15401] Updated weights for policy 0, policy_version 328770 (0.0045) [2024-06-23 01:11:43,393][15132] Fps is (10 sec: 40944.7, 60 sec: 42595.8, 300 sec: 42820.1). Total num frames: 5386584064. Throughput: 0: 42663.6. Samples: 5386701720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:11:43,394][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 01:11:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000328771_5386584064.pth... [2024-06-23 01:11:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000328144_5376311296.pth [2024-06-23 01:11:47,039][15401] Updated weights for policy 0, policy_version 328780 (0.0027) [2024-06-23 01:11:48,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 5386813440. Throughput: 0: 42625.8. Samples: 5386955900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:11:48,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 01:11:50,239][15401] Updated weights for policy 0, policy_version 328790 (0.0031) [2024-06-23 01:11:53,389][15132] Fps is (10 sec: 39336.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5386977280. Throughput: 0: 42584.5. Samples: 5387084060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:11:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 01:11:54,692][15401] Updated weights for policy 0, policy_version 328800 (0.0031) [2024-06-23 01:11:57,940][15401] Updated weights for policy 0, policy_version 328810 (0.0033) [2024-06-23 01:11:58,392][15132] Fps is (10 sec: 42597.7, 60 sec: 43142.8, 300 sec: 42876.7). Total num frames: 5387239424. Throughput: 0: 42463.5. Samples: 5387333960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:11:58,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 01:12:02,556][15401] Updated weights for policy 0, policy_version 328820 (0.0031) [2024-06-23 01:12:03,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5387436032. Throughput: 0: 42597.5. Samples: 5387596380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 01:12:03,422][15349] Signal inference workers to stop experience collection... (79700 times) [2024-06-23 01:12:03,423][15349] Signal inference workers to resume experience collection... (79700 times) [2024-06-23 01:12:03,460][15401] InferenceWorker_p0-w0: stopping experience collection (79700 times) [2024-06-23 01:12:03,460][15401] InferenceWorker_p0-w0: resuming experience collection (79700 times) [2024-06-23 01:12:05,651][15401] Updated weights for policy 0, policy_version 328830 (0.0033) [2024-06-23 01:12:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 5387632640. Throughput: 0: 42583.9. Samples: 5387724940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 01:12:10,029][15401] Updated weights for policy 0, policy_version 328840 (0.0042) [2024-06-23 01:12:13,101][15401] Updated weights for policy 0, policy_version 328850 (0.0023) [2024-06-23 01:12:13,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 5387878400. Throughput: 0: 42590.0. Samples: 5387980960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:13,393][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 01:12:17,732][15401] Updated weights for policy 0, policy_version 328860 (0.0043) [2024-06-23 01:12:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5388058624. Throughput: 0: 42724.0. Samples: 5388239980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:18,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 01:12:20,645][15401] Updated weights for policy 0, policy_version 328870 (0.0032) [2024-06-23 01:12:23,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5388288000. Throughput: 0: 42657.7. Samples: 5388365140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 01:12:25,365][15401] Updated weights for policy 0, policy_version 328880 (0.0036) [2024-06-23 01:12:28,149][15401] Updated weights for policy 0, policy_version 328890 (0.0033) [2024-06-23 01:12:28,392][15132] Fps is (10 sec: 47501.2, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 5388533760. Throughput: 0: 42801.1. Samples: 5388627720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:28,393][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 01:12:32,991][15401] Updated weights for policy 0, policy_version 328900 (0.0024) [2024-06-23 01:12:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5388730368. Throughput: 0: 42967.4. Samples: 5388889340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 01:12:35,813][15401] Updated weights for policy 0, policy_version 328910 (0.0044) [2024-06-23 01:12:38,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5388926976. Throughput: 0: 42883.0. Samples: 5389013800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 01:12:40,512][15401] Updated weights for policy 0, policy_version 328920 (0.0038) [2024-06-23 01:12:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43145.4, 300 sec: 42931.3). Total num frames: 5389172736. Throughput: 0: 43120.0. Samples: 5389274360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:43,392][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 01:12:43,699][15401] Updated weights for policy 0, policy_version 328930 (0.0028) [2024-06-23 01:12:48,011][15401] Updated weights for policy 0, policy_version 328940 (0.0034) [2024-06-23 01:12:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 5389369344. Throughput: 0: 43129.3. Samples: 5389537200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 01:12:51,104][15401] Updated weights for policy 0, policy_version 328950 (0.0047) [2024-06-23 01:12:53,389][15132] Fps is (10 sec: 39331.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5389565952. Throughput: 0: 42970.6. Samples: 5389658620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:12:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 01:12:55,576][15401] Updated weights for policy 0, policy_version 328960 (0.0027) [2024-06-23 01:12:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 5389811712. Throughput: 0: 43106.4. Samples: 5389920640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:12:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 01:12:58,548][15401] Updated weights for policy 0, policy_version 328970 (0.0028) [2024-06-23 01:13:03,277][15401] Updated weights for policy 0, policy_version 328980 (0.0028) [2024-06-23 01:13:03,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 5390008320. Throughput: 0: 43212.8. Samples: 5390184660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:03,392][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 01:13:06,166][15401] Updated weights for policy 0, policy_version 328990 (0.0042) [2024-06-23 01:13:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5390221312. Throughput: 0: 43032.0. Samples: 5390301580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:08,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 01:13:10,986][15401] Updated weights for policy 0, policy_version 329000 (0.0035) [2024-06-23 01:13:13,392][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42931.3). Total num frames: 5390467072. Throughput: 0: 43053.4. Samples: 5390565120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:13,392][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 01:13:13,687][15401] Updated weights for policy 0, policy_version 329010 (0.0040) [2024-06-23 01:13:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 5390647296. Throughput: 0: 43046.1. Samples: 5390826520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:18,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 01:13:18,567][15401] Updated weights for policy 0, policy_version 329020 (0.0031) [2024-06-23 01:13:21,733][15401] Updated weights for policy 0, policy_version 329030 (0.0040) [2024-06-23 01:13:23,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5390860288. Throughput: 0: 42985.8. Samples: 5390948160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 01:13:25,957][15401] Updated weights for policy 0, policy_version 329040 (0.0034) [2024-06-23 01:13:28,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 5391089664. Throughput: 0: 42969.9. Samples: 5391207900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 01:13:29,335][15401] Updated weights for policy 0, policy_version 329050 (0.0037) [2024-06-23 01:13:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5391302656. Throughput: 0: 42879.1. Samples: 5391466760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 01:13:33,473][15401] Updated weights for policy 0, policy_version 329060 (0.0037) [2024-06-23 01:13:36,926][15401] Updated weights for policy 0, policy_version 329070 (0.0026) [2024-06-23 01:13:38,391][15132] Fps is (10 sec: 40954.1, 60 sec: 42870.5, 300 sec: 42764.8). Total num frames: 5391499264. Throughput: 0: 42976.9. Samples: 5391592640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:38,392][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 01:13:41,071][15401] Updated weights for policy 0, policy_version 329080 (0.0038) [2024-06-23 01:13:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 5391728640. Throughput: 0: 42908.5. Samples: 5391851520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 01:13:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000329086_5391745024.pth... [2024-06-23 01:13:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000328458_5381455872.pth [2024-06-23 01:13:44,424][15401] Updated weights for policy 0, policy_version 329090 (0.0033) [2024-06-23 01:13:48,389][15132] Fps is (10 sec: 44243.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5391941632. Throughput: 0: 42743.6. Samples: 5392108020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 01:13:48,770][15401] Updated weights for policy 0, policy_version 329100 (0.0037) [2024-06-23 01:13:52,090][15401] Updated weights for policy 0, policy_version 329110 (0.0040) [2024-06-23 01:13:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5392154624. Throughput: 0: 42904.9. Samples: 5392232300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 01:13:56,357][15401] Updated weights for policy 0, policy_version 329120 (0.0028) [2024-06-23 01:13:58,054][15349] Signal inference workers to stop experience collection... (79750 times) [2024-06-23 01:13:58,054][15349] Signal inference workers to resume experience collection... (79750 times) [2024-06-23 01:13:58,091][15401] InferenceWorker_p0-w0: stopping experience collection (79750 times) [2024-06-23 01:13:58,092][15401] InferenceWorker_p0-w0: resuming experience collection (79750 times) [2024-06-23 01:13:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5392384000. Throughput: 0: 42924.5. Samples: 5392496620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:13:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 01:13:59,509][15401] Updated weights for policy 0, policy_version 329130 (0.0027) [2024-06-23 01:14:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 5392580608. Throughput: 0: 42919.3. Samples: 5392757780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:14:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 01:14:03,900][15401] Updated weights for policy 0, policy_version 329140 (0.0035) [2024-06-23 01:14:07,087][15401] Updated weights for policy 0, policy_version 329150 (0.0026) [2024-06-23 01:14:08,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.7, 300 sec: 42820.5). Total num frames: 5392809984. Throughput: 0: 42957.6. Samples: 5392881360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 01:14:08,393][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 01:14:12,043][15401] Updated weights for policy 0, policy_version 329160 (0.0025) [2024-06-23 01:14:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 5393022976. Throughput: 0: 42903.5. Samples: 5393138560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 01:14:15,107][15401] Updated weights for policy 0, policy_version 329170 (0.0031) [2024-06-23 01:14:18,389][15132] Fps is (10 sec: 39331.8, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 5393203200. Throughput: 0: 42913.0. Samples: 5393397840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 01:14:19,718][15401] Updated weights for policy 0, policy_version 329180 (0.0028) [2024-06-23 01:14:22,694][15401] Updated weights for policy 0, policy_version 329190 (0.0022) [2024-06-23 01:14:23,391][15132] Fps is (10 sec: 42592.8, 60 sec: 43143.6, 300 sec: 42875.9). Total num frames: 5393448960. Throughput: 0: 42687.2. Samples: 5393513560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:23,391][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 01:14:27,258][15401] Updated weights for policy 0, policy_version 329200 (0.0037) [2024-06-23 01:14:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5393661952. Throughput: 0: 42925.3. Samples: 5393783160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 01:14:30,353][15401] Updated weights for policy 0, policy_version 329210 (0.0026) [2024-06-23 01:14:33,389][15132] Fps is (10 sec: 40965.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5393858560. Throughput: 0: 42905.7. Samples: 5394038780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:33,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-23 01:14:35,243][15401] Updated weights for policy 0, policy_version 329220 (0.0032) [2024-06-23 01:14:38,058][15401] Updated weights for policy 0, policy_version 329230 (0.0038) [2024-06-23 01:14:38,396][15132] Fps is (10 sec: 44207.7, 60 sec: 43413.9, 300 sec: 42930.7). Total num frames: 5394104320. Throughput: 0: 42844.1. Samples: 5394160560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:38,396][15132] Avg episode reward: [(0, '0.258')] [2024-06-23 01:14:42,752][15401] Updated weights for policy 0, policy_version 329240 (0.0028) [2024-06-23 01:14:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 5394317312. Throughput: 0: 42874.1. Samples: 5394425960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 01:14:45,812][15401] Updated weights for policy 0, policy_version 329250 (0.0045) [2024-06-23 01:14:48,389][15132] Fps is (10 sec: 39347.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5394497536. Throughput: 0: 42851.6. Samples: 5394686100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:48,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 01:14:50,099][15401] Updated weights for policy 0, policy_version 329260 (0.0035) [2024-06-23 01:14:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5394743296. Throughput: 0: 42817.9. Samples: 5394808060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:53,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-23 01:14:53,582][15401] Updated weights for policy 0, policy_version 329270 (0.0030) [2024-06-23 01:14:57,714][15401] Updated weights for policy 0, policy_version 329280 (0.0027) [2024-06-23 01:14:58,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5394972672. Throughput: 0: 42981.8. Samples: 5395072740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:14:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 01:15:01,266][15401] Updated weights for policy 0, policy_version 329290 (0.0033) [2024-06-23 01:15:03,391][15132] Fps is (10 sec: 40954.8, 60 sec: 42870.5, 300 sec: 42820.4). Total num frames: 5395152896. Throughput: 0: 42908.0. Samples: 5395328760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:15:03,391][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 01:15:04,994][15349] Signal inference workers to stop experience collection... (79800 times) [2024-06-23 01:15:04,994][15349] Signal inference workers to resume experience collection... (79800 times) [2024-06-23 01:15:05,017][15401] InferenceWorker_p0-w0: stopping experience collection (79800 times) [2024-06-23 01:15:05,017][15401] InferenceWorker_p0-w0: resuming experience collection (79800 times) [2024-06-23 01:15:05,312][15401] Updated weights for policy 0, policy_version 329300 (0.0024) [2024-06-23 01:15:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 5395382272. Throughput: 0: 43054.2. Samples: 5395450940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:15:08,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-23 01:15:08,908][15401] Updated weights for policy 0, policy_version 329310 (0.0046) [2024-06-23 01:15:12,999][15401] Updated weights for policy 0, policy_version 329320 (0.0032) [2024-06-23 01:15:13,396][15132] Fps is (10 sec: 44213.9, 60 sec: 42866.8, 300 sec: 42931.1). Total num frames: 5395595264. Throughput: 0: 42928.8. Samples: 5395715240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:15:13,396][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 01:15:16,884][15401] Updated weights for policy 0, policy_version 329330 (0.0033) [2024-06-23 01:15:18,396][15132] Fps is (10 sec: 42570.6, 60 sec: 43412.8, 300 sec: 42875.2). Total num frames: 5395808256. Throughput: 0: 42679.6. Samples: 5395959640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:15:18,397][15132] Avg episode reward: [(0, '0.214')] [2024-06-23 01:15:20,603][15401] Updated weights for policy 0, policy_version 329340 (0.0033) [2024-06-23 01:15:23,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42872.5, 300 sec: 42820.6). Total num frames: 5396021248. Throughput: 0: 42925.8. Samples: 5396091940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 01:15:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 01:15:24,418][15401] Updated weights for policy 0, policy_version 329350 (0.0035) [2024-06-23 01:15:28,167][15401] Updated weights for policy 0, policy_version 329360 (0.0029) [2024-06-23 01:15:28,389][15132] Fps is (10 sec: 42626.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5396234240. Throughput: 0: 42980.6. Samples: 5396360080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:15:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 01:15:31,967][15401] Updated weights for policy 0, policy_version 329370 (0.0038) [2024-06-23 01:15:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5396447232. Throughput: 0: 42827.4. Samples: 5396613340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:15:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 01:15:35,845][15401] Updated weights for policy 0, policy_version 329380 (0.0034) [2024-06-23 01:15:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 5396660224. Throughput: 0: 42898.3. Samples: 5396738480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:15:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 01:15:39,537][15401] Updated weights for policy 0, policy_version 329390 (0.0041) [2024-06-23 01:15:43,292][15401] Updated weights for policy 0, policy_version 329400 (0.0051) [2024-06-23 01:15:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 5396889600. Throughput: 0: 42929.8. Samples: 5397004580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:15:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 01:15:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000329400_5396889600.pth... [2024-06-23 01:15:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000328771_5386584064.pth [2024-06-23 01:15:47,493][15401] Updated weights for policy 0, policy_version 329410 (0.0037) [2024-06-23 01:15:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5397086208. Throughput: 0: 42703.0. Samples: 5397250340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:15:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 01:15:51,015][15401] Updated weights for policy 0, policy_version 329420 (0.0044) [2024-06-23 01:15:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5397315584. Throughput: 0: 42756.8. Samples: 5397375000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:15:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 01:15:55,219][15401] Updated weights for policy 0, policy_version 329430 (0.0038) [2024-06-23 01:15:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5397528576. Throughput: 0: 42698.7. Samples: 5397636400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:15:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 01:15:58,771][15401] Updated weights for policy 0, policy_version 329440 (0.0028) [2024-06-23 01:16:02,883][15401] Updated weights for policy 0, policy_version 329450 (0.0028) [2024-06-23 01:16:03,390][15132] Fps is (10 sec: 39319.8, 60 sec: 42598.9, 300 sec: 42764.9). Total num frames: 5397708800. Throughput: 0: 42977.7. Samples: 5397893380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:16:03,391][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 01:16:06,279][15401] Updated weights for policy 0, policy_version 329460 (0.0033) [2024-06-23 01:16:08,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5397954560. Throughput: 0: 42767.0. Samples: 5398016460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:16:08,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 01:16:10,564][15401] Updated weights for policy 0, policy_version 329470 (0.0045) [2024-06-23 01:16:13,389][15132] Fps is (10 sec: 45877.6, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 5398167552. Throughput: 0: 42647.0. Samples: 5398279200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:16:13,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 01:16:13,967][15401] Updated weights for policy 0, policy_version 329480 (0.0036) [2024-06-23 01:16:18,360][15401] Updated weights for policy 0, policy_version 329490 (0.0044) [2024-06-23 01:16:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 5398364160. Throughput: 0: 42839.6. Samples: 5398541120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:16:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 01:16:21,470][15401] Updated weights for policy 0, policy_version 329500 (0.0028) [2024-06-23 01:16:23,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 5398609920. Throughput: 0: 42736.3. Samples: 5398661720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:16:23,392][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 01:16:26,076][15401] Updated weights for policy 0, policy_version 329510 (0.0033) [2024-06-23 01:16:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5398790144. Throughput: 0: 42514.9. Samples: 5398917760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:16:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 01:16:29,397][15401] Updated weights for policy 0, policy_version 329520 (0.0050) [2024-06-23 01:16:31,239][15349] Signal inference workers to stop experience collection... (79850 times) [2024-06-23 01:16:31,288][15401] InferenceWorker_p0-w0: stopping experience collection (79850 times) [2024-06-23 01:16:31,297][15349] Signal inference workers to resume experience collection... (79850 times) [2024-06-23 01:16:31,304][15401] InferenceWorker_p0-w0: resuming experience collection (79850 times) [2024-06-23 01:16:33,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 5398986752. Throughput: 0: 42767.1. Samples: 5399174860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:16:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 01:16:33,819][15401] Updated weights for policy 0, policy_version 329530 (0.0032) [2024-06-23 01:16:36,887][15401] Updated weights for policy 0, policy_version 329540 (0.0037) [2024-06-23 01:16:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.6, 300 sec: 42876.3). Total num frames: 5399232512. Throughput: 0: 42848.4. Samples: 5399303280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 01:16:38,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 01:16:41,448][15401] Updated weights for policy 0, policy_version 329550 (0.0042) [2024-06-23 01:16:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 5399445504. Throughput: 0: 42938.2. Samples: 5399568620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:16:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 01:16:44,242][15401] Updated weights for policy 0, policy_version 329560 (0.0028) [2024-06-23 01:16:48,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5399642112. Throughput: 0: 42794.4. Samples: 5399819100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:16:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 01:16:48,821][15401] Updated weights for policy 0, policy_version 329570 (0.0023) [2024-06-23 01:16:51,808][15401] Updated weights for policy 0, policy_version 329580 (0.0040) [2024-06-23 01:16:53,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.7, 300 sec: 42820.5). Total num frames: 5399871488. Throughput: 0: 42918.6. Samples: 5399947900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:16:53,392][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 01:16:56,358][15401] Updated weights for policy 0, policy_version 329590 (0.0039) [2024-06-23 01:16:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5400084480. Throughput: 0: 42870.8. Samples: 5400208380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:16:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 01:16:59,622][15401] Updated weights for policy 0, policy_version 329600 (0.0030) [2024-06-23 01:17:03,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.8, 300 sec: 42876.1). Total num frames: 5400281088. Throughput: 0: 42798.6. Samples: 5400467060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 01:17:04,487][15401] Updated weights for policy 0, policy_version 329610 (0.0030) [2024-06-23 01:17:07,083][15401] Updated weights for policy 0, policy_version 329620 (0.0046) [2024-06-23 01:17:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 5400526848. Throughput: 0: 42895.6. Samples: 5400591920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:08,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 01:17:12,062][15401] Updated weights for policy 0, policy_version 329630 (0.0028) [2024-06-23 01:17:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 5400723456. Throughput: 0: 43057.5. Samples: 5400855340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:13,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 01:17:14,730][15401] Updated weights for policy 0, policy_version 329640 (0.0025) [2024-06-23 01:17:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5400936448. Throughput: 0: 42964.9. Samples: 5401108280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 01:17:19,625][15401] Updated weights for policy 0, policy_version 329650 (0.0036) [2024-06-23 01:17:22,288][15401] Updated weights for policy 0, policy_version 329660 (0.0041) [2024-06-23 01:17:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.2, 300 sec: 42820.9). Total num frames: 5401165824. Throughput: 0: 42935.3. Samples: 5401235260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 01:17:27,166][15401] Updated weights for policy 0, policy_version 329670 (0.0039) [2024-06-23 01:17:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5401362432. Throughput: 0: 42942.2. Samples: 5401501020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 01:17:29,822][15401] Updated weights for policy 0, policy_version 329680 (0.0043) [2024-06-23 01:17:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5401575424. Throughput: 0: 43055.5. Samples: 5401756600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 01:17:34,612][15401] Updated weights for policy 0, policy_version 329690 (0.0036) [2024-06-23 01:17:37,730][15401] Updated weights for policy 0, policy_version 329700 (0.0027) [2024-06-23 01:17:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.3, 300 sec: 42876.4). Total num frames: 5401821184. Throughput: 0: 43026.8. Samples: 5401884000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:38,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 01:17:42,354][15401] Updated weights for policy 0, policy_version 329710 (0.0026) [2024-06-23 01:17:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5402017792. Throughput: 0: 43140.0. Samples: 5402149680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:43,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 01:17:43,484][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000329714_5402034176.pth... [2024-06-23 01:17:43,542][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000329086_5391745024.pth [2024-06-23 01:17:45,105][15401] Updated weights for policy 0, policy_version 329720 (0.0023) [2024-06-23 01:17:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5402230784. Throughput: 0: 43099.1. Samples: 5402406520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:48,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 01:17:50,014][15401] Updated weights for policy 0, policy_version 329730 (0.0042) [2024-06-23 01:17:52,516][15349] Signal inference workers to stop experience collection... (79900 times) [2024-06-23 01:17:52,517][15349] Signal inference workers to resume experience collection... (79900 times) [2024-06-23 01:17:52,565][15401] InferenceWorker_p0-w0: stopping experience collection (79900 times) [2024-06-23 01:17:52,566][15401] InferenceWorker_p0-w0: resuming experience collection (79900 times) [2024-06-23 01:17:52,659][15401] Updated weights for policy 0, policy_version 329740 (0.0054) [2024-06-23 01:17:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.4, 300 sec: 42876.1). Total num frames: 5402460160. Throughput: 0: 43207.7. Samples: 5402536260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 01:17:53,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 01:17:57,731][15401] Updated weights for policy 0, policy_version 329750 (0.0030) [2024-06-23 01:17:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 5402673152. Throughput: 0: 43265.3. Samples: 5402802280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:17:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 01:18:00,450][15401] Updated weights for policy 0, policy_version 329760 (0.0033) [2024-06-23 01:18:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 5402886144. Throughput: 0: 43150.7. Samples: 5403050060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 01:18:05,124][15401] Updated weights for policy 0, policy_version 329770 (0.0033) [2024-06-23 01:18:07,939][15401] Updated weights for policy 0, policy_version 329780 (0.0039) [2024-06-23 01:18:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 5403115520. Throughput: 0: 43354.6. Samples: 5403186220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:08,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 01:18:12,626][15401] Updated weights for policy 0, policy_version 329790 (0.0042) [2024-06-23 01:18:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 5403295744. Throughput: 0: 43210.1. Samples: 5403445480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:13,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 01:18:15,510][15401] Updated weights for policy 0, policy_version 329800 (0.0032) [2024-06-23 01:18:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5403541504. Throughput: 0: 43081.3. Samples: 5403695260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 01:18:20,068][15401] Updated weights for policy 0, policy_version 329810 (0.0041) [2024-06-23 01:18:23,063][15401] Updated weights for policy 0, policy_version 329820 (0.0034) [2024-06-23 01:18:23,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5403770880. Throughput: 0: 43318.3. Samples: 5403833320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 01:18:27,504][15401] Updated weights for policy 0, policy_version 329830 (0.0030) [2024-06-23 01:18:28,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 5403951104. Throughput: 0: 43064.7. Samples: 5404087700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:28,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 01:18:30,894][15401] Updated weights for policy 0, policy_version 329840 (0.0029) [2024-06-23 01:18:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42987.4). Total num frames: 5404180480. Throughput: 0: 42915.7. Samples: 5404337720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 01:18:34,872][15401] Updated weights for policy 0, policy_version 329850 (0.0030) [2024-06-23 01:18:38,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5404409856. Throughput: 0: 43110.6. Samples: 5404476240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 01:18:38,536][15401] Updated weights for policy 0, policy_version 329860 (0.0042) [2024-06-23 01:18:42,476][15401] Updated weights for policy 0, policy_version 329870 (0.0040) [2024-06-23 01:18:43,392][15132] Fps is (10 sec: 42587.4, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 5404606464. Throughput: 0: 42806.0. Samples: 5404728660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:43,393][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 01:18:46,638][15401] Updated weights for policy 0, policy_version 329880 (0.0034) [2024-06-23 01:18:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 5404835840. Throughput: 0: 42911.6. Samples: 5404981080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:48,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 01:18:50,227][15401] Updated weights for policy 0, policy_version 329890 (0.0025) [2024-06-23 01:18:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5405032448. Throughput: 0: 42900.4. Samples: 5405116740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 01:18:54,226][15401] Updated weights for policy 0, policy_version 329900 (0.0053) [2024-06-23 01:18:57,657][15349] Signal inference workers to stop experience collection... (79950 times) [2024-06-23 01:18:57,712][15401] InferenceWorker_p0-w0: stopping experience collection (79950 times) [2024-06-23 01:18:57,713][15349] Signal inference workers to resume experience collection... (79950 times) [2024-06-23 01:18:57,726][15401] InferenceWorker_p0-w0: resuming experience collection (79950 times) [2024-06-23 01:18:57,867][15401] Updated weights for policy 0, policy_version 329910 (0.0031) [2024-06-23 01:18:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5405261824. Throughput: 0: 42792.0. Samples: 5405371120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:18:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 01:19:01,758][15401] Updated weights for policy 0, policy_version 329920 (0.0042) [2024-06-23 01:19:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 5405491200. Throughput: 0: 42967.6. Samples: 5405628800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:19:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 01:19:05,569][15401] Updated weights for policy 0, policy_version 329930 (0.0028) [2024-06-23 01:19:08,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 5405655040. Throughput: 0: 42812.2. Samples: 5405759980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 01:19:08,393][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 01:19:09,418][15401] Updated weights for policy 0, policy_version 329940 (0.0031) [2024-06-23 01:19:13,096][15401] Updated weights for policy 0, policy_version 329950 (0.0033) [2024-06-23 01:19:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 5405900800. Throughput: 0: 42765.9. Samples: 5406012060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 01:19:17,344][15401] Updated weights for policy 0, policy_version 329960 (0.0041) [2024-06-23 01:19:18,389][15132] Fps is (10 sec: 47525.3, 60 sec: 43144.5, 300 sec: 42987.4). Total num frames: 5406130176. Throughput: 0: 42903.4. Samples: 5406268380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 01:19:20,523][15401] Updated weights for policy 0, policy_version 329970 (0.0032) [2024-06-23 01:19:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 5406294016. Throughput: 0: 42810.7. Samples: 5406402720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:23,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 01:19:24,643][15401] Updated weights for policy 0, policy_version 329980 (0.0037) [2024-06-23 01:19:28,031][15401] Updated weights for policy 0, policy_version 329990 (0.0041) [2024-06-23 01:19:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43419.4, 300 sec: 43042.7). Total num frames: 5406556160. Throughput: 0: 42886.4. Samples: 5406658440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 01:19:32,266][15401] Updated weights for policy 0, policy_version 330000 (0.0031) [2024-06-23 01:19:33,389][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42877.0). Total num frames: 5406752768. Throughput: 0: 43073.2. Samples: 5406919380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 01:19:35,691][15401] Updated weights for policy 0, policy_version 330010 (0.0033) [2024-06-23 01:19:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5406949376. Throughput: 0: 42833.7. Samples: 5407044260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 01:19:39,933][15401] Updated weights for policy 0, policy_version 330020 (0.0044) [2024-06-23 01:19:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 5407195136. Throughput: 0: 42842.7. Samples: 5407299040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:43,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-23 01:19:43,516][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000330030_5407211520.pth... [2024-06-23 01:19:43,528][15401] Updated weights for policy 0, policy_version 330030 (0.0048) [2024-06-23 01:19:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000329400_5396889600.pth [2024-06-23 01:19:48,020][15401] Updated weights for policy 0, policy_version 330040 (0.0045) [2024-06-23 01:19:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5407391744. Throughput: 0: 42916.4. Samples: 5407560040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:48,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-23 01:19:51,100][15401] Updated weights for policy 0, policy_version 330050 (0.0040) [2024-06-23 01:19:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5407588352. Throughput: 0: 42734.8. Samples: 5407682940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 01:19:55,565][15401] Updated weights for policy 0, policy_version 330060 (0.0035) [2024-06-23 01:19:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.4). Total num frames: 5407834112. Throughput: 0: 43044.0. Samples: 5407949040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:19:58,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 01:19:58,838][15401] Updated weights for policy 0, policy_version 330070 (0.0035) [2024-06-23 01:20:03,121][15401] Updated weights for policy 0, policy_version 330080 (0.0032) [2024-06-23 01:20:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5408030720. Throughput: 0: 43198.2. Samples: 5408212300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:03,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 01:20:06,309][15401] Updated weights for policy 0, policy_version 330090 (0.0047) [2024-06-23 01:20:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.3, 300 sec: 42877.0). Total num frames: 5408243712. Throughput: 0: 42950.1. Samples: 5408335480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 01:20:10,571][15401] Updated weights for policy 0, policy_version 330100 (0.0038) [2024-06-23 01:20:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42988.1). Total num frames: 5408489472. Throughput: 0: 42997.7. Samples: 5408593340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 01:20:14,465][15401] Updated weights for policy 0, policy_version 330110 (0.0033) [2024-06-23 01:20:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5408669696. Throughput: 0: 42884.6. Samples: 5408849180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 01:20:18,535][15401] Updated weights for policy 0, policy_version 330120 (0.0034) [2024-06-23 01:20:21,943][15401] Updated weights for policy 0, policy_version 330130 (0.0033) [2024-06-23 01:20:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 5408899072. Throughput: 0: 42940.0. Samples: 5408976660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:23,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 01:20:26,121][15401] Updated weights for policy 0, policy_version 330140 (0.0033) [2024-06-23 01:20:27,215][15349] Signal inference workers to stop experience collection... (80000 times) [2024-06-23 01:20:27,215][15349] Signal inference workers to resume experience collection... (80000 times) [2024-06-23 01:20:27,229][15401] InferenceWorker_p0-w0: stopping experience collection (80000 times) [2024-06-23 01:20:27,229][15401] InferenceWorker_p0-w0: resuming experience collection (80000 times) [2024-06-23 01:20:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5409112064. Throughput: 0: 43021.4. Samples: 5409235000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 01:20:29,524][15401] Updated weights for policy 0, policy_version 330150 (0.0027) [2024-06-23 01:20:33,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5409308672. Throughput: 0: 42904.9. Samples: 5409490760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 01:20:33,831][15401] Updated weights for policy 0, policy_version 330160 (0.0030) [2024-06-23 01:20:37,043][15401] Updated weights for policy 0, policy_version 330170 (0.0030) [2024-06-23 01:20:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5409521664. Throughput: 0: 42932.3. Samples: 5409614900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:38,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 01:20:41,588][15401] Updated weights for policy 0, policy_version 330180 (0.0038) [2024-06-23 01:20:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5409767424. Throughput: 0: 42660.4. Samples: 5409868760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 01:20:45,241][15401] Updated weights for policy 0, policy_version 330190 (0.0031) [2024-06-23 01:20:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5409947648. Throughput: 0: 42493.8. Samples: 5410124520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:48,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 01:20:49,324][15401] Updated weights for policy 0, policy_version 330200 (0.0038) [2024-06-23 01:20:52,892][15401] Updated weights for policy 0, policy_version 330210 (0.0030) [2024-06-23 01:20:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5410160640. Throughput: 0: 42440.5. Samples: 5410245300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:53,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 01:20:57,097][15401] Updated weights for policy 0, policy_version 330220 (0.0034) [2024-06-23 01:20:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42931.7). Total num frames: 5410373632. Throughput: 0: 42399.1. Samples: 5410501300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:20:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 01:21:00,417][15401] Updated weights for policy 0, policy_version 330230 (0.0029) [2024-06-23 01:21:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5410570240. Throughput: 0: 42437.2. Samples: 5410758860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:03,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 01:21:04,862][15401] Updated weights for policy 0, policy_version 330240 (0.0025) [2024-06-23 01:21:08,296][15401] Updated weights for policy 0, policy_version 330250 (0.0041) [2024-06-23 01:21:08,391][15132] Fps is (10 sec: 44232.0, 60 sec: 42870.7, 300 sec: 42875.9). Total num frames: 5410816000. Throughput: 0: 42308.8. Samples: 5410880500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:08,391][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 01:21:12,417][15401] Updated weights for policy 0, policy_version 330260 (0.0025) [2024-06-23 01:21:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42820.6). Total num frames: 5410996224. Throughput: 0: 42339.6. Samples: 5411140280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 01:21:16,391][15401] Updated weights for policy 0, policy_version 330270 (0.0034) [2024-06-23 01:21:18,389][15132] Fps is (10 sec: 40964.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 5411225600. Throughput: 0: 42341.9. Samples: 5411396140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 01:21:20,043][15401] Updated weights for policy 0, policy_version 330280 (0.0037) [2024-06-23 01:21:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 5411438592. Throughput: 0: 42358.2. Samples: 5411521020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 01:21:23,945][15401] Updated weights for policy 0, policy_version 330290 (0.0032) [2024-06-23 01:21:27,736][15401] Updated weights for policy 0, policy_version 330300 (0.0040) [2024-06-23 01:21:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 5411651584. Throughput: 0: 42509.3. Samples: 5411781680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 01:21:31,651][15401] Updated weights for policy 0, policy_version 330310 (0.0034) [2024-06-23 01:21:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 5411864576. Throughput: 0: 42372.8. Samples: 5412031300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:33,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 01:21:35,460][15401] Updated weights for policy 0, policy_version 330320 (0.0038) [2024-06-23 01:21:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5412093952. Throughput: 0: 42507.4. Samples: 5412158140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:38,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 01:21:39,127][15349] Signal inference workers to stop experience collection... (80050 times) [2024-06-23 01:21:39,179][15401] InferenceWorker_p0-w0: stopping experience collection (80050 times) [2024-06-23 01:21:39,185][15349] Signal inference workers to resume experience collection... (80050 times) [2024-06-23 01:21:39,201][15401] InferenceWorker_p0-w0: resuming experience collection (80050 times) [2024-06-23 01:21:39,327][15401] Updated weights for policy 0, policy_version 330330 (0.0029) [2024-06-23 01:21:43,267][15401] Updated weights for policy 0, policy_version 330340 (0.0046) [2024-06-23 01:21:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 5412290560. Throughput: 0: 42697.8. Samples: 5412422700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 01:21:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 01:21:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000330340_5412290560.pth... [2024-06-23 01:21:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000329714_5402034176.pth [2024-06-23 01:21:46,852][15401] Updated weights for policy 0, policy_version 330350 (0.0036) [2024-06-23 01:21:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 5412503552. Throughput: 0: 42448.4. Samples: 5412669040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:21:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 01:21:51,049][15401] Updated weights for policy 0, policy_version 330360 (0.0036) [2024-06-23 01:21:53,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 5412732928. Throughput: 0: 42649.2. Samples: 5412799940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:21:53,396][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 01:21:54,658][15401] Updated weights for policy 0, policy_version 330370 (0.0022) [2024-06-23 01:21:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 5412913152. Throughput: 0: 42709.7. Samples: 5413062220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:21:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 01:21:58,932][15401] Updated weights for policy 0, policy_version 330380 (0.0032) [2024-06-23 01:22:02,170][15401] Updated weights for policy 0, policy_version 330390 (0.0033) [2024-06-23 01:22:03,390][15132] Fps is (10 sec: 42624.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5413158912. Throughput: 0: 42487.8. Samples: 5413308100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:03,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 01:22:06,601][15401] Updated weights for policy 0, policy_version 330400 (0.0033) [2024-06-23 01:22:08,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42872.2, 300 sec: 42931.6). Total num frames: 5413388288. Throughput: 0: 42802.7. Samples: 5413447140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:08,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 01:22:09,700][15401] Updated weights for policy 0, policy_version 330410 (0.0023) [2024-06-23 01:22:13,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 5413552128. Throughput: 0: 42515.0. Samples: 5413694960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:13,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 01:22:14,200][15401] Updated weights for policy 0, policy_version 330420 (0.0032) [2024-06-23 01:22:17,626][15401] Updated weights for policy 0, policy_version 330430 (0.0035) [2024-06-23 01:22:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5413797888. Throughput: 0: 42534.3. Samples: 5413945340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 01:22:21,939][15401] Updated weights for policy 0, policy_version 330440 (0.0033) [2024-06-23 01:22:23,392][15132] Fps is (10 sec: 45875.4, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 5414010880. Throughput: 0: 42752.5. Samples: 5414082100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:23,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 01:22:24,993][15401] Updated weights for policy 0, policy_version 330450 (0.0033) [2024-06-23 01:22:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5414191104. Throughput: 0: 42584.0. Samples: 5414338980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 01:22:29,490][15401] Updated weights for policy 0, policy_version 330460 (0.0040) [2024-06-23 01:22:32,310][15401] Updated weights for policy 0, policy_version 330470 (0.0036) [2024-06-23 01:22:33,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5414436864. Throughput: 0: 42821.0. Samples: 5414595980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 01:22:36,930][15401] Updated weights for policy 0, policy_version 330480 (0.0040) [2024-06-23 01:22:38,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5414666240. Throughput: 0: 42996.3. Samples: 5414734500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 01:22:39,690][15401] Updated weights for policy 0, policy_version 330490 (0.0035) [2024-06-23 01:22:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5414846464. Throughput: 0: 42811.6. Samples: 5414988740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 01:22:44,418][15401] Updated weights for policy 0, policy_version 330500 (0.0024) [2024-06-23 01:22:47,483][15401] Updated weights for policy 0, policy_version 330510 (0.0040) [2024-06-23 01:22:48,391][15132] Fps is (10 sec: 42590.0, 60 sec: 43143.2, 300 sec: 42820.3). Total num frames: 5415092224. Throughput: 0: 42968.5. Samples: 5415241760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:48,392][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 01:22:51,934][15401] Updated weights for policy 0, policy_version 330520 (0.0033) [2024-06-23 01:22:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 5415288832. Throughput: 0: 42975.6. Samples: 5415381040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:53,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 01:22:55,105][15401] Updated weights for policy 0, policy_version 330530 (0.0035) [2024-06-23 01:22:58,389][15132] Fps is (10 sec: 40968.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5415501824. Throughput: 0: 43175.3. Samples: 5415637740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 01:22:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 01:22:58,672][15349] Signal inference workers to stop experience collection... (80100 times) [2024-06-23 01:22:58,695][15401] InferenceWorker_p0-w0: stopping experience collection (80100 times) [2024-06-23 01:22:58,733][15349] Signal inference workers to resume experience collection... (80100 times) [2024-06-23 01:22:58,733][15401] InferenceWorker_p0-w0: resuming experience collection (80100 times) [2024-06-23 01:22:59,453][15401] Updated weights for policy 0, policy_version 330540 (0.0032) [2024-06-23 01:23:02,636][15401] Updated weights for policy 0, policy_version 330550 (0.0030) [2024-06-23 01:23:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5415731200. Throughput: 0: 43324.0. Samples: 5415894920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 01:23:07,123][15401] Updated weights for policy 0, policy_version 330560 (0.0031) [2024-06-23 01:23:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5415944192. Throughput: 0: 43279.7. Samples: 5416029580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 01:23:10,125][15401] Updated weights for policy 0, policy_version 330570 (0.0034) [2024-06-23 01:23:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 5416157184. Throughput: 0: 43348.9. Samples: 5416289680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:13,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-23 01:23:14,845][15401] Updated weights for policy 0, policy_version 330580 (0.0031) [2024-06-23 01:23:17,771][15401] Updated weights for policy 0, policy_version 330590 (0.0046) [2024-06-23 01:23:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5416386560. Throughput: 0: 42992.5. Samples: 5416530640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:18,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 01:23:22,352][15401] Updated weights for policy 0, policy_version 330600 (0.0034) [2024-06-23 01:23:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 5416566784. Throughput: 0: 42925.7. Samples: 5416666160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 01:23:25,405][15401] Updated weights for policy 0, policy_version 330610 (0.0032) [2024-06-23 01:23:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5416796160. Throughput: 0: 43080.1. Samples: 5416927340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 01:23:29,944][15401] Updated weights for policy 0, policy_version 330620 (0.0045) [2024-06-23 01:23:32,803][15401] Updated weights for policy 0, policy_version 330630 (0.0049) [2024-06-23 01:23:33,389][15132] Fps is (10 sec: 49152.4, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 5417058304. Throughput: 0: 43001.5. Samples: 5417176740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 01:23:37,943][15401] Updated weights for policy 0, policy_version 330640 (0.0026) [2024-06-23 01:23:38,390][15132] Fps is (10 sec: 42594.5, 60 sec: 42597.8, 300 sec: 42765.2). Total num frames: 5417222144. Throughput: 0: 43062.3. Samples: 5417318880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:38,391][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 01:23:40,755][15401] Updated weights for policy 0, policy_version 330650 (0.0038) [2024-06-23 01:23:43,389][15132] Fps is (10 sec: 37682.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5417435136. Throughput: 0: 42927.1. Samples: 5417569460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:43,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 01:23:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000330655_5417451520.pth... [2024-06-23 01:23:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000330030_5407211520.pth [2024-06-23 01:23:45,879][15401] Updated weights for policy 0, policy_version 330660 (0.0027) [2024-06-23 01:23:48,321][15401] Updated weights for policy 0, policy_version 330670 (0.0028) [2024-06-23 01:23:48,396][15132] Fps is (10 sec: 47487.4, 60 sec: 43414.4, 300 sec: 42930.7). Total num frames: 5417697280. Throughput: 0: 42765.1. Samples: 5417819620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:48,396][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 01:23:52,080][15349] Signal inference workers to stop experience collection... (80150 times) [2024-06-23 01:23:52,117][15401] InferenceWorker_p0-w0: stopping experience collection (80150 times) [2024-06-23 01:23:52,194][15349] Signal inference workers to resume experience collection... (80150 times) [2024-06-23 01:23:52,194][15401] InferenceWorker_p0-w0: resuming experience collection (80150 times) [2024-06-23 01:23:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5417844736. Throughput: 0: 42737.3. Samples: 5417952760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 01:23:53,522][15401] Updated weights for policy 0, policy_version 330680 (0.0030) [2024-06-23 01:23:55,899][15401] Updated weights for policy 0, policy_version 330690 (0.0030) [2024-06-23 01:23:58,390][15132] Fps is (10 sec: 39346.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5418090496. Throughput: 0: 42715.0. Samples: 5418211860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:23:58,401][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 01:24:00,902][15401] Updated weights for policy 0, policy_version 330700 (0.0043) [2024-06-23 01:24:03,390][15132] Fps is (10 sec: 49151.6, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 5418336256. Throughput: 0: 43101.2. Samples: 5418470200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:24:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 01:24:03,593][15401] Updated weights for policy 0, policy_version 330710 (0.0036) [2024-06-23 01:24:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5418500096. Throughput: 0: 42985.7. Samples: 5418600520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:24:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 01:24:08,607][15401] Updated weights for policy 0, policy_version 330720 (0.0039) [2024-06-23 01:24:11,251][15401] Updated weights for policy 0, policy_version 330730 (0.0037) [2024-06-23 01:24:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5418729472. Throughput: 0: 42901.8. Samples: 5418857920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 01:24:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 01:24:16,025][15401] Updated weights for policy 0, policy_version 330740 (0.0039) [2024-06-23 01:24:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5418942464. Throughput: 0: 43036.4. Samples: 5419113380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 01:24:19,339][15401] Updated weights for policy 0, policy_version 330750 (0.0044) [2024-06-23 01:24:23,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 5419155456. Throughput: 0: 42687.8. Samples: 5419239900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:23,392][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 01:24:23,710][15401] Updated weights for policy 0, policy_version 330760 (0.0027) [2024-06-23 01:24:27,125][15401] Updated weights for policy 0, policy_version 330770 (0.0036) [2024-06-23 01:24:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5419384832. Throughput: 0: 42761.3. Samples: 5419493720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 01:24:31,270][15401] Updated weights for policy 0, policy_version 330780 (0.0034) [2024-06-23 01:24:33,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 5419581440. Throughput: 0: 42895.8. Samples: 5419749660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:33,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 01:24:34,804][15401] Updated weights for policy 0, policy_version 330790 (0.0036) [2024-06-23 01:24:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42599.0, 300 sec: 42653.9). Total num frames: 5419778048. Throughput: 0: 42850.6. Samples: 5419881040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 01:24:39,034][15401] Updated weights for policy 0, policy_version 330800 (0.0033) [2024-06-23 01:24:42,556][15401] Updated weights for policy 0, policy_version 330810 (0.0034) [2024-06-23 01:24:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5420023808. Throughput: 0: 42771.5. Samples: 5420136580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 01:24:46,553][15401] Updated weights for policy 0, policy_version 330820 (0.0036) [2024-06-23 01:24:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42329.8, 300 sec: 42876.1). Total num frames: 5420236800. Throughput: 0: 42736.1. Samples: 5420393320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 01:24:50,147][15401] Updated weights for policy 0, policy_version 330830 (0.0041) [2024-06-23 01:24:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5420433408. Throughput: 0: 42673.5. Samples: 5420520820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:53,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 01:24:54,507][15401] Updated weights for policy 0, policy_version 330840 (0.0046) [2024-06-23 01:24:57,671][15401] Updated weights for policy 0, policy_version 330850 (0.0047) [2024-06-23 01:24:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5420679168. Throughput: 0: 42702.5. Samples: 5420779540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:24:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 01:25:01,986][15401] Updated weights for policy 0, policy_version 330860 (0.0027) [2024-06-23 01:25:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5420892160. Throughput: 0: 42671.5. Samples: 5421033600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:25:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 01:25:05,389][15401] Updated weights for policy 0, policy_version 330870 (0.0032) [2024-06-23 01:25:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5421088768. Throughput: 0: 42793.4. Samples: 5421165500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:25:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 01:25:09,385][15401] Updated weights for policy 0, policy_version 330880 (0.0028) [2024-06-23 01:25:12,643][15349] Signal inference workers to stop experience collection... (80200 times) [2024-06-23 01:25:12,644][15349] Signal inference workers to resume experience collection... (80200 times) [2024-06-23 01:25:12,692][15401] InferenceWorker_p0-w0: stopping experience collection (80200 times) [2024-06-23 01:25:12,692][15401] InferenceWorker_p0-w0: resuming experience collection (80200 times) [2024-06-23 01:25:12,943][15401] Updated weights for policy 0, policy_version 330890 (0.0026) [2024-06-23 01:25:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5421318144. Throughput: 0: 43145.4. Samples: 5421435260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:25:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 01:25:17,287][15401] Updated weights for policy 0, policy_version 330900 (0.0030) [2024-06-23 01:25:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 5421531136. Throughput: 0: 43034.2. Samples: 5421686200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:25:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 01:25:20,835][15401] Updated weights for policy 0, policy_version 330910 (0.0035) [2024-06-23 01:25:23,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 5421727744. Throughput: 0: 42917.3. Samples: 5421812420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:25:23,393][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 01:25:24,907][15401] Updated weights for policy 0, policy_version 330920 (0.0034) [2024-06-23 01:25:28,359][15401] Updated weights for policy 0, policy_version 330930 (0.0037) [2024-06-23 01:25:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5421957120. Throughput: 0: 42963.1. Samples: 5422069920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 01:25:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 01:25:32,411][15401] Updated weights for policy 0, policy_version 330940 (0.0026) [2024-06-23 01:25:33,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5422153728. Throughput: 0: 42992.0. Samples: 5422327960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:25:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 01:25:36,279][15401] Updated weights for policy 0, policy_version 330950 (0.0031) [2024-06-23 01:25:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5422383104. Throughput: 0: 42944.8. Samples: 5422453340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:25:38,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 01:25:40,072][15401] Updated weights for policy 0, policy_version 330960 (0.0047) [2024-06-23 01:25:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5422579712. Throughput: 0: 42836.0. Samples: 5422707160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:25:43,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 01:25:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000330968_5422579712.pth... [2024-06-23 01:25:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000330340_5412290560.pth [2024-06-23 01:25:44,036][15401] Updated weights for policy 0, policy_version 330970 (0.0038) [2024-06-23 01:25:47,634][15401] Updated weights for policy 0, policy_version 330980 (0.0043) [2024-06-23 01:25:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5422792704. Throughput: 0: 42958.3. Samples: 5422966720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:25:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 01:25:51,846][15401] Updated weights for policy 0, policy_version 330990 (0.0033) [2024-06-23 01:25:53,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 5423022080. Throughput: 0: 42836.0. Samples: 5423093220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:25:53,393][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 01:25:55,169][15401] Updated weights for policy 0, policy_version 331000 (0.0036) [2024-06-23 01:25:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5423218688. Throughput: 0: 42446.7. Samples: 5423345360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:25:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 01:25:59,367][15401] Updated weights for policy 0, policy_version 331010 (0.0027) [2024-06-23 01:26:02,888][15401] Updated weights for policy 0, policy_version 331020 (0.0036) [2024-06-23 01:26:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42765.2). Total num frames: 5423431680. Throughput: 0: 42612.9. Samples: 5423603780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 01:26:06,932][15401] Updated weights for policy 0, policy_version 331030 (0.0029) [2024-06-23 01:26:08,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 5423661056. Throughput: 0: 42672.0. Samples: 5423732660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:08,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 01:26:10,541][15401] Updated weights for policy 0, policy_version 331040 (0.0041) [2024-06-23 01:26:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5423857664. Throughput: 0: 42596.3. Samples: 5423986740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 01:26:14,626][15401] Updated weights for policy 0, policy_version 331050 (0.0041) [2024-06-23 01:26:18,066][15401] Updated weights for policy 0, policy_version 331060 (0.0028) [2024-06-23 01:26:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5424087040. Throughput: 0: 42475.5. Samples: 5424239360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 01:26:22,292][15401] Updated weights for policy 0, policy_version 331070 (0.0036) [2024-06-23 01:26:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 5424300032. Throughput: 0: 42650.6. Samples: 5424372620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 01:26:25,889][15401] Updated weights for policy 0, policy_version 331080 (0.0041) [2024-06-23 01:26:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.9, 300 sec: 42931.3). Total num frames: 5424529408. Throughput: 0: 42857.3. Samples: 5424635840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:28,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 01:26:30,119][15401] Updated weights for policy 0, policy_version 331090 (0.0035) [2024-06-23 01:26:33,270][15401] Updated weights for policy 0, policy_version 331100 (0.0039) [2024-06-23 01:26:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5424742400. Throughput: 0: 42705.3. Samples: 5424888460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:33,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 01:26:37,652][15401] Updated weights for policy 0, policy_version 331110 (0.0044) [2024-06-23 01:26:38,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5424922624. Throughput: 0: 42799.3. Samples: 5425019080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 01:26:40,903][15401] Updated weights for policy 0, policy_version 331120 (0.0041) [2024-06-23 01:26:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5425152000. Throughput: 0: 42814.2. Samples: 5425272000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 01:26:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 01:26:45,370][15401] Updated weights for policy 0, policy_version 331130 (0.0034) [2024-06-23 01:26:46,173][15349] Signal inference workers to stop experience collection... (80250 times) [2024-06-23 01:26:46,218][15401] InferenceWorker_p0-w0: stopping experience collection (80250 times) [2024-06-23 01:26:46,227][15349] Signal inference workers to resume experience collection... (80250 times) [2024-06-23 01:26:46,233][15401] InferenceWorker_p0-w0: resuming experience collection (80250 times) [2024-06-23 01:26:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 5425364992. Throughput: 0: 42788.5. Samples: 5425529260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:26:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 01:26:48,960][15401] Updated weights for policy 0, policy_version 331140 (0.0039) [2024-06-23 01:26:53,000][15401] Updated weights for policy 0, policy_version 331150 (0.0027) [2024-06-23 01:26:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 5425561600. Throughput: 0: 42744.7. Samples: 5425656060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:26:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 01:26:56,873][15401] Updated weights for policy 0, policy_version 331160 (0.0035) [2024-06-23 01:26:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5425807360. Throughput: 0: 42891.5. Samples: 5425916860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:26:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 01:27:00,429][15401] Updated weights for policy 0, policy_version 331170 (0.0034) [2024-06-23 01:27:03,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5426020352. Throughput: 0: 43059.0. Samples: 5426177020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 01:27:04,485][15401] Updated weights for policy 0, policy_version 331180 (0.0050) [2024-06-23 01:27:07,970][15401] Updated weights for policy 0, policy_version 331190 (0.0027) [2024-06-23 01:27:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42932.0). Total num frames: 5426216960. Throughput: 0: 42772.9. Samples: 5426297400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 01:27:12,101][15401] Updated weights for policy 0, policy_version 331200 (0.0038) [2024-06-23 01:27:13,392][15132] Fps is (10 sec: 42589.3, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 5426446336. Throughput: 0: 42862.4. Samples: 5426564640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:13,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 01:27:15,664][15401] Updated weights for policy 0, policy_version 331210 (0.0029) [2024-06-23 01:27:18,390][15132] Fps is (10 sec: 44233.8, 60 sec: 42870.9, 300 sec: 42876.3). Total num frames: 5426659328. Throughput: 0: 42787.3. Samples: 5426813920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:18,391][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 01:27:19,666][15401] Updated weights for policy 0, policy_version 331220 (0.0032) [2024-06-23 01:27:23,390][15132] Fps is (10 sec: 40968.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5426855936. Throughput: 0: 42774.0. Samples: 5426943920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:23,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 01:27:23,845][15401] Updated weights for policy 0, policy_version 331230 (0.0032) [2024-06-23 01:27:27,430][15401] Updated weights for policy 0, policy_version 331240 (0.0032) [2024-06-23 01:27:28,389][15132] Fps is (10 sec: 40963.4, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 5427068928. Throughput: 0: 42773.4. Samples: 5427196800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 01:27:31,449][15401] Updated weights for policy 0, policy_version 331250 (0.0026) [2024-06-23 01:27:33,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 5427314688. Throughput: 0: 42681.7. Samples: 5427450040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:33,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 01:27:34,926][15401] Updated weights for policy 0, policy_version 331260 (0.0044) [2024-06-23 01:27:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5427494912. Throughput: 0: 42919.8. Samples: 5427587460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 01:27:38,880][15401] Updated weights for policy 0, policy_version 331270 (0.0041) [2024-06-23 01:27:42,523][15401] Updated weights for policy 0, policy_version 331280 (0.0033) [2024-06-23 01:27:43,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 5427707904. Throughput: 0: 42803.5. Samples: 5427843020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 01:27:43,520][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000331282_5427724288.pth... [2024-06-23 01:27:43,578][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000330655_5417451520.pth [2024-06-23 01:27:46,678][15401] Updated weights for policy 0, policy_version 331290 (0.0036) [2024-06-23 01:27:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5427953664. Throughput: 0: 42619.7. Samples: 5428094900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:48,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 01:27:50,672][15401] Updated weights for policy 0, policy_version 331300 (0.0035) [2024-06-23 01:27:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5428150272. Throughput: 0: 42909.7. Samples: 5428228340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:53,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 01:27:54,236][15401] Updated weights for policy 0, policy_version 331310 (0.0034) [2024-06-23 01:27:58,258][15401] Updated weights for policy 0, policy_version 331320 (0.0054) [2024-06-23 01:27:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5428346880. Throughput: 0: 42517.6. Samples: 5428477840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 01:27:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 01:28:01,926][15401] Updated weights for policy 0, policy_version 331330 (0.0039) [2024-06-23 01:28:03,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 5428609024. Throughput: 0: 42594.9. Samples: 5428730760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:03,393][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 01:28:05,703][15401] Updated weights for policy 0, policy_version 331340 (0.0035) [2024-06-23 01:28:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5428772864. Throughput: 0: 42821.0. Samples: 5428870860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 01:28:09,474][15401] Updated weights for policy 0, policy_version 331350 (0.0034) [2024-06-23 01:28:13,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 5428985856. Throughput: 0: 42803.1. Samples: 5429122940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 01:28:13,443][15401] Updated weights for policy 0, policy_version 331360 (0.0043) [2024-06-23 01:28:16,287][15349] Signal inference workers to stop experience collection... (80300 times) [2024-06-23 01:28:16,288][15349] Signal inference workers to resume experience collection... (80300 times) [2024-06-23 01:28:16,327][15401] InferenceWorker_p0-w0: stopping experience collection (80300 times) [2024-06-23 01:28:16,327][15401] InferenceWorker_p0-w0: resuming experience collection (80300 times) [2024-06-23 01:28:17,231][15401] Updated weights for policy 0, policy_version 331370 (0.0030) [2024-06-23 01:28:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42872.1, 300 sec: 42931.6). Total num frames: 5429231616. Throughput: 0: 42850.8. Samples: 5429378220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 01:28:21,033][15401] Updated weights for policy 0, policy_version 331380 (0.0047) [2024-06-23 01:28:23,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.7, 300 sec: 42764.6). Total num frames: 5429411840. Throughput: 0: 42612.4. Samples: 5429505120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:23,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 01:28:24,994][15401] Updated weights for policy 0, policy_version 331390 (0.0037) [2024-06-23 01:28:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5429624832. Throughput: 0: 42454.3. Samples: 5429753460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 01:28:28,910][15401] Updated weights for policy 0, policy_version 331400 (0.0028) [2024-06-23 01:28:32,569][15401] Updated weights for policy 0, policy_version 331410 (0.0038) [2024-06-23 01:28:33,389][15132] Fps is (10 sec: 45886.9, 60 sec: 42600.2, 300 sec: 42876.2). Total num frames: 5429870592. Throughput: 0: 42521.8. Samples: 5430008380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 01:28:36,494][15401] Updated weights for policy 0, policy_version 331420 (0.0029) [2024-06-23 01:28:38,390][15132] Fps is (10 sec: 44234.4, 60 sec: 42871.2, 300 sec: 42820.5). Total num frames: 5430067200. Throughput: 0: 42525.0. Samples: 5430141980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 01:28:40,102][15401] Updated weights for policy 0, policy_version 331430 (0.0033) [2024-06-23 01:28:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 5430280192. Throughput: 0: 42559.6. Samples: 5430393020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:43,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 01:28:44,029][15401] Updated weights for policy 0, policy_version 331440 (0.0023) [2024-06-23 01:28:47,639][15401] Updated weights for policy 0, policy_version 331450 (0.0040) [2024-06-23 01:28:48,391][15132] Fps is (10 sec: 42593.3, 60 sec: 42324.1, 300 sec: 42875.8). Total num frames: 5430493184. Throughput: 0: 42747.8. Samples: 5430654380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:48,392][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 01:28:51,552][15401] Updated weights for policy 0, policy_version 331460 (0.0044) [2024-06-23 01:28:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5430689792. Throughput: 0: 42541.7. Samples: 5430785240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 01:28:55,642][15401] Updated weights for policy 0, policy_version 331470 (0.0030) [2024-06-23 01:28:58,389][15132] Fps is (10 sec: 44244.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5430935552. Throughput: 0: 42497.3. Samples: 5431035320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:28:58,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 01:28:59,278][15401] Updated weights for policy 0, policy_version 331480 (0.0027) [2024-06-23 01:29:03,305][15401] Updated weights for policy 0, policy_version 331490 (0.0032) [2024-06-23 01:29:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42054.0, 300 sec: 42820.6). Total num frames: 5431132160. Throughput: 0: 42699.0. Samples: 5431299680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:29:03,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 01:29:06,909][15401] Updated weights for policy 0, policy_version 331500 (0.0054) [2024-06-23 01:29:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5431312384. Throughput: 0: 42613.8. Samples: 5431422640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:29:08,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 01:29:11,072][15401] Updated weights for policy 0, policy_version 331510 (0.0028) [2024-06-23 01:29:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5431590912. Throughput: 0: 42716.3. Samples: 5431675700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:29:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 01:29:14,533][15401] Updated weights for policy 0, policy_version 331520 (0.0033) [2024-06-23 01:29:18,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42323.6, 300 sec: 42765.0). Total num frames: 5431771136. Throughput: 0: 42984.7. Samples: 5431942800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 01:29:18,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 01:29:18,623][15349] Signal inference workers to stop experience collection... (80350 times) [2024-06-23 01:29:18,652][15401] InferenceWorker_p0-w0: stopping experience collection (80350 times) [2024-06-23 01:29:18,678][15349] Signal inference workers to resume experience collection... (80350 times) [2024-06-23 01:29:18,680][15401] InferenceWorker_p0-w0: resuming experience collection (80350 times) [2024-06-23 01:29:18,683][15401] Updated weights for policy 0, policy_version 331530 (0.0035) [2024-06-23 01:29:22,165][15401] Updated weights for policy 0, policy_version 331540 (0.0036) [2024-06-23 01:29:23,392][15132] Fps is (10 sec: 36035.9, 60 sec: 42325.3, 300 sec: 42598.0). Total num frames: 5431951360. Throughput: 0: 42673.2. Samples: 5432062360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:29:23,393][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 01:29:26,197][15401] Updated weights for policy 0, policy_version 331550 (0.0041) [2024-06-23 01:29:28,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5432213504. Throughput: 0: 42822.8. Samples: 5432320040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:29:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 01:29:29,876][15401] Updated weights for policy 0, policy_version 331560 (0.0032) [2024-06-23 01:29:33,389][15132] Fps is (10 sec: 45886.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5432410112. Throughput: 0: 42794.5. Samples: 5432580060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:29:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 01:29:33,742][15401] Updated weights for policy 0, policy_version 331570 (0.0029) [2024-06-23 01:29:37,945][15401] Updated weights for policy 0, policy_version 331580 (0.0034) [2024-06-23 01:29:38,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.6, 300 sec: 42653.9). Total num frames: 5432606720. Throughput: 0: 42577.7. Samples: 5432701240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:29:38,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 01:29:41,403][15401] Updated weights for policy 0, policy_version 331590 (0.0035) [2024-06-23 01:29:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5432852480. Throughput: 0: 42720.0. Samples: 5432957720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:29:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 01:29:43,525][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000331596_5432868864.pth... [2024-06-23 01:29:43,588][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000330968_5422579712.pth [2024-06-23 01:29:46,141][15401] Updated weights for policy 0, policy_version 331600 (0.0040) [2024-06-23 01:29:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42599.6, 300 sec: 42765.0). Total num frames: 5433049088. Throughput: 0: 42644.9. Samples: 5433218700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:29:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 01:29:49,201][15401] Updated weights for policy 0, policy_version 331610 (0.0029) [2024-06-23 01:29:53,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 5433245696. Throughput: 0: 42670.2. Samples: 5433342900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:29:53,393][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 01:29:53,655][15401] Updated weights for policy 0, policy_version 331620 (0.0028) [2024-06-23 01:29:56,696][15401] Updated weights for policy 0, policy_version 331630 (0.0021) [2024-06-23 01:29:58,391][15132] Fps is (10 sec: 45867.6, 60 sec: 42870.3, 300 sec: 42764.8). Total num frames: 5433507840. Throughput: 0: 42735.8. Samples: 5433598880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:29:58,392][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 01:30:01,182][15401] Updated weights for policy 0, policy_version 331640 (0.0040) [2024-06-23 01:30:03,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5433704448. Throughput: 0: 42667.1. Samples: 5433862720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:30:03,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 01:30:04,409][15401] Updated weights for policy 0, policy_version 331650 (0.0032) [2024-06-23 01:30:08,389][15132] Fps is (10 sec: 39328.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5433901056. Throughput: 0: 42774.4. Samples: 5433987100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:30:08,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 01:30:08,667][15401] Updated weights for policy 0, policy_version 331660 (0.0030) [2024-06-23 01:30:11,878][15401] Updated weights for policy 0, policy_version 331670 (0.0028) [2024-06-23 01:30:13,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5434163200. Throughput: 0: 42786.5. Samples: 5434245540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:30:13,393][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 01:30:16,599][15401] Updated weights for policy 0, policy_version 331680 (0.0024) [2024-06-23 01:30:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 5434343424. Throughput: 0: 42822.2. Samples: 5434507060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:30:18,396][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 01:30:19,530][15401] Updated weights for policy 0, policy_version 331690 (0.0034) [2024-06-23 01:30:23,389][15132] Fps is (10 sec: 37692.7, 60 sec: 43146.4, 300 sec: 42654.0). Total num frames: 5434540032. Throughput: 0: 42715.7. Samples: 5434623440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:30:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 01:30:24,516][15401] Updated weights for policy 0, policy_version 331700 (0.0043) [2024-06-23 01:30:27,227][15401] Updated weights for policy 0, policy_version 331710 (0.0038) [2024-06-23 01:30:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5434785792. Throughput: 0: 42743.1. Samples: 5434881160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:30:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 01:30:32,255][15401] Updated weights for policy 0, policy_version 331720 (0.0035) [2024-06-23 01:30:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5434966016. Throughput: 0: 42717.8. Samples: 5435141000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:30:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 01:30:34,155][15349] Signal inference workers to stop experience collection... (80400 times) [2024-06-23 01:30:34,156][15349] Signal inference workers to resume experience collection... (80400 times) [2024-06-23 01:30:34,177][15401] InferenceWorker_p0-w0: stopping experience collection (80400 times) [2024-06-23 01:30:34,177][15401] InferenceWorker_p0-w0: resuming experience collection (80400 times) [2024-06-23 01:30:35,184][15401] Updated weights for policy 0, policy_version 331730 (0.0046) [2024-06-23 01:30:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5435179008. Throughput: 0: 42574.8. Samples: 5435258660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:30:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 01:30:39,737][15401] Updated weights for policy 0, policy_version 331740 (0.0029) [2024-06-23 01:30:42,671][15401] Updated weights for policy 0, policy_version 331750 (0.0023) [2024-06-23 01:30:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5435424768. Throughput: 0: 42631.8. Samples: 5435517240. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:30:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 01:30:47,290][15401] Updated weights for policy 0, policy_version 331760 (0.0033) [2024-06-23 01:30:48,395][15132] Fps is (10 sec: 40936.8, 60 sec: 42321.3, 300 sec: 42597.9). Total num frames: 5435588608. Throughput: 0: 42660.9. Samples: 5435782700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:30:48,396][15132] Avg episode reward: [(0, '0.245')] [2024-06-23 01:30:50,276][15401] Updated weights for policy 0, policy_version 331770 (0.0032) [2024-06-23 01:30:53,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 5435817984. Throughput: 0: 42493.3. Samples: 5435899300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:30:53,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 01:30:54,883][15401] Updated weights for policy 0, policy_version 331780 (0.0041) [2024-06-23 01:30:58,010][15401] Updated weights for policy 0, policy_version 331790 (0.0038) [2024-06-23 01:30:58,389][15132] Fps is (10 sec: 45901.2, 60 sec: 42326.5, 300 sec: 42765.0). Total num frames: 5436047360. Throughput: 0: 42573.0. Samples: 5436161220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:30:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 01:31:02,727][15401] Updated weights for policy 0, policy_version 331800 (0.0025) [2024-06-23 01:31:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 5436243968. Throughput: 0: 42491.5. Samples: 5436419180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 01:31:05,514][15401] Updated weights for policy 0, policy_version 331810 (0.0023) [2024-06-23 01:31:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5436473344. Throughput: 0: 42608.9. Samples: 5436540840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 01:31:10,316][15401] Updated weights for policy 0, policy_version 331820 (0.0041) [2024-06-23 01:31:13,160][15401] Updated weights for policy 0, policy_version 331830 (0.0029) [2024-06-23 01:31:13,391][15132] Fps is (10 sec: 45867.3, 60 sec: 42325.8, 300 sec: 42764.8). Total num frames: 5436702720. Throughput: 0: 42771.6. Samples: 5436805960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:13,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 01:31:17,814][15401] Updated weights for policy 0, policy_version 331840 (0.0041) [2024-06-23 01:31:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5436882944. Throughput: 0: 42801.7. Samples: 5437067080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 01:31:20,891][15401] Updated weights for policy 0, policy_version 331850 (0.0036) [2024-06-23 01:31:23,390][15132] Fps is (10 sec: 40966.9, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 5437112320. Throughput: 0: 42841.7. Samples: 5437186540. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 01:31:25,699][15401] Updated weights for policy 0, policy_version 331860 (0.0034) [2024-06-23 01:31:28,288][15401] Updated weights for policy 0, policy_version 331870 (0.0038) [2024-06-23 01:31:28,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5437358080. Throughput: 0: 43047.8. Samples: 5437454400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 01:31:33,243][15401] Updated weights for policy 0, policy_version 331880 (0.0039) [2024-06-23 01:31:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5437521920. Throughput: 0: 42864.1. Samples: 5437711340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 01:31:35,866][15401] Updated weights for policy 0, policy_version 331890 (0.0037) [2024-06-23 01:31:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5437767680. Throughput: 0: 42879.2. Samples: 5437828860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 01:31:40,827][15401] Updated weights for policy 0, policy_version 331900 (0.0038) [2024-06-23 01:31:43,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5437997056. Throughput: 0: 42891.5. Samples: 5438091340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:43,399][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 01:31:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000331909_5437997056.pth... [2024-06-23 01:31:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000331282_5427724288.pth [2024-06-23 01:31:43,882][15401] Updated weights for policy 0, policy_version 331910 (0.0036) [2024-06-23 01:31:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42875.6, 300 sec: 42709.5). Total num frames: 5438160896. Throughput: 0: 42879.2. Samples: 5438348740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 26.0) [2024-06-23 01:31:48,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 01:31:48,397][15401] Updated weights for policy 0, policy_version 331920 (0.0038) [2024-06-23 01:31:51,421][15401] Updated weights for policy 0, policy_version 331930 (0.0040) [2024-06-23 01:31:53,393][15132] Fps is (10 sec: 42584.5, 60 sec: 43415.2, 300 sec: 42764.5). Total num frames: 5438423040. Throughput: 0: 42835.5. Samples: 5438468580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:31:53,393][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 01:31:56,238][15401] Updated weights for policy 0, policy_version 331940 (0.0042) [2024-06-23 01:31:57,714][15349] Signal inference workers to stop experience collection... (80450 times) [2024-06-23 01:31:57,763][15401] InferenceWorker_p0-w0: stopping experience collection (80450 times) [2024-06-23 01:31:57,829][15349] Signal inference workers to resume experience collection... (80450 times) [2024-06-23 01:31:57,829][15401] InferenceWorker_p0-w0: resuming experience collection (80450 times) [2024-06-23 01:31:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5438603264. Throughput: 0: 42830.6. Samples: 5438733260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:31:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 01:31:59,339][15401] Updated weights for policy 0, policy_version 331950 (0.0035) [2024-06-23 01:32:03,390][15132] Fps is (10 sec: 37695.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5438799872. Throughput: 0: 42644.4. Samples: 5438986080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 01:32:03,790][15401] Updated weights for policy 0, policy_version 331960 (0.0041) [2024-06-23 01:32:06,875][15401] Updated weights for policy 0, policy_version 331970 (0.0038) [2024-06-23 01:32:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 5439045632. Throughput: 0: 42758.3. Samples: 5439110660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 01:32:11,625][15401] Updated weights for policy 0, policy_version 331980 (0.0023) [2024-06-23 01:32:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42324.8, 300 sec: 42653.7). Total num frames: 5439242240. Throughput: 0: 42679.6. Samples: 5439375080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:13,393][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 01:32:14,416][15401] Updated weights for policy 0, policy_version 331990 (0.0044) [2024-06-23 01:32:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5439438848. Throughput: 0: 42592.0. Samples: 5439627980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 01:32:19,231][15401] Updated weights for policy 0, policy_version 332000 (0.0034) [2024-06-23 01:32:22,088][15401] Updated weights for policy 0, policy_version 332010 (0.0034) [2024-06-23 01:32:23,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5439684608. Throughput: 0: 42812.4. Samples: 5439755420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 01:32:26,751][15401] Updated weights for policy 0, policy_version 332020 (0.0031) [2024-06-23 01:32:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 5439897600. Throughput: 0: 42609.4. Samples: 5440008760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:28,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 01:32:29,816][15401] Updated weights for policy 0, policy_version 332030 (0.0027) [2024-06-23 01:32:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5440077824. Throughput: 0: 42572.4. Samples: 5440264500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 01:32:34,423][15401] Updated weights for policy 0, policy_version 332040 (0.0037) [2024-06-23 01:32:37,810][15401] Updated weights for policy 0, policy_version 332050 (0.0039) [2024-06-23 01:32:38,391][15132] Fps is (10 sec: 42591.2, 60 sec: 42597.1, 300 sec: 42764.8). Total num frames: 5440323584. Throughput: 0: 42714.4. Samples: 5440390660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:38,391][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 01:32:42,022][15401] Updated weights for policy 0, policy_version 332060 (0.0040) [2024-06-23 01:32:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 5440487424. Throughput: 0: 42508.8. Samples: 5440646160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 01:32:45,319][15401] Updated weights for policy 0, policy_version 332070 (0.0025) [2024-06-23 01:32:48,389][15132] Fps is (10 sec: 40967.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5440733184. Throughput: 0: 42614.4. Samples: 5440903720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 01:32:49,644][15401] Updated weights for policy 0, policy_version 332080 (0.0034) [2024-06-23 01:32:53,184][15401] Updated weights for policy 0, policy_version 332090 (0.0028) [2024-06-23 01:32:53,392][15132] Fps is (10 sec: 47502.4, 60 sec: 42326.0, 300 sec: 42764.7). Total num frames: 5440962560. Throughput: 0: 42660.3. Samples: 5441030480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:53,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 01:32:57,250][15401] Updated weights for policy 0, policy_version 332100 (0.0031) [2024-06-23 01:32:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 5441142784. Throughput: 0: 42408.6. Samples: 5441283360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:32:58,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 01:33:00,860][15401] Updated weights for policy 0, policy_version 332110 (0.0030) [2024-06-23 01:33:03,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5441372160. Throughput: 0: 42533.2. Samples: 5441541980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 01:33:03,396][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 01:33:04,939][15401] Updated weights for policy 0, policy_version 332120 (0.0034) [2024-06-23 01:33:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5441601536. Throughput: 0: 42523.6. Samples: 5441668980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 01:33:08,534][15401] Updated weights for policy 0, policy_version 332130 (0.0047) [2024-06-23 01:33:12,630][15401] Updated weights for policy 0, policy_version 332140 (0.0050) [2024-06-23 01:33:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 5441798144. Throughput: 0: 42632.4. Samples: 5441927220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:13,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 01:33:16,658][15401] Updated weights for policy 0, policy_version 332150 (0.0022) [2024-06-23 01:33:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 5442011136. Throughput: 0: 42590.6. Samples: 5442181080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 01:33:20,276][15401] Updated weights for policy 0, policy_version 332160 (0.0033) [2024-06-23 01:33:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5442224128. Throughput: 0: 42703.5. Samples: 5442312240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 01:33:24,010][15401] Updated weights for policy 0, policy_version 332170 (0.0031) [2024-06-23 01:33:27,814][15349] Signal inference workers to stop experience collection... (80500 times) [2024-06-23 01:33:27,868][15401] InferenceWorker_p0-w0: stopping experience collection (80500 times) [2024-06-23 01:33:27,877][15349] Signal inference workers to resume experience collection... (80500 times) [2024-06-23 01:33:27,883][15401] InferenceWorker_p0-w0: resuming experience collection (80500 times) [2024-06-23 01:33:28,018][15401] Updated weights for policy 0, policy_version 332180 (0.0032) [2024-06-23 01:33:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5442437120. Throughput: 0: 42630.7. Samples: 5442564540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 01:33:31,675][15401] Updated weights for policy 0, policy_version 332190 (0.0035) [2024-06-23 01:33:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5442650112. Throughput: 0: 42612.0. Samples: 5442821260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 01:33:35,864][15401] Updated weights for policy 0, policy_version 332200 (0.0036) [2024-06-23 01:33:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42326.6, 300 sec: 42654.0). Total num frames: 5442863104. Throughput: 0: 42711.7. Samples: 5442952400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 01:33:39,273][15401] Updated weights for policy 0, policy_version 332210 (0.0046) [2024-06-23 01:33:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42654.2). Total num frames: 5443076096. Throughput: 0: 42790.4. Samples: 5443208940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:43,391][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 01:33:43,473][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000332220_5443092480.pth... [2024-06-23 01:33:43,482][15401] Updated weights for policy 0, policy_version 332220 (0.0034) [2024-06-23 01:33:43,537][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000331596_5432868864.pth [2024-06-23 01:33:47,068][15401] Updated weights for policy 0, policy_version 332230 (0.0032) [2024-06-23 01:33:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5443305472. Throughput: 0: 42586.0. Samples: 5443458340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 01:33:51,050][15401] Updated weights for policy 0, policy_version 332240 (0.0044) [2024-06-23 01:33:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 5443502080. Throughput: 0: 42549.8. Samples: 5443583720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 01:33:54,806][15401] Updated weights for policy 0, policy_version 332250 (0.0051) [2024-06-23 01:33:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5443715072. Throughput: 0: 42641.1. Samples: 5443846060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:33:58,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 01:33:58,889][15401] Updated weights for policy 0, policy_version 332260 (0.0034) [2024-06-23 01:34:02,750][15401] Updated weights for policy 0, policy_version 332270 (0.0039) [2024-06-23 01:34:03,393][15132] Fps is (10 sec: 45860.6, 60 sec: 43142.4, 300 sec: 42875.7). Total num frames: 5443960832. Throughput: 0: 42506.0. Samples: 5444093980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:34:03,393][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 01:34:06,430][15401] Updated weights for policy 0, policy_version 332280 (0.0028) [2024-06-23 01:34:08,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5444157440. Throughput: 0: 42579.9. Samples: 5444228340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:34:08,393][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 01:34:10,283][15401] Updated weights for policy 0, policy_version 332290 (0.0051) [2024-06-23 01:34:13,389][15132] Fps is (10 sec: 37695.0, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 5444337664. Throughput: 0: 42676.5. Samples: 5444484980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:34:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 01:34:14,426][15401] Updated weights for policy 0, policy_version 332300 (0.0037) [2024-06-23 01:34:17,842][15401] Updated weights for policy 0, policy_version 332310 (0.0040) [2024-06-23 01:34:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42820.6). Total num frames: 5444583424. Throughput: 0: 42462.1. Samples: 5444732160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:34:18,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 01:34:22,017][15401] Updated weights for policy 0, policy_version 332320 (0.0039) [2024-06-23 01:34:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5444780032. Throughput: 0: 42591.1. Samples: 5444869000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:34:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 01:34:25,475][15401] Updated weights for policy 0, policy_version 332330 (0.0027) [2024-06-23 01:34:28,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5444993024. Throughput: 0: 42378.8. Samples: 5445115980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:34:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 01:34:29,786][15401] Updated weights for policy 0, policy_version 332340 (0.0040) [2024-06-23 01:34:33,317][15401] Updated weights for policy 0, policy_version 332350 (0.0027) [2024-06-23 01:34:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5445222400. Throughput: 0: 42437.3. Samples: 5445368020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:34:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 01:34:37,519][15401] Updated weights for policy 0, policy_version 332360 (0.0034) [2024-06-23 01:34:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5445419008. Throughput: 0: 42618.6. Samples: 5445501560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:34:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 01:34:41,121][15401] Updated weights for policy 0, policy_version 332370 (0.0034) [2024-06-23 01:34:43,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42594.0, 300 sec: 42653.0). Total num frames: 5445632000. Throughput: 0: 42490.8. Samples: 5445758420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:34:43,397][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 01:34:45,402][15401] Updated weights for policy 0, policy_version 332380 (0.0030) [2024-06-23 01:34:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42654.3). Total num frames: 5445828608. Throughput: 0: 42622.5. Samples: 5446011860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:34:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 01:34:48,842][15401] Updated weights for policy 0, policy_version 332390 (0.0028) [2024-06-23 01:34:52,870][15401] Updated weights for policy 0, policy_version 332400 (0.0047) [2024-06-23 01:34:53,392][15132] Fps is (10 sec: 42615.1, 60 sec: 42596.6, 300 sec: 42542.7). Total num frames: 5446057984. Throughput: 0: 42399.9. Samples: 5446136440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:34:53,393][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 01:34:56,155][15349] Signal inference workers to stop experience collection... (80550 times) [2024-06-23 01:34:56,155][15349] Signal inference workers to resume experience collection... (80550 times) [2024-06-23 01:34:56,199][15401] InferenceWorker_p0-w0: stopping experience collection (80550 times) [2024-06-23 01:34:56,199][15401] InferenceWorker_p0-w0: resuming experience collection (80550 times) [2024-06-23 01:34:56,299][15401] Updated weights for policy 0, policy_version 332410 (0.0038) [2024-06-23 01:34:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 5446270976. Throughput: 0: 42471.0. Samples: 5446396180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:34:58,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 01:35:00,302][15401] Updated weights for policy 0, policy_version 332420 (0.0034) [2024-06-23 01:35:03,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42054.5, 300 sec: 42653.9). Total num frames: 5446483968. Throughput: 0: 42766.7. Samples: 5446656560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:35:03,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 01:35:04,010][15401] Updated weights for policy 0, policy_version 332430 (0.0034) [2024-06-23 01:35:07,767][15401] Updated weights for policy 0, policy_version 332440 (0.0031) [2024-06-23 01:35:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 5446713344. Throughput: 0: 42547.0. Samples: 5446783620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:35:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 01:35:11,931][15401] Updated weights for policy 0, policy_version 332450 (0.0038) [2024-06-23 01:35:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5446909952. Throughput: 0: 42760.0. Samples: 5447040180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:35:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 01:35:15,381][15401] Updated weights for policy 0, policy_version 332460 (0.0033) [2024-06-23 01:35:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 5447122944. Throughput: 0: 42896.4. Samples: 5447298360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:35:18,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 01:35:19,841][15401] Updated weights for policy 0, policy_version 332470 (0.0033) [2024-06-23 01:35:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5447335936. Throughput: 0: 42603.2. Samples: 5447418700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:35:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 01:35:23,453][15401] Updated weights for policy 0, policy_version 332480 (0.0037) [2024-06-23 01:35:27,564][15401] Updated weights for policy 0, policy_version 332490 (0.0042) [2024-06-23 01:35:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5447565312. Throughput: 0: 42779.8. Samples: 5447683240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:35:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 01:35:31,317][15401] Updated weights for policy 0, policy_version 332500 (0.0039) [2024-06-23 01:35:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5447778304. Throughput: 0: 42763.6. Samples: 5447936220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:35:33,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 01:35:35,133][15401] Updated weights for policy 0, policy_version 332510 (0.0037) [2024-06-23 01:35:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5447991296. Throughput: 0: 42860.0. Samples: 5448065040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:35:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 01:35:38,941][15401] Updated weights for policy 0, policy_version 332520 (0.0032) [2024-06-23 01:35:42,805][15401] Updated weights for policy 0, policy_version 332530 (0.0033) [2024-06-23 01:35:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42602.9, 300 sec: 42710.3). Total num frames: 5448187904. Throughput: 0: 42820.1. Samples: 5448323080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:35:43,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 01:35:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000332532_5448204288.pth... [2024-06-23 01:35:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000331909_5437997056.pth [2024-06-23 01:35:46,611][15401] Updated weights for policy 0, policy_version 332540 (0.0035) [2024-06-23 01:35:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5448400896. Throughput: 0: 42692.5. Samples: 5448577720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:35:48,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 01:35:50,418][15401] Updated weights for policy 0, policy_version 332550 (0.0028) [2024-06-23 01:35:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 5448630272. Throughput: 0: 42684.9. Samples: 5448704440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:35:53,396][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 01:35:54,087][15401] Updated weights for policy 0, policy_version 332560 (0.0031) [2024-06-23 01:35:57,942][15401] Updated weights for policy 0, policy_version 332570 (0.0036) [2024-06-23 01:35:58,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 5448843264. Throughput: 0: 42630.1. Samples: 5448958640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:35:58,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 01:36:01,683][15401] Updated weights for policy 0, policy_version 332580 (0.0036) [2024-06-23 01:36:03,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42871.2, 300 sec: 42653.9). Total num frames: 5449056256. Throughput: 0: 42700.6. Samples: 5449219900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:03,391][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 01:36:05,471][15401] Updated weights for policy 0, policy_version 332590 (0.0034) [2024-06-23 01:36:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 5449269248. Throughput: 0: 42849.6. Samples: 5449346940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:08,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 01:36:09,356][15401] Updated weights for policy 0, policy_version 332600 (0.0031) [2024-06-23 01:36:13,285][15401] Updated weights for policy 0, policy_version 332610 (0.0030) [2024-06-23 01:36:13,392][15132] Fps is (10 sec: 42589.5, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 5449482240. Throughput: 0: 42605.7. Samples: 5449600600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:13,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 01:36:17,209][15401] Updated weights for policy 0, policy_version 332620 (0.0040) [2024-06-23 01:36:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5449678848. Throughput: 0: 42762.2. Samples: 5449860520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:18,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 01:36:20,898][15401] Updated weights for policy 0, policy_version 332630 (0.0023) [2024-06-23 01:36:23,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 5449908224. Throughput: 0: 42554.1. Samples: 5449979980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 01:36:24,923][15401] Updated weights for policy 0, policy_version 332640 (0.0044) [2024-06-23 01:36:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 5450104832. Throughput: 0: 42578.2. Samples: 5450239200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:28,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 01:36:28,604][15401] Updated weights for policy 0, policy_version 332650 (0.0038) [2024-06-23 01:36:29,956][15349] Signal inference workers to stop experience collection... (80600 times) [2024-06-23 01:36:29,994][15401] InferenceWorker_p0-w0: stopping experience collection (80600 times) [2024-06-23 01:36:30,006][15349] Signal inference workers to resume experience collection... (80600 times) [2024-06-23 01:36:30,020][15401] InferenceWorker_p0-w0: resuming experience collection (80600 times) [2024-06-23 01:36:32,726][15401] Updated weights for policy 0, policy_version 332660 (0.0033) [2024-06-23 01:36:33,391][15132] Fps is (10 sec: 42592.9, 60 sec: 42597.3, 300 sec: 42598.2). Total num frames: 5450334208. Throughput: 0: 42691.0. Samples: 5450498880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:33,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 01:36:36,138][15401] Updated weights for policy 0, policy_version 332670 (0.0037) [2024-06-23 01:36:38,392][15132] Fps is (10 sec: 45875.9, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 5450563584. Throughput: 0: 42624.1. Samples: 5450622620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:38,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 01:36:40,271][15401] Updated weights for policy 0, policy_version 332680 (0.0030) [2024-06-23 01:36:43,389][15132] Fps is (10 sec: 42605.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5450760192. Throughput: 0: 42757.5. Samples: 5450882620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 01:36:43,655][15401] Updated weights for policy 0, policy_version 332690 (0.0034) [2024-06-23 01:36:47,842][15401] Updated weights for policy 0, policy_version 332700 (0.0025) [2024-06-23 01:36:48,389][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.4, 300 sec: 42487.8). Total num frames: 5450956800. Throughput: 0: 42552.0. Samples: 5451134720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 01:36:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 01:36:51,343][15401] Updated weights for policy 0, policy_version 332710 (0.0041) [2024-06-23 01:36:53,390][15132] Fps is (10 sec: 44233.1, 60 sec: 42870.9, 300 sec: 42709.4). Total num frames: 5451202560. Throughput: 0: 42457.5. Samples: 5451257560. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:36:53,391][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 01:36:55,666][15401] Updated weights for policy 0, policy_version 332720 (0.0040) [2024-06-23 01:36:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 5451415552. Throughput: 0: 42641.9. Samples: 5451519380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:36:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 01:36:58,929][15401] Updated weights for policy 0, policy_version 332730 (0.0036) [2024-06-23 01:37:03,346][15401] Updated weights for policy 0, policy_version 332740 (0.0035) [2024-06-23 01:37:03,392][15132] Fps is (10 sec: 40953.4, 60 sec: 42596.9, 300 sec: 42598.0). Total num frames: 5451612160. Throughput: 0: 42526.2. Samples: 5451774300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:03,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 01:37:06,725][15401] Updated weights for policy 0, policy_version 332750 (0.0038) [2024-06-23 01:37:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 5451841536. Throughput: 0: 42653.6. Samples: 5451899380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 01:37:11,624][15401] Updated weights for policy 0, policy_version 332760 (0.0031) [2024-06-23 01:37:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 5452038144. Throughput: 0: 42723.6. Samples: 5452161660. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:13,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-23 01:37:14,282][15401] Updated weights for policy 0, policy_version 332770 (0.0032) [2024-06-23 01:37:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5452234752. Throughput: 0: 42563.3. Samples: 5452414160. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:18,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 01:37:19,160][15401] Updated weights for policy 0, policy_version 332780 (0.0034) [2024-06-23 01:37:22,054][15401] Updated weights for policy 0, policy_version 332790 (0.0043) [2024-06-23 01:37:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5452480512. Throughput: 0: 42515.9. Samples: 5452535740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 01:37:26,714][15401] Updated weights for policy 0, policy_version 332800 (0.0035) [2024-06-23 01:37:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 5452677120. Throughput: 0: 42563.5. Samples: 5452797980. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 01:37:30,173][15401] Updated weights for policy 0, policy_version 332810 (0.0049) [2024-06-23 01:37:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42326.3, 300 sec: 42543.1). Total num frames: 5452873728. Throughput: 0: 42604.7. Samples: 5453051940. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 01:37:34,296][15401] Updated weights for policy 0, policy_version 332820 (0.0037) [2024-06-23 01:37:37,912][15401] Updated weights for policy 0, policy_version 332830 (0.0038) [2024-06-23 01:37:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 5453135872. Throughput: 0: 42612.3. Samples: 5453175080. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 01:37:42,022][15401] Updated weights for policy 0, policy_version 332840 (0.0028) [2024-06-23 01:37:43,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5453299712. Throughput: 0: 42503.1. Samples: 5453432020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 01:37:43,545][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000332844_5453316096.pth... [2024-06-23 01:37:43,615][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000332220_5443092480.pth [2024-06-23 01:37:45,518][15401] Updated weights for policy 0, policy_version 332850 (0.0037) [2024-06-23 01:37:46,064][15349] Signal inference workers to stop experience collection... (80650 times) [2024-06-23 01:37:46,065][15349] Signal inference workers to resume experience collection... (80650 times) [2024-06-23 01:37:46,084][15401] InferenceWorker_p0-w0: stopping experience collection (80650 times) [2024-06-23 01:37:46,084][15401] InferenceWorker_p0-w0: resuming experience collection (80650 times) [2024-06-23 01:37:48,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 5453512704. Throughput: 0: 42459.7. Samples: 5453684880. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 01:37:50,046][15401] Updated weights for policy 0, policy_version 332860 (0.0029) [2024-06-23 01:37:53,107][15401] Updated weights for policy 0, policy_version 332870 (0.0021) [2024-06-23 01:37:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42599.0, 300 sec: 42765.0). Total num frames: 5453758464. Throughput: 0: 42522.2. Samples: 5453812880. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 01:37:57,465][15401] Updated weights for policy 0, policy_version 332880 (0.0040) [2024-06-23 01:37:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5453955072. Throughput: 0: 42546.6. Samples: 5454076260. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:37:58,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 01:38:00,609][15401] Updated weights for policy 0, policy_version 332890 (0.0034) [2024-06-23 01:38:03,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 5454151680. Throughput: 0: 42551.4. Samples: 5454328980. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:38:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 01:38:04,996][15401] Updated weights for policy 0, policy_version 332900 (0.0023) [2024-06-23 01:38:08,171][15401] Updated weights for policy 0, policy_version 332910 (0.0038) [2024-06-23 01:38:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5454397440. Throughput: 0: 42672.5. Samples: 5454456000. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 01:38:08,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-23 01:38:13,065][15401] Updated weights for policy 0, policy_version 332920 (0.0041) [2024-06-23 01:38:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5454577664. Throughput: 0: 42614.3. Samples: 5454715620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 01:38:15,890][15401] Updated weights for policy 0, policy_version 332930 (0.0043) [2024-06-23 01:38:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5454790656. Throughput: 0: 42525.6. Samples: 5454965580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 01:38:20,575][15401] Updated weights for policy 0, policy_version 332940 (0.0032) [2024-06-23 01:38:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5455020032. Throughput: 0: 42697.7. Samples: 5455096480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 01:38:23,695][15401] Updated weights for policy 0, policy_version 332950 (0.0038) [2024-06-23 01:38:28,237][15401] Updated weights for policy 0, policy_version 332960 (0.0043) [2024-06-23 01:38:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5455216640. Throughput: 0: 42749.3. Samples: 5455355740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:28,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 01:38:31,260][15401] Updated weights for policy 0, policy_version 332970 (0.0052) [2024-06-23 01:38:33,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 5455446016. Throughput: 0: 42835.8. Samples: 5455612600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:33,393][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 01:38:35,673][15401] Updated weights for policy 0, policy_version 332980 (0.0031) [2024-06-23 01:38:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5455659008. Throughput: 0: 42825.4. Samples: 5455740020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:38,390][15132] Avg episode reward: [(0, '0.153')] [2024-06-23 01:38:39,078][15401] Updated weights for policy 0, policy_version 332990 (0.0034) [2024-06-23 01:38:43,336][15401] Updated weights for policy 0, policy_version 333000 (0.0035) [2024-06-23 01:38:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5455872000. Throughput: 0: 42740.9. Samples: 5455999600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:43,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 01:38:46,842][15401] Updated weights for policy 0, policy_version 333010 (0.0028) [2024-06-23 01:38:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5456084992. Throughput: 0: 42718.1. Samples: 5456251300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:48,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 01:38:51,154][15401] Updated weights for policy 0, policy_version 333020 (0.0041) [2024-06-23 01:38:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5456297984. Throughput: 0: 42794.5. Samples: 5456381760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:53,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 01:38:54,471][15401] Updated weights for policy 0, policy_version 333030 (0.0039) [2024-06-23 01:38:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.8). Total num frames: 5456494592. Throughput: 0: 42474.6. Samples: 5456626980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:38:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 01:38:58,838][15401] Updated weights for policy 0, policy_version 333040 (0.0029) [2024-06-23 01:39:02,375][15401] Updated weights for policy 0, policy_version 333050 (0.0041) [2024-06-23 01:39:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5456707584. Throughput: 0: 42575.6. Samples: 5456881480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:39:03,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 01:39:06,427][15401] Updated weights for policy 0, policy_version 333060 (0.0043) [2024-06-23 01:39:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5456920576. Throughput: 0: 42590.3. Samples: 5457013040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:39:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 01:39:10,291][15401] Updated weights for policy 0, policy_version 333070 (0.0038) [2024-06-23 01:39:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 5457133568. Throughput: 0: 42472.9. Samples: 5457267020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:39:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 01:39:13,892][15401] Updated weights for policy 0, policy_version 333080 (0.0039) [2024-06-23 01:39:17,928][15401] Updated weights for policy 0, policy_version 333090 (0.0028) [2024-06-23 01:39:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5457346560. Throughput: 0: 42562.8. Samples: 5457527820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:39:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 01:39:21,711][15401] Updated weights for policy 0, policy_version 333100 (0.0042) [2024-06-23 01:39:23,332][15349] Signal inference workers to stop experience collection... (80700 times) [2024-06-23 01:39:23,332][15349] Signal inference workers to resume experience collection... (80700 times) [2024-06-23 01:39:23,349][15401] InferenceWorker_p0-w0: stopping experience collection (80700 times) [2024-06-23 01:39:23,349][15401] InferenceWorker_p0-w0: resuming experience collection (80700 times) [2024-06-23 01:39:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5457575936. Throughput: 0: 42570.7. Samples: 5457655700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 01:39:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 01:39:25,901][15401] Updated weights for policy 0, policy_version 333110 (0.0040) [2024-06-23 01:39:28,391][15132] Fps is (10 sec: 42591.2, 60 sec: 42597.2, 300 sec: 42542.6). Total num frames: 5457772544. Throughput: 0: 42465.1. Samples: 5457910600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:39:28,392][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 01:39:29,135][15401] Updated weights for policy 0, policy_version 333120 (0.0034) [2024-06-23 01:39:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 5457985536. Throughput: 0: 42817.8. Samples: 5458178100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:39:33,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 01:39:33,555][15401] Updated weights for policy 0, policy_version 333130 (0.0041) [2024-06-23 01:39:36,638][15401] Updated weights for policy 0, policy_version 333140 (0.0023) [2024-06-23 01:39:38,389][15132] Fps is (10 sec: 45883.2, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 5458231296. Throughput: 0: 42733.9. Samples: 5458304780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:39:38,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 01:39:41,148][15401] Updated weights for policy 0, policy_version 333150 (0.0037) [2024-06-23 01:39:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5458427904. Throughput: 0: 42925.3. Samples: 5458558620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:39:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 01:39:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000333156_5458427904.pth... [2024-06-23 01:39:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000332532_5448204288.pth [2024-06-23 01:39:44,256][15401] Updated weights for policy 0, policy_version 333160 (0.0041) [2024-06-23 01:39:48,391][15132] Fps is (10 sec: 39314.1, 60 sec: 42324.1, 300 sec: 42598.5). Total num frames: 5458624512. Throughput: 0: 42989.7. Samples: 5458816100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:39:48,392][15132] Avg episode reward: [(0, '0.873')] [2024-06-23 01:39:48,751][15401] Updated weights for policy 0, policy_version 333170 (0.0029) [2024-06-23 01:39:51,730][15401] Updated weights for policy 0, policy_version 333180 (0.0033) [2024-06-23 01:39:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5458886656. Throughput: 0: 42867.1. Samples: 5458942060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:39:53,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-23 01:39:56,461][15401] Updated weights for policy 0, policy_version 333190 (0.0029) [2024-06-23 01:39:58,389][15132] Fps is (10 sec: 45884.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5459083264. Throughput: 0: 43009.0. Samples: 5459202420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:39:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 01:39:59,147][15401] Updated weights for policy 0, policy_version 333200 (0.0039) [2024-06-23 01:40:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5459263488. Throughput: 0: 42925.3. Samples: 5459459460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:40:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 01:40:04,097][15401] Updated weights for policy 0, policy_version 333210 (0.0032) [2024-06-23 01:40:06,849][15401] Updated weights for policy 0, policy_version 333220 (0.0030) [2024-06-23 01:40:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5459509248. Throughput: 0: 42823.5. Samples: 5459582760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:40:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 01:40:11,595][15401] Updated weights for policy 0, policy_version 333230 (0.0029) [2024-06-23 01:40:13,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 5459738624. Throughput: 0: 43111.9. Samples: 5459850560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:40:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 01:40:14,219][15401] Updated weights for policy 0, policy_version 333240 (0.0032) [2024-06-23 01:40:18,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5459902464. Throughput: 0: 43013.7. Samples: 5460113720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:40:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 01:40:19,211][15401] Updated weights for policy 0, policy_version 333250 (0.0036) [2024-06-23 01:40:21,878][15401] Updated weights for policy 0, policy_version 333260 (0.0045) [2024-06-23 01:40:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5460148224. Throughput: 0: 42773.3. Samples: 5460229580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:40:23,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 01:40:26,893][15401] Updated weights for policy 0, policy_version 333270 (0.0025) [2024-06-23 01:40:28,290][15349] Signal inference workers to stop experience collection... (80750 times) [2024-06-23 01:40:28,330][15401] InferenceWorker_p0-w0: stopping experience collection (80750 times) [2024-06-23 01:40:28,338][15349] Signal inference workers to resume experience collection... (80750 times) [2024-06-23 01:40:28,353][15401] InferenceWorker_p0-w0: resuming experience collection (80750 times) [2024-06-23 01:40:28,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43145.7, 300 sec: 42653.9). Total num frames: 5460361216. Throughput: 0: 42930.6. Samples: 5460490500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:40:28,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 01:40:29,410][15401] Updated weights for policy 0, policy_version 333280 (0.0035) [2024-06-23 01:40:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5460557824. Throughput: 0: 43005.3. Samples: 5460751260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:40:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 01:40:34,449][15401] Updated weights for policy 0, policy_version 333290 (0.0033) [2024-06-23 01:40:37,019][15401] Updated weights for policy 0, policy_version 333300 (0.0028) [2024-06-23 01:40:38,391][15132] Fps is (10 sec: 44229.3, 60 sec: 42870.2, 300 sec: 42764.8). Total num frames: 5460803584. Throughput: 0: 42951.3. Samples: 5460874940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 01:40:38,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 01:40:42,336][15401] Updated weights for policy 0, policy_version 333310 (0.0033) [2024-06-23 01:40:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5461000192. Throughput: 0: 42949.7. Samples: 5461135160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:40:43,392][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 01:40:44,669][15401] Updated weights for policy 0, policy_version 333320 (0.0033) [2024-06-23 01:40:48,389][15132] Fps is (10 sec: 39328.7, 60 sec: 42872.8, 300 sec: 42598.4). Total num frames: 5461196800. Throughput: 0: 42887.2. Samples: 5461389380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:40:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 01:40:50,082][15401] Updated weights for policy 0, policy_version 333330 (0.0040) [2024-06-23 01:40:52,422][15401] Updated weights for policy 0, policy_version 333340 (0.0033) [2024-06-23 01:40:53,392][15132] Fps is (10 sec: 45863.3, 60 sec: 42869.7, 300 sec: 42765.0). Total num frames: 5461458944. Throughput: 0: 42997.6. Samples: 5461517760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:40:53,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 01:40:57,718][15401] Updated weights for policy 0, policy_version 333350 (0.0040) [2024-06-23 01:40:58,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.6, 300 sec: 42709.2). Total num frames: 5461655552. Throughput: 0: 42867.9. Samples: 5461779720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:40:58,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 01:40:59,950][15401] Updated weights for policy 0, policy_version 333360 (0.0028) [2024-06-23 01:41:03,389][15132] Fps is (10 sec: 39331.8, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5461852160. Throughput: 0: 42590.9. Samples: 5462030300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 01:41:05,405][15401] Updated weights for policy 0, policy_version 333370 (0.0040) [2024-06-23 01:41:07,934][15401] Updated weights for policy 0, policy_version 333380 (0.0031) [2024-06-23 01:41:08,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 5462097920. Throughput: 0: 42975.9. Samples: 5462163500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 01:41:12,854][15401] Updated weights for policy 0, policy_version 333390 (0.0045) [2024-06-23 01:41:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5462278144. Throughput: 0: 42930.7. Samples: 5462422380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 01:41:15,449][15401] Updated weights for policy 0, policy_version 333400 (0.0023) [2024-06-23 01:41:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 5462507520. Throughput: 0: 42710.3. Samples: 5462673220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 01:41:20,731][15401] Updated weights for policy 0, policy_version 333410 (0.0028) [2024-06-23 01:41:23,124][15401] Updated weights for policy 0, policy_version 333420 (0.0025) [2024-06-23 01:41:23,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43415.8, 300 sec: 42876.1). Total num frames: 5462753280. Throughput: 0: 42875.7. Samples: 5462804380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:23,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 01:41:28,246][15401] Updated weights for policy 0, policy_version 333430 (0.0031) [2024-06-23 01:41:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 5462917120. Throughput: 0: 42835.8. Samples: 5463062780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:28,390][15132] Avg episode reward: [(0, '0.099')] [2024-06-23 01:41:28,715][15349] Signal inference workers to stop experience collection... (80800 times) [2024-06-23 01:41:28,720][15349] Signal inference workers to resume experience collection... (80800 times) [2024-06-23 01:41:28,752][15401] InferenceWorker_p0-w0: stopping experience collection (80800 times) [2024-06-23 01:41:28,752][15401] InferenceWorker_p0-w0: resuming experience collection (80800 times) [2024-06-23 01:41:30,537][15401] Updated weights for policy 0, policy_version 333440 (0.0031) [2024-06-23 01:41:33,390][15132] Fps is (10 sec: 37692.5, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 5463130112. Throughput: 0: 42929.2. Samples: 5463321200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 01:41:35,698][15401] Updated weights for policy 0, policy_version 333450 (0.0039) [2024-06-23 01:41:38,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43145.8, 300 sec: 42820.5). Total num frames: 5463392256. Throughput: 0: 42954.8. Samples: 5463450620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 01:41:38,565][15401] Updated weights for policy 0, policy_version 333460 (0.0027) [2024-06-23 01:41:43,122][15401] Updated weights for policy 0, policy_version 333470 (0.0050) [2024-06-23 01:41:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5463588864. Throughput: 0: 43003.6. Samples: 5463714780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:43,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 01:41:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000333471_5463588864.pth... [2024-06-23 01:41:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000332844_5453316096.pth [2024-06-23 01:41:46,131][15401] Updated weights for policy 0, policy_version 333480 (0.0033) [2024-06-23 01:41:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 43142.8, 300 sec: 42653.7). Total num frames: 5463785472. Throughput: 0: 42977.2. Samples: 5463964380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:48,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 01:41:50,847][15401] Updated weights for policy 0, policy_version 333490 (0.0043) [2024-06-23 01:41:53,391][15132] Fps is (10 sec: 44229.6, 60 sec: 42872.0, 300 sec: 42764.8). Total num frames: 5464031232. Throughput: 0: 42810.0. Samples: 5464090020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 01:41:53,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 01:41:53,601][15401] Updated weights for policy 0, policy_version 333500 (0.0028) [2024-06-23 01:41:58,334][15401] Updated weights for policy 0, policy_version 333510 (0.0031) [2024-06-23 01:41:58,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42873.1, 300 sec: 42765.4). Total num frames: 5464227840. Throughput: 0: 42957.7. Samples: 5464355480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:41:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 01:42:01,227][15401] Updated weights for policy 0, policy_version 333520 (0.0031) [2024-06-23 01:42:03,389][15132] Fps is (10 sec: 40967.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5464440832. Throughput: 0: 43061.4. Samples: 5464610980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:03,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 01:42:05,815][15401] Updated weights for policy 0, policy_version 333530 (0.0035) [2024-06-23 01:42:08,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5464686592. Throughput: 0: 43208.2. Samples: 5464748640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 01:42:08,631][15401] Updated weights for policy 0, policy_version 333540 (0.0029) [2024-06-23 01:42:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5464866816. Throughput: 0: 43231.6. Samples: 5465008200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:13,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 01:42:13,422][15401] Updated weights for policy 0, policy_version 333550 (0.0029) [2024-06-23 01:42:16,827][15401] Updated weights for policy 0, policy_version 333560 (0.0026) [2024-06-23 01:42:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 5465096192. Throughput: 0: 43013.7. Samples: 5465256920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:18,392][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 01:42:21,201][15401] Updated weights for policy 0, policy_version 333570 (0.0043) [2024-06-23 01:42:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 5465325568. Throughput: 0: 43200.4. Samples: 5465394640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 01:42:24,523][15401] Updated weights for policy 0, policy_version 333580 (0.0033) [2024-06-23 01:42:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 5465522176. Throughput: 0: 43147.7. Samples: 5465656420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 01:42:28,558][15401] Updated weights for policy 0, policy_version 333590 (0.0025) [2024-06-23 01:42:31,962][15401] Updated weights for policy 0, policy_version 333600 (0.0035) [2024-06-23 01:42:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 5465751552. Throughput: 0: 43253.7. Samples: 5465910700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:33,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 01:42:36,218][15401] Updated weights for policy 0, policy_version 333610 (0.0035) [2024-06-23 01:42:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5465980928. Throughput: 0: 43493.2. Samples: 5466047140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:38,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 01:42:39,386][15401] Updated weights for policy 0, policy_version 333620 (0.0029) [2024-06-23 01:42:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5466161152. Throughput: 0: 43339.6. Samples: 5466305760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 01:42:43,824][15401] Updated weights for policy 0, policy_version 333630 (0.0041) [2024-06-23 01:42:46,970][15401] Updated weights for policy 0, policy_version 333640 (0.0045) [2024-06-23 01:42:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 5466390528. Throughput: 0: 43266.1. Samples: 5466557960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 01:42:51,740][15401] Updated weights for policy 0, policy_version 333650 (0.0037) [2024-06-23 01:42:53,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43144.0, 300 sec: 42931.3). Total num frames: 5466619904. Throughput: 0: 43207.4. Samples: 5466693080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:53,393][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 01:42:53,592][15349] Signal inference workers to stop experience collection... (80850 times) [2024-06-23 01:42:53,593][15349] Signal inference workers to resume experience collection... (80850 times) [2024-06-23 01:42:53,626][15401] InferenceWorker_p0-w0: stopping experience collection (80850 times) [2024-06-23 01:42:53,626][15401] InferenceWorker_p0-w0: resuming experience collection (80850 times) [2024-06-23 01:42:54,510][15401] Updated weights for policy 0, policy_version 333660 (0.0043) [2024-06-23 01:42:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5466816512. Throughput: 0: 43140.4. Samples: 5466949520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:42:58,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 01:42:59,137][15401] Updated weights for policy 0, policy_version 333670 (0.0030) [2024-06-23 01:43:02,047][15401] Updated weights for policy 0, policy_version 333680 (0.0047) [2024-06-23 01:43:03,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5467029504. Throughput: 0: 43366.7. Samples: 5467208320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:43:03,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-23 01:43:06,738][15401] Updated weights for policy 0, policy_version 333690 (0.0033) [2024-06-23 01:43:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5467258880. Throughput: 0: 43316.0. Samples: 5467343860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:43:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 01:43:09,697][15401] Updated weights for policy 0, policy_version 333700 (0.0038) [2024-06-23 01:43:13,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 5467439104. Throughput: 0: 43066.9. Samples: 5467594540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 01:43:13,393][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 01:43:14,295][15401] Updated weights for policy 0, policy_version 333710 (0.0030) [2024-06-23 01:43:17,346][15401] Updated weights for policy 0, policy_version 333720 (0.0040) [2024-06-23 01:43:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43146.3, 300 sec: 42931.7). Total num frames: 5467684864. Throughput: 0: 43090.4. Samples: 5467849760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 01:43:21,852][15401] Updated weights for policy 0, policy_version 333730 (0.0024) [2024-06-23 01:43:23,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5467897856. Throughput: 0: 43147.0. Samples: 5467988760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 01:43:24,846][15401] Updated weights for policy 0, policy_version 333740 (0.0034) [2024-06-23 01:43:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 5468094464. Throughput: 0: 43021.4. Samples: 5468241720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 01:43:29,262][15401] Updated weights for policy 0, policy_version 333750 (0.0029) [2024-06-23 01:43:32,303][15401] Updated weights for policy 0, policy_version 333760 (0.0038) [2024-06-23 01:43:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 5468340224. Throughput: 0: 43075.1. Samples: 5468496440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:33,393][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 01:43:37,077][15401] Updated weights for policy 0, policy_version 333770 (0.0034) [2024-06-23 01:43:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5468553216. Throughput: 0: 43182.7. Samples: 5468636200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 01:43:39,961][15401] Updated weights for policy 0, policy_version 333780 (0.0042) [2024-06-23 01:43:43,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5468749824. Throughput: 0: 43057.8. Samples: 5468887120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 01:43:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000333786_5468749824.pth... [2024-06-23 01:43:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000333156_5458427904.pth [2024-06-23 01:43:44,488][15401] Updated weights for policy 0, policy_version 333790 (0.0030) [2024-06-23 01:43:47,597][15401] Updated weights for policy 0, policy_version 333800 (0.0039) [2024-06-23 01:43:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.9, 300 sec: 43042.4). Total num frames: 5468995584. Throughput: 0: 42839.0. Samples: 5469136180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:48,392][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 01:43:52,332][15401] Updated weights for policy 0, policy_version 333810 (0.0025) [2024-06-23 01:43:53,390][15132] Fps is (10 sec: 42595.5, 60 sec: 42599.6, 300 sec: 42987.1). Total num frames: 5469175808. Throughput: 0: 42846.0. Samples: 5469271960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:53,391][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 01:43:55,399][15401] Updated weights for policy 0, policy_version 333820 (0.0030) [2024-06-23 01:43:58,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5469405184. Throughput: 0: 42986.7. Samples: 5469528840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:43:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 01:44:00,118][15401] Updated weights for policy 0, policy_version 333830 (0.0050) [2024-06-23 01:44:03,244][15401] Updated weights for policy 0, policy_version 333840 (0.0045) [2024-06-23 01:44:03,390][15132] Fps is (10 sec: 45878.2, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 5469634560. Throughput: 0: 42945.2. Samples: 5469782300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:44:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 01:44:07,603][15401] Updated weights for policy 0, policy_version 333850 (0.0024) [2024-06-23 01:44:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 5469814784. Throughput: 0: 42821.8. Samples: 5469915740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:44:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 01:44:10,801][15401] Updated weights for policy 0, policy_version 333860 (0.0034) [2024-06-23 01:44:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43419.4, 300 sec: 43042.7). Total num frames: 5470044160. Throughput: 0: 42752.5. Samples: 5470165580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:44:13,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 01:44:15,410][15401] Updated weights for policy 0, policy_version 333870 (0.0036) [2024-06-23 01:44:17,880][15349] Signal inference workers to stop experience collection... (80900 times) [2024-06-23 01:44:17,930][15401] InferenceWorker_p0-w0: stopping experience collection (80900 times) [2024-06-23 01:44:17,987][15349] Signal inference workers to resume experience collection... (80900 times) [2024-06-23 01:44:17,987][15401] InferenceWorker_p0-w0: resuming experience collection (80900 times) [2024-06-23 01:44:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5470273536. Throughput: 0: 42853.4. Samples: 5470424740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:44:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 01:44:18,654][15401] Updated weights for policy 0, policy_version 333880 (0.0031) [2024-06-23 01:44:22,882][15401] Updated weights for policy 0, policy_version 333890 (0.0038) [2024-06-23 01:44:23,392][15132] Fps is (10 sec: 42587.4, 60 sec: 42869.8, 300 sec: 43042.6). Total num frames: 5470470144. Throughput: 0: 42538.6. Samples: 5470550540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:44:23,393][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 01:44:26,237][15401] Updated weights for policy 0, policy_version 333900 (0.0034) [2024-06-23 01:44:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5470683136. Throughput: 0: 42621.9. Samples: 5470805100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 01:44:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 01:44:30,484][15401] Updated weights for policy 0, policy_version 333910 (0.0040) [2024-06-23 01:44:33,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42873.3, 300 sec: 42987.2). Total num frames: 5470912512. Throughput: 0: 42958.4. Samples: 5471069200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:44:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 01:44:33,965][15401] Updated weights for policy 0, policy_version 333920 (0.0042) [2024-06-23 01:44:38,075][15401] Updated weights for policy 0, policy_version 333930 (0.0035) [2024-06-23 01:44:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 5471109120. Throughput: 0: 42663.3. Samples: 5471191780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:44:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 01:44:41,741][15401] Updated weights for policy 0, policy_version 333940 (0.0043) [2024-06-23 01:44:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 43098.5). Total num frames: 5471338496. Throughput: 0: 42589.4. Samples: 5471445360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:44:43,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-23 01:44:46,130][15401] Updated weights for policy 0, policy_version 333950 (0.0027) [2024-06-23 01:44:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 5471535104. Throughput: 0: 42606.7. Samples: 5471699600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:44:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 01:44:49,683][15401] Updated weights for policy 0, policy_version 333960 (0.0043) [2024-06-23 01:44:53,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42597.2, 300 sec: 42875.7). Total num frames: 5471731712. Throughput: 0: 42446.7. Samples: 5471825940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:44:53,392][15132] Avg episode reward: [(0, '0.826')] [2024-06-23 01:44:53,818][15401] Updated weights for policy 0, policy_version 333970 (0.0046) [2024-06-23 01:44:57,312][15401] Updated weights for policy 0, policy_version 333980 (0.0041) [2024-06-23 01:44:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 5471977472. Throughput: 0: 42559.9. Samples: 5472080780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:44:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 01:45:01,505][15401] Updated weights for policy 0, policy_version 333990 (0.0024) [2024-06-23 01:45:03,396][15132] Fps is (10 sec: 44219.0, 60 sec: 42320.9, 300 sec: 42930.7). Total num frames: 5472174080. Throughput: 0: 42497.9. Samples: 5472337420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:03,396][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 01:45:05,094][15401] Updated weights for policy 0, policy_version 334000 (0.0046) [2024-06-23 01:45:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5472370688. Throughput: 0: 42449.1. Samples: 5472460640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 01:45:08,999][15401] Updated weights for policy 0, policy_version 334010 (0.0037) [2024-06-23 01:45:12,583][15401] Updated weights for policy 0, policy_version 334020 (0.0031) [2024-06-23 01:45:13,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 5472600064. Throughput: 0: 42523.9. Samples: 5472718680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:13,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 01:45:17,191][15401] Updated weights for policy 0, policy_version 334030 (0.0045) [2024-06-23 01:45:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 5472813056. Throughput: 0: 42425.3. Samples: 5472978340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 01:45:20,254][15401] Updated weights for policy 0, policy_version 334040 (0.0037) [2024-06-23 01:45:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.2, 300 sec: 42931.7). Total num frames: 5473026048. Throughput: 0: 42456.6. Samples: 5473102320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 01:45:24,712][15401] Updated weights for policy 0, policy_version 334050 (0.0019) [2024-06-23 01:45:28,259][15401] Updated weights for policy 0, policy_version 334060 (0.0041) [2024-06-23 01:45:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 43042.4). Total num frames: 5473255424. Throughput: 0: 42523.9. Samples: 5473359040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:28,393][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 01:45:32,200][15401] Updated weights for policy 0, policy_version 334070 (0.0031) [2024-06-23 01:45:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42876.4). Total num frames: 5473452032. Throughput: 0: 42676.2. Samples: 5473620020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:33,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 01:45:35,915][15401] Updated weights for policy 0, policy_version 334080 (0.0039) [2024-06-23 01:45:38,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5473648640. Throughput: 0: 42498.3. Samples: 5473738260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 01:45:39,731][15401] Updated weights for policy 0, policy_version 334090 (0.0037) [2024-06-23 01:45:41,524][15349] Signal inference workers to stop experience collection... (80950 times) [2024-06-23 01:45:41,587][15401] InferenceWorker_p0-w0: stopping experience collection (80950 times) [2024-06-23 01:45:41,592][15349] Signal inference workers to resume experience collection... (80950 times) [2024-06-23 01:45:41,611][15401] InferenceWorker_p0-w0: resuming experience collection (80950 times) [2024-06-23 01:45:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42931.6). Total num frames: 5473861632. Throughput: 0: 42525.2. Samples: 5473994420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 01:45:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 01:45:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000334099_5473878016.pth... [2024-06-23 01:45:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000333471_5463588864.pth [2024-06-23 01:45:43,645][15401] Updated weights for policy 0, policy_version 334100 (0.0036) [2024-06-23 01:45:47,549][15401] Updated weights for policy 0, policy_version 334110 (0.0030) [2024-06-23 01:45:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 5474091008. Throughput: 0: 42544.8. Samples: 5474251660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:45:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 01:45:51,243][15401] Updated weights for policy 0, policy_version 334120 (0.0036) [2024-06-23 01:45:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.0, 300 sec: 42876.4). Total num frames: 5474304000. Throughput: 0: 42678.9. Samples: 5474381200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:45:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 01:45:55,355][15401] Updated weights for policy 0, policy_version 334130 (0.0038) [2024-06-23 01:45:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 5474516992. Throughput: 0: 42479.1. Samples: 5474630240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:45:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 01:45:59,473][15401] Updated weights for policy 0, policy_version 334140 (0.0046) [2024-06-23 01:46:03,008][15401] Updated weights for policy 0, policy_version 334150 (0.0033) [2024-06-23 01:46:03,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 5474729984. Throughput: 0: 42408.4. Samples: 5474886720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 01:46:07,333][15401] Updated weights for policy 0, policy_version 334160 (0.0035) [2024-06-23 01:46:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5474942976. Throughput: 0: 42415.5. Samples: 5475011020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 01:46:10,798][15401] Updated weights for policy 0, policy_version 334170 (0.0043) [2024-06-23 01:46:13,390][15132] Fps is (10 sec: 42595.4, 60 sec: 42597.9, 300 sec: 42876.0). Total num frames: 5475155968. Throughput: 0: 42378.0. Samples: 5475265980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:13,391][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 01:46:14,883][15401] Updated weights for policy 0, policy_version 334180 (0.0036) [2024-06-23 01:46:18,240][15401] Updated weights for policy 0, policy_version 334190 (0.0036) [2024-06-23 01:46:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 5475385344. Throughput: 0: 42369.2. Samples: 5475526640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:18,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 01:46:22,530][15401] Updated weights for policy 0, policy_version 334200 (0.0032) [2024-06-23 01:46:23,389][15132] Fps is (10 sec: 42601.7, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 5475581952. Throughput: 0: 42503.1. Samples: 5475650900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 01:46:25,729][15401] Updated weights for policy 0, policy_version 334210 (0.0024) [2024-06-23 01:46:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42931.7). Total num frames: 5475794944. Throughput: 0: 42651.8. Samples: 5475913740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:28,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 01:46:30,083][15401] Updated weights for policy 0, policy_version 334220 (0.0028) [2024-06-23 01:46:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5475991552. Throughput: 0: 42619.5. Samples: 5476169540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 01:46:33,632][15401] Updated weights for policy 0, policy_version 334230 (0.0039) [2024-06-23 01:46:37,705][15401] Updated weights for policy 0, policy_version 334240 (0.0034) [2024-06-23 01:46:38,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 5476204544. Throughput: 0: 42482.4. Samples: 5476293000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:38,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 01:46:41,488][15401] Updated weights for policy 0, policy_version 334250 (0.0026) [2024-06-23 01:46:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 5476433920. Throughput: 0: 42673.7. Samples: 5476550560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 01:46:45,280][15401] Updated weights for policy 0, policy_version 334260 (0.0037) [2024-06-23 01:46:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.2, 300 sec: 42709.7). Total num frames: 5476630528. Throughput: 0: 42702.2. Samples: 5476808320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 01:46:49,166][15401] Updated weights for policy 0, policy_version 334270 (0.0029) [2024-06-23 01:46:52,838][15401] Updated weights for policy 0, policy_version 334280 (0.0030) [2024-06-23 01:46:53,396][15132] Fps is (10 sec: 42571.6, 60 sec: 42594.0, 300 sec: 42819.6). Total num frames: 5476859904. Throughput: 0: 42568.6. Samples: 5476926880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:53,396][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 01:46:54,874][15349] Signal inference workers to stop experience collection... (81000 times) [2024-06-23 01:46:54,882][15349] Signal inference workers to resume experience collection... (81000 times) [2024-06-23 01:46:54,897][15401] InferenceWorker_p0-w0: stopping experience collection (81000 times) [2024-06-23 01:46:54,897][15401] InferenceWorker_p0-w0: resuming experience collection (81000 times) [2024-06-23 01:46:56,888][15401] Updated weights for policy 0, policy_version 334290 (0.0035) [2024-06-23 01:46:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5477072896. Throughput: 0: 42750.5. Samples: 5477189720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 01:46:58,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 01:47:00,239][15401] Updated weights for policy 0, policy_version 334300 (0.0034) [2024-06-23 01:47:03,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5477269504. Throughput: 0: 42848.5. Samples: 5477454820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 01:47:04,438][15401] Updated weights for policy 0, policy_version 334310 (0.0031) [2024-06-23 01:47:07,658][15401] Updated weights for policy 0, policy_version 334320 (0.0031) [2024-06-23 01:47:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5477515264. Throughput: 0: 42792.9. Samples: 5477576580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 01:47:12,064][15401] Updated weights for policy 0, policy_version 334330 (0.0029) [2024-06-23 01:47:13,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.9, 300 sec: 42820.9). Total num frames: 5477728256. Throughput: 0: 42861.5. Samples: 5477842520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 01:47:15,282][15401] Updated weights for policy 0, policy_version 334340 (0.0027) [2024-06-23 01:47:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5477924864. Throughput: 0: 42816.9. Samples: 5478096300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 01:47:19,522][15401] Updated weights for policy 0, policy_version 334350 (0.0033) [2024-06-23 01:47:22,981][15401] Updated weights for policy 0, policy_version 334360 (0.0039) [2024-06-23 01:47:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5478154240. Throughput: 0: 42967.7. Samples: 5478226440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:23,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 01:47:27,210][15401] Updated weights for policy 0, policy_version 334370 (0.0025) [2024-06-23 01:47:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5478334464. Throughput: 0: 42874.5. Samples: 5478479900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 01:47:30,682][15401] Updated weights for policy 0, policy_version 334380 (0.0032) [2024-06-23 01:47:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5478580224. Throughput: 0: 42856.6. Samples: 5478736860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 01:47:35,029][15401] Updated weights for policy 0, policy_version 334390 (0.0027) [2024-06-23 01:47:38,309][15401] Updated weights for policy 0, policy_version 334400 (0.0042) [2024-06-23 01:47:38,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 5478809600. Throughput: 0: 43167.5. Samples: 5478869140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 01:47:43,035][15401] Updated weights for policy 0, policy_version 334410 (0.0031) [2024-06-23 01:47:43,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 5478989824. Throughput: 0: 43055.0. Samples: 5479127300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:43,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 01:47:43,545][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000334412_5479006208.pth... [2024-06-23 01:47:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000333786_5468749824.pth [2024-06-23 01:47:46,036][15401] Updated weights for policy 0, policy_version 334420 (0.0038) [2024-06-23 01:47:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 5479219200. Throughput: 0: 42871.4. Samples: 5479384040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:48,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 01:47:50,678][15401] Updated weights for policy 0, policy_version 334430 (0.0028) [2024-06-23 01:47:53,339][15401] Updated weights for policy 0, policy_version 334440 (0.0034) [2024-06-23 01:47:53,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43422.3, 300 sec: 42876.1). Total num frames: 5479464960. Throughput: 0: 43147.9. Samples: 5479518240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:53,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 01:47:58,142][15401] Updated weights for policy 0, policy_version 334450 (0.0038) [2024-06-23 01:47:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5479645184. Throughput: 0: 42834.2. Samples: 5479770060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:47:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 01:48:00,987][15401] Updated weights for policy 0, policy_version 334460 (0.0023) [2024-06-23 01:48:03,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 5479874560. Throughput: 0: 43013.2. Samples: 5480032000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:48:03,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 01:48:05,586][15401] Updated weights for policy 0, policy_version 334470 (0.0036) [2024-06-23 01:48:08,389][15132] Fps is (10 sec: 45876.5, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 5480103936. Throughput: 0: 43045.0. Samples: 5480163460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:48:08,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 01:48:08,599][15401] Updated weights for policy 0, policy_version 334480 (0.0032) [2024-06-23 01:48:12,935][15349] Signal inference workers to stop experience collection... (81050 times) [2024-06-23 01:48:12,982][15401] InferenceWorker_p0-w0: stopping experience collection (81050 times) [2024-06-23 01:48:12,989][15349] Signal inference workers to resume experience collection... (81050 times) [2024-06-23 01:48:12,996][15401] InferenceWorker_p0-w0: resuming experience collection (81050 times) [2024-06-23 01:48:12,998][15401] Updated weights for policy 0, policy_version 334490 (0.0031) [2024-06-23 01:48:13,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42869.8, 300 sec: 42764.6). Total num frames: 5480300544. Throughput: 0: 43252.6. Samples: 5480426380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:48:13,393][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 01:48:16,112][15401] Updated weights for policy 0, policy_version 334500 (0.0033) [2024-06-23 01:48:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5480513536. Throughput: 0: 43255.5. Samples: 5480683360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 01:48:20,716][15401] Updated weights for policy 0, policy_version 334510 (0.0030) [2024-06-23 01:48:23,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5480742912. Throughput: 0: 43186.6. Samples: 5480812540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:23,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 01:48:23,787][15401] Updated weights for policy 0, policy_version 334520 (0.0026) [2024-06-23 01:48:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42598.8). Total num frames: 5480906752. Throughput: 0: 43131.6. Samples: 5481068120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 01:48:28,740][15401] Updated weights for policy 0, policy_version 334530 (0.0028) [2024-06-23 01:48:31,530][15401] Updated weights for policy 0, policy_version 334540 (0.0031) [2024-06-23 01:48:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5481152512. Throughput: 0: 42996.6. Samples: 5481318880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 01:48:36,256][15401] Updated weights for policy 0, policy_version 334550 (0.0037) [2024-06-23 01:48:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5481381888. Throughput: 0: 43075.2. Samples: 5481456620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:38,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 01:48:39,333][15401] Updated weights for policy 0, policy_version 334560 (0.0031) [2024-06-23 01:48:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42598.8). Total num frames: 5481562112. Throughput: 0: 43054.0. Samples: 5481707480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:43,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 01:48:43,780][15401] Updated weights for policy 0, policy_version 334570 (0.0032) [2024-06-23 01:48:46,785][15401] Updated weights for policy 0, policy_version 334580 (0.0032) [2024-06-23 01:48:48,389][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42820.7). Total num frames: 5481807872. Throughput: 0: 43060.5. Samples: 5481969620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 01:48:51,237][15401] Updated weights for policy 0, policy_version 334590 (0.0029) [2024-06-23 01:48:53,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5482037248. Throughput: 0: 43266.5. Samples: 5482110460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 01:48:54,261][15401] Updated weights for policy 0, policy_version 334600 (0.0041) [2024-06-23 01:48:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5482217472. Throughput: 0: 43026.0. Samples: 5482362440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:48:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 01:48:58,899][15401] Updated weights for policy 0, policy_version 334610 (0.0029) [2024-06-23 01:49:02,179][15401] Updated weights for policy 0, policy_version 334620 (0.0036) [2024-06-23 01:49:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 5482463232. Throughput: 0: 42857.2. Samples: 5482611940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:49:03,391][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 01:49:06,501][15401] Updated weights for policy 0, policy_version 334630 (0.0031) [2024-06-23 01:49:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5482676224. Throughput: 0: 42933.0. Samples: 5482744520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:49:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 01:49:09,896][15401] Updated weights for policy 0, policy_version 334640 (0.0047) [2024-06-23 01:49:13,390][15132] Fps is (10 sec: 39318.4, 60 sec: 42599.5, 300 sec: 42653.8). Total num frames: 5482856448. Throughput: 0: 42776.9. Samples: 5482993120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:49:13,391][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 01:49:14,245][15401] Updated weights for policy 0, policy_version 334650 (0.0050) [2024-06-23 01:49:17,594][15401] Updated weights for policy 0, policy_version 334660 (0.0035) [2024-06-23 01:49:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 5483085824. Throughput: 0: 42969.7. Samples: 5483252520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:49:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 01:49:21,902][15401] Updated weights for policy 0, policy_version 334670 (0.0046) [2024-06-23 01:49:21,911][15349] Signal inference workers to stop experience collection... (81100 times) [2024-06-23 01:49:21,912][15349] Signal inference workers to resume experience collection... (81100 times) [2024-06-23 01:49:21,938][15401] InferenceWorker_p0-w0: stopping experience collection (81100 times) [2024-06-23 01:49:21,938][15401] InferenceWorker_p0-w0: resuming experience collection (81100 times) [2024-06-23 01:49:23,389][15132] Fps is (10 sec: 47518.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5483331584. Throughput: 0: 42953.6. Samples: 5483389540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:49:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 01:49:25,499][15401] Updated weights for policy 0, policy_version 334680 (0.0027) [2024-06-23 01:49:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5483495424. Throughput: 0: 43052.4. Samples: 5483644840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:49:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 01:49:29,391][15401] Updated weights for policy 0, policy_version 334690 (0.0026) [2024-06-23 01:49:33,079][15401] Updated weights for policy 0, policy_version 334700 (0.0033) [2024-06-23 01:49:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43415.8, 300 sec: 42875.7). Total num frames: 5483757568. Throughput: 0: 42927.5. Samples: 5483901460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 01:49:33,393][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 01:49:36,844][15401] Updated weights for policy 0, policy_version 334710 (0.0042) [2024-06-23 01:49:38,396][15132] Fps is (10 sec: 49120.7, 60 sec: 43412.9, 300 sec: 42875.2). Total num frames: 5483986944. Throughput: 0: 42846.4. Samples: 5484038820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:49:38,396][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 01:49:40,410][15401] Updated weights for policy 0, policy_version 334720 (0.0023) [2024-06-23 01:49:43,390][15132] Fps is (10 sec: 37692.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5484134400. Throughput: 0: 42885.7. Samples: 5484292300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:49:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 01:49:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000334725_5484134400.pth... [2024-06-23 01:49:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000334099_5473878016.pth [2024-06-23 01:49:44,623][15401] Updated weights for policy 0, policy_version 334730 (0.0041) [2024-06-23 01:49:47,858][15401] Updated weights for policy 0, policy_version 334740 (0.0037) [2024-06-23 01:49:48,390][15132] Fps is (10 sec: 40986.0, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 5484396544. Throughput: 0: 42995.2. Samples: 5484546720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:49:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 01:49:52,183][15401] Updated weights for policy 0, policy_version 334750 (0.0041) [2024-06-23 01:49:53,390][15132] Fps is (10 sec: 49151.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5484625920. Throughput: 0: 43071.4. Samples: 5484682740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:49:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 01:49:55,539][15401] Updated weights for policy 0, policy_version 334760 (0.0026) [2024-06-23 01:49:58,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.7, 300 sec: 42765.6). Total num frames: 5484789760. Throughput: 0: 43312.8. Samples: 5484942260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:49:58,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 01:49:59,645][15401] Updated weights for policy 0, policy_version 334770 (0.0050) [2024-06-23 01:50:02,867][15401] Updated weights for policy 0, policy_version 334780 (0.0024) [2024-06-23 01:50:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 5485035520. Throughput: 0: 43131.7. Samples: 5485193440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:03,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 01:50:07,461][15401] Updated weights for policy 0, policy_version 334790 (0.0043) [2024-06-23 01:50:08,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5485264896. Throughput: 0: 43150.2. Samples: 5485331300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 01:50:10,365][15401] Updated weights for policy 0, policy_version 334800 (0.0028) [2024-06-23 01:50:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42872.2, 300 sec: 42765.0). Total num frames: 5485428736. Throughput: 0: 43130.3. Samples: 5485585700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 01:50:15,023][15401] Updated weights for policy 0, policy_version 334810 (0.0024) [2024-06-23 01:50:18,009][15401] Updated weights for policy 0, policy_version 334820 (0.0032) [2024-06-23 01:50:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5485690880. Throughput: 0: 43131.7. Samples: 5485842280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 01:50:19,300][15349] Signal inference workers to stop experience collection... (81150 times) [2024-06-23 01:50:19,301][15349] Signal inference workers to resume experience collection... (81150 times) [2024-06-23 01:50:19,347][15401] InferenceWorker_p0-w0: stopping experience collection (81150 times) [2024-06-23 01:50:19,348][15401] InferenceWorker_p0-w0: resuming experience collection (81150 times) [2024-06-23 01:50:22,429][15401] Updated weights for policy 0, policy_version 334830 (0.0028) [2024-06-23 01:50:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 5485903872. Throughput: 0: 43248.9. Samples: 5485984740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:23,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 01:50:25,379][15401] Updated weights for policy 0, policy_version 334840 (0.0038) [2024-06-23 01:50:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5486084096. Throughput: 0: 43147.6. Samples: 5486233940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:28,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 01:50:29,907][15401] Updated weights for policy 0, policy_version 334850 (0.0035) [2024-06-23 01:50:33,370][15401] Updated weights for policy 0, policy_version 334860 (0.0025) [2024-06-23 01:50:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43146.2, 300 sec: 43042.7). Total num frames: 5486346240. Throughput: 0: 43337.3. Samples: 5486496900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 01:50:37,434][15401] Updated weights for policy 0, policy_version 334870 (0.0036) [2024-06-23 01:50:38,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42876.0, 300 sec: 43042.7). Total num frames: 5486559232. Throughput: 0: 43273.0. Samples: 5486630020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 01:50:40,848][15401] Updated weights for policy 0, policy_version 334880 (0.0031) [2024-06-23 01:50:43,391][15132] Fps is (10 sec: 39316.9, 60 sec: 43416.7, 300 sec: 42875.9). Total num frames: 5486739456. Throughput: 0: 43202.9. Samples: 5486886340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:43,391][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 01:50:45,085][15401] Updated weights for policy 0, policy_version 334890 (0.0037) [2024-06-23 01:50:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 5486968832. Throughput: 0: 43156.4. Samples: 5487135480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 01:50:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 01:50:48,552][15401] Updated weights for policy 0, policy_version 334900 (0.0043) [2024-06-23 01:50:52,822][15401] Updated weights for policy 0, policy_version 334910 (0.0030) [2024-06-23 01:50:53,389][15132] Fps is (10 sec: 44242.7, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 5487181824. Throughput: 0: 43061.9. Samples: 5487269080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:50:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 01:50:56,070][15401] Updated weights for policy 0, policy_version 334920 (0.0024) [2024-06-23 01:50:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 5487394816. Throughput: 0: 43111.5. Samples: 5487525720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:50:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 01:51:00,584][15401] Updated weights for policy 0, policy_version 334930 (0.0039) [2024-06-23 01:51:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 5487640576. Throughput: 0: 43078.6. Samples: 5487780820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:03,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 01:51:03,726][15401] Updated weights for policy 0, policy_version 334940 (0.0040) [2024-06-23 01:51:08,250][15401] Updated weights for policy 0, policy_version 334950 (0.0028) [2024-06-23 01:51:08,390][15132] Fps is (10 sec: 42595.0, 60 sec: 42597.8, 300 sec: 42931.6). Total num frames: 5487820800. Throughput: 0: 42826.2. Samples: 5487911960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:08,391][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 01:51:11,348][15401] Updated weights for policy 0, policy_version 334960 (0.0041) [2024-06-23 01:51:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5488033792. Throughput: 0: 42902.1. Samples: 5488164540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 01:51:15,968][15401] Updated weights for policy 0, policy_version 334970 (0.0033) [2024-06-23 01:51:18,369][15349] Signal inference workers to stop experience collection... (81200 times) [2024-06-23 01:51:18,369][15349] Signal inference workers to resume experience collection... (81200 times) [2024-06-23 01:51:18,380][15401] InferenceWorker_p0-w0: stopping experience collection (81200 times) [2024-06-23 01:51:18,389][15132] Fps is (10 sec: 45879.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5488279552. Throughput: 0: 42743.3. Samples: 5488420340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:18,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 01:51:18,403][15401] InferenceWorker_p0-w0: resuming experience collection (81200 times) [2024-06-23 01:51:18,935][15401] Updated weights for policy 0, policy_version 334980 (0.0039) [2024-06-23 01:51:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5488443392. Throughput: 0: 42732.4. Samples: 5488552980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:23,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 01:51:23,645][15401] Updated weights for policy 0, policy_version 334990 (0.0031) [2024-06-23 01:51:26,735][15401] Updated weights for policy 0, policy_version 335000 (0.0036) [2024-06-23 01:51:28,389][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5488672768. Throughput: 0: 42584.3. Samples: 5488802580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 01:51:31,363][15401] Updated weights for policy 0, policy_version 335010 (0.0035) [2024-06-23 01:51:33,396][15132] Fps is (10 sec: 47483.2, 60 sec: 42866.9, 300 sec: 43097.7). Total num frames: 5488918528. Throughput: 0: 42673.0. Samples: 5489056040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:33,396][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 01:51:34,492][15401] Updated weights for policy 0, policy_version 335020 (0.0045) [2024-06-23 01:51:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42931.3). Total num frames: 5489098752. Throughput: 0: 42601.7. Samples: 5489186260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:38,392][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 01:51:38,997][15401] Updated weights for policy 0, policy_version 335030 (0.0038) [2024-06-23 01:51:42,178][15401] Updated weights for policy 0, policy_version 335040 (0.0035) [2024-06-23 01:51:43,390][15132] Fps is (10 sec: 40986.1, 60 sec: 43145.4, 300 sec: 43042.7). Total num frames: 5489328128. Throughput: 0: 42608.5. Samples: 5489443100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 01:51:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000335042_5489328128.pth... [2024-06-23 01:51:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000334412_5479006208.pth [2024-06-23 01:51:46,590][15401] Updated weights for policy 0, policy_version 335050 (0.0037) [2024-06-23 01:51:48,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42871.3, 300 sec: 42988.1). Total num frames: 5489541120. Throughput: 0: 42594.6. Samples: 5489697580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:48,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-23 01:51:50,119][15401] Updated weights for policy 0, policy_version 335060 (0.0035) [2024-06-23 01:51:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5489721344. Throughput: 0: 42413.3. Samples: 5489820520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 01:51:54,508][15401] Updated weights for policy 0, policy_version 335070 (0.0035) [2024-06-23 01:51:57,758][15401] Updated weights for policy 0, policy_version 335080 (0.0047) [2024-06-23 01:51:58,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 5489983488. Throughput: 0: 42438.0. Samples: 5490074240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:51:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 01:52:02,173][15401] Updated weights for policy 0, policy_version 335090 (0.0030) [2024-06-23 01:52:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 5490180096. Throughput: 0: 42477.2. Samples: 5490331820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 01:52:03,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-23 01:52:05,314][15401] Updated weights for policy 0, policy_version 335100 (0.0027) [2024-06-23 01:52:08,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42326.0, 300 sec: 42820.6). Total num frames: 5490360320. Throughput: 0: 42359.2. Samples: 5490459140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:08,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 01:52:09,702][15401] Updated weights for policy 0, policy_version 335110 (0.0039) [2024-06-23 01:52:13,069][15401] Updated weights for policy 0, policy_version 335120 (0.0035) [2024-06-23 01:52:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5490606080. Throughput: 0: 42563.1. Samples: 5490717920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 01:52:17,227][15401] Updated weights for policy 0, policy_version 335130 (0.0032) [2024-06-23 01:52:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 5490802688. Throughput: 0: 42513.6. Samples: 5490968880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:18,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 01:52:21,128][15401] Updated weights for policy 0, policy_version 335140 (0.0029) [2024-06-23 01:52:23,394][15132] Fps is (10 sec: 40940.0, 60 sec: 42868.0, 300 sec: 42986.4). Total num frames: 5491015680. Throughput: 0: 42477.6. Samples: 5491097860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:23,395][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 01:52:25,004][15401] Updated weights for policy 0, policy_version 335150 (0.0029) [2024-06-23 01:52:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5491228672. Throughput: 0: 42492.0. Samples: 5491355240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 01:52:28,798][15401] Updated weights for policy 0, policy_version 335160 (0.0032) [2024-06-23 01:52:32,720][15401] Updated weights for policy 0, policy_version 335170 (0.0038) [2024-06-23 01:52:33,391][15132] Fps is (10 sec: 42614.1, 60 sec: 42055.9, 300 sec: 42820.4). Total num frames: 5491441664. Throughput: 0: 42502.1. Samples: 5491610220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:33,391][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 01:52:36,647][15401] Updated weights for policy 0, policy_version 335180 (0.0028) [2024-06-23 01:52:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42987.5). Total num frames: 5491671040. Throughput: 0: 42585.7. Samples: 5491736880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:38,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 01:52:40,422][15401] Updated weights for policy 0, policy_version 335190 (0.0039) [2024-06-23 01:52:43,389][15132] Fps is (10 sec: 40965.3, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 5491851264. Throughput: 0: 42603.1. Samples: 5491991380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 01:52:44,477][15401] Updated weights for policy 0, policy_version 335200 (0.0028) [2024-06-23 01:52:47,825][15401] Updated weights for policy 0, policy_version 335210 (0.0040) [2024-06-23 01:52:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5492097024. Throughput: 0: 42582.2. Samples: 5492248020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 01:52:52,423][15401] Updated weights for policy 0, policy_version 335220 (0.0029) [2024-06-23 01:52:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5492293632. Throughput: 0: 42752.5. Samples: 5492383000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:53,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-23 01:52:55,394][15401] Updated weights for policy 0, policy_version 335230 (0.0029) [2024-06-23 01:52:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 42820.9). Total num frames: 5492506624. Throughput: 0: 42612.9. Samples: 5492635500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:52:58,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-23 01:52:59,918][15401] Updated weights for policy 0, policy_version 335240 (0.0030) [2024-06-23 01:53:00,986][15349] Signal inference workers to stop experience collection... (81250 times) [2024-06-23 01:53:00,986][15349] Signal inference workers to resume experience collection... (81250 times) [2024-06-23 01:53:01,031][15401] InferenceWorker_p0-w0: stopping experience collection (81250 times) [2024-06-23 01:53:01,036][15401] InferenceWorker_p0-w0: resuming experience collection (81250 times) [2024-06-23 01:53:02,843][15401] Updated weights for policy 0, policy_version 335250 (0.0029) [2024-06-23 01:53:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5492736000. Throughput: 0: 42669.7. Samples: 5492889020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:53:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 01:53:07,527][15401] Updated weights for policy 0, policy_version 335260 (0.0036) [2024-06-23 01:53:08,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.7, 300 sec: 42876.1). Total num frames: 5492948992. Throughput: 0: 42849.5. Samples: 5493025980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:53:08,392][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 01:53:10,474][15401] Updated weights for policy 0, policy_version 335270 (0.0033) [2024-06-23 01:53:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5493161984. Throughput: 0: 42859.1. Samples: 5493283900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:53:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 01:53:15,036][15401] Updated weights for policy 0, policy_version 335280 (0.0039) [2024-06-23 01:53:17,902][15401] Updated weights for policy 0, policy_version 335290 (0.0038) [2024-06-23 01:53:18,389][15132] Fps is (10 sec: 44248.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5493391360. Throughput: 0: 42712.8. Samples: 5493532240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-23 01:53:18,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 01:53:22,814][15401] Updated weights for policy 0, policy_version 335300 (0.0038) [2024-06-23 01:53:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42328.7, 300 sec: 42876.1). Total num frames: 5493555200. Throughput: 0: 42852.9. Samples: 5493665260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:53:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 01:53:25,685][15401] Updated weights for policy 0, policy_version 335310 (0.0037) [2024-06-23 01:53:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5493800960. Throughput: 0: 42857.7. Samples: 5493919980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:53:28,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 01:53:30,470][15401] Updated weights for policy 0, policy_version 335320 (0.0037) [2024-06-23 01:53:33,249][15401] Updated weights for policy 0, policy_version 335330 (0.0035) [2024-06-23 01:53:33,392][15132] Fps is (10 sec: 49140.6, 60 sec: 43416.7, 300 sec: 42931.3). Total num frames: 5494046720. Throughput: 0: 42675.0. Samples: 5494168500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:53:33,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 01:53:38,244][15401] Updated weights for policy 0, policy_version 335340 (0.0035) [2024-06-23 01:53:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5494210560. Throughput: 0: 42740.3. Samples: 5494306320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:53:38,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 01:53:41,018][15401] Updated weights for policy 0, policy_version 335350 (0.0035) [2024-06-23 01:53:43,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5494423552. Throughput: 0: 42918.3. Samples: 5494566820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:53:43,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 01:53:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000335354_5494439936.pth... [2024-06-23 01:53:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000334725_5484134400.pth [2024-06-23 01:53:45,763][15401] Updated weights for policy 0, policy_version 335360 (0.0031) [2024-06-23 01:53:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5494685696. Throughput: 0: 42894.7. Samples: 5494819280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:53:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 01:53:48,499][15401] Updated weights for policy 0, policy_version 335370 (0.0049) [2024-06-23 01:53:53,214][15401] Updated weights for policy 0, policy_version 335380 (0.0027) [2024-06-23 01:53:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5494865920. Throughput: 0: 42927.7. Samples: 5494957620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:53:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 01:53:56,071][15401] Updated weights for policy 0, policy_version 335390 (0.0035) [2024-06-23 01:53:58,393][15132] Fps is (10 sec: 39308.4, 60 sec: 42869.0, 300 sec: 42764.5). Total num frames: 5495078912. Throughput: 0: 42768.8. Samples: 5495208640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:53:58,393][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 01:54:00,807][15401] Updated weights for policy 0, policy_version 335400 (0.0034) [2024-06-23 01:54:02,414][15349] Signal inference workers to stop experience collection... (81300 times) [2024-06-23 01:54:02,420][15349] Signal inference workers to resume experience collection... (81300 times) [2024-06-23 01:54:02,464][15401] InferenceWorker_p0-w0: stopping experience collection (81300 times) [2024-06-23 01:54:02,464][15401] InferenceWorker_p0-w0: resuming experience collection (81300 times) [2024-06-23 01:54:03,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5495341056. Throughput: 0: 42858.5. Samples: 5495460880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:54:03,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 01:54:03,748][15401] Updated weights for policy 0, policy_version 335410 (0.0038) [2024-06-23 01:54:08,390][15132] Fps is (10 sec: 42612.7, 60 sec: 42600.1, 300 sec: 42876.2). Total num frames: 5495504896. Throughput: 0: 43055.5. Samples: 5495602760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:54:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 01:54:08,651][15401] Updated weights for policy 0, policy_version 335420 (0.0044) [2024-06-23 01:54:11,569][15401] Updated weights for policy 0, policy_version 335430 (0.0028) [2024-06-23 01:54:13,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5495717888. Throughput: 0: 43026.2. Samples: 5495856160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:54:13,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 01:54:15,941][15401] Updated weights for policy 0, policy_version 335440 (0.0041) [2024-06-23 01:54:18,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5495980032. Throughput: 0: 43027.7. Samples: 5496104640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:54:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 01:54:19,114][15401] Updated weights for policy 0, policy_version 335450 (0.0036) [2024-06-23 01:54:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 5496160256. Throughput: 0: 43091.6. Samples: 5496245440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:54:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 01:54:23,404][15401] Updated weights for policy 0, policy_version 335460 (0.0027) [2024-06-23 01:54:26,611][15401] Updated weights for policy 0, policy_version 335470 (0.0033) [2024-06-23 01:54:28,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 5496356864. Throughput: 0: 42887.4. Samples: 5496496760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:54:28,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 01:54:31,465][15401] Updated weights for policy 0, policy_version 335480 (0.0038) [2024-06-23 01:54:33,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42871.6, 300 sec: 42821.2). Total num frames: 5496619008. Throughput: 0: 42867.7. Samples: 5496748420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 01:54:33,392][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 01:54:34,022][15401] Updated weights for policy 0, policy_version 335490 (0.0031) [2024-06-23 01:54:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 5496782848. Throughput: 0: 43034.1. Samples: 5496894260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:54:38,401][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 01:54:39,035][15401] Updated weights for policy 0, policy_version 335500 (0.0030) [2024-06-23 01:54:41,954][15401] Updated weights for policy 0, policy_version 335510 (0.0034) [2024-06-23 01:54:43,396][15132] Fps is (10 sec: 39305.1, 60 sec: 43139.8, 300 sec: 42764.1). Total num frames: 5497012224. Throughput: 0: 42833.1. Samples: 5497136260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:54:43,397][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 01:54:46,849][15401] Updated weights for policy 0, policy_version 335520 (0.0028) [2024-06-23 01:54:48,392][15132] Fps is (10 sec: 49151.7, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 5497274368. Throughput: 0: 42815.1. Samples: 5497387660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:54:48,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 01:54:49,533][15401] Updated weights for policy 0, policy_version 335530 (0.0029) [2024-06-23 01:54:53,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 5497421824. Throughput: 0: 42862.8. Samples: 5497531580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:54:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 01:54:54,230][15401] Updated weights for policy 0, policy_version 335540 (0.0042) [2024-06-23 01:54:57,019][15401] Updated weights for policy 0, policy_version 335550 (0.0026) [2024-06-23 01:54:58,389][15132] Fps is (10 sec: 39331.3, 60 sec: 43147.0, 300 sec: 42820.5). Total num frames: 5497667584. Throughput: 0: 42747.7. Samples: 5497779800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:54:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 01:55:01,854][15401] Updated weights for policy 0, policy_version 335560 (0.0041) [2024-06-23 01:55:02,148][15349] Signal inference workers to stop experience collection... (81350 times) [2024-06-23 01:55:02,149][15349] Signal inference workers to resume experience collection... (81350 times) [2024-06-23 01:55:02,171][15401] InferenceWorker_p0-w0: stopping experience collection (81350 times) [2024-06-23 01:55:02,208][15401] InferenceWorker_p0-w0: resuming experience collection (81350 times) [2024-06-23 01:55:03,390][15132] Fps is (10 sec: 49151.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5497913344. Throughput: 0: 42978.1. Samples: 5498038660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 01:55:04,949][15401] Updated weights for policy 0, policy_version 335570 (0.0041) [2024-06-23 01:55:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5498077184. Throughput: 0: 43188.8. Samples: 5498188940. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 01:55:09,197][15401] Updated weights for policy 0, policy_version 335580 (0.0029) [2024-06-23 01:55:12,298][15401] Updated weights for policy 0, policy_version 335590 (0.0027) [2024-06-23 01:55:13,392][15132] Fps is (10 sec: 39312.4, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 5498306560. Throughput: 0: 43122.6. Samples: 5498437380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:13,393][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 01:55:16,759][15401] Updated weights for policy 0, policy_version 335600 (0.0028) [2024-06-23 01:55:18,389][15132] Fps is (10 sec: 49152.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5498568704. Throughput: 0: 43240.1. Samples: 5498694120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 01:55:19,802][15401] Updated weights for policy 0, policy_version 335610 (0.0047) [2024-06-23 01:55:23,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 5498732544. Throughput: 0: 43142.7. Samples: 5498835680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:23,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 01:55:24,374][15401] Updated weights for policy 0, policy_version 335620 (0.0039) [2024-06-23 01:55:27,235][15401] Updated weights for policy 0, policy_version 335630 (0.0030) [2024-06-23 01:55:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5498961920. Throughput: 0: 43319.1. Samples: 5499085340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:28,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 01:55:31,878][15401] Updated weights for policy 0, policy_version 335640 (0.0033) [2024-06-23 01:55:33,389][15132] Fps is (10 sec: 49164.1, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 5499224064. Throughput: 0: 43483.3. Samples: 5499344300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 01:55:34,734][15401] Updated weights for policy 0, policy_version 335650 (0.0036) [2024-06-23 01:55:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43419.3, 300 sec: 42876.3). Total num frames: 5499387904. Throughput: 0: 43340.4. Samples: 5499481900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:38,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 01:55:39,369][15401] Updated weights for policy 0, policy_version 335660 (0.0035) [2024-06-23 01:55:42,263][15401] Updated weights for policy 0, policy_version 335670 (0.0042) [2024-06-23 01:55:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 5499617280. Throughput: 0: 43287.9. Samples: 5499727760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 01:55:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000335670_5499617280.pth... [2024-06-23 01:55:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000335042_5489328128.pth [2024-06-23 01:55:47,011][15401] Updated weights for policy 0, policy_version 335680 (0.0033) [2024-06-23 01:55:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42873.3, 300 sec: 42931.6). Total num frames: 5499846656. Throughput: 0: 43370.8. Samples: 5499990340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 01:55:49,824][15401] Updated weights for policy 0, policy_version 335690 (0.0037) [2024-06-23 01:55:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5500026880. Throughput: 0: 42869.8. Samples: 5500118080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-23 01:55:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 01:55:54,528][15401] Updated weights for policy 0, policy_version 335700 (0.0028) [2024-06-23 01:55:58,300][15401] Updated weights for policy 0, policy_version 335710 (0.0029) [2024-06-23 01:55:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5500272640. Throughput: 0: 42981.0. Samples: 5500371420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:55:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 01:56:02,603][15401] Updated weights for policy 0, policy_version 335720 (0.0031) [2024-06-23 01:56:03,391][15132] Fps is (10 sec: 45870.1, 60 sec: 42870.7, 300 sec: 42931.6). Total num frames: 5500485632. Throughput: 0: 43037.4. Samples: 5500630860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:03,391][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 01:56:06,252][15401] Updated weights for policy 0, policy_version 335730 (0.0037) [2024-06-23 01:56:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5500649472. Throughput: 0: 42787.1. Samples: 5500761000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 01:56:10,278][15401] Updated weights for policy 0, policy_version 335740 (0.0034) [2024-06-23 01:56:10,773][15349] Signal inference workers to stop experience collection... (81400 times) [2024-06-23 01:56:10,812][15401] InferenceWorker_p0-w0: stopping experience collection (81400 times) [2024-06-23 01:56:10,827][15349] Signal inference workers to resume experience collection... (81400 times) [2024-06-23 01:56:10,834][15401] InferenceWorker_p0-w0: resuming experience collection (81400 times) [2024-06-23 01:56:13,390][15132] Fps is (10 sec: 40964.3, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 5500895232. Throughput: 0: 42756.0. Samples: 5501009360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 01:56:14,022][15401] Updated weights for policy 0, policy_version 335750 (0.0035) [2024-06-23 01:56:17,815][15401] Updated weights for policy 0, policy_version 335760 (0.0028) [2024-06-23 01:56:18,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 5501108224. Throughput: 0: 42835.5. Samples: 5501271900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:18,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 01:56:21,871][15401] Updated weights for policy 0, policy_version 335770 (0.0026) [2024-06-23 01:56:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 5501304832. Throughput: 0: 42648.1. Samples: 5501401060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:23,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 01:56:25,237][15401] Updated weights for policy 0, policy_version 335780 (0.0029) [2024-06-23 01:56:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42765.6). Total num frames: 5501534208. Throughput: 0: 42752.0. Samples: 5501651700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:28,393][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 01:56:29,469][15401] Updated weights for policy 0, policy_version 335790 (0.0038) [2024-06-23 01:56:33,288][15401] Updated weights for policy 0, policy_version 335800 (0.0041) [2024-06-23 01:56:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42876.4). Total num frames: 5501747200. Throughput: 0: 42721.2. Samples: 5501912800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 01:56:37,264][15401] Updated weights for policy 0, policy_version 335810 (0.0036) [2024-06-23 01:56:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5501960192. Throughput: 0: 42723.5. Samples: 5502040640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:38,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 01:56:40,657][15401] Updated weights for policy 0, policy_version 335820 (0.0048) [2024-06-23 01:56:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 5502173184. Throughput: 0: 42703.9. Samples: 5502293200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:43,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 01:56:44,979][15401] Updated weights for policy 0, policy_version 335830 (0.0030) [2024-06-23 01:56:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 5502386176. Throughput: 0: 42762.4. Samples: 5502555120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 01:56:48,459][15401] Updated weights for policy 0, policy_version 335840 (0.0034) [2024-06-23 01:56:52,494][15401] Updated weights for policy 0, policy_version 335850 (0.0045) [2024-06-23 01:56:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 5502582784. Throughput: 0: 42734.7. Samples: 5502684060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 01:56:56,030][15401] Updated weights for policy 0, policy_version 335860 (0.0028) [2024-06-23 01:56:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5502828544. Throughput: 0: 42821.9. Samples: 5502936340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:56:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 01:56:59,956][15401] Updated weights for policy 0, policy_version 335870 (0.0031) [2024-06-23 01:57:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42326.1, 300 sec: 42931.6). Total num frames: 5503025152. Throughput: 0: 42840.9. Samples: 5503199740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:57:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 01:57:03,648][15401] Updated weights for policy 0, policy_version 335880 (0.0032) [2024-06-23 01:57:07,980][15401] Updated weights for policy 0, policy_version 335890 (0.0032) [2024-06-23 01:57:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5503221760. Throughput: 0: 42767.0. Samples: 5503325580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-23 01:57:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 01:57:11,187][15401] Updated weights for policy 0, policy_version 335900 (0.0026) [2024-06-23 01:57:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5503467520. Throughput: 0: 42862.7. Samples: 5503580420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 01:57:15,471][15401] Updated weights for policy 0, policy_version 335910 (0.0030) [2024-06-23 01:57:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42932.3). Total num frames: 5503680512. Throughput: 0: 42896.0. Samples: 5503843120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 01:57:19,022][15401] Updated weights for policy 0, policy_version 335920 (0.0035) [2024-06-23 01:57:23,017][15401] Updated weights for policy 0, policy_version 335930 (0.0033) [2024-06-23 01:57:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5503877120. Throughput: 0: 42849.7. Samples: 5503968880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:23,396][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 01:57:23,988][15349] Signal inference workers to stop experience collection... (81450 times) [2024-06-23 01:57:23,988][15349] Signal inference workers to resume experience collection... (81450 times) [2024-06-23 01:57:24,025][15401] InferenceWorker_p0-w0: stopping experience collection (81450 times) [2024-06-23 01:57:24,025][15401] InferenceWorker_p0-w0: resuming experience collection (81450 times) [2024-06-23 01:57:26,583][15401] Updated weights for policy 0, policy_version 335940 (0.0031) [2024-06-23 01:57:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43146.3, 300 sec: 42987.4). Total num frames: 5504122880. Throughput: 0: 42920.2. Samples: 5504224500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:28,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 01:57:31,113][15401] Updated weights for policy 0, policy_version 335950 (0.0038) [2024-06-23 01:57:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5504319488. Throughput: 0: 42842.6. Samples: 5504483040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 01:57:34,214][15401] Updated weights for policy 0, policy_version 335960 (0.0040) [2024-06-23 01:57:38,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 5504516096. Throughput: 0: 42722.3. Samples: 5504606560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 01:57:38,523][15401] Updated weights for policy 0, policy_version 335970 (0.0032) [2024-06-23 01:57:42,064][15401] Updated weights for policy 0, policy_version 335980 (0.0035) [2024-06-23 01:57:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 5504761856. Throughput: 0: 42928.1. Samples: 5504868100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 01:57:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000335985_5504778240.pth... [2024-06-23 01:57:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000335354_5494439936.pth [2024-06-23 01:57:45,989][15401] Updated weights for policy 0, policy_version 335990 (0.0054) [2024-06-23 01:57:48,391][15132] Fps is (10 sec: 44231.7, 60 sec: 42870.7, 300 sec: 42931.5). Total num frames: 5504958464. Throughput: 0: 42857.6. Samples: 5505128380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:48,391][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 01:57:49,531][15401] Updated weights for policy 0, policy_version 336000 (0.0028) [2024-06-23 01:57:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5505155072. Throughput: 0: 42740.6. Samples: 5505248900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 01:57:53,646][15401] Updated weights for policy 0, policy_version 336010 (0.0044) [2024-06-23 01:57:57,070][15401] Updated weights for policy 0, policy_version 336020 (0.0036) [2024-06-23 01:57:58,390][15132] Fps is (10 sec: 44241.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5505400832. Throughput: 0: 42829.3. Samples: 5505507740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:57:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 01:58:01,300][15401] Updated weights for policy 0, policy_version 336030 (0.0040) [2024-06-23 01:58:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 5505597440. Throughput: 0: 42826.3. Samples: 5505770300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:58:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 01:58:04,676][15401] Updated weights for policy 0, policy_version 336040 (0.0043) [2024-06-23 01:58:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5505810432. Throughput: 0: 42768.6. Samples: 5505893460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:58:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 01:58:09,082][15401] Updated weights for policy 0, policy_version 336050 (0.0035) [2024-06-23 01:58:12,260][15401] Updated weights for policy 0, policy_version 336060 (0.0032) [2024-06-23 01:58:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5506039808. Throughput: 0: 42881.7. Samples: 5506154180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:58:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 01:58:16,828][15401] Updated weights for policy 0, policy_version 336070 (0.0049) [2024-06-23 01:58:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 5506236416. Throughput: 0: 42859.2. Samples: 5506411700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:58:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 01:58:19,950][15401] Updated weights for policy 0, policy_version 336080 (0.0036) [2024-06-23 01:58:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5506449408. Throughput: 0: 42969.8. Samples: 5506540200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 01:58:23,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-23 01:58:24,139][15401] Updated weights for policy 0, policy_version 336090 (0.0031) [2024-06-23 01:58:27,457][15401] Updated weights for policy 0, policy_version 336100 (0.0034) [2024-06-23 01:58:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42876.5). Total num frames: 5506695168. Throughput: 0: 42922.6. Samples: 5506799620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:58:28,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 01:58:31,549][15401] Updated weights for policy 0, policy_version 336110 (0.0040) [2024-06-23 01:58:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5506891776. Throughput: 0: 43000.6. Samples: 5507063360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:58:33,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 01:58:34,878][15401] Updated weights for policy 0, policy_version 336120 (0.0037) [2024-06-23 01:58:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5507104768. Throughput: 0: 43179.1. Samples: 5507191960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:58:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 01:58:39,173][15401] Updated weights for policy 0, policy_version 336130 (0.0036) [2024-06-23 01:58:43,070][15401] Updated weights for policy 0, policy_version 336140 (0.0029) [2024-06-23 01:58:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5507334144. Throughput: 0: 43164.0. Samples: 5507450120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:58:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 01:58:46,860][15401] Updated weights for policy 0, policy_version 336150 (0.0030) [2024-06-23 01:58:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42872.3, 300 sec: 42931.6). Total num frames: 5507530752. Throughput: 0: 43008.8. Samples: 5507705700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:58:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 01:58:50,565][15401] Updated weights for policy 0, policy_version 336160 (0.0041) [2024-06-23 01:58:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42932.1). Total num frames: 5507743744. Throughput: 0: 43121.2. Samples: 5507833920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:58:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 01:58:54,255][15401] Updated weights for policy 0, policy_version 336170 (0.0030) [2024-06-23 01:58:58,066][15401] Updated weights for policy 0, policy_version 336180 (0.0033) [2024-06-23 01:58:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5507973120. Throughput: 0: 43184.0. Samples: 5508097460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:58:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 01:59:01,941][15401] Updated weights for policy 0, policy_version 336190 (0.0034) [2024-06-23 01:59:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 5508186112. Throughput: 0: 43081.6. Samples: 5508350480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:59:03,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 01:59:05,805][15401] Updated weights for policy 0, policy_version 336200 (0.0037) [2024-06-23 01:59:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 5508382720. Throughput: 0: 43078.6. Samples: 5508478740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:59:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 01:59:09,405][15401] Updated weights for policy 0, policy_version 336210 (0.0025) [2024-06-23 01:59:12,827][15349] Signal inference workers to stop experience collection... (81500 times) [2024-06-23 01:59:12,827][15349] Signal inference workers to resume experience collection... (81500 times) [2024-06-23 01:59:12,847][15401] InferenceWorker_p0-w0: stopping experience collection (81500 times) [2024-06-23 01:59:12,847][15401] InferenceWorker_p0-w0: resuming experience collection (81500 times) [2024-06-23 01:59:13,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5508612096. Throughput: 0: 43096.8. Samples: 5508738980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:59:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 01:59:13,398][15401] Updated weights for policy 0, policy_version 336220 (0.0040) [2024-06-23 01:59:16,914][15401] Updated weights for policy 0, policy_version 336230 (0.0031) [2024-06-23 01:59:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 5508841472. Throughput: 0: 42994.6. Samples: 5508998120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:59:18,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 01:59:20,996][15401] Updated weights for policy 0, policy_version 336240 (0.0031) [2024-06-23 01:59:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5509021696. Throughput: 0: 42916.0. Samples: 5509123180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:59:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 01:59:24,454][15401] Updated weights for policy 0, policy_version 336250 (0.0040) [2024-06-23 01:59:28,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.6, 300 sec: 42820.5). Total num frames: 5509251072. Throughput: 0: 42857.3. Samples: 5509378800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:59:28,392][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 01:59:29,224][15401] Updated weights for policy 0, policy_version 336260 (0.0038) [2024-06-23 01:59:32,201][15401] Updated weights for policy 0, policy_version 336270 (0.0037) [2024-06-23 01:59:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 43043.1). Total num frames: 5509480448. Throughput: 0: 42967.5. Samples: 5509639240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:59:33,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 01:59:36,765][15401] Updated weights for policy 0, policy_version 336280 (0.0033) [2024-06-23 01:59:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42932.6). Total num frames: 5509677056. Throughput: 0: 43090.7. Samples: 5509773000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 01:59:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 01:59:39,619][15401] Updated weights for policy 0, policy_version 336290 (0.0032) [2024-06-23 01:59:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 5509906432. Throughput: 0: 42797.6. Samples: 5510023360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 01:59:43,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 01:59:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000336298_5509906432.pth... [2024-06-23 01:59:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000335670_5499617280.pth [2024-06-23 01:59:44,239][15401] Updated weights for policy 0, policy_version 336300 (0.0040) [2024-06-23 01:59:47,728][15401] Updated weights for policy 0, policy_version 336310 (0.0031) [2024-06-23 01:59:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 5510135808. Throughput: 0: 42973.0. Samples: 5510284160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 01:59:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 01:59:51,682][15401] Updated weights for policy 0, policy_version 336320 (0.0034) [2024-06-23 01:59:53,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 5510316032. Throughput: 0: 43118.1. Samples: 5510419160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 01:59:53,393][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 01:59:55,103][15401] Updated weights for policy 0, policy_version 336330 (0.0034) [2024-06-23 01:59:58,390][15132] Fps is (10 sec: 42596.8, 60 sec: 43144.3, 300 sec: 42876.1). Total num frames: 5510561792. Throughput: 0: 43015.7. Samples: 5510674700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 01:59:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 01:59:59,092][15401] Updated weights for policy 0, policy_version 336340 (0.0044) [2024-06-23 02:00:02,734][15401] Updated weights for policy 0, policy_version 336350 (0.0030) [2024-06-23 02:00:03,389][15132] Fps is (10 sec: 45886.9, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 5510774784. Throughput: 0: 43138.0. Samples: 5510939320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 02:00:06,810][15401] Updated weights for policy 0, policy_version 336360 (0.0028) [2024-06-23 02:00:08,389][15132] Fps is (10 sec: 40961.4, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 5510971392. Throughput: 0: 43288.5. Samples: 5511071160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 02:00:10,236][15401] Updated weights for policy 0, policy_version 336370 (0.0037) [2024-06-23 02:00:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5511200768. Throughput: 0: 43201.1. Samples: 5511322740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:13,400][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 02:00:14,374][15401] Updated weights for policy 0, policy_version 336380 (0.0028) [2024-06-23 02:00:17,953][15401] Updated weights for policy 0, policy_version 336390 (0.0043) [2024-06-23 02:00:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 43043.1). Total num frames: 5511430144. Throughput: 0: 43304.1. Samples: 5511587920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 02:00:22,517][15401] Updated weights for policy 0, policy_version 336400 (0.0030) [2024-06-23 02:00:23,116][15349] Signal inference workers to stop experience collection... (81550 times) [2024-06-23 02:00:23,118][15349] Signal inference workers to resume experience collection... (81550 times) [2024-06-23 02:00:23,138][15401] InferenceWorker_p0-w0: stopping experience collection (81550 times) [2024-06-23 02:00:23,138][15401] InferenceWorker_p0-w0: resuming experience collection (81550 times) [2024-06-23 02:00:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 5511643136. Throughput: 0: 43281.8. Samples: 5511720680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:23,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 02:00:25,255][15401] Updated weights for policy 0, policy_version 336410 (0.0024) [2024-06-23 02:00:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 5511839744. Throughput: 0: 43346.4. Samples: 5511973940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:28,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 02:00:29,964][15401] Updated weights for policy 0, policy_version 336420 (0.0031) [2024-06-23 02:00:32,845][15401] Updated weights for policy 0, policy_version 336430 (0.0035) [2024-06-23 02:00:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5512069120. Throughput: 0: 43341.4. Samples: 5512234520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 02:00:37,394][15401] Updated weights for policy 0, policy_version 336440 (0.0029) [2024-06-23 02:00:38,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43688.9, 300 sec: 42986.8). Total num frames: 5512298496. Throughput: 0: 43324.9. Samples: 5512368780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:38,392][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 02:00:40,767][15401] Updated weights for policy 0, policy_version 336450 (0.0028) [2024-06-23 02:00:43,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5512478720. Throughput: 0: 43157.5. Samples: 5512616780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:43,399][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 02:00:45,196][15401] Updated weights for policy 0, policy_version 336460 (0.0033) [2024-06-23 02:00:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5512708096. Throughput: 0: 42952.8. Samples: 5512872200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 02:00:48,411][15401] Updated weights for policy 0, policy_version 336470 (0.0043) [2024-06-23 02:00:52,937][15401] Updated weights for policy 0, policy_version 336480 (0.0029) [2024-06-23 02:00:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43146.4, 300 sec: 42820.6). Total num frames: 5512904704. Throughput: 0: 42949.4. Samples: 5513003880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 02:00:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 02:00:55,915][15401] Updated weights for policy 0, policy_version 336490 (0.0035) [2024-06-23 02:00:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42870.0, 300 sec: 42875.9). Total num frames: 5513134080. Throughput: 0: 42784.3. Samples: 5513248140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:00:58,401][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 02:01:00,515][15401] Updated weights for policy 0, policy_version 336500 (0.0039) [2024-06-23 02:01:03,386][15401] Updated weights for policy 0, policy_version 336510 (0.0040) [2024-06-23 02:01:03,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 5513379840. Throughput: 0: 42789.3. Samples: 5513513440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 02:01:08,267][15401] Updated weights for policy 0, policy_version 336520 (0.0030) [2024-06-23 02:01:08,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5513560064. Throughput: 0: 42730.1. Samples: 5513643540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 02:01:10,869][15401] Updated weights for policy 0, policy_version 336530 (0.0045) [2024-06-23 02:01:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5513789440. Throughput: 0: 42641.3. Samples: 5513892800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 02:01:15,693][15401] Updated weights for policy 0, policy_version 336540 (0.0037) [2024-06-23 02:01:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 5514018816. Throughput: 0: 42629.8. Samples: 5514152860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 02:01:18,483][15401] Updated weights for policy 0, policy_version 336550 (0.0022) [2024-06-23 02:01:23,171][15349] Signal inference workers to stop experience collection... (81600 times) [2024-06-23 02:01:23,172][15349] Signal inference workers to resume experience collection... (81600 times) [2024-06-23 02:01:23,209][15401] InferenceWorker_p0-w0: stopping experience collection (81600 times) [2024-06-23 02:01:23,209][15401] InferenceWorker_p0-w0: resuming experience collection (81600 times) [2024-06-23 02:01:23,314][15401] Updated weights for policy 0, policy_version 336560 (0.0034) [2024-06-23 02:01:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42932.0). Total num frames: 5514199040. Throughput: 0: 42568.1. Samples: 5514284240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 02:01:26,328][15401] Updated weights for policy 0, policy_version 336570 (0.0027) [2024-06-23 02:01:28,392][15132] Fps is (10 sec: 40949.7, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 5514428416. Throughput: 0: 42787.2. Samples: 5514542300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:28,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 02:01:30,874][15401] Updated weights for policy 0, policy_version 336580 (0.0031) [2024-06-23 02:01:33,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.8, 300 sec: 43042.4). Total num frames: 5514657792. Throughput: 0: 42867.5. Samples: 5514801340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:33,393][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 02:01:33,846][15401] Updated weights for policy 0, policy_version 336590 (0.0031) [2024-06-23 02:01:38,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42327.0, 300 sec: 42932.0). Total num frames: 5514838016. Throughput: 0: 42844.8. Samples: 5514931900. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 02:01:38,416][15401] Updated weights for policy 0, policy_version 336600 (0.0038) [2024-06-23 02:01:41,670][15401] Updated weights for policy 0, policy_version 336610 (0.0031) [2024-06-23 02:01:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 5515083776. Throughput: 0: 43028.6. Samples: 5515184320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 02:01:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000336614_5515083776.pth... [2024-06-23 02:01:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000335985_5504778240.pth [2024-06-23 02:01:46,148][15401] Updated weights for policy 0, policy_version 336620 (0.0028) [2024-06-23 02:01:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 5515296768. Throughput: 0: 42950.3. Samples: 5515446200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 02:01:49,284][15401] Updated weights for policy 0, policy_version 336630 (0.0030) [2024-06-23 02:01:53,389][15132] Fps is (10 sec: 36044.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5515444224. Throughput: 0: 42841.8. Samples: 5515571420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:53,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 02:01:53,868][15401] Updated weights for policy 0, policy_version 336640 (0.0053) [2024-06-23 02:01:56,896][15401] Updated weights for policy 0, policy_version 336650 (0.0033) [2024-06-23 02:01:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43419.4, 300 sec: 43098.3). Total num frames: 5515739136. Throughput: 0: 42973.0. Samples: 5515826580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:01:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 02:02:01,428][15401] Updated weights for policy 0, policy_version 336660 (0.0028) [2024-06-23 02:02:03,390][15132] Fps is (10 sec: 50790.0, 60 sec: 42871.4, 300 sec: 43153.8). Total num frames: 5515952128. Throughput: 0: 43109.6. Samples: 5516092800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:02:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 02:02:04,532][15401] Updated weights for policy 0, policy_version 336670 (0.0031) [2024-06-23 02:02:08,390][15132] Fps is (10 sec: 36044.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5516099584. Throughput: 0: 43012.5. Samples: 5516219800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:02:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 02:02:09,168][15401] Updated weights for policy 0, policy_version 336680 (0.0046) [2024-06-23 02:02:12,093][15401] Updated weights for policy 0, policy_version 336690 (0.0039) [2024-06-23 02:02:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5516378112. Throughput: 0: 42915.6. Samples: 5516473400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 02:02:13,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 02:02:17,048][15401] Updated weights for policy 0, policy_version 336700 (0.0033) [2024-06-23 02:02:18,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 5516574720. Throughput: 0: 42994.7. Samples: 5516736000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:18,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 02:02:19,714][15401] Updated weights for policy 0, policy_version 336710 (0.0021) [2024-06-23 02:02:23,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 5516754944. Throughput: 0: 42813.3. Samples: 5516858600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:23,392][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 02:02:24,658][15401] Updated weights for policy 0, policy_version 336720 (0.0033) [2024-06-23 02:02:27,308][15401] Updated weights for policy 0, policy_version 336730 (0.0028) [2024-06-23 02:02:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43419.4, 300 sec: 43098.3). Total num frames: 5517033472. Throughput: 0: 42912.5. Samples: 5517115380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 02:02:32,629][15401] Updated weights for policy 0, policy_version 336740 (0.0024) [2024-06-23 02:02:33,158][15349] Signal inference workers to stop experience collection... (81650 times) [2024-06-23 02:02:33,211][15401] InferenceWorker_p0-w0: stopping experience collection (81650 times) [2024-06-23 02:02:33,220][15349] Signal inference workers to resume experience collection... (81650 times) [2024-06-23 02:02:33,226][15401] InferenceWorker_p0-w0: resuming experience collection (81650 times) [2024-06-23 02:02:33,392][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 43042.3). Total num frames: 5517213696. Throughput: 0: 42971.7. Samples: 5517380040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:33,393][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 02:02:34,990][15401] Updated weights for policy 0, policy_version 336750 (0.0037) [2024-06-23 02:02:38,390][15132] Fps is (10 sec: 37682.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5517410304. Throughput: 0: 42794.5. Samples: 5517497180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 02:02:40,076][15401] Updated weights for policy 0, policy_version 336760 (0.0026) [2024-06-23 02:02:42,480][15401] Updated weights for policy 0, policy_version 336770 (0.0032) [2024-06-23 02:02:43,389][15132] Fps is (10 sec: 45887.0, 60 sec: 43144.6, 300 sec: 43098.4). Total num frames: 5517672448. Throughput: 0: 42844.0. Samples: 5517754560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:43,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 02:02:47,570][15401] Updated weights for policy 0, policy_version 336780 (0.0041) [2024-06-23 02:02:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.2, 300 sec: 43042.7). Total num frames: 5517852672. Throughput: 0: 42940.4. Samples: 5518025120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:48,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 02:02:50,066][15401] Updated weights for policy 0, policy_version 336790 (0.0036) [2024-06-23 02:02:53,389][15132] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 5518065664. Throughput: 0: 42720.5. Samples: 5518142220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 02:02:55,218][15401] Updated weights for policy 0, policy_version 336800 (0.0029) [2024-06-23 02:02:57,656][15401] Updated weights for policy 0, policy_version 336810 (0.0027) [2024-06-23 02:02:58,395][15132] Fps is (10 sec: 47488.0, 60 sec: 43140.6, 300 sec: 43153.0). Total num frames: 5518327808. Throughput: 0: 43007.7. Samples: 5518408980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:02:58,396][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 02:03:03,065][15401] Updated weights for policy 0, policy_version 336820 (0.0039) [2024-06-23 02:03:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42931.6). Total num frames: 5518475264. Throughput: 0: 43137.4. Samples: 5518677180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:03:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 02:03:05,192][15401] Updated weights for policy 0, policy_version 336830 (0.0030) [2024-06-23 02:03:08,389][15132] Fps is (10 sec: 39343.6, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 5518721024. Throughput: 0: 42962.8. Samples: 5518791820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:03:08,391][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 02:03:10,691][15401] Updated weights for policy 0, policy_version 336840 (0.0036) [2024-06-23 02:03:12,734][15401] Updated weights for policy 0, policy_version 336850 (0.0034) [2024-06-23 02:03:13,392][15132] Fps is (10 sec: 49139.8, 60 sec: 43142.8, 300 sec: 43153.4). Total num frames: 5518966784. Throughput: 0: 43114.5. Samples: 5519055640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:03:13,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 02:03:18,293][15401] Updated weights for policy 0, policy_version 336860 (0.0038) [2024-06-23 02:03:18,394][15132] Fps is (10 sec: 39303.8, 60 sec: 42322.2, 300 sec: 42931.0). Total num frames: 5519114240. Throughput: 0: 43154.6. Samples: 5519322080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:03:18,395][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 02:03:20,245][15401] Updated weights for policy 0, policy_version 336870 (0.0032) [2024-06-23 02:03:23,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43692.4, 300 sec: 42987.2). Total num frames: 5519376384. Throughput: 0: 43076.9. Samples: 5519435640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:03:23,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-23 02:03:25,911][15401] Updated weights for policy 0, policy_version 336880 (0.0038) [2024-06-23 02:03:26,421][15349] Signal inference workers to stop experience collection... (81700 times) [2024-06-23 02:03:26,422][15349] Signal inference workers to resume experience collection... (81700 times) [2024-06-23 02:03:26,468][15401] InferenceWorker_p0-w0: stopping experience collection (81700 times) [2024-06-23 02:03:26,468][15401] InferenceWorker_p0-w0: resuming experience collection (81700 times) [2024-06-23 02:03:27,978][15401] Updated weights for policy 0, policy_version 336890 (0.0039) [2024-06-23 02:03:28,392][15132] Fps is (10 sec: 49162.0, 60 sec: 42869.7, 300 sec: 43097.9). Total num frames: 5519605760. Throughput: 0: 43244.3. Samples: 5519700660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 02:03:28,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 02:03:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 5519753216. Throughput: 0: 42990.3. Samples: 5519959680. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:03:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 02:03:33,490][15401] Updated weights for policy 0, policy_version 336900 (0.0039) [2024-06-23 02:03:35,774][15401] Updated weights for policy 0, policy_version 336910 (0.0033) [2024-06-23 02:03:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 5520015360. Throughput: 0: 42852.9. Samples: 5520070600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:03:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 02:03:41,042][15401] Updated weights for policy 0, policy_version 336920 (0.0033) [2024-06-23 02:03:43,392][15132] Fps is (10 sec: 49140.3, 60 sec: 42869.7, 300 sec: 43097.9). Total num frames: 5520244736. Throughput: 0: 42869.6. Samples: 5520337980. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:03:43,392][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 02:03:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000336929_5520244736.pth... [2024-06-23 02:03:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000336298_5509906432.pth [2024-06-23 02:03:44,001][15401] Updated weights for policy 0, policy_version 336930 (0.0038) [2024-06-23 02:03:48,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5520392192. Throughput: 0: 42805.3. Samples: 5520603420. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:03:48,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-23 02:03:48,637][15401] Updated weights for policy 0, policy_version 336940 (0.0027) [2024-06-23 02:03:51,600][15401] Updated weights for policy 0, policy_version 336950 (0.0026) [2024-06-23 02:03:53,392][15132] Fps is (10 sec: 42598.1, 60 sec: 43415.8, 300 sec: 43042.4). Total num frames: 5520670720. Throughput: 0: 42791.4. Samples: 5520717540. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:03:53,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 02:03:56,114][15401] Updated weights for policy 0, policy_version 336960 (0.0038) [2024-06-23 02:03:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42056.0, 300 sec: 42932.0). Total num frames: 5520850944. Throughput: 0: 42812.9. Samples: 5520982120. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:03:58,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 02:03:59,360][15401] Updated weights for policy 0, policy_version 336970 (0.0037) [2024-06-23 02:04:03,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5521047552. Throughput: 0: 42769.1. Samples: 5521246500. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 02:04:03,659][15401] Updated weights for policy 0, policy_version 336980 (0.0028) [2024-06-23 02:04:07,317][15401] Updated weights for policy 0, policy_version 336990 (0.0038) [2024-06-23 02:04:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5521293312. Throughput: 0: 42832.5. Samples: 5521363100. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 02:04:11,363][15349] Signal inference workers to stop experience collection... (81750 times) [2024-06-23 02:04:11,416][15401] InferenceWorker_p0-w0: stopping experience collection (81750 times) [2024-06-23 02:04:11,423][15349] Signal inference workers to resume experience collection... (81750 times) [2024-06-23 02:04:11,437][15401] InferenceWorker_p0-w0: resuming experience collection (81750 times) [2024-06-23 02:04:11,559][15401] Updated weights for policy 0, policy_version 337000 (0.0028) [2024-06-23 02:04:13,395][15132] Fps is (10 sec: 45848.1, 60 sec: 42322.9, 300 sec: 42930.8). Total num frames: 5521506304. Throughput: 0: 42652.2. Samples: 5521620160. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:13,396][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 02:04:14,803][15401] Updated weights for policy 0, policy_version 337010 (0.0038) [2024-06-23 02:04:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42874.6, 300 sec: 42931.6). Total num frames: 5521686528. Throughput: 0: 42655.1. Samples: 5521879160. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 02:04:19,140][15401] Updated weights for policy 0, policy_version 337020 (0.0036) [2024-06-23 02:04:22,418][15401] Updated weights for policy 0, policy_version 337030 (0.0041) [2024-06-23 02:04:23,389][15132] Fps is (10 sec: 44263.1, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 5521948672. Throughput: 0: 42873.7. Samples: 5521999920. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 02:04:26,698][15401] Updated weights for policy 0, policy_version 337040 (0.0034) [2024-06-23 02:04:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42052.3, 300 sec: 42875.8). Total num frames: 5522128896. Throughput: 0: 42673.7. Samples: 5522258300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:28,393][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 02:04:29,949][15401] Updated weights for policy 0, policy_version 337050 (0.0039) [2024-06-23 02:04:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5522325504. Throughput: 0: 42572.5. Samples: 5522519180. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 02:04:34,423][15401] Updated weights for policy 0, policy_version 337060 (0.0040) [2024-06-23 02:04:37,934][15401] Updated weights for policy 0, policy_version 337070 (0.0022) [2024-06-23 02:04:38,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5522587648. Throughput: 0: 42715.7. Samples: 5522639640. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 02:04:41,822][15401] Updated weights for policy 0, policy_version 337080 (0.0026) [2024-06-23 02:04:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 5522800640. Throughput: 0: 42579.7. Samples: 5522898200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 02:04:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 02:04:45,528][15401] Updated weights for policy 0, policy_version 337090 (0.0054) [2024-06-23 02:04:48,396][15132] Fps is (10 sec: 39295.8, 60 sec: 43139.9, 300 sec: 42931.0). Total num frames: 5522980864. Throughput: 0: 42588.1. Samples: 5523163240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:04:48,396][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 02:04:49,614][15401] Updated weights for policy 0, policy_version 337100 (0.0042) [2024-06-23 02:04:53,071][15401] Updated weights for policy 0, policy_version 337110 (0.0028) [2024-06-23 02:04:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42931.7). Total num frames: 5523226624. Throughput: 0: 42714.7. Samples: 5523285260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:04:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 02:04:57,248][15401] Updated weights for policy 0, policy_version 337120 (0.0037) [2024-06-23 02:04:58,390][15132] Fps is (10 sec: 44265.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5523423232. Throughput: 0: 42699.8. Samples: 5523541400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:04:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 02:05:00,734][15401] Updated weights for policy 0, policy_version 337130 (0.0038) [2024-06-23 02:05:01,635][15349] Signal inference workers to stop experience collection... (81800 times) [2024-06-23 02:05:01,635][15349] Signal inference workers to resume experience collection... (81800 times) [2024-06-23 02:05:01,685][15401] InferenceWorker_p0-w0: stopping experience collection (81800 times) [2024-06-23 02:05:01,685][15401] InferenceWorker_p0-w0: resuming experience collection (81800 times) [2024-06-23 02:05:03,389][15132] Fps is (10 sec: 36044.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5523587072. Throughput: 0: 42722.3. Samples: 5523801660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 02:05:04,909][15401] Updated weights for policy 0, policy_version 337140 (0.0035) [2024-06-23 02:05:08,280][15401] Updated weights for policy 0, policy_version 337150 (0.0025) [2024-06-23 02:05:08,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 5523865600. Throughput: 0: 42735.5. Samples: 5523923120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:08,393][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 02:05:12,418][15401] Updated weights for policy 0, policy_version 337160 (0.0036) [2024-06-23 02:05:13,392][15132] Fps is (10 sec: 49140.1, 60 sec: 42874.0, 300 sec: 42875.7). Total num frames: 5524078592. Throughput: 0: 42895.5. Samples: 5524188600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:13,393][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 02:05:15,939][15401] Updated weights for policy 0, policy_version 337170 (0.0023) [2024-06-23 02:05:18,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5524242432. Throughput: 0: 42998.7. Samples: 5524454120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 02:05:19,797][15401] Updated weights for policy 0, policy_version 337180 (0.0028) [2024-06-23 02:05:23,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 5524504576. Throughput: 0: 43045.6. Samples: 5524576800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:23,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 02:05:23,640][15401] Updated weights for policy 0, policy_version 337190 (0.0039) [2024-06-23 02:05:27,574][15401] Updated weights for policy 0, policy_version 337200 (0.0035) [2024-06-23 02:05:28,390][15132] Fps is (10 sec: 49151.4, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 5524733952. Throughput: 0: 43144.3. Samples: 5524839700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 02:05:31,367][15401] Updated weights for policy 0, policy_version 337210 (0.0029) [2024-06-23 02:05:33,390][15132] Fps is (10 sec: 39329.8, 60 sec: 42871.2, 300 sec: 42709.8). Total num frames: 5524897792. Throughput: 0: 43042.3. Samples: 5525099880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 02:05:34,974][15401] Updated weights for policy 0, policy_version 337220 (0.0032) [2024-06-23 02:05:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 5525143552. Throughput: 0: 42995.1. Samples: 5525220040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:38,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 02:05:39,252][15401] Updated weights for policy 0, policy_version 337230 (0.0038) [2024-06-23 02:05:42,612][15401] Updated weights for policy 0, policy_version 337240 (0.0040) [2024-06-23 02:05:43,389][15132] Fps is (10 sec: 47515.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5525372928. Throughput: 0: 43097.8. Samples: 5525480800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 02:05:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000337242_5525372928.pth... [2024-06-23 02:05:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000336614_5515083776.pth [2024-06-23 02:05:47,034][15401] Updated weights for policy 0, policy_version 337250 (0.0043) [2024-06-23 02:05:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42603.0, 300 sec: 42820.5). Total num frames: 5525536768. Throughput: 0: 43077.8. Samples: 5525740160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:48,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 02:05:50,035][15401] Updated weights for policy 0, policy_version 337260 (0.0043) [2024-06-23 02:05:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 5525798912. Throughput: 0: 42978.3. Samples: 5525857040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 02:05:54,586][15401] Updated weights for policy 0, policy_version 337270 (0.0045) [2024-06-23 02:05:57,595][15401] Updated weights for policy 0, policy_version 337280 (0.0031) [2024-06-23 02:05:58,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5526011904. Throughput: 0: 42937.0. Samples: 5526120660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:05:58,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 02:06:02,154][15401] Updated weights for policy 0, policy_version 337290 (0.0036) [2024-06-23 02:06:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5526175744. Throughput: 0: 42843.1. Samples: 5526382060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 02:06:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 02:06:05,663][15401] Updated weights for policy 0, policy_version 337300 (0.0036) [2024-06-23 02:06:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 5526421504. Throughput: 0: 42837.1. Samples: 5526504360. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 02:06:09,639][15401] Updated weights for policy 0, policy_version 337310 (0.0036) [2024-06-23 02:06:11,167][15349] Signal inference workers to stop experience collection... (81850 times) [2024-06-23 02:06:11,208][15401] InferenceWorker_p0-w0: stopping experience collection (81850 times) [2024-06-23 02:06:11,217][15349] Signal inference workers to resume experience collection... (81850 times) [2024-06-23 02:06:11,219][15401] InferenceWorker_p0-w0: resuming experience collection (81850 times) [2024-06-23 02:06:13,119][15401] Updated weights for policy 0, policy_version 337320 (0.0031) [2024-06-23 02:06:13,390][15132] Fps is (10 sec: 49151.1, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 5526667264. Throughput: 0: 42791.5. Samples: 5526765320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 02:06:17,478][15401] Updated weights for policy 0, policy_version 337330 (0.0024) [2024-06-23 02:06:18,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 5526831104. Throughput: 0: 42685.2. Samples: 5527020800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:18,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 02:06:20,665][15401] Updated weights for policy 0, policy_version 337340 (0.0032) [2024-06-23 02:06:23,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42599.8, 300 sec: 42820.8). Total num frames: 5527060480. Throughput: 0: 42705.7. Samples: 5527141820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:23,391][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 02:06:25,185][15401] Updated weights for policy 0, policy_version 337350 (0.0043) [2024-06-23 02:06:28,257][15401] Updated weights for policy 0, policy_version 337360 (0.0036) [2024-06-23 02:06:28,392][15132] Fps is (10 sec: 47513.1, 60 sec: 42869.8, 300 sec: 42876.1). Total num frames: 5527306240. Throughput: 0: 42823.4. Samples: 5527407960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:28,393][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 02:06:32,931][15401] Updated weights for policy 0, policy_version 337370 (0.0033) [2024-06-23 02:06:33,389][15132] Fps is (10 sec: 40962.2, 60 sec: 42871.8, 300 sec: 42820.6). Total num frames: 5527470080. Throughput: 0: 42714.3. Samples: 5527662300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 02:06:35,975][15401] Updated weights for policy 0, policy_version 337380 (0.0026) [2024-06-23 02:06:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5527715840. Throughput: 0: 42791.1. Samples: 5527782640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:38,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 02:06:40,542][15401] Updated weights for policy 0, policy_version 337390 (0.0042) [2024-06-23 02:06:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 5527928832. Throughput: 0: 42788.1. Samples: 5528046120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:43,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 02:06:43,721][15401] Updated weights for policy 0, policy_version 337400 (0.0027) [2024-06-23 02:06:48,226][15401] Updated weights for policy 0, policy_version 337410 (0.0039) [2024-06-23 02:06:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5528125440. Throughput: 0: 42556.9. Samples: 5528297120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 02:06:51,366][15401] Updated weights for policy 0, policy_version 337420 (0.0041) [2024-06-23 02:06:53,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 5528354816. Throughput: 0: 42530.3. Samples: 5528418500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:53,396][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 02:06:56,530][15401] Updated weights for policy 0, policy_version 337430 (0.0041) [2024-06-23 02:06:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5528551424. Throughput: 0: 42511.4. Samples: 5528678320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:06:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 02:06:59,273][15401] Updated weights for policy 0, policy_version 337440 (0.0030) [2024-06-23 02:07:03,390][15132] Fps is (10 sec: 39346.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5528748032. Throughput: 0: 42348.9. Samples: 5528926400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:07:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 02:07:04,099][15401] Updated weights for policy 0, policy_version 337450 (0.0045) [2024-06-23 02:07:07,162][15401] Updated weights for policy 0, policy_version 337460 (0.0041) [2024-06-23 02:07:08,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 5528993792. Throughput: 0: 42461.7. Samples: 5529052680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:07:08,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 02:07:11,677][15401] Updated weights for policy 0, policy_version 337470 (0.0031) [2024-06-23 02:07:13,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42050.7, 300 sec: 42764.7). Total num frames: 5529190400. Throughput: 0: 42375.6. Samples: 5529314860. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:07:13,393][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 02:07:14,893][15401] Updated weights for policy 0, policy_version 337480 (0.0033) [2024-06-23 02:07:18,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42600.1, 300 sec: 42820.9). Total num frames: 5529387008. Throughput: 0: 42274.6. Samples: 5529564660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 02:07:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 02:07:19,250][15401] Updated weights for policy 0, policy_version 337490 (0.0024) [2024-06-23 02:07:22,493][15401] Updated weights for policy 0, policy_version 337500 (0.0035) [2024-06-23 02:07:23,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.8, 300 sec: 42709.5). Total num frames: 5529632768. Throughput: 0: 42452.5. Samples: 5529693000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:07:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 02:07:26,762][15401] Updated weights for policy 0, policy_version 337510 (0.0036) [2024-06-23 02:07:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41781.0, 300 sec: 42709.9). Total num frames: 5529812992. Throughput: 0: 42294.7. Samples: 5529949380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:07:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 02:07:29,095][15349] Signal inference workers to stop experience collection... (81900 times) [2024-06-23 02:07:29,096][15349] Signal inference workers to resume experience collection... (81900 times) [2024-06-23 02:07:29,137][15401] InferenceWorker_p0-w0: stopping experience collection (81900 times) [2024-06-23 02:07:29,137][15401] InferenceWorker_p0-w0: resuming experience collection (81900 times) [2024-06-23 02:07:30,145][15401] Updated weights for policy 0, policy_version 337520 (0.0028) [2024-06-23 02:07:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5530025984. Throughput: 0: 42430.2. Samples: 5530206480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:07:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 02:07:34,387][15401] Updated weights for policy 0, policy_version 337530 (0.0031) [2024-06-23 02:07:38,050][15401] Updated weights for policy 0, policy_version 337540 (0.0036) [2024-06-23 02:07:38,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5530255360. Throughput: 0: 42507.8. Samples: 5530331080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:07:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 02:07:42,177][15401] Updated weights for policy 0, policy_version 337550 (0.0031) [2024-06-23 02:07:43,390][15132] Fps is (10 sec: 44232.8, 60 sec: 42324.6, 300 sec: 42764.9). Total num frames: 5530468352. Throughput: 0: 42504.4. Samples: 5530591060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:07:43,391][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 02:07:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000337553_5530468352.pth... [2024-06-23 02:07:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000336929_5520244736.pth [2024-06-23 02:07:45,703][15401] Updated weights for policy 0, policy_version 337560 (0.0029) [2024-06-23 02:07:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5530697728. Throughput: 0: 42665.7. Samples: 5530846360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:07:48,399][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 02:07:49,645][15401] Updated weights for policy 0, policy_version 337570 (0.0038) [2024-06-23 02:07:53,266][15401] Updated weights for policy 0, policy_version 337580 (0.0033) [2024-06-23 02:07:53,390][15132] Fps is (10 sec: 44240.6, 60 sec: 42602.9, 300 sec: 42654.7). Total num frames: 5530910720. Throughput: 0: 42780.9. Samples: 5530977720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:07:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 02:07:57,035][15401] Updated weights for policy 0, policy_version 337590 (0.0037) [2024-06-23 02:07:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5531090944. Throughput: 0: 42547.7. Samples: 5531229400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:07:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 02:08:00,743][15401] Updated weights for policy 0, policy_version 337600 (0.0032) [2024-06-23 02:08:03,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 5531320320. Throughput: 0: 42782.3. Samples: 5531490140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:08:03,396][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 02:08:04,907][15401] Updated weights for policy 0, policy_version 337610 (0.0041) [2024-06-23 02:08:08,304][15401] Updated weights for policy 0, policy_version 337620 (0.0038) [2024-06-23 02:08:08,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 5531566080. Throughput: 0: 42938.7. Samples: 5531625240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:08:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 02:08:12,282][15401] Updated weights for policy 0, policy_version 337630 (0.0022) [2024-06-23 02:08:13,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42600.1, 300 sec: 42821.2). Total num frames: 5531746304. Throughput: 0: 42975.9. Samples: 5531883300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:08:13,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 02:08:15,809][15401] Updated weights for policy 0, policy_version 337640 (0.0033) [2024-06-23 02:08:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5531975680. Throughput: 0: 42949.3. Samples: 5532139200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:08:18,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-23 02:08:20,131][15401] Updated weights for policy 0, policy_version 337650 (0.0040) [2024-06-23 02:08:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 5532205056. Throughput: 0: 43061.7. Samples: 5532268860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:08:23,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 02:08:23,471][15401] Updated weights for policy 0, policy_version 337660 (0.0036) [2024-06-23 02:08:27,875][15401] Updated weights for policy 0, policy_version 337670 (0.0035) [2024-06-23 02:08:28,392][15132] Fps is (10 sec: 42589.2, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 5532401664. Throughput: 0: 42896.5. Samples: 5532521460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:08:28,392][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 02:08:31,190][15401] Updated weights for policy 0, policy_version 337680 (0.0032) [2024-06-23 02:08:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5532598272. Throughput: 0: 43056.5. Samples: 5532783900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 02:08:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 02:08:35,510][15401] Updated weights for policy 0, policy_version 337690 (0.0032) [2024-06-23 02:08:38,392][15132] Fps is (10 sec: 45874.3, 60 sec: 43415.9, 300 sec: 42765.0). Total num frames: 5532860416. Throughput: 0: 43021.7. Samples: 5532913800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:08:38,393][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 02:08:38,859][15401] Updated weights for policy 0, policy_version 337700 (0.0033) [2024-06-23 02:08:43,136][15401] Updated weights for policy 0, policy_version 337710 (0.0027) [2024-06-23 02:08:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42872.1, 300 sec: 42876.1). Total num frames: 5533040640. Throughput: 0: 43161.7. Samples: 5533171680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:08:43,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 02:08:46,390][15401] Updated weights for policy 0, policy_version 337720 (0.0030) [2024-06-23 02:08:48,390][15132] Fps is (10 sec: 37691.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 5533237248. Throughput: 0: 43140.3. Samples: 5533431180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:08:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 02:08:50,659][15401] Updated weights for policy 0, policy_version 337730 (0.0029) [2024-06-23 02:08:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5533499392. Throughput: 0: 42947.5. Samples: 5533557880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:08:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 02:08:53,918][15401] Updated weights for policy 0, policy_version 337740 (0.0038) [2024-06-23 02:08:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5533663232. Throughput: 0: 43015.6. Samples: 5533819000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:08:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 02:08:58,659][15401] Updated weights for policy 0, policy_version 337750 (0.0045) [2024-06-23 02:08:58,996][15349] Signal inference workers to stop experience collection... (81950 times) [2024-06-23 02:08:58,997][15349] Signal inference workers to resume experience collection... (81950 times) [2024-06-23 02:08:59,022][15401] InferenceWorker_p0-w0: stopping experience collection (81950 times) [2024-06-23 02:08:59,022][15401] InferenceWorker_p0-w0: resuming experience collection (81950 times) [2024-06-23 02:09:01,845][15401] Updated weights for policy 0, policy_version 337760 (0.0028) [2024-06-23 02:09:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42876.2, 300 sec: 42709.5). Total num frames: 5533892608. Throughput: 0: 42960.2. Samples: 5534072400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 02:09:06,093][15401] Updated weights for policy 0, policy_version 337770 (0.0039) [2024-06-23 02:09:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.9). Total num frames: 5534121984. Throughput: 0: 42853.8. Samples: 5534197280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 02:09:09,860][15401] Updated weights for policy 0, policy_version 337780 (0.0037) [2024-06-23 02:09:13,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5534318592. Throughput: 0: 42972.8. Samples: 5534455140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:13,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 02:09:13,570][15401] Updated weights for policy 0, policy_version 337790 (0.0032) [2024-06-23 02:09:17,383][15401] Updated weights for policy 0, policy_version 337800 (0.0032) [2024-06-23 02:09:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5534531584. Throughput: 0: 42679.6. Samples: 5534704480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 02:09:21,519][15401] Updated weights for policy 0, policy_version 337810 (0.0022) [2024-06-23 02:09:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 5534777344. Throughput: 0: 42650.3. Samples: 5534832960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:23,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 02:09:25,114][15401] Updated weights for policy 0, policy_version 337820 (0.0032) [2024-06-23 02:09:28,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42599.9, 300 sec: 42820.5). Total num frames: 5534957568. Throughput: 0: 42739.8. Samples: 5535094980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 02:09:29,118][15401] Updated weights for policy 0, policy_version 337830 (0.0023) [2024-06-23 02:09:32,765][15401] Updated weights for policy 0, policy_version 337840 (0.0031) [2024-06-23 02:09:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 5535186944. Throughput: 0: 42560.9. Samples: 5535346520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:33,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 02:09:36,585][15401] Updated weights for policy 0, policy_version 337850 (0.0033) [2024-06-23 02:09:38,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5535416320. Throughput: 0: 42706.2. Samples: 5535479660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 02:09:40,515][15401] Updated weights for policy 0, policy_version 337860 (0.0032) [2024-06-23 02:09:43,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42766.0). Total num frames: 5535596544. Throughput: 0: 42575.1. Samples: 5535734880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 02:09:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000337866_5535596544.pth... [2024-06-23 02:09:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000337242_5525372928.pth [2024-06-23 02:09:44,632][15401] Updated weights for policy 0, policy_version 337870 (0.0024) [2024-06-23 02:09:48,171][15401] Updated weights for policy 0, policy_version 337880 (0.0036) [2024-06-23 02:09:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5535825920. Throughput: 0: 42636.0. Samples: 5535991020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 02:09:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 02:09:52,211][15401] Updated weights for policy 0, policy_version 337890 (0.0041) [2024-06-23 02:09:53,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 5536055296. Throughput: 0: 42772.4. Samples: 5536122140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:09:53,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 02:09:55,704][15401] Updated weights for policy 0, policy_version 337900 (0.0041) [2024-06-23 02:09:58,395][15132] Fps is (10 sec: 40937.6, 60 sec: 42867.6, 300 sec: 42875.3). Total num frames: 5536235520. Throughput: 0: 42666.0. Samples: 5536375340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:09:58,395][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 02:09:59,667][15401] Updated weights for policy 0, policy_version 337910 (0.0039) [2024-06-23 02:10:03,222][15401] Updated weights for policy 0, policy_version 337920 (0.0025) [2024-06-23 02:10:03,392][15132] Fps is (10 sec: 42598.3, 60 sec: 43142.7, 300 sec: 42765.0). Total num frames: 5536481280. Throughput: 0: 42827.4. Samples: 5536631820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:03,393][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 02:10:07,191][15401] Updated weights for policy 0, policy_version 337930 (0.0027) [2024-06-23 02:10:08,389][15132] Fps is (10 sec: 44261.2, 60 sec: 42598.5, 300 sec: 42709.9). Total num frames: 5536677888. Throughput: 0: 42998.8. Samples: 5536767900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 02:10:10,858][15401] Updated weights for policy 0, policy_version 337940 (0.0027) [2024-06-23 02:10:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5536890880. Throughput: 0: 42735.2. Samples: 5537018060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 02:10:14,869][15401] Updated weights for policy 0, policy_version 337950 (0.0033) [2024-06-23 02:10:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 5537120256. Throughput: 0: 42877.0. Samples: 5537275880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 02:10:18,628][15401] Updated weights for policy 0, policy_version 337960 (0.0043) [2024-06-23 02:10:22,537][15401] Updated weights for policy 0, policy_version 337970 (0.0033) [2024-06-23 02:10:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5537316864. Throughput: 0: 42776.5. Samples: 5537404600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 02:10:26,593][15401] Updated weights for policy 0, policy_version 337980 (0.0035) [2024-06-23 02:10:28,390][15132] Fps is (10 sec: 40957.7, 60 sec: 42871.2, 300 sec: 42820.5). Total num frames: 5537529856. Throughput: 0: 42878.5. Samples: 5537664440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:28,391][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 02:10:30,125][15401] Updated weights for policy 0, policy_version 337990 (0.0037) [2024-06-23 02:10:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 5537759232. Throughput: 0: 42716.3. Samples: 5537913260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 02:10:34,305][15401] Updated weights for policy 0, policy_version 338000 (0.0033) [2024-06-23 02:10:37,608][15401] Updated weights for policy 0, policy_version 338010 (0.0026) [2024-06-23 02:10:38,390][15132] Fps is (10 sec: 44239.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5537972224. Throughput: 0: 42750.6. Samples: 5538045820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 02:10:41,869][15401] Updated weights for policy 0, policy_version 338020 (0.0035) [2024-06-23 02:10:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5538168832. Throughput: 0: 42963.4. Samples: 5538308460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:43,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 02:10:45,621][15349] Signal inference workers to stop experience collection... (82000 times) [2024-06-23 02:10:45,673][15349] Signal inference workers to resume experience collection... (82000 times) [2024-06-23 02:10:45,674][15401] InferenceWorker_p0-w0: stopping experience collection (82000 times) [2024-06-23 02:10:45,677][15401] Updated weights for policy 0, policy_version 338030 (0.0034) [2024-06-23 02:10:45,686][15401] InferenceWorker_p0-w0: resuming experience collection (82000 times) [2024-06-23 02:10:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5538381824. Throughput: 0: 42787.2. Samples: 5538557140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:48,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 02:10:49,474][15401] Updated weights for policy 0, policy_version 338040 (0.0033) [2024-06-23 02:10:53,094][15401] Updated weights for policy 0, policy_version 338050 (0.0031) [2024-06-23 02:10:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 5538627584. Throughput: 0: 42709.1. Samples: 5538689820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:53,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 02:10:56,900][15401] Updated weights for policy 0, policy_version 338060 (0.0027) [2024-06-23 02:10:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42875.4, 300 sec: 42820.6). Total num frames: 5538807808. Throughput: 0: 42798.4. Samples: 5538943980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:10:58,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 02:11:00,704][15401] Updated weights for policy 0, policy_version 338070 (0.0049) [2024-06-23 02:11:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 5539020800. Throughput: 0: 42853.9. Samples: 5539204300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 02:11:03,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-23 02:11:04,467][15401] Updated weights for policy 0, policy_version 338080 (0.0030) [2024-06-23 02:11:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5539266560. Throughput: 0: 42841.7. Samples: 5539332480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 02:11:08,390][15401] Updated weights for policy 0, policy_version 338090 (0.0033) [2024-06-23 02:11:12,181][15401] Updated weights for policy 0, policy_version 338100 (0.0029) [2024-06-23 02:11:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 5539463168. Throughput: 0: 42606.4. Samples: 5539581700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 02:11:16,066][15401] Updated weights for policy 0, policy_version 338110 (0.0034) [2024-06-23 02:11:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 5539676160. Throughput: 0: 42832.4. Samples: 5539840720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:18,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-23 02:11:19,946][15401] Updated weights for policy 0, policy_version 338120 (0.0028) [2024-06-23 02:11:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 5539889152. Throughput: 0: 42653.3. Samples: 5539965220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 02:11:23,812][15401] Updated weights for policy 0, policy_version 338130 (0.0036) [2024-06-23 02:11:27,467][15401] Updated weights for policy 0, policy_version 338140 (0.0033) [2024-06-23 02:11:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42870.2, 300 sec: 42820.2). Total num frames: 5540102144. Throughput: 0: 42514.2. Samples: 5540221700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:28,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 02:11:31,700][15401] Updated weights for policy 0, policy_version 338150 (0.0048) [2024-06-23 02:11:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5540315136. Throughput: 0: 42607.6. Samples: 5540474480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 02:11:35,300][15401] Updated weights for policy 0, policy_version 338160 (0.0034) [2024-06-23 02:11:38,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5540511744. Throughput: 0: 42543.2. Samples: 5540604260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 02:11:39,466][15401] Updated weights for policy 0, policy_version 338170 (0.0035) [2024-06-23 02:11:43,371][15401] Updated weights for policy 0, policy_version 338180 (0.0031) [2024-06-23 02:11:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5540741120. Throughput: 0: 42495.0. Samples: 5540856260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:43,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-23 02:11:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000338181_5540757504.pth... [2024-06-23 02:11:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000337553_5530468352.pth [2024-06-23 02:11:47,299][15401] Updated weights for policy 0, policy_version 338190 (0.0035) [2024-06-23 02:11:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.9). Total num frames: 5540970496. Throughput: 0: 42338.6. Samples: 5541109540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:48,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 02:11:50,935][15401] Updated weights for policy 0, policy_version 338200 (0.0028) [2024-06-23 02:11:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5541150720. Throughput: 0: 42323.1. Samples: 5541237020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 02:11:54,787][15401] Updated weights for policy 0, policy_version 338210 (0.0032) [2024-06-23 02:11:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5541380096. Throughput: 0: 42457.8. Samples: 5541492300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:11:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 02:11:58,485][15401] Updated weights for policy 0, policy_version 338220 (0.0028) [2024-06-23 02:12:02,416][15401] Updated weights for policy 0, policy_version 338230 (0.0034) [2024-06-23 02:12:03,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.7, 300 sec: 42765.0). Total num frames: 5541609472. Throughput: 0: 42375.5. Samples: 5541747720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:12:03,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 02:12:06,018][15401] Updated weights for policy 0, policy_version 338240 (0.0038) [2024-06-23 02:12:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42654.3). Total num frames: 5541773312. Throughput: 0: 42487.4. Samples: 5541877140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:12:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 02:12:09,931][15349] Signal inference workers to stop experience collection... (82050 times) [2024-06-23 02:12:09,932][15349] Signal inference workers to resume experience collection... (82050 times) [2024-06-23 02:12:09,949][15401] InferenceWorker_p0-w0: stopping experience collection (82050 times) [2024-06-23 02:12:09,979][15401] InferenceWorker_p0-w0: resuming experience collection (82050 times) [2024-06-23 02:12:10,074][15401] Updated weights for policy 0, policy_version 338250 (0.0023) [2024-06-23 02:12:13,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5542019072. Throughput: 0: 42527.5. Samples: 5542135340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:12:13,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 02:12:14,382][15401] Updated weights for policy 0, policy_version 338260 (0.0049) [2024-06-23 02:12:17,995][15401] Updated weights for policy 0, policy_version 338270 (0.0030) [2024-06-23 02:12:18,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5542232064. Throughput: 0: 42602.3. Samples: 5542391580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:12:18,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 02:12:22,047][15401] Updated weights for policy 0, policy_version 338280 (0.0040) [2024-06-23 02:12:23,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5542412288. Throughput: 0: 42520.0. Samples: 5542517660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 02:12:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 02:12:25,635][15401] Updated weights for policy 0, policy_version 338290 (0.0025) [2024-06-23 02:12:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 5542674432. Throughput: 0: 42519.9. Samples: 5542769660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:12:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 02:12:29,716][15401] Updated weights for policy 0, policy_version 338300 (0.0039) [2024-06-23 02:12:33,309][15401] Updated weights for policy 0, policy_version 338310 (0.0028) [2024-06-23 02:12:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5542871040. Throughput: 0: 42710.0. Samples: 5543031500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:12:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 02:12:37,181][15401] Updated weights for policy 0, policy_version 338320 (0.0056) [2024-06-23 02:12:38,395][15132] Fps is (10 sec: 39302.0, 60 sec: 42594.8, 300 sec: 42708.9). Total num frames: 5543067648. Throughput: 0: 42629.4. Samples: 5543155560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:12:38,395][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 02:12:41,010][15401] Updated weights for policy 0, policy_version 338330 (0.0033) [2024-06-23 02:12:43,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5543329792. Throughput: 0: 42711.4. Samples: 5543414320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:12:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 02:12:44,644][15401] Updated weights for policy 0, policy_version 338340 (0.0040) [2024-06-23 02:12:48,390][15132] Fps is (10 sec: 44258.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5543510016. Throughput: 0: 42938.7. Samples: 5543679860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:12:48,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 02:12:48,535][15401] Updated weights for policy 0, policy_version 338350 (0.0038) [2024-06-23 02:12:52,326][15401] Updated weights for policy 0, policy_version 338360 (0.0031) [2024-06-23 02:12:53,390][15132] Fps is (10 sec: 37681.2, 60 sec: 42598.0, 300 sec: 42764.9). Total num frames: 5543706624. Throughput: 0: 42733.6. Samples: 5543800180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:12:53,391][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 02:12:56,067][15401] Updated weights for policy 0, policy_version 338370 (0.0025) [2024-06-23 02:12:58,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 5543952384. Throughput: 0: 42704.7. Samples: 5544057040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:12:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 02:13:00,305][15401] Updated weights for policy 0, policy_version 338380 (0.0039) [2024-06-23 02:13:03,390][15132] Fps is (10 sec: 42600.3, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 5544132608. Throughput: 0: 42904.8. Samples: 5544322300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:13:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 02:13:03,755][15401] Updated weights for policy 0, policy_version 338390 (0.0037) [2024-06-23 02:13:08,153][15401] Updated weights for policy 0, policy_version 338400 (0.0036) [2024-06-23 02:13:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5544345600. Throughput: 0: 42762.1. Samples: 5544441960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:13:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 02:13:11,586][15401] Updated weights for policy 0, policy_version 338410 (0.0037) [2024-06-23 02:13:13,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5544607744. Throughput: 0: 42909.5. Samples: 5544700580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:13:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 02:13:15,946][15401] Updated weights for policy 0, policy_version 338420 (0.0040) [2024-06-23 02:13:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5544771584. Throughput: 0: 42905.1. Samples: 5544962220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:13:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 02:13:19,555][15401] Updated weights for policy 0, policy_version 338430 (0.0039) [2024-06-23 02:13:23,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.4, 300 sec: 42654.2). Total num frames: 5544984576. Throughput: 0: 42787.9. Samples: 5545080800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:13:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 02:13:23,510][15401] Updated weights for policy 0, policy_version 338440 (0.0038) [2024-06-23 02:13:26,234][15349] Signal inference workers to stop experience collection... (82100 times) [2024-06-23 02:13:26,285][15401] InferenceWorker_p0-w0: stopping experience collection (82100 times) [2024-06-23 02:13:26,350][15349] Signal inference workers to resume experience collection... (82100 times) [2024-06-23 02:13:26,350][15401] InferenceWorker_p0-w0: resuming experience collection (82100 times) [2024-06-23 02:13:27,007][15401] Updated weights for policy 0, policy_version 338450 (0.0042) [2024-06-23 02:13:28,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5545246720. Throughput: 0: 42761.4. Samples: 5545338580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:13:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 02:13:31,075][15401] Updated weights for policy 0, policy_version 338460 (0.0032) [2024-06-23 02:13:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42543.2). Total num frames: 5545410560. Throughput: 0: 42509.0. Samples: 5545592760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:13:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 02:13:34,629][15401] Updated weights for policy 0, policy_version 338470 (0.0037) [2024-06-23 02:13:38,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42602.0, 300 sec: 42653.9). Total num frames: 5545623552. Throughput: 0: 42522.3. Samples: 5545713660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 02:13:38,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 02:13:38,612][15401] Updated weights for policy 0, policy_version 338480 (0.0031) [2024-06-23 02:13:42,226][15401] Updated weights for policy 0, policy_version 338490 (0.0032) [2024-06-23 02:13:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5545869312. Throughput: 0: 42684.0. Samples: 5545977820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:13:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 02:13:43,539][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000338494_5545885696.pth... [2024-06-23 02:13:43,604][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000337866_5535596544.pth [2024-06-23 02:13:46,170][15401] Updated weights for policy 0, policy_version 338500 (0.0031) [2024-06-23 02:13:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5546065920. Throughput: 0: 42575.1. Samples: 5546238180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:13:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 02:13:49,792][15401] Updated weights for policy 0, policy_version 338510 (0.0038) [2024-06-23 02:13:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.8, 300 sec: 42765.0). Total num frames: 5546278912. Throughput: 0: 42637.4. Samples: 5546360640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:13:53,393][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 02:13:53,799][15401] Updated weights for policy 0, policy_version 338520 (0.0041) [2024-06-23 02:13:57,601][15401] Updated weights for policy 0, policy_version 338530 (0.0032) [2024-06-23 02:13:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5546491904. Throughput: 0: 42633.0. Samples: 5546619060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:13:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 02:14:01,417][15401] Updated weights for policy 0, policy_version 338540 (0.0037) [2024-06-23 02:14:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5546688512. Throughput: 0: 42466.7. Samples: 5546873220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 02:14:05,384][15401] Updated weights for policy 0, policy_version 338550 (0.0044) [2024-06-23 02:14:08,396][15132] Fps is (10 sec: 42570.7, 60 sec: 42867.0, 300 sec: 42708.6). Total num frames: 5546917888. Throughput: 0: 42423.9. Samples: 5546990140. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:08,396][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 02:14:09,531][15401] Updated weights for policy 0, policy_version 338560 (0.0049) [2024-06-23 02:14:13,136][15401] Updated weights for policy 0, policy_version 338570 (0.0036) [2024-06-23 02:14:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5547147264. Throughput: 0: 42654.6. Samples: 5547258040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 02:14:17,220][15401] Updated weights for policy 0, policy_version 338580 (0.0048) [2024-06-23 02:14:18,389][15132] Fps is (10 sec: 40986.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5547327488. Throughput: 0: 42678.7. Samples: 5547513300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 02:14:20,750][15401] Updated weights for policy 0, policy_version 338590 (0.0049) [2024-06-23 02:14:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5547556864. Throughput: 0: 42686.2. Samples: 5547634540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:23,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 02:14:24,963][15401] Updated weights for policy 0, policy_version 338600 (0.0027) [2024-06-23 02:14:28,366][15401] Updated weights for policy 0, policy_version 338610 (0.0040) [2024-06-23 02:14:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 5547786240. Throughput: 0: 42654.2. Samples: 5547897260. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:28,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 02:14:28,851][15349] Signal inference workers to stop experience collection... (82150 times) [2024-06-23 02:14:28,855][15349] Signal inference workers to resume experience collection... (82150 times) [2024-06-23 02:14:28,896][15401] InferenceWorker_p0-w0: stopping experience collection (82150 times) [2024-06-23 02:14:28,896][15401] InferenceWorker_p0-w0: resuming experience collection (82150 times) [2024-06-23 02:14:32,710][15401] Updated weights for policy 0, policy_version 338620 (0.0029) [2024-06-23 02:14:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5547982848. Throughput: 0: 42589.0. Samples: 5548154680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 02:14:36,042][15401] Updated weights for policy 0, policy_version 338630 (0.0026) [2024-06-23 02:14:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 5548195840. Throughput: 0: 42592.9. Samples: 5548277420. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:38,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 02:14:40,373][15401] Updated weights for policy 0, policy_version 338640 (0.0027) [2024-06-23 02:14:43,390][15132] Fps is (10 sec: 42596.3, 60 sec: 42324.9, 300 sec: 42653.9). Total num frames: 5548408832. Throughput: 0: 42369.2. Samples: 5548525700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 02:14:43,885][15401] Updated weights for policy 0, policy_version 338650 (0.0032) [2024-06-23 02:14:47,971][15401] Updated weights for policy 0, policy_version 338660 (0.0036) [2024-06-23 02:14:48,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 5548621824. Throughput: 0: 42478.1. Samples: 5548784740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 02:14:51,419][15401] Updated weights for policy 0, policy_version 338670 (0.0028) [2024-06-23 02:14:53,390][15132] Fps is (10 sec: 42600.2, 60 sec: 42598.3, 300 sec: 42710.2). Total num frames: 5548834816. Throughput: 0: 42720.7. Samples: 5548912300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 02:14:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 02:14:55,476][15401] Updated weights for policy 0, policy_version 338680 (0.0038) [2024-06-23 02:14:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 5549047808. Throughput: 0: 42553.0. Samples: 5549172920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:14:58,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 02:14:59,112][15401] Updated weights for policy 0, policy_version 338690 (0.0030) [2024-06-23 02:15:02,945][15401] Updated weights for policy 0, policy_version 338700 (0.0035) [2024-06-23 02:15:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5549260800. Throughput: 0: 42610.7. Samples: 5549430780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:03,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 02:15:06,597][15401] Updated weights for policy 0, policy_version 338710 (0.0027) [2024-06-23 02:15:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 5549490176. Throughput: 0: 42848.1. Samples: 5549562700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 02:15:11,054][15401] Updated weights for policy 0, policy_version 338720 (0.0029) [2024-06-23 02:15:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5549686784. Throughput: 0: 42695.0. Samples: 5549818540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:13,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 02:15:14,317][15401] Updated weights for policy 0, policy_version 338730 (0.0046) [2024-06-23 02:15:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5549899776. Throughput: 0: 42665.3. Samples: 5550074620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 02:15:18,645][15401] Updated weights for policy 0, policy_version 338740 (0.0048) [2024-06-23 02:15:21,930][15401] Updated weights for policy 0, policy_version 338750 (0.0043) [2024-06-23 02:15:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.5). Total num frames: 5550096384. Throughput: 0: 42750.3. Samples: 5550201080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 02:15:26,243][15401] Updated weights for policy 0, policy_version 338760 (0.0026) [2024-06-23 02:15:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5550342144. Throughput: 0: 42836.1. Samples: 5550453300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 02:15:29,466][15401] Updated weights for policy 0, policy_version 338770 (0.0032) [2024-06-23 02:15:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5550538752. Throughput: 0: 43012.1. Samples: 5550720280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 02:15:33,787][15401] Updated weights for policy 0, policy_version 338780 (0.0027) [2024-06-23 02:15:37,489][15401] Updated weights for policy 0, policy_version 338790 (0.0042) [2024-06-23 02:15:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 5550735360. Throughput: 0: 42950.8. Samples: 5550845080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 02:15:41,190][15401] Updated weights for policy 0, policy_version 338800 (0.0030) [2024-06-23 02:15:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.9, 300 sec: 42765.0). Total num frames: 5550997504. Throughput: 0: 42831.4. Samples: 5551100340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:43,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 02:15:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000338806_5550997504.pth... [2024-06-23 02:15:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000338181_5540757504.pth [2024-06-23 02:15:43,687][15349] Signal inference workers to stop experience collection... (82200 times) [2024-06-23 02:15:43,689][15349] Signal inference workers to resume experience collection... (82200 times) [2024-06-23 02:15:43,724][15401] InferenceWorker_p0-w0: stopping experience collection (82200 times) [2024-06-23 02:15:43,724][15401] InferenceWorker_p0-w0: resuming experience collection (82200 times) [2024-06-23 02:15:44,815][15401] Updated weights for policy 0, policy_version 338810 (0.0026) [2024-06-23 02:15:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5551194112. Throughput: 0: 43088.8. Samples: 5551369780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:48,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 02:15:49,063][15401] Updated weights for policy 0, policy_version 338820 (0.0035) [2024-06-23 02:15:52,342][15401] Updated weights for policy 0, policy_version 338830 (0.0029) [2024-06-23 02:15:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5551407104. Throughput: 0: 42827.9. Samples: 5551489960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 02:15:56,804][15401] Updated weights for policy 0, policy_version 338840 (0.0026) [2024-06-23 02:15:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5551636480. Throughput: 0: 42859.7. Samples: 5551747220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:15:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 02:15:59,915][15401] Updated weights for policy 0, policy_version 338850 (0.0039) [2024-06-23 02:16:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5551816704. Throughput: 0: 42900.2. Samples: 5552005120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:16:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 02:16:04,496][15401] Updated weights for policy 0, policy_version 338860 (0.0036) [2024-06-23 02:16:08,076][15401] Updated weights for policy 0, policy_version 338870 (0.0042) [2024-06-23 02:16:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5552046080. Throughput: 0: 42724.3. Samples: 5552123680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 02:16:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 02:16:12,325][15401] Updated weights for policy 0, policy_version 338880 (0.0028) [2024-06-23 02:16:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5552275456. Throughput: 0: 42913.9. Samples: 5552384420. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:13,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 02:16:15,682][15401] Updated weights for policy 0, policy_version 338890 (0.0037) [2024-06-23 02:16:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5552455680. Throughput: 0: 42671.2. Samples: 5552640480. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:18,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 02:16:19,912][15401] Updated weights for policy 0, policy_version 338900 (0.0045) [2024-06-23 02:16:23,228][15401] Updated weights for policy 0, policy_version 338910 (0.0030) [2024-06-23 02:16:23,391][15132] Fps is (10 sec: 42590.9, 60 sec: 43416.4, 300 sec: 42709.6). Total num frames: 5552701440. Throughput: 0: 42614.9. Samples: 5552762820. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:23,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 02:16:27,441][15401] Updated weights for policy 0, policy_version 338920 (0.0042) [2024-06-23 02:16:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5552914432. Throughput: 0: 42811.5. Samples: 5553026860. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 02:16:30,711][15401] Updated weights for policy 0, policy_version 338930 (0.0039) [2024-06-23 02:16:33,389][15132] Fps is (10 sec: 40966.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5553111040. Throughput: 0: 42635.2. Samples: 5553288360. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:33,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 02:16:34,940][15401] Updated weights for policy 0, policy_version 338940 (0.0039) [2024-06-23 02:16:38,230][15401] Updated weights for policy 0, policy_version 338950 (0.0037) [2024-06-23 02:16:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 5553356800. Throughput: 0: 42733.4. Samples: 5553412960. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:38,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 02:16:42,522][15401] Updated weights for policy 0, policy_version 338960 (0.0034) [2024-06-23 02:16:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5553537024. Throughput: 0: 42875.6. Samples: 5553676620. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:43,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 02:16:45,785][15401] Updated weights for policy 0, policy_version 338970 (0.0031) [2024-06-23 02:16:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5553766400. Throughput: 0: 42878.9. Samples: 5553934680. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:48,396][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 02:16:50,392][15401] Updated weights for policy 0, policy_version 338980 (0.0033) [2024-06-23 02:16:53,317][15401] Updated weights for policy 0, policy_version 338990 (0.0043) [2024-06-23 02:16:53,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5554012160. Throughput: 0: 42988.4. Samples: 5554058160. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 02:16:56,302][15349] Signal inference workers to stop experience collection... (82250 times) [2024-06-23 02:16:56,303][15349] Signal inference workers to resume experience collection... (82250 times) [2024-06-23 02:16:56,317][15401] InferenceWorker_p0-w0: stopping experience collection (82250 times) [2024-06-23 02:16:56,342][15401] InferenceWorker_p0-w0: resuming experience collection (82250 times) [2024-06-23 02:16:58,103][15401] Updated weights for policy 0, policy_version 339000 (0.0036) [2024-06-23 02:16:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 5554192384. Throughput: 0: 42921.1. Samples: 5554315880. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:16:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 02:17:01,051][15401] Updated weights for policy 0, policy_version 339010 (0.0038) [2024-06-23 02:17:03,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5554388992. Throughput: 0: 42955.0. Samples: 5554573460. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:17:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 02:17:05,557][15401] Updated weights for policy 0, policy_version 339020 (0.0036) [2024-06-23 02:17:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5554634752. Throughput: 0: 43062.0. Samples: 5554700540. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:17:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 02:17:09,153][15401] Updated weights for policy 0, policy_version 339030 (0.0030) [2024-06-23 02:17:13,181][15401] Updated weights for policy 0, policy_version 339040 (0.0037) [2024-06-23 02:17:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5554831360. Throughput: 0: 42973.9. Samples: 5554960680. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:17:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 02:17:16,784][15401] Updated weights for policy 0, policy_version 339050 (0.0040) [2024-06-23 02:17:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5555044352. Throughput: 0: 42757.2. Samples: 5555212440. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:17:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 02:17:20,936][15401] Updated weights for policy 0, policy_version 339060 (0.0034) [2024-06-23 02:17:23,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43143.9, 300 sec: 42764.7). Total num frames: 5555290112. Throughput: 0: 42806.5. Samples: 5555339360. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:17:23,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 02:17:24,254][15401] Updated weights for policy 0, policy_version 339070 (0.0038) [2024-06-23 02:17:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 5555453952. Throughput: 0: 42707.9. Samples: 5555598580. Policy #0 lag: (min: 1.0, avg: 11.9, max: 27.0) [2024-06-23 02:17:28,392][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 02:17:28,770][15401] Updated weights for policy 0, policy_version 339080 (0.0041) [2024-06-23 02:17:31,814][15401] Updated weights for policy 0, policy_version 339090 (0.0038) [2024-06-23 02:17:33,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42598.4, 300 sec: 42710.2). Total num frames: 5555666944. Throughput: 0: 42576.1. Samples: 5555850600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:17:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 02:17:36,280][15401] Updated weights for policy 0, policy_version 339100 (0.0032) [2024-06-23 02:17:38,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5555912704. Throughput: 0: 42751.3. Samples: 5555981960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:17:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 02:17:39,238][15401] Updated weights for policy 0, policy_version 339110 (0.0029) [2024-06-23 02:17:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5556060160. Throughput: 0: 42684.5. Samples: 5556236680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:17:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 02:17:43,553][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000339117_5556092928.pth... [2024-06-23 02:17:43,633][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000338494_5545885696.pth [2024-06-23 02:17:44,034][15401] Updated weights for policy 0, policy_version 339120 (0.0031) [2024-06-23 02:17:46,867][15401] Updated weights for policy 0, policy_version 339130 (0.0040) [2024-06-23 02:17:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5556305920. Throughput: 0: 42572.0. Samples: 5556489200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:17:48,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 02:17:51,684][15401] Updated weights for policy 0, policy_version 339140 (0.0026) [2024-06-23 02:17:53,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5556551680. Throughput: 0: 42676.9. Samples: 5556621000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:17:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 02:17:54,751][15401] Updated weights for policy 0, policy_version 339150 (0.0028) [2024-06-23 02:17:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5556731904. Throughput: 0: 42439.9. Samples: 5556870480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:17:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 02:17:59,010][15349] Signal inference workers to stop experience collection... (82300 times) [2024-06-23 02:17:59,030][15401] InferenceWorker_p0-w0: stopping experience collection (82300 times) [2024-06-23 02:17:59,120][15349] Signal inference workers to resume experience collection... (82300 times) [2024-06-23 02:17:59,120][15401] InferenceWorker_p0-w0: resuming experience collection (82300 times) [2024-06-23 02:17:59,257][15401] Updated weights for policy 0, policy_version 339160 (0.0032) [2024-06-23 02:18:03,085][15401] Updated weights for policy 0, policy_version 339170 (0.0051) [2024-06-23 02:18:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5556961280. Throughput: 0: 42535.6. Samples: 5557126540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 02:18:06,882][15401] Updated weights for policy 0, policy_version 339180 (0.0034) [2024-06-23 02:18:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5557190656. Throughput: 0: 42736.2. Samples: 5557262380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 02:18:10,511][15401] Updated weights for policy 0, policy_version 339190 (0.0023) [2024-06-23 02:18:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5557387264. Throughput: 0: 42609.9. Samples: 5557515920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 02:18:14,528][15401] Updated weights for policy 0, policy_version 339200 (0.0038) [2024-06-23 02:18:18,280][15401] Updated weights for policy 0, policy_version 339210 (0.0043) [2024-06-23 02:18:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5557616640. Throughput: 0: 42750.2. Samples: 5557774360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:18,399][15132] Avg episode reward: [(0, '0.292')] [2024-06-23 02:18:22,075][15401] Updated weights for policy 0, policy_version 339220 (0.0033) [2024-06-23 02:18:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42325.4, 300 sec: 42653.6). Total num frames: 5557829632. Throughput: 0: 42811.9. Samples: 5557908600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:23,392][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 02:18:25,955][15401] Updated weights for policy 0, policy_version 339230 (0.0030) [2024-06-23 02:18:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 5558026240. Throughput: 0: 42736.9. Samples: 5558159840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 02:18:29,695][15401] Updated weights for policy 0, policy_version 339240 (0.0024) [2024-06-23 02:18:33,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5558255616. Throughput: 0: 42925.4. Samples: 5558420840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 02:18:33,881][15401] Updated weights for policy 0, policy_version 339250 (0.0040) [2024-06-23 02:18:37,469][15401] Updated weights for policy 0, policy_version 339260 (0.0032) [2024-06-23 02:18:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5558468608. Throughput: 0: 42838.3. Samples: 5558548720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 02:18:41,317][15401] Updated weights for policy 0, policy_version 339270 (0.0039) [2024-06-23 02:18:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43690.5, 300 sec: 42765.0). Total num frames: 5558681600. Throughput: 0: 43074.6. Samples: 5558808840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 02:18:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 02:18:44,975][15401] Updated weights for policy 0, policy_version 339280 (0.0043) [2024-06-23 02:18:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5558910976. Throughput: 0: 43052.8. Samples: 5559063920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:18:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 02:18:48,806][15401] Updated weights for policy 0, policy_version 339290 (0.0028) [2024-06-23 02:18:53,003][15401] Updated weights for policy 0, policy_version 339300 (0.0034) [2024-06-23 02:18:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5559107584. Throughput: 0: 42925.2. Samples: 5559194020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:18:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 02:18:56,559][15401] Updated weights for policy 0, policy_version 339310 (0.0037) [2024-06-23 02:18:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5559320576. Throughput: 0: 43060.4. Samples: 5559453640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:18:58,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 02:19:00,445][15401] Updated weights for policy 0, policy_version 339320 (0.0038) [2024-06-23 02:19:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42877.0). Total num frames: 5559566336. Throughput: 0: 42975.6. Samples: 5559708260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:03,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-23 02:19:03,955][15401] Updated weights for policy 0, policy_version 339330 (0.0028) [2024-06-23 02:19:07,876][15401] Updated weights for policy 0, policy_version 339340 (0.0031) [2024-06-23 02:19:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5559746560. Throughput: 0: 42973.1. Samples: 5559842280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:08,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-23 02:19:11,354][15401] Updated weights for policy 0, policy_version 339350 (0.0033) [2024-06-23 02:19:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 5559992320. Throughput: 0: 43151.3. Samples: 5560101660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:13,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 02:19:15,287][15401] Updated weights for policy 0, policy_version 339360 (0.0028) [2024-06-23 02:19:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5560188928. Throughput: 0: 43063.6. Samples: 5560358700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 02:19:19,009][15401] Updated weights for policy 0, policy_version 339370 (0.0036) [2024-06-23 02:19:22,747][15401] Updated weights for policy 0, policy_version 339380 (0.0024) [2024-06-23 02:19:23,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 5560418304. Throughput: 0: 43143.0. Samples: 5560490160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 02:19:26,475][15401] Updated weights for policy 0, policy_version 339390 (0.0041) [2024-06-23 02:19:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5560631296. Throughput: 0: 43192.2. Samples: 5560752480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 02:19:30,274][15401] Updated weights for policy 0, policy_version 339400 (0.0028) [2024-06-23 02:19:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 5560860672. Throughput: 0: 43217.8. Samples: 5561008720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 02:19:34,312][15401] Updated weights for policy 0, policy_version 339410 (0.0037) [2024-06-23 02:19:34,771][15349] Signal inference workers to stop experience collection... (82350 times) [2024-06-23 02:19:34,771][15349] Signal inference workers to resume experience collection... (82350 times) [2024-06-23 02:19:34,786][15401] InferenceWorker_p0-w0: stopping experience collection (82350 times) [2024-06-23 02:19:34,787][15401] InferenceWorker_p0-w0: resuming experience collection (82350 times) [2024-06-23 02:19:38,071][15401] Updated weights for policy 0, policy_version 339420 (0.0033) [2024-06-23 02:19:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42876.2). Total num frames: 5561057280. Throughput: 0: 43189.3. Samples: 5561137540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 02:19:41,874][15401] Updated weights for policy 0, policy_version 339430 (0.0038) [2024-06-23 02:19:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 5561286656. Throughput: 0: 43148.4. Samples: 5561395320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:43,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 02:19:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000339434_5561286656.pth... [2024-06-23 02:19:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000338806_5550997504.pth [2024-06-23 02:19:45,750][15401] Updated weights for policy 0, policy_version 339440 (0.0038) [2024-06-23 02:19:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5561483264. Throughput: 0: 43224.3. Samples: 5561653360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 02:19:49,747][15401] Updated weights for policy 0, policy_version 339450 (0.0031) [2024-06-23 02:19:53,343][15401] Updated weights for policy 0, policy_version 339460 (0.0030) [2024-06-23 02:19:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5561712640. Throughput: 0: 42960.7. Samples: 5561775520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 02:19:57,310][15401] Updated weights for policy 0, policy_version 339470 (0.0033) [2024-06-23 02:19:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5561909248. Throughput: 0: 42969.5. Samples: 5562035280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 02:19:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 02:20:01,496][15401] Updated weights for policy 0, policy_version 339480 (0.0038) [2024-06-23 02:20:03,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 5562122240. Throughput: 0: 42925.7. Samples: 5562290460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:03,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 02:20:04,814][15401] Updated weights for policy 0, policy_version 339490 (0.0027) [2024-06-23 02:20:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5562318848. Throughput: 0: 42868.5. Samples: 5562419240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 02:20:08,939][15401] Updated weights for policy 0, policy_version 339500 (0.0030) [2024-06-23 02:20:12,213][15401] Updated weights for policy 0, policy_version 339510 (0.0037) [2024-06-23 02:20:13,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 5562580992. Throughput: 0: 42874.2. Samples: 5562681820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 02:20:16,445][15401] Updated weights for policy 0, policy_version 339520 (0.0032) [2024-06-23 02:20:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5562777600. Throughput: 0: 43012.0. Samples: 5562944260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:18,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 02:20:19,811][15401] Updated weights for policy 0, policy_version 339530 (0.0045) [2024-06-23 02:20:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5562974208. Throughput: 0: 42851.7. Samples: 5563065860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 02:20:24,443][15401] Updated weights for policy 0, policy_version 339540 (0.0027) [2024-06-23 02:20:27,712][15401] Updated weights for policy 0, policy_version 339550 (0.0035) [2024-06-23 02:20:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5563219968. Throughput: 0: 43032.0. Samples: 5563331760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:28,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 02:20:32,221][15401] Updated weights for policy 0, policy_version 339560 (0.0044) [2024-06-23 02:20:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 5563400192. Throughput: 0: 42991.3. Samples: 5563587960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 02:20:35,254][15401] Updated weights for policy 0, policy_version 339570 (0.0025) [2024-06-23 02:20:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5563629568. Throughput: 0: 42994.7. Samples: 5563710280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 02:20:39,841][15401] Updated weights for policy 0, policy_version 339580 (0.0032) [2024-06-23 02:20:42,745][15401] Updated weights for policy 0, policy_version 339590 (0.0030) [2024-06-23 02:20:43,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5563858944. Throughput: 0: 43174.1. Samples: 5563978120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:43,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 02:20:47,592][15401] Updated weights for policy 0, policy_version 339600 (0.0045) [2024-06-23 02:20:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5564071936. Throughput: 0: 43133.9. Samples: 5564231380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 02:20:50,166][15349] Signal inference workers to stop experience collection... (82400 times) [2024-06-23 02:20:50,205][15401] InferenceWorker_p0-w0: stopping experience collection (82400 times) [2024-06-23 02:20:50,224][15349] Signal inference workers to resume experience collection... (82400 times) [2024-06-23 02:20:50,227][15401] InferenceWorker_p0-w0: resuming experience collection (82400 times) [2024-06-23 02:20:50,364][15401] Updated weights for policy 0, policy_version 339610 (0.0040) [2024-06-23 02:20:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5564284928. Throughput: 0: 43097.8. Samples: 5564358640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:53,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 02:20:55,147][15401] Updated weights for policy 0, policy_version 339620 (0.0026) [2024-06-23 02:20:58,041][15401] Updated weights for policy 0, policy_version 339630 (0.0040) [2024-06-23 02:20:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5564497920. Throughput: 0: 43078.6. Samples: 5564620360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:20:58,393][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 02:21:02,639][15401] Updated weights for policy 0, policy_version 339640 (0.0040) [2024-06-23 02:21:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 5564678144. Throughput: 0: 42934.8. Samples: 5564876320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:21:03,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 02:21:05,609][15401] Updated weights for policy 0, policy_version 339650 (0.0034) [2024-06-23 02:21:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5564923904. Throughput: 0: 42882.0. Samples: 5564995560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:21:08,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 02:21:10,433][15401] Updated weights for policy 0, policy_version 339660 (0.0028) [2024-06-23 02:21:13,186][15401] Updated weights for policy 0, policy_version 339670 (0.0024) [2024-06-23 02:21:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 5565153280. Throughput: 0: 42904.0. Samples: 5565262440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 02:21:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 02:21:17,914][15401] Updated weights for policy 0, policy_version 339680 (0.0028) [2024-06-23 02:21:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42820.8). Total num frames: 5565333504. Throughput: 0: 42967.9. Samples: 5565521520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 02:21:21,037][15401] Updated weights for policy 0, policy_version 339690 (0.0033) [2024-06-23 02:21:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 5565579264. Throughput: 0: 42898.7. Samples: 5565640720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 02:21:25,777][15401] Updated weights for policy 0, policy_version 339700 (0.0036) [2024-06-23 02:21:28,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 5565775872. Throughput: 0: 42771.6. Samples: 5565902940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:28,393][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 02:21:28,788][15401] Updated weights for policy 0, policy_version 339710 (0.0029) [2024-06-23 02:21:33,254][15401] Updated weights for policy 0, policy_version 339720 (0.0029) [2024-06-23 02:21:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5565972480. Throughput: 0: 42872.8. Samples: 5566160660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 02:21:36,372][15401] Updated weights for policy 0, policy_version 339730 (0.0034) [2024-06-23 02:21:38,389][15132] Fps is (10 sec: 45886.7, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 5566234624. Throughput: 0: 42792.5. Samples: 5566284300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 02:21:40,889][15401] Updated weights for policy 0, policy_version 339740 (0.0046) [2024-06-23 02:21:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5566398464. Throughput: 0: 42773.8. Samples: 5566545180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:43,400][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 02:21:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000339746_5566398464.pth... [2024-06-23 02:21:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000339117_5556092928.pth [2024-06-23 02:21:44,111][15401] Updated weights for policy 0, policy_version 339750 (0.0040) [2024-06-23 02:21:48,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5566611456. Throughput: 0: 42805.6. Samples: 5566802580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:48,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 02:21:48,531][15401] Updated weights for policy 0, policy_version 339760 (0.0052) [2024-06-23 02:21:51,737][15401] Updated weights for policy 0, policy_version 339770 (0.0037) [2024-06-23 02:21:53,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 5566873600. Throughput: 0: 42987.6. Samples: 5566930100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:53,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 02:21:56,345][15401] Updated weights for policy 0, policy_version 339780 (0.0036) [2024-06-23 02:21:57,943][15349] Signal inference workers to stop experience collection... (82450 times) [2024-06-23 02:21:57,943][15349] Signal inference workers to resume experience collection... (82450 times) [2024-06-23 02:21:57,974][15401] InferenceWorker_p0-w0: stopping experience collection (82450 times) [2024-06-23 02:21:57,975][15401] InferenceWorker_p0-w0: resuming experience collection (82450 times) [2024-06-23 02:21:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5567037440. Throughput: 0: 42761.3. Samples: 5567186700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:21:58,394][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 02:21:59,353][15401] Updated weights for policy 0, policy_version 339790 (0.0028) [2024-06-23 02:22:03,396][15132] Fps is (10 sec: 37668.1, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 5567250432. Throughput: 0: 42843.2. Samples: 5567449740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:22:03,397][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 02:22:03,830][15401] Updated weights for policy 0, policy_version 339800 (0.0030) [2024-06-23 02:22:07,151][15401] Updated weights for policy 0, policy_version 339810 (0.0039) [2024-06-23 02:22:08,389][15132] Fps is (10 sec: 49152.7, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 5567528960. Throughput: 0: 42943.2. Samples: 5567573160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:22:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 02:22:11,413][15401] Updated weights for policy 0, policy_version 339820 (0.0036) [2024-06-23 02:22:13,392][15132] Fps is (10 sec: 42615.6, 60 sec: 42050.6, 300 sec: 42820.2). Total num frames: 5567676416. Throughput: 0: 42837.8. Samples: 5567830640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:22:13,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 02:22:14,550][15401] Updated weights for policy 0, policy_version 339830 (0.0035) [2024-06-23 02:22:18,392][15132] Fps is (10 sec: 37675.2, 60 sec: 42870.0, 300 sec: 42765.1). Total num frames: 5567905792. Throughput: 0: 42894.1. Samples: 5568090980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:22:18,392][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 02:22:19,323][15401] Updated weights for policy 0, policy_version 339840 (0.0028) [2024-06-23 02:22:22,019][15401] Updated weights for policy 0, policy_version 339850 (0.0025) [2024-06-23 02:22:23,389][15132] Fps is (10 sec: 50802.8, 60 sec: 43417.6, 300 sec: 43154.1). Total num frames: 5568184320. Throughput: 0: 42933.7. Samples: 5568216320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:22:23,393][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 02:22:26,816][15401] Updated weights for policy 0, policy_version 339860 (0.0048) [2024-06-23 02:22:28,389][15132] Fps is (10 sec: 42607.9, 60 sec: 42600.2, 300 sec: 42931.7). Total num frames: 5568331776. Throughput: 0: 43057.9. Samples: 5568482780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:22:28,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 02:22:29,806][15401] Updated weights for policy 0, policy_version 339870 (0.0040) [2024-06-23 02:22:33,389][15132] Fps is (10 sec: 36044.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5568544768. Throughput: 0: 42983.2. Samples: 5568736820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 02:22:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 02:22:34,236][15401] Updated weights for policy 0, policy_version 339880 (0.0037) [2024-06-23 02:22:37,286][15401] Updated weights for policy 0, policy_version 339890 (0.0040) [2024-06-23 02:22:38,392][15132] Fps is (10 sec: 45863.4, 60 sec: 42596.6, 300 sec: 43153.4). Total num frames: 5568790528. Throughput: 0: 43080.9. Samples: 5568868740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:22:38,392][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 02:22:41,710][15401] Updated weights for policy 0, policy_version 339900 (0.0024) [2024-06-23 02:22:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5568987136. Throughput: 0: 43135.2. Samples: 5569127780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:22:43,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 02:22:45,087][15401] Updated weights for policy 0, policy_version 339910 (0.0035) [2024-06-23 02:22:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5569200128. Throughput: 0: 42854.6. Samples: 5569377920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:22:48,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 02:22:49,575][15401] Updated weights for policy 0, policy_version 339920 (0.0027) [2024-06-23 02:22:52,592][15401] Updated weights for policy 0, policy_version 339930 (0.0028) [2024-06-23 02:22:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 43042.7). Total num frames: 5569429504. Throughput: 0: 43014.6. Samples: 5569508820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:22:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 02:22:56,988][15401] Updated weights for policy 0, policy_version 339940 (0.0028) [2024-06-23 02:22:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 5569642496. Throughput: 0: 43097.5. Samples: 5569769920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:22:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 02:23:00,056][15401] Updated weights for policy 0, policy_version 339950 (0.0028) [2024-06-23 02:23:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43422.3, 300 sec: 42931.6). Total num frames: 5569855488. Throughput: 0: 42981.2. Samples: 5570025040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 02:23:04,362][15401] Updated weights for policy 0, policy_version 339960 (0.0026) [2024-06-23 02:23:07,572][15401] Updated weights for policy 0, policy_version 339970 (0.0026) [2024-06-23 02:23:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 5570084864. Throughput: 0: 43109.4. Samples: 5570156240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 02:23:12,130][15401] Updated weights for policy 0, policy_version 339980 (0.0033) [2024-06-23 02:23:13,394][15132] Fps is (10 sec: 42579.3, 60 sec: 43416.2, 300 sec: 42931.0). Total num frames: 5570281472. Throughput: 0: 43008.5. Samples: 5570418360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:13,394][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 02:23:14,362][15349] Signal inference workers to stop experience collection... (82500 times) [2024-06-23 02:23:14,363][15349] Signal inference workers to resume experience collection... (82500 times) [2024-06-23 02:23:14,412][15401] InferenceWorker_p0-w0: stopping experience collection (82500 times) [2024-06-23 02:23:14,412][15401] InferenceWorker_p0-w0: resuming experience collection (82500 times) [2024-06-23 02:23:14,933][15401] Updated weights for policy 0, policy_version 339990 (0.0033) [2024-06-23 02:23:18,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43144.3, 300 sec: 42931.6). Total num frames: 5570494464. Throughput: 0: 43077.7. Samples: 5570675420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:18,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 02:23:19,689][15401] Updated weights for policy 0, policy_version 340000 (0.0054) [2024-06-23 02:23:22,519][15401] Updated weights for policy 0, policy_version 340010 (0.0030) [2024-06-23 02:23:23,390][15132] Fps is (10 sec: 45895.2, 60 sec: 42598.4, 300 sec: 43098.2). Total num frames: 5570740224. Throughput: 0: 43064.0. Samples: 5570806520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 02:23:27,278][15401] Updated weights for policy 0, policy_version 340020 (0.0046) [2024-06-23 02:23:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5570904064. Throughput: 0: 43082.7. Samples: 5571066500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 02:23:30,095][15401] Updated weights for policy 0, policy_version 340030 (0.0033) [2024-06-23 02:23:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5571149824. Throughput: 0: 43202.2. Samples: 5571322020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:33,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 02:23:35,134][15401] Updated weights for policy 0, policy_version 340040 (0.0032) [2024-06-23 02:23:37,756][15401] Updated weights for policy 0, policy_version 340050 (0.0027) [2024-06-23 02:23:38,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 5571379200. Throughput: 0: 43239.6. Samples: 5571454600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:38,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 02:23:42,578][15401] Updated weights for policy 0, policy_version 340060 (0.0025) [2024-06-23 02:23:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5571592192. Throughput: 0: 43257.7. Samples: 5571716520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 02:23:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000340063_5571592192.pth... [2024-06-23 02:23:43,441][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000339434_5561286656.pth [2024-06-23 02:23:45,531][15401] Updated weights for policy 0, policy_version 340070 (0.0036) [2024-06-23 02:23:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5571788800. Throughput: 0: 43198.1. Samples: 5571968960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:23:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 02:23:50,081][15401] Updated weights for policy 0, policy_version 340080 (0.0040) [2024-06-23 02:23:53,359][15401] Updated weights for policy 0, policy_version 340090 (0.0043) [2024-06-23 02:23:53,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43415.8, 300 sec: 43097.9). Total num frames: 5572034560. Throughput: 0: 43174.5. Samples: 5572099200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:23:53,393][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 02:23:57,477][15401] Updated weights for policy 0, policy_version 340100 (0.0039) [2024-06-23 02:23:58,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.4, 300 sec: 42987.1). Total num frames: 5572247552. Throughput: 0: 43188.9. Samples: 5572361680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:23:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 02:24:00,985][15401] Updated weights for policy 0, policy_version 340110 (0.0038) [2024-06-23 02:24:03,389][15132] Fps is (10 sec: 40970.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5572444160. Throughput: 0: 43153.1. Samples: 5572617200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:03,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-23 02:24:05,002][15401] Updated weights for policy 0, policy_version 340120 (0.0024) [2024-06-23 02:24:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 5572657152. Throughput: 0: 43108.0. Samples: 5572746380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 02:24:08,722][15401] Updated weights for policy 0, policy_version 340130 (0.0033) [2024-06-23 02:24:12,488][15401] Updated weights for policy 0, policy_version 340140 (0.0047) [2024-06-23 02:24:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43420.9, 300 sec: 43042.7). Total num frames: 5572886528. Throughput: 0: 43137.0. Samples: 5573007660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:13,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 02:24:16,627][15401] Updated weights for policy 0, policy_version 340150 (0.0032) [2024-06-23 02:24:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43419.4, 300 sec: 42987.2). Total num frames: 5573099520. Throughput: 0: 42991.6. Samples: 5573256640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 02:24:20,177][15401] Updated weights for policy 0, policy_version 340160 (0.0043) [2024-06-23 02:24:23,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 5573312512. Throughput: 0: 42939.4. Samples: 5573386880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 02:24:24,316][15401] Updated weights for policy 0, policy_version 340170 (0.0036) [2024-06-23 02:24:27,898][15401] Updated weights for policy 0, policy_version 340180 (0.0033) [2024-06-23 02:24:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 5573525504. Throughput: 0: 43006.7. Samples: 5573651820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 02:24:31,695][15401] Updated weights for policy 0, policy_version 340190 (0.0022) [2024-06-23 02:24:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5573738496. Throughput: 0: 43037.4. Samples: 5573905640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 02:24:35,473][15401] Updated weights for policy 0, policy_version 340200 (0.0031) [2024-06-23 02:24:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5573951488. Throughput: 0: 43012.1. Samples: 5574034640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 02:24:39,043][15401] Updated weights for policy 0, policy_version 340210 (0.0033) [2024-06-23 02:24:42,952][15401] Updated weights for policy 0, policy_version 340220 (0.0022) [2024-06-23 02:24:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5574180864. Throughput: 0: 42955.2. Samples: 5574294660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:43,400][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 02:24:46,399][15401] Updated weights for policy 0, policy_version 340230 (0.0026) [2024-06-23 02:24:48,307][15349] Signal inference workers to stop experience collection... (82550 times) [2024-06-23 02:24:48,307][15349] Signal inference workers to resume experience collection... (82550 times) [2024-06-23 02:24:48,340][15401] InferenceWorker_p0-w0: stopping experience collection (82550 times) [2024-06-23 02:24:48,341][15401] InferenceWorker_p0-w0: resuming experience collection (82550 times) [2024-06-23 02:24:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 5574393856. Throughput: 0: 43105.4. Samples: 5574556940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 02:24:50,474][15401] Updated weights for policy 0, policy_version 340240 (0.0042) [2024-06-23 02:24:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.3, 300 sec: 43042.7). Total num frames: 5574606848. Throughput: 0: 43185.9. Samples: 5574689740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 02:24:53,831][15401] Updated weights for policy 0, policy_version 340250 (0.0031) [2024-06-23 02:24:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.7, 300 sec: 43043.1). Total num frames: 5574819840. Throughput: 0: 43019.6. Samples: 5574943540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:24:58,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 02:24:58,401][15401] Updated weights for policy 0, policy_version 340260 (0.0032) [2024-06-23 02:25:01,678][15401] Updated weights for policy 0, policy_version 340270 (0.0032) [2024-06-23 02:25:03,396][15132] Fps is (10 sec: 45845.3, 60 sec: 43685.9, 300 sec: 43208.4). Total num frames: 5575065600. Throughput: 0: 43214.3. Samples: 5575201560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 24.0) [2024-06-23 02:25:03,396][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 02:25:05,873][15401] Updated weights for policy 0, policy_version 340280 (0.0038) [2024-06-23 02:25:08,389][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5575245824. Throughput: 0: 43282.8. Samples: 5575334600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 02:25:09,164][15401] Updated weights for policy 0, policy_version 340290 (0.0031) [2024-06-23 02:25:13,301][15401] Updated weights for policy 0, policy_version 340300 (0.0040) [2024-06-23 02:25:13,390][15132] Fps is (10 sec: 40986.2, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 5575475200. Throughput: 0: 43057.3. Samples: 5575589400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 02:25:16,609][15401] Updated weights for policy 0, policy_version 340310 (0.0029) [2024-06-23 02:25:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 5575704576. Throughput: 0: 43163.5. Samples: 5575848000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:18,396][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 02:25:20,759][15401] Updated weights for policy 0, policy_version 340320 (0.0024) [2024-06-23 02:25:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 5575917568. Throughput: 0: 43383.9. Samples: 5575986920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:23,395][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 02:25:24,450][15401] Updated weights for policy 0, policy_version 340330 (0.0038) [2024-06-23 02:25:28,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.8, 300 sec: 43097.9). Total num frames: 5576114176. Throughput: 0: 43359.1. Samples: 5576245920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:28,393][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 02:25:28,576][15401] Updated weights for policy 0, policy_version 340340 (0.0033) [2024-06-23 02:25:32,051][15401] Updated weights for policy 0, policy_version 340350 (0.0025) [2024-06-23 02:25:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 5576343552. Throughput: 0: 43252.2. Samples: 5576503300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 02:25:36,259][15401] Updated weights for policy 0, policy_version 340360 (0.0043) [2024-06-23 02:25:38,392][15132] Fps is (10 sec: 44236.7, 60 sec: 43415.8, 300 sec: 43042.4). Total num frames: 5576556544. Throughput: 0: 43329.1. Samples: 5576639660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:38,393][15132] Avg episode reward: [(0, '0.261')] [2024-06-23 02:25:39,893][15401] Updated weights for policy 0, policy_version 340370 (0.0037) [2024-06-23 02:25:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5576753152. Throughput: 0: 43280.3. Samples: 5576891160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:43,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 02:25:43,513][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000340379_5576769536.pth... [2024-06-23 02:25:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000339746_5566398464.pth [2024-06-23 02:25:43,715][15401] Updated weights for policy 0, policy_version 340380 (0.0031) [2024-06-23 02:25:47,379][15401] Updated weights for policy 0, policy_version 340390 (0.0037) [2024-06-23 02:25:48,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 5576998912. Throughput: 0: 43355.6. Samples: 5577152280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:48,396][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 02:25:51,146][15401] Updated weights for policy 0, policy_version 340400 (0.0043) [2024-06-23 02:25:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 5577211904. Throughput: 0: 43254.1. Samples: 5577281040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 02:25:54,955][15401] Updated weights for policy 0, policy_version 340410 (0.0040) [2024-06-23 02:25:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 43209.3). Total num frames: 5577424896. Throughput: 0: 43227.6. Samples: 5577534640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:25:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 02:25:59,192][15401] Updated weights for policy 0, policy_version 340420 (0.0029) [2024-06-23 02:26:02,779][15401] Updated weights for policy 0, policy_version 340430 (0.0034) [2024-06-23 02:26:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42603.0, 300 sec: 43042.7). Total num frames: 5577621504. Throughput: 0: 43255.2. Samples: 5577794480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:26:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 02:26:06,750][15401] Updated weights for policy 0, policy_version 340440 (0.0039) [2024-06-23 02:26:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 5577850880. Throughput: 0: 43046.4. Samples: 5577924000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:26:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 02:26:10,140][15401] Updated weights for policy 0, policy_version 340450 (0.0031) [2024-06-23 02:26:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.6, 300 sec: 43209.3). Total num frames: 5578080256. Throughput: 0: 43055.6. Samples: 5578183320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:26:13,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 02:26:14,231][15401] Updated weights for policy 0, policy_version 340460 (0.0037) [2024-06-23 02:26:17,475][15349] Signal inference workers to stop experience collection... (82600 times) [2024-06-23 02:26:17,516][15401] InferenceWorker_p0-w0: stopping experience collection (82600 times) [2024-06-23 02:26:17,597][15349] Signal inference workers to resume experience collection... (82600 times) [2024-06-23 02:26:17,598][15401] InferenceWorker_p0-w0: resuming experience collection (82600 times) [2024-06-23 02:26:17,731][15401] Updated weights for policy 0, policy_version 340470 (0.0026) [2024-06-23 02:26:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 5578276864. Throughput: 0: 42890.2. Samples: 5578433360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:26:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 02:26:21,899][15401] Updated weights for policy 0, policy_version 340480 (0.0032) [2024-06-23 02:26:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 43043.1). Total num frames: 5578473472. Throughput: 0: 42772.2. Samples: 5578564300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 02:26:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 02:26:25,228][15401] Updated weights for policy 0, policy_version 340490 (0.0031) [2024-06-23 02:26:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43419.4, 300 sec: 43209.4). Total num frames: 5578719232. Throughput: 0: 42876.1. Samples: 5578820580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:26:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 02:26:29,376][15401] Updated weights for policy 0, policy_version 340500 (0.0034) [2024-06-23 02:26:33,080][15401] Updated weights for policy 0, policy_version 340510 (0.0035) [2024-06-23 02:26:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5578932224. Throughput: 0: 42792.4. Samples: 5579077940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:26:33,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 02:26:37,225][15401] Updated weights for policy 0, policy_version 340520 (0.0028) [2024-06-23 02:26:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 43153.8). Total num frames: 5579128832. Throughput: 0: 42863.2. Samples: 5579209880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:26:38,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 02:26:40,505][15401] Updated weights for policy 0, policy_version 340530 (0.0032) [2024-06-23 02:26:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 5579325440. Throughput: 0: 42884.9. Samples: 5579464460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:26:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 02:26:44,837][15401] Updated weights for policy 0, policy_version 340540 (0.0035) [2024-06-23 02:26:48,082][15401] Updated weights for policy 0, policy_version 340550 (0.0042) [2024-06-23 02:26:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 5579571200. Throughput: 0: 42843.5. Samples: 5579722440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:26:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 02:26:52,240][15401] Updated weights for policy 0, policy_version 340560 (0.0039) [2024-06-23 02:26:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 5579784192. Throughput: 0: 43007.9. Samples: 5579859360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:26:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 02:26:55,776][15401] Updated weights for policy 0, policy_version 340570 (0.0026) [2024-06-23 02:26:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 43154.4). Total num frames: 5579980800. Throughput: 0: 42841.8. Samples: 5580111300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:26:58,392][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 02:26:59,854][15401] Updated weights for policy 0, policy_version 340580 (0.0029) [2024-06-23 02:27:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5580210176. Throughput: 0: 43037.1. Samples: 5580370020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:27:03,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 02:27:03,498][15401] Updated weights for policy 0, policy_version 340590 (0.0043) [2024-06-23 02:27:07,757][15401] Updated weights for policy 0, policy_version 340600 (0.0036) [2024-06-23 02:27:08,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 43209.7). Total num frames: 5580423168. Throughput: 0: 43070.2. Samples: 5580502460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:27:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 02:27:10,846][15401] Updated weights for policy 0, policy_version 340610 (0.0036) [2024-06-23 02:27:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 43154.1). Total num frames: 5580636160. Throughput: 0: 43135.8. Samples: 5580761700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:27:13,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 02:27:15,253][15401] Updated weights for policy 0, policy_version 340620 (0.0048) [2024-06-23 02:27:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5580849152. Throughput: 0: 43144.8. Samples: 5581019460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:27:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 02:27:18,714][15401] Updated weights for policy 0, policy_version 340630 (0.0037) [2024-06-23 02:27:22,808][15401] Updated weights for policy 0, policy_version 340640 (0.0027) [2024-06-23 02:27:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 43209.3). Total num frames: 5581078528. Throughput: 0: 43115.5. Samples: 5581150080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:27:23,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 02:27:26,361][15401] Updated weights for policy 0, policy_version 340650 (0.0036) [2024-06-23 02:27:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 43209.3). Total num frames: 5581291520. Throughput: 0: 43103.4. Samples: 5581404120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:27:28,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 02:27:30,392][15401] Updated weights for policy 0, policy_version 340660 (0.0034) [2024-06-23 02:27:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 43043.1). Total num frames: 5581488128. Throughput: 0: 43234.6. Samples: 5581668000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:27:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 02:27:34,020][15401] Updated weights for policy 0, policy_version 340670 (0.0041) [2024-06-23 02:27:37,826][15401] Updated weights for policy 0, policy_version 340680 (0.0047) [2024-06-23 02:27:38,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 43153.8). Total num frames: 5581717504. Throughput: 0: 43009.5. Samples: 5581794780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 02:27:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 02:27:39,383][15349] Signal inference workers to stop experience collection... (82650 times) [2024-06-23 02:27:39,383][15349] Signal inference workers to resume experience collection... (82650 times) [2024-06-23 02:27:39,425][15401] InferenceWorker_p0-w0: stopping experience collection (82650 times) [2024-06-23 02:27:39,426][15401] InferenceWorker_p0-w0: resuming experience collection (82650 times) [2024-06-23 02:27:41,699][15401] Updated weights for policy 0, policy_version 340690 (0.0036) [2024-06-23 02:27:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 5581946880. Throughput: 0: 43084.1. Samples: 5582049980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:27:43,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 02:27:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000340695_5581946880.pth... [2024-06-23 02:27:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000340063_5571592192.pth [2024-06-23 02:27:45,448][15401] Updated weights for policy 0, policy_version 340700 (0.0032) [2024-06-23 02:27:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 5582127104. Throughput: 0: 43107.0. Samples: 5582309840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:27:48,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 02:27:49,480][15401] Updated weights for policy 0, policy_version 340710 (0.0023) [2024-06-23 02:27:53,185][15401] Updated weights for policy 0, policy_version 340720 (0.0036) [2024-06-23 02:27:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 5582356480. Throughput: 0: 42967.8. Samples: 5582436020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:27:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 02:27:56,956][15401] Updated weights for policy 0, policy_version 340730 (0.0029) [2024-06-23 02:27:58,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43690.7, 300 sec: 43209.0). Total num frames: 5582602240. Throughput: 0: 43034.3. Samples: 5582698340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:27:58,392][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 02:28:00,794][15401] Updated weights for policy 0, policy_version 340740 (0.0037) [2024-06-23 02:28:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 5582782464. Throughput: 0: 43056.4. Samples: 5582957000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 02:28:04,412][15401] Updated weights for policy 0, policy_version 340750 (0.0033) [2024-06-23 02:28:08,333][15401] Updated weights for policy 0, policy_version 340760 (0.0039) [2024-06-23 02:28:08,389][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.5, 300 sec: 43154.4). Total num frames: 5583011840. Throughput: 0: 42973.9. Samples: 5583083900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 02:28:11,869][15401] Updated weights for policy 0, policy_version 340770 (0.0034) [2024-06-23 02:28:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 43209.7). Total num frames: 5583241216. Throughput: 0: 43150.8. Samples: 5583345900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:13,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 02:28:16,393][15401] Updated weights for policy 0, policy_version 340780 (0.0037) [2024-06-23 02:28:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5583437824. Throughput: 0: 42940.5. Samples: 5583600320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 02:28:19,833][15401] Updated weights for policy 0, policy_version 340790 (0.0040) [2024-06-23 02:28:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 43153.8). Total num frames: 5583634432. Throughput: 0: 42966.0. Samples: 5583728260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 02:28:23,854][15401] Updated weights for policy 0, policy_version 340800 (0.0045) [2024-06-23 02:28:27,421][15401] Updated weights for policy 0, policy_version 340810 (0.0038) [2024-06-23 02:28:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 43209.3). Total num frames: 5583896576. Throughput: 0: 43225.8. Samples: 5583995140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 02:28:31,369][15401] Updated weights for policy 0, policy_version 340820 (0.0042) [2024-06-23 02:28:33,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5584076800. Throughput: 0: 43031.5. Samples: 5584246260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:33,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 02:28:35,236][15401] Updated weights for policy 0, policy_version 340830 (0.0033) [2024-06-23 02:28:35,527][15349] Signal inference workers to stop experience collection... (82700 times) [2024-06-23 02:28:35,536][15349] Signal inference workers to resume experience collection... (82700 times) [2024-06-23 02:28:35,547][15401] InferenceWorker_p0-w0: stopping experience collection (82700 times) [2024-06-23 02:28:35,588][15401] InferenceWorker_p0-w0: resuming experience collection (82700 times) [2024-06-23 02:28:38,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 5584273408. Throughput: 0: 42981.9. Samples: 5584370200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 02:28:38,855][15401] Updated weights for policy 0, policy_version 340840 (0.0032) [2024-06-23 02:28:42,819][15401] Updated weights for policy 0, policy_version 340850 (0.0032) [2024-06-23 02:28:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 5584535552. Throughput: 0: 43046.6. Samples: 5584635340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:43,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 02:28:46,350][15401] Updated weights for policy 0, policy_version 340860 (0.0032) [2024-06-23 02:28:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 5584715776. Throughput: 0: 42959.7. Samples: 5584890180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:48,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 02:28:50,182][15401] Updated weights for policy 0, policy_version 340870 (0.0025) [2024-06-23 02:28:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5584945152. Throughput: 0: 42921.7. Samples: 5585015380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 02:28:53,399][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 02:28:53,989][15401] Updated weights for policy 0, policy_version 340880 (0.0034) [2024-06-23 02:28:57,670][15401] Updated weights for policy 0, policy_version 340890 (0.0028) [2024-06-23 02:28:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.3, 300 sec: 43153.8). Total num frames: 5585174528. Throughput: 0: 42960.2. Samples: 5585279100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:28:58,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 02:29:01,751][15401] Updated weights for policy 0, policy_version 340900 (0.0023) [2024-06-23 02:29:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 43153.8). Total num frames: 5585387520. Throughput: 0: 43027.8. Samples: 5585536580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:03,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 02:29:05,178][15401] Updated weights for policy 0, policy_version 340910 (0.0037) [2024-06-23 02:29:08,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 5585584128. Throughput: 0: 42994.8. Samples: 5585663020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:08,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 02:29:09,646][15401] Updated weights for policy 0, policy_version 340920 (0.0036) [2024-06-23 02:29:12,942][15401] Updated weights for policy 0, policy_version 340930 (0.0036) [2024-06-23 02:29:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 5585797120. Throughput: 0: 42861.7. Samples: 5585923920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 02:29:17,159][15401] Updated weights for policy 0, policy_version 340940 (0.0027) [2024-06-23 02:29:18,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43139.8, 300 sec: 43097.3). Total num frames: 5586026496. Throughput: 0: 42950.7. Samples: 5586179320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:18,397][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 02:29:20,586][15401] Updated weights for policy 0, policy_version 340950 (0.0032) [2024-06-23 02:29:23,391][15132] Fps is (10 sec: 44229.2, 60 sec: 43416.4, 300 sec: 43098.0). Total num frames: 5586239488. Throughput: 0: 43090.3. Samples: 5586309340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:23,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 02:29:24,559][15401] Updated weights for policy 0, policy_version 340960 (0.0033) [2024-06-23 02:29:28,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42325.4, 300 sec: 43042.7). Total num frames: 5586436096. Throughput: 0: 43063.3. Samples: 5586573180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 02:29:28,483][15401] Updated weights for policy 0, policy_version 340970 (0.0027) [2024-06-23 02:29:32,258][15401] Updated weights for policy 0, policy_version 340980 (0.0035) [2024-06-23 02:29:33,389][15132] Fps is (10 sec: 42606.6, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 5586665472. Throughput: 0: 43136.0. Samples: 5586831300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 02:29:35,786][15401] Updated weights for policy 0, policy_version 340990 (0.0032) [2024-06-23 02:29:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 5586878464. Throughput: 0: 43333.5. Samples: 5586965380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 02:29:39,764][15401] Updated weights for policy 0, policy_version 341000 (0.0042) [2024-06-23 02:29:43,285][15401] Updated weights for policy 0, policy_version 341010 (0.0031) [2024-06-23 02:29:43,390][15132] Fps is (10 sec: 44234.3, 60 sec: 42871.2, 300 sec: 43098.2). Total num frames: 5587107840. Throughput: 0: 43236.8. Samples: 5587224780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 02:29:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000341010_5587107840.pth... [2024-06-23 02:29:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000340379_5576769536.pth [2024-06-23 02:29:47,260][15401] Updated weights for policy 0, policy_version 341020 (0.0037) [2024-06-23 02:29:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 5587320832. Throughput: 0: 43200.1. Samples: 5587480580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:48,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 02:29:51,120][15401] Updated weights for policy 0, policy_version 341030 (0.0036) [2024-06-23 02:29:53,389][15132] Fps is (10 sec: 42600.3, 60 sec: 43144.6, 300 sec: 43098.2). Total num frames: 5587533824. Throughput: 0: 43324.4. Samples: 5587612620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 02:29:54,852][15401] Updated weights for policy 0, policy_version 341040 (0.0039) [2024-06-23 02:29:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42932.6). Total num frames: 5587730432. Throughput: 0: 43163.2. Samples: 5587866260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:29:58,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 02:29:58,735][15401] Updated weights for policy 0, policy_version 341050 (0.0034) [2024-06-23 02:30:02,463][15401] Updated weights for policy 0, policy_version 341060 (0.0037) [2024-06-23 02:30:02,629][15349] Signal inference workers to stop experience collection... (82750 times) [2024-06-23 02:30:02,683][15349] Signal inference workers to resume experience collection... (82750 times) [2024-06-23 02:30:02,683][15401] InferenceWorker_p0-w0: stopping experience collection (82750 times) [2024-06-23 02:30:02,701][15401] InferenceWorker_p0-w0: resuming experience collection (82750 times) [2024-06-23 02:30:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 43153.8). Total num frames: 5587976192. Throughput: 0: 43152.0. Samples: 5588120880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:30:03,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 02:30:06,225][15401] Updated weights for policy 0, policy_version 341070 (0.0030) [2024-06-23 02:30:08,390][15132] Fps is (10 sec: 45871.5, 60 sec: 43417.0, 300 sec: 43098.1). Total num frames: 5588189184. Throughput: 0: 43337.4. Samples: 5588259480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 02:30:08,391][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 02:30:09,897][15401] Updated weights for policy 0, policy_version 341080 (0.0027) [2024-06-23 02:30:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 5588402176. Throughput: 0: 43311.5. Samples: 5588522200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 02:30:13,569][15401] Updated weights for policy 0, policy_version 341090 (0.0040) [2024-06-23 02:30:17,295][15401] Updated weights for policy 0, policy_version 341100 (0.0033) [2024-06-23 02:30:18,390][15132] Fps is (10 sec: 44240.1, 60 sec: 43422.3, 300 sec: 43098.3). Total num frames: 5588631552. Throughput: 0: 43281.2. Samples: 5588778960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 02:30:21,136][15401] Updated weights for policy 0, policy_version 341110 (0.0032) [2024-06-23 02:30:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43418.9, 300 sec: 43154.2). Total num frames: 5588844544. Throughput: 0: 43333.7. Samples: 5588915400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 02:30:24,713][15401] Updated weights for policy 0, policy_version 341120 (0.0035) [2024-06-23 02:30:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 5589041152. Throughput: 0: 43236.9. Samples: 5589170420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 02:30:28,605][15401] Updated weights for policy 0, policy_version 341130 (0.0032) [2024-06-23 02:30:32,191][15401] Updated weights for policy 0, policy_version 341140 (0.0033) [2024-06-23 02:30:33,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43415.8, 300 sec: 43098.3). Total num frames: 5589270528. Throughput: 0: 43238.1. Samples: 5589426400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:33,392][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 02:30:36,506][15401] Updated weights for policy 0, policy_version 341150 (0.0035) [2024-06-23 02:30:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 43153.8). Total num frames: 5589483520. Throughput: 0: 43230.6. Samples: 5589558000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 02:30:39,796][15401] Updated weights for policy 0, policy_version 341160 (0.0037) [2024-06-23 02:30:43,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.8, 300 sec: 43042.7). Total num frames: 5589696512. Throughput: 0: 43207.0. Samples: 5589810580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 02:30:44,071][15401] Updated weights for policy 0, policy_version 341170 (0.0034) [2024-06-23 02:30:47,807][15401] Updated weights for policy 0, policy_version 341180 (0.0036) [2024-06-23 02:30:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5589909504. Throughput: 0: 43185.4. Samples: 5590064220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 02:30:51,537][15401] Updated weights for policy 0, policy_version 341190 (0.0039) [2024-06-23 02:30:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5590122496. Throughput: 0: 43074.5. Samples: 5590197800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:53,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 02:30:55,297][15401] Updated weights for policy 0, policy_version 341200 (0.0033) [2024-06-23 02:30:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43415.8, 300 sec: 43097.9). Total num frames: 5590335488. Throughput: 0: 42976.8. Samples: 5590456260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:30:58,393][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 02:30:59,338][15401] Updated weights for policy 0, policy_version 341210 (0.0033) [2024-06-23 02:31:02,889][15401] Updated weights for policy 0, policy_version 341220 (0.0028) [2024-06-23 02:31:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 5590548480. Throughput: 0: 43004.9. Samples: 5590714180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:31:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 02:31:04,090][15349] Signal inference workers to stop experience collection... (82800 times) [2024-06-23 02:31:04,139][15401] InferenceWorker_p0-w0: stopping experience collection (82800 times) [2024-06-23 02:31:04,147][15349] Signal inference workers to resume experience collection... (82800 times) [2024-06-23 02:31:04,153][15401] InferenceWorker_p0-w0: resuming experience collection (82800 times) [2024-06-23 02:31:06,843][15401] Updated weights for policy 0, policy_version 341230 (0.0041) [2024-06-23 02:31:08,394][15132] Fps is (10 sec: 45864.7, 60 sec: 43414.8, 300 sec: 43097.6). Total num frames: 5590794240. Throughput: 0: 42945.2. Samples: 5590848140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:31:08,395][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 02:31:10,281][15401] Updated weights for policy 0, policy_version 341240 (0.0025) [2024-06-23 02:31:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 5590974464. Throughput: 0: 43094.2. Samples: 5591109660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:31:13,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 02:31:14,185][15401] Updated weights for policy 0, policy_version 341250 (0.0051) [2024-06-23 02:31:17,979][15401] Updated weights for policy 0, policy_version 341260 (0.0043) [2024-06-23 02:31:18,389][15132] Fps is (10 sec: 40979.4, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 5591203840. Throughput: 0: 43032.1. Samples: 5591362740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:31:18,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-23 02:31:22,157][15401] Updated weights for policy 0, policy_version 341270 (0.0033) [2024-06-23 02:31:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 5591416832. Throughput: 0: 42929.0. Samples: 5591489800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:31:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 02:31:25,667][15401] Updated weights for policy 0, policy_version 341280 (0.0031) [2024-06-23 02:31:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5591629824. Throughput: 0: 43276.5. Samples: 5591758020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 02:31:28,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 02:31:29,631][15401] Updated weights for policy 0, policy_version 341290 (0.0030) [2024-06-23 02:31:33,073][15401] Updated weights for policy 0, policy_version 341300 (0.0031) [2024-06-23 02:31:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43146.2, 300 sec: 43153.8). Total num frames: 5591859200. Throughput: 0: 43033.2. Samples: 5592000720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:31:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 02:31:37,173][15401] Updated weights for policy 0, policy_version 341310 (0.0036) [2024-06-23 02:31:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 43098.3). Total num frames: 5592039424. Throughput: 0: 43111.1. Samples: 5592137800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:31:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 02:31:40,701][15401] Updated weights for policy 0, policy_version 341320 (0.0043) [2024-06-23 02:31:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42987.1). Total num frames: 5592252416. Throughput: 0: 43045.3. Samples: 5592393200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:31:43,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 02:31:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000341324_5592252416.pth... [2024-06-23 02:31:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000340695_5581946880.pth [2024-06-23 02:31:44,731][15401] Updated weights for policy 0, policy_version 341330 (0.0029) [2024-06-23 02:31:48,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 43097.9). Total num frames: 5592498176. Throughput: 0: 42847.5. Samples: 5592642420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:31:48,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 02:31:48,686][15401] Updated weights for policy 0, policy_version 341340 (0.0040) [2024-06-23 02:31:52,325][15401] Updated weights for policy 0, policy_version 341350 (0.0036) [2024-06-23 02:31:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 43098.6). Total num frames: 5592694784. Throughput: 0: 42957.3. Samples: 5592781020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:31:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 02:31:56,168][15401] Updated weights for policy 0, policy_version 341360 (0.0032) [2024-06-23 02:31:58,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 5592891392. Throughput: 0: 42695.2. Samples: 5593030940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:31:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 02:32:00,190][15401] Updated weights for policy 0, policy_version 341370 (0.0040) [2024-06-23 02:32:03,165][15349] Signal inference workers to stop experience collection... (82850 times) [2024-06-23 02:32:03,208][15401] InferenceWorker_p0-w0: stopping experience collection (82850 times) [2024-06-23 02:32:03,223][15349] Signal inference workers to resume experience collection... (82850 times) [2024-06-23 02:32:03,227][15401] InferenceWorker_p0-w0: resuming experience collection (82850 times) [2024-06-23 02:32:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 5593137152. Throughput: 0: 42779.4. Samples: 5593287820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 02:32:03,692][15401] Updated weights for policy 0, policy_version 341380 (0.0032) [2024-06-23 02:32:07,820][15401] Updated weights for policy 0, policy_version 341390 (0.0038) [2024-06-23 02:32:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42328.6, 300 sec: 43042.7). Total num frames: 5593333760. Throughput: 0: 42837.7. Samples: 5593417500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 02:32:11,347][15401] Updated weights for policy 0, policy_version 341400 (0.0056) [2024-06-23 02:32:13,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 5593530368. Throughput: 0: 42466.8. Samples: 5593669020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 02:32:15,398][15401] Updated weights for policy 0, policy_version 341410 (0.0041) [2024-06-23 02:32:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 5593759744. Throughput: 0: 42654.8. Samples: 5593920180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 02:32:19,330][15401] Updated weights for policy 0, policy_version 341420 (0.0034) [2024-06-23 02:32:23,054][15401] Updated weights for policy 0, policy_version 341430 (0.0031) [2024-06-23 02:32:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 5593989120. Throughput: 0: 42591.5. Samples: 5594054420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 02:32:26,950][15401] Updated weights for policy 0, policy_version 341440 (0.0034) [2024-06-23 02:32:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 5594169344. Throughput: 0: 42458.4. Samples: 5594303820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:28,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 02:32:30,909][15401] Updated weights for policy 0, policy_version 341450 (0.0036) [2024-06-23 02:32:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 5594398720. Throughput: 0: 42657.8. Samples: 5594561920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 02:32:34,539][15401] Updated weights for policy 0, policy_version 341460 (0.0030) [2024-06-23 02:32:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5594628096. Throughput: 0: 42553.4. Samples: 5594695920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 02:32:38,571][15401] Updated weights for policy 0, policy_version 341470 (0.0032) [2024-06-23 02:32:42,093][15401] Updated weights for policy 0, policy_version 341480 (0.0038) [2024-06-23 02:32:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 5594824704. Throughput: 0: 42551.0. Samples: 5594945740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:32:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 02:32:45,953][15401] Updated weights for policy 0, policy_version 341490 (0.0024) [2024-06-23 02:32:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42987.2). Total num frames: 5595037696. Throughput: 0: 42738.8. Samples: 5595211060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:32:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 02:32:50,069][15401] Updated weights for policy 0, policy_version 341500 (0.0043) [2024-06-23 02:32:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 5595267072. Throughput: 0: 42677.8. Samples: 5595338000. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:32:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 02:32:54,469][15401] Updated weights for policy 0, policy_version 341510 (0.0036) [2024-06-23 02:32:57,964][15401] Updated weights for policy 0, policy_version 341520 (0.0035) [2024-06-23 02:32:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 43042.4). Total num frames: 5595480064. Throughput: 0: 42722.9. Samples: 5595591660. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:32:58,393][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 02:33:02,044][15401] Updated weights for policy 0, policy_version 341530 (0.0027) [2024-06-23 02:33:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 5595693056. Throughput: 0: 42969.3. Samples: 5595853800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 02:33:05,546][15401] Updated weights for policy 0, policy_version 341540 (0.0030) [2024-06-23 02:33:08,098][15349] Signal inference workers to stop experience collection... (82900 times) [2024-06-23 02:33:08,104][15349] Signal inference workers to resume experience collection... (82900 times) [2024-06-23 02:33:08,148][15401] InferenceWorker_p0-w0: stopping experience collection (82900 times) [2024-06-23 02:33:08,148][15401] InferenceWorker_p0-w0: resuming experience collection (82900 times) [2024-06-23 02:33:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5595906048. Throughput: 0: 42816.5. Samples: 5595981160. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 02:33:09,797][15401] Updated weights for policy 0, policy_version 341550 (0.0034) [2024-06-23 02:33:13,234][15401] Updated weights for policy 0, policy_version 341560 (0.0031) [2024-06-23 02:33:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5596119040. Throughput: 0: 42960.8. Samples: 5596237060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 02:33:17,370][15401] Updated weights for policy 0, policy_version 341570 (0.0043) [2024-06-23 02:33:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 5596332032. Throughput: 0: 42947.1. Samples: 5596494540. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 02:33:20,967][15401] Updated weights for policy 0, policy_version 341580 (0.0035) [2024-06-23 02:33:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5596545024. Throughput: 0: 42693.3. Samples: 5596617120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 02:33:24,803][15401] Updated weights for policy 0, policy_version 341590 (0.0042) [2024-06-23 02:33:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5596741632. Throughput: 0: 42755.2. Samples: 5596869720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 02:33:28,625][15401] Updated weights for policy 0, policy_version 341600 (0.0033) [2024-06-23 02:33:32,327][15401] Updated weights for policy 0, policy_version 341610 (0.0036) [2024-06-23 02:33:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 5596938240. Throughput: 0: 42628.5. Samples: 5597129340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:33,390][15132] Avg episode reward: [(0, '0.125')] [2024-06-23 02:33:36,503][15401] Updated weights for policy 0, policy_version 341620 (0.0029) [2024-06-23 02:33:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5597167616. Throughput: 0: 42560.9. Samples: 5597253240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 02:33:40,703][15401] Updated weights for policy 0, policy_version 341630 (0.0046) [2024-06-23 02:33:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5597364224. Throughput: 0: 42555.1. Samples: 5597506540. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:43,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 02:33:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000341636_5597364224.pth... [2024-06-23 02:33:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000341010_5587107840.pth [2024-06-23 02:33:44,162][15401] Updated weights for policy 0, policy_version 341640 (0.0027) [2024-06-23 02:33:48,384][15401] Updated weights for policy 0, policy_version 341650 (0.0039) [2024-06-23 02:33:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5597593600. Throughput: 0: 42469.4. Samples: 5597764920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 02:33:51,859][15401] Updated weights for policy 0, policy_version 341660 (0.0033) [2024-06-23 02:33:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5597822976. Throughput: 0: 42467.6. Samples: 5597892200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:53,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 02:33:56,165][15401] Updated weights for policy 0, policy_version 341670 (0.0035) [2024-06-23 02:33:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 5598019584. Throughput: 0: 42376.0. Samples: 5598143980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 02:33:58,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 02:33:59,619][15401] Updated weights for policy 0, policy_version 341680 (0.0027) [2024-06-23 02:34:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5598232576. Throughput: 0: 42399.9. Samples: 5598402540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 02:34:03,576][15401] Updated weights for policy 0, policy_version 341690 (0.0030) [2024-06-23 02:34:07,231][15401] Updated weights for policy 0, policy_version 341700 (0.0035) [2024-06-23 02:34:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 5598461952. Throughput: 0: 42663.2. Samples: 5598536960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 02:34:11,046][15401] Updated weights for policy 0, policy_version 341710 (0.0027) [2024-06-23 02:34:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42821.5). Total num frames: 5598658560. Throughput: 0: 42592.8. Samples: 5598786400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:13,394][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 02:34:15,127][15401] Updated weights for policy 0, policy_version 341720 (0.0033) [2024-06-23 02:34:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42820.8). Total num frames: 5598871552. Throughput: 0: 42512.9. Samples: 5599042420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 02:34:18,548][15401] Updated weights for policy 0, policy_version 341730 (0.0024) [2024-06-23 02:34:22,809][15401] Updated weights for policy 0, policy_version 341740 (0.0036) [2024-06-23 02:34:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 5599100928. Throughput: 0: 42658.7. Samples: 5599172880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 02:34:26,474][15401] Updated weights for policy 0, policy_version 341750 (0.0029) [2024-06-23 02:34:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5599297536. Throughput: 0: 42705.0. Samples: 5599428260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:28,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 02:34:30,486][15401] Updated weights for policy 0, policy_version 341760 (0.0033) [2024-06-23 02:34:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5599510528. Throughput: 0: 42686.2. Samples: 5599685800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 02:34:34,007][15401] Updated weights for policy 0, policy_version 341770 (0.0033) [2024-06-23 02:34:38,145][15401] Updated weights for policy 0, policy_version 341780 (0.0042) [2024-06-23 02:34:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5599739904. Throughput: 0: 42698.2. Samples: 5599813620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 02:34:38,962][15349] Signal inference workers to stop experience collection... (82950 times) [2024-06-23 02:34:38,964][15349] Signal inference workers to resume experience collection... (82950 times) [2024-06-23 02:34:38,976][15401] InferenceWorker_p0-w0: stopping experience collection (82950 times) [2024-06-23 02:34:38,995][15401] InferenceWorker_p0-w0: resuming experience collection (82950 times) [2024-06-23 02:34:41,593][15401] Updated weights for policy 0, policy_version 341790 (0.0039) [2024-06-23 02:34:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 5599936512. Throughput: 0: 42679.6. Samples: 5600064660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:43,401][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 02:34:45,720][15401] Updated weights for policy 0, policy_version 341800 (0.0034) [2024-06-23 02:34:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5600149504. Throughput: 0: 42705.8. Samples: 5600324300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 02:34:49,197][15401] Updated weights for policy 0, policy_version 341810 (0.0031) [2024-06-23 02:34:53,187][15401] Updated weights for policy 0, policy_version 341820 (0.0030) [2024-06-23 02:34:53,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 5600378880. Throughput: 0: 42619.4. Samples: 5600454940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:53,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 02:34:56,961][15401] Updated weights for policy 0, policy_version 341830 (0.0037) [2024-06-23 02:34:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5600575488. Throughput: 0: 42715.1. Samples: 5600708580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:34:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 02:35:00,856][15401] Updated weights for policy 0, policy_version 341840 (0.0041) [2024-06-23 02:35:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 5600804864. Throughput: 0: 42781.2. Samples: 5600967580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:35:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 02:35:04,643][15401] Updated weights for policy 0, policy_version 341850 (0.0032) [2024-06-23 02:35:08,289][15401] Updated weights for policy 0, policy_version 341860 (0.0040) [2024-06-23 02:35:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5601034240. Throughput: 0: 42870.7. Samples: 5601102060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:35:08,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 02:35:12,107][15401] Updated weights for policy 0, policy_version 341870 (0.0030) [2024-06-23 02:35:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5601230848. Throughput: 0: 42810.5. Samples: 5601354740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:35:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 02:35:15,749][15401] Updated weights for policy 0, policy_version 341880 (0.0046) [2024-06-23 02:35:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5601443840. Throughput: 0: 42809.0. Samples: 5601612200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 02:35:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 02:35:20,148][15401] Updated weights for policy 0, policy_version 341890 (0.0032) [2024-06-23 02:35:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5601673216. Throughput: 0: 42913.9. Samples: 5601744740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:35:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 02:35:23,603][15401] Updated weights for policy 0, policy_version 341900 (0.0035) [2024-06-23 02:35:27,740][15401] Updated weights for policy 0, policy_version 341910 (0.0026) [2024-06-23 02:35:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 5601869824. Throughput: 0: 42938.7. Samples: 5601996800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:35:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 02:35:31,233][15401] Updated weights for policy 0, policy_version 341920 (0.0033) [2024-06-23 02:35:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5602099200. Throughput: 0: 42910.3. Samples: 5602255260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:35:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 02:35:35,302][15401] Updated weights for policy 0, policy_version 341930 (0.0031) [2024-06-23 02:35:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5602295808. Throughput: 0: 42857.1. Samples: 5602383400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:35:38,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-23 02:35:38,879][15401] Updated weights for policy 0, policy_version 341940 (0.0038) [2024-06-23 02:35:43,059][15401] Updated weights for policy 0, policy_version 341950 (0.0038) [2024-06-23 02:35:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 5602525184. Throughput: 0: 43020.3. Samples: 5602644500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:35:43,399][15132] Avg episode reward: [(0, '0.263')] [2024-06-23 02:35:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000341951_5602525184.pth... [2024-06-23 02:35:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000341324_5592252416.pth [2024-06-23 02:35:46,521][15401] Updated weights for policy 0, policy_version 341960 (0.0039) [2024-06-23 02:35:48,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 5602754560. Throughput: 0: 42770.3. Samples: 5602892340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:35:48,392][15132] Avg episode reward: [(0, '0.261')] [2024-06-23 02:35:50,990][15401] Updated weights for policy 0, policy_version 341970 (0.0043) [2024-06-23 02:35:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 5602951168. Throughput: 0: 42735.9. Samples: 5603025180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:35:53,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 02:35:54,260][15401] Updated weights for policy 0, policy_version 341980 (0.0042) [2024-06-23 02:35:58,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5603147776. Throughput: 0: 42823.1. Samples: 5603281780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:35:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 02:35:58,649][15401] Updated weights for policy 0, policy_version 341990 (0.0030) [2024-06-23 02:36:01,844][15401] Updated weights for policy 0, policy_version 342000 (0.0031) [2024-06-23 02:36:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42710.1). Total num frames: 5603393536. Throughput: 0: 42758.5. Samples: 5603536340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:36:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 02:36:06,186][15401] Updated weights for policy 0, policy_version 342010 (0.0038) [2024-06-23 02:36:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5603590144. Throughput: 0: 42787.5. Samples: 5603670180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:36:08,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-23 02:36:09,783][15401] Updated weights for policy 0, policy_version 342020 (0.0034) [2024-06-23 02:36:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5603786752. Throughput: 0: 42652.0. Samples: 5603916140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:36:13,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 02:36:13,766][15401] Updated weights for policy 0, policy_version 342030 (0.0031) [2024-06-23 02:36:16,636][15349] Signal inference workers to stop experience collection... (83000 times) [2024-06-23 02:36:16,636][15349] Signal inference workers to resume experience collection... (83000 times) [2024-06-23 02:36:16,666][15401] InferenceWorker_p0-w0: stopping experience collection (83000 times) [2024-06-23 02:36:16,666][15401] InferenceWorker_p0-w0: resuming experience collection (83000 times) [2024-06-23 02:36:17,363][15401] Updated weights for policy 0, policy_version 342040 (0.0036) [2024-06-23 02:36:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5604032512. Throughput: 0: 42629.3. Samples: 5604173580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:36:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 02:36:21,466][15401] Updated weights for policy 0, policy_version 342050 (0.0032) [2024-06-23 02:36:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5604212736. Throughput: 0: 42672.7. Samples: 5604303680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:36:23,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 02:36:25,168][15401] Updated weights for policy 0, policy_version 342060 (0.0044) [2024-06-23 02:36:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5604425728. Throughput: 0: 42443.4. Samples: 5604554440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:36:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 02:36:29,283][15401] Updated weights for policy 0, policy_version 342070 (0.0042) [2024-06-23 02:36:32,731][15401] Updated weights for policy 0, policy_version 342080 (0.0022) [2024-06-23 02:36:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5604671488. Throughput: 0: 42650.7. Samples: 5604811520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 02:36:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 02:36:37,037][15401] Updated weights for policy 0, policy_version 342090 (0.0026) [2024-06-23 02:36:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5604851712. Throughput: 0: 42688.1. Samples: 5604946140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:36:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 02:36:40,312][15401] Updated weights for policy 0, policy_version 342100 (0.0032) [2024-06-23 02:36:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 5605081088. Throughput: 0: 42536.0. Samples: 5605195900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:36:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 02:36:45,042][15401] Updated weights for policy 0, policy_version 342110 (0.0044) [2024-06-23 02:36:48,135][15401] Updated weights for policy 0, policy_version 342120 (0.0034) [2024-06-23 02:36:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 5605294080. Throughput: 0: 42536.2. Samples: 5605450460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:36:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 02:36:52,522][15401] Updated weights for policy 0, policy_version 342130 (0.0038) [2024-06-23 02:36:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5605474304. Throughput: 0: 42457.3. Samples: 5605580760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:36:53,394][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 02:36:55,867][15401] Updated weights for policy 0, policy_version 342140 (0.0048) [2024-06-23 02:36:58,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5605720064. Throughput: 0: 42696.9. Samples: 5605837500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:36:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 02:36:59,954][15401] Updated weights for policy 0, policy_version 342150 (0.0039) [2024-06-23 02:37:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5605933056. Throughput: 0: 42603.9. Samples: 5606090760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 02:37:03,475][15401] Updated weights for policy 0, policy_version 342160 (0.0031) [2024-06-23 02:37:07,441][15401] Updated weights for policy 0, policy_version 342170 (0.0032) [2024-06-23 02:37:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5606113280. Throughput: 0: 42636.6. Samples: 5606222320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 02:37:11,194][15401] Updated weights for policy 0, policy_version 342180 (0.0034) [2024-06-23 02:37:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5606359040. Throughput: 0: 42616.3. Samples: 5606472180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:13,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-23 02:37:15,396][15401] Updated weights for policy 0, policy_version 342190 (0.0043) [2024-06-23 02:37:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42050.6, 300 sec: 42598.1). Total num frames: 5606555648. Throughput: 0: 42588.8. Samples: 5606728120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:18,392][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 02:37:18,989][15401] Updated weights for policy 0, policy_version 342200 (0.0051) [2024-06-23 02:37:23,267][15401] Updated weights for policy 0, policy_version 342210 (0.0030) [2024-06-23 02:37:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5606768640. Throughput: 0: 42335.9. Samples: 5606851260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:23,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 02:37:26,689][15401] Updated weights for policy 0, policy_version 342220 (0.0053) [2024-06-23 02:37:28,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5606998016. Throughput: 0: 42385.8. Samples: 5607103260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 02:37:31,151][15401] Updated weights for policy 0, policy_version 342230 (0.0031) [2024-06-23 02:37:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5607194624. Throughput: 0: 42543.9. Samples: 5607364940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:33,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 02:37:34,427][15401] Updated weights for policy 0, policy_version 342240 (0.0036) [2024-06-23 02:37:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5607391232. Throughput: 0: 42294.2. Samples: 5607484000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 02:37:38,915][15401] Updated weights for policy 0, policy_version 342250 (0.0033) [2024-06-23 02:37:42,300][15401] Updated weights for policy 0, policy_version 342260 (0.0041) [2024-06-23 02:37:43,390][15132] Fps is (10 sec: 44235.1, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 5607636992. Throughput: 0: 42389.5. Samples: 5607745040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:43,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 02:37:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000342263_5607636992.pth... [2024-06-23 02:37:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000341636_5597364224.pth [2024-06-23 02:37:46,727][15401] Updated weights for policy 0, policy_version 342270 (0.0032) [2024-06-23 02:37:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5607833600. Throughput: 0: 42388.5. Samples: 5607998240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 02:37:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 02:37:49,886][15401] Updated weights for policy 0, policy_version 342280 (0.0030) [2024-06-23 02:37:53,389][15132] Fps is (10 sec: 39323.1, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 5608030208. Throughput: 0: 42251.6. Samples: 5608123640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:37:53,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 02:37:54,290][15401] Updated weights for policy 0, policy_version 342290 (0.0033) [2024-06-23 02:37:58,047][15401] Updated weights for policy 0, policy_version 342300 (0.0038) [2024-06-23 02:37:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5608259584. Throughput: 0: 42380.0. Samples: 5608379280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:37:58,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 02:38:02,043][15401] Updated weights for policy 0, policy_version 342310 (0.0040) [2024-06-23 02:38:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 5608472576. Throughput: 0: 42160.1. Samples: 5608625220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 02:38:05,547][15401] Updated weights for policy 0, policy_version 342320 (0.0038) [2024-06-23 02:38:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5608669184. Throughput: 0: 42155.0. Samples: 5608748240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:08,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 02:38:09,091][15349] Signal inference workers to stop experience collection... (83050 times) [2024-06-23 02:38:09,091][15349] Signal inference workers to resume experience collection... (83050 times) [2024-06-23 02:38:09,111][15401] InferenceWorker_p0-w0: stopping experience collection (83050 times) [2024-06-23 02:38:09,111][15401] InferenceWorker_p0-w0: resuming experience collection (83050 times) [2024-06-23 02:38:10,033][15401] Updated weights for policy 0, policy_version 342330 (0.0037) [2024-06-23 02:38:13,266][15401] Updated weights for policy 0, policy_version 342340 (0.0048) [2024-06-23 02:38:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 5608898560. Throughput: 0: 42291.0. Samples: 5609006460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:13,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 02:38:17,677][15401] Updated weights for policy 0, policy_version 342350 (0.0034) [2024-06-23 02:38:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 5609078784. Throughput: 0: 42249.7. Samples: 5609266180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 02:38:20,976][15401] Updated weights for policy 0, policy_version 342360 (0.0022) [2024-06-23 02:38:23,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5609308160. Throughput: 0: 42262.7. Samples: 5609385820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:23,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 02:38:25,330][15401] Updated weights for policy 0, policy_version 342370 (0.0038) [2024-06-23 02:38:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5609521152. Throughput: 0: 42143.0. Samples: 5609641460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:28,393][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 02:38:28,790][15401] Updated weights for policy 0, policy_version 342380 (0.0035) [2024-06-23 02:38:32,986][15401] Updated weights for policy 0, policy_version 342390 (0.0033) [2024-06-23 02:38:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5609734144. Throughput: 0: 42314.2. Samples: 5609902380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 02:38:36,728][15401] Updated weights for policy 0, policy_version 342400 (0.0046) [2024-06-23 02:38:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5609947136. Throughput: 0: 42301.7. Samples: 5610027220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 02:38:40,683][15401] Updated weights for policy 0, policy_version 342410 (0.0034) [2024-06-23 02:38:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 5610176512. Throughput: 0: 42315.5. Samples: 5610283480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 02:38:44,598][15401] Updated weights for policy 0, policy_version 342420 (0.0039) [2024-06-23 02:38:48,377][15401] Updated weights for policy 0, policy_version 342430 (0.0044) [2024-06-23 02:38:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5610373120. Throughput: 0: 42328.0. Samples: 5610529980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 02:38:52,082][15401] Updated weights for policy 0, policy_version 342440 (0.0041) [2024-06-23 02:38:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5610569728. Throughput: 0: 42362.4. Samples: 5610654540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 02:38:56,318][15401] Updated weights for policy 0, policy_version 342450 (0.0027) [2024-06-23 02:38:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5610799104. Throughput: 0: 42435.5. Samples: 5610915960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:38:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 02:38:59,628][15401] Updated weights for policy 0, policy_version 342460 (0.0042) [2024-06-23 02:39:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5610995712. Throughput: 0: 42291.0. Samples: 5611169280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:39:03,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 02:39:03,862][15401] Updated weights for policy 0, policy_version 342470 (0.0045) [2024-06-23 02:39:07,097][15401] Updated weights for policy 0, policy_version 342480 (0.0041) [2024-06-23 02:39:08,391][15132] Fps is (10 sec: 39315.5, 60 sec: 42051.2, 300 sec: 42487.1). Total num frames: 5611192320. Throughput: 0: 42430.9. Samples: 5611295280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 02:39:08,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 02:39:11,379][15401] Updated weights for policy 0, policy_version 342490 (0.0029) [2024-06-23 02:39:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 5611438080. Throughput: 0: 42614.7. Samples: 5611559120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:13,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 02:39:14,522][15401] Updated weights for policy 0, policy_version 342500 (0.0028) [2024-06-23 02:39:18,389][15132] Fps is (10 sec: 44244.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5611634688. Throughput: 0: 42480.7. Samples: 5611814000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 02:39:19,091][15401] Updated weights for policy 0, policy_version 342510 (0.0028) [2024-06-23 02:39:22,418][15401] Updated weights for policy 0, policy_version 342520 (0.0032) [2024-06-23 02:39:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 5611847680. Throughput: 0: 42411.1. Samples: 5611935820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:23,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 02:39:24,340][15349] Signal inference workers to stop experience collection... (83100 times) [2024-06-23 02:39:24,373][15401] InferenceWorker_p0-w0: stopping experience collection (83100 times) [2024-06-23 02:39:24,395][15349] Signal inference workers to resume experience collection... (83100 times) [2024-06-23 02:39:24,396][15401] InferenceWorker_p0-w0: resuming experience collection (83100 times) [2024-06-23 02:39:26,725][15401] Updated weights for policy 0, policy_version 342530 (0.0030) [2024-06-23 02:39:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5612060672. Throughput: 0: 42360.1. Samples: 5612189680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 02:39:30,100][15401] Updated weights for policy 0, policy_version 342540 (0.0029) [2024-06-23 02:39:33,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5612273664. Throughput: 0: 42743.0. Samples: 5612453420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:33,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 02:39:34,278][15401] Updated weights for policy 0, policy_version 342550 (0.0026) [2024-06-23 02:39:38,070][15401] Updated weights for policy 0, policy_version 342560 (0.0028) [2024-06-23 02:39:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 5612503040. Throughput: 0: 42764.4. Samples: 5612578940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 02:39:41,778][15401] Updated weights for policy 0, policy_version 342570 (0.0021) [2024-06-23 02:39:43,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5612699648. Throughput: 0: 42592.1. Samples: 5612832600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 02:39:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000342573_5612716032.pth... [2024-06-23 02:39:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000341951_5602525184.pth [2024-06-23 02:39:45,785][15401] Updated weights for policy 0, policy_version 342580 (0.0031) [2024-06-23 02:39:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 5612912640. Throughput: 0: 42714.7. Samples: 5613091440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 02:39:49,327][15401] Updated weights for policy 0, policy_version 342590 (0.0029) [2024-06-23 02:39:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5613142016. Throughput: 0: 42641.6. Samples: 5613214080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:53,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 02:39:53,734][15401] Updated weights for policy 0, policy_version 342600 (0.0037) [2024-06-23 02:39:57,064][15401] Updated weights for policy 0, policy_version 342610 (0.0036) [2024-06-23 02:39:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5613355008. Throughput: 0: 42383.5. Samples: 5613466380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:39:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 02:40:01,769][15401] Updated weights for policy 0, policy_version 342620 (0.0043) [2024-06-23 02:40:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 5613551616. Throughput: 0: 42601.7. Samples: 5613731080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:40:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 02:40:04,926][15401] Updated weights for policy 0, policy_version 342630 (0.0032) [2024-06-23 02:40:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43418.8, 300 sec: 42598.4). Total num frames: 5613797376. Throughput: 0: 42612.9. Samples: 5613853300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:40:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 02:40:09,201][15401] Updated weights for policy 0, policy_version 342640 (0.0037) [2024-06-23 02:40:12,653][15401] Updated weights for policy 0, policy_version 342650 (0.0035) [2024-06-23 02:40:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5613993984. Throughput: 0: 42849.2. Samples: 5614117900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:40:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 02:40:16,719][15401] Updated weights for policy 0, policy_version 342660 (0.0048) [2024-06-23 02:40:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 5614190592. Throughput: 0: 42776.6. Samples: 5614378360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:40:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 02:40:20,775][15401] Updated weights for policy 0, policy_version 342670 (0.0037) [2024-06-23 02:40:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 5614436352. Throughput: 0: 42582.2. Samples: 5614495140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 02:40:23,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 02:40:24,159][15401] Updated weights for policy 0, policy_version 342680 (0.0038) [2024-06-23 02:40:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 5614616576. Throughput: 0: 42762.3. Samples: 5614756900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:40:28,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 02:40:28,417][15401] Updated weights for policy 0, policy_version 342690 (0.0033) [2024-06-23 02:40:31,634][15401] Updated weights for policy 0, policy_version 342700 (0.0036) [2024-06-23 02:40:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5614829568. Throughput: 0: 42794.2. Samples: 5615017180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:40:33,395][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 02:40:36,024][15401] Updated weights for policy 0, policy_version 342710 (0.0043) [2024-06-23 02:40:38,259][15349] Signal inference workers to stop experience collection... (83150 times) [2024-06-23 02:40:38,311][15401] InferenceWorker_p0-w0: stopping experience collection (83150 times) [2024-06-23 02:40:38,318][15349] Signal inference workers to resume experience collection... (83150 times) [2024-06-23 02:40:38,333][15401] InferenceWorker_p0-w0: resuming experience collection (83150 times) [2024-06-23 02:40:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5615075328. Throughput: 0: 42752.1. Samples: 5615137920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:40:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 02:40:39,001][15401] Updated weights for policy 0, policy_version 342720 (0.0035) [2024-06-23 02:40:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.6, 300 sec: 42376.2). Total num frames: 5615255552. Throughput: 0: 42882.1. Samples: 5615396180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:40:43,393][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 02:40:43,750][15401] Updated weights for policy 0, policy_version 342730 (0.0034) [2024-06-23 02:40:47,207][15401] Updated weights for policy 0, policy_version 342740 (0.0034) [2024-06-23 02:40:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 5615468544. Throughput: 0: 42676.5. Samples: 5615651520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:40:48,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 02:40:51,334][15401] Updated weights for policy 0, policy_version 342750 (0.0033) [2024-06-23 02:40:53,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5615714304. Throughput: 0: 42793.4. Samples: 5615779000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:40:53,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 02:40:54,677][15401] Updated weights for policy 0, policy_version 342760 (0.0035) [2024-06-23 02:40:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 5615894528. Throughput: 0: 42626.7. Samples: 5616036100. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:40:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 02:40:58,894][15401] Updated weights for policy 0, policy_version 342770 (0.0027) [2024-06-23 02:41:02,229][15401] Updated weights for policy 0, policy_version 342780 (0.0029) [2024-06-23 02:41:03,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42596.7, 300 sec: 42431.4). Total num frames: 5616107520. Throughput: 0: 42530.6. Samples: 5616292340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:41:03,393][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 02:41:06,573][15401] Updated weights for policy 0, policy_version 342790 (0.0028) [2024-06-23 02:41:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5616353280. Throughput: 0: 42851.1. Samples: 5616423440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:41:08,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 02:41:09,854][15401] Updated weights for policy 0, policy_version 342800 (0.0024) [2024-06-23 02:41:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 5616533504. Throughput: 0: 42596.8. Samples: 5616673760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:41:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 02:41:14,197][15401] Updated weights for policy 0, policy_version 342810 (0.0039) [2024-06-23 02:41:17,318][15401] Updated weights for policy 0, policy_version 342820 (0.0025) [2024-06-23 02:41:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5616762880. Throughput: 0: 42516.4. Samples: 5616930420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:41:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 02:41:22,155][15401] Updated weights for policy 0, policy_version 342830 (0.0046) [2024-06-23 02:41:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5616975872. Throughput: 0: 42845.3. Samples: 5617065960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:41:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 02:41:24,832][15401] Updated weights for policy 0, policy_version 342840 (0.0038) [2024-06-23 02:41:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 5617172480. Throughput: 0: 42656.1. Samples: 5617315600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:41:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 02:41:29,778][15401] Updated weights for policy 0, policy_version 342850 (0.0027) [2024-06-23 02:41:32,498][15401] Updated weights for policy 0, policy_version 342860 (0.0041) [2024-06-23 02:41:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5617418240. Throughput: 0: 42637.7. Samples: 5617570220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:41:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 02:41:37,259][15401] Updated weights for policy 0, policy_version 342870 (0.0030) [2024-06-23 02:41:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42431.8). Total num frames: 5617598464. Throughput: 0: 42833.1. Samples: 5617706500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 02:41:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 02:41:40,258][15401] Updated weights for policy 0, policy_version 342880 (0.0034) [2024-06-23 02:41:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 5617827840. Throughput: 0: 42803.5. Samples: 5617962260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:41:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 02:41:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000342885_5617827840.pth... [2024-06-23 02:41:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000342263_5607636992.pth [2024-06-23 02:41:45,260][15401] Updated weights for policy 0, policy_version 342890 (0.0042) [2024-06-23 02:41:46,582][15349] Signal inference workers to stop experience collection... (83200 times) [2024-06-23 02:41:46,582][15349] Signal inference workers to resume experience collection... (83200 times) [2024-06-23 02:41:46,604][15401] InferenceWorker_p0-w0: stopping experience collection (83200 times) [2024-06-23 02:41:46,604][15401] InferenceWorker_p0-w0: resuming experience collection (83200 times) [2024-06-23 02:41:47,943][15401] Updated weights for policy 0, policy_version 342900 (0.0041) [2024-06-23 02:41:48,389][15132] Fps is (10 sec: 47514.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5618073600. Throughput: 0: 42658.8. Samples: 5618211880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:41:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 02:41:52,745][15401] Updated weights for policy 0, policy_version 342910 (0.0036) [2024-06-23 02:41:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5618253824. Throughput: 0: 42795.0. Samples: 5618349220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:41:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 02:41:55,639][15401] Updated weights for policy 0, policy_version 342920 (0.0034) [2024-06-23 02:41:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 5618483200. Throughput: 0: 42861.7. Samples: 5618602540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:41:58,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 02:42:00,577][15401] Updated weights for policy 0, policy_version 342930 (0.0043) [2024-06-23 02:42:03,389][15132] Fps is (10 sec: 45876.4, 60 sec: 43419.4, 300 sec: 42709.5). Total num frames: 5618712576. Throughput: 0: 42797.1. Samples: 5618856280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 02:42:03,458][15401] Updated weights for policy 0, policy_version 342940 (0.0030) [2024-06-23 02:42:08,096][15401] Updated weights for policy 0, policy_version 342950 (0.0037) [2024-06-23 02:42:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5618892800. Throughput: 0: 42685.8. Samples: 5618986820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 02:42:11,155][15401] Updated weights for policy 0, policy_version 342960 (0.0030) [2024-06-23 02:42:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42654.3). Total num frames: 5619138560. Throughput: 0: 42805.4. Samples: 5619241840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 02:42:15,698][15401] Updated weights for policy 0, policy_version 342970 (0.0027) [2024-06-23 02:42:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5619335168. Throughput: 0: 42818.3. Samples: 5619497040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:18,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 02:42:18,877][15401] Updated weights for policy 0, policy_version 342980 (0.0033) [2024-06-23 02:42:23,245][15401] Updated weights for policy 0, policy_version 342990 (0.0038) [2024-06-23 02:42:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5619548160. Throughput: 0: 42658.8. Samples: 5619626140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:23,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 02:42:26,701][15401] Updated weights for policy 0, policy_version 343000 (0.0031) [2024-06-23 02:42:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5619761152. Throughput: 0: 42547.1. Samples: 5619876880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:28,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 02:42:31,316][15401] Updated weights for policy 0, policy_version 343010 (0.0029) [2024-06-23 02:42:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5619957760. Throughput: 0: 42660.0. Samples: 5620131580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 02:42:34,637][15401] Updated weights for policy 0, policy_version 343020 (0.0045) [2024-06-23 02:42:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42487.4). Total num frames: 5620170752. Throughput: 0: 42450.3. Samples: 5620259480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:38,396][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 02:42:38,807][15401] Updated weights for policy 0, policy_version 343030 (0.0030) [2024-06-23 02:42:42,334][15401] Updated weights for policy 0, policy_version 343040 (0.0040) [2024-06-23 02:42:43,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 5620416512. Throughput: 0: 42647.6. Samples: 5620521780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:43,393][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 02:42:46,143][15401] Updated weights for policy 0, policy_version 343050 (0.0029) [2024-06-23 02:42:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5620596736. Throughput: 0: 42716.9. Samples: 5620778540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 02:42:50,087][15401] Updated weights for policy 0, policy_version 343060 (0.0047) [2024-06-23 02:42:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5620826112. Throughput: 0: 42523.1. Samples: 5620900360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:53,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 02:42:53,902][15401] Updated weights for policy 0, policy_version 343070 (0.0054) [2024-06-23 02:42:57,910][15401] Updated weights for policy 0, policy_version 343080 (0.0026) [2024-06-23 02:42:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5621039104. Throughput: 0: 42770.6. Samples: 5621166520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 02:42:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 02:43:01,315][15401] Updated weights for policy 0, policy_version 343090 (0.0025) [2024-06-23 02:43:03,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.5, 300 sec: 42653.6). Total num frames: 5621252096. Throughput: 0: 42727.9. Samples: 5621419900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:03,393][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 02:43:05,564][15401] Updated weights for policy 0, policy_version 343100 (0.0038) [2024-06-23 02:43:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 5621481472. Throughput: 0: 42726.3. Samples: 5621548820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:08,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 02:43:09,255][15401] Updated weights for policy 0, policy_version 343110 (0.0031) [2024-06-23 02:43:13,038][15401] Updated weights for policy 0, policy_version 343120 (0.0040) [2024-06-23 02:43:13,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5621678080. Throughput: 0: 42928.9. Samples: 5621808680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 02:43:14,747][15349] Signal inference workers to stop experience collection... (83250 times) [2024-06-23 02:43:14,771][15401] InferenceWorker_p0-w0: stopping experience collection (83250 times) [2024-06-23 02:43:14,809][15349] Signal inference workers to resume experience collection... (83250 times) [2024-06-23 02:43:14,809][15401] InferenceWorker_p0-w0: resuming experience collection (83250 times) [2024-06-23 02:43:16,763][15401] Updated weights for policy 0, policy_version 343130 (0.0023) [2024-06-23 02:43:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5621907456. Throughput: 0: 42967.6. Samples: 5622065120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 02:43:20,726][15401] Updated weights for policy 0, policy_version 343140 (0.0029) [2024-06-23 02:43:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5622136832. Throughput: 0: 42929.0. Samples: 5622191280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 02:43:24,247][15401] Updated weights for policy 0, policy_version 343150 (0.0041) [2024-06-23 02:43:28,300][15401] Updated weights for policy 0, policy_version 343160 (0.0026) [2024-06-23 02:43:28,395][15132] Fps is (10 sec: 42576.5, 60 sec: 42867.9, 300 sec: 42708.8). Total num frames: 5622333440. Throughput: 0: 42963.7. Samples: 5622455260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:28,395][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 02:43:31,645][15401] Updated weights for policy 0, policy_version 343170 (0.0032) [2024-06-23 02:43:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5622546432. Throughput: 0: 42968.4. Samples: 5622712120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 02:43:35,951][15401] Updated weights for policy 0, policy_version 343180 (0.0033) [2024-06-23 02:43:38,389][15132] Fps is (10 sec: 44259.4, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 5622775808. Throughput: 0: 43168.9. Samples: 5622842960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 02:43:39,173][15401] Updated weights for policy 0, policy_version 343190 (0.0040) [2024-06-23 02:43:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 5622972416. Throughput: 0: 43087.2. Samples: 5623105440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:43,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 02:43:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000343200_5622988800.pth... [2024-06-23 02:43:43,530][15401] Updated weights for policy 0, policy_version 343200 (0.0026) [2024-06-23 02:43:43,616][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000342573_5612716032.pth [2024-06-23 02:43:46,825][15401] Updated weights for policy 0, policy_version 343210 (0.0041) [2024-06-23 02:43:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5623201792. Throughput: 0: 43012.5. Samples: 5623355360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 02:43:51,186][15401] Updated weights for policy 0, policy_version 343220 (0.0033) [2024-06-23 02:43:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5623398400. Throughput: 0: 43055.5. Samples: 5623486320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 02:43:54,636][15401] Updated weights for policy 0, policy_version 343230 (0.0035) [2024-06-23 02:43:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5623611392. Throughput: 0: 42963.0. Samples: 5623742020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:43:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 02:43:59,020][15401] Updated weights for policy 0, policy_version 343240 (0.0041) [2024-06-23 02:44:02,244][15401] Updated weights for policy 0, policy_version 343250 (0.0036) [2024-06-23 02:44:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.3, 300 sec: 42876.3). Total num frames: 5623840768. Throughput: 0: 42860.4. Samples: 5623993840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:44:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 02:44:06,659][15401] Updated weights for policy 0, policy_version 343260 (0.0037) [2024-06-23 02:44:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5624053760. Throughput: 0: 42971.9. Samples: 5624125020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:44:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 02:44:10,306][15401] Updated weights for policy 0, policy_version 343270 (0.0037) [2024-06-23 02:44:13,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.8, 300 sec: 42764.6). Total num frames: 5624250368. Throughput: 0: 42770.1. Samples: 5624379800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-23 02:44:13,393][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 02:44:14,052][15401] Updated weights for policy 0, policy_version 343280 (0.0048) [2024-06-23 02:44:17,820][15401] Updated weights for policy 0, policy_version 343290 (0.0031) [2024-06-23 02:44:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 5624479744. Throughput: 0: 42695.1. Samples: 5624633400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 02:44:21,847][15401] Updated weights for policy 0, policy_version 343300 (0.0040) [2024-06-23 02:44:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5624676352. Throughput: 0: 42627.9. Samples: 5624761220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 02:44:25,215][15401] Updated weights for policy 0, policy_version 343310 (0.0028) [2024-06-23 02:44:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42601.9, 300 sec: 42765.0). Total num frames: 5624889344. Throughput: 0: 42622.1. Samples: 5625023440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:28,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 02:44:29,162][15349] Signal inference workers to stop experience collection... (83300 times) [2024-06-23 02:44:29,164][15349] Signal inference workers to resume experience collection... (83300 times) [2024-06-23 02:44:29,199][15401] InferenceWorker_p0-w0: stopping experience collection (83300 times) [2024-06-23 02:44:29,200][15401] InferenceWorker_p0-w0: resuming experience collection (83300 times) [2024-06-23 02:44:29,314][15401] Updated weights for policy 0, policy_version 343320 (0.0041) [2024-06-23 02:44:32,662][15401] Updated weights for policy 0, policy_version 343330 (0.0046) [2024-06-23 02:44:33,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 5625118720. Throughput: 0: 42720.0. Samples: 5625277860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:33,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 02:44:37,052][15401] Updated weights for policy 0, policy_version 343340 (0.0030) [2024-06-23 02:44:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5625331712. Throughput: 0: 42716.5. Samples: 5625408560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 02:44:40,586][15401] Updated weights for policy 0, policy_version 343350 (0.0039) [2024-06-23 02:44:43,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5625511936. Throughput: 0: 42667.2. Samples: 5625662040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 02:44:45,075][15401] Updated weights for policy 0, policy_version 343360 (0.0039) [2024-06-23 02:44:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5625757696. Throughput: 0: 42681.4. Samples: 5625914500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 02:44:48,543][15401] Updated weights for policy 0, policy_version 343370 (0.0026) [2024-06-23 02:44:52,719][15401] Updated weights for policy 0, policy_version 343380 (0.0038) [2024-06-23 02:44:53,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 5625987072. Throughput: 0: 42778.7. Samples: 5626050160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:53,393][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 02:44:56,215][15401] Updated weights for policy 0, policy_version 343390 (0.0030) [2024-06-23 02:44:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5626150912. Throughput: 0: 42639.2. Samples: 5626298460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:44:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 02:45:00,256][15401] Updated weights for policy 0, policy_version 343400 (0.0023) [2024-06-23 02:45:03,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5626396672. Throughput: 0: 42761.3. Samples: 5626557660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:45:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 02:45:03,654][15401] Updated weights for policy 0, policy_version 343410 (0.0038) [2024-06-23 02:45:07,946][15401] Updated weights for policy 0, policy_version 343420 (0.0040) [2024-06-23 02:45:08,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 5626609664. Throughput: 0: 43007.6. Samples: 5626696660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:45:08,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 02:45:11,302][15401] Updated weights for policy 0, policy_version 343430 (0.0028) [2024-06-23 02:45:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5626806272. Throughput: 0: 42629.4. Samples: 5626941760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:45:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 02:45:15,627][15401] Updated weights for policy 0, policy_version 343440 (0.0046) [2024-06-23 02:45:18,389][15132] Fps is (10 sec: 42609.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5627035648. Throughput: 0: 42828.7. Samples: 5627205040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:45:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 02:45:19,273][15401] Updated weights for policy 0, policy_version 343450 (0.0029) [2024-06-23 02:45:23,306][15401] Updated weights for policy 0, policy_version 343460 (0.0027) [2024-06-23 02:45:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5627248640. Throughput: 0: 42824.5. Samples: 5627335660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:45:23,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 02:45:26,812][15401] Updated weights for policy 0, policy_version 343470 (0.0040) [2024-06-23 02:45:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5627461632. Throughput: 0: 42761.8. Samples: 5627586320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 02:45:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 02:45:30,903][15401] Updated weights for policy 0, policy_version 343480 (0.0028) [2024-06-23 02:45:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 5627674624. Throughput: 0: 42834.2. Samples: 5627842040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:45:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 02:45:34,359][15401] Updated weights for policy 0, policy_version 343490 (0.0042) [2024-06-23 02:45:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 5627887616. Throughput: 0: 42668.1. Samples: 5627970120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:45:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 02:45:38,495][15401] Updated weights for policy 0, policy_version 343500 (0.0033) [2024-06-23 02:45:42,511][15401] Updated weights for policy 0, policy_version 343510 (0.0023) [2024-06-23 02:45:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5628116992. Throughput: 0: 42878.5. Samples: 5628228000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:45:43,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 02:45:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000343513_5628116992.pth... [2024-06-23 02:45:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000342885_5617827840.pth [2024-06-23 02:45:44,324][15349] Signal inference workers to stop experience collection... (83350 times) [2024-06-23 02:45:44,326][15349] Signal inference workers to resume experience collection... (83350 times) [2024-06-23 02:45:44,352][15401] InferenceWorker_p0-w0: stopping experience collection (83350 times) [2024-06-23 02:45:44,352][15401] InferenceWorker_p0-w0: resuming experience collection (83350 times) [2024-06-23 02:45:46,173][15401] Updated weights for policy 0, policy_version 343520 (0.0039) [2024-06-23 02:45:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5628313600. Throughput: 0: 42740.1. Samples: 5628480960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:45:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 02:45:49,993][15401] Updated weights for policy 0, policy_version 343530 (0.0032) [2024-06-23 02:45:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 5628542976. Throughput: 0: 42437.4. Samples: 5628606240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:45:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 02:45:53,709][15401] Updated weights for policy 0, policy_version 343540 (0.0043) [2024-06-23 02:45:57,411][15401] Updated weights for policy 0, policy_version 343550 (0.0052) [2024-06-23 02:45:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 5628739584. Throughput: 0: 42741.4. Samples: 5628865120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:45:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 02:46:01,217][15401] Updated weights for policy 0, policy_version 343560 (0.0032) [2024-06-23 02:46:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5628952576. Throughput: 0: 42661.7. Samples: 5629124820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:03,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 02:46:04,872][15401] Updated weights for policy 0, policy_version 343570 (0.0040) [2024-06-23 02:46:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 5629165568. Throughput: 0: 42570.7. Samples: 5629251340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:08,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-23 02:46:08,826][15401] Updated weights for policy 0, policy_version 343580 (0.0029) [2024-06-23 02:46:12,577][15401] Updated weights for policy 0, policy_version 343590 (0.0032) [2024-06-23 02:46:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5629394944. Throughput: 0: 42768.4. Samples: 5629510900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:13,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 02:46:16,450][15401] Updated weights for policy 0, policy_version 343600 (0.0030) [2024-06-23 02:46:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 5629591552. Throughput: 0: 42799.5. Samples: 5629768020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:18,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-23 02:46:20,332][15401] Updated weights for policy 0, policy_version 343610 (0.0040) [2024-06-23 02:46:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5629804544. Throughput: 0: 42838.2. Samples: 5629897840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 02:46:24,284][15401] Updated weights for policy 0, policy_version 343620 (0.0034) [2024-06-23 02:46:28,020][15401] Updated weights for policy 0, policy_version 343630 (0.0043) [2024-06-23 02:46:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5630033920. Throughput: 0: 42799.3. Samples: 5630153960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 02:46:31,754][15401] Updated weights for policy 0, policy_version 343640 (0.0026) [2024-06-23 02:46:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5630246912. Throughput: 0: 42930.6. Samples: 5630412840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:33,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 02:46:35,813][15401] Updated weights for policy 0, policy_version 343650 (0.0033) [2024-06-23 02:46:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5630459904. Throughput: 0: 42947.1. Samples: 5630538860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:38,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 02:46:39,216][15401] Updated weights for policy 0, policy_version 343660 (0.0038) [2024-06-23 02:46:43,351][15401] Updated weights for policy 0, policy_version 343670 (0.0045) [2024-06-23 02:46:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5630689280. Throughput: 0: 42986.2. Samples: 5630799500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 02:46:47,123][15401] Updated weights for policy 0, policy_version 343680 (0.0037) [2024-06-23 02:46:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5630869504. Throughput: 0: 43031.4. Samples: 5631061240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 02:46:48,391][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 02:46:51,162][15401] Updated weights for policy 0, policy_version 343690 (0.0033) [2024-06-23 02:46:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5631115264. Throughput: 0: 42969.6. Samples: 5631184980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:46:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 02:46:55,029][15401] Updated weights for policy 0, policy_version 343700 (0.0040) [2024-06-23 02:46:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5631311872. Throughput: 0: 42933.5. Samples: 5631442900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:46:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 02:46:59,152][15401] Updated weights for policy 0, policy_version 343710 (0.0037) [2024-06-23 02:47:02,599][15401] Updated weights for policy 0, policy_version 343720 (0.0033) [2024-06-23 02:47:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5631524864. Throughput: 0: 42904.2. Samples: 5631698700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 02:47:06,720][15401] Updated weights for policy 0, policy_version 343730 (0.0048) [2024-06-23 02:47:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5631770624. Throughput: 0: 42999.5. Samples: 5631832820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 02:47:10,045][15401] Updated weights for policy 0, policy_version 343740 (0.0030) [2024-06-23 02:47:13,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 5631967232. Throughput: 0: 42953.6. Samples: 5632086980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:13,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 02:47:14,617][15401] Updated weights for policy 0, policy_version 343750 (0.0038) [2024-06-23 02:47:15,439][15349] Signal inference workers to stop experience collection... (83400 times) [2024-06-23 02:47:15,448][15349] Signal inference workers to resume experience collection... (83400 times) [2024-06-23 02:47:15,454][15401] InferenceWorker_p0-w0: stopping experience collection (83400 times) [2024-06-23 02:47:15,475][15401] InferenceWorker_p0-w0: resuming experience collection (83400 times) [2024-06-23 02:47:17,629][15401] Updated weights for policy 0, policy_version 343760 (0.0034) [2024-06-23 02:47:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5632180224. Throughput: 0: 42902.6. Samples: 5632343460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 02:47:22,193][15401] Updated weights for policy 0, policy_version 343770 (0.0034) [2024-06-23 02:47:23,390][15132] Fps is (10 sec: 44246.6, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 5632409600. Throughput: 0: 43053.2. Samples: 5632476260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 02:47:25,191][15401] Updated weights for policy 0, policy_version 343780 (0.0023) [2024-06-23 02:47:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5632606208. Throughput: 0: 42960.6. Samples: 5632732720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 02:47:29,735][15401] Updated weights for policy 0, policy_version 343790 (0.0026) [2024-06-23 02:47:33,136][15401] Updated weights for policy 0, policy_version 343800 (0.0049) [2024-06-23 02:47:33,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5632819200. Throughput: 0: 42769.8. Samples: 5632985880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 02:47:37,322][15401] Updated weights for policy 0, policy_version 343810 (0.0042) [2024-06-23 02:47:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 5633048576. Throughput: 0: 42979.7. Samples: 5633119060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 02:47:40,742][15401] Updated weights for policy 0, policy_version 343820 (0.0043) [2024-06-23 02:47:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 5633228800. Throughput: 0: 42774.9. Samples: 5633367780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 02:47:43,537][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000343826_5633245184.pth... [2024-06-23 02:47:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000343200_5622988800.pth [2024-06-23 02:47:44,787][15401] Updated weights for policy 0, policy_version 343830 (0.0037) [2024-06-23 02:47:48,314][15401] Updated weights for policy 0, policy_version 343840 (0.0040) [2024-06-23 02:47:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 5633474560. Throughput: 0: 42833.2. Samples: 5633626200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 02:47:52,423][15401] Updated weights for policy 0, policy_version 343850 (0.0031) [2024-06-23 02:47:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5633671168. Throughput: 0: 42741.8. Samples: 5633756200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:53,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 02:47:56,151][15401] Updated weights for policy 0, policy_version 343860 (0.0028) [2024-06-23 02:47:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 5633867776. Throughput: 0: 42608.1. Samples: 5634004240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:47:58,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 02:48:00,099][15401] Updated weights for policy 0, policy_version 343870 (0.0033) [2024-06-23 02:48:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5634113536. Throughput: 0: 42686.7. Samples: 5634264360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 02:48:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 02:48:03,651][15401] Updated weights for policy 0, policy_version 343880 (0.0040) [2024-06-23 02:48:07,723][15401] Updated weights for policy 0, policy_version 343890 (0.0038) [2024-06-23 02:48:08,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 5634326528. Throughput: 0: 42725.9. Samples: 5634399020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:08,393][15132] Avg episode reward: [(0, '0.836')] [2024-06-23 02:48:11,400][15401] Updated weights for policy 0, policy_version 343900 (0.0041) [2024-06-23 02:48:13,394][15132] Fps is (10 sec: 39303.2, 60 sec: 42323.7, 300 sec: 42708.8). Total num frames: 5634506752. Throughput: 0: 42535.4. Samples: 5634647020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:13,395][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 02:48:15,452][15401] Updated weights for policy 0, policy_version 343910 (0.0039) [2024-06-23 02:48:18,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5634752512. Throughput: 0: 42582.2. Samples: 5634902080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 02:48:19,089][15401] Updated weights for policy 0, policy_version 343920 (0.0040) [2024-06-23 02:48:19,830][15349] Signal inference workers to stop experience collection... (83450 times) [2024-06-23 02:48:19,831][15349] Signal inference workers to resume experience collection... (83450 times) [2024-06-23 02:48:19,849][15401] InferenceWorker_p0-w0: stopping experience collection (83450 times) [2024-06-23 02:48:19,849][15401] InferenceWorker_p0-w0: resuming experience collection (83450 times) [2024-06-23 02:48:23,035][15401] Updated weights for policy 0, policy_version 343930 (0.0060) [2024-06-23 02:48:23,390][15132] Fps is (10 sec: 45896.6, 60 sec: 42598.5, 300 sec: 42821.3). Total num frames: 5634965504. Throughput: 0: 42616.7. Samples: 5635036820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 02:48:26,581][15401] Updated weights for policy 0, policy_version 343940 (0.0031) [2024-06-23 02:48:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 5635162112. Throughput: 0: 42685.8. Samples: 5635288640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 02:48:30,768][15401] Updated weights for policy 0, policy_version 343950 (0.0032) [2024-06-23 02:48:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5635407872. Throughput: 0: 42667.4. Samples: 5635546240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 02:48:34,054][15401] Updated weights for policy 0, policy_version 343960 (0.0032) [2024-06-23 02:48:38,327][15401] Updated weights for policy 0, policy_version 343970 (0.0036) [2024-06-23 02:48:38,393][15132] Fps is (10 sec: 44222.0, 60 sec: 42595.8, 300 sec: 42820.0). Total num frames: 5635604480. Throughput: 0: 42747.8. Samples: 5635680000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:38,393][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 02:48:41,603][15401] Updated weights for policy 0, policy_version 343980 (0.0033) [2024-06-23 02:48:43,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5635801088. Throughput: 0: 42772.9. Samples: 5635929020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 02:48:46,292][15401] Updated weights for policy 0, policy_version 343990 (0.0043) [2024-06-23 02:48:48,390][15132] Fps is (10 sec: 42613.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5636030464. Throughput: 0: 42783.6. Samples: 5636189620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 02:48:49,251][15401] Updated weights for policy 0, policy_version 344000 (0.0039) [2024-06-23 02:48:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5636210688. Throughput: 0: 42753.0. Samples: 5636322800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:53,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 02:48:53,835][15401] Updated weights for policy 0, policy_version 344010 (0.0032) [2024-06-23 02:48:57,153][15401] Updated weights for policy 0, policy_version 344020 (0.0037) [2024-06-23 02:48:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5636456448. Throughput: 0: 42814.3. Samples: 5636573460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:48:58,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 02:49:01,381][15401] Updated weights for policy 0, policy_version 344030 (0.0043) [2024-06-23 02:49:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5636669440. Throughput: 0: 42908.0. Samples: 5636832940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:49:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 02:49:04,651][15401] Updated weights for policy 0, policy_version 344040 (0.0032) [2024-06-23 02:49:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42765.4). Total num frames: 5636866048. Throughput: 0: 42792.2. Samples: 5636962460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:49:08,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 02:49:08,910][15401] Updated weights for policy 0, policy_version 344050 (0.0036) [2024-06-23 02:49:12,221][15401] Updated weights for policy 0, policy_version 344060 (0.0030) [2024-06-23 02:49:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43421.0, 300 sec: 42820.5). Total num frames: 5637111808. Throughput: 0: 42871.3. Samples: 5637217840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:49:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 02:49:16,738][15401] Updated weights for policy 0, policy_version 344070 (0.0043) [2024-06-23 02:49:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5637324800. Throughput: 0: 42891.2. Samples: 5637476340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 02:49:18,390][15132] Avg episode reward: [(0, '0.171')] [2024-06-23 02:49:19,963][15401] Updated weights for policy 0, policy_version 344080 (0.0041) [2024-06-23 02:49:23,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 5637488640. Throughput: 0: 42794.0. Samples: 5637605580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:49:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 02:49:24,395][15401] Updated weights for policy 0, policy_version 344090 (0.0030) [2024-06-23 02:49:26,656][15349] Signal inference workers to stop experience collection... (83500 times) [2024-06-23 02:49:26,660][15349] Signal inference workers to resume experience collection... (83500 times) [2024-06-23 02:49:26,687][15401] InferenceWorker_p0-w0: stopping experience collection (83500 times) [2024-06-23 02:49:26,687][15401] InferenceWorker_p0-w0: resuming experience collection (83500 times) [2024-06-23 02:49:27,906][15401] Updated weights for policy 0, policy_version 344100 (0.0030) [2024-06-23 02:49:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 5637750784. Throughput: 0: 42877.8. Samples: 5637858520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:49:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 02:49:31,970][15401] Updated weights for policy 0, policy_version 344110 (0.0035) [2024-06-23 02:49:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5637963776. Throughput: 0: 42703.2. Samples: 5638111260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:49:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 02:49:35,895][15401] Updated weights for policy 0, policy_version 344120 (0.0039) [2024-06-23 02:49:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42054.7, 300 sec: 42765.0). Total num frames: 5638127616. Throughput: 0: 42490.6. Samples: 5638234880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:49:38,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 02:49:39,815][15401] Updated weights for policy 0, policy_version 344130 (0.0037) [2024-06-23 02:49:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5638373376. Throughput: 0: 42530.7. Samples: 5638487340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:49:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 02:49:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000344139_5638373376.pth... [2024-06-23 02:49:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000343513_5628116992.pth [2024-06-23 02:49:43,674][15401] Updated weights for policy 0, policy_version 344140 (0.0033) [2024-06-23 02:49:47,723][15401] Updated weights for policy 0, policy_version 344150 (0.0032) [2024-06-23 02:49:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 5638586368. Throughput: 0: 42518.7. Samples: 5638746280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:49:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 02:49:51,244][15401] Updated weights for policy 0, policy_version 344160 (0.0026) [2024-06-23 02:49:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5638766592. Throughput: 0: 42496.3. Samples: 5638874800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:49:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 02:49:55,233][15401] Updated weights for policy 0, policy_version 344170 (0.0038) [2024-06-23 02:49:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5639012352. Throughput: 0: 42389.8. Samples: 5639125380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:49:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 02:49:59,007][15401] Updated weights for policy 0, policy_version 344180 (0.0033) [2024-06-23 02:50:02,795][15401] Updated weights for policy 0, policy_version 344190 (0.0030) [2024-06-23 02:50:03,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 5639225344. Throughput: 0: 42475.6. Samples: 5639387840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:50:03,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 02:50:06,736][15401] Updated weights for policy 0, policy_version 344200 (0.0030) [2024-06-23 02:50:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5639421952. Throughput: 0: 42429.3. Samples: 5639514900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:50:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 02:50:10,548][15401] Updated weights for policy 0, policy_version 344210 (0.0032) [2024-06-23 02:50:13,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 5639667712. Throughput: 0: 42370.7. Samples: 5639765300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:50:13,392][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 02:50:14,643][15401] Updated weights for policy 0, policy_version 344220 (0.0032) [2024-06-23 02:50:18,091][15401] Updated weights for policy 0, policy_version 344230 (0.0035) [2024-06-23 02:50:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5639880704. Throughput: 0: 42622.1. Samples: 5640029260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:50:18,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 02:50:22,119][15401] Updated weights for policy 0, policy_version 344240 (0.0030) [2024-06-23 02:50:23,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5640060928. Throughput: 0: 42673.9. Samples: 5640155200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:50:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 02:50:25,822][15401] Updated weights for policy 0, policy_version 344250 (0.0031) [2024-06-23 02:50:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5640290304. Throughput: 0: 42676.9. Samples: 5640407800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:50:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 02:50:30,063][15401] Updated weights for policy 0, policy_version 344260 (0.0023) [2024-06-23 02:50:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5640486912. Throughput: 0: 42828.4. Samples: 5640673560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:50:33,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 02:50:33,615][15401] Updated weights for policy 0, policy_version 344270 (0.0037) [2024-06-23 02:50:37,735][15401] Updated weights for policy 0, policy_version 344280 (0.0042) [2024-06-23 02:50:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5640699904. Throughput: 0: 42773.4. Samples: 5640799600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 02:50:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 02:50:41,200][15401] Updated weights for policy 0, policy_version 344290 (0.0036) [2024-06-23 02:50:41,706][15349] Signal inference workers to stop experience collection... (83550 times) [2024-06-23 02:50:41,738][15401] InferenceWorker_p0-w0: stopping experience collection (83550 times) [2024-06-23 02:50:41,765][15349] Signal inference workers to resume experience collection... (83550 times) [2024-06-23 02:50:41,772][15401] InferenceWorker_p0-w0: resuming experience collection (83550 times) [2024-06-23 02:50:43,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5640945664. Throughput: 0: 42858.1. Samples: 5641054100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:50:43,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 02:50:45,175][15401] Updated weights for policy 0, policy_version 344300 (0.0031) [2024-06-23 02:50:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5641125888. Throughput: 0: 42921.5. Samples: 5641319200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:50:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 02:50:48,827][15401] Updated weights for policy 0, policy_version 344310 (0.0036) [2024-06-23 02:50:52,741][15401] Updated weights for policy 0, policy_version 344320 (0.0037) [2024-06-23 02:50:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5641355264. Throughput: 0: 42787.5. Samples: 5641440340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:50:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 02:50:56,538][15401] Updated weights for policy 0, policy_version 344330 (0.0033) [2024-06-23 02:50:58,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5641584640. Throughput: 0: 43024.0. Samples: 5641701280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:50:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 02:51:00,173][15401] Updated weights for policy 0, policy_version 344340 (0.0034) [2024-06-23 02:51:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5641781248. Throughput: 0: 42997.0. Samples: 5641964120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 02:51:04,125][15401] Updated weights for policy 0, policy_version 344350 (0.0033) [2024-06-23 02:51:07,602][15401] Updated weights for policy 0, policy_version 344360 (0.0033) [2024-06-23 02:51:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 5641994240. Throughput: 0: 42873.6. Samples: 5642084620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:08,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 02:51:11,665][15401] Updated weights for policy 0, policy_version 344370 (0.0034) [2024-06-23 02:51:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42871.4, 300 sec: 42875.8). Total num frames: 5642240000. Throughput: 0: 43050.1. Samples: 5642345160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:13,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 02:51:15,464][15401] Updated weights for policy 0, policy_version 344380 (0.0025) [2024-06-23 02:51:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5642420224. Throughput: 0: 42797.4. Samples: 5642599440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:18,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-23 02:51:19,656][15401] Updated weights for policy 0, policy_version 344390 (0.0040) [2024-06-23 02:51:23,226][15401] Updated weights for policy 0, policy_version 344400 (0.0036) [2024-06-23 02:51:23,392][15132] Fps is (10 sec: 40960.2, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 5642649600. Throughput: 0: 42774.2. Samples: 5642724540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:23,392][15132] Avg episode reward: [(0, '0.271')] [2024-06-23 02:51:27,096][15401] Updated weights for policy 0, policy_version 344410 (0.0025) [2024-06-23 02:51:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5642862592. Throughput: 0: 42788.5. Samples: 5642979480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:28,392][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 02:51:30,839][15401] Updated weights for policy 0, policy_version 344420 (0.0027) [2024-06-23 02:51:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5643042816. Throughput: 0: 42747.1. Samples: 5643242820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:33,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 02:51:34,671][15401] Updated weights for policy 0, policy_version 344430 (0.0038) [2024-06-23 02:51:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5643272192. Throughput: 0: 42787.6. Samples: 5643365780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 02:51:38,661][15401] Updated weights for policy 0, policy_version 344440 (0.0033) [2024-06-23 02:51:42,333][15401] Updated weights for policy 0, policy_version 344450 (0.0038) [2024-06-23 02:51:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 5643485184. Throughput: 0: 42723.2. Samples: 5643623820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 02:51:43,476][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000344452_5643501568.pth... [2024-06-23 02:51:43,523][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000343826_5633245184.pth [2024-06-23 02:51:46,389][15401] Updated weights for policy 0, policy_version 344460 (0.0048) [2024-06-23 02:51:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5643698176. Throughput: 0: 42619.6. Samples: 5643882000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 02:51:49,986][15401] Updated weights for policy 0, policy_version 344470 (0.0024) [2024-06-23 02:51:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5643927552. Throughput: 0: 42657.3. Samples: 5644004100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 02:51:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 02:51:53,910][15401] Updated weights for policy 0, policy_version 344480 (0.0042) [2024-06-23 02:51:58,000][15401] Updated weights for policy 0, policy_version 344490 (0.0052) [2024-06-23 02:51:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5644124160. Throughput: 0: 42622.7. Samples: 5644263080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:51:58,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 02:52:01,641][15401] Updated weights for policy 0, policy_version 344500 (0.0036) [2024-06-23 02:52:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5644353536. Throughput: 0: 42639.9. Samples: 5644518240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 02:52:05,624][15401] Updated weights for policy 0, policy_version 344510 (0.0029) [2024-06-23 02:52:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 5644582912. Throughput: 0: 42789.4. Samples: 5644649960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 02:52:09,343][15401] Updated weights for policy 0, policy_version 344520 (0.0038) [2024-06-23 02:52:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 5644779520. Throughput: 0: 42772.0. Samples: 5644904220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 02:52:13,395][15401] Updated weights for policy 0, policy_version 344530 (0.0033) [2024-06-23 02:52:16,912][15401] Updated weights for policy 0, policy_version 344540 (0.0036) [2024-06-23 02:52:17,832][15349] Signal inference workers to stop experience collection... (83600 times) [2024-06-23 02:52:17,832][15349] Signal inference workers to resume experience collection... (83600 times) [2024-06-23 02:52:17,879][15401] InferenceWorker_p0-w0: stopping experience collection (83600 times) [2024-06-23 02:52:17,879][15401] InferenceWorker_p0-w0: resuming experience collection (83600 times) [2024-06-23 02:52:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5645008896. Throughput: 0: 42432.4. Samples: 5645152280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 02:52:21,096][15401] Updated weights for policy 0, policy_version 344550 (0.0050) [2024-06-23 02:52:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42598.3, 300 sec: 42709.1). Total num frames: 5645205504. Throughput: 0: 42646.1. Samples: 5645284960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:23,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 02:52:24,574][15401] Updated weights for policy 0, policy_version 344560 (0.0032) [2024-06-23 02:52:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5645418496. Throughput: 0: 42652.9. Samples: 5645543200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 02:52:28,579][15401] Updated weights for policy 0, policy_version 344570 (0.0034) [2024-06-23 02:52:32,177][15401] Updated weights for policy 0, policy_version 344580 (0.0031) [2024-06-23 02:52:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 5645647872. Throughput: 0: 42524.3. Samples: 5645795600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:33,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 02:52:36,286][15401] Updated weights for policy 0, policy_version 344590 (0.0035) [2024-06-23 02:52:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42709.2). Total num frames: 5645828096. Throughput: 0: 42673.4. Samples: 5645924500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:38,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 02:52:40,061][15401] Updated weights for policy 0, policy_version 344600 (0.0034) [2024-06-23 02:52:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5646057472. Throughput: 0: 42628.8. Samples: 5646181380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 02:52:43,941][15401] Updated weights for policy 0, policy_version 344610 (0.0030) [2024-06-23 02:52:47,659][15401] Updated weights for policy 0, policy_version 344620 (0.0035) [2024-06-23 02:52:48,392][15132] Fps is (10 sec: 45876.2, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 5646286848. Throughput: 0: 42598.5. Samples: 5646435260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:48,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 02:52:51,637][15401] Updated weights for policy 0, policy_version 344630 (0.0032) [2024-06-23 02:52:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5646467072. Throughput: 0: 42527.5. Samples: 5646563700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 02:52:55,506][15401] Updated weights for policy 0, policy_version 344640 (0.0040) [2024-06-23 02:52:58,389][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5646696448. Throughput: 0: 42488.1. Samples: 5646816180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:52:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 02:52:59,517][15401] Updated weights for policy 0, policy_version 344650 (0.0025) [2024-06-23 02:53:03,103][15401] Updated weights for policy 0, policy_version 344660 (0.0033) [2024-06-23 02:53:03,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 5646925824. Throughput: 0: 42621.7. Samples: 5647070360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:53:03,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 02:53:07,089][15401] Updated weights for policy 0, policy_version 344670 (0.0029) [2024-06-23 02:53:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42710.2). Total num frames: 5647106048. Throughput: 0: 42548.2. Samples: 5647199520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 02:53:08,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 02:53:10,717][15401] Updated weights for policy 0, policy_version 344680 (0.0045) [2024-06-23 02:53:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5647335424. Throughput: 0: 42423.5. Samples: 5647452260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 02:53:14,600][15401] Updated weights for policy 0, policy_version 344690 (0.0045) [2024-06-23 02:53:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5647532032. Throughput: 0: 42565.0. Samples: 5647711020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:18,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 02:53:18,550][15401] Updated weights for policy 0, policy_version 344700 (0.0028) [2024-06-23 02:53:22,146][15401] Updated weights for policy 0, policy_version 344710 (0.0034) [2024-06-23 02:53:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 5647761408. Throughput: 0: 42559.1. Samples: 5647839560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:23,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-23 02:53:26,166][15401] Updated weights for policy 0, policy_version 344720 (0.0023) [2024-06-23 02:53:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5647990784. Throughput: 0: 42574.4. Samples: 5648097220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 02:53:29,586][15401] Updated weights for policy 0, policy_version 344730 (0.0033) [2024-06-23 02:53:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42598.9). Total num frames: 5648171008. Throughput: 0: 42669.2. Samples: 5648355280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:33,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 02:53:33,993][15401] Updated weights for policy 0, policy_version 344740 (0.0036) [2024-06-23 02:53:37,098][15401] Updated weights for policy 0, policy_version 344750 (0.0039) [2024-06-23 02:53:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 5648400384. Throughput: 0: 42588.4. Samples: 5648480180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 02:53:41,446][15349] Signal inference workers to stop experience collection... (83650 times) [2024-06-23 02:53:41,498][15401] InferenceWorker_p0-w0: stopping experience collection (83650 times) [2024-06-23 02:53:41,504][15349] Signal inference workers to resume experience collection... (83650 times) [2024-06-23 02:53:41,512][15401] InferenceWorker_p0-w0: resuming experience collection (83650 times) [2024-06-23 02:53:41,647][15401] Updated weights for policy 0, policy_version 344760 (0.0041) [2024-06-23 02:53:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5648629760. Throughput: 0: 42646.5. Samples: 5648735280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 02:53:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000344765_5648629760.pth... [2024-06-23 02:53:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000344139_5638373376.pth [2024-06-23 02:53:44,631][15401] Updated weights for policy 0, policy_version 344770 (0.0033) [2024-06-23 02:53:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42053.8, 300 sec: 42709.5). Total num frames: 5648809984. Throughput: 0: 42885.9. Samples: 5649000120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 02:53:49,234][15401] Updated weights for policy 0, policy_version 344780 (0.0024) [2024-06-23 02:53:52,116][15401] Updated weights for policy 0, policy_version 344790 (0.0048) [2024-06-23 02:53:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5649039360. Throughput: 0: 42789.2. Samples: 5649125040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 02:53:57,117][15401] Updated weights for policy 0, policy_version 344800 (0.0037) [2024-06-23 02:53:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5649268736. Throughput: 0: 42890.2. Samples: 5649382320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:53:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 02:54:00,116][15401] Updated weights for policy 0, policy_version 344810 (0.0040) [2024-06-23 02:54:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 5649448960. Throughput: 0: 42826.0. Samples: 5649638200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:54:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 02:54:04,763][15401] Updated weights for policy 0, policy_version 344820 (0.0037) [2024-06-23 02:54:07,836][15401] Updated weights for policy 0, policy_version 344830 (0.0027) [2024-06-23 02:54:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5649694720. Throughput: 0: 42714.3. Samples: 5649761700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:54:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 02:54:12,245][15401] Updated weights for policy 0, policy_version 344840 (0.0033) [2024-06-23 02:54:13,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5649907712. Throughput: 0: 42860.5. Samples: 5650025940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:54:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 02:54:15,204][15401] Updated weights for policy 0, policy_version 344850 (0.0037) [2024-06-23 02:54:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5650087936. Throughput: 0: 43065.8. Samples: 5650293240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:54:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 02:54:19,789][15401] Updated weights for policy 0, policy_version 344860 (0.0037) [2024-06-23 02:54:22,882][15401] Updated weights for policy 0, policy_version 344870 (0.0036) [2024-06-23 02:54:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5650350080. Throughput: 0: 42898.2. Samples: 5650410600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:54:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 02:54:27,633][15401] Updated weights for policy 0, policy_version 344880 (0.0025) [2024-06-23 02:54:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5650546688. Throughput: 0: 43050.2. Samples: 5650672540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 02:54:28,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 02:54:30,461][15401] Updated weights for policy 0, policy_version 344890 (0.0042) [2024-06-23 02:54:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5650743296. Throughput: 0: 42912.9. Samples: 5650931200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:54:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 02:54:35,213][15401] Updated weights for policy 0, policy_version 344900 (0.0033) [2024-06-23 02:54:37,963][15401] Updated weights for policy 0, policy_version 344910 (0.0039) [2024-06-23 02:54:38,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43416.0, 300 sec: 42820.2). Total num frames: 5651005440. Throughput: 0: 43045.3. Samples: 5651062180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:54:38,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 02:54:42,710][15401] Updated weights for policy 0, policy_version 344920 (0.0039) [2024-06-23 02:54:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5651202048. Throughput: 0: 43234.8. Samples: 5651327880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:54:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 02:54:45,532][15401] Updated weights for policy 0, policy_version 344930 (0.0046) [2024-06-23 02:54:48,390][15132] Fps is (10 sec: 37689.9, 60 sec: 42871.0, 300 sec: 42764.9). Total num frames: 5651382272. Throughput: 0: 43165.4. Samples: 5651580660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:54:48,391][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 02:54:50,799][15401] Updated weights for policy 0, policy_version 344940 (0.0032) [2024-06-23 02:54:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5651644416. Throughput: 0: 43291.2. Samples: 5651709800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:54:53,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 02:54:53,407][15401] Updated weights for policy 0, policy_version 344950 (0.0037) [2024-06-23 02:54:58,389][15132] Fps is (10 sec: 42601.2, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 5651808256. Throughput: 0: 43195.1. Samples: 5651969720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:54:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 02:54:58,487][15401] Updated weights for policy 0, policy_version 344960 (0.0043) [2024-06-23 02:54:58,828][15349] Signal inference workers to stop experience collection... (83700 times) [2024-06-23 02:54:58,831][15349] Signal inference workers to resume experience collection... (83700 times) [2024-06-23 02:54:58,851][15401] InferenceWorker_p0-w0: stopping experience collection (83700 times) [2024-06-23 02:54:58,851][15401] InferenceWorker_p0-w0: resuming experience collection (83700 times) [2024-06-23 02:55:00,926][15401] Updated weights for policy 0, policy_version 344970 (0.0034) [2024-06-23 02:55:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5652037632. Throughput: 0: 42944.3. Samples: 5652225740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 02:55:06,005][15401] Updated weights for policy 0, policy_version 344980 (0.0032) [2024-06-23 02:55:08,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 5652299776. Throughput: 0: 43240.1. Samples: 5652356400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 02:55:08,487][15401] Updated weights for policy 0, policy_version 344990 (0.0038) [2024-06-23 02:55:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5652463616. Throughput: 0: 43145.2. Samples: 5652614080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 02:55:13,542][15401] Updated weights for policy 0, policy_version 345000 (0.0036) [2024-06-23 02:55:15,985][15401] Updated weights for policy 0, policy_version 345010 (0.0035) [2024-06-23 02:55:18,392][15132] Fps is (10 sec: 39312.2, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 5652692992. Throughput: 0: 43050.1. Samples: 5652868560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:18,393][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 02:55:21,132][15401] Updated weights for policy 0, policy_version 345020 (0.0028) [2024-06-23 02:55:23,390][15132] Fps is (10 sec: 49152.3, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 5652955136. Throughput: 0: 43091.5. Samples: 5653001200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 02:55:23,755][15401] Updated weights for policy 0, policy_version 345030 (0.0030) [2024-06-23 02:55:28,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5653086208. Throughput: 0: 42919.1. Samples: 5653259240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 02:55:28,804][15401] Updated weights for policy 0, policy_version 345040 (0.0049) [2024-06-23 02:55:31,323][15401] Updated weights for policy 0, policy_version 345050 (0.0030) [2024-06-23 02:55:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5653348352. Throughput: 0: 42848.5. Samples: 5653508820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:33,403][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 02:55:36,307][15401] Updated weights for policy 0, policy_version 345060 (0.0042) [2024-06-23 02:55:38,392][15132] Fps is (10 sec: 50780.0, 60 sec: 43144.8, 300 sec: 42876.2). Total num frames: 5653594112. Throughput: 0: 42968.7. Samples: 5653643480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:38,392][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 02:55:38,939][15401] Updated weights for policy 0, policy_version 345070 (0.0028) [2024-06-23 02:55:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5653757952. Throughput: 0: 42928.8. Samples: 5653901520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 02:55:43,391][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 02:55:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000345078_5653757952.pth... [2024-06-23 02:55:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000344452_5643501568.pth [2024-06-23 02:55:43,758][15401] Updated weights for policy 0, policy_version 345080 (0.0023) [2024-06-23 02:55:46,318][15401] Updated weights for policy 0, policy_version 345090 (0.0032) [2024-06-23 02:55:48,390][15132] Fps is (10 sec: 37690.7, 60 sec: 43145.0, 300 sec: 42765.0). Total num frames: 5653970944. Throughput: 0: 42918.3. Samples: 5654157060. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:55:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 02:55:51,106][15349] Signal inference workers to stop experience collection... (83750 times) [2024-06-23 02:55:51,108][15349] Signal inference workers to resume experience collection... (83750 times) [2024-06-23 02:55:51,151][15401] InferenceWorker_p0-w0: stopping experience collection (83750 times) [2024-06-23 02:55:51,152][15401] InferenceWorker_p0-w0: resuming experience collection (83750 times) [2024-06-23 02:55:51,255][15401] Updated weights for policy 0, policy_version 345100 (0.0037) [2024-06-23 02:55:53,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5654216704. Throughput: 0: 42786.6. Samples: 5654281900. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:55:53,392][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 02:55:54,346][15401] Updated weights for policy 0, policy_version 345110 (0.0033) [2024-06-23 02:55:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5654396928. Throughput: 0: 42752.1. Samples: 5654537920. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:55:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 02:55:59,243][15401] Updated weights for policy 0, policy_version 345120 (0.0031) [2024-06-23 02:56:02,089][15401] Updated weights for policy 0, policy_version 345130 (0.0037) [2024-06-23 02:56:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 5654626304. Throughput: 0: 42666.3. Samples: 5654788440. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 02:56:06,597][15401] Updated weights for policy 0, policy_version 345140 (0.0038) [2024-06-23 02:56:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 5654822912. Throughput: 0: 42697.0. Samples: 5654922560. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 02:56:09,797][15401] Updated weights for policy 0, policy_version 345150 (0.0029) [2024-06-23 02:56:13,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 5655035904. Throughput: 0: 42685.0. Samples: 5655180060. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 02:56:14,254][15401] Updated weights for policy 0, policy_version 345160 (0.0033) [2024-06-23 02:56:17,423][15401] Updated weights for policy 0, policy_version 345170 (0.0024) [2024-06-23 02:56:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43146.3, 300 sec: 42820.9). Total num frames: 5655281664. Throughput: 0: 42738.2. Samples: 5655432040. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 02:56:22,038][15401] Updated weights for policy 0, policy_version 345180 (0.0038) [2024-06-23 02:56:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5655478272. Throughput: 0: 42776.1. Samples: 5655568320. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 02:56:25,083][15401] Updated weights for policy 0, policy_version 345190 (0.0030) [2024-06-23 02:56:28,396][15132] Fps is (10 sec: 40933.7, 60 sec: 43412.9, 300 sec: 42875.1). Total num frames: 5655691264. Throughput: 0: 42781.0. Samples: 5655826940. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:28,397][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 02:56:29,827][15401] Updated weights for policy 0, policy_version 345200 (0.0043) [2024-06-23 02:56:32,637][15401] Updated weights for policy 0, policy_version 345210 (0.0040) [2024-06-23 02:56:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5655937024. Throughput: 0: 42571.4. Samples: 5656072780. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 02:56:37,482][15401] Updated weights for policy 0, policy_version 345220 (0.0038) [2024-06-23 02:56:38,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42053.7, 300 sec: 42820.6). Total num frames: 5656117248. Throughput: 0: 42825.9. Samples: 5656208960. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:38,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 02:56:40,351][15401] Updated weights for policy 0, policy_version 345230 (0.0033) [2024-06-23 02:56:43,389][15132] Fps is (10 sec: 40961.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5656346624. Throughput: 0: 42775.7. Samples: 5656462820. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:43,396][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 02:56:44,943][15401] Updated weights for policy 0, policy_version 345240 (0.0042) [2024-06-23 02:56:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5656543232. Throughput: 0: 42836.9. Samples: 5656716100. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 02:56:48,588][15401] Updated weights for policy 0, policy_version 345250 (0.0038) [2024-06-23 02:56:52,478][15401] Updated weights for policy 0, policy_version 345260 (0.0033) [2024-06-23 02:56:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 5656739840. Throughput: 0: 42734.2. Samples: 5656845600. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 02:56:56,195][15401] Updated weights for policy 0, policy_version 345270 (0.0040) [2024-06-23 02:56:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5657001984. Throughput: 0: 42808.7. Samples: 5657106460. Policy #0 lag: (min: 2.0, avg: 10.5, max: 24.0) [2024-06-23 02:56:58,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 02:57:00,050][15401] Updated weights for policy 0, policy_version 345280 (0.0039) [2024-06-23 02:57:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5657198592. Throughput: 0: 42791.2. Samples: 5657357640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:03,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 02:57:03,979][15401] Updated weights for policy 0, policy_version 345290 (0.0034) [2024-06-23 02:57:07,784][15401] Updated weights for policy 0, policy_version 345300 (0.0036) [2024-06-23 02:57:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5657395200. Throughput: 0: 42663.7. Samples: 5657488180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:08,390][15132] Avg episode reward: [(0, '0.120')] [2024-06-23 02:57:11,001][15349] Signal inference workers to stop experience collection... (83800 times) [2024-06-23 02:57:11,055][15349] Signal inference workers to resume experience collection... (83800 times) [2024-06-23 02:57:11,055][15401] InferenceWorker_p0-w0: stopping experience collection (83800 times) [2024-06-23 02:57:11,068][15401] InferenceWorker_p0-w0: resuming experience collection (83800 times) [2024-06-23 02:57:11,379][15401] Updated weights for policy 0, policy_version 345310 (0.0036) [2024-06-23 02:57:13,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43415.7, 300 sec: 42820.2). Total num frames: 5657640960. Throughput: 0: 42699.4. Samples: 5657748240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:13,393][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 02:57:15,285][15401] Updated weights for policy 0, policy_version 345320 (0.0042) [2024-06-23 02:57:18,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42050.6, 300 sec: 42709.5). Total num frames: 5657804800. Throughput: 0: 42898.7. Samples: 5658003320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:18,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 02:57:19,078][15401] Updated weights for policy 0, policy_version 345330 (0.0033) [2024-06-23 02:57:23,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5658034176. Throughput: 0: 42535.9. Samples: 5658123080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:23,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 02:57:23,527][15401] Updated weights for policy 0, policy_version 345340 (0.0024) [2024-06-23 02:57:26,872][15401] Updated weights for policy 0, policy_version 345350 (0.0031) [2024-06-23 02:57:28,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 5658263552. Throughput: 0: 42645.7. Samples: 5658381880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 02:57:31,231][15401] Updated weights for policy 0, policy_version 345360 (0.0024) [2024-06-23 02:57:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42876.4). Total num frames: 5658476544. Throughput: 0: 42934.2. Samples: 5658648140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 02:57:34,456][15401] Updated weights for policy 0, policy_version 345370 (0.0032) [2024-06-23 02:57:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5658689536. Throughput: 0: 42824.4. Samples: 5658772700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 02:57:38,689][15401] Updated weights for policy 0, policy_version 345380 (0.0042) [2024-06-23 02:57:41,887][15401] Updated weights for policy 0, policy_version 345390 (0.0027) [2024-06-23 02:57:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 5658918912. Throughput: 0: 42905.4. Samples: 5659037200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 02:57:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000345393_5658918912.pth... [2024-06-23 02:57:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000344765_5648629760.pth [2024-06-23 02:57:46,041][15401] Updated weights for policy 0, policy_version 345400 (0.0045) [2024-06-23 02:57:48,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 5659115520. Throughput: 0: 43067.2. Samples: 5659295940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:48,396][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 02:57:49,584][15401] Updated weights for policy 0, policy_version 345410 (0.0030) [2024-06-23 02:57:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5659344896. Throughput: 0: 42993.2. Samples: 5659422880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 02:57:53,593][15401] Updated weights for policy 0, policy_version 345420 (0.0040) [2024-06-23 02:57:57,556][15401] Updated weights for policy 0, policy_version 345430 (0.0026) [2024-06-23 02:57:58,389][15132] Fps is (10 sec: 44265.7, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 5659557888. Throughput: 0: 42914.0. Samples: 5659679260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:57:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 02:58:01,145][15401] Updated weights for policy 0, policy_version 345440 (0.0041) [2024-06-23 02:58:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5659770880. Throughput: 0: 42984.9. Samples: 5659937540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:58:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 02:58:05,038][15401] Updated weights for policy 0, policy_version 345450 (0.0031) [2024-06-23 02:58:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5660000256. Throughput: 0: 43120.6. Samples: 5660063500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:58:08,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 02:58:08,587][15401] Updated weights for policy 0, policy_version 345460 (0.0034) [2024-06-23 02:58:12,529][15401] Updated weights for policy 0, policy_version 345470 (0.0023) [2024-06-23 02:58:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 5660180480. Throughput: 0: 43180.5. Samples: 5660325000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:58:13,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 02:58:16,264][15401] Updated weights for policy 0, policy_version 345480 (0.0036) [2024-06-23 02:58:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 5660409856. Throughput: 0: 43003.9. Samples: 5660583320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 02:58:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 02:58:20,219][15401] Updated weights for policy 0, policy_version 345490 (0.0028) [2024-06-23 02:58:23,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43688.9, 300 sec: 42931.3). Total num frames: 5660655616. Throughput: 0: 43019.1. Samples: 5660708660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:58:23,392][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 02:58:23,853][15401] Updated weights for policy 0, policy_version 345500 (0.0041) [2024-06-23 02:58:27,756][15401] Updated weights for policy 0, policy_version 345510 (0.0035) [2024-06-23 02:58:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5660835840. Throughput: 0: 42991.1. Samples: 5660971800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:58:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 02:58:29,370][15349] Signal inference workers to stop experience collection... (83850 times) [2024-06-23 02:58:29,408][15401] InferenceWorker_p0-w0: stopping experience collection (83850 times) [2024-06-23 02:58:29,429][15349] Signal inference workers to resume experience collection... (83850 times) [2024-06-23 02:58:29,436][15401] InferenceWorker_p0-w0: resuming experience collection (83850 times) [2024-06-23 02:58:31,262][15401] Updated weights for policy 0, policy_version 345520 (0.0030) [2024-06-23 02:58:33,390][15132] Fps is (10 sec: 37692.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5661032448. Throughput: 0: 43078.1. Samples: 5661234180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:58:33,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 02:58:35,430][15401] Updated weights for policy 0, policy_version 345530 (0.0026) [2024-06-23 02:58:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5661294592. Throughput: 0: 42899.1. Samples: 5661353340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:58:38,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 02:58:38,958][15401] Updated weights for policy 0, policy_version 345540 (0.0027) [2024-06-23 02:58:43,249][15401] Updated weights for policy 0, policy_version 345550 (0.0038) [2024-06-23 02:58:43,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.7, 300 sec: 42986.8). Total num frames: 5661491200. Throughput: 0: 43118.9. Samples: 5661619720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:58:43,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 02:58:46,236][15401] Updated weights for policy 0, policy_version 345560 (0.0020) [2024-06-23 02:58:48,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 5661671424. Throughput: 0: 43229.9. Samples: 5661882880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:58:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 02:58:50,975][15401] Updated weights for policy 0, policy_version 345570 (0.0028) [2024-06-23 02:58:53,392][15132] Fps is (10 sec: 45875.1, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 5661949952. Throughput: 0: 43198.1. Samples: 5662007520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:58:53,392][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 02:58:54,094][15401] Updated weights for policy 0, policy_version 345580 (0.0029) [2024-06-23 02:58:58,382][15401] Updated weights for policy 0, policy_version 345590 (0.0028) [2024-06-23 02:58:58,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 43042.8). Total num frames: 5662146560. Throughput: 0: 43185.0. Samples: 5662268320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:58:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 02:59:01,435][15401] Updated weights for policy 0, policy_version 345600 (0.0035) [2024-06-23 02:59:03,390][15132] Fps is (10 sec: 37692.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5662326784. Throughput: 0: 43289.4. Samples: 5662531340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:59:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 02:59:05,890][15401] Updated weights for policy 0, policy_version 345610 (0.0027) [2024-06-23 02:59:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5662588928. Throughput: 0: 43205.4. Samples: 5662652800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:59:08,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 02:59:09,070][15401] Updated weights for policy 0, policy_version 345620 (0.0035) [2024-06-23 02:59:13,342][15401] Updated weights for policy 0, policy_version 345630 (0.0031) [2024-06-23 02:59:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 5662801920. Throughput: 0: 43257.7. Samples: 5662918400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:59:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 02:59:17,143][15401] Updated weights for policy 0, policy_version 345640 (0.0034) [2024-06-23 02:59:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5662982144. Throughput: 0: 43120.5. Samples: 5663174600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:59:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 02:59:21,023][15401] Updated weights for policy 0, policy_version 345650 (0.0039) [2024-06-23 02:59:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 43042.7). Total num frames: 5663244288. Throughput: 0: 43149.3. Samples: 5663295060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:59:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 02:59:24,627][15401] Updated weights for policy 0, policy_version 345660 (0.0036) [2024-06-23 02:59:28,394][15132] Fps is (10 sec: 45853.5, 60 sec: 43414.2, 300 sec: 43042.0). Total num frames: 5663440896. Throughput: 0: 43115.6. Samples: 5663560020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:59:28,395][15132] Avg episode reward: [(0, '0.243')] [2024-06-23 02:59:28,542][15401] Updated weights for policy 0, policy_version 345670 (0.0028) [2024-06-23 02:59:32,143][15401] Updated weights for policy 0, policy_version 345680 (0.0034) [2024-06-23 02:59:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 5663637504. Throughput: 0: 42931.0. Samples: 5663814780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 02:59:33,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-23 02:59:35,707][15349] Signal inference workers to stop experience collection... (83900 times) [2024-06-23 02:59:35,720][15349] Signal inference workers to resume experience collection... (83900 times) [2024-06-23 02:59:35,726][15401] InferenceWorker_p0-w0: stopping experience collection (83900 times) [2024-06-23 02:59:35,750][15401] InferenceWorker_p0-w0: resuming experience collection (83900 times) [2024-06-23 02:59:36,207][15401] Updated weights for policy 0, policy_version 345690 (0.0037) [2024-06-23 02:59:38,389][15132] Fps is (10 sec: 44257.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5663883264. Throughput: 0: 42972.1. Samples: 5663941160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:59:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 02:59:40,143][15401] Updated weights for policy 0, policy_version 345700 (0.0030) [2024-06-23 02:59:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42871.4, 300 sec: 42986.9). Total num frames: 5664063488. Throughput: 0: 43001.1. Samples: 5664203480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:59:43,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 02:59:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000345708_5664079872.pth... [2024-06-23 02:59:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000345078_5653757952.pth [2024-06-23 02:59:43,877][15401] Updated weights for policy 0, policy_version 345710 (0.0028) [2024-06-23 02:59:48,006][15401] Updated weights for policy 0, policy_version 345720 (0.0032) [2024-06-23 02:59:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5664276480. Throughput: 0: 42726.3. Samples: 5664454020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:59:48,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 02:59:51,611][15401] Updated weights for policy 0, policy_version 345730 (0.0030) [2024-06-23 02:59:53,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42873.1, 300 sec: 43098.2). Total num frames: 5664522240. Throughput: 0: 42905.8. Samples: 5664583560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:59:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 02:59:55,778][15401] Updated weights for policy 0, policy_version 345740 (0.0028) [2024-06-23 02:59:58,394][15132] Fps is (10 sec: 44218.2, 60 sec: 42868.4, 300 sec: 42986.6). Total num frames: 5664718848. Throughput: 0: 42758.8. Samples: 5664842720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 02:59:58,394][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 02:59:59,268][15401] Updated weights for policy 0, policy_version 345750 (0.0025) [2024-06-23 03:00:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5664915456. Throughput: 0: 42780.0. Samples: 5665099700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 03:00:03,483][15401] Updated weights for policy 0, policy_version 345760 (0.0040) [2024-06-23 03:00:06,867][15401] Updated weights for policy 0, policy_version 345770 (0.0042) [2024-06-23 03:00:08,390][15132] Fps is (10 sec: 44254.7, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 5665161216. Throughput: 0: 42934.6. Samples: 5665227120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 03:00:10,919][15401] Updated weights for policy 0, policy_version 345780 (0.0031) [2024-06-23 03:00:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42932.0). Total num frames: 5665357824. Throughput: 0: 42798.3. Samples: 5665485740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 03:00:14,445][15401] Updated weights for policy 0, policy_version 345790 (0.0033) [2024-06-23 03:00:18,391][15132] Fps is (10 sec: 40954.9, 60 sec: 43143.6, 300 sec: 42764.8). Total num frames: 5665570816. Throughput: 0: 42952.9. Samples: 5665747720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:18,391][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 03:00:18,601][15401] Updated weights for policy 0, policy_version 345800 (0.0035) [2024-06-23 03:00:22,146][15401] Updated weights for policy 0, policy_version 345810 (0.0027) [2024-06-23 03:00:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 43097.9). Total num frames: 5665800192. Throughput: 0: 42879.9. Samples: 5665870860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:23,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 03:00:26,241][15401] Updated weights for policy 0, policy_version 345820 (0.0025) [2024-06-23 03:00:28,392][15132] Fps is (10 sec: 42593.7, 60 sec: 42600.0, 300 sec: 42875.8). Total num frames: 5665996800. Throughput: 0: 42888.5. Samples: 5666133460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:28,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 03:00:29,680][15401] Updated weights for policy 0, policy_version 345830 (0.0044) [2024-06-23 03:00:33,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.4, 300 sec: 42820.8). Total num frames: 5666226176. Throughput: 0: 42934.9. Samples: 5666386100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 03:00:33,585][15401] Updated weights for policy 0, policy_version 345840 (0.0028) [2024-06-23 03:00:37,610][15401] Updated weights for policy 0, policy_version 345850 (0.0040) [2024-06-23 03:00:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 5666422784. Throughput: 0: 42948.1. Samples: 5666516220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:38,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-23 03:00:40,970][15401] Updated weights for policy 0, policy_version 345860 (0.0023) [2024-06-23 03:00:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 5666635776. Throughput: 0: 42871.3. Samples: 5666771760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:43,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-23 03:00:45,170][15401] Updated weights for policy 0, policy_version 345870 (0.0023) [2024-06-23 03:00:47,252][15349] Signal inference workers to stop experience collection... (83950 times) [2024-06-23 03:00:47,252][15349] Signal inference workers to resume experience collection... (83950 times) [2024-06-23 03:00:47,288][15401] InferenceWorker_p0-w0: stopping experience collection (83950 times) [2024-06-23 03:00:47,288][15401] InferenceWorker_p0-w0: resuming experience collection (83950 times) [2024-06-23 03:00:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42932.0). Total num frames: 5666881536. Throughput: 0: 42798.6. Samples: 5667025640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 03:00:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 03:00:48,457][15401] Updated weights for policy 0, policy_version 345880 (0.0037) [2024-06-23 03:00:52,791][15401] Updated weights for policy 0, policy_version 345890 (0.0031) [2024-06-23 03:00:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 5667078144. Throughput: 0: 43041.9. Samples: 5667164000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:00:53,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 03:00:55,921][15401] Updated weights for policy 0, policy_version 345900 (0.0043) [2024-06-23 03:00:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42601.3, 300 sec: 42876.1). Total num frames: 5667274752. Throughput: 0: 42968.4. Samples: 5667419320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:00:58,395][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 03:01:00,331][15401] Updated weights for policy 0, policy_version 345910 (0.0029) [2024-06-23 03:01:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 5667536896. Throughput: 0: 42673.6. Samples: 5667667980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 03:01:04,187][15401] Updated weights for policy 0, policy_version 345920 (0.0033) [2024-06-23 03:01:08,102][15401] Updated weights for policy 0, policy_version 345930 (0.0032) [2024-06-23 03:01:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42987.1). Total num frames: 5667717120. Throughput: 0: 42995.6. Samples: 5667805560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 03:01:11,632][15401] Updated weights for policy 0, policy_version 345940 (0.0043) [2024-06-23 03:01:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5667930112. Throughput: 0: 42811.6. Samples: 5668059880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 03:01:15,493][15401] Updated weights for policy 0, policy_version 345950 (0.0042) [2024-06-23 03:01:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43145.5, 300 sec: 42987.2). Total num frames: 5668159488. Throughput: 0: 42832.6. Samples: 5668313560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 03:01:19,403][15401] Updated weights for policy 0, policy_version 345960 (0.0024) [2024-06-23 03:01:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.3, 300 sec: 42988.1). Total num frames: 5668372480. Throughput: 0: 42850.3. Samples: 5668444480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:23,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 03:01:23,390][15401] Updated weights for policy 0, policy_version 345970 (0.0027) [2024-06-23 03:01:26,942][15401] Updated weights for policy 0, policy_version 345980 (0.0032) [2024-06-23 03:01:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 5668569088. Throughput: 0: 42812.9. Samples: 5668698340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:28,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 03:01:30,886][15401] Updated weights for policy 0, policy_version 345990 (0.0040) [2024-06-23 03:01:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 5668798464. Throughput: 0: 43060.1. Samples: 5668963340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 03:01:34,781][15401] Updated weights for policy 0, policy_version 346000 (0.0033) [2024-06-23 03:01:38,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 5669011456. Throughput: 0: 42895.9. Samples: 5669094420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:38,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 03:01:38,431][15401] Updated weights for policy 0, policy_version 346010 (0.0029) [2024-06-23 03:01:42,271][15401] Updated weights for policy 0, policy_version 346020 (0.0034) [2024-06-23 03:01:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5669224448. Throughput: 0: 42909.6. Samples: 5669350260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:43,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 03:01:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000346022_5669224448.pth... [2024-06-23 03:01:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000345393_5658918912.pth [2024-06-23 03:01:46,085][15401] Updated weights for policy 0, policy_version 346030 (0.0034) [2024-06-23 03:01:48,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 5669437440. Throughput: 0: 43063.2. Samples: 5669605820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:48,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 03:01:50,183][15401] Updated weights for policy 0, policy_version 346040 (0.0041) [2024-06-23 03:01:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5669666816. Throughput: 0: 42865.4. Samples: 5669734500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 03:01:53,512][15401] Updated weights for policy 0, policy_version 346050 (0.0032) [2024-06-23 03:01:57,764][15401] Updated weights for policy 0, policy_version 346060 (0.0048) [2024-06-23 03:01:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5669863424. Throughput: 0: 43034.3. Samples: 5669996420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:01:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 03:02:01,406][15401] Updated weights for policy 0, policy_version 346070 (0.0030) [2024-06-23 03:02:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 5670092800. Throughput: 0: 43037.3. Samples: 5670250240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:02:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 03:02:05,598][15401] Updated weights for policy 0, policy_version 346080 (0.0030) [2024-06-23 03:02:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 5670305792. Throughput: 0: 42955.0. Samples: 5670377460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 03:02:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 03:02:09,161][15401] Updated weights for policy 0, policy_version 346090 (0.0038) [2024-06-23 03:02:13,198][15401] Updated weights for policy 0, policy_version 346100 (0.0043) [2024-06-23 03:02:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 5670502400. Throughput: 0: 43062.8. Samples: 5670636160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 03:02:16,830][15401] Updated weights for policy 0, policy_version 346110 (0.0044) [2024-06-23 03:02:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 5670731776. Throughput: 0: 42793.4. Samples: 5670889040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 03:02:20,825][15401] Updated weights for policy 0, policy_version 346120 (0.0035) [2024-06-23 03:02:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5670944768. Throughput: 0: 42738.8. Samples: 5671017560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 03:02:24,291][15401] Updated weights for policy 0, policy_version 346130 (0.0032) [2024-06-23 03:02:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5671141376. Throughput: 0: 42803.7. Samples: 5671276420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 03:02:28,820][15401] Updated weights for policy 0, policy_version 346140 (0.0031) [2024-06-23 03:02:29,077][15349] Signal inference workers to stop experience collection... (84000 times) [2024-06-23 03:02:29,077][15349] Signal inference workers to resume experience collection... (84000 times) [2024-06-23 03:02:29,097][15401] InferenceWorker_p0-w0: stopping experience collection (84000 times) [2024-06-23 03:02:29,098][15401] InferenceWorker_p0-w0: resuming experience collection (84000 times) [2024-06-23 03:02:31,895][15401] Updated weights for policy 0, policy_version 346150 (0.0041) [2024-06-23 03:02:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 5671387136. Throughput: 0: 42675.9. Samples: 5671526240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 03:02:36,378][15401] Updated weights for policy 0, policy_version 346160 (0.0044) [2024-06-23 03:02:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.3, 300 sec: 42931.6). Total num frames: 5671583744. Throughput: 0: 42825.4. Samples: 5671661640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 03:02:39,541][15401] Updated weights for policy 0, policy_version 346170 (0.0049) [2024-06-23 03:02:43,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42596.8, 300 sec: 42932.2). Total num frames: 5671780352. Throughput: 0: 42599.0. Samples: 5671913480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:43,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 03:02:43,980][15401] Updated weights for policy 0, policy_version 346180 (0.0024) [2024-06-23 03:02:47,066][15401] Updated weights for policy 0, policy_version 346190 (0.0037) [2024-06-23 03:02:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5672026112. Throughput: 0: 42765.8. Samples: 5672174700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 03:02:51,392][15401] Updated weights for policy 0, policy_version 346200 (0.0033) [2024-06-23 03:02:53,390][15132] Fps is (10 sec: 45885.5, 60 sec: 42871.3, 300 sec: 42987.1). Total num frames: 5672239104. Throughput: 0: 42954.9. Samples: 5672310440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 03:02:54,918][15401] Updated weights for policy 0, policy_version 346210 (0.0031) [2024-06-23 03:02:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5672419328. Throughput: 0: 42843.7. Samples: 5672564120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:02:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 03:02:59,443][15401] Updated weights for policy 0, policy_version 346220 (0.0031) [2024-06-23 03:03:02,527][15401] Updated weights for policy 0, policy_version 346230 (0.0034) [2024-06-23 03:03:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5672665088. Throughput: 0: 42828.8. Samples: 5672816340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:03:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 03:03:07,109][15401] Updated weights for policy 0, policy_version 346240 (0.0030) [2024-06-23 03:03:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 5672878080. Throughput: 0: 43071.2. Samples: 5672955760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:03:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 03:03:10,430][15401] Updated weights for policy 0, policy_version 346250 (0.0055) [2024-06-23 03:03:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5673058304. Throughput: 0: 42694.2. Samples: 5673197660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:03:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 03:03:14,627][15401] Updated weights for policy 0, policy_version 346260 (0.0026) [2024-06-23 03:03:17,966][15401] Updated weights for policy 0, policy_version 346270 (0.0032) [2024-06-23 03:03:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 5673304064. Throughput: 0: 42956.0. Samples: 5673459260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:03:18,393][15132] Avg episode reward: [(0, '0.223')] [2024-06-23 03:03:22,249][15401] Updated weights for policy 0, policy_version 346280 (0.0032) [2024-06-23 03:03:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 5673500672. Throughput: 0: 42938.7. Samples: 5673593880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 03:03:23,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 03:03:25,675][15401] Updated weights for policy 0, policy_version 346290 (0.0034) [2024-06-23 03:03:28,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42866.9, 300 sec: 42986.2). Total num frames: 5673713664. Throughput: 0: 42782.0. Samples: 5673838840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:03:28,396][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 03:03:30,004][15401] Updated weights for policy 0, policy_version 346300 (0.0024) [2024-06-23 03:03:33,326][15401] Updated weights for policy 0, policy_version 346310 (0.0047) [2024-06-23 03:03:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5673943040. Throughput: 0: 42873.3. Samples: 5674104000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:03:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 03:03:37,576][15349] Signal inference workers to stop experience collection... (84050 times) [2024-06-23 03:03:37,579][15349] Signal inference workers to resume experience collection... (84050 times) [2024-06-23 03:03:37,588][15401] Updated weights for policy 0, policy_version 346320 (0.0041) [2024-06-23 03:03:37,620][15401] InferenceWorker_p0-w0: stopping experience collection (84050 times) [2024-06-23 03:03:37,620][15401] InferenceWorker_p0-w0: resuming experience collection (84050 times) [2024-06-23 03:03:38,390][15132] Fps is (10 sec: 44264.8, 60 sec: 42871.3, 300 sec: 42932.0). Total num frames: 5674156032. Throughput: 0: 42713.4. Samples: 5674232540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:03:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 03:03:40,930][15401] Updated weights for policy 0, policy_version 346330 (0.0040) [2024-06-23 03:03:43,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42871.5, 300 sec: 42986.8). Total num frames: 5674352640. Throughput: 0: 42574.5. Samples: 5674480080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:03:43,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 03:03:43,455][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000346336_5674369024.pth... [2024-06-23 03:03:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000345708_5664079872.pth [2024-06-23 03:03:44,975][15401] Updated weights for policy 0, policy_version 346340 (0.0026) [2024-06-23 03:03:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 5674565632. Throughput: 0: 42800.1. Samples: 5674742340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:03:48,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 03:03:48,523][15401] Updated weights for policy 0, policy_version 346350 (0.0030) [2024-06-23 03:03:52,858][15401] Updated weights for policy 0, policy_version 346360 (0.0036) [2024-06-23 03:03:53,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 5674778624. Throughput: 0: 42575.4. Samples: 5674871660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:03:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 03:03:56,135][15401] Updated weights for policy 0, policy_version 346370 (0.0026) [2024-06-23 03:03:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5675008000. Throughput: 0: 42740.9. Samples: 5675121000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:03:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 03:04:00,486][15401] Updated weights for policy 0, policy_version 346380 (0.0029) [2024-06-23 03:04:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5675220992. Throughput: 0: 42660.1. Samples: 5675378960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:03,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-23 03:04:03,686][15401] Updated weights for policy 0, policy_version 346390 (0.0044) [2024-06-23 03:04:08,043][15401] Updated weights for policy 0, policy_version 346400 (0.0025) [2024-06-23 03:04:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 5675417600. Throughput: 0: 42669.6. Samples: 5675514020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:08,394][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 03:04:11,105][15401] Updated weights for policy 0, policy_version 346410 (0.0029) [2024-06-23 03:04:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5675646976. Throughput: 0: 42943.5. Samples: 5675771020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 03:04:15,624][15401] Updated weights for policy 0, policy_version 346420 (0.0043) [2024-06-23 03:04:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5675876352. Throughput: 0: 42849.7. Samples: 5676032240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 03:04:18,668][15401] Updated weights for policy 0, policy_version 346430 (0.0041) [2024-06-23 03:04:23,015][15401] Updated weights for policy 0, policy_version 346440 (0.0028) [2024-06-23 03:04:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42821.3). Total num frames: 5676072960. Throughput: 0: 42825.5. Samples: 5676159680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 03:04:26,553][15401] Updated weights for policy 0, policy_version 346450 (0.0038) [2024-06-23 03:04:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43149.0, 300 sec: 42931.6). Total num frames: 5676302336. Throughput: 0: 42951.1. Samples: 5676412780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 03:04:30,813][15401] Updated weights for policy 0, policy_version 346460 (0.0032) [2024-06-23 03:04:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5676498944. Throughput: 0: 43027.5. Samples: 5676678580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 03:04:34,370][15401] Updated weights for policy 0, policy_version 346470 (0.0045) [2024-06-23 03:04:38,299][15401] Updated weights for policy 0, policy_version 346480 (0.0037) [2024-06-23 03:04:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 5676728320. Throughput: 0: 42979.5. Samples: 5676805740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 03:04:42,118][15401] Updated weights for policy 0, policy_version 346490 (0.0023) [2024-06-23 03:04:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 5676941312. Throughput: 0: 43038.2. Samples: 5677057720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 03:04:43,399][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 03:04:45,712][15401] Updated weights for policy 0, policy_version 346500 (0.0020) [2024-06-23 03:04:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5677137920. Throughput: 0: 43185.7. Samples: 5677322320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:04:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 03:04:49,651][15401] Updated weights for policy 0, policy_version 346510 (0.0034) [2024-06-23 03:04:53,375][15401] Updated weights for policy 0, policy_version 346520 (0.0035) [2024-06-23 03:04:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 42932.2). Total num frames: 5677383680. Throughput: 0: 42929.4. Samples: 5677445840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:04:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 03:04:57,334][15401] Updated weights for policy 0, policy_version 346530 (0.0034) [2024-06-23 03:04:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5677563904. Throughput: 0: 42892.0. Samples: 5677701160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:04:58,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 03:05:00,164][15349] Signal inference workers to stop experience collection... (84100 times) [2024-06-23 03:05:00,168][15349] Signal inference workers to resume experience collection... (84100 times) [2024-06-23 03:05:00,195][15401] InferenceWorker_p0-w0: stopping experience collection (84100 times) [2024-06-23 03:05:00,196][15401] InferenceWorker_p0-w0: resuming experience collection (84100 times) [2024-06-23 03:05:00,874][15401] Updated weights for policy 0, policy_version 346540 (0.0027) [2024-06-23 03:05:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5677776896. Throughput: 0: 42911.1. Samples: 5677963240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 03:05:04,879][15401] Updated weights for policy 0, policy_version 346550 (0.0034) [2024-06-23 03:05:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 5678022656. Throughput: 0: 42947.5. Samples: 5678092320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:08,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 03:05:08,404][15401] Updated weights for policy 0, policy_version 346560 (0.0033) [2024-06-23 03:05:12,552][15401] Updated weights for policy 0, policy_version 346570 (0.0023) [2024-06-23 03:05:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.8). Total num frames: 5678202880. Throughput: 0: 42908.7. Samples: 5678343660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 03:05:16,298][15401] Updated weights for policy 0, policy_version 346580 (0.0028) [2024-06-23 03:05:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 5678415872. Throughput: 0: 42737.8. Samples: 5678601780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:18,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-23 03:05:20,146][15401] Updated weights for policy 0, policy_version 346590 (0.0027) [2024-06-23 03:05:23,392][15132] Fps is (10 sec: 45863.6, 60 sec: 43142.7, 300 sec: 42931.6). Total num frames: 5678661632. Throughput: 0: 42740.0. Samples: 5678729140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:23,393][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 03:05:23,804][15401] Updated weights for policy 0, policy_version 346600 (0.0033) [2024-06-23 03:05:27,879][15401] Updated weights for policy 0, policy_version 346610 (0.0032) [2024-06-23 03:05:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5678858240. Throughput: 0: 42884.5. Samples: 5678987520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 03:05:31,643][15401] Updated weights for policy 0, policy_version 346620 (0.0029) [2024-06-23 03:05:33,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5679071232. Throughput: 0: 42731.1. Samples: 5679245220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 03:05:35,477][15401] Updated weights for policy 0, policy_version 346630 (0.0041) [2024-06-23 03:05:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 5679300608. Throughput: 0: 42848.8. Samples: 5679374040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 03:05:39,395][15401] Updated weights for policy 0, policy_version 346640 (0.0031) [2024-06-23 03:05:43,292][15401] Updated weights for policy 0, policy_version 346650 (0.0032) [2024-06-23 03:05:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5679513600. Throughput: 0: 42876.9. Samples: 5679630620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 03:05:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000346650_5679513600.pth... [2024-06-23 03:05:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000346022_5669224448.pth [2024-06-23 03:05:47,158][15401] Updated weights for policy 0, policy_version 346660 (0.0028) [2024-06-23 03:05:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5679710208. Throughput: 0: 42732.4. Samples: 5679886200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 03:05:50,949][15401] Updated weights for policy 0, policy_version 346670 (0.0041) [2024-06-23 03:05:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5679939584. Throughput: 0: 42624.3. Samples: 5680010420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:53,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 03:05:54,880][15401] Updated weights for policy 0, policy_version 346680 (0.0037) [2024-06-23 03:05:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5680152576. Throughput: 0: 42985.6. Samples: 5680278020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:05:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 03:05:58,543][15401] Updated weights for policy 0, policy_version 346690 (0.0038) [2024-06-23 03:06:02,443][15401] Updated weights for policy 0, policy_version 346700 (0.0038) [2024-06-23 03:06:03,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42866.8, 300 sec: 42819.6). Total num frames: 5680349184. Throughput: 0: 42755.2. Samples: 5680526040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:03,397][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 03:06:06,376][15401] Updated weights for policy 0, policy_version 346710 (0.0036) [2024-06-23 03:06:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5680594944. Throughput: 0: 42848.5. Samples: 5680657220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:08,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 03:06:10,360][15401] Updated weights for policy 0, policy_version 346720 (0.0032) [2024-06-23 03:06:13,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5680775168. Throughput: 0: 42820.0. Samples: 5680914420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 03:06:14,007][15401] Updated weights for policy 0, policy_version 346730 (0.0030) [2024-06-23 03:06:17,937][15401] Updated weights for policy 0, policy_version 346740 (0.0035) [2024-06-23 03:06:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 5681004544. Throughput: 0: 42661.7. Samples: 5681165100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:18,392][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 03:06:21,545][15401] Updated weights for policy 0, policy_version 346750 (0.0037) [2024-06-23 03:06:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.2, 300 sec: 42931.7). Total num frames: 5681233920. Throughput: 0: 42745.8. Samples: 5681297600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 03:06:25,789][15401] Updated weights for policy 0, policy_version 346760 (0.0031) [2024-06-23 03:06:28,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5681397760. Throughput: 0: 42714.2. Samples: 5681552760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:28,394][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 03:06:28,592][15349] Signal inference workers to stop experience collection... (84150 times) [2024-06-23 03:06:28,631][15401] InferenceWorker_p0-w0: stopping experience collection (84150 times) [2024-06-23 03:06:28,650][15349] Signal inference workers to resume experience collection... (84150 times) [2024-06-23 03:06:28,652][15401] InferenceWorker_p0-w0: resuming experience collection (84150 times) [2024-06-23 03:06:29,329][15401] Updated weights for policy 0, policy_version 346770 (0.0037) [2024-06-23 03:06:33,286][15401] Updated weights for policy 0, policy_version 346780 (0.0025) [2024-06-23 03:06:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 5681643520. Throughput: 0: 42609.6. Samples: 5681803640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:33,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 03:06:37,253][15401] Updated weights for policy 0, policy_version 346790 (0.0036) [2024-06-23 03:06:38,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5681872896. Throughput: 0: 42794.4. Samples: 5681936160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:38,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 03:06:40,955][15401] Updated weights for policy 0, policy_version 346800 (0.0033) [2024-06-23 03:06:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5682036736. Throughput: 0: 42459.5. Samples: 5682188700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 03:06:45,056][15401] Updated weights for policy 0, policy_version 346810 (0.0030) [2024-06-23 03:06:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5682266112. Throughput: 0: 42383.3. Samples: 5682433020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:48,390][15132] Avg episode reward: [(0, '0.179')] [2024-06-23 03:06:48,770][15401] Updated weights for policy 0, policy_version 346820 (0.0050) [2024-06-23 03:06:52,854][15401] Updated weights for policy 0, policy_version 346830 (0.0032) [2024-06-23 03:06:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5682479104. Throughput: 0: 42381.2. Samples: 5682564380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 03:06:56,315][15401] Updated weights for policy 0, policy_version 346840 (0.0046) [2024-06-23 03:06:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5682675712. Throughput: 0: 42219.6. Samples: 5682814300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:06:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 03:07:00,547][15401] Updated weights for policy 0, policy_version 346850 (0.0029) [2024-06-23 03:07:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 5682905088. Throughput: 0: 42403.1. Samples: 5683073140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:07:03,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 03:07:03,879][15401] Updated weights for policy 0, policy_version 346860 (0.0028) [2024-06-23 03:07:08,187][15401] Updated weights for policy 0, policy_version 346870 (0.0043) [2024-06-23 03:07:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5683118080. Throughput: 0: 42419.5. Samples: 5683206480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:07:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 03:07:11,471][15401] Updated weights for policy 0, policy_version 346880 (0.0042) [2024-06-23 03:07:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5683331072. Throughput: 0: 42265.5. Samples: 5683454700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 03:07:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 03:07:15,823][15401] Updated weights for policy 0, policy_version 346890 (0.0026) [2024-06-23 03:07:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5683560448. Throughput: 0: 42479.3. Samples: 5683715200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:18,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 03:07:19,508][15401] Updated weights for policy 0, policy_version 346900 (0.0041) [2024-06-23 03:07:23,347][15401] Updated weights for policy 0, policy_version 346910 (0.0028) [2024-06-23 03:07:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5683773440. Throughput: 0: 42451.5. Samples: 5683846480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 03:07:27,076][15401] Updated weights for policy 0, policy_version 346920 (0.0031) [2024-06-23 03:07:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5683970048. Throughput: 0: 42345.4. Samples: 5684094240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 03:07:30,948][15401] Updated weights for policy 0, policy_version 346930 (0.0032) [2024-06-23 03:07:32,692][15349] Signal inference workers to stop experience collection... (84200 times) [2024-06-23 03:07:32,692][15349] Signal inference workers to resume experience collection... (84200 times) [2024-06-23 03:07:32,748][15401] InferenceWorker_p0-w0: stopping experience collection (84200 times) [2024-06-23 03:07:32,748][15401] InferenceWorker_p0-w0: resuming experience collection (84200 times) [2024-06-23 03:07:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5684199424. Throughput: 0: 42557.4. Samples: 5684348100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:33,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 03:07:34,619][15401] Updated weights for policy 0, policy_version 346940 (0.0030) [2024-06-23 03:07:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42765.4). Total num frames: 5684396032. Throughput: 0: 42675.7. Samples: 5684484780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 03:07:38,941][15401] Updated weights for policy 0, policy_version 346950 (0.0032) [2024-06-23 03:07:42,208][15401] Updated weights for policy 0, policy_version 346960 (0.0034) [2024-06-23 03:07:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5684625408. Throughput: 0: 42639.9. Samples: 5684733100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:43,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 03:07:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000346962_5684625408.pth... [2024-06-23 03:07:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000346336_5674369024.pth [2024-06-23 03:07:46,695][15401] Updated weights for policy 0, policy_version 346970 (0.0022) [2024-06-23 03:07:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5684838400. Throughput: 0: 42658.6. Samples: 5684992780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 03:07:50,298][15401] Updated weights for policy 0, policy_version 346980 (0.0037) [2024-06-23 03:07:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 5685018624. Throughput: 0: 42564.9. Samples: 5685121900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 03:07:54,208][15401] Updated weights for policy 0, policy_version 346990 (0.0040) [2024-06-23 03:07:58,080][15401] Updated weights for policy 0, policy_version 347000 (0.0036) [2024-06-23 03:07:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5685264384. Throughput: 0: 42641.3. Samples: 5685373560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:07:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 03:08:01,770][15401] Updated weights for policy 0, policy_version 347010 (0.0040) [2024-06-23 03:08:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5685477376. Throughput: 0: 42582.7. Samples: 5685631420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:08:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 03:08:05,656][15401] Updated weights for policy 0, policy_version 347020 (0.0036) [2024-06-23 03:08:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5685657600. Throughput: 0: 42550.2. Samples: 5685761240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:08:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 03:08:09,605][15401] Updated weights for policy 0, policy_version 347030 (0.0038) [2024-06-23 03:08:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5685886976. Throughput: 0: 42619.9. Samples: 5686012140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:08:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 03:08:13,527][15401] Updated weights for policy 0, policy_version 347040 (0.0036) [2024-06-23 03:08:17,302][15401] Updated weights for policy 0, policy_version 347050 (0.0040) [2024-06-23 03:08:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5686099968. Throughput: 0: 42615.9. Samples: 5686265820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:08:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 03:08:21,122][15401] Updated weights for policy 0, policy_version 347060 (0.0031) [2024-06-23 03:08:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 5686312960. Throughput: 0: 42327.7. Samples: 5686389520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:08:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 03:08:25,207][15401] Updated weights for policy 0, policy_version 347070 (0.0050) [2024-06-23 03:08:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5686525952. Throughput: 0: 42497.5. Samples: 5686645480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:08:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 03:08:28,905][15401] Updated weights for policy 0, policy_version 347080 (0.0026) [2024-06-23 03:08:33,262][15401] Updated weights for policy 0, policy_version 347090 (0.0024) [2024-06-23 03:08:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5686738944. Throughput: 0: 42487.2. Samples: 5686904700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 03:08:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 03:08:36,533][15401] Updated weights for policy 0, policy_version 347100 (0.0032) [2024-06-23 03:08:37,836][15349] Signal inference workers to stop experience collection... (84250 times) [2024-06-23 03:08:37,868][15401] InferenceWorker_p0-w0: stopping experience collection (84250 times) [2024-06-23 03:08:37,894][15349] Signal inference workers to resume experience collection... (84250 times) [2024-06-23 03:08:37,895][15401] InferenceWorker_p0-w0: resuming experience collection (84250 times) [2024-06-23 03:08:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 5686951936. Throughput: 0: 42363.2. Samples: 5687028240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:08:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 03:08:40,817][15401] Updated weights for policy 0, policy_version 347110 (0.0049) [2024-06-23 03:08:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5687164928. Throughput: 0: 42522.2. Samples: 5687287060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:08:43,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 03:08:44,291][15401] Updated weights for policy 0, policy_version 347120 (0.0033) [2024-06-23 03:08:48,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42050.7, 300 sec: 42653.6). Total num frames: 5687361536. Throughput: 0: 42570.1. Samples: 5687547180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:08:48,392][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 03:08:48,472][15401] Updated weights for policy 0, policy_version 347130 (0.0040) [2024-06-23 03:08:51,988][15401] Updated weights for policy 0, policy_version 347140 (0.0036) [2024-06-23 03:08:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5687590912. Throughput: 0: 42478.6. Samples: 5687672780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:08:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 03:08:55,817][15401] Updated weights for policy 0, policy_version 347150 (0.0031) [2024-06-23 03:08:58,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5687803904. Throughput: 0: 42516.6. Samples: 5687925380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:08:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 03:08:59,683][15401] Updated weights for policy 0, policy_version 347160 (0.0027) [2024-06-23 03:09:03,377][15401] Updated weights for policy 0, policy_version 347170 (0.0043) [2024-06-23 03:09:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5688033280. Throughput: 0: 42623.1. Samples: 5688183860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 03:09:07,316][15401] Updated weights for policy 0, policy_version 347180 (0.0037) [2024-06-23 03:09:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5688229888. Throughput: 0: 42691.5. Samples: 5688310640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 03:09:10,962][15401] Updated weights for policy 0, policy_version 347190 (0.0039) [2024-06-23 03:09:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5688442880. Throughput: 0: 42612.3. Samples: 5688563040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:13,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 03:09:15,306][15401] Updated weights for policy 0, policy_version 347200 (0.0033) [2024-06-23 03:09:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5688672256. Throughput: 0: 42497.4. Samples: 5688817080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 03:09:18,858][15401] Updated weights for policy 0, policy_version 347210 (0.0037) [2024-06-23 03:09:23,118][15401] Updated weights for policy 0, policy_version 347220 (0.0027) [2024-06-23 03:09:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5688868864. Throughput: 0: 42645.8. Samples: 5688947300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 03:09:26,756][15401] Updated weights for policy 0, policy_version 347230 (0.0034) [2024-06-23 03:09:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5689081856. Throughput: 0: 42615.0. Samples: 5689204740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 03:09:30,651][15401] Updated weights for policy 0, policy_version 347240 (0.0051) [2024-06-23 03:09:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5689311232. Throughput: 0: 42409.4. Samples: 5689455500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 03:09:34,381][15401] Updated weights for policy 0, policy_version 347250 (0.0038) [2024-06-23 03:09:38,264][15401] Updated weights for policy 0, policy_version 347260 (0.0034) [2024-06-23 03:09:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5689507840. Throughput: 0: 42585.8. Samples: 5689589140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 03:09:42,225][15401] Updated weights for policy 0, policy_version 347270 (0.0031) [2024-06-23 03:09:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5689704448. Throughput: 0: 42667.2. Samples: 5689845400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 03:09:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000347273_5689720832.pth... [2024-06-23 03:09:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000346650_5679513600.pth [2024-06-23 03:09:45,871][15401] Updated weights for policy 0, policy_version 347280 (0.0028) [2024-06-23 03:09:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 5689950208. Throughput: 0: 42444.1. Samples: 5690093840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 03:09:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 03:09:50,040][15401] Updated weights for policy 0, policy_version 347290 (0.0044) [2024-06-23 03:09:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5690146816. Throughput: 0: 42588.5. Samples: 5690227120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:09:53,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 03:09:53,816][15401] Updated weights for policy 0, policy_version 347300 (0.0036) [2024-06-23 03:09:58,115][15401] Updated weights for policy 0, policy_version 347310 (0.0038) [2024-06-23 03:09:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5690343424. Throughput: 0: 42494.6. Samples: 5690475300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:09:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 03:10:01,458][15401] Updated weights for policy 0, policy_version 347320 (0.0023) [2024-06-23 03:10:02,997][15349] Signal inference workers to stop experience collection... (84300 times) [2024-06-23 03:10:02,997][15349] Signal inference workers to resume experience collection... (84300 times) [2024-06-23 03:10:03,038][15401] InferenceWorker_p0-w0: stopping experience collection (84300 times) [2024-06-23 03:10:03,038][15401] InferenceWorker_p0-w0: resuming experience collection (84300 times) [2024-06-23 03:10:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5690589184. Throughput: 0: 42460.4. Samples: 5690727800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 03:10:05,671][15401] Updated weights for policy 0, policy_version 347330 (0.0025) [2024-06-23 03:10:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5690753024. Throughput: 0: 42484.3. Samples: 5690859100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 03:10:09,095][15401] Updated weights for policy 0, policy_version 347340 (0.0028) [2024-06-23 03:10:13,290][15401] Updated weights for policy 0, policy_version 347350 (0.0032) [2024-06-23 03:10:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5690982400. Throughput: 0: 42322.7. Samples: 5691109260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 03:10:16,779][15401] Updated weights for policy 0, policy_version 347360 (0.0034) [2024-06-23 03:10:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 5691211776. Throughput: 0: 42494.3. Samples: 5691367740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 03:10:20,970][15401] Updated weights for policy 0, policy_version 347370 (0.0036) [2024-06-23 03:10:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5691408384. Throughput: 0: 42438.6. Samples: 5691498880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:23,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 03:10:24,276][15401] Updated weights for policy 0, policy_version 347380 (0.0037) [2024-06-23 03:10:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 5691621376. Throughput: 0: 42320.5. Samples: 5691749820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 03:10:28,487][15401] Updated weights for policy 0, policy_version 347390 (0.0045) [2024-06-23 03:10:32,417][15401] Updated weights for policy 0, policy_version 347400 (0.0047) [2024-06-23 03:10:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5691834368. Throughput: 0: 42502.7. Samples: 5692006460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 03:10:36,038][15401] Updated weights for policy 0, policy_version 347410 (0.0029) [2024-06-23 03:10:38,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5692047360. Throughput: 0: 42289.7. Samples: 5692130160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 03:10:39,904][15401] Updated weights for policy 0, policy_version 347420 (0.0035) [2024-06-23 03:10:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5692260352. Throughput: 0: 42485.0. Samples: 5692387120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 03:10:43,911][15401] Updated weights for policy 0, policy_version 347430 (0.0041) [2024-06-23 03:10:47,607][15401] Updated weights for policy 0, policy_version 347440 (0.0039) [2024-06-23 03:10:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5692489728. Throughput: 0: 42550.4. Samples: 5692642560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 03:10:51,844][15401] Updated weights for policy 0, policy_version 347450 (0.0028) [2024-06-23 03:10:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5692702720. Throughput: 0: 42552.9. Samples: 5692773980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 03:10:55,413][15401] Updated weights for policy 0, policy_version 347460 (0.0034) [2024-06-23 03:10:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 5692899328. Throughput: 0: 42459.2. Samples: 5693019920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:10:58,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 03:10:59,626][15401] Updated weights for policy 0, policy_version 347470 (0.0039) [2024-06-23 03:11:02,979][15401] Updated weights for policy 0, policy_version 347480 (0.0032) [2024-06-23 03:11:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 5693112320. Throughput: 0: 42536.5. Samples: 5693281880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:11:03,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 03:11:04,001][15349] Signal inference workers to stop experience collection... (84350 times) [2024-06-23 03:11:04,056][15401] InferenceWorker_p0-w0: stopping experience collection (84350 times) [2024-06-23 03:11:04,059][15349] Signal inference workers to resume experience collection... (84350 times) [2024-06-23 03:11:04,068][15401] InferenceWorker_p0-w0: resuming experience collection (84350 times) [2024-06-23 03:11:07,317][15401] Updated weights for policy 0, policy_version 347490 (0.0035) [2024-06-23 03:11:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 5693325312. Throughput: 0: 42556.1. Samples: 5693413900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 03:11:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 03:11:10,463][15401] Updated weights for policy 0, policy_version 347500 (0.0031) [2024-06-23 03:11:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 5693538304. Throughput: 0: 42431.0. Samples: 5693659220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 03:11:14,982][15401] Updated weights for policy 0, policy_version 347510 (0.0042) [2024-06-23 03:11:17,970][15401] Updated weights for policy 0, policy_version 347520 (0.0028) [2024-06-23 03:11:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5693767680. Throughput: 0: 42533.6. Samples: 5693920480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:18,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 03:11:22,837][15401] Updated weights for policy 0, policy_version 347530 (0.0028) [2024-06-23 03:11:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5693947904. Throughput: 0: 42694.5. Samples: 5694051420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 03:11:26,068][15401] Updated weights for policy 0, policy_version 347540 (0.0035) [2024-06-23 03:11:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5694193664. Throughput: 0: 42565.3. Samples: 5694302560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:28,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 03:11:30,533][15401] Updated weights for policy 0, policy_version 347550 (0.0036) [2024-06-23 03:11:33,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 5694406656. Throughput: 0: 42690.1. Samples: 5694563620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:33,392][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 03:11:33,586][15401] Updated weights for policy 0, policy_version 347560 (0.0049) [2024-06-23 03:11:38,272][15401] Updated weights for policy 0, policy_version 347570 (0.0045) [2024-06-23 03:11:38,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5694586880. Throughput: 0: 42598.2. Samples: 5694690900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 03:11:41,119][15401] Updated weights for policy 0, policy_version 347580 (0.0032) [2024-06-23 03:11:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5694849024. Throughput: 0: 42845.7. Samples: 5694947980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 03:11:43,538][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000347587_5694865408.pth... [2024-06-23 03:11:43,599][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000346962_5684625408.pth [2024-06-23 03:11:45,801][15401] Updated weights for policy 0, policy_version 347590 (0.0032) [2024-06-23 03:11:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 5695045632. Throughput: 0: 42836.2. Samples: 5695209520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:48,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 03:11:48,748][15401] Updated weights for policy 0, policy_version 347600 (0.0039) [2024-06-23 03:11:53,379][15401] Updated weights for policy 0, policy_version 347610 (0.0045) [2024-06-23 03:11:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5695242240. Throughput: 0: 42541.2. Samples: 5695328260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:53,395][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 03:11:56,522][15401] Updated weights for policy 0, policy_version 347620 (0.0042) [2024-06-23 03:11:58,389][15132] Fps is (10 sec: 44238.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5695488000. Throughput: 0: 42695.2. Samples: 5695580500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:11:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 03:12:00,978][15401] Updated weights for policy 0, policy_version 347630 (0.0029) [2024-06-23 03:12:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5695651840. Throughput: 0: 42902.8. Samples: 5695851100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:12:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 03:12:04,169][15401] Updated weights for policy 0, policy_version 347640 (0.0031) [2024-06-23 03:12:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5695881216. Throughput: 0: 42627.4. Samples: 5695969640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:12:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 03:12:08,461][15401] Updated weights for policy 0, policy_version 347650 (0.0031) [2024-06-23 03:12:11,674][15349] Signal inference workers to stop experience collection... (84400 times) [2024-06-23 03:12:11,676][15349] Signal inference workers to resume experience collection... (84400 times) [2024-06-23 03:12:11,718][15401] InferenceWorker_p0-w0: stopping experience collection (84400 times) [2024-06-23 03:12:11,718][15401] InferenceWorker_p0-w0: resuming experience collection (84400 times) [2024-06-23 03:12:11,869][15401] Updated weights for policy 0, policy_version 347660 (0.0030) [2024-06-23 03:12:13,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5696126976. Throughput: 0: 42655.1. Samples: 5696222040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:12:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 03:12:16,045][15401] Updated weights for policy 0, policy_version 347670 (0.0036) [2024-06-23 03:12:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 5696290816. Throughput: 0: 42802.8. Samples: 5696489740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:12:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 03:12:19,457][15401] Updated weights for policy 0, policy_version 347680 (0.0038) [2024-06-23 03:12:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 5696520192. Throughput: 0: 42547.7. Samples: 5696605540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-23 03:12:23,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 03:12:23,717][15401] Updated weights for policy 0, policy_version 347690 (0.0032) [2024-06-23 03:12:27,215][15401] Updated weights for policy 0, policy_version 347700 (0.0043) [2024-06-23 03:12:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5696765952. Throughput: 0: 42557.8. Samples: 5696863080. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:12:28,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 03:12:31,296][15401] Updated weights for policy 0, policy_version 347710 (0.0034) [2024-06-23 03:12:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 5696929792. Throughput: 0: 42558.8. Samples: 5697124760. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:12:33,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 03:12:34,826][15401] Updated weights for policy 0, policy_version 347720 (0.0041) [2024-06-23 03:12:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5697159168. Throughput: 0: 42553.0. Samples: 5697243140. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:12:38,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 03:12:39,762][15401] Updated weights for policy 0, policy_version 347730 (0.0045) [2024-06-23 03:12:42,574][15401] Updated weights for policy 0, policy_version 347740 (0.0031) [2024-06-23 03:12:43,392][15132] Fps is (10 sec: 47513.1, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 5697404928. Throughput: 0: 42764.2. Samples: 5697505000. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:12:43,393][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 03:12:47,288][15401] Updated weights for policy 0, policy_version 347750 (0.0041) [2024-06-23 03:12:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5697568768. Throughput: 0: 42564.9. Samples: 5697766520. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:12:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 03:12:50,321][15401] Updated weights for policy 0, policy_version 347760 (0.0036) [2024-06-23 03:12:53,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5697798144. Throughput: 0: 42613.6. Samples: 5697887260. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:12:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 03:12:54,857][15401] Updated weights for policy 0, policy_version 347770 (0.0041) [2024-06-23 03:12:57,883][15401] Updated weights for policy 0, policy_version 347780 (0.0026) [2024-06-23 03:12:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5698027520. Throughput: 0: 42815.0. Samples: 5698148720. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:12:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 03:13:02,403][15401] Updated weights for policy 0, policy_version 347790 (0.0028) [2024-06-23 03:13:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5698224128. Throughput: 0: 42631.4. Samples: 5698408160. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:13:03,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 03:13:05,727][15401] Updated weights for policy 0, policy_version 347800 (0.0024) [2024-06-23 03:13:08,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 5698453504. Throughput: 0: 42749.2. Samples: 5698529360. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:13:08,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 03:13:10,017][15401] Updated weights for policy 0, policy_version 347810 (0.0038) [2024-06-23 03:13:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5698666496. Throughput: 0: 42700.9. Samples: 5698784620. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:13:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 03:13:13,551][15401] Updated weights for policy 0, policy_version 347820 (0.0034) [2024-06-23 03:13:17,908][15401] Updated weights for policy 0, policy_version 347830 (0.0037) [2024-06-23 03:13:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 5698863104. Throughput: 0: 42751.1. Samples: 5699048460. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:13:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 03:13:20,175][15349] Signal inference workers to stop experience collection... (84450 times) [2024-06-23 03:13:20,200][15401] InferenceWorker_p0-w0: stopping experience collection (84450 times) [2024-06-23 03:13:20,234][15349] Signal inference workers to resume experience collection... (84450 times) [2024-06-23 03:13:20,234][15401] InferenceWorker_p0-w0: resuming experience collection (84450 times) [2024-06-23 03:13:21,134][15401] Updated weights for policy 0, policy_version 347840 (0.0044) [2024-06-23 03:13:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5699108864. Throughput: 0: 42847.9. Samples: 5699171300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:13:23,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 03:13:25,337][15401] Updated weights for policy 0, policy_version 347850 (0.0040) [2024-06-23 03:13:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5699305472. Throughput: 0: 42851.7. Samples: 5699433220. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:13:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 03:13:28,685][15401] Updated weights for policy 0, policy_version 347860 (0.0032) [2024-06-23 03:13:32,817][15401] Updated weights for policy 0, policy_version 347870 (0.0031) [2024-06-23 03:13:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 5699502080. Throughput: 0: 42737.3. Samples: 5699689700. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:13:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 03:13:36,243][15401] Updated weights for policy 0, policy_version 347880 (0.0024) [2024-06-23 03:13:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5699747840. Throughput: 0: 42871.2. Samples: 5699816460. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 03:13:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 03:13:40,262][15401] Updated weights for policy 0, policy_version 347890 (0.0028) [2024-06-23 03:13:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42600.2, 300 sec: 42709.8). Total num frames: 5699960832. Throughput: 0: 42926.3. Samples: 5700080400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:13:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 03:13:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000347898_5699960832.pth... [2024-06-23 03:13:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000347273_5689720832.pth [2024-06-23 03:13:43,764][15401] Updated weights for policy 0, policy_version 347900 (0.0033) [2024-06-23 03:13:47,820][15401] Updated weights for policy 0, policy_version 347910 (0.0032) [2024-06-23 03:13:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5700157440. Throughput: 0: 42858.8. Samples: 5700336800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:13:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 03:13:51,635][15401] Updated weights for policy 0, policy_version 347920 (0.0027) [2024-06-23 03:13:53,396][15132] Fps is (10 sec: 44208.4, 60 sec: 43413.0, 300 sec: 42708.5). Total num frames: 5700403200. Throughput: 0: 42995.7. Samples: 5700464340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:13:53,396][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 03:13:55,425][15401] Updated weights for policy 0, policy_version 347930 (0.0034) [2024-06-23 03:13:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5700599808. Throughput: 0: 43034.2. Samples: 5700721160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:13:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 03:13:59,542][15401] Updated weights for policy 0, policy_version 347940 (0.0038) [2024-06-23 03:14:03,151][15401] Updated weights for policy 0, policy_version 347950 (0.0032) [2024-06-23 03:14:03,390][15132] Fps is (10 sec: 40986.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5700812800. Throughput: 0: 42746.7. Samples: 5700972060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:03,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-23 03:14:07,382][15401] Updated weights for policy 0, policy_version 347960 (0.0035) [2024-06-23 03:14:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 5701025792. Throughput: 0: 42902.7. Samples: 5701101920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 03:14:10,769][15401] Updated weights for policy 0, policy_version 347970 (0.0028) [2024-06-23 03:14:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5701238784. Throughput: 0: 42812.9. Samples: 5701359800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 03:14:14,866][15401] Updated weights for policy 0, policy_version 347980 (0.0037) [2024-06-23 03:14:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5701451776. Throughput: 0: 42856.4. Samples: 5701618240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 03:14:18,410][15401] Updated weights for policy 0, policy_version 347990 (0.0043) [2024-06-23 03:14:22,405][15401] Updated weights for policy 0, policy_version 348000 (0.0041) [2024-06-23 03:14:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5701681152. Throughput: 0: 42934.2. Samples: 5701748500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:23,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 03:14:26,155][15401] Updated weights for policy 0, policy_version 348010 (0.0029) [2024-06-23 03:14:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5701894144. Throughput: 0: 42816.4. Samples: 5702007140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:28,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 03:14:29,931][15401] Updated weights for policy 0, policy_version 348020 (0.0033) [2024-06-23 03:14:30,828][15349] Signal inference workers to stop experience collection... (84500 times) [2024-06-23 03:14:30,828][15349] Signal inference workers to resume experience collection... (84500 times) [2024-06-23 03:14:30,868][15401] InferenceWorker_p0-w0: stopping experience collection (84500 times) [2024-06-23 03:14:30,868][15401] InferenceWorker_p0-w0: resuming experience collection (84500 times) [2024-06-23 03:14:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5702090752. Throughput: 0: 42726.0. Samples: 5702259480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 03:14:34,057][15401] Updated weights for policy 0, policy_version 348030 (0.0028) [2024-06-23 03:14:37,560][15401] Updated weights for policy 0, policy_version 348040 (0.0034) [2024-06-23 03:14:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5702303744. Throughput: 0: 42764.4. Samples: 5702388460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 03:14:41,553][15401] Updated weights for policy 0, policy_version 348050 (0.0034) [2024-06-23 03:14:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5702533120. Throughput: 0: 42825.5. Samples: 5702648300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:43,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 03:14:44,967][15401] Updated weights for policy 0, policy_version 348060 (0.0046) [2024-06-23 03:14:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5702746112. Throughput: 0: 42850.3. Samples: 5702900320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 03:14:49,096][15401] Updated weights for policy 0, policy_version 348070 (0.0034) [2024-06-23 03:14:52,998][15401] Updated weights for policy 0, policy_version 348080 (0.0043) [2024-06-23 03:14:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 5702942720. Throughput: 0: 42799.2. Samples: 5703027880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:53,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 03:14:56,695][15401] Updated weights for policy 0, policy_version 348090 (0.0036) [2024-06-23 03:14:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5703172096. Throughput: 0: 42712.0. Samples: 5703281840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 03:14:58,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 03:15:00,634][15401] Updated weights for policy 0, policy_version 348100 (0.0035) [2024-06-23 03:15:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5703385088. Throughput: 0: 42560.4. Samples: 5703533460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 03:15:04,847][15401] Updated weights for policy 0, policy_version 348110 (0.0035) [2024-06-23 03:15:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5703581696. Throughput: 0: 42618.7. Samples: 5703666340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 03:15:08,411][15401] Updated weights for policy 0, policy_version 348120 (0.0028) [2024-06-23 03:15:12,437][15401] Updated weights for policy 0, policy_version 348130 (0.0025) [2024-06-23 03:15:13,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 5703794688. Throughput: 0: 42539.8. Samples: 5703921700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:13,397][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 03:15:15,952][15401] Updated weights for policy 0, policy_version 348140 (0.0033) [2024-06-23 03:15:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5704024064. Throughput: 0: 42707.5. Samples: 5704181320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 03:15:20,122][15401] Updated weights for policy 0, policy_version 348150 (0.0038) [2024-06-23 03:15:23,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5704220672. Throughput: 0: 42557.7. Samples: 5704303560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 03:15:23,905][15401] Updated weights for policy 0, policy_version 348160 (0.0036) [2024-06-23 03:15:27,579][15401] Updated weights for policy 0, policy_version 348170 (0.0040) [2024-06-23 03:15:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5704433664. Throughput: 0: 42623.9. Samples: 5704566380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:28,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 03:15:31,472][15401] Updated weights for policy 0, policy_version 348180 (0.0034) [2024-06-23 03:15:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5704663040. Throughput: 0: 42766.1. Samples: 5704824800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 03:15:35,073][15401] Updated weights for policy 0, policy_version 348190 (0.0048) [2024-06-23 03:15:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5704876032. Throughput: 0: 42700.8. Samples: 5704949420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:38,391][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 03:15:39,062][15401] Updated weights for policy 0, policy_version 348200 (0.0036) [2024-06-23 03:15:43,022][15401] Updated weights for policy 0, policy_version 348210 (0.0045) [2024-06-23 03:15:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5705089024. Throughput: 0: 42828.1. Samples: 5705209100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:43,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 03:15:43,485][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000348212_5705105408.pth... [2024-06-23 03:15:43,542][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000347587_5694865408.pth [2024-06-23 03:15:46,543][15401] Updated weights for policy 0, policy_version 348220 (0.0021) [2024-06-23 03:15:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5705285632. Throughput: 0: 43011.7. Samples: 5705468980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:48,397][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 03:15:50,492][15401] Updated weights for policy 0, policy_version 348230 (0.0030) [2024-06-23 03:15:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5705531392. Throughput: 0: 42756.4. Samples: 5705590380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 03:15:54,001][15401] Updated weights for policy 0, policy_version 348240 (0.0042) [2024-06-23 03:15:55,731][15349] Signal inference workers to stop experience collection... (84550 times) [2024-06-23 03:15:55,784][15401] InferenceWorker_p0-w0: stopping experience collection (84550 times) [2024-06-23 03:15:55,785][15349] Signal inference workers to resume experience collection... (84550 times) [2024-06-23 03:15:55,801][15401] InferenceWorker_p0-w0: resuming experience collection (84550 times) [2024-06-23 03:15:57,932][15401] Updated weights for policy 0, policy_version 348250 (0.0040) [2024-06-23 03:15:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5705744384. Throughput: 0: 42929.3. Samples: 5705853240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:15:58,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 03:16:01,862][15401] Updated weights for policy 0, policy_version 348260 (0.0028) [2024-06-23 03:16:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5705940992. Throughput: 0: 43033.4. Samples: 5706117820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:16:03,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 03:16:05,611][15401] Updated weights for policy 0, policy_version 348270 (0.0032) [2024-06-23 03:16:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5706186752. Throughput: 0: 43112.3. Samples: 5706243620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:16:08,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 03:16:09,260][15401] Updated weights for policy 0, policy_version 348280 (0.0041) [2024-06-23 03:16:13,140][15401] Updated weights for policy 0, policy_version 348290 (0.0040) [2024-06-23 03:16:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 5706383360. Throughput: 0: 42988.0. Samples: 5706500840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 03:16:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 03:16:17,270][15401] Updated weights for policy 0, policy_version 348300 (0.0042) [2024-06-23 03:16:18,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5706579968. Throughput: 0: 43011.7. Samples: 5706760320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 03:16:20,694][15401] Updated weights for policy 0, policy_version 348310 (0.0048) [2024-06-23 03:16:23,396][15132] Fps is (10 sec: 44208.0, 60 sec: 43412.9, 300 sec: 42819.6). Total num frames: 5706825728. Throughput: 0: 43022.3. Samples: 5706885700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:23,397][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 03:16:24,860][15401] Updated weights for policy 0, policy_version 348320 (0.0037) [2024-06-23 03:16:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5707022336. Throughput: 0: 43011.0. Samples: 5707144600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:28,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-23 03:16:28,401][15401] Updated weights for policy 0, policy_version 348330 (0.0027) [2024-06-23 03:16:32,402][15401] Updated weights for policy 0, policy_version 348340 (0.0034) [2024-06-23 03:16:33,389][15132] Fps is (10 sec: 37707.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5707202560. Throughput: 0: 42945.3. Samples: 5707401520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 03:16:36,192][15401] Updated weights for policy 0, policy_version 348350 (0.0037) [2024-06-23 03:16:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5707464704. Throughput: 0: 42924.8. Samples: 5707522000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 03:16:40,154][15401] Updated weights for policy 0, policy_version 348360 (0.0029) [2024-06-23 03:16:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5707644928. Throughput: 0: 42870.6. Samples: 5707782420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:43,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 03:16:43,777][15401] Updated weights for policy 0, policy_version 348370 (0.0033) [2024-06-23 03:16:47,885][15401] Updated weights for policy 0, policy_version 348380 (0.0028) [2024-06-23 03:16:48,395][15132] Fps is (10 sec: 39298.7, 60 sec: 42867.2, 300 sec: 42764.2). Total num frames: 5707857920. Throughput: 0: 42486.0. Samples: 5708029940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:48,396][15132] Avg episode reward: [(0, '0.221')] [2024-06-23 03:16:51,451][15401] Updated weights for policy 0, policy_version 348390 (0.0033) [2024-06-23 03:16:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5708070912. Throughput: 0: 42496.5. Samples: 5708155960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:53,390][15132] Avg episode reward: [(0, '0.156')] [2024-06-23 03:16:55,725][15401] Updated weights for policy 0, policy_version 348400 (0.0029) [2024-06-23 03:16:58,389][15132] Fps is (10 sec: 40984.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5708267520. Throughput: 0: 42595.6. Samples: 5708417640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:16:58,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 03:16:59,491][15401] Updated weights for policy 0, policy_version 348410 (0.0036) [2024-06-23 03:17:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5708480512. Throughput: 0: 42315.8. Samples: 5708664540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:17:03,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 03:17:03,925][15401] Updated weights for policy 0, policy_version 348420 (0.0054) [2024-06-23 03:17:07,116][15401] Updated weights for policy 0, policy_version 348430 (0.0036) [2024-06-23 03:17:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5708726272. Throughput: 0: 42330.9. Samples: 5708790320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:17:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 03:17:11,926][15401] Updated weights for policy 0, policy_version 348440 (0.0026) [2024-06-23 03:17:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 5708906496. Throughput: 0: 42248.9. Samples: 5709045800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:17:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 03:17:14,781][15401] Updated weights for policy 0, policy_version 348450 (0.0035) [2024-06-23 03:17:15,703][15349] Signal inference workers to stop experience collection... (84600 times) [2024-06-23 03:17:15,704][15349] Signal inference workers to resume experience collection... (84600 times) [2024-06-23 03:17:15,720][15401] InferenceWorker_p0-w0: stopping experience collection (84600 times) [2024-06-23 03:17:15,721][15401] InferenceWorker_p0-w0: resuming experience collection (84600 times) [2024-06-23 03:17:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5709135872. Throughput: 0: 42076.4. Samples: 5709294960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:17:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 03:17:19,669][15401] Updated weights for policy 0, policy_version 348460 (0.0036) [2024-06-23 03:17:22,389][15401] Updated weights for policy 0, policy_version 348470 (0.0038) [2024-06-23 03:17:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 5709365248. Throughput: 0: 42353.0. Samples: 5709427880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:17:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 03:17:27,265][15401] Updated weights for policy 0, policy_version 348480 (0.0027) [2024-06-23 03:17:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 5709561856. Throughput: 0: 42430.7. Samples: 5709691800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:17:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 03:17:29,905][15401] Updated weights for policy 0, policy_version 348490 (0.0030) [2024-06-23 03:17:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5709774848. Throughput: 0: 42522.3. Samples: 5709943200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 03:17:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 03:17:34,926][15401] Updated weights for policy 0, policy_version 348500 (0.0034) [2024-06-23 03:17:37,768][15401] Updated weights for policy 0, policy_version 348510 (0.0031) [2024-06-23 03:17:38,392][15132] Fps is (10 sec: 44223.7, 60 sec: 42323.3, 300 sec: 42709.4). Total num frames: 5710004224. Throughput: 0: 42545.3. Samples: 5710070620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:17:38,393][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 03:17:42,632][15401] Updated weights for policy 0, policy_version 348520 (0.0041) [2024-06-23 03:17:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5710200832. Throughput: 0: 42598.6. Samples: 5710334580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:17:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 03:17:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000348523_5710200832.pth... [2024-06-23 03:17:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000347898_5699960832.pth [2024-06-23 03:17:45,467][15401] Updated weights for policy 0, policy_version 348530 (0.0038) [2024-06-23 03:17:48,390][15132] Fps is (10 sec: 40972.0, 60 sec: 42602.6, 300 sec: 42765.0). Total num frames: 5710413824. Throughput: 0: 42520.5. Samples: 5710577960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:17:48,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 03:17:50,174][15401] Updated weights for policy 0, policy_version 348540 (0.0043) [2024-06-23 03:17:53,042][15401] Updated weights for policy 0, policy_version 348550 (0.0034) [2024-06-23 03:17:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5710643200. Throughput: 0: 42636.9. Samples: 5710708980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:17:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 03:17:57,779][15401] Updated weights for policy 0, policy_version 348560 (0.0031) [2024-06-23 03:17:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5710839808. Throughput: 0: 42738.3. Samples: 5710969020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:17:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 03:18:00,910][15401] Updated weights for policy 0, policy_version 348570 (0.0036) [2024-06-23 03:18:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 5711052800. Throughput: 0: 42533.4. Samples: 5711208960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:03,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 03:18:05,333][15401] Updated weights for policy 0, policy_version 348580 (0.0031) [2024-06-23 03:18:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5711282176. Throughput: 0: 42563.2. Samples: 5711343220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 03:18:08,435][15401] Updated weights for policy 0, policy_version 348590 (0.0044) [2024-06-23 03:18:12,990][15401] Updated weights for policy 0, policy_version 348600 (0.0031) [2024-06-23 03:18:13,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 5711462400. Throughput: 0: 42404.8. Samples: 5711600120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:13,392][15132] Avg episode reward: [(0, '0.267')] [2024-06-23 03:18:16,271][15401] Updated weights for policy 0, policy_version 348610 (0.0043) [2024-06-23 03:18:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5711691776. Throughput: 0: 42465.4. Samples: 5711854140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 03:18:20,658][15401] Updated weights for policy 0, policy_version 348620 (0.0029) [2024-06-23 03:18:23,396][15132] Fps is (10 sec: 45857.1, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 5711921152. Throughput: 0: 42613.6. Samples: 5711988380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:23,396][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 03:18:23,851][15401] Updated weights for policy 0, policy_version 348630 (0.0032) [2024-06-23 03:18:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5712101376. Throughput: 0: 42433.0. Samples: 5712244060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 03:18:28,471][15401] Updated weights for policy 0, policy_version 348640 (0.0030) [2024-06-23 03:18:31,341][15401] Updated weights for policy 0, policy_version 348650 (0.0039) [2024-06-23 03:18:33,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 5712330752. Throughput: 0: 42670.4. Samples: 5712498120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 03:18:35,574][15349] Signal inference workers to stop experience collection... (84650 times) [2024-06-23 03:18:35,582][15349] Signal inference workers to resume experience collection... (84650 times) [2024-06-23 03:18:35,597][15401] InferenceWorker_p0-w0: stopping experience collection (84650 times) [2024-06-23 03:18:35,597][15401] InferenceWorker_p0-w0: resuming experience collection (84650 times) [2024-06-23 03:18:35,944][15401] Updated weights for policy 0, policy_version 348660 (0.0034) [2024-06-23 03:18:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42327.5, 300 sec: 42653.9). Total num frames: 5712543744. Throughput: 0: 42642.8. Samples: 5712627900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 03:18:39,400][15401] Updated weights for policy 0, policy_version 348670 (0.0029) [2024-06-23 03:18:43,379][15401] Updated weights for policy 0, policy_version 348680 (0.0039) [2024-06-23 03:18:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5712773120. Throughput: 0: 42587.0. Samples: 5712885440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 03:18:47,130][15401] Updated weights for policy 0, policy_version 348690 (0.0039) [2024-06-23 03:18:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 5712969728. Throughput: 0: 42838.3. Samples: 5713136680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 03:18:48,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 03:18:50,966][15401] Updated weights for policy 0, policy_version 348700 (0.0029) [2024-06-23 03:18:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5713182720. Throughput: 0: 42711.2. Samples: 5713265220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:18:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 03:18:54,917][15401] Updated weights for policy 0, policy_version 348710 (0.0030) [2024-06-23 03:18:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5713395712. Throughput: 0: 42670.7. Samples: 5713520200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:18:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 03:18:58,605][15401] Updated weights for policy 0, policy_version 348720 (0.0026) [2024-06-23 03:19:02,392][15401] Updated weights for policy 0, policy_version 348730 (0.0041) [2024-06-23 03:19:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5713625088. Throughput: 0: 42830.7. Samples: 5713781520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:03,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 03:19:06,228][15401] Updated weights for policy 0, policy_version 348740 (0.0032) [2024-06-23 03:19:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5713838080. Throughput: 0: 42729.5. Samples: 5713910940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 03:19:10,171][15401] Updated weights for policy 0, policy_version 348750 (0.0047) [2024-06-23 03:19:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43146.2, 300 sec: 42709.4). Total num frames: 5714051072. Throughput: 0: 42820.2. Samples: 5714170980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 03:19:14,026][15401] Updated weights for policy 0, policy_version 348760 (0.0029) [2024-06-23 03:19:17,824][15401] Updated weights for policy 0, policy_version 348770 (0.0049) [2024-06-23 03:19:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5714264064. Throughput: 0: 42756.3. Samples: 5714422160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 03:19:22,015][15401] Updated weights for policy 0, policy_version 348780 (0.0042) [2024-06-23 03:19:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42329.7, 300 sec: 42598.4). Total num frames: 5714460672. Throughput: 0: 42775.4. Samples: 5714552800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:23,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 03:19:25,413][15401] Updated weights for policy 0, policy_version 348790 (0.0041) [2024-06-23 03:19:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5714673664. Throughput: 0: 42781.8. Samples: 5714810620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 03:19:29,471][15401] Updated weights for policy 0, policy_version 348800 (0.0032) [2024-06-23 03:19:32,918][15401] Updated weights for policy 0, policy_version 348810 (0.0032) [2024-06-23 03:19:33,390][15132] Fps is (10 sec: 45873.2, 60 sec: 43144.0, 300 sec: 42764.9). Total num frames: 5714919424. Throughput: 0: 42826.9. Samples: 5715063920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:33,391][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 03:19:37,322][15401] Updated weights for policy 0, policy_version 348820 (0.0043) [2024-06-23 03:19:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 5715116032. Throughput: 0: 42972.3. Samples: 5715199080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:38,392][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 03:19:40,602][15401] Updated weights for policy 0, policy_version 348830 (0.0033) [2024-06-23 03:19:43,389][15132] Fps is (10 sec: 39324.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5715312640. Throughput: 0: 42885.9. Samples: 5715450060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 03:19:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000348836_5715329024.pth... [2024-06-23 03:19:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000348212_5705105408.pth [2024-06-23 03:19:44,954][15401] Updated weights for policy 0, policy_version 348840 (0.0038) [2024-06-23 03:19:48,237][15401] Updated weights for policy 0, policy_version 348850 (0.0045) [2024-06-23 03:19:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5715558400. Throughput: 0: 42663.1. Samples: 5715701360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 03:19:52,842][15401] Updated weights for policy 0, policy_version 348860 (0.0033) [2024-06-23 03:19:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5715738624. Throughput: 0: 42662.9. Samples: 5715830760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:53,396][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 03:19:53,796][15349] Signal inference workers to stop experience collection... (84700 times) [2024-06-23 03:19:53,797][15349] Signal inference workers to resume experience collection... (84700 times) [2024-06-23 03:19:53,840][15401] InferenceWorker_p0-w0: stopping experience collection (84700 times) [2024-06-23 03:19:53,840][15401] InferenceWorker_p0-w0: resuming experience collection (84700 times) [2024-06-23 03:19:55,850][15401] Updated weights for policy 0, policy_version 348870 (0.0040) [2024-06-23 03:19:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5715968000. Throughput: 0: 42526.5. Samples: 5716084660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:19:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 03:20:00,562][15401] Updated weights for policy 0, policy_version 348880 (0.0041) [2024-06-23 03:20:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5716180992. Throughput: 0: 42731.1. Samples: 5716345060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 03:20:03,547][15401] Updated weights for policy 0, policy_version 348890 (0.0042) [2024-06-23 03:20:08,098][15401] Updated weights for policy 0, policy_version 348900 (0.0028) [2024-06-23 03:20:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.5, 300 sec: 42654.9). Total num frames: 5716377600. Throughput: 0: 42534.9. Samples: 5716466860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:08,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 03:20:11,408][15401] Updated weights for policy 0, policy_version 348910 (0.0035) [2024-06-23 03:20:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5716623360. Throughput: 0: 42533.8. Samples: 5716724640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 03:20:15,622][15401] Updated weights for policy 0, policy_version 348920 (0.0026) [2024-06-23 03:20:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5716819968. Throughput: 0: 42714.8. Samples: 5716986060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 03:20:19,135][15401] Updated weights for policy 0, policy_version 348930 (0.0028) [2024-06-23 03:20:23,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5717000192. Throughput: 0: 42451.9. Samples: 5717109320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 03:20:23,599][15401] Updated weights for policy 0, policy_version 348940 (0.0032) [2024-06-23 03:20:26,861][15401] Updated weights for policy 0, policy_version 348950 (0.0043) [2024-06-23 03:20:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5717262336. Throughput: 0: 42628.5. Samples: 5717368340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:28,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 03:20:31,143][15401] Updated weights for policy 0, policy_version 348960 (0.0039) [2024-06-23 03:20:33,391][15132] Fps is (10 sec: 45869.5, 60 sec: 42324.8, 300 sec: 42653.7). Total num frames: 5717458944. Throughput: 0: 42741.8. Samples: 5717624800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:33,392][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 03:20:34,749][15401] Updated weights for policy 0, policy_version 348970 (0.0030) [2024-06-23 03:20:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 5717655552. Throughput: 0: 42597.2. Samples: 5717747640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 03:20:38,678][15401] Updated weights for policy 0, policy_version 348980 (0.0028) [2024-06-23 03:20:42,637][15401] Updated weights for policy 0, policy_version 348990 (0.0048) [2024-06-23 03:20:43,392][15132] Fps is (10 sec: 44232.0, 60 sec: 43142.7, 300 sec: 42764.6). Total num frames: 5717901312. Throughput: 0: 42713.1. Samples: 5718006860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:43,393][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 03:20:46,843][15401] Updated weights for policy 0, policy_version 349000 (0.0023) [2024-06-23 03:20:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5718097920. Throughput: 0: 42619.5. Samples: 5718262940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 03:20:50,080][15401] Updated weights for policy 0, policy_version 349010 (0.0039) [2024-06-23 03:20:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5718310912. Throughput: 0: 42775.0. Samples: 5718391740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 03:20:54,345][15401] Updated weights for policy 0, policy_version 349020 (0.0034) [2024-06-23 03:20:57,828][15401] Updated weights for policy 0, policy_version 349030 (0.0032) [2024-06-23 03:20:58,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 5718523904. Throughput: 0: 42812.1. Samples: 5718651460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:20:58,397][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 03:21:01,800][15401] Updated weights for policy 0, policy_version 349040 (0.0038) [2024-06-23 03:21:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5718753280. Throughput: 0: 42585.3. Samples: 5718902400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:21:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 03:21:05,299][15401] Updated weights for policy 0, policy_version 349050 (0.0023) [2024-06-23 03:21:08,390][15132] Fps is (10 sec: 44265.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5718966272. Throughput: 0: 42753.4. Samples: 5719033220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:21:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 03:21:09,486][15401] Updated weights for policy 0, policy_version 349060 (0.0037) [2024-06-23 03:21:13,027][15401] Updated weights for policy 0, policy_version 349070 (0.0029) [2024-06-23 03:21:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5719162880. Throughput: 0: 42778.2. Samples: 5719293360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:21:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 03:21:17,163][15401] Updated weights for policy 0, policy_version 349080 (0.0026) [2024-06-23 03:21:17,606][15349] Signal inference workers to stop experience collection... (84750 times) [2024-06-23 03:21:17,656][15401] InferenceWorker_p0-w0: stopping experience collection (84750 times) [2024-06-23 03:21:17,723][15349] Signal inference workers to resume experience collection... (84750 times) [2024-06-23 03:21:17,723][15401] InferenceWorker_p0-w0: resuming experience collection (84750 times) [2024-06-23 03:21:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 5719375872. Throughput: 0: 42749.8. Samples: 5719548480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:21:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 03:21:20,558][15401] Updated weights for policy 0, policy_version 349090 (0.0027) [2024-06-23 03:21:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 5719588864. Throughput: 0: 42869.9. Samples: 5719676780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 03:21:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 03:21:24,730][15401] Updated weights for policy 0, policy_version 349100 (0.0035) [2024-06-23 03:21:28,222][15401] Updated weights for policy 0, policy_version 349110 (0.0038) [2024-06-23 03:21:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5719818240. Throughput: 0: 42809.4. Samples: 5719933180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:21:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 03:21:32,474][15401] Updated weights for policy 0, policy_version 349120 (0.0036) [2024-06-23 03:21:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42872.4, 300 sec: 42598.4). Total num frames: 5720031232. Throughput: 0: 42729.7. Samples: 5720185780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:21:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 03:21:36,184][15401] Updated weights for policy 0, policy_version 349130 (0.0038) [2024-06-23 03:21:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5720227840. Throughput: 0: 42797.8. Samples: 5720317640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:21:38,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 03:21:40,035][15401] Updated weights for policy 0, policy_version 349140 (0.0022) [2024-06-23 03:21:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42654.8). Total num frames: 5720440832. Throughput: 0: 42784.4. Samples: 5720576480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:21:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 03:21:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000349149_5720457216.pth... [2024-06-23 03:21:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000348523_5710200832.pth [2024-06-23 03:21:43,889][15401] Updated weights for policy 0, policy_version 349150 (0.0033) [2024-06-23 03:21:47,536][15401] Updated weights for policy 0, policy_version 349160 (0.0040) [2024-06-23 03:21:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5720686592. Throughput: 0: 42825.7. Samples: 5720829560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:21:48,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-23 03:21:51,441][15401] Updated weights for policy 0, policy_version 349170 (0.0024) [2024-06-23 03:21:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5720883200. Throughput: 0: 42874.7. Samples: 5720962580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:21:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 03:21:55,116][15401] Updated weights for policy 0, policy_version 349180 (0.0035) [2024-06-23 03:21:58,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42871.5, 300 sec: 42764.1). Total num frames: 5721096192. Throughput: 0: 42640.1. Samples: 5721212440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:21:58,396][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 03:21:58,934][15401] Updated weights for policy 0, policy_version 349190 (0.0038) [2024-06-23 03:22:03,046][15401] Updated weights for policy 0, policy_version 349200 (0.0041) [2024-06-23 03:22:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 5721309184. Throughput: 0: 42718.5. Samples: 5721470920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:22:03,393][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 03:22:06,460][15401] Updated weights for policy 0, policy_version 349210 (0.0032) [2024-06-23 03:22:08,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5721505792. Throughput: 0: 42716.9. Samples: 5721599040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:22:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 03:22:10,525][15401] Updated weights for policy 0, policy_version 349220 (0.0040) [2024-06-23 03:22:13,390][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5721718784. Throughput: 0: 42527.1. Samples: 5721846900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:22:13,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 03:22:14,079][15401] Updated weights for policy 0, policy_version 349230 (0.0024) [2024-06-23 03:22:18,145][15401] Updated weights for policy 0, policy_version 349240 (0.0027) [2024-06-23 03:22:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5721948160. Throughput: 0: 42799.2. Samples: 5722111740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:22:18,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 03:22:21,628][15401] Updated weights for policy 0, policy_version 349250 (0.0042) [2024-06-23 03:22:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5722161152. Throughput: 0: 42696.1. Samples: 5722238960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:22:23,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 03:22:25,929][15401] Updated weights for policy 0, policy_version 349260 (0.0041) [2024-06-23 03:22:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5722374144. Throughput: 0: 42658.1. Samples: 5722496100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:22:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 03:22:29,703][15401] Updated weights for policy 0, policy_version 349270 (0.0035) [2024-06-23 03:22:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42654.4). Total num frames: 5722587136. Throughput: 0: 42854.2. Samples: 5722758000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:22:33,394][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 03:22:33,631][15401] Updated weights for policy 0, policy_version 349280 (0.0031) [2024-06-23 03:22:34,980][15349] Signal inference workers to stop experience collection... (84800 times) [2024-06-23 03:22:34,980][15349] Signal inference workers to resume experience collection... (84800 times) [2024-06-23 03:22:34,998][15401] InferenceWorker_p0-w0: stopping experience collection (84800 times) [2024-06-23 03:22:34,998][15401] InferenceWorker_p0-w0: resuming experience collection (84800 times) [2024-06-23 03:22:37,195][15401] Updated weights for policy 0, policy_version 349290 (0.0032) [2024-06-23 03:22:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5722800128. Throughput: 0: 42835.1. Samples: 5722890160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 03:22:38,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 03:22:41,207][15401] Updated weights for policy 0, policy_version 349300 (0.0046) [2024-06-23 03:22:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5723013120. Throughput: 0: 42967.5. Samples: 5723145700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:22:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 03:22:44,805][15401] Updated weights for policy 0, policy_version 349310 (0.0038) [2024-06-23 03:22:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5723226112. Throughput: 0: 42783.3. Samples: 5723396060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:22:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 03:22:49,006][15401] Updated weights for policy 0, policy_version 349320 (0.0038) [2024-06-23 03:22:52,299][15401] Updated weights for policy 0, policy_version 349330 (0.0032) [2024-06-23 03:22:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5723439104. Throughput: 0: 42820.3. Samples: 5723525960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:22:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 03:22:56,526][15401] Updated weights for policy 0, policy_version 349340 (0.0033) [2024-06-23 03:22:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 5723652096. Throughput: 0: 43065.2. Samples: 5723784840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:22:58,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 03:23:00,199][15401] Updated weights for policy 0, policy_version 349350 (0.0045) [2024-06-23 03:23:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 5723881472. Throughput: 0: 42817.7. Samples: 5724038540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 03:23:04,242][15401] Updated weights for policy 0, policy_version 349360 (0.0034) [2024-06-23 03:23:07,790][15401] Updated weights for policy 0, policy_version 349370 (0.0031) [2024-06-23 03:23:08,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.7, 300 sec: 42820.6). Total num frames: 5724094464. Throughput: 0: 42899.9. Samples: 5724169560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:08,392][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 03:23:11,914][15401] Updated weights for policy 0, policy_version 349380 (0.0037) [2024-06-23 03:23:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5724291072. Throughput: 0: 42882.7. Samples: 5724425820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 03:23:16,010][15401] Updated weights for policy 0, policy_version 349390 (0.0039) [2024-06-23 03:23:18,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 5724520448. Throughput: 0: 42618.3. Samples: 5724675820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 03:23:19,525][15401] Updated weights for policy 0, policy_version 349400 (0.0030) [2024-06-23 03:23:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5724717056. Throughput: 0: 42534.7. Samples: 5724804220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 03:23:23,436][15401] Updated weights for policy 0, policy_version 349410 (0.0030) [2024-06-23 03:23:27,137][15401] Updated weights for policy 0, policy_version 349420 (0.0025) [2024-06-23 03:23:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5724930048. Throughput: 0: 42631.2. Samples: 5725064100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 03:23:31,073][15401] Updated weights for policy 0, policy_version 349430 (0.0041) [2024-06-23 03:23:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 5725143040. Throughput: 0: 42786.1. Samples: 5725321540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:33,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 03:23:34,951][15401] Updated weights for policy 0, policy_version 349440 (0.0024) [2024-06-23 03:23:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5725356032. Throughput: 0: 42831.7. Samples: 5725453380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 03:23:38,567][15401] Updated weights for policy 0, policy_version 349450 (0.0033) [2024-06-23 03:23:42,618][15401] Updated weights for policy 0, policy_version 349460 (0.0025) [2024-06-23 03:23:43,393][15132] Fps is (10 sec: 44232.6, 60 sec: 42869.1, 300 sec: 42764.5). Total num frames: 5725585408. Throughput: 0: 42786.7. Samples: 5725710380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:43,393][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 03:23:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000349462_5725585408.pth... [2024-06-23 03:23:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000348836_5715329024.pth [2024-06-23 03:23:46,198][15401] Updated weights for policy 0, policy_version 349470 (0.0038) [2024-06-23 03:23:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5725798400. Throughput: 0: 42671.1. Samples: 5725958740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:48,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 03:23:50,365][15401] Updated weights for policy 0, policy_version 349480 (0.0035) [2024-06-23 03:23:53,389][15132] Fps is (10 sec: 40974.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5725995008. Throughput: 0: 42608.6. Samples: 5726086840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 03:23:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 03:23:54,104][15401] Updated weights for policy 0, policy_version 349490 (0.0032) [2024-06-23 03:23:57,931][15401] Updated weights for policy 0, policy_version 349500 (0.0033) [2024-06-23 03:23:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5726224384. Throughput: 0: 42714.7. Samples: 5726347980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:23:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 03:24:01,802][15401] Updated weights for policy 0, policy_version 349510 (0.0033) [2024-06-23 03:24:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5726437376. Throughput: 0: 42679.5. Samples: 5726596400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 03:24:05,397][15401] Updated weights for policy 0, policy_version 349520 (0.0034) [2024-06-23 03:24:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 5726633984. Throughput: 0: 42756.8. Samples: 5726728280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:08,392][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 03:24:09,395][15401] Updated weights for policy 0, policy_version 349530 (0.0033) [2024-06-23 03:24:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5726846976. Throughput: 0: 42602.2. Samples: 5726981200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 03:24:13,617][15401] Updated weights for policy 0, policy_version 349540 (0.0043) [2024-06-23 03:24:16,686][15349] Signal inference workers to stop experience collection... (84850 times) [2024-06-23 03:24:16,686][15349] Signal inference workers to resume experience collection... (84850 times) [2024-06-23 03:24:16,729][15401] InferenceWorker_p0-w0: stopping experience collection (84850 times) [2024-06-23 03:24:16,729][15401] InferenceWorker_p0-w0: resuming experience collection (84850 times) [2024-06-23 03:24:17,015][15401] Updated weights for policy 0, policy_version 349550 (0.0039) [2024-06-23 03:24:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5727076352. Throughput: 0: 42510.3. Samples: 5727234400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 03:24:21,190][15401] Updated weights for policy 0, policy_version 349560 (0.0037) [2024-06-23 03:24:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5727272960. Throughput: 0: 42410.4. Samples: 5727361860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:23,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 03:24:25,027][15401] Updated weights for policy 0, policy_version 349570 (0.0037) [2024-06-23 03:24:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.5). Total num frames: 5727485952. Throughput: 0: 42367.1. Samples: 5727616760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 03:24:28,867][15401] Updated weights for policy 0, policy_version 349580 (0.0033) [2024-06-23 03:24:32,528][15401] Updated weights for policy 0, policy_version 349590 (0.0031) [2024-06-23 03:24:33,390][15132] Fps is (10 sec: 45875.9, 60 sec: 43146.2, 300 sec: 42765.4). Total num frames: 5727731712. Throughput: 0: 42654.2. Samples: 5727878180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:33,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 03:24:36,527][15401] Updated weights for policy 0, policy_version 349600 (0.0037) [2024-06-23 03:24:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5727928320. Throughput: 0: 42756.8. Samples: 5728010900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:38,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 03:24:40,539][15401] Updated weights for policy 0, policy_version 349610 (0.0027) [2024-06-23 03:24:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.7, 300 sec: 42598.4). Total num frames: 5728124928. Throughput: 0: 42522.2. Samples: 5728261480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:43,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 03:24:44,072][15401] Updated weights for policy 0, policy_version 349620 (0.0033) [2024-06-23 03:24:48,147][15401] Updated weights for policy 0, policy_version 349630 (0.0034) [2024-06-23 03:24:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5728354304. Throughput: 0: 42745.7. Samples: 5728519960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 03:24:51,579][15401] Updated weights for policy 0, policy_version 349640 (0.0029) [2024-06-23 03:24:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5728567296. Throughput: 0: 42680.0. Samples: 5728648880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 03:24:55,692][15401] Updated weights for policy 0, policy_version 349650 (0.0036) [2024-06-23 03:24:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5728780288. Throughput: 0: 42748.4. Samples: 5728904880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:24:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 03:24:59,232][15401] Updated weights for policy 0, policy_version 349660 (0.0038) [2024-06-23 03:25:03,315][15401] Updated weights for policy 0, policy_version 349670 (0.0028) [2024-06-23 03:25:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5728993280. Throughput: 0: 43057.4. Samples: 5729171980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:25:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 03:25:06,683][15401] Updated weights for policy 0, policy_version 349680 (0.0037) [2024-06-23 03:25:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5729222656. Throughput: 0: 42984.5. Samples: 5729296160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:25:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 03:25:10,877][15401] Updated weights for policy 0, policy_version 349690 (0.0040) [2024-06-23 03:25:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5729435648. Throughput: 0: 43078.3. Samples: 5729555280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:25:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 03:25:14,202][15401] Updated weights for policy 0, policy_version 349700 (0.0032) [2024-06-23 03:25:15,242][15349] Signal inference workers to stop experience collection... (84900 times) [2024-06-23 03:25:15,242][15349] Signal inference workers to resume experience collection... (84900 times) [2024-06-23 03:25:15,258][15401] InferenceWorker_p0-w0: stopping experience collection (84900 times) [2024-06-23 03:25:15,259][15401] InferenceWorker_p0-w0: resuming experience collection (84900 times) [2024-06-23 03:25:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5729632256. Throughput: 0: 42961.8. Samples: 5729811460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:18,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-23 03:25:18,483][15401] Updated weights for policy 0, policy_version 349710 (0.0037) [2024-06-23 03:25:21,973][15401] Updated weights for policy 0, policy_version 349720 (0.0042) [2024-06-23 03:25:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 5729878016. Throughput: 0: 42686.6. Samples: 5729931800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:23,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 03:25:26,418][15401] Updated weights for policy 0, policy_version 349730 (0.0027) [2024-06-23 03:25:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.2). Total num frames: 5730074624. Throughput: 0: 42916.0. Samples: 5730192700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 03:25:29,563][15401] Updated weights for policy 0, policy_version 349740 (0.0027) [2024-06-23 03:25:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5730271232. Throughput: 0: 42972.6. Samples: 5730453720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 03:25:34,135][15401] Updated weights for policy 0, policy_version 349750 (0.0035) [2024-06-23 03:25:37,260][15401] Updated weights for policy 0, policy_version 349760 (0.0031) [2024-06-23 03:25:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 5730516992. Throughput: 0: 43011.7. Samples: 5730584400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 03:25:41,617][15401] Updated weights for policy 0, policy_version 349770 (0.0027) [2024-06-23 03:25:43,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5730729984. Throughput: 0: 43145.2. Samples: 5730846420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 03:25:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000349776_5730729984.pth... [2024-06-23 03:25:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000349149_5720457216.pth [2024-06-23 03:25:44,850][15401] Updated weights for policy 0, policy_version 349780 (0.0045) [2024-06-23 03:25:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5730926592. Throughput: 0: 42882.9. Samples: 5731101720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 03:25:49,186][15401] Updated weights for policy 0, policy_version 349790 (0.0030) [2024-06-23 03:25:52,705][15401] Updated weights for policy 0, policy_version 349800 (0.0038) [2024-06-23 03:25:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.7, 300 sec: 42877.0). Total num frames: 5731172352. Throughput: 0: 42853.4. Samples: 5731224560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 03:25:56,781][15401] Updated weights for policy 0, policy_version 349810 (0.0037) [2024-06-23 03:25:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5731368960. Throughput: 0: 42983.5. Samples: 5731489540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:25:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 03:26:00,180][15401] Updated weights for policy 0, policy_version 349820 (0.0028) [2024-06-23 03:26:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5731565568. Throughput: 0: 42964.0. Samples: 5731744840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:26:03,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-23 03:26:04,614][15401] Updated weights for policy 0, policy_version 349830 (0.0034) [2024-06-23 03:26:07,746][15401] Updated weights for policy 0, policy_version 349840 (0.0029) [2024-06-23 03:26:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5731794944. Throughput: 0: 42973.0. Samples: 5731865580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:26:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 03:26:12,275][15401] Updated weights for policy 0, policy_version 349850 (0.0042) [2024-06-23 03:26:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5732007936. Throughput: 0: 42991.1. Samples: 5732127300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:26:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 03:26:15,262][15401] Updated weights for policy 0, policy_version 349860 (0.0042) [2024-06-23 03:26:15,981][15349] Signal inference workers to stop experience collection... (84950 times) [2024-06-23 03:26:15,982][15349] Signal inference workers to resume experience collection... (84950 times) [2024-06-23 03:26:15,995][15401] InferenceWorker_p0-w0: stopping experience collection (84950 times) [2024-06-23 03:26:15,996][15401] InferenceWorker_p0-w0: resuming experience collection (84950 times) [2024-06-23 03:26:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5732204544. Throughput: 0: 42852.3. Samples: 5732382080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:26:18,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 03:26:19,829][15401] Updated weights for policy 0, policy_version 349870 (0.0033) [2024-06-23 03:26:22,742][15401] Updated weights for policy 0, policy_version 349880 (0.0034) [2024-06-23 03:26:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5732450304. Throughput: 0: 42856.4. Samples: 5732512940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:26:23,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-23 03:26:27,363][15401] Updated weights for policy 0, policy_version 349890 (0.0032) [2024-06-23 03:26:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5732646912. Throughput: 0: 42923.6. Samples: 5732777980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 03:26:28,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 03:26:30,347][15401] Updated weights for policy 0, policy_version 349900 (0.0032) [2024-06-23 03:26:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5732843520. Throughput: 0: 42991.6. Samples: 5733036340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:26:33,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 03:26:34,901][15401] Updated weights for policy 0, policy_version 349910 (0.0034) [2024-06-23 03:26:37,859][15401] Updated weights for policy 0, policy_version 349920 (0.0022) [2024-06-23 03:26:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5733122048. Throughput: 0: 43072.4. Samples: 5733162820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:26:38,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 03:26:42,277][15401] Updated weights for policy 0, policy_version 349930 (0.0037) [2024-06-23 03:26:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5733302272. Throughput: 0: 43146.3. Samples: 5733431120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:26:43,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-23 03:26:45,415][15401] Updated weights for policy 0, policy_version 349940 (0.0030) [2024-06-23 03:26:48,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5733498880. Throughput: 0: 43265.8. Samples: 5733691800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:26:48,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-23 03:26:49,901][15401] Updated weights for policy 0, policy_version 349950 (0.0028) [2024-06-23 03:26:52,944][15401] Updated weights for policy 0, policy_version 349960 (0.0043) [2024-06-23 03:26:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42932.6). Total num frames: 5733761024. Throughput: 0: 43238.6. Samples: 5733811320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:26:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 03:26:57,427][15401] Updated weights for policy 0, policy_version 349970 (0.0038) [2024-06-23 03:26:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.5). Total num frames: 5733957632. Throughput: 0: 43304.5. Samples: 5734076000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:26:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 03:27:00,548][15401] Updated weights for policy 0, policy_version 349980 (0.0035) [2024-06-23 03:27:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5734154240. Throughput: 0: 43414.2. Samples: 5734335720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 03:27:04,871][15401] Updated weights for policy 0, policy_version 349990 (0.0036) [2024-06-23 03:27:08,255][15401] Updated weights for policy 0, policy_version 350000 (0.0026) [2024-06-23 03:27:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 5734400000. Throughput: 0: 43284.8. Samples: 5734460760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 03:27:12,419][15401] Updated weights for policy 0, policy_version 350010 (0.0029) [2024-06-23 03:27:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5734596608. Throughput: 0: 43106.8. Samples: 5734717780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 03:27:15,756][15401] Updated weights for policy 0, policy_version 350020 (0.0041) [2024-06-23 03:27:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5734793216. Throughput: 0: 43086.3. Samples: 5734975220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 03:27:20,409][15401] Updated weights for policy 0, policy_version 350030 (0.0027) [2024-06-23 03:27:23,222][15401] Updated weights for policy 0, policy_version 350040 (0.0026) [2024-06-23 03:27:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 5735055360. Throughput: 0: 43124.4. Samples: 5735103420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 03:27:27,930][15401] Updated weights for policy 0, policy_version 350050 (0.0033) [2024-06-23 03:27:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5735235584. Throughput: 0: 43015.0. Samples: 5735366800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:28,396][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 03:27:30,781][15349] Signal inference workers to stop experience collection... (85000 times) [2024-06-23 03:27:30,781][15349] Signal inference workers to resume experience collection... (85000 times) [2024-06-23 03:27:30,807][15401] InferenceWorker_p0-w0: stopping experience collection (85000 times) [2024-06-23 03:27:30,808][15401] InferenceWorker_p0-w0: resuming experience collection (85000 times) [2024-06-23 03:27:30,917][15401] Updated weights for policy 0, policy_version 350060 (0.0028) [2024-06-23 03:27:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 5735448576. Throughput: 0: 42690.3. Samples: 5735612860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:33,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 03:27:35,535][15401] Updated weights for policy 0, policy_version 350070 (0.0026) [2024-06-23 03:27:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5735677952. Throughput: 0: 42895.2. Samples: 5735741600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 03:27:38,698][15401] Updated weights for policy 0, policy_version 350080 (0.0034) [2024-06-23 03:27:43,152][15401] Updated weights for policy 0, policy_version 350090 (0.0025) [2024-06-23 03:27:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5735874560. Throughput: 0: 42911.5. Samples: 5736007020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 03:27:43,536][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000350091_5735890944.pth... [2024-06-23 03:27:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000349462_5725585408.pth [2024-06-23 03:27:46,560][15401] Updated weights for policy 0, policy_version 350100 (0.0042) [2024-06-23 03:27:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5736087552. Throughput: 0: 42650.2. Samples: 5736254980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:27:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 03:27:50,729][15401] Updated weights for policy 0, policy_version 350110 (0.0035) [2024-06-23 03:27:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.6, 300 sec: 42875.8). Total num frames: 5736300544. Throughput: 0: 42757.3. Samples: 5736384940. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:27:53,393][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 03:27:53,987][15401] Updated weights for policy 0, policy_version 350120 (0.0037) [2024-06-23 03:27:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5736497152. Throughput: 0: 42775.9. Samples: 5736642700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:27:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 03:27:58,812][15401] Updated weights for policy 0, policy_version 350130 (0.0039) [2024-06-23 03:28:01,667][15401] Updated weights for policy 0, policy_version 350140 (0.0035) [2024-06-23 03:28:03,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 5736742912. Throughput: 0: 42646.1. Samples: 5736894300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 03:28:06,530][15401] Updated weights for policy 0, policy_version 350150 (0.0039) [2024-06-23 03:28:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5736955904. Throughput: 0: 42857.8. Samples: 5737032020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 03:28:09,241][15401] Updated weights for policy 0, policy_version 350160 (0.0041) [2024-06-23 03:28:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 5737136128. Throughput: 0: 42549.6. Samples: 5737281540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 03:28:14,154][15401] Updated weights for policy 0, policy_version 350170 (0.0031) [2024-06-23 03:28:17,025][15401] Updated weights for policy 0, policy_version 350180 (0.0038) [2024-06-23 03:28:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 5737381888. Throughput: 0: 42743.4. Samples: 5737536320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:18,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 03:28:21,819][15401] Updated weights for policy 0, policy_version 350190 (0.0034) [2024-06-23 03:28:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42820.5). Total num frames: 5737562112. Throughput: 0: 42818.6. Samples: 5737668440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 03:28:24,910][15401] Updated weights for policy 0, policy_version 350200 (0.0037) [2024-06-23 03:28:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42876.1). Total num frames: 5737791488. Throughput: 0: 42311.1. Samples: 5737911120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:28,393][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 03:28:29,518][15401] Updated weights for policy 0, policy_version 350210 (0.0043) [2024-06-23 03:28:32,422][15401] Updated weights for policy 0, policy_version 350220 (0.0035) [2024-06-23 03:28:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5738020864. Throughput: 0: 42461.8. Samples: 5738165760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 03:28:36,969][15401] Updated weights for policy 0, policy_version 350230 (0.0028) [2024-06-23 03:28:38,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42052.3, 300 sec: 42765.5). Total num frames: 5738201088. Throughput: 0: 42506.3. Samples: 5738297620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:38,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 03:28:40,042][15401] Updated weights for policy 0, policy_version 350240 (0.0027) [2024-06-23 03:28:41,979][15349] Signal inference workers to stop experience collection... (85050 times) [2024-06-23 03:28:42,031][15401] InferenceWorker_p0-w0: stopping experience collection (85050 times) [2024-06-23 03:28:42,039][15349] Signal inference workers to resume experience collection... (85050 times) [2024-06-23 03:28:42,048][15401] InferenceWorker_p0-w0: resuming experience collection (85050 times) [2024-06-23 03:28:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5738430464. Throughput: 0: 42676.9. Samples: 5738563160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 03:28:44,444][15401] Updated weights for policy 0, policy_version 350250 (0.0045) [2024-06-23 03:28:47,607][15401] Updated weights for policy 0, policy_version 350260 (0.0031) [2024-06-23 03:28:48,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5738676224. Throughput: 0: 42493.4. Samples: 5738806500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 03:28:52,495][15401] Updated weights for policy 0, policy_version 350270 (0.0028) [2024-06-23 03:28:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 5738840064. Throughput: 0: 42510.8. Samples: 5738945000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 03:28:55,303][15401] Updated weights for policy 0, policy_version 350280 (0.0032) [2024-06-23 03:28:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5739053056. Throughput: 0: 42781.5. Samples: 5739206700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:28:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 03:29:00,035][15401] Updated weights for policy 0, policy_version 350290 (0.0030) [2024-06-23 03:29:03,194][15401] Updated weights for policy 0, policy_version 350300 (0.0032) [2024-06-23 03:29:03,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5739315200. Throughput: 0: 42604.0. Samples: 5739453500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-23 03:29:03,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 03:29:07,929][15401] Updated weights for policy 0, policy_version 350310 (0.0036) [2024-06-23 03:29:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5739495424. Throughput: 0: 42640.9. Samples: 5739587280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 03:29:10,593][15401] Updated weights for policy 0, policy_version 350320 (0.0037) [2024-06-23 03:29:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5739708416. Throughput: 0: 42947.2. Samples: 5739843640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 03:29:15,752][15401] Updated weights for policy 0, policy_version 350330 (0.0033) [2024-06-23 03:29:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 5739954176. Throughput: 0: 42709.4. Samples: 5740087680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 03:29:18,427][15401] Updated weights for policy 0, policy_version 350340 (0.0026) [2024-06-23 03:29:23,183][15401] Updated weights for policy 0, policy_version 350350 (0.0035) [2024-06-23 03:29:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5740134400. Throughput: 0: 42807.6. Samples: 5740223960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:23,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 03:29:26,650][15401] Updated weights for policy 0, policy_version 350360 (0.0027) [2024-06-23 03:29:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5740347392. Throughput: 0: 42484.0. Samples: 5740474940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 03:29:30,852][15401] Updated weights for policy 0, policy_version 350370 (0.0037) [2024-06-23 03:29:33,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 5740593152. Throughput: 0: 42693.3. Samples: 5740727800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:33,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 03:29:34,493][15401] Updated weights for policy 0, policy_version 350380 (0.0037) [2024-06-23 03:29:38,283][15401] Updated weights for policy 0, policy_version 350390 (0.0032) [2024-06-23 03:29:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5740789760. Throughput: 0: 42618.1. Samples: 5740862820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:38,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 03:29:42,303][15401] Updated weights for policy 0, policy_version 350400 (0.0040) [2024-06-23 03:29:43,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5740986368. Throughput: 0: 42449.8. Samples: 5741116940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:43,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 03:29:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000350402_5740986368.pth... [2024-06-23 03:29:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000349776_5730729984.pth [2024-06-23 03:29:45,890][15401] Updated weights for policy 0, policy_version 350410 (0.0042) [2024-06-23 03:29:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42875.8). Total num frames: 5741215744. Throughput: 0: 42467.1. Samples: 5741364620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:48,392][15132] Avg episode reward: [(0, '0.019')] [2024-06-23 03:29:50,006][15401] Updated weights for policy 0, policy_version 350420 (0.0038) [2024-06-23 03:29:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5741412352. Throughput: 0: 42461.7. Samples: 5741498060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:53,390][15132] Avg episode reward: [(0, '0.019')] [2024-06-23 03:29:53,863][15401] Updated weights for policy 0, policy_version 350430 (0.0042) [2024-06-23 03:29:55,399][15349] Signal inference workers to stop experience collection... (85100 times) [2024-06-23 03:29:55,440][15401] InferenceWorker_p0-w0: stopping experience collection (85100 times) [2024-06-23 03:29:55,522][15349] Signal inference workers to resume experience collection... (85100 times) [2024-06-23 03:29:55,523][15401] InferenceWorker_p0-w0: resuming experience collection (85100 times) [2024-06-23 03:29:57,755][15401] Updated weights for policy 0, policy_version 350440 (0.0028) [2024-06-23 03:29:58,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5741625344. Throughput: 0: 42204.4. Samples: 5741742840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:29:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 03:30:01,508][15401] Updated weights for policy 0, policy_version 350450 (0.0039) [2024-06-23 03:30:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5741854720. Throughput: 0: 42582.6. Samples: 5742003900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:30:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 03:30:05,353][15401] Updated weights for policy 0, policy_version 350460 (0.0028) [2024-06-23 03:30:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5742034944. Throughput: 0: 42377.7. Samples: 5742130960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:30:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 03:30:09,255][15401] Updated weights for policy 0, policy_version 350470 (0.0043) [2024-06-23 03:30:12,986][15401] Updated weights for policy 0, policy_version 350480 (0.0030) [2024-06-23 03:30:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5742264320. Throughput: 0: 42340.9. Samples: 5742380280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:30:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 03:30:17,124][15401] Updated weights for policy 0, policy_version 350490 (0.0038) [2024-06-23 03:30:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5742493696. Throughput: 0: 42325.3. Samples: 5742632340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:30:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 03:30:20,599][15401] Updated weights for policy 0, policy_version 350500 (0.0035) [2024-06-23 03:30:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5742673920. Throughput: 0: 42164.8. Samples: 5742760240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-23 03:30:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 03:30:24,682][15401] Updated weights for policy 0, policy_version 350510 (0.0042) [2024-06-23 03:30:28,331][15401] Updated weights for policy 0, policy_version 350520 (0.0039) [2024-06-23 03:30:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5742919680. Throughput: 0: 42266.6. Samples: 5743018940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:30:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 03:30:32,512][15401] Updated weights for policy 0, policy_version 350530 (0.0044) [2024-06-23 03:30:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 5743132672. Throughput: 0: 42376.4. Samples: 5743271460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:30:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 03:30:35,973][15401] Updated weights for policy 0, policy_version 350540 (0.0029) [2024-06-23 03:30:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 5743296512. Throughput: 0: 42271.6. Samples: 5743400280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:30:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 03:30:40,153][15401] Updated weights for policy 0, policy_version 350550 (0.0048) [2024-06-23 03:30:43,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 5743542272. Throughput: 0: 42478.6. Samples: 5743654480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:30:43,393][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 03:30:44,318][15401] Updated weights for policy 0, policy_version 350560 (0.0028) [2024-06-23 03:30:47,883][15401] Updated weights for policy 0, policy_version 350570 (0.0032) [2024-06-23 03:30:48,389][15132] Fps is (10 sec: 49152.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 5743788032. Throughput: 0: 42238.7. Samples: 5743904640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:30:48,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 03:30:52,037][15401] Updated weights for policy 0, policy_version 350580 (0.0041) [2024-06-23 03:30:53,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5743935488. Throughput: 0: 42376.4. Samples: 5744037900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:30:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 03:30:55,555][15349] Signal inference workers to stop experience collection... (85150 times) [2024-06-23 03:30:55,556][15349] Signal inference workers to resume experience collection... (85150 times) [2024-06-23 03:30:55,557][15401] Updated weights for policy 0, policy_version 350590 (0.0037) [2024-06-23 03:30:55,599][15401] InferenceWorker_p0-w0: stopping experience collection (85150 times) [2024-06-23 03:30:55,599][15401] InferenceWorker_p0-w0: resuming experience collection (85150 times) [2024-06-23 03:30:58,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5744164864. Throughput: 0: 42472.3. Samples: 5744291540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:30:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 03:30:59,750][15401] Updated weights for policy 0, policy_version 350600 (0.0044) [2024-06-23 03:31:03,312][15401] Updated weights for policy 0, policy_version 350610 (0.0039) [2024-06-23 03:31:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5744394240. Throughput: 0: 42405.8. Samples: 5744540600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:31:03,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 03:31:07,398][15401] Updated weights for policy 0, policy_version 350620 (0.0043) [2024-06-23 03:31:08,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5744558080. Throughput: 0: 42399.6. Samples: 5744668220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:31:08,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 03:31:11,328][15401] Updated weights for policy 0, policy_version 350630 (0.0047) [2024-06-23 03:31:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5744820224. Throughput: 0: 42230.3. Samples: 5744919300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:31:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 03:31:15,233][15401] Updated weights for policy 0, policy_version 350640 (0.0032) [2024-06-23 03:31:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5745016832. Throughput: 0: 42438.3. Samples: 5745181180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:31:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 03:31:18,833][15401] Updated weights for policy 0, policy_version 350650 (0.0039) [2024-06-23 03:31:23,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5745197056. Throughput: 0: 42283.6. Samples: 5745303040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:31:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 03:31:23,496][15401] Updated weights for policy 0, policy_version 350660 (0.0034) [2024-06-23 03:31:26,489][15401] Updated weights for policy 0, policy_version 350670 (0.0029) [2024-06-23 03:31:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5745459200. Throughput: 0: 42199.6. Samples: 5745553360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:31:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 03:31:31,066][15401] Updated weights for policy 0, policy_version 350680 (0.0035) [2024-06-23 03:31:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 5745639424. Throughput: 0: 42583.1. Samples: 5745820880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:31:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 03:31:34,070][15401] Updated weights for policy 0, policy_version 350690 (0.0034) [2024-06-23 03:31:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5745836032. Throughput: 0: 42245.0. Samples: 5745938920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 03:31:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 03:31:38,735][15401] Updated weights for policy 0, policy_version 350700 (0.0029) [2024-06-23 03:31:41,638][15401] Updated weights for policy 0, policy_version 350710 (0.0030) [2024-06-23 03:31:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 5746098176. Throughput: 0: 42222.7. Samples: 5746191560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:31:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 03:31:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000350715_5746114560.pth... [2024-06-23 03:31:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000350091_5735890944.pth [2024-06-23 03:31:46,850][15401] Updated weights for policy 0, policy_version 350720 (0.0032) [2024-06-23 03:31:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41506.1, 300 sec: 42431.8). Total num frames: 5746278400. Throughput: 0: 42490.7. Samples: 5746452680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:31:48,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 03:31:49,420][15401] Updated weights for policy 0, policy_version 350730 (0.0037) [2024-06-23 03:31:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5746491392. Throughput: 0: 42508.0. Samples: 5746581080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:31:53,394][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 03:31:54,593][15401] Updated weights for policy 0, policy_version 350740 (0.0033) [2024-06-23 03:31:56,943][15401] Updated weights for policy 0, policy_version 350750 (0.0033) [2024-06-23 03:31:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5746737152. Throughput: 0: 42473.3. Samples: 5746830600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:31:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 03:32:02,151][15401] Updated weights for policy 0, policy_version 350760 (0.0026) [2024-06-23 03:32:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5746933760. Throughput: 0: 42516.3. Samples: 5747094420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 03:32:04,666][15401] Updated weights for policy 0, policy_version 350770 (0.0034) [2024-06-23 03:32:06,601][15349] Signal inference workers to stop experience collection... (85200 times) [2024-06-23 03:32:06,638][15401] InferenceWorker_p0-w0: stopping experience collection (85200 times) [2024-06-23 03:32:06,648][15349] Signal inference workers to resume experience collection... (85200 times) [2024-06-23 03:32:06,658][15401] InferenceWorker_p0-w0: resuming experience collection (85200 times) [2024-06-23 03:32:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5747130368. Throughput: 0: 42610.2. Samples: 5747220500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:08,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 03:32:09,804][15401] Updated weights for policy 0, policy_version 350780 (0.0032) [2024-06-23 03:32:12,443][15401] Updated weights for policy 0, policy_version 350790 (0.0035) [2024-06-23 03:32:13,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5747392512. Throughput: 0: 42802.2. Samples: 5747479460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:13,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-23 03:32:17,428][15401] Updated weights for policy 0, policy_version 350800 (0.0035) [2024-06-23 03:32:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5747589120. Throughput: 0: 42552.5. Samples: 5747735740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 03:32:20,064][15401] Updated weights for policy 0, policy_version 350810 (0.0033) [2024-06-23 03:32:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 5747785728. Throughput: 0: 42671.9. Samples: 5747859160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 03:32:24,923][15401] Updated weights for policy 0, policy_version 350820 (0.0039) [2024-06-23 03:32:27,787][15401] Updated weights for policy 0, policy_version 350830 (0.0036) [2024-06-23 03:32:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5748015104. Throughput: 0: 42901.4. Samples: 5748122120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:28,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 03:32:32,683][15401] Updated weights for policy 0, policy_version 350840 (0.0036) [2024-06-23 03:32:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 5748211712. Throughput: 0: 42764.3. Samples: 5748377080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 03:32:35,446][15401] Updated weights for policy 0, policy_version 350850 (0.0027) [2024-06-23 03:32:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 5748424704. Throughput: 0: 42603.6. Samples: 5748498240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 03:32:40,192][15401] Updated weights for policy 0, policy_version 350860 (0.0038) [2024-06-23 03:32:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5748637696. Throughput: 0: 43039.2. Samples: 5748767360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 03:32:43,418][15401] Updated weights for policy 0, policy_version 350870 (0.0029) [2024-06-23 03:32:47,811][15401] Updated weights for policy 0, policy_version 350880 (0.0032) [2024-06-23 03:32:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 5748834304. Throughput: 0: 42892.1. Samples: 5749024560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:48,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 03:32:50,994][15401] Updated weights for policy 0, policy_version 350890 (0.0033) [2024-06-23 03:32:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5749063680. Throughput: 0: 42784.0. Samples: 5749145780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-23 03:32:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 03:32:55,189][15401] Updated weights for policy 0, policy_version 350900 (0.0045) [2024-06-23 03:32:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5749293056. Throughput: 0: 42814.8. Samples: 5749406120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:32:58,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 03:32:58,411][15401] Updated weights for policy 0, policy_version 350910 (0.0026) [2024-06-23 03:32:59,730][15349] Signal inference workers to stop experience collection... (85250 times) [2024-06-23 03:32:59,736][15349] Signal inference workers to resume experience collection... (85250 times) [2024-06-23 03:32:59,764][15401] InferenceWorker_p0-w0: stopping experience collection (85250 times) [2024-06-23 03:32:59,764][15401] InferenceWorker_p0-w0: resuming experience collection (85250 times) [2024-06-23 03:33:03,029][15401] Updated weights for policy 0, policy_version 350920 (0.0032) [2024-06-23 03:33:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5749489664. Throughput: 0: 42872.9. Samples: 5749665020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 03:33:05,866][15401] Updated weights for policy 0, policy_version 350930 (0.0036) [2024-06-23 03:33:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5749702656. Throughput: 0: 42847.3. Samples: 5749787280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 03:33:10,457][15401] Updated weights for policy 0, policy_version 350940 (0.0030) [2024-06-23 03:33:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5749948416. Throughput: 0: 42822.2. Samples: 5750049120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 03:33:13,613][15401] Updated weights for policy 0, policy_version 350950 (0.0040) [2024-06-23 03:33:18,385][15401] Updated weights for policy 0, policy_version 350960 (0.0033) [2024-06-23 03:33:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5750128640. Throughput: 0: 42938.3. Samples: 5750309300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:18,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 03:33:21,749][15401] Updated weights for policy 0, policy_version 350970 (0.0033) [2024-06-23 03:33:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 5750341632. Throughput: 0: 42975.1. Samples: 5750432120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 03:33:25,916][15401] Updated weights for policy 0, policy_version 350980 (0.0028) [2024-06-23 03:33:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5750587392. Throughput: 0: 42586.2. Samples: 5750683740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 03:33:29,302][15401] Updated weights for policy 0, policy_version 350990 (0.0024) [2024-06-23 03:33:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5750767616. Throughput: 0: 42783.6. Samples: 5750949820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 03:33:33,500][15401] Updated weights for policy 0, policy_version 351000 (0.0036) [2024-06-23 03:33:36,858][15401] Updated weights for policy 0, policy_version 351010 (0.0028) [2024-06-23 03:33:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5750980608. Throughput: 0: 42691.6. Samples: 5751066900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 03:33:40,962][15401] Updated weights for policy 0, policy_version 351020 (0.0033) [2024-06-23 03:33:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 5751209984. Throughput: 0: 42583.3. Samples: 5751322380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 03:33:43,534][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000351027_5751226368.pth... [2024-06-23 03:33:43,594][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000350402_5740986368.pth [2024-06-23 03:33:44,348][15401] Updated weights for policy 0, policy_version 351030 (0.0036) [2024-06-23 03:33:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5751422976. Throughput: 0: 42659.0. Samples: 5751584680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 03:33:48,698][15401] Updated weights for policy 0, policy_version 351040 (0.0034) [2024-06-23 03:33:52,066][15401] Updated weights for policy 0, policy_version 351050 (0.0037) [2024-06-23 03:33:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5751619584. Throughput: 0: 42745.9. Samples: 5751710860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 03:33:56,329][15401] Updated weights for policy 0, policy_version 351060 (0.0032) [2024-06-23 03:33:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5751848960. Throughput: 0: 42358.3. Samples: 5751955240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:33:58,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 03:34:00,477][15401] Updated weights for policy 0, policy_version 351070 (0.0035) [2024-06-23 03:34:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5752061952. Throughput: 0: 42475.1. Samples: 5752220680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:34:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 03:34:03,952][15401] Updated weights for policy 0, policy_version 351080 (0.0028) [2024-06-23 03:34:08,360][15401] Updated weights for policy 0, policy_version 351090 (0.0030) [2024-06-23 03:34:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5752258560. Throughput: 0: 42520.8. Samples: 5752345560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:34:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 03:34:11,567][15401] Updated weights for policy 0, policy_version 351100 (0.0033) [2024-06-23 03:34:12,160][15349] Signal inference workers to stop experience collection... (85300 times) [2024-06-23 03:34:12,160][15349] Signal inference workers to resume experience collection... (85300 times) [2024-06-23 03:34:12,208][15401] InferenceWorker_p0-w0: stopping experience collection (85300 times) [2024-06-23 03:34:12,212][15401] InferenceWorker_p0-w0: resuming experience collection (85300 times) [2024-06-23 03:34:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5752487936. Throughput: 0: 42595.2. Samples: 5752600520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 03:34:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 03:34:16,154][15401] Updated weights for policy 0, policy_version 351110 (0.0034) [2024-06-23 03:34:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5752684544. Throughput: 0: 42433.3. Samples: 5752859320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 03:34:19,101][15401] Updated weights for policy 0, policy_version 351120 (0.0040) [2024-06-23 03:34:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5752897536. Throughput: 0: 42570.6. Samples: 5752982580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 03:34:24,043][15401] Updated weights for policy 0, policy_version 351130 (0.0028) [2024-06-23 03:34:27,092][15401] Updated weights for policy 0, policy_version 351140 (0.0034) [2024-06-23 03:34:28,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42596.7, 300 sec: 42542.9). Total num frames: 5753143296. Throughput: 0: 42681.0. Samples: 5753243120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:28,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 03:34:31,708][15401] Updated weights for policy 0, policy_version 351150 (0.0035) [2024-06-23 03:34:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5753323520. Throughput: 0: 42674.3. Samples: 5753505020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 03:34:34,669][15401] Updated weights for policy 0, policy_version 351160 (0.0034) [2024-06-23 03:34:38,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5753552896. Throughput: 0: 42598.7. Samples: 5753627800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 03:34:39,255][15401] Updated weights for policy 0, policy_version 351170 (0.0031) [2024-06-23 03:34:42,340][15401] Updated weights for policy 0, policy_version 351180 (0.0029) [2024-06-23 03:34:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 5753782272. Throughput: 0: 42933.6. Samples: 5753887260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:43,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-23 03:34:46,815][15401] Updated weights for policy 0, policy_version 351190 (0.0040) [2024-06-23 03:34:48,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 5753978880. Throughput: 0: 42930.4. Samples: 5754152540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:48,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 03:34:49,961][15401] Updated weights for policy 0, policy_version 351200 (0.0026) [2024-06-23 03:34:53,396][15132] Fps is (10 sec: 42571.5, 60 sec: 43140.0, 300 sec: 42653.0). Total num frames: 5754208256. Throughput: 0: 42935.3. Samples: 5754277920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:53,396][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 03:34:54,344][15401] Updated weights for policy 0, policy_version 351210 (0.0036) [2024-06-23 03:34:57,508][15401] Updated weights for policy 0, policy_version 351220 (0.0034) [2024-06-23 03:34:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5754404864. Throughput: 0: 42936.9. Samples: 5754532680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:34:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 03:35:01,836][15401] Updated weights for policy 0, policy_version 351230 (0.0028) [2024-06-23 03:35:03,389][15132] Fps is (10 sec: 39347.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5754601472. Throughput: 0: 43044.2. Samples: 5754796300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:35:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 03:35:05,413][15401] Updated weights for policy 0, policy_version 351240 (0.0026) [2024-06-23 03:35:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5754863616. Throughput: 0: 43072.9. Samples: 5754920860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:35:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 03:35:09,375][15401] Updated weights for policy 0, policy_version 351250 (0.0050) [2024-06-23 03:35:13,014][15401] Updated weights for policy 0, policy_version 351260 (0.0024) [2024-06-23 03:35:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5755060224. Throughput: 0: 42958.3. Samples: 5755176140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:35:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 03:35:17,309][15401] Updated weights for policy 0, policy_version 351270 (0.0035) [2024-06-23 03:35:18,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 5755240448. Throughput: 0: 42964.6. Samples: 5755438420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:35:18,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-23 03:35:20,566][15401] Updated weights for policy 0, policy_version 351280 (0.0030) [2024-06-23 03:35:21,380][15349] Signal inference workers to stop experience collection... (85350 times) [2024-06-23 03:35:21,405][15401] InferenceWorker_p0-w0: stopping experience collection (85350 times) [2024-06-23 03:35:21,439][15349] Signal inference workers to resume experience collection... (85350 times) [2024-06-23 03:35:21,441][15401] InferenceWorker_p0-w0: resuming experience collection (85350 times) [2024-06-23 03:35:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 5755502592. Throughput: 0: 42973.9. Samples: 5755561620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:35:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 03:35:24,756][15401] Updated weights for policy 0, policy_version 351290 (0.0032) [2024-06-23 03:35:28,357][15401] Updated weights for policy 0, policy_version 351300 (0.0035) [2024-06-23 03:35:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 5755699200. Throughput: 0: 42916.5. Samples: 5755818500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:35:28,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 03:35:32,326][15401] Updated weights for policy 0, policy_version 351310 (0.0046) [2024-06-23 03:35:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5755879424. Throughput: 0: 42663.3. Samples: 5756072400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:35:33,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 03:35:36,108][15401] Updated weights for policy 0, policy_version 351320 (0.0031) [2024-06-23 03:35:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 5756125184. Throughput: 0: 42657.2. Samples: 5756197220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:35:38,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 03:35:39,782][15401] Updated weights for policy 0, policy_version 351330 (0.0035) [2024-06-23 03:35:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 5756305408. Throughput: 0: 42572.8. Samples: 5756448460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:35:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 03:35:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000351338_5756321792.pth... [2024-06-23 03:35:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000350715_5746114560.pth [2024-06-23 03:35:44,008][15401] Updated weights for policy 0, policy_version 351340 (0.0023) [2024-06-23 03:35:47,390][15401] Updated weights for policy 0, policy_version 351350 (0.0031) [2024-06-23 03:35:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5756534784. Throughput: 0: 42372.0. Samples: 5756703040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:35:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 03:35:51,597][15401] Updated weights for policy 0, policy_version 351360 (0.0033) [2024-06-23 03:35:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 5756764160. Throughput: 0: 42481.4. Samples: 5756832520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:35:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 03:35:55,198][15401] Updated weights for policy 0, policy_version 351370 (0.0033) [2024-06-23 03:35:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5756944384. Throughput: 0: 42497.7. Samples: 5757088540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:35:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 03:35:59,392][15401] Updated weights for policy 0, policy_version 351380 (0.0029) [2024-06-23 03:36:02,744][15401] Updated weights for policy 0, policy_version 351390 (0.0039) [2024-06-23 03:36:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5757173760. Throughput: 0: 42248.6. Samples: 5757339620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:03,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 03:36:07,074][15401] Updated weights for policy 0, policy_version 351400 (0.0036) [2024-06-23 03:36:08,396][15132] Fps is (10 sec: 45846.6, 60 sec: 42320.9, 300 sec: 42653.0). Total num frames: 5757403136. Throughput: 0: 42434.1. Samples: 5757471420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:08,396][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 03:36:10,819][15401] Updated weights for policy 0, policy_version 351410 (0.0025) [2024-06-23 03:36:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5757583360. Throughput: 0: 42409.8. Samples: 5757726940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 03:36:15,221][15401] Updated weights for policy 0, policy_version 351420 (0.0035) [2024-06-23 03:36:18,268][15401] Updated weights for policy 0, policy_version 351430 (0.0028) [2024-06-23 03:36:18,389][15132] Fps is (10 sec: 42625.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5757829120. Throughput: 0: 42335.3. Samples: 5757977480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 03:36:22,635][15401] Updated weights for policy 0, policy_version 351440 (0.0046) [2024-06-23 03:36:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5758025728. Throughput: 0: 42510.2. Samples: 5758110180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 03:36:25,663][15401] Updated weights for policy 0, policy_version 351450 (0.0031) [2024-06-23 03:36:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5758255104. Throughput: 0: 42628.6. Samples: 5758366740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:28,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 03:36:30,153][15401] Updated weights for policy 0, policy_version 351460 (0.0035) [2024-06-23 03:36:33,112][15401] Updated weights for policy 0, policy_version 351470 (0.0028) [2024-06-23 03:36:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5758484480. Throughput: 0: 42626.6. Samples: 5758621240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:33,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-23 03:36:37,745][15401] Updated weights for policy 0, policy_version 351480 (0.0039) [2024-06-23 03:36:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5758664704. Throughput: 0: 42825.2. Samples: 5758759660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:38,390][15132] Avg episode reward: [(0, '0.101')] [2024-06-23 03:36:38,514][15349] Signal inference workers to stop experience collection... (85400 times) [2024-06-23 03:36:38,514][15349] Signal inference workers to resume experience collection... (85400 times) [2024-06-23 03:36:38,538][15401] InferenceWorker_p0-w0: stopping experience collection (85400 times) [2024-06-23 03:36:38,538][15401] InferenceWorker_p0-w0: resuming experience collection (85400 times) [2024-06-23 03:36:41,048][15401] Updated weights for policy 0, policy_version 351490 (0.0051) [2024-06-23 03:36:43,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 5758894080. Throughput: 0: 42821.8. Samples: 5759015620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:43,392][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 03:36:45,568][15401] Updated weights for policy 0, policy_version 351500 (0.0027) [2024-06-23 03:36:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5759107072. Throughput: 0: 42733.1. Samples: 5759262600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 03:36:48,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 03:36:48,678][15401] Updated weights for policy 0, policy_version 351510 (0.0039) [2024-06-23 03:36:53,204][15401] Updated weights for policy 0, policy_version 351520 (0.0046) [2024-06-23 03:36:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5759303680. Throughput: 0: 42790.5. Samples: 5759396720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:36:53,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 03:36:56,242][15401] Updated weights for policy 0, policy_version 351530 (0.0044) [2024-06-23 03:36:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5759516672. Throughput: 0: 42773.7. Samples: 5759651760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:36:58,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 03:37:00,802][15401] Updated weights for policy 0, policy_version 351540 (0.0036) [2024-06-23 03:37:03,396][15132] Fps is (10 sec: 45845.3, 60 sec: 43140.0, 300 sec: 42819.6). Total num frames: 5759762432. Throughput: 0: 43023.1. Samples: 5759913800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:03,397][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 03:37:03,770][15401] Updated weights for policy 0, policy_version 351550 (0.0028) [2024-06-23 03:37:08,284][15401] Updated weights for policy 0, policy_version 351560 (0.0030) [2024-06-23 03:37:08,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42601.2, 300 sec: 42598.1). Total num frames: 5759959040. Throughput: 0: 43015.5. Samples: 5760045980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:08,392][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 03:37:11,687][15401] Updated weights for policy 0, policy_version 351570 (0.0025) [2024-06-23 03:37:13,390][15132] Fps is (10 sec: 40986.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5760172032. Throughput: 0: 42880.7. Samples: 5760296380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:13,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 03:37:15,893][15401] Updated weights for policy 0, policy_version 351580 (0.0046) [2024-06-23 03:37:18,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5760368640. Throughput: 0: 42934.8. Samples: 5760553300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:18,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 03:37:19,646][15401] Updated weights for policy 0, policy_version 351590 (0.0033) [2024-06-23 03:37:23,326][15401] Updated weights for policy 0, policy_version 351600 (0.0039) [2024-06-23 03:37:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5760614400. Throughput: 0: 42667.2. Samples: 5760679680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 03:37:27,467][15401] Updated weights for policy 0, policy_version 351610 (0.0050) [2024-06-23 03:37:28,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 5760827392. Throughput: 0: 42801.8. Samples: 5760941700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:28,392][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 03:37:30,874][15401] Updated weights for policy 0, policy_version 351620 (0.0042) [2024-06-23 03:37:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5761024000. Throughput: 0: 42847.5. Samples: 5761190740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:33,392][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 03:37:35,279][15401] Updated weights for policy 0, policy_version 351630 (0.0035) [2024-06-23 03:37:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5761253376. Throughput: 0: 42680.9. Samples: 5761317360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 03:37:38,448][15401] Updated weights for policy 0, policy_version 351640 (0.0029) [2024-06-23 03:37:42,925][15401] Updated weights for policy 0, policy_version 351650 (0.0035) [2024-06-23 03:37:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 5761466368. Throughput: 0: 42973.3. Samples: 5761585560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:43,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 03:37:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000351652_5761466368.pth... [2024-06-23 03:37:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000351027_5751226368.pth [2024-06-23 03:37:46,151][15401] Updated weights for policy 0, policy_version 351660 (0.0028) [2024-06-23 03:37:48,392][15132] Fps is (10 sec: 42586.5, 60 sec: 42869.5, 300 sec: 42764.6). Total num frames: 5761679360. Throughput: 0: 42831.1. Samples: 5761841040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:48,393][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 03:37:50,428][15401] Updated weights for policy 0, policy_version 351670 (0.0033) [2024-06-23 03:37:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5761908736. Throughput: 0: 42843.2. Samples: 5761973820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 03:37:54,018][15401] Updated weights for policy 0, policy_version 351680 (0.0041) [2024-06-23 03:37:57,782][15349] Signal inference workers to stop experience collection... (85450 times) [2024-06-23 03:37:57,785][15349] Signal inference workers to resume experience collection... (85450 times) [2024-06-23 03:37:57,812][15401] InferenceWorker_p0-w0: stopping experience collection (85450 times) [2024-06-23 03:37:57,812][15401] InferenceWorker_p0-w0: resuming experience collection (85450 times) [2024-06-23 03:37:57,922][15401] Updated weights for policy 0, policy_version 351690 (0.0041) [2024-06-23 03:37:58,389][15132] Fps is (10 sec: 42610.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5762105344. Throughput: 0: 42978.8. Samples: 5762230420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:37:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 03:38:01,527][15401] Updated weights for policy 0, policy_version 351700 (0.0029) [2024-06-23 03:38:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 5762318336. Throughput: 0: 42977.8. Samples: 5762487300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 03:38:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 03:38:05,460][15401] Updated weights for policy 0, policy_version 351710 (0.0031) [2024-06-23 03:38:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 5762547712. Throughput: 0: 42994.8. Samples: 5762614440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 03:38:08,913][15401] Updated weights for policy 0, policy_version 351720 (0.0042) [2024-06-23 03:38:12,952][15401] Updated weights for policy 0, policy_version 351730 (0.0044) [2024-06-23 03:38:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5762744320. Throughput: 0: 43033.8. Samples: 5762878120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 03:38:16,452][15401] Updated weights for policy 0, policy_version 351740 (0.0050) [2024-06-23 03:38:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5762957312. Throughput: 0: 43109.7. Samples: 5763130680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 03:38:20,907][15401] Updated weights for policy 0, policy_version 351750 (0.0033) [2024-06-23 03:38:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5763186688. Throughput: 0: 43073.3. Samples: 5763255660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:23,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 03:38:24,317][15401] Updated weights for policy 0, policy_version 351760 (0.0043) [2024-06-23 03:38:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5763383296. Throughput: 0: 42828.5. Samples: 5763512840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:28,391][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 03:38:28,428][15401] Updated weights for policy 0, policy_version 351770 (0.0037) [2024-06-23 03:38:32,134][15401] Updated weights for policy 0, policy_version 351780 (0.0030) [2024-06-23 03:38:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5763596288. Throughput: 0: 42803.8. Samples: 5763767100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 03:38:35,966][15401] Updated weights for policy 0, policy_version 351790 (0.0037) [2024-06-23 03:38:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5763825664. Throughput: 0: 42544.8. Samples: 5763888340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 03:38:39,842][15401] Updated weights for policy 0, policy_version 351800 (0.0034) [2024-06-23 03:38:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5764038656. Throughput: 0: 42645.2. Samples: 5764149460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 03:38:43,604][15401] Updated weights for policy 0, policy_version 351810 (0.0035) [2024-06-23 03:38:47,651][15401] Updated weights for policy 0, policy_version 351820 (0.0031) [2024-06-23 03:38:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.3, 300 sec: 42765.0). Total num frames: 5764235264. Throughput: 0: 42558.1. Samples: 5764402420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:48,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 03:38:51,687][15401] Updated weights for policy 0, policy_version 351830 (0.0040) [2024-06-23 03:38:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5764448256. Throughput: 0: 42493.3. Samples: 5764526640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:53,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 03:38:55,342][15401] Updated weights for policy 0, policy_version 351840 (0.0038) [2024-06-23 03:38:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5764644864. Throughput: 0: 42302.7. Samples: 5764781740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:38:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 03:38:59,534][15401] Updated weights for policy 0, policy_version 351850 (0.0029) [2024-06-23 03:39:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5764857856. Throughput: 0: 42313.4. Samples: 5765034780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:39:03,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 03:39:03,405][15401] Updated weights for policy 0, policy_version 351860 (0.0027) [2024-06-23 03:39:07,205][15401] Updated weights for policy 0, policy_version 351870 (0.0044) [2024-06-23 03:39:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5765070848. Throughput: 0: 42387.5. Samples: 5765163100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:39:08,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-23 03:39:11,118][15401] Updated weights for policy 0, policy_version 351880 (0.0027) [2024-06-23 03:39:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5765316608. Throughput: 0: 42345.0. Samples: 5765418360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:39:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 03:39:14,775][15401] Updated weights for policy 0, policy_version 351890 (0.0033) [2024-06-23 03:39:15,972][15349] Signal inference workers to stop experience collection... (85500 times) [2024-06-23 03:39:15,972][15349] Signal inference workers to resume experience collection... (85500 times) [2024-06-23 03:39:16,008][15401] InferenceWorker_p0-w0: stopping experience collection (85500 times) [2024-06-23 03:39:16,008][15401] InferenceWorker_p0-w0: resuming experience collection (85500 times) [2024-06-23 03:39:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5765496832. Throughput: 0: 42264.1. Samples: 5765668980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:39:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 03:39:18,829][15401] Updated weights for policy 0, policy_version 351900 (0.0036) [2024-06-23 03:39:22,446][15401] Updated weights for policy 0, policy_version 351910 (0.0029) [2024-06-23 03:39:23,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.2, 300 sec: 42543.2). Total num frames: 5765693440. Throughput: 0: 42424.5. Samples: 5765797440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 03:39:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 03:39:26,415][15401] Updated weights for policy 0, policy_version 351920 (0.0024) [2024-06-23 03:39:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5765939200. Throughput: 0: 42364.6. Samples: 5766055860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:39:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 03:39:30,309][15401] Updated weights for policy 0, policy_version 351930 (0.0042) [2024-06-23 03:39:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 5766135808. Throughput: 0: 42340.1. Samples: 5766307720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:39:33,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 03:39:34,089][15401] Updated weights for policy 0, policy_version 351940 (0.0028) [2024-06-23 03:39:37,930][15401] Updated weights for policy 0, policy_version 351950 (0.0037) [2024-06-23 03:39:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5766348800. Throughput: 0: 42276.0. Samples: 5766429060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:39:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 03:39:41,859][15401] Updated weights for policy 0, policy_version 351960 (0.0033) [2024-06-23 03:39:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.4, 300 sec: 42598.4). Total num frames: 5766545408. Throughput: 0: 42266.3. Samples: 5766683720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:39:43,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 03:39:43,552][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000351964_5766578176.pth... [2024-06-23 03:39:43,601][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000351338_5756321792.pth [2024-06-23 03:39:46,020][15401] Updated weights for policy 0, policy_version 351970 (0.0027) [2024-06-23 03:39:48,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42050.6, 300 sec: 42543.4). Total num frames: 5766758400. Throughput: 0: 42328.9. Samples: 5766939680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:39:48,393][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 03:39:49,764][15401] Updated weights for policy 0, policy_version 351980 (0.0047) [2024-06-23 03:39:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5766987776. Throughput: 0: 42298.8. Samples: 5767066540. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:39:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 03:39:53,751][15401] Updated weights for policy 0, policy_version 351990 (0.0034) [2024-06-23 03:39:57,221][15401] Updated weights for policy 0, policy_version 352000 (0.0032) [2024-06-23 03:39:58,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5767184384. Throughput: 0: 42304.8. Samples: 5767322080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:39:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 03:40:01,262][15401] Updated weights for policy 0, policy_version 352010 (0.0026) [2024-06-23 03:40:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5767413760. Throughput: 0: 42361.3. Samples: 5767575240. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:40:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 03:40:04,757][15401] Updated weights for policy 0, policy_version 352020 (0.0033) [2024-06-23 03:40:08,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 5767626752. Throughput: 0: 42415.3. Samples: 5767706400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:40:08,396][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 03:40:09,249][15401] Updated weights for policy 0, policy_version 352030 (0.0035) [2024-06-23 03:40:12,861][15401] Updated weights for policy 0, policy_version 352040 (0.0030) [2024-06-23 03:40:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 5767823360. Throughput: 0: 42159.9. Samples: 5767953060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:40:13,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 03:40:16,885][15401] Updated weights for policy 0, policy_version 352050 (0.0022) [2024-06-23 03:40:18,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5768052736. Throughput: 0: 42226.2. Samples: 5768207900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:40:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 03:40:21,031][15401] Updated weights for policy 0, policy_version 352060 (0.0025) [2024-06-23 03:40:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5768265728. Throughput: 0: 42364.4. Samples: 5768335460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:40:23,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 03:40:24,650][15401] Updated weights for policy 0, policy_version 352070 (0.0037) [2024-06-23 03:40:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 5768462336. Throughput: 0: 42418.6. Samples: 5768592560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:40:28,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 03:40:28,530][15401] Updated weights for policy 0, policy_version 352080 (0.0029) [2024-06-23 03:40:32,159][15401] Updated weights for policy 0, policy_version 352090 (0.0036) [2024-06-23 03:40:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5768675328. Throughput: 0: 42408.8. Samples: 5768847980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:40:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 03:40:33,566][15349] Signal inference workers to stop experience collection... (85550 times) [2024-06-23 03:40:33,619][15401] InferenceWorker_p0-w0: stopping experience collection (85550 times) [2024-06-23 03:40:33,627][15349] Signal inference workers to resume experience collection... (85550 times) [2024-06-23 03:40:33,637][15401] InferenceWorker_p0-w0: resuming experience collection (85550 times) [2024-06-23 03:40:36,016][15401] Updated weights for policy 0, policy_version 352100 (0.0051) [2024-06-23 03:40:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5768888320. Throughput: 0: 42360.8. Samples: 5768972780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-23 03:40:38,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 03:40:39,802][15401] Updated weights for policy 0, policy_version 352110 (0.0044) [2024-06-23 03:40:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5769117696. Throughput: 0: 42455.5. Samples: 5769232580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:40:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 03:40:43,627][15401] Updated weights for policy 0, policy_version 352120 (0.0038) [2024-06-23 03:40:47,825][15401] Updated weights for policy 0, policy_version 352130 (0.0034) [2024-06-23 03:40:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42598.4, 300 sec: 42542.5). Total num frames: 5769314304. Throughput: 0: 42334.2. Samples: 5769480380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:40:48,393][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 03:40:51,123][15401] Updated weights for policy 0, policy_version 352140 (0.0047) [2024-06-23 03:40:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5769527296. Throughput: 0: 42222.4. Samples: 5769606140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:40:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 03:40:55,438][15401] Updated weights for policy 0, policy_version 352150 (0.0040) [2024-06-23 03:40:58,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5769740288. Throughput: 0: 42542.0. Samples: 5769867440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:40:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 03:40:58,771][15401] Updated weights for policy 0, policy_version 352160 (0.0033) [2024-06-23 03:41:03,158][15401] Updated weights for policy 0, policy_version 352170 (0.0050) [2024-06-23 03:41:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42543.8). Total num frames: 5769953280. Throughput: 0: 42659.4. Samples: 5770127580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:03,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 03:41:06,486][15401] Updated weights for policy 0, policy_version 352180 (0.0038) [2024-06-23 03:41:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 5770182656. Throughput: 0: 42647.1. Samples: 5770254580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 03:41:10,773][15401] Updated weights for policy 0, policy_version 352190 (0.0034) [2024-06-23 03:41:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5770395648. Throughput: 0: 42727.1. Samples: 5770515280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 03:41:14,126][15401] Updated weights for policy 0, policy_version 352200 (0.0039) [2024-06-23 03:41:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5770592256. Throughput: 0: 42788.5. Samples: 5770773460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 03:41:18,614][15401] Updated weights for policy 0, policy_version 352210 (0.0045) [2024-06-23 03:41:21,763][15401] Updated weights for policy 0, policy_version 352220 (0.0025) [2024-06-23 03:41:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5770838016. Throughput: 0: 42937.8. Samples: 5770904980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 03:41:26,056][15401] Updated weights for policy 0, policy_version 352230 (0.0043) [2024-06-23 03:41:28,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 5771034624. Throughput: 0: 42803.5. Samples: 5771158840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:28,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 03:41:29,383][15401] Updated weights for policy 0, policy_version 352240 (0.0033) [2024-06-23 03:41:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 5771247616. Throughput: 0: 42855.6. Samples: 5771408880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:33,393][15132] Avg episode reward: [(0, '0.241')] [2024-06-23 03:41:34,135][15401] Updated weights for policy 0, policy_version 352250 (0.0044) [2024-06-23 03:41:37,303][15401] Updated weights for policy 0, policy_version 352260 (0.0044) [2024-06-23 03:41:38,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 5771476992. Throughput: 0: 42919.4. Samples: 5771537520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 03:41:41,754][15401] Updated weights for policy 0, policy_version 352270 (0.0038) [2024-06-23 03:41:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5771673600. Throughput: 0: 42879.0. Samples: 5771797000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 03:41:43,483][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000352276_5771689984.pth... [2024-06-23 03:41:43,530][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000351652_5761466368.pth [2024-06-23 03:41:44,989][15401] Updated weights for policy 0, policy_version 352280 (0.0042) [2024-06-23 03:41:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 5771886592. Throughput: 0: 42683.7. Samples: 5772048340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 03:41:49,398][15401] Updated weights for policy 0, policy_version 352290 (0.0030) [2024-06-23 03:41:52,645][15401] Updated weights for policy 0, policy_version 352300 (0.0046) [2024-06-23 03:41:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5772099584. Throughput: 0: 42748.4. Samples: 5772178260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:53,391][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 03:41:53,459][15349] Signal inference workers to stop experience collection... (85600 times) [2024-06-23 03:41:53,511][15401] InferenceWorker_p0-w0: stopping experience collection (85600 times) [2024-06-23 03:41:53,519][15349] Signal inference workers to resume experience collection... (85600 times) [2024-06-23 03:41:53,531][15401] InferenceWorker_p0-w0: resuming experience collection (85600 times) [2024-06-23 03:41:56,991][15401] Updated weights for policy 0, policy_version 352310 (0.0037) [2024-06-23 03:41:58,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.6, 300 sec: 42432.4). Total num frames: 5772279808. Throughput: 0: 42570.2. Samples: 5772431040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 03:41:58,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 03:42:00,282][15401] Updated weights for policy 0, policy_version 352320 (0.0034) [2024-06-23 03:42:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 5772525568. Throughput: 0: 42378.7. Samples: 5772680500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:03,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 03:42:04,670][15401] Updated weights for policy 0, policy_version 352330 (0.0032) [2024-06-23 03:42:08,033][15401] Updated weights for policy 0, policy_version 352340 (0.0036) [2024-06-23 03:42:08,390][15132] Fps is (10 sec: 47524.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5772754944. Throughput: 0: 42535.1. Samples: 5772819060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 03:42:12,170][15401] Updated weights for policy 0, policy_version 352350 (0.0029) [2024-06-23 03:42:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5772935168. Throughput: 0: 42537.0. Samples: 5773072900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 03:42:15,769][15401] Updated weights for policy 0, policy_version 352360 (0.0043) [2024-06-23 03:42:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5773164544. Throughput: 0: 42476.5. Samples: 5773320220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 03:42:20,363][15401] Updated weights for policy 0, policy_version 352370 (0.0036) [2024-06-23 03:42:23,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42320.8, 300 sec: 42542.3). Total num frames: 5773377536. Throughput: 0: 42619.4. Samples: 5773455660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:23,397][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 03:42:23,662][15401] Updated weights for policy 0, policy_version 352380 (0.0029) [2024-06-23 03:42:27,776][15401] Updated weights for policy 0, policy_version 352390 (0.0040) [2024-06-23 03:42:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 5773574144. Throughput: 0: 42513.8. Samples: 5773710120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 03:42:31,189][15401] Updated weights for policy 0, policy_version 352400 (0.0043) [2024-06-23 03:42:33,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 5773803520. Throughput: 0: 42490.8. Samples: 5773960420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:33,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 03:42:35,498][15401] Updated weights for policy 0, policy_version 352410 (0.0041) [2024-06-23 03:42:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.4, 300 sec: 42431.8). Total num frames: 5773983744. Throughput: 0: 42544.6. Samples: 5774092760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 03:42:38,986][15401] Updated weights for policy 0, policy_version 352420 (0.0024) [2024-06-23 03:42:43,202][15401] Updated weights for policy 0, policy_version 352430 (0.0029) [2024-06-23 03:42:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 5774213120. Throughput: 0: 42483.2. Samples: 5774342680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 03:42:46,825][15401] Updated weights for policy 0, policy_version 352440 (0.0032) [2024-06-23 03:42:48,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 5774458880. Throughput: 0: 42573.8. Samples: 5774596320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 03:42:50,804][15401] Updated weights for policy 0, policy_version 352450 (0.0048) [2024-06-23 03:42:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5774639104. Throughput: 0: 42439.5. Samples: 5774728840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:53,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 03:42:54,744][15401] Updated weights for policy 0, policy_version 352460 (0.0029) [2024-06-23 03:42:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 5774852096. Throughput: 0: 42248.8. Samples: 5774974100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:42:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 03:42:58,486][15401] Updated weights for policy 0, policy_version 352470 (0.0040) [2024-06-23 03:43:02,442][15401] Updated weights for policy 0, policy_version 352480 (0.0030) [2024-06-23 03:43:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5775081472. Throughput: 0: 42440.0. Samples: 5775230020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:43:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 03:43:06,479][15401] Updated weights for policy 0, policy_version 352490 (0.0038) [2024-06-23 03:43:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5775278080. Throughput: 0: 42368.3. Samples: 5775361960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:43:08,391][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 03:43:09,999][15401] Updated weights for policy 0, policy_version 352500 (0.0037) [2024-06-23 03:43:13,391][15132] Fps is (10 sec: 39316.6, 60 sec: 42324.3, 300 sec: 42431.6). Total num frames: 5775474688. Throughput: 0: 42276.1. Samples: 5775612600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 03:43:13,391][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 03:43:14,262][15401] Updated weights for policy 0, policy_version 352510 (0.0030) [2024-06-23 03:43:17,336][15349] Signal inference workers to stop experience collection... (85650 times) [2024-06-23 03:43:17,340][15349] Signal inference workers to resume experience collection... (85650 times) [2024-06-23 03:43:17,363][15401] InferenceWorker_p0-w0: stopping experience collection (85650 times) [2024-06-23 03:43:17,363][15401] InferenceWorker_p0-w0: resuming experience collection (85650 times) [2024-06-23 03:43:17,721][15401] Updated weights for policy 0, policy_version 352520 (0.0032) [2024-06-23 03:43:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5775704064. Throughput: 0: 42374.4. Samples: 5775867280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 03:43:21,812][15401] Updated weights for policy 0, policy_version 352530 (0.0037) [2024-06-23 03:43:23,390][15132] Fps is (10 sec: 44242.2, 60 sec: 42329.8, 300 sec: 42487.3). Total num frames: 5775917056. Throughput: 0: 42358.0. Samples: 5775998880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 03:43:25,300][15401] Updated weights for policy 0, policy_version 352540 (0.0030) [2024-06-23 03:43:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5776130048. Throughput: 0: 42421.2. Samples: 5776251640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:28,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 03:43:29,848][15401] Updated weights for policy 0, policy_version 352550 (0.0052) [2024-06-23 03:43:33,195][15401] Updated weights for policy 0, policy_version 352560 (0.0027) [2024-06-23 03:43:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5776359424. Throughput: 0: 42602.7. Samples: 5776513440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 03:43:37,273][15401] Updated weights for policy 0, policy_version 352570 (0.0033) [2024-06-23 03:43:38,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.7, 300 sec: 42487.0). Total num frames: 5776572416. Throughput: 0: 42470.2. Samples: 5776640100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:38,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 03:43:40,736][15401] Updated weights for policy 0, policy_version 352580 (0.0036) [2024-06-23 03:43:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5776769024. Throughput: 0: 42680.9. Samples: 5776894740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:43,393][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 03:43:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000352587_5776785408.pth... [2024-06-23 03:43:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000351964_5766578176.pth [2024-06-23 03:43:44,689][15401] Updated weights for policy 0, policy_version 352590 (0.0037) [2024-06-23 03:43:48,097][15401] Updated weights for policy 0, policy_version 352600 (0.0040) [2024-06-23 03:43:48,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5776998400. Throughput: 0: 42820.0. Samples: 5777156920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 03:43:52,299][15401] Updated weights for policy 0, policy_version 352610 (0.0041) [2024-06-23 03:43:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5777195008. Throughput: 0: 42823.2. Samples: 5777289000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 03:43:55,643][15401] Updated weights for policy 0, policy_version 352620 (0.0036) [2024-06-23 03:43:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5777424384. Throughput: 0: 42817.6. Samples: 5777539340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:43:58,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 03:44:00,156][15401] Updated weights for policy 0, policy_version 352630 (0.0044) [2024-06-23 03:44:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5777620992. Throughput: 0: 42862.3. Samples: 5777796080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:44:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 03:44:03,819][15401] Updated weights for policy 0, policy_version 352640 (0.0032) [2024-06-23 03:44:07,907][15401] Updated weights for policy 0, policy_version 352650 (0.0037) [2024-06-23 03:44:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5777850368. Throughput: 0: 42819.2. Samples: 5777925740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:44:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 03:44:11,265][15401] Updated weights for policy 0, policy_version 352660 (0.0034) [2024-06-23 03:44:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43145.4, 300 sec: 42598.4). Total num frames: 5778063360. Throughput: 0: 42816.5. Samples: 5778178380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:44:13,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 03:44:15,471][15401] Updated weights for policy 0, policy_version 352670 (0.0042) [2024-06-23 03:44:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5778276352. Throughput: 0: 42915.9. Samples: 5778444660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:44:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 03:44:18,744][15401] Updated weights for policy 0, policy_version 352680 (0.0033) [2024-06-23 03:44:23,041][15401] Updated weights for policy 0, policy_version 352690 (0.0031) [2024-06-23 03:44:23,392][15132] Fps is (10 sec: 42589.1, 60 sec: 42870.0, 300 sec: 42542.5). Total num frames: 5778489344. Throughput: 0: 42929.5. Samples: 5778571920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:44:23,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 03:44:26,925][15401] Updated weights for policy 0, policy_version 352700 (0.0044) [2024-06-23 03:44:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5778718720. Throughput: 0: 42751.0. Samples: 5778818540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 03:44:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 03:44:30,632][15401] Updated weights for policy 0, policy_version 352710 (0.0031) [2024-06-23 03:44:33,392][15132] Fps is (10 sec: 40959.1, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 5778898944. Throughput: 0: 42786.6. Samples: 5779082420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:44:33,393][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 03:44:34,466][15401] Updated weights for policy 0, policy_version 352720 (0.0037) [2024-06-23 03:44:35,057][15349] Signal inference workers to stop experience collection... (85700 times) [2024-06-23 03:44:35,104][15401] InferenceWorker_p0-w0: stopping experience collection (85700 times) [2024-06-23 03:44:35,115][15349] Signal inference workers to resume experience collection... (85700 times) [2024-06-23 03:44:35,120][15401] InferenceWorker_p0-w0: resuming experience collection (85700 times) [2024-06-23 03:44:38,280][15401] Updated weights for policy 0, policy_version 352730 (0.0038) [2024-06-23 03:44:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 5779128320. Throughput: 0: 42545.7. Samples: 5779203560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:44:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 03:44:42,136][15401] Updated weights for policy 0, policy_version 352740 (0.0050) [2024-06-23 03:44:43,389][15132] Fps is (10 sec: 45886.6, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 5779357696. Throughput: 0: 42842.4. Samples: 5779467240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:44:43,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-23 03:44:45,801][15401] Updated weights for policy 0, policy_version 352750 (0.0034) [2024-06-23 03:44:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5779537920. Throughput: 0: 42727.2. Samples: 5779718800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:44:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 03:44:49,972][15401] Updated weights for policy 0, policy_version 352760 (0.0038) [2024-06-23 03:44:53,378][15401] Updated weights for policy 0, policy_version 352770 (0.0042) [2024-06-23 03:44:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5779783680. Throughput: 0: 42564.4. Samples: 5779841140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:44:53,403][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 03:44:57,413][15401] Updated weights for policy 0, policy_version 352780 (0.0050) [2024-06-23 03:44:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5779980288. Throughput: 0: 42801.9. Samples: 5780104460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:44:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 03:45:01,531][15401] Updated weights for policy 0, policy_version 352790 (0.0029) [2024-06-23 03:45:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42599.3). Total num frames: 5780193280. Throughput: 0: 42386.8. Samples: 5780352060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 03:45:05,363][15401] Updated weights for policy 0, policy_version 352800 (0.0036) [2024-06-23 03:45:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5780422656. Throughput: 0: 42462.1. Samples: 5780482620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 03:45:09,333][15401] Updated weights for policy 0, policy_version 352810 (0.0034) [2024-06-23 03:45:13,021][15401] Updated weights for policy 0, policy_version 352820 (0.0042) [2024-06-23 03:45:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5780619264. Throughput: 0: 42649.5. Samples: 5780737760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 03:45:16,933][15401] Updated weights for policy 0, policy_version 352830 (0.0044) [2024-06-23 03:45:18,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5780799488. Throughput: 0: 42380.9. Samples: 5780989460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 03:45:20,815][15401] Updated weights for policy 0, policy_version 352840 (0.0027) [2024-06-23 03:45:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 5781061632. Throughput: 0: 42485.8. Samples: 5781115420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:23,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 03:45:24,435][15401] Updated weights for policy 0, policy_version 352850 (0.0025) [2024-06-23 03:45:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 5781241856. Throughput: 0: 42448.0. Samples: 5781377400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:28,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 03:45:28,417][15401] Updated weights for policy 0, policy_version 352860 (0.0038) [2024-06-23 03:45:31,985][15401] Updated weights for policy 0, policy_version 352870 (0.0040) [2024-06-23 03:45:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 5781454848. Throughput: 0: 42550.6. Samples: 5781633580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:33,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 03:45:35,986][15401] Updated weights for policy 0, policy_version 352880 (0.0041) [2024-06-23 03:45:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5781684224. Throughput: 0: 42562.7. Samples: 5781756460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:38,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-23 03:45:39,523][15401] Updated weights for policy 0, policy_version 352890 (0.0044) [2024-06-23 03:45:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42598.8). Total num frames: 5781880832. Throughput: 0: 42434.2. Samples: 5782014000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 03:45:43,573][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000352900_5781913600.pth... [2024-06-23 03:45:43,571][15401] Updated weights for policy 0, policy_version 352900 (0.0033) [2024-06-23 03:45:43,619][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000352276_5771689984.pth [2024-06-23 03:45:47,228][15401] Updated weights for policy 0, policy_version 352910 (0.0028) [2024-06-23 03:45:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5782093824. Throughput: 0: 42535.9. Samples: 5782266180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 03:45:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 03:45:51,703][15401] Updated weights for policy 0, policy_version 352920 (0.0028) [2024-06-23 03:45:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5782323200. Throughput: 0: 42546.3. Samples: 5782397200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:45:53,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 03:45:54,850][15401] Updated weights for policy 0, policy_version 352930 (0.0025) [2024-06-23 03:45:58,393][15132] Fps is (10 sec: 42583.6, 60 sec: 42322.8, 300 sec: 42597.9). Total num frames: 5782519808. Throughput: 0: 42572.2. Samples: 5782653660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:45:58,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 03:45:59,403][15401] Updated weights for policy 0, policy_version 352940 (0.0032) [2024-06-23 03:45:59,417][15349] Signal inference workers to stop experience collection... (85750 times) [2024-06-23 03:45:59,417][15349] Signal inference workers to resume experience collection... (85750 times) [2024-06-23 03:45:59,438][15401] InferenceWorker_p0-w0: stopping experience collection (85750 times) [2024-06-23 03:45:59,438][15401] InferenceWorker_p0-w0: resuming experience collection (85750 times) [2024-06-23 03:46:02,466][15401] Updated weights for policy 0, policy_version 352950 (0.0029) [2024-06-23 03:46:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5782749184. Throughput: 0: 42615.1. Samples: 5782907140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 03:46:06,841][15401] Updated weights for policy 0, policy_version 352960 (0.0024) [2024-06-23 03:46:08,390][15132] Fps is (10 sec: 45891.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5782978560. Throughput: 0: 42771.5. Samples: 5783040140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 03:46:09,999][15401] Updated weights for policy 0, policy_version 352970 (0.0039) [2024-06-23 03:46:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5783175168. Throughput: 0: 42671.4. Samples: 5783297620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 03:46:14,547][15401] Updated weights for policy 0, policy_version 352980 (0.0040) [2024-06-23 03:46:18,040][15401] Updated weights for policy 0, policy_version 352990 (0.0042) [2024-06-23 03:46:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 5783404544. Throughput: 0: 42536.3. Samples: 5783547720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 03:46:22,338][15401] Updated weights for policy 0, policy_version 353000 (0.0041) [2024-06-23 03:46:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 5783601152. Throughput: 0: 42652.1. Samples: 5783675800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:23,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 03:46:25,501][15401] Updated weights for policy 0, policy_version 353010 (0.0040) [2024-06-23 03:46:28,393][15132] Fps is (10 sec: 40945.2, 60 sec: 42868.8, 300 sec: 42598.2). Total num frames: 5783814144. Throughput: 0: 42761.8. Samples: 5783938440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:28,394][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 03:46:29,999][15401] Updated weights for policy 0, policy_version 353020 (0.0025) [2024-06-23 03:46:33,159][15401] Updated weights for policy 0, policy_version 353030 (0.0032) [2024-06-23 03:46:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5784043520. Throughput: 0: 42745.8. Samples: 5784189740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:33,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 03:46:37,686][15401] Updated weights for policy 0, policy_version 353040 (0.0040) [2024-06-23 03:46:38,390][15132] Fps is (10 sec: 42614.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5784240128. Throughput: 0: 42694.6. Samples: 5784318460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 03:46:41,060][15401] Updated weights for policy 0, policy_version 353050 (0.0034) [2024-06-23 03:46:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5784453120. Throughput: 0: 42725.7. Samples: 5784576160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 03:46:45,271][15401] Updated weights for policy 0, policy_version 353060 (0.0032) [2024-06-23 03:46:48,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 5784682496. Throughput: 0: 42781.3. Samples: 5784832400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:48,393][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 03:46:48,589][15401] Updated weights for policy 0, policy_version 353070 (0.0041) [2024-06-23 03:46:52,825][15401] Updated weights for policy 0, policy_version 353080 (0.0037) [2024-06-23 03:46:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 5784895488. Throughput: 0: 42811.5. Samples: 5784966660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 03:46:56,177][15401] Updated weights for policy 0, policy_version 353090 (0.0029) [2024-06-23 03:46:58,389][15132] Fps is (10 sec: 39331.7, 60 sec: 42601.0, 300 sec: 42542.9). Total num frames: 5785075712. Throughput: 0: 42773.1. Samples: 5785222400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:46:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 03:47:00,599][15401] Updated weights for policy 0, policy_version 353100 (0.0035) [2024-06-23 03:47:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5785305088. Throughput: 0: 42856.9. Samples: 5785476280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 03:47:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 03:47:03,967][15401] Updated weights for policy 0, policy_version 353110 (0.0047) [2024-06-23 03:47:08,115][15401] Updated weights for policy 0, policy_version 353120 (0.0040) [2024-06-23 03:47:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5785534464. Throughput: 0: 42959.9. Samples: 5785609000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:08,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 03:47:11,810][15401] Updated weights for policy 0, policy_version 353130 (0.0029) [2024-06-23 03:47:13,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 5785731072. Throughput: 0: 42610.1. Samples: 5785855840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:13,392][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 03:47:15,643][15401] Updated weights for policy 0, policy_version 353140 (0.0040) [2024-06-23 03:47:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 5785944064. Throughput: 0: 42728.0. Samples: 5786112500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 03:47:19,490][15401] Updated weights for policy 0, policy_version 353150 (0.0039) [2024-06-23 03:47:22,691][15349] Signal inference workers to stop experience collection... (85800 times) [2024-06-23 03:47:22,692][15349] Signal inference workers to resume experience collection... (85800 times) [2024-06-23 03:47:22,748][15401] InferenceWorker_p0-w0: stopping experience collection (85800 times) [2024-06-23 03:47:22,748][15401] InferenceWorker_p0-w0: resuming experience collection (85800 times) [2024-06-23 03:47:23,346][15401] Updated weights for policy 0, policy_version 353160 (0.0038) [2024-06-23 03:47:23,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5786173440. Throughput: 0: 42836.4. Samples: 5786246100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 03:47:27,242][15401] Updated weights for policy 0, policy_version 353170 (0.0034) [2024-06-23 03:47:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5786370048. Throughput: 0: 42710.5. Samples: 5786498140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:28,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 03:47:31,330][15401] Updated weights for policy 0, policy_version 353180 (0.0036) [2024-06-23 03:47:33,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.6, 300 sec: 42764.6). Total num frames: 5786599424. Throughput: 0: 42630.1. Samples: 5786750760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:33,393][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 03:47:34,757][15401] Updated weights for policy 0, policy_version 353190 (0.0029) [2024-06-23 03:47:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5786796032. Throughput: 0: 42617.7. Samples: 5786884460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:38,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-23 03:47:38,814][15401] Updated weights for policy 0, policy_version 353200 (0.0032) [2024-06-23 03:47:42,303][15401] Updated weights for policy 0, policy_version 353210 (0.0037) [2024-06-23 03:47:43,389][15132] Fps is (10 sec: 42609.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5787025408. Throughput: 0: 42648.8. Samples: 5787141600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 03:47:43,446][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000353213_5787041792.pth... [2024-06-23 03:47:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000352587_5776785408.pth [2024-06-23 03:47:46,630][15401] Updated weights for policy 0, policy_version 353220 (0.0034) [2024-06-23 03:47:48,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 5787254784. Throughput: 0: 42422.7. Samples: 5787385300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:48,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 03:47:49,955][15401] Updated weights for policy 0, policy_version 353230 (0.0033) [2024-06-23 03:47:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5787418624. Throughput: 0: 42408.1. Samples: 5787517360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 03:47:54,552][15401] Updated weights for policy 0, policy_version 353240 (0.0031) [2024-06-23 03:47:58,008][15401] Updated weights for policy 0, policy_version 353250 (0.0038) [2024-06-23 03:47:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 5787664384. Throughput: 0: 42638.3. Samples: 5787774460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:47:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 03:48:02,270][15401] Updated weights for policy 0, policy_version 353260 (0.0033) [2024-06-23 03:48:03,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5787893760. Throughput: 0: 42563.6. Samples: 5788027860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:48:03,392][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 03:48:05,651][15401] Updated weights for policy 0, policy_version 353270 (0.0038) [2024-06-23 03:48:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.4, 300 sec: 42654.1). Total num frames: 5788057600. Throughput: 0: 42442.8. Samples: 5788156020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:48:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 03:48:09,795][15401] Updated weights for policy 0, policy_version 353280 (0.0033) [2024-06-23 03:48:13,344][15401] Updated weights for policy 0, policy_version 353290 (0.0037) [2024-06-23 03:48:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 5788303360. Throughput: 0: 42558.4. Samples: 5788413260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:48:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 03:48:17,382][15401] Updated weights for policy 0, policy_version 353300 (0.0034) [2024-06-23 03:48:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5788499968. Throughput: 0: 42697.6. Samples: 5788672040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:48:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 03:48:20,942][15401] Updated weights for policy 0, policy_version 353310 (0.0037) [2024-06-23 03:48:23,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5788696576. Throughput: 0: 42415.1. Samples: 5788793140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 03:48:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 03:48:25,079][15401] Updated weights for policy 0, policy_version 353320 (0.0027) [2024-06-23 03:48:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5788942336. Throughput: 0: 42465.8. Samples: 5789052560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:48:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 03:48:28,493][15401] Updated weights for policy 0, policy_version 353330 (0.0030) [2024-06-23 03:48:32,669][15401] Updated weights for policy 0, policy_version 353340 (0.0046) [2024-06-23 03:48:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42327.1, 300 sec: 42598.7). Total num frames: 5789138944. Throughput: 0: 42802.7. Samples: 5789311420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:48:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 03:48:36,463][15401] Updated weights for policy 0, policy_version 353350 (0.0034) [2024-06-23 03:48:38,392][15132] Fps is (10 sec: 39311.6, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 5789335552. Throughput: 0: 42641.7. Samples: 5789436340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:48:38,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 03:48:40,300][15401] Updated weights for policy 0, policy_version 353360 (0.0037) [2024-06-23 03:48:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 5789581312. Throughput: 0: 42611.0. Samples: 5789692060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:48:43,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 03:48:43,841][15401] Updated weights for policy 0, policy_version 353370 (0.0039) [2024-06-23 03:48:48,095][15401] Updated weights for policy 0, policy_version 353380 (0.0031) [2024-06-23 03:48:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5789777920. Throughput: 0: 42709.4. Samples: 5789949780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:48:48,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 03:48:51,320][15401] Updated weights for policy 0, policy_version 353390 (0.0037) [2024-06-23 03:48:53,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5789974528. Throughput: 0: 42682.1. Samples: 5790076720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:48:53,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 03:48:54,444][15349] Signal inference workers to stop experience collection... (85850 times) [2024-06-23 03:48:54,444][15349] Signal inference workers to resume experience collection... (85850 times) [2024-06-23 03:48:54,466][15401] InferenceWorker_p0-w0: stopping experience collection (85850 times) [2024-06-23 03:48:54,467][15401] InferenceWorker_p0-w0: resuming experience collection (85850 times) [2024-06-23 03:48:55,701][15401] Updated weights for policy 0, policy_version 353400 (0.0029) [2024-06-23 03:48:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5790203904. Throughput: 0: 42587.0. Samples: 5790329680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:48:58,390][15132] Avg episode reward: [(0, '0.191')] [2024-06-23 03:48:59,096][15401] Updated weights for policy 0, policy_version 353410 (0.0038) [2024-06-23 03:49:03,283][15401] Updated weights for policy 0, policy_version 353420 (0.0038) [2024-06-23 03:49:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5790433280. Throughput: 0: 42653.6. Samples: 5790591460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:49:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 03:49:07,176][15401] Updated weights for policy 0, policy_version 353430 (0.0038) [2024-06-23 03:49:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5790613504. Throughput: 0: 42726.2. Samples: 5790715820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:49:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 03:49:10,863][15401] Updated weights for policy 0, policy_version 353440 (0.0046) [2024-06-23 03:49:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5790859264. Throughput: 0: 42713.6. Samples: 5790974680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:49:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 03:49:14,668][15401] Updated weights for policy 0, policy_version 353450 (0.0037) [2024-06-23 03:49:18,315][15401] Updated weights for policy 0, policy_version 353460 (0.0035) [2024-06-23 03:49:18,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 5791088640. Throughput: 0: 42768.5. Samples: 5791236000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:49:18,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 03:49:22,310][15401] Updated weights for policy 0, policy_version 353470 (0.0029) [2024-06-23 03:49:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 5791268864. Throughput: 0: 42830.8. Samples: 5791363620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:49:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 03:49:26,110][15401] Updated weights for policy 0, policy_version 353480 (0.0024) [2024-06-23 03:49:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 5791498240. Throughput: 0: 42722.3. Samples: 5791614460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:49:28,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 03:49:30,122][15401] Updated weights for policy 0, policy_version 353490 (0.0038) [2024-06-23 03:49:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5791711232. Throughput: 0: 42656.0. Samples: 5791869300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:49:33,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-23 03:49:33,794][15401] Updated weights for policy 0, policy_version 353500 (0.0026) [2024-06-23 03:49:37,653][15401] Updated weights for policy 0, policy_version 353510 (0.0037) [2024-06-23 03:49:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 5791907840. Throughput: 0: 42761.3. Samples: 5792000980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 03:49:38,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-23 03:49:41,280][15401] Updated weights for policy 0, policy_version 353520 (0.0030) [2024-06-23 03:49:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 5792137216. Throughput: 0: 42745.7. Samples: 5792253240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:49:43,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 03:49:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000353524_5792137216.pth... [2024-06-23 03:49:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000352900_5781913600.pth [2024-06-23 03:49:45,278][15401] Updated weights for policy 0, policy_version 353530 (0.0041) [2024-06-23 03:49:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5792366592. Throughput: 0: 42770.0. Samples: 5792516100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:49:48,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 03:49:48,725][15401] Updated weights for policy 0, policy_version 353540 (0.0032) [2024-06-23 03:49:53,324][15401] Updated weights for policy 0, policy_version 353550 (0.0034) [2024-06-23 03:49:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5792563200. Throughput: 0: 42847.1. Samples: 5792643940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:49:53,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 03:49:56,824][15401] Updated weights for policy 0, policy_version 353560 (0.0032) [2024-06-23 03:49:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5792792576. Throughput: 0: 42735.1. Samples: 5792897760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:49:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 03:50:01,013][15401] Updated weights for policy 0, policy_version 353570 (0.0044) [2024-06-23 03:50:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5792989184. Throughput: 0: 42648.4. Samples: 5793155180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 03:50:04,399][15401] Updated weights for policy 0, policy_version 353580 (0.0024) [2024-06-23 03:50:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 5793202176. Throughput: 0: 42600.8. Samples: 5793280760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:08,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 03:50:08,606][15401] Updated weights for policy 0, policy_version 353590 (0.0044) [2024-06-23 03:50:12,094][15349] Signal inference workers to stop experience collection... (85900 times) [2024-06-23 03:50:12,101][15349] Signal inference workers to resume experience collection... (85900 times) [2024-06-23 03:50:12,119][15401] Updated weights for policy 0, policy_version 353600 (0.0031) [2024-06-23 03:50:12,145][15401] InferenceWorker_p0-w0: stopping experience collection (85900 times) [2024-06-23 03:50:12,145][15401] InferenceWorker_p0-w0: resuming experience collection (85900 times) [2024-06-23 03:50:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5793431552. Throughput: 0: 42686.6. Samples: 5793535360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 03:50:16,203][15401] Updated weights for policy 0, policy_version 353610 (0.0032) [2024-06-23 03:50:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5793611776. Throughput: 0: 42913.7. Samples: 5793800420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 03:50:19,599][15401] Updated weights for policy 0, policy_version 353620 (0.0033) [2024-06-23 03:50:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5793841152. Throughput: 0: 42561.8. Samples: 5793916260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 03:50:23,829][15401] Updated weights for policy 0, policy_version 353630 (0.0036) [2024-06-23 03:50:27,291][15401] Updated weights for policy 0, policy_version 353640 (0.0036) [2024-06-23 03:50:28,396][15132] Fps is (10 sec: 47483.5, 60 sec: 43140.0, 300 sec: 42819.6). Total num frames: 5794086912. Throughput: 0: 42778.9. Samples: 5794178560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:28,396][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 03:50:31,466][15401] Updated weights for policy 0, policy_version 353650 (0.0031) [2024-06-23 03:50:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5794250752. Throughput: 0: 42791.5. Samples: 5794441720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 03:50:34,923][15401] Updated weights for policy 0, policy_version 353660 (0.0035) [2024-06-23 03:50:38,390][15132] Fps is (10 sec: 40986.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5794496512. Throughput: 0: 42554.8. Samples: 5794558900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:38,393][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 03:50:39,076][15401] Updated weights for policy 0, policy_version 353670 (0.0034) [2024-06-23 03:50:42,625][15401] Updated weights for policy 0, policy_version 353680 (0.0038) [2024-06-23 03:50:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5794725888. Throughput: 0: 42709.4. Samples: 5794819680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:43,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 03:50:46,880][15401] Updated weights for policy 0, policy_version 353690 (0.0033) [2024-06-23 03:50:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5794889728. Throughput: 0: 42795.2. Samples: 5795080960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 03:50:50,349][15401] Updated weights for policy 0, policy_version 353700 (0.0033) [2024-06-23 03:50:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42710.0). Total num frames: 5795119104. Throughput: 0: 42678.3. Samples: 5795201180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 03:50:54,472][15401] Updated weights for policy 0, policy_version 353710 (0.0026) [2024-06-23 03:50:57,836][15401] Updated weights for policy 0, policy_version 353720 (0.0042) [2024-06-23 03:50:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5795348480. Throughput: 0: 42888.9. Samples: 5795465360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 03:50:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 03:51:02,158][15401] Updated weights for policy 0, policy_version 353730 (0.0027) [2024-06-23 03:51:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5795528704. Throughput: 0: 42717.9. Samples: 5795722720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:03,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 03:51:05,903][15401] Updated weights for policy 0, policy_version 353740 (0.0042) [2024-06-23 03:51:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 5795758080. Throughput: 0: 42829.4. Samples: 5795843580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 03:51:10,273][15401] Updated weights for policy 0, policy_version 353750 (0.0055) [2024-06-23 03:51:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5795971072. Throughput: 0: 42679.8. Samples: 5796098880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 03:51:13,628][15401] Updated weights for policy 0, policy_version 353760 (0.0037) [2024-06-23 03:51:17,992][15401] Updated weights for policy 0, policy_version 353770 (0.0029) [2024-06-23 03:51:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5796167680. Throughput: 0: 42395.1. Samples: 5796349500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:18,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 03:51:21,570][15401] Updated weights for policy 0, policy_version 353780 (0.0031) [2024-06-23 03:51:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 5796397056. Throughput: 0: 42600.0. Samples: 5796475900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:23,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 03:51:25,671][15401] Updated weights for policy 0, policy_version 353790 (0.0029) [2024-06-23 03:51:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41783.6, 300 sec: 42542.9). Total num frames: 5796593664. Throughput: 0: 42412.9. Samples: 5796728260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:28,399][15132] Avg episode reward: [(0, '0.858')] [2024-06-23 03:51:29,134][15401] Updated weights for policy 0, policy_version 353800 (0.0035) [2024-06-23 03:51:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5796806656. Throughput: 0: 42250.9. Samples: 5796982260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:33,399][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 03:51:33,522][15401] Updated weights for policy 0, policy_version 353810 (0.0041) [2024-06-23 03:51:37,111][15401] Updated weights for policy 0, policy_version 353820 (0.0035) [2024-06-23 03:51:37,689][15349] Signal inference workers to stop experience collection... (85950 times) [2024-06-23 03:51:37,715][15401] InferenceWorker_p0-w0: stopping experience collection (85950 times) [2024-06-23 03:51:37,753][15349] Signal inference workers to resume experience collection... (85950 times) [2024-06-23 03:51:37,754][15401] InferenceWorker_p0-w0: resuming experience collection (85950 times) [2024-06-23 03:51:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5797036032. Throughput: 0: 42527.5. Samples: 5797114920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:38,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 03:51:41,318][15401] Updated weights for policy 0, policy_version 353830 (0.0034) [2024-06-23 03:51:43,391][15132] Fps is (10 sec: 44228.7, 60 sec: 42050.9, 300 sec: 42598.5). Total num frames: 5797249024. Throughput: 0: 42349.8. Samples: 5797371180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:43,392][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 03:51:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000353836_5797249024.pth... [2024-06-23 03:51:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000353213_5787041792.pth [2024-06-23 03:51:44,906][15401] Updated weights for policy 0, policy_version 353840 (0.0032) [2024-06-23 03:51:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5797462016. Throughput: 0: 42122.0. Samples: 5797618220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 03:51:49,074][15401] Updated weights for policy 0, policy_version 353850 (0.0041) [2024-06-23 03:51:52,434][15401] Updated weights for policy 0, policy_version 353860 (0.0037) [2024-06-23 03:51:53,389][15132] Fps is (10 sec: 42607.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5797675008. Throughput: 0: 42214.3. Samples: 5797743220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 03:51:56,718][15401] Updated weights for policy 0, policy_version 353870 (0.0037) [2024-06-23 03:51:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 5797888000. Throughput: 0: 42242.8. Samples: 5797999800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:51:58,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-23 03:52:00,118][15401] Updated weights for policy 0, policy_version 353880 (0.0031) [2024-06-23 03:52:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5798068224. Throughput: 0: 42421.9. Samples: 5798258480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:52:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 03:52:04,238][15401] Updated weights for policy 0, policy_version 353890 (0.0029) [2024-06-23 03:52:07,836][15401] Updated weights for policy 0, policy_version 353900 (0.0043) [2024-06-23 03:52:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 5798313984. Throughput: 0: 42342.3. Samples: 5798381400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:52:08,392][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 03:52:12,011][15401] Updated weights for policy 0, policy_version 353910 (0.0025) [2024-06-23 03:52:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5798526976. Throughput: 0: 42500.5. Samples: 5798640780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 03:52:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 03:52:15,507][15401] Updated weights for policy 0, policy_version 353920 (0.0027) [2024-06-23 03:52:18,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5798723584. Throughput: 0: 42621.3. Samples: 5798900220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 03:52:19,708][15401] Updated weights for policy 0, policy_version 353930 (0.0027) [2024-06-23 03:52:23,106][15401] Updated weights for policy 0, policy_version 353940 (0.0030) [2024-06-23 03:52:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5798952960. Throughput: 0: 42450.2. Samples: 5799025180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:23,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 03:52:27,625][15401] Updated weights for policy 0, policy_version 353950 (0.0040) [2024-06-23 03:52:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 5799149568. Throughput: 0: 42547.1. Samples: 5799285720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 03:52:30,698][15401] Updated weights for policy 0, policy_version 353960 (0.0037) [2024-06-23 03:52:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5799362560. Throughput: 0: 42676.0. Samples: 5799538640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:33,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-23 03:52:35,525][15401] Updated weights for policy 0, policy_version 353970 (0.0037) [2024-06-23 03:52:38,342][15401] Updated weights for policy 0, policy_version 353980 (0.0029) [2024-06-23 03:52:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5799608320. Throughput: 0: 42710.0. Samples: 5799665180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:38,395][15132] Avg episode reward: [(0, '0.172')] [2024-06-23 03:52:43,133][15401] Updated weights for policy 0, policy_version 353990 (0.0035) [2024-06-23 03:52:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42053.6, 300 sec: 42431.8). Total num frames: 5799772160. Throughput: 0: 42735.5. Samples: 5799922900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 03:52:45,996][15401] Updated weights for policy 0, policy_version 354000 (0.0036) [2024-06-23 03:52:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5800017920. Throughput: 0: 42545.7. Samples: 5800173040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 03:52:50,567][15401] Updated weights for policy 0, policy_version 354010 (0.0031) [2024-06-23 03:52:53,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5800247296. Throughput: 0: 42830.4. Samples: 5800308660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 03:52:53,761][15401] Updated weights for policy 0, policy_version 354020 (0.0032) [2024-06-23 03:52:57,890][15349] Signal inference workers to stop experience collection... (86000 times) [2024-06-23 03:52:57,896][15349] Signal inference workers to resume experience collection... (86000 times) [2024-06-23 03:52:57,904][15401] InferenceWorker_p0-w0: stopping experience collection (86000 times) [2024-06-23 03:52:57,938][15401] InferenceWorker_p0-w0: resuming experience collection (86000 times) [2024-06-23 03:52:58,026][15401] Updated weights for policy 0, policy_version 354030 (0.0041) [2024-06-23 03:52:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5800427520. Throughput: 0: 42829.4. Samples: 5800568100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:52:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 03:53:01,649][15401] Updated weights for policy 0, policy_version 354040 (0.0033) [2024-06-23 03:53:03,390][15132] Fps is (10 sec: 40958.8, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 5800656896. Throughput: 0: 42623.1. Samples: 5800818260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:53:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 03:53:06,316][15401] Updated weights for policy 0, policy_version 354050 (0.0035) [2024-06-23 03:53:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 5800886272. Throughput: 0: 42791.6. Samples: 5800950800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:53:08,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 03:53:09,150][15401] Updated weights for policy 0, policy_version 354060 (0.0035) [2024-06-23 03:53:13,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5801050112. Throughput: 0: 42639.1. Samples: 5801204480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:53:13,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 03:53:13,841][15401] Updated weights for policy 0, policy_version 354070 (0.0033) [2024-06-23 03:53:17,093][15401] Updated weights for policy 0, policy_version 354080 (0.0037) [2024-06-23 03:53:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5801295872. Throughput: 0: 42640.4. Samples: 5801457460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:53:18,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 03:53:21,394][15401] Updated weights for policy 0, policy_version 354090 (0.0023) [2024-06-23 03:53:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5801508864. Throughput: 0: 42795.5. Samples: 5801590980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:53:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 03:53:24,649][15401] Updated weights for policy 0, policy_version 354100 (0.0023) [2024-06-23 03:53:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5801705472. Throughput: 0: 42599.1. Samples: 5801839860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:53:28,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 03:53:28,989][15401] Updated weights for policy 0, policy_version 354110 (0.0034) [2024-06-23 03:53:32,236][15401] Updated weights for policy 0, policy_version 354120 (0.0036) [2024-06-23 03:53:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 5801951232. Throughput: 0: 42755.5. Samples: 5802097040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 03:53:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 03:53:36,487][15401] Updated weights for policy 0, policy_version 354130 (0.0045) [2024-06-23 03:53:38,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 5802164224. Throughput: 0: 42703.3. Samples: 5802230420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:53:38,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 03:53:40,440][15401] Updated weights for policy 0, policy_version 354140 (0.0047) [2024-06-23 03:53:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 5802360832. Throughput: 0: 42580.4. Samples: 5802484320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:53:43,393][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 03:53:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000354148_5802360832.pth... [2024-06-23 03:53:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000353524_5792137216.pth [2024-06-23 03:53:44,074][15401] Updated weights for policy 0, policy_version 354150 (0.0027) [2024-06-23 03:53:47,948][15401] Updated weights for policy 0, policy_version 354160 (0.0027) [2024-06-23 03:53:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5802590208. Throughput: 0: 42613.1. Samples: 5802735840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:53:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 03:53:51,558][15401] Updated weights for policy 0, policy_version 354170 (0.0037) [2024-06-23 03:53:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5802803200. Throughput: 0: 42608.1. Samples: 5802868160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:53:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 03:53:55,484][15401] Updated weights for policy 0, policy_version 354180 (0.0033) [2024-06-23 03:53:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5802999808. Throughput: 0: 42687.6. Samples: 5803125420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:53:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 03:53:59,192][15401] Updated weights for policy 0, policy_version 354190 (0.0040) [2024-06-23 03:54:03,144][15401] Updated weights for policy 0, policy_version 354200 (0.0027) [2024-06-23 03:54:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5803229184. Throughput: 0: 42795.1. Samples: 5803383240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 03:54:06,692][15401] Updated weights for policy 0, policy_version 354210 (0.0035) [2024-06-23 03:54:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5803458560. Throughput: 0: 42706.2. Samples: 5803512760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:08,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 03:54:10,781][15401] Updated weights for policy 0, policy_version 354220 (0.0028) [2024-06-23 03:54:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.7, 300 sec: 42542.9). Total num frames: 5803638784. Throughput: 0: 42934.8. Samples: 5803771920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 03:54:14,234][15401] Updated weights for policy 0, policy_version 354230 (0.0032) [2024-06-23 03:54:17,552][15349] Signal inference workers to stop experience collection... (86050 times) [2024-06-23 03:54:17,552][15349] Signal inference workers to resume experience collection... (86050 times) [2024-06-23 03:54:17,577][15401] InferenceWorker_p0-w0: stopping experience collection (86050 times) [2024-06-23 03:54:17,577][15401] InferenceWorker_p0-w0: resuming experience collection (86050 times) [2024-06-23 03:54:18,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5803851776. Throughput: 0: 42924.6. Samples: 5804028640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 03:54:18,433][15401] Updated weights for policy 0, policy_version 354240 (0.0032) [2024-06-23 03:54:22,239][15401] Updated weights for policy 0, policy_version 354250 (0.0035) [2024-06-23 03:54:23,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5804097536. Throughput: 0: 42727.6. Samples: 5804153060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 03:54:26,217][15401] Updated weights for policy 0, policy_version 354260 (0.0033) [2024-06-23 03:54:28,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 5804277760. Throughput: 0: 42735.1. Samples: 5804407400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:28,393][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 03:54:29,725][15401] Updated weights for policy 0, policy_version 354270 (0.0027) [2024-06-23 03:54:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5804490752. Throughput: 0: 42904.5. Samples: 5804666540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:33,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 03:54:33,683][15401] Updated weights for policy 0, policy_version 354280 (0.0031) [2024-06-23 03:54:37,293][15401] Updated weights for policy 0, policy_version 354290 (0.0034) [2024-06-23 03:54:38,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 5804736512. Throughput: 0: 42884.9. Samples: 5804797980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 03:54:41,685][15401] Updated weights for policy 0, policy_version 354300 (0.0043) [2024-06-23 03:54:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 5804933120. Throughput: 0: 42882.4. Samples: 5805055120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 03:54:44,710][15401] Updated weights for policy 0, policy_version 354310 (0.0029) [2024-06-23 03:54:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5805129728. Throughput: 0: 42981.9. Samples: 5805317420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 03:54:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 03:54:49,380][15401] Updated weights for policy 0, policy_version 354320 (0.0022) [2024-06-23 03:54:52,437][15401] Updated weights for policy 0, policy_version 354330 (0.0032) [2024-06-23 03:54:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5805391872. Throughput: 0: 42866.7. Samples: 5805441760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:54:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 03:54:57,186][15401] Updated weights for policy 0, policy_version 354340 (0.0038) [2024-06-23 03:54:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5805572096. Throughput: 0: 42911.0. Samples: 5805702920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:54:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 03:55:00,044][15401] Updated weights for policy 0, policy_version 354350 (0.0029) [2024-06-23 03:55:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 5805801472. Throughput: 0: 42768.2. Samples: 5805953220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 03:55:04,883][15401] Updated weights for policy 0, policy_version 354360 (0.0030) [2024-06-23 03:55:07,630][15401] Updated weights for policy 0, policy_version 354370 (0.0032) [2024-06-23 03:55:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5806014464. Throughput: 0: 42771.0. Samples: 5806077760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:08,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 03:55:12,583][15401] Updated weights for policy 0, policy_version 354380 (0.0038) [2024-06-23 03:55:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5806211072. Throughput: 0: 42971.1. Samples: 5806341000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:13,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-23 03:55:15,498][15401] Updated weights for policy 0, policy_version 354390 (0.0043) [2024-06-23 03:55:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5806424064. Throughput: 0: 42841.2. Samples: 5806594400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:18,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 03:55:20,162][15401] Updated weights for policy 0, policy_version 354400 (0.0039) [2024-06-23 03:55:23,155][15401] Updated weights for policy 0, policy_version 354410 (0.0027) [2024-06-23 03:55:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 5806653440. Throughput: 0: 42774.5. Samples: 5806722840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:23,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 03:55:27,717][15401] Updated weights for policy 0, policy_version 354420 (0.0030) [2024-06-23 03:55:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 5806850048. Throughput: 0: 42927.4. Samples: 5806986860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 03:55:30,862][15401] Updated weights for policy 0, policy_version 354430 (0.0040) [2024-06-23 03:55:33,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 5807079424. Throughput: 0: 42605.3. Samples: 5807234760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:33,393][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 03:55:35,330][15401] Updated weights for policy 0, policy_version 354440 (0.0033) [2024-06-23 03:55:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5807276032. Throughput: 0: 42654.4. Samples: 5807361200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 03:55:38,580][15401] Updated weights for policy 0, policy_version 354450 (0.0040) [2024-06-23 03:55:42,870][15401] Updated weights for policy 0, policy_version 354460 (0.0023) [2024-06-23 03:55:43,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5807489024. Throughput: 0: 42748.8. Samples: 5807626620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:43,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 03:55:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000354461_5807489024.pth... [2024-06-23 03:55:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000353836_5797249024.pth [2024-06-23 03:55:46,012][15349] Signal inference workers to stop experience collection... (86100 times) [2024-06-23 03:55:46,012][15349] Signal inference workers to resume experience collection... (86100 times) [2024-06-23 03:55:46,026][15401] InferenceWorker_p0-w0: stopping experience collection (86100 times) [2024-06-23 03:55:46,026][15401] InferenceWorker_p0-w0: resuming experience collection (86100 times) [2024-06-23 03:55:46,154][15401] Updated weights for policy 0, policy_version 354470 (0.0037) [2024-06-23 03:55:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5807718400. Throughput: 0: 42620.5. Samples: 5807871140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 03:55:50,570][15401] Updated weights for policy 0, policy_version 354480 (0.0034) [2024-06-23 03:55:53,391][15132] Fps is (10 sec: 44232.7, 60 sec: 42324.7, 300 sec: 42653.8). Total num frames: 5807931392. Throughput: 0: 42743.6. Samples: 5808001260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:53,391][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 03:55:53,767][15401] Updated weights for policy 0, policy_version 354490 (0.0038) [2024-06-23 03:55:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5808111616. Throughput: 0: 42611.7. Samples: 5808258520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:55:58,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 03:55:58,403][15401] Updated weights for policy 0, policy_version 354500 (0.0037) [2024-06-23 03:56:01,565][15401] Updated weights for policy 0, policy_version 354510 (0.0027) [2024-06-23 03:56:03,390][15132] Fps is (10 sec: 42602.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5808357376. Throughput: 0: 42562.2. Samples: 5808509700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:56:03,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 03:56:06,227][15401] Updated weights for policy 0, policy_version 354520 (0.0029) [2024-06-23 03:56:08,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5808553984. Throughput: 0: 42707.5. Samples: 5808644680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 03:56:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 03:56:09,094][15401] Updated weights for policy 0, policy_version 354530 (0.0040) [2024-06-23 03:56:13,390][15132] Fps is (10 sec: 36044.7, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 5808717824. Throughput: 0: 42370.2. Samples: 5808893520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:13,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-23 03:56:14,044][15401] Updated weights for policy 0, policy_version 354540 (0.0028) [2024-06-23 03:56:16,846][15401] Updated weights for policy 0, policy_version 354550 (0.0037) [2024-06-23 03:56:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5808996352. Throughput: 0: 42360.0. Samples: 5809140860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 03:56:21,839][15401] Updated weights for policy 0, policy_version 354560 (0.0031) [2024-06-23 03:56:23,390][15132] Fps is (10 sec: 49152.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5809209344. Throughput: 0: 42627.4. Samples: 5809279440. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:23,393][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 03:56:24,545][15401] Updated weights for policy 0, policy_version 354570 (0.0029) [2024-06-23 03:56:28,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5809373184. Throughput: 0: 42202.7. Samples: 5809525740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 03:56:29,409][15401] Updated weights for policy 0, policy_version 354580 (0.0033) [2024-06-23 03:56:32,141][15401] Updated weights for policy 0, policy_version 354590 (0.0030) [2024-06-23 03:56:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42598.4, 300 sec: 42709.1). Total num frames: 5809635328. Throughput: 0: 42462.2. Samples: 5809782040. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:33,393][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 03:56:36,953][15401] Updated weights for policy 0, policy_version 354600 (0.0046) [2024-06-23 03:56:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 5809831936. Throughput: 0: 42572.5. Samples: 5809916980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 03:56:39,769][15401] Updated weights for policy 0, policy_version 354610 (0.0030) [2024-06-23 03:56:43,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5810028544. Throughput: 0: 42546.7. Samples: 5810173120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 03:56:44,464][15401] Updated weights for policy 0, policy_version 354620 (0.0036) [2024-06-23 03:56:47,399][15401] Updated weights for policy 0, policy_version 354630 (0.0029) [2024-06-23 03:56:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5810274304. Throughput: 0: 42486.2. Samples: 5810421580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 03:56:51,998][15401] Updated weights for policy 0, policy_version 354640 (0.0031) [2024-06-23 03:56:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42326.1, 300 sec: 42653.9). Total num frames: 5810470912. Throughput: 0: 42607.3. Samples: 5810562000. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 03:56:55,245][15401] Updated weights for policy 0, policy_version 354650 (0.0039) [2024-06-23 03:56:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5810667520. Throughput: 0: 42697.8. Samples: 5810814920. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:56:58,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 03:56:59,833][15401] Updated weights for policy 0, policy_version 354660 (0.0049) [2024-06-23 03:57:02,931][15401] Updated weights for policy 0, policy_version 354670 (0.0050) [2024-06-23 03:57:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 5810929664. Throughput: 0: 42661.0. Samples: 5811060600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:57:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 03:57:07,399][15401] Updated weights for policy 0, policy_version 354680 (0.0027) [2024-06-23 03:57:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5811109888. Throughput: 0: 42570.3. Samples: 5811195100. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:57:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 03:57:10,633][15401] Updated weights for policy 0, policy_version 354690 (0.0038) [2024-06-23 03:57:13,390][15132] Fps is (10 sec: 37682.8, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5811306496. Throughput: 0: 42730.7. Samples: 5811448620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:57:13,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 03:57:15,204][15401] Updated weights for policy 0, policy_version 354700 (0.0038) [2024-06-23 03:57:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5811552256. Throughput: 0: 42525.0. Samples: 5811695560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:57:18,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 03:57:18,547][15401] Updated weights for policy 0, policy_version 354710 (0.0036) [2024-06-23 03:57:22,665][15401] Updated weights for policy 0, policy_version 354720 (0.0035) [2024-06-23 03:57:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 5811748864. Throughput: 0: 42640.1. Samples: 5811835780. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-23 03:57:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 03:57:26,015][15401] Updated weights for policy 0, policy_version 354730 (0.0039) [2024-06-23 03:57:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5811961856. Throughput: 0: 42676.9. Samples: 5812093580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:57:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 03:57:29,954][15349] Signal inference workers to stop experience collection... (86150 times) [2024-06-23 03:57:30,010][15401] InferenceWorker_p0-w0: stopping experience collection (86150 times) [2024-06-23 03:57:30,070][15349] Signal inference workers to resume experience collection... (86150 times) [2024-06-23 03:57:30,071][15401] InferenceWorker_p0-w0: resuming experience collection (86150 times) [2024-06-23 03:57:30,212][15401] Updated weights for policy 0, policy_version 354740 (0.0024) [2024-06-23 03:57:33,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 5812207616. Throughput: 0: 42803.5. Samples: 5812347740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:57:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 03:57:33,450][15401] Updated weights for policy 0, policy_version 354750 (0.0024) [2024-06-23 03:57:37,766][15401] Updated weights for policy 0, policy_version 354760 (0.0029) [2024-06-23 03:57:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5812404224. Throughput: 0: 42683.9. Samples: 5812482780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:57:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 03:57:41,656][15401] Updated weights for policy 0, policy_version 354770 (0.0049) [2024-06-23 03:57:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5812600832. Throughput: 0: 42627.5. Samples: 5812733160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:57:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 03:57:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000354774_5812617216.pth... [2024-06-23 03:57:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000354148_5802360832.pth [2024-06-23 03:57:45,306][15401] Updated weights for policy 0, policy_version 354780 (0.0043) [2024-06-23 03:57:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5812813824. Throughput: 0: 42845.2. Samples: 5812988640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:57:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 03:57:49,232][15401] Updated weights for policy 0, policy_version 354790 (0.0027) [2024-06-23 03:57:52,958][15401] Updated weights for policy 0, policy_version 354800 (0.0032) [2024-06-23 03:57:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5813043200. Throughput: 0: 42641.8. Samples: 5813113980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:57:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 03:57:56,852][15401] Updated weights for policy 0, policy_version 354810 (0.0031) [2024-06-23 03:57:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5813256192. Throughput: 0: 42661.4. Samples: 5813368380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:57:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 03:58:01,112][15401] Updated weights for policy 0, policy_version 354820 (0.0047) [2024-06-23 03:58:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5813469184. Throughput: 0: 42979.9. Samples: 5813629660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:58:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 03:58:04,982][15401] Updated weights for policy 0, policy_version 354830 (0.0029) [2024-06-23 03:58:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5813682176. Throughput: 0: 42699.9. Samples: 5813757280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:58:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 03:58:08,573][15401] Updated weights for policy 0, policy_version 354840 (0.0038) [2024-06-23 03:58:12,602][15401] Updated weights for policy 0, policy_version 354850 (0.0035) [2024-06-23 03:58:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 5813895168. Throughput: 0: 42690.9. Samples: 5814014780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:58:13,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 03:58:16,286][15401] Updated weights for policy 0, policy_version 354860 (0.0033) [2024-06-23 03:58:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 5814124544. Throughput: 0: 42590.7. Samples: 5814264420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:58:18,392][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 03:58:20,249][15401] Updated weights for policy 0, policy_version 354870 (0.0040) [2024-06-23 03:58:23,389][15132] Fps is (10 sec: 40970.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5814304768. Throughput: 0: 42497.9. Samples: 5814395180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:58:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 03:58:23,985][15401] Updated weights for policy 0, policy_version 354880 (0.0037) [2024-06-23 03:58:27,865][15401] Updated weights for policy 0, policy_version 354890 (0.0034) [2024-06-23 03:58:28,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5814534144. Throughput: 0: 42736.1. Samples: 5814656280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:58:28,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 03:58:31,568][15401] Updated weights for policy 0, policy_version 354900 (0.0048) [2024-06-23 03:58:33,394][15132] Fps is (10 sec: 45853.7, 60 sec: 42595.2, 300 sec: 42709.2). Total num frames: 5814763520. Throughput: 0: 42580.7. Samples: 5814904960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:58:33,395][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 03:58:35,570][15401] Updated weights for policy 0, policy_version 354910 (0.0029) [2024-06-23 03:58:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 5814960128. Throughput: 0: 42761.5. Samples: 5815038260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 03:58:38,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 03:58:39,318][15401] Updated weights for policy 0, policy_version 354920 (0.0037) [2024-06-23 03:58:43,153][15401] Updated weights for policy 0, policy_version 354930 (0.0023) [2024-06-23 03:58:43,390][15132] Fps is (10 sec: 42617.3, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 5815189504. Throughput: 0: 42888.3. Samples: 5815298360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:58:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 03:58:46,886][15401] Updated weights for policy 0, policy_version 354940 (0.0029) [2024-06-23 03:58:48,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5815386112. Throughput: 0: 42646.8. Samples: 5815548760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:58:48,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 03:58:51,134][15401] Updated weights for policy 0, policy_version 354950 (0.0036) [2024-06-23 03:58:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5815599104. Throughput: 0: 42800.8. Samples: 5815683320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:58:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 03:58:54,417][15401] Updated weights for policy 0, policy_version 354960 (0.0041) [2024-06-23 03:58:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5815812096. Throughput: 0: 42838.4. Samples: 5815942400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:58:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 03:58:58,511][15401] Updated weights for policy 0, policy_version 354970 (0.0028) [2024-06-23 03:59:02,129][15401] Updated weights for policy 0, policy_version 354980 (0.0036) [2024-06-23 03:59:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5816041472. Throughput: 0: 42864.9. Samples: 5816193240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 03:59:05,657][15349] Signal inference workers to stop experience collection... (86200 times) [2024-06-23 03:59:05,704][15401] InferenceWorker_p0-w0: stopping experience collection (86200 times) [2024-06-23 03:59:05,767][15349] Signal inference workers to resume experience collection... (86200 times) [2024-06-23 03:59:05,768][15401] InferenceWorker_p0-w0: resuming experience collection (86200 times) [2024-06-23 03:59:05,902][15401] Updated weights for policy 0, policy_version 354990 (0.0037) [2024-06-23 03:59:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5816238080. Throughput: 0: 42978.1. Samples: 5816329200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 03:59:09,633][15401] Updated weights for policy 0, policy_version 355000 (0.0030) [2024-06-23 03:59:13,365][15401] Updated weights for policy 0, policy_version 355010 (0.0034) [2024-06-23 03:59:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 5816483840. Throughput: 0: 43020.8. Samples: 5816592220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 03:59:17,172][15401] Updated weights for policy 0, policy_version 355020 (0.0032) [2024-06-23 03:59:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 5816696832. Throughput: 0: 43067.4. Samples: 5816842800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:18,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 03:59:21,258][15401] Updated weights for policy 0, policy_version 355030 (0.0038) [2024-06-23 03:59:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.3, 300 sec: 42765.3). Total num frames: 5816893440. Throughput: 0: 43097.0. Samples: 5816977620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:23,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-23 03:59:24,747][15401] Updated weights for policy 0, policy_version 355040 (0.0038) [2024-06-23 03:59:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5817106432. Throughput: 0: 43102.4. Samples: 5817237960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:28,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 03:59:28,823][15401] Updated weights for policy 0, policy_version 355050 (0.0030) [2024-06-23 03:59:32,418][15401] Updated weights for policy 0, policy_version 355060 (0.0028) [2024-06-23 03:59:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43147.8, 300 sec: 42765.0). Total num frames: 5817352192. Throughput: 0: 43008.0. Samples: 5817484120. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 03:59:36,705][15401] Updated weights for policy 0, policy_version 355070 (0.0033) [2024-06-23 03:59:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 5817548800. Throughput: 0: 43088.0. Samples: 5817622380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:38,392][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 03:59:40,096][15401] Updated weights for policy 0, policy_version 355080 (0.0057) [2024-06-23 03:59:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5817761792. Throughput: 0: 43177.2. Samples: 5817885380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:43,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 03:59:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000355088_5817761792.pth... [2024-06-23 03:59:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000354461_5807489024.pth [2024-06-23 03:59:44,179][15401] Updated weights for policy 0, policy_version 355090 (0.0037) [2024-06-23 03:59:47,583][15401] Updated weights for policy 0, policy_version 355100 (0.0036) [2024-06-23 03:59:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5817991168. Throughput: 0: 43163.2. Samples: 5818135580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:48,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-23 03:59:51,670][15401] Updated weights for policy 0, policy_version 355110 (0.0032) [2024-06-23 03:59:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5818187776. Throughput: 0: 43104.6. Samples: 5818268900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:53,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 03:59:55,244][15401] Updated weights for policy 0, policy_version 355120 (0.0030) [2024-06-23 03:59:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5818384384. Throughput: 0: 42858.7. Samples: 5818520860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 03:59:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 03:59:59,314][15401] Updated weights for policy 0, policy_version 355130 (0.0027) [2024-06-23 04:00:02,939][15401] Updated weights for policy 0, policy_version 355140 (0.0023) [2024-06-23 04:00:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5818613760. Throughput: 0: 42794.4. Samples: 5818768540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 04:00:07,468][15401] Updated weights for policy 0, policy_version 355150 (0.0032) [2024-06-23 04:00:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5818810368. Throughput: 0: 42782.8. Samples: 5818902840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 04:00:10,486][15401] Updated weights for policy 0, policy_version 355160 (0.0039) [2024-06-23 04:00:13,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 5819023360. Throughput: 0: 42524.8. Samples: 5819151680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:13,392][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 04:00:15,307][15401] Updated weights for policy 0, policy_version 355170 (0.0042) [2024-06-23 04:00:18,251][15401] Updated weights for policy 0, policy_version 355180 (0.0044) [2024-06-23 04:00:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5819269120. Throughput: 0: 42616.5. Samples: 5819401860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 04:00:23,101][15401] Updated weights for policy 0, policy_version 355190 (0.0033) [2024-06-23 04:00:23,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5819449344. Throughput: 0: 42516.0. Samples: 5819535500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 04:00:25,867][15401] Updated weights for policy 0, policy_version 355200 (0.0034) [2024-06-23 04:00:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 5819678720. Throughput: 0: 42365.4. Samples: 5819791820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:28,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 04:00:30,797][15401] Updated weights for policy 0, policy_version 355210 (0.0030) [2024-06-23 04:00:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5819908096. Throughput: 0: 42374.2. Samples: 5820042420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:33,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 04:00:33,479][15401] Updated weights for policy 0, policy_version 355220 (0.0023) [2024-06-23 04:00:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 5820055552. Throughput: 0: 42366.2. Samples: 5820175380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 04:00:38,461][15349] Signal inference workers to stop experience collection... (86250 times) [2024-06-23 04:00:38,467][15349] Signal inference workers to resume experience collection... (86250 times) [2024-06-23 04:00:38,507][15401] InferenceWorker_p0-w0: stopping experience collection (86250 times) [2024-06-23 04:00:38,512][15401] InferenceWorker_p0-w0: resuming experience collection (86250 times) [2024-06-23 04:00:38,598][15401] Updated weights for policy 0, policy_version 355230 (0.0026) [2024-06-23 04:00:41,179][15401] Updated weights for policy 0, policy_version 355240 (0.0047) [2024-06-23 04:00:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5820317696. Throughput: 0: 42298.8. Samples: 5820424300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:43,390][15132] Avg episode reward: [(0, '0.183')] [2024-06-23 04:00:46,251][15401] Updated weights for policy 0, policy_version 355250 (0.0031) [2024-06-23 04:00:48,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42325.4, 300 sec: 42709.6). Total num frames: 5820530688. Throughput: 0: 42446.2. Samples: 5820678620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:48,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 04:00:48,898][15401] Updated weights for policy 0, policy_version 355260 (0.0042) [2024-06-23 04:00:53,390][15132] Fps is (10 sec: 37682.8, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 5820694528. Throughput: 0: 42381.7. Samples: 5820810020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 04:00:53,795][15401] Updated weights for policy 0, policy_version 355270 (0.0038) [2024-06-23 04:00:56,970][15401] Updated weights for policy 0, policy_version 355280 (0.0031) [2024-06-23 04:00:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5820956672. Throughput: 0: 42531.7. Samples: 5821065500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:00:58,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 04:01:01,566][15401] Updated weights for policy 0, policy_version 355290 (0.0035) [2024-06-23 04:01:03,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 5821169664. Throughput: 0: 42606.7. Samples: 5821319160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:01:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 04:01:04,479][15401] Updated weights for policy 0, policy_version 355300 (0.0034) [2024-06-23 04:01:08,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5821349888. Throughput: 0: 42534.7. Samples: 5821449560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:01:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 04:01:09,131][15401] Updated weights for policy 0, policy_version 355310 (0.0041) [2024-06-23 04:01:12,436][15401] Updated weights for policy 0, policy_version 355320 (0.0028) [2024-06-23 04:01:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 5821579264. Throughput: 0: 42463.7. Samples: 5821702680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 04:01:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 04:01:16,944][15401] Updated weights for policy 0, policy_version 355330 (0.0028) [2024-06-23 04:01:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5821808640. Throughput: 0: 42524.4. Samples: 5821956020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:18,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 04:01:20,218][15401] Updated weights for policy 0, policy_version 355340 (0.0028) [2024-06-23 04:01:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5821972480. Throughput: 0: 42462.1. Samples: 5822086180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 04:01:24,600][15401] Updated weights for policy 0, policy_version 355350 (0.0044) [2024-06-23 04:01:27,867][15401] Updated weights for policy 0, policy_version 355360 (0.0043) [2024-06-23 04:01:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 5822218240. Throughput: 0: 42516.5. Samples: 5822337540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 04:01:32,296][15401] Updated weights for policy 0, policy_version 355370 (0.0036) [2024-06-23 04:01:33,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5822447616. Throughput: 0: 42596.0. Samples: 5822595440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:33,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 04:01:35,527][15401] Updated weights for policy 0, policy_version 355380 (0.0025) [2024-06-23 04:01:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5822611456. Throughput: 0: 42471.3. Samples: 5822721220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 04:01:39,964][15401] Updated weights for policy 0, policy_version 355390 (0.0039) [2024-06-23 04:01:41,093][15349] Signal inference workers to stop experience collection... (86300 times) [2024-06-23 04:01:41,094][15349] Signal inference workers to resume experience collection... (86300 times) [2024-06-23 04:01:41,124][15401] InferenceWorker_p0-w0: stopping experience collection (86300 times) [2024-06-23 04:01:41,124][15401] InferenceWorker_p0-w0: resuming experience collection (86300 times) [2024-06-23 04:01:43,239][15401] Updated weights for policy 0, policy_version 355400 (0.0021) [2024-06-23 04:01:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5822873600. Throughput: 0: 42449.2. Samples: 5822975720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:43,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 04:01:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000355400_5822873600.pth... [2024-06-23 04:01:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000354774_5812617216.pth [2024-06-23 04:01:47,514][15401] Updated weights for policy 0, policy_version 355410 (0.0034) [2024-06-23 04:01:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5823070208. Throughput: 0: 42714.7. Samples: 5823241320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 04:01:50,901][15401] Updated weights for policy 0, policy_version 355420 (0.0040) [2024-06-23 04:01:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5823283200. Throughput: 0: 42595.1. Samples: 5823366340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 04:01:55,130][15401] Updated weights for policy 0, policy_version 355430 (0.0025) [2024-06-23 04:01:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5823512576. Throughput: 0: 42692.3. Samples: 5823623840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:01:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 04:01:58,589][15401] Updated weights for policy 0, policy_version 355440 (0.0038) [2024-06-23 04:02:02,634][15401] Updated weights for policy 0, policy_version 355450 (0.0028) [2024-06-23 04:02:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5823741952. Throughput: 0: 42920.9. Samples: 5823887460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:02:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 04:02:06,120][15401] Updated weights for policy 0, policy_version 355460 (0.0036) [2024-06-23 04:02:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5823922176. Throughput: 0: 42794.8. Samples: 5824011940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:02:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 04:02:10,438][15401] Updated weights for policy 0, policy_version 355470 (0.0034) [2024-06-23 04:02:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5824167936. Throughput: 0: 42895.5. Samples: 5824267840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:02:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 04:02:13,526][15401] Updated weights for policy 0, policy_version 355480 (0.0022) [2024-06-23 04:02:17,901][15401] Updated weights for policy 0, policy_version 355490 (0.0033) [2024-06-23 04:02:18,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5824380928. Throughput: 0: 43045.3. Samples: 5824532480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:02:18,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 04:02:20,968][15401] Updated weights for policy 0, policy_version 355500 (0.0032) [2024-06-23 04:02:23,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5824544768. Throughput: 0: 42895.0. Samples: 5824651500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:02:23,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-23 04:02:25,822][15401] Updated weights for policy 0, policy_version 355510 (0.0041) [2024-06-23 04:02:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5824823296. Throughput: 0: 42943.1. Samples: 5824908160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:02:28,394][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 04:02:29,410][15401] Updated weights for policy 0, policy_version 355520 (0.0027) [2024-06-23 04:02:33,261][15401] Updated weights for policy 0, policy_version 355530 (0.0040) [2024-06-23 04:02:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5825003520. Throughput: 0: 42898.2. Samples: 5825171740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 04:02:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 04:02:36,934][15401] Updated weights for policy 0, policy_version 355540 (0.0036) [2024-06-23 04:02:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5825200128. Throughput: 0: 42787.9. Samples: 5825291800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:02:38,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 04:02:40,924][15401] Updated weights for policy 0, policy_version 355550 (0.0043) [2024-06-23 04:02:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5825462272. Throughput: 0: 42809.5. Samples: 5825550260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:02:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 04:02:44,460][15401] Updated weights for policy 0, policy_version 355560 (0.0030) [2024-06-23 04:02:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5825642496. Throughput: 0: 42913.7. Samples: 5825818580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:02:48,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 04:02:48,605][15401] Updated weights for policy 0, policy_version 355570 (0.0029) [2024-06-23 04:02:51,954][15401] Updated weights for policy 0, policy_version 355580 (0.0031) [2024-06-23 04:02:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5825839104. Throughput: 0: 42711.0. Samples: 5825933940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:02:53,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-23 04:02:55,142][15349] Signal inference workers to stop experience collection... (86350 times) [2024-06-23 04:02:55,150][15349] Signal inference workers to resume experience collection... (86350 times) [2024-06-23 04:02:55,152][15401] InferenceWorker_p0-w0: stopping experience collection (86350 times) [2024-06-23 04:02:55,167][15401] InferenceWorker_p0-w0: resuming experience collection (86350 times) [2024-06-23 04:02:55,997][15401] Updated weights for policy 0, policy_version 355590 (0.0048) [2024-06-23 04:02:58,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5826117632. Throughput: 0: 42961.3. Samples: 5826201100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:02:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 04:02:59,416][15401] Updated weights for policy 0, policy_version 355600 (0.0030) [2024-06-23 04:03:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5826297856. Throughput: 0: 42932.9. Samples: 5826464460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 04:03:03,418][15401] Updated weights for policy 0, policy_version 355610 (0.0033) [2024-06-23 04:03:06,944][15401] Updated weights for policy 0, policy_version 355620 (0.0033) [2024-06-23 04:03:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 5826510848. Throughput: 0: 43058.2. Samples: 5826589120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:08,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 04:03:10,942][15401] Updated weights for policy 0, policy_version 355630 (0.0033) [2024-06-23 04:03:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 5826740224. Throughput: 0: 43195.1. Samples: 5826851940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 04:03:14,941][15401] Updated weights for policy 0, policy_version 355640 (0.0032) [2024-06-23 04:03:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5826936832. Throughput: 0: 43080.3. Samples: 5827110360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 04:03:18,583][15401] Updated weights for policy 0, policy_version 355650 (0.0028) [2024-06-23 04:03:22,442][15401] Updated weights for policy 0, policy_version 355660 (0.0032) [2024-06-23 04:03:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43688.9, 300 sec: 42820.2). Total num frames: 5827166208. Throughput: 0: 43142.1. Samples: 5827233300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:23,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 04:03:26,230][15401] Updated weights for policy 0, policy_version 355670 (0.0028) [2024-06-23 04:03:28,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42765.3). Total num frames: 5827379200. Throughput: 0: 43274.0. Samples: 5827497700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:28,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 04:03:29,902][15401] Updated weights for policy 0, policy_version 355680 (0.0030) [2024-06-23 04:03:33,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 5827592192. Throughput: 0: 43125.8. Samples: 5827759240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 04:03:33,774][15401] Updated weights for policy 0, policy_version 355690 (0.0029) [2024-06-23 04:03:37,791][15401] Updated weights for policy 0, policy_version 355700 (0.0038) [2024-06-23 04:03:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 5827821568. Throughput: 0: 43358.7. Samples: 5827885080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 04:03:41,454][15401] Updated weights for policy 0, policy_version 355710 (0.0041) [2024-06-23 04:03:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5828034560. Throughput: 0: 43164.4. Samples: 5828143500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:43,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 04:03:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000355715_5828034560.pth... [2024-06-23 04:03:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000355088_5817761792.pth [2024-06-23 04:03:45,263][15401] Updated weights for policy 0, policy_version 355720 (0.0031) [2024-06-23 04:03:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5828214784. Throughput: 0: 43115.0. Samples: 5828404640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 04:03:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 04:03:49,059][15401] Updated weights for policy 0, policy_version 355730 (0.0022) [2024-06-23 04:03:52,955][15401] Updated weights for policy 0, policy_version 355740 (0.0030) [2024-06-23 04:03:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 5828444160. Throughput: 0: 42931.6. Samples: 5828521040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:03:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 04:03:56,783][15401] Updated weights for policy 0, policy_version 355750 (0.0030) [2024-06-23 04:03:58,396][15132] Fps is (10 sec: 47483.2, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 5828689920. Throughput: 0: 42838.3. Samples: 5828779940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:03:58,396][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 04:04:00,475][15401] Updated weights for policy 0, policy_version 355760 (0.0042) [2024-06-23 04:04:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5828853760. Throughput: 0: 43003.2. Samples: 5829045500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:03,394][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 04:04:04,627][15401] Updated weights for policy 0, policy_version 355770 (0.0036) [2024-06-23 04:04:08,057][15401] Updated weights for policy 0, policy_version 355780 (0.0050) [2024-06-23 04:04:08,390][15132] Fps is (10 sec: 40986.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5829099520. Throughput: 0: 43000.5. Samples: 5829168220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:08,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 04:04:12,011][15349] Signal inference workers to stop experience collection... (86400 times) [2024-06-23 04:04:12,012][15349] Signal inference workers to resume experience collection... (86400 times) [2024-06-23 04:04:12,032][15401] InferenceWorker_p0-w0: stopping experience collection (86400 times) [2024-06-23 04:04:12,032][15401] InferenceWorker_p0-w0: resuming experience collection (86400 times) [2024-06-23 04:04:12,159][15401] Updated weights for policy 0, policy_version 355790 (0.0034) [2024-06-23 04:04:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5829312512. Throughput: 0: 42732.6. Samples: 5829420560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 04:04:16,188][15401] Updated weights for policy 0, policy_version 355800 (0.0034) [2024-06-23 04:04:18,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 5829492736. Throughput: 0: 42667.6. Samples: 5829679380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:18,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 04:04:19,929][15401] Updated weights for policy 0, policy_version 355810 (0.0055) [2024-06-23 04:04:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 5829722112. Throughput: 0: 42514.1. Samples: 5829798220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 04:04:23,817][15401] Updated weights for policy 0, policy_version 355820 (0.0040) [2024-06-23 04:04:27,766][15401] Updated weights for policy 0, policy_version 355830 (0.0029) [2024-06-23 04:04:28,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 5829951488. Throughput: 0: 42651.2. Samples: 5830062800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 04:04:31,463][15401] Updated weights for policy 0, policy_version 355840 (0.0046) [2024-06-23 04:04:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 5830131712. Throughput: 0: 42430.7. Samples: 5830314020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 04:04:35,727][15401] Updated weights for policy 0, policy_version 355850 (0.0038) [2024-06-23 04:04:38,391][15132] Fps is (10 sec: 40954.8, 60 sec: 42324.4, 300 sec: 42709.3). Total num frames: 5830361088. Throughput: 0: 42613.5. Samples: 5830438700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:38,391][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 04:04:39,347][15401] Updated weights for policy 0, policy_version 355860 (0.0037) [2024-06-23 04:04:43,370][15401] Updated weights for policy 0, policy_version 355870 (0.0035) [2024-06-23 04:04:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5830574080. Throughput: 0: 42754.6. Samples: 5830703620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 04:04:46,980][15401] Updated weights for policy 0, policy_version 355880 (0.0045) [2024-06-23 04:04:48,389][15132] Fps is (10 sec: 42603.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5830787072. Throughput: 0: 42413.0. Samples: 5830954080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 04:04:50,936][15401] Updated weights for policy 0, policy_version 355890 (0.0036) [2024-06-23 04:04:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5831000064. Throughput: 0: 42615.1. Samples: 5831085900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 04:04:54,706][15401] Updated weights for policy 0, policy_version 355900 (0.0045) [2024-06-23 04:04:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42056.8, 300 sec: 42709.5). Total num frames: 5831213056. Throughput: 0: 42820.3. Samples: 5831347480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:04:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 04:04:58,740][15401] Updated weights for policy 0, policy_version 355910 (0.0034) [2024-06-23 04:05:02,255][15401] Updated weights for policy 0, policy_version 355920 (0.0036) [2024-06-23 04:05:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5831442432. Throughput: 0: 42493.9. Samples: 5831591500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:05:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 04:05:06,074][15401] Updated weights for policy 0, policy_version 355930 (0.0038) [2024-06-23 04:05:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 5831655424. Throughput: 0: 42865.8. Samples: 5831727180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 04:05:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 04:05:09,649][15401] Updated weights for policy 0, policy_version 355940 (0.0043) [2024-06-23 04:05:13,278][15349] Signal inference workers to stop experience collection... (86450 times) [2024-06-23 04:05:13,316][15401] InferenceWorker_p0-w0: stopping experience collection (86450 times) [2024-06-23 04:05:13,334][15349] Signal inference workers to resume experience collection... (86450 times) [2024-06-23 04:05:13,335][15401] InferenceWorker_p0-w0: resuming experience collection (86450 times) [2024-06-23 04:05:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5831868416. Throughput: 0: 42739.8. Samples: 5831986100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:13,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-23 04:05:13,477][15401] Updated weights for policy 0, policy_version 355950 (0.0029) [2024-06-23 04:05:17,053][15401] Updated weights for policy 0, policy_version 355960 (0.0034) [2024-06-23 04:05:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 5832097792. Throughput: 0: 42896.9. Samples: 5832244380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 04:05:21,691][15401] Updated weights for policy 0, policy_version 355970 (0.0032) [2024-06-23 04:05:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5832278016. Throughput: 0: 43056.8. Samples: 5832376200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 04:05:24,922][15401] Updated weights for policy 0, policy_version 355980 (0.0048) [2024-06-23 04:05:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5832507392. Throughput: 0: 42829.2. Samples: 5832630940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:28,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 04:05:29,216][15401] Updated weights for policy 0, policy_version 355990 (0.0043) [2024-06-23 04:05:32,423][15401] Updated weights for policy 0, policy_version 356000 (0.0026) [2024-06-23 04:05:33,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43690.6, 300 sec: 43042.7). Total num frames: 5832753152. Throughput: 0: 42964.3. Samples: 5832887480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 04:05:36,854][15401] Updated weights for policy 0, policy_version 356010 (0.0036) [2024-06-23 04:05:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42872.4, 300 sec: 42765.0). Total num frames: 5832933376. Throughput: 0: 43117.4. Samples: 5833026180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:38,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-23 04:05:40,096][15401] Updated weights for policy 0, policy_version 356020 (0.0041) [2024-06-23 04:05:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5833146368. Throughput: 0: 42906.8. Samples: 5833278280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 04:05:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000356028_5833162752.pth... [2024-06-23 04:05:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000355400_5822873600.pth [2024-06-23 04:05:44,281][15401] Updated weights for policy 0, policy_version 356030 (0.0034) [2024-06-23 04:05:47,741][15401] Updated weights for policy 0, policy_version 356040 (0.0028) [2024-06-23 04:05:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5833375744. Throughput: 0: 43117.9. Samples: 5833531800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:48,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 04:05:52,203][15401] Updated weights for policy 0, policy_version 356050 (0.0028) [2024-06-23 04:05:53,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5833588736. Throughput: 0: 43104.9. Samples: 5833666900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 04:05:55,639][15401] Updated weights for policy 0, policy_version 356060 (0.0041) [2024-06-23 04:05:58,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5833785344. Throughput: 0: 42813.3. Samples: 5833912700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:05:58,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 04:05:59,671][15401] Updated weights for policy 0, policy_version 356070 (0.0028) [2024-06-23 04:06:03,138][15401] Updated weights for policy 0, policy_version 356080 (0.0025) [2024-06-23 04:06:03,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.7, 300 sec: 42986.8). Total num frames: 5834031104. Throughput: 0: 42881.6. Samples: 5834174160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:06:03,393][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 04:06:07,294][15401] Updated weights for policy 0, policy_version 356090 (0.0028) [2024-06-23 04:06:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 5834211328. Throughput: 0: 42954.6. Samples: 5834309160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:06:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 04:06:10,697][15401] Updated weights for policy 0, policy_version 356100 (0.0030) [2024-06-23 04:06:13,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 5834440704. Throughput: 0: 42769.3. Samples: 5834555660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:06:13,393][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 04:06:14,808][15401] Updated weights for policy 0, policy_version 356110 (0.0039) [2024-06-23 04:06:18,225][15401] Updated weights for policy 0, policy_version 356120 (0.0037) [2024-06-23 04:06:18,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 5834686464. Throughput: 0: 42899.6. Samples: 5834817960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:06:18,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 04:06:22,366][15401] Updated weights for policy 0, policy_version 356130 (0.0033) [2024-06-23 04:06:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5834850304. Throughput: 0: 42709.3. Samples: 5834948100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 04:06:23,390][15132] Avg episode reward: [(0, '0.211')] [2024-06-23 04:06:25,634][15401] Updated weights for policy 0, policy_version 356140 (0.0028) [2024-06-23 04:06:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5835079680. Throughput: 0: 42909.3. Samples: 5835209200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:06:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 04:06:29,948][15401] Updated weights for policy 0, policy_version 356150 (0.0044) [2024-06-23 04:06:33,134][15401] Updated weights for policy 0, policy_version 356160 (0.0036) [2024-06-23 04:06:33,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 5835341824. Throughput: 0: 42846.9. Samples: 5835459920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:06:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 04:06:38,227][15401] Updated weights for policy 0, policy_version 356170 (0.0034) [2024-06-23 04:06:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5835489280. Throughput: 0: 42751.8. Samples: 5835590720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:06:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 04:06:40,372][15349] Signal inference workers to stop experience collection... (86500 times) [2024-06-23 04:06:40,373][15349] Signal inference workers to resume experience collection... (86500 times) [2024-06-23 04:06:40,418][15401] InferenceWorker_p0-w0: stopping experience collection (86500 times) [2024-06-23 04:06:40,418][15401] InferenceWorker_p0-w0: resuming experience collection (86500 times) [2024-06-23 04:06:41,019][15401] Updated weights for policy 0, policy_version 356180 (0.0040) [2024-06-23 04:06:43,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5835718656. Throughput: 0: 42891.7. Samples: 5835842820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:06:43,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 04:06:45,729][15401] Updated weights for policy 0, policy_version 356190 (0.0033) [2024-06-23 04:06:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5835931648. Throughput: 0: 42768.6. Samples: 5836098640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:06:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 04:06:48,786][15401] Updated weights for policy 0, policy_version 356200 (0.0024) [2024-06-23 04:06:53,153][15401] Updated weights for policy 0, policy_version 356210 (0.0029) [2024-06-23 04:06:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5836144640. Throughput: 0: 42621.6. Samples: 5836227140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:06:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 04:06:56,838][15401] Updated weights for policy 0, policy_version 356220 (0.0038) [2024-06-23 04:06:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5836357632. Throughput: 0: 42780.1. Samples: 5836480660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:06:58,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 04:07:00,763][15401] Updated weights for policy 0, policy_version 356230 (0.0036) [2024-06-23 04:07:03,389][15132] Fps is (10 sec: 44238.1, 60 sec: 42600.2, 300 sec: 42931.6). Total num frames: 5836587008. Throughput: 0: 42816.9. Samples: 5836744720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 04:07:04,314][15401] Updated weights for policy 0, policy_version 356240 (0.0032) [2024-06-23 04:07:08,224][15401] Updated weights for policy 0, policy_version 356250 (0.0034) [2024-06-23 04:07:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5836800000. Throughput: 0: 42875.0. Samples: 5836877480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 04:07:11,731][15401] Updated weights for policy 0, policy_version 356260 (0.0042) [2024-06-23 04:07:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 5837012992. Throughput: 0: 42585.7. Samples: 5837125560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:13,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 04:07:16,034][15401] Updated weights for policy 0, policy_version 356270 (0.0033) [2024-06-23 04:07:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42931.6). Total num frames: 5837209600. Throughput: 0: 42829.8. Samples: 5837387260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 04:07:19,480][15401] Updated weights for policy 0, policy_version 356280 (0.0036) [2024-06-23 04:07:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5837438976. Throughput: 0: 42732.3. Samples: 5837513680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:23,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 04:07:23,625][15401] Updated weights for policy 0, policy_version 356290 (0.0036) [2024-06-23 04:07:27,143][15401] Updated weights for policy 0, policy_version 356300 (0.0035) [2024-06-23 04:07:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5837635584. Throughput: 0: 42805.3. Samples: 5837769060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:28,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-23 04:07:31,298][15401] Updated weights for policy 0, policy_version 356310 (0.0031) [2024-06-23 04:07:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42931.6). Total num frames: 5837864960. Throughput: 0: 42901.8. Samples: 5838029220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 04:07:35,043][15401] Updated weights for policy 0, policy_version 356320 (0.0035) [2024-06-23 04:07:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5838077952. Throughput: 0: 42883.1. Samples: 5838156880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:38,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-23 04:07:38,889][15401] Updated weights for policy 0, policy_version 356330 (0.0036) [2024-06-23 04:07:42,539][15401] Updated weights for policy 0, policy_version 356340 (0.0032) [2024-06-23 04:07:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5838290944. Throughput: 0: 42907.6. Samples: 5838411500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:07:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 04:07:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000356341_5838290944.pth... [2024-06-23 04:07:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000355715_5828034560.pth [2024-06-23 04:07:46,441][15401] Updated weights for policy 0, policy_version 356350 (0.0037) [2024-06-23 04:07:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5838503936. Throughput: 0: 42811.0. Samples: 5838671220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:07:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 04:07:50,272][15401] Updated weights for policy 0, policy_version 356360 (0.0034) [2024-06-23 04:07:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 5838700544. Throughput: 0: 42708.6. Samples: 5838799360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:07:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 04:07:54,331][15401] Updated weights for policy 0, policy_version 356370 (0.0041) [2024-06-23 04:07:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5838913536. Throughput: 0: 42789.8. Samples: 5839051100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:07:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 04:07:58,509][15401] Updated weights for policy 0, policy_version 356380 (0.0032) [2024-06-23 04:08:02,012][15401] Updated weights for policy 0, policy_version 356390 (0.0031) [2024-06-23 04:08:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5839159296. Throughput: 0: 42580.4. Samples: 5839303380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 04:08:05,997][15401] Updated weights for policy 0, policy_version 356400 (0.0037) [2024-06-23 04:08:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5839339520. Throughput: 0: 42664.1. Samples: 5839433560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 04:08:09,889][15401] Updated weights for policy 0, policy_version 356410 (0.0036) [2024-06-23 04:08:13,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 5839568896. Throughput: 0: 42616.4. Samples: 5839686900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:13,392][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 04:08:13,541][15401] Updated weights for policy 0, policy_version 356420 (0.0027) [2024-06-23 04:08:17,526][15401] Updated weights for policy 0, policy_version 356430 (0.0039) [2024-06-23 04:08:18,130][15349] Signal inference workers to stop experience collection... (86550 times) [2024-06-23 04:08:18,164][15401] InferenceWorker_p0-w0: stopping experience collection (86550 times) [2024-06-23 04:08:18,185][15349] Signal inference workers to resume experience collection... (86550 times) [2024-06-23 04:08:18,185][15401] InferenceWorker_p0-w0: resuming experience collection (86550 times) [2024-06-23 04:08:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 5839798272. Throughput: 0: 42430.6. Samples: 5839938600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 04:08:21,271][15401] Updated weights for policy 0, policy_version 356440 (0.0039) [2024-06-23 04:08:23,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 5839978496. Throughput: 0: 42494.7. Samples: 5840069140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 04:08:25,341][15401] Updated weights for policy 0, policy_version 356450 (0.0025) [2024-06-23 04:08:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5840207872. Throughput: 0: 42454.1. Samples: 5840321940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 04:08:28,698][15401] Updated weights for policy 0, policy_version 356460 (0.0038) [2024-06-23 04:08:33,100][15401] Updated weights for policy 0, policy_version 356470 (0.0033) [2024-06-23 04:08:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5840420864. Throughput: 0: 42382.2. Samples: 5840578420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:33,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 04:08:36,711][15401] Updated weights for policy 0, policy_version 356480 (0.0027) [2024-06-23 04:08:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5840617472. Throughput: 0: 42334.6. Samples: 5840704420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:38,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 04:08:40,635][15401] Updated weights for policy 0, policy_version 356490 (0.0030) [2024-06-23 04:08:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5840846848. Throughput: 0: 42324.1. Samples: 5840955680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 04:08:44,922][15401] Updated weights for policy 0, policy_version 356500 (0.0044) [2024-06-23 04:08:48,268][15401] Updated weights for policy 0, policy_version 356510 (0.0038) [2024-06-23 04:08:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5841059840. Throughput: 0: 42591.2. Samples: 5841219980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:48,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 04:08:52,445][15401] Updated weights for policy 0, policy_version 356520 (0.0037) [2024-06-23 04:08:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42654.9). Total num frames: 5841272832. Throughput: 0: 42467.4. Samples: 5841344600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:53,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 04:08:55,967][15401] Updated weights for policy 0, policy_version 356530 (0.0037) [2024-06-23 04:08:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5841485824. Throughput: 0: 42365.4. Samples: 5841593240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 04:08:58,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 04:09:00,086][15401] Updated weights for policy 0, policy_version 356540 (0.0060) [2024-06-23 04:09:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 5841682432. Throughput: 0: 42743.2. Samples: 5841862040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 04:09:03,709][15401] Updated weights for policy 0, policy_version 356550 (0.0041) [2024-06-23 04:09:07,613][15401] Updated weights for policy 0, policy_version 356560 (0.0036) [2024-06-23 04:09:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5841895424. Throughput: 0: 42577.4. Samples: 5841985120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 04:09:11,264][15401] Updated weights for policy 0, policy_version 356570 (0.0027) [2024-06-23 04:09:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 5842141184. Throughput: 0: 42628.0. Samples: 5842240200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 04:09:15,212][15401] Updated weights for policy 0, policy_version 356580 (0.0037) [2024-06-23 04:09:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41777.5, 300 sec: 42653.6). Total num frames: 5842305024. Throughput: 0: 42907.9. Samples: 5842509380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:18,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 04:09:19,072][15401] Updated weights for policy 0, policy_version 356590 (0.0036) [2024-06-23 04:09:23,123][15401] Updated weights for policy 0, policy_version 356600 (0.0045) [2024-06-23 04:09:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5842534400. Throughput: 0: 42620.5. Samples: 5842622340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 04:09:26,527][15401] Updated weights for policy 0, policy_version 356610 (0.0023) [2024-06-23 04:09:27,863][15349] Signal inference workers to stop experience collection... (86600 times) [2024-06-23 04:09:27,890][15401] InferenceWorker_p0-w0: stopping experience collection (86600 times) [2024-06-23 04:09:27,911][15349] Signal inference workers to resume experience collection... (86600 times) [2024-06-23 04:09:27,912][15401] InferenceWorker_p0-w0: resuming experience collection (86600 times) [2024-06-23 04:09:28,392][15132] Fps is (10 sec: 47513.5, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 5842780160. Throughput: 0: 42748.3. Samples: 5842879460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:28,392][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 04:09:30,572][15401] Updated weights for policy 0, policy_version 356620 (0.0030) [2024-06-23 04:09:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.7). Total num frames: 5842960384. Throughput: 0: 42883.2. Samples: 5843149720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 04:09:34,104][15401] Updated weights for policy 0, policy_version 356630 (0.0045) [2024-06-23 04:09:38,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5843173376. Throughput: 0: 42749.5. Samples: 5843268320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 04:09:38,510][15401] Updated weights for policy 0, policy_version 356640 (0.0031) [2024-06-23 04:09:41,629][15401] Updated weights for policy 0, policy_version 356650 (0.0030) [2024-06-23 04:09:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5843419136. Throughput: 0: 43036.1. Samples: 5843529860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 04:09:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000356655_5843435520.pth... [2024-06-23 04:09:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000356028_5833162752.pth [2024-06-23 04:09:45,964][15401] Updated weights for policy 0, policy_version 356660 (0.0049) [2024-06-23 04:09:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5843632128. Throughput: 0: 42818.2. Samples: 5843788860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 04:09:49,484][15401] Updated weights for policy 0, policy_version 356670 (0.0041) [2024-06-23 04:09:53,336][15401] Updated weights for policy 0, policy_version 356680 (0.0027) [2024-06-23 04:09:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5843845120. Throughput: 0: 42920.1. Samples: 5843916520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:53,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 04:09:56,968][15401] Updated weights for policy 0, policy_version 356690 (0.0036) [2024-06-23 04:09:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5844058112. Throughput: 0: 43073.3. Samples: 5844178500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:09:58,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 04:10:00,866][15401] Updated weights for policy 0, policy_version 356700 (0.0031) [2024-06-23 04:10:03,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 5844287488. Throughput: 0: 42844.4. Samples: 5844437380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:10:03,393][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 04:10:04,514][15401] Updated weights for policy 0, policy_version 356710 (0.0034) [2024-06-23 04:10:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5844484096. Throughput: 0: 43264.9. Samples: 5844569260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:10:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 04:10:08,539][15401] Updated weights for policy 0, policy_version 356720 (0.0035) [2024-06-23 04:10:12,219][15401] Updated weights for policy 0, policy_version 356730 (0.0029) [2024-06-23 04:10:13,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5844697088. Throughput: 0: 43228.4. Samples: 5844824640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:10:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 04:10:16,144][15401] Updated weights for policy 0, policy_version 356740 (0.0038) [2024-06-23 04:10:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 5844910080. Throughput: 0: 42962.3. Samples: 5845083020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 04:10:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 04:10:19,868][15401] Updated weights for policy 0, policy_version 356750 (0.0037) [2024-06-23 04:10:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 5845139456. Throughput: 0: 43191.4. Samples: 5845211940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:10:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 04:10:23,783][15401] Updated weights for policy 0, policy_version 356760 (0.0033) [2024-06-23 04:10:27,318][15401] Updated weights for policy 0, policy_version 356770 (0.0046) [2024-06-23 04:10:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 5845352448. Throughput: 0: 43124.8. Samples: 5845470480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:10:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 04:10:31,537][15401] Updated weights for policy 0, policy_version 356780 (0.0029) [2024-06-23 04:10:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5845565440. Throughput: 0: 43120.7. Samples: 5845729300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:10:33,391][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 04:10:34,832][15401] Updated weights for policy 0, policy_version 356790 (0.0034) [2024-06-23 04:10:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 5845762048. Throughput: 0: 43230.1. Samples: 5845861980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:10:38,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 04:10:39,089][15401] Updated weights for policy 0, policy_version 356800 (0.0031) [2024-06-23 04:10:42,609][15401] Updated weights for policy 0, policy_version 356810 (0.0036) [2024-06-23 04:10:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5845991424. Throughput: 0: 43121.0. Samples: 5846118940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:10:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 04:10:46,849][15401] Updated weights for policy 0, policy_version 356820 (0.0034) [2024-06-23 04:10:48,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 5846188032. Throughput: 0: 43157.3. Samples: 5846379460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:10:48,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 04:10:50,274][15401] Updated weights for policy 0, policy_version 356830 (0.0034) [2024-06-23 04:10:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5846401024. Throughput: 0: 42987.6. Samples: 5846503700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:10:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 04:10:53,928][15349] Signal inference workers to stop experience collection... (86650 times) [2024-06-23 04:10:53,929][15349] Signal inference workers to resume experience collection... (86650 times) [2024-06-23 04:10:53,942][15401] InferenceWorker_p0-w0: stopping experience collection (86650 times) [2024-06-23 04:10:53,945][15401] InferenceWorker_p0-w0: resuming experience collection (86650 times) [2024-06-23 04:10:54,259][15401] Updated weights for policy 0, policy_version 356840 (0.0026) [2024-06-23 04:10:57,895][15401] Updated weights for policy 0, policy_version 356850 (0.0024) [2024-06-23 04:10:58,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 5846646784. Throughput: 0: 43061.5. Samples: 5846762400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:10:58,396][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 04:11:01,806][15401] Updated weights for policy 0, policy_version 356860 (0.0042) [2024-06-23 04:11:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 5846843392. Throughput: 0: 43064.8. Samples: 5847020940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:11:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 04:11:05,447][15401] Updated weights for policy 0, policy_version 356870 (0.0038) [2024-06-23 04:11:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 5847056384. Throughput: 0: 43015.6. Samples: 5847147640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:11:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 04:11:09,248][15401] Updated weights for policy 0, policy_version 356880 (0.0042) [2024-06-23 04:11:13,147][15401] Updated weights for policy 0, policy_version 356890 (0.0034) [2024-06-23 04:11:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5847285760. Throughput: 0: 43034.3. Samples: 5847407020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:11:13,396][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 04:11:16,788][15401] Updated weights for policy 0, policy_version 356900 (0.0032) [2024-06-23 04:11:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5847498752. Throughput: 0: 43041.3. Samples: 5847666160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:11:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 04:11:20,664][15401] Updated weights for policy 0, policy_version 356910 (0.0051) [2024-06-23 04:11:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5847695360. Throughput: 0: 42872.0. Samples: 5847791120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:11:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 04:11:24,407][15401] Updated weights for policy 0, policy_version 356920 (0.0035) [2024-06-23 04:11:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5847924736. Throughput: 0: 42925.7. Samples: 5848050600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:11:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 04:11:28,478][15401] Updated weights for policy 0, policy_version 356930 (0.0050) [2024-06-23 04:11:32,103][15401] Updated weights for policy 0, policy_version 356940 (0.0033) [2024-06-23 04:11:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5848137728. Throughput: 0: 42820.9. Samples: 5848306300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:11:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 04:11:36,059][15401] Updated weights for policy 0, policy_version 356950 (0.0027) [2024-06-23 04:11:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 5848334336. Throughput: 0: 42836.3. Samples: 5848431340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:11:38,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 04:11:39,872][15401] Updated weights for policy 0, policy_version 356960 (0.0036) [2024-06-23 04:11:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5848580096. Throughput: 0: 42835.5. Samples: 5848690000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:11:43,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 04:11:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000356969_5848580096.pth... [2024-06-23 04:11:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000356341_5838290944.pth [2024-06-23 04:11:43,788][15401] Updated weights for policy 0, policy_version 356970 (0.0039) [2024-06-23 04:11:47,720][15401] Updated weights for policy 0, policy_version 356980 (0.0032) [2024-06-23 04:11:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 5848776704. Throughput: 0: 42780.7. Samples: 5848946080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:11:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 04:11:51,362][15401] Updated weights for policy 0, policy_version 356990 (0.0040) [2024-06-23 04:11:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5848973312. Throughput: 0: 42804.1. Samples: 5849073820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:11:53,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 04:11:55,283][15401] Updated weights for policy 0, policy_version 357000 (0.0025) [2024-06-23 04:11:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5849219072. Throughput: 0: 42766.6. Samples: 5849331520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:11:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 04:11:58,859][15401] Updated weights for policy 0, policy_version 357010 (0.0025) [2024-06-23 04:12:02,981][15401] Updated weights for policy 0, policy_version 357020 (0.0033) [2024-06-23 04:12:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5849432064. Throughput: 0: 42872.1. Samples: 5849595400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 04:12:06,472][15401] Updated weights for policy 0, policy_version 357030 (0.0033) [2024-06-23 04:12:08,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 5849628672. Throughput: 0: 42993.3. Samples: 5849725920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:08,393][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 04:12:10,350][15349] Signal inference workers to stop experience collection... (86700 times) [2024-06-23 04:12:10,404][15401] InferenceWorker_p0-w0: stopping experience collection (86700 times) [2024-06-23 04:12:10,407][15349] Signal inference workers to resume experience collection... (86700 times) [2024-06-23 04:12:10,417][15401] InferenceWorker_p0-w0: resuming experience collection (86700 times) [2024-06-23 04:12:10,545][15401] Updated weights for policy 0, policy_version 357040 (0.0026) [2024-06-23 04:12:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 5849874432. Throughput: 0: 42768.5. Samples: 5849975180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:13,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 04:12:14,067][15401] Updated weights for policy 0, policy_version 357050 (0.0032) [2024-06-23 04:12:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5850054656. Throughput: 0: 42895.2. Samples: 5850236580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:18,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 04:12:18,512][15401] Updated weights for policy 0, policy_version 357060 (0.0037) [2024-06-23 04:12:21,671][15401] Updated weights for policy 0, policy_version 357070 (0.0033) [2024-06-23 04:12:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5850284032. Throughput: 0: 42911.2. Samples: 5850362340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 04:12:26,059][15401] Updated weights for policy 0, policy_version 357080 (0.0045) [2024-06-23 04:12:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5850513408. Throughput: 0: 42886.7. Samples: 5850619900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 04:12:29,382][15401] Updated weights for policy 0, policy_version 357090 (0.0037) [2024-06-23 04:12:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5850710016. Throughput: 0: 42915.2. Samples: 5850877260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 04:12:33,613][15401] Updated weights for policy 0, policy_version 357100 (0.0021) [2024-06-23 04:12:36,959][15401] Updated weights for policy 0, policy_version 357110 (0.0023) [2024-06-23 04:12:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 5850923008. Throughput: 0: 42939.0. Samples: 5851006180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:38,392][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 04:12:41,211][15401] Updated weights for policy 0, policy_version 357120 (0.0036) [2024-06-23 04:12:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5851152384. Throughput: 0: 42978.6. Samples: 5851265560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 04:12:44,506][15401] Updated weights for policy 0, policy_version 357130 (0.0029) [2024-06-23 04:12:48,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5851348992. Throughput: 0: 42708.8. Samples: 5851517300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 04:12:49,383][15401] Updated weights for policy 0, policy_version 357140 (0.0033) [2024-06-23 04:12:52,674][15401] Updated weights for policy 0, policy_version 357150 (0.0030) [2024-06-23 04:12:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5851561984. Throughput: 0: 42721.9. Samples: 5851648300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:12:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 04:12:57,025][15401] Updated weights for policy 0, policy_version 357160 (0.0039) [2024-06-23 04:12:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5851791360. Throughput: 0: 42876.8. Samples: 5851904640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:12:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 04:13:00,463][15401] Updated weights for policy 0, policy_version 357170 (0.0026) [2024-06-23 04:13:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5851987968. Throughput: 0: 42675.0. Samples: 5852156960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 04:13:04,559][15401] Updated weights for policy 0, policy_version 357180 (0.0038) [2024-06-23 04:13:08,126][15401] Updated weights for policy 0, policy_version 357190 (0.0036) [2024-06-23 04:13:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.1, 300 sec: 42820.9). Total num frames: 5852200960. Throughput: 0: 42687.1. Samples: 5852283260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 04:13:12,129][15401] Updated weights for policy 0, policy_version 357200 (0.0035) [2024-06-23 04:13:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5852446720. Throughput: 0: 42840.9. Samples: 5852547740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 04:13:15,732][15401] Updated weights for policy 0, policy_version 357210 (0.0024) [2024-06-23 04:13:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5852626944. Throughput: 0: 42876.6. Samples: 5852806700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 04:13:19,633][15401] Updated weights for policy 0, policy_version 357220 (0.0037) [2024-06-23 04:13:23,279][15401] Updated weights for policy 0, policy_version 357230 (0.0023) [2024-06-23 04:13:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5852856320. Throughput: 0: 42705.8. Samples: 5852927840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 04:13:27,252][15401] Updated weights for policy 0, policy_version 357240 (0.0033) [2024-06-23 04:13:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 5853085696. Throughput: 0: 42844.6. Samples: 5853193560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:28,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 04:13:30,844][15401] Updated weights for policy 0, policy_version 357250 (0.0038) [2024-06-23 04:13:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5853265920. Throughput: 0: 42906.4. Samples: 5853448080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 04:13:34,933][15401] Updated weights for policy 0, policy_version 357260 (0.0030) [2024-06-23 04:13:38,377][15401] Updated weights for policy 0, policy_version 357270 (0.0037) [2024-06-23 04:13:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 5853511680. Throughput: 0: 42766.6. Samples: 5853572800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 04:13:40,510][15349] Signal inference workers to stop experience collection... (86750 times) [2024-06-23 04:13:40,542][15401] InferenceWorker_p0-w0: stopping experience collection (86750 times) [2024-06-23 04:13:40,619][15349] Signal inference workers to resume experience collection... (86750 times) [2024-06-23 04:13:40,619][15401] InferenceWorker_p0-w0: resuming experience collection (86750 times) [2024-06-23 04:13:42,574][15401] Updated weights for policy 0, policy_version 357280 (0.0040) [2024-06-23 04:13:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5853708288. Throughput: 0: 42688.9. Samples: 5853825640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 04:13:43,543][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000357283_5853724672.pth... [2024-06-23 04:13:43,616][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000356655_5843435520.pth [2024-06-23 04:13:46,022][15401] Updated weights for policy 0, policy_version 357290 (0.0038) [2024-06-23 04:13:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5853921280. Throughput: 0: 42826.4. Samples: 5854084140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 04:13:50,262][15401] Updated weights for policy 0, policy_version 357300 (0.0035) [2024-06-23 04:13:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5854150656. Throughput: 0: 42793.4. Samples: 5854208960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:53,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 04:13:53,874][15401] Updated weights for policy 0, policy_version 357310 (0.0025) [2024-06-23 04:13:57,802][15401] Updated weights for policy 0, policy_version 357320 (0.0030) [2024-06-23 04:13:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 5854347264. Throughput: 0: 42800.1. Samples: 5854473740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:13:58,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 04:14:01,500][15401] Updated weights for policy 0, policy_version 357330 (0.0033) [2024-06-23 04:14:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5854576640. Throughput: 0: 42698.1. Samples: 5854728120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:14:03,392][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 04:14:05,499][15401] Updated weights for policy 0, policy_version 357340 (0.0025) [2024-06-23 04:14:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5854789632. Throughput: 0: 42916.0. Samples: 5854859060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 04:14:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 04:14:09,001][15401] Updated weights for policy 0, policy_version 357350 (0.0036) [2024-06-23 04:14:12,892][15401] Updated weights for policy 0, policy_version 357360 (0.0035) [2024-06-23 04:14:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 43043.1). Total num frames: 5855002624. Throughput: 0: 42815.0. Samples: 5855120240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 04:14:16,870][15401] Updated weights for policy 0, policy_version 357370 (0.0032) [2024-06-23 04:14:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 5855199232. Throughput: 0: 42827.8. Samples: 5855375340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 04:14:20,421][15401] Updated weights for policy 0, policy_version 357380 (0.0030) [2024-06-23 04:14:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 5855412224. Throughput: 0: 42896.1. Samples: 5855503120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:23,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 04:14:24,571][15401] Updated weights for policy 0, policy_version 357390 (0.0038) [2024-06-23 04:14:28,177][15401] Updated weights for policy 0, policy_version 357400 (0.0027) [2024-06-23 04:14:28,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42596.7, 300 sec: 42986.8). Total num frames: 5855641600. Throughput: 0: 42806.7. Samples: 5855752040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:28,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 04:14:32,015][15401] Updated weights for policy 0, policy_version 357410 (0.0033) [2024-06-23 04:14:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5855854592. Throughput: 0: 42980.4. Samples: 5856018260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:33,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 04:14:35,885][15401] Updated weights for policy 0, policy_version 357420 (0.0034) [2024-06-23 04:14:38,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5856067584. Throughput: 0: 43034.6. Samples: 5856145520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:38,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-23 04:14:39,678][15401] Updated weights for policy 0, policy_version 357430 (0.0036) [2024-06-23 04:14:43,377][15401] Updated weights for policy 0, policy_version 357440 (0.0037) [2024-06-23 04:14:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 5856296960. Throughput: 0: 42857.6. Samples: 5856402440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:43,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 04:14:47,184][15401] Updated weights for policy 0, policy_version 357450 (0.0040) [2024-06-23 04:14:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 5856509952. Throughput: 0: 42971.5. Samples: 5856661840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:48,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 04:14:51,234][15401] Updated weights for policy 0, policy_version 357460 (0.0040) [2024-06-23 04:14:53,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5856722944. Throughput: 0: 42944.0. Samples: 5856791540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 04:14:54,926][15401] Updated weights for policy 0, policy_version 357470 (0.0025) [2024-06-23 04:14:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 5856935936. Throughput: 0: 42903.1. Samples: 5857050880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:14:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 04:14:59,114][15401] Updated weights for policy 0, policy_version 357480 (0.0037) [2024-06-23 04:15:02,606][15401] Updated weights for policy 0, policy_version 357490 (0.0040) [2024-06-23 04:15:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5857148928. Throughput: 0: 42917.5. Samples: 5857306620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:15:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 04:15:06,563][15401] Updated weights for policy 0, policy_version 357500 (0.0043) [2024-06-23 04:15:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 5857361920. Throughput: 0: 43014.6. Samples: 5857438780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:15:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 04:15:10,297][15401] Updated weights for policy 0, policy_version 357510 (0.0032) [2024-06-23 04:15:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5857574912. Throughput: 0: 43038.2. Samples: 5857688660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:15:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 04:15:14,214][15401] Updated weights for policy 0, policy_version 357520 (0.0039) [2024-06-23 04:15:17,861][15401] Updated weights for policy 0, policy_version 357530 (0.0032) [2024-06-23 04:15:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5857787904. Throughput: 0: 42892.4. Samples: 5857948420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:15:18,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 04:15:18,812][15349] Signal inference workers to stop experience collection... (86800 times) [2024-06-23 04:15:18,822][15349] Signal inference workers to resume experience collection... (86800 times) [2024-06-23 04:15:18,827][15401] InferenceWorker_p0-w0: stopping experience collection (86800 times) [2024-06-23 04:15:18,863][15401] InferenceWorker_p0-w0: resuming experience collection (86800 times) [2024-06-23 04:15:21,700][15401] Updated weights for policy 0, policy_version 357540 (0.0033) [2024-06-23 04:15:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5857984512. Throughput: 0: 42920.3. Samples: 5858076940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:15:23,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 04:15:25,348][15401] Updated weights for policy 0, policy_version 357550 (0.0040) [2024-06-23 04:15:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 5858230272. Throughput: 0: 42874.7. Samples: 5858331700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 04:15:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 04:15:29,210][15401] Updated weights for policy 0, policy_version 357560 (0.0030) [2024-06-23 04:15:32,901][15401] Updated weights for policy 0, policy_version 357570 (0.0033) [2024-06-23 04:15:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 5858426880. Throughput: 0: 42867.1. Samples: 5858590860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:15:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 04:15:36,717][15401] Updated weights for policy 0, policy_version 357580 (0.0028) [2024-06-23 04:15:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5858623488. Throughput: 0: 42885.4. Samples: 5858721380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:15:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 04:15:40,779][15401] Updated weights for policy 0, policy_version 357590 (0.0027) [2024-06-23 04:15:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.3, 300 sec: 42987.5). Total num frames: 5858869248. Throughput: 0: 42849.0. Samples: 5858979080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:15:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 04:15:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000357598_5858885632.pth... [2024-06-23 04:15:43,561][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000356969_5848580096.pth [2024-06-23 04:15:44,206][15401] Updated weights for policy 0, policy_version 357600 (0.0036) [2024-06-23 04:15:48,290][15401] Updated weights for policy 0, policy_version 357610 (0.0033) [2024-06-23 04:15:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5859082240. Throughput: 0: 42904.4. Samples: 5859237320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:15:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 04:15:51,738][15401] Updated weights for policy 0, policy_version 357620 (0.0031) [2024-06-23 04:15:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5859262464. Throughput: 0: 42677.2. Samples: 5859359260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:15:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 04:15:56,102][15401] Updated weights for policy 0, policy_version 357630 (0.0039) [2024-06-23 04:15:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 5859524608. Throughput: 0: 42916.6. Samples: 5859619900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:15:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 04:15:59,349][15401] Updated weights for policy 0, policy_version 357640 (0.0040) [2024-06-23 04:16:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5859704832. Throughput: 0: 42930.7. Samples: 5859880300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 04:16:03,638][15401] Updated weights for policy 0, policy_version 357650 (0.0027) [2024-06-23 04:16:06,866][15401] Updated weights for policy 0, policy_version 357660 (0.0025) [2024-06-23 04:16:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5859917824. Throughput: 0: 42865.9. Samples: 5860005900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 04:16:11,026][15401] Updated weights for policy 0, policy_version 357670 (0.0044) [2024-06-23 04:16:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5860147200. Throughput: 0: 42904.6. Samples: 5860262400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 04:16:14,471][15401] Updated weights for policy 0, policy_version 357680 (0.0031) [2024-06-23 04:16:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5860360192. Throughput: 0: 43029.0. Samples: 5860527160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 04:16:18,700][15401] Updated weights for policy 0, policy_version 357690 (0.0040) [2024-06-23 04:16:22,240][15401] Updated weights for policy 0, policy_version 357700 (0.0038) [2024-06-23 04:16:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 5860573184. Throughput: 0: 42913.4. Samples: 5860652480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:23,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 04:16:25,939][15349] Signal inference workers to stop experience collection... (86850 times) [2024-06-23 04:16:25,995][15401] InferenceWorker_p0-w0: stopping experience collection (86850 times) [2024-06-23 04:16:26,003][15349] Signal inference workers to resume experience collection... (86850 times) [2024-06-23 04:16:26,006][15401] InferenceWorker_p0-w0: resuming experience collection (86850 times) [2024-06-23 04:16:26,293][15401] Updated weights for policy 0, policy_version 357710 (0.0043) [2024-06-23 04:16:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5860802560. Throughput: 0: 42772.7. Samples: 5860903860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:28,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 04:16:30,300][15401] Updated weights for policy 0, policy_version 357720 (0.0027) [2024-06-23 04:16:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5860982784. Throughput: 0: 42907.1. Samples: 5861168140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 04:16:34,280][15401] Updated weights for policy 0, policy_version 357730 (0.0035) [2024-06-23 04:16:37,943][15401] Updated weights for policy 0, policy_version 357740 (0.0028) [2024-06-23 04:16:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5861212160. Throughput: 0: 42869.5. Samples: 5861288380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:38,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 04:16:41,863][15401] Updated weights for policy 0, policy_version 357750 (0.0034) [2024-06-23 04:16:43,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 5861457920. Throughput: 0: 42778.1. Samples: 5861545020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:43,392][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 04:16:45,658][15401] Updated weights for policy 0, policy_version 357760 (0.0038) [2024-06-23 04:16:48,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 5861588992. Throughput: 0: 42905.8. Samples: 5861811060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 04:16:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 04:16:49,405][15401] Updated weights for policy 0, policy_version 357770 (0.0024) [2024-06-23 04:16:53,390][15132] Fps is (10 sec: 39330.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5861851136. Throughput: 0: 42665.7. Samples: 5861925860. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:16:53,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 04:16:53,518][15401] Updated weights for policy 0, policy_version 357780 (0.0024) [2024-06-23 04:16:57,339][15401] Updated weights for policy 0, policy_version 357790 (0.0030) [2024-06-23 04:16:58,389][15132] Fps is (10 sec: 50790.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5862096896. Throughput: 0: 42738.6. Samples: 5862185640. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:16:58,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 04:17:01,501][15401] Updated weights for policy 0, policy_version 357800 (0.0041) [2024-06-23 04:17:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 5862244352. Throughput: 0: 42739.1. Samples: 5862450420. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:03,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 04:17:04,810][15401] Updated weights for policy 0, policy_version 357810 (0.0035) [2024-06-23 04:17:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5862490112. Throughput: 0: 42435.0. Samples: 5862562060. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:08,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 04:17:09,162][15401] Updated weights for policy 0, policy_version 357820 (0.0031) [2024-06-23 04:17:12,441][15401] Updated weights for policy 0, policy_version 357830 (0.0034) [2024-06-23 04:17:13,389][15132] Fps is (10 sec: 49152.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5862735872. Throughput: 0: 42805.0. Samples: 5862830080. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 04:17:16,898][15401] Updated weights for policy 0, policy_version 357840 (0.0045) [2024-06-23 04:17:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5862883328. Throughput: 0: 42860.5. Samples: 5863096860. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:18,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 04:17:20,001][15401] Updated weights for policy 0, policy_version 357850 (0.0048) [2024-06-23 04:17:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5863145472. Throughput: 0: 42713.7. Samples: 5863210500. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:23,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 04:17:24,667][15401] Updated weights for policy 0, policy_version 357860 (0.0026) [2024-06-23 04:17:27,428][15349] Signal inference workers to stop experience collection... (86900 times) [2024-06-23 04:17:27,433][15349] Signal inference workers to resume experience collection... (86900 times) [2024-06-23 04:17:27,454][15401] InferenceWorker_p0-w0: stopping experience collection (86900 times) [2024-06-23 04:17:27,455][15401] InferenceWorker_p0-w0: resuming experience collection (86900 times) [2024-06-23 04:17:27,609][15401] Updated weights for policy 0, policy_version 357870 (0.0041) [2024-06-23 04:17:28,389][15132] Fps is (10 sec: 50790.6, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 5863391232. Throughput: 0: 42985.5. Samples: 5863479260. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 04:17:32,306][15401] Updated weights for policy 0, policy_version 357880 (0.0022) [2024-06-23 04:17:33,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 5863522304. Throughput: 0: 42858.1. Samples: 5863739680. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 04:17:35,150][15401] Updated weights for policy 0, policy_version 357890 (0.0033) [2024-06-23 04:17:38,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5863784448. Throughput: 0: 42823.1. Samples: 5863852900. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 04:17:39,846][15401] Updated weights for policy 0, policy_version 357900 (0.0035) [2024-06-23 04:17:42,823][15401] Updated weights for policy 0, policy_version 357910 (0.0026) [2024-06-23 04:17:43,389][15132] Fps is (10 sec: 50790.8, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 5864030208. Throughput: 0: 43044.0. Samples: 5864122620. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:43,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 04:17:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000357912_5864030208.pth... [2024-06-23 04:17:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000357283_5853724672.pth [2024-06-23 04:17:47,637][15401] Updated weights for policy 0, policy_version 357920 (0.0032) [2024-06-23 04:17:48,393][15132] Fps is (10 sec: 37671.0, 60 sec: 42869.0, 300 sec: 42709.0). Total num frames: 5864161280. Throughput: 0: 42988.4. Samples: 5864385040. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:48,393][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 04:17:50,796][15401] Updated weights for policy 0, policy_version 357930 (0.0031) [2024-06-23 04:17:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5864439808. Throughput: 0: 42984.4. Samples: 5864496360. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 04:17:55,183][15401] Updated weights for policy 0, policy_version 357940 (0.0034) [2024-06-23 04:17:58,392][15132] Fps is (10 sec: 47518.2, 60 sec: 42323.6, 300 sec: 42875.8). Total num frames: 5864636416. Throughput: 0: 43018.5. Samples: 5864766020. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:17:58,393][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 04:17:58,439][15401] Updated weights for policy 0, policy_version 357950 (0.0026) [2024-06-23 04:18:02,621][15401] Updated weights for policy 0, policy_version 357960 (0.0038) [2024-06-23 04:18:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5864816640. Throughput: 0: 42823.9. Samples: 5865023940. Policy #0 lag: (min: 0.0, avg: 6.9, max: 22.0) [2024-06-23 04:18:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 04:18:05,998][15401] Updated weights for policy 0, policy_version 357970 (0.0037) [2024-06-23 04:18:08,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5865095168. Throughput: 0: 42947.1. Samples: 5865143120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 04:18:10,094][15401] Updated weights for policy 0, policy_version 357980 (0.0025) [2024-06-23 04:18:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5865275392. Throughput: 0: 42864.4. Samples: 5865408160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 04:18:13,611][15401] Updated weights for policy 0, policy_version 357990 (0.0032) [2024-06-23 04:18:18,392][15132] Fps is (10 sec: 36036.3, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 5865455616. Throughput: 0: 42763.6. Samples: 5865664140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:18,392][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 04:18:18,473][15401] Updated weights for policy 0, policy_version 358000 (0.0036) [2024-06-23 04:18:19,680][15349] Signal inference workers to stop experience collection... (86950 times) [2024-06-23 04:18:19,681][15349] Signal inference workers to resume experience collection... (86950 times) [2024-06-23 04:18:19,694][15401] InferenceWorker_p0-w0: stopping experience collection (86950 times) [2024-06-23 04:18:19,694][15401] InferenceWorker_p0-w0: resuming experience collection (86950 times) [2024-06-23 04:18:21,506][15401] Updated weights for policy 0, policy_version 358010 (0.0040) [2024-06-23 04:18:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5865734144. Throughput: 0: 42926.2. Samples: 5865784580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 04:18:25,946][15401] Updated weights for policy 0, policy_version 358020 (0.0042) [2024-06-23 04:18:28,390][15132] Fps is (10 sec: 44247.2, 60 sec: 41779.1, 300 sec: 42820.5). Total num frames: 5865897984. Throughput: 0: 42712.0. Samples: 5866044660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:28,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 04:18:29,300][15401] Updated weights for policy 0, policy_version 358030 (0.0031) [2024-06-23 04:18:33,389][15132] Fps is (10 sec: 37684.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5866110976. Throughput: 0: 42500.1. Samples: 5866297400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:33,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 04:18:33,451][15401] Updated weights for policy 0, policy_version 358040 (0.0034) [2024-06-23 04:18:36,967][15401] Updated weights for policy 0, policy_version 358050 (0.0029) [2024-06-23 04:18:38,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5866373120. Throughput: 0: 42826.2. Samples: 5866423540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 04:18:41,170][15401] Updated weights for policy 0, policy_version 358060 (0.0032) [2024-06-23 04:18:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 5866536960. Throughput: 0: 42577.8. Samples: 5866681920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:43,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 04:18:44,592][15401] Updated weights for policy 0, policy_version 358070 (0.0041) [2024-06-23 04:18:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43420.0, 300 sec: 42765.0). Total num frames: 5866766336. Throughput: 0: 42483.5. Samples: 5866935700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:48,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 04:18:48,584][15401] Updated weights for policy 0, policy_version 358080 (0.0037) [2024-06-23 04:18:52,269][15401] Updated weights for policy 0, policy_version 358090 (0.0029) [2024-06-23 04:18:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5866995712. Throughput: 0: 42768.6. Samples: 5867067700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 04:18:56,284][15401] Updated weights for policy 0, policy_version 358100 (0.0025) [2024-06-23 04:18:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5867192320. Throughput: 0: 42595.1. Samples: 5867324940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:18:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 04:18:59,988][15401] Updated weights for policy 0, policy_version 358110 (0.0045) [2024-06-23 04:19:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5867405312. Throughput: 0: 42445.9. Samples: 5867574100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:19:03,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 04:19:03,693][15401] Updated weights for policy 0, policy_version 358120 (0.0030) [2024-06-23 04:19:07,722][15401] Updated weights for policy 0, policy_version 358130 (0.0035) [2024-06-23 04:19:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5867634688. Throughput: 0: 42766.4. Samples: 5867709060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:19:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 04:19:11,110][15401] Updated weights for policy 0, policy_version 358140 (0.0033) [2024-06-23 04:19:13,393][15132] Fps is (10 sec: 40943.5, 60 sec: 42322.5, 300 sec: 42764.5). Total num frames: 5867814912. Throughput: 0: 42664.3. Samples: 5867964720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:19:13,394][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 04:19:15,488][15401] Updated weights for policy 0, policy_version 358150 (0.0047) [2024-06-23 04:19:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 5868060672. Throughput: 0: 42553.7. Samples: 5868212320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:19:18,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 04:19:19,221][15401] Updated weights for policy 0, policy_version 358160 (0.0037) [2024-06-23 04:19:23,068][15401] Updated weights for policy 0, policy_version 358170 (0.0050) [2024-06-23 04:19:23,390][15132] Fps is (10 sec: 45893.0, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 5868273664. Throughput: 0: 42932.0. Samples: 5868355480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:19:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 04:19:26,719][15401] Updated weights for policy 0, policy_version 358180 (0.0030) [2024-06-23 04:19:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5868453888. Throughput: 0: 42724.0. Samples: 5868604500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:19:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 04:19:30,645][15401] Updated weights for policy 0, policy_version 358190 (0.0037) [2024-06-23 04:19:32,432][15349] Signal inference workers to stop experience collection... (87000 times) [2024-06-23 04:19:32,432][15349] Signal inference workers to resume experience collection... (87000 times) [2024-06-23 04:19:32,457][15401] InferenceWorker_p0-w0: stopping experience collection (87000 times) [2024-06-23 04:19:32,489][15401] InferenceWorker_p0-w0: resuming experience collection (87000 times) [2024-06-23 04:19:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5868716032. Throughput: 0: 42719.3. Samples: 5868858060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:19:33,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 04:19:34,200][15401] Updated weights for policy 0, policy_version 358200 (0.0035) [2024-06-23 04:19:38,118][15401] Updated weights for policy 0, policy_version 358210 (0.0041) [2024-06-23 04:19:38,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42323.6, 300 sec: 42765.0). Total num frames: 5868912640. Throughput: 0: 42899.3. Samples: 5868998280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:19:38,393][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 04:19:41,645][15401] Updated weights for policy 0, policy_version 358220 (0.0029) [2024-06-23 04:19:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5869109248. Throughput: 0: 42703.5. Samples: 5869246600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:19:43,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-23 04:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000358223_5869125632.pth... [2024-06-23 04:19:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000357598_5858885632.pth [2024-06-23 04:19:45,901][15401] Updated weights for policy 0, policy_version 358230 (0.0034) [2024-06-23 04:19:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5869355008. Throughput: 0: 42794.1. Samples: 5869499840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:19:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 04:19:49,246][15401] Updated weights for policy 0, policy_version 358240 (0.0029) [2024-06-23 04:19:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5869551616. Throughput: 0: 42818.2. Samples: 5869635880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:19:53,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 04:19:53,446][15401] Updated weights for policy 0, policy_version 358250 (0.0034) [2024-06-23 04:19:57,372][15401] Updated weights for policy 0, policy_version 358260 (0.0033) [2024-06-23 04:19:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5869764608. Throughput: 0: 42728.1. Samples: 5869887320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:19:58,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 04:20:01,104][15401] Updated weights for policy 0, policy_version 358270 (0.0032) [2024-06-23 04:20:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5869993984. Throughput: 0: 42897.0. Samples: 5870142680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:20:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 04:20:05,158][15401] Updated weights for policy 0, policy_version 358280 (0.0022) [2024-06-23 04:20:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5870190592. Throughput: 0: 42620.5. Samples: 5870273400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:20:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 04:20:08,806][15401] Updated weights for policy 0, policy_version 358290 (0.0024) [2024-06-23 04:20:12,695][15401] Updated weights for policy 0, policy_version 358300 (0.0031) [2024-06-23 04:20:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43147.4, 300 sec: 42765.0). Total num frames: 5870403584. Throughput: 0: 42663.6. Samples: 5870524360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:20:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 04:20:16,687][15401] Updated weights for policy 0, policy_version 358310 (0.0047) [2024-06-23 04:20:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5870616576. Throughput: 0: 42715.2. Samples: 5870780240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:20:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 04:20:20,328][15401] Updated weights for policy 0, policy_version 358320 (0.0036) [2024-06-23 04:20:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5870813184. Throughput: 0: 42394.7. Samples: 5870905940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:20:23,390][15132] Avg episode reward: [(0, '0.889')] [2024-06-23 04:20:24,377][15401] Updated weights for policy 0, policy_version 358330 (0.0039) [2024-06-23 04:20:28,017][15401] Updated weights for policy 0, policy_version 358340 (0.0041) [2024-06-23 04:20:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 5871058944. Throughput: 0: 42505.3. Samples: 5871159340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:20:28,395][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 04:20:32,212][15401] Updated weights for policy 0, policy_version 358350 (0.0033) [2024-06-23 04:20:33,390][15132] Fps is (10 sec: 44234.5, 60 sec: 42324.9, 300 sec: 42820.5). Total num frames: 5871255552. Throughput: 0: 42523.9. Samples: 5871413440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:20:33,391][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 04:20:35,730][15401] Updated weights for policy 0, policy_version 358360 (0.0029) [2024-06-23 04:20:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 5871452160. Throughput: 0: 42467.6. Samples: 5871546920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 04:20:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 04:20:39,852][15401] Updated weights for policy 0, policy_version 358370 (0.0032) [2024-06-23 04:20:43,385][15401] Updated weights for policy 0, policy_version 358380 (0.0047) [2024-06-23 04:20:43,389][15132] Fps is (10 sec: 44239.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5871697920. Throughput: 0: 42545.0. Samples: 5871801840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:20:43,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 04:20:47,509][15401] Updated weights for policy 0, policy_version 358390 (0.0030) [2024-06-23 04:20:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5871894528. Throughput: 0: 42533.2. Samples: 5872056680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:20:48,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 04:20:51,162][15401] Updated weights for policy 0, policy_version 358400 (0.0042) [2024-06-23 04:20:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5872091136. Throughput: 0: 42436.9. Samples: 5872183060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:20:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 04:20:55,355][15401] Updated weights for policy 0, policy_version 358410 (0.0037) [2024-06-23 04:20:56,614][15349] Signal inference workers to stop experience collection... (87050 times) [2024-06-23 04:20:56,647][15401] InferenceWorker_p0-w0: stopping experience collection (87050 times) [2024-06-23 04:20:56,677][15349] Signal inference workers to resume experience collection... (87050 times) [2024-06-23 04:20:56,678][15401] InferenceWorker_p0-w0: resuming experience collection (87050 times) [2024-06-23 04:20:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5872320512. Throughput: 0: 42456.9. Samples: 5872434920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:20:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 04:20:58,953][15401] Updated weights for policy 0, policy_version 358420 (0.0039) [2024-06-23 04:21:03,068][15401] Updated weights for policy 0, policy_version 358430 (0.0036) [2024-06-23 04:21:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5872533504. Throughput: 0: 42490.6. Samples: 5872692320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:03,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 04:21:06,779][15401] Updated weights for policy 0, policy_version 358440 (0.0041) [2024-06-23 04:21:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5872730112. Throughput: 0: 42552.0. Samples: 5872820780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 04:21:10,780][15401] Updated weights for policy 0, policy_version 358450 (0.0035) [2024-06-23 04:21:13,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5872975872. Throughput: 0: 42486.1. Samples: 5873071220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:13,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 04:21:14,337][15401] Updated weights for policy 0, policy_version 358460 (0.0031) [2024-06-23 04:21:18,335][15401] Updated weights for policy 0, policy_version 358470 (0.0031) [2024-06-23 04:21:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5873172480. Throughput: 0: 42796.9. Samples: 5873339280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 04:21:21,955][15401] Updated weights for policy 0, policy_version 358480 (0.0028) [2024-06-23 04:21:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5873385472. Throughput: 0: 42481.3. Samples: 5873458580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 04:21:26,102][15401] Updated weights for policy 0, policy_version 358490 (0.0042) [2024-06-23 04:21:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5873614848. Throughput: 0: 42527.4. Samples: 5873715580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 04:21:29,657][15401] Updated weights for policy 0, policy_version 358500 (0.0036) [2024-06-23 04:21:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.7, 300 sec: 42653.9). Total num frames: 5873795072. Throughput: 0: 42708.5. Samples: 5873978560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 04:21:33,597][15401] Updated weights for policy 0, policy_version 358510 (0.0048) [2024-06-23 04:21:37,189][15401] Updated weights for policy 0, policy_version 358520 (0.0027) [2024-06-23 04:21:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 5874008064. Throughput: 0: 42543.7. Samples: 5874097520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 04:21:41,218][15401] Updated weights for policy 0, policy_version 358530 (0.0034) [2024-06-23 04:21:43,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 5874270208. Throughput: 0: 42623.5. Samples: 5874352980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:43,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 04:21:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000358537_5874270208.pth... [2024-06-23 04:21:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000357912_5864030208.pth [2024-06-23 04:21:45,357][15401] Updated weights for policy 0, policy_version 358540 (0.0041) [2024-06-23 04:21:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 5874434048. Throughput: 0: 42749.9. Samples: 5874616060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 04:21:48,953][15401] Updated weights for policy 0, policy_version 358550 (0.0032) [2024-06-23 04:21:52,873][15401] Updated weights for policy 0, policy_version 358560 (0.0047) [2024-06-23 04:21:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5874663424. Throughput: 0: 42518.7. Samples: 5874734120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:53,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 04:21:56,703][15401] Updated weights for policy 0, policy_version 358570 (0.0032) [2024-06-23 04:21:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 5874909184. Throughput: 0: 42697.6. Samples: 5874992600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 04:21:58,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 04:22:00,718][15401] Updated weights for policy 0, policy_version 358580 (0.0033) [2024-06-23 04:22:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5875056640. Throughput: 0: 42629.5. Samples: 5875257600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 04:22:04,287][15401] Updated weights for policy 0, policy_version 358590 (0.0025) [2024-06-23 04:22:08,219][15401] Updated weights for policy 0, policy_version 358600 (0.0032) [2024-06-23 04:22:08,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5875302400. Throughput: 0: 42545.3. Samples: 5875373120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:08,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 04:22:09,411][15349] Signal inference workers to stop experience collection... (87100 times) [2024-06-23 04:22:09,466][15401] InferenceWorker_p0-w0: stopping experience collection (87100 times) [2024-06-23 04:22:09,472][15349] Signal inference workers to resume experience collection... (87100 times) [2024-06-23 04:22:09,496][15401] InferenceWorker_p0-w0: resuming experience collection (87100 times) [2024-06-23 04:22:12,282][15401] Updated weights for policy 0, policy_version 358610 (0.0030) [2024-06-23 04:22:13,390][15132] Fps is (10 sec: 49151.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 5875548160. Throughput: 0: 42773.4. Samples: 5875640380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 04:22:16,137][15401] Updated weights for policy 0, policy_version 358620 (0.0028) [2024-06-23 04:22:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5875712000. Throughput: 0: 42619.0. Samples: 5875896420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 04:22:19,779][15401] Updated weights for policy 0, policy_version 358630 (0.0031) [2024-06-23 04:22:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5875941376. Throughput: 0: 42669.2. Samples: 5876017640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 04:22:23,583][15401] Updated weights for policy 0, policy_version 358640 (0.0032) [2024-06-23 04:22:27,288][15401] Updated weights for policy 0, policy_version 358650 (0.0031) [2024-06-23 04:22:28,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5876187136. Throughput: 0: 42868.9. Samples: 5876282080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:28,396][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 04:22:31,415][15401] Updated weights for policy 0, policy_version 358660 (0.0040) [2024-06-23 04:22:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 5876350976. Throughput: 0: 42865.6. Samples: 5876545120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:33,392][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 04:22:34,958][15401] Updated weights for policy 0, policy_version 358670 (0.0033) [2024-06-23 04:22:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 5876596736. Throughput: 0: 42877.2. Samples: 5876663600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:38,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 04:22:38,905][15401] Updated weights for policy 0, policy_version 358680 (0.0032) [2024-06-23 04:22:42,751][15401] Updated weights for policy 0, policy_version 358690 (0.0035) [2024-06-23 04:22:43,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42325.3, 300 sec: 42876.6). Total num frames: 5876809728. Throughput: 0: 42910.9. Samples: 5876923600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 04:22:46,635][15401] Updated weights for policy 0, policy_version 358700 (0.0042) [2024-06-23 04:22:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5876989952. Throughput: 0: 42790.3. Samples: 5877183160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:48,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 04:22:50,389][15401] Updated weights for policy 0, policy_version 358710 (0.0033) [2024-06-23 04:22:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 5877235712. Throughput: 0: 42928.9. Samples: 5877304920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 04:22:54,538][15401] Updated weights for policy 0, policy_version 358720 (0.0045) [2024-06-23 04:22:58,029][15401] Updated weights for policy 0, policy_version 358730 (0.0033) [2024-06-23 04:22:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 5877432320. Throughput: 0: 42788.1. Samples: 5877565840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:22:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 04:23:02,120][15401] Updated weights for policy 0, policy_version 358740 (0.0035) [2024-06-23 04:23:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5877628928. Throughput: 0: 42656.6. Samples: 5877815960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:23:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 04:23:06,097][15401] Updated weights for policy 0, policy_version 358750 (0.0044) [2024-06-23 04:23:08,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 5877858304. Throughput: 0: 42701.5. Samples: 5877939300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:23:08,392][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 04:23:09,750][15401] Updated weights for policy 0, policy_version 358760 (0.0041) [2024-06-23 04:23:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42709.8). Total num frames: 5878054912. Throughput: 0: 42471.6. Samples: 5878193300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-23 04:23:13,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 04:23:13,725][15401] Updated weights for policy 0, policy_version 358770 (0.0032) [2024-06-23 04:23:17,253][15401] Updated weights for policy 0, policy_version 358780 (0.0024) [2024-06-23 04:23:18,390][15132] Fps is (10 sec: 40968.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5878267904. Throughput: 0: 42407.5. Samples: 5878453360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:18,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 04:23:21,519][15401] Updated weights for policy 0, policy_version 358790 (0.0041) [2024-06-23 04:23:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5878513664. Throughput: 0: 42532.5. Samples: 5878577560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 04:23:24,870][15401] Updated weights for policy 0, policy_version 358800 (0.0035) [2024-06-23 04:23:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 5878693888. Throughput: 0: 42421.0. Samples: 5878832540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 04:23:29,067][15401] Updated weights for policy 0, policy_version 358810 (0.0034) [2024-06-23 04:23:32,423][15401] Updated weights for policy 0, policy_version 358820 (0.0030) [2024-06-23 04:23:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 5878923264. Throughput: 0: 42387.5. Samples: 5879090600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:33,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 04:23:36,763][15401] Updated weights for policy 0, policy_version 358830 (0.0037) [2024-06-23 04:23:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5879152640. Throughput: 0: 42546.6. Samples: 5879219520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:38,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 04:23:40,188][15401] Updated weights for policy 0, policy_version 358840 (0.0028) [2024-06-23 04:23:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5879332864. Throughput: 0: 42284.0. Samples: 5879468620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:43,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 04:23:43,538][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000358847_5879349248.pth... [2024-06-23 04:23:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000358223_5869125632.pth [2024-06-23 04:23:44,582][15401] Updated weights for policy 0, policy_version 358850 (0.0033) [2024-06-23 04:23:48,310][15401] Updated weights for policy 0, policy_version 358860 (0.0028) [2024-06-23 04:23:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5879562240. Throughput: 0: 42227.0. Samples: 5879716180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 04:23:49,704][15349] Signal inference workers to stop experience collection... (87150 times) [2024-06-23 04:23:49,704][15349] Signal inference workers to resume experience collection... (87150 times) [2024-06-23 04:23:49,748][15401] InferenceWorker_p0-w0: stopping experience collection (87150 times) [2024-06-23 04:23:49,748][15401] InferenceWorker_p0-w0: resuming experience collection (87150 times) [2024-06-23 04:23:52,324][15401] Updated weights for policy 0, policy_version 358870 (0.0035) [2024-06-23 04:23:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5879758848. Throughput: 0: 42365.2. Samples: 5879845640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 04:23:55,927][15401] Updated weights for policy 0, policy_version 358880 (0.0046) [2024-06-23 04:23:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5879971840. Throughput: 0: 42451.1. Samples: 5880103600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:23:58,396][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 04:24:00,001][15401] Updated weights for policy 0, policy_version 358890 (0.0042) [2024-06-23 04:24:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5880184832. Throughput: 0: 42325.9. Samples: 5880358020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:24:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 04:24:03,628][15401] Updated weights for policy 0, policy_version 358900 (0.0040) [2024-06-23 04:24:07,654][15401] Updated weights for policy 0, policy_version 358910 (0.0034) [2024-06-23 04:24:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42326.9, 300 sec: 42654.5). Total num frames: 5880397824. Throughput: 0: 42424.3. Samples: 5880486660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:24:08,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 04:24:11,524][15401] Updated weights for policy 0, policy_version 358920 (0.0036) [2024-06-23 04:24:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5880643584. Throughput: 0: 42621.9. Samples: 5880750520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:24:13,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 04:24:15,221][15401] Updated weights for policy 0, policy_version 358930 (0.0036) [2024-06-23 04:24:18,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 5880840192. Throughput: 0: 42631.1. Samples: 5881009100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:24:18,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 04:24:19,035][15401] Updated weights for policy 0, policy_version 358940 (0.0039) [2024-06-23 04:24:22,743][15401] Updated weights for policy 0, policy_version 358950 (0.0026) [2024-06-23 04:24:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5881053184. Throughput: 0: 42569.3. Samples: 5881135140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:24:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 04:24:26,677][15401] Updated weights for policy 0, policy_version 358960 (0.0050) [2024-06-23 04:24:28,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5881282560. Throughput: 0: 42840.0. Samples: 5881396420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:24:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 04:24:30,258][15401] Updated weights for policy 0, policy_version 358970 (0.0021) [2024-06-23 04:24:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 5881479168. Throughput: 0: 43061.7. Samples: 5881653960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 04:24:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 04:24:34,316][15401] Updated weights for policy 0, policy_version 358980 (0.0028) [2024-06-23 04:24:37,802][15401] Updated weights for policy 0, policy_version 358990 (0.0042) [2024-06-23 04:24:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5881692160. Throughput: 0: 42868.4. Samples: 5881774720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:24:38,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 04:24:41,791][15401] Updated weights for policy 0, policy_version 359000 (0.0041) [2024-06-23 04:24:43,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43417.7, 300 sec: 42654.0). Total num frames: 5881937920. Throughput: 0: 43021.1. Samples: 5882039540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:24:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 04:24:45,668][15401] Updated weights for policy 0, policy_version 359010 (0.0035) [2024-06-23 04:24:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5882134528. Throughput: 0: 43055.1. Samples: 5882295500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:24:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 04:24:49,348][15401] Updated weights for policy 0, policy_version 359020 (0.0051) [2024-06-23 04:24:53,307][15401] Updated weights for policy 0, policy_version 359030 (0.0028) [2024-06-23 04:24:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5882347520. Throughput: 0: 42823.1. Samples: 5882413700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:24:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 04:24:57,307][15401] Updated weights for policy 0, policy_version 359040 (0.0028) [2024-06-23 04:24:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 5882576896. Throughput: 0: 42759.1. Samples: 5882674680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:24:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 04:25:01,088][15349] Signal inference workers to stop experience collection... (87200 times) [2024-06-23 04:25:01,090][15349] Signal inference workers to resume experience collection... (87200 times) [2024-06-23 04:25:01,114][15401] InferenceWorker_p0-w0: stopping experience collection (87200 times) [2024-06-23 04:25:01,114][15401] InferenceWorker_p0-w0: resuming experience collection (87200 times) [2024-06-23 04:25:01,225][15401] Updated weights for policy 0, policy_version 359050 (0.0030) [2024-06-23 04:25:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5882757120. Throughput: 0: 42620.4. Samples: 5882926920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:03,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-23 04:25:05,285][15401] Updated weights for policy 0, policy_version 359060 (0.0032) [2024-06-23 04:25:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5882970112. Throughput: 0: 42455.6. Samples: 5883045640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 04:25:08,826][15401] Updated weights for policy 0, policy_version 359070 (0.0042) [2024-06-23 04:25:12,818][15401] Updated weights for policy 0, policy_version 359080 (0.0053) [2024-06-23 04:25:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5883199488. Throughput: 0: 42515.1. Samples: 5883309600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 04:25:16,493][15401] Updated weights for policy 0, policy_version 359090 (0.0039) [2024-06-23 04:25:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 5883379712. Throughput: 0: 42414.8. Samples: 5883562620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 04:25:20,522][15401] Updated weights for policy 0, policy_version 359100 (0.0046) [2024-06-23 04:25:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5883609088. Throughput: 0: 42341.0. Samples: 5883680060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 04:25:24,253][15401] Updated weights for policy 0, policy_version 359110 (0.0028) [2024-06-23 04:25:28,377][15401] Updated weights for policy 0, policy_version 359120 (0.0033) [2024-06-23 04:25:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.5). Total num frames: 5883822080. Throughput: 0: 42288.7. Samples: 5883942540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:28,394][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 04:25:31,892][15401] Updated weights for policy 0, policy_version 359130 (0.0032) [2024-06-23 04:25:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 5884002304. Throughput: 0: 42231.4. Samples: 5884195920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:33,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-23 04:25:36,194][15401] Updated weights for policy 0, policy_version 359140 (0.0037) [2024-06-23 04:25:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 5884248064. Throughput: 0: 42361.7. Samples: 5884320080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:38,393][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 04:25:40,186][15401] Updated weights for policy 0, policy_version 359150 (0.0034) [2024-06-23 04:25:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 5884444672. Throughput: 0: 42300.7. Samples: 5884578220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 04:25:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000359158_5884444672.pth... [2024-06-23 04:25:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000358537_5874270208.pth [2024-06-23 04:25:43,910][15401] Updated weights for policy 0, policy_version 359160 (0.0026) [2024-06-23 04:25:47,827][15401] Updated weights for policy 0, policy_version 359170 (0.0036) [2024-06-23 04:25:48,389][15132] Fps is (10 sec: 39331.7, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 5884641280. Throughput: 0: 42267.7. Samples: 5884828960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 04:25:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 04:25:51,504][15401] Updated weights for policy 0, policy_version 359180 (0.0042) [2024-06-23 04:25:53,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 5884887040. Throughput: 0: 42430.2. Samples: 5884955100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:25:53,393][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 04:25:55,394][15401] Updated weights for policy 0, policy_version 359190 (0.0029) [2024-06-23 04:25:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 5885083648. Throughput: 0: 42298.2. Samples: 5885213020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:25:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 04:25:59,233][15401] Updated weights for policy 0, policy_version 359200 (0.0031) [2024-06-23 04:26:03,143][15401] Updated weights for policy 0, policy_version 359210 (0.0029) [2024-06-23 04:26:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5885296640. Throughput: 0: 42210.2. Samples: 5885462080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:03,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 04:26:06,786][15401] Updated weights for policy 0, policy_version 359220 (0.0035) [2024-06-23 04:26:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5885526016. Throughput: 0: 42462.6. Samples: 5885590880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:08,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 04:26:10,809][15401] Updated weights for policy 0, policy_version 359230 (0.0028) [2024-06-23 04:26:12,846][15349] Signal inference workers to stop experience collection... (87250 times) [2024-06-23 04:26:12,898][15349] Signal inference workers to resume experience collection... (87250 times) [2024-06-23 04:26:12,899][15401] InferenceWorker_p0-w0: stopping experience collection (87250 times) [2024-06-23 04:26:12,915][15401] InferenceWorker_p0-w0: resuming experience collection (87250 times) [2024-06-23 04:26:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5885722624. Throughput: 0: 42408.5. Samples: 5885850920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 04:26:14,443][15401] Updated weights for policy 0, policy_version 359240 (0.0043) [2024-06-23 04:26:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5885935616. Throughput: 0: 42306.4. Samples: 5886099700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:18,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 04:26:18,494][15401] Updated weights for policy 0, policy_version 359250 (0.0027) [2024-06-23 04:26:22,262][15401] Updated weights for policy 0, policy_version 359260 (0.0030) [2024-06-23 04:26:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5886164992. Throughput: 0: 42491.7. Samples: 5886232100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 04:26:26,311][15401] Updated weights for policy 0, policy_version 359270 (0.0035) [2024-06-23 04:26:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5886345216. Throughput: 0: 42425.9. Samples: 5886487380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:28,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-23 04:26:29,718][15401] Updated weights for policy 0, policy_version 359280 (0.0037) [2024-06-23 04:26:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5886574592. Throughput: 0: 42560.4. Samples: 5886744180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:33,394][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 04:26:33,880][15401] Updated weights for policy 0, policy_version 359290 (0.0039) [2024-06-23 04:26:37,199][15401] Updated weights for policy 0, policy_version 359300 (0.0032) [2024-06-23 04:26:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 5886803968. Throughput: 0: 42656.4. Samples: 5886874540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:38,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 04:26:41,521][15401] Updated weights for policy 0, policy_version 359310 (0.0039) [2024-06-23 04:26:43,393][15132] Fps is (10 sec: 40946.7, 60 sec: 42323.1, 300 sec: 42542.4). Total num frames: 5886984192. Throughput: 0: 42586.3. Samples: 5887129540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:43,393][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 04:26:45,047][15401] Updated weights for policy 0, policy_version 359320 (0.0025) [2024-06-23 04:26:48,392][15132] Fps is (10 sec: 42589.8, 60 sec: 43142.9, 300 sec: 42598.1). Total num frames: 5887229952. Throughput: 0: 42692.7. Samples: 5887383340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:48,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 04:26:49,065][15401] Updated weights for policy 0, policy_version 359330 (0.0028) [2024-06-23 04:26:53,006][15401] Updated weights for policy 0, policy_version 359340 (0.0047) [2024-06-23 04:26:53,389][15132] Fps is (10 sec: 45890.4, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 5887442944. Throughput: 0: 42774.2. Samples: 5887515720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 04:26:57,282][15401] Updated weights for policy 0, policy_version 359350 (0.0041) [2024-06-23 04:26:58,389][15132] Fps is (10 sec: 39330.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5887623168. Throughput: 0: 42606.7. Samples: 5887768220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:26:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 04:27:00,539][15401] Updated weights for policy 0, policy_version 359360 (0.0029) [2024-06-23 04:27:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5887885312. Throughput: 0: 42570.0. Samples: 5888015360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:27:03,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 04:27:04,794][15401] Updated weights for policy 0, policy_version 359370 (0.0037) [2024-06-23 04:27:08,223][15401] Updated weights for policy 0, policy_version 359380 (0.0022) [2024-06-23 04:27:08,392][15132] Fps is (10 sec: 45865.1, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 5888081920. Throughput: 0: 42717.0. Samples: 5888154460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 04:27:08,392][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 04:27:12,476][15401] Updated weights for policy 0, policy_version 359390 (0.0047) [2024-06-23 04:27:13,390][15132] Fps is (10 sec: 37682.3, 60 sec: 42325.1, 300 sec: 42542.8). Total num frames: 5888262144. Throughput: 0: 42637.4. Samples: 5888406080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 04:27:16,089][15401] Updated weights for policy 0, policy_version 359400 (0.0031) [2024-06-23 04:27:18,389][15132] Fps is (10 sec: 44246.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 5888524288. Throughput: 0: 42364.5. Samples: 5888650580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 04:27:20,098][15401] Updated weights for policy 0, policy_version 359410 (0.0035) [2024-06-23 04:27:23,390][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 5888704512. Throughput: 0: 42560.9. Samples: 5888789780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 04:27:23,950][15401] Updated weights for policy 0, policy_version 359420 (0.0032) [2024-06-23 04:27:25,491][15349] Signal inference workers to stop experience collection... (87300 times) [2024-06-23 04:27:25,491][15349] Signal inference workers to resume experience collection... (87300 times) [2024-06-23 04:27:25,503][15401] InferenceWorker_p0-w0: stopping experience collection (87300 times) [2024-06-23 04:27:25,503][15401] InferenceWorker_p0-w0: resuming experience collection (87300 times) [2024-06-23 04:27:27,747][15401] Updated weights for policy 0, policy_version 359430 (0.0026) [2024-06-23 04:27:28,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 5888901120. Throughput: 0: 42450.1. Samples: 5889039660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 04:27:31,921][15401] Updated weights for policy 0, policy_version 359440 (0.0038) [2024-06-23 04:27:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5889146880. Throughput: 0: 42367.4. Samples: 5889289780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:33,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 04:27:35,643][15401] Updated weights for policy 0, policy_version 359450 (0.0030) [2024-06-23 04:27:38,389][15132] Fps is (10 sec: 44238.1, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 5889343488. Throughput: 0: 42475.2. Samples: 5889427100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 04:27:39,493][15401] Updated weights for policy 0, policy_version 359460 (0.0029) [2024-06-23 04:27:43,039][15401] Updated weights for policy 0, policy_version 359470 (0.0028) [2024-06-23 04:27:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.8, 300 sec: 42598.4). Total num frames: 5889556480. Throughput: 0: 42506.1. Samples: 5889681000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 04:27:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000359470_5889556480.pth... [2024-06-23 04:27:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000358847_5879349248.pth [2024-06-23 04:27:47,009][15401] Updated weights for policy 0, policy_version 359480 (0.0033) [2024-06-23 04:27:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42599.9, 300 sec: 42542.8). Total num frames: 5889785856. Throughput: 0: 42687.2. Samples: 5889936280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:48,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-23 04:27:50,583][15401] Updated weights for policy 0, policy_version 359490 (0.0030) [2024-06-23 04:27:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5889982464. Throughput: 0: 42578.5. Samples: 5890070400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 04:27:54,753][15401] Updated weights for policy 0, policy_version 359500 (0.0036) [2024-06-23 04:27:58,238][15401] Updated weights for policy 0, policy_version 359510 (0.0035) [2024-06-23 04:27:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5890211840. Throughput: 0: 42533.6. Samples: 5890320080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:27:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 04:28:02,596][15401] Updated weights for policy 0, policy_version 359520 (0.0035) [2024-06-23 04:28:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 5890424832. Throughput: 0: 42739.9. Samples: 5890573880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:28:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 04:28:05,913][15401] Updated weights for policy 0, policy_version 359530 (0.0035) [2024-06-23 04:28:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 5890621440. Throughput: 0: 42536.5. Samples: 5890703920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:28:08,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 04:28:10,135][15401] Updated weights for policy 0, policy_version 359540 (0.0040) [2024-06-23 04:28:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 5890850816. Throughput: 0: 42559.7. Samples: 5890954840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:28:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 04:28:13,540][15401] Updated weights for policy 0, policy_version 359550 (0.0042) [2024-06-23 04:28:17,925][15401] Updated weights for policy 0, policy_version 359560 (0.0031) [2024-06-23 04:28:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 5891047424. Throughput: 0: 42893.1. Samples: 5891219980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:28:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 04:28:21,411][15401] Updated weights for policy 0, policy_version 359570 (0.0029) [2024-06-23 04:28:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5891244032. Throughput: 0: 42583.9. Samples: 5891343380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:28:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 04:28:25,551][15401] Updated weights for policy 0, policy_version 359580 (0.0024) [2024-06-23 04:28:28,389][15132] Fps is (10 sec: 44238.0, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 5891489792. Throughput: 0: 42667.7. Samples: 5891601040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:28:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 04:28:28,854][15401] Updated weights for policy 0, policy_version 359590 (0.0036) [2024-06-23 04:28:33,085][15401] Updated weights for policy 0, policy_version 359600 (0.0034) [2024-06-23 04:28:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5891686400. Throughput: 0: 42711.0. Samples: 5891858280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:28:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 04:28:36,981][15401] Updated weights for policy 0, policy_version 359610 (0.0040) [2024-06-23 04:28:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5891899392. Throughput: 0: 42583.5. Samples: 5891986660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:28:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 04:28:41,235][15401] Updated weights for policy 0, policy_version 359620 (0.0030) [2024-06-23 04:28:43,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5892128768. Throughput: 0: 42746.8. Samples: 5892243680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:28:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 04:28:44,634][15401] Updated weights for policy 0, policy_version 359630 (0.0038) [2024-06-23 04:28:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5892308992. Throughput: 0: 42839.2. Samples: 5892501640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:28:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 04:28:48,898][15401] Updated weights for policy 0, policy_version 359640 (0.0040) [2024-06-23 04:28:52,079][15401] Updated weights for policy 0, policy_version 359650 (0.0032) [2024-06-23 04:28:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5892554752. Throughput: 0: 42737.4. Samples: 5892627100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:28:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 04:28:56,499][15401] Updated weights for policy 0, policy_version 359660 (0.0027) [2024-06-23 04:28:57,113][15349] Signal inference workers to stop experience collection... (87350 times) [2024-06-23 04:28:57,116][15349] Signal inference workers to resume experience collection... (87350 times) [2024-06-23 04:28:57,133][15401] InferenceWorker_p0-w0: stopping experience collection (87350 times) [2024-06-23 04:28:57,134][15401] InferenceWorker_p0-w0: resuming experience collection (87350 times) [2024-06-23 04:28:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5892767744. Throughput: 0: 42870.6. Samples: 5892884020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:28:58,395][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 04:28:59,557][15401] Updated weights for policy 0, policy_version 359670 (0.0041) [2024-06-23 04:29:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5892964352. Throughput: 0: 42726.8. Samples: 5893142680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:03,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 04:29:03,965][15401] Updated weights for policy 0, policy_version 359680 (0.0033) [2024-06-23 04:29:07,235][15401] Updated weights for policy 0, policy_version 359690 (0.0036) [2024-06-23 04:29:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 5893193728. Throughput: 0: 42810.2. Samples: 5893269840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 04:29:11,567][15401] Updated weights for policy 0, policy_version 359700 (0.0036) [2024-06-23 04:29:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 5893406720. Throughput: 0: 42777.7. Samples: 5893526040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 04:29:14,923][15401] Updated weights for policy 0, policy_version 359710 (0.0032) [2024-06-23 04:29:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5893603328. Throughput: 0: 42660.1. Samples: 5893777980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 04:29:19,501][15401] Updated weights for policy 0, policy_version 359720 (0.0040) [2024-06-23 04:29:22,771][15401] Updated weights for policy 0, policy_version 359730 (0.0026) [2024-06-23 04:29:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 5893832704. Throughput: 0: 42577.3. Samples: 5893902640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 04:29:27,244][15401] Updated weights for policy 0, policy_version 359740 (0.0039) [2024-06-23 04:29:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 5894045696. Throughput: 0: 42585.5. Samples: 5894160040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:28,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 04:29:30,376][15401] Updated weights for policy 0, policy_version 359750 (0.0039) [2024-06-23 04:29:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5894225920. Throughput: 0: 42524.8. Samples: 5894415260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 04:29:34,892][15401] Updated weights for policy 0, policy_version 359760 (0.0035) [2024-06-23 04:29:38,054][15401] Updated weights for policy 0, policy_version 359770 (0.0040) [2024-06-23 04:29:38,392][15132] Fps is (10 sec: 42589.2, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 5894471680. Throughput: 0: 42466.2. Samples: 5894538180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:38,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 04:29:42,613][15401] Updated weights for policy 0, policy_version 359780 (0.0030) [2024-06-23 04:29:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5894668288. Throughput: 0: 42411.6. Samples: 5894792540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 04:29:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 04:29:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000359783_5894684672.pth... [2024-06-23 04:29:43,517][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000359158_5884444672.pth [2024-06-23 04:29:45,851][15401] Updated weights for policy 0, policy_version 359790 (0.0040) [2024-06-23 04:29:48,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5894881280. Throughput: 0: 42315.2. Samples: 5895046860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:29:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 04:29:50,388][15401] Updated weights for policy 0, policy_version 359800 (0.0034) [2024-06-23 04:29:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5895110656. Throughput: 0: 42310.8. Samples: 5895173820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:29:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 04:29:53,639][15401] Updated weights for policy 0, policy_version 359810 (0.0027) [2024-06-23 04:29:57,886][15401] Updated weights for policy 0, policy_version 359820 (0.0039) [2024-06-23 04:29:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5895307264. Throughput: 0: 42368.4. Samples: 5895432620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:29:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 04:30:01,456][15401] Updated weights for policy 0, policy_version 359830 (0.0039) [2024-06-23 04:30:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5895520256. Throughput: 0: 42215.0. Samples: 5895677660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 04:30:05,716][15401] Updated weights for policy 0, policy_version 359840 (0.0036) [2024-06-23 04:30:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 5895716864. Throughput: 0: 42304.5. Samples: 5895806340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 04:30:09,123][15401] Updated weights for policy 0, policy_version 359850 (0.0031) [2024-06-23 04:30:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5895929856. Throughput: 0: 42210.9. Samples: 5896059520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 04:30:13,427][15401] Updated weights for policy 0, policy_version 359860 (0.0032) [2024-06-23 04:30:16,575][15349] Signal inference workers to stop experience collection... (87400 times) [2024-06-23 04:30:16,576][15349] Signal inference workers to resume experience collection... (87400 times) [2024-06-23 04:30:16,604][15401] InferenceWorker_p0-w0: stopping experience collection (87400 times) [2024-06-23 04:30:16,604][15401] InferenceWorker_p0-w0: resuming experience collection (87400 times) [2024-06-23 04:30:16,908][15401] Updated weights for policy 0, policy_version 359870 (0.0039) [2024-06-23 04:30:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5896142848. Throughput: 0: 42274.0. Samples: 5896317580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:18,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 04:30:21,018][15401] Updated weights for policy 0, policy_version 359880 (0.0026) [2024-06-23 04:30:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 5896372224. Throughput: 0: 42293.2. Samples: 5896441280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 04:30:24,476][15401] Updated weights for policy 0, policy_version 359890 (0.0049) [2024-06-23 04:30:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.5, 300 sec: 42598.4). Total num frames: 5896568832. Throughput: 0: 42383.7. Samples: 5896699800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 04:30:28,613][15401] Updated weights for policy 0, policy_version 359900 (0.0035) [2024-06-23 04:30:32,321][15401] Updated weights for policy 0, policy_version 359910 (0.0036) [2024-06-23 04:30:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 5896781824. Throughput: 0: 42381.9. Samples: 5896954060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 04:30:36,864][15401] Updated weights for policy 0, policy_version 359920 (0.0033) [2024-06-23 04:30:38,392][15132] Fps is (10 sec: 40949.7, 60 sec: 41779.2, 300 sec: 42487.0). Total num frames: 5896978432. Throughput: 0: 42344.8. Samples: 5897079440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:38,392][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 04:30:40,317][15401] Updated weights for policy 0, policy_version 359930 (0.0030) [2024-06-23 04:30:43,390][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5897207808. Throughput: 0: 42190.3. Samples: 5897331180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 04:30:44,419][15401] Updated weights for policy 0, policy_version 359940 (0.0038) [2024-06-23 04:30:48,339][15401] Updated weights for policy 0, policy_version 359950 (0.0037) [2024-06-23 04:30:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 5897420800. Throughput: 0: 42431.6. Samples: 5897587080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 04:30:52,076][15401] Updated weights for policy 0, policy_version 359960 (0.0031) [2024-06-23 04:30:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5897633792. Throughput: 0: 42302.6. Samples: 5897709960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:53,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 04:30:56,315][15401] Updated weights for policy 0, policy_version 359970 (0.0032) [2024-06-23 04:30:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5897846784. Throughput: 0: 42314.6. Samples: 5897963680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 04:30:58,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 04:30:59,927][15401] Updated weights for policy 0, policy_version 359980 (0.0042) [2024-06-23 04:31:03,394][15132] Fps is (10 sec: 40942.3, 60 sec: 42049.3, 300 sec: 42431.2). Total num frames: 5898043392. Throughput: 0: 42298.1. Samples: 5898221180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:03,394][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 04:31:03,787][15401] Updated weights for policy 0, policy_version 359990 (0.0028) [2024-06-23 04:31:07,386][15401] Updated weights for policy 0, policy_version 360000 (0.0047) [2024-06-23 04:31:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5898256384. Throughput: 0: 42341.9. Samples: 5898346660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:08,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 04:31:11,533][15401] Updated weights for policy 0, policy_version 360010 (0.0032) [2024-06-23 04:31:13,389][15132] Fps is (10 sec: 42617.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5898469376. Throughput: 0: 42206.6. Samples: 5898599100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 04:31:15,422][15401] Updated weights for policy 0, policy_version 360020 (0.0037) [2024-06-23 04:31:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5898682368. Throughput: 0: 42244.7. Samples: 5898855060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 04:31:19,207][15401] Updated weights for policy 0, policy_version 360030 (0.0032) [2024-06-23 04:31:23,048][15401] Updated weights for policy 0, policy_version 360040 (0.0037) [2024-06-23 04:31:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5898911744. Throughput: 0: 42340.8. Samples: 5898984680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 04:31:26,821][15401] Updated weights for policy 0, policy_version 360050 (0.0031) [2024-06-23 04:31:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5899108352. Throughput: 0: 42384.9. Samples: 5899238500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 04:31:30,802][15401] Updated weights for policy 0, policy_version 360060 (0.0037) [2024-06-23 04:31:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.6, 300 sec: 42431.8). Total num frames: 5899321344. Throughput: 0: 42365.8. Samples: 5899493540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 04:31:33,426][15349] Signal inference workers to stop experience collection... (87450 times) [2024-06-23 04:31:33,468][15401] InferenceWorker_p0-w0: stopping experience collection (87450 times) [2024-06-23 04:31:33,477][15349] Signal inference workers to resume experience collection... (87450 times) [2024-06-23 04:31:33,478][15401] InferenceWorker_p0-w0: resuming experience collection (87450 times) [2024-06-23 04:31:34,338][15401] Updated weights for policy 0, policy_version 360070 (0.0027) [2024-06-23 04:31:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42543.3). Total num frames: 5899534336. Throughput: 0: 42571.6. Samples: 5899625680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 04:31:38,440][15401] Updated weights for policy 0, policy_version 360080 (0.0027) [2024-06-23 04:31:41,919][15401] Updated weights for policy 0, policy_version 360090 (0.0046) [2024-06-23 04:31:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42376.6). Total num frames: 5899730944. Throughput: 0: 42569.0. Samples: 5899879280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 04:31:43,587][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000360093_5899763712.pth... [2024-06-23 04:31:43,632][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000359470_5889556480.pth [2024-06-23 04:31:45,963][15401] Updated weights for policy 0, policy_version 360100 (0.0024) [2024-06-23 04:31:48,391][15132] Fps is (10 sec: 42591.2, 60 sec: 42324.2, 300 sec: 42431.5). Total num frames: 5899960320. Throughput: 0: 42485.6. Samples: 5900132920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:48,392][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 04:31:49,683][15401] Updated weights for policy 0, policy_version 360110 (0.0024) [2024-06-23 04:31:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5900156928. Throughput: 0: 42590.7. Samples: 5900263240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:53,390][15132] Avg episode reward: [(0, '0.139')] [2024-06-23 04:31:53,742][15401] Updated weights for policy 0, policy_version 360120 (0.0032) [2024-06-23 04:31:57,643][15401] Updated weights for policy 0, policy_version 360130 (0.0023) [2024-06-23 04:31:58,390][15132] Fps is (10 sec: 42605.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 5900386304. Throughput: 0: 42677.2. Samples: 5900519580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:31:58,390][15132] Avg episode reward: [(0, '0.154')] [2024-06-23 04:32:01,423][15401] Updated weights for policy 0, policy_version 360140 (0.0031) [2024-06-23 04:32:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42601.5, 300 sec: 42432.1). Total num frames: 5900599296. Throughput: 0: 42585.4. Samples: 5900771400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:32:03,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 04:32:05,266][15401] Updated weights for policy 0, policy_version 360150 (0.0037) [2024-06-23 04:32:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5900812288. Throughput: 0: 42535.9. Samples: 5900898800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:32:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 04:32:09,122][15401] Updated weights for policy 0, policy_version 360160 (0.0033) [2024-06-23 04:32:12,962][15401] Updated weights for policy 0, policy_version 360170 (0.0037) [2024-06-23 04:32:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42431.8). Total num frames: 5901041664. Throughput: 0: 42611.5. Samples: 5901156020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:32:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 04:32:16,831][15401] Updated weights for policy 0, policy_version 360180 (0.0028) [2024-06-23 04:32:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5901238272. Throughput: 0: 42567.0. Samples: 5901409060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 04:32:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 04:32:20,522][15401] Updated weights for policy 0, policy_version 360190 (0.0044) [2024-06-23 04:32:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5901451264. Throughput: 0: 42426.6. Samples: 5901534880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:32:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 04:32:24,756][15401] Updated weights for policy 0, policy_version 360200 (0.0040) [2024-06-23 04:32:28,089][15401] Updated weights for policy 0, policy_version 360210 (0.0032) [2024-06-23 04:32:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 5901680640. Throughput: 0: 42505.7. Samples: 5901792040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:32:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 04:32:32,518][15401] Updated weights for policy 0, policy_version 360220 (0.0027) [2024-06-23 04:32:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 5901893632. Throughput: 0: 42515.7. Samples: 5902046060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:32:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 04:32:35,690][15401] Updated weights for policy 0, policy_version 360230 (0.0033) [2024-06-23 04:32:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5902090240. Throughput: 0: 42462.1. Samples: 5902174040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:32:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 04:32:40,046][15401] Updated weights for policy 0, policy_version 360240 (0.0031) [2024-06-23 04:32:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 5902319616. Throughput: 0: 42573.9. Samples: 5902435400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:32:43,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 04:32:43,458][15401] Updated weights for policy 0, policy_version 360250 (0.0027) [2024-06-23 04:32:47,675][15401] Updated weights for policy 0, policy_version 360260 (0.0028) [2024-06-23 04:32:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42599.6, 300 sec: 42487.3). Total num frames: 5902516224. Throughput: 0: 42641.8. Samples: 5902690280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:32:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 04:32:49,044][15349] Signal inference workers to stop experience collection... (87500 times) [2024-06-23 04:32:49,092][15401] InferenceWorker_p0-w0: stopping experience collection (87500 times) [2024-06-23 04:32:49,101][15349] Signal inference workers to resume experience collection... (87500 times) [2024-06-23 04:32:49,111][15401] InferenceWorker_p0-w0: resuming experience collection (87500 times) [2024-06-23 04:32:51,299][15401] Updated weights for policy 0, policy_version 360270 (0.0041) [2024-06-23 04:32:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 5902745600. Throughput: 0: 42535.7. Samples: 5902812900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:32:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 04:32:55,528][15401] Updated weights for policy 0, policy_version 360280 (0.0048) [2024-06-23 04:32:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5902958592. Throughput: 0: 42430.3. Samples: 5903065380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:32:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 04:32:58,875][15401] Updated weights for policy 0, policy_version 360290 (0.0025) [2024-06-23 04:33:03,190][15401] Updated weights for policy 0, policy_version 360300 (0.0051) [2024-06-23 04:33:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5903155200. Throughput: 0: 42644.4. Samples: 5903328060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:33:03,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-23 04:33:06,493][15401] Updated weights for policy 0, policy_version 360310 (0.0033) [2024-06-23 04:33:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42431.8). Total num frames: 5903368192. Throughput: 0: 42611.3. Samples: 5903452380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:33:08,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-23 04:33:10,950][15401] Updated weights for policy 0, policy_version 360320 (0.0028) [2024-06-23 04:33:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5903581184. Throughput: 0: 42728.4. Samples: 5903714820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:33:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 04:33:14,150][15401] Updated weights for policy 0, policy_version 360330 (0.0042) [2024-06-23 04:33:18,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 5903794176. Throughput: 0: 42719.5. Samples: 5903968540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:33:18,393][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 04:33:18,584][15401] Updated weights for policy 0, policy_version 360340 (0.0046) [2024-06-23 04:33:22,028][15401] Updated weights for policy 0, policy_version 360350 (0.0035) [2024-06-23 04:33:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 5904007168. Throughput: 0: 42736.6. Samples: 5904097180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:33:23,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-23 04:33:26,299][15401] Updated weights for policy 0, policy_version 360360 (0.0045) [2024-06-23 04:33:28,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5904236544. Throughput: 0: 42586.1. Samples: 5904351780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:33:28,390][15132] Avg episode reward: [(0, '0.149')] [2024-06-23 04:33:29,618][15401] Updated weights for policy 0, policy_version 360370 (0.0043) [2024-06-23 04:33:33,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5904433152. Throughput: 0: 42593.2. Samples: 5904606980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 04:33:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 04:33:34,111][15401] Updated weights for policy 0, policy_version 360380 (0.0039) [2024-06-23 04:33:37,253][15401] Updated weights for policy 0, policy_version 360390 (0.0036) [2024-06-23 04:33:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 5904646144. Throughput: 0: 42676.4. Samples: 5904733340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:33:38,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 04:33:41,766][15401] Updated weights for policy 0, policy_version 360400 (0.0032) [2024-06-23 04:33:43,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5904875520. Throughput: 0: 42783.6. Samples: 5904990640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:33:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 04:33:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000360405_5904875520.pth... [2024-06-23 04:33:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000359783_5894684672.pth [2024-06-23 04:33:44,925][15401] Updated weights for policy 0, policy_version 360410 (0.0036) [2024-06-23 04:33:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 5905072128. Throughput: 0: 42632.5. Samples: 5905246520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:33:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 04:33:49,373][15401] Updated weights for policy 0, policy_version 360420 (0.0045) [2024-06-23 04:33:52,795][15401] Updated weights for policy 0, policy_version 360430 (0.0037) [2024-06-23 04:33:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5905301504. Throughput: 0: 42718.6. Samples: 5905374720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:33:53,394][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 04:33:56,998][15401] Updated weights for policy 0, policy_version 360440 (0.0025) [2024-06-23 04:33:58,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 5905514496. Throughput: 0: 42629.8. Samples: 5905633260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:33:58,393][15132] Avg episode reward: [(0, '0.175')] [2024-06-23 04:34:00,493][15401] Updated weights for policy 0, policy_version 360450 (0.0044) [2024-06-23 04:34:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5905727488. Throughput: 0: 42655.2. Samples: 5905887920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:03,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 04:34:04,746][15401] Updated weights for policy 0, policy_version 360460 (0.0040) [2024-06-23 04:34:08,020][15401] Updated weights for policy 0, policy_version 360470 (0.0040) [2024-06-23 04:34:08,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 5905940480. Throughput: 0: 42611.9. Samples: 5906014720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:08,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 04:34:12,473][15401] Updated weights for policy 0, policy_version 360480 (0.0042) [2024-06-23 04:34:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5906153472. Throughput: 0: 42610.3. Samples: 5906269240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:13,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 04:34:15,956][15401] Updated weights for policy 0, policy_version 360490 (0.0041) [2024-06-23 04:34:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 5906366464. Throughput: 0: 42614.9. Samples: 5906524640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 04:34:19,946][15349] Signal inference workers to stop experience collection... (87550 times) [2024-06-23 04:34:19,985][15401] InferenceWorker_p0-w0: stopping experience collection (87550 times) [2024-06-23 04:34:20,065][15349] Signal inference workers to resume experience collection... (87550 times) [2024-06-23 04:34:20,065][15401] InferenceWorker_p0-w0: resuming experience collection (87550 times) [2024-06-23 04:34:20,197][15401] Updated weights for policy 0, policy_version 360500 (0.0035) [2024-06-23 04:34:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 5906579456. Throughput: 0: 42421.8. Samples: 5906642320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 04:34:23,708][15401] Updated weights for policy 0, policy_version 360510 (0.0035) [2024-06-23 04:34:27,959][15401] Updated weights for policy 0, policy_version 360520 (0.0041) [2024-06-23 04:34:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5906792448. Throughput: 0: 42546.1. Samples: 5906905220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 04:34:31,260][15401] Updated weights for policy 0, policy_version 360530 (0.0026) [2024-06-23 04:34:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42376.6). Total num frames: 5906972672. Throughput: 0: 42611.6. Samples: 5907164040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:33,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 04:34:35,611][15401] Updated weights for policy 0, policy_version 360540 (0.0032) [2024-06-23 04:34:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5907234816. Throughput: 0: 42435.5. Samples: 5907284320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:38,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 04:34:38,816][15401] Updated weights for policy 0, policy_version 360550 (0.0043) [2024-06-23 04:34:43,361][15401] Updated weights for policy 0, policy_version 360560 (0.0036) [2024-06-23 04:34:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5907415040. Throughput: 0: 42513.9. Samples: 5907546280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 04:34:46,345][15401] Updated weights for policy 0, policy_version 360570 (0.0029) [2024-06-23 04:34:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 5907628032. Throughput: 0: 42470.5. Samples: 5907799100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:48,391][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 04:34:51,215][15401] Updated weights for policy 0, policy_version 360580 (0.0038) [2024-06-23 04:34:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5907873792. Throughput: 0: 42429.8. Samples: 5907924060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 04:34:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 04:34:53,892][15401] Updated weights for policy 0, policy_version 360590 (0.0044) [2024-06-23 04:34:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 5908037632. Throughput: 0: 42506.2. Samples: 5908182020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:34:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 04:34:59,063][15401] Updated weights for policy 0, policy_version 360600 (0.0034) [2024-06-23 04:35:01,387][15401] Updated weights for policy 0, policy_version 360610 (0.0031) [2024-06-23 04:35:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5908250624. Throughput: 0: 42592.4. Samples: 5908441300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:03,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 04:35:06,556][15401] Updated weights for policy 0, policy_version 360620 (0.0029) [2024-06-23 04:35:08,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5908529152. Throughput: 0: 42905.4. Samples: 5908573060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 04:35:08,858][15401] Updated weights for policy 0, policy_version 360630 (0.0042) [2024-06-23 04:35:13,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42047.7, 300 sec: 42486.4). Total num frames: 5908676608. Throughput: 0: 42531.6. Samples: 5908819420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:13,397][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 04:35:14,472][15401] Updated weights for policy 0, policy_version 360640 (0.0033) [2024-06-23 04:35:16,453][15401] Updated weights for policy 0, policy_version 360650 (0.0041) [2024-06-23 04:35:18,395][15132] Fps is (10 sec: 37663.4, 60 sec: 42321.6, 300 sec: 42486.6). Total num frames: 5908905984. Throughput: 0: 42575.4. Samples: 5909080160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:18,395][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 04:35:21,925][15401] Updated weights for policy 0, policy_version 360660 (0.0041) [2024-06-23 04:35:23,389][15132] Fps is (10 sec: 47545.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5909151744. Throughput: 0: 43041.9. Samples: 5909221200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:23,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 04:35:24,602][15401] Updated weights for policy 0, policy_version 360670 (0.0033) [2024-06-23 04:35:28,389][15132] Fps is (10 sec: 40982.0, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 5909315584. Throughput: 0: 42828.9. Samples: 5909473580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 04:35:28,918][15349] Signal inference workers to stop experience collection... (87600 times) [2024-06-23 04:35:28,919][15349] Signal inference workers to resume experience collection... (87600 times) [2024-06-23 04:35:28,942][15401] InferenceWorker_p0-w0: stopping experience collection (87600 times) [2024-06-23 04:35:28,942][15401] InferenceWorker_p0-w0: resuming experience collection (87600 times) [2024-06-23 04:35:29,213][15401] Updated weights for policy 0, policy_version 360680 (0.0042) [2024-06-23 04:35:32,234][15401] Updated weights for policy 0, policy_version 360690 (0.0028) [2024-06-23 04:35:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43690.6, 300 sec: 42765.4). Total num frames: 5909594112. Throughput: 0: 42972.0. Samples: 5909732840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 04:35:36,594][15401] Updated weights for policy 0, policy_version 360700 (0.0023) [2024-06-23 04:35:38,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5909790720. Throughput: 0: 43291.2. Samples: 5909872160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 04:35:39,942][15401] Updated weights for policy 0, policy_version 360710 (0.0032) [2024-06-23 04:35:43,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5909970944. Throughput: 0: 43080.1. Samples: 5910120620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:43,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 04:35:43,532][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000360717_5909987328.pth... [2024-06-23 04:35:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000360093_5899763712.pth [2024-06-23 04:35:44,068][15401] Updated weights for policy 0, policy_version 360720 (0.0035) [2024-06-23 04:35:47,346][15401] Updated weights for policy 0, policy_version 360730 (0.0034) [2024-06-23 04:35:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 5910216704. Throughput: 0: 43063.2. Samples: 5910379140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 04:35:51,589][15401] Updated weights for policy 0, policy_version 360740 (0.0034) [2024-06-23 04:35:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5910429696. Throughput: 0: 43127.1. Samples: 5910513780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 04:35:54,833][15401] Updated weights for policy 0, policy_version 360750 (0.0029) [2024-06-23 04:35:58,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.5, 300 sec: 42654.5). Total num frames: 5910626304. Throughput: 0: 43263.9. Samples: 5910766020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:35:58,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 04:35:59,412][15401] Updated weights for policy 0, policy_version 360760 (0.0037) [2024-06-23 04:36:02,473][15401] Updated weights for policy 0, policy_version 360770 (0.0030) [2024-06-23 04:36:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5910855680. Throughput: 0: 43349.1. Samples: 5911030640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:36:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 04:36:06,918][15401] Updated weights for policy 0, policy_version 360780 (0.0037) [2024-06-23 04:36:08,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5911068672. Throughput: 0: 43111.5. Samples: 5911161220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:36:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 04:36:10,335][15401] Updated weights for policy 0, policy_version 360790 (0.0036) [2024-06-23 04:36:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43422.3, 300 sec: 42709.5). Total num frames: 5911281664. Throughput: 0: 43158.5. Samples: 5911415720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:36:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 04:36:14,490][15401] Updated weights for policy 0, policy_version 360800 (0.0040) [2024-06-23 04:36:18,109][15401] Updated weights for policy 0, policy_version 360810 (0.0036) [2024-06-23 04:36:18,396][15132] Fps is (10 sec: 44208.5, 60 sec: 43416.8, 300 sec: 42708.6). Total num frames: 5911511040. Throughput: 0: 42910.4. Samples: 5911664080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:18,396][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 04:36:22,438][15401] Updated weights for policy 0, policy_version 360820 (0.0041) [2024-06-23 04:36:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5911707648. Throughput: 0: 42819.1. Samples: 5911799020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 04:36:26,346][15401] Updated weights for policy 0, policy_version 360830 (0.0037) [2024-06-23 04:36:28,389][15132] Fps is (10 sec: 40986.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 5911920640. Throughput: 0: 42994.2. Samples: 5912055360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 04:36:29,901][15401] Updated weights for policy 0, policy_version 360840 (0.0047) [2024-06-23 04:36:33,396][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5912150016. Throughput: 0: 42951.9. Samples: 5912311980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:33,396][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 04:36:33,771][15401] Updated weights for policy 0, policy_version 360850 (0.0031) [2024-06-23 04:36:37,524][15401] Updated weights for policy 0, policy_version 360860 (0.0041) [2024-06-23 04:36:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5912363008. Throughput: 0: 42862.2. Samples: 5912442580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:38,396][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 04:36:38,774][15349] Signal inference workers to stop experience collection... (87650 times) [2024-06-23 04:36:38,833][15401] InferenceWorker_p0-w0: stopping experience collection (87650 times) [2024-06-23 04:36:38,840][15349] Signal inference workers to resume experience collection... (87650 times) [2024-06-23 04:36:38,849][15401] InferenceWorker_p0-w0: resuming experience collection (87650 times) [2024-06-23 04:36:41,342][15401] Updated weights for policy 0, policy_version 360870 (0.0030) [2024-06-23 04:36:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42765.3). Total num frames: 5912576000. Throughput: 0: 43117.5. Samples: 5912706300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 04:36:45,035][15401] Updated weights for policy 0, policy_version 360880 (0.0026) [2024-06-23 04:36:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5912805376. Throughput: 0: 42750.6. Samples: 5912954420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 04:36:48,821][15401] Updated weights for policy 0, policy_version 360890 (0.0027) [2024-06-23 04:36:52,752][15401] Updated weights for policy 0, policy_version 360900 (0.0043) [2024-06-23 04:36:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5913001984. Throughput: 0: 42787.5. Samples: 5913086660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:53,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-23 04:36:56,434][15401] Updated weights for policy 0, policy_version 360910 (0.0026) [2024-06-23 04:36:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 5913231360. Throughput: 0: 42945.8. Samples: 5913348280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:36:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 04:37:00,303][15401] Updated weights for policy 0, policy_version 360920 (0.0038) [2024-06-23 04:37:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5913444352. Throughput: 0: 42998.5. Samples: 5913598740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:37:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 04:37:04,314][15401] Updated weights for policy 0, policy_version 360930 (0.0030) [2024-06-23 04:37:07,926][15401] Updated weights for policy 0, policy_version 360940 (0.0030) [2024-06-23 04:37:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5913657344. Throughput: 0: 42808.4. Samples: 5913725400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:37:08,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-23 04:37:12,203][15401] Updated weights for policy 0, policy_version 360950 (0.0034) [2024-06-23 04:37:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5913853952. Throughput: 0: 42936.9. Samples: 5913987520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:37:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 04:37:15,645][15401] Updated weights for policy 0, policy_version 360960 (0.0022) [2024-06-23 04:37:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 5914083328. Throughput: 0: 42799.6. Samples: 5914237960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:37:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 04:37:19,772][15401] Updated weights for policy 0, policy_version 360970 (0.0033) [2024-06-23 04:37:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5914296320. Throughput: 0: 42717.8. Samples: 5914364880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:37:23,390][15132] Avg episode reward: [(0, '0.220')] [2024-06-23 04:37:23,392][15401] Updated weights for policy 0, policy_version 360980 (0.0038) [2024-06-23 04:37:27,814][15401] Updated weights for policy 0, policy_version 360990 (0.0036) [2024-06-23 04:37:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5914492928. Throughput: 0: 42570.6. Samples: 5914621980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-23 04:37:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 04:37:31,369][15401] Updated weights for policy 0, policy_version 361000 (0.0029) [2024-06-23 04:37:33,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 5914722304. Throughput: 0: 42736.8. Samples: 5914877680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:37:33,393][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 04:37:35,197][15401] Updated weights for policy 0, policy_version 361010 (0.0034) [2024-06-23 04:37:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5914918912. Throughput: 0: 42696.0. Samples: 5915007980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:37:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 04:37:38,883][15401] Updated weights for policy 0, policy_version 361020 (0.0037) [2024-06-23 04:37:43,079][15401] Updated weights for policy 0, policy_version 361030 (0.0034) [2024-06-23 04:37:43,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5915115520. Throughput: 0: 42585.4. Samples: 5915264620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:37:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 04:37:43,554][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000361032_5915148288.pth... [2024-06-23 04:37:43,606][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000360405_5904875520.pth [2024-06-23 04:37:46,735][15401] Updated weights for policy 0, policy_version 361040 (0.0035) [2024-06-23 04:37:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5915361280. Throughput: 0: 42667.2. Samples: 5915518760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:37:48,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 04:37:50,557][15401] Updated weights for policy 0, policy_version 361050 (0.0028) [2024-06-23 04:37:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5915557888. Throughput: 0: 42645.8. Samples: 5915644460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:37:53,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 04:37:54,413][15401] Updated weights for policy 0, policy_version 361060 (0.0030) [2024-06-23 04:37:58,055][15401] Updated weights for policy 0, policy_version 361070 (0.0039) [2024-06-23 04:37:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5915787264. Throughput: 0: 42539.9. Samples: 5915901820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:37:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 04:38:02,273][15401] Updated weights for policy 0, policy_version 361080 (0.0029) [2024-06-23 04:38:03,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 5916000256. Throughput: 0: 42636.8. Samples: 5916156720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:03,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 04:38:05,397][15401] Updated weights for policy 0, policy_version 361090 (0.0030) [2024-06-23 04:38:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5916213248. Throughput: 0: 42693.4. Samples: 5916286080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 04:38:09,886][15401] Updated weights for policy 0, policy_version 361100 (0.0025) [2024-06-23 04:38:10,536][15349] Signal inference workers to stop experience collection... (87700 times) [2024-06-23 04:38:10,545][15401] InferenceWorker_p0-w0: stopping experience collection (87700 times) [2024-06-23 04:38:10,599][15349] Signal inference workers to resume experience collection... (87700 times) [2024-06-23 04:38:10,604][15401] InferenceWorker_p0-w0: resuming experience collection (87700 times) [2024-06-23 04:38:12,739][15401] Updated weights for policy 0, policy_version 361110 (0.0030) [2024-06-23 04:38:13,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 5916426240. Throughput: 0: 42818.7. Samples: 5916548820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 04:38:17,579][15401] Updated weights for policy 0, policy_version 361120 (0.0028) [2024-06-23 04:38:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5916639232. Throughput: 0: 42827.1. Samples: 5916804800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:18,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 04:38:20,406][15401] Updated weights for policy 0, policy_version 361130 (0.0033) [2024-06-23 04:38:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5916868608. Throughput: 0: 42801.4. Samples: 5916934040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:23,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 04:38:25,334][15401] Updated weights for policy 0, policy_version 361140 (0.0036) [2024-06-23 04:38:28,198][15401] Updated weights for policy 0, policy_version 361150 (0.0040) [2024-06-23 04:38:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5917081600. Throughput: 0: 42809.3. Samples: 5917191040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:28,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 04:38:32,898][15401] Updated weights for policy 0, policy_version 361160 (0.0024) [2024-06-23 04:38:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 5917261824. Throughput: 0: 42840.7. Samples: 5917446600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:33,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-23 04:38:35,748][15401] Updated weights for policy 0, policy_version 361170 (0.0032) [2024-06-23 04:38:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5917507584. Throughput: 0: 42816.5. Samples: 5917571200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 04:38:40,456][15401] Updated weights for policy 0, policy_version 361180 (0.0022) [2024-06-23 04:38:43,235][15401] Updated weights for policy 0, policy_version 361190 (0.0022) [2024-06-23 04:38:43,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 5917736960. Throughput: 0: 42949.4. Samples: 5917834540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 04:38:48,198][15401] Updated weights for policy 0, policy_version 361200 (0.0029) [2024-06-23 04:38:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5917900800. Throughput: 0: 43109.3. Samples: 5918096540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 04:38:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 04:38:50,890][15401] Updated weights for policy 0, policy_version 361210 (0.0035) [2024-06-23 04:38:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 5918162944. Throughput: 0: 42807.4. Samples: 5918212420. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:38:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 04:38:55,773][15401] Updated weights for policy 0, policy_version 361220 (0.0038) [2024-06-23 04:38:58,392][15132] Fps is (10 sec: 47502.9, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 5918375936. Throughput: 0: 42914.7. Samples: 5918480080. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:38:58,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 04:38:58,777][15401] Updated weights for policy 0, policy_version 361230 (0.0022) [2024-06-23 04:39:03,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 5918539776. Throughput: 0: 43011.7. Samples: 5918740320. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:03,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 04:39:03,422][15401] Updated weights for policy 0, policy_version 361240 (0.0030) [2024-06-23 04:39:06,218][15401] Updated weights for policy 0, policy_version 361250 (0.0026) [2024-06-23 04:39:08,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5918801920. Throughput: 0: 42743.1. Samples: 5918857480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 04:39:10,950][15401] Updated weights for policy 0, policy_version 361260 (0.0039) [2024-06-23 04:39:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5918998528. Throughput: 0: 43093.4. Samples: 5919130240. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 04:39:14,038][15401] Updated weights for policy 0, policy_version 361270 (0.0033) [2024-06-23 04:39:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5919195136. Throughput: 0: 43003.1. Samples: 5919381740. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 04:39:18,735][15401] Updated weights for policy 0, policy_version 361280 (0.0035) [2024-06-23 04:39:21,773][15401] Updated weights for policy 0, policy_version 361290 (0.0033) [2024-06-23 04:39:22,498][15349] Signal inference workers to stop experience collection... (87750 times) [2024-06-23 04:39:22,498][15349] Signal inference workers to resume experience collection... (87750 times) [2024-06-23 04:39:22,528][15401] InferenceWorker_p0-w0: stopping experience collection (87750 times) [2024-06-23 04:39:22,528][15401] InferenceWorker_p0-w0: resuming experience collection (87750 times) [2024-06-23 04:39:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5919440896. Throughput: 0: 42937.8. Samples: 5919503400. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:23,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 04:39:26,483][15401] Updated weights for policy 0, policy_version 361300 (0.0031) [2024-06-23 04:39:28,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5919653888. Throughput: 0: 43035.1. Samples: 5919771120. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:28,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 04:39:29,528][15401] Updated weights for policy 0, policy_version 361310 (0.0040) [2024-06-23 04:39:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5919850496. Throughput: 0: 42801.3. Samples: 5920022600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 04:39:34,137][15401] Updated weights for policy 0, policy_version 361320 (0.0029) [2024-06-23 04:39:37,122][15401] Updated weights for policy 0, policy_version 361330 (0.0039) [2024-06-23 04:39:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5920079872. Throughput: 0: 43042.8. Samples: 5920149340. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 04:39:41,901][15401] Updated weights for policy 0, policy_version 361340 (0.0028) [2024-06-23 04:39:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5920276480. Throughput: 0: 42953.4. Samples: 5920412880. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 04:39:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000361346_5920292864.pth... [2024-06-23 04:39:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000360717_5909987328.pth [2024-06-23 04:39:44,703][15401] Updated weights for policy 0, policy_version 361350 (0.0037) [2024-06-23 04:39:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5920489472. Throughput: 0: 42538.5. Samples: 5920654560. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 04:39:49,552][15401] Updated weights for policy 0, policy_version 361360 (0.0045) [2024-06-23 04:39:52,436][15401] Updated weights for policy 0, policy_version 361370 (0.0029) [2024-06-23 04:39:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 5920735232. Throughput: 0: 42797.3. Samples: 5920783360. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 04:39:57,598][15401] Updated weights for policy 0, policy_version 361380 (0.0042) [2024-06-23 04:39:58,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42049.4, 300 sec: 42875.2). Total num frames: 5920899072. Throughput: 0: 42488.6. Samples: 5921042500. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:39:58,396][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 04:40:00,162][15401] Updated weights for policy 0, policy_version 361390 (0.0032) [2024-06-23 04:40:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5921128448. Throughput: 0: 42463.6. Samples: 5921292600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 23.0) [2024-06-23 04:40:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 04:40:05,003][15401] Updated weights for policy 0, policy_version 361400 (0.0033) [2024-06-23 04:40:08,039][15401] Updated weights for policy 0, policy_version 361410 (0.0028) [2024-06-23 04:40:08,389][15132] Fps is (10 sec: 45904.9, 60 sec: 42598.5, 300 sec: 42988.1). Total num frames: 5921357824. Throughput: 0: 42750.3. Samples: 5921427160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:08,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 04:40:12,656][15401] Updated weights for policy 0, policy_version 361420 (0.0030) [2024-06-23 04:40:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 42765.8). Total num frames: 5921521664. Throughput: 0: 42542.7. Samples: 5921685540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:13,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 04:40:15,551][15401] Updated weights for policy 0, policy_version 361430 (0.0040) [2024-06-23 04:40:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5921783808. Throughput: 0: 42360.0. Samples: 5921928800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:18,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-23 04:40:20,166][15401] Updated weights for policy 0, policy_version 361440 (0.0045) [2024-06-23 04:40:23,133][15401] Updated weights for policy 0, policy_version 361450 (0.0045) [2024-06-23 04:40:23,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42598.4, 300 sec: 42987.1). Total num frames: 5921996800. Throughput: 0: 42733.7. Samples: 5922072360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 04:40:27,598][15401] Updated weights for policy 0, policy_version 361460 (0.0029) [2024-06-23 04:40:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5922177024. Throughput: 0: 42583.0. Samples: 5922329120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:28,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 04:40:30,646][15401] Updated weights for policy 0, policy_version 361470 (0.0030) [2024-06-23 04:40:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5922422784. Throughput: 0: 42791.6. Samples: 5922580180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:33,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 04:40:35,122][15401] Updated weights for policy 0, policy_version 361480 (0.0031) [2024-06-23 04:40:37,704][15349] Signal inference workers to stop experience collection... (87800 times) [2024-06-23 04:40:37,704][15349] Signal inference workers to resume experience collection... (87800 times) [2024-06-23 04:40:37,733][15401] InferenceWorker_p0-w0: stopping experience collection (87800 times) [2024-06-23 04:40:37,734][15401] InferenceWorker_p0-w0: resuming experience collection (87800 times) [2024-06-23 04:40:38,366][15401] Updated weights for policy 0, policy_version 361490 (0.0040) [2024-06-23 04:40:38,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5922652160. Throughput: 0: 43029.1. Samples: 5922719660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 04:40:42,739][15401] Updated weights for policy 0, policy_version 361500 (0.0034) [2024-06-23 04:40:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5922816000. Throughput: 0: 42773.6. Samples: 5922967040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 04:40:46,154][15401] Updated weights for policy 0, policy_version 361510 (0.0033) [2024-06-23 04:40:48,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 5923078144. Throughput: 0: 42887.1. Samples: 5923222620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:48,393][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 04:40:50,150][15401] Updated weights for policy 0, policy_version 361520 (0.0026) [2024-06-23 04:40:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5923274752. Throughput: 0: 42923.9. Samples: 5923358740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:53,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 04:40:54,147][15401] Updated weights for policy 0, policy_version 361530 (0.0028) [2024-06-23 04:40:57,696][15401] Updated weights for policy 0, policy_version 361540 (0.0043) [2024-06-23 04:40:58,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43149.2, 300 sec: 42820.6). Total num frames: 5923487744. Throughput: 0: 42823.1. Samples: 5923612580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:40:58,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 04:41:01,529][15401] Updated weights for policy 0, policy_version 361550 (0.0023) [2024-06-23 04:41:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5923733504. Throughput: 0: 43137.3. Samples: 5923869980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:41:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 04:41:05,264][15401] Updated weights for policy 0, policy_version 361560 (0.0032) [2024-06-23 04:41:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5923897344. Throughput: 0: 42907.7. Samples: 5924003200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:41:08,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 04:41:09,089][15401] Updated weights for policy 0, policy_version 361570 (0.0035) [2024-06-23 04:41:12,805][15401] Updated weights for policy 0, policy_version 361580 (0.0035) [2024-06-23 04:41:13,390][15132] Fps is (10 sec: 39322.0, 60 sec: 43417.5, 300 sec: 42765.9). Total num frames: 5924126720. Throughput: 0: 42800.5. Samples: 5924255140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:41:13,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 04:41:17,150][15401] Updated weights for policy 0, policy_version 361590 (0.0037) [2024-06-23 04:41:18,389][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5924372480. Throughput: 0: 42862.7. Samples: 5924509000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:41:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 04:41:20,718][15401] Updated weights for policy 0, policy_version 361600 (0.0032) [2024-06-23 04:41:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5924552704. Throughput: 0: 42721.7. Samples: 5924642140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 19.0) [2024-06-23 04:41:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 04:41:24,860][15401] Updated weights for policy 0, policy_version 361610 (0.0033) [2024-06-23 04:41:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5924765696. Throughput: 0: 42694.2. Samples: 5924888280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:41:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 04:41:28,417][15401] Updated weights for policy 0, policy_version 361620 (0.0053) [2024-06-23 04:41:32,657][15401] Updated weights for policy 0, policy_version 361630 (0.0029) [2024-06-23 04:41:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5924995072. Throughput: 0: 42793.4. Samples: 5925148220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:41:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 04:41:36,326][15401] Updated weights for policy 0, policy_version 361640 (0.0030) [2024-06-23 04:41:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 5925175296. Throughput: 0: 42661.4. Samples: 5925278500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:41:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 04:41:40,114][15401] Updated weights for policy 0, policy_version 361650 (0.0034) [2024-06-23 04:41:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5925421056. Throughput: 0: 42683.8. Samples: 5925533360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:41:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 04:41:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000361659_5925421056.pth... [2024-06-23 04:41:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000361032_5915148288.pth [2024-06-23 04:41:43,699][15401] Updated weights for policy 0, policy_version 361660 (0.0033) [2024-06-23 04:41:47,652][15401] Updated weights for policy 0, policy_version 361670 (0.0028) [2024-06-23 04:41:47,687][15349] Signal inference workers to stop experience collection... (87850 times) [2024-06-23 04:41:47,687][15349] Signal inference workers to resume experience collection... (87850 times) [2024-06-23 04:41:47,706][15401] InferenceWorker_p0-w0: stopping experience collection (87850 times) [2024-06-23 04:41:47,706][15401] InferenceWorker_p0-w0: resuming experience collection (87850 times) [2024-06-23 04:41:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 5925617664. Throughput: 0: 42740.0. Samples: 5925793280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:41:48,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 04:41:51,234][15401] Updated weights for policy 0, policy_version 361680 (0.0027) [2024-06-23 04:41:53,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 5925830656. Throughput: 0: 42678.9. Samples: 5925923860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:41:53,392][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 04:41:55,180][15401] Updated weights for policy 0, policy_version 361690 (0.0039) [2024-06-23 04:41:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5926060032. Throughput: 0: 42749.4. Samples: 5926178860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:41:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 04:41:58,733][15401] Updated weights for policy 0, policy_version 361700 (0.0041) [2024-06-23 04:42:02,919][15401] Updated weights for policy 0, policy_version 361710 (0.0042) [2024-06-23 04:42:03,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5926289408. Throughput: 0: 42807.0. Samples: 5926435320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:42:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 04:42:06,295][15401] Updated weights for policy 0, policy_version 361720 (0.0030) [2024-06-23 04:42:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5926469632. Throughput: 0: 42630.2. Samples: 5926560500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:42:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 04:42:10,598][15401] Updated weights for policy 0, policy_version 361730 (0.0023) [2024-06-23 04:42:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5926715392. Throughput: 0: 42779.1. Samples: 5926813340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:42:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 04:42:14,250][15401] Updated weights for policy 0, policy_version 361740 (0.0032) [2024-06-23 04:42:18,315][15401] Updated weights for policy 0, policy_version 361750 (0.0037) [2024-06-23 04:42:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5926912000. Throughput: 0: 42792.9. Samples: 5927073900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:42:18,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-23 04:42:21,691][15401] Updated weights for policy 0, policy_version 361760 (0.0035) [2024-06-23 04:42:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5927124992. Throughput: 0: 42575.9. Samples: 5927194420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:42:23,396][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 04:42:26,116][15401] Updated weights for policy 0, policy_version 361770 (0.0047) [2024-06-23 04:42:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 5927354368. Throughput: 0: 42758.4. Samples: 5927457480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:42:28,390][15132] Avg episode reward: [(0, '0.143')] [2024-06-23 04:42:29,133][15401] Updated weights for policy 0, policy_version 361780 (0.0041) [2024-06-23 04:42:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5927534592. Throughput: 0: 42814.4. Samples: 5927719920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:42:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 04:42:33,640][15401] Updated weights for policy 0, policy_version 361790 (0.0041) [2024-06-23 04:42:37,066][15401] Updated weights for policy 0, policy_version 361800 (0.0028) [2024-06-23 04:42:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5927763968. Throughput: 0: 42578.7. Samples: 5927839800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 04:42:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 04:42:41,205][15401] Updated weights for policy 0, policy_version 361810 (0.0039) [2024-06-23 04:42:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5927993344. Throughput: 0: 42680.8. Samples: 5928099500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:42:43,392][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 04:42:44,772][15401] Updated weights for policy 0, policy_version 361820 (0.0039) [2024-06-23 04:42:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5928189952. Throughput: 0: 42659.2. Samples: 5928354980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:42:48,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 04:42:48,821][15401] Updated weights for policy 0, policy_version 361830 (0.0029) [2024-06-23 04:42:52,855][15401] Updated weights for policy 0, policy_version 361840 (0.0041) [2024-06-23 04:42:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 5928402944. Throughput: 0: 42760.3. Samples: 5928484720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:42:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 04:42:56,396][15401] Updated weights for policy 0, policy_version 361850 (0.0041) [2024-06-23 04:42:58,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42866.9, 300 sec: 42820.0). Total num frames: 5928632320. Throughput: 0: 42812.1. Samples: 5928740160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:42:58,397][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 04:43:00,394][15401] Updated weights for policy 0, policy_version 361860 (0.0038) [2024-06-23 04:43:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5928828928. Throughput: 0: 42798.2. Samples: 5928999820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:03,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 04:43:04,009][15401] Updated weights for policy 0, policy_version 361870 (0.0048) [2024-06-23 04:43:06,690][15349] Signal inference workers to stop experience collection... (87900 times) [2024-06-23 04:43:06,737][15401] InferenceWorker_p0-w0: stopping experience collection (87900 times) [2024-06-23 04:43:06,750][15349] Signal inference workers to resume experience collection... (87900 times) [2024-06-23 04:43:06,751][15401] InferenceWorker_p0-w0: resuming experience collection (87900 times) [2024-06-23 04:43:08,322][15401] Updated weights for policy 0, policy_version 361880 (0.0027) [2024-06-23 04:43:08,392][15132] Fps is (10 sec: 40976.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 5929041920. Throughput: 0: 42875.4. Samples: 5929123920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:08,393][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 04:43:11,842][15401] Updated weights for policy 0, policy_version 361890 (0.0031) [2024-06-23 04:43:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5929254912. Throughput: 0: 42631.9. Samples: 5929375920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:13,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 04:43:15,835][15401] Updated weights for policy 0, policy_version 361900 (0.0030) [2024-06-23 04:43:18,390][15132] Fps is (10 sec: 42607.2, 60 sec: 42598.1, 300 sec: 42709.4). Total num frames: 5929467904. Throughput: 0: 42589.8. Samples: 5929636480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 04:43:19,323][15401] Updated weights for policy 0, policy_version 361910 (0.0049) [2024-06-23 04:43:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5929680896. Throughput: 0: 42820.5. Samples: 5929766720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 04:43:23,502][15401] Updated weights for policy 0, policy_version 361920 (0.0037) [2024-06-23 04:43:27,316][15401] Updated weights for policy 0, policy_version 361930 (0.0025) [2024-06-23 04:43:28,390][15132] Fps is (10 sec: 42599.7, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 5929893888. Throughput: 0: 42632.4. Samples: 5930017960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 04:43:31,427][15401] Updated weights for policy 0, policy_version 361940 (0.0038) [2024-06-23 04:43:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5930106880. Throughput: 0: 42630.1. Samples: 5930273340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 04:43:34,937][15401] Updated weights for policy 0, policy_version 361950 (0.0041) [2024-06-23 04:43:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5930303488. Throughput: 0: 42607.3. Samples: 5930402040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 04:43:39,090][15401] Updated weights for policy 0, policy_version 361960 (0.0046) [2024-06-23 04:43:42,743][15401] Updated weights for policy 0, policy_version 361970 (0.0030) [2024-06-23 04:43:43,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5930532864. Throughput: 0: 42483.5. Samples: 5930651640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 04:43:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000361971_5930532864.pth... [2024-06-23 04:43:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000361346_5920292864.pth [2024-06-23 04:43:46,851][15401] Updated weights for policy 0, policy_version 361980 (0.0042) [2024-06-23 04:43:48,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 5930762240. Throughput: 0: 42348.4. Samples: 5930905600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:48,393][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 04:43:50,400][15401] Updated weights for policy 0, policy_version 361990 (0.0030) [2024-06-23 04:43:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 5930958848. Throughput: 0: 42525.9. Samples: 5931037480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 04:43:54,250][15401] Updated weights for policy 0, policy_version 362000 (0.0035) [2024-06-23 04:43:58,290][15401] Updated weights for policy 0, policy_version 362010 (0.0037) [2024-06-23 04:43:58,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42329.9, 300 sec: 42820.6). Total num frames: 5931171840. Throughput: 0: 42638.3. Samples: 5931294640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-23 04:43:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 04:44:02,058][15401] Updated weights for policy 0, policy_version 362020 (0.0033) [2024-06-23 04:44:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5931401216. Throughput: 0: 42530.5. Samples: 5931550340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 04:44:05,905][15401] Updated weights for policy 0, policy_version 362030 (0.0038) [2024-06-23 04:44:08,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 5931581440. Throughput: 0: 42529.3. Samples: 5931680540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 04:44:09,673][15401] Updated weights for policy 0, policy_version 362040 (0.0025) [2024-06-23 04:44:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5931810816. Throughput: 0: 42587.7. Samples: 5931934400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:13,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 04:44:13,425][15401] Updated weights for policy 0, policy_version 362050 (0.0047) [2024-06-23 04:44:17,494][15401] Updated weights for policy 0, policy_version 362060 (0.0034) [2024-06-23 04:44:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 5932040192. Throughput: 0: 42547.1. Samples: 5932187960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:18,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 04:44:21,410][15401] Updated weights for policy 0, policy_version 362070 (0.0036) [2024-06-23 04:44:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5932236800. Throughput: 0: 42615.6. Samples: 5932319740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 04:44:24,994][15401] Updated weights for policy 0, policy_version 362080 (0.0036) [2024-06-23 04:44:27,487][15349] Signal inference workers to stop experience collection... (87950 times) [2024-06-23 04:44:27,489][15349] Signal inference workers to resume experience collection... (87950 times) [2024-06-23 04:44:27,528][15401] InferenceWorker_p0-w0: stopping experience collection (87950 times) [2024-06-23 04:44:27,528][15401] InferenceWorker_p0-w0: resuming experience collection (87950 times) [2024-06-23 04:44:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5932466176. Throughput: 0: 42798.2. Samples: 5932577560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 04:44:28,926][15401] Updated weights for policy 0, policy_version 362090 (0.0040) [2024-06-23 04:44:32,584][15401] Updated weights for policy 0, policy_version 362100 (0.0040) [2024-06-23 04:44:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5932695552. Throughput: 0: 42858.3. Samples: 5932834120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 04:44:36,431][15401] Updated weights for policy 0, policy_version 362110 (0.0029) [2024-06-23 04:44:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5932892160. Throughput: 0: 42831.5. Samples: 5932964900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:38,390][15132] Avg episode reward: [(0, '0.104')] [2024-06-23 04:44:40,424][15401] Updated weights for policy 0, policy_version 362120 (0.0038) [2024-06-23 04:44:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5933105152. Throughput: 0: 42791.5. Samples: 5933220260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 04:44:43,921][15401] Updated weights for policy 0, policy_version 362130 (0.0040) [2024-06-23 04:44:48,071][15401] Updated weights for policy 0, policy_version 362140 (0.0032) [2024-06-23 04:44:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 5933318144. Throughput: 0: 42837.4. Samples: 5933478020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:48,390][15132] Avg episode reward: [(0, '0.097')] [2024-06-23 04:44:51,492][15401] Updated weights for policy 0, policy_version 362150 (0.0029) [2024-06-23 04:44:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.9). Total num frames: 5933514752. Throughput: 0: 42824.0. Samples: 5933607620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 04:44:55,834][15401] Updated weights for policy 0, policy_version 362160 (0.0030) [2024-06-23 04:44:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5933727744. Throughput: 0: 42746.2. Samples: 5933857980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:44:58,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 04:44:59,231][15401] Updated weights for policy 0, policy_version 362170 (0.0038) [2024-06-23 04:45:03,351][15401] Updated weights for policy 0, policy_version 362180 (0.0031) [2024-06-23 04:45:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 5933957120. Throughput: 0: 42940.4. Samples: 5934120280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:45:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 04:45:07,264][15401] Updated weights for policy 0, policy_version 362190 (0.0033) [2024-06-23 04:45:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5934170112. Throughput: 0: 42803.4. Samples: 5934245900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:45:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 04:45:11,147][15401] Updated weights for policy 0, policy_version 362200 (0.0031) [2024-06-23 04:45:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5934383104. Throughput: 0: 42642.1. Samples: 5934496460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 04:45:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 04:45:14,892][15401] Updated weights for policy 0, policy_version 362210 (0.0035) [2024-06-23 04:45:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5934579712. Throughput: 0: 42714.6. Samples: 5934756280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 04:45:18,950][15401] Updated weights for policy 0, policy_version 362220 (0.0033) [2024-06-23 04:45:22,653][15401] Updated weights for policy 0, policy_version 362230 (0.0038) [2024-06-23 04:45:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5934792704. Throughput: 0: 42639.1. Samples: 5934883660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:23,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-23 04:45:26,558][15401] Updated weights for policy 0, policy_version 362240 (0.0032) [2024-06-23 04:45:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5935022080. Throughput: 0: 42587.1. Samples: 5935136680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 04:45:30,432][15401] Updated weights for policy 0, policy_version 362250 (0.0024) [2024-06-23 04:45:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5935218688. Throughput: 0: 42658.6. Samples: 5935397660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 04:45:34,062][15401] Updated weights for policy 0, policy_version 362260 (0.0041) [2024-06-23 04:45:38,030][15401] Updated weights for policy 0, policy_version 362270 (0.0026) [2024-06-23 04:45:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 5935431680. Throughput: 0: 42470.8. Samples: 5935518800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:38,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-23 04:45:41,807][15401] Updated weights for policy 0, policy_version 362280 (0.0031) [2024-06-23 04:45:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 5935644672. Throughput: 0: 42521.7. Samples: 5935771460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:43,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 04:45:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000362285_5935677440.pth... [2024-06-23 04:45:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000361659_5925421056.pth [2024-06-23 04:45:45,673][15401] Updated weights for policy 0, policy_version 362290 (0.0038) [2024-06-23 04:45:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5935841280. Throughput: 0: 42405.1. Samples: 5936028500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 04:45:48,791][15349] Signal inference workers to stop experience collection... (88000 times) [2024-06-23 04:45:48,791][15349] Signal inference workers to resume experience collection... (88000 times) [2024-06-23 04:45:48,810][15401] InferenceWorker_p0-w0: stopping experience collection (88000 times) [2024-06-23 04:45:48,810][15401] InferenceWorker_p0-w0: resuming experience collection (88000 times) [2024-06-23 04:45:49,741][15401] Updated weights for policy 0, policy_version 362300 (0.0034) [2024-06-23 04:45:53,269][15401] Updated weights for policy 0, policy_version 362310 (0.0034) [2024-06-23 04:45:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5936087040. Throughput: 0: 42308.1. Samples: 5936149760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 04:45:57,324][15401] Updated weights for policy 0, policy_version 362320 (0.0040) [2024-06-23 04:45:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5936283648. Throughput: 0: 42511.6. Samples: 5936409480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:45:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 04:46:00,955][15401] Updated weights for policy 0, policy_version 362330 (0.0036) [2024-06-23 04:46:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.4, 300 sec: 42709.4). Total num frames: 5936496640. Throughput: 0: 42451.5. Samples: 5936666600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:46:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 04:46:05,097][15401] Updated weights for policy 0, policy_version 362340 (0.0038) [2024-06-23 04:46:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5936709632. Throughput: 0: 42374.9. Samples: 5936790520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:46:08,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-23 04:46:08,725][15401] Updated weights for policy 0, policy_version 362350 (0.0047) [2024-06-23 04:46:12,538][15401] Updated weights for policy 0, policy_version 362360 (0.0037) [2024-06-23 04:46:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5936922624. Throughput: 0: 42399.6. Samples: 5937044660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:46:13,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-23 04:46:16,231][15401] Updated weights for policy 0, policy_version 362370 (0.0035) [2024-06-23 04:46:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5937119232. Throughput: 0: 42264.4. Samples: 5937299560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:46:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 04:46:20,456][15401] Updated weights for policy 0, policy_version 362380 (0.0032) [2024-06-23 04:46:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5937348608. Throughput: 0: 42390.2. Samples: 5937426360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:46:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 04:46:23,888][15401] Updated weights for policy 0, policy_version 362390 (0.0036) [2024-06-23 04:46:28,100][15401] Updated weights for policy 0, policy_version 362400 (0.0040) [2024-06-23 04:46:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5937561600. Throughput: 0: 42610.6. Samples: 5937688940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:46:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 04:46:31,430][15401] Updated weights for policy 0, policy_version 362410 (0.0038) [2024-06-23 04:46:33,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 5937774592. Throughput: 0: 42491.0. Samples: 5937940700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 04:46:33,393][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 04:46:35,631][15401] Updated weights for policy 0, policy_version 362420 (0.0039) [2024-06-23 04:46:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5937987584. Throughput: 0: 42524.4. Samples: 5938063360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:46:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 04:46:39,099][15401] Updated weights for policy 0, policy_version 362430 (0.0033) [2024-06-23 04:46:43,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5938200576. Throughput: 0: 42582.7. Samples: 5938325700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:46:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 04:46:43,444][15401] Updated weights for policy 0, policy_version 362440 (0.0029) [2024-06-23 04:46:46,893][15401] Updated weights for policy 0, policy_version 362450 (0.0038) [2024-06-23 04:46:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 5938413568. Throughput: 0: 42508.6. Samples: 5938579480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:46:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 04:46:51,114][15401] Updated weights for policy 0, policy_version 362460 (0.0038) [2024-06-23 04:46:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5938626560. Throughput: 0: 42662.7. Samples: 5938710340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:46:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 04:46:54,503][15401] Updated weights for policy 0, policy_version 362470 (0.0030) [2024-06-23 04:46:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5938839552. Throughput: 0: 42750.2. Samples: 5938968420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:46:58,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 04:46:58,979][15401] Updated weights for policy 0, policy_version 362480 (0.0025) [2024-06-23 04:47:02,099][15401] Updated weights for policy 0, policy_version 362490 (0.0025) [2024-06-23 04:47:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5939068928. Throughput: 0: 42759.6. Samples: 5939223740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 04:47:06,908][15401] Updated weights for policy 0, policy_version 362500 (0.0036) [2024-06-23 04:47:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5939281920. Throughput: 0: 42901.1. Samples: 5939356920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 04:47:08,849][15349] Signal inference workers to stop experience collection... (88050 times) [2024-06-23 04:47:08,849][15349] Signal inference workers to resume experience collection... (88050 times) [2024-06-23 04:47:08,869][15401] InferenceWorker_p0-w0: stopping experience collection (88050 times) [2024-06-23 04:47:08,869][15401] InferenceWorker_p0-w0: resuming experience collection (88050 times) [2024-06-23 04:47:09,692][15401] Updated weights for policy 0, policy_version 362510 (0.0041) [2024-06-23 04:47:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5939478528. Throughput: 0: 42750.6. Samples: 5939612720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:13,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-23 04:47:14,434][15401] Updated weights for policy 0, policy_version 362520 (0.0040) [2024-06-23 04:47:17,277][15401] Updated weights for policy 0, policy_version 362530 (0.0038) [2024-06-23 04:47:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5939707904. Throughput: 0: 42798.7. Samples: 5939866540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 04:47:22,207][15401] Updated weights for policy 0, policy_version 362540 (0.0041) [2024-06-23 04:47:23,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5939920896. Throughput: 0: 42983.6. Samples: 5939997620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:23,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 04:47:24,704][15401] Updated weights for policy 0, policy_version 362550 (0.0033) [2024-06-23 04:47:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5940117504. Throughput: 0: 42911.9. Samples: 5940256740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:28,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-23 04:47:29,736][15401] Updated weights for policy 0, policy_version 362560 (0.0037) [2024-06-23 04:47:32,565][15401] Updated weights for policy 0, policy_version 362570 (0.0041) [2024-06-23 04:47:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 5940346880. Throughput: 0: 42930.0. Samples: 5940511340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:33,390][15132] Avg episode reward: [(0, '0.866')] [2024-06-23 04:47:37,294][15401] Updated weights for policy 0, policy_version 362580 (0.0030) [2024-06-23 04:47:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5940559872. Throughput: 0: 43077.7. Samples: 5940648840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:38,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-23 04:47:39,976][15401] Updated weights for policy 0, policy_version 362590 (0.0033) [2024-06-23 04:47:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5940756480. Throughput: 0: 42956.0. Samples: 5940901440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 04:47:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000362596_5940772864.pth... [2024-06-23 04:47:43,561][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000361971_5930532864.pth [2024-06-23 04:47:44,769][15401] Updated weights for policy 0, policy_version 362600 (0.0046) [2024-06-23 04:47:47,539][15401] Updated weights for policy 0, policy_version 362610 (0.0042) [2024-06-23 04:47:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5941018624. Throughput: 0: 42885.8. Samples: 5941153600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 04:47:48,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 04:47:52,503][15401] Updated weights for policy 0, policy_version 362620 (0.0025) [2024-06-23 04:47:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 5941198848. Throughput: 0: 42965.5. Samples: 5941290360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:47:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 04:47:55,504][15401] Updated weights for policy 0, policy_version 362630 (0.0028) [2024-06-23 04:47:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5941411840. Throughput: 0: 42829.5. Samples: 5941540040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:47:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 04:48:00,064][15401] Updated weights for policy 0, policy_version 362640 (0.0034) [2024-06-23 04:48:02,980][15401] Updated weights for policy 0, policy_version 362650 (0.0045) [2024-06-23 04:48:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 5941657600. Throughput: 0: 42762.2. Samples: 5941790840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 04:48:07,639][15401] Updated weights for policy 0, policy_version 362660 (0.0025) [2024-06-23 04:48:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5941837824. Throughput: 0: 42933.6. Samples: 5941929640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:08,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 04:48:10,930][15401] Updated weights for policy 0, policy_version 362670 (0.0023) [2024-06-23 04:48:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5942067200. Throughput: 0: 42816.0. Samples: 5942183460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 04:48:14,459][15349] Signal inference workers to stop experience collection... (88100 times) [2024-06-23 04:48:14,459][15349] Signal inference workers to resume experience collection... (88100 times) [2024-06-23 04:48:14,472][15401] InferenceWorker_p0-w0: stopping experience collection (88100 times) [2024-06-23 04:48:14,472][15401] InferenceWorker_p0-w0: resuming experience collection (88100 times) [2024-06-23 04:48:15,115][15401] Updated weights for policy 0, policy_version 362680 (0.0024) [2024-06-23 04:48:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5942296576. Throughput: 0: 42883.1. Samples: 5942441080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:18,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 04:48:18,529][15401] Updated weights for policy 0, policy_version 362690 (0.0028) [2024-06-23 04:48:23,157][15401] Updated weights for policy 0, policy_version 362700 (0.0039) [2024-06-23 04:48:23,394][15132] Fps is (10 sec: 40939.5, 60 sec: 42594.8, 300 sec: 42653.2). Total num frames: 5942476800. Throughput: 0: 42793.9. Samples: 5942574780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:23,395][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 04:48:26,337][15401] Updated weights for policy 0, policy_version 362710 (0.0033) [2024-06-23 04:48:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5942722560. Throughput: 0: 42808.8. Samples: 5942827840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 04:48:30,918][15401] Updated weights for policy 0, policy_version 362720 (0.0033) [2024-06-23 04:48:33,389][15132] Fps is (10 sec: 45898.7, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 5942935552. Throughput: 0: 43054.8. Samples: 5943091060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 04:48:33,927][15401] Updated weights for policy 0, policy_version 362730 (0.0030) [2024-06-23 04:48:38,364][15401] Updated weights for policy 0, policy_version 362740 (0.0028) [2024-06-23 04:48:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5943132160. Throughput: 0: 42821.7. Samples: 5943217340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 04:48:41,490][15401] Updated weights for policy 0, policy_version 362750 (0.0026) [2024-06-23 04:48:43,390][15132] Fps is (10 sec: 42596.9, 60 sec: 43417.5, 300 sec: 42709.8). Total num frames: 5943361536. Throughput: 0: 42929.1. Samples: 5943471860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 04:48:46,170][15401] Updated weights for policy 0, policy_version 362760 (0.0035) [2024-06-23 04:48:48,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42320.9, 300 sec: 42708.6). Total num frames: 5943558144. Throughput: 0: 43010.4. Samples: 5943726580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:48,397][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 04:48:49,125][15401] Updated weights for policy 0, policy_version 362770 (0.0037) [2024-06-23 04:48:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5943754752. Throughput: 0: 42790.6. Samples: 5943855220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 04:48:53,830][15401] Updated weights for policy 0, policy_version 362780 (0.0028) [2024-06-23 04:48:56,667][15401] Updated weights for policy 0, policy_version 362790 (0.0031) [2024-06-23 04:48:58,396][15132] Fps is (10 sec: 44236.5, 60 sec: 43139.9, 300 sec: 42708.6). Total num frames: 5944000512. Throughput: 0: 42818.7. Samples: 5944110580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:48:58,397][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 04:49:01,546][15401] Updated weights for policy 0, policy_version 362800 (0.0043) [2024-06-23 04:49:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5944197120. Throughput: 0: 42852.5. Samples: 5944369440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:49:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 04:49:04,468][15401] Updated weights for policy 0, policy_version 362810 (0.0022) [2024-06-23 04:49:08,390][15132] Fps is (10 sec: 39346.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5944393728. Throughput: 0: 42649.9. Samples: 5944493820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 04:49:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 04:49:09,434][15401] Updated weights for policy 0, policy_version 362820 (0.0029) [2024-06-23 04:49:12,552][15401] Updated weights for policy 0, policy_version 362830 (0.0048) [2024-06-23 04:49:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5944639488. Throughput: 0: 42698.4. Samples: 5944749260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 04:49:16,947][15401] Updated weights for policy 0, policy_version 362840 (0.0031) [2024-06-23 04:49:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5944852480. Throughput: 0: 42597.2. Samples: 5945007940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:18,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 04:49:20,050][15401] Updated weights for policy 0, policy_version 362850 (0.0026) [2024-06-23 04:49:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42874.9, 300 sec: 42653.9). Total num frames: 5945049088. Throughput: 0: 42582.1. Samples: 5945133540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 04:49:24,657][15401] Updated weights for policy 0, policy_version 362860 (0.0052) [2024-06-23 04:49:27,831][15401] Updated weights for policy 0, policy_version 362870 (0.0042) [2024-06-23 04:49:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 5945262080. Throughput: 0: 42540.8. Samples: 5945386180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 04:49:32,128][15401] Updated weights for policy 0, policy_version 362880 (0.0042) [2024-06-23 04:49:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 5945475072. Throughput: 0: 42694.4. Samples: 5945647560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 04:49:35,375][15401] Updated weights for policy 0, policy_version 362890 (0.0042) [2024-06-23 04:49:38,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5945688064. Throughput: 0: 42602.3. Samples: 5945772320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:38,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 04:49:39,783][15401] Updated weights for policy 0, policy_version 362900 (0.0038) [2024-06-23 04:49:42,186][15349] Signal inference workers to stop experience collection... (88150 times) [2024-06-23 04:49:42,186][15349] Signal inference workers to resume experience collection... (88150 times) [2024-06-23 04:49:42,241][15401] InferenceWorker_p0-w0: stopping experience collection (88150 times) [2024-06-23 04:49:42,242][15401] InferenceWorker_p0-w0: resuming experience collection (88150 times) [2024-06-23 04:49:43,105][15401] Updated weights for policy 0, policy_version 362910 (0.0027) [2024-06-23 04:49:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5945917440. Throughput: 0: 42702.9. Samples: 5946031940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:43,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 04:49:43,446][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000362911_5945933824.pth... [2024-06-23 04:49:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000362285_5935677440.pth [2024-06-23 04:49:47,414][15401] Updated weights for policy 0, policy_version 362920 (0.0038) [2024-06-23 04:49:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42602.8, 300 sec: 42709.5). Total num frames: 5946114048. Throughput: 0: 42627.5. Samples: 5946287680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 04:49:50,811][15401] Updated weights for policy 0, policy_version 362930 (0.0031) [2024-06-23 04:49:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5946343424. Throughput: 0: 42604.0. Samples: 5946411000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:53,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 04:49:54,880][15401] Updated weights for policy 0, policy_version 362940 (0.0031) [2024-06-23 04:49:58,221][15401] Updated weights for policy 0, policy_version 362950 (0.0022) [2024-06-23 04:49:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 5946572800. Throughput: 0: 42837.2. Samples: 5946676940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:49:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 04:50:02,762][15401] Updated weights for policy 0, policy_version 362960 (0.0037) [2024-06-23 04:50:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5946753024. Throughput: 0: 42796.8. Samples: 5946933800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:50:03,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-23 04:50:05,743][15401] Updated weights for policy 0, policy_version 362970 (0.0041) [2024-06-23 04:50:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5946998784. Throughput: 0: 42769.8. Samples: 5947058180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:50:08,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 04:50:10,172][15401] Updated weights for policy 0, policy_version 362980 (0.0044) [2024-06-23 04:50:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 5947195392. Throughput: 0: 43004.6. Samples: 5947321500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:50:13,392][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 04:50:13,682][15401] Updated weights for policy 0, policy_version 362990 (0.0046) [2024-06-23 04:50:17,884][15401] Updated weights for policy 0, policy_version 363000 (0.0032) [2024-06-23 04:50:18,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 5947408384. Throughput: 0: 42639.0. Samples: 5947566580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:50:18,397][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 04:50:21,448][15401] Updated weights for policy 0, policy_version 363010 (0.0038) [2024-06-23 04:50:23,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5947637760. Throughput: 0: 42707.7. Samples: 5947694160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:50:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 04:50:25,799][15401] Updated weights for policy 0, policy_version 363020 (0.0031) [2024-06-23 04:50:28,390][15132] Fps is (10 sec: 40985.8, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 5947817984. Throughput: 0: 42707.1. Samples: 5947953760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 04:50:28,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 04:50:29,325][15401] Updated weights for policy 0, policy_version 363030 (0.0041) [2024-06-23 04:50:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5948030976. Throughput: 0: 42674.8. Samples: 5948208040. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:50:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 04:50:33,446][15401] Updated weights for policy 0, policy_version 363040 (0.0036) [2024-06-23 04:50:36,810][15401] Updated weights for policy 0, policy_version 363050 (0.0037) [2024-06-23 04:50:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5948276736. Throughput: 0: 42732.9. Samples: 5948333980. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:50:38,394][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 04:50:40,854][15401] Updated weights for policy 0, policy_version 363060 (0.0028) [2024-06-23 04:50:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5948456960. Throughput: 0: 42582.6. Samples: 5948593160. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:50:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 04:50:44,592][15401] Updated weights for policy 0, policy_version 363070 (0.0043) [2024-06-23 04:50:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5948686336. Throughput: 0: 42604.1. Samples: 5948850980. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:50:48,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 04:50:48,666][15401] Updated weights for policy 0, policy_version 363080 (0.0027) [2024-06-23 04:50:52,339][15401] Updated weights for policy 0, policy_version 363090 (0.0038) [2024-06-23 04:50:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5948915712. Throughput: 0: 42688.1. Samples: 5948979140. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:50:53,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-23 04:50:56,530][15401] Updated weights for policy 0, policy_version 363100 (0.0033) [2024-06-23 04:50:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 5949095936. Throughput: 0: 42477.1. Samples: 5949232860. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:50:58,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 04:51:00,074][15401] Updated weights for policy 0, policy_version 363110 (0.0033) [2024-06-23 04:51:00,292][15349] Signal inference workers to stop experience collection... (88200 times) [2024-06-23 04:51:00,294][15349] Signal inference workers to resume experience collection... (88200 times) [2024-06-23 04:51:00,316][15401] InferenceWorker_p0-w0: stopping experience collection (88200 times) [2024-06-23 04:51:00,316][15401] InferenceWorker_p0-w0: resuming experience collection (88200 times) [2024-06-23 04:51:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5949308928. Throughput: 0: 42695.8. Samples: 5949487620. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 04:51:03,998][15401] Updated weights for policy 0, policy_version 363120 (0.0038) [2024-06-23 04:51:07,818][15401] Updated weights for policy 0, policy_version 363130 (0.0041) [2024-06-23 04:51:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5949554688. Throughput: 0: 42694.6. Samples: 5949615420. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 04:51:11,886][15401] Updated weights for policy 0, policy_version 363140 (0.0036) [2024-06-23 04:51:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42325.4, 300 sec: 42764.7). Total num frames: 5949734912. Throughput: 0: 42564.5. Samples: 5949869260. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:13,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 04:51:15,377][15401] Updated weights for policy 0, policy_version 363150 (0.0035) [2024-06-23 04:51:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42329.8, 300 sec: 42709.5). Total num frames: 5949947904. Throughput: 0: 42661.8. Samples: 5950127820. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 04:51:19,350][15401] Updated weights for policy 0, policy_version 363160 (0.0028) [2024-06-23 04:51:22,965][15401] Updated weights for policy 0, policy_version 363170 (0.0036) [2024-06-23 04:51:23,394][15132] Fps is (10 sec: 47505.9, 60 sec: 42868.6, 300 sec: 42875.5). Total num frames: 5950210048. Throughput: 0: 42876.3. Samples: 5950263580. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:23,394][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 04:51:26,856][15401] Updated weights for policy 0, policy_version 363180 (0.0030) [2024-06-23 04:51:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 5950390272. Throughput: 0: 42714.9. Samples: 5950515320. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:28,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 04:51:30,454][15401] Updated weights for policy 0, policy_version 363190 (0.0039) [2024-06-23 04:51:33,390][15132] Fps is (10 sec: 39337.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5950603264. Throughput: 0: 42820.4. Samples: 5950777900. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 04:51:34,508][15401] Updated weights for policy 0, policy_version 363200 (0.0028) [2024-06-23 04:51:38,086][15401] Updated weights for policy 0, policy_version 363210 (0.0039) [2024-06-23 04:51:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5950849024. Throughput: 0: 42858.7. Samples: 5950907780. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 04:51:41,912][15401] Updated weights for policy 0, policy_version 363220 (0.0032) [2024-06-23 04:51:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5951029248. Throughput: 0: 42851.4. Samples: 5951161180. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-23 04:51:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 04:51:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000363222_5951029248.pth... [2024-06-23 04:51:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000362596_5940772864.pth [2024-06-23 04:51:45,629][15401] Updated weights for policy 0, policy_version 363230 (0.0044) [2024-06-23 04:51:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5951242240. Throughput: 0: 42858.8. Samples: 5951416260. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:51:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 04:51:49,459][15401] Updated weights for policy 0, policy_version 363240 (0.0039) [2024-06-23 04:51:53,226][15401] Updated weights for policy 0, policy_version 363250 (0.0023) [2024-06-23 04:51:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5951488000. Throughput: 0: 43004.1. Samples: 5951550600. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:51:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 04:51:57,491][15401] Updated weights for policy 0, policy_version 363260 (0.0033) [2024-06-23 04:51:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5951668224. Throughput: 0: 42935.9. Samples: 5951801280. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:51:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 04:52:00,889][15401] Updated weights for policy 0, policy_version 363270 (0.0047) [2024-06-23 04:52:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5951897600. Throughput: 0: 42852.5. Samples: 5952056180. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 04:52:05,014][15401] Updated weights for policy 0, policy_version 363280 (0.0028) [2024-06-23 04:52:08,389][15132] Fps is (10 sec: 45876.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5952126976. Throughput: 0: 42839.5. Samples: 5952191180. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 04:52:08,405][15401] Updated weights for policy 0, policy_version 363290 (0.0031) [2024-06-23 04:52:12,517][15401] Updated weights for policy 0, policy_version 363300 (0.0036) [2024-06-23 04:52:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 5952307200. Throughput: 0: 42907.9. Samples: 5952446180. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 04:52:16,197][15349] Signal inference workers to stop experience collection... (88250 times) [2024-06-23 04:52:16,197][15349] Signal inference workers to resume experience collection... (88250 times) [2024-06-23 04:52:16,239][15401] InferenceWorker_p0-w0: stopping experience collection (88250 times) [2024-06-23 04:52:16,239][15401] InferenceWorker_p0-w0: resuming experience collection (88250 times) [2024-06-23 04:52:16,355][15401] Updated weights for policy 0, policy_version 363310 (0.0033) [2024-06-23 04:52:18,390][15132] Fps is (10 sec: 42597.1, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5952552960. Throughput: 0: 42712.3. Samples: 5952699960. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 04:52:20,383][15401] Updated weights for policy 0, policy_version 363320 (0.0027) [2024-06-23 04:52:23,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42326.5, 300 sec: 42820.2). Total num frames: 5952749568. Throughput: 0: 42887.5. Samples: 5952837820. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:23,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 04:52:23,978][15401] Updated weights for policy 0, policy_version 363330 (0.0031) [2024-06-23 04:52:28,254][15401] Updated weights for policy 0, policy_version 363340 (0.0032) [2024-06-23 04:52:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5952962560. Throughput: 0: 42865.0. Samples: 5953090100. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 04:52:31,572][15401] Updated weights for policy 0, policy_version 363350 (0.0031) [2024-06-23 04:52:33,390][15132] Fps is (10 sec: 45885.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5953208320. Throughput: 0: 42757.6. Samples: 5953340360. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 04:52:35,870][15401] Updated weights for policy 0, policy_version 363360 (0.0024) [2024-06-23 04:52:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 5953388544. Throughput: 0: 42802.6. Samples: 5953476820. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:38,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 04:52:39,535][15401] Updated weights for policy 0, policy_version 363370 (0.0036) [2024-06-23 04:52:43,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5953601536. Throughput: 0: 42866.1. Samples: 5953730240. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 04:52:43,461][15401] Updated weights for policy 0, policy_version 363380 (0.0034) [2024-06-23 04:52:47,149][15401] Updated weights for policy 0, policy_version 363390 (0.0037) [2024-06-23 04:52:48,390][15132] Fps is (10 sec: 44246.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5953830912. Throughput: 0: 42769.3. Samples: 5953980800. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 04:52:51,248][15401] Updated weights for policy 0, policy_version 363400 (0.0025) [2024-06-23 04:52:53,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 5954027520. Throughput: 0: 42702.0. Samples: 5954112780. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 04:52:54,878][15401] Updated weights for policy 0, policy_version 363410 (0.0035) [2024-06-23 04:52:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5954240512. Throughput: 0: 42538.2. Samples: 5954360400. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:52:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 04:52:58,904][15401] Updated weights for policy 0, policy_version 363420 (0.0035) [2024-06-23 04:53:02,988][15401] Updated weights for policy 0, policy_version 363430 (0.0035) [2024-06-23 04:53:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 5954437120. Throughput: 0: 42538.7. Samples: 5954614300. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 04:53:03,392][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 04:53:06,636][15401] Updated weights for policy 0, policy_version 363440 (0.0047) [2024-06-23 04:53:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 5954650112. Throughput: 0: 42302.6. Samples: 5954741340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 04:53:10,685][15401] Updated weights for policy 0, policy_version 363450 (0.0038) [2024-06-23 04:53:13,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5954879488. Throughput: 0: 42383.1. Samples: 5954997340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 04:53:14,196][15401] Updated weights for policy 0, policy_version 363460 (0.0032) [2024-06-23 04:53:18,358][15401] Updated weights for policy 0, policy_version 363470 (0.0033) [2024-06-23 04:53:18,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42323.7, 300 sec: 42765.4). Total num frames: 5955092480. Throughput: 0: 42439.6. Samples: 5955250240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:18,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 04:53:21,907][15401] Updated weights for policy 0, policy_version 363480 (0.0034) [2024-06-23 04:53:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 5955272704. Throughput: 0: 42166.7. Samples: 5955374220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 04:53:25,963][15401] Updated weights for policy 0, policy_version 363490 (0.0027) [2024-06-23 04:53:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5955518464. Throughput: 0: 42474.6. Samples: 5955641600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:28,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 04:53:29,395][15401] Updated weights for policy 0, policy_version 363500 (0.0030) [2024-06-23 04:53:33,396][15132] Fps is (10 sec: 44208.2, 60 sec: 41774.8, 300 sec: 42653.0). Total num frames: 5955715072. Throughput: 0: 42483.8. Samples: 5955892840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:33,396][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 04:53:33,959][15401] Updated weights for policy 0, policy_version 363510 (0.0030) [2024-06-23 04:53:37,380][15401] Updated weights for policy 0, policy_version 363520 (0.0032) [2024-06-23 04:53:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 5955928064. Throughput: 0: 42364.7. Samples: 5956019180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 04:53:41,447][15401] Updated weights for policy 0, policy_version 363530 (0.0038) [2024-06-23 04:53:43,389][15132] Fps is (10 sec: 44265.7, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 5956157440. Throughput: 0: 42642.4. Samples: 5956279300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:43,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 04:53:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000363535_5956157440.pth... [2024-06-23 04:53:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000362911_5945933824.pth [2024-06-23 04:53:44,884][15401] Updated weights for policy 0, policy_version 363540 (0.0037) [2024-06-23 04:53:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5956370432. Throughput: 0: 42732.0. Samples: 5956537140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 04:53:48,916][15401] Updated weights for policy 0, policy_version 363550 (0.0046) [2024-06-23 04:53:51,084][15349] Signal inference workers to stop experience collection... (88300 times) [2024-06-23 04:53:51,084][15349] Signal inference workers to resume experience collection... (88300 times) [2024-06-23 04:53:51,098][15401] InferenceWorker_p0-w0: stopping experience collection (88300 times) [2024-06-23 04:53:51,098][15401] InferenceWorker_p0-w0: resuming experience collection (88300 times) [2024-06-23 04:53:52,460][15401] Updated weights for policy 0, policy_version 363560 (0.0031) [2024-06-23 04:53:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 5956599808. Throughput: 0: 42789.8. Samples: 5956666880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 04:53:56,933][15401] Updated weights for policy 0, policy_version 363570 (0.0035) [2024-06-23 04:53:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5956780032. Throughput: 0: 42708.0. Samples: 5956919200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:53:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 04:54:00,259][15401] Updated weights for policy 0, policy_version 363580 (0.0036) [2024-06-23 04:54:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 5956993024. Throughput: 0: 42837.4. Samples: 5957177820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:54:03,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 04:54:04,431][15401] Updated weights for policy 0, policy_version 363590 (0.0032) [2024-06-23 04:54:08,140][15401] Updated weights for policy 0, policy_version 363600 (0.0026) [2024-06-23 04:54:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5957238784. Throughput: 0: 42950.2. Samples: 5957306980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:54:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 04:54:11,927][15401] Updated weights for policy 0, policy_version 363610 (0.0035) [2024-06-23 04:54:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5957435392. Throughput: 0: 42706.7. Samples: 5957563400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:54:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 04:54:15,649][15401] Updated weights for policy 0, policy_version 363620 (0.0026) [2024-06-23 04:54:18,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 5957632000. Throughput: 0: 42841.1. Samples: 5957820520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 04:54:18,393][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 04:54:19,693][15401] Updated weights for policy 0, policy_version 363630 (0.0031) [2024-06-23 04:54:23,224][15401] Updated weights for policy 0, policy_version 363640 (0.0037) [2024-06-23 04:54:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5957877760. Throughput: 0: 42786.9. Samples: 5957944600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:54:23,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 04:54:27,187][15401] Updated weights for policy 0, policy_version 363650 (0.0045) [2024-06-23 04:54:28,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5958090752. Throughput: 0: 42666.1. Samples: 5958199280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:54:28,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 04:54:30,729][15401] Updated weights for policy 0, policy_version 363660 (0.0035) [2024-06-23 04:54:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 5958270976. Throughput: 0: 42748.0. Samples: 5958460800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:54:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 04:54:34,875][15401] Updated weights for policy 0, policy_version 363670 (0.0032) [2024-06-23 04:54:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5958516736. Throughput: 0: 42618.7. Samples: 5958584720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:54:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 04:54:38,766][15401] Updated weights for policy 0, policy_version 363680 (0.0034) [2024-06-23 04:54:42,641][15401] Updated weights for policy 0, policy_version 363690 (0.0022) [2024-06-23 04:54:43,392][15132] Fps is (10 sec: 47502.3, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 5958746112. Throughput: 0: 42771.0. Samples: 5958844000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:54:43,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 04:54:46,638][15401] Updated weights for policy 0, policy_version 363700 (0.0027) [2024-06-23 04:54:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5958909952. Throughput: 0: 42713.7. Samples: 5959099940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:54:48,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 04:54:50,136][15401] Updated weights for policy 0, policy_version 363710 (0.0048) [2024-06-23 04:54:53,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5959155712. Throughput: 0: 42508.3. Samples: 5959219860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:54:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 04:54:54,013][15401] Updated weights for policy 0, policy_version 363720 (0.0034) [2024-06-23 04:54:56,321][15349] Signal inference workers to stop experience collection... (88350 times) [2024-06-23 04:54:56,356][15401] InferenceWorker_p0-w0: stopping experience collection (88350 times) [2024-06-23 04:54:56,377][15349] Signal inference workers to resume experience collection... (88350 times) [2024-06-23 04:54:56,378][15401] InferenceWorker_p0-w0: resuming experience collection (88350 times) [2024-06-23 04:54:57,683][15401] Updated weights for policy 0, policy_version 363730 (0.0027) [2024-06-23 04:54:58,396][15132] Fps is (10 sec: 45846.3, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 5959368704. Throughput: 0: 42732.6. Samples: 5959486640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:54:58,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 04:55:01,658][15401] Updated weights for policy 0, policy_version 363740 (0.0031) [2024-06-23 04:55:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5959565312. Throughput: 0: 42598.1. Samples: 5959737340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:55:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 04:55:05,603][15401] Updated weights for policy 0, policy_version 363750 (0.0026) [2024-06-23 04:55:08,392][15132] Fps is (10 sec: 40976.3, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 5959778304. Throughput: 0: 42549.8. Samples: 5959859440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:55:08,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 04:55:09,217][15401] Updated weights for policy 0, policy_version 363760 (0.0042) [2024-06-23 04:55:13,181][15401] Updated weights for policy 0, policy_version 363770 (0.0026) [2024-06-23 04:55:13,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 5960007680. Throughput: 0: 42808.9. Samples: 5960125680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:55:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 04:55:16,863][15401] Updated weights for policy 0, policy_version 363780 (0.0028) [2024-06-23 04:55:18,390][15132] Fps is (10 sec: 44246.9, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 5960220672. Throughput: 0: 42574.7. Samples: 5960376660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:55:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 04:55:20,934][15401] Updated weights for policy 0, policy_version 363790 (0.0045) [2024-06-23 04:55:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5960433664. Throughput: 0: 42747.7. Samples: 5960508360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:55:23,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 04:55:24,605][15401] Updated weights for policy 0, policy_version 363800 (0.0036) [2024-06-23 04:55:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5960630272. Throughput: 0: 42682.4. Samples: 5960764600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:55:28,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 04:55:28,717][15401] Updated weights for policy 0, policy_version 363810 (0.0032) [2024-06-23 04:55:32,326][15401] Updated weights for policy 0, policy_version 363820 (0.0029) [2024-06-23 04:55:33,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43415.9, 300 sec: 42709.1). Total num frames: 5960876032. Throughput: 0: 42601.8. Samples: 5961017120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:55:33,392][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 04:55:36,139][15401] Updated weights for policy 0, policy_version 363830 (0.0029) [2024-06-23 04:55:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5961072640. Throughput: 0: 42825.5. Samples: 5961147000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 04:55:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 04:55:39,985][15401] Updated weights for policy 0, policy_version 363840 (0.0038) [2024-06-23 04:55:43,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 5961269248. Throughput: 0: 42720.6. Samples: 5961408800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:55:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 04:55:43,449][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000363848_5961285632.pth... [2024-06-23 04:55:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000363222_5951029248.pth [2024-06-23 04:55:43,969][15401] Updated weights for policy 0, policy_version 363850 (0.0040) [2024-06-23 04:55:47,577][15401] Updated weights for policy 0, policy_version 363860 (0.0034) [2024-06-23 04:55:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5961515008. Throughput: 0: 42780.6. Samples: 5961662460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:55:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 04:55:51,598][15401] Updated weights for policy 0, policy_version 363870 (0.0034) [2024-06-23 04:55:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5961711616. Throughput: 0: 43035.4. Samples: 5961795940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:55:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 04:55:55,842][15401] Updated weights for policy 0, policy_version 363880 (0.0028) [2024-06-23 04:55:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42329.8, 300 sec: 42709.5). Total num frames: 5961908224. Throughput: 0: 42522.2. Samples: 5962039180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:55:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 04:55:59,242][15401] Updated weights for policy 0, policy_version 363890 (0.0044) [2024-06-23 04:56:03,295][15401] Updated weights for policy 0, policy_version 363900 (0.0033) [2024-06-23 04:56:03,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 5962137600. Throughput: 0: 42717.3. Samples: 5962299040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:03,393][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 04:56:06,992][15401] Updated weights for policy 0, policy_version 363910 (0.0037) [2024-06-23 04:56:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42765.3). Total num frames: 5962350592. Throughput: 0: 42769.2. Samples: 5962432980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 04:56:11,155][15349] Signal inference workers to stop experience collection... (88400 times) [2024-06-23 04:56:11,157][15349] Signal inference workers to resume experience collection... (88400 times) [2024-06-23 04:56:11,168][15401] InferenceWorker_p0-w0: stopping experience collection (88400 times) [2024-06-23 04:56:11,171][15401] Updated weights for policy 0, policy_version 363920 (0.0039) [2024-06-23 04:56:11,192][15401] InferenceWorker_p0-w0: resuming experience collection (88400 times) [2024-06-23 04:56:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5962547200. Throughput: 0: 42577.8. Samples: 5962680600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:13,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-23 04:56:14,653][15401] Updated weights for policy 0, policy_version 363930 (0.0035) [2024-06-23 04:56:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 5962760192. Throughput: 0: 42832.0. Samples: 5962944460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:18,396][15132] Avg episode reward: [(0, '0.840')] [2024-06-23 04:56:18,624][15401] Updated weights for policy 0, policy_version 363940 (0.0046) [2024-06-23 04:56:22,382][15401] Updated weights for policy 0, policy_version 363950 (0.0039) [2024-06-23 04:56:23,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 5962989568. Throughput: 0: 42801.6. Samples: 5963073180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:23,393][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 04:56:26,253][15401] Updated weights for policy 0, policy_version 363960 (0.0044) [2024-06-23 04:56:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5963202560. Throughput: 0: 42626.7. Samples: 5963327000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 04:56:30,126][15401] Updated weights for policy 0, policy_version 363970 (0.0043) [2024-06-23 04:56:33,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 5963415552. Throughput: 0: 42727.1. Samples: 5963585180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 04:56:34,066][15401] Updated weights for policy 0, policy_version 363980 (0.0029) [2024-06-23 04:56:37,937][15401] Updated weights for policy 0, policy_version 363990 (0.0043) [2024-06-23 04:56:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5963628544. Throughput: 0: 42538.9. Samples: 5963710180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:38,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 04:56:41,633][15401] Updated weights for policy 0, policy_version 364000 (0.0029) [2024-06-23 04:56:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5963857920. Throughput: 0: 42816.9. Samples: 5963965940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:43,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 04:56:45,708][15401] Updated weights for policy 0, policy_version 364010 (0.0041) [2024-06-23 04:56:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5964038144. Throughput: 0: 42956.6. Samples: 5964231980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 04:56:49,091][15401] Updated weights for policy 0, policy_version 364020 (0.0024) [2024-06-23 04:56:53,282][15401] Updated weights for policy 0, policy_version 364030 (0.0033) [2024-06-23 04:56:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5964267520. Throughput: 0: 42672.1. Samples: 5964353220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 04:56:53,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 04:56:57,221][15401] Updated weights for policy 0, policy_version 364040 (0.0035) [2024-06-23 04:56:58,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5964513280. Throughput: 0: 42963.0. Samples: 5964613940. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:56:58,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 04:57:00,981][15401] Updated weights for policy 0, policy_version 364050 (0.0031) [2024-06-23 04:57:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42542.8). Total num frames: 5964677120. Throughput: 0: 42739.2. Samples: 5964867720. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 04:57:04,864][15401] Updated weights for policy 0, policy_version 364060 (0.0047) [2024-06-23 04:57:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5964906496. Throughput: 0: 42558.3. Samples: 5964988200. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 04:57:08,750][15401] Updated weights for policy 0, policy_version 364070 (0.0031) [2024-06-23 04:57:12,463][15401] Updated weights for policy 0, policy_version 364080 (0.0031) [2024-06-23 04:57:13,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5965152256. Throughput: 0: 42896.5. Samples: 5965257340. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:13,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 04:57:16,285][15401] Updated weights for policy 0, policy_version 364090 (0.0046) [2024-06-23 04:57:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 5965332480. Throughput: 0: 42682.3. Samples: 5965505880. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 04:57:19,936][15349] Signal inference workers to stop experience collection... (88450 times) [2024-06-23 04:57:19,936][15349] Signal inference workers to resume experience collection... (88450 times) [2024-06-23 04:57:19,972][15401] InferenceWorker_p0-w0: stopping experience collection (88450 times) [2024-06-23 04:57:19,972][15401] InferenceWorker_p0-w0: resuming experience collection (88450 times) [2024-06-23 04:57:20,090][15401] Updated weights for policy 0, policy_version 364100 (0.0040) [2024-06-23 04:57:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 5965545472. Throughput: 0: 42699.5. Samples: 5965631660. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 04:57:23,832][15401] Updated weights for policy 0, policy_version 364110 (0.0032) [2024-06-23 04:57:27,725][15401] Updated weights for policy 0, policy_version 364120 (0.0043) [2024-06-23 04:57:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5965758464. Throughput: 0: 42779.9. Samples: 5965891040. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 04:57:31,417][15401] Updated weights for policy 0, policy_version 364130 (0.0036) [2024-06-23 04:57:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 5965987840. Throughput: 0: 42455.9. Samples: 5966142500. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 04:57:35,406][15401] Updated weights for policy 0, policy_version 364140 (0.0041) [2024-06-23 04:57:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5966200832. Throughput: 0: 42594.7. Samples: 5966269980. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:38,391][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 04:57:39,331][15401] Updated weights for policy 0, policy_version 364150 (0.0041) [2024-06-23 04:57:42,933][15401] Updated weights for policy 0, policy_version 364160 (0.0040) [2024-06-23 04:57:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5966397440. Throughput: 0: 42466.3. Samples: 5966524920. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:43,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 04:57:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000364160_5966397440.pth... [2024-06-23 04:57:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000363535_5956157440.pth [2024-06-23 04:57:46,887][15401] Updated weights for policy 0, policy_version 364170 (0.0036) [2024-06-23 04:57:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5966610432. Throughput: 0: 42596.3. Samples: 5966784560. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 04:57:50,597][15401] Updated weights for policy 0, policy_version 364180 (0.0042) [2024-06-23 04:57:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5966856192. Throughput: 0: 42852.7. Samples: 5966916580. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 04:57:54,293][15401] Updated weights for policy 0, policy_version 364190 (0.0044) [2024-06-23 04:57:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42709.8). Total num frames: 5967036416. Throughput: 0: 42523.6. Samples: 5967170900. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:57:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 04:57:58,477][15401] Updated weights for policy 0, policy_version 364200 (0.0036) [2024-06-23 04:58:01,761][15401] Updated weights for policy 0, policy_version 364210 (0.0035) [2024-06-23 04:58:03,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5967249408. Throughput: 0: 42683.1. Samples: 5967426620. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:58:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 04:58:06,338][15401] Updated weights for policy 0, policy_version 364220 (0.0033) [2024-06-23 04:58:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5967478784. Throughput: 0: 42893.8. Samples: 5967561880. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:58:08,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 04:58:09,834][15401] Updated weights for policy 0, policy_version 364230 (0.0026) [2024-06-23 04:58:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 5967675392. Throughput: 0: 42716.1. Samples: 5967813260. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-23 04:58:13,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 04:58:14,027][15401] Updated weights for policy 0, policy_version 364240 (0.0037) [2024-06-23 04:58:17,334][15401] Updated weights for policy 0, policy_version 364250 (0.0033) [2024-06-23 04:58:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5967904768. Throughput: 0: 42786.2. Samples: 5968067880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 04:58:21,510][15401] Updated weights for policy 0, policy_version 364260 (0.0036) [2024-06-23 04:58:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5968117760. Throughput: 0: 42893.7. Samples: 5968200200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:23,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 04:58:24,839][15401] Updated weights for policy 0, policy_version 364270 (0.0032) [2024-06-23 04:58:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 5968330752. Throughput: 0: 42793.3. Samples: 5968450620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 04:58:28,986][15401] Updated weights for policy 0, policy_version 364280 (0.0031) [2024-06-23 04:58:32,434][15401] Updated weights for policy 0, policy_version 364290 (0.0042) [2024-06-23 04:58:33,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 5968527360. Throughput: 0: 42721.0. Samples: 5968707100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:33,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 04:58:36,388][15401] Updated weights for policy 0, policy_version 364300 (0.0038) [2024-06-23 04:58:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5968756736. Throughput: 0: 42576.6. Samples: 5968832520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 04:58:40,598][15401] Updated weights for policy 0, policy_version 364310 (0.0033) [2024-06-23 04:58:43,396][15132] Fps is (10 sec: 45856.9, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 5968986112. Throughput: 0: 42683.3. Samples: 5969091920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:43,396][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 04:58:43,962][15401] Updated weights for policy 0, policy_version 364320 (0.0044) [2024-06-23 04:58:46,116][15349] Signal inference workers to stop experience collection... (88500 times) [2024-06-23 04:58:46,159][15401] InferenceWorker_p0-w0: stopping experience collection (88500 times) [2024-06-23 04:58:46,167][15349] Signal inference workers to resume experience collection... (88500 times) [2024-06-23 04:58:46,172][15401] InferenceWorker_p0-w0: resuming experience collection (88500 times) [2024-06-23 04:58:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5969166336. Throughput: 0: 42704.0. Samples: 5969348300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 04:58:48,562][15401] Updated weights for policy 0, policy_version 364330 (0.0035) [2024-06-23 04:58:51,754][15401] Updated weights for policy 0, policy_version 364340 (0.0029) [2024-06-23 04:58:53,391][15132] Fps is (10 sec: 40980.6, 60 sec: 42324.5, 300 sec: 42764.8). Total num frames: 5969395712. Throughput: 0: 42537.0. Samples: 5969476100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:53,391][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 04:58:56,302][15401] Updated weights for policy 0, policy_version 364350 (0.0031) [2024-06-23 04:58:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5969625088. Throughput: 0: 42644.5. Samples: 5969732260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:58:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 04:58:59,330][15401] Updated weights for policy 0, policy_version 364360 (0.0028) [2024-06-23 04:59:03,392][15132] Fps is (10 sec: 42593.9, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 5969821696. Throughput: 0: 42714.3. Samples: 5969990120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:59:03,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 04:59:03,707][15401] Updated weights for policy 0, policy_version 364370 (0.0024) [2024-06-23 04:59:06,843][15401] Updated weights for policy 0, policy_version 364380 (0.0032) [2024-06-23 04:59:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5970018304. Throughput: 0: 42707.3. Samples: 5970122020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:59:08,396][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 04:59:11,329][15401] Updated weights for policy 0, policy_version 364390 (0.0030) [2024-06-23 04:59:13,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 5970247680. Throughput: 0: 42742.4. Samples: 5970374020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:59:13,390][15132] Avg episode reward: [(0, '0.158')] [2024-06-23 04:59:14,894][15401] Updated weights for policy 0, policy_version 364400 (0.0030) [2024-06-23 04:59:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5970460672. Throughput: 0: 42728.9. Samples: 5970629800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:59:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 04:59:18,970][15401] Updated weights for policy 0, policy_version 364410 (0.0037) [2024-06-23 04:59:22,622][15401] Updated weights for policy 0, policy_version 364420 (0.0033) [2024-06-23 04:59:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5970673664. Throughput: 0: 42845.2. Samples: 5970760560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:59:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 04:59:26,658][15401] Updated weights for policy 0, policy_version 364430 (0.0035) [2024-06-23 04:59:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5970903040. Throughput: 0: 42699.8. Samples: 5971013140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:59:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 04:59:30,616][15401] Updated weights for policy 0, policy_version 364440 (0.0044) [2024-06-23 04:59:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 5971083264. Throughput: 0: 42802.2. Samples: 5971274400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 04:59:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 04:59:34,315][15401] Updated weights for policy 0, policy_version 364450 (0.0038) [2024-06-23 04:59:38,109][15401] Updated weights for policy 0, policy_version 364460 (0.0029) [2024-06-23 04:59:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 5971312640. Throughput: 0: 42706.7. Samples: 5971397840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 04:59:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 04:59:42,058][15401] Updated weights for policy 0, policy_version 364470 (0.0034) [2024-06-23 04:59:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 5971558400. Throughput: 0: 42658.1. Samples: 5971651880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 04:59:43,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 04:59:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000364475_5971558400.pth... [2024-06-23 04:59:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000363848_5961285632.pth [2024-06-23 04:59:45,955][15401] Updated weights for policy 0, policy_version 364480 (0.0031) [2024-06-23 04:59:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5971738624. Throughput: 0: 42593.9. Samples: 5971906740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 04:59:48,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 04:59:49,845][15401] Updated weights for policy 0, policy_version 364490 (0.0029) [2024-06-23 04:59:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42599.3, 300 sec: 42654.9). Total num frames: 5971951616. Throughput: 0: 42259.5. Samples: 5972023700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 04:59:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 04:59:53,623][15401] Updated weights for policy 0, policy_version 364500 (0.0039) [2024-06-23 04:59:57,509][15401] Updated weights for policy 0, policy_version 364510 (0.0031) [2024-06-23 04:59:58,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5972213760. Throughput: 0: 42681.1. Samples: 5972294680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 04:59:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 05:00:01,377][15401] Updated weights for policy 0, policy_version 364520 (0.0031) [2024-06-23 05:00:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.0, 300 sec: 42709.8). Total num frames: 5972377600. Throughput: 0: 42643.9. Samples: 5972548780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:03,391][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 05:00:05,143][15401] Updated weights for policy 0, policy_version 364530 (0.0039) [2024-06-23 05:00:05,252][15349] Signal inference workers to stop experience collection... (88550 times) [2024-06-23 05:00:05,277][15401] InferenceWorker_p0-w0: stopping experience collection (88550 times) [2024-06-23 05:00:05,316][15349] Signal inference workers to resume experience collection... (88550 times) [2024-06-23 05:00:05,316][15401] InferenceWorker_p0-w0: resuming experience collection (88550 times) [2024-06-23 05:00:08,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5972590592. Throughput: 0: 42431.7. Samples: 5972669980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 05:00:08,997][15401] Updated weights for policy 0, policy_version 364540 (0.0034) [2024-06-23 05:00:12,746][15401] Updated weights for policy 0, policy_version 364550 (0.0028) [2024-06-23 05:00:13,389][15132] Fps is (10 sec: 45876.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5972836352. Throughput: 0: 42840.2. Samples: 5972940940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:13,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 05:00:16,678][15401] Updated weights for policy 0, policy_version 364560 (0.0039) [2024-06-23 05:00:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5973032960. Throughput: 0: 42694.6. Samples: 5973195660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:18,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 05:00:20,339][15401] Updated weights for policy 0, policy_version 364570 (0.0036) [2024-06-23 05:00:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5973229568. Throughput: 0: 42585.2. Samples: 5973314180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 05:00:24,460][15401] Updated weights for policy 0, policy_version 364580 (0.0030) [2024-06-23 05:00:27,989][15401] Updated weights for policy 0, policy_version 364590 (0.0026) [2024-06-23 05:00:28,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 5973458944. Throughput: 0: 42926.4. Samples: 5973583560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:28,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 05:00:32,009][15401] Updated weights for policy 0, policy_version 364600 (0.0031) [2024-06-23 05:00:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5973671936. Throughput: 0: 42837.3. Samples: 5973834420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:33,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 05:00:35,515][15401] Updated weights for policy 0, policy_version 364610 (0.0025) [2024-06-23 05:00:38,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5973884928. Throughput: 0: 42908.4. Samples: 5973954580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 05:00:39,557][15401] Updated weights for policy 0, policy_version 364620 (0.0034) [2024-06-23 05:00:42,969][15401] Updated weights for policy 0, policy_version 364630 (0.0034) [2024-06-23 05:00:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5974130688. Throughput: 0: 42755.7. Samples: 5974218680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 05:00:47,109][15401] Updated weights for policy 0, policy_version 364640 (0.0036) [2024-06-23 05:00:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5974310912. Throughput: 0: 42870.8. Samples: 5974477960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 05:00:48,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 05:00:50,659][15401] Updated weights for policy 0, policy_version 364650 (0.0033) [2024-06-23 05:00:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5974523904. Throughput: 0: 42925.7. Samples: 5974601640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:00:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 05:00:54,742][15401] Updated weights for policy 0, policy_version 364660 (0.0029) [2024-06-23 05:00:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 5974736896. Throughput: 0: 42803.4. Samples: 5974867100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:00:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 05:00:58,409][15401] Updated weights for policy 0, policy_version 364670 (0.0040) [2024-06-23 05:01:02,441][15401] Updated weights for policy 0, policy_version 364680 (0.0037) [2024-06-23 05:01:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5974966272. Throughput: 0: 42673.7. Samples: 5975115980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 05:01:06,006][15401] Updated weights for policy 0, policy_version 364690 (0.0036) [2024-06-23 05:01:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5975146496. Throughput: 0: 42737.3. Samples: 5975237360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:08,391][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 05:01:10,003][15401] Updated weights for policy 0, policy_version 364700 (0.0028) [2024-06-23 05:01:13,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5975392256. Throughput: 0: 42549.3. Samples: 5975498280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 05:01:13,953][15401] Updated weights for policy 0, policy_version 364710 (0.0034) [2024-06-23 05:01:17,686][15401] Updated weights for policy 0, policy_version 364720 (0.0029) [2024-06-23 05:01:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 5975605248. Throughput: 0: 42725.2. Samples: 5975757060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 05:01:21,475][15401] Updated weights for policy 0, policy_version 364730 (0.0033) [2024-06-23 05:01:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5975801856. Throughput: 0: 42953.3. Samples: 5975887480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 05:01:25,259][15401] Updated weights for policy 0, policy_version 364740 (0.0032) [2024-06-23 05:01:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5976031232. Throughput: 0: 42873.7. Samples: 5976148000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:28,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-23 05:01:29,079][15401] Updated weights for policy 0, policy_version 364750 (0.0028) [2024-06-23 05:01:32,513][15349] Signal inference workers to stop experience collection... (88600 times) [2024-06-23 05:01:32,513][15349] Signal inference workers to resume experience collection... (88600 times) [2024-06-23 05:01:32,525][15401] InferenceWorker_p0-w0: stopping experience collection (88600 times) [2024-06-23 05:01:32,525][15401] InferenceWorker_p0-w0: resuming experience collection (88600 times) [2024-06-23 05:01:32,662][15401] Updated weights for policy 0, policy_version 364760 (0.0036) [2024-06-23 05:01:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5976244224. Throughput: 0: 42813.2. Samples: 5976404560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 05:01:36,637][15401] Updated weights for policy 0, policy_version 364770 (0.0032) [2024-06-23 05:01:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 5976457216. Throughput: 0: 42936.7. Samples: 5976533800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 05:01:40,566][15401] Updated weights for policy 0, policy_version 364780 (0.0037) [2024-06-23 05:01:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 5976653824. Throughput: 0: 42694.1. Samples: 5976788340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 05:01:43,453][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000364787_5976670208.pth... [2024-06-23 05:01:43,509][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000364160_5966397440.pth [2024-06-23 05:01:44,224][15401] Updated weights for policy 0, policy_version 364790 (0.0033) [2024-06-23 05:01:48,362][15401] Updated weights for policy 0, policy_version 364800 (0.0035) [2024-06-23 05:01:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5976883200. Throughput: 0: 42853.9. Samples: 5977044400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 05:01:51,810][15401] Updated weights for policy 0, policy_version 364810 (0.0033) [2024-06-23 05:01:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5977063424. Throughput: 0: 42950.3. Samples: 5977170120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:53,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-23 05:01:55,844][15401] Updated weights for policy 0, policy_version 364820 (0.0031) [2024-06-23 05:01:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5977292800. Throughput: 0: 42780.9. Samples: 5977423420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:01:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 05:01:59,510][15401] Updated weights for policy 0, policy_version 364830 (0.0049) [2024-06-23 05:02:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5977522176. Throughput: 0: 42645.8. Samples: 5977676120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:02:03,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 05:02:03,662][15401] Updated weights for policy 0, policy_version 364840 (0.0038) [2024-06-23 05:02:07,343][15401] Updated weights for policy 0, policy_version 364850 (0.0043) [2024-06-23 05:02:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5977718784. Throughput: 0: 42713.9. Samples: 5977809600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:02:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 05:02:11,070][15401] Updated weights for policy 0, policy_version 364860 (0.0032) [2024-06-23 05:02:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5977948160. Throughput: 0: 42631.9. Samples: 5978066440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 05:02:14,964][15401] Updated weights for policy 0, policy_version 364870 (0.0039) [2024-06-23 05:02:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5978161152. Throughput: 0: 42506.9. Samples: 5978317360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 05:02:19,042][15401] Updated weights for policy 0, policy_version 364880 (0.0030) [2024-06-23 05:02:22,895][15401] Updated weights for policy 0, policy_version 364890 (0.0038) [2024-06-23 05:02:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 5978374144. Throughput: 0: 42574.9. Samples: 5978449660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:23,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 05:02:27,025][15401] Updated weights for policy 0, policy_version 364900 (0.0039) [2024-06-23 05:02:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5978570752. Throughput: 0: 42625.5. Samples: 5978706480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 05:02:30,531][15401] Updated weights for policy 0, policy_version 364910 (0.0041) [2024-06-23 05:02:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5978800128. Throughput: 0: 42567.2. Samples: 5978959920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 05:02:34,765][15401] Updated weights for policy 0, policy_version 364920 (0.0035) [2024-06-23 05:02:38,028][15401] Updated weights for policy 0, policy_version 364930 (0.0024) [2024-06-23 05:02:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5979013120. Throughput: 0: 42661.8. Samples: 5979089900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 05:02:42,331][15401] Updated weights for policy 0, policy_version 364940 (0.0042) [2024-06-23 05:02:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5979209728. Throughput: 0: 42842.9. Samples: 5979351360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 05:02:45,637][15401] Updated weights for policy 0, policy_version 364950 (0.0036) [2024-06-23 05:02:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5979455488. Throughput: 0: 42771.1. Samples: 5979600820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:48,395][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 05:02:49,944][15401] Updated weights for policy 0, policy_version 364960 (0.0028) [2024-06-23 05:02:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5979652096. Throughput: 0: 42711.6. Samples: 5979731620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 05:02:53,427][15401] Updated weights for policy 0, policy_version 364970 (0.0032) [2024-06-23 05:02:57,809][15401] Updated weights for policy 0, policy_version 364980 (0.0048) [2024-06-23 05:02:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5979865088. Throughput: 0: 42779.2. Samples: 5979991500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:02:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 05:02:59,048][15349] Signal inference workers to stop experience collection... (88650 times) [2024-06-23 05:02:59,048][15349] Signal inference workers to resume experience collection... (88650 times) [2024-06-23 05:02:59,081][15401] InferenceWorker_p0-w0: stopping experience collection (88650 times) [2024-06-23 05:02:59,116][15401] InferenceWorker_p0-w0: resuming experience collection (88650 times) [2024-06-23 05:03:00,991][15401] Updated weights for policy 0, policy_version 364990 (0.0023) [2024-06-23 05:03:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5980094464. Throughput: 0: 42832.4. Samples: 5980244820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:03:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 05:03:05,437][15401] Updated weights for policy 0, policy_version 365000 (0.0035) [2024-06-23 05:03:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5980307456. Throughput: 0: 42751.5. Samples: 5980373480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:03:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 05:03:08,482][15401] Updated weights for policy 0, policy_version 365010 (0.0048) [2024-06-23 05:03:12,867][15401] Updated weights for policy 0, policy_version 365020 (0.0041) [2024-06-23 05:03:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5980504064. Throughput: 0: 42722.7. Samples: 5980629000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:03:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 05:03:16,533][15401] Updated weights for policy 0, policy_version 365030 (0.0035) [2024-06-23 05:03:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5980700672. Throughput: 0: 42693.0. Samples: 5980881100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:03:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 05:03:20,423][15401] Updated weights for policy 0, policy_version 365040 (0.0049) [2024-06-23 05:03:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5980946432. Throughput: 0: 42685.3. Samples: 5981010740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:03:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 05:03:24,015][15401] Updated weights for policy 0, policy_version 365050 (0.0029) [2024-06-23 05:03:27,963][15401] Updated weights for policy 0, policy_version 365060 (0.0036) [2024-06-23 05:03:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 5981143040. Throughput: 0: 42623.1. Samples: 5981269400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:03:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 05:03:31,925][15401] Updated weights for policy 0, policy_version 365070 (0.0041) [2024-06-23 05:03:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5981339648. Throughput: 0: 42784.1. Samples: 5981526100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:03:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 05:03:36,064][15401] Updated weights for policy 0, policy_version 365080 (0.0032) [2024-06-23 05:03:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 5981569024. Throughput: 0: 42826.7. Samples: 5981658820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:03:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 05:03:39,402][15401] Updated weights for policy 0, policy_version 365090 (0.0025) [2024-06-23 05:03:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 5981782016. Throughput: 0: 42746.7. Samples: 5981915200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:03:43,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 05:03:43,472][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000365100_5981798400.pth... [2024-06-23 05:03:43,473][15401] Updated weights for policy 0, policy_version 365100 (0.0029) [2024-06-23 05:03:43,535][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000364475_5971558400.pth [2024-06-23 05:03:46,947][15401] Updated weights for policy 0, policy_version 365110 (0.0038) [2024-06-23 05:03:48,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42325.2, 300 sec: 42709.6). Total num frames: 5981995008. Throughput: 0: 42772.7. Samples: 5982169600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:03:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 05:03:51,065][15401] Updated weights for policy 0, policy_version 365120 (0.0026) [2024-06-23 05:03:53,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5982224384. Throughput: 0: 42779.0. Samples: 5982298540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:03:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 05:03:54,430][15401] Updated weights for policy 0, policy_version 365130 (0.0033) [2024-06-23 05:03:58,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 5982420992. Throughput: 0: 42761.6. Samples: 5982553280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:03:58,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 05:03:58,632][15401] Updated weights for policy 0, policy_version 365140 (0.0045) [2024-06-23 05:04:02,366][15401] Updated weights for policy 0, policy_version 365150 (0.0025) [2024-06-23 05:04:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5982633984. Throughput: 0: 42889.7. Samples: 5982811140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 05:04:06,333][15401] Updated weights for policy 0, policy_version 365160 (0.0047) [2024-06-23 05:04:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5982846976. Throughput: 0: 42939.2. Samples: 5982943000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 05:04:10,003][15401] Updated weights for policy 0, policy_version 365170 (0.0031) [2024-06-23 05:04:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5983043584. Throughput: 0: 42804.4. Samples: 5983195600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 05:04:14,215][15401] Updated weights for policy 0, policy_version 365180 (0.0032) [2024-06-23 05:04:17,824][15401] Updated weights for policy 0, policy_version 365190 (0.0041) [2024-06-23 05:04:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5983289344. Throughput: 0: 42820.3. Samples: 5983453020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:18,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-23 05:04:21,688][15401] Updated weights for policy 0, policy_version 365200 (0.0037) [2024-06-23 05:04:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5983485952. Throughput: 0: 42783.9. Samples: 5983584100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:23,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-23 05:04:24,615][15349] Signal inference workers to stop experience collection... (88700 times) [2024-06-23 05:04:24,663][15401] InferenceWorker_p0-w0: stopping experience collection (88700 times) [2024-06-23 05:04:24,671][15349] Signal inference workers to resume experience collection... (88700 times) [2024-06-23 05:04:24,678][15401] InferenceWorker_p0-w0: resuming experience collection (88700 times) [2024-06-23 05:04:25,514][15401] Updated weights for policy 0, policy_version 365210 (0.0031) [2024-06-23 05:04:28,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 5983698944. Throughput: 0: 42626.2. Samples: 5983833380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:28,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 05:04:29,230][15401] Updated weights for policy 0, policy_version 365220 (0.0027) [2024-06-23 05:04:33,203][15401] Updated weights for policy 0, policy_version 365230 (0.0027) [2024-06-23 05:04:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 5983944704. Throughput: 0: 42833.9. Samples: 5984097120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 05:04:36,711][15401] Updated weights for policy 0, policy_version 365240 (0.0035) [2024-06-23 05:04:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5984124928. Throughput: 0: 42741.8. Samples: 5984221920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 05:04:40,867][15401] Updated weights for policy 0, policy_version 365250 (0.0032) [2024-06-23 05:04:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 5984354304. Throughput: 0: 42657.8. Samples: 5984472880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 05:04:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 05:04:44,315][15401] Updated weights for policy 0, policy_version 365260 (0.0034) [2024-06-23 05:04:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5984550912. Throughput: 0: 42728.9. Samples: 5984733940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:04:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 05:04:48,869][15401] Updated weights for policy 0, policy_version 365270 (0.0034) [2024-06-23 05:04:51,928][15401] Updated weights for policy 0, policy_version 365280 (0.0033) [2024-06-23 05:04:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5984763904. Throughput: 0: 42573.3. Samples: 5984858800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:04:53,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 05:04:56,551][15401] Updated weights for policy 0, policy_version 365290 (0.0048) [2024-06-23 05:04:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5985009664. Throughput: 0: 42524.1. Samples: 5985109180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:04:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 05:04:59,983][15401] Updated weights for policy 0, policy_version 365300 (0.0026) [2024-06-23 05:05:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 5985173504. Throughput: 0: 42682.7. Samples: 5985373840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:03,392][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 05:05:04,125][15401] Updated weights for policy 0, policy_version 365310 (0.0041) [2024-06-23 05:05:07,565][15401] Updated weights for policy 0, policy_version 365320 (0.0030) [2024-06-23 05:05:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5985419264. Throughput: 0: 42442.3. Samples: 5985494000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 05:05:11,790][15401] Updated weights for policy 0, policy_version 365330 (0.0036) [2024-06-23 05:05:13,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 5985648640. Throughput: 0: 42744.9. Samples: 5985756800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 05:05:15,232][15401] Updated weights for policy 0, policy_version 365340 (0.0042) [2024-06-23 05:05:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5985828864. Throughput: 0: 42554.1. Samples: 5986012060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:18,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 05:05:19,314][15401] Updated weights for policy 0, policy_version 365350 (0.0030) [2024-06-23 05:05:23,304][15401] Updated weights for policy 0, policy_version 365360 (0.0041) [2024-06-23 05:05:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5986058240. Throughput: 0: 42466.2. Samples: 5986132900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 05:05:26,821][15401] Updated weights for policy 0, policy_version 365370 (0.0026) [2024-06-23 05:05:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 5986254848. Throughput: 0: 42529.4. Samples: 5986386700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 05:05:30,823][15401] Updated weights for policy 0, policy_version 365380 (0.0035) [2024-06-23 05:05:33,392][15132] Fps is (10 sec: 40951.1, 60 sec: 42050.8, 300 sec: 42653.6). Total num frames: 5986467840. Throughput: 0: 42494.0. Samples: 5986646260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:33,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 05:05:34,483][15401] Updated weights for policy 0, policy_version 365390 (0.0031) [2024-06-23 05:05:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5986697216. Throughput: 0: 42457.5. Samples: 5986769380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 05:05:38,417][15401] Updated weights for policy 0, policy_version 365400 (0.0029) [2024-06-23 05:05:42,212][15401] Updated weights for policy 0, policy_version 365410 (0.0034) [2024-06-23 05:05:43,389][15132] Fps is (10 sec: 42607.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5986893824. Throughput: 0: 42469.4. Samples: 5987020300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:43,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 05:05:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000365412_5986910208.pth... [2024-06-23 05:05:43,601][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000364787_5976670208.pth [2024-06-23 05:05:45,952][15401] Updated weights for policy 0, policy_version 365420 (0.0033) [2024-06-23 05:05:48,310][15349] Signal inference workers to stop experience collection... (88750 times) [2024-06-23 05:05:48,310][15349] Signal inference workers to resume experience collection... (88750 times) [2024-06-23 05:05:48,350][15401] InferenceWorker_p0-w0: stopping experience collection (88750 times) [2024-06-23 05:05:48,351][15401] InferenceWorker_p0-w0: resuming experience collection (88750 times) [2024-06-23 05:05:48,389][15132] Fps is (10 sec: 37682.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5987074048. Throughput: 0: 42533.4. Samples: 5987287740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:48,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 05:05:49,823][15401] Updated weights for policy 0, policy_version 365430 (0.0034) [2024-06-23 05:05:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5987352576. Throughput: 0: 42497.4. Samples: 5987406380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:53,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 05:05:53,517][15401] Updated weights for policy 0, policy_version 365440 (0.0040) [2024-06-23 05:05:57,635][15401] Updated weights for policy 0, policy_version 365450 (0.0039) [2024-06-23 05:05:58,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5987549184. Throughput: 0: 42311.6. Samples: 5987660820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:05:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 05:06:01,305][15401] Updated weights for policy 0, policy_version 365460 (0.0049) [2024-06-23 05:06:03,390][15132] Fps is (10 sec: 36044.2, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 5987713024. Throughput: 0: 42431.2. Samples: 5987921460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:06:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 05:06:05,243][15401] Updated weights for policy 0, policy_version 365470 (0.0026) [2024-06-23 05:06:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5987958784. Throughput: 0: 42368.0. Samples: 5988039460. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:08,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-23 05:06:09,209][15401] Updated weights for policy 0, policy_version 365480 (0.0033) [2024-06-23 05:06:13,071][15401] Updated weights for policy 0, policy_version 365490 (0.0048) [2024-06-23 05:06:13,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5988188160. Throughput: 0: 42487.5. Samples: 5988298640. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 05:06:16,829][15401] Updated weights for policy 0, policy_version 365500 (0.0035) [2024-06-23 05:06:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5988368384. Throughput: 0: 42442.8. Samples: 5988556100. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 05:06:20,806][15401] Updated weights for policy 0, policy_version 365510 (0.0034) [2024-06-23 05:06:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5988597760. Throughput: 0: 42344.3. Samples: 5988674880. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:23,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 05:06:24,350][15401] Updated weights for policy 0, policy_version 365520 (0.0030) [2024-06-23 05:06:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5988810752. Throughput: 0: 42602.6. Samples: 5988937420. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:28,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 05:06:28,880][15401] Updated weights for policy 0, policy_version 365530 (0.0036) [2024-06-23 05:06:32,525][15401] Updated weights for policy 0, policy_version 365540 (0.0036) [2024-06-23 05:06:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42326.8, 300 sec: 42542.9). Total num frames: 5989007360. Throughput: 0: 42152.8. Samples: 5989184620. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 05:06:36,442][15401] Updated weights for policy 0, policy_version 365550 (0.0021) [2024-06-23 05:06:37,437][15349] Signal inference workers to stop experience collection... (88800 times) [2024-06-23 05:06:37,446][15349] Signal inference workers to resume experience collection... (88800 times) [2024-06-23 05:06:37,474][15401] InferenceWorker_p0-w0: stopping experience collection (88800 times) [2024-06-23 05:06:37,474][15401] InferenceWorker_p0-w0: resuming experience collection (88800 times) [2024-06-23 05:06:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 5989220352. Throughput: 0: 42412.7. Samples: 5989314960. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 05:06:40,205][15401] Updated weights for policy 0, policy_version 365560 (0.0037) [2024-06-23 05:06:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5989433344. Throughput: 0: 42496.4. Samples: 5989573160. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:43,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 05:06:43,974][15401] Updated weights for policy 0, policy_version 365570 (0.0025) [2024-06-23 05:06:47,847][15401] Updated weights for policy 0, policy_version 365580 (0.0022) [2024-06-23 05:06:48,392][15132] Fps is (10 sec: 44227.3, 60 sec: 43142.9, 300 sec: 42709.2). Total num frames: 5989662720. Throughput: 0: 42253.5. Samples: 5989822960. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:48,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 05:06:51,729][15401] Updated weights for policy 0, policy_version 365590 (0.0034) [2024-06-23 05:06:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5989875712. Throughput: 0: 42496.9. Samples: 5989951820. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 05:06:55,749][15401] Updated weights for policy 0, policy_version 365600 (0.0036) [2024-06-23 05:06:58,390][15132] Fps is (10 sec: 40968.8, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 5990072320. Throughput: 0: 42596.4. Samples: 5990215480. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:06:58,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 05:06:59,446][15401] Updated weights for policy 0, policy_version 365610 (0.0028) [2024-06-23 05:07:03,286][15401] Updated weights for policy 0, policy_version 365620 (0.0037) [2024-06-23 05:07:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 5990318080. Throughput: 0: 42555.0. Samples: 5990471080. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:07:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 05:07:07,083][15401] Updated weights for policy 0, policy_version 365630 (0.0043) [2024-06-23 05:07:08,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5990531072. Throughput: 0: 42754.2. Samples: 5990598820. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:07:08,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 05:07:10,954][15401] Updated weights for policy 0, policy_version 365640 (0.0032) [2024-06-23 05:07:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5990727680. Throughput: 0: 42813.2. Samples: 5990864020. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:07:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 05:07:14,737][15401] Updated weights for policy 0, policy_version 365650 (0.0038) [2024-06-23 05:07:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5990940672. Throughput: 0: 42921.4. Samples: 5991116080. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-23 05:07:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 05:07:18,785][15401] Updated weights for policy 0, policy_version 365660 (0.0027) [2024-06-23 05:07:22,212][15401] Updated weights for policy 0, policy_version 365670 (0.0037) [2024-06-23 05:07:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 5991170048. Throughput: 0: 42918.2. Samples: 5991246280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:07:23,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-23 05:07:26,357][15401] Updated weights for policy 0, policy_version 365680 (0.0031) [2024-06-23 05:07:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5991366656. Throughput: 0: 42933.7. Samples: 5991505180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:07:28,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 05:07:29,717][15401] Updated weights for policy 0, policy_version 365690 (0.0039) [2024-06-23 05:07:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5991579648. Throughput: 0: 43219.0. Samples: 5991767720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:07:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 05:07:34,214][15401] Updated weights for policy 0, policy_version 365700 (0.0037) [2024-06-23 05:07:37,362][15401] Updated weights for policy 0, policy_version 365710 (0.0026) [2024-06-23 05:07:38,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 5991825408. Throughput: 0: 43137.0. Samples: 5991892980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:07:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 05:07:39,289][15349] Signal inference workers to stop experience collection... (88850 times) [2024-06-23 05:07:39,290][15349] Signal inference workers to resume experience collection... (88850 times) [2024-06-23 05:07:39,309][15401] InferenceWorker_p0-w0: stopping experience collection (88850 times) [2024-06-23 05:07:39,309][15401] InferenceWorker_p0-w0: resuming experience collection (88850 times) [2024-06-23 05:07:41,776][15401] Updated weights for policy 0, policy_version 365720 (0.0031) [2024-06-23 05:07:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5992022016. Throughput: 0: 42963.2. Samples: 5992148820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:07:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 05:07:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000365724_5992022016.pth... [2024-06-23 05:07:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000365100_5981798400.pth [2024-06-23 05:07:44,983][15401] Updated weights for policy 0, policy_version 365730 (0.0044) [2024-06-23 05:07:48,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 5992218624. Throughput: 0: 43091.7. Samples: 5992410200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:07:48,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 05:07:49,466][15401] Updated weights for policy 0, policy_version 365740 (0.0042) [2024-06-23 05:07:52,687][15401] Updated weights for policy 0, policy_version 365750 (0.0035) [2024-06-23 05:07:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5992464384. Throughput: 0: 43096.9. Samples: 5992538180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:07:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 05:07:57,046][15401] Updated weights for policy 0, policy_version 365760 (0.0027) [2024-06-23 05:07:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5992660992. Throughput: 0: 42932.1. Samples: 5992795960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:07:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 05:08:00,748][15401] Updated weights for policy 0, policy_version 365770 (0.0026) [2024-06-23 05:08:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5992873984. Throughput: 0: 43051.1. Samples: 5993053380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:08:03,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 05:08:04,549][15401] Updated weights for policy 0, policy_version 365780 (0.0032) [2024-06-23 05:08:08,309][15401] Updated weights for policy 0, policy_version 365790 (0.0035) [2024-06-23 05:08:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5993103360. Throughput: 0: 43088.3. Samples: 5993185240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:08:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 05:08:11,914][15401] Updated weights for policy 0, policy_version 365800 (0.0031) [2024-06-23 05:08:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5993316352. Throughput: 0: 43019.2. Samples: 5993441040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:08:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 05:08:16,063][15401] Updated weights for policy 0, policy_version 365810 (0.0044) [2024-06-23 05:08:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5993529344. Throughput: 0: 42930.3. Samples: 5993699580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:08:18,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 05:08:19,602][15401] Updated weights for policy 0, policy_version 365820 (0.0037) [2024-06-23 05:08:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5993725952. Throughput: 0: 43011.8. Samples: 5993828520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:08:23,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 05:08:23,721][15401] Updated weights for policy 0, policy_version 365830 (0.0041) [2024-06-23 05:08:27,109][15401] Updated weights for policy 0, policy_version 365840 (0.0027) [2024-06-23 05:08:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 5993971712. Throughput: 0: 43017.4. Samples: 5994084600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:08:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 05:08:31,220][15401] Updated weights for policy 0, policy_version 365850 (0.0034) [2024-06-23 05:08:33,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5994168320. Throughput: 0: 43026.8. Samples: 5994346400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:08:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 05:08:34,994][15401] Updated weights for policy 0, policy_version 365860 (0.0027) [2024-06-23 05:08:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42709.8). Total num frames: 5994381312. Throughput: 0: 42983.9. Samples: 5994472460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:08:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 05:08:38,785][15401] Updated weights for policy 0, policy_version 365870 (0.0038) [2024-06-23 05:08:42,495][15401] Updated weights for policy 0, policy_version 365880 (0.0036) [2024-06-23 05:08:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 5994627072. Throughput: 0: 43141.9. Samples: 5994737340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:08:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 05:08:46,553][15401] Updated weights for policy 0, policy_version 365890 (0.0041) [2024-06-23 05:08:48,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43690.8, 300 sec: 42765.0). Total num frames: 5994840064. Throughput: 0: 43038.4. Samples: 5994990100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:08:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 05:08:50,326][15401] Updated weights for policy 0, policy_version 365900 (0.0028) [2024-06-23 05:08:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5995020288. Throughput: 0: 42956.9. Samples: 5995118300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:08:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 05:08:54,102][15401] Updated weights for policy 0, policy_version 365910 (0.0040) [2024-06-23 05:08:57,863][15401] Updated weights for policy 0, policy_version 365920 (0.0035) [2024-06-23 05:08:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5995249664. Throughput: 0: 43128.9. Samples: 5995381840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:08:58,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 05:09:01,594][15401] Updated weights for policy 0, policy_version 365930 (0.0039) [2024-06-23 05:09:02,696][15349] Signal inference workers to stop experience collection... (88900 times) [2024-06-23 05:09:02,726][15401] InferenceWorker_p0-w0: stopping experience collection (88900 times) [2024-06-23 05:09:02,754][15349] Signal inference workers to resume experience collection... (88900 times) [2024-06-23 05:09:02,755][15401] InferenceWorker_p0-w0: resuming experience collection (88900 times) [2024-06-23 05:09:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 5995479040. Throughput: 0: 42871.9. Samples: 5995628820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 05:09:05,530][15401] Updated weights for policy 0, policy_version 365940 (0.0043) [2024-06-23 05:09:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5995659264. Throughput: 0: 43009.9. Samples: 5995763960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 05:09:09,194][15401] Updated weights for policy 0, policy_version 365950 (0.0033) [2024-06-23 05:09:13,082][15401] Updated weights for policy 0, policy_version 365960 (0.0030) [2024-06-23 05:09:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 5995888640. Throughput: 0: 43091.9. Samples: 5996023840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:13,393][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 05:09:16,845][15401] Updated weights for policy 0, policy_version 365970 (0.0039) [2024-06-23 05:09:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5996101632. Throughput: 0: 42812.8. Samples: 5996272980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:18,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 05:09:21,039][15401] Updated weights for policy 0, policy_version 365980 (0.0032) [2024-06-23 05:09:23,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 5996314624. Throughput: 0: 42986.8. Samples: 5996406860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:23,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 05:09:24,414][15401] Updated weights for policy 0, policy_version 365990 (0.0028) [2024-06-23 05:09:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5996527616. Throughput: 0: 42802.6. Samples: 5996663460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 05:09:28,478][15401] Updated weights for policy 0, policy_version 366000 (0.0032) [2024-06-23 05:09:31,968][15401] Updated weights for policy 0, policy_version 366010 (0.0029) [2024-06-23 05:09:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5996756992. Throughput: 0: 42810.6. Samples: 5996916580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:33,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 05:09:35,956][15401] Updated weights for policy 0, policy_version 366020 (0.0030) [2024-06-23 05:09:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 5996969984. Throughput: 0: 42779.9. Samples: 5997043500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:38,392][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 05:09:39,736][15401] Updated weights for policy 0, policy_version 366030 (0.0042) [2024-06-23 05:09:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5997166592. Throughput: 0: 42661.8. Samples: 5997301620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 05:09:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000366038_5997166592.pth... [2024-06-23 05:09:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000365412_5986910208.pth [2024-06-23 05:09:44,078][15401] Updated weights for policy 0, policy_version 366040 (0.0048) [2024-06-23 05:09:47,858][15401] Updated weights for policy 0, policy_version 366050 (0.0038) [2024-06-23 05:09:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5997379584. Throughput: 0: 42665.4. Samples: 5997548760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 05:09:51,626][15401] Updated weights for policy 0, policy_version 366060 (0.0042) [2024-06-23 05:09:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5997608960. Throughput: 0: 42596.9. Samples: 5997680820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 05:09:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 05:09:55,378][15401] Updated weights for policy 0, policy_version 366070 (0.0028) [2024-06-23 05:09:58,396][15132] Fps is (10 sec: 44208.0, 60 sec: 42866.8, 300 sec: 42875.5). Total num frames: 5997821952. Throughput: 0: 42507.7. Samples: 5997936860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:09:58,397][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 05:09:59,123][15401] Updated weights for policy 0, policy_version 366080 (0.0023) [2024-06-23 05:10:03,060][15401] Updated weights for policy 0, policy_version 366090 (0.0043) [2024-06-23 05:10:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5998018560. Throughput: 0: 42475.1. Samples: 5998184360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 05:10:06,464][15349] Signal inference workers to stop experience collection... (88950 times) [2024-06-23 05:10:06,498][15401] InferenceWorker_p0-w0: stopping experience collection (88950 times) [2024-06-23 05:10:06,536][15349] Signal inference workers to resume experience collection... (88950 times) [2024-06-23 05:10:06,538][15401] InferenceWorker_p0-w0: resuming experience collection (88950 times) [2024-06-23 05:10:06,675][15401] Updated weights for policy 0, policy_version 366100 (0.0041) [2024-06-23 05:10:08,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5998231552. Throughput: 0: 42332.0. Samples: 5998311800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 05:10:10,907][15401] Updated weights for policy 0, policy_version 366110 (0.0045) [2024-06-23 05:10:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 5998444544. Throughput: 0: 42459.5. Samples: 5998574140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 05:10:14,362][15401] Updated weights for policy 0, policy_version 366120 (0.0033) [2024-06-23 05:10:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5998657536. Throughput: 0: 42441.3. Samples: 5998826440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 05:10:18,397][15401] Updated weights for policy 0, policy_version 366130 (0.0026) [2024-06-23 05:10:22,165][15401] Updated weights for policy 0, policy_version 366140 (0.0028) [2024-06-23 05:10:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5998886912. Throughput: 0: 42415.5. Samples: 5998952100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 05:10:25,951][15401] Updated weights for policy 0, policy_version 366150 (0.0032) [2024-06-23 05:10:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 5999083520. Throughput: 0: 42500.1. Samples: 5999214120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:28,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 05:10:29,829][15401] Updated weights for policy 0, policy_version 366160 (0.0038) [2024-06-23 05:10:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5999296512. Throughput: 0: 42650.2. Samples: 5999468020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 05:10:34,234][15401] Updated weights for policy 0, policy_version 366170 (0.0028) [2024-06-23 05:10:37,525][15401] Updated weights for policy 0, policy_version 366180 (0.0049) [2024-06-23 05:10:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 5999509504. Throughput: 0: 42362.7. Samples: 5999587140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 05:10:41,738][15401] Updated weights for policy 0, policy_version 366190 (0.0039) [2024-06-23 05:10:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5999706112. Throughput: 0: 42444.4. Samples: 5999846580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 05:10:45,150][15401] Updated weights for policy 0, policy_version 366200 (0.0029) [2024-06-23 05:10:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5999919104. Throughput: 0: 42531.8. Samples: 6000098300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:48,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 05:10:49,270][15401] Updated weights for policy 0, policy_version 366210 (0.0035) [2024-06-23 05:10:52,859][15401] Updated weights for policy 0, policy_version 366220 (0.0036) [2024-06-23 05:10:53,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6000164864. Throughput: 0: 42501.6. Samples: 6000224380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 05:10:56,927][15401] Updated weights for policy 0, policy_version 366230 (0.0041) [2024-06-23 05:10:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42329.8, 300 sec: 42876.1). Total num frames: 6000361472. Throughput: 0: 42436.4. Samples: 6000483780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:10:58,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 05:11:00,450][15401] Updated weights for policy 0, policy_version 366240 (0.0023) [2024-06-23 05:11:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6000574464. Throughput: 0: 42594.7. Samples: 6000743200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:11:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 05:11:04,480][15401] Updated weights for policy 0, policy_version 366250 (0.0038) [2024-06-23 05:11:07,937][15401] Updated weights for policy 0, policy_version 366260 (0.0037) [2024-06-23 05:11:08,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6000803840. Throughput: 0: 42733.0. Samples: 6000875180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:11:08,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 05:11:12,570][15401] Updated weights for policy 0, policy_version 366270 (0.0044) [2024-06-23 05:11:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6001000448. Throughput: 0: 42576.9. Samples: 6001130080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:11:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 05:11:15,724][15401] Updated weights for policy 0, policy_version 366280 (0.0039) [2024-06-23 05:11:18,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6001213440. Throughput: 0: 42553.9. Samples: 6001382940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 05:11:20,319][15401] Updated weights for policy 0, policy_version 366290 (0.0045) [2024-06-23 05:11:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6001426432. Throughput: 0: 42763.9. Samples: 6001511520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 05:11:23,677][15401] Updated weights for policy 0, policy_version 366300 (0.0031) [2024-06-23 05:11:28,121][15401] Updated weights for policy 0, policy_version 366310 (0.0034) [2024-06-23 05:11:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6001623040. Throughput: 0: 42693.2. Samples: 6001767780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:28,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 05:11:31,262][15401] Updated weights for policy 0, policy_version 366320 (0.0031) [2024-06-23 05:11:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6001868800. Throughput: 0: 42819.2. Samples: 6002025160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 05:11:35,913][15401] Updated weights for policy 0, policy_version 366330 (0.0036) [2024-06-23 05:11:36,613][15349] Signal inference workers to stop experience collection... (89000 times) [2024-06-23 05:11:36,613][15349] Signal inference workers to resume experience collection... (89000 times) [2024-06-23 05:11:36,655][15401] InferenceWorker_p0-w0: stopping experience collection (89000 times) [2024-06-23 05:11:36,660][15401] InferenceWorker_p0-w0: resuming experience collection (89000 times) [2024-06-23 05:11:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6002081792. Throughput: 0: 42983.6. Samples: 6002158640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 05:11:38,721][15401] Updated weights for policy 0, policy_version 366340 (0.0038) [2024-06-23 05:11:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 6002262016. Throughput: 0: 42906.4. Samples: 6002414560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 05:11:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000366349_6002262016.pth... [2024-06-23 05:11:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000365724_5992022016.pth [2024-06-23 05:11:43,640][15401] Updated weights for policy 0, policy_version 366350 (0.0035) [2024-06-23 05:11:46,222][15401] Updated weights for policy 0, policy_version 366360 (0.0033) [2024-06-23 05:11:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 6002507776. Throughput: 0: 42783.6. Samples: 6002668460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 05:11:51,064][15401] Updated weights for policy 0, policy_version 366370 (0.0039) [2024-06-23 05:11:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6002737152. Throughput: 0: 42783.1. Samples: 6002800320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 05:11:53,680][15401] Updated weights for policy 0, policy_version 366380 (0.0031) [2024-06-23 05:11:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6002917376. Throughput: 0: 42807.4. Samples: 6003056420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:11:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 05:11:58,554][15401] Updated weights for policy 0, policy_version 366390 (0.0023) [2024-06-23 05:12:01,826][15401] Updated weights for policy 0, policy_version 366400 (0.0039) [2024-06-23 05:12:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6003146752. Throughput: 0: 42753.6. Samples: 6003306860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:12:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 05:12:06,166][15401] Updated weights for policy 0, policy_version 366410 (0.0036) [2024-06-23 05:12:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42599.9, 300 sec: 42820.6). Total num frames: 6003359744. Throughput: 0: 42825.8. Samples: 6003438680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:12:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 05:12:09,402][15401] Updated weights for policy 0, policy_version 366420 (0.0036) [2024-06-23 05:12:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6003556352. Throughput: 0: 42784.0. Samples: 6003693060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:12:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 05:12:13,774][15401] Updated weights for policy 0, policy_version 366430 (0.0029) [2024-06-23 05:12:17,214][15401] Updated weights for policy 0, policy_version 366440 (0.0041) [2024-06-23 05:12:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 6003785728. Throughput: 0: 42652.1. Samples: 6003944500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:12:18,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-23 05:12:21,663][15401] Updated weights for policy 0, policy_version 366450 (0.0036) [2024-06-23 05:12:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6004015104. Throughput: 0: 42563.7. Samples: 6004074000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:12:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 05:12:24,869][15401] Updated weights for policy 0, policy_version 366460 (0.0032) [2024-06-23 05:12:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6004178944. Throughput: 0: 42612.7. Samples: 6004332140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:12:28,393][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 05:12:29,223][15401] Updated weights for policy 0, policy_version 366470 (0.0040) [2024-06-23 05:12:32,816][15401] Updated weights for policy 0, policy_version 366480 (0.0031) [2024-06-23 05:12:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 6004424704. Throughput: 0: 42510.0. Samples: 6004581420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 05:12:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 05:12:36,783][15401] Updated weights for policy 0, policy_version 366490 (0.0044) [2024-06-23 05:12:38,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6004654080. Throughput: 0: 42653.7. Samples: 6004719740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:12:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 05:12:40,401][15401] Updated weights for policy 0, policy_version 366500 (0.0027) [2024-06-23 05:12:43,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 6004834304. Throughput: 0: 42495.6. Samples: 6004968820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:12:43,392][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 05:12:44,548][15401] Updated weights for policy 0, policy_version 366510 (0.0031) [2024-06-23 05:12:48,258][15401] Updated weights for policy 0, policy_version 366520 (0.0047) [2024-06-23 05:12:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6005063680. Throughput: 0: 42701.7. Samples: 6005228440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:12:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 05:12:52,060][15401] Updated weights for policy 0, policy_version 366530 (0.0036) [2024-06-23 05:12:53,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6005293056. Throughput: 0: 42821.0. Samples: 6005365620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:12:53,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 05:12:55,814][15401] Updated weights for policy 0, policy_version 366540 (0.0034) [2024-06-23 05:12:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6005473280. Throughput: 0: 42672.5. Samples: 6005613320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:12:58,391][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 05:12:59,155][15349] Signal inference workers to stop experience collection... (89050 times) [2024-06-23 05:12:59,212][15401] InferenceWorker_p0-w0: stopping experience collection (89050 times) [2024-06-23 05:12:59,213][15349] Signal inference workers to resume experience collection... (89050 times) [2024-06-23 05:12:59,223][15401] InferenceWorker_p0-w0: resuming experience collection (89050 times) [2024-06-23 05:12:59,702][15401] Updated weights for policy 0, policy_version 366550 (0.0028) [2024-06-23 05:13:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6005702656. Throughput: 0: 42901.3. Samples: 6005875060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 05:13:03,496][15401] Updated weights for policy 0, policy_version 366560 (0.0030) [2024-06-23 05:13:07,494][15401] Updated weights for policy 0, policy_version 366570 (0.0049) [2024-06-23 05:13:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 6005948416. Throughput: 0: 42871.8. Samples: 6006003240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 05:13:11,143][15401] Updated weights for policy 0, policy_version 366580 (0.0027) [2024-06-23 05:13:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6006128640. Throughput: 0: 42652.5. Samples: 6006251500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:13,400][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 05:13:15,197][15401] Updated weights for policy 0, policy_version 366590 (0.0029) [2024-06-23 05:13:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6006341632. Throughput: 0: 42881.8. Samples: 6006511100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 05:13:19,071][15401] Updated weights for policy 0, policy_version 366600 (0.0040) [2024-06-23 05:13:22,852][15401] Updated weights for policy 0, policy_version 366610 (0.0043) [2024-06-23 05:13:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6006554624. Throughput: 0: 42652.5. Samples: 6006639100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 05:13:26,659][15401] Updated weights for policy 0, policy_version 366620 (0.0025) [2024-06-23 05:13:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6006784000. Throughput: 0: 42826.2. Samples: 6006895900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 05:13:30,495][15401] Updated weights for policy 0, policy_version 366630 (0.0046) [2024-06-23 05:13:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6006996992. Throughput: 0: 42712.2. Samples: 6007150480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 05:13:34,096][15401] Updated weights for policy 0, policy_version 366640 (0.0033) [2024-06-23 05:13:38,016][15401] Updated weights for policy 0, policy_version 366650 (0.0022) [2024-06-23 05:13:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6007193600. Throughput: 0: 42563.2. Samples: 6007280960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 05:13:41,483][15401] Updated weights for policy 0, policy_version 366660 (0.0030) [2024-06-23 05:13:43,394][15132] Fps is (10 sec: 40941.9, 60 sec: 42870.1, 300 sec: 42597.8). Total num frames: 6007406592. Throughput: 0: 42690.6. Samples: 6007534580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:43,394][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 05:13:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000366663_6007406592.pth... [2024-06-23 05:13:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000366038_5997166592.pth [2024-06-23 05:13:45,586][15401] Updated weights for policy 0, policy_version 366670 (0.0042) [2024-06-23 05:13:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6007635968. Throughput: 0: 42624.9. Samples: 6007793180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 05:13:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 05:13:49,011][15401] Updated weights for policy 0, policy_version 366680 (0.0027) [2024-06-23 05:13:53,337][15401] Updated weights for policy 0, policy_version 366690 (0.0040) [2024-06-23 05:13:53,389][15132] Fps is (10 sec: 44256.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6007848960. Throughput: 0: 42703.3. Samples: 6007924880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:13:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 05:13:56,874][15401] Updated weights for policy 0, policy_version 366700 (0.0036) [2024-06-23 05:13:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 6008045568. Throughput: 0: 42839.1. Samples: 6008179360. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:13:58,393][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 05:14:00,871][15401] Updated weights for policy 0, policy_version 366710 (0.0023) [2024-06-23 05:14:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6008274944. Throughput: 0: 42716.9. Samples: 6008433360. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 05:14:04,672][15401] Updated weights for policy 0, policy_version 366720 (0.0034) [2024-06-23 05:14:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 6008471552. Throughput: 0: 42796.4. Samples: 6008564940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 05:14:08,659][15401] Updated weights for policy 0, policy_version 366730 (0.0050) [2024-06-23 05:14:12,239][15401] Updated weights for policy 0, policy_version 366740 (0.0028) [2024-06-23 05:14:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6008700928. Throughput: 0: 42778.6. Samples: 6008820940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 05:14:14,353][15349] Signal inference workers to stop experience collection... (89100 times) [2024-06-23 05:14:14,357][15349] Signal inference workers to resume experience collection... (89100 times) [2024-06-23 05:14:14,378][15401] InferenceWorker_p0-w0: stopping experience collection (89100 times) [2024-06-23 05:14:14,378][15401] InferenceWorker_p0-w0: resuming experience collection (89100 times) [2024-06-23 05:14:16,332][15401] Updated weights for policy 0, policy_version 366750 (0.0028) [2024-06-23 05:14:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6008930304. Throughput: 0: 42864.4. Samples: 6009079380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 05:14:19,738][15401] Updated weights for policy 0, policy_version 366760 (0.0021) [2024-06-23 05:14:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6009126912. Throughput: 0: 42963.9. Samples: 6009214340. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 05:14:24,041][15401] Updated weights for policy 0, policy_version 366770 (0.0029) [2024-06-23 05:14:27,343][15401] Updated weights for policy 0, policy_version 366780 (0.0037) [2024-06-23 05:14:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6009339904. Throughput: 0: 42908.5. Samples: 6009465280. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 05:14:31,649][15401] Updated weights for policy 0, policy_version 366790 (0.0036) [2024-06-23 05:14:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 6009569280. Throughput: 0: 42878.1. Samples: 6009722700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:33,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 05:14:34,907][15401] Updated weights for policy 0, policy_version 366800 (0.0037) [2024-06-23 05:14:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6009749504. Throughput: 0: 42863.8. Samples: 6009853760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:38,394][15132] Avg episode reward: [(0, '0.223')] [2024-06-23 05:14:39,216][15401] Updated weights for policy 0, policy_version 366810 (0.0024) [2024-06-23 05:14:42,667][15401] Updated weights for policy 0, policy_version 366820 (0.0036) [2024-06-23 05:14:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43147.6, 300 sec: 42765.0). Total num frames: 6009995264. Throughput: 0: 42874.3. Samples: 6010108600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 05:14:46,895][15401] Updated weights for policy 0, policy_version 366830 (0.0028) [2024-06-23 05:14:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6010208256. Throughput: 0: 42851.5. Samples: 6010361680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:48,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 05:14:50,366][15401] Updated weights for policy 0, policy_version 366840 (0.0030) [2024-06-23 05:14:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42654.8). Total num frames: 6010404864. Throughput: 0: 42894.9. Samples: 6010495220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 05:14:54,489][15401] Updated weights for policy 0, policy_version 366850 (0.0039) [2024-06-23 05:14:58,098][15401] Updated weights for policy 0, policy_version 366860 (0.0027) [2024-06-23 05:14:58,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43144.5, 300 sec: 42764.7). Total num frames: 6010634240. Throughput: 0: 42914.2. Samples: 6010752180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:14:58,393][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 05:15:02,247][15401] Updated weights for policy 0, policy_version 366870 (0.0032) [2024-06-23 05:15:03,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6010830848. Throughput: 0: 42810.1. Samples: 6011005940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:15:03,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 05:15:05,823][15401] Updated weights for policy 0, policy_version 366880 (0.0026) [2024-06-23 05:15:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6011043840. Throughput: 0: 42591.2. Samples: 6011130940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-23 05:15:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 05:15:10,244][15401] Updated weights for policy 0, policy_version 366890 (0.0034) [2024-06-23 05:15:13,358][15401] Updated weights for policy 0, policy_version 366900 (0.0029) [2024-06-23 05:15:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 6011289600. Throughput: 0: 42770.3. Samples: 6011389940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 05:15:17,753][15401] Updated weights for policy 0, policy_version 366910 (0.0032) [2024-06-23 05:15:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6011469824. Throughput: 0: 42776.4. Samples: 6011647640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:18,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 05:15:20,994][15401] Updated weights for policy 0, policy_version 366920 (0.0044) [2024-06-23 05:15:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6011699200. Throughput: 0: 42608.0. Samples: 6011771120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 05:15:25,362][15401] Updated weights for policy 0, policy_version 366930 (0.0034) [2024-06-23 05:15:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6011928576. Throughput: 0: 42804.1. Samples: 6012034780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 05:15:28,417][15401] Updated weights for policy 0, policy_version 366940 (0.0033) [2024-06-23 05:15:33,083][15401] Updated weights for policy 0, policy_version 366950 (0.0025) [2024-06-23 05:15:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6012108800. Throughput: 0: 42863.2. Samples: 6012290520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 05:15:35,412][15349] Signal inference workers to stop experience collection... (89150 times) [2024-06-23 05:15:35,412][15349] Signal inference workers to resume experience collection... (89150 times) [2024-06-23 05:15:35,429][15401] InferenceWorker_p0-w0: stopping experience collection (89150 times) [2024-06-23 05:15:35,429][15401] InferenceWorker_p0-w0: resuming experience collection (89150 times) [2024-06-23 05:15:35,997][15401] Updated weights for policy 0, policy_version 366960 (0.0038) [2024-06-23 05:15:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6012354560. Throughput: 0: 42546.0. Samples: 6012409780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:38,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 05:15:41,026][15401] Updated weights for policy 0, policy_version 366970 (0.0034) [2024-06-23 05:15:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6012551168. Throughput: 0: 42639.9. Samples: 6012670880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:43,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 05:15:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000366978_6012567552.pth... [2024-06-23 05:15:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000366349_6002262016.pth [2024-06-23 05:15:44,452][15401] Updated weights for policy 0, policy_version 366980 (0.0043) [2024-06-23 05:15:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6012747776. Throughput: 0: 42628.4. Samples: 6012924120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 05:15:48,742][15401] Updated weights for policy 0, policy_version 366990 (0.0044) [2024-06-23 05:15:52,184][15401] Updated weights for policy 0, policy_version 367000 (0.0031) [2024-06-23 05:15:53,389][15132] Fps is (10 sec: 45876.4, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 6013009920. Throughput: 0: 42645.0. Samples: 6013049960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 05:15:56,303][15401] Updated weights for policy 0, policy_version 367010 (0.0024) [2024-06-23 05:15:58,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 6013190144. Throughput: 0: 42655.3. Samples: 6013309420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:15:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 05:15:59,859][15401] Updated weights for policy 0, policy_version 367020 (0.0048) [2024-06-23 05:16:03,390][15132] Fps is (10 sec: 37682.2, 60 sec: 42600.0, 300 sec: 42654.2). Total num frames: 6013386752. Throughput: 0: 42719.9. Samples: 6013570040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:16:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 05:16:03,979][15401] Updated weights for policy 0, policy_version 367030 (0.0025) [2024-06-23 05:16:07,464][15401] Updated weights for policy 0, policy_version 367040 (0.0031) [2024-06-23 05:16:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6013632512. Throughput: 0: 42707.6. Samples: 6013692960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:16:08,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 05:16:11,579][15401] Updated weights for policy 0, policy_version 367050 (0.0046) [2024-06-23 05:16:13,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6013829120. Throughput: 0: 42679.5. Samples: 6013955360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:16:13,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 05:16:15,054][15401] Updated weights for policy 0, policy_version 367060 (0.0031) [2024-06-23 05:16:18,392][15132] Fps is (10 sec: 39313.7, 60 sec: 42597.0, 300 sec: 42709.2). Total num frames: 6014025728. Throughput: 0: 42669.6. Samples: 6014210740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:16:18,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 05:16:19,083][15401] Updated weights for policy 0, policy_version 367070 (0.0029) [2024-06-23 05:16:22,756][15401] Updated weights for policy 0, policy_version 367080 (0.0024) [2024-06-23 05:16:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6014271488. Throughput: 0: 42823.9. Samples: 6014336860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-23 05:16:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 05:16:26,764][15401] Updated weights for policy 0, policy_version 367090 (0.0024) [2024-06-23 05:16:28,389][15132] Fps is (10 sec: 45884.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6014484480. Throughput: 0: 42760.2. Samples: 6014595080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:16:28,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 05:16:30,598][15401] Updated weights for policy 0, policy_version 367100 (0.0035) [2024-06-23 05:16:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6014681088. Throughput: 0: 42814.8. Samples: 6014850780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:16:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 05:16:34,530][15401] Updated weights for policy 0, policy_version 367110 (0.0030) [2024-06-23 05:16:38,184][15401] Updated weights for policy 0, policy_version 367120 (0.0038) [2024-06-23 05:16:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6014910464. Throughput: 0: 42845.2. Samples: 6014978000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:16:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 05:16:42,053][15401] Updated weights for policy 0, policy_version 367130 (0.0041) [2024-06-23 05:16:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6015107072. Throughput: 0: 42762.5. Samples: 6015233740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:16:43,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 05:16:45,582][15401] Updated weights for policy 0, policy_version 367140 (0.0029) [2024-06-23 05:16:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6015336448. Throughput: 0: 42665.5. Samples: 6015489980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:16:48,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 05:16:49,617][15401] Updated weights for policy 0, policy_version 367150 (0.0036) [2024-06-23 05:16:53,220][15401] Updated weights for policy 0, policy_version 367160 (0.0041) [2024-06-23 05:16:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6015549440. Throughput: 0: 42848.1. Samples: 6015621120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:16:53,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 05:16:57,357][15401] Updated weights for policy 0, policy_version 367170 (0.0027) [2024-06-23 05:16:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6015762432. Throughput: 0: 42678.3. Samples: 6015875880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:16:58,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 05:17:00,939][15401] Updated weights for policy 0, policy_version 367180 (0.0022) [2024-06-23 05:17:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6015959040. Throughput: 0: 42653.4. Samples: 6016130060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 05:17:04,969][15401] Updated weights for policy 0, policy_version 367190 (0.0028) [2024-06-23 05:17:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6016188416. Throughput: 0: 42698.5. Samples: 6016258300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 05:17:08,708][15401] Updated weights for policy 0, policy_version 367200 (0.0039) [2024-06-23 05:17:12,324][15401] Updated weights for policy 0, policy_version 367210 (0.0031) [2024-06-23 05:17:13,029][15349] Signal inference workers to stop experience collection... (89200 times) [2024-06-23 05:17:13,030][15349] Signal inference workers to resume experience collection... (89200 times) [2024-06-23 05:17:13,051][15401] InferenceWorker_p0-w0: stopping experience collection (89200 times) [2024-06-23 05:17:13,051][15401] InferenceWorker_p0-w0: resuming experience collection (89200 times) [2024-06-23 05:17:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6016417792. Throughput: 0: 42653.2. Samples: 6016514480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:13,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 05:17:16,615][15401] Updated weights for policy 0, policy_version 367220 (0.0041) [2024-06-23 05:17:18,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43146.0, 300 sec: 42709.5). Total num frames: 6016614400. Throughput: 0: 42717.3. Samples: 6016773060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 05:17:20,016][15401] Updated weights for policy 0, policy_version 367230 (0.0030) [2024-06-23 05:17:23,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 6016811008. Throughput: 0: 42687.9. Samples: 6016899060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:23,393][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 05:17:24,064][15401] Updated weights for policy 0, policy_version 367240 (0.0025) [2024-06-23 05:17:28,018][15401] Updated weights for policy 0, policy_version 367250 (0.0038) [2024-06-23 05:17:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6017040384. Throughput: 0: 42711.1. Samples: 6017155740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:28,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 05:17:31,765][15401] Updated weights for policy 0, policy_version 367260 (0.0032) [2024-06-23 05:17:33,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6017236992. Throughput: 0: 42719.0. Samples: 6017412340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 05:17:35,406][15401] Updated weights for policy 0, policy_version 367270 (0.0043) [2024-06-23 05:17:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 6017449984. Throughput: 0: 42657.2. Samples: 6017540700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 05:17:39,274][15401] Updated weights for policy 0, policy_version 367280 (0.0033) [2024-06-23 05:17:43,058][15401] Updated weights for policy 0, policy_version 367290 (0.0028) [2024-06-23 05:17:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6017679360. Throughput: 0: 42729.3. Samples: 6017798700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 05:17:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 05:17:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000367290_6017679360.pth... [2024-06-23 05:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000366663_6007406592.pth [2024-06-23 05:17:47,033][15401] Updated weights for policy 0, policy_version 367300 (0.0038) [2024-06-23 05:17:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6017892352. Throughput: 0: 42567.6. Samples: 6018045600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:17:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 05:17:51,210][15401] Updated weights for policy 0, policy_version 367310 (0.0034) [2024-06-23 05:17:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6018088960. Throughput: 0: 42695.7. Samples: 6018179600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:17:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 05:17:54,859][15401] Updated weights for policy 0, policy_version 367320 (0.0024) [2024-06-23 05:17:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6018301952. Throughput: 0: 42616.1. Samples: 6018432200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:17:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 05:17:58,671][15401] Updated weights for policy 0, policy_version 367330 (0.0032) [2024-06-23 05:18:02,494][15401] Updated weights for policy 0, policy_version 367340 (0.0036) [2024-06-23 05:18:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6018531328. Throughput: 0: 42552.4. Samples: 6018687920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 05:18:06,535][15401] Updated weights for policy 0, policy_version 367350 (0.0032) [2024-06-23 05:18:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6018744320. Throughput: 0: 42698.3. Samples: 6018820380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 05:18:10,000][15401] Updated weights for policy 0, policy_version 367360 (0.0031) [2024-06-23 05:18:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 6018940928. Throughput: 0: 42672.3. Samples: 6019076000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 05:18:14,217][15401] Updated weights for policy 0, policy_version 367370 (0.0037) [2024-06-23 05:18:17,667][15401] Updated weights for policy 0, policy_version 367380 (0.0029) [2024-06-23 05:18:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6019186688. Throughput: 0: 42539.2. Samples: 6019326600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 05:18:22,070][15401] Updated weights for policy 0, policy_version 367390 (0.0030) [2024-06-23 05:18:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 6019366912. Throughput: 0: 42695.6. Samples: 6019462000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 05:18:25,310][15401] Updated weights for policy 0, policy_version 367400 (0.0049) [2024-06-23 05:18:28,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6019563520. Throughput: 0: 42468.4. Samples: 6019709780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:28,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 05:18:29,625][15401] Updated weights for policy 0, policy_version 367410 (0.0036) [2024-06-23 05:18:32,348][15349] Signal inference workers to stop experience collection... (89250 times) [2024-06-23 05:18:32,349][15349] Signal inference workers to resume experience collection... (89250 times) [2024-06-23 05:18:32,377][15401] InferenceWorker_p0-w0: stopping experience collection (89250 times) [2024-06-23 05:18:32,377][15401] InferenceWorker_p0-w0: resuming experience collection (89250 times) [2024-06-23 05:18:32,898][15401] Updated weights for policy 0, policy_version 367420 (0.0038) [2024-06-23 05:18:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6019825664. Throughput: 0: 42563.2. Samples: 6019960940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 05:18:37,226][15401] Updated weights for policy 0, policy_version 367430 (0.0023) [2024-06-23 05:18:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.6). Total num frames: 6019989504. Throughput: 0: 42523.7. Samples: 6020093160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 05:18:40,527][15401] Updated weights for policy 0, policy_version 367440 (0.0038) [2024-06-23 05:18:43,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6020218880. Throughput: 0: 42566.5. Samples: 6020347700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 05:18:44,834][15401] Updated weights for policy 0, policy_version 367450 (0.0034) [2024-06-23 05:18:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6020448256. Throughput: 0: 42521.4. Samples: 6020601380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:48,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 05:18:48,461][15401] Updated weights for policy 0, policy_version 367460 (0.0032) [2024-06-23 05:18:52,830][15401] Updated weights for policy 0, policy_version 367470 (0.0042) [2024-06-23 05:18:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 6020628480. Throughput: 0: 42434.2. Samples: 6020729920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 05:18:56,077][15401] Updated weights for policy 0, policy_version 367480 (0.0046) [2024-06-23 05:18:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6020874240. Throughput: 0: 42315.2. Samples: 6020980180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:18:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 05:19:00,411][15401] Updated weights for policy 0, policy_version 367490 (0.0033) [2024-06-23 05:19:03,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 6021087232. Throughput: 0: 42522.1. Samples: 6021240200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 05:19:03,393][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 05:19:04,081][15401] Updated weights for policy 0, policy_version 367500 (0.0038) [2024-06-23 05:19:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6021267456. Throughput: 0: 42267.6. Samples: 6021364040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 05:19:08,514][15401] Updated weights for policy 0, policy_version 367510 (0.0036) [2024-06-23 05:19:11,651][15401] Updated weights for policy 0, policy_version 367520 (0.0038) [2024-06-23 05:19:13,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6021496832. Throughput: 0: 42362.7. Samples: 6021616100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:13,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 05:19:16,217][15401] Updated weights for policy 0, policy_version 367530 (0.0040) [2024-06-23 05:19:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 6021709824. Throughput: 0: 42487.9. Samples: 6021872900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:18,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 05:19:19,372][15401] Updated weights for policy 0, policy_version 367540 (0.0039) [2024-06-23 05:19:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6021906432. Throughput: 0: 42302.1. Samples: 6021996760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 05:19:23,851][15401] Updated weights for policy 0, policy_version 367550 (0.0023) [2024-06-23 05:19:27,215][15401] Updated weights for policy 0, policy_version 367560 (0.0027) [2024-06-23 05:19:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6022152192. Throughput: 0: 42291.2. Samples: 6022250800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 05:19:32,082][15401] Updated weights for policy 0, policy_version 367570 (0.0031) [2024-06-23 05:19:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 6022348800. Throughput: 0: 42372.0. Samples: 6022508120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:33,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 05:19:34,950][15401] Updated weights for policy 0, policy_version 367580 (0.0035) [2024-06-23 05:19:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6022561792. Throughput: 0: 42243.9. Samples: 6022630900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 05:19:39,641][15401] Updated weights for policy 0, policy_version 367590 (0.0038) [2024-06-23 05:19:42,827][15401] Updated weights for policy 0, policy_version 367600 (0.0041) [2024-06-23 05:19:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6022774784. Throughput: 0: 42410.2. Samples: 6022888640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 05:19:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000367602_6022791168.pth... [2024-06-23 05:19:43,525][15349] Signal inference workers to stop experience collection... (89300 times) [2024-06-23 05:19:43,525][15349] Signal inference workers to resume experience collection... (89300 times) [2024-06-23 05:19:43,539][15401] InferenceWorker_p0-w0: stopping experience collection (89300 times) [2024-06-23 05:19:43,539][15401] InferenceWorker_p0-w0: resuming experience collection (89300 times) [2024-06-23 05:19:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000366978_6012567552.pth [2024-06-23 05:19:47,177][15401] Updated weights for policy 0, policy_version 367610 (0.0033) [2024-06-23 05:19:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6022971392. Throughput: 0: 42438.9. Samples: 6023149840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:48,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 05:19:50,314][15401] Updated weights for policy 0, policy_version 367620 (0.0040) [2024-06-23 05:19:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 6023200768. Throughput: 0: 42453.8. Samples: 6023274460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:53,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-23 05:19:54,833][15401] Updated weights for policy 0, policy_version 367630 (0.0032) [2024-06-23 05:19:57,905][15401] Updated weights for policy 0, policy_version 367640 (0.0044) [2024-06-23 05:19:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 6023413760. Throughput: 0: 42582.7. Samples: 6023532320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:19:58,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-23 05:20:02,254][15401] Updated weights for policy 0, policy_version 367650 (0.0043) [2024-06-23 05:20:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41781.0, 300 sec: 42542.9). Total num frames: 6023593984. Throughput: 0: 42762.8. Samples: 6023797220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:20:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 05:20:05,769][15401] Updated weights for policy 0, policy_version 367660 (0.0031) [2024-06-23 05:20:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6023839744. Throughput: 0: 42658.4. Samples: 6023916380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:20:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 05:20:10,002][15401] Updated weights for policy 0, policy_version 367670 (0.0026) [2024-06-23 05:20:13,187][15401] Updated weights for policy 0, policy_version 367680 (0.0040) [2024-06-23 05:20:13,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6024069120. Throughput: 0: 42642.7. Samples: 6024169720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:20:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 05:20:17,414][15401] Updated weights for policy 0, policy_version 367690 (0.0023) [2024-06-23 05:20:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6024249344. Throughput: 0: 42739.6. Samples: 6024431400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-23 05:20:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 05:20:20,984][15401] Updated weights for policy 0, policy_version 367700 (0.0041) [2024-06-23 05:20:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 6024478720. Throughput: 0: 42681.7. Samples: 6024551580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:20:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 05:20:24,757][15401] Updated weights for policy 0, policy_version 367710 (0.0032) [2024-06-23 05:20:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 6024691712. Throughput: 0: 42762.8. Samples: 6024812960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:20:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 05:20:28,646][15401] Updated weights for policy 0, policy_version 367720 (0.0037) [2024-06-23 05:20:32,755][15401] Updated weights for policy 0, policy_version 367730 (0.0036) [2024-06-23 05:20:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6024888320. Throughput: 0: 42721.3. Samples: 6025072300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:20:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 05:20:36,310][15401] Updated weights for policy 0, policy_version 367740 (0.0043) [2024-06-23 05:20:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6025117696. Throughput: 0: 42726.2. Samples: 6025197140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:20:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 05:20:40,271][15401] Updated weights for policy 0, policy_version 367750 (0.0042) [2024-06-23 05:20:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6025347072. Throughput: 0: 42942.5. Samples: 6025464740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:20:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 05:20:43,840][15401] Updated weights for policy 0, policy_version 367760 (0.0030) [2024-06-23 05:20:47,956][15401] Updated weights for policy 0, policy_version 367770 (0.0036) [2024-06-23 05:20:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 6025543680. Throughput: 0: 42692.4. Samples: 6025718380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:20:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 05:20:51,717][15401] Updated weights for policy 0, policy_version 367780 (0.0026) [2024-06-23 05:20:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6025756672. Throughput: 0: 42824.3. Samples: 6025843480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:20:53,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 05:20:55,336][15401] Updated weights for policy 0, policy_version 367790 (0.0034) [2024-06-23 05:20:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 6025969664. Throughput: 0: 43000.4. Samples: 6026104740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:20:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 05:20:59,353][15401] Updated weights for policy 0, policy_version 367800 (0.0022) [2024-06-23 05:21:03,079][15401] Updated weights for policy 0, policy_version 367810 (0.0043) [2024-06-23 05:21:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 42653.9). Total num frames: 6026215424. Throughput: 0: 42718.1. Samples: 6026353720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:21:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 05:21:06,886][15401] Updated weights for policy 0, policy_version 367820 (0.0039) [2024-06-23 05:21:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6026395648. Throughput: 0: 43010.0. Samples: 6026487020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:21:08,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 05:21:10,682][15401] Updated weights for policy 0, policy_version 367830 (0.0033) [2024-06-23 05:21:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42654.2). Total num frames: 6026608640. Throughput: 0: 43073.3. Samples: 6026751260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:21:13,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-23 05:21:13,604][15349] Signal inference workers to stop experience collection... (89350 times) [2024-06-23 05:21:13,659][15401] InferenceWorker_p0-w0: stopping experience collection (89350 times) [2024-06-23 05:21:13,714][15349] Signal inference workers to resume experience collection... (89350 times) [2024-06-23 05:21:13,714][15401] InferenceWorker_p0-w0: resuming experience collection (89350 times) [2024-06-23 05:21:14,310][15401] Updated weights for policy 0, policy_version 367840 (0.0031) [2024-06-23 05:21:18,171][15401] Updated weights for policy 0, policy_version 367850 (0.0036) [2024-06-23 05:21:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 6026854400. Throughput: 0: 42811.1. Samples: 6026998800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:21:18,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 05:21:22,111][15401] Updated weights for policy 0, policy_version 367860 (0.0026) [2024-06-23 05:21:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 6027051008. Throughput: 0: 43062.3. Samples: 6027134940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:21:23,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 05:21:26,175][15401] Updated weights for policy 0, policy_version 367870 (0.0048) [2024-06-23 05:21:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6027247616. Throughput: 0: 42764.6. Samples: 6027389140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:21:28,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 05:21:30,054][15401] Updated weights for policy 0, policy_version 367880 (0.0022) [2024-06-23 05:21:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 6027493376. Throughput: 0: 42765.7. Samples: 6027642840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:21:33,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 05:21:33,507][15401] Updated weights for policy 0, policy_version 367890 (0.0037) [2024-06-23 05:21:37,556][15401] Updated weights for policy 0, policy_version 367900 (0.0036) [2024-06-23 05:21:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 6027689984. Throughput: 0: 43000.0. Samples: 6027778580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 05:21:38,392][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 05:21:41,349][15401] Updated weights for policy 0, policy_version 367910 (0.0041) [2024-06-23 05:21:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 6027886592. Throughput: 0: 42872.4. Samples: 6028034000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:21:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 05:21:43,508][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000367914_6027902976.pth... [2024-06-23 05:21:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000367290_6017679360.pth [2024-06-23 05:21:45,049][15401] Updated weights for policy 0, policy_version 367920 (0.0037) [2024-06-23 05:21:48,390][15132] Fps is (10 sec: 44246.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 6028132352. Throughput: 0: 42940.4. Samples: 6028286040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:21:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 05:21:48,885][15401] Updated weights for policy 0, policy_version 367930 (0.0036) [2024-06-23 05:21:52,551][15401] Updated weights for policy 0, policy_version 367940 (0.0025) [2024-06-23 05:21:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6028345344. Throughput: 0: 43074.0. Samples: 6028425360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:21:53,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 05:21:56,527][15401] Updated weights for policy 0, policy_version 367950 (0.0027) [2024-06-23 05:21:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6028541952. Throughput: 0: 42907.0. Samples: 6028682080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:21:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 05:22:00,077][15401] Updated weights for policy 0, policy_version 367960 (0.0032) [2024-06-23 05:22:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6028771328. Throughput: 0: 42979.1. Samples: 6028932860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:03,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 05:22:04,257][15401] Updated weights for policy 0, policy_version 367970 (0.0038) [2024-06-23 05:22:08,165][15401] Updated weights for policy 0, policy_version 367980 (0.0041) [2024-06-23 05:22:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 6029000704. Throughput: 0: 42820.4. Samples: 6029061860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 05:22:12,138][15401] Updated weights for policy 0, policy_version 367990 (0.0035) [2024-06-23 05:22:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6029180928. Throughput: 0: 42726.2. Samples: 6029311820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 05:22:15,769][15401] Updated weights for policy 0, policy_version 368000 (0.0026) [2024-06-23 05:22:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 6029410304. Throughput: 0: 42841.3. Samples: 6029570700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 05:22:20,065][15401] Updated weights for policy 0, policy_version 368010 (0.0026) [2024-06-23 05:22:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6029623296. Throughput: 0: 42653.5. Samples: 6029697880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 05:22:23,433][15401] Updated weights for policy 0, policy_version 368020 (0.0037) [2024-06-23 05:22:27,643][15401] Updated weights for policy 0, policy_version 368030 (0.0041) [2024-06-23 05:22:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 6029852672. Throughput: 0: 42747.0. Samples: 6029957620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:28,399][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 05:22:30,907][15401] Updated weights for policy 0, policy_version 368040 (0.0029) [2024-06-23 05:22:32,832][15349] Signal inference workers to stop experience collection... (89400 times) [2024-06-23 05:22:32,887][15401] InferenceWorker_p0-w0: stopping experience collection (89400 times) [2024-06-23 05:22:32,949][15349] Signal inference workers to resume experience collection... (89400 times) [2024-06-23 05:22:32,950][15401] InferenceWorker_p0-w0: resuming experience collection (89400 times) [2024-06-23 05:22:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6030065664. Throughput: 0: 42841.1. Samples: 6030213880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:33,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 05:22:35,084][15401] Updated weights for policy 0, policy_version 368050 (0.0040) [2024-06-23 05:22:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 6030278656. Throughput: 0: 42617.1. Samples: 6030343120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:38,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 05:22:38,529][15401] Updated weights for policy 0, policy_version 368060 (0.0050) [2024-06-23 05:22:42,695][15401] Updated weights for policy 0, policy_version 368070 (0.0037) [2024-06-23 05:22:43,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6030475264. Throughput: 0: 42684.9. Samples: 6030602900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:43,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 05:22:46,173][15401] Updated weights for policy 0, policy_version 368080 (0.0033) [2024-06-23 05:22:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6030688256. Throughput: 0: 42939.7. Samples: 6030865140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:48,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 05:22:50,202][15401] Updated weights for policy 0, policy_version 368090 (0.0034) [2024-06-23 05:22:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 6030901248. Throughput: 0: 42725.9. Samples: 6030984520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:53,390][15132] Avg episode reward: [(0, '0.133')] [2024-06-23 05:22:54,010][15401] Updated weights for policy 0, policy_version 368100 (0.0036) [2024-06-23 05:22:58,113][15401] Updated weights for policy 0, policy_version 368110 (0.0031) [2024-06-23 05:22:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6031130624. Throughput: 0: 42972.9. Samples: 6031245600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-23 05:22:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 05:23:01,495][15401] Updated weights for policy 0, policy_version 368120 (0.0034) [2024-06-23 05:23:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6031343616. Throughput: 0: 42986.3. Samples: 6031505080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 05:23:05,713][15401] Updated weights for policy 0, policy_version 368130 (0.0032) [2024-06-23 05:23:08,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6031540224. Throughput: 0: 42927.3. Samples: 6031629620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:08,393][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 05:23:09,141][15401] Updated weights for policy 0, policy_version 368140 (0.0042) [2024-06-23 05:23:13,291][15401] Updated weights for policy 0, policy_version 368150 (0.0029) [2024-06-23 05:23:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6031769600. Throughput: 0: 42982.3. Samples: 6031891820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:13,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 05:23:16,901][15401] Updated weights for policy 0, policy_version 368160 (0.0033) [2024-06-23 05:23:18,396][15132] Fps is (10 sec: 44209.3, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 6031982592. Throughput: 0: 42981.3. Samples: 6032148320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:18,396][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 05:23:20,926][15401] Updated weights for policy 0, policy_version 368170 (0.0042) [2024-06-23 05:23:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6032179200. Throughput: 0: 42860.8. Samples: 6032271860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 05:23:24,582][15401] Updated weights for policy 0, policy_version 368180 (0.0037) [2024-06-23 05:23:28,333][15401] Updated weights for policy 0, policy_version 368190 (0.0038) [2024-06-23 05:23:28,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6032424960. Throughput: 0: 42882.7. Samples: 6032532620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 05:23:32,395][15401] Updated weights for policy 0, policy_version 368200 (0.0035) [2024-06-23 05:23:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6032621568. Throughput: 0: 42884.0. Samples: 6032794920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:33,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-23 05:23:35,765][15401] Updated weights for policy 0, policy_version 368210 (0.0031) [2024-06-23 05:23:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6032834560. Throughput: 0: 42965.6. Samples: 6032917980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:38,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-23 05:23:39,921][15401] Updated weights for policy 0, policy_version 368220 (0.0031) [2024-06-23 05:23:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6033063936. Throughput: 0: 42995.9. Samples: 6033180420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 05:23:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000368230_6033080320.pth... [2024-06-23 05:23:43,526][15401] Updated weights for policy 0, policy_version 368230 (0.0024) [2024-06-23 05:23:43,574][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000367602_6022791168.pth [2024-06-23 05:23:47,577][15401] Updated weights for policy 0, policy_version 368240 (0.0022) [2024-06-23 05:23:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6033260544. Throughput: 0: 42799.1. Samples: 6033431040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:48,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 05:23:51,297][15401] Updated weights for policy 0, policy_version 368250 (0.0035) [2024-06-23 05:23:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6033457152. Throughput: 0: 42845.0. Samples: 6033557640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 05:23:55,219][15401] Updated weights for policy 0, policy_version 368260 (0.0023) [2024-06-23 05:23:58,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42765.0). Total num frames: 6033702912. Throughput: 0: 42643.9. Samples: 6033810900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:23:58,393][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 05:23:58,944][15401] Updated weights for policy 0, policy_version 368270 (0.0030) [2024-06-23 05:24:00,080][15349] Signal inference workers to stop experience collection... (89450 times) [2024-06-23 05:24:00,122][15401] InferenceWorker_p0-w0: stopping experience collection (89450 times) [2024-06-23 05:24:00,196][15349] Signal inference workers to resume experience collection... (89450 times) [2024-06-23 05:24:00,197][15401] InferenceWorker_p0-w0: resuming experience collection (89450 times) [2024-06-23 05:24:02,795][15401] Updated weights for policy 0, policy_version 368280 (0.0036) [2024-06-23 05:24:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6033899520. Throughput: 0: 42706.5. Samples: 6034069840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:24:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 05:24:06,460][15401] Updated weights for policy 0, policy_version 368290 (0.0037) [2024-06-23 05:24:08,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6034096128. Throughput: 0: 42806.6. Samples: 6034198160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:24:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 05:24:10,342][15401] Updated weights for policy 0, policy_version 368300 (0.0041) [2024-06-23 05:24:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6034341888. Throughput: 0: 42755.6. Samples: 6034456620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:24:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 05:24:13,911][15401] Updated weights for policy 0, policy_version 368310 (0.0032) [2024-06-23 05:24:17,952][15401] Updated weights for policy 0, policy_version 368320 (0.0043) [2024-06-23 05:24:18,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43149.2, 300 sec: 42931.6). Total num frames: 6034571264. Throughput: 0: 42648.9. Samples: 6034714120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:18,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 05:24:22,044][15401] Updated weights for policy 0, policy_version 368330 (0.0031) [2024-06-23 05:24:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6034735104. Throughput: 0: 42670.7. Samples: 6034838160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:23,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-23 05:24:25,672][15401] Updated weights for policy 0, policy_version 368340 (0.0044) [2024-06-23 05:24:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6034980864. Throughput: 0: 42625.8. Samples: 6035098580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 05:24:29,659][15401] Updated weights for policy 0, policy_version 368350 (0.0046) [2024-06-23 05:24:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6035193856. Throughput: 0: 42638.2. Samples: 6035349760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:33,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-23 05:24:33,460][15401] Updated weights for policy 0, policy_version 368360 (0.0033) [2024-06-23 05:24:37,484][15401] Updated weights for policy 0, policy_version 368370 (0.0031) [2024-06-23 05:24:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6035390464. Throughput: 0: 42756.5. Samples: 6035481680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 05:24:41,352][15401] Updated weights for policy 0, policy_version 368380 (0.0028) [2024-06-23 05:24:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6035619840. Throughput: 0: 42718.4. Samples: 6035733120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 05:24:45,067][15401] Updated weights for policy 0, policy_version 368390 (0.0030) [2024-06-23 05:24:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6035800064. Throughput: 0: 42807.2. Samples: 6035996160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 05:24:49,220][15401] Updated weights for policy 0, policy_version 368400 (0.0032) [2024-06-23 05:24:53,159][15401] Updated weights for policy 0, policy_version 368410 (0.0035) [2024-06-23 05:24:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6036045824. Throughput: 0: 42599.2. Samples: 6036115120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:53,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-23 05:24:56,978][15401] Updated weights for policy 0, policy_version 368420 (0.0038) [2024-06-23 05:24:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 6036258816. Throughput: 0: 42557.3. Samples: 6036371700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:24:58,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-23 05:25:00,579][15401] Updated weights for policy 0, policy_version 368430 (0.0037) [2024-06-23 05:25:03,394][15132] Fps is (10 sec: 39303.4, 60 sec: 42322.2, 300 sec: 42708.8). Total num frames: 6036439040. Throughput: 0: 42469.5. Samples: 6036625440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:25:03,395][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 05:25:04,753][15401] Updated weights for policy 0, policy_version 368440 (0.0043) [2024-06-23 05:25:08,140][15401] Updated weights for policy 0, policy_version 368450 (0.0037) [2024-06-23 05:25:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6036684800. Throughput: 0: 42450.6. Samples: 6036748440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:25:08,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-23 05:25:12,284][15401] Updated weights for policy 0, policy_version 368460 (0.0031) [2024-06-23 05:25:13,390][15132] Fps is (10 sec: 45895.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6036897792. Throughput: 0: 42446.2. Samples: 6037008660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:25:13,396][15132] Avg episode reward: [(0, '0.251')] [2024-06-23 05:25:15,825][15401] Updated weights for policy 0, policy_version 368470 (0.0034) [2024-06-23 05:25:18,390][15132] Fps is (10 sec: 40956.8, 60 sec: 42051.7, 300 sec: 42764.9). Total num frames: 6037094400. Throughput: 0: 42587.7. Samples: 6037266240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:25:18,391][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 05:25:19,791][15401] Updated weights for policy 0, policy_version 368480 (0.0023) [2024-06-23 05:25:20,711][15349] Signal inference workers to stop experience collection... (89500 times) [2024-06-23 05:25:20,712][15349] Signal inference workers to resume experience collection... (89500 times) [2024-06-23 05:25:20,751][15401] InferenceWorker_p0-w0: stopping experience collection (89500 times) [2024-06-23 05:25:20,751][15401] InferenceWorker_p0-w0: resuming experience collection (89500 times) [2024-06-23 05:25:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6037307392. Throughput: 0: 42428.9. Samples: 6037390980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:25:23,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 05:25:23,603][15401] Updated weights for policy 0, policy_version 368490 (0.0024) [2024-06-23 05:25:27,622][15401] Updated weights for policy 0, policy_version 368500 (0.0029) [2024-06-23 05:25:28,390][15132] Fps is (10 sec: 44240.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6037536768. Throughput: 0: 42640.3. Samples: 6037651940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:25:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 05:25:31,291][15401] Updated weights for policy 0, policy_version 368510 (0.0031) [2024-06-23 05:25:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6037733376. Throughput: 0: 42524.7. Samples: 6037909780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 05:25:33,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 05:25:35,079][15401] Updated weights for policy 0, policy_version 368520 (0.0041) [2024-06-23 05:25:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6037979136. Throughput: 0: 42727.0. Samples: 6038037840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:25:38,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 05:25:39,034][15401] Updated weights for policy 0, policy_version 368530 (0.0035) [2024-06-23 05:25:42,633][15401] Updated weights for policy 0, policy_version 368540 (0.0026) [2024-06-23 05:25:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6038159360. Throughput: 0: 42906.2. Samples: 6038302480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:25:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 05:25:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000368541_6038175744.pth... [2024-06-23 05:25:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000367914_6027902976.pth [2024-06-23 05:25:46,632][15401] Updated weights for policy 0, policy_version 368550 (0.0037) [2024-06-23 05:25:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6038405120. Throughput: 0: 42998.7. Samples: 6038560180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:25:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 05:25:50,272][15401] Updated weights for policy 0, policy_version 368560 (0.0030) [2024-06-23 05:25:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6038618112. Throughput: 0: 43020.5. Samples: 6038684360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:25:53,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 05:25:54,079][15401] Updated weights for policy 0, policy_version 368570 (0.0034) [2024-06-23 05:25:57,787][15401] Updated weights for policy 0, policy_version 368580 (0.0030) [2024-06-23 05:25:58,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6038831104. Throughput: 0: 43080.3. Samples: 6038947280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:25:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 05:26:01,982][15401] Updated weights for policy 0, policy_version 368590 (0.0025) [2024-06-23 05:26:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43147.7, 300 sec: 42820.5). Total num frames: 6039027712. Throughput: 0: 43090.9. Samples: 6039205300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:03,391][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 05:26:05,663][15401] Updated weights for policy 0, policy_version 368600 (0.0031) [2024-06-23 05:26:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6039273472. Throughput: 0: 43039.0. Samples: 6039327740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 05:26:09,499][15401] Updated weights for policy 0, policy_version 368610 (0.0022) [2024-06-23 05:26:13,287][15401] Updated weights for policy 0, policy_version 368620 (0.0036) [2024-06-23 05:26:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 6039470080. Throughput: 0: 43052.4. Samples: 6039589400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:13,393][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 05:26:16,904][15401] Updated weights for policy 0, policy_version 368630 (0.0031) [2024-06-23 05:26:18,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42597.3, 300 sec: 42709.1). Total num frames: 6039650304. Throughput: 0: 43142.7. Samples: 6039851300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:18,393][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 05:26:20,853][15401] Updated weights for policy 0, policy_version 368640 (0.0030) [2024-06-23 05:26:23,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6039896064. Throughput: 0: 43040.9. Samples: 6039974680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:23,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-23 05:26:24,617][15401] Updated weights for policy 0, policy_version 368650 (0.0043) [2024-06-23 05:26:28,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6040109056. Throughput: 0: 42771.2. Samples: 6040227180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 05:26:28,732][15401] Updated weights for policy 0, policy_version 368660 (0.0038) [2024-06-23 05:26:32,316][15401] Updated weights for policy 0, policy_version 368670 (0.0045) [2024-06-23 05:26:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 6040305664. Throughput: 0: 42742.9. Samples: 6040483620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 05:26:36,195][15401] Updated weights for policy 0, policy_version 368680 (0.0029) [2024-06-23 05:26:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6040551424. Throughput: 0: 42844.9. Samples: 6040612380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 05:26:39,870][15401] Updated weights for policy 0, policy_version 368690 (0.0032) [2024-06-23 05:26:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6040748032. Throughput: 0: 42751.2. Samples: 6040871080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 05:26:44,098][15401] Updated weights for policy 0, policy_version 368700 (0.0030) [2024-06-23 05:26:48,041][15401] Updated weights for policy 0, policy_version 368710 (0.0047) [2024-06-23 05:26:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6040944640. Throughput: 0: 42644.9. Samples: 6041124320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 05:26:51,623][15401] Updated weights for policy 0, policy_version 368720 (0.0033) [2024-06-23 05:26:52,923][15349] Signal inference workers to stop experience collection... (89550 times) [2024-06-23 05:26:52,924][15349] Signal inference workers to resume experience collection... (89550 times) [2024-06-23 05:26:52,945][15401] InferenceWorker_p0-w0: stopping experience collection (89550 times) [2024-06-23 05:26:52,946][15401] InferenceWorker_p0-w0: resuming experience collection (89550 times) [2024-06-23 05:26:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6041190400. Throughput: 0: 42754.4. Samples: 6041251680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 05:26:53,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 05:26:55,745][15401] Updated weights for policy 0, policy_version 368730 (0.0034) [2024-06-23 05:26:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6041370624. Throughput: 0: 42805.9. Samples: 6041515560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:26:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 05:26:59,161][15401] Updated weights for policy 0, policy_version 368740 (0.0032) [2024-06-23 05:27:03,249][15401] Updated weights for policy 0, policy_version 368750 (0.0030) [2024-06-23 05:27:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6041600000. Throughput: 0: 42685.0. Samples: 6041772020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 05:27:06,889][15401] Updated weights for policy 0, policy_version 368760 (0.0037) [2024-06-23 05:27:08,392][15132] Fps is (10 sec: 47502.2, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 6041845760. Throughput: 0: 42793.7. Samples: 6041900500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:08,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 05:27:10,807][15401] Updated weights for policy 0, policy_version 368770 (0.0035) [2024-06-23 05:27:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 6042025984. Throughput: 0: 42851.4. Samples: 6042155600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:13,392][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 05:27:14,566][15401] Updated weights for policy 0, policy_version 368780 (0.0046) [2024-06-23 05:27:18,272][15401] Updated weights for policy 0, policy_version 368790 (0.0049) [2024-06-23 05:27:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 6042255360. Throughput: 0: 42836.5. Samples: 6042411260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 05:27:22,210][15401] Updated weights for policy 0, policy_version 368800 (0.0033) [2024-06-23 05:27:23,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6042468352. Throughput: 0: 42954.2. Samples: 6042545320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:23,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 05:27:25,965][15401] Updated weights for policy 0, policy_version 368810 (0.0035) [2024-06-23 05:27:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6042664960. Throughput: 0: 42840.8. Samples: 6042798920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:28,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 05:27:29,836][15401] Updated weights for policy 0, policy_version 368820 (0.0030) [2024-06-23 05:27:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6042894336. Throughput: 0: 42912.0. Samples: 6043055360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 05:27:33,597][15401] Updated weights for policy 0, policy_version 368830 (0.0037) [2024-06-23 05:27:37,491][15401] Updated weights for policy 0, policy_version 368840 (0.0029) [2024-06-23 05:27:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6043107328. Throughput: 0: 42966.1. Samples: 6043185160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:38,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 05:27:41,074][15401] Updated weights for policy 0, policy_version 368850 (0.0028) [2024-06-23 05:27:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6043320320. Throughput: 0: 42746.1. Samples: 6043439140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 05:27:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000368855_6043320320.pth... [2024-06-23 05:27:43,443][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000368230_6033080320.pth [2024-06-23 05:27:45,031][15401] Updated weights for policy 0, policy_version 368860 (0.0031) [2024-06-23 05:27:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6043533312. Throughput: 0: 42682.1. Samples: 6043692720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 05:27:48,872][15401] Updated weights for policy 0, policy_version 368870 (0.0029) [2024-06-23 05:27:52,751][15401] Updated weights for policy 0, policy_version 368880 (0.0028) [2024-06-23 05:27:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6043762688. Throughput: 0: 42728.9. Samples: 6043823200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:53,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 05:27:56,414][15401] Updated weights for policy 0, policy_version 368890 (0.0035) [2024-06-23 05:27:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 6043975680. Throughput: 0: 42845.0. Samples: 6044083520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:27:58,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 05:28:00,307][15401] Updated weights for policy 0, policy_version 368900 (0.0026) [2024-06-23 05:28:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6044188672. Throughput: 0: 42904.1. Samples: 6044341940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:28:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 05:28:03,932][15401] Updated weights for policy 0, policy_version 368910 (0.0043) [2024-06-23 05:28:07,962][15401] Updated weights for policy 0, policy_version 368920 (0.0022) [2024-06-23 05:28:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 6044385280. Throughput: 0: 42849.0. Samples: 6044473520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:28:08,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 05:28:11,381][15401] Updated weights for policy 0, policy_version 368930 (0.0030) [2024-06-23 05:28:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.2, 300 sec: 42821.5). Total num frames: 6044614656. Throughput: 0: 43032.0. Samples: 6044735360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 05:28:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 05:28:15,718][15401] Updated weights for policy 0, policy_version 368940 (0.0042) [2024-06-23 05:28:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6044844032. Throughput: 0: 42956.5. Samples: 6044988400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:18,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 05:28:18,838][15401] Updated weights for policy 0, policy_version 368950 (0.0026) [2024-06-23 05:28:23,259][15401] Updated weights for policy 0, policy_version 368960 (0.0037) [2024-06-23 05:28:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6045040640. Throughput: 0: 43062.3. Samples: 6045122960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 05:28:27,080][15401] Updated weights for policy 0, policy_version 368970 (0.0040) [2024-06-23 05:28:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 6045253632. Throughput: 0: 43100.2. Samples: 6045378640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:28,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 05:28:30,843][15401] Updated weights for policy 0, policy_version 368980 (0.0038) [2024-06-23 05:28:32,514][15349] Signal inference workers to stop experience collection... (89600 times) [2024-06-23 05:28:32,515][15349] Signal inference workers to resume experience collection... (89600 times) [2024-06-23 05:28:32,537][15401] InferenceWorker_p0-w0: stopping experience collection (89600 times) [2024-06-23 05:28:32,537][15401] InferenceWorker_p0-w0: resuming experience collection (89600 times) [2024-06-23 05:28:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6045483008. Throughput: 0: 43135.7. Samples: 6045633820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:33,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-23 05:28:34,536][15401] Updated weights for policy 0, policy_version 368990 (0.0042) [2024-06-23 05:28:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6045663232. Throughput: 0: 43155.3. Samples: 6045765180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 05:28:38,746][15401] Updated weights for policy 0, policy_version 369000 (0.0027) [2024-06-23 05:28:42,146][15401] Updated weights for policy 0, policy_version 369010 (0.0029) [2024-06-23 05:28:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6045908992. Throughput: 0: 43019.0. Samples: 6046019380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 05:28:46,231][15401] Updated weights for policy 0, policy_version 369020 (0.0031) [2024-06-23 05:28:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6046105600. Throughput: 0: 43109.3. Samples: 6046281860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 05:28:49,582][15401] Updated weights for policy 0, policy_version 369030 (0.0027) [2024-06-23 05:28:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 6046318592. Throughput: 0: 42987.0. Samples: 6046407940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:53,391][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 05:28:53,734][15401] Updated weights for policy 0, policy_version 369040 (0.0035) [2024-06-23 05:28:57,233][15401] Updated weights for policy 0, policy_version 369050 (0.0038) [2024-06-23 05:28:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6046547968. Throughput: 0: 42915.2. Samples: 6046666540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:28:58,391][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 05:29:01,263][15401] Updated weights for policy 0, policy_version 369060 (0.0031) [2024-06-23 05:29:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6046744576. Throughput: 0: 43214.2. Samples: 6046933040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:29:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 05:29:04,997][15401] Updated weights for policy 0, policy_version 369070 (0.0042) [2024-06-23 05:29:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6046973952. Throughput: 0: 42834.6. Samples: 6047050520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:29:08,399][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 05:29:08,863][15401] Updated weights for policy 0, policy_version 369080 (0.0027) [2024-06-23 05:29:12,761][15401] Updated weights for policy 0, policy_version 369090 (0.0050) [2024-06-23 05:29:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6047186944. Throughput: 0: 42993.6. Samples: 6047313360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:29:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 05:29:16,790][15401] Updated weights for policy 0, policy_version 369100 (0.0043) [2024-06-23 05:29:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6047383552. Throughput: 0: 43072.5. Samples: 6047572080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:29:18,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 05:29:20,585][15401] Updated weights for policy 0, policy_version 369110 (0.0031) [2024-06-23 05:29:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6047612928. Throughput: 0: 42971.9. Samples: 6047698920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:29:23,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-23 05:29:24,262][15401] Updated weights for policy 0, policy_version 369120 (0.0023) [2024-06-23 05:29:28,165][15401] Updated weights for policy 0, policy_version 369130 (0.0037) [2024-06-23 05:29:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 6047825920. Throughput: 0: 43046.2. Samples: 6047956460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 05:29:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 05:29:31,934][15401] Updated weights for policy 0, policy_version 369140 (0.0028) [2024-06-23 05:29:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6048038912. Throughput: 0: 42883.1. Samples: 6048211600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:29:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 05:29:35,640][15401] Updated weights for policy 0, policy_version 369150 (0.0041) [2024-06-23 05:29:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6048251904. Throughput: 0: 42805.4. Samples: 6048334180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:29:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 05:29:39,921][15401] Updated weights for policy 0, policy_version 369160 (0.0042) [2024-06-23 05:29:43,049][15401] Updated weights for policy 0, policy_version 369170 (0.0060) [2024-06-23 05:29:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 6048481280. Throughput: 0: 42912.4. Samples: 6048597600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:29:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 05:29:43,519][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000369171_6048497664.pth... [2024-06-23 05:29:43,565][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000368541_6038175744.pth [2024-06-23 05:29:47,331][15401] Updated weights for policy 0, policy_version 369180 (0.0037) [2024-06-23 05:29:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6048677888. Throughput: 0: 42645.4. Samples: 6048852080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:29:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 05:29:50,952][15401] Updated weights for policy 0, policy_version 369190 (0.0056) [2024-06-23 05:29:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6048890880. Throughput: 0: 42873.4. Samples: 6048979820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:29:53,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 05:29:55,014][15401] Updated weights for policy 0, policy_version 369200 (0.0026) [2024-06-23 05:29:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42987.8). Total num frames: 6049120256. Throughput: 0: 42951.1. Samples: 6049246160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:29:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 05:29:58,479][15401] Updated weights for policy 0, policy_version 369210 (0.0043) [2024-06-23 05:30:02,421][15401] Updated weights for policy 0, policy_version 369220 (0.0032) [2024-06-23 05:30:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6049333248. Throughput: 0: 42917.7. Samples: 6049503380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:03,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 05:30:05,968][15349] Signal inference workers to stop experience collection... (89650 times) [2024-06-23 05:30:05,968][15349] Signal inference workers to resume experience collection... (89650 times) [2024-06-23 05:30:06,013][15401] InferenceWorker_p0-w0: stopping experience collection (89650 times) [2024-06-23 05:30:06,013][15401] InferenceWorker_p0-w0: resuming experience collection (89650 times) [2024-06-23 05:30:06,110][15401] Updated weights for policy 0, policy_version 369230 (0.0025) [2024-06-23 05:30:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6049546240. Throughput: 0: 42935.4. Samples: 6049631020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 05:30:09,890][15401] Updated weights for policy 0, policy_version 369240 (0.0036) [2024-06-23 05:30:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42987.3). Total num frames: 6049775616. Throughput: 0: 43107.9. Samples: 6049896320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 05:30:13,711][15401] Updated weights for policy 0, policy_version 369250 (0.0028) [2024-06-23 05:30:17,391][15401] Updated weights for policy 0, policy_version 369260 (0.0023) [2024-06-23 05:30:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 6049988608. Throughput: 0: 43226.2. Samples: 6050156780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 05:30:21,166][15401] Updated weights for policy 0, policy_version 369270 (0.0037) [2024-06-23 05:30:23,389][15132] Fps is (10 sec: 42599.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 6050201600. Throughput: 0: 43376.1. Samples: 6050286100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 05:30:24,935][15401] Updated weights for policy 0, policy_version 369280 (0.0050) [2024-06-23 05:30:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 6050414592. Throughput: 0: 43299.5. Samples: 6050546080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 05:30:28,702][15401] Updated weights for policy 0, policy_version 369290 (0.0026) [2024-06-23 05:30:32,574][15401] Updated weights for policy 0, policy_version 369300 (0.0042) [2024-06-23 05:30:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6050627584. Throughput: 0: 43305.4. Samples: 6050800820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 05:30:36,363][15401] Updated weights for policy 0, policy_version 369310 (0.0037) [2024-06-23 05:30:38,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43415.8, 300 sec: 43042.4). Total num frames: 6050856960. Throughput: 0: 43306.5. Samples: 6050928720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:38,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 05:30:40,099][15401] Updated weights for policy 0, policy_version 369320 (0.0025) [2024-06-23 05:30:43,390][15132] Fps is (10 sec: 44235.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6051069952. Throughput: 0: 43151.0. Samples: 6051187960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:43,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 05:30:43,814][15401] Updated weights for policy 0, policy_version 369330 (0.0027) [2024-06-23 05:30:47,599][15401] Updated weights for policy 0, policy_version 369340 (0.0047) [2024-06-23 05:30:48,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6051282944. Throughput: 0: 43178.4. Samples: 6051446400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 05:30:48,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 05:30:51,650][15401] Updated weights for policy 0, policy_version 369350 (0.0033) [2024-06-23 05:30:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 6051512320. Throughput: 0: 43197.1. Samples: 6051574880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:30:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 05:30:55,565][15401] Updated weights for policy 0, policy_version 369360 (0.0031) [2024-06-23 05:30:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 6051708928. Throughput: 0: 43003.7. Samples: 6051831480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:30:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 05:30:59,315][15401] Updated weights for policy 0, policy_version 369370 (0.0034) [2024-06-23 05:31:03,346][15401] Updated weights for policy 0, policy_version 369380 (0.0035) [2024-06-23 05:31:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6051921920. Throughput: 0: 42985.0. Samples: 6052091100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 05:31:06,855][15401] Updated weights for policy 0, policy_version 369390 (0.0027) [2024-06-23 05:31:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 6052134912. Throughput: 0: 42864.3. Samples: 6052215000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:08,394][15132] Avg episode reward: [(0, '0.275')] [2024-06-23 05:31:10,971][15401] Updated weights for policy 0, policy_version 369400 (0.0039) [2024-06-23 05:31:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 43098.6). Total num frames: 6052364288. Throughput: 0: 42873.4. Samples: 6052475380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:13,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-23 05:31:14,412][15401] Updated weights for policy 0, policy_version 369410 (0.0039) [2024-06-23 05:31:18,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 6052560896. Throughput: 0: 42912.7. Samples: 6052732000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:18,392][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 05:31:18,597][15401] Updated weights for policy 0, policy_version 369420 (0.0028) [2024-06-23 05:31:22,120][15401] Updated weights for policy 0, policy_version 369430 (0.0038) [2024-06-23 05:31:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6052773888. Throughput: 0: 42888.5. Samples: 6052858600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 05:31:26,098][15401] Updated weights for policy 0, policy_version 369440 (0.0037) [2024-06-23 05:31:28,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 6052986880. Throughput: 0: 42958.9. Samples: 6053121100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 05:31:29,667][15401] Updated weights for policy 0, policy_version 369450 (0.0033) [2024-06-23 05:31:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6053216256. Throughput: 0: 42811.5. Samples: 6053372920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 05:31:34,000][15401] Updated weights for policy 0, policy_version 369460 (0.0028) [2024-06-23 05:31:36,533][15349] Signal inference workers to stop experience collection... (89700 times) [2024-06-23 05:31:36,564][15401] InferenceWorker_p0-w0: stopping experience collection (89700 times) [2024-06-23 05:31:36,598][15349] Signal inference workers to resume experience collection... (89700 times) [2024-06-23 05:31:36,598][15401] InferenceWorker_p0-w0: resuming experience collection (89700 times) [2024-06-23 05:31:37,185][15401] Updated weights for policy 0, policy_version 369470 (0.0030) [2024-06-23 05:31:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 6053429248. Throughput: 0: 42789.8. Samples: 6053500420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 05:31:41,451][15401] Updated weights for policy 0, policy_version 369480 (0.0033) [2024-06-23 05:31:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 6053642240. Throughput: 0: 42976.5. Samples: 6053765420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 05:31:43,480][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000369486_6053658624.pth... [2024-06-23 05:31:43,534][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000368855_6043320320.pth [2024-06-23 05:31:44,858][15401] Updated weights for policy 0, policy_version 369490 (0.0035) [2024-06-23 05:31:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6053855232. Throughput: 0: 42788.4. Samples: 6054016580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 05:31:48,998][15401] Updated weights for policy 0, policy_version 369500 (0.0026) [2024-06-23 05:31:52,694][15401] Updated weights for policy 0, policy_version 369510 (0.0044) [2024-06-23 05:31:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 6054068224. Throughput: 0: 42738.7. Samples: 6054138240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 05:31:57,030][15401] Updated weights for policy 0, policy_version 369520 (0.0049) [2024-06-23 05:31:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6054264832. Throughput: 0: 42605.3. Samples: 6054392620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:31:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 05:32:00,425][15401] Updated weights for policy 0, policy_version 369530 (0.0027) [2024-06-23 05:32:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 6054477824. Throughput: 0: 42762.0. Samples: 6054656180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:32:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 05:32:04,521][15401] Updated weights for policy 0, policy_version 369540 (0.0050) [2024-06-23 05:32:08,096][15401] Updated weights for policy 0, policy_version 369550 (0.0033) [2024-06-23 05:32:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 6054707200. Throughput: 0: 42673.7. Samples: 6054778920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 05:32:08,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 05:32:12,262][15401] Updated weights for policy 0, policy_version 369560 (0.0039) [2024-06-23 05:32:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 6054887424. Throughput: 0: 42396.4. Samples: 6055028940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:13,390][15132] Avg episode reward: [(0, '0.142')] [2024-06-23 05:32:16,087][15401] Updated weights for policy 0, policy_version 369570 (0.0045) [2024-06-23 05:32:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 6055116800. Throughput: 0: 42611.1. Samples: 6055290420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:18,392][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 05:32:19,764][15401] Updated weights for policy 0, policy_version 369580 (0.0035) [2024-06-23 05:32:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6055329792. Throughput: 0: 42647.9. Samples: 6055419580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 05:32:23,787][15401] Updated weights for policy 0, policy_version 369590 (0.0038) [2024-06-23 05:32:27,688][15401] Updated weights for policy 0, policy_version 369600 (0.0032) [2024-06-23 05:32:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6055542784. Throughput: 0: 42343.0. Samples: 6055670860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:28,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 05:32:31,437][15401] Updated weights for policy 0, policy_version 369610 (0.0046) [2024-06-23 05:32:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6055772160. Throughput: 0: 42489.7. Samples: 6055928620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:33,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 05:32:35,168][15401] Updated weights for policy 0, policy_version 369620 (0.0029) [2024-06-23 05:32:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 6055985152. Throughput: 0: 42731.4. Samples: 6056061160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 05:32:39,019][15401] Updated weights for policy 0, policy_version 369630 (0.0043) [2024-06-23 05:32:43,145][15401] Updated weights for policy 0, policy_version 369640 (0.0036) [2024-06-23 05:32:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6056181760. Throughput: 0: 42740.0. Samples: 6056315920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 05:32:46,892][15401] Updated weights for policy 0, policy_version 369650 (0.0037) [2024-06-23 05:32:48,390][15132] Fps is (10 sec: 44233.6, 60 sec: 42870.9, 300 sec: 42931.5). Total num frames: 6056427520. Throughput: 0: 42371.5. Samples: 6056562940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:48,391][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 05:32:50,855][15401] Updated weights for policy 0, policy_version 369660 (0.0039) [2024-06-23 05:32:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 6056607744. Throughput: 0: 42681.4. Samples: 6056699580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 05:32:54,379][15401] Updated weights for policy 0, policy_version 369670 (0.0023) [2024-06-23 05:32:58,389][15132] Fps is (10 sec: 39324.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6056820736. Throughput: 0: 42779.1. Samples: 6056954000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:32:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 05:32:58,934][15401] Updated weights for policy 0, policy_version 369680 (0.0036) [2024-06-23 05:33:01,512][15349] Signal inference workers to stop experience collection... (89750 times) [2024-06-23 05:33:01,568][15401] InferenceWorker_p0-w0: stopping experience collection (89750 times) [2024-06-23 05:33:01,626][15349] Signal inference workers to resume experience collection... (89750 times) [2024-06-23 05:33:01,627][15401] InferenceWorker_p0-w0: resuming experience collection (89750 times) [2024-06-23 05:33:01,923][15401] Updated weights for policy 0, policy_version 369690 (0.0038) [2024-06-23 05:33:03,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 6057082880. Throughput: 0: 42447.7. Samples: 6057200560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:33:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 05:33:06,636][15401] Updated weights for policy 0, policy_version 369700 (0.0038) [2024-06-23 05:33:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6057246720. Throughput: 0: 42750.3. Samples: 6057343340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:33:08,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 05:33:09,436][15401] Updated weights for policy 0, policy_version 369710 (0.0036) [2024-06-23 05:33:13,389][15132] Fps is (10 sec: 36044.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6057443328. Throughput: 0: 42704.9. Samples: 6057592580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:33:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 05:33:14,205][15401] Updated weights for policy 0, policy_version 369720 (0.0034) [2024-06-23 05:33:17,146][15401] Updated weights for policy 0, policy_version 369730 (0.0025) [2024-06-23 05:33:18,392][15132] Fps is (10 sec: 45863.6, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 6057705472. Throughput: 0: 42553.3. Samples: 6057843620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:33:18,393][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 05:33:21,626][15401] Updated weights for policy 0, policy_version 369740 (0.0030) [2024-06-23 05:33:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6057902080. Throughput: 0: 42762.4. Samples: 6057985460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:33:23,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 05:33:24,698][15401] Updated weights for policy 0, policy_version 369750 (0.0031) [2024-06-23 05:33:28,389][15132] Fps is (10 sec: 36053.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6058065920. Throughput: 0: 42530.7. Samples: 6058229800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 05:33:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 05:33:29,485][15401] Updated weights for policy 0, policy_version 369760 (0.0040) [2024-06-23 05:33:32,677][15401] Updated weights for policy 0, policy_version 369770 (0.0035) [2024-06-23 05:33:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 6058360832. Throughput: 0: 42560.3. Samples: 6058478120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:33:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 05:33:37,148][15401] Updated weights for policy 0, policy_version 369780 (0.0030) [2024-06-23 05:33:38,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6058541056. Throughput: 0: 42702.7. Samples: 6058621200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:33:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 05:33:40,144][15401] Updated weights for policy 0, policy_version 369790 (0.0029) [2024-06-23 05:33:43,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6058737664. Throughput: 0: 42547.1. Samples: 6058868620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:33:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 05:33:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000369796_6058737664.pth... [2024-06-23 05:33:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000369171_6048497664.pth [2024-06-23 05:33:44,885][15401] Updated weights for policy 0, policy_version 369800 (0.0051) [2024-06-23 05:33:47,802][15401] Updated weights for policy 0, policy_version 369810 (0.0037) [2024-06-23 05:33:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42872.0, 300 sec: 42987.2). Total num frames: 6058999808. Throughput: 0: 42665.2. Samples: 6059120500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:33:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 05:33:52,530][15401] Updated weights for policy 0, policy_version 369820 (0.0024) [2024-06-23 05:33:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6059147264. Throughput: 0: 42607.9. Samples: 6059260700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:33:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 05:33:55,267][15401] Updated weights for policy 0, policy_version 369830 (0.0040) [2024-06-23 05:33:58,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6059376640. Throughput: 0: 42579.5. Samples: 6059508660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:33:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 05:34:00,148][15401] Updated weights for policy 0, policy_version 369840 (0.0035) [2024-06-23 05:34:02,963][15401] Updated weights for policy 0, policy_version 369850 (0.0038) [2024-06-23 05:34:03,392][15132] Fps is (10 sec: 49140.2, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 6059638784. Throughput: 0: 42573.3. Samples: 6059759420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:03,393][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 05:34:07,864][15401] Updated weights for policy 0, policy_version 369860 (0.0031) [2024-06-23 05:34:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6059786240. Throughput: 0: 42535.6. Samples: 6059899560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:08,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 05:34:08,772][15349] Signal inference workers to stop experience collection... (89800 times) [2024-06-23 05:34:08,774][15349] Signal inference workers to resume experience collection... (89800 times) [2024-06-23 05:34:08,797][15401] InferenceWorker_p0-w0: stopping experience collection (89800 times) [2024-06-23 05:34:08,797][15401] InferenceWorker_p0-w0: resuming experience collection (89800 times) [2024-06-23 05:34:10,582][15401] Updated weights for policy 0, policy_version 369870 (0.0025) [2024-06-23 05:34:13,392][15132] Fps is (10 sec: 39321.6, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 6060032000. Throughput: 0: 42567.0. Samples: 6060145420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:13,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 05:34:15,461][15401] Updated weights for policy 0, policy_version 369880 (0.0041) [2024-06-23 05:34:18,328][15401] Updated weights for policy 0, policy_version 369890 (0.0028) [2024-06-23 05:34:18,389][15132] Fps is (10 sec: 49151.6, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 6060277760. Throughput: 0: 42679.2. Samples: 6060398680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 05:34:23,306][15401] Updated weights for policy 0, policy_version 369900 (0.0031) [2024-06-23 05:34:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6060441600. Throughput: 0: 42454.6. Samples: 6060531660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 05:34:26,028][15401] Updated weights for policy 0, policy_version 369910 (0.0038) [2024-06-23 05:34:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 6060670976. Throughput: 0: 42573.8. Samples: 6060784440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 05:34:30,895][15401] Updated weights for policy 0, policy_version 369920 (0.0036) [2024-06-23 05:34:33,393][15132] Fps is (10 sec: 44222.9, 60 sec: 42050.0, 300 sec: 42820.1). Total num frames: 6060883968. Throughput: 0: 42663.3. Samples: 6061040480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:33,394][15132] Avg episode reward: [(0, '0.850')] [2024-06-23 05:34:33,839][15401] Updated weights for policy 0, policy_version 369930 (0.0032) [2024-06-23 05:34:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6061080576. Throughput: 0: 42422.3. Samples: 6061169700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 05:34:38,502][15401] Updated weights for policy 0, policy_version 369940 (0.0033) [2024-06-23 05:34:41,488][15401] Updated weights for policy 0, policy_version 369950 (0.0034) [2024-06-23 05:34:43,389][15132] Fps is (10 sec: 42612.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6061309952. Throughput: 0: 42537.8. Samples: 6061422860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:34:43,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 05:34:46,391][15401] Updated weights for policy 0, policy_version 369960 (0.0042) [2024-06-23 05:34:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 6061522944. Throughput: 0: 42842.8. Samples: 6061687240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:34:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 05:34:49,191][15401] Updated weights for policy 0, policy_version 369970 (0.0038) [2024-06-23 05:34:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6061719552. Throughput: 0: 42590.6. Samples: 6061816140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:34:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 05:34:54,066][15401] Updated weights for policy 0, policy_version 369980 (0.0042) [2024-06-23 05:34:56,813][15401] Updated weights for policy 0, policy_version 369990 (0.0045) [2024-06-23 05:34:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6061965312. Throughput: 0: 42629.8. Samples: 6062063660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:34:58,392][15132] Avg episode reward: [(0, '0.276')] [2024-06-23 05:35:01,692][15401] Updated weights for policy 0, policy_version 370000 (0.0030) [2024-06-23 05:35:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41781.0, 300 sec: 42709.5). Total num frames: 6062145536. Throughput: 0: 42983.2. Samples: 6062332920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 05:35:04,611][15401] Updated weights for policy 0, policy_version 370010 (0.0031) [2024-06-23 05:35:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 6062358528. Throughput: 0: 42737.5. Samples: 6062454840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:08,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-23 05:35:09,177][15401] Updated weights for policy 0, policy_version 370020 (0.0025) [2024-06-23 05:35:12,271][15401] Updated weights for policy 0, policy_version 370030 (0.0033) [2024-06-23 05:35:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6062604288. Throughput: 0: 42704.0. Samples: 6062706120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 05:35:16,835][15401] Updated weights for policy 0, policy_version 370040 (0.0037) [2024-06-23 05:35:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 6062784512. Throughput: 0: 42944.9. Samples: 6062972860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 05:35:19,756][15401] Updated weights for policy 0, policy_version 370050 (0.0040) [2024-06-23 05:35:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6063013888. Throughput: 0: 42778.2. Samples: 6063094720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:23,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 05:35:24,823][15401] Updated weights for policy 0, policy_version 370060 (0.0028) [2024-06-23 05:35:27,380][15401] Updated weights for policy 0, policy_version 370070 (0.0032) [2024-06-23 05:35:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6063243264. Throughput: 0: 42800.0. Samples: 6063348860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 05:35:32,357][15401] Updated weights for policy 0, policy_version 370080 (0.0038) [2024-06-23 05:35:33,388][15349] Signal inference workers to stop experience collection... (89850 times) [2024-06-23 05:35:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42054.5, 300 sec: 42543.2). Total num frames: 6063407104. Throughput: 0: 42960.7. Samples: 6063620480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 05:35:33,439][15349] Signal inference workers to resume experience collection... (89850 times) [2024-06-23 05:35:33,444][15401] InferenceWorker_p0-w0: stopping experience collection (89850 times) [2024-06-23 05:35:33,463][15401] InferenceWorker_p0-w0: resuming experience collection (89850 times) [2024-06-23 05:35:35,138][15401] Updated weights for policy 0, policy_version 370090 (0.0035) [2024-06-23 05:35:38,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 6063652864. Throughput: 0: 42708.7. Samples: 6063738040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 05:35:39,779][15401] Updated weights for policy 0, policy_version 370100 (0.0040) [2024-06-23 05:35:42,666][15401] Updated weights for policy 0, policy_version 370110 (0.0030) [2024-06-23 05:35:43,392][15132] Fps is (10 sec: 49140.3, 60 sec: 43142.7, 300 sec: 42764.6). Total num frames: 6063898624. Throughput: 0: 43000.3. Samples: 6063998780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:43,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 05:35:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000370111_6063898624.pth... [2024-06-23 05:35:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000369486_6053658624.pth [2024-06-23 05:35:47,298][15401] Updated weights for policy 0, policy_version 370120 (0.0033) [2024-06-23 05:35:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 6064062464. Throughput: 0: 43027.8. Samples: 6064269180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 05:35:50,408][15401] Updated weights for policy 0, policy_version 370130 (0.0042) [2024-06-23 05:35:53,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6064308224. Throughput: 0: 42897.3. Samples: 6064385220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 05:35:54,931][15401] Updated weights for policy 0, policy_version 370140 (0.0037) [2024-06-23 05:35:58,144][15401] Updated weights for policy 0, policy_version 370150 (0.0041) [2024-06-23 05:35:58,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6064537600. Throughput: 0: 43097.4. Samples: 6064645500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:35:58,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 05:36:02,417][15401] Updated weights for policy 0, policy_version 370160 (0.0038) [2024-06-23 05:36:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6064701440. Throughput: 0: 43026.6. Samples: 6064909060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 05:36:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 05:36:05,796][15401] Updated weights for policy 0, policy_version 370170 (0.0051) [2024-06-23 05:36:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 6064947200. Throughput: 0: 42907.5. Samples: 6065025560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:08,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 05:36:10,434][15401] Updated weights for policy 0, policy_version 370180 (0.0027) [2024-06-23 05:36:13,353][15401] Updated weights for policy 0, policy_version 370190 (0.0040) [2024-06-23 05:36:13,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 6065192960. Throughput: 0: 43137.7. Samples: 6065290060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 05:36:17,906][15401] Updated weights for policy 0, policy_version 370200 (0.0032) [2024-06-23 05:36:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6065373184. Throughput: 0: 43019.7. Samples: 6065556360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 05:36:20,931][15401] Updated weights for policy 0, policy_version 370210 (0.0028) [2024-06-23 05:36:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6065602560. Throughput: 0: 43033.3. Samples: 6065674540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 05:36:25,451][15401] Updated weights for policy 0, policy_version 370220 (0.0027) [2024-06-23 05:36:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6065831936. Throughput: 0: 43012.6. Samples: 6065934240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:28,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 05:36:28,592][15401] Updated weights for policy 0, policy_version 370230 (0.0033) [2024-06-23 05:36:32,986][15401] Updated weights for policy 0, policy_version 370240 (0.0028) [2024-06-23 05:36:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 6066012160. Throughput: 0: 42729.0. Samples: 6066191980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 05:36:36,202][15401] Updated weights for policy 0, policy_version 370250 (0.0030) [2024-06-23 05:36:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 6066241536. Throughput: 0: 42874.6. Samples: 6066314580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:38,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 05:36:40,534][15401] Updated weights for policy 0, policy_version 370260 (0.0022) [2024-06-23 05:36:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 6066454528. Throughput: 0: 43039.0. Samples: 6066582260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:43,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 05:36:43,761][15401] Updated weights for policy 0, policy_version 370270 (0.0028) [2024-06-23 05:36:45,162][15349] Signal inference workers to stop experience collection... (89900 times) [2024-06-23 05:36:45,162][15349] Signal inference workers to resume experience collection... (89900 times) [2024-06-23 05:36:45,198][15401] InferenceWorker_p0-w0: stopping experience collection (89900 times) [2024-06-23 05:36:45,199][15401] InferenceWorker_p0-w0: resuming experience collection (89900 times) [2024-06-23 05:36:48,020][15401] Updated weights for policy 0, policy_version 370280 (0.0027) [2024-06-23 05:36:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 6066667520. Throughput: 0: 42754.1. Samples: 6066833000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:48,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 05:36:51,329][15401] Updated weights for policy 0, policy_version 370290 (0.0042) [2024-06-23 05:36:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 6066896896. Throughput: 0: 43044.9. Samples: 6066962680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:53,393][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 05:36:55,520][15401] Updated weights for policy 0, policy_version 370300 (0.0035) [2024-06-23 05:36:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6067093504. Throughput: 0: 42964.9. Samples: 6067223480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:36:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 05:36:59,085][15401] Updated weights for policy 0, policy_version 370310 (0.0042) [2024-06-23 05:37:03,097][15401] Updated weights for policy 0, policy_version 370320 (0.0038) [2024-06-23 05:37:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 6067322880. Throughput: 0: 42705.7. Samples: 6067478120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:37:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 05:37:06,548][15401] Updated weights for policy 0, policy_version 370330 (0.0029) [2024-06-23 05:37:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6067535872. Throughput: 0: 42957.6. Samples: 6067607620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:37:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 05:37:10,520][15401] Updated weights for policy 0, policy_version 370340 (0.0031) [2024-06-23 05:37:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6067732480. Throughput: 0: 43131.6. Samples: 6067875160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:37:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 05:37:14,780][15401] Updated weights for policy 0, policy_version 370350 (0.0032) [2024-06-23 05:37:18,320][15401] Updated weights for policy 0, policy_version 370360 (0.0048) [2024-06-23 05:37:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 6067978240. Throughput: 0: 42929.6. Samples: 6068123820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 05:37:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 05:37:22,325][15401] Updated weights for policy 0, policy_version 370370 (0.0022) [2024-06-23 05:37:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6068191232. Throughput: 0: 43052.5. Samples: 6068251940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:37:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 05:37:26,044][15401] Updated weights for policy 0, policy_version 370380 (0.0042) [2024-06-23 05:37:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6068387840. Throughput: 0: 42953.4. Samples: 6068515160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:37:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 05:37:30,036][15401] Updated weights for policy 0, policy_version 370390 (0.0045) [2024-06-23 05:37:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6068600832. Throughput: 0: 42980.4. Samples: 6068767120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:37:33,395][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 05:37:33,693][15401] Updated weights for policy 0, policy_version 370400 (0.0022) [2024-06-23 05:37:37,629][15401] Updated weights for policy 0, policy_version 370410 (0.0033) [2024-06-23 05:37:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6068830208. Throughput: 0: 42929.9. Samples: 6068894420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:37:38,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 05:37:41,242][15401] Updated weights for policy 0, policy_version 370420 (0.0027) [2024-06-23 05:37:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 6069010432. Throughput: 0: 42813.3. Samples: 6069150080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:37:43,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 05:37:43,519][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000370424_6069026816.pth... [2024-06-23 05:37:43,575][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000369796_6058737664.pth [2024-06-23 05:37:45,302][15401] Updated weights for policy 0, policy_version 370430 (0.0038) [2024-06-23 05:37:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6069239808. Throughput: 0: 42813.8. Samples: 6069404740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:37:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 05:37:49,281][15401] Updated weights for policy 0, policy_version 370440 (0.0039) [2024-06-23 05:37:52,998][15401] Updated weights for policy 0, policy_version 370450 (0.0033) [2024-06-23 05:37:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 6069469184. Throughput: 0: 42800.7. Samples: 6069533660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:37:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 05:37:56,916][15401] Updated weights for policy 0, policy_version 370460 (0.0031) [2024-06-23 05:37:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6069665792. Throughput: 0: 42447.0. Samples: 6069785280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:37:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 05:38:00,495][15349] Signal inference workers to stop experience collection... (89950 times) [2024-06-23 05:38:00,524][15401] InferenceWorker_p0-w0: stopping experience collection (89950 times) [2024-06-23 05:38:00,550][15349] Signal inference workers to resume experience collection... (89950 times) [2024-06-23 05:38:00,556][15401] InferenceWorker_p0-w0: resuming experience collection (89950 times) [2024-06-23 05:38:00,696][15401] Updated weights for policy 0, policy_version 370470 (0.0036) [2024-06-23 05:38:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 6069878784. Throughput: 0: 42553.9. Samples: 6070038740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:38:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 05:38:04,899][15401] Updated weights for policy 0, policy_version 370480 (0.0044) [2024-06-23 05:38:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6070091776. Throughput: 0: 42580.5. Samples: 6070168060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:38:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 05:38:08,429][15401] Updated weights for policy 0, policy_version 370490 (0.0039) [2024-06-23 05:38:12,410][15401] Updated weights for policy 0, policy_version 370500 (0.0031) [2024-06-23 05:38:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 6070288384. Throughput: 0: 42476.0. Samples: 6070426580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:38:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 05:38:16,321][15401] Updated weights for policy 0, policy_version 370510 (0.0029) [2024-06-23 05:38:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 6070517760. Throughput: 0: 42404.1. Samples: 6070675400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:38:18,392][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 05:38:19,978][15401] Updated weights for policy 0, policy_version 370520 (0.0030) [2024-06-23 05:38:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 6070714368. Throughput: 0: 42510.6. Samples: 6070807400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:38:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 05:38:24,046][15401] Updated weights for policy 0, policy_version 370530 (0.0037) [2024-06-23 05:38:27,403][15401] Updated weights for policy 0, policy_version 370540 (0.0034) [2024-06-23 05:38:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6070943744. Throughput: 0: 42480.1. Samples: 6071061680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:38:28,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 05:38:31,629][15401] Updated weights for policy 0, policy_version 370550 (0.0034) [2024-06-23 05:38:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6071173120. Throughput: 0: 42598.2. Samples: 6071321660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:38:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 05:38:35,374][15401] Updated weights for policy 0, policy_version 370560 (0.0043) [2024-06-23 05:38:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 6071369728. Throughput: 0: 42675.6. Samples: 6071454160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 05:38:38,392][15132] Avg episode reward: [(0, '0.265')] [2024-06-23 05:38:39,259][15401] Updated weights for policy 0, policy_version 370570 (0.0029) [2024-06-23 05:38:42,751][15401] Updated weights for policy 0, policy_version 370580 (0.0033) [2024-06-23 05:38:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6071599104. Throughput: 0: 42816.1. Samples: 6071712000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:38:43,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-23 05:38:46,947][15401] Updated weights for policy 0, policy_version 370590 (0.0030) [2024-06-23 05:38:48,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6071812096. Throughput: 0: 42846.6. Samples: 6071966840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:38:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 05:38:50,387][15401] Updated weights for policy 0, policy_version 370600 (0.0022) [2024-06-23 05:38:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6072008704. Throughput: 0: 42815.0. Samples: 6072094740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:38:53,392][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 05:38:54,694][15401] Updated weights for policy 0, policy_version 370610 (0.0041) [2024-06-23 05:38:57,857][15401] Updated weights for policy 0, policy_version 370620 (0.0032) [2024-06-23 05:38:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 6072254464. Throughput: 0: 42717.7. Samples: 6072348880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:38:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 05:39:02,284][15401] Updated weights for policy 0, policy_version 370630 (0.0038) [2024-06-23 05:39:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6072434688. Throughput: 0: 43091.2. Samples: 6072614400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 05:39:05,504][15401] Updated weights for policy 0, policy_version 370640 (0.0034) [2024-06-23 05:39:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 6072647680. Throughput: 0: 42815.2. Samples: 6072734080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 05:39:10,033][15401] Updated weights for policy 0, policy_version 370650 (0.0025) [2024-06-23 05:39:13,080][15401] Updated weights for policy 0, policy_version 370660 (0.0031) [2024-06-23 05:39:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 6072909824. Throughput: 0: 43022.1. Samples: 6072997680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:13,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 05:39:17,866][15401] Updated weights for policy 0, policy_version 370670 (0.0034) [2024-06-23 05:39:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 6073073664. Throughput: 0: 43062.6. Samples: 6073259480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 05:39:20,558][15401] Updated weights for policy 0, policy_version 370680 (0.0027) [2024-06-23 05:39:23,390][15132] Fps is (10 sec: 36044.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6073270272. Throughput: 0: 42703.1. Samples: 6073375700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 05:39:25,193][15349] Signal inference workers to stop experience collection... (90000 times) [2024-06-23 05:39:25,223][15401] InferenceWorker_p0-w0: stopping experience collection (90000 times) [2024-06-23 05:39:25,308][15349] Signal inference workers to resume experience collection... (90000 times) [2024-06-23 05:39:25,308][15401] InferenceWorker_p0-w0: resuming experience collection (90000 times) [2024-06-23 05:39:25,446][15401] Updated weights for policy 0, policy_version 370690 (0.0037) [2024-06-23 05:39:28,097][15401] Updated weights for policy 0, policy_version 370700 (0.0026) [2024-06-23 05:39:28,389][15132] Fps is (10 sec: 47514.5, 60 sec: 43417.6, 300 sec: 42932.1). Total num frames: 6073548800. Throughput: 0: 42785.4. Samples: 6073637340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 05:39:32,980][15401] Updated weights for policy 0, policy_version 370710 (0.0022) [2024-06-23 05:39:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6073729024. Throughput: 0: 43004.8. Samples: 6073902060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:33,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 05:39:35,606][15401] Updated weights for policy 0, policy_version 370720 (0.0037) [2024-06-23 05:39:38,392][15132] Fps is (10 sec: 36036.0, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 6073909248. Throughput: 0: 42804.4. Samples: 6074021040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:38,393][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 05:39:40,535][15401] Updated weights for policy 0, policy_version 370730 (0.0028) [2024-06-23 05:39:43,082][15401] Updated weights for policy 0, policy_version 370740 (0.0033) [2024-06-23 05:39:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6074204160. Throughput: 0: 42980.0. Samples: 6074282980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:43,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 05:39:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000370741_6074220544.pth... [2024-06-23 05:39:43,548][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000370111_6063898624.pth [2024-06-23 05:39:48,043][15401] Updated weights for policy 0, policy_version 370750 (0.0035) [2024-06-23 05:39:48,389][15132] Fps is (10 sec: 45886.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6074368000. Throughput: 0: 42921.3. Samples: 6074545860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:48,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-23 05:39:50,851][15401] Updated weights for policy 0, policy_version 370760 (0.0036) [2024-06-23 05:39:53,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6074580992. Throughput: 0: 42929.2. Samples: 6074665900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 05:39:55,798][15401] Updated weights for policy 0, policy_version 370770 (0.0037) [2024-06-23 05:39:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 6074843136. Throughput: 0: 42877.0. Samples: 6074927140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:39:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 05:39:58,417][15401] Updated weights for policy 0, policy_version 370780 (0.0037) [2024-06-23 05:40:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6075006976. Throughput: 0: 43077.1. Samples: 6075197940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 05:40:03,415][15401] Updated weights for policy 0, policy_version 370790 (0.0033) [2024-06-23 05:40:05,895][15401] Updated weights for policy 0, policy_version 370800 (0.0031) [2024-06-23 05:40:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6075219968. Throughput: 0: 43007.1. Samples: 6075311020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 05:40:11,042][15401] Updated weights for policy 0, policy_version 370810 (0.0039) [2024-06-23 05:40:13,392][15132] Fps is (10 sec: 49140.4, 60 sec: 43142.9, 300 sec: 43097.9). Total num frames: 6075498496. Throughput: 0: 43020.4. Samples: 6075573360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:13,392][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 05:40:13,591][15401] Updated weights for policy 0, policy_version 370820 (0.0021) [2024-06-23 05:40:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6075645952. Throughput: 0: 43058.3. Samples: 6075839680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 05:40:18,586][15401] Updated weights for policy 0, policy_version 370830 (0.0042) [2024-06-23 05:40:21,567][15401] Updated weights for policy 0, policy_version 370840 (0.0034) [2024-06-23 05:40:23,390][15132] Fps is (10 sec: 37691.8, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 6075875328. Throughput: 0: 43014.3. Samples: 6075956580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 05:40:26,109][15401] Updated weights for policy 0, policy_version 370850 (0.0035) [2024-06-23 05:40:28,389][15132] Fps is (10 sec: 49152.4, 60 sec: 43144.6, 300 sec: 43153.8). Total num frames: 6076137472. Throughput: 0: 43076.1. Samples: 6076221400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:28,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 05:40:29,089][15401] Updated weights for policy 0, policy_version 370860 (0.0048) [2024-06-23 05:40:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6076301312. Throughput: 0: 43199.5. Samples: 6076489840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 05:40:34,103][15401] Updated weights for policy 0, policy_version 370870 (0.0042) [2024-06-23 05:40:35,177][15349] Signal inference workers to stop experience collection... (90050 times) [2024-06-23 05:40:35,177][15349] Signal inference workers to resume experience collection... (90050 times) [2024-06-23 05:40:35,203][15401] InferenceWorker_p0-w0: stopping experience collection (90050 times) [2024-06-23 05:40:35,236][15401] InferenceWorker_p0-w0: resuming experience collection (90050 times) [2024-06-23 05:40:36,520][15401] Updated weights for policy 0, policy_version 370880 (0.0030) [2024-06-23 05:40:38,390][15132] Fps is (10 sec: 37682.6, 60 sec: 43419.3, 300 sec: 42765.4). Total num frames: 6076514304. Throughput: 0: 42960.1. Samples: 6076599100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 05:40:41,595][15401] Updated weights for policy 0, policy_version 370890 (0.0032) [2024-06-23 05:40:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 6076776448. Throughput: 0: 43060.1. Samples: 6076864840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 05:40:44,236][15401] Updated weights for policy 0, policy_version 370900 (0.0039) [2024-06-23 05:40:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6076940288. Throughput: 0: 42944.0. Samples: 6077130420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 05:40:48,951][15401] Updated weights for policy 0, policy_version 370910 (0.0032) [2024-06-23 05:40:51,967][15401] Updated weights for policy 0, policy_version 370920 (0.0031) [2024-06-23 05:40:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6077169664. Throughput: 0: 43002.3. Samples: 6077246120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:53,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 05:40:56,798][15401] Updated weights for policy 0, policy_version 370930 (0.0034) [2024-06-23 05:40:58,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 6077399040. Throughput: 0: 42926.5. Samples: 6077504960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:40:58,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 05:41:00,304][15401] Updated weights for policy 0, policy_version 370940 (0.0034) [2024-06-23 05:41:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6077579264. Throughput: 0: 42806.3. Samples: 6077765960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:41:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 05:41:04,425][15401] Updated weights for policy 0, policy_version 370950 (0.0043) [2024-06-23 05:41:07,860][15401] Updated weights for policy 0, policy_version 370960 (0.0040) [2024-06-23 05:41:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6077808640. Throughput: 0: 42842.6. Samples: 6077884500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:41:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 05:41:12,095][15401] Updated weights for policy 0, policy_version 370970 (0.0038) [2024-06-23 05:41:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42327.0, 300 sec: 42931.6). Total num frames: 6078038016. Throughput: 0: 42765.7. Samples: 6078145860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:41:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 05:41:16,133][15401] Updated weights for policy 0, policy_version 370980 (0.0033) [2024-06-23 05:41:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6078201856. Throughput: 0: 42629.3. Samples: 6078408160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 05:41:18,390][15132] Avg episode reward: [(0, '0.144')] [2024-06-23 05:41:19,713][15401] Updated weights for policy 0, policy_version 370990 (0.0036) [2024-06-23 05:41:23,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6078447616. Throughput: 0: 42758.6. Samples: 6078523340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:41:23,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 05:41:23,665][15401] Updated weights for policy 0, policy_version 371000 (0.0043) [2024-06-23 05:41:27,305][15401] Updated weights for policy 0, policy_version 371010 (0.0035) [2024-06-23 05:41:28,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 6078676992. Throughput: 0: 42627.9. Samples: 6078783100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:41:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 05:41:31,425][15401] Updated weights for policy 0, policy_version 371020 (0.0046) [2024-06-23 05:41:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6078857216. Throughput: 0: 42587.1. Samples: 6079046840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:41:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 05:41:35,026][15401] Updated weights for policy 0, policy_version 371030 (0.0041) [2024-06-23 05:41:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6079086592. Throughput: 0: 42637.3. Samples: 6079164800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:41:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 05:41:39,015][15401] Updated weights for policy 0, policy_version 371040 (0.0032) [2024-06-23 05:41:42,678][15401] Updated weights for policy 0, policy_version 371050 (0.0033) [2024-06-23 05:41:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 6079299584. Throughput: 0: 42671.3. Samples: 6079425160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:41:43,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 05:41:43,473][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000371052_6079315968.pth... [2024-06-23 05:41:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000370424_6069026816.pth [2024-06-23 05:41:46,532][15401] Updated weights for policy 0, policy_version 371060 (0.0030) [2024-06-23 05:41:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.2, 300 sec: 42709.8). Total num frames: 6079496192. Throughput: 0: 42687.7. Samples: 6079686920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:41:48,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 05:41:48,606][15349] Signal inference workers to stop experience collection... (90100 times) [2024-06-23 05:41:48,606][15349] Signal inference workers to resume experience collection... (90100 times) [2024-06-23 05:41:48,619][15401] InferenceWorker_p0-w0: stopping experience collection (90100 times) [2024-06-23 05:41:48,619][15401] InferenceWorker_p0-w0: resuming experience collection (90100 times) [2024-06-23 05:41:50,442][15401] Updated weights for policy 0, policy_version 371070 (0.0024) [2024-06-23 05:41:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6079741952. Throughput: 0: 42765.8. Samples: 6079808960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:41:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 05:41:53,987][15401] Updated weights for policy 0, policy_version 371080 (0.0036) [2024-06-23 05:41:58,223][15401] Updated weights for policy 0, policy_version 371090 (0.0041) [2024-06-23 05:41:58,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 6079938560. Throughput: 0: 42682.6. Samples: 6080066680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:41:58,393][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 05:42:01,624][15401] Updated weights for policy 0, policy_version 371100 (0.0028) [2024-06-23 05:42:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 6080151552. Throughput: 0: 42606.6. Samples: 6080325460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:42:03,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 05:42:05,787][15401] Updated weights for policy 0, policy_version 371110 (0.0034) [2024-06-23 05:42:08,393][15132] Fps is (10 sec: 44230.1, 60 sec: 42868.7, 300 sec: 42875.5). Total num frames: 6080380928. Throughput: 0: 42742.1. Samples: 6080446800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:42:08,394][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 05:42:09,186][15401] Updated weights for policy 0, policy_version 371120 (0.0039) [2024-06-23 05:42:13,378][15401] Updated weights for policy 0, policy_version 371130 (0.0040) [2024-06-23 05:42:13,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6080593920. Throughput: 0: 42788.5. Samples: 6080708580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:42:13,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 05:42:16,725][15401] Updated weights for policy 0, policy_version 371140 (0.0033) [2024-06-23 05:42:18,390][15132] Fps is (10 sec: 39336.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6080774144. Throughput: 0: 42681.7. Samples: 6080967520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:42:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 05:42:20,922][15401] Updated weights for policy 0, policy_version 371150 (0.0033) [2024-06-23 05:42:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43144.5, 300 sec: 42875.7). Total num frames: 6081036288. Throughput: 0: 42715.1. Samples: 6081087080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:42:23,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 05:42:24,157][15401] Updated weights for policy 0, policy_version 371160 (0.0043) [2024-06-23 05:42:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6081232896. Throughput: 0: 42906.3. Samples: 6081355940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:42:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 05:42:28,454][15401] Updated weights for policy 0, policy_version 371170 (0.0022) [2024-06-23 05:42:31,702][15401] Updated weights for policy 0, policy_version 371180 (0.0036) [2024-06-23 05:42:33,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6081413120. Throughput: 0: 42800.2. Samples: 6081612920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 05:42:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 05:42:36,090][15401] Updated weights for policy 0, policy_version 371190 (0.0035) [2024-06-23 05:42:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6081691648. Throughput: 0: 42805.3. Samples: 6081735200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:42:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 05:42:39,238][15401] Updated weights for policy 0, policy_version 371200 (0.0047) [2024-06-23 05:42:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6081871872. Throughput: 0: 43119.7. Samples: 6082006960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:42:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 05:42:43,677][15401] Updated weights for policy 0, policy_version 371210 (0.0031) [2024-06-23 05:42:46,829][15401] Updated weights for policy 0, policy_version 371220 (0.0026) [2024-06-23 05:42:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6082068480. Throughput: 0: 42951.8. Samples: 6082258280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:42:48,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 05:42:51,313][15401] Updated weights for policy 0, policy_version 371230 (0.0031) [2024-06-23 05:42:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6082314240. Throughput: 0: 43100.5. Samples: 6082386160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:42:53,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 05:42:54,899][15401] Updated weights for policy 0, policy_version 371240 (0.0036) [2024-06-23 05:42:58,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42868.6, 300 sec: 42819.6). Total num frames: 6082510848. Throughput: 0: 43053.0. Samples: 6082646240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:42:58,396][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 05:42:59,140][15401] Updated weights for policy 0, policy_version 371250 (0.0033) [2024-06-23 05:43:00,071][15349] Signal inference workers to stop experience collection... (90150 times) [2024-06-23 05:43:00,127][15401] InferenceWorker_p0-w0: stopping experience collection (90150 times) [2024-06-23 05:43:00,134][15349] Signal inference workers to resume experience collection... (90150 times) [2024-06-23 05:43:00,141][15401] InferenceWorker_p0-w0: resuming experience collection (90150 times) [2024-06-23 05:43:02,475][15401] Updated weights for policy 0, policy_version 371260 (0.0036) [2024-06-23 05:43:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 6082723840. Throughput: 0: 42745.9. Samples: 6082891080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 05:43:06,907][15401] Updated weights for policy 0, policy_version 371270 (0.0024) [2024-06-23 05:43:08,390][15132] Fps is (10 sec: 42625.0, 60 sec: 42601.1, 300 sec: 42876.1). Total num frames: 6082936832. Throughput: 0: 43088.4. Samples: 6083025960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:08,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 05:43:10,122][15401] Updated weights for policy 0, policy_version 371280 (0.0028) [2024-06-23 05:43:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 6083133440. Throughput: 0: 42907.1. Samples: 6083286760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:13,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 05:43:14,533][15401] Updated weights for policy 0, policy_version 371290 (0.0030) [2024-06-23 05:43:17,973][15401] Updated weights for policy 0, policy_version 371300 (0.0031) [2024-06-23 05:43:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6083379200. Throughput: 0: 42568.9. Samples: 6083528520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 05:43:22,283][15401] Updated weights for policy 0, policy_version 371310 (0.0043) [2024-06-23 05:43:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 6083575808. Throughput: 0: 42804.1. Samples: 6083661380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 05:43:25,940][15401] Updated weights for policy 0, policy_version 371320 (0.0029) [2024-06-23 05:43:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6083788800. Throughput: 0: 42440.7. Samples: 6083916800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:28,399][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 05:43:30,431][15401] Updated weights for policy 0, policy_version 371330 (0.0030) [2024-06-23 05:43:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42876.5). Total num frames: 6084018176. Throughput: 0: 42449.0. Samples: 6084168480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:33,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 05:43:33,494][15401] Updated weights for policy 0, policy_version 371340 (0.0029) [2024-06-23 05:43:37,862][15401] Updated weights for policy 0, policy_version 371350 (0.0031) [2024-06-23 05:43:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6084231168. Throughput: 0: 42547.2. Samples: 6084300780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 05:43:41,345][15401] Updated weights for policy 0, policy_version 371360 (0.0035) [2024-06-23 05:43:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6084444160. Throughput: 0: 42479.0. Samples: 6084557520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 05:43:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000371365_6084444160.pth... [2024-06-23 05:43:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000370741_6074220544.pth [2024-06-23 05:43:45,411][15401] Updated weights for policy 0, policy_version 371370 (0.0029) [2024-06-23 05:43:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6084657152. Throughput: 0: 42747.9. Samples: 6084814740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 05:43:49,099][15401] Updated weights for policy 0, policy_version 371380 (0.0037) [2024-06-23 05:43:53,015][15401] Updated weights for policy 0, policy_version 371390 (0.0030) [2024-06-23 05:43:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6084870144. Throughput: 0: 42651.2. Samples: 6084945260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:43:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 05:43:56,827][15401] Updated weights for policy 0, policy_version 371400 (0.0033) [2024-06-23 05:43:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 6085066752. Throughput: 0: 42450.2. Samples: 6085197020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:43:58,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 05:44:00,582][15401] Updated weights for policy 0, policy_version 371410 (0.0039) [2024-06-23 05:44:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6085279744. Throughput: 0: 42909.4. Samples: 6085459440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 05:44:04,410][15401] Updated weights for policy 0, policy_version 371420 (0.0032) [2024-06-23 05:44:08,380][15401] Updated weights for policy 0, policy_version 371430 (0.0039) [2024-06-23 05:44:08,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42867.0, 300 sec: 42708.6). Total num frames: 6085509120. Throughput: 0: 42757.9. Samples: 6085585760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:08,396][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 05:44:11,898][15401] Updated weights for policy 0, policy_version 371440 (0.0035) [2024-06-23 05:44:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6085722112. Throughput: 0: 42874.7. Samples: 6085846160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 05:44:16,073][15401] Updated weights for policy 0, policy_version 371450 (0.0039) [2024-06-23 05:44:18,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6085935104. Throughput: 0: 43047.0. Samples: 6086105600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 05:44:18,573][15349] Signal inference workers to stop experience collection... (90200 times) [2024-06-23 05:44:18,584][15401] InferenceWorker_p0-w0: stopping experience collection (90200 times) [2024-06-23 05:44:18,636][15349] Signal inference workers to resume experience collection... (90200 times) [2024-06-23 05:44:18,636][15401] InferenceWorker_p0-w0: resuming experience collection (90200 times) [2024-06-23 05:44:19,289][15401] Updated weights for policy 0, policy_version 371460 (0.0050) [2024-06-23 05:44:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6086131712. Throughput: 0: 43015.1. Samples: 6086236460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:23,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 05:44:23,586][15401] Updated weights for policy 0, policy_version 371470 (0.0035) [2024-06-23 05:44:27,199][15401] Updated weights for policy 0, policy_version 371480 (0.0033) [2024-06-23 05:44:28,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 6086377472. Throughput: 0: 42894.6. Samples: 6086487880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:28,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 05:44:31,119][15401] Updated weights for policy 0, policy_version 371490 (0.0036) [2024-06-23 05:44:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42932.0). Total num frames: 6086574080. Throughput: 0: 42974.7. Samples: 6086748600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 05:44:34,723][15401] Updated weights for policy 0, policy_version 371500 (0.0028) [2024-06-23 05:44:38,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6086787072. Throughput: 0: 42839.1. Samples: 6086873020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:38,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 05:44:38,642][15401] Updated weights for policy 0, policy_version 371510 (0.0041) [2024-06-23 05:44:42,241][15401] Updated weights for policy 0, policy_version 371520 (0.0054) [2024-06-23 05:44:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6087000064. Throughput: 0: 42999.0. Samples: 6087131980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 05:44:46,183][15401] Updated weights for policy 0, policy_version 371530 (0.0042) [2024-06-23 05:44:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6087213056. Throughput: 0: 42886.6. Samples: 6087389340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:48,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 05:44:50,239][15401] Updated weights for policy 0, policy_version 371540 (0.0039) [2024-06-23 05:44:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6087426048. Throughput: 0: 42897.1. Samples: 6087515860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:53,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 05:44:54,043][15401] Updated weights for policy 0, policy_version 371550 (0.0027) [2024-06-23 05:44:57,719][15401] Updated weights for policy 0, policy_version 371560 (0.0029) [2024-06-23 05:44:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6087655424. Throughput: 0: 42867.6. Samples: 6087775200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:44:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 05:45:01,718][15401] Updated weights for policy 0, policy_version 371570 (0.0031) [2024-06-23 05:45:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6087852032. Throughput: 0: 42730.3. Samples: 6088028460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:45:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 05:45:05,283][15401] Updated weights for policy 0, policy_version 371580 (0.0039) [2024-06-23 05:45:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42603.0, 300 sec: 42598.7). Total num frames: 6088065024. Throughput: 0: 42715.6. Samples: 6088158660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:45:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 05:45:09,568][15401] Updated weights for policy 0, policy_version 371590 (0.0044) [2024-06-23 05:45:12,932][15401] Updated weights for policy 0, policy_version 371600 (0.0029) [2024-06-23 05:45:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6088294400. Throughput: 0: 42762.6. Samples: 6088412100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 05:45:13,392][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 05:45:17,328][15401] Updated weights for policy 0, policy_version 371610 (0.0026) [2024-06-23 05:45:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6088507392. Throughput: 0: 42605.3. Samples: 6088665840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:18,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-23 05:45:20,589][15401] Updated weights for policy 0, policy_version 371620 (0.0037) [2024-06-23 05:45:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6088720384. Throughput: 0: 42590.2. Samples: 6088789580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 05:45:25,102][15401] Updated weights for policy 0, policy_version 371630 (0.0036) [2024-06-23 05:45:28,311][15401] Updated weights for policy 0, policy_version 371640 (0.0035) [2024-06-23 05:45:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 6088949760. Throughput: 0: 42656.0. Samples: 6089051500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 05:45:30,115][15349] Signal inference workers to stop experience collection... (90250 times) [2024-06-23 05:45:30,115][15349] Signal inference workers to resume experience collection... (90250 times) [2024-06-23 05:45:30,150][15401] InferenceWorker_p0-w0: stopping experience collection (90250 times) [2024-06-23 05:45:30,150][15401] InferenceWorker_p0-w0: resuming experience collection (90250 times) [2024-06-23 05:45:32,748][15401] Updated weights for policy 0, policy_version 371650 (0.0047) [2024-06-23 05:45:33,394][15132] Fps is (10 sec: 40940.6, 60 sec: 42595.0, 300 sec: 42764.3). Total num frames: 6089129984. Throughput: 0: 42581.3. Samples: 6089305700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:33,395][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 05:45:36,194][15401] Updated weights for policy 0, policy_version 371660 (0.0028) [2024-06-23 05:45:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6089359360. Throughput: 0: 42595.2. Samples: 6089432640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:38,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 05:45:40,326][15401] Updated weights for policy 0, policy_version 371670 (0.0038) [2024-06-23 05:45:43,390][15132] Fps is (10 sec: 44257.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6089572352. Throughput: 0: 42503.4. Samples: 6089687860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 05:45:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000371678_6089572352.pth... [2024-06-23 05:45:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000371052_6079315968.pth [2024-06-23 05:45:43,829][15401] Updated weights for policy 0, policy_version 371680 (0.0033) [2024-06-23 05:45:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6089752576. Throughput: 0: 42715.2. Samples: 6089950640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 05:45:48,773][15401] Updated weights for policy 0, policy_version 371690 (0.0040) [2024-06-23 05:45:51,293][15401] Updated weights for policy 0, policy_version 371700 (0.0033) [2024-06-23 05:45:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6090014720. Throughput: 0: 42507.9. Samples: 6090071520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 05:45:56,358][15401] Updated weights for policy 0, policy_version 371710 (0.0028) [2024-06-23 05:45:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6090194944. Throughput: 0: 42638.3. Samples: 6090330820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:45:58,396][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 05:45:59,089][15401] Updated weights for policy 0, policy_version 371720 (0.0035) [2024-06-23 05:46:03,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6090407936. Throughput: 0: 42647.5. Samples: 6090585080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:46:03,393][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 05:46:03,960][15401] Updated weights for policy 0, policy_version 371730 (0.0026) [2024-06-23 05:46:06,629][15401] Updated weights for policy 0, policy_version 371740 (0.0033) [2024-06-23 05:46:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6090653696. Throughput: 0: 42800.5. Samples: 6090715600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:46:08,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 05:46:11,371][15401] Updated weights for policy 0, policy_version 371750 (0.0034) [2024-06-23 05:46:13,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6090850304. Throughput: 0: 42911.1. Samples: 6090982500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:46:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 05:46:14,262][15401] Updated weights for policy 0, policy_version 371760 (0.0026) [2024-06-23 05:46:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 6091063296. Throughput: 0: 42718.6. Samples: 6091227840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:46:18,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 05:46:18,868][15401] Updated weights for policy 0, policy_version 371770 (0.0034) [2024-06-23 05:46:22,093][15401] Updated weights for policy 0, policy_version 371780 (0.0048) [2024-06-23 05:46:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6091276288. Throughput: 0: 42856.5. Samples: 6091361180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:46:23,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 05:46:26,333][15401] Updated weights for policy 0, policy_version 371790 (0.0028) [2024-06-23 05:46:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 6091489280. Throughput: 0: 43070.4. Samples: 6091626020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 05:46:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 05:46:29,535][15401] Updated weights for policy 0, policy_version 371800 (0.0021) [2024-06-23 05:46:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43148.0, 300 sec: 42820.6). Total num frames: 6091718656. Throughput: 0: 42808.0. Samples: 6091877000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:46:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 05:46:33,815][15401] Updated weights for policy 0, policy_version 371810 (0.0028) [2024-06-23 05:46:37,159][15401] Updated weights for policy 0, policy_version 371820 (0.0029) [2024-06-23 05:46:38,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 6091948032. Throughput: 0: 42988.0. Samples: 6092006080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:46:38,393][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 05:46:41,351][15401] Updated weights for policy 0, policy_version 371830 (0.0037) [2024-06-23 05:46:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6092128256. Throughput: 0: 43043.9. Samples: 6092267800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:46:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 05:46:44,760][15401] Updated weights for policy 0, policy_version 371840 (0.0024) [2024-06-23 05:46:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 6092357632. Throughput: 0: 42962.7. Samples: 6092518300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:46:48,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 05:46:48,797][15401] Updated weights for policy 0, policy_version 371850 (0.0037) [2024-06-23 05:46:51,888][15349] Signal inference workers to stop experience collection... (90300 times) [2024-06-23 05:46:51,888][15349] Signal inference workers to resume experience collection... (90300 times) [2024-06-23 05:46:51,901][15401] InferenceWorker_p0-w0: stopping experience collection (90300 times) [2024-06-23 05:46:51,931][15401] InferenceWorker_p0-w0: resuming experience collection (90300 times) [2024-06-23 05:46:52,793][15401] Updated weights for policy 0, policy_version 371860 (0.0031) [2024-06-23 05:46:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 6092570624. Throughput: 0: 43020.0. Samples: 6092651500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:46:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 05:46:56,469][15401] Updated weights for policy 0, policy_version 371870 (0.0022) [2024-06-23 05:46:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6092767232. Throughput: 0: 42856.0. Samples: 6092911020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:46:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 05:47:00,435][15401] Updated weights for policy 0, policy_version 371880 (0.0037) [2024-06-23 05:47:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43419.4, 300 sec: 42821.1). Total num frames: 6093012992. Throughput: 0: 42880.1. Samples: 6093157440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 05:47:04,060][15401] Updated weights for policy 0, policy_version 371890 (0.0028) [2024-06-23 05:47:07,939][15401] Updated weights for policy 0, policy_version 371900 (0.0036) [2024-06-23 05:47:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6093242368. Throughput: 0: 42979.4. Samples: 6093295260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 05:47:11,720][15401] Updated weights for policy 0, policy_version 371910 (0.0033) [2024-06-23 05:47:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6093422592. Throughput: 0: 42968.0. Samples: 6093559580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:13,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 05:47:15,424][15401] Updated weights for policy 0, policy_version 371920 (0.0042) [2024-06-23 05:47:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42820.9). Total num frames: 6093668352. Throughput: 0: 42880.4. Samples: 6093806620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 05:47:19,467][15401] Updated weights for policy 0, policy_version 371930 (0.0038) [2024-06-23 05:47:23,134][15401] Updated weights for policy 0, policy_version 371940 (0.0032) [2024-06-23 05:47:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6093864960. Throughput: 0: 43089.4. Samples: 6093945000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 05:47:27,251][15401] Updated weights for policy 0, policy_version 371950 (0.0032) [2024-06-23 05:47:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6094077952. Throughput: 0: 42997.8. Samples: 6094202700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 05:47:30,747][15401] Updated weights for policy 0, policy_version 371960 (0.0028) [2024-06-23 05:47:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 6094323712. Throughput: 0: 43044.4. Samples: 6094455300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 05:47:34,776][15401] Updated weights for policy 0, policy_version 371970 (0.0037) [2024-06-23 05:47:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 6094503936. Throughput: 0: 42976.0. Samples: 6094585420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 05:47:38,673][15401] Updated weights for policy 0, policy_version 371980 (0.0036) [2024-06-23 05:47:42,084][15401] Updated weights for policy 0, policy_version 371990 (0.0024) [2024-06-23 05:47:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 6094733312. Throughput: 0: 42937.0. Samples: 6094843180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 05:47:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000371993_6094733312.pth... [2024-06-23 05:47:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000371365_6084444160.pth [2024-06-23 05:47:46,189][15401] Updated weights for policy 0, policy_version 372000 (0.0033) [2024-06-23 05:47:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6094929920. Throughput: 0: 43039.2. Samples: 6095094200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 05:47:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 05:47:50,043][15401] Updated weights for policy 0, policy_version 372010 (0.0032) [2024-06-23 05:47:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 6095142912. Throughput: 0: 42915.5. Samples: 6095226460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:47:53,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 05:47:53,574][15401] Updated weights for policy 0, policy_version 372020 (0.0028) [2024-06-23 05:47:57,551][15401] Updated weights for policy 0, policy_version 372030 (0.0037) [2024-06-23 05:47:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6095355904. Throughput: 0: 42795.9. Samples: 6095485400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:47:58,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 05:48:01,267][15401] Updated weights for policy 0, policy_version 372040 (0.0040) [2024-06-23 05:48:02,233][15349] Signal inference workers to stop experience collection... (90350 times) [2024-06-23 05:48:02,268][15401] InferenceWorker_p0-w0: stopping experience collection (90350 times) [2024-06-23 05:48:02,294][15349] Signal inference workers to resume experience collection... (90350 times) [2024-06-23 05:48:02,295][15401] InferenceWorker_p0-w0: resuming experience collection (90350 times) [2024-06-23 05:48:03,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6095585280. Throughput: 0: 43019.5. Samples: 6095742500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:03,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 05:48:05,134][15401] Updated weights for policy 0, policy_version 372050 (0.0049) [2024-06-23 05:48:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6095781888. Throughput: 0: 42791.5. Samples: 6095870620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 05:48:08,889][15401] Updated weights for policy 0, policy_version 372060 (0.0033) [2024-06-23 05:48:12,751][15401] Updated weights for policy 0, policy_version 372070 (0.0024) [2024-06-23 05:48:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6096011264. Throughput: 0: 42828.8. Samples: 6096130000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:13,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 05:48:16,981][15401] Updated weights for policy 0, policy_version 372080 (0.0029) [2024-06-23 05:48:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 6096207872. Throughput: 0: 42748.1. Samples: 6096378960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 05:48:20,825][15401] Updated weights for policy 0, policy_version 372090 (0.0032) [2024-06-23 05:48:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6096420864. Throughput: 0: 42619.0. Samples: 6096503280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 05:48:24,786][15401] Updated weights for policy 0, policy_version 372100 (0.0039) [2024-06-23 05:48:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6096633856. Throughput: 0: 42631.1. Samples: 6096761580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 05:48:28,518][15401] Updated weights for policy 0, policy_version 372110 (0.0030) [2024-06-23 05:48:32,361][15401] Updated weights for policy 0, policy_version 372120 (0.0030) [2024-06-23 05:48:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 6096846848. Throughput: 0: 42675.8. Samples: 6097014620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 05:48:36,145][15401] Updated weights for policy 0, policy_version 372130 (0.0025) [2024-06-23 05:48:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6097059840. Throughput: 0: 42547.3. Samples: 6097141080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 05:48:40,129][15401] Updated weights for policy 0, policy_version 372140 (0.0032) [2024-06-23 05:48:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 6097256448. Throughput: 0: 42438.3. Samples: 6097395120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 05:48:43,852][15401] Updated weights for policy 0, policy_version 372150 (0.0045) [2024-06-23 05:48:48,181][15401] Updated weights for policy 0, policy_version 372160 (0.0031) [2024-06-23 05:48:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6097485824. Throughput: 0: 42467.3. Samples: 6097653520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:48,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 05:48:51,590][15401] Updated weights for policy 0, policy_version 372170 (0.0033) [2024-06-23 05:48:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 6097698816. Throughput: 0: 42352.5. Samples: 6097776480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 05:48:55,691][15401] Updated weights for policy 0, policy_version 372180 (0.0038) [2024-06-23 05:48:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6097911808. Throughput: 0: 42140.7. Samples: 6098026320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:48:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 05:48:59,201][15401] Updated weights for policy 0, policy_version 372190 (0.0028) [2024-06-23 05:49:03,181][15401] Updated weights for policy 0, policy_version 372200 (0.0043) [2024-06-23 05:49:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 6098124800. Throughput: 0: 42462.7. Samples: 6098289780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:49:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 05:49:07,232][15401] Updated weights for policy 0, policy_version 372210 (0.0044) [2024-06-23 05:49:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6098337792. Throughput: 0: 42685.5. Samples: 6098424120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 05:49:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 05:49:10,814][15401] Updated weights for policy 0, policy_version 372220 (0.0030) [2024-06-23 05:49:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6098550784. Throughput: 0: 42447.5. Samples: 6098671720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 05:49:14,707][15401] Updated weights for policy 0, policy_version 372230 (0.0031) [2024-06-23 05:49:18,347][15401] Updated weights for policy 0, policy_version 372240 (0.0034) [2024-06-23 05:49:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6098780160. Throughput: 0: 42701.3. Samples: 6098936180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 05:49:21,585][15349] Signal inference workers to stop experience collection... (90400 times) [2024-06-23 05:49:21,587][15349] Signal inference workers to resume experience collection... (90400 times) [2024-06-23 05:49:21,612][15401] InferenceWorker_p0-w0: stopping experience collection (90400 times) [2024-06-23 05:49:21,642][15401] InferenceWorker_p0-w0: resuming experience collection (90400 times) [2024-06-23 05:49:22,202][15401] Updated weights for policy 0, policy_version 372250 (0.0040) [2024-06-23 05:49:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 6098993152. Throughput: 0: 42751.0. Samples: 6099064880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 05:49:26,101][15401] Updated weights for policy 0, policy_version 372260 (0.0033) [2024-06-23 05:49:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6099206144. Throughput: 0: 42723.5. Samples: 6099317680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 05:49:29,765][15401] Updated weights for policy 0, policy_version 372270 (0.0052) [2024-06-23 05:49:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6099386368. Throughput: 0: 42738.5. Samples: 6099576760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:33,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 05:49:34,007][15401] Updated weights for policy 0, policy_version 372280 (0.0041) [2024-06-23 05:49:37,821][15401] Updated weights for policy 0, policy_version 372290 (0.0035) [2024-06-23 05:49:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6099632128. Throughput: 0: 42727.1. Samples: 6099699200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 05:49:41,460][15401] Updated weights for policy 0, policy_version 372300 (0.0034) [2024-06-23 05:49:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 6099861504. Throughput: 0: 42981.2. Samples: 6099960480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 05:49:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000372306_6099861504.pth... [2024-06-23 05:49:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000371678_6089572352.pth [2024-06-23 05:49:45,211][15401] Updated weights for policy 0, policy_version 372310 (0.0041) [2024-06-23 05:49:48,396][15132] Fps is (10 sec: 39296.6, 60 sec: 42320.7, 300 sec: 42708.6). Total num frames: 6100025344. Throughput: 0: 42998.4. Samples: 6100224980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:48,396][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 05:49:49,175][15401] Updated weights for policy 0, policy_version 372320 (0.0025) [2024-06-23 05:49:52,651][15401] Updated weights for policy 0, policy_version 372330 (0.0033) [2024-06-23 05:49:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6100271104. Throughput: 0: 42740.4. Samples: 6100347440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 05:49:56,958][15401] Updated weights for policy 0, policy_version 372340 (0.0045) [2024-06-23 05:49:58,389][15132] Fps is (10 sec: 47544.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6100500480. Throughput: 0: 43007.6. Samples: 6100607060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:49:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 05:50:00,184][15401] Updated weights for policy 0, policy_version 372350 (0.0042) [2024-06-23 05:50:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6100680704. Throughput: 0: 42995.2. Samples: 6100870960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:50:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 05:50:04,585][15401] Updated weights for policy 0, policy_version 372360 (0.0041) [2024-06-23 05:50:07,653][15401] Updated weights for policy 0, policy_version 372370 (0.0042) [2024-06-23 05:50:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 6100926464. Throughput: 0: 42904.9. Samples: 6100995600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:50:08,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 05:50:12,083][15401] Updated weights for policy 0, policy_version 372380 (0.0031) [2024-06-23 05:50:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6101155840. Throughput: 0: 43102.7. Samples: 6101257300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:50:13,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 05:50:15,544][15401] Updated weights for policy 0, policy_version 372390 (0.0031) [2024-06-23 05:50:18,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 6101319680. Throughput: 0: 43071.1. Samples: 6101515060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:50:18,392][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 05:50:19,788][15401] Updated weights for policy 0, policy_version 372400 (0.0028) [2024-06-23 05:50:23,080][15401] Updated weights for policy 0, policy_version 372410 (0.0036) [2024-06-23 05:50:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6101565440. Throughput: 0: 43014.1. Samples: 6101634840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:50:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 05:50:27,384][15401] Updated weights for policy 0, policy_version 372420 (0.0040) [2024-06-23 05:50:28,390][15132] Fps is (10 sec: 42607.5, 60 sec: 42325.2, 300 sec: 42765.7). Total num frames: 6101745664. Throughput: 0: 42969.2. Samples: 6101894100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 05:50:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 05:50:30,722][15401] Updated weights for policy 0, policy_version 372430 (0.0044) [2024-06-23 05:50:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6101975040. Throughput: 0: 42757.1. Samples: 6102148780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:50:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 05:50:35,190][15401] Updated weights for policy 0, policy_version 372440 (0.0029) [2024-06-23 05:50:37,346][15349] Signal inference workers to stop experience collection... (90450 times) [2024-06-23 05:50:37,400][15401] InferenceWorker_p0-w0: stopping experience collection (90450 times) [2024-06-23 05:50:37,403][15349] Signal inference workers to resume experience collection... (90450 times) [2024-06-23 05:50:37,421][15401] InferenceWorker_p0-w0: resuming experience collection (90450 times) [2024-06-23 05:50:38,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6102204416. Throughput: 0: 42896.3. Samples: 6102277780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:50:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 05:50:38,471][15401] Updated weights for policy 0, policy_version 372450 (0.0038) [2024-06-23 05:50:42,752][15401] Updated weights for policy 0, policy_version 372460 (0.0042) [2024-06-23 05:50:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6102417408. Throughput: 0: 42891.9. Samples: 6102537200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:50:43,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 05:50:46,004][15401] Updated weights for policy 0, policy_version 372470 (0.0044) [2024-06-23 05:50:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43422.3, 300 sec: 42765.0). Total num frames: 6102630400. Throughput: 0: 42704.5. Samples: 6102792660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:50:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 05:50:50,475][15401] Updated weights for policy 0, policy_version 372480 (0.0030) [2024-06-23 05:50:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6102859776. Throughput: 0: 43004.6. Samples: 6102930800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:50:53,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 05:50:53,452][15401] Updated weights for policy 0, policy_version 372490 (0.0054) [2024-06-23 05:50:57,929][15401] Updated weights for policy 0, policy_version 372500 (0.0034) [2024-06-23 05:50:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 6103056384. Throughput: 0: 42905.8. Samples: 6103188060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:50:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 05:51:01,306][15401] Updated weights for policy 0, policy_version 372510 (0.0034) [2024-06-23 05:51:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6103269376. Throughput: 0: 42750.1. Samples: 6103438720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 05:51:05,513][15401] Updated weights for policy 0, policy_version 372520 (0.0026) [2024-06-23 05:51:08,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 6103498752. Throughput: 0: 43046.0. Samples: 6103572180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:08,396][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 05:51:08,941][15401] Updated weights for policy 0, policy_version 372530 (0.0039) [2024-06-23 05:51:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 6103678976. Throughput: 0: 42981.5. Samples: 6103828260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 05:51:13,546][15401] Updated weights for policy 0, policy_version 372540 (0.0037) [2024-06-23 05:51:16,419][15401] Updated weights for policy 0, policy_version 372550 (0.0022) [2024-06-23 05:51:18,390][15132] Fps is (10 sec: 42625.6, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 6103924736. Throughput: 0: 42901.9. Samples: 6104079360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:18,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 05:51:21,049][15401] Updated weights for policy 0, policy_version 372560 (0.0038) [2024-06-23 05:51:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6104137728. Throughput: 0: 43059.3. Samples: 6104215440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:23,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 05:51:23,967][15401] Updated weights for policy 0, policy_version 372570 (0.0031) [2024-06-23 05:51:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 6104317952. Throughput: 0: 42908.1. Samples: 6104468060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 05:51:29,095][15401] Updated weights for policy 0, policy_version 372580 (0.0035) [2024-06-23 05:51:31,864][15401] Updated weights for policy 0, policy_version 372590 (0.0033) [2024-06-23 05:51:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 6104563712. Throughput: 0: 42843.5. Samples: 6104720620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 05:51:36,572][15401] Updated weights for policy 0, policy_version 372600 (0.0034) [2024-06-23 05:51:38,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 6104776704. Throughput: 0: 42853.7. Samples: 6104859320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:38,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 05:51:39,559][15401] Updated weights for policy 0, policy_version 372610 (0.0032) [2024-06-23 05:51:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6104973312. Throughput: 0: 42608.5. Samples: 6105105440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 05:51:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 05:51:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000372618_6104973312.pth... [2024-06-23 05:51:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000371993_6094733312.pth [2024-06-23 05:51:44,234][15401] Updated weights for policy 0, policy_version 372620 (0.0037) [2024-06-23 05:51:47,230][15401] Updated weights for policy 0, policy_version 372630 (0.0026) [2024-06-23 05:51:48,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6105219072. Throughput: 0: 42646.7. Samples: 6105357820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:51:48,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 05:51:51,728][15401] Updated weights for policy 0, policy_version 372640 (0.0021) [2024-06-23 05:51:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6105399296. Throughput: 0: 42836.4. Samples: 6105499540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:51:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 05:51:54,752][15349] Signal inference workers to stop experience collection... (90500 times) [2024-06-23 05:51:54,777][15401] InferenceWorker_p0-w0: stopping experience collection (90500 times) [2024-06-23 05:51:54,811][15349] Signal inference workers to resume experience collection... (90500 times) [2024-06-23 05:51:54,812][15401] InferenceWorker_p0-w0: resuming experience collection (90500 times) [2024-06-23 05:51:54,954][15401] Updated weights for policy 0, policy_version 372650 (0.0026) [2024-06-23 05:51:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6105628672. Throughput: 0: 42710.7. Samples: 6105750240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:51:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 05:51:59,304][15401] Updated weights for policy 0, policy_version 372660 (0.0028) [2024-06-23 05:52:02,745][15401] Updated weights for policy 0, policy_version 372670 (0.0039) [2024-06-23 05:52:03,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 6105874432. Throughput: 0: 42823.2. Samples: 6106006400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 05:52:06,848][15401] Updated weights for policy 0, policy_version 372680 (0.0032) [2024-06-23 05:52:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 6106054656. Throughput: 0: 42646.2. Samples: 6106134520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:08,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 05:52:10,505][15401] Updated weights for policy 0, policy_version 372690 (0.0036) [2024-06-23 05:52:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 6106284032. Throughput: 0: 42607.1. Samples: 6106385380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 05:52:14,351][15401] Updated weights for policy 0, policy_version 372700 (0.0036) [2024-06-23 05:52:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6106464256. Throughput: 0: 42889.0. Samples: 6106650620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 05:52:18,429][15401] Updated weights for policy 0, policy_version 372710 (0.0035) [2024-06-23 05:52:21,929][15401] Updated weights for policy 0, policy_version 372720 (0.0034) [2024-06-23 05:52:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6106693632. Throughput: 0: 42597.4. Samples: 6106776100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:23,391][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 05:52:26,051][15401] Updated weights for policy 0, policy_version 372730 (0.0040) [2024-06-23 05:52:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 6106939392. Throughput: 0: 42796.8. Samples: 6107031300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 05:52:29,728][15401] Updated weights for policy 0, policy_version 372740 (0.0035) [2024-06-23 05:52:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6107119616. Throughput: 0: 43005.9. Samples: 6107293080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 05:52:33,691][15401] Updated weights for policy 0, policy_version 372750 (0.0030) [2024-06-23 05:52:37,434][15401] Updated weights for policy 0, policy_version 372760 (0.0035) [2024-06-23 05:52:38,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 6107316224. Throughput: 0: 42582.2. Samples: 6107415740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 05:52:41,164][15401] Updated weights for policy 0, policy_version 372770 (0.0032) [2024-06-23 05:52:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6107578368. Throughput: 0: 42868.2. Samples: 6107679300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 05:52:44,939][15401] Updated weights for policy 0, policy_version 372780 (0.0041) [2024-06-23 05:52:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6107758592. Throughput: 0: 42913.8. Samples: 6107937520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 05:52:48,678][15401] Updated weights for policy 0, policy_version 372790 (0.0028) [2024-06-23 05:52:52,630][15401] Updated weights for policy 0, policy_version 372800 (0.0038) [2024-06-23 05:52:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6107971584. Throughput: 0: 42759.1. Samples: 6108058680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:53,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 05:52:56,143][15401] Updated weights for policy 0, policy_version 372810 (0.0037) [2024-06-23 05:52:58,390][15132] Fps is (10 sec: 49151.8, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 6108250112. Throughput: 0: 43220.8. Samples: 6108330320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:52:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 05:53:00,222][15401] Updated weights for policy 0, policy_version 372820 (0.0034) [2024-06-23 05:53:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6108413952. Throughput: 0: 42896.3. Samples: 6108580960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-23 05:53:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 05:53:03,810][15401] Updated weights for policy 0, policy_version 372830 (0.0029) [2024-06-23 05:53:07,729][15401] Updated weights for policy 0, policy_version 372840 (0.0026) [2024-06-23 05:53:08,390][15132] Fps is (10 sec: 36044.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6108610560. Throughput: 0: 42823.4. Samples: 6108703160. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 05:53:11,429][15401] Updated weights for policy 0, policy_version 372850 (0.0028) [2024-06-23 05:53:13,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 6108856320. Throughput: 0: 43072.9. Samples: 6108969680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:13,393][15132] Avg episode reward: [(0, '0.847')] [2024-06-23 05:53:15,474][15401] Updated weights for policy 0, policy_version 372860 (0.0024) [2024-06-23 05:53:17,352][15349] Signal inference workers to stop experience collection... (90550 times) [2024-06-23 05:53:17,406][15401] InferenceWorker_p0-w0: stopping experience collection (90550 times) [2024-06-23 05:53:17,414][15349] Signal inference workers to resume experience collection... (90550 times) [2024-06-23 05:53:17,419][15401] InferenceWorker_p0-w0: resuming experience collection (90550 times) [2024-06-23 05:53:18,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 6109052928. Throughput: 0: 42862.1. Samples: 6109221980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:18,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 05:53:19,406][15401] Updated weights for policy 0, policy_version 372870 (0.0033) [2024-06-23 05:53:23,294][15401] Updated weights for policy 0, policy_version 372880 (0.0028) [2024-06-23 05:53:23,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6109265920. Throughput: 0: 42910.5. Samples: 6109346720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 05:53:26,999][15401] Updated weights for policy 0, policy_version 372890 (0.0033) [2024-06-23 05:53:28,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 6109511680. Throughput: 0: 42899.1. Samples: 6109609760. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 05:53:30,774][15401] Updated weights for policy 0, policy_version 372900 (0.0038) [2024-06-23 05:53:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6109708288. Throughput: 0: 42811.6. Samples: 6109864040. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 05:53:34,741][15401] Updated weights for policy 0, policy_version 372910 (0.0049) [2024-06-23 05:53:38,392][15132] Fps is (10 sec: 39312.0, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 6109904896. Throughput: 0: 42725.3. Samples: 6109981420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:38,392][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 05:53:38,653][15401] Updated weights for policy 0, policy_version 372920 (0.0039) [2024-06-23 05:53:42,507][15401] Updated weights for policy 0, policy_version 372930 (0.0028) [2024-06-23 05:53:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6110134272. Throughput: 0: 42603.1. Samples: 6110247460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 05:53:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000372933_6110134272.pth... [2024-06-23 05:53:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000372306_6099861504.pth [2024-06-23 05:53:46,666][15401] Updated weights for policy 0, policy_version 372940 (0.0032) [2024-06-23 05:53:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6110330880. Throughput: 0: 42495.2. Samples: 6110493240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:48,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 05:53:50,064][15401] Updated weights for policy 0, policy_version 372950 (0.0042) [2024-06-23 05:53:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6110543872. Throughput: 0: 42579.2. Samples: 6110619220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 05:53:54,500][15401] Updated weights for policy 0, policy_version 372960 (0.0033) [2024-06-23 05:53:57,642][15401] Updated weights for policy 0, policy_version 372970 (0.0034) [2024-06-23 05:53:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 42765.0). Total num frames: 6110740480. Throughput: 0: 42290.2. Samples: 6110872640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:53:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 05:54:02,078][15401] Updated weights for policy 0, policy_version 372980 (0.0034) [2024-06-23 05:54:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6110953472. Throughput: 0: 42476.9. Samples: 6111133340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:54:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 05:54:05,740][15401] Updated weights for policy 0, policy_version 372990 (0.0055) [2024-06-23 05:54:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6111199232. Throughput: 0: 42416.2. Samples: 6111255440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:54:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 05:54:09,549][15401] Updated weights for policy 0, policy_version 373000 (0.0029) [2024-06-23 05:54:13,273][15401] Updated weights for policy 0, policy_version 373010 (0.0033) [2024-06-23 05:54:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 6111395840. Throughput: 0: 42356.0. Samples: 6111515780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:54:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 05:54:17,169][15401] Updated weights for policy 0, policy_version 373020 (0.0033) [2024-06-23 05:54:18,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42054.0, 300 sec: 42654.0). Total num frames: 6111576064. Throughput: 0: 42464.9. Samples: 6111774960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:54:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 05:54:20,848][15401] Updated weights for policy 0, policy_version 373030 (0.0047) [2024-06-23 05:54:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6111821824. Throughput: 0: 42504.5. Samples: 6111894020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-23 05:54:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 05:54:24,727][15401] Updated weights for policy 0, policy_version 373040 (0.0038) [2024-06-23 05:54:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 6112034816. Throughput: 0: 42410.6. Samples: 6112155940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:54:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 05:54:28,654][15401] Updated weights for policy 0, policy_version 373050 (0.0033) [2024-06-23 05:54:32,259][15401] Updated weights for policy 0, policy_version 373060 (0.0028) [2024-06-23 05:54:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 6112215040. Throughput: 0: 42594.6. Samples: 6112410000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:54:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 05:54:36,615][15401] Updated weights for policy 0, policy_version 373070 (0.0038) [2024-06-23 05:54:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 6112444416. Throughput: 0: 42556.5. Samples: 6112534260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:54:38,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 05:54:40,093][15401] Updated weights for policy 0, policy_version 373080 (0.0036) [2024-06-23 05:54:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42877.0). Total num frames: 6112673792. Throughput: 0: 42662.6. Samples: 6112792460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:54:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 05:54:44,312][15401] Updated weights for policy 0, policy_version 373090 (0.0023) [2024-06-23 05:54:46,080][15349] Signal inference workers to stop experience collection... (90600 times) [2024-06-23 05:54:46,080][15349] Signal inference workers to resume experience collection... (90600 times) [2024-06-23 05:54:46,112][15401] InferenceWorker_p0-w0: stopping experience collection (90600 times) [2024-06-23 05:54:46,112][15401] InferenceWorker_p0-w0: resuming experience collection (90600 times) [2024-06-23 05:54:47,609][15401] Updated weights for policy 0, policy_version 373100 (0.0039) [2024-06-23 05:54:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6112870400. Throughput: 0: 42469.7. Samples: 6113044480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:54:48,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 05:54:51,838][15401] Updated weights for policy 0, policy_version 373110 (0.0033) [2024-06-23 05:54:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6113083392. Throughput: 0: 42546.9. Samples: 6113170060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:54:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 05:54:55,575][15401] Updated weights for policy 0, policy_version 373120 (0.0031) [2024-06-23 05:54:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6113296384. Throughput: 0: 42484.4. Samples: 6113427580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:54:58,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 05:54:59,359][15401] Updated weights for policy 0, policy_version 373130 (0.0031) [2024-06-23 05:55:03,229][15401] Updated weights for policy 0, policy_version 373140 (0.0048) [2024-06-23 05:55:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6113525760. Throughput: 0: 42329.6. Samples: 6113679800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:55:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 05:55:07,196][15401] Updated weights for policy 0, policy_version 373150 (0.0040) [2024-06-23 05:55:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 6113738752. Throughput: 0: 42641.4. Samples: 6113812880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:55:08,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 05:55:10,666][15401] Updated weights for policy 0, policy_version 373160 (0.0039) [2024-06-23 05:55:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.3). Total num frames: 6113935360. Throughput: 0: 42432.0. Samples: 6114065380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:55:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 05:55:14,780][15401] Updated weights for policy 0, policy_version 373170 (0.0036) [2024-06-23 05:55:18,199][15401] Updated weights for policy 0, policy_version 373180 (0.0040) [2024-06-23 05:55:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 6114181120. Throughput: 0: 42344.9. Samples: 6114315520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:55:18,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-23 05:55:22,396][15401] Updated weights for policy 0, policy_version 373190 (0.0026) [2024-06-23 05:55:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6114361344. Throughput: 0: 42580.4. Samples: 6114450380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:55:23,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 05:55:25,769][15401] Updated weights for policy 0, policy_version 373200 (0.0041) [2024-06-23 05:55:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6114574336. Throughput: 0: 42461.3. Samples: 6114703220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:55:28,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 05:55:30,278][15401] Updated weights for policy 0, policy_version 373210 (0.0038) [2024-06-23 05:55:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6114820096. Throughput: 0: 42403.7. Samples: 6114952640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:55:33,390][15132] Avg episode reward: [(0, '0.160')] [2024-06-23 05:55:33,669][15401] Updated weights for policy 0, policy_version 373220 (0.0038) [2024-06-23 05:55:37,939][15401] Updated weights for policy 0, policy_version 373230 (0.0034) [2024-06-23 05:55:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 6115000320. Throughput: 0: 42536.7. Samples: 6115084200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 05:55:38,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 05:55:41,528][15401] Updated weights for policy 0, policy_version 373240 (0.0035) [2024-06-23 05:55:43,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6115196928. Throughput: 0: 42326.6. Samples: 6115332280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:55:43,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 05:55:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000373242_6115196928.pth... [2024-06-23 05:55:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000372618_6104973312.pth [2024-06-23 05:55:45,634][15401] Updated weights for policy 0, policy_version 373250 (0.0029) [2024-06-23 05:55:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6115426304. Throughput: 0: 42478.7. Samples: 6115591340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:55:48,392][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 05:55:49,204][15401] Updated weights for policy 0, policy_version 373260 (0.0034) [2024-06-23 05:55:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 6115639296. Throughput: 0: 42491.6. Samples: 6115725000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:55:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 05:55:53,752][15401] Updated weights for policy 0, policy_version 373270 (0.0031) [2024-06-23 05:55:57,102][15401] Updated weights for policy 0, policy_version 373280 (0.0029) [2024-06-23 05:55:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6115852288. Throughput: 0: 42465.4. Samples: 6115976320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:55:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 05:56:01,170][15401] Updated weights for policy 0, policy_version 373290 (0.0021) [2024-06-23 05:56:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 6116081664. Throughput: 0: 42734.7. Samples: 6116238580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 05:56:03,811][15349] Signal inference workers to stop experience collection... (90650 times) [2024-06-23 05:56:03,852][15401] InferenceWorker_p0-w0: stopping experience collection (90650 times) [2024-06-23 05:56:03,859][15349] Signal inference workers to resume experience collection... (90650 times) [2024-06-23 05:56:03,869][15401] InferenceWorker_p0-w0: resuming experience collection (90650 times) [2024-06-23 05:56:04,509][15401] Updated weights for policy 0, policy_version 373300 (0.0041) [2024-06-23 05:56:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6116294656. Throughput: 0: 42726.2. Samples: 6116373060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:08,396][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 05:56:08,589][15401] Updated weights for policy 0, policy_version 373310 (0.0040) [2024-06-23 05:56:12,086][15401] Updated weights for policy 0, policy_version 373320 (0.0030) [2024-06-23 05:56:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6116507648. Throughput: 0: 42752.0. Samples: 6116627060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 05:56:16,356][15401] Updated weights for policy 0, policy_version 373330 (0.0037) [2024-06-23 05:56:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6116737024. Throughput: 0: 42949.7. Samples: 6116885380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 05:56:19,816][15401] Updated weights for policy 0, policy_version 373340 (0.0026) [2024-06-23 05:56:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6116933632. Throughput: 0: 43060.9. Samples: 6117021940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 05:56:23,934][15401] Updated weights for policy 0, policy_version 373350 (0.0045) [2024-06-23 05:56:27,322][15401] Updated weights for policy 0, policy_version 373360 (0.0029) [2024-06-23 05:56:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6117146624. Throughput: 0: 43045.8. Samples: 6117269340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 05:56:31,426][15401] Updated weights for policy 0, policy_version 373370 (0.0028) [2024-06-23 05:56:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 6117376000. Throughput: 0: 43087.2. Samples: 6117530260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 05:56:35,019][15401] Updated weights for policy 0, policy_version 373380 (0.0037) [2024-06-23 05:56:38,391][15132] Fps is (10 sec: 44230.2, 60 sec: 43143.4, 300 sec: 42764.8). Total num frames: 6117588992. Throughput: 0: 43016.2. Samples: 6117660800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:38,391][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 05:56:39,022][15401] Updated weights for policy 0, policy_version 373390 (0.0041) [2024-06-23 05:56:42,660][15401] Updated weights for policy 0, policy_version 373400 (0.0025) [2024-06-23 05:56:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 6117801984. Throughput: 0: 43089.7. Samples: 6117915360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 05:56:46,479][15401] Updated weights for policy 0, policy_version 373410 (0.0053) [2024-06-23 05:56:48,390][15132] Fps is (10 sec: 40965.1, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 6117998592. Throughput: 0: 42893.1. Samples: 6118168780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 05:56:50,553][15401] Updated weights for policy 0, policy_version 373420 (0.0032) [2024-06-23 05:56:53,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6118211584. Throughput: 0: 42770.9. Samples: 6118297740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 05:56:54,571][15401] Updated weights for policy 0, policy_version 373430 (0.0037) [2024-06-23 05:56:58,174][15401] Updated weights for policy 0, policy_version 373440 (0.0033) [2024-06-23 05:56:58,390][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6118440960. Throughput: 0: 42928.0. Samples: 6118558820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 05:56:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 05:57:02,062][15401] Updated weights for policy 0, policy_version 373450 (0.0038) [2024-06-23 05:57:03,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 6118653952. Throughput: 0: 42883.1. Samples: 6118815220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:03,393][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 05:57:05,871][15401] Updated weights for policy 0, policy_version 373460 (0.0037) [2024-06-23 05:57:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6118850560. Throughput: 0: 42693.7. Samples: 6118943160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:08,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 05:57:09,568][15401] Updated weights for policy 0, policy_version 373470 (0.0027) [2024-06-23 05:57:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6119079936. Throughput: 0: 42922.2. Samples: 6119200840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:13,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 05:57:13,619][15401] Updated weights for policy 0, policy_version 373480 (0.0046) [2024-06-23 05:57:17,143][15401] Updated weights for policy 0, policy_version 373490 (0.0041) [2024-06-23 05:57:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6119309312. Throughput: 0: 42935.2. Samples: 6119462340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 05:57:21,167][15401] Updated weights for policy 0, policy_version 373500 (0.0022) [2024-06-23 05:57:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 6119522304. Throughput: 0: 42900.9. Samples: 6119591380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:23,393][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 05:57:24,851][15401] Updated weights for policy 0, policy_version 373510 (0.0033) [2024-06-23 05:57:28,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 6119718912. Throughput: 0: 42920.1. Samples: 6119846860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:28,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 05:57:28,725][15401] Updated weights for policy 0, policy_version 373520 (0.0041) [2024-06-23 05:57:32,660][15401] Updated weights for policy 0, policy_version 373530 (0.0033) [2024-06-23 05:57:33,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6119964672. Throughput: 0: 43083.8. Samples: 6120107540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 05:57:36,479][15401] Updated weights for policy 0, policy_version 373540 (0.0027) [2024-06-23 05:57:38,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42872.6, 300 sec: 42653.9). Total num frames: 6120161280. Throughput: 0: 43151.9. Samples: 6120239580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 05:57:40,181][15401] Updated weights for policy 0, policy_version 373550 (0.0035) [2024-06-23 05:57:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6120374272. Throughput: 0: 42895.1. Samples: 6120489100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:43,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 05:57:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000373559_6120390656.pth... [2024-06-23 05:57:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000372933_6110134272.pth [2024-06-23 05:57:44,102][15401] Updated weights for policy 0, policy_version 373560 (0.0023) [2024-06-23 05:57:45,636][15349] Signal inference workers to stop experience collection... (90700 times) [2024-06-23 05:57:45,636][15349] Signal inference workers to resume experience collection... (90700 times) [2024-06-23 05:57:45,664][15401] InferenceWorker_p0-w0: stopping experience collection (90700 times) [2024-06-23 05:57:45,664][15401] InferenceWorker_p0-w0: resuming experience collection (90700 times) [2024-06-23 05:57:47,771][15401] Updated weights for policy 0, policy_version 373570 (0.0026) [2024-06-23 05:57:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 6120587264. Throughput: 0: 42931.6. Samples: 6120747040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 05:57:51,425][15401] Updated weights for policy 0, policy_version 373580 (0.0029) [2024-06-23 05:57:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 6120800256. Throughput: 0: 42972.0. Samples: 6120876900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 05:57:55,488][15401] Updated weights for policy 0, policy_version 373590 (0.0036) [2024-06-23 05:57:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6121029632. Throughput: 0: 42938.7. Samples: 6121133080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:57:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 05:57:59,772][15401] Updated weights for policy 0, policy_version 373600 (0.0038) [2024-06-23 05:58:03,229][15401] Updated weights for policy 0, policy_version 373610 (0.0045) [2024-06-23 05:58:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6121226240. Throughput: 0: 42800.7. Samples: 6121388380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:58:03,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 05:58:07,299][15401] Updated weights for policy 0, policy_version 373620 (0.0034) [2024-06-23 05:58:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 6121439232. Throughput: 0: 42563.6. Samples: 6121506640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:58:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 05:58:10,919][15401] Updated weights for policy 0, policy_version 373630 (0.0037) [2024-06-23 05:58:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 6121652224. Throughput: 0: 42782.8. Samples: 6121771980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:58:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 05:58:14,704][15401] Updated weights for policy 0, policy_version 373640 (0.0022) [2024-06-23 05:58:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6121865216. Throughput: 0: 42759.2. Samples: 6122031700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 05:58:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 05:58:18,458][15401] Updated weights for policy 0, policy_version 373650 (0.0038) [2024-06-23 05:58:22,106][15401] Updated weights for policy 0, policy_version 373660 (0.0035) [2024-06-23 05:58:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 6122061824. Throughput: 0: 42591.1. Samples: 6122156180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:58:23,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 05:58:26,212][15401] Updated weights for policy 0, policy_version 373670 (0.0032) [2024-06-23 05:58:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 6122307584. Throughput: 0: 42801.1. Samples: 6122415140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:58:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 05:58:29,534][15401] Updated weights for policy 0, policy_version 373680 (0.0030) [2024-06-23 05:58:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 6122504192. Throughput: 0: 42779.0. Samples: 6122672100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:58:33,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 05:58:33,732][15401] Updated weights for policy 0, policy_version 373690 (0.0033) [2024-06-23 05:58:37,327][15401] Updated weights for policy 0, policy_version 373700 (0.0035) [2024-06-23 05:58:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6122717184. Throughput: 0: 42622.6. Samples: 6122794920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:58:38,395][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 05:58:41,414][15401] Updated weights for policy 0, policy_version 373710 (0.0028) [2024-06-23 05:58:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6122946560. Throughput: 0: 42698.3. Samples: 6123054500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:58:43,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 05:58:44,929][15401] Updated weights for policy 0, policy_version 373720 (0.0027) [2024-06-23 05:58:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6123126784. Throughput: 0: 42863.6. Samples: 6123317240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:58:48,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 05:58:48,982][15401] Updated weights for policy 0, policy_version 373730 (0.0032) [2024-06-23 05:58:52,587][15401] Updated weights for policy 0, policy_version 373740 (0.0027) [2024-06-23 05:58:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6123356160. Throughput: 0: 43020.9. Samples: 6123442580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:58:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 05:58:56,998][15401] Updated weights for policy 0, policy_version 373750 (0.0039) [2024-06-23 05:58:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6123585536. Throughput: 0: 42785.7. Samples: 6123697340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:58:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 05:59:00,308][15401] Updated weights for policy 0, policy_version 373760 (0.0029) [2024-06-23 05:59:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6123765760. Throughput: 0: 42829.8. Samples: 6123959040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:59:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 05:59:04,530][15401] Updated weights for policy 0, policy_version 373770 (0.0050) [2024-06-23 05:59:07,991][15401] Updated weights for policy 0, policy_version 373780 (0.0026) [2024-06-23 05:59:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6124011520. Throughput: 0: 42790.8. Samples: 6124081760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:59:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 05:59:10,528][15349] Signal inference workers to stop experience collection... (90750 times) [2024-06-23 05:59:10,528][15349] Signal inference workers to resume experience collection... (90750 times) [2024-06-23 05:59:10,545][15401] InferenceWorker_p0-w0: stopping experience collection (90750 times) [2024-06-23 05:59:10,545][15401] InferenceWorker_p0-w0: resuming experience collection (90750 times) [2024-06-23 05:59:12,122][15401] Updated weights for policy 0, policy_version 373790 (0.0034) [2024-06-23 05:59:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6124224512. Throughput: 0: 42846.5. Samples: 6124343240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:59:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 05:59:15,654][15401] Updated weights for policy 0, policy_version 373800 (0.0049) [2024-06-23 05:59:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6124421120. Throughput: 0: 42890.4. Samples: 6124602160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:59:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 05:59:19,816][15401] Updated weights for policy 0, policy_version 373810 (0.0031) [2024-06-23 05:59:23,338][15401] Updated weights for policy 0, policy_version 373820 (0.0036) [2024-06-23 05:59:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 6124666880. Throughput: 0: 42863.2. Samples: 6124723760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:59:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 05:59:27,359][15401] Updated weights for policy 0, policy_version 373830 (0.0034) [2024-06-23 05:59:28,391][15132] Fps is (10 sec: 44231.3, 60 sec: 42597.5, 300 sec: 42875.9). Total num frames: 6124863488. Throughput: 0: 42930.8. Samples: 6124986440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:59:28,391][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 05:59:30,871][15401] Updated weights for policy 0, policy_version 373840 (0.0044) [2024-06-23 05:59:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 6125060096. Throughput: 0: 42830.3. Samples: 6125244600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 05:59:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 05:59:35,211][15401] Updated weights for policy 0, policy_version 373850 (0.0039) [2024-06-23 05:59:38,390][15132] Fps is (10 sec: 44242.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6125305856. Throughput: 0: 42828.9. Samples: 6125369880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 05:59:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 05:59:38,481][15401] Updated weights for policy 0, policy_version 373860 (0.0032) [2024-06-23 05:59:43,006][15401] Updated weights for policy 0, policy_version 373870 (0.0033) [2024-06-23 05:59:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6125502464. Throughput: 0: 42988.9. Samples: 6125631840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 05:59:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 05:59:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000373871_6125502464.pth... [2024-06-23 05:59:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000373242_6115196928.pth [2024-06-23 05:59:46,027][15401] Updated weights for policy 0, policy_version 373880 (0.0044) [2024-06-23 05:59:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6125699072. Throughput: 0: 42760.9. Samples: 6125883280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 05:59:48,390][15132] Avg episode reward: [(0, '0.137')] [2024-06-23 05:59:50,737][15401] Updated weights for policy 0, policy_version 373890 (0.0033) [2024-06-23 05:59:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6125944832. Throughput: 0: 42913.6. Samples: 6126012880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 05:59:53,390][15132] Avg episode reward: [(0, '0.137')] [2024-06-23 05:59:53,744][15401] Updated weights for policy 0, policy_version 373900 (0.0035) [2024-06-23 05:59:58,166][15401] Updated weights for policy 0, policy_version 373910 (0.0039) [2024-06-23 05:59:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6126141440. Throughput: 0: 42965.8. Samples: 6126276700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 05:59:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 06:00:01,425][15401] Updated weights for policy 0, policy_version 373920 (0.0037) [2024-06-23 06:00:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6126354432. Throughput: 0: 42876.4. Samples: 6126531600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 06:00:05,956][15401] Updated weights for policy 0, policy_version 373930 (0.0029) [2024-06-23 06:00:08,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 6126583808. Throughput: 0: 43018.3. Samples: 6126659860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:08,396][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 06:00:09,428][15401] Updated weights for policy 0, policy_version 373940 (0.0038) [2024-06-23 06:00:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6126780416. Throughput: 0: 42860.4. Samples: 6126915100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 06:00:13,418][15401] Updated weights for policy 0, policy_version 373950 (0.0034) [2024-06-23 06:00:16,971][15401] Updated weights for policy 0, policy_version 373960 (0.0031) [2024-06-23 06:00:18,392][15132] Fps is (10 sec: 42615.4, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 6127009792. Throughput: 0: 42851.4. Samples: 6127173020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:18,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 06:00:20,961][15401] Updated weights for policy 0, policy_version 373970 (0.0044) [2024-06-23 06:00:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6127222784. Throughput: 0: 42977.9. Samples: 6127303880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 06:00:24,717][15401] Updated weights for policy 0, policy_version 373980 (0.0038) [2024-06-23 06:00:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42599.3, 300 sec: 42709.5). Total num frames: 6127419392. Throughput: 0: 42860.0. Samples: 6127560540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 06:00:28,917][15401] Updated weights for policy 0, policy_version 373990 (0.0034) [2024-06-23 06:00:32,400][15401] Updated weights for policy 0, policy_version 374000 (0.0038) [2024-06-23 06:00:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6127648768. Throughput: 0: 42935.6. Samples: 6127815380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 06:00:36,743][15401] Updated weights for policy 0, policy_version 374010 (0.0033) [2024-06-23 06:00:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6127861760. Throughput: 0: 42949.5. Samples: 6127945600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 06:00:39,991][15401] Updated weights for policy 0, policy_version 374020 (0.0027) [2024-06-23 06:00:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6128074752. Throughput: 0: 42684.5. Samples: 6128197500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:43,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 06:00:44,306][15401] Updated weights for policy 0, policy_version 374030 (0.0024) [2024-06-23 06:00:47,752][15401] Updated weights for policy 0, policy_version 374040 (0.0039) [2024-06-23 06:00:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6128287744. Throughput: 0: 42723.1. Samples: 6128454140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 06:00:49,399][15349] Signal inference workers to stop experience collection... (90800 times) [2024-06-23 06:00:49,449][15401] InferenceWorker_p0-w0: stopping experience collection (90800 times) [2024-06-23 06:00:49,458][15349] Signal inference workers to resume experience collection... (90800 times) [2024-06-23 06:00:49,467][15401] InferenceWorker_p0-w0: resuming experience collection (90800 times) [2024-06-23 06:00:51,770][15401] Updated weights for policy 0, policy_version 374050 (0.0037) [2024-06-23 06:00:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6128500736. Throughput: 0: 42698.0. Samples: 6128581000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 06:00:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 06:00:55,564][15401] Updated weights for policy 0, policy_version 374060 (0.0035) [2024-06-23 06:00:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6128713728. Throughput: 0: 42669.3. Samples: 6128835220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:00:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 06:00:59,553][15401] Updated weights for policy 0, policy_version 374070 (0.0033) [2024-06-23 06:01:03,208][15401] Updated weights for policy 0, policy_version 374080 (0.0032) [2024-06-23 06:01:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6128926720. Throughput: 0: 42774.2. Samples: 6129097760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 06:01:07,126][15401] Updated weights for policy 0, policy_version 374090 (0.0024) [2024-06-23 06:01:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 6129139712. Throughput: 0: 42575.0. Samples: 6129219760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:08,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 06:01:11,226][15401] Updated weights for policy 0, policy_version 374100 (0.0038) [2024-06-23 06:01:13,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43415.8, 300 sec: 42875.8). Total num frames: 6129385472. Throughput: 0: 42637.3. Samples: 6129479320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:13,393][15132] Avg episode reward: [(0, '0.845')] [2024-06-23 06:01:14,686][15401] Updated weights for policy 0, policy_version 374110 (0.0023) [2024-06-23 06:01:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 6129565696. Throughput: 0: 42707.9. Samples: 6129737240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 06:01:18,731][15401] Updated weights for policy 0, policy_version 374120 (0.0035) [2024-06-23 06:01:22,377][15401] Updated weights for policy 0, policy_version 374130 (0.0031) [2024-06-23 06:01:23,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6129778688. Throughput: 0: 42481.8. Samples: 6129857280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 06:01:26,130][15401] Updated weights for policy 0, policy_version 374140 (0.0043) [2024-06-23 06:01:28,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 6130040832. Throughput: 0: 42752.0. Samples: 6130121340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 06:01:29,982][15401] Updated weights for policy 0, policy_version 374150 (0.0031) [2024-06-23 06:01:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.2). Total num frames: 6130204672. Throughput: 0: 42786.2. Samples: 6130379520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 06:01:33,702][15401] Updated weights for policy 0, policy_version 374160 (0.0036) [2024-06-23 06:01:37,542][15401] Updated weights for policy 0, policy_version 374170 (0.0046) [2024-06-23 06:01:38,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6130417664. Throughput: 0: 42688.0. Samples: 6130501960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:38,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 06:01:41,809][15401] Updated weights for policy 0, policy_version 374180 (0.0034) [2024-06-23 06:01:43,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6130679808. Throughput: 0: 42875.5. Samples: 6130764620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 06:01:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000374187_6130679808.pth... [2024-06-23 06:01:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000373559_6120390656.pth [2024-06-23 06:01:45,002][15401] Updated weights for policy 0, policy_version 374190 (0.0032) [2024-06-23 06:01:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6130827264. Throughput: 0: 42733.3. Samples: 6131020760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 06:01:49,642][15401] Updated weights for policy 0, policy_version 374200 (0.0033) [2024-06-23 06:01:52,666][15401] Updated weights for policy 0, policy_version 374210 (0.0035) [2024-06-23 06:01:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6131073024. Throughput: 0: 42599.0. Samples: 6131136720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 06:01:57,102][15401] Updated weights for policy 0, policy_version 374220 (0.0038) [2024-06-23 06:01:57,330][15349] Signal inference workers to stop experience collection... (90850 times) [2024-06-23 06:01:57,330][15349] Signal inference workers to resume experience collection... (90850 times) [2024-06-23 06:01:57,347][15401] InferenceWorker_p0-w0: stopping experience collection (90850 times) [2024-06-23 06:01:57,347][15401] InferenceWorker_p0-w0: resuming experience collection (90850 times) [2024-06-23 06:01:58,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 6131302400. Throughput: 0: 42753.3. Samples: 6131403120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:01:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 06:02:00,167][15401] Updated weights for policy 0, policy_version 374230 (0.0035) [2024-06-23 06:02:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6131466240. Throughput: 0: 42781.9. Samples: 6131662420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:02:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 06:02:04,810][15401] Updated weights for policy 0, policy_version 374240 (0.0047) [2024-06-23 06:02:07,903][15401] Updated weights for policy 0, policy_version 374250 (0.0033) [2024-06-23 06:02:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6131728384. Throughput: 0: 42645.8. Samples: 6131776340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:02:08,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 06:02:12,523][15401] Updated weights for policy 0, policy_version 374260 (0.0031) [2024-06-23 06:02:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 6131941376. Throughput: 0: 42709.7. Samples: 6132043280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 06:02:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 06:02:15,649][15401] Updated weights for policy 0, policy_version 374270 (0.0043) [2024-06-23 06:02:18,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 6132105216. Throughput: 0: 42691.1. Samples: 6132300620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 06:02:20,242][15401] Updated weights for policy 0, policy_version 374280 (0.0037) [2024-06-23 06:02:23,090][15401] Updated weights for policy 0, policy_version 374290 (0.0040) [2024-06-23 06:02:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 6132367360. Throughput: 0: 42621.0. Samples: 6132419900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 06:02:27,815][15401] Updated weights for policy 0, policy_version 374300 (0.0031) [2024-06-23 06:02:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 6132547584. Throughput: 0: 42813.9. Samples: 6132691240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 06:02:30,635][15401] Updated weights for policy 0, policy_version 374310 (0.0039) [2024-06-23 06:02:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6132760576. Throughput: 0: 42640.1. Samples: 6132939560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 06:02:35,604][15401] Updated weights for policy 0, policy_version 374320 (0.0033) [2024-06-23 06:02:38,344][15401] Updated weights for policy 0, policy_version 374330 (0.0039) [2024-06-23 06:02:38,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6133022720. Throughput: 0: 42876.1. Samples: 6133066140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 06:02:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42654.0). Total num frames: 6133170176. Throughput: 0: 42732.6. Samples: 6133326080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 06:02:43,548][15401] Updated weights for policy 0, policy_version 374340 (0.0038) [2024-06-23 06:02:46,610][15401] Updated weights for policy 0, policy_version 374350 (0.0035) [2024-06-23 06:02:48,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 6133399552. Throughput: 0: 42495.4. Samples: 6133574720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:48,390][15132] Avg episode reward: [(0, '0.866')] [2024-06-23 06:02:51,042][15401] Updated weights for policy 0, policy_version 374360 (0.0033) [2024-06-23 06:02:53,390][15132] Fps is (10 sec: 47512.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6133645312. Throughput: 0: 42955.8. Samples: 6133709360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:53,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 06:02:53,986][15401] Updated weights for policy 0, policy_version 374370 (0.0031) [2024-06-23 06:02:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 6133825536. Throughput: 0: 42666.8. Samples: 6133963280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:02:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 06:02:58,522][15401] Updated weights for policy 0, policy_version 374380 (0.0029) [2024-06-23 06:03:01,532][15401] Updated weights for policy 0, policy_version 374390 (0.0031) [2024-06-23 06:03:03,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6134038528. Throughput: 0: 42659.6. Samples: 6134220300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:03:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 06:03:06,025][15401] Updated weights for policy 0, policy_version 374400 (0.0026) [2024-06-23 06:03:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6134267904. Throughput: 0: 42730.3. Samples: 6134342760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:03:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 06:03:09,134][15401] Updated weights for policy 0, policy_version 374410 (0.0040) [2024-06-23 06:03:13,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42047.8, 300 sec: 42708.5). Total num frames: 6134464512. Throughput: 0: 42330.3. Samples: 6134596380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:03:13,397][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 06:03:13,984][15401] Updated weights for policy 0, policy_version 374420 (0.0043) [2024-06-23 06:03:16,779][15401] Updated weights for policy 0, policy_version 374430 (0.0035) [2024-06-23 06:03:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6134677504. Throughput: 0: 42416.4. Samples: 6134848300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:03:18,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 06:03:21,518][15349] Signal inference workers to stop experience collection... (90900 times) [2024-06-23 06:03:21,519][15349] Signal inference workers to resume experience collection... (90900 times) [2024-06-23 06:03:21,543][15401] InferenceWorker_p0-w0: stopping experience collection (90900 times) [2024-06-23 06:03:21,543][15401] InferenceWorker_p0-w0: resuming experience collection (90900 times) [2024-06-23 06:03:21,683][15401] Updated weights for policy 0, policy_version 374440 (0.0029) [2024-06-23 06:03:23,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6134906880. Throughput: 0: 42535.5. Samples: 6134980240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:03:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 06:03:25,010][15401] Updated weights for policy 0, policy_version 374450 (0.0025) [2024-06-23 06:03:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6135103488. Throughput: 0: 42393.3. Samples: 6135233780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:03:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 06:03:29,226][15401] Updated weights for policy 0, policy_version 374460 (0.0035) [2024-06-23 06:03:32,673][15401] Updated weights for policy 0, policy_version 374470 (0.0051) [2024-06-23 06:03:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6135332864. Throughput: 0: 42581.0. Samples: 6135490860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 06:03:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 06:03:36,840][15401] Updated weights for policy 0, policy_version 374480 (0.0035) [2024-06-23 06:03:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 6135529472. Throughput: 0: 42604.5. Samples: 6135626560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:03:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 06:03:40,391][15401] Updated weights for policy 0, policy_version 374490 (0.0030) [2024-06-23 06:03:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.3, 300 sec: 42820.5). Total num frames: 6135758848. Throughput: 0: 42534.9. Samples: 6135877360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:03:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 06:03:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000374497_6135758848.pth... [2024-06-23 06:03:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000373871_6125502464.pth [2024-06-23 06:03:44,611][15401] Updated weights for policy 0, policy_version 374500 (0.0026) [2024-06-23 06:03:48,080][15401] Updated weights for policy 0, policy_version 374510 (0.0036) [2024-06-23 06:03:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6135988224. Throughput: 0: 42400.0. Samples: 6136128300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:03:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 06:03:52,195][15401] Updated weights for policy 0, policy_version 374520 (0.0042) [2024-06-23 06:03:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6136168448. Throughput: 0: 42515.0. Samples: 6136255940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:03:53,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 06:03:55,915][15401] Updated weights for policy 0, policy_version 374530 (0.0030) [2024-06-23 06:03:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6136397824. Throughput: 0: 42494.9. Samples: 6136508380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:03:58,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 06:04:00,262][15401] Updated weights for policy 0, policy_version 374540 (0.0037) [2024-06-23 06:04:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6136594432. Throughput: 0: 42600.5. Samples: 6136765320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:03,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 06:04:03,625][15401] Updated weights for policy 0, policy_version 374550 (0.0038) [2024-06-23 06:04:08,083][15401] Updated weights for policy 0, policy_version 374560 (0.0040) [2024-06-23 06:04:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6136807424. Throughput: 0: 42359.1. Samples: 6136886400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:08,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 06:04:11,309][15401] Updated weights for policy 0, policy_version 374570 (0.0038) [2024-06-23 06:04:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 6137020416. Throughput: 0: 42473.7. Samples: 6137145100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 06:04:15,737][15401] Updated weights for policy 0, policy_version 374580 (0.0027) [2024-06-23 06:04:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6137217024. Throughput: 0: 42452.5. Samples: 6137401220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 06:04:19,075][15401] Updated weights for policy 0, policy_version 374590 (0.0031) [2024-06-23 06:04:23,381][15401] Updated weights for policy 0, policy_version 374600 (0.0030) [2024-06-23 06:04:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.1). Total num frames: 6137446400. Throughput: 0: 42204.5. Samples: 6137525760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:23,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 06:04:24,684][15349] Signal inference workers to stop experience collection... (90950 times) [2024-06-23 06:04:24,685][15349] Signal inference workers to resume experience collection... (90950 times) [2024-06-23 06:04:24,725][15401] InferenceWorker_p0-w0: stopping experience collection (90950 times) [2024-06-23 06:04:24,725][15401] InferenceWorker_p0-w0: resuming experience collection (90950 times) [2024-06-23 06:04:26,683][15401] Updated weights for policy 0, policy_version 374610 (0.0031) [2024-06-23 06:04:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6137659392. Throughput: 0: 42231.8. Samples: 6137777780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 06:04:31,303][15401] Updated weights for policy 0, policy_version 374620 (0.0034) [2024-06-23 06:04:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 6137856000. Throughput: 0: 42451.2. Samples: 6138038600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:33,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 06:04:34,876][15401] Updated weights for policy 0, policy_version 374630 (0.0031) [2024-06-23 06:04:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6138085376. Throughput: 0: 42376.9. Samples: 6138162900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:38,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 06:04:39,271][15401] Updated weights for policy 0, policy_version 374640 (0.0032) [2024-06-23 06:04:42,583][15401] Updated weights for policy 0, policy_version 374650 (0.0026) [2024-06-23 06:04:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 6138298368. Throughput: 0: 42475.2. Samples: 6138419760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 06:04:46,784][15401] Updated weights for policy 0, policy_version 374660 (0.0033) [2024-06-23 06:04:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6138511360. Throughput: 0: 42346.1. Samples: 6138670900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:04:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 06:04:50,411][15401] Updated weights for policy 0, policy_version 374670 (0.0036) [2024-06-23 06:04:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6138724352. Throughput: 0: 42483.9. Samples: 6138798180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:04:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 06:04:54,096][15401] Updated weights for policy 0, policy_version 374680 (0.0031) [2024-06-23 06:04:57,899][15401] Updated weights for policy 0, policy_version 374690 (0.0031) [2024-06-23 06:04:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6138953728. Throughput: 0: 42652.9. Samples: 6139064480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:04:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 06:05:01,503][15401] Updated weights for policy 0, policy_version 374700 (0.0032) [2024-06-23 06:05:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 6139150336. Throughput: 0: 42643.9. Samples: 6139320200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:03,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 06:05:05,385][15401] Updated weights for policy 0, policy_version 374710 (0.0035) [2024-06-23 06:05:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6139379712. Throughput: 0: 42638.6. Samples: 6139444500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 06:05:09,238][15401] Updated weights for policy 0, policy_version 374720 (0.0028) [2024-06-23 06:05:12,972][15401] Updated weights for policy 0, policy_version 374730 (0.0039) [2024-06-23 06:05:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 6139592704. Throughput: 0: 42859.9. Samples: 6139706480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:13,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 06:05:16,816][15401] Updated weights for policy 0, policy_version 374740 (0.0048) [2024-06-23 06:05:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6139789312. Throughput: 0: 42781.8. Samples: 6139963780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 06:05:21,019][15401] Updated weights for policy 0, policy_version 374750 (0.0029) [2024-06-23 06:05:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6140018688. Throughput: 0: 42838.3. Samples: 6140090620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 06:05:24,374][15401] Updated weights for policy 0, policy_version 374760 (0.0033) [2024-06-23 06:05:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6140215296. Throughput: 0: 42829.3. Samples: 6140347080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:28,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 06:05:28,456][15401] Updated weights for policy 0, policy_version 374770 (0.0030) [2024-06-23 06:05:31,686][15349] Signal inference workers to stop experience collection... (91000 times) [2024-06-23 06:05:31,686][15349] Signal inference workers to resume experience collection... (91000 times) [2024-06-23 06:05:31,702][15401] InferenceWorker_p0-w0: stopping experience collection (91000 times) [2024-06-23 06:05:31,702][15401] InferenceWorker_p0-w0: resuming experience collection (91000 times) [2024-06-23 06:05:31,846][15401] Updated weights for policy 0, policy_version 374780 (0.0036) [2024-06-23 06:05:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6140428288. Throughput: 0: 43031.6. Samples: 6140607320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 06:05:36,178][15401] Updated weights for policy 0, policy_version 374790 (0.0040) [2024-06-23 06:05:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6140657664. Throughput: 0: 43075.2. Samples: 6140736560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:38,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 06:05:39,683][15401] Updated weights for policy 0, policy_version 374800 (0.0034) [2024-06-23 06:05:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6140854272. Throughput: 0: 42688.9. Samples: 6140985480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 06:05:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000374809_6140870656.pth... [2024-06-23 06:05:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000374187_6130679808.pth [2024-06-23 06:05:43,747][15401] Updated weights for policy 0, policy_version 374810 (0.0040) [2024-06-23 06:05:47,310][15401] Updated weights for policy 0, policy_version 374820 (0.0043) [2024-06-23 06:05:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6141067264. Throughput: 0: 42816.6. Samples: 6141246940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:48,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 06:05:51,334][15401] Updated weights for policy 0, policy_version 374830 (0.0031) [2024-06-23 06:05:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 6141296640. Throughput: 0: 43031.2. Samples: 6141380900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 06:05:54,928][15401] Updated weights for policy 0, policy_version 374840 (0.0048) [2024-06-23 06:05:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6141493248. Throughput: 0: 42802.3. Samples: 6141632580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:05:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 06:05:59,037][15401] Updated weights for policy 0, policy_version 374850 (0.0026) [2024-06-23 06:06:02,518][15401] Updated weights for policy 0, policy_version 374860 (0.0036) [2024-06-23 06:06:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6141722624. Throughput: 0: 42988.7. Samples: 6141898280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:06:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 06:06:06,561][15401] Updated weights for policy 0, policy_version 374870 (0.0024) [2024-06-23 06:06:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 6141935616. Throughput: 0: 43029.8. Samples: 6142026960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 06:06:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 06:06:10,092][15401] Updated weights for policy 0, policy_version 374880 (0.0029) [2024-06-23 06:06:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6142164992. Throughput: 0: 42956.9. Samples: 6142280140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:13,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 06:06:14,375][15401] Updated weights for policy 0, policy_version 374890 (0.0043) [2024-06-23 06:06:17,850][15401] Updated weights for policy 0, policy_version 374900 (0.0028) [2024-06-23 06:06:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6142377984. Throughput: 0: 42982.2. Samples: 6142541520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 06:06:22,524][15401] Updated weights for policy 0, policy_version 374910 (0.0033) [2024-06-23 06:06:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 6142574592. Throughput: 0: 42845.9. Samples: 6142664620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 06:06:25,572][15401] Updated weights for policy 0, policy_version 374920 (0.0040) [2024-06-23 06:06:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6142820352. Throughput: 0: 43035.1. Samples: 6142922060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 06:06:29,913][15401] Updated weights for policy 0, policy_version 374930 (0.0046) [2024-06-23 06:06:33,323][15401] Updated weights for policy 0, policy_version 374940 (0.0034) [2024-06-23 06:06:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6143016960. Throughput: 0: 43002.1. Samples: 6143182040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:33,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 06:06:37,506][15401] Updated weights for policy 0, policy_version 374950 (0.0037) [2024-06-23 06:06:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 6143213568. Throughput: 0: 42836.0. Samples: 6143308520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:38,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 06:06:40,810][15401] Updated weights for policy 0, policy_version 374960 (0.0030) [2024-06-23 06:06:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 6143475712. Throughput: 0: 43082.6. Samples: 6143571300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 06:06:45,252][15401] Updated weights for policy 0, policy_version 374970 (0.0039) [2024-06-23 06:06:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 6143655936. Throughput: 0: 42847.7. Samples: 6143826420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 06:06:48,964][15401] Updated weights for policy 0, policy_version 374980 (0.0044) [2024-06-23 06:06:50,397][15349] Signal inference workers to stop experience collection... (91050 times) [2024-06-23 06:06:50,397][15349] Signal inference workers to resume experience collection... (91050 times) [2024-06-23 06:06:50,415][15401] InferenceWorker_p0-w0: stopping experience collection (91050 times) [2024-06-23 06:06:50,416][15401] InferenceWorker_p0-w0: resuming experience collection (91050 times) [2024-06-23 06:06:52,855][15401] Updated weights for policy 0, policy_version 374990 (0.0032) [2024-06-23 06:06:53,390][15132] Fps is (10 sec: 37682.3, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 6143852544. Throughput: 0: 42761.5. Samples: 6143951240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 06:06:56,402][15401] Updated weights for policy 0, policy_version 375000 (0.0031) [2024-06-23 06:06:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43690.5, 300 sec: 42876.1). Total num frames: 6144114688. Throughput: 0: 43054.1. Samples: 6144217580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:06:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 06:07:00,165][15401] Updated weights for policy 0, policy_version 375010 (0.0034) [2024-06-23 06:07:03,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6144294912. Throughput: 0: 42817.3. Samples: 6144468300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:07:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 06:07:03,837][15401] Updated weights for policy 0, policy_version 375020 (0.0041) [2024-06-23 06:07:07,654][15401] Updated weights for policy 0, policy_version 375030 (0.0029) [2024-06-23 06:07:08,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6144507904. Throughput: 0: 42813.8. Samples: 6144591240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:07:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 06:07:11,374][15401] Updated weights for policy 0, policy_version 375040 (0.0030) [2024-06-23 06:07:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6144753664. Throughput: 0: 43056.1. Samples: 6144859580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:07:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 06:07:14,983][15401] Updated weights for policy 0, policy_version 375050 (0.0026) [2024-06-23 06:07:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6144950272. Throughput: 0: 42924.9. Samples: 6145113660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:07:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 06:07:19,092][15401] Updated weights for policy 0, policy_version 375060 (0.0027) [2024-06-23 06:07:22,299][15401] Updated weights for policy 0, policy_version 375070 (0.0030) [2024-06-23 06:07:23,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 6145163264. Throughput: 0: 42985.6. Samples: 6145242980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:07:23,393][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 06:07:26,582][15401] Updated weights for policy 0, policy_version 375080 (0.0041) [2024-06-23 06:07:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6145392640. Throughput: 0: 43040.5. Samples: 6145508120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 06:07:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 06:07:29,925][15401] Updated weights for policy 0, policy_version 375090 (0.0030) [2024-06-23 06:07:33,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6145605632. Throughput: 0: 43119.9. Samples: 6145766820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:07:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 06:07:34,046][15401] Updated weights for policy 0, policy_version 375100 (0.0038) [2024-06-23 06:07:37,951][15401] Updated weights for policy 0, policy_version 375110 (0.0046) [2024-06-23 06:07:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6145802240. Throughput: 0: 43137.0. Samples: 6145892400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:07:38,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 06:07:41,786][15401] Updated weights for policy 0, policy_version 375120 (0.0038) [2024-06-23 06:07:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6146048000. Throughput: 0: 42905.5. Samples: 6146148320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:07:43,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 06:07:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000375125_6146048000.pth... [2024-06-23 06:07:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000374497_6135758848.pth [2024-06-23 06:07:45,906][15401] Updated weights for policy 0, policy_version 375130 (0.0030) [2024-06-23 06:07:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6146228224. Throughput: 0: 43168.1. Samples: 6146410860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:07:48,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 06:07:49,263][15401] Updated weights for policy 0, policy_version 375140 (0.0028) [2024-06-23 06:07:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.8, 300 sec: 42765.0). Total num frames: 6146441216. Throughput: 0: 43160.9. Samples: 6146533480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:07:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 06:07:53,514][15401] Updated weights for policy 0, policy_version 375150 (0.0043) [2024-06-23 06:07:56,899][15401] Updated weights for policy 0, policy_version 375160 (0.0025) [2024-06-23 06:07:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6146670592. Throughput: 0: 42890.1. Samples: 6146789640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:07:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 06:08:01,063][15401] Updated weights for policy 0, policy_version 375170 (0.0032) [2024-06-23 06:08:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6146883584. Throughput: 0: 43110.6. Samples: 6147053640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:03,391][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 06:08:04,472][15401] Updated weights for policy 0, policy_version 375180 (0.0029) [2024-06-23 06:08:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 6147096576. Throughput: 0: 43042.4. Samples: 6147179780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:08,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-23 06:08:08,510][15401] Updated weights for policy 0, policy_version 375190 (0.0037) [2024-06-23 06:08:12,330][15401] Updated weights for policy 0, policy_version 375200 (0.0040) [2024-06-23 06:08:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6147309568. Throughput: 0: 42822.6. Samples: 6147435140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 06:08:16,194][15401] Updated weights for policy 0, policy_version 375210 (0.0028) [2024-06-23 06:08:16,681][15349] Signal inference workers to stop experience collection... (91100 times) [2024-06-23 06:08:16,682][15349] Signal inference workers to resume experience collection... (91100 times) [2024-06-23 06:08:16,698][15401] InferenceWorker_p0-w0: stopping experience collection (91100 times) [2024-06-23 06:08:16,728][15401] InferenceWorker_p0-w0: resuming experience collection (91100 times) [2024-06-23 06:08:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6147506176. Throughput: 0: 42745.0. Samples: 6147690340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 06:08:19,975][15401] Updated weights for policy 0, policy_version 375220 (0.0032) [2024-06-23 06:08:23,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 6147735552. Throughput: 0: 42708.9. Samples: 6147814300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:23,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 06:08:23,992][15401] Updated weights for policy 0, policy_version 375230 (0.0036) [2024-06-23 06:08:27,694][15401] Updated weights for policy 0, policy_version 375240 (0.0046) [2024-06-23 06:08:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6147948544. Throughput: 0: 42731.0. Samples: 6148071220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 06:08:31,475][15401] Updated weights for policy 0, policy_version 375250 (0.0027) [2024-06-23 06:08:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6148145152. Throughput: 0: 42637.3. Samples: 6148329540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:33,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 06:08:35,268][15401] Updated weights for policy 0, policy_version 375260 (0.0036) [2024-06-23 06:08:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6148358144. Throughput: 0: 42670.8. Samples: 6148453680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 06:08:39,148][15401] Updated weights for policy 0, policy_version 375270 (0.0042) [2024-06-23 06:08:43,322][15401] Updated weights for policy 0, policy_version 375280 (0.0039) [2024-06-23 06:08:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6148587520. Throughput: 0: 42630.4. Samples: 6148708000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 06:08:47,016][15401] Updated weights for policy 0, policy_version 375290 (0.0032) [2024-06-23 06:08:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6148800512. Throughput: 0: 42466.7. Samples: 6148964640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 06:08:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 06:08:50,821][15401] Updated weights for policy 0, policy_version 375300 (0.0036) [2024-06-23 06:08:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6148980736. Throughput: 0: 42456.4. Samples: 6149090320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:08:53,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 06:08:54,692][15401] Updated weights for policy 0, policy_version 375310 (0.0040) [2024-06-23 06:08:58,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 6149226496. Throughput: 0: 42264.8. Samples: 6149337160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:08:58,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 06:08:58,757][15401] Updated weights for policy 0, policy_version 375320 (0.0045) [2024-06-23 06:09:02,581][15401] Updated weights for policy 0, policy_version 375330 (0.0042) [2024-06-23 06:09:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6149439488. Throughput: 0: 42446.6. Samples: 6149600440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:03,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 06:09:06,526][15401] Updated weights for policy 0, policy_version 375340 (0.0041) [2024-06-23 06:09:08,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6149636096. Throughput: 0: 42485.2. Samples: 6149726140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 06:09:10,355][15401] Updated weights for policy 0, policy_version 375350 (0.0026) [2024-06-23 06:09:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6149865472. Throughput: 0: 42391.1. Samples: 6149978820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 06:09:14,193][15401] Updated weights for policy 0, policy_version 375360 (0.0032) [2024-06-23 06:09:18,174][15401] Updated weights for policy 0, policy_version 375370 (0.0030) [2024-06-23 06:09:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6150078464. Throughput: 0: 42428.5. Samples: 6150238820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 06:09:22,150][15401] Updated weights for policy 0, policy_version 375380 (0.0038) [2024-06-23 06:09:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6150291456. Throughput: 0: 42431.4. Samples: 6150363080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 06:09:25,585][15401] Updated weights for policy 0, policy_version 375390 (0.0041) [2024-06-23 06:09:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 6150520832. Throughput: 0: 42649.7. Samples: 6150627340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:28,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 06:09:29,673][15401] Updated weights for policy 0, policy_version 375400 (0.0041) [2024-06-23 06:09:33,227][15401] Updated weights for policy 0, policy_version 375410 (0.0041) [2024-06-23 06:09:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6150717440. Throughput: 0: 42766.7. Samples: 6150889140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 06:09:34,610][15349] Signal inference workers to stop experience collection... (91150 times) [2024-06-23 06:09:34,611][15349] Signal inference workers to resume experience collection... (91150 times) [2024-06-23 06:09:34,621][15401] InferenceWorker_p0-w0: stopping experience collection (91150 times) [2024-06-23 06:09:34,622][15401] InferenceWorker_p0-w0: resuming experience collection (91150 times) [2024-06-23 06:09:37,194][15401] Updated weights for policy 0, policy_version 375420 (0.0038) [2024-06-23 06:09:38,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6150946816. Throughput: 0: 42692.5. Samples: 6151011480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:38,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 06:09:40,682][15401] Updated weights for policy 0, policy_version 375430 (0.0028) [2024-06-23 06:09:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6151176192. Throughput: 0: 43068.4. Samples: 6151275140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 06:09:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000375438_6151176192.pth... [2024-06-23 06:09:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000374809_6140870656.pth [2024-06-23 06:09:44,728][15401] Updated weights for policy 0, policy_version 375440 (0.0037) [2024-06-23 06:09:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6151340032. Throughput: 0: 42922.7. Samples: 6151531960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 06:09:48,575][15401] Updated weights for policy 0, policy_version 375450 (0.0037) [2024-06-23 06:09:52,255][15401] Updated weights for policy 0, policy_version 375460 (0.0036) [2024-06-23 06:09:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6151569408. Throughput: 0: 42762.7. Samples: 6151650460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 06:09:56,565][15401] Updated weights for policy 0, policy_version 375470 (0.0033) [2024-06-23 06:09:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43146.3, 300 sec: 42931.7). Total num frames: 6151815168. Throughput: 0: 43021.4. Samples: 6151914780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:09:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 06:09:59,682][15401] Updated weights for policy 0, policy_version 375480 (0.0033) [2024-06-23 06:10:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6151995392. Throughput: 0: 43132.5. Samples: 6152179780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:10:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 06:10:03,941][15401] Updated weights for policy 0, policy_version 375490 (0.0037) [2024-06-23 06:10:07,015][15401] Updated weights for policy 0, policy_version 375500 (0.0027) [2024-06-23 06:10:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6152208384. Throughput: 0: 43031.4. Samples: 6152299500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 06:10:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 06:10:11,209][15401] Updated weights for policy 0, policy_version 375510 (0.0031) [2024-06-23 06:10:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6152454144. Throughput: 0: 43121.9. Samples: 6152567720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 06:10:14,299][15401] Updated weights for policy 0, policy_version 375520 (0.0025) [2024-06-23 06:10:18,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6152667136. Throughput: 0: 42937.4. Samples: 6152821320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 06:10:19,070][15401] Updated weights for policy 0, policy_version 375530 (0.0029) [2024-06-23 06:10:21,834][15401] Updated weights for policy 0, policy_version 375540 (0.0033) [2024-06-23 06:10:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6152863744. Throughput: 0: 42990.6. Samples: 6152946060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 06:10:26,554][15401] Updated weights for policy 0, policy_version 375550 (0.0033) [2024-06-23 06:10:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 6153076736. Throughput: 0: 43112.1. Samples: 6153215180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:28,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-23 06:10:29,854][15401] Updated weights for policy 0, policy_version 375560 (0.0039) [2024-06-23 06:10:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6153289728. Throughput: 0: 42985.7. Samples: 6153466320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 06:10:34,140][15401] Updated weights for policy 0, policy_version 375570 (0.0032) [2024-06-23 06:10:37,283][15401] Updated weights for policy 0, policy_version 375580 (0.0034) [2024-06-23 06:10:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6153519104. Throughput: 0: 43118.3. Samples: 6153590780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:38,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 06:10:42,101][15401] Updated weights for policy 0, policy_version 375590 (0.0049) [2024-06-23 06:10:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 6153732096. Throughput: 0: 43118.6. Samples: 6153855120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 06:10:44,900][15401] Updated weights for policy 0, policy_version 375600 (0.0039) [2024-06-23 06:10:46,919][15349] Signal inference workers to stop experience collection... (91200 times) [2024-06-23 06:10:46,971][15401] InferenceWorker_p0-w0: stopping experience collection (91200 times) [2024-06-23 06:10:46,981][15349] Signal inference workers to resume experience collection... (91200 times) [2024-06-23 06:10:46,990][15401] InferenceWorker_p0-w0: resuming experience collection (91200 times) [2024-06-23 06:10:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6153945088. Throughput: 0: 42865.3. Samples: 6154108720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 06:10:49,790][15401] Updated weights for policy 0, policy_version 375610 (0.0041) [2024-06-23 06:10:52,427][15401] Updated weights for policy 0, policy_version 375620 (0.0041) [2024-06-23 06:10:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6154158080. Throughput: 0: 42961.4. Samples: 6154232760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:53,394][15132] Avg episode reward: [(0, '0.861')] [2024-06-23 06:10:57,382][15401] Updated weights for policy 0, policy_version 375630 (0.0031) [2024-06-23 06:10:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6154371072. Throughput: 0: 42861.3. Samples: 6154496480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:10:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 06:11:00,011][15401] Updated weights for policy 0, policy_version 375640 (0.0036) [2024-06-23 06:11:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6154584064. Throughput: 0: 42875.4. Samples: 6154750720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 06:11:05,047][15401] Updated weights for policy 0, policy_version 375650 (0.0033) [2024-06-23 06:11:07,972][15401] Updated weights for policy 0, policy_version 375660 (0.0042) [2024-06-23 06:11:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6154813440. Throughput: 0: 42939.1. Samples: 6154878320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:08,393][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 06:11:12,611][15401] Updated weights for policy 0, policy_version 375670 (0.0035) [2024-06-23 06:11:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6154993664. Throughput: 0: 42716.5. Samples: 6155137420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 06:11:15,636][15401] Updated weights for policy 0, policy_version 375680 (0.0022) [2024-06-23 06:11:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6155223040. Throughput: 0: 42784.0. Samples: 6155391600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:18,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 06:11:20,284][15401] Updated weights for policy 0, policy_version 375690 (0.0034) [2024-06-23 06:11:23,263][15401] Updated weights for policy 0, policy_version 375700 (0.0025) [2024-06-23 06:11:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6155468800. Throughput: 0: 42820.8. Samples: 6155517720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:23,400][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 06:11:27,825][15401] Updated weights for policy 0, policy_version 375710 (0.0029) [2024-06-23 06:11:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6155649024. Throughput: 0: 42712.5. Samples: 6155777180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 06:11:30,860][15401] Updated weights for policy 0, policy_version 375720 (0.0029) [2024-06-23 06:11:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6155862016. Throughput: 0: 42736.7. Samples: 6156031880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 06:11:35,488][15401] Updated weights for policy 0, policy_version 375730 (0.0026) [2024-06-23 06:11:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6156091392. Throughput: 0: 42840.8. Samples: 6156160600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 06:11:39,331][15401] Updated weights for policy 0, policy_version 375740 (0.0032) [2024-06-23 06:11:43,204][15401] Updated weights for policy 0, policy_version 375750 (0.0033) [2024-06-23 06:11:43,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 6156288000. Throughput: 0: 42617.8. Samples: 6156414380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:43,393][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 06:11:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000375750_6156288000.pth... [2024-06-23 06:11:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000375125_6146048000.pth [2024-06-23 06:11:46,883][15401] Updated weights for policy 0, policy_version 375760 (0.0034) [2024-06-23 06:11:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6156500992. Throughput: 0: 42682.6. Samples: 6156671440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:48,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 06:11:51,129][15401] Updated weights for policy 0, policy_version 375770 (0.0044) [2024-06-23 06:11:53,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6156746752. Throughput: 0: 42683.6. Samples: 6156799080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:53,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 06:11:54,412][15401] Updated weights for policy 0, policy_version 375780 (0.0033) [2024-06-23 06:11:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6156926976. Throughput: 0: 42623.1. Samples: 6157055460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:11:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 06:11:58,530][15401] Updated weights for policy 0, policy_version 375790 (0.0036) [2024-06-23 06:12:02,237][15401] Updated weights for policy 0, policy_version 375800 (0.0038) [2024-06-23 06:12:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6157139968. Throughput: 0: 42712.4. Samples: 6157313660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 06:12:05,984][15401] Updated weights for policy 0, policy_version 375810 (0.0033) [2024-06-23 06:12:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6157369344. Throughput: 0: 42708.1. Samples: 6157439580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 06:12:09,935][15401] Updated weights for policy 0, policy_version 375820 (0.0027) [2024-06-23 06:12:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6157582336. Throughput: 0: 42704.8. Samples: 6157698900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 06:12:13,675][15401] Updated weights for policy 0, policy_version 375830 (0.0031) [2024-06-23 06:12:15,643][15349] Signal inference workers to stop experience collection... (91250 times) [2024-06-23 06:12:15,699][15401] InferenceWorker_p0-w0: stopping experience collection (91250 times) [2024-06-23 06:12:15,699][15349] Signal inference workers to resume experience collection... (91250 times) [2024-06-23 06:12:15,725][15401] InferenceWorker_p0-w0: resuming experience collection (91250 times) [2024-06-23 06:12:17,602][15401] Updated weights for policy 0, policy_version 375840 (0.0035) [2024-06-23 06:12:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 6157778944. Throughput: 0: 42659.2. Samples: 6157951540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 06:12:21,457][15401] Updated weights for policy 0, policy_version 375850 (0.0044) [2024-06-23 06:12:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6158008320. Throughput: 0: 42780.0. Samples: 6158085700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 06:12:25,104][15401] Updated weights for policy 0, policy_version 375860 (0.0043) [2024-06-23 06:12:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6158221312. Throughput: 0: 42685.8. Samples: 6158335140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 06:12:28,958][15401] Updated weights for policy 0, policy_version 375870 (0.0039) [2024-06-23 06:12:32,609][15401] Updated weights for policy 0, policy_version 375880 (0.0032) [2024-06-23 06:12:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6158434304. Throughput: 0: 42751.1. Samples: 6158595240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:33,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 06:12:36,523][15401] Updated weights for policy 0, policy_version 375890 (0.0028) [2024-06-23 06:12:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6158630912. Throughput: 0: 42684.8. Samples: 6158719900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 06:12:40,469][15401] Updated weights for policy 0, policy_version 375900 (0.0027) [2024-06-23 06:12:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 6158860288. Throughput: 0: 42730.3. Samples: 6158978320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 06:12:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 06:12:44,090][15401] Updated weights for policy 0, policy_version 375910 (0.0029) [2024-06-23 06:12:48,166][15401] Updated weights for policy 0, policy_version 375920 (0.0040) [2024-06-23 06:12:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6159073280. Throughput: 0: 42740.0. Samples: 6159236960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:12:48,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 06:12:52,123][15401] Updated weights for policy 0, policy_version 375930 (0.0032) [2024-06-23 06:12:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 6159269888. Throughput: 0: 42736.4. Samples: 6159362720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:12:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 06:12:55,747][15401] Updated weights for policy 0, policy_version 375940 (0.0043) [2024-06-23 06:12:58,396][15132] Fps is (10 sec: 44209.0, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 6159515648. Throughput: 0: 42807.8. Samples: 6159625520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:12:58,396][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 06:12:59,738][15401] Updated weights for policy 0, policy_version 375950 (0.0033) [2024-06-23 06:13:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6159712256. Throughput: 0: 42874.2. Samples: 6159880880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:03,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 06:13:03,656][15401] Updated weights for policy 0, policy_version 375960 (0.0046) [2024-06-23 06:13:07,256][15401] Updated weights for policy 0, policy_version 375970 (0.0040) [2024-06-23 06:13:08,390][15132] Fps is (10 sec: 40985.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6159925248. Throughput: 0: 42608.5. Samples: 6160003080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 06:13:11,303][15401] Updated weights for policy 0, policy_version 375980 (0.0026) [2024-06-23 06:13:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6160154624. Throughput: 0: 42908.0. Samples: 6160266000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 06:13:14,882][15401] Updated weights for policy 0, policy_version 375990 (0.0032) [2024-06-23 06:13:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6160351232. Throughput: 0: 42763.6. Samples: 6160519600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 06:13:19,309][15401] Updated weights for policy 0, policy_version 376000 (0.0038) [2024-06-23 06:13:22,549][15401] Updated weights for policy 0, policy_version 376010 (0.0027) [2024-06-23 06:13:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6160564224. Throughput: 0: 42828.9. Samples: 6160647200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 06:13:26,679][15401] Updated weights for policy 0, policy_version 376020 (0.0028) [2024-06-23 06:13:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6160793600. Throughput: 0: 42931.4. Samples: 6160910240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 06:13:29,878][15401] Updated weights for policy 0, policy_version 376030 (0.0028) [2024-06-23 06:13:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6160990208. Throughput: 0: 42939.6. Samples: 6161169240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 06:13:34,258][15401] Updated weights for policy 0, policy_version 376040 (0.0031) [2024-06-23 06:13:37,471][15401] Updated weights for policy 0, policy_version 376050 (0.0026) [2024-06-23 06:13:38,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6161203200. Throughput: 0: 42892.8. Samples: 6161293000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:38,393][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 06:13:42,039][15401] Updated weights for policy 0, policy_version 376060 (0.0041) [2024-06-23 06:13:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6161432576. Throughput: 0: 42824.9. Samples: 6161552360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 06:13:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000376065_6161448960.pth... [2024-06-23 06:13:43,560][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000375438_6151176192.pth [2024-06-23 06:13:45,066][15401] Updated weights for policy 0, policy_version 376070 (0.0030) [2024-06-23 06:13:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6161645568. Throughput: 0: 42890.7. Samples: 6161810960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 06:13:49,641][15401] Updated weights for policy 0, policy_version 376080 (0.0027) [2024-06-23 06:13:50,403][15349] Signal inference workers to stop experience collection... (91300 times) [2024-06-23 06:13:50,404][15349] Signal inference workers to resume experience collection... (91300 times) [2024-06-23 06:13:50,417][15401] InferenceWorker_p0-w0: stopping experience collection (91300 times) [2024-06-23 06:13:50,417][15401] InferenceWorker_p0-w0: resuming experience collection (91300 times) [2024-06-23 06:13:52,995][15401] Updated weights for policy 0, policy_version 376090 (0.0029) [2024-06-23 06:13:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42820.6). Total num frames: 6161858560. Throughput: 0: 42897.9. Samples: 6161933580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:53,392][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 06:13:57,105][15401] Updated weights for policy 0, policy_version 376100 (0.0030) [2024-06-23 06:13:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 6162071552. Throughput: 0: 42878.3. Samples: 6162195520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:13:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 06:14:00,554][15401] Updated weights for policy 0, policy_version 376110 (0.0042) [2024-06-23 06:14:03,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6162284544. Throughput: 0: 42827.6. Samples: 6162446840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 06:14:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 06:14:04,679][15401] Updated weights for policy 0, policy_version 376120 (0.0029) [2024-06-23 06:14:08,362][15401] Updated weights for policy 0, policy_version 376130 (0.0038) [2024-06-23 06:14:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6162513920. Throughput: 0: 42937.8. Samples: 6162579400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 06:14:12,482][15401] Updated weights for policy 0, policy_version 376140 (0.0037) [2024-06-23 06:14:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6162694144. Throughput: 0: 42813.3. Samples: 6162836840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 06:14:15,897][15401] Updated weights for policy 0, policy_version 376150 (0.0045) [2024-06-23 06:14:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6162923520. Throughput: 0: 42646.8. Samples: 6163088340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:18,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-23 06:14:20,085][15401] Updated weights for policy 0, policy_version 376160 (0.0035) [2024-06-23 06:14:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 6163136512. Throughput: 0: 42864.1. Samples: 6163221780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:23,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 06:14:23,662][15401] Updated weights for policy 0, policy_version 376170 (0.0023) [2024-06-23 06:14:28,106][15401] Updated weights for policy 0, policy_version 376180 (0.0039) [2024-06-23 06:14:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6163333120. Throughput: 0: 42702.4. Samples: 6163473980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 06:14:31,769][15401] Updated weights for policy 0, policy_version 376190 (0.0027) [2024-06-23 06:14:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6163578880. Throughput: 0: 42720.0. Samples: 6163733360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:33,390][15132] Avg episode reward: [(0, '0.004')] [2024-06-23 06:14:35,622][15401] Updated weights for policy 0, policy_version 376200 (0.0027) [2024-06-23 06:14:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 6163791872. Throughput: 0: 42919.5. Samples: 6163864860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:38,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 06:14:39,314][15401] Updated weights for policy 0, policy_version 376210 (0.0027) [2024-06-23 06:14:43,101][15401] Updated weights for policy 0, policy_version 376220 (0.0027) [2024-06-23 06:14:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6163988480. Throughput: 0: 42688.0. Samples: 6164116480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 06:14:46,906][15401] Updated weights for policy 0, policy_version 376230 (0.0034) [2024-06-23 06:14:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6164217856. Throughput: 0: 42897.4. Samples: 6164377220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:48,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 06:14:50,731][15401] Updated weights for policy 0, policy_version 376240 (0.0033) [2024-06-23 06:14:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 6164414464. Throughput: 0: 42788.0. Samples: 6164504860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:53,398][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 06:14:54,410][15401] Updated weights for policy 0, policy_version 376250 (0.0024) [2024-06-23 06:14:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6164611072. Throughput: 0: 42727.8. Samples: 6164759580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:14:58,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-23 06:14:58,821][15401] Updated weights for policy 0, policy_version 376260 (0.0042) [2024-06-23 06:15:02,006][15401] Updated weights for policy 0, policy_version 376270 (0.0036) [2024-06-23 06:15:02,008][15349] Signal inference workers to stop experience collection... (91350 times) [2024-06-23 06:15:02,008][15349] Signal inference workers to resume experience collection... (91350 times) [2024-06-23 06:15:02,061][15401] InferenceWorker_p0-w0: stopping experience collection (91350 times) [2024-06-23 06:15:02,061][15401] InferenceWorker_p0-w0: resuming experience collection (91350 times) [2024-06-23 06:15:03,394][15132] Fps is (10 sec: 44216.8, 60 sec: 42868.2, 300 sec: 42875.5). Total num frames: 6164856832. Throughput: 0: 42680.5. Samples: 6165009160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:15:03,394][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 06:15:06,416][15401] Updated weights for policy 0, policy_version 376280 (0.0033) [2024-06-23 06:15:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6165069824. Throughput: 0: 42674.6. Samples: 6165142140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:15:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 06:15:09,708][15401] Updated weights for policy 0, policy_version 376290 (0.0031) [2024-06-23 06:15:13,392][15132] Fps is (10 sec: 40969.0, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 6165266432. Throughput: 0: 42740.5. Samples: 6165397400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:15:13,392][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 06:15:13,890][15401] Updated weights for policy 0, policy_version 376300 (0.0043) [2024-06-23 06:15:17,662][15401] Updated weights for policy 0, policy_version 376310 (0.0034) [2024-06-23 06:15:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6165479424. Throughput: 0: 42689.8. Samples: 6165654400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:15:18,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 06:15:21,461][15401] Updated weights for policy 0, policy_version 376320 (0.0029) [2024-06-23 06:15:23,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6165708800. Throughput: 0: 42581.1. Samples: 6165781000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 06:15:23,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 06:15:25,434][15401] Updated weights for policy 0, policy_version 376330 (0.0027) [2024-06-23 06:15:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6165905408. Throughput: 0: 42683.4. Samples: 6166037240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:15:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 06:15:29,041][15401] Updated weights for policy 0, policy_version 376340 (0.0044) [2024-06-23 06:15:32,953][15401] Updated weights for policy 0, policy_version 376350 (0.0034) [2024-06-23 06:15:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6166134784. Throughput: 0: 42655.9. Samples: 6166296740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:15:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 06:15:36,538][15401] Updated weights for policy 0, policy_version 376360 (0.0031) [2024-06-23 06:15:38,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6166364160. Throughput: 0: 42711.2. Samples: 6166426860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:15:38,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 06:15:40,483][15401] Updated weights for policy 0, policy_version 376370 (0.0032) [2024-06-23 06:15:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 6166560768. Throughput: 0: 42680.2. Samples: 6166680300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:15:43,393][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 06:15:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000376377_6166560768.pth... [2024-06-23 06:15:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000375750_6156288000.pth [2024-06-23 06:15:44,442][15401] Updated weights for policy 0, policy_version 376380 (0.0047) [2024-06-23 06:15:48,216][15401] Updated weights for policy 0, policy_version 376390 (0.0022) [2024-06-23 06:15:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 6166790144. Throughput: 0: 42987.4. Samples: 6166943400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:15:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 06:15:52,474][15401] Updated weights for policy 0, policy_version 376400 (0.0024) [2024-06-23 06:15:53,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6167003136. Throughput: 0: 42923.1. Samples: 6167073680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:15:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 06:15:55,801][15401] Updated weights for policy 0, policy_version 376410 (0.0035) [2024-06-23 06:15:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 6167216128. Throughput: 0: 42824.5. Samples: 6167324400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:15:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 06:16:00,004][15401] Updated weights for policy 0, policy_version 376420 (0.0034) [2024-06-23 06:16:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42601.6, 300 sec: 42709.5). Total num frames: 6167412736. Throughput: 0: 42970.6. Samples: 6167588080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:16:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 06:16:03,451][15401] Updated weights for policy 0, policy_version 376430 (0.0024) [2024-06-23 06:16:07,663][15401] Updated weights for policy 0, policy_version 376440 (0.0036) [2024-06-23 06:16:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6167625728. Throughput: 0: 42944.8. Samples: 6167713520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:16:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 06:16:11,062][15401] Updated weights for policy 0, policy_version 376450 (0.0043) [2024-06-23 06:16:13,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 6167855104. Throughput: 0: 42924.9. Samples: 6167968960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:16:13,392][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 06:16:15,319][15401] Updated weights for policy 0, policy_version 376460 (0.0033) [2024-06-23 06:16:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6168068096. Throughput: 0: 43048.5. Samples: 6168233920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:16:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 06:16:18,501][15401] Updated weights for policy 0, policy_version 376470 (0.0042) [2024-06-23 06:16:22,860][15401] Updated weights for policy 0, policy_version 376480 (0.0038) [2024-06-23 06:16:23,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6168264704. Throughput: 0: 42958.7. Samples: 6168360000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:16:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 06:16:26,105][15401] Updated weights for policy 0, policy_version 376490 (0.0039) [2024-06-23 06:16:28,391][15132] Fps is (10 sec: 44229.0, 60 sec: 43416.5, 300 sec: 42875.9). Total num frames: 6168510464. Throughput: 0: 42995.9. Samples: 6168615080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:16:28,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 06:16:30,336][15401] Updated weights for policy 0, policy_version 376500 (0.0033) [2024-06-23 06:16:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6168707072. Throughput: 0: 42967.2. Samples: 6168876920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:16:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 06:16:33,936][15401] Updated weights for policy 0, policy_version 376510 (0.0029) [2024-06-23 06:16:34,256][15349] Signal inference workers to stop experience collection... (91400 times) [2024-06-23 06:16:34,295][15401] InferenceWorker_p0-w0: stopping experience collection (91400 times) [2024-06-23 06:16:34,306][15349] Signal inference workers to resume experience collection... (91400 times) [2024-06-23 06:16:34,313][15401] InferenceWorker_p0-w0: resuming experience collection (91400 times) [2024-06-23 06:16:37,825][15401] Updated weights for policy 0, policy_version 376520 (0.0036) [2024-06-23 06:16:38,392][15132] Fps is (10 sec: 40956.9, 60 sec: 42596.6, 300 sec: 42820.6). Total num frames: 6168920064. Throughput: 0: 42859.5. Samples: 6169002460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 06:16:38,393][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 06:16:41,661][15401] Updated weights for policy 0, policy_version 376530 (0.0041) [2024-06-23 06:16:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 6169149440. Throughput: 0: 43000.5. Samples: 6169259420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:16:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 06:16:45,769][15401] Updated weights for policy 0, policy_version 376540 (0.0044) [2024-06-23 06:16:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6169346048. Throughput: 0: 42942.8. Samples: 6169520500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:16:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 06:16:49,359][15401] Updated weights for policy 0, policy_version 376550 (0.0030) [2024-06-23 06:16:53,296][15401] Updated weights for policy 0, policy_version 376560 (0.0038) [2024-06-23 06:16:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6169559040. Throughput: 0: 42909.7. Samples: 6169644460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:16:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 06:16:57,042][15401] Updated weights for policy 0, policy_version 376570 (0.0025) [2024-06-23 06:16:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6169788416. Throughput: 0: 42998.7. Samples: 6169903800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:16:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 06:17:00,848][15401] Updated weights for policy 0, policy_version 376580 (0.0031) [2024-06-23 06:17:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6170001408. Throughput: 0: 42846.0. Samples: 6170162000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 06:17:04,679][15401] Updated weights for policy 0, policy_version 376590 (0.0033) [2024-06-23 06:17:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6170198016. Throughput: 0: 42847.1. Samples: 6170288120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 06:17:08,490][15401] Updated weights for policy 0, policy_version 376600 (0.0037) [2024-06-23 06:17:12,135][15401] Updated weights for policy 0, policy_version 376610 (0.0032) [2024-06-23 06:17:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 6170427392. Throughput: 0: 43001.2. Samples: 6170550060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 06:17:16,477][15401] Updated weights for policy 0, policy_version 376620 (0.0050) [2024-06-23 06:17:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 6170640384. Throughput: 0: 42820.3. Samples: 6170803840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 06:17:19,603][15401] Updated weights for policy 0, policy_version 376630 (0.0035) [2024-06-23 06:17:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6170836992. Throughput: 0: 42901.9. Samples: 6170932940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 06:17:24,049][15401] Updated weights for policy 0, policy_version 376640 (0.0030) [2024-06-23 06:17:27,226][15401] Updated weights for policy 0, policy_version 376650 (0.0033) [2024-06-23 06:17:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42599.5, 300 sec: 42820.6). Total num frames: 6171066368. Throughput: 0: 42870.5. Samples: 6171188600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 06:17:31,651][15401] Updated weights for policy 0, policy_version 376660 (0.0034) [2024-06-23 06:17:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6171295744. Throughput: 0: 42768.3. Samples: 6171445080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 06:17:34,700][15401] Updated weights for policy 0, policy_version 376670 (0.0033) [2024-06-23 06:17:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 6171492352. Throughput: 0: 42911.9. Samples: 6171575500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 06:17:39,065][15401] Updated weights for policy 0, policy_version 376680 (0.0034) [2024-06-23 06:17:42,438][15401] Updated weights for policy 0, policy_version 376690 (0.0031) [2024-06-23 06:17:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6171705344. Throughput: 0: 42931.2. Samples: 6171835700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 06:17:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000376691_6171705344.pth... [2024-06-23 06:17:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000376065_6161448960.pth [2024-06-23 06:17:46,603][15401] Updated weights for policy 0, policy_version 376700 (0.0039) [2024-06-23 06:17:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6171934720. Throughput: 0: 42800.6. Samples: 6172088020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:48,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 06:17:50,074][15401] Updated weights for policy 0, policy_version 376710 (0.0036) [2024-06-23 06:17:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 6172114944. Throughput: 0: 43006.5. Samples: 6172223420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:53,395][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 06:17:53,552][15349] Signal inference workers to stop experience collection... (91450 times) [2024-06-23 06:17:53,593][15401] InferenceWorker_p0-w0: stopping experience collection (91450 times) [2024-06-23 06:17:53,602][15349] Signal inference workers to resume experience collection... (91450 times) [2024-06-23 06:17:53,607][15401] InferenceWorker_p0-w0: resuming experience collection (91450 times) [2024-06-23 06:17:54,207][15401] Updated weights for policy 0, policy_version 376720 (0.0037) [2024-06-23 06:17:57,756][15401] Updated weights for policy 0, policy_version 376730 (0.0033) [2024-06-23 06:17:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6172344320. Throughput: 0: 42675.9. Samples: 6172470480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 06:17:58,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 06:18:02,044][15401] Updated weights for policy 0, policy_version 376740 (0.0039) [2024-06-23 06:18:03,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6172590080. Throughput: 0: 42860.5. Samples: 6172732560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 06:18:05,125][15401] Updated weights for policy 0, policy_version 376750 (0.0035) [2024-06-23 06:18:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6172770304. Throughput: 0: 42877.3. Samples: 6172862420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 06:18:09,689][15401] Updated weights for policy 0, policy_version 376760 (0.0039) [2024-06-23 06:18:12,476][15401] Updated weights for policy 0, policy_version 376770 (0.0031) [2024-06-23 06:18:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 6172999680. Throughput: 0: 42944.4. Samples: 6173121100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 06:18:17,194][15401] Updated weights for policy 0, policy_version 376780 (0.0031) [2024-06-23 06:18:18,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6173245440. Throughput: 0: 43016.0. Samples: 6173380800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:18,391][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 06:18:19,903][15401] Updated weights for policy 0, policy_version 376790 (0.0039) [2024-06-23 06:18:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6173409280. Throughput: 0: 43064.6. Samples: 6173513400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 06:18:24,923][15401] Updated weights for policy 0, policy_version 376800 (0.0040) [2024-06-23 06:18:27,506][15401] Updated weights for policy 0, policy_version 376810 (0.0035) [2024-06-23 06:18:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6173655040. Throughput: 0: 42746.5. Samples: 6173759300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 06:18:32,674][15401] Updated weights for policy 0, policy_version 376820 (0.0028) [2024-06-23 06:18:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 6173851648. Throughput: 0: 43060.8. Samples: 6174025760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 06:18:35,353][15401] Updated weights for policy 0, policy_version 376830 (0.0035) [2024-06-23 06:18:38,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6174048256. Throughput: 0: 42808.1. Samples: 6174149780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 06:18:40,285][15401] Updated weights for policy 0, policy_version 376840 (0.0036) [2024-06-23 06:18:43,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 6174294016. Throughput: 0: 42947.1. Samples: 6174403200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:43,392][15132] Avg episode reward: [(0, '0.860')] [2024-06-23 06:18:43,669][15401] Updated weights for policy 0, policy_version 376850 (0.0036) [2024-06-23 06:18:48,186][15401] Updated weights for policy 0, policy_version 376860 (0.0042) [2024-06-23 06:18:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 6174490624. Throughput: 0: 43070.6. Samples: 6174670740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 06:18:51,103][15401] Updated weights for policy 0, policy_version 376870 (0.0042) [2024-06-23 06:18:53,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 6174703616. Throughput: 0: 42821.8. Samples: 6174789400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 06:18:55,939][15401] Updated weights for policy 0, policy_version 376880 (0.0023) [2024-06-23 06:18:57,430][15349] Signal inference workers to stop experience collection... (91500 times) [2024-06-23 06:18:57,430][15349] Signal inference workers to resume experience collection... (91500 times) [2024-06-23 06:18:57,453][15401] InferenceWorker_p0-w0: stopping experience collection (91500 times) [2024-06-23 06:18:57,453][15401] InferenceWorker_p0-w0: resuming experience collection (91500 times) [2024-06-23 06:18:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 6174949376. Throughput: 0: 42700.6. Samples: 6175042620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:18:58,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 06:18:58,593][15401] Updated weights for policy 0, policy_version 376890 (0.0033) [2024-06-23 06:19:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 6175113216. Throughput: 0: 43069.8. Samples: 6175318940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:19:03,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 06:19:03,444][15401] Updated weights for policy 0, policy_version 376900 (0.0033) [2024-06-23 06:19:05,958][15401] Updated weights for policy 0, policy_version 376910 (0.0028) [2024-06-23 06:19:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6175342592. Throughput: 0: 42641.8. Samples: 6175432280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:19:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 06:19:10,813][15401] Updated weights for policy 0, policy_version 376920 (0.0028) [2024-06-23 06:19:13,389][15132] Fps is (10 sec: 49152.7, 60 sec: 43417.8, 300 sec: 42987.2). Total num frames: 6175604736. Throughput: 0: 43034.4. Samples: 6175695840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:19:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 06:19:13,485][15401] Updated weights for policy 0, policy_version 376930 (0.0034) [2024-06-23 06:19:18,390][15401] Updated weights for policy 0, policy_version 376940 (0.0043) [2024-06-23 06:19:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6175784960. Throughput: 0: 43107.6. Samples: 6175965600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 06:19:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 06:19:21,095][15401] Updated weights for policy 0, policy_version 376950 (0.0041) [2024-06-23 06:19:23,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6175981568. Throughput: 0: 42907.2. Samples: 6176080600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:19:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 06:19:26,019][15401] Updated weights for policy 0, policy_version 376960 (0.0032) [2024-06-23 06:19:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6176243712. Throughput: 0: 42905.8. Samples: 6176333860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:19:28,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 06:19:28,983][15401] Updated weights for policy 0, policy_version 376970 (0.0032) [2024-06-23 06:19:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6176423936. Throughput: 0: 42931.1. Samples: 6176602640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:19:33,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 06:19:33,529][15401] Updated weights for policy 0, policy_version 376980 (0.0028) [2024-06-23 06:19:36,776][15401] Updated weights for policy 0, policy_version 376990 (0.0035) [2024-06-23 06:19:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6176636928. Throughput: 0: 42944.0. Samples: 6176721880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:19:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 06:19:41,314][15401] Updated weights for policy 0, policy_version 377000 (0.0047) [2024-06-23 06:19:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 6176882688. Throughput: 0: 42932.3. Samples: 6176974580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:19:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 06:19:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000377007_6176882688.pth... [2024-06-23 06:19:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000376377_6166560768.pth [2024-06-23 06:19:44,468][15401] Updated weights for policy 0, policy_version 377010 (0.0033) [2024-06-23 06:19:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6177046528. Throughput: 0: 42672.1. Samples: 6177239180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:19:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 06:19:49,183][15401] Updated weights for policy 0, policy_version 377020 (0.0034) [2024-06-23 06:19:52,058][15401] Updated weights for policy 0, policy_version 377030 (0.0037) [2024-06-23 06:19:53,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 6177275904. Throughput: 0: 42739.6. Samples: 6177355560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:19:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 06:19:56,735][15401] Updated weights for policy 0, policy_version 377040 (0.0043) [2024-06-23 06:19:58,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42869.8, 300 sec: 42932.0). Total num frames: 6177521664. Throughput: 0: 42710.6. Samples: 6177617920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:19:58,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 06:19:59,801][15401] Updated weights for policy 0, policy_version 377050 (0.0040) [2024-06-23 06:20:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6177685504. Throughput: 0: 42570.7. Samples: 6177881280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:20:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 06:20:04,052][15349] Signal inference workers to stop experience collection... (91550 times) [2024-06-23 06:20:04,052][15349] Signal inference workers to resume experience collection... (91550 times) [2024-06-23 06:20:04,084][15401] InferenceWorker_p0-w0: stopping experience collection (91550 times) [2024-06-23 06:20:04,085][15401] InferenceWorker_p0-w0: resuming experience collection (91550 times) [2024-06-23 06:20:04,201][15401] Updated weights for policy 0, policy_version 377060 (0.0039) [2024-06-23 06:20:07,438][15401] Updated weights for policy 0, policy_version 377070 (0.0038) [2024-06-23 06:20:08,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 6177931264. Throughput: 0: 42669.2. Samples: 6178000720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:20:08,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 06:20:12,057][15401] Updated weights for policy 0, policy_version 377080 (0.0031) [2024-06-23 06:20:13,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42596.6, 300 sec: 42986.8). Total num frames: 6178160640. Throughput: 0: 42907.1. Samples: 6178264780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:20:13,393][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 06:20:15,100][15401] Updated weights for policy 0, policy_version 377090 (0.0023) [2024-06-23 06:20:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6178340864. Throughput: 0: 42789.4. Samples: 6178528160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:20:18,396][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 06:20:19,584][15401] Updated weights for policy 0, policy_version 377100 (0.0045) [2024-06-23 06:20:22,722][15401] Updated weights for policy 0, policy_version 377110 (0.0023) [2024-06-23 06:20:23,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6178586624. Throughput: 0: 42792.6. Samples: 6178647540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:20:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 06:20:27,298][15401] Updated weights for policy 0, policy_version 377120 (0.0035) [2024-06-23 06:20:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 6178799616. Throughput: 0: 43072.2. Samples: 6178912820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:20:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 06:20:30,309][15401] Updated weights for policy 0, policy_version 377130 (0.0032) [2024-06-23 06:20:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6178979840. Throughput: 0: 42833.2. Samples: 6179166680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:20:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 06:20:34,880][15401] Updated weights for policy 0, policy_version 377140 (0.0039) [2024-06-23 06:20:37,866][15401] Updated weights for policy 0, policy_version 377150 (0.0042) [2024-06-23 06:20:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.9, 300 sec: 42931.6). Total num frames: 6179225600. Throughput: 0: 43079.4. Samples: 6179294240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 06:20:38,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 06:20:42,238][15401] Updated weights for policy 0, policy_version 377160 (0.0025) [2024-06-23 06:20:43,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 6179438592. Throughput: 0: 43213.9. Samples: 6179562440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:20:43,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-23 06:20:45,543][15401] Updated weights for policy 0, policy_version 377170 (0.0026) [2024-06-23 06:20:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6179635200. Throughput: 0: 43050.3. Samples: 6179818540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:20:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 06:20:50,018][15401] Updated weights for policy 0, policy_version 377180 (0.0026) [2024-06-23 06:20:53,120][15401] Updated weights for policy 0, policy_version 377190 (0.0033) [2024-06-23 06:20:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6179880960. Throughput: 0: 43076.8. Samples: 6179939180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:20:53,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 06:20:57,557][15401] Updated weights for policy 0, policy_version 377200 (0.0031) [2024-06-23 06:20:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 6180077568. Throughput: 0: 43043.2. Samples: 6180201620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:20:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 06:21:00,610][15401] Updated weights for policy 0, policy_version 377210 (0.0035) [2024-06-23 06:21:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6180290560. Throughput: 0: 42979.0. Samples: 6180462220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:03,395][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 06:21:05,341][15401] Updated weights for policy 0, policy_version 377220 (0.0041) [2024-06-23 06:21:08,260][15401] Updated weights for policy 0, policy_version 377230 (0.0034) [2024-06-23 06:21:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 6180536320. Throughput: 0: 43041.7. Samples: 6180584420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 06:21:12,982][15401] Updated weights for policy 0, policy_version 377240 (0.0031) [2024-06-23 06:21:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 6180716544. Throughput: 0: 42963.5. Samples: 6180846180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 06:21:15,841][15401] Updated weights for policy 0, policy_version 377250 (0.0039) [2024-06-23 06:21:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6180929536. Throughput: 0: 43123.1. Samples: 6181107220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:18,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 06:21:20,400][15401] Updated weights for policy 0, policy_version 377260 (0.0037) [2024-06-23 06:21:21,786][15349] Signal inference workers to stop experience collection... (91600 times) [2024-06-23 06:21:21,816][15401] InferenceWorker_p0-w0: stopping experience collection (91600 times) [2024-06-23 06:21:21,844][15349] Signal inference workers to resume experience collection... (91600 times) [2024-06-23 06:21:21,845][15401] InferenceWorker_p0-w0: resuming experience collection (91600 times) [2024-06-23 06:21:23,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.9). Total num frames: 6181175296. Throughput: 0: 43076.1. Samples: 6181232560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 06:21:23,422][15401] Updated weights for policy 0, policy_version 377270 (0.0035) [2024-06-23 06:21:27,943][15401] Updated weights for policy 0, policy_version 377280 (0.0036) [2024-06-23 06:21:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6181355520. Throughput: 0: 42876.4. Samples: 6181491880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 06:21:31,415][15401] Updated weights for policy 0, policy_version 377290 (0.0027) [2024-06-23 06:21:33,392][15132] Fps is (10 sec: 39312.0, 60 sec: 43142.8, 300 sec: 42876.1). Total num frames: 6181568512. Throughput: 0: 42839.8. Samples: 6181746440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:33,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 06:21:35,503][15401] Updated weights for policy 0, policy_version 377300 (0.0037) [2024-06-23 06:21:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 6181814272. Throughput: 0: 43028.0. Samples: 6181875440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 06:21:38,967][15401] Updated weights for policy 0, policy_version 377310 (0.0036) [2024-06-23 06:21:42,918][15401] Updated weights for policy 0, policy_version 377320 (0.0030) [2024-06-23 06:21:43,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6182010880. Throughput: 0: 43104.0. Samples: 6182141300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:43,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 06:21:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000377320_6182010880.pth... [2024-06-23 06:21:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000376691_6171705344.pth [2024-06-23 06:21:46,657][15401] Updated weights for policy 0, policy_version 377330 (0.0040) [2024-06-23 06:21:48,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 6182207488. Throughput: 0: 42967.1. Samples: 6182395840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:48,401][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 06:21:50,346][15401] Updated weights for policy 0, policy_version 377340 (0.0039) [2024-06-23 06:21:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6182453248. Throughput: 0: 43140.0. Samples: 6182525720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 06:21:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 06:21:54,139][15401] Updated weights for policy 0, policy_version 377350 (0.0037) [2024-06-23 06:21:58,319][15401] Updated weights for policy 0, policy_version 377360 (0.0030) [2024-06-23 06:21:58,390][15132] Fps is (10 sec: 45885.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6182666240. Throughput: 0: 43065.6. Samples: 6182784140. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:21:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 06:22:01,765][15401] Updated weights for policy 0, policy_version 377370 (0.0040) [2024-06-23 06:22:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 6182879232. Throughput: 0: 42878.7. Samples: 6183036760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 06:22:06,143][15401] Updated weights for policy 0, policy_version 377380 (0.0029) [2024-06-23 06:22:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6183075840. Throughput: 0: 42955.1. Samples: 6183165540. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 06:22:09,365][15401] Updated weights for policy 0, policy_version 377390 (0.0028) [2024-06-23 06:22:13,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 6183288832. Throughput: 0: 42980.8. Samples: 6183426120. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:13,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 06:22:13,903][15401] Updated weights for policy 0, policy_version 377400 (0.0040) [2024-06-23 06:22:17,275][15401] Updated weights for policy 0, policy_version 377410 (0.0047) [2024-06-23 06:22:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6183501824. Throughput: 0: 42926.2. Samples: 6183678020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 06:22:21,539][15401] Updated weights for policy 0, policy_version 377420 (0.0031) [2024-06-23 06:22:23,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6183731200. Throughput: 0: 42959.6. Samples: 6183808620. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 06:22:24,870][15401] Updated weights for policy 0, policy_version 377430 (0.0051) [2024-06-23 06:22:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6183944192. Throughput: 0: 42773.8. Samples: 6184066120. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 06:22:29,129][15401] Updated weights for policy 0, policy_version 377440 (0.0031) [2024-06-23 06:22:32,399][15401] Updated weights for policy 0, policy_version 377450 (0.0035) [2024-06-23 06:22:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42931.7). Total num frames: 6184157184. Throughput: 0: 42759.6. Samples: 6184319920. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 06:22:36,733][15401] Updated weights for policy 0, policy_version 377460 (0.0042) [2024-06-23 06:22:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6184370176. Throughput: 0: 42776.4. Samples: 6184450660. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 06:22:39,275][15349] Signal inference workers to stop experience collection... (91650 times) [2024-06-23 06:22:39,305][15401] InferenceWorker_p0-w0: stopping experience collection (91650 times) [2024-06-23 06:22:39,387][15349] Signal inference workers to resume experience collection... (91650 times) [2024-06-23 06:22:39,387][15401] InferenceWorker_p0-w0: resuming experience collection (91650 times) [2024-06-23 06:22:40,045][15401] Updated weights for policy 0, policy_version 377470 (0.0039) [2024-06-23 06:22:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6184583168. Throughput: 0: 42698.4. Samples: 6184705560. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 06:22:44,293][15401] Updated weights for policy 0, policy_version 377480 (0.0038) [2024-06-23 06:22:47,667][15401] Updated weights for policy 0, policy_version 377490 (0.0034) [2024-06-23 06:22:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43419.3, 300 sec: 43042.7). Total num frames: 6184812544. Throughput: 0: 42701.3. Samples: 6184958320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:48,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 06:22:51,790][15401] Updated weights for policy 0, policy_version 377500 (0.0038) [2024-06-23 06:22:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6184992768. Throughput: 0: 42750.7. Samples: 6185089320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 06:22:55,624][15401] Updated weights for policy 0, policy_version 377510 (0.0032) [2024-06-23 06:22:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 6185205760. Throughput: 0: 42586.4. Samples: 6185342400. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:22:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 06:22:59,692][15401] Updated weights for policy 0, policy_version 377520 (0.0041) [2024-06-23 06:23:03,172][15401] Updated weights for policy 0, policy_version 377530 (0.0047) [2024-06-23 06:23:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 6185451520. Throughput: 0: 42648.0. Samples: 6185597180. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:23:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 06:23:07,190][15401] Updated weights for policy 0, policy_version 377540 (0.0024) [2024-06-23 06:23:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6185631744. Throughput: 0: 42714.1. Samples: 6185730760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:23:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 06:23:10,731][15401] Updated weights for policy 0, policy_version 377550 (0.0028) [2024-06-23 06:23:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6185861120. Throughput: 0: 42646.3. Samples: 6185985200. Policy #0 lag: (min: 1.0, avg: 11.5, max: 23.0) [2024-06-23 06:23:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 06:23:14,671][15401] Updated weights for policy 0, policy_version 377560 (0.0038) [2024-06-23 06:23:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6186074112. Throughput: 0: 42760.3. Samples: 6186244140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 06:23:18,718][15401] Updated weights for policy 0, policy_version 377570 (0.0033) [2024-06-23 06:23:22,283][15401] Updated weights for policy 0, policy_version 377580 (0.0042) [2024-06-23 06:23:23,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 6186270720. Throughput: 0: 42644.9. Samples: 6186369780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:23,393][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 06:23:26,453][15401] Updated weights for policy 0, policy_version 377590 (0.0040) [2024-06-23 06:23:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6186500096. Throughput: 0: 42654.5. Samples: 6186625020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 06:23:29,804][15401] Updated weights for policy 0, policy_version 377600 (0.0032) [2024-06-23 06:23:33,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 6186729472. Throughput: 0: 42957.3. Samples: 6186891400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 06:23:34,091][15401] Updated weights for policy 0, policy_version 377610 (0.0029) [2024-06-23 06:23:37,890][15401] Updated weights for policy 0, policy_version 377620 (0.0042) [2024-06-23 06:23:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 6186926080. Throughput: 0: 42783.5. Samples: 6187014580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 06:23:41,726][15401] Updated weights for policy 0, policy_version 377630 (0.0040) [2024-06-23 06:23:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6187139072. Throughput: 0: 42757.8. Samples: 6187266500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:43,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 06:23:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000377634_6187155456.pth... [2024-06-23 06:23:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000377007_6176882688.pth [2024-06-23 06:23:45,455][15401] Updated weights for policy 0, policy_version 377640 (0.0032) [2024-06-23 06:23:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6187352064. Throughput: 0: 42899.6. Samples: 6187527660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 06:23:49,533][15401] Updated weights for policy 0, policy_version 377650 (0.0047) [2024-06-23 06:23:52,979][15401] Updated weights for policy 0, policy_version 377660 (0.0030) [2024-06-23 06:23:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6187581440. Throughput: 0: 42753.1. Samples: 6187654640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 06:23:57,115][15401] Updated weights for policy 0, policy_version 377670 (0.0030) [2024-06-23 06:23:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 6187794432. Throughput: 0: 42817.8. Samples: 6187912000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:23:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 06:24:01,155][15401] Updated weights for policy 0, policy_version 377680 (0.0035) [2024-06-23 06:24:03,393][15132] Fps is (10 sec: 42581.9, 60 sec: 42595.7, 300 sec: 42931.1). Total num frames: 6188007424. Throughput: 0: 42682.2. Samples: 6188165000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:24:03,394][15132] Avg episode reward: [(0, '0.826')] [2024-06-23 06:24:04,768][15401] Updated weights for policy 0, policy_version 377690 (0.0025) [2024-06-23 06:24:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6188204032. Throughput: 0: 42769.8. Samples: 6188294320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:24:08,390][15132] Avg episode reward: [(0, '0.877')] [2024-06-23 06:24:08,755][15401] Updated weights for policy 0, policy_version 377700 (0.0034) [2024-06-23 06:24:10,009][15349] Signal inference workers to stop experience collection... (91700 times) [2024-06-23 06:24:10,009][15349] Signal inference workers to resume experience collection... (91700 times) [2024-06-23 06:24:10,036][15401] InferenceWorker_p0-w0: stopping experience collection (91700 times) [2024-06-23 06:24:10,036][15401] InferenceWorker_p0-w0: resuming experience collection (91700 times) [2024-06-23 06:24:12,431][15401] Updated weights for policy 0, policy_version 377710 (0.0040) [2024-06-23 06:24:13,389][15132] Fps is (10 sec: 40975.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6188417024. Throughput: 0: 42796.1. Samples: 6188550840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:24:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 06:24:16,682][15401] Updated weights for policy 0, policy_version 377720 (0.0035) [2024-06-23 06:24:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6188646400. Throughput: 0: 42431.1. Samples: 6188800800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:24:18,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 06:24:20,211][15401] Updated weights for policy 0, policy_version 377730 (0.0030) [2024-06-23 06:24:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 6188843008. Throughput: 0: 42536.8. Samples: 6188928740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:24:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 06:24:24,112][15401] Updated weights for policy 0, policy_version 377740 (0.0024) [2024-06-23 06:24:27,832][15401] Updated weights for policy 0, policy_version 377750 (0.0034) [2024-06-23 06:24:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6189072384. Throughput: 0: 42776.4. Samples: 6189191440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:24:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 06:24:31,594][15401] Updated weights for policy 0, policy_version 377760 (0.0034) [2024-06-23 06:24:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6189285376. Throughput: 0: 42627.6. Samples: 6189445900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 06:24:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 06:24:35,390][15401] Updated weights for policy 0, policy_version 377770 (0.0038) [2024-06-23 06:24:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6189481984. Throughput: 0: 42685.2. Samples: 6189575480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:24:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 06:24:39,325][15401] Updated weights for policy 0, policy_version 377780 (0.0032) [2024-06-23 06:24:43,130][15401] Updated weights for policy 0, policy_version 377790 (0.0028) [2024-06-23 06:24:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6189711360. Throughput: 0: 42619.0. Samples: 6189829860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:24:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 06:24:46,881][15401] Updated weights for policy 0, policy_version 377800 (0.0035) [2024-06-23 06:24:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6189924352. Throughput: 0: 42685.8. Samples: 6190085700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:24:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 06:24:51,303][15401] Updated weights for policy 0, policy_version 377810 (0.0029) [2024-06-23 06:24:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 6190120960. Throughput: 0: 42699.7. Samples: 6190215800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:24:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 06:24:54,818][15401] Updated weights for policy 0, policy_version 377820 (0.0037) [2024-06-23 06:24:58,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42593.8, 300 sec: 42930.7). Total num frames: 6190350336. Throughput: 0: 42649.9. Samples: 6190470360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:24:58,397][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 06:24:58,693][15401] Updated weights for policy 0, policy_version 377830 (0.0032) [2024-06-23 06:25:02,317][15401] Updated weights for policy 0, policy_version 377840 (0.0039) [2024-06-23 06:25:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42874.2, 300 sec: 42876.1). Total num frames: 6190579712. Throughput: 0: 42744.5. Samples: 6190724300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 06:25:06,751][15401] Updated weights for policy 0, policy_version 377850 (0.0038) [2024-06-23 06:25:08,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 6190776320. Throughput: 0: 42694.4. Samples: 6190849980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 06:25:10,144][15401] Updated weights for policy 0, policy_version 377860 (0.0028) [2024-06-23 06:25:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6190989312. Throughput: 0: 42615.6. Samples: 6191109140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:13,391][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 06:25:14,277][15401] Updated weights for policy 0, policy_version 377870 (0.0027) [2024-06-23 06:25:18,030][15401] Updated weights for policy 0, policy_version 377880 (0.0028) [2024-06-23 06:25:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6191218688. Throughput: 0: 42743.0. Samples: 6191369340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 06:25:21,846][15401] Updated weights for policy 0, policy_version 377890 (0.0024) [2024-06-23 06:25:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 6191431680. Throughput: 0: 42678.7. Samples: 6191496020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 06:25:25,896][15401] Updated weights for policy 0, policy_version 377900 (0.0041) [2024-06-23 06:25:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6191628288. Throughput: 0: 42681.8. Samples: 6191750540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 06:25:29,389][15401] Updated weights for policy 0, policy_version 377910 (0.0033) [2024-06-23 06:25:33,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 6191824896. Throughput: 0: 42831.9. Samples: 6192013140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 06:25:33,526][15401] Updated weights for policy 0, policy_version 377920 (0.0022) [2024-06-23 06:25:35,817][15349] Signal inference workers to stop experience collection... (91750 times) [2024-06-23 06:25:35,867][15401] InferenceWorker_p0-w0: stopping experience collection (91750 times) [2024-06-23 06:25:35,871][15349] Signal inference workers to resume experience collection... (91750 times) [2024-06-23 06:25:35,879][15401] InferenceWorker_p0-w0: resuming experience collection (91750 times) [2024-06-23 06:25:37,159][15401] Updated weights for policy 0, policy_version 377930 (0.0025) [2024-06-23 06:25:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6192054272. Throughput: 0: 42607.0. Samples: 6192133120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 06:25:41,081][15401] Updated weights for policy 0, policy_version 377940 (0.0029) [2024-06-23 06:25:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6192267264. Throughput: 0: 42725.1. Samples: 6192392720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 06:25:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000377947_6192283648.pth... [2024-06-23 06:25:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000377320_6182010880.pth [2024-06-23 06:25:44,847][15401] Updated weights for policy 0, policy_version 377950 (0.0025) [2024-06-23 06:25:48,391][15132] Fps is (10 sec: 42590.7, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 6192480256. Throughput: 0: 42695.6. Samples: 6192645680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:48,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 06:25:48,751][15401] Updated weights for policy 0, policy_version 377960 (0.0037) [2024-06-23 06:25:52,683][15401] Updated weights for policy 0, policy_version 377970 (0.0038) [2024-06-23 06:25:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6192693248. Throughput: 0: 42663.5. Samples: 6192769840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 06:25:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 06:25:56,676][15401] Updated weights for policy 0, policy_version 377980 (0.0026) [2024-06-23 06:25:58,390][15132] Fps is (10 sec: 44244.9, 60 sec: 42876.0, 300 sec: 42820.6). Total num frames: 6192922624. Throughput: 0: 42848.0. Samples: 6193037300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:25:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 06:26:00,324][15401] Updated weights for policy 0, policy_version 377990 (0.0033) [2024-06-23 06:26:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6193119232. Throughput: 0: 42654.7. Samples: 6193288800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 06:26:04,528][15401] Updated weights for policy 0, policy_version 378000 (0.0046) [2024-06-23 06:26:08,134][15401] Updated weights for policy 0, policy_version 378010 (0.0032) [2024-06-23 06:26:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6193315840. Throughput: 0: 42696.0. Samples: 6193417340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 06:26:12,092][15401] Updated weights for policy 0, policy_version 378020 (0.0035) [2024-06-23 06:26:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6193577984. Throughput: 0: 42849.3. Samples: 6193678760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 06:26:15,611][15401] Updated weights for policy 0, policy_version 378030 (0.0039) [2024-06-23 06:26:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6193774592. Throughput: 0: 42522.8. Samples: 6193926660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:18,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 06:26:19,649][15401] Updated weights for policy 0, policy_version 378040 (0.0029) [2024-06-23 06:26:23,322][15401] Updated weights for policy 0, policy_version 378050 (0.0029) [2024-06-23 06:26:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6193971200. Throughput: 0: 42714.3. Samples: 6194055260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 06:26:27,363][15401] Updated weights for policy 0, policy_version 378060 (0.0040) [2024-06-23 06:26:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 6194200576. Throughput: 0: 42821.8. Samples: 6194319700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 06:26:30,859][15401] Updated weights for policy 0, policy_version 378070 (0.0035) [2024-06-23 06:26:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 6194413568. Throughput: 0: 42741.8. Samples: 6194568980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 06:26:34,777][15401] Updated weights for policy 0, policy_version 378080 (0.0037) [2024-06-23 06:26:38,353][15401] Updated weights for policy 0, policy_version 378090 (0.0027) [2024-06-23 06:26:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6194626560. Throughput: 0: 42892.5. Samples: 6194700000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 06:26:42,267][15401] Updated weights for policy 0, policy_version 378100 (0.0036) [2024-06-23 06:26:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 6194839552. Throughput: 0: 42783.6. Samples: 6194962560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:43,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 06:26:46,002][15401] Updated weights for policy 0, policy_version 378110 (0.0027) [2024-06-23 06:26:47,522][15349] Signal inference workers to stop experience collection... (91800 times) [2024-06-23 06:26:47,522][15349] Signal inference workers to resume experience collection... (91800 times) [2024-06-23 06:26:47,571][15401] InferenceWorker_p0-w0: stopping experience collection (91800 times) [2024-06-23 06:26:47,571][15401] InferenceWorker_p0-w0: resuming experience collection (91800 times) [2024-06-23 06:26:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42872.7, 300 sec: 42709.5). Total num frames: 6195052544. Throughput: 0: 42706.2. Samples: 6195210580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 06:26:49,797][15401] Updated weights for policy 0, policy_version 378120 (0.0039) [2024-06-23 06:26:53,390][15132] Fps is (10 sec: 40957.4, 60 sec: 42598.0, 300 sec: 42653.9). Total num frames: 6195249152. Throughput: 0: 42779.3. Samples: 6195342440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:53,391][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 06:26:53,679][15401] Updated weights for policy 0, policy_version 378130 (0.0034) [2024-06-23 06:26:57,246][15401] Updated weights for policy 0, policy_version 378140 (0.0036) [2024-06-23 06:26:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6195478528. Throughput: 0: 42824.2. Samples: 6195605840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:26:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 06:27:01,179][15401] Updated weights for policy 0, policy_version 378150 (0.0039) [2024-06-23 06:27:03,390][15132] Fps is (10 sec: 44239.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6195691520. Throughput: 0: 43004.0. Samples: 6195861840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:27:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 06:27:05,093][15401] Updated weights for policy 0, policy_version 378160 (0.0038) [2024-06-23 06:27:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 6195904512. Throughput: 0: 42986.4. Samples: 6195989640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:27:08,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 06:27:08,796][15401] Updated weights for policy 0, policy_version 378170 (0.0044) [2024-06-23 06:27:12,643][15401] Updated weights for policy 0, policy_version 378180 (0.0035) [2024-06-23 06:27:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6196117504. Throughput: 0: 42945.8. Samples: 6196252260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 06:27:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 06:27:16,370][15401] Updated weights for policy 0, policy_version 378190 (0.0034) [2024-06-23 06:27:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6196346880. Throughput: 0: 43120.4. Samples: 6196509400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 06:27:20,198][15401] Updated weights for policy 0, policy_version 378200 (0.0026) [2024-06-23 06:27:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6196559872. Throughput: 0: 42962.5. Samples: 6196633320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 06:27:24,058][15401] Updated weights for policy 0, policy_version 378210 (0.0041) [2024-06-23 06:27:28,023][15401] Updated weights for policy 0, policy_version 378220 (0.0028) [2024-06-23 06:27:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6196772864. Throughput: 0: 42946.7. Samples: 6196895160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 06:27:31,615][15401] Updated weights for policy 0, policy_version 378230 (0.0026) [2024-06-23 06:27:33,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6196969472. Throughput: 0: 43262.4. Samples: 6197157380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:33,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 06:27:35,437][15401] Updated weights for policy 0, policy_version 378240 (0.0037) [2024-06-23 06:27:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6197198848. Throughput: 0: 43177.5. Samples: 6197285400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 06:27:39,091][15401] Updated weights for policy 0, policy_version 378250 (0.0030) [2024-06-23 06:27:43,256][15401] Updated weights for policy 0, policy_version 378260 (0.0040) [2024-06-23 06:27:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6197411840. Throughput: 0: 43065.1. Samples: 6197543780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 06:27:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000378261_6197428224.pth... [2024-06-23 06:27:43,578][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000377634_6187155456.pth [2024-06-23 06:27:46,606][15401] Updated weights for policy 0, policy_version 378270 (0.0028) [2024-06-23 06:27:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6197624832. Throughput: 0: 43073.3. Samples: 6197800140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 06:27:50,751][15401] Updated weights for policy 0, policy_version 378280 (0.0044) [2024-06-23 06:27:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43145.0, 300 sec: 42820.5). Total num frames: 6197837824. Throughput: 0: 43019.5. Samples: 6197925520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 06:27:54,131][15401] Updated weights for policy 0, policy_version 378290 (0.0029) [2024-06-23 06:27:58,345][15401] Updated weights for policy 0, policy_version 378300 (0.0029) [2024-06-23 06:27:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6198067200. Throughput: 0: 42992.9. Samples: 6198186940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:27:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 06:28:01,595][15401] Updated weights for policy 0, policy_version 378310 (0.0031) [2024-06-23 06:28:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6198280192. Throughput: 0: 42953.2. Samples: 6198442300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:28:03,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-23 06:28:05,896][15401] Updated weights for policy 0, policy_version 378320 (0.0033) [2024-06-23 06:28:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6198493184. Throughput: 0: 43089.1. Samples: 6198572320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:28:08,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-23 06:28:08,416][15349] Signal inference workers to stop experience collection... (91850 times) [2024-06-23 06:28:08,417][15349] Signal inference workers to resume experience collection... (91850 times) [2024-06-23 06:28:08,450][15401] InferenceWorker_p0-w0: stopping experience collection (91850 times) [2024-06-23 06:28:08,450][15401] InferenceWorker_p0-w0: resuming experience collection (91850 times) [2024-06-23 06:28:09,258][15401] Updated weights for policy 0, policy_version 378330 (0.0039) [2024-06-23 06:28:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6198706176. Throughput: 0: 43114.5. Samples: 6198835320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:28:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 06:28:13,497][15401] Updated weights for policy 0, policy_version 378340 (0.0032) [2024-06-23 06:28:16,876][15401] Updated weights for policy 0, policy_version 378350 (0.0038) [2024-06-23 06:28:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 6198919168. Throughput: 0: 42822.2. Samples: 6199084380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:28:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 06:28:21,157][15401] Updated weights for policy 0, policy_version 378360 (0.0048) [2024-06-23 06:28:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 6199132160. Throughput: 0: 42882.8. Samples: 6199215120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:28:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 06:28:24,563][15401] Updated weights for policy 0, policy_version 378370 (0.0028) [2024-06-23 06:28:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6199345152. Throughput: 0: 43020.6. Samples: 6199479700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:28:28,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 06:28:28,861][15401] Updated weights for policy 0, policy_version 378380 (0.0047) [2024-06-23 06:28:32,210][15401] Updated weights for policy 0, policy_version 378390 (0.0025) [2024-06-23 06:28:33,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 6199574528. Throughput: 0: 42764.3. Samples: 6199724540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 06:28:33,391][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 06:28:36,585][15401] Updated weights for policy 0, policy_version 378400 (0.0038) [2024-06-23 06:28:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6199771136. Throughput: 0: 43044.0. Samples: 6199862500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:28:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 06:28:39,872][15401] Updated weights for policy 0, policy_version 378410 (0.0041) [2024-06-23 06:28:43,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6199984128. Throughput: 0: 42958.6. Samples: 6200120080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:28:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 06:28:44,114][15401] Updated weights for policy 0, policy_version 378420 (0.0026) [2024-06-23 06:28:47,916][15401] Updated weights for policy 0, policy_version 378430 (0.0032) [2024-06-23 06:28:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6200213504. Throughput: 0: 42757.0. Samples: 6200366360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:28:48,399][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 06:28:51,918][15401] Updated weights for policy 0, policy_version 378440 (0.0029) [2024-06-23 06:28:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6200426496. Throughput: 0: 42822.2. Samples: 6200499320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:28:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 06:28:55,649][15401] Updated weights for policy 0, policy_version 378450 (0.0042) [2024-06-23 06:28:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.6). Total num frames: 6200623104. Throughput: 0: 42691.1. Samples: 6200756420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:28:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 06:28:59,457][15401] Updated weights for policy 0, policy_version 378460 (0.0033) [2024-06-23 06:29:03,311][15401] Updated weights for policy 0, policy_version 378470 (0.0047) [2024-06-23 06:29:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6200852480. Throughput: 0: 42785.2. Samples: 6201009720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 06:29:07,050][15401] Updated weights for policy 0, policy_version 378480 (0.0031) [2024-06-23 06:29:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6201081856. Throughput: 0: 42854.9. Samples: 6201143600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 06:29:10,874][15401] Updated weights for policy 0, policy_version 378490 (0.0029) [2024-06-23 06:29:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6201278464. Throughput: 0: 42736.4. Samples: 6201402840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 06:29:14,688][15401] Updated weights for policy 0, policy_version 378500 (0.0030) [2024-06-23 06:29:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6201475072. Throughput: 0: 43012.6. Samples: 6201660100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 06:29:19,006][15401] Updated weights for policy 0, policy_version 378510 (0.0025) [2024-06-23 06:29:22,287][15401] Updated weights for policy 0, policy_version 378520 (0.0051) [2024-06-23 06:29:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6201720832. Throughput: 0: 42810.7. Samples: 6201788980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:23,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 06:29:26,621][15401] Updated weights for policy 0, policy_version 378530 (0.0044) [2024-06-23 06:29:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6201933824. Throughput: 0: 42784.4. Samples: 6202045380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 06:29:30,064][15401] Updated weights for policy 0, policy_version 378540 (0.0037) [2024-06-23 06:29:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6202130432. Throughput: 0: 42865.7. Samples: 6202295320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:33,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-23 06:29:34,212][15401] Updated weights for policy 0, policy_version 378550 (0.0036) [2024-06-23 06:29:37,623][15401] Updated weights for policy 0, policy_version 378560 (0.0038) [2024-06-23 06:29:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6202343424. Throughput: 0: 42739.2. Samples: 6202422580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:38,390][15132] Avg episode reward: [(0, '0.885')] [2024-06-23 06:29:40,450][15349] Signal inference workers to stop experience collection... (91900 times) [2024-06-23 06:29:40,484][15401] InferenceWorker_p0-w0: stopping experience collection (91900 times) [2024-06-23 06:29:40,521][15349] Signal inference workers to resume experience collection... (91900 times) [2024-06-23 06:29:40,522][15401] InferenceWorker_p0-w0: resuming experience collection (91900 times) [2024-06-23 06:29:41,803][15401] Updated weights for policy 0, policy_version 378570 (0.0036) [2024-06-23 06:29:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6202572800. Throughput: 0: 42852.0. Samples: 6202684760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 06:29:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000378575_6202572800.pth... [2024-06-23 06:29:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000377947_6192283648.pth [2024-06-23 06:29:45,526][15401] Updated weights for policy 0, policy_version 378580 (0.0034) [2024-06-23 06:29:48,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6202769408. Throughput: 0: 42847.9. Samples: 6202937880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 06:29:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 06:29:49,512][15401] Updated weights for policy 0, policy_version 378590 (0.0042) [2024-06-23 06:29:53,105][15401] Updated weights for policy 0, policy_version 378600 (0.0037) [2024-06-23 06:29:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 6202998784. Throughput: 0: 42697.0. Samples: 6203064960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:29:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 06:29:57,083][15401] Updated weights for policy 0, policy_version 378610 (0.0036) [2024-06-23 06:29:58,390][15132] Fps is (10 sec: 45876.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6203228160. Throughput: 0: 42724.9. Samples: 6203325460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:29:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 06:30:00,964][15401] Updated weights for policy 0, policy_version 378620 (0.0044) [2024-06-23 06:30:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6203424768. Throughput: 0: 42516.3. Samples: 6203573340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:03,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-23 06:30:04,600][15401] Updated weights for policy 0, policy_version 378630 (0.0028) [2024-06-23 06:30:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6203621376. Throughput: 0: 42555.1. Samples: 6203703960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:08,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 06:30:08,502][15401] Updated weights for policy 0, policy_version 378640 (0.0028) [2024-06-23 06:30:12,591][15401] Updated weights for policy 0, policy_version 378650 (0.0031) [2024-06-23 06:30:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6203850752. Throughput: 0: 42844.1. Samples: 6203973360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 06:30:16,009][15401] Updated weights for policy 0, policy_version 378660 (0.0029) [2024-06-23 06:30:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6204080128. Throughput: 0: 42753.9. Samples: 6204219240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 06:30:20,067][15401] Updated weights for policy 0, policy_version 378670 (0.0028) [2024-06-23 06:30:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6204260352. Throughput: 0: 42858.7. Samples: 6204351220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 06:30:23,793][15401] Updated weights for policy 0, policy_version 378680 (0.0027) [2024-06-23 06:30:27,681][15401] Updated weights for policy 0, policy_version 378690 (0.0049) [2024-06-23 06:30:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6204473344. Throughput: 0: 42722.8. Samples: 6204607280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 06:30:31,365][15401] Updated weights for policy 0, policy_version 378700 (0.0027) [2024-06-23 06:30:33,396][15132] Fps is (10 sec: 45845.4, 60 sec: 43140.0, 300 sec: 42930.7). Total num frames: 6204719104. Throughput: 0: 42678.5. Samples: 6204858680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:33,397][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 06:30:35,250][15401] Updated weights for policy 0, policy_version 378710 (0.0045) [2024-06-23 06:30:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6204899328. Throughput: 0: 42885.3. Samples: 6204994800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 06:30:38,954][15401] Updated weights for policy 0, policy_version 378720 (0.0026) [2024-06-23 06:30:43,169][15401] Updated weights for policy 0, policy_version 378730 (0.0043) [2024-06-23 06:30:43,390][15132] Fps is (10 sec: 39346.8, 60 sec: 42325.4, 300 sec: 42820.8). Total num frames: 6205112320. Throughput: 0: 42603.6. Samples: 6205242620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 06:30:46,574][15401] Updated weights for policy 0, policy_version 378740 (0.0036) [2024-06-23 06:30:47,586][15349] Signal inference workers to stop experience collection... (91950 times) [2024-06-23 06:30:47,595][15349] Signal inference workers to resume experience collection... (91950 times) [2024-06-23 06:30:47,599][15401] InferenceWorker_p0-w0: stopping experience collection (91950 times) [2024-06-23 06:30:47,618][15401] InferenceWorker_p0-w0: resuming experience collection (91950 times) [2024-06-23 06:30:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 6205358080. Throughput: 0: 42688.9. Samples: 6205494340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 06:30:50,778][15401] Updated weights for policy 0, policy_version 378750 (0.0029) [2024-06-23 06:30:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6205538304. Throughput: 0: 42776.4. Samples: 6205628900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 06:30:54,738][15401] Updated weights for policy 0, policy_version 378760 (0.0032) [2024-06-23 06:30:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 6205751296. Throughput: 0: 42420.0. Samples: 6205882260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:30:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 06:30:58,489][15401] Updated weights for policy 0, policy_version 378770 (0.0034) [2024-06-23 06:31:02,333][15401] Updated weights for policy 0, policy_version 378780 (0.0042) [2024-06-23 06:31:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 6205980672. Throughput: 0: 42512.9. Samples: 6206132320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:31:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 06:31:05,946][15401] Updated weights for policy 0, policy_version 378790 (0.0034) [2024-06-23 06:31:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6206193664. Throughput: 0: 42587.1. Samples: 6206267640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 06:31:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 06:31:09,853][15401] Updated weights for policy 0, policy_version 378800 (0.0038) [2024-06-23 06:31:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6206406656. Throughput: 0: 42673.8. Samples: 6206527600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 06:31:13,475][15401] Updated weights for policy 0, policy_version 378810 (0.0025) [2024-06-23 06:31:17,460][15401] Updated weights for policy 0, policy_version 378820 (0.0036) [2024-06-23 06:31:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6206636032. Throughput: 0: 42771.5. Samples: 6206783120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 06:31:21,514][15401] Updated weights for policy 0, policy_version 378830 (0.0038) [2024-06-23 06:31:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6206849024. Throughput: 0: 42734.2. Samples: 6206917840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 06:31:25,338][15401] Updated weights for policy 0, policy_version 378840 (0.0036) [2024-06-23 06:31:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6207029248. Throughput: 0: 42777.8. Samples: 6207167620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:28,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 06:31:29,115][15401] Updated weights for policy 0, policy_version 378850 (0.0038) [2024-06-23 06:31:32,930][15401] Updated weights for policy 0, policy_version 378860 (0.0033) [2024-06-23 06:31:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42329.9, 300 sec: 42820.6). Total num frames: 6207258624. Throughput: 0: 42793.8. Samples: 6207420060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 06:31:36,806][15401] Updated weights for policy 0, policy_version 378870 (0.0038) [2024-06-23 06:31:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6207471616. Throughput: 0: 42862.7. Samples: 6207557720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 06:31:40,466][15401] Updated weights for policy 0, policy_version 378880 (0.0034) [2024-06-23 06:31:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6207668224. Throughput: 0: 42820.3. Samples: 6207809180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 06:31:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000378886_6207668224.pth... [2024-06-23 06:31:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000378261_6197428224.pth [2024-06-23 06:31:44,455][15401] Updated weights for policy 0, policy_version 378890 (0.0027) [2024-06-23 06:31:48,034][15401] Updated weights for policy 0, policy_version 378900 (0.0034) [2024-06-23 06:31:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 6207913984. Throughput: 0: 42927.2. Samples: 6208064040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 06:31:52,076][15401] Updated weights for policy 0, policy_version 378910 (0.0022) [2024-06-23 06:31:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6208094208. Throughput: 0: 42862.1. Samples: 6208196440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 06:31:55,485][15401] Updated weights for policy 0, policy_version 378920 (0.0042) [2024-06-23 06:31:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6208307200. Throughput: 0: 42568.5. Samples: 6208443180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:31:58,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 06:31:59,876][15401] Updated weights for policy 0, policy_version 378930 (0.0058) [2024-06-23 06:32:03,183][15401] Updated weights for policy 0, policy_version 378940 (0.0034) [2024-06-23 06:32:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6208552960. Throughput: 0: 42572.8. Samples: 6208698900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:32:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 06:32:07,786][15401] Updated weights for policy 0, policy_version 378950 (0.0045) [2024-06-23 06:32:08,294][15349] Signal inference workers to stop experience collection... (92000 times) [2024-06-23 06:32:08,294][15349] Signal inference workers to resume experience collection... (92000 times) [2024-06-23 06:32:08,311][15401] InferenceWorker_p0-w0: stopping experience collection (92000 times) [2024-06-23 06:32:08,311][15401] InferenceWorker_p0-w0: resuming experience collection (92000 times) [2024-06-23 06:32:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6208733184. Throughput: 0: 42424.6. Samples: 6208826940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:32:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 06:32:11,043][15401] Updated weights for policy 0, policy_version 378960 (0.0034) [2024-06-23 06:32:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6208946176. Throughput: 0: 42601.8. Samples: 6209084700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:32:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 06:32:15,642][15401] Updated weights for policy 0, policy_version 378970 (0.0048) [2024-06-23 06:32:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6209175552. Throughput: 0: 42649.6. Samples: 6209339300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:32:18,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 06:32:18,716][15401] Updated weights for policy 0, policy_version 378980 (0.0044) [2024-06-23 06:32:23,158][15401] Updated weights for policy 0, policy_version 378990 (0.0036) [2024-06-23 06:32:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 6209372160. Throughput: 0: 42398.1. Samples: 6209465640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:32:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 06:32:26,352][15401] Updated weights for policy 0, policy_version 379000 (0.0033) [2024-06-23 06:32:28,391][15132] Fps is (10 sec: 44229.8, 60 sec: 43143.3, 300 sec: 42875.8). Total num frames: 6209617920. Throughput: 0: 42441.5. Samples: 6209719120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 06:32:28,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 06:32:30,623][15401] Updated weights for policy 0, policy_version 379010 (0.0033) [2024-06-23 06:32:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6209814528. Throughput: 0: 42626.2. Samples: 6209982220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:32:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 06:32:34,218][15401] Updated weights for policy 0, policy_version 379020 (0.0028) [2024-06-23 06:32:38,390][15132] Fps is (10 sec: 39328.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6210011136. Throughput: 0: 42470.7. Samples: 6210107620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:32:38,391][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 06:32:38,542][15401] Updated weights for policy 0, policy_version 379030 (0.0036) [2024-06-23 06:32:41,860][15401] Updated weights for policy 0, policy_version 379040 (0.0035) [2024-06-23 06:32:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6210240512. Throughput: 0: 42653.7. Samples: 6210362600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:32:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 06:32:46,070][15401] Updated weights for policy 0, policy_version 379050 (0.0036) [2024-06-23 06:32:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6210453504. Throughput: 0: 42818.7. Samples: 6210625740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:32:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 06:32:49,462][15401] Updated weights for policy 0, policy_version 379060 (0.0032) [2024-06-23 06:32:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6210650112. Throughput: 0: 42922.2. Samples: 6210758440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:32:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 06:32:53,663][15401] Updated weights for policy 0, policy_version 379070 (0.0028) [2024-06-23 06:32:57,070][15401] Updated weights for policy 0, policy_version 379080 (0.0031) [2024-06-23 06:32:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6210895872. Throughput: 0: 42823.9. Samples: 6211011780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:32:58,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 06:33:01,205][15401] Updated weights for policy 0, policy_version 379090 (0.0037) [2024-06-23 06:33:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6211108864. Throughput: 0: 42913.9. Samples: 6211270420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 06:33:04,651][15401] Updated weights for policy 0, policy_version 379100 (0.0031) [2024-06-23 06:33:08,394][15132] Fps is (10 sec: 40941.7, 60 sec: 42868.2, 300 sec: 42708.8). Total num frames: 6211305472. Throughput: 0: 42997.5. Samples: 6211400720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:08,395][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 06:33:08,715][15401] Updated weights for policy 0, policy_version 379110 (0.0049) [2024-06-23 06:33:12,111][15401] Updated weights for policy 0, policy_version 379120 (0.0029) [2024-06-23 06:33:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.3, 300 sec: 42765.0). Total num frames: 6211534848. Throughput: 0: 43058.8. Samples: 6211656700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 06:33:16,161][15401] Updated weights for policy 0, policy_version 379130 (0.0039) [2024-06-23 06:33:18,390][15132] Fps is (10 sec: 45895.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 6211764224. Throughput: 0: 43042.1. Samples: 6211919120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 06:33:19,932][15401] Updated weights for policy 0, policy_version 379140 (0.0031) [2024-06-23 06:33:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 6211977216. Throughput: 0: 43292.9. Samples: 6212055800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 06:33:23,483][15401] Updated weights for policy 0, policy_version 379150 (0.0023) [2024-06-23 06:33:27,544][15401] Updated weights for policy 0, policy_version 379160 (0.0031) [2024-06-23 06:33:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42326.6, 300 sec: 42654.0). Total num frames: 6212157440. Throughput: 0: 43174.6. Samples: 6212305460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 06:33:31,109][15401] Updated weights for policy 0, policy_version 379170 (0.0035) [2024-06-23 06:33:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6212403200. Throughput: 0: 43075.1. Samples: 6212564120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 06:33:34,866][15349] Signal inference workers to stop experience collection... (92050 times) [2024-06-23 06:33:34,906][15401] InferenceWorker_p0-w0: stopping experience collection (92050 times) [2024-06-23 06:33:34,985][15349] Signal inference workers to resume experience collection... (92050 times) [2024-06-23 06:33:34,985][15401] InferenceWorker_p0-w0: resuming experience collection (92050 times) [2024-06-23 06:33:35,133][15401] Updated weights for policy 0, policy_version 379180 (0.0032) [2024-06-23 06:33:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 6212616192. Throughput: 0: 43045.0. Samples: 6212695460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:38,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 06:33:38,896][15401] Updated weights for policy 0, policy_version 379190 (0.0037) [2024-06-23 06:33:42,948][15401] Updated weights for policy 0, policy_version 379200 (0.0041) [2024-06-23 06:33:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6212812800. Throughput: 0: 43141.8. Samples: 6212953160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:43,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 06:33:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000379200_6212812800.pth... [2024-06-23 06:33:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000378575_6202572800.pth [2024-06-23 06:33:46,679][15401] Updated weights for policy 0, policy_version 379210 (0.0037) [2024-06-23 06:33:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 6213058560. Throughput: 0: 42818.3. Samples: 6213197240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 06:33:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 06:33:50,625][15401] Updated weights for policy 0, policy_version 379220 (0.0035) [2024-06-23 06:33:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6213238784. Throughput: 0: 43022.1. Samples: 6213336520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:33:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 06:33:54,140][15401] Updated weights for policy 0, policy_version 379230 (0.0041) [2024-06-23 06:33:58,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6213451776. Throughput: 0: 43000.1. Samples: 6213591700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:33:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 06:33:58,556][15401] Updated weights for policy 0, policy_version 379240 (0.0036) [2024-06-23 06:34:01,986][15401] Updated weights for policy 0, policy_version 379250 (0.0036) [2024-06-23 06:34:03,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 6213713920. Throughput: 0: 42664.0. Samples: 6213839000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 06:34:06,390][15401] Updated weights for policy 0, policy_version 379260 (0.0034) [2024-06-23 06:34:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42874.6, 300 sec: 42709.5). Total num frames: 6213877760. Throughput: 0: 42740.4. Samples: 6213979120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 06:34:09,589][15401] Updated weights for policy 0, policy_version 379270 (0.0028) [2024-06-23 06:34:13,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6214090752. Throughput: 0: 42764.3. Samples: 6214229860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 06:34:14,055][15401] Updated weights for policy 0, policy_version 379280 (0.0038) [2024-06-23 06:34:17,114][15401] Updated weights for policy 0, policy_version 379290 (0.0034) [2024-06-23 06:34:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6214336512. Throughput: 0: 42656.1. Samples: 6214483640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 06:34:21,609][15401] Updated weights for policy 0, policy_version 379300 (0.0035) [2024-06-23 06:34:23,391][15132] Fps is (10 sec: 42591.6, 60 sec: 42324.1, 300 sec: 42653.7). Total num frames: 6214516736. Throughput: 0: 42762.3. Samples: 6214619840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:23,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 06:34:25,234][15401] Updated weights for policy 0, policy_version 379310 (0.0034) [2024-06-23 06:34:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6214746112. Throughput: 0: 42540.0. Samples: 6214867460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:28,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 06:34:29,183][15401] Updated weights for policy 0, policy_version 379320 (0.0040) [2024-06-23 06:34:32,794][15401] Updated weights for policy 0, policy_version 379330 (0.0027) [2024-06-23 06:34:33,390][15132] Fps is (10 sec: 45882.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6214975488. Throughput: 0: 42943.8. Samples: 6215129720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 06:34:36,708][15401] Updated weights for policy 0, policy_version 379340 (0.0027) [2024-06-23 06:34:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 6215155712. Throughput: 0: 42674.2. Samples: 6215256860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:38,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 06:34:40,527][15401] Updated weights for policy 0, policy_version 379350 (0.0026) [2024-06-23 06:34:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6215401472. Throughput: 0: 42493.6. Samples: 6215503920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 06:34:44,167][15401] Updated weights for policy 0, policy_version 379360 (0.0043) [2024-06-23 06:34:46,015][15349] Signal inference workers to stop experience collection... (92100 times) [2024-06-23 06:34:46,060][15401] InferenceWorker_p0-w0: stopping experience collection (92100 times) [2024-06-23 06:34:46,130][15349] Signal inference workers to resume experience collection... (92100 times) [2024-06-23 06:34:46,131][15401] InferenceWorker_p0-w0: resuming experience collection (92100 times) [2024-06-23 06:34:48,353][15401] Updated weights for policy 0, policy_version 379370 (0.0036) [2024-06-23 06:34:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6215598080. Throughput: 0: 42850.7. Samples: 6215767280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:48,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 06:34:51,689][15401] Updated weights for policy 0, policy_version 379380 (0.0040) [2024-06-23 06:34:53,389][15132] Fps is (10 sec: 37684.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6215778304. Throughput: 0: 42419.7. Samples: 6215888000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:53,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 06:34:55,896][15401] Updated weights for policy 0, policy_version 379390 (0.0027) [2024-06-23 06:34:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6216040448. Throughput: 0: 42469.0. Samples: 6216140960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:34:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 06:34:59,541][15401] Updated weights for policy 0, policy_version 379400 (0.0026) [2024-06-23 06:35:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 6216237056. Throughput: 0: 42788.0. Samples: 6216409100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:35:03,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 06:35:03,459][15401] Updated weights for policy 0, policy_version 379410 (0.0040) [2024-06-23 06:35:07,083][15401] Updated weights for policy 0, policy_version 379420 (0.0031) [2024-06-23 06:35:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6216433664. Throughput: 0: 42480.6. Samples: 6216531400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:35:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 06:35:11,178][15401] Updated weights for policy 0, policy_version 379430 (0.0029) [2024-06-23 06:35:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6216679424. Throughput: 0: 42551.0. Samples: 6216782260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 06:35:14,559][15401] Updated weights for policy 0, policy_version 379440 (0.0040) [2024-06-23 06:35:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 6216859648. Throughput: 0: 42547.6. Samples: 6217044360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 06:35:19,116][15401] Updated weights for policy 0, policy_version 379450 (0.0041) [2024-06-23 06:35:22,654][15401] Updated weights for policy 0, policy_version 379460 (0.0026) [2024-06-23 06:35:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42872.6, 300 sec: 42765.0). Total num frames: 6217089024. Throughput: 0: 42451.9. Samples: 6217167200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 06:35:26,615][15401] Updated weights for policy 0, policy_version 379470 (0.0036) [2024-06-23 06:35:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 6217302016. Throughput: 0: 42710.1. Samples: 6217425860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 06:35:30,058][15401] Updated weights for policy 0, policy_version 379480 (0.0026) [2024-06-23 06:35:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6217515008. Throughput: 0: 42550.2. Samples: 6217682040. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 06:35:34,109][15401] Updated weights for policy 0, policy_version 379490 (0.0040) [2024-06-23 06:35:37,530][15401] Updated weights for policy 0, policy_version 379500 (0.0022) [2024-06-23 06:35:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6217744384. Throughput: 0: 42818.2. Samples: 6217814820. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 06:35:41,764][15401] Updated weights for policy 0, policy_version 379510 (0.0033) [2024-06-23 06:35:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 6217940992. Throughput: 0: 42758.2. Samples: 6218065080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 06:35:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000379513_6217940992.pth... [2024-06-23 06:35:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000378886_6207668224.pth [2024-06-23 06:35:44,994][15401] Updated weights for policy 0, policy_version 379520 (0.0030) [2024-06-23 06:35:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6218153984. Throughput: 0: 42674.7. Samples: 6218329460. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 06:35:49,559][15401] Updated weights for policy 0, policy_version 379530 (0.0033) [2024-06-23 06:35:52,341][15401] Updated weights for policy 0, policy_version 379540 (0.0029) [2024-06-23 06:35:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 6218383360. Throughput: 0: 42812.6. Samples: 6218457960. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 06:35:57,241][15401] Updated weights for policy 0, policy_version 379550 (0.0036) [2024-06-23 06:35:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6218579968. Throughput: 0: 42991.2. Samples: 6218716860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:35:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 06:35:58,877][15349] Signal inference workers to stop experience collection... (92150 times) [2024-06-23 06:35:58,877][15349] Signal inference workers to resume experience collection... (92150 times) [2024-06-23 06:35:58,920][15401] InferenceWorker_p0-w0: stopping experience collection (92150 times) [2024-06-23 06:35:58,920][15401] InferenceWorker_p0-w0: resuming experience collection (92150 times) [2024-06-23 06:36:00,525][15401] Updated weights for policy 0, policy_version 379560 (0.0039) [2024-06-23 06:36:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6218792960. Throughput: 0: 42892.3. Samples: 6218974520. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:36:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 06:36:04,917][15401] Updated weights for policy 0, policy_version 379570 (0.0035) [2024-06-23 06:36:08,063][15401] Updated weights for policy 0, policy_version 379580 (0.0051) [2024-06-23 06:36:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 6219038720. Throughput: 0: 43083.2. Samples: 6219105940. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:36:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 06:36:12,431][15401] Updated weights for policy 0, policy_version 379590 (0.0043) [2024-06-23 06:36:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6219218944. Throughput: 0: 42956.3. Samples: 6219358900. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:36:13,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 06:36:15,673][15401] Updated weights for policy 0, policy_version 379600 (0.0022) [2024-06-23 06:36:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6219448320. Throughput: 0: 43012.1. Samples: 6219617580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:36:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 06:36:20,123][15401] Updated weights for policy 0, policy_version 379610 (0.0023) [2024-06-23 06:36:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6219677696. Throughput: 0: 42956.3. Samples: 6219747860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 06:36:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 06:36:23,508][15401] Updated weights for policy 0, policy_version 379620 (0.0029) [2024-06-23 06:36:27,680][15401] Updated weights for policy 0, policy_version 379630 (0.0034) [2024-06-23 06:36:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6219874304. Throughput: 0: 43134.7. Samples: 6220006140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:36:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 06:36:31,253][15401] Updated weights for policy 0, policy_version 379640 (0.0023) [2024-06-23 06:36:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 6220103680. Throughput: 0: 42831.9. Samples: 6220256900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:36:33,394][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 06:36:35,197][15401] Updated weights for policy 0, policy_version 379650 (0.0026) [2024-06-23 06:36:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6220316672. Throughput: 0: 43000.5. Samples: 6220392980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:36:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 06:36:38,816][15401] Updated weights for policy 0, policy_version 379660 (0.0048) [2024-06-23 06:36:42,689][15401] Updated weights for policy 0, policy_version 379670 (0.0038) [2024-06-23 06:36:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 6220513280. Throughput: 0: 42992.8. Samples: 6220651540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:36:43,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 06:36:46,512][15401] Updated weights for policy 0, policy_version 379680 (0.0027) [2024-06-23 06:36:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6220759040. Throughput: 0: 42712.0. Samples: 6220896560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:36:48,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 06:36:50,387][15401] Updated weights for policy 0, policy_version 379690 (0.0028) [2024-06-23 06:36:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6220955648. Throughput: 0: 42902.3. Samples: 6221036540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:36:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 06:36:54,122][15401] Updated weights for policy 0, policy_version 379700 (0.0030) [2024-06-23 06:36:58,217][15401] Updated weights for policy 0, policy_version 379710 (0.0027) [2024-06-23 06:36:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6221168640. Throughput: 0: 42892.9. Samples: 6221289080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:36:58,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 06:37:01,654][15401] Updated weights for policy 0, policy_version 379720 (0.0039) [2024-06-23 06:37:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6221381632. Throughput: 0: 42904.4. Samples: 6221548280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:03,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 06:37:05,792][15401] Updated weights for policy 0, policy_version 379730 (0.0028) [2024-06-23 06:37:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6221594624. Throughput: 0: 42827.7. Samples: 6221675100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 06:37:09,234][15401] Updated weights for policy 0, policy_version 379740 (0.0043) [2024-06-23 06:37:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6221807616. Throughput: 0: 42755.1. Samples: 6221930120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 06:37:13,434][15401] Updated weights for policy 0, policy_version 379750 (0.0037) [2024-06-23 06:37:16,987][15401] Updated weights for policy 0, policy_version 379760 (0.0032) [2024-06-23 06:37:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6222004224. Throughput: 0: 42869.4. Samples: 6222186020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:18,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 06:37:21,066][15401] Updated weights for policy 0, policy_version 379770 (0.0038) [2024-06-23 06:37:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.7). Total num frames: 6222217216. Throughput: 0: 42579.5. Samples: 6222309060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 06:37:24,126][15349] Signal inference workers to stop experience collection... (92200 times) [2024-06-23 06:37:24,126][15349] Signal inference workers to resume experience collection... (92200 times) [2024-06-23 06:37:24,142][15401] InferenceWorker_p0-w0: stopping experience collection (92200 times) [2024-06-23 06:37:24,143][15401] InferenceWorker_p0-w0: resuming experience collection (92200 times) [2024-06-23 06:37:24,906][15401] Updated weights for policy 0, policy_version 379780 (0.0030) [2024-06-23 06:37:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6222446592. Throughput: 0: 42573.0. Samples: 6222567320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:28,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 06:37:28,730][15401] Updated weights for policy 0, policy_version 379790 (0.0038) [2024-06-23 06:37:32,514][15401] Updated weights for policy 0, policy_version 379800 (0.0034) [2024-06-23 06:37:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6222659584. Throughput: 0: 42826.7. Samples: 6222823760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 06:37:36,815][15401] Updated weights for policy 0, policy_version 379810 (0.0028) [2024-06-23 06:37:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6222856192. Throughput: 0: 42526.2. Samples: 6222950220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 06:37:40,461][15401] Updated weights for policy 0, policy_version 379820 (0.0031) [2024-06-23 06:37:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6223069184. Throughput: 0: 42562.2. Samples: 6223204380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:37:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 06:37:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000379826_6223069184.pth... [2024-06-23 06:37:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000379200_6212812800.pth [2024-06-23 06:37:44,470][15401] Updated weights for policy 0, policy_version 379830 (0.0028) [2024-06-23 06:37:47,991][15401] Updated weights for policy 0, policy_version 379840 (0.0033) [2024-06-23 06:37:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6223298560. Throughput: 0: 42400.5. Samples: 6223456300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:37:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 06:37:52,088][15401] Updated weights for policy 0, policy_version 379850 (0.0031) [2024-06-23 06:37:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6223511552. Throughput: 0: 42631.6. Samples: 6223593520. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:37:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 06:37:55,490][15401] Updated weights for policy 0, policy_version 379860 (0.0026) [2024-06-23 06:37:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6223708160. Throughput: 0: 42574.5. Samples: 6223845980. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:37:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 06:37:59,730][15401] Updated weights for policy 0, policy_version 379870 (0.0028) [2024-06-23 06:38:03,037][15401] Updated weights for policy 0, policy_version 379880 (0.0037) [2024-06-23 06:38:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.7). Total num frames: 6223953920. Throughput: 0: 42309.7. Samples: 6224089960. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 06:38:07,528][15401] Updated weights for policy 0, policy_version 379890 (0.0032) [2024-06-23 06:38:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6224134144. Throughput: 0: 42641.3. Samples: 6224227920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 06:38:10,923][15401] Updated weights for policy 0, policy_version 379900 (0.0040) [2024-06-23 06:38:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6224347136. Throughput: 0: 42603.5. Samples: 6224484480. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:13,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-23 06:38:15,181][15401] Updated weights for policy 0, policy_version 379910 (0.0044) [2024-06-23 06:38:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6224592896. Throughput: 0: 42379.0. Samples: 6224730820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 06:38:18,585][15401] Updated weights for policy 0, policy_version 379920 (0.0029) [2024-06-23 06:38:22,810][15401] Updated weights for policy 0, policy_version 379930 (0.0039) [2024-06-23 06:38:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6224789504. Throughput: 0: 42648.0. Samples: 6224869380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 06:38:26,246][15401] Updated weights for policy 0, policy_version 379940 (0.0042) [2024-06-23 06:38:28,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6224986112. Throughput: 0: 42568.0. Samples: 6225119940. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 06:38:30,803][15401] Updated weights for policy 0, policy_version 379950 (0.0036) [2024-06-23 06:38:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6225231872. Throughput: 0: 42504.7. Samples: 6225369020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 06:38:34,438][15401] Updated weights for policy 0, policy_version 379960 (0.0035) [2024-06-23 06:38:37,197][15349] Signal inference workers to stop experience collection... (92250 times) [2024-06-23 06:38:37,198][15349] Signal inference workers to resume experience collection... (92250 times) [2024-06-23 06:38:37,215][15401] InferenceWorker_p0-w0: stopping experience collection (92250 times) [2024-06-23 06:38:37,215][15401] InferenceWorker_p0-w0: resuming experience collection (92250 times) [2024-06-23 06:38:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6225412096. Throughput: 0: 42494.7. Samples: 6225505780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 06:38:38,501][15401] Updated weights for policy 0, policy_version 379970 (0.0028) [2024-06-23 06:38:41,913][15401] Updated weights for policy 0, policy_version 379980 (0.0029) [2024-06-23 06:38:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6225641472. Throughput: 0: 42468.0. Samples: 6225757040. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 06:38:46,139][15401] Updated weights for policy 0, policy_version 379990 (0.0021) [2024-06-23 06:38:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6225870848. Throughput: 0: 42884.5. Samples: 6226019760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 06:38:49,376][15401] Updated weights for policy 0, policy_version 380000 (0.0030) [2024-06-23 06:38:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6226067456. Throughput: 0: 42846.7. Samples: 6226156020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:53,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 06:38:53,629][15401] Updated weights for policy 0, policy_version 380010 (0.0031) [2024-06-23 06:38:56,918][15401] Updated weights for policy 0, policy_version 380020 (0.0033) [2024-06-23 06:38:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 6226296832. Throughput: 0: 42841.4. Samples: 6226412340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:38:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 06:39:01,047][15401] Updated weights for policy 0, policy_version 380030 (0.0036) [2024-06-23 06:39:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6226509824. Throughput: 0: 43129.5. Samples: 6226671640. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-23 06:39:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 06:39:04,680][15401] Updated weights for policy 0, policy_version 380040 (0.0036) [2024-06-23 06:39:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6226706432. Throughput: 0: 43003.5. Samples: 6226804540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 06:39:08,550][15401] Updated weights for policy 0, policy_version 380050 (0.0034) [2024-06-23 06:39:12,151][15401] Updated weights for policy 0, policy_version 380060 (0.0030) [2024-06-23 06:39:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 6226952192. Throughput: 0: 43177.3. Samples: 6227062920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 06:39:16,123][15401] Updated weights for policy 0, policy_version 380070 (0.0033) [2024-06-23 06:39:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42876.3). Total num frames: 6227165184. Throughput: 0: 43294.6. Samples: 6227317280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 06:39:19,726][15401] Updated weights for policy 0, policy_version 380080 (0.0029) [2024-06-23 06:39:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6227361792. Throughput: 0: 42996.8. Samples: 6227440640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:23,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 06:39:24,052][15401] Updated weights for policy 0, policy_version 380090 (0.0041) [2024-06-23 06:39:27,304][15401] Updated weights for policy 0, policy_version 380100 (0.0023) [2024-06-23 06:39:28,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43689.0, 300 sec: 42820.2). Total num frames: 6227607552. Throughput: 0: 43056.0. Samples: 6227694660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:28,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 06:39:31,544][15401] Updated weights for policy 0, policy_version 380110 (0.0034) [2024-06-23 06:39:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6227771392. Throughput: 0: 43164.4. Samples: 6227962160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 06:39:34,909][15401] Updated weights for policy 0, policy_version 380120 (0.0029) [2024-06-23 06:39:38,389][15132] Fps is (10 sec: 39331.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6228000768. Throughput: 0: 42830.2. Samples: 6228083380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 06:39:39,386][15401] Updated weights for policy 0, policy_version 380130 (0.0035) [2024-06-23 06:39:42,415][15401] Updated weights for policy 0, policy_version 380140 (0.0034) [2024-06-23 06:39:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6228230144. Throughput: 0: 42779.9. Samples: 6228337440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 06:39:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000380142_6228246528.pth... [2024-06-23 06:39:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000379513_6217940992.pth [2024-06-23 06:39:47,364][15401] Updated weights for policy 0, policy_version 380150 (0.0037) [2024-06-23 06:39:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 6228393984. Throughput: 0: 42929.8. Samples: 6228603480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 06:39:50,123][15401] Updated weights for policy 0, policy_version 380160 (0.0028) [2024-06-23 06:39:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6228639744. Throughput: 0: 42587.2. Samples: 6228720960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:53,396][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 06:39:55,018][15401] Updated weights for policy 0, policy_version 380170 (0.0022) [2024-06-23 06:39:58,206][15401] Updated weights for policy 0, policy_version 380180 (0.0036) [2024-06-23 06:39:58,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6228869120. Throughput: 0: 42600.2. Samples: 6228979920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:39:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 06:40:01,031][15349] Signal inference workers to stop experience collection... (92300 times) [2024-06-23 06:40:01,032][15349] Signal inference workers to resume experience collection... (92300 times) [2024-06-23 06:40:01,050][15401] InferenceWorker_p0-w0: stopping experience collection (92300 times) [2024-06-23 06:40:01,050][15401] InferenceWorker_p0-w0: resuming experience collection (92300 times) [2024-06-23 06:40:02,616][15401] Updated weights for policy 0, policy_version 380190 (0.0030) [2024-06-23 06:40:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6229049344. Throughput: 0: 42837.5. Samples: 6229244960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:40:03,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 06:40:05,919][15401] Updated weights for policy 0, policy_version 380200 (0.0030) [2024-06-23 06:40:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6229278720. Throughput: 0: 42738.8. Samples: 6229363880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:40:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 06:40:10,232][15401] Updated weights for policy 0, policy_version 380210 (0.0049) [2024-06-23 06:40:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6229508096. Throughput: 0: 42952.6. Samples: 6229627420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:40:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 06:40:13,507][15401] Updated weights for policy 0, policy_version 380220 (0.0053) [2024-06-23 06:40:17,649][15401] Updated weights for policy 0, policy_version 380230 (0.0039) [2024-06-23 06:40:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 6229704704. Throughput: 0: 42754.3. Samples: 6229886100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:40:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 06:40:21,010][15401] Updated weights for policy 0, policy_version 380240 (0.0032) [2024-06-23 06:40:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6229950464. Throughput: 0: 42816.8. Samples: 6230010140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 06:40:23,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 06:40:25,042][15401] Updated weights for policy 0, policy_version 380250 (0.0034) [2024-06-23 06:40:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 6230163456. Throughput: 0: 43090.3. Samples: 6230276500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:40:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 06:40:28,479][15401] Updated weights for policy 0, policy_version 380260 (0.0041) [2024-06-23 06:40:32,985][15401] Updated weights for policy 0, policy_version 380270 (0.0035) [2024-06-23 06:40:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6230360064. Throughput: 0: 43005.2. Samples: 6230538720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:40:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 06:40:36,016][15401] Updated weights for policy 0, policy_version 380280 (0.0028) [2024-06-23 06:40:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6230589440. Throughput: 0: 43043.2. Samples: 6230657900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:40:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 06:40:40,700][15401] Updated weights for policy 0, policy_version 380290 (0.0024) [2024-06-23 06:40:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6230818816. Throughput: 0: 43294.2. Samples: 6230928160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:40:43,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 06:40:43,749][15401] Updated weights for policy 0, policy_version 380300 (0.0023) [2024-06-23 06:40:48,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 6230999040. Throughput: 0: 43120.3. Samples: 6231185480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:40:48,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 06:40:48,394][15401] Updated weights for policy 0, policy_version 380310 (0.0026) [2024-06-23 06:40:51,330][15401] Updated weights for policy 0, policy_version 380320 (0.0044) [2024-06-23 06:40:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6231244800. Throughput: 0: 43252.1. Samples: 6231310220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:40:53,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 06:40:55,870][15401] Updated weights for policy 0, policy_version 380330 (0.0027) [2024-06-23 06:40:58,389][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6231457792. Throughput: 0: 43202.2. Samples: 6231571520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:40:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 06:40:58,892][15401] Updated weights for policy 0, policy_version 380340 (0.0042) [2024-06-23 06:41:03,349][15401] Updated weights for policy 0, policy_version 380350 (0.0039) [2024-06-23 06:41:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6231654400. Throughput: 0: 43230.6. Samples: 6231831480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 06:41:06,890][15401] Updated weights for policy 0, policy_version 380360 (0.0029) [2024-06-23 06:41:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 6231900160. Throughput: 0: 43206.2. Samples: 6231954420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 06:41:11,216][15401] Updated weights for policy 0, policy_version 380370 (0.0034) [2024-06-23 06:41:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6232096768. Throughput: 0: 42984.9. Samples: 6232210820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 06:41:14,490][15401] Updated weights for policy 0, policy_version 380380 (0.0031) [2024-06-23 06:41:16,908][15349] Signal inference workers to stop experience collection... (92350 times) [2024-06-23 06:41:16,908][15349] Signal inference workers to resume experience collection... (92350 times) [2024-06-23 06:41:16,928][15401] InferenceWorker_p0-w0: stopping experience collection (92350 times) [2024-06-23 06:41:16,928][15401] InferenceWorker_p0-w0: resuming experience collection (92350 times) [2024-06-23 06:41:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6232293376. Throughput: 0: 42892.5. Samples: 6232468880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:18,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-23 06:41:18,685][15401] Updated weights for policy 0, policy_version 380390 (0.0034) [2024-06-23 06:41:22,360][15401] Updated weights for policy 0, policy_version 380400 (0.0034) [2024-06-23 06:41:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6232506368. Throughput: 0: 42924.4. Samples: 6232589500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 06:41:26,118][15401] Updated weights for policy 0, policy_version 380410 (0.0035) [2024-06-23 06:41:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6232752128. Throughput: 0: 42727.0. Samples: 6232850880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:28,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 06:41:30,085][15401] Updated weights for policy 0, policy_version 380420 (0.0029) [2024-06-23 06:41:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6232932352. Throughput: 0: 42696.0. Samples: 6233106700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 06:41:33,737][15401] Updated weights for policy 0, policy_version 380430 (0.0039) [2024-06-23 06:41:37,691][15401] Updated weights for policy 0, policy_version 380440 (0.0038) [2024-06-23 06:41:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6233145344. Throughput: 0: 42719.0. Samples: 6233232580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 06:41:41,237][15401] Updated weights for policy 0, policy_version 380450 (0.0037) [2024-06-23 06:41:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6233391104. Throughput: 0: 42816.8. Samples: 6233498280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 06:41:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 06:41:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000380456_6233391104.pth... [2024-06-23 06:41:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000379826_6223069184.pth [2024-06-23 06:41:45,311][15401] Updated weights for policy 0, policy_version 380460 (0.0041) [2024-06-23 06:41:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 6233587712. Throughput: 0: 42626.3. Samples: 6233749660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:41:48,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 06:41:48,907][15401] Updated weights for policy 0, policy_version 380470 (0.0040) [2024-06-23 06:41:52,945][15401] Updated weights for policy 0, policy_version 380480 (0.0035) [2024-06-23 06:41:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6233800704. Throughput: 0: 42725.8. Samples: 6233877080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:41:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 06:41:56,523][15401] Updated weights for policy 0, policy_version 380490 (0.0035) [2024-06-23 06:41:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 6234013696. Throughput: 0: 42770.6. Samples: 6234135600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:41:58,392][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 06:42:00,530][15401] Updated weights for policy 0, policy_version 380500 (0.0034) [2024-06-23 06:42:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6234226688. Throughput: 0: 42889.3. Samples: 6234398900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 06:42:04,464][15401] Updated weights for policy 0, policy_version 380510 (0.0030) [2024-06-23 06:42:08,066][15401] Updated weights for policy 0, policy_version 380520 (0.0038) [2024-06-23 06:42:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6234439680. Throughput: 0: 42912.5. Samples: 6234520560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 06:42:12,017][15401] Updated weights for policy 0, policy_version 380530 (0.0031) [2024-06-23 06:42:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6234669056. Throughput: 0: 42824.9. Samples: 6234778000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:13,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 06:42:15,668][15401] Updated weights for policy 0, policy_version 380540 (0.0036) [2024-06-23 06:42:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6234865664. Throughput: 0: 43002.7. Samples: 6235041820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 06:42:19,766][15401] Updated weights for policy 0, policy_version 380550 (0.0031) [2024-06-23 06:42:23,232][15401] Updated weights for policy 0, policy_version 380560 (0.0031) [2024-06-23 06:42:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6235111424. Throughput: 0: 42885.9. Samples: 6235162440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 06:42:27,306][15401] Updated weights for policy 0, policy_version 380570 (0.0034) [2024-06-23 06:42:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6235324416. Throughput: 0: 42841.9. Samples: 6235426160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:28,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-23 06:42:30,788][15401] Updated weights for policy 0, policy_version 380580 (0.0031) [2024-06-23 06:42:33,391][15132] Fps is (10 sec: 39316.3, 60 sec: 42870.6, 300 sec: 42875.9). Total num frames: 6235504640. Throughput: 0: 42861.8. Samples: 6235678500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:33,391][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 06:42:34,997][15401] Updated weights for policy 0, policy_version 380590 (0.0026) [2024-06-23 06:42:38,327][15349] Signal inference workers to stop experience collection... (92400 times) [2024-06-23 06:42:38,327][15349] Signal inference workers to resume experience collection... (92400 times) [2024-06-23 06:42:38,343][15401] InferenceWorker_p0-w0: stopping experience collection (92400 times) [2024-06-23 06:42:38,343][15401] InferenceWorker_p0-w0: resuming experience collection (92400 times) [2024-06-23 06:42:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6235734016. Throughput: 0: 42786.3. Samples: 6235802460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:38,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 06:42:38,471][15401] Updated weights for policy 0, policy_version 380600 (0.0025) [2024-06-23 06:42:42,833][15401] Updated weights for policy 0, policy_version 380610 (0.0039) [2024-06-23 06:42:43,389][15132] Fps is (10 sec: 44242.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6235947008. Throughput: 0: 42902.3. Samples: 6236066100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 06:42:46,547][15401] Updated weights for policy 0, policy_version 380620 (0.0037) [2024-06-23 06:42:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 6236160000. Throughput: 0: 42633.7. Samples: 6236317420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 06:42:50,559][15401] Updated weights for policy 0, policy_version 380630 (0.0045) [2024-06-23 06:42:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 6236372992. Throughput: 0: 42708.8. Samples: 6236442460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 06:42:54,171][15401] Updated weights for policy 0, policy_version 380640 (0.0026) [2024-06-23 06:42:58,118][15401] Updated weights for policy 0, policy_version 380650 (0.0028) [2024-06-23 06:42:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 6236585984. Throughput: 0: 42923.9. Samples: 6236709580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:42:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 06:43:01,801][15401] Updated weights for policy 0, policy_version 380660 (0.0029) [2024-06-23 06:43:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 6236815360. Throughput: 0: 42486.7. Samples: 6236953720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 06:43:03,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 06:43:05,718][15401] Updated weights for policy 0, policy_version 380670 (0.0042) [2024-06-23 06:43:08,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 6237011968. Throughput: 0: 42695.4. Samples: 6237083840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:08,393][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 06:43:09,561][15401] Updated weights for policy 0, policy_version 380680 (0.0042) [2024-06-23 06:43:13,197][15401] Updated weights for policy 0, policy_version 380690 (0.0032) [2024-06-23 06:43:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6237224960. Throughput: 0: 42745.3. Samples: 6237349700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 06:43:17,176][15401] Updated weights for policy 0, policy_version 380700 (0.0028) [2024-06-23 06:43:18,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6237454336. Throughput: 0: 42580.7. Samples: 6237594580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 06:43:21,076][15401] Updated weights for policy 0, policy_version 380710 (0.0038) [2024-06-23 06:43:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.1, 300 sec: 42931.6). Total num frames: 6237650944. Throughput: 0: 42840.2. Samples: 6237730280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 06:43:24,759][15401] Updated weights for policy 0, policy_version 380720 (0.0032) [2024-06-23 06:43:28,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 6237831168. Throughput: 0: 42663.5. Samples: 6237985960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 06:43:28,827][15401] Updated weights for policy 0, policy_version 380730 (0.0024) [2024-06-23 06:43:32,572][15401] Updated weights for policy 0, policy_version 380740 (0.0027) [2024-06-23 06:43:33,394][15132] Fps is (10 sec: 45854.6, 60 sec: 43415.1, 300 sec: 43042.0). Total num frames: 6238109696. Throughput: 0: 42685.4. Samples: 6238238460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:33,395][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 06:43:36,486][15401] Updated weights for policy 0, policy_version 380750 (0.0031) [2024-06-23 06:43:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6238289920. Throughput: 0: 43002.1. Samples: 6238377560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 06:43:39,986][15349] Signal inference workers to stop experience collection... (92450 times) [2024-06-23 06:43:39,987][15349] Signal inference workers to resume experience collection... (92450 times) [2024-06-23 06:43:40,019][15401] InferenceWorker_p0-w0: stopping experience collection (92450 times) [2024-06-23 06:43:40,019][15401] InferenceWorker_p0-w0: resuming experience collection (92450 times) [2024-06-23 06:43:40,148][15401] Updated weights for policy 0, policy_version 380760 (0.0040) [2024-06-23 06:43:43,390][15132] Fps is (10 sec: 37700.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6238486528. Throughput: 0: 42741.9. Samples: 6238632960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 06:43:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000380767_6238486528.pth... [2024-06-23 06:43:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000380142_6228246528.pth [2024-06-23 06:43:44,111][15401] Updated weights for policy 0, policy_version 380770 (0.0031) [2024-06-23 06:43:47,596][15401] Updated weights for policy 0, policy_version 380780 (0.0029) [2024-06-23 06:43:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 6238748672. Throughput: 0: 42908.4. Samples: 6238884600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 06:43:51,663][15401] Updated weights for policy 0, policy_version 380790 (0.0027) [2024-06-23 06:43:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6238928896. Throughput: 0: 43058.3. Samples: 6239021360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 06:43:55,165][15401] Updated weights for policy 0, policy_version 380800 (0.0042) [2024-06-23 06:43:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6239141888. Throughput: 0: 42759.2. Samples: 6239273860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:43:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 06:43:59,314][15401] Updated weights for policy 0, policy_version 380810 (0.0039) [2024-06-23 06:44:02,965][15401] Updated weights for policy 0, policy_version 380820 (0.0039) [2024-06-23 06:44:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 6239371264. Throughput: 0: 42992.1. Samples: 6239529220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:44:03,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 06:44:06,940][15401] Updated weights for policy 0, policy_version 380830 (0.0034) [2024-06-23 06:44:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 6239567872. Throughput: 0: 42929.1. Samples: 6239662080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:44:08,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 06:44:10,625][15401] Updated weights for policy 0, policy_version 380840 (0.0035) [2024-06-23 06:44:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 6239797248. Throughput: 0: 42710.6. Samples: 6239908040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:44:13,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 06:44:14,587][15401] Updated weights for policy 0, policy_version 380850 (0.0029) [2024-06-23 06:44:18,133][15401] Updated weights for policy 0, policy_version 380860 (0.0026) [2024-06-23 06:44:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 6240026624. Throughput: 0: 42977.9. Samples: 6240172260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:44:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 06:44:21,993][15401] Updated weights for policy 0, policy_version 380870 (0.0039) [2024-06-23 06:44:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 6240223232. Throughput: 0: 42755.1. Samples: 6240301540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 06:44:23,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 06:44:25,629][15401] Updated weights for policy 0, policy_version 380880 (0.0028) [2024-06-23 06:44:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 6240452608. Throughput: 0: 42847.1. Samples: 6240561080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:44:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 06:44:29,532][15401] Updated weights for policy 0, policy_version 380890 (0.0043) [2024-06-23 06:44:33,216][15401] Updated weights for policy 0, policy_version 380900 (0.0049) [2024-06-23 06:44:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42601.7, 300 sec: 42931.6). Total num frames: 6240665600. Throughput: 0: 42989.3. Samples: 6240819120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:44:33,392][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 06:44:37,221][15401] Updated weights for policy 0, policy_version 380910 (0.0042) [2024-06-23 06:44:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6240862208. Throughput: 0: 42783.1. Samples: 6240946600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:44:38,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 06:44:40,918][15401] Updated weights for policy 0, policy_version 380920 (0.0026) [2024-06-23 06:44:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 6241107968. Throughput: 0: 42880.4. Samples: 6241203480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:44:43,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 06:44:45,270][15401] Updated weights for policy 0, policy_version 380930 (0.0036) [2024-06-23 06:44:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6241288192. Throughput: 0: 42986.2. Samples: 6241463600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:44:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 06:44:48,636][15401] Updated weights for policy 0, policy_version 380940 (0.0037) [2024-06-23 06:44:52,842][15401] Updated weights for policy 0, policy_version 380950 (0.0030) [2024-06-23 06:44:53,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6241484800. Throughput: 0: 42570.1. Samples: 6241577740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:44:53,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 06:44:56,535][15401] Updated weights for policy 0, policy_version 380960 (0.0033) [2024-06-23 06:44:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 6241746944. Throughput: 0: 42873.4. Samples: 6241837240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:44:58,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 06:45:00,586][15401] Updated weights for policy 0, policy_version 380970 (0.0034) [2024-06-23 06:45:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6241910784. Throughput: 0: 42713.3. Samples: 6242094360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:45:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 06:45:04,309][15401] Updated weights for policy 0, policy_version 380980 (0.0032) [2024-06-23 06:45:05,176][15349] Signal inference workers to stop experience collection... (92500 times) [2024-06-23 06:45:05,176][15349] Signal inference workers to resume experience collection... (92500 times) [2024-06-23 06:45:05,215][15401] InferenceWorker_p0-w0: stopping experience collection (92500 times) [2024-06-23 06:45:05,215][15401] InferenceWorker_p0-w0: resuming experience collection (92500 times) [2024-06-23 06:45:08,064][15401] Updated weights for policy 0, policy_version 380990 (0.0035) [2024-06-23 06:45:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6242140160. Throughput: 0: 42475.5. Samples: 6242212940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:45:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 06:45:11,882][15401] Updated weights for policy 0, policy_version 381000 (0.0026) [2024-06-23 06:45:13,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 6242385920. Throughput: 0: 42538.8. Samples: 6242475320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:45:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 06:45:15,857][15401] Updated weights for policy 0, policy_version 381010 (0.0035) [2024-06-23 06:45:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 6242549760. Throughput: 0: 42577.8. Samples: 6242735120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:45:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 06:45:19,987][15401] Updated weights for policy 0, policy_version 381020 (0.0041) [2024-06-23 06:45:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6242779136. Throughput: 0: 42313.9. Samples: 6242850720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:45:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 06:45:23,426][15401] Updated weights for policy 0, policy_version 381030 (0.0039) [2024-06-23 06:45:27,524][15401] Updated weights for policy 0, policy_version 381040 (0.0033) [2024-06-23 06:45:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6243008512. Throughput: 0: 42538.8. Samples: 6243117720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:45:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 06:45:31,259][15401] Updated weights for policy 0, policy_version 381050 (0.0035) [2024-06-23 06:45:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6243205120. Throughput: 0: 42464.7. Samples: 6243374520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:45:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 06:45:35,329][15401] Updated weights for policy 0, policy_version 381060 (0.0044) [2024-06-23 06:45:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6243418112. Throughput: 0: 42695.8. Samples: 6243499040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 06:45:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 06:45:38,653][15401] Updated weights for policy 0, policy_version 381070 (0.0038) [2024-06-23 06:45:42,949][15401] Updated weights for policy 0, policy_version 381080 (0.0046) [2024-06-23 06:45:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 6243647488. Throughput: 0: 42725.2. Samples: 6243759880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:45:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 06:45:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000381083_6243663872.pth... [2024-06-23 06:45:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000380456_6233391104.pth [2024-06-23 06:45:46,245][15401] Updated weights for policy 0, policy_version 381090 (0.0035) [2024-06-23 06:45:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6243844096. Throughput: 0: 42769.9. Samples: 6244019000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:45:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 06:45:50,559][15401] Updated weights for policy 0, policy_version 381100 (0.0036) [2024-06-23 06:45:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 6244073472. Throughput: 0: 42871.3. Samples: 6244142140. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:45:53,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 06:45:54,238][15401] Updated weights for policy 0, policy_version 381110 (0.0038) [2024-06-23 06:45:58,105][15401] Updated weights for policy 0, policy_version 381120 (0.0045) [2024-06-23 06:45:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 6244270080. Throughput: 0: 42752.4. Samples: 6244399180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:45:58,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-23 06:46:02,007][15401] Updated weights for policy 0, policy_version 381130 (0.0036) [2024-06-23 06:46:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6244483072. Throughput: 0: 42855.5. Samples: 6244663620. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:03,392][15132] Avg episode reward: [(0, '0.945')] [2024-06-23 06:46:03,417][15349] Saving new best policy, reward=0.945! [2024-06-23 06:46:05,718][15401] Updated weights for policy 0, policy_version 381140 (0.0046) [2024-06-23 06:46:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6244728832. Throughput: 0: 43008.7. Samples: 6244786120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 06:46:09,603][15401] Updated weights for policy 0, policy_version 381150 (0.0030) [2024-06-23 06:46:11,613][15349] Signal inference workers to stop experience collection... (92550 times) [2024-06-23 06:46:11,620][15349] Signal inference workers to resume experience collection... (92550 times) [2024-06-23 06:46:11,663][15401] InferenceWorker_p0-w0: stopping experience collection (92550 times) [2024-06-23 06:46:11,664][15401] InferenceWorker_p0-w0: resuming experience collection (92550 times) [2024-06-23 06:46:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.5, 300 sec: 42764.7). Total num frames: 6244909056. Throughput: 0: 42599.4. Samples: 6245034800. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:13,393][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 06:46:13,667][15401] Updated weights for policy 0, policy_version 381160 (0.0034) [2024-06-23 06:46:17,198][15401] Updated weights for policy 0, policy_version 381170 (0.0027) [2024-06-23 06:46:18,392][15132] Fps is (10 sec: 37674.5, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6245105664. Throughput: 0: 42691.1. Samples: 6245295720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:18,392][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 06:46:21,243][15401] Updated weights for policy 0, policy_version 381180 (0.0030) [2024-06-23 06:46:23,392][15132] Fps is (10 sec: 45875.0, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 6245367808. Throughput: 0: 42802.0. Samples: 6245425240. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:23,393][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 06:46:24,612][15401] Updated weights for policy 0, policy_version 381190 (0.0045) [2024-06-23 06:46:28,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6245548032. Throughput: 0: 42649.9. Samples: 6245679120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:28,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-23 06:46:28,782][15401] Updated weights for policy 0, policy_version 381200 (0.0025) [2024-06-23 06:46:32,349][15401] Updated weights for policy 0, policy_version 381210 (0.0035) [2024-06-23 06:46:33,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6245761024. Throughput: 0: 42634.6. Samples: 6245937560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 06:46:36,566][15401] Updated weights for policy 0, policy_version 381220 (0.0029) [2024-06-23 06:46:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6246006784. Throughput: 0: 42740.3. Samples: 6246065460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 06:46:39,941][15401] Updated weights for policy 0, policy_version 381230 (0.0028) [2024-06-23 06:46:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6246203392. Throughput: 0: 42851.9. Samples: 6246327520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 06:46:44,179][15401] Updated weights for policy 0, policy_version 381240 (0.0030) [2024-06-23 06:46:47,499][15401] Updated weights for policy 0, policy_version 381250 (0.0040) [2024-06-23 06:46:48,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6246400000. Throughput: 0: 42550.2. Samples: 6246578380. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:48,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 06:46:51,830][15401] Updated weights for policy 0, policy_version 381260 (0.0028) [2024-06-23 06:46:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 6246629376. Throughput: 0: 42594.4. Samples: 6246702860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:53,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 06:46:55,042][15401] Updated weights for policy 0, policy_version 381270 (0.0035) [2024-06-23 06:46:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6246825984. Throughput: 0: 42914.4. Samples: 6246965840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 06:46:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 06:46:59,551][15401] Updated weights for policy 0, policy_version 381280 (0.0038) [2024-06-23 06:47:02,707][15401] Updated weights for policy 0, policy_version 381290 (0.0027) [2024-06-23 06:47:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6247055360. Throughput: 0: 42672.0. Samples: 6247215860. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 06:47:07,162][15401] Updated weights for policy 0, policy_version 381300 (0.0036) [2024-06-23 06:47:08,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6247284736. Throughput: 0: 42722.2. Samples: 6247347640. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 06:47:10,607][15401] Updated weights for policy 0, policy_version 381310 (0.0037) [2024-06-23 06:47:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6247481344. Throughput: 0: 42873.4. Samples: 6247608420. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 06:47:14,763][15401] Updated weights for policy 0, policy_version 381320 (0.0031) [2024-06-23 06:47:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 6247694336. Throughput: 0: 42756.0. Samples: 6247861580. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:18,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 06:47:18,463][15401] Updated weights for policy 0, policy_version 381330 (0.0026) [2024-06-23 06:47:22,364][15401] Updated weights for policy 0, policy_version 381340 (0.0038) [2024-06-23 06:47:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42327.2, 300 sec: 42654.0). Total num frames: 6247907328. Throughput: 0: 42787.3. Samples: 6247990880. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 06:47:23,496][15349] Signal inference workers to stop experience collection... (92600 times) [2024-06-23 06:47:23,497][15349] Signal inference workers to resume experience collection... (92600 times) [2024-06-23 06:47:23,530][15401] InferenceWorker_p0-w0: stopping experience collection (92600 times) [2024-06-23 06:47:23,530][15401] InferenceWorker_p0-w0: resuming experience collection (92600 times) [2024-06-23 06:47:26,129][15401] Updated weights for policy 0, policy_version 381350 (0.0028) [2024-06-23 06:47:28,396][15132] Fps is (10 sec: 44208.9, 60 sec: 43139.9, 300 sec: 42819.8). Total num frames: 6248136704. Throughput: 0: 42777.9. Samples: 6248252800. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:28,396][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 06:47:30,036][15401] Updated weights for policy 0, policy_version 381360 (0.0030) [2024-06-23 06:47:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6248333312. Throughput: 0: 42856.0. Samples: 6248506900. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 06:47:33,982][15401] Updated weights for policy 0, policy_version 381370 (0.0039) [2024-06-23 06:47:37,716][15401] Updated weights for policy 0, policy_version 381380 (0.0044) [2024-06-23 06:47:38,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6248562688. Throughput: 0: 42905.4. Samples: 6248633600. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 06:47:41,516][15401] Updated weights for policy 0, policy_version 381390 (0.0038) [2024-06-23 06:47:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6248759296. Throughput: 0: 42823.1. Samples: 6248892880. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 06:47:43,529][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000381395_6248775680.pth... [2024-06-23 06:47:43,580][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000380767_6238486528.pth [2024-06-23 06:47:45,240][15401] Updated weights for policy 0, policy_version 381400 (0.0032) [2024-06-23 06:47:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6248988672. Throughput: 0: 42738.7. Samples: 6249139100. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:48,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-23 06:47:49,166][15401] Updated weights for policy 0, policy_version 381410 (0.0033) [2024-06-23 06:47:53,209][15401] Updated weights for policy 0, policy_version 381420 (0.0041) [2024-06-23 06:47:53,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6249185280. Throughput: 0: 42853.4. Samples: 6249276140. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:53,392][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 06:47:56,820][15401] Updated weights for policy 0, policy_version 381430 (0.0028) [2024-06-23 06:47:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6249398272. Throughput: 0: 42637.3. Samples: 6249527100. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:47:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 06:48:00,691][15401] Updated weights for policy 0, policy_version 381440 (0.0034) [2024-06-23 06:48:03,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 6249611264. Throughput: 0: 42691.2. Samples: 6249782680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:48:03,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 06:48:04,483][15401] Updated weights for policy 0, policy_version 381450 (0.0039) [2024-06-23 06:48:08,213][15401] Updated weights for policy 0, policy_version 381460 (0.0042) [2024-06-23 06:48:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6249840640. Throughput: 0: 42666.5. Samples: 6249910880. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:48:08,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 06:48:12,210][15401] Updated weights for policy 0, policy_version 381470 (0.0028) [2024-06-23 06:48:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6250053632. Throughput: 0: 42536.6. Samples: 6250166680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:48:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 06:48:15,828][15401] Updated weights for policy 0, policy_version 381480 (0.0031) [2024-06-23 06:48:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 6250266624. Throughput: 0: 42570.7. Samples: 6250422580. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-06-23 06:48:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 06:48:19,864][15401] Updated weights for policy 0, policy_version 381490 (0.0040) [2024-06-23 06:48:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6250496000. Throughput: 0: 42659.6. Samples: 6250553280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:48:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 06:48:23,401][15401] Updated weights for policy 0, policy_version 381500 (0.0043) [2024-06-23 06:48:27,493][15401] Updated weights for policy 0, policy_version 381510 (0.0038) [2024-06-23 06:48:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42603.0, 300 sec: 42654.6). Total num frames: 6250692608. Throughput: 0: 42660.9. Samples: 6250812620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:48:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 06:48:31,005][15401] Updated weights for policy 0, policy_version 381520 (0.0029) [2024-06-23 06:48:33,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 6250889216. Throughput: 0: 42786.8. Samples: 6251064600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:48:33,392][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 06:48:35,362][15401] Updated weights for policy 0, policy_version 381530 (0.0050) [2024-06-23 06:48:38,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 6251118592. Throughput: 0: 42624.7. Samples: 6251194420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:48:38,397][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 06:48:38,661][15401] Updated weights for policy 0, policy_version 381540 (0.0035) [2024-06-23 06:48:43,229][15401] Updated weights for policy 0, policy_version 381550 (0.0036) [2024-06-23 06:48:43,389][15132] Fps is (10 sec: 44246.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6251331584. Throughput: 0: 42697.3. Samples: 6251448480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:48:43,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 06:48:46,302][15401] Updated weights for policy 0, policy_version 381560 (0.0032) [2024-06-23 06:48:48,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6251544576. Throughput: 0: 42598.3. Samples: 6251699600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:48:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 06:48:50,800][15401] Updated weights for policy 0, policy_version 381570 (0.0038) [2024-06-23 06:48:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 6251741184. Throughput: 0: 42620.1. Samples: 6251828780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:48:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 06:48:53,952][15401] Updated weights for policy 0, policy_version 381580 (0.0039) [2024-06-23 06:48:57,782][15349] Signal inference workers to stop experience collection... (92650 times) [2024-06-23 06:48:57,782][15349] Signal inference workers to resume experience collection... (92650 times) [2024-06-23 06:48:57,816][15401] InferenceWorker_p0-w0: stopping experience collection (92650 times) [2024-06-23 06:48:57,817][15401] InferenceWorker_p0-w0: resuming experience collection (92650 times) [2024-06-23 06:48:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6251954176. Throughput: 0: 42581.9. Samples: 6252082860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:48:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 06:48:58,600][15401] Updated weights for policy 0, policy_version 381590 (0.0036) [2024-06-23 06:49:01,879][15401] Updated weights for policy 0, policy_version 381600 (0.0033) [2024-06-23 06:49:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6252183552. Throughput: 0: 42689.6. Samples: 6252343620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:49:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 06:49:06,504][15401] Updated weights for policy 0, policy_version 381610 (0.0025) [2024-06-23 06:49:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 6252396544. Throughput: 0: 42575.1. Samples: 6252469160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:49:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 06:49:09,627][15401] Updated weights for policy 0, policy_version 381620 (0.0032) [2024-06-23 06:49:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6252593152. Throughput: 0: 42449.7. Samples: 6252722860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:49:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 06:49:14,035][15401] Updated weights for policy 0, policy_version 381630 (0.0034) [2024-06-23 06:49:17,481][15401] Updated weights for policy 0, policy_version 381640 (0.0043) [2024-06-23 06:49:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6252806144. Throughput: 0: 42653.8. Samples: 6252983920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:49:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 06:49:21,603][15401] Updated weights for policy 0, policy_version 381650 (0.0030) [2024-06-23 06:49:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 6253035520. Throughput: 0: 42669.7. Samples: 6253114280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:49:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 06:49:25,003][15401] Updated weights for policy 0, policy_version 381660 (0.0036) [2024-06-23 06:49:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6253248512. Throughput: 0: 42698.5. Samples: 6253369920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:49:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 06:49:28,957][15401] Updated weights for policy 0, policy_version 381670 (0.0035) [2024-06-23 06:49:33,060][15401] Updated weights for policy 0, policy_version 381680 (0.0034) [2024-06-23 06:49:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 6253461504. Throughput: 0: 42760.0. Samples: 6253623800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:49:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 06:49:36,480][15401] Updated weights for policy 0, policy_version 381690 (0.0039) [2024-06-23 06:49:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 6253674496. Throughput: 0: 42730.3. Samples: 6253751640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 06:49:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 06:49:40,704][15401] Updated weights for policy 0, policy_version 381700 (0.0032) [2024-06-23 06:49:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6253903872. Throughput: 0: 42749.2. Samples: 6254006580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:49:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 06:49:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000381708_6253903872.pth... [2024-06-23 06:49:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000381083_6243663872.pth [2024-06-23 06:49:44,655][15401] Updated weights for policy 0, policy_version 381710 (0.0030) [2024-06-23 06:49:48,293][15401] Updated weights for policy 0, policy_version 381720 (0.0037) [2024-06-23 06:49:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6254100480. Throughput: 0: 42704.2. Samples: 6254265300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:49:48,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 06:49:52,208][15401] Updated weights for policy 0, policy_version 381730 (0.0043) [2024-06-23 06:49:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6254329856. Throughput: 0: 42707.9. Samples: 6254391020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:49:53,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 06:49:55,756][15401] Updated weights for policy 0, policy_version 381740 (0.0031) [2024-06-23 06:49:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6254526464. Throughput: 0: 42791.1. Samples: 6254648460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:49:58,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 06:49:59,847][15401] Updated weights for policy 0, policy_version 381750 (0.0033) [2024-06-23 06:50:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 6254739456. Throughput: 0: 42651.6. Samples: 6254903240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 06:50:03,416][15401] Updated weights for policy 0, policy_version 381760 (0.0030) [2024-06-23 06:50:07,607][15401] Updated weights for policy 0, policy_version 381770 (0.0040) [2024-06-23 06:50:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 6254968832. Throughput: 0: 42614.1. Samples: 6255031920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:08,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 06:50:11,157][15401] Updated weights for policy 0, policy_version 381780 (0.0039) [2024-06-23 06:50:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6255165440. Throughput: 0: 42641.5. Samples: 6255288780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 06:50:15,143][15401] Updated weights for policy 0, policy_version 381790 (0.0032) [2024-06-23 06:50:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 6255378432. Throughput: 0: 42757.5. Samples: 6255547900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 06:50:18,490][15349] Signal inference workers to stop experience collection... (92700 times) [2024-06-23 06:50:18,517][15401] InferenceWorker_p0-w0: stopping experience collection (92700 times) [2024-06-23 06:50:18,552][15349] Signal inference workers to resume experience collection... (92700 times) [2024-06-23 06:50:18,555][15401] InferenceWorker_p0-w0: resuming experience collection (92700 times) [2024-06-23 06:50:18,695][15401] Updated weights for policy 0, policy_version 381800 (0.0035) [2024-06-23 06:50:22,854][15401] Updated weights for policy 0, policy_version 381810 (0.0038) [2024-06-23 06:50:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6255591424. Throughput: 0: 42779.5. Samples: 6255676720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:23,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 06:50:26,390][15401] Updated weights for policy 0, policy_version 381820 (0.0034) [2024-06-23 06:50:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6255804416. Throughput: 0: 42713.8. Samples: 6255928700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:28,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 06:50:30,447][15401] Updated weights for policy 0, policy_version 381830 (0.0042) [2024-06-23 06:50:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6256017408. Throughput: 0: 42663.5. Samples: 6256185160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 06:50:34,179][15401] Updated weights for policy 0, policy_version 381840 (0.0027) [2024-06-23 06:50:38,041][15401] Updated weights for policy 0, policy_version 381850 (0.0035) [2024-06-23 06:50:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6256230400. Throughput: 0: 42714.1. Samples: 6256313160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 06:50:41,768][15401] Updated weights for policy 0, policy_version 381860 (0.0033) [2024-06-23 06:50:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6256443392. Throughput: 0: 42682.7. Samples: 6256569180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 06:50:45,498][15401] Updated weights for policy 0, policy_version 381870 (0.0027) [2024-06-23 06:50:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6256672768. Throughput: 0: 42831.0. Samples: 6256830640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 06:50:49,200][15401] Updated weights for policy 0, policy_version 381880 (0.0028) [2024-06-23 06:50:52,938][15401] Updated weights for policy 0, policy_version 381890 (0.0040) [2024-06-23 06:50:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6256885760. Throughput: 0: 42917.4. Samples: 6256963200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 06:50:56,698][15401] Updated weights for policy 0, policy_version 381900 (0.0036) [2024-06-23 06:50:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6257082368. Throughput: 0: 42810.2. Samples: 6257215240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 06:50:58,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 06:51:01,028][15401] Updated weights for policy 0, policy_version 381910 (0.0030) [2024-06-23 06:51:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 6257311744. Throughput: 0: 42876.6. Samples: 6257477340. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 06:51:04,267][15401] Updated weights for policy 0, policy_version 381920 (0.0030) [2024-06-23 06:51:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 6257524736. Throughput: 0: 42941.6. Samples: 6257609100. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 06:51:08,519][15401] Updated weights for policy 0, policy_version 381930 (0.0039) [2024-06-23 06:51:11,978][15401] Updated weights for policy 0, policy_version 381940 (0.0032) [2024-06-23 06:51:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 6257737728. Throughput: 0: 42829.4. Samples: 6257856020. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:13,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 06:51:15,950][15401] Updated weights for policy 0, policy_version 381950 (0.0048) [2024-06-23 06:51:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 6257950720. Throughput: 0: 43011.4. Samples: 6258120680. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 06:51:19,855][15401] Updated weights for policy 0, policy_version 381960 (0.0026) [2024-06-23 06:51:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6258180096. Throughput: 0: 43024.6. Samples: 6258249260. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 06:51:23,482][15401] Updated weights for policy 0, policy_version 381970 (0.0032) [2024-06-23 06:51:27,556][15401] Updated weights for policy 0, policy_version 381980 (0.0036) [2024-06-23 06:51:28,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6258376704. Throughput: 0: 42862.6. Samples: 6258498100. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:28,392][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 06:51:31,593][15401] Updated weights for policy 0, policy_version 381990 (0.0042) [2024-06-23 06:51:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 6258589696. Throughput: 0: 42800.0. Samples: 6258756640. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 06:51:35,264][15401] Updated weights for policy 0, policy_version 382000 (0.0032) [2024-06-23 06:51:37,665][15349] Signal inference workers to stop experience collection... (92750 times) [2024-06-23 06:51:37,715][15401] InferenceWorker_p0-w0: stopping experience collection (92750 times) [2024-06-23 06:51:37,715][15349] Signal inference workers to resume experience collection... (92750 times) [2024-06-23 06:51:37,729][15401] InferenceWorker_p0-w0: resuming experience collection (92750 times) [2024-06-23 06:51:38,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6258802688. Throughput: 0: 42724.5. Samples: 6258885800. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:38,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-23 06:51:39,099][15401] Updated weights for policy 0, policy_version 382010 (0.0028) [2024-06-23 06:51:43,026][15401] Updated weights for policy 0, policy_version 382020 (0.0034) [2024-06-23 06:51:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6259015680. Throughput: 0: 42813.3. Samples: 6259141840. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 06:51:43,510][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000382021_6259032064.pth... [2024-06-23 06:51:43,561][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000381395_6248775680.pth [2024-06-23 06:51:46,733][15401] Updated weights for policy 0, policy_version 382030 (0.0038) [2024-06-23 06:51:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6259245056. Throughput: 0: 42807.0. Samples: 6259403660. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 06:51:50,895][15401] Updated weights for policy 0, policy_version 382040 (0.0037) [2024-06-23 06:51:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6259458048. Throughput: 0: 42739.3. Samples: 6259532360. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 06:51:54,342][15401] Updated weights for policy 0, policy_version 382050 (0.0038) [2024-06-23 06:51:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6259654656. Throughput: 0: 42899.1. Samples: 6259786480. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:51:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 06:51:58,728][15401] Updated weights for policy 0, policy_version 382060 (0.0030) [2024-06-23 06:52:02,243][15401] Updated weights for policy 0, policy_version 382070 (0.0039) [2024-06-23 06:52:03,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 6259884032. Throughput: 0: 42687.1. Samples: 6260041700. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:52:03,393][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 06:52:06,264][15401] Updated weights for policy 0, policy_version 382080 (0.0036) [2024-06-23 06:52:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6260097024. Throughput: 0: 42836.8. Samples: 6260176920. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:52:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 06:52:09,767][15401] Updated weights for policy 0, policy_version 382090 (0.0043) [2024-06-23 06:52:13,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6260310016. Throughput: 0: 42937.8. Samples: 6260430200. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:52:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 06:52:13,716][15401] Updated weights for policy 0, policy_version 382100 (0.0028) [2024-06-23 06:52:17,760][15401] Updated weights for policy 0, policy_version 382110 (0.0030) [2024-06-23 06:52:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6260523008. Throughput: 0: 42808.0. Samples: 6260683000. Policy #0 lag: (min: 1.0, avg: 7.8, max: 19.0) [2024-06-23 06:52:18,392][15132] Avg episode reward: [(0, '0.213')] [2024-06-23 06:52:21,245][15401] Updated weights for policy 0, policy_version 382120 (0.0031) [2024-06-23 06:52:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 6260752384. Throughput: 0: 42802.6. Samples: 6260811920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:52:23,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 06:52:25,358][15401] Updated weights for policy 0, policy_version 382130 (0.0031) [2024-06-23 06:52:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6260948992. Throughput: 0: 42799.6. Samples: 6261067820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:52:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 06:52:28,982][15401] Updated weights for policy 0, policy_version 382140 (0.0044) [2024-06-23 06:52:32,879][15401] Updated weights for policy 0, policy_version 382150 (0.0036) [2024-06-23 06:52:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6261161984. Throughput: 0: 42694.7. Samples: 6261324920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:52:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 06:52:36,678][15401] Updated weights for policy 0, policy_version 382160 (0.0028) [2024-06-23 06:52:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6261374976. Throughput: 0: 42631.4. Samples: 6261450780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:52:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 06:52:40,442][15401] Updated weights for policy 0, policy_version 382170 (0.0028) [2024-06-23 06:52:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6261604352. Throughput: 0: 42903.2. Samples: 6261717120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:52:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 06:52:44,286][15401] Updated weights for policy 0, policy_version 382180 (0.0035) [2024-06-23 06:52:48,039][15401] Updated weights for policy 0, policy_version 382190 (0.0037) [2024-06-23 06:52:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 6261800960. Throughput: 0: 42800.5. Samples: 6261967620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:52:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 06:52:51,916][15401] Updated weights for policy 0, policy_version 382200 (0.0044) [2024-06-23 06:52:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6262013952. Throughput: 0: 42659.6. Samples: 6262096600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:52:53,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-23 06:52:55,471][15401] Updated weights for policy 0, policy_version 382210 (0.0043) [2024-06-23 06:52:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6262226944. Throughput: 0: 42836.9. Samples: 6262357860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:52:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 06:52:59,560][15401] Updated weights for policy 0, policy_version 382220 (0.0046) [2024-06-23 06:53:03,181][15401] Updated weights for policy 0, policy_version 382230 (0.0030) [2024-06-23 06:53:03,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 6262456320. Throughput: 0: 42775.8. Samples: 6262607920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:53:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 06:53:07,226][15401] Updated weights for policy 0, policy_version 382240 (0.0041) [2024-06-23 06:53:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6262652928. Throughput: 0: 42816.0. Samples: 6262738640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:53:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 06:53:10,583][15349] Signal inference workers to stop experience collection... (92800 times) [2024-06-23 06:53:10,596][15349] Signal inference workers to resume experience collection... (92800 times) [2024-06-23 06:53:10,628][15401] InferenceWorker_p0-w0: stopping experience collection (92800 times) [2024-06-23 06:53:10,628][15401] InferenceWorker_p0-w0: resuming experience collection (92800 times) [2024-06-23 06:53:10,738][15401] Updated weights for policy 0, policy_version 382250 (0.0035) [2024-06-23 06:53:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6262849536. Throughput: 0: 42787.6. Samples: 6262993260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:53:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 06:53:14,828][15401] Updated weights for policy 0, policy_version 382260 (0.0033) [2024-06-23 06:53:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 6263095296. Throughput: 0: 42704.8. Samples: 6263246640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:53:18,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 06:53:18,438][15401] Updated weights for policy 0, policy_version 382270 (0.0031) [2024-06-23 06:53:22,544][15401] Updated weights for policy 0, policy_version 382280 (0.0035) [2024-06-23 06:53:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6263291904. Throughput: 0: 43034.7. Samples: 6263387340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:53:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 06:53:25,905][15401] Updated weights for policy 0, policy_version 382290 (0.0034) [2024-06-23 06:53:28,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 6263488512. Throughput: 0: 42682.1. Samples: 6263637820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:53:28,393][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 06:53:30,034][15401] Updated weights for policy 0, policy_version 382300 (0.0032) [2024-06-23 06:53:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 6263750656. Throughput: 0: 42688.1. Samples: 6263888580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-06-23 06:53:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 06:53:33,446][15401] Updated weights for policy 0, policy_version 382310 (0.0038) [2024-06-23 06:53:37,943][15401] Updated weights for policy 0, policy_version 382320 (0.0027) [2024-06-23 06:53:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6263947264. Throughput: 0: 42980.0. Samples: 6264030700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:53:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 06:53:41,171][15401] Updated weights for policy 0, policy_version 382330 (0.0041) [2024-06-23 06:53:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6264143872. Throughput: 0: 42785.8. Samples: 6264283220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:53:43,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 06:53:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000382333_6264143872.pth... [2024-06-23 06:53:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000381708_6253903872.pth [2024-06-23 06:53:45,384][15401] Updated weights for policy 0, policy_version 382340 (0.0033) [2024-06-23 06:53:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6264389632. Throughput: 0: 42830.8. Samples: 6264535300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:53:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 06:53:48,692][15401] Updated weights for policy 0, policy_version 382350 (0.0032) [2024-06-23 06:53:52,961][15401] Updated weights for policy 0, policy_version 382360 (0.0040) [2024-06-23 06:53:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6264586240. Throughput: 0: 43018.7. Samples: 6264674480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:53:53,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 06:53:56,341][15401] Updated weights for policy 0, policy_version 382370 (0.0055) [2024-06-23 06:53:58,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 6264782848. Throughput: 0: 42881.5. Samples: 6264922940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:53:58,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 06:54:00,700][15401] Updated weights for policy 0, policy_version 382380 (0.0032) [2024-06-23 06:54:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6265044992. Throughput: 0: 42878.4. Samples: 6265176160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:03,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 06:54:04,027][15401] Updated weights for policy 0, policy_version 382390 (0.0042) [2024-06-23 06:54:08,335][15401] Updated weights for policy 0, policy_version 382400 (0.0036) [2024-06-23 06:54:08,390][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6265241600. Throughput: 0: 42982.2. Samples: 6265321540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:08,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 06:54:11,423][15401] Updated weights for policy 0, policy_version 382410 (0.0023) [2024-06-23 06:54:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6265438208. Throughput: 0: 42924.0. Samples: 6265569400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 06:54:16,114][15401] Updated weights for policy 0, policy_version 382420 (0.0034) [2024-06-23 06:54:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6265683968. Throughput: 0: 43083.4. Samples: 6265827340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:18,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 06:54:18,953][15401] Updated weights for policy 0, policy_version 382430 (0.0035) [2024-06-23 06:54:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6265864192. Throughput: 0: 43038.6. Samples: 6265967440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:23,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-23 06:54:23,707][15401] Updated weights for policy 0, policy_version 382440 (0.0034) [2024-06-23 06:54:26,392][15401] Updated weights for policy 0, policy_version 382450 (0.0049) [2024-06-23 06:54:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 6266093568. Throughput: 0: 42871.9. Samples: 6266212460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 06:54:31,287][15401] Updated weights for policy 0, policy_version 382460 (0.0039) [2024-06-23 06:54:33,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43417.4, 300 sec: 42987.1). Total num frames: 6266355712. Throughput: 0: 43061.2. Samples: 6266473060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 06:54:34,711][15401] Updated weights for policy 0, policy_version 382470 (0.0043) [2024-06-23 06:54:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 6266503168. Throughput: 0: 43031.5. Samples: 6266611000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:38,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 06:54:38,799][15401] Updated weights for policy 0, policy_version 382480 (0.0035) [2024-06-23 06:54:42,262][15401] Updated weights for policy 0, policy_version 382490 (0.0030) [2024-06-23 06:54:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 6266748928. Throughput: 0: 43183.2. Samples: 6266866180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 06:54:45,770][15349] Signal inference workers to stop experience collection... (92850 times) [2024-06-23 06:54:45,770][15349] Signal inference workers to resume experience collection... (92850 times) [2024-06-23 06:54:45,782][15401] InferenceWorker_p0-w0: stopping experience collection (92850 times) [2024-06-23 06:54:45,793][15401] InferenceWorker_p0-w0: resuming experience collection (92850 times) [2024-06-23 06:54:46,298][15401] Updated weights for policy 0, policy_version 382500 (0.0029) [2024-06-23 06:54:48,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6266961920. Throughput: 0: 43265.6. Samples: 6267123120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 06:54:49,651][15401] Updated weights for policy 0, policy_version 382510 (0.0027) [2024-06-23 06:54:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6267142144. Throughput: 0: 42925.2. Samples: 6267253180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 06:54:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 06:54:53,983][15401] Updated weights for policy 0, policy_version 382520 (0.0035) [2024-06-23 06:54:57,161][15401] Updated weights for policy 0, policy_version 382530 (0.0033) [2024-06-23 06:54:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6267387904. Throughput: 0: 43075.0. Samples: 6267507780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:54:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 06:55:01,688][15401] Updated weights for policy 0, policy_version 382540 (0.0030) [2024-06-23 06:55:03,389][15132] Fps is (10 sec: 47514.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6267617280. Throughput: 0: 43182.3. Samples: 6267770540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 06:55:04,775][15401] Updated weights for policy 0, policy_version 382550 (0.0044) [2024-06-23 06:55:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6267781120. Throughput: 0: 42932.8. Samples: 6267899420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 06:55:09,311][15401] Updated weights for policy 0, policy_version 382560 (0.0031) [2024-06-23 06:55:12,229][15401] Updated weights for policy 0, policy_version 382570 (0.0042) [2024-06-23 06:55:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6268026880. Throughput: 0: 43056.8. Samples: 6268150020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:13,393][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 06:55:17,098][15401] Updated weights for policy 0, policy_version 382580 (0.0032) [2024-06-23 06:55:18,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 6268256256. Throughput: 0: 43137.1. Samples: 6268414220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 06:55:19,771][15401] Updated weights for policy 0, policy_version 382590 (0.0044) [2024-06-23 06:55:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6268436480. Throughput: 0: 42987.6. Samples: 6268545340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 06:55:24,865][15401] Updated weights for policy 0, policy_version 382600 (0.0040) [2024-06-23 06:55:27,742][15401] Updated weights for policy 0, policy_version 382610 (0.0033) [2024-06-23 06:55:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6268682240. Throughput: 0: 42730.3. Samples: 6268789040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 06:55:32,439][15401] Updated weights for policy 0, policy_version 382620 (0.0035) [2024-06-23 06:55:33,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 6268911616. Throughput: 0: 42856.1. Samples: 6269051640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 06:55:35,247][15401] Updated weights for policy 0, policy_version 382630 (0.0028) [2024-06-23 06:55:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 6269075456. Throughput: 0: 42892.2. Samples: 6269183320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:38,392][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 06:55:39,912][15349] Signal inference workers to stop experience collection... (92900 times) [2024-06-23 06:55:39,968][15401] InferenceWorker_p0-w0: stopping experience collection (92900 times) [2024-06-23 06:55:39,968][15349] Signal inference workers to resume experience collection... (92900 times) [2024-06-23 06:55:39,970][15401] Updated weights for policy 0, policy_version 382640 (0.0027) [2024-06-23 06:55:39,992][15401] InferenceWorker_p0-w0: resuming experience collection (92900 times) [2024-06-23 06:55:43,260][15401] Updated weights for policy 0, policy_version 382650 (0.0032) [2024-06-23 06:55:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6269337600. Throughput: 0: 42708.9. Samples: 6269429680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 06:55:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000382650_6269337600.pth... [2024-06-23 06:55:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000382021_6259032064.pth [2024-06-23 06:55:47,683][15401] Updated weights for policy 0, policy_version 382660 (0.0035) [2024-06-23 06:55:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6269517824. Throughput: 0: 42722.3. Samples: 6269693040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:48,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-23 06:55:50,755][15401] Updated weights for policy 0, policy_version 382670 (0.0036) [2024-06-23 06:55:53,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6269714432. Throughput: 0: 42656.5. Samples: 6269818960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:53,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 06:55:55,518][15401] Updated weights for policy 0, policy_version 382680 (0.0037) [2024-06-23 06:55:58,243][15401] Updated weights for policy 0, policy_version 382690 (0.0031) [2024-06-23 06:55:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 6269992960. Throughput: 0: 42691.3. Samples: 6270071120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:55:58,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 06:56:03,132][15401] Updated weights for policy 0, policy_version 382700 (0.0041) [2024-06-23 06:56:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6270156800. Throughput: 0: 42623.0. Samples: 6270332260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:56:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 06:56:05,798][15401] Updated weights for policy 0, policy_version 382710 (0.0033) [2024-06-23 06:56:08,390][15132] Fps is (10 sec: 36044.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6270353408. Throughput: 0: 42330.7. Samples: 6270450220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:56:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 06:56:10,716][15401] Updated weights for policy 0, policy_version 382720 (0.0042) [2024-06-23 06:56:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 6270631936. Throughput: 0: 42786.7. Samples: 6270714440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-23 06:56:13,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 06:56:13,583][15401] Updated weights for policy 0, policy_version 382730 (0.0031) [2024-06-23 06:56:18,244][15401] Updated weights for policy 0, policy_version 382740 (0.0027) [2024-06-23 06:56:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6270812160. Throughput: 0: 42830.2. Samples: 6270979000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 06:56:21,341][15401] Updated weights for policy 0, policy_version 382750 (0.0029) [2024-06-23 06:56:23,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 6271008768. Throughput: 0: 42649.3. Samples: 6271102540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 06:56:25,809][15401] Updated weights for policy 0, policy_version 382760 (0.0040) [2024-06-23 06:56:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 6271270912. Throughput: 0: 43003.3. Samples: 6271364820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 06:56:28,911][15401] Updated weights for policy 0, policy_version 382770 (0.0035) [2024-06-23 06:56:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6271451136. Throughput: 0: 42891.5. Samples: 6271623160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 06:56:33,420][15401] Updated weights for policy 0, policy_version 382780 (0.0034) [2024-06-23 06:56:36,903][15401] Updated weights for policy 0, policy_version 382790 (0.0033) [2024-06-23 06:56:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6271647744. Throughput: 0: 42773.4. Samples: 6271743760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 06:56:40,857][15401] Updated weights for policy 0, policy_version 382800 (0.0025) [2024-06-23 06:56:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 6271909888. Throughput: 0: 43049.7. Samples: 6272008360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 06:56:44,321][15401] Updated weights for policy 0, policy_version 382810 (0.0037) [2024-06-23 06:56:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6272106496. Throughput: 0: 43052.9. Samples: 6272269640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 06:56:48,546][15401] Updated weights for policy 0, policy_version 382820 (0.0030) [2024-06-23 06:56:52,405][15401] Updated weights for policy 0, policy_version 382830 (0.0036) [2024-06-23 06:56:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6272303104. Throughput: 0: 43130.7. Samples: 6272391100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:53,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 06:56:56,281][15401] Updated weights for policy 0, policy_version 382840 (0.0033) [2024-06-23 06:56:57,749][15349] Signal inference workers to stop experience collection... (92950 times) [2024-06-23 06:56:57,749][15349] Signal inference workers to resume experience collection... (92950 times) [2024-06-23 06:56:57,797][15401] InferenceWorker_p0-w0: stopping experience collection (92950 times) [2024-06-23 06:56:57,797][15401] InferenceWorker_p0-w0: resuming experience collection (92950 times) [2024-06-23 06:56:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 6272548864. Throughput: 0: 43101.9. Samples: 6272654020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:56:58,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 06:56:59,770][15401] Updated weights for policy 0, policy_version 382850 (0.0038) [2024-06-23 06:57:03,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 6272761856. Throughput: 0: 43045.2. Samples: 6272916140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:57:03,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 06:57:03,754][15401] Updated weights for policy 0, policy_version 382860 (0.0042) [2024-06-23 06:57:07,235][15401] Updated weights for policy 0, policy_version 382870 (0.0038) [2024-06-23 06:57:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6272958464. Throughput: 0: 43111.1. Samples: 6273042540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:57:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 06:57:11,563][15401] Updated weights for policy 0, policy_version 382880 (0.0047) [2024-06-23 06:57:13,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6273187840. Throughput: 0: 43009.7. Samples: 6273300260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:57:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 06:57:14,831][15401] Updated weights for policy 0, policy_version 382890 (0.0037) [2024-06-23 06:57:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6273400832. Throughput: 0: 43029.3. Samples: 6273559480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:57:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 06:57:19,205][15401] Updated weights for policy 0, policy_version 382900 (0.0030) [2024-06-23 06:57:22,623][15401] Updated weights for policy 0, policy_version 382910 (0.0045) [2024-06-23 06:57:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6273613824. Throughput: 0: 43097.7. Samples: 6273683160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:57:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 06:57:26,754][15401] Updated weights for policy 0, policy_version 382920 (0.0028) [2024-06-23 06:57:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6273826816. Throughput: 0: 42954.7. Samples: 6273941320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:57:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 06:57:30,365][15401] Updated weights for policy 0, policy_version 382930 (0.0043) [2024-06-23 06:57:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6274023424. Throughput: 0: 42791.0. Samples: 6274195240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-23 06:57:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 06:57:34,828][15401] Updated weights for policy 0, policy_version 382940 (0.0043) [2024-06-23 06:57:38,170][15401] Updated weights for policy 0, policy_version 382950 (0.0037) [2024-06-23 06:57:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 6274269184. Throughput: 0: 42818.6. Samples: 6274317940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:57:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 06:57:42,423][15401] Updated weights for policy 0, policy_version 382960 (0.0031) [2024-06-23 06:57:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 6274465792. Throughput: 0: 43026.2. Samples: 6274590200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:57:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 06:57:43,511][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000382964_6274482176.pth... [2024-06-23 06:57:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000382333_6264143872.pth [2024-06-23 06:57:45,744][15401] Updated weights for policy 0, policy_version 382970 (0.0032) [2024-06-23 06:57:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6274678784. Throughput: 0: 42756.9. Samples: 6274840100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:57:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 06:57:49,946][15401] Updated weights for policy 0, policy_version 382980 (0.0030) [2024-06-23 06:57:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6274891776. Throughput: 0: 42805.0. Samples: 6274968760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:57:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 06:57:53,406][15401] Updated weights for policy 0, policy_version 382990 (0.0031) [2024-06-23 06:57:57,496][15401] Updated weights for policy 0, policy_version 383000 (0.0034) [2024-06-23 06:57:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6275104768. Throughput: 0: 42902.6. Samples: 6275230880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:57:58,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 06:58:01,162][15401] Updated weights for policy 0, policy_version 383010 (0.0028) [2024-06-23 06:58:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 6275301376. Throughput: 0: 42770.6. Samples: 6275484160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:03,395][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 06:58:04,995][15401] Updated weights for policy 0, policy_version 383020 (0.0046) [2024-06-23 06:58:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 6275530752. Throughput: 0: 42853.0. Samples: 6275611540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 06:58:08,670][15401] Updated weights for policy 0, policy_version 383030 (0.0036) [2024-06-23 06:58:12,649][15401] Updated weights for policy 0, policy_version 383040 (0.0025) [2024-06-23 06:58:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6275743744. Throughput: 0: 42806.2. Samples: 6275867600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 06:58:16,264][15401] Updated weights for policy 0, policy_version 383050 (0.0036) [2024-06-23 06:58:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 6275940352. Throughput: 0: 42807.5. Samples: 6276121580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 06:58:20,489][15401] Updated weights for policy 0, policy_version 383060 (0.0040) [2024-06-23 06:58:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 6276186112. Throughput: 0: 42881.8. Samples: 6276247620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 06:58:23,856][15401] Updated weights for policy 0, policy_version 383070 (0.0028) [2024-06-23 06:58:28,002][15401] Updated weights for policy 0, policy_version 383080 (0.0035) [2024-06-23 06:58:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6276382720. Throughput: 0: 42576.3. Samples: 6276506140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 06:58:30,745][15349] Signal inference workers to stop experience collection... (93000 times) [2024-06-23 06:58:30,745][15349] Signal inference workers to resume experience collection... (93000 times) [2024-06-23 06:58:30,790][15401] InferenceWorker_p0-w0: stopping experience collection (93000 times) [2024-06-23 06:58:30,790][15401] InferenceWorker_p0-w0: resuming experience collection (93000 times) [2024-06-23 06:58:31,857][15401] Updated weights for policy 0, policy_version 383090 (0.0040) [2024-06-23 06:58:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 6276579328. Throughput: 0: 42747.6. Samples: 6276763740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 06:58:35,831][15401] Updated weights for policy 0, policy_version 383100 (0.0043) [2024-06-23 06:58:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 6276808704. Throughput: 0: 42515.1. Samples: 6276881940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 06:58:40,010][15401] Updated weights for policy 0, policy_version 383110 (0.0024) [2024-06-23 06:58:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6277005312. Throughput: 0: 42393.0. Samples: 6277138560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 06:58:43,535][15401] Updated weights for policy 0, policy_version 383120 (0.0024) [2024-06-23 06:58:47,786][15401] Updated weights for policy 0, policy_version 383130 (0.0025) [2024-06-23 06:58:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6277218304. Throughput: 0: 42518.4. Samples: 6277397480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 06:58:51,070][15401] Updated weights for policy 0, policy_version 383140 (0.0034) [2024-06-23 06:58:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6277431296. Throughput: 0: 42353.3. Samples: 6277517440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 06:58:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 06:58:55,211][15401] Updated weights for policy 0, policy_version 383150 (0.0039) [2024-06-23 06:58:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6277644288. Throughput: 0: 42444.5. Samples: 6277777600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:58:58,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 06:58:59,181][15401] Updated weights for policy 0, policy_version 383160 (0.0039) [2024-06-23 06:59:02,695][15401] Updated weights for policy 0, policy_version 383170 (0.0029) [2024-06-23 06:59:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6277873664. Throughput: 0: 42608.1. Samples: 6278038940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 06:59:06,758][15401] Updated weights for policy 0, policy_version 383180 (0.0026) [2024-06-23 06:59:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6278086656. Throughput: 0: 42722.7. Samples: 6278170140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 06:59:10,387][15401] Updated weights for policy 0, policy_version 383190 (0.0042) [2024-06-23 06:59:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6278283264. Throughput: 0: 42559.1. Samples: 6278421300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 06:59:14,327][15401] Updated weights for policy 0, policy_version 383200 (0.0027) [2024-06-23 06:59:18,126][15401] Updated weights for policy 0, policy_version 383210 (0.0026) [2024-06-23 06:59:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6278512640. Throughput: 0: 42398.7. Samples: 6278671680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 06:59:22,018][15401] Updated weights for policy 0, policy_version 383220 (0.0041) [2024-06-23 06:59:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6278725632. Throughput: 0: 42782.2. Samples: 6278807140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 06:59:25,938][15401] Updated weights for policy 0, policy_version 383230 (0.0032) [2024-06-23 06:59:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6278938624. Throughput: 0: 42703.0. Samples: 6279060200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 06:59:29,735][15401] Updated weights for policy 0, policy_version 383240 (0.0036) [2024-06-23 06:59:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 6279151616. Throughput: 0: 42765.2. Samples: 6279321920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 06:59:33,440][15401] Updated weights for policy 0, policy_version 383250 (0.0029) [2024-06-23 06:59:37,813][15401] Updated weights for policy 0, policy_version 383260 (0.0039) [2024-06-23 06:59:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6279364608. Throughput: 0: 42944.9. Samples: 6279449960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:38,395][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 06:59:41,298][15401] Updated weights for policy 0, policy_version 383270 (0.0023) [2024-06-23 06:59:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6279577600. Throughput: 0: 42775.5. Samples: 6279702500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:43,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 06:59:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000383275_6279577600.pth... [2024-06-23 06:59:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000382650_6269337600.pth [2024-06-23 06:59:45,349][15401] Updated weights for policy 0, policy_version 383280 (0.0037) [2024-06-23 06:59:48,391][15132] Fps is (10 sec: 42590.6, 60 sec: 42870.1, 300 sec: 42875.9). Total num frames: 6279790592. Throughput: 0: 42729.8. Samples: 6279961860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:48,392][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 06:59:48,920][15401] Updated weights for policy 0, policy_version 383290 (0.0048) [2024-06-23 06:59:52,827][15401] Updated weights for policy 0, policy_version 383300 (0.0035) [2024-06-23 06:59:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6280003584. Throughput: 0: 42589.3. Samples: 6280086660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 06:59:56,464][15401] Updated weights for policy 0, policy_version 383310 (0.0032) [2024-06-23 06:59:57,893][15349] Signal inference workers to stop experience collection... (93050 times) [2024-06-23 06:59:57,895][15349] Signal inference workers to resume experience collection... (93050 times) [2024-06-23 06:59:57,907][15401] InferenceWorker_p0-w0: stopping experience collection (93050 times) [2024-06-23 06:59:57,936][15401] InferenceWorker_p0-w0: resuming experience collection (93050 times) [2024-06-23 06:59:58,389][15132] Fps is (10 sec: 44245.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6280232960. Throughput: 0: 42752.1. Samples: 6280345140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 06:59:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 07:00:00,721][15401] Updated weights for policy 0, policy_version 383320 (0.0037) [2024-06-23 07:00:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6280429568. Throughput: 0: 42798.2. Samples: 6280597600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:00:03,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 07:00:04,171][15401] Updated weights for policy 0, policy_version 383330 (0.0043) [2024-06-23 07:00:08,280][15401] Updated weights for policy 0, policy_version 383340 (0.0030) [2024-06-23 07:00:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6280642560. Throughput: 0: 42630.2. Samples: 6280725500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:00:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 07:00:11,871][15401] Updated weights for policy 0, policy_version 383350 (0.0026) [2024-06-23 07:00:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6280871936. Throughput: 0: 42808.4. Samples: 6280986580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:13,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 07:00:16,037][15401] Updated weights for policy 0, policy_version 383360 (0.0029) [2024-06-23 07:00:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6281084928. Throughput: 0: 42583.5. Samples: 6281238180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 07:00:19,554][15401] Updated weights for policy 0, policy_version 383370 (0.0032) [2024-06-23 07:00:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6281281536. Throughput: 0: 42535.0. Samples: 6281364040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 07:00:23,766][15401] Updated weights for policy 0, policy_version 383380 (0.0036) [2024-06-23 07:00:27,008][15401] Updated weights for policy 0, policy_version 383390 (0.0043) [2024-06-23 07:00:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6281494528. Throughput: 0: 42576.5. Samples: 6281618440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 07:00:31,405][15401] Updated weights for policy 0, policy_version 383400 (0.0038) [2024-06-23 07:00:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6281707520. Throughput: 0: 42690.2. Samples: 6281882840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:33,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 07:00:34,827][15401] Updated weights for policy 0, policy_version 383410 (0.0033) [2024-06-23 07:00:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6281920512. Throughput: 0: 42665.3. Samples: 6282006600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:38,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 07:00:38,977][15401] Updated weights for policy 0, policy_version 383420 (0.0037) [2024-06-23 07:00:42,364][15401] Updated weights for policy 0, policy_version 383430 (0.0036) [2024-06-23 07:00:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6282149888. Throughput: 0: 42676.0. Samples: 6282265560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:43,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 07:00:46,559][15401] Updated weights for policy 0, policy_version 383440 (0.0028) [2024-06-23 07:00:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42599.6, 300 sec: 42820.5). Total num frames: 6282346496. Throughput: 0: 42908.3. Samples: 6282528480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:48,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 07:00:49,871][15401] Updated weights for policy 0, policy_version 383450 (0.0036) [2024-06-23 07:00:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6282575872. Throughput: 0: 42785.3. Samples: 6282650840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:53,392][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 07:00:54,024][15401] Updated weights for policy 0, policy_version 383460 (0.0038) [2024-06-23 07:00:57,507][15401] Updated weights for policy 0, policy_version 383470 (0.0032) [2024-06-23 07:00:58,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6282788864. Throughput: 0: 42637.0. Samples: 6282905240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:00:58,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 07:01:01,784][15401] Updated weights for policy 0, policy_version 383480 (0.0038) [2024-06-23 07:01:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6282985472. Throughput: 0: 42826.2. Samples: 6283165360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:01:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 07:01:05,355][15401] Updated weights for policy 0, policy_version 383490 (0.0041) [2024-06-23 07:01:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6283214848. Throughput: 0: 42747.8. Samples: 6283287680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:01:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 07:01:09,329][15401] Updated weights for policy 0, policy_version 383500 (0.0028) [2024-06-23 07:01:13,017][15401] Updated weights for policy 0, policy_version 383510 (0.0042) [2024-06-23 07:01:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6283427840. Throughput: 0: 42892.0. Samples: 6283548580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:01:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 07:01:16,910][15401] Updated weights for policy 0, policy_version 383520 (0.0032) [2024-06-23 07:01:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6283624448. Throughput: 0: 42699.1. Samples: 6283804300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:01:18,395][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 07:01:20,834][15401] Updated weights for policy 0, policy_version 383530 (0.0030) [2024-06-23 07:01:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6283837440. Throughput: 0: 42586.2. Samples: 6283922980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:01:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 07:01:24,193][15349] Signal inference workers to stop experience collection... (93100 times) [2024-06-23 07:01:24,193][15349] Signal inference workers to resume experience collection... (93100 times) [2024-06-23 07:01:24,215][15401] InferenceWorker_p0-w0: stopping experience collection (93100 times) [2024-06-23 07:01:24,215][15401] InferenceWorker_p0-w0: resuming experience collection (93100 times) [2024-06-23 07:01:24,744][15401] Updated weights for policy 0, policy_version 383540 (0.0036) [2024-06-23 07:01:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6284066816. Throughput: 0: 42706.7. Samples: 6284187360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 07:01:28,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 07:01:28,525][15401] Updated weights for policy 0, policy_version 383550 (0.0032) [2024-06-23 07:01:32,446][15401] Updated weights for policy 0, policy_version 383560 (0.0041) [2024-06-23 07:01:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6284247040. Throughput: 0: 42555.8. Samples: 6284443480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:01:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 07:01:36,032][15401] Updated weights for policy 0, policy_version 383570 (0.0024) [2024-06-23 07:01:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6284492800. Throughput: 0: 42547.7. Samples: 6284565480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:01:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 07:01:40,083][15401] Updated weights for policy 0, policy_version 383580 (0.0030) [2024-06-23 07:01:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6284705792. Throughput: 0: 42675.6. Samples: 6284825640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:01:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 07:01:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000383589_6284722176.pth... [2024-06-23 07:01:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000382964_6274482176.pth [2024-06-23 07:01:43,853][15401] Updated weights for policy 0, policy_version 383590 (0.0027) [2024-06-23 07:01:47,698][15401] Updated weights for policy 0, policy_version 383600 (0.0039) [2024-06-23 07:01:48,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6284902400. Throughput: 0: 42553.4. Samples: 6285080260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:01:48,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 07:01:51,543][15401] Updated weights for policy 0, policy_version 383610 (0.0030) [2024-06-23 07:01:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6285131776. Throughput: 0: 42598.5. Samples: 6285204620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:01:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 07:01:55,515][15401] Updated weights for policy 0, policy_version 383620 (0.0029) [2024-06-23 07:01:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 6285361152. Throughput: 0: 42674.6. Samples: 6285468940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:01:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 07:01:59,176][15401] Updated weights for policy 0, policy_version 383630 (0.0049) [2024-06-23 07:02:03,140][15401] Updated weights for policy 0, policy_version 383640 (0.0029) [2024-06-23 07:02:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6285557760. Throughput: 0: 42618.8. Samples: 6285722140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 07:02:06,795][15401] Updated weights for policy 0, policy_version 383650 (0.0024) [2024-06-23 07:02:08,394][15132] Fps is (10 sec: 42580.6, 60 sec: 42868.3, 300 sec: 42708.9). Total num frames: 6285787136. Throughput: 0: 42810.6. Samples: 6285849640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:08,394][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 07:02:10,593][15401] Updated weights for policy 0, policy_version 383660 (0.0037) [2024-06-23 07:02:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6286000128. Throughput: 0: 42848.5. Samples: 6286115540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 07:02:14,356][15401] Updated weights for policy 0, policy_version 383670 (0.0029) [2024-06-23 07:02:18,190][15401] Updated weights for policy 0, policy_version 383680 (0.0034) [2024-06-23 07:02:18,396][15132] Fps is (10 sec: 42589.4, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 6286213120. Throughput: 0: 42732.5. Samples: 6286366720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:18,396][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 07:02:22,047][15401] Updated weights for policy 0, policy_version 383690 (0.0026) [2024-06-23 07:02:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 6286442496. Throughput: 0: 42893.2. Samples: 6286495680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:23,391][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 07:02:26,427][15401] Updated weights for policy 0, policy_version 383700 (0.0033) [2024-06-23 07:02:28,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6286639104. Throughput: 0: 42961.2. Samples: 6286758900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 07:02:29,911][15401] Updated weights for policy 0, policy_version 383710 (0.0026) [2024-06-23 07:02:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6286835712. Throughput: 0: 42901.8. Samples: 6287010840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 07:02:34,009][15401] Updated weights for policy 0, policy_version 383720 (0.0041) [2024-06-23 07:02:34,034][15349] Signal inference workers to stop experience collection... (93150 times) [2024-06-23 07:02:34,035][15349] Signal inference workers to resume experience collection... (93150 times) [2024-06-23 07:02:34,063][15401] InferenceWorker_p0-w0: stopping experience collection (93150 times) [2024-06-23 07:02:34,063][15401] InferenceWorker_p0-w0: resuming experience collection (93150 times) [2024-06-23 07:02:37,561][15401] Updated weights for policy 0, policy_version 383730 (0.0039) [2024-06-23 07:02:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6287065088. Throughput: 0: 42930.8. Samples: 6287136500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 07:02:41,566][15401] Updated weights for policy 0, policy_version 383740 (0.0023) [2024-06-23 07:02:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 6287278080. Throughput: 0: 42721.7. Samples: 6287391420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 07:02:45,082][15401] Updated weights for policy 0, policy_version 383750 (0.0028) [2024-06-23 07:02:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6287474688. Throughput: 0: 42874.9. Samples: 6287651520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 07:02:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 07:02:49,173][15401] Updated weights for policy 0, policy_version 383760 (0.0037) [2024-06-23 07:02:52,779][15401] Updated weights for policy 0, policy_version 383770 (0.0035) [2024-06-23 07:02:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6287704064. Throughput: 0: 42743.2. Samples: 6287772900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:02:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 07:02:56,994][15401] Updated weights for policy 0, policy_version 383780 (0.0039) [2024-06-23 07:02:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6287900672. Throughput: 0: 42545.8. Samples: 6288030100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:02:58,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 07:03:00,409][15401] Updated weights for policy 0, policy_version 383790 (0.0047) [2024-06-23 07:03:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6288113664. Throughput: 0: 42549.2. Samples: 6288281160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 07:03:04,784][15401] Updated weights for policy 0, policy_version 383800 (0.0041) [2024-06-23 07:03:08,031][15401] Updated weights for policy 0, policy_version 383810 (0.0026) [2024-06-23 07:03:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42601.4, 300 sec: 42709.5). Total num frames: 6288343040. Throughput: 0: 42574.7. Samples: 6288411540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:08,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 07:03:12,398][15401] Updated weights for policy 0, policy_version 383820 (0.0032) [2024-06-23 07:03:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6288539648. Throughput: 0: 42458.7. Samples: 6288669540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 07:03:15,629][15401] Updated weights for policy 0, policy_version 383830 (0.0033) [2024-06-23 07:03:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42603.0, 300 sec: 42654.0). Total num frames: 6288769024. Throughput: 0: 42519.6. Samples: 6288924220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 07:03:20,105][15401] Updated weights for policy 0, policy_version 383840 (0.0032) [2024-06-23 07:03:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6288965632. Throughput: 0: 42575.9. Samples: 6289052420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 07:03:23,875][15401] Updated weights for policy 0, policy_version 383850 (0.0038) [2024-06-23 07:03:27,843][15401] Updated weights for policy 0, policy_version 383860 (0.0036) [2024-06-23 07:03:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6289178624. Throughput: 0: 42550.4. Samples: 6289306180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 07:03:31,503][15401] Updated weights for policy 0, policy_version 383870 (0.0024) [2024-06-23 07:03:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6289424384. Throughput: 0: 42397.3. Samples: 6289559400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 07:03:35,481][15401] Updated weights for policy 0, policy_version 383880 (0.0051) [2024-06-23 07:03:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6289604608. Throughput: 0: 42528.9. Samples: 6289686700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 07:03:39,363][15401] Updated weights for policy 0, policy_version 383890 (0.0038) [2024-06-23 07:03:43,355][15401] Updated weights for policy 0, policy_version 383900 (0.0029) [2024-06-23 07:03:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6289817600. Throughput: 0: 42406.6. Samples: 6289938400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 07:03:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000383900_6289817600.pth... [2024-06-23 07:03:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000383275_6279577600.pth [2024-06-23 07:03:47,055][15401] Updated weights for policy 0, policy_version 383910 (0.0034) [2024-06-23 07:03:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6290030592. Throughput: 0: 42449.0. Samples: 6290191360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 07:03:50,881][15401] Updated weights for policy 0, policy_version 383920 (0.0040) [2024-06-23 07:03:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6290243584. Throughput: 0: 42349.0. Samples: 6290317240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 07:03:54,678][15401] Updated weights for policy 0, policy_version 383930 (0.0036) [2024-06-23 07:03:58,385][15401] Updated weights for policy 0, policy_version 383940 (0.0039) [2024-06-23 07:03:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6290472960. Throughput: 0: 42279.2. Samples: 6290572100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:03:58,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 07:04:00,204][15349] Signal inference workers to stop experience collection... (93200 times) [2024-06-23 07:04:00,253][15349] Signal inference workers to resume experience collection... (93200 times) [2024-06-23 07:04:00,254][15401] InferenceWorker_p0-w0: stopping experience collection (93200 times) [2024-06-23 07:04:00,266][15401] InferenceWorker_p0-w0: resuming experience collection (93200 times) [2024-06-23 07:04:02,656][15401] Updated weights for policy 0, policy_version 383950 (0.0039) [2024-06-23 07:04:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6290685952. Throughput: 0: 42286.1. Samples: 6290827100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:04:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 07:04:06,411][15401] Updated weights for policy 0, policy_version 383960 (0.0033) [2024-06-23 07:04:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 6290866176. Throughput: 0: 42277.9. Samples: 6290954920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:04:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 07:04:10,346][15401] Updated weights for policy 0, policy_version 383970 (0.0034) [2024-06-23 07:04:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6291111936. Throughput: 0: 42392.8. Samples: 6291213860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 07:04:14,182][15401] Updated weights for policy 0, policy_version 383980 (0.0034) [2024-06-23 07:04:17,918][15401] Updated weights for policy 0, policy_version 383990 (0.0036) [2024-06-23 07:04:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 6291308544. Throughput: 0: 42526.0. Samples: 6291473060. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 07:04:21,967][15401] Updated weights for policy 0, policy_version 384000 (0.0047) [2024-06-23 07:04:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 6291521536. Throughput: 0: 42557.7. Samples: 6291601900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:23,393][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 07:04:25,379][15401] Updated weights for policy 0, policy_version 384010 (0.0029) [2024-06-23 07:04:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6291750912. Throughput: 0: 42654.1. Samples: 6291857840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:28,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 07:04:29,579][15401] Updated weights for policy 0, policy_version 384020 (0.0035) [2024-06-23 07:04:32,890][15401] Updated weights for policy 0, policy_version 384030 (0.0038) [2024-06-23 07:04:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 6291947520. Throughput: 0: 42731.9. Samples: 6292114300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 07:04:37,010][15401] Updated weights for policy 0, policy_version 384040 (0.0035) [2024-06-23 07:04:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6292160512. Throughput: 0: 42775.4. Samples: 6292242140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 07:04:40,946][15401] Updated weights for policy 0, policy_version 384050 (0.0036) [2024-06-23 07:04:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 6292373504. Throughput: 0: 42804.0. Samples: 6292498280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:43,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 07:04:44,577][15401] Updated weights for policy 0, policy_version 384060 (0.0032) [2024-06-23 07:04:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6292586496. Throughput: 0: 42875.2. Samples: 6292756480. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:48,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 07:04:48,542][15401] Updated weights for policy 0, policy_version 384070 (0.0031) [2024-06-23 07:04:52,746][15401] Updated weights for policy 0, policy_version 384080 (0.0031) [2024-06-23 07:04:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6292783104. Throughput: 0: 42806.7. Samples: 6292881220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:53,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 07:04:56,061][15401] Updated weights for policy 0, policy_version 384090 (0.0037) [2024-06-23 07:04:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6293028864. Throughput: 0: 42718.3. Samples: 6293136180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:04:58,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-23 07:05:00,256][15401] Updated weights for policy 0, policy_version 384100 (0.0023) [2024-06-23 07:05:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6293241856. Throughput: 0: 42799.1. Samples: 6293399020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:05:03,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 07:05:03,463][15401] Updated weights for policy 0, policy_version 384110 (0.0034) [2024-06-23 07:05:07,927][15401] Updated weights for policy 0, policy_version 384120 (0.0024) [2024-06-23 07:05:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 6293438464. Throughput: 0: 42907.1. Samples: 6293532620. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:05:08,390][15132] Avg episode reward: [(0, '0.143')] [2024-06-23 07:05:10,978][15401] Updated weights for policy 0, policy_version 384130 (0.0049) [2024-06-23 07:05:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6293667840. Throughput: 0: 42794.4. Samples: 6293783580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:05:13,390][15132] Avg episode reward: [(0, '0.116')] [2024-06-23 07:05:15,501][15401] Updated weights for policy 0, policy_version 384140 (0.0038) [2024-06-23 07:05:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6293897216. Throughput: 0: 42803.1. Samples: 6294040440. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:05:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 07:05:18,619][15401] Updated weights for policy 0, policy_version 384150 (0.0040) [2024-06-23 07:05:23,110][15401] Updated weights for policy 0, policy_version 384160 (0.0027) [2024-06-23 07:05:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 6294077440. Throughput: 0: 42905.1. Samples: 6294172860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:05:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 07:05:26,083][15401] Updated weights for policy 0, policy_version 384170 (0.0036) [2024-06-23 07:05:26,955][15349] Signal inference workers to stop experience collection... (93250 times) [2024-06-23 07:05:26,955][15349] Signal inference workers to resume experience collection... (93250 times) [2024-06-23 07:05:26,972][15401] InferenceWorker_p0-w0: stopping experience collection (93250 times) [2024-06-23 07:05:26,976][15401] InferenceWorker_p0-w0: resuming experience collection (93250 times) [2024-06-23 07:05:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6294323200. Throughput: 0: 42794.6. Samples: 6294424140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 07:05:28,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 07:05:30,920][15401] Updated weights for policy 0, policy_version 384180 (0.0026) [2024-06-23 07:05:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6294536192. Throughput: 0: 42832.4. Samples: 6294683940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:05:33,393][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 07:05:33,649][15401] Updated weights for policy 0, policy_version 384190 (0.0038) [2024-06-23 07:05:38,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6294700032. Throughput: 0: 42826.6. Samples: 6294808420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:05:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 07:05:38,698][15401] Updated weights for policy 0, policy_version 384200 (0.0034) [2024-06-23 07:05:41,464][15401] Updated weights for policy 0, policy_version 384210 (0.0037) [2024-06-23 07:05:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6294962176. Throughput: 0: 42797.4. Samples: 6295062060. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:05:43,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 07:05:43,496][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000384215_6294978560.pth... [2024-06-23 07:05:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000383589_6284722176.pth [2024-06-23 07:05:46,655][15401] Updated weights for policy 0, policy_version 384220 (0.0030) [2024-06-23 07:05:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6295158784. Throughput: 0: 42617.2. Samples: 6295316800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:05:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 07:05:49,624][15401] Updated weights for policy 0, policy_version 384230 (0.0035) [2024-06-23 07:05:53,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6295339008. Throughput: 0: 42485.1. Samples: 6295444440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:05:53,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 07:05:54,079][15401] Updated weights for policy 0, policy_version 384240 (0.0043) [2024-06-23 07:05:57,193][15401] Updated weights for policy 0, policy_version 384250 (0.0035) [2024-06-23 07:05:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6295568384. Throughput: 0: 42575.0. Samples: 6295699460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:05:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 07:06:01,583][15401] Updated weights for policy 0, policy_version 384260 (0.0029) [2024-06-23 07:06:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6295797760. Throughput: 0: 42609.9. Samples: 6295957880. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 07:06:04,658][15401] Updated weights for policy 0, policy_version 384270 (0.0045) [2024-06-23 07:06:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 6295994368. Throughput: 0: 42512.9. Samples: 6296085940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 07:06:09,105][15401] Updated weights for policy 0, policy_version 384280 (0.0035) [2024-06-23 07:06:12,775][15401] Updated weights for policy 0, policy_version 384290 (0.0028) [2024-06-23 07:06:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6296223744. Throughput: 0: 42527.6. Samples: 6296337780. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:13,392][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 07:06:17,001][15401] Updated weights for policy 0, policy_version 384300 (0.0041) [2024-06-23 07:06:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 6296403968. Throughput: 0: 42499.7. Samples: 6296596420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 07:06:20,343][15401] Updated weights for policy 0, policy_version 384310 (0.0030) [2024-06-23 07:06:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6296633344. Throughput: 0: 42488.8. Samples: 6296720420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:23,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 07:06:24,592][15401] Updated weights for policy 0, policy_version 384320 (0.0036) [2024-06-23 07:06:28,196][15401] Updated weights for policy 0, policy_version 384330 (0.0041) [2024-06-23 07:06:28,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 6296862720. Throughput: 0: 42555.3. Samples: 6296977060. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:28,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 07:06:32,166][15401] Updated weights for policy 0, policy_version 384340 (0.0036) [2024-06-23 07:06:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 6297042944. Throughput: 0: 42595.7. Samples: 6297233600. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 07:06:35,732][15401] Updated weights for policy 0, policy_version 384350 (0.0040) [2024-06-23 07:06:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6297288704. Throughput: 0: 42479.0. Samples: 6297356000. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 07:06:40,429][15401] Updated weights for policy 0, policy_version 384360 (0.0037) [2024-06-23 07:06:43,371][15401] Updated weights for policy 0, policy_version 384370 (0.0038) [2024-06-23 07:06:43,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6297518080. Throughput: 0: 42616.4. Samples: 6297617200. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 07:06:47,961][15401] Updated weights for policy 0, policy_version 384380 (0.0043) [2024-06-23 07:06:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6297698304. Throughput: 0: 42551.9. Samples: 6297872720. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 07:06:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 07:06:50,867][15401] Updated weights for policy 0, policy_version 384390 (0.0039) [2024-06-23 07:06:53,394][15132] Fps is (10 sec: 40942.9, 60 sec: 43141.4, 300 sec: 42597.8). Total num frames: 6297927680. Throughput: 0: 42400.3. Samples: 6297994140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:06:53,394][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 07:06:55,617][15401] Updated weights for policy 0, policy_version 384400 (0.0029) [2024-06-23 07:06:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 6298140672. Throughput: 0: 42641.9. Samples: 6298256660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:06:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 07:06:58,566][15401] Updated weights for policy 0, policy_version 384410 (0.0030) [2024-06-23 07:07:03,390][15132] Fps is (10 sec: 39337.7, 60 sec: 42052.1, 300 sec: 42487.9). Total num frames: 6298320896. Throughput: 0: 42517.5. Samples: 6298509720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:03,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 07:07:03,542][15401] Updated weights for policy 0, policy_version 384420 (0.0038) [2024-06-23 07:07:06,370][15401] Updated weights for policy 0, policy_version 384430 (0.0038) [2024-06-23 07:07:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6298566656. Throughput: 0: 42515.7. Samples: 6298633620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:08,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 07:07:11,227][15401] Updated weights for policy 0, policy_version 384440 (0.0029) [2024-06-23 07:07:11,762][15349] Signal inference workers to stop experience collection... (93300 times) [2024-06-23 07:07:11,763][15349] Signal inference workers to resume experience collection... (93300 times) [2024-06-23 07:07:11,803][15401] InferenceWorker_p0-w0: stopping experience collection (93300 times) [2024-06-23 07:07:11,803][15401] InferenceWorker_p0-w0: resuming experience collection (93300 times) [2024-06-23 07:07:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42543.8). Total num frames: 6298763264. Throughput: 0: 42617.9. Samples: 6298894860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 07:07:14,297][15401] Updated weights for policy 0, policy_version 384450 (0.0038) [2024-06-23 07:07:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 6298959872. Throughput: 0: 42607.0. Samples: 6299150920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 07:07:18,932][15401] Updated weights for policy 0, policy_version 384460 (0.0032) [2024-06-23 07:07:21,906][15401] Updated weights for policy 0, policy_version 384470 (0.0034) [2024-06-23 07:07:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 6299222016. Throughput: 0: 42649.1. Samples: 6299275220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:23,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 07:07:26,561][15401] Updated weights for policy 0, policy_version 384480 (0.0028) [2024-06-23 07:07:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6299418624. Throughput: 0: 42560.4. Samples: 6299532420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 07:07:29,610][15401] Updated weights for policy 0, policy_version 384490 (0.0027) [2024-06-23 07:07:33,390][15132] Fps is (10 sec: 39322.5, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 6299615232. Throughput: 0: 42751.5. Samples: 6299796540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:33,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 07:07:34,059][15401] Updated weights for policy 0, policy_version 384500 (0.0034) [2024-06-23 07:07:37,308][15401] Updated weights for policy 0, policy_version 384510 (0.0037) [2024-06-23 07:07:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6299828224. Throughput: 0: 42836.6. Samples: 6299921600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 07:07:41,655][15401] Updated weights for policy 0, policy_version 384520 (0.0035) [2024-06-23 07:07:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6300057600. Throughput: 0: 42710.9. Samples: 6300178660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:43,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 07:07:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000384525_6300057600.pth... [2024-06-23 07:07:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000383900_6289817600.pth [2024-06-23 07:07:45,226][15401] Updated weights for policy 0, policy_version 384530 (0.0033) [2024-06-23 07:07:48,390][15132] Fps is (10 sec: 44234.7, 60 sec: 42871.2, 300 sec: 42598.3). Total num frames: 6300270592. Throughput: 0: 42688.2. Samples: 6300430700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:48,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 07:07:49,229][15401] Updated weights for policy 0, policy_version 384540 (0.0037) [2024-06-23 07:07:52,866][15401] Updated weights for policy 0, policy_version 384550 (0.0038) [2024-06-23 07:07:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42601.4, 300 sec: 42653.9). Total num frames: 6300483584. Throughput: 0: 42896.3. Samples: 6300563960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 07:07:56,822][15401] Updated weights for policy 0, policy_version 384560 (0.0043) [2024-06-23 07:07:58,390][15132] Fps is (10 sec: 42599.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6300696576. Throughput: 0: 42866.2. Samples: 6300823840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:07:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 07:08:00,328][15401] Updated weights for policy 0, policy_version 384570 (0.0032) [2024-06-23 07:08:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 6300909568. Throughput: 0: 42745.8. Samples: 6301074480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:08:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 07:08:04,373][15401] Updated weights for policy 0, policy_version 384580 (0.0039) [2024-06-23 07:08:07,806][15401] Updated weights for policy 0, policy_version 384590 (0.0043) [2024-06-23 07:08:08,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 6301138944. Throughput: 0: 42994.8. Samples: 6301210080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 07:08:08,393][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 07:08:12,269][15401] Updated weights for policy 0, policy_version 384600 (0.0038) [2024-06-23 07:08:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6301335552. Throughput: 0: 43066.7. Samples: 6301470420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:13,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 07:08:15,232][15401] Updated weights for policy 0, policy_version 384610 (0.0036) [2024-06-23 07:08:18,389][15132] Fps is (10 sec: 40970.5, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 6301548544. Throughput: 0: 42836.1. Samples: 6301724160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 07:08:19,609][15401] Updated weights for policy 0, policy_version 384620 (0.0032) [2024-06-23 07:08:23,069][15401] Updated weights for policy 0, policy_version 384630 (0.0041) [2024-06-23 07:08:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 6301777920. Throughput: 0: 42982.6. Samples: 6301855820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:23,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 07:08:27,068][15401] Updated weights for policy 0, policy_version 384640 (0.0035) [2024-06-23 07:08:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 6301958144. Throughput: 0: 42881.0. Samples: 6302108300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 07:08:28,532][15349] Signal inference workers to stop experience collection... (93350 times) [2024-06-23 07:08:28,533][15349] Signal inference workers to resume experience collection... (93350 times) [2024-06-23 07:08:28,553][15401] InferenceWorker_p0-w0: stopping experience collection (93350 times) [2024-06-23 07:08:28,553][15401] InferenceWorker_p0-w0: resuming experience collection (93350 times) [2024-06-23 07:08:30,621][15401] Updated weights for policy 0, policy_version 384650 (0.0039) [2024-06-23 07:08:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6302203904. Throughput: 0: 42992.4. Samples: 6302365340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 07:08:34,573][15401] Updated weights for policy 0, policy_version 384660 (0.0045) [2024-06-23 07:08:38,063][15401] Updated weights for policy 0, policy_version 384670 (0.0035) [2024-06-23 07:08:38,394][15132] Fps is (10 sec: 47491.0, 60 sec: 43414.2, 300 sec: 42764.3). Total num frames: 6302433280. Throughput: 0: 43052.0. Samples: 6302501500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:38,395][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 07:08:42,491][15401] Updated weights for policy 0, policy_version 384680 (0.0032) [2024-06-23 07:08:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6302613504. Throughput: 0: 43022.2. Samples: 6302759840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:43,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 07:08:45,491][15401] Updated weights for policy 0, policy_version 384690 (0.0039) [2024-06-23 07:08:48,390][15132] Fps is (10 sec: 42618.3, 60 sec: 43144.8, 300 sec: 42765.0). Total num frames: 6302859264. Throughput: 0: 43085.7. Samples: 6303013340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:48,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 07:08:50,095][15401] Updated weights for policy 0, policy_version 384700 (0.0047) [2024-06-23 07:08:53,246][15401] Updated weights for policy 0, policy_version 384710 (0.0031) [2024-06-23 07:08:53,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6303088640. Throughput: 0: 43063.7. Samples: 6303147840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 07:08:57,667][15401] Updated weights for policy 0, policy_version 384720 (0.0042) [2024-06-23 07:08:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6303268864. Throughput: 0: 42998.6. Samples: 6303405360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:08:58,396][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 07:09:00,984][15401] Updated weights for policy 0, policy_version 384730 (0.0038) [2024-06-23 07:09:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6303498240. Throughput: 0: 42885.1. Samples: 6303654000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:09:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 07:09:05,864][15401] Updated weights for policy 0, policy_version 384740 (0.0047) [2024-06-23 07:09:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 6303711232. Throughput: 0: 42894.6. Samples: 6303786080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:09:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 07:09:08,647][15401] Updated weights for policy 0, policy_version 384750 (0.0031) [2024-06-23 07:09:13,289][15401] Updated weights for policy 0, policy_version 384760 (0.0030) [2024-06-23 07:09:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6303907840. Throughput: 0: 42936.3. Samples: 6304040440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:09:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 07:09:16,142][15401] Updated weights for policy 0, policy_version 384770 (0.0032) [2024-06-23 07:09:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.3, 300 sec: 42765.4). Total num frames: 6304137216. Throughput: 0: 43017.2. Samples: 6304301120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:09:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 07:09:21,075][15401] Updated weights for policy 0, policy_version 384780 (0.0026) [2024-06-23 07:09:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6304350208. Throughput: 0: 42868.5. Samples: 6304430380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:09:23,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 07:09:23,763][15401] Updated weights for policy 0, policy_version 384790 (0.0032) [2024-06-23 07:09:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6304530432. Throughput: 0: 42755.6. Samples: 6304683840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 07:09:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 07:09:28,778][15401] Updated weights for policy 0, policy_version 384800 (0.0034) [2024-06-23 07:09:31,446][15401] Updated weights for policy 0, policy_version 384810 (0.0038) [2024-06-23 07:09:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 6304776192. Throughput: 0: 42705.9. Samples: 6304935100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:09:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 07:09:36,314][15401] Updated weights for policy 0, policy_version 384820 (0.0037) [2024-06-23 07:09:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42328.7, 300 sec: 42709.5). Total num frames: 6304972800. Throughput: 0: 42700.5. Samples: 6305069360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:09:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 07:09:38,663][15349] Signal inference workers to stop experience collection... (93400 times) [2024-06-23 07:09:38,663][15349] Signal inference workers to resume experience collection... (93400 times) [2024-06-23 07:09:38,718][15401] InferenceWorker_p0-w0: stopping experience collection (93400 times) [2024-06-23 07:09:38,718][15401] InferenceWorker_p0-w0: resuming experience collection (93400 times) [2024-06-23 07:09:39,249][15401] Updated weights for policy 0, policy_version 384830 (0.0036) [2024-06-23 07:09:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6305185792. Throughput: 0: 42665.3. Samples: 6305325300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:09:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 07:09:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000384838_6305185792.pth... [2024-06-23 07:09:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000384215_6294978560.pth [2024-06-23 07:09:43,926][15401] Updated weights for policy 0, policy_version 384840 (0.0035) [2024-06-23 07:09:46,988][15401] Updated weights for policy 0, policy_version 384850 (0.0033) [2024-06-23 07:09:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6305415168. Throughput: 0: 42617.9. Samples: 6305571800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:09:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 07:09:51,859][15401] Updated weights for policy 0, policy_version 384860 (0.0031) [2024-06-23 07:09:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6305611776. Throughput: 0: 42581.4. Samples: 6305702240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:09:53,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 07:09:54,642][15401] Updated weights for policy 0, policy_version 384870 (0.0024) [2024-06-23 07:09:58,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6305824768. Throughput: 0: 42691.4. Samples: 6305961560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:09:58,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 07:09:59,286][15401] Updated weights for policy 0, policy_version 384880 (0.0026) [2024-06-23 07:10:02,336][15401] Updated weights for policy 0, policy_version 384890 (0.0041) [2024-06-23 07:10:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6306054144. Throughput: 0: 42570.4. Samples: 6306216780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:03,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 07:10:06,751][15401] Updated weights for policy 0, policy_version 384900 (0.0029) [2024-06-23 07:10:08,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6306250752. Throughput: 0: 42655.2. Samples: 6306349860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:08,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 07:10:10,174][15401] Updated weights for policy 0, policy_version 384910 (0.0033) [2024-06-23 07:10:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6306463744. Throughput: 0: 42604.8. Samples: 6306601060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:13,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 07:10:14,536][15401] Updated weights for policy 0, policy_version 384920 (0.0042) [2024-06-23 07:10:17,756][15401] Updated weights for policy 0, policy_version 384930 (0.0038) [2024-06-23 07:10:18,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 6306693120. Throughput: 0: 42741.6. Samples: 6306858580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:18,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 07:10:22,088][15401] Updated weights for policy 0, policy_version 384940 (0.0037) [2024-06-23 07:10:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 6306889728. Throughput: 0: 42788.0. Samples: 6306994820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:23,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 07:10:25,302][15401] Updated weights for policy 0, policy_version 384950 (0.0033) [2024-06-23 07:10:28,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6307119104. Throughput: 0: 42716.1. Samples: 6307247520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 07:10:29,675][15401] Updated weights for policy 0, policy_version 384960 (0.0035) [2024-06-23 07:10:33,178][15401] Updated weights for policy 0, policy_version 384970 (0.0027) [2024-06-23 07:10:33,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 6307348480. Throughput: 0: 42836.3. Samples: 6307499540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:33,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 07:10:37,197][15401] Updated weights for policy 0, policy_version 384980 (0.0023) [2024-06-23 07:10:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6307512320. Throughput: 0: 42875.9. Samples: 6307631660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 07:10:40,662][15401] Updated weights for policy 0, policy_version 384990 (0.0036) [2024-06-23 07:10:43,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6307758080. Throughput: 0: 42714.3. Samples: 6307883700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-23 07:10:43,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 07:10:45,013][15401] Updated weights for policy 0, policy_version 385000 (0.0031) [2024-06-23 07:10:48,293][15401] Updated weights for policy 0, policy_version 385010 (0.0029) [2024-06-23 07:10:48,390][15132] Fps is (10 sec: 49151.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6308003840. Throughput: 0: 42715.3. Samples: 6308138980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:10:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 07:10:52,546][15401] Updated weights for policy 0, policy_version 385020 (0.0030) [2024-06-23 07:10:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6308167680. Throughput: 0: 42763.9. Samples: 6308274240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:10:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 07:10:54,886][15349] Signal inference workers to stop experience collection... (93450 times) [2024-06-23 07:10:54,887][15349] Signal inference workers to resume experience collection... (93450 times) [2024-06-23 07:10:54,919][15401] InferenceWorker_p0-w0: stopping experience collection (93450 times) [2024-06-23 07:10:54,919][15401] InferenceWorker_p0-w0: resuming experience collection (93450 times) [2024-06-23 07:10:56,057][15401] Updated weights for policy 0, policy_version 385030 (0.0037) [2024-06-23 07:10:58,390][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6308413440. Throughput: 0: 42818.7. Samples: 6308527900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:10:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 07:11:00,060][15401] Updated weights for policy 0, policy_version 385040 (0.0036) [2024-06-23 07:11:03,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6308642816. Throughput: 0: 42982.8. Samples: 6308792700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:03,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 07:11:03,466][15401] Updated weights for policy 0, policy_version 385050 (0.0028) [2024-06-23 07:11:07,705][15401] Updated weights for policy 0, policy_version 385060 (0.0045) [2024-06-23 07:11:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6308839424. Throughput: 0: 42957.7. Samples: 6308927920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 07:11:10,989][15401] Updated weights for policy 0, policy_version 385070 (0.0045) [2024-06-23 07:11:13,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 6309052416. Throughput: 0: 42939.1. Samples: 6309179880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:13,392][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 07:11:15,259][15401] Updated weights for policy 0, policy_version 385080 (0.0034) [2024-06-23 07:11:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43419.4, 300 sec: 42931.7). Total num frames: 6309298176. Throughput: 0: 43070.0. Samples: 6309437580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 07:11:18,438][15401] Updated weights for policy 0, policy_version 385090 (0.0030) [2024-06-23 07:11:22,962][15401] Updated weights for policy 0, policy_version 385100 (0.0036) [2024-06-23 07:11:23,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6309478400. Throughput: 0: 43070.7. Samples: 6309569840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:23,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 07:11:25,886][15401] Updated weights for policy 0, policy_version 385110 (0.0035) [2024-06-23 07:11:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6309707776. Throughput: 0: 43283.1. Samples: 6309831440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 07:11:30,495][15401] Updated weights for policy 0, policy_version 385120 (0.0038) [2024-06-23 07:11:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 6309937152. Throughput: 0: 43231.3. Samples: 6310084380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 07:11:33,678][15401] Updated weights for policy 0, policy_version 385130 (0.0027) [2024-06-23 07:11:38,132][15401] Updated weights for policy 0, policy_version 385140 (0.0034) [2024-06-23 07:11:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 6310133760. Throughput: 0: 43238.7. Samples: 6310219980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:38,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 07:11:41,314][15401] Updated weights for policy 0, policy_version 385150 (0.0026) [2024-06-23 07:11:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6310346752. Throughput: 0: 43128.4. Samples: 6310468680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 07:11:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000385153_6310346752.pth... [2024-06-23 07:11:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000384525_6300057600.pth [2024-06-23 07:11:46,027][15401] Updated weights for policy 0, policy_version 385160 (0.0045) [2024-06-23 07:11:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42821.2). Total num frames: 6310559744. Throughput: 0: 43147.8. Samples: 6310734360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 07:11:48,975][15401] Updated weights for policy 0, policy_version 385170 (0.0028) [2024-06-23 07:11:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 6310772736. Throughput: 0: 43024.5. Samples: 6310864020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 07:11:53,420][15401] Updated weights for policy 0, policy_version 385180 (0.0033) [2024-06-23 07:11:56,487][15401] Updated weights for policy 0, policy_version 385190 (0.0031) [2024-06-23 07:11:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 6311002112. Throughput: 0: 43036.0. Samples: 6311116400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:11:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 07:12:00,983][15401] Updated weights for policy 0, policy_version 385200 (0.0027) [2024-06-23 07:12:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6311198720. Throughput: 0: 43098.2. Samples: 6311377000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 07:12:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 07:12:04,509][15401] Updated weights for policy 0, policy_version 385210 (0.0032) [2024-06-23 07:12:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6311411712. Throughput: 0: 43043.2. Samples: 6311506780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:08,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 07:12:08,689][15401] Updated weights for policy 0, policy_version 385220 (0.0039) [2024-06-23 07:12:12,246][15401] Updated weights for policy 0, policy_version 385230 (0.0048) [2024-06-23 07:12:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43419.3, 300 sec: 43042.7). Total num frames: 6311657472. Throughput: 0: 42838.7. Samples: 6311759180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 07:12:16,254][15401] Updated weights for policy 0, policy_version 385240 (0.0042) [2024-06-23 07:12:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 6311837696. Throughput: 0: 42928.9. Samples: 6312016180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 07:12:19,938][15401] Updated weights for policy 0, policy_version 385250 (0.0039) [2024-06-23 07:12:23,004][15349] Signal inference workers to stop experience collection... (93500 times) [2024-06-23 07:12:23,004][15349] Signal inference workers to resume experience collection... (93500 times) [2024-06-23 07:12:23,047][15401] InferenceWorker_p0-w0: stopping experience collection (93500 times) [2024-06-23 07:12:23,047][15401] InferenceWorker_p0-w0: resuming experience collection (93500 times) [2024-06-23 07:12:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6312067072. Throughput: 0: 42747.1. Samples: 6312143600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:23,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 07:12:23,893][15401] Updated weights for policy 0, policy_version 385260 (0.0034) [2024-06-23 07:12:27,628][15401] Updated weights for policy 0, policy_version 385270 (0.0034) [2024-06-23 07:12:28,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 6312312832. Throughput: 0: 43065.8. Samples: 6312406640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 07:12:31,472][15401] Updated weights for policy 0, policy_version 385280 (0.0039) [2024-06-23 07:12:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6312493056. Throughput: 0: 42841.4. Samples: 6312662220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:33,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 07:12:35,179][15401] Updated weights for policy 0, policy_version 385290 (0.0036) [2024-06-23 07:12:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6312722432. Throughput: 0: 42765.8. Samples: 6312788480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 07:12:39,140][15401] Updated weights for policy 0, policy_version 385300 (0.0047) [2024-06-23 07:12:43,114][15401] Updated weights for policy 0, policy_version 385310 (0.0038) [2024-06-23 07:12:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6312919040. Throughput: 0: 42913.3. Samples: 6313047500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 07:12:46,801][15401] Updated weights for policy 0, policy_version 385320 (0.0034) [2024-06-23 07:12:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 6313148416. Throughput: 0: 42851.6. Samples: 6313305320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 07:12:50,841][15401] Updated weights for policy 0, policy_version 385330 (0.0033) [2024-06-23 07:12:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6313377792. Throughput: 0: 42877.6. Samples: 6313436280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 07:12:54,533][15401] Updated weights for policy 0, policy_version 385340 (0.0032) [2024-06-23 07:12:58,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 6313541632. Throughput: 0: 42814.1. Samples: 6313685920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:12:58,393][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 07:12:58,863][15401] Updated weights for policy 0, policy_version 385350 (0.0037) [2024-06-23 07:13:02,384][15401] Updated weights for policy 0, policy_version 385360 (0.0038) [2024-06-23 07:13:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42876.5). Total num frames: 6313787392. Throughput: 0: 42816.0. Samples: 6313942900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:13:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 07:13:06,468][15401] Updated weights for policy 0, policy_version 385370 (0.0034) [2024-06-23 07:13:08,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6313984000. Throughput: 0: 42937.4. Samples: 6314075780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:13:08,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 07:13:10,037][15401] Updated weights for policy 0, policy_version 385380 (0.0030) [2024-06-23 07:13:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6314196992. Throughput: 0: 42629.3. Samples: 6314324960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:13:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 07:13:13,971][15401] Updated weights for policy 0, policy_version 385390 (0.0028) [2024-06-23 07:13:17,672][15401] Updated weights for policy 0, policy_version 385400 (0.0040) [2024-06-23 07:13:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6314426368. Throughput: 0: 42714.8. Samples: 6314584380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:13:18,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 07:13:21,541][15401] Updated weights for policy 0, policy_version 385410 (0.0041) [2024-06-23 07:13:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 6314639360. Throughput: 0: 42785.3. Samples: 6314713920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 07:13:23,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 07:13:25,269][15401] Updated weights for policy 0, policy_version 385420 (0.0044) [2024-06-23 07:13:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 6314835968. Throughput: 0: 42660.6. Samples: 6314967220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:13:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 07:13:29,369][15401] Updated weights for policy 0, policy_version 385430 (0.0028) [2024-06-23 07:13:33,027][15401] Updated weights for policy 0, policy_version 385440 (0.0037) [2024-06-23 07:13:33,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42821.2). Total num frames: 6315065344. Throughput: 0: 42669.7. Samples: 6315225460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:13:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 07:13:36,905][15401] Updated weights for policy 0, policy_version 385450 (0.0050) [2024-06-23 07:13:37,285][15349] Signal inference workers to stop experience collection... (93550 times) [2024-06-23 07:13:37,285][15349] Signal inference workers to resume experience collection... (93550 times) [2024-06-23 07:13:37,309][15401] InferenceWorker_p0-w0: stopping experience collection (93550 times) [2024-06-23 07:13:37,309][15401] InferenceWorker_p0-w0: resuming experience collection (93550 times) [2024-06-23 07:13:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 6315278336. Throughput: 0: 42577.0. Samples: 6315352240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:13:38,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 07:13:40,575][15401] Updated weights for policy 0, policy_version 385460 (0.0028) [2024-06-23 07:13:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6315474944. Throughput: 0: 42711.3. Samples: 6315607820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:13:43,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 07:13:43,437][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000385467_6315491328.pth... [2024-06-23 07:13:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000384838_6305185792.pth [2024-06-23 07:13:44,549][15401] Updated weights for policy 0, policy_version 385470 (0.0044) [2024-06-23 07:13:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6315704320. Throughput: 0: 42770.2. Samples: 6315867560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:13:48,390][15401] Updated weights for policy 0, policy_version 385480 (0.0029) [2024-06-23 07:13:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 07:13:52,150][15401] Updated weights for policy 0, policy_version 385490 (0.0026) [2024-06-23 07:13:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6315933696. Throughput: 0: 42618.7. Samples: 6315993620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:13:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 07:13:56,040][15401] Updated weights for policy 0, policy_version 385500 (0.0030) [2024-06-23 07:13:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6316113920. Throughput: 0: 42666.3. Samples: 6316244940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:13:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 07:13:59,754][15401] Updated weights for policy 0, policy_version 385510 (0.0038) [2024-06-23 07:14:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6316326912. Throughput: 0: 42814.9. Samples: 6316511060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 07:14:03,626][15401] Updated weights for policy 0, policy_version 385520 (0.0039) [2024-06-23 07:14:07,212][15401] Updated weights for policy 0, policy_version 385530 (0.0035) [2024-06-23 07:14:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6316572672. Throughput: 0: 42656.5. Samples: 6316633360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 07:14:11,330][15401] Updated weights for policy 0, policy_version 385540 (0.0026) [2024-06-23 07:14:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6316769280. Throughput: 0: 42696.7. Samples: 6316888580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 07:14:14,713][15401] Updated weights for policy 0, policy_version 385550 (0.0030) [2024-06-23 07:14:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 6316982272. Throughput: 0: 42827.5. Samples: 6317152700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 07:14:18,844][15401] Updated weights for policy 0, policy_version 385560 (0.0039) [2024-06-23 07:14:22,370][15401] Updated weights for policy 0, policy_version 385570 (0.0025) [2024-06-23 07:14:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 6317211648. Throughput: 0: 42771.9. Samples: 6317276980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:23,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 07:14:26,634][15401] Updated weights for policy 0, policy_version 385580 (0.0034) [2024-06-23 07:14:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6317424640. Throughput: 0: 42989.7. Samples: 6317542360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 07:14:29,888][15401] Updated weights for policy 0, policy_version 385590 (0.0040) [2024-06-23 07:14:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6317621248. Throughput: 0: 42786.3. Samples: 6317792940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:33,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 07:14:34,257][15401] Updated weights for policy 0, policy_version 385600 (0.0043) [2024-06-23 07:14:37,541][15401] Updated weights for policy 0, policy_version 385610 (0.0028) [2024-06-23 07:14:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6317850624. Throughput: 0: 42841.8. Samples: 6317921500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 07:14:41,941][15401] Updated weights for policy 0, policy_version 385620 (0.0028) [2024-06-23 07:14:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6318030848. Throughput: 0: 42932.8. Samples: 6318176920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 07:14:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 07:14:45,271][15401] Updated weights for policy 0, policy_version 385630 (0.0034) [2024-06-23 07:14:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6318276608. Throughput: 0: 42734.3. Samples: 6318434100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:14:48,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 07:14:49,612][15401] Updated weights for policy 0, policy_version 385640 (0.0030) [2024-06-23 07:14:52,746][15401] Updated weights for policy 0, policy_version 385650 (0.0031) [2024-06-23 07:14:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 6318489600. Throughput: 0: 43058.3. Samples: 6318570980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:14:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 07:14:57,276][15401] Updated weights for policy 0, policy_version 385660 (0.0039) [2024-06-23 07:14:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6318669824. Throughput: 0: 42981.3. Samples: 6318822740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:14:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 07:15:00,554][15401] Updated weights for policy 0, policy_version 385670 (0.0031) [2024-06-23 07:15:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 6318915584. Throughput: 0: 42726.2. Samples: 6319075480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:03,393][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 07:15:04,842][15401] Updated weights for policy 0, policy_version 385680 (0.0033) [2024-06-23 07:15:08,279][15401] Updated weights for policy 0, policy_version 385690 (0.0022) [2024-06-23 07:15:08,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 6319144960. Throughput: 0: 43005.9. Samples: 6319212240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 07:15:10,268][15349] Signal inference workers to stop experience collection... (93600 times) [2024-06-23 07:15:10,269][15349] Signal inference workers to resume experience collection... (93600 times) [2024-06-23 07:15:10,278][15401] InferenceWorker_p0-w0: stopping experience collection (93600 times) [2024-06-23 07:15:10,291][15401] InferenceWorker_p0-w0: resuming experience collection (93600 times) [2024-06-23 07:15:12,396][15401] Updated weights for policy 0, policy_version 385700 (0.0031) [2024-06-23 07:15:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 6319325184. Throughput: 0: 42803.5. Samples: 6319468520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 07:15:15,825][15401] Updated weights for policy 0, policy_version 385710 (0.0043) [2024-06-23 07:15:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 6319570944. Throughput: 0: 42689.5. Samples: 6319713980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 07:15:19,911][15401] Updated weights for policy 0, policy_version 385720 (0.0030) [2024-06-23 07:15:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6319767552. Throughput: 0: 42876.5. Samples: 6319850940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 07:15:23,891][15401] Updated weights for policy 0, policy_version 385730 (0.0031) [2024-06-23 07:15:27,442][15401] Updated weights for policy 0, policy_version 385740 (0.0037) [2024-06-23 07:15:28,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 6319964160. Throughput: 0: 42779.6. Samples: 6320102000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 07:15:31,438][15401] Updated weights for policy 0, policy_version 385750 (0.0043) [2024-06-23 07:15:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 6320209920. Throughput: 0: 42728.4. Samples: 6320356880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 07:15:34,950][15401] Updated weights for policy 0, policy_version 385760 (0.0041) [2024-06-23 07:15:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6320406528. Throughput: 0: 42764.5. Samples: 6320495380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 07:15:39,354][15401] Updated weights for policy 0, policy_version 385770 (0.0033) [2024-06-23 07:15:42,503][15401] Updated weights for policy 0, policy_version 385780 (0.0029) [2024-06-23 07:15:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6320619520. Throughput: 0: 42632.0. Samples: 6320741180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 07:15:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000385780_6320619520.pth... [2024-06-23 07:15:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000385153_6310346752.pth [2024-06-23 07:15:46,902][15401] Updated weights for policy 0, policy_version 385790 (0.0039) [2024-06-23 07:15:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 6320848896. Throughput: 0: 42784.1. Samples: 6321000660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 07:15:50,162][15401] Updated weights for policy 0, policy_version 385800 (0.0027) [2024-06-23 07:15:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6321045504. Throughput: 0: 42663.0. Samples: 6321132080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 07:15:54,815][15401] Updated weights for policy 0, policy_version 385810 (0.0023) [2024-06-23 07:15:57,808][15401] Updated weights for policy 0, policy_version 385820 (0.0030) [2024-06-23 07:15:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 6321274880. Throughput: 0: 42564.9. Samples: 6321383940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:15:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 07:16:02,474][15401] Updated weights for policy 0, policy_version 385830 (0.0033) [2024-06-23 07:16:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 6321487872. Throughput: 0: 42775.8. Samples: 6321638880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 07:16:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 07:16:06,062][15401] Updated weights for policy 0, policy_version 385840 (0.0028) [2024-06-23 07:16:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 6321684480. Throughput: 0: 42512.0. Samples: 6321763980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 07:16:10,351][15401] Updated weights for policy 0, policy_version 385850 (0.0030) [2024-06-23 07:16:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6321897472. Throughput: 0: 42644.1. Samples: 6322020980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:13,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 07:16:13,652][15401] Updated weights for policy 0, policy_version 385860 (0.0027) [2024-06-23 07:16:18,042][15401] Updated weights for policy 0, policy_version 385870 (0.0021) [2024-06-23 07:16:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 6322110464. Throughput: 0: 42690.7. Samples: 6322277960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 07:16:19,975][15349] Signal inference workers to stop experience collection... (93650 times) [2024-06-23 07:16:20,004][15401] InferenceWorker_p0-w0: stopping experience collection (93650 times) [2024-06-23 07:16:20,037][15349] Signal inference workers to resume experience collection... (93650 times) [2024-06-23 07:16:20,038][15401] InferenceWorker_p0-w0: resuming experience collection (93650 times) [2024-06-23 07:16:21,270][15401] Updated weights for policy 0, policy_version 385880 (0.0027) [2024-06-23 07:16:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6322307072. Throughput: 0: 42426.7. Samples: 6322404580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 07:16:25,690][15401] Updated weights for policy 0, policy_version 385890 (0.0040) [2024-06-23 07:16:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6322536448. Throughput: 0: 42552.9. Samples: 6322656060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 07:16:28,861][15401] Updated weights for policy 0, policy_version 385900 (0.0031) [2024-06-23 07:16:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 6322733056. Throughput: 0: 42675.1. Samples: 6322921040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 07:16:33,524][15401] Updated weights for policy 0, policy_version 385910 (0.0028) [2024-06-23 07:16:36,466][15401] Updated weights for policy 0, policy_version 385920 (0.0039) [2024-06-23 07:16:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6322946048. Throughput: 0: 42455.7. Samples: 6323042580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 07:16:41,383][15401] Updated weights for policy 0, policy_version 385930 (0.0028) [2024-06-23 07:16:43,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6323191808. Throughput: 0: 42489.9. Samples: 6323295980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 07:16:44,158][15401] Updated weights for policy 0, policy_version 385940 (0.0037) [2024-06-23 07:16:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 6323355648. Throughput: 0: 42674.2. Samples: 6323559220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 07:16:49,142][15401] Updated weights for policy 0, policy_version 385950 (0.0027) [2024-06-23 07:16:51,866][15401] Updated weights for policy 0, policy_version 385960 (0.0022) [2024-06-23 07:16:53,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6323585024. Throughput: 0: 42565.8. Samples: 6323679440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 07:16:56,706][15401] Updated weights for policy 0, policy_version 385970 (0.0025) [2024-06-23 07:16:58,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6323830784. Throughput: 0: 42644.3. Samples: 6323939980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:16:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 07:16:59,638][15401] Updated weights for policy 0, policy_version 385980 (0.0029) [2024-06-23 07:17:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 6324011008. Throughput: 0: 42701.3. Samples: 6324199620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:17:03,392][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 07:17:04,305][15401] Updated weights for policy 0, policy_version 385990 (0.0035) [2024-06-23 07:17:07,718][15401] Updated weights for policy 0, policy_version 386000 (0.0033) [2024-06-23 07:17:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 6324240384. Throughput: 0: 42528.8. Samples: 6324318480. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:17:08,392][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 07:17:11,857][15401] Updated weights for policy 0, policy_version 386010 (0.0041) [2024-06-23 07:17:13,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6324469760. Throughput: 0: 42694.2. Samples: 6324577300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:17:13,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 07:17:15,419][15401] Updated weights for policy 0, policy_version 386020 (0.0031) [2024-06-23 07:17:18,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6324633600. Throughput: 0: 42606.7. Samples: 6324838340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:17:18,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 07:17:19,350][15401] Updated weights for policy 0, policy_version 386030 (0.0032) [2024-06-23 07:17:22,973][15401] Updated weights for policy 0, policy_version 386040 (0.0036) [2024-06-23 07:17:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6324879360. Throughput: 0: 42569.3. Samples: 6324958200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 19.0) [2024-06-23 07:17:23,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 07:17:27,384][15401] Updated weights for policy 0, policy_version 386050 (0.0043) [2024-06-23 07:17:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6325092352. Throughput: 0: 42768.2. Samples: 6325220560. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:17:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 07:17:28,583][15349] Signal inference workers to stop experience collection... (93700 times) [2024-06-23 07:17:28,583][15349] Signal inference workers to resume experience collection... (93700 times) [2024-06-23 07:17:28,631][15401] InferenceWorker_p0-w0: stopping experience collection (93700 times) [2024-06-23 07:17:28,631][15401] InferenceWorker_p0-w0: resuming experience collection (93700 times) [2024-06-23 07:17:30,597][15401] Updated weights for policy 0, policy_version 386060 (0.0032) [2024-06-23 07:17:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6325288960. Throughput: 0: 42643.9. Samples: 6325478200. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:17:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 07:17:35,047][15401] Updated weights for policy 0, policy_version 386070 (0.0033) [2024-06-23 07:17:38,104][15401] Updated weights for policy 0, policy_version 386080 (0.0044) [2024-06-23 07:17:38,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6325534720. Throughput: 0: 42864.0. Samples: 6325608320. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:17:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 07:17:42,494][15401] Updated weights for policy 0, policy_version 386090 (0.0039) [2024-06-23 07:17:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.5, 300 sec: 42653.6). Total num frames: 6325731328. Throughput: 0: 42787.5. Samples: 6325865520. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:17:43,401][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 07:17:43,538][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000386093_6325747712.pth... [2024-06-23 07:17:43,585][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000385467_6315491328.pth [2024-06-23 07:17:46,267][15401] Updated weights for policy 0, policy_version 386100 (0.0036) [2024-06-23 07:17:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6325944320. Throughput: 0: 42738.7. Samples: 6326122760. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:17:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 07:17:49,970][15401] Updated weights for policy 0, policy_version 386110 (0.0030) [2024-06-23 07:17:53,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 6326157312. Throughput: 0: 43025.4. Samples: 6326254520. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:17:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 07:17:53,799][15401] Updated weights for policy 0, policy_version 386120 (0.0033) [2024-06-23 07:17:57,471][15401] Updated weights for policy 0, policy_version 386130 (0.0036) [2024-06-23 07:17:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6326386688. Throughput: 0: 43043.2. Samples: 6326514240. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:17:58,391][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 07:18:01,407][15401] Updated weights for policy 0, policy_version 386140 (0.0030) [2024-06-23 07:18:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 6326599680. Throughput: 0: 42940.9. Samples: 6326770680. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:03,392][15132] Avg episode reward: [(0, '0.875')] [2024-06-23 07:18:04,970][15401] Updated weights for policy 0, policy_version 386150 (0.0034) [2024-06-23 07:18:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6326812672. Throughput: 0: 43082.2. Samples: 6326896900. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:08,390][15132] Avg episode reward: [(0, '0.884')] [2024-06-23 07:18:08,993][15401] Updated weights for policy 0, policy_version 386160 (0.0038) [2024-06-23 07:18:12,453][15401] Updated weights for policy 0, policy_version 386170 (0.0023) [2024-06-23 07:18:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6327025664. Throughput: 0: 42832.2. Samples: 6327148000. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:13,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 07:18:16,788][15401] Updated weights for policy 0, policy_version 386180 (0.0033) [2024-06-23 07:18:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 6327238656. Throughput: 0: 42983.2. Samples: 6327412440. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 07:18:19,871][15401] Updated weights for policy 0, policy_version 386190 (0.0047) [2024-06-23 07:18:23,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43144.3, 300 sec: 42820.5). Total num frames: 6327468032. Throughput: 0: 42861.9. Samples: 6327537120. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:23,391][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 07:18:25,089][15401] Updated weights for policy 0, policy_version 386200 (0.0033) [2024-06-23 07:18:27,716][15401] Updated weights for policy 0, policy_version 386210 (0.0036) [2024-06-23 07:18:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6327681024. Throughput: 0: 42784.9. Samples: 6327790740. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 07:18:30,536][15349] Signal inference workers to stop experience collection... (93750 times) [2024-06-23 07:18:30,536][15349] Signal inference workers to resume experience collection... (93750 times) [2024-06-23 07:18:30,550][15401] InferenceWorker_p0-w0: stopping experience collection (93750 times) [2024-06-23 07:18:30,551][15401] InferenceWorker_p0-w0: resuming experience collection (93750 times) [2024-06-23 07:18:32,626][15401] Updated weights for policy 0, policy_version 386220 (0.0026) [2024-06-23 07:18:33,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6327861248. Throughput: 0: 42864.9. Samples: 6328051680. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 07:18:35,531][15401] Updated weights for policy 0, policy_version 386230 (0.0037) [2024-06-23 07:18:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6328123392. Throughput: 0: 42566.5. Samples: 6328170020. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 07:18:40,160][15401] Updated weights for policy 0, policy_version 386240 (0.0046) [2024-06-23 07:18:43,056][15401] Updated weights for policy 0, policy_version 386250 (0.0029) [2024-06-23 07:18:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43419.5, 300 sec: 42820.6). Total num frames: 6328336384. Throughput: 0: 42605.4. Samples: 6328431480. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-23 07:18:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 07:18:47,745][15401] Updated weights for policy 0, policy_version 386260 (0.0036) [2024-06-23 07:18:48,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6328500224. Throughput: 0: 42682.7. Samples: 6328691400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:18:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 07:18:51,020][15401] Updated weights for policy 0, policy_version 386270 (0.0032) [2024-06-23 07:18:53,390][15132] Fps is (10 sec: 42596.8, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 6328762368. Throughput: 0: 42527.3. Samples: 6328810640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:18:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 07:18:55,333][15401] Updated weights for policy 0, policy_version 386280 (0.0050) [2024-06-23 07:18:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6328958976. Throughput: 0: 42756.4. Samples: 6329072040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:18:58,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 07:18:58,986][15401] Updated weights for policy 0, policy_version 386290 (0.0041) [2024-06-23 07:19:02,896][15401] Updated weights for policy 0, policy_version 386300 (0.0034) [2024-06-23 07:19:03,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6329155584. Throughput: 0: 42655.6. Samples: 6329331940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:03,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 07:19:06,580][15401] Updated weights for policy 0, policy_version 386310 (0.0043) [2024-06-23 07:19:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6329401344. Throughput: 0: 42666.0. Samples: 6329457080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 07:19:10,518][15401] Updated weights for policy 0, policy_version 386320 (0.0029) [2024-06-23 07:19:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6329581568. Throughput: 0: 42682.8. Samples: 6329711460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 07:19:14,337][15401] Updated weights for policy 0, policy_version 386330 (0.0029) [2024-06-23 07:19:18,188][15401] Updated weights for policy 0, policy_version 386340 (0.0030) [2024-06-23 07:19:18,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 6329794560. Throughput: 0: 42626.1. Samples: 6329969960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:18,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 07:19:21,944][15401] Updated weights for policy 0, policy_version 386350 (0.0044) [2024-06-23 07:19:23,397][15132] Fps is (10 sec: 44205.1, 60 sec: 42593.5, 300 sec: 42708.4). Total num frames: 6330023936. Throughput: 0: 42795.1. Samples: 6330096100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:23,397][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 07:19:26,140][15401] Updated weights for policy 0, policy_version 386360 (0.0035) [2024-06-23 07:19:28,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6330220544. Throughput: 0: 42655.0. Samples: 6330350960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 07:19:29,686][15401] Updated weights for policy 0, policy_version 386370 (0.0040) [2024-06-23 07:19:33,390][15132] Fps is (10 sec: 40988.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6330433536. Throughput: 0: 42462.1. Samples: 6330602200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 07:19:33,623][15401] Updated weights for policy 0, policy_version 386380 (0.0028) [2024-06-23 07:19:37,623][15401] Updated weights for policy 0, policy_version 386390 (0.0046) [2024-06-23 07:19:38,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42320.9, 300 sec: 42819.7). Total num frames: 6330662912. Throughput: 0: 42639.6. Samples: 6330729680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:38,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 07:19:41,110][15401] Updated weights for policy 0, policy_version 386400 (0.0043) [2024-06-23 07:19:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 6330859520. Throughput: 0: 42555.0. Samples: 6330987020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 07:19:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000386405_6330859520.pth... [2024-06-23 07:19:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000385780_6320619520.pth [2024-06-23 07:19:45,106][15401] Updated weights for policy 0, policy_version 386410 (0.0032) [2024-06-23 07:19:48,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6331072512. Throughput: 0: 42423.9. Samples: 6331241020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 07:19:49,151][15401] Updated weights for policy 0, policy_version 386420 (0.0032) [2024-06-23 07:19:52,671][15401] Updated weights for policy 0, policy_version 386430 (0.0041) [2024-06-23 07:19:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 6331285504. Throughput: 0: 42576.4. Samples: 6331373020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:53,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 07:19:55,382][15349] Signal inference workers to stop experience collection... (93800 times) [2024-06-23 07:19:55,383][15349] Signal inference workers to resume experience collection... (93800 times) [2024-06-23 07:19:55,395][15401] InferenceWorker_p0-w0: stopping experience collection (93800 times) [2024-06-23 07:19:55,424][15401] InferenceWorker_p0-w0: resuming experience collection (93800 times) [2024-06-23 07:19:56,709][15401] Updated weights for policy 0, policy_version 386440 (0.0032) [2024-06-23 07:19:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 6331514880. Throughput: 0: 42555.5. Samples: 6331626460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:19:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 07:20:00,681][15401] Updated weights for policy 0, policy_version 386450 (0.0033) [2024-06-23 07:20:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6331727872. Throughput: 0: 42349.4. Samples: 6331875580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:20:03,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 07:20:04,278][15401] Updated weights for policy 0, policy_version 386460 (0.0035) [2024-06-23 07:20:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 6331908096. Throughput: 0: 42402.8. Samples: 6332003920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 07:20:08,493][15401] Updated weights for policy 0, policy_version 386470 (0.0039) [2024-06-23 07:20:11,765][15401] Updated weights for policy 0, policy_version 386480 (0.0038) [2024-06-23 07:20:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 6332153856. Throughput: 0: 42405.6. Samples: 6332259220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 07:20:16,062][15401] Updated weights for policy 0, policy_version 386490 (0.0033) [2024-06-23 07:20:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 6332366848. Throughput: 0: 42606.8. Samples: 6332519500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:18,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 07:20:19,464][15401] Updated weights for policy 0, policy_version 386500 (0.0033) [2024-06-23 07:20:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42330.3, 300 sec: 42709.5). Total num frames: 6332563456. Throughput: 0: 42605.5. Samples: 6332646660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:23,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 07:20:23,669][15401] Updated weights for policy 0, policy_version 386510 (0.0051) [2024-06-23 07:20:27,016][15401] Updated weights for policy 0, policy_version 386520 (0.0029) [2024-06-23 07:20:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6332792832. Throughput: 0: 42429.4. Samples: 6332896340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:28,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 07:20:31,291][15401] Updated weights for policy 0, policy_version 386530 (0.0034) [2024-06-23 07:20:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6332989440. Throughput: 0: 42673.3. Samples: 6333161320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:33,398][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 07:20:34,669][15401] Updated weights for policy 0, policy_version 386540 (0.0045) [2024-06-23 07:20:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 6333202432. Throughput: 0: 42533.0. Samples: 6333287000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 07:20:38,916][15401] Updated weights for policy 0, policy_version 386550 (0.0038) [2024-06-23 07:20:42,218][15401] Updated weights for policy 0, policy_version 386560 (0.0034) [2024-06-23 07:20:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6333431808. Throughput: 0: 42580.3. Samples: 6333542580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 07:20:46,636][15401] Updated weights for policy 0, policy_version 386570 (0.0032) [2024-06-23 07:20:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6333628416. Throughput: 0: 42796.9. Samples: 6333801440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 07:20:50,214][15401] Updated weights for policy 0, policy_version 386580 (0.0041) [2024-06-23 07:20:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6333857792. Throughput: 0: 42633.2. Samples: 6333922420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:53,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 07:20:54,670][15401] Updated weights for policy 0, policy_version 386590 (0.0039) [2024-06-23 07:20:57,772][15401] Updated weights for policy 0, policy_version 386600 (0.0036) [2024-06-23 07:20:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6334087168. Throughput: 0: 42757.0. Samples: 6334183280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:20:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 07:21:02,185][15401] Updated weights for policy 0, policy_version 386610 (0.0036) [2024-06-23 07:21:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6334267392. Throughput: 0: 42710.6. Samples: 6334441480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:21:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 07:21:05,441][15401] Updated weights for policy 0, policy_version 386620 (0.0021) [2024-06-23 07:21:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6334480384. Throughput: 0: 42438.8. Samples: 6334556400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:21:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 07:21:09,686][15401] Updated weights for policy 0, policy_version 386630 (0.0022) [2024-06-23 07:21:13,065][15401] Updated weights for policy 0, policy_version 386640 (0.0038) [2024-06-23 07:21:13,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 6334726144. Throughput: 0: 42940.4. Samples: 6334828760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:21:13,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 07:21:17,211][15401] Updated weights for policy 0, policy_version 386650 (0.0033) [2024-06-23 07:21:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6334889984. Throughput: 0: 42829.0. Samples: 6335088620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 07:21:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 07:21:20,215][15349] Signal inference workers to stop experience collection... (93850 times) [2024-06-23 07:21:20,215][15349] Signal inference workers to resume experience collection... (93850 times) [2024-06-23 07:21:20,256][15401] InferenceWorker_p0-w0: stopping experience collection (93850 times) [2024-06-23 07:21:20,256][15401] InferenceWorker_p0-w0: resuming experience collection (93850 times) [2024-06-23 07:21:20,812][15401] Updated weights for policy 0, policy_version 386660 (0.0038) [2024-06-23 07:21:23,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6335135744. Throughput: 0: 42568.0. Samples: 6335202560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:21:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 07:21:25,396][15401] Updated weights for policy 0, policy_version 386670 (0.0044) [2024-06-23 07:21:28,338][15401] Updated weights for policy 0, policy_version 386680 (0.0026) [2024-06-23 07:21:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6335365120. Throughput: 0: 42709.0. Samples: 6335464480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:21:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 07:21:32,880][15401] Updated weights for policy 0, policy_version 386690 (0.0035) [2024-06-23 07:21:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6335528960. Throughput: 0: 42625.8. Samples: 6335719600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:21:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 07:21:35,874][15401] Updated weights for policy 0, policy_version 386700 (0.0026) [2024-06-23 07:21:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6335774720. Throughput: 0: 42596.3. Samples: 6335839260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:21:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 07:21:40,460][15401] Updated weights for policy 0, policy_version 386710 (0.0029) [2024-06-23 07:21:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 6335954944. Throughput: 0: 42605.9. Samples: 6336100540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:21:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 07:21:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000386717_6335971328.pth... [2024-06-23 07:21:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000386093_6325747712.pth [2024-06-23 07:21:44,115][15401] Updated weights for policy 0, policy_version 386720 (0.0030) [2024-06-23 07:21:48,295][15401] Updated weights for policy 0, policy_version 386730 (0.0029) [2024-06-23 07:21:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6336184320. Throughput: 0: 42496.4. Samples: 6336353820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:21:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 07:21:51,658][15401] Updated weights for policy 0, policy_version 386740 (0.0035) [2024-06-23 07:21:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6336413696. Throughput: 0: 42859.5. Samples: 6336485080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:21:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 07:21:55,897][15401] Updated weights for policy 0, policy_version 386750 (0.0043) [2024-06-23 07:21:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42654.3). Total num frames: 6336593920. Throughput: 0: 42455.2. Samples: 6336739140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:21:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 07:21:59,333][15401] Updated weights for policy 0, policy_version 386760 (0.0038) [2024-06-23 07:22:03,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.6, 300 sec: 42653.9). Total num frames: 6336823296. Throughput: 0: 42252.7. Samples: 6336990100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:22:03,393][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 07:22:03,458][15401] Updated weights for policy 0, policy_version 386770 (0.0049) [2024-06-23 07:22:07,094][15401] Updated weights for policy 0, policy_version 386780 (0.0029) [2024-06-23 07:22:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6337036288. Throughput: 0: 42656.8. Samples: 6337122120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:22:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 07:22:11,167][15401] Updated weights for policy 0, policy_version 386790 (0.0032) [2024-06-23 07:22:13,389][15132] Fps is (10 sec: 40970.5, 60 sec: 41780.9, 300 sec: 42709.5). Total num frames: 6337232896. Throughput: 0: 42531.2. Samples: 6337378380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:22:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 07:22:14,840][15401] Updated weights for policy 0, policy_version 386800 (0.0045) [2024-06-23 07:22:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6337462272. Throughput: 0: 42502.7. Samples: 6337632220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:22:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 07:22:18,701][15401] Updated weights for policy 0, policy_version 386810 (0.0044) [2024-06-23 07:22:22,806][15401] Updated weights for policy 0, policy_version 386820 (0.0026) [2024-06-23 07:22:23,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6337691648. Throughput: 0: 42782.8. Samples: 6337764480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:22:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 07:22:26,251][15401] Updated weights for policy 0, policy_version 386830 (0.0052) [2024-06-23 07:22:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 6337888256. Throughput: 0: 42683.1. Samples: 6338021280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:22:28,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-23 07:22:30,345][15401] Updated weights for policy 0, policy_version 386840 (0.0042) [2024-06-23 07:22:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6338117632. Throughput: 0: 42619.5. Samples: 6338271700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:22:33,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 07:22:33,909][15401] Updated weights for policy 0, policy_version 386850 (0.0037) [2024-06-23 07:22:37,887][15401] Updated weights for policy 0, policy_version 386860 (0.0027) [2024-06-23 07:22:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 6338330624. Throughput: 0: 42593.3. Samples: 6338401780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 07:22:38,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-23 07:22:41,887][15401] Updated weights for policy 0, policy_version 386870 (0.0030) [2024-06-23 07:22:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6338527232. Throughput: 0: 42638.2. Samples: 6338657860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:22:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 07:22:45,531][15401] Updated weights for policy 0, policy_version 386880 (0.0028) [2024-06-23 07:22:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6338740224. Throughput: 0: 42834.8. Samples: 6338917560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:22:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 07:22:49,865][15401] Updated weights for policy 0, policy_version 386890 (0.0036) [2024-06-23 07:22:50,488][15349] Signal inference workers to stop experience collection... (93900 times) [2024-06-23 07:22:50,488][15349] Signal inference workers to resume experience collection... (93900 times) [2024-06-23 07:22:50,536][15401] InferenceWorker_p0-w0: stopping experience collection (93900 times) [2024-06-23 07:22:50,536][15401] InferenceWorker_p0-w0: resuming experience collection (93900 times) [2024-06-23 07:22:53,191][15401] Updated weights for policy 0, policy_version 386900 (0.0034) [2024-06-23 07:22:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6338985984. Throughput: 0: 42640.6. Samples: 6339040940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:22:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 07:22:57,418][15401] Updated weights for policy 0, policy_version 386910 (0.0034) [2024-06-23 07:22:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 6339166208. Throughput: 0: 42919.8. Samples: 6339309880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:22:58,393][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 07:23:00,614][15401] Updated weights for policy 0, policy_version 386920 (0.0039) [2024-06-23 07:23:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 6339395584. Throughput: 0: 42988.0. Samples: 6339566680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 07:23:05,167][15401] Updated weights for policy 0, policy_version 386930 (0.0031) [2024-06-23 07:23:08,115][15401] Updated weights for policy 0, policy_version 386940 (0.0034) [2024-06-23 07:23:08,390][15132] Fps is (10 sec: 47524.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6339641344. Throughput: 0: 42807.0. Samples: 6339690800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 07:23:12,818][15401] Updated weights for policy 0, policy_version 386950 (0.0044) [2024-06-23 07:23:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 6339821568. Throughput: 0: 43099.9. Samples: 6339960780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:13,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 07:23:15,845][15401] Updated weights for policy 0, policy_version 386960 (0.0043) [2024-06-23 07:23:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6340034560. Throughput: 0: 42886.3. Samples: 6340201580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 07:23:20,481][15401] Updated weights for policy 0, policy_version 386970 (0.0032) [2024-06-23 07:23:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6340263936. Throughput: 0: 42886.7. Samples: 6340331680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 07:23:23,711][15401] Updated weights for policy 0, policy_version 386980 (0.0043) [2024-06-23 07:23:28,046][15401] Updated weights for policy 0, policy_version 386990 (0.0033) [2024-06-23 07:23:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6340444160. Throughput: 0: 43035.2. Samples: 6340594440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 07:23:31,312][15401] Updated weights for policy 0, policy_version 387000 (0.0033) [2024-06-23 07:23:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6340689920. Throughput: 0: 42756.5. Samples: 6340841600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:33,392][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 07:23:35,613][15401] Updated weights for policy 0, policy_version 387010 (0.0038) [2024-06-23 07:23:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6340902912. Throughput: 0: 43096.8. Samples: 6340980300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 07:23:38,886][15401] Updated weights for policy 0, policy_version 387020 (0.0038) [2024-06-23 07:23:43,252][15401] Updated weights for policy 0, policy_version 387030 (0.0027) [2024-06-23 07:23:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6341115904. Throughput: 0: 42876.5. Samples: 6341239220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 07:23:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000387031_6341115904.pth... [2024-06-23 07:23:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000386405_6330859520.pth [2024-06-23 07:23:46,460][15401] Updated weights for policy 0, policy_version 387040 (0.0037) [2024-06-23 07:23:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6341328896. Throughput: 0: 42694.6. Samples: 6341487940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 07:23:50,866][15401] Updated weights for policy 0, policy_version 387050 (0.0034) [2024-06-23 07:23:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 6341558272. Throughput: 0: 42902.2. Samples: 6341621400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 07:23:54,012][15401] Updated weights for policy 0, policy_version 387060 (0.0050) [2024-06-23 07:23:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 6341738496. Throughput: 0: 42516.2. Samples: 6341874000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 07:23:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 07:23:58,428][15401] Updated weights for policy 0, policy_version 387070 (0.0048) [2024-06-23 07:24:00,542][15349] Signal inference workers to stop experience collection... (93950 times) [2024-06-23 07:24:00,580][15401] InferenceWorker_p0-w0: stopping experience collection (93950 times) [2024-06-23 07:24:00,658][15349] Signal inference workers to resume experience collection... (93950 times) [2024-06-23 07:24:00,658][15401] InferenceWorker_p0-w0: resuming experience collection (93950 times) [2024-06-23 07:24:02,199][15401] Updated weights for policy 0, policy_version 387080 (0.0033) [2024-06-23 07:24:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6341967872. Throughput: 0: 42661.8. Samples: 6342121360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 07:24:06,186][15401] Updated weights for policy 0, policy_version 387090 (0.0036) [2024-06-23 07:24:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6342213632. Throughput: 0: 42784.9. Samples: 6342257000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 07:24:09,671][15401] Updated weights for policy 0, policy_version 387100 (0.0035) [2024-06-23 07:24:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 6342377472. Throughput: 0: 42634.6. Samples: 6342513000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 07:24:13,685][15401] Updated weights for policy 0, policy_version 387110 (0.0040) [2024-06-23 07:24:17,587][15401] Updated weights for policy 0, policy_version 387120 (0.0037) [2024-06-23 07:24:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42710.5). Total num frames: 6342623232. Throughput: 0: 42718.3. Samples: 6342763920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:18,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 07:24:21,544][15401] Updated weights for policy 0, policy_version 387130 (0.0037) [2024-06-23 07:24:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6342836224. Throughput: 0: 42578.6. Samples: 6342896340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:23,390][15132] Avg episode reward: [(0, '0.196')] [2024-06-23 07:24:25,243][15401] Updated weights for policy 0, policy_version 387140 (0.0022) [2024-06-23 07:24:28,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 6343016448. Throughput: 0: 42406.1. Samples: 6343147500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 07:24:29,264][15401] Updated weights for policy 0, policy_version 387150 (0.0031) [2024-06-23 07:24:33,167][15401] Updated weights for policy 0, policy_version 387160 (0.0031) [2024-06-23 07:24:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 6343229440. Throughput: 0: 42614.8. Samples: 6343405600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 07:24:36,852][15401] Updated weights for policy 0, policy_version 387170 (0.0040) [2024-06-23 07:24:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6343475200. Throughput: 0: 42516.9. Samples: 6343534660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:38,396][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 07:24:40,789][15401] Updated weights for policy 0, policy_version 387180 (0.0032) [2024-06-23 07:24:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 6343639040. Throughput: 0: 42620.9. Samples: 6343791940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:43,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 07:24:44,489][15401] Updated weights for policy 0, policy_version 387190 (0.0033) [2024-06-23 07:24:48,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6343884800. Throughput: 0: 42827.9. Samples: 6344048720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:48,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 07:24:48,393][15401] Updated weights for policy 0, policy_version 387200 (0.0054) [2024-06-23 07:24:52,255][15401] Updated weights for policy 0, policy_version 387210 (0.0043) [2024-06-23 07:24:53,390][15132] Fps is (10 sec: 49150.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6344130560. Throughput: 0: 42808.3. Samples: 6344183380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 07:24:55,891][15401] Updated weights for policy 0, policy_version 387220 (0.0034) [2024-06-23 07:24:58,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6344310784. Throughput: 0: 42863.6. Samples: 6344441860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:24:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 07:24:59,832][15401] Updated weights for policy 0, policy_version 387230 (0.0028) [2024-06-23 07:25:03,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 6344523776. Throughput: 0: 43009.6. Samples: 6344699460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:25:03,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 07:25:03,506][15401] Updated weights for policy 0, policy_version 387240 (0.0031) [2024-06-23 07:25:07,301][15401] Updated weights for policy 0, policy_version 387250 (0.0025) [2024-06-23 07:25:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6344753152. Throughput: 0: 42912.4. Samples: 6344827400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:25:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 07:25:10,939][15401] Updated weights for policy 0, policy_version 387260 (0.0031) [2024-06-23 07:25:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6344933376. Throughput: 0: 43080.2. Samples: 6345086100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:25:13,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 07:25:14,925][15401] Updated weights for policy 0, policy_version 387270 (0.0040) [2024-06-23 07:25:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6345179136. Throughput: 0: 43007.4. Samples: 6345340940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-23 07:25:18,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 07:25:18,813][15401] Updated weights for policy 0, policy_version 387280 (0.0030) [2024-06-23 07:25:22,437][15401] Updated weights for policy 0, policy_version 387290 (0.0025) [2024-06-23 07:25:23,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6345408512. Throughput: 0: 43115.2. Samples: 6345474840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:25:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 07:25:26,268][15401] Updated weights for policy 0, policy_version 387300 (0.0028) [2024-06-23 07:25:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6345588736. Throughput: 0: 43210.1. Samples: 6345736400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:25:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 07:25:29,943][15401] Updated weights for policy 0, policy_version 387310 (0.0021) [2024-06-23 07:25:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6345818112. Throughput: 0: 43163.7. Samples: 6345990980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:25:33,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 07:25:33,758][15401] Updated weights for policy 0, policy_version 387320 (0.0043) [2024-06-23 07:25:37,331][15349] Signal inference workers to stop experience collection... (94000 times) [2024-06-23 07:25:37,332][15349] Signal inference workers to resume experience collection... (94000 times) [2024-06-23 07:25:37,356][15401] InferenceWorker_p0-w0: stopping experience collection (94000 times) [2024-06-23 07:25:37,356][15401] InferenceWorker_p0-w0: resuming experience collection (94000 times) [2024-06-23 07:25:37,479][15401] Updated weights for policy 0, policy_version 387330 (0.0036) [2024-06-23 07:25:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6346047488. Throughput: 0: 43128.0. Samples: 6346124140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:25:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 07:25:41,295][15401] Updated weights for policy 0, policy_version 387340 (0.0043) [2024-06-23 07:25:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6346244096. Throughput: 0: 43035.1. Samples: 6346378440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:25:43,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 07:25:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000387345_6346260480.pth... [2024-06-23 07:25:43,566][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000386717_6335971328.pth [2024-06-23 07:25:45,331][15401] Updated weights for policy 0, policy_version 387350 (0.0033) [2024-06-23 07:25:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 6346473472. Throughput: 0: 42803.5. Samples: 6346625520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:25:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 07:25:48,773][15401] Updated weights for policy 0, policy_version 387360 (0.0029) [2024-06-23 07:25:52,881][15401] Updated weights for policy 0, policy_version 387370 (0.0043) [2024-06-23 07:25:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6346702848. Throughput: 0: 42982.7. Samples: 6346761620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:25:53,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 07:25:56,295][15401] Updated weights for policy 0, policy_version 387380 (0.0026) [2024-06-23 07:25:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6346899456. Throughput: 0: 43002.2. Samples: 6347021200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:25:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 07:26:00,399][15401] Updated weights for policy 0, policy_version 387390 (0.0029) [2024-06-23 07:26:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 6347128832. Throughput: 0: 43076.4. Samples: 6347279380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:26:03,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 07:26:03,822][15401] Updated weights for policy 0, policy_version 387400 (0.0046) [2024-06-23 07:26:07,892][15401] Updated weights for policy 0, policy_version 387410 (0.0038) [2024-06-23 07:26:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 6347325440. Throughput: 0: 42943.6. Samples: 6347407300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:26:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 07:26:11,612][15401] Updated weights for policy 0, policy_version 387420 (0.0033) [2024-06-23 07:26:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6347538432. Throughput: 0: 42931.6. Samples: 6347668320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:26:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 07:26:15,800][15401] Updated weights for policy 0, policy_version 387430 (0.0030) [2024-06-23 07:26:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6347767808. Throughput: 0: 42888.2. Samples: 6347920960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:26:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 07:26:19,470][15401] Updated weights for policy 0, policy_version 387440 (0.0034) [2024-06-23 07:26:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6347964416. Throughput: 0: 42849.9. Samples: 6348052380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:26:23,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 07:26:23,415][15401] Updated weights for policy 0, policy_version 387450 (0.0032) [2024-06-23 07:26:27,151][15401] Updated weights for policy 0, policy_version 387460 (0.0042) [2024-06-23 07:26:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6348193792. Throughput: 0: 42769.7. Samples: 6348303080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:26:28,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 07:26:31,168][15401] Updated weights for policy 0, policy_version 387470 (0.0049) [2024-06-23 07:26:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6348390400. Throughput: 0: 42990.7. Samples: 6348560100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:26:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 07:26:34,628][15401] Updated weights for policy 0, policy_version 387480 (0.0042) [2024-06-23 07:26:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6348603392. Throughput: 0: 42902.6. Samples: 6348692240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 07:26:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 07:26:38,720][15401] Updated weights for policy 0, policy_version 387490 (0.0029) [2024-06-23 07:26:42,365][15401] Updated weights for policy 0, policy_version 387500 (0.0034) [2024-06-23 07:26:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6348816384. Throughput: 0: 42733.8. Samples: 6348944220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:26:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 07:26:46,353][15401] Updated weights for policy 0, policy_version 387510 (0.0040) [2024-06-23 07:26:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6349045760. Throughput: 0: 42659.1. Samples: 6349199040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:26:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 07:26:50,421][15401] Updated weights for policy 0, policy_version 387520 (0.0038) [2024-06-23 07:26:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6349242368. Throughput: 0: 42828.4. Samples: 6349334580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:26:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 07:26:54,032][15401] Updated weights for policy 0, policy_version 387530 (0.0044) [2024-06-23 07:26:58,109][15401] Updated weights for policy 0, policy_version 387540 (0.0033) [2024-06-23 07:26:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 6349455360. Throughput: 0: 42685.6. Samples: 6349589180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:26:58,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-23 07:27:01,539][15401] Updated weights for policy 0, policy_version 387550 (0.0033) [2024-06-23 07:27:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6349684736. Throughput: 0: 42672.1. Samples: 6349841200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:03,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 07:27:04,536][15349] Signal inference workers to stop experience collection... (94050 times) [2024-06-23 07:27:04,542][15349] Signal inference workers to resume experience collection... (94050 times) [2024-06-23 07:27:04,583][15401] InferenceWorker_p0-w0: stopping experience collection (94050 times) [2024-06-23 07:27:04,583][15401] InferenceWorker_p0-w0: resuming experience collection (94050 times) [2024-06-23 07:27:05,815][15401] Updated weights for policy 0, policy_version 387560 (0.0041) [2024-06-23 07:27:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6349881344. Throughput: 0: 42756.0. Samples: 6349976400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 07:27:09,174][15401] Updated weights for policy 0, policy_version 387570 (0.0047) [2024-06-23 07:27:13,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6350094336. Throughput: 0: 42680.1. Samples: 6350223680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 07:27:13,429][15401] Updated weights for policy 0, policy_version 387580 (0.0032) [2024-06-23 07:27:16,965][15401] Updated weights for policy 0, policy_version 387590 (0.0040) [2024-06-23 07:27:18,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.9, 300 sec: 42820.2). Total num frames: 6350323712. Throughput: 0: 42666.3. Samples: 6350480180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:18,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 07:27:21,461][15401] Updated weights for policy 0, policy_version 387600 (0.0031) [2024-06-23 07:27:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6350536704. Throughput: 0: 42749.4. Samples: 6350615960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 07:27:24,511][15401] Updated weights for policy 0, policy_version 387610 (0.0038) [2024-06-23 07:27:28,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6350733312. Throughput: 0: 42793.3. Samples: 6350869920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 07:27:29,081][15401] Updated weights for policy 0, policy_version 387620 (0.0032) [2024-06-23 07:27:32,068][15401] Updated weights for policy 0, policy_version 387630 (0.0045) [2024-06-23 07:27:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6350962688. Throughput: 0: 42823.1. Samples: 6351126080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 07:27:36,622][15401] Updated weights for policy 0, policy_version 387640 (0.0034) [2024-06-23 07:27:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6351192064. Throughput: 0: 42714.7. Samples: 6351256740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 07:27:39,921][15401] Updated weights for policy 0, policy_version 387650 (0.0024) [2024-06-23 07:27:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6351355904. Throughput: 0: 42666.7. Samples: 6351509180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 07:27:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000387657_6351372288.pth... [2024-06-23 07:27:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000387031_6341115904.pth [2024-06-23 07:27:44,326][15401] Updated weights for policy 0, policy_version 387660 (0.0047) [2024-06-23 07:27:47,577][15401] Updated weights for policy 0, policy_version 387670 (0.0031) [2024-06-23 07:27:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6351601664. Throughput: 0: 42706.8. Samples: 6351763000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 07:27:51,899][15401] Updated weights for policy 0, policy_version 387680 (0.0043) [2024-06-23 07:27:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 6351831040. Throughput: 0: 42763.0. Samples: 6351900740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:53,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 07:27:55,257][15401] Updated weights for policy 0, policy_version 387690 (0.0030) [2024-06-23 07:27:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6352011264. Throughput: 0: 42777.1. Samples: 6352148660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 07:27:58,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 07:27:59,560][15401] Updated weights for policy 0, policy_version 387700 (0.0023) [2024-06-23 07:28:02,853][15401] Updated weights for policy 0, policy_version 387710 (0.0041) [2024-06-23 07:28:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6352240640. Throughput: 0: 42667.5. Samples: 6352400120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:03,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 07:28:07,216][15401] Updated weights for policy 0, policy_version 387720 (0.0032) [2024-06-23 07:28:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6352453632. Throughput: 0: 42694.7. Samples: 6352537220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 07:28:10,414][15401] Updated weights for policy 0, policy_version 387730 (0.0033) [2024-06-23 07:28:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6352650240. Throughput: 0: 42757.0. Samples: 6352793980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 07:28:14,939][15401] Updated weights for policy 0, policy_version 387740 (0.0024) [2024-06-23 07:28:18,030][15401] Updated weights for policy 0, policy_version 387750 (0.0039) [2024-06-23 07:28:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 6352896000. Throughput: 0: 42562.7. Samples: 6353041400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:18,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 07:28:22,530][15401] Updated weights for policy 0, policy_version 387760 (0.0033) [2024-06-23 07:28:23,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 6353108992. Throughput: 0: 42691.2. Samples: 6353177940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:23,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 07:28:26,003][15401] Updated weights for policy 0, policy_version 387770 (0.0045) [2024-06-23 07:28:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6353305600. Throughput: 0: 42589.8. Samples: 6353425720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:28,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 07:28:29,754][15349] Signal inference workers to stop experience collection... (94100 times) [2024-06-23 07:28:29,801][15401] InferenceWorker_p0-w0: stopping experience collection (94100 times) [2024-06-23 07:28:29,808][15349] Signal inference workers to resume experience collection... (94100 times) [2024-06-23 07:28:29,822][15401] InferenceWorker_p0-w0: resuming experience collection (94100 times) [2024-06-23 07:28:30,103][15401] Updated weights for policy 0, policy_version 387780 (0.0036) [2024-06-23 07:28:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6353518592. Throughput: 0: 42649.8. Samples: 6353682240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 07:28:33,696][15401] Updated weights for policy 0, policy_version 387790 (0.0033) [2024-06-23 07:28:38,062][15401] Updated weights for policy 0, policy_version 387800 (0.0028) [2024-06-23 07:28:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 6353715200. Throughput: 0: 42362.4. Samples: 6353807040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 07:28:41,259][15401] Updated weights for policy 0, policy_version 387810 (0.0028) [2024-06-23 07:28:43,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 6353928192. Throughput: 0: 42459.4. Samples: 6354059600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:43,396][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 07:28:45,716][15401] Updated weights for policy 0, policy_version 387820 (0.0045) [2024-06-23 07:28:48,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6354157568. Throughput: 0: 42686.7. Samples: 6354321120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:48,392][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 07:28:48,843][15401] Updated weights for policy 0, policy_version 387830 (0.0032) [2024-06-23 07:28:53,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 6354354176. Throughput: 0: 42542.6. Samples: 6354451640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:53,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 07:28:53,410][15401] Updated weights for policy 0, policy_version 387840 (0.0039) [2024-06-23 07:28:56,643][15401] Updated weights for policy 0, policy_version 387850 (0.0040) [2024-06-23 07:28:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6354583552. Throughput: 0: 42184.4. Samples: 6354692280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:28:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 07:29:01,698][15401] Updated weights for policy 0, policy_version 387860 (0.0041) [2024-06-23 07:29:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6354780160. Throughput: 0: 42524.0. Samples: 6354954980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:29:03,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 07:29:04,231][15401] Updated weights for policy 0, policy_version 387870 (0.0036) [2024-06-23 07:29:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6354993152. Throughput: 0: 42221.4. Samples: 6355077800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:29:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 07:29:09,314][15401] Updated weights for policy 0, policy_version 387880 (0.0038) [2024-06-23 07:29:11,927][15401] Updated weights for policy 0, policy_version 387890 (0.0027) [2024-06-23 07:29:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6355238912. Throughput: 0: 42248.4. Samples: 6355326900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:29:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 07:29:16,922][15401] Updated weights for policy 0, policy_version 387900 (0.0028) [2024-06-23 07:29:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 6355419136. Throughput: 0: 42511.7. Samples: 6355595260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 07:29:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 07:29:19,645][15401] Updated weights for policy 0, policy_version 387910 (0.0025) [2024-06-23 07:29:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41780.9, 300 sec: 42709.5). Total num frames: 6355615744. Throughput: 0: 42416.4. Samples: 6355715780. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:29:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 07:29:24,682][15401] Updated weights for policy 0, policy_version 387920 (0.0035) [2024-06-23 07:29:27,239][15401] Updated weights for policy 0, policy_version 387930 (0.0027) [2024-06-23 07:29:28,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6355894272. Throughput: 0: 42494.5. Samples: 6355971580. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:29:28,393][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 07:29:32,703][15401] Updated weights for policy 0, policy_version 387940 (0.0025) [2024-06-23 07:29:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6356058112. Throughput: 0: 42632.1. Samples: 6356239460. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:29:33,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 07:29:34,873][15401] Updated weights for policy 0, policy_version 387950 (0.0037) [2024-06-23 07:29:38,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6356254720. Throughput: 0: 42160.6. Samples: 6356348860. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:29:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 07:29:40,418][15349] Signal inference workers to stop experience collection... (94150 times) [2024-06-23 07:29:40,419][15349] Signal inference workers to resume experience collection... (94150 times) [2024-06-23 07:29:40,423][15401] Updated weights for policy 0, policy_version 387960 (0.0037) [2024-06-23 07:29:40,434][15401] InferenceWorker_p0-w0: stopping experience collection (94150 times) [2024-06-23 07:29:40,434][15401] InferenceWorker_p0-w0: resuming experience collection (94150 times) [2024-06-23 07:29:42,726][15401] Updated weights for policy 0, policy_version 387970 (0.0038) [2024-06-23 07:29:43,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43422.2, 300 sec: 42876.4). Total num frames: 6356533248. Throughput: 0: 42558.6. Samples: 6356607420. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:29:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 07:29:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000387973_6356549632.pth... [2024-06-23 07:29:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000387345_6346260480.pth [2024-06-23 07:29:48,097][15401] Updated weights for policy 0, policy_version 387980 (0.0043) [2024-06-23 07:29:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 6356680704. Throughput: 0: 42582.3. Samples: 6356871180. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:29:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 07:29:50,528][15401] Updated weights for policy 0, policy_version 387990 (0.0038) [2024-06-23 07:29:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6356910080. Throughput: 0: 42467.9. Samples: 6356988860. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:29:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 07:29:55,530][15401] Updated weights for policy 0, policy_version 388000 (0.0036) [2024-06-23 07:29:58,125][15401] Updated weights for policy 0, policy_version 388010 (0.0019) [2024-06-23 07:29:58,390][15132] Fps is (10 sec: 49151.4, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 6357172224. Throughput: 0: 42790.2. Samples: 6357252460. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:29:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 07:30:03,079][15401] Updated weights for policy 0, policy_version 388020 (0.0031) [2024-06-23 07:30:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 6357336064. Throughput: 0: 42670.0. Samples: 6357515520. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:30:03,393][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 07:30:05,881][15401] Updated weights for policy 0, policy_version 388030 (0.0036) [2024-06-23 07:30:08,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6357549056. Throughput: 0: 42618.6. Samples: 6357633620. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:30:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 07:30:10,817][15401] Updated weights for policy 0, policy_version 388040 (0.0046) [2024-06-23 07:30:13,354][15401] Updated weights for policy 0, policy_version 388050 (0.0041) [2024-06-23 07:30:13,390][15132] Fps is (10 sec: 47525.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6357811200. Throughput: 0: 42838.2. Samples: 6357899300. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:30:13,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 07:30:18,385][15401] Updated weights for policy 0, policy_version 388060 (0.0038) [2024-06-23 07:30:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6357975040. Throughput: 0: 42762.7. Samples: 6358163780. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:30:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 07:30:20,853][15401] Updated weights for policy 0, policy_version 388070 (0.0035) [2024-06-23 07:30:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6358204416. Throughput: 0: 42888.7. Samples: 6358278860. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:30:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 07:30:25,895][15401] Updated weights for policy 0, policy_version 388080 (0.0030) [2024-06-23 07:30:28,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6358450176. Throughput: 0: 43136.0. Samples: 6358548540. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:30:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 07:30:28,431][15401] Updated weights for policy 0, policy_version 388090 (0.0028) [2024-06-23 07:30:33,290][15401] Updated weights for policy 0, policy_version 388100 (0.0038) [2024-06-23 07:30:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6358630400. Throughput: 0: 43078.9. Samples: 6358809740. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:30:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 07:30:36,265][15401] Updated weights for policy 0, policy_version 388110 (0.0039) [2024-06-23 07:30:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6358859776. Throughput: 0: 43021.8. Samples: 6358924840. Policy #0 lag: (min: 0.0, avg: 13.7, max: 24.0) [2024-06-23 07:30:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 07:30:40,769][15401] Updated weights for policy 0, policy_version 388120 (0.0036) [2024-06-23 07:30:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6359089152. Throughput: 0: 43222.7. Samples: 6359197480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:30:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 07:30:43,820][15401] Updated weights for policy 0, policy_version 388130 (0.0041) [2024-06-23 07:30:48,192][15401] Updated weights for policy 0, policy_version 388140 (0.0031) [2024-06-23 07:30:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 6359285760. Throughput: 0: 43043.7. Samples: 6359452380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:30:48,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-23 07:30:51,465][15401] Updated weights for policy 0, policy_version 388150 (0.0041) [2024-06-23 07:30:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6359498752. Throughput: 0: 43228.0. Samples: 6359578880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:30:53,395][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 07:30:55,806][15401] Updated weights for policy 0, policy_version 388160 (0.0034) [2024-06-23 07:30:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6359728128. Throughput: 0: 43184.6. Samples: 6359842600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:30:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 07:30:59,227][15401] Updated weights for policy 0, policy_version 388170 (0.0030) [2024-06-23 07:31:03,140][15401] Updated weights for policy 0, policy_version 388180 (0.0030) [2024-06-23 07:31:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 6359941120. Throughput: 0: 43003.9. Samples: 6360098960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:03,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 07:31:04,856][15349] Signal inference workers to stop experience collection... (94200 times) [2024-06-23 07:31:04,905][15349] Signal inference workers to resume experience collection... (94200 times) [2024-06-23 07:31:04,906][15401] InferenceWorker_p0-w0: stopping experience collection (94200 times) [2024-06-23 07:31:04,927][15401] InferenceWorker_p0-w0: resuming experience collection (94200 times) [2024-06-23 07:31:06,836][15401] Updated weights for policy 0, policy_version 388190 (0.0031) [2024-06-23 07:31:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6360137728. Throughput: 0: 43216.2. Samples: 6360223580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:08,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 07:31:10,701][15401] Updated weights for policy 0, policy_version 388200 (0.0039) [2024-06-23 07:31:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6360399872. Throughput: 0: 43199.5. Samples: 6360492520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:13,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 07:31:14,349][15401] Updated weights for policy 0, policy_version 388210 (0.0028) [2024-06-23 07:31:18,240][15401] Updated weights for policy 0, policy_version 388220 (0.0034) [2024-06-23 07:31:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 6360596480. Throughput: 0: 42941.1. Samples: 6360742080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 07:31:22,291][15401] Updated weights for policy 0, policy_version 388230 (0.0024) [2024-06-23 07:31:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6360793088. Throughput: 0: 43193.7. Samples: 6360868560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 07:31:26,149][15401] Updated weights for policy 0, policy_version 388240 (0.0036) [2024-06-23 07:31:28,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 6361022464. Throughput: 0: 42980.5. Samples: 6361131880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:28,397][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 07:31:29,854][15401] Updated weights for policy 0, policy_version 388250 (0.0036) [2024-06-23 07:31:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6361219072. Throughput: 0: 43084.7. Samples: 6361391200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 07:31:33,798][15401] Updated weights for policy 0, policy_version 388260 (0.0026) [2024-06-23 07:31:37,375][15401] Updated weights for policy 0, policy_version 388270 (0.0028) [2024-06-23 07:31:38,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6361432064. Throughput: 0: 42923.1. Samples: 6361510420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 07:31:41,390][15401] Updated weights for policy 0, policy_version 388280 (0.0037) [2024-06-23 07:31:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6361661440. Throughput: 0: 42997.2. Samples: 6361777480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 07:31:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000388285_6361661440.pth... [2024-06-23 07:31:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000387657_6351372288.pth [2024-06-23 07:31:44,836][15401] Updated weights for policy 0, policy_version 388290 (0.0038) [2024-06-23 07:31:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6361858048. Throughput: 0: 43165.5. Samples: 6362041400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 07:31:49,089][15401] Updated weights for policy 0, policy_version 388300 (0.0036) [2024-06-23 07:31:52,415][15401] Updated weights for policy 0, policy_version 388310 (0.0028) [2024-06-23 07:31:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6362087424. Throughput: 0: 43110.7. Samples: 6362163560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:53,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 07:31:56,663][15401] Updated weights for policy 0, policy_version 388320 (0.0041) [2024-06-23 07:31:58,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 6362316800. Throughput: 0: 42914.6. Samples: 6362423780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 07:31:58,392][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 07:32:00,037][15401] Updated weights for policy 0, policy_version 388330 (0.0028) [2024-06-23 07:32:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6362497024. Throughput: 0: 43276.8. Samples: 6362689540. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:03,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 07:32:04,429][15401] Updated weights for policy 0, policy_version 388340 (0.0032) [2024-06-23 07:32:07,570][15401] Updated weights for policy 0, policy_version 388350 (0.0034) [2024-06-23 07:32:08,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6362726400. Throughput: 0: 43028.6. Samples: 6362804840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:08,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 07:32:12,006][15401] Updated weights for policy 0, policy_version 388360 (0.0042) [2024-06-23 07:32:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 6362972160. Throughput: 0: 43007.9. Samples: 6363066960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 07:32:15,123][15401] Updated weights for policy 0, policy_version 388370 (0.0036) [2024-06-23 07:32:18,018][15349] Signal inference workers to stop experience collection... (94250 times) [2024-06-23 07:32:18,062][15401] InferenceWorker_p0-w0: stopping experience collection (94250 times) [2024-06-23 07:32:18,135][15349] Signal inference workers to resume experience collection... (94250 times) [2024-06-23 07:32:18,136][15401] InferenceWorker_p0-w0: resuming experience collection (94250 times) [2024-06-23 07:32:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6363136000. Throughput: 0: 43233.9. Samples: 6363336720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 07:32:19,675][15401] Updated weights for policy 0, policy_version 388380 (0.0022) [2024-06-23 07:32:22,678][15401] Updated weights for policy 0, policy_version 388390 (0.0026) [2024-06-23 07:32:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6363381760. Throughput: 0: 43128.1. Samples: 6363451180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 07:32:27,575][15401] Updated weights for policy 0, policy_version 388400 (0.0033) [2024-06-23 07:32:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 6363594752. Throughput: 0: 43082.7. Samples: 6363716200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:28,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-23 07:32:30,439][15401] Updated weights for policy 0, policy_version 388410 (0.0036) [2024-06-23 07:32:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 6363774976. Throughput: 0: 42862.2. Samples: 6363970200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 07:32:35,185][15401] Updated weights for policy 0, policy_version 388420 (0.0037) [2024-06-23 07:32:38,185][15401] Updated weights for policy 0, policy_version 388430 (0.0047) [2024-06-23 07:32:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 6364037120. Throughput: 0: 42776.8. Samples: 6364088520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 07:32:42,739][15401] Updated weights for policy 0, policy_version 388440 (0.0031) [2024-06-23 07:32:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6364233728. Throughput: 0: 42899.2. Samples: 6364354140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 07:32:45,834][15401] Updated weights for policy 0, policy_version 388450 (0.0032) [2024-06-23 07:32:48,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6364413952. Throughput: 0: 42639.1. Samples: 6364608300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 07:32:50,374][15401] Updated weights for policy 0, policy_version 388460 (0.0029) [2024-06-23 07:32:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6364659712. Throughput: 0: 42793.4. Samples: 6364730540. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 07:32:53,621][15401] Updated weights for policy 0, policy_version 388470 (0.0028) [2024-06-23 07:32:57,886][15401] Updated weights for policy 0, policy_version 388480 (0.0033) [2024-06-23 07:32:58,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 6364872704. Throughput: 0: 42766.8. Samples: 6364991460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:32:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 07:33:01,682][15401] Updated weights for policy 0, policy_version 388490 (0.0043) [2024-06-23 07:33:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6365069312. Throughput: 0: 42373.3. Samples: 6365243520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:33:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 07:33:05,378][15401] Updated weights for policy 0, policy_version 388500 (0.0034) [2024-06-23 07:33:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6365298688. Throughput: 0: 42765.8. Samples: 6365375640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:33:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 07:33:09,070][15401] Updated weights for policy 0, policy_version 388510 (0.0036) [2024-06-23 07:33:12,152][15349] Signal inference workers to stop experience collection... (94300 times) [2024-06-23 07:33:12,153][15349] Signal inference workers to resume experience collection... (94300 times) [2024-06-23 07:33:12,166][15401] InferenceWorker_p0-w0: stopping experience collection (94300 times) [2024-06-23 07:33:12,166][15401] InferenceWorker_p0-w0: resuming experience collection (94300 times) [2024-06-23 07:33:12,813][15401] Updated weights for policy 0, policy_version 388520 (0.0046) [2024-06-23 07:33:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6365511680. Throughput: 0: 42641.6. Samples: 6365635080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:33:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 07:33:16,995][15401] Updated weights for policy 0, policy_version 388530 (0.0043) [2024-06-23 07:33:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 6365724672. Throughput: 0: 42547.5. Samples: 6365884840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 07:33:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 07:33:20,647][15401] Updated weights for policy 0, policy_version 388540 (0.0023) [2024-06-23 07:33:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6365937664. Throughput: 0: 42723.6. Samples: 6366011080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:33:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 07:33:24,783][15401] Updated weights for policy 0, policy_version 388550 (0.0037) [2024-06-23 07:33:28,069][15401] Updated weights for policy 0, policy_version 388560 (0.0043) [2024-06-23 07:33:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 6366167040. Throughput: 0: 42662.9. Samples: 6366273980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:33:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 07:33:32,658][15401] Updated weights for policy 0, policy_version 388570 (0.0044) [2024-06-23 07:33:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6366347264. Throughput: 0: 42637.8. Samples: 6366527000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:33:33,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 07:33:35,989][15401] Updated weights for policy 0, policy_version 388580 (0.0038) [2024-06-23 07:33:38,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42877.0). Total num frames: 6366576640. Throughput: 0: 42663.0. Samples: 6366650380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:33:38,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 07:33:40,255][15401] Updated weights for policy 0, policy_version 388590 (0.0034) [2024-06-23 07:33:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 6366789632. Throughput: 0: 42712.4. Samples: 6366913520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:33:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 07:33:43,465][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000388599_6366806016.pth... [2024-06-23 07:33:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000387973_6356549632.pth [2024-06-23 07:33:43,682][15401] Updated weights for policy 0, policy_version 388600 (0.0046) [2024-06-23 07:33:47,846][15401] Updated weights for policy 0, policy_version 388610 (0.0033) [2024-06-23 07:33:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6367002624. Throughput: 0: 42653.3. Samples: 6367162920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:33:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 07:33:51,744][15401] Updated weights for policy 0, policy_version 388620 (0.0034) [2024-06-23 07:33:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6367232000. Throughput: 0: 42578.7. Samples: 6367291680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:33:53,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 07:33:55,570][15401] Updated weights for policy 0, policy_version 388630 (0.0032) [2024-06-23 07:33:58,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6367412224. Throughput: 0: 42566.0. Samples: 6367550540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:33:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 07:33:59,443][15401] Updated weights for policy 0, policy_version 388640 (0.0037) [2024-06-23 07:34:03,174][15401] Updated weights for policy 0, policy_version 388650 (0.0040) [2024-06-23 07:34:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6367657984. Throughput: 0: 42672.4. Samples: 6367805100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:34:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 07:34:07,263][15401] Updated weights for policy 0, policy_version 388660 (0.0035) [2024-06-23 07:34:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6367854592. Throughput: 0: 42677.7. Samples: 6367931580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:34:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 07:34:11,022][15401] Updated weights for policy 0, policy_version 388670 (0.0030) [2024-06-23 07:34:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6368067584. Throughput: 0: 42523.7. Samples: 6368187540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:34:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 07:34:14,784][15401] Updated weights for policy 0, policy_version 388680 (0.0046) [2024-06-23 07:34:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6368280576. Throughput: 0: 42712.5. Samples: 6368449060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:34:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 07:34:18,653][15401] Updated weights for policy 0, policy_version 388690 (0.0044) [2024-06-23 07:34:22,463][15401] Updated weights for policy 0, policy_version 388700 (0.0033) [2024-06-23 07:34:23,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 6368477184. Throughput: 0: 42782.7. Samples: 6368575700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:34:23,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 07:34:26,167][15401] Updated weights for policy 0, policy_version 388710 (0.0035) [2024-06-23 07:34:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 6368706560. Throughput: 0: 42718.2. Samples: 6368835840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:34:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 07:34:30,103][15401] Updated weights for policy 0, policy_version 388720 (0.0033) [2024-06-23 07:34:33,389][15132] Fps is (10 sec: 45886.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 6368935936. Throughput: 0: 42996.7. Samples: 6369097760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:34:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 07:34:33,697][15401] Updated weights for policy 0, policy_version 388730 (0.0035) [2024-06-23 07:34:37,515][15401] Updated weights for policy 0, policy_version 388740 (0.0045) [2024-06-23 07:34:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6369132544. Throughput: 0: 42953.3. Samples: 6369224580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-23 07:34:38,396][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 07:34:38,647][15349] Signal inference workers to stop experience collection... (94350 times) [2024-06-23 07:34:38,680][15401] InferenceWorker_p0-w0: stopping experience collection (94350 times) [2024-06-23 07:34:38,705][15349] Signal inference workers to resume experience collection... (94350 times) [2024-06-23 07:34:38,705][15401] InferenceWorker_p0-w0: resuming experience collection (94350 times) [2024-06-23 07:34:41,408][15401] Updated weights for policy 0, policy_version 388750 (0.0028) [2024-06-23 07:34:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 6369361920. Throughput: 0: 42927.2. Samples: 6369482260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:34:43,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 07:34:45,302][15401] Updated weights for policy 0, policy_version 388760 (0.0031) [2024-06-23 07:34:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 6369574912. Throughput: 0: 42967.7. Samples: 6369738640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:34:48,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 07:34:49,043][15401] Updated weights for policy 0, policy_version 388770 (0.0043) [2024-06-23 07:34:52,620][15401] Updated weights for policy 0, policy_version 388780 (0.0028) [2024-06-23 07:34:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6369787904. Throughput: 0: 43067.7. Samples: 6369869620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:34:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 07:34:56,634][15401] Updated weights for policy 0, policy_version 388790 (0.0033) [2024-06-23 07:34:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42987.5). Total num frames: 6370017280. Throughput: 0: 43279.6. Samples: 6370135120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:34:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 07:35:00,224][15401] Updated weights for policy 0, policy_version 388800 (0.0032) [2024-06-23 07:35:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 6370230272. Throughput: 0: 43148.8. Samples: 6370390760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:03,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 07:35:04,185][15401] Updated weights for policy 0, policy_version 388810 (0.0038) [2024-06-23 07:35:07,821][15401] Updated weights for policy 0, policy_version 388820 (0.0039) [2024-06-23 07:35:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6370426880. Throughput: 0: 43217.1. Samples: 6370520360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:08,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-23 07:35:11,862][15401] Updated weights for policy 0, policy_version 388830 (0.0043) [2024-06-23 07:35:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 6370656256. Throughput: 0: 42985.2. Samples: 6370770180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 07:35:15,863][15401] Updated weights for policy 0, policy_version 388840 (0.0024) [2024-06-23 07:35:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6370852864. Throughput: 0: 42866.2. Samples: 6371026740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 07:35:19,361][15401] Updated weights for policy 0, policy_version 388850 (0.0043) [2024-06-23 07:35:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 6371065856. Throughput: 0: 42905.3. Samples: 6371155320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:23,399][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 07:35:23,457][15401] Updated weights for policy 0, policy_version 388860 (0.0042) [2024-06-23 07:35:26,998][15401] Updated weights for policy 0, policy_version 388870 (0.0036) [2024-06-23 07:35:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 6371295232. Throughput: 0: 42890.2. Samples: 6371412320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:28,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 07:35:30,927][15401] Updated weights for policy 0, policy_version 388880 (0.0029) [2024-06-23 07:35:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6371491840. Throughput: 0: 42882.6. Samples: 6371668360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 07:35:35,022][15401] Updated weights for policy 0, policy_version 388890 (0.0038) [2024-06-23 07:35:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6371721216. Throughput: 0: 42699.5. Samples: 6371791100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:38,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-23 07:35:38,680][15401] Updated weights for policy 0, policy_version 388900 (0.0040) [2024-06-23 07:35:42,687][15401] Updated weights for policy 0, policy_version 388910 (0.0030) [2024-06-23 07:35:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6371917824. Throughput: 0: 42633.3. Samples: 6372053620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 07:35:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000388911_6371917824.pth... [2024-06-23 07:35:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000388285_6361661440.pth [2024-06-23 07:35:46,515][15401] Updated weights for policy 0, policy_version 388920 (0.0033) [2024-06-23 07:35:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6372147200. Throughput: 0: 42513.0. Samples: 6372303840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:48,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 07:35:50,361][15401] Updated weights for policy 0, policy_version 388930 (0.0031) [2024-06-23 07:35:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 6372343808. Throughput: 0: 42654.9. Samples: 6372439840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 07:35:54,114][15401] Updated weights for policy 0, policy_version 388940 (0.0031) [2024-06-23 07:35:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 6372540416. Throughput: 0: 42821.9. Samples: 6372697160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 07:35:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 07:35:58,479][15401] Updated weights for policy 0, policy_version 388950 (0.0023) [2024-06-23 07:35:58,895][15349] Signal inference workers to stop experience collection... (94400 times) [2024-06-23 07:35:58,937][15401] InferenceWorker_p0-w0: stopping experience collection (94400 times) [2024-06-23 07:35:58,964][15349] Signal inference workers to resume experience collection... (94400 times) [2024-06-23 07:35:58,968][15401] InferenceWorker_p0-w0: resuming experience collection (94400 times) [2024-06-23 07:36:01,655][15401] Updated weights for policy 0, policy_version 388960 (0.0029) [2024-06-23 07:36:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6372786176. Throughput: 0: 42593.8. Samples: 6372943460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:03,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 07:36:05,991][15401] Updated weights for policy 0, policy_version 388970 (0.0037) [2024-06-23 07:36:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6372982784. Throughput: 0: 42752.4. Samples: 6373079180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 07:36:09,638][15401] Updated weights for policy 0, policy_version 388980 (0.0033) [2024-06-23 07:36:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6373195776. Throughput: 0: 42715.8. Samples: 6373334540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:13,396][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 07:36:13,463][15401] Updated weights for policy 0, policy_version 388990 (0.0030) [2024-06-23 07:36:17,302][15401] Updated weights for policy 0, policy_version 389000 (0.0027) [2024-06-23 07:36:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6373425152. Throughput: 0: 42761.7. Samples: 6373592640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 07:36:20,955][15401] Updated weights for policy 0, policy_version 389010 (0.0037) [2024-06-23 07:36:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 6373638144. Throughput: 0: 42775.1. Samples: 6373715980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 07:36:24,914][15401] Updated weights for policy 0, policy_version 389020 (0.0029) [2024-06-23 07:36:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6373851136. Throughput: 0: 42585.9. Samples: 6373969980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 07:36:28,746][15401] Updated weights for policy 0, policy_version 389030 (0.0034) [2024-06-23 07:36:32,755][15401] Updated weights for policy 0, policy_version 389040 (0.0033) [2024-06-23 07:36:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6374047744. Throughput: 0: 42783.4. Samples: 6374229100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:33,390][15132] Avg episode reward: [(0, '0.124')] [2024-06-23 07:36:36,391][15401] Updated weights for policy 0, policy_version 389050 (0.0040) [2024-06-23 07:36:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 6374293504. Throughput: 0: 42541.0. Samples: 6374354280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:38,392][15132] Avg episode reward: [(0, '0.150')] [2024-06-23 07:36:40,377][15401] Updated weights for policy 0, policy_version 389060 (0.0039) [2024-06-23 07:36:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6374490112. Throughput: 0: 42481.2. Samples: 6374608820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:43,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 07:36:43,910][15401] Updated weights for policy 0, policy_version 389070 (0.0029) [2024-06-23 07:36:48,063][15401] Updated weights for policy 0, policy_version 389080 (0.0025) [2024-06-23 07:36:48,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6374703104. Throughput: 0: 42798.2. Samples: 6374869380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 07:36:51,502][15401] Updated weights for policy 0, policy_version 389090 (0.0039) [2024-06-23 07:36:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 6374916096. Throughput: 0: 42625.4. Samples: 6374997320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 07:36:55,691][15401] Updated weights for policy 0, policy_version 389100 (0.0030) [2024-06-23 07:36:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6375129088. Throughput: 0: 42578.7. Samples: 6375250580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:36:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 07:36:58,955][15401] Updated weights for policy 0, policy_version 389110 (0.0028) [2024-06-23 07:37:03,181][15401] Updated weights for policy 0, policy_version 389120 (0.0039) [2024-06-23 07:37:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6375342080. Throughput: 0: 42705.5. Samples: 6375514380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 07:37:06,750][15401] Updated weights for policy 0, policy_version 389130 (0.0029) [2024-06-23 07:37:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6375571456. Throughput: 0: 42749.9. Samples: 6375639720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 07:37:10,774][15401] Updated weights for policy 0, policy_version 389140 (0.0042) [2024-06-23 07:37:13,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6375784448. Throughput: 0: 42880.8. Samples: 6375899620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 07:37:14,309][15401] Updated weights for policy 0, policy_version 389150 (0.0036) [2024-06-23 07:37:15,872][15349] Signal inference workers to stop experience collection... (94450 times) [2024-06-23 07:37:15,906][15401] InferenceWorker_p0-w0: stopping experience collection (94450 times) [2024-06-23 07:37:15,928][15349] Signal inference workers to resume experience collection... (94450 times) [2024-06-23 07:37:15,928][15401] InferenceWorker_p0-w0: resuming experience collection (94450 times) [2024-06-23 07:37:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6375964672. Throughput: 0: 43034.3. Samples: 6376165640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:18,392][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 07:37:18,723][15401] Updated weights for policy 0, policy_version 389160 (0.0027) [2024-06-23 07:37:21,670][15401] Updated weights for policy 0, policy_version 389170 (0.0028) [2024-06-23 07:37:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6376210432. Throughput: 0: 43045.9. Samples: 6376291240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 07:37:26,320][15401] Updated weights for policy 0, policy_version 389180 (0.0030) [2024-06-23 07:37:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 6376423424. Throughput: 0: 43075.1. Samples: 6376547200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 07:37:29,454][15401] Updated weights for policy 0, policy_version 389190 (0.0027) [2024-06-23 07:37:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6376620032. Throughput: 0: 43112.9. Samples: 6376809460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 07:37:33,818][15401] Updated weights for policy 0, policy_version 389200 (0.0045) [2024-06-23 07:37:36,856][15401] Updated weights for policy 0, policy_version 389210 (0.0028) [2024-06-23 07:37:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 6376865792. Throughput: 0: 43072.4. Samples: 6376935580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:38,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 07:37:41,411][15401] Updated weights for policy 0, policy_version 389220 (0.0029) [2024-06-23 07:37:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6377062400. Throughput: 0: 43148.5. Samples: 6377192260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 07:37:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000389226_6377078784.pth... [2024-06-23 07:37:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000388599_6366806016.pth [2024-06-23 07:37:44,536][15401] Updated weights for policy 0, policy_version 389230 (0.0037) [2024-06-23 07:37:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6377275392. Throughput: 0: 43051.4. Samples: 6377451700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:48,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 07:37:48,922][15401] Updated weights for policy 0, policy_version 389240 (0.0035) [2024-06-23 07:37:52,284][15401] Updated weights for policy 0, policy_version 389250 (0.0025) [2024-06-23 07:37:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6377521152. Throughput: 0: 43108.8. Samples: 6377579620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:53,391][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 07:37:56,519][15401] Updated weights for policy 0, policy_version 389260 (0.0025) [2024-06-23 07:37:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6377701376. Throughput: 0: 43109.8. Samples: 6377839560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:37:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 07:37:59,962][15401] Updated weights for policy 0, policy_version 389270 (0.0037) [2024-06-23 07:38:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6377930752. Throughput: 0: 42972.5. Samples: 6378099400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:38:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 07:38:03,982][15401] Updated weights for policy 0, policy_version 389280 (0.0038) [2024-06-23 07:38:07,693][15401] Updated weights for policy 0, policy_version 389290 (0.0040) [2024-06-23 07:38:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6378160128. Throughput: 0: 43070.5. Samples: 6378229420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:38:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 07:38:11,515][15401] Updated weights for policy 0, policy_version 389300 (0.0043) [2024-06-23 07:38:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6378356736. Throughput: 0: 43045.7. Samples: 6378484260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:38:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 07:38:15,246][15401] Updated weights for policy 0, policy_version 389310 (0.0034) [2024-06-23 07:38:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 6378569728. Throughput: 0: 42797.4. Samples: 6378735340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:38:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 07:38:19,256][15401] Updated weights for policy 0, policy_version 389320 (0.0049) [2024-06-23 07:38:23,080][15401] Updated weights for policy 0, policy_version 389330 (0.0037) [2024-06-23 07:38:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6378782720. Throughput: 0: 42836.8. Samples: 6378863240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:38:23,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-23 07:38:27,105][15401] Updated weights for policy 0, policy_version 389340 (0.0036) [2024-06-23 07:38:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 6378995712. Throughput: 0: 42784.3. Samples: 6379117660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:38:28,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 07:38:30,763][15401] Updated weights for policy 0, policy_version 389350 (0.0037) [2024-06-23 07:38:33,395][15132] Fps is (10 sec: 42577.3, 60 sec: 43141.0, 300 sec: 42819.8). Total num frames: 6379208704. Throughput: 0: 42782.4. Samples: 6379377120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 07:38:33,395][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 07:38:34,780][15401] Updated weights for policy 0, policy_version 389360 (0.0035) [2024-06-23 07:38:38,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6379421696. Throughput: 0: 42859.6. Samples: 6379508300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:38:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 07:38:38,485][15401] Updated weights for policy 0, policy_version 389370 (0.0032) [2024-06-23 07:38:40,047][15349] Signal inference workers to stop experience collection... (94500 times) [2024-06-23 07:38:40,051][15349] Signal inference workers to resume experience collection... (94500 times) [2024-06-23 07:38:40,073][15401] InferenceWorker_p0-w0: stopping experience collection (94500 times) [2024-06-23 07:38:40,073][15401] InferenceWorker_p0-w0: resuming experience collection (94500 times) [2024-06-23 07:38:42,360][15401] Updated weights for policy 0, policy_version 389380 (0.0032) [2024-06-23 07:38:43,389][15132] Fps is (10 sec: 44259.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6379651072. Throughput: 0: 42782.2. Samples: 6379764760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:38:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 07:38:46,257][15401] Updated weights for policy 0, policy_version 389390 (0.0025) [2024-06-23 07:38:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6379864064. Throughput: 0: 42560.0. Samples: 6380014600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:38:48,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 07:38:49,860][15401] Updated weights for policy 0, policy_version 389400 (0.0045) [2024-06-23 07:38:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6380060672. Throughput: 0: 42552.9. Samples: 6380144300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:38:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 07:38:53,923][15401] Updated weights for policy 0, policy_version 389410 (0.0029) [2024-06-23 07:38:57,458][15401] Updated weights for policy 0, policy_version 389420 (0.0026) [2024-06-23 07:38:58,396][15132] Fps is (10 sec: 42571.0, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 6380290048. Throughput: 0: 42732.3. Samples: 6380407480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:38:58,396][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 07:39:01,663][15401] Updated weights for policy 0, policy_version 389430 (0.0036) [2024-06-23 07:39:03,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 6380503040. Throughput: 0: 42733.0. Samples: 6380658600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:03,396][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 07:39:04,972][15401] Updated weights for policy 0, policy_version 389440 (0.0037) [2024-06-23 07:39:08,396][15132] Fps is (10 sec: 42598.5, 60 sec: 42594.0, 300 sec: 42875.2). Total num frames: 6380716032. Throughput: 0: 42948.7. Samples: 6380796200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:08,396][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 07:39:09,200][15401] Updated weights for policy 0, policy_version 389450 (0.0039) [2024-06-23 07:39:13,030][15401] Updated weights for policy 0, policy_version 389460 (0.0022) [2024-06-23 07:39:13,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6380929024. Throughput: 0: 42955.1. Samples: 6381050540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 07:39:16,754][15401] Updated weights for policy 0, policy_version 389470 (0.0026) [2024-06-23 07:39:18,389][15132] Fps is (10 sec: 44265.2, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 6381158400. Throughput: 0: 42709.7. Samples: 6381298840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 07:39:20,791][15401] Updated weights for policy 0, policy_version 389480 (0.0034) [2024-06-23 07:39:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6381371392. Throughput: 0: 42792.8. Samples: 6381433980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 07:39:24,340][15401] Updated weights for policy 0, policy_version 389490 (0.0043) [2024-06-23 07:39:28,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 6381551616. Throughput: 0: 42833.2. Samples: 6381692260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 07:39:28,569][15401] Updated weights for policy 0, policy_version 389500 (0.0032) [2024-06-23 07:39:32,254][15401] Updated weights for policy 0, policy_version 389510 (0.0026) [2024-06-23 07:39:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43148.1, 300 sec: 42931.6). Total num frames: 6381797376. Throughput: 0: 42890.5. Samples: 6381944680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:33,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-23 07:39:36,119][15401] Updated weights for policy 0, policy_version 389520 (0.0034) [2024-06-23 07:39:38,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6381977600. Throughput: 0: 42859.3. Samples: 6382072960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 07:39:39,824][15401] Updated weights for policy 0, policy_version 389530 (0.0026) [2024-06-23 07:39:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6382206976. Throughput: 0: 42667.3. Samples: 6382327240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:43,398][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 07:39:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000389540_6382223360.pth... [2024-06-23 07:39:43,517][15401] Updated weights for policy 0, policy_version 389540 (0.0024) [2024-06-23 07:39:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000388911_6371917824.pth [2024-06-23 07:39:47,867][15401] Updated weights for policy 0, policy_version 389550 (0.0029) [2024-06-23 07:39:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6382419968. Throughput: 0: 42818.5. Samples: 6382585160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:48,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 07:39:51,269][15401] Updated weights for policy 0, policy_version 389560 (0.0033) [2024-06-23 07:39:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6382632960. Throughput: 0: 42617.1. Samples: 6382713700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 07:39:53,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 07:39:55,425][15401] Updated weights for policy 0, policy_version 389570 (0.0037) [2024-06-23 07:39:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 6382845952. Throughput: 0: 42657.0. Samples: 6382970100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:39:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 07:39:58,831][15401] Updated weights for policy 0, policy_version 389580 (0.0036) [2024-06-23 07:39:59,058][15349] Signal inference workers to stop experience collection... (94550 times) [2024-06-23 07:39:59,068][15401] InferenceWorker_p0-w0: stopping experience collection (94550 times) [2024-06-23 07:39:59,169][15349] Signal inference workers to resume experience collection... (94550 times) [2024-06-23 07:39:59,170][15401] InferenceWorker_p0-w0: resuming experience collection (94550 times) [2024-06-23 07:40:03,024][15401] Updated weights for policy 0, policy_version 389590 (0.0028) [2024-06-23 07:40:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42602.9, 300 sec: 42820.5). Total num frames: 6383058944. Throughput: 0: 42849.6. Samples: 6383227080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 07:40:06,684][15401] Updated weights for policy 0, policy_version 389600 (0.0041) [2024-06-23 07:40:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 6383271936. Throughput: 0: 42763.5. Samples: 6383358340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:08,392][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 07:40:10,793][15401] Updated weights for policy 0, policy_version 389610 (0.0053) [2024-06-23 07:40:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6383484928. Throughput: 0: 42645.5. Samples: 6383611300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:13,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 07:40:14,229][15401] Updated weights for policy 0, policy_version 389620 (0.0037) [2024-06-23 07:40:18,341][15401] Updated weights for policy 0, policy_version 389630 (0.0041) [2024-06-23 07:40:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6383697920. Throughput: 0: 42845.8. Samples: 6383872740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 07:40:21,887][15401] Updated weights for policy 0, policy_version 389640 (0.0039) [2024-06-23 07:40:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6383927296. Throughput: 0: 42699.4. Samples: 6383994440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:23,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 07:40:26,181][15401] Updated weights for policy 0, policy_version 389650 (0.0034) [2024-06-23 07:40:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6384140288. Throughput: 0: 42816.2. Samples: 6384253960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 07:40:29,562][15401] Updated weights for policy 0, policy_version 389660 (0.0027) [2024-06-23 07:40:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 6384320512. Throughput: 0: 42925.9. Samples: 6384516820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:33,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 07:40:33,811][15401] Updated weights for policy 0, policy_version 389670 (0.0037) [2024-06-23 07:40:37,461][15401] Updated weights for policy 0, policy_version 389680 (0.0033) [2024-06-23 07:40:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6384566272. Throughput: 0: 42626.3. Samples: 6384631880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:38,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 07:40:41,288][15401] Updated weights for policy 0, policy_version 389690 (0.0034) [2024-06-23 07:40:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6384779264. Throughput: 0: 42766.2. Samples: 6384894580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:43,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 07:40:44,916][15401] Updated weights for policy 0, policy_version 389700 (0.0052) [2024-06-23 07:40:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6384959488. Throughput: 0: 42863.6. Samples: 6385155940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 07:40:48,840][15401] Updated weights for policy 0, policy_version 389710 (0.0038) [2024-06-23 07:40:52,321][15401] Updated weights for policy 0, policy_version 389720 (0.0030) [2024-06-23 07:40:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6385205248. Throughput: 0: 42592.9. Samples: 6385275020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 07:40:56,546][15401] Updated weights for policy 0, policy_version 389730 (0.0041) [2024-06-23 07:40:58,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 6385434624. Throughput: 0: 42956.7. Samples: 6385544460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:40:58,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 07:40:59,756][15401] Updated weights for policy 0, policy_version 389740 (0.0027) [2024-06-23 07:41:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6385598464. Throughput: 0: 42890.3. Samples: 6385802800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:03,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 07:41:04,514][15401] Updated weights for policy 0, policy_version 389750 (0.0041) [2024-06-23 07:41:07,364][15401] Updated weights for policy 0, policy_version 389760 (0.0032) [2024-06-23 07:41:08,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6385844224. Throughput: 0: 42806.2. Samples: 6385920720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 07:41:11,974][15401] Updated weights for policy 0, policy_version 389770 (0.0023) [2024-06-23 07:41:13,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6386089984. Throughput: 0: 42909.3. Samples: 6386184880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:13,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 07:41:15,156][15401] Updated weights for policy 0, policy_version 389780 (0.0026) [2024-06-23 07:41:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6386237440. Throughput: 0: 42886.7. Samples: 6386446720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 07:41:19,076][15349] Signal inference workers to stop experience collection... (94600 times) [2024-06-23 07:41:19,131][15401] InferenceWorker_p0-w0: stopping experience collection (94600 times) [2024-06-23 07:41:19,187][15349] Signal inference workers to resume experience collection... (94600 times) [2024-06-23 07:41:19,187][15401] InferenceWorker_p0-w0: resuming experience collection (94600 times) [2024-06-23 07:41:19,647][15401] Updated weights for policy 0, policy_version 389790 (0.0032) [2024-06-23 07:41:22,735][15401] Updated weights for policy 0, policy_version 389800 (0.0033) [2024-06-23 07:41:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6386483200. Throughput: 0: 42959.1. Samples: 6386565040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 07:41:27,022][15401] Updated weights for policy 0, policy_version 389810 (0.0034) [2024-06-23 07:41:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6386696192. Throughput: 0: 42979.7. Samples: 6386828660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 07:41:30,220][15401] Updated weights for policy 0, policy_version 389820 (0.0041) [2024-06-23 07:41:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 6386892800. Throughput: 0: 43024.9. Samples: 6387092060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 07:41:34,662][15401] Updated weights for policy 0, policy_version 389830 (0.0042) [2024-06-23 07:41:37,910][15401] Updated weights for policy 0, policy_version 389840 (0.0041) [2024-06-23 07:41:38,390][15132] Fps is (10 sec: 45873.8, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6387154944. Throughput: 0: 43111.8. Samples: 6387215060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 07:41:42,518][15401] Updated weights for policy 0, policy_version 389850 (0.0034) [2024-06-23 07:41:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6387351552. Throughput: 0: 43012.5. Samples: 6387479920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 07:41:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000389854_6387367936.pth... [2024-06-23 07:41:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000389226_6377078784.pth [2024-06-23 07:41:45,539][15401] Updated weights for policy 0, policy_version 389860 (0.0027) [2024-06-23 07:41:48,390][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6387548160. Throughput: 0: 42868.4. Samples: 6387731880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:48,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 07:41:50,066][15401] Updated weights for policy 0, policy_version 389870 (0.0041) [2024-06-23 07:41:53,191][15401] Updated weights for policy 0, policy_version 389880 (0.0041) [2024-06-23 07:41:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6387793920. Throughput: 0: 43040.1. Samples: 6387857520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 07:41:57,686][15401] Updated weights for policy 0, policy_version 389890 (0.0033) [2024-06-23 07:41:58,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42871.5, 300 sec: 42931.3). Total num frames: 6388006912. Throughput: 0: 43269.6. Samples: 6388132120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:41:58,393][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 07:42:00,601][15401] Updated weights for policy 0, policy_version 389900 (0.0031) [2024-06-23 07:42:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 6388203520. Throughput: 0: 43072.0. Samples: 6388384960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:42:03,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 07:42:05,211][15401] Updated weights for policy 0, policy_version 389910 (0.0028) [2024-06-23 07:42:08,056][15401] Updated weights for policy 0, policy_version 389920 (0.0039) [2024-06-23 07:42:08,392][15132] Fps is (10 sec: 44236.6, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 6388449280. Throughput: 0: 43219.4. Samples: 6388510020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:42:08,393][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 07:42:12,974][15401] Updated weights for policy 0, policy_version 389930 (0.0032) [2024-06-23 07:42:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 6388629504. Throughput: 0: 43308.8. Samples: 6388777560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:42:13,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 07:42:15,587][15401] Updated weights for policy 0, policy_version 389940 (0.0049) [2024-06-23 07:42:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 6388842496. Throughput: 0: 43023.5. Samples: 6389028120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:42:18,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 07:42:20,540][15401] Updated weights for policy 0, policy_version 389950 (0.0025) [2024-06-23 07:42:23,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6389088256. Throughput: 0: 43156.2. Samples: 6389157080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:42:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 07:42:23,443][15401] Updated weights for policy 0, policy_version 389960 (0.0032) [2024-06-23 07:42:28,175][15401] Updated weights for policy 0, policy_version 389970 (0.0045) [2024-06-23 07:42:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6389268480. Throughput: 0: 43064.9. Samples: 6389417840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:42:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 07:42:31,149][15401] Updated weights for policy 0, policy_version 389980 (0.0036) [2024-06-23 07:42:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6389481472. Throughput: 0: 42997.3. Samples: 6389666760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 07:42:33,391][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 07:42:35,633][15401] Updated weights for policy 0, policy_version 389990 (0.0049) [2024-06-23 07:42:36,002][15349] Signal inference workers to stop experience collection... (94650 times) [2024-06-23 07:42:36,002][15349] Signal inference workers to resume experience collection... (94650 times) [2024-06-23 07:42:36,042][15401] InferenceWorker_p0-w0: stopping experience collection (94650 times) [2024-06-23 07:42:36,042][15401] InferenceWorker_p0-w0: resuming experience collection (94650 times) [2024-06-23 07:42:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 6389710848. Throughput: 0: 43008.4. Samples: 6389792900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:42:38,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 07:42:38,951][15401] Updated weights for policy 0, policy_version 390000 (0.0029) [2024-06-23 07:42:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6389907456. Throughput: 0: 42610.2. Samples: 6390049480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:42:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 07:42:43,522][15401] Updated weights for policy 0, policy_version 390010 (0.0023) [2024-06-23 07:42:46,844][15401] Updated weights for policy 0, policy_version 390020 (0.0036) [2024-06-23 07:42:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6390136832. Throughput: 0: 42483.9. Samples: 6390296740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:42:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 07:42:51,159][15401] Updated weights for policy 0, policy_version 390030 (0.0036) [2024-06-23 07:42:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 6390349824. Throughput: 0: 42723.5. Samples: 6390432480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:42:53,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 07:42:54,378][15401] Updated weights for policy 0, policy_version 390040 (0.0024) [2024-06-23 07:42:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 6390546432. Throughput: 0: 42338.5. Samples: 6390682800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:42:58,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-23 07:42:58,974][15401] Updated weights for policy 0, policy_version 390050 (0.0036) [2024-06-23 07:43:01,995][15401] Updated weights for policy 0, policy_version 390060 (0.0043) [2024-06-23 07:43:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6390792192. Throughput: 0: 42386.1. Samples: 6390935500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 07:43:06,543][15401] Updated weights for policy 0, policy_version 390070 (0.0042) [2024-06-23 07:43:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42326.9, 300 sec: 42820.6). Total num frames: 6390988800. Throughput: 0: 42577.2. Samples: 6391073060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:08,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 07:43:09,640][15401] Updated weights for policy 0, policy_version 390080 (0.0036) [2024-06-23 07:43:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6391185408. Throughput: 0: 42292.8. Samples: 6391321020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:13,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 07:43:14,211][15401] Updated weights for policy 0, policy_version 390090 (0.0035) [2024-06-23 07:43:17,828][15401] Updated weights for policy 0, policy_version 390100 (0.0033) [2024-06-23 07:43:18,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6391414784. Throughput: 0: 42379.2. Samples: 6391573820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:18,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 07:43:21,856][15401] Updated weights for policy 0, policy_version 390110 (0.0028) [2024-06-23 07:43:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42765.4). Total num frames: 6391611392. Throughput: 0: 42493.6. Samples: 6391705120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 07:43:25,634][15401] Updated weights for policy 0, policy_version 390120 (0.0036) [2024-06-23 07:43:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.2, 300 sec: 42765.7). Total num frames: 6391824384. Throughput: 0: 42331.4. Samples: 6391954400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 07:43:29,849][15401] Updated weights for policy 0, policy_version 390130 (0.0023) [2024-06-23 07:43:33,233][15401] Updated weights for policy 0, policy_version 390140 (0.0031) [2024-06-23 07:43:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6392053760. Throughput: 0: 42595.1. Samples: 6392213520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 07:43:37,391][15401] Updated weights for policy 0, policy_version 390150 (0.0039) [2024-06-23 07:43:38,391][15132] Fps is (10 sec: 44229.7, 60 sec: 42597.1, 300 sec: 42764.8). Total num frames: 6392266752. Throughput: 0: 42436.2. Samples: 6392342180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:38,392][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 07:43:40,939][15401] Updated weights for policy 0, policy_version 390160 (0.0039) [2024-06-23 07:43:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6392479744. Throughput: 0: 42664.1. Samples: 6392602680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:43,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 07:43:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000390167_6392496128.pth... [2024-06-23 07:43:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000389540_6382223360.pth [2024-06-23 07:43:44,998][15401] Updated weights for policy 0, policy_version 390170 (0.0035) [2024-06-23 07:43:48,390][15132] Fps is (10 sec: 42605.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6392692736. Throughput: 0: 42599.5. Samples: 6392852480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 07:43:48,560][15401] Updated weights for policy 0, policy_version 390180 (0.0038) [2024-06-23 07:43:53,066][15401] Updated weights for policy 0, policy_version 390190 (0.0038) [2024-06-23 07:43:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.9). Total num frames: 6392905728. Throughput: 0: 42329.5. Samples: 6392977880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 18.0) [2024-06-23 07:43:53,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 07:43:56,138][15401] Updated weights for policy 0, policy_version 390200 (0.0038) [2024-06-23 07:43:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42766.0). Total num frames: 6393118720. Throughput: 0: 42593.0. Samples: 6393237700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:43:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 07:44:00,568][15401] Updated weights for policy 0, policy_version 390210 (0.0028) [2024-06-23 07:44:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.9). Total num frames: 6393331712. Throughput: 0: 42776.9. Samples: 6393498780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:03,390][15132] Avg episode reward: [(0, '0.877')] [2024-06-23 07:44:03,690][15401] Updated weights for policy 0, policy_version 390220 (0.0041) [2024-06-23 07:44:05,665][15349] Signal inference workers to stop experience collection... (94700 times) [2024-06-23 07:44:05,693][15401] InferenceWorker_p0-w0: stopping experience collection (94700 times) [2024-06-23 07:44:05,724][15349] Signal inference workers to resume experience collection... (94700 times) [2024-06-23 07:44:05,727][15401] InferenceWorker_p0-w0: resuming experience collection (94700 times) [2024-06-23 07:44:07,887][15401] Updated weights for policy 0, policy_version 390230 (0.0030) [2024-06-23 07:44:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6393561088. Throughput: 0: 42659.7. Samples: 6393624800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 07:44:11,233][15401] Updated weights for policy 0, policy_version 390240 (0.0041) [2024-06-23 07:44:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6393757696. Throughput: 0: 42915.8. Samples: 6393885600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 07:44:15,252][15401] Updated weights for policy 0, policy_version 390250 (0.0041) [2024-06-23 07:44:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6393987072. Throughput: 0: 42979.6. Samples: 6394147600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 07:44:18,781][15401] Updated weights for policy 0, policy_version 390260 (0.0047) [2024-06-23 07:44:22,854][15401] Updated weights for policy 0, policy_version 390270 (0.0038) [2024-06-23 07:44:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6394200064. Throughput: 0: 42899.1. Samples: 6394272560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 07:44:26,646][15401] Updated weights for policy 0, policy_version 390280 (0.0032) [2024-06-23 07:44:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6394396672. Throughput: 0: 42772.5. Samples: 6394527440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 07:44:30,416][15401] Updated weights for policy 0, policy_version 390290 (0.0041) [2024-06-23 07:44:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6394626048. Throughput: 0: 43003.2. Samples: 6394787620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 07:44:34,545][15401] Updated weights for policy 0, policy_version 390300 (0.0031) [2024-06-23 07:44:38,173][15401] Updated weights for policy 0, policy_version 390310 (0.0028) [2024-06-23 07:44:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42872.8, 300 sec: 42820.6). Total num frames: 6394839040. Throughput: 0: 43168.1. Samples: 6394920440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:38,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-23 07:44:42,104][15401] Updated weights for policy 0, policy_version 390320 (0.0029) [2024-06-23 07:44:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6395052032. Throughput: 0: 43064.4. Samples: 6395175600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 07:44:45,932][15401] Updated weights for policy 0, policy_version 390330 (0.0032) [2024-06-23 07:44:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6395281408. Throughput: 0: 42916.0. Samples: 6395430000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 07:44:49,675][15401] Updated weights for policy 0, policy_version 390340 (0.0024) [2024-06-23 07:44:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6395478016. Throughput: 0: 43049.6. Samples: 6395562040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 07:44:53,678][15401] Updated weights for policy 0, policy_version 390350 (0.0026) [2024-06-23 07:44:57,077][15401] Updated weights for policy 0, policy_version 390360 (0.0036) [2024-06-23 07:44:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6395691008. Throughput: 0: 42889.3. Samples: 6395815620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:44:58,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 07:45:01,135][15401] Updated weights for policy 0, policy_version 390370 (0.0038) [2024-06-23 07:45:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6395920384. Throughput: 0: 42871.1. Samples: 6396076800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:45:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 07:45:04,995][15401] Updated weights for policy 0, policy_version 390380 (0.0031) [2024-06-23 07:45:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6396116992. Throughput: 0: 42945.3. Samples: 6396205100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:45:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 07:45:08,876][15401] Updated weights for policy 0, policy_version 390390 (0.0032) [2024-06-23 07:45:12,463][15401] Updated weights for policy 0, policy_version 390400 (0.0033) [2024-06-23 07:45:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6396346368. Throughput: 0: 43059.4. Samples: 6396465120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 07:45:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 07:45:16,395][15401] Updated weights for policy 0, policy_version 390410 (0.0031) [2024-06-23 07:45:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6396575744. Throughput: 0: 42999.1. Samples: 6396722580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 07:45:20,149][15401] Updated weights for policy 0, policy_version 390420 (0.0030) [2024-06-23 07:45:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6396772352. Throughput: 0: 43011.5. Samples: 6396855960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:23,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 07:45:23,889][15401] Updated weights for policy 0, policy_version 390430 (0.0041) [2024-06-23 07:45:27,609][15401] Updated weights for policy 0, policy_version 390440 (0.0047) [2024-06-23 07:45:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6397001728. Throughput: 0: 43195.6. Samples: 6397119400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:28,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-23 07:45:31,547][15401] Updated weights for policy 0, policy_version 390450 (0.0031) [2024-06-23 07:45:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 6397231104. Throughput: 0: 43191.7. Samples: 6397373620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 07:45:35,173][15401] Updated weights for policy 0, policy_version 390460 (0.0036) [2024-06-23 07:45:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6397411328. Throughput: 0: 43145.1. Samples: 6397503560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:38,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 07:45:38,946][15401] Updated weights for policy 0, policy_version 390470 (0.0028) [2024-06-23 07:45:42,725][15401] Updated weights for policy 0, policy_version 390480 (0.0033) [2024-06-23 07:45:43,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 6397640704. Throughput: 0: 43148.5. Samples: 6397757300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 07:45:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000390481_6397640704.pth... [2024-06-23 07:45:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000389854_6387367936.pth [2024-06-23 07:45:46,509][15401] Updated weights for policy 0, policy_version 390490 (0.0041) [2024-06-23 07:45:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6397870080. Throughput: 0: 43130.7. Samples: 6398017680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 07:45:50,224][15401] Updated weights for policy 0, policy_version 390500 (0.0040) [2024-06-23 07:45:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 6398050304. Throughput: 0: 43272.8. Samples: 6398152380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 07:45:54,207][15401] Updated weights for policy 0, policy_version 390510 (0.0033) [2024-06-23 07:45:57,693][15401] Updated weights for policy 0, policy_version 390520 (0.0033) [2024-06-23 07:45:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 6398279680. Throughput: 0: 43050.3. Samples: 6398402380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:45:58,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 07:46:01,934][15401] Updated weights for policy 0, policy_version 390530 (0.0031) [2024-06-23 07:46:03,392][15132] Fps is (10 sec: 45865.0, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 6398509056. Throughput: 0: 43141.3. Samples: 6398664040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:03,392][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 07:46:05,246][15401] Updated weights for policy 0, policy_version 390540 (0.0038) [2024-06-23 07:46:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6398705664. Throughput: 0: 43141.7. Samples: 6398797340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 07:46:09,482][15401] Updated weights for policy 0, policy_version 390550 (0.0036) [2024-06-23 07:46:10,284][15349] Signal inference workers to stop experience collection... (94750 times) [2024-06-23 07:46:10,284][15349] Signal inference workers to resume experience collection... (94750 times) [2024-06-23 07:46:10,293][15401] InferenceWorker_p0-w0: stopping experience collection (94750 times) [2024-06-23 07:46:10,293][15401] InferenceWorker_p0-w0: resuming experience collection (94750 times) [2024-06-23 07:46:13,334][15401] Updated weights for policy 0, policy_version 390560 (0.0044) [2024-06-23 07:46:13,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 6398935040. Throughput: 0: 42823.6. Samples: 6399046460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 07:46:16,979][15401] Updated weights for policy 0, policy_version 390570 (0.0041) [2024-06-23 07:46:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6399131648. Throughput: 0: 42884.4. Samples: 6399303420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:18,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 07:46:21,036][15401] Updated weights for policy 0, policy_version 390580 (0.0023) [2024-06-23 07:46:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6399344640. Throughput: 0: 42760.3. Samples: 6399427780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:23,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 07:46:24,881][15401] Updated weights for policy 0, policy_version 390590 (0.0038) [2024-06-23 07:46:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 6399574016. Throughput: 0: 42956.8. Samples: 6399690360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 07:46:28,651][15401] Updated weights for policy 0, policy_version 390600 (0.0033) [2024-06-23 07:46:32,473][15401] Updated weights for policy 0, policy_version 390610 (0.0053) [2024-06-23 07:46:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42820.6). Total num frames: 6399787008. Throughput: 0: 42941.7. Samples: 6399950060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:33,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 07:46:36,359][15401] Updated weights for policy 0, policy_version 390620 (0.0057) [2024-06-23 07:46:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6399983616. Throughput: 0: 42826.9. Samples: 6400079580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 07:46:40,031][15401] Updated weights for policy 0, policy_version 390630 (0.0028) [2024-06-23 07:46:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6400212992. Throughput: 0: 42914.3. Samples: 6400333520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 07:46:43,939][15401] Updated weights for policy 0, policy_version 390640 (0.0033) [2024-06-23 07:46:47,671][15401] Updated weights for policy 0, policy_version 390650 (0.0038) [2024-06-23 07:46:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6400425984. Throughput: 0: 42762.7. Samples: 6400588260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:48,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 07:46:51,454][15401] Updated weights for policy 0, policy_version 390660 (0.0029) [2024-06-23 07:46:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42820.9). Total num frames: 6400638976. Throughput: 0: 42676.5. Samples: 6400717780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:53,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 07:46:55,177][15401] Updated weights for policy 0, policy_version 390670 (0.0038) [2024-06-23 07:46:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6400851968. Throughput: 0: 42922.6. Samples: 6400977980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:46:58,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-23 07:46:59,167][15401] Updated weights for policy 0, policy_version 390680 (0.0030) [2024-06-23 07:47:03,141][15401] Updated weights for policy 0, policy_version 390690 (0.0034) [2024-06-23 07:47:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42600.0, 300 sec: 42765.3). Total num frames: 6401064960. Throughput: 0: 42880.7. Samples: 6401233060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:03,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 07:47:06,991][15401] Updated weights for policy 0, policy_version 390700 (0.0034) [2024-06-23 07:47:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6401277952. Throughput: 0: 42948.1. Samples: 6401360440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 07:47:10,743][15401] Updated weights for policy 0, policy_version 390710 (0.0039) [2024-06-23 07:47:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 6401490944. Throughput: 0: 42721.7. Samples: 6401612840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 07:47:14,542][15401] Updated weights for policy 0, policy_version 390720 (0.0041) [2024-06-23 07:47:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6401703936. Throughput: 0: 42616.6. Samples: 6401867800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:18,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 07:47:18,423][15401] Updated weights for policy 0, policy_version 390730 (0.0038) [2024-06-23 07:47:22,235][15401] Updated weights for policy 0, policy_version 390740 (0.0039) [2024-06-23 07:47:23,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6401916928. Throughput: 0: 42590.7. Samples: 6401996160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:23,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 07:47:25,925][15401] Updated weights for policy 0, policy_version 390750 (0.0038) [2024-06-23 07:47:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6402129920. Throughput: 0: 42575.9. Samples: 6402249440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 07:47:29,967][15401] Updated weights for policy 0, policy_version 390760 (0.0025) [2024-06-23 07:47:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6402359296. Throughput: 0: 42649.3. Samples: 6402507480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 07:47:33,486][15401] Updated weights for policy 0, policy_version 390770 (0.0022) [2024-06-23 07:47:37,385][15349] Signal inference workers to stop experience collection... (94800 times) [2024-06-23 07:47:37,405][15401] InferenceWorker_p0-w0: stopping experience collection (94800 times) [2024-06-23 07:47:37,443][15349] Signal inference workers to resume experience collection... (94800 times) [2024-06-23 07:47:37,443][15401] InferenceWorker_p0-w0: resuming experience collection (94800 times) [2024-06-23 07:47:37,578][15401] Updated weights for policy 0, policy_version 390780 (0.0026) [2024-06-23 07:47:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6402555904. Throughput: 0: 42602.9. Samples: 6402634920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:38,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 07:47:41,690][15401] Updated weights for policy 0, policy_version 390790 (0.0031) [2024-06-23 07:47:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6402752512. Throughput: 0: 42477.9. Samples: 6402889480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 07:47:43,436][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000390794_6402768896.pth... [2024-06-23 07:47:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000390167_6392496128.pth [2024-06-23 07:47:45,635][15401] Updated weights for policy 0, policy_version 390800 (0.0034) [2024-06-23 07:47:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6402981888. Throughput: 0: 42357.9. Samples: 6403139160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 07:47:49,316][15401] Updated weights for policy 0, policy_version 390810 (0.0037) [2024-06-23 07:47:53,252][15401] Updated weights for policy 0, policy_version 390820 (0.0031) [2024-06-23 07:47:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6403194880. Throughput: 0: 42516.8. Samples: 6403273700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:53,391][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 07:47:57,117][15401] Updated weights for policy 0, policy_version 390830 (0.0038) [2024-06-23 07:47:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6403407872. Throughput: 0: 42466.0. Samples: 6403523800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:47:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 07:48:00,806][15401] Updated weights for policy 0, policy_version 390840 (0.0023) [2024-06-23 07:48:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6403620864. Throughput: 0: 42472.0. Samples: 6403779040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:03,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 07:48:04,532][15401] Updated weights for policy 0, policy_version 390850 (0.0032) [2024-06-23 07:48:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 6403801088. Throughput: 0: 42485.3. Samples: 6403908000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 07:48:08,882][15401] Updated weights for policy 0, policy_version 390860 (0.0036) [2024-06-23 07:48:12,231][15401] Updated weights for policy 0, policy_version 390870 (0.0032) [2024-06-23 07:48:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 6404046848. Throughput: 0: 42406.2. Samples: 6404157720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 07:48:16,854][15401] Updated weights for policy 0, policy_version 390880 (0.0038) [2024-06-23 07:48:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6404259840. Throughput: 0: 42453.0. Samples: 6404417860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 07:48:19,749][15401] Updated weights for policy 0, policy_version 390890 (0.0029) [2024-06-23 07:48:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.1, 300 sec: 42765.0). Total num frames: 6404440064. Throughput: 0: 42465.3. Samples: 6404545860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 07:48:24,461][15401] Updated weights for policy 0, policy_version 390900 (0.0047) [2024-06-23 07:48:27,410][15401] Updated weights for policy 0, policy_version 390910 (0.0032) [2024-06-23 07:48:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6404685824. Throughput: 0: 42503.5. Samples: 6404802140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 07:48:32,241][15401] Updated weights for policy 0, policy_version 390920 (0.0043) [2024-06-23 07:48:33,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.4, 300 sec: 42820.8). Total num frames: 6404898816. Throughput: 0: 42650.6. Samples: 6405058440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:33,392][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 07:48:35,042][15401] Updated weights for policy 0, policy_version 390930 (0.0029) [2024-06-23 07:48:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6405095424. Throughput: 0: 42542.7. Samples: 6405188120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 07:48:39,849][15401] Updated weights for policy 0, policy_version 390940 (0.0030) [2024-06-23 07:48:42,730][15401] Updated weights for policy 0, policy_version 390950 (0.0034) [2024-06-23 07:48:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6405341184. Throughput: 0: 42561.0. Samples: 6405439040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 07:48:47,407][15401] Updated weights for policy 0, policy_version 390960 (0.0040) [2024-06-23 07:48:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6405537792. Throughput: 0: 42718.6. Samples: 6405701380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 07:48:50,303][15401] Updated weights for policy 0, policy_version 390970 (0.0023) [2024-06-23 07:48:53,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6405734400. Throughput: 0: 42607.4. Samples: 6405825340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 07:48:55,408][15401] Updated weights for policy 0, policy_version 390980 (0.0038) [2024-06-23 07:48:57,982][15401] Updated weights for policy 0, policy_version 390990 (0.0037) [2024-06-23 07:48:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6405980160. Throughput: 0: 42704.5. Samples: 6406079420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:48:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 07:49:02,884][15401] Updated weights for policy 0, policy_version 391000 (0.0040) [2024-06-23 07:49:03,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6406176768. Throughput: 0: 42797.2. Samples: 6406343740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:49:03,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 07:49:06,036][15401] Updated weights for policy 0, policy_version 391010 (0.0037) [2024-06-23 07:49:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6406373376. Throughput: 0: 42729.4. Samples: 6406468680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:49:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 07:49:10,483][15401] Updated weights for policy 0, policy_version 391020 (0.0035) [2024-06-23 07:49:11,805][15349] Signal inference workers to stop experience collection... (94850 times) [2024-06-23 07:49:11,809][15349] Signal inference workers to resume experience collection... (94850 times) [2024-06-23 07:49:11,832][15401] InferenceWorker_p0-w0: stopping experience collection (94850 times) [2024-06-23 07:49:11,832][15401] InferenceWorker_p0-w0: resuming experience collection (94850 times) [2024-06-23 07:49:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6406619136. Throughput: 0: 42667.4. Samples: 6406722180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 07:49:13,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 07:49:13,575][15401] Updated weights for policy 0, policy_version 391030 (0.0029) [2024-06-23 07:49:18,118][15401] Updated weights for policy 0, policy_version 391040 (0.0024) [2024-06-23 07:49:18,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 6406815744. Throughput: 0: 42854.2. Samples: 6406986980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:18,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 07:49:21,074][15401] Updated weights for policy 0, policy_version 391050 (0.0034) [2024-06-23 07:49:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6407012352. Throughput: 0: 42551.6. Samples: 6407102940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:23,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 07:49:25,754][15401] Updated weights for policy 0, policy_version 391060 (0.0034) [2024-06-23 07:49:28,394][15132] Fps is (10 sec: 45865.3, 60 sec: 43141.3, 300 sec: 42875.4). Total num frames: 6407274496. Throughput: 0: 42628.6. Samples: 6407357520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:28,395][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 07:49:28,550][15401] Updated weights for policy 0, policy_version 391070 (0.0031) [2024-06-23 07:49:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6407438336. Throughput: 0: 42771.2. Samples: 6407626080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:33,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 07:49:33,442][15401] Updated weights for policy 0, policy_version 391080 (0.0032) [2024-06-23 07:49:36,137][15401] Updated weights for policy 0, policy_version 391090 (0.0038) [2024-06-23 07:49:38,389][15132] Fps is (10 sec: 37700.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6407651328. Throughput: 0: 42551.3. Samples: 6407740140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 07:49:41,137][15401] Updated weights for policy 0, policy_version 391100 (0.0033) [2024-06-23 07:49:43,392][15132] Fps is (10 sec: 47501.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 6407913472. Throughput: 0: 42632.4. Samples: 6407997980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:43,392][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 07:49:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000391109_6407929856.pth... [2024-06-23 07:49:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000390481_6397640704.pth [2024-06-23 07:49:43,758][15401] Updated weights for policy 0, policy_version 391110 (0.0031) [2024-06-23 07:49:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 6408060928. Throughput: 0: 42761.4. Samples: 6408268000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 07:49:48,862][15401] Updated weights for policy 0, policy_version 391120 (0.0031) [2024-06-23 07:49:51,788][15401] Updated weights for policy 0, policy_version 391130 (0.0038) [2024-06-23 07:49:53,390][15132] Fps is (10 sec: 37692.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6408290304. Throughput: 0: 42452.9. Samples: 6408379060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:53,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 07:49:56,456][15401] Updated weights for policy 0, policy_version 391140 (0.0027) [2024-06-23 07:49:58,390][15132] Fps is (10 sec: 49151.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6408552448. Throughput: 0: 42634.7. Samples: 6408640740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:49:58,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 07:49:59,318][15401] Updated weights for policy 0, policy_version 391150 (0.0038) [2024-06-23 07:50:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6408699904. Throughput: 0: 42557.8. Samples: 6408901980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:50:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 07:50:03,998][15401] Updated weights for policy 0, policy_version 391160 (0.0037) [2024-06-23 07:50:06,640][15349] Signal inference workers to stop experience collection... (94900 times) [2024-06-23 07:50:06,644][15349] Signal inference workers to resume experience collection... (94900 times) [2024-06-23 07:50:06,668][15401] InferenceWorker_p0-w0: stopping experience collection (94900 times) [2024-06-23 07:50:06,668][15401] InferenceWorker_p0-w0: resuming experience collection (94900 times) [2024-06-23 07:50:07,782][15401] Updated weights for policy 0, policy_version 391170 (0.0032) [2024-06-23 07:50:08,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6408945664. Throughput: 0: 42585.7. Samples: 6409019300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:50:08,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 07:50:11,648][15401] Updated weights for policy 0, policy_version 391180 (0.0036) [2024-06-23 07:50:13,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 6409175040. Throughput: 0: 42695.9. Samples: 6409278640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:50:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 07:50:15,348][15401] Updated weights for policy 0, policy_version 391190 (0.0033) [2024-06-23 07:50:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42053.8, 300 sec: 42598.4). Total num frames: 6409338880. Throughput: 0: 42510.9. Samples: 6409539080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:50:18,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 07:50:19,442][15401] Updated weights for policy 0, policy_version 391200 (0.0027) [2024-06-23 07:50:22,913][15401] Updated weights for policy 0, policy_version 391210 (0.0037) [2024-06-23 07:50:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6409584640. Throughput: 0: 42530.2. Samples: 6409654000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:50:23,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 07:50:27,137][15401] Updated weights for policy 0, policy_version 391220 (0.0039) [2024-06-23 07:50:28,389][15132] Fps is (10 sec: 45876.4, 60 sec: 42055.5, 300 sec: 42598.4). Total num frames: 6409797632. Throughput: 0: 42583.7. Samples: 6409914140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:50:28,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 07:50:31,099][15401] Updated weights for policy 0, policy_version 391230 (0.0045) [2024-06-23 07:50:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6409977856. Throughput: 0: 42398.7. Samples: 6410175940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 07:50:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 07:50:34,988][15401] Updated weights for policy 0, policy_version 391240 (0.0035) [2024-06-23 07:50:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6410207232. Throughput: 0: 42520.1. Samples: 6410292460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:50:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 07:50:38,820][15401] Updated weights for policy 0, policy_version 391250 (0.0043) [2024-06-23 07:50:42,542][15401] Updated weights for policy 0, policy_version 391260 (0.0032) [2024-06-23 07:50:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 6410436608. Throughput: 0: 42528.0. Samples: 6410554500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:50:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 07:50:46,312][15401] Updated weights for policy 0, policy_version 391270 (0.0029) [2024-06-23 07:50:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6410616832. Throughput: 0: 42478.6. Samples: 6410813520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:50:48,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 07:50:50,320][15401] Updated weights for policy 0, policy_version 391280 (0.0045) [2024-06-23 07:50:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6410862592. Throughput: 0: 42509.8. Samples: 6410932240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:50:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 07:50:54,279][15401] Updated weights for policy 0, policy_version 391290 (0.0028) [2024-06-23 07:50:57,750][15401] Updated weights for policy 0, policy_version 391300 (0.0032) [2024-06-23 07:50:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 6411075584. Throughput: 0: 42640.3. Samples: 6411197460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:50:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 07:51:01,663][15401] Updated weights for policy 0, policy_version 391310 (0.0037) [2024-06-23 07:51:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6411255808. Throughput: 0: 42540.1. Samples: 6411453380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 07:51:05,681][15401] Updated weights for policy 0, policy_version 391320 (0.0043) [2024-06-23 07:51:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6411517952. Throughput: 0: 42721.2. Samples: 6411576460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 07:51:09,499][15401] Updated weights for policy 0, policy_version 391330 (0.0038) [2024-06-23 07:51:13,274][15401] Updated weights for policy 0, policy_version 391340 (0.0031) [2024-06-23 07:51:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6411714560. Throughput: 0: 42785.6. Samples: 6411839500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:13,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 07:51:17,453][15401] Updated weights for policy 0, policy_version 391350 (0.0030) [2024-06-23 07:51:18,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 6411894784. Throughput: 0: 42586.2. Samples: 6412092320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 07:51:21,410][15401] Updated weights for policy 0, policy_version 391360 (0.0035) [2024-06-23 07:51:23,396][15132] Fps is (10 sec: 45846.3, 60 sec: 43139.9, 300 sec: 42708.6). Total num frames: 6412173312. Throughput: 0: 42746.7. Samples: 6412216340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:23,397][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 07:51:24,935][15401] Updated weights for policy 0, policy_version 391370 (0.0030) [2024-06-23 07:51:27,086][15349] Signal inference workers to stop experience collection... (94950 times) [2024-06-23 07:51:27,134][15401] InferenceWorker_p0-w0: stopping experience collection (94950 times) [2024-06-23 07:51:27,138][15349] Signal inference workers to resume experience collection... (94950 times) [2024-06-23 07:51:27,150][15401] InferenceWorker_p0-w0: resuming experience collection (94950 times) [2024-06-23 07:51:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6412353536. Throughput: 0: 42816.0. Samples: 6412481220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 07:51:28,914][15401] Updated weights for policy 0, policy_version 391380 (0.0037) [2024-06-23 07:51:32,461][15401] Updated weights for policy 0, policy_version 391390 (0.0035) [2024-06-23 07:51:33,389][15132] Fps is (10 sec: 37707.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6412550144. Throughput: 0: 42666.4. Samples: 6412733500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 07:51:36,364][15401] Updated weights for policy 0, policy_version 391400 (0.0039) [2024-06-23 07:51:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 6412812288. Throughput: 0: 42895.6. Samples: 6412862540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:38,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 07:51:39,937][15401] Updated weights for policy 0, policy_version 391410 (0.0035) [2024-06-23 07:51:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6412992512. Throughput: 0: 42717.4. Samples: 6413119740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:43,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 07:51:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000391419_6413008896.pth... [2024-06-23 07:51:43,520][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000390794_6402768896.pth [2024-06-23 07:51:44,077][15401] Updated weights for policy 0, policy_version 391420 (0.0034) [2024-06-23 07:51:47,747][15401] Updated weights for policy 0, policy_version 391430 (0.0024) [2024-06-23 07:51:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 6413205504. Throughput: 0: 42616.1. Samples: 6413371100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 07:51:51,780][15401] Updated weights for policy 0, policy_version 391440 (0.0026) [2024-06-23 07:51:53,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6413434880. Throughput: 0: 42891.4. Samples: 6413506580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-23 07:51:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 07:51:55,176][15401] Updated weights for policy 0, policy_version 391450 (0.0037) [2024-06-23 07:51:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6413615104. Throughput: 0: 42805.0. Samples: 6413765720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:51:58,395][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 07:51:59,413][15401] Updated weights for policy 0, policy_version 391460 (0.0038) [2024-06-23 07:52:02,707][15401] Updated weights for policy 0, policy_version 391470 (0.0035) [2024-06-23 07:52:03,389][15132] Fps is (10 sec: 42599.6, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 6413860864. Throughput: 0: 42818.7. Samples: 6414019160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 07:52:06,767][15401] Updated weights for policy 0, policy_version 391480 (0.0031) [2024-06-23 07:52:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6414057472. Throughput: 0: 43021.7. Samples: 6414152040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 07:52:10,445][15401] Updated weights for policy 0, policy_version 391490 (0.0029) [2024-06-23 07:52:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6414270464. Throughput: 0: 42790.3. Samples: 6414406780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 07:52:14,267][15401] Updated weights for policy 0, policy_version 391500 (0.0030) [2024-06-23 07:52:17,897][15401] Updated weights for policy 0, policy_version 391510 (0.0032) [2024-06-23 07:52:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 42709.5). Total num frames: 6414516224. Throughput: 0: 43039.2. Samples: 6414670260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 07:52:21,899][15401] Updated weights for policy 0, policy_version 391520 (0.0035) [2024-06-23 07:52:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 6414712832. Throughput: 0: 43227.6. Samples: 6414807780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 07:52:25,326][15401] Updated weights for policy 0, policy_version 391530 (0.0032) [2024-06-23 07:52:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 6414925824. Throughput: 0: 43089.8. Samples: 6415058780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 07:52:29,503][15401] Updated weights for policy 0, policy_version 391540 (0.0028) [2024-06-23 07:52:33,147][15401] Updated weights for policy 0, policy_version 391550 (0.0043) [2024-06-23 07:52:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 6415171584. Throughput: 0: 43251.0. Samples: 6415317400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 07:52:37,215][15401] Updated weights for policy 0, policy_version 391560 (0.0031) [2024-06-23 07:52:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 6415368192. Throughput: 0: 43150.4. Samples: 6415448440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:38,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 07:52:40,692][15401] Updated weights for policy 0, policy_version 391570 (0.0037) [2024-06-23 07:52:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6415581184. Throughput: 0: 42994.7. Samples: 6415700480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 07:52:44,962][15401] Updated weights for policy 0, policy_version 391580 (0.0043) [2024-06-23 07:52:46,661][15349] Signal inference workers to stop experience collection... (95000 times) [2024-06-23 07:52:46,661][15349] Signal inference workers to resume experience collection... (95000 times) [2024-06-23 07:52:46,680][15401] InferenceWorker_p0-w0: stopping experience collection (95000 times) [2024-06-23 07:52:46,681][15401] InferenceWorker_p0-w0: resuming experience collection (95000 times) [2024-06-23 07:52:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6415794176. Throughput: 0: 43000.9. Samples: 6415954200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 07:52:48,471][15401] Updated weights for policy 0, policy_version 391590 (0.0029) [2024-06-23 07:52:52,765][15401] Updated weights for policy 0, policy_version 391600 (0.0028) [2024-06-23 07:52:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 6415990784. Throughput: 0: 42912.4. Samples: 6416083100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 07:52:56,006][15401] Updated weights for policy 0, policy_version 391610 (0.0027) [2024-06-23 07:52:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 6416236544. Throughput: 0: 42950.7. Samples: 6416339560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:52:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 07:53:00,258][15401] Updated weights for policy 0, policy_version 391620 (0.0037) [2024-06-23 07:53:03,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42866.8, 300 sec: 42819.6). Total num frames: 6416433152. Throughput: 0: 42777.3. Samples: 6416595520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:53:03,397][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 07:53:03,534][15401] Updated weights for policy 0, policy_version 391630 (0.0045) [2024-06-23 07:53:07,780][15401] Updated weights for policy 0, policy_version 391640 (0.0027) [2024-06-23 07:53:08,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 6416629760. Throughput: 0: 42438.2. Samples: 6416717600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:53:08,392][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 07:53:11,180][15401] Updated weights for policy 0, policy_version 391650 (0.0038) [2024-06-23 07:53:13,390][15132] Fps is (10 sec: 44265.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6416875520. Throughput: 0: 42658.6. Samples: 6416978420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 07:53:13,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 07:53:15,468][15401] Updated weights for policy 0, policy_version 391660 (0.0022) [2024-06-23 07:53:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6417072128. Throughput: 0: 42611.2. Samples: 6417234900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:18,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 07:53:18,922][15401] Updated weights for policy 0, policy_version 391670 (0.0030) [2024-06-23 07:53:23,104][15401] Updated weights for policy 0, policy_version 391680 (0.0039) [2024-06-23 07:53:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6417285120. Throughput: 0: 42469.3. Samples: 6417359460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 07:53:26,697][15401] Updated weights for policy 0, policy_version 391690 (0.0036) [2024-06-23 07:53:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6417514496. Throughput: 0: 42627.0. Samples: 6417618700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:28,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 07:53:30,637][15401] Updated weights for policy 0, policy_version 391700 (0.0030) [2024-06-23 07:53:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6417711104. Throughput: 0: 42659.0. Samples: 6417873860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 07:53:34,641][15401] Updated weights for policy 0, policy_version 391710 (0.0037) [2024-06-23 07:53:38,136][15401] Updated weights for policy 0, policy_version 391720 (0.0038) [2024-06-23 07:53:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42871.4, 300 sec: 42709.1). Total num frames: 6417940480. Throughput: 0: 42643.9. Samples: 6418002180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:38,393][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 07:53:42,760][15401] Updated weights for policy 0, policy_version 391730 (0.0030) [2024-06-23 07:53:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6418153472. Throughput: 0: 42647.4. Samples: 6418258700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:43,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 07:53:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000391733_6418153472.pth... [2024-06-23 07:53:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000391109_6407929856.pth [2024-06-23 07:53:45,682][15401] Updated weights for policy 0, policy_version 391740 (0.0034) [2024-06-23 07:53:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6418350080. Throughput: 0: 42659.0. Samples: 6418514900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 07:53:50,280][15401] Updated weights for policy 0, policy_version 391750 (0.0030) [2024-06-23 07:53:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6418579456. Throughput: 0: 42674.7. Samples: 6418637860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 07:53:53,666][15401] Updated weights for policy 0, policy_version 391760 (0.0046) [2024-06-23 07:53:55,528][15349] Signal inference workers to stop experience collection... (95050 times) [2024-06-23 07:53:55,530][15349] Signal inference workers to resume experience collection... (95050 times) [2024-06-23 07:53:55,575][15401] InferenceWorker_p0-w0: stopping experience collection (95050 times) [2024-06-23 07:53:55,575][15401] InferenceWorker_p0-w0: resuming experience collection (95050 times) [2024-06-23 07:53:57,927][15401] Updated weights for policy 0, policy_version 391770 (0.0031) [2024-06-23 07:53:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6418792448. Throughput: 0: 42634.7. Samples: 6418896980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:53:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 07:54:01,270][15401] Updated weights for policy 0, policy_version 391780 (0.0041) [2024-06-23 07:54:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 6419005440. Throughput: 0: 42788.3. Samples: 6419160380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:54:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 07:54:05,345][15401] Updated weights for policy 0, policy_version 391790 (0.0030) [2024-06-23 07:54:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42871.5, 300 sec: 42653.6). Total num frames: 6419202048. Throughput: 0: 42774.7. Samples: 6419284420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:54:08,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 07:54:09,052][15401] Updated weights for policy 0, policy_version 391800 (0.0038) [2024-06-23 07:54:12,970][15401] Updated weights for policy 0, policy_version 391810 (0.0033) [2024-06-23 07:54:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 6419431424. Throughput: 0: 42769.4. Samples: 6419543320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:54:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 07:54:16,789][15401] Updated weights for policy 0, policy_version 391820 (0.0043) [2024-06-23 07:54:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6419611648. Throughput: 0: 42969.9. Samples: 6419807500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:54:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 07:54:20,602][15401] Updated weights for policy 0, policy_version 391830 (0.0026) [2024-06-23 07:54:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42599.1). Total num frames: 6419841024. Throughput: 0: 42763.3. Samples: 6419926420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:54:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 07:54:24,369][15401] Updated weights for policy 0, policy_version 391840 (0.0035) [2024-06-23 07:54:28,245][15401] Updated weights for policy 0, policy_version 391850 (0.0038) [2024-06-23 07:54:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6420070400. Throughput: 0: 42795.3. Samples: 6420184480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:54:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 07:54:32,070][15401] Updated weights for policy 0, policy_version 391860 (0.0033) [2024-06-23 07:54:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6420250624. Throughput: 0: 42962.5. Samples: 6420448220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-23 07:54:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 07:54:35,896][15401] Updated weights for policy 0, policy_version 391870 (0.0029) [2024-06-23 07:54:38,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42595.6, 300 sec: 42653.4). Total num frames: 6420496384. Throughput: 0: 42914.9. Samples: 6420569300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:54:38,396][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 07:54:39,719][15401] Updated weights for policy 0, policy_version 391880 (0.0041) [2024-06-23 07:54:43,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 6420692992. Throughput: 0: 42785.7. Samples: 6420822440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:54:43,393][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 07:54:43,712][15401] Updated weights for policy 0, policy_version 391890 (0.0028) [2024-06-23 07:54:48,141][15401] Updated weights for policy 0, policy_version 391900 (0.0028) [2024-06-23 07:54:48,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6420905984. Throughput: 0: 42718.3. Samples: 6421082700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:54:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 07:54:51,282][15401] Updated weights for policy 0, policy_version 391910 (0.0027) [2024-06-23 07:54:53,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6421135360. Throughput: 0: 42740.0. Samples: 6421207620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:54:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 07:54:55,609][15401] Updated weights for policy 0, policy_version 391920 (0.0030) [2024-06-23 07:54:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6421348352. Throughput: 0: 42628.4. Samples: 6421461600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:54:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 07:54:58,897][15401] Updated weights for policy 0, policy_version 391930 (0.0037) [2024-06-23 07:55:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6421528576. Throughput: 0: 42544.4. Samples: 6421722000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:03,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-23 07:55:03,417][15401] Updated weights for policy 0, policy_version 391940 (0.0032) [2024-06-23 07:55:06,449][15401] Updated weights for policy 0, policy_version 391950 (0.0029) [2024-06-23 07:55:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 6421757952. Throughput: 0: 42670.5. Samples: 6421846600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 07:55:10,027][15349] Signal inference workers to stop experience collection... (95100 times) [2024-06-23 07:55:10,028][15349] Signal inference workers to resume experience collection... (95100 times) [2024-06-23 07:55:10,052][15401] InferenceWorker_p0-w0: stopping experience collection (95100 times) [2024-06-23 07:55:10,052][15401] InferenceWorker_p0-w0: resuming experience collection (95100 times) [2024-06-23 07:55:11,303][15401] Updated weights for policy 0, policy_version 391960 (0.0041) [2024-06-23 07:55:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6421987328. Throughput: 0: 42589.7. Samples: 6422101020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 07:55:14,170][15401] Updated weights for policy 0, policy_version 391970 (0.0044) [2024-06-23 07:55:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6422167552. Throughput: 0: 42490.0. Samples: 6422360260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:18,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 07:55:18,811][15401] Updated weights for policy 0, policy_version 391980 (0.0035) [2024-06-23 07:55:21,671][15401] Updated weights for policy 0, policy_version 391990 (0.0031) [2024-06-23 07:55:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6422413312. Throughput: 0: 42554.5. Samples: 6422483980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 07:55:26,250][15401] Updated weights for policy 0, policy_version 392000 (0.0037) [2024-06-23 07:55:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6422626304. Throughput: 0: 42759.7. Samples: 6422746520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 07:55:29,214][15401] Updated weights for policy 0, policy_version 392010 (0.0032) [2024-06-23 07:55:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6422822912. Throughput: 0: 42776.4. Samples: 6423007640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 07:55:33,700][15401] Updated weights for policy 0, policy_version 392020 (0.0033) [2024-06-23 07:55:36,987][15401] Updated weights for policy 0, policy_version 392030 (0.0040) [2024-06-23 07:55:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 6423052288. Throughput: 0: 42722.2. Samples: 6423130120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:38,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 07:55:41,515][15401] Updated weights for policy 0, policy_version 392040 (0.0037) [2024-06-23 07:55:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 6423265280. Throughput: 0: 42766.8. Samples: 6423386100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:43,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 07:55:43,431][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000392046_6423281664.pth... [2024-06-23 07:55:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000391419_6413008896.pth [2024-06-23 07:55:44,650][15401] Updated weights for policy 0, policy_version 392050 (0.0045) [2024-06-23 07:55:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6423445504. Throughput: 0: 42854.2. Samples: 6423650440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:48,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 07:55:49,168][15401] Updated weights for policy 0, policy_version 392060 (0.0037) [2024-06-23 07:55:52,383][15401] Updated weights for policy 0, policy_version 392070 (0.0032) [2024-06-23 07:55:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6423691264. Throughput: 0: 42708.4. Samples: 6423768480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 07:55:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 07:55:56,737][15401] Updated weights for policy 0, policy_version 392080 (0.0039) [2024-06-23 07:55:58,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6423920640. Throughput: 0: 42760.4. Samples: 6424025240. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:55:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 07:56:00,513][15401] Updated weights for policy 0, policy_version 392090 (0.0028) [2024-06-23 07:56:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6424100864. Throughput: 0: 42751.0. Samples: 6424284060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 07:56:04,229][15401] Updated weights for policy 0, policy_version 392100 (0.0031) [2024-06-23 07:56:08,173][15401] Updated weights for policy 0, policy_version 392110 (0.0035) [2024-06-23 07:56:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6424330240. Throughput: 0: 42775.1. Samples: 6424408860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 07:56:11,852][15401] Updated weights for policy 0, policy_version 392120 (0.0043) [2024-06-23 07:56:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6424543232. Throughput: 0: 42462.2. Samples: 6424657320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 07:56:16,166][15401] Updated weights for policy 0, policy_version 392130 (0.0041) [2024-06-23 07:56:18,391][15132] Fps is (10 sec: 40955.3, 60 sec: 42870.6, 300 sec: 42599.2). Total num frames: 6424739840. Throughput: 0: 42407.4. Samples: 6424916020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:18,391][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 07:56:19,705][15401] Updated weights for policy 0, policy_version 392140 (0.0045) [2024-06-23 07:56:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6424952832. Throughput: 0: 42540.0. Samples: 6425044420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 07:56:23,756][15401] Updated weights for policy 0, policy_version 392150 (0.0028) [2024-06-23 07:56:27,423][15401] Updated weights for policy 0, policy_version 392160 (0.0040) [2024-06-23 07:56:28,389][15132] Fps is (10 sec: 45880.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6425198592. Throughput: 0: 42600.4. Samples: 6425303120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 07:56:31,414][15401] Updated weights for policy 0, policy_version 392170 (0.0028) [2024-06-23 07:56:31,989][15349] Signal inference workers to stop experience collection... (95150 times) [2024-06-23 07:56:32,016][15401] InferenceWorker_p0-w0: stopping experience collection (95150 times) [2024-06-23 07:56:32,058][15349] Signal inference workers to resume experience collection... (95150 times) [2024-06-23 07:56:32,058][15401] InferenceWorker_p0-w0: resuming experience collection (95150 times) [2024-06-23 07:56:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6425378816. Throughput: 0: 42418.3. Samples: 6425559260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 07:56:34,967][15401] Updated weights for policy 0, policy_version 392180 (0.0024) [2024-06-23 07:56:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6425608192. Throughput: 0: 42499.2. Samples: 6425680940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 07:56:39,034][15401] Updated weights for policy 0, policy_version 392190 (0.0036) [2024-06-23 07:56:42,550][15401] Updated weights for policy 0, policy_version 392200 (0.0034) [2024-06-23 07:56:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6425837568. Throughput: 0: 42579.5. Samples: 6425941320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:43,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 07:56:46,908][15401] Updated weights for policy 0, policy_version 392210 (0.0036) [2024-06-23 07:56:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6426034176. Throughput: 0: 42545.4. Samples: 6426198600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 07:56:50,502][15401] Updated weights for policy 0, policy_version 392220 (0.0033) [2024-06-23 07:56:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6426247168. Throughput: 0: 42625.8. Samples: 6426327020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 07:56:54,467][15401] Updated weights for policy 0, policy_version 392230 (0.0034) [2024-06-23 07:56:58,055][15401] Updated weights for policy 0, policy_version 392240 (0.0037) [2024-06-23 07:56:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6426460160. Throughput: 0: 42825.4. Samples: 6426584460. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:56:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 07:57:02,411][15401] Updated weights for policy 0, policy_version 392250 (0.0038) [2024-06-23 07:57:03,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6426673152. Throughput: 0: 42875.5. Samples: 6426845380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:57:03,392][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 07:57:05,631][15401] Updated weights for policy 0, policy_version 392260 (0.0034) [2024-06-23 07:57:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6426886144. Throughput: 0: 42816.9. Samples: 6426971180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:57:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 07:57:10,006][15401] Updated weights for policy 0, policy_version 392270 (0.0040) [2024-06-23 07:57:13,384][15401] Updated weights for policy 0, policy_version 392280 (0.0032) [2024-06-23 07:57:13,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6427115520. Throughput: 0: 42734.7. Samples: 6427226180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-23 07:57:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 07:57:17,783][15401] Updated weights for policy 0, policy_version 392290 (0.0030) [2024-06-23 07:57:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42599.2, 300 sec: 42653.9). Total num frames: 6427295744. Throughput: 0: 42697.8. Samples: 6427480660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:18,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 07:57:21,035][15401] Updated weights for policy 0, policy_version 392300 (0.0026) [2024-06-23 07:57:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6427525120. Throughput: 0: 42641.4. Samples: 6427599800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 07:57:25,667][15401] Updated weights for policy 0, policy_version 392310 (0.0036) [2024-06-23 07:57:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6427754496. Throughput: 0: 42499.1. Samples: 6427853780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 07:57:28,656][15401] Updated weights for policy 0, policy_version 392320 (0.0036) [2024-06-23 07:57:33,268][15401] Updated weights for policy 0, policy_version 392330 (0.0036) [2024-06-23 07:57:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 6427934720. Throughput: 0: 42556.7. Samples: 6428113660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 07:57:36,326][15401] Updated weights for policy 0, policy_version 392340 (0.0045) [2024-06-23 07:57:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6428147712. Throughput: 0: 42373.2. Samples: 6428233820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 07:57:40,755][15401] Updated weights for policy 0, policy_version 392350 (0.0040) [2024-06-23 07:57:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6428377088. Throughput: 0: 42391.9. Samples: 6428492100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 07:57:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000392358_6428393472.pth... [2024-06-23 07:57:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000391733_6418153472.pth [2024-06-23 07:57:43,922][15401] Updated weights for policy 0, policy_version 392360 (0.0022) [2024-06-23 07:57:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6428573696. Throughput: 0: 42387.1. Samples: 6428752800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 07:57:48,864][15401] Updated weights for policy 0, policy_version 392370 (0.0034) [2024-06-23 07:57:51,522][15401] Updated weights for policy 0, policy_version 392380 (0.0036) [2024-06-23 07:57:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6428803072. Throughput: 0: 42268.5. Samples: 6428873260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 07:57:56,367][15401] Updated weights for policy 0, policy_version 392390 (0.0040) [2024-06-23 07:57:57,206][15349] Signal inference workers to stop experience collection... (95200 times) [2024-06-23 07:57:57,213][15349] Signal inference workers to resume experience collection... (95200 times) [2024-06-23 07:57:57,248][15401] InferenceWorker_p0-w0: stopping experience collection (95200 times) [2024-06-23 07:57:57,248][15401] InferenceWorker_p0-w0: resuming experience collection (95200 times) [2024-06-23 07:57:58,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 6429032448. Throughput: 0: 42536.9. Samples: 6429140340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:57:58,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 07:57:59,031][15401] Updated weights for policy 0, policy_version 392400 (0.0034) [2024-06-23 07:58:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 6429212672. Throughput: 0: 42604.9. Samples: 6429397880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:58:03,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 07:58:03,894][15401] Updated weights for policy 0, policy_version 392410 (0.0043) [2024-06-23 07:58:07,147][15401] Updated weights for policy 0, policy_version 392420 (0.0037) [2024-06-23 07:58:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6429458432. Throughput: 0: 42683.9. Samples: 6429520580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:58:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 07:58:11,218][15401] Updated weights for policy 0, policy_version 392430 (0.0033) [2024-06-23 07:58:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6429671424. Throughput: 0: 42856.5. Samples: 6429782320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:58:13,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 07:58:14,942][15401] Updated weights for policy 0, policy_version 392440 (0.0029) [2024-06-23 07:58:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6429868032. Throughput: 0: 42923.6. Samples: 6430045220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:58:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 07:58:18,829][15401] Updated weights for policy 0, policy_version 392450 (0.0032) [2024-06-23 07:58:22,308][15401] Updated weights for policy 0, policy_version 392460 (0.0042) [2024-06-23 07:58:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6430113792. Throughput: 0: 42965.7. Samples: 6430167280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:58:23,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 07:58:26,274][15401] Updated weights for policy 0, policy_version 392470 (0.0034) [2024-06-23 07:58:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6430310400. Throughput: 0: 43192.5. Samples: 6430435760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:58:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 07:58:29,732][15401] Updated weights for policy 0, policy_version 392480 (0.0035) [2024-06-23 07:58:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42709.8). Total num frames: 6430539776. Throughput: 0: 43075.2. Samples: 6430691180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 07:58:33,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 07:58:33,680][15401] Updated weights for policy 0, policy_version 392490 (0.0025) [2024-06-23 07:58:37,419][15401] Updated weights for policy 0, policy_version 392500 (0.0038) [2024-06-23 07:58:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 6430736384. Throughput: 0: 43211.6. Samples: 6430817780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:58:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 07:58:41,553][15401] Updated weights for policy 0, policy_version 392510 (0.0043) [2024-06-23 07:58:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6430949376. Throughput: 0: 42919.5. Samples: 6431071720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:58:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 07:58:45,068][15401] Updated weights for policy 0, policy_version 392520 (0.0026) [2024-06-23 07:58:48,395][15132] Fps is (10 sec: 40936.6, 60 sec: 42867.6, 300 sec: 42597.6). Total num frames: 6431145984. Throughput: 0: 43107.9. Samples: 6431337980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:58:48,396][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 07:58:49,064][15401] Updated weights for policy 0, policy_version 392530 (0.0041) [2024-06-23 07:58:52,805][15401] Updated weights for policy 0, policy_version 392540 (0.0038) [2024-06-23 07:58:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6431391744. Throughput: 0: 43004.4. Samples: 6431455780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:58:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 07:58:56,803][15401] Updated weights for policy 0, policy_version 392550 (0.0036) [2024-06-23 07:58:58,390][15132] Fps is (10 sec: 44261.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6431588352. Throughput: 0: 42836.9. Samples: 6431709980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:58:58,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 07:59:00,469][15401] Updated weights for policy 0, policy_version 392560 (0.0037) [2024-06-23 07:59:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 6431784960. Throughput: 0: 42906.3. Samples: 6431976000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 07:59:04,394][15401] Updated weights for policy 0, policy_version 392570 (0.0027) [2024-06-23 07:59:08,026][15401] Updated weights for policy 0, policy_version 392580 (0.0035) [2024-06-23 07:59:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6432030720. Throughput: 0: 43089.9. Samples: 6432106320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 07:59:12,089][15401] Updated weights for policy 0, policy_version 392590 (0.0037) [2024-06-23 07:59:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6432243712. Throughput: 0: 42769.7. Samples: 6432360400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 07:59:15,981][15401] Updated weights for policy 0, policy_version 392600 (0.0047) [2024-06-23 07:59:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6432440320. Throughput: 0: 42856.5. Samples: 6432619720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 07:59:19,884][15401] Updated weights for policy 0, policy_version 392610 (0.0024) [2024-06-23 07:59:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6432669696. Throughput: 0: 42833.7. Samples: 6432745300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 07:59:23,562][15401] Updated weights for policy 0, policy_version 392620 (0.0040) [2024-06-23 07:59:27,363][15349] Signal inference workers to stop experience collection... (95250 times) [2024-06-23 07:59:27,409][15401] InferenceWorker_p0-w0: stopping experience collection (95250 times) [2024-06-23 07:59:27,418][15349] Signal inference workers to resume experience collection... (95250 times) [2024-06-23 07:59:27,428][15401] InferenceWorker_p0-w0: resuming experience collection (95250 times) [2024-06-23 07:59:27,431][15401] Updated weights for policy 0, policy_version 392630 (0.0027) [2024-06-23 07:59:28,395][15132] Fps is (10 sec: 45849.2, 60 sec: 43140.5, 300 sec: 42875.3). Total num frames: 6432899072. Throughput: 0: 42995.5. Samples: 6433006760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:28,396][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 07:59:31,239][15401] Updated weights for policy 0, policy_version 392640 (0.0037) [2024-06-23 07:59:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 6433095680. Throughput: 0: 42837.0. Samples: 6433265400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 07:59:35,098][15401] Updated weights for policy 0, policy_version 392650 (0.0029) [2024-06-23 07:59:38,389][15132] Fps is (10 sec: 40983.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 6433308672. Throughput: 0: 43038.8. Samples: 6433392520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 07:59:38,868][15401] Updated weights for policy 0, policy_version 392660 (0.0037) [2024-06-23 07:59:42,755][15401] Updated weights for policy 0, policy_version 392670 (0.0041) [2024-06-23 07:59:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6433521664. Throughput: 0: 42922.2. Samples: 6433641480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 07:59:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000392671_6433521664.pth... [2024-06-23 07:59:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000392046_6423281664.pth [2024-06-23 07:59:46,629][15401] Updated weights for policy 0, policy_version 392680 (0.0040) [2024-06-23 07:59:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42875.5, 300 sec: 42654.0). Total num frames: 6433718272. Throughput: 0: 42727.6. Samples: 6433898740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:48,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 07:59:50,318][15401] Updated weights for policy 0, policy_version 392690 (0.0029) [2024-06-23 07:59:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6433947648. Throughput: 0: 42706.1. Samples: 6434028100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 07:59:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 07:59:54,378][15401] Updated weights for policy 0, policy_version 392700 (0.0023) [2024-06-23 07:59:57,904][15401] Updated weights for policy 0, policy_version 392710 (0.0036) [2024-06-23 07:59:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6434177024. Throughput: 0: 42780.4. Samples: 6434285520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 07:59:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 08:00:02,279][15401] Updated weights for policy 0, policy_version 392720 (0.0034) [2024-06-23 08:00:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6434357248. Throughput: 0: 42702.2. Samples: 6434541320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 08:00:05,659][15401] Updated weights for policy 0, policy_version 392730 (0.0028) [2024-06-23 08:00:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 6434586624. Throughput: 0: 42689.7. Samples: 6434666440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:08,392][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 08:00:09,838][15401] Updated weights for policy 0, policy_version 392740 (0.0033) [2024-06-23 08:00:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6434799616. Throughput: 0: 42658.3. Samples: 6434926140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 08:00:13,489][15401] Updated weights for policy 0, policy_version 392750 (0.0037) [2024-06-23 08:00:17,298][15401] Updated weights for policy 0, policy_version 392760 (0.0032) [2024-06-23 08:00:18,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6435012608. Throughput: 0: 42664.4. Samples: 6435185300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 08:00:21,219][15401] Updated weights for policy 0, policy_version 392770 (0.0043) [2024-06-23 08:00:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6435241984. Throughput: 0: 42673.6. Samples: 6435312840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 08:00:24,854][15401] Updated weights for policy 0, policy_version 392780 (0.0037) [2024-06-23 08:00:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42602.4, 300 sec: 42820.6). Total num frames: 6435454976. Throughput: 0: 43106.7. Samples: 6435581280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 08:00:28,628][15401] Updated weights for policy 0, policy_version 392790 (0.0036) [2024-06-23 08:00:32,335][15401] Updated weights for policy 0, policy_version 392800 (0.0041) [2024-06-23 08:00:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6435651584. Throughput: 0: 42900.0. Samples: 6435829240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 08:00:36,096][15401] Updated weights for policy 0, policy_version 392810 (0.0040) [2024-06-23 08:00:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6435880960. Throughput: 0: 42801.3. Samples: 6435954160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 08:00:39,860][15401] Updated weights for policy 0, policy_version 392820 (0.0031) [2024-06-23 08:00:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6436093952. Throughput: 0: 43015.2. Samples: 6436221200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 08:00:43,776][15401] Updated weights for policy 0, policy_version 392830 (0.0046) [2024-06-23 08:00:47,308][15401] Updated weights for policy 0, policy_version 392840 (0.0031) [2024-06-23 08:00:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6436306944. Throughput: 0: 43043.0. Samples: 6436478260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 08:00:51,323][15401] Updated weights for policy 0, policy_version 392850 (0.0045) [2024-06-23 08:00:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6436519936. Throughput: 0: 43081.1. Samples: 6436604980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:53,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 08:00:54,869][15401] Updated weights for policy 0, policy_version 392860 (0.0038) [2024-06-23 08:00:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6436732928. Throughput: 0: 43094.0. Samples: 6436865380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:00:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 08:00:59,141][15401] Updated weights for policy 0, policy_version 392870 (0.0028) [2024-06-23 08:00:59,574][15349] Signal inference workers to stop experience collection... (95300 times) [2024-06-23 08:00:59,575][15349] Signal inference workers to resume experience collection... (95300 times) [2024-06-23 08:00:59,600][15401] InferenceWorker_p0-w0: stopping experience collection (95300 times) [2024-06-23 08:00:59,600][15401] InferenceWorker_p0-w0: resuming experience collection (95300 times) [2024-06-23 08:01:02,414][15401] Updated weights for policy 0, policy_version 392880 (0.0035) [2024-06-23 08:01:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6436945920. Throughput: 0: 43024.9. Samples: 6437121420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:01:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 08:01:06,643][15401] Updated weights for policy 0, policy_version 392890 (0.0032) [2024-06-23 08:01:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 6437175296. Throughput: 0: 43092.9. Samples: 6437252020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:01:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 08:01:10,355][15401] Updated weights for policy 0, policy_version 392900 (0.0035) [2024-06-23 08:01:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 6437371904. Throughput: 0: 42850.8. Samples: 6437509560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 08:01:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 08:01:14,101][15401] Updated weights for policy 0, policy_version 392910 (0.0034) [2024-06-23 08:01:18,321][15401] Updated weights for policy 0, policy_version 392920 (0.0032) [2024-06-23 08:01:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6437601280. Throughput: 0: 43067.0. Samples: 6437767260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:18,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 08:01:21,768][15401] Updated weights for policy 0, policy_version 392930 (0.0022) [2024-06-23 08:01:23,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6437814272. Throughput: 0: 43223.2. Samples: 6437899200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 08:01:25,799][15401] Updated weights for policy 0, policy_version 392940 (0.0051) [2024-06-23 08:01:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6438010880. Throughput: 0: 42882.6. Samples: 6438150920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 08:01:29,271][15401] Updated weights for policy 0, policy_version 392950 (0.0034) [2024-06-23 08:01:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 6438240256. Throughput: 0: 42773.3. Samples: 6438403160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:33,392][15132] Avg episode reward: [(0, '0.241')] [2024-06-23 08:01:33,600][15401] Updated weights for policy 0, policy_version 392960 (0.0031) [2024-06-23 08:01:37,127][15401] Updated weights for policy 0, policy_version 392970 (0.0047) [2024-06-23 08:01:38,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 6438469632. Throughput: 0: 42928.3. Samples: 6438536860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:38,392][15132] Avg episode reward: [(0, '0.185')] [2024-06-23 08:01:41,214][15401] Updated weights for policy 0, policy_version 392980 (0.0047) [2024-06-23 08:01:43,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6438649856. Throughput: 0: 42882.3. Samples: 6438795080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:43,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 08:01:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000392985_6438666240.pth... [2024-06-23 08:01:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000392358_6428393472.pth [2024-06-23 08:01:44,710][15401] Updated weights for policy 0, policy_version 392990 (0.0038) [2024-06-23 08:01:48,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6438879232. Throughput: 0: 42777.1. Samples: 6439046400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 08:01:48,951][15401] Updated weights for policy 0, policy_version 393000 (0.0044) [2024-06-23 08:01:52,590][15401] Updated weights for policy 0, policy_version 393010 (0.0032) [2024-06-23 08:01:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6439108608. Throughput: 0: 42685.0. Samples: 6439172840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 08:01:56,642][15401] Updated weights for policy 0, policy_version 393020 (0.0028) [2024-06-23 08:01:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6439305216. Throughput: 0: 42600.7. Samples: 6439426600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:01:58,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 08:02:00,370][15401] Updated weights for policy 0, policy_version 393030 (0.0037) [2024-06-23 08:02:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6439518208. Throughput: 0: 42632.4. Samples: 6439685720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:02:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 08:02:04,305][15401] Updated weights for policy 0, policy_version 393040 (0.0046) [2024-06-23 08:02:07,990][15401] Updated weights for policy 0, policy_version 393050 (0.0032) [2024-06-23 08:02:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6439747584. Throughput: 0: 42600.4. Samples: 6439816220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:02:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 08:02:12,116][15401] Updated weights for policy 0, policy_version 393060 (0.0032) [2024-06-23 08:02:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6439944192. Throughput: 0: 42660.9. Samples: 6440070660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:02:13,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 08:02:15,587][15401] Updated weights for policy 0, policy_version 393070 (0.0033) [2024-06-23 08:02:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6440157184. Throughput: 0: 42623.9. Samples: 6440321140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:02:18,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 08:02:19,797][15401] Updated weights for policy 0, policy_version 393080 (0.0045) [2024-06-23 08:02:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6440370176. Throughput: 0: 42496.0. Samples: 6440449080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:02:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 08:02:23,533][15401] Updated weights for policy 0, policy_version 393090 (0.0030) [2024-06-23 08:02:27,670][15401] Updated weights for policy 0, policy_version 393100 (0.0048) [2024-06-23 08:02:28,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6440566784. Throughput: 0: 42344.9. Samples: 6440700600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:02:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 08:02:31,363][15401] Updated weights for policy 0, policy_version 393110 (0.0030) [2024-06-23 08:02:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 6440796160. Throughput: 0: 42235.6. Samples: 6440947000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 08:02:33,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 08:02:35,350][15401] Updated weights for policy 0, policy_version 393120 (0.0036) [2024-06-23 08:02:35,982][15349] Signal inference workers to stop experience collection... (95350 times) [2024-06-23 08:02:35,982][15349] Signal inference workers to resume experience collection... (95350 times) [2024-06-23 08:02:36,003][15401] InferenceWorker_p0-w0: stopping experience collection (95350 times) [2024-06-23 08:02:36,003][15401] InferenceWorker_p0-w0: resuming experience collection (95350 times) [2024-06-23 08:02:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 6440992768. Throughput: 0: 42441.4. Samples: 6441082700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:02:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 08:02:38,941][15401] Updated weights for policy 0, policy_version 393130 (0.0023) [2024-06-23 08:02:43,305][15401] Updated weights for policy 0, policy_version 393140 (0.0029) [2024-06-23 08:02:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6441205760. Throughput: 0: 42467.2. Samples: 6441337620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:02:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 08:02:46,559][15401] Updated weights for policy 0, policy_version 393150 (0.0027) [2024-06-23 08:02:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6441451520. Throughput: 0: 42293.8. Samples: 6441588940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:02:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 08:02:50,925][15401] Updated weights for policy 0, policy_version 393160 (0.0040) [2024-06-23 08:02:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 6441631744. Throughput: 0: 42387.5. Samples: 6441723660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:02:53,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 08:02:54,542][15401] Updated weights for policy 0, policy_version 393170 (0.0029) [2024-06-23 08:02:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 6441844736. Throughput: 0: 42405.4. Samples: 6441978900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:02:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 08:02:58,429][15401] Updated weights for policy 0, policy_version 393180 (0.0035) [2024-06-23 08:03:02,110][15401] Updated weights for policy 0, policy_version 393190 (0.0046) [2024-06-23 08:03:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6442090496. Throughput: 0: 42451.6. Samples: 6442231460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 08:03:06,034][15401] Updated weights for policy 0, policy_version 393200 (0.0039) [2024-06-23 08:03:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6442287104. Throughput: 0: 42464.9. Samples: 6442360000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:08,399][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 08:03:09,841][15401] Updated weights for policy 0, policy_version 393210 (0.0039) [2024-06-23 08:03:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6442483712. Throughput: 0: 42543.7. Samples: 6442615060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 08:03:13,676][15401] Updated weights for policy 0, policy_version 393220 (0.0026) [2024-06-23 08:03:17,697][15401] Updated weights for policy 0, policy_version 393230 (0.0040) [2024-06-23 08:03:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6442713088. Throughput: 0: 42771.1. Samples: 6442871700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 08:03:21,310][15401] Updated weights for policy 0, policy_version 393240 (0.0022) [2024-06-23 08:03:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6442926080. Throughput: 0: 42576.8. Samples: 6442998660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:23,400][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 08:03:25,214][15401] Updated weights for policy 0, policy_version 393250 (0.0027) [2024-06-23 08:03:28,390][15132] Fps is (10 sec: 42596.5, 60 sec: 42871.1, 300 sec: 42709.4). Total num frames: 6443139072. Throughput: 0: 42676.8. Samples: 6443258100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:28,391][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 08:03:29,032][15401] Updated weights for policy 0, policy_version 393260 (0.0049) [2024-06-23 08:03:33,055][15401] Updated weights for policy 0, policy_version 393270 (0.0049) [2024-06-23 08:03:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6443352064. Throughput: 0: 42713.0. Samples: 6443511020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:33,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 08:03:36,855][15401] Updated weights for policy 0, policy_version 393280 (0.0038) [2024-06-23 08:03:38,392][15132] Fps is (10 sec: 40952.2, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 6443548672. Throughput: 0: 42620.9. Samples: 6443641700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:38,392][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 08:03:40,571][15401] Updated weights for policy 0, policy_version 393290 (0.0033) [2024-06-23 08:03:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.8). Total num frames: 6443761664. Throughput: 0: 42564.4. Samples: 6443894300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:43,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 08:03:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000393297_6443778048.pth... [2024-06-23 08:03:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000392671_6433521664.pth [2024-06-23 08:03:44,368][15401] Updated weights for policy 0, policy_version 393300 (0.0036) [2024-06-23 08:03:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 6443974656. Throughput: 0: 42704.1. Samples: 6444153140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:48,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 08:03:48,430][15401] Updated weights for policy 0, policy_version 393310 (0.0044) [2024-06-23 08:03:51,818][15401] Updated weights for policy 0, policy_version 393320 (0.0033) [2024-06-23 08:03:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6444187648. Throughput: 0: 42668.9. Samples: 6444280100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 08:03:53,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-23 08:03:56,107][15401] Updated weights for policy 0, policy_version 393330 (0.0036) [2024-06-23 08:03:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6444417024. Throughput: 0: 42734.6. Samples: 6444538120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:03:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 08:03:58,824][15349] Signal inference workers to stop experience collection... (95400 times) [2024-06-23 08:03:58,876][15401] InferenceWorker_p0-w0: stopping experience collection (95400 times) [2024-06-23 08:03:58,884][15349] Signal inference workers to resume experience collection... (95400 times) [2024-06-23 08:03:58,896][15401] InferenceWorker_p0-w0: resuming experience collection (95400 times) [2024-06-23 08:03:59,531][15401] Updated weights for policy 0, policy_version 393340 (0.0038) [2024-06-23 08:04:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 6444613632. Throughput: 0: 42710.8. Samples: 6444793680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 08:04:03,736][15401] Updated weights for policy 0, policy_version 393350 (0.0036) [2024-06-23 08:04:07,058][15401] Updated weights for policy 0, policy_version 393360 (0.0033) [2024-06-23 08:04:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6444843008. Throughput: 0: 42749.1. Samples: 6444922360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:08,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 08:04:11,101][15401] Updated weights for policy 0, policy_version 393370 (0.0034) [2024-06-23 08:04:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6445072384. Throughput: 0: 42776.4. Samples: 6445183020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:13,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 08:04:14,971][15401] Updated weights for policy 0, policy_version 393380 (0.0043) [2024-06-23 08:04:18,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6445268992. Throughput: 0: 42810.2. Samples: 6445437480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:18,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 08:04:18,598][15401] Updated weights for policy 0, policy_version 393390 (0.0031) [2024-06-23 08:04:22,541][15401] Updated weights for policy 0, policy_version 393400 (0.0042) [2024-06-23 08:04:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.7). Total num frames: 6445481984. Throughput: 0: 42765.3. Samples: 6445566040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 08:04:26,623][15401] Updated weights for policy 0, policy_version 393410 (0.0031) [2024-06-23 08:04:28,391][15132] Fps is (10 sec: 44230.3, 60 sec: 42870.8, 300 sec: 42764.8). Total num frames: 6445711360. Throughput: 0: 42875.1. Samples: 6445823740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:28,392][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 08:04:30,031][15401] Updated weights for policy 0, policy_version 393420 (0.0030) [2024-06-23 08:04:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6445924352. Throughput: 0: 42945.7. Samples: 6446085700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:33,392][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 08:04:33,960][15401] Updated weights for policy 0, policy_version 393430 (0.0037) [2024-06-23 08:04:37,465][15401] Updated weights for policy 0, policy_version 393440 (0.0020) [2024-06-23 08:04:38,390][15132] Fps is (10 sec: 40965.6, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 6446120960. Throughput: 0: 42954.6. Samples: 6446213060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:38,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 08:04:41,886][15401] Updated weights for policy 0, policy_version 393450 (0.0046) [2024-06-23 08:04:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6446366720. Throughput: 0: 42913.9. Samples: 6446469240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 08:04:45,272][15401] Updated weights for policy 0, policy_version 393460 (0.0041) [2024-06-23 08:04:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6446546944. Throughput: 0: 42980.4. Samples: 6446727800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 08:04:49,402][15401] Updated weights for policy 0, policy_version 393470 (0.0039) [2024-06-23 08:04:52,942][15401] Updated weights for policy 0, policy_version 393480 (0.0026) [2024-06-23 08:04:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6446776320. Throughput: 0: 42871.3. Samples: 6446851580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:53,393][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 08:04:57,196][15401] Updated weights for policy 0, policy_version 393490 (0.0043) [2024-06-23 08:04:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6446989312. Throughput: 0: 42877.0. Samples: 6447112480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:04:58,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 08:05:00,567][15401] Updated weights for policy 0, policy_version 393500 (0.0032) [2024-06-23 08:05:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 6447202304. Throughput: 0: 42897.8. Samples: 6447367880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:05:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 08:05:04,672][15401] Updated weights for policy 0, policy_version 393510 (0.0036) [2024-06-23 08:05:08,270][15401] Updated weights for policy 0, policy_version 393520 (0.0027) [2024-06-23 08:05:08,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.3, 300 sec: 42820.5). Total num frames: 6447431680. Throughput: 0: 42897.7. Samples: 6447496440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:05:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 08:05:12,167][15401] Updated weights for policy 0, policy_version 393530 (0.0035) [2024-06-23 08:05:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6447644672. Throughput: 0: 43014.7. Samples: 6447759340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 08:05:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 08:05:15,854][15401] Updated weights for policy 0, policy_version 393540 (0.0029) [2024-06-23 08:05:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6447841280. Throughput: 0: 42940.4. Samples: 6448018020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 08:05:19,618][15401] Updated weights for policy 0, policy_version 393550 (0.0038) [2024-06-23 08:05:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6448070656. Throughput: 0: 43023.5. Samples: 6448149120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 08:05:23,428][15401] Updated weights for policy 0, policy_version 393560 (0.0041) [2024-06-23 08:05:27,099][15401] Updated weights for policy 0, policy_version 393570 (0.0032) [2024-06-23 08:05:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43145.6, 300 sec: 42876.1). Total num frames: 6448300032. Throughput: 0: 43004.0. Samples: 6448404420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 08:05:31,330][15401] Updated weights for policy 0, policy_version 393580 (0.0040) [2024-06-23 08:05:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6448496640. Throughput: 0: 42889.7. Samples: 6448657840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:33,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 08:05:34,751][15401] Updated weights for policy 0, policy_version 393590 (0.0029) [2024-06-23 08:05:38,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6448693248. Throughput: 0: 42996.1. Samples: 6448786400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:38,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 08:05:39,139][15401] Updated weights for policy 0, policy_version 393600 (0.0021) [2024-06-23 08:05:42,723][15401] Updated weights for policy 0, policy_version 393610 (0.0026) [2024-06-23 08:05:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 6448939008. Throughput: 0: 42899.8. Samples: 6449043080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:43,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 08:05:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000393612_6448939008.pth... [2024-06-23 08:05:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000392985_6438666240.pth [2024-06-23 08:05:45,987][15349] Signal inference workers to stop experience collection... (95450 times) [2024-06-23 08:05:46,033][15401] InferenceWorker_p0-w0: stopping experience collection (95450 times) [2024-06-23 08:05:46,035][15349] Signal inference workers to resume experience collection... (95450 times) [2024-06-23 08:05:46,044][15401] InferenceWorker_p0-w0: resuming experience collection (95450 times) [2024-06-23 08:05:46,858][15401] Updated weights for policy 0, policy_version 393620 (0.0043) [2024-06-23 08:05:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6449135616. Throughput: 0: 42940.1. Samples: 6449300180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:48,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 08:05:50,218][15401] Updated weights for policy 0, policy_version 393630 (0.0032) [2024-06-23 08:05:53,390][15132] Fps is (10 sec: 37692.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6449315840. Throughput: 0: 42945.0. Samples: 6449428960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 08:05:54,388][15401] Updated weights for policy 0, policy_version 393640 (0.0031) [2024-06-23 08:05:57,646][15401] Updated weights for policy 0, policy_version 393650 (0.0028) [2024-06-23 08:05:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6449577984. Throughput: 0: 42864.5. Samples: 6449688240. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:05:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 08:06:02,089][15401] Updated weights for policy 0, policy_version 393660 (0.0035) [2024-06-23 08:06:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6449774592. Throughput: 0: 42820.1. Samples: 6449944920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:06:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 08:06:05,444][15401] Updated weights for policy 0, policy_version 393670 (0.0035) [2024-06-23 08:06:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 6449971200. Throughput: 0: 42759.7. Samples: 6450073300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:06:08,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 08:06:09,541][15401] Updated weights for policy 0, policy_version 393680 (0.0030) [2024-06-23 08:06:12,901][15401] Updated weights for policy 0, policy_version 393690 (0.0031) [2024-06-23 08:06:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6450233344. Throughput: 0: 42963.9. Samples: 6450337800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:06:13,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 08:06:17,278][15401] Updated weights for policy 0, policy_version 393700 (0.0026) [2024-06-23 08:06:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6450429952. Throughput: 0: 42984.1. Samples: 6450592120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:06:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 08:06:20,574][15401] Updated weights for policy 0, policy_version 393710 (0.0038) [2024-06-23 08:06:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6450626560. Throughput: 0: 43008.0. Samples: 6450721760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:06:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 08:06:24,806][15401] Updated weights for policy 0, policy_version 393720 (0.0037) [2024-06-23 08:06:27,939][15401] Updated weights for policy 0, policy_version 393730 (0.0031) [2024-06-23 08:06:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 6450888704. Throughput: 0: 43089.9. Samples: 6450982020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:06:28,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-23 08:06:32,501][15401] Updated weights for policy 0, policy_version 393740 (0.0034) [2024-06-23 08:06:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 6451052544. Throughput: 0: 43083.8. Samples: 6451238960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 08:06:33,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 08:06:35,602][15401] Updated weights for policy 0, policy_version 393750 (0.0034) [2024-06-23 08:06:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6451281920. Throughput: 0: 42981.4. Samples: 6451363120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:06:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 08:06:39,961][15401] Updated weights for policy 0, policy_version 393760 (0.0028) [2024-06-23 08:06:43,341][15401] Updated weights for policy 0, policy_version 393770 (0.0036) [2024-06-23 08:06:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 6451527680. Throughput: 0: 42942.7. Samples: 6451620660. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:06:43,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-23 08:06:47,519][15401] Updated weights for policy 0, policy_version 393780 (0.0036) [2024-06-23 08:06:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6451707904. Throughput: 0: 43164.3. Samples: 6451887320. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:06:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 08:06:51,071][15401] Updated weights for policy 0, policy_version 393790 (0.0042) [2024-06-23 08:06:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 6451937280. Throughput: 0: 43085.6. Samples: 6452012160. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:06:53,392][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 08:06:54,794][15349] Signal inference workers to stop experience collection... (95500 times) [2024-06-23 08:06:54,841][15401] InferenceWorker_p0-w0: stopping experience collection (95500 times) [2024-06-23 08:06:54,843][15349] Signal inference workers to resume experience collection... (95500 times) [2024-06-23 08:06:54,850][15401] InferenceWorker_p0-w0: resuming experience collection (95500 times) [2024-06-23 08:06:55,238][15401] Updated weights for policy 0, policy_version 393800 (0.0038) [2024-06-23 08:06:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6452150272. Throughput: 0: 42987.2. Samples: 6452272220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:06:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 08:06:58,568][15401] Updated weights for policy 0, policy_version 393810 (0.0028) [2024-06-23 08:07:02,745][15401] Updated weights for policy 0, policy_version 393820 (0.0038) [2024-06-23 08:07:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6452363264. Throughput: 0: 43076.0. Samples: 6452530540. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 08:07:06,343][15401] Updated weights for policy 0, policy_version 393830 (0.0044) [2024-06-23 08:07:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 6452576256. Throughput: 0: 43074.7. Samples: 6452660120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 08:07:10,534][15401] Updated weights for policy 0, policy_version 393840 (0.0031) [2024-06-23 08:07:13,391][15132] Fps is (10 sec: 44228.1, 60 sec: 42870.1, 300 sec: 42875.8). Total num frames: 6452805632. Throughput: 0: 43072.0. Samples: 6452920340. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:13,392][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 08:07:13,730][15401] Updated weights for policy 0, policy_version 393850 (0.0035) [2024-06-23 08:07:18,095][15401] Updated weights for policy 0, policy_version 393860 (0.0029) [2024-06-23 08:07:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6453018624. Throughput: 0: 43137.9. Samples: 6453180160. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:18,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 08:07:21,779][15401] Updated weights for policy 0, policy_version 393870 (0.0028) [2024-06-23 08:07:23,390][15132] Fps is (10 sec: 42606.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6453231616. Throughput: 0: 43119.9. Samples: 6453303520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 08:07:25,890][15401] Updated weights for policy 0, policy_version 393880 (0.0042) [2024-06-23 08:07:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6453460992. Throughput: 0: 43114.7. Samples: 6453560820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 08:07:29,217][15401] Updated weights for policy 0, policy_version 393890 (0.0036) [2024-06-23 08:07:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6453641216. Throughput: 0: 43232.0. Samples: 6453832760. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 08:07:33,466][15401] Updated weights for policy 0, policy_version 393900 (0.0034) [2024-06-23 08:07:36,588][15401] Updated weights for policy 0, policy_version 393910 (0.0037) [2024-06-23 08:07:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6453854208. Throughput: 0: 43025.5. Samples: 6453948300. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 08:07:41,350][15401] Updated weights for policy 0, policy_version 393920 (0.0035) [2024-06-23 08:07:43,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6454116352. Throughput: 0: 42964.7. Samples: 6454205640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:43,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 08:07:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000393928_6454116352.pth... [2024-06-23 08:07:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000393297_6443778048.pth [2024-06-23 08:07:44,196][15401] Updated weights for policy 0, policy_version 393930 (0.0034) [2024-06-23 08:07:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6454280192. Throughput: 0: 43068.8. Samples: 6454468640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 08:07:49,014][15401] Updated weights for policy 0, policy_version 393940 (0.0033) [2024-06-23 08:07:51,851][15401] Updated weights for policy 0, policy_version 393950 (0.0036) [2024-06-23 08:07:53,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6454493184. Throughput: 0: 42802.6. Samples: 6454586240. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-23 08:07:53,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 08:07:56,619][15401] Updated weights for policy 0, policy_version 393960 (0.0033) [2024-06-23 08:07:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6454738944. Throughput: 0: 42836.5. Samples: 6454847900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:07:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 08:07:59,563][15401] Updated weights for policy 0, policy_version 393970 (0.0025) [2024-06-23 08:08:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6454919168. Throughput: 0: 42810.1. Samples: 6455106620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 08:08:04,167][15401] Updated weights for policy 0, policy_version 393980 (0.0031) [2024-06-23 08:08:07,016][15401] Updated weights for policy 0, policy_version 393990 (0.0031) [2024-06-23 08:08:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6455148544. Throughput: 0: 42762.2. Samples: 6455227820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 08:08:11,708][15401] Updated weights for policy 0, policy_version 394000 (0.0043) [2024-06-23 08:08:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42872.7, 300 sec: 42931.6). Total num frames: 6455377920. Throughput: 0: 42977.6. Samples: 6455494820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 08:08:14,562][15401] Updated weights for policy 0, policy_version 394010 (0.0036) [2024-06-23 08:08:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6455558144. Throughput: 0: 42593.7. Samples: 6455749480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:18,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 08:08:19,227][15401] Updated weights for policy 0, policy_version 394020 (0.0028) [2024-06-23 08:08:21,790][15349] Signal inference workers to stop experience collection... (95550 times) [2024-06-23 08:08:21,843][15401] InferenceWorker_p0-w0: stopping experience collection (95550 times) [2024-06-23 08:08:21,851][15349] Signal inference workers to resume experience collection... (95550 times) [2024-06-23 08:08:21,861][15401] InferenceWorker_p0-w0: resuming experience collection (95550 times) [2024-06-23 08:08:22,169][15401] Updated weights for policy 0, policy_version 394030 (0.0037) [2024-06-23 08:08:23,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 6455787520. Throughput: 0: 42771.9. Samples: 6455873140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:23,393][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 08:08:27,031][15401] Updated weights for policy 0, policy_version 394040 (0.0041) [2024-06-23 08:08:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6456000512. Throughput: 0: 42833.5. Samples: 6456133140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 08:08:30,146][15401] Updated weights for policy 0, policy_version 394050 (0.0036) [2024-06-23 08:08:33,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 6456213504. Throughput: 0: 42675.6. Samples: 6456389040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:33,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-23 08:08:34,907][15401] Updated weights for policy 0, policy_version 394060 (0.0035) [2024-06-23 08:08:37,650][15401] Updated weights for policy 0, policy_version 394070 (0.0030) [2024-06-23 08:08:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 6456459264. Throughput: 0: 42882.7. Samples: 6456515960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 08:08:42,569][15401] Updated weights for policy 0, policy_version 394080 (0.0037) [2024-06-23 08:08:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 6456655872. Throughput: 0: 42930.9. Samples: 6456779800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 08:08:45,200][15401] Updated weights for policy 0, policy_version 394090 (0.0029) [2024-06-23 08:08:48,391][15132] Fps is (10 sec: 39316.2, 60 sec: 42870.5, 300 sec: 42931.4). Total num frames: 6456852480. Throughput: 0: 42642.3. Samples: 6457025580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:48,391][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 08:08:50,329][15401] Updated weights for policy 0, policy_version 394100 (0.0036) [2024-06-23 08:08:53,303][15401] Updated weights for policy 0, policy_version 394110 (0.0033) [2024-06-23 08:08:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6457098240. Throughput: 0: 42792.9. Samples: 6457153500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:53,391][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 08:08:57,798][15401] Updated weights for policy 0, policy_version 394120 (0.0029) [2024-06-23 08:08:58,389][15132] Fps is (10 sec: 40965.9, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 6457262080. Throughput: 0: 42613.1. Samples: 6457412400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:08:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 08:09:00,986][15401] Updated weights for policy 0, policy_version 394130 (0.0026) [2024-06-23 08:09:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6457475072. Throughput: 0: 42584.9. Samples: 6457665800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:09:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 08:09:05,541][15401] Updated weights for policy 0, policy_version 394140 (0.0032) [2024-06-23 08:09:08,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6457737216. Throughput: 0: 42600.9. Samples: 6457790080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:09:08,392][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 08:09:08,511][15401] Updated weights for policy 0, policy_version 394150 (0.0036) [2024-06-23 08:09:13,227][15401] Updated weights for policy 0, policy_version 394160 (0.0030) [2024-06-23 08:09:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6457917440. Throughput: 0: 42611.0. Samples: 6458050640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 08:09:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 08:09:16,682][15401] Updated weights for policy 0, policy_version 394170 (0.0043) [2024-06-23 08:09:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6458130432. Throughput: 0: 42360.9. Samples: 6458295280. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:18,390][15132] Avg episode reward: [(0, '0.137')] [2024-06-23 08:09:20,971][15401] Updated weights for policy 0, policy_version 394180 (0.0024) [2024-06-23 08:09:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42876.3). Total num frames: 6458359808. Throughput: 0: 42439.6. Samples: 6458425740. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:23,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 08:09:24,143][15401] Updated weights for policy 0, policy_version 394190 (0.0026) [2024-06-23 08:09:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6458556416. Throughput: 0: 42482.4. Samples: 6458691500. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 08:09:28,530][15401] Updated weights for policy 0, policy_version 394200 (0.0051) [2024-06-23 08:09:31,323][15349] Signal inference workers to stop experience collection... (95600 times) [2024-06-23 08:09:31,326][15349] Signal inference workers to resume experience collection... (95600 times) [2024-06-23 08:09:31,337][15401] InferenceWorker_p0-w0: stopping experience collection (95600 times) [2024-06-23 08:09:31,353][15401] InferenceWorker_p0-w0: resuming experience collection (95600 times) [2024-06-23 08:09:31,635][15401] Updated weights for policy 0, policy_version 394210 (0.0028) [2024-06-23 08:09:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 6458785792. Throughput: 0: 42526.7. Samples: 6458939220. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 08:09:36,311][15401] Updated weights for policy 0, policy_version 394220 (0.0043) [2024-06-23 08:09:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 6458998784. Throughput: 0: 42708.9. Samples: 6459075400. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:38,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 08:09:39,260][15401] Updated weights for policy 0, policy_version 394230 (0.0024) [2024-06-23 08:09:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 6459179008. Throughput: 0: 42555.1. Samples: 6459327380. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 08:09:43,477][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000394238_6459195392.pth... [2024-06-23 08:09:43,529][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000393612_6448939008.pth [2024-06-23 08:09:43,980][15401] Updated weights for policy 0, policy_version 394240 (0.0051) [2024-06-23 08:09:47,075][15401] Updated weights for policy 0, policy_version 394250 (0.0033) [2024-06-23 08:09:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43143.8, 300 sec: 42931.3). Total num frames: 6459441152. Throughput: 0: 42378.7. Samples: 6459572940. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:48,392][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 08:09:51,630][15401] Updated weights for policy 0, policy_version 394260 (0.0040) [2024-06-23 08:09:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6459637760. Throughput: 0: 42770.6. Samples: 6459714760. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 08:09:54,910][15401] Updated weights for policy 0, policy_version 394270 (0.0044) [2024-06-23 08:09:58,390][15132] Fps is (10 sec: 36053.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6459801600. Throughput: 0: 42614.2. Samples: 6459968280. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:09:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 08:09:59,208][15401] Updated weights for policy 0, policy_version 394280 (0.0035) [2024-06-23 08:10:02,489][15401] Updated weights for policy 0, policy_version 394290 (0.0034) [2024-06-23 08:10:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6460080128. Throughput: 0: 42692.5. Samples: 6460216440. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:10:03,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 08:10:06,799][15401] Updated weights for policy 0, policy_version 394300 (0.0027) [2024-06-23 08:10:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 6460260352. Throughput: 0: 42926.3. Samples: 6460357420. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:10:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 08:10:10,231][15401] Updated weights for policy 0, policy_version 394310 (0.0038) [2024-06-23 08:10:13,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6460456960. Throughput: 0: 42612.5. Samples: 6460609060. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:10:13,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 08:10:14,568][15401] Updated weights for policy 0, policy_version 394320 (0.0026) [2024-06-23 08:10:17,839][15401] Updated weights for policy 0, policy_version 394330 (0.0025) [2024-06-23 08:10:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6460702720. Throughput: 0: 42686.6. Samples: 6460860120. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:10:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 08:10:22,214][15401] Updated weights for policy 0, policy_version 394340 (0.0025) [2024-06-23 08:10:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6460915712. Throughput: 0: 42670.2. Samples: 6460995560. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:10:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 08:10:24,123][15349] Signal inference workers to stop experience collection... (95650 times) [2024-06-23 08:10:24,128][15349] Signal inference workers to resume experience collection... (95650 times) [2024-06-23 08:10:24,170][15401] InferenceWorker_p0-w0: stopping experience collection (95650 times) [2024-06-23 08:10:24,170][15401] InferenceWorker_p0-w0: resuming experience collection (95650 times) [2024-06-23 08:10:25,432][15401] Updated weights for policy 0, policy_version 394350 (0.0031) [2024-06-23 08:10:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 6461112320. Throughput: 0: 42644.4. Samples: 6461246480. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:10:28,392][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 08:10:30,211][15401] Updated weights for policy 0, policy_version 394360 (0.0028) [2024-06-23 08:10:33,162][15401] Updated weights for policy 0, policy_version 394370 (0.0038) [2024-06-23 08:10:33,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 6461358080. Throughput: 0: 42702.7. Samples: 6461494560. Policy #0 lag: (min: 3.0, avg: 11.2, max: 24.0) [2024-06-23 08:10:33,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 08:10:37,716][15401] Updated weights for policy 0, policy_version 394380 (0.0048) [2024-06-23 08:10:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 6461538304. Throughput: 0: 42625.8. Samples: 6461632920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:10:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 08:10:40,786][15401] Updated weights for policy 0, policy_version 394390 (0.0027) [2024-06-23 08:10:43,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6461751296. Throughput: 0: 42670.3. Samples: 6461888440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:10:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 08:10:45,406][15401] Updated weights for policy 0, policy_version 394400 (0.0034) [2024-06-23 08:10:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 6461997056. Throughput: 0: 42560.3. Samples: 6462131660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:10:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 08:10:48,586][15401] Updated weights for policy 0, policy_version 394410 (0.0036) [2024-06-23 08:10:53,001][15401] Updated weights for policy 0, policy_version 394420 (0.0022) [2024-06-23 08:10:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6462177280. Throughput: 0: 42474.6. Samples: 6462268780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:10:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 08:10:56,193][15401] Updated weights for policy 0, policy_version 394430 (0.0028) [2024-06-23 08:10:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6462390272. Throughput: 0: 42549.8. Samples: 6462523800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:10:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 08:11:00,867][15401] Updated weights for policy 0, policy_version 394440 (0.0050) [2024-06-23 08:11:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 6462636032. Throughput: 0: 42538.2. Samples: 6462774340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:03,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 08:11:03,765][15401] Updated weights for policy 0, policy_version 394450 (0.0033) [2024-06-23 08:11:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6462816256. Throughput: 0: 42556.9. Samples: 6462910620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:08,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-23 08:11:08,452][15401] Updated weights for policy 0, policy_version 394460 (0.0025) [2024-06-23 08:11:11,347][15401] Updated weights for policy 0, policy_version 394470 (0.0039) [2024-06-23 08:11:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6463045632. Throughput: 0: 42581.7. Samples: 6463162560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:13,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 08:11:16,045][15401] Updated weights for policy 0, policy_version 394480 (0.0049) [2024-06-23 08:11:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6463242240. Throughput: 0: 42787.2. Samples: 6463419880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 08:11:19,043][15401] Updated weights for policy 0, policy_version 394490 (0.0029) [2024-06-23 08:11:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6463455232. Throughput: 0: 42581.7. Samples: 6463549100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 08:11:23,665][15401] Updated weights for policy 0, policy_version 394500 (0.0036) [2024-06-23 08:11:27,179][15401] Updated weights for policy 0, policy_version 394510 (0.0043) [2024-06-23 08:11:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 6463700992. Throughput: 0: 42559.5. Samples: 6463803620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 08:11:31,151][15401] Updated weights for policy 0, policy_version 394520 (0.0038) [2024-06-23 08:11:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 6463881216. Throughput: 0: 42913.4. Samples: 6464062760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:33,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 08:11:34,917][15401] Updated weights for policy 0, policy_version 394530 (0.0041) [2024-06-23 08:11:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6464110592. Throughput: 0: 42598.7. Samples: 6464185720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 08:11:38,900][15401] Updated weights for policy 0, policy_version 394540 (0.0034) [2024-06-23 08:11:42,400][15349] Signal inference workers to stop experience collection... (95700 times) [2024-06-23 08:11:42,442][15401] InferenceWorker_p0-w0: stopping experience collection (95700 times) [2024-06-23 08:11:42,453][15349] Signal inference workers to resume experience collection... (95700 times) [2024-06-23 08:11:42,466][15401] InferenceWorker_p0-w0: resuming experience collection (95700 times) [2024-06-23 08:11:42,596][15401] Updated weights for policy 0, policy_version 394550 (0.0049) [2024-06-23 08:11:43,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6464323584. Throughput: 0: 42735.3. Samples: 6464446900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 08:11:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000394551_6464323584.pth... [2024-06-23 08:11:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000393928_6454116352.pth [2024-06-23 08:11:46,991][15401] Updated weights for policy 0, policy_version 394560 (0.0030) [2024-06-23 08:11:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6464536576. Throughput: 0: 42868.1. Samples: 6464703400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 08:11:50,226][15401] Updated weights for policy 0, policy_version 394570 (0.0034) [2024-06-23 08:11:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6464749568. Throughput: 0: 42686.2. Samples: 6464831500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 08:11:53,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 08:11:54,411][15401] Updated weights for policy 0, policy_version 394580 (0.0040) [2024-06-23 08:11:57,899][15401] Updated weights for policy 0, policy_version 394590 (0.0039) [2024-06-23 08:11:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6464962560. Throughput: 0: 42891.7. Samples: 6465092680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:11:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 08:12:01,997][15401] Updated weights for policy 0, policy_version 394600 (0.0033) [2024-06-23 08:12:03,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6465191936. Throughput: 0: 42768.1. Samples: 6465344440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 08:12:05,541][15401] Updated weights for policy 0, policy_version 394610 (0.0027) [2024-06-23 08:12:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 6465404928. Throughput: 0: 42784.2. Samples: 6465474380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 08:12:09,520][15401] Updated weights for policy 0, policy_version 394620 (0.0035) [2024-06-23 08:12:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6465601536. Throughput: 0: 42985.8. Samples: 6465737980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:13,398][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 08:12:13,449][15401] Updated weights for policy 0, policy_version 394630 (0.0031) [2024-06-23 08:12:16,937][15401] Updated weights for policy 0, policy_version 394640 (0.0040) [2024-06-23 08:12:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 6465847296. Throughput: 0: 42851.0. Samples: 6465991060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:18,399][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 08:12:20,982][15401] Updated weights for policy 0, policy_version 394650 (0.0039) [2024-06-23 08:12:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6466043904. Throughput: 0: 43144.8. Samples: 6466127240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 08:12:24,518][15401] Updated weights for policy 0, policy_version 394660 (0.0029) [2024-06-23 08:12:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6466256896. Throughput: 0: 43094.9. Samples: 6466386160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 08:12:28,430][15401] Updated weights for policy 0, policy_version 394670 (0.0029) [2024-06-23 08:12:32,143][15401] Updated weights for policy 0, policy_version 394680 (0.0032) [2024-06-23 08:12:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 6466486272. Throughput: 0: 42985.2. Samples: 6466637740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:33,392][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 08:12:36,364][15401] Updated weights for policy 0, policy_version 394690 (0.0034) [2024-06-23 08:12:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6466682880. Throughput: 0: 43013.4. Samples: 6466767100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 08:12:39,759][15401] Updated weights for policy 0, policy_version 394700 (0.0035) [2024-06-23 08:12:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6466895872. Throughput: 0: 42838.2. Samples: 6467020400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:43,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 08:12:43,873][15401] Updated weights for policy 0, policy_version 394710 (0.0038) [2024-06-23 08:12:47,567][15401] Updated weights for policy 0, policy_version 394720 (0.0035) [2024-06-23 08:12:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6467125248. Throughput: 0: 42855.1. Samples: 6467272920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 08:12:51,817][15401] Updated weights for policy 0, policy_version 394730 (0.0043) [2024-06-23 08:12:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6467338240. Throughput: 0: 42958.5. Samples: 6467407520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 08:12:55,250][15401] Updated weights for policy 0, policy_version 394740 (0.0028) [2024-06-23 08:12:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6467518464. Throughput: 0: 42716.0. Samples: 6467660200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:12:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 08:12:59,459][15401] Updated weights for policy 0, policy_version 394750 (0.0028) [2024-06-23 08:13:02,971][15401] Updated weights for policy 0, policy_version 394760 (0.0030) [2024-06-23 08:13:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6467764224. Throughput: 0: 42753.8. Samples: 6467914980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:13:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 08:13:07,055][15401] Updated weights for policy 0, policy_version 394770 (0.0033) [2024-06-23 08:13:08,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 6467960832. Throughput: 0: 42738.2. Samples: 6468050560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:13:08,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 08:13:10,456][15401] Updated weights for policy 0, policy_version 394780 (0.0034) [2024-06-23 08:13:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6468173824. Throughput: 0: 42465.0. Samples: 6468297080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 08:13:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 08:13:14,637][15401] Updated weights for policy 0, policy_version 394790 (0.0037) [2024-06-23 08:13:18,226][15401] Updated weights for policy 0, policy_version 394800 (0.0027) [2024-06-23 08:13:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 6468403200. Throughput: 0: 42622.7. Samples: 6468555760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 08:13:22,304][15401] Updated weights for policy 0, policy_version 394810 (0.0033) [2024-06-23 08:13:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6468583424. Throughput: 0: 42751.7. Samples: 6468690920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 08:13:25,680][15349] Signal inference workers to stop experience collection... (95750 times) [2024-06-23 08:13:25,680][15349] Signal inference workers to resume experience collection... (95750 times) [2024-06-23 08:13:25,736][15401] InferenceWorker_p0-w0: stopping experience collection (95750 times) [2024-06-23 08:13:25,736][15401] InferenceWorker_p0-w0: resuming experience collection (95750 times) [2024-06-23 08:13:25,827][15401] Updated weights for policy 0, policy_version 394820 (0.0032) [2024-06-23 08:13:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6468829184. Throughput: 0: 42683.7. Samples: 6468941160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:28,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 08:13:30,045][15401] Updated weights for policy 0, policy_version 394830 (0.0035) [2024-06-23 08:13:33,337][15401] Updated weights for policy 0, policy_version 394840 (0.0035) [2024-06-23 08:13:33,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6469058560. Throughput: 0: 42949.3. Samples: 6469205640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 08:13:37,645][15401] Updated weights for policy 0, policy_version 394850 (0.0030) [2024-06-23 08:13:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6469238784. Throughput: 0: 42735.6. Samples: 6469330620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 08:13:40,871][15401] Updated weights for policy 0, policy_version 394860 (0.0032) [2024-06-23 08:13:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.2). Total num frames: 6469468160. Throughput: 0: 42852.4. Samples: 6469588560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:43,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 08:13:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000394865_6469468160.pth... [2024-06-23 08:13:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000394238_6459195392.pth [2024-06-23 08:13:45,258][15401] Updated weights for policy 0, policy_version 394870 (0.0033) [2024-06-23 08:13:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6469697536. Throughput: 0: 42868.0. Samples: 6469844040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:48,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 08:13:48,471][15401] Updated weights for policy 0, policy_version 394880 (0.0036) [2024-06-23 08:13:52,842][15401] Updated weights for policy 0, policy_version 394890 (0.0029) [2024-06-23 08:13:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6469877760. Throughput: 0: 42760.8. Samples: 6469974700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 08:13:56,206][15401] Updated weights for policy 0, policy_version 394900 (0.0043) [2024-06-23 08:13:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6470107136. Throughput: 0: 42955.8. Samples: 6470230100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:13:58,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 08:14:00,679][15401] Updated weights for policy 0, policy_version 394910 (0.0031) [2024-06-23 08:14:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6470336512. Throughput: 0: 42878.6. Samples: 6470485300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:14:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 08:14:04,170][15401] Updated weights for policy 0, policy_version 394920 (0.0036) [2024-06-23 08:14:08,331][15401] Updated weights for policy 0, policy_version 394930 (0.0031) [2024-06-23 08:14:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 6470533120. Throughput: 0: 42695.8. Samples: 6470612240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:14:08,394][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 08:14:11,711][15401] Updated weights for policy 0, policy_version 394940 (0.0039) [2024-06-23 08:14:13,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.6, 300 sec: 42764.7). Total num frames: 6470746112. Throughput: 0: 42865.2. Samples: 6470870200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:14:13,393][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 08:14:16,092][15401] Updated weights for policy 0, policy_version 394950 (0.0041) [2024-06-23 08:14:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6470975488. Throughput: 0: 42717.7. Samples: 6471127940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:14:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 08:14:19,257][15401] Updated weights for policy 0, policy_version 394960 (0.0024) [2024-06-23 08:14:23,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6471172096. Throughput: 0: 42770.2. Samples: 6471255280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:14:23,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 08:14:23,790][15401] Updated weights for policy 0, policy_version 394970 (0.0022) [2024-06-23 08:14:27,271][15401] Updated weights for policy 0, policy_version 394980 (0.0031) [2024-06-23 08:14:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6471401472. Throughput: 0: 42708.4. Samples: 6471510440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:14:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 08:14:31,346][15401] Updated weights for policy 0, policy_version 394990 (0.0040) [2024-06-23 08:14:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6471598080. Throughput: 0: 42716.0. Samples: 6471766260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 08:14:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 08:14:34,727][15401] Updated weights for policy 0, policy_version 395000 (0.0022) [2024-06-23 08:14:38,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6471811072. Throughput: 0: 42645.4. Samples: 6471893740. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:14:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 08:14:38,870][15401] Updated weights for policy 0, policy_version 395010 (0.0036) [2024-06-23 08:14:42,167][15401] Updated weights for policy 0, policy_version 395020 (0.0031) [2024-06-23 08:14:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 6472040448. Throughput: 0: 42713.8. Samples: 6472152220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:14:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 08:14:46,338][15401] Updated weights for policy 0, policy_version 395030 (0.0026) [2024-06-23 08:14:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6472253440. Throughput: 0: 42992.9. Samples: 6472419980. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:14:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 08:14:49,075][15349] Signal inference workers to stop experience collection... (95800 times) [2024-06-23 08:14:49,075][15349] Signal inference workers to resume experience collection... (95800 times) [2024-06-23 08:14:49,094][15401] InferenceWorker_p0-w0: stopping experience collection (95800 times) [2024-06-23 08:14:49,095][15401] InferenceWorker_p0-w0: resuming experience collection (95800 times) [2024-06-23 08:14:49,715][15401] Updated weights for policy 0, policy_version 395040 (0.0030) [2024-06-23 08:14:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6472466432. Throughput: 0: 42884.9. Samples: 6472542060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:14:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 08:14:53,978][15401] Updated weights for policy 0, policy_version 395050 (0.0029) [2024-06-23 08:14:57,474][15401] Updated weights for policy 0, policy_version 395060 (0.0032) [2024-06-23 08:14:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6472695808. Throughput: 0: 42870.4. Samples: 6472799260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:14:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 08:15:02,283][15401] Updated weights for policy 0, policy_version 395070 (0.0037) [2024-06-23 08:15:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6472876032. Throughput: 0: 43044.8. Samples: 6473064960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 08:15:05,022][15401] Updated weights for policy 0, policy_version 395080 (0.0034) [2024-06-23 08:15:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6473121792. Throughput: 0: 42801.0. Samples: 6473181320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 08:15:09,895][15401] Updated weights for policy 0, policy_version 395090 (0.0023) [2024-06-23 08:15:12,628][15401] Updated weights for policy 0, policy_version 395100 (0.0045) [2024-06-23 08:15:13,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 6473351168. Throughput: 0: 42836.6. Samples: 6473438080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 08:15:17,522][15401] Updated weights for policy 0, policy_version 395110 (0.0026) [2024-06-23 08:15:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6473515008. Throughput: 0: 43089.0. Samples: 6473705260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 08:15:20,300][15401] Updated weights for policy 0, policy_version 395120 (0.0034) [2024-06-23 08:15:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 6473760768. Throughput: 0: 42848.1. Samples: 6473821900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:23,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 08:15:25,112][15401] Updated weights for policy 0, policy_version 395130 (0.0046) [2024-06-23 08:15:28,002][15401] Updated weights for policy 0, policy_version 395140 (0.0043) [2024-06-23 08:15:28,390][15132] Fps is (10 sec: 47512.3, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 6473990144. Throughput: 0: 42871.0. Samples: 6474081420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:28,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 08:15:32,734][15401] Updated weights for policy 0, policy_version 395150 (0.0038) [2024-06-23 08:15:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6474170368. Throughput: 0: 42798.6. Samples: 6474345920. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 08:15:35,678][15401] Updated weights for policy 0, policy_version 395160 (0.0042) [2024-06-23 08:15:38,390][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6474416128. Throughput: 0: 42738.2. Samples: 6474465280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 08:15:40,404][15401] Updated weights for policy 0, policy_version 395170 (0.0027) [2024-06-23 08:15:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6474612736. Throughput: 0: 42769.3. Samples: 6474723880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 08:15:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000395180_6474629120.pth... [2024-06-23 08:15:43,462][15401] Updated weights for policy 0, policy_version 395180 (0.0041) [2024-06-23 08:15:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000394551_6464323584.pth [2024-06-23 08:15:47,910][15401] Updated weights for policy 0, policy_version 395190 (0.0038) [2024-06-23 08:15:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6474792960. Throughput: 0: 42687.7. Samples: 6474985900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 08:15:50,971][15401] Updated weights for policy 0, policy_version 395200 (0.0029) [2024-06-23 08:15:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6475055104. Throughput: 0: 42771.1. Samples: 6475106020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-23 08:15:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 08:15:55,656][15401] Updated weights for policy 0, policy_version 395210 (0.0024) [2024-06-23 08:15:57,889][15349] Signal inference workers to stop experience collection... (95850 times) [2024-06-23 08:15:57,890][15349] Signal inference workers to resume experience collection... (95850 times) [2024-06-23 08:15:57,901][15401] InferenceWorker_p0-w0: stopping experience collection (95850 times) [2024-06-23 08:15:57,901][15401] InferenceWorker_p0-w0: resuming experience collection (95850 times) [2024-06-23 08:15:58,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 6475251712. Throughput: 0: 42902.2. Samples: 6475368780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:15:58,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 08:15:58,622][15401] Updated weights for policy 0, policy_version 395220 (0.0035) [2024-06-23 08:16:03,362][15401] Updated weights for policy 0, policy_version 395230 (0.0041) [2024-06-23 08:16:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6475448320. Throughput: 0: 42725.2. Samples: 6475627900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:03,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 08:16:06,229][15401] Updated weights for policy 0, policy_version 395240 (0.0032) [2024-06-23 08:16:08,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6475694080. Throughput: 0: 42770.1. Samples: 6475746560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 08:16:11,265][15401] Updated weights for policy 0, policy_version 395250 (0.0041) [2024-06-23 08:16:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6475890688. Throughput: 0: 42890.8. Samples: 6476011500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:13,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 08:16:14,250][15401] Updated weights for policy 0, policy_version 395260 (0.0036) [2024-06-23 08:16:18,392][15132] Fps is (10 sec: 37674.2, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 6476070912. Throughput: 0: 42728.4. Samples: 6476268800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:18,393][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 08:16:18,928][15401] Updated weights for policy 0, policy_version 395270 (0.0041) [2024-06-23 08:16:21,911][15401] Updated weights for policy 0, policy_version 395280 (0.0036) [2024-06-23 08:16:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6476333056. Throughput: 0: 42799.6. Samples: 6476391260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 08:16:26,811][15401] Updated weights for policy 0, policy_version 395290 (0.0044) [2024-06-23 08:16:28,390][15132] Fps is (10 sec: 47524.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 6476546048. Throughput: 0: 43059.9. Samples: 6476661580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 08:16:29,637][15401] Updated weights for policy 0, policy_version 395300 (0.0031) [2024-06-23 08:16:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6476726272. Throughput: 0: 42803.1. Samples: 6476912040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 08:16:34,209][15401] Updated weights for policy 0, policy_version 395310 (0.0045) [2024-06-23 08:16:36,996][15401] Updated weights for policy 0, policy_version 395320 (0.0035) [2024-06-23 08:16:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6476972032. Throughput: 0: 42906.6. Samples: 6477036820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:38,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 08:16:41,589][15401] Updated weights for policy 0, policy_version 395330 (0.0038) [2024-06-23 08:16:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6477168640. Throughput: 0: 42978.3. Samples: 6477302700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 08:16:44,988][15401] Updated weights for policy 0, policy_version 395340 (0.0036) [2024-06-23 08:16:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6477365248. Throughput: 0: 42824.5. Samples: 6477555000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 08:16:49,032][15401] Updated weights for policy 0, policy_version 395350 (0.0040) [2024-06-23 08:16:52,576][15401] Updated weights for policy 0, policy_version 395360 (0.0046) [2024-06-23 08:16:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6477627392. Throughput: 0: 42978.3. Samples: 6477680580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 08:16:56,627][15401] Updated weights for policy 0, policy_version 395370 (0.0029) [2024-06-23 08:16:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 6477791232. Throughput: 0: 42740.1. Samples: 6477934800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:16:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 08:17:00,263][15401] Updated weights for policy 0, policy_version 395380 (0.0042) [2024-06-23 08:17:03,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6478020608. Throughput: 0: 42613.2. Samples: 6478186300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:17:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 08:17:04,502][15401] Updated weights for policy 0, policy_version 395390 (0.0028) [2024-06-23 08:17:07,780][15401] Updated weights for policy 0, policy_version 395400 (0.0044) [2024-06-23 08:17:08,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6478266368. Throughput: 0: 42791.4. Samples: 6478316880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:17:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 08:17:12,112][15401] Updated weights for policy 0, policy_version 395410 (0.0030) [2024-06-23 08:17:13,159][15349] Signal inference workers to stop experience collection... (95900 times) [2024-06-23 08:17:13,159][15349] Signal inference workers to resume experience collection... (95900 times) [2024-06-23 08:17:13,176][15401] InferenceWorker_p0-w0: stopping experience collection (95900 times) [2024-06-23 08:17:13,176][15401] InferenceWorker_p0-w0: resuming experience collection (95900 times) [2024-06-23 08:17:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6478446592. Throughput: 0: 42585.4. Samples: 6478577920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:17:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 08:17:15,263][15401] Updated weights for policy 0, policy_version 395420 (0.0027) [2024-06-23 08:17:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 6478675968. Throughput: 0: 42693.7. Samples: 6478833260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 08:17:19,858][15401] Updated weights for policy 0, policy_version 395430 (0.0038) [2024-06-23 08:17:22,967][15401] Updated weights for policy 0, policy_version 395440 (0.0034) [2024-06-23 08:17:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6478905344. Throughput: 0: 42803.5. Samples: 6478962980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 08:17:27,339][15401] Updated weights for policy 0, policy_version 395450 (0.0034) [2024-06-23 08:17:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6479085568. Throughput: 0: 42693.0. Samples: 6479223880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 08:17:30,723][15401] Updated weights for policy 0, policy_version 395460 (0.0031) [2024-06-23 08:17:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 6479331328. Throughput: 0: 42713.7. Samples: 6479477120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 08:17:34,932][15401] Updated weights for policy 0, policy_version 395470 (0.0033) [2024-06-23 08:17:38,182][15401] Updated weights for policy 0, policy_version 395480 (0.0040) [2024-06-23 08:17:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6479544320. Throughput: 0: 42856.4. Samples: 6479609120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 08:17:42,528][15401] Updated weights for policy 0, policy_version 395490 (0.0037) [2024-06-23 08:17:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6479724544. Throughput: 0: 42981.7. Samples: 6479868980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 08:17:43,585][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000395493_6479757312.pth... [2024-06-23 08:17:43,641][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000394865_6469468160.pth [2024-06-23 08:17:45,881][15401] Updated weights for policy 0, policy_version 395500 (0.0027) [2024-06-23 08:17:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6479953920. Throughput: 0: 43036.1. Samples: 6480122920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 08:17:50,181][15401] Updated weights for policy 0, policy_version 395510 (0.0036) [2024-06-23 08:17:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6480183296. Throughput: 0: 42905.0. Samples: 6480247600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 08:17:53,529][15401] Updated weights for policy 0, policy_version 395520 (0.0047) [2024-06-23 08:17:57,657][15401] Updated weights for policy 0, policy_version 395530 (0.0032) [2024-06-23 08:17:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6480379904. Throughput: 0: 42925.4. Samples: 6480509560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:17:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 08:18:00,897][15401] Updated weights for policy 0, policy_version 395540 (0.0031) [2024-06-23 08:18:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 6480592896. Throughput: 0: 43123.1. Samples: 6480773800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:18:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 08:18:05,164][15401] Updated weights for policy 0, policy_version 395550 (0.0036) [2024-06-23 08:18:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 6480838656. Throughput: 0: 42962.8. Samples: 6480896300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:18:08,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 08:18:08,491][15401] Updated weights for policy 0, policy_version 395560 (0.0033) [2024-06-23 08:18:12,715][15401] Updated weights for policy 0, policy_version 395570 (0.0023) [2024-06-23 08:18:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6481035264. Throughput: 0: 42948.8. Samples: 6481156580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:18:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 08:18:16,071][15401] Updated weights for policy 0, policy_version 395580 (0.0042) [2024-06-23 08:18:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6481231872. Throughput: 0: 43103.2. Samples: 6481416760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:18:18,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 08:18:20,171][15401] Updated weights for policy 0, policy_version 395590 (0.0031) [2024-06-23 08:18:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6481461248. Throughput: 0: 42872.9. Samples: 6481538400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:18:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 08:18:23,885][15401] Updated weights for policy 0, policy_version 395600 (0.0044) [2024-06-23 08:18:28,025][15401] Updated weights for policy 0, policy_version 395610 (0.0040) [2024-06-23 08:18:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 6481690624. Throughput: 0: 42868.5. Samples: 6481798060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:18:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 08:18:29,920][15349] Signal inference workers to stop experience collection... (95950 times) [2024-06-23 08:18:29,920][15349] Signal inference workers to resume experience collection... (95950 times) [2024-06-23 08:18:29,952][15401] InferenceWorker_p0-w0: stopping experience collection (95950 times) [2024-06-23 08:18:29,953][15401] InferenceWorker_p0-w0: resuming experience collection (95950 times) [2024-06-23 08:18:31,617][15401] Updated weights for policy 0, policy_version 395620 (0.0044) [2024-06-23 08:18:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 6481854464. Throughput: 0: 42914.2. Samples: 6482054060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-23 08:18:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 08:18:35,692][15401] Updated weights for policy 0, policy_version 395630 (0.0037) [2024-06-23 08:18:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6482100224. Throughput: 0: 42764.9. Samples: 6482172020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:18:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 08:18:39,244][15401] Updated weights for policy 0, policy_version 395640 (0.0028) [2024-06-23 08:18:43,372][15401] Updated weights for policy 0, policy_version 395650 (0.0038) [2024-06-23 08:18:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 6482329600. Throughput: 0: 42892.3. Samples: 6482439720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:18:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 08:18:47,008][15401] Updated weights for policy 0, policy_version 395660 (0.0032) [2024-06-23 08:18:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6482509824. Throughput: 0: 42606.2. Samples: 6482691080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:18:48,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 08:18:51,252][15401] Updated weights for policy 0, policy_version 395670 (0.0036) [2024-06-23 08:18:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6482755584. Throughput: 0: 42767.5. Samples: 6482820840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:18:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 08:18:54,573][15401] Updated weights for policy 0, policy_version 395680 (0.0029) [2024-06-23 08:18:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6482952192. Throughput: 0: 42791.1. Samples: 6483082180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:18:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 08:18:59,124][15401] Updated weights for policy 0, policy_version 395690 (0.0041) [2024-06-23 08:19:02,434][15401] Updated weights for policy 0, policy_version 395700 (0.0030) [2024-06-23 08:19:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6483165184. Throughput: 0: 42564.3. Samples: 6483332160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:03,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 08:19:06,715][15401] Updated weights for policy 0, policy_version 395710 (0.0031) [2024-06-23 08:19:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 6483378176. Throughput: 0: 42801.0. Samples: 6483464440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 08:19:10,009][15401] Updated weights for policy 0, policy_version 395720 (0.0030) [2024-06-23 08:19:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6483574784. Throughput: 0: 42627.5. Samples: 6483716300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 08:19:14,266][15401] Updated weights for policy 0, policy_version 395730 (0.0035) [2024-06-23 08:19:17,665][15401] Updated weights for policy 0, policy_version 395740 (0.0038) [2024-06-23 08:19:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6483804160. Throughput: 0: 42484.0. Samples: 6483965840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 08:19:21,894][15401] Updated weights for policy 0, policy_version 395750 (0.0043) [2024-06-23 08:19:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6484033536. Throughput: 0: 42858.2. Samples: 6484100640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 08:19:25,524][15401] Updated weights for policy 0, policy_version 395760 (0.0042) [2024-06-23 08:19:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 6484197376. Throughput: 0: 42537.9. Samples: 6484353920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 08:19:29,469][15401] Updated weights for policy 0, policy_version 395770 (0.0032) [2024-06-23 08:19:33,302][15401] Updated weights for policy 0, policy_version 395780 (0.0037) [2024-06-23 08:19:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6484459520. Throughput: 0: 42492.3. Samples: 6484603240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 08:19:37,282][15401] Updated weights for policy 0, policy_version 395790 (0.0032) [2024-06-23 08:19:38,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6484672512. Throughput: 0: 42628.4. Samples: 6484739120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:38,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 08:19:40,932][15401] Updated weights for policy 0, policy_version 395800 (0.0035) [2024-06-23 08:19:43,389][15132] Fps is (10 sec: 37683.9, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 6484836352. Throughput: 0: 42395.2. Samples: 6484989960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:43,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 08:19:43,557][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000395804_6484852736.pth... [2024-06-23 08:19:43,624][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000395180_6474629120.pth [2024-06-23 08:19:44,959][15401] Updated weights for policy 0, policy_version 395810 (0.0041) [2024-06-23 08:19:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6485098496. Throughput: 0: 42369.0. Samples: 6485238760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 08:19:48,959][15401] Updated weights for policy 0, policy_version 395820 (0.0029) [2024-06-23 08:19:51,970][15349] Signal inference workers to stop experience collection... (96000 times) [2024-06-23 08:19:51,971][15349] Signal inference workers to resume experience collection... (96000 times) [2024-06-23 08:19:52,021][15401] InferenceWorker_p0-w0: stopping experience collection (96000 times) [2024-06-23 08:19:52,021][15401] InferenceWorker_p0-w0: resuming experience collection (96000 times) [2024-06-23 08:19:52,674][15401] Updated weights for policy 0, policy_version 395830 (0.0037) [2024-06-23 08:19:53,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6485311488. Throughput: 0: 42486.9. Samples: 6485376360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 08:19:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 08:19:56,459][15401] Updated weights for policy 0, policy_version 395840 (0.0035) [2024-06-23 08:19:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6485491712. Throughput: 0: 42554.6. Samples: 6485631260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:19:58,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 08:20:00,484][15401] Updated weights for policy 0, policy_version 395850 (0.0030) [2024-06-23 08:20:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6485721088. Throughput: 0: 42651.6. Samples: 6485885160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 08:20:04,283][15401] Updated weights for policy 0, policy_version 395860 (0.0036) [2024-06-23 08:20:08,052][15401] Updated weights for policy 0, policy_version 395870 (0.0039) [2024-06-23 08:20:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6485950464. Throughput: 0: 42549.3. Samples: 6486015360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 08:20:12,342][15401] Updated weights for policy 0, policy_version 395880 (0.0037) [2024-06-23 08:20:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6486147072. Throughput: 0: 42689.2. Samples: 6486274940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 08:20:15,674][15401] Updated weights for policy 0, policy_version 395890 (0.0029) [2024-06-23 08:20:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6486360064. Throughput: 0: 42626.4. Samples: 6486521420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 08:20:20,021][15401] Updated weights for policy 0, policy_version 395900 (0.0032) [2024-06-23 08:20:23,360][15401] Updated weights for policy 0, policy_version 395910 (0.0036) [2024-06-23 08:20:23,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6486589440. Throughput: 0: 42587.7. Samples: 6486655560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 08:20:27,651][15401] Updated weights for policy 0, policy_version 395920 (0.0046) [2024-06-23 08:20:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 6486769664. Throughput: 0: 42604.3. Samples: 6486907160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 08:20:30,973][15401] Updated weights for policy 0, policy_version 395930 (0.0028) [2024-06-23 08:20:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6487015424. Throughput: 0: 42683.5. Samples: 6487159520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 08:20:35,303][15401] Updated weights for policy 0, policy_version 395940 (0.0041) [2024-06-23 08:20:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6487212032. Throughput: 0: 42468.6. Samples: 6487287440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 08:20:39,075][15401] Updated weights for policy 0, policy_version 395950 (0.0035) [2024-06-23 08:20:42,981][15401] Updated weights for policy 0, policy_version 395960 (0.0044) [2024-06-23 08:20:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6487425024. Throughput: 0: 42601.8. Samples: 6487548340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 08:20:46,768][15401] Updated weights for policy 0, policy_version 395970 (0.0035) [2024-06-23 08:20:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6487654400. Throughput: 0: 42324.9. Samples: 6487789780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 08:20:50,783][15401] Updated weights for policy 0, policy_version 395980 (0.0038) [2024-06-23 08:20:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 6487834624. Throughput: 0: 42368.8. Samples: 6487921960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 08:20:54,451][15401] Updated weights for policy 0, policy_version 395990 (0.0033) [2024-06-23 08:20:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6488047616. Throughput: 0: 42265.4. Samples: 6488176880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:20:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 08:20:58,415][15401] Updated weights for policy 0, policy_version 396000 (0.0043) [2024-06-23 08:21:02,089][15401] Updated weights for policy 0, policy_version 396010 (0.0034) [2024-06-23 08:21:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6488276992. Throughput: 0: 42380.3. Samples: 6488428540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:21:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 08:21:05,996][15401] Updated weights for policy 0, policy_version 396020 (0.0032) [2024-06-23 08:21:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 6488457216. Throughput: 0: 42239.4. Samples: 6488556340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:21:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 08:21:09,054][15349] Signal inference workers to stop experience collection... (96050 times) [2024-06-23 08:21:09,091][15401] InferenceWorker_p0-w0: stopping experience collection (96050 times) [2024-06-23 08:21:09,102][15349] Signal inference workers to resume experience collection... (96050 times) [2024-06-23 08:21:09,112][15401] InferenceWorker_p0-w0: resuming experience collection (96050 times) [2024-06-23 08:21:09,917][15401] Updated weights for policy 0, policy_version 396030 (0.0040) [2024-06-23 08:21:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 6488686592. Throughput: 0: 42352.9. Samples: 6488813040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 08:21:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 08:21:13,658][15401] Updated weights for policy 0, policy_version 396040 (0.0028) [2024-06-23 08:21:17,683][15401] Updated weights for policy 0, policy_version 396050 (0.0023) [2024-06-23 08:21:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6488915968. Throughput: 0: 42304.1. Samples: 6489063200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 08:21:21,716][15401] Updated weights for policy 0, policy_version 396060 (0.0030) [2024-06-23 08:21:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 6489096192. Throughput: 0: 42372.0. Samples: 6489194180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 08:21:25,264][15401] Updated weights for policy 0, policy_version 396070 (0.0030) [2024-06-23 08:21:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6489325568. Throughput: 0: 42185.8. Samples: 6489446700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 08:21:29,230][15401] Updated weights for policy 0, policy_version 396080 (0.0030) [2024-06-23 08:21:33,024][15401] Updated weights for policy 0, policy_version 396090 (0.0039) [2024-06-23 08:21:33,391][15132] Fps is (10 sec: 45868.9, 60 sec: 42324.5, 300 sec: 42653.8). Total num frames: 6489554944. Throughput: 0: 42550.3. Samples: 6489704600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:33,391][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 08:21:36,860][15401] Updated weights for policy 0, policy_version 396100 (0.0034) [2024-06-23 08:21:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6489751552. Throughput: 0: 42520.5. Samples: 6489835380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:38,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 08:21:40,855][15401] Updated weights for policy 0, policy_version 396110 (0.0032) [2024-06-23 08:21:43,390][15132] Fps is (10 sec: 40965.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6489964544. Throughput: 0: 42473.3. Samples: 6490088180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 08:21:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000396116_6489964544.pth... [2024-06-23 08:21:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000395493_6479757312.pth [2024-06-23 08:21:44,818][15401] Updated weights for policy 0, policy_version 396120 (0.0037) [2024-06-23 08:21:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 6490161152. Throughput: 0: 42505.4. Samples: 6490341280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 08:21:48,661][15401] Updated weights for policy 0, policy_version 396130 (0.0038) [2024-06-23 08:21:52,441][15401] Updated weights for policy 0, policy_version 396140 (0.0036) [2024-06-23 08:21:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6490390528. Throughput: 0: 42485.3. Samples: 6490468180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:53,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 08:21:56,474][15401] Updated weights for policy 0, policy_version 396150 (0.0042) [2024-06-23 08:21:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6490603520. Throughput: 0: 42422.7. Samples: 6490722060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:21:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 08:22:00,152][15401] Updated weights for policy 0, policy_version 396160 (0.0036) [2024-06-23 08:22:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 6490800128. Throughput: 0: 42596.0. Samples: 6490980020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:22:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 08:22:04,046][15401] Updated weights for policy 0, policy_version 396170 (0.0035) [2024-06-23 08:22:07,879][15401] Updated weights for policy 0, policy_version 396180 (0.0027) [2024-06-23 08:22:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 6491029504. Throughput: 0: 42464.5. Samples: 6491105080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:22:08,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 08:22:11,668][15401] Updated weights for policy 0, policy_version 396190 (0.0026) [2024-06-23 08:22:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6491226112. Throughput: 0: 42618.2. Samples: 6491364520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:22:13,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-23 08:22:15,328][15401] Updated weights for policy 0, policy_version 396200 (0.0031) [2024-06-23 08:22:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 6491439104. Throughput: 0: 42720.0. Samples: 6491626940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:22:18,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 08:22:19,083][15401] Updated weights for policy 0, policy_version 396210 (0.0027) [2024-06-23 08:22:22,153][15349] Signal inference workers to stop experience collection... (96100 times) [2024-06-23 08:22:22,188][15401] InferenceWorker_p0-w0: stopping experience collection (96100 times) [2024-06-23 08:22:22,214][15349] Signal inference workers to resume experience collection... (96100 times) [2024-06-23 08:22:22,220][15401] InferenceWorker_p0-w0: resuming experience collection (96100 times) [2024-06-23 08:22:23,091][15401] Updated weights for policy 0, policy_version 396220 (0.0033) [2024-06-23 08:22:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6491668480. Throughput: 0: 42618.3. Samples: 6491753200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:22:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 08:22:26,736][15401] Updated weights for policy 0, policy_version 396230 (0.0034) [2024-06-23 08:22:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6491897856. Throughput: 0: 42738.2. Samples: 6492011400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:22:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 08:22:30,658][15401] Updated weights for policy 0, policy_version 396240 (0.0036) [2024-06-23 08:22:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42326.3, 300 sec: 42542.9). Total num frames: 6492094464. Throughput: 0: 42856.1. Samples: 6492269800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-23 08:22:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 08:22:34,721][15401] Updated weights for policy 0, policy_version 396250 (0.0028) [2024-06-23 08:22:38,161][15401] Updated weights for policy 0, policy_version 396260 (0.0028) [2024-06-23 08:22:38,394][15132] Fps is (10 sec: 42580.7, 60 sec: 42868.5, 300 sec: 42708.9). Total num frames: 6492323840. Throughput: 0: 42861.4. Samples: 6492397120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:22:38,394][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 08:22:42,167][15401] Updated weights for policy 0, policy_version 396270 (0.0042) [2024-06-23 08:22:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 6492536832. Throughput: 0: 42888.5. Samples: 6492652040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:22:43,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 08:22:46,050][15401] Updated weights for policy 0, policy_version 396280 (0.0039) [2024-06-23 08:22:48,389][15132] Fps is (10 sec: 40977.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6492733440. Throughput: 0: 42880.4. Samples: 6492909640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:22:48,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 08:22:49,777][15401] Updated weights for policy 0, policy_version 396290 (0.0032) [2024-06-23 08:22:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6492946432. Throughput: 0: 42879.9. Samples: 6493034680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:22:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 08:22:53,955][15401] Updated weights for policy 0, policy_version 396300 (0.0037) [2024-06-23 08:22:57,538][15401] Updated weights for policy 0, policy_version 396310 (0.0028) [2024-06-23 08:22:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6493192192. Throughput: 0: 42837.8. Samples: 6493292220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:22:58,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 08:23:01,626][15401] Updated weights for policy 0, policy_version 396320 (0.0043) [2024-06-23 08:23:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 6493388800. Throughput: 0: 42661.3. Samples: 6493546700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 08:23:05,291][15401] Updated weights for policy 0, policy_version 396330 (0.0035) [2024-06-23 08:23:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6493585408. Throughput: 0: 42734.6. Samples: 6493676260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 08:23:09,301][15401] Updated weights for policy 0, policy_version 396340 (0.0033) [2024-06-23 08:23:12,817][15401] Updated weights for policy 0, policy_version 396350 (0.0030) [2024-06-23 08:23:13,390][15132] Fps is (10 sec: 44232.3, 60 sec: 43417.0, 300 sec: 42709.3). Total num frames: 6493831168. Throughput: 0: 42689.4. Samples: 6493932460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:13,391][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 08:23:17,142][15401] Updated weights for policy 0, policy_version 396360 (0.0037) [2024-06-23 08:23:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 6494027776. Throughput: 0: 42565.4. Samples: 6494185240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 08:23:20,385][15401] Updated weights for policy 0, policy_version 396370 (0.0037) [2024-06-23 08:23:23,390][15132] Fps is (10 sec: 39325.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 6494224384. Throughput: 0: 42562.6. Samples: 6494312260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 08:23:24,712][15401] Updated weights for policy 0, policy_version 396380 (0.0033) [2024-06-23 08:23:27,893][15401] Updated weights for policy 0, policy_version 396390 (0.0028) [2024-06-23 08:23:28,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6494470144. Throughput: 0: 42649.7. Samples: 6494571280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 08:23:32,191][15401] Updated weights for policy 0, policy_version 396400 (0.0047) [2024-06-23 08:23:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6494683136. Throughput: 0: 42639.6. Samples: 6494828420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:33,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 08:23:35,508][15401] Updated weights for policy 0, policy_version 396410 (0.0031) [2024-06-23 08:23:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42601.3, 300 sec: 42542.9). Total num frames: 6494879744. Throughput: 0: 42724.8. Samples: 6494957300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 08:23:39,566][15349] Signal inference workers to stop experience collection... (96150 times) [2024-06-23 08:23:39,592][15401] InferenceWorker_p0-w0: stopping experience collection (96150 times) [2024-06-23 08:23:39,622][15349] Signal inference workers to resume experience collection... (96150 times) [2024-06-23 08:23:39,628][15401] InferenceWorker_p0-w0: resuming experience collection (96150 times) [2024-06-23 08:23:39,770][15401] Updated weights for policy 0, policy_version 396420 (0.0031) [2024-06-23 08:23:43,358][15401] Updated weights for policy 0, policy_version 396430 (0.0034) [2024-06-23 08:23:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6495109120. Throughput: 0: 42700.5. Samples: 6495213740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 08:23:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000396430_6495109120.pth... [2024-06-23 08:23:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000395804_6484852736.pth [2024-06-23 08:23:47,378][15401] Updated weights for policy 0, policy_version 396440 (0.0025) [2024-06-23 08:23:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6495305728. Throughput: 0: 42771.5. Samples: 6495471420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:48,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 08:23:50,877][15401] Updated weights for policy 0, policy_version 396450 (0.0036) [2024-06-23 08:23:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6495518720. Throughput: 0: 42640.9. Samples: 6495595100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:23:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 08:23:55,338][15401] Updated weights for policy 0, policy_version 396460 (0.0036) [2024-06-23 08:23:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 6495748096. Throughput: 0: 42735.7. Samples: 6495855520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:23:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 08:23:58,648][15401] Updated weights for policy 0, policy_version 396470 (0.0021) [2024-06-23 08:24:02,735][15401] Updated weights for policy 0, policy_version 396480 (0.0033) [2024-06-23 08:24:03,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 6495944704. Throughput: 0: 42814.9. Samples: 6496112020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:03,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 08:24:06,437][15401] Updated weights for policy 0, policy_version 396490 (0.0035) [2024-06-23 08:24:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6496141312. Throughput: 0: 42755.6. Samples: 6496236260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:08,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 08:24:10,601][15401] Updated weights for policy 0, policy_version 396500 (0.0036) [2024-06-23 08:24:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42599.1, 300 sec: 42653.9). Total num frames: 6496387072. Throughput: 0: 42859.1. Samples: 6496499940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 08:24:14,056][15401] Updated weights for policy 0, policy_version 396510 (0.0033) [2024-06-23 08:24:18,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.5, 300 sec: 42487.0). Total num frames: 6496567296. Throughput: 0: 42763.0. Samples: 6496752860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:18,393][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 08:24:18,571][15401] Updated weights for policy 0, policy_version 396520 (0.0028) [2024-06-23 08:24:21,782][15401] Updated weights for policy 0, policy_version 396530 (0.0028) [2024-06-23 08:24:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6496796672. Throughput: 0: 42592.9. Samples: 6496873980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 08:24:26,177][15401] Updated weights for policy 0, policy_version 396540 (0.0041) [2024-06-23 08:24:28,390][15132] Fps is (10 sec: 44245.3, 60 sec: 42325.0, 300 sec: 42542.8). Total num frames: 6497009664. Throughput: 0: 42782.6. Samples: 6497138980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:28,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 08:24:29,697][15401] Updated weights for policy 0, policy_version 396550 (0.0041) [2024-06-23 08:24:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6497222656. Throughput: 0: 42651.1. Samples: 6497390720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:33,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 08:24:33,661][15401] Updated weights for policy 0, policy_version 396560 (0.0028) [2024-06-23 08:24:37,216][15401] Updated weights for policy 0, policy_version 396570 (0.0037) [2024-06-23 08:24:38,389][15132] Fps is (10 sec: 44239.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6497452032. Throughput: 0: 42825.9. Samples: 6497522260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:38,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-23 08:24:41,171][15401] Updated weights for policy 0, policy_version 396580 (0.0025) [2024-06-23 08:24:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6497665024. Throughput: 0: 42837.7. Samples: 6497783220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:43,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-23 08:24:44,750][15401] Updated weights for policy 0, policy_version 396590 (0.0030) [2024-06-23 08:24:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6497878016. Throughput: 0: 42705.3. Samples: 6498033660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:48,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 08:24:48,675][15401] Updated weights for policy 0, policy_version 396600 (0.0032) [2024-06-23 08:24:52,383][15401] Updated weights for policy 0, policy_version 396610 (0.0028) [2024-06-23 08:24:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6498091008. Throughput: 0: 42943.4. Samples: 6498168720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 08:24:56,188][15401] Updated weights for policy 0, policy_version 396620 (0.0031) [2024-06-23 08:24:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6498304000. Throughput: 0: 42764.1. Samples: 6498424320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:24:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 08:25:00,047][15401] Updated weights for policy 0, policy_version 396630 (0.0036) [2024-06-23 08:25:01,600][15349] Signal inference workers to stop experience collection... (96200 times) [2024-06-23 08:25:01,600][15349] Signal inference workers to resume experience collection... (96200 times) [2024-06-23 08:25:01,647][15401] InferenceWorker_p0-w0: stopping experience collection (96200 times) [2024-06-23 08:25:01,647][15401] InferenceWorker_p0-w0: resuming experience collection (96200 times) [2024-06-23 08:25:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 6498516992. Throughput: 0: 42676.6. Samples: 6498673200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:25:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 08:25:03,686][15401] Updated weights for policy 0, policy_version 396640 (0.0037) [2024-06-23 08:25:07,802][15401] Updated weights for policy 0, policy_version 396650 (0.0032) [2024-06-23 08:25:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 6498729984. Throughput: 0: 42849.4. Samples: 6498802200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:25:08,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 08:25:11,337][15401] Updated weights for policy 0, policy_version 396660 (0.0033) [2024-06-23 08:25:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6498959360. Throughput: 0: 42754.3. Samples: 6499062900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 08:25:13,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 08:25:15,515][15401] Updated weights for policy 0, policy_version 396670 (0.0024) [2024-06-23 08:25:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 6499155968. Throughput: 0: 42770.2. Samples: 6499315380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 08:25:18,799][15401] Updated weights for policy 0, policy_version 396680 (0.0030) [2024-06-23 08:25:23,048][15401] Updated weights for policy 0, policy_version 396690 (0.0028) [2024-06-23 08:25:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6499368960. Throughput: 0: 42709.2. Samples: 6499444180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 08:25:26,665][15401] Updated weights for policy 0, policy_version 396700 (0.0036) [2024-06-23 08:25:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.7, 300 sec: 42487.3). Total num frames: 6499549184. Throughput: 0: 42506.2. Samples: 6499696000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:28,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 08:25:30,690][15401] Updated weights for policy 0, policy_version 396710 (0.0035) [2024-06-23 08:25:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6499794944. Throughput: 0: 42714.8. Samples: 6499955820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 08:25:34,294][15401] Updated weights for policy 0, policy_version 396720 (0.0032) [2024-06-23 08:25:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6499991552. Throughput: 0: 42604.9. Samples: 6500085940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 08:25:38,908][15401] Updated weights for policy 0, policy_version 396730 (0.0024) [2024-06-23 08:25:42,069][15401] Updated weights for policy 0, policy_version 396740 (0.0024) [2024-06-23 08:25:43,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 6500204544. Throughput: 0: 42446.6. Samples: 6500334520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:43,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 08:25:43,538][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000396742_6500220928.pth... [2024-06-23 08:25:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000396116_6489964544.pth [2024-06-23 08:25:46,407][15401] Updated weights for policy 0, policy_version 396750 (0.0031) [2024-06-23 08:25:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6500433920. Throughput: 0: 42697.8. Samples: 6500594600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:48,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 08:25:49,897][15401] Updated weights for policy 0, policy_version 396760 (0.0022) [2024-06-23 08:25:53,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6500630528. Throughput: 0: 42715.4. Samples: 6500724400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 08:25:53,971][15401] Updated weights for policy 0, policy_version 396770 (0.0037) [2024-06-23 08:25:57,722][15401] Updated weights for policy 0, policy_version 396780 (0.0035) [2024-06-23 08:25:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6500859904. Throughput: 0: 42560.8. Samples: 6500978140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:25:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 08:26:01,805][15401] Updated weights for policy 0, policy_version 396790 (0.0031) [2024-06-23 08:26:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6501072896. Throughput: 0: 42717.5. Samples: 6501237660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:26:03,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 08:26:05,421][15401] Updated weights for policy 0, policy_version 396800 (0.0027) [2024-06-23 08:26:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 6501269504. Throughput: 0: 42724.0. Samples: 6501366860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:26:08,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 08:26:09,290][15401] Updated weights for policy 0, policy_version 396810 (0.0042) [2024-06-23 08:26:13,326][15401] Updated weights for policy 0, policy_version 396820 (0.0054) [2024-06-23 08:26:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6501498880. Throughput: 0: 42742.7. Samples: 6501619420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:26:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 08:26:16,935][15401] Updated weights for policy 0, policy_version 396830 (0.0033) [2024-06-23 08:26:18,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6501711872. Throughput: 0: 42728.4. Samples: 6501878600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:26:18,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 08:26:21,190][15401] Updated weights for policy 0, policy_version 396840 (0.0036) [2024-06-23 08:26:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6501908480. Throughput: 0: 42867.4. Samples: 6502014980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:26:23,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-23 08:26:24,725][15401] Updated weights for policy 0, policy_version 396850 (0.0034) [2024-06-23 08:26:27,119][15349] Signal inference workers to stop experience collection... (96250 times) [2024-06-23 08:26:27,119][15349] Signal inference workers to resume experience collection... (96250 times) [2024-06-23 08:26:27,135][15401] InferenceWorker_p0-w0: stopping experience collection (96250 times) [2024-06-23 08:26:27,135][15401] InferenceWorker_p0-w0: resuming experience collection (96250 times) [2024-06-23 08:26:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.6). Total num frames: 6502121472. Throughput: 0: 42824.1. Samples: 6502261500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:26:28,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 08:26:28,753][15401] Updated weights for policy 0, policy_version 396860 (0.0042) [2024-06-23 08:26:32,502][15401] Updated weights for policy 0, policy_version 396870 (0.0033) [2024-06-23 08:26:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6502367232. Throughput: 0: 42682.0. Samples: 6502515300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 08:26:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 08:26:36,287][15401] Updated weights for policy 0, policy_version 396880 (0.0026) [2024-06-23 08:26:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6502563840. Throughput: 0: 42746.8. Samples: 6502648000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:26:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 08:26:40,246][15401] Updated weights for policy 0, policy_version 396890 (0.0033) [2024-06-23 08:26:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6502776832. Throughput: 0: 42600.9. Samples: 6502895180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:26:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 08:26:43,699][15401] Updated weights for policy 0, policy_version 396900 (0.0028) [2024-06-23 08:26:47,863][15401] Updated weights for policy 0, policy_version 396910 (0.0025) [2024-06-23 08:26:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6502989824. Throughput: 0: 42562.7. Samples: 6503152980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:26:48,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 08:26:51,203][15401] Updated weights for policy 0, policy_version 396920 (0.0034) [2024-06-23 08:26:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6503202816. Throughput: 0: 42489.7. Samples: 6503278800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:26:53,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 08:26:55,513][15401] Updated weights for policy 0, policy_version 396930 (0.0029) [2024-06-23 08:26:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6503415808. Throughput: 0: 42539.9. Samples: 6503533720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:26:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 08:26:59,260][15401] Updated weights for policy 0, policy_version 396940 (0.0034) [2024-06-23 08:27:02,992][15401] Updated weights for policy 0, policy_version 396950 (0.0044) [2024-06-23 08:27:03,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 6503628800. Throughput: 0: 42671.9. Samples: 6503798940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:03,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 08:27:06,642][15401] Updated weights for policy 0, policy_version 396960 (0.0037) [2024-06-23 08:27:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 6503841792. Throughput: 0: 42494.3. Samples: 6503927220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 08:27:10,639][15401] Updated weights for policy 0, policy_version 396970 (0.0039) [2024-06-23 08:27:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6504087552. Throughput: 0: 42698.2. Samples: 6504182920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:13,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 08:27:13,988][15401] Updated weights for policy 0, policy_version 396980 (0.0043) [2024-06-23 08:27:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6504267776. Throughput: 0: 42975.4. Samples: 6504449180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 08:27:18,487][15401] Updated weights for policy 0, policy_version 396990 (0.0035) [2024-06-23 08:27:21,440][15401] Updated weights for policy 0, policy_version 397000 (0.0036) [2024-06-23 08:27:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 6504480768. Throughput: 0: 42712.9. Samples: 6504570080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:23,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 08:27:26,029][15401] Updated weights for policy 0, policy_version 397010 (0.0044) [2024-06-23 08:27:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6504710144. Throughput: 0: 42914.2. Samples: 6504826320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:28,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 08:27:29,700][15401] Updated weights for policy 0, policy_version 397020 (0.0042) [2024-06-23 08:27:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42654.5). Total num frames: 6504906752. Throughput: 0: 43066.5. Samples: 6505090980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:33,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 08:27:33,598][15401] Updated weights for policy 0, policy_version 397030 (0.0028) [2024-06-23 08:27:37,341][15401] Updated weights for policy 0, policy_version 397040 (0.0030) [2024-06-23 08:27:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6505136128. Throughput: 0: 42874.7. Samples: 6505208160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 08:27:40,787][15349] Signal inference workers to stop experience collection... (96300 times) [2024-06-23 08:27:40,788][15349] Signal inference workers to resume experience collection... (96300 times) [2024-06-23 08:27:40,829][15401] InferenceWorker_p0-w0: stopping experience collection (96300 times) [2024-06-23 08:27:40,829][15401] InferenceWorker_p0-w0: resuming experience collection (96300 times) [2024-06-23 08:27:41,118][15401] Updated weights for policy 0, policy_version 397050 (0.0031) [2024-06-23 08:27:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6505365504. Throughput: 0: 43080.0. Samples: 6505472320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:43,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 08:27:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000397056_6505365504.pth... [2024-06-23 08:27:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000396430_6495109120.pth [2024-06-23 08:27:44,811][15401] Updated weights for policy 0, policy_version 397060 (0.0041) [2024-06-23 08:27:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.6, 300 sec: 42764.7). Total num frames: 6505562112. Throughput: 0: 42951.5. Samples: 6505731760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:48,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 08:27:48,708][15401] Updated weights for policy 0, policy_version 397070 (0.0031) [2024-06-23 08:27:52,480][15401] Updated weights for policy 0, policy_version 397080 (0.0027) [2024-06-23 08:27:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6505775104. Throughput: 0: 42877.3. Samples: 6505856700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 08:27:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 08:27:56,290][15401] Updated weights for policy 0, policy_version 397090 (0.0033) [2024-06-23 08:27:58,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6506004480. Throughput: 0: 42892.5. Samples: 6506113080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:27:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 08:28:00,163][15401] Updated weights for policy 0, policy_version 397100 (0.0039) [2024-06-23 08:28:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 6506184704. Throughput: 0: 42639.0. Samples: 6506367940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 08:28:04,287][15401] Updated weights for policy 0, policy_version 397110 (0.0037) [2024-06-23 08:28:08,372][15401] Updated weights for policy 0, policy_version 397120 (0.0050) [2024-06-23 08:28:08,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42871.3, 300 sec: 42654.0). Total num frames: 6506414080. Throughput: 0: 42734.9. Samples: 6506493160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:08,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 08:28:11,905][15401] Updated weights for policy 0, policy_version 397130 (0.0038) [2024-06-23 08:28:13,393][15132] Fps is (10 sec: 45858.4, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 6506643456. Throughput: 0: 42770.7. Samples: 6506751160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:13,394][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 08:28:15,999][15401] Updated weights for policy 0, policy_version 397140 (0.0024) [2024-06-23 08:28:18,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6506840064. Throughput: 0: 42637.3. Samples: 6507009660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 08:28:19,386][15401] Updated weights for policy 0, policy_version 397150 (0.0033) [2024-06-23 08:28:23,389][15132] Fps is (10 sec: 40975.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6507053056. Throughput: 0: 42850.3. Samples: 6507136420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 08:28:23,573][15401] Updated weights for policy 0, policy_version 397160 (0.0042) [2024-06-23 08:28:26,946][15401] Updated weights for policy 0, policy_version 397170 (0.0021) [2024-06-23 08:28:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6507282432. Throughput: 0: 42514.2. Samples: 6507385460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:28,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 08:28:31,698][15401] Updated weights for policy 0, policy_version 397180 (0.0026) [2024-06-23 08:28:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6507479040. Throughput: 0: 42578.2. Samples: 6507647680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 08:28:34,576][15401] Updated weights for policy 0, policy_version 397190 (0.0040) [2024-06-23 08:28:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6507692032. Throughput: 0: 42478.6. Samples: 6507768240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 08:28:39,636][15401] Updated weights for policy 0, policy_version 397200 (0.0044) [2024-06-23 08:28:42,606][15401] Updated weights for policy 0, policy_version 397210 (0.0034) [2024-06-23 08:28:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6507921408. Throughput: 0: 42523.4. Samples: 6508026640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 08:28:47,296][15401] Updated weights for policy 0, policy_version 397220 (0.0043) [2024-06-23 08:28:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 6508118016. Throughput: 0: 42719.5. Samples: 6508290320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 08:28:50,390][15401] Updated weights for policy 0, policy_version 397230 (0.0037) [2024-06-23 08:28:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6508331008. Throughput: 0: 42539.3. Samples: 6508407420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 08:28:55,079][15401] Updated weights for policy 0, policy_version 397240 (0.0039) [2024-06-23 08:28:58,215][15401] Updated weights for policy 0, policy_version 397250 (0.0035) [2024-06-23 08:28:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 6508544000. Throughput: 0: 42531.9. Samples: 6508664940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:28:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 08:29:02,574][15401] Updated weights for policy 0, policy_version 397260 (0.0036) [2024-06-23 08:29:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6508740608. Throughput: 0: 42680.6. Samples: 6508930280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:29:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 08:29:03,530][15349] Signal inference workers to stop experience collection... (96350 times) [2024-06-23 08:29:03,531][15349] Signal inference workers to resume experience collection... (96350 times) [2024-06-23 08:29:03,550][15401] InferenceWorker_p0-w0: stopping experience collection (96350 times) [2024-06-23 08:29:03,551][15401] InferenceWorker_p0-w0: resuming experience collection (96350 times) [2024-06-23 08:29:05,706][15401] Updated weights for policy 0, policy_version 397270 (0.0025) [2024-06-23 08:29:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6508986368. Throughput: 0: 42605.7. Samples: 6509053680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:29:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 08:29:10,283][15401] Updated weights for policy 0, policy_version 397280 (0.0032) [2024-06-23 08:29:13,361][15401] Updated weights for policy 0, policy_version 397290 (0.0034) [2024-06-23 08:29:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42601.0, 300 sec: 42820.9). Total num frames: 6509199360. Throughput: 0: 42852.5. Samples: 6509313820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 08:29:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 08:29:17,857][15401] Updated weights for policy 0, policy_version 397300 (0.0034) [2024-06-23 08:29:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 6509379584. Throughput: 0: 42630.8. Samples: 6509566060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 08:29:21,629][15401] Updated weights for policy 0, policy_version 397310 (0.0031) [2024-06-23 08:29:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 6509625344. Throughput: 0: 42797.9. Samples: 6509694140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:23,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 08:29:25,306][15401] Updated weights for policy 0, policy_version 397320 (0.0035) [2024-06-23 08:29:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6509821952. Throughput: 0: 42626.2. Samples: 6509944820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 08:29:29,179][15401] Updated weights for policy 0, policy_version 397330 (0.0033) [2024-06-23 08:29:32,735][15401] Updated weights for policy 0, policy_version 397340 (0.0030) [2024-06-23 08:29:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6510034944. Throughput: 0: 42445.4. Samples: 6510200360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 08:29:37,011][15401] Updated weights for policy 0, policy_version 397350 (0.0028) [2024-06-23 08:29:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6510264320. Throughput: 0: 42573.4. Samples: 6510323220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:38,391][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 08:29:40,491][15401] Updated weights for policy 0, policy_version 397360 (0.0036) [2024-06-23 08:29:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6510444544. Throughput: 0: 42596.3. Samples: 6510581780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:43,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 08:29:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000397366_6510444544.pth... [2024-06-23 08:29:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000396742_6500220928.pth [2024-06-23 08:29:44,714][15401] Updated weights for policy 0, policy_version 397370 (0.0034) [2024-06-23 08:29:48,203][15401] Updated weights for policy 0, policy_version 397380 (0.0023) [2024-06-23 08:29:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 6510673920. Throughput: 0: 42229.3. Samples: 6510830600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:48,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 08:29:52,470][15401] Updated weights for policy 0, policy_version 397390 (0.0033) [2024-06-23 08:29:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6510886912. Throughput: 0: 42351.1. Samples: 6510959480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:53,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 08:29:56,099][15401] Updated weights for policy 0, policy_version 397400 (0.0035) [2024-06-23 08:29:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6511083520. Throughput: 0: 42402.3. Samples: 6511221920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:29:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 08:30:00,114][15401] Updated weights for policy 0, policy_version 397410 (0.0038) [2024-06-23 08:30:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 6511312896. Throughput: 0: 42267.8. Samples: 6511468120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:30:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 08:30:03,689][15401] Updated weights for policy 0, policy_version 397420 (0.0029) [2024-06-23 08:30:07,761][15401] Updated weights for policy 0, policy_version 397430 (0.0042) [2024-06-23 08:30:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6511525888. Throughput: 0: 42477.2. Samples: 6511605620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:30:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 08:30:11,214][15401] Updated weights for policy 0, policy_version 397440 (0.0032) [2024-06-23 08:30:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6511738880. Throughput: 0: 42594.7. Samples: 6511861580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:30:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 08:30:15,417][15401] Updated weights for policy 0, policy_version 397450 (0.0044) [2024-06-23 08:30:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6511968256. Throughput: 0: 42493.4. Samples: 6512112560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:30:18,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-23 08:30:18,755][15401] Updated weights for policy 0, policy_version 397460 (0.0025) [2024-06-23 08:30:22,952][15401] Updated weights for policy 0, policy_version 397470 (0.0033) [2024-06-23 08:30:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6512181248. Throughput: 0: 42668.0. Samples: 6512243280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:30:23,393][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 08:30:25,000][15349] Signal inference workers to stop experience collection... (96400 times) [2024-06-23 08:30:25,001][15349] Signal inference workers to resume experience collection... (96400 times) [2024-06-23 08:30:25,050][15401] InferenceWorker_p0-w0: stopping experience collection (96400 times) [2024-06-23 08:30:25,050][15401] InferenceWorker_p0-w0: resuming experience collection (96400 times) [2024-06-23 08:30:26,989][15401] Updated weights for policy 0, policy_version 397480 (0.0029) [2024-06-23 08:30:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6512377856. Throughput: 0: 42672.2. Samples: 6512502020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:30:28,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 08:30:30,524][15401] Updated weights for policy 0, policy_version 397490 (0.0028) [2024-06-23 08:30:33,393][15132] Fps is (10 sec: 42581.6, 60 sec: 42868.6, 300 sec: 42764.4). Total num frames: 6512607232. Throughput: 0: 42740.6. Samples: 6512754100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 08:30:33,394][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 08:30:34,706][15401] Updated weights for policy 0, policy_version 397500 (0.0028) [2024-06-23 08:30:38,099][15401] Updated weights for policy 0, policy_version 397510 (0.0038) [2024-06-23 08:30:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 6512803840. Throughput: 0: 42787.2. Samples: 6512884900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:30:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 08:30:42,396][15401] Updated weights for policy 0, policy_version 397520 (0.0031) [2024-06-23 08:30:43,389][15132] Fps is (10 sec: 40976.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 6513016832. Throughput: 0: 42756.3. Samples: 6513145960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:30:43,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-23 08:30:46,074][15401] Updated weights for policy 0, policy_version 397530 (0.0039) [2024-06-23 08:30:48,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42598.1, 300 sec: 42709.5). Total num frames: 6513229824. Throughput: 0: 42622.0. Samples: 6513386120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:30:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 08:30:49,978][15401] Updated weights for policy 0, policy_version 397540 (0.0038) [2024-06-23 08:30:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 6513426432. Throughput: 0: 42521.1. Samples: 6513519060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:30:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 08:30:53,708][15401] Updated weights for policy 0, policy_version 397550 (0.0028) [2024-06-23 08:30:57,883][15401] Updated weights for policy 0, policy_version 397560 (0.0035) [2024-06-23 08:30:58,391][15132] Fps is (10 sec: 39315.6, 60 sec: 42324.0, 300 sec: 42542.6). Total num frames: 6513623040. Throughput: 0: 42547.6. Samples: 6513776300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:30:58,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 08:31:01,223][15401] Updated weights for policy 0, policy_version 397570 (0.0032) [2024-06-23 08:31:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 6513868800. Throughput: 0: 42633.7. Samples: 6514031080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 08:31:05,521][15401] Updated weights for policy 0, policy_version 397580 (0.0038) [2024-06-23 08:31:08,389][15132] Fps is (10 sec: 45883.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6514081792. Throughput: 0: 42584.5. Samples: 6514159580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 08:31:08,997][15401] Updated weights for policy 0, policy_version 397590 (0.0048) [2024-06-23 08:31:12,953][15401] Updated weights for policy 0, policy_version 397600 (0.0041) [2024-06-23 08:31:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6514278400. Throughput: 0: 42528.5. Samples: 6514415800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 08:31:16,648][15401] Updated weights for policy 0, policy_version 397610 (0.0035) [2024-06-23 08:31:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6514507776. Throughput: 0: 42688.6. Samples: 6514674920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 08:31:20,491][15401] Updated weights for policy 0, policy_version 397620 (0.0027) [2024-06-23 08:31:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6514720768. Throughput: 0: 42623.2. Samples: 6514802940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:23,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 08:31:24,118][15401] Updated weights for policy 0, policy_version 397630 (0.0028) [2024-06-23 08:31:28,344][15401] Updated weights for policy 0, policy_version 397640 (0.0033) [2024-06-23 08:31:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6514933760. Throughput: 0: 42580.4. Samples: 6515062080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 08:31:31,621][15401] Updated weights for policy 0, policy_version 397650 (0.0024) [2024-06-23 08:31:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42328.2, 300 sec: 42653.9). Total num frames: 6515146752. Throughput: 0: 42770.5. Samples: 6515310780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:33,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 08:31:36,107][15401] Updated weights for policy 0, policy_version 397660 (0.0029) [2024-06-23 08:31:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6515359744. Throughput: 0: 42739.0. Samples: 6515442320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:38,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 08:31:39,494][15401] Updated weights for policy 0, policy_version 397670 (0.0037) [2024-06-23 08:31:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6515572736. Throughput: 0: 42729.2. Samples: 6515699040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 08:31:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000397679_6515572736.pth... [2024-06-23 08:31:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000397056_6505365504.pth [2024-06-23 08:31:43,732][15401] Updated weights for policy 0, policy_version 397680 (0.0048) [2024-06-23 08:31:47,303][15401] Updated weights for policy 0, policy_version 397690 (0.0034) [2024-06-23 08:31:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.7, 300 sec: 42654.0). Total num frames: 6515785728. Throughput: 0: 42778.8. Samples: 6515956120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 08:31:51,355][15401] Updated weights for policy 0, policy_version 397700 (0.0030) [2024-06-23 08:31:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6516015104. Throughput: 0: 42954.2. Samples: 6516092520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 08:31:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 08:31:54,897][15401] Updated weights for policy 0, policy_version 397710 (0.0034) [2024-06-23 08:31:58,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43144.1, 300 sec: 42653.9). Total num frames: 6516211712. Throughput: 0: 42870.9. Samples: 6516345100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:31:58,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 08:31:58,837][15401] Updated weights for policy 0, policy_version 397720 (0.0024) [2024-06-23 08:31:59,433][15349] Signal inference workers to stop experience collection... (96450 times) [2024-06-23 08:31:59,433][15349] Signal inference workers to resume experience collection... (96450 times) [2024-06-23 08:31:59,467][15401] InferenceWorker_p0-w0: stopping experience collection (96450 times) [2024-06-23 08:31:59,467][15401] InferenceWorker_p0-w0: resuming experience collection (96450 times) [2024-06-23 08:32:02,689][15401] Updated weights for policy 0, policy_version 397730 (0.0039) [2024-06-23 08:32:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 6516424704. Throughput: 0: 42742.8. Samples: 6516598340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 08:32:06,679][15401] Updated weights for policy 0, policy_version 397740 (0.0030) [2024-06-23 08:32:08,392][15132] Fps is (10 sec: 44237.0, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 6516654080. Throughput: 0: 42717.2. Samples: 6516725320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:08,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 08:32:10,270][15401] Updated weights for policy 0, policy_version 397750 (0.0027) [2024-06-23 08:32:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6516867072. Throughput: 0: 42640.5. Samples: 6516980900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:13,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-23 08:32:14,099][15401] Updated weights for policy 0, policy_version 397760 (0.0029) [2024-06-23 08:32:17,828][15401] Updated weights for policy 0, policy_version 397770 (0.0038) [2024-06-23 08:32:18,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 6517080064. Throughput: 0: 42820.4. Samples: 6517237800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:18,392][15132] Avg episode reward: [(0, '0.152')] [2024-06-23 08:32:21,622][15401] Updated weights for policy 0, policy_version 397780 (0.0031) [2024-06-23 08:32:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6517276672. Throughput: 0: 42789.4. Samples: 6517367840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 08:32:25,422][15401] Updated weights for policy 0, policy_version 397790 (0.0058) [2024-06-23 08:32:28,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6517506048. Throughput: 0: 42866.7. Samples: 6517628040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 08:32:29,219][15401] Updated weights for policy 0, policy_version 397800 (0.0029) [2024-06-23 08:32:32,961][15401] Updated weights for policy 0, policy_version 397810 (0.0032) [2024-06-23 08:32:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6517735424. Throughput: 0: 42744.3. Samples: 6517879620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:33,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 08:32:36,738][15401] Updated weights for policy 0, policy_version 397820 (0.0035) [2024-06-23 08:32:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 6517932032. Throughput: 0: 42568.5. Samples: 6518008100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 08:32:40,689][15401] Updated weights for policy 0, policy_version 397830 (0.0026) [2024-06-23 08:32:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 6518128640. Throughput: 0: 42660.1. Samples: 6518264700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 08:32:44,492][15401] Updated weights for policy 0, policy_version 397840 (0.0041) [2024-06-23 08:32:48,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 6518341632. Throughput: 0: 42510.0. Samples: 6518511300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 08:32:48,522][15401] Updated weights for policy 0, policy_version 397850 (0.0035) [2024-06-23 08:32:52,174][15401] Updated weights for policy 0, policy_version 397860 (0.0034) [2024-06-23 08:32:53,395][15132] Fps is (10 sec: 40937.2, 60 sec: 42048.4, 300 sec: 42486.5). Total num frames: 6518538240. Throughput: 0: 42558.4. Samples: 6518640580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:53,395][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 08:32:56,294][15401] Updated weights for policy 0, policy_version 397870 (0.0037) [2024-06-23 08:32:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 6518751232. Throughput: 0: 42459.2. Samples: 6518891560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:32:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 08:32:59,792][15401] Updated weights for policy 0, policy_version 397880 (0.0037) [2024-06-23 08:33:03,389][15132] Fps is (10 sec: 42622.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6518964224. Throughput: 0: 42409.0. Samples: 6519146100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:33:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 08:33:04,163][15401] Updated weights for policy 0, policy_version 397890 (0.0031) [2024-06-23 08:33:07,796][15401] Updated weights for policy 0, policy_version 397900 (0.0036) [2024-06-23 08:33:08,394][15132] Fps is (10 sec: 44217.2, 60 sec: 42323.9, 300 sec: 42542.8). Total num frames: 6519193600. Throughput: 0: 42351.3. Samples: 6519273840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:33:08,394][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 08:33:11,842][15401] Updated weights for policy 0, policy_version 397910 (0.0040) [2024-06-23 08:33:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6519406592. Throughput: 0: 42250.7. Samples: 6519529320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:33:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 08:33:15,788][15401] Updated weights for policy 0, policy_version 397920 (0.0037) [2024-06-23 08:33:18,390][15132] Fps is (10 sec: 42617.1, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 6519619584. Throughput: 0: 42400.5. Samples: 6519787640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:18,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 08:33:19,821][15401] Updated weights for policy 0, policy_version 397930 (0.0025) [2024-06-23 08:33:23,374][15401] Updated weights for policy 0, policy_version 397940 (0.0045) [2024-06-23 08:33:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6519848960. Throughput: 0: 42382.1. Samples: 6519915300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:23,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 08:33:27,385][15401] Updated weights for policy 0, policy_version 397950 (0.0035) [2024-06-23 08:33:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 6520061952. Throughput: 0: 42400.4. Samples: 6520172820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:28,392][15132] Avg episode reward: [(0, '0.187')] [2024-06-23 08:33:29,915][15349] Signal inference workers to stop experience collection... (96500 times) [2024-06-23 08:33:29,916][15349] Signal inference workers to resume experience collection... (96500 times) [2024-06-23 08:33:29,952][15401] InferenceWorker_p0-w0: stopping experience collection (96500 times) [2024-06-23 08:33:29,953][15401] InferenceWorker_p0-w0: resuming experience collection (96500 times) [2024-06-23 08:33:30,892][15401] Updated weights for policy 0, policy_version 397960 (0.0035) [2024-06-23 08:33:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6520258560. Throughput: 0: 42587.3. Samples: 6520427720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 08:33:34,908][15401] Updated weights for policy 0, policy_version 397970 (0.0039) [2024-06-23 08:33:38,396][15132] Fps is (10 sec: 42581.0, 60 sec: 42593.7, 300 sec: 42597.5). Total num frames: 6520487936. Throughput: 0: 42511.6. Samples: 6520553640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:38,397][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 08:33:38,814][15401] Updated weights for policy 0, policy_version 397980 (0.0032) [2024-06-23 08:33:42,909][15401] Updated weights for policy 0, policy_version 397990 (0.0033) [2024-06-23 08:33:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6520684544. Throughput: 0: 42627.5. Samples: 6520809800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:43,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-23 08:33:43,483][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000397992_6520700928.pth... [2024-06-23 08:33:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000397366_6510444544.pth [2024-06-23 08:33:46,418][15401] Updated weights for policy 0, policy_version 398000 (0.0032) [2024-06-23 08:33:48,392][15132] Fps is (10 sec: 40976.7, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 6520897536. Throughput: 0: 42623.9. Samples: 6521064280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:48,393][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 08:33:50,513][15401] Updated weights for policy 0, policy_version 398010 (0.0037) [2024-06-23 08:33:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42875.5, 300 sec: 42598.4). Total num frames: 6521110528. Throughput: 0: 42654.5. Samples: 6521193100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 08:33:54,199][15401] Updated weights for policy 0, policy_version 398020 (0.0041) [2024-06-23 08:33:58,010][15401] Updated weights for policy 0, policy_version 398030 (0.0041) [2024-06-23 08:33:58,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6521339904. Throughput: 0: 42672.4. Samples: 6521449580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:33:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 08:34:01,859][15401] Updated weights for policy 0, policy_version 398040 (0.0048) [2024-06-23 08:34:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6521536512. Throughput: 0: 42482.8. Samples: 6521699360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:34:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 08:34:05,574][15401] Updated weights for policy 0, policy_version 398050 (0.0024) [2024-06-23 08:34:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42601.6, 300 sec: 42542.9). Total num frames: 6521749504. Throughput: 0: 42513.0. Samples: 6521828380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:34:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 08:34:09,407][15401] Updated weights for policy 0, policy_version 398060 (0.0031) [2024-06-23 08:34:13,186][15401] Updated weights for policy 0, policy_version 398070 (0.0031) [2024-06-23 08:34:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6521978880. Throughput: 0: 42574.6. Samples: 6522088580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:34:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 08:34:16,999][15401] Updated weights for policy 0, policy_version 398080 (0.0025) [2024-06-23 08:34:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6522191872. Throughput: 0: 42398.6. Samples: 6522335660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:34:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 08:34:21,151][15401] Updated weights for policy 0, policy_version 398090 (0.0031) [2024-06-23 08:34:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 6522372096. Throughput: 0: 42446.6. Samples: 6522463460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:34:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 08:34:24,604][15401] Updated weights for policy 0, policy_version 398100 (0.0027) [2024-06-23 08:34:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 6522601472. Throughput: 0: 42628.5. Samples: 6522728080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:34:28,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 08:34:28,626][15401] Updated weights for policy 0, policy_version 398110 (0.0030) [2024-06-23 08:34:32,278][15401] Updated weights for policy 0, policy_version 398120 (0.0036) [2024-06-23 08:34:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6522814464. Throughput: 0: 42654.3. Samples: 6522983620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 08:34:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 08:34:36,156][15401] Updated weights for policy 0, policy_version 398130 (0.0047) [2024-06-23 08:34:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42329.9, 300 sec: 42654.0). Total num frames: 6523027456. Throughput: 0: 42606.2. Samples: 6523110380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:34:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 08:34:40,184][15401] Updated weights for policy 0, policy_version 398140 (0.0028) [2024-06-23 08:34:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6523256832. Throughput: 0: 42669.8. Samples: 6523369720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:34:43,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 08:34:43,946][15401] Updated weights for policy 0, policy_version 398150 (0.0027) [2024-06-23 08:34:47,786][15401] Updated weights for policy 0, policy_version 398160 (0.0039) [2024-06-23 08:34:48,391][15132] Fps is (10 sec: 42591.6, 60 sec: 42599.0, 300 sec: 42598.2). Total num frames: 6523453440. Throughput: 0: 42854.0. Samples: 6523627860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:34:48,392][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 08:34:51,386][15401] Updated weights for policy 0, policy_version 398170 (0.0037) [2024-06-23 08:34:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6523682816. Throughput: 0: 42740.0. Samples: 6523751680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:34:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 08:34:55,499][15401] Updated weights for policy 0, policy_version 398180 (0.0039) [2024-06-23 08:34:58,390][15132] Fps is (10 sec: 44243.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6523895808. Throughput: 0: 42846.3. Samples: 6524016660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:34:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 08:34:59,083][15401] Updated weights for policy 0, policy_version 398190 (0.0027) [2024-06-23 08:35:01,086][15349] Signal inference workers to stop experience collection... (96550 times) [2024-06-23 08:35:01,087][15349] Signal inference workers to resume experience collection... (96550 times) [2024-06-23 08:35:01,105][15401] InferenceWorker_p0-w0: stopping experience collection (96550 times) [2024-06-23 08:35:01,105][15401] InferenceWorker_p0-w0: resuming experience collection (96550 times) [2024-06-23 08:35:03,396][15132] Fps is (10 sec: 40933.3, 60 sec: 42593.8, 300 sec: 42597.5). Total num frames: 6524092416. Throughput: 0: 42923.7. Samples: 6524267500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:03,396][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 08:35:03,684][15401] Updated weights for policy 0, policy_version 398200 (0.0029) [2024-06-23 08:35:07,085][15401] Updated weights for policy 0, policy_version 398210 (0.0036) [2024-06-23 08:35:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6524305408. Throughput: 0: 42907.0. Samples: 6524394280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:08,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 08:35:11,145][15401] Updated weights for policy 0, policy_version 398220 (0.0039) [2024-06-23 08:35:13,392][15132] Fps is (10 sec: 42615.1, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 6524518400. Throughput: 0: 42642.0. Samples: 6524647080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:13,393][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 08:35:14,583][15401] Updated weights for policy 0, policy_version 398230 (0.0030) [2024-06-23 08:35:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6524747776. Throughput: 0: 42647.9. Samples: 6524902780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:18,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 08:35:18,542][15401] Updated weights for policy 0, policy_version 398240 (0.0025) [2024-06-23 08:35:22,117][15401] Updated weights for policy 0, policy_version 398250 (0.0038) [2024-06-23 08:35:23,390][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6524944384. Throughput: 0: 42713.3. Samples: 6525032480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 08:35:26,295][15401] Updated weights for policy 0, policy_version 398260 (0.0029) [2024-06-23 08:35:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42543.4). Total num frames: 6525157376. Throughput: 0: 42747.9. Samples: 6525293380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:28,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 08:35:29,715][15401] Updated weights for policy 0, policy_version 398270 (0.0031) [2024-06-23 08:35:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6525370368. Throughput: 0: 42772.2. Samples: 6525552540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:33,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 08:35:33,898][15401] Updated weights for policy 0, policy_version 398280 (0.0032) [2024-06-23 08:35:37,287][15401] Updated weights for policy 0, policy_version 398290 (0.0026) [2024-06-23 08:35:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6525616128. Throughput: 0: 42813.2. Samples: 6525678280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 08:35:41,920][15401] Updated weights for policy 0, policy_version 398300 (0.0047) [2024-06-23 08:35:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 6525812736. Throughput: 0: 42675.0. Samples: 6525937040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:43,396][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 08:35:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000398304_6525812736.pth... [2024-06-23 08:35:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000397679_6515572736.pth [2024-06-23 08:35:45,279][15401] Updated weights for policy 0, policy_version 398310 (0.0030) [2024-06-23 08:35:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42872.5, 300 sec: 42709.4). Total num frames: 6526025728. Throughput: 0: 42843.3. Samples: 6526195180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:48,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 08:35:49,721][15401] Updated weights for policy 0, policy_version 398320 (0.0037) [2024-06-23 08:35:52,830][15401] Updated weights for policy 0, policy_version 398330 (0.0031) [2024-06-23 08:35:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.8). Total num frames: 6526255104. Throughput: 0: 42886.3. Samples: 6526324160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:35:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 08:35:57,418][15401] Updated weights for policy 0, policy_version 398340 (0.0028) [2024-06-23 08:35:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6526435328. Throughput: 0: 42980.6. Samples: 6526581100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:35:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 08:36:00,485][15401] Updated weights for policy 0, policy_version 398350 (0.0033) [2024-06-23 08:36:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 6526664704. Throughput: 0: 42908.9. Samples: 6526833680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 08:36:04,912][15401] Updated weights for policy 0, policy_version 398360 (0.0032) [2024-06-23 08:36:08,254][15401] Updated weights for policy 0, policy_version 398370 (0.0034) [2024-06-23 08:36:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6526894080. Throughput: 0: 42940.1. Samples: 6526964780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 08:36:12,569][15401] Updated weights for policy 0, policy_version 398380 (0.0043) [2024-06-23 08:36:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 6527074304. Throughput: 0: 42870.7. Samples: 6527222560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 08:36:15,904][15401] Updated weights for policy 0, policy_version 398390 (0.0048) [2024-06-23 08:36:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6527303680. Throughput: 0: 42552.4. Samples: 6527467400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:18,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-23 08:36:20,496][15401] Updated weights for policy 0, policy_version 398400 (0.0034) [2024-06-23 08:36:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6527516672. Throughput: 0: 42676.4. Samples: 6527598720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 08:36:23,775][15401] Updated weights for policy 0, policy_version 398410 (0.0038) [2024-06-23 08:36:27,996][15401] Updated weights for policy 0, policy_version 398420 (0.0044) [2024-06-23 08:36:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6527713280. Throughput: 0: 42669.5. Samples: 6527857160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:28,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 08:36:31,425][15401] Updated weights for policy 0, policy_version 398430 (0.0033) [2024-06-23 08:36:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6527959040. Throughput: 0: 42505.8. Samples: 6528107940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:33,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 08:36:35,566][15401] Updated weights for policy 0, policy_version 398440 (0.0032) [2024-06-23 08:36:36,953][15349] Signal inference workers to stop experience collection... (96600 times) [2024-06-23 08:36:37,000][15401] InferenceWorker_p0-w0: stopping experience collection (96600 times) [2024-06-23 08:36:37,007][15349] Signal inference workers to resume experience collection... (96600 times) [2024-06-23 08:36:37,013][15401] InferenceWorker_p0-w0: resuming experience collection (96600 times) [2024-06-23 08:36:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6528172032. Throughput: 0: 42636.1. Samples: 6528242780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 08:36:38,958][15401] Updated weights for policy 0, policy_version 398450 (0.0025) [2024-06-23 08:36:43,183][15401] Updated weights for policy 0, policy_version 398460 (0.0041) [2024-06-23 08:36:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6528368640. Throughput: 0: 42722.1. Samples: 6528503600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 08:36:46,645][15401] Updated weights for policy 0, policy_version 398470 (0.0027) [2024-06-23 08:36:48,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 6528614400. Throughput: 0: 42650.2. Samples: 6528753040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:48,392][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 08:36:50,936][15401] Updated weights for policy 0, policy_version 398480 (0.0037) [2024-06-23 08:36:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 6528794624. Throughput: 0: 42602.6. Samples: 6528881900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 08:36:54,052][15401] Updated weights for policy 0, policy_version 398490 (0.0040) [2024-06-23 08:36:58,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6529007616. Throughput: 0: 42571.5. Samples: 6529138280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:36:58,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 08:36:58,482][15401] Updated weights for policy 0, policy_version 398500 (0.0036) [2024-06-23 08:37:02,081][15401] Updated weights for policy 0, policy_version 398510 (0.0036) [2024-06-23 08:37:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 6529236992. Throughput: 0: 42561.4. Samples: 6529382660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:37:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 08:37:07,093][15401] Updated weights for policy 0, policy_version 398520 (0.0035) [2024-06-23 08:37:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 6529433600. Throughput: 0: 42600.9. Samples: 6529515760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:37:08,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 08:37:09,796][15401] Updated weights for policy 0, policy_version 398530 (0.0036) [2024-06-23 08:37:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 6529646592. Throughput: 0: 42450.1. Samples: 6529767420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 08:37:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 08:37:14,641][15401] Updated weights for policy 0, policy_version 398540 (0.0025) [2024-06-23 08:37:17,657][15401] Updated weights for policy 0, policy_version 398550 (0.0037) [2024-06-23 08:37:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6529892352. Throughput: 0: 42521.8. Samples: 6530021420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 08:37:22,183][15401] Updated weights for policy 0, policy_version 398560 (0.0028) [2024-06-23 08:37:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6530056192. Throughput: 0: 42402.2. Samples: 6530150880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 08:37:25,506][15401] Updated weights for policy 0, policy_version 398570 (0.0036) [2024-06-23 08:37:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6530301952. Throughput: 0: 42257.8. Samples: 6530405200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 08:37:29,670][15401] Updated weights for policy 0, policy_version 398580 (0.0036) [2024-06-23 08:37:33,033][15401] Updated weights for policy 0, policy_version 398590 (0.0037) [2024-06-23 08:37:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6530514944. Throughput: 0: 42581.4. Samples: 6530669100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 08:37:37,081][15401] Updated weights for policy 0, policy_version 398600 (0.0030) [2024-06-23 08:37:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6530695168. Throughput: 0: 42566.7. Samples: 6530797400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 08:37:40,533][15401] Updated weights for policy 0, policy_version 398610 (0.0033) [2024-06-23 08:37:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6530924544. Throughput: 0: 42363.1. Samples: 6531044620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:43,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 08:37:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000398616_6530924544.pth... [2024-06-23 08:37:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000397992_6520700928.pth [2024-06-23 08:37:44,674][15401] Updated weights for policy 0, policy_version 398620 (0.0035) [2024-06-23 08:37:47,890][15349] Signal inference workers to stop experience collection... (96650 times) [2024-06-23 08:37:47,912][15401] InferenceWorker_p0-w0: stopping experience collection (96650 times) [2024-06-23 08:37:47,947][15349] Signal inference workers to resume experience collection... (96650 times) [2024-06-23 08:37:47,948][15401] InferenceWorker_p0-w0: resuming experience collection (96650 times) [2024-06-23 08:37:48,283][15401] Updated weights for policy 0, policy_version 398630 (0.0034) [2024-06-23 08:37:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42327.0, 300 sec: 42765.8). Total num frames: 6531153920. Throughput: 0: 42681.7. Samples: 6531303340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:48,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 08:37:52,204][15401] Updated weights for policy 0, policy_version 398640 (0.0024) [2024-06-23 08:37:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6531334144. Throughput: 0: 42560.4. Samples: 6531430980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 08:37:55,795][15401] Updated weights for policy 0, policy_version 398650 (0.0030) [2024-06-23 08:37:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6531579904. Throughput: 0: 42700.9. Samples: 6531688960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:37:58,396][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 08:38:00,323][15401] Updated weights for policy 0, policy_version 398660 (0.0025) [2024-06-23 08:38:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42710.1). Total num frames: 6531792896. Throughput: 0: 42760.5. Samples: 6531945640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:38:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 08:38:03,700][15401] Updated weights for policy 0, policy_version 398670 (0.0033) [2024-06-23 08:38:07,793][15401] Updated weights for policy 0, policy_version 398680 (0.0038) [2024-06-23 08:38:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6531989504. Throughput: 0: 42676.4. Samples: 6532071320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:38:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 08:38:11,456][15401] Updated weights for policy 0, policy_version 398690 (0.0028) [2024-06-23 08:38:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6532218880. Throughput: 0: 42757.9. Samples: 6532329300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:38:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 08:38:15,229][15401] Updated weights for policy 0, policy_version 398700 (0.0030) [2024-06-23 08:38:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6532415488. Throughput: 0: 42765.3. Samples: 6532593540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:38:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 08:38:19,062][15401] Updated weights for policy 0, policy_version 398710 (0.0050) [2024-06-23 08:38:22,572][15401] Updated weights for policy 0, policy_version 398720 (0.0039) [2024-06-23 08:38:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 6532628480. Throughput: 0: 42733.3. Samples: 6532720400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:38:23,394][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 08:38:26,643][15401] Updated weights for policy 0, policy_version 398730 (0.0032) [2024-06-23 08:38:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6532857856. Throughput: 0: 42789.3. Samples: 6532970140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:38:28,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 08:38:30,613][15401] Updated weights for policy 0, policy_version 398740 (0.0041) [2024-06-23 08:38:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42599.3). Total num frames: 6533054464. Throughput: 0: 42811.5. Samples: 6533229860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 08:38:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 08:38:34,435][15401] Updated weights for policy 0, policy_version 398750 (0.0047) [2024-06-23 08:38:38,150][15401] Updated weights for policy 0, policy_version 398760 (0.0034) [2024-06-23 08:38:38,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 6533300224. Throughput: 0: 42832.5. Samples: 6533358540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:38:38,392][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 08:38:41,991][15401] Updated weights for policy 0, policy_version 398770 (0.0037) [2024-06-23 08:38:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 6533496832. Throughput: 0: 42870.8. Samples: 6533618140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:38:43,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 08:38:45,596][15401] Updated weights for policy 0, policy_version 398780 (0.0026) [2024-06-23 08:38:48,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6533693440. Throughput: 0: 42965.9. Samples: 6533879100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:38:48,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 08:38:50,070][15401] Updated weights for policy 0, policy_version 398790 (0.0033) [2024-06-23 08:38:53,079][15401] Updated weights for policy 0, policy_version 398800 (0.0026) [2024-06-23 08:38:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 6533955584. Throughput: 0: 42990.6. Samples: 6534005900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:38:53,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-23 08:38:54,088][15349] Signal inference workers to stop experience collection... (96700 times) [2024-06-23 08:38:54,091][15349] Signal inference workers to resume experience collection... (96700 times) [2024-06-23 08:38:54,103][15401] InferenceWorker_p0-w0: stopping experience collection (96700 times) [2024-06-23 08:38:54,104][15401] InferenceWorker_p0-w0: resuming experience collection (96700 times) [2024-06-23 08:38:57,586][15401] Updated weights for policy 0, policy_version 398810 (0.0032) [2024-06-23 08:38:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6534135808. Throughput: 0: 43211.1. Samples: 6534273800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:38:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 08:39:00,566][15401] Updated weights for policy 0, policy_version 398820 (0.0030) [2024-06-23 08:39:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6534348800. Throughput: 0: 43013.8. Samples: 6534529160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:03,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 08:39:05,221][15401] Updated weights for policy 0, policy_version 398830 (0.0033) [2024-06-23 08:39:07,920][15401] Updated weights for policy 0, policy_version 398840 (0.0039) [2024-06-23 08:39:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 6534610944. Throughput: 0: 43097.0. Samples: 6534659760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 08:39:12,872][15401] Updated weights for policy 0, policy_version 398850 (0.0043) [2024-06-23 08:39:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6534791168. Throughput: 0: 43446.7. Samples: 6534925240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 08:39:15,858][15401] Updated weights for policy 0, policy_version 398860 (0.0029) [2024-06-23 08:39:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6535004160. Throughput: 0: 43115.7. Samples: 6535170060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 08:39:20,399][15401] Updated weights for policy 0, policy_version 398870 (0.0045) [2024-06-23 08:39:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 6535233536. Throughput: 0: 43045.0. Samples: 6535295460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 08:39:23,422][15401] Updated weights for policy 0, policy_version 398880 (0.0046) [2024-06-23 08:39:28,065][15401] Updated weights for policy 0, policy_version 398890 (0.0036) [2024-06-23 08:39:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6535430144. Throughput: 0: 43216.0. Samples: 6535562860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 08:39:31,219][15401] Updated weights for policy 0, policy_version 398900 (0.0040) [2024-06-23 08:39:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6535643136. Throughput: 0: 42851.4. Samples: 6535807420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:33,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 08:39:35,844][15401] Updated weights for policy 0, policy_version 398910 (0.0035) [2024-06-23 08:39:38,392][15132] Fps is (10 sec: 44225.3, 60 sec: 42871.4, 300 sec: 42764.6). Total num frames: 6535872512. Throughput: 0: 43039.4. Samples: 6535942780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:38,392][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 08:39:38,974][15401] Updated weights for policy 0, policy_version 398920 (0.0036) [2024-06-23 08:39:43,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42323.6, 300 sec: 42653.8). Total num frames: 6536036352. Throughput: 0: 42700.7. Samples: 6536195440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:43,393][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 08:39:43,545][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000398929_6536052736.pth... [2024-06-23 08:39:43,599][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000398304_6525812736.pth [2024-06-23 08:39:43,782][15401] Updated weights for policy 0, policy_version 398930 (0.0035) [2024-06-23 08:39:46,826][15401] Updated weights for policy 0, policy_version 398940 (0.0033) [2024-06-23 08:39:48,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 6536298496. Throughput: 0: 42518.9. Samples: 6536442520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 08:39:51,366][15401] Updated weights for policy 0, policy_version 398950 (0.0035) [2024-06-23 08:39:53,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 6536478720. Throughput: 0: 42541.3. Samples: 6536574120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 08:39:54,354][15401] Updated weights for policy 0, policy_version 398960 (0.0032) [2024-06-23 08:39:58,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 6536691712. Throughput: 0: 42393.8. Samples: 6536832960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 08:39:58,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 08:39:58,844][15401] Updated weights for policy 0, policy_version 398970 (0.0037) [2024-06-23 08:40:01,923][15401] Updated weights for policy 0, policy_version 398980 (0.0035) [2024-06-23 08:40:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6536937472. Throughput: 0: 42402.0. Samples: 6537078160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:03,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 08:40:06,424][15401] Updated weights for policy 0, policy_version 398990 (0.0032) [2024-06-23 08:40:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42765.4). Total num frames: 6537134080. Throughput: 0: 42679.9. Samples: 6537216060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:08,392][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 08:40:09,902][15401] Updated weights for policy 0, policy_version 399000 (0.0029) [2024-06-23 08:40:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6537347072. Throughput: 0: 42404.4. Samples: 6537471060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 08:40:14,024][15401] Updated weights for policy 0, policy_version 399010 (0.0050) [2024-06-23 08:40:16,650][15349] Signal inference workers to stop experience collection... (96750 times) [2024-06-23 08:40:16,659][15401] InferenceWorker_p0-w0: stopping experience collection (96750 times) [2024-06-23 08:40:16,705][15349] Signal inference workers to resume experience collection... (96750 times) [2024-06-23 08:40:16,706][15401] InferenceWorker_p0-w0: resuming experience collection (96750 times) [2024-06-23 08:40:17,516][15401] Updated weights for policy 0, policy_version 399020 (0.0031) [2024-06-23 08:40:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6537560064. Throughput: 0: 42632.6. Samples: 6537725880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:18,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 08:40:21,610][15401] Updated weights for policy 0, policy_version 399030 (0.0033) [2024-06-23 08:40:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 6537789440. Throughput: 0: 42597.8. Samples: 6537859580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 08:40:25,073][15401] Updated weights for policy 0, policy_version 399040 (0.0023) [2024-06-23 08:40:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6537986048. Throughput: 0: 42622.8. Samples: 6538113360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:28,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 08:40:29,167][15401] Updated weights for policy 0, policy_version 399050 (0.0029) [2024-06-23 08:40:32,875][15401] Updated weights for policy 0, policy_version 399060 (0.0036) [2024-06-23 08:40:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6538215424. Throughput: 0: 42831.5. Samples: 6538369940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 08:40:36,925][15401] Updated weights for policy 0, policy_version 399070 (0.0026) [2024-06-23 08:40:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 6538428416. Throughput: 0: 42826.5. Samples: 6538501320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:38,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 08:40:40,507][15401] Updated weights for policy 0, policy_version 399080 (0.0031) [2024-06-23 08:40:43,389][15132] Fps is (10 sec: 40961.0, 60 sec: 43146.4, 300 sec: 42709.5). Total num frames: 6538625024. Throughput: 0: 42580.6. Samples: 6538749080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 08:40:44,473][15401] Updated weights for policy 0, policy_version 399090 (0.0037) [2024-06-23 08:40:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 6538838016. Throughput: 0: 42802.0. Samples: 6539004240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 08:40:48,395][15401] Updated weights for policy 0, policy_version 399100 (0.0031) [2024-06-23 08:40:52,618][15401] Updated weights for policy 0, policy_version 399110 (0.0038) [2024-06-23 08:40:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6539034624. Throughput: 0: 42638.3. Samples: 6539134780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 08:40:56,099][15401] Updated weights for policy 0, policy_version 399120 (0.0030) [2024-06-23 08:40:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6539280384. Throughput: 0: 42570.6. Samples: 6539386740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:40:58,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 08:41:00,289][15401] Updated weights for policy 0, policy_version 399130 (0.0028) [2024-06-23 08:41:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 6539476992. Throughput: 0: 42540.4. Samples: 6539640200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:41:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 08:41:03,967][15401] Updated weights for policy 0, policy_version 399140 (0.0028) [2024-06-23 08:41:07,943][15401] Updated weights for policy 0, policy_version 399150 (0.0035) [2024-06-23 08:41:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6539673600. Throughput: 0: 42438.0. Samples: 6539769280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:41:08,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 08:41:11,677][15401] Updated weights for policy 0, policy_version 399160 (0.0035) [2024-06-23 08:41:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6539902976. Throughput: 0: 42528.5. Samples: 6540027140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:41:13,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 08:41:15,672][15401] Updated weights for policy 0, policy_version 399170 (0.0042) [2024-06-23 08:41:18,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 6540132352. Throughput: 0: 42396.5. Samples: 6540277880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 08:41:18,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 08:41:19,454][15401] Updated weights for policy 0, policy_version 399180 (0.0047) [2024-06-23 08:41:23,268][15401] Updated weights for policy 0, policy_version 399190 (0.0029) [2024-06-23 08:41:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 6540328960. Throughput: 0: 42376.1. Samples: 6540408240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:41:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 08:41:27,103][15401] Updated weights for policy 0, policy_version 399200 (0.0035) [2024-06-23 08:41:28,391][15132] Fps is (10 sec: 40961.9, 60 sec: 42597.0, 300 sec: 42653.7). Total num frames: 6540541952. Throughput: 0: 42584.7. Samples: 6540665480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:41:28,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 08:41:28,753][15349] Signal inference workers to stop experience collection... (96800 times) [2024-06-23 08:41:28,753][15349] Signal inference workers to resume experience collection... (96800 times) [2024-06-23 08:41:28,796][15401] InferenceWorker_p0-w0: stopping experience collection (96800 times) [2024-06-23 08:41:28,796][15401] InferenceWorker_p0-w0: resuming experience collection (96800 times) [2024-06-23 08:41:30,925][15401] Updated weights for policy 0, policy_version 399210 (0.0023) [2024-06-23 08:41:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6540754944. Throughput: 0: 42565.1. Samples: 6540919680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:41:33,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 08:41:34,766][15401] Updated weights for policy 0, policy_version 399220 (0.0035) [2024-06-23 08:41:38,389][15132] Fps is (10 sec: 42606.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6540967936. Throughput: 0: 42592.4. Samples: 6541051440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:41:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 08:41:38,548][15401] Updated weights for policy 0, policy_version 399230 (0.0032) [2024-06-23 08:41:42,642][15401] Updated weights for policy 0, policy_version 399240 (0.0035) [2024-06-23 08:41:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42543.2). Total num frames: 6541164544. Throughput: 0: 42800.9. Samples: 6541312780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:41:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 08:41:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000399242_6541180928.pth... [2024-06-23 08:41:43,570][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000398616_6530924544.pth [2024-06-23 08:41:46,283][15401] Updated weights for policy 0, policy_version 399250 (0.0030) [2024-06-23 08:41:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6541410304. Throughput: 0: 42749.8. Samples: 6541563940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:41:48,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 08:41:50,304][15401] Updated weights for policy 0, policy_version 399260 (0.0032) [2024-06-23 08:41:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6541606912. Throughput: 0: 42920.8. Samples: 6541700720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:41:53,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 08:41:53,869][15401] Updated weights for policy 0, policy_version 399270 (0.0032) [2024-06-23 08:41:57,868][15401] Updated weights for policy 0, policy_version 399280 (0.0041) [2024-06-23 08:41:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6541836288. Throughput: 0: 42953.7. Samples: 6541960060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:41:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 08:42:01,506][15401] Updated weights for policy 0, policy_version 399290 (0.0025) [2024-06-23 08:42:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6542049280. Throughput: 0: 42950.3. Samples: 6542210540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:42:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 08:42:05,384][15401] Updated weights for policy 0, policy_version 399300 (0.0027) [2024-06-23 08:42:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6542262272. Throughput: 0: 43083.8. Samples: 6542347020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:42:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 08:42:09,034][15401] Updated weights for policy 0, policy_version 399310 (0.0031) [2024-06-23 08:42:12,829][15401] Updated weights for policy 0, policy_version 399320 (0.0033) [2024-06-23 08:42:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 6542458880. Throughput: 0: 43055.5. Samples: 6542602900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:42:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 08:42:16,714][15401] Updated weights for policy 0, policy_version 399330 (0.0033) [2024-06-23 08:42:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 6542688256. Throughput: 0: 42991.7. Samples: 6542854300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:42:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 08:42:20,683][15401] Updated weights for policy 0, policy_version 399340 (0.0031) [2024-06-23 08:42:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6542917632. Throughput: 0: 42989.2. Samples: 6542985960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:42:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 08:42:24,173][15401] Updated weights for policy 0, policy_version 399350 (0.0046) [2024-06-23 08:42:28,238][15401] Updated weights for policy 0, policy_version 399360 (0.0035) [2024-06-23 08:42:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42872.9, 300 sec: 42709.5). Total num frames: 6543114240. Throughput: 0: 42975.2. Samples: 6543246660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:42:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 08:42:31,710][15401] Updated weights for policy 0, policy_version 399370 (0.0033) [2024-06-23 08:42:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6543327232. Throughput: 0: 43070.2. Samples: 6543502100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:42:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 08:42:35,720][15401] Updated weights for policy 0, policy_version 399380 (0.0031) [2024-06-23 08:42:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6543572992. Throughput: 0: 42826.7. Samples: 6543627920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 08:42:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 08:42:39,351][15401] Updated weights for policy 0, policy_version 399390 (0.0030) [2024-06-23 08:42:43,127][15349] Signal inference workers to stop experience collection... (96850 times) [2024-06-23 08:42:43,179][15401] InferenceWorker_p0-w0: stopping experience collection (96850 times) [2024-06-23 08:42:43,180][15349] Signal inference workers to resume experience collection... (96850 times) [2024-06-23 08:42:43,195][15401] InferenceWorker_p0-w0: resuming experience collection (96850 times) [2024-06-23 08:42:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6543753216. Throughput: 0: 42915.1. Samples: 6543891240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:42:43,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 08:42:43,525][15401] Updated weights for policy 0, policy_version 399400 (0.0035) [2024-06-23 08:42:47,117][15401] Updated weights for policy 0, policy_version 399410 (0.0025) [2024-06-23 08:42:48,393][15132] Fps is (10 sec: 40946.7, 60 sec: 42869.1, 300 sec: 42875.6). Total num frames: 6543982592. Throughput: 0: 42952.9. Samples: 6544143560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:42:48,393][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 08:42:51,002][15401] Updated weights for policy 0, policy_version 399420 (0.0026) [2024-06-23 08:42:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 6544211968. Throughput: 0: 42734.4. Samples: 6544270060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:42:53,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 08:42:54,679][15401] Updated weights for policy 0, policy_version 399430 (0.0028) [2024-06-23 08:42:58,390][15132] Fps is (10 sec: 40973.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6544392192. Throughput: 0: 42730.8. Samples: 6544525780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:42:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 08:42:58,897][15401] Updated weights for policy 0, policy_version 399440 (0.0039) [2024-06-23 08:43:02,146][15401] Updated weights for policy 0, policy_version 399450 (0.0026) [2024-06-23 08:43:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6544621568. Throughput: 0: 42991.1. Samples: 6544788900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 08:43:06,725][15401] Updated weights for policy 0, policy_version 399460 (0.0028) [2024-06-23 08:43:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 6544850944. Throughput: 0: 42961.0. Samples: 6544919200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:08,400][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 08:43:09,678][15401] Updated weights for policy 0, policy_version 399470 (0.0045) [2024-06-23 08:43:13,393][15132] Fps is (10 sec: 42582.1, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 6545047552. Throughput: 0: 42814.6. Samples: 6545173480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:13,394][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 08:43:14,197][15401] Updated weights for policy 0, policy_version 399480 (0.0030) [2024-06-23 08:43:17,257][15401] Updated weights for policy 0, policy_version 399490 (0.0046) [2024-06-23 08:43:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6545260544. Throughput: 0: 42844.0. Samples: 6545430080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:18,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 08:43:21,679][15401] Updated weights for policy 0, policy_version 399500 (0.0035) [2024-06-23 08:43:23,390][15132] Fps is (10 sec: 45892.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6545506304. Throughput: 0: 42941.7. Samples: 6545560300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 08:43:25,017][15401] Updated weights for policy 0, policy_version 399510 (0.0029) [2024-06-23 08:43:28,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 6545702912. Throughput: 0: 42693.7. Samples: 6545812560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:28,393][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 08:43:29,413][15401] Updated weights for policy 0, policy_version 399520 (0.0030) [2024-06-23 08:43:32,613][15401] Updated weights for policy 0, policy_version 399530 (0.0038) [2024-06-23 08:43:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 6545899520. Throughput: 0: 42842.6. Samples: 6546071340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 08:43:37,062][15401] Updated weights for policy 0, policy_version 399540 (0.0041) [2024-06-23 08:43:38,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6546128896. Throughput: 0: 42907.0. Samples: 6546200880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 08:43:40,643][15401] Updated weights for policy 0, policy_version 399550 (0.0033) [2024-06-23 08:43:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6546325504. Throughput: 0: 42771.6. Samples: 6546450500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 08:43:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000399556_6546325504.pth... [2024-06-23 08:43:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000398929_6536052736.pth [2024-06-23 08:43:44,618][15401] Updated weights for policy 0, policy_version 399560 (0.0032) [2024-06-23 08:43:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.8, 300 sec: 42653.9). Total num frames: 6546538496. Throughput: 0: 42841.4. Samples: 6546716760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 08:43:48,431][15401] Updated weights for policy 0, policy_version 399570 (0.0023) [2024-06-23 08:43:52,382][15401] Updated weights for policy 0, policy_version 399580 (0.0036) [2024-06-23 08:43:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6546767872. Throughput: 0: 42760.9. Samples: 6546843440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 08:43:56,046][15401] Updated weights for policy 0, policy_version 399590 (0.0030) [2024-06-23 08:43:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6546980864. Throughput: 0: 42691.1. Samples: 6547094420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:43:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 08:44:00,130][15401] Updated weights for policy 0, policy_version 399600 (0.0050) [2024-06-23 08:44:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6547177472. Throughput: 0: 42847.0. Samples: 6547358200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 08:44:03,879][15401] Updated weights for policy 0, policy_version 399610 (0.0027) [2024-06-23 08:44:05,116][15349] Signal inference workers to stop experience collection... (96900 times) [2024-06-23 08:44:05,117][15349] Signal inference workers to resume experience collection... (96900 times) [2024-06-23 08:44:05,133][15401] InferenceWorker_p0-w0: stopping experience collection (96900 times) [2024-06-23 08:44:05,140][15401] InferenceWorker_p0-w0: resuming experience collection (96900 times) [2024-06-23 08:44:07,495][15401] Updated weights for policy 0, policy_version 399620 (0.0042) [2024-06-23 08:44:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6547406848. Throughput: 0: 42735.6. Samples: 6547483400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 08:44:11,743][15401] Updated weights for policy 0, policy_version 399630 (0.0025) [2024-06-23 08:44:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42874.3, 300 sec: 42765.0). Total num frames: 6547619840. Throughput: 0: 42865.0. Samples: 6547741380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 08:44:15,326][15401] Updated weights for policy 0, policy_version 399640 (0.0021) [2024-06-23 08:44:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6547832832. Throughput: 0: 42681.5. Samples: 6547992000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 08:44:19,295][15401] Updated weights for policy 0, policy_version 399650 (0.0030) [2024-06-23 08:44:23,142][15401] Updated weights for policy 0, policy_version 399660 (0.0041) [2024-06-23 08:44:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 6548029440. Throughput: 0: 42635.5. Samples: 6548119480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 08:44:27,083][15401] Updated weights for policy 0, policy_version 399670 (0.0033) [2024-06-23 08:44:28,391][15132] Fps is (10 sec: 42593.9, 60 sec: 42599.4, 300 sec: 42764.9). Total num frames: 6548258816. Throughput: 0: 42932.3. Samples: 6548382500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:28,391][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 08:44:30,735][15401] Updated weights for policy 0, policy_version 399680 (0.0028) [2024-06-23 08:44:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 6548471808. Throughput: 0: 42589.3. Samples: 6548633280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:33,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-23 08:44:34,849][15401] Updated weights for policy 0, policy_version 399690 (0.0036) [2024-06-23 08:44:38,343][15401] Updated weights for policy 0, policy_version 399700 (0.0036) [2024-06-23 08:44:38,390][15132] Fps is (10 sec: 42602.7, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 6548684800. Throughput: 0: 42598.2. Samples: 6548760360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 08:44:42,353][15401] Updated weights for policy 0, policy_version 399710 (0.0031) [2024-06-23 08:44:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6548897792. Throughput: 0: 42861.4. Samples: 6549023180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 08:44:46,051][15401] Updated weights for policy 0, policy_version 399720 (0.0026) [2024-06-23 08:44:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6549127168. Throughput: 0: 42525.0. Samples: 6549271820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 08:44:49,933][15401] Updated weights for policy 0, policy_version 399730 (0.0038) [2024-06-23 08:44:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6549307392. Throughput: 0: 42609.9. Samples: 6549400840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 08:44:53,643][15401] Updated weights for policy 0, policy_version 399740 (0.0033) [2024-06-23 08:44:57,631][15401] Updated weights for policy 0, policy_version 399750 (0.0046) [2024-06-23 08:44:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6549520384. Throughput: 0: 42611.4. Samples: 6549658900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:44:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 08:45:01,122][15401] Updated weights for policy 0, policy_version 399760 (0.0034) [2024-06-23 08:45:03,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6549749760. Throughput: 0: 42553.3. Samples: 6549907000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:45:03,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 08:45:05,107][15401] Updated weights for policy 0, policy_version 399770 (0.0032) [2024-06-23 08:45:08,385][15349] Signal inference workers to stop experience collection... (96950 times) [2024-06-23 08:45:08,386][15349] Signal inference workers to resume experience collection... (96950 times) [2024-06-23 08:45:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 6549946368. Throughput: 0: 42694.3. Samples: 6550040720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:45:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 08:45:08,425][15401] InferenceWorker_p0-w0: stopping experience collection (96950 times) [2024-06-23 08:45:08,426][15401] InferenceWorker_p0-w0: resuming experience collection (96950 times) [2024-06-23 08:45:08,994][15401] Updated weights for policy 0, policy_version 399780 (0.0036) [2024-06-23 08:45:12,692][15401] Updated weights for policy 0, policy_version 399790 (0.0032) [2024-06-23 08:45:13,394][15132] Fps is (10 sec: 40951.8, 60 sec: 42322.2, 300 sec: 42708.8). Total num frames: 6550159360. Throughput: 0: 42502.2. Samples: 6550295240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:45:13,394][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 08:45:16,859][15401] Updated weights for policy 0, policy_version 399800 (0.0036) [2024-06-23 08:45:18,396][15132] Fps is (10 sec: 45845.5, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 6550405120. Throughput: 0: 42357.1. Samples: 6550539620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 08:45:18,397][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 08:45:20,831][15401] Updated weights for policy 0, policy_version 399810 (0.0035) [2024-06-23 08:45:23,389][15132] Fps is (10 sec: 40978.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6550568960. Throughput: 0: 42573.0. Samples: 6550676140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:45:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 08:45:24,382][15401] Updated weights for policy 0, policy_version 399820 (0.0033) [2024-06-23 08:45:28,324][15401] Updated weights for policy 0, policy_version 399830 (0.0036) [2024-06-23 08:45:28,389][15132] Fps is (10 sec: 40986.3, 60 sec: 42599.2, 300 sec: 42709.5). Total num frames: 6550814720. Throughput: 0: 42460.5. Samples: 6550933900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:45:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 08:45:31,856][15401] Updated weights for policy 0, policy_version 399840 (0.0034) [2024-06-23 08:45:33,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6551044096. Throughput: 0: 42561.1. Samples: 6551187080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:45:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 08:45:35,879][15401] Updated weights for policy 0, policy_version 399850 (0.0031) [2024-06-23 08:45:38,392][15132] Fps is (10 sec: 40951.3, 60 sec: 42323.9, 300 sec: 42709.2). Total num frames: 6551224320. Throughput: 0: 42766.9. Samples: 6551325440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:45:38,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 08:45:39,363][15401] Updated weights for policy 0, policy_version 399860 (0.0038) [2024-06-23 08:45:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6551453696. Throughput: 0: 42722.7. Samples: 6551581420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:45:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 08:45:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000399870_6551470080.pth... [2024-06-23 08:45:43,466][15401] Updated weights for policy 0, policy_version 399870 (0.0035) [2024-06-23 08:45:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000399242_6541180928.pth [2024-06-23 08:45:46,988][15401] Updated weights for policy 0, policy_version 399880 (0.0031) [2024-06-23 08:45:48,389][15132] Fps is (10 sec: 47524.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6551699456. Throughput: 0: 42911.2. Samples: 6551837900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:45:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 08:45:51,026][15401] Updated weights for policy 0, policy_version 399890 (0.0022) [2024-06-23 08:45:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6551863296. Throughput: 0: 42780.4. Samples: 6551965840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:45:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 08:45:54,656][15401] Updated weights for policy 0, policy_version 399900 (0.0045) [2024-06-23 08:45:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6552092672. Throughput: 0: 42753.9. Samples: 6552218980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:45:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 08:45:58,663][15401] Updated weights for policy 0, policy_version 399910 (0.0025) [2024-06-23 08:46:02,597][15401] Updated weights for policy 0, policy_version 399920 (0.0037) [2024-06-23 08:46:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 6552322048. Throughput: 0: 43088.3. Samples: 6552478320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:46:03,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 08:46:06,142][15401] Updated weights for policy 0, policy_version 399930 (0.0038) [2024-06-23 08:46:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6552502272. Throughput: 0: 42880.9. Samples: 6552605780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:46:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 08:46:09,998][15401] Updated weights for policy 0, policy_version 399940 (0.0045) [2024-06-23 08:46:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43147.7, 300 sec: 42765.4). Total num frames: 6552748032. Throughput: 0: 42812.4. Samples: 6552860460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:46:13,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 08:46:13,679][15401] Updated weights for policy 0, policy_version 399950 (0.0030) [2024-06-23 08:46:16,060][15349] Signal inference workers to stop experience collection... (97000 times) [2024-06-23 08:46:16,099][15401] InferenceWorker_p0-w0: stopping experience collection (97000 times) [2024-06-23 08:46:16,117][15349] Signal inference workers to resume experience collection... (97000 times) [2024-06-23 08:46:16,118][15401] InferenceWorker_p0-w0: resuming experience collection (97000 times) [2024-06-23 08:46:18,111][15401] Updated weights for policy 0, policy_version 399960 (0.0035) [2024-06-23 08:46:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42329.8, 300 sec: 42765.0). Total num frames: 6552944640. Throughput: 0: 43085.0. Samples: 6553125900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:46:18,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 08:46:21,667][15401] Updated weights for policy 0, policy_version 399970 (0.0033) [2024-06-23 08:46:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42765.3). Total num frames: 6553157632. Throughput: 0: 42830.3. Samples: 6553252720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:46:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 08:46:25,759][15401] Updated weights for policy 0, policy_version 399980 (0.0030) [2024-06-23 08:46:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6553403392. Throughput: 0: 42859.1. Samples: 6553510080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:46:28,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-23 08:46:29,193][15401] Updated weights for policy 0, policy_version 399990 (0.0029) [2024-06-23 08:46:33,345][15401] Updated weights for policy 0, policy_version 400000 (0.0028) [2024-06-23 08:46:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6553600000. Throughput: 0: 42943.9. Samples: 6553770380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:46:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 08:46:36,677][15401] Updated weights for policy 0, policy_version 400010 (0.0032) [2024-06-23 08:46:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.0, 300 sec: 42820.6). Total num frames: 6553796608. Throughput: 0: 42832.4. Samples: 6553893300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 08:46:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 08:46:40,913][15401] Updated weights for policy 0, policy_version 400020 (0.0034) [2024-06-23 08:46:43,390][15132] Fps is (10 sec: 44233.8, 60 sec: 43144.0, 300 sec: 42820.4). Total num frames: 6554042368. Throughput: 0: 42879.0. Samples: 6554148560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:46:43,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 08:46:44,124][15401] Updated weights for policy 0, policy_version 400030 (0.0032) [2024-06-23 08:46:48,323][15401] Updated weights for policy 0, policy_version 400040 (0.0038) [2024-06-23 08:46:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6554255360. Throughput: 0: 42896.0. Samples: 6554408640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:46:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 08:46:52,190][15401] Updated weights for policy 0, policy_version 400050 (0.0025) [2024-06-23 08:46:53,389][15132] Fps is (10 sec: 40962.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6554451968. Throughput: 0: 42886.2. Samples: 6554535660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:46:53,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 08:46:55,794][15401] Updated weights for policy 0, policy_version 400060 (0.0027) [2024-06-23 08:46:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 6554681344. Throughput: 0: 43059.6. Samples: 6554798140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:46:58,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 08:46:59,899][15401] Updated weights for policy 0, policy_version 400070 (0.0027) [2024-06-23 08:47:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6554894336. Throughput: 0: 42809.7. Samples: 6555052340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:03,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 08:47:03,755][15401] Updated weights for policy 0, policy_version 400080 (0.0042) [2024-06-23 08:47:07,518][15401] Updated weights for policy 0, policy_version 400090 (0.0037) [2024-06-23 08:47:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 6555090944. Throughput: 0: 42724.1. Samples: 6555175300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 08:47:11,335][15401] Updated weights for policy 0, policy_version 400100 (0.0035) [2024-06-23 08:47:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6555320320. Throughput: 0: 42758.6. Samples: 6555434220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:13,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 08:47:15,371][15401] Updated weights for policy 0, policy_version 400110 (0.0028) [2024-06-23 08:47:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6555533312. Throughput: 0: 42666.2. Samples: 6555690360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:18,399][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 08:47:19,030][15401] Updated weights for policy 0, policy_version 400120 (0.0029) [2024-06-23 08:47:22,972][15401] Updated weights for policy 0, policy_version 400130 (0.0041) [2024-06-23 08:47:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6555729920. Throughput: 0: 42708.9. Samples: 6555815200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 08:47:26,783][15401] Updated weights for policy 0, policy_version 400140 (0.0041) [2024-06-23 08:47:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6555975680. Throughput: 0: 42761.5. Samples: 6556072800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 08:47:30,423][15401] Updated weights for policy 0, policy_version 400150 (0.0029) [2024-06-23 08:47:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 6556155904. Throughput: 0: 42910.4. Samples: 6556339600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:33,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 08:47:34,426][15401] Updated weights for policy 0, policy_version 400160 (0.0028) [2024-06-23 08:47:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6556368896. Throughput: 0: 42729.5. Samples: 6556458480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:38,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 08:47:38,397][15401] Updated weights for policy 0, policy_version 400170 (0.0029) [2024-06-23 08:47:42,098][15401] Updated weights for policy 0, policy_version 400180 (0.0036) [2024-06-23 08:47:43,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.9, 300 sec: 42821.0). Total num frames: 6556614656. Throughput: 0: 42553.6. Samples: 6556713060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 08:47:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000400184_6556614656.pth... [2024-06-23 08:47:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000399556_6546325504.pth [2024-06-23 08:47:46,125][15401] Updated weights for policy 0, policy_version 400190 (0.0033) [2024-06-23 08:47:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6556794880. Throughput: 0: 42568.2. Samples: 6556967900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:48,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 08:47:49,886][15401] Updated weights for policy 0, policy_version 400200 (0.0045) [2024-06-23 08:47:52,421][15349] Signal inference workers to stop experience collection... (97050 times) [2024-06-23 08:47:52,428][15349] Signal inference workers to resume experience collection... (97050 times) [2024-06-23 08:47:52,456][15401] InferenceWorker_p0-w0: stopping experience collection (97050 times) [2024-06-23 08:47:52,456][15401] InferenceWorker_p0-w0: resuming experience collection (97050 times) [2024-06-23 08:47:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6557024256. Throughput: 0: 42481.8. Samples: 6557086980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 08:47:53,730][15401] Updated weights for policy 0, policy_version 400210 (0.0035) [2024-06-23 08:47:57,722][15401] Updated weights for policy 0, policy_version 400220 (0.0032) [2024-06-23 08:47:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6557220864. Throughput: 0: 42597.9. Samples: 6557351120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:47:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 08:48:01,429][15401] Updated weights for policy 0, policy_version 400230 (0.0041) [2024-06-23 08:48:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6557450240. Throughput: 0: 42491.2. Samples: 6557602460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 08:48:05,432][15401] Updated weights for policy 0, policy_version 400240 (0.0021) [2024-06-23 08:48:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42710.0). Total num frames: 6557646848. Throughput: 0: 42565.8. Samples: 6557730660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:08,399][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 08:48:09,020][15401] Updated weights for policy 0, policy_version 400250 (0.0037) [2024-06-23 08:48:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6557843456. Throughput: 0: 42535.2. Samples: 6557986880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 08:48:13,461][15401] Updated weights for policy 0, policy_version 400260 (0.0043) [2024-06-23 08:48:16,866][15401] Updated weights for policy 0, policy_version 400270 (0.0042) [2024-06-23 08:48:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6558072832. Throughput: 0: 42138.6. Samples: 6558235840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 08:48:21,144][15401] Updated weights for policy 0, policy_version 400280 (0.0034) [2024-06-23 08:48:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 6558285824. Throughput: 0: 42365.3. Samples: 6558364920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 08:48:24,616][15401] Updated weights for policy 0, policy_version 400290 (0.0041) [2024-06-23 08:48:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 42598.4). Total num frames: 6558466048. Throughput: 0: 42344.5. Samples: 6558618560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 08:48:28,855][15401] Updated weights for policy 0, policy_version 400300 (0.0041) [2024-06-23 08:48:32,432][15401] Updated weights for policy 0, policy_version 400310 (0.0032) [2024-06-23 08:48:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 6558711808. Throughput: 0: 42237.6. Samples: 6558868600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 08:48:36,778][15401] Updated weights for policy 0, policy_version 400320 (0.0031) [2024-06-23 08:48:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6558924800. Throughput: 0: 42624.6. Samples: 6559005080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 08:48:39,923][15401] Updated weights for policy 0, policy_version 400330 (0.0021) [2024-06-23 08:48:43,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 6559105024. Throughput: 0: 42259.5. Samples: 6559252800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:43,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 08:48:44,488][15401] Updated weights for policy 0, policy_version 400340 (0.0040) [2024-06-23 08:48:47,496][15401] Updated weights for policy 0, policy_version 400350 (0.0035) [2024-06-23 08:48:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6559350784. Throughput: 0: 42267.1. Samples: 6559504480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 08:48:52,256][15401] Updated weights for policy 0, policy_version 400360 (0.0027) [2024-06-23 08:48:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6559563776. Throughput: 0: 42499.6. Samples: 6559643140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 08:48:55,110][15401] Updated weights for policy 0, policy_version 400370 (0.0042) [2024-06-23 08:48:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6559744000. Throughput: 0: 42152.3. Samples: 6559883740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:48:58,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 08:49:00,317][15401] Updated weights for policy 0, policy_version 400380 (0.0042) [2024-06-23 08:49:00,737][15349] Signal inference workers to stop experience collection... (97100 times) [2024-06-23 08:49:00,773][15401] InferenceWorker_p0-w0: stopping experience collection (97100 times) [2024-06-23 08:49:00,803][15349] Signal inference workers to resume experience collection... (97100 times) [2024-06-23 08:49:00,804][15401] InferenceWorker_p0-w0: resuming experience collection (97100 times) [2024-06-23 08:49:02,931][15401] Updated weights for policy 0, policy_version 400390 (0.0037) [2024-06-23 08:49:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 6559989760. Throughput: 0: 42190.7. Samples: 6560134420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:49:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 08:49:07,979][15401] Updated weights for policy 0, policy_version 400400 (0.0035) [2024-06-23 08:49:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6560186368. Throughput: 0: 42380.4. Samples: 6560272040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:49:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 08:49:10,658][15401] Updated weights for policy 0, policy_version 400410 (0.0028) [2024-06-23 08:49:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6560399360. Throughput: 0: 42297.8. Samples: 6560521960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:49:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 08:49:15,468][15401] Updated weights for policy 0, policy_version 400420 (0.0037) [2024-06-23 08:49:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6560612352. Throughput: 0: 42401.4. Samples: 6560776660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 08:49:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 08:49:18,594][15401] Updated weights for policy 0, policy_version 400430 (0.0030) [2024-06-23 08:49:23,319][15401] Updated weights for policy 0, policy_version 400440 (0.0044) [2024-06-23 08:49:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42543.0). Total num frames: 6560808960. Throughput: 0: 42238.6. Samples: 6560905820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:49:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 08:49:26,131][15401] Updated weights for policy 0, policy_version 400450 (0.0038) [2024-06-23 08:49:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6561054720. Throughput: 0: 42312.0. Samples: 6561156840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:49:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 08:49:30,884][15401] Updated weights for policy 0, policy_version 400460 (0.0029) [2024-06-23 08:49:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6561267712. Throughput: 0: 42443.8. Samples: 6561414460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:49:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 08:49:33,983][15401] Updated weights for policy 0, policy_version 400470 (0.0047) [2024-06-23 08:49:38,339][15401] Updated weights for policy 0, policy_version 400480 (0.0038) [2024-06-23 08:49:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 6561464320. Throughput: 0: 42284.4. Samples: 6561545940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:49:38,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 08:49:41,409][15401] Updated weights for policy 0, policy_version 400490 (0.0044) [2024-06-23 08:49:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6561677312. Throughput: 0: 42651.2. Samples: 6561803040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:49:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 08:49:43,435][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000400494_6561693696.pth... [2024-06-23 08:49:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000399870_6551470080.pth [2024-06-23 08:49:45,783][15401] Updated weights for policy 0, policy_version 400500 (0.0026) [2024-06-23 08:49:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6561906688. Throughput: 0: 42774.6. Samples: 6562059280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:49:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 08:49:49,120][15401] Updated weights for policy 0, policy_version 400510 (0.0040) [2024-06-23 08:49:53,339][15401] Updated weights for policy 0, policy_version 400520 (0.0030) [2024-06-23 08:49:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6562119680. Throughput: 0: 42738.2. Samples: 6562195260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:49:53,392][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 08:49:56,771][15401] Updated weights for policy 0, policy_version 400530 (0.0030) [2024-06-23 08:49:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 6562316288. Throughput: 0: 42681.8. Samples: 6562442640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:49:58,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-23 08:50:01,435][15401] Updated weights for policy 0, policy_version 400540 (0.0030) [2024-06-23 08:50:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6562529280. Throughput: 0: 42758.6. Samples: 6562700800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:50:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 08:50:04,383][15401] Updated weights for policy 0, policy_version 400550 (0.0032) [2024-06-23 08:50:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.6). Total num frames: 6562742272. Throughput: 0: 42646.7. Samples: 6562824920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:50:08,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 08:50:09,119][15401] Updated weights for policy 0, policy_version 400560 (0.0021) [2024-06-23 08:50:12,065][15401] Updated weights for policy 0, policy_version 400570 (0.0028) [2024-06-23 08:50:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 6562971648. Throughput: 0: 42685.4. Samples: 6563077680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:50:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 08:50:16,756][15401] Updated weights for policy 0, policy_version 400580 (0.0037) [2024-06-23 08:50:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6563168256. Throughput: 0: 42875.6. Samples: 6563343860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:50:18,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 08:50:19,866][15401] Updated weights for policy 0, policy_version 400590 (0.0030) [2024-06-23 08:50:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6563381248. Throughput: 0: 42573.3. Samples: 6563461740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:50:23,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 08:50:24,710][15401] Updated weights for policy 0, policy_version 400600 (0.0031) [2024-06-23 08:50:25,413][15349] Signal inference workers to stop experience collection... (97150 times) [2024-06-23 08:50:25,414][15349] Signal inference workers to resume experience collection... (97150 times) [2024-06-23 08:50:25,429][15401] InferenceWorker_p0-w0: stopping experience collection (97150 times) [2024-06-23 08:50:25,429][15401] InferenceWorker_p0-w0: resuming experience collection (97150 times) [2024-06-23 08:50:27,879][15401] Updated weights for policy 0, policy_version 400610 (0.0047) [2024-06-23 08:50:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 6563610624. Throughput: 0: 42575.4. Samples: 6563719040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:50:28,393][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 08:50:32,058][15401] Updated weights for policy 0, policy_version 400620 (0.0034) [2024-06-23 08:50:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 6563807232. Throughput: 0: 42765.0. Samples: 6563983700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:50:33,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 08:50:35,400][15401] Updated weights for policy 0, policy_version 400630 (0.0034) [2024-06-23 08:50:38,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6564020224. Throughput: 0: 42613.9. Samples: 6564112880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 08:50:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 08:50:39,422][15401] Updated weights for policy 0, policy_version 400640 (0.0030) [2024-06-23 08:50:42,902][15401] Updated weights for policy 0, policy_version 400650 (0.0029) [2024-06-23 08:50:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 6564265984. Throughput: 0: 42862.1. Samples: 6564371440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:50:43,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 08:50:47,213][15401] Updated weights for policy 0, policy_version 400660 (0.0036) [2024-06-23 08:50:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6564462592. Throughput: 0: 42898.7. Samples: 6564631240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:50:48,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 08:50:50,477][15401] Updated weights for policy 0, policy_version 400670 (0.0039) [2024-06-23 08:50:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6564675584. Throughput: 0: 42953.2. Samples: 6564757820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:50:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 08:50:54,832][15401] Updated weights for policy 0, policy_version 400680 (0.0027) [2024-06-23 08:50:58,289][15401] Updated weights for policy 0, policy_version 400690 (0.0043) [2024-06-23 08:50:58,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 6564904960. Throughput: 0: 43123.0. Samples: 6565018320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:50:58,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 08:51:02,590][15401] Updated weights for policy 0, policy_version 400700 (0.0049) [2024-06-23 08:51:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 6565117952. Throughput: 0: 42973.0. Samples: 6565277640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 08:51:05,894][15401] Updated weights for policy 0, policy_version 400710 (0.0021) [2024-06-23 08:51:08,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 6565330944. Throughput: 0: 43185.7. Samples: 6565405100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:08,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 08:51:10,155][15401] Updated weights for policy 0, policy_version 400720 (0.0019) [2024-06-23 08:51:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6565543936. Throughput: 0: 43267.7. Samples: 6565665980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 08:51:13,612][15401] Updated weights for policy 0, policy_version 400730 (0.0043) [2024-06-23 08:51:17,646][15401] Updated weights for policy 0, policy_version 400740 (0.0034) [2024-06-23 08:51:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6565756928. Throughput: 0: 43116.9. Samples: 6565923960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 08:51:21,290][15401] Updated weights for policy 0, policy_version 400750 (0.0034) [2024-06-23 08:51:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6565969920. Throughput: 0: 43209.7. Samples: 6566057320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 08:51:25,052][15401] Updated weights for policy 0, policy_version 400760 (0.0034) [2024-06-23 08:51:28,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42868.6, 300 sec: 42653.0). Total num frames: 6566182912. Throughput: 0: 43001.1. Samples: 6566306760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:28,396][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 08:51:28,772][15401] Updated weights for policy 0, policy_version 400770 (0.0026) [2024-06-23 08:51:32,512][15401] Updated weights for policy 0, policy_version 400780 (0.0034) [2024-06-23 08:51:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6566395904. Throughput: 0: 43007.7. Samples: 6566566580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 08:51:36,590][15401] Updated weights for policy 0, policy_version 400790 (0.0029) [2024-06-23 08:51:38,389][15132] Fps is (10 sec: 42626.0, 60 sec: 43144.6, 300 sec: 42598.5). Total num frames: 6566608896. Throughput: 0: 43038.4. Samples: 6566694540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 08:51:40,364][15401] Updated weights for policy 0, policy_version 400800 (0.0031) [2024-06-23 08:51:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6566838272. Throughput: 0: 43024.9. Samples: 6566954340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:43,391][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 08:51:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000400808_6566838272.pth... [2024-06-23 08:51:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000400184_6556614656.pth [2024-06-23 08:51:44,133][15401] Updated weights for policy 0, policy_version 400810 (0.0034) [2024-06-23 08:51:47,806][15401] Updated weights for policy 0, policy_version 400820 (0.0037) [2024-06-23 08:51:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 6567034880. Throughput: 0: 42844.0. Samples: 6567205620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:48,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-23 08:51:51,741][15401] Updated weights for policy 0, policy_version 400830 (0.0026) [2024-06-23 08:51:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6567247872. Throughput: 0: 42877.4. Samples: 6567334580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 08:51:55,567][15401] Updated weights for policy 0, policy_version 400840 (0.0029) [2024-06-23 08:51:56,451][15349] Signal inference workers to stop experience collection... (97200 times) [2024-06-23 08:51:56,454][15349] Signal inference workers to resume experience collection... (97200 times) [2024-06-23 08:51:56,475][15401] InferenceWorker_p0-w0: stopping experience collection (97200 times) [2024-06-23 08:51:56,475][15401] InferenceWorker_p0-w0: resuming experience collection (97200 times) [2024-06-23 08:51:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 6567477248. Throughput: 0: 42822.7. Samples: 6567593000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 08:51:58,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 08:51:59,373][15401] Updated weights for policy 0, policy_version 400850 (0.0037) [2024-06-23 08:52:03,038][15401] Updated weights for policy 0, policy_version 400860 (0.0047) [2024-06-23 08:52:03,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 6567690240. Throughput: 0: 42626.5. Samples: 6567842260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:03,393][15132] Avg episode reward: [(0, '0.229')] [2024-06-23 08:52:07,336][15401] Updated weights for policy 0, policy_version 400870 (0.0033) [2024-06-23 08:52:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6567886848. Throughput: 0: 42601.8. Samples: 6567974400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:08,390][15132] Avg episode reward: [(0, '0.022')] [2024-06-23 08:52:10,555][15401] Updated weights for policy 0, policy_version 400880 (0.0025) [2024-06-23 08:52:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6568116224. Throughput: 0: 42709.2. Samples: 6568228400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 08:52:15,130][15401] Updated weights for policy 0, policy_version 400890 (0.0031) [2024-06-23 08:52:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6568329216. Throughput: 0: 42486.2. Samples: 6568478460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 08:52:18,547][15401] Updated weights for policy 0, policy_version 400900 (0.0031) [2024-06-23 08:52:23,095][15401] Updated weights for policy 0, policy_version 400910 (0.0042) [2024-06-23 08:52:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6568525824. Throughput: 0: 42539.5. Samples: 6568608820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 08:52:25,962][15401] Updated weights for policy 0, policy_version 400920 (0.0030) [2024-06-23 08:52:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 6568771584. Throughput: 0: 42464.0. Samples: 6568865220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 08:52:30,897][15401] Updated weights for policy 0, policy_version 400930 (0.0035) [2024-06-23 08:52:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 6568968192. Throughput: 0: 42638.0. Samples: 6569124340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:33,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 08:52:33,766][15401] Updated weights for policy 0, policy_version 400940 (0.0033) [2024-06-23 08:52:38,369][15401] Updated weights for policy 0, policy_version 400950 (0.0028) [2024-06-23 08:52:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6569164800. Throughput: 0: 42531.1. Samples: 6569248480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:38,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 08:52:41,396][15401] Updated weights for policy 0, policy_version 400960 (0.0034) [2024-06-23 08:52:43,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6569394176. Throughput: 0: 42487.0. Samples: 6569505020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:43,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 08:52:46,147][15401] Updated weights for policy 0, policy_version 400970 (0.0039) [2024-06-23 08:52:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6569590784. Throughput: 0: 42684.9. Samples: 6569762980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 08:52:48,995][15401] Updated weights for policy 0, policy_version 400980 (0.0022) [2024-06-23 08:52:53,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6569803776. Throughput: 0: 42398.1. Samples: 6569882320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 08:52:54,320][15401] Updated weights for policy 0, policy_version 400990 (0.0029) [2024-06-23 08:52:56,538][15401] Updated weights for policy 0, policy_version 401000 (0.0046) [2024-06-23 08:52:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6570033152. Throughput: 0: 42586.1. Samples: 6570144780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:52:58,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 08:53:02,046][15401] Updated weights for policy 0, policy_version 401010 (0.0034) [2024-06-23 08:53:03,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 6570229760. Throughput: 0: 42835.0. Samples: 6570406140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:53:03,392][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 08:53:04,345][15401] Updated weights for policy 0, policy_version 401020 (0.0027) [2024-06-23 08:53:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6570442752. Throughput: 0: 42668.4. Samples: 6570528900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:53:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 08:53:09,621][15401] Updated weights for policy 0, policy_version 401030 (0.0037) [2024-06-23 08:53:10,886][15349] Signal inference workers to stop experience collection... (97250 times) [2024-06-23 08:53:10,930][15401] InferenceWorker_p0-w0: stopping experience collection (97250 times) [2024-06-23 08:53:10,940][15349] Signal inference workers to resume experience collection... (97250 times) [2024-06-23 08:53:10,947][15401] InferenceWorker_p0-w0: resuming experience collection (97250 times) [2024-06-23 08:53:12,179][15401] Updated weights for policy 0, policy_version 401040 (0.0034) [2024-06-23 08:53:13,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6570688512. Throughput: 0: 42682.7. Samples: 6570785940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:53:13,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 08:53:17,178][15401] Updated weights for policy 0, policy_version 401050 (0.0034) [2024-06-23 08:53:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6570852352. Throughput: 0: 42874.8. Samples: 6571053700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:53:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 08:53:19,764][15401] Updated weights for policy 0, policy_version 401060 (0.0031) [2024-06-23 08:53:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6571098112. Throughput: 0: 42762.3. Samples: 6571172780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 08:53:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 08:53:24,661][15401] Updated weights for policy 0, policy_version 401070 (0.0031) [2024-06-23 08:53:27,306][15401] Updated weights for policy 0, policy_version 401080 (0.0026) [2024-06-23 08:53:28,390][15132] Fps is (10 sec: 49152.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6571343872. Throughput: 0: 42861.4. Samples: 6571433680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:53:28,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 08:53:32,280][15401] Updated weights for policy 0, policy_version 401090 (0.0030) [2024-06-23 08:53:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 6571491328. Throughput: 0: 42931.2. Samples: 6571694880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:53:33,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 08:53:34,931][15401] Updated weights for policy 0, policy_version 401100 (0.0041) [2024-06-23 08:53:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6571753472. Throughput: 0: 42936.6. Samples: 6571814460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:53:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 08:53:39,654][15401] Updated weights for policy 0, policy_version 401110 (0.0034) [2024-06-23 08:53:42,459][15401] Updated weights for policy 0, policy_version 401120 (0.0028) [2024-06-23 08:53:43,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6571966464. Throughput: 0: 42985.9. Samples: 6572079140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:53:43,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 08:53:43,485][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000401122_6571982848.pth... [2024-06-23 08:53:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000400494_6561693696.pth [2024-06-23 08:53:47,565][15401] Updated weights for policy 0, policy_version 401130 (0.0039) [2024-06-23 08:53:48,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6572130304. Throughput: 0: 43032.2. Samples: 6572342480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:53:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 08:53:50,197][15401] Updated weights for policy 0, policy_version 401140 (0.0040) [2024-06-23 08:53:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6572392448. Throughput: 0: 42872.5. Samples: 6572458160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:53:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 08:53:55,130][15401] Updated weights for policy 0, policy_version 401150 (0.0033) [2024-06-23 08:53:57,903][15401] Updated weights for policy 0, policy_version 401160 (0.0026) [2024-06-23 08:53:58,389][15132] Fps is (10 sec: 49151.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6572621824. Throughput: 0: 42931.6. Samples: 6572717860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:53:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 08:54:02,591][15401] Updated weights for policy 0, policy_version 401170 (0.0028) [2024-06-23 08:54:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 6572785664. Throughput: 0: 42796.0. Samples: 6572979520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 08:54:05,624][15401] Updated weights for policy 0, policy_version 401180 (0.0038) [2024-06-23 08:54:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6573031424. Throughput: 0: 42785.6. Samples: 6573098140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 08:54:10,069][15401] Updated weights for policy 0, policy_version 401190 (0.0040) [2024-06-23 08:54:13,309][15401] Updated weights for policy 0, policy_version 401200 (0.0031) [2024-06-23 08:54:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6573260800. Throughput: 0: 42828.0. Samples: 6573360940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:13,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-23 08:54:17,689][15401] Updated weights for policy 0, policy_version 401210 (0.0034) [2024-06-23 08:54:18,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6573424640. Throughput: 0: 42736.0. Samples: 6573618000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 08:54:18,533][15349] Signal inference workers to stop experience collection... (97300 times) [2024-06-23 08:54:18,534][15349] Signal inference workers to resume experience collection... (97300 times) [2024-06-23 08:54:18,553][15401] InferenceWorker_p0-w0: stopping experience collection (97300 times) [2024-06-23 08:54:18,583][15401] InferenceWorker_p0-w0: resuming experience collection (97300 times) [2024-06-23 08:54:20,943][15401] Updated weights for policy 0, policy_version 401220 (0.0035) [2024-06-23 08:54:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6573654016. Throughput: 0: 42611.5. Samples: 6573731980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:23,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 08:54:25,603][15401] Updated weights for policy 0, policy_version 401230 (0.0030) [2024-06-23 08:54:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6573883392. Throughput: 0: 42654.7. Samples: 6573998600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 08:54:28,606][15401] Updated weights for policy 0, policy_version 401240 (0.0048) [2024-06-23 08:54:33,363][15401] Updated weights for policy 0, policy_version 401250 (0.0028) [2024-06-23 08:54:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6574080000. Throughput: 0: 42600.8. Samples: 6574259520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 08:54:36,331][15401] Updated weights for policy 0, policy_version 401260 (0.0031) [2024-06-23 08:54:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6574309376. Throughput: 0: 42675.4. Samples: 6574378560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 08:54:40,976][15401] Updated weights for policy 0, policy_version 401270 (0.0034) [2024-06-23 08:54:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6574522368. Throughput: 0: 42733.7. Samples: 6574640880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-23 08:54:43,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 08:54:43,977][15401] Updated weights for policy 0, policy_version 401280 (0.0045) [2024-06-23 08:54:48,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6574702592. Throughput: 0: 42560.9. Samples: 6574894760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:54:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 08:54:48,615][15401] Updated weights for policy 0, policy_version 401290 (0.0033) [2024-06-23 08:54:52,019][15401] Updated weights for policy 0, policy_version 401300 (0.0031) [2024-06-23 08:54:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6574964736. Throughput: 0: 42590.0. Samples: 6575014680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:54:53,396][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 08:54:56,231][15401] Updated weights for policy 0, policy_version 401310 (0.0033) [2024-06-23 08:54:58,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 6575161344. Throughput: 0: 42562.7. Samples: 6575276360. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:54:58,392][15132] Avg episode reward: [(0, '0.289')] [2024-06-23 08:54:59,662][15401] Updated weights for policy 0, policy_version 401320 (0.0028) [2024-06-23 08:55:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6575357952. Throughput: 0: 42655.0. Samples: 6575537480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:03,396][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 08:55:04,219][15401] Updated weights for policy 0, policy_version 401330 (0.0037) [2024-06-23 08:55:07,296][15401] Updated weights for policy 0, policy_version 401340 (0.0029) [2024-06-23 08:55:08,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 6575603712. Throughput: 0: 42763.1. Samples: 6575656420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:08,392][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 08:55:11,688][15401] Updated weights for policy 0, policy_version 401350 (0.0034) [2024-06-23 08:55:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 6575783936. Throughput: 0: 42503.5. Samples: 6575911260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 08:55:15,122][15401] Updated weights for policy 0, policy_version 401360 (0.0038) [2024-06-23 08:55:18,389][15132] Fps is (10 sec: 36053.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6575964160. Throughput: 0: 42469.4. Samples: 6576170640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 08:55:19,301][15401] Updated weights for policy 0, policy_version 401370 (0.0027) [2024-06-23 08:55:22,986][15401] Updated weights for policy 0, policy_version 401380 (0.0037) [2024-06-23 08:55:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 6576226304. Throughput: 0: 42479.6. Samples: 6576290140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 08:55:23,581][15349] Signal inference workers to stop experience collection... (97350 times) [2024-06-23 08:55:23,633][15349] Signal inference workers to resume experience collection... (97350 times) [2024-06-23 08:55:23,634][15401] InferenceWorker_p0-w0: stopping experience collection (97350 times) [2024-06-23 08:55:23,646][15401] InferenceWorker_p0-w0: resuming experience collection (97350 times) [2024-06-23 08:55:26,888][15401] Updated weights for policy 0, policy_version 401390 (0.0042) [2024-06-23 08:55:28,392][15132] Fps is (10 sec: 47502.1, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 6576439296. Throughput: 0: 42417.3. Samples: 6576549760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:28,392][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 08:55:30,649][15401] Updated weights for policy 0, policy_version 401400 (0.0032) [2024-06-23 08:55:33,391][15132] Fps is (10 sec: 39317.0, 60 sec: 42324.5, 300 sec: 42709.3). Total num frames: 6576619520. Throughput: 0: 42532.2. Samples: 6576808760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:33,391][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 08:55:34,849][15401] Updated weights for policy 0, policy_version 401410 (0.0046) [2024-06-23 08:55:38,322][15401] Updated weights for policy 0, policy_version 401420 (0.0029) [2024-06-23 08:55:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6576865280. Throughput: 0: 42616.4. Samples: 6576932420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 08:55:42,485][15401] Updated weights for policy 0, policy_version 401430 (0.0028) [2024-06-23 08:55:43,389][15132] Fps is (10 sec: 45880.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6577078272. Throughput: 0: 42591.2. Samples: 6577192860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 08:55:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000401433_6577078272.pth... [2024-06-23 08:55:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000400808_6566838272.pth [2024-06-23 08:55:45,951][15401] Updated weights for policy 0, policy_version 401440 (0.0037) [2024-06-23 08:55:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6577274880. Throughput: 0: 42439.2. Samples: 6577447240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:48,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 08:55:50,097][15401] Updated weights for policy 0, policy_version 401450 (0.0039) [2024-06-23 08:55:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 6577504256. Throughput: 0: 42640.9. Samples: 6577575160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:53,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 08:55:53,546][15401] Updated weights for policy 0, policy_version 401460 (0.0032) [2024-06-23 08:55:57,628][15401] Updated weights for policy 0, policy_version 401470 (0.0027) [2024-06-23 08:55:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 6577700864. Throughput: 0: 42733.4. Samples: 6577834260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:55:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 08:56:01,315][15401] Updated weights for policy 0, policy_version 401480 (0.0039) [2024-06-23 08:56:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6577930240. Throughput: 0: 42480.3. Samples: 6578082260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-23 08:56:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 08:56:05,492][15401] Updated weights for policy 0, policy_version 401490 (0.0040) [2024-06-23 08:56:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 6578126848. Throughput: 0: 42783.2. Samples: 6578215380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 08:56:09,144][15401] Updated weights for policy 0, policy_version 401500 (0.0026) [2024-06-23 08:56:13,328][15401] Updated weights for policy 0, policy_version 401510 (0.0042) [2024-06-23 08:56:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6578339840. Throughput: 0: 42656.8. Samples: 6578469220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:13,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 08:56:16,835][15401] Updated weights for policy 0, policy_version 401520 (0.0040) [2024-06-23 08:56:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 6578569216. Throughput: 0: 42395.9. Samples: 6578716520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 08:56:20,803][15401] Updated weights for policy 0, policy_version 401530 (0.0036) [2024-06-23 08:56:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42599.3). Total num frames: 6578749440. Throughput: 0: 42619.7. Samples: 6578850300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 08:56:24,646][15401] Updated weights for policy 0, policy_version 401540 (0.0033) [2024-06-23 08:56:28,343][15401] Updated weights for policy 0, policy_version 401550 (0.0028) [2024-06-23 08:56:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 6578995200. Throughput: 0: 42492.1. Samples: 6579105000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:28,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-23 08:56:32,274][15401] Updated weights for policy 0, policy_version 401560 (0.0033) [2024-06-23 08:56:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43145.4, 300 sec: 42709.5). Total num frames: 6579208192. Throughput: 0: 42457.3. Samples: 6579357820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:33,390][15132] Avg episode reward: [(0, '0.196')] [2024-06-23 08:56:36,271][15401] Updated weights for policy 0, policy_version 401570 (0.0047) [2024-06-23 08:56:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6579404800. Throughput: 0: 42564.8. Samples: 6579490580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:38,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 08:56:39,943][15401] Updated weights for policy 0, policy_version 401580 (0.0031) [2024-06-23 08:56:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6579634176. Throughput: 0: 42495.5. Samples: 6579746560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 08:56:43,717][15401] Updated weights for policy 0, policy_version 401590 (0.0044) [2024-06-23 08:56:47,328][15349] Signal inference workers to stop experience collection... (97400 times) [2024-06-23 08:56:47,328][15349] Signal inference workers to resume experience collection... (97400 times) [2024-06-23 08:56:47,372][15401] InferenceWorker_p0-w0: stopping experience collection (97400 times) [2024-06-23 08:56:47,372][15401] InferenceWorker_p0-w0: resuming experience collection (97400 times) [2024-06-23 08:56:47,625][15401] Updated weights for policy 0, policy_version 401600 (0.0044) [2024-06-23 08:56:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6579830784. Throughput: 0: 42744.5. Samples: 6580005760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 08:56:51,238][15401] Updated weights for policy 0, policy_version 401610 (0.0030) [2024-06-23 08:56:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6580060160. Throughput: 0: 42567.9. Samples: 6580130940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:53,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-23 08:56:55,918][15401] Updated weights for policy 0, policy_version 401620 (0.0029) [2024-06-23 08:56:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 6580273152. Throughput: 0: 42476.6. Samples: 6580380660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:56:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 08:56:58,716][15401] Updated weights for policy 0, policy_version 401630 (0.0032) [2024-06-23 08:57:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6580453376. Throughput: 0: 43039.5. Samples: 6580653300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:57:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 08:57:03,431][15401] Updated weights for policy 0, policy_version 401640 (0.0029) [2024-06-23 08:57:06,221][15401] Updated weights for policy 0, policy_version 401650 (0.0034) [2024-06-23 08:57:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6580682752. Throughput: 0: 42732.8. Samples: 6580773280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:57:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 08:57:10,970][15401] Updated weights for policy 0, policy_version 401660 (0.0035) [2024-06-23 08:57:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6580928512. Throughput: 0: 42625.7. Samples: 6581023160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:57:13,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 08:57:13,883][15401] Updated weights for policy 0, policy_version 401670 (0.0028) [2024-06-23 08:57:18,396][15132] Fps is (10 sec: 39296.3, 60 sec: 41774.7, 300 sec: 42541.9). Total num frames: 6581075968. Throughput: 0: 43087.2. Samples: 6581297020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:57:18,396][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 08:57:18,707][15401] Updated weights for policy 0, policy_version 401680 (0.0032) [2024-06-23 08:57:21,882][15401] Updated weights for policy 0, policy_version 401690 (0.0027) [2024-06-23 08:57:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6581305344. Throughput: 0: 42561.0. Samples: 6581405820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 08:57:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 08:57:26,490][15401] Updated weights for policy 0, policy_version 401700 (0.0038) [2024-06-23 08:57:28,390][15132] Fps is (10 sec: 50822.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6581583872. Throughput: 0: 42516.9. Samples: 6581659820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:57:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 08:57:29,735][15401] Updated weights for policy 0, policy_version 401710 (0.0028) [2024-06-23 08:57:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 41777.5, 300 sec: 42542.5). Total num frames: 6581714944. Throughput: 0: 42639.9. Samples: 6581924660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:57:33,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 08:57:34,165][15401] Updated weights for policy 0, policy_version 401720 (0.0039) [2024-06-23 08:57:37,420][15401] Updated weights for policy 0, policy_version 401730 (0.0041) [2024-06-23 08:57:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 6581960704. Throughput: 0: 42295.1. Samples: 6582034220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:57:38,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 08:57:41,931][15401] Updated weights for policy 0, policy_version 401740 (0.0039) [2024-06-23 08:57:42,280][15349] Signal inference workers to stop experience collection... (97450 times) [2024-06-23 08:57:42,312][15401] InferenceWorker_p0-w0: stopping experience collection (97450 times) [2024-06-23 08:57:42,395][15349] Signal inference workers to resume experience collection... (97450 times) [2024-06-23 08:57:42,395][15401] InferenceWorker_p0-w0: resuming experience collection (97450 times) [2024-06-23 08:57:43,390][15132] Fps is (10 sec: 49163.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6582206464. Throughput: 0: 42739.9. Samples: 6582303960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:57:43,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 08:57:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000401747_6582222848.pth... [2024-06-23 08:57:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000401122_6571982848.pth [2024-06-23 08:57:44,898][15401] Updated weights for policy 0, policy_version 401750 (0.0036) [2024-06-23 08:57:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 6582353920. Throughput: 0: 42358.7. Samples: 6582559440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:57:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 08:57:49,586][15401] Updated weights for policy 0, policy_version 401760 (0.0035) [2024-06-23 08:57:52,424][15401] Updated weights for policy 0, policy_version 401770 (0.0037) [2024-06-23 08:57:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6582616064. Throughput: 0: 42276.9. Samples: 6582675740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:57:53,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 08:57:57,029][15401] Updated weights for policy 0, policy_version 401780 (0.0040) [2024-06-23 08:57:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 6582812672. Throughput: 0: 42589.5. Samples: 6582939680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:57:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 08:58:00,515][15401] Updated weights for policy 0, policy_version 401790 (0.0036) [2024-06-23 08:58:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6583009280. Throughput: 0: 42391.4. Samples: 6583204360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 08:58:04,575][15401] Updated weights for policy 0, policy_version 401800 (0.0037) [2024-06-23 08:58:08,126][15401] Updated weights for policy 0, policy_version 401810 (0.0035) [2024-06-23 08:58:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6583255040. Throughput: 0: 42565.8. Samples: 6583321280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 08:58:12,214][15401] Updated weights for policy 0, policy_version 401820 (0.0027) [2024-06-23 08:58:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6583468032. Throughput: 0: 42756.1. Samples: 6583583840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 08:58:15,649][15401] Updated weights for policy 0, policy_version 401830 (0.0032) [2024-06-23 08:58:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42876.1, 300 sec: 42542.9). Total num frames: 6583648256. Throughput: 0: 42825.5. Samples: 6583851700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:18,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 08:58:19,638][15401] Updated weights for policy 0, policy_version 401840 (0.0030) [2024-06-23 08:58:23,141][15401] Updated weights for policy 0, policy_version 401850 (0.0036) [2024-06-23 08:58:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 6583910400. Throughput: 0: 43029.7. Samples: 6583970560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 08:58:27,478][15401] Updated weights for policy 0, policy_version 401860 (0.0042) [2024-06-23 08:58:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6584123392. Throughput: 0: 42859.7. Samples: 6584232640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 08:58:30,751][15401] Updated weights for policy 0, policy_version 401870 (0.0034) [2024-06-23 08:58:33,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 6584287232. Throughput: 0: 42957.2. Samples: 6584492520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 08:58:35,105][15401] Updated weights for policy 0, policy_version 401880 (0.0038) [2024-06-23 08:58:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6584532992. Throughput: 0: 42979.1. Samples: 6584609800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 08:58:38,634][15401] Updated weights for policy 0, policy_version 401890 (0.0034) [2024-06-23 08:58:42,868][15401] Updated weights for policy 0, policy_version 401900 (0.0036) [2024-06-23 08:58:43,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6584762368. Throughput: 0: 42820.0. Samples: 6584866580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 08:58:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 08:58:46,440][15401] Updated weights for policy 0, policy_version 401910 (0.0034) [2024-06-23 08:58:48,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.7, 300 sec: 42542.5). Total num frames: 6584942592. Throughput: 0: 42693.3. Samples: 6585125660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:58:48,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 08:58:50,469][15401] Updated weights for policy 0, policy_version 401920 (0.0034) [2024-06-23 08:58:53,390][15132] Fps is (10 sec: 39320.1, 60 sec: 42325.1, 300 sec: 42487.3). Total num frames: 6585155584. Throughput: 0: 42864.5. Samples: 6585250200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:58:53,391][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 08:58:53,956][15401] Updated weights for policy 0, policy_version 401930 (0.0032) [2024-06-23 08:58:58,110][15401] Updated weights for policy 0, policy_version 401940 (0.0039) [2024-06-23 08:58:58,389][15132] Fps is (10 sec: 45886.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6585401344. Throughput: 0: 42838.7. Samples: 6585511580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:58:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 08:58:58,617][15349] Signal inference workers to stop experience collection... (97500 times) [2024-06-23 08:58:58,617][15349] Signal inference workers to resume experience collection... (97500 times) [2024-06-23 08:58:58,629][15401] InferenceWorker_p0-w0: stopping experience collection (97500 times) [2024-06-23 08:58:58,629][15401] InferenceWorker_p0-w0: resuming experience collection (97500 times) [2024-06-23 08:59:01,673][15401] Updated weights for policy 0, policy_version 401950 (0.0024) [2024-06-23 08:59:03,390][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6585597952. Throughput: 0: 42546.5. Samples: 6585766300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 08:59:05,630][15401] Updated weights for policy 0, policy_version 401960 (0.0035) [2024-06-23 08:59:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6585810944. Throughput: 0: 42702.3. Samples: 6585892160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:08,391][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 08:59:09,440][15401] Updated weights for policy 0, policy_version 401970 (0.0035) [2024-06-23 08:59:13,070][15401] Updated weights for policy 0, policy_version 401980 (0.0031) [2024-06-23 08:59:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6586056704. Throughput: 0: 42739.4. Samples: 6586155920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 08:59:17,381][15401] Updated weights for policy 0, policy_version 401990 (0.0042) [2024-06-23 08:59:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6586220544. Throughput: 0: 42598.3. Samples: 6586409440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:18,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 08:59:20,732][15401] Updated weights for policy 0, policy_version 402000 (0.0026) [2024-06-23 08:59:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 6586449920. Throughput: 0: 42744.5. Samples: 6586533300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 08:59:25,034][15401] Updated weights for policy 0, policy_version 402010 (0.0045) [2024-06-23 08:59:28,396][15132] Fps is (10 sec: 45845.7, 60 sec: 42593.8, 300 sec: 42708.6). Total num frames: 6586679296. Throughput: 0: 42864.1. Samples: 6586795740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:28,396][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 08:59:28,593][15401] Updated weights for policy 0, policy_version 402020 (0.0037) [2024-06-23 08:59:32,883][15401] Updated weights for policy 0, policy_version 402030 (0.0029) [2024-06-23 08:59:33,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6586859520. Throughput: 0: 42705.4. Samples: 6587047300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:33,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 08:59:36,096][15401] Updated weights for policy 0, policy_version 402040 (0.0029) [2024-06-23 08:59:38,389][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6587105280. Throughput: 0: 42619.9. Samples: 6587168080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:38,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-23 08:59:40,542][15401] Updated weights for policy 0, policy_version 402050 (0.0037) [2024-06-23 08:59:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6587301888. Throughput: 0: 42600.0. Samples: 6587428580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:43,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 08:59:43,584][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000402059_6587334656.pth... [2024-06-23 08:59:43,636][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000401433_6577078272.pth [2024-06-23 08:59:43,781][15401] Updated weights for policy 0, policy_version 402060 (0.0034) [2024-06-23 08:59:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 6587498496. Throughput: 0: 42498.2. Samples: 6587678720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 08:59:48,586][15401] Updated weights for policy 0, policy_version 402070 (0.0042) [2024-06-23 08:59:51,792][15401] Updated weights for policy 0, policy_version 402080 (0.0028) [2024-06-23 08:59:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.7, 300 sec: 42598.8). Total num frames: 6587727872. Throughput: 0: 42565.9. Samples: 6587807620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:53,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-23 08:59:56,154][15401] Updated weights for policy 0, policy_version 402090 (0.0028) [2024-06-23 08:59:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6587924480. Throughput: 0: 42492.9. Samples: 6588068100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 08:59:58,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 08:59:58,786][15349] Signal inference workers to stop experience collection... (97550 times) [2024-06-23 08:59:58,833][15401] InferenceWorker_p0-w0: stopping experience collection (97550 times) [2024-06-23 08:59:58,846][15349] Signal inference workers to resume experience collection... (97550 times) [2024-06-23 08:59:58,851][15401] InferenceWorker_p0-w0: resuming experience collection (97550 times) [2024-06-23 08:59:59,301][15401] Updated weights for policy 0, policy_version 402100 (0.0033) [2024-06-23 09:00:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42487.7). Total num frames: 6588137472. Throughput: 0: 42559.6. Samples: 6588324620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 09:00:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 09:00:03,777][15401] Updated weights for policy 0, policy_version 402110 (0.0026) [2024-06-23 09:00:06,957][15401] Updated weights for policy 0, policy_version 402120 (0.0041) [2024-06-23 09:00:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6588366848. Throughput: 0: 42690.5. Samples: 6588454380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 09:00:11,355][15401] Updated weights for policy 0, policy_version 402130 (0.0032) [2024-06-23 09:00:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 6588563456. Throughput: 0: 42559.3. Samples: 6588710640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 09:00:14,533][15401] Updated weights for policy 0, policy_version 402140 (0.0029) [2024-06-23 09:00:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6588792832. Throughput: 0: 42635.0. Samples: 6588965880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 09:00:18,765][15401] Updated weights for policy 0, policy_version 402150 (0.0034) [2024-06-23 09:00:22,068][15401] Updated weights for policy 0, policy_version 402160 (0.0037) [2024-06-23 09:00:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 6589022208. Throughput: 0: 42869.3. Samples: 6589097200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:23,400][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 09:00:26,468][15401] Updated weights for policy 0, policy_version 402170 (0.0047) [2024-06-23 09:00:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42056.7, 300 sec: 42654.1). Total num frames: 6589202432. Throughput: 0: 42743.5. Samples: 6589352040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 09:00:30,049][15401] Updated weights for policy 0, policy_version 402180 (0.0032) [2024-06-23 09:00:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6589448192. Throughput: 0: 42757.0. Samples: 6589602780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 09:00:34,022][15401] Updated weights for policy 0, policy_version 402190 (0.0036) [2024-06-23 09:00:37,616][15401] Updated weights for policy 0, policy_version 402200 (0.0052) [2024-06-23 09:00:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6589661184. Throughput: 0: 42772.9. Samples: 6589732400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 09:00:41,936][15401] Updated weights for policy 0, policy_version 402210 (0.0036) [2024-06-23 09:00:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6589857792. Throughput: 0: 42707.6. Samples: 6589989940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:43,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 09:00:45,282][15401] Updated weights for policy 0, policy_version 402220 (0.0030) [2024-06-23 09:00:48,393][15132] Fps is (10 sec: 42581.3, 60 sec: 43141.8, 300 sec: 42653.4). Total num frames: 6590087168. Throughput: 0: 42868.2. Samples: 6590253860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:48,394][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 09:00:49,405][15401] Updated weights for policy 0, policy_version 402230 (0.0033) [2024-06-23 09:00:52,919][15401] Updated weights for policy 0, policy_version 402240 (0.0041) [2024-06-23 09:00:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6590316544. Throughput: 0: 42718.2. Samples: 6590376700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:53,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-23 09:00:56,938][15401] Updated weights for policy 0, policy_version 402250 (0.0030) [2024-06-23 09:00:58,389][15132] Fps is (10 sec: 42615.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 6590513152. Throughput: 0: 42794.2. Samples: 6590636380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:00:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 09:01:00,570][15401] Updated weights for policy 0, policy_version 402260 (0.0025) [2024-06-23 09:01:02,561][15349] Signal inference workers to stop experience collection... (97600 times) [2024-06-23 09:01:02,561][15349] Signal inference workers to resume experience collection... (97600 times) [2024-06-23 09:01:02,602][15401] InferenceWorker_p0-w0: stopping experience collection (97600 times) [2024-06-23 09:01:02,602][15401] InferenceWorker_p0-w0: resuming experience collection (97600 times) [2024-06-23 09:01:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6590726144. Throughput: 0: 42907.6. Samples: 6590896720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:01:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 09:01:04,495][15401] Updated weights for policy 0, policy_version 402270 (0.0048) [2024-06-23 09:01:08,082][15401] Updated weights for policy 0, policy_version 402280 (0.0035) [2024-06-23 09:01:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6590955520. Throughput: 0: 42892.6. Samples: 6591027360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:01:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 09:01:12,623][15401] Updated weights for policy 0, policy_version 402290 (0.0043) [2024-06-23 09:01:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6591152128. Throughput: 0: 42952.5. Samples: 6591284900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:01:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 09:01:15,566][15401] Updated weights for policy 0, policy_version 402300 (0.0030) [2024-06-23 09:01:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6591381504. Throughput: 0: 42974.7. Samples: 6591536640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:01:18,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 09:01:20,322][15401] Updated weights for policy 0, policy_version 402310 (0.0035) [2024-06-23 09:01:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6591594496. Throughput: 0: 42992.9. Samples: 6591667080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 09:01:23,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 09:01:23,433][15401] Updated weights for policy 0, policy_version 402320 (0.0027) [2024-06-23 09:01:27,841][15401] Updated weights for policy 0, policy_version 402330 (0.0039) [2024-06-23 09:01:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6591791104. Throughput: 0: 43038.2. Samples: 6591926660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:01:28,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 09:01:31,037][15401] Updated weights for policy 0, policy_version 402340 (0.0047) [2024-06-23 09:01:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6592020480. Throughput: 0: 42875.8. Samples: 6592183100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:01:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 09:01:35,432][15401] Updated weights for policy 0, policy_version 402350 (0.0029) [2024-06-23 09:01:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6592233472. Throughput: 0: 42989.8. Samples: 6592311240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:01:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 09:01:38,778][15401] Updated weights for policy 0, policy_version 402360 (0.0046) [2024-06-23 09:01:42,978][15401] Updated weights for policy 0, policy_version 402370 (0.0028) [2024-06-23 09:01:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6592430080. Throughput: 0: 43062.2. Samples: 6592574180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:01:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 09:01:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000402371_6592446464.pth... [2024-06-23 09:01:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000401747_6582222848.pth [2024-06-23 09:01:46,319][15401] Updated weights for policy 0, policy_version 402380 (0.0027) [2024-06-23 09:01:48,391][15132] Fps is (10 sec: 40952.2, 60 sec: 42599.9, 300 sec: 42653.7). Total num frames: 6592643072. Throughput: 0: 42795.6. Samples: 6592822600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:01:48,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 09:01:50,820][15401] Updated weights for policy 0, policy_version 402390 (0.0042) [2024-06-23 09:01:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6592872448. Throughput: 0: 42688.9. Samples: 6592948360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:01:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 09:01:54,415][15401] Updated weights for policy 0, policy_version 402400 (0.0032) [2024-06-23 09:01:58,390][15132] Fps is (10 sec: 42606.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6593069056. Throughput: 0: 42648.8. Samples: 6593204100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:01:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 09:01:58,448][15401] Updated weights for policy 0, policy_version 402410 (0.0040) [2024-06-23 09:02:01,932][15401] Updated weights for policy 0, policy_version 402420 (0.0027) [2024-06-23 09:02:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6593282048. Throughput: 0: 42741.7. Samples: 6593460020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 09:02:05,994][15401] Updated weights for policy 0, policy_version 402430 (0.0035) [2024-06-23 09:02:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6593527808. Throughput: 0: 42670.2. Samples: 6593587240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:08,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 09:02:09,454][15401] Updated weights for policy 0, policy_version 402440 (0.0033) [2024-06-23 09:02:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42877.0). Total num frames: 6593724416. Throughput: 0: 42689.2. Samples: 6593847680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 09:02:13,461][15401] Updated weights for policy 0, policy_version 402450 (0.0033) [2024-06-23 09:02:14,676][15349] Signal inference workers to stop experience collection... (97650 times) [2024-06-23 09:02:14,723][15401] InferenceWorker_p0-w0: stopping experience collection (97650 times) [2024-06-23 09:02:14,732][15349] Signal inference workers to resume experience collection... (97650 times) [2024-06-23 09:02:14,737][15401] InferenceWorker_p0-w0: resuming experience collection (97650 times) [2024-06-23 09:02:16,887][15401] Updated weights for policy 0, policy_version 402460 (0.0032) [2024-06-23 09:02:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6593921024. Throughput: 0: 42688.9. Samples: 6594104100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 09:02:21,044][15401] Updated weights for policy 0, policy_version 402470 (0.0043) [2024-06-23 09:02:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6594150400. Throughput: 0: 42632.5. Samples: 6594229700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 09:02:24,439][15401] Updated weights for policy 0, policy_version 402480 (0.0034) [2024-06-23 09:02:28,394][15132] Fps is (10 sec: 45856.3, 60 sec: 43141.6, 300 sec: 42931.4). Total num frames: 6594379776. Throughput: 0: 42651.8. Samples: 6594493680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:28,394][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 09:02:29,228][15401] Updated weights for policy 0, policy_version 402490 (0.0031) [2024-06-23 09:02:31,965][15401] Updated weights for policy 0, policy_version 402500 (0.0037) [2024-06-23 09:02:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6594592768. Throughput: 0: 42905.3. Samples: 6594753260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 09:02:36,782][15401] Updated weights for policy 0, policy_version 402510 (0.0034) [2024-06-23 09:02:38,390][15132] Fps is (10 sec: 42615.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6594805760. Throughput: 0: 43022.5. Samples: 6594884380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:38,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 09:02:39,794][15401] Updated weights for policy 0, policy_version 402520 (0.0032) [2024-06-23 09:02:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6595002368. Throughput: 0: 42873.0. Samples: 6595133380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:02:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 09:02:44,348][15401] Updated weights for policy 0, policy_version 402530 (0.0039) [2024-06-23 09:02:47,407][15401] Updated weights for policy 0, policy_version 402540 (0.0026) [2024-06-23 09:02:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43145.9, 300 sec: 42765.0). Total num frames: 6595231744. Throughput: 0: 42852.0. Samples: 6595388360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:02:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 09:02:52,309][15401] Updated weights for policy 0, policy_version 402550 (0.0043) [2024-06-23 09:02:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6595444736. Throughput: 0: 43020.8. Samples: 6595523180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:02:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 09:02:55,183][15401] Updated weights for policy 0, policy_version 402560 (0.0037) [2024-06-23 09:02:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6595641344. Throughput: 0: 42773.9. Samples: 6595772500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:02:58,396][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 09:02:59,889][15401] Updated weights for policy 0, policy_version 402570 (0.0029) [2024-06-23 09:03:02,708][15401] Updated weights for policy 0, policy_version 402580 (0.0039) [2024-06-23 09:03:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 6595887104. Throughput: 0: 42777.4. Samples: 6596029080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 09:03:07,405][15401] Updated weights for policy 0, policy_version 402590 (0.0045) [2024-06-23 09:03:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 6596050944. Throughput: 0: 42995.0. Samples: 6596164480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 09:03:10,355][15401] Updated weights for policy 0, policy_version 402600 (0.0032) [2024-06-23 09:03:13,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 6596280320. Throughput: 0: 42638.0. Samples: 6596412320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:13,392][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 09:03:15,085][15401] Updated weights for policy 0, policy_version 402610 (0.0029) [2024-06-23 09:03:18,235][15401] Updated weights for policy 0, policy_version 402620 (0.0035) [2024-06-23 09:03:18,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6596526080. Throughput: 0: 42561.0. Samples: 6596668500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:18,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 09:03:22,744][15401] Updated weights for policy 0, policy_version 402630 (0.0041) [2024-06-23 09:03:23,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6596706304. Throughput: 0: 42695.6. Samples: 6596805680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 09:03:25,917][15401] Updated weights for policy 0, policy_version 402640 (0.0042) [2024-06-23 09:03:28,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42599.6, 300 sec: 42875.8). Total num frames: 6596935680. Throughput: 0: 42617.7. Samples: 6597051280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:28,401][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 09:03:30,274][15401] Updated weights for policy 0, policy_version 402650 (0.0042) [2024-06-23 09:03:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6597165056. Throughput: 0: 42689.9. Samples: 6597309400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:33,390][15132] Avg episode reward: [(0, '0.004')] [2024-06-23 09:03:33,465][15401] Updated weights for policy 0, policy_version 402660 (0.0025) [2024-06-23 09:03:38,156][15401] Updated weights for policy 0, policy_version 402670 (0.0042) [2024-06-23 09:03:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6597361664. Throughput: 0: 42587.6. Samples: 6597439620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 09:03:41,073][15349] Signal inference workers to stop experience collection... (97700 times) [2024-06-23 09:03:41,073][15349] Signal inference workers to resume experience collection... (97700 times) [2024-06-23 09:03:41,086][15401] InferenceWorker_p0-w0: stopping experience collection (97700 times) [2024-06-23 09:03:41,086][15401] InferenceWorker_p0-w0: resuming experience collection (97700 times) [2024-06-23 09:03:41,242][15401] Updated weights for policy 0, policy_version 402680 (0.0048) [2024-06-23 09:03:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 6597574656. Throughput: 0: 42590.3. Samples: 6597689060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 09:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000402685_6597591040.pth... [2024-06-23 09:03:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000402059_6587334656.pth [2024-06-23 09:03:45,722][15401] Updated weights for policy 0, policy_version 402690 (0.0023) [2024-06-23 09:03:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6597787648. Throughput: 0: 42654.4. Samples: 6597948540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:48,394][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 09:03:49,093][15401] Updated weights for policy 0, policy_version 402700 (0.0036) [2024-06-23 09:03:53,223][15401] Updated weights for policy 0, policy_version 402710 (0.0044) [2024-06-23 09:03:53,391][15132] Fps is (10 sec: 42593.3, 60 sec: 42597.6, 300 sec: 42709.3). Total num frames: 6598000640. Throughput: 0: 42457.2. Samples: 6598075100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:53,391][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 09:03:56,796][15401] Updated weights for policy 0, policy_version 402720 (0.0042) [2024-06-23 09:03:58,391][15132] Fps is (10 sec: 42590.7, 60 sec: 42870.1, 300 sec: 42764.8). Total num frames: 6598213632. Throughput: 0: 42573.8. Samples: 6598328120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:03:58,392][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 09:04:00,914][15401] Updated weights for policy 0, policy_version 402730 (0.0027) [2024-06-23 09:04:03,389][15132] Fps is (10 sec: 42603.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6598426624. Throughput: 0: 42634.7. Samples: 6598587060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 09:04:03,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 09:04:04,579][15401] Updated weights for policy 0, policy_version 402740 (0.0039) [2024-06-23 09:04:08,389][15132] Fps is (10 sec: 40968.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6598623232. Throughput: 0: 42487.6. Samples: 6598717620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 09:04:08,606][15401] Updated weights for policy 0, policy_version 402750 (0.0030) [2024-06-23 09:04:12,161][15401] Updated weights for policy 0, policy_version 402760 (0.0033) [2024-06-23 09:04:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 6598868992. Throughput: 0: 42763.9. Samples: 6598975560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 09:04:16,236][15401] Updated weights for policy 0, policy_version 402770 (0.0033) [2024-06-23 09:04:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6599065600. Throughput: 0: 42689.2. Samples: 6599230420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 09:04:19,854][15401] Updated weights for policy 0, policy_version 402780 (0.0037) [2024-06-23 09:04:23,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 6599245824. Throughput: 0: 42604.0. Samples: 6599356800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 09:04:23,931][15401] Updated weights for policy 0, policy_version 402790 (0.0044) [2024-06-23 09:04:27,410][15401] Updated weights for policy 0, policy_version 402800 (0.0025) [2024-06-23 09:04:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 6599524352. Throughput: 0: 42819.9. Samples: 6599615960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 09:04:31,636][15401] Updated weights for policy 0, policy_version 402810 (0.0042) [2024-06-23 09:04:33,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42052.0, 300 sec: 42653.9). Total num frames: 6599688192. Throughput: 0: 42833.2. Samples: 6599876040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 09:04:35,042][15401] Updated weights for policy 0, policy_version 402820 (0.0026) [2024-06-23 09:04:38,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6599901184. Throughput: 0: 42634.9. Samples: 6599993620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 09:04:39,285][15401] Updated weights for policy 0, policy_version 402830 (0.0027) [2024-06-23 09:04:42,654][15401] Updated weights for policy 0, policy_version 402840 (0.0031) [2024-06-23 09:04:43,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6600146944. Throughput: 0: 42883.1. Samples: 6600257780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 09:04:46,753][15401] Updated weights for policy 0, policy_version 402850 (0.0047) [2024-06-23 09:04:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6600343552. Throughput: 0: 42869.8. Samples: 6600516200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 09:04:50,210][15349] Signal inference workers to stop experience collection... (97750 times) [2024-06-23 09:04:50,213][15349] Signal inference workers to resume experience collection... (97750 times) [2024-06-23 09:04:50,232][15401] InferenceWorker_p0-w0: stopping experience collection (97750 times) [2024-06-23 09:04:50,232][15401] InferenceWorker_p0-w0: resuming experience collection (97750 times) [2024-06-23 09:04:50,345][15401] Updated weights for policy 0, policy_version 402860 (0.0034) [2024-06-23 09:04:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42599.2, 300 sec: 42820.6). Total num frames: 6600556544. Throughput: 0: 42587.9. Samples: 6600634080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 09:04:54,206][15401] Updated weights for policy 0, policy_version 402870 (0.0025) [2024-06-23 09:04:57,978][15401] Updated weights for policy 0, policy_version 402880 (0.0046) [2024-06-23 09:04:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43145.8, 300 sec: 42931.6). Total num frames: 6600802304. Throughput: 0: 42724.9. Samples: 6600898180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:04:58,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 09:05:02,037][15401] Updated weights for policy 0, policy_version 402890 (0.0027) [2024-06-23 09:05:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6600966144. Throughput: 0: 42815.6. Samples: 6601157120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:05:03,398][15132] Avg episode reward: [(0, '0.860')] [2024-06-23 09:05:05,742][15401] Updated weights for policy 0, policy_version 402900 (0.0032) [2024-06-23 09:05:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 6601195520. Throughput: 0: 42738.9. Samples: 6601280060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:05:08,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 09:05:09,853][15401] Updated weights for policy 0, policy_version 402910 (0.0028) [2024-06-23 09:05:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 6601408512. Throughput: 0: 42744.0. Samples: 6601539440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:05:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 09:05:13,581][15401] Updated weights for policy 0, policy_version 402920 (0.0034) [2024-06-23 09:05:17,566][15401] Updated weights for policy 0, policy_version 402930 (0.0052) [2024-06-23 09:05:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6601621504. Throughput: 0: 42671.2. Samples: 6601796240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:05:18,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 09:05:21,087][15401] Updated weights for policy 0, policy_version 402940 (0.0035) [2024-06-23 09:05:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6601834496. Throughput: 0: 42695.2. Samples: 6601914900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 09:05:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 09:05:25,104][15401] Updated weights for policy 0, policy_version 402950 (0.0041) [2024-06-23 09:05:28,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 6602063872. Throughput: 0: 42670.2. Samples: 6602178040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:05:28,393][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 09:05:29,181][15401] Updated weights for policy 0, policy_version 402960 (0.0027) [2024-06-23 09:05:33,196][15401] Updated weights for policy 0, policy_version 402970 (0.0036) [2024-06-23 09:05:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.8, 300 sec: 42765.0). Total num frames: 6602276864. Throughput: 0: 42577.0. Samples: 6602432160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:05:33,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 09:05:36,856][15401] Updated weights for policy 0, policy_version 402980 (0.0032) [2024-06-23 09:05:38,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6602473472. Throughput: 0: 42724.9. Samples: 6602556700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:05:38,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 09:05:40,659][15401] Updated weights for policy 0, policy_version 402990 (0.0033) [2024-06-23 09:05:43,396][15132] Fps is (10 sec: 39295.9, 60 sec: 42047.8, 300 sec: 42653.6). Total num frames: 6602670080. Throughput: 0: 42442.9. Samples: 6602808380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:05:43,397][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 09:05:43,464][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000402996_6602686464.pth... [2024-06-23 09:05:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000402371_6592446464.pth [2024-06-23 09:05:44,523][15401] Updated weights for policy 0, policy_version 403000 (0.0029) [2024-06-23 09:05:48,390][15401] Updated weights for policy 0, policy_version 403010 (0.0029) [2024-06-23 09:05:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6602915840. Throughput: 0: 42391.1. Samples: 6603064720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:05:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 09:05:52,291][15401] Updated weights for policy 0, policy_version 403020 (0.0048) [2024-06-23 09:05:53,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6603096064. Throughput: 0: 42561.0. Samples: 6603195300. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:05:53,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 09:05:56,082][15401] Updated weights for policy 0, policy_version 403030 (0.0028) [2024-06-23 09:05:58,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 6603325440. Throughput: 0: 42482.1. Samples: 6603451240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:05:58,393][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 09:05:59,941][15401] Updated weights for policy 0, policy_version 403040 (0.0034) [2024-06-23 09:06:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6603538432. Throughput: 0: 42413.8. Samples: 6603704860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 09:06:03,674][15401] Updated weights for policy 0, policy_version 403050 (0.0032) [2024-06-23 09:06:07,702][15401] Updated weights for policy 0, policy_version 403060 (0.0027) [2024-06-23 09:06:08,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6603767808. Throughput: 0: 42662.7. Samples: 6603834720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:08,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 09:06:11,603][15401] Updated weights for policy 0, policy_version 403070 (0.0028) [2024-06-23 09:06:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 6603964416. Throughput: 0: 42579.8. Samples: 6604094040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 09:06:13,533][15349] Signal inference workers to stop experience collection... (97800 times) [2024-06-23 09:06:13,585][15401] InferenceWorker_p0-w0: stopping experience collection (97800 times) [2024-06-23 09:06:13,588][15349] Signal inference workers to resume experience collection... (97800 times) [2024-06-23 09:06:13,608][15401] InferenceWorker_p0-w0: resuming experience collection (97800 times) [2024-06-23 09:06:15,520][15401] Updated weights for policy 0, policy_version 403080 (0.0036) [2024-06-23 09:06:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6604177408. Throughput: 0: 42497.2. Samples: 6604344540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:18,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-23 09:06:19,398][15401] Updated weights for policy 0, policy_version 403090 (0.0036) [2024-06-23 09:06:23,331][15401] Updated weights for policy 0, policy_version 403100 (0.0044) [2024-06-23 09:06:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6604390400. Throughput: 0: 42588.3. Samples: 6604473180. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:23,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 09:06:27,149][15401] Updated weights for policy 0, policy_version 403110 (0.0031) [2024-06-23 09:06:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 6604603392. Throughput: 0: 42699.0. Samples: 6604729560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:28,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 09:06:31,071][15401] Updated weights for policy 0, policy_version 403120 (0.0036) [2024-06-23 09:06:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6604816384. Throughput: 0: 42551.4. Samples: 6604979540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 09:06:34,647][15401] Updated weights for policy 0, policy_version 403130 (0.0033) [2024-06-23 09:06:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6605029376. Throughput: 0: 42477.8. Samples: 6605106800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 09:06:38,937][15401] Updated weights for policy 0, policy_version 403140 (0.0033) [2024-06-23 09:06:42,169][15401] Updated weights for policy 0, policy_version 403150 (0.0034) [2024-06-23 09:06:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43149.1, 300 sec: 42765.3). Total num frames: 6605258752. Throughput: 0: 42497.8. Samples: 6605363540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 09:06:46,694][15401] Updated weights for policy 0, policy_version 403160 (0.0039) [2024-06-23 09:06:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6605455360. Throughput: 0: 42460.6. Samples: 6605615580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 09:06:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 09:06:50,291][15401] Updated weights for policy 0, policy_version 403170 (0.0036) [2024-06-23 09:06:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 6605651968. Throughput: 0: 42436.0. Samples: 6605744340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:06:53,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 09:06:54,463][15401] Updated weights for policy 0, policy_version 403180 (0.0035) [2024-06-23 09:06:57,651][15401] Updated weights for policy 0, policy_version 403190 (0.0029) [2024-06-23 09:06:58,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 6605897728. Throughput: 0: 42449.8. Samples: 6606004280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:06:58,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 09:07:01,886][15401] Updated weights for policy 0, policy_version 403200 (0.0036) [2024-06-23 09:07:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6606110720. Throughput: 0: 42673.3. Samples: 6606264840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:03,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 09:07:05,065][15401] Updated weights for policy 0, policy_version 403210 (0.0038) [2024-06-23 09:07:08,392][15132] Fps is (10 sec: 39310.8, 60 sec: 42050.2, 300 sec: 42598.0). Total num frames: 6606290944. Throughput: 0: 42658.2. Samples: 6606392920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:08,393][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 09:07:09,323][15401] Updated weights for policy 0, policy_version 403220 (0.0032) [2024-06-23 09:07:12,585][15401] Updated weights for policy 0, policy_version 403230 (0.0042) [2024-06-23 09:07:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6606536704. Throughput: 0: 42636.9. Samples: 6606648220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 09:07:17,036][15401] Updated weights for policy 0, policy_version 403240 (0.0041) [2024-06-23 09:07:18,389][15132] Fps is (10 sec: 44250.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6606733312. Throughput: 0: 42895.3. Samples: 6606909820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 09:07:20,237][15401] Updated weights for policy 0, policy_version 403250 (0.0028) [2024-06-23 09:07:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 6606929920. Throughput: 0: 42761.3. Samples: 6607031060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 09:07:24,924][15401] Updated weights for policy 0, policy_version 403260 (0.0035) [2024-06-23 09:07:28,299][15401] Updated weights for policy 0, policy_version 403270 (0.0035) [2024-06-23 09:07:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6607175680. Throughput: 0: 42767.5. Samples: 6607288080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 09:07:32,222][15401] Updated weights for policy 0, policy_version 403280 (0.0022) [2024-06-23 09:07:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 6607372288. Throughput: 0: 43033.8. Samples: 6607552100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 09:07:33,524][15349] Signal inference workers to stop experience collection... (97850 times) [2024-06-23 09:07:33,524][15349] Signal inference workers to resume experience collection... (97850 times) [2024-06-23 09:07:33,547][15401] InferenceWorker_p0-w0: stopping experience collection (97850 times) [2024-06-23 09:07:33,547][15401] InferenceWorker_p0-w0: resuming experience collection (97850 times) [2024-06-23 09:07:35,695][15401] Updated weights for policy 0, policy_version 403290 (0.0032) [2024-06-23 09:07:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6607585280. Throughput: 0: 42893.7. Samples: 6607674560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:38,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 09:07:40,327][15401] Updated weights for policy 0, policy_version 403300 (0.0044) [2024-06-23 09:07:43,177][15401] Updated weights for policy 0, policy_version 403310 (0.0033) [2024-06-23 09:07:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6607831040. Throughput: 0: 42678.3. Samples: 6607924800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 09:07:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000403310_6607831040.pth... [2024-06-23 09:07:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000402685_6597591040.pth [2024-06-23 09:07:47,859][15401] Updated weights for policy 0, policy_version 403320 (0.0030) [2024-06-23 09:07:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6608011264. Throughput: 0: 42721.3. Samples: 6608187300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:48,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 09:07:50,688][15401] Updated weights for policy 0, policy_version 403330 (0.0038) [2024-06-23 09:07:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 6608224256. Throughput: 0: 42628.0. Samples: 6608311060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:53,390][15132] Avg episode reward: [(0, '0.128')] [2024-06-23 09:07:55,369][15401] Updated weights for policy 0, policy_version 403340 (0.0040) [2024-06-23 09:07:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 6608470016. Throughput: 0: 42732.5. Samples: 6608571180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:07:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 09:07:58,796][15401] Updated weights for policy 0, policy_version 403350 (0.0024) [2024-06-23 09:08:02,962][15401] Updated weights for policy 0, policy_version 403360 (0.0031) [2024-06-23 09:08:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6608666624. Throughput: 0: 42779.9. Samples: 6608834920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:08:03,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 09:08:06,424][15401] Updated weights for policy 0, policy_version 403370 (0.0037) [2024-06-23 09:08:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43146.7, 300 sec: 42709.8). Total num frames: 6608879616. Throughput: 0: 42873.4. Samples: 6608960360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-23 09:08:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 09:08:10,544][15401] Updated weights for policy 0, policy_version 403380 (0.0029) [2024-06-23 09:08:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6609108992. Throughput: 0: 42904.0. Samples: 6609218760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 09:08:14,537][15401] Updated weights for policy 0, policy_version 403390 (0.0037) [2024-06-23 09:08:18,258][15401] Updated weights for policy 0, policy_version 403400 (0.0040) [2024-06-23 09:08:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6609305600. Throughput: 0: 42819.9. Samples: 6609479000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 09:08:22,106][15401] Updated weights for policy 0, policy_version 403410 (0.0044) [2024-06-23 09:08:23,397][15132] Fps is (10 sec: 40929.8, 60 sec: 43139.2, 300 sec: 42653.2). Total num frames: 6609518592. Throughput: 0: 42776.1. Samples: 6609599800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:23,397][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 09:08:25,775][15401] Updated weights for policy 0, policy_version 403420 (0.0032) [2024-06-23 09:08:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6609731584. Throughput: 0: 42917.8. Samples: 6609856100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:28,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 09:08:29,772][15401] Updated weights for policy 0, policy_version 403430 (0.0027) [2024-06-23 09:08:33,390][15132] Fps is (10 sec: 42629.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 6609944576. Throughput: 0: 42896.4. Samples: 6610117640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 09:08:33,441][15401] Updated weights for policy 0, policy_version 403440 (0.0050) [2024-06-23 09:08:37,238][15401] Updated weights for policy 0, policy_version 403450 (0.0032) [2024-06-23 09:08:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6610173952. Throughput: 0: 42913.5. Samples: 6610242160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 09:08:41,114][15401] Updated weights for policy 0, policy_version 403460 (0.0029) [2024-06-23 09:08:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6610370560. Throughput: 0: 42826.6. Samples: 6610498380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:43,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 09:08:44,942][15401] Updated weights for policy 0, policy_version 403470 (0.0038) [2024-06-23 09:08:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42654.1). Total num frames: 6610583552. Throughput: 0: 42715.6. Samples: 6610757120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:48,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 09:08:48,810][15401] Updated weights for policy 0, policy_version 403480 (0.0030) [2024-06-23 09:08:52,293][15349] Signal inference workers to stop experience collection... (97900 times) [2024-06-23 09:08:52,325][15401] InferenceWorker_p0-w0: stopping experience collection (97900 times) [2024-06-23 09:08:52,362][15349] Signal inference workers to resume experience collection... (97900 times) [2024-06-23 09:08:52,362][15401] InferenceWorker_p0-w0: resuming experience collection (97900 times) [2024-06-23 09:08:52,506][15401] Updated weights for policy 0, policy_version 403490 (0.0020) [2024-06-23 09:08:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 6610796544. Throughput: 0: 42790.6. Samples: 6610885940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 09:08:56,448][15401] Updated weights for policy 0, policy_version 403500 (0.0028) [2024-06-23 09:08:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6611025920. Throughput: 0: 42616.9. Samples: 6611136520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:08:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 09:09:00,123][15401] Updated weights for policy 0, policy_version 403510 (0.0035) [2024-06-23 09:09:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6611206144. Throughput: 0: 42582.6. Samples: 6611395220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:09:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 09:09:04,066][15401] Updated weights for policy 0, policy_version 403520 (0.0030) [2024-06-23 09:09:07,782][15401] Updated weights for policy 0, policy_version 403530 (0.0032) [2024-06-23 09:09:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6611435520. Throughput: 0: 42738.2. Samples: 6611522700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:09:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 09:09:11,824][15401] Updated weights for policy 0, policy_version 403540 (0.0038) [2024-06-23 09:09:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 6611648512. Throughput: 0: 42622.7. Samples: 6611774220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:09:13,393][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 09:09:15,684][15401] Updated weights for policy 0, policy_version 403550 (0.0039) [2024-06-23 09:09:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6611877888. Throughput: 0: 42537.1. Samples: 6612031800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:09:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 09:09:19,460][15401] Updated weights for policy 0, policy_version 403560 (0.0044) [2024-06-23 09:09:23,240][15401] Updated weights for policy 0, policy_version 403570 (0.0041) [2024-06-23 09:09:23,391][15132] Fps is (10 sec: 44241.0, 60 sec: 42875.7, 300 sec: 42598.2). Total num frames: 6612090880. Throughput: 0: 42561.6. Samples: 6612157500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:09:23,391][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 09:09:27,308][15401] Updated weights for policy 0, policy_version 403580 (0.0034) [2024-06-23 09:09:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6612287488. Throughput: 0: 42614.3. Samples: 6612416020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 09:09:28,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 09:09:30,914][15401] Updated weights for policy 0, policy_version 403590 (0.0029) [2024-06-23 09:09:33,389][15132] Fps is (10 sec: 42605.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6612516864. Throughput: 0: 42512.6. Samples: 6612670180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:09:33,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 09:09:35,122][15401] Updated weights for policy 0, policy_version 403600 (0.0029) [2024-06-23 09:09:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 6612729856. Throughput: 0: 42528.0. Samples: 6612799700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:09:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 09:09:38,469][15401] Updated weights for policy 0, policy_version 403610 (0.0035) [2024-06-23 09:09:42,650][15401] Updated weights for policy 0, policy_version 403620 (0.0030) [2024-06-23 09:09:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6612926464. Throughput: 0: 42775.1. Samples: 6613061400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:09:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 09:09:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000403622_6612942848.pth... [2024-06-23 09:09:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000402996_6602686464.pth [2024-06-23 09:09:46,259][15401] Updated weights for policy 0, policy_version 403630 (0.0031) [2024-06-23 09:09:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6613155840. Throughput: 0: 42620.0. Samples: 6613313120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:09:48,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 09:09:50,383][15401] Updated weights for policy 0, policy_version 403640 (0.0034) [2024-06-23 09:09:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6613352448. Throughput: 0: 42571.5. Samples: 6613438420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:09:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 09:09:54,035][15401] Updated weights for policy 0, policy_version 403650 (0.0036) [2024-06-23 09:09:58,181][15401] Updated weights for policy 0, policy_version 403660 (0.0030) [2024-06-23 09:09:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6613565440. Throughput: 0: 42757.8. Samples: 6613698220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:09:58,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 09:10:01,658][15401] Updated weights for policy 0, policy_version 403670 (0.0037) [2024-06-23 09:10:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6613811200. Throughput: 0: 42643.0. Samples: 6613950740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:03,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 09:10:05,865][15401] Updated weights for policy 0, policy_version 403680 (0.0038) [2024-06-23 09:10:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 6614007808. Throughput: 0: 42916.0. Samples: 6614088660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:08,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 09:10:09,214][15401] Updated weights for policy 0, policy_version 403690 (0.0039) [2024-06-23 09:10:13,375][15401] Updated weights for policy 0, policy_version 403700 (0.0034) [2024-06-23 09:10:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 6614220800. Throughput: 0: 42916.4. Samples: 6614347260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:13,391][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 09:10:16,932][15401] Updated weights for policy 0, policy_version 403710 (0.0031) [2024-06-23 09:10:17,301][15349] Signal inference workers to stop experience collection... (97950 times) [2024-06-23 09:10:17,301][15349] Signal inference workers to resume experience collection... (97950 times) [2024-06-23 09:10:17,343][15401] InferenceWorker_p0-w0: stopping experience collection (97950 times) [2024-06-23 09:10:17,344][15401] InferenceWorker_p0-w0: resuming experience collection (97950 times) [2024-06-23 09:10:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6614466560. Throughput: 0: 42813.7. Samples: 6614596800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:18,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-23 09:10:21,063][15401] Updated weights for policy 0, policy_version 403720 (0.0034) [2024-06-23 09:10:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42599.4, 300 sec: 42654.3). Total num frames: 6614646784. Throughput: 0: 43024.0. Samples: 6614735780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:23,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-23 09:10:24,264][15401] Updated weights for policy 0, policy_version 403730 (0.0032) [2024-06-23 09:10:28,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6614843392. Throughput: 0: 42917.4. Samples: 6614992680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:28,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-23 09:10:28,688][15401] Updated weights for policy 0, policy_version 403740 (0.0045) [2024-06-23 09:10:31,879][15401] Updated weights for policy 0, policy_version 403750 (0.0029) [2024-06-23 09:10:33,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 6615121920. Throughput: 0: 42916.1. Samples: 6615244340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 09:10:36,240][15401] Updated weights for policy 0, policy_version 403760 (0.0030) [2024-06-23 09:10:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 6615285760. Throughput: 0: 43271.4. Samples: 6615385640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 09:10:39,528][15401] Updated weights for policy 0, policy_version 403770 (0.0038) [2024-06-23 09:10:43,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6615498752. Throughput: 0: 43040.4. Samples: 6615635040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 09:10:43,793][15401] Updated weights for policy 0, policy_version 403780 (0.0033) [2024-06-23 09:10:47,231][15401] Updated weights for policy 0, policy_version 403790 (0.0029) [2024-06-23 09:10:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6615744512. Throughput: 0: 42922.2. Samples: 6615882240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 09:10:48,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 09:10:51,607][15401] Updated weights for policy 0, policy_version 403800 (0.0042) [2024-06-23 09:10:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 6615924736. Throughput: 0: 42979.5. Samples: 6616022740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:10:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 09:10:54,814][15401] Updated weights for policy 0, policy_version 403810 (0.0029) [2024-06-23 09:10:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6616137728. Throughput: 0: 42947.1. Samples: 6616279880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:10:58,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 09:10:59,342][15401] Updated weights for policy 0, policy_version 403820 (0.0028) [2024-06-23 09:11:02,502][15401] Updated weights for policy 0, policy_version 403830 (0.0030) [2024-06-23 09:11:03,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6616399872. Throughput: 0: 42926.6. Samples: 6616528500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 09:11:06,848][15401] Updated weights for policy 0, policy_version 403840 (0.0036) [2024-06-23 09:11:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6616596480. Throughput: 0: 43008.4. Samples: 6616671160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 09:11:10,162][15401] Updated weights for policy 0, policy_version 403850 (0.0025) [2024-06-23 09:11:13,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6616776704. Throughput: 0: 42789.3. Samples: 6616918200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:13,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 09:11:14,858][15401] Updated weights for policy 0, policy_version 403860 (0.0035) [2024-06-23 09:11:17,607][15401] Updated weights for policy 0, policy_version 403870 (0.0044) [2024-06-23 09:11:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6617038848. Throughput: 0: 42768.0. Samples: 6617168900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 09:11:22,353][15401] Updated weights for policy 0, policy_version 403880 (0.0042) [2024-06-23 09:11:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6617219072. Throughput: 0: 42853.9. Samples: 6617314060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:23,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 09:11:24,971][15349] Signal inference workers to stop experience collection... (98000 times) [2024-06-23 09:11:24,976][15349] Signal inference workers to resume experience collection... (98000 times) [2024-06-23 09:11:24,990][15401] InferenceWorker_p0-w0: stopping experience collection (98000 times) [2024-06-23 09:11:25,023][15401] InferenceWorker_p0-w0: resuming experience collection (98000 times) [2024-06-23 09:11:25,141][15401] Updated weights for policy 0, policy_version 403890 (0.0037) [2024-06-23 09:11:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6617432064. Throughput: 0: 42778.8. Samples: 6617560080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 09:11:29,905][15401] Updated weights for policy 0, policy_version 403900 (0.0037) [2024-06-23 09:11:32,733][15401] Updated weights for policy 0, policy_version 403910 (0.0033) [2024-06-23 09:11:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6617677824. Throughput: 0: 43068.1. Samples: 6617820300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 09:11:37,707][15401] Updated weights for policy 0, policy_version 403920 (0.0036) [2024-06-23 09:11:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6617858048. Throughput: 0: 42966.9. Samples: 6617956240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 09:11:40,346][15401] Updated weights for policy 0, policy_version 403930 (0.0027) [2024-06-23 09:11:43,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6618071040. Throughput: 0: 42792.3. Samples: 6618205540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:43,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 09:11:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000403935_6618071040.pth... [2024-06-23 09:11:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000403310_6607831040.pth [2024-06-23 09:11:45,247][15401] Updated weights for policy 0, policy_version 403940 (0.0042) [2024-06-23 09:11:48,214][15401] Updated weights for policy 0, policy_version 403950 (0.0027) [2024-06-23 09:11:48,394][15132] Fps is (10 sec: 45855.0, 60 sec: 42868.5, 300 sec: 42931.0). Total num frames: 6618316800. Throughput: 0: 42927.5. Samples: 6618460420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:48,394][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 09:11:52,790][15401] Updated weights for policy 0, policy_version 403960 (0.0033) [2024-06-23 09:11:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6618480640. Throughput: 0: 42675.9. Samples: 6618591580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 09:11:55,698][15401] Updated weights for policy 0, policy_version 403970 (0.0032) [2024-06-23 09:11:58,389][15132] Fps is (10 sec: 40977.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6618726400. Throughput: 0: 42887.1. Samples: 6618848120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:11:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 09:12:00,273][15401] Updated weights for policy 0, policy_version 403980 (0.0035) [2024-06-23 09:12:03,246][15401] Updated weights for policy 0, policy_version 403990 (0.0027) [2024-06-23 09:12:03,392][15132] Fps is (10 sec: 49140.9, 60 sec: 42869.8, 300 sec: 42987.3). Total num frames: 6618972160. Throughput: 0: 42735.5. Samples: 6619092100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:12:03,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 09:12:07,877][15401] Updated weights for policy 0, policy_version 404000 (0.0029) [2024-06-23 09:12:08,392][15132] Fps is (10 sec: 40951.3, 60 sec: 42323.9, 300 sec: 42709.2). Total num frames: 6619136000. Throughput: 0: 42580.2. Samples: 6619230260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:12:08,392][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 09:12:10,952][15401] Updated weights for policy 0, policy_version 404010 (0.0036) [2024-06-23 09:12:13,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6619348992. Throughput: 0: 42796.1. Samples: 6619485900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 09:12:15,526][15401] Updated weights for policy 0, policy_version 404020 (0.0030) [2024-06-23 09:12:18,389][15132] Fps is (10 sec: 45884.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 6619594752. Throughput: 0: 42478.2. Samples: 6619731820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 09:12:19,368][15401] Updated weights for policy 0, policy_version 404030 (0.0026) [2024-06-23 09:12:23,287][15401] Updated weights for policy 0, policy_version 404040 (0.0033) [2024-06-23 09:12:23,391][15132] Fps is (10 sec: 44231.1, 60 sec: 42870.6, 300 sec: 42764.9). Total num frames: 6619791360. Throughput: 0: 42535.2. Samples: 6619870380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:23,391][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 09:12:25,341][15349] Signal inference workers to stop experience collection... (98050 times) [2024-06-23 09:12:25,395][15401] InferenceWorker_p0-w0: stopping experience collection (98050 times) [2024-06-23 09:12:25,398][15349] Signal inference workers to resume experience collection... (98050 times) [2024-06-23 09:12:25,413][15401] InferenceWorker_p0-w0: resuming experience collection (98050 times) [2024-06-23 09:12:26,882][15401] Updated weights for policy 0, policy_version 404050 (0.0034) [2024-06-23 09:12:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6620004352. Throughput: 0: 42527.2. Samples: 6620119260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:28,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 09:12:30,965][15401] Updated weights for policy 0, policy_version 404060 (0.0026) [2024-06-23 09:12:33,392][15132] Fps is (10 sec: 44231.4, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 6620233728. Throughput: 0: 42566.7. Samples: 6620375840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:33,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 09:12:34,383][15401] Updated weights for policy 0, policy_version 404070 (0.0045) [2024-06-23 09:12:38,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42709.2). Total num frames: 6620430336. Throughput: 0: 42543.7. Samples: 6620506140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:38,393][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 09:12:38,559][15401] Updated weights for policy 0, policy_version 404080 (0.0033) [2024-06-23 09:12:42,636][15401] Updated weights for policy 0, policy_version 404090 (0.0031) [2024-06-23 09:12:43,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6620643328. Throughput: 0: 42369.2. Samples: 6620754740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 09:12:46,740][15401] Updated weights for policy 0, policy_version 404100 (0.0049) [2024-06-23 09:12:48,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42328.4, 300 sec: 42820.6). Total num frames: 6620856320. Throughput: 0: 42560.5. Samples: 6621007220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 09:12:50,263][15401] Updated weights for policy 0, policy_version 404110 (0.0038) [2024-06-23 09:12:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 6621052928. Throughput: 0: 42460.7. Samples: 6621140900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 09:12:54,144][15401] Updated weights for policy 0, policy_version 404120 (0.0021) [2024-06-23 09:12:57,878][15401] Updated weights for policy 0, policy_version 404130 (0.0025) [2024-06-23 09:12:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6621282304. Throughput: 0: 42471.9. Samples: 6621397140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:12:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 09:13:01,751][15401] Updated weights for policy 0, policy_version 404140 (0.0036) [2024-06-23 09:13:03,392][15132] Fps is (10 sec: 45865.0, 60 sec: 42325.5, 300 sec: 42820.2). Total num frames: 6621511680. Throughput: 0: 42801.5. Samples: 6621657980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:13:03,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 09:13:05,357][15401] Updated weights for policy 0, policy_version 404150 (0.0032) [2024-06-23 09:13:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 6621691904. Throughput: 0: 42699.4. Samples: 6621791800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:13:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 09:13:09,538][15401] Updated weights for policy 0, policy_version 404160 (0.0043) [2024-06-23 09:13:12,967][15401] Updated weights for policy 0, policy_version 404170 (0.0033) [2024-06-23 09:13:13,390][15132] Fps is (10 sec: 40968.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6621921280. Throughput: 0: 42689.3. Samples: 6622040280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:13:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 09:13:17,468][15401] Updated weights for policy 0, policy_version 404180 (0.0041) [2024-06-23 09:13:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42766.1). Total num frames: 6622134272. Throughput: 0: 42710.7. Samples: 6622297720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:13:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 09:13:20,565][15401] Updated weights for policy 0, policy_version 404190 (0.0031) [2024-06-23 09:13:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42326.1, 300 sec: 42709.5). Total num frames: 6622330880. Throughput: 0: 42600.4. Samples: 6622423060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:13:23,402][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 09:13:24,893][15401] Updated weights for policy 0, policy_version 404200 (0.0032) [2024-06-23 09:13:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 6622560256. Throughput: 0: 42610.7. Samples: 6622672320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 09:13:28,393][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 09:13:28,503][15401] Updated weights for policy 0, policy_version 404210 (0.0031) [2024-06-23 09:13:31,839][15349] Signal inference workers to stop experience collection... (98100 times) [2024-06-23 09:13:31,839][15349] Signal inference workers to resume experience collection... (98100 times) [2024-06-23 09:13:31,858][15401] InferenceWorker_p0-w0: stopping experience collection (98100 times) [2024-06-23 09:13:31,858][15401] InferenceWorker_p0-w0: resuming experience collection (98100 times) [2024-06-23 09:13:32,450][15401] Updated weights for policy 0, policy_version 404220 (0.0038) [2024-06-23 09:13:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 6622773248. Throughput: 0: 42869.7. Samples: 6622936360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:13:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 09:13:36,146][15401] Updated weights for policy 0, policy_version 404230 (0.0039) [2024-06-23 09:13:38,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 6622986240. Throughput: 0: 42719.1. Samples: 6623063260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:13:38,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 09:13:40,241][15401] Updated weights for policy 0, policy_version 404240 (0.0033) [2024-06-23 09:13:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6623199232. Throughput: 0: 42583.2. Samples: 6623313380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:13:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 09:13:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000404248_6623199232.pth... [2024-06-23 09:13:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000403622_6612942848.pth [2024-06-23 09:13:44,381][15401] Updated weights for policy 0, policy_version 404250 (0.0047) [2024-06-23 09:13:47,703][15401] Updated weights for policy 0, policy_version 404260 (0.0036) [2024-06-23 09:13:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6623412224. Throughput: 0: 42461.1. Samples: 6623568640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:13:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 09:13:51,897][15401] Updated weights for policy 0, policy_version 404270 (0.0025) [2024-06-23 09:13:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6623608832. Throughput: 0: 42397.0. Samples: 6623699660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:13:53,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-23 09:13:55,486][15401] Updated weights for policy 0, policy_version 404280 (0.0043) [2024-06-23 09:13:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6623838208. Throughput: 0: 42595.6. Samples: 6623957080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:13:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 09:13:59,590][15401] Updated weights for policy 0, policy_version 404290 (0.0030) [2024-06-23 09:14:03,009][15401] Updated weights for policy 0, policy_version 404300 (0.0044) [2024-06-23 09:14:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42599.9, 300 sec: 42820.5). Total num frames: 6624067584. Throughput: 0: 42519.1. Samples: 6624211080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 09:14:07,246][15401] Updated weights for policy 0, policy_version 404310 (0.0036) [2024-06-23 09:14:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 6624264192. Throughput: 0: 42657.7. Samples: 6624342660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:08,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 09:14:10,737][15401] Updated weights for policy 0, policy_version 404320 (0.0032) [2024-06-23 09:14:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6624477184. Throughput: 0: 42800.5. Samples: 6624598240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 09:14:14,872][15401] Updated weights for policy 0, policy_version 404330 (0.0046) [2024-06-23 09:14:18,262][15401] Updated weights for policy 0, policy_version 404340 (0.0026) [2024-06-23 09:14:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 6624706560. Throughput: 0: 42501.4. Samples: 6624848920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 09:14:22,507][15401] Updated weights for policy 0, policy_version 404350 (0.0039) [2024-06-23 09:14:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6624903168. Throughput: 0: 42654.6. Samples: 6624982720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 09:14:26,215][15401] Updated weights for policy 0, policy_version 404360 (0.0030) [2024-06-23 09:14:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6625132544. Throughput: 0: 42840.0. Samples: 6625241180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:28,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 09:14:30,266][15401] Updated weights for policy 0, policy_version 404370 (0.0033) [2024-06-23 09:14:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6625329152. Throughput: 0: 42813.8. Samples: 6625495260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:33,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 09:14:33,707][15401] Updated weights for policy 0, policy_version 404380 (0.0035) [2024-06-23 09:14:37,841][15401] Updated weights for policy 0, policy_version 404390 (0.0044) [2024-06-23 09:14:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6625558528. Throughput: 0: 42797.3. Samples: 6625625540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:38,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 09:14:41,328][15401] Updated weights for policy 0, policy_version 404400 (0.0027) [2024-06-23 09:14:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6625771520. Throughput: 0: 42697.3. Samples: 6625878460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 09:14:45,626][15401] Updated weights for policy 0, policy_version 404410 (0.0029) [2024-06-23 09:14:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6625984512. Throughput: 0: 42733.4. Samples: 6626134080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:48,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 09:14:48,697][15349] Signal inference workers to stop experience collection... (98150 times) [2024-06-23 09:14:48,752][15401] InferenceWorker_p0-w0: stopping experience collection (98150 times) [2024-06-23 09:14:48,810][15349] Signal inference workers to resume experience collection... (98150 times) [2024-06-23 09:14:48,810][15401] InferenceWorker_p0-w0: resuming experience collection (98150 times) [2024-06-23 09:14:48,945][15401] Updated weights for policy 0, policy_version 404420 (0.0041) [2024-06-23 09:14:53,269][15401] Updated weights for policy 0, policy_version 404430 (0.0027) [2024-06-23 09:14:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6626181120. Throughput: 0: 42697.1. Samples: 6626264020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 09:14:53,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 09:14:56,371][15401] Updated weights for policy 0, policy_version 404440 (0.0029) [2024-06-23 09:14:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6626394112. Throughput: 0: 42665.0. Samples: 6626518160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:14:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 09:15:00,963][15401] Updated weights for policy 0, policy_version 404450 (0.0030) [2024-06-23 09:15:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6626623488. Throughput: 0: 42880.0. Samples: 6626778520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 09:15:04,279][15401] Updated weights for policy 0, policy_version 404460 (0.0047) [2024-06-23 09:15:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6626820096. Throughput: 0: 42849.8. Samples: 6626910960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 09:15:08,757][15401] Updated weights for policy 0, policy_version 404470 (0.0027) [2024-06-23 09:15:11,914][15401] Updated weights for policy 0, policy_version 404480 (0.0029) [2024-06-23 09:15:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6627016704. Throughput: 0: 42440.9. Samples: 6627151020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 09:15:16,342][15401] Updated weights for policy 0, policy_version 404490 (0.0026) [2024-06-23 09:15:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6627246080. Throughput: 0: 42737.7. Samples: 6627418460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 09:15:19,536][15401] Updated weights for policy 0, policy_version 404500 (0.0032) [2024-06-23 09:15:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6627475456. Throughput: 0: 42725.6. Samples: 6627548200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 09:15:23,820][15401] Updated weights for policy 0, policy_version 404510 (0.0026) [2024-06-23 09:15:27,657][15401] Updated weights for policy 0, policy_version 404520 (0.0036) [2024-06-23 09:15:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6627672064. Throughput: 0: 42607.5. Samples: 6627795800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 09:15:31,262][15401] Updated weights for policy 0, policy_version 404530 (0.0039) [2024-06-23 09:15:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6627885056. Throughput: 0: 42852.8. Samples: 6628062460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 09:15:35,240][15401] Updated weights for policy 0, policy_version 404540 (0.0038) [2024-06-23 09:15:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6628098048. Throughput: 0: 42824.4. Samples: 6628191120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 09:15:38,973][15401] Updated weights for policy 0, policy_version 404550 (0.0038) [2024-06-23 09:15:42,666][15401] Updated weights for policy 0, policy_version 404560 (0.0039) [2024-06-23 09:15:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6628311040. Throughput: 0: 42748.0. Samples: 6628441820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 09:15:43,440][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000404561_6628327424.pth... [2024-06-23 09:15:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000403935_6618071040.pth [2024-06-23 09:15:46,506][15401] Updated weights for policy 0, policy_version 404570 (0.0042) [2024-06-23 09:15:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6628540416. Throughput: 0: 42784.0. Samples: 6628703800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 09:15:50,615][15401] Updated weights for policy 0, policy_version 404580 (0.0036) [2024-06-23 09:15:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6628737024. Throughput: 0: 42733.2. Samples: 6628833960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 09:15:54,294][15401] Updated weights for policy 0, policy_version 404590 (0.0036) [2024-06-23 09:15:58,220][15401] Updated weights for policy 0, policy_version 404600 (0.0027) [2024-06-23 09:15:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6628966400. Throughput: 0: 43072.8. Samples: 6629089300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:15:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 09:16:02,062][15401] Updated weights for policy 0, policy_version 404610 (0.0036) [2024-06-23 09:16:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6629179392. Throughput: 0: 42803.5. Samples: 6629344620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:16:03,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-23 09:16:03,666][15349] Signal inference workers to stop experience collection... (98200 times) [2024-06-23 09:16:03,666][15349] Signal inference workers to resume experience collection... (98200 times) [2024-06-23 09:16:03,716][15401] InferenceWorker_p0-w0: stopping experience collection (98200 times) [2024-06-23 09:16:03,716][15401] InferenceWorker_p0-w0: resuming experience collection (98200 times) [2024-06-23 09:16:06,024][15401] Updated weights for policy 0, policy_version 404620 (0.0037) [2024-06-23 09:16:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6629376000. Throughput: 0: 42672.5. Samples: 6629468460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:16:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 09:16:09,927][15401] Updated weights for policy 0, policy_version 404630 (0.0027) [2024-06-23 09:16:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 6629605376. Throughput: 0: 42838.3. Samples: 6629723520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-23 09:16:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 09:16:13,554][15401] Updated weights for policy 0, policy_version 404640 (0.0035) [2024-06-23 09:16:17,677][15401] Updated weights for policy 0, policy_version 404650 (0.0029) [2024-06-23 09:16:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6629818368. Throughput: 0: 42600.1. Samples: 6629979460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 09:16:21,146][15401] Updated weights for policy 0, policy_version 404660 (0.0030) [2024-06-23 09:16:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6630031360. Throughput: 0: 42564.9. Samples: 6630106540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 09:16:25,501][15401] Updated weights for policy 0, policy_version 404670 (0.0036) [2024-06-23 09:16:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6630244352. Throughput: 0: 42715.0. Samples: 6630364000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:28,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 09:16:29,054][15401] Updated weights for policy 0, policy_version 404680 (0.0055) [2024-06-23 09:16:33,205][15401] Updated weights for policy 0, policy_version 404690 (0.0044) [2024-06-23 09:16:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6630440960. Throughput: 0: 42731.1. Samples: 6630626700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 09:16:36,543][15401] Updated weights for policy 0, policy_version 404700 (0.0027) [2024-06-23 09:16:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6630686720. Throughput: 0: 42526.8. Samples: 6630747660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 09:16:40,988][15401] Updated weights for policy 0, policy_version 404710 (0.0035) [2024-06-23 09:16:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42599.0). Total num frames: 6630883328. Throughput: 0: 42628.1. Samples: 6631007560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:43,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 09:16:44,036][15401] Updated weights for policy 0, policy_version 404720 (0.0029) [2024-06-23 09:16:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6631079936. Throughput: 0: 42751.6. Samples: 6631268440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 09:16:48,522][15401] Updated weights for policy 0, policy_version 404730 (0.0033) [2024-06-23 09:16:51,663][15401] Updated weights for policy 0, policy_version 404740 (0.0039) [2024-06-23 09:16:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6631309312. Throughput: 0: 42811.4. Samples: 6631394980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:53,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 09:16:56,377][15401] Updated weights for policy 0, policy_version 404750 (0.0028) [2024-06-23 09:16:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 6631522304. Throughput: 0: 42723.6. Samples: 6631646080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:16:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 09:16:59,459][15401] Updated weights for policy 0, policy_version 404760 (0.0033) [2024-06-23 09:17:03,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.2). Total num frames: 6631718912. Throughput: 0: 42768.9. Samples: 6631904060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:17:03,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 09:17:03,949][15401] Updated weights for policy 0, policy_version 404770 (0.0032) [2024-06-23 09:17:07,350][15401] Updated weights for policy 0, policy_version 404780 (0.0032) [2024-06-23 09:17:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6631964672. Throughput: 0: 42648.8. Samples: 6632025740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:17:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 09:17:11,877][15401] Updated weights for policy 0, policy_version 404790 (0.0034) [2024-06-23 09:17:13,396][15132] Fps is (10 sec: 45846.0, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 6632177664. Throughput: 0: 42695.0. Samples: 6632285540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:17:13,396][15132] Avg episode reward: [(0, '0.886')] [2024-06-23 09:17:14,963][15401] Updated weights for policy 0, policy_version 404800 (0.0038) [2024-06-23 09:17:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 6632374272. Throughput: 0: 42464.0. Samples: 6632537580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:17:18,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 09:17:19,515][15401] Updated weights for policy 0, policy_version 404810 (0.0034) [2024-06-23 09:17:22,985][15349] Signal inference workers to stop experience collection... (98250 times) [2024-06-23 09:17:22,985][15349] Signal inference workers to resume experience collection... (98250 times) [2024-06-23 09:17:22,993][15401] Updated weights for policy 0, policy_version 404820 (0.0034) [2024-06-23 09:17:23,000][15401] InferenceWorker_p0-w0: stopping experience collection (98250 times) [2024-06-23 09:17:23,001][15401] InferenceWorker_p0-w0: resuming experience collection (98250 times) [2024-06-23 09:17:23,390][15132] Fps is (10 sec: 40985.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6632587264. Throughput: 0: 42511.0. Samples: 6632660660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:17:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 09:17:27,140][15401] Updated weights for policy 0, policy_version 404830 (0.0035) [2024-06-23 09:17:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.6, 300 sec: 42598.8). Total num frames: 6632800256. Throughput: 0: 42624.9. Samples: 6632925680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:17:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 09:17:30,427][15401] Updated weights for policy 0, policy_version 404840 (0.0027) [2024-06-23 09:17:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 6632996864. Throughput: 0: 42280.6. Samples: 6633171060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 09:17:33,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 09:17:34,754][15401] Updated weights for policy 0, policy_version 404850 (0.0038) [2024-06-23 09:17:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6633209856. Throughput: 0: 42290.9. Samples: 6633298060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:17:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 09:17:38,411][15401] Updated weights for policy 0, policy_version 404860 (0.0036) [2024-06-23 09:17:42,334][15401] Updated weights for policy 0, policy_version 404870 (0.0022) [2024-06-23 09:17:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6633422848. Throughput: 0: 42471.0. Samples: 6633557280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:17:43,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 09:17:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000404872_6633422848.pth... [2024-06-23 09:17:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000404248_6623199232.pth [2024-06-23 09:17:46,126][15401] Updated weights for policy 0, policy_version 404880 (0.0033) [2024-06-23 09:17:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6633652224. Throughput: 0: 42315.9. Samples: 6633808280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:17:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 09:17:49,966][15401] Updated weights for policy 0, policy_version 404890 (0.0032) [2024-06-23 09:17:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6633848832. Throughput: 0: 42513.9. Samples: 6633938860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:17:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 09:17:53,768][15401] Updated weights for policy 0, policy_version 404900 (0.0041) [2024-06-23 09:17:57,700][15401] Updated weights for policy 0, policy_version 404910 (0.0047) [2024-06-23 09:17:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 6634061824. Throughput: 0: 42331.8. Samples: 6634190200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:17:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 09:18:01,293][15401] Updated weights for policy 0, policy_version 404920 (0.0045) [2024-06-23 09:18:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6634291200. Throughput: 0: 42316.5. Samples: 6634441820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:03,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 09:18:05,599][15401] Updated weights for policy 0, policy_version 404930 (0.0035) [2024-06-23 09:18:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 6634504192. Throughput: 0: 42534.9. Samples: 6634574720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 09:18:09,353][15401] Updated weights for policy 0, policy_version 404940 (0.0048) [2024-06-23 09:18:13,111][15401] Updated weights for policy 0, policy_version 404950 (0.0036) [2024-06-23 09:18:13,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42056.6, 300 sec: 42598.4). Total num frames: 6634700800. Throughput: 0: 42160.7. Samples: 6634822920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 09:18:16,885][15401] Updated weights for policy 0, policy_version 404960 (0.0027) [2024-06-23 09:18:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6634930176. Throughput: 0: 42435.0. Samples: 6635080740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:18,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 09:18:20,683][15401] Updated weights for policy 0, policy_version 404970 (0.0028) [2024-06-23 09:18:23,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.6, 300 sec: 42654.3). Total num frames: 6635143168. Throughput: 0: 42556.0. Samples: 6635213080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 09:18:24,407][15401] Updated weights for policy 0, policy_version 404980 (0.0041) [2024-06-23 09:18:28,199][15401] Updated weights for policy 0, policy_version 404990 (0.0034) [2024-06-23 09:18:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6635356160. Throughput: 0: 42420.9. Samples: 6635466220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 09:18:32,203][15401] Updated weights for policy 0, policy_version 405000 (0.0032) [2024-06-23 09:18:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6635569152. Throughput: 0: 42460.6. Samples: 6635719000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 09:18:36,082][15401] Updated weights for policy 0, policy_version 405010 (0.0025) [2024-06-23 09:18:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6635749376. Throughput: 0: 42440.9. Samples: 6635848700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 09:18:39,945][15401] Updated weights for policy 0, policy_version 405020 (0.0027) [2024-06-23 09:18:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6635978752. Throughput: 0: 42563.2. Samples: 6636105540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 09:18:43,600][15401] Updated weights for policy 0, policy_version 405030 (0.0039) [2024-06-23 09:18:46,650][15349] Signal inference workers to stop experience collection... (98300 times) [2024-06-23 09:18:46,683][15401] InferenceWorker_p0-w0: stopping experience collection (98300 times) [2024-06-23 09:18:46,702][15349] Signal inference workers to resume experience collection... (98300 times) [2024-06-23 09:18:46,702][15401] InferenceWorker_p0-w0: resuming experience collection (98300 times) [2024-06-23 09:18:47,474][15401] Updated weights for policy 0, policy_version 405040 (0.0041) [2024-06-23 09:18:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6636224512. Throughput: 0: 42588.8. Samples: 6636358320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 09:18:51,180][15401] Updated weights for policy 0, policy_version 405050 (0.0031) [2024-06-23 09:18:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6636388352. Throughput: 0: 42489.8. Samples: 6636486760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-23 09:18:53,398][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 09:18:55,498][15401] Updated weights for policy 0, policy_version 405060 (0.0030) [2024-06-23 09:18:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6636634112. Throughput: 0: 42675.6. Samples: 6636743320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:18:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 09:18:58,892][15401] Updated weights for policy 0, policy_version 405070 (0.0021) [2024-06-23 09:19:03,068][15401] Updated weights for policy 0, policy_version 405080 (0.0034) [2024-06-23 09:19:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6636847104. Throughput: 0: 42696.6. Samples: 6637001980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:03,398][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 09:19:06,430][15401] Updated weights for policy 0, policy_version 405090 (0.0028) [2024-06-23 09:19:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6637043712. Throughput: 0: 42619.9. Samples: 6637130980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 09:19:10,616][15401] Updated weights for policy 0, policy_version 405100 (0.0051) [2024-06-23 09:19:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6637289472. Throughput: 0: 42796.9. Samples: 6637392080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:13,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 09:19:14,018][15401] Updated weights for policy 0, policy_version 405110 (0.0029) [2024-06-23 09:19:18,203][15401] Updated weights for policy 0, policy_version 405120 (0.0034) [2024-06-23 09:19:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 6637486080. Throughput: 0: 42855.1. Samples: 6637647480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 09:19:21,897][15401] Updated weights for policy 0, policy_version 405130 (0.0035) [2024-06-23 09:19:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6637699072. Throughput: 0: 42810.2. Samples: 6637775160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 09:19:25,986][15401] Updated weights for policy 0, policy_version 405140 (0.0041) [2024-06-23 09:19:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6637912064. Throughput: 0: 42794.1. Samples: 6638031280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 09:19:29,623][15401] Updated weights for policy 0, policy_version 405150 (0.0040) [2024-06-23 09:19:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6638108672. Throughput: 0: 42995.7. Samples: 6638293120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 09:19:33,654][15401] Updated weights for policy 0, policy_version 405160 (0.0039) [2024-06-23 09:19:37,374][15401] Updated weights for policy 0, policy_version 405170 (0.0033) [2024-06-23 09:19:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 6638354432. Throughput: 0: 42938.1. Samples: 6638418980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 09:19:41,325][15401] Updated weights for policy 0, policy_version 405180 (0.0040) [2024-06-23 09:19:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6638551040. Throughput: 0: 42965.0. Samples: 6638676740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 09:19:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000405185_6638551040.pth... [2024-06-23 09:19:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000404561_6628327424.pth [2024-06-23 09:19:44,721][15401] Updated weights for policy 0, policy_version 405190 (0.0034) [2024-06-23 09:19:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6638747648. Throughput: 0: 43048.9. Samples: 6638939180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 09:19:49,003][15401] Updated weights for policy 0, policy_version 405200 (0.0036) [2024-06-23 09:19:52,349][15401] Updated weights for policy 0, policy_version 405210 (0.0026) [2024-06-23 09:19:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 6638993408. Throughput: 0: 42987.1. Samples: 6639065400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 09:19:56,761][15401] Updated weights for policy 0, policy_version 405220 (0.0035) [2024-06-23 09:19:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 6639206400. Throughput: 0: 42896.5. Samples: 6639322420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:19:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 09:20:00,009][15401] Updated weights for policy 0, policy_version 405230 (0.0029) [2024-06-23 09:20:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6639403008. Throughput: 0: 42989.2. Samples: 6639582000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:20:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 09:20:04,421][15401] Updated weights for policy 0, policy_version 405240 (0.0032) [2024-06-23 09:20:07,302][15349] Signal inference workers to stop experience collection... (98350 times) [2024-06-23 09:20:07,349][15401] InferenceWorker_p0-w0: stopping experience collection (98350 times) [2024-06-23 09:20:07,414][15349] Signal inference workers to resume experience collection... (98350 times) [2024-06-23 09:20:07,414][15401] InferenceWorker_p0-w0: resuming experience collection (98350 times) [2024-06-23 09:20:07,547][15401] Updated weights for policy 0, policy_version 405250 (0.0039) [2024-06-23 09:20:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6639632384. Throughput: 0: 42923.0. Samples: 6639706700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:20:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 09:20:12,111][15401] Updated weights for policy 0, policy_version 405260 (0.0042) [2024-06-23 09:20:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6639861760. Throughput: 0: 43081.8. Samples: 6639969960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 24.0) [2024-06-23 09:20:13,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 09:20:15,183][15401] Updated weights for policy 0, policy_version 405270 (0.0037) [2024-06-23 09:20:18,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 6640058368. Throughput: 0: 42971.9. Samples: 6640226960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:18,392][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 09:20:19,737][15401] Updated weights for policy 0, policy_version 405280 (0.0032) [2024-06-23 09:20:22,669][15401] Updated weights for policy 0, policy_version 405290 (0.0036) [2024-06-23 09:20:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 6640287744. Throughput: 0: 42938.6. Samples: 6640351320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:23,393][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 09:20:27,645][15401] Updated weights for policy 0, policy_version 405300 (0.0041) [2024-06-23 09:20:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6640484352. Throughput: 0: 43153.4. Samples: 6640618640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 09:20:30,358][15401] Updated weights for policy 0, policy_version 405310 (0.0034) [2024-06-23 09:20:33,390][15132] Fps is (10 sec: 40969.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6640697344. Throughput: 0: 42849.1. Samples: 6640867400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:33,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 09:20:35,242][15401] Updated weights for policy 0, policy_version 405320 (0.0030) [2024-06-23 09:20:38,077][15401] Updated weights for policy 0, policy_version 405330 (0.0040) [2024-06-23 09:20:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6640943104. Throughput: 0: 42925.4. Samples: 6640997040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 09:20:42,819][15401] Updated weights for policy 0, policy_version 405340 (0.0027) [2024-06-23 09:20:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6641123328. Throughput: 0: 43035.0. Samples: 6641259000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:43,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 09:20:45,707][15401] Updated weights for policy 0, policy_version 405350 (0.0036) [2024-06-23 09:20:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 6641352704. Throughput: 0: 42686.7. Samples: 6641502900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 09:20:50,497][15401] Updated weights for policy 0, policy_version 405360 (0.0042) [2024-06-23 09:20:53,388][15401] Updated weights for policy 0, policy_version 405370 (0.0035) [2024-06-23 09:20:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6641582080. Throughput: 0: 42865.8. Samples: 6641635660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 09:20:58,087][15401] Updated weights for policy 0, policy_version 405380 (0.0049) [2024-06-23 09:20:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6641745920. Throughput: 0: 42749.5. Samples: 6641893680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:20:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 09:21:01,278][15401] Updated weights for policy 0, policy_version 405390 (0.0033) [2024-06-23 09:21:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6641991680. Throughput: 0: 42573.4. Samples: 6642142660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:21:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 09:21:05,858][15401] Updated weights for policy 0, policy_version 405400 (0.0029) [2024-06-23 09:21:08,103][15349] Signal inference workers to stop experience collection... (98400 times) [2024-06-23 09:21:08,107][15349] Signal inference workers to resume experience collection... (98400 times) [2024-06-23 09:21:08,120][15401] InferenceWorker_p0-w0: stopping experience collection (98400 times) [2024-06-23 09:21:08,145][15401] InferenceWorker_p0-w0: resuming experience collection (98400 times) [2024-06-23 09:21:08,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 6642204672. Throughput: 0: 42877.8. Samples: 6642280820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:21:08,392][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 09:21:08,831][15401] Updated weights for policy 0, policy_version 405410 (0.0039) [2024-06-23 09:21:13,387][15401] Updated weights for policy 0, policy_version 405420 (0.0031) [2024-06-23 09:21:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6642401280. Throughput: 0: 42670.6. Samples: 6642538820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:21:13,395][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 09:21:16,802][15401] Updated weights for policy 0, policy_version 405430 (0.0026) [2024-06-23 09:21:18,390][15132] Fps is (10 sec: 44246.7, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 6642647040. Throughput: 0: 42580.0. Samples: 6642783500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:21:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 09:21:20,934][15401] Updated weights for policy 0, policy_version 405440 (0.0042) [2024-06-23 09:21:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 6642827264. Throughput: 0: 42719.5. Samples: 6642919420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:21:23,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 09:21:24,484][15401] Updated weights for policy 0, policy_version 405450 (0.0043) [2024-06-23 09:21:28,394][15132] Fps is (10 sec: 39305.3, 60 sec: 42595.3, 300 sec: 42708.9). Total num frames: 6643040256. Throughput: 0: 42621.3. Samples: 6643177140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:21:28,394][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 09:21:28,953][15401] Updated weights for policy 0, policy_version 405460 (0.0043) [2024-06-23 09:21:31,936][15401] Updated weights for policy 0, policy_version 405470 (0.0023) [2024-06-23 09:21:33,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 6643302400. Throughput: 0: 42713.3. Samples: 6643425000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:21:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 09:21:36,416][15401] Updated weights for policy 0, policy_version 405480 (0.0033) [2024-06-23 09:21:38,390][15132] Fps is (10 sec: 44255.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6643482624. Throughput: 0: 42880.9. Samples: 6643565300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:21:38,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 09:21:39,555][15401] Updated weights for policy 0, policy_version 405490 (0.0035) [2024-06-23 09:21:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6643695616. Throughput: 0: 42856.3. Samples: 6643822220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:21:43,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 09:21:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000405499_6643695616.pth... [2024-06-23 09:21:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000404872_6633422848.pth [2024-06-23 09:21:43,802][15401] Updated weights for policy 0, policy_version 405500 (0.0025) [2024-06-23 09:21:47,112][15401] Updated weights for policy 0, policy_version 405510 (0.0034) [2024-06-23 09:21:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6643924992. Throughput: 0: 42973.3. Samples: 6644076460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:21:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 09:21:51,250][15401] Updated weights for policy 0, policy_version 405520 (0.0034) [2024-06-23 09:21:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6644137984. Throughput: 0: 43027.2. Samples: 6644216940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:21:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 09:21:54,681][15401] Updated weights for policy 0, policy_version 405530 (0.0032) [2024-06-23 09:21:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6644334592. Throughput: 0: 42883.5. Samples: 6644468580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:21:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 09:21:58,706][15401] Updated weights for policy 0, policy_version 405540 (0.0034) [2024-06-23 09:22:02,142][15401] Updated weights for policy 0, policy_version 405550 (0.0032) [2024-06-23 09:22:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6644563968. Throughput: 0: 43170.9. Samples: 6644726180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 09:22:06,274][15401] Updated weights for policy 0, policy_version 405560 (0.0036) [2024-06-23 09:22:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42710.4). Total num frames: 6644776960. Throughput: 0: 43138.2. Samples: 6644860640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:08,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 09:22:10,000][15401] Updated weights for policy 0, policy_version 405570 (0.0037) [2024-06-23 09:22:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6644973568. Throughput: 0: 42988.2. Samples: 6645111420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 09:22:13,444][15349] Signal inference workers to stop experience collection... (98450 times) [2024-06-23 09:22:13,445][15349] Signal inference workers to resume experience collection... (98450 times) [2024-06-23 09:22:13,463][15401] InferenceWorker_p0-w0: stopping experience collection (98450 times) [2024-06-23 09:22:13,464][15401] InferenceWorker_p0-w0: resuming experience collection (98450 times) [2024-06-23 09:22:13,729][15401] Updated weights for policy 0, policy_version 405580 (0.0038) [2024-06-23 09:22:17,577][15401] Updated weights for policy 0, policy_version 405590 (0.0047) [2024-06-23 09:22:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6645219328. Throughput: 0: 43236.0. Samples: 6645370620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 09:22:21,214][15401] Updated weights for policy 0, policy_version 405600 (0.0035) [2024-06-23 09:22:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6645415936. Throughput: 0: 43017.7. Samples: 6645501100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 09:22:25,068][15401] Updated weights for policy 0, policy_version 405610 (0.0051) [2024-06-23 09:22:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43420.7, 300 sec: 42876.1). Total num frames: 6645645312. Throughput: 0: 43010.3. Samples: 6645757680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:28,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 09:22:28,971][15401] Updated weights for policy 0, policy_version 405620 (0.0041) [2024-06-23 09:22:32,809][15401] Updated weights for policy 0, policy_version 405630 (0.0035) [2024-06-23 09:22:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6645858304. Throughput: 0: 43145.9. Samples: 6646018020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 09:22:36,555][15401] Updated weights for policy 0, policy_version 405640 (0.0033) [2024-06-23 09:22:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6646054912. Throughput: 0: 42772.5. Samples: 6646141700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 09:22:40,782][15401] Updated weights for policy 0, policy_version 405650 (0.0045) [2024-06-23 09:22:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6646300672. Throughput: 0: 42911.1. Samples: 6646399580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:43,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 09:22:44,226][15401] Updated weights for policy 0, policy_version 405660 (0.0024) [2024-06-23 09:22:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6646480896. Throughput: 0: 42858.1. Samples: 6646654800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:48,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-23 09:22:48,412][15401] Updated weights for policy 0, policy_version 405670 (0.0033) [2024-06-23 09:22:51,815][15401] Updated weights for policy 0, policy_version 405680 (0.0033) [2024-06-23 09:22:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6646710272. Throughput: 0: 42772.9. Samples: 6646785420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 09:22:53,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 09:22:56,001][15401] Updated weights for policy 0, policy_version 405690 (0.0042) [2024-06-23 09:22:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 6646923264. Throughput: 0: 42838.5. Samples: 6647039260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:22:58,393][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 09:22:59,603][15401] Updated weights for policy 0, policy_version 405700 (0.0039) [2024-06-23 09:23:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6647136256. Throughput: 0: 42841.4. Samples: 6647298480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 09:23:03,893][15401] Updated weights for policy 0, policy_version 405710 (0.0034) [2024-06-23 09:23:07,091][15401] Updated weights for policy 0, policy_version 405720 (0.0045) [2024-06-23 09:23:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6647349248. Throughput: 0: 42786.8. Samples: 6647426500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 09:23:11,416][15401] Updated weights for policy 0, policy_version 405730 (0.0030) [2024-06-23 09:23:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 6647562240. Throughput: 0: 42754.2. Samples: 6647681620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 09:23:14,608][15401] Updated weights for policy 0, policy_version 405740 (0.0031) [2024-06-23 09:23:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6647758848. Throughput: 0: 42759.1. Samples: 6647942180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 09:23:18,944][15401] Updated weights for policy 0, policy_version 405750 (0.0033) [2024-06-23 09:23:22,497][15401] Updated weights for policy 0, policy_version 405760 (0.0028) [2024-06-23 09:23:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 6648004608. Throughput: 0: 42854.5. Samples: 6648070260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:23,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 09:23:26,623][15401] Updated weights for policy 0, policy_version 405770 (0.0034) [2024-06-23 09:23:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6648201216. Throughput: 0: 42779.6. Samples: 6648324660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:28,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-23 09:23:30,297][15401] Updated weights for policy 0, policy_version 405780 (0.0043) [2024-06-23 09:23:33,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 6648397824. Throughput: 0: 42852.8. Samples: 6648583180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 09:23:34,241][15401] Updated weights for policy 0, policy_version 405790 (0.0032) [2024-06-23 09:23:37,028][15349] Signal inference workers to stop experience collection... (98500 times) [2024-06-23 09:23:37,029][15349] Signal inference workers to resume experience collection... (98500 times) [2024-06-23 09:23:37,083][15401] InferenceWorker_p0-w0: stopping experience collection (98500 times) [2024-06-23 09:23:37,083][15401] InferenceWorker_p0-w0: resuming experience collection (98500 times) [2024-06-23 09:23:38,003][15401] Updated weights for policy 0, policy_version 405800 (0.0038) [2024-06-23 09:23:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 6648627200. Throughput: 0: 42817.7. Samples: 6648712220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:38,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 09:23:41,809][15401] Updated weights for policy 0, policy_version 405810 (0.0031) [2024-06-23 09:23:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6648840192. Throughput: 0: 42800.6. Samples: 6648965180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:43,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 09:23:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000405814_6648856576.pth... [2024-06-23 09:23:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000405185_6638551040.pth [2024-06-23 09:23:45,649][15401] Updated weights for policy 0, policy_version 405820 (0.0041) [2024-06-23 09:23:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6649036800. Throughput: 0: 42828.3. Samples: 6649225760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:48,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-23 09:23:49,490][15401] Updated weights for policy 0, policy_version 405830 (0.0037) [2024-06-23 09:23:53,191][15401] Updated weights for policy 0, policy_version 405840 (0.0036) [2024-06-23 09:23:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6649282560. Throughput: 0: 42750.1. Samples: 6649350260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 09:23:57,394][15401] Updated weights for policy 0, policy_version 405850 (0.0038) [2024-06-23 09:23:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 6649479168. Throughput: 0: 42815.6. Samples: 6649608320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:23:58,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 09:24:01,086][15401] Updated weights for policy 0, policy_version 405860 (0.0032) [2024-06-23 09:24:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6649692160. Throughput: 0: 42799.5. Samples: 6649868160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:24:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 09:24:05,029][15401] Updated weights for policy 0, policy_version 405870 (0.0047) [2024-06-23 09:24:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6649921536. Throughput: 0: 42769.3. Samples: 6649994780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:24:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 09:24:08,583][15401] Updated weights for policy 0, policy_version 405880 (0.0033) [2024-06-23 09:24:12,753][15401] Updated weights for policy 0, policy_version 405890 (0.0037) [2024-06-23 09:24:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6650134528. Throughput: 0: 42989.3. Samples: 6650259180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:24:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 09:24:16,245][15401] Updated weights for policy 0, policy_version 405900 (0.0031) [2024-06-23 09:24:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6650347520. Throughput: 0: 42814.7. Samples: 6650509840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 09:24:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 09:24:20,256][15401] Updated weights for policy 0, policy_version 405910 (0.0030) [2024-06-23 09:24:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 6650560512. Throughput: 0: 42850.3. Samples: 6650640480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:24:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 09:24:23,931][15401] Updated weights for policy 0, policy_version 405920 (0.0031) [2024-06-23 09:24:27,766][15401] Updated weights for policy 0, policy_version 405930 (0.0046) [2024-06-23 09:24:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6650773504. Throughput: 0: 43020.4. Samples: 6650901100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:24:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 09:24:31,562][15401] Updated weights for policy 0, policy_version 405940 (0.0037) [2024-06-23 09:24:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6651002880. Throughput: 0: 42848.0. Samples: 6651153920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:24:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 09:24:35,525][15401] Updated weights for policy 0, policy_version 405950 (0.0028) [2024-06-23 09:24:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6651183104. Throughput: 0: 43069.7. Samples: 6651288400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:24:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 09:24:39,208][15401] Updated weights for policy 0, policy_version 405960 (0.0028) [2024-06-23 09:24:43,168][15401] Updated weights for policy 0, policy_version 405970 (0.0039) [2024-06-23 09:24:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6651412480. Throughput: 0: 43010.2. Samples: 6651543780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:24:43,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-23 09:24:46,832][15401] Updated weights for policy 0, policy_version 405980 (0.0044) [2024-06-23 09:24:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6651641856. Throughput: 0: 42807.2. Samples: 6651794480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:24:48,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 09:24:50,740][15401] Updated weights for policy 0, policy_version 405990 (0.0043) [2024-06-23 09:24:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6651838464. Throughput: 0: 43029.9. Samples: 6651931120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:24:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 09:24:54,381][15401] Updated weights for policy 0, policy_version 406000 (0.0033) [2024-06-23 09:24:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6652051456. Throughput: 0: 42793.3. Samples: 6652184880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:24:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 09:24:58,541][15401] Updated weights for policy 0, policy_version 406010 (0.0043) [2024-06-23 09:24:59,784][15349] Signal inference workers to stop experience collection... (98550 times) [2024-06-23 09:24:59,832][15401] InferenceWorker_p0-w0: stopping experience collection (98550 times) [2024-06-23 09:24:59,900][15349] Signal inference workers to resume experience collection... (98550 times) [2024-06-23 09:24:59,900][15401] InferenceWorker_p0-w0: resuming experience collection (98550 times) [2024-06-23 09:25:01,946][15401] Updated weights for policy 0, policy_version 406020 (0.0030) [2024-06-23 09:25:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 6652297216. Throughput: 0: 42805.0. Samples: 6652436060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:25:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 09:25:06,464][15401] Updated weights for policy 0, policy_version 406030 (0.0039) [2024-06-23 09:25:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6652477440. Throughput: 0: 42864.7. Samples: 6652569400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:25:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 09:25:09,743][15401] Updated weights for policy 0, policy_version 406040 (0.0025) [2024-06-23 09:25:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 6652690432. Throughput: 0: 42674.5. Samples: 6652821460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:25:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 09:25:13,960][15401] Updated weights for policy 0, policy_version 406050 (0.0043) [2024-06-23 09:25:17,270][15401] Updated weights for policy 0, policy_version 406060 (0.0025) [2024-06-23 09:25:18,392][15132] Fps is (10 sec: 45864.8, 60 sec: 43142.9, 300 sec: 42876.1). Total num frames: 6652936192. Throughput: 0: 42744.9. Samples: 6653077540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:25:18,392][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 09:25:21,652][15401] Updated weights for policy 0, policy_version 406070 (0.0045) [2024-06-23 09:25:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6653100032. Throughput: 0: 42848.0. Samples: 6653216560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:25:23,392][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 09:25:24,847][15401] Updated weights for policy 0, policy_version 406080 (0.0036) [2024-06-23 09:25:28,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6653329408. Throughput: 0: 42620.5. Samples: 6653461700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:25:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 09:25:29,287][15401] Updated weights for policy 0, policy_version 406090 (0.0027) [2024-06-23 09:25:32,416][15401] Updated weights for policy 0, policy_version 406100 (0.0030) [2024-06-23 09:25:33,392][15132] Fps is (10 sec: 49140.6, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 6653591552. Throughput: 0: 42760.8. Samples: 6653718820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:25:33,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 09:25:36,919][15401] Updated weights for policy 0, policy_version 406110 (0.0038) [2024-06-23 09:25:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 6653739008. Throughput: 0: 42779.9. Samples: 6653856320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-23 09:25:38,393][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 09:25:39,937][15401] Updated weights for policy 0, policy_version 406120 (0.0035) [2024-06-23 09:25:43,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6653984768. Throughput: 0: 42706.7. Samples: 6654106680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:25:43,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 09:25:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000406127_6653984768.pth... [2024-06-23 09:25:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000405499_6643695616.pth [2024-06-23 09:25:44,357][15401] Updated weights for policy 0, policy_version 406130 (0.0031) [2024-06-23 09:25:47,761][15401] Updated weights for policy 0, policy_version 406140 (0.0032) [2024-06-23 09:25:48,390][15132] Fps is (10 sec: 50802.7, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6654246912. Throughput: 0: 42845.3. Samples: 6654364100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:25:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 09:25:52,224][15401] Updated weights for policy 0, policy_version 406150 (0.0028) [2024-06-23 09:25:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 6654377984. Throughput: 0: 42909.0. Samples: 6654500300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:25:53,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 09:25:55,406][15401] Updated weights for policy 0, policy_version 406160 (0.0031) [2024-06-23 09:25:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6654623744. Throughput: 0: 42906.7. Samples: 6654752260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:25:58,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 09:25:59,622][15401] Updated weights for policy 0, policy_version 406170 (0.0032) [2024-06-23 09:26:01,641][15349] Signal inference workers to stop experience collection... (98600 times) [2024-06-23 09:26:01,644][15349] Signal inference workers to resume experience collection... (98600 times) [2024-06-23 09:26:01,676][15401] InferenceWorker_p0-w0: stopping experience collection (98600 times) [2024-06-23 09:26:01,676][15401] InferenceWorker_p0-w0: resuming experience collection (98600 times) [2024-06-23 09:26:03,021][15401] Updated weights for policy 0, policy_version 406180 (0.0029) [2024-06-23 09:26:03,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 6654853120. Throughput: 0: 42859.7. Samples: 6655006120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 09:26:07,129][15401] Updated weights for policy 0, policy_version 406190 (0.0029) [2024-06-23 09:26:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6655033344. Throughput: 0: 42756.0. Samples: 6655140580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 09:26:10,629][15401] Updated weights for policy 0, policy_version 406200 (0.0049) [2024-06-23 09:26:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6655279104. Throughput: 0: 42958.6. Samples: 6655394840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 09:26:14,534][15401] Updated weights for policy 0, policy_version 406210 (0.0037) [2024-06-23 09:26:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 6655492096. Throughput: 0: 43027.1. Samples: 6655654940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 09:26:18,556][15401] Updated weights for policy 0, policy_version 406220 (0.0039) [2024-06-23 09:26:21,947][15401] Updated weights for policy 0, policy_version 406230 (0.0040) [2024-06-23 09:26:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42821.2). Total num frames: 6655672320. Throughput: 0: 42772.5. Samples: 6655780980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 09:26:26,005][15401] Updated weights for policy 0, policy_version 406240 (0.0031) [2024-06-23 09:26:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 6655934464. Throughput: 0: 42913.4. Samples: 6656037780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 09:26:29,468][15401] Updated weights for policy 0, policy_version 406250 (0.0027) [2024-06-23 09:26:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 6656131072. Throughput: 0: 42980.9. Samples: 6656298240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 09:26:33,699][15401] Updated weights for policy 0, policy_version 406260 (0.0038) [2024-06-23 09:26:37,121][15401] Updated weights for policy 0, policy_version 406270 (0.0034) [2024-06-23 09:26:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 6656327680. Throughput: 0: 42806.6. Samples: 6656426600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 09:26:41,270][15401] Updated weights for policy 0, policy_version 406280 (0.0031) [2024-06-23 09:26:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6656589824. Throughput: 0: 42984.0. Samples: 6656686540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 09:26:44,866][15401] Updated weights for policy 0, policy_version 406290 (0.0045) [2024-06-23 09:26:48,394][15132] Fps is (10 sec: 44215.3, 60 sec: 42048.8, 300 sec: 42819.8). Total num frames: 6656770048. Throughput: 0: 43107.2. Samples: 6656946160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:48,395][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 09:26:48,956][15401] Updated weights for policy 0, policy_version 406300 (0.0037) [2024-06-23 09:26:53,033][15401] Updated weights for policy 0, policy_version 406310 (0.0038) [2024-06-23 09:26:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6656983040. Throughput: 0: 42817.4. Samples: 6657067360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 09:26:56,707][15401] Updated weights for policy 0, policy_version 406320 (0.0036) [2024-06-23 09:26:58,389][15132] Fps is (10 sec: 45897.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 6657228800. Throughput: 0: 42964.9. Samples: 6657328260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-23 09:26:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 09:27:00,629][15401] Updated weights for policy 0, policy_version 406330 (0.0038) [2024-06-23 09:27:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6657409024. Throughput: 0: 43010.7. Samples: 6657590420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:03,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 09:27:04,260][15401] Updated weights for policy 0, policy_version 406340 (0.0036) [2024-06-23 09:27:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6657622016. Throughput: 0: 42835.2. Samples: 6657708560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:08,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 09:27:08,437][15401] Updated weights for policy 0, policy_version 406350 (0.0028) [2024-06-23 09:27:10,813][15349] Signal inference workers to stop experience collection... (98650 times) [2024-06-23 09:27:10,813][15349] Signal inference workers to resume experience collection... (98650 times) [2024-06-23 09:27:10,841][15401] InferenceWorker_p0-w0: stopping experience collection (98650 times) [2024-06-23 09:27:10,841][15401] InferenceWorker_p0-w0: resuming experience collection (98650 times) [2024-06-23 09:27:11,747][15401] Updated weights for policy 0, policy_version 406360 (0.0028) [2024-06-23 09:27:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6657884160. Throughput: 0: 42910.6. Samples: 6657968760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:13,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 09:27:15,941][15401] Updated weights for policy 0, policy_version 406370 (0.0037) [2024-06-23 09:27:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6658048000. Throughput: 0: 43062.1. Samples: 6658236040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 09:27:19,401][15401] Updated weights for policy 0, policy_version 406380 (0.0026) [2024-06-23 09:27:23,389][15132] Fps is (10 sec: 37683.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6658260992. Throughput: 0: 42841.4. Samples: 6658354460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 09:27:23,788][15401] Updated weights for policy 0, policy_version 406390 (0.0039) [2024-06-23 09:27:27,137][15401] Updated weights for policy 0, policy_version 406400 (0.0039) [2024-06-23 09:27:28,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6658506752. Throughput: 0: 42804.6. Samples: 6658612740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 09:27:31,616][15401] Updated weights for policy 0, policy_version 406410 (0.0039) [2024-06-23 09:27:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6658703360. Throughput: 0: 42896.7. Samples: 6658876300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:33,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 09:27:35,045][15401] Updated weights for policy 0, policy_version 406420 (0.0033) [2024-06-23 09:27:38,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6658899968. Throughput: 0: 42764.0. Samples: 6658991740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:38,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 09:27:39,107][15401] Updated weights for policy 0, policy_version 406430 (0.0032) [2024-06-23 09:27:43,140][15401] Updated weights for policy 0, policy_version 406440 (0.0032) [2024-06-23 09:27:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 6659129344. Throughput: 0: 42817.4. Samples: 6659255040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 09:27:43,569][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000406443_6659162112.pth... [2024-06-23 09:27:43,654][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000405814_6648856576.pth [2024-06-23 09:27:46,829][15401] Updated weights for policy 0, policy_version 406450 (0.0033) [2024-06-23 09:27:48,390][15132] Fps is (10 sec: 40958.0, 60 sec: 42328.4, 300 sec: 42709.4). Total num frames: 6659309568. Throughput: 0: 42664.0. Samples: 6659510320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 09:27:50,689][15401] Updated weights for policy 0, policy_version 406460 (0.0037) [2024-06-23 09:27:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 6659538944. Throughput: 0: 42750.3. Samples: 6659632320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:53,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-23 09:27:54,271][15401] Updated weights for policy 0, policy_version 406470 (0.0025) [2024-06-23 09:27:58,263][15401] Updated weights for policy 0, policy_version 406480 (0.0036) [2024-06-23 09:27:58,390][15132] Fps is (10 sec: 45877.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 6659768320. Throughput: 0: 42796.4. Samples: 6659894600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:27:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 09:28:01,842][15401] Updated weights for policy 0, policy_version 406490 (0.0037) [2024-06-23 09:28:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6659964928. Throughput: 0: 42612.5. Samples: 6660153600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:28:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 09:28:05,818][15401] Updated weights for policy 0, policy_version 406500 (0.0040) [2024-06-23 09:28:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6660194304. Throughput: 0: 42782.1. Samples: 6660279660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:28:08,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-23 09:28:09,431][15401] Updated weights for policy 0, policy_version 406510 (0.0035) [2024-06-23 09:28:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 41777.6, 300 sec: 42820.2). Total num frames: 6660390912. Throughput: 0: 42771.0. Samples: 6660537540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:28:13,393][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 09:28:13,808][15401] Updated weights for policy 0, policy_version 406520 (0.0036) [2024-06-23 09:28:17,079][15401] Updated weights for policy 0, policy_version 406530 (0.0028) [2024-06-23 09:28:18,393][15132] Fps is (10 sec: 42582.8, 60 sec: 42868.9, 300 sec: 42764.8). Total num frames: 6660620288. Throughput: 0: 42507.5. Samples: 6660789300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-23 09:28:18,394][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 09:28:21,273][15401] Updated weights for policy 0, policy_version 406540 (0.0040) [2024-06-23 09:28:23,390][15132] Fps is (10 sec: 45885.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6660849664. Throughput: 0: 42926.1. Samples: 6660923420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:28:23,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 09:28:24,767][15401] Updated weights for policy 0, policy_version 406550 (0.0040) [2024-06-23 09:28:28,390][15132] Fps is (10 sec: 40975.0, 60 sec: 42052.1, 300 sec: 42820.6). Total num frames: 6661029888. Throughput: 0: 42765.6. Samples: 6661179500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:28:28,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 09:28:28,494][15349] Signal inference workers to stop experience collection... (98700 times) [2024-06-23 09:28:28,494][15349] Signal inference workers to resume experience collection... (98700 times) [2024-06-23 09:28:28,530][15401] InferenceWorker_p0-w0: stopping experience collection (98700 times) [2024-06-23 09:28:28,530][15401] InferenceWorker_p0-w0: resuming experience collection (98700 times) [2024-06-23 09:28:28,834][15401] Updated weights for policy 0, policy_version 406560 (0.0039) [2024-06-23 09:28:32,469][15401] Updated weights for policy 0, policy_version 406570 (0.0024) [2024-06-23 09:28:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6661259264. Throughput: 0: 42810.1. Samples: 6661436760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:28:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 09:28:36,384][15401] Updated weights for policy 0, policy_version 406580 (0.0043) [2024-06-23 09:28:38,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6661505024. Throughput: 0: 42979.5. Samples: 6661566400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:28:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 09:28:39,968][15401] Updated weights for policy 0, policy_version 406590 (0.0046) [2024-06-23 09:28:43,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6661685248. Throughput: 0: 42859.3. Samples: 6661823260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:28:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 09:28:44,026][15401] Updated weights for policy 0, policy_version 406600 (0.0032) [2024-06-23 09:28:47,902][15401] Updated weights for policy 0, policy_version 406610 (0.0035) [2024-06-23 09:28:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.9, 300 sec: 42765.0). Total num frames: 6661898240. Throughput: 0: 42792.0. Samples: 6662079240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:28:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 09:28:51,672][15401] Updated weights for policy 0, policy_version 406620 (0.0036) [2024-06-23 09:28:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6662144000. Throughput: 0: 42910.3. Samples: 6662210620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:28:53,399][15132] Avg episode reward: [(0, '0.818')] [2024-06-23 09:28:55,290][15401] Updated weights for policy 0, policy_version 406630 (0.0043) [2024-06-23 09:28:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6662340608. Throughput: 0: 42796.4. Samples: 6662463280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:28:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 09:28:59,316][15401] Updated weights for policy 0, policy_version 406640 (0.0046) [2024-06-23 09:29:02,895][15401] Updated weights for policy 0, policy_version 406650 (0.0044) [2024-06-23 09:29:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 6662553600. Throughput: 0: 42815.0. Samples: 6662715920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:29:03,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 09:29:07,215][15401] Updated weights for policy 0, policy_version 406660 (0.0040) [2024-06-23 09:29:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6662782976. Throughput: 0: 42769.3. Samples: 6662848040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:29:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 09:29:10,584][15401] Updated weights for policy 0, policy_version 406670 (0.0042) [2024-06-23 09:29:13,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6662963200. Throughput: 0: 42716.2. Samples: 6663101720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:29:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 09:29:14,879][15401] Updated weights for policy 0, policy_version 406680 (0.0023) [2024-06-23 09:29:18,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42874.2, 300 sec: 42820.6). Total num frames: 6663192576. Throughput: 0: 42624.7. Samples: 6663354860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:29:18,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 09:29:18,461][15401] Updated weights for policy 0, policy_version 406690 (0.0037) [2024-06-23 09:29:22,483][15401] Updated weights for policy 0, policy_version 406700 (0.0037) [2024-06-23 09:29:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6663405568. Throughput: 0: 42753.8. Samples: 6663490320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:29:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 09:29:25,930][15401] Updated weights for policy 0, policy_version 406710 (0.0028) [2024-06-23 09:29:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6663602176. Throughput: 0: 42702.0. Samples: 6663744860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:29:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 09:29:30,169][15401] Updated weights for policy 0, policy_version 406720 (0.0036) [2024-06-23 09:29:33,317][15401] Updated weights for policy 0, policy_version 406730 (0.0041) [2024-06-23 09:29:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 6663864320. Throughput: 0: 42604.4. Samples: 6663996440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:29:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 09:29:37,893][15401] Updated weights for policy 0, policy_version 406740 (0.0032) [2024-06-23 09:29:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 6664028160. Throughput: 0: 42712.0. Samples: 6664132660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 24.0) [2024-06-23 09:29:38,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 09:29:40,983][15401] Updated weights for policy 0, policy_version 406750 (0.0031) [2024-06-23 09:29:43,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6664241152. Throughput: 0: 42690.7. Samples: 6664384360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:29:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 09:29:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000406753_6664241152.pth... [2024-06-23 09:29:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000406127_6653984768.pth [2024-06-23 09:29:45,504][15401] Updated weights for policy 0, policy_version 406760 (0.0037) [2024-06-23 09:29:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6664486912. Throughput: 0: 42748.1. Samples: 6664639480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:29:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 09:29:48,399][15349] Signal inference workers to stop experience collection... (98750 times) [2024-06-23 09:29:48,404][15349] Signal inference workers to resume experience collection... (98750 times) [2024-06-23 09:29:48,423][15401] InferenceWorker_p0-w0: stopping experience collection (98750 times) [2024-06-23 09:29:48,460][15401] InferenceWorker_p0-w0: resuming experience collection (98750 times) [2024-06-23 09:29:48,535][15401] Updated weights for policy 0, policy_version 406770 (0.0039) [2024-06-23 09:29:53,109][15401] Updated weights for policy 0, policy_version 406780 (0.0041) [2024-06-23 09:29:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 6664683520. Throughput: 0: 42747.2. Samples: 6664771760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:29:53,393][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 09:29:56,291][15401] Updated weights for policy 0, policy_version 406790 (0.0043) [2024-06-23 09:29:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6664896512. Throughput: 0: 42586.6. Samples: 6665018120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:29:58,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 09:30:00,657][15401] Updated weights for policy 0, policy_version 406800 (0.0036) [2024-06-23 09:30:03,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 6665109504. Throughput: 0: 42850.1. Samples: 6665283120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 09:30:03,857][15401] Updated weights for policy 0, policy_version 406810 (0.0042) [2024-06-23 09:30:08,271][15401] Updated weights for policy 0, policy_version 406820 (0.0034) [2024-06-23 09:30:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6665338880. Throughput: 0: 42672.4. Samples: 6665410580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:08,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 09:30:11,621][15401] Updated weights for policy 0, policy_version 406830 (0.0038) [2024-06-23 09:30:13,391][15132] Fps is (10 sec: 44230.3, 60 sec: 43143.4, 300 sec: 42765.2). Total num frames: 6665551872. Throughput: 0: 42635.2. Samples: 6665663500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:13,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 09:30:15,920][15401] Updated weights for policy 0, policy_version 406840 (0.0032) [2024-06-23 09:30:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6665764864. Throughput: 0: 42912.9. Samples: 6665927520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 09:30:19,462][15401] Updated weights for policy 0, policy_version 406850 (0.0031) [2024-06-23 09:30:23,343][15401] Updated weights for policy 0, policy_version 406860 (0.0042) [2024-06-23 09:30:23,390][15132] Fps is (10 sec: 44243.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6665994240. Throughput: 0: 42719.6. Samples: 6666055040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 09:30:27,177][15401] Updated weights for policy 0, policy_version 406870 (0.0020) [2024-06-23 09:30:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 6666174464. Throughput: 0: 42841.3. Samples: 6666312220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 09:30:30,990][15401] Updated weights for policy 0, policy_version 406880 (0.0027) [2024-06-23 09:30:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42876.4). Total num frames: 6666387456. Throughput: 0: 42920.4. Samples: 6666570900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 09:30:34,815][15401] Updated weights for policy 0, policy_version 406890 (0.0041) [2024-06-23 09:30:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6666633216. Throughput: 0: 42743.7. Samples: 6666695120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:38,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-23 09:30:38,662][15401] Updated weights for policy 0, policy_version 406900 (0.0025) [2024-06-23 09:30:42,496][15401] Updated weights for policy 0, policy_version 406910 (0.0048) [2024-06-23 09:30:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 6666846208. Throughput: 0: 43080.4. Samples: 6666956740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:43,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-23 09:30:46,098][15401] Updated weights for policy 0, policy_version 406920 (0.0039) [2024-06-23 09:30:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6667042816. Throughput: 0: 42921.4. Samples: 6667214580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 09:30:50,160][15401] Updated weights for policy 0, policy_version 406930 (0.0034) [2024-06-23 09:30:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 6667255808. Throughput: 0: 42791.6. Samples: 6667336200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:53,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 09:30:54,295][15401] Updated weights for policy 0, policy_version 406940 (0.0026) [2024-06-23 09:30:57,846][15401] Updated weights for policy 0, policy_version 406950 (0.0042) [2024-06-23 09:30:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6667485184. Throughput: 0: 42898.7. Samples: 6667593880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 09:30:58,392][15132] Avg episode reward: [(0, '0.022')] [2024-06-23 09:31:02,101][15401] Updated weights for policy 0, policy_version 406960 (0.0036) [2024-06-23 09:31:02,286][15349] Signal inference workers to stop experience collection... (98800 times) [2024-06-23 09:31:02,287][15349] Signal inference workers to resume experience collection... (98800 times) [2024-06-23 09:31:02,305][15401] InferenceWorker_p0-w0: stopping experience collection (98800 times) [2024-06-23 09:31:02,306][15401] InferenceWorker_p0-w0: resuming experience collection (98800 times) [2024-06-23 09:31:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6667698176. Throughput: 0: 42734.3. Samples: 6667850560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:03,396][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 09:31:05,496][15401] Updated weights for policy 0, policy_version 406970 (0.0032) [2024-06-23 09:31:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6667878400. Throughput: 0: 42752.1. Samples: 6667978880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:08,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-23 09:31:09,749][15401] Updated weights for policy 0, policy_version 406980 (0.0049) [2024-06-23 09:31:13,148][15401] Updated weights for policy 0, policy_version 406990 (0.0025) [2024-06-23 09:31:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42872.6, 300 sec: 42820.6). Total num frames: 6668124160. Throughput: 0: 42741.8. Samples: 6668235600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 09:31:17,334][15401] Updated weights for policy 0, policy_version 407000 (0.0025) [2024-06-23 09:31:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 6668337152. Throughput: 0: 42633.9. Samples: 6668489420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 09:31:20,653][15401] Updated weights for policy 0, policy_version 407010 (0.0029) [2024-06-23 09:31:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6668517376. Throughput: 0: 42675.0. Samples: 6668615500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 09:31:25,104][15401] Updated weights for policy 0, policy_version 407020 (0.0029) [2024-06-23 09:31:28,270][15401] Updated weights for policy 0, policy_version 407030 (0.0041) [2024-06-23 09:31:28,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 6668779520. Throughput: 0: 42764.7. Samples: 6668881160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 09:31:32,691][15401] Updated weights for policy 0, policy_version 407040 (0.0037) [2024-06-23 09:31:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6668976128. Throughput: 0: 42627.0. Samples: 6669132800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:33,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 09:31:36,023][15401] Updated weights for policy 0, policy_version 407050 (0.0023) [2024-06-23 09:31:38,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6669156352. Throughput: 0: 42688.0. Samples: 6669257160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:38,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 09:31:40,208][15401] Updated weights for policy 0, policy_version 407060 (0.0032) [2024-06-23 09:31:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.7). Total num frames: 6669385728. Throughput: 0: 42643.7. Samples: 6669512840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 09:31:43,559][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000407069_6669418496.pth... [2024-06-23 09:31:43,622][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000406443_6659162112.pth [2024-06-23 09:31:43,772][15401] Updated weights for policy 0, policy_version 407070 (0.0031) [2024-06-23 09:31:47,867][15401] Updated weights for policy 0, policy_version 407080 (0.0029) [2024-06-23 09:31:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6669598720. Throughput: 0: 42574.3. Samples: 6669766400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 09:31:51,755][15401] Updated weights for policy 0, policy_version 407090 (0.0038) [2024-06-23 09:31:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6669811712. Throughput: 0: 42544.8. Samples: 6669893400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 09:31:55,793][15401] Updated weights for policy 0, policy_version 407100 (0.0028) [2024-06-23 09:31:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6670024704. Throughput: 0: 42438.2. Samples: 6670145320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:31:58,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 09:31:59,562][15401] Updated weights for policy 0, policy_version 407110 (0.0044) [2024-06-23 09:32:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6670237696. Throughput: 0: 42508.7. Samples: 6670402320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:32:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 09:32:03,757][15401] Updated weights for policy 0, policy_version 407120 (0.0031) [2024-06-23 09:32:07,117][15401] Updated weights for policy 0, policy_version 407130 (0.0022) [2024-06-23 09:32:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6670467072. Throughput: 0: 42567.1. Samples: 6670531020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:32:08,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-23 09:32:11,360][15401] Updated weights for policy 0, policy_version 407140 (0.0031) [2024-06-23 09:32:11,368][15349] Signal inference workers to stop experience collection... (98850 times) [2024-06-23 09:32:11,369][15349] Signal inference workers to resume experience collection... (98850 times) [2024-06-23 09:32:11,380][15401] InferenceWorker_p0-w0: stopping experience collection (98850 times) [2024-06-23 09:32:11,380][15401] InferenceWorker_p0-w0: resuming experience collection (98850 times) [2024-06-23 09:32:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6670680064. Throughput: 0: 42314.2. Samples: 6670785300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:32:13,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 09:32:14,961][15401] Updated weights for policy 0, policy_version 407150 (0.0044) [2024-06-23 09:32:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6670876672. Throughput: 0: 42535.5. Samples: 6671046900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:32:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 09:32:18,974][15401] Updated weights for policy 0, policy_version 407160 (0.0037) [2024-06-23 09:32:22,619][15401] Updated weights for policy 0, policy_version 407170 (0.0036) [2024-06-23 09:32:23,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6671073280. Throughput: 0: 42385.2. Samples: 6671164500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 09:32:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 09:32:26,551][15401] Updated weights for policy 0, policy_version 407180 (0.0051) [2024-06-23 09:32:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 6671302656. Throughput: 0: 42345.3. Samples: 6671418380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:32:28,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 09:32:30,258][15401] Updated weights for policy 0, policy_version 407190 (0.0042) [2024-06-23 09:32:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 6671499264. Throughput: 0: 42540.7. Samples: 6671680740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:32:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 09:32:34,063][15401] Updated weights for policy 0, policy_version 407200 (0.0048) [2024-06-23 09:32:37,876][15401] Updated weights for policy 0, policy_version 407210 (0.0028) [2024-06-23 09:32:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6671728640. Throughput: 0: 42478.8. Samples: 6671804940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:32:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 09:32:42,087][15401] Updated weights for policy 0, policy_version 407220 (0.0046) [2024-06-23 09:32:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6671941632. Throughput: 0: 42524.8. Samples: 6672058940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:32:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 09:32:46,229][15401] Updated weights for policy 0, policy_version 407230 (0.0043) [2024-06-23 09:32:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6672138240. Throughput: 0: 42432.6. Samples: 6672311780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:32:48,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 09:32:49,892][15401] Updated weights for policy 0, policy_version 407240 (0.0024) [2024-06-23 09:32:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 6672351232. Throughput: 0: 42441.3. Samples: 6672440980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:32:53,392][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 09:32:53,777][15401] Updated weights for policy 0, policy_version 407250 (0.0039) [2024-06-23 09:32:57,655][15401] Updated weights for policy 0, policy_version 407260 (0.0035) [2024-06-23 09:32:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6672596992. Throughput: 0: 42549.4. Samples: 6672700020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:32:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 09:33:01,476][15401] Updated weights for policy 0, policy_version 407270 (0.0034) [2024-06-23 09:33:03,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6672809984. Throughput: 0: 42367.1. Samples: 6672953420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:03,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 09:33:05,306][15401] Updated weights for policy 0, policy_version 407280 (0.0033) [2024-06-23 09:33:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.7, 300 sec: 42765.0). Total num frames: 6673006592. Throughput: 0: 42572.4. Samples: 6673080360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:08,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 09:33:09,141][15401] Updated weights for policy 0, policy_version 407290 (0.0041) [2024-06-23 09:33:13,112][15401] Updated weights for policy 0, policy_version 407300 (0.0041) [2024-06-23 09:33:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 42654.5). Total num frames: 6673203200. Throughput: 0: 42772.0. Samples: 6673343120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:13,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 09:33:16,854][15401] Updated weights for policy 0, policy_version 407310 (0.0051) [2024-06-23 09:33:18,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6673432576. Throughput: 0: 42419.6. Samples: 6673589620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:18,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 09:33:20,944][15401] Updated weights for policy 0, policy_version 407320 (0.0031) [2024-06-23 09:33:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6673645568. Throughput: 0: 42599.4. Samples: 6673721920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:23,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 09:33:24,409][15401] Updated weights for policy 0, policy_version 407330 (0.0036) [2024-06-23 09:33:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6673825792. Throughput: 0: 42569.5. Samples: 6673974560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 09:33:28,588][15401] Updated weights for policy 0, policy_version 407340 (0.0026) [2024-06-23 09:33:31,443][15349] Signal inference workers to stop experience collection... (98900 times) [2024-06-23 09:33:31,443][15349] Signal inference workers to resume experience collection... (98900 times) [2024-06-23 09:33:31,460][15401] InferenceWorker_p0-w0: stopping experience collection (98900 times) [2024-06-23 09:33:31,482][15401] InferenceWorker_p0-w0: resuming experience collection (98900 times) [2024-06-23 09:33:31,912][15401] Updated weights for policy 0, policy_version 407350 (0.0033) [2024-06-23 09:33:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6674087936. Throughput: 0: 42597.7. Samples: 6674228680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:33,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 09:33:36,117][15401] Updated weights for policy 0, policy_version 407360 (0.0043) [2024-06-23 09:33:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6674268160. Throughput: 0: 42675.1. Samples: 6674361260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 09:33:39,599][15401] Updated weights for policy 0, policy_version 407370 (0.0029) [2024-06-23 09:33:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6674481152. Throughput: 0: 42551.7. Samples: 6674614840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 09:33:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 09:33:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000407379_6674497536.pth... [2024-06-23 09:33:43,537][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000406753_6664241152.pth [2024-06-23 09:33:43,714][15401] Updated weights for policy 0, policy_version 407380 (0.0043) [2024-06-23 09:33:47,477][15401] Updated weights for policy 0, policy_version 407390 (0.0033) [2024-06-23 09:33:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6674726912. Throughput: 0: 42558.7. Samples: 6674868560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:33:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 09:33:51,127][15401] Updated weights for policy 0, policy_version 407400 (0.0027) [2024-06-23 09:33:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 6674907136. Throughput: 0: 42677.3. Samples: 6675000740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:33:53,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 09:33:55,039][15401] Updated weights for policy 0, policy_version 407410 (0.0031) [2024-06-23 09:33:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 6675136512. Throughput: 0: 42585.7. Samples: 6675259480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:33:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 09:33:58,641][15401] Updated weights for policy 0, policy_version 407420 (0.0030) [2024-06-23 09:34:02,646][15401] Updated weights for policy 0, policy_version 407430 (0.0037) [2024-06-23 09:34:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6675365888. Throughput: 0: 42870.5. Samples: 6675518800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 09:34:06,215][15401] Updated weights for policy 0, policy_version 407440 (0.0043) [2024-06-23 09:34:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 6675529728. Throughput: 0: 42787.6. Samples: 6675647360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 09:34:10,189][15401] Updated weights for policy 0, policy_version 407450 (0.0034) [2024-06-23 09:34:13,390][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6675791872. Throughput: 0: 42763.9. Samples: 6675898940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 09:34:14,143][15401] Updated weights for policy 0, policy_version 407460 (0.0031) [2024-06-23 09:34:17,875][15401] Updated weights for policy 0, policy_version 407470 (0.0051) [2024-06-23 09:34:18,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6676004864. Throughput: 0: 42762.3. Samples: 6676152980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 09:34:21,714][15401] Updated weights for policy 0, policy_version 407480 (0.0038) [2024-06-23 09:34:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 6676168704. Throughput: 0: 42683.3. Samples: 6676282000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 09:34:25,529][15401] Updated weights for policy 0, policy_version 407490 (0.0030) [2024-06-23 09:34:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43415.8, 300 sec: 42598.1). Total num frames: 6676430848. Throughput: 0: 42838.1. Samples: 6676542660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:28,393][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 09:34:29,284][15401] Updated weights for policy 0, policy_version 407500 (0.0028) [2024-06-23 09:34:33,080][15401] Updated weights for policy 0, policy_version 407510 (0.0037) [2024-06-23 09:34:33,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6676643840. Throughput: 0: 42793.7. Samples: 6676794280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:33,391][15132] Avg episode reward: [(0, '0.858')] [2024-06-23 09:34:36,899][15401] Updated weights for policy 0, policy_version 407520 (0.0035) [2024-06-23 09:34:38,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6676824064. Throughput: 0: 42770.8. Samples: 6676925420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:38,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 09:34:39,285][15349] Signal inference workers to stop experience collection... (98950 times) [2024-06-23 09:34:39,336][15401] InferenceWorker_p0-w0: stopping experience collection (98950 times) [2024-06-23 09:34:39,343][15349] Signal inference workers to resume experience collection... (98950 times) [2024-06-23 09:34:39,348][15401] InferenceWorker_p0-w0: resuming experience collection (98950 times) [2024-06-23 09:34:40,882][15401] Updated weights for policy 0, policy_version 407530 (0.0040) [2024-06-23 09:34:43,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.6, 300 sec: 42598.0). Total num frames: 6677053440. Throughput: 0: 42601.1. Samples: 6677176640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:43,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 09:34:44,635][15401] Updated weights for policy 0, policy_version 407540 (0.0059) [2024-06-23 09:34:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 6677282816. Throughput: 0: 42490.0. Samples: 6677430840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:48,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 09:34:48,458][15401] Updated weights for policy 0, policy_version 407550 (0.0034) [2024-06-23 09:34:52,987][15401] Updated weights for policy 0, policy_version 407560 (0.0038) [2024-06-23 09:34:53,389][15132] Fps is (10 sec: 40970.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6677463040. Throughput: 0: 42514.3. Samples: 6677560500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 09:34:56,228][15401] Updated weights for policy 0, policy_version 407570 (0.0031) [2024-06-23 09:34:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6677708800. Throughput: 0: 42467.9. Samples: 6677810000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:34:58,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-23 09:35:00,593][15401] Updated weights for policy 0, policy_version 407580 (0.0039) [2024-06-23 09:35:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 6677889024. Throughput: 0: 42674.6. Samples: 6678073340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 09:35:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 09:35:04,272][15401] Updated weights for policy 0, policy_version 407590 (0.0040) [2024-06-23 09:35:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42543.1). Total num frames: 6678102016. Throughput: 0: 42429.3. Samples: 6678191320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 09:35:08,473][15401] Updated weights for policy 0, policy_version 407600 (0.0039) [2024-06-23 09:35:11,921][15401] Updated weights for policy 0, policy_version 407610 (0.0042) [2024-06-23 09:35:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6678347776. Throughput: 0: 42387.1. Samples: 6678449980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 09:35:16,281][15401] Updated weights for policy 0, policy_version 407620 (0.0037) [2024-06-23 09:35:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 6678528000. Throughput: 0: 42676.9. Samples: 6678714740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:18,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 09:35:19,555][15401] Updated weights for policy 0, policy_version 407630 (0.0034) [2024-06-23 09:35:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 6678757376. Throughput: 0: 42359.0. Samples: 6678831680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:23,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 09:35:23,861][15401] Updated weights for policy 0, policy_version 407640 (0.0024) [2024-06-23 09:35:27,158][15401] Updated weights for policy 0, policy_version 407650 (0.0038) [2024-06-23 09:35:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 6678986752. Throughput: 0: 42610.0. Samples: 6679093980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:28,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 09:35:31,464][15401] Updated weights for policy 0, policy_version 407660 (0.0041) [2024-06-23 09:35:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6679183360. Throughput: 0: 42761.8. Samples: 6679355120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:33,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-23 09:35:34,848][15401] Updated weights for policy 0, policy_version 407670 (0.0035) [2024-06-23 09:35:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 6679379968. Throughput: 0: 42504.4. Samples: 6679473200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 09:35:39,326][15401] Updated weights for policy 0, policy_version 407680 (0.0029) [2024-06-23 09:35:42,434][15401] Updated weights for policy 0, policy_version 407690 (0.0030) [2024-06-23 09:35:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42871.6, 300 sec: 42653.6). Total num frames: 6679625728. Throughput: 0: 42702.3. Samples: 6679731700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:43,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 09:35:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000407692_6679625728.pth... [2024-06-23 09:35:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000407069_6669418496.pth [2024-06-23 09:35:47,046][15349] Signal inference workers to stop experience collection... (99000 times) [2024-06-23 09:35:47,056][15349] Signal inference workers to resume experience collection... (99000 times) [2024-06-23 09:35:47,056][15401] Updated weights for policy 0, policy_version 407700 (0.0036) [2024-06-23 09:35:47,066][15401] InferenceWorker_p0-w0: stopping experience collection (99000 times) [2024-06-23 09:35:47,092][15401] InferenceWorker_p0-w0: resuming experience collection (99000 times) [2024-06-23 09:35:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6679822336. Throughput: 0: 42644.1. Samples: 6679992320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 09:35:50,001][15401] Updated weights for policy 0, policy_version 407710 (0.0027) [2024-06-23 09:35:53,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6680018944. Throughput: 0: 42853.8. Samples: 6680119740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 09:35:54,574][15401] Updated weights for policy 0, policy_version 407720 (0.0032) [2024-06-23 09:35:57,984][15401] Updated weights for policy 0, policy_version 407730 (0.0040) [2024-06-23 09:35:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6680264704. Throughput: 0: 42781.8. Samples: 6680375160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:35:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 09:36:02,246][15401] Updated weights for policy 0, policy_version 407740 (0.0029) [2024-06-23 09:36:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6680477696. Throughput: 0: 42589.9. Samples: 6680631280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:36:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 09:36:06,007][15401] Updated weights for policy 0, policy_version 407750 (0.0042) [2024-06-23 09:36:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 6680674304. Throughput: 0: 42933.4. Samples: 6680763580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:36:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 09:36:10,019][15401] Updated weights for policy 0, policy_version 407760 (0.0035) [2024-06-23 09:36:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6680887296. Throughput: 0: 42685.8. Samples: 6681014840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:36:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 09:36:13,530][15401] Updated weights for policy 0, policy_version 407770 (0.0034) [2024-06-23 09:36:17,385][15401] Updated weights for policy 0, policy_version 407780 (0.0037) [2024-06-23 09:36:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6681100288. Throughput: 0: 42561.7. Samples: 6681270400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:36:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 09:36:21,118][15401] Updated weights for policy 0, policy_version 407790 (0.0034) [2024-06-23 09:36:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42873.1, 300 sec: 42542.9). Total num frames: 6681329664. Throughput: 0: 42695.9. Samples: 6681394520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 09:36:23,396][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 09:36:25,602][15401] Updated weights for policy 0, policy_version 407800 (0.0024) [2024-06-23 09:36:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6681526272. Throughput: 0: 42602.2. Samples: 6681648700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:36:28,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 09:36:29,006][15401] Updated weights for policy 0, policy_version 407810 (0.0032) [2024-06-23 09:36:33,011][15401] Updated weights for policy 0, policy_version 407820 (0.0044) [2024-06-23 09:36:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6681739264. Throughput: 0: 42612.8. Samples: 6681909900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:36:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 09:36:36,503][15401] Updated weights for policy 0, policy_version 407830 (0.0048) [2024-06-23 09:36:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 6681985024. Throughput: 0: 42624.3. Samples: 6682037840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:36:38,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 09:36:40,504][15401] Updated weights for policy 0, policy_version 407840 (0.0034) [2024-06-23 09:36:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 6682148864. Throughput: 0: 42668.5. Samples: 6682295240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:36:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 09:36:44,463][15401] Updated weights for policy 0, policy_version 407850 (0.0032) [2024-06-23 09:36:48,139][15401] Updated weights for policy 0, policy_version 407860 (0.0039) [2024-06-23 09:36:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6682378240. Throughput: 0: 42623.9. Samples: 6682549360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:36:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 09:36:51,948][15401] Updated weights for policy 0, policy_version 407870 (0.0024) [2024-06-23 09:36:53,391][15132] Fps is (10 sec: 47505.4, 60 sec: 43416.4, 300 sec: 42709.2). Total num frames: 6682624000. Throughput: 0: 42554.5. Samples: 6682678600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:36:53,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 09:36:55,923][15401] Updated weights for policy 0, policy_version 407880 (0.0023) [2024-06-23 09:36:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6682804224. Throughput: 0: 42786.6. Samples: 6682940240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:36:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 09:36:59,394][15401] Updated weights for policy 0, policy_version 407890 (0.0037) [2024-06-23 09:37:03,389][15132] Fps is (10 sec: 39328.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6683017216. Throughput: 0: 42811.2. Samples: 6683196900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:03,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 09:37:03,409][15401] Updated weights for policy 0, policy_version 407900 (0.0026) [2024-06-23 09:37:04,518][15349] Signal inference workers to stop experience collection... (99050 times) [2024-06-23 09:37:04,549][15401] InferenceWorker_p0-w0: stopping experience collection (99050 times) [2024-06-23 09:37:04,580][15349] Signal inference workers to resume experience collection... (99050 times) [2024-06-23 09:37:04,581][15401] InferenceWorker_p0-w0: resuming experience collection (99050 times) [2024-06-23 09:37:06,930][15401] Updated weights for policy 0, policy_version 407910 (0.0024) [2024-06-23 09:37:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 6683262976. Throughput: 0: 42930.3. Samples: 6683326380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 09:37:10,919][15401] Updated weights for policy 0, policy_version 407920 (0.0040) [2024-06-23 09:37:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6683443200. Throughput: 0: 42983.1. Samples: 6683582940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 09:37:14,759][15401] Updated weights for policy 0, policy_version 407930 (0.0027) [2024-06-23 09:37:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6683672576. Throughput: 0: 42668.2. Samples: 6683829960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 09:37:18,450][15401] Updated weights for policy 0, policy_version 407940 (0.0036) [2024-06-23 09:37:22,359][15401] Updated weights for policy 0, policy_version 407950 (0.0032) [2024-06-23 09:37:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6683885568. Throughput: 0: 42725.3. Samples: 6683960480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 09:37:26,286][15401] Updated weights for policy 0, policy_version 407960 (0.0039) [2024-06-23 09:37:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6684098560. Throughput: 0: 42723.0. Samples: 6684217780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 09:37:30,188][15401] Updated weights for policy 0, policy_version 407970 (0.0029) [2024-06-23 09:37:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6684311552. Throughput: 0: 42756.0. Samples: 6684473380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:33,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 09:37:34,269][15401] Updated weights for policy 0, policy_version 407980 (0.0036) [2024-06-23 09:37:38,045][15401] Updated weights for policy 0, policy_version 407990 (0.0042) [2024-06-23 09:37:38,391][15132] Fps is (10 sec: 42591.0, 60 sec: 42324.1, 300 sec: 42653.7). Total num frames: 6684524544. Throughput: 0: 42618.5. Samples: 6684596440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:38,392][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 09:37:41,840][15401] Updated weights for policy 0, policy_version 408000 (0.0034) [2024-06-23 09:37:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 6684737536. Throughput: 0: 42644.3. Samples: 6684859240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 09:37:43,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 09:37:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000408005_6684753920.pth... [2024-06-23 09:37:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000407379_6674497536.pth [2024-06-23 09:37:45,582][15401] Updated weights for policy 0, policy_version 408010 (0.0037) [2024-06-23 09:37:48,389][15132] Fps is (10 sec: 44245.0, 60 sec: 43144.7, 300 sec: 42765.4). Total num frames: 6684966912. Throughput: 0: 42684.9. Samples: 6685117720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:37:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 09:37:49,531][15401] Updated weights for policy 0, policy_version 408020 (0.0039) [2024-06-23 09:37:52,980][15401] Updated weights for policy 0, policy_version 408030 (0.0046) [2024-06-23 09:37:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42599.5, 300 sec: 42653.9). Total num frames: 6685179904. Throughput: 0: 42585.2. Samples: 6685242720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:37:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 09:37:56,945][15401] Updated weights for policy 0, policy_version 408040 (0.0034) [2024-06-23 09:37:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6685376512. Throughput: 0: 42568.0. Samples: 6685498500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:37:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 09:38:00,744][15401] Updated weights for policy 0, policy_version 408050 (0.0043) [2024-06-23 09:38:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 6685589504. Throughput: 0: 42704.0. Samples: 6685751640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:03,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 09:38:04,802][15401] Updated weights for policy 0, policy_version 408060 (0.0038) [2024-06-23 09:38:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6685802496. Throughput: 0: 42739.2. Samples: 6685883740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 09:38:08,426][15401] Updated weights for policy 0, policy_version 408070 (0.0034) [2024-06-23 09:38:12,235][15401] Updated weights for policy 0, policy_version 408080 (0.0035) [2024-06-23 09:38:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6686015488. Throughput: 0: 42765.0. Samples: 6686142200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 09:38:16,172][15401] Updated weights for policy 0, policy_version 408090 (0.0034) [2024-06-23 09:38:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 6686228480. Throughput: 0: 42731.6. Samples: 6686396300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 09:38:20,104][15401] Updated weights for policy 0, policy_version 408100 (0.0032) [2024-06-23 09:38:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6686441472. Throughput: 0: 42771.5. Samples: 6686521080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 09:38:23,814][15401] Updated weights for policy 0, policy_version 408110 (0.0023) [2024-06-23 09:38:27,755][15401] Updated weights for policy 0, policy_version 408120 (0.0043) [2024-06-23 09:38:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6686654464. Throughput: 0: 42838.9. Samples: 6686786980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 09:38:31,405][15401] Updated weights for policy 0, policy_version 408130 (0.0041) [2024-06-23 09:38:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6686867456. Throughput: 0: 42684.3. Samples: 6687038520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 09:38:34,854][15349] Signal inference workers to stop experience collection... (99100 times) [2024-06-23 09:38:34,861][15349] Signal inference workers to resume experience collection... (99100 times) [2024-06-23 09:38:34,870][15401] InferenceWorker_p0-w0: stopping experience collection (99100 times) [2024-06-23 09:38:34,888][15401] InferenceWorker_p0-w0: resuming experience collection (99100 times) [2024-06-23 09:38:35,316][15401] Updated weights for policy 0, policy_version 408140 (0.0033) [2024-06-23 09:38:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42599.7, 300 sec: 42709.5). Total num frames: 6687080448. Throughput: 0: 42775.2. Samples: 6687167600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 09:38:38,946][15401] Updated weights for policy 0, policy_version 408150 (0.0041) [2024-06-23 09:38:43,361][15401] Updated weights for policy 0, policy_version 408160 (0.0026) [2024-06-23 09:38:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6687293440. Throughput: 0: 42866.6. Samples: 6687427500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:43,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 09:38:46,937][15401] Updated weights for policy 0, policy_version 408170 (0.0030) [2024-06-23 09:38:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6687506432. Throughput: 0: 42794.5. Samples: 6687677400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 09:38:51,022][15401] Updated weights for policy 0, policy_version 408180 (0.0033) [2024-06-23 09:38:53,390][15132] Fps is (10 sec: 44234.7, 60 sec: 42598.1, 300 sec: 42709.4). Total num frames: 6687735808. Throughput: 0: 42766.6. Samples: 6687808260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 09:38:54,597][15401] Updated weights for policy 0, policy_version 408190 (0.0035) [2024-06-23 09:38:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6687932416. Throughput: 0: 42787.9. Samples: 6688067660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:38:58,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 09:38:58,535][15401] Updated weights for policy 0, policy_version 408200 (0.0031) [2024-06-23 09:39:02,098][15401] Updated weights for policy 0, policy_version 408210 (0.0030) [2024-06-23 09:39:03,390][15132] Fps is (10 sec: 44239.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6688178176. Throughput: 0: 42849.8. Samples: 6688324540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:39:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 09:39:06,115][15401] Updated weights for policy 0, policy_version 408220 (0.0040) [2024-06-23 09:39:08,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 6688374784. Throughput: 0: 43027.4. Samples: 6688457420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 09:39:08,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 09:39:09,820][15401] Updated weights for policy 0, policy_version 408230 (0.0039) [2024-06-23 09:39:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6688587776. Throughput: 0: 42678.1. Samples: 6688707500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 09:39:13,768][15401] Updated weights for policy 0, policy_version 408240 (0.0029) [2024-06-23 09:39:17,422][15401] Updated weights for policy 0, policy_version 408250 (0.0040) [2024-06-23 09:39:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6688817152. Throughput: 0: 42780.0. Samples: 6688963620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 09:39:21,327][15401] Updated weights for policy 0, policy_version 408260 (0.0039) [2024-06-23 09:39:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 6689013760. Throughput: 0: 42776.8. Samples: 6689092560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:23,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 09:39:25,093][15401] Updated weights for policy 0, policy_version 408270 (0.0040) [2024-06-23 09:39:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 6689226752. Throughput: 0: 42763.2. Samples: 6689351840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 09:39:28,978][15401] Updated weights for policy 0, policy_version 408280 (0.0035) [2024-06-23 09:39:32,646][15401] Updated weights for policy 0, policy_version 408290 (0.0026) [2024-06-23 09:39:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6689456128. Throughput: 0: 42912.4. Samples: 6689608460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:33,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 09:39:36,594][15401] Updated weights for policy 0, policy_version 408300 (0.0035) [2024-06-23 09:39:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 6689652736. Throughput: 0: 42792.9. Samples: 6689733920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 09:39:40,476][15401] Updated weights for policy 0, policy_version 408310 (0.0024) [2024-06-23 09:39:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6689865728. Throughput: 0: 42672.1. Samples: 6689987900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 09:39:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000408318_6689882112.pth... [2024-06-23 09:39:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000407692_6679625728.pth [2024-06-23 09:39:44,416][15401] Updated weights for policy 0, policy_version 408320 (0.0033) [2024-06-23 09:39:48,202][15401] Updated weights for policy 0, policy_version 408330 (0.0036) [2024-06-23 09:39:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6690078720. Throughput: 0: 42674.6. Samples: 6690244900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 09:39:52,040][15401] Updated weights for policy 0, policy_version 408340 (0.0042) [2024-06-23 09:39:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.8, 300 sec: 42654.0). Total num frames: 6690291712. Throughput: 0: 42642.7. Samples: 6690376240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 09:39:55,830][15401] Updated weights for policy 0, policy_version 408350 (0.0034) [2024-06-23 09:39:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6690504704. Throughput: 0: 42797.3. Samples: 6690633380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:39:58,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 09:39:59,843][15401] Updated weights for policy 0, policy_version 408360 (0.0031) [2024-06-23 09:40:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6690717696. Throughput: 0: 42811.6. Samples: 6690890140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:40:03,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 09:40:03,445][15401] Updated weights for policy 0, policy_version 408370 (0.0043) [2024-06-23 09:40:07,365][15349] Signal inference workers to stop experience collection... (99150 times) [2024-06-23 09:40:07,365][15349] Signal inference workers to resume experience collection... (99150 times) [2024-06-23 09:40:07,409][15401] InferenceWorker_p0-w0: stopping experience collection (99150 times) [2024-06-23 09:40:07,409][15401] InferenceWorker_p0-w0: resuming experience collection (99150 times) [2024-06-23 09:40:07,537][15401] Updated weights for policy 0, policy_version 408380 (0.0037) [2024-06-23 09:40:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 6690930688. Throughput: 0: 42811.6. Samples: 6691019080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:40:08,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 09:40:11,060][15401] Updated weights for policy 0, policy_version 408390 (0.0028) [2024-06-23 09:40:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6691143680. Throughput: 0: 42712.4. Samples: 6691273900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:40:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 09:40:15,007][15401] Updated weights for policy 0, policy_version 408400 (0.0035) [2024-06-23 09:40:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 6691356672. Throughput: 0: 42744.1. Samples: 6691531940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:40:18,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 09:40:18,672][15401] Updated weights for policy 0, policy_version 408410 (0.0053) [2024-06-23 09:40:22,559][15401] Updated weights for policy 0, policy_version 408420 (0.0035) [2024-06-23 09:40:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6691569664. Throughput: 0: 42721.5. Samples: 6691656380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:40:23,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 09:40:26,786][15401] Updated weights for policy 0, policy_version 408430 (0.0044) [2024-06-23 09:40:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6691799040. Throughput: 0: 42738.6. Samples: 6691911140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 09:40:28,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 09:40:30,137][15401] Updated weights for policy 0, policy_version 408440 (0.0033) [2024-06-23 09:40:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 6691995648. Throughput: 0: 42948.6. Samples: 6692177580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:40:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 09:40:34,270][15401] Updated weights for policy 0, policy_version 408450 (0.0032) [2024-06-23 09:40:37,678][15401] Updated weights for policy 0, policy_version 408460 (0.0027) [2024-06-23 09:40:38,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 6692225024. Throughput: 0: 42703.3. Samples: 6692297900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:40:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 09:40:41,881][15401] Updated weights for policy 0, policy_version 408470 (0.0041) [2024-06-23 09:40:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6692454400. Throughput: 0: 42805.8. Samples: 6692559640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:40:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 09:40:45,149][15401] Updated weights for policy 0, policy_version 408480 (0.0043) [2024-06-23 09:40:48,390][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6692651008. Throughput: 0: 42800.9. Samples: 6692816180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:40:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 09:40:49,511][15401] Updated weights for policy 0, policy_version 408490 (0.0030) [2024-06-23 09:40:52,847][15401] Updated weights for policy 0, policy_version 408500 (0.0026) [2024-06-23 09:40:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6692864000. Throughput: 0: 42724.0. Samples: 6692941660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:40:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 09:40:57,284][15401] Updated weights for policy 0, policy_version 408510 (0.0040) [2024-06-23 09:40:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6693060608. Throughput: 0: 42927.1. Samples: 6693205620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:40:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 09:41:00,561][15401] Updated weights for policy 0, policy_version 408520 (0.0023) [2024-06-23 09:41:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6693289984. Throughput: 0: 42911.1. Samples: 6693462940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 09:41:04,999][15401] Updated weights for policy 0, policy_version 408530 (0.0041) [2024-06-23 09:41:08,368][15401] Updated weights for policy 0, policy_version 408540 (0.0032) [2024-06-23 09:41:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6693519360. Throughput: 0: 42982.1. Samples: 6693590580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:08,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 09:41:12,604][15401] Updated weights for policy 0, policy_version 408550 (0.0043) [2024-06-23 09:41:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6693715968. Throughput: 0: 42982.2. Samples: 6693845340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:13,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 09:41:15,852][15401] Updated weights for policy 0, policy_version 408560 (0.0043) [2024-06-23 09:41:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6693928960. Throughput: 0: 42766.6. Samples: 6694102080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 09:41:20,015][15401] Updated weights for policy 0, policy_version 408570 (0.0026) [2024-06-23 09:41:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 6694158336. Throughput: 0: 42917.5. Samples: 6694229180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 09:41:23,403][15401] Updated weights for policy 0, policy_version 408580 (0.0027) [2024-06-23 09:41:27,998][15401] Updated weights for policy 0, policy_version 408590 (0.0031) [2024-06-23 09:41:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6694354944. Throughput: 0: 42800.5. Samples: 6694485660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 09:41:31,263][15401] Updated weights for policy 0, policy_version 408600 (0.0032) [2024-06-23 09:41:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6694551552. Throughput: 0: 42884.0. Samples: 6694745960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:33,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 09:41:35,253][15349] Signal inference workers to stop experience collection... (99200 times) [2024-06-23 09:41:35,290][15401] InferenceWorker_p0-w0: stopping experience collection (99200 times) [2024-06-23 09:41:35,299][15349] Signal inference workers to resume experience collection... (99200 times) [2024-06-23 09:41:35,309][15401] InferenceWorker_p0-w0: resuming experience collection (99200 times) [2024-06-23 09:41:35,437][15401] Updated weights for policy 0, policy_version 408610 (0.0038) [2024-06-23 09:41:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 6694797312. Throughput: 0: 42805.4. Samples: 6694867900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 09:41:38,782][15401] Updated weights for policy 0, policy_version 408620 (0.0022) [2024-06-23 09:41:43,045][15401] Updated weights for policy 0, policy_version 408630 (0.0038) [2024-06-23 09:41:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6694993920. Throughput: 0: 42769.5. Samples: 6695130260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 09:41:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000408631_6695010304.pth... [2024-06-23 09:41:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000408005_6684753920.pth [2024-06-23 09:41:46,390][15401] Updated weights for policy 0, policy_version 408640 (0.0041) [2024-06-23 09:41:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42654.2). Total num frames: 6695206912. Throughput: 0: 42737.9. Samples: 6695386140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 09:41:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 09:41:50,658][15401] Updated weights for policy 0, policy_version 408650 (0.0031) [2024-06-23 09:41:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6695419904. Throughput: 0: 42683.5. Samples: 6695511340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:41:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 09:41:54,181][15401] Updated weights for policy 0, policy_version 408660 (0.0043) [2024-06-23 09:41:58,340][15401] Updated weights for policy 0, policy_version 408670 (0.0045) [2024-06-23 09:41:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6695649280. Throughput: 0: 42794.7. Samples: 6695771100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:41:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 09:42:01,961][15401] Updated weights for policy 0, policy_version 408680 (0.0040) [2024-06-23 09:42:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6695862272. Throughput: 0: 42747.0. Samples: 6696025700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 09:42:06,080][15401] Updated weights for policy 0, policy_version 408690 (0.0027) [2024-06-23 09:42:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6696075264. Throughput: 0: 42750.1. Samples: 6696152940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 09:42:09,426][15401] Updated weights for policy 0, policy_version 408700 (0.0037) [2024-06-23 09:42:13,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6696271872. Throughput: 0: 42711.6. Samples: 6696407680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 09:42:13,692][15401] Updated weights for policy 0, policy_version 408710 (0.0032) [2024-06-23 09:42:17,046][15401] Updated weights for policy 0, policy_version 408720 (0.0035) [2024-06-23 09:42:18,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6696501248. Throughput: 0: 42618.2. Samples: 6696663880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:18,392][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 09:42:21,455][15401] Updated weights for policy 0, policy_version 408730 (0.0029) [2024-06-23 09:42:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6696714240. Throughput: 0: 42830.1. Samples: 6696795260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:23,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 09:42:24,632][15401] Updated weights for policy 0, policy_version 408740 (0.0049) [2024-06-23 09:42:28,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6696910848. Throughput: 0: 42642.3. Samples: 6697049160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:28,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 09:42:28,984][15401] Updated weights for policy 0, policy_version 408750 (0.0020) [2024-06-23 09:42:32,848][15401] Updated weights for policy 0, policy_version 408760 (0.0024) [2024-06-23 09:42:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 6697140224. Throughput: 0: 42599.3. Samples: 6697303120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 09:42:36,595][15401] Updated weights for policy 0, policy_version 408770 (0.0034) [2024-06-23 09:42:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6697369600. Throughput: 0: 42721.9. Samples: 6697433820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 09:42:40,651][15401] Updated weights for policy 0, policy_version 408780 (0.0032) [2024-06-23 09:42:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 6697549824. Throughput: 0: 42602.3. Samples: 6697688200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 09:42:44,319][15401] Updated weights for policy 0, policy_version 408790 (0.0027) [2024-06-23 09:42:48,349][15401] Updated weights for policy 0, policy_version 408800 (0.0030) [2024-06-23 09:42:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6697779200. Throughput: 0: 42590.8. Samples: 6697942280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:48,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 09:42:52,051][15401] Updated weights for policy 0, policy_version 408810 (0.0032) [2024-06-23 09:42:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6697992192. Throughput: 0: 42528.2. Samples: 6698066700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 09:42:55,929][15401] Updated weights for policy 0, policy_version 408820 (0.0029) [2024-06-23 09:42:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6698188800. Throughput: 0: 42733.4. Samples: 6698330680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:42:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 09:42:58,702][15349] Signal inference workers to stop experience collection... (99250 times) [2024-06-23 09:42:58,702][15349] Signal inference workers to resume experience collection... (99250 times) [2024-06-23 09:42:58,718][15401] InferenceWorker_p0-w0: stopping experience collection (99250 times) [2024-06-23 09:42:58,718][15401] InferenceWorker_p0-w0: resuming experience collection (99250 times) [2024-06-23 09:42:59,658][15401] Updated weights for policy 0, policy_version 408830 (0.0030) [2024-06-23 09:43:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6698418176. Throughput: 0: 42652.1. Samples: 6698583120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:43:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 09:43:03,537][15401] Updated weights for policy 0, policy_version 408840 (0.0034) [2024-06-23 09:43:07,313][15401] Updated weights for policy 0, policy_version 408850 (0.0021) [2024-06-23 09:43:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 6698647552. Throughput: 0: 42618.6. Samples: 6698713100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 09:43:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 09:43:11,213][15401] Updated weights for policy 0, policy_version 408860 (0.0042) [2024-06-23 09:43:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6698811392. Throughput: 0: 42571.9. Samples: 6698964900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:13,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 09:43:15,275][15401] Updated weights for policy 0, policy_version 408870 (0.0043) [2024-06-23 09:43:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 6699057152. Throughput: 0: 42560.1. Samples: 6699218320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 09:43:19,285][15401] Updated weights for policy 0, policy_version 408880 (0.0021) [2024-06-23 09:43:22,914][15401] Updated weights for policy 0, policy_version 408890 (0.0021) [2024-06-23 09:43:23,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6699270144. Throughput: 0: 42597.4. Samples: 6699350700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 09:43:26,909][15401] Updated weights for policy 0, policy_version 408900 (0.0037) [2024-06-23 09:43:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6699466752. Throughput: 0: 42464.8. Samples: 6699599120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 09:43:30,413][15401] Updated weights for policy 0, policy_version 408910 (0.0027) [2024-06-23 09:43:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6699679744. Throughput: 0: 42577.7. Samples: 6699858280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 09:43:34,503][15401] Updated weights for policy 0, policy_version 408920 (0.0039) [2024-06-23 09:43:38,275][15401] Updated weights for policy 0, policy_version 408930 (0.0035) [2024-06-23 09:43:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 6699909120. Throughput: 0: 42634.6. Samples: 6699985360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:38,393][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 09:43:42,129][15401] Updated weights for policy 0, policy_version 408940 (0.0031) [2024-06-23 09:43:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6700122112. Throughput: 0: 42400.3. Samples: 6700238700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 09:43:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000408944_6700138496.pth... [2024-06-23 09:43:43,561][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000408318_6689882112.pth [2024-06-23 09:43:45,953][15401] Updated weights for policy 0, policy_version 408950 (0.0038) [2024-06-23 09:43:48,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 6700335104. Throughput: 0: 42592.9. Samples: 6700499800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 09:43:50,013][15401] Updated weights for policy 0, policy_version 408960 (0.0034) [2024-06-23 09:43:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6700531712. Throughput: 0: 42461.1. Samples: 6700623840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 09:43:53,678][15401] Updated weights for policy 0, policy_version 408970 (0.0037) [2024-06-23 09:43:57,784][15401] Updated weights for policy 0, policy_version 408980 (0.0044) [2024-06-23 09:43:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6700761088. Throughput: 0: 42602.4. Samples: 6700882000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:43:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 09:44:01,282][15401] Updated weights for policy 0, policy_version 408990 (0.0026) [2024-06-23 09:44:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 6700974080. Throughput: 0: 42575.1. Samples: 6701134200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:44:03,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 09:44:05,511][15401] Updated weights for policy 0, policy_version 409000 (0.0036) [2024-06-23 09:44:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 6701187072. Throughput: 0: 42517.8. Samples: 6701264000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:44:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 09:44:09,154][15401] Updated weights for policy 0, policy_version 409010 (0.0038) [2024-06-23 09:44:13,037][15401] Updated weights for policy 0, policy_version 409020 (0.0034) [2024-06-23 09:44:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6701400064. Throughput: 0: 42727.5. Samples: 6701521860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:44:13,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 09:44:16,910][15401] Updated weights for policy 0, policy_version 409030 (0.0035) [2024-06-23 09:44:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6701629440. Throughput: 0: 42612.6. Samples: 6701775840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:44:18,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 09:44:20,665][15401] Updated weights for policy 0, policy_version 409040 (0.0033) [2024-06-23 09:44:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6701826048. Throughput: 0: 42668.5. Samples: 6701905340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:44:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 09:44:24,781][15401] Updated weights for policy 0, policy_version 409050 (0.0035) [2024-06-23 09:44:28,222][15401] Updated weights for policy 0, policy_version 409060 (0.0035) [2024-06-23 09:44:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6702039040. Throughput: 0: 42604.9. Samples: 6702155920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 09:44:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 09:44:32,313][15401] Updated weights for policy 0, policy_version 409070 (0.0053) [2024-06-23 09:44:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 6702252032. Throughput: 0: 42615.5. Samples: 6702417600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:44:33,392][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 09:44:36,215][15401] Updated weights for policy 0, policy_version 409080 (0.0037) [2024-06-23 09:44:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 6702465024. Throughput: 0: 42775.0. Samples: 6702548720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:44:38,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-23 09:44:39,899][15401] Updated weights for policy 0, policy_version 409090 (0.0035) [2024-06-23 09:44:41,787][15349] Signal inference workers to stop experience collection... (99300 times) [2024-06-23 09:44:41,788][15349] Signal inference workers to resume experience collection... (99300 times) [2024-06-23 09:44:41,830][15401] InferenceWorker_p0-w0: stopping experience collection (99300 times) [2024-06-23 09:44:41,830][15401] InferenceWorker_p0-w0: resuming experience collection (99300 times) [2024-06-23 09:44:43,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6702661632. Throughput: 0: 42545.7. Samples: 6702796560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:44:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 09:44:43,930][15401] Updated weights for policy 0, policy_version 409100 (0.0032) [2024-06-23 09:44:47,430][15401] Updated weights for policy 0, policy_version 409110 (0.0026) [2024-06-23 09:44:48,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6702891008. Throughput: 0: 42583.2. Samples: 6703050540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:44:48,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 09:44:51,531][15401] Updated weights for policy 0, policy_version 409120 (0.0027) [2024-06-23 09:44:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6703104000. Throughput: 0: 42732.3. Samples: 6703186960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:44:53,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 09:44:54,976][15401] Updated weights for policy 0, policy_version 409130 (0.0033) [2024-06-23 09:44:58,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6703316992. Throughput: 0: 42578.6. Samples: 6703437900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:44:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 09:44:59,144][15401] Updated weights for policy 0, policy_version 409140 (0.0053) [2024-06-23 09:45:02,571][15401] Updated weights for policy 0, policy_version 409150 (0.0043) [2024-06-23 09:45:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6703529984. Throughput: 0: 42591.1. Samples: 6703692440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 09:45:07,084][15401] Updated weights for policy 0, policy_version 409160 (0.0032) [2024-06-23 09:45:08,390][15132] Fps is (10 sec: 42595.2, 60 sec: 42597.7, 300 sec: 42709.3). Total num frames: 6703742976. Throughput: 0: 42561.9. Samples: 6703820660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:08,391][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 09:45:10,542][15401] Updated weights for policy 0, policy_version 409170 (0.0036) [2024-06-23 09:45:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6703939584. Throughput: 0: 42573.4. Samples: 6704071720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:13,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 09:45:14,882][15401] Updated weights for policy 0, policy_version 409180 (0.0045) [2024-06-23 09:45:18,389][15132] Fps is (10 sec: 40963.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6704152576. Throughput: 0: 42436.1. Samples: 6704327120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 09:45:18,419][15401] Updated weights for policy 0, policy_version 409190 (0.0043) [2024-06-23 09:45:22,579][15401] Updated weights for policy 0, policy_version 409200 (0.0035) [2024-06-23 09:45:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 6704381952. Throughput: 0: 42319.1. Samples: 6704453180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:23,392][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 09:45:26,143][15401] Updated weights for policy 0, policy_version 409210 (0.0030) [2024-06-23 09:45:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6704578560. Throughput: 0: 42478.7. Samples: 6704708100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 09:45:30,343][15401] Updated weights for policy 0, policy_version 409220 (0.0033) [2024-06-23 09:45:33,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 6704791552. Throughput: 0: 42386.1. Samples: 6704957820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 09:45:33,872][15401] Updated weights for policy 0, policy_version 409230 (0.0040) [2024-06-23 09:45:37,843][15401] Updated weights for policy 0, policy_version 409240 (0.0028) [2024-06-23 09:45:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6705004544. Throughput: 0: 42231.2. Samples: 6705087360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 09:45:41,554][15401] Updated weights for policy 0, policy_version 409250 (0.0028) [2024-06-23 09:45:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6705201152. Throughput: 0: 42286.7. Samples: 6705340800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 09:45:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000409254_6705217536.pth... [2024-06-23 09:45:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000408631_6695010304.pth [2024-06-23 09:45:45,582][15401] Updated weights for policy 0, policy_version 409260 (0.0037) [2024-06-23 09:45:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 6705430528. Throughput: 0: 42331.1. Samples: 6705597340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 09:45:49,265][15401] Updated weights for policy 0, policy_version 409270 (0.0027) [2024-06-23 09:45:53,228][15401] Updated weights for policy 0, policy_version 409280 (0.0037) [2024-06-23 09:45:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6705643520. Throughput: 0: 42316.8. Samples: 6705724880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:45:53,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 09:45:56,948][15401] Updated weights for policy 0, policy_version 409290 (0.0032) [2024-06-23 09:45:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 6705840128. Throughput: 0: 42370.7. Samples: 6705978400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:45:58,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 09:46:00,835][15401] Updated weights for policy 0, policy_version 409300 (0.0032) [2024-06-23 09:46:02,175][15349] Signal inference workers to stop experience collection... (99350 times) [2024-06-23 09:46:02,225][15401] InferenceWorker_p0-w0: stopping experience collection (99350 times) [2024-06-23 09:46:02,232][15349] Signal inference workers to resume experience collection... (99350 times) [2024-06-23 09:46:02,253][15401] InferenceWorker_p0-w0: resuming experience collection (99350 times) [2024-06-23 09:46:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6706069504. Throughput: 0: 42396.4. Samples: 6706234960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 09:46:04,442][15401] Updated weights for policy 0, policy_version 409310 (0.0031) [2024-06-23 09:46:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42326.0, 300 sec: 42598.4). Total num frames: 6706282496. Throughput: 0: 42537.4. Samples: 6706367260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 09:46:08,495][15401] Updated weights for policy 0, policy_version 409320 (0.0032) [2024-06-23 09:46:12,564][15401] Updated weights for policy 0, policy_version 409330 (0.0050) [2024-06-23 09:46:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6706479104. Throughput: 0: 42382.2. Samples: 6706615300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 09:46:16,255][15401] Updated weights for policy 0, policy_version 409340 (0.0026) [2024-06-23 09:46:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6706708480. Throughput: 0: 42380.5. Samples: 6706864940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:18,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 09:46:20,243][15401] Updated weights for policy 0, policy_version 409350 (0.0027) [2024-06-23 09:46:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 6706905088. Throughput: 0: 42448.0. Samples: 6706997520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 09:46:24,023][15401] Updated weights for policy 0, policy_version 409360 (0.0040) [2024-06-23 09:46:27,762][15401] Updated weights for policy 0, policy_version 409370 (0.0038) [2024-06-23 09:46:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6707134464. Throughput: 0: 42500.1. Samples: 6707253300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 09:46:31,611][15401] Updated weights for policy 0, policy_version 409380 (0.0032) [2024-06-23 09:46:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6707347456. Throughput: 0: 42409.3. Samples: 6707505760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 09:46:35,553][15401] Updated weights for policy 0, policy_version 409390 (0.0033) [2024-06-23 09:46:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6707544064. Throughput: 0: 42427.9. Samples: 6707634140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:38,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 09:46:39,407][15401] Updated weights for policy 0, policy_version 409400 (0.0040) [2024-06-23 09:46:43,267][15401] Updated weights for policy 0, policy_version 409410 (0.0026) [2024-06-23 09:46:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6707773440. Throughput: 0: 42499.5. Samples: 6707890880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 09:46:47,107][15401] Updated weights for policy 0, policy_version 409420 (0.0036) [2024-06-23 09:46:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6707986432. Throughput: 0: 42327.5. Samples: 6708139700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 09:46:50,904][15401] Updated weights for policy 0, policy_version 409430 (0.0034) [2024-06-23 09:46:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 6708166656. Throughput: 0: 42207.9. Samples: 6708266620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 09:46:55,313][15401] Updated weights for policy 0, policy_version 409440 (0.0033) [2024-06-23 09:46:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6708412416. Throughput: 0: 42312.0. Samples: 6708519340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:46:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 09:46:58,447][15401] Updated weights for policy 0, policy_version 409450 (0.0031) [2024-06-23 09:47:02,889][15401] Updated weights for policy 0, policy_version 409460 (0.0039) [2024-06-23 09:47:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 6708609024. Throughput: 0: 42488.4. Samples: 6708776920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:47:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 09:47:06,202][15401] Updated weights for policy 0, policy_version 409470 (0.0040) [2024-06-23 09:47:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 6708805632. Throughput: 0: 42294.2. Samples: 6708900760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:47:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 09:47:10,805][15401] Updated weights for policy 0, policy_version 409480 (0.0053) [2024-06-23 09:47:13,369][15349] Signal inference workers to stop experience collection... (99400 times) [2024-06-23 09:47:13,376][15349] Signal inference workers to resume experience collection... (99400 times) [2024-06-23 09:47:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 6709035008. Throughput: 0: 42232.4. Samples: 6709153760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 09:47:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 09:47:13,391][15401] InferenceWorker_p0-w0: stopping experience collection (99400 times) [2024-06-23 09:47:13,391][15401] InferenceWorker_p0-w0: resuming experience collection (99400 times) [2024-06-23 09:47:14,197][15401] Updated weights for policy 0, policy_version 409490 (0.0028) [2024-06-23 09:47:18,387][15401] Updated weights for policy 0, policy_version 409500 (0.0035) [2024-06-23 09:47:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6709248000. Throughput: 0: 42367.9. Samples: 6709412320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 09:47:21,938][15401] Updated weights for policy 0, policy_version 409510 (0.0046) [2024-06-23 09:47:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6709460992. Throughput: 0: 42200.0. Samples: 6709533140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 09:47:26,005][15401] Updated weights for policy 0, policy_version 409520 (0.0040) [2024-06-23 09:47:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6709673984. Throughput: 0: 42160.9. Samples: 6709788120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:28,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 09:47:29,709][15401] Updated weights for policy 0, policy_version 409530 (0.0038) [2024-06-23 09:47:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 6709870592. Throughput: 0: 42305.8. Samples: 6710043460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 09:47:33,559][15401] Updated weights for policy 0, policy_version 409540 (0.0033) [2024-06-23 09:47:37,881][15401] Updated weights for policy 0, policy_version 409550 (0.0038) [2024-06-23 09:47:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6710083584. Throughput: 0: 42255.5. Samples: 6710168120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 09:47:41,341][15401] Updated weights for policy 0, policy_version 409560 (0.0044) [2024-06-23 09:47:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 6710296576. Throughput: 0: 42316.5. Samples: 6710423580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:43,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 09:47:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000409565_6710312960.pth... [2024-06-23 09:47:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000408944_6700138496.pth [2024-06-23 09:47:45,523][15401] Updated weights for policy 0, policy_version 409570 (0.0031) [2024-06-23 09:47:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 6710509568. Throughput: 0: 42164.9. Samples: 6710674340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 09:47:49,176][15401] Updated weights for policy 0, policy_version 409580 (0.0033) [2024-06-23 09:47:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 6710706176. Throughput: 0: 42169.2. Samples: 6710798380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:53,392][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 09:47:53,447][15401] Updated weights for policy 0, policy_version 409590 (0.0033) [2024-06-23 09:47:56,767][15401] Updated weights for policy 0, policy_version 409600 (0.0044) [2024-06-23 09:47:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 6710935552. Throughput: 0: 42242.6. Samples: 6711054680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:47:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-23 09:48:01,083][15401] Updated weights for policy 0, policy_version 409610 (0.0042) [2024-06-23 09:48:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 6711148544. Throughput: 0: 42204.4. Samples: 6711311520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:48:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 09:48:04,634][15401] Updated weights for policy 0, policy_version 409620 (0.0030) [2024-06-23 09:48:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6711361536. Throughput: 0: 42339.6. Samples: 6711438420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:48:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 09:48:08,631][15401] Updated weights for policy 0, policy_version 409630 (0.0031) [2024-06-23 09:48:12,376][15401] Updated weights for policy 0, policy_version 409640 (0.0031) [2024-06-23 09:48:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 6711558144. Throughput: 0: 42339.7. Samples: 6711693400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:48:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 09:48:16,390][15401] Updated weights for policy 0, policy_version 409650 (0.0024) [2024-06-23 09:48:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 6711754752. Throughput: 0: 42211.1. Samples: 6711942960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:48:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 09:48:19,986][15401] Updated weights for policy 0, policy_version 409660 (0.0038) [2024-06-23 09:48:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6712000512. Throughput: 0: 42211.2. Samples: 6712067620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:48:23,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 09:48:24,040][15401] Updated weights for policy 0, policy_version 409670 (0.0033) [2024-06-23 09:48:28,111][15401] Updated weights for policy 0, policy_version 409680 (0.0043) [2024-06-23 09:48:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 6712197120. Throughput: 0: 42163.1. Samples: 6712320920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:48:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 09:48:31,644][15401] Updated weights for policy 0, policy_version 409690 (0.0038) [2024-06-23 09:48:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42376.6). Total num frames: 6712410112. Throughput: 0: 42166.6. Samples: 6712571840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 09:48:33,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 09:48:35,782][15401] Updated weights for policy 0, policy_version 409700 (0.0032) [2024-06-23 09:48:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 6712639488. Throughput: 0: 42306.2. Samples: 6712702160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:48:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 09:48:39,376][15401] Updated weights for policy 0, policy_version 409710 (0.0034) [2024-06-23 09:48:43,294][15401] Updated weights for policy 0, policy_version 409720 (0.0033) [2024-06-23 09:48:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 6712852480. Throughput: 0: 42437.8. Samples: 6712964380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:48:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 09:48:44,721][15349] Signal inference workers to stop experience collection... (99450 times) [2024-06-23 09:48:44,724][15349] Signal inference workers to resume experience collection... (99450 times) [2024-06-23 09:48:44,745][15401] InferenceWorker_p0-w0: stopping experience collection (99450 times) [2024-06-23 09:48:44,745][15401] InferenceWorker_p0-w0: resuming experience collection (99450 times) [2024-06-23 09:48:47,169][15401] Updated weights for policy 0, policy_version 409730 (0.0033) [2024-06-23 09:48:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 6713049088. Throughput: 0: 42332.1. Samples: 6713216460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:48:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 09:48:51,343][15401] Updated weights for policy 0, policy_version 409740 (0.0028) [2024-06-23 09:48:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 6713278464. Throughput: 0: 42400.8. Samples: 6713346460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:48:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 09:48:54,617][15401] Updated weights for policy 0, policy_version 409750 (0.0033) [2024-06-23 09:48:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 6713475072. Throughput: 0: 42435.5. Samples: 6713603000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:48:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 09:48:58,843][15401] Updated weights for policy 0, policy_version 409760 (0.0039) [2024-06-23 09:49:02,744][15401] Updated weights for policy 0, policy_version 409770 (0.0028) [2024-06-23 09:49:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42431.7). Total num frames: 6713704448. Throughput: 0: 42379.2. Samples: 6713850040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:03,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 09:49:06,931][15401] Updated weights for policy 0, policy_version 409780 (0.0035) [2024-06-23 09:49:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 6713901056. Throughput: 0: 42540.0. Samples: 6713981920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 09:49:10,329][15401] Updated weights for policy 0, policy_version 409790 (0.0034) [2024-06-23 09:49:13,390][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 6714114048. Throughput: 0: 42660.0. Samples: 6714240620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:13,392][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 09:49:14,654][15401] Updated weights for policy 0, policy_version 409800 (0.0037) [2024-06-23 09:49:17,928][15401] Updated weights for policy 0, policy_version 409810 (0.0033) [2024-06-23 09:49:18,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.7, 300 sec: 42431.4). Total num frames: 6714343424. Throughput: 0: 42492.9. Samples: 6714484120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:18,393][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 09:49:22,295][15401] Updated weights for policy 0, policy_version 409820 (0.0023) [2024-06-23 09:49:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 6714540032. Throughput: 0: 42708.5. Samples: 6714624040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 09:49:25,535][15401] Updated weights for policy 0, policy_version 409830 (0.0030) [2024-06-23 09:49:28,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.3, 300 sec: 42321.0). Total num frames: 6714736640. Throughput: 0: 42387.1. Samples: 6714871800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 09:49:30,034][15401] Updated weights for policy 0, policy_version 409840 (0.0039) [2024-06-23 09:49:33,094][15401] Updated weights for policy 0, policy_version 409850 (0.0031) [2024-06-23 09:49:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 6714982400. Throughput: 0: 42329.7. Samples: 6715121300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 09:49:37,514][15401] Updated weights for policy 0, policy_version 409860 (0.0031) [2024-06-23 09:49:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 6715179008. Throughput: 0: 42609.4. Samples: 6715263880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:38,393][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 09:49:40,893][15401] Updated weights for policy 0, policy_version 409870 (0.0041) [2024-06-23 09:49:43,394][15132] Fps is (10 sec: 40943.1, 60 sec: 42322.4, 300 sec: 42376.0). Total num frames: 6715392000. Throughput: 0: 42438.2. Samples: 6715512900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:43,395][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 09:49:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000409875_6715392000.pth... [2024-06-23 09:49:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000409254_6705217536.pth [2024-06-23 09:49:45,177][15401] Updated weights for policy 0, policy_version 409880 (0.0032) [2024-06-23 09:49:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 6715621376. Throughput: 0: 42503.8. Samples: 6715762700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 09:49:48,573][15401] Updated weights for policy 0, policy_version 409890 (0.0029) [2024-06-23 09:49:53,026][15401] Updated weights for policy 0, policy_version 409900 (0.0041) [2024-06-23 09:49:53,390][15132] Fps is (10 sec: 42616.2, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 6715817984. Throughput: 0: 42476.9. Samples: 6715893380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 09:49:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 09:49:56,208][15401] Updated weights for policy 0, policy_version 409910 (0.0035) [2024-06-23 09:49:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 6716014592. Throughput: 0: 42390.7. Samples: 6716148200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:49:58,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-23 09:50:00,734][15401] Updated weights for policy 0, policy_version 409920 (0.0026) [2024-06-23 09:50:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.6, 300 sec: 42376.4). Total num frames: 6716243968. Throughput: 0: 42678.4. Samples: 6716404540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 09:50:03,697][15401] Updated weights for policy 0, policy_version 409930 (0.0030) [2024-06-23 09:50:08,338][15401] Updated weights for policy 0, policy_version 409940 (0.0043) [2024-06-23 09:50:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 6716456960. Throughput: 0: 42454.6. Samples: 6716534500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 09:50:11,648][15401] Updated weights for policy 0, policy_version 409950 (0.0022) [2024-06-23 09:50:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 6716686336. Throughput: 0: 42618.7. Samples: 6716789640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 09:50:15,777][15401] Updated weights for policy 0, policy_version 409960 (0.0037) [2024-06-23 09:50:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42054.0, 300 sec: 42321.1). Total num frames: 6716866560. Throughput: 0: 42839.7. Samples: 6717049080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 09:50:19,471][15401] Updated weights for policy 0, policy_version 409970 (0.0032) [2024-06-23 09:50:23,264][15401] Updated weights for policy 0, policy_version 409980 (0.0034) [2024-06-23 09:50:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 6717112320. Throughput: 0: 42470.2. Samples: 6717175140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:23,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 09:50:24,870][15349] Signal inference workers to stop experience collection... (99500 times) [2024-06-23 09:50:24,924][15401] InferenceWorker_p0-w0: stopping experience collection (99500 times) [2024-06-23 09:50:24,985][15349] Signal inference workers to resume experience collection... (99500 times) [2024-06-23 09:50:24,985][15401] InferenceWorker_p0-w0: resuming experience collection (99500 times) [2024-06-23 09:50:27,070][15401] Updated weights for policy 0, policy_version 409990 (0.0025) [2024-06-23 09:50:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 6717325312. Throughput: 0: 42641.3. Samples: 6717431580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 09:50:31,024][15401] Updated weights for policy 0, policy_version 410000 (0.0029) [2024-06-23 09:50:33,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 6717521920. Throughput: 0: 42930.6. Samples: 6717694580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:33,399][15132] Avg episode reward: [(0, '0.874')] [2024-06-23 09:50:34,674][15401] Updated weights for policy 0, policy_version 410010 (0.0027) [2024-06-23 09:50:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 6717751296. Throughput: 0: 42906.6. Samples: 6717824280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:38,393][15132] Avg episode reward: [(0, '0.874')] [2024-06-23 09:50:38,722][15401] Updated weights for policy 0, policy_version 410020 (0.0031) [2024-06-23 09:50:42,217][15401] Updated weights for policy 0, policy_version 410030 (0.0028) [2024-06-23 09:50:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43147.5, 300 sec: 42542.8). Total num frames: 6717980672. Throughput: 0: 42941.3. Samples: 6718080560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:43,392][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 09:50:46,571][15401] Updated weights for policy 0, policy_version 410040 (0.0022) [2024-06-23 09:50:48,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 6718177280. Throughput: 0: 43019.0. Samples: 6718340500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:48,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 09:50:49,869][15401] Updated weights for policy 0, policy_version 410050 (0.0029) [2024-06-23 09:50:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 6718390272. Throughput: 0: 42858.6. Samples: 6718463240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:53,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 09:50:54,563][15401] Updated weights for policy 0, policy_version 410060 (0.0032) [2024-06-23 09:50:57,338][15401] Updated weights for policy 0, policy_version 410070 (0.0043) [2024-06-23 09:50:58,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 6718619648. Throughput: 0: 42954.2. Samples: 6718722580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:50:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 09:51:02,015][15401] Updated weights for policy 0, policy_version 410080 (0.0035) [2024-06-23 09:51:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 6718816256. Throughput: 0: 42942.0. Samples: 6718981480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:51:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 09:51:04,962][15401] Updated weights for policy 0, policy_version 410090 (0.0031) [2024-06-23 09:51:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 6719029248. Throughput: 0: 42938.3. Samples: 6719107260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:51:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 09:51:09,508][15401] Updated weights for policy 0, policy_version 410100 (0.0030) [2024-06-23 09:51:12,527][15401] Updated weights for policy 0, policy_version 410110 (0.0029) [2024-06-23 09:51:13,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6719275008. Throughput: 0: 43077.7. Samples: 6719370080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:51:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 09:51:17,063][15401] Updated weights for policy 0, policy_version 410120 (0.0037) [2024-06-23 09:51:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42542.5). Total num frames: 6719455232. Throughput: 0: 42994.3. Samples: 6719629420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 09:51:18,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 09:51:20,457][15401] Updated weights for policy 0, policy_version 410130 (0.0026) [2024-06-23 09:51:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 6719668224. Throughput: 0: 42742.7. Samples: 6719747600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:51:23,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 09:51:24,703][15401] Updated weights for policy 0, policy_version 410140 (0.0026) [2024-06-23 09:51:28,030][15401] Updated weights for policy 0, policy_version 410150 (0.0033) [2024-06-23 09:51:28,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6719913984. Throughput: 0: 42917.8. Samples: 6720011860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:51:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 09:51:32,340][15401] Updated weights for policy 0, policy_version 410160 (0.0044) [2024-06-23 09:51:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 6720077824. Throughput: 0: 42842.3. Samples: 6720268300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:51:33,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-23 09:51:35,613][15401] Updated weights for policy 0, policy_version 410170 (0.0044) [2024-06-23 09:51:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 6720323584. Throughput: 0: 42757.5. Samples: 6720387220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:51:38,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-23 09:51:40,279][15401] Updated weights for policy 0, policy_version 410180 (0.0034) [2024-06-23 09:51:42,285][15349] Signal inference workers to stop experience collection... (99550 times) [2024-06-23 09:51:42,327][15401] InferenceWorker_p0-w0: stopping experience collection (99550 times) [2024-06-23 09:51:42,401][15349] Signal inference workers to resume experience collection... (99550 times) [2024-06-23 09:51:42,401][15401] InferenceWorker_p0-w0: resuming experience collection (99550 times) [2024-06-23 09:51:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6720520192. Throughput: 0: 42927.6. Samples: 6720654320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:51:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 09:51:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000410189_6720536576.pth... [2024-06-23 09:51:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000409565_6710312960.pth [2024-06-23 09:51:43,627][15401] Updated weights for policy 0, policy_version 410190 (0.0031) [2024-06-23 09:51:48,177][15401] Updated weights for policy 0, policy_version 410200 (0.0035) [2024-06-23 09:51:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 6720716800. Throughput: 0: 42817.5. Samples: 6720908260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:51:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 09:51:51,404][15401] Updated weights for policy 0, policy_version 410210 (0.0036) [2024-06-23 09:51:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 6720962560. Throughput: 0: 42771.9. Samples: 6721032000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:51:53,395][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 09:51:55,875][15401] Updated weights for policy 0, policy_version 410220 (0.0034) [2024-06-23 09:51:58,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 6721175552. Throughput: 0: 42712.9. Samples: 6721292260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:51:58,393][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 09:51:59,061][15401] Updated weights for policy 0, policy_version 410230 (0.0042) [2024-06-23 09:52:03,324][15401] Updated weights for policy 0, policy_version 410240 (0.0029) [2024-06-23 09:52:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6721372160. Throughput: 0: 42620.3. Samples: 6721547240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:52:03,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-23 09:52:06,679][15401] Updated weights for policy 0, policy_version 410250 (0.0028) [2024-06-23 09:52:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6721617920. Throughput: 0: 42664.9. Samples: 6721667520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:52:08,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 09:52:11,065][15401] Updated weights for policy 0, policy_version 410260 (0.0038) [2024-06-23 09:52:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6721814528. Throughput: 0: 42680.4. Samples: 6721932480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:52:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 09:52:14,737][15401] Updated weights for policy 0, policy_version 410270 (0.0032) [2024-06-23 09:52:18,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 6721994752. Throughput: 0: 42563.1. Samples: 6722183640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:52:18,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 09:52:18,942][15401] Updated weights for policy 0, policy_version 410280 (0.0026) [2024-06-23 09:52:22,218][15401] Updated weights for policy 0, policy_version 410290 (0.0044) [2024-06-23 09:52:23,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 6722273280. Throughput: 0: 42722.6. Samples: 6722309740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:52:23,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 09:52:26,449][15401] Updated weights for policy 0, policy_version 410300 (0.0041) [2024-06-23 09:52:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6722437120. Throughput: 0: 42671.4. Samples: 6722574540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:52:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 09:52:29,939][15401] Updated weights for policy 0, policy_version 410310 (0.0029) [2024-06-23 09:52:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6722650112. Throughput: 0: 42493.4. Samples: 6722820460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:52:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 09:52:34,407][15401] Updated weights for policy 0, policy_version 410320 (0.0029) [2024-06-23 09:52:37,563][15401] Updated weights for policy 0, policy_version 410330 (0.0044) [2024-06-23 09:52:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6722895872. Throughput: 0: 42591.1. Samples: 6722948600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 09:52:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 09:52:42,156][15401] Updated weights for policy 0, policy_version 410340 (0.0043) [2024-06-23 09:52:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6723076096. Throughput: 0: 42589.7. Samples: 6723208700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:52:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 09:52:45,424][15401] Updated weights for policy 0, policy_version 410350 (0.0027) [2024-06-23 09:52:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 6723289088. Throughput: 0: 42398.8. Samples: 6723455280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:52:48,392][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 09:52:49,721][15401] Updated weights for policy 0, policy_version 410360 (0.0045) [2024-06-23 09:52:53,108][15401] Updated weights for policy 0, policy_version 410370 (0.0026) [2024-06-23 09:52:53,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6723518464. Throughput: 0: 42587.5. Samples: 6723583960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:52:53,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 09:52:57,316][15401] Updated weights for policy 0, policy_version 410380 (0.0033) [2024-06-23 09:52:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 6723715072. Throughput: 0: 42533.9. Samples: 6723846500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:52:58,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 09:53:00,454][15349] Signal inference workers to stop experience collection... (99600 times) [2024-06-23 09:53:00,490][15401] InferenceWorker_p0-w0: stopping experience collection (99600 times) [2024-06-23 09:53:00,510][15349] Signal inference workers to resume experience collection... (99600 times) [2024-06-23 09:53:00,511][15401] InferenceWorker_p0-w0: resuming experience collection (99600 times) [2024-06-23 09:53:00,643][15401] Updated weights for policy 0, policy_version 410390 (0.0033) [2024-06-23 09:53:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 6723928064. Throughput: 0: 42445.8. Samples: 6724093700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 09:53:04,769][15401] Updated weights for policy 0, policy_version 410400 (0.0034) [2024-06-23 09:53:08,343][15401] Updated weights for policy 0, policy_version 410410 (0.0039) [2024-06-23 09:53:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6724157440. Throughput: 0: 42569.8. Samples: 6724225380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 09:53:12,331][15401] Updated weights for policy 0, policy_version 410420 (0.0030) [2024-06-23 09:53:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 6724337664. Throughput: 0: 42361.5. Samples: 6724480800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:13,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 09:53:16,018][15401] Updated weights for policy 0, policy_version 410430 (0.0037) [2024-06-23 09:53:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6724583424. Throughput: 0: 42467.9. Samples: 6724731520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 09:53:20,293][15401] Updated weights for policy 0, policy_version 410440 (0.0027) [2024-06-23 09:53:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 6724780032. Throughput: 0: 42609.2. Samples: 6724866020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 09:53:23,603][15401] Updated weights for policy 0, policy_version 410450 (0.0047) [2024-06-23 09:53:27,793][15401] Updated weights for policy 0, policy_version 410460 (0.0041) [2024-06-23 09:53:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6724976640. Throughput: 0: 42501.0. Samples: 6725121240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 09:53:31,380][15401] Updated weights for policy 0, policy_version 410470 (0.0039) [2024-06-23 09:53:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6725222400. Throughput: 0: 42601.3. Samples: 6725372240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 09:53:35,615][15401] Updated weights for policy 0, policy_version 410480 (0.0036) [2024-06-23 09:53:38,392][15132] Fps is (10 sec: 45864.7, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 6725435392. Throughput: 0: 42763.5. Samples: 6725508420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:38,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 09:53:38,942][15401] Updated weights for policy 0, policy_version 410490 (0.0034) [2024-06-23 09:53:43,230][15401] Updated weights for policy 0, policy_version 410500 (0.0029) [2024-06-23 09:53:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6725632000. Throughput: 0: 42588.4. Samples: 6725762980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 09:53:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000410500_6725632000.pth... [2024-06-23 09:53:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000409875_6715392000.pth [2024-06-23 09:53:46,381][15401] Updated weights for policy 0, policy_version 410510 (0.0036) [2024-06-23 09:53:48,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 6725861376. Throughput: 0: 42768.8. Samples: 6726018300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 09:53:50,881][15401] Updated weights for policy 0, policy_version 410520 (0.0035) [2024-06-23 09:53:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6726074368. Throughput: 0: 42827.1. Samples: 6726152600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 09:53:53,867][15401] Updated weights for policy 0, policy_version 410530 (0.0038) [2024-06-23 09:53:58,387][15401] Updated weights for policy 0, policy_version 410540 (0.0030) [2024-06-23 09:53:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 6726287360. Throughput: 0: 42936.5. Samples: 6726412940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 09:53:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 09:54:01,378][15401] Updated weights for policy 0, policy_version 410550 (0.0034) [2024-06-23 09:54:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6726516736. Throughput: 0: 43025.4. Samples: 6726667660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:03,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-23 09:54:05,860][15401] Updated weights for policy 0, policy_version 410560 (0.0044) [2024-06-23 09:54:08,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6726729728. Throughput: 0: 42987.6. Samples: 6726800460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 09:54:08,866][15401] Updated weights for policy 0, policy_version 410570 (0.0033) [2024-06-23 09:54:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 6726926336. Throughput: 0: 43082.8. Samples: 6727059960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 09:54:13,528][15401] Updated weights for policy 0, policy_version 410580 (0.0027) [2024-06-23 09:54:17,086][15401] Updated weights for policy 0, policy_version 410590 (0.0034) [2024-06-23 09:54:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6727172096. Throughput: 0: 43016.5. Samples: 6727307980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:18,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 09:54:21,244][15401] Updated weights for policy 0, policy_version 410600 (0.0043) [2024-06-23 09:54:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6727352320. Throughput: 0: 43081.0. Samples: 6727446960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 09:54:24,493][15401] Updated weights for policy 0, policy_version 410610 (0.0033) [2024-06-23 09:54:24,522][15349] Signal inference workers to stop experience collection... (99650 times) [2024-06-23 09:54:24,522][15349] Signal inference workers to resume experience collection... (99650 times) [2024-06-23 09:54:24,535][15401] InferenceWorker_p0-w0: stopping experience collection (99650 times) [2024-06-23 09:54:24,536][15401] InferenceWorker_p0-w0: resuming experience collection (99650 times) [2024-06-23 09:54:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6727565312. Throughput: 0: 43094.2. Samples: 6727702220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:28,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 09:54:28,766][15401] Updated weights for policy 0, policy_version 410620 (0.0032) [2024-06-23 09:54:32,194][15401] Updated weights for policy 0, policy_version 410630 (0.0028) [2024-06-23 09:54:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6727811072. Throughput: 0: 42988.4. Samples: 6727952780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 09:54:36,705][15401] Updated weights for policy 0, policy_version 410640 (0.0032) [2024-06-23 09:54:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42765.6). Total num frames: 6728007680. Throughput: 0: 43056.5. Samples: 6728090140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:38,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 09:54:39,856][15401] Updated weights for policy 0, policy_version 410650 (0.0030) [2024-06-23 09:54:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6728220672. Throughput: 0: 42753.2. Samples: 6728336840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 09:54:44,320][15401] Updated weights for policy 0, policy_version 410660 (0.0033) [2024-06-23 09:54:47,603][15401] Updated weights for policy 0, policy_version 410670 (0.0032) [2024-06-23 09:54:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 6728450048. Throughput: 0: 42830.1. Samples: 6728595120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:48,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 09:54:51,902][15401] Updated weights for policy 0, policy_version 410680 (0.0033) [2024-06-23 09:54:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6728646656. Throughput: 0: 42732.1. Samples: 6728723400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:53,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 09:54:55,182][15401] Updated weights for policy 0, policy_version 410690 (0.0029) [2024-06-23 09:54:58,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6728859648. Throughput: 0: 42587.4. Samples: 6728976400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:54:58,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 09:54:59,969][15401] Updated weights for policy 0, policy_version 410700 (0.0035) [2024-06-23 09:55:03,147][15401] Updated weights for policy 0, policy_version 410710 (0.0034) [2024-06-23 09:55:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6729072640. Throughput: 0: 42692.5. Samples: 6729229140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:55:03,396][15132] Avg episode reward: [(0, '0.287')] [2024-06-23 09:55:07,455][15401] Updated weights for policy 0, policy_version 410720 (0.0035) [2024-06-23 09:55:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 6729269248. Throughput: 0: 42510.1. Samples: 6729360020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:55:08,401][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 09:55:10,774][15401] Updated weights for policy 0, policy_version 410730 (0.0045) [2024-06-23 09:55:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6729498624. Throughput: 0: 42608.1. Samples: 6729619580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:55:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 09:55:15,154][15401] Updated weights for policy 0, policy_version 410740 (0.0032) [2024-06-23 09:55:18,272][15401] Updated weights for policy 0, policy_version 410750 (0.0033) [2024-06-23 09:55:18,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 6729728000. Throughput: 0: 42649.3. Samples: 6729872000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 09:55:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 09:55:22,868][15401] Updated weights for policy 0, policy_version 410760 (0.0028) [2024-06-23 09:55:23,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 6729908224. Throughput: 0: 42497.7. Samples: 6730002640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:55:23,392][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 09:55:25,794][15401] Updated weights for policy 0, policy_version 410770 (0.0030) [2024-06-23 09:55:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6730153984. Throughput: 0: 42716.8. Samples: 6730259100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:55:28,396][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 09:55:30,495][15401] Updated weights for policy 0, policy_version 410780 (0.0032) [2024-06-23 09:55:33,252][15401] Updated weights for policy 0, policy_version 410790 (0.0029) [2024-06-23 09:55:33,390][15132] Fps is (10 sec: 47524.3, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 6730383360. Throughput: 0: 42699.5. Samples: 6730516500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:55:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 09:55:37,909][15349] Signal inference workers to stop experience collection... (99700 times) [2024-06-23 09:55:37,911][15349] Signal inference workers to resume experience collection... (99700 times) [2024-06-23 09:55:37,931][15401] InferenceWorker_p0-w0: stopping experience collection (99700 times) [2024-06-23 09:55:37,931][15401] InferenceWorker_p0-w0: resuming experience collection (99700 times) [2024-06-23 09:55:38,070][15401] Updated weights for policy 0, policy_version 410800 (0.0034) [2024-06-23 09:55:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6730547200. Throughput: 0: 42813.7. Samples: 6730650020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:55:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 09:55:40,866][15401] Updated weights for policy 0, policy_version 410810 (0.0034) [2024-06-23 09:55:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 6730792960. Throughput: 0: 42942.8. Samples: 6730908820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:55:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 09:55:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000410816_6730809344.pth... [2024-06-23 09:55:43,575][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000410189_6720536576.pth [2024-06-23 09:55:45,430][15401] Updated weights for policy 0, policy_version 410820 (0.0028) [2024-06-23 09:55:48,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42873.2, 300 sec: 42820.9). Total num frames: 6731022336. Throughput: 0: 42914.2. Samples: 6731160280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:55:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 09:55:48,521][15401] Updated weights for policy 0, policy_version 410830 (0.0031) [2024-06-23 09:55:52,941][15401] Updated weights for policy 0, policy_version 410840 (0.0027) [2024-06-23 09:55:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6731202560. Throughput: 0: 42920.5. Samples: 6731291340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:55:53,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 09:55:56,165][15401] Updated weights for policy 0, policy_version 410850 (0.0031) [2024-06-23 09:55:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6731431936. Throughput: 0: 42884.0. Samples: 6731549360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:55:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 09:56:00,799][15401] Updated weights for policy 0, policy_version 410860 (0.0033) [2024-06-23 09:56:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6731628544. Throughput: 0: 43033.4. Samples: 6731808500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:56:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 09:56:03,927][15401] Updated weights for policy 0, policy_version 410870 (0.0037) [2024-06-23 09:56:08,218][15401] Updated weights for policy 0, policy_version 410880 (0.0031) [2024-06-23 09:56:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 6731857920. Throughput: 0: 42866.6. Samples: 6731931540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:56:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 09:56:11,554][15401] Updated weights for policy 0, policy_version 410890 (0.0033) [2024-06-23 09:56:13,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 6732103680. Throughput: 0: 42905.4. Samples: 6732189840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:56:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 09:56:15,784][15401] Updated weights for policy 0, policy_version 410900 (0.0026) [2024-06-23 09:56:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6732283904. Throughput: 0: 43038.0. Samples: 6732453200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:56:18,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 09:56:19,178][15401] Updated weights for policy 0, policy_version 410910 (0.0025) [2024-06-23 09:56:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 6732496896. Throughput: 0: 42838.9. Samples: 6732577780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:56:23,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-23 09:56:23,550][15401] Updated weights for policy 0, policy_version 410920 (0.0029) [2024-06-23 09:56:27,354][15401] Updated weights for policy 0, policy_version 410930 (0.0029) [2024-06-23 09:56:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6732726272. Throughput: 0: 42832.5. Samples: 6732836280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:56:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 09:56:31,286][15401] Updated weights for policy 0, policy_version 410940 (0.0041) [2024-06-23 09:56:33,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 6732922880. Throughput: 0: 42861.8. Samples: 6733089060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:56:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 09:56:35,085][15401] Updated weights for policy 0, policy_version 410950 (0.0032) [2024-06-23 09:56:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6733135872. Throughput: 0: 42677.0. Samples: 6733211800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 09:56:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 09:56:38,830][15401] Updated weights for policy 0, policy_version 410960 (0.0027) [2024-06-23 09:56:42,897][15401] Updated weights for policy 0, policy_version 410970 (0.0040) [2024-06-23 09:56:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6733332480. Throughput: 0: 42681.8. Samples: 6733470040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:56:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 09:56:46,460][15401] Updated weights for policy 0, policy_version 410980 (0.0032) [2024-06-23 09:56:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6733578240. Throughput: 0: 42668.4. Samples: 6733728580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:56:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 09:56:50,558][15401] Updated weights for policy 0, policy_version 410990 (0.0039) [2024-06-23 09:56:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 6733791232. Throughput: 0: 42726.3. Samples: 6733854220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:56:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 09:56:54,446][15349] Signal inference workers to stop experience collection... (99750 times) [2024-06-23 09:56:54,480][15401] InferenceWorker_p0-w0: stopping experience collection (99750 times) [2024-06-23 09:56:54,502][15349] Signal inference workers to resume experience collection... (99750 times) [2024-06-23 09:56:54,508][15401] InferenceWorker_p0-w0: resuming experience collection (99750 times) [2024-06-23 09:56:54,511][15401] Updated weights for policy 0, policy_version 411000 (0.0027) [2024-06-23 09:56:57,975][15401] Updated weights for policy 0, policy_version 411010 (0.0024) [2024-06-23 09:56:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6734004224. Throughput: 0: 43003.6. Samples: 6734125000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:56:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 09:57:01,969][15401] Updated weights for policy 0, policy_version 411020 (0.0033) [2024-06-23 09:57:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6734200832. Throughput: 0: 42848.0. Samples: 6734381360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:03,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 09:57:05,684][15401] Updated weights for policy 0, policy_version 411030 (0.0023) [2024-06-23 09:57:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6734430208. Throughput: 0: 42816.1. Samples: 6734504500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 09:57:09,479][15401] Updated weights for policy 0, policy_version 411040 (0.0030) [2024-06-23 09:57:13,268][15401] Updated weights for policy 0, policy_version 411050 (0.0026) [2024-06-23 09:57:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6734643200. Throughput: 0: 42932.4. Samples: 6734768240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:13,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 09:57:17,081][15401] Updated weights for policy 0, policy_version 411060 (0.0037) [2024-06-23 09:57:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6734872576. Throughput: 0: 42997.2. Samples: 6735023940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:18,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-23 09:57:20,967][15401] Updated weights for policy 0, policy_version 411070 (0.0045) [2024-06-23 09:57:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 6735069184. Throughput: 0: 43114.2. Samples: 6735151940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:23,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 09:57:24,587][15401] Updated weights for policy 0, policy_version 411080 (0.0032) [2024-06-23 09:57:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6735282176. Throughput: 0: 43105.8. Samples: 6735409800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 09:57:28,497][15401] Updated weights for policy 0, policy_version 411090 (0.0040) [2024-06-23 09:57:32,101][15401] Updated weights for policy 0, policy_version 411100 (0.0033) [2024-06-23 09:57:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6735511552. Throughput: 0: 43076.5. Samples: 6735667020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:33,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 09:57:36,208][15401] Updated weights for policy 0, policy_version 411110 (0.0037) [2024-06-23 09:57:38,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 6735724544. Throughput: 0: 43213.5. Samples: 6735799100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:38,397][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 09:57:40,060][15401] Updated weights for policy 0, policy_version 411120 (0.0028) [2024-06-23 09:57:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 6735921152. Throughput: 0: 42907.2. Samples: 6736055820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:43,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-23 09:57:43,499][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000411129_6735937536.pth... [2024-06-23 09:57:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000410500_6725632000.pth [2024-06-23 09:57:43,725][15401] Updated weights for policy 0, policy_version 411130 (0.0025) [2024-06-23 09:57:47,569][15401] Updated weights for policy 0, policy_version 411140 (0.0028) [2024-06-23 09:57:48,389][15132] Fps is (10 sec: 40986.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6736134144. Throughput: 0: 42877.7. Samples: 6736310860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 09:57:51,243][15401] Updated weights for policy 0, policy_version 411150 (0.0040) [2024-06-23 09:57:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6736379904. Throughput: 0: 43024.0. Samples: 6736440580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 09:57:55,042][15401] Updated weights for policy 0, policy_version 411160 (0.0038) [2024-06-23 09:57:56,229][15349] Signal inference workers to stop experience collection... (99800 times) [2024-06-23 09:57:56,259][15401] InferenceWorker_p0-w0: stopping experience collection (99800 times) [2024-06-23 09:57:56,281][15349] Signal inference workers to resume experience collection... (99800 times) [2024-06-23 09:57:56,287][15401] InferenceWorker_p0-w0: resuming experience collection (99800 times) [2024-06-23 09:57:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6736560128. Throughput: 0: 42807.9. Samples: 6736694600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:57:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 09:57:59,030][15401] Updated weights for policy 0, policy_version 411170 (0.0034) [2024-06-23 09:58:02,464][15401] Updated weights for policy 0, policy_version 411180 (0.0023) [2024-06-23 09:58:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6736773120. Throughput: 0: 42909.0. Samples: 6736954840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 09:58:03,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-23 09:58:06,443][15401] Updated weights for policy 0, policy_version 411190 (0.0028) [2024-06-23 09:58:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6737002496. Throughput: 0: 42998.3. Samples: 6737086860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:08,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-23 09:58:09,918][15401] Updated weights for policy 0, policy_version 411200 (0.0034) [2024-06-23 09:58:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6737215488. Throughput: 0: 43137.3. Samples: 6737350980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:13,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 09:58:14,290][15401] Updated weights for policy 0, policy_version 411210 (0.0026) [2024-06-23 09:58:17,581][15401] Updated weights for policy 0, policy_version 411220 (0.0024) [2024-06-23 09:58:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6737444864. Throughput: 0: 43020.4. Samples: 6737602940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 09:58:21,687][15401] Updated weights for policy 0, policy_version 411230 (0.0049) [2024-06-23 09:58:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6737641472. Throughput: 0: 43097.6. Samples: 6737738220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:23,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 09:58:25,164][15401] Updated weights for policy 0, policy_version 411240 (0.0035) [2024-06-23 09:58:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6737854464. Throughput: 0: 43023.4. Samples: 6737991880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:28,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 09:58:29,319][15401] Updated weights for policy 0, policy_version 411250 (0.0033) [2024-06-23 09:58:32,827][15401] Updated weights for policy 0, policy_version 411260 (0.0025) [2024-06-23 09:58:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 6738083840. Throughput: 0: 43017.7. Samples: 6738246660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 09:58:37,030][15401] Updated weights for policy 0, policy_version 411270 (0.0051) [2024-06-23 09:58:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42876.0, 300 sec: 42931.6). Total num frames: 6738296832. Throughput: 0: 43108.8. Samples: 6738380480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 09:58:40,436][15401] Updated weights for policy 0, policy_version 411280 (0.0035) [2024-06-23 09:58:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6738509824. Throughput: 0: 43107.1. Samples: 6738634420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:43,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-23 09:58:44,678][15401] Updated weights for policy 0, policy_version 411290 (0.0037) [2024-06-23 09:58:47,979][15401] Updated weights for policy 0, policy_version 411300 (0.0037) [2024-06-23 09:58:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6738739200. Throughput: 0: 42852.8. Samples: 6738883220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 09:58:52,508][15401] Updated weights for policy 0, policy_version 411310 (0.0030) [2024-06-23 09:58:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 6738919424. Throughput: 0: 42869.3. Samples: 6739015980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 09:58:55,612][15401] Updated weights for policy 0, policy_version 411320 (0.0028) [2024-06-23 09:58:58,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6739132416. Throughput: 0: 42701.7. Samples: 6739272660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:58:58,393][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 09:59:00,120][15401] Updated weights for policy 0, policy_version 411330 (0.0027) [2024-06-23 09:59:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 6739378176. Throughput: 0: 42680.5. Samples: 6739523560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:59:03,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 09:59:03,622][15401] Updated weights for policy 0, policy_version 411340 (0.0034) [2024-06-23 09:59:07,586][15401] Updated weights for policy 0, policy_version 411350 (0.0034) [2024-06-23 09:59:08,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6739558400. Throughput: 0: 42751.6. Samples: 6739662040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:59:08,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-23 09:59:11,187][15401] Updated weights for policy 0, policy_version 411360 (0.0032) [2024-06-23 09:59:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6739771392. Throughput: 0: 42688.0. Samples: 6739912840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:59:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 09:59:15,303][15401] Updated weights for policy 0, policy_version 411370 (0.0034) [2024-06-23 09:59:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6740017152. Throughput: 0: 42550.7. Samples: 6740161440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:59:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 09:59:18,897][15401] Updated weights for policy 0, policy_version 411380 (0.0043) [2024-06-23 09:59:23,139][15401] Updated weights for policy 0, policy_version 411390 (0.0041) [2024-06-23 09:59:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6740213760. Throughput: 0: 42605.4. Samples: 6740297720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 09:59:23,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 09:59:26,498][15401] Updated weights for policy 0, policy_version 411400 (0.0029) [2024-06-23 09:59:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6740426752. Throughput: 0: 42560.0. Samples: 6740549620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 09:59:28,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-23 09:59:30,717][15401] Updated weights for policy 0, policy_version 411410 (0.0031) [2024-06-23 09:59:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6740639744. Throughput: 0: 42890.7. Samples: 6740813300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 09:59:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 09:59:34,073][15401] Updated weights for policy 0, policy_version 411420 (0.0031) [2024-06-23 09:59:35,691][15349] Signal inference workers to stop experience collection... (99850 times) [2024-06-23 09:59:35,692][15349] Signal inference workers to resume experience collection... (99850 times) [2024-06-23 09:59:35,718][15401] InferenceWorker_p0-w0: stopping experience collection (99850 times) [2024-06-23 09:59:35,718][15401] InferenceWorker_p0-w0: resuming experience collection (99850 times) [2024-06-23 09:59:38,236][15401] Updated weights for policy 0, policy_version 411430 (0.0029) [2024-06-23 09:59:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 6740869120. Throughput: 0: 42816.8. Samples: 6740942740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 09:59:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 09:59:41,736][15401] Updated weights for policy 0, policy_version 411440 (0.0035) [2024-06-23 09:59:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 6741049344. Throughput: 0: 42676.6. Samples: 6741193000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 09:59:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 09:59:43,431][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000411442_6741065728.pth... [2024-06-23 09:59:43,511][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000410816_6730809344.pth [2024-06-23 09:59:45,920][15401] Updated weights for policy 0, policy_version 411450 (0.0028) [2024-06-23 09:59:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6741295104. Throughput: 0: 42903.2. Samples: 6741454200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 09:59:48,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 09:59:49,401][15401] Updated weights for policy 0, policy_version 411460 (0.0039) [2024-06-23 09:59:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6741491712. Throughput: 0: 42764.9. Samples: 6741586460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 09:59:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 09:59:53,793][15401] Updated weights for policy 0, policy_version 411470 (0.0043) [2024-06-23 09:59:57,163][15401] Updated weights for policy 0, policy_version 411480 (0.0029) [2024-06-23 09:59:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 6741721088. Throughput: 0: 42820.1. Samples: 6741839740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 09:59:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 10:00:01,306][15401] Updated weights for policy 0, policy_version 411490 (0.0039) [2024-06-23 10:00:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 6741950464. Throughput: 0: 43009.3. Samples: 6742096860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 10:00:04,732][15401] Updated weights for policy 0, policy_version 411500 (0.0036) [2024-06-23 10:00:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6742147072. Throughput: 0: 42888.9. Samples: 6742227720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 10:00:08,757][15401] Updated weights for policy 0, policy_version 411510 (0.0032) [2024-06-23 10:00:12,462][15401] Updated weights for policy 0, policy_version 411520 (0.0023) [2024-06-23 10:00:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6742360064. Throughput: 0: 43036.1. Samples: 6742486240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 10:00:16,408][15401] Updated weights for policy 0, policy_version 411530 (0.0041) [2024-06-23 10:00:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 6742589440. Throughput: 0: 42800.9. Samples: 6742739340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 10:00:20,210][15401] Updated weights for policy 0, policy_version 411540 (0.0032) [2024-06-23 10:00:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6742786048. Throughput: 0: 42829.3. Samples: 6742870060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 10:00:24,125][15401] Updated weights for policy 0, policy_version 411550 (0.0023) [2024-06-23 10:00:27,716][15401] Updated weights for policy 0, policy_version 411560 (0.0029) [2024-06-23 10:00:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6743015424. Throughput: 0: 43065.7. Samples: 6743130960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:28,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 10:00:31,911][15401] Updated weights for policy 0, policy_version 411570 (0.0021) [2024-06-23 10:00:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 6743244800. Throughput: 0: 42957.3. Samples: 6743387280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 10:00:35,262][15401] Updated weights for policy 0, policy_version 411580 (0.0026) [2024-06-23 10:00:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6743425024. Throughput: 0: 42987.5. Samples: 6743520900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 10:00:38,899][15349] Signal inference workers to stop experience collection... (99900 times) [2024-06-23 10:00:38,927][15401] InferenceWorker_p0-w0: stopping experience collection (99900 times) [2024-06-23 10:00:38,962][15349] Signal inference workers to resume experience collection... (99900 times) [2024-06-23 10:00:38,971][15401] InferenceWorker_p0-w0: resuming experience collection (99900 times) [2024-06-23 10:00:39,678][15401] Updated weights for policy 0, policy_version 411590 (0.0033) [2024-06-23 10:00:42,758][15401] Updated weights for policy 0, policy_version 411600 (0.0038) [2024-06-23 10:00:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 6743654400. Throughput: 0: 43054.2. Samples: 6743777180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 10:00:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 10:00:47,153][15401] Updated weights for policy 0, policy_version 411610 (0.0039) [2024-06-23 10:00:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 6743867392. Throughput: 0: 43001.3. Samples: 6744031920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:00:48,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-23 10:00:50,724][15401] Updated weights for policy 0, policy_version 411620 (0.0036) [2024-06-23 10:00:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6744064000. Throughput: 0: 42881.0. Samples: 6744157360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:00:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 10:00:54,897][15401] Updated weights for policy 0, policy_version 411630 (0.0045) [2024-06-23 10:00:58,250][15401] Updated weights for policy 0, policy_version 411640 (0.0033) [2024-06-23 10:00:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 6744309760. Throughput: 0: 42843.0. Samples: 6744414180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:00:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 10:01:02,507][15401] Updated weights for policy 0, policy_version 411650 (0.0046) [2024-06-23 10:01:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6744522752. Throughput: 0: 42878.5. Samples: 6744668880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:03,394][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 10:01:05,987][15401] Updated weights for policy 0, policy_version 411660 (0.0036) [2024-06-23 10:01:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6744719360. Throughput: 0: 42817.0. Samples: 6744796820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 10:01:10,011][15401] Updated weights for policy 0, policy_version 411670 (0.0031) [2024-06-23 10:01:13,390][15132] Fps is (10 sec: 42595.1, 60 sec: 43143.9, 300 sec: 42931.5). Total num frames: 6744948736. Throughput: 0: 42797.9. Samples: 6745056900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:13,391][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 10:01:14,131][15401] Updated weights for policy 0, policy_version 411680 (0.0027) [2024-06-23 10:01:17,735][15401] Updated weights for policy 0, policy_version 411690 (0.0025) [2024-06-23 10:01:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 6745161728. Throughput: 0: 42852.7. Samples: 6745315660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 10:01:21,537][15401] Updated weights for policy 0, policy_version 411700 (0.0028) [2024-06-23 10:01:23,392][15132] Fps is (10 sec: 42591.6, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 6745374720. Throughput: 0: 42843.1. Samples: 6745448940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:23,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 10:01:25,302][15401] Updated weights for policy 0, policy_version 411710 (0.0041) [2024-06-23 10:01:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6745587712. Throughput: 0: 42852.4. Samples: 6745705540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:28,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 10:01:28,966][15401] Updated weights for policy 0, policy_version 411720 (0.0025) [2024-06-23 10:01:32,797][15401] Updated weights for policy 0, policy_version 411730 (0.0047) [2024-06-23 10:01:33,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 6745800704. Throughput: 0: 42838.7. Samples: 6745959660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 10:01:36,414][15401] Updated weights for policy 0, policy_version 411740 (0.0035) [2024-06-23 10:01:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 6746013696. Throughput: 0: 42972.4. Samples: 6746091120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:38,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 10:01:40,417][15401] Updated weights for policy 0, policy_version 411750 (0.0034) [2024-06-23 10:01:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6746226688. Throughput: 0: 42915.0. Samples: 6746345360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 10:01:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000411757_6746226688.pth... [2024-06-23 10:01:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000411129_6735937536.pth [2024-06-23 10:01:44,358][15401] Updated weights for policy 0, policy_version 411760 (0.0033) [2024-06-23 10:01:48,225][15401] Updated weights for policy 0, policy_version 411770 (0.0039) [2024-06-23 10:01:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6746439680. Throughput: 0: 43029.8. Samples: 6746605220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:48,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 10:01:51,753][15401] Updated weights for policy 0, policy_version 411780 (0.0041) [2024-06-23 10:01:53,391][15132] Fps is (10 sec: 42594.0, 60 sec: 43143.7, 300 sec: 42875.9). Total num frames: 6746652672. Throughput: 0: 43071.7. Samples: 6746735100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:53,391][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 10:01:55,873][15401] Updated weights for policy 0, policy_version 411790 (0.0032) [2024-06-23 10:01:58,394][15132] Fps is (10 sec: 44215.2, 60 sec: 42867.9, 300 sec: 42986.4). Total num frames: 6746882048. Throughput: 0: 43061.0. Samples: 6746994820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:01:58,395][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 10:01:59,107][15349] Signal inference workers to stop experience collection... (99950 times) [2024-06-23 10:01:59,107][15349] Signal inference workers to resume experience collection... (99950 times) [2024-06-23 10:01:59,128][15401] InferenceWorker_p0-w0: stopping experience collection (99950 times) [2024-06-23 10:01:59,129][15401] InferenceWorker_p0-w0: resuming experience collection (99950 times) [2024-06-23 10:01:59,257][15401] Updated weights for policy 0, policy_version 411800 (0.0036) [2024-06-23 10:02:03,390][15132] Fps is (10 sec: 42603.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6747078656. Throughput: 0: 43025.4. Samples: 6747251800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-23 10:02:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 10:02:03,624][15401] Updated weights for policy 0, policy_version 411810 (0.0039) [2024-06-23 10:02:06,741][15401] Updated weights for policy 0, policy_version 411820 (0.0040) [2024-06-23 10:02:08,389][15132] Fps is (10 sec: 40980.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6747291648. Throughput: 0: 42954.9. Samples: 6747381800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 10:02:11,064][15401] Updated weights for policy 0, policy_version 411830 (0.0037) [2024-06-23 10:02:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42872.1, 300 sec: 42876.1). Total num frames: 6747521024. Throughput: 0: 42992.9. Samples: 6747640220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 10:02:14,268][15401] Updated weights for policy 0, policy_version 411840 (0.0044) [2024-06-23 10:02:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 6747734016. Throughput: 0: 43062.3. Samples: 6747897460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 10:02:18,752][15401] Updated weights for policy 0, policy_version 411850 (0.0046) [2024-06-23 10:02:21,978][15401] Updated weights for policy 0, policy_version 411860 (0.0046) [2024-06-23 10:02:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 6747947008. Throughput: 0: 42897.3. Samples: 6748021500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 10:02:26,372][15401] Updated weights for policy 0, policy_version 411870 (0.0038) [2024-06-23 10:02:28,395][15132] Fps is (10 sec: 40936.4, 60 sec: 42594.3, 300 sec: 42819.7). Total num frames: 6748143616. Throughput: 0: 43006.2. Samples: 6748280880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:28,396][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 10:02:29,492][15401] Updated weights for policy 0, policy_version 411880 (0.0040) [2024-06-23 10:02:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 6748356608. Throughput: 0: 43055.6. Samples: 6748542720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:33,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 10:02:34,135][15401] Updated weights for policy 0, policy_version 411890 (0.0035) [2024-06-23 10:02:37,080][15401] Updated weights for policy 0, policy_version 411900 (0.0040) [2024-06-23 10:02:38,389][15132] Fps is (10 sec: 45902.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 6748602368. Throughput: 0: 42960.8. Samples: 6748668280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 10:02:41,903][15401] Updated weights for policy 0, policy_version 411910 (0.0034) [2024-06-23 10:02:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 6748798976. Throughput: 0: 42957.2. Samples: 6748927680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 10:02:45,019][15401] Updated weights for policy 0, policy_version 411920 (0.0032) [2024-06-23 10:02:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6748995584. Throughput: 0: 43033.9. Samples: 6749188320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 10:02:49,392][15401] Updated weights for policy 0, policy_version 411930 (0.0030) [2024-06-23 10:02:52,491][15401] Updated weights for policy 0, policy_version 411940 (0.0036) [2024-06-23 10:02:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43418.5, 300 sec: 43042.7). Total num frames: 6749257728. Throughput: 0: 42915.9. Samples: 6749313020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:53,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 10:02:56,851][15401] Updated weights for policy 0, policy_version 411950 (0.0054) [2024-06-23 10:02:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42875.0, 300 sec: 42987.2). Total num frames: 6749454336. Throughput: 0: 42931.0. Samples: 6749572120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:02:58,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 10:03:00,009][15401] Updated weights for policy 0, policy_version 411960 (0.0045) [2024-06-23 10:03:03,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 6749634560. Throughput: 0: 43106.7. Samples: 6749837260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:03:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 10:03:04,602][15401] Updated weights for policy 0, policy_version 411970 (0.0025) [2024-06-23 10:03:07,485][15401] Updated weights for policy 0, policy_version 411980 (0.0031) [2024-06-23 10:03:08,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43688.8, 300 sec: 43042.4). Total num frames: 6749913088. Throughput: 0: 43235.6. Samples: 6749967200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:03:08,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 10:03:12,034][15401] Updated weights for policy 0, policy_version 411990 (0.0033) [2024-06-23 10:03:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6750076928. Throughput: 0: 43254.3. Samples: 6750227080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:03:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 10:03:15,110][15401] Updated weights for policy 0, policy_version 412000 (0.0028) [2024-06-23 10:03:18,390][15132] Fps is (10 sec: 37692.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6750289920. Throughput: 0: 43164.0. Samples: 6750485100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:03:18,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 10:03:19,661][15401] Updated weights for policy 0, policy_version 412010 (0.0032) [2024-06-23 10:03:22,606][15401] Updated weights for policy 0, policy_version 412020 (0.0036) [2024-06-23 10:03:23,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 6750552064. Throughput: 0: 43187.6. Samples: 6750611720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 10:03:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 10:03:27,262][15401] Updated weights for policy 0, policy_version 412030 (0.0036) [2024-06-23 10:03:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43148.6, 300 sec: 42876.1). Total num frames: 6750732288. Throughput: 0: 43146.9. Samples: 6750869300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:03:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 10:03:30,594][15401] Updated weights for policy 0, policy_version 412040 (0.0023) [2024-06-23 10:03:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6750945280. Throughput: 0: 43048.4. Samples: 6751125500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:03:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 10:03:34,934][15401] Updated weights for policy 0, policy_version 412050 (0.0033) [2024-06-23 10:03:37,171][15349] Signal inference workers to stop experience collection... (100000 times) [2024-06-23 10:03:37,223][15401] InferenceWorker_p0-w0: stopping experience collection (100000 times) [2024-06-23 10:03:37,231][15349] Signal inference workers to resume experience collection... (100000 times) [2024-06-23 10:03:37,241][15401] InferenceWorker_p0-w0: resuming experience collection (100000 times) [2024-06-23 10:03:38,110][15401] Updated weights for policy 0, policy_version 412060 (0.0029) [2024-06-23 10:03:38,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 6751191040. Throughput: 0: 43102.2. Samples: 6751252620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:03:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 10:03:42,392][15401] Updated weights for policy 0, policy_version 412070 (0.0037) [2024-06-23 10:03:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6751371264. Throughput: 0: 43162.2. Samples: 6751514420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:03:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 10:03:43,556][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000412073_6751404032.pth... [2024-06-23 10:03:43,616][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000411442_6741065728.pth [2024-06-23 10:03:45,627][15401] Updated weights for policy 0, policy_version 412080 (0.0034) [2024-06-23 10:03:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 6751600640. Throughput: 0: 42995.9. Samples: 6751772080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:03:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 10:03:50,003][15401] Updated weights for policy 0, policy_version 412090 (0.0036) [2024-06-23 10:03:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 43043.1). Total num frames: 6751830016. Throughput: 0: 42998.7. Samples: 6751902040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:03:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 10:03:53,918][15401] Updated weights for policy 0, policy_version 412100 (0.0038) [2024-06-23 10:03:57,600][15401] Updated weights for policy 0, policy_version 412110 (0.0027) [2024-06-23 10:03:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6752026624. Throughput: 0: 42790.8. Samples: 6752152660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:03:58,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 10:04:01,485][15401] Updated weights for policy 0, policy_version 412120 (0.0035) [2024-06-23 10:04:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6752239616. Throughput: 0: 42796.6. Samples: 6752410940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:03,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 10:04:05,012][15401] Updated weights for policy 0, policy_version 412130 (0.0027) [2024-06-23 10:04:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 43042.7). Total num frames: 6752468992. Throughput: 0: 42839.9. Samples: 6752539520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 10:04:09,069][15401] Updated weights for policy 0, policy_version 412140 (0.0030) [2024-06-23 10:04:12,689][15401] Updated weights for policy 0, policy_version 412150 (0.0032) [2024-06-23 10:04:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6752681984. Throughput: 0: 42755.5. Samples: 6752793300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 10:04:16,681][15401] Updated weights for policy 0, policy_version 412160 (0.0031) [2024-06-23 10:04:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6752878592. Throughput: 0: 42773.3. Samples: 6753050300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:18,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 10:04:20,338][15401] Updated weights for policy 0, policy_version 412170 (0.0029) [2024-06-23 10:04:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 6753091584. Throughput: 0: 42853.4. Samples: 6753181020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 10:04:24,183][15401] Updated weights for policy 0, policy_version 412180 (0.0033) [2024-06-23 10:04:27,802][15401] Updated weights for policy 0, policy_version 412190 (0.0043) [2024-06-23 10:04:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 6753337344. Throughput: 0: 42750.2. Samples: 6753438180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 10:04:32,224][15401] Updated weights for policy 0, policy_version 412200 (0.0032) [2024-06-23 10:04:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6753533952. Throughput: 0: 42675.3. Samples: 6753692460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 10:04:35,357][15401] Updated weights for policy 0, policy_version 412210 (0.0042) [2024-06-23 10:04:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 6753730560. Throughput: 0: 42561.4. Samples: 6753817300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:38,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-23 10:04:39,733][15349] Signal inference workers to stop experience collection... (100050 times) [2024-06-23 10:04:39,733][15349] Signal inference workers to resume experience collection... (100050 times) [2024-06-23 10:04:39,775][15401] InferenceWorker_p0-w0: stopping experience collection (100050 times) [2024-06-23 10:04:39,775][15401] InferenceWorker_p0-w0: resuming experience collection (100050 times) [2024-06-23 10:04:39,867][15401] Updated weights for policy 0, policy_version 412220 (0.0046) [2024-06-23 10:04:43,362][15401] Updated weights for policy 0, policy_version 412230 (0.0029) [2024-06-23 10:04:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6753976320. Throughput: 0: 42755.8. Samples: 6754076680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:43,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 10:04:47,478][15401] Updated weights for policy 0, policy_version 412240 (0.0043) [2024-06-23 10:04:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 6754156544. Throughput: 0: 42858.6. Samples: 6754339580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:04:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 10:04:50,906][15401] Updated weights for policy 0, policy_version 412250 (0.0035) [2024-06-23 10:04:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6754369536. Throughput: 0: 42668.4. Samples: 6754459600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:04:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 10:04:55,235][15401] Updated weights for policy 0, policy_version 412260 (0.0037) [2024-06-23 10:04:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6754598912. Throughput: 0: 42687.2. Samples: 6754714220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:04:58,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 10:04:58,677][15401] Updated weights for policy 0, policy_version 412270 (0.0032) [2024-06-23 10:05:02,860][15401] Updated weights for policy 0, policy_version 412280 (0.0042) [2024-06-23 10:05:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6754795520. Throughput: 0: 42817.0. Samples: 6754977060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 10:05:06,294][15401] Updated weights for policy 0, policy_version 412290 (0.0041) [2024-06-23 10:05:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6755008512. Throughput: 0: 42707.4. Samples: 6755102860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:08,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 10:05:10,474][15401] Updated weights for policy 0, policy_version 412300 (0.0035) [2024-06-23 10:05:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6755254272. Throughput: 0: 42632.9. Samples: 6755356660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 10:05:13,980][15401] Updated weights for policy 0, policy_version 412310 (0.0036) [2024-06-23 10:05:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6755434496. Throughput: 0: 42855.6. Samples: 6755620960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 10:05:18,483][15401] Updated weights for policy 0, policy_version 412320 (0.0039) [2024-06-23 10:05:21,655][15401] Updated weights for policy 0, policy_version 412330 (0.0037) [2024-06-23 10:05:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 6755663872. Throughput: 0: 42799.0. Samples: 6755743360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:23,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 10:05:26,301][15401] Updated weights for policy 0, policy_version 412340 (0.0033) [2024-06-23 10:05:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6755893248. Throughput: 0: 42707.6. Samples: 6755998520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 10:05:29,279][15401] Updated weights for policy 0, policy_version 412350 (0.0028) [2024-06-23 10:05:33,396][15132] Fps is (10 sec: 37668.2, 60 sec: 41774.7, 300 sec: 42764.1). Total num frames: 6756040704. Throughput: 0: 42790.8. Samples: 6756265440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:33,397][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 10:05:33,931][15401] Updated weights for policy 0, policy_version 412360 (0.0033) [2024-06-23 10:05:36,913][15401] Updated weights for policy 0, policy_version 412370 (0.0032) [2024-06-23 10:05:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 6756319232. Throughput: 0: 42698.6. Samples: 6756381140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:38,392][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 10:05:41,623][15401] Updated weights for policy 0, policy_version 412380 (0.0035) [2024-06-23 10:05:43,389][15132] Fps is (10 sec: 49183.7, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 6756532224. Throughput: 0: 42726.7. Samples: 6756636920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:43,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 10:05:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000412386_6756532224.pth... [2024-06-23 10:05:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000411757_6746226688.pth [2024-06-23 10:05:44,646][15401] Updated weights for policy 0, policy_version 412390 (0.0037) [2024-06-23 10:05:48,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6756696064. Throughput: 0: 42867.2. Samples: 6756906080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 10:05:49,196][15349] Signal inference workers to stop experience collection... (100100 times) [2024-06-23 10:05:49,197][15349] Signal inference workers to resume experience collection... (100100 times) [2024-06-23 10:05:49,232][15401] InferenceWorker_p0-w0: stopping experience collection (100100 times) [2024-06-23 10:05:49,232][15401] InferenceWorker_p0-w0: resuming experience collection (100100 times) [2024-06-23 10:05:49,344][15401] Updated weights for policy 0, policy_version 412400 (0.0038) [2024-06-23 10:05:52,235][15401] Updated weights for policy 0, policy_version 412410 (0.0034) [2024-06-23 10:05:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6756974592. Throughput: 0: 42576.9. Samples: 6757018820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 10:05:57,116][15401] Updated weights for policy 0, policy_version 412420 (0.0037) [2024-06-23 10:05:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6757154816. Throughput: 0: 42691.3. Samples: 6757277760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:05:58,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 10:05:59,798][15401] Updated weights for policy 0, policy_version 412430 (0.0032) [2024-06-23 10:06:03,393][15132] Fps is (10 sec: 34393.8, 60 sec: 42049.7, 300 sec: 42708.9). Total num frames: 6757318656. Throughput: 0: 42671.9. Samples: 6757541360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:06:03,394][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 10:06:04,593][15401] Updated weights for policy 0, policy_version 412440 (0.0031) [2024-06-23 10:06:07,818][15401] Updated weights for policy 0, policy_version 412450 (0.0033) [2024-06-23 10:06:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.2). Total num frames: 6757597184. Throughput: 0: 42542.3. Samples: 6757657660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 10:06:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 10:06:12,159][15401] Updated weights for policy 0, policy_version 412460 (0.0040) [2024-06-23 10:06:13,396][15132] Fps is (10 sec: 49138.8, 60 sec: 42593.9, 300 sec: 42875.2). Total num frames: 6757810176. Throughput: 0: 42716.2. Samples: 6757921020. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:13,396][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 10:06:15,630][15401] Updated weights for policy 0, policy_version 412470 (0.0037) [2024-06-23 10:06:18,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 6757974016. Throughput: 0: 42663.9. Samples: 6758185040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 10:06:19,930][15401] Updated weights for policy 0, policy_version 412480 (0.0032) [2024-06-23 10:06:23,207][15401] Updated weights for policy 0, policy_version 412490 (0.0029) [2024-06-23 10:06:23,390][15132] Fps is (10 sec: 42624.9, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 6758236160. Throughput: 0: 42759.9. Samples: 6758305240. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 10:06:27,495][15401] Updated weights for policy 0, policy_version 412500 (0.0025) [2024-06-23 10:06:28,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6758449152. Throughput: 0: 43032.4. Samples: 6758573380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 10:06:30,711][15401] Updated weights for policy 0, policy_version 412510 (0.0028) [2024-06-23 10:06:33,392][15132] Fps is (10 sec: 40950.8, 60 sec: 43420.5, 300 sec: 42820.2). Total num frames: 6758645760. Throughput: 0: 42753.2. Samples: 6758830080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:33,393][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 10:06:35,144][15401] Updated weights for policy 0, policy_version 412520 (0.0036) [2024-06-23 10:06:38,300][15401] Updated weights for policy 0, policy_version 412530 (0.0032) [2024-06-23 10:06:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42931.7). Total num frames: 6758891520. Throughput: 0: 42833.0. Samples: 6758946300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:38,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-23 10:06:42,818][15401] Updated weights for policy 0, policy_version 412540 (0.0038) [2024-06-23 10:06:43,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6759088128. Throughput: 0: 43087.5. Samples: 6759216700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:43,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 10:06:45,829][15401] Updated weights for policy 0, policy_version 412550 (0.0036) [2024-06-23 10:06:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 42876.3). Total num frames: 6759301120. Throughput: 0: 42880.8. Samples: 6759470840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:48,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 10:06:50,545][15401] Updated weights for policy 0, policy_version 412560 (0.0043) [2024-06-23 10:06:53,347][15401] Updated weights for policy 0, policy_version 412570 (0.0038) [2024-06-23 10:06:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42932.4). Total num frames: 6759546880. Throughput: 0: 43049.3. Samples: 6759594880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 10:06:57,999][15401] Updated weights for policy 0, policy_version 412580 (0.0027) [2024-06-23 10:06:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6759727104. Throughput: 0: 43067.0. Samples: 6759858760. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:06:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 10:07:01,118][15401] Updated weights for policy 0, policy_version 412590 (0.0033) [2024-06-23 10:07:03,389][15132] Fps is (10 sec: 37683.4, 60 sec: 43420.3, 300 sec: 42820.5). Total num frames: 6759923712. Throughput: 0: 42928.5. Samples: 6760116820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:07:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 10:07:05,629][15401] Updated weights for policy 0, policy_version 412600 (0.0029) [2024-06-23 10:07:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6760169472. Throughput: 0: 43062.5. Samples: 6760243040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:07:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 10:07:08,663][15401] Updated weights for policy 0, policy_version 412610 (0.0040) [2024-06-23 10:07:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42329.7, 300 sec: 42765.0). Total num frames: 6760349696. Throughput: 0: 42775.8. Samples: 6760498300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:07:13,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-23 10:07:13,576][15401] Updated weights for policy 0, policy_version 412620 (0.0038) [2024-06-23 10:07:15,970][15349] Signal inference workers to stop experience collection... (100150 times) [2024-06-23 10:07:15,970][15349] Signal inference workers to resume experience collection... (100150 times) [2024-06-23 10:07:15,978][15401] InferenceWorker_p0-w0: stopping experience collection (100150 times) [2024-06-23 10:07:15,990][15401] InferenceWorker_p0-w0: resuming experience collection (100150 times) [2024-06-23 10:07:16,330][15401] Updated weights for policy 0, policy_version 412630 (0.0043) [2024-06-23 10:07:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 6760579072. Throughput: 0: 42647.6. Samples: 6760749120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:07:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 10:07:21,200][15401] Updated weights for policy 0, policy_version 412640 (0.0040) [2024-06-23 10:07:23,392][15132] Fps is (10 sec: 45865.0, 60 sec: 42869.9, 300 sec: 42932.1). Total num frames: 6760808448. Throughput: 0: 42941.2. Samples: 6760878760. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:07:23,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 10:07:23,925][15401] Updated weights for policy 0, policy_version 412650 (0.0029) [2024-06-23 10:07:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6760988672. Throughput: 0: 42651.1. Samples: 6761136000. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-23 10:07:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 10:07:28,650][15401] Updated weights for policy 0, policy_version 412660 (0.0026) [2024-06-23 10:07:31,893][15401] Updated weights for policy 0, policy_version 412670 (0.0029) [2024-06-23 10:07:33,391][15132] Fps is (10 sec: 40963.2, 60 sec: 42872.0, 300 sec: 42764.8). Total num frames: 6761218048. Throughput: 0: 42713.6. Samples: 6761393020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:07:33,392][15132] Avg episode reward: [(0, '0.284')] [2024-06-23 10:07:36,824][15401] Updated weights for policy 0, policy_version 412680 (0.0025) [2024-06-23 10:07:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6761447424. Throughput: 0: 42850.7. Samples: 6761523160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:07:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 10:07:39,450][15401] Updated weights for policy 0, policy_version 412690 (0.0035) [2024-06-23 10:07:43,389][15132] Fps is (10 sec: 42605.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6761644032. Throughput: 0: 42672.0. Samples: 6761779000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:07:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 10:07:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000412699_6761660416.pth... [2024-06-23 10:07:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000412073_6751404032.pth [2024-06-23 10:07:44,302][15401] Updated weights for policy 0, policy_version 412700 (0.0033) [2024-06-23 10:07:46,993][15401] Updated weights for policy 0, policy_version 412710 (0.0043) [2024-06-23 10:07:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6761873408. Throughput: 0: 42775.1. Samples: 6762041700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:07:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 10:07:51,976][15401] Updated weights for policy 0, policy_version 412720 (0.0033) [2024-06-23 10:07:53,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6762102784. Throughput: 0: 42762.9. Samples: 6762167380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:07:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 10:07:54,546][15401] Updated weights for policy 0, policy_version 412730 (0.0033) [2024-06-23 10:07:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6762299392. Throughput: 0: 42789.1. Samples: 6762423800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:07:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 10:07:59,476][15401] Updated weights for policy 0, policy_version 412740 (0.0028) [2024-06-23 10:08:02,213][15401] Updated weights for policy 0, policy_version 412750 (0.0045) [2024-06-23 10:08:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 6762512384. Throughput: 0: 42942.2. Samples: 6762681520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:03,393][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 10:08:06,935][15401] Updated weights for policy 0, policy_version 412760 (0.0026) [2024-06-23 10:08:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6762708992. Throughput: 0: 42920.1. Samples: 6762810060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 10:08:10,135][15401] Updated weights for policy 0, policy_version 412770 (0.0031) [2024-06-23 10:08:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 6762954752. Throughput: 0: 42986.0. Samples: 6763070380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:13,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 10:08:14,573][15401] Updated weights for policy 0, policy_version 412780 (0.0033) [2024-06-23 10:08:17,649][15401] Updated weights for policy 0, policy_version 412790 (0.0045) [2024-06-23 10:08:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6763167744. Throughput: 0: 42890.8. Samples: 6763323040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 10:08:22,051][15401] Updated weights for policy 0, policy_version 412800 (0.0028) [2024-06-23 10:08:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 6763364352. Throughput: 0: 42983.1. Samples: 6763457400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 10:08:25,154][15401] Updated weights for policy 0, policy_version 412810 (0.0035) [2024-06-23 10:08:26,734][15349] Signal inference workers to stop experience collection... (100200 times) [2024-06-23 10:08:26,735][15349] Signal inference workers to resume experience collection... (100200 times) [2024-06-23 10:08:26,779][15401] InferenceWorker_p0-w0: stopping experience collection (100200 times) [2024-06-23 10:08:26,779][15401] InferenceWorker_p0-w0: resuming experience collection (100200 times) [2024-06-23 10:08:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 6763610112. Throughput: 0: 43014.7. Samples: 6763714660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 10:08:29,473][15401] Updated weights for policy 0, policy_version 412820 (0.0043) [2024-06-23 10:08:32,669][15401] Updated weights for policy 0, policy_version 412830 (0.0041) [2024-06-23 10:08:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43418.7, 300 sec: 42820.6). Total num frames: 6763823104. Throughput: 0: 43092.8. Samples: 6763980880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:33,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 10:08:37,512][15401] Updated weights for policy 0, policy_version 412840 (0.0028) [2024-06-23 10:08:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6764019712. Throughput: 0: 43200.6. Samples: 6764111400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 10:08:40,134][15401] Updated weights for policy 0, policy_version 412850 (0.0031) [2024-06-23 10:08:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 6764265472. Throughput: 0: 43210.1. Samples: 6764368260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 10:08:44,937][15401] Updated weights for policy 0, policy_version 412860 (0.0025) [2024-06-23 10:08:47,997][15401] Updated weights for policy 0, policy_version 412870 (0.0037) [2024-06-23 10:08:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 6764494848. Throughput: 0: 43361.3. Samples: 6764632780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 10:08:52,381][15401] Updated weights for policy 0, policy_version 412880 (0.0041) [2024-06-23 10:08:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6764658688. Throughput: 0: 43332.3. Samples: 6764760020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-23 10:08:53,391][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 10:08:55,572][15401] Updated weights for policy 0, policy_version 412890 (0.0031) [2024-06-23 10:08:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6764888064. Throughput: 0: 43150.8. Samples: 6765012160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:08:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 10:08:59,834][15401] Updated weights for policy 0, policy_version 412900 (0.0038) [2024-06-23 10:09:03,311][15401] Updated weights for policy 0, policy_version 412910 (0.0025) [2024-06-23 10:09:03,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 6765117440. Throughput: 0: 43283.7. Samples: 6765270800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 10:09:07,297][15401] Updated weights for policy 0, policy_version 412920 (0.0040) [2024-06-23 10:09:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6765297664. Throughput: 0: 43150.6. Samples: 6765399180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 10:09:10,918][15401] Updated weights for policy 0, policy_version 412930 (0.0038) [2024-06-23 10:09:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 6765543424. Throughput: 0: 43168.4. Samples: 6765657240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:13,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-23 10:09:14,815][15401] Updated weights for policy 0, policy_version 412940 (0.0034) [2024-06-23 10:09:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6765740032. Throughput: 0: 43126.6. Samples: 6765921580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:18,399][15132] Avg episode reward: [(0, '0.273')] [2024-06-23 10:09:18,906][15401] Updated weights for policy 0, policy_version 412950 (0.0028) [2024-06-23 10:09:22,422][15401] Updated weights for policy 0, policy_version 412960 (0.0031) [2024-06-23 10:09:23,393][15132] Fps is (10 sec: 40946.6, 60 sec: 43142.2, 300 sec: 42764.6). Total num frames: 6765953024. Throughput: 0: 42868.1. Samples: 6766040600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:23,393][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 10:09:26,495][15401] Updated weights for policy 0, policy_version 412970 (0.0027) [2024-06-23 10:09:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 6766198784. Throughput: 0: 42829.3. Samples: 6766295580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 10:09:29,911][15401] Updated weights for policy 0, policy_version 412980 (0.0031) [2024-06-23 10:09:33,389][15132] Fps is (10 sec: 40973.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6766362624. Throughput: 0: 42904.6. Samples: 6766563480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:33,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 10:09:34,116][15401] Updated weights for policy 0, policy_version 412990 (0.0024) [2024-06-23 10:09:34,226][15349] Signal inference workers to stop experience collection... (100250 times) [2024-06-23 10:09:34,253][15401] InferenceWorker_p0-w0: stopping experience collection (100250 times) [2024-06-23 10:09:34,287][15349] Signal inference workers to resume experience collection... (100250 times) [2024-06-23 10:09:34,288][15401] InferenceWorker_p0-w0: resuming experience collection (100250 times) [2024-06-23 10:09:37,619][15401] Updated weights for policy 0, policy_version 413000 (0.0033) [2024-06-23 10:09:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6766608384. Throughput: 0: 42695.3. Samples: 6766681300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 10:09:41,735][15401] Updated weights for policy 0, policy_version 413010 (0.0037) [2024-06-23 10:09:43,390][15132] Fps is (10 sec: 49151.1, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 6766854144. Throughput: 0: 42794.1. Samples: 6766937900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 10:09:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000413016_6766854144.pth... [2024-06-23 10:09:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000412386_6756532224.pth [2024-06-23 10:09:45,298][15401] Updated weights for policy 0, policy_version 413020 (0.0037) [2024-06-23 10:09:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 6767017984. Throughput: 0: 42953.3. Samples: 6767203700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 10:09:49,239][15401] Updated weights for policy 0, policy_version 413030 (0.0042) [2024-06-23 10:09:52,726][15401] Updated weights for policy 0, policy_version 413040 (0.0035) [2024-06-23 10:09:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6767263744. Throughput: 0: 42742.9. Samples: 6767322620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:53,391][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 10:09:56,824][15401] Updated weights for policy 0, policy_version 413050 (0.0031) [2024-06-23 10:09:58,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 6767493120. Throughput: 0: 42911.5. Samples: 6767588260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:09:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 10:10:00,224][15401] Updated weights for policy 0, policy_version 413060 (0.0029) [2024-06-23 10:10:03,392][15132] Fps is (10 sec: 39313.0, 60 sec: 42323.6, 300 sec: 42875.8). Total num frames: 6767656960. Throughput: 0: 42954.7. Samples: 6767854640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:10:03,392][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 10:10:04,443][15401] Updated weights for policy 0, policy_version 413070 (0.0046) [2024-06-23 10:10:07,902][15401] Updated weights for policy 0, policy_version 413080 (0.0035) [2024-06-23 10:10:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43688.9, 300 sec: 42931.3). Total num frames: 6767919104. Throughput: 0: 42923.0. Samples: 6767972100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:10:08,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 10:10:12,206][15401] Updated weights for policy 0, policy_version 413090 (0.0031) [2024-06-23 10:10:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 6768115712. Throughput: 0: 42965.0. Samples: 6768229000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 10:10:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 10:10:15,454][15401] Updated weights for policy 0, policy_version 413100 (0.0043) [2024-06-23 10:10:18,390][15132] Fps is (10 sec: 37692.0, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 6768295936. Throughput: 0: 42769.6. Samples: 6768488120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:18,394][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 10:10:19,924][15401] Updated weights for policy 0, policy_version 413110 (0.0033) [2024-06-23 10:10:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.9, 300 sec: 42876.1). Total num frames: 6768541696. Throughput: 0: 42768.8. Samples: 6768605900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 10:10:23,418][15401] Updated weights for policy 0, policy_version 413120 (0.0047) [2024-06-23 10:10:27,752][15401] Updated weights for policy 0, policy_version 413130 (0.0038) [2024-06-23 10:10:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 43099.2). Total num frames: 6768754688. Throughput: 0: 42901.9. Samples: 6768868480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 10:10:31,049][15401] Updated weights for policy 0, policy_version 413140 (0.0045) [2024-06-23 10:10:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 6768934912. Throughput: 0: 42565.3. Samples: 6769119140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 10:10:35,406][15401] Updated weights for policy 0, policy_version 413150 (0.0044) [2024-06-23 10:10:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6769180672. Throughput: 0: 42714.9. Samples: 6769244780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:38,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 10:10:38,701][15401] Updated weights for policy 0, policy_version 413160 (0.0033) [2024-06-23 10:10:42,921][15401] Updated weights for policy 0, policy_version 413170 (0.0038) [2024-06-23 10:10:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 43042.7). Total num frames: 6769393664. Throughput: 0: 42742.7. Samples: 6769511680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:43,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 10:10:46,641][15401] Updated weights for policy 0, policy_version 413180 (0.0038) [2024-06-23 10:10:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6769590272. Throughput: 0: 42304.4. Samples: 6769758240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 10:10:50,719][15401] Updated weights for policy 0, policy_version 413190 (0.0042) [2024-06-23 10:10:53,100][15349] Signal inference workers to stop experience collection... (100300 times) [2024-06-23 10:10:53,154][15401] InferenceWorker_p0-w0: stopping experience collection (100300 times) [2024-06-23 10:10:53,159][15349] Signal inference workers to resume experience collection... (100300 times) [2024-06-23 10:10:53,166][15401] InferenceWorker_p0-w0: resuming experience collection (100300 times) [2024-06-23 10:10:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42987.1). Total num frames: 6769836032. Throughput: 0: 42463.0. Samples: 6769882840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 10:10:54,250][15401] Updated weights for policy 0, policy_version 413200 (0.0029) [2024-06-23 10:10:58,271][15401] Updated weights for policy 0, policy_version 413210 (0.0037) [2024-06-23 10:10:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 43098.8). Total num frames: 6770032640. Throughput: 0: 42556.4. Samples: 6770144040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:10:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 10:11:01,971][15401] Updated weights for policy 0, policy_version 413220 (0.0031) [2024-06-23 10:11:03,392][15132] Fps is (10 sec: 40950.9, 60 sec: 43144.5, 300 sec: 42875.7). Total num frames: 6770245632. Throughput: 0: 42379.6. Samples: 6770395300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:11:03,392][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 10:11:06,156][15401] Updated weights for policy 0, policy_version 413230 (0.0035) [2024-06-23 10:11:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42877.0). Total num frames: 6770458624. Throughput: 0: 42655.5. Samples: 6770525400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:11:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 10:11:10,092][15401] Updated weights for policy 0, policy_version 413240 (0.0028) [2024-06-23 10:11:13,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 6770655232. Throughput: 0: 42612.4. Samples: 6770786040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:11:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 10:11:13,807][15401] Updated weights for policy 0, policy_version 413250 (0.0040) [2024-06-23 10:11:17,639][15401] Updated weights for policy 0, policy_version 413260 (0.0035) [2024-06-23 10:11:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 6770884608. Throughput: 0: 42584.9. Samples: 6771035460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:11:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 10:11:21,360][15401] Updated weights for policy 0, policy_version 413270 (0.0037) [2024-06-23 10:11:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6771097600. Throughput: 0: 42783.5. Samples: 6771170040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:11:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 10:11:25,119][15401] Updated weights for policy 0, policy_version 413280 (0.0039) [2024-06-23 10:11:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42820.9). Total num frames: 6771277824. Throughput: 0: 42439.0. Samples: 6771421440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:11:28,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 10:11:29,355][15401] Updated weights for policy 0, policy_version 413290 (0.0038) [2024-06-23 10:11:32,690][15401] Updated weights for policy 0, policy_version 413300 (0.0030) [2024-06-23 10:11:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6771523584. Throughput: 0: 42490.7. Samples: 6771670320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-23 10:11:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 10:11:37,113][15401] Updated weights for policy 0, policy_version 413310 (0.0029) [2024-06-23 10:11:38,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42320.8, 300 sec: 42819.6). Total num frames: 6771720192. Throughput: 0: 42618.1. Samples: 6771800920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:11:38,396][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 10:11:40,680][15401] Updated weights for policy 0, policy_version 413320 (0.0036) [2024-06-23 10:11:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 6771933184. Throughput: 0: 42444.6. Samples: 6772054040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:11:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 10:11:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000413327_6771949568.pth... [2024-06-23 10:11:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000412699_6761660416.pth [2024-06-23 10:11:44,818][15401] Updated weights for policy 0, policy_version 413330 (0.0031) [2024-06-23 10:11:48,225][15401] Updated weights for policy 0, policy_version 413340 (0.0037) [2024-06-23 10:11:48,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6772162560. Throughput: 0: 42464.0. Samples: 6772306080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:11:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 10:11:52,351][15401] Updated weights for policy 0, policy_version 413350 (0.0032) [2024-06-23 10:11:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6772375552. Throughput: 0: 42567.6. Samples: 6772440940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:11:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 10:11:55,738][15401] Updated weights for policy 0, policy_version 413360 (0.0040) [2024-06-23 10:11:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42875.7). Total num frames: 6772572160. Throughput: 0: 42454.3. Samples: 6772696580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:11:58,392][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 10:11:59,947][15401] Updated weights for policy 0, policy_version 413370 (0.0024) [2024-06-23 10:12:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 6772785152. Throughput: 0: 42545.7. Samples: 6772950020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 10:12:03,709][15401] Updated weights for policy 0, policy_version 413380 (0.0032) [2024-06-23 10:12:07,573][15401] Updated weights for policy 0, policy_version 413390 (0.0044) [2024-06-23 10:12:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6772998144. Throughput: 0: 42347.2. Samples: 6773075660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 10:12:11,643][15401] Updated weights for policy 0, policy_version 413400 (0.0043) [2024-06-23 10:12:12,961][15349] Signal inference workers to stop experience collection... (100350 times) [2024-06-23 10:12:12,996][15401] InferenceWorker_p0-w0: stopping experience collection (100350 times) [2024-06-23 10:12:13,015][15349] Signal inference workers to resume experience collection... (100350 times) [2024-06-23 10:12:13,016][15401] InferenceWorker_p0-w0: resuming experience collection (100350 times) [2024-06-23 10:12:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6773211136. Throughput: 0: 42404.1. Samples: 6773329620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 10:12:15,271][15401] Updated weights for policy 0, policy_version 413410 (0.0034) [2024-06-23 10:12:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42765.4). Total num frames: 6773424128. Throughput: 0: 42624.0. Samples: 6773588400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:18,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 10:12:19,267][15401] Updated weights for policy 0, policy_version 413420 (0.0023) [2024-06-23 10:12:23,174][15401] Updated weights for policy 0, policy_version 413430 (0.0035) [2024-06-23 10:12:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 6773637120. Throughput: 0: 42566.5. Samples: 6773716140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 10:12:26,997][15401] Updated weights for policy 0, policy_version 413440 (0.0034) [2024-06-23 10:12:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42820.8). Total num frames: 6773850112. Throughput: 0: 42560.4. Samples: 6773969260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 10:12:30,952][15401] Updated weights for policy 0, policy_version 413450 (0.0028) [2024-06-23 10:12:33,392][15132] Fps is (10 sec: 42589.7, 60 sec: 42323.9, 300 sec: 42764.7). Total num frames: 6774063104. Throughput: 0: 42727.0. Samples: 6774228880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:33,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 10:12:34,670][15401] Updated weights for policy 0, policy_version 413460 (0.0032) [2024-06-23 10:12:38,393][15132] Fps is (10 sec: 40945.6, 60 sec: 42327.4, 300 sec: 42764.5). Total num frames: 6774259712. Throughput: 0: 42528.7. Samples: 6774354880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:38,393][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 10:12:38,987][15401] Updated weights for policy 0, policy_version 413470 (0.0036) [2024-06-23 10:12:42,379][15401] Updated weights for policy 0, policy_version 413480 (0.0026) [2024-06-23 10:12:43,390][15132] Fps is (10 sec: 44245.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 6774505472. Throughput: 0: 42559.0. Samples: 6774611640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:43,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 10:12:46,747][15401] Updated weights for policy 0, policy_version 413490 (0.0042) [2024-06-23 10:12:48,390][15132] Fps is (10 sec: 44251.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6774702080. Throughput: 0: 42600.4. Samples: 6774867040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 10:12:50,002][15401] Updated weights for policy 0, policy_version 413500 (0.0032) [2024-06-23 10:12:53,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42764.6). Total num frames: 6774915072. Throughput: 0: 42598.0. Samples: 6774992680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:12:53,393][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 10:12:54,122][15401] Updated weights for policy 0, policy_version 413510 (0.0040) [2024-06-23 10:12:58,247][15401] Updated weights for policy 0, policy_version 413520 (0.0039) [2024-06-23 10:12:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 6775111680. Throughput: 0: 42640.1. Samples: 6775248420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:12:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 10:13:02,032][15401] Updated weights for policy 0, policy_version 413530 (0.0032) [2024-06-23 10:13:03,389][15132] Fps is (10 sec: 42609.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6775341056. Throughput: 0: 42610.4. Samples: 6775505860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 10:13:05,607][15401] Updated weights for policy 0, policy_version 413540 (0.0043) [2024-06-23 10:13:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6775554048. Throughput: 0: 42619.5. Samples: 6775634020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:08,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 10:13:09,571][15401] Updated weights for policy 0, policy_version 413550 (0.0029) [2024-06-23 10:13:13,196][15401] Updated weights for policy 0, policy_version 413560 (0.0037) [2024-06-23 10:13:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6775767040. Throughput: 0: 42780.8. Samples: 6775894400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 10:13:17,157][15401] Updated weights for policy 0, policy_version 413570 (0.0036) [2024-06-23 10:13:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6775963648. Throughput: 0: 42670.8. Samples: 6776148980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:18,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 10:13:20,852][15401] Updated weights for policy 0, policy_version 413580 (0.0034) [2024-06-23 10:13:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6776209408. Throughput: 0: 42634.3. Samples: 6776273280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:23,395][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 10:13:25,038][15401] Updated weights for policy 0, policy_version 413590 (0.0040) [2024-06-23 10:13:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6776406016. Throughput: 0: 42617.4. Samples: 6776529420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 10:13:28,599][15401] Updated weights for policy 0, policy_version 413600 (0.0042) [2024-06-23 10:13:29,470][15349] Signal inference workers to stop experience collection... (100400 times) [2024-06-23 10:13:29,476][15349] Signal inference workers to resume experience collection... (100400 times) [2024-06-23 10:13:29,519][15401] InferenceWorker_p0-w0: stopping experience collection (100400 times) [2024-06-23 10:13:29,519][15401] InferenceWorker_p0-w0: resuming experience collection (100400 times) [2024-06-23 10:13:32,664][15401] Updated weights for policy 0, policy_version 413610 (0.0043) [2024-06-23 10:13:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42326.8, 300 sec: 42653.9). Total num frames: 6776602624. Throughput: 0: 42577.0. Samples: 6776783000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 10:13:36,290][15401] Updated weights for policy 0, policy_version 413620 (0.0029) [2024-06-23 10:13:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43147.0, 300 sec: 42653.9). Total num frames: 6776848384. Throughput: 0: 42682.8. Samples: 6776913300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 10:13:40,228][15401] Updated weights for policy 0, policy_version 413630 (0.0042) [2024-06-23 10:13:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6777044992. Throughput: 0: 42648.4. Samples: 6777167600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:43,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 10:13:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000413638_6777044992.pth... [2024-06-23 10:13:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000413016_6766854144.pth [2024-06-23 10:13:43,915][15401] Updated weights for policy 0, policy_version 413640 (0.0033) [2024-06-23 10:13:47,861][15401] Updated weights for policy 0, policy_version 413650 (0.0034) [2024-06-23 10:13:48,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6777257984. Throughput: 0: 42591.4. Samples: 6777422580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:48,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 10:13:51,611][15401] Updated weights for policy 0, policy_version 413660 (0.0029) [2024-06-23 10:13:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 6777470976. Throughput: 0: 42517.4. Samples: 6777547300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 10:13:55,519][15401] Updated weights for policy 0, policy_version 413670 (0.0035) [2024-06-23 10:13:58,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6777683968. Throughput: 0: 42440.1. Samples: 6777804200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:13:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 10:13:59,479][15401] Updated weights for policy 0, policy_version 413680 (0.0025) [2024-06-23 10:14:03,096][15401] Updated weights for policy 0, policy_version 413690 (0.0039) [2024-06-23 10:14:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6777896960. Throughput: 0: 42475.9. Samples: 6778060400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:14:03,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 10:14:07,073][15401] Updated weights for policy 0, policy_version 413700 (0.0044) [2024-06-23 10:14:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6778126336. Throughput: 0: 42556.1. Samples: 6778188300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:14:08,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 10:14:10,870][15401] Updated weights for policy 0, policy_version 413710 (0.0036) [2024-06-23 10:14:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6778322944. Throughput: 0: 42767.5. Samples: 6778453960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:14:13,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 10:14:14,777][15401] Updated weights for policy 0, policy_version 413720 (0.0031) [2024-06-23 10:14:18,303][15401] Updated weights for policy 0, policy_version 413730 (0.0045) [2024-06-23 10:14:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.9). Total num frames: 6778552320. Throughput: 0: 42692.8. Samples: 6778704180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 10:14:18,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 10:14:22,341][15401] Updated weights for policy 0, policy_version 413740 (0.0038) [2024-06-23 10:14:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6778765312. Throughput: 0: 42723.0. Samples: 6778835840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:14:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 10:14:26,005][15401] Updated weights for policy 0, policy_version 413750 (0.0041) [2024-06-23 10:14:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6778978304. Throughput: 0: 42814.2. Samples: 6779094240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:14:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 10:14:30,059][15401] Updated weights for policy 0, policy_version 413760 (0.0029) [2024-06-23 10:14:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 6779191296. Throughput: 0: 42735.1. Samples: 6779345560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:14:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 10:14:33,544][15401] Updated weights for policy 0, policy_version 413770 (0.0049) [2024-06-23 10:14:37,809][15401] Updated weights for policy 0, policy_version 413780 (0.0038) [2024-06-23 10:14:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6779404288. Throughput: 0: 42775.1. Samples: 6779472180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:14:38,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 10:14:41,439][15401] Updated weights for policy 0, policy_version 413790 (0.0049) [2024-06-23 10:14:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6779617280. Throughput: 0: 42795.5. Samples: 6779730000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:14:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 10:14:45,386][15401] Updated weights for policy 0, policy_version 413800 (0.0027) [2024-06-23 10:14:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 6779830272. Throughput: 0: 42753.1. Samples: 6779984280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:14:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 10:14:48,976][15349] Signal inference workers to stop experience collection... (100450 times) [2024-06-23 10:14:48,976][15349] Signal inference workers to resume experience collection... (100450 times) [2024-06-23 10:14:49,016][15401] InferenceWorker_p0-w0: stopping experience collection (100450 times) [2024-06-23 10:14:49,016][15401] InferenceWorker_p0-w0: resuming experience collection (100450 times) [2024-06-23 10:14:49,114][15401] Updated weights for policy 0, policy_version 413810 (0.0028) [2024-06-23 10:14:52,995][15401] Updated weights for policy 0, policy_version 413820 (0.0023) [2024-06-23 10:14:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 6780026880. Throughput: 0: 42650.6. Samples: 6780107580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:14:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 10:14:57,028][15401] Updated weights for policy 0, policy_version 413830 (0.0027) [2024-06-23 10:14:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 6780272640. Throughput: 0: 42577.0. Samples: 6780369920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:14:58,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 10:15:00,761][15401] Updated weights for policy 0, policy_version 413840 (0.0031) [2024-06-23 10:15:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 6780452864. Throughput: 0: 42738.2. Samples: 6780627400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:15:03,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 10:15:04,618][15401] Updated weights for policy 0, policy_version 413850 (0.0024) [2024-06-23 10:15:08,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 6780665856. Throughput: 0: 42549.8. Samples: 6780750680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:15:08,393][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 10:15:08,704][15401] Updated weights for policy 0, policy_version 413860 (0.0052) [2024-06-23 10:15:12,181][15401] Updated weights for policy 0, policy_version 413870 (0.0033) [2024-06-23 10:15:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6780911616. Throughput: 0: 42665.9. Samples: 6781014200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:15:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 10:15:16,236][15401] Updated weights for policy 0, policy_version 413880 (0.0047) [2024-06-23 10:15:18,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6781108224. Throughput: 0: 42703.2. Samples: 6781267200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:15:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 10:15:19,854][15401] Updated weights for policy 0, policy_version 413890 (0.0035) [2024-06-23 10:15:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6781321216. Throughput: 0: 42701.3. Samples: 6781393740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:15:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 10:15:23,827][15401] Updated weights for policy 0, policy_version 413900 (0.0033) [2024-06-23 10:15:27,527][15401] Updated weights for policy 0, policy_version 413910 (0.0028) [2024-06-23 10:15:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6781534208. Throughput: 0: 42792.0. Samples: 6781655640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:15:28,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 10:15:31,248][15401] Updated weights for policy 0, policy_version 413920 (0.0030) [2024-06-23 10:15:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 6781730816. Throughput: 0: 42837.8. Samples: 6781911980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:15:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 10:15:35,029][15401] Updated weights for policy 0, policy_version 413930 (0.0033) [2024-06-23 10:15:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6781960192. Throughput: 0: 42872.5. Samples: 6782036840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 10:15:38,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 10:15:38,727][15401] Updated weights for policy 0, policy_version 413940 (0.0041) [2024-06-23 10:15:42,551][15401] Updated weights for policy 0, policy_version 413950 (0.0037) [2024-06-23 10:15:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6782173184. Throughput: 0: 42828.3. Samples: 6782297200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:15:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 10:15:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000413952_6782189568.pth... [2024-06-23 10:15:43,517][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000413327_6771949568.pth [2024-06-23 10:15:46,330][15401] Updated weights for policy 0, policy_version 413960 (0.0040) [2024-06-23 10:15:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6782386176. Throughput: 0: 42739.1. Samples: 6782550660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:15:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 10:15:50,224][15401] Updated weights for policy 0, policy_version 413970 (0.0033) [2024-06-23 10:15:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6782599168. Throughput: 0: 42802.2. Samples: 6782676680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:15:53,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 10:15:54,425][15401] Updated weights for policy 0, policy_version 413980 (0.0027) [2024-06-23 10:15:57,953][15401] Updated weights for policy 0, policy_version 413990 (0.0026) [2024-06-23 10:15:58,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42593.8, 300 sec: 42653.4). Total num frames: 6782828544. Throughput: 0: 42510.3. Samples: 6782927440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:15:58,396][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 10:16:02,278][15401] Updated weights for policy 0, policy_version 414000 (0.0029) [2024-06-23 10:16:03,394][15132] Fps is (10 sec: 40944.1, 60 sec: 42595.6, 300 sec: 42542.3). Total num frames: 6783008768. Throughput: 0: 42730.8. Samples: 6783190260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:03,394][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 10:16:05,626][15401] Updated weights for policy 0, policy_version 414010 (0.0029) [2024-06-23 10:16:07,481][15349] Signal inference workers to stop experience collection... (100500 times) [2024-06-23 10:16:07,520][15401] InferenceWorker_p0-w0: stopping experience collection (100500 times) [2024-06-23 10:16:07,549][15349] Signal inference workers to resume experience collection... (100500 times) [2024-06-23 10:16:07,550][15401] InferenceWorker_p0-w0: resuming experience collection (100500 times) [2024-06-23 10:16:08,390][15132] Fps is (10 sec: 42625.4, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 6783254528. Throughput: 0: 42709.3. Samples: 6783315660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:08,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 10:16:09,913][15401] Updated weights for policy 0, policy_version 414020 (0.0041) [2024-06-23 10:16:13,242][15401] Updated weights for policy 0, policy_version 414030 (0.0041) [2024-06-23 10:16:13,390][15132] Fps is (10 sec: 45893.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6783467520. Throughput: 0: 42657.2. Samples: 6783575220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:13,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 10:16:17,821][15401] Updated weights for policy 0, policy_version 414040 (0.0033) [2024-06-23 10:16:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6783647744. Throughput: 0: 42680.4. Samples: 6783832600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:18,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 10:16:20,787][15401] Updated weights for policy 0, policy_version 414050 (0.0038) [2024-06-23 10:16:23,394][15132] Fps is (10 sec: 42578.9, 60 sec: 42868.1, 300 sec: 42764.3). Total num frames: 6783893504. Throughput: 0: 42695.1. Samples: 6783958320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:23,395][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 10:16:25,433][15401] Updated weights for policy 0, policy_version 414060 (0.0035) [2024-06-23 10:16:28,378][15401] Updated weights for policy 0, policy_version 414070 (0.0028) [2024-06-23 10:16:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6784122880. Throughput: 0: 42669.8. Samples: 6784217340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 10:16:33,037][15401] Updated weights for policy 0, policy_version 414080 (0.0038) [2024-06-23 10:16:33,390][15132] Fps is (10 sec: 39339.8, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 6784286720. Throughput: 0: 42899.6. Samples: 6784481140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:33,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 10:16:36,079][15401] Updated weights for policy 0, policy_version 414090 (0.0026) [2024-06-23 10:16:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6784516096. Throughput: 0: 42779.8. Samples: 6784601760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 10:16:40,685][15401] Updated weights for policy 0, policy_version 414100 (0.0035) [2024-06-23 10:16:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6784745472. Throughput: 0: 42922.1. Samples: 6784858660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 10:16:43,657][15401] Updated weights for policy 0, policy_version 414110 (0.0044) [2024-06-23 10:16:48,366][15401] Updated weights for policy 0, policy_version 414120 (0.0028) [2024-06-23 10:16:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6784942080. Throughput: 0: 42938.1. Samples: 6785122300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 10:16:51,181][15401] Updated weights for policy 0, policy_version 414130 (0.0031) [2024-06-23 10:16:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42654.3). Total num frames: 6785155072. Throughput: 0: 42803.7. Samples: 6785241820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 10:16:55,951][15401] Updated weights for policy 0, policy_version 414140 (0.0033) [2024-06-23 10:16:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 6785400832. Throughput: 0: 42905.5. Samples: 6785505960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 10:16:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 10:16:59,241][15401] Updated weights for policy 0, policy_version 414150 (0.0024) [2024-06-23 10:17:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42874.4, 300 sec: 42653.9). Total num frames: 6785581056. Throughput: 0: 42991.1. Samples: 6785767200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:03,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 10:17:03,495][15401] Updated weights for policy 0, policy_version 414160 (0.0028) [2024-06-23 10:17:06,713][15401] Updated weights for policy 0, policy_version 414170 (0.0029) [2024-06-23 10:17:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6785810432. Throughput: 0: 42800.4. Samples: 6785884140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 10:17:11,118][15401] Updated weights for policy 0, policy_version 414180 (0.0032) [2024-06-23 10:17:13,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6786056192. Throughput: 0: 43029.8. Samples: 6786153680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 10:17:14,186][15401] Updated weights for policy 0, policy_version 414190 (0.0022) [2024-06-23 10:17:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6786220032. Throughput: 0: 42859.6. Samples: 6786409820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:18,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 10:17:18,809][15401] Updated weights for policy 0, policy_version 414200 (0.0039) [2024-06-23 10:17:20,690][15349] Signal inference workers to stop experience collection... (100550 times) [2024-06-23 10:17:20,718][15401] InferenceWorker_p0-w0: stopping experience collection (100550 times) [2024-06-23 10:17:20,801][15349] Signal inference workers to resume experience collection... (100550 times) [2024-06-23 10:17:20,802][15401] InferenceWorker_p0-w0: resuming experience collection (100550 times) [2024-06-23 10:17:21,711][15401] Updated weights for policy 0, policy_version 414210 (0.0022) [2024-06-23 10:17:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42601.7, 300 sec: 42709.5). Total num frames: 6786449408. Throughput: 0: 42785.7. Samples: 6786527120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 10:17:26,637][15401] Updated weights for policy 0, policy_version 414220 (0.0042) [2024-06-23 10:17:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42654.2). Total num frames: 6786646016. Throughput: 0: 42829.4. Samples: 6786785980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:28,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 10:17:29,685][15401] Updated weights for policy 0, policy_version 414230 (0.0039) [2024-06-23 10:17:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 6786859008. Throughput: 0: 42729.2. Samples: 6787045120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 10:17:34,384][15401] Updated weights for policy 0, policy_version 414240 (0.0034) [2024-06-23 10:17:37,118][15401] Updated weights for policy 0, policy_version 414250 (0.0041) [2024-06-23 10:17:38,396][15132] Fps is (10 sec: 45845.8, 60 sec: 43139.9, 300 sec: 42708.6). Total num frames: 6787104768. Throughput: 0: 42856.1. Samples: 6787170620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:38,396][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 10:17:42,219][15401] Updated weights for policy 0, policy_version 414260 (0.0047) [2024-06-23 10:17:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6787301376. Throughput: 0: 42698.6. Samples: 6787427400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 10:17:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000414264_6787301376.pth... [2024-06-23 10:17:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000413638_6777044992.pth [2024-06-23 10:17:45,117][15401] Updated weights for policy 0, policy_version 414270 (0.0037) [2024-06-23 10:17:48,389][15132] Fps is (10 sec: 40986.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 6787514368. Throughput: 0: 42455.1. Samples: 6787677680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 10:17:49,923][15401] Updated weights for policy 0, policy_version 414280 (0.0041) [2024-06-23 10:17:52,987][15401] Updated weights for policy 0, policy_version 414290 (0.0032) [2024-06-23 10:17:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 6787743744. Throughput: 0: 42680.6. Samples: 6787804760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:53,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 10:17:57,530][15401] Updated weights for policy 0, policy_version 414300 (0.0037) [2024-06-23 10:17:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6787940352. Throughput: 0: 42468.3. Samples: 6788064760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:17:58,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 10:18:00,853][15401] Updated weights for policy 0, policy_version 414310 (0.0036) [2024-06-23 10:18:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6788136960. Throughput: 0: 42255.7. Samples: 6788311320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:18:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 10:18:05,183][15401] Updated weights for policy 0, policy_version 414320 (0.0037) [2024-06-23 10:18:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6788366336. Throughput: 0: 42487.6. Samples: 6788439060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:18:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 10:18:08,461][15401] Updated weights for policy 0, policy_version 414330 (0.0043) [2024-06-23 10:18:12,777][15401] Updated weights for policy 0, policy_version 414340 (0.0038) [2024-06-23 10:18:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 6788562944. Throughput: 0: 42531.1. Samples: 6788699880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:18:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 10:18:16,043][15401] Updated weights for policy 0, policy_version 414350 (0.0034) [2024-06-23 10:18:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6788792320. Throughput: 0: 42461.8. Samples: 6788955900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 10:18:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 10:18:20,456][15401] Updated weights for policy 0, policy_version 414360 (0.0032) [2024-06-23 10:18:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6789005312. Throughput: 0: 42640.1. Samples: 6789089160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:18:23,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 10:18:23,554][15401] Updated weights for policy 0, policy_version 414370 (0.0035) [2024-06-23 10:18:28,013][15401] Updated weights for policy 0, policy_version 414380 (0.0027) [2024-06-23 10:18:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6789201920. Throughput: 0: 42684.9. Samples: 6789348220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:18:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 10:18:31,520][15401] Updated weights for policy 0, policy_version 414390 (0.0032) [2024-06-23 10:18:33,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6789447680. Throughput: 0: 42577.4. Samples: 6789593660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:18:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 10:18:35,445][15401] Updated weights for policy 0, policy_version 414400 (0.0039) [2024-06-23 10:18:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42328.1, 300 sec: 42709.1). Total num frames: 6789644288. Throughput: 0: 42795.4. Samples: 6789730660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:18:38,393][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 10:18:39,100][15401] Updated weights for policy 0, policy_version 414410 (0.0048) [2024-06-23 10:18:43,302][15401] Updated weights for policy 0, policy_version 414420 (0.0027) [2024-06-23 10:18:43,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42593.9, 300 sec: 42708.9). Total num frames: 6789857280. Throughput: 0: 42693.1. Samples: 6789986220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:18:43,396][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 10:18:46,830][15401] Updated weights for policy 0, policy_version 414430 (0.0031) [2024-06-23 10:18:48,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6790086656. Throughput: 0: 42747.1. Samples: 6790234940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:18:48,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 10:18:50,673][15349] Signal inference workers to stop experience collection... (100600 times) [2024-06-23 10:18:50,675][15349] Signal inference workers to resume experience collection... (100600 times) [2024-06-23 10:18:50,724][15401] InferenceWorker_p0-w0: stopping experience collection (100600 times) [2024-06-23 10:18:50,724][15401] InferenceWorker_p0-w0: resuming experience collection (100600 times) [2024-06-23 10:18:50,810][15401] Updated weights for policy 0, policy_version 414440 (0.0033) [2024-06-23 10:18:53,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6790283264. Throughput: 0: 42860.4. Samples: 6790367780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:18:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 10:18:54,626][15401] Updated weights for policy 0, policy_version 414450 (0.0028) [2024-06-23 10:18:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6790496256. Throughput: 0: 42751.1. Samples: 6790623680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:18:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 10:18:58,611][15401] Updated weights for policy 0, policy_version 414460 (0.0026) [2024-06-23 10:19:02,195][15401] Updated weights for policy 0, policy_version 414470 (0.0022) [2024-06-23 10:19:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6790742016. Throughput: 0: 42620.9. Samples: 6790873840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 10:19:06,399][15401] Updated weights for policy 0, policy_version 414480 (0.0025) [2024-06-23 10:19:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6790922240. Throughput: 0: 42730.4. Samples: 6791012020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 10:19:09,683][15401] Updated weights for policy 0, policy_version 414490 (0.0039) [2024-06-23 10:19:13,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6791118848. Throughput: 0: 42584.5. Samples: 6791264520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 10:19:13,958][15401] Updated weights for policy 0, policy_version 414500 (0.0032) [2024-06-23 10:19:17,481][15401] Updated weights for policy 0, policy_version 414510 (0.0041) [2024-06-23 10:19:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6791380992. Throughput: 0: 42692.9. Samples: 6791514840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 10:19:21,791][15401] Updated weights for policy 0, policy_version 414520 (0.0035) [2024-06-23 10:19:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6791561216. Throughput: 0: 42684.8. Samples: 6791651380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 10:19:25,126][15401] Updated weights for policy 0, policy_version 414530 (0.0042) [2024-06-23 10:19:28,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6791757824. Throughput: 0: 42523.3. Samples: 6791899500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 10:19:29,428][15401] Updated weights for policy 0, policy_version 414540 (0.0043) [2024-06-23 10:19:33,048][15401] Updated weights for policy 0, policy_version 414550 (0.0035) [2024-06-23 10:19:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6792003584. Throughput: 0: 42663.6. Samples: 6792154800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 10:19:37,025][15401] Updated weights for policy 0, policy_version 414560 (0.0042) [2024-06-23 10:19:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 6792183808. Throughput: 0: 42681.5. Samples: 6792288440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 10:19:40,682][15401] Updated weights for policy 0, policy_version 414570 (0.0033) [2024-06-23 10:19:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 6792413184. Throughput: 0: 42506.6. Samples: 6792536480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 10:19:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 10:19:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000414576_6792413184.pth... [2024-06-23 10:19:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000413952_6782189568.pth [2024-06-23 10:19:44,579][15401] Updated weights for policy 0, policy_version 414580 (0.0035) [2024-06-23 10:19:48,287][15401] Updated weights for policy 0, policy_version 414590 (0.0028) [2024-06-23 10:19:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6792642560. Throughput: 0: 42765.8. Samples: 6792798300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:19:48,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 10:19:52,105][15401] Updated weights for policy 0, policy_version 414600 (0.0037) [2024-06-23 10:19:53,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 6792839168. Throughput: 0: 42645.0. Samples: 6792931320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:19:53,396][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 10:19:55,966][15401] Updated weights for policy 0, policy_version 414610 (0.0044) [2024-06-23 10:19:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6793052160. Throughput: 0: 42524.8. Samples: 6793178140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:19:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 10:19:59,824][15401] Updated weights for policy 0, policy_version 414620 (0.0045) [2024-06-23 10:20:03,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 6793265152. Throughput: 0: 42722.2. Samples: 6793437340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 10:20:03,641][15401] Updated weights for policy 0, policy_version 414630 (0.0025) [2024-06-23 10:20:07,803][15401] Updated weights for policy 0, policy_version 414640 (0.0024) [2024-06-23 10:20:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6793494528. Throughput: 0: 42505.5. Samples: 6793564120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:08,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 10:20:11,214][15349] Signal inference workers to stop experience collection... (100650 times) [2024-06-23 10:20:11,215][15349] Signal inference workers to resume experience collection... (100650 times) [2024-06-23 10:20:11,262][15401] InferenceWorker_p0-w0: stopping experience collection (100650 times) [2024-06-23 10:20:11,263][15401] InferenceWorker_p0-w0: resuming experience collection (100650 times) [2024-06-23 10:20:11,349][15401] Updated weights for policy 0, policy_version 414650 (0.0040) [2024-06-23 10:20:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6793691136. Throughput: 0: 42477.9. Samples: 6793811000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 10:20:15,511][15401] Updated weights for policy 0, policy_version 414660 (0.0039) [2024-06-23 10:20:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 6793887744. Throughput: 0: 42565.8. Samples: 6794070260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:18,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-23 10:20:19,161][15401] Updated weights for policy 0, policy_version 414670 (0.0043) [2024-06-23 10:20:23,334][15401] Updated weights for policy 0, policy_version 414680 (0.0035) [2024-06-23 10:20:23,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 6794117120. Throughput: 0: 42426.6. Samples: 6794197740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:23,392][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 10:20:26,811][15401] Updated weights for policy 0, policy_version 414690 (0.0038) [2024-06-23 10:20:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6794346496. Throughput: 0: 42487.0. Samples: 6794448400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 10:20:31,249][15401] Updated weights for policy 0, policy_version 414700 (0.0034) [2024-06-23 10:20:33,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42050.6, 300 sec: 42598.1). Total num frames: 6794526720. Throughput: 0: 42479.1. Samples: 6794709960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:33,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 10:20:34,617][15401] Updated weights for policy 0, policy_version 414710 (0.0047) [2024-06-23 10:20:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 6794756096. Throughput: 0: 42167.4. Samples: 6794828580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:38,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 10:20:38,873][15401] Updated weights for policy 0, policy_version 414720 (0.0045) [2024-06-23 10:20:42,326][15401] Updated weights for policy 0, policy_version 414730 (0.0031) [2024-06-23 10:20:43,390][15132] Fps is (10 sec: 45885.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6794985472. Throughput: 0: 42401.3. Samples: 6795086200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:43,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 10:20:46,516][15401] Updated weights for policy 0, policy_version 414740 (0.0029) [2024-06-23 10:20:48,395][15132] Fps is (10 sec: 42576.4, 60 sec: 42321.7, 300 sec: 42653.2). Total num frames: 6795182080. Throughput: 0: 42422.7. Samples: 6795346580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:48,395][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 10:20:49,952][15401] Updated weights for policy 0, policy_version 414750 (0.0036) [2024-06-23 10:20:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42056.8, 300 sec: 42488.2). Total num frames: 6795362304. Throughput: 0: 42275.1. Samples: 6795466500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 10:20:54,177][15401] Updated weights for policy 0, policy_version 414760 (0.0033) [2024-06-23 10:20:57,586][15401] Updated weights for policy 0, policy_version 414770 (0.0043) [2024-06-23 10:20:58,390][15132] Fps is (10 sec: 42620.0, 60 sec: 42598.5, 300 sec: 42710.1). Total num frames: 6795608064. Throughput: 0: 42527.9. Samples: 6795724760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:20:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 10:21:01,731][15401] Updated weights for policy 0, policy_version 414780 (0.0049) [2024-06-23 10:21:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 6795788288. Throughput: 0: 42494.3. Samples: 6795982500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:21:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 10:21:05,628][15401] Updated weights for policy 0, policy_version 414790 (0.0025) [2024-06-23 10:21:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 42487.4). Total num frames: 6796001280. Throughput: 0: 42329.9. Samples: 6796102480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 10:21:09,281][15401] Updated weights for policy 0, policy_version 414800 (0.0027) [2024-06-23 10:21:13,356][15401] Updated weights for policy 0, policy_version 414810 (0.0037) [2024-06-23 10:21:13,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6796247040. Throughput: 0: 42453.8. Samples: 6796358920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:13,393][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 10:21:16,887][15401] Updated weights for policy 0, policy_version 414820 (0.0024) [2024-06-23 10:21:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42488.0). Total num frames: 6796427264. Throughput: 0: 42372.1. Samples: 6796616600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:18,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 10:21:20,944][15401] Updated weights for policy 0, policy_version 414830 (0.0054) [2024-06-23 10:21:23,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42054.0, 300 sec: 42431.8). Total num frames: 6796640256. Throughput: 0: 42472.0. Samples: 6796739820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 10:21:24,200][15349] Signal inference workers to stop experience collection... (100700 times) [2024-06-23 10:21:24,246][15401] InferenceWorker_p0-w0: stopping experience collection (100700 times) [2024-06-23 10:21:24,258][15349] Signal inference workers to resume experience collection... (100700 times) [2024-06-23 10:21:24,262][15401] InferenceWorker_p0-w0: resuming experience collection (100700 times) [2024-06-23 10:21:24,400][15401] Updated weights for policy 0, policy_version 414840 (0.0033) [2024-06-23 10:21:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 6796869632. Throughput: 0: 42350.4. Samples: 6796991960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 10:21:29,172][15401] Updated weights for policy 0, policy_version 414850 (0.0046) [2024-06-23 10:21:32,207][15401] Updated weights for policy 0, policy_version 414860 (0.0024) [2024-06-23 10:21:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 6797082624. Throughput: 0: 42211.5. Samples: 6797245880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 10:21:36,887][15401] Updated weights for policy 0, policy_version 414870 (0.0034) [2024-06-23 10:21:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 6797279232. Throughput: 0: 42417.8. Samples: 6797375300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 10:21:39,753][15401] Updated weights for policy 0, policy_version 414880 (0.0025) [2024-06-23 10:21:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 6797508608. Throughput: 0: 42385.9. Samples: 6797632120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:43,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 10:21:43,493][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000414888_6797524992.pth... [2024-06-23 10:21:43,539][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000414264_6787301376.pth [2024-06-23 10:21:44,495][15401] Updated weights for policy 0, policy_version 414890 (0.0024) [2024-06-23 10:21:47,528][15401] Updated weights for policy 0, policy_version 414900 (0.0027) [2024-06-23 10:21:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42329.0, 300 sec: 42598.4). Total num frames: 6797721600. Throughput: 0: 42296.4. Samples: 6797885840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 10:21:52,215][15401] Updated weights for policy 0, policy_version 414910 (0.0035) [2024-06-23 10:21:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 6797934592. Throughput: 0: 42480.3. Samples: 6798014100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:53,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 10:21:55,544][15401] Updated weights for policy 0, policy_version 414920 (0.0037) [2024-06-23 10:21:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6798163968. Throughput: 0: 42460.1. Samples: 6798269520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:21:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 10:21:59,815][15401] Updated weights for policy 0, policy_version 414930 (0.0029) [2024-06-23 10:22:03,270][15401] Updated weights for policy 0, policy_version 414940 (0.0029) [2024-06-23 10:22:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6798376960. Throughput: 0: 42352.4. Samples: 6798522460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:22:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 10:22:07,392][15401] Updated weights for policy 0, policy_version 414950 (0.0039) [2024-06-23 10:22:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42431.4). Total num frames: 6798573568. Throughput: 0: 42471.5. Samples: 6798651140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:22:08,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 10:22:11,089][15401] Updated weights for policy 0, policy_version 414960 (0.0059) [2024-06-23 10:22:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42325.3, 300 sec: 42598.1). Total num frames: 6798786560. Throughput: 0: 42544.8. Samples: 6798906580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:22:13,392][15132] Avg episode reward: [(0, '0.255')] [2024-06-23 10:22:15,310][15401] Updated weights for policy 0, policy_version 414970 (0.0032) [2024-06-23 10:22:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 6798999552. Throughput: 0: 42556.4. Samples: 6799160920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:22:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 10:22:18,775][15401] Updated weights for policy 0, policy_version 414980 (0.0030) [2024-06-23 10:22:22,849][15401] Updated weights for policy 0, policy_version 414990 (0.0042) [2024-06-23 10:22:23,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6799212544. Throughput: 0: 42534.2. Samples: 6799289340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 10:22:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 10:22:26,508][15401] Updated weights for policy 0, policy_version 415000 (0.0042) [2024-06-23 10:22:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 6799441920. Throughput: 0: 42702.6. Samples: 6799553740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:22:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 10:22:30,392][15401] Updated weights for policy 0, policy_version 415010 (0.0036) [2024-06-23 10:22:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42543.8). Total num frames: 6799654912. Throughput: 0: 42647.9. Samples: 6799805000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:22:33,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-23 10:22:34,107][15401] Updated weights for policy 0, policy_version 415020 (0.0039) [2024-06-23 10:22:36,852][15349] Signal inference workers to stop experience collection... (100750 times) [2024-06-23 10:22:36,879][15401] InferenceWorker_p0-w0: stopping experience collection (100750 times) [2024-06-23 10:22:36,965][15349] Signal inference workers to resume experience collection... (100750 times) [2024-06-23 10:22:36,965][15401] InferenceWorker_p0-w0: resuming experience collection (100750 times) [2024-06-23 10:22:37,754][15401] Updated weights for policy 0, policy_version 415030 (0.0030) [2024-06-23 10:22:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 6799851520. Throughput: 0: 42623.3. Samples: 6799932140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:22:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 10:22:42,225][15401] Updated weights for policy 0, policy_version 415040 (0.0041) [2024-06-23 10:22:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 6800048128. Throughput: 0: 42775.4. Samples: 6800194420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:22:43,396][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 10:22:45,646][15401] Updated weights for policy 0, policy_version 415050 (0.0034) [2024-06-23 10:22:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 6800293888. Throughput: 0: 42758.6. Samples: 6800446600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:22:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 10:22:49,778][15401] Updated weights for policy 0, policy_version 415060 (0.0034) [2024-06-23 10:22:53,184][15401] Updated weights for policy 0, policy_version 415070 (0.0035) [2024-06-23 10:22:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6800506880. Throughput: 0: 42825.8. Samples: 6800578200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:22:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 10:22:57,517][15401] Updated weights for policy 0, policy_version 415080 (0.0048) [2024-06-23 10:22:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 6800687104. Throughput: 0: 42886.3. Samples: 6800836360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:22:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 10:23:00,851][15401] Updated weights for policy 0, policy_version 415090 (0.0028) [2024-06-23 10:23:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6800932864. Throughput: 0: 42836.7. Samples: 6801088580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:03,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 10:23:05,098][15401] Updated weights for policy 0, policy_version 415100 (0.0033) [2024-06-23 10:23:08,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42871.4, 300 sec: 42653.6). Total num frames: 6801145856. Throughput: 0: 42912.3. Samples: 6801220500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:08,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 10:23:08,441][15401] Updated weights for policy 0, policy_version 415110 (0.0042) [2024-06-23 10:23:13,003][15401] Updated weights for policy 0, policy_version 415120 (0.0026) [2024-06-23 10:23:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 6801342464. Throughput: 0: 42661.7. Samples: 6801473520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 10:23:16,140][15401] Updated weights for policy 0, policy_version 415130 (0.0028) [2024-06-23 10:23:18,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6801571840. Throughput: 0: 42808.8. Samples: 6801731400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:18,396][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 10:23:20,616][15401] Updated weights for policy 0, policy_version 415140 (0.0031) [2024-06-23 10:23:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 6801784832. Throughput: 0: 42859.3. Samples: 6801860920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:23,392][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 10:23:23,884][15401] Updated weights for policy 0, policy_version 415150 (0.0030) [2024-06-23 10:23:28,248][15401] Updated weights for policy 0, policy_version 415160 (0.0034) [2024-06-23 10:23:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 6801997824. Throughput: 0: 42733.9. Samples: 6802117440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 10:23:31,519][15401] Updated weights for policy 0, policy_version 415170 (0.0028) [2024-06-23 10:23:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 6802210816. Throughput: 0: 42799.2. Samples: 6802372560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 10:23:35,820][15401] Updated weights for policy 0, policy_version 415180 (0.0038) [2024-06-23 10:23:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42654.9). Total num frames: 6802440192. Throughput: 0: 42791.5. Samples: 6802503820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:38,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 10:23:39,402][15401] Updated weights for policy 0, policy_version 415190 (0.0044) [2024-06-23 10:23:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 6802620416. Throughput: 0: 42607.0. Samples: 6802753680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:23:43,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 10:23:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000415199_6802620416.pth... [2024-06-23 10:23:43,427][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000414576_6792413184.pth [2024-06-23 10:23:43,627][15401] Updated weights for policy 0, policy_version 415200 (0.0036) [2024-06-23 10:23:47,047][15401] Updated weights for policy 0, policy_version 415210 (0.0040) [2024-06-23 10:23:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6802866176. Throughput: 0: 42703.3. Samples: 6803010220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:23:48,390][15132] Avg episode reward: [(0, '0.126')] [2024-06-23 10:23:51,322][15401] Updated weights for policy 0, policy_version 415220 (0.0031) [2024-06-23 10:23:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6803079168. Throughput: 0: 42887.7. Samples: 6803150340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:23:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 10:23:54,848][15401] Updated weights for policy 0, policy_version 415230 (0.0031) [2024-06-23 10:23:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 6803259392. Throughput: 0: 42711.5. Samples: 6803395540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:23:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 10:23:59,167][15401] Updated weights for policy 0, policy_version 415240 (0.0034) [2024-06-23 10:24:02,479][15401] Updated weights for policy 0, policy_version 415250 (0.0027) [2024-06-23 10:24:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6803488768. Throughput: 0: 42714.2. Samples: 6803653540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:03,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 10:24:06,826][15401] Updated weights for policy 0, policy_version 415260 (0.0036) [2024-06-23 10:24:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 6803701760. Throughput: 0: 42715.5. Samples: 6803783020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 10:24:09,052][15349] Signal inference workers to stop experience collection... (100800 times) [2024-06-23 10:24:09,053][15349] Signal inference workers to resume experience collection... (100800 times) [2024-06-23 10:24:09,096][15401] InferenceWorker_p0-w0: stopping experience collection (100800 times) [2024-06-23 10:24:09,096][15401] InferenceWorker_p0-w0: resuming experience collection (100800 times) [2024-06-23 10:24:10,388][15401] Updated weights for policy 0, policy_version 415270 (0.0037) [2024-06-23 10:24:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 6803914752. Throughput: 0: 42621.7. Samples: 6804035420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 10:24:14,329][15401] Updated weights for policy 0, policy_version 415280 (0.0033) [2024-06-23 10:24:17,884][15401] Updated weights for policy 0, policy_version 415290 (0.0027) [2024-06-23 10:24:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6804127744. Throughput: 0: 42862.7. Samples: 6804301380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 10:24:21,830][15401] Updated weights for policy 0, policy_version 415300 (0.0036) [2024-06-23 10:24:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 6804357120. Throughput: 0: 42891.6. Samples: 6804433940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 10:24:25,406][15401] Updated weights for policy 0, policy_version 415310 (0.0029) [2024-06-23 10:24:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6804570112. Throughput: 0: 42973.8. Samples: 6804687500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:28,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 10:24:29,319][15401] Updated weights for policy 0, policy_version 415320 (0.0026) [2024-06-23 10:24:32,920][15401] Updated weights for policy 0, policy_version 415330 (0.0028) [2024-06-23 10:24:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 6804783104. Throughput: 0: 43026.1. Samples: 6804946400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 10:24:36,958][15401] Updated weights for policy 0, policy_version 415340 (0.0036) [2024-06-23 10:24:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6805012480. Throughput: 0: 42670.1. Samples: 6805070500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 10:24:40,551][15401] Updated weights for policy 0, policy_version 415350 (0.0024) [2024-06-23 10:24:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6805209088. Throughput: 0: 42947.5. Samples: 6805328180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 10:24:44,536][15401] Updated weights for policy 0, policy_version 415360 (0.0031) [2024-06-23 10:24:48,093][15401] Updated weights for policy 0, policy_version 415370 (0.0041) [2024-06-23 10:24:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 6805422080. Throughput: 0: 43051.7. Samples: 6805590860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:48,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 10:24:51,894][15401] Updated weights for policy 0, policy_version 415380 (0.0039) [2024-06-23 10:24:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 6805651456. Throughput: 0: 43113.7. Samples: 6805723140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:53,399][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 10:24:55,598][15401] Updated weights for policy 0, policy_version 415390 (0.0034) [2024-06-23 10:24:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6805848064. Throughput: 0: 43161.7. Samples: 6805977700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:24:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 10:24:59,593][15401] Updated weights for policy 0, policy_version 415400 (0.0036) [2024-06-23 10:25:03,269][15401] Updated weights for policy 0, policy_version 415410 (0.0038) [2024-06-23 10:25:03,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6806077440. Throughput: 0: 43070.5. Samples: 6806239560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:25:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 10:25:07,045][15401] Updated weights for policy 0, policy_version 415420 (0.0027) [2024-06-23 10:25:08,392][15132] Fps is (10 sec: 45864.8, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 6806306816. Throughput: 0: 43139.9. Samples: 6806375340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 10:25:08,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 10:25:10,886][15401] Updated weights for policy 0, policy_version 415430 (0.0029) [2024-06-23 10:25:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6806503424. Throughput: 0: 43208.5. Samples: 6806631880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 10:25:14,621][15401] Updated weights for policy 0, policy_version 415440 (0.0043) [2024-06-23 10:25:18,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 6806700032. Throughput: 0: 43292.2. Samples: 6806894540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:18,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 10:25:18,596][15401] Updated weights for policy 0, policy_version 415450 (0.0026) [2024-06-23 10:25:22,194][15401] Updated weights for policy 0, policy_version 415460 (0.0020) [2024-06-23 10:25:23,396][15132] Fps is (10 sec: 44208.1, 60 sec: 43139.9, 300 sec: 42708.6). Total num frames: 6806945792. Throughput: 0: 43426.3. Samples: 6807024960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:23,397][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 10:25:26,254][15401] Updated weights for policy 0, policy_version 415470 (0.0035) [2024-06-23 10:25:28,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.8, 300 sec: 42820.6). Total num frames: 6807158784. Throughput: 0: 43318.7. Samples: 6807277620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:28,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 10:25:29,900][15401] Updated weights for policy 0, policy_version 415480 (0.0039) [2024-06-23 10:25:33,392][15132] Fps is (10 sec: 40976.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 6807355392. Throughput: 0: 43241.6. Samples: 6807536840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:33,393][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 10:25:33,838][15401] Updated weights for policy 0, policy_version 415490 (0.0033) [2024-06-23 10:25:37,556][15401] Updated weights for policy 0, policy_version 415500 (0.0046) [2024-06-23 10:25:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6807584768. Throughput: 0: 43174.4. Samples: 6807665980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:38,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 10:25:41,655][15401] Updated weights for policy 0, policy_version 415510 (0.0026) [2024-06-23 10:25:43,396][15132] Fps is (10 sec: 44219.1, 60 sec: 43140.0, 300 sec: 42764.8). Total num frames: 6807797760. Throughput: 0: 43231.3. Samples: 6807923380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:43,396][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 10:25:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000415515_6807797760.pth... [2024-06-23 10:25:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000414888_6797524992.pth [2024-06-23 10:25:43,718][15349] Signal inference workers to stop experience collection... (100850 times) [2024-06-23 10:25:43,753][15401] InferenceWorker_p0-w0: stopping experience collection (100850 times) [2024-06-23 10:25:43,788][15349] Signal inference workers to resume experience collection... (100850 times) [2024-06-23 10:25:43,790][15401] InferenceWorker_p0-w0: resuming experience collection (100850 times) [2024-06-23 10:25:45,095][15401] Updated weights for policy 0, policy_version 415520 (0.0037) [2024-06-23 10:25:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 6808010752. Throughput: 0: 42915.6. Samples: 6808170760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 10:25:49,393][15401] Updated weights for policy 0, policy_version 415530 (0.0036) [2024-06-23 10:25:52,659][15401] Updated weights for policy 0, policy_version 415540 (0.0024) [2024-06-23 10:25:53,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6808223744. Throughput: 0: 42855.2. Samples: 6808303720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 10:25:56,858][15401] Updated weights for policy 0, policy_version 415550 (0.0030) [2024-06-23 10:25:58,391][15132] Fps is (10 sec: 42592.5, 60 sec: 43143.6, 300 sec: 42875.9). Total num frames: 6808436736. Throughput: 0: 42839.1. Samples: 6808559700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:25:58,391][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 10:26:00,258][15401] Updated weights for policy 0, policy_version 415560 (0.0033) [2024-06-23 10:26:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 6808666112. Throughput: 0: 42614.0. Samples: 6808812180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:26:03,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 10:26:04,408][15401] Updated weights for policy 0, policy_version 415570 (0.0042) [2024-06-23 10:26:07,722][15401] Updated weights for policy 0, policy_version 415580 (0.0044) [2024-06-23 10:26:08,390][15132] Fps is (10 sec: 42604.4, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 6808862720. Throughput: 0: 42649.7. Samples: 6808943920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:26:08,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 10:26:12,274][15401] Updated weights for policy 0, policy_version 415590 (0.0038) [2024-06-23 10:26:13,392][15132] Fps is (10 sec: 40951.0, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 6809075712. Throughput: 0: 42878.2. Samples: 6809207140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:26:13,392][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 10:26:15,246][15401] Updated weights for policy 0, policy_version 415600 (0.0038) [2024-06-23 10:26:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 6809305088. Throughput: 0: 42780.9. Samples: 6809461880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:26:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 10:26:19,682][15401] Updated weights for policy 0, policy_version 415610 (0.0037) [2024-06-23 10:26:23,240][15401] Updated weights for policy 0, policy_version 415620 (0.0044) [2024-06-23 10:26:23,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 6809518080. Throughput: 0: 42813.8. Samples: 6809592600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:26:23,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 10:26:27,300][15401] Updated weights for policy 0, policy_version 415630 (0.0037) [2024-06-23 10:26:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 6809698304. Throughput: 0: 42824.4. Samples: 6809850200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-23 10:26:28,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 10:26:30,759][15401] Updated weights for policy 0, policy_version 415640 (0.0029) [2024-06-23 10:26:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 6809927680. Throughput: 0: 42818.7. Samples: 6810097600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:26:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 10:26:34,866][15401] Updated weights for policy 0, policy_version 415650 (0.0049) [2024-06-23 10:26:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6810157056. Throughput: 0: 42928.0. Samples: 6810235480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:26:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 10:26:38,479][15401] Updated weights for policy 0, policy_version 415660 (0.0029) [2024-06-23 10:26:42,235][15401] Updated weights for policy 0, policy_version 415670 (0.0033) [2024-06-23 10:26:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 6810337280. Throughput: 0: 42752.5. Samples: 6810483500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:26:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 10:26:46,252][15401] Updated weights for policy 0, policy_version 415680 (0.0044) [2024-06-23 10:26:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6810583040. Throughput: 0: 42722.0. Samples: 6810734660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:26:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 10:26:50,504][15401] Updated weights for policy 0, policy_version 415690 (0.0039) [2024-06-23 10:26:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6810779648. Throughput: 0: 42747.2. Samples: 6810867540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:26:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 10:26:53,853][15401] Updated weights for policy 0, policy_version 415700 (0.0029) [2024-06-23 10:26:57,074][15349] Signal inference workers to stop experience collection... (100900 times) [2024-06-23 10:26:57,131][15401] InferenceWorker_p0-w0: stopping experience collection (100900 times) [2024-06-23 10:26:57,197][15349] Signal inference workers to resume experience collection... (100900 times) [2024-06-23 10:26:57,198][15401] InferenceWorker_p0-w0: resuming experience collection (100900 times) [2024-06-23 10:26:58,130][15401] Updated weights for policy 0, policy_version 415710 (0.0026) [2024-06-23 10:26:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42599.3, 300 sec: 42765.0). Total num frames: 6810992640. Throughput: 0: 42574.6. Samples: 6811122900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:26:58,396][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 10:27:01,448][15401] Updated weights for policy 0, policy_version 415720 (0.0038) [2024-06-23 10:27:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 6811222016. Throughput: 0: 42668.5. Samples: 6811381960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 10:27:05,602][15401] Updated weights for policy 0, policy_version 415730 (0.0029) [2024-06-23 10:27:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 6811435008. Throughput: 0: 42765.7. Samples: 6811517060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 10:27:08,879][15401] Updated weights for policy 0, policy_version 415740 (0.0031) [2024-06-23 10:27:13,269][15401] Updated weights for policy 0, policy_version 415750 (0.0034) [2024-06-23 10:27:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 6811648000. Throughput: 0: 42696.9. Samples: 6811771560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 10:27:16,864][15401] Updated weights for policy 0, policy_version 415760 (0.0039) [2024-06-23 10:27:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6811860992. Throughput: 0: 42732.4. Samples: 6812020560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:18,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 10:27:21,351][15401] Updated weights for policy 0, policy_version 415770 (0.0035) [2024-06-23 10:27:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6812073984. Throughput: 0: 42571.0. Samples: 6812151180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:23,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 10:27:24,473][15401] Updated weights for policy 0, policy_version 415780 (0.0031) [2024-06-23 10:27:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6812286976. Throughput: 0: 42682.6. Samples: 6812404220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:28,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 10:27:28,883][15401] Updated weights for policy 0, policy_version 415790 (0.0046) [2024-06-23 10:27:32,094][15401] Updated weights for policy 0, policy_version 415800 (0.0033) [2024-06-23 10:27:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6812499968. Throughput: 0: 42785.7. Samples: 6812660020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 10:27:36,436][15401] Updated weights for policy 0, policy_version 415810 (0.0041) [2024-06-23 10:27:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 6812696576. Throughput: 0: 42681.2. Samples: 6812788200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 10:27:39,796][15401] Updated weights for policy 0, policy_version 415820 (0.0032) [2024-06-23 10:27:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6812925952. Throughput: 0: 42656.1. Samples: 6813042420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 10:27:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000415828_6812925952.pth... [2024-06-23 10:27:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000415199_6802620416.pth [2024-06-23 10:27:44,395][15401] Updated weights for policy 0, policy_version 415830 (0.0027) [2024-06-23 10:27:47,806][15401] Updated weights for policy 0, policy_version 415840 (0.0036) [2024-06-23 10:27:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6813138944. Throughput: 0: 42581.7. Samples: 6813298140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 10:27:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 10:27:52,096][15401] Updated weights for policy 0, policy_version 415850 (0.0033) [2024-06-23 10:27:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6813335552. Throughput: 0: 42453.8. Samples: 6813427480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:27:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 10:27:55,473][15401] Updated weights for policy 0, policy_version 415860 (0.0032) [2024-06-23 10:27:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6813564928. Throughput: 0: 42331.8. Samples: 6813676500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:27:58,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 10:27:59,856][15401] Updated weights for policy 0, policy_version 415870 (0.0040) [2024-06-23 10:28:03,065][15401] Updated weights for policy 0, policy_version 415880 (0.0036) [2024-06-23 10:28:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 6813777920. Throughput: 0: 42546.7. Samples: 6813935160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 10:28:07,455][15401] Updated weights for policy 0, policy_version 415890 (0.0023) [2024-06-23 10:28:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 6813941760. Throughput: 0: 42508.1. Samples: 6814064040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 10:28:10,828][15401] Updated weights for policy 0, policy_version 415900 (0.0047) [2024-06-23 10:28:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 6814220288. Throughput: 0: 42523.6. Samples: 6814317780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:13,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 10:28:15,586][15401] Updated weights for policy 0, policy_version 415910 (0.0024) [2024-06-23 10:28:18,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 6814416896. Throughput: 0: 42348.4. Samples: 6814565700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 10:28:18,913][15401] Updated weights for policy 0, policy_version 415920 (0.0032) [2024-06-23 10:28:23,191][15401] Updated weights for policy 0, policy_version 415930 (0.0040) [2024-06-23 10:28:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 6814597120. Throughput: 0: 42276.9. Samples: 6814690660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:23,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 10:28:26,458][15401] Updated weights for policy 0, policy_version 415940 (0.0040) [2024-06-23 10:28:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6814842880. Throughput: 0: 42356.5. Samples: 6814948460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 10:28:30,871][15401] Updated weights for policy 0, policy_version 415950 (0.0034) [2024-06-23 10:28:31,976][15349] Signal inference workers to stop experience collection... (100950 times) [2024-06-23 10:28:32,017][15401] InferenceWorker_p0-w0: stopping experience collection (100950 times) [2024-06-23 10:28:32,092][15349] Signal inference workers to resume experience collection... (100950 times) [2024-06-23 10:28:32,092][15401] InferenceWorker_p0-w0: resuming experience collection (100950 times) [2024-06-23 10:28:33,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42320.8, 300 sec: 42708.6). Total num frames: 6815039488. Throughput: 0: 42359.4. Samples: 6815204580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:33,397][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 10:28:34,092][15401] Updated weights for policy 0, policy_version 415960 (0.0039) [2024-06-23 10:28:38,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6815236096. Throughput: 0: 42256.1. Samples: 6815329000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 10:28:38,467][15401] Updated weights for policy 0, policy_version 415970 (0.0036) [2024-06-23 10:28:41,802][15401] Updated weights for policy 0, policy_version 415980 (0.0022) [2024-06-23 10:28:43,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6815481856. Throughput: 0: 42513.0. Samples: 6815589580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:43,396][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 10:28:46,101][15401] Updated weights for policy 0, policy_version 415990 (0.0041) [2024-06-23 10:28:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6815678464. Throughput: 0: 42460.6. Samples: 6815845880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 10:28:49,929][15401] Updated weights for policy 0, policy_version 416000 (0.0029) [2024-06-23 10:28:53,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 6815891456. Throughput: 0: 42341.5. Samples: 6815969680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:53,396][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 10:28:53,588][15401] Updated weights for policy 0, policy_version 416010 (0.0041) [2024-06-23 10:28:57,418][15401] Updated weights for policy 0, policy_version 416020 (0.0025) [2024-06-23 10:28:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 6816088064. Throughput: 0: 42367.2. Samples: 6816224300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:28:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 10:29:01,532][15401] Updated weights for policy 0, policy_version 416030 (0.0032) [2024-06-23 10:29:03,390][15132] Fps is (10 sec: 42625.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6816317440. Throughput: 0: 42517.2. Samples: 6816478980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:29:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 10:29:04,878][15401] Updated weights for policy 0, policy_version 416040 (0.0048) [2024-06-23 10:29:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6816514048. Throughput: 0: 42538.7. Samples: 6816604900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 10:29:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 10:29:09,223][15401] Updated weights for policy 0, policy_version 416050 (0.0032) [2024-06-23 10:29:12,477][15401] Updated weights for policy 0, policy_version 416060 (0.0033) [2024-06-23 10:29:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 6816743424. Throughput: 0: 42404.4. Samples: 6816856660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:13,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 10:29:16,965][15401] Updated weights for policy 0, policy_version 416070 (0.0049) [2024-06-23 10:29:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 6816940032. Throughput: 0: 42490.0. Samples: 6817116360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:18,390][15132] Avg episode reward: [(0, '0.886')] [2024-06-23 10:29:20,193][15401] Updated weights for policy 0, policy_version 416080 (0.0035) [2024-06-23 10:29:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6817169408. Throughput: 0: 42499.1. Samples: 6817241460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:23,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 10:29:24,452][15401] Updated weights for policy 0, policy_version 416090 (0.0031) [2024-06-23 10:29:27,937][15401] Updated weights for policy 0, policy_version 416100 (0.0036) [2024-06-23 10:29:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6817382400. Throughput: 0: 42396.7. Samples: 6817497440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:28,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 10:29:32,184][15401] Updated weights for policy 0, policy_version 416110 (0.0031) [2024-06-23 10:29:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42329.7, 300 sec: 42598.4). Total num frames: 6817579008. Throughput: 0: 42432.6. Samples: 6817755360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:33,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 10:29:35,576][15401] Updated weights for policy 0, policy_version 416120 (0.0044) [2024-06-23 10:29:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6817792000. Throughput: 0: 42425.5. Samples: 6817878560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 10:29:39,834][15401] Updated weights for policy 0, policy_version 416130 (0.0034) [2024-06-23 10:29:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6818021376. Throughput: 0: 42356.3. Samples: 6818130340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 10:29:43,536][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000416140_6818037760.pth... [2024-06-23 10:29:43,549][15401] Updated weights for policy 0, policy_version 416140 (0.0031) [2024-06-23 10:29:43,603][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000415515_6807797760.pth [2024-06-23 10:29:46,760][15349] Signal inference workers to stop experience collection... (101000 times) [2024-06-23 10:29:46,761][15349] Signal inference workers to resume experience collection... (101000 times) [2024-06-23 10:29:46,784][15401] InferenceWorker_p0-w0: stopping experience collection (101000 times) [2024-06-23 10:29:46,784][15401] InferenceWorker_p0-w0: resuming experience collection (101000 times) [2024-06-23 10:29:47,587][15401] Updated weights for policy 0, policy_version 416150 (0.0031) [2024-06-23 10:29:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 6818217984. Throughput: 0: 42470.3. Samples: 6818390140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:48,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 10:29:51,048][15401] Updated weights for policy 0, policy_version 416160 (0.0041) [2024-06-23 10:29:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42329.8, 300 sec: 42654.0). Total num frames: 6818430976. Throughput: 0: 42374.2. Samples: 6818511740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:53,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 10:29:55,278][15401] Updated weights for policy 0, policy_version 416170 (0.0042) [2024-06-23 10:29:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 6818660352. Throughput: 0: 42601.3. Samples: 6818773720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:29:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 10:29:58,627][15401] Updated weights for policy 0, policy_version 416180 (0.0027) [2024-06-23 10:30:02,745][15401] Updated weights for policy 0, policy_version 416190 (0.0031) [2024-06-23 10:30:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 6818856960. Throughput: 0: 42579.6. Samples: 6819032440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:30:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 10:30:06,265][15401] Updated weights for policy 0, policy_version 416200 (0.0050) [2024-06-23 10:30:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6819069952. Throughput: 0: 42559.6. Samples: 6819156640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:30:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 10:30:10,777][15401] Updated weights for policy 0, policy_version 416210 (0.0033) [2024-06-23 10:30:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6819299328. Throughput: 0: 42632.1. Samples: 6819415880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:30:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 10:30:13,823][15401] Updated weights for policy 0, policy_version 416220 (0.0034) [2024-06-23 10:30:18,307][15401] Updated weights for policy 0, policy_version 416230 (0.0037) [2024-06-23 10:30:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42599.3). Total num frames: 6819512320. Throughput: 0: 42606.9. Samples: 6819672660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:30:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 10:30:22,098][15401] Updated weights for policy 0, policy_version 416240 (0.0029) [2024-06-23 10:30:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 6819708928. Throughput: 0: 42632.0. Samples: 6819797000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:30:23,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 10:30:25,932][15401] Updated weights for policy 0, policy_version 416250 (0.0027) [2024-06-23 10:30:28,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42320.9, 300 sec: 42597.8). Total num frames: 6819921920. Throughput: 0: 42795.3. Samples: 6820056400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:30:28,405][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 10:30:29,957][15401] Updated weights for policy 0, policy_version 416260 (0.0036) [2024-06-23 10:30:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6820134912. Throughput: 0: 42629.5. Samples: 6820308460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 10:30:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 10:30:33,788][15401] Updated weights for policy 0, policy_version 416270 (0.0029) [2024-06-23 10:30:37,505][15401] Updated weights for policy 0, policy_version 416280 (0.0035) [2024-06-23 10:30:38,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42598.5, 300 sec: 42543.8). Total num frames: 6820347904. Throughput: 0: 42752.5. Samples: 6820435600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:30:38,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 10:30:41,449][15401] Updated weights for policy 0, policy_version 416290 (0.0039) [2024-06-23 10:30:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6820577280. Throughput: 0: 42660.7. Samples: 6820693460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:30:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 10:30:45,026][15401] Updated weights for policy 0, policy_version 416300 (0.0046) [2024-06-23 10:30:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6820757504. Throughput: 0: 42662.2. Samples: 6820952240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:30:48,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 10:30:49,142][15401] Updated weights for policy 0, policy_version 416310 (0.0040) [2024-06-23 10:30:52,492][15401] Updated weights for policy 0, policy_version 416320 (0.0032) [2024-06-23 10:30:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42598.6). Total num frames: 6821003264. Throughput: 0: 42614.3. Samples: 6821074280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:30:53,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 10:30:56,366][15349] Signal inference workers to stop experience collection... (101050 times) [2024-06-23 10:30:56,402][15401] InferenceWorker_p0-w0: stopping experience collection (101050 times) [2024-06-23 10:30:56,415][15349] Signal inference workers to resume experience collection... (101050 times) [2024-06-23 10:30:56,420][15401] InferenceWorker_p0-w0: resuming experience collection (101050 times) [2024-06-23 10:30:56,697][15401] Updated weights for policy 0, policy_version 416330 (0.0036) [2024-06-23 10:30:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 6821199872. Throughput: 0: 42643.6. Samples: 6821334840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:30:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 10:31:00,249][15401] Updated weights for policy 0, policy_version 416340 (0.0039) [2024-06-23 10:31:03,391][15132] Fps is (10 sec: 40952.3, 60 sec: 42597.1, 300 sec: 42542.6). Total num frames: 6821412864. Throughput: 0: 42596.0. Samples: 6821589560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:03,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 10:31:04,276][15401] Updated weights for policy 0, policy_version 416350 (0.0031) [2024-06-23 10:31:07,989][15401] Updated weights for policy 0, policy_version 416360 (0.0034) [2024-06-23 10:31:08,395][15132] Fps is (10 sec: 45849.0, 60 sec: 43140.5, 300 sec: 42653.5). Total num frames: 6821658624. Throughput: 0: 42738.6. Samples: 6821720480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:08,396][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 10:31:11,902][15401] Updated weights for policy 0, policy_version 416370 (0.0024) [2024-06-23 10:31:13,390][15132] Fps is (10 sec: 44243.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 6821855232. Throughput: 0: 42649.9. Samples: 6821975380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 10:31:15,562][15401] Updated weights for policy 0, policy_version 416380 (0.0032) [2024-06-23 10:31:18,389][15132] Fps is (10 sec: 40983.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6822068224. Throughput: 0: 42851.5. Samples: 6822236780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 10:31:19,785][15401] Updated weights for policy 0, policy_version 416390 (0.0037) [2024-06-23 10:31:23,053][15401] Updated weights for policy 0, policy_version 416400 (0.0026) [2024-06-23 10:31:23,392][15132] Fps is (10 sec: 44227.0, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 6822297600. Throughput: 0: 42827.9. Samples: 6822362960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:23,393][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 10:31:27,623][15401] Updated weights for policy 0, policy_version 416410 (0.0033) [2024-06-23 10:31:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42876.1, 300 sec: 42598.4). Total num frames: 6822494208. Throughput: 0: 42853.1. Samples: 6822621840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 10:31:30,761][15401] Updated weights for policy 0, policy_version 416420 (0.0042) [2024-06-23 10:31:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 6822707200. Throughput: 0: 42813.7. Samples: 6822878860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 10:31:35,247][15401] Updated weights for policy 0, policy_version 416430 (0.0047) [2024-06-23 10:31:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6822920192. Throughput: 0: 42835.9. Samples: 6823001900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:38,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 10:31:38,748][15401] Updated weights for policy 0, policy_version 416440 (0.0027) [2024-06-23 10:31:42,876][15401] Updated weights for policy 0, policy_version 416450 (0.0035) [2024-06-23 10:31:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6823116800. Throughput: 0: 42743.4. Samples: 6823258300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 10:31:43,510][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000416451_6823133184.pth... [2024-06-23 10:31:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000415828_6812925952.pth [2024-06-23 10:31:46,446][15401] Updated weights for policy 0, policy_version 416460 (0.0042) [2024-06-23 10:31:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6823346176. Throughput: 0: 42656.3. Samples: 6823509020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 10:31:50,423][15401] Updated weights for policy 0, policy_version 416470 (0.0042) [2024-06-23 10:31:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6823542784. Throughput: 0: 42711.7. Samples: 6823642260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 10:31:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 10:31:54,495][15401] Updated weights for policy 0, policy_version 416480 (0.0022) [2024-06-23 10:31:58,035][15401] Updated weights for policy 0, policy_version 416490 (0.0031) [2024-06-23 10:31:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 6823772160. Throughput: 0: 42595.7. Samples: 6823892180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:31:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 10:32:02,311][15401] Updated weights for policy 0, policy_version 416500 (0.0031) [2024-06-23 10:32:03,165][15349] Signal inference workers to stop experience collection... (101100 times) [2024-06-23 10:32:03,216][15401] InferenceWorker_p0-w0: stopping experience collection (101100 times) [2024-06-23 10:32:03,225][15349] Signal inference workers to resume experience collection... (101100 times) [2024-06-23 10:32:03,232][15401] InferenceWorker_p0-w0: resuming experience collection (101100 times) [2024-06-23 10:32:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42872.8, 300 sec: 42542.9). Total num frames: 6823985152. Throughput: 0: 42527.2. Samples: 6824150500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 10:32:06,112][15401] Updated weights for policy 0, policy_version 416510 (0.0034) [2024-06-23 10:32:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42329.3, 300 sec: 42542.8). Total num frames: 6824198144. Throughput: 0: 42590.7. Samples: 6824279440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 10:32:10,014][15401] Updated weights for policy 0, policy_version 416520 (0.0037) [2024-06-23 10:32:13,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 6824411136. Throughput: 0: 42498.5. Samples: 6824534380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:13,393][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 10:32:13,646][15401] Updated weights for policy 0, policy_version 416530 (0.0027) [2024-06-23 10:32:17,816][15401] Updated weights for policy 0, policy_version 416540 (0.0032) [2024-06-23 10:32:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6824624128. Throughput: 0: 42423.6. Samples: 6824787920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 10:32:21,496][15401] Updated weights for policy 0, policy_version 416550 (0.0051) [2024-06-23 10:32:23,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42054.1, 300 sec: 42487.3). Total num frames: 6824820736. Throughput: 0: 42498.4. Samples: 6824914320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 10:32:25,557][15401] Updated weights for policy 0, policy_version 416560 (0.0032) [2024-06-23 10:32:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 6825033728. Throughput: 0: 42360.5. Samples: 6825164520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 10:32:29,084][15401] Updated weights for policy 0, policy_version 416570 (0.0040) [2024-06-23 10:32:33,184][15401] Updated weights for policy 0, policy_version 416580 (0.0040) [2024-06-23 10:32:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6825246720. Throughput: 0: 42584.5. Samples: 6825425320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:33,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 10:32:36,571][15401] Updated weights for policy 0, policy_version 416590 (0.0031) [2024-06-23 10:32:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6825476096. Throughput: 0: 42424.3. Samples: 6825551360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 10:32:40,705][15401] Updated weights for policy 0, policy_version 416600 (0.0036) [2024-06-23 10:32:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 6825672704. Throughput: 0: 42623.6. Samples: 6825810240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:43,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 10:32:44,180][15401] Updated weights for policy 0, policy_version 416610 (0.0033) [2024-06-23 10:32:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6825885696. Throughput: 0: 42556.8. Samples: 6826065560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 10:32:48,559][15401] Updated weights for policy 0, policy_version 416620 (0.0039) [2024-06-23 10:32:51,922][15401] Updated weights for policy 0, policy_version 416630 (0.0031) [2024-06-23 10:32:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 6826115072. Throughput: 0: 42536.9. Samples: 6826193600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 10:32:56,041][15401] Updated weights for policy 0, policy_version 416640 (0.0046) [2024-06-23 10:32:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6826311680. Throughput: 0: 42511.6. Samples: 6826447300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:32:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 10:32:59,799][15401] Updated weights for policy 0, policy_version 416650 (0.0037) [2024-06-23 10:33:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6826524672. Throughput: 0: 42588.8. Samples: 6826704420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:33:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 10:33:04,095][15401] Updated weights for policy 0, policy_version 416660 (0.0030) [2024-06-23 10:33:07,413][15401] Updated weights for policy 0, policy_version 416670 (0.0037) [2024-06-23 10:33:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 6826770432. Throughput: 0: 42532.0. Samples: 6826828260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:33:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 10:33:11,610][15401] Updated weights for policy 0, policy_version 416680 (0.0040) [2024-06-23 10:33:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 6826967040. Throughput: 0: 42639.2. Samples: 6827083280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 10:33:13,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 10:33:15,229][15401] Updated weights for policy 0, policy_version 416690 (0.0022) [2024-06-23 10:33:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6827163648. Throughput: 0: 42576.9. Samples: 6827341280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:18,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 10:33:19,441][15401] Updated weights for policy 0, policy_version 416700 (0.0043) [2024-06-23 10:33:19,444][15349] Signal inference workers to stop experience collection... (101150 times) [2024-06-23 10:33:19,444][15349] Signal inference workers to resume experience collection... (101150 times) [2024-06-23 10:33:19,463][15401] InferenceWorker_p0-w0: stopping experience collection (101150 times) [2024-06-23 10:33:19,463][15401] InferenceWorker_p0-w0: resuming experience collection (101150 times) [2024-06-23 10:33:22,765][15401] Updated weights for policy 0, policy_version 416710 (0.0040) [2024-06-23 10:33:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 6827409408. Throughput: 0: 42525.8. Samples: 6827465020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 10:33:26,975][15401] Updated weights for policy 0, policy_version 416720 (0.0035) [2024-06-23 10:33:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 6827606016. Throughput: 0: 42559.6. Samples: 6827725420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 10:33:30,618][15401] Updated weights for policy 0, policy_version 416730 (0.0030) [2024-06-23 10:33:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6827786240. Throughput: 0: 42516.0. Samples: 6827978780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 10:33:34,683][15401] Updated weights for policy 0, policy_version 416740 (0.0030) [2024-06-23 10:33:38,148][15401] Updated weights for policy 0, policy_version 416750 (0.0034) [2024-06-23 10:33:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 6828032000. Throughput: 0: 42413.8. Samples: 6828102320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:38,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 10:33:42,302][15401] Updated weights for policy 0, policy_version 416760 (0.0036) [2024-06-23 10:33:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6828244992. Throughput: 0: 42518.7. Samples: 6828360640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 10:33:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000416763_6828244992.pth... [2024-06-23 10:33:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000416140_6818037760.pth [2024-06-23 10:33:45,897][15401] Updated weights for policy 0, policy_version 416770 (0.0032) [2024-06-23 10:33:48,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 6828441600. Throughput: 0: 42472.0. Samples: 6828615660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:48,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 10:33:49,991][15401] Updated weights for policy 0, policy_version 416780 (0.0023) [2024-06-23 10:33:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6828670976. Throughput: 0: 42569.6. Samples: 6828743900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 10:33:53,505][15401] Updated weights for policy 0, policy_version 416790 (0.0043) [2024-06-23 10:33:57,676][15401] Updated weights for policy 0, policy_version 416800 (0.0025) [2024-06-23 10:33:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 6828900352. Throughput: 0: 42795.0. Samples: 6829009060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:33:58,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 10:34:01,033][15401] Updated weights for policy 0, policy_version 416810 (0.0032) [2024-06-23 10:34:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6829096960. Throughput: 0: 42555.5. Samples: 6829256280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:34:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 10:34:05,238][15401] Updated weights for policy 0, policy_version 416820 (0.0030) [2024-06-23 10:34:08,391][15132] Fps is (10 sec: 40952.6, 60 sec: 42323.9, 300 sec: 42598.1). Total num frames: 6829309952. Throughput: 0: 42650.2. Samples: 6829384360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:34:08,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 10:34:09,066][15401] Updated weights for policy 0, policy_version 416830 (0.0045) [2024-06-23 10:34:12,965][15401] Updated weights for policy 0, policy_version 416840 (0.0041) [2024-06-23 10:34:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6829506560. Throughput: 0: 42719.1. Samples: 6829647780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:34:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 10:34:16,807][15401] Updated weights for policy 0, policy_version 416850 (0.0040) [2024-06-23 10:34:18,390][15132] Fps is (10 sec: 44244.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6829752320. Throughput: 0: 42601.7. Samples: 6829895860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:34:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 10:34:20,836][15401] Updated weights for policy 0, policy_version 416860 (0.0029) [2024-06-23 10:34:23,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6829948928. Throughput: 0: 42907.1. Samples: 6830033040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:34:23,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 10:34:24,435][15401] Updated weights for policy 0, policy_version 416870 (0.0042) [2024-06-23 10:34:28,319][15401] Updated weights for policy 0, policy_version 416880 (0.0031) [2024-06-23 10:34:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6830161920. Throughput: 0: 42795.0. Samples: 6830286420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:34:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 10:34:31,979][15401] Updated weights for policy 0, policy_version 416890 (0.0034) [2024-06-23 10:34:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 6830374912. Throughput: 0: 42705.4. Samples: 6830537400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:34:33,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 10:34:36,089][15401] Updated weights for policy 0, policy_version 416900 (0.0030) [2024-06-23 10:34:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 6830587904. Throughput: 0: 42815.6. Samples: 6830670600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 10:34:38,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-23 10:34:39,595][15401] Updated weights for policy 0, policy_version 416910 (0.0033) [2024-06-23 10:34:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6830784512. Throughput: 0: 42573.0. Samples: 6830924840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:34:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 10:34:43,717][15401] Updated weights for policy 0, policy_version 416920 (0.0033) [2024-06-23 10:34:47,075][15401] Updated weights for policy 0, policy_version 416930 (0.0026) [2024-06-23 10:34:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6831013888. Throughput: 0: 42780.5. Samples: 6831181400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:34:48,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 10:34:51,708][15401] Updated weights for policy 0, policy_version 416940 (0.0032) [2024-06-23 10:34:51,715][15349] Signal inference workers to stop experience collection... (101200 times) [2024-06-23 10:34:51,715][15349] Signal inference workers to resume experience collection... (101200 times) [2024-06-23 10:34:51,752][15401] InferenceWorker_p0-w0: stopping experience collection (101200 times) [2024-06-23 10:34:51,752][15401] InferenceWorker_p0-w0: resuming experience collection (101200 times) [2024-06-23 10:34:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6831243264. Throughput: 0: 42817.2. Samples: 6831311060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:34:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 10:34:54,777][15401] Updated weights for policy 0, policy_version 416950 (0.0042) [2024-06-23 10:34:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6831423488. Throughput: 0: 42485.7. Samples: 6831559640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:34:58,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 10:34:59,394][15401] Updated weights for policy 0, policy_version 416960 (0.0038) [2024-06-23 10:35:02,314][15401] Updated weights for policy 0, policy_version 416970 (0.0036) [2024-06-23 10:35:03,391][15132] Fps is (10 sec: 39315.0, 60 sec: 42324.1, 300 sec: 42598.1). Total num frames: 6831636480. Throughput: 0: 42698.3. Samples: 6831817360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:03,392][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 10:35:07,065][15401] Updated weights for policy 0, policy_version 416980 (0.0035) [2024-06-23 10:35:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42872.8, 300 sec: 42654.0). Total num frames: 6831882240. Throughput: 0: 42594.7. Samples: 6831949800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:08,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 10:35:09,897][15401] Updated weights for policy 0, policy_version 416990 (0.0039) [2024-06-23 10:35:13,389][15132] Fps is (10 sec: 42606.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6832062464. Throughput: 0: 42546.7. Samples: 6832201020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 10:35:14,675][15401] Updated weights for policy 0, policy_version 417000 (0.0036) [2024-06-23 10:35:17,847][15401] Updated weights for policy 0, policy_version 417010 (0.0023) [2024-06-23 10:35:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6832291840. Throughput: 0: 42595.4. Samples: 6832454200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:18,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 10:35:22,301][15401] Updated weights for policy 0, policy_version 417020 (0.0047) [2024-06-23 10:35:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42654.5). Total num frames: 6832504832. Throughput: 0: 42629.3. Samples: 6832589020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:23,392][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 10:35:25,593][15401] Updated weights for policy 0, policy_version 417030 (0.0040) [2024-06-23 10:35:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6832717824. Throughput: 0: 42460.0. Samples: 6832835540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:28,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 10:35:29,995][15401] Updated weights for policy 0, policy_version 417040 (0.0031) [2024-06-23 10:35:33,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6832930816. Throughput: 0: 42524.8. Samples: 6833095020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 10:35:33,707][15401] Updated weights for policy 0, policy_version 417050 (0.0028) [2024-06-23 10:35:37,474][15401] Updated weights for policy 0, policy_version 417060 (0.0030) [2024-06-23 10:35:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6833127424. Throughput: 0: 42637.9. Samples: 6833229760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 10:35:41,278][15401] Updated weights for policy 0, policy_version 417070 (0.0028) [2024-06-23 10:35:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6833373184. Throughput: 0: 42814.1. Samples: 6833486280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:43,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 10:35:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000417076_6833373184.pth... [2024-06-23 10:35:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000416451_6823133184.pth [2024-06-23 10:35:44,993][15401] Updated weights for policy 0, policy_version 417080 (0.0045) [2024-06-23 10:35:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6833569792. Throughput: 0: 42875.6. Samples: 6833746680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 10:35:48,814][15401] Updated weights for policy 0, policy_version 417090 (0.0030) [2024-06-23 10:35:52,548][15401] Updated weights for policy 0, policy_version 417100 (0.0033) [2024-06-23 10:35:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6833782784. Throughput: 0: 42791.5. Samples: 6833875420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:53,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 10:35:56,261][15401] Updated weights for policy 0, policy_version 417110 (0.0033) [2024-06-23 10:35:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42765.3). Total num frames: 6834028544. Throughput: 0: 42949.3. Samples: 6834133740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 10:35:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 10:36:00,048][15401] Updated weights for policy 0, policy_version 417120 (0.0033) [2024-06-23 10:36:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43144.1, 300 sec: 42598.9). Total num frames: 6834225152. Throughput: 0: 42916.5. Samples: 6834385540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:03,392][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 10:36:04,218][15401] Updated weights for policy 0, policy_version 417130 (0.0029) [2024-06-23 10:36:07,802][15401] Updated weights for policy 0, policy_version 417140 (0.0043) [2024-06-23 10:36:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 6834438144. Throughput: 0: 42798.2. Samples: 6834514840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:08,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 10:36:09,447][15349] Signal inference workers to stop experience collection... (101250 times) [2024-06-23 10:36:09,452][15349] Signal inference workers to resume experience collection... (101250 times) [2024-06-23 10:36:09,461][15401] InferenceWorker_p0-w0: stopping experience collection (101250 times) [2024-06-23 10:36:09,475][15401] InferenceWorker_p0-w0: resuming experience collection (101250 times) [2024-06-23 10:36:11,648][15401] Updated weights for policy 0, policy_version 417150 (0.0038) [2024-06-23 10:36:13,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6834651136. Throughput: 0: 43132.5. Samples: 6834776500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:13,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 10:36:15,803][15401] Updated weights for policy 0, policy_version 417160 (0.0037) [2024-06-23 10:36:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 6834880512. Throughput: 0: 42865.9. Samples: 6835023980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:18,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 10:36:19,293][15401] Updated weights for policy 0, policy_version 417170 (0.0037) [2024-06-23 10:36:23,347][15401] Updated weights for policy 0, policy_version 417180 (0.0043) [2024-06-23 10:36:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 6835077120. Throughput: 0: 42779.1. Samples: 6835154820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 10:36:26,958][15401] Updated weights for policy 0, policy_version 417190 (0.0040) [2024-06-23 10:36:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 6835290112. Throughput: 0: 42691.2. Samples: 6835407480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:28,393][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 10:36:30,855][15401] Updated weights for policy 0, policy_version 417200 (0.0039) [2024-06-23 10:36:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6835503104. Throughput: 0: 42562.6. Samples: 6835662000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:33,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 10:36:34,758][15401] Updated weights for policy 0, policy_version 417210 (0.0042) [2024-06-23 10:36:38,378][15401] Updated weights for policy 0, policy_version 417220 (0.0031) [2024-06-23 10:36:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6835732480. Throughput: 0: 42627.2. Samples: 6835793640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:38,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-23 10:36:42,424][15401] Updated weights for policy 0, policy_version 417230 (0.0033) [2024-06-23 10:36:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 6835912704. Throughput: 0: 42483.7. Samples: 6836045500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 10:36:46,271][15401] Updated weights for policy 0, policy_version 417240 (0.0032) [2024-06-23 10:36:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6836142080. Throughput: 0: 42430.2. Samples: 6836294800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:48,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-23 10:36:50,105][15401] Updated weights for policy 0, policy_version 417250 (0.0028) [2024-06-23 10:36:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6836355072. Throughput: 0: 42520.5. Samples: 6836428260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 10:36:53,894][15401] Updated weights for policy 0, policy_version 417260 (0.0032) [2024-06-23 10:36:57,816][15401] Updated weights for policy 0, policy_version 417270 (0.0044) [2024-06-23 10:36:58,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42050.6, 300 sec: 42598.0). Total num frames: 6836551680. Throughput: 0: 42419.0. Samples: 6836685460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:36:58,392][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 10:37:01,968][15401] Updated weights for policy 0, policy_version 417280 (0.0038) [2024-06-23 10:37:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 6836781056. Throughput: 0: 42457.3. Samples: 6836934560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:37:03,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 10:37:05,902][15401] Updated weights for policy 0, policy_version 417290 (0.0033) [2024-06-23 10:37:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 6836977664. Throughput: 0: 42428.0. Samples: 6837064080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:37:08,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 10:37:09,564][15401] Updated weights for policy 0, policy_version 417300 (0.0029) [2024-06-23 10:37:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 6837190656. Throughput: 0: 42405.8. Samples: 6837315640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:37:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 10:37:13,594][15401] Updated weights for policy 0, policy_version 417310 (0.0031) [2024-06-23 10:37:17,104][15401] Updated weights for policy 0, policy_version 417320 (0.0034) [2024-06-23 10:37:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 6837420032. Throughput: 0: 42415.9. Samples: 6837570720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:37:18,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 10:37:21,399][15401] Updated weights for policy 0, policy_version 417330 (0.0035) [2024-06-23 10:37:23,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 6837616640. Throughput: 0: 42452.7. Samples: 6837704120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:37:23,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 10:37:24,771][15401] Updated weights for policy 0, policy_version 417340 (0.0038) [2024-06-23 10:37:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 6837829632. Throughput: 0: 42411.4. Samples: 6837954020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:37:28,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 10:37:29,038][15401] Updated weights for policy 0, policy_version 417350 (0.0028) [2024-06-23 10:37:32,285][15401] Updated weights for policy 0, policy_version 417360 (0.0028) [2024-06-23 10:37:33,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6838075392. Throughput: 0: 42642.6. Samples: 6838213720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:37:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 10:37:34,183][15349] Signal inference workers to stop experience collection... (101300 times) [2024-06-23 10:37:34,183][15349] Signal inference workers to resume experience collection... (101300 times) [2024-06-23 10:37:34,224][15401] InferenceWorker_p0-w0: stopping experience collection (101300 times) [2024-06-23 10:37:34,224][15401] InferenceWorker_p0-w0: resuming experience collection (101300 times) [2024-06-23 10:37:36,625][15401] Updated weights for policy 0, policy_version 417370 (0.0043) [2024-06-23 10:37:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6838272000. Throughput: 0: 42745.4. Samples: 6838351800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:37:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 10:37:40,059][15401] Updated weights for policy 0, policy_version 417380 (0.0028) [2024-06-23 10:37:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6838468608. Throughput: 0: 42563.2. Samples: 6838600700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:37:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 10:37:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000417387_6838468608.pth... [2024-06-23 10:37:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000416763_6828244992.pth [2024-06-23 10:37:44,578][15401] Updated weights for policy 0, policy_version 417390 (0.0034) [2024-06-23 10:37:47,718][15401] Updated weights for policy 0, policy_version 417400 (0.0029) [2024-06-23 10:37:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6838714368. Throughput: 0: 42678.2. Samples: 6838855080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:37:48,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 10:37:52,095][15401] Updated weights for policy 0, policy_version 417410 (0.0027) [2024-06-23 10:37:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6838927360. Throughput: 0: 42856.0. Samples: 6838992600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:37:53,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 10:37:55,224][15401] Updated weights for policy 0, policy_version 417420 (0.0033) [2024-06-23 10:37:58,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 6839091200. Throughput: 0: 42873.9. Samples: 6839244960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:37:58,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 10:37:59,719][15401] Updated weights for policy 0, policy_version 417430 (0.0032) [2024-06-23 10:38:02,821][15401] Updated weights for policy 0, policy_version 417440 (0.0034) [2024-06-23 10:38:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6839353344. Throughput: 0: 42783.2. Samples: 6839495960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 10:38:07,309][15401] Updated weights for policy 0, policy_version 417450 (0.0038) [2024-06-23 10:38:08,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6839549952. Throughput: 0: 42872.1. Samples: 6839633260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 10:38:10,447][15401] Updated weights for policy 0, policy_version 417460 (0.0034) [2024-06-23 10:38:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 6839746560. Throughput: 0: 42899.7. Samples: 6839884500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 10:38:14,847][15401] Updated weights for policy 0, policy_version 417470 (0.0038) [2024-06-23 10:38:18,219][15401] Updated weights for policy 0, policy_version 417480 (0.0031) [2024-06-23 10:38:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 6839992320. Throughput: 0: 42829.9. Samples: 6840141060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 10:38:22,692][15401] Updated weights for policy 0, policy_version 417490 (0.0031) [2024-06-23 10:38:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 6840172544. Throughput: 0: 42675.1. Samples: 6840272180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 10:38:25,918][15401] Updated weights for policy 0, policy_version 417500 (0.0026) [2024-06-23 10:38:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6840401920. Throughput: 0: 42718.6. Samples: 6840523040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 10:38:30,322][15401] Updated weights for policy 0, policy_version 417510 (0.0036) [2024-06-23 10:38:33,358][15401] Updated weights for policy 0, policy_version 417520 (0.0035) [2024-06-23 10:38:33,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 6840647680. Throughput: 0: 42873.8. Samples: 6840784400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:33,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 10:38:37,775][15401] Updated weights for policy 0, policy_version 417530 (0.0027) [2024-06-23 10:38:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6840827904. Throughput: 0: 42761.8. Samples: 6840916880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 10:38:41,061][15401] Updated weights for policy 0, policy_version 417540 (0.0027) [2024-06-23 10:38:41,923][15349] Signal inference workers to stop experience collection... (101350 times) [2024-06-23 10:38:41,924][15349] Signal inference workers to resume experience collection... (101350 times) [2024-06-23 10:38:41,963][15401] InferenceWorker_p0-w0: stopping experience collection (101350 times) [2024-06-23 10:38:41,963][15401] InferenceWorker_p0-w0: resuming experience collection (101350 times) [2024-06-23 10:38:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6841040896. Throughput: 0: 42699.5. Samples: 6841166440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:38:43,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 10:38:45,298][15401] Updated weights for policy 0, policy_version 417550 (0.0028) [2024-06-23 10:38:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6841270272. Throughput: 0: 42995.9. Samples: 6841430880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:38:48,401][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 10:38:48,775][15401] Updated weights for policy 0, policy_version 417560 (0.0043) [2024-06-23 10:38:52,894][15401] Updated weights for policy 0, policy_version 417570 (0.0035) [2024-06-23 10:38:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6841466880. Throughput: 0: 42756.0. Samples: 6841557280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:38:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 10:38:56,216][15401] Updated weights for policy 0, policy_version 417580 (0.0034) [2024-06-23 10:38:58,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 6841696256. Throughput: 0: 42787.0. Samples: 6841809920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:38:58,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 10:39:00,629][15401] Updated weights for policy 0, policy_version 417590 (0.0030) [2024-06-23 10:39:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 6841909248. Throughput: 0: 42818.6. Samples: 6842067900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 10:39:04,130][15401] Updated weights for policy 0, policy_version 417600 (0.0035) [2024-06-23 10:39:08,366][15401] Updated weights for policy 0, policy_version 417610 (0.0043) [2024-06-23 10:39:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6842122240. Throughput: 0: 42789.6. Samples: 6842197720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:08,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 10:39:11,699][15401] Updated weights for policy 0, policy_version 417620 (0.0035) [2024-06-23 10:39:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6842335232. Throughput: 0: 42747.5. Samples: 6842446680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 10:39:16,231][15401] Updated weights for policy 0, policy_version 417630 (0.0028) [2024-06-23 10:39:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6842548224. Throughput: 0: 42691.9. Samples: 6842705540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 10:39:19,270][15401] Updated weights for policy 0, policy_version 417640 (0.0035) [2024-06-23 10:39:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 6842744832. Throughput: 0: 42662.1. Samples: 6842836680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:23,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 10:39:23,948][15401] Updated weights for policy 0, policy_version 417650 (0.0043) [2024-06-23 10:39:27,054][15401] Updated weights for policy 0, policy_version 417660 (0.0029) [2024-06-23 10:39:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6842974208. Throughput: 0: 42669.6. Samples: 6843086580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 10:39:31,635][15401] Updated weights for policy 0, policy_version 417670 (0.0027) [2024-06-23 10:39:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6843187200. Throughput: 0: 42560.1. Samples: 6843345980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 10:39:34,714][15401] Updated weights for policy 0, policy_version 417680 (0.0045) [2024-06-23 10:39:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6843383808. Throughput: 0: 42606.3. Samples: 6843474560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 10:39:39,340][15401] Updated weights for policy 0, policy_version 417690 (0.0040) [2024-06-23 10:39:42,392][15401] Updated weights for policy 0, policy_version 417700 (0.0035) [2024-06-23 10:39:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 6843629568. Throughput: 0: 42649.2. Samples: 6843729140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 10:39:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000417702_6843629568.pth... [2024-06-23 10:39:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000417076_6833373184.pth [2024-06-23 10:39:46,974][15401] Updated weights for policy 0, policy_version 417710 (0.0023) [2024-06-23 10:39:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 6843842560. Throughput: 0: 42666.7. Samples: 6843987900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 10:39:50,060][15401] Updated weights for policy 0, policy_version 417720 (0.0034) [2024-06-23 10:39:53,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6844022784. Throughput: 0: 42628.5. Samples: 6844116000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 10:39:54,495][15401] Updated weights for policy 0, policy_version 417730 (0.0043) [2024-06-23 10:39:57,719][15401] Updated weights for policy 0, policy_version 417740 (0.0023) [2024-06-23 10:39:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 6844284928. Throughput: 0: 42878.1. Samples: 6844376200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:39:58,393][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 10:40:02,174][15401] Updated weights for policy 0, policy_version 417750 (0.0043) [2024-06-23 10:40:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6844481536. Throughput: 0: 42813.4. Samples: 6844632140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 10:40:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 10:40:05,224][15401] Updated weights for policy 0, policy_version 417760 (0.0037) [2024-06-23 10:40:08,392][15132] Fps is (10 sec: 37674.7, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 6844661760. Throughput: 0: 42813.9. Samples: 6844763400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:08,392][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 10:40:09,745][15401] Updated weights for policy 0, policy_version 417770 (0.0036) [2024-06-23 10:40:12,948][15401] Updated weights for policy 0, policy_version 417780 (0.0033) [2024-06-23 10:40:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 6844923904. Throughput: 0: 42985.1. Samples: 6845020900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:13,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-23 10:40:17,467][15401] Updated weights for policy 0, policy_version 417790 (0.0036) [2024-06-23 10:40:18,390][15132] Fps is (10 sec: 47524.5, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 6845136896. Throughput: 0: 42834.1. Samples: 6845273520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 10:40:20,513][15401] Updated weights for policy 0, policy_version 417800 (0.0031) [2024-06-23 10:40:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6845317120. Throughput: 0: 42897.7. Samples: 6845404960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:23,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-23 10:40:25,139][15401] Updated weights for policy 0, policy_version 417810 (0.0024) [2024-06-23 10:40:28,139][15401] Updated weights for policy 0, policy_version 417820 (0.0029) [2024-06-23 10:40:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 6845562880. Throughput: 0: 42897.1. Samples: 6845659500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 10:40:32,763][15401] Updated weights for policy 0, policy_version 417830 (0.0031) [2024-06-23 10:40:33,393][15132] Fps is (10 sec: 45858.2, 60 sec: 43141.8, 300 sec: 42875.5). Total num frames: 6845775872. Throughput: 0: 42871.5. Samples: 6845917280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:33,402][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 10:40:35,085][15349] Signal inference workers to stop experience collection... (101400 times) [2024-06-23 10:40:35,092][15349] Signal inference workers to resume experience collection... (101400 times) [2024-06-23 10:40:35,131][15401] InferenceWorker_p0-w0: stopping experience collection (101400 times) [2024-06-23 10:40:35,131][15401] InferenceWorker_p0-w0: resuming experience collection (101400 times) [2024-06-23 10:40:35,695][15401] Updated weights for policy 0, policy_version 417840 (0.0026) [2024-06-23 10:40:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6845939712. Throughput: 0: 42838.7. Samples: 6846043740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 10:40:40,404][15401] Updated weights for policy 0, policy_version 417850 (0.0030) [2024-06-23 10:40:43,185][15401] Updated weights for policy 0, policy_version 417860 (0.0030) [2024-06-23 10:40:43,389][15132] Fps is (10 sec: 44253.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 6846218240. Throughput: 0: 42782.4. Samples: 6846301400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 10:40:47,979][15401] Updated weights for policy 0, policy_version 417870 (0.0036) [2024-06-23 10:40:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6846398464. Throughput: 0: 42831.9. Samples: 6846559580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 10:40:51,216][15401] Updated weights for policy 0, policy_version 417880 (0.0030) [2024-06-23 10:40:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 6846611456. Throughput: 0: 42718.3. Samples: 6846685620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 10:40:55,694][15401] Updated weights for policy 0, policy_version 417890 (0.0038) [2024-06-23 10:40:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 6846840832. Throughput: 0: 42734.6. Samples: 6846943960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:40:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 10:40:58,718][15401] Updated weights for policy 0, policy_version 417900 (0.0035) [2024-06-23 10:41:03,207][15401] Updated weights for policy 0, policy_version 417910 (0.0027) [2024-06-23 10:41:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6847037440. Throughput: 0: 42823.2. Samples: 6847200560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:41:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 10:41:06,470][15401] Updated weights for policy 0, policy_version 417920 (0.0034) [2024-06-23 10:41:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 6847234048. Throughput: 0: 42610.8. Samples: 6847322440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:41:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 10:41:11,021][15401] Updated weights for policy 0, policy_version 417930 (0.0035) [2024-06-23 10:41:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6847463424. Throughput: 0: 42867.5. Samples: 6847588540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:41:13,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 10:41:13,954][15401] Updated weights for policy 0, policy_version 417940 (0.0038) [2024-06-23 10:41:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.7, 300 sec: 42653.6). Total num frames: 6847660032. Throughput: 0: 42802.6. Samples: 6847843340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:41:18,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 10:41:18,891][15401] Updated weights for policy 0, policy_version 417950 (0.0027) [2024-06-23 10:41:21,526][15401] Updated weights for policy 0, policy_version 417960 (0.0033) [2024-06-23 10:41:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 6847873024. Throughput: 0: 42775.1. Samples: 6847968620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 10:41:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 10:41:26,561][15401] Updated weights for policy 0, policy_version 417970 (0.0033) [2024-06-23 10:41:28,390][15132] Fps is (10 sec: 45883.6, 60 sec: 42598.0, 300 sec: 42764.9). Total num frames: 6848118784. Throughput: 0: 42827.5. Samples: 6848228660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:41:28,391][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 10:41:29,507][15401] Updated weights for policy 0, policy_version 417980 (0.0045) [2024-06-23 10:41:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42328.0, 300 sec: 42653.9). Total num frames: 6848315392. Throughput: 0: 42788.1. Samples: 6848485040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:41:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 10:41:34,368][15401] Updated weights for policy 0, policy_version 417990 (0.0042) [2024-06-23 10:41:37,075][15401] Updated weights for policy 0, policy_version 418000 (0.0031) [2024-06-23 10:41:38,389][15132] Fps is (10 sec: 40962.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6848528384. Throughput: 0: 42739.6. Samples: 6848608900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:41:38,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 10:41:41,961][15401] Updated weights for policy 0, policy_version 418010 (0.0043) [2024-06-23 10:41:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 6848757760. Throughput: 0: 42886.1. Samples: 6848873840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:41:43,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-23 10:41:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000418015_6848757760.pth... [2024-06-23 10:41:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000417387_6838468608.pth [2024-06-23 10:41:44,367][15349] Signal inference workers to stop experience collection... (101450 times) [2024-06-23 10:41:44,368][15349] Signal inference workers to resume experience collection... (101450 times) [2024-06-23 10:41:44,407][15401] InferenceWorker_p0-w0: stopping experience collection (101450 times) [2024-06-23 10:41:44,407][15401] InferenceWorker_p0-w0: resuming experience collection (101450 times) [2024-06-23 10:41:45,060][15401] Updated weights for policy 0, policy_version 418020 (0.0044) [2024-06-23 10:41:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6848954368. Throughput: 0: 42704.9. Samples: 6849122280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:41:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 10:41:49,696][15401] Updated weights for policy 0, policy_version 418030 (0.0045) [2024-06-23 10:41:52,747][15401] Updated weights for policy 0, policy_version 418040 (0.0029) [2024-06-23 10:41:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 6849167360. Throughput: 0: 42817.2. Samples: 6849249220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:41:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 10:41:57,376][15401] Updated weights for policy 0, policy_version 418050 (0.0036) [2024-06-23 10:41:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 6849380352. Throughput: 0: 42673.6. Samples: 6849508860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:41:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 10:42:00,578][15401] Updated weights for policy 0, policy_version 418060 (0.0028) [2024-06-23 10:42:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6849593344. Throughput: 0: 42546.6. Samples: 6849757840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 10:42:04,946][15401] Updated weights for policy 0, policy_version 418070 (0.0025) [2024-06-23 10:42:08,341][15401] Updated weights for policy 0, policy_version 418080 (0.0040) [2024-06-23 10:42:08,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 6849822720. Throughput: 0: 42572.0. Samples: 6849884460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:08,392][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 10:42:13,118][15401] Updated weights for policy 0, policy_version 418090 (0.0033) [2024-06-23 10:42:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 6850002944. Throughput: 0: 42578.8. Samples: 6850144680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:13,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 10:42:16,171][15401] Updated weights for policy 0, policy_version 418100 (0.0033) [2024-06-23 10:42:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42873.1, 300 sec: 42765.4). Total num frames: 6850232320. Throughput: 0: 42481.7. Samples: 6850396720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 10:42:20,675][15401] Updated weights for policy 0, policy_version 418110 (0.0036) [2024-06-23 10:42:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6850445312. Throughput: 0: 42472.3. Samples: 6850520160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:23,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 10:42:23,790][15401] Updated weights for policy 0, policy_version 418120 (0.0028) [2024-06-23 10:42:28,161][15401] Updated weights for policy 0, policy_version 418130 (0.0032) [2024-06-23 10:42:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.6, 300 sec: 42598.4). Total num frames: 6850641920. Throughput: 0: 42383.6. Samples: 6850781100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 10:42:31,614][15401] Updated weights for policy 0, policy_version 418140 (0.0043) [2024-06-23 10:42:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 6850871296. Throughput: 0: 42489.7. Samples: 6851034320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:33,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 10:42:35,693][15401] Updated weights for policy 0, policy_version 418150 (0.0033) [2024-06-23 10:42:38,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 6851084288. Throughput: 0: 42493.1. Samples: 6851161680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:38,396][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 10:42:39,134][15401] Updated weights for policy 0, policy_version 418160 (0.0023) [2024-06-23 10:42:43,288][15401] Updated weights for policy 0, policy_version 418170 (0.0036) [2024-06-23 10:42:43,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 6851297280. Throughput: 0: 42637.6. Samples: 6851427540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 10:42:47,101][15401] Updated weights for policy 0, policy_version 418180 (0.0028) [2024-06-23 10:42:48,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6851526656. Throughput: 0: 42693.9. Samples: 6851679060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 10:42:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 10:42:50,886][15401] Updated weights for policy 0, policy_version 418190 (0.0031) [2024-06-23 10:42:53,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 6851723264. Throughput: 0: 42663.5. Samples: 6851804320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:42:53,393][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 10:42:54,790][15401] Updated weights for policy 0, policy_version 418200 (0.0044) [2024-06-23 10:42:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6851952640. Throughput: 0: 42672.0. Samples: 6852064920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:42:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 10:42:58,391][15401] Updated weights for policy 0, policy_version 418210 (0.0028) [2024-06-23 10:43:02,287][15401] Updated weights for policy 0, policy_version 418220 (0.0032) [2024-06-23 10:43:03,286][15349] Signal inference workers to stop experience collection... (101500 times) [2024-06-23 10:43:03,286][15349] Signal inference workers to resume experience collection... (101500 times) [2024-06-23 10:43:03,316][15401] InferenceWorker_p0-w0: stopping experience collection (101500 times) [2024-06-23 10:43:03,316][15401] InferenceWorker_p0-w0: resuming experience collection (101500 times) [2024-06-23 10:43:03,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6852149248. Throughput: 0: 42861.0. Samples: 6852325460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 10:43:05,984][15401] Updated weights for policy 0, policy_version 418230 (0.0024) [2024-06-23 10:43:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 6852378624. Throughput: 0: 42893.0. Samples: 6852450340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 10:43:09,687][15401] Updated weights for policy 0, policy_version 418240 (0.0032) [2024-06-23 10:43:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6852591616. Throughput: 0: 42800.5. Samples: 6852707120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 10:43:13,780][15401] Updated weights for policy 0, policy_version 418250 (0.0034) [2024-06-23 10:43:17,373][15401] Updated weights for policy 0, policy_version 418260 (0.0033) [2024-06-23 10:43:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6852788224. Throughput: 0: 42796.5. Samples: 6852960160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:18,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 10:43:21,300][15401] Updated weights for policy 0, policy_version 418270 (0.0039) [2024-06-23 10:43:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6853017600. Throughput: 0: 42856.7. Samples: 6853089960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 10:43:25,000][15401] Updated weights for policy 0, policy_version 418280 (0.0027) [2024-06-23 10:43:28,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 6853230592. Throughput: 0: 42719.4. Samples: 6853350020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:28,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 10:43:28,911][15401] Updated weights for policy 0, policy_version 418290 (0.0024) [2024-06-23 10:43:32,581][15401] Updated weights for policy 0, policy_version 418300 (0.0033) [2024-06-23 10:43:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6853443584. Throughput: 0: 42800.7. Samples: 6853605100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 10:43:36,674][15401] Updated weights for policy 0, policy_version 418310 (0.0030) [2024-06-23 10:43:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 6853656576. Throughput: 0: 42850.7. Samples: 6853732500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 10:43:40,349][15401] Updated weights for policy 0, policy_version 418320 (0.0034) [2024-06-23 10:43:43,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 6853853184. Throughput: 0: 42818.6. Samples: 6853991760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 10:43:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000418328_6853885952.pth... [2024-06-23 10:43:43,595][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000417702_6843629568.pth [2024-06-23 10:43:44,253][15401] Updated weights for policy 0, policy_version 418330 (0.0022) [2024-06-23 10:43:47,983][15401] Updated weights for policy 0, policy_version 418340 (0.0027) [2024-06-23 10:43:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6854082560. Throughput: 0: 42666.1. Samples: 6854245440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 10:43:51,717][15401] Updated weights for policy 0, policy_version 418350 (0.0035) [2024-06-23 10:43:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 6854295552. Throughput: 0: 42823.2. Samples: 6854377380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 10:43:55,918][15401] Updated weights for policy 0, policy_version 418360 (0.0042) [2024-06-23 10:43:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 6854492160. Throughput: 0: 42681.3. Samples: 6854627780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:43:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 10:43:59,477][15401] Updated weights for policy 0, policy_version 418370 (0.0026) [2024-06-23 10:44:03,394][15132] Fps is (10 sec: 40941.6, 60 sec: 42595.2, 300 sec: 42653.3). Total num frames: 6854705152. Throughput: 0: 42805.2. Samples: 6854886580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:44:03,394][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 10:44:04,007][15401] Updated weights for policy 0, policy_version 418380 (0.0029) [2024-06-23 10:44:07,205][15401] Updated weights for policy 0, policy_version 418390 (0.0038) [2024-06-23 10:44:08,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6854950912. Throughput: 0: 42783.2. Samples: 6855015200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 10:44:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 10:44:11,701][15401] Updated weights for policy 0, policy_version 418400 (0.0032) [2024-06-23 10:44:13,390][15132] Fps is (10 sec: 44256.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6855147520. Throughput: 0: 42602.7. Samples: 6855267040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 10:44:14,922][15401] Updated weights for policy 0, policy_version 418410 (0.0034) [2024-06-23 10:44:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6855360512. Throughput: 0: 42505.8. Samples: 6855517860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:18,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 10:44:19,362][15401] Updated weights for policy 0, policy_version 418420 (0.0035) [2024-06-23 10:44:21,901][15349] Signal inference workers to stop experience collection... (101550 times) [2024-06-23 10:44:21,951][15401] InferenceWorker_p0-w0: stopping experience collection (101550 times) [2024-06-23 10:44:21,954][15349] Signal inference workers to resume experience collection... (101550 times) [2024-06-23 10:44:21,960][15401] InferenceWorker_p0-w0: resuming experience collection (101550 times) [2024-06-23 10:44:22,640][15401] Updated weights for policy 0, policy_version 418430 (0.0033) [2024-06-23 10:44:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6855589888. Throughput: 0: 42500.9. Samples: 6855645040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 10:44:27,217][15401] Updated weights for policy 0, policy_version 418440 (0.0038) [2024-06-23 10:44:28,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 6855770112. Throughput: 0: 42579.7. Samples: 6855907840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 10:44:30,439][15401] Updated weights for policy 0, policy_version 418450 (0.0040) [2024-06-23 10:44:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6855999488. Throughput: 0: 42516.0. Samples: 6856158660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:33,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 10:44:34,739][15401] Updated weights for policy 0, policy_version 418460 (0.0034) [2024-06-23 10:44:38,137][15401] Updated weights for policy 0, policy_version 418470 (0.0032) [2024-06-23 10:44:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6856228864. Throughput: 0: 42487.6. Samples: 6856289320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 10:44:42,186][15401] Updated weights for policy 0, policy_version 418480 (0.0040) [2024-06-23 10:44:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6856409088. Throughput: 0: 42711.7. Samples: 6856549800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 10:44:45,608][15401] Updated weights for policy 0, policy_version 418490 (0.0030) [2024-06-23 10:44:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6856622080. Throughput: 0: 42562.8. Samples: 6856801720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 10:44:49,789][15401] Updated weights for policy 0, policy_version 418500 (0.0035) [2024-06-23 10:44:53,227][15401] Updated weights for policy 0, policy_version 418510 (0.0043) [2024-06-23 10:44:53,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 6856867840. Throughput: 0: 42532.8. Samples: 6856929280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:53,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 10:44:57,626][15401] Updated weights for policy 0, policy_version 418520 (0.0036) [2024-06-23 10:44:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6857048064. Throughput: 0: 42656.8. Samples: 6857186600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:44:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 10:45:00,914][15401] Updated weights for policy 0, policy_version 418530 (0.0039) [2024-06-23 10:45:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42874.6, 300 sec: 42765.4). Total num frames: 6857277440. Throughput: 0: 42723.6. Samples: 6857440420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:45:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 10:45:05,207][15401] Updated weights for policy 0, policy_version 418540 (0.0037) [2024-06-23 10:45:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6857490432. Throughput: 0: 42809.4. Samples: 6857571460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:45:08,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 10:45:08,830][15401] Updated weights for policy 0, policy_version 418550 (0.0036) [2024-06-23 10:45:12,803][15401] Updated weights for policy 0, policy_version 418560 (0.0040) [2024-06-23 10:45:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6857687040. Throughput: 0: 42597.2. Samples: 6857824720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:45:13,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 10:45:16,740][15401] Updated weights for policy 0, policy_version 418570 (0.0041) [2024-06-23 10:45:18,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 6857900032. Throughput: 0: 42582.3. Samples: 6858074960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:45:18,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 10:45:20,448][15401] Updated weights for policy 0, policy_version 418580 (0.0029) [2024-06-23 10:45:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6858129408. Throughput: 0: 42571.5. Samples: 6858205040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:45:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 10:45:24,194][15401] Updated weights for policy 0, policy_version 418590 (0.0048) [2024-06-23 10:45:28,327][15401] Updated weights for policy 0, policy_version 418600 (0.0022) [2024-06-23 10:45:28,394][15132] Fps is (10 sec: 44225.2, 60 sec: 42867.8, 300 sec: 42598.2). Total num frames: 6858342400. Throughput: 0: 42525.9. Samples: 6858463680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 10:45:28,395][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 10:45:31,670][15401] Updated weights for policy 0, policy_version 418610 (0.0028) [2024-06-23 10:45:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6858555392. Throughput: 0: 42579.2. Samples: 6858717780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:45:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 10:45:35,756][15401] Updated weights for policy 0, policy_version 418620 (0.0044) [2024-06-23 10:45:37,769][15349] Signal inference workers to stop experience collection... (101600 times) [2024-06-23 10:45:37,769][15349] Signal inference workers to resume experience collection... (101600 times) [2024-06-23 10:45:37,795][15401] InferenceWorker_p0-w0: stopping experience collection (101600 times) [2024-06-23 10:45:37,796][15401] InferenceWorker_p0-w0: resuming experience collection (101600 times) [2024-06-23 10:45:38,390][15132] Fps is (10 sec: 42619.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 6858768384. Throughput: 0: 42675.1. Samples: 6858849560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:45:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 10:45:39,354][15401] Updated weights for policy 0, policy_version 418630 (0.0043) [2024-06-23 10:45:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6858981376. Throughput: 0: 42625.0. Samples: 6859104720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:45:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 10:45:43,479][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000418640_6858997760.pth... [2024-06-23 10:45:43,486][15401] Updated weights for policy 0, policy_version 418640 (0.0032) [2024-06-23 10:45:43,530][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000418015_6848757760.pth [2024-06-23 10:45:47,010][15401] Updated weights for policy 0, policy_version 418650 (0.0027) [2024-06-23 10:45:48,390][15132] Fps is (10 sec: 42596.9, 60 sec: 42871.2, 300 sec: 42653.9). Total num frames: 6859194368. Throughput: 0: 42615.6. Samples: 6859358140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:45:48,391][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 10:45:51,175][15401] Updated weights for policy 0, policy_version 418660 (0.0029) [2024-06-23 10:45:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 6859407360. Throughput: 0: 42545.7. Samples: 6859486020. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:45:53,393][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 10:45:54,678][15401] Updated weights for policy 0, policy_version 418670 (0.0033) [2024-06-23 10:45:58,390][15132] Fps is (10 sec: 42600.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6859620352. Throughput: 0: 42656.5. Samples: 6859744260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:45:58,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 10:45:59,074][15401] Updated weights for policy 0, policy_version 418680 (0.0032) [2024-06-23 10:46:02,349][15401] Updated weights for policy 0, policy_version 418690 (0.0037) [2024-06-23 10:46:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6859833344. Throughput: 0: 42525.4. Samples: 6859988500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:03,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 10:46:06,704][15401] Updated weights for policy 0, policy_version 418700 (0.0037) [2024-06-23 10:46:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6860029952. Throughput: 0: 42566.2. Samples: 6860120520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 10:46:09,969][15401] Updated weights for policy 0, policy_version 418710 (0.0041) [2024-06-23 10:46:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 6860259328. Throughput: 0: 42624.0. Samples: 6860381540. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:13,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 10:46:14,383][15401] Updated weights for policy 0, policy_version 418720 (0.0031) [2024-06-23 10:46:17,717][15401] Updated weights for policy 0, policy_version 418730 (0.0033) [2024-06-23 10:46:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 6860488704. Throughput: 0: 42480.9. Samples: 6860629420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 10:46:22,010][15401] Updated weights for policy 0, policy_version 418740 (0.0022) [2024-06-23 10:46:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 6860652544. Throughput: 0: 42452.6. Samples: 6860759920. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 10:46:25,278][15401] Updated weights for policy 0, policy_version 418750 (0.0047) [2024-06-23 10:46:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42328.9, 300 sec: 42598.4). Total num frames: 6860881920. Throughput: 0: 42409.3. Samples: 6861013140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:28,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-23 10:46:29,659][15401] Updated weights for policy 0, policy_version 418760 (0.0040) [2024-06-23 10:46:33,076][15401] Updated weights for policy 0, policy_version 418770 (0.0028) [2024-06-23 10:46:33,394][15132] Fps is (10 sec: 49129.6, 60 sec: 43141.3, 300 sec: 42764.4). Total num frames: 6861144064. Throughput: 0: 42300.2. Samples: 6861261820. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:33,395][15132] Avg episode reward: [(0, '0.818')] [2024-06-23 10:46:37,952][15401] Updated weights for policy 0, policy_version 418780 (0.0029) [2024-06-23 10:46:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6861307904. Throughput: 0: 42570.7. Samples: 6861401700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 10:46:40,722][15401] Updated weights for policy 0, policy_version 418790 (0.0030) [2024-06-23 10:46:43,390][15132] Fps is (10 sec: 37700.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6861520896. Throughput: 0: 42475.1. Samples: 6861655640. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 10:46:45,444][15401] Updated weights for policy 0, policy_version 418800 (0.0032) [2024-06-23 10:46:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 6861766656. Throughput: 0: 42566.2. Samples: 6861903980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 10:46:48,423][15401] Updated weights for policy 0, policy_version 418810 (0.0029) [2024-06-23 10:46:52,178][15349] Signal inference workers to stop experience collection... (101650 times) [2024-06-23 10:46:52,179][15349] Signal inference workers to resume experience collection... (101650 times) [2024-06-23 10:46:52,203][15401] InferenceWorker_p0-w0: stopping experience collection (101650 times) [2024-06-23 10:46:52,203][15401] InferenceWorker_p0-w0: resuming experience collection (101650 times) [2024-06-23 10:46:53,015][15401] Updated weights for policy 0, policy_version 418820 (0.0037) [2024-06-23 10:46:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6861946880. Throughput: 0: 42619.2. Samples: 6862038380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-23 10:46:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 10:46:56,023][15401] Updated weights for policy 0, policy_version 418830 (0.0043) [2024-06-23 10:46:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6862159872. Throughput: 0: 42397.7. Samples: 6862289440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:46:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 10:47:00,621][15401] Updated weights for policy 0, policy_version 418840 (0.0031) [2024-06-23 10:47:03,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.7, 300 sec: 42653.9). Total num frames: 6862405632. Throughput: 0: 42507.9. Samples: 6862542380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:03,392][15132] Avg episode reward: [(0, '0.821')] [2024-06-23 10:47:03,789][15401] Updated weights for policy 0, policy_version 418850 (0.0046) [2024-06-23 10:47:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6862585856. Throughput: 0: 42609.2. Samples: 6862677340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 10:47:08,501][15401] Updated weights for policy 0, policy_version 418860 (0.0029) [2024-06-23 10:47:11,370][15401] Updated weights for policy 0, policy_version 418870 (0.0028) [2024-06-23 10:47:13,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 6862815232. Throughput: 0: 42478.2. Samples: 6862924760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:13,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 10:47:16,388][15401] Updated weights for policy 0, policy_version 418880 (0.0039) [2024-06-23 10:47:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 6863028224. Throughput: 0: 42668.7. Samples: 6863181720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 10:47:19,417][15401] Updated weights for policy 0, policy_version 418890 (0.0042) [2024-06-23 10:47:23,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6863208448. Throughput: 0: 42310.1. Samples: 6863305660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 10:47:24,299][15401] Updated weights for policy 0, policy_version 418900 (0.0038) [2024-06-23 10:47:26,967][15401] Updated weights for policy 0, policy_version 418910 (0.0028) [2024-06-23 10:47:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6863437824. Throughput: 0: 42191.5. Samples: 6863554260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:28,392][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 10:47:32,045][15401] Updated weights for policy 0, policy_version 418920 (0.0035) [2024-06-23 10:47:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41782.3, 300 sec: 42599.3). Total num frames: 6863650816. Throughput: 0: 42566.6. Samples: 6863819480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:33,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 10:47:34,746][15401] Updated weights for policy 0, policy_version 418930 (0.0041) [2024-06-23 10:47:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6863847424. Throughput: 0: 42325.3. Samples: 6863943020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:38,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 10:47:39,782][15401] Updated weights for policy 0, policy_version 418940 (0.0028) [2024-06-23 10:47:42,782][15401] Updated weights for policy 0, policy_version 418950 (0.0032) [2024-06-23 10:47:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 6864076800. Throughput: 0: 42320.8. Samples: 6864193880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:43,394][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 10:47:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000418950_6864076800.pth... [2024-06-23 10:47:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000418328_6853885952.pth [2024-06-23 10:47:47,855][15401] Updated weights for policy 0, policy_version 418960 (0.0036) [2024-06-23 10:47:48,392][15132] Fps is (10 sec: 40950.1, 60 sec: 41504.5, 300 sec: 42487.3). Total num frames: 6864257024. Throughput: 0: 42383.1. Samples: 6864449620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:48,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 10:47:50,545][15401] Updated weights for policy 0, policy_version 418970 (0.0030) [2024-06-23 10:47:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6864486400. Throughput: 0: 42120.0. Samples: 6864572740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:53,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 10:47:55,583][15401] Updated weights for policy 0, policy_version 418980 (0.0030) [2024-06-23 10:47:58,329][15401] Updated weights for policy 0, policy_version 418990 (0.0025) [2024-06-23 10:47:58,390][15132] Fps is (10 sec: 47524.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6864732160. Throughput: 0: 42220.8. Samples: 6864824600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:47:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 10:48:03,239][15401] Updated weights for policy 0, policy_version 419000 (0.0031) [2024-06-23 10:48:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41507.8, 300 sec: 42431.8). Total num frames: 6864896000. Throughput: 0: 42290.6. Samples: 6865084800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:48:03,399][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 10:48:06,309][15401] Updated weights for policy 0, policy_version 419010 (0.0029) [2024-06-23 10:48:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6865125376. Throughput: 0: 42177.4. Samples: 6865203640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:48:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 10:48:10,792][15401] Updated weights for policy 0, policy_version 419020 (0.0048) [2024-06-23 10:48:13,391][15132] Fps is (10 sec: 45870.5, 60 sec: 42326.3, 300 sec: 42598.3). Total num frames: 6865354752. Throughput: 0: 42437.8. Samples: 6865464000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 10:48:13,391][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 10:48:14,140][15401] Updated weights for policy 0, policy_version 419030 (0.0037) [2024-06-23 10:48:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 6865534976. Throughput: 0: 42092.7. Samples: 6865713640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 10:48:18,447][15401] Updated weights for policy 0, policy_version 419040 (0.0030) [2024-06-23 10:48:21,791][15401] Updated weights for policy 0, policy_version 419050 (0.0025) [2024-06-23 10:48:23,390][15132] Fps is (10 sec: 42602.5, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 6865780736. Throughput: 0: 42211.9. Samples: 6865842560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:23,391][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 10:48:25,821][15401] Updated weights for policy 0, policy_version 419060 (0.0037) [2024-06-23 10:48:27,221][15349] Signal inference workers to stop experience collection... (101700 times) [2024-06-23 10:48:27,249][15401] InferenceWorker_p0-w0: stopping experience collection (101700 times) [2024-06-23 10:48:27,275][15349] Signal inference workers to resume experience collection... (101700 times) [2024-06-23 10:48:27,280][15401] InferenceWorker_p0-w0: resuming experience collection (101700 times) [2024-06-23 10:48:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6865977344. Throughput: 0: 42558.7. Samples: 6866109020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 10:48:29,336][15401] Updated weights for policy 0, policy_version 419070 (0.0044) [2024-06-23 10:48:33,299][15401] Updated weights for policy 0, policy_version 419080 (0.0034) [2024-06-23 10:48:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6866206720. Throughput: 0: 42405.4. Samples: 6866357760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 10:48:36,981][15401] Updated weights for policy 0, policy_version 419090 (0.0030) [2024-06-23 10:48:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6866419712. Throughput: 0: 42577.8. Samples: 6866488740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:38,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 10:48:41,264][15401] Updated weights for policy 0, policy_version 419100 (0.0034) [2024-06-23 10:48:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6866616320. Throughput: 0: 42652.0. Samples: 6866743940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:43,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 10:48:45,176][15401] Updated weights for policy 0, policy_version 419110 (0.0031) [2024-06-23 10:48:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 6866829312. Throughput: 0: 42293.4. Samples: 6866988000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 10:48:49,166][15401] Updated weights for policy 0, policy_version 419120 (0.0031) [2024-06-23 10:48:52,943][15401] Updated weights for policy 0, policy_version 419130 (0.0034) [2024-06-23 10:48:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6867042304. Throughput: 0: 42522.6. Samples: 6867117160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:53,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 10:48:57,066][15401] Updated weights for policy 0, policy_version 419140 (0.0039) [2024-06-23 10:48:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42543.5). Total num frames: 6867255296. Throughput: 0: 42474.2. Samples: 6867375300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:48:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 10:49:00,438][15401] Updated weights for policy 0, policy_version 419150 (0.0033) [2024-06-23 10:49:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 6867451904. Throughput: 0: 42571.4. Samples: 6867629360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:49:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 10:49:04,504][15401] Updated weights for policy 0, policy_version 419160 (0.0046) [2024-06-23 10:49:08,261][15401] Updated weights for policy 0, policy_version 419170 (0.0030) [2024-06-23 10:49:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6867681280. Throughput: 0: 42576.1. Samples: 6867758480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:49:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 10:49:12,243][15401] Updated weights for policy 0, policy_version 419180 (0.0028) [2024-06-23 10:49:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41780.0, 300 sec: 42376.3). Total num frames: 6867861504. Throughput: 0: 42233.9. Samples: 6868009540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:49:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 10:49:15,830][15401] Updated weights for policy 0, policy_version 419190 (0.0028) [2024-06-23 10:49:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42431.8). Total num frames: 6868107264. Throughput: 0: 42334.6. Samples: 6868262820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:49:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 10:49:19,722][15401] Updated weights for policy 0, policy_version 419200 (0.0027) [2024-06-23 10:49:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6868320256. Throughput: 0: 42333.9. Samples: 6868393760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:49:23,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 10:49:23,475][15401] Updated weights for policy 0, policy_version 419210 (0.0030) [2024-06-23 10:49:27,148][15401] Updated weights for policy 0, policy_version 419220 (0.0046) [2024-06-23 10:49:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 6868516864. Throughput: 0: 42241.5. Samples: 6868644800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:49:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 10:49:31,264][15401] Updated weights for policy 0, policy_version 419230 (0.0030) [2024-06-23 10:49:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 6868746240. Throughput: 0: 42493.3. Samples: 6868900200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 10:49:33,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-23 10:49:34,923][15401] Updated weights for policy 0, policy_version 419240 (0.0027) [2024-06-23 10:49:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6868959232. Throughput: 0: 42646.3. Samples: 6869036240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:49:38,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-23 10:49:39,083][15401] Updated weights for policy 0, policy_version 419250 (0.0037) [2024-06-23 10:49:42,592][15401] Updated weights for policy 0, policy_version 419260 (0.0044) [2024-06-23 10:49:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6869172224. Throughput: 0: 42484.0. Samples: 6869287080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:49:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 10:49:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000419261_6869172224.pth... [2024-06-23 10:49:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000418640_6858997760.pth [2024-06-23 10:49:46,865][15401] Updated weights for policy 0, policy_version 419270 (0.0024) [2024-06-23 10:49:47,426][15349] Signal inference workers to stop experience collection... (101750 times) [2024-06-23 10:49:47,470][15401] InferenceWorker_p0-w0: stopping experience collection (101750 times) [2024-06-23 10:49:47,547][15349] Signal inference workers to resume experience collection... (101750 times) [2024-06-23 10:49:47,547][15401] InferenceWorker_p0-w0: resuming experience collection (101750 times) [2024-06-23 10:49:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42432.1). Total num frames: 6869385216. Throughput: 0: 42522.3. Samples: 6869542860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:49:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 10:49:50,448][15401] Updated weights for policy 0, policy_version 419280 (0.0043) [2024-06-23 10:49:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6869598208. Throughput: 0: 42494.0. Samples: 6869670720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:49:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 10:49:54,403][15401] Updated weights for policy 0, policy_version 419290 (0.0033) [2024-06-23 10:49:58,028][15401] Updated weights for policy 0, policy_version 419300 (0.0037) [2024-06-23 10:49:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6869811200. Throughput: 0: 42681.1. Samples: 6869930200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:49:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 10:50:01,915][15401] Updated weights for policy 0, policy_version 419310 (0.0035) [2024-06-23 10:50:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 6870024192. Throughput: 0: 42584.0. Samples: 6870179100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 10:50:05,557][15401] Updated weights for policy 0, policy_version 419320 (0.0021) [2024-06-23 10:50:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 6870220800. Throughput: 0: 42570.1. Samples: 6870309420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 10:50:09,809][15401] Updated weights for policy 0, policy_version 419330 (0.0027) [2024-06-23 10:50:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 6870450176. Throughput: 0: 42777.4. Samples: 6870569780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 10:50:13,474][15401] Updated weights for policy 0, policy_version 419340 (0.0047) [2024-06-23 10:50:17,295][15401] Updated weights for policy 0, policy_version 419350 (0.0036) [2024-06-23 10:50:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6870663168. Throughput: 0: 42692.0. Samples: 6870821340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 10:50:21,020][15401] Updated weights for policy 0, policy_version 419360 (0.0041) [2024-06-23 10:50:23,391][15132] Fps is (10 sec: 40951.7, 60 sec: 42323.9, 300 sec: 42432.2). Total num frames: 6870859776. Throughput: 0: 42515.5. Samples: 6870949520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:23,392][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 10:50:25,112][15401] Updated weights for policy 0, policy_version 419370 (0.0039) [2024-06-23 10:50:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 6871105536. Throughput: 0: 42823.3. Samples: 6871214120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 10:50:28,486][15401] Updated weights for policy 0, policy_version 419380 (0.0031) [2024-06-23 10:50:32,881][15401] Updated weights for policy 0, policy_version 419390 (0.0031) [2024-06-23 10:50:33,390][15132] Fps is (10 sec: 45883.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6871318528. Throughput: 0: 42573.7. Samples: 6871458680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 10:50:36,150][15401] Updated weights for policy 0, policy_version 419400 (0.0036) [2024-06-23 10:50:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 6871498752. Throughput: 0: 42549.8. Samples: 6871585460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 10:50:40,548][15401] Updated weights for policy 0, policy_version 419410 (0.0031) [2024-06-23 10:50:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 6871728128. Throughput: 0: 42596.6. Samples: 6871847040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:43,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 10:50:44,274][15401] Updated weights for policy 0, policy_version 419420 (0.0035) [2024-06-23 10:50:48,113][15401] Updated weights for policy 0, policy_version 419430 (0.0037) [2024-06-23 10:50:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 6871957504. Throughput: 0: 42679.0. Samples: 6872099660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:48,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 10:50:51,953][15401] Updated weights for policy 0, policy_version 419440 (0.0033) [2024-06-23 10:50:53,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.7, 300 sec: 42431.4). Total num frames: 6872137728. Throughput: 0: 42662.2. Samples: 6872229320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 10:50:53,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 10:50:55,340][15349] Signal inference workers to stop experience collection... (101800 times) [2024-06-23 10:50:55,391][15401] InferenceWorker_p0-w0: stopping experience collection (101800 times) [2024-06-23 10:50:55,395][15349] Signal inference workers to resume experience collection... (101800 times) [2024-06-23 10:50:55,405][15401] InferenceWorker_p0-w0: resuming experience collection (101800 times) [2024-06-23 10:50:55,538][15401] Updated weights for policy 0, policy_version 419450 (0.0039) [2024-06-23 10:50:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 6872383488. Throughput: 0: 42639.9. Samples: 6872488580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:50:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 10:50:59,643][15401] Updated weights for policy 0, policy_version 419460 (0.0034) [2024-06-23 10:51:03,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6872580096. Throughput: 0: 42707.6. Samples: 6872743180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 10:51:03,671][15401] Updated weights for policy 0, policy_version 419470 (0.0031) [2024-06-23 10:51:07,443][15401] Updated weights for policy 0, policy_version 419480 (0.0036) [2024-06-23 10:51:08,394][15132] Fps is (10 sec: 39303.1, 60 sec: 42595.1, 300 sec: 42431.1). Total num frames: 6872776704. Throughput: 0: 42599.1. Samples: 6872866600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:08,395][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 10:51:11,234][15401] Updated weights for policy 0, policy_version 419490 (0.0042) [2024-06-23 10:51:13,394][15132] Fps is (10 sec: 44215.5, 60 sec: 42868.0, 300 sec: 42486.6). Total num frames: 6873022464. Throughput: 0: 42356.8. Samples: 6873120380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:13,403][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 10:51:15,354][15401] Updated weights for policy 0, policy_version 419500 (0.0028) [2024-06-23 10:51:18,390][15132] Fps is (10 sec: 42618.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 6873202688. Throughput: 0: 42736.4. Samples: 6873381820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 10:51:18,865][15401] Updated weights for policy 0, policy_version 419510 (0.0037) [2024-06-23 10:51:22,952][15401] Updated weights for policy 0, policy_version 419520 (0.0039) [2024-06-23 10:51:23,390][15132] Fps is (10 sec: 39340.2, 60 sec: 42599.8, 300 sec: 42487.3). Total num frames: 6873415680. Throughput: 0: 42664.0. Samples: 6873505340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 10:51:26,615][15401] Updated weights for policy 0, policy_version 419530 (0.0022) [2024-06-23 10:51:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42376.9). Total num frames: 6873645056. Throughput: 0: 42584.5. Samples: 6873763340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:28,390][15132] Avg episode reward: [(0, '0.880')] [2024-06-23 10:51:30,589][15401] Updated weights for policy 0, policy_version 419540 (0.0037) [2024-06-23 10:51:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 6873858048. Throughput: 0: 42695.2. Samples: 6874020940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:33,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 10:51:34,649][15401] Updated weights for policy 0, policy_version 419550 (0.0032) [2024-06-23 10:51:38,301][15401] Updated weights for policy 0, policy_version 419560 (0.0033) [2024-06-23 10:51:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 6874071040. Throughput: 0: 42605.5. Samples: 6874146460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 10:51:42,270][15401] Updated weights for policy 0, policy_version 419570 (0.0029) [2024-06-23 10:51:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 6874284032. Throughput: 0: 42713.4. Samples: 6874410680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 10:51:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000419574_6874300416.pth... [2024-06-23 10:51:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000418950_6864076800.pth [2024-06-23 10:51:45,767][15401] Updated weights for policy 0, policy_version 419580 (0.0033) [2024-06-23 10:51:48,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 6874497024. Throughput: 0: 42692.8. Samples: 6874664460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:48,392][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 10:51:49,914][15401] Updated weights for policy 0, policy_version 419590 (0.0036) [2024-06-23 10:51:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 6874710016. Throughput: 0: 42721.8. Samples: 6874788880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 10:51:53,756][15401] Updated weights for policy 0, policy_version 419600 (0.0034) [2024-06-23 10:51:57,559][15401] Updated weights for policy 0, policy_version 419610 (0.0026) [2024-06-23 10:51:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 6874923008. Throughput: 0: 42740.6. Samples: 6875043500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:51:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 10:52:01,241][15401] Updated weights for policy 0, policy_version 419620 (0.0029) [2024-06-23 10:52:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 6875119616. Throughput: 0: 42628.0. Samples: 6875300180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:52:03,393][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 10:52:05,172][15401] Updated weights for policy 0, policy_version 419630 (0.0025) [2024-06-23 10:52:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42874.9, 300 sec: 42487.7). Total num frames: 6875348992. Throughput: 0: 42521.5. Samples: 6875418800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:52:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 10:52:08,806][15401] Updated weights for policy 0, policy_version 419640 (0.0042) [2024-06-23 10:52:12,623][15349] Signal inference workers to stop experience collection... (101850 times) [2024-06-23 10:52:12,625][15349] Signal inference workers to resume experience collection... (101850 times) [2024-06-23 10:52:12,662][15401] InferenceWorker_p0-w0: stopping experience collection (101850 times) [2024-06-23 10:52:12,662][15401] InferenceWorker_p0-w0: resuming experience collection (101850 times) [2024-06-23 10:52:12,763][15401] Updated weights for policy 0, policy_version 419650 (0.0032) [2024-06-23 10:52:13,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42328.6, 300 sec: 42487.3). Total num frames: 6875561984. Throughput: 0: 42601.2. Samples: 6875680400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:52:13,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 10:52:16,797][15401] Updated weights for policy 0, policy_version 419660 (0.0024) [2024-06-23 10:52:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6875758592. Throughput: 0: 42602.8. Samples: 6875938060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 10:52:18,390][15132] Avg episode reward: [(0, '0.873')] [2024-06-23 10:52:20,398][15401] Updated weights for policy 0, policy_version 419670 (0.0031) [2024-06-23 10:52:23,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6875987968. Throughput: 0: 42524.4. Samples: 6876060060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:52:23,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 10:52:24,423][15401] Updated weights for policy 0, policy_version 419680 (0.0030) [2024-06-23 10:52:28,148][15401] Updated weights for policy 0, policy_version 419690 (0.0031) [2024-06-23 10:52:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6876200960. Throughput: 0: 42419.5. Samples: 6876319560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:52:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 10:52:32,157][15401] Updated weights for policy 0, policy_version 419700 (0.0034) [2024-06-23 10:52:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6876397568. Throughput: 0: 42403.7. Samples: 6876572520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:52:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 10:52:35,889][15401] Updated weights for policy 0, policy_version 419710 (0.0032) [2024-06-23 10:52:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6876626944. Throughput: 0: 42393.0. Samples: 6876696560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:52:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 10:52:39,994][15401] Updated weights for policy 0, policy_version 419720 (0.0034) [2024-06-23 10:52:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 6876823552. Throughput: 0: 42471.0. Samples: 6876954700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:52:43,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 10:52:43,890][15401] Updated weights for policy 0, policy_version 419730 (0.0023) [2024-06-23 10:52:47,583][15401] Updated weights for policy 0, policy_version 419740 (0.0040) [2024-06-23 10:52:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 6877036544. Throughput: 0: 42276.9. Samples: 6877202540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:52:48,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 10:52:51,577][15401] Updated weights for policy 0, policy_version 419750 (0.0039) [2024-06-23 10:52:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6877265920. Throughput: 0: 42606.6. Samples: 6877336100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:52:53,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 10:52:55,385][15401] Updated weights for policy 0, policy_version 419760 (0.0030) [2024-06-23 10:52:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 6877446144. Throughput: 0: 42466.3. Samples: 6877591380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:52:58,390][15132] Avg episode reward: [(0, '0.905')] [2024-06-23 10:52:59,319][15401] Updated weights for policy 0, policy_version 419770 (0.0033) [2024-06-23 10:53:02,998][15401] Updated weights for policy 0, policy_version 419780 (0.0036) [2024-06-23 10:53:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 6877691904. Throughput: 0: 42491.0. Samples: 6877850160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:53:03,390][15132] Avg episode reward: [(0, '0.881')] [2024-06-23 10:53:06,758][15401] Updated weights for policy 0, policy_version 419790 (0.0036) [2024-06-23 10:53:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.3, 300 sec: 42598.5). Total num frames: 6877921280. Throughput: 0: 42818.1. Samples: 6877986880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:53:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 10:53:10,616][15401] Updated weights for policy 0, policy_version 419800 (0.0039) [2024-06-23 10:53:13,390][15132] Fps is (10 sec: 40956.1, 60 sec: 42324.7, 300 sec: 42598.2). Total num frames: 6878101504. Throughput: 0: 42568.0. Samples: 6878235160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:53:13,391][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 10:53:14,526][15401] Updated weights for policy 0, policy_version 419810 (0.0028) [2024-06-23 10:53:18,306][15401] Updated weights for policy 0, policy_version 419820 (0.0040) [2024-06-23 10:53:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 6878330880. Throughput: 0: 42713.7. Samples: 6878494640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:53:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 10:53:22,122][15401] Updated weights for policy 0, policy_version 419830 (0.0031) [2024-06-23 10:53:23,389][15132] Fps is (10 sec: 45879.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6878560256. Throughput: 0: 42807.5. Samples: 6878622900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:53:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 10:53:25,991][15401] Updated weights for policy 0, policy_version 419840 (0.0039) [2024-06-23 10:53:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6878740480. Throughput: 0: 42733.8. Samples: 6878877720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:53:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 10:53:29,687][15401] Updated weights for policy 0, policy_version 419850 (0.0034) [2024-06-23 10:53:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 6878953472. Throughput: 0: 42881.8. Samples: 6879132220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:53:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 10:53:33,849][15349] Signal inference workers to stop experience collection... (101900 times) [2024-06-23 10:53:33,885][15401] InferenceWorker_p0-w0: stopping experience collection (101900 times) [2024-06-23 10:53:33,911][15349] Signal inference workers to resume experience collection... (101900 times) [2024-06-23 10:53:33,912][15401] InferenceWorker_p0-w0: resuming experience collection (101900 times) [2024-06-23 10:53:34,071][15401] Updated weights for policy 0, policy_version 419860 (0.0035) [2024-06-23 10:53:37,349][15401] Updated weights for policy 0, policy_version 419870 (0.0043) [2024-06-23 10:53:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6879182848. Throughput: 0: 42656.9. Samples: 6879255660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 10:53:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 10:53:41,704][15401] Updated weights for policy 0, policy_version 419880 (0.0039) [2024-06-23 10:53:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 6879379456. Throughput: 0: 42614.2. Samples: 6879509020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:53:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 10:53:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000419885_6879395840.pth... [2024-06-23 10:53:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000419261_6869172224.pth [2024-06-23 10:53:44,979][15401] Updated weights for policy 0, policy_version 419890 (0.0039) [2024-06-23 10:53:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6879608832. Throughput: 0: 42477.0. Samples: 6879761620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:53:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 10:53:49,738][15401] Updated weights for policy 0, policy_version 419900 (0.0040) [2024-06-23 10:53:52,864][15401] Updated weights for policy 0, policy_version 419910 (0.0041) [2024-06-23 10:53:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6879821824. Throughput: 0: 42343.6. Samples: 6879892340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:53:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 10:53:57,346][15401] Updated weights for policy 0, policy_version 419920 (0.0041) [2024-06-23 10:53:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6880018432. Throughput: 0: 42782.8. Samples: 6880160340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:53:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 10:54:00,411][15401] Updated weights for policy 0, policy_version 419930 (0.0029) [2024-06-23 10:54:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6880247808. Throughput: 0: 42520.0. Samples: 6880408040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:03,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 10:54:04,850][15401] Updated weights for policy 0, policy_version 419940 (0.0034) [2024-06-23 10:54:07,948][15401] Updated weights for policy 0, policy_version 419950 (0.0030) [2024-06-23 10:54:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6880477184. Throughput: 0: 42741.3. Samples: 6880546260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:08,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-23 10:54:12,397][15401] Updated weights for policy 0, policy_version 419960 (0.0039) [2024-06-23 10:54:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42326.1, 300 sec: 42487.3). Total num frames: 6880641024. Throughput: 0: 42843.6. Samples: 6880805680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 10:54:15,606][15401] Updated weights for policy 0, policy_version 419970 (0.0051) [2024-06-23 10:54:18,394][15132] Fps is (10 sec: 42578.7, 60 sec: 42868.1, 300 sec: 42653.3). Total num frames: 6880903168. Throughput: 0: 42543.6. Samples: 6881046880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:18,395][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 10:54:20,049][15401] Updated weights for policy 0, policy_version 419980 (0.0043) [2024-06-23 10:54:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 6881083392. Throughput: 0: 42972.7. Samples: 6881189440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 10:54:23,736][15401] Updated weights for policy 0, policy_version 419990 (0.0035) [2024-06-23 10:54:27,582][15401] Updated weights for policy 0, policy_version 420000 (0.0033) [2024-06-23 10:54:28,392][15132] Fps is (10 sec: 39330.4, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 6881296384. Throughput: 0: 42845.3. Samples: 6881437160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:28,401][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 10:54:31,310][15401] Updated weights for policy 0, policy_version 420010 (0.0025) [2024-06-23 10:54:32,248][15349] Signal inference workers to stop experience collection... (101950 times) [2024-06-23 10:54:32,286][15401] InferenceWorker_p0-w0: stopping experience collection (101950 times) [2024-06-23 10:54:32,314][15349] Signal inference workers to resume experience collection... (101950 times) [2024-06-23 10:54:32,315][15401] InferenceWorker_p0-w0: resuming experience collection (101950 times) [2024-06-23 10:54:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6881542144. Throughput: 0: 42733.3. Samples: 6881684620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:33,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 10:54:35,100][15401] Updated weights for policy 0, policy_version 420020 (0.0023) [2024-06-23 10:54:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 6881705984. Throughput: 0: 42944.2. Samples: 6881824820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 10:54:38,943][15401] Updated weights for policy 0, policy_version 420030 (0.0024) [2024-06-23 10:54:42,705][15401] Updated weights for policy 0, policy_version 420040 (0.0043) [2024-06-23 10:54:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6881951744. Throughput: 0: 42600.8. Samples: 6882077380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:43,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 10:54:46,546][15401] Updated weights for policy 0, policy_version 420050 (0.0028) [2024-06-23 10:54:48,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6882181120. Throughput: 0: 42680.1. Samples: 6882328640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 10:54:50,188][15401] Updated weights for policy 0, policy_version 420060 (0.0044) [2024-06-23 10:54:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6882361344. Throughput: 0: 42490.6. Samples: 6882458340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 10:54:54,166][15401] Updated weights for policy 0, policy_version 420070 (0.0034) [2024-06-23 10:54:58,351][15401] Updated weights for policy 0, policy_version 420080 (0.0049) [2024-06-23 10:54:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6882590720. Throughput: 0: 42395.9. Samples: 6882713500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:54:58,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 10:55:01,791][15401] Updated weights for policy 0, policy_version 420090 (0.0042) [2024-06-23 10:55:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6882820096. Throughput: 0: 42558.7. Samples: 6882961820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 10:55:03,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-23 10:55:05,932][15401] Updated weights for policy 0, policy_version 420100 (0.0030) [2024-06-23 10:55:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 6883000320. Throughput: 0: 42381.8. Samples: 6883096620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:08,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-23 10:55:09,369][15401] Updated weights for policy 0, policy_version 420110 (0.0029) [2024-06-23 10:55:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6883229696. Throughput: 0: 42594.7. Samples: 6883353820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:13,390][15132] Avg episode reward: [(0, '0.158')] [2024-06-23 10:55:13,706][15401] Updated weights for policy 0, policy_version 420120 (0.0026) [2024-06-23 10:55:17,078][15401] Updated weights for policy 0, policy_version 420130 (0.0037) [2024-06-23 10:55:18,392][15132] Fps is (10 sec: 45864.7, 60 sec: 42600.0, 300 sec: 42709.4). Total num frames: 6883459072. Throughput: 0: 42659.1. Samples: 6883604380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:18,392][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 10:55:21,612][15401] Updated weights for policy 0, policy_version 420140 (0.0036) [2024-06-23 10:55:23,394][15132] Fps is (10 sec: 42580.4, 60 sec: 42868.5, 300 sec: 42542.2). Total num frames: 6883655680. Throughput: 0: 42679.9. Samples: 6883745600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:23,395][15132] Avg episode reward: [(0, '0.227')] [2024-06-23 10:55:24,630][15401] Updated weights for policy 0, policy_version 420150 (0.0032) [2024-06-23 10:55:28,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 6883868672. Throughput: 0: 42686.7. Samples: 6883998280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:28,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-23 10:55:29,138][15401] Updated weights for policy 0, policy_version 420160 (0.0031) [2024-06-23 10:55:32,189][15401] Updated weights for policy 0, policy_version 420170 (0.0025) [2024-06-23 10:55:33,389][15132] Fps is (10 sec: 45895.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6884114432. Throughput: 0: 42740.9. Samples: 6884251980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 10:55:36,563][15401] Updated weights for policy 0, policy_version 420180 (0.0025) [2024-06-23 10:55:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 6884278272. Throughput: 0: 42928.4. Samples: 6884390120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:38,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 10:55:39,677][15349] Signal inference workers to stop experience collection... (102000 times) [2024-06-23 10:55:39,724][15401] InferenceWorker_p0-w0: stopping experience collection (102000 times) [2024-06-23 10:55:39,736][15349] Signal inference workers to resume experience collection... (102000 times) [2024-06-23 10:55:39,737][15401] InferenceWorker_p0-w0: resuming experience collection (102000 times) [2024-06-23 10:55:39,902][15401] Updated weights for policy 0, policy_version 420190 (0.0025) [2024-06-23 10:55:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6884524032. Throughput: 0: 42872.5. Samples: 6884642760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 10:55:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000420198_6884524032.pth... [2024-06-23 10:55:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000419574_6874300416.pth [2024-06-23 10:55:44,002][15401] Updated weights for policy 0, policy_version 420200 (0.0026) [2024-06-23 10:55:47,574][15401] Updated weights for policy 0, policy_version 420210 (0.0034) [2024-06-23 10:55:48,392][15132] Fps is (10 sec: 47504.4, 60 sec: 42869.9, 300 sec: 42765.1). Total num frames: 6884753408. Throughput: 0: 42827.3. Samples: 6884889140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:48,392][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 10:55:51,749][15401] Updated weights for policy 0, policy_version 420220 (0.0036) [2024-06-23 10:55:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6884933632. Throughput: 0: 42808.9. Samples: 6885023020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 10:55:55,369][15401] Updated weights for policy 0, policy_version 420230 (0.0026) [2024-06-23 10:55:58,390][15132] Fps is (10 sec: 40968.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6885163008. Throughput: 0: 42808.9. Samples: 6885280220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:55:58,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 10:55:59,322][15401] Updated weights for policy 0, policy_version 420240 (0.0039) [2024-06-23 10:56:02,946][15401] Updated weights for policy 0, policy_version 420250 (0.0032) [2024-06-23 10:56:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42765.7). Total num frames: 6885392384. Throughput: 0: 42928.5. Samples: 6885536060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:56:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 10:56:06,759][15401] Updated weights for policy 0, policy_version 420260 (0.0029) [2024-06-23 10:56:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42543.2). Total num frames: 6885572608. Throughput: 0: 42668.0. Samples: 6885665580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:56:08,393][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 10:56:10,588][15401] Updated weights for policy 0, policy_version 420270 (0.0029) [2024-06-23 10:56:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6885801984. Throughput: 0: 42750.2. Samples: 6885922040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:56:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 10:56:14,637][15401] Updated weights for policy 0, policy_version 420280 (0.0031) [2024-06-23 10:56:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 6886014976. Throughput: 0: 43050.7. Samples: 6886189260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:56:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 10:56:18,398][15401] Updated weights for policy 0, policy_version 420290 (0.0032) [2024-06-23 10:56:22,283][15401] Updated weights for policy 0, policy_version 420300 (0.0033) [2024-06-23 10:56:23,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42869.9, 300 sec: 42653.0). Total num frames: 6886227968. Throughput: 0: 42690.9. Samples: 6886311480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 10:56:23,397][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 10:56:26,005][15401] Updated weights for policy 0, policy_version 420310 (0.0034) [2024-06-23 10:56:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6886440960. Throughput: 0: 42800.5. Samples: 6886568780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:56:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 10:56:29,842][15401] Updated weights for policy 0, policy_version 420320 (0.0035) [2024-06-23 10:56:33,392][15132] Fps is (10 sec: 42615.8, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 6886653952. Throughput: 0: 43108.2. Samples: 6886829020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:56:33,392][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 10:56:33,678][15401] Updated weights for policy 0, policy_version 420330 (0.0037) [2024-06-23 10:56:37,352][15401] Updated weights for policy 0, policy_version 420340 (0.0042) [2024-06-23 10:56:38,391][15132] Fps is (10 sec: 42590.9, 60 sec: 43143.4, 300 sec: 42653.7). Total num frames: 6886866944. Throughput: 0: 42883.3. Samples: 6886952840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:56:38,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 10:56:41,498][15401] Updated weights for policy 0, policy_version 420350 (0.0029) [2024-06-23 10:56:43,392][15132] Fps is (10 sec: 44236.2, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 6887096320. Throughput: 0: 42932.8. Samples: 6887212300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:56:43,393][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 10:56:44,775][15349] Signal inference workers to stop experience collection... (102050 times) [2024-06-23 10:56:44,775][15349] Signal inference workers to resume experience collection... (102050 times) [2024-06-23 10:56:44,802][15401] InferenceWorker_p0-w0: stopping experience collection (102050 times) [2024-06-23 10:56:44,802][15401] InferenceWorker_p0-w0: resuming experience collection (102050 times) [2024-06-23 10:56:44,915][15401] Updated weights for policy 0, policy_version 420360 (0.0031) [2024-06-23 10:56:48,389][15132] Fps is (10 sec: 40967.1, 60 sec: 42053.7, 300 sec: 42598.4). Total num frames: 6887276544. Throughput: 0: 42948.0. Samples: 6887468720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:56:48,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-23 10:56:49,250][15401] Updated weights for policy 0, policy_version 420370 (0.0028) [2024-06-23 10:56:53,053][15401] Updated weights for policy 0, policy_version 420380 (0.0038) [2024-06-23 10:56:53,390][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6887522304. Throughput: 0: 42808.9. Samples: 6887591880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:56:53,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 10:56:56,881][15401] Updated weights for policy 0, policy_version 420390 (0.0044) [2024-06-23 10:56:58,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 6887735296. Throughput: 0: 42861.3. Samples: 6887850900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:56:58,392][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 10:57:00,763][15401] Updated weights for policy 0, policy_version 420400 (0.0037) [2024-06-23 10:57:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6887931904. Throughput: 0: 42646.5. Samples: 6888108360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 10:57:04,409][15401] Updated weights for policy 0, policy_version 420410 (0.0027) [2024-06-23 10:57:08,255][15401] Updated weights for policy 0, policy_version 420420 (0.0034) [2024-06-23 10:57:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 6888161280. Throughput: 0: 42770.1. Samples: 6888235860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 10:57:12,180][15401] Updated weights for policy 0, policy_version 420430 (0.0034) [2024-06-23 10:57:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 6888374272. Throughput: 0: 42851.0. Samples: 6888497180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:13,392][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 10:57:16,099][15401] Updated weights for policy 0, policy_version 420440 (0.0026) [2024-06-23 10:57:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 6888587264. Throughput: 0: 42607.5. Samples: 6888746260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:18,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 10:57:20,021][15401] Updated weights for policy 0, policy_version 420450 (0.0033) [2024-06-23 10:57:23,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 6888783872. Throughput: 0: 42817.6. Samples: 6888879560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 10:57:23,825][15401] Updated weights for policy 0, policy_version 420460 (0.0037) [2024-06-23 10:57:27,629][15401] Updated weights for policy 0, policy_version 420470 (0.0030) [2024-06-23 10:57:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6888996864. Throughput: 0: 42808.7. Samples: 6889138580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 10:57:31,282][15401] Updated weights for policy 0, policy_version 420480 (0.0039) [2024-06-23 10:57:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 6889242624. Throughput: 0: 42709.3. Samples: 6889390640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 10:57:35,169][15401] Updated weights for policy 0, policy_version 420490 (0.0032) [2024-06-23 10:57:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42872.7, 300 sec: 42765.0). Total num frames: 6889439232. Throughput: 0: 42903.2. Samples: 6889522520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 10:57:38,717][15401] Updated weights for policy 0, policy_version 420500 (0.0032) [2024-06-23 10:57:42,827][15401] Updated weights for policy 0, policy_version 420510 (0.0042) [2024-06-23 10:57:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 6889635840. Throughput: 0: 42791.6. Samples: 6889776420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-23 10:57:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 10:57:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000420511_6889652224.pth... [2024-06-23 10:57:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000419885_6879395840.pth [2024-06-23 10:57:46,371][15401] Updated weights for policy 0, policy_version 420520 (0.0031) [2024-06-23 10:57:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6889881600. Throughput: 0: 42691.1. Samples: 6890029460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:57:48,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 10:57:50,599][15401] Updated weights for policy 0, policy_version 420530 (0.0025) [2024-06-23 10:57:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 6890078208. Throughput: 0: 42705.2. Samples: 6890157600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:57:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 10:57:54,005][15401] Updated weights for policy 0, policy_version 420540 (0.0036) [2024-06-23 10:57:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 6890274816. Throughput: 0: 42459.5. Samples: 6890407760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:57:58,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 10:57:58,671][15401] Updated weights for policy 0, policy_version 420550 (0.0032) [2024-06-23 10:58:01,764][15401] Updated weights for policy 0, policy_version 420560 (0.0030) [2024-06-23 10:58:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6890520576. Throughput: 0: 42628.1. Samples: 6890664520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:03,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 10:58:06,377][15401] Updated weights for policy 0, policy_version 420570 (0.0033) [2024-06-23 10:58:08,395][15132] Fps is (10 sec: 44214.7, 60 sec: 42594.8, 300 sec: 42764.4). Total num frames: 6890717184. Throughput: 0: 42581.9. Samples: 6890795960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:08,395][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 10:58:09,531][15401] Updated weights for policy 0, policy_version 420580 (0.0036) [2024-06-23 10:58:12,170][15349] Signal inference workers to stop experience collection... (102100 times) [2024-06-23 10:58:12,171][15349] Signal inference workers to resume experience collection... (102100 times) [2024-06-23 10:58:12,193][15401] InferenceWorker_p0-w0: stopping experience collection (102100 times) [2024-06-23 10:58:12,193][15401] InferenceWorker_p0-w0: resuming experience collection (102100 times) [2024-06-23 10:58:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 6890913792. Throughput: 0: 42416.0. Samples: 6891047300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 10:58:13,899][15401] Updated weights for policy 0, policy_version 420590 (0.0034) [2024-06-23 10:58:17,302][15401] Updated weights for policy 0, policy_version 420600 (0.0039) [2024-06-23 10:58:18,390][15132] Fps is (10 sec: 44258.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6891159552. Throughput: 0: 42533.3. Samples: 6891304640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 10:58:21,378][15401] Updated weights for policy 0, policy_version 420610 (0.0025) [2024-06-23 10:58:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6891356160. Throughput: 0: 42543.1. Samples: 6891436960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 10:58:24,896][15401] Updated weights for policy 0, policy_version 420620 (0.0025) [2024-06-23 10:58:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 6891569152. Throughput: 0: 42507.9. Samples: 6891689280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 10:58:28,987][15401] Updated weights for policy 0, policy_version 420630 (0.0049) [2024-06-23 10:58:32,469][15401] Updated weights for policy 0, policy_version 420640 (0.0031) [2024-06-23 10:58:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6891798528. Throughput: 0: 42610.6. Samples: 6891946940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 10:58:36,438][15401] Updated weights for policy 0, policy_version 420650 (0.0034) [2024-06-23 10:58:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6891995136. Throughput: 0: 42745.5. Samples: 6892081140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:38,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 10:58:40,169][15401] Updated weights for policy 0, policy_version 420660 (0.0037) [2024-06-23 10:58:43,396][15132] Fps is (10 sec: 42571.6, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 6892224512. Throughput: 0: 42732.7. Samples: 6892331000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:43,396][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 10:58:44,109][15401] Updated weights for policy 0, policy_version 420670 (0.0035) [2024-06-23 10:58:47,990][15401] Updated weights for policy 0, policy_version 420680 (0.0041) [2024-06-23 10:58:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6892437504. Throughput: 0: 42752.0. Samples: 6892588360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 10:58:51,798][15401] Updated weights for policy 0, policy_version 420690 (0.0038) [2024-06-23 10:58:53,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6892634112. Throughput: 0: 42689.2. Samples: 6892716760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:53,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 10:58:55,690][15401] Updated weights for policy 0, policy_version 420700 (0.0039) [2024-06-23 10:58:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6892847104. Throughput: 0: 42685.7. Samples: 6892968160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:58:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 10:58:59,694][15401] Updated weights for policy 0, policy_version 420710 (0.0035) [2024-06-23 10:59:03,268][15401] Updated weights for policy 0, policy_version 420720 (0.0030) [2024-06-23 10:59:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6893076480. Throughput: 0: 42756.5. Samples: 6893228680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:59:03,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 10:59:07,199][15401] Updated weights for policy 0, policy_version 420730 (0.0030) [2024-06-23 10:59:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42327.2, 300 sec: 42764.7). Total num frames: 6893256704. Throughput: 0: 42667.9. Samples: 6893357120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-23 10:59:08,392][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 10:59:11,238][15401] Updated weights for policy 0, policy_version 420740 (0.0031) [2024-06-23 10:59:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42654.6). Total num frames: 6893486080. Throughput: 0: 42643.6. Samples: 6893608240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 10:59:15,095][15401] Updated weights for policy 0, policy_version 420750 (0.0040) [2024-06-23 10:59:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 6893699072. Throughput: 0: 42538.0. Samples: 6893861140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 10:59:18,823][15401] Updated weights for policy 0, policy_version 420760 (0.0028) [2024-06-23 10:59:22,814][15401] Updated weights for policy 0, policy_version 420770 (0.0029) [2024-06-23 10:59:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 6893912064. Throughput: 0: 42485.8. Samples: 6893993000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 10:59:26,507][15401] Updated weights for policy 0, policy_version 420780 (0.0040) [2024-06-23 10:59:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6894125056. Throughput: 0: 42531.4. Samples: 6894244640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 10:59:30,842][15401] Updated weights for policy 0, policy_version 420790 (0.0028) [2024-06-23 10:59:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 6894354432. Throughput: 0: 42456.9. Samples: 6894498920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 10:59:34,166][15401] Updated weights for policy 0, policy_version 420800 (0.0023) [2024-06-23 10:59:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6894534656. Throughput: 0: 42481.8. Samples: 6894628440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 10:59:38,481][15401] Updated weights for policy 0, policy_version 420810 (0.0030) [2024-06-23 10:59:42,194][15401] Updated weights for policy 0, policy_version 420820 (0.0043) [2024-06-23 10:59:42,781][15349] Signal inference workers to stop experience collection... (102150 times) [2024-06-23 10:59:42,812][15401] InferenceWorker_p0-w0: stopping experience collection (102150 times) [2024-06-23 10:59:42,844][15349] Signal inference workers to resume experience collection... (102150 times) [2024-06-23 10:59:42,846][15401] InferenceWorker_p0-w0: resuming experience collection (102150 times) [2024-06-23 10:59:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42329.7, 300 sec: 42653.9). Total num frames: 6894764032. Throughput: 0: 42543.4. Samples: 6894882620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 10:59:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000420823_6894764032.pth... [2024-06-23 10:59:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000420198_6884524032.pth [2024-06-23 10:59:45,921][15401] Updated weights for policy 0, policy_version 420830 (0.0030) [2024-06-23 10:59:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6894977024. Throughput: 0: 42370.7. Samples: 6895135360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:48,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 10:59:50,052][15401] Updated weights for policy 0, policy_version 420840 (0.0034) [2024-06-23 10:59:53,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6895190016. Throughput: 0: 42327.6. Samples: 6895261760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:53,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 10:59:53,450][15401] Updated weights for policy 0, policy_version 420850 (0.0028) [2024-06-23 10:59:57,660][15401] Updated weights for policy 0, policy_version 420860 (0.0034) [2024-06-23 10:59:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6895403008. Throughput: 0: 42600.1. Samples: 6895525240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 10:59:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 11:00:01,188][15401] Updated weights for policy 0, policy_version 420870 (0.0030) [2024-06-23 11:00:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6895616000. Throughput: 0: 42676.2. Samples: 6895781580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:00:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 11:00:05,448][15401] Updated weights for policy 0, policy_version 420880 (0.0027) [2024-06-23 11:00:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 6895828992. Throughput: 0: 42536.0. Samples: 6895907120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:00:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 11:00:08,719][15401] Updated weights for policy 0, policy_version 420890 (0.0035) [2024-06-23 11:00:12,899][15401] Updated weights for policy 0, policy_version 420900 (0.0026) [2024-06-23 11:00:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 6896058368. Throughput: 0: 42811.2. Samples: 6896171140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:00:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 11:00:16,388][15401] Updated weights for policy 0, policy_version 420910 (0.0040) [2024-06-23 11:00:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42710.1). Total num frames: 6896254976. Throughput: 0: 42871.6. Samples: 6896428140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:00:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 11:00:20,338][15401] Updated weights for policy 0, policy_version 420920 (0.0033) [2024-06-23 11:00:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6896467968. Throughput: 0: 42756.5. Samples: 6896552480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:00:23,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 11:00:24,075][15401] Updated weights for policy 0, policy_version 420930 (0.0032) [2024-06-23 11:00:27,829][15401] Updated weights for policy 0, policy_version 420940 (0.0024) [2024-06-23 11:00:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6896697344. Throughput: 0: 42794.4. Samples: 6896808360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:00:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 11:00:31,556][15401] Updated weights for policy 0, policy_version 420950 (0.0029) [2024-06-23 11:00:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6896893952. Throughput: 0: 42875.6. Samples: 6897064760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:00:33,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-23 11:00:35,461][15401] Updated weights for policy 0, policy_version 420960 (0.0028) [2024-06-23 11:00:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6897106944. Throughput: 0: 42887.1. Samples: 6897191680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:00:38,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-23 11:00:39,095][15401] Updated weights for policy 0, policy_version 420970 (0.0032) [2024-06-23 11:00:43,364][15401] Updated weights for policy 0, policy_version 420980 (0.0049) [2024-06-23 11:00:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 6897336320. Throughput: 0: 42875.4. Samples: 6897454640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:00:43,390][15132] Avg episode reward: [(0, '0.165')] [2024-06-23 11:00:47,191][15401] Updated weights for policy 0, policy_version 420990 (0.0035) [2024-06-23 11:00:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 6897549312. Throughput: 0: 42781.6. Samples: 6897706740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:00:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 11:00:50,977][15401] Updated weights for policy 0, policy_version 421000 (0.0026) [2024-06-23 11:00:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6897762304. Throughput: 0: 42746.2. Samples: 6897830700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:00:53,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 11:00:54,827][15401] Updated weights for policy 0, policy_version 421010 (0.0029) [2024-06-23 11:00:58,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6897958912. Throughput: 0: 42697.3. Samples: 6898092520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:00:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 11:00:58,622][15401] Updated weights for policy 0, policy_version 421020 (0.0038) [2024-06-23 11:01:02,719][15401] Updated weights for policy 0, policy_version 421030 (0.0038) [2024-06-23 11:01:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 6898171904. Throughput: 0: 42586.2. Samples: 6898344520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 11:01:06,210][15401] Updated weights for policy 0, policy_version 421040 (0.0024) [2024-06-23 11:01:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6898384896. Throughput: 0: 42626.6. Samples: 6898470680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:08,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 11:01:10,350][15401] Updated weights for policy 0, policy_version 421050 (0.0032) [2024-06-23 11:01:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6898614272. Throughput: 0: 42775.7. Samples: 6898733260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:13,390][15132] Avg episode reward: [(0, '0.906')] [2024-06-23 11:01:13,942][15401] Updated weights for policy 0, policy_version 421060 (0.0038) [2024-06-23 11:01:14,708][15349] Signal inference workers to stop experience collection... (102200 times) [2024-06-23 11:01:14,746][15401] InferenceWorker_p0-w0: stopping experience collection (102200 times) [2024-06-23 11:01:14,768][15349] Signal inference workers to resume experience collection... (102200 times) [2024-06-23 11:01:14,768][15401] InferenceWorker_p0-w0: resuming experience collection (102200 times) [2024-06-23 11:01:18,055][15401] Updated weights for policy 0, policy_version 421070 (0.0029) [2024-06-23 11:01:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 6898827264. Throughput: 0: 42569.3. Samples: 6898980380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 11:01:21,645][15401] Updated weights for policy 0, policy_version 421080 (0.0028) [2024-06-23 11:01:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6899023872. Throughput: 0: 42467.6. Samples: 6899102720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:23,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 11:01:25,618][15401] Updated weights for policy 0, policy_version 421090 (0.0038) [2024-06-23 11:01:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 6899220480. Throughput: 0: 42434.3. Samples: 6899364180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:28,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-23 11:01:29,279][15401] Updated weights for policy 0, policy_version 421100 (0.0033) [2024-06-23 11:01:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 6899433472. Throughput: 0: 42300.8. Samples: 6899610280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:33,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-23 11:01:33,615][15401] Updated weights for policy 0, policy_version 421110 (0.0043) [2024-06-23 11:01:37,274][15401] Updated weights for policy 0, policy_version 421120 (0.0030) [2024-06-23 11:01:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 6899646464. Throughput: 0: 42444.1. Samples: 6899740680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 11:01:41,180][15401] Updated weights for policy 0, policy_version 421130 (0.0028) [2024-06-23 11:01:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 6899843072. Throughput: 0: 42162.1. Samples: 6899989820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:43,399][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 11:01:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000421133_6899843072.pth... [2024-06-23 11:01:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000420511_6889652224.pth [2024-06-23 11:01:45,209][15401] Updated weights for policy 0, policy_version 421140 (0.0037) [2024-06-23 11:01:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6900088832. Throughput: 0: 42165.8. Samples: 6900241980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:01:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 11:01:49,283][15401] Updated weights for policy 0, policy_version 421150 (0.0026) [2024-06-23 11:01:52,723][15401] Updated weights for policy 0, policy_version 421160 (0.0036) [2024-06-23 11:01:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 6900301824. Throughput: 0: 42411.9. Samples: 6900379220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:01:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 11:01:56,863][15401] Updated weights for policy 0, policy_version 421170 (0.0033) [2024-06-23 11:01:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 6900498432. Throughput: 0: 42103.0. Samples: 6900628000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:01:58,401][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 11:02:00,445][15401] Updated weights for policy 0, policy_version 421180 (0.0035) [2024-06-23 11:02:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6900711424. Throughput: 0: 42240.6. Samples: 6900881200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 11:02:04,404][15401] Updated weights for policy 0, policy_version 421190 (0.0031) [2024-06-23 11:02:08,385][15401] Updated weights for policy 0, policy_version 421200 (0.0034) [2024-06-23 11:02:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 6900940800. Throughput: 0: 42494.2. Samples: 6901014960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 11:02:12,142][15401] Updated weights for policy 0, policy_version 421210 (0.0035) [2024-06-23 11:02:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 41779.0, 300 sec: 42487.3). Total num frames: 6901121024. Throughput: 0: 42158.6. Samples: 6901261320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:13,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 11:02:15,974][15401] Updated weights for policy 0, policy_version 421220 (0.0041) [2024-06-23 11:02:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42050.7, 300 sec: 42598.1). Total num frames: 6901350400. Throughput: 0: 42238.1. Samples: 6901511100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:18,392][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 11:02:19,951][15401] Updated weights for policy 0, policy_version 421230 (0.0028) [2024-06-23 11:02:23,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6901563392. Throughput: 0: 42356.9. Samples: 6901646740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 11:02:23,607][15401] Updated weights for policy 0, policy_version 421240 (0.0026) [2024-06-23 11:02:27,362][15401] Updated weights for policy 0, policy_version 421250 (0.0025) [2024-06-23 11:02:28,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6901776384. Throughput: 0: 42432.0. Samples: 6901899260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:28,394][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 11:02:31,009][15349] Signal inference workers to stop experience collection... (102250 times) [2024-06-23 11:02:31,009][15349] Signal inference workers to resume experience collection... (102250 times) [2024-06-23 11:02:31,033][15401] InferenceWorker_p0-w0: stopping experience collection (102250 times) [2024-06-23 11:02:31,033][15401] InferenceWorker_p0-w0: resuming experience collection (102250 times) [2024-06-23 11:02:31,161][15401] Updated weights for policy 0, policy_version 421260 (0.0028) [2024-06-23 11:02:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6901989376. Throughput: 0: 42620.5. Samples: 6902159900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 11:02:35,549][15401] Updated weights for policy 0, policy_version 421270 (0.0045) [2024-06-23 11:02:38,390][15132] Fps is (10 sec: 44234.9, 60 sec: 42871.1, 300 sec: 42653.9). Total num frames: 6902218752. Throughput: 0: 42355.6. Samples: 6902285240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:38,391][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 11:02:38,823][15401] Updated weights for policy 0, policy_version 421280 (0.0035) [2024-06-23 11:02:43,074][15401] Updated weights for policy 0, policy_version 421290 (0.0040) [2024-06-23 11:02:43,391][15132] Fps is (10 sec: 42590.2, 60 sec: 42870.3, 300 sec: 42487.1). Total num frames: 6902415360. Throughput: 0: 42468.6. Samples: 6902539060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:43,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 11:02:46,795][15401] Updated weights for policy 0, policy_version 421300 (0.0033) [2024-06-23 11:02:48,390][15132] Fps is (10 sec: 40961.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6902628352. Throughput: 0: 42660.3. Samples: 6902800920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:48,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 11:02:50,567][15401] Updated weights for policy 0, policy_version 421310 (0.0041) [2024-06-23 11:02:53,390][15132] Fps is (10 sec: 42605.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6902841344. Throughput: 0: 42438.1. Samples: 6902924680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 11:02:54,273][15401] Updated weights for policy 0, policy_version 421320 (0.0023) [2024-06-23 11:02:58,191][15401] Updated weights for policy 0, policy_version 421330 (0.0050) [2024-06-23 11:02:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 6903070720. Throughput: 0: 42816.2. Samples: 6903188040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:02:58,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 11:03:02,093][15401] Updated weights for policy 0, policy_version 421340 (0.0032) [2024-06-23 11:03:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42599.1). Total num frames: 6903283712. Throughput: 0: 42851.6. Samples: 6903439320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:03:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 11:03:05,986][15401] Updated weights for policy 0, policy_version 421350 (0.0036) [2024-06-23 11:03:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 6903480320. Throughput: 0: 42587.9. Samples: 6903563300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:03:08,393][15132] Avg episode reward: [(0, '0.284')] [2024-06-23 11:03:09,675][15401] Updated weights for policy 0, policy_version 421360 (0.0035) [2024-06-23 11:03:13,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43143.0, 300 sec: 42542.5). Total num frames: 6903709696. Throughput: 0: 42630.3. Samples: 6903817720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:03:13,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 11:03:14,063][15401] Updated weights for policy 0, policy_version 421370 (0.0043) [2024-06-23 11:03:17,231][15401] Updated weights for policy 0, policy_version 421380 (0.0042) [2024-06-23 11:03:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 6903906304. Throughput: 0: 42624.8. Samples: 6904078020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 11:03:21,751][15401] Updated weights for policy 0, policy_version 421390 (0.0034) [2024-06-23 11:03:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6904135680. Throughput: 0: 42631.1. Samples: 6904203620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 11:03:25,079][15401] Updated weights for policy 0, policy_version 421400 (0.0033) [2024-06-23 11:03:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 6904348672. Throughput: 0: 42673.6. Samples: 6904459300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 11:03:29,440][15401] Updated weights for policy 0, policy_version 421410 (0.0039) [2024-06-23 11:03:32,395][15401] Updated weights for policy 0, policy_version 421420 (0.0034) [2024-06-23 11:03:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6904561664. Throughput: 0: 42698.8. Samples: 6904722360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 11:03:36,982][15401] Updated weights for policy 0, policy_version 421430 (0.0031) [2024-06-23 11:03:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.7, 300 sec: 42488.2). Total num frames: 6904758272. Throughput: 0: 42897.8. Samples: 6904855080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 11:03:39,947][15401] Updated weights for policy 0, policy_version 421440 (0.0039) [2024-06-23 11:03:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42872.7, 300 sec: 42542.8). Total num frames: 6904987648. Throughput: 0: 42760.7. Samples: 6905112280. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 11:03:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000421447_6904987648.pth... [2024-06-23 11:03:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000420823_6894764032.pth [2024-06-23 11:03:44,574][15401] Updated weights for policy 0, policy_version 421450 (0.0035) [2024-06-23 11:03:47,551][15401] Updated weights for policy 0, policy_version 421460 (0.0023) [2024-06-23 11:03:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6905217024. Throughput: 0: 42853.7. Samples: 6905367740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 11:03:52,136][15401] Updated weights for policy 0, policy_version 421470 (0.0029) [2024-06-23 11:03:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 6905397248. Throughput: 0: 43007.1. Samples: 6905498520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 11:03:55,548][15401] Updated weights for policy 0, policy_version 421480 (0.0043) [2024-06-23 11:03:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 6905626624. Throughput: 0: 43043.4. Samples: 6905754580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:03:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 11:04:00,257][15401] Updated weights for policy 0, policy_version 421490 (0.0025) [2024-06-23 11:04:03,071][15401] Updated weights for policy 0, policy_version 421500 (0.0028) [2024-06-23 11:04:03,392][15132] Fps is (10 sec: 45864.7, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 6905856000. Throughput: 0: 42758.1. Samples: 6906002240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:04:03,392][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 11:04:07,893][15401] Updated weights for policy 0, policy_version 421510 (0.0032) [2024-06-23 11:04:07,990][15349] Signal inference workers to stop experience collection... (102300 times) [2024-06-23 11:04:08,041][15401] InferenceWorker_p0-w0: stopping experience collection (102300 times) [2024-06-23 11:04:08,044][15349] Signal inference workers to resume experience collection... (102300 times) [2024-06-23 11:04:08,056][15401] InferenceWorker_p0-w0: resuming experience collection (102300 times) [2024-06-23 11:04:08,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 6906068992. Throughput: 0: 43026.7. Samples: 6906139820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:04:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 11:04:10,825][15401] Updated weights for policy 0, policy_version 421520 (0.0044) [2024-06-23 11:04:13,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 6906281984. Throughput: 0: 43117.4. Samples: 6906399580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:04:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 11:04:15,436][15401] Updated weights for policy 0, policy_version 421530 (0.0033) [2024-06-23 11:04:18,352][15401] Updated weights for policy 0, policy_version 421540 (0.0043) [2024-06-23 11:04:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 6906511360. Throughput: 0: 42928.8. Samples: 6906654160. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:04:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 11:04:22,978][15401] Updated weights for policy 0, policy_version 421550 (0.0032) [2024-06-23 11:04:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6906691584. Throughput: 0: 42823.6. Samples: 6906782140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:04:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 11:04:25,987][15401] Updated weights for policy 0, policy_version 421560 (0.0023) [2024-06-23 11:04:28,393][15132] Fps is (10 sec: 42581.5, 60 sec: 43141.8, 300 sec: 42653.4). Total num frames: 6906937344. Throughput: 0: 42907.0. Samples: 6907043260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:04:28,394][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 11:04:30,431][15401] Updated weights for policy 0, policy_version 421570 (0.0025) [2024-06-23 11:04:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 6907133952. Throughput: 0: 42863.2. Samples: 6907296580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-23 11:04:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 11:04:33,606][15401] Updated weights for policy 0, policy_version 421580 (0.0034) [2024-06-23 11:04:38,375][15401] Updated weights for policy 0, policy_version 421590 (0.0040) [2024-06-23 11:04:38,390][15132] Fps is (10 sec: 39337.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6907330560. Throughput: 0: 42715.6. Samples: 6907420720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:04:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 11:04:41,224][15401] Updated weights for policy 0, policy_version 421600 (0.0034) [2024-06-23 11:04:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6907576320. Throughput: 0: 42841.5. Samples: 6907682440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:04:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 11:04:45,877][15401] Updated weights for policy 0, policy_version 421610 (0.0036) [2024-06-23 11:04:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6907756544. Throughput: 0: 43079.5. Samples: 6907940720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:04:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 11:04:48,935][15401] Updated weights for policy 0, policy_version 421620 (0.0032) [2024-06-23 11:04:53,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6907969536. Throughput: 0: 42651.8. Samples: 6908059160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:04:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 11:04:53,555][15401] Updated weights for policy 0, policy_version 421630 (0.0042) [2024-06-23 11:04:56,412][15401] Updated weights for policy 0, policy_version 421640 (0.0034) [2024-06-23 11:04:58,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 6908215296. Throughput: 0: 42742.3. Samples: 6908322980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:04:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 11:05:01,074][15401] Updated weights for policy 0, policy_version 421650 (0.0042) [2024-06-23 11:05:03,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 6908411904. Throughput: 0: 42849.2. Samples: 6908582480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:03,401][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 11:05:04,406][15401] Updated weights for policy 0, policy_version 421660 (0.0031) [2024-06-23 11:05:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6908624896. Throughput: 0: 42674.7. Samples: 6908702500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:08,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 11:05:08,586][15401] Updated weights for policy 0, policy_version 421670 (0.0035) [2024-06-23 11:05:12,111][15401] Updated weights for policy 0, policy_version 421680 (0.0049) [2024-06-23 11:05:13,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6908854272. Throughput: 0: 42628.6. Samples: 6908961380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 11:05:16,123][15401] Updated weights for policy 0, policy_version 421690 (0.0038) [2024-06-23 11:05:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 6909018112. Throughput: 0: 42717.4. Samples: 6909218860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:18,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 11:05:19,184][15349] Signal inference workers to stop experience collection... (102350 times) [2024-06-23 11:05:19,184][15349] Signal inference workers to resume experience collection... (102350 times) [2024-06-23 11:05:19,218][15401] InferenceWorker_p0-w0: stopping experience collection (102350 times) [2024-06-23 11:05:19,218][15401] InferenceWorker_p0-w0: resuming experience collection (102350 times) [2024-06-23 11:05:20,293][15401] Updated weights for policy 0, policy_version 421700 (0.0033) [2024-06-23 11:05:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6909263872. Throughput: 0: 42630.6. Samples: 6909339100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:23,392][15132] Avg episode reward: [(0, '0.228')] [2024-06-23 11:05:24,040][15401] Updated weights for policy 0, policy_version 421710 (0.0032) [2024-06-23 11:05:27,883][15401] Updated weights for policy 0, policy_version 421720 (0.0032) [2024-06-23 11:05:28,392][15132] Fps is (10 sec: 47502.2, 60 sec: 42599.5, 300 sec: 42709.1). Total num frames: 6909493248. Throughput: 0: 42580.8. Samples: 6909598680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:28,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 11:05:31,417][15401] Updated weights for policy 0, policy_version 421730 (0.0028) [2024-06-23 11:05:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6909673472. Throughput: 0: 42604.5. Samples: 6909857920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 11:05:35,383][15401] Updated weights for policy 0, policy_version 421740 (0.0028) [2024-06-23 11:05:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 6909919232. Throughput: 0: 42763.7. Samples: 6909983520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 11:05:38,903][15401] Updated weights for policy 0, policy_version 421750 (0.0035) [2024-06-23 11:05:43,057][15401] Updated weights for policy 0, policy_version 421760 (0.0040) [2024-06-23 11:05:43,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42320.8, 300 sec: 42597.5). Total num frames: 6910115840. Throughput: 0: 42675.3. Samples: 6910243640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:43,396][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 11:05:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000421761_6910132224.pth... [2024-06-23 11:05:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000421133_6899843072.pth [2024-06-23 11:05:47,101][15401] Updated weights for policy 0, policy_version 421770 (0.0042) [2024-06-23 11:05:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6910312448. Throughput: 0: 42454.2. Samples: 6910492820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 11:05:50,783][15401] Updated weights for policy 0, policy_version 421780 (0.0040) [2024-06-23 11:05:53,389][15132] Fps is (10 sec: 44265.3, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 6910558208. Throughput: 0: 42634.7. Samples: 6910621060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 11:05:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 11:05:54,625][15401] Updated weights for policy 0, policy_version 421790 (0.0027) [2024-06-23 11:05:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6910754816. Throughput: 0: 42623.2. Samples: 6910879420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:05:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 11:05:58,453][15401] Updated weights for policy 0, policy_version 421800 (0.0029) [2024-06-23 11:06:02,183][15401] Updated weights for policy 0, policy_version 421810 (0.0041) [2024-06-23 11:06:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 6910967808. Throughput: 0: 42561.3. Samples: 6911134120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 11:06:06,227][15401] Updated weights for policy 0, policy_version 421820 (0.0027) [2024-06-23 11:06:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6911197184. Throughput: 0: 42750.3. Samples: 6911262860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 11:06:09,738][15401] Updated weights for policy 0, policy_version 421830 (0.0040) [2024-06-23 11:06:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6911393792. Throughput: 0: 42684.0. Samples: 6911519360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:13,399][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 11:06:13,746][15401] Updated weights for policy 0, policy_version 421840 (0.0027) [2024-06-23 11:06:17,489][15401] Updated weights for policy 0, policy_version 421850 (0.0039) [2024-06-23 11:06:18,396][15132] Fps is (10 sec: 40933.7, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 6911606784. Throughput: 0: 42536.6. Samples: 6911772340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:18,396][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 11:06:21,298][15401] Updated weights for policy 0, policy_version 421860 (0.0026) [2024-06-23 11:06:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6911836160. Throughput: 0: 42663.6. Samples: 6911903380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:23,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 11:06:25,214][15401] Updated weights for policy 0, policy_version 421870 (0.0031) [2024-06-23 11:06:28,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 6912016384. Throughput: 0: 42624.3. Samples: 6912161460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:28,396][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 11:06:29,159][15401] Updated weights for policy 0, policy_version 421880 (0.0024) [2024-06-23 11:06:32,823][15401] Updated weights for policy 0, policy_version 421890 (0.0045) [2024-06-23 11:06:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6912262144. Throughput: 0: 42626.4. Samples: 6912411000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:33,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 11:06:36,535][15349] Signal inference workers to stop experience collection... (102400 times) [2024-06-23 11:06:36,567][15401] InferenceWorker_p0-w0: stopping experience collection (102400 times) [2024-06-23 11:06:36,591][15349] Signal inference workers to resume experience collection... (102400 times) [2024-06-23 11:06:36,599][15401] InferenceWorker_p0-w0: resuming experience collection (102400 times) [2024-06-23 11:06:36,733][15401] Updated weights for policy 0, policy_version 421900 (0.0034) [2024-06-23 11:06:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6912475136. Throughput: 0: 42798.2. Samples: 6912546980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:38,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 11:06:40,154][15401] Updated weights for policy 0, policy_version 421910 (0.0028) [2024-06-23 11:06:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 6912688128. Throughput: 0: 42884.9. Samples: 6912809240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:43,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-23 11:06:44,091][15401] Updated weights for policy 0, policy_version 421920 (0.0042) [2024-06-23 11:06:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 6912884736. Throughput: 0: 42899.6. Samples: 6913064600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:48,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 11:06:48,440][15401] Updated weights for policy 0, policy_version 421930 (0.0053) [2024-06-23 11:06:51,644][15401] Updated weights for policy 0, policy_version 421940 (0.0037) [2024-06-23 11:06:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 6913130496. Throughput: 0: 42981.8. Samples: 6913197040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 11:06:55,971][15401] Updated weights for policy 0, policy_version 421950 (0.0029) [2024-06-23 11:06:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6913310720. Throughput: 0: 42862.2. Samples: 6913448160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:06:58,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-23 11:06:59,875][15401] Updated weights for policy 0, policy_version 421960 (0.0047) [2024-06-23 11:07:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6913540096. Throughput: 0: 42855.1. Samples: 6913700540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:07:03,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 11:07:03,859][15401] Updated weights for policy 0, policy_version 421970 (0.0039) [2024-06-23 11:07:07,641][15401] Updated weights for policy 0, policy_version 421980 (0.0033) [2024-06-23 11:07:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6913753088. Throughput: 0: 42793.3. Samples: 6913829080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:07:08,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 11:07:11,406][15401] Updated weights for policy 0, policy_version 421990 (0.0032) [2024-06-23 11:07:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 6913966080. Throughput: 0: 42828.9. Samples: 6914088760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:07:13,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 11:07:15,357][15401] Updated weights for policy 0, policy_version 422000 (0.0028) [2024-06-23 11:07:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 6914179072. Throughput: 0: 42921.8. Samples: 6914342480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 11:07:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 11:07:19,028][15401] Updated weights for policy 0, policy_version 422010 (0.0033) [2024-06-23 11:07:22,998][15401] Updated weights for policy 0, policy_version 422020 (0.0028) [2024-06-23 11:07:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6914392064. Throughput: 0: 42815.5. Samples: 6914473680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:07:23,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 11:07:26,432][15401] Updated weights for policy 0, policy_version 422030 (0.0028) [2024-06-23 11:07:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 6914621440. Throughput: 0: 42843.9. Samples: 6914737220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:07:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 11:07:30,492][15401] Updated weights for policy 0, policy_version 422040 (0.0032) [2024-06-23 11:07:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 6914834432. Throughput: 0: 42792.8. Samples: 6914990380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:07:33,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 11:07:34,035][15401] Updated weights for policy 0, policy_version 422050 (0.0028) [2024-06-23 11:07:38,110][15401] Updated weights for policy 0, policy_version 422060 (0.0039) [2024-06-23 11:07:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 6915047424. Throughput: 0: 42714.2. Samples: 6915119180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:07:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 11:07:41,514][15401] Updated weights for policy 0, policy_version 422070 (0.0029) [2024-06-23 11:07:43,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6915244032. Throughput: 0: 43022.3. Samples: 6915384160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:07:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 11:07:43,548][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000422075_6915276800.pth... [2024-06-23 11:07:43,593][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000421447_6904987648.pth [2024-06-23 11:07:45,626][15401] Updated weights for policy 0, policy_version 422080 (0.0030) [2024-06-23 11:07:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6915473408. Throughput: 0: 42898.6. Samples: 6915630980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:07:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 11:07:49,772][15401] Updated weights for policy 0, policy_version 422090 (0.0033) [2024-06-23 11:07:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 6915670016. Throughput: 0: 42889.0. Samples: 6915759080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:07:53,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 11:07:53,574][15401] Updated weights for policy 0, policy_version 422100 (0.0036) [2024-06-23 11:07:57,395][15401] Updated weights for policy 0, policy_version 422110 (0.0034) [2024-06-23 11:07:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 6915899392. Throughput: 0: 42947.4. Samples: 6916021400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:07:58,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 11:08:01,120][15401] Updated weights for policy 0, policy_version 422120 (0.0031) [2024-06-23 11:08:03,392][15132] Fps is (10 sec: 45863.7, 60 sec: 43142.7, 300 sec: 42876.1). Total num frames: 6916128768. Throughput: 0: 42888.3. Samples: 6916272560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:08:03,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 11:08:05,000][15401] Updated weights for policy 0, policy_version 422130 (0.0024) [2024-06-23 11:08:05,933][15349] Signal inference workers to stop experience collection... (102450 times) [2024-06-23 11:08:05,934][15349] Signal inference workers to resume experience collection... (102450 times) [2024-06-23 11:08:05,962][15401] InferenceWorker_p0-w0: stopping experience collection (102450 times) [2024-06-23 11:08:05,989][15401] InferenceWorker_p0-w0: resuming experience collection (102450 times) [2024-06-23 11:08:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 6916308992. Throughput: 0: 42864.4. Samples: 6916402580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:08:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 11:08:08,715][15401] Updated weights for policy 0, policy_version 422140 (0.0030) [2024-06-23 11:08:12,513][15401] Updated weights for policy 0, policy_version 422150 (0.0028) [2024-06-23 11:08:13,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6916521984. Throughput: 0: 42689.4. Samples: 6916658240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:08:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 11:08:16,745][15401] Updated weights for policy 0, policy_version 422160 (0.0032) [2024-06-23 11:08:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6916734976. Throughput: 0: 42731.3. Samples: 6916913180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:08:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 11:08:20,083][15401] Updated weights for policy 0, policy_version 422170 (0.0039) [2024-06-23 11:08:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 6916947968. Throughput: 0: 42691.9. Samples: 6917040420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:08:23,393][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 11:08:24,309][15401] Updated weights for policy 0, policy_version 422180 (0.0040) [2024-06-23 11:08:27,715][15401] Updated weights for policy 0, policy_version 422190 (0.0034) [2024-06-23 11:08:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6917177344. Throughput: 0: 42455.5. Samples: 6917294660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:08:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 11:08:31,848][15401] Updated weights for policy 0, policy_version 422200 (0.0044) [2024-06-23 11:08:33,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 6917373952. Throughput: 0: 42615.1. Samples: 6917548660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:08:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 11:08:35,334][15401] Updated weights for policy 0, policy_version 422210 (0.0034) [2024-06-23 11:08:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6917586944. Throughput: 0: 42563.9. Samples: 6917674460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 11:08:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 11:08:39,365][15401] Updated weights for policy 0, policy_version 422220 (0.0034) [2024-06-23 11:08:42,864][15401] Updated weights for policy 0, policy_version 422230 (0.0040) [2024-06-23 11:08:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6917816320. Throughput: 0: 42573.5. Samples: 6917937200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:08:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 11:08:46,976][15401] Updated weights for policy 0, policy_version 422240 (0.0038) [2024-06-23 11:08:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6918029312. Throughput: 0: 42644.6. Samples: 6918191460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:08:48,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 11:08:50,916][15401] Updated weights for policy 0, policy_version 422250 (0.0045) [2024-06-23 11:08:53,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 6918225920. Throughput: 0: 42634.0. Samples: 6918321200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:08:53,392][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 11:08:54,834][15401] Updated weights for policy 0, policy_version 422260 (0.0047) [2024-06-23 11:08:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 6918455296. Throughput: 0: 42739.6. Samples: 6918581520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:08:58,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 11:08:58,470][15401] Updated weights for policy 0, policy_version 422270 (0.0034) [2024-06-23 11:09:02,224][15401] Updated weights for policy 0, policy_version 422280 (0.0028) [2024-06-23 11:09:03,390][15132] Fps is (10 sec: 45885.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 6918684672. Throughput: 0: 42737.7. Samples: 6918836380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:03,390][15132] Avg episode reward: [(0, '0.097')] [2024-06-23 11:09:06,003][15401] Updated weights for policy 0, policy_version 422290 (0.0044) [2024-06-23 11:09:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6918881280. Throughput: 0: 42803.7. Samples: 6918966480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 11:09:09,739][15349] Signal inference workers to stop experience collection... (102500 times) [2024-06-23 11:09:09,741][15349] Signal inference workers to resume experience collection... (102500 times) [2024-06-23 11:09:09,751][15401] InferenceWorker_p0-w0: stopping experience collection (102500 times) [2024-06-23 11:09:09,764][15401] InferenceWorker_p0-w0: resuming experience collection (102500 times) [2024-06-23 11:09:09,919][15401] Updated weights for policy 0, policy_version 422300 (0.0030) [2024-06-23 11:09:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 6919127040. Throughput: 0: 42972.4. Samples: 6919228420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 11:09:13,390][15401] Updated weights for policy 0, policy_version 422310 (0.0024) [2024-06-23 11:09:17,544][15401] Updated weights for policy 0, policy_version 422320 (0.0036) [2024-06-23 11:09:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 6919340032. Throughput: 0: 42792.1. Samples: 6919474300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 11:09:21,307][15401] Updated weights for policy 0, policy_version 422330 (0.0028) [2024-06-23 11:09:23,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43144.6, 300 sec: 42709.7). Total num frames: 6919536640. Throughput: 0: 43062.6. Samples: 6919612380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:23,392][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 11:09:25,090][15401] Updated weights for policy 0, policy_version 422340 (0.0041) [2024-06-23 11:09:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6919749632. Throughput: 0: 42963.0. Samples: 6919870540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 11:09:28,899][15401] Updated weights for policy 0, policy_version 422350 (0.0022) [2024-06-23 11:09:32,644][15401] Updated weights for policy 0, policy_version 422360 (0.0027) [2024-06-23 11:09:33,389][15132] Fps is (10 sec: 45886.1, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 6919995392. Throughput: 0: 42981.3. Samples: 6920125620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 11:09:36,320][15401] Updated weights for policy 0, policy_version 422370 (0.0046) [2024-06-23 11:09:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6920159232. Throughput: 0: 43099.9. Samples: 6920260600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 11:09:40,111][15401] Updated weights for policy 0, policy_version 422380 (0.0027) [2024-06-23 11:09:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6920388608. Throughput: 0: 43036.8. Samples: 6920518180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:43,394][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 11:09:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000422388_6920404992.pth... [2024-06-23 11:09:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000421761_6910132224.pth [2024-06-23 11:09:44,019][15401] Updated weights for policy 0, policy_version 422390 (0.0030) [2024-06-23 11:09:47,682][15401] Updated weights for policy 0, policy_version 422400 (0.0032) [2024-06-23 11:09:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6920617984. Throughput: 0: 42945.4. Samples: 6920768920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 11:09:51,877][15401] Updated weights for policy 0, policy_version 422410 (0.0033) [2024-06-23 11:09:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 6920814592. Throughput: 0: 43007.6. Samples: 6920901820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 11:09:55,237][15401] Updated weights for policy 0, policy_version 422420 (0.0032) [2024-06-23 11:09:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 6921043968. Throughput: 0: 42922.2. Samples: 6921159920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-23 11:09:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 11:09:59,472][15401] Updated weights for policy 0, policy_version 422430 (0.0033) [2024-06-23 11:10:02,849][15401] Updated weights for policy 0, policy_version 422440 (0.0033) [2024-06-23 11:10:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 6921273344. Throughput: 0: 43086.5. Samples: 6921413200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 11:10:07,059][15401] Updated weights for policy 0, policy_version 422450 (0.0038) [2024-06-23 11:10:08,393][15132] Fps is (10 sec: 42582.2, 60 sec: 43141.8, 300 sec: 42764.5). Total num frames: 6921469952. Throughput: 0: 43010.2. Samples: 6921547900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:08,394][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 11:10:10,700][15401] Updated weights for policy 0, policy_version 422460 (0.0042) [2024-06-23 11:10:13,102][15349] Signal inference workers to stop experience collection... (102550 times) [2024-06-23 11:10:13,102][15349] Signal inference workers to resume experience collection... (102550 times) [2024-06-23 11:10:13,145][15401] InferenceWorker_p0-w0: stopping experience collection (102550 times) [2024-06-23 11:10:13,145][15401] InferenceWorker_p0-w0: resuming experience collection (102550 times) [2024-06-23 11:10:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6921682944. Throughput: 0: 42951.3. Samples: 6921803340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 11:10:14,678][15401] Updated weights for policy 0, policy_version 422470 (0.0044) [2024-06-23 11:10:18,390][15132] Fps is (10 sec: 42614.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 6921895936. Throughput: 0: 42987.5. Samples: 6922060060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:18,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 11:10:18,478][15401] Updated weights for policy 0, policy_version 422480 (0.0032) [2024-06-23 11:10:22,062][15401] Updated weights for policy 0, policy_version 422490 (0.0025) [2024-06-23 11:10:23,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 6922125312. Throughput: 0: 42995.9. Samples: 6922195520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:23,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 11:10:25,966][15401] Updated weights for policy 0, policy_version 422500 (0.0027) [2024-06-23 11:10:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 6922338304. Throughput: 0: 43043.7. Samples: 6922455140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 11:10:29,575][15401] Updated weights for policy 0, policy_version 422510 (0.0039) [2024-06-23 11:10:33,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6922551296. Throughput: 0: 43280.0. Samples: 6922716520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:33,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 11:10:33,494][15401] Updated weights for policy 0, policy_version 422520 (0.0037) [2024-06-23 11:10:37,267][15401] Updated weights for policy 0, policy_version 422530 (0.0034) [2024-06-23 11:10:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43690.6, 300 sec: 42932.6). Total num frames: 6922780672. Throughput: 0: 43160.4. Samples: 6922844040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:38,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 11:10:41,398][15401] Updated weights for policy 0, policy_version 422540 (0.0032) [2024-06-23 11:10:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 6922993664. Throughput: 0: 43266.6. Samples: 6923106920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 11:10:44,695][15401] Updated weights for policy 0, policy_version 422550 (0.0028) [2024-06-23 11:10:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6923190272. Throughput: 0: 43328.1. Samples: 6923362960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 11:10:49,167][15401] Updated weights for policy 0, policy_version 422560 (0.0037) [2024-06-23 11:10:52,261][15401] Updated weights for policy 0, policy_version 422570 (0.0036) [2024-06-23 11:10:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 6923419648. Throughput: 0: 43187.2. Samples: 6923491160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 11:10:56,787][15401] Updated weights for policy 0, policy_version 422580 (0.0023) [2024-06-23 11:10:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6923599872. Throughput: 0: 43239.1. Samples: 6923749100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:10:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 11:10:59,848][15401] Updated weights for policy 0, policy_version 422590 (0.0034) [2024-06-23 11:11:03,391][15132] Fps is (10 sec: 42590.0, 60 sec: 42870.1, 300 sec: 42875.8). Total num frames: 6923845632. Throughput: 0: 43098.6. Samples: 6923999580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:11:03,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 11:11:04,310][15401] Updated weights for policy 0, policy_version 422600 (0.0043) [2024-06-23 11:11:07,509][15401] Updated weights for policy 0, policy_version 422610 (0.0029) [2024-06-23 11:11:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43147.3, 300 sec: 42931.7). Total num frames: 6924058624. Throughput: 0: 43038.8. Samples: 6924132160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:11:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 11:11:11,866][15401] Updated weights for policy 0, policy_version 422620 (0.0031) [2024-06-23 11:11:13,390][15132] Fps is (10 sec: 40967.7, 60 sec: 42871.3, 300 sec: 42877.0). Total num frames: 6924255232. Throughput: 0: 43007.8. Samples: 6924390500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:11:13,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-23 11:11:15,089][15401] Updated weights for policy 0, policy_version 422630 (0.0028) [2024-06-23 11:11:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 6924468224. Throughput: 0: 42923.1. Samples: 6924648060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:11:18,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 11:11:19,805][15401] Updated weights for policy 0, policy_version 422640 (0.0042) [2024-06-23 11:11:20,509][15349] Signal inference workers to stop experience collection... (102600 times) [2024-06-23 11:11:20,509][15349] Signal inference workers to resume experience collection... (102600 times) [2024-06-23 11:11:20,559][15401] InferenceWorker_p0-w0: stopping experience collection (102600 times) [2024-06-23 11:11:20,559][15401] InferenceWorker_p0-w0: resuming experience collection (102600 times) [2024-06-23 11:11:22,862][15401] Updated weights for policy 0, policy_version 422650 (0.0031) [2024-06-23 11:11:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 6924713984. Throughput: 0: 42978.2. Samples: 6924778060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 11:11:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 11:11:27,419][15401] Updated weights for policy 0, policy_version 422660 (0.0036) [2024-06-23 11:11:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 6924910592. Throughput: 0: 42852.8. Samples: 6925035400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:11:28,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 11:11:30,654][15401] Updated weights for policy 0, policy_version 422670 (0.0026) [2024-06-23 11:11:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6925107200. Throughput: 0: 42667.1. Samples: 6925282980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:11:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 11:11:35,127][15401] Updated weights for policy 0, policy_version 422680 (0.0042) [2024-06-23 11:11:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 6925336576. Throughput: 0: 42606.2. Samples: 6925408440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:11:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 11:11:38,398][15401] Updated weights for policy 0, policy_version 422690 (0.0030) [2024-06-23 11:11:42,680][15401] Updated weights for policy 0, policy_version 422700 (0.0029) [2024-06-23 11:11:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 6925533184. Throughput: 0: 42659.1. Samples: 6925668760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:11:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 11:11:43,474][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000422702_6925549568.pth... [2024-06-23 11:11:43,529][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000422075_6915276800.pth [2024-06-23 11:11:46,118][15401] Updated weights for policy 0, policy_version 422710 (0.0035) [2024-06-23 11:11:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 6925762560. Throughput: 0: 42676.1. Samples: 6925919920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:11:48,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 11:11:50,789][15401] Updated weights for policy 0, policy_version 422720 (0.0037) [2024-06-23 11:11:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 6925975552. Throughput: 0: 42653.7. Samples: 6926051580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:11:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 11:11:53,942][15401] Updated weights for policy 0, policy_version 422730 (0.0028) [2024-06-23 11:11:58,317][15401] Updated weights for policy 0, policy_version 422740 (0.0026) [2024-06-23 11:11:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 6926172160. Throughput: 0: 42664.9. Samples: 6926310420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:11:58,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 11:12:01,557][15401] Updated weights for policy 0, policy_version 422750 (0.0034) [2024-06-23 11:12:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42599.9, 300 sec: 42876.1). Total num frames: 6926401536. Throughput: 0: 42531.2. Samples: 6926561960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:03,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 11:12:05,941][15401] Updated weights for policy 0, policy_version 422760 (0.0041) [2024-06-23 11:12:08,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 6926630912. Throughput: 0: 42554.3. Samples: 6926693000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 11:12:09,076][15401] Updated weights for policy 0, policy_version 422770 (0.0039) [2024-06-23 11:12:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 6926811136. Throughput: 0: 42506.2. Samples: 6926948080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 11:12:13,552][15401] Updated weights for policy 0, policy_version 422780 (0.0046) [2024-06-23 11:12:16,908][15401] Updated weights for policy 0, policy_version 422790 (0.0032) [2024-06-23 11:12:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 6927040512. Throughput: 0: 42566.3. Samples: 6927198460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 11:12:21,345][15401] Updated weights for policy 0, policy_version 422800 (0.0036) [2024-06-23 11:12:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 6927253504. Throughput: 0: 42764.0. Samples: 6927332820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 11:12:24,773][15401] Updated weights for policy 0, policy_version 422810 (0.0047) [2024-06-23 11:12:28,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42053.9, 300 sec: 42709.8). Total num frames: 6927433728. Throughput: 0: 42531.5. Samples: 6927582680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 11:12:28,976][15401] Updated weights for policy 0, policy_version 422820 (0.0033) [2024-06-23 11:12:32,519][15401] Updated weights for policy 0, policy_version 422830 (0.0029) [2024-06-23 11:12:33,396][15132] Fps is (10 sec: 44208.7, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 6927695872. Throughput: 0: 42524.2. Samples: 6927833780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:33,396][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 11:12:36,745][15401] Updated weights for policy 0, policy_version 422840 (0.0032) [2024-06-23 11:12:37,368][15349] Signal inference workers to stop experience collection... (102650 times) [2024-06-23 11:12:37,368][15349] Signal inference workers to resume experience collection... (102650 times) [2024-06-23 11:12:37,408][15401] InferenceWorker_p0-w0: stopping experience collection (102650 times) [2024-06-23 11:12:37,408][15401] InferenceWorker_p0-w0: resuming experience collection (102650 times) [2024-06-23 11:12:38,394][15132] Fps is (10 sec: 45854.4, 60 sec: 42595.2, 300 sec: 42875.4). Total num frames: 6927892480. Throughput: 0: 42576.2. Samples: 6927967700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:38,394][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 11:12:40,342][15401] Updated weights for policy 0, policy_version 422850 (0.0035) [2024-06-23 11:12:43,390][15132] Fps is (10 sec: 37707.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6928072704. Throughput: 0: 42356.0. Samples: 6928216440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 11:12:43,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 11:12:44,780][15401] Updated weights for policy 0, policy_version 422860 (0.0034) [2024-06-23 11:12:48,022][15401] Updated weights for policy 0, policy_version 422870 (0.0037) [2024-06-23 11:12:48,390][15132] Fps is (10 sec: 42617.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6928318464. Throughput: 0: 42415.4. Samples: 6928470660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:12:48,395][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 11:12:52,429][15401] Updated weights for policy 0, policy_version 422880 (0.0032) [2024-06-23 11:12:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 6928515072. Throughput: 0: 42559.0. Samples: 6928608160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:12:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 11:12:55,619][15401] Updated weights for policy 0, policy_version 422890 (0.0037) [2024-06-23 11:12:58,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42323.7, 300 sec: 42653.9). Total num frames: 6928711680. Throughput: 0: 42240.9. Samples: 6928849020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:12:58,392][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 11:13:00,176][15401] Updated weights for policy 0, policy_version 422900 (0.0041) [2024-06-23 11:13:03,244][15401] Updated weights for policy 0, policy_version 422910 (0.0028) [2024-06-23 11:13:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 6928957440. Throughput: 0: 42276.4. Samples: 6929100900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 11:13:07,855][15401] Updated weights for policy 0, policy_version 422920 (0.0033) [2024-06-23 11:13:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 6929137664. Throughput: 0: 42261.9. Samples: 6929234600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 11:13:11,263][15401] Updated weights for policy 0, policy_version 422930 (0.0045) [2024-06-23 11:13:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 6929367040. Throughput: 0: 42310.3. Samples: 6929486640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 11:13:15,637][15401] Updated weights for policy 0, policy_version 422940 (0.0033) [2024-06-23 11:13:18,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42598.2, 300 sec: 42876.4). Total num frames: 6929596416. Throughput: 0: 42352.5. Samples: 6929739380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 11:13:18,838][15401] Updated weights for policy 0, policy_version 422950 (0.0030) [2024-06-23 11:13:23,344][15401] Updated weights for policy 0, policy_version 422960 (0.0025) [2024-06-23 11:13:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 6929776640. Throughput: 0: 42251.9. Samples: 6929868840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 11:13:26,427][15401] Updated weights for policy 0, policy_version 422970 (0.0038) [2024-06-23 11:13:28,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6929989632. Throughput: 0: 42324.5. Samples: 6930121040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 11:13:30,830][15401] Updated weights for policy 0, policy_version 422980 (0.0037) [2024-06-23 11:13:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42329.8, 300 sec: 42876.1). Total num frames: 6930235392. Throughput: 0: 42230.7. Samples: 6930371040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 11:13:34,238][15401] Updated weights for policy 0, policy_version 422990 (0.0033) [2024-06-23 11:13:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42055.4, 300 sec: 42709.5). Total num frames: 6930415616. Throughput: 0: 42201.3. Samples: 6930507220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:38,391][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 11:13:38,477][15401] Updated weights for policy 0, policy_version 423000 (0.0036) [2024-06-23 11:13:41,720][15349] Signal inference workers to stop experience collection... (102700 times) [2024-06-23 11:13:41,720][15349] Signal inference workers to resume experience collection... (102700 times) [2024-06-23 11:13:41,751][15401] InferenceWorker_p0-w0: stopping experience collection (102700 times) [2024-06-23 11:13:41,756][15401] InferenceWorker_p0-w0: resuming experience collection (102700 times) [2024-06-23 11:13:41,858][15401] Updated weights for policy 0, policy_version 423010 (0.0027) [2024-06-23 11:13:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6930644992. Throughput: 0: 42427.5. Samples: 6930758160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 11:13:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000423013_6930644992.pth... [2024-06-23 11:13:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000422388_6920404992.pth [2024-06-23 11:13:46,033][15401] Updated weights for policy 0, policy_version 423020 (0.0056) [2024-06-23 11:13:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 6930857984. Throughput: 0: 42580.0. Samples: 6931017000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 11:13:49,463][15401] Updated weights for policy 0, policy_version 423030 (0.0027) [2024-06-23 11:13:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6931054592. Throughput: 0: 42586.2. Samples: 6931150980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 11:13:53,553][15401] Updated weights for policy 0, policy_version 423040 (0.0034) [2024-06-23 11:13:56,955][15401] Updated weights for policy 0, policy_version 423050 (0.0034) [2024-06-23 11:13:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 6931283968. Throughput: 0: 42581.3. Samples: 6931402800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:13:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 11:14:01,754][15401] Updated weights for policy 0, policy_version 423060 (0.0031) [2024-06-23 11:14:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 6931496960. Throughput: 0: 42703.3. Samples: 6931661020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:14:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 11:14:04,810][15401] Updated weights for policy 0, policy_version 423070 (0.0041) [2024-06-23 11:14:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6931709952. Throughput: 0: 42735.5. Samples: 6931791940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 11:14:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 11:14:09,430][15401] Updated weights for policy 0, policy_version 423080 (0.0045) [2024-06-23 11:14:12,308][15401] Updated weights for policy 0, policy_version 423090 (0.0039) [2024-06-23 11:14:13,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42866.8, 300 sec: 42708.5). Total num frames: 6931939328. Throughput: 0: 42759.7. Samples: 6932045500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:13,396][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 11:14:17,027][15401] Updated weights for policy 0, policy_version 423100 (0.0039) [2024-06-23 11:14:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42709.8). Total num frames: 6932135936. Throughput: 0: 42879.2. Samples: 6932300600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 11:14:19,978][15401] Updated weights for policy 0, policy_version 423110 (0.0035) [2024-06-23 11:14:23,390][15132] Fps is (10 sec: 39346.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6932332544. Throughput: 0: 42587.5. Samples: 6932423660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 11:14:24,766][15401] Updated weights for policy 0, policy_version 423120 (0.0036) [2024-06-23 11:14:27,435][15401] Updated weights for policy 0, policy_version 423130 (0.0028) [2024-06-23 11:14:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6932578304. Throughput: 0: 42663.2. Samples: 6932678000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 11:14:32,306][15401] Updated weights for policy 0, policy_version 423140 (0.0042) [2024-06-23 11:14:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 6932742144. Throughput: 0: 42918.7. Samples: 6932948340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 11:14:35,327][15401] Updated weights for policy 0, policy_version 423150 (0.0036) [2024-06-23 11:14:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6932971520. Throughput: 0: 42576.8. Samples: 6933066940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:38,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 11:14:39,728][15401] Updated weights for policy 0, policy_version 423160 (0.0027) [2024-06-23 11:14:42,825][15401] Updated weights for policy 0, policy_version 423170 (0.0041) [2024-06-23 11:14:43,390][15132] Fps is (10 sec: 47509.1, 60 sec: 42870.9, 300 sec: 42709.3). Total num frames: 6933217280. Throughput: 0: 42648.0. Samples: 6933322000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:43,391][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 11:14:47,278][15401] Updated weights for policy 0, policy_version 423180 (0.0033) [2024-06-23 11:14:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6933381120. Throughput: 0: 42877.8. Samples: 6933590520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 11:14:50,503][15401] Updated weights for policy 0, policy_version 423190 (0.0032) [2024-06-23 11:14:53,390][15132] Fps is (10 sec: 40963.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6933626880. Throughput: 0: 42576.4. Samples: 6933707880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:53,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 11:14:55,261][15401] Updated weights for policy 0, policy_version 423200 (0.0031) [2024-06-23 11:14:56,588][15349] Signal inference workers to stop experience collection... (102750 times) [2024-06-23 11:14:56,588][15349] Signal inference workers to resume experience collection... (102750 times) [2024-06-23 11:14:56,607][15401] InferenceWorker_p0-w0: stopping experience collection (102750 times) [2024-06-23 11:14:56,608][15401] InferenceWorker_p0-w0: resuming experience collection (102750 times) [2024-06-23 11:14:58,279][15401] Updated weights for policy 0, policy_version 423210 (0.0037) [2024-06-23 11:14:58,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6933872640. Throughput: 0: 42556.6. Samples: 6933960280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:14:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 11:15:02,998][15401] Updated weights for policy 0, policy_version 423220 (0.0023) [2024-06-23 11:15:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42599.0). Total num frames: 6934036480. Throughput: 0: 42609.7. Samples: 6934218040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:15:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 11:15:06,167][15401] Updated weights for policy 0, policy_version 423230 (0.0024) [2024-06-23 11:15:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6934265856. Throughput: 0: 42607.6. Samples: 6934341000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:15:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 11:15:10,608][15401] Updated weights for policy 0, policy_version 423240 (0.0030) [2024-06-23 11:15:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 6934495232. Throughput: 0: 42825.8. Samples: 6934605160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:15:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 11:15:13,985][15401] Updated weights for policy 0, policy_version 423250 (0.0034) [2024-06-23 11:15:18,215][15401] Updated weights for policy 0, policy_version 423260 (0.0029) [2024-06-23 11:15:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 6934691840. Throughput: 0: 42402.6. Samples: 6934856460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:15:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 11:15:21,819][15401] Updated weights for policy 0, policy_version 423270 (0.0030) [2024-06-23 11:15:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 6934904832. Throughput: 0: 42437.4. Samples: 6934976620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:15:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 11:15:26,339][15401] Updated weights for policy 0, policy_version 423280 (0.0029) [2024-06-23 11:15:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6935117824. Throughput: 0: 42641.7. Samples: 6935240840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 11:15:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 11:15:29,672][15401] Updated weights for policy 0, policy_version 423290 (0.0037) [2024-06-23 11:15:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 6935314432. Throughput: 0: 42257.8. Samples: 6935492120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:15:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 11:15:33,907][15401] Updated weights for policy 0, policy_version 423300 (0.0044) [2024-06-23 11:15:37,468][15401] Updated weights for policy 0, policy_version 423310 (0.0046) [2024-06-23 11:15:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6935543808. Throughput: 0: 42392.9. Samples: 6935615560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:15:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 11:15:41,403][15401] Updated weights for policy 0, policy_version 423320 (0.0034) [2024-06-23 11:15:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42326.0, 300 sec: 42598.4). Total num frames: 6935756800. Throughput: 0: 42633.9. Samples: 6935878800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:15:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 11:15:43,479][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000423326_6935773184.pth... [2024-06-23 11:15:43,530][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000422702_6925549568.pth [2024-06-23 11:15:45,076][15401] Updated weights for policy 0, policy_version 423330 (0.0042) [2024-06-23 11:15:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 6935969792. Throughput: 0: 42470.2. Samples: 6936129200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:15:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 11:15:49,285][15401] Updated weights for policy 0, policy_version 423340 (0.0038) [2024-06-23 11:15:52,667][15401] Updated weights for policy 0, policy_version 423350 (0.0043) [2024-06-23 11:15:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6936166400. Throughput: 0: 42631.1. Samples: 6936259400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:15:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 11:15:56,967][15401] Updated weights for policy 0, policy_version 423360 (0.0031) [2024-06-23 11:15:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42487.6). Total num frames: 6936379392. Throughput: 0: 42422.1. Samples: 6936514160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:15:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 11:16:00,227][15401] Updated weights for policy 0, policy_version 423370 (0.0035) [2024-06-23 11:16:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6936592384. Throughput: 0: 42504.9. Samples: 6936769180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 11:16:04,313][15349] Signal inference workers to stop experience collection... (102800 times) [2024-06-23 11:16:04,314][15349] Signal inference workers to resume experience collection... (102800 times) [2024-06-23 11:16:04,323][15401] InferenceWorker_p0-w0: stopping experience collection (102800 times) [2024-06-23 11:16:04,335][15401] InferenceWorker_p0-w0: resuming experience collection (102800 times) [2024-06-23 11:16:04,479][15401] Updated weights for policy 0, policy_version 423380 (0.0031) [2024-06-23 11:16:08,096][15401] Updated weights for policy 0, policy_version 423390 (0.0025) [2024-06-23 11:16:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6936821760. Throughput: 0: 42659.9. Samples: 6936896320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 11:16:12,081][15401] Updated weights for policy 0, policy_version 423400 (0.0034) [2024-06-23 11:16:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 6937018368. Throughput: 0: 42420.5. Samples: 6937149760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 11:16:16,173][15401] Updated weights for policy 0, policy_version 423410 (0.0037) [2024-06-23 11:16:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 6937231360. Throughput: 0: 42469.3. Samples: 6937403240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 11:16:19,943][15401] Updated weights for policy 0, policy_version 423420 (0.0044) [2024-06-23 11:16:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.6, 300 sec: 42542.9). Total num frames: 6937460736. Throughput: 0: 42548.4. Samples: 6937530340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:23,393][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 11:16:23,810][15401] Updated weights for policy 0, policy_version 423430 (0.0030) [2024-06-23 11:16:27,630][15401] Updated weights for policy 0, policy_version 423440 (0.0040) [2024-06-23 11:16:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6937673728. Throughput: 0: 42467.1. Samples: 6937789820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:28,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 11:16:31,318][15401] Updated weights for policy 0, policy_version 423450 (0.0024) [2024-06-23 11:16:33,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 6937886720. Throughput: 0: 42491.9. Samples: 6938041340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:33,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 11:16:35,235][15401] Updated weights for policy 0, policy_version 423460 (0.0032) [2024-06-23 11:16:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6938099712. Throughput: 0: 42480.8. Samples: 6938171040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 11:16:39,099][15401] Updated weights for policy 0, policy_version 423470 (0.0038) [2024-06-23 11:16:43,041][15401] Updated weights for policy 0, policy_version 423480 (0.0036) [2024-06-23 11:16:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 6938312704. Throughput: 0: 42697.8. Samples: 6938435560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 11:16:46,823][15401] Updated weights for policy 0, policy_version 423490 (0.0038) [2024-06-23 11:16:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6938525696. Throughput: 0: 42624.0. Samples: 6938687260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 11:16:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 11:16:50,534][15401] Updated weights for policy 0, policy_version 423500 (0.0035) [2024-06-23 11:16:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6938722304. Throughput: 0: 42633.4. Samples: 6938814820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:16:53,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 11:16:54,290][15401] Updated weights for policy 0, policy_version 423510 (0.0032) [2024-06-23 11:16:58,082][15401] Updated weights for policy 0, policy_version 423520 (0.0031) [2024-06-23 11:16:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 6938951680. Throughput: 0: 42730.8. Samples: 6939072640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:16:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 11:17:02,013][15401] Updated weights for policy 0, policy_version 423530 (0.0029) [2024-06-23 11:17:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 6939148288. Throughput: 0: 42814.7. Samples: 6939329900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:03,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 11:17:05,936][15401] Updated weights for policy 0, policy_version 423540 (0.0044) [2024-06-23 11:17:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 6939361280. Throughput: 0: 42734.2. Samples: 6939453380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:08,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 11:17:09,541][15401] Updated weights for policy 0, policy_version 423550 (0.0031) [2024-06-23 11:17:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 6939590656. Throughput: 0: 42719.0. Samples: 6939712180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 11:17:13,430][15401] Updated weights for policy 0, policy_version 423560 (0.0042) [2024-06-23 11:17:17,278][15401] Updated weights for policy 0, policy_version 423570 (0.0028) [2024-06-23 11:17:18,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 6939803648. Throughput: 0: 42814.7. Samples: 6939968100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:18,392][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 11:17:21,325][15401] Updated weights for policy 0, policy_version 423580 (0.0029) [2024-06-23 11:17:23,127][15349] Signal inference workers to stop experience collection... (102850 times) [2024-06-23 11:17:23,163][15401] InferenceWorker_p0-w0: stopping experience collection (102850 times) [2024-06-23 11:17:23,187][15349] Signal inference workers to resume experience collection... (102850 times) [2024-06-23 11:17:23,188][15401] InferenceWorker_p0-w0: resuming experience collection (102850 times) [2024-06-23 11:17:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 6940000256. Throughput: 0: 42776.1. Samples: 6940095960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 11:17:24,912][15401] Updated weights for policy 0, policy_version 423590 (0.0033) [2024-06-23 11:17:28,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.5, 300 sec: 42488.3). Total num frames: 6940229632. Throughput: 0: 42443.3. Samples: 6940345500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 11:17:28,892][15401] Updated weights for policy 0, policy_version 423600 (0.0047) [2024-06-23 11:17:32,851][15401] Updated weights for policy 0, policy_version 423610 (0.0033) [2024-06-23 11:17:33,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.8, 300 sec: 42543.2). Total num frames: 6940442624. Throughput: 0: 42659.5. Samples: 6940607040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:33,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 11:17:36,357][15401] Updated weights for policy 0, policy_version 423620 (0.0035) [2024-06-23 11:17:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 6940622848. Throughput: 0: 42662.1. Samples: 6940734620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 11:17:40,630][15401] Updated weights for policy 0, policy_version 423630 (0.0051) [2024-06-23 11:17:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6940868608. Throughput: 0: 42527.1. Samples: 6940986360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 11:17:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000423638_6940884992.pth... [2024-06-23 11:17:43,534][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000423013_6930644992.pth [2024-06-23 11:17:43,871][15401] Updated weights for policy 0, policy_version 423640 (0.0028) [2024-06-23 11:17:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6941065216. Throughput: 0: 42558.3. Samples: 6941245020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 11:17:48,432][15401] Updated weights for policy 0, policy_version 423650 (0.0024) [2024-06-23 11:17:51,653][15401] Updated weights for policy 0, policy_version 423660 (0.0030) [2024-06-23 11:17:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 6941278208. Throughput: 0: 42390.3. Samples: 6941360840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 11:17:56,047][15401] Updated weights for policy 0, policy_version 423670 (0.0029) [2024-06-23 11:17:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6941523968. Throughput: 0: 42432.5. Samples: 6941621640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:17:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 11:17:59,418][15401] Updated weights for policy 0, policy_version 423680 (0.0028) [2024-06-23 11:18:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 6941687808. Throughput: 0: 42661.7. Samples: 6941887780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:18:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 11:18:03,682][15401] Updated weights for policy 0, policy_version 423690 (0.0041) [2024-06-23 11:18:06,886][15401] Updated weights for policy 0, policy_version 423700 (0.0029) [2024-06-23 11:18:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 6941917184. Throughput: 0: 42289.4. Samples: 6941998980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:18:08,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 11:18:11,473][15401] Updated weights for policy 0, policy_version 423710 (0.0033) [2024-06-23 11:18:13,389][15132] Fps is (10 sec: 49152.8, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 6942179328. Throughput: 0: 42761.6. Samples: 6942269780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-23 11:18:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 11:18:14,379][15401] Updated weights for policy 0, policy_version 423720 (0.0033) [2024-06-23 11:18:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 6942326784. Throughput: 0: 42726.3. Samples: 6942529620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 11:18:19,220][15401] Updated weights for policy 0, policy_version 423730 (0.0039) [2024-06-23 11:18:21,964][15401] Updated weights for policy 0, policy_version 423740 (0.0047) [2024-06-23 11:18:23,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 6942572544. Throughput: 0: 42392.9. Samples: 6942642400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:23,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 11:18:26,609][15401] Updated weights for policy 0, policy_version 423750 (0.0025) [2024-06-23 11:18:28,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6942801920. Throughput: 0: 42732.1. Samples: 6942909300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 11:18:28,410][15349] Signal inference workers to stop experience collection... (102900 times) [2024-06-23 11:18:28,438][15401] InferenceWorker_p0-w0: stopping experience collection (102900 times) [2024-06-23 11:18:28,458][15349] Signal inference workers to resume experience collection... (102900 times) [2024-06-23 11:18:28,459][15401] InferenceWorker_p0-w0: resuming experience collection (102900 times) [2024-06-23 11:18:29,665][15401] Updated weights for policy 0, policy_version 423760 (0.0029) [2024-06-23 11:18:33,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 6942965760. Throughput: 0: 42819.5. Samples: 6943171900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 11:18:34,359][15401] Updated weights for policy 0, policy_version 423770 (0.0028) [2024-06-23 11:18:37,455][15401] Updated weights for policy 0, policy_version 423780 (0.0035) [2024-06-23 11:18:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6943211520. Throughput: 0: 42808.0. Samples: 6943287200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 11:18:42,150][15401] Updated weights for policy 0, policy_version 423790 (0.0037) [2024-06-23 11:18:43,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6943440896. Throughput: 0: 42902.7. Samples: 6943552260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:43,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 11:18:45,533][15401] Updated weights for policy 0, policy_version 423800 (0.0041) [2024-06-23 11:18:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6943621120. Throughput: 0: 42760.6. Samples: 6943812000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 11:18:49,796][15401] Updated weights for policy 0, policy_version 423810 (0.0052) [2024-06-23 11:18:53,083][15401] Updated weights for policy 0, policy_version 423820 (0.0038) [2024-06-23 11:18:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6943866880. Throughput: 0: 42844.0. Samples: 6943926960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 11:18:57,656][15401] Updated weights for policy 0, policy_version 423830 (0.0035) [2024-06-23 11:18:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6944079872. Throughput: 0: 42639.6. Samples: 6944188560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:18:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 11:19:00,668][15401] Updated weights for policy 0, policy_version 423840 (0.0039) [2024-06-23 11:19:03,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 6944227328. Throughput: 0: 42630.2. Samples: 6944447980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:19:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 11:19:05,390][15401] Updated weights for policy 0, policy_version 423850 (0.0034) [2024-06-23 11:19:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42654.9). Total num frames: 6944522240. Throughput: 0: 42711.1. Samples: 6944564300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:19:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 11:19:08,404][15401] Updated weights for policy 0, policy_version 423860 (0.0046) [2024-06-23 11:19:13,040][15401] Updated weights for policy 0, policy_version 423870 (0.0037) [2024-06-23 11:19:13,392][15132] Fps is (10 sec: 47502.1, 60 sec: 42050.6, 300 sec: 42598.0). Total num frames: 6944702464. Throughput: 0: 42714.5. Samples: 6944831560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:19:13,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 11:19:15,949][15401] Updated weights for policy 0, policy_version 423880 (0.0041) [2024-06-23 11:19:18,389][15132] Fps is (10 sec: 36045.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6944882688. Throughput: 0: 42521.8. Samples: 6945085380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:19:18,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 11:19:20,719][15401] Updated weights for policy 0, policy_version 423890 (0.0029) [2024-06-23 11:19:23,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 6945144832. Throughput: 0: 42650.2. Samples: 6945206460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:19:23,394][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 11:19:23,961][15401] Updated weights for policy 0, policy_version 423900 (0.0032) [2024-06-23 11:19:28,338][15401] Updated weights for policy 0, policy_version 423910 (0.0027) [2024-06-23 11:19:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 6945341440. Throughput: 0: 42607.2. Samples: 6945469580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:19:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 11:19:31,801][15401] Updated weights for policy 0, policy_version 423920 (0.0031) [2024-06-23 11:19:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 6945521664. Throughput: 0: 42395.0. Samples: 6945719780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 11:19:33,396][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 11:19:35,990][15401] Updated weights for policy 0, policy_version 423930 (0.0033) [2024-06-23 11:19:37,664][15349] Signal inference workers to stop experience collection... (102950 times) [2024-06-23 11:19:37,664][15349] Signal inference workers to resume experience collection... (102950 times) [2024-06-23 11:19:37,692][15401] InferenceWorker_p0-w0: stopping experience collection (102950 times) [2024-06-23 11:19:37,692][15401] InferenceWorker_p0-w0: resuming experience collection (102950 times) [2024-06-23 11:19:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.5). Total num frames: 6945783808. Throughput: 0: 42574.6. Samples: 6945842820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:19:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 11:19:39,349][15401] Updated weights for policy 0, policy_version 423940 (0.0038) [2024-06-23 11:19:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 6945964032. Throughput: 0: 42628.4. Samples: 6946106840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:19:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 11:19:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000423949_6945980416.pth... [2024-06-23 11:19:43,580][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000423326_6935773184.pth [2024-06-23 11:19:43,808][15401] Updated weights for policy 0, policy_version 423950 (0.0023) [2024-06-23 11:19:47,498][15401] Updated weights for policy 0, policy_version 423960 (0.0032) [2024-06-23 11:19:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6946177024. Throughput: 0: 42329.8. Samples: 6946352820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:19:48,392][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 11:19:51,688][15401] Updated weights for policy 0, policy_version 423970 (0.0027) [2024-06-23 11:19:53,392][15132] Fps is (10 sec: 47502.4, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 6946439168. Throughput: 0: 42586.7. Samples: 6946480800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:19:53,392][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 11:19:55,071][15401] Updated weights for policy 0, policy_version 423980 (0.0024) [2024-06-23 11:19:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 6946586624. Throughput: 0: 42583.7. Samples: 6946747720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:19:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 11:19:59,266][15401] Updated weights for policy 0, policy_version 423990 (0.0039) [2024-06-23 11:20:03,010][15401] Updated weights for policy 0, policy_version 424000 (0.0036) [2024-06-23 11:20:03,389][15132] Fps is (10 sec: 37692.2, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 6946816000. Throughput: 0: 42445.7. Samples: 6946995440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 11:20:06,854][15401] Updated weights for policy 0, policy_version 424010 (0.0045) [2024-06-23 11:20:08,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6947078144. Throughput: 0: 42622.3. Samples: 6947124460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 11:20:10,527][15401] Updated weights for policy 0, policy_version 424020 (0.0036) [2024-06-23 11:20:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 6947241984. Throughput: 0: 42539.5. Samples: 6947383860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 11:20:14,401][15401] Updated weights for policy 0, policy_version 424030 (0.0031) [2024-06-23 11:20:18,168][15401] Updated weights for policy 0, policy_version 424040 (0.0023) [2024-06-23 11:20:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6947471360. Throughput: 0: 42503.2. Samples: 6947632420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 11:20:22,079][15401] Updated weights for policy 0, policy_version 424050 (0.0031) [2024-06-23 11:20:23,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6947717120. Throughput: 0: 42704.0. Samples: 6947764500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:23,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 11:20:25,857][15401] Updated weights for policy 0, policy_version 424060 (0.0037) [2024-06-23 11:20:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6947880960. Throughput: 0: 42637.4. Samples: 6948025520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:28,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 11:20:29,755][15401] Updated weights for policy 0, policy_version 424070 (0.0028) [2024-06-23 11:20:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 6948110336. Throughput: 0: 42768.5. Samples: 6948277400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 11:20:33,450][15401] Updated weights for policy 0, policy_version 424080 (0.0037) [2024-06-23 11:20:37,356][15401] Updated weights for policy 0, policy_version 424090 (0.0039) [2024-06-23 11:20:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6948339712. Throughput: 0: 42864.1. Samples: 6948409580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 11:20:41,002][15401] Updated weights for policy 0, policy_version 424100 (0.0039) [2024-06-23 11:20:43,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 6948503552. Throughput: 0: 42495.9. Samples: 6948660140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:43,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 11:20:44,927][15349] Signal inference workers to stop experience collection... (103000 times) [2024-06-23 11:20:44,932][15349] Signal inference workers to resume experience collection... (103000 times) [2024-06-23 11:20:44,959][15401] InferenceWorker_p0-w0: stopping experience collection (103000 times) [2024-06-23 11:20:44,959][15401] InferenceWorker_p0-w0: resuming experience collection (103000 times) [2024-06-23 11:20:45,083][15401] Updated weights for policy 0, policy_version 424110 (0.0028) [2024-06-23 11:20:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6948749312. Throughput: 0: 42603.6. Samples: 6948912600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 11:20:49,106][15401] Updated weights for policy 0, policy_version 424120 (0.0048) [2024-06-23 11:20:53,066][15401] Updated weights for policy 0, policy_version 424130 (0.0045) [2024-06-23 11:20:53,392][15132] Fps is (10 sec: 47513.7, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 6948978688. Throughput: 0: 42615.4. Samples: 6949042260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:53,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 11:20:56,612][15401] Updated weights for policy 0, policy_version 424140 (0.0040) [2024-06-23 11:20:58,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 6949158912. Throughput: 0: 42597.3. Samples: 6949300840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 27.0) [2024-06-23 11:20:58,393][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 11:21:00,780][15401] Updated weights for policy 0, policy_version 424150 (0.0033) [2024-06-23 11:21:03,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6949371904. Throughput: 0: 42626.2. Samples: 6949550600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 11:21:04,414][15401] Updated weights for policy 0, policy_version 424160 (0.0032) [2024-06-23 11:21:08,300][15401] Updated weights for policy 0, policy_version 424170 (0.0028) [2024-06-23 11:21:08,392][15132] Fps is (10 sec: 44237.7, 60 sec: 42050.7, 300 sec: 42653.6). Total num frames: 6949601280. Throughput: 0: 42630.8. Samples: 6949682980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:08,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 11:21:12,058][15401] Updated weights for policy 0, policy_version 424180 (0.0043) [2024-06-23 11:21:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6949797888. Throughput: 0: 42417.7. Samples: 6949934320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:13,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 11:21:15,886][15401] Updated weights for policy 0, policy_version 424190 (0.0031) [2024-06-23 11:21:18,389][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 6950010880. Throughput: 0: 42381.8. Samples: 6950184580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:18,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 11:21:19,844][15401] Updated weights for policy 0, policy_version 424200 (0.0032) [2024-06-23 11:21:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6950240256. Throughput: 0: 42420.3. Samples: 6950318500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:23,399][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 11:21:23,651][15401] Updated weights for policy 0, policy_version 424210 (0.0034) [2024-06-23 11:21:27,384][15401] Updated weights for policy 0, policy_version 424220 (0.0045) [2024-06-23 11:21:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6950453248. Throughput: 0: 42501.4. Samples: 6950572600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:28,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 11:21:31,249][15401] Updated weights for policy 0, policy_version 424230 (0.0029) [2024-06-23 11:21:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6950666240. Throughput: 0: 42477.8. Samples: 6950824100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 11:21:35,373][15401] Updated weights for policy 0, policy_version 424240 (0.0026) [2024-06-23 11:21:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 6950862848. Throughput: 0: 42517.9. Samples: 6950955460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 11:21:38,979][15401] Updated weights for policy 0, policy_version 424250 (0.0035) [2024-06-23 11:21:42,925][15401] Updated weights for policy 0, policy_version 424260 (0.0039) [2024-06-23 11:21:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 6951075840. Throughput: 0: 42528.6. Samples: 6951214520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 11:21:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000424260_6951075840.pth... [2024-06-23 11:21:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000423638_6940884992.pth [2024-06-23 11:21:46,791][15401] Updated weights for policy 0, policy_version 424270 (0.0027) [2024-06-23 11:21:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6951321600. Throughput: 0: 42444.4. Samples: 6951460600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 11:21:50,601][15401] Updated weights for policy 0, policy_version 424280 (0.0030) [2024-06-23 11:21:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 6951501824. Throughput: 0: 42515.4. Samples: 6951596080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 11:21:54,539][15401] Updated weights for policy 0, policy_version 424290 (0.0035) [2024-06-23 11:21:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 6951714816. Throughput: 0: 42451.5. Samples: 6951844640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:21:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 11:21:58,741][15401] Updated weights for policy 0, policy_version 424300 (0.0029) [2024-06-23 11:22:02,367][15401] Updated weights for policy 0, policy_version 424310 (0.0022) [2024-06-23 11:22:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 6951960576. Throughput: 0: 42500.8. Samples: 6952097120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:22:03,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 11:22:06,487][15401] Updated weights for policy 0, policy_version 424320 (0.0042) [2024-06-23 11:22:08,393][15132] Fps is (10 sec: 44224.0, 60 sec: 42597.8, 300 sec: 42598.0). Total num frames: 6952157184. Throughput: 0: 42556.8. Samples: 6952233680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:22:08,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 11:22:09,920][15401] Updated weights for policy 0, policy_version 424330 (0.0033) [2024-06-23 11:22:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 6952370176. Throughput: 0: 42695.5. Samples: 6952493900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:22:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 11:22:13,790][15401] Updated weights for policy 0, policy_version 424340 (0.0031) [2024-06-23 11:22:17,352][15401] Updated weights for policy 0, policy_version 424350 (0.0043) [2024-06-23 11:22:18,392][15132] Fps is (10 sec: 44239.5, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 6952599552. Throughput: 0: 42610.1. Samples: 6952741660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 11:22:18,393][15132] Avg episode reward: [(0, '0.246')] [2024-06-23 11:22:21,658][15401] Updated weights for policy 0, policy_version 424360 (0.0022) [2024-06-23 11:22:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6952796160. Throughput: 0: 42642.5. Samples: 6952874380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:22:23,390][15132] Avg episode reward: [(0, '0.122')] [2024-06-23 11:22:25,263][15401] Updated weights for policy 0, policy_version 424370 (0.0033) [2024-06-23 11:22:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 6953009152. Throughput: 0: 42447.5. Samples: 6953124660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:22:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 11:22:29,248][15349] Signal inference workers to stop experience collection... (103050 times) [2024-06-23 11:22:29,300][15401] InferenceWorker_p0-w0: stopping experience collection (103050 times) [2024-06-23 11:22:29,305][15349] Signal inference workers to resume experience collection... (103050 times) [2024-06-23 11:22:29,315][15401] InferenceWorker_p0-w0: resuming experience collection (103050 times) [2024-06-23 11:22:29,322][15401] Updated weights for policy 0, policy_version 424380 (0.0044) [2024-06-23 11:22:32,958][15401] Updated weights for policy 0, policy_version 424390 (0.0032) [2024-06-23 11:22:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6953222144. Throughput: 0: 42608.8. Samples: 6953378000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:22:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 11:22:37,233][15401] Updated weights for policy 0, policy_version 424400 (0.0031) [2024-06-23 11:22:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6953402368. Throughput: 0: 42431.6. Samples: 6953505500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:22:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 11:22:40,573][15401] Updated weights for policy 0, policy_version 424410 (0.0041) [2024-06-23 11:22:43,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42871.1, 300 sec: 42653.9). Total num frames: 6953648128. Throughput: 0: 42560.6. Samples: 6953759880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:22:43,391][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 11:22:45,068][15401] Updated weights for policy 0, policy_version 424420 (0.0039) [2024-06-23 11:22:48,251][15401] Updated weights for policy 0, policy_version 424430 (0.0036) [2024-06-23 11:22:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6953861120. Throughput: 0: 42647.6. Samples: 6954016260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:22:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 11:22:52,594][15401] Updated weights for policy 0, policy_version 424440 (0.0035) [2024-06-23 11:22:53,390][15132] Fps is (10 sec: 40961.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 6954057728. Throughput: 0: 42401.4. Samples: 6954141620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:22:53,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-23 11:22:55,898][15401] Updated weights for policy 0, policy_version 424450 (0.0031) [2024-06-23 11:22:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6954303488. Throughput: 0: 42457.0. Samples: 6954404460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:22:58,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 11:23:00,240][15401] Updated weights for policy 0, policy_version 424460 (0.0037) [2024-06-23 11:23:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 6954483712. Throughput: 0: 42566.7. Samples: 6954657060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:23:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 11:23:03,651][15401] Updated weights for policy 0, policy_version 424470 (0.0031) [2024-06-23 11:23:07,973][15401] Updated weights for policy 0, policy_version 424480 (0.0045) [2024-06-23 11:23:08,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42054.4, 300 sec: 42376.2). Total num frames: 6954680320. Throughput: 0: 42384.6. Samples: 6954781680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:23:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 11:23:11,287][15401] Updated weights for policy 0, policy_version 424490 (0.0039) [2024-06-23 11:23:13,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 6954926080. Throughput: 0: 42596.8. Samples: 6955041620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:23:13,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 11:23:15,638][15401] Updated weights for policy 0, policy_version 424500 (0.0034) [2024-06-23 11:23:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42327.0, 300 sec: 42598.7). Total num frames: 6955139072. Throughput: 0: 42618.7. Samples: 6955295840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:23:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 11:23:18,872][15401] Updated weights for policy 0, policy_version 424510 (0.0029) [2024-06-23 11:23:23,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 6955319296. Throughput: 0: 42565.4. Samples: 6955420940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:23:23,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 11:23:23,461][15401] Updated weights for policy 0, policy_version 424520 (0.0035) [2024-06-23 11:23:26,471][15401] Updated weights for policy 0, policy_version 424530 (0.0035) [2024-06-23 11:23:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6955581440. Throughput: 0: 42600.0. Samples: 6955676860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:23:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 11:23:30,945][15401] Updated weights for policy 0, policy_version 424540 (0.0035) [2024-06-23 11:23:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 6955761664. Throughput: 0: 42593.0. Samples: 6955932940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:23:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 11:23:34,390][15401] Updated weights for policy 0, policy_version 424550 (0.0022) [2024-06-23 11:23:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 6955974656. Throughput: 0: 42581.5. Samples: 6956057780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 11:23:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 11:23:38,417][15401] Updated weights for policy 0, policy_version 424560 (0.0036) [2024-06-23 11:23:42,150][15401] Updated weights for policy 0, policy_version 424570 (0.0029) [2024-06-23 11:23:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.8, 300 sec: 42653.9). Total num frames: 6956204032. Throughput: 0: 42373.9. Samples: 6956311280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:23:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 11:23:43,443][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000424574_6956220416.pth... [2024-06-23 11:23:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000423949_6945980416.pth [2024-06-23 11:23:45,961][15401] Updated weights for policy 0, policy_version 424580 (0.0034) [2024-06-23 11:23:48,023][15349] Signal inference workers to stop experience collection... (103100 times) [2024-06-23 11:23:48,024][15349] Signal inference workers to resume experience collection... (103100 times) [2024-06-23 11:23:48,062][15401] InferenceWorker_p0-w0: stopping experience collection (103100 times) [2024-06-23 11:23:48,062][15401] InferenceWorker_p0-w0: resuming experience collection (103100 times) [2024-06-23 11:23:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6956400640. Throughput: 0: 42484.4. Samples: 6956568860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:23:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 11:23:49,822][15401] Updated weights for policy 0, policy_version 424590 (0.0035) [2024-06-23 11:23:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 6956613632. Throughput: 0: 42480.1. Samples: 6956693280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:23:53,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-23 11:23:53,719][15401] Updated weights for policy 0, policy_version 424600 (0.0054) [2024-06-23 11:23:57,693][15401] Updated weights for policy 0, policy_version 424610 (0.0026) [2024-06-23 11:23:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 6956826624. Throughput: 0: 42300.1. Samples: 6956945020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:23:58,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 11:24:01,634][15401] Updated weights for policy 0, policy_version 424620 (0.0023) [2024-06-23 11:24:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 6957023232. Throughput: 0: 42311.7. Samples: 6957199860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 11:24:05,409][15401] Updated weights for policy 0, policy_version 424630 (0.0037) [2024-06-23 11:24:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 6957236224. Throughput: 0: 42306.1. Samples: 6957324720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 11:24:09,451][15401] Updated weights for policy 0, policy_version 424640 (0.0043) [2024-06-23 11:24:13,076][15401] Updated weights for policy 0, policy_version 424650 (0.0038) [2024-06-23 11:24:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 6957465600. Throughput: 0: 42263.9. Samples: 6957578740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 11:24:17,501][15401] Updated weights for policy 0, policy_version 424660 (0.0035) [2024-06-23 11:24:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 6957662208. Throughput: 0: 42371.8. Samples: 6957839680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:18,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 11:24:20,799][15401] Updated weights for policy 0, policy_version 424670 (0.0037) [2024-06-23 11:24:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 6957875200. Throughput: 0: 42349.7. Samples: 6957963520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:23,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 11:24:25,122][15401] Updated weights for policy 0, policy_version 424680 (0.0039) [2024-06-23 11:24:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 6958104576. Throughput: 0: 42352.9. Samples: 6958217160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 11:24:28,471][15401] Updated weights for policy 0, policy_version 424690 (0.0033) [2024-06-23 11:24:32,801][15401] Updated weights for policy 0, policy_version 424700 (0.0037) [2024-06-23 11:24:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 6958301184. Throughput: 0: 42370.2. Samples: 6958475520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:33,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 11:24:36,170][15401] Updated weights for policy 0, policy_version 424710 (0.0037) [2024-06-23 11:24:38,396][15132] Fps is (10 sec: 40933.2, 60 sec: 42320.7, 300 sec: 42541.9). Total num frames: 6958514176. Throughput: 0: 42402.7. Samples: 6958601680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:38,397][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 11:24:40,247][15401] Updated weights for policy 0, policy_version 424720 (0.0031) [2024-06-23 11:24:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6958743552. Throughput: 0: 42478.6. Samples: 6958856560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 11:24:43,865][15401] Updated weights for policy 0, policy_version 424730 (0.0037) [2024-06-23 11:24:47,941][15401] Updated weights for policy 0, policy_version 424740 (0.0028) [2024-06-23 11:24:48,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42325.4, 300 sec: 42376.6). Total num frames: 6958940160. Throughput: 0: 42515.6. Samples: 6959113060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 11:24:51,577][15401] Updated weights for policy 0, policy_version 424750 (0.0038) [2024-06-23 11:24:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 6959153152. Throughput: 0: 42617.3. Samples: 6959242500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 11:24:55,619][15401] Updated weights for policy 0, policy_version 424760 (0.0037) [2024-06-23 11:24:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 6959366144. Throughput: 0: 42623.6. Samples: 6959496800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:24:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 11:24:59,067][15401] Updated weights for policy 0, policy_version 424770 (0.0037) [2024-06-23 11:25:03,365][15401] Updated weights for policy 0, policy_version 424780 (0.0031) [2024-06-23 11:25:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 6959595520. Throughput: 0: 42550.3. Samples: 6959754440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 11:25:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 11:25:06,600][15401] Updated weights for policy 0, policy_version 424790 (0.0029) [2024-06-23 11:25:08,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42593.9, 300 sec: 42541.9). Total num frames: 6959792128. Throughput: 0: 42599.6. Samples: 6959880780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:08,397][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 11:25:10,940][15401] Updated weights for policy 0, policy_version 424800 (0.0045) [2024-06-23 11:25:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 6960005120. Throughput: 0: 42765.2. Samples: 6960141600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:13,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 11:25:14,359][15401] Updated weights for policy 0, policy_version 424810 (0.0026) [2024-06-23 11:25:18,389][15132] Fps is (10 sec: 42626.4, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 6960218112. Throughput: 0: 42657.5. Samples: 6960395100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 11:25:18,538][15349] Signal inference workers to stop experience collection... (103150 times) [2024-06-23 11:25:18,538][15349] Signal inference workers to resume experience collection... (103150 times) [2024-06-23 11:25:18,568][15401] InferenceWorker_p0-w0: stopping experience collection (103150 times) [2024-06-23 11:25:18,568][15401] InferenceWorker_p0-w0: resuming experience collection (103150 times) [2024-06-23 11:25:18,687][15401] Updated weights for policy 0, policy_version 424820 (0.0039) [2024-06-23 11:25:22,097][15401] Updated weights for policy 0, policy_version 424830 (0.0026) [2024-06-23 11:25:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6960431104. Throughput: 0: 42652.0. Samples: 6960520740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 11:25:26,322][15401] Updated weights for policy 0, policy_version 424840 (0.0046) [2024-06-23 11:25:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 6960660480. Throughput: 0: 42760.4. Samples: 6960780780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 11:25:29,654][15401] Updated weights for policy 0, policy_version 424850 (0.0041) [2024-06-23 11:25:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 6960857088. Throughput: 0: 42854.7. Samples: 6961041520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:33,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 11:25:34,004][15401] Updated weights for policy 0, policy_version 424860 (0.0038) [2024-06-23 11:25:37,275][15401] Updated weights for policy 0, policy_version 424870 (0.0034) [2024-06-23 11:25:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42876.1, 300 sec: 42654.3). Total num frames: 6961086464. Throughput: 0: 42712.6. Samples: 6961164560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:38,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 11:25:41,574][15401] Updated weights for policy 0, policy_version 424880 (0.0033) [2024-06-23 11:25:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6961299456. Throughput: 0: 42733.9. Samples: 6961419820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 11:25:43,432][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000424885_6961315840.pth... [2024-06-23 11:25:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000424260_6951075840.pth [2024-06-23 11:25:44,959][15401] Updated weights for policy 0, policy_version 424890 (0.0035) [2024-06-23 11:25:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42376.6). Total num frames: 6961479680. Throughput: 0: 42815.2. Samples: 6961681120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 11:25:49,391][15401] Updated weights for policy 0, policy_version 424900 (0.0033) [2024-06-23 11:25:52,489][15401] Updated weights for policy 0, policy_version 424910 (0.0030) [2024-06-23 11:25:53,392][15132] Fps is (10 sec: 44224.2, 60 sec: 43142.6, 300 sec: 42653.9). Total num frames: 6961741824. Throughput: 0: 42788.3. Samples: 6961806100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:53,393][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 11:25:56,930][15401] Updated weights for policy 0, policy_version 424920 (0.0035) [2024-06-23 11:25:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6961938432. Throughput: 0: 42606.7. Samples: 6962058900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:25:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 11:26:00,357][15401] Updated weights for policy 0, policy_version 424930 (0.0038) [2024-06-23 11:26:03,389][15132] Fps is (10 sec: 39333.1, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 6962135040. Throughput: 0: 42860.4. Samples: 6962323820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:26:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 11:26:04,825][15401] Updated weights for policy 0, policy_version 424940 (0.0026) [2024-06-23 11:26:07,873][15401] Updated weights for policy 0, policy_version 424950 (0.0037) [2024-06-23 11:26:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43149.1, 300 sec: 42653.9). Total num frames: 6962380800. Throughput: 0: 42736.7. Samples: 6962443900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:26:08,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 11:26:12,396][15401] Updated weights for policy 0, policy_version 424960 (0.0025) [2024-06-23 11:26:13,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 6962593792. Throughput: 0: 42769.2. Samples: 6962705400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:26:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 11:26:15,314][15401] Updated weights for policy 0, policy_version 424970 (0.0030) [2024-06-23 11:26:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 6962757632. Throughput: 0: 42832.8. Samples: 6962969000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:26:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 11:26:20,009][15401] Updated weights for policy 0, policy_version 424980 (0.0034) [2024-06-23 11:26:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6963019776. Throughput: 0: 42784.4. Samples: 6963089860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 11:26:23,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 11:26:23,515][15401] Updated weights for policy 0, policy_version 424990 (0.0027) [2024-06-23 11:26:27,782][15401] Updated weights for policy 0, policy_version 425000 (0.0039) [2024-06-23 11:26:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6963216384. Throughput: 0: 42888.5. Samples: 6963349800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:26:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 11:26:31,142][15401] Updated weights for policy 0, policy_version 425010 (0.0047) [2024-06-23 11:26:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 6963412992. Throughput: 0: 42811.1. Samples: 6963607620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:26:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 11:26:35,367][15401] Updated weights for policy 0, policy_version 425020 (0.0030) [2024-06-23 11:26:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6963675136. Throughput: 0: 42805.8. Samples: 6963732240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:26:38,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 11:26:38,757][15401] Updated weights for policy 0, policy_version 425030 (0.0038) [2024-06-23 11:26:42,249][15349] Signal inference workers to stop experience collection... (103200 times) [2024-06-23 11:26:42,250][15349] Signal inference workers to resume experience collection... (103200 times) [2024-06-23 11:26:42,288][15401] InferenceWorker_p0-w0: stopping experience collection (103200 times) [2024-06-23 11:26:42,288][15401] InferenceWorker_p0-w0: resuming experience collection (103200 times) [2024-06-23 11:26:42,959][15401] Updated weights for policy 0, policy_version 425040 (0.0035) [2024-06-23 11:26:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 6963871744. Throughput: 0: 43004.4. Samples: 6963994100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:26:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 11:26:46,137][15401] Updated weights for policy 0, policy_version 425050 (0.0029) [2024-06-23 11:26:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6964068352. Throughput: 0: 42753.3. Samples: 6964247720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:26:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 11:26:50,517][15401] Updated weights for policy 0, policy_version 425060 (0.0038) [2024-06-23 11:26:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.5, 300 sec: 42709.5). Total num frames: 6964314112. Throughput: 0: 42909.1. Samples: 6964374800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:26:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 11:26:54,289][15401] Updated weights for policy 0, policy_version 425070 (0.0037) [2024-06-23 11:26:58,173][15401] Updated weights for policy 0, policy_version 425080 (0.0033) [2024-06-23 11:26:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 6964510720. Throughput: 0: 42856.6. Samples: 6964633940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:26:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 11:27:01,726][15401] Updated weights for policy 0, policy_version 425090 (0.0042) [2024-06-23 11:27:03,390][15132] Fps is (10 sec: 39318.1, 60 sec: 42870.8, 300 sec: 42543.2). Total num frames: 6964707328. Throughput: 0: 42622.3. Samples: 6964887040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:03,391][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 11:27:05,905][15401] Updated weights for policy 0, policy_version 425100 (0.0035) [2024-06-23 11:27:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6964936704. Throughput: 0: 42726.3. Samples: 6965012540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 11:27:09,724][15401] Updated weights for policy 0, policy_version 425110 (0.0039) [2024-06-23 11:27:13,389][15132] Fps is (10 sec: 44240.4, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 6965149696. Throughput: 0: 42713.2. Samples: 6965271900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:13,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 11:27:13,776][15401] Updated weights for policy 0, policy_version 425120 (0.0033) [2024-06-23 11:27:17,335][15401] Updated weights for policy 0, policy_version 425130 (0.0031) [2024-06-23 11:27:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 6965346304. Throughput: 0: 42608.3. Samples: 6965525000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 11:27:21,391][15401] Updated weights for policy 0, policy_version 425140 (0.0053) [2024-06-23 11:27:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6965592064. Throughput: 0: 42752.0. Samples: 6965656080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 11:27:25,101][15401] Updated weights for policy 0, policy_version 425150 (0.0042) [2024-06-23 11:27:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6965772288. Throughput: 0: 42680.5. Samples: 6965914720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 11:27:28,939][15401] Updated weights for policy 0, policy_version 425160 (0.0037) [2024-06-23 11:27:32,789][15401] Updated weights for policy 0, policy_version 425170 (0.0038) [2024-06-23 11:27:33,391][15132] Fps is (10 sec: 40952.3, 60 sec: 43143.1, 300 sec: 42709.2). Total num frames: 6966001664. Throughput: 0: 42527.9. Samples: 6966161560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:33,392][15132] Avg episode reward: [(0, '0.868')] [2024-06-23 11:27:36,487][15401] Updated weights for policy 0, policy_version 425180 (0.0036) [2024-06-23 11:27:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 6966214656. Throughput: 0: 42596.8. Samples: 6966291760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:38,393][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 11:27:40,409][15401] Updated weights for policy 0, policy_version 425190 (0.0027) [2024-06-23 11:27:43,389][15132] Fps is (10 sec: 40968.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6966411264. Throughput: 0: 42600.5. Samples: 6966550960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 11:27:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000425197_6966427648.pth... [2024-06-23 11:27:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000424574_6956220416.pth [2024-06-23 11:27:43,990][15401] Updated weights for policy 0, policy_version 425200 (0.0033) [2024-06-23 11:27:48,269][15401] Updated weights for policy 0, policy_version 425210 (0.0044) [2024-06-23 11:27:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6966640640. Throughput: 0: 42573.2. Samples: 6966802800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 11:27:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 11:27:51,889][15401] Updated weights for policy 0, policy_version 425220 (0.0036) [2024-06-23 11:27:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6966853632. Throughput: 0: 42623.1. Samples: 6966930580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:27:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 11:27:55,895][15401] Updated weights for policy 0, policy_version 425230 (0.0041) [2024-06-23 11:27:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6967050240. Throughput: 0: 42599.9. Samples: 6967188900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:27:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 11:27:59,671][15401] Updated weights for policy 0, policy_version 425240 (0.0031) [2024-06-23 11:28:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42599.1, 300 sec: 42654.0). Total num frames: 6967263232. Throughput: 0: 42641.5. Samples: 6967443860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 11:28:03,908][15401] Updated weights for policy 0, policy_version 425250 (0.0037) [2024-06-23 11:28:07,317][15401] Updated weights for policy 0, policy_version 425260 (0.0033) [2024-06-23 11:28:08,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 6967492608. Throughput: 0: 42629.1. Samples: 6967574380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 11:28:09,643][15349] Signal inference workers to stop experience collection... (103250 times) [2024-06-23 11:28:09,643][15349] Signal inference workers to resume experience collection... (103250 times) [2024-06-23 11:28:09,686][15401] InferenceWorker_p0-w0: stopping experience collection (103250 times) [2024-06-23 11:28:09,687][15401] InferenceWorker_p0-w0: resuming experience collection (103250 times) [2024-06-23 11:28:11,628][15401] Updated weights for policy 0, policy_version 425270 (0.0030) [2024-06-23 11:28:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6967705600. Throughput: 0: 42550.2. Samples: 6967829480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 11:28:14,951][15401] Updated weights for policy 0, policy_version 425280 (0.0032) [2024-06-23 11:28:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6967918592. Throughput: 0: 42736.9. Samples: 6968084640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:18,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 11:28:19,244][15401] Updated weights for policy 0, policy_version 425290 (0.0027) [2024-06-23 11:28:22,642][15401] Updated weights for policy 0, policy_version 425300 (0.0038) [2024-06-23 11:28:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6968147968. Throughput: 0: 42752.4. Samples: 6968215520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:23,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 11:28:26,812][15401] Updated weights for policy 0, policy_version 425310 (0.0027) [2024-06-23 11:28:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6968344576. Throughput: 0: 42689.7. Samples: 6968472000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 11:28:30,463][15401] Updated weights for policy 0, policy_version 425320 (0.0035) [2024-06-23 11:28:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42599.7, 300 sec: 42653.9). Total num frames: 6968557568. Throughput: 0: 42684.4. Samples: 6968723600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 11:28:34,500][15401] Updated weights for policy 0, policy_version 425330 (0.0037) [2024-06-23 11:28:38,094][15401] Updated weights for policy 0, policy_version 425340 (0.0030) [2024-06-23 11:28:38,394][15132] Fps is (10 sec: 44215.0, 60 sec: 42869.7, 300 sec: 42653.2). Total num frames: 6968786944. Throughput: 0: 42705.1. Samples: 6968852520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:38,395][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 11:28:42,067][15401] Updated weights for policy 0, policy_version 425350 (0.0040) [2024-06-23 11:28:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6968983552. Throughput: 0: 42651.7. Samples: 6969108220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:43,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 11:28:45,840][15401] Updated weights for policy 0, policy_version 425360 (0.0033) [2024-06-23 11:28:48,390][15132] Fps is (10 sec: 42619.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6969212928. Throughput: 0: 42631.4. Samples: 6969362280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:48,393][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 11:28:49,712][15401] Updated weights for policy 0, policy_version 425370 (0.0041) [2024-06-23 11:28:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6969409536. Throughput: 0: 42554.1. Samples: 6969489320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 11:28:53,708][15401] Updated weights for policy 0, policy_version 425380 (0.0034) [2024-06-23 11:28:57,194][15401] Updated weights for policy 0, policy_version 425390 (0.0037) [2024-06-23 11:28:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6969638912. Throughput: 0: 42611.1. Samples: 6969746980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:28:58,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 11:29:01,634][15401] Updated weights for policy 0, policy_version 425400 (0.0028) [2024-06-23 11:29:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6969835520. Throughput: 0: 42651.1. Samples: 6970003940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:29:03,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 11:29:04,744][15401] Updated weights for policy 0, policy_version 425410 (0.0030) [2024-06-23 11:29:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 6970048512. Throughput: 0: 42383.2. Samples: 6970122860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 11:29:08,392][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 11:29:09,306][15401] Updated weights for policy 0, policy_version 425420 (0.0032) [2024-06-23 11:29:12,569][15401] Updated weights for policy 0, policy_version 425430 (0.0023) [2024-06-23 11:29:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6970261504. Throughput: 0: 42361.3. Samples: 6970378260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:13,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 11:29:17,327][15401] Updated weights for policy 0, policy_version 425440 (0.0029) [2024-06-23 11:29:18,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6970474496. Throughput: 0: 42636.1. Samples: 6970642220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 11:29:20,079][15401] Updated weights for policy 0, policy_version 425450 (0.0026) [2024-06-23 11:29:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6970687488. Throughput: 0: 42597.6. Samples: 6970769200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 11:29:24,861][15401] Updated weights for policy 0, policy_version 425460 (0.0036) [2024-06-23 11:29:26,000][15349] Signal inference workers to stop experience collection... (103300 times) [2024-06-23 11:29:26,001][15349] Signal inference workers to resume experience collection... (103300 times) [2024-06-23 11:29:26,027][15401] InferenceWorker_p0-w0: stopping experience collection (103300 times) [2024-06-23 11:29:26,027][15401] InferenceWorker_p0-w0: resuming experience collection (103300 times) [2024-06-23 11:29:28,069][15401] Updated weights for policy 0, policy_version 425470 (0.0055) [2024-06-23 11:29:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6970916864. Throughput: 0: 42534.3. Samples: 6971022260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 11:29:32,317][15401] Updated weights for policy 0, policy_version 425480 (0.0046) [2024-06-23 11:29:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42766.0). Total num frames: 6971129856. Throughput: 0: 42761.9. Samples: 6971286560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 11:29:35,520][15401] Updated weights for policy 0, policy_version 425490 (0.0036) [2024-06-23 11:29:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42328.7, 300 sec: 42653.9). Total num frames: 6971326464. Throughput: 0: 42881.2. Samples: 6971418980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 11:29:39,796][15401] Updated weights for policy 0, policy_version 425500 (0.0032) [2024-06-23 11:29:43,043][15401] Updated weights for policy 0, policy_version 425510 (0.0036) [2024-06-23 11:29:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6971555840. Throughput: 0: 42756.0. Samples: 6971671000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:43,398][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 11:29:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000425510_6971555840.pth... [2024-06-23 11:29:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000424885_6961315840.pth [2024-06-23 11:29:47,406][15401] Updated weights for policy 0, policy_version 425520 (0.0027) [2024-06-23 11:29:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 6971768832. Throughput: 0: 42818.3. Samples: 6971930760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 11:29:50,727][15401] Updated weights for policy 0, policy_version 425530 (0.0045) [2024-06-23 11:29:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 6971965440. Throughput: 0: 43004.6. Samples: 6972057960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 11:29:55,258][15401] Updated weights for policy 0, policy_version 425540 (0.0038) [2024-06-23 11:29:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6972178432. Throughput: 0: 42918.1. Samples: 6972309580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:29:58,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 11:29:58,742][15401] Updated weights for policy 0, policy_version 425550 (0.0036) [2024-06-23 11:30:02,962][15401] Updated weights for policy 0, policy_version 425560 (0.0046) [2024-06-23 11:30:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 6972391424. Throughput: 0: 42765.7. Samples: 6972566680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:30:03,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 11:30:06,686][15401] Updated weights for policy 0, policy_version 425570 (0.0035) [2024-06-23 11:30:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 6972620800. Throughput: 0: 42840.9. Samples: 6972697040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:30:08,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 11:30:10,578][15401] Updated weights for policy 0, policy_version 425580 (0.0047) [2024-06-23 11:30:13,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 6972833792. Throughput: 0: 42787.9. Samples: 6972947820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:30:13,392][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 11:30:14,386][15401] Updated weights for policy 0, policy_version 425590 (0.0034) [2024-06-23 11:30:18,310][15401] Updated weights for policy 0, policy_version 425600 (0.0037) [2024-06-23 11:30:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6973030400. Throughput: 0: 42751.6. Samples: 6973210380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:30:18,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 11:30:21,917][15401] Updated weights for policy 0, policy_version 425610 (0.0027) [2024-06-23 11:30:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6973259776. Throughput: 0: 42550.2. Samples: 6973333740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:30:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 11:30:25,899][15401] Updated weights for policy 0, policy_version 425620 (0.0033) [2024-06-23 11:30:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 6973472768. Throughput: 0: 42552.0. Samples: 6973585840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-23 11:30:28,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 11:30:29,571][15401] Updated weights for policy 0, policy_version 425630 (0.0039) [2024-06-23 11:30:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6973652992. Throughput: 0: 42694.6. Samples: 6973852020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:30:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 11:30:33,542][15401] Updated weights for policy 0, policy_version 425640 (0.0032) [2024-06-23 11:30:36,950][15349] Signal inference workers to stop experience collection... (103350 times) [2024-06-23 11:30:36,951][15349] Signal inference workers to resume experience collection... (103350 times) [2024-06-23 11:30:36,992][15401] InferenceWorker_p0-w0: stopping experience collection (103350 times) [2024-06-23 11:30:36,992][15401] InferenceWorker_p0-w0: resuming experience collection (103350 times) [2024-06-23 11:30:37,091][15401] Updated weights for policy 0, policy_version 425650 (0.0038) [2024-06-23 11:30:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6973898752. Throughput: 0: 42549.8. Samples: 6973972700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:30:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 11:30:41,338][15401] Updated weights for policy 0, policy_version 425660 (0.0040) [2024-06-23 11:30:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 6974111744. Throughput: 0: 42649.8. Samples: 6974228820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:30:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 11:30:44,653][15401] Updated weights for policy 0, policy_version 425670 (0.0041) [2024-06-23 11:30:48,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42320.8, 300 sec: 42597.9). Total num frames: 6974308352. Throughput: 0: 42734.0. Samples: 6974489980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:30:48,396][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 11:30:48,970][15401] Updated weights for policy 0, policy_version 425680 (0.0043) [2024-06-23 11:30:52,286][15401] Updated weights for policy 0, policy_version 425690 (0.0033) [2024-06-23 11:30:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6974537728. Throughput: 0: 42543.9. Samples: 6974611520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:30:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 11:30:56,725][15401] Updated weights for policy 0, policy_version 425700 (0.0036) [2024-06-23 11:30:58,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 6974750720. Throughput: 0: 42640.9. Samples: 6974866560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:30:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 11:30:59,917][15401] Updated weights for policy 0, policy_version 425710 (0.0037) [2024-06-23 11:31:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6974947328. Throughput: 0: 42536.9. Samples: 6975124540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 11:31:04,348][15401] Updated weights for policy 0, policy_version 425720 (0.0037) [2024-06-23 11:31:07,965][15401] Updated weights for policy 0, policy_version 425730 (0.0035) [2024-06-23 11:31:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 6975176704. Throughput: 0: 42495.6. Samples: 6975246040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 11:31:12,346][15401] Updated weights for policy 0, policy_version 425740 (0.0036) [2024-06-23 11:31:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 6975389696. Throughput: 0: 42664.1. Samples: 6975505720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 11:31:15,542][15401] Updated weights for policy 0, policy_version 425750 (0.0031) [2024-06-23 11:31:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 6975586304. Throughput: 0: 42478.7. Samples: 6975763560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 11:31:19,991][15401] Updated weights for policy 0, policy_version 425760 (0.0028) [2024-06-23 11:31:23,227][15401] Updated weights for policy 0, policy_version 425770 (0.0039) [2024-06-23 11:31:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 6975815680. Throughput: 0: 42424.0. Samples: 6975881780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 11:31:27,523][15401] Updated weights for policy 0, policy_version 425780 (0.0026) [2024-06-23 11:31:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 6976028672. Throughput: 0: 42551.1. Samples: 6976143620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 11:31:31,126][15401] Updated weights for policy 0, policy_version 425790 (0.0030) [2024-06-23 11:31:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 6976241664. Throughput: 0: 42256.6. Samples: 6976391260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 11:31:35,459][15401] Updated weights for policy 0, policy_version 425800 (0.0053) [2024-06-23 11:31:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 6976438272. Throughput: 0: 42368.4. Samples: 6976518100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:38,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 11:31:38,759][15401] Updated weights for policy 0, policy_version 425810 (0.0028) [2024-06-23 11:31:42,925][15401] Updated weights for policy 0, policy_version 425820 (0.0035) [2024-06-23 11:31:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6976651264. Throughput: 0: 42519.5. Samples: 6976779940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:43,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 11:31:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000425821_6976651264.pth... [2024-06-23 11:31:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000425197_6966427648.pth [2024-06-23 11:31:46,407][15401] Updated weights for policy 0, policy_version 425830 (0.0037) [2024-06-23 11:31:48,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42871.5, 300 sec: 42597.5). Total num frames: 6976880640. Throughput: 0: 42324.1. Samples: 6977029400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:48,396][15132] Avg episode reward: [(0, '0.819')] [2024-06-23 11:31:50,427][15401] Updated weights for policy 0, policy_version 425840 (0.0038) [2024-06-23 11:31:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6977077248. Throughput: 0: 42651.7. Samples: 6977165360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 11:31:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 11:31:54,048][15401] Updated weights for policy 0, policy_version 425850 (0.0035) [2024-06-23 11:31:58,390][15132] Fps is (10 sec: 39346.4, 60 sec: 42052.2, 300 sec: 42598.5). Total num frames: 6977273856. Throughput: 0: 42617.7. Samples: 6977423520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:31:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 11:31:58,658][15401] Updated weights for policy 0, policy_version 425860 (0.0033) [2024-06-23 11:31:59,620][15349] Signal inference workers to stop experience collection... (103400 times) [2024-06-23 11:31:59,621][15349] Signal inference workers to resume experience collection... (103400 times) [2024-06-23 11:31:59,663][15401] InferenceWorker_p0-w0: stopping experience collection (103400 times) [2024-06-23 11:31:59,664][15401] InferenceWorker_p0-w0: resuming experience collection (103400 times) [2024-06-23 11:32:01,547][15401] Updated weights for policy 0, policy_version 425870 (0.0035) [2024-06-23 11:32:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6977519616. Throughput: 0: 42436.9. Samples: 6977673220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 11:32:06,189][15401] Updated weights for policy 0, policy_version 425880 (0.0033) [2024-06-23 11:32:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 6977732608. Throughput: 0: 42897.5. Samples: 6977812180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:08,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 11:32:09,202][15401] Updated weights for policy 0, policy_version 425890 (0.0032) [2024-06-23 11:32:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 6977912832. Throughput: 0: 42720.0. Samples: 6978066020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:13,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 11:32:13,786][15401] Updated weights for policy 0, policy_version 425900 (0.0039) [2024-06-23 11:32:16,815][15401] Updated weights for policy 0, policy_version 425910 (0.0036) [2024-06-23 11:32:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6978174976. Throughput: 0: 42768.4. Samples: 6978315840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:18,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 11:32:21,446][15401] Updated weights for policy 0, policy_version 425920 (0.0037) [2024-06-23 11:32:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6978371584. Throughput: 0: 43078.7. Samples: 6978456640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 11:32:24,344][15401] Updated weights for policy 0, policy_version 425930 (0.0029) [2024-06-23 11:32:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 6978568192. Throughput: 0: 42829.3. Samples: 6978707260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 11:32:28,996][15401] Updated weights for policy 0, policy_version 425940 (0.0038) [2024-06-23 11:32:32,042][15401] Updated weights for policy 0, policy_version 425950 (0.0054) [2024-06-23 11:32:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 6978813952. Throughput: 0: 42835.4. Samples: 6978956720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 11:32:36,515][15401] Updated weights for policy 0, policy_version 425960 (0.0031) [2024-06-23 11:32:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6979010560. Throughput: 0: 42962.7. Samples: 6979098680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 11:32:39,835][15401] Updated weights for policy 0, policy_version 425970 (0.0029) [2024-06-23 11:32:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6979223552. Throughput: 0: 42680.0. Samples: 6979344120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 11:32:44,269][15401] Updated weights for policy 0, policy_version 425980 (0.0033) [2024-06-23 11:32:47,696][15401] Updated weights for policy 0, policy_version 425990 (0.0028) [2024-06-23 11:32:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 6979452928. Throughput: 0: 42727.6. Samples: 6979595960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:48,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 11:32:51,885][15401] Updated weights for policy 0, policy_version 426000 (0.0028) [2024-06-23 11:32:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 6979633152. Throughput: 0: 42587.8. Samples: 6979728620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 11:32:55,190][15401] Updated weights for policy 0, policy_version 426010 (0.0041) [2024-06-23 11:32:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 6979878912. Throughput: 0: 42530.2. Samples: 6979979880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:32:58,399][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 11:32:59,684][15401] Updated weights for policy 0, policy_version 426020 (0.0034) [2024-06-23 11:33:02,814][15401] Updated weights for policy 0, policy_version 426030 (0.0034) [2024-06-23 11:33:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 6980091904. Throughput: 0: 42704.5. Samples: 6980237540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:33:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 11:33:07,317][15401] Updated weights for policy 0, policy_version 426040 (0.0034) [2024-06-23 11:33:08,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 6980272128. Throughput: 0: 42437.3. Samples: 6980366420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:33:08,392][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 11:33:10,597][15401] Updated weights for policy 0, policy_version 426050 (0.0033) [2024-06-23 11:33:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 6980501504. Throughput: 0: 42385.9. Samples: 6980614620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 11:33:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 11:33:14,969][15401] Updated weights for policy 0, policy_version 426060 (0.0038) [2024-06-23 11:33:18,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 6980698112. Throughput: 0: 42649.4. Samples: 6980875940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 11:33:18,578][15401] Updated weights for policy 0, policy_version 426070 (0.0037) [2024-06-23 11:33:22,974][15401] Updated weights for policy 0, policy_version 426080 (0.0026) [2024-06-23 11:33:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6980911104. Throughput: 0: 42236.3. Samples: 6980999320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 11:33:26,009][15349] Signal inference workers to stop experience collection... (103450 times) [2024-06-23 11:33:26,016][15349] Signal inference workers to resume experience collection... (103450 times) [2024-06-23 11:33:26,027][15401] InferenceWorker_p0-w0: stopping experience collection (103450 times) [2024-06-23 11:33:26,027][15401] InferenceWorker_p0-w0: resuming experience collection (103450 times) [2024-06-23 11:33:26,169][15401] Updated weights for policy 0, policy_version 426090 (0.0039) [2024-06-23 11:33:28,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 6981140480. Throughput: 0: 42290.3. Samples: 6981247280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:28,392][15132] Avg episode reward: [(0, '0.818')] [2024-06-23 11:33:30,753][15401] Updated weights for policy 0, policy_version 426100 (0.0039) [2024-06-23 11:33:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42488.0). Total num frames: 6981320704. Throughput: 0: 42617.3. Samples: 6981513740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 11:33:33,991][15401] Updated weights for policy 0, policy_version 426110 (0.0029) [2024-06-23 11:33:38,390][15132] Fps is (10 sec: 39330.4, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 6981533696. Throughput: 0: 42339.8. Samples: 6981633920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:38,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 11:33:38,455][15401] Updated weights for policy 0, policy_version 426120 (0.0026) [2024-06-23 11:33:41,805][15401] Updated weights for policy 0, policy_version 426130 (0.0034) [2024-06-23 11:33:43,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6981795840. Throughput: 0: 42290.6. Samples: 6981882960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 11:33:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000426135_6981795840.pth... [2024-06-23 11:33:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000425510_6971555840.pth [2024-06-23 11:33:46,219][15401] Updated weights for policy 0, policy_version 426140 (0.0029) [2024-06-23 11:33:48,389][15132] Fps is (10 sec: 42599.4, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 6981959680. Throughput: 0: 42321.0. Samples: 6982141980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 11:33:49,642][15401] Updated weights for policy 0, policy_version 426150 (0.0037) [2024-06-23 11:33:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 6982172672. Throughput: 0: 42122.1. Samples: 6982261820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:53,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 11:33:53,777][15401] Updated weights for policy 0, policy_version 426160 (0.0031) [2024-06-23 11:33:57,362][15401] Updated weights for policy 0, policy_version 426170 (0.0030) [2024-06-23 11:33:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 6982418432. Throughput: 0: 42466.6. Samples: 6982525620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:33:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 11:34:01,368][15401] Updated weights for policy 0, policy_version 426180 (0.0044) [2024-06-23 11:34:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42052.4, 300 sec: 42598.8). Total num frames: 6982615040. Throughput: 0: 42224.9. Samples: 6982776060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:34:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 11:34:05,075][15401] Updated weights for policy 0, policy_version 426190 (0.0030) [2024-06-23 11:34:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 6982811648. Throughput: 0: 42265.8. Samples: 6982901280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:34:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 11:34:09,175][15401] Updated weights for policy 0, policy_version 426200 (0.0044) [2024-06-23 11:34:12,711][15401] Updated weights for policy 0, policy_version 426210 (0.0029) [2024-06-23 11:34:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 6983041024. Throughput: 0: 42444.0. Samples: 6983157160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:34:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 11:34:17,247][15401] Updated weights for policy 0, policy_version 426220 (0.0042) [2024-06-23 11:34:18,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.6, 300 sec: 42598.0). Total num frames: 6983254016. Throughput: 0: 42187.5. Samples: 6983412280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:34:18,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 11:34:20,779][15401] Updated weights for policy 0, policy_version 426230 (0.0026) [2024-06-23 11:34:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 6983467008. Throughput: 0: 42387.2. Samples: 6983541440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:34:23,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 11:34:24,725][15401] Updated weights for policy 0, policy_version 426240 (0.0041) [2024-06-23 11:34:28,322][15401] Updated weights for policy 0, policy_version 426250 (0.0031) [2024-06-23 11:34:28,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 6983680000. Throughput: 0: 42689.8. Samples: 6983804000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:34:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 11:34:32,309][15401] Updated weights for policy 0, policy_version 426260 (0.0037) [2024-06-23 11:34:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 6983909376. Throughput: 0: 42519.5. Samples: 6984055360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:34:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 11:34:36,073][15401] Updated weights for policy 0, policy_version 426270 (0.0031) [2024-06-23 11:34:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 6984105984. Throughput: 0: 42798.2. Samples: 6984187740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 11:34:38,391][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 11:34:39,975][15401] Updated weights for policy 0, policy_version 426280 (0.0032) [2024-06-23 11:34:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 6984302592. Throughput: 0: 42529.8. Samples: 6984439460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:34:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 11:34:43,771][15401] Updated weights for policy 0, policy_version 426290 (0.0036) [2024-06-23 11:34:47,623][15401] Updated weights for policy 0, policy_version 426300 (0.0038) [2024-06-23 11:34:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6984531968. Throughput: 0: 42579.9. Samples: 6984692160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:34:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 11:34:51,336][15401] Updated weights for policy 0, policy_version 426310 (0.0035) [2024-06-23 11:34:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 6984761344. Throughput: 0: 42772.4. Samples: 6984826040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:34:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 11:34:54,481][15349] Signal inference workers to stop experience collection... (103500 times) [2024-06-23 11:34:54,481][15349] Signal inference workers to resume experience collection... (103500 times) [2024-06-23 11:34:54,523][15401] InferenceWorker_p0-w0: stopping experience collection (103500 times) [2024-06-23 11:34:54,523][15401] InferenceWorker_p0-w0: resuming experience collection (103500 times) [2024-06-23 11:34:55,217][15401] Updated weights for policy 0, policy_version 426320 (0.0037) [2024-06-23 11:34:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6984957952. Throughput: 0: 42657.0. Samples: 6985076720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:34:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 11:34:58,872][15401] Updated weights for policy 0, policy_version 426330 (0.0032) [2024-06-23 11:35:02,981][15401] Updated weights for policy 0, policy_version 426340 (0.0025) [2024-06-23 11:35:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6985170944. Throughput: 0: 42728.9. Samples: 6985334980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:03,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 11:35:06,917][15401] Updated weights for policy 0, policy_version 426350 (0.0029) [2024-06-23 11:35:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42598.4). Total num frames: 6985400320. Throughput: 0: 42758.3. Samples: 6985465560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:08,392][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 11:35:10,474][15401] Updated weights for policy 0, policy_version 426360 (0.0033) [2024-06-23 11:35:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 6985596928. Throughput: 0: 42490.3. Samples: 6985716060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 11:35:14,346][15401] Updated weights for policy 0, policy_version 426370 (0.0034) [2024-06-23 11:35:18,335][15401] Updated weights for policy 0, policy_version 426380 (0.0040) [2024-06-23 11:35:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 6985809920. Throughput: 0: 42753.7. Samples: 6985979280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 11:35:21,834][15401] Updated weights for policy 0, policy_version 426390 (0.0031) [2024-06-23 11:35:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 6986022912. Throughput: 0: 42489.5. Samples: 6986099760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 11:35:26,025][15401] Updated weights for policy 0, policy_version 426400 (0.0031) [2024-06-23 11:35:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6986252288. Throughput: 0: 42740.1. Samples: 6986362760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 11:35:29,238][15401] Updated weights for policy 0, policy_version 426410 (0.0034) [2024-06-23 11:35:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 6986432512. Throughput: 0: 43028.5. Samples: 6986628440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 11:35:33,592][15401] Updated weights for policy 0, policy_version 426420 (0.0030) [2024-06-23 11:35:37,232][15401] Updated weights for policy 0, policy_version 426430 (0.0037) [2024-06-23 11:35:38,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 6986661888. Throughput: 0: 42725.8. Samples: 6986748800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:38,393][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 11:35:41,290][15401] Updated weights for policy 0, policy_version 426440 (0.0022) [2024-06-23 11:35:43,391][15132] Fps is (10 sec: 47508.5, 60 sec: 43416.9, 300 sec: 42710.3). Total num frames: 6986907648. Throughput: 0: 42907.8. Samples: 6987007620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:43,391][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 11:35:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000426447_6986907648.pth... [2024-06-23 11:35:43,443][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000425821_6976651264.pth [2024-06-23 11:35:44,741][15401] Updated weights for policy 0, policy_version 426450 (0.0043) [2024-06-23 11:35:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 6987087872. Throughput: 0: 43018.2. Samples: 6987270800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 11:35:48,911][15401] Updated weights for policy 0, policy_version 426460 (0.0030) [2024-06-23 11:35:52,322][15401] Updated weights for policy 0, policy_version 426470 (0.0027) [2024-06-23 11:35:53,390][15132] Fps is (10 sec: 39325.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 6987300864. Throughput: 0: 42812.0. Samples: 6987392000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 11:35:56,727][15401] Updated weights for policy 0, policy_version 426480 (0.0035) [2024-06-23 11:35:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6987546624. Throughput: 0: 42982.5. Samples: 6987650280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 11:35:58,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 11:35:59,801][15401] Updated weights for policy 0, policy_version 426490 (0.0034) [2024-06-23 11:36:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6987743232. Throughput: 0: 42854.8. Samples: 6987907740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 11:36:04,365][15401] Updated weights for policy 0, policy_version 426500 (0.0039) [2024-06-23 11:36:07,627][15401] Updated weights for policy 0, policy_version 426510 (0.0029) [2024-06-23 11:36:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 6987956224. Throughput: 0: 42919.6. Samples: 6988031140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:08,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 11:36:11,823][15401] Updated weights for policy 0, policy_version 426520 (0.0030) [2024-06-23 11:36:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 6988185600. Throughput: 0: 42800.4. Samples: 6988288780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 11:36:15,234][15401] Updated weights for policy 0, policy_version 426530 (0.0050) [2024-06-23 11:36:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6988382208. Throughput: 0: 42664.5. Samples: 6988548340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:18,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 11:36:19,495][15349] Signal inference workers to stop experience collection... (103550 times) [2024-06-23 11:36:19,552][15401] InferenceWorker_p0-w0: stopping experience collection (103550 times) [2024-06-23 11:36:19,611][15349] Signal inference workers to resume experience collection... (103550 times) [2024-06-23 11:36:19,611][15401] InferenceWorker_p0-w0: resuming experience collection (103550 times) [2024-06-23 11:36:19,758][15401] Updated weights for policy 0, policy_version 426540 (0.0027) [2024-06-23 11:36:22,797][15401] Updated weights for policy 0, policy_version 426550 (0.0048) [2024-06-23 11:36:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 6988595200. Throughput: 0: 42688.1. Samples: 6988669660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:23,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 11:36:27,262][15401] Updated weights for policy 0, policy_version 426560 (0.0034) [2024-06-23 11:36:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6988808192. Throughput: 0: 42839.7. Samples: 6988935360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 11:36:31,033][15401] Updated weights for policy 0, policy_version 426570 (0.0044) [2024-06-23 11:36:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 6989021184. Throughput: 0: 42652.1. Samples: 6989190140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:33,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 11:36:34,852][15401] Updated weights for policy 0, policy_version 426580 (0.0038) [2024-06-23 11:36:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 6989217792. Throughput: 0: 42752.4. Samples: 6989315860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 11:36:38,639][15401] Updated weights for policy 0, policy_version 426590 (0.0029) [2024-06-23 11:36:42,541][15401] Updated weights for policy 0, policy_version 426600 (0.0041) [2024-06-23 11:36:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42599.2, 300 sec: 42654.9). Total num frames: 6989463552. Throughput: 0: 42799.2. Samples: 6989576240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:43,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 11:36:46,208][15401] Updated weights for policy 0, policy_version 426610 (0.0035) [2024-06-23 11:36:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 6989660160. Throughput: 0: 42624.4. Samples: 6989825840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 11:36:50,227][15401] Updated weights for policy 0, policy_version 426620 (0.0036) [2024-06-23 11:36:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6989873152. Throughput: 0: 42597.4. Samples: 6989948020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 11:36:54,140][15401] Updated weights for policy 0, policy_version 426630 (0.0043) [2024-06-23 11:36:57,985][15401] Updated weights for policy 0, policy_version 426640 (0.0045) [2024-06-23 11:36:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6990086144. Throughput: 0: 42759.6. Samples: 6990212960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:36:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 11:37:01,580][15401] Updated weights for policy 0, policy_version 426650 (0.0035) [2024-06-23 11:37:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 6990282752. Throughput: 0: 42526.5. Samples: 6990462040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:37:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 11:37:05,713][15401] Updated weights for policy 0, policy_version 426660 (0.0040) [2024-06-23 11:37:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6990512128. Throughput: 0: 42634.1. Samples: 6990588200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:37:08,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 11:37:09,143][15401] Updated weights for policy 0, policy_version 426670 (0.0033) [2024-06-23 11:37:13,364][15401] Updated weights for policy 0, policy_version 426680 (0.0036) [2024-06-23 11:37:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 6990725120. Throughput: 0: 42349.3. Samples: 6990841080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:37:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 11:37:16,807][15401] Updated weights for policy 0, policy_version 426690 (0.0033) [2024-06-23 11:37:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6990938112. Throughput: 0: 42365.4. Samples: 6991096580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:37:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 11:37:20,930][15401] Updated weights for policy 0, policy_version 426700 (0.0037) [2024-06-23 11:37:23,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 6991151104. Throughput: 0: 42352.3. Samples: 6991221980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 11:37:23,396][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 11:37:24,575][15401] Updated weights for policy 0, policy_version 426710 (0.0044) [2024-06-23 11:37:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6991364096. Throughput: 0: 42276.5. Samples: 6991478680. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:37:28,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 11:37:28,512][15401] Updated weights for policy 0, policy_version 426720 (0.0028) [2024-06-23 11:37:32,011][15401] Updated weights for policy 0, policy_version 426730 (0.0040) [2024-06-23 11:37:33,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6991577088. Throughput: 0: 42482.7. Samples: 6991737560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:37:33,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 11:37:36,451][15401] Updated weights for policy 0, policy_version 426740 (0.0027) [2024-06-23 11:37:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 6991773696. Throughput: 0: 42570.6. Samples: 6991863700. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:37:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 11:37:40,000][15401] Updated weights for policy 0, policy_version 426750 (0.0046) [2024-06-23 11:37:40,723][15349] Signal inference workers to stop experience collection... (103600 times) [2024-06-23 11:37:40,726][15349] Signal inference workers to resume experience collection... (103600 times) [2024-06-23 11:37:40,767][15401] InferenceWorker_p0-w0: stopping experience collection (103600 times) [2024-06-23 11:37:40,767][15401] InferenceWorker_p0-w0: resuming experience collection (103600 times) [2024-06-23 11:37:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 6991986688. Throughput: 0: 42423.1. Samples: 6992122000. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:37:43,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-23 11:37:43,521][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000426758_6992003072.pth... [2024-06-23 11:37:43,566][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000426135_6981795840.pth [2024-06-23 11:37:44,131][15401] Updated weights for policy 0, policy_version 426760 (0.0042) [2024-06-23 11:37:47,752][15401] Updated weights for policy 0, policy_version 426770 (0.0040) [2024-06-23 11:37:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 6992232448. Throughput: 0: 42349.5. Samples: 6992367760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:37:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 11:37:51,756][15401] Updated weights for policy 0, policy_version 426780 (0.0035) [2024-06-23 11:37:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6992412672. Throughput: 0: 42346.4. Samples: 6992493780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:37:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 11:37:55,590][15401] Updated weights for policy 0, policy_version 426790 (0.0034) [2024-06-23 11:37:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 6992625664. Throughput: 0: 42479.2. Samples: 6992752640. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:37:58,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 11:37:59,960][15401] Updated weights for policy 0, policy_version 426800 (0.0034) [2024-06-23 11:38:03,154][15401] Updated weights for policy 0, policy_version 426810 (0.0034) [2024-06-23 11:38:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 6992871424. Throughput: 0: 42435.0. Samples: 6993006160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:03,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 11:38:07,335][15401] Updated weights for policy 0, policy_version 426820 (0.0025) [2024-06-23 11:38:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 6993051648. Throughput: 0: 42659.0. Samples: 6993141360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 11:38:10,491][15401] Updated weights for policy 0, policy_version 426830 (0.0038) [2024-06-23 11:38:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 6993264640. Throughput: 0: 42630.6. Samples: 6993397060. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 11:38:15,147][15401] Updated weights for policy 0, policy_version 426840 (0.0030) [2024-06-23 11:38:18,158][15401] Updated weights for policy 0, policy_version 426850 (0.0039) [2024-06-23 11:38:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 6993510400. Throughput: 0: 42485.6. Samples: 6993649420. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:18,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 11:38:22,708][15401] Updated weights for policy 0, policy_version 426860 (0.0025) [2024-06-23 11:38:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42329.9, 300 sec: 42543.2). Total num frames: 6993690624. Throughput: 0: 42689.8. Samples: 6993784740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:23,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 11:38:25,848][15401] Updated weights for policy 0, policy_version 426870 (0.0035) [2024-06-23 11:38:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 6993920000. Throughput: 0: 42635.9. Samples: 6994040620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 11:38:30,278][15401] Updated weights for policy 0, policy_version 426880 (0.0032) [2024-06-23 11:38:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 6994149376. Throughput: 0: 42829.3. Samples: 6994295080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:33,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 11:38:33,499][15401] Updated weights for policy 0, policy_version 426890 (0.0028) [2024-06-23 11:38:37,872][15401] Updated weights for policy 0, policy_version 426900 (0.0031) [2024-06-23 11:38:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 6994345984. Throughput: 0: 42943.6. Samples: 6994426240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:38,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-23 11:38:41,534][15401] Updated weights for policy 0, policy_version 426910 (0.0035) [2024-06-23 11:38:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6994558976. Throughput: 0: 42791.9. Samples: 6994678280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-23 11:38:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 11:38:45,371][15401] Updated weights for policy 0, policy_version 426920 (0.0026) [2024-06-23 11:38:48,396][15132] Fps is (10 sec: 44208.0, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 6994788352. Throughput: 0: 42835.7. Samples: 6994934040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:38:48,397][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 11:38:49,096][15401] Updated weights for policy 0, policy_version 426930 (0.0029) [2024-06-23 11:38:52,991][15401] Updated weights for policy 0, policy_version 426940 (0.0030) [2024-06-23 11:38:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 6994984960. Throughput: 0: 42721.3. Samples: 6995063820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:38:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 11:38:56,790][15401] Updated weights for policy 0, policy_version 426950 (0.0041) [2024-06-23 11:38:58,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 6995197952. Throughput: 0: 42676.3. Samples: 6995317500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:38:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 11:39:00,621][15401] Updated weights for policy 0, policy_version 426960 (0.0036) [2024-06-23 11:39:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 6995410944. Throughput: 0: 42756.2. Samples: 6995573440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 11:39:04,658][15401] Updated weights for policy 0, policy_version 426970 (0.0059) [2024-06-23 11:39:08,298][15401] Updated weights for policy 0, policy_version 426980 (0.0026) [2024-06-23 11:39:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 6995640320. Throughput: 0: 42664.8. Samples: 6995704660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:08,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-23 11:39:09,999][15349] Signal inference workers to stop experience collection... (103650 times) [2024-06-23 11:39:10,042][15401] InferenceWorker_p0-w0: stopping experience collection (103650 times) [2024-06-23 11:39:10,056][15349] Signal inference workers to resume experience collection... (103650 times) [2024-06-23 11:39:10,063][15401] InferenceWorker_p0-w0: resuming experience collection (103650 times) [2024-06-23 11:39:12,328][15401] Updated weights for policy 0, policy_version 426990 (0.0032) [2024-06-23 11:39:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 6995836928. Throughput: 0: 42715.4. Samples: 6995962820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:13,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-23 11:39:15,777][15401] Updated weights for policy 0, policy_version 427000 (0.0044) [2024-06-23 11:39:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 6996049920. Throughput: 0: 42763.5. Samples: 6996219440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:18,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 11:39:20,481][15401] Updated weights for policy 0, policy_version 427010 (0.0036) [2024-06-23 11:39:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 6996262912. Throughput: 0: 42647.6. Samples: 6996345380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:23,390][15132] Avg episode reward: [(0, '0.149')] [2024-06-23 11:39:23,560][15401] Updated weights for policy 0, policy_version 427020 (0.0040) [2024-06-23 11:39:27,970][15401] Updated weights for policy 0, policy_version 427030 (0.0036) [2024-06-23 11:39:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 6996459520. Throughput: 0: 42651.1. Samples: 6996597580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 11:39:31,050][15401] Updated weights for policy 0, policy_version 427040 (0.0039) [2024-06-23 11:39:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 6996688896. Throughput: 0: 42758.1. Samples: 6996857880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:33,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 11:39:35,861][15401] Updated weights for policy 0, policy_version 427050 (0.0034) [2024-06-23 11:39:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6996918272. Throughput: 0: 42789.7. Samples: 6996989360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 11:39:38,894][15401] Updated weights for policy 0, policy_version 427060 (0.0037) [2024-06-23 11:39:43,364][15401] Updated weights for policy 0, policy_version 427070 (0.0041) [2024-06-23 11:39:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6997114880. Throughput: 0: 42748.4. Samples: 6997241180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 11:39:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000427070_6997114880.pth... [2024-06-23 11:39:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000426447_6986907648.pth [2024-06-23 11:39:46,865][15401] Updated weights for policy 0, policy_version 427080 (0.0037) [2024-06-23 11:39:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42603.0, 300 sec: 42654.0). Total num frames: 6997344256. Throughput: 0: 42813.7. Samples: 6997500060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 11:39:50,935][15401] Updated weights for policy 0, policy_version 427090 (0.0037) [2024-06-23 11:39:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6997557248. Throughput: 0: 42792.0. Samples: 6997630300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 11:39:54,306][15401] Updated weights for policy 0, policy_version 427100 (0.0048) [2024-06-23 11:39:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 6997753856. Throughput: 0: 42773.4. Samples: 6997887620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:39:58,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 11:39:58,508][15401] Updated weights for policy 0, policy_version 427110 (0.0031) [2024-06-23 11:40:01,963][15401] Updated weights for policy 0, policy_version 427120 (0.0043) [2024-06-23 11:40:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 6997983232. Throughput: 0: 42558.7. Samples: 6998134580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-23 11:40:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 11:40:06,167][15401] Updated weights for policy 0, policy_version 427130 (0.0028) [2024-06-23 11:40:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 6998179840. Throughput: 0: 42775.9. Samples: 6998270300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:08,399][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 11:40:09,754][15401] Updated weights for policy 0, policy_version 427140 (0.0038) [2024-06-23 11:40:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 6998392832. Throughput: 0: 42757.9. Samples: 6998521680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:13,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 11:40:14,176][15401] Updated weights for policy 0, policy_version 427150 (0.0037) [2024-06-23 11:40:17,546][15401] Updated weights for policy 0, policy_version 427160 (0.0033) [2024-06-23 11:40:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 6998622208. Throughput: 0: 42764.0. Samples: 6998782260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 11:40:21,697][15401] Updated weights for policy 0, policy_version 427170 (0.0047) [2024-06-23 11:40:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 6998818816. Throughput: 0: 42717.5. Samples: 6998911640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:23,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 11:40:25,146][15401] Updated weights for policy 0, policy_version 427180 (0.0038) [2024-06-23 11:40:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 6999048192. Throughput: 0: 42707.2. Samples: 6999163000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 11:40:29,408][15401] Updated weights for policy 0, policy_version 427190 (0.0045) [2024-06-23 11:40:32,999][15401] Updated weights for policy 0, policy_version 427200 (0.0038) [2024-06-23 11:40:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 6999277568. Throughput: 0: 42695.6. Samples: 6999421360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:33,398][15132] Avg episode reward: [(0, '0.301')] [2024-06-23 11:40:36,951][15401] Updated weights for policy 0, policy_version 427210 (0.0029) [2024-06-23 11:40:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42543.0). Total num frames: 6999457792. Throughput: 0: 42613.4. Samples: 6999547900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:38,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 11:40:40,581][15401] Updated weights for policy 0, policy_version 427220 (0.0044) [2024-06-23 11:40:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 6999687168. Throughput: 0: 42491.6. Samples: 6999799740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:43,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 11:40:44,741][15401] Updated weights for policy 0, policy_version 427230 (0.0035) [2024-06-23 11:40:48,154][15401] Updated weights for policy 0, policy_version 427240 (0.0042) [2024-06-23 11:40:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 6999916544. Throughput: 0: 42819.1. Samples: 7000061440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:48,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 11:40:52,450][15401] Updated weights for policy 0, policy_version 427250 (0.0037) [2024-06-23 11:40:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7000096768. Throughput: 0: 42633.8. Samples: 7000188820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 11:40:55,178][15349] Signal inference workers to stop experience collection... (103700 times) [2024-06-23 11:40:55,210][15401] InferenceWorker_p0-w0: stopping experience collection (103700 times) [2024-06-23 11:40:55,238][15349] Signal inference workers to resume experience collection... (103700 times) [2024-06-23 11:40:55,239][15401] InferenceWorker_p0-w0: resuming experience collection (103700 times) [2024-06-23 11:40:55,776][15401] Updated weights for policy 0, policy_version 427260 (0.0029) [2024-06-23 11:40:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 7000326144. Throughput: 0: 42629.4. Samples: 7000440000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:40:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 11:41:00,123][15401] Updated weights for policy 0, policy_version 427270 (0.0051) [2024-06-23 11:41:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7000539136. Throughput: 0: 42537.4. Samples: 7000696440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:41:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 11:41:03,407][15401] Updated weights for policy 0, policy_version 427280 (0.0026) [2024-06-23 11:41:07,994][15401] Updated weights for policy 0, policy_version 427290 (0.0037) [2024-06-23 11:41:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7000735744. Throughput: 0: 42454.1. Samples: 7000822080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:41:08,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 11:41:11,140][15401] Updated weights for policy 0, policy_version 427300 (0.0033) [2024-06-23 11:41:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7000981504. Throughput: 0: 42548.4. Samples: 7001077680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:41:13,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 11:41:15,733][15401] Updated weights for policy 0, policy_version 427310 (0.0045) [2024-06-23 11:41:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7001178112. Throughput: 0: 42452.8. Samples: 7001331740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:41:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 11:41:19,412][15401] Updated weights for policy 0, policy_version 427320 (0.0033) [2024-06-23 11:41:23,190][15401] Updated weights for policy 0, policy_version 427330 (0.0034) [2024-06-23 11:41:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7001374720. Throughput: 0: 42388.9. Samples: 7001455400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:41:23,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 11:41:27,088][15401] Updated weights for policy 0, policy_version 427340 (0.0037) [2024-06-23 11:41:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7001620480. Throughput: 0: 42692.3. Samples: 7001720900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 11:41:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 11:41:30,744][15401] Updated weights for policy 0, policy_version 427350 (0.0034) [2024-06-23 11:41:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 7001800704. Throughput: 0: 42475.6. Samples: 7001972840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:41:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 11:41:34,659][15401] Updated weights for policy 0, policy_version 427360 (0.0041) [2024-06-23 11:41:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7002013696. Throughput: 0: 42437.4. Samples: 7002098500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:41:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 11:41:38,421][15401] Updated weights for policy 0, policy_version 427370 (0.0039) [2024-06-23 11:41:42,331][15401] Updated weights for policy 0, policy_version 427380 (0.0037) [2024-06-23 11:41:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7002243072. Throughput: 0: 42622.5. Samples: 7002358020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:41:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 11:41:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000427383_7002243072.pth... [2024-06-23 11:41:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000426758_6992003072.pth [2024-06-23 11:41:46,232][15401] Updated weights for policy 0, policy_version 427390 (0.0042) [2024-06-23 11:41:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7002439680. Throughput: 0: 42593.7. Samples: 7002613160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:41:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 11:41:50,016][15401] Updated weights for policy 0, policy_version 427400 (0.0040) [2024-06-23 11:41:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7002669056. Throughput: 0: 42551.2. Samples: 7002736880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:41:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 11:41:53,752][15401] Updated weights for policy 0, policy_version 427410 (0.0048) [2024-06-23 11:41:57,651][15401] Updated weights for policy 0, policy_version 427420 (0.0028) [2024-06-23 11:41:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7002882048. Throughput: 0: 42719.6. Samples: 7003000060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:41:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 11:42:01,156][15401] Updated weights for policy 0, policy_version 427430 (0.0029) [2024-06-23 11:42:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7003095040. Throughput: 0: 42848.5. Samples: 7003259920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 11:42:05,166][15401] Updated weights for policy 0, policy_version 427440 (0.0029) [2024-06-23 11:42:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7003308032. Throughput: 0: 42963.9. Samples: 7003388780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 11:42:08,645][15401] Updated weights for policy 0, policy_version 427450 (0.0038) [2024-06-23 11:42:12,779][15401] Updated weights for policy 0, policy_version 427460 (0.0035) [2024-06-23 11:42:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7003521024. Throughput: 0: 42741.0. Samples: 7003644240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:13,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 11:42:16,555][15401] Updated weights for policy 0, policy_version 427470 (0.0034) [2024-06-23 11:42:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 7003734016. Throughput: 0: 42883.6. Samples: 7003902600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:18,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 11:42:18,428][15349] Signal inference workers to stop experience collection... (103750 times) [2024-06-23 11:42:18,429][15349] Signal inference workers to resume experience collection... (103750 times) [2024-06-23 11:42:18,467][15401] InferenceWorker_p0-w0: stopping experience collection (103750 times) [2024-06-23 11:42:18,467][15401] InferenceWorker_p0-w0: resuming experience collection (103750 times) [2024-06-23 11:42:20,375][15401] Updated weights for policy 0, policy_version 427480 (0.0030) [2024-06-23 11:42:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 7003963392. Throughput: 0: 42892.7. Samples: 7004028680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 11:42:24,176][15401] Updated weights for policy 0, policy_version 427490 (0.0029) [2024-06-23 11:42:27,987][15401] Updated weights for policy 0, policy_version 427500 (0.0038) [2024-06-23 11:42:28,392][15132] Fps is (10 sec: 42587.3, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 7004160000. Throughput: 0: 42713.7. Samples: 7004280240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:28,393][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 11:42:31,755][15401] Updated weights for policy 0, policy_version 427510 (0.0041) [2024-06-23 11:42:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7004356608. Throughput: 0: 42875.6. Samples: 7004542560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 11:42:35,583][15401] Updated weights for policy 0, policy_version 427520 (0.0048) [2024-06-23 11:42:38,389][15132] Fps is (10 sec: 42609.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7004585984. Throughput: 0: 42979.1. Samples: 7004670940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 11:42:39,259][15401] Updated weights for policy 0, policy_version 427530 (0.0040) [2024-06-23 11:42:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7004798976. Throughput: 0: 42720.0. Samples: 7004922460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 11:42:43,424][15401] Updated weights for policy 0, policy_version 427540 (0.0032) [2024-06-23 11:42:47,243][15401] Updated weights for policy 0, policy_version 427550 (0.0035) [2024-06-23 11:42:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7004995584. Throughput: 0: 42693.6. Samples: 7005181140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:42:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 11:42:51,247][15401] Updated weights for policy 0, policy_version 427560 (0.0038) [2024-06-23 11:42:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7005224960. Throughput: 0: 42673.5. Samples: 7005309080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:42:53,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 11:42:54,990][15401] Updated weights for policy 0, policy_version 427570 (0.0042) [2024-06-23 11:42:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7005437952. Throughput: 0: 42559.2. Samples: 7005559400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:42:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 11:42:59,303][15401] Updated weights for policy 0, policy_version 427580 (0.0042) [2024-06-23 11:43:02,549][15401] Updated weights for policy 0, policy_version 427590 (0.0037) [2024-06-23 11:43:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7005650944. Throughput: 0: 42515.0. Samples: 7005815780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 11:43:07,153][15401] Updated weights for policy 0, policy_version 427600 (0.0033) [2024-06-23 11:43:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7005863936. Throughput: 0: 42649.9. Samples: 7005947920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:08,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 11:43:10,225][15401] Updated weights for policy 0, policy_version 427610 (0.0034) [2024-06-23 11:43:13,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 7006093312. Throughput: 0: 42602.8. Samples: 7006197360. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:13,393][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 11:43:14,873][15401] Updated weights for policy 0, policy_version 427620 (0.0052) [2024-06-23 11:43:17,945][15401] Updated weights for policy 0, policy_version 427630 (0.0041) [2024-06-23 11:43:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7006306304. Throughput: 0: 42429.3. Samples: 7006451880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 11:43:22,427][15401] Updated weights for policy 0, policy_version 427640 (0.0039) [2024-06-23 11:43:23,389][15132] Fps is (10 sec: 37692.5, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 7006470144. Throughput: 0: 42376.0. Samples: 7006577860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 11:43:25,579][15401] Updated weights for policy 0, policy_version 427650 (0.0039) [2024-06-23 11:43:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 7006732288. Throughput: 0: 42570.2. Samples: 7006838120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 11:43:30,031][15401] Updated weights for policy 0, policy_version 427660 (0.0047) [2024-06-23 11:43:33,064][15401] Updated weights for policy 0, policy_version 427670 (0.0030) [2024-06-23 11:43:33,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7006945280. Throughput: 0: 42498.8. Samples: 7007093580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 11:43:37,468][15401] Updated weights for policy 0, policy_version 427680 (0.0029) [2024-06-23 11:43:38,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 7007125504. Throughput: 0: 42541.6. Samples: 7007223560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:38,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 11:43:38,527][15349] Signal inference workers to stop experience collection... (103800 times) [2024-06-23 11:43:38,577][15401] InferenceWorker_p0-w0: stopping experience collection (103800 times) [2024-06-23 11:43:38,579][15349] Signal inference workers to resume experience collection... (103800 times) [2024-06-23 11:43:38,586][15401] InferenceWorker_p0-w0: resuming experience collection (103800 times) [2024-06-23 11:43:41,192][15401] Updated weights for policy 0, policy_version 427690 (0.0039) [2024-06-23 11:43:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42654.9). Total num frames: 7007371264. Throughput: 0: 42765.3. Samples: 7007483840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 11:43:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000427697_7007387648.pth... [2024-06-23 11:43:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000427070_6997114880.pth [2024-06-23 11:43:44,965][15401] Updated weights for policy 0, policy_version 427700 (0.0028) [2024-06-23 11:43:48,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7007567872. Throughput: 0: 42731.9. Samples: 7007738720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:48,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 11:43:48,831][15401] Updated weights for policy 0, policy_version 427710 (0.0028) [2024-06-23 11:43:52,779][15401] Updated weights for policy 0, policy_version 427720 (0.0028) [2024-06-23 11:43:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7007764480. Throughput: 0: 42583.5. Samples: 7007864180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 11:43:56,240][15401] Updated weights for policy 0, policy_version 427730 (0.0038) [2024-06-23 11:43:58,390][15132] Fps is (10 sec: 45873.4, 60 sec: 43144.1, 300 sec: 42764.9). Total num frames: 7008026624. Throughput: 0: 42856.4. Samples: 7008125820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:43:58,391][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 11:44:00,933][15401] Updated weights for policy 0, policy_version 427740 (0.0046) [2024-06-23 11:44:03,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 7008223232. Throughput: 0: 42763.0. Samples: 7008376320. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:44:03,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 11:44:03,744][15401] Updated weights for policy 0, policy_version 427750 (0.0032) [2024-06-23 11:44:08,389][15132] Fps is (10 sec: 36046.8, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 7008387072. Throughput: 0: 42752.0. Samples: 7008501700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:44:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 11:44:08,602][15401] Updated weights for policy 0, policy_version 427760 (0.0030) [2024-06-23 11:44:11,445][15401] Updated weights for policy 0, policy_version 427770 (0.0038) [2024-06-23 11:44:13,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7008649216. Throughput: 0: 42685.8. Samples: 7008758980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 11:44:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 11:44:16,625][15401] Updated weights for policy 0, policy_version 427780 (0.0028) [2024-06-23 11:44:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7008845824. Throughput: 0: 42686.2. Samples: 7009014460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:18,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 11:44:19,667][15401] Updated weights for policy 0, policy_version 427790 (0.0033) [2024-06-23 11:44:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7009042432. Throughput: 0: 42460.8. Samples: 7009134200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 11:44:24,133][15401] Updated weights for policy 0, policy_version 427800 (0.0030) [2024-06-23 11:44:27,231][15401] Updated weights for policy 0, policy_version 427810 (0.0035) [2024-06-23 11:44:28,395][15132] Fps is (10 sec: 44213.5, 60 sec: 42594.7, 300 sec: 42708.7). Total num frames: 7009288192. Throughput: 0: 42389.8. Samples: 7009391600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:28,395][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 11:44:31,685][15401] Updated weights for policy 0, policy_version 427820 (0.0042) [2024-06-23 11:44:33,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 7009484800. Throughput: 0: 42453.4. Samples: 7009649220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:33,392][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 11:44:34,763][15401] Updated weights for policy 0, policy_version 427830 (0.0029) [2024-06-23 11:44:38,391][15132] Fps is (10 sec: 40977.2, 60 sec: 42872.5, 300 sec: 42653.8). Total num frames: 7009697792. Throughput: 0: 42507.5. Samples: 7009777060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:38,391][15132] Avg episode reward: [(0, '0.012')] [2024-06-23 11:44:39,210][15401] Updated weights for policy 0, policy_version 427840 (0.0029) [2024-06-23 11:44:42,397][15401] Updated weights for policy 0, policy_version 427850 (0.0031) [2024-06-23 11:44:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7009927168. Throughput: 0: 42446.8. Samples: 7010035900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:43,390][15132] Avg episode reward: [(0, '0.162')] [2024-06-23 11:44:47,062][15401] Updated weights for policy 0, policy_version 427860 (0.0036) [2024-06-23 11:44:48,390][15132] Fps is (10 sec: 40963.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7010107392. Throughput: 0: 42722.3. Samples: 7010298720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 11:44:49,346][15349] Signal inference workers to stop experience collection... (103850 times) [2024-06-23 11:44:49,372][15401] InferenceWorker_p0-w0: stopping experience collection (103850 times) [2024-06-23 11:44:49,406][15349] Signal inference workers to resume experience collection... (103850 times) [2024-06-23 11:44:49,408][15401] InferenceWorker_p0-w0: resuming experience collection (103850 times) [2024-06-23 11:44:49,895][15401] Updated weights for policy 0, policy_version 427870 (0.0042) [2024-06-23 11:44:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7010336768. Throughput: 0: 42580.8. Samples: 7010417840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:53,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 11:44:54,459][15401] Updated weights for policy 0, policy_version 427880 (0.0032) [2024-06-23 11:44:57,460][15401] Updated weights for policy 0, policy_version 427890 (0.0042) [2024-06-23 11:44:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.6, 300 sec: 42653.9). Total num frames: 7010566144. Throughput: 0: 42645.6. Samples: 7010678040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:44:58,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 11:45:02,015][15401] Updated weights for policy 0, policy_version 427900 (0.0041) [2024-06-23 11:45:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 7010762752. Throughput: 0: 42775.0. Samples: 7010939340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:45:03,392][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 11:45:05,439][15401] Updated weights for policy 0, policy_version 427910 (0.0032) [2024-06-23 11:45:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7010975744. Throughput: 0: 42926.6. Samples: 7011065900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:45:08,396][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 11:45:10,068][15401] Updated weights for policy 0, policy_version 427920 (0.0041) [2024-06-23 11:45:12,891][15401] Updated weights for policy 0, policy_version 427930 (0.0042) [2024-06-23 11:45:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 7011221504. Throughput: 0: 42841.3. Samples: 7011319240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:45:13,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 11:45:17,713][15401] Updated weights for policy 0, policy_version 427940 (0.0047) [2024-06-23 11:45:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7011385344. Throughput: 0: 43085.4. Samples: 7011587960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:45:18,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 11:45:20,390][15401] Updated weights for policy 0, policy_version 427950 (0.0029) [2024-06-23 11:45:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7011631104. Throughput: 0: 42801.7. Samples: 7011703100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:45:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 11:45:25,360][15401] Updated weights for policy 0, policy_version 427960 (0.0044) [2024-06-23 11:45:28,161][15401] Updated weights for policy 0, policy_version 427970 (0.0031) [2024-06-23 11:45:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42875.2, 300 sec: 42653.9). Total num frames: 7011860480. Throughput: 0: 42876.9. Samples: 7011965360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:45:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 11:45:32,871][15401] Updated weights for policy 0, policy_version 427980 (0.0028) [2024-06-23 11:45:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 7012040704. Throughput: 0: 42756.5. Samples: 7012222760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:45:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 11:45:35,762][15401] Updated weights for policy 0, policy_version 427990 (0.0024) [2024-06-23 11:45:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42599.0, 300 sec: 42598.4). Total num frames: 7012253696. Throughput: 0: 42791.5. Samples: 7012343460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-23 11:45:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 11:45:40,345][15401] Updated weights for policy 0, policy_version 428000 (0.0040) [2024-06-23 11:45:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7012499456. Throughput: 0: 42891.6. Samples: 7012608160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:45:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 11:45:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000428010_7012515840.pth... [2024-06-23 11:45:43,526][15401] Updated weights for policy 0, policy_version 428010 (0.0032) [2024-06-23 11:45:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000427383_7002243072.pth [2024-06-23 11:45:48,124][15401] Updated weights for policy 0, policy_version 428020 (0.0023) [2024-06-23 11:45:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7012696064. Throughput: 0: 42824.8. Samples: 7012866460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:45:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 11:45:51,073][15401] Updated weights for policy 0, policy_version 428030 (0.0028) [2024-06-23 11:45:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7012892672. Throughput: 0: 42582.9. Samples: 7012982120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:45:53,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 11:45:54,570][15349] Signal inference workers to stop experience collection... (103900 times) [2024-06-23 11:45:54,571][15349] Signal inference workers to resume experience collection... (103900 times) [2024-06-23 11:45:54,616][15401] InferenceWorker_p0-w0: stopping experience collection (103900 times) [2024-06-23 11:45:54,616][15401] InferenceWorker_p0-w0: resuming experience collection (103900 times) [2024-06-23 11:45:55,817][15401] Updated weights for policy 0, policy_version 428040 (0.0039) [2024-06-23 11:45:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7013122048. Throughput: 0: 42857.1. Samples: 7013247800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:45:58,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 11:45:58,701][15401] Updated weights for policy 0, policy_version 428050 (0.0029) [2024-06-23 11:46:03,383][15401] Updated weights for policy 0, policy_version 428060 (0.0023) [2024-06-23 11:46:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7013335040. Throughput: 0: 42600.8. Samples: 7013505000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 11:46:06,572][15401] Updated weights for policy 0, policy_version 428070 (0.0034) [2024-06-23 11:46:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7013548032. Throughput: 0: 42674.8. Samples: 7013623460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 11:46:10,775][15401] Updated weights for policy 0, policy_version 428080 (0.0030) [2024-06-23 11:46:13,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 7013777408. Throughput: 0: 42777.6. Samples: 7013890460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:13,393][15132] Avg episode reward: [(0, '0.870')] [2024-06-23 11:46:14,006][15401] Updated weights for policy 0, policy_version 428090 (0.0027) [2024-06-23 11:46:18,339][15401] Updated weights for policy 0, policy_version 428100 (0.0036) [2024-06-23 11:46:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7013990400. Throughput: 0: 42749.3. Samples: 7014146480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:18,394][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 11:46:21,914][15401] Updated weights for policy 0, policy_version 428110 (0.0033) [2024-06-23 11:46:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7014203392. Throughput: 0: 42818.3. Samples: 7014270280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:23,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 11:46:25,969][15401] Updated weights for policy 0, policy_version 428120 (0.0042) [2024-06-23 11:46:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7014416384. Throughput: 0: 42794.2. Samples: 7014533900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:28,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 11:46:29,564][15401] Updated weights for policy 0, policy_version 428130 (0.0027) [2024-06-23 11:46:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 7014612992. Throughput: 0: 42729.3. Samples: 7014789280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 11:46:33,750][15401] Updated weights for policy 0, policy_version 428140 (0.0046) [2024-06-23 11:46:37,533][15401] Updated weights for policy 0, policy_version 428150 (0.0042) [2024-06-23 11:46:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7014842368. Throughput: 0: 42863.0. Samples: 7014910960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 11:46:41,612][15401] Updated weights for policy 0, policy_version 428160 (0.0039) [2024-06-23 11:46:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7015038976. Throughput: 0: 42680.8. Samples: 7015168440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 11:46:45,095][15401] Updated weights for policy 0, policy_version 428170 (0.0030) [2024-06-23 11:46:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7015251968. Throughput: 0: 42666.6. Samples: 7015425000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:48,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 11:46:49,264][15401] Updated weights for policy 0, policy_version 428180 (0.0024) [2024-06-23 11:46:52,926][15401] Updated weights for policy 0, policy_version 428190 (0.0026) [2024-06-23 11:46:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7015481344. Throughput: 0: 42828.9. Samples: 7015550760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:53,390][15132] Avg episode reward: [(0, '0.166')] [2024-06-23 11:46:56,952][15401] Updated weights for policy 0, policy_version 428200 (0.0032) [2024-06-23 11:46:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7015677952. Throughput: 0: 42586.7. Samples: 7015806760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 11:46:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 11:47:00,513][15401] Updated weights for policy 0, policy_version 428210 (0.0023) [2024-06-23 11:47:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7015890944. Throughput: 0: 42584.4. Samples: 7016062780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 11:47:04,825][15401] Updated weights for policy 0, policy_version 428220 (0.0041) [2024-06-23 11:47:08,175][15401] Updated weights for policy 0, policy_version 428230 (0.0030) [2024-06-23 11:47:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7016120320. Throughput: 0: 42716.4. Samples: 7016192520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 11:47:12,672][15401] Updated weights for policy 0, policy_version 428240 (0.0036) [2024-06-23 11:47:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7016333312. Throughput: 0: 42639.1. Samples: 7016452660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:13,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 11:47:15,851][15401] Updated weights for policy 0, policy_version 428250 (0.0021) [2024-06-23 11:47:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7016529920. Throughput: 0: 42541.5. Samples: 7016703640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 11:47:19,246][15349] Signal inference workers to stop experience collection... (103950 times) [2024-06-23 11:47:19,247][15349] Signal inference workers to resume experience collection... (103950 times) [2024-06-23 11:47:19,290][15401] InferenceWorker_p0-w0: stopping experience collection (103950 times) [2024-06-23 11:47:19,290][15401] InferenceWorker_p0-w0: resuming experience collection (103950 times) [2024-06-23 11:47:20,242][15401] Updated weights for policy 0, policy_version 428260 (0.0039) [2024-06-23 11:47:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 7016742912. Throughput: 0: 42638.8. Samples: 7016829700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 11:47:23,660][15401] Updated weights for policy 0, policy_version 428270 (0.0042) [2024-06-23 11:47:27,787][15401] Updated weights for policy 0, policy_version 428280 (0.0027) [2024-06-23 11:47:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7016972288. Throughput: 0: 42720.5. Samples: 7017090860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 11:47:31,573][15401] Updated weights for policy 0, policy_version 428290 (0.0028) [2024-06-23 11:47:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7017185280. Throughput: 0: 42702.7. Samples: 7017346620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:33,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 11:47:35,267][15401] Updated weights for policy 0, policy_version 428300 (0.0036) [2024-06-23 11:47:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7017398272. Throughput: 0: 42712.8. Samples: 7017472840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 11:47:39,318][15401] Updated weights for policy 0, policy_version 428310 (0.0042) [2024-06-23 11:47:42,762][15401] Updated weights for policy 0, policy_version 428320 (0.0022) [2024-06-23 11:47:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7017627648. Throughput: 0: 43025.8. Samples: 7017742920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 11:47:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000428322_7017627648.pth... [2024-06-23 11:47:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000427697_7007387648.pth [2024-06-23 11:47:46,875][15401] Updated weights for policy 0, policy_version 428330 (0.0037) [2024-06-23 11:47:48,391][15132] Fps is (10 sec: 42593.7, 60 sec: 42870.7, 300 sec: 42709.3). Total num frames: 7017824256. Throughput: 0: 42876.3. Samples: 7017992260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:48,391][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 11:47:50,298][15401] Updated weights for policy 0, policy_version 428340 (0.0031) [2024-06-23 11:47:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7018037248. Throughput: 0: 42832.0. Samples: 7018119960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 11:47:54,311][15401] Updated weights for policy 0, policy_version 428350 (0.0041) [2024-06-23 11:47:57,761][15401] Updated weights for policy 0, policy_version 428360 (0.0041) [2024-06-23 11:47:58,389][15132] Fps is (10 sec: 44242.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7018266624. Throughput: 0: 42758.8. Samples: 7018376800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:47:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 11:48:02,172][15401] Updated weights for policy 0, policy_version 428370 (0.0032) [2024-06-23 11:48:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7018463232. Throughput: 0: 42999.9. Samples: 7018638640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:48:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 11:48:05,658][15401] Updated weights for policy 0, policy_version 428380 (0.0031) [2024-06-23 11:48:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 7018676224. Throughput: 0: 42971.1. Samples: 7018763400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:48:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 11:48:09,739][15401] Updated weights for policy 0, policy_version 428390 (0.0039) [2024-06-23 11:48:12,968][15401] Updated weights for policy 0, policy_version 428400 (0.0037) [2024-06-23 11:48:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7018921984. Throughput: 0: 42998.9. Samples: 7019025820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:48:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 11:48:17,483][15401] Updated weights for policy 0, policy_version 428410 (0.0031) [2024-06-23 11:48:18,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 7019118592. Throughput: 0: 43027.9. Samples: 7019282980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:48:18,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 11:48:20,843][15401] Updated weights for policy 0, policy_version 428420 (0.0035) [2024-06-23 11:48:23,394][15132] Fps is (10 sec: 40941.9, 60 sec: 43141.2, 300 sec: 42708.8). Total num frames: 7019331584. Throughput: 0: 42978.8. Samples: 7019407080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:48:23,395][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 11:48:24,922][15401] Updated weights for policy 0, policy_version 428430 (0.0034) [2024-06-23 11:48:28,257][15401] Updated weights for policy 0, policy_version 428440 (0.0039) [2024-06-23 11:48:28,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7019560960. Throughput: 0: 42777.0. Samples: 7019667880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:48:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 11:48:32,701][15401] Updated weights for policy 0, policy_version 428450 (0.0032) [2024-06-23 11:48:33,389][15132] Fps is (10 sec: 42617.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 7019757568. Throughput: 0: 43042.5. Samples: 7019929120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:48:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 11:48:35,687][15401] Updated weights for policy 0, policy_version 428460 (0.0039) [2024-06-23 11:48:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7019970560. Throughput: 0: 42963.6. Samples: 7020053320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:48:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 11:48:40,466][15401] Updated weights for policy 0, policy_version 428470 (0.0031) [2024-06-23 11:48:43,392][15401] Updated weights for policy 0, policy_version 428480 (0.0041) [2024-06-23 11:48:43,393][15132] Fps is (10 sec: 45860.7, 60 sec: 43142.3, 300 sec: 42875.7). Total num frames: 7020216320. Throughput: 0: 43007.2. Samples: 7020312260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:48:43,393][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 11:48:48,033][15401] Updated weights for policy 0, policy_version 428490 (0.0038) [2024-06-23 11:48:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42872.2, 300 sec: 42820.5). Total num frames: 7020396544. Throughput: 0: 42980.9. Samples: 7020572780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:48:48,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 11:48:51,256][15401] Updated weights for policy 0, policy_version 428500 (0.0049) [2024-06-23 11:48:53,392][15132] Fps is (10 sec: 40962.7, 60 sec: 43142.8, 300 sec: 42709.2). Total num frames: 7020625920. Throughput: 0: 42924.7. Samples: 7020695120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:48:53,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 11:48:55,619][15401] Updated weights for policy 0, policy_version 428510 (0.0034) [2024-06-23 11:48:58,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.7, 300 sec: 42765.0). Total num frames: 7020838912. Throughput: 0: 42829.0. Samples: 7020953220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:48:58,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 11:48:58,943][15401] Updated weights for policy 0, policy_version 428520 (0.0043) [2024-06-23 11:49:03,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7021019136. Throughput: 0: 42986.3. Samples: 7021217260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 11:49:03,495][15401] Updated weights for policy 0, policy_version 428530 (0.0034) [2024-06-23 11:49:03,612][15349] Signal inference workers to stop experience collection... (104000 times) [2024-06-23 11:49:03,651][15401] InferenceWorker_p0-w0: stopping experience collection (104000 times) [2024-06-23 11:49:03,673][15349] Signal inference workers to resume experience collection... (104000 times) [2024-06-23 11:49:03,680][15401] InferenceWorker_p0-w0: resuming experience collection (104000 times) [2024-06-23 11:49:06,659][15401] Updated weights for policy 0, policy_version 428540 (0.0034) [2024-06-23 11:49:08,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 7021281280. Throughput: 0: 43010.9. Samples: 7021342380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 11:49:11,021][15401] Updated weights for policy 0, policy_version 428550 (0.0041) [2024-06-23 11:49:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7021494272. Throughput: 0: 42862.2. Samples: 7021596680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:13,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 11:49:14,474][15401] Updated weights for policy 0, policy_version 428560 (0.0037) [2024-06-23 11:49:18,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 7021674496. Throughput: 0: 42953.6. Samples: 7021862140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:18,393][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 11:49:18,586][15401] Updated weights for policy 0, policy_version 428570 (0.0028) [2024-06-23 11:49:21,982][15401] Updated weights for policy 0, policy_version 428580 (0.0034) [2024-06-23 11:49:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42874.6, 300 sec: 42765.8). Total num frames: 7021903872. Throughput: 0: 42871.5. Samples: 7021982540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:23,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 11:49:26,222][15401] Updated weights for policy 0, policy_version 428590 (0.0035) [2024-06-23 11:49:28,389][15132] Fps is (10 sec: 47525.4, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 7022149632. Throughput: 0: 42890.5. Samples: 7022242200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 11:49:29,544][15401] Updated weights for policy 0, policy_version 428600 (0.0032) [2024-06-23 11:49:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.2). Total num frames: 7022313472. Throughput: 0: 42826.8. Samples: 7022499980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 11:49:33,834][15401] Updated weights for policy 0, policy_version 428610 (0.0031) [2024-06-23 11:49:37,074][15401] Updated weights for policy 0, policy_version 428620 (0.0035) [2024-06-23 11:49:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7022542848. Throughput: 0: 42797.4. Samples: 7022620900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 11:49:41,394][15401] Updated weights for policy 0, policy_version 428630 (0.0020) [2024-06-23 11:49:43,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42873.6, 300 sec: 42987.2). Total num frames: 7022788608. Throughput: 0: 42930.6. Samples: 7022885000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 11:49:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 11:49:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000428637_7022788608.pth... [2024-06-23 11:49:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000428010_7012515840.pth [2024-06-23 11:49:44,912][15401] Updated weights for policy 0, policy_version 428640 (0.0027) [2024-06-23 11:49:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7022936064. Throughput: 0: 42878.3. Samples: 7023146780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:49:48,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 11:49:49,085][15401] Updated weights for policy 0, policy_version 428650 (0.0023) [2024-06-23 11:49:52,395][15401] Updated weights for policy 0, policy_version 428660 (0.0041) [2024-06-23 11:49:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 7023181824. Throughput: 0: 42697.8. Samples: 7023263780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:49:53,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 11:49:56,585][15401] Updated weights for policy 0, policy_version 428670 (0.0033) [2024-06-23 11:49:58,390][15132] Fps is (10 sec: 49151.6, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 7023427584. Throughput: 0: 42992.4. Samples: 7023531340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:49:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 11:49:59,837][15401] Updated weights for policy 0, policy_version 428680 (0.0035) [2024-06-23 11:50:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7023591424. Throughput: 0: 42869.5. Samples: 7023791160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 11:50:04,413][15401] Updated weights for policy 0, policy_version 428690 (0.0041) [2024-06-23 11:50:07,363][15401] Updated weights for policy 0, policy_version 428700 (0.0033) [2024-06-23 11:50:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 7023837184. Throughput: 0: 42887.2. Samples: 7023912560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:08,393][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 11:50:11,851][15401] Updated weights for policy 0, policy_version 428710 (0.0037) [2024-06-23 11:50:13,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7024066560. Throughput: 0: 42904.0. Samples: 7024172880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:13,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 11:50:14,864][15401] Updated weights for policy 0, policy_version 428720 (0.0034) [2024-06-23 11:50:18,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 7024246784. Throughput: 0: 43010.6. Samples: 7024435460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 11:50:19,378][15401] Updated weights for policy 0, policy_version 428730 (0.0024) [2024-06-23 11:50:20,065][15349] Signal inference workers to stop experience collection... (104050 times) [2024-06-23 11:50:20,065][15349] Signal inference workers to resume experience collection... (104050 times) [2024-06-23 11:50:20,110][15401] InferenceWorker_p0-w0: stopping experience collection (104050 times) [2024-06-23 11:50:20,110][15401] InferenceWorker_p0-w0: resuming experience collection (104050 times) [2024-06-23 11:50:22,352][15401] Updated weights for policy 0, policy_version 428740 (0.0032) [2024-06-23 11:50:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 7024492544. Throughput: 0: 43048.5. Samples: 7024558080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 11:50:26,749][15401] Updated weights for policy 0, policy_version 428750 (0.0041) [2024-06-23 11:50:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7024721920. Throughput: 0: 43087.1. Samples: 7024823920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 11:50:29,926][15401] Updated weights for policy 0, policy_version 428760 (0.0033) [2024-06-23 11:50:33,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 7024885760. Throughput: 0: 43161.6. Samples: 7025089160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:33,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 11:50:34,232][15401] Updated weights for policy 0, policy_version 428770 (0.0036) [2024-06-23 11:50:37,336][15401] Updated weights for policy 0, policy_version 428780 (0.0028) [2024-06-23 11:50:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7025147904. Throughput: 0: 43205.4. Samples: 7025208020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 11:50:42,149][15401] Updated weights for policy 0, policy_version 428790 (0.0026) [2024-06-23 11:50:43,390][15132] Fps is (10 sec: 49163.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7025377280. Throughput: 0: 43170.6. Samples: 7025474020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 11:50:44,771][15401] Updated weights for policy 0, policy_version 428800 (0.0034) [2024-06-23 11:50:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7025541120. Throughput: 0: 43247.7. Samples: 7025737300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 11:50:49,815][15401] Updated weights for policy 0, policy_version 428810 (0.0046) [2024-06-23 11:50:52,308][15401] Updated weights for policy 0, policy_version 428820 (0.0029) [2024-06-23 11:50:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 7025803264. Throughput: 0: 43156.9. Samples: 7025854520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 11:50:57,249][15401] Updated weights for policy 0, policy_version 428830 (0.0042) [2024-06-23 11:50:58,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7026016256. Throughput: 0: 43414.2. Samples: 7026126520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:50:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 11:50:59,718][15401] Updated weights for policy 0, policy_version 428840 (0.0036) [2024-06-23 11:51:03,390][15132] Fps is (10 sec: 37683.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7026180096. Throughput: 0: 43414.6. Samples: 7026389120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 11:51:03,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 11:51:04,761][15401] Updated weights for policy 0, policy_version 428850 (0.0037) [2024-06-23 11:51:07,300][15401] Updated weights for policy 0, policy_version 428860 (0.0045) [2024-06-23 11:51:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43692.4, 300 sec: 42987.5). Total num frames: 7026458624. Throughput: 0: 43360.9. Samples: 7026509320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 11:51:12,467][15401] Updated weights for policy 0, policy_version 428870 (0.0032) [2024-06-23 11:51:12,469][15349] Signal inference workers to stop experience collection... (104100 times) [2024-06-23 11:51:12,470][15349] Signal inference workers to resume experience collection... (104100 times) [2024-06-23 11:51:12,520][15401] InferenceWorker_p0-w0: stopping experience collection (104100 times) [2024-06-23 11:51:12,520][15401] InferenceWorker_p0-w0: resuming experience collection (104100 times) [2024-06-23 11:51:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7026638848. Throughput: 0: 43290.3. Samples: 7026771980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 11:51:15,049][15401] Updated weights for policy 0, policy_version 428880 (0.0039) [2024-06-23 11:51:18,390][15132] Fps is (10 sec: 37683.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7026835456. Throughput: 0: 43232.0. Samples: 7027034500. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 11:51:19,995][15401] Updated weights for policy 0, policy_version 428890 (0.0028) [2024-06-23 11:51:22,715][15401] Updated weights for policy 0, policy_version 428900 (0.0028) [2024-06-23 11:51:23,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 7027113984. Throughput: 0: 43292.9. Samples: 7027156200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 11:51:27,463][15401] Updated weights for policy 0, policy_version 428910 (0.0035) [2024-06-23 11:51:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 7027277824. Throughput: 0: 43182.3. Samples: 7027417220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 11:51:30,207][15401] Updated weights for policy 0, policy_version 428920 (0.0044) [2024-06-23 11:51:33,390][15132] Fps is (10 sec: 36044.1, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 7027474432. Throughput: 0: 43081.1. Samples: 7027675960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 11:51:35,212][15401] Updated weights for policy 0, policy_version 428930 (0.0033) [2024-06-23 11:51:37,911][15401] Updated weights for policy 0, policy_version 428940 (0.0023) [2024-06-23 11:51:38,392][15132] Fps is (10 sec: 49140.3, 60 sec: 43688.9, 300 sec: 43153.4). Total num frames: 7027769344. Throughput: 0: 43309.3. Samples: 7027803540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:38,392][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 11:51:42,805][15401] Updated weights for policy 0, policy_version 428950 (0.0034) [2024-06-23 11:51:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 7027933184. Throughput: 0: 43147.9. Samples: 7028068180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 11:51:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000428951_7027933184.pth... [2024-06-23 11:51:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000428322_7017627648.pth [2024-06-23 11:51:45,489][15401] Updated weights for policy 0, policy_version 428960 (0.0043) [2024-06-23 11:51:48,390][15132] Fps is (10 sec: 36053.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7028129792. Throughput: 0: 42984.0. Samples: 7028323400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:48,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 11:51:50,335][15401] Updated weights for policy 0, policy_version 428970 (0.0031) [2024-06-23 11:51:53,188][15401] Updated weights for policy 0, policy_version 428980 (0.0037) [2024-06-23 11:51:53,389][15132] Fps is (10 sec: 47514.6, 60 sec: 43417.7, 300 sec: 43153.8). Total num frames: 7028408320. Throughput: 0: 43139.2. Samples: 7028450580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 11:51:58,324][15401] Updated weights for policy 0, policy_version 428990 (0.0036) [2024-06-23 11:51:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 7028572160. Throughput: 0: 42921.0. Samples: 7028703420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:51:58,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 11:52:00,890][15401] Updated weights for policy 0, policy_version 429000 (0.0039) [2024-06-23 11:52:03,390][15132] Fps is (10 sec: 36043.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7028768768. Throughput: 0: 42813.6. Samples: 7028961120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:52:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 11:52:06,158][15401] Updated weights for policy 0, policy_version 429010 (0.0034) [2024-06-23 11:52:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 7029047296. Throughput: 0: 42935.6. Samples: 7029088300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:52:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 11:52:08,601][15401] Updated weights for policy 0, policy_version 429020 (0.0032) [2024-06-23 11:52:12,969][15349] Signal inference workers to stop experience collection... (104150 times) [2024-06-23 11:52:12,969][15349] Signal inference workers to resume experience collection... (104150 times) [2024-06-23 11:52:13,002][15401] InferenceWorker_p0-w0: stopping experience collection (104150 times) [2024-06-23 11:52:13,002][15401] InferenceWorker_p0-w0: resuming experience collection (104150 times) [2024-06-23 11:52:13,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7029194752. Throughput: 0: 42926.2. Samples: 7029348900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:52:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 11:52:13,599][15401] Updated weights for policy 0, policy_version 429030 (0.0038) [2024-06-23 11:52:16,582][15401] Updated weights for policy 0, policy_version 429040 (0.0037) [2024-06-23 11:52:18,390][15132] Fps is (10 sec: 36044.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7029407744. Throughput: 0: 42753.4. Samples: 7029599860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:52:18,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 11:52:21,071][15401] Updated weights for policy 0, policy_version 429050 (0.0027) [2024-06-23 11:52:23,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 7029669888. Throughput: 0: 42693.7. Samples: 7029724660. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:52:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 11:52:24,106][15401] Updated weights for policy 0, policy_version 429060 (0.0042) [2024-06-23 11:52:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7029850112. Throughput: 0: 42786.8. Samples: 7029993580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-23 11:52:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 11:52:28,593][15401] Updated weights for policy 0, policy_version 429070 (0.0036) [2024-06-23 11:52:31,829][15401] Updated weights for policy 0, policy_version 429080 (0.0036) [2024-06-23 11:52:33,392][15132] Fps is (10 sec: 40950.7, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 7030079488. Throughput: 0: 42656.4. Samples: 7030243040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:52:33,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 11:52:36,263][15401] Updated weights for policy 0, policy_version 429090 (0.0032) [2024-06-23 11:52:38,396][15132] Fps is (10 sec: 45845.2, 60 sec: 42322.4, 300 sec: 42986.2). Total num frames: 7030308864. Throughput: 0: 42692.4. Samples: 7030372020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:52:38,397][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 11:52:39,779][15401] Updated weights for policy 0, policy_version 429100 (0.0031) [2024-06-23 11:52:43,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.5, 300 sec: 42931.8). Total num frames: 7030489088. Throughput: 0: 42902.1. Samples: 7030634020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:52:43,391][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 11:52:43,976][15401] Updated weights for policy 0, policy_version 429110 (0.0031) [2024-06-23 11:52:47,559][15401] Updated weights for policy 0, policy_version 429120 (0.0023) [2024-06-23 11:52:48,390][15132] Fps is (10 sec: 40986.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 7030718464. Throughput: 0: 42801.5. Samples: 7030887180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:52:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 11:52:51,451][15401] Updated weights for policy 0, policy_version 429130 (0.0025) [2024-06-23 11:52:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.2, 300 sec: 42987.2). Total num frames: 7030947840. Throughput: 0: 42851.0. Samples: 7031016600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:52:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 11:52:55,363][15401] Updated weights for policy 0, policy_version 429140 (0.0043) [2024-06-23 11:52:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7031144448. Throughput: 0: 42855.6. Samples: 7031277400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:52:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 11:52:58,915][15401] Updated weights for policy 0, policy_version 429150 (0.0020) [2024-06-23 11:53:02,781][15401] Updated weights for policy 0, policy_version 429160 (0.0041) [2024-06-23 11:53:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 7031357440. Throughput: 0: 42869.7. Samples: 7031529000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 11:53:06,865][15401] Updated weights for policy 0, policy_version 429170 (0.0027) [2024-06-23 11:53:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42931.7). Total num frames: 7031586816. Throughput: 0: 42948.6. Samples: 7031657340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 11:53:10,150][15401] Updated weights for policy 0, policy_version 429180 (0.0033) [2024-06-23 11:53:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 7031783424. Throughput: 0: 42779.1. Samples: 7031918640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 11:53:14,213][15401] Updated weights for policy 0, policy_version 429190 (0.0027) [2024-06-23 11:53:15,061][15349] Signal inference workers to stop experience collection... (104200 times) [2024-06-23 11:53:15,062][15349] Signal inference workers to resume experience collection... (104200 times) [2024-06-23 11:53:15,097][15401] InferenceWorker_p0-w0: stopping experience collection (104200 times) [2024-06-23 11:53:15,097][15401] InferenceWorker_p0-w0: resuming experience collection (104200 times) [2024-06-23 11:53:17,992][15401] Updated weights for policy 0, policy_version 429200 (0.0035) [2024-06-23 11:53:18,396][15132] Fps is (10 sec: 42570.6, 60 sec: 43412.9, 300 sec: 42986.9). Total num frames: 7032012800. Throughput: 0: 42766.8. Samples: 7032167720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:18,397][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 11:53:22,283][15401] Updated weights for policy 0, policy_version 429210 (0.0029) [2024-06-23 11:53:23,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.9, 300 sec: 42986.8). Total num frames: 7032242176. Throughput: 0: 42885.7. Samples: 7032301700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:23,392][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 11:53:25,997][15401] Updated weights for policy 0, policy_version 429220 (0.0039) [2024-06-23 11:53:28,392][15132] Fps is (10 sec: 42615.9, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 7032438784. Throughput: 0: 42936.9. Samples: 7032566280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:28,392][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 11:53:29,789][15401] Updated weights for policy 0, policy_version 429230 (0.0034) [2024-06-23 11:53:33,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 7032651776. Throughput: 0: 42918.2. Samples: 7032818500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 11:53:33,713][15401] Updated weights for policy 0, policy_version 429240 (0.0048) [2024-06-23 11:53:37,266][15401] Updated weights for policy 0, policy_version 429250 (0.0028) [2024-06-23 11:53:38,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42603.0, 300 sec: 42876.5). Total num frames: 7032864768. Throughput: 0: 42820.9. Samples: 7032943540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 11:53:41,375][15401] Updated weights for policy 0, policy_version 429260 (0.0042) [2024-06-23 11:53:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7033077760. Throughput: 0: 42782.6. Samples: 7033202620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 11:53:43,435][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000429266_7033094144.pth... [2024-06-23 11:53:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000428637_7022788608.pth [2024-06-23 11:53:45,179][15401] Updated weights for policy 0, policy_version 429270 (0.0022) [2024-06-23 11:53:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 7033290752. Throughput: 0: 42708.1. Samples: 7033450860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 11:53:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 11:53:49,113][15401] Updated weights for policy 0, policy_version 429280 (0.0033) [2024-06-23 11:53:52,612][15401] Updated weights for policy 0, policy_version 429290 (0.0049) [2024-06-23 11:53:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 7033520128. Throughput: 0: 42851.1. Samples: 7033585640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:53:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 11:53:56,990][15401] Updated weights for policy 0, policy_version 429300 (0.0053) [2024-06-23 11:53:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 7033733120. Throughput: 0: 42813.7. Samples: 7033845260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:53:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 11:54:00,114][15401] Updated weights for policy 0, policy_version 429310 (0.0029) [2024-06-23 11:54:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7033929728. Throughput: 0: 42817.7. Samples: 7034094240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:03,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 11:54:04,694][15401] Updated weights for policy 0, policy_version 429320 (0.0028) [2024-06-23 11:54:07,882][15401] Updated weights for policy 0, policy_version 429330 (0.0046) [2024-06-23 11:54:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7034159104. Throughput: 0: 42776.9. Samples: 7034226560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 11:54:12,200][15401] Updated weights for policy 0, policy_version 429340 (0.0042) [2024-06-23 11:54:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42987.2). Total num frames: 7034355712. Throughput: 0: 42568.8. Samples: 7034481880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:13,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 11:54:15,473][15401] Updated weights for policy 0, policy_version 429350 (0.0041) [2024-06-23 11:54:18,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42598.5, 300 sec: 42930.7). Total num frames: 7034568704. Throughput: 0: 42582.4. Samples: 7034734980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:18,405][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 11:54:19,894][15401] Updated weights for policy 0, policy_version 429360 (0.0043) [2024-06-23 11:54:23,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 7034781696. Throughput: 0: 42621.0. Samples: 7034861480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 11:54:23,491][15401] Updated weights for policy 0, policy_version 429370 (0.0030) [2024-06-23 11:54:27,352][15401] Updated weights for policy 0, policy_version 429380 (0.0025) [2024-06-23 11:54:28,390][15132] Fps is (10 sec: 42625.7, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 7034994688. Throughput: 0: 42546.2. Samples: 7035117200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 11:54:31,233][15401] Updated weights for policy 0, policy_version 429390 (0.0047) [2024-06-23 11:54:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7035207680. Throughput: 0: 42729.3. Samples: 7035373680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:33,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 11:54:35,421][15401] Updated weights for policy 0, policy_version 429400 (0.0028) [2024-06-23 11:54:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7035420672. Throughput: 0: 42652.9. Samples: 7035505020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 11:54:38,787][15401] Updated weights for policy 0, policy_version 429410 (0.0027) [2024-06-23 11:54:43,023][15401] Updated weights for policy 0, policy_version 429420 (0.0039) [2024-06-23 11:54:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42987.1). Total num frames: 7035617280. Throughput: 0: 42542.1. Samples: 7035759660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 11:54:46,831][15401] Updated weights for policy 0, policy_version 429430 (0.0032) [2024-06-23 11:54:47,381][15349] Signal inference workers to stop experience collection... (104250 times) [2024-06-23 11:54:47,419][15401] InferenceWorker_p0-w0: stopping experience collection (104250 times) [2024-06-23 11:54:47,429][15349] Signal inference workers to resume experience collection... (104250 times) [2024-06-23 11:54:47,436][15401] InferenceWorker_p0-w0: resuming experience collection (104250 times) [2024-06-23 11:54:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7035863040. Throughput: 0: 42589.4. Samples: 7036010760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 11:54:50,623][15401] Updated weights for policy 0, policy_version 429440 (0.0034) [2024-06-23 11:54:53,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 7036076032. Throughput: 0: 42803.1. Samples: 7036152800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:53,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 11:54:54,191][15401] Updated weights for policy 0, policy_version 429450 (0.0028) [2024-06-23 11:54:58,122][15401] Updated weights for policy 0, policy_version 429460 (0.0049) [2024-06-23 11:54:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 7036272640. Throughput: 0: 42775.6. Samples: 7036406680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:54:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 11:55:01,627][15401] Updated weights for policy 0, policy_version 429470 (0.0035) [2024-06-23 11:55:03,392][15132] Fps is (10 sec: 42600.2, 60 sec: 42870.1, 300 sec: 42931.7). Total num frames: 7036502016. Throughput: 0: 42816.6. Samples: 7036661540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:55:03,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 11:55:05,740][15401] Updated weights for policy 0, policy_version 429480 (0.0033) [2024-06-23 11:55:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7036715008. Throughput: 0: 42981.2. Samples: 7036795640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:55:08,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 11:55:09,057][15401] Updated weights for policy 0, policy_version 429490 (0.0040) [2024-06-23 11:55:13,269][15401] Updated weights for policy 0, policy_version 429500 (0.0037) [2024-06-23 11:55:13,390][15132] Fps is (10 sec: 42606.5, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 7036928000. Throughput: 0: 43026.6. Samples: 7037053400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 11:55:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 11:55:16,671][15401] Updated weights for policy 0, policy_version 429510 (0.0030) [2024-06-23 11:55:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 7037140992. Throughput: 0: 42985.3. Samples: 7037308020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 11:55:20,884][15401] Updated weights for policy 0, policy_version 429520 (0.0031) [2024-06-23 11:55:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7037370368. Throughput: 0: 43017.3. Samples: 7037440800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:23,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 11:55:24,200][15401] Updated weights for policy 0, policy_version 429530 (0.0042) [2024-06-23 11:55:28,339][15401] Updated weights for policy 0, policy_version 429540 (0.0026) [2024-06-23 11:55:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 43043.1). Total num frames: 7037583360. Throughput: 0: 43147.3. Samples: 7037701280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:28,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 11:55:31,703][15401] Updated weights for policy 0, policy_version 429550 (0.0030) [2024-06-23 11:55:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7037796352. Throughput: 0: 43098.6. Samples: 7037950200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:33,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-23 11:55:36,015][15401] Updated weights for policy 0, policy_version 429560 (0.0042) [2024-06-23 11:55:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7038009344. Throughput: 0: 42857.9. Samples: 7038081300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:38,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 11:55:39,567][15401] Updated weights for policy 0, policy_version 429570 (0.0037) [2024-06-23 11:55:43,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 7038205952. Throughput: 0: 43000.9. Samples: 7038341820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:43,392][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 11:55:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000429578_7038205952.pth... [2024-06-23 11:55:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000428951_7027933184.pth [2024-06-23 11:55:43,783][15401] Updated weights for policy 0, policy_version 429580 (0.0041) [2024-06-23 11:55:47,070][15401] Updated weights for policy 0, policy_version 429590 (0.0031) [2024-06-23 11:55:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7038435328. Throughput: 0: 42971.2. Samples: 7038595160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 11:55:51,446][15401] Updated weights for policy 0, policy_version 429600 (0.0034) [2024-06-23 11:55:53,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 7038648320. Throughput: 0: 42866.3. Samples: 7038724620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 11:55:54,605][15401] Updated weights for policy 0, policy_version 429610 (0.0026) [2024-06-23 11:55:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 7038844928. Throughput: 0: 43030.4. Samples: 7038989760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:55:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 11:55:58,940][15401] Updated weights for policy 0, policy_version 429620 (0.0026) [2024-06-23 11:56:02,204][15401] Updated weights for policy 0, policy_version 429630 (0.0030) [2024-06-23 11:56:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42872.8, 300 sec: 42765.0). Total num frames: 7039074304. Throughput: 0: 42938.6. Samples: 7039240260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:56:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 11:56:06,822][15401] Updated weights for policy 0, policy_version 429640 (0.0038) [2024-06-23 11:56:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7039303680. Throughput: 0: 42821.3. Samples: 7039367760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:56:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 11:56:09,782][15401] Updated weights for policy 0, policy_version 429650 (0.0035) [2024-06-23 11:56:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7039500288. Throughput: 0: 42797.4. Samples: 7039627160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:56:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 11:56:14,495][15401] Updated weights for policy 0, policy_version 429660 (0.0043) [2024-06-23 11:56:17,841][15401] Updated weights for policy 0, policy_version 429670 (0.0038) [2024-06-23 11:56:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7039713280. Throughput: 0: 42804.9. Samples: 7039876420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:56:18,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 11:56:22,408][15401] Updated weights for policy 0, policy_version 429680 (0.0034) [2024-06-23 11:56:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7039909888. Throughput: 0: 42728.0. Samples: 7040004060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:56:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 11:56:25,687][15401] Updated weights for policy 0, policy_version 429690 (0.0029) [2024-06-23 11:56:27,939][15349] Signal inference workers to stop experience collection... (104300 times) [2024-06-23 11:56:27,974][15401] InferenceWorker_p0-w0: stopping experience collection (104300 times) [2024-06-23 11:56:27,987][15349] Signal inference workers to resume experience collection... (104300 times) [2024-06-23 11:56:28,001][15401] InferenceWorker_p0-w0: resuming experience collection (104300 times) [2024-06-23 11:56:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.6, 300 sec: 42875.8). Total num frames: 7040122880. Throughput: 0: 42502.2. Samples: 7040254420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:56:28,392][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 11:56:30,090][15401] Updated weights for policy 0, policy_version 429700 (0.0041) [2024-06-23 11:56:33,111][15401] Updated weights for policy 0, policy_version 429710 (0.0031) [2024-06-23 11:56:33,396][15132] Fps is (10 sec: 45845.4, 60 sec: 42866.9, 300 sec: 42708.9). Total num frames: 7040368640. Throughput: 0: 42452.3. Samples: 7040505780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 11:56:33,397][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 11:56:37,551][15401] Updated weights for policy 0, policy_version 429720 (0.0032) [2024-06-23 11:56:38,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7040565248. Throughput: 0: 42673.8. Samples: 7040644940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:56:38,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 11:56:40,589][15401] Updated weights for policy 0, policy_version 429730 (0.0033) [2024-06-23 11:56:43,390][15132] Fps is (10 sec: 39346.2, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 7040761856. Throughput: 0: 42473.5. Samples: 7040901080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:56:43,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 11:56:45,203][15401] Updated weights for policy 0, policy_version 429740 (0.0027) [2024-06-23 11:56:48,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 7041007616. Throughput: 0: 42547.6. Samples: 7041155000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:56:48,393][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 11:56:48,782][15401] Updated weights for policy 0, policy_version 429750 (0.0045) [2024-06-23 11:56:52,768][15401] Updated weights for policy 0, policy_version 429760 (0.0028) [2024-06-23 11:56:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7041204224. Throughput: 0: 42652.1. Samples: 7041287100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:56:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 11:56:56,443][15401] Updated weights for policy 0, policy_version 429770 (0.0034) [2024-06-23 11:56:58,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 7041400832. Throughput: 0: 42540.4. Samples: 7041541480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:56:58,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 11:57:00,351][15401] Updated weights for policy 0, policy_version 429780 (0.0040) [2024-06-23 11:57:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7041646592. Throughput: 0: 42607.1. Samples: 7041793740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 11:57:04,101][15401] Updated weights for policy 0, policy_version 429790 (0.0027) [2024-06-23 11:57:08,334][15401] Updated weights for policy 0, policy_version 429800 (0.0030) [2024-06-23 11:57:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7041843200. Throughput: 0: 42787.5. Samples: 7041929500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 11:57:11,735][15401] Updated weights for policy 0, policy_version 429810 (0.0033) [2024-06-23 11:57:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7042056192. Throughput: 0: 42869.9. Samples: 7042183460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 11:57:16,133][15401] Updated weights for policy 0, policy_version 429820 (0.0029) [2024-06-23 11:57:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7042301952. Throughput: 0: 42786.5. Samples: 7042430900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 11:57:19,384][15401] Updated weights for policy 0, policy_version 429830 (0.0029) [2024-06-23 11:57:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7042465792. Throughput: 0: 42749.0. Samples: 7042568640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 11:57:23,604][15401] Updated weights for policy 0, policy_version 429840 (0.0036) [2024-06-23 11:57:26,817][15401] Updated weights for policy 0, policy_version 429850 (0.0033) [2024-06-23 11:57:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43146.3, 300 sec: 42820.9). Total num frames: 7042711552. Throughput: 0: 42865.9. Samples: 7042830040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 11:57:31,287][15401] Updated weights for policy 0, policy_version 429860 (0.0038) [2024-06-23 11:57:33,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42876.1, 300 sec: 42821.5). Total num frames: 7042940928. Throughput: 0: 42814.0. Samples: 7043081520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:33,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 11:57:34,480][15401] Updated weights for policy 0, policy_version 429870 (0.0044) [2024-06-23 11:57:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7043121152. Throughput: 0: 42800.3. Samples: 7043213120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 11:57:39,099][15401] Updated weights for policy 0, policy_version 429880 (0.0028) [2024-06-23 11:57:42,095][15401] Updated weights for policy 0, policy_version 429890 (0.0034) [2024-06-23 11:57:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7043350528. Throughput: 0: 42762.7. Samples: 7043465800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 11:57:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000429892_7043350528.pth... [2024-06-23 11:57:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000429266_7033094144.pth [2024-06-23 11:57:46,731][15401] Updated weights for policy 0, policy_version 429900 (0.0038) [2024-06-23 11:57:48,391][15132] Fps is (10 sec: 45870.2, 60 sec: 42872.4, 300 sec: 42820.4). Total num frames: 7043579904. Throughput: 0: 42874.0. Samples: 7043723120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:48,391][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 11:57:49,899][15401] Updated weights for policy 0, policy_version 429910 (0.0034) [2024-06-23 11:57:51,004][15349] Signal inference workers to stop experience collection... (104350 times) [2024-06-23 11:57:51,004][15349] Signal inference workers to resume experience collection... (104350 times) [2024-06-23 11:57:51,056][15401] InferenceWorker_p0-w0: stopping experience collection (104350 times) [2024-06-23 11:57:51,056][15401] InferenceWorker_p0-w0: resuming experience collection (104350 times) [2024-06-23 11:57:53,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 7043776512. Throughput: 0: 42728.9. Samples: 7043852400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:53,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 11:57:54,199][15401] Updated weights for policy 0, policy_version 429920 (0.0025) [2024-06-23 11:57:57,941][15401] Updated weights for policy 0, policy_version 429930 (0.0031) [2024-06-23 11:57:58,390][15132] Fps is (10 sec: 40964.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7043989504. Throughput: 0: 42757.2. Samples: 7044107540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 11:57:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 11:58:01,777][15401] Updated weights for policy 0, policy_version 429940 (0.0041) [2024-06-23 11:58:03,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7044202496. Throughput: 0: 43022.7. Samples: 7044366920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 11:58:05,513][15401] Updated weights for policy 0, policy_version 429950 (0.0034) [2024-06-23 11:58:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7044415488. Throughput: 0: 42863.9. Samples: 7044497520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:08,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 11:58:09,522][15401] Updated weights for policy 0, policy_version 429960 (0.0045) [2024-06-23 11:58:13,095][15401] Updated weights for policy 0, policy_version 429970 (0.0031) [2024-06-23 11:58:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.7, 300 sec: 42821.1). Total num frames: 7044644864. Throughput: 0: 42724.4. Samples: 7044752740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:13,393][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 11:58:16,984][15401] Updated weights for policy 0, policy_version 429980 (0.0029) [2024-06-23 11:58:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 7044857856. Throughput: 0: 42796.0. Samples: 7045007340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 11:58:20,800][15401] Updated weights for policy 0, policy_version 429990 (0.0028) [2024-06-23 11:58:23,389][15132] Fps is (10 sec: 40970.3, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 7045054464. Throughput: 0: 42836.6. Samples: 7045140760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 11:58:24,427][15401] Updated weights for policy 0, policy_version 430000 (0.0042) [2024-06-23 11:58:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7045267456. Throughput: 0: 43031.6. Samples: 7045402220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 11:58:28,431][15401] Updated weights for policy 0, policy_version 430010 (0.0044) [2024-06-23 11:58:31,966][15401] Updated weights for policy 0, policy_version 430020 (0.0030) [2024-06-23 11:58:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7045496832. Throughput: 0: 42866.8. Samples: 7045652080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 11:58:36,177][15401] Updated weights for policy 0, policy_version 430030 (0.0027) [2024-06-23 11:58:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7045709824. Throughput: 0: 42879.1. Samples: 7045781860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:38,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 11:58:40,059][15401] Updated weights for policy 0, policy_version 430040 (0.0031) [2024-06-23 11:58:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7045906432. Throughput: 0: 42871.7. Samples: 7046036760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:43,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-23 11:58:43,942][15401] Updated weights for policy 0, policy_version 430050 (0.0030) [2024-06-23 11:58:47,616][15401] Updated weights for policy 0, policy_version 430060 (0.0043) [2024-06-23 11:58:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42326.2, 300 sec: 42709.5). Total num frames: 7046119424. Throughput: 0: 42721.3. Samples: 7046289380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 11:58:51,963][15401] Updated weights for policy 0, policy_version 430070 (0.0039) [2024-06-23 11:58:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 7046365184. Throughput: 0: 42826.7. Samples: 7046424720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 11:58:55,197][15401] Updated weights for policy 0, policy_version 430080 (0.0034) [2024-06-23 11:58:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 7046529024. Throughput: 0: 42692.5. Samples: 7046673900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:58:58,392][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 11:58:59,598][15401] Updated weights for policy 0, policy_version 430090 (0.0039) [2024-06-23 11:59:02,663][15401] Updated weights for policy 0, policy_version 430100 (0.0028) [2024-06-23 11:59:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7046774784. Throughput: 0: 42684.7. Samples: 7046928160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:59:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 11:59:07,175][15401] Updated weights for policy 0, policy_version 430110 (0.0028) [2024-06-23 11:59:08,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 7047004160. Throughput: 0: 42812.0. Samples: 7047067300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:59:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 11:59:10,518][15401] Updated weights for policy 0, policy_version 430120 (0.0022) [2024-06-23 11:59:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42327.1, 300 sec: 42766.0). Total num frames: 7047184384. Throughput: 0: 42584.1. Samples: 7047318500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:59:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 11:59:14,799][15401] Updated weights for policy 0, policy_version 430130 (0.0027) [2024-06-23 11:59:17,986][15401] Updated weights for policy 0, policy_version 430140 (0.0038) [2024-06-23 11:59:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7047430144. Throughput: 0: 42628.0. Samples: 7047570340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 11:59:18,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 11:59:22,122][15349] Signal inference workers to stop experience collection... (104400 times) [2024-06-23 11:59:22,174][15401] InferenceWorker_p0-w0: stopping experience collection (104400 times) [2024-06-23 11:59:22,174][15349] Signal inference workers to resume experience collection... (104400 times) [2024-06-23 11:59:22,187][15401] InferenceWorker_p0-w0: resuming experience collection (104400 times) [2024-06-23 11:59:22,333][15401] Updated weights for policy 0, policy_version 430150 (0.0026) [2024-06-23 11:59:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7047626752. Throughput: 0: 42750.2. Samples: 7047705620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 11:59:23,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-23 11:59:25,517][15401] Updated weights for policy 0, policy_version 430160 (0.0027) [2024-06-23 11:59:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7047823360. Throughput: 0: 42628.4. Samples: 7047955040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 11:59:28,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 11:59:29,914][15401] Updated weights for policy 0, policy_version 430170 (0.0033) [2024-06-23 11:59:33,157][15401] Updated weights for policy 0, policy_version 430180 (0.0028) [2024-06-23 11:59:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 7048069120. Throughput: 0: 42655.6. Samples: 7048208880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 11:59:33,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 11:59:37,501][15401] Updated weights for policy 0, policy_version 430190 (0.0026) [2024-06-23 11:59:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7048249344. Throughput: 0: 42740.4. Samples: 7048348040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 11:59:38,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 11:59:40,746][15401] Updated weights for policy 0, policy_version 430200 (0.0028) [2024-06-23 11:59:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7048478720. Throughput: 0: 42705.7. Samples: 7048595660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 11:59:43,393][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 11:59:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000430205_7048478720.pth... [2024-06-23 11:59:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000429578_7038205952.pth [2024-06-23 11:59:45,473][15401] Updated weights for policy 0, policy_version 430210 (0.0039) [2024-06-23 11:59:48,354][15401] Updated weights for policy 0, policy_version 430220 (0.0035) [2024-06-23 11:59:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 7048724480. Throughput: 0: 42843.2. Samples: 7048856100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 11:59:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 11:59:53,070][15401] Updated weights for policy 0, policy_version 430230 (0.0041) [2024-06-23 11:59:53,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7048904704. Throughput: 0: 42722.1. Samples: 7048989800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 11:59:53,394][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 11:59:56,338][15401] Updated weights for policy 0, policy_version 430240 (0.0031) [2024-06-23 11:59:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43419.3, 300 sec: 42820.8). Total num frames: 7049134080. Throughput: 0: 42640.8. Samples: 7049237340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 11:59:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 12:00:00,721][15401] Updated weights for policy 0, policy_version 430250 (0.0030) [2024-06-23 12:00:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7049330688. Throughput: 0: 42897.3. Samples: 7049500720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 12:00:03,906][15401] Updated weights for policy 0, policy_version 430260 (0.0034) [2024-06-23 12:00:08,297][15401] Updated weights for policy 0, policy_version 430270 (0.0046) [2024-06-23 12:00:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 7049543680. Throughput: 0: 42813.7. Samples: 7049632240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 12:00:11,503][15401] Updated weights for policy 0, policy_version 430280 (0.0033) [2024-06-23 12:00:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7049756672. Throughput: 0: 42853.8. Samples: 7049883460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 12:00:16,089][15401] Updated weights for policy 0, policy_version 430290 (0.0045) [2024-06-23 12:00:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7049986048. Throughput: 0: 42922.1. Samples: 7050140380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 12:00:19,391][15401] Updated weights for policy 0, policy_version 430300 (0.0033) [2024-06-23 12:00:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7050166272. Throughput: 0: 42682.6. Samples: 7050268760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:23,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-23 12:00:23,641][15401] Updated weights for policy 0, policy_version 430310 (0.0030) [2024-06-23 12:00:26,933][15401] Updated weights for policy 0, policy_version 430320 (0.0030) [2024-06-23 12:00:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7050395648. Throughput: 0: 42823.7. Samples: 7050522620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:28,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 12:00:29,580][15349] Signal inference workers to stop experience collection... (104450 times) [2024-06-23 12:00:29,580][15349] Signal inference workers to resume experience collection... (104450 times) [2024-06-23 12:00:29,624][15401] InferenceWorker_p0-w0: stopping experience collection (104450 times) [2024-06-23 12:00:29,624][15401] InferenceWorker_p0-w0: resuming experience collection (104450 times) [2024-06-23 12:00:31,551][15401] Updated weights for policy 0, policy_version 430330 (0.0033) [2024-06-23 12:00:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7050625024. Throughput: 0: 42800.1. Samples: 7050782100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:33,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 12:00:34,516][15401] Updated weights for policy 0, policy_version 430340 (0.0031) [2024-06-23 12:00:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 7050821632. Throughput: 0: 42876.5. Samples: 7050919240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:38,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-23 12:00:38,999][15401] Updated weights for policy 0, policy_version 430350 (0.0042) [2024-06-23 12:00:42,432][15401] Updated weights for policy 0, policy_version 430360 (0.0037) [2024-06-23 12:00:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 7051051008. Throughput: 0: 42978.1. Samples: 7051171360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 12:00:43,399][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 12:00:46,513][15401] Updated weights for policy 0, policy_version 430370 (0.0028) [2024-06-23 12:00:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7051280384. Throughput: 0: 42878.1. Samples: 7051430240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:00:48,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 12:00:49,886][15401] Updated weights for policy 0, policy_version 430380 (0.0045) [2024-06-23 12:00:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7051476992. Throughput: 0: 42890.9. Samples: 7051562320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:00:53,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 12:00:54,021][15401] Updated weights for policy 0, policy_version 430390 (0.0028) [2024-06-23 12:00:57,239][15401] Updated weights for policy 0, policy_version 430400 (0.0038) [2024-06-23 12:00:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7051689984. Throughput: 0: 43018.2. Samples: 7051819280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:00:58,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 12:01:01,459][15401] Updated weights for policy 0, policy_version 430410 (0.0036) [2024-06-23 12:01:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7051919360. Throughput: 0: 43064.4. Samples: 7052078280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:03,400][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 12:01:04,648][15401] Updated weights for policy 0, policy_version 430420 (0.0048) [2024-06-23 12:01:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7052115968. Throughput: 0: 43122.7. Samples: 7052209280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 12:01:09,127][15401] Updated weights for policy 0, policy_version 430430 (0.0026) [2024-06-23 12:01:12,554][15401] Updated weights for policy 0, policy_version 430440 (0.0037) [2024-06-23 12:01:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7052328960. Throughput: 0: 42955.5. Samples: 7052455620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 12:01:17,007][15401] Updated weights for policy 0, policy_version 430450 (0.0027) [2024-06-23 12:01:18,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42867.0, 300 sec: 42875.2). Total num frames: 7052558336. Throughput: 0: 42829.4. Samples: 7052709700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:18,396][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 12:01:20,210][15401] Updated weights for policy 0, policy_version 430460 (0.0030) [2024-06-23 12:01:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 7052754944. Throughput: 0: 42727.1. Samples: 7052841960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 12:01:25,041][15401] Updated weights for policy 0, policy_version 430470 (0.0033) [2024-06-23 12:01:28,155][15401] Updated weights for policy 0, policy_version 430480 (0.0042) [2024-06-23 12:01:28,392][15132] Fps is (10 sec: 42615.4, 60 sec: 43142.8, 300 sec: 42765.6). Total num frames: 7052984320. Throughput: 0: 42565.4. Samples: 7053086900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:28,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 12:01:32,558][15401] Updated weights for policy 0, policy_version 430490 (0.0027) [2024-06-23 12:01:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7053197312. Throughput: 0: 42715.6. Samples: 7053352440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 12:01:35,630][15401] Updated weights for policy 0, policy_version 430500 (0.0027) [2024-06-23 12:01:38,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7053393920. Throughput: 0: 42724.3. Samples: 7053484920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 12:01:39,675][15349] Signal inference workers to stop experience collection... (104500 times) [2024-06-23 12:01:39,727][15401] InferenceWorker_p0-w0: stopping experience collection (104500 times) [2024-06-23 12:01:39,791][15349] Signal inference workers to resume experience collection... (104500 times) [2024-06-23 12:01:39,791][15401] InferenceWorker_p0-w0: resuming experience collection (104500 times) [2024-06-23 12:01:39,953][15401] Updated weights for policy 0, policy_version 430510 (0.0031) [2024-06-23 12:01:43,383][15401] Updated weights for policy 0, policy_version 430520 (0.0037) [2024-06-23 12:01:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42820.9). Total num frames: 7053639680. Throughput: 0: 42632.9. Samples: 7053737760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 12:01:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000430520_7053639680.pth... [2024-06-23 12:01:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000429892_7043350528.pth [2024-06-23 12:01:47,930][15401] Updated weights for policy 0, policy_version 430530 (0.0024) [2024-06-23 12:01:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7053819904. Throughput: 0: 42717.3. Samples: 7054000560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:48,395][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 12:01:50,948][15401] Updated weights for policy 0, policy_version 430540 (0.0038) [2024-06-23 12:01:53,394][15132] Fps is (10 sec: 40940.0, 60 sec: 42868.0, 300 sec: 42875.4). Total num frames: 7054049280. Throughput: 0: 42453.6. Samples: 7054119900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:53,395][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 12:01:55,601][15401] Updated weights for policy 0, policy_version 430550 (0.0034) [2024-06-23 12:01:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7054262272. Throughput: 0: 42721.8. Samples: 7054378100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:01:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 12:01:58,671][15401] Updated weights for policy 0, policy_version 430560 (0.0026) [2024-06-23 12:02:03,125][15401] Updated weights for policy 0, policy_version 430570 (0.0037) [2024-06-23 12:02:03,389][15132] Fps is (10 sec: 40979.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7054458880. Throughput: 0: 42931.0. Samples: 7054641320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 12:02:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 12:02:06,356][15401] Updated weights for policy 0, policy_version 430580 (0.0037) [2024-06-23 12:02:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7054704640. Throughput: 0: 42770.7. Samples: 7054766640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 12:02:10,885][15401] Updated weights for policy 0, policy_version 430590 (0.0030) [2024-06-23 12:02:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7054884864. Throughput: 0: 43017.5. Samples: 7055022580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 12:02:14,382][15401] Updated weights for policy 0, policy_version 430600 (0.0041) [2024-06-23 12:02:18,387][15401] Updated weights for policy 0, policy_version 430610 (0.0036) [2024-06-23 12:02:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42602.9, 300 sec: 42876.1). Total num frames: 7055114240. Throughput: 0: 42949.4. Samples: 7055285160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:18,396][15132] Avg episode reward: [(0, '0.262')] [2024-06-23 12:02:22,205][15401] Updated weights for policy 0, policy_version 430620 (0.0032) [2024-06-23 12:02:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7055343616. Throughput: 0: 42814.0. Samples: 7055411540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:23,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-23 12:02:26,028][15401] Updated weights for policy 0, policy_version 430630 (0.0036) [2024-06-23 12:02:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7055540224. Throughput: 0: 42739.0. Samples: 7055661020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 12:02:29,831][15401] Updated weights for policy 0, policy_version 430640 (0.0035) [2024-06-23 12:02:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7055753216. Throughput: 0: 42717.4. Samples: 7055922840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:33,399][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 12:02:33,536][15401] Updated weights for policy 0, policy_version 430650 (0.0037) [2024-06-23 12:02:37,331][15401] Updated weights for policy 0, policy_version 430660 (0.0041) [2024-06-23 12:02:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7055966208. Throughput: 0: 42878.3. Samples: 7056049220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 12:02:41,162][15401] Updated weights for policy 0, policy_version 430670 (0.0034) [2024-06-23 12:02:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 7056179200. Throughput: 0: 42624.8. Samples: 7056296220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 12:02:44,761][15349] Signal inference workers to stop experience collection... (104550 times) [2024-06-23 12:02:44,768][15349] Signal inference workers to resume experience collection... (104550 times) [2024-06-23 12:02:44,772][15401] InferenceWorker_p0-w0: stopping experience collection (104550 times) [2024-06-23 12:02:44,789][15401] InferenceWorker_p0-w0: resuming experience collection (104550 times) [2024-06-23 12:02:44,929][15401] Updated weights for policy 0, policy_version 430680 (0.0031) [2024-06-23 12:02:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 7056375808. Throughput: 0: 42612.8. Samples: 7056558900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:48,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 12:02:49,160][15401] Updated weights for policy 0, policy_version 430690 (0.0038) [2024-06-23 12:02:52,686][15401] Updated weights for policy 0, policy_version 430700 (0.0035) [2024-06-23 12:02:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42328.7, 300 sec: 42709.5). Total num frames: 7056588800. Throughput: 0: 42592.4. Samples: 7056683300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:53,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 12:02:56,739][15401] Updated weights for policy 0, policy_version 430710 (0.0027) [2024-06-23 12:02:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7056818176. Throughput: 0: 42480.0. Samples: 7056934180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:02:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 12:03:00,791][15401] Updated weights for policy 0, policy_version 430720 (0.0043) [2024-06-23 12:03:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7057014784. Throughput: 0: 42418.2. Samples: 7057193980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:03:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 12:03:04,349][15401] Updated weights for policy 0, policy_version 430730 (0.0027) [2024-06-23 12:03:08,300][15401] Updated weights for policy 0, policy_version 430740 (0.0032) [2024-06-23 12:03:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 7057244160. Throughput: 0: 42414.2. Samples: 7057320180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:03:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 12:03:11,865][15401] Updated weights for policy 0, policy_version 430750 (0.0038) [2024-06-23 12:03:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7057457152. Throughput: 0: 42557.5. Samples: 7057576100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:03:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 12:03:16,209][15401] Updated weights for policy 0, policy_version 430760 (0.0038) [2024-06-23 12:03:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7057653760. Throughput: 0: 42373.4. Samples: 7057829640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:03:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 12:03:19,524][15401] Updated weights for policy 0, policy_version 430770 (0.0044) [2024-06-23 12:03:23,391][15132] Fps is (10 sec: 40954.2, 60 sec: 42051.3, 300 sec: 42709.3). Total num frames: 7057866752. Throughput: 0: 42370.8. Samples: 7057955960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:03:23,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 12:03:23,694][15401] Updated weights for policy 0, policy_version 430780 (0.0027) [2024-06-23 12:03:27,182][15401] Updated weights for policy 0, policy_version 430790 (0.0034) [2024-06-23 12:03:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7058096128. Throughput: 0: 42573.8. Samples: 7058212040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:03:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 12:03:31,272][15401] Updated weights for policy 0, policy_version 430800 (0.0037) [2024-06-23 12:03:33,390][15132] Fps is (10 sec: 44242.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7058309120. Throughput: 0: 42537.7. Samples: 7058473100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:03:33,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 12:03:34,827][15401] Updated weights for policy 0, policy_version 430810 (0.0039) [2024-06-23 12:03:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7058505728. Throughput: 0: 42684.9. Samples: 7058604120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:03:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 12:03:38,851][15401] Updated weights for policy 0, policy_version 430820 (0.0033) [2024-06-23 12:03:42,612][15401] Updated weights for policy 0, policy_version 430830 (0.0033) [2024-06-23 12:03:43,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 7058751488. Throughput: 0: 42823.0. Samples: 7058861320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:03:43,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 12:03:43,465][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000430833_7058767872.pth... [2024-06-23 12:03:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000430205_7048478720.pth [2024-06-23 12:03:46,362][15401] Updated weights for policy 0, policy_version 430840 (0.0029) [2024-06-23 12:03:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7058948096. Throughput: 0: 42907.4. Samples: 7059124820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:03:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 12:03:50,074][15401] Updated weights for policy 0, policy_version 430850 (0.0032) [2024-06-23 12:03:53,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 7059161088. Throughput: 0: 42780.3. Samples: 7059245300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:03:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 12:03:54,278][15401] Updated weights for policy 0, policy_version 430860 (0.0031) [2024-06-23 12:03:56,745][15349] Signal inference workers to stop experience collection... (104600 times) [2024-06-23 12:03:56,773][15401] InferenceWorker_p0-w0: stopping experience collection (104600 times) [2024-06-23 12:03:56,856][15349] Signal inference workers to resume experience collection... (104600 times) [2024-06-23 12:03:56,857][15401] InferenceWorker_p0-w0: resuming experience collection (104600 times) [2024-06-23 12:03:57,940][15401] Updated weights for policy 0, policy_version 430870 (0.0039) [2024-06-23 12:03:58,393][15132] Fps is (10 sec: 44220.5, 60 sec: 42868.7, 300 sec: 42764.5). Total num frames: 7059390464. Throughput: 0: 42858.5. Samples: 7059504900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:03:58,394][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 12:04:01,965][15401] Updated weights for policy 0, policy_version 430880 (0.0030) [2024-06-23 12:04:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7059587072. Throughput: 0: 43082.2. Samples: 7059768340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:03,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 12:04:05,402][15401] Updated weights for policy 0, policy_version 430890 (0.0037) [2024-06-23 12:04:08,389][15132] Fps is (10 sec: 40975.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7059800064. Throughput: 0: 43128.5. Samples: 7059896680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:08,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 12:04:09,471][15401] Updated weights for policy 0, policy_version 430900 (0.0035) [2024-06-23 12:04:12,988][15401] Updated weights for policy 0, policy_version 430910 (0.0042) [2024-06-23 12:04:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 7060029440. Throughput: 0: 43003.8. Samples: 7060147220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 12:04:17,121][15401] Updated weights for policy 0, policy_version 430920 (0.0029) [2024-06-23 12:04:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7060242432. Throughput: 0: 43095.7. Samples: 7060412400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 12:04:20,571][15401] Updated weights for policy 0, policy_version 430930 (0.0033) [2024-06-23 12:04:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43145.6, 300 sec: 42820.6). Total num frames: 7060455424. Throughput: 0: 43094.3. Samples: 7060543360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 12:04:24,725][15401] Updated weights for policy 0, policy_version 430940 (0.0037) [2024-06-23 12:04:28,058][15401] Updated weights for policy 0, policy_version 430950 (0.0042) [2024-06-23 12:04:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7060684800. Throughput: 0: 43081.4. Samples: 7060799880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:28,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 12:04:32,378][15401] Updated weights for policy 0, policy_version 430960 (0.0023) [2024-06-23 12:04:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 7060881408. Throughput: 0: 42956.1. Samples: 7061057840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 12:04:35,955][15401] Updated weights for policy 0, policy_version 430970 (0.0037) [2024-06-23 12:04:38,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43142.8, 300 sec: 42765.0). Total num frames: 7061094400. Throughput: 0: 42976.1. Samples: 7061179320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:38,392][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 12:04:39,941][15401] Updated weights for policy 0, policy_version 430980 (0.0034) [2024-06-23 12:04:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 7061323776. Throughput: 0: 42886.8. Samples: 7061434640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 12:04:43,509][15401] Updated weights for policy 0, policy_version 430990 (0.0036) [2024-06-23 12:04:47,933][15401] Updated weights for policy 0, policy_version 431000 (0.0037) [2024-06-23 12:04:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7061520384. Throughput: 0: 42905.7. Samples: 7061699100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 12:04:48,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 12:04:51,177][15401] Updated weights for policy 0, policy_version 431010 (0.0036) [2024-06-23 12:04:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7061749760. Throughput: 0: 42850.7. Samples: 7061824960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:04:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 12:04:55,578][15401] Updated weights for policy 0, policy_version 431020 (0.0027) [2024-06-23 12:04:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42874.2, 300 sec: 42820.6). Total num frames: 7061962752. Throughput: 0: 42951.3. Samples: 7062080020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:04:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 12:04:59,434][15401] Updated weights for policy 0, policy_version 431030 (0.0036) [2024-06-23 12:05:03,110][15401] Updated weights for policy 0, policy_version 431040 (0.0028) [2024-06-23 12:05:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7062159360. Throughput: 0: 42800.5. Samples: 7062338420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 12:05:07,034][15401] Updated weights for policy 0, policy_version 431050 (0.0038) [2024-06-23 12:05:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7062388736. Throughput: 0: 42728.5. Samples: 7062466140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 12:05:10,881][15401] Updated weights for policy 0, policy_version 431060 (0.0046) [2024-06-23 12:05:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7062618112. Throughput: 0: 42632.0. Samples: 7062718320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 12:05:15,019][15401] Updated weights for policy 0, policy_version 431070 (0.0044) [2024-06-23 12:05:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7062798336. Throughput: 0: 42642.3. Samples: 7062976740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:18,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 12:05:18,510][15401] Updated weights for policy 0, policy_version 431080 (0.0034) [2024-06-23 12:05:22,643][15401] Updated weights for policy 0, policy_version 431090 (0.0036) [2024-06-23 12:05:23,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7062994944. Throughput: 0: 42587.9. Samples: 7063095680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 12:05:25,886][15349] Signal inference workers to stop experience collection... (104650 times) [2024-06-23 12:05:25,886][15349] Signal inference workers to resume experience collection... (104650 times) [2024-06-23 12:05:25,910][15401] InferenceWorker_p0-w0: stopping experience collection (104650 times) [2024-06-23 12:05:25,910][15401] InferenceWorker_p0-w0: resuming experience collection (104650 times) [2024-06-23 12:05:26,242][15401] Updated weights for policy 0, policy_version 431100 (0.0037) [2024-06-23 12:05:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7063257088. Throughput: 0: 42633.3. Samples: 7063353140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 12:05:30,056][15401] Updated weights for policy 0, policy_version 431110 (0.0035) [2024-06-23 12:05:33,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 7063437312. Throughput: 0: 42673.3. Samples: 7063619500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:33,393][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 12:05:33,785][15401] Updated weights for policy 0, policy_version 431120 (0.0027) [2024-06-23 12:05:37,553][15401] Updated weights for policy 0, policy_version 431130 (0.0036) [2024-06-23 12:05:38,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42598.4, 300 sec: 42709.2). Total num frames: 7063650304. Throughput: 0: 42488.4. Samples: 7063737040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:38,401][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 12:05:41,639][15401] Updated weights for policy 0, policy_version 431140 (0.0030) [2024-06-23 12:05:43,392][15132] Fps is (10 sec: 45874.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7063896064. Throughput: 0: 42588.3. Samples: 7063996600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:43,401][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 12:05:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000431146_7063896064.pth... [2024-06-23 12:05:43,509][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000430520_7053639680.pth [2024-06-23 12:05:45,042][15401] Updated weights for policy 0, policy_version 431150 (0.0036) [2024-06-23 12:05:48,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7064076288. Throughput: 0: 42488.8. Samples: 7064250420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:48,396][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 12:05:49,555][15401] Updated weights for policy 0, policy_version 431160 (0.0029) [2024-06-23 12:05:53,329][15401] Updated weights for policy 0, policy_version 431170 (0.0046) [2024-06-23 12:05:53,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7064289280. Throughput: 0: 42249.7. Samples: 7064367380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 12:05:57,184][15401] Updated weights for policy 0, policy_version 431180 (0.0029) [2024-06-23 12:05:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7064535040. Throughput: 0: 42422.8. Samples: 7064627340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:05:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 12:06:01,198][15401] Updated weights for policy 0, policy_version 431190 (0.0034) [2024-06-23 12:06:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7064698880. Throughput: 0: 42424.4. Samples: 7064885840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:06:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 12:06:04,743][15401] Updated weights for policy 0, policy_version 431200 (0.0038) [2024-06-23 12:06:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7064928256. Throughput: 0: 42542.8. Samples: 7065010100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:06:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 12:06:08,761][15401] Updated weights for policy 0, policy_version 431210 (0.0037) [2024-06-23 12:06:12,325][15401] Updated weights for policy 0, policy_version 431220 (0.0027) [2024-06-23 12:06:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42654.9). Total num frames: 7065141248. Throughput: 0: 42532.4. Samples: 7065267100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 12:06:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 12:06:16,352][15401] Updated weights for policy 0, policy_version 431230 (0.0048) [2024-06-23 12:06:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7065354240. Throughput: 0: 42251.9. Samples: 7065520740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 12:06:20,225][15401] Updated weights for policy 0, policy_version 431240 (0.0040) [2024-06-23 12:06:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 7065583616. Throughput: 0: 42416.5. Samples: 7065645680. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 12:06:23,877][15401] Updated weights for policy 0, policy_version 431250 (0.0040) [2024-06-23 12:06:28,082][15401] Updated weights for policy 0, policy_version 431260 (0.0042) [2024-06-23 12:06:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 7065763840. Throughput: 0: 42319.5. Samples: 7065900880. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:28,394][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 12:06:31,499][15401] Updated weights for policy 0, policy_version 431270 (0.0044) [2024-06-23 12:06:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 7065976832. Throughput: 0: 42352.1. Samples: 7066156260. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:33,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 12:06:35,522][15401] Updated weights for policy 0, policy_version 431280 (0.0034) [2024-06-23 12:06:38,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 7066222592. Throughput: 0: 42608.4. Samples: 7066284760. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 12:06:39,225][15401] Updated weights for policy 0, policy_version 431290 (0.0030) [2024-06-23 12:06:42,231][15349] Signal inference workers to stop experience collection... (104700 times) [2024-06-23 12:06:42,287][15401] InferenceWorker_p0-w0: stopping experience collection (104700 times) [2024-06-23 12:06:42,290][15349] Signal inference workers to resume experience collection... (104700 times) [2024-06-23 12:06:42,297][15401] InferenceWorker_p0-w0: resuming experience collection (104700 times) [2024-06-23 12:06:43,353][15401] Updated weights for policy 0, policy_version 431300 (0.0036) [2024-06-23 12:06:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 7066419200. Throughput: 0: 42554.6. Samples: 7066542300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 12:06:46,769][15401] Updated weights for policy 0, policy_version 431310 (0.0034) [2024-06-23 12:06:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42654.6). Total num frames: 7066632192. Throughput: 0: 42485.7. Samples: 7066797700. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:48,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 12:06:51,165][15401] Updated weights for policy 0, policy_version 431320 (0.0039) [2024-06-23 12:06:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7066861568. Throughput: 0: 42687.5. Samples: 7066931040. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:53,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 12:06:54,200][15401] Updated weights for policy 0, policy_version 431330 (0.0034) [2024-06-23 12:06:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 7067041792. Throughput: 0: 42557.9. Samples: 7067182200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:06:58,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 12:06:58,788][15401] Updated weights for policy 0, policy_version 431340 (0.0042) [2024-06-23 12:07:01,706][15401] Updated weights for policy 0, policy_version 431350 (0.0030) [2024-06-23 12:07:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7067271168. Throughput: 0: 42439.7. Samples: 7067430520. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:07:03,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 12:07:06,419][15401] Updated weights for policy 0, policy_version 431360 (0.0040) [2024-06-23 12:07:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7067467776. Throughput: 0: 42562.6. Samples: 7067561000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:07:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 12:07:09,412][15401] Updated weights for policy 0, policy_version 431370 (0.0026) [2024-06-23 12:07:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7067697152. Throughput: 0: 42652.0. Samples: 7067820220. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:07:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 12:07:13,952][15401] Updated weights for policy 0, policy_version 431380 (0.0033) [2024-06-23 12:07:17,441][15401] Updated weights for policy 0, policy_version 431390 (0.0035) [2024-06-23 12:07:18,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 7067926528. Throughput: 0: 42481.7. Samples: 7068067940. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:07:18,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 12:07:21,563][15401] Updated weights for policy 0, policy_version 431400 (0.0034) [2024-06-23 12:07:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7068106752. Throughput: 0: 42505.8. Samples: 7068197520. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:07:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 12:07:25,011][15401] Updated weights for policy 0, policy_version 431410 (0.0040) [2024-06-23 12:07:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7068336128. Throughput: 0: 42518.7. Samples: 7068455640. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:07:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 12:07:29,515][15401] Updated weights for policy 0, policy_version 431420 (0.0044) [2024-06-23 12:07:32,700][15401] Updated weights for policy 0, policy_version 431430 (0.0028) [2024-06-23 12:07:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 7068565504. Throughput: 0: 42412.4. Samples: 7068706260. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-23 12:07:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 12:07:37,318][15401] Updated weights for policy 0, policy_version 431440 (0.0043) [2024-06-23 12:07:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7068745728. Throughput: 0: 42385.0. Samples: 7068838360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:07:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 12:07:40,261][15401] Updated weights for policy 0, policy_version 431450 (0.0034) [2024-06-23 12:07:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7068975104. Throughput: 0: 42540.2. Samples: 7069096520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:07:43,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 12:07:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000431456_7068975104.pth... [2024-06-23 12:07:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000430833_7058767872.pth [2024-06-23 12:07:44,903][15401] Updated weights for policy 0, policy_version 431460 (0.0029) [2024-06-23 12:07:47,859][15401] Updated weights for policy 0, policy_version 431470 (0.0041) [2024-06-23 12:07:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7069204480. Throughput: 0: 42589.3. Samples: 7069347040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:07:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 12:07:52,436][15401] Updated weights for policy 0, policy_version 431480 (0.0035) [2024-06-23 12:07:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7069384704. Throughput: 0: 42716.5. Samples: 7069483240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:07:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 12:07:55,498][15401] Updated weights for policy 0, policy_version 431490 (0.0028) [2024-06-23 12:07:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7069614080. Throughput: 0: 42621.0. Samples: 7069738160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:07:58,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 12:07:59,951][15349] Signal inference workers to stop experience collection... (104750 times) [2024-06-23 12:07:59,951][15349] Signal inference workers to resume experience collection... (104750 times) [2024-06-23 12:07:59,962][15401] InferenceWorker_p0-w0: stopping experience collection (104750 times) [2024-06-23 12:07:59,978][15401] InferenceWorker_p0-w0: resuming experience collection (104750 times) [2024-06-23 12:08:00,101][15401] Updated weights for policy 0, policy_version 431500 (0.0034) [2024-06-23 12:08:03,119][15401] Updated weights for policy 0, policy_version 431510 (0.0030) [2024-06-23 12:08:03,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7069859840. Throughput: 0: 42707.1. Samples: 7069989760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 12:08:07,673][15401] Updated weights for policy 0, policy_version 431520 (0.0031) [2024-06-23 12:08:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7070023680. Throughput: 0: 42864.9. Samples: 7070126440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 12:08:10,722][15401] Updated weights for policy 0, policy_version 431530 (0.0030) [2024-06-23 12:08:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7070253056. Throughput: 0: 42757.2. Samples: 7070379720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 12:08:15,286][15401] Updated weights for policy 0, policy_version 431540 (0.0039) [2024-06-23 12:08:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.2). Total num frames: 7070482432. Throughput: 0: 42703.2. Samples: 7070627900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 12:08:18,634][15401] Updated weights for policy 0, policy_version 431550 (0.0026) [2024-06-23 12:08:23,162][15401] Updated weights for policy 0, policy_version 431560 (0.0029) [2024-06-23 12:08:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7070679040. Throughput: 0: 42753.4. Samples: 7070762260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:23,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 12:08:26,299][15401] Updated weights for policy 0, policy_version 431570 (0.0026) [2024-06-23 12:08:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 7070892032. Throughput: 0: 42619.2. Samples: 7071014380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:28,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 12:08:30,736][15401] Updated weights for policy 0, policy_version 431580 (0.0040) [2024-06-23 12:08:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 7071105024. Throughput: 0: 42699.3. Samples: 7071268500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 12:08:34,096][15401] Updated weights for policy 0, policy_version 431590 (0.0034) [2024-06-23 12:08:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 7071318016. Throughput: 0: 42494.3. Samples: 7071395480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:38,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 12:08:38,418][15401] Updated weights for policy 0, policy_version 431600 (0.0036) [2024-06-23 12:08:41,685][15401] Updated weights for policy 0, policy_version 431610 (0.0038) [2024-06-23 12:08:43,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 7071531008. Throughput: 0: 42310.6. Samples: 7071642240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:43,393][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 12:08:46,684][15401] Updated weights for policy 0, policy_version 431620 (0.0041) [2024-06-23 12:08:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 7071727616. Throughput: 0: 42607.7. Samples: 7071907100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 12:08:49,314][15401] Updated weights for policy 0, policy_version 431630 (0.0034) [2024-06-23 12:08:53,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42869.7, 300 sec: 42598.6). Total num frames: 7071956992. Throughput: 0: 42310.2. Samples: 7072030500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:53,393][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 12:08:54,262][15401] Updated weights for policy 0, policy_version 431640 (0.0036) [2024-06-23 12:08:57,235][15401] Updated weights for policy 0, policy_version 431650 (0.0027) [2024-06-23 12:08:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7072169984. Throughput: 0: 42270.8. Samples: 7072281900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 12:08:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 12:09:02,328][15401] Updated weights for policy 0, policy_version 431660 (0.0031) [2024-06-23 12:09:03,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 7072366592. Throughput: 0: 42710.7. Samples: 7072549880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 12:09:03,676][15349] Signal inference workers to stop experience collection... (104800 times) [2024-06-23 12:09:03,701][15401] InferenceWorker_p0-w0: stopping experience collection (104800 times) [2024-06-23 12:09:03,731][15349] Signal inference workers to resume experience collection... (104800 times) [2024-06-23 12:09:03,731][15401] InferenceWorker_p0-w0: resuming experience collection (104800 times) [2024-06-23 12:09:04,777][15401] Updated weights for policy 0, policy_version 431670 (0.0034) [2024-06-23 12:09:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7072579584. Throughput: 0: 42471.1. Samples: 7072673460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 12:09:09,814][15401] Updated weights for policy 0, policy_version 431680 (0.0030) [2024-06-23 12:09:13,122][15401] Updated weights for policy 0, policy_version 431690 (0.0039) [2024-06-23 12:09:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7072808960. Throughput: 0: 42382.5. Samples: 7072921600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:13,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-23 12:09:17,382][15401] Updated weights for policy 0, policy_version 431700 (0.0052) [2024-06-23 12:09:18,396][15132] Fps is (10 sec: 40933.5, 60 sec: 41774.7, 300 sec: 42486.4). Total num frames: 7072989184. Throughput: 0: 42592.1. Samples: 7073185420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:18,396][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 12:09:20,740][15401] Updated weights for policy 0, policy_version 431710 (0.0025) [2024-06-23 12:09:23,391][15132] Fps is (10 sec: 40953.2, 60 sec: 42324.0, 300 sec: 42487.1). Total num frames: 7073218560. Throughput: 0: 42496.9. Samples: 7073307920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:23,392][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 12:09:25,037][15401] Updated weights for policy 0, policy_version 431720 (0.0030) [2024-06-23 12:09:28,322][15401] Updated weights for policy 0, policy_version 431730 (0.0028) [2024-06-23 12:09:28,390][15132] Fps is (10 sec: 47543.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7073464320. Throughput: 0: 42698.7. Samples: 7073563580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 12:09:33,010][15401] Updated weights for policy 0, policy_version 431740 (0.0023) [2024-06-23 12:09:33,389][15132] Fps is (10 sec: 40967.6, 60 sec: 42052.2, 300 sec: 42487.7). Total num frames: 7073628160. Throughput: 0: 42712.8. Samples: 7073829180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 12:09:35,822][15401] Updated weights for policy 0, policy_version 431750 (0.0046) [2024-06-23 12:09:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7073873920. Throughput: 0: 42660.6. Samples: 7073950120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 12:09:40,578][15401] Updated weights for policy 0, policy_version 431760 (0.0039) [2024-06-23 12:09:43,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 7074103296. Throughput: 0: 42785.6. Samples: 7074207260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:43,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 12:09:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000431769_7074103296.pth... [2024-06-23 12:09:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000431146_7063896064.pth [2024-06-23 12:09:43,699][15401] Updated weights for policy 0, policy_version 431770 (0.0039) [2024-06-23 12:09:48,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 7074250752. Throughput: 0: 42617.4. Samples: 7074467660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 12:09:48,541][15401] Updated weights for policy 0, policy_version 431780 (0.0044) [2024-06-23 12:09:51,408][15401] Updated weights for policy 0, policy_version 431790 (0.0039) [2024-06-23 12:09:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 7074529280. Throughput: 0: 42493.7. Samples: 7074585680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 12:09:56,167][15401] Updated weights for policy 0, policy_version 431800 (0.0032) [2024-06-23 12:09:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7074709504. Throughput: 0: 42670.9. Samples: 7074841780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:09:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 12:09:59,210][15401] Updated weights for policy 0, policy_version 431810 (0.0037) [2024-06-23 12:10:03,392][15132] Fps is (10 sec: 37673.9, 60 sec: 42323.6, 300 sec: 42431.4). Total num frames: 7074906112. Throughput: 0: 42484.6. Samples: 7075097060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:10:03,393][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 12:10:03,830][15401] Updated weights for policy 0, policy_version 431820 (0.0038) [2024-06-23 12:10:07,126][15401] Updated weights for policy 0, policy_version 431830 (0.0039) [2024-06-23 12:10:08,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.7, 300 sec: 42487.0). Total num frames: 7075151872. Throughput: 0: 42581.7. Samples: 7075224120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:10:08,393][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 12:10:11,448][15401] Updated weights for policy 0, policy_version 431840 (0.0037) [2024-06-23 12:10:13,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 7075348480. Throughput: 0: 42543.2. Samples: 7075478020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:10:13,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 12:10:14,813][15401] Updated weights for policy 0, policy_version 431850 (0.0037) [2024-06-23 12:10:18,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42876.1, 300 sec: 42598.4). Total num frames: 7075561472. Throughput: 0: 42253.8. Samples: 7075730600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 12:10:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 12:10:19,325][15401] Updated weights for policy 0, policy_version 431860 (0.0031) [2024-06-23 12:10:22,442][15401] Updated weights for policy 0, policy_version 431870 (0.0026) [2024-06-23 12:10:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42872.8, 300 sec: 42487.3). Total num frames: 7075790848. Throughput: 0: 42446.6. Samples: 7075860220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:10:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 12:10:23,878][15349] Signal inference workers to stop experience collection... (104850 times) [2024-06-23 12:10:23,904][15401] InferenceWorker_p0-w0: stopping experience collection (104850 times) [2024-06-23 12:10:23,944][15349] Signal inference workers to resume experience collection... (104850 times) [2024-06-23 12:10:23,945][15401] InferenceWorker_p0-w0: resuming experience collection (104850 times) [2024-06-23 12:10:26,865][15401] Updated weights for policy 0, policy_version 431880 (0.0043) [2024-06-23 12:10:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42543.2). Total num frames: 7075987456. Throughput: 0: 42454.0. Samples: 7076117680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:10:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 12:10:30,129][15401] Updated weights for policy 0, policy_version 431890 (0.0028) [2024-06-23 12:10:33,396][15132] Fps is (10 sec: 39296.1, 60 sec: 42593.8, 300 sec: 42486.7). Total num frames: 7076184064. Throughput: 0: 42241.0. Samples: 7076368780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:10:33,397][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 12:10:34,520][15401] Updated weights for policy 0, policy_version 431900 (0.0040) [2024-06-23 12:10:38,027][15401] Updated weights for policy 0, policy_version 431910 (0.0030) [2024-06-23 12:10:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42432.2). Total num frames: 7076413440. Throughput: 0: 42354.8. Samples: 7076491640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:10:38,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 12:10:42,155][15401] Updated weights for policy 0, policy_version 431920 (0.0037) [2024-06-23 12:10:43,390][15132] Fps is (10 sec: 44264.6, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 7076626432. Throughput: 0: 42449.1. Samples: 7076752000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:10:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 12:10:46,101][15401] Updated weights for policy 0, policy_version 431930 (0.0038) [2024-06-23 12:10:48,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 7076839424. Throughput: 0: 42342.2. Samples: 7077002360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:10:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 12:10:49,766][15401] Updated weights for policy 0, policy_version 431940 (0.0024) [2024-06-23 12:10:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 7077052416. Throughput: 0: 42343.2. Samples: 7077129460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:10:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 12:10:53,624][15401] Updated weights for policy 0, policy_version 431950 (0.0045) [2024-06-23 12:10:57,698][15401] Updated weights for policy 0, policy_version 431960 (0.0025) [2024-06-23 12:10:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7077265408. Throughput: 0: 42632.0. Samples: 7077396460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:10:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 12:11:01,337][15401] Updated weights for policy 0, policy_version 431970 (0.0029) [2024-06-23 12:11:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 7077494784. Throughput: 0: 42421.6. Samples: 7077639580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:03,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 12:11:05,390][15401] Updated weights for policy 0, policy_version 431980 (0.0036) [2024-06-23 12:11:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 7077691392. Throughput: 0: 42485.4. Samples: 7077772060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 12:11:09,122][15401] Updated weights for policy 0, policy_version 431990 (0.0046) [2024-06-23 12:11:13,220][15401] Updated weights for policy 0, policy_version 432000 (0.0045) [2024-06-23 12:11:13,392][15132] Fps is (10 sec: 40951.2, 60 sec: 42596.8, 300 sec: 42542.6). Total num frames: 7077904384. Throughput: 0: 42452.4. Samples: 7078028140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:13,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 12:11:16,886][15401] Updated weights for policy 0, policy_version 432010 (0.0037) [2024-06-23 12:11:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7078133760. Throughput: 0: 42284.2. Samples: 7078271300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:18,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 12:11:20,823][15401] Updated weights for policy 0, policy_version 432020 (0.0038) [2024-06-23 12:11:23,390][15132] Fps is (10 sec: 40969.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 7078313984. Throughput: 0: 42546.9. Samples: 7078406260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 12:11:24,532][15401] Updated weights for policy 0, policy_version 432030 (0.0049) [2024-06-23 12:11:28,243][15401] Updated weights for policy 0, policy_version 432040 (0.0045) [2024-06-23 12:11:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7078543360. Throughput: 0: 42482.0. Samples: 7078663680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 12:11:32,054][15401] Updated weights for policy 0, policy_version 432050 (0.0038) [2024-06-23 12:11:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43149.2, 300 sec: 42542.9). Total num frames: 7078772736. Throughput: 0: 42510.4. Samples: 7078915320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:33,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 12:11:36,083][15401] Updated weights for policy 0, policy_version 432060 (0.0033) [2024-06-23 12:11:36,587][15349] Signal inference workers to stop experience collection... (104900 times) [2024-06-23 12:11:36,588][15349] Signal inference workers to resume experience collection... (104900 times) [2024-06-23 12:11:36,620][15401] InferenceWorker_p0-w0: stopping experience collection (104900 times) [2024-06-23 12:11:36,620][15401] InferenceWorker_p0-w0: resuming experience collection (104900 times) [2024-06-23 12:11:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7078969344. Throughput: 0: 42583.5. Samples: 7079045720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:38,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 12:11:40,082][15401] Updated weights for policy 0, policy_version 432070 (0.0035) [2024-06-23 12:11:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7079165952. Throughput: 0: 42258.6. Samples: 7079298100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 12:11:43,398][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 12:11:43,449][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000432079_7079182336.pth... [2024-06-23 12:11:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000431456_7068975104.pth [2024-06-23 12:11:43,702][15401] Updated weights for policy 0, policy_version 432080 (0.0027) [2024-06-23 12:11:47,712][15401] Updated weights for policy 0, policy_version 432090 (0.0042) [2024-06-23 12:11:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7079411712. Throughput: 0: 42468.9. Samples: 7079550680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:11:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 12:11:51,428][15401] Updated weights for policy 0, policy_version 432100 (0.0036) [2024-06-23 12:11:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7079608320. Throughput: 0: 42549.7. Samples: 7079686800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:11:53,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 12:11:55,166][15401] Updated weights for policy 0, policy_version 432110 (0.0034) [2024-06-23 12:11:58,392][15132] Fps is (10 sec: 39312.6, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 7079804928. Throughput: 0: 42437.2. Samples: 7079937820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:11:58,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 12:11:59,121][15401] Updated weights for policy 0, policy_version 432120 (0.0032) [2024-06-23 12:12:02,875][15401] Updated weights for policy 0, policy_version 432130 (0.0037) [2024-06-23 12:12:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7080034304. Throughput: 0: 42756.5. Samples: 7080195340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 12:12:06,823][15401] Updated weights for policy 0, policy_version 432140 (0.0050) [2024-06-23 12:12:08,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7080247296. Throughput: 0: 42656.0. Samples: 7080325780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:08,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-23 12:12:10,470][15401] Updated weights for policy 0, policy_version 432150 (0.0031) [2024-06-23 12:12:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 7080460288. Throughput: 0: 42571.0. Samples: 7080579380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 12:12:14,619][15401] Updated weights for policy 0, policy_version 432160 (0.0027) [2024-06-23 12:12:18,036][15401] Updated weights for policy 0, policy_version 432170 (0.0029) [2024-06-23 12:12:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7080689664. Throughput: 0: 42742.5. Samples: 7080838740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 12:12:22,127][15401] Updated weights for policy 0, policy_version 432180 (0.0023) [2024-06-23 12:12:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 7080886272. Throughput: 0: 42781.8. Samples: 7080970900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 12:12:25,579][15401] Updated weights for policy 0, policy_version 432190 (0.0025) [2024-06-23 12:12:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 7081115648. Throughput: 0: 42856.4. Samples: 7081226640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 12:12:29,585][15401] Updated weights for policy 0, policy_version 432200 (0.0028) [2024-06-23 12:12:32,964][15401] Updated weights for policy 0, policy_version 432210 (0.0027) [2024-06-23 12:12:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7081328640. Throughput: 0: 42992.2. Samples: 7081485320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 12:12:37,543][15401] Updated weights for policy 0, policy_version 432220 (0.0035) [2024-06-23 12:12:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7081525248. Throughput: 0: 42880.1. Samples: 7081616400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:38,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 12:12:40,563][15401] Updated weights for policy 0, policy_version 432230 (0.0032) [2024-06-23 12:12:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 7081754624. Throughput: 0: 42948.5. Samples: 7081870400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:43,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 12:12:45,109][15401] Updated weights for policy 0, policy_version 432240 (0.0028) [2024-06-23 12:12:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7081967616. Throughput: 0: 42976.5. Samples: 7082129280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 12:12:48,559][15401] Updated weights for policy 0, policy_version 432250 (0.0043) [2024-06-23 12:12:52,744][15401] Updated weights for policy 0, policy_version 432260 (0.0031) [2024-06-23 12:12:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7082164224. Throughput: 0: 42890.2. Samples: 7082255840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 12:12:56,147][15401] Updated weights for policy 0, policy_version 432270 (0.0055) [2024-06-23 12:12:58,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43417.7, 300 sec: 42542.5). Total num frames: 7082409984. Throughput: 0: 42922.3. Samples: 7082510980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:12:58,392][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 12:13:00,309][15401] Updated weights for policy 0, policy_version 432280 (0.0026) [2024-06-23 12:13:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7082622976. Throughput: 0: 42871.3. Samples: 7082767940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 12:13:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 12:13:03,618][15401] Updated weights for policy 0, policy_version 432290 (0.0044) [2024-06-23 12:13:08,073][15401] Updated weights for policy 0, policy_version 432300 (0.0042) [2024-06-23 12:13:08,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7082803200. Throughput: 0: 42834.6. Samples: 7082898460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 12:13:08,896][15349] Signal inference workers to stop experience collection... (104950 times) [2024-06-23 12:13:08,897][15349] Signal inference workers to resume experience collection... (104950 times) [2024-06-23 12:13:08,919][15401] InferenceWorker_p0-w0: stopping experience collection (104950 times) [2024-06-23 12:13:08,919][15401] InferenceWorker_p0-w0: resuming experience collection (104950 times) [2024-06-23 12:13:11,312][15401] Updated weights for policy 0, policy_version 432310 (0.0024) [2024-06-23 12:13:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 7083032576. Throughput: 0: 42778.7. Samples: 7083151680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 12:13:15,558][15401] Updated weights for policy 0, policy_version 432320 (0.0046) [2024-06-23 12:13:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7083261952. Throughput: 0: 42860.8. Samples: 7083414060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 12:13:19,159][15401] Updated weights for policy 0, policy_version 432330 (0.0027) [2024-06-23 12:13:23,196][15401] Updated weights for policy 0, policy_version 432340 (0.0046) [2024-06-23 12:13:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7083458560. Throughput: 0: 42864.3. Samples: 7083545300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 12:13:26,712][15401] Updated weights for policy 0, policy_version 432350 (0.0043) [2024-06-23 12:13:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7083687936. Throughput: 0: 42774.6. Samples: 7083795260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:28,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 12:13:31,036][15401] Updated weights for policy 0, policy_version 432360 (0.0028) [2024-06-23 12:13:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7083884544. Throughput: 0: 42917.9. Samples: 7084060580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 12:13:34,291][15401] Updated weights for policy 0, policy_version 432370 (0.0034) [2024-06-23 12:13:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 7084097536. Throughput: 0: 42814.2. Samples: 7084182480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:38,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 12:13:38,593][15401] Updated weights for policy 0, policy_version 432380 (0.0027) [2024-06-23 12:13:41,848][15401] Updated weights for policy 0, policy_version 432390 (0.0042) [2024-06-23 12:13:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 7084326912. Throughput: 0: 42867.5. Samples: 7084440020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:43,393][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 12:13:43,537][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000432394_7084343296.pth... [2024-06-23 12:13:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000431769_7074103296.pth [2024-06-23 12:13:46,116][15401] Updated weights for policy 0, policy_version 432400 (0.0040) [2024-06-23 12:13:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7084539904. Throughput: 0: 42997.2. Samples: 7084702820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:48,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 12:13:49,717][15401] Updated weights for policy 0, policy_version 432410 (0.0033) [2024-06-23 12:13:53,392][15132] Fps is (10 sec: 42598.2, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 7084752896. Throughput: 0: 42805.3. Samples: 7084824800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:53,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 12:13:53,614][15401] Updated weights for policy 0, policy_version 432420 (0.0030) [2024-06-23 12:13:57,380][15401] Updated weights for policy 0, policy_version 432430 (0.0035) [2024-06-23 12:13:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 7084982272. Throughput: 0: 42984.0. Samples: 7085085960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:13:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 12:14:01,321][15401] Updated weights for policy 0, policy_version 432440 (0.0041) [2024-06-23 12:14:03,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7085178880. Throughput: 0: 42935.9. Samples: 7085346180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:14:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 12:14:04,951][15401] Updated weights for policy 0, policy_version 432450 (0.0036) [2024-06-23 12:14:08,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7085359104. Throughput: 0: 42675.7. Samples: 7085465700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:14:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 12:14:08,924][15401] Updated weights for policy 0, policy_version 432460 (0.0031) [2024-06-23 12:14:12,661][15401] Updated weights for policy 0, policy_version 432470 (0.0037) [2024-06-23 12:14:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.7, 300 sec: 42821.5). Total num frames: 7085621248. Throughput: 0: 42899.8. Samples: 7085725740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:14:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 12:14:16,554][15401] Updated weights for policy 0, policy_version 432480 (0.0042) [2024-06-23 12:14:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 7085817856. Throughput: 0: 42811.0. Samples: 7085987080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:14:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 12:14:20,546][15401] Updated weights for policy 0, policy_version 432490 (0.0033) [2024-06-23 12:14:23,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 7086014464. Throughput: 0: 42843.5. Samples: 7086110540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:14:23,393][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 12:14:24,248][15401] Updated weights for policy 0, policy_version 432500 (0.0031) [2024-06-23 12:14:28,328][15401] Updated weights for policy 0, policy_version 432510 (0.0034) [2024-06-23 12:14:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7086243840. Throughput: 0: 42915.5. Samples: 7086371120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:14:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 12:14:32,258][15401] Updated weights for policy 0, policy_version 432520 (0.0043) [2024-06-23 12:14:33,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7086456832. Throughput: 0: 42650.6. Samples: 7086622100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:14:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 12:14:33,846][15349] Signal inference workers to stop experience collection... (105000 times) [2024-06-23 12:14:33,850][15349] Signal inference workers to resume experience collection... (105000 times) [2024-06-23 12:14:33,863][15401] InferenceWorker_p0-w0: stopping experience collection (105000 times) [2024-06-23 12:14:33,863][15401] InferenceWorker_p0-w0: resuming experience collection (105000 times) [2024-06-23 12:14:35,871][15401] Updated weights for policy 0, policy_version 432530 (0.0025) [2024-06-23 12:14:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7086669824. Throughput: 0: 42757.9. Samples: 7086748800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:14:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 12:14:39,813][15401] Updated weights for policy 0, policy_version 432540 (0.0047) [2024-06-23 12:14:43,349][15401] Updated weights for policy 0, policy_version 432550 (0.0044) [2024-06-23 12:14:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 7086899200. Throughput: 0: 42825.8. Samples: 7087013120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:14:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 12:14:47,341][15401] Updated weights for policy 0, policy_version 432560 (0.0048) [2024-06-23 12:14:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7087079424. Throughput: 0: 42691.7. Samples: 7087267300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:14:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 12:14:50,880][15401] Updated weights for policy 0, policy_version 432570 (0.0041) [2024-06-23 12:14:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 7087325184. Throughput: 0: 42794.6. Samples: 7087391460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:14:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 12:14:54,822][15401] Updated weights for policy 0, policy_version 432580 (0.0028) [2024-06-23 12:14:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7087538176. Throughput: 0: 42859.5. Samples: 7087654420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:14:58,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 12:14:58,487][15401] Updated weights for policy 0, policy_version 432590 (0.0029) [2024-06-23 12:15:02,402][15401] Updated weights for policy 0, policy_version 432600 (0.0040) [2024-06-23 12:15:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42653.9). Total num frames: 7087734784. Throughput: 0: 42672.4. Samples: 7087907440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:03,393][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 12:15:06,407][15401] Updated weights for policy 0, policy_version 432610 (0.0040) [2024-06-23 12:15:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 7087964160. Throughput: 0: 42861.4. Samples: 7088039300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:08,393][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 12:15:10,024][15401] Updated weights for policy 0, policy_version 432620 (0.0051) [2024-06-23 12:15:13,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7088177152. Throughput: 0: 42757.4. Samples: 7088295200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 12:15:13,985][15401] Updated weights for policy 0, policy_version 432630 (0.0032) [2024-06-23 12:15:17,939][15401] Updated weights for policy 0, policy_version 432640 (0.0033) [2024-06-23 12:15:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7088390144. Throughput: 0: 42700.9. Samples: 7088543640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:18,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 12:15:21,710][15401] Updated weights for policy 0, policy_version 432650 (0.0035) [2024-06-23 12:15:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 7088603136. Throughput: 0: 42762.6. Samples: 7088673120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 12:15:25,495][15401] Updated weights for policy 0, policy_version 432660 (0.0041) [2024-06-23 12:15:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42821.1). Total num frames: 7088816128. Throughput: 0: 42699.1. Samples: 7088934680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:28,393][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 12:15:29,419][15401] Updated weights for policy 0, policy_version 432670 (0.0038) [2024-06-23 12:15:33,042][15401] Updated weights for policy 0, policy_version 432680 (0.0033) [2024-06-23 12:15:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7089029120. Throughput: 0: 42662.5. Samples: 7089187120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 12:15:36,891][15401] Updated weights for policy 0, policy_version 432690 (0.0036) [2024-06-23 12:15:38,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 7089258496. Throughput: 0: 42797.7. Samples: 7089317360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:38,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 12:15:41,191][15401] Updated weights for policy 0, policy_version 432700 (0.0034) [2024-06-23 12:15:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7089438720. Throughput: 0: 42647.9. Samples: 7089573580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 12:15:43,538][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000432706_7089455104.pth... [2024-06-23 12:15:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000432079_7079182336.pth [2024-06-23 12:15:44,789][15401] Updated weights for policy 0, policy_version 432710 (0.0028) [2024-06-23 12:15:48,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7089668096. Throughput: 0: 42623.7. Samples: 7089825400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 12:15:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 12:15:48,765][15401] Updated weights for policy 0, policy_version 432720 (0.0042) [2024-06-23 12:15:52,416][15401] Updated weights for policy 0, policy_version 432730 (0.0026) [2024-06-23 12:15:53,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 7089897472. Throughput: 0: 42544.4. Samples: 7089953800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:15:53,393][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 12:15:56,662][15401] Updated weights for policy 0, policy_version 432740 (0.0030) [2024-06-23 12:15:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7090077696. Throughput: 0: 42511.3. Samples: 7090208200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:15:58,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 12:15:59,602][15349] Signal inference workers to stop experience collection... (105050 times) [2024-06-23 12:15:59,603][15349] Signal inference workers to resume experience collection... (105050 times) [2024-06-23 12:15:59,645][15401] InferenceWorker_p0-w0: stopping experience collection (105050 times) [2024-06-23 12:15:59,646][15401] InferenceWorker_p0-w0: resuming experience collection (105050 times) [2024-06-23 12:16:00,165][15401] Updated weights for policy 0, policy_version 432750 (0.0031) [2024-06-23 12:16:03,396][15132] Fps is (10 sec: 37668.1, 60 sec: 42322.5, 300 sec: 42653.0). Total num frames: 7090274304. Throughput: 0: 42670.9. Samples: 7090464100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:03,396][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 12:16:04,193][15401] Updated weights for policy 0, policy_version 432760 (0.0042) [2024-06-23 12:16:07,739][15401] Updated weights for policy 0, policy_version 432770 (0.0036) [2024-06-23 12:16:08,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42327.0, 300 sec: 42709.8). Total num frames: 7090503680. Throughput: 0: 42642.7. Samples: 7090592040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 12:16:12,133][15401] Updated weights for policy 0, policy_version 432780 (0.0023) [2024-06-23 12:16:13,390][15132] Fps is (10 sec: 45903.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7090733056. Throughput: 0: 42588.4. Samples: 7090851060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 12:16:15,353][15401] Updated weights for policy 0, policy_version 432790 (0.0034) [2024-06-23 12:16:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7090929664. Throughput: 0: 42558.7. Samples: 7091102260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 12:16:19,801][15401] Updated weights for policy 0, policy_version 432800 (0.0038) [2024-06-23 12:16:23,075][15401] Updated weights for policy 0, policy_version 432810 (0.0035) [2024-06-23 12:16:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7091159040. Throughput: 0: 42441.9. Samples: 7091227240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:23,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 12:16:27,365][15401] Updated weights for policy 0, policy_version 432820 (0.0029) [2024-06-23 12:16:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7091372032. Throughput: 0: 42422.7. Samples: 7091482600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 12:16:30,952][15401] Updated weights for policy 0, policy_version 432830 (0.0043) [2024-06-23 12:16:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7091568640. Throughput: 0: 42612.4. Samples: 7091742960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 12:16:34,783][15401] Updated weights for policy 0, policy_version 432840 (0.0047) [2024-06-23 12:16:38,355][15401] Updated weights for policy 0, policy_version 432850 (0.0040) [2024-06-23 12:16:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7091814400. Throughput: 0: 42578.6. Samples: 7091869740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 12:16:42,524][15401] Updated weights for policy 0, policy_version 432860 (0.0027) [2024-06-23 12:16:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7091994624. Throughput: 0: 42661.0. Samples: 7092127960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 12:16:46,053][15401] Updated weights for policy 0, policy_version 432870 (0.0039) [2024-06-23 12:16:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7092224000. Throughput: 0: 42713.2. Samples: 7092385920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:48,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 12:16:50,074][15401] Updated weights for policy 0, policy_version 432880 (0.0035) [2024-06-23 12:16:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.0, 300 sec: 42820.9). Total num frames: 7092436992. Throughput: 0: 42679.0. Samples: 7092512600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 12:16:53,986][15401] Updated weights for policy 0, policy_version 432890 (0.0035) [2024-06-23 12:16:57,529][15401] Updated weights for policy 0, policy_version 432900 (0.0037) [2024-06-23 12:16:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 7092649984. Throughput: 0: 42632.5. Samples: 7092769520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:16:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 12:17:01,559][15401] Updated weights for policy 0, policy_version 432910 (0.0040) [2024-06-23 12:17:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 7092862976. Throughput: 0: 42828.5. Samples: 7093029540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:17:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 12:17:05,082][15401] Updated weights for policy 0, policy_version 432920 (0.0041) [2024-06-23 12:17:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7093092352. Throughput: 0: 42951.1. Samples: 7093160040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:17:08,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 12:17:09,204][15401] Updated weights for policy 0, policy_version 432930 (0.0033) [2024-06-23 12:17:13,241][15401] Updated weights for policy 0, policy_version 432940 (0.0046) [2024-06-23 12:17:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7093288960. Throughput: 0: 42848.4. Samples: 7093410780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-23 12:17:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 12:17:16,899][15401] Updated weights for policy 0, policy_version 432950 (0.0027) [2024-06-23 12:17:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7093501952. Throughput: 0: 42667.2. Samples: 7093662980. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 12:17:20,832][15401] Updated weights for policy 0, policy_version 432960 (0.0024) [2024-06-23 12:17:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7093714944. Throughput: 0: 42728.6. Samples: 7093792520. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 12:17:24,330][15401] Updated weights for policy 0, policy_version 432970 (0.0030) [2024-06-23 12:17:25,523][15349] Signal inference workers to stop experience collection... (105100 times) [2024-06-23 12:17:25,523][15349] Signal inference workers to resume experience collection... (105100 times) [2024-06-23 12:17:25,564][15401] InferenceWorker_p0-w0: stopping experience collection (105100 times) [2024-06-23 12:17:25,564][15401] InferenceWorker_p0-w0: resuming experience collection (105100 times) [2024-06-23 12:17:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7093911552. Throughput: 0: 42749.1. Samples: 7094051660. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:28,398][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 12:17:28,934][15401] Updated weights for policy 0, policy_version 432980 (0.0039) [2024-06-23 12:17:31,939][15401] Updated weights for policy 0, policy_version 432990 (0.0040) [2024-06-23 12:17:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7094140928. Throughput: 0: 42640.8. Samples: 7094304760. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 12:17:36,461][15401] Updated weights for policy 0, policy_version 433000 (0.0045) [2024-06-23 12:17:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7094353920. Throughput: 0: 42712.5. Samples: 7094434660. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 12:17:39,451][15401] Updated weights for policy 0, policy_version 433010 (0.0034) [2024-06-23 12:17:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7094550528. Throughput: 0: 42660.9. Samples: 7094689260. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 12:17:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000433018_7094566912.pth... [2024-06-23 12:17:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000432394_7084343296.pth [2024-06-23 12:17:44,042][15401] Updated weights for policy 0, policy_version 433020 (0.0044) [2024-06-23 12:17:47,153][15401] Updated weights for policy 0, policy_version 433030 (0.0029) [2024-06-23 12:17:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7094779904. Throughput: 0: 42492.0. Samples: 7094941680. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 12:17:51,650][15401] Updated weights for policy 0, policy_version 433040 (0.0034) [2024-06-23 12:17:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 7094992896. Throughput: 0: 42458.7. Samples: 7095070680. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 12:17:54,779][15401] Updated weights for policy 0, policy_version 433050 (0.0036) [2024-06-23 12:17:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.4). Total num frames: 7095222272. Throughput: 0: 42752.0. Samples: 7095334620. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:17:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 12:17:59,008][15401] Updated weights for policy 0, policy_version 433060 (0.0026) [2024-06-23 12:18:02,398][15401] Updated weights for policy 0, policy_version 433070 (0.0039) [2024-06-23 12:18:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7095435264. Throughput: 0: 42781.5. Samples: 7095588160. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:03,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 12:18:06,947][15401] Updated weights for policy 0, policy_version 433080 (0.0027) [2024-06-23 12:18:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7095615488. Throughput: 0: 42742.5. Samples: 7095715940. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 12:18:10,441][15401] Updated weights for policy 0, policy_version 433090 (0.0042) [2024-06-23 12:18:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7095844864. Throughput: 0: 42643.8. Samples: 7095970640. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 12:18:14,899][15401] Updated weights for policy 0, policy_version 433100 (0.0046) [2024-06-23 12:18:18,132][15401] Updated weights for policy 0, policy_version 433110 (0.0030) [2024-06-23 12:18:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7096074240. Throughput: 0: 42581.0. Samples: 7096220900. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 12:18:22,575][15401] Updated weights for policy 0, policy_version 433120 (0.0029) [2024-06-23 12:18:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 7096254464. Throughput: 0: 42699.1. Samples: 7096356120. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 12:18:25,783][15401] Updated weights for policy 0, policy_version 433130 (0.0031) [2024-06-23 12:18:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7096483840. Throughput: 0: 42735.1. Samples: 7096612340. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:28,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 12:18:30,503][15401] Updated weights for policy 0, policy_version 433140 (0.0028) [2024-06-23 12:18:33,235][15401] Updated weights for policy 0, policy_version 433150 (0.0027) [2024-06-23 12:18:33,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7096729600. Throughput: 0: 42685.4. Samples: 7096862520. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 12:18:37,897][15401] Updated weights for policy 0, policy_version 433160 (0.0032) [2024-06-23 12:18:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 7096909824. Throughput: 0: 42809.7. Samples: 7096997220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:38,393][15132] Avg episode reward: [(0, '0.272')] [2024-06-23 12:18:40,109][15349] Signal inference workers to stop experience collection... (105150 times) [2024-06-23 12:18:40,160][15349] Signal inference workers to resume experience collection... (105150 times) [2024-06-23 12:18:40,160][15401] InferenceWorker_p0-w0: stopping experience collection (105150 times) [2024-06-23 12:18:40,173][15401] InferenceWorker_p0-w0: resuming experience collection (105150 times) [2024-06-23 12:18:40,770][15401] Updated weights for policy 0, policy_version 433170 (0.0033) [2024-06-23 12:18:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7097139200. Throughput: 0: 42758.0. Samples: 7097258720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 12:18:45,258][15401] Updated weights for policy 0, policy_version 433180 (0.0027) [2024-06-23 12:18:48,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7097352192. Throughput: 0: 42801.9. Samples: 7097514240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 12:18:49,057][15401] Updated weights for policy 0, policy_version 433190 (0.0028) [2024-06-23 12:18:52,610][15401] Updated weights for policy 0, policy_version 433200 (0.0035) [2024-06-23 12:18:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7097548800. Throughput: 0: 42838.4. Samples: 7097643660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 12:18:56,740][15401] Updated weights for policy 0, policy_version 433210 (0.0029) [2024-06-23 12:18:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7097794560. Throughput: 0: 42930.4. Samples: 7097902500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:18:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 12:19:00,637][15401] Updated weights for policy 0, policy_version 433220 (0.0030) [2024-06-23 12:19:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 7097991168. Throughput: 0: 43151.6. Samples: 7098162720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 12:19:04,391][15401] Updated weights for policy 0, policy_version 433230 (0.0047) [2024-06-23 12:19:08,304][15401] Updated weights for policy 0, policy_version 433240 (0.0035) [2024-06-23 12:19:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7098204160. Throughput: 0: 42855.6. Samples: 7098284620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 12:19:12,107][15401] Updated weights for policy 0, policy_version 433250 (0.0032) [2024-06-23 12:19:13,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7098433536. Throughput: 0: 42946.7. Samples: 7098544940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 12:19:15,923][15401] Updated weights for policy 0, policy_version 433260 (0.0036) [2024-06-23 12:19:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.1, 300 sec: 42709.8). Total num frames: 7098613760. Throughput: 0: 43058.4. Samples: 7098800160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 12:19:19,900][15401] Updated weights for policy 0, policy_version 433270 (0.0025) [2024-06-23 12:19:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7098843136. Throughput: 0: 42768.0. Samples: 7098921680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:23,391][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 12:19:23,545][15401] Updated weights for policy 0, policy_version 433280 (0.0040) [2024-06-23 12:19:27,702][15401] Updated weights for policy 0, policy_version 433290 (0.0044) [2024-06-23 12:19:28,390][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7099072512. Throughput: 0: 42823.9. Samples: 7099185800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 12:19:30,966][15401] Updated weights for policy 0, policy_version 433300 (0.0027) [2024-06-23 12:19:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7099269120. Throughput: 0: 42760.4. Samples: 7099438460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 12:19:35,372][15401] Updated weights for policy 0, policy_version 433310 (0.0033) [2024-06-23 12:19:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 7099498496. Throughput: 0: 42641.7. Samples: 7099562540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 12:19:38,644][15401] Updated weights for policy 0, policy_version 433320 (0.0031) [2024-06-23 12:19:43,155][15401] Updated weights for policy 0, policy_version 433330 (0.0031) [2024-06-23 12:19:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7099695104. Throughput: 0: 42732.9. Samples: 7099825480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 12:19:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000433332_7099711488.pth... [2024-06-23 12:19:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000432706_7089455104.pth [2024-06-23 12:19:46,267][15401] Updated weights for policy 0, policy_version 433340 (0.0029) [2024-06-23 12:19:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7099908096. Throughput: 0: 42592.7. Samples: 7100079400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:48,392][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 12:19:50,782][15401] Updated weights for policy 0, policy_version 433350 (0.0031) [2024-06-23 12:19:53,392][15132] Fps is (10 sec: 45863.7, 60 sec: 43415.7, 300 sec: 42764.7). Total num frames: 7100153856. Throughput: 0: 42678.2. Samples: 7100205240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:53,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 12:19:54,202][15401] Updated weights for policy 0, policy_version 433360 (0.0034) [2024-06-23 12:19:58,359][15401] Updated weights for policy 0, policy_version 433370 (0.0028) [2024-06-23 12:19:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 7100334080. Throughput: 0: 42722.2. Samples: 7100467440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:19:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 12:20:01,675][15401] Updated weights for policy 0, policy_version 433380 (0.0025) [2024-06-23 12:20:03,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 7100547072. Throughput: 0: 42760.2. Samples: 7100724360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 12:20:06,176][15349] Signal inference workers to stop experience collection... (105200 times) [2024-06-23 12:20:06,176][15349] Signal inference workers to resume experience collection... (105200 times) [2024-06-23 12:20:06,179][15401] Updated weights for policy 0, policy_version 433390 (0.0049) [2024-06-23 12:20:06,218][15401] InferenceWorker_p0-w0: stopping experience collection (105200 times) [2024-06-23 12:20:06,218][15401] InferenceWorker_p0-w0: resuming experience collection (105200 times) [2024-06-23 12:20:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7100792832. Throughput: 0: 42914.3. Samples: 7100852820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 12:20:09,191][15401] Updated weights for policy 0, policy_version 433400 (0.0033) [2024-06-23 12:20:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 7100940288. Throughput: 0: 42543.5. Samples: 7101100260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 12:20:13,726][15401] Updated weights for policy 0, policy_version 433410 (0.0037) [2024-06-23 12:20:17,018][15401] Updated weights for policy 0, policy_version 433420 (0.0043) [2024-06-23 12:20:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 7101186048. Throughput: 0: 42616.5. Samples: 7101356200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:18,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 12:20:21,294][15401] Updated weights for policy 0, policy_version 433430 (0.0034) [2024-06-23 12:20:23,389][15132] Fps is (10 sec: 49152.8, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 7101431808. Throughput: 0: 42772.4. Samples: 7101487300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:23,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 12:20:24,539][15401] Updated weights for policy 0, policy_version 433440 (0.0029) [2024-06-23 12:20:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 7101595648. Throughput: 0: 42560.7. Samples: 7101740720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 12:20:29,065][15401] Updated weights for policy 0, policy_version 433450 (0.0036) [2024-06-23 12:20:32,141][15401] Updated weights for policy 0, policy_version 433460 (0.0035) [2024-06-23 12:20:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7101825024. Throughput: 0: 42630.8. Samples: 7101997780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 12:20:36,498][15401] Updated weights for policy 0, policy_version 433470 (0.0036) [2024-06-23 12:20:38,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7102054400. Throughput: 0: 42854.4. Samples: 7102133580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:38,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 12:20:39,633][15401] Updated weights for policy 0, policy_version 433480 (0.0040) [2024-06-23 12:20:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7102234624. Throughput: 0: 42642.3. Samples: 7102386340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 12:20:44,003][15401] Updated weights for policy 0, policy_version 433490 (0.0042) [2024-06-23 12:20:47,379][15401] Updated weights for policy 0, policy_version 433500 (0.0042) [2024-06-23 12:20:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7102480384. Throughput: 0: 42570.7. Samples: 7102640040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 12:20:51,659][15401] Updated weights for policy 0, policy_version 433510 (0.0040) [2024-06-23 12:20:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 7102693376. Throughput: 0: 42800.3. Samples: 7102778840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 12:20:55,038][15401] Updated weights for policy 0, policy_version 433520 (0.0038) [2024-06-23 12:20:58,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.6, 300 sec: 42710.0). Total num frames: 7102873600. Throughput: 0: 42850.2. Samples: 7103028620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:20:58,393][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 12:20:59,431][15401] Updated weights for policy 0, policy_version 433530 (0.0028) [2024-06-23 12:21:02,846][15401] Updated weights for policy 0, policy_version 433540 (0.0024) [2024-06-23 12:21:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7103135744. Throughput: 0: 42751.9. Samples: 7103280040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:21:03,391][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 12:21:06,987][15401] Updated weights for policy 0, policy_version 433550 (0.0033) [2024-06-23 12:21:08,390][15132] Fps is (10 sec: 45886.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7103332352. Throughput: 0: 42900.8. Samples: 7103417840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:21:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 12:21:10,424][15401] Updated weights for policy 0, policy_version 433560 (0.0028) [2024-06-23 12:21:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 7103528960. Throughput: 0: 42909.5. Samples: 7103671640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:21:13,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 12:21:14,751][15401] Updated weights for policy 0, policy_version 433570 (0.0024) [2024-06-23 12:21:18,047][15401] Updated weights for policy 0, policy_version 433580 (0.0029) [2024-06-23 12:21:18,391][15132] Fps is (10 sec: 44229.7, 60 sec: 43143.3, 300 sec: 42764.8). Total num frames: 7103774720. Throughput: 0: 42822.4. Samples: 7103924860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 12:21:18,391][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 12:21:22,451][15401] Updated weights for policy 0, policy_version 433590 (0.0033) [2024-06-23 12:21:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7103987712. Throughput: 0: 42796.4. Samples: 7104059420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:21:23,390][15132] Avg episode reward: [(0, '0.904')] [2024-06-23 12:21:26,126][15401] Updated weights for policy 0, policy_version 433600 (0.0037) [2024-06-23 12:21:27,620][15349] Signal inference workers to stop experience collection... (105250 times) [2024-06-23 12:21:27,621][15349] Signal inference workers to resume experience collection... (105250 times) [2024-06-23 12:21:27,638][15401] InferenceWorker_p0-w0: stopping experience collection (105250 times) [2024-06-23 12:21:27,639][15401] InferenceWorker_p0-w0: resuming experience collection (105250 times) [2024-06-23 12:21:28,390][15132] Fps is (10 sec: 40966.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7104184320. Throughput: 0: 42880.3. Samples: 7104315960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:21:28,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-23 12:21:30,159][15401] Updated weights for policy 0, policy_version 433610 (0.0029) [2024-06-23 12:21:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 7104413696. Throughput: 0: 42891.9. Samples: 7104570180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:21:33,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-23 12:21:33,550][15401] Updated weights for policy 0, policy_version 433620 (0.0031) [2024-06-23 12:21:38,067][15401] Updated weights for policy 0, policy_version 433630 (0.0031) [2024-06-23 12:21:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7104610304. Throughput: 0: 42610.3. Samples: 7104696300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:21:38,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 12:21:40,930][15401] Updated weights for policy 0, policy_version 433640 (0.0040) [2024-06-23 12:21:43,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7104839680. Throughput: 0: 42942.0. Samples: 7104960900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:21:43,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-23 12:21:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000433646_7104856064.pth... [2024-06-23 12:21:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000433018_7094566912.pth [2024-06-23 12:21:45,602][15401] Updated weights for policy 0, policy_version 433650 (0.0030) [2024-06-23 12:21:48,396][15132] Fps is (10 sec: 45845.9, 60 sec: 43140.0, 300 sec: 42819.6). Total num frames: 7105069056. Throughput: 0: 42880.2. Samples: 7105209920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:21:48,396][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 12:21:48,426][15401] Updated weights for policy 0, policy_version 433660 (0.0045) [2024-06-23 12:21:53,082][15401] Updated weights for policy 0, policy_version 433670 (0.0042) [2024-06-23 12:21:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 7105265664. Throughput: 0: 42811.1. Samples: 7105344440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:21:53,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 12:21:55,927][15401] Updated weights for policy 0, policy_version 433680 (0.0041) [2024-06-23 12:21:58,390][15132] Fps is (10 sec: 40986.0, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 7105478656. Throughput: 0: 42821.7. Samples: 7105598620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:21:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 12:22:00,679][15401] Updated weights for policy 0, policy_version 433690 (0.0035) [2024-06-23 12:22:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7105708032. Throughput: 0: 42821.2. Samples: 7105851740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 12:22:04,159][15401] Updated weights for policy 0, policy_version 433700 (0.0034) [2024-06-23 12:22:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7105888256. Throughput: 0: 42691.6. Samples: 7105980540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 12:22:08,465][15401] Updated weights for policy 0, policy_version 433710 (0.0030) [2024-06-23 12:22:11,760][15401] Updated weights for policy 0, policy_version 433720 (0.0034) [2024-06-23 12:22:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7106101248. Throughput: 0: 42636.1. Samples: 7106234580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 12:22:16,229][15401] Updated weights for policy 0, policy_version 433730 (0.0036) [2024-06-23 12:22:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42599.6, 300 sec: 42765.0). Total num frames: 7106330624. Throughput: 0: 42766.4. Samples: 7106494660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 12:22:19,674][15401] Updated weights for policy 0, policy_version 433740 (0.0039) [2024-06-23 12:22:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7106527232. Throughput: 0: 42905.3. Samples: 7106627040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 12:22:23,821][15401] Updated weights for policy 0, policy_version 433750 (0.0040) [2024-06-23 12:22:27,150][15401] Updated weights for policy 0, policy_version 433760 (0.0034) [2024-06-23 12:22:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7106740224. Throughput: 0: 42654.7. Samples: 7106880360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 12:22:31,289][15401] Updated weights for policy 0, policy_version 433770 (0.0036) [2024-06-23 12:22:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 7106985984. Throughput: 0: 42841.6. Samples: 7107137520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 12:22:34,991][15401] Updated weights for policy 0, policy_version 433780 (0.0039) [2024-06-23 12:22:38,391][15132] Fps is (10 sec: 44231.3, 60 sec: 42870.6, 300 sec: 42820.4). Total num frames: 7107182592. Throughput: 0: 42826.5. Samples: 7107271580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:38,391][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 12:22:38,917][15401] Updated weights for policy 0, policy_version 433790 (0.0032) [2024-06-23 12:22:41,077][15349] Signal inference workers to stop experience collection... (105300 times) [2024-06-23 12:22:41,079][15349] Signal inference workers to resume experience collection... (105300 times) [2024-06-23 12:22:41,100][15401] InferenceWorker_p0-w0: stopping experience collection (105300 times) [2024-06-23 12:22:41,101][15401] InferenceWorker_p0-w0: resuming experience collection (105300 times) [2024-06-23 12:22:42,715][15401] Updated weights for policy 0, policy_version 433800 (0.0033) [2024-06-23 12:22:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 7107395584. Throughput: 0: 42807.0. Samples: 7107524940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-23 12:22:43,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 12:22:46,427][15401] Updated weights for policy 0, policy_version 433810 (0.0028) [2024-06-23 12:22:48,389][15132] Fps is (10 sec: 44242.6, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 7107624960. Throughput: 0: 42886.3. Samples: 7107781620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:22:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 12:22:50,360][15401] Updated weights for policy 0, policy_version 433820 (0.0035) [2024-06-23 12:22:53,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 7107821568. Throughput: 0: 42998.3. Samples: 7107915460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:22:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 12:22:54,095][15401] Updated weights for policy 0, policy_version 433830 (0.0046) [2024-06-23 12:22:57,882][15401] Updated weights for policy 0, policy_version 433840 (0.0041) [2024-06-23 12:22:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7108050944. Throughput: 0: 42921.9. Samples: 7108166060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:22:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 12:23:01,612][15401] Updated weights for policy 0, policy_version 433850 (0.0045) [2024-06-23 12:23:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7108263936. Throughput: 0: 42939.9. Samples: 7108426960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 12:23:05,407][15401] Updated weights for policy 0, policy_version 433860 (0.0032) [2024-06-23 12:23:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7108476928. Throughput: 0: 42962.3. Samples: 7108560340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 12:23:09,148][15401] Updated weights for policy 0, policy_version 433870 (0.0048) [2024-06-23 12:23:12,933][15401] Updated weights for policy 0, policy_version 433880 (0.0041) [2024-06-23 12:23:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7108689920. Throughput: 0: 42921.7. Samples: 7108811840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 12:23:16,574][15401] Updated weights for policy 0, policy_version 433890 (0.0030) [2024-06-23 12:23:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7108919296. Throughput: 0: 42948.9. Samples: 7109070220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 12:23:21,255][15401] Updated weights for policy 0, policy_version 433900 (0.0033) [2024-06-23 12:23:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7109115904. Throughput: 0: 43081.6. Samples: 7109210200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:23,396][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 12:23:24,075][15401] Updated weights for policy 0, policy_version 433910 (0.0026) [2024-06-23 12:23:28,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.3, 300 sec: 42709.4). Total num frames: 7109328896. Throughput: 0: 42983.9. Samples: 7109459220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 12:23:28,563][15401] Updated weights for policy 0, policy_version 433920 (0.0031) [2024-06-23 12:23:32,018][15401] Updated weights for policy 0, policy_version 433930 (0.0034) [2024-06-23 12:23:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 7109574656. Throughput: 0: 43157.2. Samples: 7109723700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 12:23:36,380][15401] Updated weights for policy 0, policy_version 433940 (0.0030) [2024-06-23 12:23:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43145.3, 300 sec: 42820.5). Total num frames: 7109771264. Throughput: 0: 43322.0. Samples: 7109864960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 12:23:39,505][15401] Updated weights for policy 0, policy_version 433950 (0.0039) [2024-06-23 12:23:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7109967872. Throughput: 0: 43156.3. Samples: 7110108100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 12:23:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000433958_7109967872.pth... [2024-06-23 12:23:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000433332_7099711488.pth [2024-06-23 12:23:43,903][15401] Updated weights for policy 0, policy_version 433960 (0.0030) [2024-06-23 12:23:47,087][15401] Updated weights for policy 0, policy_version 433970 (0.0035) [2024-06-23 12:23:48,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 7110213632. Throughput: 0: 43136.5. Samples: 7110368100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 12:23:51,446][15401] Updated weights for policy 0, policy_version 433980 (0.0032) [2024-06-23 12:23:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7110410240. Throughput: 0: 43218.2. Samples: 7110505160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 12:23:54,698][15401] Updated weights for policy 0, policy_version 433990 (0.0029) [2024-06-23 12:23:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7110623232. Throughput: 0: 43122.6. Samples: 7110752360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:23:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 12:23:58,985][15401] Updated weights for policy 0, policy_version 434000 (0.0033) [2024-06-23 12:24:02,606][15401] Updated weights for policy 0, policy_version 434010 (0.0036) [2024-06-23 12:24:03,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 7110852608. Throughput: 0: 43211.2. Samples: 7111015000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:24:03,396][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 12:24:03,908][15349] Signal inference workers to stop experience collection... (105350 times) [2024-06-23 12:24:03,961][15401] InferenceWorker_p0-w0: stopping experience collection (105350 times) [2024-06-23 12:24:03,962][15349] Signal inference workers to resume experience collection... (105350 times) [2024-06-23 12:24:03,987][15401] InferenceWorker_p0-w0: resuming experience collection (105350 times) [2024-06-23 12:24:06,464][15401] Updated weights for policy 0, policy_version 434020 (0.0036) [2024-06-23 12:24:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7111065600. Throughput: 0: 43151.0. Samples: 7111152000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 12:24:08,392][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 12:24:10,024][15401] Updated weights for policy 0, policy_version 434030 (0.0040) [2024-06-23 12:24:13,390][15132] Fps is (10 sec: 42625.4, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 7111278592. Throughput: 0: 43160.6. Samples: 7111401440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 12:24:13,990][15401] Updated weights for policy 0, policy_version 434040 (0.0040) [2024-06-23 12:24:17,561][15401] Updated weights for policy 0, policy_version 434050 (0.0026) [2024-06-23 12:24:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7111491584. Throughput: 0: 43068.5. Samples: 7111661780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 12:24:21,469][15401] Updated weights for policy 0, policy_version 434060 (0.0023) [2024-06-23 12:24:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7111704576. Throughput: 0: 42911.2. Samples: 7111795960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 12:24:25,067][15401] Updated weights for policy 0, policy_version 434070 (0.0050) [2024-06-23 12:24:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 7111933952. Throughput: 0: 43361.8. Samples: 7112059380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:28,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 12:24:28,888][15401] Updated weights for policy 0, policy_version 434080 (0.0028) [2024-06-23 12:24:32,713][15401] Updated weights for policy 0, policy_version 434090 (0.0030) [2024-06-23 12:24:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7112146944. Throughput: 0: 43243.1. Samples: 7112314040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 12:24:36,343][15401] Updated weights for policy 0, policy_version 434100 (0.0024) [2024-06-23 12:24:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7112359936. Throughput: 0: 42996.3. Samples: 7112440000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 12:24:40,333][15401] Updated weights for policy 0, policy_version 434110 (0.0033) [2024-06-23 12:24:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 7112572928. Throughput: 0: 43206.8. Samples: 7112696660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 12:24:43,953][15401] Updated weights for policy 0, policy_version 434120 (0.0032) [2024-06-23 12:24:47,828][15401] Updated weights for policy 0, policy_version 434130 (0.0032) [2024-06-23 12:24:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 7112785920. Throughput: 0: 43019.4. Samples: 7112950600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 12:24:52,031][15401] Updated weights for policy 0, policy_version 434140 (0.0027) [2024-06-23 12:24:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 7113015296. Throughput: 0: 42836.0. Samples: 7113079620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 12:24:55,462][15401] Updated weights for policy 0, policy_version 434150 (0.0033) [2024-06-23 12:24:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7113211904. Throughput: 0: 42997.4. Samples: 7113336320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:24:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 12:24:59,546][15401] Updated weights for policy 0, policy_version 434160 (0.0028) [2024-06-23 12:25:03,106][15401] Updated weights for policy 0, policy_version 434170 (0.0032) [2024-06-23 12:25:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 7113441280. Throughput: 0: 42703.0. Samples: 7113583420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:25:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 12:25:07,341][15401] Updated weights for policy 0, policy_version 434180 (0.0039) [2024-06-23 12:25:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 7113654272. Throughput: 0: 42727.6. Samples: 7113718700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:25:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 12:25:11,476][15401] Updated weights for policy 0, policy_version 434190 (0.0028) [2024-06-23 12:25:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7113850880. Throughput: 0: 42404.1. Samples: 7113967560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:25:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 12:25:15,119][15401] Updated weights for policy 0, policy_version 434200 (0.0031) [2024-06-23 12:25:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7114047488. Throughput: 0: 42380.0. Samples: 7114221140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:25:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 12:25:19,133][15349] Signal inference workers to stop experience collection... (105400 times) [2024-06-23 12:25:19,180][15401] InferenceWorker_p0-w0: stopping experience collection (105400 times) [2024-06-23 12:25:19,194][15349] Signal inference workers to resume experience collection... (105400 times) [2024-06-23 12:25:19,204][15401] InferenceWorker_p0-w0: resuming experience collection (105400 times) [2024-06-23 12:25:19,206][15401] Updated weights for policy 0, policy_version 434210 (0.0030) [2024-06-23 12:25:22,762][15401] Updated weights for policy 0, policy_version 434220 (0.0028) [2024-06-23 12:25:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7114276864. Throughput: 0: 42343.2. Samples: 7114345440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:25:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 12:25:26,649][15401] Updated weights for policy 0, policy_version 434230 (0.0052) [2024-06-23 12:25:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 7114489856. Throughput: 0: 42390.7. Samples: 7114604240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 12:25:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 12:25:30,449][15401] Updated weights for policy 0, policy_version 434240 (0.0040) [2024-06-23 12:25:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 7114686464. Throughput: 0: 42562.7. Samples: 7114865920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:25:33,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-23 12:25:34,350][15401] Updated weights for policy 0, policy_version 434250 (0.0028) [2024-06-23 12:25:38,054][15401] Updated weights for policy 0, policy_version 434260 (0.0028) [2024-06-23 12:25:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 7114915840. Throughput: 0: 42418.3. Samples: 7114988440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:25:38,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 12:25:41,865][15401] Updated weights for policy 0, policy_version 434270 (0.0042) [2024-06-23 12:25:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7115128832. Throughput: 0: 42468.3. Samples: 7115247400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:25:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 12:25:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000434273_7115128832.pth... [2024-06-23 12:25:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000433646_7104856064.pth [2024-06-23 12:25:45,706][15401] Updated weights for policy 0, policy_version 434280 (0.0028) [2024-06-23 12:25:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7115325440. Throughput: 0: 42773.9. Samples: 7115508240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:25:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 12:25:49,523][15401] Updated weights for policy 0, policy_version 434290 (0.0039) [2024-06-23 12:25:53,134][15401] Updated weights for policy 0, policy_version 434300 (0.0034) [2024-06-23 12:25:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 43043.1). Total num frames: 7115571200. Throughput: 0: 42668.0. Samples: 7115638760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:25:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 12:25:57,255][15401] Updated weights for policy 0, policy_version 434310 (0.0046) [2024-06-23 12:25:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7115751424. Throughput: 0: 42875.6. Samples: 7115896960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:25:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 12:26:00,750][15401] Updated weights for policy 0, policy_version 434320 (0.0027) [2024-06-23 12:26:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7115980800. Throughput: 0: 42849.0. Samples: 7116149340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 12:26:04,877][15401] Updated weights for policy 0, policy_version 434330 (0.0046) [2024-06-23 12:26:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 7116193792. Throughput: 0: 43039.6. Samples: 7116282220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 12:26:08,615][15401] Updated weights for policy 0, policy_version 434340 (0.0041) [2024-06-23 12:26:12,450][15401] Updated weights for policy 0, policy_version 434350 (0.0036) [2024-06-23 12:26:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.8). Total num frames: 7116406784. Throughput: 0: 42901.3. Samples: 7116534800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:13,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 12:26:16,129][15401] Updated weights for policy 0, policy_version 434360 (0.0052) [2024-06-23 12:26:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 7116619776. Throughput: 0: 42851.2. Samples: 7116794220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 12:26:20,199][15401] Updated weights for policy 0, policy_version 434370 (0.0040) [2024-06-23 12:26:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7116849152. Throughput: 0: 42983.0. Samples: 7116922680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:23,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 12:26:23,990][15401] Updated weights for policy 0, policy_version 434380 (0.0024) [2024-06-23 12:26:28,093][15401] Updated weights for policy 0, policy_version 434390 (0.0031) [2024-06-23 12:26:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 7117045760. Throughput: 0: 42825.0. Samples: 7117174520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:28,394][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 12:26:31,667][15401] Updated weights for policy 0, policy_version 434400 (0.0029) [2024-06-23 12:26:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7117258752. Throughput: 0: 42763.0. Samples: 7117432580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 12:26:35,952][15401] Updated weights for policy 0, policy_version 434410 (0.0046) [2024-06-23 12:26:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7117471744. Throughput: 0: 42771.6. Samples: 7117563480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 12:26:39,135][15401] Updated weights for policy 0, policy_version 434420 (0.0038) [2024-06-23 12:26:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 7117668352. Throughput: 0: 42769.7. Samples: 7117821600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 12:26:43,639][15401] Updated weights for policy 0, policy_version 434430 (0.0032) [2024-06-23 12:26:47,081][15401] Updated weights for policy 0, policy_version 434440 (0.0029) [2024-06-23 12:26:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 7117914112. Throughput: 0: 42735.5. Samples: 7118072440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:48,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 12:26:51,236][15401] Updated weights for policy 0, policy_version 434450 (0.0039) [2024-06-23 12:26:52,514][15349] Signal inference workers to stop experience collection... (105450 times) [2024-06-23 12:26:52,515][15349] Signal inference workers to resume experience collection... (105450 times) [2024-06-23 12:26:52,565][15401] InferenceWorker_p0-w0: stopping experience collection (105450 times) [2024-06-23 12:26:52,565][15401] InferenceWorker_p0-w0: resuming experience collection (105450 times) [2024-06-23 12:26:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7118127104. Throughput: 0: 42868.1. Samples: 7118211280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-23 12:26:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 12:26:54,715][15401] Updated weights for policy 0, policy_version 434460 (0.0032) [2024-06-23 12:26:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7118323712. Throughput: 0: 42785.3. Samples: 7118460140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:26:58,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-23 12:26:58,763][15401] Updated weights for policy 0, policy_version 434470 (0.0026) [2024-06-23 12:27:02,461][15401] Updated weights for policy 0, policy_version 434480 (0.0043) [2024-06-23 12:27:03,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 7118569472. Throughput: 0: 42691.4. Samples: 7118715440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:03,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 12:27:07,015][15401] Updated weights for policy 0, policy_version 434490 (0.0029) [2024-06-23 12:27:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7118766080. Throughput: 0: 42680.1. Samples: 7118843280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:08,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 12:27:10,010][15401] Updated weights for policy 0, policy_version 434500 (0.0044) [2024-06-23 12:27:13,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7118962688. Throughput: 0: 42676.6. Samples: 7119094960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:13,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 12:27:14,530][15401] Updated weights for policy 0, policy_version 434510 (0.0029) [2024-06-23 12:27:17,822][15401] Updated weights for policy 0, policy_version 434520 (0.0034) [2024-06-23 12:27:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7119192064. Throughput: 0: 42656.0. Samples: 7119352100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 12:27:22,090][15401] Updated weights for policy 0, policy_version 434530 (0.0027) [2024-06-23 12:27:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7119388672. Throughput: 0: 42600.3. Samples: 7119480500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 12:27:25,414][15401] Updated weights for policy 0, policy_version 434540 (0.0032) [2024-06-23 12:27:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7119601664. Throughput: 0: 42451.6. Samples: 7119731920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 12:27:29,637][15401] Updated weights for policy 0, policy_version 434550 (0.0026) [2024-06-23 12:27:32,939][15401] Updated weights for policy 0, policy_version 434560 (0.0033) [2024-06-23 12:27:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42931.8). Total num frames: 7119847424. Throughput: 0: 42713.4. Samples: 7119994540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 12:27:37,129][15401] Updated weights for policy 0, policy_version 434570 (0.0040) [2024-06-23 12:27:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7120011264. Throughput: 0: 42468.9. Samples: 7120122380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 12:27:40,807][15401] Updated weights for policy 0, policy_version 434580 (0.0023) [2024-06-23 12:27:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 7120257024. Throughput: 0: 42417.3. Samples: 7120369020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:43,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 12:27:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000434586_7120257024.pth... [2024-06-23 12:27:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000433958_7109967872.pth [2024-06-23 12:27:44,716][15401] Updated weights for policy 0, policy_version 434590 (0.0030) [2024-06-23 12:27:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7120453632. Throughput: 0: 42572.8. Samples: 7120631120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:48,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 12:27:48,812][15401] Updated weights for policy 0, policy_version 434600 (0.0027) [2024-06-23 12:27:52,239][15401] Updated weights for policy 0, policy_version 434610 (0.0039) [2024-06-23 12:27:53,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7120650240. Throughput: 0: 42532.9. Samples: 7120757260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 12:27:56,414][15401] Updated weights for policy 0, policy_version 434620 (0.0037) [2024-06-23 12:27:58,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7120912384. Throughput: 0: 42603.0. Samples: 7121012100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:27:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 12:27:59,819][15401] Updated weights for policy 0, policy_version 434630 (0.0033) [2024-06-23 12:28:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 7121108992. Throughput: 0: 42697.2. Samples: 7121273480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:28:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 12:28:03,953][15401] Updated weights for policy 0, policy_version 434640 (0.0040) [2024-06-23 12:28:07,900][15401] Updated weights for policy 0, policy_version 434650 (0.0028) [2024-06-23 12:28:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7121305600. Throughput: 0: 42594.8. Samples: 7121397260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:28:08,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-23 12:28:11,490][15401] Updated weights for policy 0, policy_version 434660 (0.0029) [2024-06-23 12:28:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7121551360. Throughput: 0: 42786.6. Samples: 7121657320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 12:28:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 12:28:15,881][15401] Updated weights for policy 0, policy_version 434670 (0.0042) [2024-06-23 12:28:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7121764352. Throughput: 0: 42636.4. Samples: 7121913180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 12:28:19,347][15401] Updated weights for policy 0, policy_version 434680 (0.0031) [2024-06-23 12:28:19,962][15349] Signal inference workers to stop experience collection... (105500 times) [2024-06-23 12:28:20,012][15401] InferenceWorker_p0-w0: stopping experience collection (105500 times) [2024-06-23 12:28:20,076][15349] Signal inference workers to resume experience collection... (105500 times) [2024-06-23 12:28:20,077][15401] InferenceWorker_p0-w0: resuming experience collection (105500 times) [2024-06-23 12:28:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7121944576. Throughput: 0: 42558.5. Samples: 7122037520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:23,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-23 12:28:23,507][15401] Updated weights for policy 0, policy_version 434690 (0.0042) [2024-06-23 12:28:26,883][15401] Updated weights for policy 0, policy_version 434700 (0.0028) [2024-06-23 12:28:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 7122206720. Throughput: 0: 42864.5. Samples: 7122297820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:28,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-23 12:28:31,054][15401] Updated weights for policy 0, policy_version 434710 (0.0031) [2024-06-23 12:28:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7122386944. Throughput: 0: 42799.2. Samples: 7122557080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 12:28:34,375][15401] Updated weights for policy 0, policy_version 434720 (0.0040) [2024-06-23 12:28:38,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7122599936. Throughput: 0: 42762.1. Samples: 7122681560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 12:28:38,577][15401] Updated weights for policy 0, policy_version 434730 (0.0031) [2024-06-23 12:28:42,334][15401] Updated weights for policy 0, policy_version 434740 (0.0041) [2024-06-23 12:28:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 7122845696. Throughput: 0: 42878.2. Samples: 7122941620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 12:28:46,218][15401] Updated weights for policy 0, policy_version 434750 (0.0045) [2024-06-23 12:28:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7123025920. Throughput: 0: 42752.0. Samples: 7123197320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:48,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 12:28:49,966][15401] Updated weights for policy 0, policy_version 434760 (0.0035) [2024-06-23 12:28:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7123238912. Throughput: 0: 42653.6. Samples: 7123316680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 12:28:54,207][15401] Updated weights for policy 0, policy_version 434770 (0.0039) [2024-06-23 12:28:57,747][15401] Updated weights for policy 0, policy_version 434780 (0.0026) [2024-06-23 12:28:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 7123451904. Throughput: 0: 42605.7. Samples: 7123574580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:28:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 12:29:01,772][15401] Updated weights for policy 0, policy_version 434790 (0.0040) [2024-06-23 12:29:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7123648512. Throughput: 0: 42658.3. Samples: 7123832800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:29:03,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-23 12:29:05,451][15401] Updated weights for policy 0, policy_version 434800 (0.0041) [2024-06-23 12:29:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7123877888. Throughput: 0: 42646.7. Samples: 7123956620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:29:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 12:29:09,314][15401] Updated weights for policy 0, policy_version 434810 (0.0042) [2024-06-23 12:29:12,901][15401] Updated weights for policy 0, policy_version 434820 (0.0038) [2024-06-23 12:29:13,392][15132] Fps is (10 sec: 45862.8, 60 sec: 42596.5, 300 sec: 42764.6). Total num frames: 7124107264. Throughput: 0: 42807.2. Samples: 7124224260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:29:13,393][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 12:29:16,927][15401] Updated weights for policy 0, policy_version 434830 (0.0032) [2024-06-23 12:29:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 7124303872. Throughput: 0: 42506.2. Samples: 7124469960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:29:18,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 12:29:20,709][15401] Updated weights for policy 0, policy_version 434840 (0.0028) [2024-06-23 12:29:23,390][15132] Fps is (10 sec: 42609.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7124533248. Throughput: 0: 42616.5. Samples: 7124599300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:29:23,390][15132] Avg episode reward: [(0, '0.191')] [2024-06-23 12:29:24,804][15401] Updated weights for policy 0, policy_version 434850 (0.0032) [2024-06-23 12:29:28,389][15132] Fps is (10 sec: 40970.3, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 7124713472. Throughput: 0: 42657.0. Samples: 7124861180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:29:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 12:29:28,552][15401] Updated weights for policy 0, policy_version 434860 (0.0032) [2024-06-23 12:29:32,594][15401] Updated weights for policy 0, policy_version 434870 (0.0033) [2024-06-23 12:29:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7124942848. Throughput: 0: 42600.5. Samples: 7125114340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:29:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 12:29:34,468][15349] Signal inference workers to stop experience collection... (105550 times) [2024-06-23 12:29:34,468][15349] Signal inference workers to resume experience collection... (105550 times) [2024-06-23 12:29:34,490][15401] InferenceWorker_p0-w0: stopping experience collection (105550 times) [2024-06-23 12:29:34,491][15401] InferenceWorker_p0-w0: resuming experience collection (105550 times) [2024-06-23 12:29:36,195][15401] Updated weights for policy 0, policy_version 434880 (0.0045) [2024-06-23 12:29:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7125172224. Throughput: 0: 42772.1. Samples: 7125241420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 12:29:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 12:29:40,344][15401] Updated weights for policy 0, policy_version 434890 (0.0032) [2024-06-23 12:29:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 7125352448. Throughput: 0: 42774.8. Samples: 7125499440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:29:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 12:29:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000434898_7125368832.pth... [2024-06-23 12:29:43,539][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000434273_7115128832.pth [2024-06-23 12:29:43,841][15401] Updated weights for policy 0, policy_version 434900 (0.0037) [2024-06-23 12:29:47,822][15401] Updated weights for policy 0, policy_version 434910 (0.0044) [2024-06-23 12:29:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7125565440. Throughput: 0: 42659.1. Samples: 7125752460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:29:48,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 12:29:51,458][15401] Updated weights for policy 0, policy_version 434920 (0.0035) [2024-06-23 12:29:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7125811200. Throughput: 0: 42828.4. Samples: 7125883900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:29:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 12:29:55,317][15401] Updated weights for policy 0, policy_version 434930 (0.0038) [2024-06-23 12:29:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7125991424. Throughput: 0: 42638.1. Samples: 7126142860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:29:58,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 12:29:59,102][15401] Updated weights for policy 0, policy_version 434940 (0.0040) [2024-06-23 12:30:03,250][15401] Updated weights for policy 0, policy_version 434950 (0.0040) [2024-06-23 12:30:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7126237184. Throughput: 0: 42819.2. Samples: 7126396720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 12:30:06,988][15401] Updated weights for policy 0, policy_version 434960 (0.0043) [2024-06-23 12:30:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7126466560. Throughput: 0: 42781.8. Samples: 7126524480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 12:30:10,679][15401] Updated weights for policy 0, policy_version 434970 (0.0030) [2024-06-23 12:30:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.2, 300 sec: 42709.5). Total num frames: 7126646784. Throughput: 0: 42780.8. Samples: 7126786320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 12:30:14,501][15401] Updated weights for policy 0, policy_version 434980 (0.0028) [2024-06-23 12:30:18,192][15401] Updated weights for policy 0, policy_version 434990 (0.0033) [2024-06-23 12:30:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 7126876160. Throughput: 0: 42654.6. Samples: 7127033800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 12:30:22,083][15401] Updated weights for policy 0, policy_version 435000 (0.0027) [2024-06-23 12:30:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7127105536. Throughput: 0: 42730.2. Samples: 7127164280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 12:30:25,680][15401] Updated weights for policy 0, policy_version 435010 (0.0033) [2024-06-23 12:30:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7127285760. Throughput: 0: 42836.8. Samples: 7127427100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 12:30:29,652][15401] Updated weights for policy 0, policy_version 435020 (0.0032) [2024-06-23 12:30:33,303][15401] Updated weights for policy 0, policy_version 435030 (0.0024) [2024-06-23 12:30:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7127531520. Throughput: 0: 42702.1. Samples: 7127674060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 12:30:37,166][15401] Updated weights for policy 0, policy_version 435040 (0.0029) [2024-06-23 12:30:38,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 7127744512. Throughput: 0: 42762.6. Samples: 7127808320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:38,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 12:30:40,846][15401] Updated weights for policy 0, policy_version 435050 (0.0040) [2024-06-23 12:30:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7127924736. Throughput: 0: 42721.4. Samples: 7128065320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 12:30:45,231][15401] Updated weights for policy 0, policy_version 435060 (0.0034) [2024-06-23 12:30:45,690][15349] Signal inference workers to stop experience collection... (105600 times) [2024-06-23 12:30:45,691][15349] Signal inference workers to resume experience collection... (105600 times) [2024-06-23 12:30:45,736][15401] InferenceWorker_p0-w0: stopping experience collection (105600 times) [2024-06-23 12:30:45,736][15401] InferenceWorker_p0-w0: resuming experience collection (105600 times) [2024-06-23 12:30:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7128154112. Throughput: 0: 42503.2. Samples: 7128309360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:48,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 12:30:48,759][15401] Updated weights for policy 0, policy_version 435070 (0.0028) [2024-06-23 12:30:52,801][15401] Updated weights for policy 0, policy_version 435080 (0.0042) [2024-06-23 12:30:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7128383488. Throughput: 0: 42705.0. Samples: 7128446200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:53,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-23 12:30:56,634][15401] Updated weights for policy 0, policy_version 435090 (0.0034) [2024-06-23 12:30:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7128563712. Throughput: 0: 42510.8. Samples: 7128699300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 12:30:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 12:31:00,407][15401] Updated weights for policy 0, policy_version 435100 (0.0029) [2024-06-23 12:31:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7128793088. Throughput: 0: 42612.0. Samples: 7128951340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 12:31:04,371][15401] Updated weights for policy 0, policy_version 435110 (0.0024) [2024-06-23 12:31:08,008][15401] Updated weights for policy 0, policy_version 435120 (0.0031) [2024-06-23 12:31:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7129022464. Throughput: 0: 42716.0. Samples: 7129086500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 12:31:12,027][15401] Updated weights for policy 0, policy_version 435130 (0.0025) [2024-06-23 12:31:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7129202688. Throughput: 0: 42478.3. Samples: 7129338620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 12:31:15,675][15401] Updated weights for policy 0, policy_version 435140 (0.0040) [2024-06-23 12:31:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7129432064. Throughput: 0: 42703.1. Samples: 7129595700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 12:31:19,611][15401] Updated weights for policy 0, policy_version 435150 (0.0025) [2024-06-23 12:31:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7129645056. Throughput: 0: 42615.0. Samples: 7129725900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 12:31:23,545][15401] Updated weights for policy 0, policy_version 435160 (0.0039) [2024-06-23 12:31:27,543][15401] Updated weights for policy 0, policy_version 435170 (0.0037) [2024-06-23 12:31:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7129841664. Throughput: 0: 42659.4. Samples: 7129985000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:28,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 12:31:31,206][15401] Updated weights for policy 0, policy_version 435180 (0.0028) [2024-06-23 12:31:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7130071040. Throughput: 0: 42888.9. Samples: 7130239360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 12:31:35,034][15401] Updated weights for policy 0, policy_version 435190 (0.0050) [2024-06-23 12:31:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 7130284032. Throughput: 0: 42863.4. Samples: 7130375060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 12:31:38,707][15401] Updated weights for policy 0, policy_version 435200 (0.0029) [2024-06-23 12:31:42,590][15401] Updated weights for policy 0, policy_version 435210 (0.0034) [2024-06-23 12:31:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7130497024. Throughput: 0: 42831.8. Samples: 7130626740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 12:31:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000435211_7130497024.pth... [2024-06-23 12:31:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000434586_7120257024.pth [2024-06-23 12:31:46,729][15401] Updated weights for policy 0, policy_version 435220 (0.0037) [2024-06-23 12:31:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7130726400. Throughput: 0: 42803.6. Samples: 7130877500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:48,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 12:31:50,717][15401] Updated weights for policy 0, policy_version 435230 (0.0040) [2024-06-23 12:31:53,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42320.7, 300 sec: 42708.5). Total num frames: 7130923008. Throughput: 0: 42690.2. Samples: 7131007840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:53,397][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 12:31:54,198][15401] Updated weights for policy 0, policy_version 435240 (0.0039) [2024-06-23 12:31:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 7131119616. Throughput: 0: 42807.0. Samples: 7131264940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:31:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 12:31:58,519][15401] Updated weights for policy 0, policy_version 435250 (0.0030) [2024-06-23 12:32:01,807][15401] Updated weights for policy 0, policy_version 435260 (0.0031) [2024-06-23 12:32:03,389][15132] Fps is (10 sec: 42626.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7131348992. Throughput: 0: 42632.6. Samples: 7131514160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:32:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 12:32:06,096][15401] Updated weights for policy 0, policy_version 435270 (0.0037) [2024-06-23 12:32:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7131561984. Throughput: 0: 42757.9. Samples: 7131650000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:32:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 12:32:09,587][15401] Updated weights for policy 0, policy_version 435280 (0.0044) [2024-06-23 12:32:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7131758592. Throughput: 0: 42677.4. Samples: 7131905480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:32:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 12:32:13,702][15401] Updated weights for policy 0, policy_version 435290 (0.0027) [2024-06-23 12:32:16,985][15401] Updated weights for policy 0, policy_version 435300 (0.0032) [2024-06-23 12:32:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7132004352. Throughput: 0: 42630.6. Samples: 7132157740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:32:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 12:32:21,238][15401] Updated weights for policy 0, policy_version 435310 (0.0031) [2024-06-23 12:32:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7132217344. Throughput: 0: 42630.3. Samples: 7132293420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 12:32:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 12:32:23,590][15349] Signal inference workers to stop experience collection... (105650 times) [2024-06-23 12:32:23,627][15401] InferenceWorker_p0-w0: stopping experience collection (105650 times) [2024-06-23 12:32:23,640][15349] Signal inference workers to resume experience collection... (105650 times) [2024-06-23 12:32:23,653][15401] InferenceWorker_p0-w0: resuming experience collection (105650 times) [2024-06-23 12:32:24,550][15401] Updated weights for policy 0, policy_version 435320 (0.0037) [2024-06-23 12:32:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7132413952. Throughput: 0: 42760.5. Samples: 7132550960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:32:28,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 12:32:28,721][15401] Updated weights for policy 0, policy_version 435330 (0.0032) [2024-06-23 12:32:32,276][15401] Updated weights for policy 0, policy_version 435340 (0.0039) [2024-06-23 12:32:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 7132643328. Throughput: 0: 42914.5. Samples: 7132808660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:32:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 12:32:36,373][15401] Updated weights for policy 0, policy_version 435350 (0.0030) [2024-06-23 12:32:38,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 7132872704. Throughput: 0: 42902.7. Samples: 7132938180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:32:38,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-23 12:32:40,091][15401] Updated weights for policy 0, policy_version 435360 (0.0036) [2024-06-23 12:32:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 7133036544. Throughput: 0: 42913.0. Samples: 7133196020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:32:43,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 12:32:44,075][15401] Updated weights for policy 0, policy_version 435370 (0.0034) [2024-06-23 12:32:47,813][15401] Updated weights for policy 0, policy_version 435380 (0.0034) [2024-06-23 12:32:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7133282304. Throughput: 0: 42984.4. Samples: 7133448460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:32:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 12:32:51,705][15401] Updated weights for policy 0, policy_version 435390 (0.0038) [2024-06-23 12:32:53,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43149.3, 300 sec: 42709.5). Total num frames: 7133511680. Throughput: 0: 42858.7. Samples: 7133578640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:32:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 12:32:55,700][15401] Updated weights for policy 0, policy_version 435400 (0.0027) [2024-06-23 12:32:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7133691904. Throughput: 0: 42961.0. Samples: 7133838720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:32:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 12:32:59,110][15401] Updated weights for policy 0, policy_version 435410 (0.0029) [2024-06-23 12:33:03,229][15401] Updated weights for policy 0, policy_version 435420 (0.0044) [2024-06-23 12:33:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7133921280. Throughput: 0: 42973.4. Samples: 7134091540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 12:33:06,765][15401] Updated weights for policy 0, policy_version 435430 (0.0033) [2024-06-23 12:33:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7134150656. Throughput: 0: 42883.2. Samples: 7134223160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:08,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 12:33:10,973][15401] Updated weights for policy 0, policy_version 435440 (0.0042) [2024-06-23 12:33:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7134347264. Throughput: 0: 42780.4. Samples: 7134476080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 12:33:14,490][15401] Updated weights for policy 0, policy_version 435450 (0.0035) [2024-06-23 12:33:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 7134576640. Throughput: 0: 42806.1. Samples: 7134734920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 12:33:18,397][15401] Updated weights for policy 0, policy_version 435460 (0.0044) [2024-06-23 12:33:22,255][15401] Updated weights for policy 0, policy_version 435470 (0.0042) [2024-06-23 12:33:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 7134756864. Throughput: 0: 42705.3. Samples: 7134859920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:23,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-23 12:33:25,956][15401] Updated weights for policy 0, policy_version 435480 (0.0032) [2024-06-23 12:33:28,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7135002624. Throughput: 0: 42666.1. Samples: 7135116000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 12:33:29,718][15401] Updated weights for policy 0, policy_version 435490 (0.0024) [2024-06-23 12:33:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7135199232. Throughput: 0: 42761.4. Samples: 7135372720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 12:33:33,884][15401] Updated weights for policy 0, policy_version 435500 (0.0043) [2024-06-23 12:33:38,008][15401] Updated weights for policy 0, policy_version 435510 (0.0032) [2024-06-23 12:33:38,008][15349] Signal inference workers to stop experience collection... (105700 times) [2024-06-23 12:33:38,008][15349] Signal inference workers to resume experience collection... (105700 times) [2024-06-23 12:33:38,048][15401] InferenceWorker_p0-w0: stopping experience collection (105700 times) [2024-06-23 12:33:38,048][15401] InferenceWorker_p0-w0: resuming experience collection (105700 times) [2024-06-23 12:33:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7135412224. Throughput: 0: 42660.4. Samples: 7135498360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 12:33:41,466][15401] Updated weights for policy 0, policy_version 435520 (0.0041) [2024-06-23 12:33:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7135625216. Throughput: 0: 42647.0. Samples: 7135757840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 12:33:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 12:33:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000435525_7135641600.pth... [2024-06-23 12:33:43,541][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000434898_7125368832.pth [2024-06-23 12:33:45,462][15401] Updated weights for policy 0, policy_version 435530 (0.0031) [2024-06-23 12:33:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7135854592. Throughput: 0: 42620.8. Samples: 7136009480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:33:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 12:33:48,977][15401] Updated weights for policy 0, policy_version 435540 (0.0031) [2024-06-23 12:33:53,142][15401] Updated weights for policy 0, policy_version 435550 (0.0041) [2024-06-23 12:33:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.1, 300 sec: 42709.5). Total num frames: 7136051200. Throughput: 0: 42563.7. Samples: 7136138540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:33:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 12:33:56,538][15401] Updated weights for policy 0, policy_version 435560 (0.0028) [2024-06-23 12:33:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7136264192. Throughput: 0: 42705.1. Samples: 7136397800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:33:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 12:34:00,720][15401] Updated weights for policy 0, policy_version 435570 (0.0028) [2024-06-23 12:34:03,392][15132] Fps is (10 sec: 42589.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 7136477184. Throughput: 0: 42573.1. Samples: 7136650820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:03,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 12:34:04,373][15401] Updated weights for policy 0, policy_version 435580 (0.0041) [2024-06-23 12:34:08,360][15401] Updated weights for policy 0, policy_version 435590 (0.0037) [2024-06-23 12:34:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.9). Total num frames: 7136706560. Throughput: 0: 42722.7. Samples: 7136782440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:08,394][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 12:34:12,194][15401] Updated weights for policy 0, policy_version 435600 (0.0030) [2024-06-23 12:34:13,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 7136919552. Throughput: 0: 42732.5. Samples: 7137038960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:13,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 12:34:15,975][15401] Updated weights for policy 0, policy_version 435610 (0.0023) [2024-06-23 12:34:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7137132544. Throughput: 0: 42759.9. Samples: 7137296920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 12:34:19,775][15401] Updated weights for policy 0, policy_version 435620 (0.0043) [2024-06-23 12:34:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7137329152. Throughput: 0: 42773.4. Samples: 7137423160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 12:34:23,569][15401] Updated weights for policy 0, policy_version 435630 (0.0039) [2024-06-23 12:34:27,471][15401] Updated weights for policy 0, policy_version 435640 (0.0045) [2024-06-23 12:34:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7137525760. Throughput: 0: 42636.0. Samples: 7137676460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:28,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 12:34:31,209][15401] Updated weights for policy 0, policy_version 435650 (0.0036) [2024-06-23 12:34:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7137771520. Throughput: 0: 42597.3. Samples: 7137926360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:33,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 12:34:35,687][15401] Updated weights for policy 0, policy_version 435660 (0.0036) [2024-06-23 12:34:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7137984512. Throughput: 0: 42671.8. Samples: 7138058760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:38,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 12:34:39,065][15401] Updated weights for policy 0, policy_version 435670 (0.0032) [2024-06-23 12:34:43,259][15401] Updated weights for policy 0, policy_version 435680 (0.0026) [2024-06-23 12:34:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 7138181120. Throughput: 0: 42493.2. Samples: 7138310100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:43,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 12:34:46,858][15401] Updated weights for policy 0, policy_version 435690 (0.0034) [2024-06-23 12:34:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7138410496. Throughput: 0: 42537.3. Samples: 7138564900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 12:34:50,806][15401] Updated weights for policy 0, policy_version 435700 (0.0026) [2024-06-23 12:34:53,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7138639872. Throughput: 0: 42560.4. Samples: 7138697660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 12:34:54,562][15401] Updated weights for policy 0, policy_version 435710 (0.0025) [2024-06-23 12:34:55,533][15349] Signal inference workers to stop experience collection... (105750 times) [2024-06-23 12:34:55,534][15349] Signal inference workers to resume experience collection... (105750 times) [2024-06-23 12:34:55,546][15401] InferenceWorker_p0-w0: stopping experience collection (105750 times) [2024-06-23 12:34:55,570][15401] InferenceWorker_p0-w0: resuming experience collection (105750 times) [2024-06-23 12:34:58,377][15401] Updated weights for policy 0, policy_version 435720 (0.0038) [2024-06-23 12:34:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7138836480. Throughput: 0: 42523.6. Samples: 7138952520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:34:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 12:35:02,157][15401] Updated weights for policy 0, policy_version 435730 (0.0038) [2024-06-23 12:35:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 7139049472. Throughput: 0: 42358.6. Samples: 7139203060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:35:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 12:35:06,090][15401] Updated weights for policy 0, policy_version 435740 (0.0029) [2024-06-23 12:35:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7139262464. Throughput: 0: 42520.4. Samples: 7139336580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 12:35:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 12:35:09,602][15401] Updated weights for policy 0, policy_version 435750 (0.0042) [2024-06-23 12:35:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7139459072. Throughput: 0: 42716.4. Samples: 7139598700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 12:35:13,718][15401] Updated weights for policy 0, policy_version 435760 (0.0029) [2024-06-23 12:35:17,117][15401] Updated weights for policy 0, policy_version 435770 (0.0038) [2024-06-23 12:35:18,391][15132] Fps is (10 sec: 44231.5, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 7139704832. Throughput: 0: 42624.7. Samples: 7139844520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:18,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 12:35:21,292][15401] Updated weights for policy 0, policy_version 435780 (0.0034) [2024-06-23 12:35:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7139901440. Throughput: 0: 42704.9. Samples: 7139980480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 12:35:24,895][15401] Updated weights for policy 0, policy_version 435790 (0.0031) [2024-06-23 12:35:28,390][15132] Fps is (10 sec: 40964.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7140114432. Throughput: 0: 42808.5. Samples: 7140236380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 12:35:28,842][15401] Updated weights for policy 0, policy_version 435800 (0.0033) [2024-06-23 12:35:32,611][15401] Updated weights for policy 0, policy_version 435810 (0.0030) [2024-06-23 12:35:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7140343808. Throughput: 0: 42749.3. Samples: 7140488620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 12:35:36,566][15401] Updated weights for policy 0, policy_version 435820 (0.0045) [2024-06-23 12:35:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7140524032. Throughput: 0: 42713.3. Samples: 7140619760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:38,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 12:35:40,234][15401] Updated weights for policy 0, policy_version 435830 (0.0040) [2024-06-23 12:35:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 7140753408. Throughput: 0: 42712.8. Samples: 7140874600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 12:35:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000435837_7140753408.pth... [2024-06-23 12:35:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000435211_7130497024.pth [2024-06-23 12:35:44,381][15401] Updated weights for policy 0, policy_version 435840 (0.0046) [2024-06-23 12:35:47,904][15401] Updated weights for policy 0, policy_version 435850 (0.0031) [2024-06-23 12:35:48,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7140999168. Throughput: 0: 42608.1. Samples: 7141120420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 12:35:51,953][15401] Updated weights for policy 0, policy_version 435860 (0.0050) [2024-06-23 12:35:53,392][15132] Fps is (10 sec: 39312.1, 60 sec: 41777.5, 300 sec: 42653.6). Total num frames: 7141146624. Throughput: 0: 42582.2. Samples: 7141252880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:53,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 12:35:55,703][15401] Updated weights for policy 0, policy_version 435870 (0.0042) [2024-06-23 12:35:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7141392384. Throughput: 0: 42474.7. Samples: 7141510060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:35:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 12:36:00,092][15401] Updated weights for policy 0, policy_version 435880 (0.0034) [2024-06-23 12:36:03,390][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7141605376. Throughput: 0: 42729.6. Samples: 7141767300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:36:03,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 12:36:03,508][15401] Updated weights for policy 0, policy_version 435890 (0.0043) [2024-06-23 12:36:07,761][15401] Updated weights for policy 0, policy_version 435900 (0.0041) [2024-06-23 12:36:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7141785600. Throughput: 0: 42466.7. Samples: 7141891480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:36:08,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 12:36:11,154][15401] Updated weights for policy 0, policy_version 435910 (0.0025) [2024-06-23 12:36:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7142014976. Throughput: 0: 42446.7. Samples: 7142146480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:36:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 12:36:15,992][15401] Updated weights for policy 0, policy_version 435920 (0.0039) [2024-06-23 12:36:17,021][15349] Signal inference workers to stop experience collection... (105800 times) [2024-06-23 12:36:17,068][15401] InferenceWorker_p0-w0: stopping experience collection (105800 times) [2024-06-23 12:36:17,077][15349] Signal inference workers to resume experience collection... (105800 times) [2024-06-23 12:36:17,090][15401] InferenceWorker_p0-w0: resuming experience collection (105800 times) [2024-06-23 12:36:18,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42599.3, 300 sec: 42765.0). Total num frames: 7142260736. Throughput: 0: 42482.7. Samples: 7142400340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:36:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 12:36:18,979][15401] Updated weights for policy 0, policy_version 435930 (0.0029) [2024-06-23 12:36:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 7142424576. Throughput: 0: 42503.6. Samples: 7142532420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:36:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 12:36:23,417][15401] Updated weights for policy 0, policy_version 435940 (0.0036) [2024-06-23 12:36:26,646][15401] Updated weights for policy 0, policy_version 435950 (0.0035) [2024-06-23 12:36:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7142670336. Throughput: 0: 42532.0. Samples: 7142788540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 12:36:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 12:36:30,895][15401] Updated weights for policy 0, policy_version 435960 (0.0029) [2024-06-23 12:36:33,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7142899712. Throughput: 0: 42692.5. Samples: 7143041580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:36:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 12:36:34,321][15401] Updated weights for policy 0, policy_version 435970 (0.0035) [2024-06-23 12:36:38,396][15132] Fps is (10 sec: 39296.8, 60 sec: 42320.9, 300 sec: 42597.5). Total num frames: 7143063552. Throughput: 0: 42643.4. Samples: 7143172000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:36:38,397][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 12:36:38,668][15401] Updated weights for policy 0, policy_version 435980 (0.0035) [2024-06-23 12:36:42,029][15401] Updated weights for policy 0, policy_version 435990 (0.0045) [2024-06-23 12:36:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7143292928. Throughput: 0: 42533.2. Samples: 7143424060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:36:43,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 12:36:46,495][15401] Updated weights for policy 0, policy_version 436000 (0.0033) [2024-06-23 12:36:48,389][15132] Fps is (10 sec: 47544.3, 60 sec: 42325.4, 300 sec: 42766.0). Total num frames: 7143538688. Throughput: 0: 42489.9. Samples: 7143679340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:36:48,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 12:36:49,734][15401] Updated weights for policy 0, policy_version 436010 (0.0027) [2024-06-23 12:36:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 7143718912. Throughput: 0: 42841.9. Samples: 7143819380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:36:53,391][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 12:36:54,075][15401] Updated weights for policy 0, policy_version 436020 (0.0032) [2024-06-23 12:36:57,520][15401] Updated weights for policy 0, policy_version 436030 (0.0036) [2024-06-23 12:36:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7143931904. Throughput: 0: 42869.8. Samples: 7144075620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:36:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 12:37:01,794][15401] Updated weights for policy 0, policy_version 436040 (0.0031) [2024-06-23 12:37:03,389][15132] Fps is (10 sec: 45876.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7144177664. Throughput: 0: 42625.4. Samples: 7144318480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 12:37:05,245][15401] Updated weights for policy 0, policy_version 436050 (0.0033) [2024-06-23 12:37:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 7144357888. Throughput: 0: 42753.8. Samples: 7144456340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 12:37:09,328][15401] Updated weights for policy 0, policy_version 436060 (0.0039) [2024-06-23 12:37:12,890][15401] Updated weights for policy 0, policy_version 436070 (0.0036) [2024-06-23 12:37:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7144570880. Throughput: 0: 42556.9. Samples: 7144703600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 12:37:16,906][15401] Updated weights for policy 0, policy_version 436080 (0.0041) [2024-06-23 12:37:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7144816640. Throughput: 0: 42613.3. Samples: 7144959180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 12:37:20,500][15401] Updated weights for policy 0, policy_version 436090 (0.0040) [2024-06-23 12:37:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7144996864. Throughput: 0: 42662.5. Samples: 7145091540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 12:37:24,845][15401] Updated weights for policy 0, policy_version 436100 (0.0028) [2024-06-23 12:37:28,165][15401] Updated weights for policy 0, policy_version 436110 (0.0041) [2024-06-23 12:37:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7145226240. Throughput: 0: 42572.5. Samples: 7145339820. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 12:37:32,435][15401] Updated weights for policy 0, policy_version 436120 (0.0044) [2024-06-23 12:37:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7145439232. Throughput: 0: 42623.9. Samples: 7145597420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 12:37:33,728][15349] Signal inference workers to stop experience collection... (105850 times) [2024-06-23 12:37:33,728][15349] Signal inference workers to resume experience collection... (105850 times) [2024-06-23 12:37:33,751][15401] InferenceWorker_p0-w0: stopping experience collection (105850 times) [2024-06-23 12:37:33,781][15401] InferenceWorker_p0-w0: resuming experience collection (105850 times) [2024-06-23 12:37:35,735][15401] Updated weights for policy 0, policy_version 436130 (0.0042) [2024-06-23 12:37:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 7145652224. Throughput: 0: 42445.5. Samples: 7145729420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:38,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 12:37:40,022][15401] Updated weights for policy 0, policy_version 436140 (0.0028) [2024-06-23 12:37:43,285][15401] Updated weights for policy 0, policy_version 436150 (0.0042) [2024-06-23 12:37:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7145881600. Throughput: 0: 42411.5. Samples: 7145984140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 12:37:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000436150_7145881600.pth... [2024-06-23 12:37:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000435525_7135641600.pth [2024-06-23 12:37:47,657][15401] Updated weights for policy 0, policy_version 436160 (0.0042) [2024-06-23 12:37:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 7146078208. Throughput: 0: 42815.0. Samples: 7146245160. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 12:37:51,238][15401] Updated weights for policy 0, policy_version 436170 (0.0038) [2024-06-23 12:37:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 7146291200. Throughput: 0: 42652.1. Samples: 7146375680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-23 12:37:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 12:37:55,154][15401] Updated weights for policy 0, policy_version 436180 (0.0032) [2024-06-23 12:37:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7146504192. Throughput: 0: 42661.7. Samples: 7146623380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:37:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 12:37:58,827][15401] Updated weights for policy 0, policy_version 436190 (0.0028) [2024-06-23 12:38:02,933][15401] Updated weights for policy 0, policy_version 436200 (0.0041) [2024-06-23 12:38:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7146717184. Throughput: 0: 42873.0. Samples: 7146888460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 12:38:06,510][15401] Updated weights for policy 0, policy_version 436210 (0.0044) [2024-06-23 12:38:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7146946560. Throughput: 0: 42744.9. Samples: 7147015060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 12:38:10,327][15401] Updated weights for policy 0, policy_version 436220 (0.0038) [2024-06-23 12:38:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7147159552. Throughput: 0: 42917.4. Samples: 7147271100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:13,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 12:38:14,098][15401] Updated weights for policy 0, policy_version 436230 (0.0036) [2024-06-23 12:38:17,805][15401] Updated weights for policy 0, policy_version 436240 (0.0038) [2024-06-23 12:38:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7147356160. Throughput: 0: 42959.0. Samples: 7147530580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 12:38:21,511][15401] Updated weights for policy 0, policy_version 436250 (0.0034) [2024-06-23 12:38:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7147569152. Throughput: 0: 42881.7. Samples: 7147659100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:23,391][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 12:38:25,591][15401] Updated weights for policy 0, policy_version 436260 (0.0025) [2024-06-23 12:38:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7147798528. Throughput: 0: 42947.6. Samples: 7147916780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 12:38:28,985][15401] Updated weights for policy 0, policy_version 436270 (0.0030) [2024-06-23 12:38:33,300][15401] Updated weights for policy 0, policy_version 436280 (0.0026) [2024-06-23 12:38:33,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7148011520. Throughput: 0: 43007.6. Samples: 7148180500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 12:38:36,399][15401] Updated weights for policy 0, policy_version 436290 (0.0030) [2024-06-23 12:38:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7148224512. Throughput: 0: 42884.8. Samples: 7148305500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 12:38:40,976][15401] Updated weights for policy 0, policy_version 436300 (0.0026) [2024-06-23 12:38:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7148453888. Throughput: 0: 43258.2. Samples: 7148570000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 12:38:44,117][15401] Updated weights for policy 0, policy_version 436310 (0.0045) [2024-06-23 12:38:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7148634112. Throughput: 0: 43140.4. Samples: 7148829780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 12:38:48,602][15401] Updated weights for policy 0, policy_version 436320 (0.0036) [2024-06-23 12:38:51,902][15401] Updated weights for policy 0, policy_version 436330 (0.0030) [2024-06-23 12:38:53,393][15132] Fps is (10 sec: 42584.0, 60 sec: 43142.0, 300 sec: 42764.5). Total num frames: 7148879872. Throughput: 0: 42984.2. Samples: 7148949500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:53,393][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 12:38:56,133][15401] Updated weights for policy 0, policy_version 436340 (0.0044) [2024-06-23 12:38:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 7149092864. Throughput: 0: 43116.5. Samples: 7149211340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:38:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 12:38:59,302][15349] Signal inference workers to stop experience collection... (105900 times) [2024-06-23 12:38:59,304][15349] Signal inference workers to resume experience collection... (105900 times) [2024-06-23 12:38:59,341][15401] InferenceWorker_p0-w0: stopping experience collection (105900 times) [2024-06-23 12:38:59,342][15401] InferenceWorker_p0-w0: resuming experience collection (105900 times) [2024-06-23 12:38:59,442][15401] Updated weights for policy 0, policy_version 436350 (0.0035) [2024-06-23 12:39:03,390][15132] Fps is (10 sec: 40974.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7149289472. Throughput: 0: 43269.8. Samples: 7149477720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:39:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 12:39:03,847][15401] Updated weights for policy 0, policy_version 436360 (0.0034) [2024-06-23 12:39:07,182][15401] Updated weights for policy 0, policy_version 436370 (0.0035) [2024-06-23 12:39:08,395][15132] Fps is (10 sec: 42574.7, 60 sec: 42867.5, 300 sec: 42708.7). Total num frames: 7149518848. Throughput: 0: 43139.7. Samples: 7149600620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:39:08,396][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 12:39:11,448][15401] Updated weights for policy 0, policy_version 436380 (0.0035) [2024-06-23 12:39:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7149731840. Throughput: 0: 43107.5. Samples: 7149856620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:39:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 12:39:14,746][15401] Updated weights for policy 0, policy_version 436390 (0.0034) [2024-06-23 12:39:18,390][15132] Fps is (10 sec: 40982.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7149928448. Throughput: 0: 42969.8. Samples: 7150114140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 12:39:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 12:39:19,329][15401] Updated weights for policy 0, policy_version 436400 (0.0040) [2024-06-23 12:39:22,549][15401] Updated weights for policy 0, policy_version 436410 (0.0035) [2024-06-23 12:39:23,394][15132] Fps is (10 sec: 42580.0, 60 sec: 43141.5, 300 sec: 42819.9). Total num frames: 7150157824. Throughput: 0: 42930.1. Samples: 7150237540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:39:23,394][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 12:39:27,038][15401] Updated weights for policy 0, policy_version 436420 (0.0028) [2024-06-23 12:39:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7150370816. Throughput: 0: 42846.8. Samples: 7150498100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:39:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 12:39:30,511][15401] Updated weights for policy 0, policy_version 436430 (0.0038) [2024-06-23 12:39:33,390][15132] Fps is (10 sec: 40977.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7150567424. Throughput: 0: 42751.9. Samples: 7150753620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:39:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 12:39:34,634][15401] Updated weights for policy 0, policy_version 436440 (0.0029) [2024-06-23 12:39:37,950][15401] Updated weights for policy 0, policy_version 436450 (0.0035) [2024-06-23 12:39:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 7150813184. Throughput: 0: 42881.1. Samples: 7150879000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:39:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 12:39:42,077][15401] Updated weights for policy 0, policy_version 436460 (0.0035) [2024-06-23 12:39:43,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7151026176. Throughput: 0: 42963.1. Samples: 7151144680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:39:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 12:39:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000436465_7151042560.pth... [2024-06-23 12:39:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000435837_7140753408.pth [2024-06-23 12:39:45,519][15401] Updated weights for policy 0, policy_version 436470 (0.0028) [2024-06-23 12:39:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7151222784. Throughput: 0: 42762.2. Samples: 7151402020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:39:48,392][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 12:39:49,740][15401] Updated weights for policy 0, policy_version 436480 (0.0026) [2024-06-23 12:39:53,069][15401] Updated weights for policy 0, policy_version 436490 (0.0032) [2024-06-23 12:39:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.9, 300 sec: 42765.0). Total num frames: 7151452160. Throughput: 0: 42903.9. Samples: 7151531060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:39:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 12:39:57,240][15401] Updated weights for policy 0, policy_version 436500 (0.0033) [2024-06-23 12:39:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7151665152. Throughput: 0: 43059.5. Samples: 7151794300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:39:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 12:40:00,560][15401] Updated weights for policy 0, policy_version 436510 (0.0022) [2024-06-23 12:40:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7151878144. Throughput: 0: 42919.2. Samples: 7152045500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:40:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 12:40:04,772][15401] Updated weights for policy 0, policy_version 436520 (0.0035) [2024-06-23 12:40:08,067][15401] Updated weights for policy 0, policy_version 436530 (0.0031) [2024-06-23 12:40:08,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43148.6, 300 sec: 42876.1). Total num frames: 7152107520. Throughput: 0: 43080.7. Samples: 7152175980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:40:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 12:40:12,513][15401] Updated weights for policy 0, policy_version 436540 (0.0036) [2024-06-23 12:40:13,319][15349] Signal inference workers to stop experience collection... (105950 times) [2024-06-23 12:40:13,319][15349] Signal inference workers to resume experience collection... (105950 times) [2024-06-23 12:40:13,364][15401] InferenceWorker_p0-w0: stopping experience collection (105950 times) [2024-06-23 12:40:13,364][15401] InferenceWorker_p0-w0: resuming experience collection (105950 times) [2024-06-23 12:40:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.7). Total num frames: 7152304128. Throughput: 0: 43124.1. Samples: 7152438680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:40:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 12:40:15,976][15401] Updated weights for policy 0, policy_version 436550 (0.0036) [2024-06-23 12:40:18,392][15132] Fps is (10 sec: 40949.7, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 7152517120. Throughput: 0: 43118.7. Samples: 7152694060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:40:18,393][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 12:40:20,208][15401] Updated weights for policy 0, policy_version 436560 (0.0032) [2024-06-23 12:40:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43147.6, 300 sec: 42820.5). Total num frames: 7152746496. Throughput: 0: 43131.4. Samples: 7152819920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:40:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 12:40:23,701][15401] Updated weights for policy 0, policy_version 436570 (0.0040) [2024-06-23 12:40:27,741][15401] Updated weights for policy 0, policy_version 436580 (0.0042) [2024-06-23 12:40:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7152943104. Throughput: 0: 42980.4. Samples: 7153078800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:40:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 12:40:31,221][15401] Updated weights for policy 0, policy_version 436590 (0.0028) [2024-06-23 12:40:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7153156096. Throughput: 0: 42938.2. Samples: 7153334240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:40:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 12:40:35,355][15401] Updated weights for policy 0, policy_version 436600 (0.0032) [2024-06-23 12:40:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7153401856. Throughput: 0: 42953.0. Samples: 7153463940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 12:40:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 12:40:38,637][15401] Updated weights for policy 0, policy_version 436610 (0.0031) [2024-06-23 12:40:42,885][15401] Updated weights for policy 0, policy_version 436620 (0.0046) [2024-06-23 12:40:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7153598464. Throughput: 0: 42846.2. Samples: 7153722380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:40:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 12:40:46,115][15401] Updated weights for policy 0, policy_version 436630 (0.0033) [2024-06-23 12:40:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 7153811456. Throughput: 0: 43061.3. Samples: 7153983260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:40:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 12:40:50,643][15401] Updated weights for policy 0, policy_version 436640 (0.0047) [2024-06-23 12:40:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7154040832. Throughput: 0: 42949.2. Samples: 7154108700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:40:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 12:40:53,903][15401] Updated weights for policy 0, policy_version 436650 (0.0049) [2024-06-23 12:40:58,337][15401] Updated weights for policy 0, policy_version 436660 (0.0033) [2024-06-23 12:40:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7154237440. Throughput: 0: 42954.6. Samples: 7154371640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:40:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 12:41:01,406][15401] Updated weights for policy 0, policy_version 436670 (0.0031) [2024-06-23 12:41:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7154434048. Throughput: 0: 42941.9. Samples: 7154626340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:03,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 12:41:05,957][15401] Updated weights for policy 0, policy_version 436680 (0.0052) [2024-06-23 12:41:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 7154696192. Throughput: 0: 42945.0. Samples: 7154752440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:08,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-23 12:41:09,287][15401] Updated weights for policy 0, policy_version 436690 (0.0028) [2024-06-23 12:41:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7154860032. Throughput: 0: 42978.7. Samples: 7155012840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 12:41:13,533][15401] Updated weights for policy 0, policy_version 436700 (0.0028) [2024-06-23 12:41:16,849][15401] Updated weights for policy 0, policy_version 436710 (0.0041) [2024-06-23 12:41:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42873.2, 300 sec: 42931.7). Total num frames: 7155089408. Throughput: 0: 42908.1. Samples: 7155265100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:18,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 12:41:21,284][15401] Updated weights for policy 0, policy_version 436720 (0.0034) [2024-06-23 12:41:23,389][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7155335168. Throughput: 0: 42957.2. Samples: 7155397020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:23,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 12:41:24,682][15401] Updated weights for policy 0, policy_version 436730 (0.0035) [2024-06-23 12:41:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7155515392. Throughput: 0: 42862.0. Samples: 7155651160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 12:41:28,781][15401] Updated weights for policy 0, policy_version 436740 (0.0046) [2024-06-23 12:41:32,202][15401] Updated weights for policy 0, policy_version 436750 (0.0040) [2024-06-23 12:41:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42932.6). Total num frames: 7155728384. Throughput: 0: 42796.5. Samples: 7155909100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 12:41:36,438][15401] Updated weights for policy 0, policy_version 436760 (0.0027) [2024-06-23 12:41:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 7155957760. Throughput: 0: 42925.0. Samples: 7156040320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 12:41:39,689][15401] Updated weights for policy 0, policy_version 436770 (0.0028) [2024-06-23 12:41:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7156154368. Throughput: 0: 42725.8. Samples: 7156294300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 12:41:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000436777_7156154368.pth... [2024-06-23 12:41:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000436150_7145881600.pth [2024-06-23 12:41:43,490][15349] Signal inference workers to stop experience collection... (106000 times) [2024-06-23 12:41:43,540][15401] InferenceWorker_p0-w0: stopping experience collection (106000 times) [2024-06-23 12:41:43,608][15349] Signal inference workers to resume experience collection... (106000 times) [2024-06-23 12:41:43,608][15401] InferenceWorker_p0-w0: resuming experience collection (106000 times) [2024-06-23 12:41:44,105][15401] Updated weights for policy 0, policy_version 436780 (0.0035) [2024-06-23 12:41:47,593][15401] Updated weights for policy 0, policy_version 436790 (0.0031) [2024-06-23 12:41:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 7156383744. Throughput: 0: 42632.4. Samples: 7156544800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:48,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 12:41:51,691][15401] Updated weights for policy 0, policy_version 436800 (0.0036) [2024-06-23 12:41:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 7156596736. Throughput: 0: 42722.4. Samples: 7156674940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 12:41:55,337][15401] Updated weights for policy 0, policy_version 436810 (0.0029) [2024-06-23 12:41:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7156793344. Throughput: 0: 42650.6. Samples: 7156932120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:41:58,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 12:41:59,285][15401] Updated weights for policy 0, policy_version 436820 (0.0034) [2024-06-23 12:42:02,890][15401] Updated weights for policy 0, policy_version 436830 (0.0040) [2024-06-23 12:42:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 7157022720. Throughput: 0: 42576.8. Samples: 7157181060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-23 12:42:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 12:42:06,984][15401] Updated weights for policy 0, policy_version 436840 (0.0039) [2024-06-23 12:42:08,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42596.7, 300 sec: 42986.8). Total num frames: 7157252096. Throughput: 0: 42693.3. Samples: 7157318320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:08,393][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 12:42:10,393][15401] Updated weights for policy 0, policy_version 436850 (0.0036) [2024-06-23 12:42:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 7157448704. Throughput: 0: 42916.3. Samples: 7157582400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 12:42:14,535][15401] Updated weights for policy 0, policy_version 436860 (0.0043) [2024-06-23 12:42:18,281][15401] Updated weights for policy 0, policy_version 436870 (0.0030) [2024-06-23 12:42:18,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 7157678080. Throughput: 0: 42671.5. Samples: 7157829320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:18,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 12:42:22,090][15401] Updated weights for policy 0, policy_version 436880 (0.0038) [2024-06-23 12:42:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 7157891072. Throughput: 0: 42688.0. Samples: 7157961280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 12:42:25,834][15401] Updated weights for policy 0, policy_version 436890 (0.0038) [2024-06-23 12:42:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7158087680. Throughput: 0: 42863.1. Samples: 7158223140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 12:42:29,828][15401] Updated weights for policy 0, policy_version 436900 (0.0036) [2024-06-23 12:42:33,326][15401] Updated weights for policy 0, policy_version 436910 (0.0027) [2024-06-23 12:42:33,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43415.8, 300 sec: 42986.8). Total num frames: 7158333440. Throughput: 0: 42823.5. Samples: 7158471960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:33,393][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 12:42:37,384][15401] Updated weights for policy 0, policy_version 436920 (0.0037) [2024-06-23 12:42:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 7158530048. Throughput: 0: 42896.3. Samples: 7158605380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:38,392][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 12:42:40,854][15401] Updated weights for policy 0, policy_version 436930 (0.0026) [2024-06-23 12:42:43,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7158726656. Throughput: 0: 43022.3. Samples: 7158868120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:43,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 12:42:44,982][15401] Updated weights for policy 0, policy_version 436940 (0.0024) [2024-06-23 12:42:48,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 7158956032. Throughput: 0: 43000.8. Samples: 7159116200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:48,393][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 12:42:48,759][15401] Updated weights for policy 0, policy_version 436950 (0.0035) [2024-06-23 12:42:52,798][15401] Updated weights for policy 0, policy_version 436960 (0.0045) [2024-06-23 12:42:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 7159169024. Throughput: 0: 42892.6. Samples: 7159248380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:53,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 12:42:54,064][15349] Signal inference workers to stop experience collection... (106050 times) [2024-06-23 12:42:54,100][15401] InferenceWorker_p0-w0: stopping experience collection (106050 times) [2024-06-23 12:42:54,182][15349] Signal inference workers to resume experience collection... (106050 times) [2024-06-23 12:42:54,182][15401] InferenceWorker_p0-w0: resuming experience collection (106050 times) [2024-06-23 12:42:56,599][15401] Updated weights for policy 0, policy_version 436970 (0.0035) [2024-06-23 12:42:58,394][15132] Fps is (10 sec: 40950.1, 60 sec: 42868.0, 300 sec: 42875.4). Total num frames: 7159365632. Throughput: 0: 42716.0. Samples: 7159504820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:42:58,395][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 12:43:00,371][15401] Updated weights for policy 0, policy_version 436980 (0.0036) [2024-06-23 12:43:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7159611392. Throughput: 0: 42700.0. Samples: 7159750820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:43:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 12:43:04,284][15401] Updated weights for policy 0, policy_version 436990 (0.0032) [2024-06-23 12:43:08,122][15401] Updated weights for policy 0, policy_version 437000 (0.0025) [2024-06-23 12:43:08,389][15132] Fps is (10 sec: 45897.5, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 7159824384. Throughput: 0: 42845.8. Samples: 7159889340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:43:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 12:43:12,337][15401] Updated weights for policy 0, policy_version 437010 (0.0038) [2024-06-23 12:43:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 7160020992. Throughput: 0: 42849.8. Samples: 7160151380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:43:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 12:43:15,709][15401] Updated weights for policy 0, policy_version 437020 (0.0037) [2024-06-23 12:43:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 7160266752. Throughput: 0: 42894.3. Samples: 7160402100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:43:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 12:43:19,839][15401] Updated weights for policy 0, policy_version 437030 (0.0028) [2024-06-23 12:43:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7160446976. Throughput: 0: 42867.5. Samples: 7160534320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 12:43:23,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 12:43:23,519][15401] Updated weights for policy 0, policy_version 437040 (0.0026) [2024-06-23 12:43:27,409][15401] Updated weights for policy 0, policy_version 437050 (0.0030) [2024-06-23 12:43:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7160676352. Throughput: 0: 42875.5. Samples: 7160797520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:43:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 12:43:31,279][15401] Updated weights for policy 0, policy_version 437060 (0.0027) [2024-06-23 12:43:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 7160889344. Throughput: 0: 42914.2. Samples: 7161047240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:43:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 12:43:35,003][15401] Updated weights for policy 0, policy_version 437070 (0.0039) [2024-06-23 12:43:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 7161085952. Throughput: 0: 42840.0. Samples: 7161176180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:43:38,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 12:43:38,899][15401] Updated weights for policy 0, policy_version 437080 (0.0033) [2024-06-23 12:43:42,703][15401] Updated weights for policy 0, policy_version 437090 (0.0030) [2024-06-23 12:43:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 7161331712. Throughput: 0: 42910.2. Samples: 7161435580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:43:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 12:43:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000437093_7161331712.pth... [2024-06-23 12:43:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000436465_7151042560.pth [2024-06-23 12:43:46,427][15401] Updated weights for policy 0, policy_version 437100 (0.0028) [2024-06-23 12:43:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43146.3, 300 sec: 42932.1). Total num frames: 7161544704. Throughput: 0: 43021.4. Samples: 7161686780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:43:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 12:43:50,406][15401] Updated weights for policy 0, policy_version 437110 (0.0030) [2024-06-23 12:43:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 7161724928. Throughput: 0: 42827.8. Samples: 7161816600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:43:53,391][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 12:43:53,931][15401] Updated weights for policy 0, policy_version 437120 (0.0036) [2024-06-23 12:43:58,019][15401] Updated weights for policy 0, policy_version 437130 (0.0037) [2024-06-23 12:43:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43148.0, 300 sec: 42931.6). Total num frames: 7161954304. Throughput: 0: 42693.8. Samples: 7162072600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:43:58,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 12:44:01,895][15401] Updated weights for policy 0, policy_version 437140 (0.0036) [2024-06-23 12:44:03,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42932.4). Total num frames: 7162183680. Throughput: 0: 42680.5. Samples: 7162322720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 12:44:05,207][15349] Signal inference workers to stop experience collection... (106100 times) [2024-06-23 12:44:05,210][15349] Signal inference workers to resume experience collection... (106100 times) [2024-06-23 12:44:05,227][15401] InferenceWorker_p0-w0: stopping experience collection (106100 times) [2024-06-23 12:44:05,228][15401] InferenceWorker_p0-w0: resuming experience collection (106100 times) [2024-06-23 12:44:05,774][15401] Updated weights for policy 0, policy_version 437150 (0.0060) [2024-06-23 12:44:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7162363904. Throughput: 0: 42685.3. Samples: 7162455160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:08,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 12:44:09,504][15401] Updated weights for policy 0, policy_version 437160 (0.0041) [2024-06-23 12:44:13,138][15401] Updated weights for policy 0, policy_version 437170 (0.0028) [2024-06-23 12:44:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 7162593280. Throughput: 0: 42726.8. Samples: 7162720220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 12:44:17,127][15401] Updated weights for policy 0, policy_version 437180 (0.0036) [2024-06-23 12:44:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42932.3). Total num frames: 7162822656. Throughput: 0: 42712.4. Samples: 7162969300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 12:44:20,888][15401] Updated weights for policy 0, policy_version 437190 (0.0036) [2024-06-23 12:44:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 7163019264. Throughput: 0: 42829.8. Samples: 7163103520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:23,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 12:44:24,679][15401] Updated weights for policy 0, policy_version 437200 (0.0024) [2024-06-23 12:44:28,283][15401] Updated weights for policy 0, policy_version 437210 (0.0037) [2024-06-23 12:44:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7163248640. Throughput: 0: 42958.3. Samples: 7163368700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 12:44:32,213][15401] Updated weights for policy 0, policy_version 437220 (0.0037) [2024-06-23 12:44:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7163478016. Throughput: 0: 43093.2. Samples: 7163625980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 12:44:35,735][15401] Updated weights for policy 0, policy_version 437230 (0.0035) [2024-06-23 12:44:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7163674624. Throughput: 0: 43104.6. Samples: 7163756300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 12:44:39,778][15401] Updated weights for policy 0, policy_version 437240 (0.0032) [2024-06-23 12:44:43,141][15401] Updated weights for policy 0, policy_version 437250 (0.0033) [2024-06-23 12:44:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7163904000. Throughput: 0: 43119.0. Samples: 7164012960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 12:44:47,286][15401] Updated weights for policy 0, policy_version 437260 (0.0027) [2024-06-23 12:44:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7164116992. Throughput: 0: 43394.8. Samples: 7164275480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-23 12:44:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 12:44:50,611][15401] Updated weights for policy 0, policy_version 437270 (0.0030) [2024-06-23 12:44:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 7164329984. Throughput: 0: 43252.9. Samples: 7164401540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:44:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 12:44:55,014][15401] Updated weights for policy 0, policy_version 437280 (0.0041) [2024-06-23 12:44:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7164542976. Throughput: 0: 43124.9. Samples: 7164660840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:44:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 12:44:58,776][15401] Updated weights for policy 0, policy_version 437290 (0.0035) [2024-06-23 12:45:02,781][15401] Updated weights for policy 0, policy_version 437300 (0.0030) [2024-06-23 12:45:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7164755968. Throughput: 0: 43327.1. Samples: 7164919020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 12:45:06,256][15401] Updated weights for policy 0, policy_version 437310 (0.0025) [2024-06-23 12:45:08,390][15132] Fps is (10 sec: 42595.4, 60 sec: 43417.2, 300 sec: 42931.5). Total num frames: 7164968960. Throughput: 0: 43200.6. Samples: 7165047580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:08,391][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 12:45:10,307][15401] Updated weights for policy 0, policy_version 437320 (0.0027) [2024-06-23 12:45:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 7165181952. Throughput: 0: 42935.2. Samples: 7165300780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 12:45:13,838][15401] Updated weights for policy 0, policy_version 437330 (0.0029) [2024-06-23 12:45:18,158][15401] Updated weights for policy 0, policy_version 437340 (0.0024) [2024-06-23 12:45:18,389][15132] Fps is (10 sec: 42601.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 7165394944. Throughput: 0: 43009.5. Samples: 7165561400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 12:45:21,622][15401] Updated weights for policy 0, policy_version 437350 (0.0042) [2024-06-23 12:45:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 7165607936. Throughput: 0: 42923.1. Samples: 7165687840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 12:45:25,748][15401] Updated weights for policy 0, policy_version 437360 (0.0039) [2024-06-23 12:45:25,753][15349] Signal inference workers to stop experience collection... (106150 times) [2024-06-23 12:45:25,753][15349] Signal inference workers to resume experience collection... (106150 times) [2024-06-23 12:45:25,779][15401] InferenceWorker_p0-w0: stopping experience collection (106150 times) [2024-06-23 12:45:25,779][15401] InferenceWorker_p0-w0: resuming experience collection (106150 times) [2024-06-23 12:45:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7165820928. Throughput: 0: 42699.6. Samples: 7165934440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:28,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 12:45:29,460][15401] Updated weights for policy 0, policy_version 437370 (0.0035) [2024-06-23 12:45:33,185][15401] Updated weights for policy 0, policy_version 437380 (0.0035) [2024-06-23 12:45:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7166033920. Throughput: 0: 42705.2. Samples: 7166197220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:33,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 12:45:37,346][15401] Updated weights for policy 0, policy_version 437390 (0.0028) [2024-06-23 12:45:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7166246912. Throughput: 0: 42853.8. Samples: 7166329960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 12:45:40,592][15401] Updated weights for policy 0, policy_version 437400 (0.0040) [2024-06-23 12:45:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7166459904. Throughput: 0: 42844.4. Samples: 7166588840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:43,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 12:45:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000437406_7166459904.pth... [2024-06-23 12:45:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000436777_7156154368.pth [2024-06-23 12:45:44,825][15401] Updated weights for policy 0, policy_version 437410 (0.0030) [2024-06-23 12:45:48,135][15401] Updated weights for policy 0, policy_version 437420 (0.0032) [2024-06-23 12:45:48,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 7166689280. Throughput: 0: 42618.1. Samples: 7166837100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:48,396][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 12:45:52,406][15401] Updated weights for policy 0, policy_version 437430 (0.0034) [2024-06-23 12:45:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7166902272. Throughput: 0: 42758.1. Samples: 7166971660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:53,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-23 12:45:55,608][15401] Updated weights for policy 0, policy_version 437440 (0.0032) [2024-06-23 12:45:58,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7167115264. Throughput: 0: 43013.2. Samples: 7167236380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:45:58,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-23 12:45:59,952][15401] Updated weights for policy 0, policy_version 437450 (0.0029) [2024-06-23 12:46:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7167328256. Throughput: 0: 42728.8. Samples: 7167484200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:46:03,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 12:46:03,739][15401] Updated weights for policy 0, policy_version 437460 (0.0032) [2024-06-23 12:46:07,658][15401] Updated weights for policy 0, policy_version 437470 (0.0035) [2024-06-23 12:46:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42872.0, 300 sec: 42987.2). Total num frames: 7167541248. Throughput: 0: 42749.0. Samples: 7167611540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:46:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 12:46:11,426][15401] Updated weights for policy 0, policy_version 437480 (0.0032) [2024-06-23 12:46:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 7167754240. Throughput: 0: 43007.9. Samples: 7167869800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 12:46:15,184][15401] Updated weights for policy 0, policy_version 437490 (0.0029) [2024-06-23 12:46:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7167967232. Throughput: 0: 42877.9. Samples: 7168126720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:18,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 12:46:19,060][15401] Updated weights for policy 0, policy_version 437500 (0.0031) [2024-06-23 12:46:23,023][15401] Updated weights for policy 0, policy_version 437510 (0.0044) [2024-06-23 12:46:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7168163840. Throughput: 0: 42757.8. Samples: 7168254060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:23,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 12:46:26,643][15401] Updated weights for policy 0, policy_version 437520 (0.0028) [2024-06-23 12:46:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 7168393216. Throughput: 0: 42748.4. Samples: 7168512620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:28,393][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 12:46:30,853][15401] Updated weights for policy 0, policy_version 437530 (0.0037) [2024-06-23 12:46:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7168606208. Throughput: 0: 42927.4. Samples: 7168768560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 12:46:34,224][15401] Updated weights for policy 0, policy_version 437540 (0.0031) [2024-06-23 12:46:38,336][15401] Updated weights for policy 0, policy_version 437550 (0.0027) [2024-06-23 12:46:38,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7168819200. Throughput: 0: 42802.7. Samples: 7168897780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 12:46:41,767][15401] Updated weights for policy 0, policy_version 437560 (0.0034) [2024-06-23 12:46:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7169032192. Throughput: 0: 42506.3. Samples: 7169149160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:43,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 12:46:45,878][15401] Updated weights for policy 0, policy_version 437570 (0.0034) [2024-06-23 12:46:47,531][15349] Signal inference workers to stop experience collection... (106200 times) [2024-06-23 12:46:47,575][15401] InferenceWorker_p0-w0: stopping experience collection (106200 times) [2024-06-23 12:46:47,585][15349] Signal inference workers to resume experience collection... (106200 times) [2024-06-23 12:46:47,589][15401] InferenceWorker_p0-w0: resuming experience collection (106200 times) [2024-06-23 12:46:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42602.9, 300 sec: 42876.1). Total num frames: 7169245184. Throughput: 0: 42767.1. Samples: 7169408720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 12:46:49,677][15401] Updated weights for policy 0, policy_version 437580 (0.0039) [2024-06-23 12:46:53,329][15401] Updated weights for policy 0, policy_version 437590 (0.0042) [2024-06-23 12:46:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7169474560. Throughput: 0: 42820.0. Samples: 7169538440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 12:46:57,480][15401] Updated weights for policy 0, policy_version 437600 (0.0044) [2024-06-23 12:46:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7169687552. Throughput: 0: 42849.8. Samples: 7169798040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:46:58,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 12:47:00,759][15401] Updated weights for policy 0, policy_version 437610 (0.0032) [2024-06-23 12:47:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 7169900544. Throughput: 0: 42848.0. Samples: 7170054880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:47:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 12:47:04,921][15401] Updated weights for policy 0, policy_version 437620 (0.0021) [2024-06-23 12:47:08,276][15401] Updated weights for policy 0, policy_version 437630 (0.0033) [2024-06-23 12:47:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7170129920. Throughput: 0: 43014.3. Samples: 7170189700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:47:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 12:47:12,356][15401] Updated weights for policy 0, policy_version 437640 (0.0024) [2024-06-23 12:47:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7170310144. Throughput: 0: 42865.4. Samples: 7170441460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:47:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 12:47:15,912][15401] Updated weights for policy 0, policy_version 437650 (0.0042) [2024-06-23 12:47:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7170539520. Throughput: 0: 42870.7. Samples: 7170697740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:47:18,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-23 12:47:19,937][15401] Updated weights for policy 0, policy_version 437660 (0.0048) [2024-06-23 12:47:23,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 7170768896. Throughput: 0: 42900.7. Samples: 7170828420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:47:23,392][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 12:47:23,771][15401] Updated weights for policy 0, policy_version 437670 (0.0038) [2024-06-23 12:47:27,620][15401] Updated weights for policy 0, policy_version 437680 (0.0041) [2024-06-23 12:47:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 7170949120. Throughput: 0: 42954.7. Samples: 7171082120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:47:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 12:47:31,598][15401] Updated weights for policy 0, policy_version 437690 (0.0032) [2024-06-23 12:47:33,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 7171162112. Throughput: 0: 42976.9. Samples: 7171342680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 12:47:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 12:47:35,224][15401] Updated weights for policy 0, policy_version 437700 (0.0042) [2024-06-23 12:47:38,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 7171424256. Throughput: 0: 42958.6. Samples: 7171471580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:47:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 12:47:39,280][15401] Updated weights for policy 0, policy_version 437710 (0.0039) [2024-06-23 12:47:42,834][15401] Updated weights for policy 0, policy_version 437720 (0.0033) [2024-06-23 12:47:43,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42866.9, 300 sec: 42875.5). Total num frames: 7171604480. Throughput: 0: 42896.2. Samples: 7171728640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:47:43,397][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 12:47:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000437721_7171620864.pth... [2024-06-23 12:47:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000437093_7161331712.pth [2024-06-23 12:47:47,159][15401] Updated weights for policy 0, policy_version 437730 (0.0040) [2024-06-23 12:47:48,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7171784704. Throughput: 0: 42828.5. Samples: 7171982160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:47:48,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 12:47:50,503][15401] Updated weights for policy 0, policy_version 437740 (0.0022) [2024-06-23 12:47:53,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42598.3, 300 sec: 42932.3). Total num frames: 7172030464. Throughput: 0: 42448.8. Samples: 7172099900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:47:53,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 12:47:54,713][15401] Updated weights for policy 0, policy_version 437750 (0.0040) [2024-06-23 12:47:58,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 7172243456. Throughput: 0: 42722.2. Samples: 7172364060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:47:58,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 12:47:58,459][15401] Updated weights for policy 0, policy_version 437760 (0.0030) [2024-06-23 12:48:02,184][15401] Updated weights for policy 0, policy_version 437770 (0.0043) [2024-06-23 12:48:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 7172423680. Throughput: 0: 42748.9. Samples: 7172621440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 12:48:06,085][15401] Updated weights for policy 0, policy_version 437780 (0.0035) [2024-06-23 12:48:08,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 7172685824. Throughput: 0: 42510.2. Samples: 7172741280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 12:48:09,233][15349] Signal inference workers to stop experience collection... (106250 times) [2024-06-23 12:48:09,279][15349] Signal inference workers to resume experience collection... (106250 times) [2024-06-23 12:48:09,284][15401] InferenceWorker_p0-w0: stopping experience collection (106250 times) [2024-06-23 12:48:09,307][15401] InferenceWorker_p0-w0: resuming experience collection (106250 times) [2024-06-23 12:48:09,955][15401] Updated weights for policy 0, policy_version 437790 (0.0040) [2024-06-23 12:48:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7172882432. Throughput: 0: 42781.6. Samples: 7173007300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 12:48:13,673][15401] Updated weights for policy 0, policy_version 437800 (0.0039) [2024-06-23 12:48:17,728][15401] Updated weights for policy 0, policy_version 437810 (0.0034) [2024-06-23 12:48:18,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 7173079040. Throughput: 0: 42597.3. Samples: 7173259660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:18,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 12:48:21,622][15401] Updated weights for policy 0, policy_version 437820 (0.0034) [2024-06-23 12:48:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 7173308416. Throughput: 0: 42647.6. Samples: 7173390720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 12:48:25,456][15401] Updated weights for policy 0, policy_version 437830 (0.0046) [2024-06-23 12:48:28,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7173505024. Throughput: 0: 42610.5. Samples: 7173645840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 12:48:29,197][15401] Updated weights for policy 0, policy_version 437840 (0.0030) [2024-06-23 12:48:33,043][15401] Updated weights for policy 0, policy_version 437850 (0.0033) [2024-06-23 12:48:33,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 7173734400. Throughput: 0: 42475.9. Samples: 7173893680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:33,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 12:48:36,914][15401] Updated weights for policy 0, policy_version 437860 (0.0038) [2024-06-23 12:48:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7173963776. Throughput: 0: 42892.5. Samples: 7174030060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 12:48:40,578][15401] Updated weights for policy 0, policy_version 437870 (0.0043) [2024-06-23 12:48:43,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 7174144000. Throughput: 0: 42779.7. Samples: 7174289040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:43,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 12:48:44,431][15401] Updated weights for policy 0, policy_version 437880 (0.0028) [2024-06-23 12:48:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7174373376. Throughput: 0: 42548.8. Samples: 7174536140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:48,395][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 12:48:48,861][15401] Updated weights for policy 0, policy_version 437890 (0.0037) [2024-06-23 12:48:52,033][15401] Updated weights for policy 0, policy_version 437900 (0.0054) [2024-06-23 12:48:53,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7174619136. Throughput: 0: 42911.3. Samples: 7174672280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:53,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 12:48:56,528][15401] Updated weights for policy 0, policy_version 437910 (0.0038) [2024-06-23 12:48:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 7174782976. Throughput: 0: 42667.3. Samples: 7174927320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 12:48:58,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 12:49:00,199][15401] Updated weights for policy 0, policy_version 437920 (0.0035) [2024-06-23 12:49:03,390][15132] Fps is (10 sec: 39320.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7175012352. Throughput: 0: 42634.6. Samples: 7175178120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 12:49:04,218][15401] Updated weights for policy 0, policy_version 437930 (0.0046) [2024-06-23 12:49:07,815][15401] Updated weights for policy 0, policy_version 437940 (0.0043) [2024-06-23 12:49:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 7175241728. Throughput: 0: 42603.5. Samples: 7175307880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 12:49:11,740][15401] Updated weights for policy 0, policy_version 437950 (0.0033) [2024-06-23 12:49:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7175438336. Throughput: 0: 42699.5. Samples: 7175567320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 12:49:15,295][15401] Updated weights for policy 0, policy_version 437960 (0.0030) [2024-06-23 12:49:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 7175651328. Throughput: 0: 42866.4. Samples: 7175822560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 12:49:19,286][15401] Updated weights for policy 0, policy_version 437970 (0.0033) [2024-06-23 12:49:22,906][15401] Updated weights for policy 0, policy_version 437980 (0.0026) [2024-06-23 12:49:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7175880704. Throughput: 0: 42619.5. Samples: 7175947940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 12:49:27,431][15401] Updated weights for policy 0, policy_version 437990 (0.0043) [2024-06-23 12:49:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7176077312. Throughput: 0: 42695.0. Samples: 7176210320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 12:49:28,617][15349] Signal inference workers to stop experience collection... (106300 times) [2024-06-23 12:49:28,653][15401] InferenceWorker_p0-w0: stopping experience collection (106300 times) [2024-06-23 12:49:28,683][15349] Signal inference workers to resume experience collection... (106300 times) [2024-06-23 12:49:28,684][15401] InferenceWorker_p0-w0: resuming experience collection (106300 times) [2024-06-23 12:49:30,399][15401] Updated weights for policy 0, policy_version 438000 (0.0029) [2024-06-23 12:49:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 7176290304. Throughput: 0: 42928.9. Samples: 7176467940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:33,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 12:49:34,730][15401] Updated weights for policy 0, policy_version 438010 (0.0033) [2024-06-23 12:49:37,748][15401] Updated weights for policy 0, policy_version 438020 (0.0030) [2024-06-23 12:49:38,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7176536064. Throughput: 0: 42804.5. Samples: 7176598480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:38,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 12:49:42,164][15401] Updated weights for policy 0, policy_version 438030 (0.0039) [2024-06-23 12:49:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7176716288. Throughput: 0: 42968.9. Samples: 7176860920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 12:49:43,474][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000438033_7176732672.pth... [2024-06-23 12:49:43,529][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000437406_7166459904.pth [2024-06-23 12:49:45,516][15401] Updated weights for policy 0, policy_version 438040 (0.0042) [2024-06-23 12:49:48,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7176962048. Throughput: 0: 43038.2. Samples: 7177114840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 12:49:50,047][15401] Updated weights for policy 0, policy_version 438050 (0.0034) [2024-06-23 12:49:53,259][15401] Updated weights for policy 0, policy_version 438060 (0.0032) [2024-06-23 12:49:53,392][15132] Fps is (10 sec: 45865.1, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 7177175040. Throughput: 0: 43076.6. Samples: 7177246420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:53,392][15132] Avg episode reward: [(0, '0.174')] [2024-06-23 12:49:57,370][15401] Updated weights for policy 0, policy_version 438070 (0.0031) [2024-06-23 12:49:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7177371648. Throughput: 0: 43082.6. Samples: 7177506040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:49:58,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 12:50:00,751][15401] Updated weights for policy 0, policy_version 438080 (0.0031) [2024-06-23 12:50:03,390][15132] Fps is (10 sec: 42607.3, 60 sec: 43144.6, 300 sec: 42820.7). Total num frames: 7177601024. Throughput: 0: 43015.4. Samples: 7177758260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:50:03,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 12:50:04,714][15401] Updated weights for policy 0, policy_version 438090 (0.0027) [2024-06-23 12:50:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7177830400. Throughput: 0: 43295.1. Samples: 7177896220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:50:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 12:50:08,391][15401] Updated weights for policy 0, policy_version 438100 (0.0036) [2024-06-23 12:50:12,245][15401] Updated weights for policy 0, policy_version 438110 (0.0027) [2024-06-23 12:50:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7178027008. Throughput: 0: 43206.3. Samples: 7178154600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:50:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 12:50:16,101][15401] Updated weights for policy 0, policy_version 438120 (0.0044) [2024-06-23 12:50:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7178256384. Throughput: 0: 43240.0. Samples: 7178413740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 12:50:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 12:50:19,672][15401] Updated weights for policy 0, policy_version 438130 (0.0037) [2024-06-23 12:50:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7178469376. Throughput: 0: 43235.0. Samples: 7178544060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:50:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 12:50:23,621][15401] Updated weights for policy 0, policy_version 438140 (0.0027) [2024-06-23 12:50:27,574][15401] Updated weights for policy 0, policy_version 438150 (0.0042) [2024-06-23 12:50:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 7178682368. Throughput: 0: 43149.3. Samples: 7178802640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:50:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 12:50:31,371][15401] Updated weights for policy 0, policy_version 438160 (0.0038) [2024-06-23 12:50:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 7178895360. Throughput: 0: 43199.6. Samples: 7179058820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:50:33,390][15132] Avg episode reward: [(0, '0.105')] [2024-06-23 12:50:35,177][15401] Updated weights for policy 0, policy_version 438170 (0.0035) [2024-06-23 12:50:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7179108352. Throughput: 0: 43210.5. Samples: 7179190800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:50:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 12:50:38,978][15401] Updated weights for policy 0, policy_version 438180 (0.0036) [2024-06-23 12:50:42,748][15401] Updated weights for policy 0, policy_version 438190 (0.0036) [2024-06-23 12:50:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43690.6, 300 sec: 42877.0). Total num frames: 7179337728. Throughput: 0: 43167.6. Samples: 7179448580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:50:43,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 12:50:46,695][15401] Updated weights for policy 0, policy_version 438200 (0.0036) [2024-06-23 12:50:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 7179550720. Throughput: 0: 43189.9. Samples: 7179701800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:50:48,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 12:50:50,309][15401] Updated weights for policy 0, policy_version 438210 (0.0038) [2024-06-23 12:50:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.0, 300 sec: 42820.6). Total num frames: 7179747328. Throughput: 0: 43012.9. Samples: 7179831800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:50:53,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 12:50:54,318][15401] Updated weights for policy 0, policy_version 438220 (0.0040) [2024-06-23 12:50:57,802][15401] Updated weights for policy 0, policy_version 438230 (0.0030) [2024-06-23 12:50:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 7179976704. Throughput: 0: 42997.4. Samples: 7180089480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:50:58,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 12:51:01,572][15349] Signal inference workers to stop experience collection... (106350 times) [2024-06-23 12:51:01,572][15349] Signal inference workers to resume experience collection... (106350 times) [2024-06-23 12:51:01,616][15401] InferenceWorker_p0-w0: stopping experience collection (106350 times) [2024-06-23 12:51:01,616][15401] InferenceWorker_p0-w0: resuming experience collection (106350 times) [2024-06-23 12:51:01,711][15401] Updated weights for policy 0, policy_version 438240 (0.0032) [2024-06-23 12:51:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 7180206080. Throughput: 0: 42964.8. Samples: 7180347160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:03,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 12:51:05,322][15401] Updated weights for policy 0, policy_version 438250 (0.0034) [2024-06-23 12:51:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7180402688. Throughput: 0: 42941.8. Samples: 7180476440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:08,403][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 12:51:09,366][15401] Updated weights for policy 0, policy_version 438260 (0.0040) [2024-06-23 12:51:12,824][15401] Updated weights for policy 0, policy_version 438270 (0.0034) [2024-06-23 12:51:13,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 7180615680. Throughput: 0: 42929.6. Samples: 7180734580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:13,393][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 12:51:17,341][15401] Updated weights for policy 0, policy_version 438280 (0.0039) [2024-06-23 12:51:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7180812288. Throughput: 0: 42968.2. Samples: 7180992380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 12:51:20,733][15401] Updated weights for policy 0, policy_version 438290 (0.0050) [2024-06-23 12:51:23,390][15132] Fps is (10 sec: 42607.7, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 7181041664. Throughput: 0: 42817.6. Samples: 7181117600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 12:51:24,873][15401] Updated weights for policy 0, policy_version 438300 (0.0040) [2024-06-23 12:51:28,318][15401] Updated weights for policy 0, policy_version 438310 (0.0024) [2024-06-23 12:51:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7181271040. Throughput: 0: 42870.7. Samples: 7181377760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:28,392][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 12:51:33,017][15401] Updated weights for policy 0, policy_version 438320 (0.0035) [2024-06-23 12:51:33,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 7181451264. Throughput: 0: 42976.0. Samples: 7181635720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 12:51:35,851][15401] Updated weights for policy 0, policy_version 438330 (0.0038) [2024-06-23 12:51:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7181680640. Throughput: 0: 42726.2. Samples: 7181754480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 12:51:40,399][15401] Updated weights for policy 0, policy_version 438340 (0.0033) [2024-06-23 12:51:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7181910016. Throughput: 0: 42949.8. Samples: 7182022220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 12:51:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 12:51:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000438350_7181926400.pth... [2024-06-23 12:51:43,514][15401] Updated weights for policy 0, policy_version 438350 (0.0035) [2024-06-23 12:51:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000437721_7171620864.pth [2024-06-23 12:51:48,018][15401] Updated weights for policy 0, policy_version 438360 (0.0031) [2024-06-23 12:51:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 7182090240. Throughput: 0: 42794.6. Samples: 7182272920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:51:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 12:51:51,126][15401] Updated weights for policy 0, policy_version 438370 (0.0032) [2024-06-23 12:51:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7182336000. Throughput: 0: 42638.6. Samples: 7182395180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:51:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 12:51:55,577][15401] Updated weights for policy 0, policy_version 438380 (0.0037) [2024-06-23 12:51:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7182548992. Throughput: 0: 42863.7. Samples: 7182663340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:51:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 12:51:58,827][15401] Updated weights for policy 0, policy_version 438390 (0.0030) [2024-06-23 12:52:03,095][15401] Updated weights for policy 0, policy_version 438400 (0.0034) [2024-06-23 12:52:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7182745600. Throughput: 0: 42752.4. Samples: 7182916240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 12:52:06,489][15401] Updated weights for policy 0, policy_version 438410 (0.0027) [2024-06-23 12:52:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7182991360. Throughput: 0: 42848.1. Samples: 7183045760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 12:52:10,956][15349] Signal inference workers to stop experience collection... (106400 times) [2024-06-23 12:52:10,957][15349] Signal inference workers to resume experience collection... (106400 times) [2024-06-23 12:52:10,975][15401] InferenceWorker_p0-w0: stopping experience collection (106400 times) [2024-06-23 12:52:10,975][15401] InferenceWorker_p0-w0: resuming experience collection (106400 times) [2024-06-23 12:52:11,118][15401] Updated weights for policy 0, policy_version 438420 (0.0038) [2024-06-23 12:52:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 7183171584. Throughput: 0: 42872.9. Samples: 7183307040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 12:52:14,105][15401] Updated weights for policy 0, policy_version 438430 (0.0027) [2024-06-23 12:52:18,390][15132] Fps is (10 sec: 37680.4, 60 sec: 42597.8, 300 sec: 42709.7). Total num frames: 7183368192. Throughput: 0: 42894.3. Samples: 7183566000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:18,391][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 12:52:18,610][15401] Updated weights for policy 0, policy_version 438440 (0.0036) [2024-06-23 12:52:21,919][15401] Updated weights for policy 0, policy_version 438450 (0.0034) [2024-06-23 12:52:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 7183630336. Throughput: 0: 43017.8. Samples: 7183690280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 12:52:26,399][15401] Updated weights for policy 0, policy_version 438460 (0.0030) [2024-06-23 12:52:28,390][15132] Fps is (10 sec: 45878.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 7183826944. Throughput: 0: 42810.1. Samples: 7183948680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:28,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 12:52:29,757][15401] Updated weights for policy 0, policy_version 438470 (0.0039) [2024-06-23 12:52:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7184023552. Throughput: 0: 43058.3. Samples: 7184210540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 12:52:33,842][15401] Updated weights for policy 0, policy_version 438480 (0.0026) [2024-06-23 12:52:37,254][15401] Updated weights for policy 0, policy_version 438490 (0.0035) [2024-06-23 12:52:38,396][15132] Fps is (10 sec: 44209.2, 60 sec: 43140.0, 300 sec: 42931.6). Total num frames: 7184269312. Throughput: 0: 43020.6. Samples: 7184331380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:38,397][15132] Avg episode reward: [(0, '0.225')] [2024-06-23 12:52:41,341][15401] Updated weights for policy 0, policy_version 438500 (0.0022) [2024-06-23 12:52:43,391][15132] Fps is (10 sec: 44229.2, 60 sec: 42597.2, 300 sec: 42986.9). Total num frames: 7184465920. Throughput: 0: 42856.6. Samples: 7184591960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:43,392][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 12:52:44,677][15401] Updated weights for policy 0, policy_version 438510 (0.0027) [2024-06-23 12:52:48,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 7184662528. Throughput: 0: 43001.4. Samples: 7184851300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 12:52:49,260][15401] Updated weights for policy 0, policy_version 438520 (0.0028) [2024-06-23 12:52:52,179][15401] Updated weights for policy 0, policy_version 438530 (0.0028) [2024-06-23 12:52:53,390][15132] Fps is (10 sec: 45882.6, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 7184924672. Throughput: 0: 42940.9. Samples: 7184978100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 12:52:57,071][15401] Updated weights for policy 0, policy_version 438540 (0.0033) [2024-06-23 12:52:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 7185088512. Throughput: 0: 42800.4. Samples: 7185233060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:52:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 12:52:59,728][15401] Updated weights for policy 0, policy_version 438550 (0.0031) [2024-06-23 12:53:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7185301504. Throughput: 0: 42822.4. Samples: 7185492980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 12:53:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 12:53:04,621][15401] Updated weights for policy 0, policy_version 438560 (0.0033) [2024-06-23 12:53:07,302][15401] Updated weights for policy 0, policy_version 438570 (0.0030) [2024-06-23 12:53:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7185547264. Throughput: 0: 42784.0. Samples: 7185615560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:08,391][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 12:53:12,371][15401] Updated weights for policy 0, policy_version 438580 (0.0037) [2024-06-23 12:53:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 7185743872. Throughput: 0: 42977.0. Samples: 7185882640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 12:53:14,841][15401] Updated weights for policy 0, policy_version 438590 (0.0035) [2024-06-23 12:53:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43145.1, 300 sec: 42876.1). Total num frames: 7185956864. Throughput: 0: 42716.8. Samples: 7186132800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 12:53:19,940][15401] Updated weights for policy 0, policy_version 438600 (0.0034) [2024-06-23 12:53:21,689][15349] Signal inference workers to stop experience collection... (106450 times) [2024-06-23 12:53:21,708][15401] InferenceWorker_p0-w0: stopping experience collection (106450 times) [2024-06-23 12:53:21,749][15349] Signal inference workers to resume experience collection... (106450 times) [2024-06-23 12:53:21,749][15401] InferenceWorker_p0-w0: resuming experience collection (106450 times) [2024-06-23 12:53:22,485][15401] Updated weights for policy 0, policy_version 438610 (0.0029) [2024-06-23 12:53:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 7186202624. Throughput: 0: 42865.2. Samples: 7186260040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 12:53:27,424][15401] Updated weights for policy 0, policy_version 438620 (0.0036) [2024-06-23 12:53:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 7186415616. Throughput: 0: 43080.3. Samples: 7186530500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 12:53:30,432][15401] Updated weights for policy 0, policy_version 438630 (0.0042) [2024-06-23 12:53:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7186595840. Throughput: 0: 42856.5. Samples: 7186779840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 12:53:34,934][15401] Updated weights for policy 0, policy_version 438640 (0.0034) [2024-06-23 12:53:37,898][15401] Updated weights for policy 0, policy_version 438650 (0.0026) [2024-06-23 12:53:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42876.1, 300 sec: 43042.7). Total num frames: 7186841600. Throughput: 0: 42860.6. Samples: 7186906820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 12:53:42,380][15401] Updated weights for policy 0, policy_version 438660 (0.0033) [2024-06-23 12:53:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42872.7, 300 sec: 42931.7). Total num frames: 7187038208. Throughput: 0: 43098.8. Samples: 7187172500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 12:53:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000438663_7187054592.pth... [2024-06-23 12:53:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000438033_7176732672.pth [2024-06-23 12:53:45,500][15401] Updated weights for policy 0, policy_version 438670 (0.0032) [2024-06-23 12:53:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7187251200. Throughput: 0: 42964.1. Samples: 7187426360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 12:53:50,195][15401] Updated weights for policy 0, policy_version 438680 (0.0029) [2024-06-23 12:53:53,055][15401] Updated weights for policy 0, policy_version 438690 (0.0032) [2024-06-23 12:53:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 7187496960. Throughput: 0: 43018.2. Samples: 7187551380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 12:53:57,620][15401] Updated weights for policy 0, policy_version 438700 (0.0032) [2024-06-23 12:53:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 7187693568. Throughput: 0: 42954.7. Samples: 7187815600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:53:58,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 12:54:00,935][15401] Updated weights for policy 0, policy_version 438710 (0.0045) [2024-06-23 12:54:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 7187906560. Throughput: 0: 43004.0. Samples: 7188067980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:54:03,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 12:54:05,300][15401] Updated weights for policy 0, policy_version 438720 (0.0029) [2024-06-23 12:54:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 7188135936. Throughput: 0: 42983.1. Samples: 7188194280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:54:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 12:54:08,511][15401] Updated weights for policy 0, policy_version 438730 (0.0036) [2024-06-23 12:54:12,983][15401] Updated weights for policy 0, policy_version 438740 (0.0033) [2024-06-23 12:54:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.3, 300 sec: 42987.1). Total num frames: 7188332544. Throughput: 0: 42826.4. Samples: 7188457700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:54:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 12:54:16,224][15401] Updated weights for policy 0, policy_version 438750 (0.0038) [2024-06-23 12:54:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7188529152. Throughput: 0: 42956.9. Samples: 7188712900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:54:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 12:54:20,579][15401] Updated weights for policy 0, policy_version 438760 (0.0041) [2024-06-23 12:54:23,392][15132] Fps is (10 sec: 44227.3, 60 sec: 42869.8, 300 sec: 43042.4). Total num frames: 7188774912. Throughput: 0: 43005.6. Samples: 7188842180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:54:23,392][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 12:54:23,830][15401] Updated weights for policy 0, policy_version 438770 (0.0037) [2024-06-23 12:54:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 7188955136. Throughput: 0: 42859.1. Samples: 7189101160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-23 12:54:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 12:54:28,502][15401] Updated weights for policy 0, policy_version 438780 (0.0037) [2024-06-23 12:54:31,638][15401] Updated weights for policy 0, policy_version 438790 (0.0041) [2024-06-23 12:54:33,391][15132] Fps is (10 sec: 40962.0, 60 sec: 43143.1, 300 sec: 42875.8). Total num frames: 7189184512. Throughput: 0: 42728.8. Samples: 7189349240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:54:33,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 12:54:36,056][15401] Updated weights for policy 0, policy_version 438800 (0.0041) [2024-06-23 12:54:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 7189413888. Throughput: 0: 42905.3. Samples: 7189482120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:54:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 12:54:39,268][15401] Updated weights for policy 0, policy_version 438810 (0.0045) [2024-06-23 12:54:42,770][15349] Signal inference workers to stop experience collection... (106500 times) [2024-06-23 12:54:42,817][15401] InferenceWorker_p0-w0: stopping experience collection (106500 times) [2024-06-23 12:54:42,826][15349] Signal inference workers to resume experience collection... (106500 times) [2024-06-23 12:54:42,831][15401] InferenceWorker_p0-w0: resuming experience collection (106500 times) [2024-06-23 12:54:43,392][15132] Fps is (10 sec: 42596.3, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 7189610496. Throughput: 0: 42790.6. Samples: 7189741280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:54:43,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 12:54:43,534][15401] Updated weights for policy 0, policy_version 438820 (0.0028) [2024-06-23 12:54:46,861][15401] Updated weights for policy 0, policy_version 438830 (0.0037) [2024-06-23 12:54:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 7189823488. Throughput: 0: 42821.8. Samples: 7189994960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:54:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 12:54:51,176][15401] Updated weights for policy 0, policy_version 438840 (0.0037) [2024-06-23 12:54:53,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 7190052864. Throughput: 0: 42921.3. Samples: 7190125740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:54:53,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 12:54:54,416][15401] Updated weights for policy 0, policy_version 438850 (0.0037) [2024-06-23 12:54:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7190249472. Throughput: 0: 42822.4. Samples: 7190384700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:54:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 12:54:58,732][15401] Updated weights for policy 0, policy_version 438860 (0.0042) [2024-06-23 12:55:02,074][15401] Updated weights for policy 0, policy_version 438870 (0.0044) [2024-06-23 12:55:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7190478848. Throughput: 0: 42751.8. Samples: 7190636740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 12:55:06,760][15401] Updated weights for policy 0, policy_version 438880 (0.0044) [2024-06-23 12:55:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 7190691840. Throughput: 0: 42767.1. Samples: 7190766600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:08,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 12:55:10,084][15401] Updated weights for policy 0, policy_version 438890 (0.0026) [2024-06-23 12:55:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 7190888448. Throughput: 0: 42748.0. Samples: 7191024820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:13,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 12:55:14,488][15401] Updated weights for policy 0, policy_version 438900 (0.0029) [2024-06-23 12:55:17,630][15401] Updated weights for policy 0, policy_version 438910 (0.0029) [2024-06-23 12:55:18,391][15132] Fps is (10 sec: 42594.3, 60 sec: 43143.8, 300 sec: 42876.0). Total num frames: 7191117824. Throughput: 0: 42749.3. Samples: 7191272920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:18,391][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 12:55:22,212][15401] Updated weights for policy 0, policy_version 438920 (0.0041) [2024-06-23 12:55:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 7191314432. Throughput: 0: 42789.1. Samples: 7191407620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:23,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 12:55:25,114][15401] Updated weights for policy 0, policy_version 438930 (0.0029) [2024-06-23 12:55:28,389][15132] Fps is (10 sec: 40964.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7191527424. Throughput: 0: 42685.8. Samples: 7191662040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:28,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 12:55:29,715][15401] Updated weights for policy 0, policy_version 438940 (0.0024) [2024-06-23 12:55:32,676][15401] Updated weights for policy 0, policy_version 438950 (0.0038) [2024-06-23 12:55:33,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43145.8, 300 sec: 42931.6). Total num frames: 7191773184. Throughput: 0: 42679.9. Samples: 7191915560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:33,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-23 12:55:37,208][15401] Updated weights for policy 0, policy_version 438960 (0.0030) [2024-06-23 12:55:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 7191969792. Throughput: 0: 42799.9. Samples: 7192051840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:38,393][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 12:55:40,152][15401] Updated weights for policy 0, policy_version 438970 (0.0039) [2024-06-23 12:55:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 7192166400. Throughput: 0: 42728.3. Samples: 7192307480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 12:55:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000438976_7192182784.pth... [2024-06-23 12:55:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000438350_7181926400.pth [2024-06-23 12:55:44,658][15401] Updated weights for policy 0, policy_version 438980 (0.0038) [2024-06-23 12:55:47,914][15401] Updated weights for policy 0, policy_version 438990 (0.0026) [2024-06-23 12:55:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7192412160. Throughput: 0: 42682.4. Samples: 7192557440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 12:55:52,824][15401] Updated weights for policy 0, policy_version 439000 (0.0033) [2024-06-23 12:55:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7192608768. Throughput: 0: 42883.1. Samples: 7192696340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 12:55:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 12:55:55,666][15401] Updated weights for policy 0, policy_version 439010 (0.0030) [2024-06-23 12:55:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7192805376. Throughput: 0: 42657.3. Samples: 7192944400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:55:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 12:55:59,202][15349] Signal inference workers to stop experience collection... (106550 times) [2024-06-23 12:55:59,252][15401] InferenceWorker_p0-w0: stopping experience collection (106550 times) [2024-06-23 12:55:59,318][15349] Signal inference workers to resume experience collection... (106550 times) [2024-06-23 12:55:59,318][15401] InferenceWorker_p0-w0: resuming experience collection (106550 times) [2024-06-23 12:56:00,363][15401] Updated weights for policy 0, policy_version 439020 (0.0029) [2024-06-23 12:56:03,291][15401] Updated weights for policy 0, policy_version 439030 (0.0038) [2024-06-23 12:56:03,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 7193067520. Throughput: 0: 42805.8. Samples: 7193199240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:03,392][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 12:56:07,907][15401] Updated weights for policy 0, policy_version 439040 (0.0036) [2024-06-23 12:56:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 7193231360. Throughput: 0: 42740.3. Samples: 7193330940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 12:56:11,223][15401] Updated weights for policy 0, policy_version 439050 (0.0026) [2024-06-23 12:56:13,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7193460736. Throughput: 0: 42618.6. Samples: 7193579880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 12:56:15,520][15401] Updated weights for policy 0, policy_version 439060 (0.0042) [2024-06-23 12:56:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42326.1, 300 sec: 42765.1). Total num frames: 7193657344. Throughput: 0: 42762.4. Samples: 7193839860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 12:56:18,920][15401] Updated weights for policy 0, policy_version 439070 (0.0033) [2024-06-23 12:56:23,173][15401] Updated weights for policy 0, policy_version 439080 (0.0042) [2024-06-23 12:56:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 7193886720. Throughput: 0: 42558.3. Samples: 7193966860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 12:56:26,554][15401] Updated weights for policy 0, policy_version 439090 (0.0022) [2024-06-23 12:56:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7194099712. Throughput: 0: 42492.6. Samples: 7194219640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 12:56:30,830][15401] Updated weights for policy 0, policy_version 439100 (0.0028) [2024-06-23 12:56:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7194312704. Throughput: 0: 42817.2. Samples: 7194484220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 12:56:34,431][15401] Updated weights for policy 0, policy_version 439110 (0.0032) [2024-06-23 12:56:38,207][15401] Updated weights for policy 0, policy_version 439120 (0.0042) [2024-06-23 12:56:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 7194542080. Throughput: 0: 42558.7. Samples: 7194611480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 12:56:41,910][15401] Updated weights for policy 0, policy_version 439130 (0.0036) [2024-06-23 12:56:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7194755072. Throughput: 0: 42800.4. Samples: 7194870420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:43,399][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 12:56:45,651][15401] Updated weights for policy 0, policy_version 439140 (0.0035) [2024-06-23 12:56:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7194968064. Throughput: 0: 42979.5. Samples: 7195133220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:48,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-23 12:56:49,852][15401] Updated weights for policy 0, policy_version 439150 (0.0027) [2024-06-23 12:56:53,387][15401] Updated weights for policy 0, policy_version 439160 (0.0026) [2024-06-23 12:56:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7195197440. Throughput: 0: 42925.3. Samples: 7195262580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:53,393][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 12:56:57,414][15401] Updated weights for policy 0, policy_version 439170 (0.0023) [2024-06-23 12:56:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7195394048. Throughput: 0: 43269.0. Samples: 7195526980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:56:58,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 12:57:00,764][15401] Updated weights for policy 0, policy_version 439180 (0.0039) [2024-06-23 12:57:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 7195623424. Throughput: 0: 43059.5. Samples: 7195777540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:57:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 12:57:04,702][15349] Signal inference workers to stop experience collection... (106600 times) [2024-06-23 12:57:04,736][15401] InferenceWorker_p0-w0: stopping experience collection (106600 times) [2024-06-23 12:57:04,760][15349] Signal inference workers to resume experience collection... (106600 times) [2024-06-23 12:57:04,764][15401] InferenceWorker_p0-w0: resuming experience collection (106600 times) [2024-06-23 12:57:04,897][15401] Updated weights for policy 0, policy_version 439190 (0.0047) [2024-06-23 12:57:08,316][15401] Updated weights for policy 0, policy_version 439200 (0.0037) [2024-06-23 12:57:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 7195852800. Throughput: 0: 43247.2. Samples: 7195912980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:57:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 12:57:12,521][15401] Updated weights for policy 0, policy_version 439210 (0.0035) [2024-06-23 12:57:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 7196033024. Throughput: 0: 43387.5. Samples: 7196172080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-23 12:57:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 12:57:15,887][15401] Updated weights for policy 0, policy_version 439220 (0.0023) [2024-06-23 12:57:18,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43690.5, 300 sec: 42876.1). Total num frames: 7196278784. Throughput: 0: 43116.8. Samples: 7196424480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 12:57:20,349][15401] Updated weights for policy 0, policy_version 439230 (0.0035) [2024-06-23 12:57:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 7196491776. Throughput: 0: 43215.0. Samples: 7196556160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 12:57:23,531][15401] Updated weights for policy 0, policy_version 439240 (0.0041) [2024-06-23 12:57:28,015][15401] Updated weights for policy 0, policy_version 439250 (0.0028) [2024-06-23 12:57:28,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7196672000. Throughput: 0: 43094.2. Samples: 7196809660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:28,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 12:57:30,952][15401] Updated weights for policy 0, policy_version 439260 (0.0038) [2024-06-23 12:57:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 7196901376. Throughput: 0: 42974.4. Samples: 7197067060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:33,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 12:57:36,031][15401] Updated weights for policy 0, policy_version 439270 (0.0023) [2024-06-23 12:57:38,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42987.4). Total num frames: 7197147136. Throughput: 0: 43005.0. Samples: 7197197800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:38,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 12:57:38,695][15401] Updated weights for policy 0, policy_version 439280 (0.0035) [2024-06-23 12:57:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7197310976. Throughput: 0: 42951.4. Samples: 7197459800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:43,391][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 12:57:43,489][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000439290_7197327360.pth... [2024-06-23 12:57:43,498][15401] Updated weights for policy 0, policy_version 439290 (0.0037) [2024-06-23 12:57:43,550][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000438663_7187054592.pth [2024-06-23 12:57:46,028][15401] Updated weights for policy 0, policy_version 439300 (0.0045) [2024-06-23 12:57:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7197556736. Throughput: 0: 43036.5. Samples: 7197714180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:48,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 12:57:51,064][15401] Updated weights for policy 0, policy_version 439310 (0.0028) [2024-06-23 12:57:53,389][15132] Fps is (10 sec: 49152.5, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 7197802496. Throughput: 0: 43009.7. Samples: 7197848420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:53,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 12:57:53,597][15401] Updated weights for policy 0, policy_version 439320 (0.0024) [2024-06-23 12:57:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7197966336. Throughput: 0: 42952.4. Samples: 7198104940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:57:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 12:57:58,429][15401] Updated weights for policy 0, policy_version 439330 (0.0035) [2024-06-23 12:58:01,403][15401] Updated weights for policy 0, policy_version 439340 (0.0027) [2024-06-23 12:58:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7198212096. Throughput: 0: 42953.5. Samples: 7198357380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:58:03,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 12:58:06,392][15401] Updated weights for policy 0, policy_version 439350 (0.0030) [2024-06-23 12:58:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7198425088. Throughput: 0: 43193.9. Samples: 7198499880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:58:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 12:58:08,897][15401] Updated weights for policy 0, policy_version 439360 (0.0039) [2024-06-23 12:58:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7198605312. Throughput: 0: 43130.8. Samples: 7198750540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:58:13,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 12:58:13,974][15401] Updated weights for policy 0, policy_version 439370 (0.0026) [2024-06-23 12:58:16,637][15401] Updated weights for policy 0, policy_version 439380 (0.0038) [2024-06-23 12:58:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 7198867456. Throughput: 0: 42985.3. Samples: 7199001400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:58:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 12:58:21,979][15401] Updated weights for policy 0, policy_version 439390 (0.0032) [2024-06-23 12:58:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7199031296. Throughput: 0: 43189.3. Samples: 7199141320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:58:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 12:58:24,382][15401] Updated weights for policy 0, policy_version 439400 (0.0023) [2024-06-23 12:58:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7199260672. Throughput: 0: 42879.6. Samples: 7199389380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:58:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 12:58:29,309][15349] Signal inference workers to stop experience collection... (106650 times) [2024-06-23 12:58:29,364][15401] InferenceWorker_p0-w0: stopping experience collection (106650 times) [2024-06-23 12:58:29,423][15349] Signal inference workers to resume experience collection... (106650 times) [2024-06-23 12:58:29,424][15401] InferenceWorker_p0-w0: resuming experience collection (106650 times) [2024-06-23 12:58:29,559][15401] Updated weights for policy 0, policy_version 439410 (0.0030) [2024-06-23 12:58:31,881][15401] Updated weights for policy 0, policy_version 439420 (0.0030) [2024-06-23 12:58:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 7199506432. Throughput: 0: 42943.5. Samples: 7199646640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:58:33,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 12:58:37,046][15401] Updated weights for policy 0, policy_version 439430 (0.0042) [2024-06-23 12:58:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7199686656. Throughput: 0: 43028.0. Samples: 7199784680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 12:58:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 12:58:39,398][15401] Updated weights for policy 0, policy_version 439440 (0.0044) [2024-06-23 12:58:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 7199916032. Throughput: 0: 42898.0. Samples: 7200035360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:58:43,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 12:58:44,485][15401] Updated weights for policy 0, policy_version 439450 (0.0043) [2024-06-23 12:58:47,279][15401] Updated weights for policy 0, policy_version 439460 (0.0031) [2024-06-23 12:58:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7200145408. Throughput: 0: 43061.4. Samples: 7200295140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:58:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 12:58:51,994][15401] Updated weights for policy 0, policy_version 439470 (0.0052) [2024-06-23 12:58:53,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 7200358400. Throughput: 0: 42864.9. Samples: 7200428800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:58:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 12:58:54,831][15401] Updated weights for policy 0, policy_version 439480 (0.0030) [2024-06-23 12:58:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7200555008. Throughput: 0: 43016.9. Samples: 7200686300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:58:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 12:58:59,485][15401] Updated weights for policy 0, policy_version 439490 (0.0042) [2024-06-23 12:59:02,377][15401] Updated weights for policy 0, policy_version 439500 (0.0039) [2024-06-23 12:59:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7200784384. Throughput: 0: 42969.7. Samples: 7200935040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 12:59:06,938][15401] Updated weights for policy 0, policy_version 439510 (0.0036) [2024-06-23 12:59:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7200964608. Throughput: 0: 42829.9. Samples: 7201068660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:08,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 12:59:10,217][15401] Updated weights for policy 0, policy_version 439520 (0.0033) [2024-06-23 12:59:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 7201193984. Throughput: 0: 43080.9. Samples: 7201328020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 12:59:14,684][15401] Updated weights for policy 0, policy_version 439530 (0.0039) [2024-06-23 12:59:17,865][15401] Updated weights for policy 0, policy_version 439540 (0.0027) [2024-06-23 12:59:18,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 7201439744. Throughput: 0: 43000.5. Samples: 7201581660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:18,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-23 12:59:22,210][15401] Updated weights for policy 0, policy_version 439550 (0.0033) [2024-06-23 12:59:23,396][15132] Fps is (10 sec: 44208.7, 60 sec: 43413.0, 300 sec: 42986.2). Total num frames: 7201636352. Throughput: 0: 42921.0. Samples: 7201716400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:23,396][15132] Avg episode reward: [(0, '0.207')] [2024-06-23 12:59:25,549][15401] Updated weights for policy 0, policy_version 439560 (0.0023) [2024-06-23 12:59:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42931.9). Total num frames: 7201849344. Throughput: 0: 43146.1. Samples: 7201976920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:28,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 12:59:29,669][15401] Updated weights for policy 0, policy_version 439570 (0.0037) [2024-06-23 12:59:33,287][15401] Updated weights for policy 0, policy_version 439580 (0.0035) [2024-06-23 12:59:33,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 7202078720. Throughput: 0: 43040.8. Samples: 7202231980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 12:59:37,555][15401] Updated weights for policy 0, policy_version 439590 (0.0029) [2024-06-23 12:59:38,153][15349] Signal inference workers to stop experience collection... (106700 times) [2024-06-23 12:59:38,204][15401] InferenceWorker_p0-w0: stopping experience collection (106700 times) [2024-06-23 12:59:38,264][15349] Signal inference workers to resume experience collection... (106700 times) [2024-06-23 12:59:38,265][15401] InferenceWorker_p0-w0: resuming experience collection (106700 times) [2024-06-23 12:59:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 7202291712. Throughput: 0: 42943.3. Samples: 7202361240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:38,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 12:59:41,026][15401] Updated weights for policy 0, policy_version 439600 (0.0033) [2024-06-23 12:59:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7202488320. Throughput: 0: 42820.3. Samples: 7202613220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 12:59:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000439606_7202504704.pth... [2024-06-23 12:59:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000438976_7192182784.pth [2024-06-23 12:59:45,316][15401] Updated weights for policy 0, policy_version 439610 (0.0031) [2024-06-23 12:59:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7202701312. Throughput: 0: 43087.7. Samples: 7202873980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:48,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 12:59:48,846][15401] Updated weights for policy 0, policy_version 439620 (0.0031) [2024-06-23 12:59:52,959][15401] Updated weights for policy 0, policy_version 439630 (0.0030) [2024-06-23 12:59:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 7202914304. Throughput: 0: 43008.9. Samples: 7203004060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 12:59:56,485][15401] Updated weights for policy 0, policy_version 439640 (0.0030) [2024-06-23 12:59:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7203127296. Throughput: 0: 42922.7. Samples: 7203259540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-23 12:59:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 13:00:00,534][15401] Updated weights for policy 0, policy_version 439650 (0.0037) [2024-06-23 13:00:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 7203356672. Throughput: 0: 42848.1. Samples: 7203509820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 13:00:04,215][15401] Updated weights for policy 0, policy_version 439660 (0.0033) [2024-06-23 13:00:08,243][15401] Updated weights for policy 0, policy_version 439670 (0.0028) [2024-06-23 13:00:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 7203553280. Throughput: 0: 42743.8. Samples: 7203639700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:08,392][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 13:00:11,818][15401] Updated weights for policy 0, policy_version 439680 (0.0036) [2024-06-23 13:00:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42876.3). Total num frames: 7203766272. Throughput: 0: 42632.9. Samples: 7203895400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:13,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 13:00:15,691][15401] Updated weights for policy 0, policy_version 439690 (0.0050) [2024-06-23 13:00:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 7203995648. Throughput: 0: 42744.4. Samples: 7204155480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:18,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 13:00:19,453][15401] Updated weights for policy 0, policy_version 439700 (0.0037) [2024-06-23 13:00:23,250][15401] Updated weights for policy 0, policy_version 439710 (0.0032) [2024-06-23 13:00:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42876.1, 300 sec: 42987.2). Total num frames: 7204208640. Throughput: 0: 42796.4. Samples: 7204287080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 13:00:27,021][15401] Updated weights for policy 0, policy_version 439720 (0.0028) [2024-06-23 13:00:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 7204421632. Throughput: 0: 42710.2. Samples: 7204535180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 13:00:31,000][15401] Updated weights for policy 0, policy_version 439730 (0.0026) [2024-06-23 13:00:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 7204651008. Throughput: 0: 42653.2. Samples: 7204793380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:33,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 13:00:34,471][15401] Updated weights for policy 0, policy_version 439740 (0.0029) [2024-06-23 13:00:38,396][15132] Fps is (10 sec: 40934.4, 60 sec: 42320.8, 300 sec: 42930.7). Total num frames: 7204831232. Throughput: 0: 42684.6. Samples: 7204925140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:38,397][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 13:00:38,776][15401] Updated weights for policy 0, policy_version 439750 (0.0050) [2024-06-23 13:00:42,507][15401] Updated weights for policy 0, policy_version 439760 (0.0041) [2024-06-23 13:00:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7205076992. Throughput: 0: 42803.6. Samples: 7205185700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 13:00:46,243][15401] Updated weights for policy 0, policy_version 439770 (0.0030) [2024-06-23 13:00:48,389][15132] Fps is (10 sec: 45904.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 7205289984. Throughput: 0: 42942.2. Samples: 7205442220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:48,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 13:00:50,263][15401] Updated weights for policy 0, policy_version 439780 (0.0032) [2024-06-23 13:00:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7205486592. Throughput: 0: 43039.1. Samples: 7205576360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 13:00:53,982][15401] Updated weights for policy 0, policy_version 439790 (0.0028) [2024-06-23 13:00:57,858][15401] Updated weights for policy 0, policy_version 439800 (0.0038) [2024-06-23 13:00:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 7205699584. Throughput: 0: 43015.0. Samples: 7205831080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:00:58,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-23 13:01:01,754][15401] Updated weights for policy 0, policy_version 439810 (0.0043) [2024-06-23 13:01:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 7205912576. Throughput: 0: 42910.7. Samples: 7206086460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:01:03,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 13:01:05,541][15401] Updated weights for policy 0, policy_version 439820 (0.0032) [2024-06-23 13:01:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 7206141952. Throughput: 0: 42831.5. Samples: 7206214500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:01:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 13:01:09,304][15401] Updated weights for policy 0, policy_version 439830 (0.0029) [2024-06-23 13:01:12,947][15401] Updated weights for policy 0, policy_version 439840 (0.0028) [2024-06-23 13:01:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 7206354944. Throughput: 0: 42956.0. Samples: 7206468200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:01:13,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 13:01:15,452][15349] Signal inference workers to stop experience collection... (106750 times) [2024-06-23 13:01:15,504][15349] Signal inference workers to resume experience collection... (106750 times) [2024-06-23 13:01:15,504][15401] InferenceWorker_p0-w0: stopping experience collection (106750 times) [2024-06-23 13:01:15,520][15401] InferenceWorker_p0-w0: resuming experience collection (106750 times) [2024-06-23 13:01:16,832][15401] Updated weights for policy 0, policy_version 439850 (0.0039) [2024-06-23 13:01:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7206567936. Throughput: 0: 42976.6. Samples: 7206727320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:01:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 13:01:20,794][15401] Updated weights for policy 0, policy_version 439860 (0.0033) [2024-06-23 13:01:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7206780928. Throughput: 0: 42933.5. Samples: 7206856880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 13:01:23,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 13:01:24,320][15401] Updated weights for policy 0, policy_version 439870 (0.0037) [2024-06-23 13:01:28,322][15401] Updated weights for policy 0, policy_version 439880 (0.0030) [2024-06-23 13:01:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 7206993920. Throughput: 0: 42796.9. Samples: 7207111560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:01:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 13:01:32,201][15401] Updated weights for policy 0, policy_version 439890 (0.0044) [2024-06-23 13:01:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 7207206912. Throughput: 0: 42966.6. Samples: 7207375720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:01:33,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 13:01:35,942][15401] Updated weights for policy 0, policy_version 439900 (0.0040) [2024-06-23 13:01:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43149.2, 300 sec: 42931.7). Total num frames: 7207419904. Throughput: 0: 42689.0. Samples: 7207497360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:01:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 13:01:39,743][15401] Updated weights for policy 0, policy_version 439910 (0.0033) [2024-06-23 13:01:43,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 7207632896. Throughput: 0: 42756.4. Samples: 7207755220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:01:43,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 13:01:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000439919_7207632896.pth... [2024-06-23 13:01:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000439290_7197327360.pth [2024-06-23 13:01:43,639][15401] Updated weights for policy 0, policy_version 439920 (0.0035) [2024-06-23 13:01:47,335][15401] Updated weights for policy 0, policy_version 439930 (0.0031) [2024-06-23 13:01:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7207829504. Throughput: 0: 42891.5. Samples: 7208016580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:01:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 13:01:51,111][15401] Updated weights for policy 0, policy_version 439940 (0.0030) [2024-06-23 13:01:53,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7208075264. Throughput: 0: 42854.7. Samples: 7208142960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:01:53,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 13:01:55,199][15401] Updated weights for policy 0, policy_version 439950 (0.0025) [2024-06-23 13:01:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7208288256. Throughput: 0: 42986.3. Samples: 7208402580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:01:58,392][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 13:01:58,770][15401] Updated weights for policy 0, policy_version 439960 (0.0029) [2024-06-23 13:02:02,722][15401] Updated weights for policy 0, policy_version 439970 (0.0026) [2024-06-23 13:02:03,393][15132] Fps is (10 sec: 40943.8, 60 sec: 42868.6, 300 sec: 42820.0). Total num frames: 7208484864. Throughput: 0: 42904.6. Samples: 7208658200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:03,394][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 13:02:06,319][15401] Updated weights for policy 0, policy_version 439980 (0.0031) [2024-06-23 13:02:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7208714240. Throughput: 0: 42876.5. Samples: 7208786320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:08,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 13:02:10,228][15401] Updated weights for policy 0, policy_version 439990 (0.0025) [2024-06-23 13:02:13,389][15132] Fps is (10 sec: 45893.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 7208943616. Throughput: 0: 43071.5. Samples: 7209049780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 13:02:13,967][15401] Updated weights for policy 0, policy_version 440000 (0.0030) [2024-06-23 13:02:17,823][15401] Updated weights for policy 0, policy_version 440010 (0.0033) [2024-06-23 13:02:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 7209140224. Throughput: 0: 42759.9. Samples: 7209299920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 13:02:21,667][15401] Updated weights for policy 0, policy_version 440020 (0.0036) [2024-06-23 13:02:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7209353216. Throughput: 0: 42915.5. Samples: 7209428560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:23,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 13:02:25,307][15401] Updated weights for policy 0, policy_version 440030 (0.0036) [2024-06-23 13:02:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7209582592. Throughput: 0: 43102.3. Samples: 7209694720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:28,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 13:02:29,778][15401] Updated weights for policy 0, policy_version 440040 (0.0035) [2024-06-23 13:02:32,852][15401] Updated weights for policy 0, policy_version 440050 (0.0022) [2024-06-23 13:02:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7209795584. Throughput: 0: 42837.2. Samples: 7209944260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 13:02:35,385][15349] Signal inference workers to stop experience collection... (106800 times) [2024-06-23 13:02:35,436][15401] InferenceWorker_p0-w0: stopping experience collection (106800 times) [2024-06-23 13:02:35,444][15349] Signal inference workers to resume experience collection... (106800 times) [2024-06-23 13:02:35,457][15401] InferenceWorker_p0-w0: resuming experience collection (106800 times) [2024-06-23 13:02:37,265][15401] Updated weights for policy 0, policy_version 440060 (0.0034) [2024-06-23 13:02:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 7210008576. Throughput: 0: 42822.7. Samples: 7210069980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 13:02:40,455][15401] Updated weights for policy 0, policy_version 440070 (0.0039) [2024-06-23 13:02:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 7210188800. Throughput: 0: 42988.0. Samples: 7210337040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 13:02:45,109][15401] Updated weights for policy 0, policy_version 440080 (0.0040) [2024-06-23 13:02:47,981][15401] Updated weights for policy 0, policy_version 440090 (0.0033) [2024-06-23 13:02:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 7210434560. Throughput: 0: 42710.4. Samples: 7210580000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:02:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 13:02:52,674][15401] Updated weights for policy 0, policy_version 440100 (0.0041) [2024-06-23 13:02:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7210647552. Throughput: 0: 42924.8. Samples: 7210717940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:02:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 13:02:55,535][15401] Updated weights for policy 0, policy_version 440110 (0.0041) [2024-06-23 13:02:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7210827776. Throughput: 0: 42801.8. Samples: 7210975860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:02:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 13:03:00,068][15401] Updated weights for policy 0, policy_version 440120 (0.0042) [2024-06-23 13:03:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43147.4, 300 sec: 42876.1). Total num frames: 7211073536. Throughput: 0: 42756.1. Samples: 7211223940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 13:03:03,417][15401] Updated weights for policy 0, policy_version 440130 (0.0030) [2024-06-23 13:03:07,588][15401] Updated weights for policy 0, policy_version 440140 (0.0031) [2024-06-23 13:03:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7211270144. Throughput: 0: 42925.8. Samples: 7211360220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 13:03:11,128][15401] Updated weights for policy 0, policy_version 440150 (0.0030) [2024-06-23 13:03:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7211466752. Throughput: 0: 42600.0. Samples: 7211611720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:13,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 13:03:15,270][15401] Updated weights for policy 0, policy_version 440160 (0.0023) [2024-06-23 13:03:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 7211696128. Throughput: 0: 42734.8. Samples: 7211867320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 13:03:18,790][15401] Updated weights for policy 0, policy_version 440170 (0.0024) [2024-06-23 13:03:22,662][15401] Updated weights for policy 0, policy_version 440180 (0.0032) [2024-06-23 13:03:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7211925504. Throughput: 0: 42937.2. Samples: 7212002160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 13:03:26,325][15401] Updated weights for policy 0, policy_version 440190 (0.0034) [2024-06-23 13:03:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7212122112. Throughput: 0: 42631.1. Samples: 7212255440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 13:03:30,441][15401] Updated weights for policy 0, policy_version 440200 (0.0039) [2024-06-23 13:03:33,388][15349] Signal inference workers to stop experience collection... (106850 times) [2024-06-23 13:03:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7212335104. Throughput: 0: 42902.6. Samples: 7212510620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:33,397][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 13:03:33,439][15349] Signal inference workers to resume experience collection... (106850 times) [2024-06-23 13:03:33,444][15401] InferenceWorker_p0-w0: stopping experience collection (106850 times) [2024-06-23 13:03:33,464][15401] InferenceWorker_p0-w0: resuming experience collection (106850 times) [2024-06-23 13:03:33,925][15401] Updated weights for policy 0, policy_version 440210 (0.0034) [2024-06-23 13:03:38,251][15401] Updated weights for policy 0, policy_version 440220 (0.0031) [2024-06-23 13:03:38,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 7212580864. Throughput: 0: 42776.2. Samples: 7212642860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:38,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-23 13:03:42,057][15401] Updated weights for policy 0, policy_version 440230 (0.0033) [2024-06-23 13:03:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7212761088. Throughput: 0: 42619.3. Samples: 7212893740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:43,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-23 13:03:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000440233_7212777472.pth... [2024-06-23 13:03:43,607][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000439606_7202504704.pth [2024-06-23 13:03:45,915][15401] Updated weights for policy 0, policy_version 440240 (0.0035) [2024-06-23 13:03:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7212990464. Throughput: 0: 42839.1. Samples: 7213151700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:48,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-23 13:03:49,815][15401] Updated weights for policy 0, policy_version 440250 (0.0022) [2024-06-23 13:03:53,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 7213203456. Throughput: 0: 42752.4. Samples: 7213284080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:53,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 13:03:53,536][15401] Updated weights for policy 0, policy_version 440260 (0.0042) [2024-06-23 13:03:57,611][15401] Updated weights for policy 0, policy_version 440270 (0.0036) [2024-06-23 13:03:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7213400064. Throughput: 0: 42723.6. Samples: 7213534280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:03:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 13:04:01,171][15401] Updated weights for policy 0, policy_version 440280 (0.0024) [2024-06-23 13:04:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 7213629440. Throughput: 0: 42402.6. Samples: 7213775440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:04:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 13:04:05,437][15401] Updated weights for policy 0, policy_version 440290 (0.0029) [2024-06-23 13:04:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7213826048. Throughput: 0: 42373.6. Samples: 7213908960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 13:04:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 13:04:08,832][15401] Updated weights for policy 0, policy_version 440300 (0.0032) [2024-06-23 13:04:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7214022656. Throughput: 0: 42381.4. Samples: 7214162600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:13,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 13:04:13,448][15401] Updated weights for policy 0, policy_version 440310 (0.0039) [2024-06-23 13:04:16,884][15401] Updated weights for policy 0, policy_version 440320 (0.0039) [2024-06-23 13:04:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 7214268416. Throughput: 0: 42338.4. Samples: 7214415840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 13:04:21,067][15401] Updated weights for policy 0, policy_version 440330 (0.0036) [2024-06-23 13:04:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7214465024. Throughput: 0: 42286.5. Samples: 7214545760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 13:04:24,508][15401] Updated weights for policy 0, policy_version 440340 (0.0033) [2024-06-23 13:04:28,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7214661632. Throughput: 0: 42377.0. Samples: 7214800700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 13:04:28,604][15401] Updated weights for policy 0, policy_version 440350 (0.0046) [2024-06-23 13:04:31,963][15401] Updated weights for policy 0, policy_version 440360 (0.0033) [2024-06-23 13:04:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 7214923776. Throughput: 0: 42313.2. Samples: 7215055800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:33,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 13:04:36,375][15401] Updated weights for policy 0, policy_version 440370 (0.0034) [2024-06-23 13:04:37,539][15349] Signal inference workers to stop experience collection... (106900 times) [2024-06-23 13:04:37,585][15401] InferenceWorker_p0-w0: stopping experience collection (106900 times) [2024-06-23 13:04:37,657][15349] Signal inference workers to resume experience collection... (106900 times) [2024-06-23 13:04:37,657][15401] InferenceWorker_p0-w0: resuming experience collection (106900 times) [2024-06-23 13:04:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 7215104000. Throughput: 0: 42291.2. Samples: 7215187180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 13:04:39,747][15401] Updated weights for policy 0, policy_version 440380 (0.0025) [2024-06-23 13:04:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 7215316992. Throughput: 0: 42408.9. Samples: 7215442680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 13:04:43,875][15401] Updated weights for policy 0, policy_version 440390 (0.0031) [2024-06-23 13:04:47,325][15401] Updated weights for policy 0, policy_version 440400 (0.0029) [2024-06-23 13:04:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7215562752. Throughput: 0: 42707.1. Samples: 7215697260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 13:04:51,729][15401] Updated weights for policy 0, policy_version 440410 (0.0037) [2024-06-23 13:04:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7215742976. Throughput: 0: 42683.0. Samples: 7215829700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:53,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 13:04:55,053][15401] Updated weights for policy 0, policy_version 440420 (0.0032) [2024-06-23 13:04:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7215972352. Throughput: 0: 42682.1. Samples: 7216083300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:04:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 13:04:59,416][15401] Updated weights for policy 0, policy_version 440430 (0.0035) [2024-06-23 13:05:02,665][15401] Updated weights for policy 0, policy_version 440440 (0.0046) [2024-06-23 13:05:03,392][15132] Fps is (10 sec: 44224.4, 60 sec: 42596.5, 300 sec: 42820.5). Total num frames: 7216185344. Throughput: 0: 42811.1. Samples: 7216342460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:05:03,393][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 13:05:07,305][15401] Updated weights for policy 0, policy_version 440450 (0.0034) [2024-06-23 13:05:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7216365568. Throughput: 0: 42799.9. Samples: 7216471760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:05:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 13:05:10,173][15401] Updated weights for policy 0, policy_version 440460 (0.0025) [2024-06-23 13:05:13,389][15132] Fps is (10 sec: 44249.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 7216627712. Throughput: 0: 42632.9. Samples: 7216719180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:05:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 13:05:15,135][15401] Updated weights for policy 0, policy_version 440470 (0.0037) [2024-06-23 13:05:18,360][15401] Updated weights for policy 0, policy_version 440480 (0.0036) [2024-06-23 13:05:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7216824320. Throughput: 0: 42712.0. Samples: 7216977840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:05:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 13:05:22,838][15401] Updated weights for policy 0, policy_version 440490 (0.0043) [2024-06-23 13:05:23,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7217004544. Throughput: 0: 42532.8. Samples: 7217101160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:05:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 13:05:25,861][15401] Updated weights for policy 0, policy_version 440500 (0.0042) [2024-06-23 13:05:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7217266688. Throughput: 0: 42594.7. Samples: 7217359440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:05:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 13:05:30,212][15401] Updated weights for policy 0, policy_version 440510 (0.0033) [2024-06-23 13:05:33,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42323.6, 300 sec: 42821.1). Total num frames: 7217463296. Throughput: 0: 42822.2. Samples: 7217624360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:05:33,393][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 13:05:33,430][15401] Updated weights for policy 0, policy_version 440520 (0.0036) [2024-06-23 13:05:37,634][15401] Updated weights for policy 0, policy_version 440530 (0.0032) [2024-06-23 13:05:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7217659904. Throughput: 0: 42702.2. Samples: 7217751300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:05:38,391][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 13:05:41,433][15401] Updated weights for policy 0, policy_version 440540 (0.0037) [2024-06-23 13:05:43,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7217905664. Throughput: 0: 42680.0. Samples: 7218003900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:05:43,392][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 13:05:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000440546_7217905664.pth... [2024-06-23 13:05:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000439919_7207632896.pth [2024-06-23 13:05:45,211][15401] Updated weights for policy 0, policy_version 440550 (0.0029) [2024-06-23 13:05:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 7218085888. Throughput: 0: 42717.7. Samples: 7218264640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:05:48,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 13:05:49,085][15401] Updated weights for policy 0, policy_version 440560 (0.0040) [2024-06-23 13:05:52,977][15401] Updated weights for policy 0, policy_version 440570 (0.0039) [2024-06-23 13:05:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 7218315264. Throughput: 0: 42530.5. Samples: 7218385640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:05:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 13:05:56,735][15401] Updated weights for policy 0, policy_version 440580 (0.0044) [2024-06-23 13:05:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7218544640. Throughput: 0: 42887.4. Samples: 7218649120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:05:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 13:06:00,659][15401] Updated weights for policy 0, policy_version 440590 (0.0028) [2024-06-23 13:06:03,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.2, 300 sec: 42653.9). Total num frames: 7218724864. Throughput: 0: 42723.6. Samples: 7218900400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:03,394][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 13:06:04,764][15401] Updated weights for policy 0, policy_version 440600 (0.0041) [2024-06-23 13:06:05,109][15349] Signal inference workers to stop experience collection... (106950 times) [2024-06-23 13:06:05,132][15401] InferenceWorker_p0-w0: stopping experience collection (106950 times) [2024-06-23 13:06:05,171][15349] Signal inference workers to resume experience collection... (106950 times) [2024-06-23 13:06:05,171][15401] InferenceWorker_p0-w0: resuming experience collection (106950 times) [2024-06-23 13:06:08,343][15401] Updated weights for policy 0, policy_version 440610 (0.0023) [2024-06-23 13:06:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7218954240. Throughput: 0: 42692.8. Samples: 7219022340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 13:06:12,323][15401] Updated weights for policy 0, policy_version 440620 (0.0033) [2024-06-23 13:06:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7219183616. Throughput: 0: 42846.6. Samples: 7219287540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 13:06:16,292][15401] Updated weights for policy 0, policy_version 440630 (0.0039) [2024-06-23 13:06:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 7219363840. Throughput: 0: 42547.7. Samples: 7219538900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 13:06:20,166][15401] Updated weights for policy 0, policy_version 440640 (0.0026) [2024-06-23 13:06:23,394][15132] Fps is (10 sec: 39305.7, 60 sec: 42868.6, 300 sec: 42653.3). Total num frames: 7219576832. Throughput: 0: 42472.6. Samples: 7219662740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:23,394][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 13:06:23,979][15401] Updated weights for policy 0, policy_version 440650 (0.0041) [2024-06-23 13:06:27,756][15401] Updated weights for policy 0, policy_version 440660 (0.0026) [2024-06-23 13:06:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7219806208. Throughput: 0: 42700.0. Samples: 7219925400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:28,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 13:06:31,598][15401] Updated weights for policy 0, policy_version 440670 (0.0030) [2024-06-23 13:06:33,390][15132] Fps is (10 sec: 44254.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7220019200. Throughput: 0: 42576.4. Samples: 7220180580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 13:06:35,291][15401] Updated weights for policy 0, policy_version 440680 (0.0028) [2024-06-23 13:06:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7220232192. Throughput: 0: 42666.0. Samples: 7220305600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 13:06:39,043][15401] Updated weights for policy 0, policy_version 440690 (0.0032) [2024-06-23 13:06:42,831][15401] Updated weights for policy 0, policy_version 440700 (0.0027) [2024-06-23 13:06:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7220445184. Throughput: 0: 42730.7. Samples: 7220572000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 13:06:46,686][15401] Updated weights for policy 0, policy_version 440710 (0.0052) [2024-06-23 13:06:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7220658176. Throughput: 0: 42725.4. Samples: 7220823040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:48,394][15132] Avg episode reward: [(0, '0.188')] [2024-06-23 13:06:50,831][15401] Updated weights for policy 0, policy_version 440720 (0.0034) [2024-06-23 13:06:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7220871168. Throughput: 0: 42831.6. Samples: 7220949760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:53,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 13:06:54,209][15401] Updated weights for policy 0, policy_version 440730 (0.0031) [2024-06-23 13:06:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42654.5). Total num frames: 7221067776. Throughput: 0: 42659.2. Samples: 7221207200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-23 13:06:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 13:06:58,535][15401] Updated weights for policy 0, policy_version 440740 (0.0034) [2024-06-23 13:07:01,842][15401] Updated weights for policy 0, policy_version 440750 (0.0048) [2024-06-23 13:07:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7221297152. Throughput: 0: 42814.6. Samples: 7221465560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 13:07:06,097][15401] Updated weights for policy 0, policy_version 440760 (0.0028) [2024-06-23 13:07:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 7221526528. Throughput: 0: 42947.5. Samples: 7221595200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:08,390][15132] Avg episode reward: [(0, '0.071')] [2024-06-23 13:07:10,226][15401] Updated weights for policy 0, policy_version 440770 (0.0036) [2024-06-23 13:07:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7221706752. Throughput: 0: 42689.3. Samples: 7221846420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:13,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 13:07:13,704][15401] Updated weights for policy 0, policy_version 440780 (0.0038) [2024-06-23 13:07:17,788][15401] Updated weights for policy 0, policy_version 440790 (0.0035) [2024-06-23 13:07:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7221936128. Throughput: 0: 42686.8. Samples: 7222101480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 13:07:21,264][15349] Signal inference workers to stop experience collection... (107000 times) [2024-06-23 13:07:21,264][15349] Signal inference workers to resume experience collection... (107000 times) [2024-06-23 13:07:21,300][15401] InferenceWorker_p0-w0: stopping experience collection (107000 times) [2024-06-23 13:07:21,301][15401] InferenceWorker_p0-w0: resuming experience collection (107000 times) [2024-06-23 13:07:21,398][15401] Updated weights for policy 0, policy_version 440800 (0.0036) [2024-06-23 13:07:23,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43145.7, 300 sec: 42653.6). Total num frames: 7222165504. Throughput: 0: 42772.4. Samples: 7222230460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:23,392][15132] Avg episode reward: [(0, '0.283')] [2024-06-23 13:07:25,277][15401] Updated weights for policy 0, policy_version 440810 (0.0034) [2024-06-23 13:07:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7222345728. Throughput: 0: 42557.3. Samples: 7222487080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 13:07:29,163][15401] Updated weights for policy 0, policy_version 440820 (0.0031) [2024-06-23 13:07:32,752][15401] Updated weights for policy 0, policy_version 440830 (0.0027) [2024-06-23 13:07:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7222575104. Throughput: 0: 42506.7. Samples: 7222735840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 13:07:36,926][15401] Updated weights for policy 0, policy_version 440840 (0.0033) [2024-06-23 13:07:38,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7222804480. Throughput: 0: 42593.3. Samples: 7222866460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:38,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 13:07:40,380][15401] Updated weights for policy 0, policy_version 440850 (0.0041) [2024-06-23 13:07:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7222984704. Throughput: 0: 42479.5. Samples: 7223118780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 13:07:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000440856_7222984704.pth... [2024-06-23 13:07:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000440233_7212777472.pth [2024-06-23 13:07:44,724][15401] Updated weights for policy 0, policy_version 440860 (0.0038) [2024-06-23 13:07:47,974][15401] Updated weights for policy 0, policy_version 440870 (0.0038) [2024-06-23 13:07:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7223214080. Throughput: 0: 42362.6. Samples: 7223371880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 13:07:52,202][15401] Updated weights for policy 0, policy_version 440880 (0.0038) [2024-06-23 13:07:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7223427072. Throughput: 0: 42443.9. Samples: 7223505180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 13:07:55,806][15401] Updated weights for policy 0, policy_version 440890 (0.0028) [2024-06-23 13:07:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 7223623680. Throughput: 0: 42475.5. Samples: 7223757820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:07:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 13:08:00,036][15401] Updated weights for policy 0, policy_version 440900 (0.0037) [2024-06-23 13:08:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7223853056. Throughput: 0: 42464.4. Samples: 7224012380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:08:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 13:08:03,446][15401] Updated weights for policy 0, policy_version 440910 (0.0037) [2024-06-23 13:08:07,949][15401] Updated weights for policy 0, policy_version 440920 (0.0042) [2024-06-23 13:08:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7224049664. Throughput: 0: 42515.2. Samples: 7224143540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:08:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 13:08:11,141][15401] Updated weights for policy 0, policy_version 440930 (0.0042) [2024-06-23 13:08:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7224279040. Throughput: 0: 42361.8. Samples: 7224393360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:08:13,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 13:08:15,804][15401] Updated weights for policy 0, policy_version 440940 (0.0035) [2024-06-23 13:08:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7224492032. Throughput: 0: 42469.7. Samples: 7224646980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:08:18,399][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 13:08:18,778][15401] Updated weights for policy 0, policy_version 440950 (0.0034) [2024-06-23 13:08:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41780.8, 300 sec: 42542.9). Total num frames: 7224672256. Throughput: 0: 42446.7. Samples: 7224776560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:08:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 13:08:23,485][15401] Updated weights for policy 0, policy_version 440960 (0.0046) [2024-06-23 13:08:26,270][15401] Updated weights for policy 0, policy_version 440970 (0.0022) [2024-06-23 13:08:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7224918016. Throughput: 0: 42464.0. Samples: 7225029660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:08:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 13:08:31,126][15401] Updated weights for policy 0, policy_version 440980 (0.0033) [2024-06-23 13:08:33,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7225147392. Throughput: 0: 42573.3. Samples: 7225287680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:08:33,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-23 13:08:34,132][15401] Updated weights for policy 0, policy_version 440990 (0.0026) [2024-06-23 13:08:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7225327616. Throughput: 0: 42413.4. Samples: 7225413780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:08:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 13:08:39,130][15401] Updated weights for policy 0, policy_version 441000 (0.0041) [2024-06-23 13:08:41,672][15401] Updated weights for policy 0, policy_version 441010 (0.0036) [2024-06-23 13:08:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7225540608. Throughput: 0: 42287.7. Samples: 7225660760. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:08:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 13:08:46,688][15401] Updated weights for policy 0, policy_version 441020 (0.0026) [2024-06-23 13:08:47,054][15349] Signal inference workers to stop experience collection... (107050 times) [2024-06-23 13:08:47,054][15349] Signal inference workers to resume experience collection... (107050 times) [2024-06-23 13:08:47,082][15401] InferenceWorker_p0-w0: stopping experience collection (107050 times) [2024-06-23 13:08:47,082][15401] InferenceWorker_p0-w0: resuming experience collection (107050 times) [2024-06-23 13:08:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7225769984. Throughput: 0: 42515.5. Samples: 7225925580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:08:48,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 13:08:49,466][15401] Updated weights for policy 0, policy_version 441030 (0.0031) [2024-06-23 13:08:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 7225966592. Throughput: 0: 42426.5. Samples: 7226052740. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:08:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 13:08:54,149][15401] Updated weights for policy 0, policy_version 441040 (0.0047) [2024-06-23 13:08:57,022][15401] Updated weights for policy 0, policy_version 441050 (0.0028) [2024-06-23 13:08:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7226179584. Throughput: 0: 42476.8. Samples: 7226304820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:08:58,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 13:09:01,719][15401] Updated weights for policy 0, policy_version 441060 (0.0036) [2024-06-23 13:09:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7226408960. Throughput: 0: 42752.9. Samples: 7226570860. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:03,390][15132] Avg episode reward: [(0, '0.159')] [2024-06-23 13:09:04,466][15401] Updated weights for policy 0, policy_version 441070 (0.0035) [2024-06-23 13:09:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7226605568. Throughput: 0: 42769.8. Samples: 7226701200. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:08,396][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 13:09:09,554][15401] Updated weights for policy 0, policy_version 441080 (0.0029) [2024-06-23 13:09:12,103][15401] Updated weights for policy 0, policy_version 441090 (0.0034) [2024-06-23 13:09:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7226834944. Throughput: 0: 42571.9. Samples: 7226945400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:13,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 13:09:17,059][15401] Updated weights for policy 0, policy_version 441100 (0.0048) [2024-06-23 13:09:18,394][15132] Fps is (10 sec: 44215.6, 60 sec: 42595.0, 300 sec: 42653.2). Total num frames: 7227047936. Throughput: 0: 42796.3. Samples: 7227213720. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:18,395][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 13:09:19,689][15401] Updated weights for policy 0, policy_version 441110 (0.0026) [2024-06-23 13:09:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7227260928. Throughput: 0: 42835.1. Samples: 7227341360. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 13:09:24,792][15401] Updated weights for policy 0, policy_version 441120 (0.0030) [2024-06-23 13:09:27,230][15401] Updated weights for policy 0, policy_version 441130 (0.0032) [2024-06-23 13:09:28,389][15132] Fps is (10 sec: 44258.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7227490304. Throughput: 0: 42896.9. Samples: 7227591120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 13:09:32,419][15401] Updated weights for policy 0, policy_version 441140 (0.0051) [2024-06-23 13:09:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7227686912. Throughput: 0: 42917.8. Samples: 7227856880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 13:09:34,789][15401] Updated weights for policy 0, policy_version 441150 (0.0034) [2024-06-23 13:09:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7227899904. Throughput: 0: 42776.2. Samples: 7227977660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:38,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 13:09:39,999][15401] Updated weights for policy 0, policy_version 441160 (0.0022) [2024-06-23 13:09:43,245][15401] Updated weights for policy 0, policy_version 441170 (0.0032) [2024-06-23 13:09:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7228129280. Throughput: 0: 42879.6. Samples: 7228234400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:09:43,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 13:09:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000441170_7228129280.pth... [2024-06-23 13:09:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000440546_7217905664.pth [2024-06-23 13:09:47,553][15401] Updated weights for policy 0, policy_version 441180 (0.0032) [2024-06-23 13:09:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7228309504. Throughput: 0: 42803.0. Samples: 7228497000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:09:48,396][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 13:09:51,111][15401] Updated weights for policy 0, policy_version 441190 (0.0032) [2024-06-23 13:09:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7228538880. Throughput: 0: 42695.6. Samples: 7228622500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:09:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 13:09:55,052][15401] Updated weights for policy 0, policy_version 441200 (0.0036) [2024-06-23 13:09:55,936][15349] Signal inference workers to stop experience collection... (107100 times) [2024-06-23 13:09:55,937][15349] Signal inference workers to resume experience collection... (107100 times) [2024-06-23 13:09:55,965][15401] InferenceWorker_p0-w0: stopping experience collection (107100 times) [2024-06-23 13:09:55,965][15401] InferenceWorker_p0-w0: resuming experience collection (107100 times) [2024-06-23 13:09:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 7228768256. Throughput: 0: 42848.1. Samples: 7228873560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:09:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 13:09:58,823][15401] Updated weights for policy 0, policy_version 441210 (0.0030) [2024-06-23 13:10:03,023][15401] Updated weights for policy 0, policy_version 441220 (0.0033) [2024-06-23 13:10:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7228948480. Throughput: 0: 42677.5. Samples: 7229134000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 13:10:06,525][15401] Updated weights for policy 0, policy_version 441230 (0.0034) [2024-06-23 13:10:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7229177856. Throughput: 0: 42581.3. Samples: 7229257520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 13:10:10,471][15401] Updated weights for policy 0, policy_version 441240 (0.0022) [2024-06-23 13:10:13,392][15132] Fps is (10 sec: 47501.7, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 7229423616. Throughput: 0: 42782.1. Samples: 7229516420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:13,393][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 13:10:14,113][15401] Updated weights for policy 0, policy_version 441250 (0.0051) [2024-06-23 13:10:18,016][15401] Updated weights for policy 0, policy_version 441260 (0.0028) [2024-06-23 13:10:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42601.7, 300 sec: 42709.5). Total num frames: 7229603840. Throughput: 0: 42571.0. Samples: 7229772580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:18,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 13:10:21,645][15401] Updated weights for policy 0, policy_version 441270 (0.0032) [2024-06-23 13:10:23,392][15132] Fps is (10 sec: 39321.8, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 7229816832. Throughput: 0: 42642.1. Samples: 7229896660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:23,392][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 13:10:26,151][15401] Updated weights for policy 0, policy_version 441280 (0.0046) [2024-06-23 13:10:28,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7230062592. Throughput: 0: 42768.1. Samples: 7230158960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:28,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 13:10:29,231][15401] Updated weights for policy 0, policy_version 441290 (0.0022) [2024-06-23 13:10:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7230226432. Throughput: 0: 42743.5. Samples: 7230420460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:33,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 13:10:33,864][15401] Updated weights for policy 0, policy_version 441300 (0.0054) [2024-06-23 13:10:37,128][15401] Updated weights for policy 0, policy_version 441310 (0.0046) [2024-06-23 13:10:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7230455808. Throughput: 0: 42602.7. Samples: 7230539620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:38,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 13:10:41,427][15401] Updated weights for policy 0, policy_version 441320 (0.0025) [2024-06-23 13:10:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7230701568. Throughput: 0: 42855.5. Samples: 7230802060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 13:10:44,740][15401] Updated weights for policy 0, policy_version 441330 (0.0028) [2024-06-23 13:10:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7230881792. Throughput: 0: 42728.3. Samples: 7231056780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 13:10:48,971][15401] Updated weights for policy 0, policy_version 441340 (0.0035) [2024-06-23 13:10:52,336][15401] Updated weights for policy 0, policy_version 441350 (0.0031) [2024-06-23 13:10:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7231094784. Throughput: 0: 42664.8. Samples: 7231177440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 13:10:57,062][15401] Updated weights for policy 0, policy_version 441360 (0.0038) [2024-06-23 13:10:58,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 7231324160. Throughput: 0: 42583.2. Samples: 7231432660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:10:58,393][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 13:11:00,580][15401] Updated weights for policy 0, policy_version 441370 (0.0034) [2024-06-23 13:11:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7231504384. Throughput: 0: 42691.7. Samples: 7231693700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 13:11:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 13:11:04,682][15401] Updated weights for policy 0, policy_version 441380 (0.0033) [2024-06-23 13:11:04,719][15349] Signal inference workers to stop experience collection... (107150 times) [2024-06-23 13:11:04,720][15349] Signal inference workers to resume experience collection... (107150 times) [2024-06-23 13:11:04,758][15401] InferenceWorker_p0-w0: stopping experience collection (107150 times) [2024-06-23 13:11:04,758][15401] InferenceWorker_p0-w0: resuming experience collection (107150 times) [2024-06-23 13:11:08,279][15401] Updated weights for policy 0, policy_version 441390 (0.0023) [2024-06-23 13:11:08,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 7231733760. Throughput: 0: 42595.5. Samples: 7231813360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 13:11:12,246][15401] Updated weights for policy 0, policy_version 441400 (0.0034) [2024-06-23 13:11:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 7231979520. Throughput: 0: 42652.8. Samples: 7232078340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:13,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 13:11:15,951][15401] Updated weights for policy 0, policy_version 441410 (0.0035) [2024-06-23 13:11:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42599.0). Total num frames: 7232143360. Throughput: 0: 42465.4. Samples: 7232331400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 13:11:19,752][15401] Updated weights for policy 0, policy_version 441420 (0.0032) [2024-06-23 13:11:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 7232372736. Throughput: 0: 42432.4. Samples: 7232449080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 13:11:23,645][15401] Updated weights for policy 0, policy_version 441430 (0.0037) [2024-06-23 13:11:27,366][15401] Updated weights for policy 0, policy_version 441440 (0.0037) [2024-06-23 13:11:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 7232602112. Throughput: 0: 42438.2. Samples: 7232711780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 13:11:31,476][15401] Updated weights for policy 0, policy_version 441450 (0.0042) [2024-06-23 13:11:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7232798720. Throughput: 0: 42468.4. Samples: 7232967860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 13:11:34,874][15401] Updated weights for policy 0, policy_version 441460 (0.0025) [2024-06-23 13:11:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7233011712. Throughput: 0: 42602.2. Samples: 7233094540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 13:11:38,867][15401] Updated weights for policy 0, policy_version 441470 (0.0028) [2024-06-23 13:11:42,472][15401] Updated weights for policy 0, policy_version 441480 (0.0032) [2024-06-23 13:11:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7233257472. Throughput: 0: 42776.8. Samples: 7233357520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:43,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 13:11:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000441483_7233257472.pth... [2024-06-23 13:11:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000440856_7222984704.pth [2024-06-23 13:11:46,386][15401] Updated weights for policy 0, policy_version 441490 (0.0031) [2024-06-23 13:11:48,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 7233454080. Throughput: 0: 42772.4. Samples: 7233618560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:48,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 13:11:50,050][15401] Updated weights for policy 0, policy_version 441500 (0.0031) [2024-06-23 13:11:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7233667072. Throughput: 0: 42870.7. Samples: 7233742540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 13:11:54,007][15401] Updated weights for policy 0, policy_version 441510 (0.0036) [2024-06-23 13:11:57,651][15401] Updated weights for policy 0, policy_version 441520 (0.0036) [2024-06-23 13:11:58,390][15132] Fps is (10 sec: 42607.6, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 7233880064. Throughput: 0: 42774.6. Samples: 7234003200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:11:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 13:12:01,603][15401] Updated weights for policy 0, policy_version 441530 (0.0041) [2024-06-23 13:12:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7234093056. Throughput: 0: 42886.6. Samples: 7234261300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:12:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 13:12:05,478][15401] Updated weights for policy 0, policy_version 441540 (0.0045) [2024-06-23 13:12:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7234306048. Throughput: 0: 43003.1. Samples: 7234384220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:12:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 13:12:09,208][15401] Updated weights for policy 0, policy_version 441550 (0.0024) [2024-06-23 13:12:12,974][15401] Updated weights for policy 0, policy_version 441560 (0.0033) [2024-06-23 13:12:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7234535424. Throughput: 0: 43226.7. Samples: 7234656980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:12:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 13:12:16,721][15401] Updated weights for policy 0, policy_version 441570 (0.0039) [2024-06-23 13:12:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42598.7). Total num frames: 7234732032. Throughput: 0: 43132.4. Samples: 7234908820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:12:18,391][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 13:12:20,612][15401] Updated weights for policy 0, policy_version 441580 (0.0036) [2024-06-23 13:12:23,391][15132] Fps is (10 sec: 40952.1, 60 sec: 42870.0, 300 sec: 42709.2). Total num frames: 7234945024. Throughput: 0: 43198.6. Samples: 7235038560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:12:23,396][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 13:12:24,326][15401] Updated weights for policy 0, policy_version 441590 (0.0025) [2024-06-23 13:12:28,317][15401] Updated weights for policy 0, policy_version 441600 (0.0044) [2024-06-23 13:12:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7235174400. Throughput: 0: 43154.4. Samples: 7235299460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 13:12:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 13:12:32,149][15401] Updated weights for policy 0, policy_version 441610 (0.0042) [2024-06-23 13:12:33,389][15132] Fps is (10 sec: 42606.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7235371008. Throughput: 0: 42963.1. Samples: 7235551800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:12:33,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 13:12:35,914][15401] Updated weights for policy 0, policy_version 441620 (0.0033) [2024-06-23 13:12:37,331][15349] Signal inference workers to stop experience collection... (107200 times) [2024-06-23 13:12:37,377][15401] InferenceWorker_p0-w0: stopping experience collection (107200 times) [2024-06-23 13:12:37,387][15349] Signal inference workers to resume experience collection... (107200 times) [2024-06-23 13:12:37,391][15401] InferenceWorker_p0-w0: resuming experience collection (107200 times) [2024-06-23 13:12:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7235600384. Throughput: 0: 42964.5. Samples: 7235675940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:12:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 13:12:39,696][15401] Updated weights for policy 0, policy_version 441630 (0.0034) [2024-06-23 13:12:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 7235796992. Throughput: 0: 42947.3. Samples: 7235935820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:12:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 13:12:43,583][15401] Updated weights for policy 0, policy_version 441640 (0.0034) [2024-06-23 13:12:47,587][15401] Updated weights for policy 0, policy_version 441650 (0.0023) [2024-06-23 13:12:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 7236026368. Throughput: 0: 42682.7. Samples: 7236182020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:12:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 13:12:51,195][15401] Updated weights for policy 0, policy_version 441660 (0.0045) [2024-06-23 13:12:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7236239360. Throughput: 0: 42686.2. Samples: 7236305100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:12:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 13:12:55,042][15401] Updated weights for policy 0, policy_version 441670 (0.0034) [2024-06-23 13:12:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7236435968. Throughput: 0: 42446.1. Samples: 7236567060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:12:58,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 13:12:59,013][15401] Updated weights for policy 0, policy_version 441680 (0.0029) [2024-06-23 13:13:03,013][15401] Updated weights for policy 0, policy_version 441690 (0.0033) [2024-06-23 13:13:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7236648960. Throughput: 0: 42451.7. Samples: 7236819140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 13:13:06,914][15401] Updated weights for policy 0, policy_version 441700 (0.0034) [2024-06-23 13:13:08,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7236894720. Throughput: 0: 42489.0. Samples: 7236950480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:08,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 13:13:10,595][15401] Updated weights for policy 0, policy_version 441710 (0.0039) [2024-06-23 13:13:13,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 7237074944. Throughput: 0: 42376.1. Samples: 7237206400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 13:13:14,617][15401] Updated weights for policy 0, policy_version 441720 (0.0032) [2024-06-23 13:13:18,273][15401] Updated weights for policy 0, policy_version 441730 (0.0043) [2024-06-23 13:13:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 7237304320. Throughput: 0: 42367.2. Samples: 7237458320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:18,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 13:13:22,590][15401] Updated weights for policy 0, policy_version 441740 (0.0035) [2024-06-23 13:13:23,391][15132] Fps is (10 sec: 44232.4, 60 sec: 42871.9, 300 sec: 42709.3). Total num frames: 7237517312. Throughput: 0: 42576.5. Samples: 7237591940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:23,391][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 13:13:25,670][15401] Updated weights for policy 0, policy_version 441750 (0.0037) [2024-06-23 13:13:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7237713920. Throughput: 0: 42435.5. Samples: 7237845420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 13:13:30,201][15401] Updated weights for policy 0, policy_version 441760 (0.0027) [2024-06-23 13:13:33,133][15401] Updated weights for policy 0, policy_version 441770 (0.0028) [2024-06-23 13:13:33,389][15132] Fps is (10 sec: 44243.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7237959680. Throughput: 0: 42691.6. Samples: 7238103140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 13:13:37,612][15401] Updated weights for policy 0, policy_version 441780 (0.0039) [2024-06-23 13:13:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7238139904. Throughput: 0: 43064.5. Samples: 7238243000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:38,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 13:13:40,718][15401] Updated weights for policy 0, policy_version 441790 (0.0027) [2024-06-23 13:13:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7238369280. Throughput: 0: 42830.8. Samples: 7238494440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:43,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-23 13:13:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000441795_7238369280.pth... [2024-06-23 13:13:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000441170_7228129280.pth [2024-06-23 13:13:45,059][15401] Updated weights for policy 0, policy_version 441800 (0.0036) [2024-06-23 13:13:48,304][15401] Updated weights for policy 0, policy_version 441810 (0.0041) [2024-06-23 13:13:48,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7238615040. Throughput: 0: 43026.1. Samples: 7238755320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:48,390][15132] Avg episode reward: [(0, '0.162')] [2024-06-23 13:13:52,550][15349] Signal inference workers to stop experience collection... (107250 times) [2024-06-23 13:13:52,551][15349] Signal inference workers to resume experience collection... (107250 times) [2024-06-23 13:13:52,565][15401] InferenceWorker_p0-w0: stopping experience collection (107250 times) [2024-06-23 13:13:52,598][15401] InferenceWorker_p0-w0: resuming experience collection (107250 times) [2024-06-23 13:13:52,701][15401] Updated weights for policy 0, policy_version 441820 (0.0032) [2024-06-23 13:13:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7238795264. Throughput: 0: 43063.0. Samples: 7238888320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 13:13:53,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 13:13:55,787][15401] Updated weights for policy 0, policy_version 441830 (0.0036) [2024-06-23 13:13:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7239024640. Throughput: 0: 42917.7. Samples: 7239137680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:13:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 13:14:00,511][15401] Updated weights for policy 0, policy_version 441840 (0.0033) [2024-06-23 13:14:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7239254016. Throughput: 0: 43010.5. Samples: 7239393800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:03,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 13:14:03,666][15401] Updated weights for policy 0, policy_version 441850 (0.0030) [2024-06-23 13:14:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 7239417856. Throughput: 0: 42961.7. Samples: 7239525160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 13:14:08,536][15401] Updated weights for policy 0, policy_version 441860 (0.0035) [2024-06-23 13:14:11,281][15401] Updated weights for policy 0, policy_version 441870 (0.0032) [2024-06-23 13:14:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.8, 300 sec: 42765.7). Total num frames: 7239663616. Throughput: 0: 42892.8. Samples: 7239775600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 13:14:16,059][15401] Updated weights for policy 0, policy_version 441880 (0.0033) [2024-06-23 13:14:18,391][15132] Fps is (10 sec: 45868.5, 60 sec: 42870.4, 300 sec: 42764.8). Total num frames: 7239876608. Throughput: 0: 42939.5. Samples: 7240035480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:18,391][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 13:14:19,085][15401] Updated weights for policy 0, policy_version 441890 (0.0032) [2024-06-23 13:14:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42326.2, 300 sec: 42598.4). Total num frames: 7240056832. Throughput: 0: 42485.2. Samples: 7240154840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:23,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 13:14:23,745][15401] Updated weights for policy 0, policy_version 441900 (0.0027) [2024-06-23 13:14:26,925][15401] Updated weights for policy 0, policy_version 441910 (0.0037) [2024-06-23 13:14:28,389][15132] Fps is (10 sec: 42604.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7240302592. Throughput: 0: 42580.6. Samples: 7240410560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 13:14:31,393][15401] Updated weights for policy 0, policy_version 441920 (0.0037) [2024-06-23 13:14:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7240499200. Throughput: 0: 42581.0. Samples: 7240671460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 13:14:34,486][15401] Updated weights for policy 0, policy_version 441930 (0.0030) [2024-06-23 13:14:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7240712192. Throughput: 0: 42442.7. Samples: 7240798240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:38,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 13:14:39,025][15401] Updated weights for policy 0, policy_version 441940 (0.0037) [2024-06-23 13:14:42,087][15401] Updated weights for policy 0, policy_version 441950 (0.0051) [2024-06-23 13:14:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7240941568. Throughput: 0: 42470.6. Samples: 7241048860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:43,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 13:14:46,807][15401] Updated weights for policy 0, policy_version 441960 (0.0040) [2024-06-23 13:14:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 7241138176. Throughput: 0: 42661.1. Samples: 7241313540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:48,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 13:14:49,734][15401] Updated weights for policy 0, policy_version 441970 (0.0028) [2024-06-23 13:14:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7241351168. Throughput: 0: 42539.6. Samples: 7241439440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:53,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 13:14:54,636][15401] Updated weights for policy 0, policy_version 441980 (0.0033) [2024-06-23 13:14:57,896][15401] Updated weights for policy 0, policy_version 441990 (0.0040) [2024-06-23 13:14:58,390][15132] Fps is (10 sec: 42595.5, 60 sec: 42324.9, 300 sec: 42764.9). Total num frames: 7241564160. Throughput: 0: 42493.7. Samples: 7241687840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:14:58,391][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 13:15:02,431][15401] Updated weights for policy 0, policy_version 442000 (0.0035) [2024-06-23 13:15:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 7241777152. Throughput: 0: 42510.7. Samples: 7241948400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:15:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 13:15:05,603][15401] Updated weights for policy 0, policy_version 442010 (0.0031) [2024-06-23 13:15:06,318][15349] Signal inference workers to stop experience collection... (107300 times) [2024-06-23 13:15:06,319][15349] Signal inference workers to resume experience collection... (107300 times) [2024-06-23 13:15:06,352][15401] InferenceWorker_p0-w0: stopping experience collection (107300 times) [2024-06-23 13:15:06,352][15401] InferenceWorker_p0-w0: resuming experience collection (107300 times) [2024-06-23 13:15:08,389][15132] Fps is (10 sec: 44239.3, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 7242006528. Throughput: 0: 42689.9. Samples: 7242075880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:15:08,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 13:15:09,890][15401] Updated weights for policy 0, policy_version 442020 (0.0039) [2024-06-23 13:15:13,302][15401] Updated weights for policy 0, policy_version 442030 (0.0032) [2024-06-23 13:15:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7242219520. Throughput: 0: 42646.1. Samples: 7242329640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:15:13,392][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 13:15:17,465][15401] Updated weights for policy 0, policy_version 442040 (0.0036) [2024-06-23 13:15:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42053.3, 300 sec: 42654.3). Total num frames: 7242399744. Throughput: 0: 42608.0. Samples: 7242588820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 13:15:21,261][15401] Updated weights for policy 0, policy_version 442050 (0.0036) [2024-06-23 13:15:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7242645504. Throughput: 0: 42508.8. Samples: 7242711140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:23,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-23 13:15:25,759][15401] Updated weights for policy 0, policy_version 442060 (0.0029) [2024-06-23 13:15:28,394][15132] Fps is (10 sec: 44216.2, 60 sec: 42322.0, 300 sec: 42764.4). Total num frames: 7242842112. Throughput: 0: 42538.8. Samples: 7242963300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:28,395][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 13:15:28,871][15401] Updated weights for policy 0, policy_version 442070 (0.0033) [2024-06-23 13:15:33,276][15401] Updated weights for policy 0, policy_version 442080 (0.0049) [2024-06-23 13:15:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7243038720. Throughput: 0: 42571.4. Samples: 7243229260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:33,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 13:15:36,472][15401] Updated weights for policy 0, policy_version 442090 (0.0043) [2024-06-23 13:15:38,392][15132] Fps is (10 sec: 44246.7, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 7243284480. Throughput: 0: 42514.6. Samples: 7243352700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:38,392][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 13:15:41,151][15401] Updated weights for policy 0, policy_version 442100 (0.0036) [2024-06-23 13:15:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7243497472. Throughput: 0: 42729.8. Samples: 7243610660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 13:15:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000442108_7243497472.pth... [2024-06-23 13:15:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000441483_7233257472.pth [2024-06-23 13:15:43,938][15401] Updated weights for policy 0, policy_version 442110 (0.0024) [2024-06-23 13:15:48,390][15132] Fps is (10 sec: 37692.1, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 7243661312. Throughput: 0: 42671.5. Samples: 7243868620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 13:15:49,072][15401] Updated weights for policy 0, policy_version 442120 (0.0033) [2024-06-23 13:15:51,996][15401] Updated weights for policy 0, policy_version 442130 (0.0045) [2024-06-23 13:15:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.2, 300 sec: 42654.3). Total num frames: 7243907072. Throughput: 0: 42504.3. Samples: 7243988580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:53,391][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 13:15:56,420][15401] Updated weights for policy 0, policy_version 442140 (0.0037) [2024-06-23 13:15:58,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.9, 300 sec: 42820.5). Total num frames: 7244136448. Throughput: 0: 42769.8. Samples: 7244254280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:15:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 13:15:59,413][15401] Updated weights for policy 0, policy_version 442150 (0.0040) [2024-06-23 13:16:01,062][15349] Signal inference workers to stop experience collection... (107350 times) [2024-06-23 13:16:01,107][15401] InferenceWorker_p0-w0: stopping experience collection (107350 times) [2024-06-23 13:16:01,113][15349] Signal inference workers to resume experience collection... (107350 times) [2024-06-23 13:16:01,124][15401] InferenceWorker_p0-w0: resuming experience collection (107350 times) [2024-06-23 13:16:03,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7244316672. Throughput: 0: 42573.0. Samples: 7244504600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:16:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 13:16:04,211][15401] Updated weights for policy 0, policy_version 442160 (0.0033) [2024-06-23 13:16:06,955][15401] Updated weights for policy 0, policy_version 442170 (0.0027) [2024-06-23 13:16:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7244546048. Throughput: 0: 42589.8. Samples: 7244627680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:16:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 13:16:11,761][15401] Updated weights for policy 0, policy_version 442180 (0.0039) [2024-06-23 13:16:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7244775424. Throughput: 0: 42810.6. Samples: 7244889580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:16:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 13:16:14,734][15401] Updated weights for policy 0, policy_version 442190 (0.0041) [2024-06-23 13:16:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7244972032. Throughput: 0: 42496.9. Samples: 7245141620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:16:18,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 13:16:19,214][15401] Updated weights for policy 0, policy_version 442200 (0.0032) [2024-06-23 13:16:22,326][15401] Updated weights for policy 0, policy_version 442210 (0.0023) [2024-06-23 13:16:23,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 7245185024. Throughput: 0: 42638.7. Samples: 7245271440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:16:23,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 13:16:26,775][15401] Updated weights for policy 0, policy_version 442220 (0.0031) [2024-06-23 13:16:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43147.9, 300 sec: 42820.6). Total num frames: 7245430784. Throughput: 0: 42853.4. Samples: 7245539060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:16:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 13:16:29,834][15401] Updated weights for policy 0, policy_version 442230 (0.0039) [2024-06-23 13:16:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7245627392. Throughput: 0: 42749.7. Samples: 7245792360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:16:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 13:16:34,263][15401] Updated weights for policy 0, policy_version 442240 (0.0036) [2024-06-23 13:16:37,446][15401] Updated weights for policy 0, policy_version 442250 (0.0031) [2024-06-23 13:16:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 7245840384. Throughput: 0: 42869.9. Samples: 7245917720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 13:16:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 13:16:41,811][15401] Updated weights for policy 0, policy_version 442260 (0.0042) [2024-06-23 13:16:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 7246069760. Throughput: 0: 42818.1. Samples: 7246181100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:16:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 13:16:45,223][15401] Updated weights for policy 0, policy_version 442270 (0.0033) [2024-06-23 13:16:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 7246249984. Throughput: 0: 42921.7. Samples: 7246436080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:16:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 13:16:49,457][15401] Updated weights for policy 0, policy_version 442280 (0.0035) [2024-06-23 13:16:53,043][15401] Updated weights for policy 0, policy_version 442290 (0.0028) [2024-06-23 13:16:53,394][15132] Fps is (10 sec: 40940.4, 60 sec: 42868.1, 300 sec: 42708.8). Total num frames: 7246479360. Throughput: 0: 42897.1. Samples: 7246558260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:16:53,395][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 13:16:57,603][15401] Updated weights for policy 0, policy_version 442300 (0.0031) [2024-06-23 13:16:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7246692352. Throughput: 0: 42908.6. Samples: 7246820460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:16:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 13:17:00,871][15401] Updated weights for policy 0, policy_version 442310 (0.0027) [2024-06-23 13:17:03,389][15132] Fps is (10 sec: 42619.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7246905344. Throughput: 0: 42998.7. Samples: 7247076560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 13:17:05,071][15401] Updated weights for policy 0, policy_version 442320 (0.0030) [2024-06-23 13:17:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7247118336. Throughput: 0: 42943.0. Samples: 7247203780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 13:17:08,573][15401] Updated weights for policy 0, policy_version 442330 (0.0044) [2024-06-23 13:17:09,961][15349] Signal inference workers to stop experience collection... (107400 times) [2024-06-23 13:17:09,993][15401] InferenceWorker_p0-w0: stopping experience collection (107400 times) [2024-06-23 13:17:10,030][15349] Signal inference workers to resume experience collection... (107400 times) [2024-06-23 13:17:10,030][15401] InferenceWorker_p0-w0: resuming experience collection (107400 times) [2024-06-23 13:17:12,650][15401] Updated weights for policy 0, policy_version 442340 (0.0030) [2024-06-23 13:17:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7247347712. Throughput: 0: 42952.8. Samples: 7247471940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:13,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 13:17:15,815][15401] Updated weights for policy 0, policy_version 442350 (0.0033) [2024-06-23 13:17:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 7247560704. Throughput: 0: 42928.5. Samples: 7247724140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 13:17:20,335][15401] Updated weights for policy 0, policy_version 442360 (0.0031) [2024-06-23 13:17:23,228][15401] Updated weights for policy 0, policy_version 442370 (0.0027) [2024-06-23 13:17:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43417.6, 300 sec: 42764.7). Total num frames: 7247790080. Throughput: 0: 43062.1. Samples: 7247855620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:23,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 13:17:27,929][15401] Updated weights for policy 0, policy_version 442380 (0.0041) [2024-06-23 13:17:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7247970304. Throughput: 0: 43055.8. Samples: 7248118600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:28,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 13:17:30,887][15401] Updated weights for policy 0, policy_version 442390 (0.0031) [2024-06-23 13:17:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7248216064. Throughput: 0: 43121.3. Samples: 7248376540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:33,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 13:17:35,723][15401] Updated weights for policy 0, policy_version 442400 (0.0022) [2024-06-23 13:17:38,345][15401] Updated weights for policy 0, policy_version 442410 (0.0031) [2024-06-23 13:17:38,392][15132] Fps is (10 sec: 47501.5, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 7248445440. Throughput: 0: 43237.9. Samples: 7248503860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:38,393][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 13:17:43,147][15401] Updated weights for policy 0, policy_version 442420 (0.0038) [2024-06-23 13:17:43,394][15132] Fps is (10 sec: 40942.8, 60 sec: 42595.5, 300 sec: 42708.9). Total num frames: 7248625664. Throughput: 0: 43059.5. Samples: 7248758320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:43,394][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 13:17:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000442421_7248625664.pth... [2024-06-23 13:17:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000441795_7238369280.pth [2024-06-23 13:17:45,900][15401] Updated weights for policy 0, policy_version 442430 (0.0032) [2024-06-23 13:17:48,389][15132] Fps is (10 sec: 40970.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7248855040. Throughput: 0: 43032.1. Samples: 7249013000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 13:17:50,863][15401] Updated weights for policy 0, policy_version 442440 (0.0034) [2024-06-23 13:17:53,391][15132] Fps is (10 sec: 44250.6, 60 sec: 43147.3, 300 sec: 42820.4). Total num frames: 7249068032. Throughput: 0: 43084.0. Samples: 7249142600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:53,391][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 13:17:53,796][15401] Updated weights for policy 0, policy_version 442450 (0.0052) [2024-06-23 13:17:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7249248256. Throughput: 0: 42732.6. Samples: 7249394900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:17:58,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 13:17:58,396][15401] Updated weights for policy 0, policy_version 442460 (0.0036) [2024-06-23 13:18:01,538][15401] Updated weights for policy 0, policy_version 442470 (0.0032) [2024-06-23 13:18:03,390][15132] Fps is (10 sec: 40963.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7249477632. Throughput: 0: 42680.3. Samples: 7249644760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 13:18:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 13:18:05,938][15401] Updated weights for policy 0, policy_version 442480 (0.0041) [2024-06-23 13:18:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 7249690624. Throughput: 0: 42685.8. Samples: 7249776380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 13:18:09,177][15401] Updated weights for policy 0, policy_version 442490 (0.0044) [2024-06-23 13:18:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7249887232. Throughput: 0: 42585.2. Samples: 7250034940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 13:18:13,581][15401] Updated weights for policy 0, policy_version 442500 (0.0032) [2024-06-23 13:18:16,987][15401] Updated weights for policy 0, policy_version 442510 (0.0031) [2024-06-23 13:18:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.2). Total num frames: 7250132992. Throughput: 0: 42511.9. Samples: 7250289580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:18,392][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 13:18:21,123][15401] Updated weights for policy 0, policy_version 442520 (0.0032) [2024-06-23 13:18:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 7250345984. Throughput: 0: 42541.8. Samples: 7250418140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 13:18:24,709][15401] Updated weights for policy 0, policy_version 442530 (0.0042) [2024-06-23 13:18:28,025][15349] Signal inference workers to stop experience collection... (107450 times) [2024-06-23 13:18:28,056][15401] InferenceWorker_p0-w0: stopping experience collection (107450 times) [2024-06-23 13:18:28,080][15349] Signal inference workers to resume experience collection... (107450 times) [2024-06-23 13:18:28,088][15401] InferenceWorker_p0-w0: resuming experience collection (107450 times) [2024-06-23 13:18:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7250542592. Throughput: 0: 42632.0. Samples: 7250676580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:28,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 13:18:28,570][15401] Updated weights for policy 0, policy_version 442540 (0.0044) [2024-06-23 13:18:32,554][15401] Updated weights for policy 0, policy_version 442550 (0.0037) [2024-06-23 13:18:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7250755584. Throughput: 0: 42524.7. Samples: 7250926620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 13:18:36,286][15401] Updated weights for policy 0, policy_version 442560 (0.0030) [2024-06-23 13:18:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 7250984960. Throughput: 0: 42527.6. Samples: 7251056300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:38,403][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 13:18:40,224][15401] Updated weights for policy 0, policy_version 442570 (0.0030) [2024-06-23 13:18:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42328.3, 300 sec: 42542.9). Total num frames: 7251165184. Throughput: 0: 42630.7. Samples: 7251313280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:43,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 13:18:43,891][15401] Updated weights for policy 0, policy_version 442580 (0.0028) [2024-06-23 13:18:48,058][15401] Updated weights for policy 0, policy_version 442590 (0.0039) [2024-06-23 13:18:48,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 7251394560. Throughput: 0: 42722.7. Samples: 7251567380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:48,393][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 13:18:51,778][15401] Updated weights for policy 0, policy_version 442600 (0.0043) [2024-06-23 13:18:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42872.2, 300 sec: 42765.0). Total num frames: 7251640320. Throughput: 0: 42715.6. Samples: 7251698580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 13:18:55,598][15401] Updated weights for policy 0, policy_version 442610 (0.0037) [2024-06-23 13:18:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 7251820544. Throughput: 0: 42628.4. Samples: 7251953220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:18:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 13:18:59,682][15401] Updated weights for policy 0, policy_version 442620 (0.0040) [2024-06-23 13:19:03,146][15401] Updated weights for policy 0, policy_version 442630 (0.0038) [2024-06-23 13:19:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7252049920. Throughput: 0: 42635.6. Samples: 7252208180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:19:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 13:19:07,137][15401] Updated weights for policy 0, policy_version 442640 (0.0036) [2024-06-23 13:19:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7252262912. Throughput: 0: 42677.9. Samples: 7252338640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:19:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 13:19:11,247][15401] Updated weights for policy 0, policy_version 442650 (0.0031) [2024-06-23 13:19:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.7). Total num frames: 7252475904. Throughput: 0: 42736.3. Samples: 7252599720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:19:13,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 13:19:14,555][15401] Updated weights for policy 0, policy_version 442660 (0.0031) [2024-06-23 13:19:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7252672512. Throughput: 0: 42816.1. Samples: 7252853340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:19:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 13:19:18,789][15401] Updated weights for policy 0, policy_version 442670 (0.0034) [2024-06-23 13:19:22,244][15401] Updated weights for policy 0, policy_version 442680 (0.0032) [2024-06-23 13:19:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7252901888. Throughput: 0: 42808.4. Samples: 7252982680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 13:19:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 13:19:26,291][15401] Updated weights for policy 0, policy_version 442690 (0.0023) [2024-06-23 13:19:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7253114880. Throughput: 0: 42832.8. Samples: 7253240760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:19:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 13:19:30,107][15401] Updated weights for policy 0, policy_version 442700 (0.0032) [2024-06-23 13:19:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7253327872. Throughput: 0: 43011.5. Samples: 7253502800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:19:33,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-23 13:19:33,799][15401] Updated weights for policy 0, policy_version 442710 (0.0032) [2024-06-23 13:19:37,606][15401] Updated weights for policy 0, policy_version 442720 (0.0037) [2024-06-23 13:19:38,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 7253540864. Throughput: 0: 42903.7. Samples: 7253629520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:19:38,396][15132] Avg episode reward: [(0, '0.847')] [2024-06-23 13:19:38,738][15349] Signal inference workers to stop experience collection... (107500 times) [2024-06-23 13:19:38,787][15401] InferenceWorker_p0-w0: stopping experience collection (107500 times) [2024-06-23 13:19:38,790][15349] Signal inference workers to resume experience collection... (107500 times) [2024-06-23 13:19:38,802][15401] InferenceWorker_p0-w0: resuming experience collection (107500 times) [2024-06-23 13:19:41,511][15401] Updated weights for policy 0, policy_version 442730 (0.0033) [2024-06-23 13:19:43,396][15132] Fps is (10 sec: 44208.8, 60 sec: 43412.9, 300 sec: 42819.6). Total num frames: 7253770240. Throughput: 0: 43125.9. Samples: 7253894160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:19:43,397][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 13:19:43,539][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000442736_7253786624.pth... [2024-06-23 13:19:43,595][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000442108_7243497472.pth [2024-06-23 13:19:45,177][15401] Updated weights for policy 0, policy_version 442740 (0.0032) [2024-06-23 13:19:48,390][15132] Fps is (10 sec: 44264.9, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 7253983232. Throughput: 0: 43090.3. Samples: 7254147240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:19:48,392][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 13:19:49,056][15401] Updated weights for policy 0, policy_version 442750 (0.0036) [2024-06-23 13:19:52,924][15401] Updated weights for policy 0, policy_version 442760 (0.0035) [2024-06-23 13:19:53,390][15132] Fps is (10 sec: 40986.4, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 7254179840. Throughput: 0: 43032.8. Samples: 7254275120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:19:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 13:19:56,824][15401] Updated weights for policy 0, policy_version 442770 (0.0026) [2024-06-23 13:19:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7254409216. Throughput: 0: 43117.0. Samples: 7254539980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:19:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 13:20:00,612][15401] Updated weights for policy 0, policy_version 442780 (0.0024) [2024-06-23 13:20:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7254622208. Throughput: 0: 43134.3. Samples: 7254794380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 13:20:04,208][15401] Updated weights for policy 0, policy_version 442790 (0.0033) [2024-06-23 13:20:08,317][15401] Updated weights for policy 0, policy_version 442800 (0.0033) [2024-06-23 13:20:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7254835200. Throughput: 0: 43189.5. Samples: 7254926200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 13:20:11,653][15401] Updated weights for policy 0, policy_version 442810 (0.0037) [2024-06-23 13:20:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 7255048192. Throughput: 0: 43144.5. Samples: 7255182260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 13:20:15,898][15401] Updated weights for policy 0, policy_version 442820 (0.0024) [2024-06-23 13:20:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 7255277568. Throughput: 0: 43017.0. Samples: 7255438560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 13:20:19,343][15401] Updated weights for policy 0, policy_version 442830 (0.0028) [2024-06-23 13:20:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42821.2). Total num frames: 7255474176. Throughput: 0: 42990.0. Samples: 7255563800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 13:20:23,768][15401] Updated weights for policy 0, policy_version 442840 (0.0036) [2024-06-23 13:20:27,256][15401] Updated weights for policy 0, policy_version 442850 (0.0040) [2024-06-23 13:20:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7255703552. Throughput: 0: 42873.7. Samples: 7255823200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 13:20:31,198][15401] Updated weights for policy 0, policy_version 442860 (0.0030) [2024-06-23 13:20:32,103][15349] Signal inference workers to stop experience collection... (107550 times) [2024-06-23 13:20:32,103][15349] Signal inference workers to resume experience collection... (107550 times) [2024-06-23 13:20:32,126][15401] InferenceWorker_p0-w0: stopping experience collection (107550 times) [2024-06-23 13:20:32,127][15401] InferenceWorker_p0-w0: resuming experience collection (107550 times) [2024-06-23 13:20:33,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42820.6). Total num frames: 7255916544. Throughput: 0: 42868.9. Samples: 7256076440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:33,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 13:20:34,852][15401] Updated weights for policy 0, policy_version 442870 (0.0031) [2024-06-23 13:20:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 7256129536. Throughput: 0: 43037.4. Samples: 7256211800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 13:20:38,631][15401] Updated weights for policy 0, policy_version 442880 (0.0035) [2024-06-23 13:20:42,486][15401] Updated weights for policy 0, policy_version 442890 (0.0027) [2024-06-23 13:20:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42602.9, 300 sec: 42931.6). Total num frames: 7256326144. Throughput: 0: 42859.4. Samples: 7256468660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 13:20:46,182][15401] Updated weights for policy 0, policy_version 442900 (0.0038) [2024-06-23 13:20:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 7256571904. Throughput: 0: 42744.8. Samples: 7256717900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 13:20:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 13:20:50,374][15401] Updated weights for policy 0, policy_version 442910 (0.0038) [2024-06-23 13:20:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7256768512. Throughput: 0: 42931.9. Samples: 7256858140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:20:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 13:20:53,725][15401] Updated weights for policy 0, policy_version 442920 (0.0042) [2024-06-23 13:20:57,917][15401] Updated weights for policy 0, policy_version 442930 (0.0027) [2024-06-23 13:20:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7256981504. Throughput: 0: 42828.3. Samples: 7257109540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:20:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 13:21:01,177][15401] Updated weights for policy 0, policy_version 442940 (0.0026) [2024-06-23 13:21:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 7257210880. Throughput: 0: 42715.1. Samples: 7257360740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 13:21:05,454][15401] Updated weights for policy 0, policy_version 442950 (0.0029) [2024-06-23 13:21:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7257407488. Throughput: 0: 42894.7. Samples: 7257494060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:08,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 13:21:09,093][15401] Updated weights for policy 0, policy_version 442960 (0.0035) [2024-06-23 13:21:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7257604096. Throughput: 0: 42870.3. Samples: 7257752360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 13:21:13,474][15401] Updated weights for policy 0, policy_version 442970 (0.0039) [2024-06-23 13:21:16,633][15401] Updated weights for policy 0, policy_version 442980 (0.0028) [2024-06-23 13:21:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 7257849856. Throughput: 0: 42840.9. Samples: 7258004180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 13:21:21,178][15401] Updated weights for policy 0, policy_version 442990 (0.0033) [2024-06-23 13:21:23,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 7258079232. Throughput: 0: 42901.8. Samples: 7258142380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 13:21:24,173][15401] Updated weights for policy 0, policy_version 443000 (0.0028) [2024-06-23 13:21:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7258259456. Throughput: 0: 42937.4. Samples: 7258400840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 13:21:28,614][15401] Updated weights for policy 0, policy_version 443010 (0.0033) [2024-06-23 13:21:31,870][15401] Updated weights for policy 0, policy_version 443020 (0.0034) [2024-06-23 13:21:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 7258505216. Throughput: 0: 42897.2. Samples: 7258648280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 13:21:36,470][15401] Updated weights for policy 0, policy_version 443030 (0.0023) [2024-06-23 13:21:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7258701824. Throughput: 0: 42632.1. Samples: 7258776580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 13:21:39,713][15401] Updated weights for policy 0, policy_version 443040 (0.0031) [2024-06-23 13:21:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7258898432. Throughput: 0: 42563.1. Samples: 7259024880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 13:21:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000443048_7258898432.pth... [2024-06-23 13:21:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000442421_7248625664.pth [2024-06-23 13:21:44,450][15401] Updated weights for policy 0, policy_version 443050 (0.0024) [2024-06-23 13:21:47,464][15401] Updated weights for policy 0, policy_version 443060 (0.0033) [2024-06-23 13:21:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.8). Total num frames: 7259127808. Throughput: 0: 42481.4. Samples: 7259272400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 13:21:51,918][15401] Updated weights for policy 0, policy_version 443070 (0.0024) [2024-06-23 13:21:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7259308032. Throughput: 0: 42535.1. Samples: 7259408140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 13:21:55,068][15401] Updated weights for policy 0, policy_version 443080 (0.0025) [2024-06-23 13:21:58,395][15132] Fps is (10 sec: 40938.8, 60 sec: 42594.8, 300 sec: 42819.8). Total num frames: 7259537408. Throughput: 0: 42428.9. Samples: 7259661880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:21:58,395][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 13:21:59,755][15401] Updated weights for policy 0, policy_version 443090 (0.0032) [2024-06-23 13:22:02,787][15401] Updated weights for policy 0, policy_version 443100 (0.0028) [2024-06-23 13:22:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7259766784. Throughput: 0: 42444.9. Samples: 7259914200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:22:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 13:22:03,520][15349] Signal inference workers to stop experience collection... (107600 times) [2024-06-23 13:22:03,523][15349] Signal inference workers to resume experience collection... (107600 times) [2024-06-23 13:22:03,536][15401] InferenceWorker_p0-w0: stopping experience collection (107600 times) [2024-06-23 13:22:03,536][15401] InferenceWorker_p0-w0: resuming experience collection (107600 times) [2024-06-23 13:22:07,174][15401] Updated weights for policy 0, policy_version 443110 (0.0028) [2024-06-23 13:22:08,389][15132] Fps is (10 sec: 39342.1, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 7259930624. Throughput: 0: 42297.4. Samples: 7260045760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:22:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 13:22:10,369][15401] Updated weights for policy 0, policy_version 443120 (0.0033) [2024-06-23 13:22:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7260160000. Throughput: 0: 42255.2. Samples: 7260302320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:22:13,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-23 13:22:14,533][15401] Updated weights for policy 0, policy_version 443130 (0.0030) [2024-06-23 13:22:17,858][15401] Updated weights for policy 0, policy_version 443140 (0.0032) [2024-06-23 13:22:18,389][15132] Fps is (10 sec: 49151.8, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 7260422144. Throughput: 0: 42504.6. Samples: 7260560980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:18,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-23 13:22:22,110][15401] Updated weights for policy 0, policy_version 443150 (0.0035) [2024-06-23 13:22:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 7260602368. Throughput: 0: 42746.6. Samples: 7260700180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 13:22:25,355][15401] Updated weights for policy 0, policy_version 443160 (0.0031) [2024-06-23 13:22:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7260831744. Throughput: 0: 42941.8. Samples: 7260957260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 13:22:29,581][15401] Updated weights for policy 0, policy_version 443170 (0.0026) [2024-06-23 13:22:32,790][15401] Updated weights for policy 0, policy_version 443180 (0.0034) [2024-06-23 13:22:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 7261061120. Throughput: 0: 43041.3. Samples: 7261209260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 13:22:37,451][15401] Updated weights for policy 0, policy_version 443190 (0.0039) [2024-06-23 13:22:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42821.1). Total num frames: 7261257728. Throughput: 0: 42979.4. Samples: 7261342220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:38,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 13:22:40,916][15401] Updated weights for policy 0, policy_version 443200 (0.0051) [2024-06-23 13:22:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7261470720. Throughput: 0: 42891.9. Samples: 7261591800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:43,399][15132] Avg episode reward: [(0, '0.821')] [2024-06-23 13:22:45,268][15401] Updated weights for policy 0, policy_version 443210 (0.0031) [2024-06-23 13:22:48,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 7261700096. Throughput: 0: 43075.2. Samples: 7261852580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:48,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-23 13:22:48,526][15401] Updated weights for policy 0, policy_version 443220 (0.0037) [2024-06-23 13:22:52,790][15401] Updated weights for policy 0, policy_version 443230 (0.0032) [2024-06-23 13:22:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7261896704. Throughput: 0: 42976.7. Samples: 7261979720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 13:22:56,228][15401] Updated weights for policy 0, policy_version 443240 (0.0037) [2024-06-23 13:22:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42875.2, 300 sec: 42820.6). Total num frames: 7262109696. Throughput: 0: 43003.5. Samples: 7262237480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:22:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 13:23:00,585][15401] Updated weights for policy 0, policy_version 443250 (0.0036) [2024-06-23 13:23:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7262339072. Throughput: 0: 42866.1. Samples: 7262489960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:23:03,390][15132] Avg episode reward: [(0, '0.094')] [2024-06-23 13:23:03,930][15401] Updated weights for policy 0, policy_version 443260 (0.0033) [2024-06-23 13:23:08,175][15401] Updated weights for policy 0, policy_version 443270 (0.0038) [2024-06-23 13:23:08,394][15132] Fps is (10 sec: 42579.5, 60 sec: 43414.4, 300 sec: 42875.5). Total num frames: 7262535680. Throughput: 0: 42691.4. Samples: 7262621480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:23:08,394][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 13:23:11,704][15401] Updated weights for policy 0, policy_version 443280 (0.0032) [2024-06-23 13:23:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7262748672. Throughput: 0: 42652.0. Samples: 7262876600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:23:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 13:23:16,193][15401] Updated weights for policy 0, policy_version 443290 (0.0036) [2024-06-23 13:23:16,980][15349] Signal inference workers to stop experience collection... (107650 times) [2024-06-23 13:23:16,981][15349] Signal inference workers to resume experience collection... (107650 times) [2024-06-23 13:23:16,990][15401] InferenceWorker_p0-w0: stopping experience collection (107650 times) [2024-06-23 13:23:16,990][15401] InferenceWorker_p0-w0: resuming experience collection (107650 times) [2024-06-23 13:23:18,389][15132] Fps is (10 sec: 42617.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7262961664. Throughput: 0: 42694.3. Samples: 7263130500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:23:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 13:23:19,477][15401] Updated weights for policy 0, policy_version 443300 (0.0032) [2024-06-23 13:23:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7263158272. Throughput: 0: 42676.0. Samples: 7263262640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:23:23,390][15132] Avg episode reward: [(0, '0.052')] [2024-06-23 13:23:23,654][15401] Updated weights for policy 0, policy_version 443310 (0.0044) [2024-06-23 13:23:27,190][15401] Updated weights for policy 0, policy_version 443320 (0.0020) [2024-06-23 13:23:28,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7263387648. Throughput: 0: 42806.3. Samples: 7263518080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:23:28,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 13:23:31,154][15401] Updated weights for policy 0, policy_version 443330 (0.0035) [2024-06-23 13:23:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7263617024. Throughput: 0: 42655.8. Samples: 7263772100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 13:23:33,399][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 13:23:35,095][15401] Updated weights for policy 0, policy_version 443340 (0.0029) [2024-06-23 13:23:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7263830016. Throughput: 0: 42743.7. Samples: 7263903180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:23:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 13:23:38,516][15401] Updated weights for policy 0, policy_version 443350 (0.0030) [2024-06-23 13:23:42,571][15401] Updated weights for policy 0, policy_version 443360 (0.0027) [2024-06-23 13:23:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 7264043008. Throughput: 0: 42902.0. Samples: 7264168080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:23:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 13:23:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000443363_7264059392.pth... [2024-06-23 13:23:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000442736_7253786624.pth [2024-06-23 13:23:46,276][15401] Updated weights for policy 0, policy_version 443370 (0.0048) [2024-06-23 13:23:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7264256000. Throughput: 0: 42934.2. Samples: 7264422000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:23:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 13:23:50,416][15401] Updated weights for policy 0, policy_version 443380 (0.0030) [2024-06-23 13:23:53,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42869.9, 300 sec: 42875.8). Total num frames: 7264468992. Throughput: 0: 42831.2. Samples: 7264548800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:23:53,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 13:23:53,907][15401] Updated weights for policy 0, policy_version 443390 (0.0028) [2024-06-23 13:23:58,022][15401] Updated weights for policy 0, policy_version 443400 (0.0032) [2024-06-23 13:23:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7264665600. Throughput: 0: 42875.6. Samples: 7264806000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:23:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 13:24:01,518][15401] Updated weights for policy 0, policy_version 443410 (0.0030) [2024-06-23 13:24:03,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 7264911360. Throughput: 0: 42746.1. Samples: 7265054080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 13:24:06,160][15401] Updated weights for policy 0, policy_version 443420 (0.0035) [2024-06-23 13:24:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43147.6, 300 sec: 42876.1). Total num frames: 7265124352. Throughput: 0: 42879.1. Samples: 7265192200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:08,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-23 13:24:09,059][15401] Updated weights for policy 0, policy_version 443430 (0.0046) [2024-06-23 13:24:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7265288192. Throughput: 0: 42729.0. Samples: 7265440880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:13,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 13:24:13,897][15401] Updated weights for policy 0, policy_version 443440 (0.0031) [2024-06-23 13:24:16,815][15401] Updated weights for policy 0, policy_version 443450 (0.0029) [2024-06-23 13:24:18,390][15132] Fps is (10 sec: 44235.1, 60 sec: 43417.1, 300 sec: 42931.6). Total num frames: 7265566720. Throughput: 0: 42640.5. Samples: 7265690940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:18,391][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 13:24:21,509][15401] Updated weights for policy 0, policy_version 443460 (0.0043) [2024-06-23 13:24:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7265730560. Throughput: 0: 42764.9. Samples: 7265827600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 13:24:24,421][15401] Updated weights for policy 0, policy_version 443470 (0.0039) [2024-06-23 13:24:28,390][15132] Fps is (10 sec: 36046.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7265927168. Throughput: 0: 42329.4. Samples: 7266072900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 13:24:29,053][15401] Updated weights for policy 0, policy_version 443480 (0.0022) [2024-06-23 13:24:32,059][15401] Updated weights for policy 0, policy_version 443490 (0.0039) [2024-06-23 13:24:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42821.5). Total num frames: 7266172928. Throughput: 0: 42332.5. Samples: 7266326960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 13:24:36,841][15401] Updated weights for policy 0, policy_version 443500 (0.0044) [2024-06-23 13:24:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42710.4). Total num frames: 7266369536. Throughput: 0: 42479.9. Samples: 7266460300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 13:24:39,958][15401] Updated weights for policy 0, policy_version 443510 (0.0040) [2024-06-23 13:24:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7266582528. Throughput: 0: 42245.8. Samples: 7266707060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 13:24:44,640][15401] Updated weights for policy 0, policy_version 443520 (0.0048) [2024-06-23 13:24:47,619][15401] Updated weights for policy 0, policy_version 443530 (0.0034) [2024-06-23 13:24:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7266828288. Throughput: 0: 42384.4. Samples: 7266961380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 13:24:52,190][15401] Updated weights for policy 0, policy_version 443540 (0.0035) [2024-06-23 13:24:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 7267008512. Throughput: 0: 42393.0. Samples: 7267099880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 13:24:53,612][15349] Signal inference workers to stop experience collection... (107700 times) [2024-06-23 13:24:53,613][15349] Signal inference workers to resume experience collection... (107700 times) [2024-06-23 13:24:53,639][15401] InferenceWorker_p0-w0: stopping experience collection (107700 times) [2024-06-23 13:24:53,669][15401] InferenceWorker_p0-w0: resuming experience collection (107700 times) [2024-06-23 13:24:55,199][15401] Updated weights for policy 0, policy_version 443550 (0.0032) [2024-06-23 13:24:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7267221504. Throughput: 0: 42285.3. Samples: 7267343720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 13:24:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 13:25:00,248][15401] Updated weights for policy 0, policy_version 443560 (0.0035) [2024-06-23 13:25:03,172][15401] Updated weights for policy 0, policy_version 443570 (0.0044) [2024-06-23 13:25:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 7267450880. Throughput: 0: 42426.6. Samples: 7267600120. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 13:25:07,723][15401] Updated weights for policy 0, policy_version 443580 (0.0030) [2024-06-23 13:25:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 7267631104. Throughput: 0: 42326.2. Samples: 7267732280. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 13:25:10,851][15401] Updated weights for policy 0, policy_version 443590 (0.0030) [2024-06-23 13:25:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7267844096. Throughput: 0: 42365.4. Samples: 7267979340. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 13:25:15,271][15401] Updated weights for policy 0, policy_version 443600 (0.0031) [2024-06-23 13:25:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42052.7, 300 sec: 42765.0). Total num frames: 7268089856. Throughput: 0: 42462.3. Samples: 7268237760. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 13:25:18,499][15401] Updated weights for policy 0, policy_version 443610 (0.0035) [2024-06-23 13:25:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 7268253696. Throughput: 0: 42492.6. Samples: 7268372460. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 13:25:23,423][15401] Updated weights for policy 0, policy_version 443620 (0.0029) [2024-06-23 13:25:26,279][15401] Updated weights for policy 0, policy_version 443630 (0.0035) [2024-06-23 13:25:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7268499456. Throughput: 0: 42413.3. Samples: 7268615660. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 13:25:30,981][15401] Updated weights for policy 0, policy_version 443640 (0.0033) [2024-06-23 13:25:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7268712448. Throughput: 0: 42597.8. Samples: 7268878280. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 13:25:33,876][15401] Updated weights for policy 0, policy_version 443650 (0.0023) [2024-06-23 13:25:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7268909056. Throughput: 0: 42475.5. Samples: 7269011280. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 13:25:38,500][15401] Updated weights for policy 0, policy_version 443660 (0.0036) [2024-06-23 13:25:41,638][15401] Updated weights for policy 0, policy_version 443670 (0.0037) [2024-06-23 13:25:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7269154816. Throughput: 0: 42719.0. Samples: 7269266080. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 13:25:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000443674_7269154816.pth... [2024-06-23 13:25:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000443048_7258898432.pth [2024-06-23 13:25:46,066][15401] Updated weights for policy 0, policy_version 443680 (0.0024) [2024-06-23 13:25:48,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7269367808. Throughput: 0: 42615.2. Samples: 7269517800. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 13:25:49,146][15401] Updated weights for policy 0, policy_version 443690 (0.0038) [2024-06-23 13:25:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7269564416. Throughput: 0: 42521.3. Samples: 7269645840. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:53,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 13:25:53,460][15401] Updated weights for policy 0, policy_version 443700 (0.0029) [2024-06-23 13:25:56,935][15401] Updated weights for policy 0, policy_version 443710 (0.0035) [2024-06-23 13:25:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 7269793792. Throughput: 0: 42704.4. Samples: 7269901140. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:25:58,393][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 13:26:01,620][15401] Updated weights for policy 0, policy_version 443720 (0.0041) [2024-06-23 13:26:03,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7270006784. Throughput: 0: 42673.3. Samples: 7270158060. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:26:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 13:26:04,617][15401] Updated weights for policy 0, policy_version 443730 (0.0026) [2024-06-23 13:26:08,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7270187008. Throughput: 0: 42502.5. Samples: 7270285080. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:26:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 13:26:08,548][15349] Signal inference workers to stop experience collection... (107750 times) [2024-06-23 13:26:08,599][15401] InferenceWorker_p0-w0: stopping experience collection (107750 times) [2024-06-23 13:26:08,607][15349] Signal inference workers to resume experience collection... (107750 times) [2024-06-23 13:26:08,618][15401] InferenceWorker_p0-w0: resuming experience collection (107750 times) [2024-06-23 13:26:09,257][15401] Updated weights for policy 0, policy_version 443740 (0.0037) [2024-06-23 13:26:12,126][15401] Updated weights for policy 0, policy_version 443750 (0.0031) [2024-06-23 13:26:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 7270432768. Throughput: 0: 42751.7. Samples: 7270539480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:26:13,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 13:26:16,812][15401] Updated weights for policy 0, policy_version 443760 (0.0033) [2024-06-23 13:26:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7270645760. Throughput: 0: 42697.2. Samples: 7270799660. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:26:18,390][15132] Avg episode reward: [(0, '0.913')] [2024-06-23 13:26:19,753][15401] Updated weights for policy 0, policy_version 443770 (0.0031) [2024-06-23 13:26:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7270825984. Throughput: 0: 42508.5. Samples: 7270924160. Policy #0 lag: (min: 0.0, avg: 7.6, max: 19.0) [2024-06-23 13:26:23,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 13:26:24,515][15401] Updated weights for policy 0, policy_version 443780 (0.0045) [2024-06-23 13:26:27,163][15401] Updated weights for policy 0, policy_version 443790 (0.0027) [2024-06-23 13:26:28,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 7271088128. Throughput: 0: 42504.2. Samples: 7271178760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:26:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 13:26:32,268][15401] Updated weights for policy 0, policy_version 443800 (0.0030) [2024-06-23 13:26:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7271284736. Throughput: 0: 42868.5. Samples: 7271446880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:26:33,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-23 13:26:34,651][15401] Updated weights for policy 0, policy_version 443810 (0.0023) [2024-06-23 13:26:38,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7271481344. Throughput: 0: 42681.8. Samples: 7271566420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:26:38,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 13:26:40,091][15401] Updated weights for policy 0, policy_version 443820 (0.0036) [2024-06-23 13:26:42,224][15401] Updated weights for policy 0, policy_version 443830 (0.0038) [2024-06-23 13:26:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7271743488. Throughput: 0: 42737.0. Samples: 7271824200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:26:43,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 13:26:47,723][15401] Updated weights for policy 0, policy_version 443840 (0.0028) [2024-06-23 13:26:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7271923712. Throughput: 0: 42871.9. Samples: 7272087300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:26:48,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 13:26:50,130][15401] Updated weights for policy 0, policy_version 443850 (0.0029) [2024-06-23 13:26:53,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42600.1, 300 sec: 42654.7). Total num frames: 7272120320. Throughput: 0: 42739.2. Samples: 7272208340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:26:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 13:26:55,287][15401] Updated weights for policy 0, policy_version 443860 (0.0035) [2024-06-23 13:26:57,678][15401] Updated weights for policy 0, policy_version 443870 (0.0032) [2024-06-23 13:26:58,396][15132] Fps is (10 sec: 45846.4, 60 sec: 43141.7, 300 sec: 42764.1). Total num frames: 7272382464. Throughput: 0: 42760.5. Samples: 7272463980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:26:58,396][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 13:27:03,133][15401] Updated weights for policy 0, policy_version 443880 (0.0033) [2024-06-23 13:27:03,207][15349] Signal inference workers to stop experience collection... (107800 times) [2024-06-23 13:27:03,254][15401] InferenceWorker_p0-w0: stopping experience collection (107800 times) [2024-06-23 13:27:03,264][15349] Signal inference workers to resume experience collection... (107800 times) [2024-06-23 13:27:03,277][15401] InferenceWorker_p0-w0: resuming experience collection (107800 times) [2024-06-23 13:27:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7272546304. Throughput: 0: 42959.2. Samples: 7272732820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 13:27:05,446][15401] Updated weights for policy 0, policy_version 443890 (0.0028) [2024-06-23 13:27:08,389][15132] Fps is (10 sec: 39346.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7272775680. Throughput: 0: 42729.4. Samples: 7272846980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:08,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 13:27:10,719][15401] Updated weights for policy 0, policy_version 443900 (0.0038) [2024-06-23 13:27:13,353][15401] Updated weights for policy 0, policy_version 443910 (0.0030) [2024-06-23 13:27:13,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 7273021440. Throughput: 0: 42873.1. Samples: 7273108060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 13:27:18,334][15401] Updated weights for policy 0, policy_version 443920 (0.0037) [2024-06-23 13:27:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7273185280. Throughput: 0: 42776.9. Samples: 7273371840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 13:27:21,095][15401] Updated weights for policy 0, policy_version 443930 (0.0028) [2024-06-23 13:27:23,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7273398272. Throughput: 0: 42564.5. Samples: 7273481820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 13:27:26,126][15401] Updated weights for policy 0, policy_version 443940 (0.0045) [2024-06-23 13:27:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7273627648. Throughput: 0: 42606.2. Samples: 7273741480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 13:27:28,910][15401] Updated weights for policy 0, policy_version 443950 (0.0042) [2024-06-23 13:27:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7273807872. Throughput: 0: 42704.1. Samples: 7274008980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:33,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 13:27:33,616][15401] Updated weights for policy 0, policy_version 443960 (0.0022) [2024-06-23 13:27:36,766][15401] Updated weights for policy 0, policy_version 443970 (0.0021) [2024-06-23 13:27:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7274053632. Throughput: 0: 42732.1. Samples: 7274131280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 13:27:41,228][15401] Updated weights for policy 0, policy_version 443980 (0.0048) [2024-06-23 13:27:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 7274266624. Throughput: 0: 42742.8. Samples: 7274387140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-23 13:27:43,391][15132] Avg episode reward: [(0, '0.261')] [2024-06-23 13:27:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000443986_7274266624.pth... [2024-06-23 13:27:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000443363_7264059392.pth [2024-06-23 13:27:44,280][15401] Updated weights for policy 0, policy_version 443990 (0.0033) [2024-06-23 13:27:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7274463232. Throughput: 0: 42542.7. Samples: 7274647240. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:27:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 13:27:49,022][15401] Updated weights for policy 0, policy_version 444000 (0.0032) [2024-06-23 13:27:51,967][15401] Updated weights for policy 0, policy_version 444010 (0.0023) [2024-06-23 13:27:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7274708992. Throughput: 0: 42844.0. Samples: 7274774960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:27:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 13:27:56,433][15401] Updated weights for policy 0, policy_version 444020 (0.0035) [2024-06-23 13:27:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41783.6, 300 sec: 42542.9). Total num frames: 7274889216. Throughput: 0: 42722.7. Samples: 7275030580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:27:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 13:27:59,614][15401] Updated weights for policy 0, policy_version 444030 (0.0023) [2024-06-23 13:28:03,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42598.7). Total num frames: 7275102208. Throughput: 0: 42708.8. Samples: 7275293840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:03,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 13:28:03,877][15401] Updated weights for policy 0, policy_version 444040 (0.0031) [2024-06-23 13:28:07,268][15401] Updated weights for policy 0, policy_version 444050 (0.0027) [2024-06-23 13:28:08,396][15132] Fps is (10 sec: 47483.7, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 7275364352. Throughput: 0: 43077.4. Samples: 7275420580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:08,397][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 13:28:11,889][15401] Updated weights for policy 0, policy_version 444060 (0.0035) [2024-06-23 13:28:13,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 7275544576. Throughput: 0: 42933.3. Samples: 7275673480. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 13:28:14,966][15401] Updated weights for policy 0, policy_version 444070 (0.0028) [2024-06-23 13:28:18,390][15132] Fps is (10 sec: 37707.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7275741184. Throughput: 0: 42766.6. Samples: 7275933480. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 13:28:19,536][15401] Updated weights for policy 0, policy_version 444080 (0.0031) [2024-06-23 13:28:22,711][15401] Updated weights for policy 0, policy_version 444090 (0.0037) [2024-06-23 13:28:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7275970560. Throughput: 0: 42871.9. Samples: 7276060520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 13:28:23,461][15349] Signal inference workers to stop experience collection... (107850 times) [2024-06-23 13:28:23,468][15349] Signal inference workers to resume experience collection... (107850 times) [2024-06-23 13:28:23,492][15401] InferenceWorker_p0-w0: stopping experience collection (107850 times) [2024-06-23 13:28:23,496][15401] InferenceWorker_p0-w0: resuming experience collection (107850 times) [2024-06-23 13:28:27,120][15401] Updated weights for policy 0, policy_version 444100 (0.0040) [2024-06-23 13:28:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7276183552. Throughput: 0: 42788.1. Samples: 7276312600. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 13:28:30,948][15401] Updated weights for policy 0, policy_version 444110 (0.0032) [2024-06-23 13:28:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7276396544. Throughput: 0: 42768.0. Samples: 7276571800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:33,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 13:28:34,709][15401] Updated weights for policy 0, policy_version 444120 (0.0043) [2024-06-23 13:28:38,346][15401] Updated weights for policy 0, policy_version 444130 (0.0026) [2024-06-23 13:28:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 7276625920. Throughput: 0: 42750.2. Samples: 7276698720. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 13:28:42,255][15401] Updated weights for policy 0, policy_version 444140 (0.0032) [2024-06-23 13:28:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7276838912. Throughput: 0: 42873.1. Samples: 7276959860. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 13:28:45,687][15401] Updated weights for policy 0, policy_version 444150 (0.0034) [2024-06-23 13:28:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 7277035520. Throughput: 0: 42636.5. Samples: 7277212380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:48,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 13:28:49,915][15401] Updated weights for policy 0, policy_version 444160 (0.0040) [2024-06-23 13:28:53,349][15401] Updated weights for policy 0, policy_version 444170 (0.0041) [2024-06-23 13:28:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7277281280. Throughput: 0: 42433.6. Samples: 7277329820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 13:28:57,555][15401] Updated weights for policy 0, policy_version 444180 (0.0040) [2024-06-23 13:28:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 7277477888. Throughput: 0: 42620.4. Samples: 7277591400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:28:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 13:29:01,561][15401] Updated weights for policy 0, policy_version 444190 (0.0044) [2024-06-23 13:29:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 7277674496. Throughput: 0: 42578.7. Samples: 7277849520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:29:03,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 13:29:05,128][15401] Updated weights for policy 0, policy_version 444200 (0.0036) [2024-06-23 13:29:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42329.7, 300 sec: 42765.0). Total num frames: 7277903872. Throughput: 0: 42555.4. Samples: 7277975520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-23 13:29:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 13:29:09,118][15401] Updated weights for policy 0, policy_version 444210 (0.0029) [2024-06-23 13:29:13,031][15401] Updated weights for policy 0, policy_version 444220 (0.0031) [2024-06-23 13:29:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42598.5). Total num frames: 7278133248. Throughput: 0: 42698.2. Samples: 7278234020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:13,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 13:29:16,904][15401] Updated weights for policy 0, policy_version 444230 (0.0034) [2024-06-23 13:29:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7278297088. Throughput: 0: 42632.3. Samples: 7278490260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 13:29:20,502][15401] Updated weights for policy 0, policy_version 444240 (0.0037) [2024-06-23 13:29:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7278526464. Throughput: 0: 42524.1. Samples: 7278612300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 13:29:24,454][15401] Updated weights for policy 0, policy_version 444250 (0.0040) [2024-06-23 13:29:28,042][15401] Updated weights for policy 0, policy_version 444260 (0.0034) [2024-06-23 13:29:28,389][15132] Fps is (10 sec: 45876.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7278755840. Throughput: 0: 42524.0. Samples: 7278873440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 13:29:31,931][15401] Updated weights for policy 0, policy_version 444270 (0.0035) [2024-06-23 13:29:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7278952448. Throughput: 0: 42663.9. Samples: 7279132260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:33,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 13:29:35,665][15401] Updated weights for policy 0, policy_version 444280 (0.0033) [2024-06-23 13:29:37,798][15349] Signal inference workers to stop experience collection... (107900 times) [2024-06-23 13:29:37,856][15349] Signal inference workers to resume experience collection... (107900 times) [2024-06-23 13:29:37,856][15401] InferenceWorker_p0-w0: stopping experience collection (107900 times) [2024-06-23 13:29:37,868][15401] InferenceWorker_p0-w0: resuming experience collection (107900 times) [2024-06-23 13:29:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7279165440. Throughput: 0: 42776.6. Samples: 7279254760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:38,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 13:29:39,618][15401] Updated weights for policy 0, policy_version 444290 (0.0034) [2024-06-23 13:29:43,359][15401] Updated weights for policy 0, policy_version 444300 (0.0032) [2024-06-23 13:29:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7279411200. Throughput: 0: 42704.5. Samples: 7279513100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 13:29:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000444300_7279411200.pth... [2024-06-23 13:29:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000443674_7269154816.pth [2024-06-23 13:29:47,187][15401] Updated weights for policy 0, policy_version 444310 (0.0036) [2024-06-23 13:29:48,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7279591424. Throughput: 0: 42560.8. Samples: 7279764860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:48,392][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 13:29:51,316][15401] Updated weights for policy 0, policy_version 444320 (0.0025) [2024-06-23 13:29:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7279804416. Throughput: 0: 42619.6. Samples: 7279893400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 13:29:55,092][15401] Updated weights for policy 0, policy_version 444330 (0.0032) [2024-06-23 13:29:58,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7280017408. Throughput: 0: 42672.6. Samples: 7280154280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:29:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 13:29:58,786][15401] Updated weights for policy 0, policy_version 444340 (0.0031) [2024-06-23 13:30:02,537][15401] Updated weights for policy 0, policy_version 444350 (0.0028) [2024-06-23 13:30:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7280263168. Throughput: 0: 42692.9. Samples: 7280411440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:30:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 13:30:06,390][15401] Updated weights for policy 0, policy_version 444360 (0.0026) [2024-06-23 13:30:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7280443392. Throughput: 0: 42731.8. Samples: 7280535240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:30:08,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 13:30:09,953][15401] Updated weights for policy 0, policy_version 444370 (0.0030) [2024-06-23 13:30:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 7280656384. Throughput: 0: 42762.2. Samples: 7280797740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:30:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 13:30:14,277][15401] Updated weights for policy 0, policy_version 444380 (0.0034) [2024-06-23 13:30:18,134][15401] Updated weights for policy 0, policy_version 444390 (0.0028) [2024-06-23 13:30:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 7280902144. Throughput: 0: 42610.3. Samples: 7281049720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:30:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 13:30:21,699][15401] Updated weights for policy 0, policy_version 444400 (0.0031) [2024-06-23 13:30:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7281098752. Throughput: 0: 42811.8. Samples: 7281181300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:30:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 13:30:25,790][15401] Updated weights for policy 0, policy_version 444410 (0.0044) [2024-06-23 13:30:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 7281311744. Throughput: 0: 42732.8. Samples: 7281436180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:30:28,393][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 13:30:29,222][15401] Updated weights for policy 0, policy_version 444420 (0.0031) [2024-06-23 13:30:33,227][15401] Updated weights for policy 0, policy_version 444430 (0.0024) [2024-06-23 13:30:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7281541120. Throughput: 0: 42836.0. Samples: 7281692380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:30:33,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 13:30:37,361][15401] Updated weights for policy 0, policy_version 444440 (0.0037) [2024-06-23 13:30:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7281737728. Throughput: 0: 42919.7. Samples: 7281824780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:30:38,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-23 13:30:40,807][15401] Updated weights for policy 0, policy_version 444450 (0.0024) [2024-06-23 13:30:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7281950720. Throughput: 0: 42800.8. Samples: 7282080320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:30:43,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 13:30:44,874][15401] Updated weights for policy 0, policy_version 444460 (0.0042) [2024-06-23 13:30:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 7282180096. Throughput: 0: 42734.3. Samples: 7282334480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:30:48,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 13:30:48,427][15401] Updated weights for policy 0, policy_version 444470 (0.0041) [2024-06-23 13:30:52,550][15401] Updated weights for policy 0, policy_version 444480 (0.0032) [2024-06-23 13:30:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 7282376704. Throughput: 0: 42897.4. Samples: 7282465620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:30:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 13:30:56,390][15401] Updated weights for policy 0, policy_version 444490 (0.0045) [2024-06-23 13:30:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7282589696. Throughput: 0: 42870.6. Samples: 7282726920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:30:58,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 13:31:00,429][15349] Signal inference workers to stop experience collection... (107950 times) [2024-06-23 13:31:00,433][15349] Signal inference workers to resume experience collection... (107950 times) [2024-06-23 13:31:00,444][15401] Updated weights for policy 0, policy_version 444500 (0.0039) [2024-06-23 13:31:00,453][15401] InferenceWorker_p0-w0: stopping experience collection (107950 times) [2024-06-23 13:31:00,453][15401] InferenceWorker_p0-w0: resuming experience collection (107950 times) [2024-06-23 13:31:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7282819072. Throughput: 0: 42970.3. Samples: 7282983380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 13:31:03,925][15401] Updated weights for policy 0, policy_version 444510 (0.0033) [2024-06-23 13:31:07,959][15401] Updated weights for policy 0, policy_version 444520 (0.0025) [2024-06-23 13:31:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7283032064. Throughput: 0: 42892.9. Samples: 7283111480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:08,391][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 13:31:11,581][15401] Updated weights for policy 0, policy_version 444530 (0.0038) [2024-06-23 13:31:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7283261440. Throughput: 0: 42977.0. Samples: 7283370040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 13:31:15,476][15401] Updated weights for policy 0, policy_version 444540 (0.0030) [2024-06-23 13:31:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7283458048. Throughput: 0: 42896.5. Samples: 7283622720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 13:31:19,218][15401] Updated weights for policy 0, policy_version 444550 (0.0025) [2024-06-23 13:31:23,028][15401] Updated weights for policy 0, policy_version 444560 (0.0034) [2024-06-23 13:31:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 7283687424. Throughput: 0: 42749.2. Samples: 7283748500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 13:31:26,865][15401] Updated weights for policy 0, policy_version 444570 (0.0032) [2024-06-23 13:31:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 7283884032. Throughput: 0: 42931.5. Samples: 7284012240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 13:31:30,645][15401] Updated weights for policy 0, policy_version 444580 (0.0032) [2024-06-23 13:31:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7284097024. Throughput: 0: 43050.2. Samples: 7284271740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 13:31:34,712][15401] Updated weights for policy 0, policy_version 444590 (0.0038) [2024-06-23 13:31:38,166][15401] Updated weights for policy 0, policy_version 444600 (0.0042) [2024-06-23 13:31:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7284326400. Throughput: 0: 42978.6. Samples: 7284399660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:38,392][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 13:31:42,383][15401] Updated weights for policy 0, policy_version 444610 (0.0031) [2024-06-23 13:31:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.6, 300 sec: 42709.1). Total num frames: 7284523008. Throughput: 0: 42982.0. Samples: 7284661220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:43,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 13:31:43,539][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000444614_7284555776.pth... [2024-06-23 13:31:43,612][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000443986_7274266624.pth [2024-06-23 13:31:45,809][15401] Updated weights for policy 0, policy_version 444620 (0.0035) [2024-06-23 13:31:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7284736000. Throughput: 0: 42918.5. Samples: 7284914720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 13:31:49,881][15401] Updated weights for policy 0, policy_version 444630 (0.0041) [2024-06-23 13:31:53,390][15401] Updated weights for policy 0, policy_version 444640 (0.0037) [2024-06-23 13:31:53,389][15132] Fps is (10 sec: 45887.1, 60 sec: 43417.7, 300 sec: 42710.4). Total num frames: 7284981760. Throughput: 0: 42886.3. Samples: 7285041360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 13:31:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 13:31:57,586][15401] Updated weights for policy 0, policy_version 444650 (0.0033) [2024-06-23 13:31:58,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 7285161984. Throughput: 0: 42821.3. Samples: 7285297100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:31:58,392][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 13:32:00,704][15349] Signal inference workers to stop experience collection... (108000 times) [2024-06-23 13:32:00,705][15349] Signal inference workers to resume experience collection... (108000 times) [2024-06-23 13:32:00,719][15401] InferenceWorker_p0-w0: stopping experience collection (108000 times) [2024-06-23 13:32:00,719][15401] InferenceWorker_p0-w0: resuming experience collection (108000 times) [2024-06-23 13:32:01,016][15401] Updated weights for policy 0, policy_version 444660 (0.0031) [2024-06-23 13:32:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7285374976. Throughput: 0: 42937.0. Samples: 7285554880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:03,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 13:32:05,089][15401] Updated weights for policy 0, policy_version 444670 (0.0031) [2024-06-23 13:32:08,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7285604352. Throughput: 0: 43040.2. Samples: 7285685300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 13:32:08,887][15401] Updated weights for policy 0, policy_version 444680 (0.0032) [2024-06-23 13:32:13,106][15401] Updated weights for policy 0, policy_version 444690 (0.0045) [2024-06-23 13:32:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7285817344. Throughput: 0: 42835.5. Samples: 7285939840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 13:32:16,688][15401] Updated weights for policy 0, policy_version 444700 (0.0036) [2024-06-23 13:32:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7286030336. Throughput: 0: 42681.5. Samples: 7286192400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 13:32:20,712][15401] Updated weights for policy 0, policy_version 444710 (0.0032) [2024-06-23 13:32:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7286226944. Throughput: 0: 42835.5. Samples: 7286327260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 13:32:24,073][15401] Updated weights for policy 0, policy_version 444720 (0.0031) [2024-06-23 13:32:28,318][15401] Updated weights for policy 0, policy_version 444730 (0.0041) [2024-06-23 13:32:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7286456320. Throughput: 0: 42692.6. Samples: 7286582280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 13:32:31,611][15401] Updated weights for policy 0, policy_version 444740 (0.0034) [2024-06-23 13:32:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7286669312. Throughput: 0: 42801.9. Samples: 7286840800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 13:32:35,862][15401] Updated weights for policy 0, policy_version 444750 (0.0043) [2024-06-23 13:32:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7286898688. Throughput: 0: 42844.9. Samples: 7286969380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 13:32:39,215][15401] Updated weights for policy 0, policy_version 444760 (0.0022) [2024-06-23 13:32:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 7287095296. Throughput: 0: 42900.6. Samples: 7287227520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 13:32:43,450][15401] Updated weights for policy 0, policy_version 444770 (0.0024) [2024-06-23 13:32:47,128][15401] Updated weights for policy 0, policy_version 444780 (0.0037) [2024-06-23 13:32:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7287324672. Throughput: 0: 42960.9. Samples: 7287488120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 13:32:51,052][15401] Updated weights for policy 0, policy_version 444790 (0.0042) [2024-06-23 13:32:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7287537664. Throughput: 0: 42913.6. Samples: 7287616420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:53,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 13:32:54,619][15401] Updated weights for policy 0, policy_version 444800 (0.0029) [2024-06-23 13:32:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.2, 300 sec: 42876.4). Total num frames: 7287750656. Throughput: 0: 42984.9. Samples: 7287874160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:32:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 13:32:58,754][15401] Updated weights for policy 0, policy_version 444810 (0.0028) [2024-06-23 13:33:02,077][15401] Updated weights for policy 0, policy_version 444820 (0.0044) [2024-06-23 13:33:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42710.4). Total num frames: 7287963648. Throughput: 0: 42999.8. Samples: 7288127400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:33:03,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 13:33:06,447][15401] Updated weights for policy 0, policy_version 444830 (0.0031) [2024-06-23 13:33:08,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 7288193024. Throughput: 0: 42965.8. Samples: 7288260820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:33:08,392][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 13:33:09,667][15401] Updated weights for policy 0, policy_version 444840 (0.0038) [2024-06-23 13:33:13,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 7288356864. Throughput: 0: 42971.9. Samples: 7288516120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:33:13,393][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 13:33:14,006][15401] Updated weights for policy 0, policy_version 444850 (0.0047) [2024-06-23 13:33:17,693][15401] Updated weights for policy 0, policy_version 444860 (0.0036) [2024-06-23 13:33:18,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7288602624. Throughput: 0: 42999.1. Samples: 7288775760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 13:33:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 13:33:21,543][15401] Updated weights for policy 0, policy_version 444870 (0.0037) [2024-06-23 13:33:23,390][15132] Fps is (10 sec: 47524.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7288832000. Throughput: 0: 42968.8. Samples: 7288902980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:33:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 13:33:25,214][15401] Updated weights for policy 0, policy_version 444880 (0.0039) [2024-06-23 13:33:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7289012224. Throughput: 0: 42895.1. Samples: 7289157800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:33:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 13:33:29,101][15401] Updated weights for policy 0, policy_version 444890 (0.0039) [2024-06-23 13:33:32,880][15401] Updated weights for policy 0, policy_version 444900 (0.0037) [2024-06-23 13:33:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7289257984. Throughput: 0: 42598.6. Samples: 7289405060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:33:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 13:33:36,951][15401] Updated weights for policy 0, policy_version 444910 (0.0032) [2024-06-23 13:33:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7289454592. Throughput: 0: 42647.2. Samples: 7289535540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:33:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 13:33:38,854][15349] Signal inference workers to stop experience collection... (108050 times) [2024-06-23 13:33:38,855][15349] Signal inference workers to resume experience collection... (108050 times) [2024-06-23 13:33:38,888][15401] InferenceWorker_p0-w0: stopping experience collection (108050 times) [2024-06-23 13:33:38,888][15401] InferenceWorker_p0-w0: resuming experience collection (108050 times) [2024-06-23 13:33:40,451][15401] Updated weights for policy 0, policy_version 444920 (0.0027) [2024-06-23 13:33:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7289667584. Throughput: 0: 42593.9. Samples: 7289790880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:33:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 13:33:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000444926_7289667584.pth... [2024-06-23 13:33:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000444300_7279411200.pth [2024-06-23 13:33:44,422][15401] Updated weights for policy 0, policy_version 444930 (0.0034) [2024-06-23 13:33:48,295][15401] Updated weights for policy 0, policy_version 444940 (0.0034) [2024-06-23 13:33:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7289896960. Throughput: 0: 42547.1. Samples: 7290042020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:33:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 13:33:52,009][15401] Updated weights for policy 0, policy_version 444950 (0.0032) [2024-06-23 13:33:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7290093568. Throughput: 0: 42616.8. Samples: 7290178480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:33:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 13:33:55,921][15401] Updated weights for policy 0, policy_version 444960 (0.0036) [2024-06-23 13:33:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7290306560. Throughput: 0: 42555.9. Samples: 7290431040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:33:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 13:33:59,608][15401] Updated weights for policy 0, policy_version 444970 (0.0040) [2024-06-23 13:34:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 7290519552. Throughput: 0: 42312.5. Samples: 7290679820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:03,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-23 13:34:03,577][15401] Updated weights for policy 0, policy_version 444980 (0.0044) [2024-06-23 13:34:07,582][15401] Updated weights for policy 0, policy_version 444990 (0.0039) [2024-06-23 13:34:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 7290732544. Throughput: 0: 42540.0. Samples: 7290817280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:08,390][15132] Avg episode reward: [(0, '0.177')] [2024-06-23 13:34:11,473][15401] Updated weights for policy 0, policy_version 445000 (0.0034) [2024-06-23 13:34:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 7290912768. Throughput: 0: 42294.1. Samples: 7291061040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 13:34:15,598][15401] Updated weights for policy 0, policy_version 445010 (0.0031) [2024-06-23 13:34:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7291174912. Throughput: 0: 42234.1. Samples: 7291305600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 13:34:19,481][15401] Updated weights for policy 0, policy_version 445020 (0.0032) [2024-06-23 13:34:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 7291355136. Throughput: 0: 42442.8. Samples: 7291445460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 13:34:23,486][15401] Updated weights for policy 0, policy_version 445030 (0.0037) [2024-06-23 13:34:27,014][15401] Updated weights for policy 0, policy_version 445040 (0.0042) [2024-06-23 13:34:28,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7291551744. Throughput: 0: 42220.0. Samples: 7291690780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 13:34:31,122][15401] Updated weights for policy 0, policy_version 445050 (0.0035) [2024-06-23 13:34:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7291797504. Throughput: 0: 42353.8. Samples: 7291947940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:33,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 13:34:34,654][15401] Updated weights for policy 0, policy_version 445060 (0.0038) [2024-06-23 13:34:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7291994112. Throughput: 0: 42371.1. Samples: 7292085180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 13:34:38,630][15401] Updated weights for policy 0, policy_version 445070 (0.0028) [2024-06-23 13:34:42,707][15401] Updated weights for policy 0, policy_version 445080 (0.0031) [2024-06-23 13:34:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 7292190720. Throughput: 0: 42270.9. Samples: 7292333220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:34:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 13:34:46,393][15401] Updated weights for policy 0, policy_version 445090 (0.0040) [2024-06-23 13:34:48,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 7292452864. Throughput: 0: 42348.8. Samples: 7292585620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:34:48,393][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 13:34:50,453][15401] Updated weights for policy 0, policy_version 445100 (0.0035) [2024-06-23 13:34:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 7292616704. Throughput: 0: 42309.4. Samples: 7292721200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:34:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 13:34:53,448][15349] Signal inference workers to stop experience collection... (108100 times) [2024-06-23 13:34:53,448][15349] Signal inference workers to resume experience collection... (108100 times) [2024-06-23 13:34:53,481][15401] InferenceWorker_p0-w0: stopping experience collection (108100 times) [2024-06-23 13:34:53,488][15401] InferenceWorker_p0-w0: resuming experience collection (108100 times) [2024-06-23 13:34:54,173][15401] Updated weights for policy 0, policy_version 445110 (0.0029) [2024-06-23 13:34:58,373][15401] Updated weights for policy 0, policy_version 445120 (0.0039) [2024-06-23 13:34:58,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7292846080. Throughput: 0: 42367.6. Samples: 7292967580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:34:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 13:35:01,818][15401] Updated weights for policy 0, policy_version 445130 (0.0031) [2024-06-23 13:35:03,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7293091840. Throughput: 0: 42524.1. Samples: 7293219180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 13:35:06,291][15401] Updated weights for policy 0, policy_version 445140 (0.0040) [2024-06-23 13:35:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7293272064. Throughput: 0: 42526.0. Samples: 7293359140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 13:35:09,473][15401] Updated weights for policy 0, policy_version 445150 (0.0043) [2024-06-23 13:35:13,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7293468672. Throughput: 0: 42596.3. Samples: 7293607620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:13,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 13:35:13,890][15401] Updated weights for policy 0, policy_version 445160 (0.0036) [2024-06-23 13:35:17,155][15401] Updated weights for policy 0, policy_version 445170 (0.0026) [2024-06-23 13:35:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7293714432. Throughput: 0: 42506.7. Samples: 7293860740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 13:35:21,583][15401] Updated weights for policy 0, policy_version 445180 (0.0027) [2024-06-23 13:35:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 7293894656. Throughput: 0: 42485.0. Samples: 7293997000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:23,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 13:35:24,762][15401] Updated weights for policy 0, policy_version 445190 (0.0039) [2024-06-23 13:35:28,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 7294107648. Throughput: 0: 42347.9. Samples: 7294238980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:28,393][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 13:35:29,485][15401] Updated weights for policy 0, policy_version 445200 (0.0036) [2024-06-23 13:35:32,511][15401] Updated weights for policy 0, policy_version 445210 (0.0030) [2024-06-23 13:35:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7294337024. Throughput: 0: 42379.2. Samples: 7294492580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:33,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 13:35:36,969][15401] Updated weights for policy 0, policy_version 445220 (0.0027) [2024-06-23 13:35:38,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 7294517248. Throughput: 0: 42349.4. Samples: 7294626920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 13:35:40,023][15401] Updated weights for policy 0, policy_version 445230 (0.0036) [2024-06-23 13:35:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7294746624. Throughput: 0: 42438.2. Samples: 7294877300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 13:35:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000445237_7294763008.pth... [2024-06-23 13:35:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000444614_7284555776.pth [2024-06-23 13:35:44,602][15401] Updated weights for policy 0, policy_version 445240 (0.0032) [2024-06-23 13:35:47,632][15401] Updated weights for policy 0, policy_version 445250 (0.0040) [2024-06-23 13:35:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 7294976000. Throughput: 0: 42527.6. Samples: 7295132920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 13:35:52,309][15401] Updated weights for policy 0, policy_version 445260 (0.0036) [2024-06-23 13:35:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7295156224. Throughput: 0: 42389.9. Samples: 7295266680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 13:35:55,578][15401] Updated weights for policy 0, policy_version 445270 (0.0037) [2024-06-23 13:35:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7295401984. Throughput: 0: 42452.1. Samples: 7295517960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:35:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 13:35:59,893][15401] Updated weights for policy 0, policy_version 445280 (0.0032) [2024-06-23 13:36:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7295614976. Throughput: 0: 42458.2. Samples: 7295771360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 13:36:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 13:36:03,493][15401] Updated weights for policy 0, policy_version 445290 (0.0030) [2024-06-23 13:36:07,664][15401] Updated weights for policy 0, policy_version 445300 (0.0042) [2024-06-23 13:36:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 7295795200. Throughput: 0: 42315.5. Samples: 7295901200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:08,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 13:36:11,111][15401] Updated weights for policy 0, policy_version 445310 (0.0047) [2024-06-23 13:36:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7296040960. Throughput: 0: 42600.3. Samples: 7296155900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 13:36:15,523][15401] Updated weights for policy 0, policy_version 445320 (0.0033) [2024-06-23 13:36:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 7296221184. Throughput: 0: 42728.4. Samples: 7296415360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 13:36:18,832][15349] Signal inference workers to stop experience collection... (108150 times) [2024-06-23 13:36:18,832][15349] Signal inference workers to resume experience collection... (108150 times) [2024-06-23 13:36:18,875][15401] InferenceWorker_p0-w0: stopping experience collection (108150 times) [2024-06-23 13:36:18,875][15401] InferenceWorker_p0-w0: resuming experience collection (108150 times) [2024-06-23 13:36:18,980][15401] Updated weights for policy 0, policy_version 445330 (0.0037) [2024-06-23 13:36:23,158][15401] Updated weights for policy 0, policy_version 445340 (0.0038) [2024-06-23 13:36:23,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7296450560. Throughput: 0: 42505.4. Samples: 7296539660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:23,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 13:36:26,628][15401] Updated weights for policy 0, policy_version 445350 (0.0029) [2024-06-23 13:36:28,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 7296696320. Throughput: 0: 42572.0. Samples: 7296793040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 13:36:31,000][15401] Updated weights for policy 0, policy_version 445360 (0.0032) [2024-06-23 13:36:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7296876544. Throughput: 0: 42774.2. Samples: 7297057760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 13:36:34,152][15401] Updated weights for policy 0, policy_version 445370 (0.0032) [2024-06-23 13:36:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 7297073152. Throughput: 0: 42615.2. Samples: 7297184360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 13:36:38,643][15401] Updated weights for policy 0, policy_version 445380 (0.0031) [2024-06-23 13:36:41,898][15401] Updated weights for policy 0, policy_version 445390 (0.0026) [2024-06-23 13:36:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7297318912. Throughput: 0: 42635.5. Samples: 7297436560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:43,390][15132] Avg episode reward: [(0, '0.877')] [2024-06-23 13:36:46,174][15401] Updated weights for policy 0, policy_version 445400 (0.0034) [2024-06-23 13:36:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7297531904. Throughput: 0: 42706.3. Samples: 7297693140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 13:36:49,578][15401] Updated weights for policy 0, policy_version 445410 (0.0023) [2024-06-23 13:36:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 7297728512. Throughput: 0: 42659.9. Samples: 7297820900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:53,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-23 13:36:53,542][15401] Updated weights for policy 0, policy_version 445420 (0.0023) [2024-06-23 13:36:57,095][15401] Updated weights for policy 0, policy_version 445430 (0.0030) [2024-06-23 13:36:58,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7297990656. Throughput: 0: 42850.0. Samples: 7298084140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:36:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 13:37:00,982][15401] Updated weights for policy 0, policy_version 445440 (0.0036) [2024-06-23 13:37:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7298187264. Throughput: 0: 42939.0. Samples: 7298347620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:37:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 13:37:04,576][15401] Updated weights for policy 0, policy_version 445450 (0.0037) [2024-06-23 13:37:08,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7298367488. Throughput: 0: 43029.7. Samples: 7298476000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:37:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 13:37:08,943][15401] Updated weights for policy 0, policy_version 445460 (0.0023) [2024-06-23 13:37:12,080][15401] Updated weights for policy 0, policy_version 445470 (0.0037) [2024-06-23 13:37:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 7298646016. Throughput: 0: 43063.6. Samples: 7298730900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:37:13,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 13:37:16,738][15401] Updated weights for policy 0, policy_version 445480 (0.0035) [2024-06-23 13:37:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 7298826240. Throughput: 0: 42947.6. Samples: 7298990400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:37:18,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 13:37:19,704][15401] Updated weights for policy 0, policy_version 445490 (0.0047) [2024-06-23 13:37:23,390][15132] Fps is (10 sec: 36044.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7299006464. Throughput: 0: 42894.2. Samples: 7299114600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:37:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 13:37:24,218][15401] Updated weights for policy 0, policy_version 445500 (0.0028) [2024-06-23 13:37:26,126][15349] Signal inference workers to stop experience collection... (108200 times) [2024-06-23 13:37:26,127][15349] Signal inference workers to resume experience collection... (108200 times) [2024-06-23 13:37:26,139][15401] InferenceWorker_p0-w0: stopping experience collection (108200 times) [2024-06-23 13:37:26,139][15401] InferenceWorker_p0-w0: resuming experience collection (108200 times) [2024-06-23 13:37:27,369][15401] Updated weights for policy 0, policy_version 445510 (0.0036) [2024-06-23 13:37:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7299268608. Throughput: 0: 43118.3. Samples: 7299376880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 13:37:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 13:37:31,787][15401] Updated weights for policy 0, policy_version 445520 (0.0027) [2024-06-23 13:37:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 7299481600. Throughput: 0: 43175.8. Samples: 7299636060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:37:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 13:37:35,184][15401] Updated weights for policy 0, policy_version 445530 (0.0034) [2024-06-23 13:37:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7299661824. Throughput: 0: 43166.4. Samples: 7299763380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:37:38,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 13:37:39,325][15401] Updated weights for policy 0, policy_version 445540 (0.0030) [2024-06-23 13:37:42,481][15401] Updated weights for policy 0, policy_version 445550 (0.0040) [2024-06-23 13:37:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 7299907584. Throughput: 0: 43170.1. Samples: 7300026900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:37:43,392][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 13:37:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000445552_7299923968.pth... [2024-06-23 13:37:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000444926_7289667584.pth [2024-06-23 13:37:46,885][15401] Updated weights for policy 0, policy_version 445560 (0.0029) [2024-06-23 13:37:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 7300120576. Throughput: 0: 43049.0. Samples: 7300284820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:37:48,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-23 13:37:50,420][15401] Updated weights for policy 0, policy_version 445570 (0.0035) [2024-06-23 13:37:53,390][15132] Fps is (10 sec: 40969.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7300317184. Throughput: 0: 43063.8. Samples: 7300413880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:37:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 13:37:54,442][15401] Updated weights for policy 0, policy_version 445580 (0.0044) [2024-06-23 13:37:58,070][15401] Updated weights for policy 0, policy_version 445590 (0.0035) [2024-06-23 13:37:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7300562944. Throughput: 0: 43127.2. Samples: 7300671620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:37:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 13:38:02,036][15401] Updated weights for policy 0, policy_version 445600 (0.0028) [2024-06-23 13:38:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 7300759552. Throughput: 0: 43177.7. Samples: 7300933400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 13:38:05,574][15401] Updated weights for policy 0, policy_version 445610 (0.0028) [2024-06-23 13:38:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 7300972544. Throughput: 0: 43199.2. Samples: 7301058560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 13:38:09,560][15401] Updated weights for policy 0, policy_version 445620 (0.0031) [2024-06-23 13:38:13,342][15401] Updated weights for policy 0, policy_version 445630 (0.0053) [2024-06-23 13:38:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7301201920. Throughput: 0: 43123.9. Samples: 7301317460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:13,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-23 13:38:17,114][15401] Updated weights for policy 0, policy_version 445640 (0.0034) [2024-06-23 13:38:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7301414912. Throughput: 0: 43133.8. Samples: 7301577080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 13:38:21,012][15401] Updated weights for policy 0, policy_version 445650 (0.0043) [2024-06-23 13:38:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 7301627904. Throughput: 0: 43163.5. Samples: 7301705740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 13:38:24,828][15401] Updated weights for policy 0, policy_version 445660 (0.0035) [2024-06-23 13:38:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7301824512. Throughput: 0: 42961.0. Samples: 7301960040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 13:38:28,647][15401] Updated weights for policy 0, policy_version 445670 (0.0037) [2024-06-23 13:38:32,637][15401] Updated weights for policy 0, policy_version 445680 (0.0037) [2024-06-23 13:38:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7302053888. Throughput: 0: 43032.9. Samples: 7302221300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 13:38:36,322][15401] Updated weights for policy 0, policy_version 445690 (0.0045) [2024-06-23 13:38:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 7302266880. Throughput: 0: 43072.2. Samples: 7302352120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 13:38:40,094][15401] Updated weights for policy 0, policy_version 445700 (0.0039) [2024-06-23 13:38:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42871.4, 300 sec: 42653.6). Total num frames: 7302479872. Throughput: 0: 42953.7. Samples: 7302604640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:43,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 13:38:44,032][15401] Updated weights for policy 0, policy_version 445710 (0.0043) [2024-06-23 13:38:47,541][15401] Updated weights for policy 0, policy_version 445720 (0.0032) [2024-06-23 13:38:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7302692864. Throughput: 0: 42876.5. Samples: 7302862840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 13:38:50,443][15349] Signal inference workers to stop experience collection... (108250 times) [2024-06-23 13:38:50,504][15349] Signal inference workers to resume experience collection... (108250 times) [2024-06-23 13:38:50,504][15401] InferenceWorker_p0-w0: stopping experience collection (108250 times) [2024-06-23 13:38:50,514][15401] InferenceWorker_p0-w0: resuming experience collection (108250 times) [2024-06-23 13:38:51,756][15401] Updated weights for policy 0, policy_version 445730 (0.0030) [2024-06-23 13:38:53,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 7302905856. Throughput: 0: 42938.2. Samples: 7302990780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 13:38:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 13:38:55,107][15401] Updated weights for policy 0, policy_version 445740 (0.0029) [2024-06-23 13:38:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7303135232. Throughput: 0: 42855.7. Samples: 7303245960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:38:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 13:38:59,371][15401] Updated weights for policy 0, policy_version 445750 (0.0037) [2024-06-23 13:39:02,729][15401] Updated weights for policy 0, policy_version 445760 (0.0041) [2024-06-23 13:39:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7303348224. Throughput: 0: 42685.9. Samples: 7303497940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:03,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 13:39:06,806][15401] Updated weights for policy 0, policy_version 445770 (0.0032) [2024-06-23 13:39:08,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 7303544832. Throughput: 0: 42842.4. Samples: 7303633920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:08,396][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 13:39:10,373][15401] Updated weights for policy 0, policy_version 445780 (0.0037) [2024-06-23 13:39:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7303757824. Throughput: 0: 42689.3. Samples: 7303881060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 13:39:14,856][15401] Updated weights for policy 0, policy_version 445790 (0.0038) [2024-06-23 13:39:18,389][15132] Fps is (10 sec: 42625.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7303970816. Throughput: 0: 42516.9. Samples: 7304134560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:18,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 13:39:18,587][15401] Updated weights for policy 0, policy_version 445800 (0.0042) [2024-06-23 13:39:22,504][15401] Updated weights for policy 0, policy_version 445810 (0.0027) [2024-06-23 13:39:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7304167424. Throughput: 0: 42559.5. Samples: 7304267300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 13:39:26,208][15401] Updated weights for policy 0, policy_version 445820 (0.0022) [2024-06-23 13:39:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7304396800. Throughput: 0: 42579.1. Samples: 7304520600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 13:39:30,020][15401] Updated weights for policy 0, policy_version 445830 (0.0053) [2024-06-23 13:39:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7304609792. Throughput: 0: 42504.8. Samples: 7304775560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 13:39:33,731][15401] Updated weights for policy 0, policy_version 445840 (0.0034) [2024-06-23 13:39:37,610][15401] Updated weights for policy 0, policy_version 445850 (0.0031) [2024-06-23 13:39:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7304822784. Throughput: 0: 42663.1. Samples: 7304910620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:38,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-23 13:39:41,117][15401] Updated weights for policy 0, policy_version 445860 (0.0044) [2024-06-23 13:39:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 7305052160. Throughput: 0: 42599.9. Samples: 7305162960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 13:39:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000445865_7305052160.pth... [2024-06-23 13:39:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000445237_7294763008.pth [2024-06-23 13:39:45,348][15401] Updated weights for policy 0, policy_version 445870 (0.0032) [2024-06-23 13:39:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7305265152. Throughput: 0: 42682.6. Samples: 7305418660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:48,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 13:39:48,903][15401] Updated weights for policy 0, policy_version 445880 (0.0040) [2024-06-23 13:39:52,926][15401] Updated weights for policy 0, policy_version 445890 (0.0038) [2024-06-23 13:39:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 7305461760. Throughput: 0: 42469.8. Samples: 7305544800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 13:39:56,662][15401] Updated weights for policy 0, policy_version 445900 (0.0042) [2024-06-23 13:39:58,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 7305674752. Throughput: 0: 42616.2. Samples: 7305799060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:39:58,396][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 13:40:00,703][15401] Updated weights for policy 0, policy_version 445910 (0.0041) [2024-06-23 13:40:03,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7305904128. Throughput: 0: 42745.8. Samples: 7306058120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:40:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 13:40:04,306][15401] Updated weights for policy 0, policy_version 445920 (0.0024) [2024-06-23 13:40:07,520][15349] Signal inference workers to stop experience collection... (108300 times) [2024-06-23 13:40:07,522][15349] Signal inference workers to resume experience collection... (108300 times) [2024-06-23 13:40:07,564][15401] InferenceWorker_p0-w0: stopping experience collection (108300 times) [2024-06-23 13:40:07,564][15401] InferenceWorker_p0-w0: resuming experience collection (108300 times) [2024-06-23 13:40:08,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 7306100736. Throughput: 0: 42688.5. Samples: 7306188280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:40:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 13:40:08,433][15401] Updated weights for policy 0, policy_version 445930 (0.0030) [2024-06-23 13:40:11,717][15401] Updated weights for policy 0, policy_version 445940 (0.0045) [2024-06-23 13:40:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7306330112. Throughput: 0: 42771.3. Samples: 7306445300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 13:40:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 13:40:16,202][15401] Updated weights for policy 0, policy_version 445950 (0.0034) [2024-06-23 13:40:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7306543104. Throughput: 0: 42792.6. Samples: 7306701220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:18,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 13:40:19,239][15401] Updated weights for policy 0, policy_version 445960 (0.0033) [2024-06-23 13:40:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 7306739712. Throughput: 0: 42731.6. Samples: 7306833540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 13:40:23,862][15401] Updated weights for policy 0, policy_version 445970 (0.0045) [2024-06-23 13:40:27,107][15401] Updated weights for policy 0, policy_version 445980 (0.0030) [2024-06-23 13:40:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7306952704. Throughput: 0: 42614.2. Samples: 7307080600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 13:40:31,305][15401] Updated weights for policy 0, policy_version 445990 (0.0041) [2024-06-23 13:40:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7307182080. Throughput: 0: 42696.6. Samples: 7307340000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:33,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 13:40:35,063][15401] Updated weights for policy 0, policy_version 446000 (0.0029) [2024-06-23 13:40:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7307395072. Throughput: 0: 42816.7. Samples: 7307471540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 13:40:39,312][15401] Updated weights for policy 0, policy_version 446010 (0.0034) [2024-06-23 13:40:42,970][15401] Updated weights for policy 0, policy_version 446020 (0.0035) [2024-06-23 13:40:43,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7307608064. Throughput: 0: 42708.5. Samples: 7307720680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:43,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 13:40:46,762][15401] Updated weights for policy 0, policy_version 446030 (0.0029) [2024-06-23 13:40:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7307821056. Throughput: 0: 42683.1. Samples: 7307978860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 13:40:50,679][15401] Updated weights for policy 0, policy_version 446040 (0.0033) [2024-06-23 13:40:53,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 7308034048. Throughput: 0: 42691.9. Samples: 7308109440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:53,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 13:40:54,803][15401] Updated weights for policy 0, policy_version 446050 (0.0026) [2024-06-23 13:40:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 7308230656. Throughput: 0: 42443.1. Samples: 7308355240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:40:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 13:40:58,454][15401] Updated weights for policy 0, policy_version 446060 (0.0036) [2024-06-23 13:41:02,223][15401] Updated weights for policy 0, policy_version 446070 (0.0033) [2024-06-23 13:41:03,390][15132] Fps is (10 sec: 42599.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 7308460032. Throughput: 0: 42576.7. Samples: 7308617180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:41:03,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 13:41:05,965][15401] Updated weights for policy 0, policy_version 446080 (0.0031) [2024-06-23 13:41:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7308673024. Throughput: 0: 42423.6. Samples: 7308742600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:41:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 13:41:10,056][15401] Updated weights for policy 0, policy_version 446090 (0.0041) [2024-06-23 13:41:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 7308869632. Throughput: 0: 42633.3. Samples: 7308999100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:41:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 13:41:13,780][15401] Updated weights for policy 0, policy_version 446100 (0.0021) [2024-06-23 13:41:17,917][15401] Updated weights for policy 0, policy_version 446110 (0.0040) [2024-06-23 13:41:18,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 7309082624. Throughput: 0: 42528.3. Samples: 7309253880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:41:18,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 13:41:21,346][15401] Updated weights for policy 0, policy_version 446120 (0.0028) [2024-06-23 13:41:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7309279232. Throughput: 0: 42453.8. Samples: 7309381960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:41:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 13:41:25,421][15401] Updated weights for policy 0, policy_version 446130 (0.0044) [2024-06-23 13:41:28,396][15132] Fps is (10 sec: 42581.1, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 7309508608. Throughput: 0: 42464.3. Samples: 7309631840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:41:28,397][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 13:41:29,317][15401] Updated weights for policy 0, policy_version 446140 (0.0033) [2024-06-23 13:41:32,980][15401] Updated weights for policy 0, policy_version 446150 (0.0042) [2024-06-23 13:41:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7309721600. Throughput: 0: 42505.3. Samples: 7309891600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:41:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 13:41:36,859][15401] Updated weights for policy 0, policy_version 446160 (0.0036) [2024-06-23 13:41:38,390][15132] Fps is (10 sec: 42625.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7309934592. Throughput: 0: 42538.2. Samples: 7310023640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 13:41:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 13:41:40,732][15401] Updated weights for policy 0, policy_version 446170 (0.0033) [2024-06-23 13:41:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7310147584. Throughput: 0: 42769.5. Samples: 7310279880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:41:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 13:41:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000446176_7310147584.pth... [2024-06-23 13:41:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000445552_7299923968.pth [2024-06-23 13:41:44,310][15349] Signal inference workers to stop experience collection... (108350 times) [2024-06-23 13:41:44,310][15349] Signal inference workers to resume experience collection... (108350 times) [2024-06-23 13:41:44,357][15401] InferenceWorker_p0-w0: stopping experience collection (108350 times) [2024-06-23 13:41:44,358][15401] InferenceWorker_p0-w0: resuming experience collection (108350 times) [2024-06-23 13:41:44,455][15401] Updated weights for policy 0, policy_version 446180 (0.0035) [2024-06-23 13:41:48,237][15401] Updated weights for policy 0, policy_version 446190 (0.0033) [2024-06-23 13:41:48,391][15132] Fps is (10 sec: 44232.0, 60 sec: 42597.6, 300 sec: 42875.9). Total num frames: 7310376960. Throughput: 0: 42675.0. Samples: 7310537600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:41:48,391][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 13:41:52,047][15401] Updated weights for policy 0, policy_version 446200 (0.0038) [2024-06-23 13:41:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.8, 300 sec: 42709.5). Total num frames: 7310589952. Throughput: 0: 42747.5. Samples: 7310666240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:41:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 13:41:56,018][15401] Updated weights for policy 0, policy_version 446210 (0.0033) [2024-06-23 13:41:58,390][15132] Fps is (10 sec: 40964.3, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 7310786560. Throughput: 0: 42723.5. Samples: 7310921660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:41:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 13:42:00,218][15401] Updated weights for policy 0, policy_version 446220 (0.0040) [2024-06-23 13:42:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7310999552. Throughput: 0: 42834.2. Samples: 7311181320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:03,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 13:42:03,688][15401] Updated weights for policy 0, policy_version 446230 (0.0035) [2024-06-23 13:42:07,705][15401] Updated weights for policy 0, policy_version 446240 (0.0037) [2024-06-23 13:42:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7311228928. Throughput: 0: 42827.6. Samples: 7311309200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 13:42:11,170][15401] Updated weights for policy 0, policy_version 446250 (0.0037) [2024-06-23 13:42:13,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 7311441920. Throughput: 0: 42893.7. Samples: 7311561880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:13,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 13:42:15,221][15401] Updated weights for policy 0, policy_version 446260 (0.0044) [2024-06-23 13:42:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42871.5, 300 sec: 42875.7). Total num frames: 7311654912. Throughput: 0: 42780.8. Samples: 7311816840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:18,392][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 13:42:18,823][15401] Updated weights for policy 0, policy_version 446270 (0.0034) [2024-06-23 13:42:22,910][15401] Updated weights for policy 0, policy_version 446280 (0.0041) [2024-06-23 13:42:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7311851520. Throughput: 0: 42761.8. Samples: 7311947920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:23,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 13:42:26,388][15401] Updated weights for policy 0, policy_version 446290 (0.0040) [2024-06-23 13:42:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 7312080896. Throughput: 0: 42766.8. Samples: 7312204380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:28,395][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 13:42:30,540][15401] Updated weights for policy 0, policy_version 446300 (0.0031) [2024-06-23 13:42:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7312293888. Throughput: 0: 42769.1. Samples: 7312462160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:33,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 13:42:33,853][15401] Updated weights for policy 0, policy_version 446310 (0.0026) [2024-06-23 13:42:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 7312490496. Throughput: 0: 42734.7. Samples: 7312589300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:38,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 13:42:38,513][15401] Updated weights for policy 0, policy_version 446320 (0.0046) [2024-06-23 13:42:41,560][15401] Updated weights for policy 0, policy_version 446330 (0.0035) [2024-06-23 13:42:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7312736256. Throughput: 0: 42758.3. Samples: 7312845780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 13:42:46,147][15401] Updated weights for policy 0, policy_version 446340 (0.0031) [2024-06-23 13:42:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42599.2, 300 sec: 42765.0). Total num frames: 7312932864. Throughput: 0: 42670.3. Samples: 7313101480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 13:42:49,274][15401] Updated weights for policy 0, policy_version 446350 (0.0033) [2024-06-23 13:42:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7313129472. Throughput: 0: 42538.7. Samples: 7313223440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 13:42:54,263][15401] Updated weights for policy 0, policy_version 446360 (0.0033) [2024-06-23 13:42:56,943][15401] Updated weights for policy 0, policy_version 446370 (0.0031) [2024-06-23 13:42:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7313375232. Throughput: 0: 42544.1. Samples: 7313476260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:42:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 13:43:02,026][15401] Updated weights for policy 0, policy_version 446380 (0.0037) [2024-06-23 13:43:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7313571840. Throughput: 0: 42618.3. Samples: 7313734560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 13:43:03,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 13:43:04,553][15401] Updated weights for policy 0, policy_version 446390 (0.0046) [2024-06-23 13:43:08,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 7313752064. Throughput: 0: 42508.4. Samples: 7313860800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:08,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-23 13:43:09,627][15401] Updated weights for policy 0, policy_version 446400 (0.0030) [2024-06-23 13:43:12,742][15401] Updated weights for policy 0, policy_version 446410 (0.0038) [2024-06-23 13:43:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 7313997824. Throughput: 0: 42371.5. Samples: 7314111100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:13,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 13:43:14,001][15349] Signal inference workers to stop experience collection... (108400 times) [2024-06-23 13:43:14,002][15349] Signal inference workers to resume experience collection... (108400 times) [2024-06-23 13:43:14,029][15401] InferenceWorker_p0-w0: stopping experience collection (108400 times) [2024-06-23 13:43:14,029][15401] InferenceWorker_p0-w0: resuming experience collection (108400 times) [2024-06-23 13:43:17,178][15401] Updated weights for policy 0, policy_version 446420 (0.0041) [2024-06-23 13:43:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 7314210816. Throughput: 0: 42518.1. Samples: 7314375480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 13:43:20,441][15401] Updated weights for policy 0, policy_version 446430 (0.0039) [2024-06-23 13:43:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7314423808. Throughput: 0: 42490.6. Samples: 7314501380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 13:43:24,824][15401] Updated weights for policy 0, policy_version 446440 (0.0032) [2024-06-23 13:43:27,925][15401] Updated weights for policy 0, policy_version 446450 (0.0036) [2024-06-23 13:43:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7314653184. Throughput: 0: 42434.6. Samples: 7314755340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 13:43:32,465][15401] Updated weights for policy 0, policy_version 446460 (0.0034) [2024-06-23 13:43:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7314849792. Throughput: 0: 42511.1. Samples: 7315014480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:33,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 13:43:35,451][15401] Updated weights for policy 0, policy_version 446470 (0.0045) [2024-06-23 13:43:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 7315062784. Throughput: 0: 42601.3. Samples: 7315140500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:38,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 13:43:39,973][15401] Updated weights for policy 0, policy_version 446480 (0.0040) [2024-06-23 13:43:42,949][15401] Updated weights for policy 0, policy_version 446490 (0.0046) [2024-06-23 13:43:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7315292160. Throughput: 0: 42681.2. Samples: 7315396920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 13:43:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000446490_7315292160.pth... [2024-06-23 13:43:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000445865_7305052160.pth [2024-06-23 13:43:47,643][15401] Updated weights for policy 0, policy_version 446500 (0.0032) [2024-06-23 13:43:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7315488768. Throughput: 0: 42759.5. Samples: 7315658740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 13:43:50,861][15401] Updated weights for policy 0, policy_version 446510 (0.0034) [2024-06-23 13:43:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 7315718144. Throughput: 0: 42619.1. Samples: 7315778760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:53,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 13:43:55,383][15401] Updated weights for policy 0, policy_version 446520 (0.0032) [2024-06-23 13:43:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7315914752. Throughput: 0: 42826.4. Samples: 7316038280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:43:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 13:43:58,818][15401] Updated weights for policy 0, policy_version 446530 (0.0046) [2024-06-23 13:44:02,955][15401] Updated weights for policy 0, policy_version 446540 (0.0051) [2024-06-23 13:44:03,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 7316127744. Throughput: 0: 42736.0. Samples: 7316298600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:44:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 13:44:06,319][15401] Updated weights for policy 0, policy_version 446550 (0.0029) [2024-06-23 13:44:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 7316357120. Throughput: 0: 42704.4. Samples: 7316423080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:44:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 13:44:10,578][15401] Updated weights for policy 0, policy_version 446560 (0.0024) [2024-06-23 13:44:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7316570112. Throughput: 0: 42808.1. Samples: 7316681700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:44:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 13:44:13,908][15401] Updated weights for policy 0, policy_version 446570 (0.0035) [2024-06-23 13:44:18,371][15401] Updated weights for policy 0, policy_version 446580 (0.0024) [2024-06-23 13:44:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7316766720. Throughput: 0: 42986.6. Samples: 7316948880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:44:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 13:44:21,495][15401] Updated weights for policy 0, policy_version 446590 (0.0033) [2024-06-23 13:44:23,393][15132] Fps is (10 sec: 42583.7, 60 sec: 42869.0, 300 sec: 42709.0). Total num frames: 7316996096. Throughput: 0: 42775.0. Samples: 7317065520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 13:44:23,394][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 13:44:25,930][15401] Updated weights for policy 0, policy_version 446600 (0.0041) [2024-06-23 13:44:28,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7317225472. Throughput: 0: 42946.2. Samples: 7317329500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:44:28,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 13:44:29,155][15401] Updated weights for policy 0, policy_version 446610 (0.0029) [2024-06-23 13:44:33,390][15132] Fps is (10 sec: 39334.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7317389312. Throughput: 0: 42919.5. Samples: 7317590120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:44:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 13:44:33,591][15401] Updated weights for policy 0, policy_version 446620 (0.0036) [2024-06-23 13:44:36,939][15401] Updated weights for policy 0, policy_version 446630 (0.0024) [2024-06-23 13:44:38,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7317602304. Throughput: 0: 42930.4. Samples: 7317710520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:44:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 13:44:41,197][15401] Updated weights for policy 0, policy_version 446640 (0.0036) [2024-06-23 13:44:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7317848064. Throughput: 0: 42757.2. Samples: 7317962360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:44:43,390][15132] Avg episode reward: [(0, '0.883')] [2024-06-23 13:44:44,572][15401] Updated weights for policy 0, policy_version 446650 (0.0030) [2024-06-23 13:44:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7318044672. Throughput: 0: 42890.6. Samples: 7318228680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:44:48,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 13:44:48,784][15401] Updated weights for policy 0, policy_version 446660 (0.0034) [2024-06-23 13:44:50,548][15349] Signal inference workers to stop experience collection... (108450 times) [2024-06-23 13:44:50,548][15349] Signal inference workers to resume experience collection... (108450 times) [2024-06-23 13:44:50,592][15401] InferenceWorker_p0-w0: stopping experience collection (108450 times) [2024-06-23 13:44:50,593][15401] InferenceWorker_p0-w0: resuming experience collection (108450 times) [2024-06-23 13:44:52,266][15401] Updated weights for policy 0, policy_version 446670 (0.0036) [2024-06-23 13:44:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.0, 300 sec: 42654.9). Total num frames: 7318257664. Throughput: 0: 42869.8. Samples: 7318352220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:44:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 13:44:56,380][15401] Updated weights for policy 0, policy_version 446680 (0.0039) [2024-06-23 13:44:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 7318503424. Throughput: 0: 42816.8. Samples: 7318608460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:44:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 13:44:59,993][15401] Updated weights for policy 0, policy_version 446690 (0.0032) [2024-06-23 13:45:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7318683648. Throughput: 0: 42545.3. Samples: 7318863420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 13:45:04,173][15401] Updated weights for policy 0, policy_version 446700 (0.0033) [2024-06-23 13:45:08,080][15401] Updated weights for policy 0, policy_version 446710 (0.0056) [2024-06-23 13:45:08,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 7318913024. Throughput: 0: 42667.2. Samples: 7318985500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:08,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 13:45:12,065][15401] Updated weights for policy 0, policy_version 446720 (0.0047) [2024-06-23 13:45:13,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7319158784. Throughput: 0: 42683.1. Samples: 7319250240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 13:45:15,721][15401] Updated weights for policy 0, policy_version 446730 (0.0034) [2024-06-23 13:45:18,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 7319322624. Throughput: 0: 42577.8. Samples: 7319506220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:18,393][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 13:45:19,525][15401] Updated weights for policy 0, policy_version 446740 (0.0045) [2024-06-23 13:45:23,267][15401] Updated weights for policy 0, policy_version 446750 (0.0036) [2024-06-23 13:45:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.9, 300 sec: 42765.0). Total num frames: 7319568384. Throughput: 0: 42707.9. Samples: 7319632380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:23,393][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 13:45:27,130][15401] Updated weights for policy 0, policy_version 446760 (0.0042) [2024-06-23 13:45:28,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7319764992. Throughput: 0: 42951.6. Samples: 7319895180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 13:45:30,834][15401] Updated weights for policy 0, policy_version 446770 (0.0024) [2024-06-23 13:45:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7319977984. Throughput: 0: 42720.1. Samples: 7320151080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 13:45:34,650][15401] Updated weights for policy 0, policy_version 446780 (0.0037) [2024-06-23 13:45:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 7320190976. Throughput: 0: 42727.2. Samples: 7320274940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 13:45:38,475][15401] Updated weights for policy 0, policy_version 446790 (0.0045) [2024-06-23 13:45:42,232][15401] Updated weights for policy 0, policy_version 446800 (0.0040) [2024-06-23 13:45:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7320420352. Throughput: 0: 42760.0. Samples: 7320532660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:43,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-23 13:45:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000446803_7320420352.pth... [2024-06-23 13:45:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000446176_7310147584.pth [2024-06-23 13:45:46,170][15401] Updated weights for policy 0, policy_version 446810 (0.0024) [2024-06-23 13:45:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7320616960. Throughput: 0: 42664.6. Samples: 7320783320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-23 13:45:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 13:45:49,579][15349] Signal inference workers to stop experience collection... (108500 times) [2024-06-23 13:45:49,593][15401] InferenceWorker_p0-w0: stopping experience collection (108500 times) [2024-06-23 13:45:49,643][15349] Signal inference workers to resume experience collection... (108500 times) [2024-06-23 13:45:49,643][15401] InferenceWorker_p0-w0: resuming experience collection (108500 times) [2024-06-23 13:45:50,110][15401] Updated weights for policy 0, policy_version 446820 (0.0042) [2024-06-23 13:45:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7320829952. Throughput: 0: 42765.1. Samples: 7320909820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:45:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 13:45:53,643][15401] Updated weights for policy 0, policy_version 446830 (0.0032) [2024-06-23 13:45:57,568][15401] Updated weights for policy 0, policy_version 446840 (0.0041) [2024-06-23 13:45:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7321042944. Throughput: 0: 42891.2. Samples: 7321180340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:45:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 13:46:01,216][15401] Updated weights for policy 0, policy_version 446850 (0.0043) [2024-06-23 13:46:03,389][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7321272320. Throughput: 0: 42653.8. Samples: 7321425540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 13:46:05,328][15401] Updated weights for policy 0, policy_version 446860 (0.0030) [2024-06-23 13:46:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 7321468928. Throughput: 0: 42753.9. Samples: 7321556300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 13:46:08,815][15401] Updated weights for policy 0, policy_version 446870 (0.0039) [2024-06-23 13:46:12,996][15401] Updated weights for policy 0, policy_version 446880 (0.0037) [2024-06-23 13:46:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42709.8). Total num frames: 7321681920. Throughput: 0: 42679.6. Samples: 7321815760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 13:46:16,898][15401] Updated weights for policy 0, policy_version 446890 (0.0048) [2024-06-23 13:46:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 7321911296. Throughput: 0: 42512.8. Samples: 7322064160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 13:46:20,939][15401] Updated weights for policy 0, policy_version 446900 (0.0028) [2024-06-23 13:46:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 7322107904. Throughput: 0: 42597.2. Samples: 7322191820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 13:46:24,500][15401] Updated weights for policy 0, policy_version 446910 (0.0034) [2024-06-23 13:46:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7322304512. Throughput: 0: 42534.3. Samples: 7322446700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:28,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 13:46:28,658][15401] Updated weights for policy 0, policy_version 446920 (0.0032) [2024-06-23 13:46:32,095][15401] Updated weights for policy 0, policy_version 446930 (0.0023) [2024-06-23 13:46:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7322550272. Throughput: 0: 42575.0. Samples: 7322699300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:33,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 13:46:36,533][15401] Updated weights for policy 0, policy_version 446940 (0.0038) [2024-06-23 13:46:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7322730496. Throughput: 0: 42721.2. Samples: 7322832280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 13:46:39,646][15401] Updated weights for policy 0, policy_version 446950 (0.0035) [2024-06-23 13:46:43,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.4, 300 sec: 42654.1). Total num frames: 7322959872. Throughput: 0: 42241.2. Samples: 7323081200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 13:46:44,315][15401] Updated weights for policy 0, policy_version 446960 (0.0031) [2024-06-23 13:46:47,574][15401] Updated weights for policy 0, policy_version 446970 (0.0027) [2024-06-23 13:46:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7323172864. Throughput: 0: 42520.4. Samples: 7323339060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:48,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 13:46:51,853][15401] Updated weights for policy 0, policy_version 446980 (0.0045) [2024-06-23 13:46:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7323385856. Throughput: 0: 42520.0. Samples: 7323469700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 13:46:55,266][15401] Updated weights for policy 0, policy_version 446990 (0.0031) [2024-06-23 13:46:58,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7323582464. Throughput: 0: 42342.5. Samples: 7323721180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:46:58,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 13:46:59,421][15401] Updated weights for policy 0, policy_version 447000 (0.0034) [2024-06-23 13:47:02,064][15349] Signal inference workers to stop experience collection... (108550 times) [2024-06-23 13:47:02,064][15349] Signal inference workers to resume experience collection... (108550 times) [2024-06-23 13:47:02,090][15401] InferenceWorker_p0-w0: stopping experience collection (108550 times) [2024-06-23 13:47:02,090][15401] InferenceWorker_p0-w0: resuming experience collection (108550 times) [2024-06-23 13:47:02,929][15401] Updated weights for policy 0, policy_version 447010 (0.0043) [2024-06-23 13:47:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7323828224. Throughput: 0: 42428.4. Samples: 7323973440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:47:03,395][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 13:47:07,130][15401] Updated weights for policy 0, policy_version 447020 (0.0034) [2024-06-23 13:47:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 7324024832. Throughput: 0: 42558.3. Samples: 7324106940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:47:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 13:47:10,690][15401] Updated weights for policy 0, policy_version 447030 (0.0032) [2024-06-23 13:47:13,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.6, 300 sec: 42653.9). Total num frames: 7324237824. Throughput: 0: 42579.9. Samples: 7324362900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 13:47:13,393][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 13:47:14,767][15401] Updated weights for policy 0, policy_version 447040 (0.0039) [2024-06-23 13:47:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7324450816. Throughput: 0: 42596.5. Samples: 7324616040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 13:47:18,407][15401] Updated weights for policy 0, policy_version 447050 (0.0027) [2024-06-23 13:47:22,524][15401] Updated weights for policy 0, policy_version 447060 (0.0037) [2024-06-23 13:47:23,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7324663808. Throughput: 0: 42471.0. Samples: 7324743580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:23,393][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 13:47:26,152][15401] Updated weights for policy 0, policy_version 447070 (0.0029) [2024-06-23 13:47:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7324860416. Throughput: 0: 42566.3. Samples: 7324996680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:28,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-23 13:47:30,243][15401] Updated weights for policy 0, policy_version 447080 (0.0040) [2024-06-23 13:47:33,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 7325089792. Throughput: 0: 42406.2. Samples: 7325247240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 13:47:33,721][15401] Updated weights for policy 0, policy_version 447090 (0.0033) [2024-06-23 13:47:38,048][15401] Updated weights for policy 0, policy_version 447100 (0.0037) [2024-06-23 13:47:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7325286400. Throughput: 0: 42356.0. Samples: 7325375720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:38,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 13:47:41,442][15401] Updated weights for policy 0, policy_version 447110 (0.0036) [2024-06-23 13:47:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7325499392. Throughput: 0: 42321.0. Samples: 7325625620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 13:47:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000447113_7325499392.pth... [2024-06-23 13:47:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000446490_7315292160.pth [2024-06-23 13:47:45,958][15401] Updated weights for policy 0, policy_version 447120 (0.0029) [2024-06-23 13:47:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 7325712384. Throughput: 0: 42327.7. Samples: 7325878180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:48,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 13:47:49,445][15401] Updated weights for policy 0, policy_version 447130 (0.0030) [2024-06-23 13:47:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7325925376. Throughput: 0: 42308.8. Samples: 7326010840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 13:47:53,480][15401] Updated weights for policy 0, policy_version 447140 (0.0041) [2024-06-23 13:47:57,015][15401] Updated weights for policy 0, policy_version 447150 (0.0038) [2024-06-23 13:47:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7326121984. Throughput: 0: 42223.2. Samples: 7326262840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:47:58,392][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 13:48:01,075][15401] Updated weights for policy 0, policy_version 447160 (0.0040) [2024-06-23 13:48:02,230][15349] Signal inference workers to stop experience collection... (108600 times) [2024-06-23 13:48:02,231][15349] Signal inference workers to resume experience collection... (108600 times) [2024-06-23 13:48:02,245][15401] InferenceWorker_p0-w0: stopping experience collection (108600 times) [2024-06-23 13:48:02,274][15401] InferenceWorker_p0-w0: resuming experience collection (108600 times) [2024-06-23 13:48:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7326367744. Throughput: 0: 42383.5. Samples: 7326523300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:48:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 13:48:04,627][15401] Updated weights for policy 0, policy_version 447170 (0.0024) [2024-06-23 13:48:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7326564352. Throughput: 0: 42419.7. Samples: 7326652360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:48:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 13:48:08,869][15401] Updated weights for policy 0, policy_version 447180 (0.0041) [2024-06-23 13:48:12,341][15401] Updated weights for policy 0, policy_version 447190 (0.0032) [2024-06-23 13:48:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 7326777344. Throughput: 0: 42409.6. Samples: 7326905120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:48:13,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 13:48:16,349][15401] Updated weights for policy 0, policy_version 447200 (0.0047) [2024-06-23 13:48:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7327006720. Throughput: 0: 42492.9. Samples: 7327159420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:48:18,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 13:48:20,328][15401] Updated weights for policy 0, policy_version 447210 (0.0023) [2024-06-23 13:48:23,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 7327186944. Throughput: 0: 42642.6. Samples: 7327294640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:48:23,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-23 13:48:23,962][15401] Updated weights for policy 0, policy_version 447220 (0.0026) [2024-06-23 13:48:27,862][15401] Updated weights for policy 0, policy_version 447230 (0.0036) [2024-06-23 13:48:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7327432704. Throughput: 0: 42693.8. Samples: 7327546840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:48:28,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 13:48:31,752][15401] Updated weights for policy 0, policy_version 447240 (0.0031) [2024-06-23 13:48:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7327645696. Throughput: 0: 42750.1. Samples: 7327801940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-23 13:48:33,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-23 13:48:35,639][15401] Updated weights for policy 0, policy_version 447250 (0.0036) [2024-06-23 13:48:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7327825920. Throughput: 0: 42646.7. Samples: 7327929940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:48:38,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 13:48:39,527][15401] Updated weights for policy 0, policy_version 447260 (0.0036) [2024-06-23 13:48:43,137][15401] Updated weights for policy 0, policy_version 447270 (0.0039) [2024-06-23 13:48:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7328071680. Throughput: 0: 42894.2. Samples: 7328193080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:48:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 13:48:47,388][15401] Updated weights for policy 0, policy_version 447280 (0.0031) [2024-06-23 13:48:48,390][15132] Fps is (10 sec: 47512.1, 60 sec: 43144.3, 300 sec: 42654.3). Total num frames: 7328301056. Throughput: 0: 42675.8. Samples: 7328443720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:48:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 13:48:50,762][15401] Updated weights for policy 0, policy_version 447290 (0.0040) [2024-06-23 13:48:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7328481280. Throughput: 0: 42620.8. Samples: 7328570300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:48:53,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-23 13:48:54,925][15401] Updated weights for policy 0, policy_version 447300 (0.0027) [2024-06-23 13:48:58,341][15401] Updated weights for policy 0, policy_version 447310 (0.0035) [2024-06-23 13:48:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 7328727040. Throughput: 0: 42799.7. Samples: 7328831100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:48:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 13:49:02,564][15401] Updated weights for policy 0, policy_version 447320 (0.0027) [2024-06-23 13:49:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7328923648. Throughput: 0: 42750.7. Samples: 7329083200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:03,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 13:49:05,937][15401] Updated weights for policy 0, policy_version 447330 (0.0037) [2024-06-23 13:49:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 7329136640. Throughput: 0: 42548.4. Samples: 7329209320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 13:49:10,177][15401] Updated weights for policy 0, policy_version 447340 (0.0032) [2024-06-23 13:49:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7329366016. Throughput: 0: 42719.1. Samples: 7329469200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:13,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 13:49:13,645][15401] Updated weights for policy 0, policy_version 447350 (0.0031) [2024-06-23 13:49:17,907][15401] Updated weights for policy 0, policy_version 447360 (0.0029) [2024-06-23 13:49:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42543.3). Total num frames: 7329546240. Throughput: 0: 42724.8. Samples: 7329724560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 13:49:21,617][15401] Updated weights for policy 0, policy_version 447370 (0.0037) [2024-06-23 13:49:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 7329775616. Throughput: 0: 42684.0. Samples: 7329850720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 13:49:25,765][15401] Updated weights for policy 0, policy_version 447380 (0.0046) [2024-06-23 13:49:28,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7330004992. Throughput: 0: 42596.5. Samples: 7330109920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 13:49:29,089][15401] Updated weights for policy 0, policy_version 447390 (0.0027) [2024-06-23 13:49:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7330185216. Throughput: 0: 42841.6. Samples: 7330371580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 13:49:33,413][15401] Updated weights for policy 0, policy_version 447400 (0.0032) [2024-06-23 13:49:34,888][15349] Signal inference workers to stop experience collection... (108650 times) [2024-06-23 13:49:34,933][15401] InferenceWorker_p0-w0: stopping experience collection (108650 times) [2024-06-23 13:49:34,943][15349] Signal inference workers to resume experience collection... (108650 times) [2024-06-23 13:49:34,956][15401] InferenceWorker_p0-w0: resuming experience collection (108650 times) [2024-06-23 13:49:36,596][15401] Updated weights for policy 0, policy_version 447410 (0.0031) [2024-06-23 13:49:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7330414592. Throughput: 0: 42770.6. Samples: 7330494980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 13:49:40,930][15401] Updated weights for policy 0, policy_version 447420 (0.0028) [2024-06-23 13:49:43,390][15132] Fps is (10 sec: 44235.4, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 7330627584. Throughput: 0: 42812.2. Samples: 7330757660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:43,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 13:49:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000447426_7330627584.pth... [2024-06-23 13:49:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000446803_7320420352.pth [2024-06-23 13:49:44,171][15401] Updated weights for policy 0, policy_version 447430 (0.0038) [2024-06-23 13:49:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 7330824192. Throughput: 0: 42822.6. Samples: 7331010220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 13:49:48,803][15401] Updated weights for policy 0, policy_version 447440 (0.0042) [2024-06-23 13:49:51,910][15401] Updated weights for policy 0, policy_version 447450 (0.0028) [2024-06-23 13:49:53,390][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7331069952. Throughput: 0: 42912.1. Samples: 7331140360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 13:49:56,333][15401] Updated weights for policy 0, policy_version 447460 (0.0036) [2024-06-23 13:49:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7331266560. Throughput: 0: 42804.4. Samples: 7331395400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 13:49:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 13:49:59,838][15401] Updated weights for policy 0, policy_version 447470 (0.0031) [2024-06-23 13:50:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 7331479552. Throughput: 0: 42768.1. Samples: 7331649120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 13:50:03,875][15401] Updated weights for policy 0, policy_version 447480 (0.0046) [2024-06-23 13:50:07,594][15401] Updated weights for policy 0, policy_version 447490 (0.0047) [2024-06-23 13:50:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7331692544. Throughput: 0: 42715.4. Samples: 7331772920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 13:50:12,244][15401] Updated weights for policy 0, policy_version 447500 (0.0029) [2024-06-23 13:50:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 7331921920. Throughput: 0: 42805.6. Samples: 7332036180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:13,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 13:50:15,119][15401] Updated weights for policy 0, policy_version 447510 (0.0043) [2024-06-23 13:50:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 7332118528. Throughput: 0: 42574.2. Samples: 7332287420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:18,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 13:50:19,753][15401] Updated weights for policy 0, policy_version 447520 (0.0034) [2024-06-23 13:50:22,909][15401] Updated weights for policy 0, policy_version 447530 (0.0037) [2024-06-23 13:50:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7332331520. Throughput: 0: 42554.3. Samples: 7332409920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 13:50:27,273][15401] Updated weights for policy 0, policy_version 447540 (0.0026) [2024-06-23 13:50:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7332560896. Throughput: 0: 42550.0. Samples: 7332672500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:28,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 13:50:30,767][15401] Updated weights for policy 0, policy_version 447550 (0.0041) [2024-06-23 13:50:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 7332757504. Throughput: 0: 42499.0. Samples: 7332922680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:33,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 13:50:35,018][15401] Updated weights for policy 0, policy_version 447560 (0.0044) [2024-06-23 13:50:38,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7332970496. Throughput: 0: 42536.0. Samples: 7333054480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 13:50:38,434][15401] Updated weights for policy 0, policy_version 447570 (0.0040) [2024-06-23 13:50:42,613][15401] Updated weights for policy 0, policy_version 447580 (0.0021) [2024-06-23 13:50:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7333183488. Throughput: 0: 42561.8. Samples: 7333310680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:43,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 13:50:46,111][15401] Updated weights for policy 0, policy_version 447590 (0.0039) [2024-06-23 13:50:46,128][15349] Signal inference workers to stop experience collection... (108700 times) [2024-06-23 13:50:46,128][15349] Signal inference workers to resume experience collection... (108700 times) [2024-06-23 13:50:46,145][15401] InferenceWorker_p0-w0: stopping experience collection (108700 times) [2024-06-23 13:50:46,145][15401] InferenceWorker_p0-w0: resuming experience collection (108700 times) [2024-06-23 13:50:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7333396480. Throughput: 0: 42498.3. Samples: 7333561540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 13:50:50,124][15401] Updated weights for policy 0, policy_version 447600 (0.0038) [2024-06-23 13:50:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7333609472. Throughput: 0: 42669.5. Samples: 7333693040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 13:50:53,741][15401] Updated weights for policy 0, policy_version 447610 (0.0027) [2024-06-23 13:50:57,684][15401] Updated weights for policy 0, policy_version 447620 (0.0045) [2024-06-23 13:50:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7333822464. Throughput: 0: 42552.1. Samples: 7333951020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:50:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 13:51:01,373][15401] Updated weights for policy 0, policy_version 447630 (0.0048) [2024-06-23 13:51:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7334035456. Throughput: 0: 42634.1. Samples: 7334205960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:51:03,394][15132] Avg episode reward: [(0, '0.776')] [2024-06-23 13:51:05,511][15401] Updated weights for policy 0, policy_version 447640 (0.0032) [2024-06-23 13:51:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7334248448. Throughput: 0: 42747.5. Samples: 7334333560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:51:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 13:51:08,954][15401] Updated weights for policy 0, policy_version 447650 (0.0035) [2024-06-23 13:51:13,295][15401] Updated weights for policy 0, policy_version 447660 (0.0038) [2024-06-23 13:51:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 7334461440. Throughput: 0: 42559.9. Samples: 7334587700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:51:13,393][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 13:51:16,743][15401] Updated weights for policy 0, policy_version 447670 (0.0043) [2024-06-23 13:51:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 7334674432. Throughput: 0: 42587.2. Samples: 7334839200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:51:18,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 13:51:21,124][15401] Updated weights for policy 0, policy_version 447680 (0.0028) [2024-06-23 13:51:23,390][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7334887424. Throughput: 0: 42541.4. Samples: 7334968840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 13:51:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 13:51:24,674][15401] Updated weights for policy 0, policy_version 447690 (0.0037) [2024-06-23 13:51:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42327.0, 300 sec: 42543.2). Total num frames: 7335100416. Throughput: 0: 42494.2. Samples: 7335222920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:51:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 13:51:28,595][15401] Updated weights for policy 0, policy_version 447700 (0.0023) [2024-06-23 13:51:32,341][15401] Updated weights for policy 0, policy_version 447710 (0.0032) [2024-06-23 13:51:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7335329792. Throughput: 0: 42625.6. Samples: 7335479700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:51:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 13:51:36,170][15401] Updated weights for policy 0, policy_version 447720 (0.0038) [2024-06-23 13:51:38,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 7335542784. Throughput: 0: 42537.7. Samples: 7335607340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:51:38,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 13:51:40,130][15401] Updated weights for policy 0, policy_version 447730 (0.0039) [2024-06-23 13:51:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 7335739392. Throughput: 0: 42458.1. Samples: 7335861640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:51:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 13:51:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000447738_7335739392.pth... [2024-06-23 13:51:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000447113_7325499392.pth [2024-06-23 13:51:43,882][15401] Updated weights for policy 0, policy_version 447740 (0.0032) [2024-06-23 13:51:47,804][15401] Updated weights for policy 0, policy_version 447750 (0.0038) [2024-06-23 13:51:48,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7335968768. Throughput: 0: 42389.8. Samples: 7336113500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:51:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 13:51:51,566][15401] Updated weights for policy 0, policy_version 447760 (0.0032) [2024-06-23 13:51:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7336148992. Throughput: 0: 42430.7. Samples: 7336242940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:51:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 13:51:55,506][15401] Updated weights for policy 0, policy_version 447770 (0.0030) [2024-06-23 13:51:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7336394752. Throughput: 0: 42372.9. Samples: 7336494380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:51:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 13:51:59,301][15401] Updated weights for policy 0, policy_version 447780 (0.0038) [2024-06-23 13:52:03,157][15401] Updated weights for policy 0, policy_version 447790 (0.0040) [2024-06-23 13:52:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7336591360. Throughput: 0: 42479.1. Samples: 7336750660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:03,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 13:52:06,893][15401] Updated weights for policy 0, policy_version 447800 (0.0043) [2024-06-23 13:52:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 7336787968. Throughput: 0: 42376.1. Samples: 7336875760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:08,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 13:52:10,418][15349] Signal inference workers to stop experience collection... (108750 times) [2024-06-23 13:52:10,458][15401] InferenceWorker_p0-w0: stopping experience collection (108750 times) [2024-06-23 13:52:10,478][15349] Signal inference workers to resume experience collection... (108750 times) [2024-06-23 13:52:10,484][15401] InferenceWorker_p0-w0: resuming experience collection (108750 times) [2024-06-23 13:52:10,767][15401] Updated weights for policy 0, policy_version 447810 (0.0030) [2024-06-23 13:52:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42598.4, 300 sec: 42598.0). Total num frames: 7337017344. Throughput: 0: 42452.0. Samples: 7337133360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:13,392][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 13:52:14,495][15401] Updated weights for policy 0, policy_version 447820 (0.0036) [2024-06-23 13:52:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42600.0, 300 sec: 42598.7). Total num frames: 7337230336. Throughput: 0: 42387.1. Samples: 7337387120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:18,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 13:52:18,556][15401] Updated weights for policy 0, policy_version 447830 (0.0038) [2024-06-23 13:52:22,233][15401] Updated weights for policy 0, policy_version 447840 (0.0051) [2024-06-23 13:52:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7337426944. Throughput: 0: 42298.2. Samples: 7337510660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 13:52:26,363][15401] Updated weights for policy 0, policy_version 447850 (0.0028) [2024-06-23 13:52:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7337639936. Throughput: 0: 42321.0. Samples: 7337766080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 13:52:29,823][15401] Updated weights for policy 0, policy_version 447860 (0.0034) [2024-06-23 13:52:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 7337836544. Throughput: 0: 42511.0. Samples: 7338026500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 13:52:34,103][15401] Updated weights for policy 0, policy_version 447870 (0.0029) [2024-06-23 13:52:37,482][15401] Updated weights for policy 0, policy_version 447880 (0.0043) [2024-06-23 13:52:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 7338082304. Throughput: 0: 42371.1. Samples: 7338149640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 13:52:41,895][15401] Updated weights for policy 0, policy_version 447890 (0.0040) [2024-06-23 13:52:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7338278912. Throughput: 0: 42546.2. Samples: 7338408960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 13:52:45,047][15401] Updated weights for policy 0, policy_version 447900 (0.0037) [2024-06-23 13:52:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 7338491904. Throughput: 0: 42654.7. Samples: 7338670120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 13:52:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 13:52:49,986][15401] Updated weights for policy 0, policy_version 447910 (0.0040) [2024-06-23 13:52:52,549][15401] Updated weights for policy 0, policy_version 447920 (0.0038) [2024-06-23 13:52:53,396][15132] Fps is (10 sec: 45846.4, 60 sec: 43139.8, 300 sec: 42764.1). Total num frames: 7338737664. Throughput: 0: 42573.9. Samples: 7338791860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:52:53,397][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 13:52:57,513][15401] Updated weights for policy 0, policy_version 447930 (0.0034) [2024-06-23 13:52:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7338917888. Throughput: 0: 42698.2. Samples: 7339054680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:52:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 13:53:00,131][15401] Updated weights for policy 0, policy_version 447940 (0.0028) [2024-06-23 13:53:03,390][15132] Fps is (10 sec: 39346.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7339130880. Throughput: 0: 42737.3. Samples: 7339310300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 13:53:05,722][15401] Updated weights for policy 0, policy_version 447950 (0.0034) [2024-06-23 13:53:07,678][15401] Updated weights for policy 0, policy_version 447960 (0.0036) [2024-06-23 13:53:08,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7339393024. Throughput: 0: 42834.3. Samples: 7339438200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:08,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 13:53:13,132][15401] Updated weights for policy 0, policy_version 447970 (0.0058) [2024-06-23 13:53:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42325.3, 300 sec: 42542.5). Total num frames: 7339556864. Throughput: 0: 42947.4. Samples: 7339698820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:13,393][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 13:53:15,283][15401] Updated weights for policy 0, policy_version 447980 (0.0030) [2024-06-23 13:53:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7339786240. Throughput: 0: 42746.0. Samples: 7339950060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 13:53:20,533][15401] Updated weights for policy 0, policy_version 447990 (0.0033) [2024-06-23 13:53:22,767][15349] Signal inference workers to stop experience collection... (108800 times) [2024-06-23 13:53:22,825][15401] InferenceWorker_p0-w0: stopping experience collection (108800 times) [2024-06-23 13:53:22,880][15349] Signal inference workers to resume experience collection... (108800 times) [2024-06-23 13:53:22,881][15401] InferenceWorker_p0-w0: resuming experience collection (108800 times) [2024-06-23 13:53:23,343][15401] Updated weights for policy 0, policy_version 448000 (0.0045) [2024-06-23 13:53:23,389][15132] Fps is (10 sec: 47525.5, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 7340032000. Throughput: 0: 42860.9. Samples: 7340078380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 13:53:28,135][15401] Updated weights for policy 0, policy_version 448010 (0.0038) [2024-06-23 13:53:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7340212224. Throughput: 0: 42916.5. Samples: 7340340200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 13:53:31,037][15401] Updated weights for policy 0, policy_version 448020 (0.0040) [2024-06-23 13:53:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7340425216. Throughput: 0: 42674.2. Samples: 7340590460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 13:53:35,736][15401] Updated weights for policy 0, policy_version 448030 (0.0036) [2024-06-23 13:53:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7340654592. Throughput: 0: 42887.9. Samples: 7340721540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:38,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 13:53:38,767][15401] Updated weights for policy 0, policy_version 448040 (0.0033) [2024-06-23 13:53:43,219][15401] Updated weights for policy 0, policy_version 448050 (0.0025) [2024-06-23 13:53:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7340851200. Throughput: 0: 42903.6. Samples: 7340985340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:43,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 13:53:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000448051_7340867584.pth... [2024-06-23 13:53:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000447426_7330627584.pth [2024-06-23 13:53:46,406][15401] Updated weights for policy 0, policy_version 448060 (0.0029) [2024-06-23 13:53:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7341080576. Throughput: 0: 42794.8. Samples: 7341236060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:48,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 13:53:50,705][15401] Updated weights for policy 0, policy_version 448070 (0.0061) [2024-06-23 13:53:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42602.9, 300 sec: 42598.4). Total num frames: 7341293568. Throughput: 0: 42911.5. Samples: 7341369220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 13:53:54,059][15401] Updated weights for policy 0, policy_version 448080 (0.0038) [2024-06-23 13:53:58,183][15401] Updated weights for policy 0, policy_version 448090 (0.0035) [2024-06-23 13:53:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7341506560. Throughput: 0: 42864.6. Samples: 7341627620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:53:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 13:54:02,029][15401] Updated weights for policy 0, policy_version 448100 (0.0044) [2024-06-23 13:54:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 7341719552. Throughput: 0: 42912.4. Samples: 7341881120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:54:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 13:54:05,643][15401] Updated weights for policy 0, policy_version 448110 (0.0025) [2024-06-23 13:54:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7341932544. Throughput: 0: 42916.0. Samples: 7342009600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-23 13:54:08,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 13:54:09,927][15401] Updated weights for policy 0, policy_version 448120 (0.0022) [2024-06-23 13:54:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43146.4, 300 sec: 42709.5). Total num frames: 7342145536. Throughput: 0: 42766.8. Samples: 7342264700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 13:54:13,852][15401] Updated weights for policy 0, policy_version 448130 (0.0042) [2024-06-23 13:54:17,429][15401] Updated weights for policy 0, policy_version 448140 (0.0042) [2024-06-23 13:54:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7342358528. Throughput: 0: 43008.8. Samples: 7342525860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:18,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 13:54:21,310][15401] Updated weights for policy 0, policy_version 448150 (0.0034) [2024-06-23 13:54:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 7342571520. Throughput: 0: 42950.6. Samples: 7342654320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:23,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-23 13:54:24,979][15401] Updated weights for policy 0, policy_version 448160 (0.0041) [2024-06-23 13:54:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7342784512. Throughput: 0: 42688.1. Samples: 7342906300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 13:54:28,807][15401] Updated weights for policy 0, policy_version 448170 (0.0023) [2024-06-23 13:54:32,883][15401] Updated weights for policy 0, policy_version 448180 (0.0037) [2024-06-23 13:54:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 7342997504. Throughput: 0: 42909.4. Samples: 7343166980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 13:54:36,524][15401] Updated weights for policy 0, policy_version 448190 (0.0040) [2024-06-23 13:54:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7343210496. Throughput: 0: 42772.1. Samples: 7343293960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:38,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 13:54:40,514][15401] Updated weights for policy 0, policy_version 448200 (0.0038) [2024-06-23 13:54:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7343439872. Throughput: 0: 42574.1. Samples: 7343543460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 13:54:44,457][15401] Updated weights for policy 0, policy_version 448210 (0.0023) [2024-06-23 13:54:48,240][15401] Updated weights for policy 0, policy_version 448220 (0.0030) [2024-06-23 13:54:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7343636480. Throughput: 0: 42765.9. Samples: 7343805580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:48,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 13:54:51,936][15401] Updated weights for policy 0, policy_version 448230 (0.0052) [2024-06-23 13:54:53,392][15132] Fps is (10 sec: 40948.2, 60 sec: 42596.3, 300 sec: 42653.5). Total num frames: 7343849472. Throughput: 0: 42692.3. Samples: 7343930880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:53,393][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 13:54:55,833][15401] Updated weights for policy 0, policy_version 448240 (0.0031) [2024-06-23 13:54:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7344078848. Throughput: 0: 42720.9. Samples: 7344187140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:54:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 13:54:59,795][15349] Signal inference workers to stop experience collection... (108850 times) [2024-06-23 13:54:59,795][15349] Signal inference workers to resume experience collection... (108850 times) [2024-06-23 13:54:59,809][15401] InferenceWorker_p0-w0: stopping experience collection (108850 times) [2024-06-23 13:54:59,822][15401] InferenceWorker_p0-w0: resuming experience collection (108850 times) [2024-06-23 13:54:59,956][15401] Updated weights for policy 0, policy_version 448250 (0.0043) [2024-06-23 13:55:03,389][15132] Fps is (10 sec: 42611.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7344275456. Throughput: 0: 42685.1. Samples: 7344446680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:55:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 13:55:03,478][15401] Updated weights for policy 0, policy_version 448260 (0.0032) [2024-06-23 13:55:07,679][15401] Updated weights for policy 0, policy_version 448270 (0.0048) [2024-06-23 13:55:08,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 7344472064. Throughput: 0: 42548.9. Samples: 7344569120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:55:08,392][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 13:55:11,231][15401] Updated weights for policy 0, policy_version 448280 (0.0029) [2024-06-23 13:55:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 7344701440. Throughput: 0: 42518.5. Samples: 7344819640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:55:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 13:55:15,145][15401] Updated weights for policy 0, policy_version 448290 (0.0038) [2024-06-23 13:55:18,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7344914432. Throughput: 0: 42553.2. Samples: 7345081880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:55:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 13:55:18,964][15401] Updated weights for policy 0, policy_version 448300 (0.0027) [2024-06-23 13:55:23,093][15401] Updated weights for policy 0, policy_version 448310 (0.0034) [2024-06-23 13:55:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 7345111040. Throughput: 0: 42494.6. Samples: 7345206220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:55:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 13:55:26,833][15401] Updated weights for policy 0, policy_version 448320 (0.0034) [2024-06-23 13:55:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 7345356800. Throughput: 0: 42660.8. Samples: 7345463200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:55:28,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 13:55:30,857][15401] Updated weights for policy 0, policy_version 448330 (0.0038) [2024-06-23 13:55:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7345553408. Throughput: 0: 42443.5. Samples: 7345715540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 13:55:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 13:55:34,412][15401] Updated weights for policy 0, policy_version 448340 (0.0048) [2024-06-23 13:55:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 7345750016. Throughput: 0: 42487.1. Samples: 7345842680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:55:38,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-23 13:55:38,708][15401] Updated weights for policy 0, policy_version 448350 (0.0034) [2024-06-23 13:55:42,023][15401] Updated weights for policy 0, policy_version 448360 (0.0042) [2024-06-23 13:55:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7345963008. Throughput: 0: 42430.6. Samples: 7346096520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:55:43,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 13:55:43,482][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000448363_7345979392.pth... [2024-06-23 13:55:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000447738_7335739392.pth [2024-06-23 13:55:46,261][15401] Updated weights for policy 0, policy_version 448370 (0.0032) [2024-06-23 13:55:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7346192384. Throughput: 0: 42383.6. Samples: 7346353940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:55:48,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-23 13:55:49,602][15401] Updated weights for policy 0, policy_version 448380 (0.0031) [2024-06-23 13:55:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.3, 300 sec: 42598.4). Total num frames: 7346388992. Throughput: 0: 42518.2. Samples: 7346482340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:55:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 13:55:53,771][15401] Updated weights for policy 0, policy_version 448390 (0.0034) [2024-06-23 13:55:57,175][15401] Updated weights for policy 0, policy_version 448400 (0.0028) [2024-06-23 13:55:58,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 7346601984. Throughput: 0: 42444.4. Samples: 7346729640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:55:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 13:56:01,431][15401] Updated weights for policy 0, policy_version 448410 (0.0045) [2024-06-23 13:56:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7346831360. Throughput: 0: 42491.2. Samples: 7346993980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 13:56:04,613][15401] Updated weights for policy 0, policy_version 448420 (0.0026) [2024-06-23 13:56:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.1, 300 sec: 42598.8). Total num frames: 7347027968. Throughput: 0: 42579.1. Samples: 7347122280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 13:56:09,029][15401] Updated weights for policy 0, policy_version 448430 (0.0037) [2024-06-23 13:56:12,588][15401] Updated weights for policy 0, policy_version 448440 (0.0034) [2024-06-23 13:56:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 7347240960. Throughput: 0: 42284.9. Samples: 7347366020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 13:56:16,767][15401] Updated weights for policy 0, policy_version 448450 (0.0038) [2024-06-23 13:56:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7347453952. Throughput: 0: 42464.3. Samples: 7347626440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 13:56:20,189][15401] Updated weights for policy 0, policy_version 448460 (0.0031) [2024-06-23 13:56:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7347666944. Throughput: 0: 42492.0. Samples: 7347754820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:23,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 13:56:24,549][15401] Updated weights for policy 0, policy_version 448470 (0.0038) [2024-06-23 13:56:26,890][15349] Signal inference workers to stop experience collection... (108900 times) [2024-06-23 13:56:26,892][15349] Signal inference workers to resume experience collection... (108900 times) [2024-06-23 13:56:26,914][15401] InferenceWorker_p0-w0: stopping experience collection (108900 times) [2024-06-23 13:56:26,914][15401] InferenceWorker_p0-w0: resuming experience collection (108900 times) [2024-06-23 13:56:27,763][15401] Updated weights for policy 0, policy_version 448480 (0.0024) [2024-06-23 13:56:28,393][15132] Fps is (10 sec: 44219.8, 60 sec: 42322.6, 300 sec: 42597.8). Total num frames: 7347896320. Throughput: 0: 42345.2. Samples: 7348002220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:28,394][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 13:56:32,463][15401] Updated weights for policy 0, policy_version 448490 (0.0034) [2024-06-23 13:56:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 7348092928. Throughput: 0: 42486.6. Samples: 7348265840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 13:56:35,820][15401] Updated weights for policy 0, policy_version 448500 (0.0028) [2024-06-23 13:56:38,390][15132] Fps is (10 sec: 40975.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7348305920. Throughput: 0: 42431.0. Samples: 7348391740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:38,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-23 13:56:40,029][15401] Updated weights for policy 0, policy_version 448510 (0.0043) [2024-06-23 13:56:43,218][15401] Updated weights for policy 0, policy_version 448520 (0.0033) [2024-06-23 13:56:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7348551680. Throughput: 0: 42614.3. Samples: 7348647280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 13:56:47,535][15401] Updated weights for policy 0, policy_version 448530 (0.0026) [2024-06-23 13:56:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7348748288. Throughput: 0: 42420.9. Samples: 7348902920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 13:56:51,144][15401] Updated weights for policy 0, policy_version 448540 (0.0042) [2024-06-23 13:56:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7348944896. Throughput: 0: 42386.7. Samples: 7349029680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 13:56:55,095][15401] Updated weights for policy 0, policy_version 448550 (0.0042) [2024-06-23 13:56:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7349174272. Throughput: 0: 42761.9. Samples: 7349290300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 13:56:58,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 13:56:58,862][15401] Updated weights for policy 0, policy_version 448560 (0.0039) [2024-06-23 13:57:02,837][15401] Updated weights for policy 0, policy_version 448570 (0.0043) [2024-06-23 13:57:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7349387264. Throughput: 0: 42532.1. Samples: 7349540380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 13:57:06,704][15401] Updated weights for policy 0, policy_version 448580 (0.0045) [2024-06-23 13:57:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 7349583872. Throughput: 0: 42536.9. Samples: 7349668980. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 13:57:10,556][15401] Updated weights for policy 0, policy_version 448590 (0.0033) [2024-06-23 13:57:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7349796864. Throughput: 0: 42694.0. Samples: 7349923280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 13:57:14,419][15401] Updated weights for policy 0, policy_version 448600 (0.0032) [2024-06-23 13:57:18,382][15401] Updated weights for policy 0, policy_version 448610 (0.0031) [2024-06-23 13:57:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7350026240. Throughput: 0: 42462.7. Samples: 7350176660. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 13:57:22,192][15401] Updated weights for policy 0, policy_version 448620 (0.0035) [2024-06-23 13:57:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7350206464. Throughput: 0: 42501.5. Samples: 7350304300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 13:57:25,952][15401] Updated weights for policy 0, policy_version 448630 (0.0043) [2024-06-23 13:57:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42328.1, 300 sec: 42709.5). Total num frames: 7350435840. Throughput: 0: 42431.1. Samples: 7350556680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 13:57:29,921][15401] Updated weights for policy 0, policy_version 448640 (0.0041) [2024-06-23 13:57:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7350665216. Throughput: 0: 42456.8. Samples: 7350813480. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:33,395][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 13:57:33,530][15401] Updated weights for policy 0, policy_version 448650 (0.0034) [2024-06-23 13:57:37,418][15401] Updated weights for policy 0, policy_version 448660 (0.0031) [2024-06-23 13:57:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7350861824. Throughput: 0: 42488.4. Samples: 7350941660. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 13:57:41,194][15401] Updated weights for policy 0, policy_version 448670 (0.0023) [2024-06-23 13:57:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 7351058432. Throughput: 0: 42356.5. Samples: 7351196340. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 13:57:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000448674_7351074816.pth... [2024-06-23 13:57:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000448051_7340867584.pth [2024-06-23 13:57:45,460][15401] Updated weights for policy 0, policy_version 448680 (0.0033) [2024-06-23 13:57:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42543.8). Total num frames: 7351287808. Throughput: 0: 42529.4. Samples: 7351454200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 13:57:49,225][15401] Updated weights for policy 0, policy_version 448690 (0.0042) [2024-06-23 13:57:53,206][15401] Updated weights for policy 0, policy_version 448700 (0.0036) [2024-06-23 13:57:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7351500800. Throughput: 0: 42419.2. Samples: 7351577840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 13:57:56,718][15349] Signal inference workers to stop experience collection... (108950 times) [2024-06-23 13:57:56,719][15349] Signal inference workers to resume experience collection... (108950 times) [2024-06-23 13:57:56,736][15401] InferenceWorker_p0-w0: stopping experience collection (108950 times) [2024-06-23 13:57:56,736][15401] InferenceWorker_p0-w0: resuming experience collection (108950 times) [2024-06-23 13:57:56,872][15401] Updated weights for policy 0, policy_version 448710 (0.0028) [2024-06-23 13:57:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 7351697408. Throughput: 0: 42342.2. Samples: 7351828680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:57:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 13:58:01,062][15401] Updated weights for policy 0, policy_version 448720 (0.0039) [2024-06-23 13:58:03,392][15132] Fps is (10 sec: 39312.1, 60 sec: 41777.5, 300 sec: 42375.9). Total num frames: 7351894016. Throughput: 0: 42389.2. Samples: 7352084280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:58:03,392][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 13:58:04,653][15401] Updated weights for policy 0, policy_version 448730 (0.0038) [2024-06-23 13:58:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 7352139776. Throughput: 0: 42269.8. Samples: 7352206440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:58:08,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 13:58:08,906][15401] Updated weights for policy 0, policy_version 448740 (0.0046) [2024-06-23 13:58:12,257][15401] Updated weights for policy 0, policy_version 448750 (0.0038) [2024-06-23 13:58:13,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7352352768. Throughput: 0: 42289.7. Samples: 7352459720. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:58:13,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 13:58:16,455][15401] Updated weights for policy 0, policy_version 448760 (0.0036) [2024-06-23 13:58:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 7352549376. Throughput: 0: 42402.8. Samples: 7352721600. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:58:18,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 13:58:19,887][15401] Updated weights for policy 0, policy_version 448770 (0.0028) [2024-06-23 13:58:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7352762368. Throughput: 0: 42238.2. Samples: 7352842380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-23 13:58:23,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 13:58:24,367][15401] Updated weights for policy 0, policy_version 448780 (0.0032) [2024-06-23 13:58:27,550][15401] Updated weights for policy 0, policy_version 448790 (0.0032) [2024-06-23 13:58:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7352991744. Throughput: 0: 42374.6. Samples: 7353103200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:58:28,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-23 13:58:32,048][15401] Updated weights for policy 0, policy_version 448800 (0.0028) [2024-06-23 13:58:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 7353188352. Throughput: 0: 42307.0. Samples: 7353358020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:58:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 13:58:35,533][15401] Updated weights for policy 0, policy_version 448810 (0.0027) [2024-06-23 13:58:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7353401344. Throughput: 0: 42319.1. Samples: 7353482200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:58:38,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 13:58:39,723][15401] Updated weights for policy 0, policy_version 448820 (0.0026) [2024-06-23 13:58:43,062][15401] Updated weights for policy 0, policy_version 448830 (0.0031) [2024-06-23 13:58:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7353647104. Throughput: 0: 42527.9. Samples: 7353742440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:58:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 13:58:47,359][15401] Updated weights for policy 0, policy_version 448840 (0.0041) [2024-06-23 13:58:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7353827328. Throughput: 0: 42599.3. Samples: 7354001140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:58:48,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 13:58:50,658][15401] Updated weights for policy 0, policy_version 448850 (0.0042) [2024-06-23 13:58:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7354040320. Throughput: 0: 42580.5. Samples: 7354122560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:58:53,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-23 13:58:55,007][15401] Updated weights for policy 0, policy_version 448860 (0.0026) [2024-06-23 13:58:58,278][15401] Updated weights for policy 0, policy_version 448870 (0.0042) [2024-06-23 13:58:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7354286080. Throughput: 0: 42809.0. Samples: 7354386120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:58:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 13:59:02,631][15401] Updated weights for policy 0, policy_version 448880 (0.0031) [2024-06-23 13:59:03,394][15132] Fps is (10 sec: 40940.4, 60 sec: 42596.8, 300 sec: 42431.1). Total num frames: 7354449920. Throughput: 0: 42692.3. Samples: 7354642960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:03,395][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 13:59:05,895][15401] Updated weights for policy 0, policy_version 448890 (0.0040) [2024-06-23 13:59:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7354695680. Throughput: 0: 42671.7. Samples: 7354762600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 13:59:10,542][15401] Updated weights for policy 0, policy_version 448900 (0.0032) [2024-06-23 13:59:13,390][15132] Fps is (10 sec: 47535.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7354925056. Throughput: 0: 42604.8. Samples: 7355020420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 13:59:13,619][15401] Updated weights for policy 0, policy_version 448910 (0.0033) [2024-06-23 13:59:18,003][15401] Updated weights for policy 0, policy_version 448920 (0.0039) [2024-06-23 13:59:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7355105280. Throughput: 0: 42634.7. Samples: 7355276580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 13:59:21,212][15401] Updated weights for policy 0, policy_version 448930 (0.0041) [2024-06-23 13:59:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 7355334656. Throughput: 0: 42592.3. Samples: 7355398860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:23,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-23 13:59:25,850][15401] Updated weights for policy 0, policy_version 448940 (0.0024) [2024-06-23 13:59:26,808][15349] Signal inference workers to stop experience collection... (109000 times) [2024-06-23 13:59:26,808][15349] Signal inference workers to resume experience collection... (109000 times) [2024-06-23 13:59:26,833][15401] InferenceWorker_p0-w0: stopping experience collection (109000 times) [2024-06-23 13:59:26,833][15401] InferenceWorker_p0-w0: resuming experience collection (109000 times) [2024-06-23 13:59:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7355531264. Throughput: 0: 42556.0. Samples: 7355657460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:28,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 13:59:29,314][15401] Updated weights for policy 0, policy_version 448950 (0.0037) [2024-06-23 13:59:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7355744256. Throughput: 0: 42422.9. Samples: 7355910180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:33,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 13:59:33,530][15401] Updated weights for policy 0, policy_version 448960 (0.0033) [2024-06-23 13:59:36,839][15401] Updated weights for policy 0, policy_version 448970 (0.0051) [2024-06-23 13:59:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 7355990016. Throughput: 0: 42608.0. Samples: 7356039920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:38,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 13:59:41,208][15401] Updated weights for policy 0, policy_version 448980 (0.0024) [2024-06-23 13:59:43,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 7356170240. Throughput: 0: 42757.3. Samples: 7356310300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 13:59:43,393][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 13:59:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000448986_7356186624.pth... [2024-06-23 13:59:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000448363_7345979392.pth [2024-06-23 13:59:44,756][15401] Updated weights for policy 0, policy_version 448990 (0.0037) [2024-06-23 13:59:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 7356383232. Throughput: 0: 42524.8. Samples: 7356556380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 13:59:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 13:59:49,088][15401] Updated weights for policy 0, policy_version 449000 (0.0031) [2024-06-23 13:59:52,353][15401] Updated weights for policy 0, policy_version 449010 (0.0040) [2024-06-23 13:59:53,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 7356628992. Throughput: 0: 42698.2. Samples: 7356684020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 13:59:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 13:59:57,014][15401] Updated weights for policy 0, policy_version 449020 (0.0031) [2024-06-23 13:59:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 7356809216. Throughput: 0: 42804.0. Samples: 7356946600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 13:59:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 13:59:59,847][15401] Updated weights for policy 0, policy_version 449030 (0.0033) [2024-06-23 14:00:03,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42601.7, 300 sec: 42487.7). Total num frames: 7357005824. Throughput: 0: 42721.3. Samples: 7357199040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:03,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-23 14:00:04,557][15401] Updated weights for policy 0, policy_version 449040 (0.0034) [2024-06-23 14:00:07,552][15401] Updated weights for policy 0, policy_version 449050 (0.0035) [2024-06-23 14:00:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7357251584. Throughput: 0: 42815.6. Samples: 7357325560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 14:00:12,122][15401] Updated weights for policy 0, policy_version 449060 (0.0032) [2024-06-23 14:00:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 7357448192. Throughput: 0: 42736.6. Samples: 7357580600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 14:00:15,105][15401] Updated weights for policy 0, policy_version 449070 (0.0039) [2024-06-23 14:00:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 7357661184. Throughput: 0: 42765.0. Samples: 7357834600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:18,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 14:00:19,998][15401] Updated weights for policy 0, policy_version 449080 (0.0035) [2024-06-23 14:00:22,963][15401] Updated weights for policy 0, policy_version 449090 (0.0048) [2024-06-23 14:00:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7357890560. Throughput: 0: 42725.2. Samples: 7357962560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 14:00:27,836][15401] Updated weights for policy 0, policy_version 449100 (0.0031) [2024-06-23 14:00:28,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 7358087168. Throughput: 0: 42467.6. Samples: 7358221340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:28,393][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 14:00:30,581][15401] Updated weights for policy 0, policy_version 449110 (0.0035) [2024-06-23 14:00:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7358316544. Throughput: 0: 42558.3. Samples: 7358471500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:33,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 14:00:35,332][15401] Updated weights for policy 0, policy_version 449120 (0.0042) [2024-06-23 14:00:38,194][15401] Updated weights for policy 0, policy_version 449130 (0.0040) [2024-06-23 14:00:38,392][15132] Fps is (10 sec: 45875.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7358545920. Throughput: 0: 42684.8. Samples: 7358604940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:38,393][15132] Avg episode reward: [(0, '0.270')] [2024-06-23 14:00:43,014][15349] Signal inference workers to stop experience collection... (109050 times) [2024-06-23 14:00:43,048][15401] InferenceWorker_p0-w0: stopping experience collection (109050 times) [2024-06-23 14:00:43,061][15349] Signal inference workers to resume experience collection... (109050 times) [2024-06-23 14:00:43,067][15401] InferenceWorker_p0-w0: resuming experience collection (109050 times) [2024-06-23 14:00:43,219][15401] Updated weights for policy 0, policy_version 449140 (0.0031) [2024-06-23 14:00:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 7358726144. Throughput: 0: 42451.1. Samples: 7358856900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 14:00:46,183][15401] Updated weights for policy 0, policy_version 449150 (0.0031) [2024-06-23 14:00:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7358971904. Throughput: 0: 42302.6. Samples: 7359102660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:48,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 14:00:50,750][15401] Updated weights for policy 0, policy_version 449160 (0.0030) [2024-06-23 14:00:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7359152128. Throughput: 0: 42749.9. Samples: 7359249300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 14:00:53,808][15401] Updated weights for policy 0, policy_version 449170 (0.0039) [2024-06-23 14:00:58,298][15401] Updated weights for policy 0, policy_version 449180 (0.0042) [2024-06-23 14:00:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7359365120. Throughput: 0: 42617.6. Samples: 7359498400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:00:58,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 14:01:01,434][15401] Updated weights for policy 0, policy_version 449190 (0.0024) [2024-06-23 14:01:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 7359627264. Throughput: 0: 42501.3. Samples: 7359747160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:01:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 14:01:05,845][15401] Updated weights for policy 0, policy_version 449200 (0.0032) [2024-06-23 14:01:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7359791104. Throughput: 0: 42781.1. Samples: 7359887700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:01:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 14:01:09,348][15401] Updated weights for policy 0, policy_version 449210 (0.0027) [2024-06-23 14:01:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7360004096. Throughput: 0: 42600.9. Samples: 7360138280. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:13,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 14:01:13,425][15401] Updated weights for policy 0, policy_version 449220 (0.0034) [2024-06-23 14:01:17,082][15401] Updated weights for policy 0, policy_version 449230 (0.0024) [2024-06-23 14:01:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 7360249856. Throughput: 0: 42500.5. Samples: 7360384020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 14:01:21,460][15401] Updated weights for policy 0, policy_version 449240 (0.0039) [2024-06-23 14:01:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42487.9). Total num frames: 7360430080. Throughput: 0: 42482.2. Samples: 7360516540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 14:01:24,682][15401] Updated weights for policy 0, policy_version 449250 (0.0036) [2024-06-23 14:01:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 7360659456. Throughput: 0: 42480.0. Samples: 7360768500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 14:01:29,195][15401] Updated weights for policy 0, policy_version 449260 (0.0041) [2024-06-23 14:01:32,710][15401] Updated weights for policy 0, policy_version 449270 (0.0026) [2024-06-23 14:01:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7360888832. Throughput: 0: 42737.7. Samples: 7361025860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 14:01:36,630][15401] Updated weights for policy 0, policy_version 449280 (0.0028) [2024-06-23 14:01:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 7361085440. Throughput: 0: 42619.1. Samples: 7361167160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:38,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 14:01:40,168][15401] Updated weights for policy 0, policy_version 449290 (0.0037) [2024-06-23 14:01:43,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7361298432. Throughput: 0: 42529.9. Samples: 7361412240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:43,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 14:01:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000449298_7361298432.pth... [2024-06-23 14:01:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000448674_7351074816.pth [2024-06-23 14:01:44,087][15401] Updated weights for policy 0, policy_version 449300 (0.0032) [2024-06-23 14:01:48,043][15401] Updated weights for policy 0, policy_version 449310 (0.0032) [2024-06-23 14:01:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7361527808. Throughput: 0: 42877.0. Samples: 7361676620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:48,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 14:01:51,501][15401] Updated weights for policy 0, policy_version 449320 (0.0031) [2024-06-23 14:01:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7361708032. Throughput: 0: 42634.6. Samples: 7361806260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 14:01:55,566][15401] Updated weights for policy 0, policy_version 449330 (0.0030) [2024-06-23 14:01:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7361953792. Throughput: 0: 42747.0. Samples: 7362061900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:01:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 14:01:58,831][15401] Updated weights for policy 0, policy_version 449340 (0.0034) [2024-06-23 14:02:03,254][15401] Updated weights for policy 0, policy_version 449350 (0.0038) [2024-06-23 14:02:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7362150400. Throughput: 0: 43129.7. Samples: 7362324860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:02:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 14:02:06,348][15401] Updated weights for policy 0, policy_version 449360 (0.0039) [2024-06-23 14:02:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7362363392. Throughput: 0: 42872.9. Samples: 7362445820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:02:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 14:02:10,991][15401] Updated weights for policy 0, policy_version 449370 (0.0035) [2024-06-23 14:02:11,457][15349] Signal inference workers to stop experience collection... (109100 times) [2024-06-23 14:02:11,458][15349] Signal inference workers to resume experience collection... (109100 times) [2024-06-23 14:02:11,493][15401] InferenceWorker_p0-w0: stopping experience collection (109100 times) [2024-06-23 14:02:11,493][15401] InferenceWorker_p0-w0: resuming experience collection (109100 times) [2024-06-23 14:02:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 7362609152. Throughput: 0: 43134.8. Samples: 7362709560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:02:13,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 14:02:13,763][15401] Updated weights for policy 0, policy_version 449380 (0.0028) [2024-06-23 14:02:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42050.5, 300 sec: 42598.1). Total num frames: 7362772992. Throughput: 0: 43314.7. Samples: 7362975120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:02:18,392][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 14:02:18,612][15401] Updated weights for policy 0, policy_version 449390 (0.0036) [2024-06-23 14:02:21,783][15401] Updated weights for policy 0, policy_version 449400 (0.0029) [2024-06-23 14:02:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7363002368. Throughput: 0: 42634.1. Samples: 7363085700. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:02:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 14:02:26,081][15401] Updated weights for policy 0, policy_version 449410 (0.0038) [2024-06-23 14:02:28,390][15132] Fps is (10 sec: 49163.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 7363264512. Throughput: 0: 43165.3. Samples: 7363354680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:02:28,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 14:02:29,247][15401] Updated weights for policy 0, policy_version 449420 (0.0031) [2024-06-23 14:02:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7363428352. Throughput: 0: 43131.5. Samples: 7363617540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-23 14:02:33,392][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 14:02:33,793][15401] Updated weights for policy 0, policy_version 449430 (0.0032) [2024-06-23 14:02:36,711][15401] Updated weights for policy 0, policy_version 449440 (0.0031) [2024-06-23 14:02:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7363657728. Throughput: 0: 42753.4. Samples: 7363730160. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:02:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 14:02:41,414][15401] Updated weights for policy 0, policy_version 449450 (0.0036) [2024-06-23 14:02:43,390][15132] Fps is (10 sec: 49151.3, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 7363919872. Throughput: 0: 42987.5. Samples: 7363996340. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:02:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 14:02:44,335][15401] Updated weights for policy 0, policy_version 449460 (0.0029) [2024-06-23 14:02:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7364067328. Throughput: 0: 43166.2. Samples: 7364267340. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:02:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 14:02:48,962][15401] Updated weights for policy 0, policy_version 449470 (0.0031) [2024-06-23 14:02:52,273][15401] Updated weights for policy 0, policy_version 449480 (0.0035) [2024-06-23 14:02:53,389][15132] Fps is (10 sec: 37684.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7364296704. Throughput: 0: 43018.8. Samples: 7364381660. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:02:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 14:02:56,543][15401] Updated weights for policy 0, policy_version 449490 (0.0032) [2024-06-23 14:02:58,392][15132] Fps is (10 sec: 49140.1, 60 sec: 43415.9, 300 sec: 42931.6). Total num frames: 7364558848. Throughput: 0: 43070.0. Samples: 7364647820. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:02:58,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 14:02:59,890][15401] Updated weights for policy 0, policy_version 449500 (0.0036) [2024-06-23 14:03:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7364706304. Throughput: 0: 42995.1. Samples: 7364909800. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 14:03:04,088][15401] Updated weights for policy 0, policy_version 449510 (0.0026) [2024-06-23 14:03:07,652][15401] Updated weights for policy 0, policy_version 449520 (0.0028) [2024-06-23 14:03:08,389][15132] Fps is (10 sec: 39331.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7364952064. Throughput: 0: 43189.9. Samples: 7365029240. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 14:03:11,649][15401] Updated weights for policy 0, policy_version 449530 (0.0034) [2024-06-23 14:03:13,390][15132] Fps is (10 sec: 49152.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7365197824. Throughput: 0: 43063.5. Samples: 7365292540. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:13,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 14:03:15,164][15401] Updated weights for policy 0, policy_version 449540 (0.0040) [2024-06-23 14:03:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 7365361664. Throughput: 0: 42959.6. Samples: 7365550720. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 14:03:19,265][15401] Updated weights for policy 0, policy_version 449550 (0.0038) [2024-06-23 14:03:20,643][15349] Signal inference workers to stop experience collection... (109150 times) [2024-06-23 14:03:20,643][15349] Signal inference workers to resume experience collection... (109150 times) [2024-06-23 14:03:20,690][15401] InferenceWorker_p0-w0: stopping experience collection (109150 times) [2024-06-23 14:03:20,690][15401] InferenceWorker_p0-w0: resuming experience collection (109150 times) [2024-06-23 14:03:22,937][15401] Updated weights for policy 0, policy_version 449560 (0.0032) [2024-06-23 14:03:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7365591040. Throughput: 0: 43142.1. Samples: 7365671560. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:23,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 14:03:26,947][15401] Updated weights for policy 0, policy_version 449570 (0.0029) [2024-06-23 14:03:28,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7365836800. Throughput: 0: 42905.0. Samples: 7365927060. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:28,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 14:03:30,746][15401] Updated weights for policy 0, policy_version 449580 (0.0031) [2024-06-23 14:03:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7366000640. Throughput: 0: 42806.6. Samples: 7366193640. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 14:03:34,452][15401] Updated weights for policy 0, policy_version 449590 (0.0036) [2024-06-23 14:03:38,292][15401] Updated weights for policy 0, policy_version 449600 (0.0040) [2024-06-23 14:03:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7366246400. Throughput: 0: 42828.0. Samples: 7366308920. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 14:03:41,968][15401] Updated weights for policy 0, policy_version 449610 (0.0032) [2024-06-23 14:03:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 7366459392. Throughput: 0: 42633.3. Samples: 7366566220. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 14:03:43,543][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000449614_7366475776.pth... [2024-06-23 14:03:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000448986_7356186624.pth [2024-06-23 14:03:45,855][15401] Updated weights for policy 0, policy_version 449620 (0.0030) [2024-06-23 14:03:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7366639616. Throughput: 0: 42795.3. Samples: 7366835580. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 14:03:49,727][15401] Updated weights for policy 0, policy_version 449630 (0.0036) [2024-06-23 14:03:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7366885376. Throughput: 0: 42747.5. Samples: 7366952880. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 14:03:53,431][15401] Updated weights for policy 0, policy_version 449640 (0.0027) [2024-06-23 14:03:57,466][15401] Updated weights for policy 0, policy_version 449650 (0.0026) [2024-06-23 14:03:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42327.1, 300 sec: 42876.8). Total num frames: 7367098368. Throughput: 0: 42615.2. Samples: 7367210220. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-06-23 14:03:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 14:04:01,085][15401] Updated weights for policy 0, policy_version 449660 (0.0035) [2024-06-23 14:04:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7367278592. Throughput: 0: 42794.0. Samples: 7367476460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 14:04:05,142][15401] Updated weights for policy 0, policy_version 449670 (0.0033) [2024-06-23 14:04:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7367524352. Throughput: 0: 42823.6. Samples: 7367598620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:08,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 14:04:08,839][15401] Updated weights for policy 0, policy_version 449680 (0.0039) [2024-06-23 14:04:12,784][15401] Updated weights for policy 0, policy_version 449690 (0.0024) [2024-06-23 14:04:13,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7367753728. Throughput: 0: 42923.5. Samples: 7367858620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:13,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 14:04:16,914][15401] Updated weights for policy 0, policy_version 449700 (0.0041) [2024-06-23 14:04:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7367933952. Throughput: 0: 42700.5. Samples: 7368115160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 14:04:20,331][15401] Updated weights for policy 0, policy_version 449710 (0.0032) [2024-06-23 14:04:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7368163328. Throughput: 0: 42904.8. Samples: 7368239640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 14:04:24,451][15401] Updated weights for policy 0, policy_version 449720 (0.0030) [2024-06-23 14:04:27,910][15401] Updated weights for policy 0, policy_version 449730 (0.0038) [2024-06-23 14:04:28,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7368409088. Throughput: 0: 43018.3. Samples: 7368502040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 14:04:32,240][15401] Updated weights for policy 0, policy_version 449740 (0.0039) [2024-06-23 14:04:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7368572928. Throughput: 0: 42841.1. Samples: 7368763440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 14:04:35,489][15401] Updated weights for policy 0, policy_version 449750 (0.0040) [2024-06-23 14:04:37,037][15349] Signal inference workers to stop experience collection... (109200 times) [2024-06-23 14:04:37,074][15401] InferenceWorker_p0-w0: stopping experience collection (109200 times) [2024-06-23 14:04:37,084][15349] Signal inference workers to resume experience collection... (109200 times) [2024-06-23 14:04:37,089][15401] InferenceWorker_p0-w0: resuming experience collection (109200 times) [2024-06-23 14:04:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 7368802304. Throughput: 0: 42892.5. Samples: 7368883040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 14:04:39,826][15401] Updated weights for policy 0, policy_version 449760 (0.0035) [2024-06-23 14:04:42,857][15401] Updated weights for policy 0, policy_version 449770 (0.0027) [2024-06-23 14:04:43,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7369031680. Throughput: 0: 43107.0. Samples: 7369150040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:43,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 14:04:47,371][15401] Updated weights for policy 0, policy_version 449780 (0.0034) [2024-06-23 14:04:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 7369228288. Throughput: 0: 42858.4. Samples: 7369405080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 14:04:50,764][15401] Updated weights for policy 0, policy_version 449790 (0.0028) [2024-06-23 14:04:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7369457664. Throughput: 0: 42775.2. Samples: 7369523500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 14:04:55,290][15401] Updated weights for policy 0, policy_version 449800 (0.0029) [2024-06-23 14:04:58,246][15401] Updated weights for policy 0, policy_version 449810 (0.0033) [2024-06-23 14:04:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 7369687040. Throughput: 0: 42814.7. Samples: 7369785280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:04:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 14:05:02,937][15401] Updated weights for policy 0, policy_version 449820 (0.0047) [2024-06-23 14:05:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7369867264. Throughput: 0: 42812.0. Samples: 7370041700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:05:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 14:05:06,191][15401] Updated weights for policy 0, policy_version 449830 (0.0029) [2024-06-23 14:05:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7370096640. Throughput: 0: 42848.1. Samples: 7370167800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:05:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 14:05:10,511][15401] Updated weights for policy 0, policy_version 449840 (0.0034) [2024-06-23 14:05:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 7370309632. Throughput: 0: 42872.1. Samples: 7370431280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:05:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 14:05:13,820][15401] Updated weights for policy 0, policy_version 449850 (0.0032) [2024-06-23 14:05:18,044][15401] Updated weights for policy 0, policy_version 449860 (0.0032) [2024-06-23 14:05:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7370506240. Throughput: 0: 42811.2. Samples: 7370689940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:05:18,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 14:05:21,278][15401] Updated weights for policy 0, policy_version 449870 (0.0032) [2024-06-23 14:05:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 7370752000. Throughput: 0: 42933.7. Samples: 7370815060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-23 14:05:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 14:05:25,485][15401] Updated weights for policy 0, policy_version 449880 (0.0038) [2024-06-23 14:05:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7370948608. Throughput: 0: 42858.3. Samples: 7371078660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:05:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 14:05:28,961][15401] Updated weights for policy 0, policy_version 449890 (0.0033) [2024-06-23 14:05:33,342][15401] Updated weights for policy 0, policy_version 449900 (0.0035) [2024-06-23 14:05:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 7371161600. Throughput: 0: 42776.4. Samples: 7371330020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:05:33,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 14:05:36,707][15401] Updated weights for policy 0, policy_version 449910 (0.0038) [2024-06-23 14:05:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7371374592. Throughput: 0: 42929.3. Samples: 7371455320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:05:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 14:05:41,211][15401] Updated weights for policy 0, policy_version 449920 (0.0031) [2024-06-23 14:05:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7371571200. Throughput: 0: 42858.2. Samples: 7371713900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:05:43,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 14:05:43,462][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000449926_7371587584.pth... [2024-06-23 14:05:43,515][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000449298_7361298432.pth [2024-06-23 14:05:44,394][15401] Updated weights for policy 0, policy_version 449930 (0.0023) [2024-06-23 14:05:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7371784192. Throughput: 0: 42776.9. Samples: 7371966660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:05:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 14:05:48,940][15401] Updated weights for policy 0, policy_version 449940 (0.0034) [2024-06-23 14:05:52,083][15401] Updated weights for policy 0, policy_version 449950 (0.0038) [2024-06-23 14:05:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7372013568. Throughput: 0: 42838.2. Samples: 7372095520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:05:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 14:05:56,445][15401] Updated weights for policy 0, policy_version 449960 (0.0032) [2024-06-23 14:05:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7372242944. Throughput: 0: 42787.9. Samples: 7372356740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:05:58,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 14:05:59,984][15401] Updated weights for policy 0, policy_version 449970 (0.0037) [2024-06-23 14:06:01,099][15349] Signal inference workers to stop experience collection... (109250 times) [2024-06-23 14:06:01,100][15349] Signal inference workers to resume experience collection... (109250 times) [2024-06-23 14:06:01,120][15401] InferenceWorker_p0-w0: stopping experience collection (109250 times) [2024-06-23 14:06:01,120][15401] InferenceWorker_p0-w0: resuming experience collection (109250 times) [2024-06-23 14:06:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7372439552. Throughput: 0: 42687.6. Samples: 7372610880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:03,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 14:06:03,866][15401] Updated weights for policy 0, policy_version 449980 (0.0039) [2024-06-23 14:06:07,730][15401] Updated weights for policy 0, policy_version 449990 (0.0028) [2024-06-23 14:06:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7372636160. Throughput: 0: 42748.5. Samples: 7372738740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 14:06:11,712][15401] Updated weights for policy 0, policy_version 450000 (0.0041) [2024-06-23 14:06:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 7372881920. Throughput: 0: 42497.2. Samples: 7372991140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:13,393][15132] Avg episode reward: [(0, '0.146')] [2024-06-23 14:06:15,447][15401] Updated weights for policy 0, policy_version 450010 (0.0030) [2024-06-23 14:06:18,392][15132] Fps is (10 sec: 45863.7, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 7373094912. Throughput: 0: 42556.5. Samples: 7373245160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:18,394][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 14:06:19,207][15401] Updated weights for policy 0, policy_version 450020 (0.0024) [2024-06-23 14:06:23,036][15401] Updated weights for policy 0, policy_version 450030 (0.0037) [2024-06-23 14:06:23,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7373291520. Throughput: 0: 42499.1. Samples: 7373367780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 14:06:26,685][15401] Updated weights for policy 0, policy_version 450040 (0.0038) [2024-06-23 14:06:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7373504512. Throughput: 0: 42564.1. Samples: 7373629280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 14:06:31,058][15401] Updated weights for policy 0, policy_version 450050 (0.0024) [2024-06-23 14:06:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7373733888. Throughput: 0: 42607.9. Samples: 7373884020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 14:06:34,816][15401] Updated weights for policy 0, policy_version 450060 (0.0027) [2024-06-23 14:06:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 7373914112. Throughput: 0: 42531.5. Samples: 7374009440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 14:06:38,619][15401] Updated weights for policy 0, policy_version 450070 (0.0030) [2024-06-23 14:06:42,550][15401] Updated weights for policy 0, policy_version 450080 (0.0034) [2024-06-23 14:06:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7374159872. Throughput: 0: 42515.5. Samples: 7374269940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 14:06:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 14:06:46,394][15401] Updated weights for policy 0, policy_version 450090 (0.0041) [2024-06-23 14:06:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7374356480. Throughput: 0: 42530.3. Samples: 7374524740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:06:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 14:06:50,041][15401] Updated weights for policy 0, policy_version 450100 (0.0027) [2024-06-23 14:06:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7374569472. Throughput: 0: 42488.4. Samples: 7374650720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:06:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 14:06:54,016][15401] Updated weights for policy 0, policy_version 450110 (0.0027) [2024-06-23 14:06:57,446][15401] Updated weights for policy 0, policy_version 450120 (0.0044) [2024-06-23 14:06:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 7374782464. Throughput: 0: 42545.3. Samples: 7374905580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:06:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 14:07:01,665][15401] Updated weights for policy 0, policy_version 450130 (0.0037) [2024-06-23 14:07:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7374979072. Throughput: 0: 42605.4. Samples: 7375162300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 14:07:05,448][15401] Updated weights for policy 0, policy_version 450140 (0.0038) [2024-06-23 14:07:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7375192064. Throughput: 0: 42651.5. Samples: 7375287100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 14:07:09,837][15401] Updated weights for policy 0, policy_version 450150 (0.0032) [2024-06-23 14:07:12,847][15401] Updated weights for policy 0, policy_version 450160 (0.0034) [2024-06-23 14:07:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7375437824. Throughput: 0: 42482.6. Samples: 7375541100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:13,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 14:07:17,590][15401] Updated weights for policy 0, policy_version 450170 (0.0041) [2024-06-23 14:07:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 7375618048. Throughput: 0: 42738.4. Samples: 7375807240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:18,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 14:07:20,476][15401] Updated weights for policy 0, policy_version 450180 (0.0032) [2024-06-23 14:07:23,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7375847424. Throughput: 0: 42564.1. Samples: 7375924820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 14:07:25,252][15401] Updated weights for policy 0, policy_version 450190 (0.0036) [2024-06-23 14:07:27,101][15349] Signal inference workers to stop experience collection... (109300 times) [2024-06-23 14:07:27,149][15401] InferenceWorker_p0-w0: stopping experience collection (109300 times) [2024-06-23 14:07:27,158][15349] Signal inference workers to resume experience collection... (109300 times) [2024-06-23 14:07:27,170][15401] InferenceWorker_p0-w0: resuming experience collection (109300 times) [2024-06-23 14:07:28,227][15401] Updated weights for policy 0, policy_version 450200 (0.0046) [2024-06-23 14:07:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7376076800. Throughput: 0: 42488.1. Samples: 7376181900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 14:07:33,090][15401] Updated weights for policy 0, policy_version 450210 (0.0023) [2024-06-23 14:07:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 7376257024. Throughput: 0: 42589.2. Samples: 7376441260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 14:07:36,034][15401] Updated weights for policy 0, policy_version 450220 (0.0035) [2024-06-23 14:07:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7376486400. Throughput: 0: 42474.3. Samples: 7376562060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:38,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 14:07:40,924][15401] Updated weights for policy 0, policy_version 450230 (0.0037) [2024-06-23 14:07:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7376699392. Throughput: 0: 42420.0. Samples: 7376814480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 14:07:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000450238_7376699392.pth... [2024-06-23 14:07:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000449614_7366475776.pth [2024-06-23 14:07:43,868][15401] Updated weights for policy 0, policy_version 450240 (0.0032) [2024-06-23 14:07:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7376879616. Throughput: 0: 42601.7. Samples: 7377079380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 14:07:48,556][15401] Updated weights for policy 0, policy_version 450250 (0.0034) [2024-06-23 14:07:51,408][15401] Updated weights for policy 0, policy_version 450260 (0.0036) [2024-06-23 14:07:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 7377125376. Throughput: 0: 42465.3. Samples: 7377198040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 14:07:56,235][15401] Updated weights for policy 0, policy_version 450270 (0.0024) [2024-06-23 14:07:58,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7377354752. Throughput: 0: 42594.7. Samples: 7377457760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:07:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 14:07:58,913][15401] Updated weights for policy 0, policy_version 450280 (0.0036) [2024-06-23 14:08:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7377518592. Throughput: 0: 42643.6. Samples: 7377726200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:08:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 14:08:03,791][15401] Updated weights for policy 0, policy_version 450290 (0.0042) [2024-06-23 14:08:06,509][15401] Updated weights for policy 0, policy_version 450300 (0.0036) [2024-06-23 14:08:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7377780736. Throughput: 0: 42642.4. Samples: 7377843740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:08:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 14:08:11,194][15401] Updated weights for policy 0, policy_version 450310 (0.0025) [2024-06-23 14:08:13,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 7377993728. Throughput: 0: 42813.8. Samples: 7378108520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 14:08:13,983][15401] Updated weights for policy 0, policy_version 450320 (0.0036) [2024-06-23 14:08:18,389][15132] Fps is (10 sec: 37684.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7378157568. Throughput: 0: 42985.8. Samples: 7378375620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 14:08:18,683][15401] Updated weights for policy 0, policy_version 450330 (0.0029) [2024-06-23 14:08:21,794][15401] Updated weights for policy 0, policy_version 450340 (0.0030) [2024-06-23 14:08:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7378419712. Throughput: 0: 42987.8. Samples: 7378496520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 14:08:26,172][15401] Updated weights for policy 0, policy_version 450350 (0.0037) [2024-06-23 14:08:28,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7378632704. Throughput: 0: 43063.3. Samples: 7378752320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 14:08:29,134][15401] Updated weights for policy 0, policy_version 450360 (0.0034) [2024-06-23 14:08:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7378812928. Throughput: 0: 42895.2. Samples: 7379009660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 14:08:34,137][15401] Updated weights for policy 0, policy_version 450370 (0.0044) [2024-06-23 14:08:36,756][15401] Updated weights for policy 0, policy_version 450380 (0.0031) [2024-06-23 14:08:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7379058688. Throughput: 0: 42929.5. Samples: 7379129860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 14:08:41,559][15401] Updated weights for policy 0, policy_version 450390 (0.0028) [2024-06-23 14:08:43,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 7379288064. Throughput: 0: 43018.8. Samples: 7379393600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:43,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 14:08:44,513][15401] Updated weights for policy 0, policy_version 450400 (0.0031) [2024-06-23 14:08:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7379468288. Throughput: 0: 42788.8. Samples: 7379651700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:48,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 14:08:49,052][15401] Updated weights for policy 0, policy_version 450410 (0.0030) [2024-06-23 14:08:52,079][15401] Updated weights for policy 0, policy_version 450420 (0.0029) [2024-06-23 14:08:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7379697664. Throughput: 0: 42903.7. Samples: 7379774400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 14:08:56,649][15401] Updated weights for policy 0, policy_version 450430 (0.0037) [2024-06-23 14:08:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7379910656. Throughput: 0: 42763.6. Samples: 7380032880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:08:58,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-23 14:08:59,426][15349] Signal inference workers to stop experience collection... (109350 times) [2024-06-23 14:08:59,427][15349] Signal inference workers to resume experience collection... (109350 times) [2024-06-23 14:08:59,458][15401] InferenceWorker_p0-w0: stopping experience collection (109350 times) [2024-06-23 14:08:59,458][15401] InferenceWorker_p0-w0: resuming experience collection (109350 times) [2024-06-23 14:08:59,739][15401] Updated weights for policy 0, policy_version 450440 (0.0032) [2024-06-23 14:09:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7380090880. Throughput: 0: 42693.4. Samples: 7380296820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:09:03,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 14:09:04,296][15401] Updated weights for policy 0, policy_version 450450 (0.0040) [2024-06-23 14:09:07,801][15401] Updated weights for policy 0, policy_version 450460 (0.0040) [2024-06-23 14:09:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7380336640. Throughput: 0: 42788.1. Samples: 7380421980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:09:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 14:09:12,174][15401] Updated weights for policy 0, policy_version 450470 (0.0033) [2024-06-23 14:09:13,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7380549632. Throughput: 0: 42650.4. Samples: 7380671600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:09:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 14:09:15,652][15401] Updated weights for policy 0, policy_version 450480 (0.0045) [2024-06-23 14:09:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7380746240. Throughput: 0: 42509.6. Samples: 7380922600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:09:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 14:09:20,016][15401] Updated weights for policy 0, policy_version 450490 (0.0034) [2024-06-23 14:09:23,267][15401] Updated weights for policy 0, policy_version 450500 (0.0028) [2024-06-23 14:09:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 7380992000. Throughput: 0: 42702.6. Samples: 7381051480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:09:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 14:09:27,703][15401] Updated weights for policy 0, policy_version 450510 (0.0037) [2024-06-23 14:09:28,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7381204992. Throughput: 0: 42757.2. Samples: 7381317680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:09:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 14:09:30,925][15401] Updated weights for policy 0, policy_version 450520 (0.0032) [2024-06-23 14:09:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7381401600. Throughput: 0: 42452.5. Samples: 7381562060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-23 14:09:33,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 14:09:35,477][15401] Updated weights for policy 0, policy_version 450530 (0.0044) [2024-06-23 14:09:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7381614592. Throughput: 0: 42504.1. Samples: 7381687080. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:09:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 14:09:38,673][15401] Updated weights for policy 0, policy_version 450540 (0.0038) [2024-06-23 14:09:42,998][15401] Updated weights for policy 0, policy_version 450550 (0.0033) [2024-06-23 14:09:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7381843968. Throughput: 0: 42775.0. Samples: 7381957760. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:09:43,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 14:09:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000450552_7381843968.pth... [2024-06-23 14:09:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000449926_7371587584.pth [2024-06-23 14:09:46,290][15401] Updated weights for policy 0, policy_version 450560 (0.0023) [2024-06-23 14:09:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 7382040576. Throughput: 0: 42369.6. Samples: 7382203560. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:09:48,393][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 14:09:50,789][15401] Updated weights for policy 0, policy_version 450570 (0.0034) [2024-06-23 14:09:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7382253568. Throughput: 0: 42429.8. Samples: 7382331320. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:09:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 14:09:54,313][15401] Updated weights for policy 0, policy_version 450580 (0.0028) [2024-06-23 14:09:58,370][15401] Updated weights for policy 0, policy_version 450590 (0.0026) [2024-06-23 14:09:58,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7382466560. Throughput: 0: 42733.5. Samples: 7382594600. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:09:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 14:10:01,778][15401] Updated weights for policy 0, policy_version 450600 (0.0037) [2024-06-23 14:10:03,393][15132] Fps is (10 sec: 44219.7, 60 sec: 43414.8, 300 sec: 42708.9). Total num frames: 7382695936. Throughput: 0: 42684.0. Samples: 7382843540. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:03,394][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 14:10:06,080][15401] Updated weights for policy 0, policy_version 450610 (0.0041) [2024-06-23 14:10:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7382892544. Throughput: 0: 42792.8. Samples: 7382977160. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 14:10:09,491][15401] Updated weights for policy 0, policy_version 450620 (0.0038) [2024-06-23 14:10:13,392][15132] Fps is (10 sec: 39327.2, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 7383089152. Throughput: 0: 42614.7. Samples: 7383235440. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:13,393][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 14:10:13,637][15401] Updated weights for policy 0, policy_version 450630 (0.0047) [2024-06-23 14:10:17,122][15401] Updated weights for policy 0, policy_version 450640 (0.0027) [2024-06-23 14:10:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7383334912. Throughput: 0: 42720.8. Samples: 7383484500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 14:10:21,316][15401] Updated weights for policy 0, policy_version 450650 (0.0037) [2024-06-23 14:10:22,582][15349] Signal inference workers to stop experience collection... (109400 times) [2024-06-23 14:10:22,605][15401] InferenceWorker_p0-w0: stopping experience collection (109400 times) [2024-06-23 14:10:22,698][15349] Signal inference workers to resume experience collection... (109400 times) [2024-06-23 14:10:22,698][15401] InferenceWorker_p0-w0: resuming experience collection (109400 times) [2024-06-23 14:10:23,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 7383547904. Throughput: 0: 42911.4. Samples: 7383618100. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 14:10:24,574][15401] Updated weights for policy 0, policy_version 450660 (0.0033) [2024-06-23 14:10:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 7383728128. Throughput: 0: 42520.5. Samples: 7383871180. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 14:10:29,118][15401] Updated weights for policy 0, policy_version 450670 (0.0032) [2024-06-23 14:10:32,643][15401] Updated weights for policy 0, policy_version 450680 (0.0030) [2024-06-23 14:10:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7383973888. Throughput: 0: 42786.8. Samples: 7384128860. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:33,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-23 14:10:36,708][15401] Updated weights for policy 0, policy_version 450690 (0.0031) [2024-06-23 14:10:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7384186880. Throughput: 0: 42837.3. Samples: 7384259000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:38,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 14:10:40,362][15401] Updated weights for policy 0, policy_version 450700 (0.0027) [2024-06-23 14:10:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7384383488. Throughput: 0: 42663.0. Samples: 7384514440. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 14:10:44,336][15401] Updated weights for policy 0, policy_version 450710 (0.0039) [2024-06-23 14:10:47,977][15401] Updated weights for policy 0, policy_version 450720 (0.0042) [2024-06-23 14:10:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 7384612864. Throughput: 0: 42743.2. Samples: 7384766820. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 14:10:52,030][15401] Updated weights for policy 0, policy_version 450730 (0.0051) [2024-06-23 14:10:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7384825856. Throughput: 0: 42712.1. Samples: 7384899200. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 14:10:55,454][15401] Updated weights for policy 0, policy_version 450740 (0.0029) [2024-06-23 14:10:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7385038848. Throughput: 0: 42748.1. Samples: 7385159000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 14:10:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 14:10:59,641][15401] Updated weights for policy 0, policy_version 450750 (0.0031) [2024-06-23 14:11:03,225][15401] Updated weights for policy 0, policy_version 450760 (0.0039) [2024-06-23 14:11:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42601.2, 300 sec: 42765.0). Total num frames: 7385251840. Throughput: 0: 42878.8. Samples: 7385414040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 14:11:07,162][15401] Updated weights for policy 0, policy_version 450770 (0.0032) [2024-06-23 14:11:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7385464832. Throughput: 0: 42751.1. Samples: 7385541900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 14:11:10,791][15401] Updated weights for policy 0, policy_version 450780 (0.0035) [2024-06-23 14:11:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.3, 300 sec: 42654.3). Total num frames: 7385677824. Throughput: 0: 42728.4. Samples: 7385793960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 14:11:14,809][15401] Updated weights for policy 0, policy_version 450790 (0.0032) [2024-06-23 14:11:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7385874432. Throughput: 0: 42670.2. Samples: 7386049020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 14:11:18,754][15401] Updated weights for policy 0, policy_version 450800 (0.0039) [2024-06-23 14:11:22,533][15401] Updated weights for policy 0, policy_version 450810 (0.0035) [2024-06-23 14:11:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7386103808. Throughput: 0: 42748.5. Samples: 7386182680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 14:11:26,192][15401] Updated weights for policy 0, policy_version 450820 (0.0038) [2024-06-23 14:11:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 7386316800. Throughput: 0: 42772.1. Samples: 7386439180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:28,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 14:11:30,119][15401] Updated weights for policy 0, policy_version 450830 (0.0028) [2024-06-23 14:11:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7386513408. Throughput: 0: 42903.1. Samples: 7386697460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:33,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 14:11:33,738][15401] Updated weights for policy 0, policy_version 450840 (0.0027) [2024-06-23 14:11:37,794][15401] Updated weights for policy 0, policy_version 450850 (0.0024) [2024-06-23 14:11:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7386742784. Throughput: 0: 42822.2. Samples: 7386826200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 14:11:41,450][15401] Updated weights for policy 0, policy_version 450860 (0.0026) [2024-06-23 14:11:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7386939392. Throughput: 0: 42663.6. Samples: 7387078860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 14:11:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000450863_7386939392.pth... [2024-06-23 14:11:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000450238_7376699392.pth [2024-06-23 14:11:45,452][15401] Updated weights for policy 0, policy_version 450870 (0.0029) [2024-06-23 14:11:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7387185152. Throughput: 0: 42693.4. Samples: 7387335240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:48,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 14:11:49,261][15401] Updated weights for policy 0, policy_version 450880 (0.0031) [2024-06-23 14:11:52,921][15401] Updated weights for policy 0, policy_version 450890 (0.0034) [2024-06-23 14:11:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7387381760. Throughput: 0: 42878.7. Samples: 7387471440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:53,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 14:11:56,843][15401] Updated weights for policy 0, policy_version 450900 (0.0031) [2024-06-23 14:11:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7387578368. Throughput: 0: 42817.3. Samples: 7387720740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:11:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 14:12:00,454][15401] Updated weights for policy 0, policy_version 450910 (0.0034) [2024-06-23 14:12:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 7387807744. Throughput: 0: 42797.2. Samples: 7387975000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:12:03,393][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 14:12:04,347][15401] Updated weights for policy 0, policy_version 450920 (0.0035) [2024-06-23 14:12:08,390][15132] Fps is (10 sec: 44234.1, 60 sec: 42598.0, 300 sec: 42654.2). Total num frames: 7388020736. Throughput: 0: 42730.1. Samples: 7388105560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:12:08,391][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 14:12:08,689][15401] Updated weights for policy 0, policy_version 450930 (0.0025) [2024-06-23 14:12:08,889][15349] Signal inference workers to stop experience collection... (109450 times) [2024-06-23 14:12:08,889][15349] Signal inference workers to resume experience collection... (109450 times) [2024-06-23 14:12:08,928][15401] InferenceWorker_p0-w0: stopping experience collection (109450 times) [2024-06-23 14:12:08,929][15401] InferenceWorker_p0-w0: resuming experience collection (109450 times) [2024-06-23 14:12:12,101][15401] Updated weights for policy 0, policy_version 450940 (0.0039) [2024-06-23 14:12:13,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 7388233728. Throughput: 0: 42681.7. Samples: 7388359960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:12:13,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 14:12:16,129][15401] Updated weights for policy 0, policy_version 450950 (0.0031) [2024-06-23 14:12:18,390][15132] Fps is (10 sec: 44239.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7388463104. Throughput: 0: 42770.6. Samples: 7388622140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:12:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 14:12:19,582][15401] Updated weights for policy 0, policy_version 450960 (0.0034) [2024-06-23 14:12:23,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7388676096. Throughput: 0: 42883.8. Samples: 7388755980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:12:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 14:12:23,906][15401] Updated weights for policy 0, policy_version 450970 (0.0035) [2024-06-23 14:12:27,082][15401] Updated weights for policy 0, policy_version 450980 (0.0027) [2024-06-23 14:12:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 7388889088. Throughput: 0: 42852.3. Samples: 7389007320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:12:28,393][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 14:12:31,506][15401] Updated weights for policy 0, policy_version 450990 (0.0023) [2024-06-23 14:12:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7389102080. Throughput: 0: 42929.7. Samples: 7389267080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:12:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 14:12:35,057][15401] Updated weights for policy 0, policy_version 451000 (0.0027) [2024-06-23 14:12:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7389315072. Throughput: 0: 42945.3. Samples: 7389403980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:12:38,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 14:12:39,042][15401] Updated weights for policy 0, policy_version 451010 (0.0031) [2024-06-23 14:12:42,477][15401] Updated weights for policy 0, policy_version 451020 (0.0028) [2024-06-23 14:12:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7389528064. Throughput: 0: 43002.7. Samples: 7389655860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:12:43,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 14:12:47,077][15401] Updated weights for policy 0, policy_version 451030 (0.0031) [2024-06-23 14:12:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7389757440. Throughput: 0: 43148.1. Samples: 7389916560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:12:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 14:12:50,118][15401] Updated weights for policy 0, policy_version 451040 (0.0029) [2024-06-23 14:12:53,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 7389970432. Throughput: 0: 43197.8. Samples: 7390049540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:12:53,392][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 14:12:54,589][15401] Updated weights for policy 0, policy_version 451050 (0.0037) [2024-06-23 14:12:57,556][15401] Updated weights for policy 0, policy_version 451060 (0.0036) [2024-06-23 14:12:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 7390183424. Throughput: 0: 43219.2. Samples: 7390304720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:12:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 14:13:02,142][15401] Updated weights for policy 0, policy_version 451070 (0.0039) [2024-06-23 14:13:03,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 7390412800. Throughput: 0: 43146.4. Samples: 7390563720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:03,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 14:13:05,384][15401] Updated weights for policy 0, policy_version 451080 (0.0029) [2024-06-23 14:13:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43145.0, 300 sec: 42765.0). Total num frames: 7390609408. Throughput: 0: 42975.6. Samples: 7390689880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 14:13:09,778][15401] Updated weights for policy 0, policy_version 451090 (0.0033) [2024-06-23 14:13:13,321][15401] Updated weights for policy 0, policy_version 451100 (0.0026) [2024-06-23 14:13:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 7390822400. Throughput: 0: 43025.9. Samples: 7390943380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:13,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 14:13:17,447][15401] Updated weights for policy 0, policy_version 451110 (0.0035) [2024-06-23 14:13:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7391035392. Throughput: 0: 42939.6. Samples: 7391199360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 14:13:20,924][15401] Updated weights for policy 0, policy_version 451120 (0.0032) [2024-06-23 14:13:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7391248384. Throughput: 0: 42794.5. Samples: 7391329740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 14:13:25,119][15401] Updated weights for policy 0, policy_version 451130 (0.0035) [2024-06-23 14:13:25,890][15349] Signal inference workers to stop experience collection... (109500 times) [2024-06-23 14:13:25,944][15401] InferenceWorker_p0-w0: stopping experience collection (109500 times) [2024-06-23 14:13:25,951][15349] Signal inference workers to resume experience collection... (109500 times) [2024-06-23 14:13:25,959][15401] InferenceWorker_p0-w0: resuming experience collection (109500 times) [2024-06-23 14:13:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 7391461376. Throughput: 0: 42826.2. Samples: 7391583040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 14:13:28,481][15401] Updated weights for policy 0, policy_version 451140 (0.0036) [2024-06-23 14:13:32,730][15401] Updated weights for policy 0, policy_version 451150 (0.0033) [2024-06-23 14:13:33,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7391657984. Throughput: 0: 42768.8. Samples: 7391841160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 14:13:36,011][15401] Updated weights for policy 0, policy_version 451160 (0.0033) [2024-06-23 14:13:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7391887360. Throughput: 0: 42580.5. Samples: 7391965560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:38,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 14:13:40,279][15401] Updated weights for policy 0, policy_version 451170 (0.0030) [2024-06-23 14:13:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7392116736. Throughput: 0: 42672.2. Samples: 7392224980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 14:13:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 14:13:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000451179_7392116736.pth... [2024-06-23 14:13:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000450552_7381843968.pth [2024-06-23 14:13:43,638][15401] Updated weights for policy 0, policy_version 451180 (0.0052) [2024-06-23 14:13:48,046][15401] Updated weights for policy 0, policy_version 451190 (0.0034) [2024-06-23 14:13:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7392313344. Throughput: 0: 42495.0. Samples: 7392476000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:13:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 14:13:51,178][15401] Updated weights for policy 0, policy_version 451200 (0.0046) [2024-06-23 14:13:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 7392542720. Throughput: 0: 42468.8. Samples: 7392600980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:13:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 14:13:55,564][15401] Updated weights for policy 0, policy_version 451210 (0.0041) [2024-06-23 14:13:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7392739328. Throughput: 0: 42676.0. Samples: 7392863800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:13:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 14:13:58,859][15401] Updated weights for policy 0, policy_version 451220 (0.0041) [2024-06-23 14:14:03,167][15401] Updated weights for policy 0, policy_version 451230 (0.0031) [2024-06-23 14:14:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7392952320. Throughput: 0: 42572.0. Samples: 7393115100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:03,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 14:14:06,828][15401] Updated weights for policy 0, policy_version 451240 (0.0038) [2024-06-23 14:14:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7393165312. Throughput: 0: 42449.0. Samples: 7393239940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:08,399][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 14:14:11,223][15401] Updated weights for policy 0, policy_version 451250 (0.0030) [2024-06-23 14:14:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7393378304. Throughput: 0: 42538.6. Samples: 7393497280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 14:14:14,579][15401] Updated weights for policy 0, policy_version 451260 (0.0043) [2024-06-23 14:14:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7393591296. Throughput: 0: 42499.0. Samples: 7393753620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 14:14:18,666][15401] Updated weights for policy 0, policy_version 451270 (0.0042) [2024-06-23 14:14:22,286][15401] Updated weights for policy 0, policy_version 451280 (0.0040) [2024-06-23 14:14:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 7393804288. Throughput: 0: 42589.8. Samples: 7393882100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 14:14:26,257][15401] Updated weights for policy 0, policy_version 451290 (0.0047) [2024-06-23 14:14:28,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 7394017280. Throughput: 0: 42585.0. Samples: 7394141400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:28,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 14:14:29,807][15401] Updated weights for policy 0, policy_version 451300 (0.0038) [2024-06-23 14:14:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7394230272. Throughput: 0: 42659.6. Samples: 7394395680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 14:14:34,146][15401] Updated weights for policy 0, policy_version 451310 (0.0038) [2024-06-23 14:14:37,005][15349] Signal inference workers to stop experience collection... (109550 times) [2024-06-23 14:14:37,047][15401] InferenceWorker_p0-w0: stopping experience collection (109550 times) [2024-06-23 14:14:37,067][15349] Signal inference workers to resume experience collection... (109550 times) [2024-06-23 14:14:37,072][15401] InferenceWorker_p0-w0: resuming experience collection (109550 times) [2024-06-23 14:14:37,535][15401] Updated weights for policy 0, policy_version 451320 (0.0041) [2024-06-23 14:14:38,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7394426880. Throughput: 0: 42709.0. Samples: 7394522880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 14:14:41,706][15401] Updated weights for policy 0, policy_version 451330 (0.0042) [2024-06-23 14:14:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 7394656256. Throughput: 0: 42503.5. Samples: 7394776460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 14:14:45,629][15401] Updated weights for policy 0, policy_version 451340 (0.0050) [2024-06-23 14:14:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7394885632. Throughput: 0: 42616.5. Samples: 7395032840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:48,391][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 14:14:49,359][15401] Updated weights for policy 0, policy_version 451350 (0.0033) [2024-06-23 14:14:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 7395065856. Throughput: 0: 42745.9. Samples: 7395163500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 14:14:53,470][15401] Updated weights for policy 0, policy_version 451360 (0.0047) [2024-06-23 14:14:57,094][15401] Updated weights for policy 0, policy_version 451370 (0.0035) [2024-06-23 14:14:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42710.0). Total num frames: 7395295232. Throughput: 0: 42649.7. Samples: 7395416520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:14:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 14:15:00,901][15401] Updated weights for policy 0, policy_version 451380 (0.0030) [2024-06-23 14:15:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7395508224. Throughput: 0: 42682.0. Samples: 7395674300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:15:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 14:15:04,586][15401] Updated weights for policy 0, policy_version 451390 (0.0022) [2024-06-23 14:15:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7395721216. Throughput: 0: 42713.8. Samples: 7395804220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:15:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 14:15:08,825][15401] Updated weights for policy 0, policy_version 451400 (0.0044) [2024-06-23 14:15:12,058][15401] Updated weights for policy 0, policy_version 451410 (0.0044) [2024-06-23 14:15:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7395934208. Throughput: 0: 42493.8. Samples: 7396053520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 14:15:16,412][15401] Updated weights for policy 0, policy_version 451420 (0.0031) [2024-06-23 14:15:18,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7396163584. Throughput: 0: 42562.1. Samples: 7396310980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 14:15:19,686][15401] Updated weights for policy 0, policy_version 451430 (0.0031) [2024-06-23 14:15:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7396360192. Throughput: 0: 42584.4. Samples: 7396439180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 14:15:24,039][15401] Updated weights for policy 0, policy_version 451440 (0.0034) [2024-06-23 14:15:27,053][15401] Updated weights for policy 0, policy_version 451450 (0.0037) [2024-06-23 14:15:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 7396589568. Throughput: 0: 42634.3. Samples: 7396695000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:28,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 14:15:31,646][15401] Updated weights for policy 0, policy_version 451460 (0.0041) [2024-06-23 14:15:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7396802560. Throughput: 0: 42864.5. Samples: 7396961740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 14:15:34,528][15401] Updated weights for policy 0, policy_version 451470 (0.0039) [2024-06-23 14:15:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7396999168. Throughput: 0: 42845.2. Samples: 7397091540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 14:15:39,215][15401] Updated weights for policy 0, policy_version 451480 (0.0041) [2024-06-23 14:15:42,478][15401] Updated weights for policy 0, policy_version 451490 (0.0030) [2024-06-23 14:15:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7397228544. Throughput: 0: 42656.0. Samples: 7397336040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 14:15:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000451491_7397228544.pth... [2024-06-23 14:15:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000450863_7386939392.pth [2024-06-23 14:15:46,903][15401] Updated weights for policy 0, policy_version 451500 (0.0027) [2024-06-23 14:15:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7397425152. Throughput: 0: 42840.0. Samples: 7397602100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 14:15:50,314][15401] Updated weights for policy 0, policy_version 451510 (0.0046) [2024-06-23 14:15:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7397638144. Throughput: 0: 42600.8. Samples: 7397721260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:53,390][15132] Avg episode reward: [(0, '0.175')] [2024-06-23 14:15:54,415][15401] Updated weights for policy 0, policy_version 451520 (0.0037) [2024-06-23 14:15:58,117][15401] Updated weights for policy 0, policy_version 451530 (0.0028) [2024-06-23 14:15:58,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7397867520. Throughput: 0: 42751.0. Samples: 7397977320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:15:58,390][15132] Avg episode reward: [(0, '0.243')] [2024-06-23 14:16:02,424][15401] Updated weights for policy 0, policy_version 451540 (0.0030) [2024-06-23 14:16:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 7398080512. Throughput: 0: 42956.0. Samples: 7398244000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:16:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 14:16:05,836][15401] Updated weights for policy 0, policy_version 451550 (0.0034) [2024-06-23 14:16:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 7398293504. Throughput: 0: 42915.0. Samples: 7398370360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:16:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 14:16:10,012][15401] Updated weights for policy 0, policy_version 451560 (0.0043) [2024-06-23 14:16:11,220][15349] Signal inference workers to stop experience collection... (109600 times) [2024-06-23 14:16:11,254][15401] InferenceWorker_p0-w0: stopping experience collection (109600 times) [2024-06-23 14:16:11,278][15349] Signal inference workers to resume experience collection... (109600 times) [2024-06-23 14:16:11,279][15401] InferenceWorker_p0-w0: resuming experience collection (109600 times) [2024-06-23 14:16:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7398506496. Throughput: 0: 42806.3. Samples: 7398621280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:16:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 14:16:13,470][15401] Updated weights for policy 0, policy_version 451570 (0.0042) [2024-06-23 14:16:17,496][15401] Updated weights for policy 0, policy_version 451580 (0.0041) [2024-06-23 14:16:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7398719488. Throughput: 0: 42707.9. Samples: 7398883600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:16:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 14:16:21,096][15401] Updated weights for policy 0, policy_version 451590 (0.0033) [2024-06-23 14:16:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7398899712. Throughput: 0: 42588.5. Samples: 7399008020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:16:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 14:16:25,300][15401] Updated weights for policy 0, policy_version 451600 (0.0046) [2024-06-23 14:16:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7399145472. Throughput: 0: 42827.6. Samples: 7399263280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:16:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 14:16:28,733][15401] Updated weights for policy 0, policy_version 451610 (0.0031) [2024-06-23 14:16:32,894][15401] Updated weights for policy 0, policy_version 451620 (0.0044) [2024-06-23 14:16:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7399358464. Throughput: 0: 42518.1. Samples: 7399515420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 14:16:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 14:16:36,493][15401] Updated weights for policy 0, policy_version 451630 (0.0035) [2024-06-23 14:16:38,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7399522304. Throughput: 0: 42668.1. Samples: 7399641320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:16:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 14:16:40,676][15401] Updated weights for policy 0, policy_version 451640 (0.0041) [2024-06-23 14:16:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7399784448. Throughput: 0: 42478.7. Samples: 7399888860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:16:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 14:16:44,438][15401] Updated weights for policy 0, policy_version 451650 (0.0032) [2024-06-23 14:16:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7399981056. Throughput: 0: 42495.6. Samples: 7400156300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:16:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 14:16:48,534][15401] Updated weights for policy 0, policy_version 451660 (0.0042) [2024-06-23 14:16:52,124][15401] Updated weights for policy 0, policy_version 451670 (0.0039) [2024-06-23 14:16:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7400194048. Throughput: 0: 42381.0. Samples: 7400277500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:16:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 14:16:55,958][15401] Updated weights for policy 0, policy_version 451680 (0.0029) [2024-06-23 14:16:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 7400423424. Throughput: 0: 42511.6. Samples: 7400534300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:16:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 14:16:59,906][15401] Updated weights for policy 0, policy_version 451690 (0.0028) [2024-06-23 14:17:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 7400636416. Throughput: 0: 42554.6. Samples: 7400798560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:03,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 14:17:03,767][15401] Updated weights for policy 0, policy_version 451700 (0.0034) [2024-06-23 14:17:07,414][15401] Updated weights for policy 0, policy_version 451710 (0.0026) [2024-06-23 14:17:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 7400849408. Throughput: 0: 42699.9. Samples: 7400929520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:08,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 14:17:11,416][15401] Updated weights for policy 0, policy_version 451720 (0.0037) [2024-06-23 14:17:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7401078784. Throughput: 0: 42604.4. Samples: 7401180480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 14:17:15,518][15401] Updated weights for policy 0, policy_version 451730 (0.0030) [2024-06-23 14:17:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7401275392. Throughput: 0: 42693.2. Samples: 7401436620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:18,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 14:17:19,090][15401] Updated weights for policy 0, policy_version 451740 (0.0039) [2024-06-23 14:17:23,058][15401] Updated weights for policy 0, policy_version 451750 (0.0043) [2024-06-23 14:17:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 7401472000. Throughput: 0: 42674.1. Samples: 7401561660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 14:17:26,615][15401] Updated weights for policy 0, policy_version 451760 (0.0042) [2024-06-23 14:17:28,390][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7401717760. Throughput: 0: 42961.8. Samples: 7401822140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:28,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 14:17:30,442][15401] Updated weights for policy 0, policy_version 451770 (0.0033) [2024-06-23 14:17:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7401897984. Throughput: 0: 42956.1. Samples: 7402089320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 14:17:33,549][15349] Signal inference workers to stop experience collection... (109650 times) [2024-06-23 14:17:33,585][15401] InferenceWorker_p0-w0: stopping experience collection (109650 times) [2024-06-23 14:17:33,620][15349] Signal inference workers to resume experience collection... (109650 times) [2024-06-23 14:17:33,622][15401] InferenceWorker_p0-w0: resuming experience collection (109650 times) [2024-06-23 14:17:34,060][15401] Updated weights for policy 0, policy_version 451780 (0.0031) [2024-06-23 14:17:38,197][15401] Updated weights for policy 0, policy_version 451790 (0.0036) [2024-06-23 14:17:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 7402127360. Throughput: 0: 42915.4. Samples: 7402208700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 14:17:41,721][15401] Updated weights for policy 0, policy_version 451800 (0.0035) [2024-06-23 14:17:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7402356736. Throughput: 0: 42924.9. Samples: 7402465920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:43,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 14:17:43,482][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000451805_7402373120.pth... [2024-06-23 14:17:43,538][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000451179_7392116736.pth [2024-06-23 14:17:45,757][15401] Updated weights for policy 0, policy_version 451810 (0.0032) [2024-06-23 14:17:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 7402536960. Throughput: 0: 42824.6. Samples: 7402725660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 14:17:49,465][15401] Updated weights for policy 0, policy_version 451820 (0.0038) [2024-06-23 14:17:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7402766336. Throughput: 0: 42688.8. Samples: 7402850520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 14:17:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 14:17:53,816][15401] Updated weights for policy 0, policy_version 451830 (0.0052) [2024-06-23 14:17:57,071][15401] Updated weights for policy 0, policy_version 451840 (0.0040) [2024-06-23 14:17:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7402979328. Throughput: 0: 42672.6. Samples: 7403100740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:17:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 14:18:01,319][15401] Updated weights for policy 0, policy_version 451850 (0.0032) [2024-06-23 14:18:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7403175936. Throughput: 0: 43001.1. Samples: 7403371660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 14:18:04,743][15401] Updated weights for policy 0, policy_version 451860 (0.0039) [2024-06-23 14:18:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7403421696. Throughput: 0: 42873.4. Samples: 7403490960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 14:18:08,812][15401] Updated weights for policy 0, policy_version 451870 (0.0034) [2024-06-23 14:18:12,314][15401] Updated weights for policy 0, policy_version 451880 (0.0030) [2024-06-23 14:18:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7403651072. Throughput: 0: 42923.1. Samples: 7403753680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 14:18:16,373][15401] Updated weights for policy 0, policy_version 451890 (0.0029) [2024-06-23 14:18:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 7403831296. Throughput: 0: 42759.9. Samples: 7404013620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:18,393][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 14:18:19,804][15401] Updated weights for policy 0, policy_version 451900 (0.0026) [2024-06-23 14:18:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7404060672. Throughput: 0: 42790.7. Samples: 7404134280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 14:18:24,161][15401] Updated weights for policy 0, policy_version 451910 (0.0039) [2024-06-23 14:18:27,365][15401] Updated weights for policy 0, policy_version 451920 (0.0036) [2024-06-23 14:18:28,390][15132] Fps is (10 sec: 47524.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7404306432. Throughput: 0: 42882.0. Samples: 7404395620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 14:18:31,888][15401] Updated weights for policy 0, policy_version 451930 (0.0034) [2024-06-23 14:18:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7404470272. Throughput: 0: 42921.2. Samples: 7404657120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 14:18:34,925][15401] Updated weights for policy 0, policy_version 451940 (0.0041) [2024-06-23 14:18:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7404699648. Throughput: 0: 42694.4. Samples: 7404771760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 14:18:39,655][15401] Updated weights for policy 0, policy_version 451950 (0.0043) [2024-06-23 14:18:42,457][15401] Updated weights for policy 0, policy_version 451960 (0.0036) [2024-06-23 14:18:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7404929024. Throughput: 0: 42977.7. Samples: 7405034740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 14:18:47,450][15401] Updated weights for policy 0, policy_version 451970 (0.0036) [2024-06-23 14:18:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7405092864. Throughput: 0: 42855.9. Samples: 7405300180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:48,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 14:18:48,907][15349] Signal inference workers to stop experience collection... (109700 times) [2024-06-23 14:18:48,943][15401] InferenceWorker_p0-w0: stopping experience collection (109700 times) [2024-06-23 14:18:48,970][15349] Signal inference workers to resume experience collection... (109700 times) [2024-06-23 14:18:48,976][15401] InferenceWorker_p0-w0: resuming experience collection (109700 times) [2024-06-23 14:18:50,061][15401] Updated weights for policy 0, policy_version 451980 (0.0036) [2024-06-23 14:18:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7405355008. Throughput: 0: 42792.0. Samples: 7405416600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 14:18:54,922][15401] Updated weights for policy 0, policy_version 451990 (0.0034) [2024-06-23 14:18:57,715][15401] Updated weights for policy 0, policy_version 452000 (0.0037) [2024-06-23 14:18:58,392][15132] Fps is (10 sec: 49140.3, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 7405584384. Throughput: 0: 42931.1. Samples: 7405685680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:18:58,393][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 14:19:02,403][15401] Updated weights for policy 0, policy_version 452010 (0.0035) [2024-06-23 14:19:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7405748224. Throughput: 0: 43177.3. Samples: 7405956500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:19:03,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 14:19:05,638][15401] Updated weights for policy 0, policy_version 452020 (0.0037) [2024-06-23 14:19:08,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7405993984. Throughput: 0: 43025.7. Samples: 7406070440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:19:08,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 14:19:10,089][15401] Updated weights for policy 0, policy_version 452030 (0.0030) [2024-06-23 14:19:13,254][15401] Updated weights for policy 0, policy_version 452040 (0.0043) [2024-06-23 14:19:13,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7406223360. Throughput: 0: 42897.1. Samples: 7406325980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:19:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 14:19:17,897][15401] Updated weights for policy 0, policy_version 452050 (0.0041) [2024-06-23 14:19:18,396][15132] Fps is (10 sec: 39297.0, 60 sec: 42595.6, 300 sec: 42653.0). Total num frames: 7406387200. Throughput: 0: 42959.3. Samples: 7406590560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 14:19:18,397][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 14:19:20,915][15401] Updated weights for policy 0, policy_version 452060 (0.0032) [2024-06-23 14:19:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 7406649344. Throughput: 0: 43103.0. Samples: 7406711400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:19:23,393][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 14:19:25,651][15401] Updated weights for policy 0, policy_version 452070 (0.0043) [2024-06-23 14:19:28,390][15132] Fps is (10 sec: 47543.9, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 7406862336. Throughput: 0: 43027.1. Samples: 7406970960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:19:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 14:19:28,524][15401] Updated weights for policy 0, policy_version 452080 (0.0037) [2024-06-23 14:19:33,301][15401] Updated weights for policy 0, policy_version 452090 (0.0039) [2024-06-23 14:19:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7407042560. Throughput: 0: 42999.5. Samples: 7407235160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:19:33,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 14:19:36,118][15401] Updated weights for policy 0, policy_version 452100 (0.0039) [2024-06-23 14:19:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7407288320. Throughput: 0: 43030.7. Samples: 7407352980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:19:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 14:19:41,059][15401] Updated weights for policy 0, policy_version 452110 (0.0042) [2024-06-23 14:19:43,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7407501312. Throughput: 0: 42831.6. Samples: 7407613000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:19:43,393][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 14:19:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000452118_7407501312.pth... [2024-06-23 14:19:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000451491_7397228544.pth [2024-06-23 14:19:43,784][15401] Updated weights for policy 0, policy_version 452120 (0.0050) [2024-06-23 14:19:45,191][15349] Signal inference workers to stop experience collection... (109750 times) [2024-06-23 14:19:45,213][15401] InferenceWorker_p0-w0: stopping experience collection (109750 times) [2024-06-23 14:19:45,250][15349] Signal inference workers to resume experience collection... (109750 times) [2024-06-23 14:19:45,250][15401] InferenceWorker_p0-w0: resuming experience collection (109750 times) [2024-06-23 14:19:48,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7407665152. Throughput: 0: 42634.8. Samples: 7407875060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:19:48,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 14:19:48,748][15401] Updated weights for policy 0, policy_version 452130 (0.0027) [2024-06-23 14:19:51,311][15401] Updated weights for policy 0, policy_version 452140 (0.0030) [2024-06-23 14:19:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7407943680. Throughput: 0: 42716.2. Samples: 7407992660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:19:53,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 14:19:56,291][15401] Updated weights for policy 0, policy_version 452150 (0.0036) [2024-06-23 14:19:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 7408123904. Throughput: 0: 42868.9. Samples: 7408255080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:19:58,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 14:19:58,909][15401] Updated weights for policy 0, policy_version 452160 (0.0031) [2024-06-23 14:20:03,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7408320512. Throughput: 0: 42705.2. Samples: 7408512020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:03,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 14:20:03,853][15401] Updated weights for policy 0, policy_version 452170 (0.0036) [2024-06-23 14:20:07,009][15401] Updated weights for policy 0, policy_version 452180 (0.0039) [2024-06-23 14:20:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7408566272. Throughput: 0: 42700.5. Samples: 7408632920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:08,393][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 14:20:11,265][15401] Updated weights for policy 0, policy_version 452190 (0.0042) [2024-06-23 14:20:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7408762880. Throughput: 0: 42996.1. Samples: 7408905780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 14:20:14,377][15401] Updated weights for policy 0, policy_version 452200 (0.0044) [2024-06-23 14:20:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 7408975872. Throughput: 0: 42588.1. Samples: 7409151620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 14:20:19,035][15401] Updated weights for policy 0, policy_version 452210 (0.0032) [2024-06-23 14:20:22,006][15401] Updated weights for policy 0, policy_version 452220 (0.0044) [2024-06-23 14:20:23,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 7409221632. Throughput: 0: 42800.4. Samples: 7409279100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:23,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 14:20:26,653][15401] Updated weights for policy 0, policy_version 452230 (0.0043) [2024-06-23 14:20:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7409401856. Throughput: 0: 42814.6. Samples: 7409539660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 14:20:29,616][15401] Updated weights for policy 0, policy_version 452240 (0.0027) [2024-06-23 14:20:33,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7409631232. Throughput: 0: 42510.6. Samples: 7409788040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:33,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 14:20:34,454][15401] Updated weights for policy 0, policy_version 452250 (0.0030) [2024-06-23 14:20:37,618][15401] Updated weights for policy 0, policy_version 452260 (0.0033) [2024-06-23 14:20:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7409860608. Throughput: 0: 42772.4. Samples: 7409917420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 14:20:41,990][15401] Updated weights for policy 0, policy_version 452270 (0.0042) [2024-06-23 14:20:43,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 7410008064. Throughput: 0: 42492.0. Samples: 7410167220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-23 14:20:43,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-23 14:20:45,550][15401] Updated weights for policy 0, policy_version 452280 (0.0036) [2024-06-23 14:20:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7410253824. Throughput: 0: 42399.5. Samples: 7410420000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:20:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 14:20:50,179][15401] Updated weights for policy 0, policy_version 452290 (0.0034) [2024-06-23 14:20:51,092][15349] Signal inference workers to stop experience collection... (109800 times) [2024-06-23 14:20:51,092][15349] Signal inference workers to resume experience collection... (109800 times) [2024-06-23 14:20:51,137][15401] InferenceWorker_p0-w0: stopping experience collection (109800 times) [2024-06-23 14:20:51,137][15401] InferenceWorker_p0-w0: resuming experience collection (109800 times) [2024-06-23 14:20:53,288][15401] Updated weights for policy 0, policy_version 452300 (0.0036) [2024-06-23 14:20:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7410483200. Throughput: 0: 42632.9. Samples: 7410551400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:20:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 14:20:57,723][15401] Updated weights for policy 0, policy_version 452310 (0.0023) [2024-06-23 14:20:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7410663424. Throughput: 0: 42190.1. Samples: 7410804340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:20:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 14:21:01,127][15401] Updated weights for policy 0, policy_version 452320 (0.0031) [2024-06-23 14:21:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7410909184. Throughput: 0: 42295.4. Samples: 7411054920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:03,399][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 14:21:05,275][15401] Updated weights for policy 0, policy_version 452330 (0.0045) [2024-06-23 14:21:08,392][15132] Fps is (10 sec: 45865.1, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 7411122176. Throughput: 0: 42493.0. Samples: 7411191280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:08,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 14:21:08,587][15401] Updated weights for policy 0, policy_version 452340 (0.0027) [2024-06-23 14:21:13,036][15401] Updated weights for policy 0, policy_version 452350 (0.0037) [2024-06-23 14:21:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7411318784. Throughput: 0: 42450.7. Samples: 7411449940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:13,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 14:21:16,669][15401] Updated weights for policy 0, policy_version 452360 (0.0048) [2024-06-23 14:21:18,390][15132] Fps is (10 sec: 44246.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7411564544. Throughput: 0: 42411.5. Samples: 7411696560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 14:21:20,670][15401] Updated weights for policy 0, policy_version 452370 (0.0041) [2024-06-23 14:21:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 7411744768. Throughput: 0: 42538.2. Samples: 7411831640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:23,392][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 14:21:23,917][15401] Updated weights for policy 0, policy_version 452380 (0.0040) [2024-06-23 14:21:28,194][15401] Updated weights for policy 0, policy_version 452390 (0.0030) [2024-06-23 14:21:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7411957760. Throughput: 0: 42687.5. Samples: 7412088160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 14:21:31,691][15401] Updated weights for policy 0, policy_version 452400 (0.0021) [2024-06-23 14:21:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7412203520. Throughput: 0: 42658.8. Samples: 7412339640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 14:21:35,649][15401] Updated weights for policy 0, policy_version 452410 (0.0039) [2024-06-23 14:21:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 7412383744. Throughput: 0: 42873.3. Samples: 7412480700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:38,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 14:21:39,283][15401] Updated weights for policy 0, policy_version 452420 (0.0037) [2024-06-23 14:21:43,389][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7412596736. Throughput: 0: 42884.9. Samples: 7412734160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 14:21:43,416][15401] Updated weights for policy 0, policy_version 452430 (0.0039) [2024-06-23 14:21:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000452430_7412613120.pth... [2024-06-23 14:21:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000451805_7402373120.pth [2024-06-23 14:21:46,940][15401] Updated weights for policy 0, policy_version 452440 (0.0042) [2024-06-23 14:21:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 7412842496. Throughput: 0: 42899.3. Samples: 7412985380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:48,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 14:21:51,298][15401] Updated weights for policy 0, policy_version 452450 (0.0043) [2024-06-23 14:21:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7413022720. Throughput: 0: 42812.8. Samples: 7413117760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 14:21:54,647][15401] Updated weights for policy 0, policy_version 452460 (0.0027) [2024-06-23 14:21:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7413235712. Throughput: 0: 42570.2. Samples: 7413365600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:21:58,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 14:21:59,087][15401] Updated weights for policy 0, policy_version 452470 (0.0037) [2024-06-23 14:22:02,460][15401] Updated weights for policy 0, policy_version 452480 (0.0022) [2024-06-23 14:22:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7413465088. Throughput: 0: 42848.1. Samples: 7413624720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:22:03,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-23 14:22:06,514][15401] Updated weights for policy 0, policy_version 452490 (0.0022) [2024-06-23 14:22:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 7413678080. Throughput: 0: 42772.5. Samples: 7413756400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 14:22:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 14:22:10,054][15401] Updated weights for policy 0, policy_version 452500 (0.0044) [2024-06-23 14:22:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7413891072. Throughput: 0: 42635.1. Samples: 7414006740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 14:22:13,989][15401] Updated weights for policy 0, policy_version 452510 (0.0033) [2024-06-23 14:22:17,694][15401] Updated weights for policy 0, policy_version 452520 (0.0030) [2024-06-23 14:22:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 7414087680. Throughput: 0: 42742.2. Samples: 7414263040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 14:22:21,555][15401] Updated weights for policy 0, policy_version 452530 (0.0024) [2024-06-23 14:22:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7414333440. Throughput: 0: 42471.1. Samples: 7414391900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:23,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 14:22:25,888][15401] Updated weights for policy 0, policy_version 452540 (0.0028) [2024-06-23 14:22:26,878][15349] Signal inference workers to stop experience collection... (109850 times) [2024-06-23 14:22:26,914][15401] InferenceWorker_p0-w0: stopping experience collection (109850 times) [2024-06-23 14:22:26,940][15349] Signal inference workers to resume experience collection... (109850 times) [2024-06-23 14:22:26,944][15401] InferenceWorker_p0-w0: resuming experience collection (109850 times) [2024-06-23 14:22:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7414513664. Throughput: 0: 42551.2. Samples: 7414648960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 14:22:29,126][15401] Updated weights for policy 0, policy_version 452550 (0.0038) [2024-06-23 14:22:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7414726656. Throughput: 0: 42653.7. Samples: 7414904800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:33,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-23 14:22:33,545][15401] Updated weights for policy 0, policy_version 452560 (0.0034) [2024-06-23 14:22:36,807][15401] Updated weights for policy 0, policy_version 452570 (0.0030) [2024-06-23 14:22:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7414956032. Throughput: 0: 42427.5. Samples: 7415027000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:38,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 14:22:41,115][15401] Updated weights for policy 0, policy_version 452580 (0.0034) [2024-06-23 14:22:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 7415152640. Throughput: 0: 42704.4. Samples: 7415287400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:43,393][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 14:22:44,621][15401] Updated weights for policy 0, policy_version 452590 (0.0036) [2024-06-23 14:22:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 42654.0). Total num frames: 7415349248. Throughput: 0: 42621.2. Samples: 7415542680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 14:22:48,890][15401] Updated weights for policy 0, policy_version 452600 (0.0037) [2024-06-23 14:22:52,459][15401] Updated weights for policy 0, policy_version 452610 (0.0035) [2024-06-23 14:22:53,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7415611392. Throughput: 0: 42411.1. Samples: 7415664900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 14:22:56,636][15401] Updated weights for policy 0, policy_version 452620 (0.0031) [2024-06-23 14:22:58,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 7415791616. Throughput: 0: 42504.1. Samples: 7415919700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:22:58,396][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 14:23:00,138][15401] Updated weights for policy 0, policy_version 452630 (0.0041) [2024-06-23 14:23:03,396][15132] Fps is (10 sec: 39296.4, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 7416004608. Throughput: 0: 42527.2. Samples: 7416177040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:23:03,397][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 14:23:04,476][15401] Updated weights for policy 0, policy_version 452640 (0.0032) [2024-06-23 14:23:07,871][15401] Updated weights for policy 0, policy_version 452650 (0.0025) [2024-06-23 14:23:08,390][15132] Fps is (10 sec: 45904.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7416250368. Throughput: 0: 42416.8. Samples: 7416300660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:23:08,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 14:23:12,150][15401] Updated weights for policy 0, policy_version 452660 (0.0032) [2024-06-23 14:23:13,390][15132] Fps is (10 sec: 42625.7, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 7416430592. Throughput: 0: 42410.1. Samples: 7416557420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:23:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 14:23:15,594][15401] Updated weights for policy 0, policy_version 452670 (0.0037) [2024-06-23 14:23:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7416643584. Throughput: 0: 42401.8. Samples: 7416812880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:23:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 14:23:19,747][15401] Updated weights for policy 0, policy_version 452680 (0.0032) [2024-06-23 14:23:23,245][15401] Updated weights for policy 0, policy_version 452690 (0.0041) [2024-06-23 14:23:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7416872960. Throughput: 0: 42469.0. Samples: 7416938100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:23:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 14:23:27,560][15401] Updated weights for policy 0, policy_version 452700 (0.0030) [2024-06-23 14:23:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7417053184. Throughput: 0: 42487.7. Samples: 7417199240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-23 14:23:28,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-23 14:23:30,854][15401] Updated weights for policy 0, policy_version 452710 (0.0032) [2024-06-23 14:23:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7417282560. Throughput: 0: 42332.6. Samples: 7417447640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:23:33,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 14:23:35,391][15401] Updated weights for policy 0, policy_version 452720 (0.0030) [2024-06-23 14:23:38,270][15349] Signal inference workers to stop experience collection... (109900 times) [2024-06-23 14:23:38,270][15349] Signal inference workers to resume experience collection... (109900 times) [2024-06-23 14:23:38,309][15401] InferenceWorker_p0-w0: stopping experience collection (109900 times) [2024-06-23 14:23:38,309][15401] InferenceWorker_p0-w0: resuming experience collection (109900 times) [2024-06-23 14:23:38,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7417511936. Throughput: 0: 42551.4. Samples: 7417579720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:23:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 14:23:38,415][15401] Updated weights for policy 0, policy_version 452730 (0.0040) [2024-06-23 14:23:42,926][15401] Updated weights for policy 0, policy_version 452740 (0.0043) [2024-06-23 14:23:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 7417708544. Throughput: 0: 42701.6. Samples: 7417841000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:23:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 14:23:43,585][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000452743_7417741312.pth... [2024-06-23 14:23:43,638][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000452118_7407501312.pth [2024-06-23 14:23:46,040][15401] Updated weights for policy 0, policy_version 452750 (0.0040) [2024-06-23 14:23:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7417937920. Throughput: 0: 42566.1. Samples: 7418092240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:23:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 14:23:50,827][15401] Updated weights for policy 0, policy_version 452760 (0.0025) [2024-06-23 14:23:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 7418150912. Throughput: 0: 42739.3. Samples: 7418223920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:23:53,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 14:23:53,645][15401] Updated weights for policy 0, policy_version 452770 (0.0038) [2024-06-23 14:23:58,354][15401] Updated weights for policy 0, policy_version 452780 (0.0024) [2024-06-23 14:23:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 7418347520. Throughput: 0: 42823.1. Samples: 7418484460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:23:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 14:24:01,149][15401] Updated weights for policy 0, policy_version 452790 (0.0023) [2024-06-23 14:24:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42876.1, 300 sec: 42654.0). Total num frames: 7418576896. Throughput: 0: 42744.5. Samples: 7418736380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:03,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 14:24:06,045][15401] Updated weights for policy 0, policy_version 452800 (0.0043) [2024-06-23 14:24:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 7418789888. Throughput: 0: 42881.3. Samples: 7418867760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 14:24:08,837][15401] Updated weights for policy 0, policy_version 452810 (0.0034) [2024-06-23 14:24:13,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 42599.3). Total num frames: 7418953728. Throughput: 0: 42644.8. Samples: 7419118260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 14:24:13,926][15401] Updated weights for policy 0, policy_version 452820 (0.0036) [2024-06-23 14:24:16,775][15401] Updated weights for policy 0, policy_version 452830 (0.0043) [2024-06-23 14:24:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7419215872. Throughput: 0: 42597.7. Samples: 7419364540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 14:24:21,578][15401] Updated weights for policy 0, policy_version 452840 (0.0040) [2024-06-23 14:24:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7419428864. Throughput: 0: 42792.0. Samples: 7419505360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:23,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 14:24:24,446][15401] Updated weights for policy 0, policy_version 452850 (0.0032) [2024-06-23 14:24:28,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7419592704. Throughput: 0: 42481.9. Samples: 7419752680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 14:24:29,208][15401] Updated weights for policy 0, policy_version 452860 (0.0032) [2024-06-23 14:24:31,873][15401] Updated weights for policy 0, policy_version 452870 (0.0033) [2024-06-23 14:24:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7419871232. Throughput: 0: 42565.2. Samples: 7420007680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 14:24:36,840][15401] Updated weights for policy 0, policy_version 452880 (0.0039) [2024-06-23 14:24:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 7420051456. Throughput: 0: 42670.9. Samples: 7420144120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 14:24:39,409][15401] Updated weights for policy 0, policy_version 452890 (0.0032) [2024-06-23 14:24:41,995][15349] Signal inference workers to stop experience collection... (109950 times) [2024-06-23 14:24:42,029][15401] InferenceWorker_p0-w0: stopping experience collection (109950 times) [2024-06-23 14:24:42,053][15349] Signal inference workers to resume experience collection... (109950 times) [2024-06-23 14:24:42,054][15401] InferenceWorker_p0-w0: resuming experience collection (109950 times) [2024-06-23 14:24:43,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7420264448. Throughput: 0: 42467.6. Samples: 7420395500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 14:24:44,593][15401] Updated weights for policy 0, policy_version 452900 (0.0034) [2024-06-23 14:24:47,691][15401] Updated weights for policy 0, policy_version 452910 (0.0038) [2024-06-23 14:24:48,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 7420510208. Throughput: 0: 42437.7. Samples: 7420646180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:48,392][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 14:24:52,396][15401] Updated weights for policy 0, policy_version 452920 (0.0047) [2024-06-23 14:24:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7420690432. Throughput: 0: 42564.8. Samples: 7420783180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 14:24:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 14:24:55,139][15401] Updated weights for policy 0, policy_version 452930 (0.0035) [2024-06-23 14:24:58,392][15132] Fps is (10 sec: 37683.2, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 7420887040. Throughput: 0: 42600.0. Samples: 7421035360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:24:58,392][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 14:25:00,088][15401] Updated weights for policy 0, policy_version 452940 (0.0039) [2024-06-23 14:25:02,993][15401] Updated weights for policy 0, policy_version 452950 (0.0034) [2024-06-23 14:25:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7421149184. Throughput: 0: 42593.3. Samples: 7421281240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:03,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-23 14:25:07,838][15401] Updated weights for policy 0, policy_version 452960 (0.0037) [2024-06-23 14:25:08,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7421313024. Throughput: 0: 42535.3. Samples: 7421419440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:08,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 14:25:10,738][15401] Updated weights for policy 0, policy_version 452970 (0.0031) [2024-06-23 14:25:13,396][15132] Fps is (10 sec: 39296.5, 60 sec: 43139.9, 300 sec: 42597.5). Total num frames: 7421542400. Throughput: 0: 42540.1. Samples: 7421667260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:13,396][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 14:25:15,345][15401] Updated weights for policy 0, policy_version 452980 (0.0038) [2024-06-23 14:25:18,207][15401] Updated weights for policy 0, policy_version 452990 (0.0035) [2024-06-23 14:25:18,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 7421788160. Throughput: 0: 42636.2. Samples: 7421926300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 14:25:22,890][15401] Updated weights for policy 0, policy_version 453000 (0.0041) [2024-06-23 14:25:23,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 7421952000. Throughput: 0: 42565.5. Samples: 7422059560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 14:25:25,742][15401] Updated weights for policy 0, policy_version 453010 (0.0031) [2024-06-23 14:25:28,392][15132] Fps is (10 sec: 39311.9, 60 sec: 43142.8, 300 sec: 42542.5). Total num frames: 7422181376. Throughput: 0: 42523.5. Samples: 7422309160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:28,393][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 14:25:30,346][15401] Updated weights for policy 0, policy_version 453020 (0.0031) [2024-06-23 14:25:33,336][15401] Updated weights for policy 0, policy_version 453030 (0.0033) [2024-06-23 14:25:33,390][15132] Fps is (10 sec: 49151.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7422443520. Throughput: 0: 42733.3. Samples: 7422569080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 14:25:37,963][15401] Updated weights for policy 0, policy_version 453040 (0.0026) [2024-06-23 14:25:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7422607360. Throughput: 0: 42685.3. Samples: 7422704020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 14:25:40,885][15401] Updated weights for policy 0, policy_version 453050 (0.0040) [2024-06-23 14:25:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7422836736. Throughput: 0: 42660.4. Samples: 7422954980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 14:25:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000453054_7422836736.pth... [2024-06-23 14:25:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000452430_7412613120.pth [2024-06-23 14:25:45,855][15401] Updated weights for policy 0, policy_version 453060 (0.0031) [2024-06-23 14:25:48,238][15349] Signal inference workers to stop experience collection... (110000 times) [2024-06-23 14:25:48,260][15401] InferenceWorker_p0-w0: stopping experience collection (110000 times) [2024-06-23 14:25:48,298][15349] Signal inference workers to resume experience collection... (110000 times) [2024-06-23 14:25:48,298][15401] InferenceWorker_p0-w0: resuming experience collection (110000 times) [2024-06-23 14:25:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 7423049728. Throughput: 0: 42815.6. Samples: 7423207940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 14:25:49,015][15401] Updated weights for policy 0, policy_version 453070 (0.0035) [2024-06-23 14:25:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7423246336. Throughput: 0: 42583.9. Samples: 7423335720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 14:25:53,488][15401] Updated weights for policy 0, policy_version 453080 (0.0043) [2024-06-23 14:25:56,621][15401] Updated weights for policy 0, policy_version 453090 (0.0022) [2024-06-23 14:25:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 7423475712. Throughput: 0: 42858.1. Samples: 7423595600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:25:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 14:26:01,094][15401] Updated weights for policy 0, policy_version 453100 (0.0034) [2024-06-23 14:26:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 7423688704. Throughput: 0: 42675.9. Samples: 7423846720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:26:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 14:26:04,613][15401] Updated weights for policy 0, policy_version 453110 (0.0046) [2024-06-23 14:26:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7423885312. Throughput: 0: 42673.6. Samples: 7423979880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:26:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 14:26:08,883][15401] Updated weights for policy 0, policy_version 453120 (0.0028) [2024-06-23 14:26:12,166][15401] Updated weights for policy 0, policy_version 453130 (0.0023) [2024-06-23 14:26:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43149.2, 300 sec: 42598.4). Total num frames: 7424131072. Throughput: 0: 42828.5. Samples: 7424236340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:26:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 14:26:16,422][15401] Updated weights for policy 0, policy_version 453140 (0.0032) [2024-06-23 14:26:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7424344064. Throughput: 0: 42814.3. Samples: 7424495720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-23 14:26:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 14:26:19,676][15401] Updated weights for policy 0, policy_version 453150 (0.0043) [2024-06-23 14:26:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7424524288. Throughput: 0: 42709.0. Samples: 7424625920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:26:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 14:26:23,945][15401] Updated weights for policy 0, policy_version 453160 (0.0027) [2024-06-23 14:26:27,115][15401] Updated weights for policy 0, policy_version 453170 (0.0038) [2024-06-23 14:26:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 7424753664. Throughput: 0: 42712.5. Samples: 7424877040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:26:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 14:26:31,677][15401] Updated weights for policy 0, policy_version 453180 (0.0046) [2024-06-23 14:26:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7424966656. Throughput: 0: 42990.7. Samples: 7425142520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:26:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 14:26:34,691][15401] Updated weights for policy 0, policy_version 453190 (0.0032) [2024-06-23 14:26:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7425163264. Throughput: 0: 43005.7. Samples: 7425270980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:26:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 14:26:39,445][15401] Updated weights for policy 0, policy_version 453200 (0.0033) [2024-06-23 14:26:42,329][15401] Updated weights for policy 0, policy_version 453210 (0.0037) [2024-06-23 14:26:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7425409024. Throughput: 0: 42756.8. Samples: 7425519660. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:26:43,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 14:26:47,165][15401] Updated weights for policy 0, policy_version 453220 (0.0038) [2024-06-23 14:26:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7425605632. Throughput: 0: 42923.2. Samples: 7425778260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:26:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 14:26:50,176][15401] Updated weights for policy 0, policy_version 453230 (0.0031) [2024-06-23 14:26:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7425818624. Throughput: 0: 42647.5. Samples: 7425899020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:26:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 14:26:54,784][15401] Updated weights for policy 0, policy_version 453240 (0.0041) [2024-06-23 14:26:57,719][15349] Signal inference workers to stop experience collection... (110050 times) [2024-06-23 14:26:57,720][15349] Signal inference workers to resume experience collection... (110050 times) [2024-06-23 14:26:57,734][15401] InferenceWorker_p0-w0: stopping experience collection (110050 times) [2024-06-23 14:26:57,734][15401] InferenceWorker_p0-w0: resuming experience collection (110050 times) [2024-06-23 14:26:57,863][15401] Updated weights for policy 0, policy_version 453250 (0.0030) [2024-06-23 14:26:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7426064384. Throughput: 0: 42786.6. Samples: 7426161740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:26:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 14:27:02,911][15401] Updated weights for policy 0, policy_version 453260 (0.0033) [2024-06-23 14:27:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7426228224. Throughput: 0: 42641.0. Samples: 7426414560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:03,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 14:27:05,586][15401] Updated weights for policy 0, policy_version 453270 (0.0029) [2024-06-23 14:27:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7426457600. Throughput: 0: 42315.5. Samples: 7426530120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 14:27:10,686][15401] Updated weights for policy 0, policy_version 453280 (0.0034) [2024-06-23 14:27:13,297][15401] Updated weights for policy 0, policy_version 453290 (0.0031) [2024-06-23 14:27:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7426703360. Throughput: 0: 42639.5. Samples: 7426795820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 14:27:18,378][15401] Updated weights for policy 0, policy_version 453300 (0.0030) [2024-06-23 14:27:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 7426867200. Throughput: 0: 42553.8. Samples: 7427057440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:18,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 14:27:20,895][15401] Updated weights for policy 0, policy_version 453310 (0.0032) [2024-06-23 14:27:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7427096576. Throughput: 0: 42240.1. Samples: 7427171780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 14:27:25,998][15401] Updated weights for policy 0, policy_version 453320 (0.0039) [2024-06-23 14:27:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7427325952. Throughput: 0: 42562.6. Samples: 7427434980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:28,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 14:27:29,321][15401] Updated weights for policy 0, policy_version 453330 (0.0026) [2024-06-23 14:27:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7427506176. Throughput: 0: 42600.9. Samples: 7427695300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 14:27:33,438][15401] Updated weights for policy 0, policy_version 453340 (0.0038) [2024-06-23 14:27:36,939][15401] Updated weights for policy 0, policy_version 453350 (0.0032) [2024-06-23 14:27:38,396][15132] Fps is (10 sec: 42571.4, 60 sec: 43140.0, 300 sec: 42708.9). Total num frames: 7427751936. Throughput: 0: 42586.4. Samples: 7427815680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:38,396][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 14:27:40,887][15401] Updated weights for policy 0, policy_version 453360 (0.0031) [2024-06-23 14:27:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7427964928. Throughput: 0: 42668.0. Samples: 7428081800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-23 14:27:43,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 14:27:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000453367_7427964928.pth... [2024-06-23 14:27:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000452743_7417741312.pth [2024-06-23 14:27:44,415][15401] Updated weights for policy 0, policy_version 453370 (0.0038) [2024-06-23 14:27:48,315][15401] Updated weights for policy 0, policy_version 453380 (0.0029) [2024-06-23 14:27:48,396][15132] Fps is (10 sec: 42598.7, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 7428177920. Throughput: 0: 42797.0. Samples: 7428340700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:27:48,396][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 14:27:52,234][15401] Updated weights for policy 0, policy_version 453390 (0.0040) [2024-06-23 14:27:53,217][15349] Signal inference workers to stop experience collection... (110100 times) [2024-06-23 14:27:53,218][15349] Signal inference workers to resume experience collection... (110100 times) [2024-06-23 14:27:53,236][15401] InferenceWorker_p0-w0: stopping experience collection (110100 times) [2024-06-23 14:27:53,272][15401] InferenceWorker_p0-w0: resuming experience collection (110100 times) [2024-06-23 14:27:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42821.5). Total num frames: 7428423680. Throughput: 0: 43090.6. Samples: 7428469200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:27:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 14:27:55,958][15401] Updated weights for policy 0, policy_version 453400 (0.0037) [2024-06-23 14:27:58,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 7428603904. Throughput: 0: 42843.6. Samples: 7428723780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:27:58,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 14:27:59,760][15401] Updated weights for policy 0, policy_version 453410 (0.0035) [2024-06-23 14:28:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 7428816896. Throughput: 0: 42826.6. Samples: 7428984640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 14:28:03,425][15401] Updated weights for policy 0, policy_version 453420 (0.0035) [2024-06-23 14:28:07,322][15401] Updated weights for policy 0, policy_version 453430 (0.0027) [2024-06-23 14:28:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7429029888. Throughput: 0: 43122.3. Samples: 7429112280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:08,397][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 14:28:11,475][15401] Updated weights for policy 0, policy_version 453440 (0.0037) [2024-06-23 14:28:13,392][15132] Fps is (10 sec: 42586.3, 60 sec: 42323.3, 300 sec: 42709.1). Total num frames: 7429242880. Throughput: 0: 42942.2. Samples: 7429367500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:13,393][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 14:28:14,737][15401] Updated weights for policy 0, policy_version 453450 (0.0038) [2024-06-23 14:28:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7429455872. Throughput: 0: 42867.9. Samples: 7429624360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 14:28:18,932][15401] Updated weights for policy 0, policy_version 453460 (0.0028) [2024-06-23 14:28:22,695][15401] Updated weights for policy 0, policy_version 453470 (0.0032) [2024-06-23 14:28:23,394][15132] Fps is (10 sec: 44231.8, 60 sec: 43141.7, 300 sec: 42820.0). Total num frames: 7429685248. Throughput: 0: 43180.5. Samples: 7429758700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:23,394][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 14:28:26,433][15401] Updated weights for policy 0, policy_version 453480 (0.0039) [2024-06-23 14:28:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7429865472. Throughput: 0: 42971.6. Samples: 7430015520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:28,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 14:28:30,116][15401] Updated weights for policy 0, policy_version 453490 (0.0031) [2024-06-23 14:28:33,392][15132] Fps is (10 sec: 42605.3, 60 sec: 43415.8, 300 sec: 42709.2). Total num frames: 7430111232. Throughput: 0: 42885.1. Samples: 7430270360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:33,392][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 14:28:33,895][15401] Updated weights for policy 0, policy_version 453500 (0.0030) [2024-06-23 14:28:37,947][15401] Updated weights for policy 0, policy_version 453510 (0.0037) [2024-06-23 14:28:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 7430307840. Throughput: 0: 43041.1. Samples: 7430406040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:38,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 14:28:41,394][15401] Updated weights for policy 0, policy_version 453520 (0.0028) [2024-06-23 14:28:43,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7430520832. Throughput: 0: 43008.9. Samples: 7430659180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 14:28:45,516][15401] Updated weights for policy 0, policy_version 453530 (0.0041) [2024-06-23 14:28:48,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43149.0, 300 sec: 42765.0). Total num frames: 7430766592. Throughput: 0: 42895.9. Samples: 7430914960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 14:28:48,839][15401] Updated weights for policy 0, policy_version 453540 (0.0033) [2024-06-23 14:28:53,234][15401] Updated weights for policy 0, policy_version 453550 (0.0040) [2024-06-23 14:28:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7430963200. Throughput: 0: 43109.7. Samples: 7431052220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 14:28:56,896][15401] Updated weights for policy 0, policy_version 453560 (0.0031) [2024-06-23 14:28:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7431176192. Throughput: 0: 42881.8. Samples: 7431297060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:28:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 14:29:00,706][15401] Updated weights for policy 0, policy_version 453570 (0.0037) [2024-06-23 14:29:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7431405568. Throughput: 0: 43005.4. Samples: 7431559600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 14:29:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 14:29:04,295][15401] Updated weights for policy 0, policy_version 453580 (0.0036) [2024-06-23 14:29:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7431602176. Throughput: 0: 42876.3. Samples: 7431687960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 14:29:08,430][15401] Updated weights for policy 0, policy_version 453590 (0.0042) [2024-06-23 14:29:12,083][15401] Updated weights for policy 0, policy_version 453600 (0.0034) [2024-06-23 14:29:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.6, 300 sec: 42709.5). Total num frames: 7431815168. Throughput: 0: 42669.8. Samples: 7431935660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:13,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-23 14:29:16,382][15401] Updated weights for policy 0, policy_version 453610 (0.0031) [2024-06-23 14:29:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7432028160. Throughput: 0: 42871.2. Samples: 7432199460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 14:29:19,624][15401] Updated weights for policy 0, policy_version 453620 (0.0032) [2024-06-23 14:29:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42601.3, 300 sec: 42876.1). Total num frames: 7432241152. Throughput: 0: 42582.6. Samples: 7432322260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:23,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 14:29:23,926][15401] Updated weights for policy 0, policy_version 453630 (0.0040) [2024-06-23 14:29:26,618][15349] Signal inference workers to stop experience collection... (110150 times) [2024-06-23 14:29:26,621][15349] Signal inference workers to resume experience collection... (110150 times) [2024-06-23 14:29:26,666][15401] InferenceWorker_p0-w0: stopping experience collection (110150 times) [2024-06-23 14:29:26,667][15401] InferenceWorker_p0-w0: resuming experience collection (110150 times) [2024-06-23 14:29:27,259][15401] Updated weights for policy 0, policy_version 453640 (0.0029) [2024-06-23 14:29:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 7432470528. Throughput: 0: 42652.0. Samples: 7432578520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 14:29:31,857][15401] Updated weights for policy 0, policy_version 453650 (0.0046) [2024-06-23 14:29:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 7432650752. Throughput: 0: 42753.0. Samples: 7432838840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 14:29:34,816][15401] Updated weights for policy 0, policy_version 453660 (0.0031) [2024-06-23 14:29:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7432863744. Throughput: 0: 42416.5. Samples: 7432960960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 14:29:39,532][15401] Updated weights for policy 0, policy_version 453670 (0.0034) [2024-06-23 14:29:42,488][15401] Updated weights for policy 0, policy_version 453680 (0.0029) [2024-06-23 14:29:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 7433109504. Throughput: 0: 42710.2. Samples: 7433219020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 14:29:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000453682_7433125888.pth... [2024-06-23 14:29:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000453054_7422836736.pth [2024-06-23 14:29:47,342][15401] Updated weights for policy 0, policy_version 453690 (0.0033) [2024-06-23 14:29:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 7433306112. Throughput: 0: 42560.0. Samples: 7433474800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 14:29:50,422][15401] Updated weights for policy 0, policy_version 453700 (0.0040) [2024-06-23 14:29:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7433519104. Throughput: 0: 42439.5. Samples: 7433597740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 14:29:54,742][15401] Updated weights for policy 0, policy_version 453710 (0.0023) [2024-06-23 14:29:58,091][15401] Updated weights for policy 0, policy_version 453720 (0.0024) [2024-06-23 14:29:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7433748480. Throughput: 0: 42779.0. Samples: 7433860720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:29:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 14:30:02,308][15401] Updated weights for policy 0, policy_version 453730 (0.0025) [2024-06-23 14:30:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7433945088. Throughput: 0: 42777.3. Samples: 7434124440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:30:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 14:30:05,498][15401] Updated weights for policy 0, policy_version 453740 (0.0050) [2024-06-23 14:30:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42821.5). Total num frames: 7434174464. Throughput: 0: 42759.0. Samples: 7434246420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:30:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 14:30:09,766][15401] Updated weights for policy 0, policy_version 453750 (0.0035) [2024-06-23 14:30:13,321][15401] Updated weights for policy 0, policy_version 453760 (0.0033) [2024-06-23 14:30:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7434403840. Throughput: 0: 42877.3. Samples: 7434508000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:30:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 14:30:17,219][15401] Updated weights for policy 0, policy_version 453770 (0.0028) [2024-06-23 14:30:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7434584064. Throughput: 0: 42909.8. Samples: 7434769780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:30:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 14:30:20,898][15401] Updated weights for policy 0, policy_version 453780 (0.0037) [2024-06-23 14:30:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 7434829824. Throughput: 0: 42970.2. Samples: 7434894620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:30:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 14:30:24,825][15401] Updated weights for policy 0, policy_version 453790 (0.0029) [2024-06-23 14:30:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7435042816. Throughput: 0: 43175.6. Samples: 7435161920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:30:28,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 14:30:28,585][15401] Updated weights for policy 0, policy_version 453800 (0.0031) [2024-06-23 14:30:32,595][15401] Updated weights for policy 0, policy_version 453810 (0.0034) [2024-06-23 14:30:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7435239424. Throughput: 0: 43107.9. Samples: 7435414660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:30:33,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-23 14:30:36,311][15401] Updated weights for policy 0, policy_version 453820 (0.0029) [2024-06-23 14:30:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 7435485184. Throughput: 0: 43296.8. Samples: 7435546100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:30:38,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 14:30:40,091][15401] Updated weights for policy 0, policy_version 453830 (0.0029) [2024-06-23 14:30:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7435681792. Throughput: 0: 43285.0. Samples: 7435808540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:30:43,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-23 14:30:44,164][15401] Updated weights for policy 0, policy_version 453840 (0.0028) [2024-06-23 14:30:44,768][15349] Signal inference workers to stop experience collection... (110200 times) [2024-06-23 14:30:44,775][15349] Signal inference workers to resume experience collection... (110200 times) [2024-06-23 14:30:44,816][15401] InferenceWorker_p0-w0: stopping experience collection (110200 times) [2024-06-23 14:30:44,816][15401] InferenceWorker_p0-w0: resuming experience collection (110200 times) [2024-06-23 14:30:47,546][15401] Updated weights for policy 0, policy_version 453850 (0.0031) [2024-06-23 14:30:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7435894784. Throughput: 0: 42986.3. Samples: 7436058820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:30:48,396][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 14:30:51,698][15401] Updated weights for policy 0, policy_version 453860 (0.0026) [2024-06-23 14:30:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7436124160. Throughput: 0: 43265.5. Samples: 7436193360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:30:53,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 14:30:55,065][15401] Updated weights for policy 0, policy_version 453870 (0.0046) [2024-06-23 14:30:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7436337152. Throughput: 0: 43255.9. Samples: 7436454520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:30:58,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 14:30:59,401][15401] Updated weights for policy 0, policy_version 453880 (0.0032) [2024-06-23 14:31:03,051][15401] Updated weights for policy 0, policy_version 453890 (0.0028) [2024-06-23 14:31:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 7436550144. Throughput: 0: 43049.2. Samples: 7436707000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 14:31:06,864][15401] Updated weights for policy 0, policy_version 453900 (0.0038) [2024-06-23 14:31:08,390][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 7436779520. Throughput: 0: 43265.8. Samples: 7436841580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 14:31:10,594][15401] Updated weights for policy 0, policy_version 453910 (0.0049) [2024-06-23 14:31:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7436959744. Throughput: 0: 42923.1. Samples: 7437093460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 14:31:14,868][15401] Updated weights for policy 0, policy_version 453920 (0.0035) [2024-06-23 14:31:18,074][15401] Updated weights for policy 0, policy_version 453930 (0.0035) [2024-06-23 14:31:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 7437189120. Throughput: 0: 42950.4. Samples: 7437347420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:18,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 14:31:22,335][15401] Updated weights for policy 0, policy_version 453940 (0.0029) [2024-06-23 14:31:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7437418496. Throughput: 0: 42975.1. Samples: 7437479980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:23,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 14:31:25,570][15401] Updated weights for policy 0, policy_version 453950 (0.0033) [2024-06-23 14:31:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7437615104. Throughput: 0: 42896.4. Samples: 7437738880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:28,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 14:31:29,892][15401] Updated weights for policy 0, policy_version 453960 (0.0025) [2024-06-23 14:31:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 7437828096. Throughput: 0: 42958.3. Samples: 7437991940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:33,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 14:31:33,515][15401] Updated weights for policy 0, policy_version 453970 (0.0028) [2024-06-23 14:31:37,403][15401] Updated weights for policy 0, policy_version 453980 (0.0035) [2024-06-23 14:31:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7438057472. Throughput: 0: 42933.7. Samples: 7438125380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:38,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 14:31:41,038][15401] Updated weights for policy 0, policy_version 453990 (0.0034) [2024-06-23 14:31:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7438254080. Throughput: 0: 42807.3. Samples: 7438380840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 14:31:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000453995_7438254080.pth... [2024-06-23 14:31:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000453367_7427964928.pth [2024-06-23 14:31:45,097][15401] Updated weights for policy 0, policy_version 454000 (0.0036) [2024-06-23 14:31:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7438467072. Throughput: 0: 42895.3. Samples: 7438637280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 14:31:48,619][15401] Updated weights for policy 0, policy_version 454010 (0.0037) [2024-06-23 14:31:52,631][15401] Updated weights for policy 0, policy_version 454020 (0.0042) [2024-06-23 14:31:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7438680064. Throughput: 0: 42772.9. Samples: 7438766360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 14:31:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 14:31:56,606][15401] Updated weights for policy 0, policy_version 454030 (0.0036) [2024-06-23 14:31:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 7438909440. Throughput: 0: 42899.6. Samples: 7439023940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:31:58,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 14:32:00,175][15401] Updated weights for policy 0, policy_version 454040 (0.0032) [2024-06-23 14:32:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7439122432. Throughput: 0: 43004.4. Samples: 7439282620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 14:32:04,068][15401] Updated weights for policy 0, policy_version 454050 (0.0034) [2024-06-23 14:32:07,760][15401] Updated weights for policy 0, policy_version 454060 (0.0034) [2024-06-23 14:32:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7439319040. Throughput: 0: 42804.9. Samples: 7439406200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 14:32:09,923][15349] Signal inference workers to stop experience collection... (110250 times) [2024-06-23 14:32:09,924][15349] Signal inference workers to resume experience collection... (110250 times) [2024-06-23 14:32:09,935][15401] InferenceWorker_p0-w0: stopping experience collection (110250 times) [2024-06-23 14:32:09,970][15401] InferenceWorker_p0-w0: resuming experience collection (110250 times) [2024-06-23 14:32:11,485][15401] Updated weights for policy 0, policy_version 454070 (0.0036) [2024-06-23 14:32:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 7439564800. Throughput: 0: 42806.2. Samples: 7439665160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:13,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 14:32:15,257][15401] Updated weights for policy 0, policy_version 454080 (0.0032) [2024-06-23 14:32:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7439761408. Throughput: 0: 42928.4. Samples: 7439923720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 14:32:19,014][15401] Updated weights for policy 0, policy_version 454090 (0.0026) [2024-06-23 14:32:22,994][15401] Updated weights for policy 0, policy_version 454100 (0.0021) [2024-06-23 14:32:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7439974400. Throughput: 0: 42694.2. Samples: 7440046620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 14:32:26,678][15401] Updated weights for policy 0, policy_version 454110 (0.0041) [2024-06-23 14:32:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 7440187392. Throughput: 0: 42700.5. Samples: 7440302360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 14:32:30,792][15401] Updated weights for policy 0, policy_version 454120 (0.0047) [2024-06-23 14:32:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 7440400384. Throughput: 0: 42923.9. Samples: 7440568860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 14:32:34,322][15401] Updated weights for policy 0, policy_version 454130 (0.0028) [2024-06-23 14:32:38,358][15401] Updated weights for policy 0, policy_version 454140 (0.0032) [2024-06-23 14:32:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7440629760. Throughput: 0: 42756.0. Samples: 7440690380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:38,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 14:32:42,038][15401] Updated weights for policy 0, policy_version 454150 (0.0033) [2024-06-23 14:32:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 7440826368. Throughput: 0: 42773.8. Samples: 7440948760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 14:32:46,409][15401] Updated weights for policy 0, policy_version 454160 (0.0029) [2024-06-23 14:32:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7441039360. Throughput: 0: 42837.0. Samples: 7441210280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 14:32:49,619][15401] Updated weights for policy 0, policy_version 454170 (0.0037) [2024-06-23 14:32:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7441252352. Throughput: 0: 42785.3. Samples: 7441331540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:53,390][15132] Avg episode reward: [(0, '0.089')] [2024-06-23 14:32:53,975][15401] Updated weights for policy 0, policy_version 454180 (0.0029) [2024-06-23 14:32:57,197][15401] Updated weights for policy 0, policy_version 454190 (0.0049) [2024-06-23 14:32:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7441465344. Throughput: 0: 42627.7. Samples: 7441583400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:32:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 14:33:01,462][15401] Updated weights for policy 0, policy_version 454200 (0.0028) [2024-06-23 14:33:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7441661952. Throughput: 0: 42994.5. Samples: 7441858480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:33:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 14:33:04,769][15401] Updated weights for policy 0, policy_version 454210 (0.0040) [2024-06-23 14:33:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 7441907712. Throughput: 0: 42941.7. Samples: 7441979000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:33:08,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 14:33:09,341][15401] Updated weights for policy 0, policy_version 454220 (0.0032) [2024-06-23 14:33:12,680][15401] Updated weights for policy 0, policy_version 454230 (0.0025) [2024-06-23 14:33:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7442120704. Throughput: 0: 42952.2. Samples: 7442235220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:33:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 14:33:16,945][15401] Updated weights for policy 0, policy_version 454240 (0.0033) [2024-06-23 14:33:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42821.2). Total num frames: 7442317312. Throughput: 0: 43061.0. Samples: 7442506600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 14:33:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 14:33:18,482][15349] Signal inference workers to stop experience collection... (110300 times) [2024-06-23 14:33:18,488][15349] Signal inference workers to resume experience collection... (110300 times) [2024-06-23 14:33:18,528][15401] InferenceWorker_p0-w0: stopping experience collection (110300 times) [2024-06-23 14:33:18,528][15401] InferenceWorker_p0-w0: resuming experience collection (110300 times) [2024-06-23 14:33:20,223][15401] Updated weights for policy 0, policy_version 454250 (0.0033) [2024-06-23 14:33:23,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 7442563072. Throughput: 0: 43025.4. Samples: 7442626520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:33:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 14:33:24,391][15401] Updated weights for policy 0, policy_version 454260 (0.0027) [2024-06-23 14:33:27,933][15401] Updated weights for policy 0, policy_version 454270 (0.0035) [2024-06-23 14:33:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 7442776064. Throughput: 0: 43015.2. Samples: 7442884440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:33:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 14:33:32,033][15401] Updated weights for policy 0, policy_version 454280 (0.0027) [2024-06-23 14:33:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7442972672. Throughput: 0: 42961.2. Samples: 7443143540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:33:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 14:33:35,788][15401] Updated weights for policy 0, policy_version 454290 (0.0026) [2024-06-23 14:33:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 7443185664. Throughput: 0: 43028.0. Samples: 7443267900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:33:38,393][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 14:33:39,532][15401] Updated weights for policy 0, policy_version 454300 (0.0035) [2024-06-23 14:33:43,354][15401] Updated weights for policy 0, policy_version 454310 (0.0043) [2024-06-23 14:33:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7443415040. Throughput: 0: 43181.3. Samples: 7443526560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:33:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 14:33:43,527][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000454311_7443431424.pth... [2024-06-23 14:33:43,592][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000453682_7433125888.pth [2024-06-23 14:33:47,134][15401] Updated weights for policy 0, policy_version 454320 (0.0039) [2024-06-23 14:33:48,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 7443628032. Throughput: 0: 42839.1. Samples: 7443786240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:33:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 14:33:50,767][15401] Updated weights for policy 0, policy_version 454330 (0.0034) [2024-06-23 14:33:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7443824640. Throughput: 0: 42965.0. Samples: 7443912420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:33:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 14:33:54,805][15401] Updated weights for policy 0, policy_version 454340 (0.0041) [2024-06-23 14:33:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7444054016. Throughput: 0: 42962.5. Samples: 7444168520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:33:58,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 14:33:58,506][15401] Updated weights for policy 0, policy_version 454350 (0.0035) [2024-06-23 14:34:02,669][15401] Updated weights for policy 0, policy_version 454360 (0.0047) [2024-06-23 14:34:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 7444267008. Throughput: 0: 42555.0. Samples: 7444421580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:03,398][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 14:34:06,083][15401] Updated weights for policy 0, policy_version 454370 (0.0038) [2024-06-23 14:34:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 7444463616. Throughput: 0: 42623.2. Samples: 7444544560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 14:34:10,582][15401] Updated weights for policy 0, policy_version 454380 (0.0049) [2024-06-23 14:34:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7444692992. Throughput: 0: 42798.6. Samples: 7444810380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:13,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 14:34:13,738][15401] Updated weights for policy 0, policy_version 454390 (0.0040) [2024-06-23 14:34:18,282][15401] Updated weights for policy 0, policy_version 454400 (0.0029) [2024-06-23 14:34:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 7444889600. Throughput: 0: 42716.0. Samples: 7445065760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 14:34:21,682][15401] Updated weights for policy 0, policy_version 454410 (0.0036) [2024-06-23 14:34:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7445102592. Throughput: 0: 42651.1. Samples: 7445187100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 14:34:26,033][15401] Updated weights for policy 0, policy_version 454420 (0.0033) [2024-06-23 14:34:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42931.7). Total num frames: 7445315584. Throughput: 0: 42662.8. Samples: 7445446380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 14:34:29,502][15401] Updated weights for policy 0, policy_version 454430 (0.0028) [2024-06-23 14:34:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7445512192. Throughput: 0: 42450.8. Samples: 7445696520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 14:34:33,594][15401] Updated weights for policy 0, policy_version 454440 (0.0043) [2024-06-23 14:34:37,017][15401] Updated weights for policy 0, policy_version 454450 (0.0042) [2024-06-23 14:34:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 7445741568. Throughput: 0: 42439.9. Samples: 7445822220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:38,394][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 14:34:41,156][15401] Updated weights for policy 0, policy_version 454460 (0.0042) [2024-06-23 14:34:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7445954560. Throughput: 0: 42495.1. Samples: 7446080800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 14:34:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 14:34:44,891][15401] Updated weights for policy 0, policy_version 454470 (0.0035) [2024-06-23 14:34:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7446167552. Throughput: 0: 42497.7. Samples: 7446333980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:34:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 14:34:49,337][15401] Updated weights for policy 0, policy_version 454480 (0.0041) [2024-06-23 14:34:52,805][15401] Updated weights for policy 0, policy_version 454490 (0.0041) [2024-06-23 14:34:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7446380544. Throughput: 0: 42531.5. Samples: 7446458480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:34:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 14:34:56,715][15349] Signal inference workers to stop experience collection... (110350 times) [2024-06-23 14:34:56,741][15401] InferenceWorker_p0-w0: stopping experience collection (110350 times) [2024-06-23 14:34:56,832][15349] Signal inference workers to resume experience collection... (110350 times) [2024-06-23 14:34:56,832][15401] InferenceWorker_p0-w0: resuming experience collection (110350 times) [2024-06-23 14:34:56,964][15401] Updated weights for policy 0, policy_version 454500 (0.0032) [2024-06-23 14:34:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 7446577152. Throughput: 0: 42271.6. Samples: 7446712600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:34:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 14:35:00,441][15401] Updated weights for policy 0, policy_version 454510 (0.0042) [2024-06-23 14:35:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7446806528. Throughput: 0: 42262.7. Samples: 7446967580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 14:35:04,515][15401] Updated weights for policy 0, policy_version 454520 (0.0036) [2024-06-23 14:35:08,056][15401] Updated weights for policy 0, policy_version 454530 (0.0033) [2024-06-23 14:35:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7447019520. Throughput: 0: 42296.9. Samples: 7447090460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 14:35:12,004][15401] Updated weights for policy 0, policy_version 454540 (0.0031) [2024-06-23 14:35:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 7447216128. Throughput: 0: 42216.3. Samples: 7447346120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 14:35:15,711][15401] Updated weights for policy 0, policy_version 454550 (0.0029) [2024-06-23 14:35:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 7447429120. Throughput: 0: 42435.1. Samples: 7447606100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 14:35:19,571][15401] Updated weights for policy 0, policy_version 454560 (0.0029) [2024-06-23 14:35:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7447658496. Throughput: 0: 42621.8. Samples: 7447740200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 14:35:23,398][15401] Updated weights for policy 0, policy_version 454570 (0.0033) [2024-06-23 14:35:27,731][15401] Updated weights for policy 0, policy_version 454580 (0.0031) [2024-06-23 14:35:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7447855104. Throughput: 0: 42374.6. Samples: 7447987660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 14:35:31,105][15401] Updated weights for policy 0, policy_version 454590 (0.0032) [2024-06-23 14:35:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7448084480. Throughput: 0: 42345.3. Samples: 7448239520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:33,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 14:35:35,345][15401] Updated weights for policy 0, policy_version 454600 (0.0032) [2024-06-23 14:35:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7448281088. Throughput: 0: 42597.0. Samples: 7448375340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 14:35:38,738][15401] Updated weights for policy 0, policy_version 454610 (0.0039) [2024-06-23 14:35:43,089][15401] Updated weights for policy 0, policy_version 454620 (0.0028) [2024-06-23 14:35:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7448494080. Throughput: 0: 42549.7. Samples: 7448627340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 14:35:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000454620_7448494080.pth... [2024-06-23 14:35:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000453995_7438254080.pth [2024-06-23 14:35:46,348][15401] Updated weights for policy 0, policy_version 454630 (0.0032) [2024-06-23 14:35:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7448707072. Throughput: 0: 42475.1. Samples: 7448878960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 14:35:50,613][15401] Updated weights for policy 0, policy_version 454640 (0.0032) [2024-06-23 14:35:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7448936448. Throughput: 0: 42629.4. Samples: 7449008780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:53,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 14:35:54,427][15401] Updated weights for policy 0, policy_version 454650 (0.0038) [2024-06-23 14:35:58,242][15401] Updated weights for policy 0, policy_version 454660 (0.0042) [2024-06-23 14:35:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7449149440. Throughput: 0: 42634.3. Samples: 7449264660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:35:58,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-23 14:36:02,019][15401] Updated weights for policy 0, policy_version 454670 (0.0035) [2024-06-23 14:36:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7449362432. Throughput: 0: 42644.4. Samples: 7449525100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-23 14:36:03,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 14:36:06,046][15401] Updated weights for policy 0, policy_version 454680 (0.0037) [2024-06-23 14:36:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7449575424. Throughput: 0: 42413.4. Samples: 7449648800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 14:36:09,613][15401] Updated weights for policy 0, policy_version 454690 (0.0027) [2024-06-23 14:36:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7449772032. Throughput: 0: 42648.5. Samples: 7449906840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 14:36:13,731][15401] Updated weights for policy 0, policy_version 454700 (0.0041) [2024-06-23 14:36:17,379][15401] Updated weights for policy 0, policy_version 454710 (0.0028) [2024-06-23 14:36:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7449985024. Throughput: 0: 42508.8. Samples: 7450152420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 14:36:21,425][15401] Updated weights for policy 0, policy_version 454720 (0.0048) [2024-06-23 14:36:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7450214400. Throughput: 0: 42448.4. Samples: 7450285520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 14:36:24,492][15349] Signal inference workers to stop experience collection... (110400 times) [2024-06-23 14:36:24,493][15349] Signal inference workers to resume experience collection... (110400 times) [2024-06-23 14:36:24,531][15401] InferenceWorker_p0-w0: stopping experience collection (110400 times) [2024-06-23 14:36:24,532][15401] InferenceWorker_p0-w0: resuming experience collection (110400 times) [2024-06-23 14:36:25,226][15401] Updated weights for policy 0, policy_version 454730 (0.0046) [2024-06-23 14:36:28,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7450411008. Throughput: 0: 42387.2. Samples: 7450534860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:28,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 14:36:29,839][15401] Updated weights for policy 0, policy_version 454740 (0.0032) [2024-06-23 14:36:32,783][15401] Updated weights for policy 0, policy_version 454750 (0.0040) [2024-06-23 14:36:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7450640384. Throughput: 0: 42436.5. Samples: 7450788600. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 14:36:37,340][15401] Updated weights for policy 0, policy_version 454760 (0.0042) [2024-06-23 14:36:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7450836992. Throughput: 0: 42530.1. Samples: 7450922640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 14:36:40,250][15401] Updated weights for policy 0, policy_version 454770 (0.0041) [2024-06-23 14:36:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7451066368. Throughput: 0: 42579.1. Samples: 7451180720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 14:36:44,831][15401] Updated weights for policy 0, policy_version 454780 (0.0032) [2024-06-23 14:36:47,902][15401] Updated weights for policy 0, policy_version 454790 (0.0036) [2024-06-23 14:36:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7451295744. Throughput: 0: 42268.7. Samples: 7451427200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 14:36:52,639][15401] Updated weights for policy 0, policy_version 454800 (0.0038) [2024-06-23 14:36:53,390][15132] Fps is (10 sec: 37682.8, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 7451443200. Throughput: 0: 42486.0. Samples: 7451560680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:53,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 14:36:55,736][15401] Updated weights for policy 0, policy_version 454810 (0.0031) [2024-06-23 14:36:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7451705344. Throughput: 0: 42457.4. Samples: 7451817420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:36:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 14:37:00,110][15401] Updated weights for policy 0, policy_version 454820 (0.0035) [2024-06-23 14:37:03,381][15401] Updated weights for policy 0, policy_version 454830 (0.0029) [2024-06-23 14:37:03,390][15132] Fps is (10 sec: 49152.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7451934720. Throughput: 0: 42630.3. Samples: 7452070780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:37:03,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 14:37:07,959][15401] Updated weights for policy 0, policy_version 454840 (0.0033) [2024-06-23 14:37:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 7452114944. Throughput: 0: 42531.5. Samples: 7452199540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:37:08,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 14:37:11,033][15401] Updated weights for policy 0, policy_version 454850 (0.0038) [2024-06-23 14:37:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7452344320. Throughput: 0: 42704.4. Samples: 7452456460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:37:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 14:37:15,441][15401] Updated weights for policy 0, policy_version 454860 (0.0029) [2024-06-23 14:37:18,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7452557312. Throughput: 0: 42758.5. Samples: 7452712740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:37:18,396][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 14:37:18,795][15401] Updated weights for policy 0, policy_version 454870 (0.0026) [2024-06-23 14:37:23,037][15401] Updated weights for policy 0, policy_version 454880 (0.0046) [2024-06-23 14:37:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7452753920. Throughput: 0: 42548.0. Samples: 7452837300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:37:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 14:37:26,580][15401] Updated weights for policy 0, policy_version 454890 (0.0046) [2024-06-23 14:37:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 7452999680. Throughput: 0: 42430.7. Samples: 7453090100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-23 14:37:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 14:37:30,695][15401] Updated weights for policy 0, policy_version 454900 (0.0034) [2024-06-23 14:37:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 7453163520. Throughput: 0: 42861.3. Samples: 7453355960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:37:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 14:37:34,323][15401] Updated weights for policy 0, policy_version 454910 (0.0029) [2024-06-23 14:37:38,303][15401] Updated weights for policy 0, policy_version 454920 (0.0037) [2024-06-23 14:37:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7453409280. Throughput: 0: 42458.3. Samples: 7453471300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:37:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 14:37:42,005][15401] Updated weights for policy 0, policy_version 454930 (0.0045) [2024-06-23 14:37:43,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7453638656. Throughput: 0: 42478.2. Samples: 7453728940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:37:43,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 14:37:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000454934_7453638656.pth... [2024-06-23 14:37:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000454311_7443431424.pth [2024-06-23 14:37:46,137][15401] Updated weights for policy 0, policy_version 454940 (0.0031) [2024-06-23 14:37:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 7453802496. Throughput: 0: 42717.3. Samples: 7453993060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:37:48,396][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 14:37:49,826][15401] Updated weights for policy 0, policy_version 454950 (0.0024) [2024-06-23 14:37:51,280][15349] Signal inference workers to stop experience collection... (110450 times) [2024-06-23 14:37:51,286][15349] Signal inference workers to resume experience collection... (110450 times) [2024-06-23 14:37:51,302][15401] InferenceWorker_p0-w0: stopping experience collection (110450 times) [2024-06-23 14:37:51,302][15401] InferenceWorker_p0-w0: resuming experience collection (110450 times) [2024-06-23 14:37:53,392][15132] Fps is (10 sec: 39312.0, 60 sec: 43142.9, 300 sec: 42598.0). Total num frames: 7454031872. Throughput: 0: 42539.5. Samples: 7454113820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:37:53,392][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 14:37:53,811][15401] Updated weights for policy 0, policy_version 454960 (0.0041) [2024-06-23 14:37:57,807][15401] Updated weights for policy 0, policy_version 454970 (0.0037) [2024-06-23 14:37:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7454261248. Throughput: 0: 42689.4. Samples: 7454377480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:37:58,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 14:38:01,716][15401] Updated weights for policy 0, policy_version 454980 (0.0036) [2024-06-23 14:38:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 7454441472. Throughput: 0: 42556.5. Samples: 7454627780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 14:38:05,480][15401] Updated weights for policy 0, policy_version 454990 (0.0044) [2024-06-23 14:38:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 7454670848. Throughput: 0: 42501.7. Samples: 7454749880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 14:38:09,459][15401] Updated weights for policy 0, policy_version 455000 (0.0026) [2024-06-23 14:38:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 7454867456. Throughput: 0: 42734.3. Samples: 7455013140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 14:38:13,433][15401] Updated weights for policy 0, policy_version 455010 (0.0031) [2024-06-23 14:38:17,009][15401] Updated weights for policy 0, policy_version 455020 (0.0029) [2024-06-23 14:38:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 7455096832. Throughput: 0: 42362.4. Samples: 7455262260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:18,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 14:38:20,973][15401] Updated weights for policy 0, policy_version 455030 (0.0042) [2024-06-23 14:38:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 7455309824. Throughput: 0: 42616.5. Samples: 7455389040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 14:38:24,683][15401] Updated weights for policy 0, policy_version 455040 (0.0033) [2024-06-23 14:38:28,390][15132] Fps is (10 sec: 40958.5, 60 sec: 41779.0, 300 sec: 42487.3). Total num frames: 7455506432. Throughput: 0: 42590.4. Samples: 7455645520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 14:38:28,639][15401] Updated weights for policy 0, policy_version 455050 (0.0026) [2024-06-23 14:38:32,717][15401] Updated weights for policy 0, policy_version 455060 (0.0029) [2024-06-23 14:38:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 7455735808. Throughput: 0: 42283.6. Samples: 7455895820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 14:38:36,392][15401] Updated weights for policy 0, policy_version 455070 (0.0036) [2024-06-23 14:38:38,390][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7455965184. Throughput: 0: 42496.9. Samples: 7456026080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:38,395][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 14:38:40,334][15401] Updated weights for policy 0, policy_version 455080 (0.0038) [2024-06-23 14:38:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 42376.3). Total num frames: 7456129024. Throughput: 0: 42352.5. Samples: 7456283340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:43,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 14:38:44,093][15401] Updated weights for policy 0, policy_version 455090 (0.0025) [2024-06-23 14:38:47,887][15401] Updated weights for policy 0, policy_version 455100 (0.0040) [2024-06-23 14:38:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 7456374784. Throughput: 0: 42335.5. Samples: 7456532880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:48,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 14:38:51,759][15401] Updated weights for policy 0, policy_version 455110 (0.0022) [2024-06-23 14:38:53,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 7456604160. Throughput: 0: 42483.2. Samples: 7456661620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 14:38:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 14:38:55,806][15401] Updated weights for policy 0, policy_version 455120 (0.0028) [2024-06-23 14:38:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7456800768. Throughput: 0: 42354.1. Samples: 7456919080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:38:58,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 14:38:59,325][15401] Updated weights for policy 0, policy_version 455130 (0.0027) [2024-06-23 14:39:03,315][15401] Updated weights for policy 0, policy_version 455140 (0.0028) [2024-06-23 14:39:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7457013760. Throughput: 0: 42501.3. Samples: 7457174820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 14:39:07,057][15401] Updated weights for policy 0, policy_version 455150 (0.0033) [2024-06-23 14:39:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 7457226752. Throughput: 0: 42456.0. Samples: 7457299560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 14:39:11,432][15401] Updated weights for policy 0, policy_version 455160 (0.0032) [2024-06-23 14:39:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7457423360. Throughput: 0: 42444.7. Samples: 7457555520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 14:39:14,820][15401] Updated weights for policy 0, policy_version 455170 (0.0031) [2024-06-23 14:39:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7457652736. Throughput: 0: 42594.7. Samples: 7457812580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 14:39:18,912][15401] Updated weights for policy 0, policy_version 455180 (0.0043) [2024-06-23 14:39:22,327][15401] Updated weights for policy 0, policy_version 455190 (0.0026) [2024-06-23 14:39:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7457849344. Throughput: 0: 42552.0. Samples: 7457940920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 14:39:26,514][15401] Updated weights for policy 0, policy_version 455200 (0.0032) [2024-06-23 14:39:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 7458078720. Throughput: 0: 42597.4. Samples: 7458200220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 14:39:30,250][15401] Updated weights for policy 0, policy_version 455210 (0.0035) [2024-06-23 14:39:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7458291712. Throughput: 0: 42602.7. Samples: 7458450000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 14:39:33,996][15401] Updated weights for policy 0, policy_version 455220 (0.0025) [2024-06-23 14:39:37,894][15401] Updated weights for policy 0, policy_version 455230 (0.0043) [2024-06-23 14:39:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 7458504704. Throughput: 0: 42609.8. Samples: 7458579160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:38,392][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 14:39:41,666][15401] Updated weights for policy 0, policy_version 455240 (0.0036) [2024-06-23 14:39:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 7458717696. Throughput: 0: 42678.3. Samples: 7458839600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:43,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 14:39:43,438][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000455245_7458734080.pth... [2024-06-23 14:39:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000454620_7448494080.pth [2024-06-23 14:39:45,459][15401] Updated weights for policy 0, policy_version 455250 (0.0033) [2024-06-23 14:39:48,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7458947072. Throughput: 0: 42791.9. Samples: 7459100460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:48,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 14:39:49,201][15401] Updated weights for policy 0, policy_version 455260 (0.0045) [2024-06-23 14:39:53,113][15401] Updated weights for policy 0, policy_version 455270 (0.0045) [2024-06-23 14:39:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7459143680. Throughput: 0: 42699.1. Samples: 7459221020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:53,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-23 14:39:57,152][15401] Updated weights for policy 0, policy_version 455280 (0.0044) [2024-06-23 14:39:58,167][15349] Signal inference workers to stop experience collection... (110500 times) [2024-06-23 14:39:58,168][15349] Signal inference workers to resume experience collection... (110500 times) [2024-06-23 14:39:58,197][15401] InferenceWorker_p0-w0: stopping experience collection (110500 times) [2024-06-23 14:39:58,198][15401] InferenceWorker_p0-w0: resuming experience collection (110500 times) [2024-06-23 14:39:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7459356672. Throughput: 0: 42807.1. Samples: 7459481840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:39:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 14:40:00,758][15401] Updated weights for policy 0, policy_version 455290 (0.0035) [2024-06-23 14:40:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7459569664. Throughput: 0: 42740.8. Samples: 7459735920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:40:03,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 14:40:05,160][15401] Updated weights for policy 0, policy_version 455300 (0.0040) [2024-06-23 14:40:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 7459782656. Throughput: 0: 42691.6. Samples: 7459862140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:40:08,401][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 14:40:08,464][15401] Updated weights for policy 0, policy_version 455310 (0.0032) [2024-06-23 14:40:12,805][15401] Updated weights for policy 0, policy_version 455320 (0.0040) [2024-06-23 14:40:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7459962880. Throughput: 0: 42606.6. Samples: 7460117520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:40:13,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 14:40:16,043][15401] Updated weights for policy 0, policy_version 455330 (0.0032) [2024-06-23 14:40:18,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 7460192256. Throughput: 0: 42774.7. Samples: 7460374960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 14:40:18,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 14:40:20,321][15401] Updated weights for policy 0, policy_version 455340 (0.0040) [2024-06-23 14:40:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7460421632. Throughput: 0: 42641.4. Samples: 7460497920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:40:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 14:40:23,783][15401] Updated weights for policy 0, policy_version 455350 (0.0030) [2024-06-23 14:40:28,154][15401] Updated weights for policy 0, policy_version 455360 (0.0029) [2024-06-23 14:40:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7460618240. Throughput: 0: 42582.3. Samples: 7460755800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:40:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 14:40:31,406][15401] Updated weights for policy 0, policy_version 455370 (0.0037) [2024-06-23 14:40:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7460847616. Throughput: 0: 42527.3. Samples: 7461014180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:40:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 14:40:35,835][15401] Updated weights for policy 0, policy_version 455380 (0.0029) [2024-06-23 14:40:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 7461076992. Throughput: 0: 42743.2. Samples: 7461144460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:40:38,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 14:40:39,149][15401] Updated weights for policy 0, policy_version 455390 (0.0030) [2024-06-23 14:40:43,360][15401] Updated weights for policy 0, policy_version 455400 (0.0032) [2024-06-23 14:40:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7461273600. Throughput: 0: 42842.7. Samples: 7461409760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:40:43,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-23 14:40:46,803][15401] Updated weights for policy 0, policy_version 455410 (0.0033) [2024-06-23 14:40:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7461486592. Throughput: 0: 42570.3. Samples: 7461651580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:40:48,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 14:40:50,953][15401] Updated weights for policy 0, policy_version 455420 (0.0042) [2024-06-23 14:40:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 7461715968. Throughput: 0: 42639.6. Samples: 7461780920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:40:53,392][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 14:40:54,732][15401] Updated weights for policy 0, policy_version 455430 (0.0033) [2024-06-23 14:40:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 7461879808. Throughput: 0: 42653.8. Samples: 7462036940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:40:58,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 14:40:58,999][15401] Updated weights for policy 0, policy_version 455440 (0.0029) [2024-06-23 14:41:02,182][15401] Updated weights for policy 0, policy_version 455450 (0.0028) [2024-06-23 14:41:03,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7462141952. Throughput: 0: 42511.7. Samples: 7462287880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:03,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 14:41:06,473][15401] Updated weights for policy 0, policy_version 455460 (0.0026) [2024-06-23 14:41:08,392][15132] Fps is (10 sec: 47501.9, 60 sec: 42871.5, 300 sec: 42653.6). Total num frames: 7462354944. Throughput: 0: 42800.8. Samples: 7462424060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:08,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 14:41:09,841][15401] Updated weights for policy 0, policy_version 455470 (0.0037) [2024-06-23 14:41:13,390][15132] Fps is (10 sec: 39320.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7462535168. Throughput: 0: 42809.6. Samples: 7462682240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 14:41:14,120][15401] Updated weights for policy 0, policy_version 455480 (0.0031) [2024-06-23 14:41:17,638][15401] Updated weights for policy 0, policy_version 455490 (0.0042) [2024-06-23 14:41:18,390][15132] Fps is (10 sec: 40969.1, 60 sec: 42873.1, 300 sec: 42542.8). Total num frames: 7462764544. Throughput: 0: 42557.6. Samples: 7462929280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:18,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 14:41:22,093][15401] Updated weights for policy 0, policy_version 455500 (0.0039) [2024-06-23 14:41:23,122][15349] Signal inference workers to stop experience collection... (110550 times) [2024-06-23 14:41:23,125][15349] Signal inference workers to resume experience collection... (110550 times) [2024-06-23 14:41:23,138][15401] InferenceWorker_p0-w0: stopping experience collection (110550 times) [2024-06-23 14:41:23,139][15401] InferenceWorker_p0-w0: resuming experience collection (110550 times) [2024-06-23 14:41:23,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 7462993920. Throughput: 0: 42627.1. Samples: 7463062680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:23,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 14:41:25,131][15401] Updated weights for policy 0, policy_version 455510 (0.0032) [2024-06-23 14:41:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7463174144. Throughput: 0: 42295.4. Samples: 7463313060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 14:41:29,560][15401] Updated weights for policy 0, policy_version 455520 (0.0039) [2024-06-23 14:41:33,174][15401] Updated weights for policy 0, policy_version 455530 (0.0025) [2024-06-23 14:41:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7463419904. Throughput: 0: 42614.7. Samples: 7463569240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 14:41:37,067][15401] Updated weights for policy 0, policy_version 455540 (0.0035) [2024-06-23 14:41:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7463616512. Throughput: 0: 42645.3. Samples: 7463699860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 14:41:40,850][15401] Updated weights for policy 0, policy_version 455550 (0.0043) [2024-06-23 14:41:43,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42593.8, 300 sec: 42486.4). Total num frames: 7463829504. Throughput: 0: 42572.5. Samples: 7463952980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 14:41:43,397][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 14:41:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000455556_7463829504.pth... [2024-06-23 14:41:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000454934_7453638656.pth [2024-06-23 14:41:44,469][15401] Updated weights for policy 0, policy_version 455560 (0.0041) [2024-06-23 14:41:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7464042496. Throughput: 0: 42868.8. Samples: 7464216980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:41:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 14:41:48,596][15401] Updated weights for policy 0, policy_version 455570 (0.0037) [2024-06-23 14:41:52,227][15401] Updated weights for policy 0, policy_version 455580 (0.0035) [2024-06-23 14:41:53,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 7464271872. Throughput: 0: 42720.8. Samples: 7464346400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:41:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 14:41:56,423][15401] Updated weights for policy 0, policy_version 455590 (0.0035) [2024-06-23 14:41:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 7464484864. Throughput: 0: 42618.9. Samples: 7464600080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:41:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 14:41:59,863][15401] Updated weights for policy 0, policy_version 455600 (0.0034) [2024-06-23 14:42:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 7464697856. Throughput: 0: 42898.8. Samples: 7464859720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 14:42:03,961][15401] Updated weights for policy 0, policy_version 455610 (0.0031) [2024-06-23 14:42:07,293][15401] Updated weights for policy 0, policy_version 455620 (0.0044) [2024-06-23 14:42:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 7464910848. Throughput: 0: 42831.1. Samples: 7464990080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 14:42:11,506][15401] Updated weights for policy 0, policy_version 455630 (0.0024) [2024-06-23 14:42:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 7465123840. Throughput: 0: 43040.9. Samples: 7465249900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 14:42:14,858][15401] Updated weights for policy 0, policy_version 455640 (0.0033) [2024-06-23 14:42:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 7465336832. Throughput: 0: 43021.3. Samples: 7465505200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 14:42:19,391][15401] Updated weights for policy 0, policy_version 455650 (0.0032) [2024-06-23 14:42:22,550][15401] Updated weights for policy 0, policy_version 455660 (0.0039) [2024-06-23 14:42:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 7465549824. Throughput: 0: 42895.9. Samples: 7465630180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:23,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 14:42:26,892][15401] Updated weights for policy 0, policy_version 455670 (0.0042) [2024-06-23 14:42:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7465762816. Throughput: 0: 43048.8. Samples: 7465889900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 14:42:30,398][15401] Updated weights for policy 0, policy_version 455680 (0.0033) [2024-06-23 14:42:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7465992192. Throughput: 0: 42817.6. Samples: 7466143780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 14:42:34,336][15401] Updated weights for policy 0, policy_version 455690 (0.0045) [2024-06-23 14:42:37,884][15401] Updated weights for policy 0, policy_version 455700 (0.0026) [2024-06-23 14:42:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7466205184. Throughput: 0: 42884.1. Samples: 7466276180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:38,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 14:42:41,838][15401] Updated weights for policy 0, policy_version 455710 (0.0048) [2024-06-23 14:42:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 7466401792. Throughput: 0: 42974.6. Samples: 7466533940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 14:42:45,603][15401] Updated weights for policy 0, policy_version 455720 (0.0032) [2024-06-23 14:42:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42765.3). Total num frames: 7466647552. Throughput: 0: 42723.5. Samples: 7466782280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 14:42:49,658][15401] Updated weights for policy 0, policy_version 455730 (0.0028) [2024-06-23 14:42:53,387][15401] Updated weights for policy 0, policy_version 455740 (0.0033) [2024-06-23 14:42:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7466844160. Throughput: 0: 42841.7. Samples: 7466917960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:53,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 14:42:55,177][15349] Signal inference workers to stop experience collection... (110600 times) [2024-06-23 14:42:55,177][15349] Signal inference workers to resume experience collection... (110600 times) [2024-06-23 14:42:55,190][15401] InferenceWorker_p0-w0: stopping experience collection (110600 times) [2024-06-23 14:42:55,190][15401] InferenceWorker_p0-w0: resuming experience collection (110600 times) [2024-06-23 14:42:57,536][15401] Updated weights for policy 0, policy_version 455750 (0.0031) [2024-06-23 14:42:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7467040768. Throughput: 0: 42686.3. Samples: 7467170780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:42:58,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 14:43:00,993][15401] Updated weights for policy 0, policy_version 455760 (0.0028) [2024-06-23 14:43:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 7467270144. Throughput: 0: 42591.8. Samples: 7467421940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 14:43:03,393][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 14:43:05,118][15401] Updated weights for policy 0, policy_version 455770 (0.0043) [2024-06-23 14:43:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 7467466752. Throughput: 0: 42697.9. Samples: 7467551680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:08,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 14:43:09,126][15401] Updated weights for policy 0, policy_version 455780 (0.0033) [2024-06-23 14:43:12,626][15401] Updated weights for policy 0, policy_version 455790 (0.0032) [2024-06-23 14:43:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7467679744. Throughput: 0: 42602.2. Samples: 7467807000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 14:43:16,662][15401] Updated weights for policy 0, policy_version 455800 (0.0035) [2024-06-23 14:43:18,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7467892736. Throughput: 0: 42810.3. Samples: 7468070240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 14:43:20,291][15401] Updated weights for policy 0, policy_version 455810 (0.0034) [2024-06-23 14:43:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7468089344. Throughput: 0: 42660.4. Samples: 7468195900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:23,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 14:43:24,151][15401] Updated weights for policy 0, policy_version 455820 (0.0035) [2024-06-23 14:43:28,125][15401] Updated weights for policy 0, policy_version 455830 (0.0026) [2024-06-23 14:43:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7468335104. Throughput: 0: 42547.0. Samples: 7468448560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 14:43:31,724][15401] Updated weights for policy 0, policy_version 455840 (0.0028) [2024-06-23 14:43:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7468531712. Throughput: 0: 42756.4. Samples: 7468706320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 14:43:35,805][15401] Updated weights for policy 0, policy_version 455850 (0.0027) [2024-06-23 14:43:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7468744704. Throughput: 0: 42490.2. Samples: 7468830020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 14:43:39,792][15401] Updated weights for policy 0, policy_version 455860 (0.0044) [2024-06-23 14:43:43,349][15401] Updated weights for policy 0, policy_version 455870 (0.0035) [2024-06-23 14:43:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7468974080. Throughput: 0: 42591.9. Samples: 7469087420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 14:43:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000455870_7468974080.pth... [2024-06-23 14:43:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000455245_7458734080.pth [2024-06-23 14:43:47,652][15401] Updated weights for policy 0, policy_version 455880 (0.0036) [2024-06-23 14:43:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7469170688. Throughput: 0: 42760.6. Samples: 7469346060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:48,398][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 14:43:51,023][15401] Updated weights for policy 0, policy_version 455890 (0.0033) [2024-06-23 14:43:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7469400064. Throughput: 0: 42610.7. Samples: 7469469060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 14:43:55,199][15401] Updated weights for policy 0, policy_version 455900 (0.0045) [2024-06-23 14:43:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7469596672. Throughput: 0: 42715.7. Samples: 7469729200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:43:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 14:43:58,732][15401] Updated weights for policy 0, policy_version 455910 (0.0027) [2024-06-23 14:44:02,747][15401] Updated weights for policy 0, policy_version 455920 (0.0026) [2024-06-23 14:44:03,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42325.4, 300 sec: 42653.6). Total num frames: 7469809664. Throughput: 0: 42598.2. Samples: 7469987260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:44:03,392][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 14:44:06,377][15401] Updated weights for policy 0, policy_version 455930 (0.0036) [2024-06-23 14:44:08,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 7470055424. Throughput: 0: 42629.0. Samples: 7470114200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:44:08,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 14:44:10,337][15401] Updated weights for policy 0, policy_version 455940 (0.0044) [2024-06-23 14:44:13,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7470252032. Throughput: 0: 42657.4. Samples: 7470368140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:44:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 14:44:14,493][15401] Updated weights for policy 0, policy_version 455950 (0.0038) [2024-06-23 14:44:16,649][15349] Signal inference workers to stop experience collection... (110650 times) [2024-06-23 14:44:16,695][15401] InferenceWorker_p0-w0: stopping experience collection (110650 times) [2024-06-23 14:44:16,699][15349] Signal inference workers to resume experience collection... (110650 times) [2024-06-23 14:44:16,711][15401] InferenceWorker_p0-w0: resuming experience collection (110650 times) [2024-06-23 14:44:18,022][15401] Updated weights for policy 0, policy_version 455960 (0.0035) [2024-06-23 14:44:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7470448640. Throughput: 0: 42463.1. Samples: 7470617160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:44:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 14:44:22,201][15401] Updated weights for policy 0, policy_version 455970 (0.0029) [2024-06-23 14:44:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7470678016. Throughput: 0: 42617.9. Samples: 7470747820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:44:23,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 14:44:25,980][15401] Updated weights for policy 0, policy_version 455980 (0.0035) [2024-06-23 14:44:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7470874624. Throughput: 0: 42516.9. Samples: 7471000680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 14:44:28,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-23 14:44:29,763][15401] Updated weights for policy 0, policy_version 455990 (0.0028) [2024-06-23 14:44:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.8, 300 sec: 42653.9). Total num frames: 7471087616. Throughput: 0: 42438.6. Samples: 7471255900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:44:33,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 14:44:33,541][15401] Updated weights for policy 0, policy_version 456000 (0.0034) [2024-06-23 14:44:37,319][15401] Updated weights for policy 0, policy_version 456010 (0.0031) [2024-06-23 14:44:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7471316992. Throughput: 0: 42576.9. Samples: 7471385020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:44:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 14:44:41,044][15401] Updated weights for policy 0, policy_version 456020 (0.0028) [2024-06-23 14:44:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7471513600. Throughput: 0: 42515.4. Samples: 7471642400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:44:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 14:44:45,148][15401] Updated weights for policy 0, policy_version 456030 (0.0028) [2024-06-23 14:44:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7471742976. Throughput: 0: 42407.6. Samples: 7471895500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:44:48,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 14:44:48,608][15401] Updated weights for policy 0, policy_version 456040 (0.0046) [2024-06-23 14:44:52,856][15401] Updated weights for policy 0, policy_version 456050 (0.0033) [2024-06-23 14:44:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7471939584. Throughput: 0: 42522.6. Samples: 7472027720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:44:53,396][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 14:44:56,433][15401] Updated weights for policy 0, policy_version 456060 (0.0029) [2024-06-23 14:44:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7472152576. Throughput: 0: 42513.0. Samples: 7472281220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:44:58,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 14:45:00,517][15401] Updated weights for policy 0, policy_version 456070 (0.0040) [2024-06-23 14:45:03,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7472381952. Throughput: 0: 42612.5. Samples: 7472534820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:03,392][15132] Avg episode reward: [(0, '0.298')] [2024-06-23 14:45:04,524][15401] Updated weights for policy 0, policy_version 456080 (0.0042) [2024-06-23 14:45:08,346][15401] Updated weights for policy 0, policy_version 456090 (0.0036) [2024-06-23 14:45:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 7472578560. Throughput: 0: 42588.3. Samples: 7472664300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 14:45:12,004][15401] Updated weights for policy 0, policy_version 456100 (0.0040) [2024-06-23 14:45:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 7472791552. Throughput: 0: 42670.3. Samples: 7472920840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 14:45:16,065][15401] Updated weights for policy 0, policy_version 456110 (0.0034) [2024-06-23 14:45:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7473020928. Throughput: 0: 42673.5. Samples: 7473176100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 14:45:19,506][15401] Updated weights for policy 0, policy_version 456120 (0.0055) [2024-06-23 14:45:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7473217536. Throughput: 0: 42738.9. Samples: 7473308280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:23,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 14:45:23,591][15401] Updated weights for policy 0, policy_version 456130 (0.0026) [2024-06-23 14:45:27,205][15401] Updated weights for policy 0, policy_version 456140 (0.0038) [2024-06-23 14:45:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7473446912. Throughput: 0: 42767.7. Samples: 7473566940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 14:45:30,928][15401] Updated weights for policy 0, policy_version 456150 (0.0032) [2024-06-23 14:45:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 7473659904. Throughput: 0: 42894.2. Samples: 7473825740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:33,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 14:45:34,757][15401] Updated weights for policy 0, policy_version 456160 (0.0040) [2024-06-23 14:45:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7473872896. Throughput: 0: 42897.3. Samples: 7473958100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 14:45:38,554][15401] Updated weights for policy 0, policy_version 456170 (0.0023) [2024-06-23 14:45:42,323][15401] Updated weights for policy 0, policy_version 456180 (0.0036) [2024-06-23 14:45:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7474102272. Throughput: 0: 42996.8. Samples: 7474216080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:43,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 14:45:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000456183_7474102272.pth... [2024-06-23 14:45:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000455556_7463829504.pth [2024-06-23 14:45:46,161][15401] Updated weights for policy 0, policy_version 456190 (0.0027) [2024-06-23 14:45:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7474315264. Throughput: 0: 42940.5. Samples: 7474467040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:48,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 14:45:50,194][15401] Updated weights for policy 0, policy_version 456200 (0.0034) [2024-06-23 14:45:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7474511872. Throughput: 0: 42820.1. Samples: 7474591200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 14:45:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 14:45:53,759][15401] Updated weights for policy 0, policy_version 456210 (0.0036) [2024-06-23 14:45:57,902][15401] Updated weights for policy 0, policy_version 456220 (0.0034) [2024-06-23 14:45:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7474741248. Throughput: 0: 42961.3. Samples: 7474854100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:45:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 14:45:58,493][15349] Signal inference workers to stop experience collection... (110700 times) [2024-06-23 14:45:58,494][15349] Signal inference workers to resume experience collection... (110700 times) [2024-06-23 14:45:58,514][15401] InferenceWorker_p0-w0: stopping experience collection (110700 times) [2024-06-23 14:45:58,514][15401] InferenceWorker_p0-w0: resuming experience collection (110700 times) [2024-06-23 14:46:01,408][15401] Updated weights for policy 0, policy_version 456230 (0.0047) [2024-06-23 14:46:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42873.1, 300 sec: 42709.8). Total num frames: 7474954240. Throughput: 0: 42958.5. Samples: 7475109240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:03,392][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 14:46:05,499][15401] Updated weights for policy 0, policy_version 456240 (0.0027) [2024-06-23 14:46:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7475167232. Throughput: 0: 42788.5. Samples: 7475233760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:08,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 14:46:08,927][15401] Updated weights for policy 0, policy_version 456250 (0.0039) [2024-06-23 14:46:12,981][15401] Updated weights for policy 0, policy_version 456260 (0.0034) [2024-06-23 14:46:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7475380224. Throughput: 0: 42995.1. Samples: 7475501720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:13,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 14:46:16,601][15401] Updated weights for policy 0, policy_version 456270 (0.0038) [2024-06-23 14:46:18,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 7475609600. Throughput: 0: 42839.6. Samples: 7475753620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:18,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 14:46:20,711][15401] Updated weights for policy 0, policy_version 456280 (0.0034) [2024-06-23 14:46:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 7475806208. Throughput: 0: 42837.9. Samples: 7475885800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 14:46:24,212][15401] Updated weights for policy 0, policy_version 456290 (0.0032) [2024-06-23 14:46:28,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7476002816. Throughput: 0: 42824.9. Samples: 7476143200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 14:46:28,541][15401] Updated weights for policy 0, policy_version 456300 (0.0040) [2024-06-23 14:46:31,787][15401] Updated weights for policy 0, policy_version 456310 (0.0032) [2024-06-23 14:46:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7476248576. Throughput: 0: 42873.3. Samples: 7476396340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 14:46:36,285][15401] Updated weights for policy 0, policy_version 456320 (0.0031) [2024-06-23 14:46:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 7476461568. Throughput: 0: 43155.0. Samples: 7476533180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:38,399][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 14:46:39,465][15401] Updated weights for policy 0, policy_version 456330 (0.0028) [2024-06-23 14:46:43,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 7476641792. Throughput: 0: 42931.0. Samples: 7476786100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:43,392][15132] Avg episode reward: [(0, '0.168')] [2024-06-23 14:46:43,857][15401] Updated weights for policy 0, policy_version 456340 (0.0027) [2024-06-23 14:46:47,128][15401] Updated weights for policy 0, policy_version 456350 (0.0029) [2024-06-23 14:46:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7476887552. Throughput: 0: 42898.7. Samples: 7477039680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:48,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 14:46:51,625][15401] Updated weights for policy 0, policy_version 456360 (0.0030) [2024-06-23 14:46:53,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7477100544. Throughput: 0: 43132.8. Samples: 7477174740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 14:46:54,888][15401] Updated weights for policy 0, policy_version 456370 (0.0047) [2024-06-23 14:46:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 7477297152. Throughput: 0: 42851.9. Samples: 7477430160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:46:58,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 14:46:59,558][15401] Updated weights for policy 0, policy_version 456380 (0.0040) [2024-06-23 14:47:02,288][15401] Updated weights for policy 0, policy_version 456390 (0.0048) [2024-06-23 14:47:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7477542912. Throughput: 0: 42760.8. Samples: 7477677760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:47:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 14:47:06,937][15401] Updated weights for policy 0, policy_version 456400 (0.0044) [2024-06-23 14:47:08,392][15132] Fps is (10 sec: 44237.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 7477739520. Throughput: 0: 43036.8. Samples: 7477822560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:47:08,393][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 14:47:09,780][15401] Updated weights for policy 0, policy_version 456410 (0.0027) [2024-06-23 14:47:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7477936128. Throughput: 0: 42940.8. Samples: 7478075540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:47:13,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 14:47:14,451][15401] Updated weights for policy 0, policy_version 456420 (0.0044) [2024-06-23 14:47:17,436][15401] Updated weights for policy 0, policy_version 456430 (0.0039) [2024-06-23 14:47:18,389][15132] Fps is (10 sec: 45886.3, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 7478198272. Throughput: 0: 42876.5. Samples: 7478325780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 14:47:18,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 14:47:22,394][15401] Updated weights for policy 0, policy_version 456440 (0.0034) [2024-06-23 14:47:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7478394880. Throughput: 0: 42922.2. Samples: 7478464680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:47:23,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 14:47:24,961][15401] Updated weights for policy 0, policy_version 456450 (0.0034) [2024-06-23 14:47:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7478591488. Throughput: 0: 42869.0. Samples: 7478715100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:47:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 14:47:29,923][15401] Updated weights for policy 0, policy_version 456460 (0.0034) [2024-06-23 14:47:30,070][15349] Signal inference workers to stop experience collection... (110750 times) [2024-06-23 14:47:30,090][15401] InferenceWorker_p0-w0: stopping experience collection (110750 times) [2024-06-23 14:47:30,127][15349] Signal inference workers to resume experience collection... (110750 times) [2024-06-23 14:47:30,128][15401] InferenceWorker_p0-w0: resuming experience collection (110750 times) [2024-06-23 14:47:32,499][15401] Updated weights for policy 0, policy_version 456470 (0.0034) [2024-06-23 14:47:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7478820864. Throughput: 0: 42850.7. Samples: 7478967960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:47:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 14:47:37,732][15401] Updated weights for policy 0, policy_version 456480 (0.0043) [2024-06-23 14:47:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7479017472. Throughput: 0: 42795.2. Samples: 7479100520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:47:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 14:47:40,258][15401] Updated weights for policy 0, policy_version 456490 (0.0024) [2024-06-23 14:47:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 7479214080. Throughput: 0: 42631.5. Samples: 7479348480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:47:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 14:47:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000456495_7479214080.pth... [2024-06-23 14:47:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000455870_7468974080.pth [2024-06-23 14:47:45,509][15401] Updated weights for policy 0, policy_version 456500 (0.0041) [2024-06-23 14:47:48,130][15401] Updated weights for policy 0, policy_version 456510 (0.0037) [2024-06-23 14:47:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7479459840. Throughput: 0: 42694.3. Samples: 7479599100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:47:48,393][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 14:47:53,190][15401] Updated weights for policy 0, policy_version 456520 (0.0034) [2024-06-23 14:47:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7479640064. Throughput: 0: 42571.1. Samples: 7479738160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:47:53,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 14:47:55,888][15401] Updated weights for policy 0, policy_version 456530 (0.0033) [2024-06-23 14:47:58,396][15132] Fps is (10 sec: 40943.6, 60 sec: 42868.6, 300 sec: 42708.9). Total num frames: 7479869440. Throughput: 0: 42486.1. Samples: 7479987680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:47:58,396][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 14:48:00,768][15401] Updated weights for policy 0, policy_version 456540 (0.0035) [2024-06-23 14:48:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7480098816. Throughput: 0: 42580.0. Samples: 7480241880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 14:48:03,434][15401] Updated weights for policy 0, policy_version 456550 (0.0036) [2024-06-23 14:48:08,243][15401] Updated weights for policy 0, policy_version 456560 (0.0027) [2024-06-23 14:48:08,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 7480279040. Throughput: 0: 42493.0. Samples: 7480376860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 14:48:11,023][15401] Updated weights for policy 0, policy_version 456570 (0.0042) [2024-06-23 14:48:13,395][15132] Fps is (10 sec: 42576.4, 60 sec: 43141.0, 300 sec: 42819.8). Total num frames: 7480524800. Throughput: 0: 42481.4. Samples: 7480626980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:13,395][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 14:48:15,887][15401] Updated weights for policy 0, policy_version 456580 (0.0038) [2024-06-23 14:48:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 7480721408. Throughput: 0: 42555.6. Samples: 7480882960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:18,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 14:48:18,829][15401] Updated weights for policy 0, policy_version 456590 (0.0037) [2024-06-23 14:48:23,390][15132] Fps is (10 sec: 37701.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 7480901632. Throughput: 0: 42337.6. Samples: 7481005720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 14:48:23,714][15401] Updated weights for policy 0, policy_version 456600 (0.0039) [2024-06-23 14:48:26,415][15401] Updated weights for policy 0, policy_version 456610 (0.0032) [2024-06-23 14:48:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7481147392. Throughput: 0: 42513.5. Samples: 7481261580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 14:48:31,263][15401] Updated weights for policy 0, policy_version 456620 (0.0039) [2024-06-23 14:48:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7481360384. Throughput: 0: 42720.5. Samples: 7481521420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 14:48:34,263][15401] Updated weights for policy 0, policy_version 456630 (0.0029) [2024-06-23 14:48:38,390][15132] Fps is (10 sec: 39318.6, 60 sec: 42051.8, 300 sec: 42598.3). Total num frames: 7481540608. Throughput: 0: 42405.5. Samples: 7481646440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:38,391][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 14:48:38,986][15401] Updated weights for policy 0, policy_version 456640 (0.0039) [2024-06-23 14:48:41,903][15349] Signal inference workers to stop experience collection... (110800 times) [2024-06-23 14:48:41,929][15401] InferenceWorker_p0-w0: stopping experience collection (110800 times) [2024-06-23 14:48:41,966][15349] Signal inference workers to resume experience collection... (110800 times) [2024-06-23 14:48:41,967][15401] InferenceWorker_p0-w0: resuming experience collection (110800 times) [2024-06-23 14:48:42,112][15401] Updated weights for policy 0, policy_version 456650 (0.0034) [2024-06-23 14:48:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 7481802752. Throughput: 0: 42549.3. Samples: 7481902120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 14:48:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 14:48:46,747][15401] Updated weights for policy 0, policy_version 456660 (0.0035) [2024-06-23 14:48:48,389][15132] Fps is (10 sec: 45878.6, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 7481999360. Throughput: 0: 42696.4. Samples: 7482163220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:48:48,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 14:48:49,855][15401] Updated weights for policy 0, policy_version 456670 (0.0032) [2024-06-23 14:48:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7482195968. Throughput: 0: 42379.5. Samples: 7482283940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:48:53,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 14:48:54,575][15401] Updated weights for policy 0, policy_version 456680 (0.0031) [2024-06-23 14:48:57,500][15401] Updated weights for policy 0, policy_version 456690 (0.0029) [2024-06-23 14:48:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43149.1, 300 sec: 42876.4). Total num frames: 7482458112. Throughput: 0: 42690.6. Samples: 7482547840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:48:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 14:49:02,002][15401] Updated weights for policy 0, policy_version 456700 (0.0036) [2024-06-23 14:49:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 7482638336. Throughput: 0: 42795.4. Samples: 7482808760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 14:49:05,087][15401] Updated weights for policy 0, policy_version 456710 (0.0034) [2024-06-23 14:49:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 7482851328. Throughput: 0: 42716.9. Samples: 7482927980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 14:49:09,476][15401] Updated weights for policy 0, policy_version 456720 (0.0036) [2024-06-23 14:49:12,711][15401] Updated weights for policy 0, policy_version 456730 (0.0037) [2024-06-23 14:49:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42875.0, 300 sec: 42876.1). Total num frames: 7483097088. Throughput: 0: 42873.5. Samples: 7483190900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 14:49:17,434][15401] Updated weights for policy 0, policy_version 456740 (0.0031) [2024-06-23 14:49:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7483277312. Throughput: 0: 43030.4. Samples: 7483457780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 14:49:20,472][15401] Updated weights for policy 0, policy_version 456750 (0.0028) [2024-06-23 14:49:23,389][15132] Fps is (10 sec: 39322.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7483490304. Throughput: 0: 42810.9. Samples: 7483572900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:23,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 14:49:24,933][15401] Updated weights for policy 0, policy_version 456760 (0.0043) [2024-06-23 14:49:28,350][15401] Updated weights for policy 0, policy_version 456770 (0.0041) [2024-06-23 14:49:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 7483719680. Throughput: 0: 42921.7. Samples: 7483833600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 14:49:32,797][15401] Updated weights for policy 0, policy_version 456780 (0.0030) [2024-06-23 14:49:33,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7483899904. Throughput: 0: 42896.2. Samples: 7484093560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 14:49:36,045][15401] Updated weights for policy 0, policy_version 456790 (0.0029) [2024-06-23 14:49:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43418.0, 300 sec: 42820.5). Total num frames: 7484145664. Throughput: 0: 42902.5. Samples: 7484214560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 14:49:40,611][15401] Updated weights for policy 0, policy_version 456800 (0.0028) [2024-06-23 14:49:43,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7484358656. Throughput: 0: 42816.8. Samples: 7484474600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 14:49:43,534][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000456810_7484375040.pth... [2024-06-23 14:49:43,539][15401] Updated weights for policy 0, policy_version 456810 (0.0042) [2024-06-23 14:49:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000456183_7474102272.pth [2024-06-23 14:49:48,186][15401] Updated weights for policy 0, policy_version 456820 (0.0029) [2024-06-23 14:49:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7484538880. Throughput: 0: 42930.2. Samples: 7484740620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:48,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 14:49:50,955][15401] Updated weights for policy 0, policy_version 456830 (0.0040) [2024-06-23 14:49:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 7484801024. Throughput: 0: 42921.3. Samples: 7484859440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 14:49:55,848][15401] Updated weights for policy 0, policy_version 456840 (0.0030) [2024-06-23 14:49:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 7484997632. Throughput: 0: 43006.4. Samples: 7485126180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:49:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 14:49:58,634][15401] Updated weights for policy 0, policy_version 456850 (0.0032) [2024-06-23 14:50:03,359][15401] Updated weights for policy 0, policy_version 456860 (0.0028) [2024-06-23 14:50:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7485194240. Throughput: 0: 42812.7. Samples: 7485384360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:50:03,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-23 14:50:05,941][15349] Signal inference workers to stop experience collection... (110850 times) [2024-06-23 14:50:05,942][15349] Signal inference workers to resume experience collection... (110850 times) [2024-06-23 14:50:05,980][15401] InferenceWorker_p0-w0: stopping experience collection (110850 times) [2024-06-23 14:50:05,980][15401] InferenceWorker_p0-w0: resuming experience collection (110850 times) [2024-06-23 14:50:06,235][15401] Updated weights for policy 0, policy_version 456870 (0.0032) [2024-06-23 14:50:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 7485456384. Throughput: 0: 42896.0. Samples: 7485503220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 14:50:08,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-23 14:50:10,874][15401] Updated weights for policy 0, policy_version 456880 (0.0042) [2024-06-23 14:50:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 7485636608. Throughput: 0: 42970.3. Samples: 7485767260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:13,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 14:50:13,866][15401] Updated weights for policy 0, policy_version 456890 (0.0021) [2024-06-23 14:50:18,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7485833216. Throughput: 0: 42830.5. Samples: 7486020920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:18,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 14:50:18,448][15401] Updated weights for policy 0, policy_version 456900 (0.0029) [2024-06-23 14:50:21,808][15401] Updated weights for policy 0, policy_version 456910 (0.0028) [2024-06-23 14:50:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7486078976. Throughput: 0: 42988.4. Samples: 7486149040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 14:50:26,062][15401] Updated weights for policy 0, policy_version 456920 (0.0033) [2024-06-23 14:50:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7486275584. Throughput: 0: 42998.8. Samples: 7486409540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 14:50:29,310][15401] Updated weights for policy 0, policy_version 456930 (0.0032) [2024-06-23 14:50:33,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7486472192. Throughput: 0: 42690.3. Samples: 7486661680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 14:50:33,937][15401] Updated weights for policy 0, policy_version 456940 (0.0037) [2024-06-23 14:50:36,998][15401] Updated weights for policy 0, policy_version 456950 (0.0029) [2024-06-23 14:50:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 7486734336. Throughput: 0: 42972.5. Samples: 7486793200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 14:50:41,701][15401] Updated weights for policy 0, policy_version 456960 (0.0031) [2024-06-23 14:50:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7486914560. Throughput: 0: 42622.1. Samples: 7487044180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 14:50:45,144][15401] Updated weights for policy 0, policy_version 456970 (0.0025) [2024-06-23 14:50:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7487127552. Throughput: 0: 42371.2. Samples: 7487291060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:48,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 14:50:49,472][15401] Updated weights for policy 0, policy_version 456980 (0.0035) [2024-06-23 14:50:52,691][15401] Updated weights for policy 0, policy_version 456990 (0.0033) [2024-06-23 14:50:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7487356928. Throughput: 0: 42636.0. Samples: 7487421840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 14:50:57,112][15401] Updated weights for policy 0, policy_version 457000 (0.0044) [2024-06-23 14:50:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 7487537152. Throughput: 0: 42470.0. Samples: 7487678420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:50:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 14:51:00,211][15401] Updated weights for policy 0, policy_version 457010 (0.0032) [2024-06-23 14:51:03,392][15132] Fps is (10 sec: 42587.6, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 7487782912. Throughput: 0: 42414.9. Samples: 7487929700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:51:03,393][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 14:51:04,733][15401] Updated weights for policy 0, policy_version 457020 (0.0035) [2024-06-23 14:51:07,993][15401] Updated weights for policy 0, policy_version 457030 (0.0033) [2024-06-23 14:51:08,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7487995904. Throughput: 0: 42602.3. Samples: 7488066140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:51:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 14:51:12,261][15401] Updated weights for policy 0, policy_version 457040 (0.0032) [2024-06-23 14:51:13,389][15132] Fps is (10 sec: 39331.7, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 7488176128. Throughput: 0: 42514.7. Samples: 7488322700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:51:13,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 14:51:15,544][15401] Updated weights for policy 0, policy_version 457050 (0.0036) [2024-06-23 14:51:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 7488438272. Throughput: 0: 42349.3. Samples: 7488567400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:51:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 14:51:20,151][15401] Updated weights for policy 0, policy_version 457060 (0.0029) [2024-06-23 14:51:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7488618496. Throughput: 0: 42424.4. Samples: 7488702300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:51:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 14:51:23,620][15401] Updated weights for policy 0, policy_version 457070 (0.0033) [2024-06-23 14:51:27,655][15401] Updated weights for policy 0, policy_version 457080 (0.0027) [2024-06-23 14:51:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7488815104. Throughput: 0: 42554.8. Samples: 7488959140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:51:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 14:51:31,226][15401] Updated weights for policy 0, policy_version 457090 (0.0044) [2024-06-23 14:51:33,047][15349] Signal inference workers to stop experience collection... (110900 times) [2024-06-23 14:51:33,091][15401] InferenceWorker_p0-w0: stopping experience collection (110900 times) [2024-06-23 14:51:33,106][15349] Signal inference workers to resume experience collection... (110900 times) [2024-06-23 14:51:33,107][15401] InferenceWorker_p0-w0: resuming experience collection (110900 times) [2024-06-23 14:51:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7489077248. Throughput: 0: 42584.0. Samples: 7489207340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 14:51:33,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 14:51:35,552][15401] Updated weights for policy 0, policy_version 457100 (0.0033) [2024-06-23 14:51:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 7489257472. Throughput: 0: 42717.8. Samples: 7489344140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:51:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 14:51:38,783][15401] Updated weights for policy 0, policy_version 457110 (0.0041) [2024-06-23 14:51:43,029][15401] Updated weights for policy 0, policy_version 457120 (0.0032) [2024-06-23 14:51:43,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7489454080. Throughput: 0: 42596.6. Samples: 7489595260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:51:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 14:51:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000457120_7489454080.pth... [2024-06-23 14:51:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000456495_7479214080.pth [2024-06-23 14:51:46,759][15401] Updated weights for policy 0, policy_version 457130 (0.0027) [2024-06-23 14:51:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7489716224. Throughput: 0: 42526.8. Samples: 7489843300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:51:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 14:51:51,060][15401] Updated weights for policy 0, policy_version 457140 (0.0052) [2024-06-23 14:51:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42654.3). Total num frames: 7489880064. Throughput: 0: 42556.4. Samples: 7489981180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:51:53,396][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 14:51:54,265][15401] Updated weights for policy 0, policy_version 457150 (0.0033) [2024-06-23 14:51:58,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7490093056. Throughput: 0: 42187.8. Samples: 7490221160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:51:58,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 14:51:58,711][15401] Updated weights for policy 0, policy_version 457160 (0.0032) [2024-06-23 14:52:02,139][15401] Updated weights for policy 0, policy_version 457170 (0.0041) [2024-06-23 14:52:03,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 7490355200. Throughput: 0: 42492.5. Samples: 7490479560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 14:52:06,524][15401] Updated weights for policy 0, policy_version 457180 (0.0035) [2024-06-23 14:52:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7490519040. Throughput: 0: 42614.7. Samples: 7490619960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 14:52:09,880][15401] Updated weights for policy 0, policy_version 457190 (0.0030) [2024-06-23 14:52:13,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7490732032. Throughput: 0: 42237.8. Samples: 7490859840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:13,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 14:52:14,066][15401] Updated weights for policy 0, policy_version 457200 (0.0044) [2024-06-23 14:52:17,602][15401] Updated weights for policy 0, policy_version 457210 (0.0029) [2024-06-23 14:52:18,390][15132] Fps is (10 sec: 47512.1, 60 sec: 42598.1, 300 sec: 42709.4). Total num frames: 7490994176. Throughput: 0: 42464.1. Samples: 7491118240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:18,391][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 14:52:21,702][15401] Updated weights for policy 0, policy_version 457220 (0.0028) [2024-06-23 14:52:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7491174400. Throughput: 0: 42547.1. Samples: 7491258760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 14:52:25,129][15401] Updated weights for policy 0, policy_version 457230 (0.0045) [2024-06-23 14:52:28,390][15132] Fps is (10 sec: 39323.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7491387392. Throughput: 0: 42376.0. Samples: 7491502180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:28,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 14:52:29,470][15401] Updated weights for policy 0, policy_version 457240 (0.0040) [2024-06-23 14:52:32,852][15401] Updated weights for policy 0, policy_version 457250 (0.0037) [2024-06-23 14:52:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7491600384. Throughput: 0: 42560.0. Samples: 7491758500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:33,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 14:52:37,149][15401] Updated weights for policy 0, policy_version 457260 (0.0056) [2024-06-23 14:52:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7491796992. Throughput: 0: 42351.2. Samples: 7491886980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 14:52:40,565][15401] Updated weights for policy 0, policy_version 457270 (0.0027) [2024-06-23 14:52:42,302][15349] Signal inference workers to stop experience collection... (110950 times) [2024-06-23 14:52:42,302][15349] Signal inference workers to resume experience collection... (110950 times) [2024-06-23 14:52:42,329][15401] InferenceWorker_p0-w0: stopping experience collection (110950 times) [2024-06-23 14:52:42,329][15401] InferenceWorker_p0-w0: resuming experience collection (110950 times) [2024-06-23 14:52:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 7492026368. Throughput: 0: 42654.7. Samples: 7492140620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 14:52:44,779][15401] Updated weights for policy 0, policy_version 457280 (0.0024) [2024-06-23 14:52:48,168][15401] Updated weights for policy 0, policy_version 457290 (0.0027) [2024-06-23 14:52:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7492239360. Throughput: 0: 42597.3. Samples: 7492396440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 14:52:52,401][15401] Updated weights for policy 0, policy_version 457300 (0.0030) [2024-06-23 14:52:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 7492452352. Throughput: 0: 42377.4. Samples: 7492526940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 14:52:55,649][15401] Updated weights for policy 0, policy_version 457310 (0.0037) [2024-06-23 14:52:58,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 7492665344. Throughput: 0: 42610.6. Samples: 7492777420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 14:52:58,392][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 14:53:00,096][15401] Updated weights for policy 0, policy_version 457320 (0.0039) [2024-06-23 14:53:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7492878336. Throughput: 0: 42675.9. Samples: 7493038640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 14:53:03,828][15401] Updated weights for policy 0, policy_version 457330 (0.0027) [2024-06-23 14:53:07,648][15401] Updated weights for policy 0, policy_version 457340 (0.0036) [2024-06-23 14:53:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42543.6). Total num frames: 7493074944. Throughput: 0: 42281.0. Samples: 7493161400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 14:53:11,327][15401] Updated weights for policy 0, policy_version 457350 (0.0027) [2024-06-23 14:53:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7493320704. Throughput: 0: 42573.7. Samples: 7493418000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 14:53:15,698][15401] Updated weights for policy 0, policy_version 457360 (0.0033) [2024-06-23 14:53:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.5, 300 sec: 42765.0). Total num frames: 7493517312. Throughput: 0: 42563.5. Samples: 7493673860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 14:53:19,283][15401] Updated weights for policy 0, policy_version 457370 (0.0038) [2024-06-23 14:53:23,170][15401] Updated weights for policy 0, policy_version 457380 (0.0034) [2024-06-23 14:53:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7493713920. Throughput: 0: 42457.3. Samples: 7493797560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 14:53:26,834][15401] Updated weights for policy 0, policy_version 457390 (0.0047) [2024-06-23 14:53:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7493943296. Throughput: 0: 42547.5. Samples: 7494055260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:28,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 14:53:30,714][15401] Updated weights for policy 0, policy_version 457400 (0.0031) [2024-06-23 14:53:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 7494156288. Throughput: 0: 42567.6. Samples: 7494311980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:33,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 14:53:34,459][15401] Updated weights for policy 0, policy_version 457410 (0.0030) [2024-06-23 14:53:38,263][15401] Updated weights for policy 0, policy_version 457420 (0.0047) [2024-06-23 14:53:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7494369280. Throughput: 0: 42485.8. Samples: 7494438800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 14:53:42,019][15401] Updated weights for policy 0, policy_version 457430 (0.0032) [2024-06-23 14:53:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7494582272. Throughput: 0: 42530.1. Samples: 7494691180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:43,391][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 14:53:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000457433_7494582272.pth... [2024-06-23 14:53:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000456810_7484375040.pth [2024-06-23 14:53:46,934][15401] Updated weights for policy 0, policy_version 457440 (0.0026) [2024-06-23 14:53:48,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 7494778880. Throughput: 0: 42409.4. Samples: 7494947160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:48,393][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 14:53:49,720][15401] Updated weights for policy 0, policy_version 457450 (0.0029) [2024-06-23 14:53:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7494991872. Throughput: 0: 42342.5. Samples: 7495066820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 14:53:54,499][15401] Updated weights for policy 0, policy_version 457460 (0.0034) [2024-06-23 14:53:57,214][15401] Updated weights for policy 0, policy_version 457470 (0.0030) [2024-06-23 14:53:58,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 7495221248. Throughput: 0: 42298.4. Samples: 7495321420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:53:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 14:54:02,062][15401] Updated weights for policy 0, policy_version 457480 (0.0021) [2024-06-23 14:54:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7495417856. Throughput: 0: 42539.2. Samples: 7495588120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:54:03,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 14:54:04,636][15349] Signal inference workers to stop experience collection... (111000 times) [2024-06-23 14:54:04,637][15349] Signal inference workers to resume experience collection... (111000 times) [2024-06-23 14:54:04,649][15401] InferenceWorker_p0-w0: stopping experience collection (111000 times) [2024-06-23 14:54:04,672][15401] InferenceWorker_p0-w0: resuming experience collection (111000 times) [2024-06-23 14:54:04,775][15401] Updated weights for policy 0, policy_version 457490 (0.0027) [2024-06-23 14:54:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7495630848. Throughput: 0: 42405.4. Samples: 7495705800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:54:08,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 14:54:09,906][15401] Updated weights for policy 0, policy_version 457500 (0.0037) [2024-06-23 14:54:12,830][15401] Updated weights for policy 0, policy_version 457510 (0.0035) [2024-06-23 14:54:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7495860224. Throughput: 0: 42365.0. Samples: 7495961680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:54:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 14:54:17,417][15401] Updated weights for policy 0, policy_version 457520 (0.0034) [2024-06-23 14:54:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7496056832. Throughput: 0: 42529.8. Samples: 7496225820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:54:18,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 14:54:20,818][15401] Updated weights for policy 0, policy_version 457530 (0.0030) [2024-06-23 14:54:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7496269824. Throughput: 0: 42369.3. Samples: 7496345420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 14:54:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 14:54:25,096][15401] Updated weights for policy 0, policy_version 457540 (0.0038) [2024-06-23 14:54:28,381][15401] Updated weights for policy 0, policy_version 457550 (0.0037) [2024-06-23 14:54:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7496499200. Throughput: 0: 42160.1. Samples: 7496588380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:54:28,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 14:54:33,020][15401] Updated weights for policy 0, policy_version 457560 (0.0035) [2024-06-23 14:54:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 7496663040. Throughput: 0: 42390.8. Samples: 7496854640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:54:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 14:54:36,710][15401] Updated weights for policy 0, policy_version 457570 (0.0024) [2024-06-23 14:54:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7496908800. Throughput: 0: 42499.7. Samples: 7496979300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:54:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 14:54:40,472][15401] Updated weights for policy 0, policy_version 457580 (0.0037) [2024-06-23 14:54:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 7497121792. Throughput: 0: 42528.0. Samples: 7497235180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:54:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 14:54:44,233][15401] Updated weights for policy 0, policy_version 457590 (0.0030) [2024-06-23 14:54:48,002][15401] Updated weights for policy 0, policy_version 457600 (0.0047) [2024-06-23 14:54:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42431.8). Total num frames: 7497318400. Throughput: 0: 42348.9. Samples: 7497493820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:54:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 14:54:51,655][15401] Updated weights for policy 0, policy_version 457610 (0.0035) [2024-06-23 14:54:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 7497531392. Throughput: 0: 42601.9. Samples: 7497622880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:54:53,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 14:54:55,536][15401] Updated weights for policy 0, policy_version 457620 (0.0044) [2024-06-23 14:54:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7497760768. Throughput: 0: 42553.8. Samples: 7497876600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:54:58,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 14:54:59,314][15401] Updated weights for policy 0, policy_version 457630 (0.0031) [2024-06-23 14:55:03,263][15401] Updated weights for policy 0, policy_version 457640 (0.0026) [2024-06-23 14:55:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 7497973760. Throughput: 0: 42334.3. Samples: 7498130860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 14:55:06,981][15401] Updated weights for policy 0, policy_version 457650 (0.0033) [2024-06-23 14:55:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7498170368. Throughput: 0: 42645.1. Samples: 7498264440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 14:55:10,855][15401] Updated weights for policy 0, policy_version 457660 (0.0035) [2024-06-23 14:55:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 7498383360. Throughput: 0: 42783.6. Samples: 7498513640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:13,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 14:55:14,655][15349] Signal inference workers to stop experience collection... (111050 times) [2024-06-23 14:55:14,655][15349] Signal inference workers to resume experience collection... (111050 times) [2024-06-23 14:55:14,669][15401] InferenceWorker_p0-w0: stopping experience collection (111050 times) [2024-06-23 14:55:14,669][15401] InferenceWorker_p0-w0: resuming experience collection (111050 times) [2024-06-23 14:55:14,795][15401] Updated weights for policy 0, policy_version 457670 (0.0025) [2024-06-23 14:55:18,331][15401] Updated weights for policy 0, policy_version 457680 (0.0044) [2024-06-23 14:55:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7498629120. Throughput: 0: 42454.2. Samples: 7498765080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 14:55:22,841][15401] Updated weights for policy 0, policy_version 457690 (0.0043) [2024-06-23 14:55:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 7498809344. Throughput: 0: 42669.0. Samples: 7498899400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 14:55:26,058][15401] Updated weights for policy 0, policy_version 457700 (0.0027) [2024-06-23 14:55:28,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42050.7, 300 sec: 42542.5). Total num frames: 7499022336. Throughput: 0: 42557.7. Samples: 7499150380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:28,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 14:55:30,377][15401] Updated weights for policy 0, policy_version 457710 (0.0036) [2024-06-23 14:55:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 7499251712. Throughput: 0: 42567.1. Samples: 7499409340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 14:55:33,675][15401] Updated weights for policy 0, policy_version 457720 (0.0042) [2024-06-23 14:55:38,026][15401] Updated weights for policy 0, policy_version 457730 (0.0032) [2024-06-23 14:55:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7499448320. Throughput: 0: 42634.6. Samples: 7499541440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 14:55:41,212][15401] Updated weights for policy 0, policy_version 457740 (0.0033) [2024-06-23 14:55:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7499677696. Throughput: 0: 42717.4. Samples: 7499798880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-23 14:55:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 14:55:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000457744_7499677696.pth... [2024-06-23 14:55:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000457120_7489454080.pth [2024-06-23 14:55:46,083][15401] Updated weights for policy 0, policy_version 457750 (0.0050) [2024-06-23 14:55:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 7499907072. Throughput: 0: 42638.6. Samples: 7500049600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:55:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 14:55:48,858][15401] Updated weights for policy 0, policy_version 457760 (0.0042) [2024-06-23 14:55:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7500087296. Throughput: 0: 42640.7. Samples: 7500183280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:55:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 14:55:53,586][15401] Updated weights for policy 0, policy_version 457770 (0.0037) [2024-06-23 14:55:56,550][15401] Updated weights for policy 0, policy_version 457780 (0.0037) [2024-06-23 14:55:58,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42487.3). Total num frames: 7500316672. Throughput: 0: 42772.0. Samples: 7500438480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:55:58,392][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 14:56:01,145][15401] Updated weights for policy 0, policy_version 457790 (0.0035) [2024-06-23 14:56:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7500546048. Throughput: 0: 42712.4. Samples: 7500687140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 14:56:04,182][15401] Updated weights for policy 0, policy_version 457800 (0.0047) [2024-06-23 14:56:08,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 7500726272. Throughput: 0: 42793.2. Samples: 7500825100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 14:56:09,021][15401] Updated weights for policy 0, policy_version 457810 (0.0038) [2024-06-23 14:56:11,925][15401] Updated weights for policy 0, policy_version 457820 (0.0037) [2024-06-23 14:56:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 7500972032. Throughput: 0: 42735.1. Samples: 7501073360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:13,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 14:56:16,852][15401] Updated weights for policy 0, policy_version 457830 (0.0031) [2024-06-23 14:56:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7501185024. Throughput: 0: 42689.7. Samples: 7501330380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 14:56:19,782][15401] Updated weights for policy 0, policy_version 457840 (0.0028) [2024-06-23 14:56:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7501381632. Throughput: 0: 42613.8. Samples: 7501459060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:23,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 14:56:24,554][15401] Updated weights for policy 0, policy_version 457850 (0.0039) [2024-06-23 14:56:27,268][15349] Signal inference workers to stop experience collection... (111100 times) [2024-06-23 14:56:27,316][15401] InferenceWorker_p0-w0: stopping experience collection (111100 times) [2024-06-23 14:56:27,326][15349] Signal inference workers to resume experience collection... (111100 times) [2024-06-23 14:56:27,327][15401] InferenceWorker_p0-w0: resuming experience collection (111100 times) [2024-06-23 14:56:27,473][15401] Updated weights for policy 0, policy_version 457860 (0.0043) [2024-06-23 14:56:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.2, 300 sec: 42487.3). Total num frames: 7501611008. Throughput: 0: 42584.8. Samples: 7501715200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:28,393][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 14:56:32,116][15401] Updated weights for policy 0, policy_version 457870 (0.0036) [2024-06-23 14:56:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7501807616. Throughput: 0: 42718.6. Samples: 7501971940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 14:56:35,096][15401] Updated weights for policy 0, policy_version 457880 (0.0051) [2024-06-23 14:56:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7502004224. Throughput: 0: 42531.6. Samples: 7502097200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 14:56:39,714][15401] Updated weights for policy 0, policy_version 457890 (0.0036) [2024-06-23 14:56:42,780][15401] Updated weights for policy 0, policy_version 457900 (0.0029) [2024-06-23 14:56:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 7502249984. Throughput: 0: 42589.9. Samples: 7502354920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:43,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 14:56:47,353][15401] Updated weights for policy 0, policy_version 457910 (0.0040) [2024-06-23 14:56:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 7502430208. Throughput: 0: 42901.3. Samples: 7502617700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:48,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 14:56:50,414][15401] Updated weights for policy 0, policy_version 457920 (0.0038) [2024-06-23 14:56:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7502659584. Throughput: 0: 42483.7. Samples: 7502736860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 14:56:54,887][15401] Updated weights for policy 0, policy_version 457930 (0.0031) [2024-06-23 14:56:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.2, 300 sec: 42431.8). Total num frames: 7502872576. Throughput: 0: 42727.7. Samples: 7502996100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:56:58,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 14:56:58,414][15401] Updated weights for policy 0, policy_version 457940 (0.0029) [2024-06-23 14:57:02,503][15401] Updated weights for policy 0, policy_version 457950 (0.0032) [2024-06-23 14:57:03,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 7503069184. Throughput: 0: 42781.3. Samples: 7503255640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:57:03,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 14:57:05,899][15401] Updated weights for policy 0, policy_version 457960 (0.0024) [2024-06-23 14:57:08,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 7503314944. Throughput: 0: 42681.2. Samples: 7503379820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 14:57:08,393][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 14:57:10,081][15401] Updated weights for policy 0, policy_version 457970 (0.0035) [2024-06-23 14:57:13,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42325.5, 300 sec: 42431.9). Total num frames: 7503511552. Throughput: 0: 42768.6. Samples: 7503639780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 14:57:13,605][15401] Updated weights for policy 0, policy_version 457980 (0.0030) [2024-06-23 14:57:18,070][15401] Updated weights for policy 0, policy_version 457990 (0.0025) [2024-06-23 14:57:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 7503708160. Throughput: 0: 42820.5. Samples: 7503898860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 14:57:21,184][15401] Updated weights for policy 0, policy_version 458000 (0.0026) [2024-06-23 14:57:23,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7503953920. Throughput: 0: 42732.0. Samples: 7504020140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:23,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 14:57:25,617][15401] Updated weights for policy 0, policy_version 458010 (0.0038) [2024-06-23 14:57:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7504150528. Throughput: 0: 42804.0. Samples: 7504281100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:28,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 14:57:28,973][15401] Updated weights for policy 0, policy_version 458020 (0.0035) [2024-06-23 14:57:33,130][15401] Updated weights for policy 0, policy_version 458030 (0.0039) [2024-06-23 14:57:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7504363520. Throughput: 0: 42581.7. Samples: 7504533880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:33,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-23 14:57:36,771][15401] Updated weights for policy 0, policy_version 458040 (0.0029) [2024-06-23 14:57:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 7504609280. Throughput: 0: 42911.5. Samples: 7504667880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 14:57:41,067][15401] Updated weights for policy 0, policy_version 458050 (0.0026) [2024-06-23 14:57:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7504789504. Throughput: 0: 42783.9. Samples: 7504921380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 14:57:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000458056_7504789504.pth... [2024-06-23 14:57:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000457433_7494582272.pth [2024-06-23 14:57:44,418][15401] Updated weights for policy 0, policy_version 458060 (0.0024) [2024-06-23 14:57:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 7505002496. Throughput: 0: 42633.3. Samples: 7505174140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:48,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 14:57:48,525][15401] Updated weights for policy 0, policy_version 458070 (0.0037) [2024-06-23 14:57:52,380][15401] Updated weights for policy 0, policy_version 458080 (0.0029) [2024-06-23 14:57:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 7505248256. Throughput: 0: 42785.4. Samples: 7505305060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 14:57:56,250][15401] Updated weights for policy 0, policy_version 458090 (0.0038) [2024-06-23 14:57:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7505428480. Throughput: 0: 42758.0. Samples: 7505563900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:57:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 14:58:00,099][15401] Updated weights for policy 0, policy_version 458100 (0.0032) [2024-06-23 14:58:03,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 7505641472. Throughput: 0: 42551.0. Samples: 7505813660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:58:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 14:58:03,771][15401] Updated weights for policy 0, policy_version 458110 (0.0038) [2024-06-23 14:58:05,065][15349] Signal inference workers to stop experience collection... (111150 times) [2024-06-23 14:58:05,110][15401] InferenceWorker_p0-w0: stopping experience collection (111150 times) [2024-06-23 14:58:05,120][15349] Signal inference workers to resume experience collection... (111150 times) [2024-06-23 14:58:05,125][15401] InferenceWorker_p0-w0: resuming experience collection (111150 times) [2024-06-23 14:58:07,823][15401] Updated weights for policy 0, policy_version 458120 (0.0033) [2024-06-23 14:58:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 7505854464. Throughput: 0: 42729.3. Samples: 7505942960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:58:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 14:58:11,557][15401] Updated weights for policy 0, policy_version 458130 (0.0040) [2024-06-23 14:58:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7506067456. Throughput: 0: 42628.5. Samples: 7506199380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:58:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 14:58:15,567][15401] Updated weights for policy 0, policy_version 458140 (0.0035) [2024-06-23 14:58:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 7506296832. Throughput: 0: 42541.9. Samples: 7506448260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:58:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 14:58:19,225][15401] Updated weights for policy 0, policy_version 458150 (0.0031) [2024-06-23 14:58:23,375][15401] Updated weights for policy 0, policy_version 458160 (0.0041) [2024-06-23 14:58:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 7506493440. Throughput: 0: 42460.8. Samples: 7506578720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:58:23,392][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 14:58:26,667][15401] Updated weights for policy 0, policy_version 458170 (0.0027) [2024-06-23 14:58:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7506722816. Throughput: 0: 42633.0. Samples: 7506839860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:58:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 14:58:31,130][15401] Updated weights for policy 0, policy_version 458180 (0.0035) [2024-06-23 14:58:33,389][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7506952192. Throughput: 0: 42595.2. Samples: 7507090820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 14:58:33,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 14:58:34,182][15401] Updated weights for policy 0, policy_version 458190 (0.0032) [2024-06-23 14:58:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 42487.4). Total num frames: 7507116032. Throughput: 0: 42533.7. Samples: 7507219080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:58:38,392][15132] Avg episode reward: [(0, '0.195')] [2024-06-23 14:58:38,693][15401] Updated weights for policy 0, policy_version 458200 (0.0029) [2024-06-23 14:58:42,108][15401] Updated weights for policy 0, policy_version 458210 (0.0050) [2024-06-23 14:58:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 7507378176. Throughput: 0: 42653.4. Samples: 7507483300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:58:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 14:58:46,444][15401] Updated weights for policy 0, policy_version 458220 (0.0028) [2024-06-23 14:58:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 7507591168. Throughput: 0: 42773.7. Samples: 7507738480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:58:48,394][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 14:58:49,678][15401] Updated weights for policy 0, policy_version 458230 (0.0030) [2024-06-23 14:58:53,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 7507771392. Throughput: 0: 42726.1. Samples: 7507865640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:58:53,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 14:58:54,080][15401] Updated weights for policy 0, policy_version 458240 (0.0033) [2024-06-23 14:58:57,119][15401] Updated weights for policy 0, policy_version 458250 (0.0030) [2024-06-23 14:58:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7508017152. Throughput: 0: 42838.2. Samples: 7508127100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:58:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 14:59:01,540][15401] Updated weights for policy 0, policy_version 458260 (0.0035) [2024-06-23 14:59:03,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7508230144. Throughput: 0: 43004.5. Samples: 7508383460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:03,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 14:59:05,110][15401] Updated weights for policy 0, policy_version 458270 (0.0050) [2024-06-23 14:59:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7508410368. Throughput: 0: 42864.1. Samples: 7508507500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 14:59:09,062][15401] Updated weights for policy 0, policy_version 458280 (0.0030) [2024-06-23 14:59:12,680][15401] Updated weights for policy 0, policy_version 458290 (0.0035) [2024-06-23 14:59:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7508672512. Throughput: 0: 42974.2. Samples: 7508773700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:13,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 14:59:16,653][15401] Updated weights for policy 0, policy_version 458300 (0.0029) [2024-06-23 14:59:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7508852736. Throughput: 0: 43057.8. Samples: 7509028420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 14:59:20,189][15401] Updated weights for policy 0, policy_version 458310 (0.0038) [2024-06-23 14:59:23,392][15132] Fps is (10 sec: 37673.7, 60 sec: 42598.4, 300 sec: 42542.5). Total num frames: 7509049344. Throughput: 0: 42841.2. Samples: 7509147040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:23,393][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 14:59:24,487][15401] Updated weights for policy 0, policy_version 458320 (0.0034) [2024-06-23 14:59:27,803][15401] Updated weights for policy 0, policy_version 458330 (0.0029) [2024-06-23 14:59:28,396][15132] Fps is (10 sec: 45846.1, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 7509311488. Throughput: 0: 42889.4. Samples: 7509413600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:28,397][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 14:59:31,347][15349] Signal inference workers to stop experience collection... (111200 times) [2024-06-23 14:59:31,395][15401] InferenceWorker_p0-w0: stopping experience collection (111200 times) [2024-06-23 14:59:31,403][15349] Signal inference workers to resume experience collection... (111200 times) [2024-06-23 14:59:31,414][15401] InferenceWorker_p0-w0: resuming experience collection (111200 times) [2024-06-23 14:59:32,294][15401] Updated weights for policy 0, policy_version 458340 (0.0046) [2024-06-23 14:59:33,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7509491712. Throughput: 0: 42879.2. Samples: 7509668040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 14:59:35,533][15401] Updated weights for policy 0, policy_version 458350 (0.0030) [2024-06-23 14:59:38,392][15132] Fps is (10 sec: 39337.2, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 7509704704. Throughput: 0: 42688.5. Samples: 7509786720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:38,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 14:59:40,024][15401] Updated weights for policy 0, policy_version 458360 (0.0033) [2024-06-23 14:59:42,986][15401] Updated weights for policy 0, policy_version 458370 (0.0040) [2024-06-23 14:59:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7509950464. Throughput: 0: 42783.6. Samples: 7510052360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:43,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 14:59:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000458372_7509966848.pth... [2024-06-23 14:59:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000457744_7499677696.pth [2024-06-23 14:59:47,590][15401] Updated weights for policy 0, policy_version 458380 (0.0029) [2024-06-23 14:59:48,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7510114304. Throughput: 0: 42823.0. Samples: 7510310500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 14:59:50,706][15401] Updated weights for policy 0, policy_version 458390 (0.0049) [2024-06-23 14:59:53,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 7510343680. Throughput: 0: 42587.5. Samples: 7510424040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:53,392][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 14:59:55,146][15401] Updated weights for policy 0, policy_version 458400 (0.0041) [2024-06-23 14:59:58,356][15401] Updated weights for policy 0, policy_version 458410 (0.0033) [2024-06-23 14:59:58,390][15132] Fps is (10 sec: 47510.0, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 7510589440. Throughput: 0: 42454.7. Samples: 7510684200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 14:59:58,391][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 15:00:02,800][15401] Updated weights for policy 0, policy_version 458420 (0.0041) [2024-06-23 15:00:03,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7510753280. Throughput: 0: 42543.6. Samples: 7510942880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:03,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 15:00:05,971][15401] Updated weights for policy 0, policy_version 458430 (0.0036) [2024-06-23 15:00:08,390][15132] Fps is (10 sec: 40963.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7510999040. Throughput: 0: 42612.0. Samples: 7511064480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:08,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 15:00:10,347][15401] Updated weights for policy 0, policy_version 458440 (0.0034) [2024-06-23 15:00:13,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7511228416. Throughput: 0: 42743.4. Samples: 7511336780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:13,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 15:00:13,496][15401] Updated weights for policy 0, policy_version 458450 (0.0024) [2024-06-23 15:00:18,236][15401] Updated weights for policy 0, policy_version 458460 (0.0039) [2024-06-23 15:00:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7511425024. Throughput: 0: 42928.0. Samples: 7511599800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:18,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 15:00:21,206][15401] Updated weights for policy 0, policy_version 458470 (0.0034) [2024-06-23 15:00:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43419.3, 300 sec: 42820.9). Total num frames: 7511654400. Throughput: 0: 42958.3. Samples: 7511719740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 15:00:25,783][15401] Updated weights for policy 0, policy_version 458480 (0.0034) [2024-06-23 15:00:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 7511867392. Throughput: 0: 42905.7. Samples: 7511983120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:28,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 15:00:28,850][15401] Updated weights for policy 0, policy_version 458490 (0.0029) [2024-06-23 15:00:33,293][15401] Updated weights for policy 0, policy_version 458500 (0.0032) [2024-06-23 15:00:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7512064000. Throughput: 0: 42890.7. Samples: 7512240580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 15:00:36,343][15401] Updated weights for policy 0, policy_version 458510 (0.0031) [2024-06-23 15:00:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 7512309760. Throughput: 0: 43090.2. Samples: 7512363000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 15:00:40,734][15401] Updated weights for policy 0, policy_version 458520 (0.0036) [2024-06-23 15:00:43,147][15349] Signal inference workers to stop experience collection... (111250 times) [2024-06-23 15:00:43,148][15349] Signal inference workers to resume experience collection... (111250 times) [2024-06-23 15:00:43,200][15401] InferenceWorker_p0-w0: stopping experience collection (111250 times) [2024-06-23 15:00:43,200][15401] InferenceWorker_p0-w0: resuming experience collection (111250 times) [2024-06-23 15:00:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7512522752. Throughput: 0: 43184.4. Samples: 7512627460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:43,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 15:00:43,883][15401] Updated weights for policy 0, policy_version 458530 (0.0036) [2024-06-23 15:00:48,391][15132] Fps is (10 sec: 39314.5, 60 sec: 43143.2, 300 sec: 42764.8). Total num frames: 7512702976. Throughput: 0: 43102.2. Samples: 7512882560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:48,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 15:00:48,495][15401] Updated weights for policy 0, policy_version 458540 (0.0041) [2024-06-23 15:00:51,555][15401] Updated weights for policy 0, policy_version 458550 (0.0023) [2024-06-23 15:00:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43419.3, 300 sec: 42820.9). Total num frames: 7512948736. Throughput: 0: 43138.7. Samples: 7513005720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:53,391][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 15:00:56,259][15401] Updated weights for policy 0, policy_version 458560 (0.0032) [2024-06-23 15:00:58,390][15132] Fps is (10 sec: 44245.2, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 7513145344. Throughput: 0: 43069.3. Samples: 7513274900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:00:58,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 15:00:59,203][15401] Updated weights for policy 0, policy_version 458570 (0.0023) [2024-06-23 15:01:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7513341952. Throughput: 0: 42959.3. Samples: 7513532980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:01:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 15:01:03,713][15401] Updated weights for policy 0, policy_version 458580 (0.0029) [2024-06-23 15:01:06,963][15401] Updated weights for policy 0, policy_version 458590 (0.0027) [2024-06-23 15:01:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7513587712. Throughput: 0: 42942.8. Samples: 7513652160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:01:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 15:01:11,202][15401] Updated weights for policy 0, policy_version 458600 (0.0026) [2024-06-23 15:01:13,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7513800704. Throughput: 0: 42952.6. Samples: 7513915980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:01:13,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 15:01:14,828][15401] Updated weights for policy 0, policy_version 458610 (0.0027) [2024-06-23 15:01:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7513997312. Throughput: 0: 42992.0. Samples: 7514175220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:01:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 15:01:18,649][15401] Updated weights for policy 0, policy_version 458620 (0.0040) [2024-06-23 15:01:22,325][15401] Updated weights for policy 0, policy_version 458630 (0.0038) [2024-06-23 15:01:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7514210304. Throughput: 0: 43009.8. Samples: 7514298440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 15:01:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 15:01:26,747][15401] Updated weights for policy 0, policy_version 458640 (0.0023) [2024-06-23 15:01:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7514439680. Throughput: 0: 42856.0. Samples: 7514555980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:01:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 15:01:29,869][15401] Updated weights for policy 0, policy_version 458650 (0.0037) [2024-06-23 15:01:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7514652672. Throughput: 0: 42982.7. Samples: 7514816700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:01:33,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-23 15:01:34,168][15401] Updated weights for policy 0, policy_version 458660 (0.0044) [2024-06-23 15:01:37,434][15401] Updated weights for policy 0, policy_version 458670 (0.0038) [2024-06-23 15:01:38,390][15132] Fps is (10 sec: 42596.8, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 7514865664. Throughput: 0: 43124.6. Samples: 7514946340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:01:38,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 15:01:41,677][15401] Updated weights for policy 0, policy_version 458680 (0.0035) [2024-06-23 15:01:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 7515045888. Throughput: 0: 42857.0. Samples: 7515203460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:01:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 15:01:43,443][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000458683_7515062272.pth... [2024-06-23 15:01:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000458056_7504789504.pth [2024-06-23 15:01:45,104][15401] Updated weights for policy 0, policy_version 458690 (0.0039) [2024-06-23 15:01:47,973][15349] Signal inference workers to stop experience collection... (111300 times) [2024-06-23 15:01:48,027][15349] Signal inference workers to resume experience collection... (111300 times) [2024-06-23 15:01:48,028][15401] InferenceWorker_p0-w0: stopping experience collection (111300 times) [2024-06-23 15:01:48,043][15401] InferenceWorker_p0-w0: resuming experience collection (111300 times) [2024-06-23 15:01:48,392][15132] Fps is (10 sec: 42589.7, 60 sec: 43144.2, 300 sec: 42820.2). Total num frames: 7515291648. Throughput: 0: 42749.0. Samples: 7515456780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:01:48,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 15:01:49,251][15401] Updated weights for policy 0, policy_version 458700 (0.0041) [2024-06-23 15:01:52,758][15401] Updated weights for policy 0, policy_version 458710 (0.0024) [2024-06-23 15:01:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7515504640. Throughput: 0: 42988.0. Samples: 7515586620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:01:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 15:01:57,317][15401] Updated weights for policy 0, policy_version 458720 (0.0035) [2024-06-23 15:01:58,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7515701248. Throughput: 0: 42754.7. Samples: 7515839940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:01:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 15:02:00,308][15401] Updated weights for policy 0, policy_version 458730 (0.0027) [2024-06-23 15:02:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 7515930624. Throughput: 0: 42660.0. Samples: 7516094920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 15:02:04,886][15401] Updated weights for policy 0, policy_version 458740 (0.0035) [2024-06-23 15:02:08,254][15401] Updated weights for policy 0, policy_version 458750 (0.0032) [2024-06-23 15:02:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7516160000. Throughput: 0: 42758.3. Samples: 7516222560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 15:02:12,435][15401] Updated weights for policy 0, policy_version 458760 (0.0046) [2024-06-23 15:02:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7516340224. Throughput: 0: 42716.0. Samples: 7516478200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 15:02:16,765][15401] Updated weights for policy 0, policy_version 458770 (0.0030) [2024-06-23 15:02:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7516553216. Throughput: 0: 42533.8. Samples: 7516730720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:18,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 15:02:20,035][15401] Updated weights for policy 0, policy_version 458780 (0.0033) [2024-06-23 15:02:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 7516782592. Throughput: 0: 42537.3. Samples: 7516860500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:23,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 15:02:24,145][15401] Updated weights for policy 0, policy_version 458790 (0.0024) [2024-06-23 15:02:27,691][15401] Updated weights for policy 0, policy_version 458800 (0.0037) [2024-06-23 15:02:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7517011968. Throughput: 0: 42640.8. Samples: 7517122300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 15:02:31,584][15401] Updated weights for policy 0, policy_version 458810 (0.0028) [2024-06-23 15:02:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7517192192. Throughput: 0: 42773.9. Samples: 7517381500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 15:02:35,241][15401] Updated weights for policy 0, policy_version 458820 (0.0033) [2024-06-23 15:02:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 7517437952. Throughput: 0: 42632.3. Samples: 7517505080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 15:02:39,414][15401] Updated weights for policy 0, policy_version 458830 (0.0032) [2024-06-23 15:02:43,133][15401] Updated weights for policy 0, policy_version 458840 (0.0034) [2024-06-23 15:02:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 7517650944. Throughput: 0: 42865.6. Samples: 7517768900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:43,391][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 15:02:46,857][15401] Updated weights for policy 0, policy_version 458850 (0.0037) [2024-06-23 15:02:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 7517831168. Throughput: 0: 42887.1. Samples: 7518024840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 15:02:48,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 15:02:50,657][15401] Updated weights for policy 0, policy_version 458860 (0.0033) [2024-06-23 15:02:52,716][15349] Signal inference workers to stop experience collection... (111350 times) [2024-06-23 15:02:52,742][15401] InferenceWorker_p0-w0: stopping experience collection (111350 times) [2024-06-23 15:02:52,831][15349] Signal inference workers to resume experience collection... (111350 times) [2024-06-23 15:02:52,831][15401] InferenceWorker_p0-w0: resuming experience collection (111350 times) [2024-06-23 15:02:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.3, 300 sec: 42931.6). Total num frames: 7518093312. Throughput: 0: 42782.5. Samples: 7518147780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:02:53,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 15:02:54,539][15401] Updated weights for policy 0, policy_version 458870 (0.0029) [2024-06-23 15:02:58,107][15401] Updated weights for policy 0, policy_version 458880 (0.0038) [2024-06-23 15:02:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7518289920. Throughput: 0: 42914.3. Samples: 7518409340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:02:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 15:03:02,522][15401] Updated weights for policy 0, policy_version 458890 (0.0037) [2024-06-23 15:03:03,390][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7518470144. Throughput: 0: 43056.9. Samples: 7518668280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 15:03:05,770][15401] Updated weights for policy 0, policy_version 458900 (0.0043) [2024-06-23 15:03:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7518732288. Throughput: 0: 42931.0. Samples: 7518792400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 15:03:09,898][15401] Updated weights for policy 0, policy_version 458910 (0.0022) [2024-06-23 15:03:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7518928896. Throughput: 0: 42966.6. Samples: 7519055800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 15:03:13,410][15401] Updated weights for policy 0, policy_version 458920 (0.0044) [2024-06-23 15:03:17,549][15401] Updated weights for policy 0, policy_version 458930 (0.0030) [2024-06-23 15:03:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 7519141888. Throughput: 0: 42907.9. Samples: 7519312360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:18,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 15:03:20,992][15401] Updated weights for policy 0, policy_version 458940 (0.0032) [2024-06-23 15:03:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7519371264. Throughput: 0: 43000.0. Samples: 7519440080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 15:03:25,012][15401] Updated weights for policy 0, policy_version 458950 (0.0046) [2024-06-23 15:03:28,373][15401] Updated weights for policy 0, policy_version 458960 (0.0039) [2024-06-23 15:03:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7519600640. Throughput: 0: 42970.8. Samples: 7519702580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:28,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 15:03:32,981][15401] Updated weights for policy 0, policy_version 458970 (0.0034) [2024-06-23 15:03:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 7519780864. Throughput: 0: 43036.5. Samples: 7519961480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:33,392][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 15:03:36,049][15401] Updated weights for policy 0, policy_version 458980 (0.0034) [2024-06-23 15:03:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7520010240. Throughput: 0: 43031.2. Samples: 7520084180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 15:03:40,430][15401] Updated weights for policy 0, policy_version 458990 (0.0038) [2024-06-23 15:03:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7520239616. Throughput: 0: 42975.5. Samples: 7520343240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 15:03:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000458999_7520239616.pth... [2024-06-23 15:03:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000458372_7509966848.pth [2024-06-23 15:03:43,894][15401] Updated weights for policy 0, policy_version 459000 (0.0034) [2024-06-23 15:03:48,242][15401] Updated weights for policy 0, policy_version 459010 (0.0041) [2024-06-23 15:03:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 7520419840. Throughput: 0: 43037.0. Samples: 7520604940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:48,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 15:03:51,400][15401] Updated weights for policy 0, policy_version 459020 (0.0031) [2024-06-23 15:03:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 7520649216. Throughput: 0: 43000.4. Samples: 7520727420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 15:03:55,952][15401] Updated weights for policy 0, policy_version 459030 (0.0035) [2024-06-23 15:03:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 7520862208. Throughput: 0: 42842.2. Samples: 7520983800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:03:58,392][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 15:03:58,951][15401] Updated weights for policy 0, policy_version 459040 (0.0033) [2024-06-23 15:04:03,396][15132] Fps is (10 sec: 40934.1, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 7521058816. Throughput: 0: 42959.3. Samples: 7521245800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:04:03,396][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 15:04:03,589][15401] Updated weights for policy 0, policy_version 459050 (0.0041) [2024-06-23 15:04:06,480][15401] Updated weights for policy 0, policy_version 459060 (0.0034) [2024-06-23 15:04:08,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7521304576. Throughput: 0: 42800.9. Samples: 7521366120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:04:08,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 15:04:11,517][15401] Updated weights for policy 0, policy_version 459070 (0.0025) [2024-06-23 15:04:11,941][15349] Signal inference workers to stop experience collection... (111400 times) [2024-06-23 15:04:11,942][15349] Signal inference workers to resume experience collection... (111400 times) [2024-06-23 15:04:11,964][15401] InferenceWorker_p0-w0: stopping experience collection (111400 times) [2024-06-23 15:04:11,964][15401] InferenceWorker_p0-w0: resuming experience collection (111400 times) [2024-06-23 15:04:13,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7521501184. Throughput: 0: 42699.1. Samples: 7521624040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 15:04:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 15:04:14,140][15401] Updated weights for policy 0, policy_version 459080 (0.0042) [2024-06-23 15:04:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 7521697792. Throughput: 0: 42762.8. Samples: 7521885800. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 15:04:18,945][15401] Updated weights for policy 0, policy_version 459090 (0.0045) [2024-06-23 15:04:21,625][15401] Updated weights for policy 0, policy_version 459100 (0.0029) [2024-06-23 15:04:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 7521943552. Throughput: 0: 42828.2. Samples: 7522011440. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 15:04:26,642][15401] Updated weights for policy 0, policy_version 459110 (0.0039) [2024-06-23 15:04:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7522156544. Throughput: 0: 42930.2. Samples: 7522275100. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:28,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 15:04:29,742][15401] Updated weights for policy 0, policy_version 459120 (0.0023) [2024-06-23 15:04:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7522336768. Throughput: 0: 42910.6. Samples: 7522535920. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 15:04:34,085][15401] Updated weights for policy 0, policy_version 459130 (0.0031) [2024-06-23 15:04:37,396][15401] Updated weights for policy 0, policy_version 459140 (0.0032) [2024-06-23 15:04:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 7522598912. Throughput: 0: 42768.1. Samples: 7522651980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:38,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 15:04:41,922][15401] Updated weights for policy 0, policy_version 459150 (0.0045) [2024-06-23 15:04:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 7522795520. Throughput: 0: 43023.6. Samples: 7522919760. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 15:04:44,765][15401] Updated weights for policy 0, policy_version 459160 (0.0037) [2024-06-23 15:04:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 7522992128. Throughput: 0: 42948.6. Samples: 7523178220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 15:04:49,461][15401] Updated weights for policy 0, policy_version 459170 (0.0036) [2024-06-23 15:04:52,158][15401] Updated weights for policy 0, policy_version 459180 (0.0049) [2024-06-23 15:04:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42820.3). Total num frames: 7523221504. Throughput: 0: 42915.0. Samples: 7523297400. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:53,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 15:04:56,978][15401] Updated weights for policy 0, policy_version 459190 (0.0031) [2024-06-23 15:04:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43146.3, 300 sec: 43042.7). Total num frames: 7523450880. Throughput: 0: 43096.9. Samples: 7523563400. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:04:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 15:05:00,017][15401] Updated weights for policy 0, policy_version 459200 (0.0034) [2024-06-23 15:05:03,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42875.9, 300 sec: 42820.6). Total num frames: 7523631104. Throughput: 0: 43003.9. Samples: 7523820980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:05:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 15:05:04,750][15401] Updated weights for policy 0, policy_version 459210 (0.0038) [2024-06-23 15:05:07,758][15401] Updated weights for policy 0, policy_version 459220 (0.0033) [2024-06-23 15:05:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7523876864. Throughput: 0: 42895.5. Samples: 7523941740. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:05:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 15:05:12,282][15401] Updated weights for policy 0, policy_version 459230 (0.0027) [2024-06-23 15:05:13,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7524089856. Throughput: 0: 43050.8. Samples: 7524212380. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:05:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 15:05:15,485][15401] Updated weights for policy 0, policy_version 459240 (0.0042) [2024-06-23 15:05:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7524286464. Throughput: 0: 42869.3. Samples: 7524465040. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:05:18,391][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 15:05:19,745][15401] Updated weights for policy 0, policy_version 459250 (0.0036) [2024-06-23 15:05:22,827][15401] Updated weights for policy 0, policy_version 459260 (0.0032) [2024-06-23 15:05:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7524515840. Throughput: 0: 43073.3. Samples: 7524590280. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:05:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 15:05:25,895][15349] Signal inference workers to stop experience collection... (111450 times) [2024-06-23 15:05:25,896][15349] Signal inference workers to resume experience collection... (111450 times) [2024-06-23 15:05:25,929][15401] InferenceWorker_p0-w0: stopping experience collection (111450 times) [2024-06-23 15:05:25,929][15401] InferenceWorker_p0-w0: resuming experience collection (111450 times) [2024-06-23 15:05:27,190][15401] Updated weights for policy 0, policy_version 459270 (0.0036) [2024-06-23 15:05:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 7524728832. Throughput: 0: 43009.5. Samples: 7524855180. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:05:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 15:05:30,821][15401] Updated weights for policy 0, policy_version 459280 (0.0025) [2024-06-23 15:05:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7524925440. Throughput: 0: 43013.9. Samples: 7525113840. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-23 15:05:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 15:05:34,966][15401] Updated weights for policy 0, policy_version 459290 (0.0046) [2024-06-23 15:05:38,380][15401] Updated weights for policy 0, policy_version 459300 (0.0027) [2024-06-23 15:05:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7525171200. Throughput: 0: 42959.8. Samples: 7525230480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:05:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 15:05:42,688][15401] Updated weights for policy 0, policy_version 459310 (0.0043) [2024-06-23 15:05:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42987.4). Total num frames: 7525384192. Throughput: 0: 42957.7. Samples: 7525496500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:05:43,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 15:05:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000459314_7525400576.pth... [2024-06-23 15:05:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000458683_7515062272.pth [2024-06-23 15:05:45,943][15401] Updated weights for policy 0, policy_version 459320 (0.0035) [2024-06-23 15:05:48,392][15132] Fps is (10 sec: 40949.5, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 7525580800. Throughput: 0: 42939.6. Samples: 7525753360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:05:48,392][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 15:05:50,068][15401] Updated weights for policy 0, policy_version 459330 (0.0038) [2024-06-23 15:05:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 7525810176. Throughput: 0: 42991.4. Samples: 7525876360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:05:53,391][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 15:05:53,612][15401] Updated weights for policy 0, policy_version 459340 (0.0037) [2024-06-23 15:05:57,526][15401] Updated weights for policy 0, policy_version 459350 (0.0042) [2024-06-23 15:05:58,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7526023168. Throughput: 0: 42842.1. Samples: 7526140280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:05:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 15:06:01,437][15401] Updated weights for policy 0, policy_version 459360 (0.0038) [2024-06-23 15:06:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7526203392. Throughput: 0: 42863.6. Samples: 7526393900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:03,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 15:06:05,389][15401] Updated weights for policy 0, policy_version 459370 (0.0030) [2024-06-23 15:06:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7526449152. Throughput: 0: 42860.8. Samples: 7526519020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:08,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 15:06:09,035][15401] Updated weights for policy 0, policy_version 459380 (0.0039) [2024-06-23 15:06:12,908][15401] Updated weights for policy 0, policy_version 459390 (0.0033) [2024-06-23 15:06:13,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 7526662144. Throughput: 0: 42876.7. Samples: 7526784740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:13,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 15:06:16,728][15401] Updated weights for policy 0, policy_version 459400 (0.0034) [2024-06-23 15:06:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7526858752. Throughput: 0: 42884.0. Samples: 7527043620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 15:06:20,531][15401] Updated weights for policy 0, policy_version 459410 (0.0043) [2024-06-23 15:06:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7527104512. Throughput: 0: 42959.3. Samples: 7527163660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:23,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 15:06:24,317][15401] Updated weights for policy 0, policy_version 459420 (0.0037) [2024-06-23 15:06:28,060][15401] Updated weights for policy 0, policy_version 459430 (0.0034) [2024-06-23 15:06:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 7527301120. Throughput: 0: 42872.8. Samples: 7527425780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 15:06:31,933][15401] Updated weights for policy 0, policy_version 459440 (0.0042) [2024-06-23 15:06:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7527497728. Throughput: 0: 42917.9. Samples: 7527684560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:33,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 15:06:35,873][15401] Updated weights for policy 0, policy_version 459450 (0.0033) [2024-06-23 15:06:36,242][15349] Signal inference workers to stop experience collection... (111500 times) [2024-06-23 15:06:36,243][15349] Signal inference workers to resume experience collection... (111500 times) [2024-06-23 15:06:36,260][15401] InferenceWorker_p0-w0: stopping experience collection (111500 times) [2024-06-23 15:06:36,260][15401] InferenceWorker_p0-w0: resuming experience collection (111500 times) [2024-06-23 15:06:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 7527743488. Throughput: 0: 42925.6. Samples: 7527808000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:38,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 15:06:40,010][15401] Updated weights for policy 0, policy_version 459460 (0.0038) [2024-06-23 15:06:43,327][15401] Updated weights for policy 0, policy_version 459470 (0.0035) [2024-06-23 15:06:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 7527956480. Throughput: 0: 42821.8. Samples: 7528067260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 15:06:48,154][15401] Updated weights for policy 0, policy_version 459480 (0.0034) [2024-06-23 15:06:48,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 7528136704. Throughput: 0: 42851.0. Samples: 7528322200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 15:06:51,037][15401] Updated weights for policy 0, policy_version 459490 (0.0037) [2024-06-23 15:06:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 7528382464. Throughput: 0: 42681.8. Samples: 7528439700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:53,392][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 15:06:55,741][15401] Updated weights for policy 0, policy_version 459500 (0.0042) [2024-06-23 15:06:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 7528579072. Throughput: 0: 42656.9. Samples: 7528704200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 15:06:58,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 15:06:58,763][15401] Updated weights for policy 0, policy_version 459510 (0.0027) [2024-06-23 15:07:03,357][15401] Updated weights for policy 0, policy_version 459520 (0.0037) [2024-06-23 15:07:03,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7528775680. Throughput: 0: 42618.6. Samples: 7528961560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:03,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 15:07:06,374][15401] Updated weights for policy 0, policy_version 459530 (0.0031) [2024-06-23 15:07:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7529021440. Throughput: 0: 42622.2. Samples: 7529081660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:08,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 15:07:10,925][15401] Updated weights for policy 0, policy_version 459540 (0.0037) [2024-06-23 15:07:13,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42326.9, 300 sec: 42876.1). Total num frames: 7529201664. Throughput: 0: 42567.9. Samples: 7529341340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:13,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-23 15:07:13,975][15401] Updated weights for policy 0, policy_version 459550 (0.0043) [2024-06-23 15:07:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7529414656. Throughput: 0: 42607.5. Samples: 7529601900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:18,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-23 15:07:18,602][15401] Updated weights for policy 0, policy_version 459560 (0.0030) [2024-06-23 15:07:21,814][15401] Updated weights for policy 0, policy_version 459570 (0.0027) [2024-06-23 15:07:23,390][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7529676800. Throughput: 0: 42591.0. Samples: 7529724600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:23,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 15:07:26,699][15401] Updated weights for policy 0, policy_version 459580 (0.0021) [2024-06-23 15:07:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7529840640. Throughput: 0: 42656.9. Samples: 7529986820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:28,391][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 15:07:29,338][15401] Updated weights for policy 0, policy_version 459590 (0.0038) [2024-06-23 15:07:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7530053632. Throughput: 0: 42660.1. Samples: 7530241900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 15:07:34,114][15401] Updated weights for policy 0, policy_version 459600 (0.0027) [2024-06-23 15:07:36,951][15401] Updated weights for policy 0, policy_version 459610 (0.0034) [2024-06-23 15:07:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 7530315776. Throughput: 0: 42942.7. Samples: 7530372120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 15:07:41,516][15401] Updated weights for policy 0, policy_version 459620 (0.0038) [2024-06-23 15:07:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 7530496000. Throughput: 0: 42992.0. Samples: 7530638840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 15:07:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000459625_7530496000.pth... [2024-06-23 15:07:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000458999_7520239616.pth [2024-06-23 15:07:44,462][15401] Updated weights for policy 0, policy_version 459630 (0.0025) [2024-06-23 15:07:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7530708992. Throughput: 0: 42860.9. Samples: 7530890200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 15:07:49,125][15401] Updated weights for policy 0, policy_version 459640 (0.0041) [2024-06-23 15:07:52,215][15401] Updated weights for policy 0, policy_version 459650 (0.0028) [2024-06-23 15:07:53,391][15132] Fps is (10 sec: 45869.4, 60 sec: 42870.6, 300 sec: 42931.4). Total num frames: 7530954752. Throughput: 0: 43006.0. Samples: 7531016980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:53,394][15132] Avg episode reward: [(0, '0.291')] [2024-06-23 15:07:53,656][15349] Signal inference workers to stop experience collection... (111550 times) [2024-06-23 15:07:53,657][15349] Signal inference workers to resume experience collection... (111550 times) [2024-06-23 15:07:53,667][15401] InferenceWorker_p0-w0: stopping experience collection (111550 times) [2024-06-23 15:07:53,667][15401] InferenceWorker_p0-w0: resuming experience collection (111550 times) [2024-06-23 15:07:56,632][15401] Updated weights for policy 0, policy_version 459660 (0.0035) [2024-06-23 15:07:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7531118592. Throughput: 0: 43004.6. Samples: 7531276540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:07:58,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 15:07:59,859][15401] Updated weights for policy 0, policy_version 459670 (0.0033) [2024-06-23 15:08:03,390][15132] Fps is (10 sec: 40964.6, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 7531364352. Throughput: 0: 42723.9. Samples: 7531524480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:08:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 15:08:04,172][15401] Updated weights for policy 0, policy_version 459680 (0.0032) [2024-06-23 15:08:07,654][15401] Updated weights for policy 0, policy_version 459690 (0.0038) [2024-06-23 15:08:08,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7531593728. Throughput: 0: 42978.7. Samples: 7531658640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:08:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 15:08:12,110][15401] Updated weights for policy 0, policy_version 459700 (0.0060) [2024-06-23 15:08:13,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7531741184. Throughput: 0: 42737.8. Samples: 7531910020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:08:13,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 15:08:15,194][15401] Updated weights for policy 0, policy_version 459710 (0.0029) [2024-06-23 15:08:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7532019712. Throughput: 0: 42684.0. Samples: 7532162680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:08:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 15:08:19,630][15401] Updated weights for policy 0, policy_version 459720 (0.0031) [2024-06-23 15:08:22,926][15401] Updated weights for policy 0, policy_version 459730 (0.0035) [2024-06-23 15:08:23,390][15132] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7532232704. Throughput: 0: 42853.2. Samples: 7532300520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 15:08:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 15:08:27,313][15401] Updated weights for policy 0, policy_version 459740 (0.0043) [2024-06-23 15:08:28,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7532396544. Throughput: 0: 42505.7. Samples: 7532551600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:08:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 15:08:30,562][15401] Updated weights for policy 0, policy_version 459750 (0.0045) [2024-06-23 15:08:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7532642304. Throughput: 0: 42515.1. Samples: 7532803380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:08:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 15:08:34,878][15401] Updated weights for policy 0, policy_version 459760 (0.0029) [2024-06-23 15:08:38,241][15401] Updated weights for policy 0, policy_version 459770 (0.0032) [2024-06-23 15:08:38,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 7532871680. Throughput: 0: 42762.9. Samples: 7532941360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:08:38,393][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 15:08:42,484][15401] Updated weights for policy 0, policy_version 459780 (0.0038) [2024-06-23 15:08:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7533035520. Throughput: 0: 42656.4. Samples: 7533196080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:08:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 15:08:46,074][15401] Updated weights for policy 0, policy_version 459790 (0.0024) [2024-06-23 15:08:48,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7533297664. Throughput: 0: 42587.2. Samples: 7533440900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:08:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 15:08:50,329][15401] Updated weights for policy 0, policy_version 459800 (0.0038) [2024-06-23 15:08:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42326.1, 300 sec: 42820.9). Total num frames: 7533494272. Throughput: 0: 42656.3. Samples: 7533578180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:08:53,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 15:08:53,795][15401] Updated weights for policy 0, policy_version 459810 (0.0044) [2024-06-23 15:08:58,240][15401] Updated weights for policy 0, policy_version 459820 (0.0032) [2024-06-23 15:08:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 7533690880. Throughput: 0: 42662.4. Samples: 7533829820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:08:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 15:09:01,223][15349] Signal inference workers to stop experience collection... (111600 times) [2024-06-23 15:09:01,223][15349] Signal inference workers to resume experience collection... (111600 times) [2024-06-23 15:09:01,272][15401] InferenceWorker_p0-w0: stopping experience collection (111600 times) [2024-06-23 15:09:01,272][15401] InferenceWorker_p0-w0: resuming experience collection (111600 times) [2024-06-23 15:09:01,363][15401] Updated weights for policy 0, policy_version 459830 (0.0029) [2024-06-23 15:09:03,390][15132] Fps is (10 sec: 44233.3, 60 sec: 42870.9, 300 sec: 42820.4). Total num frames: 7533936640. Throughput: 0: 42571.2. Samples: 7534078420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:03,391][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 15:09:05,895][15401] Updated weights for policy 0, policy_version 459840 (0.0044) [2024-06-23 15:09:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 7534133248. Throughput: 0: 42659.5. Samples: 7534220200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:08,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 15:09:08,911][15401] Updated weights for policy 0, policy_version 459850 (0.0027) [2024-06-23 15:09:13,389][15132] Fps is (10 sec: 39325.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7534329856. Throughput: 0: 42803.1. Samples: 7534477740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 15:09:13,415][15401] Updated weights for policy 0, policy_version 459860 (0.0040) [2024-06-23 15:09:16,625][15401] Updated weights for policy 0, policy_version 459870 (0.0034) [2024-06-23 15:09:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7534592000. Throughput: 0: 42609.9. Samples: 7534720820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 15:09:21,038][15401] Updated weights for policy 0, policy_version 459880 (0.0027) [2024-06-23 15:09:23,395][15132] Fps is (10 sec: 44212.3, 60 sec: 42321.5, 300 sec: 42764.2). Total num frames: 7534772224. Throughput: 0: 42610.9. Samples: 7534858980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:23,396][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 15:09:24,232][15401] Updated weights for policy 0, policy_version 459890 (0.0034) [2024-06-23 15:09:28,392][15132] Fps is (10 sec: 37673.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 7534968832. Throughput: 0: 42740.4. Samples: 7535119500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:28,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 15:09:28,624][15401] Updated weights for policy 0, policy_version 459900 (0.0041) [2024-06-23 15:09:31,815][15401] Updated weights for policy 0, policy_version 459910 (0.0028) [2024-06-23 15:09:33,390][15132] Fps is (10 sec: 47539.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7535247360. Throughput: 0: 42968.0. Samples: 7535374460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:33,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 15:09:36,261][15401] Updated weights for policy 0, policy_version 459920 (0.0042) [2024-06-23 15:09:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 7535411200. Throughput: 0: 43068.1. Samples: 7535516240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 15:09:39,213][15401] Updated weights for policy 0, policy_version 459930 (0.0033) [2024-06-23 15:09:43,389][15132] Fps is (10 sec: 37683.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7535624192. Throughput: 0: 43120.4. Samples: 7535770240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 15:09:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000459939_7535640576.pth... [2024-06-23 15:09:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000459314_7525400576.pth [2024-06-23 15:09:43,852][15401] Updated weights for policy 0, policy_version 459940 (0.0039) [2024-06-23 15:09:46,844][15401] Updated weights for policy 0, policy_version 459950 (0.0039) [2024-06-23 15:09:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 7535886336. Throughput: 0: 43167.0. Samples: 7536020900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 15:09:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 15:09:51,269][15401] Updated weights for policy 0, policy_version 459960 (0.0042) [2024-06-23 15:09:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7536066560. Throughput: 0: 43092.6. Samples: 7536159360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:09:53,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 15:09:54,339][15401] Updated weights for policy 0, policy_version 459970 (0.0035) [2024-06-23 15:09:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7536279552. Throughput: 0: 43020.8. Samples: 7536413680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:09:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 15:09:58,536][15349] Signal inference workers to stop experience collection... (111650 times) [2024-06-23 15:09:58,583][15401] InferenceWorker_p0-w0: stopping experience collection (111650 times) [2024-06-23 15:09:58,654][15349] Signal inference workers to resume experience collection... (111650 times) [2024-06-23 15:09:58,655][15401] InferenceWorker_p0-w0: resuming experience collection (111650 times) [2024-06-23 15:09:58,791][15401] Updated weights for policy 0, policy_version 459980 (0.0044) [2024-06-23 15:10:02,033][15401] Updated weights for policy 0, policy_version 459990 (0.0054) [2024-06-23 15:10:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42599.1, 300 sec: 42765.0). Total num frames: 7536492544. Throughput: 0: 43246.2. Samples: 7536666900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 15:10:06,806][15401] Updated weights for policy 0, policy_version 460000 (0.0038) [2024-06-23 15:10:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 7536721920. Throughput: 0: 43098.3. Samples: 7536798160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:08,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 15:10:09,719][15401] Updated weights for policy 0, policy_version 460010 (0.0032) [2024-06-23 15:10:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7536934912. Throughput: 0: 42904.0. Samples: 7537050080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 15:10:14,354][15401] Updated weights for policy 0, policy_version 460020 (0.0045) [2024-06-23 15:10:17,418][15401] Updated weights for policy 0, policy_version 460030 (0.0042) [2024-06-23 15:10:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 7537147904. Throughput: 0: 42920.0. Samples: 7537305860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 15:10:21,896][15401] Updated weights for policy 0, policy_version 460040 (0.0043) [2024-06-23 15:10:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42329.2, 300 sec: 42653.9). Total num frames: 7537311744. Throughput: 0: 42547.9. Samples: 7537430900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 15:10:25,158][15401] Updated weights for policy 0, policy_version 460050 (0.0030) [2024-06-23 15:10:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 7537573888. Throughput: 0: 42599.1. Samples: 7537687200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 15:10:29,694][15401] Updated weights for policy 0, policy_version 460060 (0.0037) [2024-06-23 15:10:32,978][15401] Updated weights for policy 0, policy_version 460070 (0.0044) [2024-06-23 15:10:33,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7537786880. Throughput: 0: 42589.8. Samples: 7537937440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 15:10:37,435][15401] Updated weights for policy 0, policy_version 460080 (0.0033) [2024-06-23 15:10:38,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7537950720. Throughput: 0: 42436.0. Samples: 7538068980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 15:10:40,826][15401] Updated weights for policy 0, policy_version 460090 (0.0042) [2024-06-23 15:10:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 7538212864. Throughput: 0: 42495.9. Samples: 7538326000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 15:10:45,197][15401] Updated weights for policy 0, policy_version 460100 (0.0033) [2024-06-23 15:10:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7538425856. Throughput: 0: 42447.9. Samples: 7538577060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:48,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-23 15:10:48,401][15401] Updated weights for policy 0, policy_version 460110 (0.0030) [2024-06-23 15:10:52,939][15401] Updated weights for policy 0, policy_version 460120 (0.0032) [2024-06-23 15:10:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 7538606080. Throughput: 0: 42373.2. Samples: 7538704960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 15:10:56,359][15401] Updated weights for policy 0, policy_version 460130 (0.0033) [2024-06-23 15:10:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7538835456. Throughput: 0: 42422.7. Samples: 7538959100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:10:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 15:11:00,734][15401] Updated weights for policy 0, policy_version 460140 (0.0041) [2024-06-23 15:11:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7539048448. Throughput: 0: 42293.3. Samples: 7539209060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:11:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 15:11:04,098][15401] Updated weights for policy 0, policy_version 460150 (0.0030) [2024-06-23 15:11:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42654.3). Total num frames: 7539245056. Throughput: 0: 42336.5. Samples: 7539336040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:11:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 15:11:08,446][15401] Updated weights for policy 0, policy_version 460160 (0.0031) [2024-06-23 15:11:10,205][15349] Signal inference workers to stop experience collection... (111700 times) [2024-06-23 15:11:10,205][15349] Signal inference workers to resume experience collection... (111700 times) [2024-06-23 15:11:10,241][15401] InferenceWorker_p0-w0: stopping experience collection (111700 times) [2024-06-23 15:11:10,241][15401] InferenceWorker_p0-w0: resuming experience collection (111700 times) [2024-06-23 15:11:11,769][15401] Updated weights for policy 0, policy_version 460170 (0.0028) [2024-06-23 15:11:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7539458048. Throughput: 0: 42166.5. Samples: 7539584700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-23 15:11:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 15:11:16,522][15401] Updated weights for policy 0, policy_version 460180 (0.0028) [2024-06-23 15:11:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 7539654656. Throughput: 0: 42404.5. Samples: 7539845640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:18,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 15:11:19,410][15401] Updated weights for policy 0, policy_version 460190 (0.0033) [2024-06-23 15:11:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7539884032. Throughput: 0: 42256.3. Samples: 7539970520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 15:11:24,095][15401] Updated weights for policy 0, policy_version 460200 (0.0028) [2024-06-23 15:11:26,907][15401] Updated weights for policy 0, policy_version 460210 (0.0041) [2024-06-23 15:11:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 7540113408. Throughput: 0: 42094.2. Samples: 7540220240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 15:11:31,557][15401] Updated weights for policy 0, policy_version 460220 (0.0031) [2024-06-23 15:11:33,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42050.6, 300 sec: 42598.0). Total num frames: 7540310016. Throughput: 0: 42427.1. Samples: 7540486380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:33,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 15:11:35,014][15401] Updated weights for policy 0, policy_version 460230 (0.0033) [2024-06-23 15:11:38,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 7540539392. Throughput: 0: 42305.9. Samples: 7540608820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:38,392][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 15:11:39,305][15401] Updated weights for policy 0, policy_version 460240 (0.0038) [2024-06-23 15:11:42,546][15401] Updated weights for policy 0, policy_version 460250 (0.0035) [2024-06-23 15:11:43,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7540752384. Throughput: 0: 42485.8. Samples: 7540870960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 15:11:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000460252_7540768768.pth... [2024-06-23 15:11:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000459625_7530496000.pth [2024-06-23 15:11:47,020][15401] Updated weights for policy 0, policy_version 460260 (0.0038) [2024-06-23 15:11:48,390][15132] Fps is (10 sec: 40968.5, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 7540948992. Throughput: 0: 42730.8. Samples: 7541131960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:48,391][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 15:11:50,166][15401] Updated weights for policy 0, policy_version 460270 (0.0025) [2024-06-23 15:11:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7541194752. Throughput: 0: 42542.1. Samples: 7541250440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 15:11:54,936][15401] Updated weights for policy 0, policy_version 460280 (0.0030) [2024-06-23 15:11:57,991][15401] Updated weights for policy 0, policy_version 460290 (0.0027) [2024-06-23 15:11:58,390][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 7541391360. Throughput: 0: 42774.3. Samples: 7541509540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:11:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 15:12:02,424][15401] Updated weights for policy 0, policy_version 460300 (0.0043) [2024-06-23 15:12:03,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 7541587968. Throughput: 0: 42849.2. Samples: 7541773960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:12:03,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 15:12:05,567][15401] Updated weights for policy 0, policy_version 460310 (0.0031) [2024-06-23 15:12:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7541833728. Throughput: 0: 42774.0. Samples: 7541895340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:12:08,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 15:12:09,880][15401] Updated weights for policy 0, policy_version 460320 (0.0030) [2024-06-23 15:12:13,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 7542030336. Throughput: 0: 43080.9. Samples: 7542158980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:12:13,393][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 15:12:13,591][15401] Updated weights for policy 0, policy_version 460330 (0.0031) [2024-06-23 15:12:14,015][15349] Signal inference workers to stop experience collection... (111750 times) [2024-06-23 15:12:14,064][15401] InferenceWorker_p0-w0: stopping experience collection (111750 times) [2024-06-23 15:12:14,070][15349] Signal inference workers to resume experience collection... (111750 times) [2024-06-23 15:12:14,079][15401] InferenceWorker_p0-w0: resuming experience collection (111750 times) [2024-06-23 15:12:17,409][15401] Updated weights for policy 0, policy_version 460340 (0.0037) [2024-06-23 15:12:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7542243328. Throughput: 0: 42853.3. Samples: 7542414680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:12:18,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 15:12:21,176][15401] Updated weights for policy 0, policy_version 460350 (0.0038) [2024-06-23 15:12:23,391][15132] Fps is (10 sec: 44242.1, 60 sec: 43143.7, 300 sec: 42820.4). Total num frames: 7542472704. Throughput: 0: 42888.2. Samples: 7542538740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:12:23,391][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 15:12:25,350][15401] Updated weights for policy 0, policy_version 460360 (0.0027) [2024-06-23 15:12:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7542669312. Throughput: 0: 42881.7. Samples: 7542800640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:12:28,390][15132] Avg episode reward: [(0, '0.243')] [2024-06-23 15:12:28,703][15401] Updated weights for policy 0, policy_version 460370 (0.0023) [2024-06-23 15:12:32,705][15401] Updated weights for policy 0, policy_version 460380 (0.0034) [2024-06-23 15:12:33,390][15132] Fps is (10 sec: 40965.1, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 7542882304. Throughput: 0: 42896.7. Samples: 7543062300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:12:33,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 15:12:36,314][15401] Updated weights for policy 0, policy_version 460390 (0.0034) [2024-06-23 15:12:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 7543111680. Throughput: 0: 43007.3. Samples: 7543185760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:12:38,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 15:12:40,237][15401] Updated weights for policy 0, policy_version 460400 (0.0042) [2024-06-23 15:12:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7543324672. Throughput: 0: 43011.6. Samples: 7543445060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:12:43,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-23 15:12:43,955][15401] Updated weights for policy 0, policy_version 460410 (0.0027) [2024-06-23 15:12:47,773][15401] Updated weights for policy 0, policy_version 460420 (0.0034) [2024-06-23 15:12:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.8, 300 sec: 42654.1). Total num frames: 7543537664. Throughput: 0: 42910.8. Samples: 7543704840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:12:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 15:12:51,590][15401] Updated weights for policy 0, policy_version 460430 (0.0036) [2024-06-23 15:12:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 7543750656. Throughput: 0: 43046.1. Samples: 7543832520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:12:53,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 15:12:55,593][15401] Updated weights for policy 0, policy_version 460440 (0.0037) [2024-06-23 15:12:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7543980032. Throughput: 0: 42927.1. Samples: 7544090600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:12:58,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 15:12:59,082][15401] Updated weights for policy 0, policy_version 460450 (0.0034) [2024-06-23 15:13:03,007][15401] Updated weights for policy 0, policy_version 460460 (0.0038) [2024-06-23 15:13:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 7544176640. Throughput: 0: 42985.8. Samples: 7544349040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 15:13:06,440][15401] Updated weights for policy 0, policy_version 460470 (0.0031) [2024-06-23 15:13:08,393][15132] Fps is (10 sec: 40944.6, 60 sec: 42595.5, 300 sec: 42875.5). Total num frames: 7544389632. Throughput: 0: 43077.0. Samples: 7544477320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:08,394][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 15:13:10,769][15401] Updated weights for policy 0, policy_version 460480 (0.0033) [2024-06-23 15:13:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 7544619008. Throughput: 0: 42985.8. Samples: 7544735000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 15:13:14,014][15401] Updated weights for policy 0, policy_version 460490 (0.0042) [2024-06-23 15:13:18,389][15132] Fps is (10 sec: 42615.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7544815616. Throughput: 0: 42931.2. Samples: 7544994200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 15:13:18,419][15401] Updated weights for policy 0, policy_version 460500 (0.0026) [2024-06-23 15:13:21,735][15401] Updated weights for policy 0, policy_version 460510 (0.0041) [2024-06-23 15:13:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42872.4, 300 sec: 42876.1). Total num frames: 7545044992. Throughput: 0: 42881.8. Samples: 7545115440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 15:13:26,071][15401] Updated weights for policy 0, policy_version 460520 (0.0040) [2024-06-23 15:13:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7545257984. Throughput: 0: 42868.0. Samples: 7545374120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:28,399][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 15:13:29,308][15401] Updated weights for policy 0, policy_version 460530 (0.0033) [2024-06-23 15:13:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7545454592. Throughput: 0: 42901.4. Samples: 7545635400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 15:13:33,661][15401] Updated weights for policy 0, policy_version 460540 (0.0035) [2024-06-23 15:13:37,017][15401] Updated weights for policy 0, policy_version 460550 (0.0045) [2024-06-23 15:13:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7545683968. Throughput: 0: 42845.8. Samples: 7545760480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 15:13:41,360][15401] Updated weights for policy 0, policy_version 460560 (0.0024) [2024-06-23 15:13:42,142][15349] Signal inference workers to stop experience collection... (111800 times) [2024-06-23 15:13:42,142][15349] Signal inference workers to resume experience collection... (111800 times) [2024-06-23 15:13:42,188][15401] InferenceWorker_p0-w0: stopping experience collection (111800 times) [2024-06-23 15:13:42,188][15401] InferenceWorker_p0-w0: resuming experience collection (111800 times) [2024-06-23 15:13:43,392][15132] Fps is (10 sec: 45863.7, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 7545913344. Throughput: 0: 42798.7. Samples: 7546016640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:43,393][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 15:13:43,433][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000460566_7545913344.pth... [2024-06-23 15:13:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000459939_7535640576.pth [2024-06-23 15:13:44,587][15401] Updated weights for policy 0, policy_version 460570 (0.0041) [2024-06-23 15:13:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 7546077184. Throughput: 0: 42882.6. Samples: 7546278760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:48,391][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 15:13:49,218][15401] Updated weights for policy 0, policy_version 460580 (0.0043) [2024-06-23 15:13:52,179][15401] Updated weights for policy 0, policy_version 460590 (0.0028) [2024-06-23 15:13:53,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 7546322944. Throughput: 0: 42666.4. Samples: 7546397140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:53,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 15:13:56,697][15401] Updated weights for policy 0, policy_version 460600 (0.0043) [2024-06-23 15:13:58,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 7546552320. Throughput: 0: 42780.0. Samples: 7546660100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:13:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 15:13:59,694][15401] Updated weights for policy 0, policy_version 460610 (0.0040) [2024-06-23 15:14:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7546732544. Throughput: 0: 42904.0. Samples: 7546924880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 15:14:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 15:14:04,318][15401] Updated weights for policy 0, policy_version 460620 (0.0034) [2024-06-23 15:14:07,212][15401] Updated weights for policy 0, policy_version 460630 (0.0047) [2024-06-23 15:14:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42872.6, 300 sec: 42820.2). Total num frames: 7546961920. Throughput: 0: 42858.1. Samples: 7547044160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:08,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 15:14:11,745][15401] Updated weights for policy 0, policy_version 460640 (0.0041) [2024-06-23 15:14:13,390][15132] Fps is (10 sec: 47512.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7547207680. Throughput: 0: 42864.8. Samples: 7547303040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 15:14:15,154][15401] Updated weights for policy 0, policy_version 460650 (0.0029) [2024-06-23 15:14:18,389][15132] Fps is (10 sec: 39330.9, 60 sec: 42325.3, 300 sec: 42654.7). Total num frames: 7547355136. Throughput: 0: 42895.5. Samples: 7547565700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:18,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 15:14:19,475][15401] Updated weights for policy 0, policy_version 460660 (0.0032) [2024-06-23 15:14:23,093][15401] Updated weights for policy 0, policy_version 460670 (0.0037) [2024-06-23 15:14:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 7547617280. Throughput: 0: 42603.9. Samples: 7547677660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 15:14:27,220][15401] Updated weights for policy 0, policy_version 460680 (0.0034) [2024-06-23 15:14:28,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7547830272. Throughput: 0: 42762.2. Samples: 7547940840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:28,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 15:14:30,601][15401] Updated weights for policy 0, policy_version 460690 (0.0051) [2024-06-23 15:14:33,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7547994112. Throughput: 0: 42797.0. Samples: 7548204620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:33,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 15:14:34,899][15401] Updated weights for policy 0, policy_version 460700 (0.0037) [2024-06-23 15:14:38,255][15401] Updated weights for policy 0, policy_version 460710 (0.0038) [2024-06-23 15:14:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7548272640. Throughput: 0: 42859.1. Samples: 7548325800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 15:14:42,557][15401] Updated weights for policy 0, policy_version 460720 (0.0032) [2024-06-23 15:14:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 7548469248. Throughput: 0: 42746.8. Samples: 7548583700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 15:14:44,102][15349] Signal inference workers to stop experience collection... (111850 times) [2024-06-23 15:14:44,102][15349] Signal inference workers to resume experience collection... (111850 times) [2024-06-23 15:14:44,127][15401] InferenceWorker_p0-w0: stopping experience collection (111850 times) [2024-06-23 15:14:44,127][15401] InferenceWorker_p0-w0: resuming experience collection (111850 times) [2024-06-23 15:14:45,680][15401] Updated weights for policy 0, policy_version 460730 (0.0027) [2024-06-23 15:14:48,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7548649472. Throughput: 0: 42801.2. Samples: 7548850940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:48,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 15:14:50,158][15401] Updated weights for policy 0, policy_version 460740 (0.0049) [2024-06-23 15:14:53,368][15401] Updated weights for policy 0, policy_version 460750 (0.0038) [2024-06-23 15:14:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 7548928000. Throughput: 0: 42706.7. Samples: 7548965860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:53,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 15:14:58,070][15401] Updated weights for policy 0, policy_version 460760 (0.0031) [2024-06-23 15:14:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7549108224. Throughput: 0: 42761.4. Samples: 7549227300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:14:58,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 15:15:00,877][15401] Updated weights for policy 0, policy_version 460770 (0.0045) [2024-06-23 15:15:03,392][15132] Fps is (10 sec: 37675.5, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 7549304832. Throughput: 0: 42904.7. Samples: 7549496500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:15:03,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 15:15:05,609][15401] Updated weights for policy 0, policy_version 460780 (0.0043) [2024-06-23 15:15:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 7549566976. Throughput: 0: 43055.7. Samples: 7549615160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:15:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 15:15:08,462][15401] Updated weights for policy 0, policy_version 460790 (0.0025) [2024-06-23 15:15:13,120][15401] Updated weights for policy 0, policy_version 460800 (0.0035) [2024-06-23 15:15:13,390][15132] Fps is (10 sec: 44245.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7549747200. Throughput: 0: 43126.7. Samples: 7549881540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:15:13,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 15:15:15,930][15401] Updated weights for policy 0, policy_version 460810 (0.0038) [2024-06-23 15:15:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7549960192. Throughput: 0: 43105.3. Samples: 7550144360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:15:18,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 15:15:20,543][15401] Updated weights for policy 0, policy_version 460820 (0.0040) [2024-06-23 15:15:23,371][15401] Updated weights for policy 0, policy_version 460830 (0.0029) [2024-06-23 15:15:23,390][15132] Fps is (10 sec: 49152.3, 60 sec: 43690.8, 300 sec: 42931.6). Total num frames: 7550238720. Throughput: 0: 43240.8. Samples: 7550271640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:15:23,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 15:15:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7550386176. Throughput: 0: 43290.9. Samples: 7550531800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-23 15:15:28,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 15:15:28,618][15401] Updated weights for policy 0, policy_version 460840 (0.0032) [2024-06-23 15:15:31,215][15401] Updated weights for policy 0, policy_version 460850 (0.0028) [2024-06-23 15:15:33,389][15132] Fps is (10 sec: 36045.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7550599168. Throughput: 0: 42964.1. Samples: 7550784320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:15:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 15:15:36,100][15401] Updated weights for policy 0, policy_version 460860 (0.0033) [2024-06-23 15:15:38,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7550861312. Throughput: 0: 43343.1. Samples: 7550916300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:15:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 15:15:39,148][15401] Updated weights for policy 0, policy_version 460870 (0.0043) [2024-06-23 15:15:43,392][15132] Fps is (10 sec: 42589.3, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 7551025152. Throughput: 0: 43396.7. Samples: 7551180240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:15:43,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 15:15:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000460879_7551041536.pth... [2024-06-23 15:15:43,537][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000460252_7540768768.pth [2024-06-23 15:15:43,692][15401] Updated weights for policy 0, policy_version 460880 (0.0032) [2024-06-23 15:15:46,712][15401] Updated weights for policy 0, policy_version 460890 (0.0023) [2024-06-23 15:15:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7551254528. Throughput: 0: 42933.5. Samples: 7551428420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:15:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 15:15:50,935][15349] Signal inference workers to stop experience collection... (111900 times) [2024-06-23 15:15:50,982][15401] InferenceWorker_p0-w0: stopping experience collection (111900 times) [2024-06-23 15:15:50,989][15349] Signal inference workers to resume experience collection... (111900 times) [2024-06-23 15:15:51,001][15401] InferenceWorker_p0-w0: resuming experience collection (111900 times) [2024-06-23 15:15:51,125][15401] Updated weights for policy 0, policy_version 460900 (0.0025) [2024-06-23 15:15:53,389][15132] Fps is (10 sec: 49162.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 7551516672. Throughput: 0: 43120.5. Samples: 7551555580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:15:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 15:15:54,075][15401] Updated weights for policy 0, policy_version 460910 (0.0024) [2024-06-23 15:15:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7551664128. Throughput: 0: 43108.1. Samples: 7551821400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:15:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 15:15:58,830][15401] Updated weights for policy 0, policy_version 460920 (0.0023) [2024-06-23 15:16:01,626][15401] Updated weights for policy 0, policy_version 460930 (0.0029) [2024-06-23 15:16:03,389][15132] Fps is (10 sec: 37683.2, 60 sec: 43146.0, 300 sec: 42876.1). Total num frames: 7551893504. Throughput: 0: 42745.0. Samples: 7552067880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:03,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-23 15:16:06,529][15401] Updated weights for policy 0, policy_version 460940 (0.0027) [2024-06-23 15:16:08,396][15132] Fps is (10 sec: 47483.4, 60 sec: 42866.9, 300 sec: 42986.3). Total num frames: 7552139264. Throughput: 0: 42751.7. Samples: 7552195740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:08,396][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 15:16:09,205][15401] Updated weights for policy 0, policy_version 460950 (0.0032) [2024-06-23 15:16:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 7552303104. Throughput: 0: 42831.7. Samples: 7552459220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 15:16:14,361][15401] Updated weights for policy 0, policy_version 460960 (0.0047) [2024-06-23 15:16:17,241][15401] Updated weights for policy 0, policy_version 460970 (0.0040) [2024-06-23 15:16:18,389][15132] Fps is (10 sec: 40986.2, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 7552548864. Throughput: 0: 42771.5. Samples: 7552709040. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 15:16:21,928][15401] Updated weights for policy 0, policy_version 460980 (0.0034) [2024-06-23 15:16:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 7552761856. Throughput: 0: 42951.2. Samples: 7552849100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 15:16:24,739][15401] Updated weights for policy 0, policy_version 460990 (0.0031) [2024-06-23 15:16:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7552942080. Throughput: 0: 42692.7. Samples: 7553101320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:28,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 15:16:29,552][15401] Updated weights for policy 0, policy_version 461000 (0.0037) [2024-06-23 15:16:32,647][15401] Updated weights for policy 0, policy_version 461010 (0.0023) [2024-06-23 15:16:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.5). Total num frames: 7553187840. Throughput: 0: 42695.2. Samples: 7553349700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 15:16:37,083][15401] Updated weights for policy 0, policy_version 461020 (0.0026) [2024-06-23 15:16:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7553400832. Throughput: 0: 42943.9. Samples: 7553488060. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 15:16:40,143][15401] Updated weights for policy 0, policy_version 461030 (0.0033) [2024-06-23 15:16:43,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42599.8, 300 sec: 42820.6). Total num frames: 7553581056. Throughput: 0: 42579.0. Samples: 7553737460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 15:16:44,866][15401] Updated weights for policy 0, policy_version 461040 (0.0034) [2024-06-23 15:16:47,799][15401] Updated weights for policy 0, policy_version 461050 (0.0034) [2024-06-23 15:16:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7553843200. Throughput: 0: 42561.8. Samples: 7553983160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:48,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 15:16:49,668][15349] Signal inference workers to stop experience collection... (111950 times) [2024-06-23 15:16:49,669][15349] Signal inference workers to resume experience collection... (111950 times) [2024-06-23 15:16:49,683][15401] InferenceWorker_p0-w0: stopping experience collection (111950 times) [2024-06-23 15:16:49,683][15401] InferenceWorker_p0-w0: resuming experience collection (111950 times) [2024-06-23 15:16:52,440][15401] Updated weights for policy 0, policy_version 461060 (0.0033) [2024-06-23 15:16:53,392][15132] Fps is (10 sec: 44226.4, 60 sec: 41777.4, 300 sec: 42820.2). Total num frames: 7554023424. Throughput: 0: 42805.0. Samples: 7554121800. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-23 15:16:53,393][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 15:16:55,596][15401] Updated weights for policy 0, policy_version 461070 (0.0040) [2024-06-23 15:16:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 7554236416. Throughput: 0: 42572.9. Samples: 7554375000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:16:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 15:16:59,997][15401] Updated weights for policy 0, policy_version 461080 (0.0031) [2024-06-23 15:17:03,121][15401] Updated weights for policy 0, policy_version 461090 (0.0027) [2024-06-23 15:17:03,393][15132] Fps is (10 sec: 47506.9, 60 sec: 43414.7, 300 sec: 42931.0). Total num frames: 7554498560. Throughput: 0: 42515.0. Samples: 7554622380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:03,394][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 15:17:07,774][15401] Updated weights for policy 0, policy_version 461100 (0.0038) [2024-06-23 15:17:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42329.8, 300 sec: 42876.4). Total num frames: 7554678784. Throughput: 0: 42503.5. Samples: 7554761760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 15:17:10,795][15401] Updated weights for policy 0, policy_version 461110 (0.0038) [2024-06-23 15:17:13,389][15132] Fps is (10 sec: 37698.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7554875392. Throughput: 0: 42556.9. Samples: 7555016380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 15:17:15,443][15401] Updated weights for policy 0, policy_version 461120 (0.0035) [2024-06-23 15:17:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42876.2). Total num frames: 7555121152. Throughput: 0: 42558.4. Samples: 7555264840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 15:17:18,819][15401] Updated weights for policy 0, policy_version 461130 (0.0026) [2024-06-23 15:17:23,107][15401] Updated weights for policy 0, policy_version 461140 (0.0042) [2024-06-23 15:17:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7555317760. Throughput: 0: 42617.9. Samples: 7555405860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 15:17:26,324][15401] Updated weights for policy 0, policy_version 461150 (0.0035) [2024-06-23 15:17:28,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7555514368. Throughput: 0: 42690.8. Samples: 7555658540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:28,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 15:17:30,668][15401] Updated weights for policy 0, policy_version 461160 (0.0031) [2024-06-23 15:17:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7555776512. Throughput: 0: 42830.6. Samples: 7555910540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 15:17:33,768][15401] Updated weights for policy 0, policy_version 461170 (0.0037) [2024-06-23 15:17:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7555940352. Throughput: 0: 42893.0. Samples: 7556051880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 15:17:38,688][15401] Updated weights for policy 0, policy_version 461180 (0.0031) [2024-06-23 15:17:41,236][15401] Updated weights for policy 0, policy_version 461190 (0.0029) [2024-06-23 15:17:43,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7556153344. Throughput: 0: 42815.0. Samples: 7556301680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 15:17:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000461191_7556153344.pth... [2024-06-23 15:17:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000460566_7545913344.pth [2024-06-23 15:17:46,106][15401] Updated weights for policy 0, policy_version 461200 (0.0040) [2024-06-23 15:17:48,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 7556415488. Throughput: 0: 43015.3. Samples: 7556557900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 15:17:48,981][15401] Updated weights for policy 0, policy_version 461210 (0.0039) [2024-06-23 15:17:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 7556595712. Throughput: 0: 42997.7. Samples: 7556696660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 15:17:53,827][15401] Updated weights for policy 0, policy_version 461220 (0.0048) [2024-06-23 15:17:55,854][15349] Signal inference workers to stop experience collection... (112000 times) [2024-06-23 15:17:55,860][15349] Signal inference workers to resume experience collection... (112000 times) [2024-06-23 15:17:55,872][15401] InferenceWorker_p0-w0: stopping experience collection (112000 times) [2024-06-23 15:17:55,903][15401] InferenceWorker_p0-w0: resuming experience collection (112000 times) [2024-06-23 15:17:56,525][15401] Updated weights for policy 0, policy_version 461230 (0.0042) [2024-06-23 15:17:58,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 7556808704. Throughput: 0: 42709.2. Samples: 7556938400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:17:58,392][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 15:18:01,433][15401] Updated weights for policy 0, policy_version 461240 (0.0045) [2024-06-23 15:18:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42601.1, 300 sec: 42932.2). Total num frames: 7557054464. Throughput: 0: 42864.6. Samples: 7557193740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:18:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 15:18:04,098][15401] Updated weights for policy 0, policy_version 461250 (0.0027) [2024-06-23 15:18:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7557218304. Throughput: 0: 42703.6. Samples: 7557327520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:18:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 15:18:08,994][15401] Updated weights for policy 0, policy_version 461260 (0.0031) [2024-06-23 15:18:12,564][15401] Updated weights for policy 0, policy_version 461270 (0.0037) [2024-06-23 15:18:13,391][15132] Fps is (10 sec: 40953.8, 60 sec: 43143.3, 300 sec: 42875.8). Total num frames: 7557464064. Throughput: 0: 42571.8. Samples: 7557574340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:18:13,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 15:18:16,608][15401] Updated weights for policy 0, policy_version 461280 (0.0048) [2024-06-23 15:18:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 7557677056. Throughput: 0: 42778.4. Samples: 7557835560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-23 15:18:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 15:18:20,159][15401] Updated weights for policy 0, policy_version 461290 (0.0029) [2024-06-23 15:18:23,389][15132] Fps is (10 sec: 39328.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7557857280. Throughput: 0: 42549.0. Samples: 7557966580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:18:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 15:18:24,503][15401] Updated weights for policy 0, policy_version 461300 (0.0037) [2024-06-23 15:18:27,746][15401] Updated weights for policy 0, policy_version 461310 (0.0041) [2024-06-23 15:18:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7558103040. Throughput: 0: 42477.9. Samples: 7558213180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:18:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 15:18:32,383][15401] Updated weights for policy 0, policy_version 461320 (0.0037) [2024-06-23 15:18:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7558316032. Throughput: 0: 42629.4. Samples: 7558476220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:18:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 15:18:36,103][15401] Updated weights for policy 0, policy_version 461330 (0.0050) [2024-06-23 15:18:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7558512640. Throughput: 0: 42355.1. Samples: 7558602640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:18:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 15:18:40,149][15401] Updated weights for policy 0, policy_version 461340 (0.0030) [2024-06-23 15:18:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7558742016. Throughput: 0: 42607.1. Samples: 7558855620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:18:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 15:18:43,564][15401] Updated weights for policy 0, policy_version 461350 (0.0038) [2024-06-23 15:18:47,667][15401] Updated weights for policy 0, policy_version 461360 (0.0030) [2024-06-23 15:18:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7558955008. Throughput: 0: 42777.9. Samples: 7559118740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:18:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 15:18:51,000][15401] Updated weights for policy 0, policy_version 461370 (0.0036) [2024-06-23 15:18:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7559151616. Throughput: 0: 42576.8. Samples: 7559243480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:18:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 15:18:55,312][15401] Updated weights for policy 0, policy_version 461380 (0.0050) [2024-06-23 15:18:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 7559397376. Throughput: 0: 42816.3. Samples: 7559501000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:18:58,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 15:18:58,507][15401] Updated weights for policy 0, policy_version 461390 (0.0037) [2024-06-23 15:19:03,113][15401] Updated weights for policy 0, policy_version 461400 (0.0034) [2024-06-23 15:19:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42765.3). Total num frames: 7559577600. Throughput: 0: 42782.0. Samples: 7559760760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 15:19:06,139][15401] Updated weights for policy 0, policy_version 461410 (0.0035) [2024-06-23 15:19:08,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 7559790592. Throughput: 0: 42544.3. Samples: 7559881180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:08,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 15:19:10,694][15401] Updated weights for policy 0, policy_version 461420 (0.0029) [2024-06-23 15:19:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42872.5, 300 sec: 42987.2). Total num frames: 7560036352. Throughput: 0: 42846.1. Samples: 7560141260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 15:19:13,742][15401] Updated weights for policy 0, policy_version 461430 (0.0026) [2024-06-23 15:19:17,913][15349] Signal inference workers to stop experience collection... (112050 times) [2024-06-23 15:19:17,945][15401] InferenceWorker_p0-w0: stopping experience collection (112050 times) [2024-06-23 15:19:17,972][15349] Signal inference workers to resume experience collection... (112050 times) [2024-06-23 15:19:17,973][15401] InferenceWorker_p0-w0: resuming experience collection (112050 times) [2024-06-23 15:19:18,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 7560200192. Throughput: 0: 42729.0. Samples: 7560399020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:18,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 15:19:18,594][15401] Updated weights for policy 0, policy_version 461440 (0.0031) [2024-06-23 15:19:21,446][15401] Updated weights for policy 0, policy_version 461450 (0.0043) [2024-06-23 15:19:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7560445952. Throughput: 0: 42568.8. Samples: 7560518240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:23,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 15:19:26,105][15401] Updated weights for policy 0, policy_version 461460 (0.0037) [2024-06-23 15:19:28,389][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7560675328. Throughput: 0: 42812.5. Samples: 7560782180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 15:19:29,322][15401] Updated weights for policy 0, policy_version 461470 (0.0033) [2024-06-23 15:19:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7560855552. Throughput: 0: 42764.9. Samples: 7561043160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 15:19:33,699][15401] Updated weights for policy 0, policy_version 461480 (0.0038) [2024-06-23 15:19:37,124][15401] Updated weights for policy 0, policy_version 461490 (0.0026) [2024-06-23 15:19:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7561101312. Throughput: 0: 42785.7. Samples: 7561168840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 15:19:41,614][15401] Updated weights for policy 0, policy_version 461500 (0.0030) [2024-06-23 15:19:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 7561314304. Throughput: 0: 42732.4. Samples: 7561423960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-23 15:19:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 15:19:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000461506_7561314304.pth... [2024-06-23 15:19:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000460879_7551041536.pth [2024-06-23 15:19:44,688][15401] Updated weights for policy 0, policy_version 461510 (0.0036) [2024-06-23 15:19:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7561478144. Throughput: 0: 42793.0. Samples: 7561686440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:19:48,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 15:19:49,278][15401] Updated weights for policy 0, policy_version 461520 (0.0026) [2024-06-23 15:19:52,285][15401] Updated weights for policy 0, policy_version 461530 (0.0039) [2024-06-23 15:19:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7561740288. Throughput: 0: 42824.9. Samples: 7561808200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:19:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 15:19:56,878][15401] Updated weights for policy 0, policy_version 461540 (0.0032) [2024-06-23 15:19:58,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 7561953280. Throughput: 0: 42931.2. Samples: 7562073160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:19:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 15:19:59,950][15401] Updated weights for policy 0, policy_version 461550 (0.0044) [2024-06-23 15:20:03,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7562133504. Throughput: 0: 42794.3. Samples: 7562324780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 15:20:04,933][15401] Updated weights for policy 0, policy_version 461560 (0.0034) [2024-06-23 15:20:07,660][15401] Updated weights for policy 0, policy_version 461570 (0.0042) [2024-06-23 15:20:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 7562379264. Throughput: 0: 42999.3. Samples: 7562453200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:08,396][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 15:20:12,394][15401] Updated weights for policy 0, policy_version 461580 (0.0030) [2024-06-23 15:20:13,389][15132] Fps is (10 sec: 44238.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7562575872. Throughput: 0: 43033.8. Samples: 7562718700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 15:20:15,107][15401] Updated weights for policy 0, policy_version 461590 (0.0043) [2024-06-23 15:20:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 7562788864. Throughput: 0: 42764.8. Samples: 7562967580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:18,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 15:20:20,051][15401] Updated weights for policy 0, policy_version 461600 (0.0027) [2024-06-23 15:20:22,690][15401] Updated weights for policy 0, policy_version 461610 (0.0040) [2024-06-23 15:20:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7563034624. Throughput: 0: 42969.0. Samples: 7563102440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:23,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 15:20:27,538][15401] Updated weights for policy 0, policy_version 461620 (0.0029) [2024-06-23 15:20:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7563214848. Throughput: 0: 43157.7. Samples: 7563366060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:28,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 15:20:30,281][15401] Updated weights for policy 0, policy_version 461630 (0.0038) [2024-06-23 15:20:30,884][15349] Signal inference workers to stop experience collection... (112100 times) [2024-06-23 15:20:30,884][15349] Signal inference workers to resume experience collection... (112100 times) [2024-06-23 15:20:30,921][15401] InferenceWorker_p0-w0: stopping experience collection (112100 times) [2024-06-23 15:20:30,921][15401] InferenceWorker_p0-w0: resuming experience collection (112100 times) [2024-06-23 15:20:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7563444224. Throughput: 0: 42879.9. Samples: 7563616040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:33,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 15:20:35,197][15401] Updated weights for policy 0, policy_version 461640 (0.0029) [2024-06-23 15:20:37,932][15401] Updated weights for policy 0, policy_version 461650 (0.0036) [2024-06-23 15:20:38,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42931.9). Total num frames: 7563689984. Throughput: 0: 43056.5. Samples: 7563745740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 15:20:42,762][15401] Updated weights for policy 0, policy_version 461660 (0.0041) [2024-06-23 15:20:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7563853824. Throughput: 0: 42996.8. Samples: 7564008020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 15:20:45,489][15401] Updated weights for policy 0, policy_version 461670 (0.0027) [2024-06-23 15:20:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 7564083200. Throughput: 0: 42983.3. Samples: 7564259020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:48,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 15:20:50,405][15401] Updated weights for policy 0, policy_version 461680 (0.0027) [2024-06-23 15:20:53,099][15401] Updated weights for policy 0, policy_version 461690 (0.0043) [2024-06-23 15:20:53,392][15132] Fps is (10 sec: 49140.5, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 7564345344. Throughput: 0: 43108.8. Samples: 7564393200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:53,392][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 15:20:58,069][15401] Updated weights for policy 0, policy_version 461700 (0.0039) [2024-06-23 15:20:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7564509184. Throughput: 0: 42940.8. Samples: 7564651040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:20:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 15:21:00,720][15401] Updated weights for policy 0, policy_version 461710 (0.0036) [2024-06-23 15:21:03,389][15132] Fps is (10 sec: 39331.0, 60 sec: 43417.8, 300 sec: 42710.4). Total num frames: 7564738560. Throughput: 0: 42855.2. Samples: 7564896060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:21:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 15:21:05,794][15401] Updated weights for policy 0, policy_version 461720 (0.0035) [2024-06-23 15:21:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7564951552. Throughput: 0: 42931.1. Samples: 7565034340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:21:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 15:21:08,771][15401] Updated weights for policy 0, policy_version 461730 (0.0036) [2024-06-23 15:21:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7565131776. Throughput: 0: 42766.8. Samples: 7565290560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 15:21:13,436][15401] Updated weights for policy 0, policy_version 461740 (0.0044) [2024-06-23 15:21:16,354][15401] Updated weights for policy 0, policy_version 461750 (0.0032) [2024-06-23 15:21:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 7565393920. Throughput: 0: 42737.8. Samples: 7565539240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 15:21:20,997][15401] Updated weights for policy 0, policy_version 461760 (0.0042) [2024-06-23 15:21:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7565574144. Throughput: 0: 42813.3. Samples: 7565672340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 15:21:24,054][15401] Updated weights for policy 0, policy_version 461770 (0.0046) [2024-06-23 15:21:28,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7565770752. Throughput: 0: 42722.6. Samples: 7565930540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:28,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 15:21:28,544][15401] Updated weights for policy 0, policy_version 461780 (0.0049) [2024-06-23 15:21:31,798][15401] Updated weights for policy 0, policy_version 461790 (0.0031) [2024-06-23 15:21:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7566016512. Throughput: 0: 42522.3. Samples: 7566172520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 15:21:36,320][15401] Updated weights for policy 0, policy_version 461800 (0.0041) [2024-06-23 15:21:36,777][15349] Signal inference workers to stop experience collection... (112150 times) [2024-06-23 15:21:36,777][15349] Signal inference workers to resume experience collection... (112150 times) [2024-06-23 15:21:36,793][15401] InferenceWorker_p0-w0: stopping experience collection (112150 times) [2024-06-23 15:21:36,794][15401] InferenceWorker_p0-w0: resuming experience collection (112150 times) [2024-06-23 15:21:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7566229504. Throughput: 0: 42629.8. Samples: 7566311440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 15:21:39,393][15401] Updated weights for policy 0, policy_version 461810 (0.0039) [2024-06-23 15:21:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7566409728. Throughput: 0: 42617.8. Samples: 7566568840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 15:21:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000461818_7566426112.pth... [2024-06-23 15:21:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000461191_7556153344.pth [2024-06-23 15:21:43,847][15401] Updated weights for policy 0, policy_version 461820 (0.0037) [2024-06-23 15:21:46,957][15401] Updated weights for policy 0, policy_version 461830 (0.0027) [2024-06-23 15:21:48,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42876.1). Total num frames: 7566671872. Throughput: 0: 42679.4. Samples: 7566816740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:48,393][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 15:21:51,564][15401] Updated weights for policy 0, policy_version 461840 (0.0027) [2024-06-23 15:21:53,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42326.9, 300 sec: 42876.1). Total num frames: 7566884864. Throughput: 0: 42785.2. Samples: 7566959680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 15:21:54,490][15401] Updated weights for policy 0, policy_version 461850 (0.0030) [2024-06-23 15:21:58,390][15132] Fps is (10 sec: 37692.4, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 7567048704. Throughput: 0: 42645.7. Samples: 7567209620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:21:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 15:21:59,142][15401] Updated weights for policy 0, policy_version 461860 (0.0038) [2024-06-23 15:22:02,067][15401] Updated weights for policy 0, policy_version 461870 (0.0039) [2024-06-23 15:22:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7567310848. Throughput: 0: 42644.9. Samples: 7567458260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:22:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 15:22:06,943][15401] Updated weights for policy 0, policy_version 461880 (0.0033) [2024-06-23 15:22:08,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7567523840. Throughput: 0: 42764.5. Samples: 7567596740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:22:08,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 15:22:09,854][15401] Updated weights for policy 0, policy_version 461890 (0.0040) [2024-06-23 15:22:13,391][15132] Fps is (10 sec: 37679.0, 60 sec: 42597.5, 300 sec: 42598.3). Total num frames: 7567687680. Throughput: 0: 42648.7. Samples: 7567849780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:22:13,391][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 15:22:14,650][15401] Updated weights for policy 0, policy_version 461900 (0.0031) [2024-06-23 15:22:17,494][15401] Updated weights for policy 0, policy_version 461910 (0.0030) [2024-06-23 15:22:18,394][15132] Fps is (10 sec: 42578.3, 60 sec: 42595.1, 300 sec: 42819.9). Total num frames: 7567949824. Throughput: 0: 42768.3. Samples: 7568097300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:22:18,395][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 15:22:22,262][15401] Updated weights for policy 0, policy_version 461920 (0.0039) [2024-06-23 15:22:23,392][15132] Fps is (10 sec: 47507.7, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 7568162816. Throughput: 0: 42673.2. Samples: 7568231840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:22:23,393][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 15:22:25,179][15401] Updated weights for policy 0, policy_version 461930 (0.0033) [2024-06-23 15:22:28,389][15132] Fps is (10 sec: 39340.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7568343040. Throughput: 0: 42512.5. Samples: 7568481900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 15:22:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 15:22:30,062][15401] Updated weights for policy 0, policy_version 461940 (0.0032) [2024-06-23 15:22:33,258][15401] Updated weights for policy 0, policy_version 461950 (0.0031) [2024-06-23 15:22:33,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7568588800. Throughput: 0: 42670.3. Samples: 7568736800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:22:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 15:22:37,608][15401] Updated weights for policy 0, policy_version 461960 (0.0040) [2024-06-23 15:22:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7568785408. Throughput: 0: 42372.6. Samples: 7568866440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:22:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 15:22:40,953][15401] Updated weights for policy 0, policy_version 461970 (0.0030) [2024-06-23 15:22:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7568998400. Throughput: 0: 42605.3. Samples: 7569126860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:22:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 15:22:45,219][15401] Updated weights for policy 0, policy_version 461980 (0.0037) [2024-06-23 15:22:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 7569211392. Throughput: 0: 42542.3. Samples: 7569372660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:22:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 15:22:48,823][15401] Updated weights for policy 0, policy_version 461990 (0.0032) [2024-06-23 15:22:52,789][15401] Updated weights for policy 0, policy_version 462000 (0.0031) [2024-06-23 15:22:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42765.4). Total num frames: 7569424384. Throughput: 0: 42380.6. Samples: 7569503860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:22:53,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 15:22:56,565][15401] Updated weights for policy 0, policy_version 462010 (0.0038) [2024-06-23 15:22:58,394][15132] Fps is (10 sec: 40941.5, 60 sec: 42868.2, 300 sec: 42597.8). Total num frames: 7569620992. Throughput: 0: 42396.9. Samples: 7569757780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:22:58,395][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 15:23:00,497][15401] Updated weights for policy 0, policy_version 462020 (0.0040) [2024-06-23 15:23:03,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7569850368. Throughput: 0: 42450.1. Samples: 7570007360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 15:23:04,470][15401] Updated weights for policy 0, policy_version 462030 (0.0032) [2024-06-23 15:23:08,214][15401] Updated weights for policy 0, policy_version 462040 (0.0031) [2024-06-23 15:23:08,390][15132] Fps is (10 sec: 44255.3, 60 sec: 42325.1, 300 sec: 42709.7). Total num frames: 7570063360. Throughput: 0: 42434.4. Samples: 7570141300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 15:23:12,266][15401] Updated weights for policy 0, policy_version 462050 (0.0030) [2024-06-23 15:23:13,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42872.3, 300 sec: 42653.9). Total num frames: 7570259968. Throughput: 0: 42424.9. Samples: 7570391020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 15:23:15,722][15349] Signal inference workers to stop experience collection... (112200 times) [2024-06-23 15:23:15,750][15401] InferenceWorker_p0-w0: stopping experience collection (112200 times) [2024-06-23 15:23:15,780][15349] Signal inference workers to resume experience collection... (112200 times) [2024-06-23 15:23:15,781][15401] InferenceWorker_p0-w0: resuming experience collection (112200 times) [2024-06-23 15:23:15,933][15401] Updated weights for policy 0, policy_version 462060 (0.0025) [2024-06-23 15:23:18,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42328.6, 300 sec: 42820.5). Total num frames: 7570489344. Throughput: 0: 42252.3. Samples: 7570638160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:18,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 15:23:20,104][15401] Updated weights for policy 0, policy_version 462070 (0.0028) [2024-06-23 15:23:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 7570685952. Throughput: 0: 42323.4. Samples: 7570771000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:23,390][15132] Avg episode reward: [(0, '0.899')] [2024-06-23 15:23:23,924][15401] Updated weights for policy 0, policy_version 462080 (0.0030) [2024-06-23 15:23:27,660][15401] Updated weights for policy 0, policy_version 462090 (0.0042) [2024-06-23 15:23:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7570882560. Throughput: 0: 42347.7. Samples: 7571032500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:28,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 15:23:31,494][15401] Updated weights for policy 0, policy_version 462100 (0.0041) [2024-06-23 15:23:33,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7571144704. Throughput: 0: 42279.5. Samples: 7571275240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:33,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 15:23:35,619][15401] Updated weights for policy 0, policy_version 462110 (0.0043) [2024-06-23 15:23:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 7571324928. Throughput: 0: 42478.5. Samples: 7571415500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:38,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 15:23:39,300][15401] Updated weights for policy 0, policy_version 462120 (0.0032) [2024-06-23 15:23:43,275][15401] Updated weights for policy 0, policy_version 462130 (0.0032) [2024-06-23 15:23:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7571537920. Throughput: 0: 42449.1. Samples: 7571667800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:43,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 15:23:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000462130_7571537920.pth... [2024-06-23 15:23:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000461506_7561314304.pth [2024-06-23 15:23:46,860][15401] Updated weights for policy 0, policy_version 462140 (0.0034) [2024-06-23 15:23:48,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7571767296. Throughput: 0: 42460.1. Samples: 7571918060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 15:23:51,148][15401] Updated weights for policy 0, policy_version 462150 (0.0047) [2024-06-23 15:23:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 7571963904. Throughput: 0: 42426.9. Samples: 7572050500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 15:23:53,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 15:23:54,637][15401] Updated weights for policy 0, policy_version 462160 (0.0043) [2024-06-23 15:23:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42328.4, 300 sec: 42653.9). Total num frames: 7572160512. Throughput: 0: 42307.5. Samples: 7572294860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:23:58,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 15:23:58,879][15401] Updated weights for policy 0, policy_version 462170 (0.0045) [2024-06-23 15:24:02,117][15401] Updated weights for policy 0, policy_version 462180 (0.0031) [2024-06-23 15:24:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 7572406272. Throughput: 0: 42601.8. Samples: 7572555240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:03,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 15:24:06,526][15401] Updated weights for policy 0, policy_version 462190 (0.0040) [2024-06-23 15:24:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.6, 300 sec: 42598.4). Total num frames: 7572602880. Throughput: 0: 42658.3. Samples: 7572690620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:08,400][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 15:24:09,740][15401] Updated weights for policy 0, policy_version 462200 (0.0047) [2024-06-23 15:24:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 7572799488. Throughput: 0: 42327.4. Samples: 7572937240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:13,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 15:24:14,221][15401] Updated weights for policy 0, policy_version 462210 (0.0037) [2024-06-23 15:24:17,343][15401] Updated weights for policy 0, policy_version 462220 (0.0039) [2024-06-23 15:24:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 7573028864. Throughput: 0: 42593.0. Samples: 7573191920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:18,398][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 15:24:21,867][15401] Updated weights for policy 0, policy_version 462230 (0.0029) [2024-06-23 15:24:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7573241856. Throughput: 0: 42511.1. Samples: 7573328400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 15:24:25,142][15401] Updated weights for policy 0, policy_version 462240 (0.0035) [2024-06-23 15:24:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7573454848. Throughput: 0: 42485.0. Samples: 7573579620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 15:24:29,445][15401] Updated weights for policy 0, policy_version 462250 (0.0030) [2024-06-23 15:24:32,814][15401] Updated weights for policy 0, policy_version 462260 (0.0034) [2024-06-23 15:24:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7573684224. Throughput: 0: 42455.3. Samples: 7573828540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:33,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 15:24:37,009][15401] Updated weights for policy 0, policy_version 462270 (0.0028) [2024-06-23 15:24:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 7573864448. Throughput: 0: 42541.4. Samples: 7573964860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 15:24:40,467][15401] Updated weights for policy 0, policy_version 462280 (0.0034) [2024-06-23 15:24:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7574110208. Throughput: 0: 42774.8. Samples: 7574219720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 15:24:44,857][15401] Updated weights for policy 0, policy_version 462290 (0.0030) [2024-06-23 15:24:47,527][15349] Signal inference workers to stop experience collection... (112250 times) [2024-06-23 15:24:47,532][15349] Signal inference workers to resume experience collection... (112250 times) [2024-06-23 15:24:47,564][15401] InferenceWorker_p0-w0: stopping experience collection (112250 times) [2024-06-23 15:24:47,564][15401] InferenceWorker_p0-w0: resuming experience collection (112250 times) [2024-06-23 15:24:48,120][15401] Updated weights for policy 0, policy_version 462300 (0.0036) [2024-06-23 15:24:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7574323200. Throughput: 0: 42628.1. Samples: 7574473500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:48,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 15:24:52,428][15401] Updated weights for policy 0, policy_version 462310 (0.0037) [2024-06-23 15:24:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7574519808. Throughput: 0: 42525.9. Samples: 7574604280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 15:24:55,944][15401] Updated weights for policy 0, policy_version 462320 (0.0025) [2024-06-23 15:24:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7574749184. Throughput: 0: 42929.7. Samples: 7574869080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:24:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 15:24:59,958][15401] Updated weights for policy 0, policy_version 462330 (0.0026) [2024-06-23 15:25:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7574962176. Throughput: 0: 42798.1. Samples: 7575117840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:25:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 15:25:03,590][15401] Updated weights for policy 0, policy_version 462340 (0.0039) [2024-06-23 15:25:07,688][15401] Updated weights for policy 0, policy_version 462350 (0.0042) [2024-06-23 15:25:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7575158784. Throughput: 0: 42545.8. Samples: 7575242960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:25:08,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 15:25:11,042][15401] Updated weights for policy 0, policy_version 462360 (0.0024) [2024-06-23 15:25:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7575371776. Throughput: 0: 42854.9. Samples: 7575508100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:25:13,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 15:25:15,350][15401] Updated weights for policy 0, policy_version 462370 (0.0040) [2024-06-23 15:25:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7575601152. Throughput: 0: 42958.1. Samples: 7575761660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 15:25:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 15:25:18,738][15401] Updated weights for policy 0, policy_version 462380 (0.0029) [2024-06-23 15:25:22,779][15401] Updated weights for policy 0, policy_version 462390 (0.0040) [2024-06-23 15:25:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7575814144. Throughput: 0: 42772.5. Samples: 7575889620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:25:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 15:25:26,563][15401] Updated weights for policy 0, policy_version 462400 (0.0025) [2024-06-23 15:25:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7576027136. Throughput: 0: 42907.4. Samples: 7576150560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:25:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 15:25:30,258][15401] Updated weights for policy 0, policy_version 462410 (0.0034) [2024-06-23 15:25:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7576240128. Throughput: 0: 42944.0. Samples: 7576405980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:25:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 15:25:34,520][15401] Updated weights for policy 0, policy_version 462420 (0.0033) [2024-06-23 15:25:37,676][15401] Updated weights for policy 0, policy_version 462430 (0.0029) [2024-06-23 15:25:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 7576469504. Throughput: 0: 42823.4. Samples: 7576531340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:25:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 15:25:42,028][15401] Updated weights for policy 0, policy_version 462440 (0.0040) [2024-06-23 15:25:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7576666112. Throughput: 0: 42753.5. Samples: 7576792980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:25:43,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-23 15:25:43,623][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000462445_7576698880.pth... [2024-06-23 15:25:43,706][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000461818_7566426112.pth [2024-06-23 15:25:45,490][15401] Updated weights for policy 0, policy_version 462450 (0.0026) [2024-06-23 15:25:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 7576879104. Throughput: 0: 42920.1. Samples: 7577049240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:25:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 15:25:50,291][15401] Updated weights for policy 0, policy_version 462460 (0.0026) [2024-06-23 15:25:53,018][15401] Updated weights for policy 0, policy_version 462470 (0.0033) [2024-06-23 15:25:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7577108480. Throughput: 0: 42963.2. Samples: 7577176300. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:25:53,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-23 15:25:57,768][15401] Updated weights for policy 0, policy_version 462480 (0.0039) [2024-06-23 15:25:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7577305088. Throughput: 0: 42914.8. Samples: 7577439260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:25:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 15:26:00,835][15401] Updated weights for policy 0, policy_version 462490 (0.0036) [2024-06-23 15:26:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7577518080. Throughput: 0: 42673.4. Samples: 7577681960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 15:26:05,387][15401] Updated weights for policy 0, policy_version 462500 (0.0048) [2024-06-23 15:26:05,888][15349] Signal inference workers to stop experience collection... (112300 times) [2024-06-23 15:26:05,889][15349] Signal inference workers to resume experience collection... (112300 times) [2024-06-23 15:26:05,922][15401] InferenceWorker_p0-w0: stopping experience collection (112300 times) [2024-06-23 15:26:05,922][15401] InferenceWorker_p0-w0: resuming experience collection (112300 times) [2024-06-23 15:26:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7577747456. Throughput: 0: 42684.5. Samples: 7577810420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 15:26:08,520][15401] Updated weights for policy 0, policy_version 462510 (0.0035) [2024-06-23 15:26:12,823][15401] Updated weights for policy 0, policy_version 462520 (0.0037) [2024-06-23 15:26:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 7577960448. Throughput: 0: 42957.1. Samples: 7578083620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:13,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 15:26:15,968][15401] Updated weights for policy 0, policy_version 462530 (0.0028) [2024-06-23 15:26:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7578173440. Throughput: 0: 42705.3. Samples: 7578327720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:18,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 15:26:20,772][15401] Updated weights for policy 0, policy_version 462540 (0.0038) [2024-06-23 15:26:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 7578386432. Throughput: 0: 42836.4. Samples: 7578458980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:23,391][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 15:26:23,614][15401] Updated weights for policy 0, policy_version 462550 (0.0028) [2024-06-23 15:26:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 7578566656. Throughput: 0: 42636.4. Samples: 7578711620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 15:26:28,491][15401] Updated weights for policy 0, policy_version 462560 (0.0044) [2024-06-23 15:26:31,488][15401] Updated weights for policy 0, policy_version 462570 (0.0039) [2024-06-23 15:26:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7578812416. Throughput: 0: 42517.8. Samples: 7578962540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 15:26:36,067][15401] Updated weights for policy 0, policy_version 462580 (0.0034) [2024-06-23 15:26:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7579025408. Throughput: 0: 42613.9. Samples: 7579093920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:38,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 15:26:39,415][15401] Updated weights for policy 0, policy_version 462590 (0.0031) [2024-06-23 15:26:43,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 7579205632. Throughput: 0: 42439.9. Samples: 7579349060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-23 15:26:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 15:26:43,730][15401] Updated weights for policy 0, policy_version 462600 (0.0040) [2024-06-23 15:26:47,369][15401] Updated weights for policy 0, policy_version 462610 (0.0039) [2024-06-23 15:26:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7579435008. Throughput: 0: 42618.7. Samples: 7579599800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:26:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 15:26:51,409][15401] Updated weights for policy 0, policy_version 462620 (0.0034) [2024-06-23 15:26:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7579664384. Throughput: 0: 42763.5. Samples: 7579734780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:26:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 15:26:54,989][15401] Updated weights for policy 0, policy_version 462630 (0.0031) [2024-06-23 15:26:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7579844608. Throughput: 0: 42231.8. Samples: 7579984060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:26:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 15:26:58,966][15401] Updated weights for policy 0, policy_version 462640 (0.0027) [2024-06-23 15:27:02,531][15401] Updated weights for policy 0, policy_version 462650 (0.0042) [2024-06-23 15:27:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 7580057600. Throughput: 0: 42407.8. Samples: 7580236060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 15:27:06,488][15401] Updated weights for policy 0, policy_version 462660 (0.0052) [2024-06-23 15:27:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42654.1). Total num frames: 7580270592. Throughput: 0: 42336.6. Samples: 7580364120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 15:27:10,832][15401] Updated weights for policy 0, policy_version 462670 (0.0039) [2024-06-23 15:27:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42488.0). Total num frames: 7580483584. Throughput: 0: 42313.5. Samples: 7580615720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 15:27:14,256][15401] Updated weights for policy 0, policy_version 462680 (0.0032) [2024-06-23 15:27:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42487.7). Total num frames: 7580696576. Throughput: 0: 42321.8. Samples: 7580867020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 15:27:18,471][15401] Updated weights for policy 0, policy_version 462690 (0.0037) [2024-06-23 15:27:22,427][15401] Updated weights for policy 0, policy_version 462700 (0.0037) [2024-06-23 15:27:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7580909568. Throughput: 0: 42308.7. Samples: 7580997820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 15:27:24,423][15349] Signal inference workers to stop experience collection... (112350 times) [2024-06-23 15:27:24,430][15349] Signal inference workers to resume experience collection... (112350 times) [2024-06-23 15:27:24,452][15401] InferenceWorker_p0-w0: stopping experience collection (112350 times) [2024-06-23 15:27:24,452][15401] InferenceWorker_p0-w0: resuming experience collection (112350 times) [2024-06-23 15:27:26,145][15401] Updated weights for policy 0, policy_version 462710 (0.0027) [2024-06-23 15:27:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 7581106176. Throughput: 0: 42247.1. Samples: 7581250180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 15:27:30,022][15401] Updated weights for policy 0, policy_version 462720 (0.0022) [2024-06-23 15:27:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 7581335552. Throughput: 0: 42299.0. Samples: 7581503260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 15:27:33,671][15401] Updated weights for policy 0, policy_version 462730 (0.0027) [2024-06-23 15:27:37,775][15401] Updated weights for policy 0, policy_version 462740 (0.0031) [2024-06-23 15:27:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 7581548544. Throughput: 0: 42204.4. Samples: 7581633980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 15:27:41,194][15401] Updated weights for policy 0, policy_version 462750 (0.0033) [2024-06-23 15:27:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7581745152. Throughput: 0: 42302.2. Samples: 7581887660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:43,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 15:27:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000462753_7581745152.pth... [2024-06-23 15:27:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000462130_7571537920.pth [2024-06-23 15:27:45,552][15401] Updated weights for policy 0, policy_version 462760 (0.0025) [2024-06-23 15:27:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7581990912. Throughput: 0: 42163.4. Samples: 7582133420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 15:27:49,473][15401] Updated weights for policy 0, policy_version 462770 (0.0035) [2024-06-23 15:27:53,211][15401] Updated weights for policy 0, policy_version 462780 (0.0035) [2024-06-23 15:27:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42599.1). Total num frames: 7582187520. Throughput: 0: 42413.8. Samples: 7582272740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 15:27:57,130][15401] Updated weights for policy 0, policy_version 462790 (0.0024) [2024-06-23 15:27:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 7582384128. Throughput: 0: 42271.5. Samples: 7582517940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:27:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 15:28:00,996][15401] Updated weights for policy 0, policy_version 462800 (0.0028) [2024-06-23 15:28:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42654.0). Total num frames: 7582646272. Throughput: 0: 42130.6. Samples: 7582762900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:28:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 15:28:04,821][15401] Updated weights for policy 0, policy_version 462810 (0.0038) [2024-06-23 15:28:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7582810112. Throughput: 0: 42348.1. Samples: 7582903480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 15:28:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 15:28:09,116][15401] Updated weights for policy 0, policy_version 462820 (0.0035) [2024-06-23 15:28:12,832][15401] Updated weights for policy 0, policy_version 462830 (0.0033) [2024-06-23 15:28:13,390][15132] Fps is (10 sec: 36044.5, 60 sec: 42052.1, 300 sec: 42431.8). Total num frames: 7583006720. Throughput: 0: 42239.5. Samples: 7583150960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 15:28:16,691][15401] Updated weights for policy 0, policy_version 462840 (0.0040) [2024-06-23 15:28:18,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43142.8, 300 sec: 42709.2). Total num frames: 7583285248. Throughput: 0: 42124.1. Samples: 7583398940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:18,392][15132] Avg episode reward: [(0, '0.219')] [2024-06-23 15:28:20,567][15401] Updated weights for policy 0, policy_version 462850 (0.0034) [2024-06-23 15:28:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7583432704. Throughput: 0: 42253.4. Samples: 7583535380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 15:28:24,285][15401] Updated weights for policy 0, policy_version 462860 (0.0033) [2024-06-23 15:28:28,266][15401] Updated weights for policy 0, policy_version 462870 (0.0040) [2024-06-23 15:28:28,392][15132] Fps is (10 sec: 37683.0, 60 sec: 42596.7, 300 sec: 42431.4). Total num frames: 7583662080. Throughput: 0: 42059.6. Samples: 7583780440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:28,392][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 15:28:32,143][15401] Updated weights for policy 0, policy_version 462880 (0.0033) [2024-06-23 15:28:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 7583891456. Throughput: 0: 42259.1. Samples: 7584035080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 15:28:33,448][15349] Signal inference workers to stop experience collection... (112400 times) [2024-06-23 15:28:33,477][15401] InferenceWorker_p0-w0: stopping experience collection (112400 times) [2024-06-23 15:28:33,496][15349] Signal inference workers to resume experience collection... (112400 times) [2024-06-23 15:28:33,513][15401] InferenceWorker_p0-w0: resuming experience collection (112400 times) [2024-06-23 15:28:35,925][15401] Updated weights for policy 0, policy_version 462890 (0.0038) [2024-06-23 15:28:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7584088064. Throughput: 0: 41960.4. Samples: 7584160960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 15:28:40,128][15401] Updated weights for policy 0, policy_version 462900 (0.0039) [2024-06-23 15:28:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42487.4). Total num frames: 7584301056. Throughput: 0: 42064.0. Samples: 7584410820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 15:28:43,488][15401] Updated weights for policy 0, policy_version 462910 (0.0035) [2024-06-23 15:28:47,614][15401] Updated weights for policy 0, policy_version 462920 (0.0044) [2024-06-23 15:28:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 7584514048. Throughput: 0: 42538.1. Samples: 7584677220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:48,393][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 15:28:51,362][15401] Updated weights for policy 0, policy_version 462930 (0.0035) [2024-06-23 15:28:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 7584694272. Throughput: 0: 42192.8. Samples: 7584802160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 15:28:55,330][15401] Updated weights for policy 0, policy_version 462940 (0.0030) [2024-06-23 15:28:58,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7584940032. Throughput: 0: 42185.4. Samples: 7585049300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:28:58,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 15:28:58,906][15401] Updated weights for policy 0, policy_version 462950 (0.0035) [2024-06-23 15:29:02,991][15401] Updated weights for policy 0, policy_version 462960 (0.0033) [2024-06-23 15:29:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 7585153024. Throughput: 0: 42474.2. Samples: 7585310180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:29:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 15:29:06,555][15401] Updated weights for policy 0, policy_version 462970 (0.0028) [2024-06-23 15:29:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7585349632. Throughput: 0: 42285.8. Samples: 7585438240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:29:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 15:29:10,738][15401] Updated weights for policy 0, policy_version 462980 (0.0034) [2024-06-23 15:29:13,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42541.9). Total num frames: 7585579008. Throughput: 0: 42450.9. Samples: 7585690900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:29:13,397][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 15:29:14,267][15401] Updated weights for policy 0, policy_version 462990 (0.0036) [2024-06-23 15:29:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41507.9, 300 sec: 42487.4). Total num frames: 7585775616. Throughput: 0: 42458.8. Samples: 7585945720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:29:18,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 15:29:18,503][15401] Updated weights for policy 0, policy_version 463000 (0.0040) [2024-06-23 15:29:22,246][15401] Updated weights for policy 0, policy_version 463010 (0.0031) [2024-06-23 15:29:23,389][15132] Fps is (10 sec: 39347.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 7585972224. Throughput: 0: 42410.8. Samples: 7586069440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:29:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 15:29:26,114][15401] Updated weights for policy 0, policy_version 463020 (0.0032) [2024-06-23 15:29:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42873.2, 300 sec: 42542.8). Total num frames: 7586234368. Throughput: 0: 42500.7. Samples: 7586323360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:29:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 15:29:29,946][15401] Updated weights for policy 0, policy_version 463030 (0.0026) [2024-06-23 15:29:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7586414592. Throughput: 0: 42495.7. Samples: 7586589420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 15:29:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 15:29:34,082][15401] Updated weights for policy 0, policy_version 463040 (0.0034) [2024-06-23 15:29:37,483][15401] Updated weights for policy 0, policy_version 463050 (0.0034) [2024-06-23 15:29:38,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42323.6, 300 sec: 42431.4). Total num frames: 7586627584. Throughput: 0: 42254.6. Samples: 7586703720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:29:38,393][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 15:29:41,646][15401] Updated weights for policy 0, policy_version 463060 (0.0039) [2024-06-23 15:29:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 7586873344. Throughput: 0: 42568.4. Samples: 7586964880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:29:43,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 15:29:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000463066_7586873344.pth... [2024-06-23 15:29:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000462445_7576698880.pth [2024-06-23 15:29:45,170][15401] Updated weights for policy 0, policy_version 463070 (0.0048) [2024-06-23 15:29:48,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42054.0, 300 sec: 42431.8). Total num frames: 7587037184. Throughput: 0: 42625.9. Samples: 7587228340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:29:48,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 15:29:49,167][15401] Updated weights for policy 0, policy_version 463080 (0.0028) [2024-06-23 15:29:49,818][15349] Signal inference workers to stop experience collection... (112450 times) [2024-06-23 15:29:49,862][15401] InferenceWorker_p0-w0: stopping experience collection (112450 times) [2024-06-23 15:29:49,874][15349] Signal inference workers to resume experience collection... (112450 times) [2024-06-23 15:29:49,875][15401] InferenceWorker_p0-w0: resuming experience collection (112450 times) [2024-06-23 15:29:52,738][15401] Updated weights for policy 0, policy_version 463090 (0.0030) [2024-06-23 15:29:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 7587266560. Throughput: 0: 42403.5. Samples: 7587346400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:29:53,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 15:29:56,705][15401] Updated weights for policy 0, policy_version 463100 (0.0037) [2024-06-23 15:29:58,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7587512320. Throughput: 0: 42588.9. Samples: 7587607120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:29:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 15:30:00,459][15401] Updated weights for policy 0, policy_version 463110 (0.0035) [2024-06-23 15:30:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 7587676160. Throughput: 0: 42770.1. Samples: 7587870380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:03,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 15:30:04,297][15401] Updated weights for policy 0, policy_version 463120 (0.0030) [2024-06-23 15:30:08,223][15401] Updated weights for policy 0, policy_version 463130 (0.0026) [2024-06-23 15:30:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7587921920. Throughput: 0: 42701.3. Samples: 7587991000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 15:30:12,235][15401] Updated weights for policy 0, policy_version 463140 (0.0042) [2024-06-23 15:30:13,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42876.1, 300 sec: 42542.9). Total num frames: 7588151296. Throughput: 0: 42865.8. Samples: 7588252320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:13,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 15:30:15,846][15401] Updated weights for policy 0, policy_version 463150 (0.0033) [2024-06-23 15:30:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 7588347904. Throughput: 0: 42676.4. Samples: 7588509860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 15:30:19,817][15401] Updated weights for policy 0, policy_version 463160 (0.0032) [2024-06-23 15:30:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 7588544512. Throughput: 0: 42841.0. Samples: 7588631460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:23,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 15:30:23,960][15401] Updated weights for policy 0, policy_version 463170 (0.0040) [2024-06-23 15:30:27,571][15401] Updated weights for policy 0, policy_version 463180 (0.0033) [2024-06-23 15:30:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7588773888. Throughput: 0: 42871.1. Samples: 7588894080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 15:30:31,631][15401] Updated weights for policy 0, policy_version 463190 (0.0038) [2024-06-23 15:30:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 7588986880. Throughput: 0: 42746.7. Samples: 7589151940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 15:30:35,395][15401] Updated weights for policy 0, policy_version 463200 (0.0028) [2024-06-23 15:30:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 7589199872. Throughput: 0: 42804.0. Samples: 7589272580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 15:30:39,088][15401] Updated weights for policy 0, policy_version 463210 (0.0038) [2024-06-23 15:30:42,843][15401] Updated weights for policy 0, policy_version 463220 (0.0033) [2024-06-23 15:30:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7589412864. Throughput: 0: 42898.2. Samples: 7589537540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 15:30:46,698][15401] Updated weights for policy 0, policy_version 463230 (0.0050) [2024-06-23 15:30:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42487.3). Total num frames: 7589642240. Throughput: 0: 42587.1. Samples: 7589786800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 15:30:50,641][15401] Updated weights for policy 0, policy_version 463240 (0.0034) [2024-06-23 15:30:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 7589838848. Throughput: 0: 42906.6. Samples: 7589921800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:53,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 15:30:54,179][15401] Updated weights for policy 0, policy_version 463250 (0.0042) [2024-06-23 15:30:58,184][15401] Updated weights for policy 0, policy_version 463260 (0.0036) [2024-06-23 15:30:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7590051840. Throughput: 0: 42684.0. Samples: 7590173100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 15:30:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 15:31:02,118][15401] Updated weights for policy 0, policy_version 463270 (0.0032) [2024-06-23 15:31:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42487.3). Total num frames: 7590281216. Throughput: 0: 42641.3. Samples: 7590428720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 15:31:05,928][15401] Updated weights for policy 0, policy_version 463280 (0.0033) [2024-06-23 15:31:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 7590477824. Throughput: 0: 42915.2. Samples: 7590562640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:08,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-23 15:31:09,761][15401] Updated weights for policy 0, policy_version 463290 (0.0047) [2024-06-23 15:31:13,315][15401] Updated weights for policy 0, policy_version 463300 (0.0030) [2024-06-23 15:31:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7590707200. Throughput: 0: 42785.9. Samples: 7590819440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:13,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-23 15:31:17,478][15401] Updated weights for policy 0, policy_version 463310 (0.0028) [2024-06-23 15:31:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 7590920192. Throughput: 0: 42773.6. Samples: 7591076760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:18,400][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 15:31:20,837][15401] Updated weights for policy 0, policy_version 463320 (0.0030) [2024-06-23 15:31:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 7591116800. Throughput: 0: 43008.4. Samples: 7591207960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 15:31:24,915][15401] Updated weights for policy 0, policy_version 463330 (0.0031) [2024-06-23 15:31:28,344][15401] Updated weights for policy 0, policy_version 463340 (0.0043) [2024-06-23 15:31:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42542.8). Total num frames: 7591362560. Throughput: 0: 42671.9. Samples: 7591457780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:28,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 15:31:32,638][15401] Updated weights for policy 0, policy_version 463350 (0.0029) [2024-06-23 15:31:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 7591559168. Throughput: 0: 43042.2. Samples: 7591723700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:33,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 15:31:36,795][15401] Updated weights for policy 0, policy_version 463360 (0.0034) [2024-06-23 15:31:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7591755776. Throughput: 0: 42888.6. Samples: 7591851780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 15:31:39,556][15349] Signal inference workers to stop experience collection... (112500 times) [2024-06-23 15:31:39,557][15349] Signal inference workers to resume experience collection... (112500 times) [2024-06-23 15:31:39,612][15401] InferenceWorker_p0-w0: stopping experience collection (112500 times) [2024-06-23 15:31:39,612][15401] InferenceWorker_p0-w0: resuming experience collection (112500 times) [2024-06-23 15:31:40,228][15401] Updated weights for policy 0, policy_version 463370 (0.0032) [2024-06-23 15:31:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 7591985152. Throughput: 0: 42739.5. Samples: 7592096380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 15:31:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000463378_7591985152.pth... [2024-06-23 15:31:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000462753_7581745152.pth [2024-06-23 15:31:44,385][15401] Updated weights for policy 0, policy_version 463380 (0.0038) [2024-06-23 15:31:47,921][15401] Updated weights for policy 0, policy_version 463390 (0.0037) [2024-06-23 15:31:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7592198144. Throughput: 0: 42854.8. Samples: 7592357180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:48,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 15:31:51,871][15401] Updated weights for policy 0, policy_version 463400 (0.0035) [2024-06-23 15:31:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7592394752. Throughput: 0: 42784.8. Samples: 7592487960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 15:31:55,453][15401] Updated weights for policy 0, policy_version 463410 (0.0032) [2024-06-23 15:31:58,391][15132] Fps is (10 sec: 44230.4, 60 sec: 43143.6, 300 sec: 42653.7). Total num frames: 7592640512. Throughput: 0: 42699.5. Samples: 7592740980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:31:58,391][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 15:31:59,223][15401] Updated weights for policy 0, policy_version 463420 (0.0036) [2024-06-23 15:32:03,363][15401] Updated weights for policy 0, policy_version 463430 (0.0026) [2024-06-23 15:32:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7592837120. Throughput: 0: 42894.6. Samples: 7593007020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:32:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 15:32:06,781][15401] Updated weights for policy 0, policy_version 463440 (0.0024) [2024-06-23 15:32:08,390][15132] Fps is (10 sec: 40965.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7593050112. Throughput: 0: 42774.7. Samples: 7593132820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:32:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 15:32:11,147][15401] Updated weights for policy 0, policy_version 463450 (0.0042) [2024-06-23 15:32:13,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 7593279488. Throughput: 0: 42782.1. Samples: 7593383080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:32:13,393][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 15:32:14,467][15401] Updated weights for policy 0, policy_version 463460 (0.0029) [2024-06-23 15:32:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7593459712. Throughput: 0: 42748.9. Samples: 7593647400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:32:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 15:32:19,043][15401] Updated weights for policy 0, policy_version 463470 (0.0041) [2024-06-23 15:32:22,040][15401] Updated weights for policy 0, policy_version 463480 (0.0034) [2024-06-23 15:32:23,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7593672704. Throughput: 0: 42666.2. Samples: 7593771760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 15:32:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 15:32:26,596][15401] Updated weights for policy 0, policy_version 463490 (0.0031) [2024-06-23 15:32:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7593918464. Throughput: 0: 42854.3. Samples: 7594024820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:32:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 15:32:29,735][15401] Updated weights for policy 0, policy_version 463500 (0.0027) [2024-06-23 15:32:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7594098688. Throughput: 0: 42831.1. Samples: 7594284580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:32:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 15:32:34,206][15401] Updated weights for policy 0, policy_version 463510 (0.0044) [2024-06-23 15:32:37,382][15401] Updated weights for policy 0, policy_version 463520 (0.0035) [2024-06-23 15:32:38,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42866.8, 300 sec: 42653.0). Total num frames: 7594328064. Throughput: 0: 42576.6. Samples: 7594404180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:32:38,397][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 15:32:41,884][15401] Updated weights for policy 0, policy_version 463530 (0.0035) [2024-06-23 15:32:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7594557440. Throughput: 0: 42774.6. Samples: 7594665780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:32:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 15:32:45,032][15401] Updated weights for policy 0, policy_version 463540 (0.0036) [2024-06-23 15:32:48,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7594737664. Throughput: 0: 42670.0. Samples: 7594927160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:32:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 15:32:49,362][15401] Updated weights for policy 0, policy_version 463550 (0.0027) [2024-06-23 15:32:53,177][15401] Updated weights for policy 0, policy_version 463560 (0.0043) [2024-06-23 15:32:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7594967040. Throughput: 0: 42666.2. Samples: 7595052800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:32:53,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 15:32:56,934][15401] Updated weights for policy 0, policy_version 463570 (0.0041) [2024-06-23 15:32:58,365][15349] Signal inference workers to stop experience collection... (112550 times) [2024-06-23 15:32:58,365][15349] Signal inference workers to resume experience collection... (112550 times) [2024-06-23 15:32:58,381][15401] InferenceWorker_p0-w0: stopping experience collection (112550 times) [2024-06-23 15:32:58,381][15401] InferenceWorker_p0-w0: resuming experience collection (112550 times) [2024-06-23 15:32:58,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42599.4, 300 sec: 42542.9). Total num frames: 7595196416. Throughput: 0: 42749.0. Samples: 7595306680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:32:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 15:33:00,974][15401] Updated weights for policy 0, policy_version 463580 (0.0040) [2024-06-23 15:33:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7595393024. Throughput: 0: 42601.6. Samples: 7595564480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 15:33:04,635][15401] Updated weights for policy 0, policy_version 463590 (0.0038) [2024-06-23 15:33:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7595589632. Throughput: 0: 42528.9. Samples: 7595685560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:08,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 15:33:08,651][15401] Updated weights for policy 0, policy_version 463600 (0.0026) [2024-06-23 15:33:12,446][15401] Updated weights for policy 0, policy_version 463610 (0.0029) [2024-06-23 15:33:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42600.1, 300 sec: 42543.2). Total num frames: 7595835392. Throughput: 0: 42779.1. Samples: 7595949880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 15:33:16,378][15401] Updated weights for policy 0, policy_version 463620 (0.0034) [2024-06-23 15:33:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7596032000. Throughput: 0: 42610.5. Samples: 7596202060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 15:33:20,075][15401] Updated weights for policy 0, policy_version 463630 (0.0039) [2024-06-23 15:33:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 7596244992. Throughput: 0: 42577.7. Samples: 7596319900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:23,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 15:33:24,223][15401] Updated weights for policy 0, policy_version 463640 (0.0040) [2024-06-23 15:33:27,943][15401] Updated weights for policy 0, policy_version 463650 (0.0043) [2024-06-23 15:33:28,390][15132] Fps is (10 sec: 42595.1, 60 sec: 42324.7, 300 sec: 42598.3). Total num frames: 7596457984. Throughput: 0: 42508.5. Samples: 7596578700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:28,391][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 15:33:31,835][15401] Updated weights for policy 0, policy_version 463660 (0.0046) [2024-06-23 15:33:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7596654592. Throughput: 0: 42422.5. Samples: 7596836180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 15:33:35,526][15401] Updated weights for policy 0, policy_version 463670 (0.0034) [2024-06-23 15:33:38,389][15132] Fps is (10 sec: 42602.6, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 7596883968. Throughput: 0: 42341.0. Samples: 7596958140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 15:33:39,878][15401] Updated weights for policy 0, policy_version 463680 (0.0030) [2024-06-23 15:33:43,216][15401] Updated weights for policy 0, policy_version 463690 (0.0038) [2024-06-23 15:33:43,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 7597096960. Throughput: 0: 42460.4. Samples: 7597217500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:43,393][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 15:33:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000463690_7597096960.pth... [2024-06-23 15:33:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000463066_7586873344.pth [2024-06-23 15:33:47,605][15401] Updated weights for policy 0, policy_version 463700 (0.0030) [2024-06-23 15:33:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7597277184. Throughput: 0: 42450.3. Samples: 7597474740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-23 15:33:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 15:33:50,838][15401] Updated weights for policy 0, policy_version 463710 (0.0028) [2024-06-23 15:33:53,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7597539328. Throughput: 0: 42504.8. Samples: 7597598280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:33:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 15:33:55,033][15401] Updated weights for policy 0, policy_version 463720 (0.0047) [2024-06-23 15:33:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7597735936. Throughput: 0: 42482.7. Samples: 7597861600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:33:58,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 15:33:58,540][15401] Updated weights for policy 0, policy_version 463730 (0.0037) [2024-06-23 15:34:02,562][15401] Updated weights for policy 0, policy_version 463740 (0.0028) [2024-06-23 15:34:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7597932544. Throughput: 0: 42552.5. Samples: 7598116920. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 15:34:06,184][15401] Updated weights for policy 0, policy_version 463750 (0.0038) [2024-06-23 15:34:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42710.4). Total num frames: 7598178304. Throughput: 0: 42659.6. Samples: 7598239580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:08,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 15:34:10,163][15401] Updated weights for policy 0, policy_version 463760 (0.0028) [2024-06-23 15:34:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 7598374912. Throughput: 0: 42762.6. Samples: 7598503080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:13,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 15:34:13,718][15401] Updated weights for policy 0, policy_version 463770 (0.0033) [2024-06-23 15:34:17,865][15401] Updated weights for policy 0, policy_version 463780 (0.0038) [2024-06-23 15:34:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7598587904. Throughput: 0: 42580.1. Samples: 7598752280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 15:34:21,296][15401] Updated weights for policy 0, policy_version 463790 (0.0029) [2024-06-23 15:34:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7598800896. Throughput: 0: 42559.4. Samples: 7598873320. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 15:34:25,613][15349] Signal inference workers to stop experience collection... (112600 times) [2024-06-23 15:34:25,644][15401] InferenceWorker_p0-w0: stopping experience collection (112600 times) [2024-06-23 15:34:25,670][15349] Signal inference workers to resume experience collection... (112600 times) [2024-06-23 15:34:25,671][15401] InferenceWorker_p0-w0: resuming experience collection (112600 times) [2024-06-23 15:34:25,813][15401] Updated weights for policy 0, policy_version 463800 (0.0032) [2024-06-23 15:34:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42326.0, 300 sec: 42653.9). Total num frames: 7598997504. Throughput: 0: 42529.9. Samples: 7599131240. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:28,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 15:34:29,143][15401] Updated weights for policy 0, policy_version 463810 (0.0030) [2024-06-23 15:34:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 7599226880. Throughput: 0: 42562.3. Samples: 7599390040. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 15:34:33,392][15401] Updated weights for policy 0, policy_version 463820 (0.0028) [2024-06-23 15:34:37,073][15401] Updated weights for policy 0, policy_version 463830 (0.0034) [2024-06-23 15:34:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7599456256. Throughput: 0: 42698.7. Samples: 7599519720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 15:34:41,351][15401] Updated weights for policy 0, policy_version 463840 (0.0040) [2024-06-23 15:34:43,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42327.0, 300 sec: 42709.4). Total num frames: 7599636480. Throughput: 0: 42490.1. Samples: 7599773660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 15:34:44,693][15401] Updated weights for policy 0, policy_version 463850 (0.0053) [2024-06-23 15:34:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 7599849472. Throughput: 0: 42452.4. Samples: 7600027380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:48,392][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 15:34:49,199][15401] Updated weights for policy 0, policy_version 463860 (0.0038) [2024-06-23 15:34:52,366][15401] Updated weights for policy 0, policy_version 463870 (0.0030) [2024-06-23 15:34:53,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 7600078848. Throughput: 0: 42566.5. Samples: 7600155180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:53,393][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 15:34:57,083][15401] Updated weights for policy 0, policy_version 463880 (0.0029) [2024-06-23 15:34:58,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7600291840. Throughput: 0: 42541.4. Samples: 7600417340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:34:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 15:35:00,268][15401] Updated weights for policy 0, policy_version 463890 (0.0040) [2024-06-23 15:35:03,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7600504832. Throughput: 0: 42390.6. Samples: 7600659860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:35:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 15:35:04,825][15401] Updated weights for policy 0, policy_version 463900 (0.0027) [2024-06-23 15:35:07,949][15401] Updated weights for policy 0, policy_version 463910 (0.0042) [2024-06-23 15:35:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7600717824. Throughput: 0: 42493.3. Samples: 7600785520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:35:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 15:35:12,402][15401] Updated weights for policy 0, policy_version 463920 (0.0035) [2024-06-23 15:35:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 7600914432. Throughput: 0: 42630.2. Samples: 7601049600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 15:35:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 15:35:15,442][15401] Updated weights for policy 0, policy_version 463930 (0.0033) [2024-06-23 15:35:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7601127424. Throughput: 0: 42282.5. Samples: 7601292760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:18,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 15:35:20,311][15401] Updated weights for policy 0, policy_version 463940 (0.0026) [2024-06-23 15:35:22,956][15401] Updated weights for policy 0, policy_version 463950 (0.0046) [2024-06-23 15:35:23,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7601356800. Throughput: 0: 42359.6. Samples: 7601425900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 15:35:27,905][15401] Updated weights for policy 0, policy_version 463960 (0.0037) [2024-06-23 15:35:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 7601537024. Throughput: 0: 42542.4. Samples: 7601688060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 15:35:30,374][15401] Updated weights for policy 0, policy_version 463970 (0.0031) [2024-06-23 15:35:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7601782784. Throughput: 0: 42452.5. Samples: 7601937640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 15:35:35,592][15401] Updated weights for policy 0, policy_version 463980 (0.0034) [2024-06-23 15:35:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7601995776. Throughput: 0: 42622.0. Samples: 7602073060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 15:35:38,610][15401] Updated weights for policy 0, policy_version 463990 (0.0030) [2024-06-23 15:35:43,163][15401] Updated weights for policy 0, policy_version 464000 (0.0028) [2024-06-23 15:35:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7602176000. Throughput: 0: 42415.8. Samples: 7602326060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:43,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 15:35:43,523][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000464001_7602192384.pth... [2024-06-23 15:35:43,579][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000463378_7591985152.pth [2024-06-23 15:35:44,203][15349] Signal inference workers to stop experience collection... (112650 times) [2024-06-23 15:35:44,252][15401] InferenceWorker_p0-w0: stopping experience collection (112650 times) [2024-06-23 15:35:44,261][15349] Signal inference workers to resume experience collection... (112650 times) [2024-06-23 15:35:44,268][15401] InferenceWorker_p0-w0: resuming experience collection (112650 times) [2024-06-23 15:35:46,152][15401] Updated weights for policy 0, policy_version 464010 (0.0029) [2024-06-23 15:35:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 7602421760. Throughput: 0: 42440.0. Samples: 7602569660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 15:35:50,932][15401] Updated weights for policy 0, policy_version 464020 (0.0029) [2024-06-23 15:35:53,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 7602634752. Throughput: 0: 42633.9. Samples: 7602704040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:53,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-23 15:35:54,102][15401] Updated weights for policy 0, policy_version 464030 (0.0036) [2024-06-23 15:35:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 7602798592. Throughput: 0: 42347.8. Samples: 7602955260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:35:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 15:35:58,701][15401] Updated weights for policy 0, policy_version 464040 (0.0035) [2024-06-23 15:36:01,638][15401] Updated weights for policy 0, policy_version 464050 (0.0029) [2024-06-23 15:36:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7603077120. Throughput: 0: 42544.5. Samples: 7603207260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:36:03,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 15:36:06,419][15401] Updated weights for policy 0, policy_version 464060 (0.0043) [2024-06-23 15:36:08,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7603257344. Throughput: 0: 42758.3. Samples: 7603350020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:36:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 15:36:09,207][15401] Updated weights for policy 0, policy_version 464070 (0.0038) [2024-06-23 15:36:13,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7603453952. Throughput: 0: 42453.3. Samples: 7603598460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:36:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 15:36:13,999][15401] Updated weights for policy 0, policy_version 464080 (0.0029) [2024-06-23 15:36:17,065][15401] Updated weights for policy 0, policy_version 464090 (0.0031) [2024-06-23 15:36:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7603716096. Throughput: 0: 42324.8. Samples: 7603842260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:36:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 15:36:21,624][15401] Updated weights for policy 0, policy_version 464100 (0.0041) [2024-06-23 15:36:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7603896320. Throughput: 0: 42525.8. Samples: 7603986720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:36:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 15:36:24,697][15401] Updated weights for policy 0, policy_version 464110 (0.0027) [2024-06-23 15:36:28,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7604092928. Throughput: 0: 42432.1. Samples: 7604235500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:36:28,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 15:36:29,269][15401] Updated weights for policy 0, policy_version 464120 (0.0047) [2024-06-23 15:36:32,390][15401] Updated weights for policy 0, policy_version 464130 (0.0041) [2024-06-23 15:36:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7604338688. Throughput: 0: 42714.2. Samples: 7604491800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:36:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 15:36:36,632][15401] Updated weights for policy 0, policy_version 464140 (0.0041) [2024-06-23 15:36:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 7604535296. Throughput: 0: 42643.8. Samples: 7604623020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-23 15:36:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 15:36:39,945][15401] Updated weights for policy 0, policy_version 464150 (0.0028) [2024-06-23 15:36:43,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 7604731904. Throughput: 0: 42687.6. Samples: 7604876200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:36:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 15:36:44,314][15401] Updated weights for policy 0, policy_version 464160 (0.0030) [2024-06-23 15:36:47,766][15401] Updated weights for policy 0, policy_version 464170 (0.0023) [2024-06-23 15:36:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7604977664. Throughput: 0: 42680.9. Samples: 7605127900. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:36:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 15:36:52,194][15401] Updated weights for policy 0, policy_version 464180 (0.0022) [2024-06-23 15:36:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42543.1). Total num frames: 7605190656. Throughput: 0: 42520.8. Samples: 7605263460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:36:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 15:36:55,496][15401] Updated weights for policy 0, policy_version 464190 (0.0023) [2024-06-23 15:36:58,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42542.5). Total num frames: 7605387264. Throughput: 0: 42617.3. Samples: 7605516340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:36:58,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 15:36:59,825][15401] Updated weights for policy 0, policy_version 464200 (0.0032) [2024-06-23 15:37:02,991][15401] Updated weights for policy 0, policy_version 464210 (0.0044) [2024-06-23 15:37:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7605616640. Throughput: 0: 42744.6. Samples: 7605765760. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 15:37:07,611][15401] Updated weights for policy 0, policy_version 464220 (0.0027) [2024-06-23 15:37:07,932][15349] Signal inference workers to stop experience collection... (112700 times) [2024-06-23 15:37:07,936][15349] Signal inference workers to resume experience collection... (112700 times) [2024-06-23 15:37:07,942][15401] InferenceWorker_p0-w0: stopping experience collection (112700 times) [2024-06-23 15:37:07,960][15401] InferenceWorker_p0-w0: resuming experience collection (112700 times) [2024-06-23 15:37:08,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 7605829632. Throughput: 0: 42520.0. Samples: 7605900120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:08,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 15:37:10,667][15401] Updated weights for policy 0, policy_version 464230 (0.0028) [2024-06-23 15:37:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7606042624. Throughput: 0: 42736.4. Samples: 7606158640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:13,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 15:37:15,099][15401] Updated weights for policy 0, policy_version 464240 (0.0037) [2024-06-23 15:37:18,318][15401] Updated weights for policy 0, policy_version 464250 (0.0025) [2024-06-23 15:37:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7606272000. Throughput: 0: 42524.1. Samples: 7606405380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 15:37:23,031][15401] Updated weights for policy 0, policy_version 464260 (0.0031) [2024-06-23 15:37:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 7606452224. Throughput: 0: 42548.8. Samples: 7606537720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 15:37:25,946][15401] Updated weights for policy 0, policy_version 464270 (0.0037) [2024-06-23 15:37:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7606665216. Throughput: 0: 42520.0. Samples: 7606789600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 15:37:30,838][15401] Updated weights for policy 0, policy_version 464280 (0.0043) [2024-06-23 15:37:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 7606894592. Throughput: 0: 42493.8. Samples: 7607040120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:33,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 15:37:33,729][15401] Updated weights for policy 0, policy_version 464290 (0.0038) [2024-06-23 15:37:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 7607058432. Throughput: 0: 42361.4. Samples: 7607169720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 15:37:38,552][15401] Updated weights for policy 0, policy_version 464300 (0.0037) [2024-06-23 15:37:41,531][15401] Updated weights for policy 0, policy_version 464310 (0.0027) [2024-06-23 15:37:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7607304192. Throughput: 0: 42336.5. Samples: 7607421380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:43,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 15:37:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000464313_7607304192.pth... [2024-06-23 15:37:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000463690_7597096960.pth [2024-06-23 15:37:46,327][15401] Updated weights for policy 0, policy_version 464320 (0.0038) [2024-06-23 15:37:48,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7607533568. Throughput: 0: 42426.6. Samples: 7607674960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 15:37:49,163][15401] Updated weights for policy 0, policy_version 464330 (0.0040) [2024-06-23 15:37:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 7607713792. Throughput: 0: 42350.5. Samples: 7607805900. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:53,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 15:37:53,946][15401] Updated weights for policy 0, policy_version 464340 (0.0042) [2024-06-23 15:37:56,777][15401] Updated weights for policy 0, policy_version 464350 (0.0038) [2024-06-23 15:37:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 7607926784. Throughput: 0: 42207.1. Samples: 7608057960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:37:58,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 15:38:01,494][15401] Updated weights for policy 0, policy_version 464360 (0.0041) [2024-06-23 15:38:03,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7608172544. Throughput: 0: 42440.9. Samples: 7608315220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 15:38:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 15:38:04,731][15401] Updated weights for policy 0, policy_version 464370 (0.0026) [2024-06-23 15:38:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 7608352768. Throughput: 0: 42503.2. Samples: 7608450360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:08,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 15:38:09,036][15401] Updated weights for policy 0, policy_version 464380 (0.0032) [2024-06-23 15:38:12,402][15401] Updated weights for policy 0, policy_version 464390 (0.0028) [2024-06-23 15:38:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7608582144. Throughput: 0: 42386.1. Samples: 7608696980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 15:38:16,774][15401] Updated weights for policy 0, policy_version 464400 (0.0026) [2024-06-23 15:38:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7608811520. Throughput: 0: 42571.6. Samples: 7608955840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:18,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 15:38:20,282][15401] Updated weights for policy 0, policy_version 464410 (0.0042) [2024-06-23 15:38:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.2, 300 sec: 42487.4). Total num frames: 7608991744. Throughput: 0: 42677.9. Samples: 7609090240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 15:38:24,277][15401] Updated weights for policy 0, policy_version 464420 (0.0038) [2024-06-23 15:38:26,776][15349] Signal inference workers to stop experience collection... (112750 times) [2024-06-23 15:38:26,776][15349] Signal inference workers to resume experience collection... (112750 times) [2024-06-23 15:38:26,811][15401] InferenceWorker_p0-w0: stopping experience collection (112750 times) [2024-06-23 15:38:26,812][15401] InferenceWorker_p0-w0: resuming experience collection (112750 times) [2024-06-23 15:38:27,915][15401] Updated weights for policy 0, policy_version 464430 (0.0044) [2024-06-23 15:38:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7609221120. Throughput: 0: 42530.7. Samples: 7609335260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 15:38:31,878][15401] Updated weights for policy 0, policy_version 464440 (0.0035) [2024-06-23 15:38:33,389][15132] Fps is (10 sec: 44238.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7609434112. Throughput: 0: 42741.0. Samples: 7609598300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 15:38:35,778][15401] Updated weights for policy 0, policy_version 464450 (0.0036) [2024-06-23 15:38:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 7609647104. Throughput: 0: 42668.1. Samples: 7609725960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 15:38:39,502][15401] Updated weights for policy 0, policy_version 464460 (0.0038) [2024-06-23 15:38:43,364][15401] Updated weights for policy 0, policy_version 464470 (0.0046) [2024-06-23 15:38:43,392][15132] Fps is (10 sec: 44225.4, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 7609876480. Throughput: 0: 42641.2. Samples: 7609976920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:43,393][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 15:38:47,305][15401] Updated weights for policy 0, policy_version 464480 (0.0041) [2024-06-23 15:38:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7610073088. Throughput: 0: 42728.9. Samples: 7610238020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 15:38:50,980][15401] Updated weights for policy 0, policy_version 464490 (0.0029) [2024-06-23 15:38:53,398][15132] Fps is (10 sec: 39296.7, 60 sec: 42592.2, 300 sec: 42486.1). Total num frames: 7610269696. Throughput: 0: 42453.5. Samples: 7610361140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:53,399][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 15:38:54,962][15401] Updated weights for policy 0, policy_version 464500 (0.0043) [2024-06-23 15:38:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7610499072. Throughput: 0: 42629.4. Samples: 7610615300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:38:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 15:38:59,065][15401] Updated weights for policy 0, policy_version 464510 (0.0033) [2024-06-23 15:39:02,805][15401] Updated weights for policy 0, policy_version 464520 (0.0033) [2024-06-23 15:39:03,390][15132] Fps is (10 sec: 44275.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7610712064. Throughput: 0: 42669.6. Samples: 7610875980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:39:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 15:39:06,929][15401] Updated weights for policy 0, policy_version 464530 (0.0033) [2024-06-23 15:39:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 7610892288. Throughput: 0: 42433.3. Samples: 7610999720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:39:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 15:39:10,340][15401] Updated weights for policy 0, policy_version 464540 (0.0035) [2024-06-23 15:39:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 7611138048. Throughput: 0: 42572.3. Samples: 7611251020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:39:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 15:39:14,756][15401] Updated weights for policy 0, policy_version 464550 (0.0044) [2024-06-23 15:39:17,994][15401] Updated weights for policy 0, policy_version 464560 (0.0029) [2024-06-23 15:39:18,389][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7611351040. Throughput: 0: 42431.0. Samples: 7611507700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:39:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 15:39:22,569][15401] Updated weights for policy 0, policy_version 464570 (0.0040) [2024-06-23 15:39:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.6, 300 sec: 42487.3). Total num frames: 7611531264. Throughput: 0: 42458.3. Samples: 7611636580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:39:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 15:39:25,639][15401] Updated weights for policy 0, policy_version 464580 (0.0033) [2024-06-23 15:39:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7611777024. Throughput: 0: 42438.9. Samples: 7611886560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 15:39:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 15:39:30,219][15401] Updated weights for policy 0, policy_version 464590 (0.0036) [2024-06-23 15:39:33,244][15401] Updated weights for policy 0, policy_version 464600 (0.0044) [2024-06-23 15:39:33,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7612006400. Throughput: 0: 42318.2. Samples: 7612142340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:39:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 15:39:38,018][15401] Updated weights for policy 0, policy_version 464610 (0.0041) [2024-06-23 15:39:38,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 7612170240. Throughput: 0: 42467.0. Samples: 7612271780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:39:38,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 15:39:41,075][15401] Updated weights for policy 0, policy_version 464620 (0.0043) [2024-06-23 15:39:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42654.3). Total num frames: 7612432384. Throughput: 0: 42474.1. Samples: 7612526640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:39:43,396][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 15:39:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000464626_7612432384.pth... [2024-06-23 15:39:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000464001_7602192384.pth [2024-06-23 15:39:45,701][15401] Updated weights for policy 0, policy_version 464630 (0.0041) [2024-06-23 15:39:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 7612628992. Throughput: 0: 42521.4. Samples: 7612789440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:39:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 15:39:48,628][15401] Updated weights for policy 0, policy_version 464640 (0.0049) [2024-06-23 15:39:53,207][15401] Updated weights for policy 0, policy_version 464650 (0.0038) [2024-06-23 15:39:53,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42604.7, 300 sec: 42487.3). Total num frames: 7612825600. Throughput: 0: 42487.1. Samples: 7612911640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:39:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 15:39:55,679][15349] Signal inference workers to stop experience collection... (112800 times) [2024-06-23 15:39:55,717][15401] InferenceWorker_p0-w0: stopping experience collection (112800 times) [2024-06-23 15:39:55,728][15349] Signal inference workers to resume experience collection... (112800 times) [2024-06-23 15:39:55,744][15401] InferenceWorker_p0-w0: resuming experience collection (112800 times) [2024-06-23 15:39:56,546][15401] Updated weights for policy 0, policy_version 464660 (0.0027) [2024-06-23 15:39:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7613071360. Throughput: 0: 42636.1. Samples: 7613169640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:39:58,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-23 15:40:00,902][15401] Updated weights for policy 0, policy_version 464670 (0.0028) [2024-06-23 15:40:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7613267968. Throughput: 0: 42674.2. Samples: 7613428040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:03,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-23 15:40:04,204][15401] Updated weights for policy 0, policy_version 464680 (0.0039) [2024-06-23 15:40:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7613464576. Throughput: 0: 42496.0. Samples: 7613548900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 15:40:08,475][15401] Updated weights for policy 0, policy_version 464690 (0.0037) [2024-06-23 15:40:12,049][15401] Updated weights for policy 0, policy_version 464700 (0.0044) [2024-06-23 15:40:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7613710336. Throughput: 0: 42676.7. Samples: 7613807020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:13,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 15:40:16,013][15401] Updated weights for policy 0, policy_version 464710 (0.0038) [2024-06-23 15:40:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7613906944. Throughput: 0: 42866.3. Samples: 7614071320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 15:40:19,527][15401] Updated weights for policy 0, policy_version 464720 (0.0029) [2024-06-23 15:40:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7614119936. Throughput: 0: 42658.7. Samples: 7614191420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 15:40:23,977][15401] Updated weights for policy 0, policy_version 464730 (0.0027) [2024-06-23 15:40:27,387][15401] Updated weights for policy 0, policy_version 464740 (0.0034) [2024-06-23 15:40:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7614332928. Throughput: 0: 42578.8. Samples: 7614442680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 15:40:31,458][15401] Updated weights for policy 0, policy_version 464750 (0.0023) [2024-06-23 15:40:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 7614513152. Throughput: 0: 42749.7. Samples: 7614713180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 15:40:34,779][15401] Updated weights for policy 0, policy_version 464760 (0.0036) [2024-06-23 15:40:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7614758912. Throughput: 0: 42636.3. Samples: 7614830280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:38,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 15:40:39,462][15401] Updated weights for policy 0, policy_version 464770 (0.0025) [2024-06-23 15:40:42,205][15401] Updated weights for policy 0, policy_version 464780 (0.0033) [2024-06-23 15:40:43,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7614988288. Throughput: 0: 42723.1. Samples: 7615092180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:43,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 15:40:46,823][15401] Updated weights for policy 0, policy_version 464790 (0.0033) [2024-06-23 15:40:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7615168512. Throughput: 0: 43083.4. Samples: 7615366800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 15:40:50,103][15401] Updated weights for policy 0, policy_version 464800 (0.0041) [2024-06-23 15:40:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7615414272. Throughput: 0: 42907.9. Samples: 7615479760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:40:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 15:40:54,242][15401] Updated weights for policy 0, policy_version 464810 (0.0039) [2024-06-23 15:40:57,734][15401] Updated weights for policy 0, policy_version 464820 (0.0029) [2024-06-23 15:40:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7615610880. Throughput: 0: 42846.7. Samples: 7615735120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:40:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 15:41:02,264][15401] Updated weights for policy 0, policy_version 464830 (0.0051) [2024-06-23 15:41:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7615807488. Throughput: 0: 42866.2. Samples: 7616000300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 15:41:05,430][15401] Updated weights for policy 0, policy_version 464840 (0.0035) [2024-06-23 15:41:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7616053248. Throughput: 0: 42859.1. Samples: 7616120080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 15:41:10,038][15401] Updated weights for policy 0, policy_version 464850 (0.0041) [2024-06-23 15:41:13,007][15401] Updated weights for policy 0, policy_version 464860 (0.0039) [2024-06-23 15:41:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7616266240. Throughput: 0: 43071.0. Samples: 7616380880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 15:41:17,658][15401] Updated weights for policy 0, policy_version 464870 (0.0037) [2024-06-23 15:41:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7616446464. Throughput: 0: 42837.1. Samples: 7616640840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 15:41:20,911][15401] Updated weights for policy 0, policy_version 464880 (0.0032) [2024-06-23 15:41:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 7616692224. Throughput: 0: 42940.3. Samples: 7616762600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 15:41:25,323][15401] Updated weights for policy 0, policy_version 464890 (0.0031) [2024-06-23 15:41:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7616905216. Throughput: 0: 42996.4. Samples: 7617027020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 15:41:28,473][15401] Updated weights for policy 0, policy_version 464900 (0.0032) [2024-06-23 15:41:32,896][15401] Updated weights for policy 0, policy_version 464910 (0.0046) [2024-06-23 15:41:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7617101824. Throughput: 0: 42587.6. Samples: 7617283240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 15:41:33,936][15349] Signal inference workers to stop experience collection... (112850 times) [2024-06-23 15:41:33,987][15401] InferenceWorker_p0-w0: stopping experience collection (112850 times) [2024-06-23 15:41:33,997][15349] Signal inference workers to resume experience collection... (112850 times) [2024-06-23 15:41:34,002][15401] InferenceWorker_p0-w0: resuming experience collection (112850 times) [2024-06-23 15:41:36,082][15401] Updated weights for policy 0, policy_version 464920 (0.0042) [2024-06-23 15:41:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7617331200. Throughput: 0: 42695.6. Samples: 7617401060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:38,403][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 15:41:40,411][15401] Updated weights for policy 0, policy_version 464930 (0.0040) [2024-06-23 15:41:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7617544192. Throughput: 0: 42875.6. Samples: 7617664520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 15:41:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000464938_7617544192.pth... [2024-06-23 15:41:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000464313_7607304192.pth [2024-06-23 15:41:43,949][15401] Updated weights for policy 0, policy_version 464940 (0.0035) [2024-06-23 15:41:47,947][15401] Updated weights for policy 0, policy_version 464950 (0.0029) [2024-06-23 15:41:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.7, 300 sec: 42542.9). Total num frames: 7617740800. Throughput: 0: 42527.2. Samples: 7617914020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 15:41:51,609][15401] Updated weights for policy 0, policy_version 464960 (0.0027) [2024-06-23 15:41:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7617986560. Throughput: 0: 42756.0. Samples: 7618044100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:53,394][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 15:41:55,724][15401] Updated weights for policy 0, policy_version 464970 (0.0030) [2024-06-23 15:41:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 7618166784. Throughput: 0: 42676.4. Samples: 7618301320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:41:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 15:41:59,393][15401] Updated weights for policy 0, policy_version 464980 (0.0033) [2024-06-23 15:42:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7618379776. Throughput: 0: 42577.3. Samples: 7618556820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:42:03,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 15:42:03,409][15401] Updated weights for policy 0, policy_version 464990 (0.0044) [2024-06-23 15:42:07,177][15401] Updated weights for policy 0, policy_version 465000 (0.0033) [2024-06-23 15:42:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7618609152. Throughput: 0: 42644.1. Samples: 7618681580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:42:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 15:42:11,056][15401] Updated weights for policy 0, policy_version 465010 (0.0049) [2024-06-23 15:42:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7618805760. Throughput: 0: 42315.6. Samples: 7618931220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:42:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 15:42:15,088][15401] Updated weights for policy 0, policy_version 465020 (0.0033) [2024-06-23 15:42:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7619018752. Throughput: 0: 42432.2. Samples: 7619192680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 15:42:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 15:42:18,871][15401] Updated weights for policy 0, policy_version 465030 (0.0037) [2024-06-23 15:42:22,627][15401] Updated weights for policy 0, policy_version 465040 (0.0030) [2024-06-23 15:42:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7619248128. Throughput: 0: 42620.9. Samples: 7619319000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:42:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 15:42:26,311][15401] Updated weights for policy 0, policy_version 465050 (0.0033) [2024-06-23 15:42:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7619461120. Throughput: 0: 42521.9. Samples: 7619578000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:42:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 15:42:30,399][15401] Updated weights for policy 0, policy_version 465060 (0.0030) [2024-06-23 15:42:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7619674112. Throughput: 0: 42797.2. Samples: 7619839900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:42:33,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 15:42:34,086][15401] Updated weights for policy 0, policy_version 465070 (0.0032) [2024-06-23 15:42:38,160][15401] Updated weights for policy 0, policy_version 465080 (0.0034) [2024-06-23 15:42:38,391][15132] Fps is (10 sec: 40953.1, 60 sec: 42324.2, 300 sec: 42598.2). Total num frames: 7619870720. Throughput: 0: 42616.3. Samples: 7619961900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:42:38,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 15:42:41,788][15401] Updated weights for policy 0, policy_version 465090 (0.0034) [2024-06-23 15:42:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7620116480. Throughput: 0: 42558.6. Samples: 7620216460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:42:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 15:42:45,763][15401] Updated weights for policy 0, policy_version 465100 (0.0044) [2024-06-23 15:42:48,392][15132] Fps is (10 sec: 42594.9, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 7620296704. Throughput: 0: 42653.2. Samples: 7620476320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:42:48,393][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 15:42:49,310][15401] Updated weights for policy 0, policy_version 465110 (0.0032) [2024-06-23 15:42:50,783][15349] Signal inference workers to stop experience collection... (112900 times) [2024-06-23 15:42:50,824][15401] InferenceWorker_p0-w0: stopping experience collection (112900 times) [2024-06-23 15:42:50,835][15349] Signal inference workers to resume experience collection... (112900 times) [2024-06-23 15:42:50,836][15401] InferenceWorker_p0-w0: resuming experience collection (112900 times) [2024-06-23 15:42:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7620509696. Throughput: 0: 42584.0. Samples: 7620597860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:42:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 15:42:53,403][15401] Updated weights for policy 0, policy_version 465120 (0.0036) [2024-06-23 15:42:57,091][15401] Updated weights for policy 0, policy_version 465130 (0.0031) [2024-06-23 15:42:58,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7620739072. Throughput: 0: 42759.9. Samples: 7620855420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:42:58,394][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 15:43:00,911][15401] Updated weights for policy 0, policy_version 465140 (0.0037) [2024-06-23 15:43:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7620952064. Throughput: 0: 42609.2. Samples: 7621110100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:43:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 15:43:04,758][15401] Updated weights for policy 0, policy_version 465150 (0.0051) [2024-06-23 15:43:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7621148672. Throughput: 0: 42834.8. Samples: 7621246560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:43:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 15:43:08,574][15401] Updated weights for policy 0, policy_version 465160 (0.0041) [2024-06-23 15:43:12,399][15401] Updated weights for policy 0, policy_version 465170 (0.0035) [2024-06-23 15:43:13,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 7621361664. Throughput: 0: 42737.2. Samples: 7621501280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:43:13,392][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 15:43:16,082][15401] Updated weights for policy 0, policy_version 465180 (0.0024) [2024-06-23 15:43:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 7621591040. Throughput: 0: 42396.4. Samples: 7621747740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:43:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 15:43:20,310][15401] Updated weights for policy 0, policy_version 465190 (0.0029) [2024-06-23 15:43:23,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7621787648. Throughput: 0: 42742.5. Samples: 7621885240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:43:23,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 15:43:23,763][15401] Updated weights for policy 0, policy_version 465200 (0.0038) [2024-06-23 15:43:27,889][15401] Updated weights for policy 0, policy_version 465210 (0.0027) [2024-06-23 15:43:28,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 7622017024. Throughput: 0: 42805.8. Samples: 7622142820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:43:28,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 15:43:31,932][15401] Updated weights for policy 0, policy_version 465220 (0.0029) [2024-06-23 15:43:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7622246400. Throughput: 0: 42533.9. Samples: 7622390240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:43:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 15:43:35,818][15401] Updated weights for policy 0, policy_version 465230 (0.0026) [2024-06-23 15:43:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42872.6, 300 sec: 42598.8). Total num frames: 7622443008. Throughput: 0: 42836.1. Samples: 7622525480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 15:43:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 15:43:39,464][15401] Updated weights for policy 0, policy_version 465240 (0.0029) [2024-06-23 15:43:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7622639616. Throughput: 0: 42825.3. Samples: 7622782560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:43:43,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 15:43:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000465250_7622656000.pth... [2024-06-23 15:43:43,443][15401] Updated weights for policy 0, policy_version 465250 (0.0045) [2024-06-23 15:43:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000464626_7612432384.pth [2024-06-23 15:43:47,004][15401] Updated weights for policy 0, policy_version 465260 (0.0035) [2024-06-23 15:43:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.3, 300 sec: 42766.3). Total num frames: 7622885376. Throughput: 0: 42694.3. Samples: 7623031340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:43:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 15:43:51,095][15401] Updated weights for policy 0, policy_version 465270 (0.0032) [2024-06-23 15:43:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7623081984. Throughput: 0: 42643.9. Samples: 7623165540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:43:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 15:43:54,591][15401] Updated weights for policy 0, policy_version 465280 (0.0031) [2024-06-23 15:43:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7623278592. Throughput: 0: 42689.0. Samples: 7623422180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:43:58,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 15:43:58,605][15401] Updated weights for policy 0, policy_version 465290 (0.0038) [2024-06-23 15:44:02,092][15401] Updated weights for policy 0, policy_version 465300 (0.0033) [2024-06-23 15:44:03,364][15349] Signal inference workers to stop experience collection... (112950 times) [2024-06-23 15:44:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 7623507968. Throughput: 0: 42946.7. Samples: 7623680440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:03,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 15:44:03,417][15401] InferenceWorker_p0-w0: stopping experience collection (112950 times) [2024-06-23 15:44:03,423][15349] Signal inference workers to resume experience collection... (112950 times) [2024-06-23 15:44:03,438][15401] InferenceWorker_p0-w0: resuming experience collection (112950 times) [2024-06-23 15:44:06,427][15401] Updated weights for policy 0, policy_version 465310 (0.0032) [2024-06-23 15:44:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7623737344. Throughput: 0: 42855.1. Samples: 7623813720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 15:44:09,704][15401] Updated weights for policy 0, policy_version 465320 (0.0027) [2024-06-23 15:44:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 7623917568. Throughput: 0: 42701.9. Samples: 7624064300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:13,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 15:44:14,060][15401] Updated weights for policy 0, policy_version 465330 (0.0037) [2024-06-23 15:44:17,591][15401] Updated weights for policy 0, policy_version 465340 (0.0029) [2024-06-23 15:44:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 7624146944. Throughput: 0: 42766.4. Samples: 7624314720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 15:44:21,524][15401] Updated weights for policy 0, policy_version 465350 (0.0022) [2024-06-23 15:44:23,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.2, 300 sec: 42653.9). Total num frames: 7624359936. Throughput: 0: 42722.4. Samples: 7624448000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 15:44:25,250][15401] Updated weights for policy 0, policy_version 465360 (0.0031) [2024-06-23 15:44:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 7624556544. Throughput: 0: 42785.6. Samples: 7624707900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 15:44:28,975][15401] Updated weights for policy 0, policy_version 465370 (0.0030) [2024-06-23 15:44:32,679][15401] Updated weights for policy 0, policy_version 465380 (0.0021) [2024-06-23 15:44:33,389][15132] Fps is (10 sec: 44238.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7624802304. Throughput: 0: 42867.7. Samples: 7624960380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 15:44:36,718][15401] Updated weights for policy 0, policy_version 465390 (0.0027) [2024-06-23 15:44:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 7625015296. Throughput: 0: 42940.0. Samples: 7625097840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 15:44:40,287][15401] Updated weights for policy 0, policy_version 465400 (0.0024) [2024-06-23 15:44:43,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 7625211904. Throughput: 0: 42833.6. Samples: 7625349800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:43,393][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 15:44:44,551][15401] Updated weights for policy 0, policy_version 465410 (0.0041) [2024-06-23 15:44:48,290][15401] Updated weights for policy 0, policy_version 465420 (0.0037) [2024-06-23 15:44:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7625441280. Throughput: 0: 42643.2. Samples: 7625599280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:48,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-23 15:44:52,203][15401] Updated weights for policy 0, policy_version 465430 (0.0035) [2024-06-23 15:44:53,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7625654272. Throughput: 0: 42571.5. Samples: 7625729440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:53,392][15132] Avg episode reward: [(0, '0.226')] [2024-06-23 15:44:55,779][15401] Updated weights for policy 0, policy_version 465440 (0.0034) [2024-06-23 15:44:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7625850880. Throughput: 0: 42648.5. Samples: 7625983480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:44:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 15:44:59,783][15401] Updated weights for policy 0, policy_version 465450 (0.0028) [2024-06-23 15:45:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 7626080256. Throughput: 0: 42768.8. Samples: 7626239320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 15:45:03,396][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 15:45:03,580][15401] Updated weights for policy 0, policy_version 465460 (0.0037) [2024-06-23 15:45:07,611][15401] Updated weights for policy 0, policy_version 465470 (0.0039) [2024-06-23 15:45:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7626276864. Throughput: 0: 42734.1. Samples: 7626371020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 15:45:10,936][15401] Updated weights for policy 0, policy_version 465480 (0.0036) [2024-06-23 15:45:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7626489856. Throughput: 0: 42574.6. Samples: 7626623760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 15:45:15,203][15401] Updated weights for policy 0, policy_version 465490 (0.0029) [2024-06-23 15:45:18,375][15401] Updated weights for policy 0, policy_version 465500 (0.0031) [2024-06-23 15:45:18,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 7626752000. Throughput: 0: 42565.7. Samples: 7626875840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 15:45:22,702][15401] Updated weights for policy 0, policy_version 465510 (0.0058) [2024-06-23 15:45:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.7, 300 sec: 42654.0). Total num frames: 7626915840. Throughput: 0: 42476.6. Samples: 7627009280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 15:45:24,891][15349] Signal inference workers to stop experience collection... (113000 times) [2024-06-23 15:45:24,891][15349] Signal inference workers to resume experience collection... (113000 times) [2024-06-23 15:45:24,907][15401] InferenceWorker_p0-w0: stopping experience collection (113000 times) [2024-06-23 15:45:24,907][15401] InferenceWorker_p0-w0: resuming experience collection (113000 times) [2024-06-23 15:45:26,428][15401] Updated weights for policy 0, policy_version 465520 (0.0034) [2024-06-23 15:45:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7627145216. Throughput: 0: 42591.7. Samples: 7627266320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 15:45:30,499][15401] Updated weights for policy 0, policy_version 465530 (0.0039) [2024-06-23 15:45:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7627374592. Throughput: 0: 42718.4. Samples: 7627521600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:33,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-23 15:45:33,811][15401] Updated weights for policy 0, policy_version 465540 (0.0041) [2024-06-23 15:45:37,949][15401] Updated weights for policy 0, policy_version 465550 (0.0037) [2024-06-23 15:45:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7627571200. Throughput: 0: 42881.4. Samples: 7627659100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:38,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 15:45:41,289][15401] Updated weights for policy 0, policy_version 465560 (0.0035) [2024-06-23 15:45:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.3, 300 sec: 42765.1). Total num frames: 7627784192. Throughput: 0: 42894.3. Samples: 7627913720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 15:45:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000465563_7627784192.pth... [2024-06-23 15:45:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000464938_7617544192.pth [2024-06-23 15:45:45,478][15401] Updated weights for policy 0, policy_version 465570 (0.0028) [2024-06-23 15:45:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7628029952. Throughput: 0: 42889.4. Samples: 7628169340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 15:45:48,828][15401] Updated weights for policy 0, policy_version 465580 (0.0038) [2024-06-23 15:45:52,897][15401] Updated weights for policy 0, policy_version 465590 (0.0031) [2024-06-23 15:45:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7628226560. Throughput: 0: 43007.7. Samples: 7628306360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 15:45:56,525][15401] Updated weights for policy 0, policy_version 465600 (0.0029) [2024-06-23 15:45:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7628423168. Throughput: 0: 43113.4. Samples: 7628563860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:45:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 15:46:00,466][15401] Updated weights for policy 0, policy_version 465610 (0.0030) [2024-06-23 15:46:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7628668928. Throughput: 0: 43170.3. Samples: 7628818500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:46:03,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 15:46:04,179][15401] Updated weights for policy 0, policy_version 465620 (0.0029) [2024-06-23 15:46:07,994][15401] Updated weights for policy 0, policy_version 465630 (0.0042) [2024-06-23 15:46:08,392][15132] Fps is (10 sec: 45863.4, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 7628881920. Throughput: 0: 43186.0. Samples: 7628952760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:46:08,393][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 15:46:12,180][15401] Updated weights for policy 0, policy_version 465640 (0.0046) [2024-06-23 15:46:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7629062144. Throughput: 0: 43001.8. Samples: 7629201400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:46:13,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 15:46:16,169][15401] Updated weights for policy 0, policy_version 465650 (0.0040) [2024-06-23 15:46:18,389][15132] Fps is (10 sec: 42609.4, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 7629307904. Throughput: 0: 42954.3. Samples: 7629454540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:46:18,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 15:46:19,751][15401] Updated weights for policy 0, policy_version 465660 (0.0031) [2024-06-23 15:46:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 7629520896. Throughput: 0: 42998.7. Samples: 7629594040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:46:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 15:46:23,733][15401] Updated weights for policy 0, policy_version 465670 (0.0029) [2024-06-23 15:46:27,307][15401] Updated weights for policy 0, policy_version 465680 (0.0028) [2024-06-23 15:46:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7629701120. Throughput: 0: 42906.7. Samples: 7629844520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 15:46:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 15:46:30,048][15349] Signal inference workers to stop experience collection... (113050 times) [2024-06-23 15:46:30,088][15401] InferenceWorker_p0-w0: stopping experience collection (113050 times) [2024-06-23 15:46:30,102][15349] Signal inference workers to resume experience collection... (113050 times) [2024-06-23 15:46:30,111][15401] InferenceWorker_p0-w0: resuming experience collection (113050 times) [2024-06-23 15:46:31,151][15401] Updated weights for policy 0, policy_version 465690 (0.0030) [2024-06-23 15:46:33,394][15132] Fps is (10 sec: 42578.0, 60 sec: 42868.0, 300 sec: 42764.3). Total num frames: 7629946880. Throughput: 0: 42947.8. Samples: 7630102200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:46:33,395][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 15:46:34,854][15401] Updated weights for policy 0, policy_version 465700 (0.0044) [2024-06-23 15:46:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7630159872. Throughput: 0: 42924.4. Samples: 7630237960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:46:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 15:46:38,913][15401] Updated weights for policy 0, policy_version 465710 (0.0028) [2024-06-23 15:46:42,603][15401] Updated weights for policy 0, policy_version 465720 (0.0038) [2024-06-23 15:46:43,389][15132] Fps is (10 sec: 40979.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7630356480. Throughput: 0: 42683.9. Samples: 7630484640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:46:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 15:46:47,079][15401] Updated weights for policy 0, policy_version 465730 (0.0046) [2024-06-23 15:46:48,394][15132] Fps is (10 sec: 44214.9, 60 sec: 42867.9, 300 sec: 42764.3). Total num frames: 7630602240. Throughput: 0: 42632.1. Samples: 7630737160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:46:48,395][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 15:46:50,526][15401] Updated weights for policy 0, policy_version 465740 (0.0028) [2024-06-23 15:46:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7630798848. Throughput: 0: 42620.7. Samples: 7630870580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:46:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 15:46:54,783][15401] Updated weights for policy 0, policy_version 465750 (0.0038) [2024-06-23 15:46:58,391][15132] Fps is (10 sec: 39335.3, 60 sec: 42870.4, 300 sec: 42764.8). Total num frames: 7630995456. Throughput: 0: 42838.2. Samples: 7631129180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:46:58,391][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 15:46:58,724][15401] Updated weights for policy 0, policy_version 465760 (0.0034) [2024-06-23 15:47:02,190][15401] Updated weights for policy 0, policy_version 465770 (0.0031) [2024-06-23 15:47:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7631224832. Throughput: 0: 42880.7. Samples: 7631384180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 15:47:06,188][15401] Updated weights for policy 0, policy_version 465780 (0.0031) [2024-06-23 15:47:08,389][15132] Fps is (10 sec: 44243.2, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 7631437824. Throughput: 0: 42719.6. Samples: 7631516420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 15:47:09,789][15401] Updated weights for policy 0, policy_version 465790 (0.0037) [2024-06-23 15:47:13,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42764.6). Total num frames: 7631634432. Throughput: 0: 42779.3. Samples: 7631769700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:13,393][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 15:47:14,136][15401] Updated weights for policy 0, policy_version 465800 (0.0042) [2024-06-23 15:47:17,398][15401] Updated weights for policy 0, policy_version 465810 (0.0032) [2024-06-23 15:47:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7631863808. Throughput: 0: 42772.1. Samples: 7632026740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 15:47:21,624][15401] Updated weights for policy 0, policy_version 465820 (0.0040) [2024-06-23 15:47:23,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7632093184. Throughput: 0: 42698.6. Samples: 7632159400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 15:47:24,959][15401] Updated weights for policy 0, policy_version 465830 (0.0023) [2024-06-23 15:47:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7632273408. Throughput: 0: 42914.7. Samples: 7632415800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 15:47:29,110][15401] Updated weights for policy 0, policy_version 465840 (0.0025) [2024-06-23 15:47:32,690][15401] Updated weights for policy 0, policy_version 465850 (0.0025) [2024-06-23 15:47:33,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42600.2, 300 sec: 42820.5). Total num frames: 7632502784. Throughput: 0: 42851.5. Samples: 7632665360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:33,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 15:47:36,929][15401] Updated weights for policy 0, policy_version 465860 (0.0037) [2024-06-23 15:47:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7632732160. Throughput: 0: 42790.6. Samples: 7632796160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 15:47:40,365][15401] Updated weights for policy 0, policy_version 465870 (0.0026) [2024-06-23 15:47:43,390][15132] Fps is (10 sec: 42607.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 7632928768. Throughput: 0: 42741.7. Samples: 7633052500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 15:47:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000465878_7632945152.pth... [2024-06-23 15:47:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000465250_7622656000.pth [2024-06-23 15:47:44,700][15401] Updated weights for policy 0, policy_version 465880 (0.0033) [2024-06-23 15:47:48,031][15401] Updated weights for policy 0, policy_version 465890 (0.0036) [2024-06-23 15:47:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42328.9, 300 sec: 42820.6). Total num frames: 7633141760. Throughput: 0: 42759.2. Samples: 7633308340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 15:47:52,190][15401] Updated weights for policy 0, policy_version 465900 (0.0034) [2024-06-23 15:47:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7633354752. Throughput: 0: 42620.9. Samples: 7633434360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-23 15:47:53,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 15:47:55,865][15401] Updated weights for policy 0, policy_version 465910 (0.0031) [2024-06-23 15:47:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42872.5, 300 sec: 42765.0). Total num frames: 7633567744. Throughput: 0: 42619.8. Samples: 7633687480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:47:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 15:47:59,796][15401] Updated weights for policy 0, policy_version 465920 (0.0033) [2024-06-23 15:48:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7633780736. Throughput: 0: 42586.7. Samples: 7633943140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 15:48:03,525][15401] Updated weights for policy 0, policy_version 465930 (0.0037) [2024-06-23 15:48:07,704][15401] Updated weights for policy 0, policy_version 465940 (0.0035) [2024-06-23 15:48:08,345][15349] Signal inference workers to stop experience collection... (113100 times) [2024-06-23 15:48:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 7633977344. Throughput: 0: 42385.4. Samples: 7634066740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 15:48:08,398][15401] InferenceWorker_p0-w0: stopping experience collection (113100 times) [2024-06-23 15:48:08,399][15349] Signal inference workers to resume experience collection... (113100 times) [2024-06-23 15:48:08,410][15401] InferenceWorker_p0-w0: resuming experience collection (113100 times) [2024-06-23 15:48:11,146][15401] Updated weights for policy 0, policy_version 465950 (0.0052) [2024-06-23 15:48:13,392][15132] Fps is (10 sec: 44227.3, 60 sec: 43144.8, 300 sec: 42820.3). Total num frames: 7634223104. Throughput: 0: 42489.9. Samples: 7634327940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:13,392][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 15:48:15,254][15401] Updated weights for policy 0, policy_version 465960 (0.0036) [2024-06-23 15:48:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7634419712. Throughput: 0: 42452.3. Samples: 7634575620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 15:48:18,995][15401] Updated weights for policy 0, policy_version 465970 (0.0038) [2024-06-23 15:48:23,186][15401] Updated weights for policy 0, policy_version 465980 (0.0024) [2024-06-23 15:48:23,389][15132] Fps is (10 sec: 39330.3, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 7634616320. Throughput: 0: 42447.1. Samples: 7634706280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 15:48:26,597][15401] Updated weights for policy 0, policy_version 465990 (0.0037) [2024-06-23 15:48:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7634845696. Throughput: 0: 42380.5. Samples: 7634959620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 15:48:30,904][15401] Updated weights for policy 0, policy_version 466000 (0.0025) [2024-06-23 15:48:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42325.2, 300 sec: 42709.1). Total num frames: 7635042304. Throughput: 0: 42233.3. Samples: 7635208940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:33,392][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 15:48:34,524][15401] Updated weights for policy 0, policy_version 466010 (0.0040) [2024-06-23 15:48:38,309][15401] Updated weights for policy 0, policy_version 466020 (0.0035) [2024-06-23 15:48:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7635271680. Throughput: 0: 42308.5. Samples: 7635338240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 15:48:41,996][15401] Updated weights for policy 0, policy_version 466030 (0.0036) [2024-06-23 15:48:43,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7635468288. Throughput: 0: 42364.9. Samples: 7635593900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 15:48:45,833][15401] Updated weights for policy 0, policy_version 466040 (0.0032) [2024-06-23 15:48:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7635681280. Throughput: 0: 42317.8. Samples: 7635847440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 15:48:50,125][15401] Updated weights for policy 0, policy_version 466050 (0.0040) [2024-06-23 15:48:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7635910656. Throughput: 0: 42459.0. Samples: 7635977400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 15:48:53,402][15401] Updated weights for policy 0, policy_version 466060 (0.0032) [2024-06-23 15:48:57,608][15401] Updated weights for policy 0, policy_version 466070 (0.0029) [2024-06-23 15:48:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.9). Total num frames: 7636107264. Throughput: 0: 42405.2. Samples: 7636236080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:48:58,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 15:49:01,035][15401] Updated weights for policy 0, policy_version 466080 (0.0037) [2024-06-23 15:49:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7636320256. Throughput: 0: 42652.0. Samples: 7636494960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:49:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 15:49:05,636][15401] Updated weights for policy 0, policy_version 466090 (0.0031) [2024-06-23 15:49:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7636549632. Throughput: 0: 42437.3. Samples: 7636615960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:49:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 15:49:08,923][15401] Updated weights for policy 0, policy_version 466100 (0.0027) [2024-06-23 15:49:13,343][15401] Updated weights for policy 0, policy_version 466110 (0.0027) [2024-06-23 15:49:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42053.8, 300 sec: 42709.5). Total num frames: 7636746240. Throughput: 0: 42554.8. Samples: 7636874580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:49:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 15:49:16,659][15401] Updated weights for policy 0, policy_version 466120 (0.0042) [2024-06-23 15:49:18,394][15132] Fps is (10 sec: 40940.3, 60 sec: 42321.9, 300 sec: 42708.8). Total num frames: 7636959232. Throughput: 0: 42523.5. Samples: 7637122600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 15:49:18,395][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 15:49:21,010][15401] Updated weights for policy 0, policy_version 466130 (0.0048) [2024-06-23 15:49:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7637172224. Throughput: 0: 42527.5. Samples: 7637251980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:49:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 15:49:24,621][15401] Updated weights for policy 0, policy_version 466140 (0.0029) [2024-06-23 15:49:28,389][15132] Fps is (10 sec: 40979.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7637368832. Throughput: 0: 42524.9. Samples: 7637507520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:49:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 15:49:28,930][15401] Updated weights for policy 0, policy_version 466150 (0.0025) [2024-06-23 15:49:32,168][15401] Updated weights for policy 0, policy_version 466160 (0.0029) [2024-06-23 15:49:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 7637614592. Throughput: 0: 42357.8. Samples: 7637753540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:49:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 15:49:36,416][15401] Updated weights for policy 0, policy_version 466170 (0.0031) [2024-06-23 15:49:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.9). Total num frames: 7637811200. Throughput: 0: 42481.0. Samples: 7637889040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:49:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 15:49:39,225][15349] Signal inference workers to stop experience collection... (113150 times) [2024-06-23 15:49:39,225][15349] Signal inference workers to resume experience collection... (113150 times) [2024-06-23 15:49:39,276][15401] InferenceWorker_p0-w0: stopping experience collection (113150 times) [2024-06-23 15:49:39,280][15401] InferenceWorker_p0-w0: resuming experience collection (113150 times) [2024-06-23 15:49:39,794][15401] Updated weights for policy 0, policy_version 466180 (0.0034) [2024-06-23 15:49:43,393][15132] Fps is (10 sec: 39305.5, 60 sec: 42322.4, 300 sec: 42597.8). Total num frames: 7638007808. Throughput: 0: 42405.4. Samples: 7638144500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:49:43,394][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 15:49:43,431][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000466187_7638007808.pth... [2024-06-23 15:49:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000465563_7627784192.pth [2024-06-23 15:49:43,996][15401] Updated weights for policy 0, policy_version 466190 (0.0042) [2024-06-23 15:49:47,556][15401] Updated weights for policy 0, policy_version 466200 (0.0030) [2024-06-23 15:49:48,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 7638253568. Throughput: 0: 42215.4. Samples: 7638394760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:49:48,393][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 15:49:51,482][15401] Updated weights for policy 0, policy_version 466210 (0.0030) [2024-06-23 15:49:53,389][15132] Fps is (10 sec: 44254.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7638450176. Throughput: 0: 42417.3. Samples: 7638524740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:49:53,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 15:49:55,576][15401] Updated weights for policy 0, policy_version 466220 (0.0023) [2024-06-23 15:49:58,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7638663168. Throughput: 0: 42333.7. Samples: 7638779600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:49:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 15:49:59,265][15401] Updated weights for policy 0, policy_version 466230 (0.0044) [2024-06-23 15:50:03,240][15401] Updated weights for policy 0, policy_version 466240 (0.0026) [2024-06-23 15:50:03,396][15132] Fps is (10 sec: 42570.7, 60 sec: 42593.7, 300 sec: 42708.5). Total num frames: 7638876160. Throughput: 0: 42495.3. Samples: 7639034960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:03,397][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 15:50:06,950][15401] Updated weights for policy 0, policy_version 466250 (0.0029) [2024-06-23 15:50:08,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 7639089152. Throughput: 0: 42515.8. Samples: 7639165300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:08,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 15:50:10,869][15401] Updated weights for policy 0, policy_version 466260 (0.0034) [2024-06-23 15:50:13,390][15132] Fps is (10 sec: 40986.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7639285760. Throughput: 0: 42398.1. Samples: 7639415440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:13,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 15:50:14,475][15401] Updated weights for policy 0, policy_version 466270 (0.0036) [2024-06-23 15:50:18,392][15132] Fps is (10 sec: 40959.6, 60 sec: 42326.9, 300 sec: 42653.6). Total num frames: 7639498752. Throughput: 0: 42608.7. Samples: 7639671040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:18,393][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 15:50:18,685][15401] Updated weights for policy 0, policy_version 466280 (0.0041) [2024-06-23 15:50:22,631][15401] Updated weights for policy 0, policy_version 466290 (0.0034) [2024-06-23 15:50:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7639728128. Throughput: 0: 42472.0. Samples: 7639800280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 15:50:26,255][15401] Updated weights for policy 0, policy_version 466300 (0.0029) [2024-06-23 15:50:28,389][15132] Fps is (10 sec: 40970.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7639908352. Throughput: 0: 42422.6. Samples: 7640053340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:28,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-23 15:50:30,174][15401] Updated weights for policy 0, policy_version 466310 (0.0047) [2024-06-23 15:50:33,394][15132] Fps is (10 sec: 40942.9, 60 sec: 42049.4, 300 sec: 42597.8). Total num frames: 7640137728. Throughput: 0: 42541.6. Samples: 7640309200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:33,394][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 15:50:34,065][15401] Updated weights for policy 0, policy_version 466320 (0.0036) [2024-06-23 15:50:37,856][15401] Updated weights for policy 0, policy_version 466330 (0.0033) [2024-06-23 15:50:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7640367104. Throughput: 0: 42537.4. Samples: 7640438920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 15:50:41,596][15401] Updated weights for policy 0, policy_version 466340 (0.0036) [2024-06-23 15:50:43,389][15132] Fps is (10 sec: 42615.6, 60 sec: 42601.3, 300 sec: 42487.3). Total num frames: 7640563712. Throughput: 0: 42561.3. Samples: 7640694860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 15:50:43,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 15:50:45,392][15401] Updated weights for policy 0, policy_version 466350 (0.0030) [2024-06-23 15:50:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.1, 300 sec: 42542.9). Total num frames: 7640776704. Throughput: 0: 42443.1. Samples: 7640944620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:50:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 15:50:49,606][15401] Updated weights for policy 0, policy_version 466360 (0.0029) [2024-06-23 15:50:53,184][15401] Updated weights for policy 0, policy_version 466370 (0.0039) [2024-06-23 15:50:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7641006080. Throughput: 0: 42498.8. Samples: 7641077640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:50:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 15:50:57,266][15401] Updated weights for policy 0, policy_version 466380 (0.0033) [2024-06-23 15:50:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 7641186304. Throughput: 0: 42579.6. Samples: 7641331520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:50:58,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 15:51:00,704][15401] Updated weights for policy 0, policy_version 466390 (0.0032) [2024-06-23 15:51:03,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42598.4, 300 sec: 42542.3). Total num frames: 7641432064. Throughput: 0: 42558.1. Samples: 7641586320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:03,396][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 15:51:04,817][15401] Updated weights for policy 0, policy_version 466400 (0.0041) [2024-06-23 15:51:08,247][15401] Updated weights for policy 0, policy_version 466410 (0.0031) [2024-06-23 15:51:08,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 7641661440. Throughput: 0: 42675.1. Samples: 7641720660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 15:51:11,397][15349] Signal inference workers to stop experience collection... (113200 times) [2024-06-23 15:51:11,405][15349] Signal inference workers to resume experience collection... (113200 times) [2024-06-23 15:51:11,409][15401] InferenceWorker_p0-w0: stopping experience collection (113200 times) [2024-06-23 15:51:11,436][15401] InferenceWorker_p0-w0: resuming experience collection (113200 times) [2024-06-23 15:51:12,797][15401] Updated weights for policy 0, policy_version 466420 (0.0031) [2024-06-23 15:51:13,392][15132] Fps is (10 sec: 40976.6, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 7641841664. Throughput: 0: 42636.3. Samples: 7641972080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:13,392][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 15:51:16,412][15401] Updated weights for policy 0, policy_version 466430 (0.0033) [2024-06-23 15:51:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 7642071040. Throughput: 0: 42597.7. Samples: 7642225920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:18,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 15:51:20,684][15401] Updated weights for policy 0, policy_version 466440 (0.0032) [2024-06-23 15:51:23,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7642267648. Throughput: 0: 42574.7. Samples: 7642354780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:23,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 15:51:24,073][15401] Updated weights for policy 0, policy_version 466450 (0.0046) [2024-06-23 15:51:28,360][15401] Updated weights for policy 0, policy_version 466460 (0.0048) [2024-06-23 15:51:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42488.0). Total num frames: 7642480640. Throughput: 0: 42556.0. Samples: 7642609880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 15:51:31,606][15401] Updated weights for policy 0, policy_version 466470 (0.0044) [2024-06-23 15:51:33,392][15132] Fps is (10 sec: 45863.5, 60 sec: 43145.7, 300 sec: 42598.0). Total num frames: 7642726400. Throughput: 0: 42811.4. Samples: 7642871240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:33,401][15132] Avg episode reward: [(0, '0.303')] [2024-06-23 15:51:35,798][15401] Updated weights for policy 0, policy_version 466480 (0.0043) [2024-06-23 15:51:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7642906624. Throughput: 0: 42832.9. Samples: 7643005120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:38,390][15132] Avg episode reward: [(0, '0.126')] [2024-06-23 15:51:39,524][15401] Updated weights for policy 0, policy_version 466490 (0.0041) [2024-06-23 15:51:43,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.5, 300 sec: 42432.5). Total num frames: 7643119616. Throughput: 0: 42585.4. Samples: 7643247860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:43,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 15:51:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000466499_7643119616.pth... [2024-06-23 15:51:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000465878_7632945152.pth [2024-06-23 15:51:43,626][15401] Updated weights for policy 0, policy_version 466500 (0.0039) [2024-06-23 15:51:47,292][15401] Updated weights for policy 0, policy_version 466510 (0.0039) [2024-06-23 15:51:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 7643348992. Throughput: 0: 42667.0. Samples: 7643506060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 15:51:51,269][15401] Updated weights for policy 0, policy_version 466520 (0.0040) [2024-06-23 15:51:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 7643561984. Throughput: 0: 42714.6. Samples: 7643642820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 15:51:54,931][15401] Updated weights for policy 0, policy_version 466530 (0.0044) [2024-06-23 15:51:58,391][15132] Fps is (10 sec: 40954.3, 60 sec: 42870.5, 300 sec: 42487.1). Total num frames: 7643758592. Throughput: 0: 42679.6. Samples: 7643892620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:51:58,391][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 15:51:58,751][15401] Updated weights for policy 0, policy_version 466540 (0.0023) [2024-06-23 15:52:02,482][15401] Updated weights for policy 0, policy_version 466550 (0.0031) [2024-06-23 15:52:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42330.0, 300 sec: 42487.3). Total num frames: 7643971584. Throughput: 0: 42853.4. Samples: 7644154320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 15:52:06,767][15401] Updated weights for policy 0, policy_version 466560 (0.0037) [2024-06-23 15:52:08,389][15132] Fps is (10 sec: 44243.5, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 7644200960. Throughput: 0: 42820.5. Samples: 7644281700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 15:52:10,006][15401] Updated weights for policy 0, policy_version 466570 (0.0042) [2024-06-23 15:52:13,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 7644413952. Throughput: 0: 42619.1. Samples: 7644527740. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 15:52:14,219][15401] Updated weights for policy 0, policy_version 466580 (0.0034) [2024-06-23 15:52:18,089][15401] Updated weights for policy 0, policy_version 466590 (0.0040) [2024-06-23 15:52:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 7644610560. Throughput: 0: 42642.8. Samples: 7644790060. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 15:52:22,387][15401] Updated weights for policy 0, policy_version 466600 (0.0037) [2024-06-23 15:52:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7644839936. Throughput: 0: 42500.5. Samples: 7644917640. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 15:52:25,718][15401] Updated weights for policy 0, policy_version 466610 (0.0034) [2024-06-23 15:52:27,001][15349] Signal inference workers to stop experience collection... (113250 times) [2024-06-23 15:52:27,002][15349] Signal inference workers to resume experience collection... (113250 times) [2024-06-23 15:52:27,036][15401] InferenceWorker_p0-w0: stopping experience collection (113250 times) [2024-06-23 15:52:27,036][15401] InferenceWorker_p0-w0: resuming experience collection (113250 times) [2024-06-23 15:52:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 7645052928. Throughput: 0: 42639.5. Samples: 7645166640. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 15:52:30,035][15401] Updated weights for policy 0, policy_version 466620 (0.0037) [2024-06-23 15:52:33,280][15401] Updated weights for policy 0, policy_version 466630 (0.0031) [2024-06-23 15:52:33,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42325.3, 300 sec: 42487.0). Total num frames: 7645265920. Throughput: 0: 42686.1. Samples: 7645427040. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:33,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 15:52:37,602][15401] Updated weights for policy 0, policy_version 466640 (0.0041) [2024-06-23 15:52:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 7645446144. Throughput: 0: 42293.9. Samples: 7645546040. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 15:52:40,767][15401] Updated weights for policy 0, policy_version 466650 (0.0035) [2024-06-23 15:52:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7645691904. Throughput: 0: 42471.5. Samples: 7645803780. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:43,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 15:52:45,425][15401] Updated weights for policy 0, policy_version 466660 (0.0040) [2024-06-23 15:52:48,228][15401] Updated weights for policy 0, policy_version 466670 (0.0039) [2024-06-23 15:52:48,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7645921280. Throughput: 0: 42288.7. Samples: 7646057320. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 15:52:52,920][15401] Updated weights for policy 0, policy_version 466680 (0.0032) [2024-06-23 15:52:53,390][15132] Fps is (10 sec: 39317.5, 60 sec: 42051.6, 300 sec: 42431.6). Total num frames: 7646085120. Throughput: 0: 42350.5. Samples: 7646187520. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:53,391][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 15:52:55,714][15401] Updated weights for policy 0, policy_version 466690 (0.0028) [2024-06-23 15:52:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42872.5, 300 sec: 42542.9). Total num frames: 7646330880. Throughput: 0: 42603.2. Samples: 7646444880. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:52:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 15:53:00,394][15401] Updated weights for policy 0, policy_version 466700 (0.0043) [2024-06-23 15:53:03,389][15132] Fps is (10 sec: 45879.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7646543872. Throughput: 0: 42509.3. Samples: 7646702980. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:53:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 15:53:03,758][15401] Updated weights for policy 0, policy_version 466710 (0.0030) [2024-06-23 15:53:08,040][15401] Updated weights for policy 0, policy_version 466720 (0.0030) [2024-06-23 15:53:08,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42432.1). Total num frames: 7646740480. Throughput: 0: 42360.9. Samples: 7646823880. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:53:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 15:53:11,519][15401] Updated weights for policy 0, policy_version 466730 (0.0041) [2024-06-23 15:53:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7646986240. Throughput: 0: 42553.8. Samples: 7647081560. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:53:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 15:53:16,013][15401] Updated weights for policy 0, policy_version 466740 (0.0038) [2024-06-23 15:53:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7647182848. Throughput: 0: 42513.9. Samples: 7647340060. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:53:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 15:53:18,967][15401] Updated weights for policy 0, policy_version 466750 (0.0029) [2024-06-23 15:53:23,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 7647363072. Throughput: 0: 42615.9. Samples: 7647463760. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:53:23,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-23 15:53:23,552][15401] Updated weights for policy 0, policy_version 466760 (0.0031) [2024-06-23 15:53:26,679][15401] Updated weights for policy 0, policy_version 466770 (0.0033) [2024-06-23 15:53:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7647625216. Throughput: 0: 42588.5. Samples: 7647720260. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:53:28,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 15:53:31,234][15401] Updated weights for policy 0, policy_version 466780 (0.0041) [2024-06-23 15:53:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 7647805440. Throughput: 0: 42821.9. Samples: 7647984300. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-23 15:53:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 15:53:34,671][15401] Updated weights for policy 0, policy_version 466790 (0.0045) [2024-06-23 15:53:38,391][15132] Fps is (10 sec: 39314.5, 60 sec: 42870.2, 300 sec: 42542.6). Total num frames: 7648018432. Throughput: 0: 42510.0. Samples: 7648100500. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:53:38,392][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 15:53:39,202][15401] Updated weights for policy 0, policy_version 466800 (0.0038) [2024-06-23 15:53:42,271][15401] Updated weights for policy 0, policy_version 466810 (0.0034) [2024-06-23 15:53:43,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7648280576. Throughput: 0: 42642.6. Samples: 7648363800. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:53:43,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 15:53:43,395][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000466814_7648280576.pth... [2024-06-23 15:53:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000466187_7638007808.pth [2024-06-23 15:53:46,952][15401] Updated weights for policy 0, policy_version 466820 (0.0051) [2024-06-23 15:53:48,390][15132] Fps is (10 sec: 42605.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 7648444416. Throughput: 0: 42607.0. Samples: 7648620300. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:53:48,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 15:53:49,924][15401] Updated weights for policy 0, policy_version 466830 (0.0039) [2024-06-23 15:53:50,662][15349] Signal inference workers to stop experience collection... (113300 times) [2024-06-23 15:53:50,662][15349] Signal inference workers to resume experience collection... (113300 times) [2024-06-23 15:53:50,691][15401] InferenceWorker_p0-w0: stopping experience collection (113300 times) [2024-06-23 15:53:50,691][15401] InferenceWorker_p0-w0: resuming experience collection (113300 times) [2024-06-23 15:53:53,396][15132] Fps is (10 sec: 37658.7, 60 sec: 42867.6, 300 sec: 42541.9). Total num frames: 7648657408. Throughput: 0: 42501.0. Samples: 7648736700. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:53:53,396][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 15:53:54,693][15401] Updated weights for policy 0, policy_version 466840 (0.0044) [2024-06-23 15:53:57,597][15401] Updated weights for policy 0, policy_version 466850 (0.0038) [2024-06-23 15:53:58,394][15132] Fps is (10 sec: 45856.2, 60 sec: 42868.3, 300 sec: 42653.3). Total num frames: 7648903168. Throughput: 0: 42536.8. Samples: 7648995900. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:53:58,394][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 15:54:02,344][15401] Updated weights for policy 0, policy_version 466860 (0.0040) [2024-06-23 15:54:03,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 7649067008. Throughput: 0: 42707.1. Samples: 7649261880. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:03,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-23 15:54:05,234][15401] Updated weights for policy 0, policy_version 466870 (0.0038) [2024-06-23 15:54:08,389][15132] Fps is (10 sec: 39338.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7649296384. Throughput: 0: 42587.2. Samples: 7649380180. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 15:54:09,982][15401] Updated weights for policy 0, policy_version 466880 (0.0042) [2024-06-23 15:54:13,002][15401] Updated weights for policy 0, policy_version 466890 (0.0026) [2024-06-23 15:54:13,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42654.6). Total num frames: 7649542144. Throughput: 0: 42614.2. Samples: 7649637900. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 15:54:17,878][15401] Updated weights for policy 0, policy_version 466900 (0.0025) [2024-06-23 15:54:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 7649689600. Throughput: 0: 42635.1. Samples: 7649902880. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 15:54:20,450][15401] Updated weights for policy 0, policy_version 466910 (0.0044) [2024-06-23 15:54:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7649951744. Throughput: 0: 42642.9. Samples: 7650019360. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 15:54:25,320][15401] Updated weights for policy 0, policy_version 466920 (0.0033) [2024-06-23 15:54:28,226][15401] Updated weights for policy 0, policy_version 466930 (0.0034) [2024-06-23 15:54:28,390][15132] Fps is (10 sec: 49151.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7650181120. Throughput: 0: 42745.6. Samples: 7650287360. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 15:54:32,869][15401] Updated weights for policy 0, policy_version 466940 (0.0033) [2024-06-23 15:54:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7650344960. Throughput: 0: 42746.8. Samples: 7650543900. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 15:54:35,875][15401] Updated weights for policy 0, policy_version 466950 (0.0037) [2024-06-23 15:54:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43145.7, 300 sec: 42710.0). Total num frames: 7650607104. Throughput: 0: 42872.3. Samples: 7650665680. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 15:54:40,351][15401] Updated weights for policy 0, policy_version 466960 (0.0037) [2024-06-23 15:54:43,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 7650820096. Throughput: 0: 43019.7. Samples: 7650931600. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 15:54:43,589][15401] Updated weights for policy 0, policy_version 466970 (0.0045) [2024-06-23 15:54:48,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 7650983936. Throughput: 0: 42673.6. Samples: 7651182300. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:48,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 15:54:48,553][15401] Updated weights for policy 0, policy_version 466980 (0.0040) [2024-06-23 15:54:51,292][15401] Updated weights for policy 0, policy_version 466990 (0.0042) [2024-06-23 15:54:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43149.2, 300 sec: 42653.9). Total num frames: 7651246080. Throughput: 0: 42712.0. Samples: 7651302220. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 15:54:56,135][15401] Updated weights for policy 0, policy_version 467000 (0.0030) [2024-06-23 15:54:58,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42055.3, 300 sec: 42543.8). Total num frames: 7651426304. Throughput: 0: 42819.5. Samples: 7651564780. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-06-23 15:54:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 15:54:59,101][15401] Updated weights for policy 0, policy_version 467010 (0.0028) [2024-06-23 15:54:59,979][15349] Signal inference workers to stop experience collection... (113350 times) [2024-06-23 15:54:59,984][15349] Signal inference workers to resume experience collection... (113350 times) [2024-06-23 15:54:59,999][15401] InferenceWorker_p0-w0: stopping experience collection (113350 times) [2024-06-23 15:55:00,037][15401] InferenceWorker_p0-w0: resuming experience collection (113350 times) [2024-06-23 15:55:03,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 7651622912. Throughput: 0: 42503.5. Samples: 7651815540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 15:55:04,133][15401] Updated weights for policy 0, policy_version 467020 (0.0039) [2024-06-23 15:55:06,827][15401] Updated weights for policy 0, policy_version 467030 (0.0031) [2024-06-23 15:55:08,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7651885056. Throughput: 0: 42732.1. Samples: 7651942300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 15:55:11,608][15401] Updated weights for policy 0, policy_version 467040 (0.0044) [2024-06-23 15:55:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42598.8). Total num frames: 7652065280. Throughput: 0: 42579.7. Samples: 7652203440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:13,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 15:55:14,775][15401] Updated weights for policy 0, policy_version 467050 (0.0036) [2024-06-23 15:55:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 7652278272. Throughput: 0: 42358.6. Samples: 7652450040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 15:55:19,157][15401] Updated weights for policy 0, policy_version 467060 (0.0035) [2024-06-23 15:55:22,236][15401] Updated weights for policy 0, policy_version 467070 (0.0041) [2024-06-23 15:55:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7652524032. Throughput: 0: 42549.8. Samples: 7652580420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:23,396][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 15:55:26,684][15401] Updated weights for policy 0, policy_version 467080 (0.0043) [2024-06-23 15:55:28,396][15132] Fps is (10 sec: 40933.5, 60 sec: 41774.8, 300 sec: 42542.5). Total num frames: 7652687872. Throughput: 0: 42352.6. Samples: 7652837740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:28,396][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 15:55:29,909][15401] Updated weights for policy 0, policy_version 467090 (0.0032) [2024-06-23 15:55:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7652933632. Throughput: 0: 42341.0. Samples: 7653087540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 15:55:34,542][15401] Updated weights for policy 0, policy_version 467100 (0.0035) [2024-06-23 15:55:37,464][15401] Updated weights for policy 0, policy_version 467110 (0.0039) [2024-06-23 15:55:38,389][15132] Fps is (10 sec: 45904.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7653146624. Throughput: 0: 42738.6. Samples: 7653225460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 15:55:42,571][15401] Updated weights for policy 0, policy_version 467120 (0.0027) [2024-06-23 15:55:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 7653326848. Throughput: 0: 42526.6. Samples: 7653478480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 15:55:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000467122_7653326848.pth... [2024-06-23 15:55:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000466499_7643119616.pth [2024-06-23 15:55:45,041][15401] Updated weights for policy 0, policy_version 467130 (0.0030) [2024-06-23 15:55:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 7653572608. Throughput: 0: 42499.6. Samples: 7653728020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 15:55:50,110][15401] Updated weights for policy 0, policy_version 467140 (0.0044) [2024-06-23 15:55:52,733][15401] Updated weights for policy 0, policy_version 467150 (0.0035) [2024-06-23 15:55:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7653785600. Throughput: 0: 42731.2. Samples: 7653865200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:53,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 15:55:57,590][15401] Updated weights for policy 0, policy_version 467160 (0.0034) [2024-06-23 15:55:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42488.3). Total num frames: 7653965824. Throughput: 0: 42444.4. Samples: 7654113440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:55:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 15:55:59,121][15349] Signal inference workers to stop experience collection... (113400 times) [2024-06-23 15:55:59,121][15349] Signal inference workers to resume experience collection... (113400 times) [2024-06-23 15:55:59,158][15401] InferenceWorker_p0-w0: stopping experience collection (113400 times) [2024-06-23 15:55:59,158][15401] InferenceWorker_p0-w0: resuming experience collection (113400 times) [2024-06-23 15:56:00,395][15401] Updated weights for policy 0, policy_version 467170 (0.0028) [2024-06-23 15:56:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 7654211584. Throughput: 0: 42657.3. Samples: 7654369620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:56:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 15:56:05,067][15401] Updated weights for policy 0, policy_version 467180 (0.0029) [2024-06-23 15:56:08,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 7654424576. Throughput: 0: 42733.3. Samples: 7654503520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:56:08,393][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 15:56:08,603][15401] Updated weights for policy 0, policy_version 467190 (0.0034) [2024-06-23 15:56:12,718][15401] Updated weights for policy 0, policy_version 467200 (0.0036) [2024-06-23 15:56:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 7654621184. Throughput: 0: 42601.2. Samples: 7654754520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:56:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 15:56:16,096][15401] Updated weights for policy 0, policy_version 467210 (0.0033) [2024-06-23 15:56:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7654850560. Throughput: 0: 42725.4. Samples: 7655010180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:56:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 15:56:20,424][15401] Updated weights for policy 0, policy_version 467220 (0.0037) [2024-06-23 15:56:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7655079936. Throughput: 0: 42695.9. Samples: 7655146780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-23 15:56:23,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-23 15:56:23,542][15401] Updated weights for policy 0, policy_version 467230 (0.0041) [2024-06-23 15:56:27,948][15401] Updated weights for policy 0, policy_version 467240 (0.0033) [2024-06-23 15:56:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43147.4, 300 sec: 42542.9). Total num frames: 7655276544. Throughput: 0: 42727.9. Samples: 7655401340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:56:28,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 15:56:31,314][15401] Updated weights for policy 0, policy_version 467250 (0.0027) [2024-06-23 15:56:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7655505920. Throughput: 0: 42866.1. Samples: 7655657000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:56:33,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 15:56:35,487][15401] Updated weights for policy 0, policy_version 467260 (0.0049) [2024-06-23 15:56:38,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7655718912. Throughput: 0: 42685.2. Samples: 7655786040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:56:38,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 15:56:38,906][15401] Updated weights for policy 0, policy_version 467270 (0.0030) [2024-06-23 15:56:43,334][15401] Updated weights for policy 0, policy_version 467280 (0.0037) [2024-06-23 15:56:43,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43142.8, 300 sec: 42598.0). Total num frames: 7655915520. Throughput: 0: 42911.9. Samples: 7656044580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:56:43,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 15:56:46,640][15401] Updated weights for policy 0, policy_version 467290 (0.0025) [2024-06-23 15:56:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 7656161280. Throughput: 0: 42761.2. Samples: 7656293980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:56:48,393][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 15:56:50,943][15401] Updated weights for policy 0, policy_version 467300 (0.0033) [2024-06-23 15:56:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 7656341504. Throughput: 0: 42862.8. Samples: 7656432240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:56:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 15:56:54,439][15401] Updated weights for policy 0, policy_version 467310 (0.0031) [2024-06-23 15:56:58,389][15132] Fps is (10 sec: 39331.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7656554496. Throughput: 0: 43035.2. Samples: 7656691100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:56:58,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 15:56:58,515][15401] Updated weights for policy 0, policy_version 467320 (0.0028) [2024-06-23 15:57:01,888][15401] Updated weights for policy 0, policy_version 467330 (0.0043) [2024-06-23 15:57:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7656800256. Throughput: 0: 42952.9. Samples: 7656943060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:03,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 15:57:06,193][15401] Updated weights for policy 0, policy_version 467340 (0.0035) [2024-06-23 15:57:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 7656980480. Throughput: 0: 42943.3. Samples: 7657079220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:08,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 15:57:09,427][15401] Updated weights for policy 0, policy_version 467350 (0.0021) [2024-06-23 15:57:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7657193472. Throughput: 0: 43018.7. Samples: 7657337080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 15:57:13,884][15401] Updated weights for policy 0, policy_version 467360 (0.0029) [2024-06-23 15:57:17,364][15401] Updated weights for policy 0, policy_version 467370 (0.0037) [2024-06-23 15:57:18,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 7657455616. Throughput: 0: 42831.6. Samples: 7657584520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:18,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 15:57:21,547][15401] Updated weights for policy 0, policy_version 467380 (0.0033) [2024-06-23 15:57:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7657635840. Throughput: 0: 43058.3. Samples: 7657723660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 15:57:25,008][15401] Updated weights for policy 0, policy_version 467390 (0.0037) [2024-06-23 15:57:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42873.2, 300 sec: 42654.3). Total num frames: 7657848832. Throughput: 0: 42948.2. Samples: 7657977140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:28,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 15:57:29,009][15401] Updated weights for policy 0, policy_version 467400 (0.0040) [2024-06-23 15:57:32,611][15401] Updated weights for policy 0, policy_version 467410 (0.0040) [2024-06-23 15:57:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7658094592. Throughput: 0: 43025.5. Samples: 7658230020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 15:57:36,626][15401] Updated weights for policy 0, policy_version 467420 (0.0020) [2024-06-23 15:57:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7658274816. Throughput: 0: 42989.8. Samples: 7658366780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:38,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 15:57:40,342][15401] Updated weights for policy 0, policy_version 467430 (0.0034) [2024-06-23 15:57:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 7658504192. Throughput: 0: 42879.9. Samples: 7658620700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:43,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 15:57:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000467438_7658504192.pth... [2024-06-23 15:57:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000466814_7648280576.pth [2024-06-23 15:57:44,350][15401] Updated weights for policy 0, policy_version 467440 (0.0039) [2024-06-23 15:57:46,786][15349] Signal inference workers to stop experience collection... (113450 times) [2024-06-23 15:57:46,788][15349] Signal inference workers to resume experience collection... (113450 times) [2024-06-23 15:57:46,815][15401] InferenceWorker_p0-w0: stopping experience collection (113450 times) [2024-06-23 15:57:46,816][15401] InferenceWorker_p0-w0: resuming experience collection (113450 times) [2024-06-23 15:57:47,815][15401] Updated weights for policy 0, policy_version 467450 (0.0037) [2024-06-23 15:57:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42820.7). Total num frames: 7658717184. Throughput: 0: 43143.0. Samples: 7658884500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:48,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 15:57:51,909][15401] Updated weights for policy 0, policy_version 467460 (0.0030) [2024-06-23 15:57:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7658930176. Throughput: 0: 42974.7. Samples: 7659013080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:53,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 15:57:55,465][15401] Updated weights for policy 0, policy_version 467470 (0.0029) [2024-06-23 15:57:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 7659143168. Throughput: 0: 42765.7. Samples: 7659261540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:57:58,390][15132] Avg episode reward: [(0, '0.147')] [2024-06-23 15:57:59,381][15401] Updated weights for policy 0, policy_version 467480 (0.0037) [2024-06-23 15:58:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7659339776. Throughput: 0: 43046.0. Samples: 7659521480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 15:58:03,438][15401] Updated weights for policy 0, policy_version 467490 (0.0036) [2024-06-23 15:58:07,283][15401] Updated weights for policy 0, policy_version 467500 (0.0043) [2024-06-23 15:58:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 7659569152. Throughput: 0: 42692.6. Samples: 7659644820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 15:58:11,261][15401] Updated weights for policy 0, policy_version 467510 (0.0038) [2024-06-23 15:58:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7659782144. Throughput: 0: 42779.9. Samples: 7659902240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 15:58:14,889][15401] Updated weights for policy 0, policy_version 467520 (0.0038) [2024-06-23 15:58:18,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 7659978752. Throughput: 0: 42979.1. Samples: 7660164080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:18,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 15:58:18,733][15401] Updated weights for policy 0, policy_version 467530 (0.0041) [2024-06-23 15:58:22,550][15401] Updated weights for policy 0, policy_version 467540 (0.0042) [2024-06-23 15:58:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7660208128. Throughput: 0: 42724.1. Samples: 7660289360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 15:58:26,164][15401] Updated weights for policy 0, policy_version 467550 (0.0023) [2024-06-23 15:58:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7660421120. Throughput: 0: 42620.9. Samples: 7660538640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 15:58:30,104][15401] Updated weights for policy 0, policy_version 467560 (0.0029) [2024-06-23 15:58:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42709.7). Total num frames: 7660617728. Throughput: 0: 42744.9. Samples: 7660808020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 15:58:33,902][15401] Updated weights for policy 0, policy_version 467570 (0.0027) [2024-06-23 15:58:37,675][15401] Updated weights for policy 0, policy_version 467580 (0.0022) [2024-06-23 15:58:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7660847104. Throughput: 0: 42578.7. Samples: 7660929120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 15:58:41,545][15401] Updated weights for policy 0, policy_version 467590 (0.0027) [2024-06-23 15:58:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7661076480. Throughput: 0: 42802.8. Samples: 7661187660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 15:58:45,424][15401] Updated weights for policy 0, policy_version 467600 (0.0033) [2024-06-23 15:58:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 7661256704. Throughput: 0: 42675.9. Samples: 7661441900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 15:58:49,434][15401] Updated weights for policy 0, policy_version 467610 (0.0029) [2024-06-23 15:58:53,021][15401] Updated weights for policy 0, policy_version 467620 (0.0039) [2024-06-23 15:58:53,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42709.7). Total num frames: 7661502464. Throughput: 0: 42620.2. Samples: 7661562840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:53,393][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 15:58:57,194][15401] Updated weights for policy 0, policy_version 467630 (0.0025) [2024-06-23 15:58:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7661715456. Throughput: 0: 42764.5. Samples: 7661826640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:58:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 15:59:00,574][15401] Updated weights for policy 0, policy_version 467640 (0.0022) [2024-06-23 15:59:02,452][15349] Signal inference workers to stop experience collection... (113500 times) [2024-06-23 15:59:02,452][15349] Signal inference workers to resume experience collection... (113500 times) [2024-06-23 15:59:02,465][15401] InferenceWorker_p0-w0: stopping experience collection (113500 times) [2024-06-23 15:59:02,472][15401] InferenceWorker_p0-w0: resuming experience collection (113500 times) [2024-06-23 15:59:03,392][15132] Fps is (10 sec: 40960.5, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7661912064. Throughput: 0: 42484.5. Samples: 7662075980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:59:03,392][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 15:59:04,867][15401] Updated weights for policy 0, policy_version 467650 (0.0030) [2024-06-23 15:59:08,310][15401] Updated weights for policy 0, policy_version 467660 (0.0040) [2024-06-23 15:59:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.6, 300 sec: 42709.1). Total num frames: 7662141440. Throughput: 0: 42519.9. Samples: 7662202860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:59:08,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 15:59:12,659][15401] Updated weights for policy 0, policy_version 467670 (0.0036) [2024-06-23 15:59:13,389][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7662338048. Throughput: 0: 42734.7. Samples: 7662461700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 15:59:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 15:59:15,968][15401] Updated weights for policy 0, policy_version 467680 (0.0037) [2024-06-23 15:59:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7662551040. Throughput: 0: 42350.7. Samples: 7662713800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 15:59:20,269][15401] Updated weights for policy 0, policy_version 467690 (0.0038) [2024-06-23 15:59:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7662780416. Throughput: 0: 42478.6. Samples: 7662840660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 15:59:23,591][15401] Updated weights for policy 0, policy_version 467700 (0.0039) [2024-06-23 15:59:27,965][15401] Updated weights for policy 0, policy_version 467710 (0.0032) [2024-06-23 15:59:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 7662960640. Throughput: 0: 42596.3. Samples: 7663104600. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:28,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 15:59:31,726][15401] Updated weights for policy 0, policy_version 467720 (0.0036) [2024-06-23 15:59:33,391][15132] Fps is (10 sec: 40952.5, 60 sec: 42870.2, 300 sec: 42653.7). Total num frames: 7663190016. Throughput: 0: 42408.5. Samples: 7663350360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:33,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 15:59:35,751][15401] Updated weights for policy 0, policy_version 467730 (0.0028) [2024-06-23 15:59:38,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7663403008. Throughput: 0: 42634.8. Samples: 7663481300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 15:59:39,369][15401] Updated weights for policy 0, policy_version 467740 (0.0052) [2024-06-23 15:59:43,389][15132] Fps is (10 sec: 39329.2, 60 sec: 41779.2, 300 sec: 42709.8). Total num frames: 7663583232. Throughput: 0: 42273.5. Samples: 7663728940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 15:59:43,566][15401] Updated weights for policy 0, policy_version 467750 (0.0036) [2024-06-23 15:59:43,569][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000467750_7663616000.pth... [2024-06-23 15:59:43,635][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000467122_7653326848.pth [2024-06-23 15:59:47,015][15401] Updated weights for policy 0, policy_version 467760 (0.0036) [2024-06-23 15:59:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7663828992. Throughput: 0: 42414.2. Samples: 7663984520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 15:59:51,611][15401] Updated weights for policy 0, policy_version 467770 (0.0038) [2024-06-23 15:59:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 7664041984. Throughput: 0: 42501.0. Samples: 7664115300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 15:59:54,760][15401] Updated weights for policy 0, policy_version 467780 (0.0041) [2024-06-23 15:59:58,392][15132] Fps is (10 sec: 37674.2, 60 sec: 41504.5, 300 sec: 42653.6). Total num frames: 7664205824. Throughput: 0: 42266.2. Samples: 7664363780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 15:59:58,393][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 15:59:59,222][15401] Updated weights for policy 0, policy_version 467790 (0.0030) [2024-06-23 16:00:02,481][15401] Updated weights for policy 0, policy_version 467800 (0.0047) [2024-06-23 16:00:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 7664451584. Throughput: 0: 42233.8. Samples: 7664614320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 16:00:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 16:00:06,777][15401] Updated weights for policy 0, policy_version 467810 (0.0032) [2024-06-23 16:00:08,392][15132] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 42709.1). Total num frames: 7664664576. Throughput: 0: 42288.8. Samples: 7664743760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 16:00:08,393][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 16:00:10,560][15401] Updated weights for policy 0, policy_version 467820 (0.0044) [2024-06-23 16:00:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7664861184. Throughput: 0: 42072.1. Samples: 7664997740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 16:00:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 16:00:14,667][15401] Updated weights for policy 0, policy_version 467830 (0.0031) [2024-06-23 16:00:16,680][15349] Signal inference workers to stop experience collection... (113550 times) [2024-06-23 16:00:16,707][15401] InferenceWorker_p0-w0: stopping experience collection (113550 times) [2024-06-23 16:00:16,743][15349] Signal inference workers to resume experience collection... (113550 times) [2024-06-23 16:00:16,744][15401] InferenceWorker_p0-w0: resuming experience collection (113550 times) [2024-06-23 16:00:18,041][15401] Updated weights for policy 0, policy_version 467840 (0.0036) [2024-06-23 16:00:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7665090560. Throughput: 0: 42187.6. Samples: 7665248720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 16:00:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 16:00:22,294][15401] Updated weights for policy 0, policy_version 467850 (0.0037) [2024-06-23 16:00:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42765.9). Total num frames: 7665303552. Throughput: 0: 42232.3. Samples: 7665381760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 16:00:23,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 16:00:25,798][15401] Updated weights for policy 0, policy_version 467860 (0.0039) [2024-06-23 16:00:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 7665500160. Throughput: 0: 42360.0. Samples: 7665635140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 16:00:28,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 16:00:29,857][15401] Updated weights for policy 0, policy_version 467870 (0.0036) [2024-06-23 16:00:33,350][15401] Updated weights for policy 0, policy_version 467880 (0.0038) [2024-06-23 16:00:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42599.7, 300 sec: 42709.5). Total num frames: 7665745920. Throughput: 0: 42200.0. Samples: 7665883520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 16:00:33,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-23 16:00:37,457][15401] Updated weights for policy 0, policy_version 467890 (0.0036) [2024-06-23 16:00:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7665926144. Throughput: 0: 42254.2. Samples: 7666016740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 16:00:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 16:00:40,891][15401] Updated weights for policy 0, policy_version 467900 (0.0040) [2024-06-23 16:00:43,392][15132] Fps is (10 sec: 37674.0, 60 sec: 42323.5, 300 sec: 42542.5). Total num frames: 7666122752. Throughput: 0: 42360.0. Samples: 7666269980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:00:43,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 16:00:45,037][15401] Updated weights for policy 0, policy_version 467910 (0.0040) [2024-06-23 16:00:48,375][15401] Updated weights for policy 0, policy_version 467920 (0.0033) [2024-06-23 16:00:48,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7666401280. Throughput: 0: 42388.9. Samples: 7666521820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:00:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 16:00:52,713][15401] Updated weights for policy 0, policy_version 467930 (0.0044) [2024-06-23 16:00:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7666565120. Throughput: 0: 42557.0. Samples: 7666658720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:00:53,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 16:00:56,082][15401] Updated weights for policy 0, policy_version 467940 (0.0025) [2024-06-23 16:00:58,396][15132] Fps is (10 sec: 37658.8, 60 sec: 42868.6, 300 sec: 42597.5). Total num frames: 7666778112. Throughput: 0: 42476.5. Samples: 7666909460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:00:58,397][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 16:01:00,297][15401] Updated weights for policy 0, policy_version 467950 (0.0028) [2024-06-23 16:01:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7667023872. Throughput: 0: 42574.7. Samples: 7667164580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 16:01:03,924][15401] Updated weights for policy 0, policy_version 467960 (0.0032) [2024-06-23 16:01:08,389][15132] Fps is (10 sec: 42626.4, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 7667204096. Throughput: 0: 42581.0. Samples: 7667297900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 16:01:08,392][15401] Updated weights for policy 0, policy_version 467970 (0.0024) [2024-06-23 16:01:11,619][15401] Updated weights for policy 0, policy_version 467980 (0.0029) [2024-06-23 16:01:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7667433472. Throughput: 0: 42534.6. Samples: 7667549200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 16:01:16,025][15401] Updated weights for policy 0, policy_version 467990 (0.0030) [2024-06-23 16:01:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7667646464. Throughput: 0: 42663.7. Samples: 7667803380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 16:01:19,798][15401] Updated weights for policy 0, policy_version 468000 (0.0035) [2024-06-23 16:01:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 7667843072. Throughput: 0: 42599.1. Samples: 7667933700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 16:01:23,686][15401] Updated weights for policy 0, policy_version 468010 (0.0039) [2024-06-23 16:01:27,291][15401] Updated weights for policy 0, policy_version 468020 (0.0030) [2024-06-23 16:01:28,391][15132] Fps is (10 sec: 40954.5, 60 sec: 42597.5, 300 sec: 42542.7). Total num frames: 7668056064. Throughput: 0: 42732.3. Samples: 7668192880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:28,391][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 16:01:31,277][15401] Updated weights for policy 0, policy_version 468030 (0.0037) [2024-06-23 16:01:33,391][15132] Fps is (10 sec: 44228.6, 60 sec: 42324.1, 300 sec: 42598.1). Total num frames: 7668285440. Throughput: 0: 42767.2. Samples: 7668446420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:33,392][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 16:01:34,852][15401] Updated weights for policy 0, policy_version 468040 (0.0033) [2024-06-23 16:01:38,389][15132] Fps is (10 sec: 42603.8, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 7668482048. Throughput: 0: 42641.8. Samples: 7668577600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:38,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-23 16:01:39,036][15401] Updated weights for policy 0, policy_version 468050 (0.0037) [2024-06-23 16:01:41,557][15349] Signal inference workers to stop experience collection... (113600 times) [2024-06-23 16:01:41,606][15401] InferenceWorker_p0-w0: stopping experience collection (113600 times) [2024-06-23 16:01:41,618][15349] Signal inference workers to resume experience collection... (113600 times) [2024-06-23 16:01:41,620][15401] InferenceWorker_p0-w0: resuming experience collection (113600 times) [2024-06-23 16:01:42,816][15401] Updated weights for policy 0, policy_version 468060 (0.0033) [2024-06-23 16:01:43,389][15132] Fps is (10 sec: 40967.7, 60 sec: 42873.3, 300 sec: 42487.7). Total num frames: 7668695040. Throughput: 0: 42670.7. Samples: 7668829360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 16:01:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000468061_7668711424.pth... [2024-06-23 16:01:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000467438_7658504192.pth [2024-06-23 16:01:46,707][15401] Updated weights for policy 0, policy_version 468070 (0.0039) [2024-06-23 16:01:48,391][15132] Fps is (10 sec: 44231.1, 60 sec: 42051.4, 300 sec: 42653.7). Total num frames: 7668924416. Throughput: 0: 42672.1. Samples: 7669084880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:48,391][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 16:01:50,579][15401] Updated weights for policy 0, policy_version 468080 (0.0034) [2024-06-23 16:01:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7669121024. Throughput: 0: 42725.7. Samples: 7669220560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:53,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 16:01:54,302][15401] Updated weights for policy 0, policy_version 468090 (0.0036) [2024-06-23 16:01:58,341][15401] Updated weights for policy 0, policy_version 468100 (0.0035) [2024-06-23 16:01:58,392][15132] Fps is (10 sec: 42593.4, 60 sec: 42874.3, 300 sec: 42542.5). Total num frames: 7669350400. Throughput: 0: 42679.5. Samples: 7669469880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:01:58,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 16:02:02,007][15401] Updated weights for policy 0, policy_version 468110 (0.0032) [2024-06-23 16:02:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7669579776. Throughput: 0: 42647.5. Samples: 7669722520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 16:02:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 16:02:06,105][15401] Updated weights for policy 0, policy_version 468120 (0.0040) [2024-06-23 16:02:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7669776384. Throughput: 0: 42795.9. Samples: 7669859520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:08,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 16:02:09,794][15401] Updated weights for policy 0, policy_version 468130 (0.0045) [2024-06-23 16:02:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 7669989376. Throughput: 0: 42777.5. Samples: 7670117820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 16:02:13,893][15401] Updated weights for policy 0, policy_version 468140 (0.0030) [2024-06-23 16:02:17,428][15401] Updated weights for policy 0, policy_version 468150 (0.0032) [2024-06-23 16:02:18,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 7670218752. Throughput: 0: 42707.4. Samples: 7670368280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:18,393][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 16:02:21,452][15401] Updated weights for policy 0, policy_version 468160 (0.0049) [2024-06-23 16:02:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7670398976. Throughput: 0: 42818.2. Samples: 7670504420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 16:02:24,826][15401] Updated weights for policy 0, policy_version 468170 (0.0042) [2024-06-23 16:02:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42872.4, 300 sec: 42487.3). Total num frames: 7670628352. Throughput: 0: 42940.8. Samples: 7670761700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 16:02:28,992][15401] Updated weights for policy 0, policy_version 468180 (0.0042) [2024-06-23 16:02:32,367][15401] Updated weights for policy 0, policy_version 468190 (0.0032) [2024-06-23 16:02:33,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43145.9, 300 sec: 42709.5). Total num frames: 7670874112. Throughput: 0: 42931.1. Samples: 7671016720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:33,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 16:02:36,377][15401] Updated weights for policy 0, policy_version 468200 (0.0023) [2024-06-23 16:02:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7671037952. Throughput: 0: 42858.2. Samples: 7671149180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 16:02:39,945][15401] Updated weights for policy 0, policy_version 468210 (0.0032) [2024-06-23 16:02:43,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7671283712. Throughput: 0: 43134.3. Samples: 7671410820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:43,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 16:02:43,822][15401] Updated weights for policy 0, policy_version 468220 (0.0032) [2024-06-23 16:02:47,492][15401] Updated weights for policy 0, policy_version 468230 (0.0029) [2024-06-23 16:02:48,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43145.5, 300 sec: 42653.9). Total num frames: 7671513088. Throughput: 0: 43183.1. Samples: 7671665760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:48,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 16:02:51,509][15401] Updated weights for policy 0, policy_version 468240 (0.0045) [2024-06-23 16:02:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7671693312. Throughput: 0: 43060.1. Samples: 7671797220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 16:02:53,477][15349] Signal inference workers to stop experience collection... (113650 times) [2024-06-23 16:02:53,479][15349] Signal inference workers to resume experience collection... (113650 times) [2024-06-23 16:02:53,508][15401] InferenceWorker_p0-w0: stopping experience collection (113650 times) [2024-06-23 16:02:53,508][15401] InferenceWorker_p0-w0: resuming experience collection (113650 times) [2024-06-23 16:02:55,016][15401] Updated weights for policy 0, policy_version 468250 (0.0045) [2024-06-23 16:02:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.3, 300 sec: 42709.4). Total num frames: 7671939072. Throughput: 0: 43024.0. Samples: 7672053900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:02:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 16:02:59,217][15401] Updated weights for policy 0, policy_version 468260 (0.0046) [2024-06-23 16:03:02,683][15401] Updated weights for policy 0, policy_version 468270 (0.0035) [2024-06-23 16:03:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7672152064. Throughput: 0: 43012.1. Samples: 7672303720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:03:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 16:03:06,725][15401] Updated weights for policy 0, policy_version 468280 (0.0037) [2024-06-23 16:03:08,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 7672348672. Throughput: 0: 42853.7. Samples: 7672432940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:03:08,392][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 16:03:10,438][15401] Updated weights for policy 0, policy_version 468290 (0.0032) [2024-06-23 16:03:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7672578048. Throughput: 0: 42948.0. Samples: 7672694360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:03:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 16:03:14,293][15401] Updated weights for policy 0, policy_version 468300 (0.0040) [2024-06-23 16:03:18,182][15401] Updated weights for policy 0, policy_version 468310 (0.0033) [2024-06-23 16:03:18,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 7672807424. Throughput: 0: 42803.1. Samples: 7672942860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:03:18,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 16:03:22,018][15401] Updated weights for policy 0, policy_version 468320 (0.0032) [2024-06-23 16:03:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7672987648. Throughput: 0: 42738.6. Samples: 7673072420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:03:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 16:03:25,977][15401] Updated weights for policy 0, policy_version 468330 (0.0039) [2024-06-23 16:03:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7673184256. Throughput: 0: 42609.4. Samples: 7673328240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 16:03:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 16:03:29,423][15401] Updated weights for policy 0, policy_version 468340 (0.0032) [2024-06-23 16:03:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7673413632. Throughput: 0: 42474.2. Samples: 7673577100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:03:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 16:03:33,894][15401] Updated weights for policy 0, policy_version 468350 (0.0037) [2024-06-23 16:03:37,327][15401] Updated weights for policy 0, policy_version 468360 (0.0030) [2024-06-23 16:03:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 7673626624. Throughput: 0: 42508.7. Samples: 7673710120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:03:38,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 16:03:41,555][15401] Updated weights for policy 0, policy_version 468370 (0.0042) [2024-06-23 16:03:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7673839616. Throughput: 0: 42438.3. Samples: 7673963620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:03:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 16:03:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000468374_7673839616.pth... [2024-06-23 16:03:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000467750_7663616000.pth [2024-06-23 16:03:45,274][15401] Updated weights for policy 0, policy_version 468380 (0.0036) [2024-06-23 16:03:48,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42598.4). Total num frames: 7674068992. Throughput: 0: 42488.9. Samples: 7674215820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:03:48,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 16:03:49,290][15401] Updated weights for policy 0, policy_version 468390 (0.0038) [2024-06-23 16:03:53,116][15401] Updated weights for policy 0, policy_version 468400 (0.0044) [2024-06-23 16:03:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7674265600. Throughput: 0: 42510.8. Samples: 7674345820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:03:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 16:03:56,958][15401] Updated weights for policy 0, policy_version 468410 (0.0038) [2024-06-23 16:03:58,390][15132] Fps is (10 sec: 40969.0, 60 sec: 42325.2, 300 sec: 42598.7). Total num frames: 7674478592. Throughput: 0: 42311.4. Samples: 7674598380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:03:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 16:04:01,029][15401] Updated weights for policy 0, policy_version 468420 (0.0037) [2024-06-23 16:04:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 7674691584. Throughput: 0: 42423.0. Samples: 7674851900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 16:04:04,566][15401] Updated weights for policy 0, policy_version 468430 (0.0040) [2024-06-23 16:04:08,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 7674904576. Throughput: 0: 42449.9. Samples: 7674982660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 16:04:08,845][15401] Updated weights for policy 0, policy_version 468440 (0.0038) [2024-06-23 16:04:12,342][15401] Updated weights for policy 0, policy_version 468450 (0.0022) [2024-06-23 16:04:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7675101184. Throughput: 0: 42387.9. Samples: 7675235700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:13,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 16:04:16,719][15401] Updated weights for policy 0, policy_version 468460 (0.0026) [2024-06-23 16:04:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7675330560. Throughput: 0: 42552.9. Samples: 7675491980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 16:04:20,165][15401] Updated weights for policy 0, policy_version 468470 (0.0038) [2024-06-23 16:04:20,167][15349] Signal inference workers to stop experience collection... (113700 times) [2024-06-23 16:04:20,168][15349] Signal inference workers to resume experience collection... (113700 times) [2024-06-23 16:04:20,182][15401] InferenceWorker_p0-w0: stopping experience collection (113700 times) [2024-06-23 16:04:20,182][15401] InferenceWorker_p0-w0: resuming experience collection (113700 times) [2024-06-23 16:04:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42654.3). Total num frames: 7675543552. Throughput: 0: 42371.3. Samples: 7675616820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 16:04:24,455][15401] Updated weights for policy 0, policy_version 468480 (0.0035) [2024-06-23 16:04:27,738][15401] Updated weights for policy 0, policy_version 468490 (0.0038) [2024-06-23 16:04:28,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.6, 300 sec: 42598.3). Total num frames: 7675756544. Throughput: 0: 42446.6. Samples: 7675873820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:28,393][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 16:04:31,995][15401] Updated weights for policy 0, policy_version 468500 (0.0038) [2024-06-23 16:04:33,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 7675953152. Throughput: 0: 42483.9. Samples: 7676127500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 16:04:35,761][15401] Updated weights for policy 0, policy_version 468510 (0.0026) [2024-06-23 16:04:38,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7676198912. Throughput: 0: 42523.0. Samples: 7676259360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:38,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 16:04:39,711][15401] Updated weights for policy 0, policy_version 468520 (0.0038) [2024-06-23 16:04:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7676379136. Throughput: 0: 42518.4. Samples: 7676511700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:43,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 16:04:43,447][15401] Updated weights for policy 0, policy_version 468530 (0.0027) [2024-06-23 16:04:47,437][15401] Updated weights for policy 0, policy_version 468540 (0.0044) [2024-06-23 16:04:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 7676608512. Throughput: 0: 42526.4. Samples: 7676765580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:48,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 16:04:50,891][15401] Updated weights for policy 0, policy_version 468550 (0.0038) [2024-06-23 16:04:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 7676821504. Throughput: 0: 42475.5. Samples: 7676894060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-23 16:04:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 16:04:54,715][15401] Updated weights for policy 0, policy_version 468560 (0.0023) [2024-06-23 16:04:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 7677034496. Throughput: 0: 42736.0. Samples: 7677158920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:04:58,392][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 16:04:58,795][15401] Updated weights for policy 0, policy_version 468570 (0.0032) [2024-06-23 16:05:02,386][15401] Updated weights for policy 0, policy_version 468580 (0.0037) [2024-06-23 16:05:03,391][15132] Fps is (10 sec: 42592.0, 60 sec: 42597.4, 300 sec: 42654.1). Total num frames: 7677247488. Throughput: 0: 42676.7. Samples: 7677412500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:03,391][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 16:05:06,463][15401] Updated weights for policy 0, policy_version 468590 (0.0028) [2024-06-23 16:05:08,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7677460480. Throughput: 0: 42894.5. Samples: 7677547080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 16:05:10,104][15401] Updated weights for policy 0, policy_version 468600 (0.0043) [2024-06-23 16:05:13,390][15132] Fps is (10 sec: 42604.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7677673472. Throughput: 0: 42700.5. Samples: 7677795240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:13,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-23 16:05:14,068][15401] Updated weights for policy 0, policy_version 468610 (0.0044) [2024-06-23 16:05:17,519][15401] Updated weights for policy 0, policy_version 468620 (0.0043) [2024-06-23 16:05:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7677886464. Throughput: 0: 42719.8. Samples: 7678049880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 16:05:21,525][15401] Updated weights for policy 0, policy_version 468630 (0.0032) [2024-06-23 16:05:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 7678099456. Throughput: 0: 42727.5. Samples: 7678182100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 16:05:25,256][15401] Updated weights for policy 0, policy_version 468640 (0.0035) [2024-06-23 16:05:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 7678296064. Throughput: 0: 42740.9. Samples: 7678435040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 16:05:29,507][15401] Updated weights for policy 0, policy_version 468650 (0.0031) [2024-06-23 16:05:32,804][15401] Updated weights for policy 0, policy_version 468660 (0.0047) [2024-06-23 16:05:33,396][15132] Fps is (10 sec: 44208.8, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 7678541824. Throughput: 0: 42705.8. Samples: 7678687620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:33,396][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 16:05:36,988][15401] Updated weights for policy 0, policy_version 468670 (0.0042) [2024-06-23 16:05:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.7, 300 sec: 42765.0). Total num frames: 7678738432. Throughput: 0: 42883.1. Samples: 7678823900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:38,392][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 16:05:40,649][15401] Updated weights for policy 0, policy_version 468680 (0.0036) [2024-06-23 16:05:43,390][15132] Fps is (10 sec: 39346.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7678935040. Throughput: 0: 42584.8. Samples: 7679075140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 16:05:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000468686_7678951424.pth... [2024-06-23 16:05:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000468061_7668711424.pth [2024-06-23 16:05:44,575][15401] Updated weights for policy 0, policy_version 468690 (0.0033) [2024-06-23 16:05:48,130][15401] Updated weights for policy 0, policy_version 468700 (0.0038) [2024-06-23 16:05:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7679180800. Throughput: 0: 42680.2. Samples: 7679333040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 16:05:51,531][15349] Signal inference workers to stop experience collection... (113750 times) [2024-06-23 16:05:51,535][15349] Signal inference workers to resume experience collection... (113750 times) [2024-06-23 16:05:51,580][15401] InferenceWorker_p0-w0: stopping experience collection (113750 times) [2024-06-23 16:05:51,580][15401] InferenceWorker_p0-w0: resuming experience collection (113750 times) [2024-06-23 16:05:52,209][15401] Updated weights for policy 0, policy_version 468710 (0.0042) [2024-06-23 16:05:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 7679377408. Throughput: 0: 42713.3. Samples: 7679469180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 16:05:55,717][15401] Updated weights for policy 0, policy_version 468720 (0.0033) [2024-06-23 16:05:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42598.4, 300 sec: 42598.0). Total num frames: 7679590400. Throughput: 0: 42793.8. Samples: 7679721060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:05:58,392][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 16:05:59,706][15401] Updated weights for policy 0, policy_version 468730 (0.0037) [2024-06-23 16:06:03,274][15401] Updated weights for policy 0, policy_version 468740 (0.0042) [2024-06-23 16:06:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43145.6, 300 sec: 42820.5). Total num frames: 7679836160. Throughput: 0: 42878.6. Samples: 7679979420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:06:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 16:06:07,707][15401] Updated weights for policy 0, policy_version 468750 (0.0033) [2024-06-23 16:06:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7680016384. Throughput: 0: 42888.1. Samples: 7680112060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:06:08,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 16:06:10,803][15401] Updated weights for policy 0, policy_version 468760 (0.0035) [2024-06-23 16:06:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7680245760. Throughput: 0: 42901.8. Samples: 7680365620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:06:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 16:06:15,518][15401] Updated weights for policy 0, policy_version 468770 (0.0027) [2024-06-23 16:06:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7680458752. Throughput: 0: 42954.7. Samples: 7680620300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 16:06:18,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 16:06:18,851][15401] Updated weights for policy 0, policy_version 468780 (0.0029) [2024-06-23 16:06:23,073][15401] Updated weights for policy 0, policy_version 468790 (0.0036) [2024-06-23 16:06:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.7). Total num frames: 7680655360. Throughput: 0: 42817.9. Samples: 7680750600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:06:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 16:06:26,390][15401] Updated weights for policy 0, policy_version 468800 (0.0029) [2024-06-23 16:06:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42765.3). Total num frames: 7680901120. Throughput: 0: 42805.8. Samples: 7681001400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:06:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 16:06:30,637][15401] Updated weights for policy 0, policy_version 468810 (0.0036) [2024-06-23 16:06:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 7681081344. Throughput: 0: 42751.1. Samples: 7681256840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:06:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 16:06:34,524][15401] Updated weights for policy 0, policy_version 468820 (0.0034) [2024-06-23 16:06:38,207][15401] Updated weights for policy 0, policy_version 468830 (0.0024) [2024-06-23 16:06:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 7681310720. Throughput: 0: 42514.2. Samples: 7681382420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:06:38,392][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 16:06:41,873][15401] Updated weights for policy 0, policy_version 468840 (0.0033) [2024-06-23 16:06:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42765.2). Total num frames: 7681540096. Throughput: 0: 42747.7. Samples: 7681644600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:06:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 16:06:45,809][15401] Updated weights for policy 0, policy_version 468850 (0.0049) [2024-06-23 16:06:48,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7681736704. Throughput: 0: 42847.1. Samples: 7681907540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:06:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 16:06:49,563][15401] Updated weights for policy 0, policy_version 468860 (0.0026) [2024-06-23 16:06:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7681949696. Throughput: 0: 42652.5. Samples: 7682031420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:06:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 16:06:53,425][15401] Updated weights for policy 0, policy_version 468870 (0.0043) [2024-06-23 16:06:57,183][15401] Updated weights for policy 0, policy_version 468880 (0.0039) [2024-06-23 16:06:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 7682179072. Throughput: 0: 42769.4. Samples: 7682290240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:06:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 16:07:01,329][15401] Updated weights for policy 0, policy_version 468890 (0.0049) [2024-06-23 16:07:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7682392064. Throughput: 0: 42788.0. Samples: 7682545760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 16:07:04,981][15401] Updated weights for policy 0, policy_version 468900 (0.0046) [2024-06-23 16:07:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7682588672. Throughput: 0: 42596.4. Samples: 7682667440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 16:07:09,105][15401] Updated weights for policy 0, policy_version 468910 (0.0040) [2024-06-23 16:07:12,398][15401] Updated weights for policy 0, policy_version 468920 (0.0023) [2024-06-23 16:07:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7682818048. Throughput: 0: 42796.9. Samples: 7682927260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:13,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 16:07:16,855][15401] Updated weights for policy 0, policy_version 468930 (0.0045) [2024-06-23 16:07:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7683031040. Throughput: 0: 42829.4. Samples: 7683184160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 16:07:20,083][15401] Updated weights for policy 0, policy_version 468940 (0.0038) [2024-06-23 16:07:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7683227648. Throughput: 0: 42858.2. Samples: 7683310940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 16:07:24,243][15349] Signal inference workers to stop experience collection... (113800 times) [2024-06-23 16:07:24,295][15401] InferenceWorker_p0-w0: stopping experience collection (113800 times) [2024-06-23 16:07:24,361][15349] Signal inference workers to resume experience collection... (113800 times) [2024-06-23 16:07:24,361][15401] InferenceWorker_p0-w0: resuming experience collection (113800 times) [2024-06-23 16:07:24,496][15401] Updated weights for policy 0, policy_version 468950 (0.0042) [2024-06-23 16:07:27,608][15401] Updated weights for policy 0, policy_version 468960 (0.0028) [2024-06-23 16:07:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7683457024. Throughput: 0: 42788.3. Samples: 7683570180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:28,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 16:07:32,085][15401] Updated weights for policy 0, policy_version 468970 (0.0032) [2024-06-23 16:07:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7683653632. Throughput: 0: 42719.6. Samples: 7683829920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:33,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 16:07:35,255][15401] Updated weights for policy 0, policy_version 468980 (0.0040) [2024-06-23 16:07:38,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 7683866624. Throughput: 0: 42715.8. Samples: 7683953740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:38,392][15132] Avg episode reward: [(0, '0.252')] [2024-06-23 16:07:39,703][15401] Updated weights for policy 0, policy_version 468990 (0.0023) [2024-06-23 16:07:42,756][15401] Updated weights for policy 0, policy_version 469000 (0.0038) [2024-06-23 16:07:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7684096000. Throughput: 0: 42840.4. Samples: 7684218060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-23 16:07:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 16:07:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000469001_7684112384.pth... [2024-06-23 16:07:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000468374_7673839616.pth [2024-06-23 16:07:47,360][15401] Updated weights for policy 0, policy_version 469010 (0.0022) [2024-06-23 16:07:48,391][15132] Fps is (10 sec: 44241.6, 60 sec: 42870.5, 300 sec: 42764.8). Total num frames: 7684308992. Throughput: 0: 42869.8. Samples: 7684474960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:07:48,391][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 16:07:50,588][15401] Updated weights for policy 0, policy_version 469020 (0.0037) [2024-06-23 16:07:53,392][15132] Fps is (10 sec: 42587.4, 60 sec: 42869.6, 300 sec: 42653.6). Total num frames: 7684521984. Throughput: 0: 42857.5. Samples: 7684596140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:07:53,393][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 16:07:55,341][15401] Updated weights for policy 0, policy_version 469030 (0.0027) [2024-06-23 16:07:58,125][15401] Updated weights for policy 0, policy_version 469040 (0.0027) [2024-06-23 16:07:58,390][15132] Fps is (10 sec: 44242.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7684751360. Throughput: 0: 42811.5. Samples: 7684853780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:07:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 16:08:02,939][15401] Updated weights for policy 0, policy_version 469050 (0.0034) [2024-06-23 16:08:03,389][15132] Fps is (10 sec: 40970.8, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 7684931584. Throughput: 0: 42920.5. Samples: 7685115580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:03,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 16:08:05,607][15401] Updated weights for policy 0, policy_version 469060 (0.0037) [2024-06-23 16:08:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7685160960. Throughput: 0: 42783.1. Samples: 7685236180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 16:08:10,562][15401] Updated weights for policy 0, policy_version 469070 (0.0042) [2024-06-23 16:08:13,175][15401] Updated weights for policy 0, policy_version 469080 (0.0033) [2024-06-23 16:08:13,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 7685406720. Throughput: 0: 42899.1. Samples: 7685500540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:13,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 16:08:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 7685554176. Throughput: 0: 42916.4. Samples: 7685761160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:18,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 16:08:18,398][15401] Updated weights for policy 0, policy_version 469090 (0.0036) [2024-06-23 16:08:21,400][15401] Updated weights for policy 0, policy_version 469100 (0.0027) [2024-06-23 16:08:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7685816320. Throughput: 0: 42796.0. Samples: 7685879460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:23,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-23 16:08:25,969][15401] Updated weights for policy 0, policy_version 469110 (0.0024) [2024-06-23 16:08:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 7686029312. Throughput: 0: 42586.2. Samples: 7686134440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 16:08:28,991][15401] Updated weights for policy 0, policy_version 469120 (0.0047) [2024-06-23 16:08:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7686193152. Throughput: 0: 42632.8. Samples: 7686393380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:33,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 16:08:33,561][15401] Updated weights for policy 0, policy_version 469130 (0.0037) [2024-06-23 16:08:34,847][15349] Signal inference workers to stop experience collection... (113850 times) [2024-06-23 16:08:34,892][15401] InferenceWorker_p0-w0: stopping experience collection (113850 times) [2024-06-23 16:08:34,906][15349] Signal inference workers to resume experience collection... (113850 times) [2024-06-23 16:08:34,911][15401] InferenceWorker_p0-w0: resuming experience collection (113850 times) [2024-06-23 16:08:36,666][15401] Updated weights for policy 0, policy_version 469140 (0.0042) [2024-06-23 16:08:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43144.5, 300 sec: 42764.7). Total num frames: 7686455296. Throughput: 0: 42664.1. Samples: 7686516020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:38,393][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 16:08:41,482][15401] Updated weights for policy 0, policy_version 469150 (0.0045) [2024-06-23 16:08:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 7686635520. Throughput: 0: 42579.1. Samples: 7686769840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 16:08:44,500][15401] Updated weights for policy 0, policy_version 469160 (0.0047) [2024-06-23 16:08:48,390][15132] Fps is (10 sec: 37692.1, 60 sec: 42053.2, 300 sec: 42598.4). Total num frames: 7686832128. Throughput: 0: 42470.1. Samples: 7687026740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 16:08:49,354][15401] Updated weights for policy 0, policy_version 469170 (0.0041) [2024-06-23 16:08:52,075][15401] Updated weights for policy 0, policy_version 469180 (0.0048) [2024-06-23 16:08:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7687077888. Throughput: 0: 42505.8. Samples: 7687148940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 16:08:56,885][15401] Updated weights for policy 0, policy_version 469190 (0.0034) [2024-06-23 16:08:58,393][15132] Fps is (10 sec: 45858.9, 60 sec: 42322.8, 300 sec: 42709.0). Total num frames: 7687290880. Throughput: 0: 42523.8. Samples: 7687414260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:08:58,394][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 16:08:59,655][15401] Updated weights for policy 0, policy_version 469200 (0.0024) [2024-06-23 16:09:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7687487488. Throughput: 0: 42403.6. Samples: 7687669320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:09:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 16:09:04,416][15401] Updated weights for policy 0, policy_version 469210 (0.0027) [2024-06-23 16:09:07,510][15401] Updated weights for policy 0, policy_version 469220 (0.0051) [2024-06-23 16:09:08,390][15132] Fps is (10 sec: 44252.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7687733248. Throughput: 0: 42493.4. Samples: 7687791660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:09:08,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 16:09:12,091][15401] Updated weights for policy 0, policy_version 469230 (0.0044) [2024-06-23 16:09:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 7687913472. Throughput: 0: 42607.6. Samples: 7688051780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:13,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 16:09:15,360][15401] Updated weights for policy 0, policy_version 469240 (0.0026) [2024-06-23 16:09:18,390][15132] Fps is (10 sec: 39318.3, 60 sec: 42870.8, 300 sec: 42653.8). Total num frames: 7688126464. Throughput: 0: 42515.2. Samples: 7688306600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:18,391][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 16:09:19,791][15401] Updated weights for policy 0, policy_version 469250 (0.0033) [2024-06-23 16:09:22,910][15401] Updated weights for policy 0, policy_version 469260 (0.0023) [2024-06-23 16:09:23,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42596.8, 300 sec: 42765.0). Total num frames: 7688372224. Throughput: 0: 42564.0. Samples: 7688431400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:23,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 16:09:27,520][15401] Updated weights for policy 0, policy_version 469270 (0.0037) [2024-06-23 16:09:28,390][15132] Fps is (10 sec: 40963.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 7688536064. Throughput: 0: 42626.7. Samples: 7688688040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 16:09:30,527][15401] Updated weights for policy 0, policy_version 469280 (0.0042) [2024-06-23 16:09:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7688765440. Throughput: 0: 42483.7. Samples: 7688938500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 16:09:35,532][15401] Updated weights for policy 0, policy_version 469290 (0.0028) [2024-06-23 16:09:38,303][15401] Updated weights for policy 0, policy_version 469300 (0.0029) [2024-06-23 16:09:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 7689011200. Throughput: 0: 42659.7. Samples: 7689068620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 16:09:43,161][15401] Updated weights for policy 0, policy_version 469310 (0.0037) [2024-06-23 16:09:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7689191424. Throughput: 0: 42429.7. Samples: 7689323440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 16:09:43,543][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000469312_7689207808.pth... [2024-06-23 16:09:43,593][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000468686_7678951424.pth [2024-06-23 16:09:45,844][15401] Updated weights for policy 0, policy_version 469320 (0.0040) [2024-06-23 16:09:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7689420800. Throughput: 0: 42224.8. Samples: 7689569440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:48,398][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 16:09:50,895][15401] Updated weights for policy 0, policy_version 469330 (0.0037) [2024-06-23 16:09:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 7689650176. Throughput: 0: 42427.2. Samples: 7689700880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 16:09:53,518][15401] Updated weights for policy 0, policy_version 469340 (0.0039) [2024-06-23 16:09:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42054.8, 300 sec: 42598.6). Total num frames: 7689814016. Throughput: 0: 42458.6. Samples: 7689962420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:09:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 16:09:58,557][15349] Signal inference workers to stop experience collection... (113900 times) [2024-06-23 16:09:58,558][15349] Signal inference workers to resume experience collection... (113900 times) [2024-06-23 16:09:58,558][15401] Updated weights for policy 0, policy_version 469350 (0.0051) [2024-06-23 16:09:58,608][15401] InferenceWorker_p0-w0: stopping experience collection (113900 times) [2024-06-23 16:09:58,608][15401] InferenceWorker_p0-w0: resuming experience collection (113900 times) [2024-06-23 16:10:01,233][15401] Updated weights for policy 0, policy_version 469360 (0.0022) [2024-06-23 16:10:03,391][15132] Fps is (10 sec: 40954.9, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 7690059776. Throughput: 0: 42345.5. Samples: 7690212160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:10:03,391][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 16:10:06,022][15401] Updated weights for policy 0, policy_version 469370 (0.0032) [2024-06-23 16:10:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7690289152. Throughput: 0: 42576.9. Samples: 7690347260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:10:08,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 16:10:08,884][15401] Updated weights for policy 0, policy_version 469380 (0.0035) [2024-06-23 16:10:13,389][15132] Fps is (10 sec: 39326.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7690452992. Throughput: 0: 42464.9. Samples: 7690598960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:10:13,390][15132] Avg episode reward: [(0, '0.160')] [2024-06-23 16:10:13,611][15401] Updated weights for policy 0, policy_version 469390 (0.0035) [2024-06-23 16:10:16,656][15401] Updated weights for policy 0, policy_version 469400 (0.0035) [2024-06-23 16:10:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42872.1, 300 sec: 42709.5). Total num frames: 7690698752. Throughput: 0: 42551.0. Samples: 7690853300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:10:18,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 16:10:21,458][15401] Updated weights for policy 0, policy_version 469410 (0.0045) [2024-06-23 16:10:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 7690895360. Throughput: 0: 42529.8. Samples: 7690982460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:10:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 16:10:24,290][15401] Updated weights for policy 0, policy_version 469420 (0.0039) [2024-06-23 16:10:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 7691108352. Throughput: 0: 42515.1. Samples: 7691236620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:10:28,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 16:10:29,058][15401] Updated weights for policy 0, policy_version 469430 (0.0030) [2024-06-23 16:10:32,266][15401] Updated weights for policy 0, policy_version 469440 (0.0033) [2024-06-23 16:10:33,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 7691337728. Throughput: 0: 42722.2. Samples: 7691492040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:10:33,392][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 16:10:36,600][15401] Updated weights for policy 0, policy_version 469450 (0.0038) [2024-06-23 16:10:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 7691550720. Throughput: 0: 42774.9. Samples: 7691625760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:10:38,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 16:10:39,808][15401] Updated weights for policy 0, policy_version 469460 (0.0045) [2024-06-23 16:10:43,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 7691730944. Throughput: 0: 42461.3. Samples: 7691873180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:10:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 16:10:44,510][15401] Updated weights for policy 0, policy_version 469470 (0.0028) [2024-06-23 16:10:47,379][15401] Updated weights for policy 0, policy_version 469480 (0.0039) [2024-06-23 16:10:48,391][15132] Fps is (10 sec: 42590.5, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 7691976704. Throughput: 0: 42602.8. Samples: 7692129320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:10:48,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 16:10:52,143][15401] Updated weights for policy 0, policy_version 469490 (0.0042) [2024-06-23 16:10:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 7692173312. Throughput: 0: 42707.2. Samples: 7692269080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:10:53,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 16:10:55,163][15401] Updated weights for policy 0, policy_version 469500 (0.0036) [2024-06-23 16:10:58,389][15132] Fps is (10 sec: 39329.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7692369920. Throughput: 0: 42629.9. Samples: 7692517300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:10:58,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 16:10:59,798][15401] Updated weights for policy 0, policy_version 469510 (0.0025) [2024-06-23 16:11:03,203][15401] Updated weights for policy 0, policy_version 469520 (0.0033) [2024-06-23 16:11:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42599.2, 300 sec: 42709.5). Total num frames: 7692615680. Throughput: 0: 42657.3. Samples: 7692772880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 16:11:07,373][15401] Updated weights for policy 0, policy_version 469530 (0.0037) [2024-06-23 16:11:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7692828672. Throughput: 0: 42695.9. Samples: 7692903780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:08,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 16:11:11,343][15401] Updated weights for policy 0, policy_version 469540 (0.0037) [2024-06-23 16:11:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7693025280. Throughput: 0: 42605.3. Samples: 7693153860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 16:11:15,058][15401] Updated weights for policy 0, policy_version 469550 (0.0032) [2024-06-23 16:11:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7693238272. Throughput: 0: 42668.5. Samples: 7693412020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 16:11:18,831][15401] Updated weights for policy 0, policy_version 469560 (0.0038) [2024-06-23 16:11:22,889][15401] Updated weights for policy 0, policy_version 469570 (0.0033) [2024-06-23 16:11:23,392][15132] Fps is (10 sec: 44227.1, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 7693467648. Throughput: 0: 42542.9. Samples: 7693540280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:23,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 16:11:26,422][15401] Updated weights for policy 0, policy_version 469580 (0.0040) [2024-06-23 16:11:28,391][15132] Fps is (10 sec: 44231.0, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 7693680640. Throughput: 0: 42700.5. Samples: 7693794760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:28,391][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 16:11:30,406][15401] Updated weights for policy 0, policy_version 469590 (0.0026) [2024-06-23 16:11:33,389][15132] Fps is (10 sec: 40969.4, 60 sec: 42327.1, 300 sec: 42598.8). Total num frames: 7693877248. Throughput: 0: 42757.5. Samples: 7694053320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:33,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 16:11:34,175][15401] Updated weights for policy 0, policy_version 469600 (0.0045) [2024-06-23 16:11:36,811][15349] Signal inference workers to stop experience collection... (113950 times) [2024-06-23 16:11:36,811][15349] Signal inference workers to resume experience collection... (113950 times) [2024-06-23 16:11:36,841][15401] InferenceWorker_p0-w0: stopping experience collection (113950 times) [2024-06-23 16:11:36,842][15401] InferenceWorker_p0-w0: resuming experience collection (113950 times) [2024-06-23 16:11:37,969][15401] Updated weights for policy 0, policy_version 469610 (0.0040) [2024-06-23 16:11:38,389][15132] Fps is (10 sec: 42604.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7694106624. Throughput: 0: 42498.6. Samples: 7694181520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 16:11:42,128][15401] Updated weights for policy 0, policy_version 469620 (0.0045) [2024-06-23 16:11:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7694319616. Throughput: 0: 42744.4. Samples: 7694440800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:43,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 16:11:43,439][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000469625_7694336000.pth... [2024-06-23 16:11:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000469001_7684112384.pth [2024-06-23 16:11:45,541][15401] Updated weights for policy 0, policy_version 469630 (0.0040) [2024-06-23 16:11:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42599.7, 300 sec: 42653.9). Total num frames: 7694532608. Throughput: 0: 42710.7. Samples: 7694694860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 16:11:49,685][15401] Updated weights for policy 0, policy_version 469640 (0.0033) [2024-06-23 16:11:53,027][15401] Updated weights for policy 0, policy_version 469650 (0.0040) [2024-06-23 16:11:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7694761984. Throughput: 0: 42776.1. Samples: 7694828700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 16:11:57,163][15401] Updated weights for policy 0, policy_version 469660 (0.0028) [2024-06-23 16:11:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7694958592. Throughput: 0: 43016.1. Samples: 7695089580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-23 16:11:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 16:12:00,525][15401] Updated weights for policy 0, policy_version 469670 (0.0037) [2024-06-23 16:12:03,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 7695187968. Throughput: 0: 42749.9. Samples: 7695336040. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:03,396][15132] Avg episode reward: [(0, '0.212')] [2024-06-23 16:12:04,980][15401] Updated weights for policy 0, policy_version 469680 (0.0046) [2024-06-23 16:12:08,311][15401] Updated weights for policy 0, policy_version 469690 (0.0034) [2024-06-23 16:12:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7695400960. Throughput: 0: 42836.7. Samples: 7695467840. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:08,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 16:12:12,695][15401] Updated weights for policy 0, policy_version 469700 (0.0031) [2024-06-23 16:12:13,390][15132] Fps is (10 sec: 40985.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 7695597568. Throughput: 0: 43071.3. Samples: 7695732920. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:13,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 16:12:15,893][15401] Updated weights for policy 0, policy_version 469710 (0.0042) [2024-06-23 16:12:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 7695826944. Throughput: 0: 42678.9. Samples: 7695973980. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:18,393][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 16:12:20,363][15401] Updated weights for policy 0, policy_version 469720 (0.0035) [2024-06-23 16:12:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42600.0, 300 sec: 42598.7). Total num frames: 7696023552. Throughput: 0: 42851.1. Samples: 7696109820. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:23,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 16:12:23,655][15401] Updated weights for policy 0, policy_version 469730 (0.0034) [2024-06-23 16:12:28,051][15401] Updated weights for policy 0, policy_version 469740 (0.0045) [2024-06-23 16:12:28,390][15132] Fps is (10 sec: 39328.5, 60 sec: 42325.8, 300 sec: 42598.3). Total num frames: 7696220160. Throughput: 0: 42591.8. Samples: 7696357460. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:28,391][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 16:12:31,341][15401] Updated weights for policy 0, policy_version 469750 (0.0032) [2024-06-23 16:12:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 7696465920. Throughput: 0: 42389.4. Samples: 7696602380. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 16:12:35,715][15401] Updated weights for policy 0, policy_version 469760 (0.0034) [2024-06-23 16:12:38,389][15132] Fps is (10 sec: 45878.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7696678912. Throughput: 0: 42492.4. Samples: 7696740860. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 16:12:38,817][15401] Updated weights for policy 0, policy_version 469770 (0.0038) [2024-06-23 16:12:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42543.1). Total num frames: 7696859136. Throughput: 0: 42253.9. Samples: 7696991000. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 16:12:43,436][15401] Updated weights for policy 0, policy_version 469780 (0.0036) [2024-06-23 16:12:47,051][15401] Updated weights for policy 0, policy_version 469790 (0.0038) [2024-06-23 16:12:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7697104896. Throughput: 0: 42321.6. Samples: 7697240240. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 16:12:51,300][15401] Updated weights for policy 0, policy_version 469800 (0.0032) [2024-06-23 16:12:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 7697301504. Throughput: 0: 42432.5. Samples: 7697377300. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 16:12:54,586][15401] Updated weights for policy 0, policy_version 469810 (0.0032) [2024-06-23 16:12:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7697498112. Throughput: 0: 42026.4. Samples: 7697624100. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:12:58,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 16:12:59,257][15401] Updated weights for policy 0, policy_version 469820 (0.0031) [2024-06-23 16:13:01,362][15349] Signal inference workers to stop experience collection... (114000 times) [2024-06-23 16:13:01,415][15401] InferenceWorker_p0-w0: stopping experience collection (114000 times) [2024-06-23 16:13:01,423][15349] Signal inference workers to resume experience collection... (114000 times) [2024-06-23 16:13:01,436][15401] InferenceWorker_p0-w0: resuming experience collection (114000 times) [2024-06-23 16:13:02,108][15401] Updated weights for policy 0, policy_version 469830 (0.0025) [2024-06-23 16:13:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 7697727488. Throughput: 0: 42290.8. Samples: 7697876960. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:13:03,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 16:13:06,873][15401] Updated weights for policy 0, policy_version 469840 (0.0038) [2024-06-23 16:13:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 7697907712. Throughput: 0: 42190.3. Samples: 7698008380. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:13:08,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 16:13:09,811][15401] Updated weights for policy 0, policy_version 469850 (0.0027) [2024-06-23 16:13:13,393][15132] Fps is (10 sec: 42584.1, 60 sec: 42596.1, 300 sec: 42709.0). Total num frames: 7698153472. Throughput: 0: 42275.7. Samples: 7698259980. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:13:13,393][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 16:13:14,403][15401] Updated weights for policy 0, policy_version 469860 (0.0038) [2024-06-23 16:13:17,491][15401] Updated weights for policy 0, policy_version 469870 (0.0038) [2024-06-23 16:13:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 7698366464. Throughput: 0: 42505.3. Samples: 7698515120. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:13:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 16:13:21,960][15401] Updated weights for policy 0, policy_version 469880 (0.0026) [2024-06-23 16:13:23,391][15132] Fps is (10 sec: 37688.4, 60 sec: 41777.8, 300 sec: 42376.0). Total num frames: 7698530304. Throughput: 0: 42232.8. Samples: 7698641420. Policy #0 lag: (min: 1.0, avg: 13.0, max: 23.0) [2024-06-23 16:13:23,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 16:13:25,539][15401] Updated weights for policy 0, policy_version 469890 (0.0024) [2024-06-23 16:13:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42599.0, 300 sec: 42653.9). Total num frames: 7698776064. Throughput: 0: 42321.3. Samples: 7698895460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:13:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 16:13:29,544][15401] Updated weights for policy 0, policy_version 469900 (0.0032) [2024-06-23 16:13:33,331][15401] Updated weights for policy 0, policy_version 469910 (0.0032) [2024-06-23 16:13:33,390][15132] Fps is (10 sec: 47522.7, 60 sec: 42325.2, 300 sec: 42543.2). Total num frames: 7699005440. Throughput: 0: 42429.3. Samples: 7699149560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:13:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 16:13:37,839][15401] Updated weights for policy 0, policy_version 469920 (0.0037) [2024-06-23 16:13:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 7699185664. Throughput: 0: 42280.1. Samples: 7699279900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:13:38,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-23 16:13:40,926][15401] Updated weights for policy 0, policy_version 469930 (0.0033) [2024-06-23 16:13:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7699415040. Throughput: 0: 42410.7. Samples: 7699532580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:13:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 16:13:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000469935_7699415040.pth... [2024-06-23 16:13:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000469312_7689207808.pth [2024-06-23 16:13:45,408][15401] Updated weights for policy 0, policy_version 469940 (0.0033) [2024-06-23 16:13:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 7699611648. Throughput: 0: 42481.0. Samples: 7699788600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:13:48,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 16:13:48,852][15401] Updated weights for policy 0, policy_version 469950 (0.0039) [2024-06-23 16:13:52,935][15401] Updated weights for policy 0, policy_version 469960 (0.0032) [2024-06-23 16:13:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42487.8). Total num frames: 7699824640. Throughput: 0: 42303.8. Samples: 7699912060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:13:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 16:13:56,539][15401] Updated weights for policy 0, policy_version 469970 (0.0027) [2024-06-23 16:13:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 7700037632. Throughput: 0: 42298.7. Samples: 7700163280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:13:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 16:14:00,799][15401] Updated weights for policy 0, policy_version 469980 (0.0028) [2024-06-23 16:14:03,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 7700250624. Throughput: 0: 42569.4. Samples: 7700430740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 16:14:04,240][15401] Updated weights for policy 0, policy_version 469990 (0.0054) [2024-06-23 16:14:06,663][15349] Signal inference workers to stop experience collection... (114050 times) [2024-06-23 16:14:06,663][15349] Signal inference workers to resume experience collection... (114050 times) [2024-06-23 16:14:06,698][15401] InferenceWorker_p0-w0: stopping experience collection (114050 times) [2024-06-23 16:14:06,698][15401] InferenceWorker_p0-w0: resuming experience collection (114050 times) [2024-06-23 16:14:08,290][15401] Updated weights for policy 0, policy_version 470000 (0.0037) [2024-06-23 16:14:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7700480000. Throughput: 0: 42430.3. Samples: 7700550700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:08,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-23 16:14:11,870][15401] Updated weights for policy 0, policy_version 470010 (0.0023) [2024-06-23 16:14:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42327.7, 300 sec: 42598.5). Total num frames: 7700692992. Throughput: 0: 42345.2. Samples: 7700801000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 16:14:16,172][15401] Updated weights for policy 0, policy_version 470020 (0.0038) [2024-06-23 16:14:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42376.6). Total num frames: 7700873216. Throughput: 0: 42601.0. Samples: 7701066600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 16:14:19,515][15401] Updated weights for policy 0, policy_version 470030 (0.0033) [2024-06-23 16:14:23,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42868.3, 300 sec: 42597.5). Total num frames: 7701102592. Throughput: 0: 42337.9. Samples: 7701185380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:23,397][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 16:14:23,692][15401] Updated weights for policy 0, policy_version 470040 (0.0046) [2024-06-23 16:14:27,269][15401] Updated weights for policy 0, policy_version 470050 (0.0042) [2024-06-23 16:14:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7701331968. Throughput: 0: 42202.2. Samples: 7701431680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 16:14:32,305][15401] Updated weights for policy 0, policy_version 470060 (0.0034) [2024-06-23 16:14:33,392][15132] Fps is (10 sec: 39337.1, 60 sec: 41504.5, 300 sec: 42320.3). Total num frames: 7701495808. Throughput: 0: 42333.6. Samples: 7701693720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:33,393][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 16:14:35,046][15401] Updated weights for policy 0, policy_version 470070 (0.0030) [2024-06-23 16:14:38,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 7701757952. Throughput: 0: 42266.2. Samples: 7701814300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:38,396][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 16:14:39,832][15401] Updated weights for policy 0, policy_version 470080 (0.0042) [2024-06-23 16:14:43,137][15401] Updated weights for policy 0, policy_version 470090 (0.0029) [2024-06-23 16:14:43,391][15132] Fps is (10 sec: 45880.8, 60 sec: 42324.5, 300 sec: 42487.2). Total num frames: 7701954560. Throughput: 0: 42437.0. Samples: 7702073000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:43,391][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 16:14:47,614][15401] Updated weights for policy 0, policy_version 470100 (0.0041) [2024-06-23 16:14:48,389][15132] Fps is (10 sec: 37707.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 7702134784. Throughput: 0: 42138.1. Samples: 7702326960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 16:14:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 16:14:50,861][15401] Updated weights for policy 0, policy_version 470110 (0.0024) [2024-06-23 16:14:53,392][15132] Fps is (10 sec: 40955.0, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 7702364160. Throughput: 0: 42175.5. Samples: 7702448700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:14:53,392][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 16:14:55,037][15401] Updated weights for policy 0, policy_version 470120 (0.0029) [2024-06-23 16:14:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42431.9). Total num frames: 7702577152. Throughput: 0: 42325.3. Samples: 7702705640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:14:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 16:14:58,805][15401] Updated weights for policy 0, policy_version 470130 (0.0036) [2024-06-23 16:15:02,604][15401] Updated weights for policy 0, policy_version 470140 (0.0036) [2024-06-23 16:15:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 7702773760. Throughput: 0: 42058.1. Samples: 7702959220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 16:15:06,667][15401] Updated weights for policy 0, policy_version 470150 (0.0040) [2024-06-23 16:15:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 7703003136. Throughput: 0: 42247.8. Samples: 7703086260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 16:15:10,424][15401] Updated weights for policy 0, policy_version 470160 (0.0033) [2024-06-23 16:15:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 7703216128. Throughput: 0: 42456.1. Samples: 7703342200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 16:15:14,279][15401] Updated weights for policy 0, policy_version 470170 (0.0046) [2024-06-23 16:15:18,330][15401] Updated weights for policy 0, policy_version 470180 (0.0037) [2024-06-23 16:15:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7703429120. Throughput: 0: 42322.4. Samples: 7703598120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 16:15:21,744][15349] Signal inference workers to stop experience collection... (114100 times) [2024-06-23 16:15:21,744][15349] Signal inference workers to resume experience collection... (114100 times) [2024-06-23 16:15:21,763][15401] InferenceWorker_p0-w0: stopping experience collection (114100 times) [2024-06-23 16:15:21,791][15401] InferenceWorker_p0-w0: resuming experience collection (114100 times) [2024-06-23 16:15:21,898][15401] Updated weights for policy 0, policy_version 470190 (0.0035) [2024-06-23 16:15:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42329.9, 300 sec: 42487.3). Total num frames: 7703642112. Throughput: 0: 42569.1. Samples: 7703729640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 16:15:26,178][15401] Updated weights for policy 0, policy_version 470200 (0.0044) [2024-06-23 16:15:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42432.1). Total num frames: 7703855104. Throughput: 0: 42378.0. Samples: 7703979960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:28,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 16:15:29,527][15401] Updated weights for policy 0, policy_version 470210 (0.0034) [2024-06-23 16:15:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42431.8). Total num frames: 7704068096. Throughput: 0: 42424.8. Samples: 7704236080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 16:15:33,666][15401] Updated weights for policy 0, policy_version 470220 (0.0029) [2024-06-23 16:15:37,141][15401] Updated weights for policy 0, policy_version 470230 (0.0026) [2024-06-23 16:15:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42056.7, 300 sec: 42542.9). Total num frames: 7704281088. Throughput: 0: 42755.6. Samples: 7704372600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:38,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 16:15:41,091][15401] Updated weights for policy 0, policy_version 470240 (0.0045) [2024-06-23 16:15:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42326.2, 300 sec: 42432.1). Total num frames: 7704494080. Throughput: 0: 42655.6. Samples: 7704625140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 16:15:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000470245_7704494080.pth... [2024-06-23 16:15:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000469625_7694336000.pth [2024-06-23 16:15:44,958][15401] Updated weights for policy 0, policy_version 470250 (0.0044) [2024-06-23 16:15:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 7704723456. Throughput: 0: 42656.4. Samples: 7704878760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 16:15:48,663][15401] Updated weights for policy 0, policy_version 470260 (0.0027) [2024-06-23 16:15:52,583][15401] Updated weights for policy 0, policy_version 470270 (0.0022) [2024-06-23 16:15:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42871.5, 300 sec: 42598.0). Total num frames: 7704936448. Throughput: 0: 42738.6. Samples: 7705009600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:53,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 16:15:56,106][15401] Updated weights for policy 0, policy_version 470280 (0.0047) [2024-06-23 16:15:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 7705133056. Throughput: 0: 42709.3. Samples: 7705264120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:15:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 16:16:00,354][15401] Updated weights for policy 0, policy_version 470290 (0.0037) [2024-06-23 16:16:03,392][15132] Fps is (10 sec: 42598.3, 60 sec: 43142.9, 300 sec: 42487.0). Total num frames: 7705362432. Throughput: 0: 42750.1. Samples: 7705521980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:16:03,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 16:16:03,953][15401] Updated weights for policy 0, policy_version 470300 (0.0037) [2024-06-23 16:16:07,876][15401] Updated weights for policy 0, policy_version 470310 (0.0033) [2024-06-23 16:16:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 7705575424. Throughput: 0: 42714.1. Samples: 7705651780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:16:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 16:16:11,823][15401] Updated weights for policy 0, policy_version 470320 (0.0033) [2024-06-23 16:16:13,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7705788416. Throughput: 0: 42687.1. Samples: 7705900880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 16:16:13,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 16:16:15,683][15401] Updated weights for policy 0, policy_version 470330 (0.0026) [2024-06-23 16:16:18,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42869.7, 300 sec: 42487.3). Total num frames: 7706001408. Throughput: 0: 42719.1. Samples: 7706158540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:18,392][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 16:16:19,307][15401] Updated weights for policy 0, policy_version 470340 (0.0036) [2024-06-23 16:16:23,136][15401] Updated weights for policy 0, policy_version 470350 (0.0035) [2024-06-23 16:16:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42487.2). Total num frames: 7706214400. Throughput: 0: 42525.3. Samples: 7706286340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:23,393][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 16:16:27,267][15401] Updated weights for policy 0, policy_version 470360 (0.0029) [2024-06-23 16:16:28,390][15132] Fps is (10 sec: 44246.4, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 7706443776. Throughput: 0: 42642.0. Samples: 7706544040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:28,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 16:16:30,777][15401] Updated weights for policy 0, policy_version 470370 (0.0044) [2024-06-23 16:16:33,396][15132] Fps is (10 sec: 42581.0, 60 sec: 42866.9, 300 sec: 42486.4). Total num frames: 7706640384. Throughput: 0: 42802.3. Samples: 7706805140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:33,397][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 16:16:34,627][15401] Updated weights for policy 0, policy_version 470380 (0.0033) [2024-06-23 16:16:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 7706853376. Throughput: 0: 42796.0. Samples: 7706935320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 16:16:38,471][15401] Updated weights for policy 0, policy_version 470390 (0.0031) [2024-06-23 16:16:42,017][15401] Updated weights for policy 0, policy_version 470400 (0.0028) [2024-06-23 16:16:43,389][15132] Fps is (10 sec: 44265.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 7707082752. Throughput: 0: 42827.1. Samples: 7707191340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 16:16:46,092][15401] Updated weights for policy 0, policy_version 470410 (0.0028) [2024-06-23 16:16:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 7707279360. Throughput: 0: 42925.0. Samples: 7707453500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 16:16:48,698][15349] Signal inference workers to stop experience collection... (114150 times) [2024-06-23 16:16:48,755][15401] InferenceWorker_p0-w0: stopping experience collection (114150 times) [2024-06-23 16:16:48,760][15349] Signal inference workers to resume experience collection... (114150 times) [2024-06-23 16:16:48,771][15401] InferenceWorker_p0-w0: resuming experience collection (114150 times) [2024-06-23 16:16:49,637][15401] Updated weights for policy 0, policy_version 470420 (0.0043) [2024-06-23 16:16:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 7707492352. Throughput: 0: 42773.4. Samples: 7707576580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 16:16:53,859][15401] Updated weights for policy 0, policy_version 470430 (0.0027) [2024-06-23 16:16:57,836][15401] Updated weights for policy 0, policy_version 470440 (0.0032) [2024-06-23 16:16:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42488.3). Total num frames: 7707721728. Throughput: 0: 43026.3. Samples: 7707837060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:16:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 16:17:01,453][15401] Updated weights for policy 0, policy_version 470450 (0.0034) [2024-06-23 16:17:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 7707934720. Throughput: 0: 43064.5. Samples: 7708096340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:17:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 16:17:05,407][15401] Updated weights for policy 0, policy_version 470460 (0.0033) [2024-06-23 16:17:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 7708114944. Throughput: 0: 42975.2. Samples: 7708220120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:17:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 16:17:09,192][15401] Updated weights for policy 0, policy_version 470470 (0.0030) [2024-06-23 16:17:12,912][15401] Updated weights for policy 0, policy_version 470480 (0.0040) [2024-06-23 16:17:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42487.7). Total num frames: 7708360704. Throughput: 0: 42905.0. Samples: 7708474760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:17:13,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 16:17:17,090][15401] Updated weights for policy 0, policy_version 470490 (0.0041) [2024-06-23 16:17:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 7708557312. Throughput: 0: 42888.9. Samples: 7708734860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:17:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 16:17:20,434][15401] Updated weights for policy 0, policy_version 470500 (0.0032) [2024-06-23 16:17:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42543.0). Total num frames: 7708770304. Throughput: 0: 42658.7. Samples: 7708854960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:17:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 16:17:24,694][15401] Updated weights for policy 0, policy_version 470510 (0.0041) [2024-06-23 16:17:27,854][15401] Updated weights for policy 0, policy_version 470520 (0.0040) [2024-06-23 16:17:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 7709016064. Throughput: 0: 42843.6. Samples: 7709119300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:17:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 16:17:32,442][15401] Updated weights for policy 0, policy_version 470530 (0.0035) [2024-06-23 16:17:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42603.1, 300 sec: 42431.8). Total num frames: 7709196288. Throughput: 0: 42712.9. Samples: 7709375580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:17:33,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 16:17:35,711][15401] Updated weights for policy 0, policy_version 470540 (0.0023) [2024-06-23 16:17:38,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 7709409280. Throughput: 0: 42505.7. Samples: 7709489440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-23 16:17:38,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 16:17:40,398][15401] Updated weights for policy 0, policy_version 470550 (0.0029) [2024-06-23 16:17:43,236][15401] Updated weights for policy 0, policy_version 470560 (0.0031) [2024-06-23 16:17:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7709655040. Throughput: 0: 42655.9. Samples: 7709756580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:17:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 16:17:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000470560_7709655040.pth... [2024-06-23 16:17:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000469935_7699415040.pth [2024-06-23 16:17:47,976][15401] Updated weights for policy 0, policy_version 470570 (0.0027) [2024-06-23 16:17:48,272][15349] Signal inference workers to stop experience collection... (114200 times) [2024-06-23 16:17:48,278][15349] Signal inference workers to resume experience collection... (114200 times) [2024-06-23 16:17:48,318][15401] InferenceWorker_p0-w0: stopping experience collection (114200 times) [2024-06-23 16:17:48,318][15401] InferenceWorker_p0-w0: resuming experience collection (114200 times) [2024-06-23 16:17:48,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7709851648. Throughput: 0: 42713.8. Samples: 7710018460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:17:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 16:17:51,166][15401] Updated weights for policy 0, policy_version 470580 (0.0050) [2024-06-23 16:17:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 7710064640. Throughput: 0: 42681.7. Samples: 7710140900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:17:53,392][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 16:17:55,418][15401] Updated weights for policy 0, policy_version 470590 (0.0027) [2024-06-23 16:17:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7710294016. Throughput: 0: 42871.2. Samples: 7710403960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:17:58,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-23 16:17:58,488][15401] Updated weights for policy 0, policy_version 470600 (0.0045) [2024-06-23 16:18:02,938][15401] Updated weights for policy 0, policy_version 470610 (0.0028) [2024-06-23 16:18:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7710507008. Throughput: 0: 42892.3. Samples: 7710665020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:03,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 16:18:05,997][15401] Updated weights for policy 0, policy_version 470620 (0.0038) [2024-06-23 16:18:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42598.9). Total num frames: 7710720000. Throughput: 0: 42996.4. Samples: 7710789800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 16:18:10,544][15401] Updated weights for policy 0, policy_version 470630 (0.0034) [2024-06-23 16:18:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7710949376. Throughput: 0: 42913.3. Samples: 7711050400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 16:18:13,747][15401] Updated weights for policy 0, policy_version 470640 (0.0022) [2024-06-23 16:18:18,067][15401] Updated weights for policy 0, policy_version 470650 (0.0042) [2024-06-23 16:18:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7711129600. Throughput: 0: 43038.2. Samples: 7711312300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:18,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 16:18:21,542][15401] Updated weights for policy 0, policy_version 470660 (0.0034) [2024-06-23 16:18:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 7711375360. Throughput: 0: 43362.9. Samples: 7711440660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 16:18:25,754][15401] Updated weights for policy 0, policy_version 470670 (0.0044) [2024-06-23 16:18:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7711588352. Throughput: 0: 43107.1. Samples: 7711696400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 16:18:29,167][15401] Updated weights for policy 0, policy_version 470680 (0.0024) [2024-06-23 16:18:33,256][15401] Updated weights for policy 0, policy_version 470690 (0.0029) [2024-06-23 16:18:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7711784960. Throughput: 0: 43099.6. Samples: 7711957940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 16:18:36,701][15401] Updated weights for policy 0, policy_version 470700 (0.0038) [2024-06-23 16:18:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 7711997952. Throughput: 0: 43155.7. Samples: 7712082800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 16:18:41,141][15401] Updated weights for policy 0, policy_version 470710 (0.0031) [2024-06-23 16:18:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7712210944. Throughput: 0: 42963.0. Samples: 7712337300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 16:18:44,590][15401] Updated weights for policy 0, policy_version 470720 (0.0034) [2024-06-23 16:18:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 7712423936. Throughput: 0: 42909.3. Samples: 7712596040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:48,392][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 16:18:48,534][15401] Updated weights for policy 0, policy_version 470730 (0.0038) [2024-06-23 16:18:52,283][15401] Updated weights for policy 0, policy_version 470740 (0.0036) [2024-06-23 16:18:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 7712653312. Throughput: 0: 43017.3. Samples: 7712725580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:53,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 16:18:56,109][15401] Updated weights for policy 0, policy_version 470750 (0.0033) [2024-06-23 16:18:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7712849920. Throughput: 0: 42832.0. Samples: 7712977840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:18:58,396][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 16:19:00,060][15401] Updated weights for policy 0, policy_version 470760 (0.0034) [2024-06-23 16:19:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7713062912. Throughput: 0: 42752.0. Samples: 7713236140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 16:19:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 16:19:03,652][15401] Updated weights for policy 0, policy_version 470770 (0.0038) [2024-06-23 16:19:07,650][15401] Updated weights for policy 0, policy_version 470780 (0.0034) [2024-06-23 16:19:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 7713259520. Throughput: 0: 42764.5. Samples: 7713365060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 16:19:11,262][15401] Updated weights for policy 0, policy_version 470790 (0.0039) [2024-06-23 16:19:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7713488896. Throughput: 0: 42615.5. Samples: 7713614100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 16:19:15,786][15401] Updated weights for policy 0, policy_version 470800 (0.0029) [2024-06-23 16:19:15,808][15349] Signal inference workers to stop experience collection... (114250 times) [2024-06-23 16:19:15,809][15349] Signal inference workers to resume experience collection... (114250 times) [2024-06-23 16:19:15,852][15401] InferenceWorker_p0-w0: stopping experience collection (114250 times) [2024-06-23 16:19:15,852][15401] InferenceWorker_p0-w0: resuming experience collection (114250 times) [2024-06-23 16:19:18,391][15132] Fps is (10 sec: 45866.7, 60 sec: 43143.3, 300 sec: 42765.7). Total num frames: 7713718272. Throughput: 0: 42610.3. Samples: 7713875480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:18,396][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 16:19:18,753][15401] Updated weights for policy 0, policy_version 470810 (0.0029) [2024-06-23 16:19:23,295][15401] Updated weights for policy 0, policy_version 470820 (0.0035) [2024-06-23 16:19:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7713914880. Throughput: 0: 42744.8. Samples: 7714006320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 16:19:26,620][15401] Updated weights for policy 0, policy_version 470830 (0.0028) [2024-06-23 16:19:28,389][15132] Fps is (10 sec: 42606.0, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 7714144256. Throughput: 0: 42624.1. Samples: 7714255380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 16:19:30,961][15401] Updated weights for policy 0, policy_version 470840 (0.0033) [2024-06-23 16:19:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42710.0). Total num frames: 7714357248. Throughput: 0: 42713.4. Samples: 7714518140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:33,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 16:19:34,244][15401] Updated weights for policy 0, policy_version 470850 (0.0025) [2024-06-23 16:19:38,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42709.3). Total num frames: 7714553856. Throughput: 0: 42567.1. Samples: 7714641200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:38,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 16:19:38,532][15401] Updated weights for policy 0, policy_version 470860 (0.0030) [2024-06-23 16:19:41,853][15401] Updated weights for policy 0, policy_version 470870 (0.0031) [2024-06-23 16:19:43,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7714799616. Throughput: 0: 42666.7. Samples: 7714897840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:43,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 16:19:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000470874_7714799616.pth... [2024-06-23 16:19:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000470245_7704494080.pth [2024-06-23 16:19:46,367][15401] Updated weights for policy 0, policy_version 470880 (0.0044) [2024-06-23 16:19:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42873.2, 300 sec: 42820.9). Total num frames: 7714996224. Throughput: 0: 42697.2. Samples: 7715157520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:48,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 16:19:49,817][15401] Updated weights for policy 0, policy_version 470890 (0.0039) [2024-06-23 16:19:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 7715176448. Throughput: 0: 42584.4. Samples: 7715281360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 16:19:54,003][15401] Updated weights for policy 0, policy_version 470900 (0.0037) [2024-06-23 16:19:57,665][15401] Updated weights for policy 0, policy_version 470910 (0.0033) [2024-06-23 16:19:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7715438592. Throughput: 0: 42783.6. Samples: 7715539360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:19:58,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 16:20:01,600][15401] Updated weights for policy 0, policy_version 470920 (0.0041) [2024-06-23 16:20:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7715635200. Throughput: 0: 42700.7. Samples: 7715796940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:20:03,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 16:20:05,270][15401] Updated weights for policy 0, policy_version 470930 (0.0034) [2024-06-23 16:20:08,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7715815424. Throughput: 0: 42594.7. Samples: 7715923080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:20:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 16:20:09,597][15401] Updated weights for policy 0, policy_version 470940 (0.0044) [2024-06-23 16:20:12,800][15401] Updated weights for policy 0, policy_version 470950 (0.0029) [2024-06-23 16:20:13,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42867.0, 300 sec: 42819.6). Total num frames: 7716061184. Throughput: 0: 42590.3. Samples: 7716172220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:20:13,396][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 16:20:17,407][15401] Updated weights for policy 0, policy_version 470960 (0.0035) [2024-06-23 16:20:18,390][15132] Fps is (10 sec: 44234.3, 60 sec: 42326.2, 300 sec: 42764.9). Total num frames: 7716257792. Throughput: 0: 42482.2. Samples: 7716429760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:20:18,391][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 16:20:20,381][15401] Updated weights for policy 0, policy_version 470970 (0.0031) [2024-06-23 16:20:23,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7716470784. Throughput: 0: 42540.1. Samples: 7716555400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:20:23,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-23 16:20:25,041][15401] Updated weights for policy 0, policy_version 470980 (0.0035) [2024-06-23 16:20:27,666][15349] Signal inference workers to stop experience collection... (114300 times) [2024-06-23 16:20:27,716][15401] InferenceWorker_p0-w0: stopping experience collection (114300 times) [2024-06-23 16:20:27,779][15349] Signal inference workers to resume experience collection... (114300 times) [2024-06-23 16:20:27,779][15401] InferenceWorker_p0-w0: resuming experience collection (114300 times) [2024-06-23 16:20:27,921][15401] Updated weights for policy 0, policy_version 470990 (0.0027) [2024-06-23 16:20:28,389][15132] Fps is (10 sec: 44239.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 7716700160. Throughput: 0: 42645.8. Samples: 7716816900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 16:20:28,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-23 16:20:32,661][15401] Updated weights for policy 0, policy_version 471000 (0.0039) [2024-06-23 16:20:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 7716896768. Throughput: 0: 42672.1. Samples: 7717077760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:20:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 16:20:35,759][15401] Updated weights for policy 0, policy_version 471010 (0.0031) [2024-06-23 16:20:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 7717126144. Throughput: 0: 42711.1. Samples: 7717203360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:20:38,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 16:20:40,119][15401] Updated weights for policy 0, policy_version 471020 (0.0039) [2024-06-23 16:20:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7717339136. Throughput: 0: 42593.3. Samples: 7717456060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:20:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 16:20:43,466][15401] Updated weights for policy 0, policy_version 471030 (0.0046) [2024-06-23 16:20:47,712][15401] Updated weights for policy 0, policy_version 471040 (0.0040) [2024-06-23 16:20:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 7717535744. Throughput: 0: 42587.2. Samples: 7717713360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:20:48,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 16:20:50,977][15401] Updated weights for policy 0, policy_version 471050 (0.0031) [2024-06-23 16:20:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7717765120. Throughput: 0: 42558.1. Samples: 7717838200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:20:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 16:20:55,302][15401] Updated weights for policy 0, policy_version 471060 (0.0026) [2024-06-23 16:20:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 7717994496. Throughput: 0: 42790.1. Samples: 7718097500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:20:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 16:20:58,444][15401] Updated weights for policy 0, policy_version 471070 (0.0029) [2024-06-23 16:21:02,977][15401] Updated weights for policy 0, policy_version 471080 (0.0042) [2024-06-23 16:21:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7718191104. Throughput: 0: 42750.2. Samples: 7718353500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:03,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 16:21:06,416][15401] Updated weights for policy 0, policy_version 471090 (0.0044) [2024-06-23 16:21:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7718387712. Throughput: 0: 42679.0. Samples: 7718475960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 16:21:10,649][15401] Updated weights for policy 0, policy_version 471100 (0.0035) [2024-06-23 16:21:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42876.0, 300 sec: 42820.9). Total num frames: 7718633472. Throughput: 0: 42657.7. Samples: 7718736500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:13,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 16:21:13,914][15401] Updated weights for policy 0, policy_version 471110 (0.0037) [2024-06-23 16:21:18,302][15401] Updated weights for policy 0, policy_version 471120 (0.0041) [2024-06-23 16:21:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.9, 300 sec: 42765.4). Total num frames: 7718830080. Throughput: 0: 42573.0. Samples: 7718993540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 16:21:21,650][15401] Updated weights for policy 0, policy_version 471130 (0.0032) [2024-06-23 16:21:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7719043072. Throughput: 0: 42568.0. Samples: 7719118920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 16:21:25,971][15401] Updated weights for policy 0, policy_version 471140 (0.0036) [2024-06-23 16:21:28,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42596.6, 300 sec: 42765.6). Total num frames: 7719256064. Throughput: 0: 42626.1. Samples: 7719374340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:28,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 16:21:29,461][15401] Updated weights for policy 0, policy_version 471150 (0.0035) [2024-06-23 16:21:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7719469056. Throughput: 0: 42647.5. Samples: 7719632500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 16:21:33,586][15401] Updated weights for policy 0, policy_version 471160 (0.0033) [2024-06-23 16:21:37,251][15401] Updated weights for policy 0, policy_version 471170 (0.0034) [2024-06-23 16:21:38,389][15132] Fps is (10 sec: 42609.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7719682048. Throughput: 0: 42572.2. Samples: 7719753940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 16:21:42,033][15401] Updated weights for policy 0, policy_version 471180 (0.0038) [2024-06-23 16:21:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7719895040. Throughput: 0: 42605.4. Samples: 7720014740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 16:21:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000471185_7719895040.pth... [2024-06-23 16:21:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000470560_7709655040.pth [2024-06-23 16:21:45,043][15401] Updated weights for policy 0, policy_version 471190 (0.0030) [2024-06-23 16:21:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7720075264. Throughput: 0: 42452.2. Samples: 7720263840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 16:21:49,587][15401] Updated weights for policy 0, policy_version 471200 (0.0032) [2024-06-23 16:21:50,450][15349] Signal inference workers to stop experience collection... (114350 times) [2024-06-23 16:21:50,492][15401] InferenceWorker_p0-w0: stopping experience collection (114350 times) [2024-06-23 16:21:50,517][15349] Signal inference workers to resume experience collection... (114350 times) [2024-06-23 16:21:50,524][15401] InferenceWorker_p0-w0: resuming experience collection (114350 times) [2024-06-23 16:21:52,745][15401] Updated weights for policy 0, policy_version 471210 (0.0033) [2024-06-23 16:21:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7720304640. Throughput: 0: 42596.4. Samples: 7720392800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 16:21:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 16:21:57,195][15401] Updated weights for policy 0, policy_version 471220 (0.0028) [2024-06-23 16:21:58,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7720517632. Throughput: 0: 42578.2. Samples: 7720652520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:21:58,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 16:22:00,591][15401] Updated weights for policy 0, policy_version 471230 (0.0030) [2024-06-23 16:22:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 7720730624. Throughput: 0: 42332.3. Samples: 7720898600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:03,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 16:22:04,929][15401] Updated weights for policy 0, policy_version 471240 (0.0053) [2024-06-23 16:22:08,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7720943616. Throughput: 0: 42472.4. Samples: 7721030280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:08,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 16:22:08,535][15401] Updated weights for policy 0, policy_version 471250 (0.0034) [2024-06-23 16:22:12,561][15401] Updated weights for policy 0, policy_version 471260 (0.0036) [2024-06-23 16:22:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 7721140224. Throughput: 0: 42415.8. Samples: 7721282940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:13,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 16:22:16,611][15401] Updated weights for policy 0, policy_version 471270 (0.0034) [2024-06-23 16:22:18,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7721369600. Throughput: 0: 42130.5. Samples: 7721528380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 16:22:20,258][15401] Updated weights for policy 0, policy_version 471280 (0.0040) [2024-06-23 16:22:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7721582592. Throughput: 0: 42454.6. Samples: 7721664400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:23,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 16:22:24,229][15401] Updated weights for policy 0, policy_version 471290 (0.0044) [2024-06-23 16:22:27,949][15401] Updated weights for policy 0, policy_version 471300 (0.0034) [2024-06-23 16:22:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 7721779200. Throughput: 0: 42260.4. Samples: 7721916460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 16:22:32,117][15401] Updated weights for policy 0, policy_version 471310 (0.0036) [2024-06-23 16:22:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.6, 300 sec: 42765.0). Total num frames: 7722024960. Throughput: 0: 42310.1. Samples: 7722167900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:33,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 16:22:35,468][15401] Updated weights for policy 0, policy_version 471320 (0.0040) [2024-06-23 16:22:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7722221568. Throughput: 0: 42249.0. Samples: 7722294000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 16:22:39,898][15401] Updated weights for policy 0, policy_version 471330 (0.0053) [2024-06-23 16:22:43,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7722418176. Throughput: 0: 42055.2. Samples: 7722545000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 16:22:43,474][15401] Updated weights for policy 0, policy_version 471340 (0.0032) [2024-06-23 16:22:47,617][15401] Updated weights for policy 0, policy_version 471350 (0.0039) [2024-06-23 16:22:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 7722631168. Throughput: 0: 42305.0. Samples: 7722802220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 16:22:51,132][15401] Updated weights for policy 0, policy_version 471360 (0.0047) [2024-06-23 16:22:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7722860544. Throughput: 0: 42156.4. Samples: 7722927220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 16:22:55,516][15401] Updated weights for policy 0, policy_version 471370 (0.0025) [2024-06-23 16:22:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7723057152. Throughput: 0: 42048.3. Samples: 7723175120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:22:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 16:22:58,873][15401] Updated weights for policy 0, policy_version 471380 (0.0036) [2024-06-23 16:23:03,059][15401] Updated weights for policy 0, policy_version 471390 (0.0032) [2024-06-23 16:23:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 7723253760. Throughput: 0: 42330.8. Samples: 7723433260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:23:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 16:23:06,434][15401] Updated weights for policy 0, policy_version 471400 (0.0037) [2024-06-23 16:23:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 7723499520. Throughput: 0: 42147.6. Samples: 7723561040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:23:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 16:23:11,013][15401] Updated weights for policy 0, policy_version 471410 (0.0028) [2024-06-23 16:23:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7723696128. Throughput: 0: 42254.3. Samples: 7723817900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:23:13,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 16:23:14,073][15401] Updated weights for policy 0, policy_version 471420 (0.0036) [2024-06-23 16:23:17,298][15349] Signal inference workers to stop experience collection... (114400 times) [2024-06-23 16:23:17,299][15349] Signal inference workers to resume experience collection... (114400 times) [2024-06-23 16:23:17,335][15401] InferenceWorker_p0-w0: stopping experience collection (114400 times) [2024-06-23 16:23:17,335][15401] InferenceWorker_p0-w0: resuming experience collection (114400 times) [2024-06-23 16:23:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 7723892736. Throughput: 0: 42223.0. Samples: 7724067840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:23:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 16:23:18,543][15401] Updated weights for policy 0, policy_version 471430 (0.0048) [2024-06-23 16:23:21,642][15401] Updated weights for policy 0, policy_version 471440 (0.0033) [2024-06-23 16:23:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7724122112. Throughput: 0: 42206.1. Samples: 7724193280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:23:23,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 16:23:26,087][15401] Updated weights for policy 0, policy_version 471450 (0.0030) [2024-06-23 16:23:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7724335104. Throughput: 0: 42536.5. Samples: 7724459140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:23:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 16:23:29,395][15401] Updated weights for policy 0, policy_version 471460 (0.0031) [2024-06-23 16:23:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42052.2, 300 sec: 42542.5). Total num frames: 7724548096. Throughput: 0: 42363.4. Samples: 7724708680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:23:33,393][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 16:23:33,641][15401] Updated weights for policy 0, policy_version 471470 (0.0029) [2024-06-23 16:23:37,544][15401] Updated weights for policy 0, policy_version 471480 (0.0029) [2024-06-23 16:23:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7724761088. Throughput: 0: 42399.1. Samples: 7724835180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:23:38,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-23 16:23:41,206][15401] Updated weights for policy 0, policy_version 471490 (0.0042) [2024-06-23 16:23:43,391][15132] Fps is (10 sec: 42603.0, 60 sec: 42597.4, 300 sec: 42543.0). Total num frames: 7724974080. Throughput: 0: 42531.7. Samples: 7725089100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:23:43,391][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 16:23:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000471495_7724974080.pth... [2024-06-23 16:23:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000470874_7714799616.pth [2024-06-23 16:23:45,083][15401] Updated weights for policy 0, policy_version 471500 (0.0039) [2024-06-23 16:23:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7725187072. Throughput: 0: 42350.7. Samples: 7725339040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:23:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 16:23:48,741][15401] Updated weights for policy 0, policy_version 471510 (0.0025) [2024-06-23 16:23:52,743][15401] Updated weights for policy 0, policy_version 471520 (0.0036) [2024-06-23 16:23:53,389][15132] Fps is (10 sec: 40966.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 7725383680. Throughput: 0: 42416.1. Samples: 7725469760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:23:53,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-23 16:23:56,941][15401] Updated weights for policy 0, policy_version 471530 (0.0033) [2024-06-23 16:23:58,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 7725580288. Throughput: 0: 42390.6. Samples: 7725725480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:23:58,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-23 16:24:00,302][15401] Updated weights for policy 0, policy_version 471540 (0.0043) [2024-06-23 16:24:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7725842432. Throughput: 0: 42621.0. Samples: 7725985780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:03,390][15132] Avg episode reward: [(0, '0.100')] [2024-06-23 16:24:04,302][15401] Updated weights for policy 0, policy_version 471550 (0.0023) [2024-06-23 16:24:07,963][15401] Updated weights for policy 0, policy_version 471560 (0.0034) [2024-06-23 16:24:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7726039040. Throughput: 0: 42781.3. Samples: 7726118440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:08,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-23 16:24:11,803][15401] Updated weights for policy 0, policy_version 471570 (0.0026) [2024-06-23 16:24:13,390][15132] Fps is (10 sec: 39317.8, 60 sec: 42324.6, 300 sec: 42431.9). Total num frames: 7726235648. Throughput: 0: 42376.8. Samples: 7726366140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:13,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 16:24:15,705][15401] Updated weights for policy 0, policy_version 471580 (0.0045) [2024-06-23 16:24:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 7726465024. Throughput: 0: 42568.6. Samples: 7726624160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 16:24:19,559][15401] Updated weights for policy 0, policy_version 471590 (0.0036) [2024-06-23 16:24:23,389][15132] Fps is (10 sec: 42602.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 7726661632. Throughput: 0: 42674.2. Samples: 7726755520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 16:24:23,721][15401] Updated weights for policy 0, policy_version 471600 (0.0040) [2024-06-23 16:24:27,119][15401] Updated weights for policy 0, policy_version 471610 (0.0042) [2024-06-23 16:24:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 7726891008. Throughput: 0: 42572.9. Samples: 7727004820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 16:24:31,531][15401] Updated weights for policy 0, policy_version 471620 (0.0023) [2024-06-23 16:24:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42543.2). Total num frames: 7727104000. Throughput: 0: 42744.4. Samples: 7727262540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:33,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 16:24:34,917][15401] Updated weights for policy 0, policy_version 471630 (0.0038) [2024-06-23 16:24:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 7727300608. Throughput: 0: 42603.1. Samples: 7727386900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 16:24:39,041][15349] Signal inference workers to stop experience collection... (114450 times) [2024-06-23 16:24:39,042][15349] Signal inference workers to resume experience collection... (114450 times) [2024-06-23 16:24:39,073][15401] InferenceWorker_p0-w0: stopping experience collection (114450 times) [2024-06-23 16:24:39,102][15401] InferenceWorker_p0-w0: resuming experience collection (114450 times) [2024-06-23 16:24:39,177][15401] Updated weights for policy 0, policy_version 471640 (0.0038) [2024-06-23 16:24:42,922][15401] Updated weights for policy 0, policy_version 471650 (0.0034) [2024-06-23 16:24:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42326.3, 300 sec: 42431.8). Total num frames: 7727513600. Throughput: 0: 42595.5. Samples: 7727642280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 16:24:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 16:24:47,062][15401] Updated weights for policy 0, policy_version 471660 (0.0030) [2024-06-23 16:24:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7727726592. Throughput: 0: 42523.6. Samples: 7727899340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:24:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 16:24:50,441][15401] Updated weights for policy 0, policy_version 471670 (0.0031) [2024-06-23 16:24:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 7727939584. Throughput: 0: 42447.6. Samples: 7728028580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:24:53,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 16:24:54,895][15401] Updated weights for policy 0, policy_version 471680 (0.0045) [2024-06-23 16:24:58,099][15401] Updated weights for policy 0, policy_version 471690 (0.0039) [2024-06-23 16:24:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 7728168960. Throughput: 0: 42521.4. Samples: 7728279560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:24:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 16:25:02,391][15401] Updated weights for policy 0, policy_version 471700 (0.0033) [2024-06-23 16:25:03,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 7728381952. Throughput: 0: 42615.0. Samples: 7728541940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:03,393][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 16:25:05,626][15401] Updated weights for policy 0, policy_version 471710 (0.0033) [2024-06-23 16:25:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42377.2). Total num frames: 7728562176. Throughput: 0: 42517.4. Samples: 7728668800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 16:25:09,949][15401] Updated weights for policy 0, policy_version 471720 (0.0033) [2024-06-23 16:25:13,153][15401] Updated weights for policy 0, policy_version 471730 (0.0034) [2024-06-23 16:25:13,392][15132] Fps is (10 sec: 44236.9, 60 sec: 43143.5, 300 sec: 42598.1). Total num frames: 7728824320. Throughput: 0: 42646.5. Samples: 7728924020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:13,393][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 16:25:17,545][15401] Updated weights for policy 0, policy_version 471740 (0.0035) [2024-06-23 16:25:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7729004544. Throughput: 0: 42813.0. Samples: 7729189120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 16:25:20,873][15401] Updated weights for policy 0, policy_version 471750 (0.0033) [2024-06-23 16:25:23,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 7729217536. Throughput: 0: 42747.6. Samples: 7729310540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:23,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 16:25:25,308][15401] Updated weights for policy 0, policy_version 471760 (0.0029) [2024-06-23 16:25:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7729446912. Throughput: 0: 42684.4. Samples: 7729563080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 16:25:28,845][15401] Updated weights for policy 0, policy_version 471770 (0.0030) [2024-06-23 16:25:32,951][15401] Updated weights for policy 0, policy_version 471780 (0.0027) [2024-06-23 16:25:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 7729643520. Throughput: 0: 42717.4. Samples: 7729821620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 16:25:36,253][15401] Updated weights for policy 0, policy_version 471790 (0.0042) [2024-06-23 16:25:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 7729856512. Throughput: 0: 42611.5. Samples: 7729946100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 16:25:40,775][15401] Updated weights for policy 0, policy_version 471800 (0.0048) [2024-06-23 16:25:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7730102272. Throughput: 0: 42798.1. Samples: 7730205480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 16:25:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000471809_7730118656.pth... [2024-06-23 16:25:43,634][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000471185_7719895040.pth [2024-06-23 16:25:43,792][15401] Updated weights for policy 0, policy_version 471810 (0.0031) [2024-06-23 16:25:48,305][15401] Updated weights for policy 0, policy_version 471820 (0.0038) [2024-06-23 16:25:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 7730298880. Throughput: 0: 42764.6. Samples: 7730466240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 16:25:51,187][15401] Updated weights for policy 0, policy_version 471830 (0.0031) [2024-06-23 16:25:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 7730495488. Throughput: 0: 42571.9. Samples: 7730584540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 16:25:55,836][15401] Updated weights for policy 0, policy_version 471840 (0.0043) [2024-06-23 16:25:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7730741248. Throughput: 0: 42843.3. Samples: 7730851860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:25:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 16:25:58,748][15401] Updated weights for policy 0, policy_version 471850 (0.0038) [2024-06-23 16:26:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 7730937856. Throughput: 0: 42656.0. Samples: 7731108640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:26:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 16:26:03,435][15401] Updated weights for policy 0, policy_version 471860 (0.0043) [2024-06-23 16:26:07,062][15401] Updated weights for policy 0, policy_version 471870 (0.0036) [2024-06-23 16:26:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 7731134464. Throughput: 0: 42626.7. Samples: 7731228740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 16:26:08,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 16:26:11,498][15401] Updated weights for policy 0, policy_version 471880 (0.0032) [2024-06-23 16:26:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 7731380224. Throughput: 0: 42850.7. Samples: 7731491360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 16:26:14,903][15401] Updated weights for policy 0, policy_version 471890 (0.0032) [2024-06-23 16:26:18,392][15132] Fps is (10 sec: 42587.2, 60 sec: 42596.5, 300 sec: 42431.4). Total num frames: 7731560448. Throughput: 0: 42771.7. Samples: 7731746460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:18,393][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 16:26:19,212][15401] Updated weights for policy 0, policy_version 471900 (0.0033) [2024-06-23 16:26:19,808][15349] Signal inference workers to stop experience collection... (114500 times) [2024-06-23 16:26:19,810][15349] Signal inference workers to resume experience collection... (114500 times) [2024-06-23 16:26:19,850][15401] InferenceWorker_p0-w0: stopping experience collection (114500 times) [2024-06-23 16:26:19,850][15401] InferenceWorker_p0-w0: resuming experience collection (114500 times) [2024-06-23 16:26:22,514][15401] Updated weights for policy 0, policy_version 471910 (0.0027) [2024-06-23 16:26:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42487.3). Total num frames: 7731789824. Throughput: 0: 42715.5. Samples: 7731868400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:23,393][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 16:26:26,950][15401] Updated weights for policy 0, policy_version 471920 (0.0023) [2024-06-23 16:26:28,389][15132] Fps is (10 sec: 47525.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 7732035584. Throughput: 0: 42868.5. Samples: 7732134560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 16:26:30,186][15401] Updated weights for policy 0, policy_version 471930 (0.0028) [2024-06-23 16:26:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 7732199424. Throughput: 0: 42881.6. Samples: 7732395920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 16:26:34,445][15401] Updated weights for policy 0, policy_version 471940 (0.0031) [2024-06-23 16:26:37,922][15401] Updated weights for policy 0, policy_version 471950 (0.0032) [2024-06-23 16:26:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 7732428800. Throughput: 0: 42821.4. Samples: 7732511500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 16:26:41,988][15401] Updated weights for policy 0, policy_version 471960 (0.0045) [2024-06-23 16:26:43,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7732674560. Throughput: 0: 42679.0. Samples: 7732772420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 16:26:45,534][15401] Updated weights for policy 0, policy_version 471970 (0.0029) [2024-06-23 16:26:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7732838400. Throughput: 0: 42829.7. Samples: 7733035980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 16:26:49,632][15401] Updated weights for policy 0, policy_version 471980 (0.0039) [2024-06-23 16:26:53,090][15401] Updated weights for policy 0, policy_version 471990 (0.0040) [2024-06-23 16:26:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.9, 300 sec: 42598.1). Total num frames: 7733084160. Throughput: 0: 42771.9. Samples: 7733153580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:53,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 16:26:57,126][15401] Updated weights for policy 0, policy_version 472000 (0.0021) [2024-06-23 16:26:58,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 7733329920. Throughput: 0: 42904.5. Samples: 7733422060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:26:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 16:27:00,661][15401] Updated weights for policy 0, policy_version 472010 (0.0035) [2024-06-23 16:27:03,396][15132] Fps is (10 sec: 40943.2, 60 sec: 42593.8, 300 sec: 42542.3). Total num frames: 7733493760. Throughput: 0: 42958.4. Samples: 7733679760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:27:03,397][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 16:27:05,045][15401] Updated weights for policy 0, policy_version 472020 (0.0025) [2024-06-23 16:27:08,396][15132] Fps is (10 sec: 39296.3, 60 sec: 43139.8, 300 sec: 42653.0). Total num frames: 7733723136. Throughput: 0: 42892.6. Samples: 7733798740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:27:08,396][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 16:27:08,688][15401] Updated weights for policy 0, policy_version 472030 (0.0036) [2024-06-23 16:27:12,527][15401] Updated weights for policy 0, policy_version 472040 (0.0044) [2024-06-23 16:27:13,389][15132] Fps is (10 sec: 45905.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7733952512. Throughput: 0: 42946.6. Samples: 7734067160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:27:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 16:27:16,032][15401] Updated weights for policy 0, policy_version 472050 (0.0027) [2024-06-23 16:27:18,389][15132] Fps is (10 sec: 42626.1, 60 sec: 43146.4, 300 sec: 42598.4). Total num frames: 7734149120. Throughput: 0: 42797.9. Samples: 7734321820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:27:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 16:27:20,044][15401] Updated weights for policy 0, policy_version 472060 (0.0038) [2024-06-23 16:27:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 7734362112. Throughput: 0: 43031.6. Samples: 7734447920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:27:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 16:27:23,677][15401] Updated weights for policy 0, policy_version 472070 (0.0039) [2024-06-23 16:27:27,170][15349] Signal inference workers to stop experience collection... (114550 times) [2024-06-23 16:27:27,171][15349] Signal inference workers to resume experience collection... (114550 times) [2024-06-23 16:27:27,188][15401] InferenceWorker_p0-w0: stopping experience collection (114550 times) [2024-06-23 16:27:27,188][15401] InferenceWorker_p0-w0: resuming experience collection (114550 times) [2024-06-23 16:27:27,755][15401] Updated weights for policy 0, policy_version 472080 (0.0042) [2024-06-23 16:27:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.6, 300 sec: 42542.9). Total num frames: 7734575104. Throughput: 0: 43097.7. Samples: 7734711920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:27:28,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 16:27:31,251][15401] Updated weights for policy 0, policy_version 472090 (0.0033) [2024-06-23 16:27:33,390][15132] Fps is (10 sec: 44234.6, 60 sec: 43417.3, 300 sec: 42653.9). Total num frames: 7734804480. Throughput: 0: 42794.2. Samples: 7734961740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 16:27:33,391][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 16:27:35,217][15401] Updated weights for policy 0, policy_version 472100 (0.0035) [2024-06-23 16:27:38,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7735017472. Throughput: 0: 43229.8. Samples: 7735098820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:27:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 16:27:38,607][15401] Updated weights for policy 0, policy_version 472110 (0.0046) [2024-06-23 16:27:43,061][15401] Updated weights for policy 0, policy_version 472120 (0.0038) [2024-06-23 16:27:43,390][15132] Fps is (10 sec: 42599.8, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 7735230464. Throughput: 0: 42861.6. Samples: 7735350840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:27:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 16:27:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000472121_7735230464.pth... [2024-06-23 16:27:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000471495_7724974080.pth [2024-06-23 16:27:46,317][15401] Updated weights for policy 0, policy_version 472130 (0.0036) [2024-06-23 16:27:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7735427072. Throughput: 0: 42812.8. Samples: 7735606060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:27:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 16:27:50,682][15401] Updated weights for policy 0, policy_version 472140 (0.0032) [2024-06-23 16:27:53,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 7735640064. Throughput: 0: 42984.2. Samples: 7735732860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:27:53,393][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 16:27:54,036][15401] Updated weights for policy 0, policy_version 472150 (0.0027) [2024-06-23 16:27:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 7735836672. Throughput: 0: 42748.0. Samples: 7735990820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:27:58,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 16:27:58,743][15401] Updated weights for policy 0, policy_version 472160 (0.0040) [2024-06-23 16:28:01,622][15401] Updated weights for policy 0, policy_version 472170 (0.0044) [2024-06-23 16:28:03,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42876.2, 300 sec: 42598.4). Total num frames: 7736066048. Throughput: 0: 42631.6. Samples: 7736240240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 16:28:06,271][15401] Updated weights for policy 0, policy_version 472180 (0.0030) [2024-06-23 16:28:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 7736279040. Throughput: 0: 42753.4. Samples: 7736371820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:08,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-23 16:28:09,220][15401] Updated weights for policy 0, policy_version 472190 (0.0040) [2024-06-23 16:28:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7736492032. Throughput: 0: 42670.7. Samples: 7736632000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:13,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 16:28:13,719][15401] Updated weights for policy 0, policy_version 472200 (0.0039) [2024-06-23 16:28:16,905][15401] Updated weights for policy 0, policy_version 472210 (0.0032) [2024-06-23 16:28:18,390][15132] Fps is (10 sec: 44233.4, 60 sec: 42870.9, 300 sec: 42709.4). Total num frames: 7736721408. Throughput: 0: 42761.1. Samples: 7736886000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:18,391][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 16:28:21,388][15401] Updated weights for policy 0, policy_version 472220 (0.0037) [2024-06-23 16:28:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7736934400. Throughput: 0: 42650.0. Samples: 7737018060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 16:28:24,462][15401] Updated weights for policy 0, policy_version 472230 (0.0034) [2024-06-23 16:28:28,389][15132] Fps is (10 sec: 40962.8, 60 sec: 42600.1, 300 sec: 42654.3). Total num frames: 7737131008. Throughput: 0: 42781.9. Samples: 7737276020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 16:28:28,960][15401] Updated weights for policy 0, policy_version 472240 (0.0036) [2024-06-23 16:28:32,246][15401] Updated weights for policy 0, policy_version 472250 (0.0032) [2024-06-23 16:28:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.7, 300 sec: 42709.5). Total num frames: 7737360384. Throughput: 0: 42659.6. Samples: 7737525740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 16:28:36,875][15401] Updated weights for policy 0, policy_version 472260 (0.0033) [2024-06-23 16:28:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 7737573376. Throughput: 0: 42735.6. Samples: 7737655860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 16:28:39,983][15401] Updated weights for policy 0, policy_version 472270 (0.0029) [2024-06-23 16:28:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7737769984. Throughput: 0: 42686.6. Samples: 7737911720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:43,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 16:28:44,454][15401] Updated weights for policy 0, policy_version 472280 (0.0027) [2024-06-23 16:28:44,454][15349] Signal inference workers to stop experience collection... (114600 times) [2024-06-23 16:28:44,455][15349] Signal inference workers to resume experience collection... (114600 times) [2024-06-23 16:28:44,500][15401] InferenceWorker_p0-w0: stopping experience collection (114600 times) [2024-06-23 16:28:44,500][15401] InferenceWorker_p0-w0: resuming experience collection (114600 times) [2024-06-23 16:28:47,824][15401] Updated weights for policy 0, policy_version 472290 (0.0033) [2024-06-23 16:28:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7738015744. Throughput: 0: 42503.9. Samples: 7738152920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 16:28:51,986][15401] Updated weights for policy 0, policy_version 472300 (0.0034) [2024-06-23 16:28:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 7738212352. Throughput: 0: 42582.6. Samples: 7738288040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 16:28:55,533][15401] Updated weights for policy 0, policy_version 472310 (0.0039) [2024-06-23 16:28:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7738408960. Throughput: 0: 42551.1. Samples: 7738546800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 16:28:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 16:28:59,581][15401] Updated weights for policy 0, policy_version 472320 (0.0035) [2024-06-23 16:29:03,358][15401] Updated weights for policy 0, policy_version 472330 (0.0026) [2024-06-23 16:29:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7738654720. Throughput: 0: 42362.5. Samples: 7738792280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 16:29:07,122][15401] Updated weights for policy 0, policy_version 472340 (0.0035) [2024-06-23 16:29:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42820.7). Total num frames: 7738867712. Throughput: 0: 42515.8. Samples: 7738931280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 16:29:10,977][15401] Updated weights for policy 0, policy_version 472350 (0.0033) [2024-06-23 16:29:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7739047936. Throughput: 0: 42537.8. Samples: 7739190220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:13,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-23 16:29:15,065][15401] Updated weights for policy 0, policy_version 472360 (0.0036) [2024-06-23 16:29:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.9, 300 sec: 42765.0). Total num frames: 7739277312. Throughput: 0: 42444.9. Samples: 7739435760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:18,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 16:29:18,813][15401] Updated weights for policy 0, policy_version 472370 (0.0045) [2024-06-23 16:29:22,691][15401] Updated weights for policy 0, policy_version 472380 (0.0028) [2024-06-23 16:29:23,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 7739490304. Throughput: 0: 42491.5. Samples: 7739568080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:23,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 16:29:26,536][15401] Updated weights for policy 0, policy_version 472390 (0.0030) [2024-06-23 16:29:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7739703296. Throughput: 0: 42521.8. Samples: 7739825200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:28,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 16:29:30,455][15401] Updated weights for policy 0, policy_version 472400 (0.0035) [2024-06-23 16:29:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7739916288. Throughput: 0: 42871.6. Samples: 7740082140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 16:29:34,062][15401] Updated weights for policy 0, policy_version 472410 (0.0037) [2024-06-23 16:29:37,818][15401] Updated weights for policy 0, policy_version 472420 (0.0035) [2024-06-23 16:29:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7740145664. Throughput: 0: 42760.5. Samples: 7740212260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 16:29:41,657][15401] Updated weights for policy 0, policy_version 472430 (0.0036) [2024-06-23 16:29:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7740342272. Throughput: 0: 42568.5. Samples: 7740462380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 16:29:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000472434_7740358656.pth... [2024-06-23 16:29:43,574][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000471809_7730118656.pth [2024-06-23 16:29:45,427][15401] Updated weights for policy 0, policy_version 472440 (0.0036) [2024-06-23 16:29:48,390][15132] Fps is (10 sec: 39320.5, 60 sec: 42052.1, 300 sec: 42709.4). Total num frames: 7740538880. Throughput: 0: 42976.6. Samples: 7740726240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 16:29:49,355][15401] Updated weights for policy 0, policy_version 472450 (0.0033) [2024-06-23 16:29:53,226][15401] Updated weights for policy 0, policy_version 472460 (0.0026) [2024-06-23 16:29:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7740784640. Throughput: 0: 42668.1. Samples: 7740851340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 16:29:55,563][15349] Signal inference workers to stop experience collection... (114650 times) [2024-06-23 16:29:55,573][15401] InferenceWorker_p0-w0: stopping experience collection (114650 times) [2024-06-23 16:29:55,622][15349] Signal inference workers to resume experience collection... (114650 times) [2024-06-23 16:29:55,622][15401] InferenceWorker_p0-w0: resuming experience collection (114650 times) [2024-06-23 16:29:57,027][15401] Updated weights for policy 0, policy_version 472470 (0.0029) [2024-06-23 16:29:58,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7740981248. Throughput: 0: 42632.4. Samples: 7741108680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:29:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 16:30:00,725][15401] Updated weights for policy 0, policy_version 472480 (0.0033) [2024-06-23 16:30:03,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42050.5, 300 sec: 42764.7). Total num frames: 7741177856. Throughput: 0: 43037.7. Samples: 7741372560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:30:03,393][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 16:30:04,610][15401] Updated weights for policy 0, policy_version 472490 (0.0026) [2024-06-23 16:30:08,271][15401] Updated weights for policy 0, policy_version 472500 (0.0033) [2024-06-23 16:30:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 7741440000. Throughput: 0: 42868.6. Samples: 7741497060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:30:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 16:30:12,509][15401] Updated weights for policy 0, policy_version 472510 (0.0041) [2024-06-23 16:30:13,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7741636608. Throughput: 0: 42906.3. Samples: 7741755980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:30:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 16:30:15,930][15401] Updated weights for policy 0, policy_version 472520 (0.0039) [2024-06-23 16:30:18,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7741816832. Throughput: 0: 42901.7. Samples: 7742012720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:30:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 16:30:20,219][15401] Updated weights for policy 0, policy_version 472530 (0.0043) [2024-06-23 16:30:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 7742062592. Throughput: 0: 42728.1. Samples: 7742135020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 16:30:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 16:30:23,635][15401] Updated weights for policy 0, policy_version 472540 (0.0035) [2024-06-23 16:30:27,891][15401] Updated weights for policy 0, policy_version 472550 (0.0032) [2024-06-23 16:30:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 7742275584. Throughput: 0: 42853.0. Samples: 7742390760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:30:28,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 16:30:31,204][15401] Updated weights for policy 0, policy_version 472560 (0.0036) [2024-06-23 16:30:33,390][15132] Fps is (10 sec: 40958.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 7742472192. Throughput: 0: 42777.3. Samples: 7742651220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:30:33,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 16:30:35,729][15401] Updated weights for policy 0, policy_version 472570 (0.0039) [2024-06-23 16:30:38,392][15132] Fps is (10 sec: 42587.3, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 7742701568. Throughput: 0: 42741.1. Samples: 7742774800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:30:38,393][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 16:30:38,851][15401] Updated weights for policy 0, policy_version 472580 (0.0035) [2024-06-23 16:30:43,282][15401] Updated weights for policy 0, policy_version 472590 (0.0031) [2024-06-23 16:30:43,389][15132] Fps is (10 sec: 44238.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7742914560. Throughput: 0: 42734.2. Samples: 7743031720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:30:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 16:30:46,419][15401] Updated weights for policy 0, policy_version 472600 (0.0034) [2024-06-23 16:30:48,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7743111168. Throughput: 0: 42664.0. Samples: 7743292340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:30:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 16:30:50,849][15401] Updated weights for policy 0, policy_version 472610 (0.0030) [2024-06-23 16:30:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 7743340544. Throughput: 0: 42678.5. Samples: 7743417700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:30:53,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 16:30:54,106][15401] Updated weights for policy 0, policy_version 472620 (0.0031) [2024-06-23 16:30:58,389][15401] Updated weights for policy 0, policy_version 472630 (0.0036) [2024-06-23 16:30:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7743569920. Throughput: 0: 42836.4. Samples: 7743683620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:30:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 16:31:02,193][15401] Updated weights for policy 0, policy_version 472640 (0.0034) [2024-06-23 16:31:03,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 7743766528. Throughput: 0: 42706.2. Samples: 7743934500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 16:31:06,590][15401] Updated weights for policy 0, policy_version 472650 (0.0027) [2024-06-23 16:31:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7743995904. Throughput: 0: 42858.5. Samples: 7744063660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:08,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 16:31:10,007][15401] Updated weights for policy 0, policy_version 472660 (0.0026) [2024-06-23 16:31:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42765.4). Total num frames: 7744176128. Throughput: 0: 42985.6. Samples: 7744325120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:13,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-23 16:31:14,008][15401] Updated weights for policy 0, policy_version 472670 (0.0039) [2024-06-23 16:31:17,587][15401] Updated weights for policy 0, policy_version 472680 (0.0031) [2024-06-23 16:31:18,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43415.8, 300 sec: 42820.6). Total num frames: 7744421888. Throughput: 0: 42732.6. Samples: 7744574280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:18,393][15132] Avg episode reward: [(0, '0.243')] [2024-06-23 16:31:21,500][15401] Updated weights for policy 0, policy_version 472690 (0.0042) [2024-06-23 16:31:22,419][15349] Signal inference workers to stop experience collection... (114700 times) [2024-06-23 16:31:22,419][15349] Signal inference workers to resume experience collection... (114700 times) [2024-06-23 16:31:22,432][15401] InferenceWorker_p0-w0: stopping experience collection (114700 times) [2024-06-23 16:31:22,432][15401] InferenceWorker_p0-w0: resuming experience collection (114700 times) [2024-06-23 16:31:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7744634880. Throughput: 0: 42862.9. Samples: 7744703520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:23,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 16:31:25,237][15401] Updated weights for policy 0, policy_version 472700 (0.0027) [2024-06-23 16:31:28,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7744815104. Throughput: 0: 42962.7. Samples: 7744965040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 16:31:29,121][15401] Updated weights for policy 0, policy_version 472710 (0.0036) [2024-06-23 16:31:32,866][15401] Updated weights for policy 0, policy_version 472720 (0.0023) [2024-06-23 16:31:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7745044480. Throughput: 0: 42788.0. Samples: 7745217800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 16:31:36,648][15401] Updated weights for policy 0, policy_version 472730 (0.0041) [2024-06-23 16:31:38,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42871.6, 300 sec: 42709.2). Total num frames: 7745273856. Throughput: 0: 43006.4. Samples: 7745352980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:38,392][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 16:31:40,503][15401] Updated weights for policy 0, policy_version 472740 (0.0030) [2024-06-23 16:31:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7745470464. Throughput: 0: 42831.5. Samples: 7745611040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 16:31:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000472746_7745470464.pth... [2024-06-23 16:31:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000472121_7735230464.pth [2024-06-23 16:31:44,429][15401] Updated weights for policy 0, policy_version 472750 (0.0036) [2024-06-23 16:31:47,994][15401] Updated weights for policy 0, policy_version 472760 (0.0025) [2024-06-23 16:31:48,389][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 7745699840. Throughput: 0: 42891.2. Samples: 7745864600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 16:31:48,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 16:31:51,963][15401] Updated weights for policy 0, policy_version 472770 (0.0027) [2024-06-23 16:31:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 7745912832. Throughput: 0: 42962.3. Samples: 7745996960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:31:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 16:31:55,467][15401] Updated weights for policy 0, policy_version 472780 (0.0028) [2024-06-23 16:31:58,391][15132] Fps is (10 sec: 42591.9, 60 sec: 42597.4, 300 sec: 42821.3). Total num frames: 7746125824. Throughput: 0: 42858.7. Samples: 7746253820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:31:58,391][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 16:31:59,909][15401] Updated weights for policy 0, policy_version 472790 (0.0040) [2024-06-23 16:32:03,330][15401] Updated weights for policy 0, policy_version 472800 (0.0028) [2024-06-23 16:32:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 7746355200. Throughput: 0: 42763.1. Samples: 7746498520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:03,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 16:32:07,494][15401] Updated weights for policy 0, policy_version 472810 (0.0036) [2024-06-23 16:32:08,390][15132] Fps is (10 sec: 42604.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7746551808. Throughput: 0: 42782.2. Samples: 7746628720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 16:32:10,975][15401] Updated weights for policy 0, policy_version 472820 (0.0033) [2024-06-23 16:32:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7746764800. Throughput: 0: 42677.3. Samples: 7746885520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:13,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 16:32:15,072][15401] Updated weights for policy 0, policy_version 472830 (0.0037) [2024-06-23 16:32:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 7746977792. Throughput: 0: 42763.2. Samples: 7747142140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 16:32:18,554][15401] Updated weights for policy 0, policy_version 472840 (0.0028) [2024-06-23 16:32:22,946][15401] Updated weights for policy 0, policy_version 472850 (0.0041) [2024-06-23 16:32:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 7747190784. Throughput: 0: 42574.1. Samples: 7747268720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 16:32:26,590][15401] Updated weights for policy 0, policy_version 472860 (0.0028) [2024-06-23 16:32:28,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43412.9, 300 sec: 42764.2). Total num frames: 7747420160. Throughput: 0: 42675.3. Samples: 7747531700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:28,396][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 16:32:30,626][15401] Updated weights for policy 0, policy_version 472870 (0.0038) [2024-06-23 16:32:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7747616768. Throughput: 0: 42800.4. Samples: 7747790620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:33,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 16:32:34,165][15401] Updated weights for policy 0, policy_version 472880 (0.0035) [2024-06-23 16:32:38,154][15401] Updated weights for policy 0, policy_version 472890 (0.0023) [2024-06-23 16:32:38,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7747829760. Throughput: 0: 42748.1. Samples: 7747920620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:38,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-23 16:32:41,930][15401] Updated weights for policy 0, policy_version 472900 (0.0033) [2024-06-23 16:32:41,937][15349] Signal inference workers to stop experience collection... (114750 times) [2024-06-23 16:32:41,938][15349] Signal inference workers to resume experience collection... (114750 times) [2024-06-23 16:32:41,964][15401] InferenceWorker_p0-w0: stopping experience collection (114750 times) [2024-06-23 16:32:41,964][15401] InferenceWorker_p0-w0: resuming experience collection (114750 times) [2024-06-23 16:32:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7748042752. Throughput: 0: 42692.0. Samples: 7748174900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 16:32:45,647][15401] Updated weights for policy 0, policy_version 472910 (0.0043) [2024-06-23 16:32:48,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 7748272128. Throughput: 0: 42922.1. Samples: 7748430020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:48,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 16:32:49,403][15401] Updated weights for policy 0, policy_version 472920 (0.0030) [2024-06-23 16:32:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7748468736. Throughput: 0: 42933.3. Samples: 7748560720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 16:32:53,672][15401] Updated weights for policy 0, policy_version 472930 (0.0043) [2024-06-23 16:32:57,034][15401] Updated weights for policy 0, policy_version 472940 (0.0045) [2024-06-23 16:32:58,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42872.5, 300 sec: 42820.5). Total num frames: 7748698112. Throughput: 0: 42784.3. Samples: 7748810820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:32:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 16:33:01,138][15401] Updated weights for policy 0, policy_version 472950 (0.0043) [2024-06-23 16:33:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7748911104. Throughput: 0: 42799.9. Samples: 7749068140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:33:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 16:33:04,591][15401] Updated weights for policy 0, policy_version 472960 (0.0031) [2024-06-23 16:33:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7749107712. Throughput: 0: 42840.1. Samples: 7749196520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:33:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 16:33:08,951][15401] Updated weights for policy 0, policy_version 472970 (0.0033) [2024-06-23 16:33:12,014][15401] Updated weights for policy 0, policy_version 472980 (0.0031) [2024-06-23 16:33:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.7). Total num frames: 7749353472. Throughput: 0: 42675.9. Samples: 7749451840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 16:33:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 16:33:16,726][15401] Updated weights for policy 0, policy_version 472990 (0.0030) [2024-06-23 16:33:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7749550080. Throughput: 0: 42624.0. Samples: 7749708700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 16:33:20,241][15401] Updated weights for policy 0, policy_version 473000 (0.0025) [2024-06-23 16:33:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7749746688. Throughput: 0: 42589.3. Samples: 7749837140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 16:33:24,219][15401] Updated weights for policy 0, policy_version 473010 (0.0033) [2024-06-23 16:33:27,928][15401] Updated weights for policy 0, policy_version 473020 (0.0034) [2024-06-23 16:33:28,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42601.2, 300 sec: 42764.7). Total num frames: 7749976064. Throughput: 0: 42558.2. Samples: 7750090120. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:28,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 16:33:31,788][15401] Updated weights for policy 0, policy_version 473030 (0.0042) [2024-06-23 16:33:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7750189056. Throughput: 0: 42636.6. Samples: 7750348660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 16:33:35,576][15401] Updated weights for policy 0, policy_version 473040 (0.0032) [2024-06-23 16:33:38,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7750385664. Throughput: 0: 42527.7. Samples: 7750474460. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 16:33:39,328][15401] Updated weights for policy 0, policy_version 473050 (0.0044) [2024-06-23 16:33:43,162][15401] Updated weights for policy 0, policy_version 473060 (0.0035) [2024-06-23 16:33:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7750631424. Throughput: 0: 42699.2. Samples: 7750732280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:43,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 16:33:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000473061_7750631424.pth... [2024-06-23 16:33:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000472434_7740358656.pth [2024-06-23 16:33:47,329][15401] Updated weights for policy 0, policy_version 473070 (0.0026) [2024-06-23 16:33:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 7750795264. Throughput: 0: 42702.7. Samples: 7750989760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:48,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 16:33:50,767][15401] Updated weights for policy 0, policy_version 473080 (0.0046) [2024-06-23 16:33:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7751041024. Throughput: 0: 42489.7. Samples: 7751108560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 16:33:55,126][15401] Updated weights for policy 0, policy_version 473090 (0.0032) [2024-06-23 16:33:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7751254016. Throughput: 0: 42607.6. Samples: 7751369180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:33:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 16:33:58,516][15401] Updated weights for policy 0, policy_version 473100 (0.0033) [2024-06-23 16:34:02,759][15401] Updated weights for policy 0, policy_version 473110 (0.0030) [2024-06-23 16:34:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7751450624. Throughput: 0: 42710.1. Samples: 7751630660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:34:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 16:34:06,469][15401] Updated weights for policy 0, policy_version 473120 (0.0041) [2024-06-23 16:34:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 7751696384. Throughput: 0: 42633.6. Samples: 7751755660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:34:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 16:34:10,397][15401] Updated weights for policy 0, policy_version 473130 (0.0029) [2024-06-23 16:34:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7751892992. Throughput: 0: 42718.8. Samples: 7752012360. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:34:13,398][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 16:34:14,463][15401] Updated weights for policy 0, policy_version 473140 (0.0040) [2024-06-23 16:34:14,566][15349] Signal inference workers to stop experience collection... (114800 times) [2024-06-23 16:34:14,614][15401] InferenceWorker_p0-w0: stopping experience collection (114800 times) [2024-06-23 16:34:14,626][15349] Signal inference workers to resume experience collection... (114800 times) [2024-06-23 16:34:14,633][15401] InferenceWorker_p0-w0: resuming experience collection (114800 times) [2024-06-23 16:34:17,927][15401] Updated weights for policy 0, policy_version 473150 (0.0029) [2024-06-23 16:34:18,392][15132] Fps is (10 sec: 39312.6, 60 sec: 42323.6, 300 sec: 42709.5). Total num frames: 7752089600. Throughput: 0: 42556.4. Samples: 7752263800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:34:18,393][15132] Avg episode reward: [(0, '0.283')] [2024-06-23 16:34:21,992][15401] Updated weights for policy 0, policy_version 473160 (0.0032) [2024-06-23 16:34:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 7752335360. Throughput: 0: 42668.3. Samples: 7752394540. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:34:23,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 16:34:25,480][15401] Updated weights for policy 0, policy_version 473170 (0.0035) [2024-06-23 16:34:28,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 7752515584. Throughput: 0: 42605.4. Samples: 7752649520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:34:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 16:34:29,705][15401] Updated weights for policy 0, policy_version 473180 (0.0044) [2024-06-23 16:34:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7752728576. Throughput: 0: 42492.9. Samples: 7752901940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:34:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 16:34:33,506][15401] Updated weights for policy 0, policy_version 473190 (0.0029) [2024-06-23 16:34:37,315][15401] Updated weights for policy 0, policy_version 473200 (0.0044) [2024-06-23 16:34:38,389][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7752974336. Throughput: 0: 42743.5. Samples: 7753032020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-23 16:34:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 16:34:40,959][15401] Updated weights for policy 0, policy_version 473210 (0.0037) [2024-06-23 16:34:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42709.5). Total num frames: 7753138176. Throughput: 0: 42687.9. Samples: 7753290140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:34:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 16:34:44,718][15401] Updated weights for policy 0, policy_version 473220 (0.0028) [2024-06-23 16:34:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7753383936. Throughput: 0: 42359.1. Samples: 7753536820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:34:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 16:34:48,696][15401] Updated weights for policy 0, policy_version 473230 (0.0031) [2024-06-23 16:34:52,265][15401] Updated weights for policy 0, policy_version 473240 (0.0031) [2024-06-23 16:34:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7753596928. Throughput: 0: 42592.7. Samples: 7753672320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:34:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 16:34:56,383][15401] Updated weights for policy 0, policy_version 473250 (0.0038) [2024-06-23 16:34:58,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42050.5, 300 sec: 42709.5). Total num frames: 7753777152. Throughput: 0: 42534.0. Samples: 7753926500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:34:58,393][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 16:35:00,065][15401] Updated weights for policy 0, policy_version 473260 (0.0027) [2024-06-23 16:35:03,392][15132] Fps is (10 sec: 44225.4, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 7754039296. Throughput: 0: 42531.9. Samples: 7754177740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:03,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 16:35:04,043][15401] Updated weights for policy 0, policy_version 473270 (0.0042) [2024-06-23 16:35:07,960][15401] Updated weights for policy 0, policy_version 473280 (0.0033) [2024-06-23 16:35:08,389][15132] Fps is (10 sec: 45887.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 7754235904. Throughput: 0: 42689.0. Samples: 7754315540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 16:35:12,289][15401] Updated weights for policy 0, policy_version 473290 (0.0032) [2024-06-23 16:35:13,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7754432512. Throughput: 0: 42797.3. Samples: 7754575400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 16:35:15,485][15401] Updated weights for policy 0, policy_version 473300 (0.0033) [2024-06-23 16:35:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 7754678272. Throughput: 0: 42644.3. Samples: 7754820940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 16:35:19,844][15401] Updated weights for policy 0, policy_version 473310 (0.0032) [2024-06-23 16:35:23,119][15401] Updated weights for policy 0, policy_version 473320 (0.0035) [2024-06-23 16:35:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7754891264. Throughput: 0: 42839.6. Samples: 7754959800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 16:35:23,891][15349] Signal inference workers to stop experience collection... (114850 times) [2024-06-23 16:35:23,891][15349] Signal inference workers to resume experience collection... (114850 times) [2024-06-23 16:35:23,905][15401] InferenceWorker_p0-w0: stopping experience collection (114850 times) [2024-06-23 16:35:23,936][15401] InferenceWorker_p0-w0: resuming experience collection (114850 times) [2024-06-23 16:35:27,792][15401] Updated weights for policy 0, policy_version 473330 (0.0045) [2024-06-23 16:35:28,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7755071488. Throughput: 0: 42698.4. Samples: 7755211560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:28,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 16:35:30,649][15401] Updated weights for policy 0, policy_version 473340 (0.0033) [2024-06-23 16:35:33,392][15132] Fps is (10 sec: 40949.4, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 7755300864. Throughput: 0: 42866.1. Samples: 7755465900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:33,393][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 16:35:35,119][15401] Updated weights for policy 0, policy_version 473350 (0.0039) [2024-06-23 16:35:38,278][15401] Updated weights for policy 0, policy_version 473360 (0.0024) [2024-06-23 16:35:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7755530240. Throughput: 0: 42760.0. Samples: 7755596520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:38,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 16:35:42,660][15401] Updated weights for policy 0, policy_version 473370 (0.0030) [2024-06-23 16:35:43,389][15132] Fps is (10 sec: 39331.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7755694080. Throughput: 0: 42665.1. Samples: 7755846320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 16:35:43,537][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000473371_7755710464.pth... [2024-06-23 16:35:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000472746_7745470464.pth [2024-06-23 16:35:45,873][15401] Updated weights for policy 0, policy_version 473380 (0.0046) [2024-06-23 16:35:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 7755956224. Throughput: 0: 42654.4. Samples: 7756097080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 16:35:50,069][15401] Updated weights for policy 0, policy_version 473390 (0.0034) [2024-06-23 16:35:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7756152832. Throughput: 0: 42732.8. Samples: 7756238520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:53,396][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 16:35:53,797][15401] Updated weights for policy 0, policy_version 473400 (0.0032) [2024-06-23 16:35:57,363][15401] Updated weights for policy 0, policy_version 473410 (0.0037) [2024-06-23 16:35:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 7756349440. Throughput: 0: 42756.0. Samples: 7756499420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:35:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 16:36:01,576][15401] Updated weights for policy 0, policy_version 473420 (0.0032) [2024-06-23 16:36:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 7756611584. Throughput: 0: 42793.1. Samples: 7756746620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-23 16:36:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 16:36:05,415][15401] Updated weights for policy 0, policy_version 473430 (0.0028) [2024-06-23 16:36:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 7756808192. Throughput: 0: 42957.5. Samples: 7756892900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 16:36:08,965][15401] Updated weights for policy 0, policy_version 473440 (0.0026) [2024-06-23 16:36:12,809][15401] Updated weights for policy 0, policy_version 473450 (0.0037) [2024-06-23 16:36:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 7757004800. Throughput: 0: 42930.1. Samples: 7757143420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 16:36:16,484][15401] Updated weights for policy 0, policy_version 473460 (0.0029) [2024-06-23 16:36:18,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7757266944. Throughput: 0: 42768.2. Samples: 7757390360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 16:36:20,611][15401] Updated weights for policy 0, policy_version 473470 (0.0045) [2024-06-23 16:36:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 7757447168. Throughput: 0: 43087.4. Samples: 7757535560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:23,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 16:36:24,005][15401] Updated weights for policy 0, policy_version 473480 (0.0051) [2024-06-23 16:36:28,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7757627392. Throughput: 0: 43177.4. Samples: 7757789300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 16:36:28,584][15401] Updated weights for policy 0, policy_version 473490 (0.0034) [2024-06-23 16:36:31,491][15401] Updated weights for policy 0, policy_version 473500 (0.0040) [2024-06-23 16:36:32,822][15349] Signal inference workers to stop experience collection... (114900 times) [2024-06-23 16:36:32,871][15401] InferenceWorker_p0-w0: stopping experience collection (114900 times) [2024-06-23 16:36:32,877][15349] Signal inference workers to resume experience collection... (114900 times) [2024-06-23 16:36:32,886][15401] InferenceWorker_p0-w0: resuming experience collection (114900 times) [2024-06-23 16:36:33,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43419.4, 300 sec: 42820.9). Total num frames: 7757905920. Throughput: 0: 43140.8. Samples: 7758038420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 16:36:36,352][15401] Updated weights for policy 0, policy_version 473510 (0.0048) [2024-06-23 16:36:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7758069760. Throughput: 0: 43156.5. Samples: 7758180560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 16:36:39,166][15401] Updated weights for policy 0, policy_version 473520 (0.0039) [2024-06-23 16:36:43,390][15132] Fps is (10 sec: 37682.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7758282752. Throughput: 0: 42851.0. Samples: 7758427720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 16:36:44,025][15401] Updated weights for policy 0, policy_version 473530 (0.0039) [2024-06-23 16:36:47,088][15401] Updated weights for policy 0, policy_version 473540 (0.0036) [2024-06-23 16:36:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7758528512. Throughput: 0: 42984.8. Samples: 7758680940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 16:36:51,504][15401] Updated weights for policy 0, policy_version 473550 (0.0022) [2024-06-23 16:36:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.6). Total num frames: 7758692352. Throughput: 0: 42748.2. Samples: 7758816560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 16:36:54,739][15401] Updated weights for policy 0, policy_version 473560 (0.0035) [2024-06-23 16:36:58,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 7758938112. Throughput: 0: 42677.3. Samples: 7759064000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:36:58,393][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 16:36:59,220][15401] Updated weights for policy 0, policy_version 473570 (0.0036) [2024-06-23 16:37:02,440][15401] Updated weights for policy 0, policy_version 473580 (0.0058) [2024-06-23 16:37:03,390][15132] Fps is (10 sec: 49151.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 7759183872. Throughput: 0: 42730.1. Samples: 7759313220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:37:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 16:37:06,766][15401] Updated weights for policy 0, policy_version 473590 (0.0033) [2024-06-23 16:37:08,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7759347712. Throughput: 0: 42579.6. Samples: 7759451540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:37:08,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 16:37:10,106][15401] Updated weights for policy 0, policy_version 473600 (0.0033) [2024-06-23 16:37:13,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 7759593472. Throughput: 0: 42515.4. Samples: 7759702600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:37:13,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 16:37:14,553][15401] Updated weights for policy 0, policy_version 473610 (0.0034) [2024-06-23 16:37:17,857][15401] Updated weights for policy 0, policy_version 473620 (0.0036) [2024-06-23 16:37:18,392][15132] Fps is (10 sec: 47502.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 7759822848. Throughput: 0: 42572.9. Samples: 7759954300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:37:18,392][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 16:37:22,243][15401] Updated weights for policy 0, policy_version 473630 (0.0034) [2024-06-23 16:37:23,396][15132] Fps is (10 sec: 39305.8, 60 sec: 42322.5, 300 sec: 42598.4). Total num frames: 7759986688. Throughput: 0: 42372.1. Samples: 7760087580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:37:23,397][15132] Avg episode reward: [(0, '0.269')] [2024-06-23 16:37:25,480][15401] Updated weights for policy 0, policy_version 473640 (0.0043) [2024-06-23 16:37:28,389][15132] Fps is (10 sec: 39331.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7760216064. Throughput: 0: 42479.4. Samples: 7760339280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:37:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 16:37:30,108][15401] Updated weights for policy 0, policy_version 473650 (0.0026) [2024-06-23 16:37:33,389][15132] Fps is (10 sec: 44265.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 7760429056. Throughput: 0: 42549.0. Samples: 7760595640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:37:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 16:37:33,419][15401] Updated weights for policy 0, policy_version 473660 (0.0029) [2024-06-23 16:37:37,646][15401] Updated weights for policy 0, policy_version 473670 (0.0032) [2024-06-23 16:37:38,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7760625664. Throughput: 0: 42451.1. Samples: 7760726860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:37:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 16:37:41,241][15401] Updated weights for policy 0, policy_version 473680 (0.0048) [2024-06-23 16:37:43,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7760871424. Throughput: 0: 42367.1. Samples: 7760970420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:37:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 16:37:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000473686_7760871424.pth... [2024-06-23 16:37:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000473061_7750631424.pth [2024-06-23 16:37:45,196][15401] Updated weights for policy 0, policy_version 473690 (0.0029) [2024-06-23 16:37:48,395][15132] Fps is (10 sec: 44213.8, 60 sec: 42321.7, 300 sec: 42708.7). Total num frames: 7761068032. Throughput: 0: 42582.8. Samples: 7761229660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:37:48,395][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 16:37:49,023][15401] Updated weights for policy 0, policy_version 473700 (0.0034) [2024-06-23 16:37:53,177][15401] Updated weights for policy 0, policy_version 473710 (0.0035) [2024-06-23 16:37:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7761264640. Throughput: 0: 42344.0. Samples: 7761357020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:37:53,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 16:37:56,610][15349] Signal inference workers to stop experience collection... (114950 times) [2024-06-23 16:37:56,655][15349] Signal inference workers to resume experience collection... (114950 times) [2024-06-23 16:37:56,661][15401] InferenceWorker_p0-w0: stopping experience collection (114950 times) [2024-06-23 16:37:56,669][15401] Updated weights for policy 0, policy_version 473720 (0.0037) [2024-06-23 16:37:56,700][15401] InferenceWorker_p0-w0: resuming experience collection (114950 times) [2024-06-23 16:37:58,390][15132] Fps is (10 sec: 45898.7, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 7761526784. Throughput: 0: 42379.6. Samples: 7761609580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:37:58,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 16:38:00,747][15401] Updated weights for policy 0, policy_version 473730 (0.0039) [2024-06-23 16:38:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 7761707008. Throughput: 0: 42585.4. Samples: 7761870540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 16:38:04,158][15401] Updated weights for policy 0, policy_version 473740 (0.0028) [2024-06-23 16:38:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7761903616. Throughput: 0: 42338.2. Samples: 7761992520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:08,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 16:38:08,711][15401] Updated weights for policy 0, policy_version 473750 (0.0023) [2024-06-23 16:38:11,897][15401] Updated weights for policy 0, policy_version 473760 (0.0036) [2024-06-23 16:38:13,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42322.5, 300 sec: 42653.0). Total num frames: 7762132992. Throughput: 0: 42347.6. Samples: 7762245200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:13,396][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 16:38:16,217][15401] Updated weights for policy 0, policy_version 473770 (0.0051) [2024-06-23 16:38:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41780.9, 300 sec: 42653.9). Total num frames: 7762329600. Throughput: 0: 42420.3. Samples: 7762504560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 16:38:19,615][15401] Updated weights for policy 0, policy_version 473780 (0.0028) [2024-06-23 16:38:23,392][15132] Fps is (10 sec: 40976.4, 60 sec: 42601.2, 300 sec: 42598.4). Total num frames: 7762542592. Throughput: 0: 42284.8. Samples: 7762629780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:23,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 16:38:23,734][15401] Updated weights for policy 0, policy_version 473790 (0.0021) [2024-06-23 16:38:27,242][15401] Updated weights for policy 0, policy_version 473800 (0.0028) [2024-06-23 16:38:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 7762788352. Throughput: 0: 42630.7. Samples: 7762888800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 16:38:31,235][15401] Updated weights for policy 0, policy_version 473810 (0.0042) [2024-06-23 16:38:33,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7762984960. Throughput: 0: 42581.8. Samples: 7763145620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:33,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 16:38:34,880][15401] Updated weights for policy 0, policy_version 473820 (0.0031) [2024-06-23 16:38:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7763197952. Throughput: 0: 42551.7. Samples: 7763271840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 16:38:38,678][15401] Updated weights for policy 0, policy_version 473830 (0.0036) [2024-06-23 16:38:42,599][15401] Updated weights for policy 0, policy_version 473840 (0.0042) [2024-06-23 16:38:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7763410944. Throughput: 0: 42628.5. Samples: 7763527860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 16:38:46,809][15401] Updated weights for policy 0, policy_version 473850 (0.0055) [2024-06-23 16:38:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42329.1, 300 sec: 42598.4). Total num frames: 7763607552. Throughput: 0: 42597.8. Samples: 7763787440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 16:38:50,185][15401] Updated weights for policy 0, policy_version 473860 (0.0040) [2024-06-23 16:38:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7763820544. Throughput: 0: 42630.1. Samples: 7763910880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 16:38:54,436][15401] Updated weights for policy 0, policy_version 473870 (0.0033) [2024-06-23 16:38:57,679][15401] Updated weights for policy 0, policy_version 473880 (0.0037) [2024-06-23 16:38:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 7764049920. Throughput: 0: 42744.8. Samples: 7764168440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 16:38:58,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 16:39:02,213][15401] Updated weights for policy 0, policy_version 473890 (0.0036) [2024-06-23 16:39:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 7764230144. Throughput: 0: 42861.8. Samples: 7764433340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:03,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 16:39:05,419][15401] Updated weights for policy 0, policy_version 473900 (0.0043) [2024-06-23 16:39:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7764475904. Throughput: 0: 42793.5. Samples: 7764555380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 16:39:10,022][15401] Updated weights for policy 0, policy_version 473910 (0.0044) [2024-06-23 16:39:13,129][15401] Updated weights for policy 0, policy_version 473920 (0.0044) [2024-06-23 16:39:13,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42876.0, 300 sec: 42765.4). Total num frames: 7764705280. Throughput: 0: 42486.7. Samples: 7764800700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 16:39:17,643][15401] Updated weights for policy 0, policy_version 473930 (0.0033) [2024-06-23 16:39:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7764869120. Throughput: 0: 42619.1. Samples: 7765063480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 16:39:20,829][15401] Updated weights for policy 0, policy_version 473940 (0.0036) [2024-06-23 16:39:23,013][15349] Signal inference workers to stop experience collection... (115000 times) [2024-06-23 16:39:23,014][15349] Signal inference workers to resume experience collection... (115000 times) [2024-06-23 16:39:23,062][15401] InferenceWorker_p0-w0: stopping experience collection (115000 times) [2024-06-23 16:39:23,062][15401] InferenceWorker_p0-w0: resuming experience collection (115000 times) [2024-06-23 16:39:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.1, 300 sec: 42709.4). Total num frames: 7765114880. Throughput: 0: 42493.6. Samples: 7765184060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 16:39:25,458][15401] Updated weights for policy 0, policy_version 473950 (0.0043) [2024-06-23 16:39:28,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7765344256. Throughput: 0: 42565.9. Samples: 7765443320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:28,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-23 16:39:28,414][15401] Updated weights for policy 0, policy_version 473960 (0.0038) [2024-06-23 16:39:33,239][15401] Updated weights for policy 0, policy_version 473970 (0.0036) [2024-06-23 16:39:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7765524480. Throughput: 0: 42608.7. Samples: 7765704840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:33,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 16:39:35,992][15401] Updated weights for policy 0, policy_version 473980 (0.0043) [2024-06-23 16:39:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7765737472. Throughput: 0: 42446.6. Samples: 7765820980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 16:39:40,951][15401] Updated weights for policy 0, policy_version 473990 (0.0032) [2024-06-23 16:39:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7765983232. Throughput: 0: 42728.0. Samples: 7766091200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 16:39:43,531][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000473999_7765999616.pth... [2024-06-23 16:39:43,601][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000473371_7755710464.pth [2024-06-23 16:39:43,759][15401] Updated weights for policy 0, policy_version 474000 (0.0036) [2024-06-23 16:39:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7766163456. Throughput: 0: 42460.5. Samples: 7766344060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 16:39:48,574][15401] Updated weights for policy 0, policy_version 474010 (0.0038) [2024-06-23 16:39:51,401][15401] Updated weights for policy 0, policy_version 474020 (0.0041) [2024-06-23 16:39:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 7766392832. Throughput: 0: 42406.5. Samples: 7766463680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 16:39:56,202][15401] Updated weights for policy 0, policy_version 474030 (0.0031) [2024-06-23 16:39:58,396][15132] Fps is (10 sec: 45847.4, 60 sec: 42867.2, 300 sec: 42653.4). Total num frames: 7766622208. Throughput: 0: 42756.6. Samples: 7766725000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:39:58,396][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 16:39:59,115][15401] Updated weights for policy 0, policy_version 474040 (0.0038) [2024-06-23 16:40:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 7766802432. Throughput: 0: 42461.3. Samples: 7766974340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:40:03,392][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 16:40:03,763][15401] Updated weights for policy 0, policy_version 474050 (0.0036) [2024-06-23 16:40:06,688][15401] Updated weights for policy 0, policy_version 474060 (0.0043) [2024-06-23 16:40:08,396][15132] Fps is (10 sec: 39320.2, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 7767015424. Throughput: 0: 42557.6. Samples: 7767099420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:40:08,397][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 16:40:11,530][15401] Updated weights for policy 0, policy_version 474070 (0.0028) [2024-06-23 16:40:13,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7767244800. Throughput: 0: 42644.9. Samples: 7767362340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:40:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 16:40:14,388][15401] Updated weights for policy 0, policy_version 474080 (0.0043) [2024-06-23 16:40:18,392][15132] Fps is (10 sec: 42615.1, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 7767441408. Throughput: 0: 42535.1. Samples: 7767619020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:40:18,393][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 16:40:19,125][15401] Updated weights for policy 0, policy_version 474090 (0.0041) [2024-06-23 16:40:22,205][15401] Updated weights for policy 0, policy_version 474100 (0.0041) [2024-06-23 16:40:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 7767670784. Throughput: 0: 42660.5. Samples: 7767740800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 16:40:23,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 16:40:27,219][15401] Updated weights for policy 0, policy_version 474110 (0.0027) [2024-06-23 16:40:28,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42052.2, 300 sec: 42598.8). Total num frames: 7767867392. Throughput: 0: 42455.1. Samples: 7768001680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:40:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 16:40:30,665][15401] Updated weights for policy 0, policy_version 474120 (0.0040) [2024-06-23 16:40:32,048][15349] Signal inference workers to stop experience collection... (115050 times) [2024-06-23 16:40:32,094][15401] InferenceWorker_p0-w0: stopping experience collection (115050 times) [2024-06-23 16:40:32,162][15349] Signal inference workers to resume experience collection... (115050 times) [2024-06-23 16:40:32,162][15401] InferenceWorker_p0-w0: resuming experience collection (115050 times) [2024-06-23 16:40:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7768080384. Throughput: 0: 42387.2. Samples: 7768251480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:40:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 16:40:34,815][15401] Updated weights for policy 0, policy_version 474130 (0.0034) [2024-06-23 16:40:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7768309760. Throughput: 0: 42533.3. Samples: 7768377680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:40:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 16:40:38,397][15401] Updated weights for policy 0, policy_version 474140 (0.0033) [2024-06-23 16:40:42,421][15401] Updated weights for policy 0, policy_version 474150 (0.0043) [2024-06-23 16:40:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 7768506368. Throughput: 0: 42506.7. Samples: 7768637540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:40:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 16:40:46,292][15401] Updated weights for policy 0, policy_version 474160 (0.0037) [2024-06-23 16:40:48,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7768719360. Throughput: 0: 42484.2. Samples: 7768886020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:40:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 16:40:50,105][15401] Updated weights for policy 0, policy_version 474170 (0.0028) [2024-06-23 16:40:53,390][15132] Fps is (10 sec: 42594.9, 60 sec: 42324.9, 300 sec: 42653.8). Total num frames: 7768932352. Throughput: 0: 42554.7. Samples: 7769014140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:40:53,391][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 16:40:54,021][15401] Updated weights for policy 0, policy_version 474180 (0.0035) [2024-06-23 16:40:57,797][15401] Updated weights for policy 0, policy_version 474190 (0.0031) [2024-06-23 16:40:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42329.6, 300 sec: 42542.9). Total num frames: 7769161728. Throughput: 0: 42541.7. Samples: 7769276720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:40:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 16:41:01,811][15401] Updated weights for policy 0, policy_version 474200 (0.0028) [2024-06-23 16:41:03,389][15132] Fps is (10 sec: 44240.3, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 7769374720. Throughput: 0: 42370.4. Samples: 7769525580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 16:41:05,811][15401] Updated weights for policy 0, policy_version 474210 (0.0030) [2024-06-23 16:41:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 7769587712. Throughput: 0: 42548.9. Samples: 7769655400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:08,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 16:41:09,474][15401] Updated weights for policy 0, policy_version 474220 (0.0028) [2024-06-23 16:41:13,297][15401] Updated weights for policy 0, policy_version 474230 (0.0034) [2024-06-23 16:41:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 7769784320. Throughput: 0: 42386.6. Samples: 7769909080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:13,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 16:41:17,131][15401] Updated weights for policy 0, policy_version 474240 (0.0029) [2024-06-23 16:41:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42598.7). Total num frames: 7770013696. Throughput: 0: 42573.2. Samples: 7770167280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 16:41:20,808][15401] Updated weights for policy 0, policy_version 474250 (0.0040) [2024-06-23 16:41:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 7770210304. Throughput: 0: 42554.7. Samples: 7770292640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:23,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 16:41:24,960][15401] Updated weights for policy 0, policy_version 474260 (0.0034) [2024-06-23 16:41:28,326][15401] Updated weights for policy 0, policy_version 474270 (0.0039) [2024-06-23 16:41:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 7770439680. Throughput: 0: 42436.3. Samples: 7770547180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:28,390][15132] Avg episode reward: [(0, '0.191')] [2024-06-23 16:41:32,535][15401] Updated weights for policy 0, policy_version 474280 (0.0036) [2024-06-23 16:41:33,284][15349] Signal inference workers to stop experience collection... (115100 times) [2024-06-23 16:41:33,322][15401] InferenceWorker_p0-w0: stopping experience collection (115100 times) [2024-06-23 16:41:33,331][15349] Signal inference workers to resume experience collection... (115100 times) [2024-06-23 16:41:33,339][15401] InferenceWorker_p0-w0: resuming experience collection (115100 times) [2024-06-23 16:41:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7770636288. Throughput: 0: 42712.4. Samples: 7770808080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:33,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 16:41:35,754][15401] Updated weights for policy 0, policy_version 474290 (0.0032) [2024-06-23 16:41:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 7770849280. Throughput: 0: 42582.5. Samples: 7770930320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 16:41:40,139][15401] Updated weights for policy 0, policy_version 474300 (0.0038) [2024-06-23 16:41:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 7771078656. Throughput: 0: 42514.6. Samples: 7771189880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 16:41:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000474309_7771078656.pth... [2024-06-23 16:41:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000473686_7760871424.pth [2024-06-23 16:41:43,681][15401] Updated weights for policy 0, policy_version 474310 (0.0032) [2024-06-23 16:41:47,838][15401] Updated weights for policy 0, policy_version 474320 (0.0034) [2024-06-23 16:41:48,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 7771291648. Throughput: 0: 42714.5. Samples: 7771447840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 16:41:48,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 16:41:51,191][15401] Updated weights for policy 0, policy_version 474330 (0.0040) [2024-06-23 16:41:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.9, 300 sec: 42543.2). Total num frames: 7771488256. Throughput: 0: 42586.7. Samples: 7771571800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:41:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 16:41:55,434][15401] Updated weights for policy 0, policy_version 474340 (0.0040) [2024-06-23 16:41:58,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 7771734016. Throughput: 0: 42713.5. Samples: 7771831180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:41:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 16:41:58,655][15401] Updated weights for policy 0, policy_version 474350 (0.0035) [2024-06-23 16:42:03,102][15401] Updated weights for policy 0, policy_version 474360 (0.0038) [2024-06-23 16:42:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7771930624. Throughput: 0: 42738.1. Samples: 7772090500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 16:42:06,664][15401] Updated weights for policy 0, policy_version 474370 (0.0030) [2024-06-23 16:42:08,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42596.7, 300 sec: 42542.9). Total num frames: 7772143616. Throughput: 0: 42732.4. Samples: 7772215700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:08,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 16:42:10,741][15401] Updated weights for policy 0, policy_version 474380 (0.0041) [2024-06-23 16:42:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 7772340224. Throughput: 0: 42822.8. Samples: 7772474200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 16:42:14,255][15401] Updated weights for policy 0, policy_version 474390 (0.0036) [2024-06-23 16:42:18,286][15401] Updated weights for policy 0, policy_version 474400 (0.0036) [2024-06-23 16:42:18,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 7772569600. Throughput: 0: 42753.3. Samples: 7772731980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 16:42:21,863][15401] Updated weights for policy 0, policy_version 474410 (0.0030) [2024-06-23 16:42:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7772782592. Throughput: 0: 42862.7. Samples: 7772859140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 16:42:26,012][15401] Updated weights for policy 0, policy_version 474420 (0.0036) [2024-06-23 16:42:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 7772979200. Throughput: 0: 42859.7. Samples: 7773118560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:28,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 16:42:29,601][15401] Updated weights for policy 0, policy_version 474430 (0.0029) [2024-06-23 16:42:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7773192192. Throughput: 0: 43007.7. Samples: 7773383080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 16:42:33,614][15401] Updated weights for policy 0, policy_version 474440 (0.0034) [2024-06-23 16:42:37,158][15401] Updated weights for policy 0, policy_version 474450 (0.0044) [2024-06-23 16:42:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7773421568. Throughput: 0: 42901.4. Samples: 7773502360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:38,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 16:42:41,249][15401] Updated weights for policy 0, policy_version 474460 (0.0022) [2024-06-23 16:42:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.6, 300 sec: 42599.2). Total num frames: 7773634560. Throughput: 0: 42817.8. Samples: 7773757980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:43,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-23 16:42:44,715][15401] Updated weights for policy 0, policy_version 474470 (0.0031) [2024-06-23 16:42:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 7773847552. Throughput: 0: 42961.9. Samples: 7774023780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:48,390][15132] Avg episode reward: [(0, '0.079')] [2024-06-23 16:42:48,794][15401] Updated weights for policy 0, policy_version 474480 (0.0039) [2024-06-23 16:42:53,026][15401] Updated weights for policy 0, policy_version 474490 (0.0036) [2024-06-23 16:42:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 7774060544. Throughput: 0: 42918.7. Samples: 7774146940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 16:42:56,618][15401] Updated weights for policy 0, policy_version 474500 (0.0046) [2024-06-23 16:42:56,644][15349] Signal inference workers to stop experience collection... (115150 times) [2024-06-23 16:42:56,644][15349] Signal inference workers to resume experience collection... (115150 times) [2024-06-23 16:42:56,692][15401] InferenceWorker_p0-w0: stopping experience collection (115150 times) [2024-06-23 16:42:56,692][15401] InferenceWorker_p0-w0: resuming experience collection (115150 times) [2024-06-23 16:42:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7774289920. Throughput: 0: 42802.8. Samples: 7774400320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:42:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 16:43:00,661][15401] Updated weights for policy 0, policy_version 474510 (0.0033) [2024-06-23 16:43:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7774470144. Throughput: 0: 42855.1. Samples: 7774660460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:43:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 16:43:04,257][15401] Updated weights for policy 0, policy_version 474520 (0.0033) [2024-06-23 16:43:08,214][15401] Updated weights for policy 0, policy_version 474530 (0.0033) [2024-06-23 16:43:08,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.1, 300 sec: 42599.3). Total num frames: 7774699520. Throughput: 0: 42704.4. Samples: 7774780840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:43:08,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 16:43:11,763][15401] Updated weights for policy 0, policy_version 474540 (0.0033) [2024-06-23 16:43:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 7774945280. Throughput: 0: 42746.1. Samples: 7775042140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 16:43:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 16:43:15,839][15401] Updated weights for policy 0, policy_version 474550 (0.0041) [2024-06-23 16:43:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 7775125504. Throughput: 0: 42542.0. Samples: 7775297480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:18,396][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 16:43:19,492][15401] Updated weights for policy 0, policy_version 474560 (0.0026) [2024-06-23 16:43:23,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7775322112. Throughput: 0: 42678.1. Samples: 7775422880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:23,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 16:43:23,578][15401] Updated weights for policy 0, policy_version 474570 (0.0029) [2024-06-23 16:43:26,992][15401] Updated weights for policy 0, policy_version 474580 (0.0027) [2024-06-23 16:43:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7775567872. Throughput: 0: 42686.8. Samples: 7775678900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:28,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 16:43:31,208][15401] Updated weights for policy 0, policy_version 474590 (0.0034) [2024-06-23 16:43:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7775780864. Throughput: 0: 42572.0. Samples: 7775939520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 16:43:34,539][15401] Updated weights for policy 0, policy_version 474600 (0.0036) [2024-06-23 16:43:38,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42866.8, 300 sec: 42653.0). Total num frames: 7775993856. Throughput: 0: 42557.5. Samples: 7776062300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:38,397][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 16:43:38,759][15401] Updated weights for policy 0, policy_version 474610 (0.0030) [2024-06-23 16:43:42,325][15401] Updated weights for policy 0, policy_version 474620 (0.0052) [2024-06-23 16:43:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7776223232. Throughput: 0: 42682.0. Samples: 7776321020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 16:43:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000474623_7776223232.pth... [2024-06-23 16:43:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000473999_7765999616.pth [2024-06-23 16:43:46,749][15401] Updated weights for policy 0, policy_version 474630 (0.0039) [2024-06-23 16:43:48,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7776403456. Throughput: 0: 42700.1. Samples: 7776581960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 16:43:49,855][15401] Updated weights for policy 0, policy_version 474640 (0.0034) [2024-06-23 16:43:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7776632832. Throughput: 0: 42780.0. Samples: 7776705940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 16:43:54,701][15401] Updated weights for policy 0, policy_version 474650 (0.0027) [2024-06-23 16:43:57,427][15401] Updated weights for policy 0, policy_version 474660 (0.0032) [2024-06-23 16:43:58,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7776878592. Throughput: 0: 42765.5. Samples: 7776966580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:43:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 16:44:02,385][15401] Updated weights for policy 0, policy_version 474670 (0.0029) [2024-06-23 16:44:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7777042432. Throughput: 0: 42888.6. Samples: 7777227460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:44:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 16:44:05,300][15401] Updated weights for policy 0, policy_version 474680 (0.0046) [2024-06-23 16:44:08,196][15349] Signal inference workers to stop experience collection... (115200 times) [2024-06-23 16:44:08,197][15349] Signal inference workers to resume experience collection... (115200 times) [2024-06-23 16:44:08,213][15401] InferenceWorker_p0-w0: stopping experience collection (115200 times) [2024-06-23 16:44:08,236][15401] InferenceWorker_p0-w0: resuming experience collection (115200 times) [2024-06-23 16:44:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 7777288192. Throughput: 0: 42774.2. Samples: 7777347720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:44:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 16:44:09,841][15401] Updated weights for policy 0, policy_version 474690 (0.0030) [2024-06-23 16:44:12,913][15401] Updated weights for policy 0, policy_version 474700 (0.0031) [2024-06-23 16:44:13,389][15132] Fps is (10 sec: 49152.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7777533952. Throughput: 0: 42998.8. Samples: 7777613840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:44:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 16:44:17,276][15401] Updated weights for policy 0, policy_version 474710 (0.0033) [2024-06-23 16:44:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 7777697792. Throughput: 0: 43019.9. Samples: 7777875420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:44:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 16:44:20,405][15401] Updated weights for policy 0, policy_version 474720 (0.0027) [2024-06-23 16:44:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 7777927168. Throughput: 0: 42823.8. Samples: 7777989100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:44:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 16:44:25,266][15401] Updated weights for policy 0, policy_version 474730 (0.0045) [2024-06-23 16:44:27,939][15401] Updated weights for policy 0, policy_version 474740 (0.0023) [2024-06-23 16:44:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7778156544. Throughput: 0: 42916.0. Samples: 7778252240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:44:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 16:44:32,675][15401] Updated weights for policy 0, policy_version 474750 (0.0046) [2024-06-23 16:44:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7778336768. Throughput: 0: 42967.9. Samples: 7778515520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:44:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 16:44:35,469][15401] Updated weights for policy 0, policy_version 474760 (0.0024) [2024-06-23 16:44:38,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 7778549760. Throughput: 0: 42842.7. Samples: 7778633860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-23 16:44:38,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 16:44:40,127][15401] Updated weights for policy 0, policy_version 474770 (0.0041) [2024-06-23 16:44:43,105][15401] Updated weights for policy 0, policy_version 474780 (0.0035) [2024-06-23 16:44:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7778795520. Throughput: 0: 42884.3. Samples: 7778896380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:44:43,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 16:44:47,655][15401] Updated weights for policy 0, policy_version 474790 (0.0034) [2024-06-23 16:44:48,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 7778975744. Throughput: 0: 42834.8. Samples: 7779155120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:44:48,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 16:44:50,968][15401] Updated weights for policy 0, policy_version 474800 (0.0043) [2024-06-23 16:44:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42654.8). Total num frames: 7779205120. Throughput: 0: 42786.2. Samples: 7779273100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:44:53,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 16:44:55,234][15401] Updated weights for policy 0, policy_version 474810 (0.0030) [2024-06-23 16:44:58,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 7779418112. Throughput: 0: 42824.1. Samples: 7779540920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:44:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 16:44:58,588][15401] Updated weights for policy 0, policy_version 474820 (0.0030) [2024-06-23 16:45:02,981][15401] Updated weights for policy 0, policy_version 474830 (0.0026) [2024-06-23 16:45:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 7779614720. Throughput: 0: 42671.6. Samples: 7779795640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 16:45:06,286][15401] Updated weights for policy 0, policy_version 474840 (0.0034) [2024-06-23 16:45:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7779827712. Throughput: 0: 42896.2. Samples: 7779919420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 16:45:10,891][15401] Updated weights for policy 0, policy_version 474850 (0.0039) [2024-06-23 16:45:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 7780073472. Throughput: 0: 42758.3. Samples: 7780176360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 16:45:14,106][15401] Updated weights for policy 0, policy_version 474860 (0.0037) [2024-06-23 16:45:18,391][15132] Fps is (10 sec: 42591.9, 60 sec: 42597.5, 300 sec: 42654.1). Total num frames: 7780253696. Throughput: 0: 42521.3. Samples: 7780429040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:18,391][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 16:45:18,825][15401] Updated weights for policy 0, policy_version 474870 (0.0033) [2024-06-23 16:45:21,967][15401] Updated weights for policy 0, policy_version 474880 (0.0034) [2024-06-23 16:45:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7780483072. Throughput: 0: 42694.6. Samples: 7780555120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:23,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 16:45:26,355][15401] Updated weights for policy 0, policy_version 474890 (0.0028) [2024-06-23 16:45:27,290][15349] Signal inference workers to stop experience collection... (115250 times) [2024-06-23 16:45:27,332][15401] InferenceWorker_p0-w0: stopping experience collection (115250 times) [2024-06-23 16:45:27,403][15349] Signal inference workers to resume experience collection... (115250 times) [2024-06-23 16:45:27,403][15401] InferenceWorker_p0-w0: resuming experience collection (115250 times) [2024-06-23 16:45:28,389][15132] Fps is (10 sec: 44243.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 7780696064. Throughput: 0: 42695.8. Samples: 7780817680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 16:45:29,503][15401] Updated weights for policy 0, policy_version 474900 (0.0040) [2024-06-23 16:45:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7780892672. Throughput: 0: 42724.4. Samples: 7781077620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:33,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 16:45:33,886][15401] Updated weights for policy 0, policy_version 474910 (0.0028) [2024-06-23 16:45:37,060][15401] Updated weights for policy 0, policy_version 474920 (0.0029) [2024-06-23 16:45:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7781122048. Throughput: 0: 42824.0. Samples: 7781200280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:38,393][15132] Avg episode reward: [(0, '0.839')] [2024-06-23 16:45:41,391][15401] Updated weights for policy 0, policy_version 474930 (0.0040) [2024-06-23 16:45:43,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 7781351424. Throughput: 0: 42641.2. Samples: 7781459780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 16:45:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000474936_7781351424.pth... [2024-06-23 16:45:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000474309_7771078656.pth [2024-06-23 16:45:44,701][15401] Updated weights for policy 0, policy_version 474940 (0.0033) [2024-06-23 16:45:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42600.0, 300 sec: 42709.6). Total num frames: 7781531648. Throughput: 0: 42630.8. Samples: 7781714020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 16:45:48,957][15401] Updated weights for policy 0, policy_version 474950 (0.0042) [2024-06-23 16:45:52,187][15401] Updated weights for policy 0, policy_version 474960 (0.0030) [2024-06-23 16:45:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7781744640. Throughput: 0: 42681.7. Samples: 7781840100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:53,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 16:45:56,784][15401] Updated weights for policy 0, policy_version 474970 (0.0026) [2024-06-23 16:45:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7781974016. Throughput: 0: 42794.4. Samples: 7782102100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:45:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 16:45:59,689][15401] Updated weights for policy 0, policy_version 474980 (0.0029) [2024-06-23 16:46:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7782187008. Throughput: 0: 42887.1. Samples: 7782358900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:46:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 16:46:04,262][15401] Updated weights for policy 0, policy_version 474990 (0.0041) [2024-06-23 16:46:07,937][15401] Updated weights for policy 0, policy_version 475000 (0.0044) [2024-06-23 16:46:08,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7782400000. Throughput: 0: 42842.6. Samples: 7782483140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:08,392][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 16:46:11,994][15401] Updated weights for policy 0, policy_version 475010 (0.0027) [2024-06-23 16:46:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7782612992. Throughput: 0: 42768.9. Samples: 7782742280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 16:46:15,516][15401] Updated weights for policy 0, policy_version 475020 (0.0038) [2024-06-23 16:46:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42872.5, 300 sec: 42765.0). Total num frames: 7782825984. Throughput: 0: 42565.7. Samples: 7782993080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 16:46:20,187][15401] Updated weights for policy 0, policy_version 475030 (0.0039) [2024-06-23 16:46:23,077][15401] Updated weights for policy 0, policy_version 475040 (0.0029) [2024-06-23 16:46:23,390][15132] Fps is (10 sec: 44234.7, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 7783055360. Throughput: 0: 42731.7. Samples: 7783123120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 16:46:27,872][15401] Updated weights for policy 0, policy_version 475050 (0.0024) [2024-06-23 16:46:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7783235584. Throughput: 0: 42689.0. Samples: 7783380780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 16:46:30,631][15401] Updated weights for policy 0, policy_version 475060 (0.0052) [2024-06-23 16:46:33,389][15132] Fps is (10 sec: 40961.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7783464960. Throughput: 0: 42560.4. Samples: 7783629240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 16:46:35,327][15401] Updated weights for policy 0, policy_version 475070 (0.0025) [2024-06-23 16:46:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 7783694336. Throughput: 0: 42746.3. Samples: 7783763680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:38,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 16:46:38,444][15401] Updated weights for policy 0, policy_version 475080 (0.0038) [2024-06-23 16:46:42,963][15401] Updated weights for policy 0, policy_version 475090 (0.0034) [2024-06-23 16:46:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 7783874560. Throughput: 0: 42574.2. Samples: 7784017940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:43,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 16:46:46,142][15401] Updated weights for policy 0, policy_version 475100 (0.0026) [2024-06-23 16:46:48,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7784103936. Throughput: 0: 42432.0. Samples: 7784268340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 16:46:50,113][15349] Signal inference workers to stop experience collection... (115300 times) [2024-06-23 16:46:50,150][15401] InferenceWorker_p0-w0: stopping experience collection (115300 times) [2024-06-23 16:46:50,161][15349] Signal inference workers to resume experience collection... (115300 times) [2024-06-23 16:46:50,165][15401] InferenceWorker_p0-w0: resuming experience collection (115300 times) [2024-06-23 16:46:50,479][15401] Updated weights for policy 0, policy_version 475110 (0.0034) [2024-06-23 16:46:53,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 7784333312. Throughput: 0: 42493.7. Samples: 7784395260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:53,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-23 16:46:53,769][15401] Updated weights for policy 0, policy_version 475120 (0.0027) [2024-06-23 16:46:58,389][15401] Updated weights for policy 0, policy_version 475130 (0.0044) [2024-06-23 16:46:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7784529920. Throughput: 0: 42536.8. Samples: 7784656440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:46:58,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 16:47:01,653][15401] Updated weights for policy 0, policy_version 475140 (0.0029) [2024-06-23 16:47:03,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 7784742912. Throughput: 0: 42631.2. Samples: 7784911480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:47:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 16:47:06,131][15401] Updated weights for policy 0, policy_version 475150 (0.0022) [2024-06-23 16:47:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 7784955904. Throughput: 0: 42609.4. Samples: 7785040520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:47:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 16:47:09,241][15401] Updated weights for policy 0, policy_version 475160 (0.0031) [2024-06-23 16:47:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 7785152512. Throughput: 0: 42658.1. Samples: 7785300400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:47:13,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-23 16:47:13,832][15401] Updated weights for policy 0, policy_version 475170 (0.0046) [2024-06-23 16:47:16,966][15401] Updated weights for policy 0, policy_version 475180 (0.0032) [2024-06-23 16:47:18,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7785398272. Throughput: 0: 42622.6. Samples: 7785547260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:47:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 16:47:21,390][15401] Updated weights for policy 0, policy_version 475190 (0.0038) [2024-06-23 16:47:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.6, 300 sec: 42820.5). Total num frames: 7785611264. Throughput: 0: 42630.1. Samples: 7785682040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:47:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 16:47:24,486][15401] Updated weights for policy 0, policy_version 475200 (0.0032) [2024-06-23 16:47:28,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7785775104. Throughput: 0: 42616.4. Samples: 7785935680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 16:47:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 16:47:29,020][15401] Updated weights for policy 0, policy_version 475210 (0.0043) [2024-06-23 16:47:32,381][15401] Updated weights for policy 0, policy_version 475220 (0.0033) [2024-06-23 16:47:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7786020864. Throughput: 0: 42732.1. Samples: 7786191280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:47:33,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 16:47:36,539][15401] Updated weights for policy 0, policy_version 475230 (0.0029) [2024-06-23 16:47:38,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7786250240. Throughput: 0: 42782.4. Samples: 7786320460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:47:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 16:47:40,531][15401] Updated weights for policy 0, policy_version 475240 (0.0028) [2024-06-23 16:47:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7786430464. Throughput: 0: 42594.6. Samples: 7786573200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:47:43,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 16:47:43,529][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000475247_7786446848.pth... [2024-06-23 16:47:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000474623_7776223232.pth [2024-06-23 16:47:44,106][15401] Updated weights for policy 0, policy_version 475250 (0.0031) [2024-06-23 16:47:48,291][15401] Updated weights for policy 0, policy_version 475260 (0.0031) [2024-06-23 16:47:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7786659840. Throughput: 0: 42568.7. Samples: 7786827080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:47:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 16:47:51,671][15401] Updated weights for policy 0, policy_version 475270 (0.0035) [2024-06-23 16:47:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7786872832. Throughput: 0: 42518.5. Samples: 7786953860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:47:53,394][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 16:47:55,811][15401] Updated weights for policy 0, policy_version 475280 (0.0036) [2024-06-23 16:47:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7787085824. Throughput: 0: 42657.0. Samples: 7787219960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:47:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 16:47:58,910][15349] Signal inference workers to stop experience collection... (115350 times) [2024-06-23 16:47:58,911][15349] Signal inference workers to resume experience collection... (115350 times) [2024-06-23 16:47:58,943][15401] InferenceWorker_p0-w0: stopping experience collection (115350 times) [2024-06-23 16:47:58,943][15401] InferenceWorker_p0-w0: resuming experience collection (115350 times) [2024-06-23 16:47:59,294][15401] Updated weights for policy 0, policy_version 475290 (0.0037) [2024-06-23 16:48:03,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 7787282432. Throughput: 0: 42771.6. Samples: 7787472080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:03,401][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 16:48:03,596][15401] Updated weights for policy 0, policy_version 475300 (0.0032) [2024-06-23 16:48:06,839][15401] Updated weights for policy 0, policy_version 475310 (0.0040) [2024-06-23 16:48:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7787528192. Throughput: 0: 42673.8. Samples: 7787602360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 16:48:11,160][15401] Updated weights for policy 0, policy_version 475320 (0.0033) [2024-06-23 16:48:13,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7787708416. Throughput: 0: 42631.9. Samples: 7787854120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:13,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 16:48:14,698][15401] Updated weights for policy 0, policy_version 475330 (0.0030) [2024-06-23 16:48:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7787937792. Throughput: 0: 42595.9. Samples: 7788108100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 16:48:18,686][15401] Updated weights for policy 0, policy_version 475340 (0.0029) [2024-06-23 16:48:22,214][15401] Updated weights for policy 0, policy_version 475350 (0.0022) [2024-06-23 16:48:23,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7788183552. Throughput: 0: 42603.1. Samples: 7788237600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 16:48:26,163][15401] Updated weights for policy 0, policy_version 475360 (0.0035) [2024-06-23 16:48:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7788347392. Throughput: 0: 42772.1. Samples: 7788497940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 16:48:29,767][15401] Updated weights for policy 0, policy_version 475370 (0.0023) [2024-06-23 16:48:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 7788576768. Throughput: 0: 42743.2. Samples: 7788750520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 16:48:34,321][15401] Updated weights for policy 0, policy_version 475380 (0.0039) [2024-06-23 16:48:37,524][15401] Updated weights for policy 0, policy_version 475390 (0.0021) [2024-06-23 16:48:38,389][15132] Fps is (10 sec: 49151.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7788838912. Throughput: 0: 42890.3. Samples: 7788883920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 16:48:41,856][15401] Updated weights for policy 0, policy_version 475400 (0.0033) [2024-06-23 16:48:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7788986368. Throughput: 0: 42659.2. Samples: 7789139620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 16:48:45,130][15401] Updated weights for policy 0, policy_version 475410 (0.0039) [2024-06-23 16:48:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 7789232128. Throughput: 0: 42781.7. Samples: 7789397160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 16:48:49,337][15401] Updated weights for policy 0, policy_version 475420 (0.0036) [2024-06-23 16:48:51,280][15349] Signal inference workers to stop experience collection... (115400 times) [2024-06-23 16:48:51,324][15401] InferenceWorker_p0-w0: stopping experience collection (115400 times) [2024-06-23 16:48:51,332][15349] Signal inference workers to resume experience collection... (115400 times) [2024-06-23 16:48:51,336][15401] InferenceWorker_p0-w0: resuming experience collection (115400 times) [2024-06-23 16:48:52,700][15401] Updated weights for policy 0, policy_version 475430 (0.0035) [2024-06-23 16:48:53,390][15132] Fps is (10 sec: 49151.0, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 7789477888. Throughput: 0: 42940.4. Samples: 7789534680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 16:48:53,391][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 16:48:57,269][15401] Updated weights for policy 0, policy_version 475440 (0.0022) [2024-06-23 16:48:58,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7789625344. Throughput: 0: 42941.5. Samples: 7789786480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:48:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 16:49:00,275][15401] Updated weights for policy 0, policy_version 475450 (0.0039) [2024-06-23 16:49:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 7789871104. Throughput: 0: 42814.4. Samples: 7790034740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 16:49:04,861][15401] Updated weights for policy 0, policy_version 475460 (0.0037) [2024-06-23 16:49:07,926][15401] Updated weights for policy 0, policy_version 475470 (0.0030) [2024-06-23 16:49:08,389][15132] Fps is (10 sec: 49151.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 7790116864. Throughput: 0: 43055.2. Samples: 7790175080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 16:49:12,472][15401] Updated weights for policy 0, policy_version 475480 (0.0031) [2024-06-23 16:49:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7790280704. Throughput: 0: 42822.0. Samples: 7790424940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:13,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 16:49:15,833][15401] Updated weights for policy 0, policy_version 475490 (0.0026) [2024-06-23 16:49:18,396][15132] Fps is (10 sec: 40934.4, 60 sec: 43140.2, 300 sec: 42708.6). Total num frames: 7790526464. Throughput: 0: 42765.6. Samples: 7790675240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:18,396][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 16:49:20,393][15401] Updated weights for policy 0, policy_version 475500 (0.0040) [2024-06-23 16:49:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7790739456. Throughput: 0: 42922.2. Samples: 7790815420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 16:49:23,459][15401] Updated weights for policy 0, policy_version 475510 (0.0035) [2024-06-23 16:49:27,997][15401] Updated weights for policy 0, policy_version 475520 (0.0052) [2024-06-23 16:49:28,390][15132] Fps is (10 sec: 39345.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7790919680. Throughput: 0: 42790.5. Samples: 7791065200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 16:49:30,965][15401] Updated weights for policy 0, policy_version 475530 (0.0037) [2024-06-23 16:49:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7791165440. Throughput: 0: 42873.9. Samples: 7791326480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:33,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 16:49:35,627][15401] Updated weights for policy 0, policy_version 475540 (0.0036) [2024-06-23 16:49:38,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7791394816. Throughput: 0: 42817.0. Samples: 7791461440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:38,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 16:49:38,569][15401] Updated weights for policy 0, policy_version 475550 (0.0031) [2024-06-23 16:49:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 7791558656. Throughput: 0: 42909.3. Samples: 7791717400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:43,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 16:49:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000475560_7791575040.pth... [2024-06-23 16:49:43,404][15401] Updated weights for policy 0, policy_version 475560 (0.0032) [2024-06-23 16:49:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000474936_7781351424.pth [2024-06-23 16:49:46,259][15401] Updated weights for policy 0, policy_version 475570 (0.0030) [2024-06-23 16:49:47,928][15349] Signal inference workers to stop experience collection... (115450 times) [2024-06-23 16:49:47,928][15349] Signal inference workers to resume experience collection... (115450 times) [2024-06-23 16:49:47,978][15401] InferenceWorker_p0-w0: stopping experience collection (115450 times) [2024-06-23 16:49:47,979][15401] InferenceWorker_p0-w0: resuming experience collection (115450 times) [2024-06-23 16:49:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7791820800. Throughput: 0: 43057.8. Samples: 7791972340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 16:49:50,712][15401] Updated weights for policy 0, policy_version 475580 (0.0037) [2024-06-23 16:49:53,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7792033792. Throughput: 0: 43127.1. Samples: 7792115800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 16:49:53,864][15401] Updated weights for policy 0, policy_version 475590 (0.0027) [2024-06-23 16:49:58,274][15401] Updated weights for policy 0, policy_version 475600 (0.0032) [2024-06-23 16:49:58,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 7792230400. Throughput: 0: 43148.9. Samples: 7792366740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:49:58,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 16:50:01,264][15401] Updated weights for policy 0, policy_version 475610 (0.0033) [2024-06-23 16:50:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 7792476160. Throughput: 0: 43467.3. Samples: 7792631000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:50:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 16:50:05,979][15401] Updated weights for policy 0, policy_version 475620 (0.0041) [2024-06-23 16:50:08,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7792689152. Throughput: 0: 43328.9. Samples: 7792765220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:50:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 16:50:08,655][15401] Updated weights for policy 0, policy_version 475630 (0.0026) [2024-06-23 16:50:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 43144.7, 300 sec: 42765.2). Total num frames: 7792869376. Throughput: 0: 43450.4. Samples: 7793020460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:50:13,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-23 16:50:13,458][15401] Updated weights for policy 0, policy_version 475640 (0.0033) [2024-06-23 16:50:16,442][15401] Updated weights for policy 0, policy_version 475650 (0.0039) [2024-06-23 16:50:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43422.0, 300 sec: 42876.1). Total num frames: 7793131520. Throughput: 0: 43501.3. Samples: 7793284040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 16:50:18,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-23 16:50:20,922][15401] Updated weights for policy 0, policy_version 475660 (0.0029) [2024-06-23 16:50:23,392][15132] Fps is (10 sec: 47501.6, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 7793344512. Throughput: 0: 43404.4. Samples: 7793414740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:50:23,392][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 16:50:24,079][15401] Updated weights for policy 0, policy_version 475670 (0.0032) [2024-06-23 16:50:28,189][15401] Updated weights for policy 0, policy_version 475680 (0.0036) [2024-06-23 16:50:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43689.0, 300 sec: 42875.7). Total num frames: 7793541120. Throughput: 0: 43463.0. Samples: 7793673340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:50:28,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 16:50:32,181][15401] Updated weights for policy 0, policy_version 475690 (0.0042) [2024-06-23 16:50:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 7793754112. Throughput: 0: 43554.2. Samples: 7793932280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:50:33,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-23 16:50:35,853][15401] Updated weights for policy 0, policy_version 475700 (0.0042) [2024-06-23 16:50:38,389][15132] Fps is (10 sec: 45886.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7793999872. Throughput: 0: 43224.0. Samples: 7794060880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:50:38,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 16:50:39,736][15401] Updated weights for policy 0, policy_version 475710 (0.0032) [2024-06-23 16:50:43,346][15401] Updated weights for policy 0, policy_version 475720 (0.0027) [2024-06-23 16:50:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43963.6, 300 sec: 42931.6). Total num frames: 7794196480. Throughput: 0: 43335.2. Samples: 7794316720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:50:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 16:50:47,147][15401] Updated weights for policy 0, policy_version 475730 (0.0031) [2024-06-23 16:50:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 7794409472. Throughput: 0: 43311.9. Samples: 7794580040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:50:48,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 16:50:50,845][15401] Updated weights for policy 0, policy_version 475740 (0.0034) [2024-06-23 16:50:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 7794622464. Throughput: 0: 43119.9. Samples: 7794705720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:50:53,393][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 16:50:54,721][15401] Updated weights for policy 0, policy_version 475750 (0.0030) [2024-06-23 16:50:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 7794835456. Throughput: 0: 43068.4. Samples: 7794958540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:50:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 16:50:58,471][15401] Updated weights for policy 0, policy_version 475760 (0.0041) [2024-06-23 16:51:02,223][15401] Updated weights for policy 0, policy_version 475770 (0.0046) [2024-06-23 16:51:03,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 7795064832. Throughput: 0: 42928.9. Samples: 7795215840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 16:51:06,230][15401] Updated weights for policy 0, policy_version 475780 (0.0036) [2024-06-23 16:51:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7795261440. Throughput: 0: 42983.0. Samples: 7795348880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 16:51:09,892][15401] Updated weights for policy 0, policy_version 475790 (0.0051) [2024-06-23 16:51:11,191][15349] Signal inference workers to stop experience collection... (115500 times) [2024-06-23 16:51:11,192][15349] Signal inference workers to resume experience collection... (115500 times) [2024-06-23 16:51:11,216][15401] InferenceWorker_p0-w0: stopping experience collection (115500 times) [2024-06-23 16:51:11,216][15401] InferenceWorker_p0-w0: resuming experience collection (115500 times) [2024-06-23 16:51:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43415.7, 300 sec: 42875.7). Total num frames: 7795474432. Throughput: 0: 42954.2. Samples: 7795606280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:13,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 16:51:13,983][15401] Updated weights for policy 0, policy_version 475800 (0.0035) [2024-06-23 16:51:17,470][15401] Updated weights for policy 0, policy_version 475810 (0.0033) [2024-06-23 16:51:18,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 7795720192. Throughput: 0: 42835.1. Samples: 7795859860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 16:51:21,816][15401] Updated weights for policy 0, policy_version 475820 (0.0025) [2024-06-23 16:51:23,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 7795900416. Throughput: 0: 42902.1. Samples: 7795991480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 16:51:25,016][15401] Updated weights for policy 0, policy_version 475830 (0.0039) [2024-06-23 16:51:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 7796129792. Throughput: 0: 42877.4. Samples: 7796246200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:28,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 16:51:29,309][15401] Updated weights for policy 0, policy_version 475840 (0.0033) [2024-06-23 16:51:32,546][15401] Updated weights for policy 0, policy_version 475850 (0.0045) [2024-06-23 16:51:33,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7796342784. Throughput: 0: 42654.4. Samples: 7796499480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 16:51:37,083][15401] Updated weights for policy 0, policy_version 475860 (0.0021) [2024-06-23 16:51:38,390][15132] Fps is (10 sec: 42594.4, 60 sec: 42597.8, 300 sec: 42987.0). Total num frames: 7796555776. Throughput: 0: 42809.0. Samples: 7796632060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:38,391][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 16:51:40,169][15401] Updated weights for policy 0, policy_version 475870 (0.0022) [2024-06-23 16:51:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7796768768. Throughput: 0: 42796.6. Samples: 7796884400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-23 16:51:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 16:51:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000475877_7796768768.pth... [2024-06-23 16:51:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000475247_7786446848.pth [2024-06-23 16:51:44,703][15401] Updated weights for policy 0, policy_version 475880 (0.0032) [2024-06-23 16:51:47,959][15401] Updated weights for policy 0, policy_version 475890 (0.0038) [2024-06-23 16:51:48,389][15132] Fps is (10 sec: 44240.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 7796998144. Throughput: 0: 42626.7. Samples: 7797134040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:51:48,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 16:51:52,235][15401] Updated weights for policy 0, policy_version 475900 (0.0043) [2024-06-23 16:51:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 7797178368. Throughput: 0: 42690.4. Samples: 7797269940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:51:53,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 16:51:55,663][15401] Updated weights for policy 0, policy_version 475910 (0.0038) [2024-06-23 16:51:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7797407744. Throughput: 0: 42637.0. Samples: 7797524840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:51:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 16:51:59,774][15401] Updated weights for policy 0, policy_version 475920 (0.0034) [2024-06-23 16:52:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7797620736. Throughput: 0: 42661.7. Samples: 7797779640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 16:52:03,596][15401] Updated weights for policy 0, policy_version 475930 (0.0040) [2024-06-23 16:52:07,578][15401] Updated weights for policy 0, policy_version 475940 (0.0039) [2024-06-23 16:52:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 7797817344. Throughput: 0: 42660.2. Samples: 7797911180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:08,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-23 16:52:11,279][15401] Updated weights for policy 0, policy_version 475950 (0.0033) [2024-06-23 16:52:13,391][15132] Fps is (10 sec: 42590.6, 60 sec: 42871.9, 300 sec: 42875.8). Total num frames: 7798046720. Throughput: 0: 42746.7. Samples: 7798169880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:13,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 16:52:15,419][15401] Updated weights for policy 0, policy_version 475960 (0.0033) [2024-06-23 16:52:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7798259712. Throughput: 0: 42700.4. Samples: 7798421000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:18,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-23 16:52:18,751][15401] Updated weights for policy 0, policy_version 475970 (0.0025) [2024-06-23 16:52:22,886][15401] Updated weights for policy 0, policy_version 475980 (0.0033) [2024-06-23 16:52:23,389][15132] Fps is (10 sec: 42606.3, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 7798472704. Throughput: 0: 42646.2. Samples: 7798551100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:23,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-23 16:52:26,207][15401] Updated weights for policy 0, policy_version 475990 (0.0021) [2024-06-23 16:52:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7798669312. Throughput: 0: 42798.9. Samples: 7798810340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:28,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-23 16:52:29,648][15349] Signal inference workers to stop experience collection... (115550 times) [2024-06-23 16:52:29,680][15401] InferenceWorker_p0-w0: stopping experience collection (115550 times) [2024-06-23 16:52:29,712][15349] Signal inference workers to resume experience collection... (115550 times) [2024-06-23 16:52:29,714][15401] InferenceWorker_p0-w0: resuming experience collection (115550 times) [2024-06-23 16:52:30,347][15401] Updated weights for policy 0, policy_version 476000 (0.0036) [2024-06-23 16:52:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7798915072. Throughput: 0: 42828.9. Samples: 7799061340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 16:52:33,724][15401] Updated weights for policy 0, policy_version 476010 (0.0045) [2024-06-23 16:52:38,359][15401] Updated weights for policy 0, policy_version 476020 (0.0037) [2024-06-23 16:52:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42599.0, 300 sec: 42987.2). Total num frames: 7799111680. Throughput: 0: 42824.4. Samples: 7799197040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 16:52:41,322][15401] Updated weights for policy 0, policy_version 476030 (0.0040) [2024-06-23 16:52:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 7799324672. Throughput: 0: 42756.4. Samples: 7799448880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 16:52:45,968][15401] Updated weights for policy 0, policy_version 476040 (0.0034) [2024-06-23 16:52:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 7799554048. Throughput: 0: 42741.8. Samples: 7799703020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:48,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 16:52:49,374][15401] Updated weights for policy 0, policy_version 476050 (0.0042) [2024-06-23 16:52:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7799750656. Throughput: 0: 42702.5. Samples: 7799832800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 16:52:53,633][15401] Updated weights for policy 0, policy_version 476060 (0.0034) [2024-06-23 16:52:57,022][15401] Updated weights for policy 0, policy_version 476070 (0.0029) [2024-06-23 16:52:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 43043.1). Total num frames: 7799980032. Throughput: 0: 42537.7. Samples: 7800084000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:52:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 16:53:01,304][15401] Updated weights for policy 0, policy_version 476080 (0.0047) [2024-06-23 16:53:03,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 7800193024. Throughput: 0: 42627.1. Samples: 7800339320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:53:03,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 16:53:05,005][15401] Updated weights for policy 0, policy_version 476090 (0.0053) [2024-06-23 16:53:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 7800389632. Throughput: 0: 42616.8. Samples: 7800468860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 16:53:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 16:53:08,722][15401] Updated weights for policy 0, policy_version 476100 (0.0034) [2024-06-23 16:53:12,467][15401] Updated weights for policy 0, policy_version 476110 (0.0027) [2024-06-23 16:53:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42872.8, 300 sec: 42987.2). Total num frames: 7800619008. Throughput: 0: 42640.4. Samples: 7800729160. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 16:53:16,188][15401] Updated weights for policy 0, policy_version 476120 (0.0027) [2024-06-23 16:53:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7800832000. Throughput: 0: 42881.7. Samples: 7800991020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 16:53:19,949][15401] Updated weights for policy 0, policy_version 476130 (0.0034) [2024-06-23 16:53:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 7801028608. Throughput: 0: 42741.8. Samples: 7801120420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 16:53:24,034][15401] Updated weights for policy 0, policy_version 476140 (0.0022) [2024-06-23 16:53:27,663][15401] Updated weights for policy 0, policy_version 476150 (0.0032) [2024-06-23 16:53:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.7, 300 sec: 42986.8). Total num frames: 7801257984. Throughput: 0: 42882.1. Samples: 7801378680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:28,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 16:53:31,474][15401] Updated weights for policy 0, policy_version 476160 (0.0036) [2024-06-23 16:53:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 7801470976. Throughput: 0: 42957.2. Samples: 7801636200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:33,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 16:53:34,974][15349] Signal inference workers to stop experience collection... (115600 times) [2024-06-23 16:53:35,018][15401] InferenceWorker_p0-w0: stopping experience collection (115600 times) [2024-06-23 16:53:35,089][15349] Signal inference workers to resume experience collection... (115600 times) [2024-06-23 16:53:35,089][15401] InferenceWorker_p0-w0: resuming experience collection (115600 times) [2024-06-23 16:53:35,230][15401] Updated weights for policy 0, policy_version 476170 (0.0027) [2024-06-23 16:53:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 7801683968. Throughput: 0: 42978.4. Samples: 7801766820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 16:53:39,290][15401] Updated weights for policy 0, policy_version 476180 (0.0037) [2024-06-23 16:53:43,316][15401] Updated weights for policy 0, policy_version 476190 (0.0036) [2024-06-23 16:53:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 7801896960. Throughput: 0: 43182.6. Samples: 7802027220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 16:53:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000476190_7801896960.pth... [2024-06-23 16:53:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000475560_7791575040.pth [2024-06-23 16:53:46,859][15401] Updated weights for policy 0, policy_version 476200 (0.0040) [2024-06-23 16:53:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7802126336. Throughput: 0: 43145.5. Samples: 7802280760. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:48,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 16:53:50,964][15401] Updated weights for policy 0, policy_version 476210 (0.0026) [2024-06-23 16:53:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 7802322944. Throughput: 0: 43192.5. Samples: 7802412520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 16:53:54,332][15401] Updated weights for policy 0, policy_version 476220 (0.0037) [2024-06-23 16:53:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7802535936. Throughput: 0: 43154.7. Samples: 7802671120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:53:58,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 16:53:58,604][15401] Updated weights for policy 0, policy_version 476230 (0.0028) [2024-06-23 16:54:01,807][15401] Updated weights for policy 0, policy_version 476240 (0.0038) [2024-06-23 16:54:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 7802781696. Throughput: 0: 42911.2. Samples: 7802922020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:54:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 16:54:06,020][15401] Updated weights for policy 0, policy_version 476250 (0.0036) [2024-06-23 16:54:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7802961920. Throughput: 0: 43026.2. Samples: 7803056600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:54:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 16:54:09,523][15401] Updated weights for policy 0, policy_version 476260 (0.0035) [2024-06-23 16:54:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42932.5). Total num frames: 7803191296. Throughput: 0: 42927.2. Samples: 7803310300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:54:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 16:54:13,675][15401] Updated weights for policy 0, policy_version 476270 (0.0043) [2024-06-23 16:54:17,226][15401] Updated weights for policy 0, policy_version 476280 (0.0035) [2024-06-23 16:54:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7803420672. Throughput: 0: 42949.8. Samples: 7803568840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:54:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 16:54:21,147][15401] Updated weights for policy 0, policy_version 476290 (0.0028) [2024-06-23 16:54:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 7803617280. Throughput: 0: 42962.2. Samples: 7803700120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:54:23,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 16:54:24,723][15401] Updated weights for policy 0, policy_version 476300 (0.0041) [2024-06-23 16:54:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 7803830272. Throughput: 0: 42847.1. Samples: 7803955340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:54:28,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 16:54:28,927][15401] Updated weights for policy 0, policy_version 476310 (0.0047) [2024-06-23 16:54:32,176][15401] Updated weights for policy 0, policy_version 476320 (0.0034) [2024-06-23 16:54:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43419.3, 300 sec: 42987.2). Total num frames: 7804076032. Throughput: 0: 42939.8. Samples: 7804213060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-23 16:54:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 16:54:36,640][15401] Updated weights for policy 0, policy_version 476330 (0.0037) [2024-06-23 16:54:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 7804239872. Throughput: 0: 43012.5. Samples: 7804348080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:54:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 16:54:39,959][15401] Updated weights for policy 0, policy_version 476340 (0.0040) [2024-06-23 16:54:43,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 7804485632. Throughput: 0: 42940.4. Samples: 7804603540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:54:43,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 16:54:44,043][15401] Updated weights for policy 0, policy_version 476350 (0.0035) [2024-06-23 16:54:47,512][15401] Updated weights for policy 0, policy_version 476360 (0.0033) [2024-06-23 16:54:48,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 7804731392. Throughput: 0: 43020.8. Samples: 7804857960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:54:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 16:54:51,548][15401] Updated weights for policy 0, policy_version 476370 (0.0041) [2024-06-23 16:54:53,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 7804895232. Throughput: 0: 43074.6. Samples: 7804994960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:54:53,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-23 16:54:53,413][15349] Signal inference workers to stop experience collection... (115650 times) [2024-06-23 16:54:53,441][15401] InferenceWorker_p0-w0: stopping experience collection (115650 times) [2024-06-23 16:54:53,471][15349] Signal inference workers to resume experience collection... (115650 times) [2024-06-23 16:54:53,476][15401] InferenceWorker_p0-w0: resuming experience collection (115650 times) [2024-06-23 16:54:55,025][15401] Updated weights for policy 0, policy_version 476380 (0.0041) [2024-06-23 16:54:58,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7805108224. Throughput: 0: 43082.2. Samples: 7805249000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:54:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 16:54:59,494][15401] Updated weights for policy 0, policy_version 476390 (0.0033) [2024-06-23 16:55:02,741][15401] Updated weights for policy 0, policy_version 476400 (0.0041) [2024-06-23 16:55:03,389][15132] Fps is (10 sec: 47514.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 7805370368. Throughput: 0: 42936.2. Samples: 7805500960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:03,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 16:55:07,015][15401] Updated weights for policy 0, policy_version 476410 (0.0036) [2024-06-23 16:55:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7805534208. Throughput: 0: 43115.6. Samples: 7805640320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:08,398][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 16:55:10,348][15401] Updated weights for policy 0, policy_version 476420 (0.0031) [2024-06-23 16:55:13,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7805763584. Throughput: 0: 43047.0. Samples: 7805892460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:13,399][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 16:55:14,829][15401] Updated weights for policy 0, policy_version 476430 (0.0030) [2024-06-23 16:55:18,003][15401] Updated weights for policy 0, policy_version 476440 (0.0025) [2024-06-23 16:55:18,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 7806009344. Throughput: 0: 42859.7. Samples: 7806141740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 16:55:22,379][15401] Updated weights for policy 0, policy_version 476450 (0.0031) [2024-06-23 16:55:23,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.7, 300 sec: 42876.1). Total num frames: 7806189568. Throughput: 0: 42977.6. Samples: 7806282180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:23,401][15132] Avg episode reward: [(0, '0.232')] [2024-06-23 16:55:25,464][15401] Updated weights for policy 0, policy_version 476460 (0.0033) [2024-06-23 16:55:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7806402560. Throughput: 0: 42869.0. Samples: 7806532540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 16:55:30,060][15401] Updated weights for policy 0, policy_version 476470 (0.0031) [2024-06-23 16:55:32,951][15401] Updated weights for policy 0, policy_version 476480 (0.0030) [2024-06-23 16:55:33,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7806648320. Throughput: 0: 42844.5. Samples: 7806785960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 16:55:37,606][15401] Updated weights for policy 0, policy_version 476490 (0.0026) [2024-06-23 16:55:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43415.8, 300 sec: 42875.8). Total num frames: 7806844928. Throughput: 0: 42865.8. Samples: 7806924020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:38,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 16:55:40,770][15401] Updated weights for policy 0, policy_version 476500 (0.0035) [2024-06-23 16:55:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 7807057920. Throughput: 0: 42727.5. Samples: 7807171740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 16:55:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000476505_7807057920.pth... [2024-06-23 16:55:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000475877_7796768768.pth [2024-06-23 16:55:45,238][15401] Updated weights for policy 0, policy_version 476510 (0.0036) [2024-06-23 16:55:48,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 7807287296. Throughput: 0: 43025.2. Samples: 7807437100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 16:55:48,550][15401] Updated weights for policy 0, policy_version 476520 (0.0041) [2024-06-23 16:55:53,101][15401] Updated weights for policy 0, policy_version 476530 (0.0032) [2024-06-23 16:55:53,396][15132] Fps is (10 sec: 42571.5, 60 sec: 43140.0, 300 sec: 42875.1). Total num frames: 7807483904. Throughput: 0: 42792.0. Samples: 7807566240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:53,396][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 16:55:56,211][15401] Updated weights for policy 0, policy_version 476540 (0.0030) [2024-06-23 16:55:58,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43415.9, 300 sec: 42875.8). Total num frames: 7807713280. Throughput: 0: 42730.3. Samples: 7807815420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 16:55:58,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 16:56:00,810][15401] Updated weights for policy 0, policy_version 476550 (0.0030) [2024-06-23 16:56:03,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7807909888. Throughput: 0: 43019.1. Samples: 7808077600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:03,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 16:56:03,763][15401] Updated weights for policy 0, policy_version 476560 (0.0038) [2024-06-23 16:56:04,033][15349] Signal inference workers to stop experience collection... (115700 times) [2024-06-23 16:56:04,034][15349] Signal inference workers to resume experience collection... (115700 times) [2024-06-23 16:56:04,064][15401] InferenceWorker_p0-w0: stopping experience collection (115700 times) [2024-06-23 16:56:04,064][15401] InferenceWorker_p0-w0: resuming experience collection (115700 times) [2024-06-23 16:56:08,313][15401] Updated weights for policy 0, policy_version 476570 (0.0038) [2024-06-23 16:56:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.5, 300 sec: 42876.5). Total num frames: 7808122880. Throughput: 0: 42647.2. Samples: 7808201200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 16:56:11,350][15401] Updated weights for policy 0, policy_version 476580 (0.0045) [2024-06-23 16:56:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 7808368640. Throughput: 0: 42795.9. Samples: 7808458460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:13,392][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 16:56:16,039][15401] Updated weights for policy 0, policy_version 476590 (0.0030) [2024-06-23 16:56:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 7808548864. Throughput: 0: 42944.9. Samples: 7808718480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 16:56:19,091][15401] Updated weights for policy 0, policy_version 476600 (0.0032) [2024-06-23 16:56:23,390][15132] Fps is (10 sec: 37692.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 7808745472. Throughput: 0: 42560.5. Samples: 7808839140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 16:56:24,066][15401] Updated weights for policy 0, policy_version 476610 (0.0043) [2024-06-23 16:56:26,649][15401] Updated weights for policy 0, policy_version 476620 (0.0042) [2024-06-23 16:56:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 7809024000. Throughput: 0: 42858.7. Samples: 7809100380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 16:56:31,645][15401] Updated weights for policy 0, policy_version 476630 (0.0032) [2024-06-23 16:56:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42820.7). Total num frames: 7809187840. Throughput: 0: 42890.2. Samples: 7809367160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 16:56:34,183][15401] Updated weights for policy 0, policy_version 476640 (0.0033) [2024-06-23 16:56:38,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 7809384448. Throughput: 0: 42682.6. Samples: 7809486680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 16:56:39,363][15401] Updated weights for policy 0, policy_version 476650 (0.0031) [2024-06-23 16:56:42,087][15401] Updated weights for policy 0, policy_version 476660 (0.0048) [2024-06-23 16:56:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 7809646592. Throughput: 0: 42699.2. Samples: 7809736780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 16:56:47,047][15401] Updated weights for policy 0, policy_version 476670 (0.0031) [2024-06-23 16:56:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7809826816. Throughput: 0: 42767.1. Samples: 7810002120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 16:56:49,724][15401] Updated weights for policy 0, policy_version 476680 (0.0025) [2024-06-23 16:56:53,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42329.8, 300 sec: 42765.0). Total num frames: 7810023424. Throughput: 0: 42640.3. Samples: 7810120020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 16:56:54,811][15401] Updated weights for policy 0, policy_version 476690 (0.0028) [2024-06-23 16:56:57,165][15401] Updated weights for policy 0, policy_version 476700 (0.0039) [2024-06-23 16:56:58,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 7810301952. Throughput: 0: 42665.8. Samples: 7810378320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:56:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 16:57:02,444][15401] Updated weights for policy 0, policy_version 476710 (0.0031) [2024-06-23 16:57:03,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7810482176. Throughput: 0: 42840.5. Samples: 7810646300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:57:03,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 16:57:05,288][15401] Updated weights for policy 0, policy_version 476720 (0.0038) [2024-06-23 16:57:06,719][15349] Signal inference workers to stop experience collection... (115750 times) [2024-06-23 16:57:06,720][15349] Signal inference workers to resume experience collection... (115750 times) [2024-06-23 16:57:06,772][15401] InferenceWorker_p0-w0: stopping experience collection (115750 times) [2024-06-23 16:57:06,772][15401] InferenceWorker_p0-w0: resuming experience collection (115750 times) [2024-06-23 16:57:08,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42820.8). Total num frames: 7810678784. Throughput: 0: 42942.7. Samples: 7810771560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:57:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 16:57:09,946][15401] Updated weights for policy 0, policy_version 476730 (0.0037) [2024-06-23 16:57:12,857][15401] Updated weights for policy 0, policy_version 476740 (0.0036) [2024-06-23 16:57:13,391][15132] Fps is (10 sec: 44227.9, 60 sec: 42598.7, 300 sec: 42931.4). Total num frames: 7810924544. Throughput: 0: 42835.1. Samples: 7811028040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:57:13,392][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 16:57:17,577][15401] Updated weights for policy 0, policy_version 476750 (0.0035) [2024-06-23 16:57:18,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 7811121152. Throughput: 0: 42710.4. Samples: 7811289400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:57:18,396][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 16:57:20,477][15401] Updated weights for policy 0, policy_version 476760 (0.0030) [2024-06-23 16:57:23,392][15132] Fps is (10 sec: 40958.0, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 7811334144. Throughput: 0: 42826.6. Samples: 7811413980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:57:23,393][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 16:57:25,119][15401] Updated weights for policy 0, policy_version 476770 (0.0032) [2024-06-23 16:57:28,054][15401] Updated weights for policy 0, policy_version 476780 (0.0041) [2024-06-23 16:57:28,389][15132] Fps is (10 sec: 44265.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7811563520. Throughput: 0: 42973.8. Samples: 7811670600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 16:57:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 16:57:32,994][15401] Updated weights for policy 0, policy_version 476790 (0.0032) [2024-06-23 16:57:33,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7811743744. Throughput: 0: 42998.8. Samples: 7811937060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:57:33,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 16:57:35,572][15401] Updated weights for policy 0, policy_version 476800 (0.0039) [2024-06-23 16:57:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7811973120. Throughput: 0: 42989.5. Samples: 7812054540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:57:38,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 16:57:40,464][15401] Updated weights for policy 0, policy_version 476810 (0.0032) [2024-06-23 16:57:43,122][15401] Updated weights for policy 0, policy_version 476820 (0.0034) [2024-06-23 16:57:43,390][15132] Fps is (10 sec: 49151.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7812235264. Throughput: 0: 43110.2. Samples: 7812318280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:57:43,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 16:57:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000476821_7812235264.pth... [2024-06-23 16:57:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000476190_7801896960.pth [2024-06-23 16:57:48,212][15401] Updated weights for policy 0, policy_version 476830 (0.0046) [2024-06-23 16:57:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7812399104. Throughput: 0: 43052.4. Samples: 7812583660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:57:48,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 16:57:50,999][15401] Updated weights for policy 0, policy_version 476840 (0.0038) [2024-06-23 16:57:53,396][15132] Fps is (10 sec: 39296.5, 60 sec: 43413.0, 300 sec: 42875.2). Total num frames: 7812628480. Throughput: 0: 42737.4. Samples: 7812695020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:57:53,396][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 16:57:55,879][15401] Updated weights for policy 0, policy_version 476850 (0.0042) [2024-06-23 16:57:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42820.9). Total num frames: 7812825088. Throughput: 0: 42895.7. Samples: 7812958260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:57:58,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 16:57:58,826][15401] Updated weights for policy 0, policy_version 476860 (0.0033) [2024-06-23 16:58:03,389][15132] Fps is (10 sec: 39347.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7813021696. Throughput: 0: 42911.9. Samples: 7813220160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:03,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 16:58:03,533][15401] Updated weights for policy 0, policy_version 476870 (0.0024) [2024-06-23 16:58:06,270][15401] Updated weights for policy 0, policy_version 476880 (0.0030) [2024-06-23 16:58:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7813267456. Throughput: 0: 42848.6. Samples: 7813342060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 16:58:10,914][15401] Updated weights for policy 0, policy_version 476890 (0.0037) [2024-06-23 16:58:13,334][15349] Signal inference workers to stop experience collection... (115800 times) [2024-06-23 16:58:13,383][15401] InferenceWorker_p0-w0: stopping experience collection (115800 times) [2024-06-23 16:58:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42599.8, 300 sec: 42876.1). Total num frames: 7813480448. Throughput: 0: 42916.4. Samples: 7813601840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 16:58:13,394][15349] Signal inference workers to resume experience collection... (115800 times) [2024-06-23 16:58:13,399][15401] InferenceWorker_p0-w0: resuming experience collection (115800 times) [2024-06-23 16:58:13,672][15401] Updated weights for policy 0, policy_version 476900 (0.0030) [2024-06-23 16:58:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42602.8, 300 sec: 42876.1). Total num frames: 7813677056. Throughput: 0: 42894.4. Samples: 7813867320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 16:58:18,521][15401] Updated weights for policy 0, policy_version 476910 (0.0028) [2024-06-23 16:58:21,985][15401] Updated weights for policy 0, policy_version 476920 (0.0034) [2024-06-23 16:58:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.3, 300 sec: 42932.0). Total num frames: 7813922816. Throughput: 0: 42845.3. Samples: 7813982580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 16:58:26,121][15401] Updated weights for policy 0, policy_version 476930 (0.0036) [2024-06-23 16:58:28,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 7814135808. Throughput: 0: 42630.3. Samples: 7814236640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:28,398][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 16:58:29,397][15401] Updated weights for policy 0, policy_version 476940 (0.0045) [2024-06-23 16:58:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 7814316032. Throughput: 0: 42851.9. Samples: 7814512000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 16:58:33,682][15401] Updated weights for policy 0, policy_version 476950 (0.0033) [2024-06-23 16:58:37,070][15401] Updated weights for policy 0, policy_version 476960 (0.0027) [2024-06-23 16:58:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7814545408. Throughput: 0: 42923.1. Samples: 7814626280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 16:58:41,175][15401] Updated weights for policy 0, policy_version 476970 (0.0030) [2024-06-23 16:58:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7814774784. Throughput: 0: 42801.8. Samples: 7814884340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 16:58:44,525][15401] Updated weights for policy 0, policy_version 476980 (0.0031) [2024-06-23 16:58:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7814955008. Throughput: 0: 42916.3. Samples: 7815151400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:48,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 16:58:48,812][15401] Updated weights for policy 0, policy_version 476990 (0.0038) [2024-06-23 16:58:52,266][15401] Updated weights for policy 0, policy_version 477000 (0.0027) [2024-06-23 16:58:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 7815184384. Throughput: 0: 42894.8. Samples: 7815272320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 16:58:53,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 16:58:56,393][15401] Updated weights for policy 0, policy_version 477010 (0.0040) [2024-06-23 16:58:58,389][15132] Fps is (10 sec: 47514.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 7815430144. Throughput: 0: 42923.2. Samples: 7815533380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:58:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 16:58:59,641][15401] Updated weights for policy 0, policy_version 477020 (0.0029) [2024-06-23 16:59:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7815593984. Throughput: 0: 42942.0. Samples: 7815799700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 16:59:03,884][15401] Updated weights for policy 0, policy_version 477030 (0.0034) [2024-06-23 16:59:07,074][15401] Updated weights for policy 0, policy_version 477040 (0.0034) [2024-06-23 16:59:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7815839744. Throughput: 0: 43099.5. Samples: 7815922060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:08,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 16:59:11,420][15401] Updated weights for policy 0, policy_version 477050 (0.0031) [2024-06-23 16:59:13,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7816069120. Throughput: 0: 43182.7. Samples: 7816179860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 16:59:14,832][15401] Updated weights for policy 0, policy_version 477060 (0.0032) [2024-06-23 16:59:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7816265728. Throughput: 0: 42966.2. Samples: 7816445480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 16:59:18,879][15349] Signal inference workers to stop experience collection... (115850 times) [2024-06-23 16:59:18,879][15349] Signal inference workers to resume experience collection... (115850 times) [2024-06-23 16:59:18,929][15401] InferenceWorker_p0-w0: stopping experience collection (115850 times) [2024-06-23 16:59:18,929][15401] InferenceWorker_p0-w0: resuming experience collection (115850 times) [2024-06-23 16:59:19,034][15401] Updated weights for policy 0, policy_version 477070 (0.0045) [2024-06-23 16:59:22,354][15401] Updated weights for policy 0, policy_version 477080 (0.0038) [2024-06-23 16:59:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7816478720. Throughput: 0: 43151.8. Samples: 7816568120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 16:59:26,575][15401] Updated weights for policy 0, policy_version 477090 (0.0035) [2024-06-23 16:59:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7816724480. Throughput: 0: 43164.4. Samples: 7816826740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 16:59:30,267][15401] Updated weights for policy 0, policy_version 477100 (0.0037) [2024-06-23 16:59:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7816904704. Throughput: 0: 43112.4. Samples: 7817091460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 16:59:34,106][15401] Updated weights for policy 0, policy_version 477110 (0.0026) [2024-06-23 16:59:37,655][15401] Updated weights for policy 0, policy_version 477120 (0.0029) [2024-06-23 16:59:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 7817134080. Throughput: 0: 43134.1. Samples: 7817213360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 16:59:41,861][15401] Updated weights for policy 0, policy_version 477130 (0.0043) [2024-06-23 16:59:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7817363456. Throughput: 0: 43072.3. Samples: 7817471640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 16:59:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000477134_7817363456.pth... [2024-06-23 16:59:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000476505_7807057920.pth [2024-06-23 16:59:45,151][15401] Updated weights for policy 0, policy_version 477140 (0.0029) [2024-06-23 16:59:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 7817543680. Throughput: 0: 42898.3. Samples: 7817730120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 16:59:49,618][15401] Updated weights for policy 0, policy_version 477150 (0.0028) [2024-06-23 16:59:52,938][15401] Updated weights for policy 0, policy_version 477160 (0.0027) [2024-06-23 16:59:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 7817789440. Throughput: 0: 42993.8. Samples: 7817856780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 16:59:57,110][15401] Updated weights for policy 0, policy_version 477170 (0.0039) [2024-06-23 16:59:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7818002432. Throughput: 0: 42951.0. Samples: 7818112660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 16:59:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 17:00:00,495][15401] Updated weights for policy 0, policy_version 477180 (0.0032) [2024-06-23 17:00:03,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 7818199040. Throughput: 0: 42801.8. Samples: 7818371660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:00:03,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 17:00:04,671][15401] Updated weights for policy 0, policy_version 477190 (0.0024) [2024-06-23 17:00:08,046][15401] Updated weights for policy 0, policy_version 477200 (0.0026) [2024-06-23 17:00:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 7818444800. Throughput: 0: 42911.2. Samples: 7818499120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:00:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 17:00:12,293][15401] Updated weights for policy 0, policy_version 477210 (0.0036) [2024-06-23 17:00:13,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7818625024. Throughput: 0: 42835.7. Samples: 7818754340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:00:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 17:00:15,929][15401] Updated weights for policy 0, policy_version 477220 (0.0038) [2024-06-23 17:00:18,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7818821632. Throughput: 0: 42838.4. Samples: 7819019180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:00:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 17:00:19,861][15401] Updated weights for policy 0, policy_version 477230 (0.0044) [2024-06-23 17:00:23,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 7819083776. Throughput: 0: 42815.7. Samples: 7819140060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:00:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 17:00:23,488][15401] Updated weights for policy 0, policy_version 477240 (0.0040) [2024-06-23 17:00:28,052][15401] Updated weights for policy 0, policy_version 477250 (0.0051) [2024-06-23 17:00:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7819280384. Throughput: 0: 42798.8. Samples: 7819397580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:00:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 17:00:31,539][15401] Updated weights for policy 0, policy_version 477260 (0.0030) [2024-06-23 17:00:33,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 7819460608. Throughput: 0: 42897.7. Samples: 7819660520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:00:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 17:00:35,598][15401] Updated weights for policy 0, policy_version 477270 (0.0031) [2024-06-23 17:00:36,124][15349] Signal inference workers to stop experience collection... (115900 times) [2024-06-23 17:00:36,124][15349] Signal inference workers to resume experience collection... (115900 times) [2024-06-23 17:00:36,155][15401] InferenceWorker_p0-w0: stopping experience collection (115900 times) [2024-06-23 17:00:36,155][15401] InferenceWorker_p0-w0: resuming experience collection (115900 times) [2024-06-23 17:00:38,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7819722752. Throughput: 0: 42741.3. Samples: 7819780140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:00:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 17:00:39,157][15401] Updated weights for policy 0, policy_version 477280 (0.0047) [2024-06-23 17:00:43,088][15401] Updated weights for policy 0, policy_version 477290 (0.0029) [2024-06-23 17:00:43,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7819935744. Throughput: 0: 42867.6. Samples: 7820041700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:00:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 17:00:46,808][15401] Updated weights for policy 0, policy_version 477300 (0.0031) [2024-06-23 17:00:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 7820099584. Throughput: 0: 42985.4. Samples: 7820305900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:00:48,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 17:00:50,657][15401] Updated weights for policy 0, policy_version 477310 (0.0034) [2024-06-23 17:00:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 7820378112. Throughput: 0: 42810.2. Samples: 7820425580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:00:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 17:00:54,930][15401] Updated weights for policy 0, policy_version 477320 (0.0039) [2024-06-23 17:00:58,188][15401] Updated weights for policy 0, policy_version 477330 (0.0041) [2024-06-23 17:00:58,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7820574720. Throughput: 0: 43034.9. Samples: 7820690920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:00:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 17:01:02,290][15401] Updated weights for policy 0, policy_version 477340 (0.0045) [2024-06-23 17:01:03,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 7820754944. Throughput: 0: 43155.4. Samples: 7820961180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:03,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 17:01:05,770][15401] Updated weights for policy 0, policy_version 477350 (0.0033) [2024-06-23 17:01:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 7821000704. Throughput: 0: 43096.0. Samples: 7821079380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:08,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 17:01:09,826][15401] Updated weights for policy 0, policy_version 477360 (0.0044) [2024-06-23 17:01:13,329][15401] Updated weights for policy 0, policy_version 477370 (0.0036) [2024-06-23 17:01:13,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.4, 300 sec: 42987.2). Total num frames: 7821230080. Throughput: 0: 43319.8. Samples: 7821346980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:13,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 17:01:17,340][15401] Updated weights for policy 0, policy_version 477380 (0.0037) [2024-06-23 17:01:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 7821410304. Throughput: 0: 43361.5. Samples: 7821611780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 17:01:20,869][15401] Updated weights for policy 0, policy_version 477390 (0.0048) [2024-06-23 17:01:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7821656064. Throughput: 0: 43400.2. Samples: 7821733140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:23,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 17:01:24,842][15401] Updated weights for policy 0, policy_version 477400 (0.0030) [2024-06-23 17:01:28,311][15401] Updated weights for policy 0, policy_version 477410 (0.0027) [2024-06-23 17:01:28,392][15132] Fps is (10 sec: 47501.5, 60 sec: 43415.7, 300 sec: 43042.4). Total num frames: 7821885440. Throughput: 0: 43460.8. Samples: 7821997540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:28,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 17:01:32,930][15401] Updated weights for policy 0, policy_version 477420 (0.0036) [2024-06-23 17:01:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 7822065664. Throughput: 0: 43302.2. Samples: 7822254500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:33,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 17:01:36,094][15401] Updated weights for policy 0, policy_version 477430 (0.0029) [2024-06-23 17:01:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7822311424. Throughput: 0: 43315.1. Samples: 7822374760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:38,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-23 17:01:40,450][15401] Updated weights for policy 0, policy_version 477440 (0.0029) [2024-06-23 17:01:43,363][15349] Signal inference workers to stop experience collection... (115950 times) [2024-06-23 17:01:43,368][15349] Signal inference workers to resume experience collection... (115950 times) [2024-06-23 17:01:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 7822524416. Throughput: 0: 43301.8. Samples: 7822639500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 17:01:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 17:01:43,405][15401] InferenceWorker_p0-w0: stopping experience collection (115950 times) [2024-06-23 17:01:43,405][15401] InferenceWorker_p0-w0: resuming experience collection (115950 times) [2024-06-23 17:01:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000477450_7822540800.pth... [2024-06-23 17:01:43,510][15401] Updated weights for policy 0, policy_version 477450 (0.0044) [2024-06-23 17:01:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000476821_7812235264.pth [2024-06-23 17:01:48,183][15401] Updated weights for policy 0, policy_version 477460 (0.0037) [2024-06-23 17:01:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 7822704640. Throughput: 0: 43045.0. Samples: 7822898200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:01:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 17:01:51,144][15401] Updated weights for policy 0, policy_version 477470 (0.0036) [2024-06-23 17:01:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7822966784. Throughput: 0: 43099.9. Samples: 7823018880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:01:53,393][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 17:01:55,685][15401] Updated weights for policy 0, policy_version 477480 (0.0046) [2024-06-23 17:01:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 7823163392. Throughput: 0: 42929.1. Samples: 7823278780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:01:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 17:01:58,808][15401] Updated weights for policy 0, policy_version 477490 (0.0041) [2024-06-23 17:02:03,389][15132] Fps is (10 sec: 37683.7, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 7823343616. Throughput: 0: 42918.2. Samples: 7823543100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 17:02:03,468][15401] Updated weights for policy 0, policy_version 477500 (0.0041) [2024-06-23 17:02:06,231][15401] Updated weights for policy 0, policy_version 477510 (0.0028) [2024-06-23 17:02:08,392][15132] Fps is (10 sec: 44225.4, 60 sec: 43415.8, 300 sec: 42987.1). Total num frames: 7823605760. Throughput: 0: 42850.0. Samples: 7823661500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:08,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 17:02:11,175][15401] Updated weights for policy 0, policy_version 477520 (0.0043) [2024-06-23 17:02:13,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 43043.6). Total num frames: 7823818752. Throughput: 0: 42912.0. Samples: 7823928480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 17:02:13,714][15401] Updated weights for policy 0, policy_version 477530 (0.0031) [2024-06-23 17:02:18,389][15132] Fps is (10 sec: 37692.9, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 7823982592. Throughput: 0: 42933.6. Samples: 7824186500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 17:02:18,721][15401] Updated weights for policy 0, policy_version 477540 (0.0035) [2024-06-23 17:02:21,258][15401] Updated weights for policy 0, policy_version 477550 (0.0027) [2024-06-23 17:02:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7824228352. Throughput: 0: 42852.5. Samples: 7824303120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 17:02:26,171][15401] Updated weights for policy 0, policy_version 477560 (0.0040) [2024-06-23 17:02:28,389][15132] Fps is (10 sec: 49151.3, 60 sec: 43146.3, 300 sec: 43153.8). Total num frames: 7824474112. Throughput: 0: 42952.4. Samples: 7824572360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 17:02:28,643][15401] Updated weights for policy 0, policy_version 477570 (0.0029) [2024-06-23 17:02:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 7824654336. Throughput: 0: 42987.6. Samples: 7824832640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 17:02:33,730][15401] Updated weights for policy 0, policy_version 477580 (0.0021) [2024-06-23 17:02:36,779][15401] Updated weights for policy 0, policy_version 477590 (0.0027) [2024-06-23 17:02:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7824883712. Throughput: 0: 43029.0. Samples: 7824955180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 17:02:41,638][15401] Updated weights for policy 0, policy_version 477600 (0.0039) [2024-06-23 17:02:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 7825096704. Throughput: 0: 43092.8. Samples: 7825217960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:43,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 17:02:44,555][15401] Updated weights for policy 0, policy_version 477610 (0.0045) [2024-06-23 17:02:48,396][15132] Fps is (10 sec: 40933.7, 60 sec: 43139.9, 300 sec: 42931.6). Total num frames: 7825293312. Throughput: 0: 42948.5. Samples: 7825476060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:48,396][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 17:02:48,963][15401] Updated weights for policy 0, policy_version 477620 (0.0039) [2024-06-23 17:02:52,207][15401] Updated weights for policy 0, policy_version 477630 (0.0037) [2024-06-23 17:02:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 7825522688. Throughput: 0: 43146.9. Samples: 7825603000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 17:02:56,460][15401] Updated weights for policy 0, policy_version 477640 (0.0028) [2024-06-23 17:02:58,392][15132] Fps is (10 sec: 45893.4, 60 sec: 43142.7, 300 sec: 43153.4). Total num frames: 7825752064. Throughput: 0: 42996.0. Samples: 7825863400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:02:58,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 17:02:59,684][15401] Updated weights for policy 0, policy_version 477650 (0.0037) [2024-06-23 17:03:01,120][15349] Signal inference workers to stop experience collection... (116000 times) [2024-06-23 17:03:01,122][15349] Signal inference workers to resume experience collection... (116000 times) [2024-06-23 17:03:01,156][15401] InferenceWorker_p0-w0: stopping experience collection (116000 times) [2024-06-23 17:03:01,156][15401] InferenceWorker_p0-w0: resuming experience collection (116000 times) [2024-06-23 17:03:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7825932288. Throughput: 0: 43146.2. Samples: 7826128080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:03:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 17:03:03,882][15401] Updated weights for policy 0, policy_version 477660 (0.0034) [2024-06-23 17:03:07,651][15401] Updated weights for policy 0, policy_version 477670 (0.0035) [2024-06-23 17:03:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 7826161664. Throughput: 0: 43093.7. Samples: 7826242340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 17:03:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 17:03:11,563][15401] Updated weights for policy 0, policy_version 477680 (0.0036) [2024-06-23 17:03:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 43153.8). Total num frames: 7826407424. Throughput: 0: 42988.4. Samples: 7826506840. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 17:03:15,106][15401] Updated weights for policy 0, policy_version 477690 (0.0024) [2024-06-23 17:03:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 7826587648. Throughput: 0: 43103.8. Samples: 7826772320. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 17:03:18,885][15401] Updated weights for policy 0, policy_version 477700 (0.0037) [2024-06-23 17:03:22,555][15401] Updated weights for policy 0, policy_version 477710 (0.0027) [2024-06-23 17:03:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7826817024. Throughput: 0: 43128.4. Samples: 7826895960. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 17:03:26,240][15401] Updated weights for policy 0, policy_version 477720 (0.0033) [2024-06-23 17:03:28,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 7827046400. Throughput: 0: 43087.0. Samples: 7827156880. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 17:03:30,002][15401] Updated weights for policy 0, policy_version 477730 (0.0037) [2024-06-23 17:03:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 7827259392. Throughput: 0: 43252.3. Samples: 7827422140. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 17:03:33,734][15401] Updated weights for policy 0, policy_version 477740 (0.0032) [2024-06-23 17:03:37,969][15401] Updated weights for policy 0, policy_version 477750 (0.0031) [2024-06-23 17:03:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.7, 300 sec: 43042.4). Total num frames: 7827472384. Throughput: 0: 43156.2. Samples: 7827545140. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:38,393][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 17:03:41,629][15401] Updated weights for policy 0, policy_version 477760 (0.0042) [2024-06-23 17:03:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 43153.8). Total num frames: 7827685376. Throughput: 0: 43019.6. Samples: 7827799180. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:43,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 17:03:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000477765_7827701760.pth... [2024-06-23 17:03:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000477134_7817363456.pth [2024-06-23 17:03:45,918][15401] Updated weights for policy 0, policy_version 477770 (0.0031) [2024-06-23 17:03:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43149.1, 300 sec: 43042.7). Total num frames: 7827881984. Throughput: 0: 42793.3. Samples: 7828053780. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 17:03:49,221][15401] Updated weights for policy 0, policy_version 477780 (0.0037) [2024-06-23 17:03:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7828094976. Throughput: 0: 43019.1. Samples: 7828178200. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 17:03:53,515][15401] Updated weights for policy 0, policy_version 477790 (0.0043) [2024-06-23 17:03:56,838][15401] Updated weights for policy 0, policy_version 477800 (0.0034) [2024-06-23 17:03:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.1, 300 sec: 43153.8). Total num frames: 7828324352. Throughput: 0: 42872.8. Samples: 7828436120. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:03:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 17:04:01,083][15401] Updated weights for policy 0, policy_version 477810 (0.0029) [2024-06-23 17:04:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 7828537344. Throughput: 0: 42736.6. Samples: 7828695460. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:04:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 17:04:04,772][15401] Updated weights for policy 0, policy_version 477820 (0.0039) [2024-06-23 17:04:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7828750336. Throughput: 0: 42886.1. Samples: 7828825840. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:04:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 17:04:08,579][15401] Updated weights for policy 0, policy_version 477830 (0.0026) [2024-06-23 17:04:12,197][15401] Updated weights for policy 0, policy_version 477840 (0.0041) [2024-06-23 17:04:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 7828963328. Throughput: 0: 42788.1. Samples: 7829082340. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:04:13,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 17:04:16,085][15401] Updated weights for policy 0, policy_version 477850 (0.0037) [2024-06-23 17:04:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7829159936. Throughput: 0: 42709.3. Samples: 7829344060. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:04:18,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 17:04:19,732][15401] Updated weights for policy 0, policy_version 477860 (0.0032) [2024-06-23 17:04:22,023][15349] Signal inference workers to stop experience collection... (116050 times) [2024-06-23 17:04:22,023][15349] Signal inference workers to resume experience collection... (116050 times) [2024-06-23 17:04:22,053][15401] InferenceWorker_p0-w0: stopping experience collection (116050 times) [2024-06-23 17:04:22,053][15401] InferenceWorker_p0-w0: resuming experience collection (116050 times) [2024-06-23 17:04:23,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7829389312. Throughput: 0: 42826.8. Samples: 7829472240. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:04:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 17:04:23,648][15401] Updated weights for policy 0, policy_version 477870 (0.0034) [2024-06-23 17:04:27,597][15401] Updated weights for policy 0, policy_version 477880 (0.0032) [2024-06-23 17:04:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 7829602304. Throughput: 0: 42821.4. Samples: 7829726140. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:04:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 17:04:31,928][15401] Updated weights for policy 0, policy_version 477890 (0.0049) [2024-06-23 17:04:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 7829815296. Throughput: 0: 42892.0. Samples: 7829983920. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-06-23 17:04:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 17:04:35,362][15401] Updated weights for policy 0, policy_version 477900 (0.0032) [2024-06-23 17:04:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 7830044672. Throughput: 0: 42985.3. Samples: 7830112540. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:04:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 17:04:39,382][15401] Updated weights for policy 0, policy_version 477910 (0.0032) [2024-06-23 17:04:43,066][15401] Updated weights for policy 0, policy_version 477920 (0.0043) [2024-06-23 17:04:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 7830241280. Throughput: 0: 42954.7. Samples: 7830369080. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:04:43,393][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 17:04:46,956][15401] Updated weights for policy 0, policy_version 477930 (0.0038) [2024-06-23 17:04:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7830454272. Throughput: 0: 42967.9. Samples: 7830629020. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:04:48,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 17:04:50,953][15401] Updated weights for policy 0, policy_version 477940 (0.0033) [2024-06-23 17:04:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 7830683648. Throughput: 0: 42853.4. Samples: 7830754240. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:04:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 17:04:54,804][15401] Updated weights for policy 0, policy_version 477950 (0.0026) [2024-06-23 17:04:58,355][15401] Updated weights for policy 0, policy_version 477960 (0.0031) [2024-06-23 17:04:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 7830896640. Throughput: 0: 42836.7. Samples: 7831010000. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:04:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 17:05:02,329][15401] Updated weights for policy 0, policy_version 477970 (0.0040) [2024-06-23 17:05:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 7831093248. Throughput: 0: 42772.6. Samples: 7831268820. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 17:05:05,965][15401] Updated weights for policy 0, policy_version 477980 (0.0028) [2024-06-23 17:05:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 7831322624. Throughput: 0: 42684.1. Samples: 7831393020. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 17:05:09,861][15401] Updated weights for policy 0, policy_version 477990 (0.0030) [2024-06-23 17:05:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 7831535616. Throughput: 0: 42676.0. Samples: 7831646560. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 17:05:13,575][15401] Updated weights for policy 0, policy_version 478000 (0.0032) [2024-06-23 17:05:17,568][15401] Updated weights for policy 0, policy_version 478010 (0.0032) [2024-06-23 17:05:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7831732224. Throughput: 0: 42719.0. Samples: 7831906280. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 17:05:21,306][15401] Updated weights for policy 0, policy_version 478020 (0.0028) [2024-06-23 17:05:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42986.8). Total num frames: 7831961600. Throughput: 0: 42695.6. Samples: 7832033940. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:23,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 17:05:25,137][15401] Updated weights for policy 0, policy_version 478030 (0.0034) [2024-06-23 17:05:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 7832158208. Throughput: 0: 42623.5. Samples: 7832287140. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:28,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 17:05:29,231][15401] Updated weights for policy 0, policy_version 478040 (0.0027) [2024-06-23 17:05:32,586][15401] Updated weights for policy 0, policy_version 478050 (0.0029) [2024-06-23 17:05:33,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7832387584. Throughput: 0: 42427.2. Samples: 7832538240. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 17:05:36,872][15401] Updated weights for policy 0, policy_version 478060 (0.0030) [2024-06-23 17:05:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 7832567808. Throughput: 0: 42570.2. Samples: 7832669900. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:38,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 17:05:40,321][15401] Updated weights for policy 0, policy_version 478070 (0.0039) [2024-06-23 17:05:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 7832797184. Throughput: 0: 42615.3. Samples: 7832927680. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 17:05:43,490][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000478077_7832813568.pth... [2024-06-23 17:05:43,566][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000477450_7822540800.pth [2024-06-23 17:05:44,441][15401] Updated weights for policy 0, policy_version 478080 (0.0030) [2024-06-23 17:05:46,949][15349] Signal inference workers to stop experience collection... (116100 times) [2024-06-23 17:05:46,950][15349] Signal inference workers to resume experience collection... (116100 times) [2024-06-23 17:05:46,984][15401] InferenceWorker_p0-w0: stopping experience collection (116100 times) [2024-06-23 17:05:46,984][15401] InferenceWorker_p0-w0: resuming experience collection (116100 times) [2024-06-23 17:05:48,316][15401] Updated weights for policy 0, policy_version 478090 (0.0042) [2024-06-23 17:05:48,396][15132] Fps is (10 sec: 45845.4, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 7833026560. Throughput: 0: 42470.3. Samples: 7833180260. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:48,397][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 17:05:52,289][15401] Updated weights for policy 0, policy_version 478100 (0.0037) [2024-06-23 17:05:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 7833206784. Throughput: 0: 42687.1. Samples: 7833313940. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 17:05:56,134][15401] Updated weights for policy 0, policy_version 478110 (0.0031) [2024-06-23 17:05:58,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 7833436160. Throughput: 0: 42396.5. Samples: 7833554400. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-06-23 17:05:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 17:05:59,935][15401] Updated weights for policy 0, policy_version 478120 (0.0031) [2024-06-23 17:06:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7833649152. Throughput: 0: 42648.5. Samples: 7833825460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 17:06:03,722][15401] Updated weights for policy 0, policy_version 478130 (0.0034) [2024-06-23 17:06:07,383][15401] Updated weights for policy 0, policy_version 478140 (0.0023) [2024-06-23 17:06:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 7833845760. Throughput: 0: 42601.5. Samples: 7833950900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:08,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 17:06:11,255][15401] Updated weights for policy 0, policy_version 478150 (0.0036) [2024-06-23 17:06:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42987.1). Total num frames: 7834091520. Throughput: 0: 42563.6. Samples: 7834202500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 17:06:14,878][15401] Updated weights for policy 0, policy_version 478160 (0.0055) [2024-06-23 17:06:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7834304512. Throughput: 0: 42834.2. Samples: 7834465780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 17:06:18,894][15401] Updated weights for policy 0, policy_version 478170 (0.0036) [2024-06-23 17:06:22,802][15401] Updated weights for policy 0, policy_version 478180 (0.0034) [2024-06-23 17:06:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42765.4). Total num frames: 7834501120. Throughput: 0: 42665.6. Samples: 7834589860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:23,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 17:06:26,551][15401] Updated weights for policy 0, policy_version 478190 (0.0029) [2024-06-23 17:06:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7834730496. Throughput: 0: 42683.9. Samples: 7834848460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 17:06:30,996][15401] Updated weights for policy 0, policy_version 478200 (0.0039) [2024-06-23 17:06:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7834927104. Throughput: 0: 42779.9. Samples: 7835105080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 17:06:34,220][15401] Updated weights for policy 0, policy_version 478210 (0.0026) [2024-06-23 17:06:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 7835140096. Throughput: 0: 42595.9. Samples: 7835230860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:38,392][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 17:06:38,435][15401] Updated weights for policy 0, policy_version 478220 (0.0034) [2024-06-23 17:06:42,056][15401] Updated weights for policy 0, policy_version 478230 (0.0037) [2024-06-23 17:06:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7835369472. Throughput: 0: 42905.4. Samples: 7835485140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:43,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 17:06:46,031][15401] Updated weights for policy 0, policy_version 478240 (0.0045) [2024-06-23 17:06:48,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42329.7, 300 sec: 42709.5). Total num frames: 7835566080. Throughput: 0: 42634.5. Samples: 7835744020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 17:06:49,867][15401] Updated weights for policy 0, policy_version 478250 (0.0030) [2024-06-23 17:06:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 7835795456. Throughput: 0: 42674.2. Samples: 7835871240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 17:06:53,500][15401] Updated weights for policy 0, policy_version 478260 (0.0039) [2024-06-23 17:06:57,389][15401] Updated weights for policy 0, policy_version 478270 (0.0030) [2024-06-23 17:06:58,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7836008448. Throughput: 0: 42817.5. Samples: 7836129280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:06:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 17:07:00,930][15401] Updated weights for policy 0, policy_version 478280 (0.0035) [2024-06-23 17:07:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 7836221440. Throughput: 0: 42655.1. Samples: 7836385260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:07:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 17:07:04,069][15349] Signal inference workers to stop experience collection... (116150 times) [2024-06-23 17:07:04,076][15349] Signal inference workers to resume experience collection... (116150 times) [2024-06-23 17:07:04,083][15401] InferenceWorker_p0-w0: stopping experience collection (116150 times) [2024-06-23 17:07:04,096][15401] InferenceWorker_p0-w0: resuming experience collection (116150 times) [2024-06-23 17:07:05,047][15401] Updated weights for policy 0, policy_version 478290 (0.0034) [2024-06-23 17:07:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7836434432. Throughput: 0: 42769.0. Samples: 7836514460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:07:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 17:07:09,072][15401] Updated weights for policy 0, policy_version 478300 (0.0034) [2024-06-23 17:07:12,882][15401] Updated weights for policy 0, policy_version 478310 (0.0034) [2024-06-23 17:07:13,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42593.9, 300 sec: 42930.7). Total num frames: 7836647424. Throughput: 0: 42561.1. Samples: 7836763980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:07:13,396][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 17:07:16,747][15401] Updated weights for policy 0, policy_version 478320 (0.0035) [2024-06-23 17:07:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7836844032. Throughput: 0: 42653.3. Samples: 7837024480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:07:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 17:07:20,411][15401] Updated weights for policy 0, policy_version 478330 (0.0031) [2024-06-23 17:07:23,390][15132] Fps is (10 sec: 44264.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7837089792. Throughput: 0: 42718.7. Samples: 7837153100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:07:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 17:07:24,135][15401] Updated weights for policy 0, policy_version 478340 (0.0030) [2024-06-23 17:07:27,874][15401] Updated weights for policy 0, policy_version 478350 (0.0033) [2024-06-23 17:07:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7837302784. Throughput: 0: 42792.8. Samples: 7837410820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 17:07:28,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 17:07:31,820][15401] Updated weights for policy 0, policy_version 478360 (0.0034) [2024-06-23 17:07:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7837483008. Throughput: 0: 42652.2. Samples: 7837663360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:07:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 17:07:35,525][15401] Updated weights for policy 0, policy_version 478370 (0.0024) [2024-06-23 17:07:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7837696000. Throughput: 0: 42542.2. Samples: 7837785640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:07:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 17:07:39,582][15401] Updated weights for policy 0, policy_version 478380 (0.0035) [2024-06-23 17:07:43,088][15401] Updated weights for policy 0, policy_version 478390 (0.0024) [2024-06-23 17:07:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 7837941760. Throughput: 0: 42567.0. Samples: 7838044800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:07:43,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-23 17:07:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000478390_7837941760.pth... [2024-06-23 17:07:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000477765_7827701760.pth [2024-06-23 17:07:47,565][15401] Updated weights for policy 0, policy_version 478400 (0.0029) [2024-06-23 17:07:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 7838121984. Throughput: 0: 42475.7. Samples: 7838296660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:07:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 17:07:51,084][15401] Updated weights for policy 0, policy_version 478410 (0.0031) [2024-06-23 17:07:53,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42323.5, 300 sec: 42653.9). Total num frames: 7838334976. Throughput: 0: 42416.8. Samples: 7838423320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:07:53,393][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 17:07:55,290][15401] Updated weights for policy 0, policy_version 478420 (0.0037) [2024-06-23 17:07:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7838564352. Throughput: 0: 42662.1. Samples: 7838683500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:07:58,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 17:07:58,682][15401] Updated weights for policy 0, policy_version 478430 (0.0031) [2024-06-23 17:08:02,732][15401] Updated weights for policy 0, policy_version 478440 (0.0031) [2024-06-23 17:08:03,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7838777344. Throughput: 0: 42628.0. Samples: 7838942740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 17:08:06,141][15401] Updated weights for policy 0, policy_version 478450 (0.0025) [2024-06-23 17:08:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7838973952. Throughput: 0: 42635.1. Samples: 7839071680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 17:08:10,387][15401] Updated weights for policy 0, policy_version 478460 (0.0034) [2024-06-23 17:08:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 7839219712. Throughput: 0: 42820.0. Samples: 7839337720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 17:08:13,631][15401] Updated weights for policy 0, policy_version 478470 (0.0036) [2024-06-23 17:08:17,072][15349] Signal inference workers to stop experience collection... (116200 times) [2024-06-23 17:08:17,127][15401] InferenceWorker_p0-w0: stopping experience collection (116200 times) [2024-06-23 17:08:17,187][15349] Signal inference workers to resume experience collection... (116200 times) [2024-06-23 17:08:17,188][15401] InferenceWorker_p0-w0: resuming experience collection (116200 times) [2024-06-23 17:08:18,088][15401] Updated weights for policy 0, policy_version 478480 (0.0024) [2024-06-23 17:08:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7839432704. Throughput: 0: 42985.3. Samples: 7839597700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:18,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 17:08:21,219][15401] Updated weights for policy 0, policy_version 478490 (0.0034) [2024-06-23 17:08:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7839629312. Throughput: 0: 43056.5. Samples: 7839723180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 17:08:25,595][15401] Updated weights for policy 0, policy_version 478500 (0.0038) [2024-06-23 17:08:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7839875072. Throughput: 0: 43171.7. Samples: 7839987520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 17:08:28,756][15401] Updated weights for policy 0, policy_version 478510 (0.0032) [2024-06-23 17:08:33,327][15401] Updated weights for policy 0, policy_version 478520 (0.0036) [2024-06-23 17:08:33,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 7840071680. Throughput: 0: 43219.7. Samples: 7840241560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:33,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 17:08:36,514][15401] Updated weights for policy 0, policy_version 478530 (0.0041) [2024-06-23 17:08:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7840284672. Throughput: 0: 43177.9. Samples: 7840366220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 17:08:41,206][15401] Updated weights for policy 0, policy_version 478540 (0.0035) [2024-06-23 17:08:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7840497664. Throughput: 0: 43134.1. Samples: 7840624540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:43,394][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 17:08:44,123][15401] Updated weights for policy 0, policy_version 478550 (0.0028) [2024-06-23 17:08:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7840694272. Throughput: 0: 42962.7. Samples: 7840876060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 17:08:48,841][15401] Updated weights for policy 0, policy_version 478560 (0.0035) [2024-06-23 17:08:51,878][15401] Updated weights for policy 0, policy_version 478570 (0.0033) [2024-06-23 17:08:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43146.4, 300 sec: 42709.5). Total num frames: 7840923648. Throughput: 0: 42880.1. Samples: 7841001280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 17:08:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 17:08:56,470][15401] Updated weights for policy 0, policy_version 478580 (0.0042) [2024-06-23 17:08:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7841120256. Throughput: 0: 42621.8. Samples: 7841255700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:08:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 17:08:59,947][15401] Updated weights for policy 0, policy_version 478590 (0.0026) [2024-06-23 17:09:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7841316864. Throughput: 0: 42620.0. Samples: 7841515600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 17:09:03,943][15401] Updated weights for policy 0, policy_version 478600 (0.0030) [2024-06-23 17:09:07,676][15401] Updated weights for policy 0, policy_version 478610 (0.0038) [2024-06-23 17:09:08,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 7841562624. Throughput: 0: 42476.8. Samples: 7841634740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:08,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 17:09:11,840][15401] Updated weights for policy 0, policy_version 478620 (0.0043) [2024-06-23 17:09:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7841759232. Throughput: 0: 42438.1. Samples: 7841897240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:13,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 17:09:15,318][15401] Updated weights for policy 0, policy_version 478630 (0.0032) [2024-06-23 17:09:18,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7841972224. Throughput: 0: 42511.7. Samples: 7842154580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 17:09:19,633][15401] Updated weights for policy 0, policy_version 478640 (0.0035) [2024-06-23 17:09:23,125][15401] Updated weights for policy 0, policy_version 478650 (0.0030) [2024-06-23 17:09:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7842201600. Throughput: 0: 42434.7. Samples: 7842275780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 17:09:27,211][15401] Updated weights for policy 0, policy_version 478660 (0.0026) [2024-06-23 17:09:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7842430976. Throughput: 0: 42603.3. Samples: 7842541680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 17:09:30,819][15401] Updated weights for policy 0, policy_version 478670 (0.0038) [2024-06-23 17:09:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7842627584. Throughput: 0: 42628.3. Samples: 7842794340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 17:09:34,906][15401] Updated weights for policy 0, policy_version 478680 (0.0032) [2024-06-23 17:09:38,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7842840576. Throughput: 0: 42619.8. Samples: 7842919180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 17:09:38,596][15401] Updated weights for policy 0, policy_version 478690 (0.0037) [2024-06-23 17:09:41,473][15349] Signal inference workers to stop experience collection... (116250 times) [2024-06-23 17:09:41,482][15349] Signal inference workers to resume experience collection... (116250 times) [2024-06-23 17:09:41,491][15401] InferenceWorker_p0-w0: stopping experience collection (116250 times) [2024-06-23 17:09:41,505][15401] InferenceWorker_p0-w0: resuming experience collection (116250 times) [2024-06-23 17:09:42,480][15401] Updated weights for policy 0, policy_version 478700 (0.0043) [2024-06-23 17:09:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7843053568. Throughput: 0: 42831.0. Samples: 7843183100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:43,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 17:09:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000478702_7843053568.pth... [2024-06-23 17:09:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000478077_7832813568.pth [2024-06-23 17:09:46,314][15401] Updated weights for policy 0, policy_version 478710 (0.0031) [2024-06-23 17:09:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7843266560. Throughput: 0: 42506.7. Samples: 7843428400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:48,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 17:09:50,451][15401] Updated weights for policy 0, policy_version 478720 (0.0040) [2024-06-23 17:09:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 7843479552. Throughput: 0: 42859.1. Samples: 7843563300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:53,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-23 17:09:53,841][15401] Updated weights for policy 0, policy_version 478730 (0.0031) [2024-06-23 17:09:57,938][15401] Updated weights for policy 0, policy_version 478740 (0.0037) [2024-06-23 17:09:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7843676160. Throughput: 0: 42744.6. Samples: 7843820740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:09:58,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 17:10:01,482][15401] Updated weights for policy 0, policy_version 478750 (0.0036) [2024-06-23 17:10:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 7843921920. Throughput: 0: 42448.0. Samples: 7844064740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:10:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 17:10:05,904][15401] Updated weights for policy 0, policy_version 478760 (0.0032) [2024-06-23 17:10:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 7844134912. Throughput: 0: 42823.0. Samples: 7844202820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:10:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 17:10:09,039][15401] Updated weights for policy 0, policy_version 478770 (0.0024) [2024-06-23 17:10:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7844315136. Throughput: 0: 42643.9. Samples: 7844460660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:10:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 17:10:13,471][15401] Updated weights for policy 0, policy_version 478780 (0.0033) [2024-06-23 17:10:17,083][15401] Updated weights for policy 0, policy_version 478790 (0.0038) [2024-06-23 17:10:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42765.4). Total num frames: 7844577280. Throughput: 0: 42551.1. Samples: 7844709140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 17:10:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 17:10:21,115][15401] Updated weights for policy 0, policy_version 478800 (0.0052) [2024-06-23 17:10:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7844757504. Throughput: 0: 42726.2. Samples: 7844841860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:10:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 17:10:24,902][15401] Updated weights for policy 0, policy_version 478810 (0.0032) [2024-06-23 17:10:28,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 7844954112. Throughput: 0: 42457.3. Samples: 7845093680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:10:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 17:10:28,699][15401] Updated weights for policy 0, policy_version 478820 (0.0024) [2024-06-23 17:10:32,596][15401] Updated weights for policy 0, policy_version 478830 (0.0035) [2024-06-23 17:10:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 7845199872. Throughput: 0: 42574.2. Samples: 7845344240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:10:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 17:10:36,438][15401] Updated weights for policy 0, policy_version 478840 (0.0041) [2024-06-23 17:10:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7845396480. Throughput: 0: 42608.1. Samples: 7845480660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:10:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 17:10:40,269][15401] Updated weights for policy 0, policy_version 478850 (0.0032) [2024-06-23 17:10:43,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.7, 300 sec: 42599.0). Total num frames: 7845593088. Throughput: 0: 42408.8. Samples: 7845729240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:10:43,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 17:10:44,279][15401] Updated weights for policy 0, policy_version 478860 (0.0035) [2024-06-23 17:10:47,785][15401] Updated weights for policy 0, policy_version 478870 (0.0036) [2024-06-23 17:10:48,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 7845838848. Throughput: 0: 42677.8. Samples: 7845985340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:10:48,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 17:10:51,826][15401] Updated weights for policy 0, policy_version 478880 (0.0030) [2024-06-23 17:10:53,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7846035456. Throughput: 0: 42510.3. Samples: 7846115780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:10:53,391][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 17:10:55,642][15401] Updated weights for policy 0, policy_version 478890 (0.0029) [2024-06-23 17:10:58,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7846232064. Throughput: 0: 42356.4. Samples: 7846366700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:10:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 17:10:59,194][15349] Signal inference workers to stop experience collection... (116300 times) [2024-06-23 17:10:59,195][15349] Signal inference workers to resume experience collection... (116300 times) [2024-06-23 17:10:59,209][15401] InferenceWorker_p0-w0: stopping experience collection (116300 times) [2024-06-23 17:10:59,233][15401] InferenceWorker_p0-w0: resuming experience collection (116300 times) [2024-06-23 17:10:59,340][15401] Updated weights for policy 0, policy_version 478900 (0.0034) [2024-06-23 17:11:03,149][15401] Updated weights for policy 0, policy_version 478910 (0.0038) [2024-06-23 17:11:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7846477824. Throughput: 0: 42546.7. Samples: 7846623740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 17:11:06,951][15401] Updated weights for policy 0, policy_version 478920 (0.0039) [2024-06-23 17:11:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7846658048. Throughput: 0: 42482.7. Samples: 7846753580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:08,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 17:11:10,790][15401] Updated weights for policy 0, policy_version 478930 (0.0033) [2024-06-23 17:11:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7846887424. Throughput: 0: 42535.9. Samples: 7847007800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 17:11:14,520][15401] Updated weights for policy 0, policy_version 478940 (0.0035) [2024-06-23 17:11:18,388][15401] Updated weights for policy 0, policy_version 478950 (0.0030) [2024-06-23 17:11:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7847116800. Throughput: 0: 42704.0. Samples: 7847265920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:18,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-23 17:11:22,262][15401] Updated weights for policy 0, policy_version 478960 (0.0039) [2024-06-23 17:11:23,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 7847297024. Throughput: 0: 42479.0. Samples: 7847392320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:23,393][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 17:11:26,140][15401] Updated weights for policy 0, policy_version 478970 (0.0031) [2024-06-23 17:11:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7847526400. Throughput: 0: 42666.2. Samples: 7847649120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 17:11:30,096][15401] Updated weights for policy 0, policy_version 478980 (0.0029) [2024-06-23 17:11:33,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 7847739392. Throughput: 0: 42599.1. Samples: 7847902200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 17:11:33,660][15401] Updated weights for policy 0, policy_version 478990 (0.0034) [2024-06-23 17:11:37,978][15401] Updated weights for policy 0, policy_version 479000 (0.0029) [2024-06-23 17:11:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7847952384. Throughput: 0: 42582.7. Samples: 7848032000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 17:11:41,409][15401] Updated weights for policy 0, policy_version 479010 (0.0027) [2024-06-23 17:11:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 7848148992. Throughput: 0: 42799.2. Samples: 7848292660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 17:11:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 17:11:43,590][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000479015_7848181760.pth... [2024-06-23 17:11:43,655][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000478390_7837941760.pth [2024-06-23 17:11:45,662][15401] Updated weights for policy 0, policy_version 479020 (0.0024) [2024-06-23 17:11:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 7848378368. Throughput: 0: 42671.0. Samples: 7848543940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:11:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 17:11:49,035][15401] Updated weights for policy 0, policy_version 479030 (0.0038) [2024-06-23 17:11:53,255][15401] Updated weights for policy 0, policy_version 479040 (0.0033) [2024-06-23 17:11:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 7848591360. Throughput: 0: 42692.4. Samples: 7848674840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:11:53,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 17:11:56,472][15401] Updated weights for policy 0, policy_version 479050 (0.0051) [2024-06-23 17:11:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7848787968. Throughput: 0: 42639.7. Samples: 7848926580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:11:58,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 17:12:00,849][15401] Updated weights for policy 0, policy_version 479060 (0.0041) [2024-06-23 17:12:03,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7849017344. Throughput: 0: 42538.7. Samples: 7849180160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:03,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 17:12:04,926][15401] Updated weights for policy 0, policy_version 479070 (0.0029) [2024-06-23 17:12:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 7849230336. Throughput: 0: 42723.7. Samples: 7849314780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:08,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 17:12:08,713][15401] Updated weights for policy 0, policy_version 479080 (0.0039) [2024-06-23 17:12:12,645][15401] Updated weights for policy 0, policy_version 479090 (0.0038) [2024-06-23 17:12:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 7849426944. Throughput: 0: 42515.7. Samples: 7849562320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 17:12:16,301][15401] Updated weights for policy 0, policy_version 479100 (0.0039) [2024-06-23 17:12:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7849656320. Throughput: 0: 42505.9. Samples: 7849814960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 17:12:20,376][15401] Updated weights for policy 0, policy_version 479110 (0.0023) [2024-06-23 17:12:23,109][15349] Signal inference workers to stop experience collection... (116350 times) [2024-06-23 17:12:23,136][15401] InferenceWorker_p0-w0: stopping experience collection (116350 times) [2024-06-23 17:12:23,170][15349] Signal inference workers to resume experience collection... (116350 times) [2024-06-23 17:12:23,171][15401] InferenceWorker_p0-w0: resuming experience collection (116350 times) [2024-06-23 17:12:23,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42871.5, 300 sec: 42598.0). Total num frames: 7849869312. Throughput: 0: 42626.2. Samples: 7849950280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:23,393][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 17:12:23,765][15401] Updated weights for policy 0, policy_version 479120 (0.0036) [2024-06-23 17:12:27,985][15401] Updated weights for policy 0, policy_version 479130 (0.0030) [2024-06-23 17:12:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7850082304. Throughput: 0: 42445.8. Samples: 7850202720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 17:12:31,317][15401] Updated weights for policy 0, policy_version 479140 (0.0034) [2024-06-23 17:12:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7850311680. Throughput: 0: 42501.9. Samples: 7850456520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 17:12:35,650][15401] Updated weights for policy 0, policy_version 479150 (0.0035) [2024-06-23 17:12:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7850508288. Throughput: 0: 42624.8. Samples: 7850592860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:38,399][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 17:12:39,103][15401] Updated weights for policy 0, policy_version 479160 (0.0023) [2024-06-23 17:12:43,298][15401] Updated weights for policy 0, policy_version 479170 (0.0030) [2024-06-23 17:12:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7850721280. Throughput: 0: 42753.3. Samples: 7850850480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 17:12:46,925][15401] Updated weights for policy 0, policy_version 479180 (0.0038) [2024-06-23 17:12:48,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 7850950656. Throughput: 0: 42718.5. Samples: 7851102600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:48,393][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 17:12:50,888][15401] Updated weights for policy 0, policy_version 479190 (0.0034) [2024-06-23 17:12:53,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 7851163648. Throughput: 0: 42669.2. Samples: 7851235000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:53,392][15132] Avg episode reward: [(0, '0.892')] [2024-06-23 17:12:54,644][15401] Updated weights for policy 0, policy_version 479200 (0.0034) [2024-06-23 17:12:58,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7851360256. Throughput: 0: 42913.3. Samples: 7851493420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:12:58,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 17:12:58,482][15401] Updated weights for policy 0, policy_version 479210 (0.0033) [2024-06-23 17:13:02,153][15401] Updated weights for policy 0, policy_version 479220 (0.0034) [2024-06-23 17:13:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7851589632. Throughput: 0: 42892.4. Samples: 7851745120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 17:13:06,102][15401] Updated weights for policy 0, policy_version 479230 (0.0051) [2024-06-23 17:13:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7851802624. Throughput: 0: 42745.5. Samples: 7851873720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 17:13:09,810][15401] Updated weights for policy 0, policy_version 479240 (0.0022) [2024-06-23 17:13:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7852015616. Throughput: 0: 42873.8. Samples: 7852132040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 17:13:13,747][15401] Updated weights for policy 0, policy_version 479250 (0.0034) [2024-06-23 17:13:17,720][15401] Updated weights for policy 0, policy_version 479260 (0.0033) [2024-06-23 17:13:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7852212224. Throughput: 0: 42911.0. Samples: 7852387520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 17:13:21,414][15401] Updated weights for policy 0, policy_version 479270 (0.0044) [2024-06-23 17:13:23,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42868.6, 300 sec: 42597.5). Total num frames: 7852441600. Throughput: 0: 42854.9. Samples: 7852521600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:23,397][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 17:13:25,151][15401] Updated weights for policy 0, policy_version 479280 (0.0031) [2024-06-23 17:13:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 7852654592. Throughput: 0: 42848.4. Samples: 7852778660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:28,392][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 17:13:28,885][15401] Updated weights for policy 0, policy_version 479290 (0.0032) [2024-06-23 17:13:32,897][15401] Updated weights for policy 0, policy_version 479300 (0.0036) [2024-06-23 17:13:33,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7852851200. Throughput: 0: 42880.1. Samples: 7853032100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 17:13:36,687][15401] Updated weights for policy 0, policy_version 479310 (0.0027) [2024-06-23 17:13:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7853096960. Throughput: 0: 42759.6. Samples: 7853159080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 17:13:40,426][15401] Updated weights for policy 0, policy_version 479320 (0.0036) [2024-06-23 17:13:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 7853293568. Throughput: 0: 42761.6. Samples: 7853417800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:43,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 17:13:43,562][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000479328_7853309952.pth... [2024-06-23 17:13:43,612][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000478702_7843053568.pth [2024-06-23 17:13:44,320][15401] Updated weights for policy 0, policy_version 479330 (0.0048) [2024-06-23 17:13:47,342][15349] Signal inference workers to stop experience collection... (116400 times) [2024-06-23 17:13:47,342][15349] Signal inference workers to resume experience collection... (116400 times) [2024-06-23 17:13:47,357][15401] InferenceWorker_p0-w0: stopping experience collection (116400 times) [2024-06-23 17:13:47,358][15401] InferenceWorker_p0-w0: resuming experience collection (116400 times) [2024-06-23 17:13:47,922][15401] Updated weights for policy 0, policy_version 479340 (0.0029) [2024-06-23 17:13:48,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 7853506560. Throughput: 0: 42821.3. Samples: 7853672180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:48,393][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 17:13:51,954][15401] Updated weights for policy 0, policy_version 479350 (0.0040) [2024-06-23 17:13:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 7853719552. Throughput: 0: 42896.9. Samples: 7853804080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:53,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 17:13:55,667][15401] Updated weights for policy 0, policy_version 479360 (0.0036) [2024-06-23 17:13:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7853932544. Throughput: 0: 42865.9. Samples: 7854061000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:13:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 17:13:59,486][15401] Updated weights for policy 0, policy_version 479370 (0.0038) [2024-06-23 17:14:03,388][15401] Updated weights for policy 0, policy_version 479380 (0.0055) [2024-06-23 17:14:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 7854161920. Throughput: 0: 42867.2. Samples: 7854316540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:14:03,390][15132] Avg episode reward: [(0, '0.104')] [2024-06-23 17:14:07,185][15401] Updated weights for policy 0, policy_version 479390 (0.0031) [2024-06-23 17:14:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 7854374912. Throughput: 0: 42770.9. Samples: 7854446020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:14:08,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-23 17:14:10,895][15401] Updated weights for policy 0, policy_version 479400 (0.0034) [2024-06-23 17:14:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7854571520. Throughput: 0: 42756.0. Samples: 7854702680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:14:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 17:14:14,927][15401] Updated weights for policy 0, policy_version 479410 (0.0031) [2024-06-23 17:14:18,321][15401] Updated weights for policy 0, policy_version 479420 (0.0025) [2024-06-23 17:14:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7854817280. Throughput: 0: 42858.2. Samples: 7854960720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:14:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 17:14:22,654][15401] Updated weights for policy 0, policy_version 479430 (0.0033) [2024-06-23 17:14:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42876.1, 300 sec: 42653.9). Total num frames: 7855013888. Throughput: 0: 42845.3. Samples: 7855087120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:14:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 17:14:26,150][15401] Updated weights for policy 0, policy_version 479440 (0.0034) [2024-06-23 17:14:28,396][15132] Fps is (10 sec: 40935.1, 60 sec: 42867.2, 300 sec: 42708.6). Total num frames: 7855226880. Throughput: 0: 42764.5. Samples: 7855342360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:14:28,396][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 17:14:30,143][15401] Updated weights for policy 0, policy_version 479450 (0.0042) [2024-06-23 17:14:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7855439872. Throughput: 0: 42923.2. Samples: 7855603620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 17:14:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 17:14:33,774][15401] Updated weights for policy 0, policy_version 479460 (0.0034) [2024-06-23 17:14:38,028][15401] Updated weights for policy 0, policy_version 479470 (0.0030) [2024-06-23 17:14:38,390][15132] Fps is (10 sec: 40984.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7855636480. Throughput: 0: 42764.3. Samples: 7855728480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:14:38,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 17:14:41,696][15401] Updated weights for policy 0, policy_version 479480 (0.0044) [2024-06-23 17:14:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 7855865856. Throughput: 0: 42751.5. Samples: 7855984820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:14:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 17:14:45,528][15401] Updated weights for policy 0, policy_version 479490 (0.0028) [2024-06-23 17:14:48,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42598.5, 300 sec: 42653.6). Total num frames: 7856062464. Throughput: 0: 42866.2. Samples: 7856245620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:14:48,392][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 17:14:49,349][15401] Updated weights for policy 0, policy_version 479500 (0.0040) [2024-06-23 17:14:53,034][15401] Updated weights for policy 0, policy_version 479510 (0.0021) [2024-06-23 17:14:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7856308224. Throughput: 0: 42773.9. Samples: 7856370840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:14:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 17:14:56,960][15401] Updated weights for policy 0, policy_version 479520 (0.0041) [2024-06-23 17:14:58,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7856504832. Throughput: 0: 42737.0. Samples: 7856625840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:14:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 17:15:00,765][15401] Updated weights for policy 0, policy_version 479530 (0.0034) [2024-06-23 17:15:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7856701440. Throughput: 0: 42914.3. Samples: 7856891860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 17:15:04,519][15401] Updated weights for policy 0, policy_version 479540 (0.0033) [2024-06-23 17:15:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7856930816. Throughput: 0: 42948.5. Samples: 7857019800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 17:15:08,461][15401] Updated weights for policy 0, policy_version 479550 (0.0041) [2024-06-23 17:15:12,175][15401] Updated weights for policy 0, policy_version 479560 (0.0036) [2024-06-23 17:15:13,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7857160192. Throughput: 0: 43006.1. Samples: 7857277380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 17:15:15,912][15401] Updated weights for policy 0, policy_version 479570 (0.0031) [2024-06-23 17:15:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7857373184. Throughput: 0: 42927.6. Samples: 7857535360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:18,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 17:15:19,655][15401] Updated weights for policy 0, policy_version 479580 (0.0036) [2024-06-23 17:15:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7857586176. Throughput: 0: 43049.8. Samples: 7857665720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:23,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-23 17:15:23,564][15401] Updated weights for policy 0, policy_version 479590 (0.0033) [2024-06-23 17:15:27,172][15401] Updated weights for policy 0, policy_version 479600 (0.0038) [2024-06-23 17:15:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43148.8, 300 sec: 42765.0). Total num frames: 7857815552. Throughput: 0: 43096.7. Samples: 7857924180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 17:15:31,154][15401] Updated weights for policy 0, policy_version 479610 (0.0037) [2024-06-23 17:15:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7858012160. Throughput: 0: 43106.7. Samples: 7858185320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 17:15:34,509][15349] Signal inference workers to stop experience collection... (116450 times) [2024-06-23 17:15:34,552][15401] InferenceWorker_p0-w0: stopping experience collection (116450 times) [2024-06-23 17:15:34,570][15349] Signal inference workers to resume experience collection... (116450 times) [2024-06-23 17:15:34,570][15401] InferenceWorker_p0-w0: resuming experience collection (116450 times) [2024-06-23 17:15:34,713][15401] Updated weights for policy 0, policy_version 479620 (0.0034) [2024-06-23 17:15:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 7858241536. Throughput: 0: 43143.1. Samples: 7858312280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 17:15:38,674][15401] Updated weights for policy 0, policy_version 479630 (0.0040) [2024-06-23 17:15:42,483][15401] Updated weights for policy 0, policy_version 479640 (0.0041) [2024-06-23 17:15:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 7858438144. Throughput: 0: 43244.7. Samples: 7858571860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 17:15:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000479641_7858438144.pth... [2024-06-23 17:15:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000479015_7848181760.pth [2024-06-23 17:15:46,203][15401] Updated weights for policy 0, policy_version 479650 (0.0034) [2024-06-23 17:15:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43417.6, 300 sec: 42820.2). Total num frames: 7858667520. Throughput: 0: 42992.7. Samples: 7858826640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:48,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 17:15:50,057][15401] Updated weights for policy 0, policy_version 479660 (0.0021) [2024-06-23 17:15:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7858880512. Throughput: 0: 43093.3. Samples: 7858959000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 17:15:53,741][15401] Updated weights for policy 0, policy_version 479670 (0.0037) [2024-06-23 17:15:57,963][15401] Updated weights for policy 0, policy_version 479680 (0.0022) [2024-06-23 17:15:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7859093504. Throughput: 0: 43136.7. Samples: 7859218520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 17:15:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 17:16:01,672][15401] Updated weights for policy 0, policy_version 479690 (0.0033) [2024-06-23 17:16:03,393][15132] Fps is (10 sec: 44219.4, 60 sec: 43687.7, 300 sec: 42931.1). Total num frames: 7859322880. Throughput: 0: 42957.1. Samples: 7859468600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:03,394][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 17:16:05,551][15401] Updated weights for policy 0, policy_version 479700 (0.0031) [2024-06-23 17:16:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 7859535872. Throughput: 0: 42984.5. Samples: 7859600020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:08,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 17:16:09,081][15401] Updated weights for policy 0, policy_version 479710 (0.0026) [2024-06-23 17:16:13,150][15401] Updated weights for policy 0, policy_version 479720 (0.0030) [2024-06-23 17:16:13,389][15132] Fps is (10 sec: 42615.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7859748864. Throughput: 0: 42841.9. Samples: 7859852060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:13,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 17:16:16,929][15401] Updated weights for policy 0, policy_version 479730 (0.0029) [2024-06-23 17:16:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 7859945472. Throughput: 0: 42718.7. Samples: 7860107660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 17:16:20,878][15401] Updated weights for policy 0, policy_version 479740 (0.0034) [2024-06-23 17:16:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7860142080. Throughput: 0: 42768.5. Samples: 7860236860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 17:16:24,884][15401] Updated weights for policy 0, policy_version 479750 (0.0044) [2024-06-23 17:16:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7860371456. Throughput: 0: 42582.3. Samples: 7860488060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 17:16:28,769][15401] Updated weights for policy 0, policy_version 479760 (0.0028) [2024-06-23 17:16:32,480][15401] Updated weights for policy 0, policy_version 479770 (0.0033) [2024-06-23 17:16:33,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 7860600832. Throughput: 0: 42469.8. Samples: 7860737780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:33,392][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 17:16:36,362][15401] Updated weights for policy 0, policy_version 479780 (0.0033) [2024-06-23 17:16:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7860781056. Throughput: 0: 42464.8. Samples: 7860869920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 17:16:40,101][15401] Updated weights for policy 0, policy_version 479790 (0.0042) [2024-06-23 17:16:43,396][15132] Fps is (10 sec: 40943.2, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 7861010432. Throughput: 0: 42352.0. Samples: 7861124640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:43,397][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 17:16:44,453][15401] Updated weights for policy 0, policy_version 479800 (0.0029) [2024-06-23 17:16:47,983][15401] Updated weights for policy 0, policy_version 479810 (0.0040) [2024-06-23 17:16:48,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7861207040. Throughput: 0: 42472.6. Samples: 7861379800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:48,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 17:16:52,199][15401] Updated weights for policy 0, policy_version 479820 (0.0047) [2024-06-23 17:16:53,389][15132] Fps is (10 sec: 40986.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7861420032. Throughput: 0: 42367.6. Samples: 7861506560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 17:16:55,769][15401] Updated weights for policy 0, policy_version 479830 (0.0028) [2024-06-23 17:16:58,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 7861649408. Throughput: 0: 42412.9. Samples: 7861760640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:16:58,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 17:16:59,721][15401] Updated weights for policy 0, policy_version 479840 (0.0030) [2024-06-23 17:17:03,323][15401] Updated weights for policy 0, policy_version 479850 (0.0028) [2024-06-23 17:17:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42328.2, 300 sec: 42820.6). Total num frames: 7861862400. Throughput: 0: 42750.7. Samples: 7862031440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:17:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 17:17:07,227][15401] Updated weights for policy 0, policy_version 479860 (0.0037) [2024-06-23 17:17:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 7862075392. Throughput: 0: 42616.5. Samples: 7862154600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:17:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 17:17:11,197][15401] Updated weights for policy 0, policy_version 479870 (0.0043) [2024-06-23 17:17:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7862304768. Throughput: 0: 42624.8. Samples: 7862406180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:17:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 17:17:14,552][15349] Signal inference workers to stop experience collection... (116500 times) [2024-06-23 17:17:14,552][15349] Signal inference workers to resume experience collection... (116500 times) [2024-06-23 17:17:14,599][15401] InferenceWorker_p0-w0: stopping experience collection (116500 times) [2024-06-23 17:17:14,599][15401] InferenceWorker_p0-w0: resuming experience collection (116500 times) [2024-06-23 17:17:14,686][15401] Updated weights for policy 0, policy_version 479880 (0.0029) [2024-06-23 17:17:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 7862484992. Throughput: 0: 42853.9. Samples: 7862666100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:17:18,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 17:17:18,667][15401] Updated weights for policy 0, policy_version 479890 (0.0033) [2024-06-23 17:17:22,466][15401] Updated weights for policy 0, policy_version 479900 (0.0037) [2024-06-23 17:17:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 7862714368. Throughput: 0: 42673.4. Samples: 7862790220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:17:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 17:17:26,166][15401] Updated weights for policy 0, policy_version 479910 (0.0039) [2024-06-23 17:17:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7862943744. Throughput: 0: 42845.3. Samples: 7863052400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-23 17:17:28,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 17:17:30,147][15401] Updated weights for policy 0, policy_version 479920 (0.0045) [2024-06-23 17:17:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 7863123968. Throughput: 0: 42829.8. Samples: 7863307040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:17:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 17:17:33,971][15401] Updated weights for policy 0, policy_version 479930 (0.0031) [2024-06-23 17:17:37,633][15401] Updated weights for policy 0, policy_version 479940 (0.0040) [2024-06-23 17:17:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7863353344. Throughput: 0: 42692.8. Samples: 7863427740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:17:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 17:17:41,758][15401] Updated weights for policy 0, policy_version 479950 (0.0031) [2024-06-23 17:17:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42602.9, 300 sec: 42765.3). Total num frames: 7863566336. Throughput: 0: 42816.8. Samples: 7863687400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:17:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 17:17:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000479955_7863582720.pth... [2024-06-23 17:17:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000479328_7853309952.pth [2024-06-23 17:17:45,272][15401] Updated weights for policy 0, policy_version 479960 (0.0034) [2024-06-23 17:17:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 7863762944. Throughput: 0: 42507.5. Samples: 7863944280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:17:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 17:17:49,754][15401] Updated weights for policy 0, policy_version 479970 (0.0040) [2024-06-23 17:17:52,796][15401] Updated weights for policy 0, policy_version 479980 (0.0037) [2024-06-23 17:17:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7864008704. Throughput: 0: 42517.7. Samples: 7864067900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:17:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 17:17:57,369][15401] Updated weights for policy 0, policy_version 479990 (0.0029) [2024-06-23 17:17:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7864205312. Throughput: 0: 42816.1. Samples: 7864332900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:17:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 17:18:00,455][15401] Updated weights for policy 0, policy_version 480000 (0.0033) [2024-06-23 17:18:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7864418304. Throughput: 0: 42774.2. Samples: 7864590940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 17:18:04,902][15401] Updated weights for policy 0, policy_version 480010 (0.0029) [2024-06-23 17:18:08,182][15401] Updated weights for policy 0, policy_version 480020 (0.0034) [2024-06-23 17:18:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7864664064. Throughput: 0: 42786.7. Samples: 7864715620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 17:18:12,495][15401] Updated weights for policy 0, policy_version 480030 (0.0034) [2024-06-23 17:18:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7864844288. Throughput: 0: 42764.9. Samples: 7864976820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 17:18:15,711][15401] Updated weights for policy 0, policy_version 480040 (0.0047) [2024-06-23 17:18:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.3, 300 sec: 42765.9). Total num frames: 7865057280. Throughput: 0: 42726.1. Samples: 7865229720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 17:18:19,985][15401] Updated weights for policy 0, policy_version 480050 (0.0041) [2024-06-23 17:18:23,117][15401] Updated weights for policy 0, policy_version 480060 (0.0034) [2024-06-23 17:18:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7865303040. Throughput: 0: 42918.6. Samples: 7865359080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 17:18:27,648][15401] Updated weights for policy 0, policy_version 480070 (0.0043) [2024-06-23 17:18:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7865483264. Throughput: 0: 42781.9. Samples: 7865612580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 17:18:31,501][15401] Updated weights for policy 0, policy_version 480080 (0.0035) [2024-06-23 17:18:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7865696256. Throughput: 0: 42795.2. Samples: 7865870060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 17:18:35,491][15401] Updated weights for policy 0, policy_version 480090 (0.0026) [2024-06-23 17:18:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 7865925632. Throughput: 0: 42829.7. Samples: 7865995240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 17:18:39,243][15401] Updated weights for policy 0, policy_version 480100 (0.0026) [2024-06-23 17:18:43,307][15401] Updated weights for policy 0, policy_version 480110 (0.0040) [2024-06-23 17:18:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 7866122240. Throughput: 0: 42557.3. Samples: 7866247980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 17:18:46,795][15401] Updated weights for policy 0, policy_version 480120 (0.0042) [2024-06-23 17:18:48,027][15349] Signal inference workers to stop experience collection... (116550 times) [2024-06-23 17:18:48,027][15349] Signal inference workers to resume experience collection... (116550 times) [2024-06-23 17:18:48,077][15401] InferenceWorker_p0-w0: stopping experience collection (116550 times) [2024-06-23 17:18:48,077][15401] InferenceWorker_p0-w0: resuming experience collection (116550 times) [2024-06-23 17:18:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7866335232. Throughput: 0: 42563.0. Samples: 7866506280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:48,390][15132] Avg episode reward: [(0, '0.145')] [2024-06-23 17:18:51,085][15401] Updated weights for policy 0, policy_version 480130 (0.0047) [2024-06-23 17:18:53,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 7866564608. Throughput: 0: 42676.9. Samples: 7866636180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-23 17:18:53,392][15132] Avg episode reward: [(0, '0.145')] [2024-06-23 17:18:54,459][15401] Updated weights for policy 0, policy_version 480140 (0.0033) [2024-06-23 17:18:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7866744832. Throughput: 0: 42332.5. Samples: 7866881780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:18:58,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 17:18:58,847][15401] Updated weights for policy 0, policy_version 480150 (0.0040) [2024-06-23 17:19:02,058][15401] Updated weights for policy 0, policy_version 480160 (0.0032) [2024-06-23 17:19:03,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 7866957824. Throughput: 0: 42455.7. Samples: 7867140220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 17:19:06,440][15401] Updated weights for policy 0, policy_version 480170 (0.0046) [2024-06-23 17:19:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 7867203584. Throughput: 0: 42325.4. Samples: 7867263720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 17:19:09,732][15401] Updated weights for policy 0, policy_version 480180 (0.0034) [2024-06-23 17:19:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7867400192. Throughput: 0: 42398.2. Samples: 7867520500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:13,404][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 17:19:14,090][15401] Updated weights for policy 0, policy_version 480190 (0.0027) [2024-06-23 17:19:17,596][15401] Updated weights for policy 0, policy_version 480200 (0.0038) [2024-06-23 17:19:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7867596800. Throughput: 0: 42300.8. Samples: 7867773600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:18,391][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 17:19:21,718][15401] Updated weights for policy 0, policy_version 480210 (0.0039) [2024-06-23 17:19:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.9). Total num frames: 7867842560. Throughput: 0: 42277.9. Samples: 7867897740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:23,398][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 17:19:25,454][15401] Updated weights for policy 0, policy_version 480220 (0.0040) [2024-06-23 17:19:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7868039168. Throughput: 0: 42450.4. Samples: 7868158240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 17:19:29,360][15401] Updated weights for policy 0, policy_version 480230 (0.0038) [2024-06-23 17:19:33,385][15401] Updated weights for policy 0, policy_version 480240 (0.0032) [2024-06-23 17:19:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7868252160. Throughput: 0: 42434.3. Samples: 7868415820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 17:19:36,815][15401] Updated weights for policy 0, policy_version 480250 (0.0033) [2024-06-23 17:19:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7868465152. Throughput: 0: 42256.9. Samples: 7868537640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 17:19:41,003][15401] Updated weights for policy 0, policy_version 480260 (0.0026) [2024-06-23 17:19:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 7868678144. Throughput: 0: 42710.7. Samples: 7868803760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 17:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000480267_7868694528.pth... [2024-06-23 17:19:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000479641_7858438144.pth [2024-06-23 17:19:44,279][15401] Updated weights for policy 0, policy_version 480270 (0.0031) [2024-06-23 17:19:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7868891136. Throughput: 0: 42764.4. Samples: 7869064620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:48,399][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 17:19:48,540][15401] Updated weights for policy 0, policy_version 480280 (0.0028) [2024-06-23 17:19:51,878][15401] Updated weights for policy 0, policy_version 480290 (0.0034) [2024-06-23 17:19:53,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 7869104128. Throughput: 0: 42759.5. Samples: 7869188000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:53,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 17:19:56,656][15401] Updated weights for policy 0, policy_version 480300 (0.0038) [2024-06-23 17:19:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7869317120. Throughput: 0: 42801.7. Samples: 7869446580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:19:58,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 17:20:00,010][15401] Updated weights for policy 0, policy_version 480310 (0.0031) [2024-06-23 17:20:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7869530112. Throughput: 0: 42737.8. Samples: 7869696800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:20:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 17:20:04,162][15401] Updated weights for policy 0, policy_version 480320 (0.0030) [2024-06-23 17:20:05,495][15349] Signal inference workers to stop experience collection... (116600 times) [2024-06-23 17:20:05,495][15349] Signal inference workers to resume experience collection... (116600 times) [2024-06-23 17:20:05,514][15401] InferenceWorker_p0-w0: stopping experience collection (116600 times) [2024-06-23 17:20:05,514][15401] InferenceWorker_p0-w0: resuming experience collection (116600 times) [2024-06-23 17:20:07,381][15401] Updated weights for policy 0, policy_version 480330 (0.0034) [2024-06-23 17:20:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7869743104. Throughput: 0: 42924.9. Samples: 7869829360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:20:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 17:20:11,612][15401] Updated weights for policy 0, policy_version 480340 (0.0026) [2024-06-23 17:20:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7869972480. Throughput: 0: 42872.4. Samples: 7870087500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:20:13,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-23 17:20:15,710][15401] Updated weights for policy 0, policy_version 480350 (0.0051) [2024-06-23 17:20:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7870185472. Throughput: 0: 42658.1. Samples: 7870335440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:20:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 17:20:19,179][15401] Updated weights for policy 0, policy_version 480360 (0.0035) [2024-06-23 17:20:23,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42050.5, 300 sec: 42542.5). Total num frames: 7870365696. Throughput: 0: 42815.4. Samples: 7870464440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:20:23,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 17:20:23,671][15401] Updated weights for policy 0, policy_version 480370 (0.0033) [2024-06-23 17:20:26,775][15401] Updated weights for policy 0, policy_version 480380 (0.0028) [2024-06-23 17:20:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7870595072. Throughput: 0: 42705.3. Samples: 7870725500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:20:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 17:20:31,102][15401] Updated weights for policy 0, policy_version 480390 (0.0035) [2024-06-23 17:20:33,392][15132] Fps is (10 sec: 47514.2, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 7870840832. Throughput: 0: 42509.0. Samples: 7870977620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:20:33,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 17:20:34,582][15401] Updated weights for policy 0, policy_version 480400 (0.0042) [2024-06-23 17:20:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7871021056. Throughput: 0: 42721.0. Samples: 7871110340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:20:38,396][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 17:20:38,583][15401] Updated weights for policy 0, policy_version 480410 (0.0036) [2024-06-23 17:20:42,063][15401] Updated weights for policy 0, policy_version 480420 (0.0034) [2024-06-23 17:20:43,389][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 7871250432. Throughput: 0: 42712.1. Samples: 7871368620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:20:43,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 17:20:46,459][15401] Updated weights for policy 0, policy_version 480430 (0.0041) [2024-06-23 17:20:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7871479808. Throughput: 0: 42715.5. Samples: 7871619000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:20:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 17:20:49,719][15401] Updated weights for policy 0, policy_version 480440 (0.0037) [2024-06-23 17:20:53,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42325.4, 300 sec: 42542.5). Total num frames: 7871643648. Throughput: 0: 42709.7. Samples: 7871751400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:20:53,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 17:20:54,118][15401] Updated weights for policy 0, policy_version 480450 (0.0034) [2024-06-23 17:20:57,438][15401] Updated weights for policy 0, policy_version 480460 (0.0027) [2024-06-23 17:20:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42599.0). Total num frames: 7871889408. Throughput: 0: 42553.6. Samples: 7872002420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:20:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 17:21:01,705][15401] Updated weights for policy 0, policy_version 480470 (0.0033) [2024-06-23 17:21:03,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7872102400. Throughput: 0: 42780.5. Samples: 7872260560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 17:21:04,864][15401] Updated weights for policy 0, policy_version 480480 (0.0049) [2024-06-23 17:21:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7872299008. Throughput: 0: 42794.4. Samples: 7872390080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 17:21:09,294][15401] Updated weights for policy 0, policy_version 480490 (0.0034) [2024-06-23 17:21:12,589][15401] Updated weights for policy 0, policy_version 480500 (0.0036) [2024-06-23 17:21:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7872528384. Throughput: 0: 42665.2. Samples: 7872645440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:13,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 17:21:16,864][15401] Updated weights for policy 0, policy_version 480510 (0.0027) [2024-06-23 17:21:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7872741376. Throughput: 0: 42752.9. Samples: 7872901400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 17:21:20,791][15401] Updated weights for policy 0, policy_version 480520 (0.0032) [2024-06-23 17:21:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 7872937984. Throughput: 0: 42697.8. Samples: 7873031740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 17:21:24,251][15401] Updated weights for policy 0, policy_version 480530 (0.0034) [2024-06-23 17:21:28,268][15401] Updated weights for policy 0, policy_version 480540 (0.0045) [2024-06-23 17:21:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 7873167360. Throughput: 0: 42686.1. Samples: 7873289500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 17:21:31,663][15401] Updated weights for policy 0, policy_version 480550 (0.0024) [2024-06-23 17:21:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 7873396736. Throughput: 0: 42865.0. Samples: 7873547920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 17:21:36,132][15401] Updated weights for policy 0, policy_version 480560 (0.0041) [2024-06-23 17:21:36,411][15349] Signal inference workers to stop experience collection... (116650 times) [2024-06-23 17:21:36,418][15349] Signal inference workers to resume experience collection... (116650 times) [2024-06-23 17:21:36,430][15401] InferenceWorker_p0-w0: stopping experience collection (116650 times) [2024-06-23 17:21:36,431][15401] InferenceWorker_p0-w0: resuming experience collection (116650 times) [2024-06-23 17:21:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 7873593344. Throughput: 0: 42962.3. Samples: 7873684600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 17:21:39,119][15401] Updated weights for policy 0, policy_version 480570 (0.0044) [2024-06-23 17:21:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 7873806336. Throughput: 0: 43004.1. Samples: 7873937600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-23 17:21:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 17:21:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000480579_7873806336.pth... [2024-06-23 17:21:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000479955_7863582720.pth [2024-06-23 17:21:43,713][15401] Updated weights for policy 0, policy_version 480580 (0.0043) [2024-06-23 17:21:47,062][15401] Updated weights for policy 0, policy_version 480590 (0.0029) [2024-06-23 17:21:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 7874019328. Throughput: 0: 43040.6. Samples: 7874197380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:21:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 17:21:51,284][15401] Updated weights for policy 0, policy_version 480600 (0.0046) [2024-06-23 17:21:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43146.3, 300 sec: 42654.0). Total num frames: 7874232320. Throughput: 0: 42979.6. Samples: 7874324160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:21:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 17:21:54,761][15401] Updated weights for policy 0, policy_version 480610 (0.0043) [2024-06-23 17:21:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7874445312. Throughput: 0: 42912.4. Samples: 7874576500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:21:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 17:21:59,006][15401] Updated weights for policy 0, policy_version 480620 (0.0033) [2024-06-23 17:22:02,397][15401] Updated weights for policy 0, policy_version 480630 (0.0038) [2024-06-23 17:22:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7874674688. Throughput: 0: 42957.7. Samples: 7874834500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 17:22:06,727][15401] Updated weights for policy 0, policy_version 480640 (0.0033) [2024-06-23 17:22:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 7874887680. Throughput: 0: 42939.5. Samples: 7874964020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 17:22:10,145][15401] Updated weights for policy 0, policy_version 480650 (0.0036) [2024-06-23 17:22:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7875084288. Throughput: 0: 42806.8. Samples: 7875215800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 17:22:14,342][15401] Updated weights for policy 0, policy_version 480660 (0.0021) [2024-06-23 17:22:17,789][15401] Updated weights for policy 0, policy_version 480670 (0.0029) [2024-06-23 17:22:18,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 7875313664. Throughput: 0: 42799.5. Samples: 7875474000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:18,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 17:22:21,958][15401] Updated weights for policy 0, policy_version 480680 (0.0032) [2024-06-23 17:22:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7875510272. Throughput: 0: 42696.0. Samples: 7875605920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:23,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 17:22:25,437][15401] Updated weights for policy 0, policy_version 480690 (0.0034) [2024-06-23 17:22:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7875739648. Throughput: 0: 42654.3. Samples: 7875857040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 17:22:29,951][15401] Updated weights for policy 0, policy_version 480700 (0.0042) [2024-06-23 17:22:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 7875936256. Throughput: 0: 42571.0. Samples: 7876113180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:33,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 17:22:33,538][15401] Updated weights for policy 0, policy_version 480710 (0.0039) [2024-06-23 17:22:37,719][15401] Updated weights for policy 0, policy_version 480720 (0.0037) [2024-06-23 17:22:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7876165632. Throughput: 0: 42581.3. Samples: 7876240320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:38,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 17:22:41,239][15401] Updated weights for policy 0, policy_version 480730 (0.0037) [2024-06-23 17:22:43,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7876378624. Throughput: 0: 42678.6. Samples: 7876497040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 17:22:45,206][15401] Updated weights for policy 0, policy_version 480740 (0.0037) [2024-06-23 17:22:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7876575232. Throughput: 0: 42765.8. Samples: 7876758960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 17:22:48,606][15401] Updated weights for policy 0, policy_version 480750 (0.0031) [2024-06-23 17:22:50,030][15349] Signal inference workers to stop experience collection... (116700 times) [2024-06-23 17:22:50,061][15401] InferenceWorker_p0-w0: stopping experience collection (116700 times) [2024-06-23 17:22:50,087][15349] Signal inference workers to resume experience collection... (116700 times) [2024-06-23 17:22:50,092][15401] InferenceWorker_p0-w0: resuming experience collection (116700 times) [2024-06-23 17:22:52,745][15401] Updated weights for policy 0, policy_version 480760 (0.0038) [2024-06-23 17:22:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7876804608. Throughput: 0: 42699.5. Samples: 7876885500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 17:22:56,135][15401] Updated weights for policy 0, policy_version 480770 (0.0046) [2024-06-23 17:22:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 7877033984. Throughput: 0: 42715.7. Samples: 7877138000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:22:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 17:23:00,807][15401] Updated weights for policy 0, policy_version 480780 (0.0044) [2024-06-23 17:23:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7877230592. Throughput: 0: 42677.8. Samples: 7877394400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:23:03,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 17:23:03,911][15401] Updated weights for policy 0, policy_version 480790 (0.0035) [2024-06-23 17:23:08,360][15401] Updated weights for policy 0, policy_version 480800 (0.0040) [2024-06-23 17:23:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7877427200. Throughput: 0: 42607.5. Samples: 7877523260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 17:23:08,391][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 17:23:11,322][15401] Updated weights for policy 0, policy_version 480810 (0.0035) [2024-06-23 17:23:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7877672960. Throughput: 0: 42799.9. Samples: 7877783040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:13,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 17:23:16,089][15401] Updated weights for policy 0, policy_version 480820 (0.0036) [2024-06-23 17:23:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 7877885952. Throughput: 0: 42740.0. Samples: 7878036380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 17:23:19,142][15401] Updated weights for policy 0, policy_version 480830 (0.0042) [2024-06-23 17:23:23,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7878049792. Throughput: 0: 42763.4. Samples: 7878164680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 17:23:23,650][15401] Updated weights for policy 0, policy_version 480840 (0.0036) [2024-06-23 17:23:26,927][15401] Updated weights for policy 0, policy_version 480850 (0.0044) [2024-06-23 17:23:28,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 7878328320. Throughput: 0: 42732.1. Samples: 7878420080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:28,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 17:23:31,445][15401] Updated weights for policy 0, policy_version 480860 (0.0041) [2024-06-23 17:23:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 7878508544. Throughput: 0: 42511.1. Samples: 7878671960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 17:23:34,669][15401] Updated weights for policy 0, policy_version 480870 (0.0036) [2024-06-23 17:23:38,390][15132] Fps is (10 sec: 36052.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 7878688768. Throughput: 0: 42470.2. Samples: 7878796660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 17:23:39,065][15401] Updated weights for policy 0, policy_version 480880 (0.0028) [2024-06-23 17:23:42,308][15401] Updated weights for policy 0, policy_version 480890 (0.0029) [2024-06-23 17:23:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7878950912. Throughput: 0: 42558.9. Samples: 7879053160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 17:23:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000480893_7878950912.pth... [2024-06-23 17:23:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000480267_7868694528.pth [2024-06-23 17:23:46,930][15401] Updated weights for policy 0, policy_version 480900 (0.0034) [2024-06-23 17:23:48,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7879147520. Throughput: 0: 42437.9. Samples: 7879304100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:48,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 17:23:49,948][15401] Updated weights for policy 0, policy_version 480910 (0.0044) [2024-06-23 17:23:53,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 7879327744. Throughput: 0: 42348.9. Samples: 7879428960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 17:23:54,509][15401] Updated weights for policy 0, policy_version 480920 (0.0034) [2024-06-23 17:23:57,465][15401] Updated weights for policy 0, policy_version 480930 (0.0041) [2024-06-23 17:23:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 7879573504. Throughput: 0: 42345.8. Samples: 7879688600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:23:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 17:24:02,117][15401] Updated weights for policy 0, policy_version 480940 (0.0041) [2024-06-23 17:24:03,349][15349] Signal inference workers to stop experience collection... (116750 times) [2024-06-23 17:24:03,349][15349] Signal inference workers to resume experience collection... (116750 times) [2024-06-23 17:24:03,367][15401] InferenceWorker_p0-w0: stopping experience collection (116750 times) [2024-06-23 17:24:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7879786496. Throughput: 0: 42418.4. Samples: 7879945200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:24:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 17:24:03,398][15401] InferenceWorker_p0-w0: resuming experience collection (116750 times) [2024-06-23 17:24:05,525][15401] Updated weights for policy 0, policy_version 480950 (0.0047) [2024-06-23 17:24:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7879983104. Throughput: 0: 42267.2. Samples: 7880066700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:24:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 17:24:09,717][15401] Updated weights for policy 0, policy_version 480960 (0.0042) [2024-06-23 17:24:13,165][15401] Updated weights for policy 0, policy_version 480970 (0.0040) [2024-06-23 17:24:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 7880212480. Throughput: 0: 42311.2. Samples: 7880323980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:24:13,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 17:24:17,606][15401] Updated weights for policy 0, policy_version 480980 (0.0034) [2024-06-23 17:24:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7880409088. Throughput: 0: 42456.0. Samples: 7880582480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:24:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 17:24:20,939][15401] Updated weights for policy 0, policy_version 480990 (0.0035) [2024-06-23 17:24:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7880622080. Throughput: 0: 42384.9. Samples: 7880703980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:24:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 17:24:25,109][15401] Updated weights for policy 0, policy_version 481000 (0.0030) [2024-06-23 17:24:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 7880851456. Throughput: 0: 42400.6. Samples: 7880961180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:24:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 17:24:28,532][15401] Updated weights for policy 0, policy_version 481010 (0.0043) [2024-06-23 17:24:32,526][15401] Updated weights for policy 0, policy_version 481020 (0.0046) [2024-06-23 17:24:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7881048064. Throughput: 0: 42625.7. Samples: 7881222260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-23 17:24:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 17:24:36,178][15401] Updated weights for policy 0, policy_version 481030 (0.0034) [2024-06-23 17:24:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 7881277440. Throughput: 0: 42614.4. Samples: 7881346600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:24:38,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 17:24:40,550][15401] Updated weights for policy 0, policy_version 481040 (0.0044) [2024-06-23 17:24:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7881490432. Throughput: 0: 42525.3. Samples: 7881602240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:24:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 17:24:43,744][15401] Updated weights for policy 0, policy_version 481050 (0.0036) [2024-06-23 17:24:48,196][15401] Updated weights for policy 0, policy_version 481060 (0.0026) [2024-06-23 17:24:48,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 7881687040. Throughput: 0: 42590.6. Samples: 7881861880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:24:48,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 17:24:51,317][15401] Updated weights for policy 0, policy_version 481070 (0.0029) [2024-06-23 17:24:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7881900032. Throughput: 0: 42577.8. Samples: 7881982700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:24:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 17:24:56,090][15401] Updated weights for policy 0, policy_version 481080 (0.0033) [2024-06-23 17:24:58,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 7882113024. Throughput: 0: 42476.5. Samples: 7882235420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:24:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 17:24:59,675][15401] Updated weights for policy 0, policy_version 481090 (0.0030) [2024-06-23 17:25:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.5, 300 sec: 42653.6). Total num frames: 7882326016. Throughput: 0: 42546.1. Samples: 7882497160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:03,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 17:25:03,556][15401] Updated weights for policy 0, policy_version 481100 (0.0040) [2024-06-23 17:25:07,272][15401] Updated weights for policy 0, policy_version 481110 (0.0032) [2024-06-23 17:25:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7882555392. Throughput: 0: 42590.3. Samples: 7882620540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 17:25:11,034][15401] Updated weights for policy 0, policy_version 481120 (0.0042) [2024-06-23 17:25:13,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7882768384. Throughput: 0: 42718.2. Samples: 7882883500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 17:25:15,032][15401] Updated weights for policy 0, policy_version 481130 (0.0038) [2024-06-23 17:25:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 7882964992. Throughput: 0: 42510.6. Samples: 7883135240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:18,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 17:25:18,781][15401] Updated weights for policy 0, policy_version 481140 (0.0028) [2024-06-23 17:25:22,595][15401] Updated weights for policy 0, policy_version 481150 (0.0034) [2024-06-23 17:25:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7883210752. Throughput: 0: 42600.3. Samples: 7883263620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 17:25:26,493][15401] Updated weights for policy 0, policy_version 481160 (0.0039) [2024-06-23 17:25:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 7883390976. Throughput: 0: 42616.1. Samples: 7883519960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:28,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 17:25:30,128][15401] Updated weights for policy 0, policy_version 481170 (0.0038) [2024-06-23 17:25:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7883603968. Throughput: 0: 42572.9. Samples: 7883777560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:33,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 17:25:34,140][15401] Updated weights for policy 0, policy_version 481180 (0.0028) [2024-06-23 17:25:36,034][15349] Signal inference workers to stop experience collection... (116800 times) [2024-06-23 17:25:36,063][15401] InferenceWorker_p0-w0: stopping experience collection (116800 times) [2024-06-23 17:25:36,101][15349] Signal inference workers to resume experience collection... (116800 times) [2024-06-23 17:25:36,102][15401] InferenceWorker_p0-w0: resuming experience collection (116800 times) [2024-06-23 17:25:37,895][15401] Updated weights for policy 0, policy_version 481190 (0.0039) [2024-06-23 17:25:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7883866112. Throughput: 0: 42852.1. Samples: 7883911040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:38,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 17:25:41,617][15401] Updated weights for policy 0, policy_version 481200 (0.0032) [2024-06-23 17:25:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 7884013568. Throughput: 0: 42760.0. Samples: 7884159620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:43,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 17:25:43,453][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000481203_7884029952.pth... [2024-06-23 17:25:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000480579_7873806336.pth [2024-06-23 17:25:45,550][15401] Updated weights for policy 0, policy_version 481210 (0.0027) [2024-06-23 17:25:48,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 7884242944. Throughput: 0: 42704.2. Samples: 7884418740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:48,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 17:25:49,295][15401] Updated weights for policy 0, policy_version 481220 (0.0029) [2024-06-23 17:25:53,133][15401] Updated weights for policy 0, policy_version 481230 (0.0027) [2024-06-23 17:25:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 7884472320. Throughput: 0: 42839.0. Samples: 7884548300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 17:25:56,886][15401] Updated weights for policy 0, policy_version 481240 (0.0047) [2024-06-23 17:25:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 7884652544. Throughput: 0: 42575.2. Samples: 7884799380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:25:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 17:26:00,714][15401] Updated weights for policy 0, policy_version 481250 (0.0041) [2024-06-23 17:26:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 7884881920. Throughput: 0: 42718.0. Samples: 7885057560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 17:26:04,448][15401] Updated weights for policy 0, policy_version 481260 (0.0030) [2024-06-23 17:26:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7885111296. Throughput: 0: 42803.7. Samples: 7885189780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 17:26:08,588][15401] Updated weights for policy 0, policy_version 481270 (0.0031) [2024-06-23 17:26:12,014][15401] Updated weights for policy 0, policy_version 481280 (0.0037) [2024-06-23 17:26:13,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7885307904. Throughput: 0: 42630.8. Samples: 7885438340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 17:26:16,182][15401] Updated weights for policy 0, policy_version 481290 (0.0047) [2024-06-23 17:26:18,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 7885537280. Throughput: 0: 42702.2. Samples: 7885699260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:18,393][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 17:26:20,164][15401] Updated weights for policy 0, policy_version 481300 (0.0029) [2024-06-23 17:26:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7885733888. Throughput: 0: 42556.3. Samples: 7885826080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 17:26:23,775][15401] Updated weights for policy 0, policy_version 481310 (0.0034) [2024-06-23 17:26:27,974][15401] Updated weights for policy 0, policy_version 481320 (0.0035) [2024-06-23 17:26:28,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7885946880. Throughput: 0: 42795.0. Samples: 7886085400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 17:26:31,292][15401] Updated weights for policy 0, policy_version 481330 (0.0029) [2024-06-23 17:26:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7886192640. Throughput: 0: 42560.3. Samples: 7886333960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:33,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 17:26:35,316][15349] Signal inference workers to stop experience collection... (116850 times) [2024-06-23 17:26:35,320][15349] Signal inference workers to resume experience collection... (116850 times) [2024-06-23 17:26:35,330][15401] InferenceWorker_p0-w0: stopping experience collection (116850 times) [2024-06-23 17:26:35,361][15401] InferenceWorker_p0-w0: resuming experience collection (116850 times) [2024-06-23 17:26:35,463][15401] Updated weights for policy 0, policy_version 481340 (0.0032) [2024-06-23 17:26:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 7886372864. Throughput: 0: 42624.6. Samples: 7886466400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 17:26:38,946][15401] Updated weights for policy 0, policy_version 481350 (0.0036) [2024-06-23 17:26:43,058][15401] Updated weights for policy 0, policy_version 481360 (0.0055) [2024-06-23 17:26:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7886602240. Throughput: 0: 42891.1. Samples: 7886729480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 17:26:46,667][15401] Updated weights for policy 0, policy_version 481370 (0.0038) [2024-06-23 17:26:48,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 7886848000. Throughput: 0: 42688.1. Samples: 7886978520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:48,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-23 17:26:50,757][15401] Updated weights for policy 0, policy_version 481380 (0.0034) [2024-06-23 17:26:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7887011840. Throughput: 0: 42655.1. Samples: 7887109260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 17:26:54,268][15401] Updated weights for policy 0, policy_version 481390 (0.0035) [2024-06-23 17:26:58,331][15401] Updated weights for policy 0, policy_version 481400 (0.0036) [2024-06-23 17:26:58,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43415.9, 300 sec: 42653.6). Total num frames: 7887257600. Throughput: 0: 42885.6. Samples: 7887368300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:26:58,392][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 17:27:02,151][15401] Updated weights for policy 0, policy_version 481410 (0.0024) [2024-06-23 17:27:03,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43690.9, 300 sec: 42765.0). Total num frames: 7887503360. Throughput: 0: 42486.8. Samples: 7887611060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:27:03,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-23 17:27:06,628][15401] Updated weights for policy 0, policy_version 481420 (0.0045) [2024-06-23 17:27:08,391][15132] Fps is (10 sec: 40965.0, 60 sec: 42597.5, 300 sec: 42653.8). Total num frames: 7887667200. Throughput: 0: 42724.2. Samples: 7887748720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:27:08,391][15132] Avg episode reward: [(0, '0.243')] [2024-06-23 17:27:09,895][15401] Updated weights for policy 0, policy_version 481430 (0.0026) [2024-06-23 17:27:13,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 7887880192. Throughput: 0: 42692.8. Samples: 7888006580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:27:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 17:27:14,245][15401] Updated weights for policy 0, policy_version 481440 (0.0034) [2024-06-23 17:27:17,480][15401] Updated weights for policy 0, policy_version 481450 (0.0039) [2024-06-23 17:27:18,390][15132] Fps is (10 sec: 47519.3, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 7888142336. Throughput: 0: 42667.1. Samples: 7888253980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:27:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 17:27:21,953][15401] Updated weights for policy 0, policy_version 481460 (0.0038) [2024-06-23 17:27:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 7888306176. Throughput: 0: 42815.1. Samples: 7888393080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:27:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 17:27:25,011][15401] Updated weights for policy 0, policy_version 481470 (0.0035) [2024-06-23 17:27:28,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7888519168. Throughput: 0: 42573.4. Samples: 7888645280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 17:27:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 17:27:29,714][15401] Updated weights for policy 0, policy_version 481480 (0.0040) [2024-06-23 17:27:32,595][15401] Updated weights for policy 0, policy_version 481490 (0.0026) [2024-06-23 17:27:33,390][15132] Fps is (10 sec: 47510.0, 60 sec: 43144.1, 300 sec: 42764.9). Total num frames: 7888781312. Throughput: 0: 42656.4. Samples: 7888898080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:27:33,391][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 17:27:37,353][15401] Updated weights for policy 0, policy_version 481500 (0.0034) [2024-06-23 17:27:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 7888945152. Throughput: 0: 42840.3. Samples: 7889037180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:27:38,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 17:27:39,678][15349] Signal inference workers to stop experience collection... (116900 times) [2024-06-23 17:27:39,705][15401] InferenceWorker_p0-w0: stopping experience collection (116900 times) [2024-06-23 17:27:39,739][15349] Signal inference workers to resume experience collection... (116900 times) [2024-06-23 17:27:39,740][15401] InferenceWorker_p0-w0: resuming experience collection (116900 times) [2024-06-23 17:27:40,219][15401] Updated weights for policy 0, policy_version 481510 (0.0045) [2024-06-23 17:27:43,389][15132] Fps is (10 sec: 37686.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7889158144. Throughput: 0: 42562.8. Samples: 7889283520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:27:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 17:27:43,474][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000481517_7889174528.pth... [2024-06-23 17:27:43,537][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000480893_7878950912.pth [2024-06-23 17:27:44,868][15401] Updated weights for policy 0, policy_version 481520 (0.0038) [2024-06-23 17:27:47,946][15401] Updated weights for policy 0, policy_version 481530 (0.0027) [2024-06-23 17:27:48,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7889403904. Throughput: 0: 42920.4. Samples: 7889542480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:27:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 17:27:53,059][15401] Updated weights for policy 0, policy_version 481540 (0.0032) [2024-06-23 17:27:53,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7889567744. Throughput: 0: 42733.2. Samples: 7889671660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:27:53,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 17:27:55,556][15401] Updated weights for policy 0, policy_version 481550 (0.0030) [2024-06-23 17:27:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 7889829888. Throughput: 0: 42501.8. Samples: 7889919160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:27:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 17:28:00,607][15401] Updated weights for policy 0, policy_version 481560 (0.0043) [2024-06-23 17:28:03,191][15401] Updated weights for policy 0, policy_version 481570 (0.0031) [2024-06-23 17:28:03,392][15132] Fps is (10 sec: 47502.2, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 7890042880. Throughput: 0: 42770.2. Samples: 7890178740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:03,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 17:28:08,251][15401] Updated weights for policy 0, policy_version 481580 (0.0043) [2024-06-23 17:28:08,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42326.2, 300 sec: 42487.3). Total num frames: 7890206720. Throughput: 0: 42560.9. Samples: 7890308320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 17:28:11,102][15401] Updated weights for policy 0, policy_version 481590 (0.0036) [2024-06-23 17:28:13,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 7890468864. Throughput: 0: 42599.6. Samples: 7890562260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:13,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 17:28:15,906][15401] Updated weights for policy 0, policy_version 481600 (0.0041) [2024-06-23 17:28:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 7890649088. Throughput: 0: 42839.7. Samples: 7890825840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 17:28:18,805][15401] Updated weights for policy 0, policy_version 481610 (0.0031) [2024-06-23 17:28:23,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.2, 300 sec: 42432.1). Total num frames: 7890845696. Throughput: 0: 42354.7. Samples: 7890943040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 17:28:23,551][15401] Updated weights for policy 0, policy_version 481620 (0.0026) [2024-06-23 17:28:26,652][15401] Updated weights for policy 0, policy_version 481630 (0.0044) [2024-06-23 17:28:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 7891124224. Throughput: 0: 42781.1. Samples: 7891208680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 17:28:31,115][15401] Updated weights for policy 0, policy_version 481640 (0.0032) [2024-06-23 17:28:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.6, 300 sec: 42709.5). Total num frames: 7891288064. Throughput: 0: 42804.4. Samples: 7891468680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 17:28:34,216][15401] Updated weights for policy 0, policy_version 481650 (0.0042) [2024-06-23 17:28:38,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 7891501056. Throughput: 0: 42475.1. Samples: 7891583040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 17:28:38,702][15401] Updated weights for policy 0, policy_version 481660 (0.0028) [2024-06-23 17:28:41,786][15401] Updated weights for policy 0, policy_version 481670 (0.0043) [2024-06-23 17:28:43,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 7891763200. Throughput: 0: 42821.7. Samples: 7891846140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:43,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 17:28:46,379][15401] Updated weights for policy 0, policy_version 481680 (0.0036) [2024-06-23 17:28:48,213][15349] Signal inference workers to stop experience collection... (116950 times) [2024-06-23 17:28:48,215][15349] Signal inference workers to resume experience collection... (116950 times) [2024-06-23 17:28:48,230][15401] InferenceWorker_p0-w0: stopping experience collection (116950 times) [2024-06-23 17:28:48,230][15401] InferenceWorker_p0-w0: resuming experience collection (116950 times) [2024-06-23 17:28:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 7891927040. Throughput: 0: 42988.0. Samples: 7892113100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:48,396][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 17:28:49,580][15401] Updated weights for policy 0, policy_version 481690 (0.0032) [2024-06-23 17:28:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7892156416. Throughput: 0: 42571.9. Samples: 7892224060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 17:28:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 17:28:53,903][15401] Updated weights for policy 0, policy_version 481700 (0.0031) [2024-06-23 17:28:57,286][15401] Updated weights for policy 0, policy_version 481710 (0.0039) [2024-06-23 17:28:58,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7892385792. Throughput: 0: 42816.9. Samples: 7892489020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:28:58,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-23 17:29:01,530][15401] Updated weights for policy 0, policy_version 481720 (0.0035) [2024-06-23 17:29:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 7892549632. Throughput: 0: 42896.5. Samples: 7892756180. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 17:29:05,089][15401] Updated weights for policy 0, policy_version 481730 (0.0045) [2024-06-23 17:29:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 7892811776. Throughput: 0: 42715.2. Samples: 7892865220. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:08,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 17:29:09,025][15401] Updated weights for policy 0, policy_version 481740 (0.0028) [2024-06-23 17:29:12,753][15401] Updated weights for policy 0, policy_version 481750 (0.0034) [2024-06-23 17:29:13,392][15132] Fps is (10 sec: 49140.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 7893041152. Throughput: 0: 42718.3. Samples: 7893131100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:13,392][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 17:29:16,678][15401] Updated weights for policy 0, policy_version 481760 (0.0026) [2024-06-23 17:29:18,390][15132] Fps is (10 sec: 36044.5, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 7893172224. Throughput: 0: 42864.4. Samples: 7893397580. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 17:29:20,265][15401] Updated weights for policy 0, policy_version 481770 (0.0032) [2024-06-23 17:29:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 7893450752. Throughput: 0: 42756.9. Samples: 7893507100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 17:29:24,245][15401] Updated weights for policy 0, policy_version 481780 (0.0037) [2024-06-23 17:29:28,041][15401] Updated weights for policy 0, policy_version 481790 (0.0034) [2024-06-23 17:29:28,390][15132] Fps is (10 sec: 49151.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 7893663744. Throughput: 0: 42844.4. Samples: 7893774140. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 17:29:31,821][15401] Updated weights for policy 0, policy_version 481800 (0.0043) [2024-06-23 17:29:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7893843968. Throughput: 0: 42668.8. Samples: 7894033200. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:33,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 17:29:35,706][15401] Updated weights for policy 0, policy_version 481810 (0.0034) [2024-06-23 17:29:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7894106112. Throughput: 0: 42794.6. Samples: 7894149820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 17:29:39,444][15401] Updated weights for policy 0, policy_version 481820 (0.0030) [2024-06-23 17:29:43,184][15401] Updated weights for policy 0, policy_version 481830 (0.0034) [2024-06-23 17:29:43,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 7894302720. Throughput: 0: 42843.0. Samples: 7894416960. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 17:29:43,569][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000481831_7894319104.pth... [2024-06-23 17:29:43,622][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000481203_7884029952.pth [2024-06-23 17:29:47,235][15401] Updated weights for policy 0, policy_version 481840 (0.0032) [2024-06-23 17:29:48,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7894482944. Throughput: 0: 42509.4. Samples: 7894669100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:48,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 17:29:50,851][15401] Updated weights for policy 0, policy_version 481850 (0.0033) [2024-06-23 17:29:51,099][15349] Signal inference workers to stop experience collection... (117000 times) [2024-06-23 17:29:51,123][15401] InferenceWorker_p0-w0: stopping experience collection (117000 times) [2024-06-23 17:29:51,161][15349] Signal inference workers to resume experience collection... (117000 times) [2024-06-23 17:29:51,162][15401] InferenceWorker_p0-w0: resuming experience collection (117000 times) [2024-06-23 17:29:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7894728704. Throughput: 0: 42845.4. Samples: 7894793260. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:53,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 17:29:54,682][15401] Updated weights for policy 0, policy_version 481860 (0.0022) [2024-06-23 17:29:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 7894925312. Throughput: 0: 42765.9. Samples: 7895055460. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:29:58,390][15132] Avg episode reward: [(0, '0.896')] [2024-06-23 17:29:58,602][15401] Updated weights for policy 0, policy_version 481870 (0.0024) [2024-06-23 17:30:02,759][15401] Updated weights for policy 0, policy_version 481880 (0.0033) [2024-06-23 17:30:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7895121920. Throughput: 0: 42426.2. Samples: 7895306760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:30:03,390][15132] Avg episode reward: [(0, '0.116')] [2024-06-23 17:30:06,221][15401] Updated weights for policy 0, policy_version 481890 (0.0037) [2024-06-23 17:30:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7895351296. Throughput: 0: 42673.3. Samples: 7895427400. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:30:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 17:30:10,460][15401] Updated weights for policy 0, policy_version 481900 (0.0032) [2024-06-23 17:30:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 7895580672. Throughput: 0: 42687.1. Samples: 7895695060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:30:13,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 17:30:13,970][15401] Updated weights for policy 0, policy_version 481910 (0.0034) [2024-06-23 17:30:17,990][15401] Updated weights for policy 0, policy_version 481920 (0.0040) [2024-06-23 17:30:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 7895777280. Throughput: 0: 42375.6. Samples: 7895940100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-23 17:30:18,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 17:30:21,738][15401] Updated weights for policy 0, policy_version 481930 (0.0029) [2024-06-23 17:30:23,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7895973888. Throughput: 0: 42599.5. Samples: 7896066800. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:30:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 17:30:25,555][15401] Updated weights for policy 0, policy_version 481940 (0.0032) [2024-06-23 17:30:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7896219648. Throughput: 0: 42565.8. Samples: 7896332420. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:30:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 17:30:29,311][15401] Updated weights for policy 0, policy_version 481950 (0.0039) [2024-06-23 17:30:33,171][15401] Updated weights for policy 0, policy_version 481960 (0.0036) [2024-06-23 17:30:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 7896432640. Throughput: 0: 42570.9. Samples: 7896584800. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:30:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 17:30:36,951][15401] Updated weights for policy 0, policy_version 481970 (0.0033) [2024-06-23 17:30:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 7896612864. Throughput: 0: 42557.3. Samples: 7896708340. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:30:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 17:30:40,796][15401] Updated weights for policy 0, policy_version 481980 (0.0039) [2024-06-23 17:30:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7896858624. Throughput: 0: 42664.8. Samples: 7896975380. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:30:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 17:30:44,556][15401] Updated weights for policy 0, policy_version 481990 (0.0042) [2024-06-23 17:30:48,386][15401] Updated weights for policy 0, policy_version 482000 (0.0038) [2024-06-23 17:30:48,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7897088000. Throughput: 0: 42651.2. Samples: 7897226060. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:30:48,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 17:30:52,215][15401] Updated weights for policy 0, policy_version 482010 (0.0041) [2024-06-23 17:30:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 7897284608. Throughput: 0: 42904.9. Samples: 7897358120. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:30:53,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 17:30:55,969][15401] Updated weights for policy 0, policy_version 482020 (0.0029) [2024-06-23 17:30:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7897497600. Throughput: 0: 42815.3. Samples: 7897621740. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:30:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 17:30:59,607][15401] Updated weights for policy 0, policy_version 482030 (0.0035) [2024-06-23 17:31:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7897710592. Throughput: 0: 43144.6. Samples: 7897881600. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 17:31:03,649][15401] Updated weights for policy 0, policy_version 482040 (0.0032) [2024-06-23 17:31:06,108][15349] Signal inference workers to stop experience collection... (117050 times) [2024-06-23 17:31:06,109][15349] Signal inference workers to resume experience collection... (117050 times) [2024-06-23 17:31:06,134][15401] InferenceWorker_p0-w0: stopping experience collection (117050 times) [2024-06-23 17:31:06,134][15401] InferenceWorker_p0-w0: resuming experience collection (117050 times) [2024-06-23 17:31:07,134][15401] Updated weights for policy 0, policy_version 482050 (0.0030) [2024-06-23 17:31:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7897923584. Throughput: 0: 43196.1. Samples: 7898010620. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 17:31:11,158][15401] Updated weights for policy 0, policy_version 482060 (0.0034) [2024-06-23 17:31:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 7898136576. Throughput: 0: 43142.1. Samples: 7898273820. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:13,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-23 17:31:14,614][15401] Updated weights for policy 0, policy_version 482070 (0.0027) [2024-06-23 17:31:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7898349568. Throughput: 0: 43193.4. Samples: 7898528500. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 17:31:18,968][15401] Updated weights for policy 0, policy_version 482080 (0.0044) [2024-06-23 17:31:22,208][15401] Updated weights for policy 0, policy_version 482090 (0.0037) [2024-06-23 17:31:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 7898578944. Throughput: 0: 43170.5. Samples: 7898651020. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 17:31:26,882][15401] Updated weights for policy 0, policy_version 482100 (0.0035) [2024-06-23 17:31:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7898791936. Throughput: 0: 43087.5. Samples: 7898914320. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 17:31:30,083][15401] Updated weights for policy 0, policy_version 482110 (0.0033) [2024-06-23 17:31:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7899004928. Throughput: 0: 43285.2. Samples: 7899173900. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:33,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 17:31:34,518][15401] Updated weights for policy 0, policy_version 482120 (0.0033) [2024-06-23 17:31:37,592][15401] Updated weights for policy 0, policy_version 482130 (0.0034) [2024-06-23 17:31:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 7899217920. Throughput: 0: 43149.4. Samples: 7899299840. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:38,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 17:31:42,038][15401] Updated weights for policy 0, policy_version 482140 (0.0039) [2024-06-23 17:31:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7899447296. Throughput: 0: 43068.0. Samples: 7899559800. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-23 17:31:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 17:31:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000482145_7899463680.pth... [2024-06-23 17:31:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000481517_7889174528.pth [2024-06-23 17:31:45,209][15401] Updated weights for policy 0, policy_version 482150 (0.0042) [2024-06-23 17:31:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7899643904. Throughput: 0: 43024.4. Samples: 7899817700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:31:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 17:31:49,777][15401] Updated weights for policy 0, policy_version 482160 (0.0045) [2024-06-23 17:31:52,805][15401] Updated weights for policy 0, policy_version 482170 (0.0047) [2024-06-23 17:31:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 7899873280. Throughput: 0: 42824.4. Samples: 7899937720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:31:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 17:31:57,655][15401] Updated weights for policy 0, policy_version 482180 (0.0048) [2024-06-23 17:31:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 7900086272. Throughput: 0: 42858.4. Samples: 7900202440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:31:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 17:32:00,630][15401] Updated weights for policy 0, policy_version 482190 (0.0027) [2024-06-23 17:32:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.2). Total num frames: 7900282880. Throughput: 0: 42893.8. Samples: 7900458720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 17:32:05,249][15401] Updated weights for policy 0, policy_version 482200 (0.0049) [2024-06-23 17:32:08,274][15401] Updated weights for policy 0, policy_version 482210 (0.0035) [2024-06-23 17:32:08,394][15132] Fps is (10 sec: 44215.2, 60 sec: 43414.1, 300 sec: 42875.4). Total num frames: 7900528640. Throughput: 0: 42843.1. Samples: 7900579160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:08,395][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 17:32:12,657][15401] Updated weights for policy 0, policy_version 482220 (0.0036) [2024-06-23 17:32:13,186][15349] Signal inference workers to stop experience collection... (117100 times) [2024-06-23 17:32:13,186][15349] Signal inference workers to resume experience collection... (117100 times) [2024-06-23 17:32:13,208][15401] InferenceWorker_p0-w0: stopping experience collection (117100 times) [2024-06-23 17:32:13,208][15401] InferenceWorker_p0-w0: resuming experience collection (117100 times) [2024-06-23 17:32:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.8, 300 sec: 42709.5). Total num frames: 7900741632. Throughput: 0: 42903.2. Samples: 7900844960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:13,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 17:32:16,349][15401] Updated weights for policy 0, policy_version 482230 (0.0025) [2024-06-23 17:32:18,389][15132] Fps is (10 sec: 39340.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7900921856. Throughput: 0: 42934.8. Samples: 7901105960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 17:32:20,128][15401] Updated weights for policy 0, policy_version 482240 (0.0040) [2024-06-23 17:32:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7901151232. Throughput: 0: 42773.6. Samples: 7901224660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:23,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 17:32:23,800][15401] Updated weights for policy 0, policy_version 482250 (0.0036) [2024-06-23 17:32:27,602][15401] Updated weights for policy 0, policy_version 482260 (0.0031) [2024-06-23 17:32:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7901364224. Throughput: 0: 42786.2. Samples: 7901485180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 17:32:31,624][15401] Updated weights for policy 0, policy_version 482270 (0.0037) [2024-06-23 17:32:33,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 7901544448. Throughput: 0: 43014.6. Samples: 7901753360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:33,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 17:32:35,521][15401] Updated weights for policy 0, policy_version 482280 (0.0030) [2024-06-23 17:32:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 7901822976. Throughput: 0: 43070.2. Samples: 7901875880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 17:32:38,902][15401] Updated weights for policy 0, policy_version 482290 (0.0042) [2024-06-23 17:32:43,135][15401] Updated weights for policy 0, policy_version 482300 (0.0026) [2024-06-23 17:32:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7902003200. Throughput: 0: 42923.5. Samples: 7902134000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 17:32:46,361][15401] Updated weights for policy 0, policy_version 482310 (0.0048) [2024-06-23 17:32:48,390][15132] Fps is (10 sec: 36044.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 7902183424. Throughput: 0: 43149.7. Samples: 7902400460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 17:32:50,768][15401] Updated weights for policy 0, policy_version 482320 (0.0031) [2024-06-23 17:32:53,391][15132] Fps is (10 sec: 45868.4, 60 sec: 43143.6, 300 sec: 42820.4). Total num frames: 7902461952. Throughput: 0: 43157.9. Samples: 7902521120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:53,391][15132] Avg episode reward: [(0, '0.319')] [2024-06-23 17:32:54,478][15401] Updated weights for policy 0, policy_version 482330 (0.0049) [2024-06-23 17:32:58,252][15401] Updated weights for policy 0, policy_version 482340 (0.0036) [2024-06-23 17:32:58,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42869.6, 300 sec: 42765.0). Total num frames: 7902658560. Throughput: 0: 42994.0. Samples: 7902779800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:32:58,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 17:33:02,426][15401] Updated weights for policy 0, policy_version 482350 (0.0035) [2024-06-23 17:33:03,389][15132] Fps is (10 sec: 37688.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7902838784. Throughput: 0: 43100.9. Samples: 7903045500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:33:03,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 17:33:06,082][15401] Updated weights for policy 0, policy_version 482360 (0.0028) [2024-06-23 17:33:08,389][15132] Fps is (10 sec: 44248.2, 60 sec: 42874.9, 300 sec: 42820.6). Total num frames: 7903100928. Throughput: 0: 43058.9. Samples: 7903162300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 17:33:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 17:33:10,199][15401] Updated weights for policy 0, policy_version 482370 (0.0050) [2024-06-23 17:33:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 7903281152. Throughput: 0: 43023.6. Samples: 7903421240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 17:33:13,392][15349] Signal inference workers to stop experience collection... (117150 times) [2024-06-23 17:33:13,395][15349] Signal inference workers to resume experience collection... (117150 times) [2024-06-23 17:33:13,440][15401] InferenceWorker_p0-w0: stopping experience collection (117150 times) [2024-06-23 17:33:13,440][15401] InferenceWorker_p0-w0: resuming experience collection (117150 times) [2024-06-23 17:33:13,562][15401] Updated weights for policy 0, policy_version 482380 (0.0029) [2024-06-23 17:33:17,862][15401] Updated weights for policy 0, policy_version 482390 (0.0038) [2024-06-23 17:33:18,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7903477760. Throughput: 0: 42891.7. Samples: 7903683480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:18,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 17:33:21,013][15401] Updated weights for policy 0, policy_version 482400 (0.0042) [2024-06-23 17:33:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7903739904. Throughput: 0: 42840.8. Samples: 7903803720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-23 17:33:25,490][15401] Updated weights for policy 0, policy_version 482410 (0.0040) [2024-06-23 17:33:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7903952896. Throughput: 0: 42900.0. Samples: 7904064500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 17:33:28,581][15401] Updated weights for policy 0, policy_version 482420 (0.0037) [2024-06-23 17:33:33,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7904116736. Throughput: 0: 42639.1. Samples: 7904319220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 17:33:33,497][15401] Updated weights for policy 0, policy_version 482430 (0.0037) [2024-06-23 17:33:36,402][15401] Updated weights for policy 0, policy_version 482440 (0.0034) [2024-06-23 17:33:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7904378880. Throughput: 0: 42632.9. Samples: 7904439540. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:38,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 17:33:40,954][15401] Updated weights for policy 0, policy_version 482450 (0.0028) [2024-06-23 17:33:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 7904575488. Throughput: 0: 42716.9. Samples: 7904701960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:43,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 17:33:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000482458_7904591872.pth... [2024-06-23 17:33:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000481831_7894319104.pth [2024-06-23 17:33:44,104][15401] Updated weights for policy 0, policy_version 482460 (0.0029) [2024-06-23 17:33:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7904772096. Throughput: 0: 42550.7. Samples: 7904960280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 17:33:48,563][15401] Updated weights for policy 0, policy_version 482470 (0.0045) [2024-06-23 17:33:51,591][15401] Updated weights for policy 0, policy_version 482480 (0.0038) [2024-06-23 17:33:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42599.4, 300 sec: 42820.5). Total num frames: 7905017856. Throughput: 0: 42746.6. Samples: 7905085900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 17:33:56,339][15401] Updated weights for policy 0, policy_version 482490 (0.0029) [2024-06-23 17:33:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 7905230848. Throughput: 0: 42749.7. Samples: 7905344980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:33:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 17:33:59,012][15401] Updated weights for policy 0, policy_version 482500 (0.0034) [2024-06-23 17:34:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7905411072. Throughput: 0: 42736.0. Samples: 7905606600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:34:03,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 17:34:03,894][15401] Updated weights for policy 0, policy_version 482510 (0.0029) [2024-06-23 17:34:06,630][15401] Updated weights for policy 0, policy_version 482520 (0.0027) [2024-06-23 17:34:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 7905673216. Throughput: 0: 42862.4. Samples: 7905732520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:34:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 17:34:11,369][15401] Updated weights for policy 0, policy_version 482530 (0.0029) [2024-06-23 17:34:13,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 7905886208. Throughput: 0: 42998.2. Samples: 7905999420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:34:13,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 17:34:14,151][15401] Updated weights for policy 0, policy_version 482540 (0.0039) [2024-06-23 17:34:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7906066432. Throughput: 0: 42936.6. Samples: 7906251360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:34:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 17:34:18,973][15401] Updated weights for policy 0, policy_version 482550 (0.0027) [2024-06-23 17:34:21,806][15401] Updated weights for policy 0, policy_version 482560 (0.0027) [2024-06-23 17:34:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 7906312192. Throughput: 0: 42966.7. Samples: 7906373040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:34:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 17:34:26,918][15401] Updated weights for policy 0, policy_version 482570 (0.0029) [2024-06-23 17:34:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 7906508800. Throughput: 0: 43026.0. Samples: 7906638120. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:34:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 17:34:29,907][15401] Updated weights for policy 0, policy_version 482580 (0.0035) [2024-06-23 17:34:33,392][15132] Fps is (10 sec: 39311.9, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 7906705408. Throughput: 0: 43011.0. Samples: 7906895880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-23 17:34:33,392][15132] Avg episode reward: [(0, '0.892')] [2024-06-23 17:34:34,572][15401] Updated weights for policy 0, policy_version 482590 (0.0042) [2024-06-23 17:34:34,812][15349] Signal inference workers to stop experience collection... (117200 times) [2024-06-23 17:34:34,860][15401] InferenceWorker_p0-w0: stopping experience collection (117200 times) [2024-06-23 17:34:34,866][15349] Signal inference workers to resume experience collection... (117200 times) [2024-06-23 17:34:34,883][15401] InferenceWorker_p0-w0: resuming experience collection (117200 times) [2024-06-23 17:34:37,497][15401] Updated weights for policy 0, policy_version 482600 (0.0048) [2024-06-23 17:34:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7906934784. Throughput: 0: 42906.2. Samples: 7907016680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:34:38,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 17:34:42,048][15401] Updated weights for policy 0, policy_version 482610 (0.0031) [2024-06-23 17:34:43,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7907147776. Throughput: 0: 42970.7. Samples: 7907278660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:34:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 17:34:44,886][15401] Updated weights for policy 0, policy_version 482620 (0.0035) [2024-06-23 17:34:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7907344384. Throughput: 0: 42889.3. Samples: 7907536620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:34:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 17:34:49,524][15401] Updated weights for policy 0, policy_version 482630 (0.0045) [2024-06-23 17:34:52,458][15401] Updated weights for policy 0, policy_version 482640 (0.0038) [2024-06-23 17:34:53,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 7907590144. Throughput: 0: 42915.0. Samples: 7907663800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:34:53,392][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 17:34:57,091][15401] Updated weights for policy 0, policy_version 482650 (0.0042) [2024-06-23 17:34:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7907786752. Throughput: 0: 42772.0. Samples: 7907924160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:34:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 17:35:00,442][15401] Updated weights for policy 0, policy_version 482660 (0.0027) [2024-06-23 17:35:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7907999744. Throughput: 0: 42880.8. Samples: 7908181000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:03,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-23 17:35:04,695][15401] Updated weights for policy 0, policy_version 482670 (0.0025) [2024-06-23 17:35:07,814][15401] Updated weights for policy 0, policy_version 482680 (0.0026) [2024-06-23 17:35:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 7908245504. Throughput: 0: 42946.1. Samples: 7908305620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 17:35:12,424][15401] Updated weights for policy 0, policy_version 482690 (0.0034) [2024-06-23 17:35:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 7908442112. Throughput: 0: 42747.8. Samples: 7908561880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:13,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 17:35:15,774][15401] Updated weights for policy 0, policy_version 482700 (0.0027) [2024-06-23 17:35:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 7908655104. Throughput: 0: 42718.2. Samples: 7908818100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:18,400][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 17:35:19,876][15401] Updated weights for policy 0, policy_version 482710 (0.0031) [2024-06-23 17:35:23,306][15401] Updated weights for policy 0, policy_version 482720 (0.0039) [2024-06-23 17:35:23,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 7908884480. Throughput: 0: 43021.8. Samples: 7908952660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:23,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 17:35:27,678][15401] Updated weights for policy 0, policy_version 482730 (0.0038) [2024-06-23 17:35:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 7909097472. Throughput: 0: 42982.3. Samples: 7909212860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 17:35:30,953][15401] Updated weights for policy 0, policy_version 482740 (0.0029) [2024-06-23 17:35:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43419.3, 300 sec: 43042.7). Total num frames: 7909310464. Throughput: 0: 42821.3. Samples: 7909463580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 17:35:35,579][15401] Updated weights for policy 0, policy_version 482750 (0.0029) [2024-06-23 17:35:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7909507072. Throughput: 0: 42677.8. Samples: 7909584200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 17:35:38,846][15401] Updated weights for policy 0, policy_version 482760 (0.0028) [2024-06-23 17:35:43,063][15401] Updated weights for policy 0, policy_version 482770 (0.0028) [2024-06-23 17:35:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 7909720064. Throughput: 0: 42701.7. Samples: 7909845740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 17:35:43,551][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000482772_7909736448.pth... [2024-06-23 17:35:43,632][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000482145_7899463680.pth [2024-06-23 17:35:46,417][15401] Updated weights for policy 0, policy_version 482780 (0.0032) [2024-06-23 17:35:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 7909949440. Throughput: 0: 42659.5. Samples: 7910100680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 17:35:50,942][15349] Signal inference workers to stop experience collection... (117250 times) [2024-06-23 17:35:50,951][15349] Signal inference workers to resume experience collection... (117250 times) [2024-06-23 17:35:50,965][15401] InferenceWorker_p0-w0: stopping experience collection (117250 times) [2024-06-23 17:35:50,968][15401] Updated weights for policy 0, policy_version 482790 (0.0041) [2024-06-23 17:35:50,996][15401] InferenceWorker_p0-w0: resuming experience collection (117250 times) [2024-06-23 17:35:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 7910162432. Throughput: 0: 42776.1. Samples: 7910230540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 17:35:54,003][15401] Updated weights for policy 0, policy_version 482800 (0.0031) [2024-06-23 17:35:58,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7910326272. Throughput: 0: 42811.3. Samples: 7910488280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-23 17:35:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 17:35:58,646][15401] Updated weights for policy 0, policy_version 482810 (0.0030) [2024-06-23 17:36:01,812][15401] Updated weights for policy 0, policy_version 482820 (0.0035) [2024-06-23 17:36:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7910588416. Throughput: 0: 42765.7. Samples: 7910742560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 17:36:06,212][15401] Updated weights for policy 0, policy_version 482830 (0.0041) [2024-06-23 17:36:08,390][15132] Fps is (10 sec: 49151.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 7910817792. Throughput: 0: 42791.5. Samples: 7910878280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:08,395][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 17:36:09,637][15401] Updated weights for policy 0, policy_version 482840 (0.0036) [2024-06-23 17:36:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 7910981632. Throughput: 0: 42742.6. Samples: 7911136280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:13,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 17:36:13,635][15401] Updated weights for policy 0, policy_version 482850 (0.0028) [2024-06-23 17:36:17,046][15401] Updated weights for policy 0, policy_version 482860 (0.0023) [2024-06-23 17:36:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7911227392. Throughput: 0: 42773.8. Samples: 7911388400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 17:36:21,197][15401] Updated weights for policy 0, policy_version 482870 (0.0035) [2024-06-23 17:36:23,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 7911456768. Throughput: 0: 43144.1. Samples: 7911525680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 17:36:24,495][15401] Updated weights for policy 0, policy_version 482880 (0.0033) [2024-06-23 17:36:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 7911620608. Throughput: 0: 42996.0. Samples: 7911780560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 17:36:28,913][15401] Updated weights for policy 0, policy_version 482890 (0.0041) [2024-06-23 17:36:32,713][15401] Updated weights for policy 0, policy_version 482900 (0.0031) [2024-06-23 17:36:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7911866368. Throughput: 0: 42955.5. Samples: 7912033680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:33,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 17:36:36,594][15401] Updated weights for policy 0, policy_version 482910 (0.0042) [2024-06-23 17:36:38,390][15132] Fps is (10 sec: 49152.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 7912112128. Throughput: 0: 43063.0. Samples: 7912168380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:38,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 17:36:40,305][15401] Updated weights for policy 0, policy_version 482920 (0.0036) [2024-06-23 17:36:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 7912259584. Throughput: 0: 42856.4. Samples: 7912416820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 17:36:44,107][15401] Updated weights for policy 0, policy_version 482930 (0.0046) [2024-06-23 17:36:47,939][15401] Updated weights for policy 0, policy_version 482940 (0.0039) [2024-06-23 17:36:48,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 7912488960. Throughput: 0: 42814.7. Samples: 7912669320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:48,392][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 17:36:51,724][15401] Updated weights for policy 0, policy_version 482950 (0.0041) [2024-06-23 17:36:53,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7912751104. Throughput: 0: 42821.0. Samples: 7912805220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 17:36:55,510][15401] Updated weights for policy 0, policy_version 482960 (0.0034) [2024-06-23 17:36:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7912914944. Throughput: 0: 42616.5. Samples: 7913054020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:36:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 17:36:59,570][15401] Updated weights for policy 0, policy_version 482970 (0.0044) [2024-06-23 17:37:00,617][15349] Signal inference workers to stop experience collection... (117300 times) [2024-06-23 17:37:00,625][15349] Signal inference workers to resume experience collection... (117300 times) [2024-06-23 17:37:00,669][15401] InferenceWorker_p0-w0: stopping experience collection (117300 times) [2024-06-23 17:37:00,669][15401] InferenceWorker_p0-w0: resuming experience collection (117300 times) [2024-06-23 17:37:03,019][15401] Updated weights for policy 0, policy_version 482980 (0.0043) [2024-06-23 17:37:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42765.7). Total num frames: 7913144320. Throughput: 0: 42651.9. Samples: 7913307740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:37:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 17:37:07,415][15401] Updated weights for policy 0, policy_version 482990 (0.0033) [2024-06-23 17:37:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7913390080. Throughput: 0: 42531.0. Samples: 7913439580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:37:08,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 17:37:10,739][15401] Updated weights for policy 0, policy_version 483000 (0.0033) [2024-06-23 17:37:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 7913570304. Throughput: 0: 42588.4. Samples: 7913697040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:37:13,394][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 17:37:15,005][15401] Updated weights for policy 0, policy_version 483010 (0.0029) [2024-06-23 17:37:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7913783296. Throughput: 0: 42613.5. Samples: 7913951280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:37:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 17:37:18,414][15401] Updated weights for policy 0, policy_version 483020 (0.0027) [2024-06-23 17:37:22,683][15401] Updated weights for policy 0, policy_version 483030 (0.0027) [2024-06-23 17:37:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7914012672. Throughput: 0: 42426.3. Samples: 7914077560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:37:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 17:37:26,043][15401] Updated weights for policy 0, policy_version 483040 (0.0031) [2024-06-23 17:37:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 7914209280. Throughput: 0: 42641.7. Samples: 7914335700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 17:37:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 17:37:30,279][15401] Updated weights for policy 0, policy_version 483050 (0.0033) [2024-06-23 17:37:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7914422272. Throughput: 0: 42613.9. Samples: 7914586840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:37:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 17:37:33,821][15401] Updated weights for policy 0, policy_version 483060 (0.0037) [2024-06-23 17:37:37,859][15401] Updated weights for policy 0, policy_version 483070 (0.0038) [2024-06-23 17:37:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 7914635264. Throughput: 0: 42476.8. Samples: 7914716680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:37:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 17:37:41,810][15401] Updated weights for policy 0, policy_version 483080 (0.0034) [2024-06-23 17:37:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7914848256. Throughput: 0: 42669.3. Samples: 7914974140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:37:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 17:37:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000483084_7914848256.pth... [2024-06-23 17:37:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000482458_7904591872.pth [2024-06-23 17:37:45,466][15401] Updated weights for policy 0, policy_version 483090 (0.0035) [2024-06-23 17:37:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42873.2, 300 sec: 42709.7). Total num frames: 7915061248. Throughput: 0: 42720.6. Samples: 7915230160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:37:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 17:37:49,261][15401] Updated weights for policy 0, policy_version 483100 (0.0043) [2024-06-23 17:37:52,929][15401] Updated weights for policy 0, policy_version 483110 (0.0028) [2024-06-23 17:37:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 7915274240. Throughput: 0: 42601.0. Samples: 7915356620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:37:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 17:37:56,994][15401] Updated weights for policy 0, policy_version 483120 (0.0039) [2024-06-23 17:37:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7915503616. Throughput: 0: 42701.8. Samples: 7915618620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:37:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 17:38:00,764][15401] Updated weights for policy 0, policy_version 483130 (0.0041) [2024-06-23 17:38:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 7915732992. Throughput: 0: 42799.9. Samples: 7915877280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 17:38:04,439][15401] Updated weights for policy 0, policy_version 483140 (0.0039) [2024-06-23 17:38:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 7915913216. Throughput: 0: 42822.3. Samples: 7916004560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 17:38:08,414][15401] Updated weights for policy 0, policy_version 483150 (0.0036) [2024-06-23 17:38:11,983][15401] Updated weights for policy 0, policy_version 483160 (0.0033) [2024-06-23 17:38:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 7916158976. Throughput: 0: 42813.4. Samples: 7916262300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 17:38:16,013][15401] Updated weights for policy 0, policy_version 483170 (0.0035) [2024-06-23 17:38:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7916355584. Throughput: 0: 42956.9. Samples: 7916519900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:18,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 17:38:18,878][15349] Signal inference workers to stop experience collection... (117350 times) [2024-06-23 17:38:18,929][15401] InferenceWorker_p0-w0: stopping experience collection (117350 times) [2024-06-23 17:38:18,938][15349] Signal inference workers to resume experience collection... (117350 times) [2024-06-23 17:38:18,943][15401] InferenceWorker_p0-w0: resuming experience collection (117350 times) [2024-06-23 17:38:19,890][15401] Updated weights for policy 0, policy_version 483180 (0.0043) [2024-06-23 17:38:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7916552192. Throughput: 0: 42839.2. Samples: 7916644440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 17:38:23,675][15401] Updated weights for policy 0, policy_version 483190 (0.0028) [2024-06-23 17:38:27,523][15401] Updated weights for policy 0, policy_version 483200 (0.0030) [2024-06-23 17:38:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 7916797952. Throughput: 0: 43080.6. Samples: 7916912760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 17:38:31,365][15401] Updated weights for policy 0, policy_version 483210 (0.0030) [2024-06-23 17:38:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7916994560. Throughput: 0: 42951.5. Samples: 7917162980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 17:38:35,322][15401] Updated weights for policy 0, policy_version 483220 (0.0042) [2024-06-23 17:38:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7917191168. Throughput: 0: 42904.4. Samples: 7917287320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 17:38:38,885][15401] Updated weights for policy 0, policy_version 483230 (0.0050) [2024-06-23 17:38:42,965][15401] Updated weights for policy 0, policy_version 483240 (0.0038) [2024-06-23 17:38:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 7917436928. Throughput: 0: 42903.3. Samples: 7917549260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 17:38:46,759][15401] Updated weights for policy 0, policy_version 483250 (0.0041) [2024-06-23 17:38:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7917649920. Throughput: 0: 42838.8. Samples: 7917805020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 17:38:50,532][15401] Updated weights for policy 0, policy_version 483260 (0.0034) [2024-06-23 17:38:53,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 7917846528. Throughput: 0: 42943.8. Samples: 7917937040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:38:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 17:38:54,292][15401] Updated weights for policy 0, policy_version 483270 (0.0028) [2024-06-23 17:38:57,953][15401] Updated weights for policy 0, policy_version 483280 (0.0032) [2024-06-23 17:38:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 7918075904. Throughput: 0: 42872.5. Samples: 7918191560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:38:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 17:39:01,847][15401] Updated weights for policy 0, policy_version 483290 (0.0027) [2024-06-23 17:39:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7918272512. Throughput: 0: 42894.2. Samples: 7918450140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 17:39:05,475][15401] Updated weights for policy 0, policy_version 483300 (0.0035) [2024-06-23 17:39:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7918485504. Throughput: 0: 42943.6. Samples: 7918576900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:08,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-23 17:39:09,488][15401] Updated weights for policy 0, policy_version 483310 (0.0034) [2024-06-23 17:39:12,905][15401] Updated weights for policy 0, policy_version 483320 (0.0035) [2024-06-23 17:39:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7918714880. Throughput: 0: 42566.2. Samples: 7918828240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:13,390][15132] Avg episode reward: [(0, '0.031')] [2024-06-23 17:39:17,176][15401] Updated weights for policy 0, policy_version 483330 (0.0030) [2024-06-23 17:39:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7918911488. Throughput: 0: 42764.9. Samples: 7919087400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:18,390][15132] Avg episode reward: [(0, '0.191')] [2024-06-23 17:39:20,479][15401] Updated weights for policy 0, policy_version 483340 (0.0032) [2024-06-23 17:39:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7919140864. Throughput: 0: 42687.8. Samples: 7919208280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 17:39:24,899][15401] Updated weights for policy 0, policy_version 483350 (0.0033) [2024-06-23 17:39:28,266][15401] Updated weights for policy 0, policy_version 483360 (0.0029) [2024-06-23 17:39:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42932.0). Total num frames: 7919370240. Throughput: 0: 42577.2. Samples: 7919465240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 17:39:31,715][15349] Signal inference workers to stop experience collection... (117400 times) [2024-06-23 17:39:31,721][15349] Signal inference workers to resume experience collection... (117400 times) [2024-06-23 17:39:31,735][15401] InferenceWorker_p0-w0: stopping experience collection (117400 times) [2024-06-23 17:39:31,735][15401] InferenceWorker_p0-w0: resuming experience collection (117400 times) [2024-06-23 17:39:32,543][15401] Updated weights for policy 0, policy_version 483370 (0.0025) [2024-06-23 17:39:33,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 7919566848. Throughput: 0: 42684.4. Samples: 7919726100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:33,396][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 17:39:35,929][15401] Updated weights for policy 0, policy_version 483380 (0.0035) [2024-06-23 17:39:38,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7919747072. Throughput: 0: 42522.8. Samples: 7919850560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:38,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 17:39:40,494][15401] Updated weights for policy 0, policy_version 483390 (0.0033) [2024-06-23 17:39:43,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 7919992832. Throughput: 0: 42603.8. Samples: 7920108740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:43,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 17:39:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000483399_7920009216.pth... [2024-06-23 17:39:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000482772_7909736448.pth [2024-06-23 17:39:43,774][15401] Updated weights for policy 0, policy_version 483400 (0.0042) [2024-06-23 17:39:48,026][15401] Updated weights for policy 0, policy_version 483410 (0.0030) [2024-06-23 17:39:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 7920205824. Throughput: 0: 42553.7. Samples: 7920365060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 17:39:51,310][15401] Updated weights for policy 0, policy_version 483420 (0.0028) [2024-06-23 17:39:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7920402432. Throughput: 0: 42535.5. Samples: 7920491000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 17:39:55,621][15401] Updated weights for policy 0, policy_version 483430 (0.0047) [2024-06-23 17:39:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7920648192. Throughput: 0: 42704.3. Samples: 7920749940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:39:58,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 17:39:59,003][15401] Updated weights for policy 0, policy_version 483440 (0.0027) [2024-06-23 17:40:03,281][15401] Updated weights for policy 0, policy_version 483450 (0.0041) [2024-06-23 17:40:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7920844800. Throughput: 0: 42551.1. Samples: 7921002200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:40:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 17:40:06,572][15401] Updated weights for policy 0, policy_version 483460 (0.0038) [2024-06-23 17:40:08,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 7921025024. Throughput: 0: 42549.4. Samples: 7921123000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:40:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 17:40:11,339][15401] Updated weights for policy 0, policy_version 483470 (0.0035) [2024-06-23 17:40:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7921270784. Throughput: 0: 42661.4. Samples: 7921385000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:40:13,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 17:40:14,268][15401] Updated weights for policy 0, policy_version 483480 (0.0034) [2024-06-23 17:40:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7921467392. Throughput: 0: 42478.5. Samples: 7921637360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 17:40:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 17:40:18,947][15401] Updated weights for policy 0, policy_version 483490 (0.0030) [2024-06-23 17:40:22,062][15401] Updated weights for policy 0, policy_version 483500 (0.0031) [2024-06-23 17:40:23,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 7921680384. Throughput: 0: 42482.7. Samples: 7921762560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:40:23,397][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 17:40:26,403][15401] Updated weights for policy 0, policy_version 483510 (0.0026) [2024-06-23 17:40:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7921909760. Throughput: 0: 42606.0. Samples: 7922026000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:40:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 17:40:29,914][15401] Updated weights for policy 0, policy_version 483520 (0.0020) [2024-06-23 17:40:33,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 7922106368. Throughput: 0: 42614.7. Samples: 7922282720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:40:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 17:40:34,121][15401] Updated weights for policy 0, policy_version 483530 (0.0037) [2024-06-23 17:40:37,533][15401] Updated weights for policy 0, policy_version 483540 (0.0039) [2024-06-23 17:40:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 7922335744. Throughput: 0: 42500.0. Samples: 7922403600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:40:38,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 17:40:41,684][15401] Updated weights for policy 0, policy_version 483550 (0.0030) [2024-06-23 17:40:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7922532352. Throughput: 0: 42491.9. Samples: 7922662080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:40:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 17:40:45,097][15401] Updated weights for policy 0, policy_version 483560 (0.0027) [2024-06-23 17:40:47,604][15349] Signal inference workers to stop experience collection... (117450 times) [2024-06-23 17:40:47,610][15349] Signal inference workers to resume experience collection... (117450 times) [2024-06-23 17:40:47,625][15401] InferenceWorker_p0-w0: stopping experience collection (117450 times) [2024-06-23 17:40:47,625][15401] InferenceWorker_p0-w0: resuming experience collection (117450 times) [2024-06-23 17:40:48,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7922745344. Throughput: 0: 42620.4. Samples: 7922920120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:40:48,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 17:40:49,530][15401] Updated weights for policy 0, policy_version 483570 (0.0028) [2024-06-23 17:40:53,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 7922958336. Throughput: 0: 42770.3. Samples: 7923047660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:40:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 17:40:53,438][15401] Updated weights for policy 0, policy_version 483580 (0.0040) [2024-06-23 17:40:57,499][15401] Updated weights for policy 0, policy_version 483590 (0.0029) [2024-06-23 17:40:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7923204096. Throughput: 0: 42737.8. Samples: 7923308200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:40:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 17:41:01,109][15401] Updated weights for policy 0, policy_version 483600 (0.0033) [2024-06-23 17:41:03,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7923417088. Throughput: 0: 42705.4. Samples: 7923559100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 17:41:04,950][15401] Updated weights for policy 0, policy_version 483610 (0.0038) [2024-06-23 17:41:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7923613696. Throughput: 0: 42876.0. Samples: 7923691700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:08,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-23 17:41:08,562][15401] Updated weights for policy 0, policy_version 483620 (0.0043) [2024-06-23 17:41:12,514][15401] Updated weights for policy 0, policy_version 483630 (0.0038) [2024-06-23 17:41:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7923826688. Throughput: 0: 42670.7. Samples: 7923946180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 17:41:16,134][15401] Updated weights for policy 0, policy_version 483640 (0.0022) [2024-06-23 17:41:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7924023296. Throughput: 0: 42709.8. Samples: 7924204660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 17:41:20,027][15401] Updated weights for policy 0, policy_version 483650 (0.0038) [2024-06-23 17:41:23,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42874.3, 300 sec: 42820.2). Total num frames: 7924252672. Throughput: 0: 42892.4. Samples: 7924333760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:23,393][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 17:41:23,598][15401] Updated weights for policy 0, policy_version 483660 (0.0038) [2024-06-23 17:41:27,612][15401] Updated weights for policy 0, policy_version 483670 (0.0032) [2024-06-23 17:41:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7924465664. Throughput: 0: 42898.0. Samples: 7924592480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 17:41:31,039][15401] Updated weights for policy 0, policy_version 483680 (0.0038) [2024-06-23 17:41:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7924678656. Throughput: 0: 42870.6. Samples: 7924849300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 17:41:35,098][15401] Updated weights for policy 0, policy_version 483690 (0.0029) [2024-06-23 17:41:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 7924908032. Throughput: 0: 42905.4. Samples: 7924978400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 17:41:38,732][15401] Updated weights for policy 0, policy_version 483700 (0.0026) [2024-06-23 17:41:42,615][15401] Updated weights for policy 0, policy_version 483710 (0.0027) [2024-06-23 17:41:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.9, 300 sec: 42765.0). Total num frames: 7925104640. Throughput: 0: 42772.3. Samples: 7925233060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 17:41:43,393][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 17:41:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000483710_7925104640.pth... [2024-06-23 17:41:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000483084_7914848256.pth [2024-06-23 17:41:46,722][15401] Updated weights for policy 0, policy_version 483720 (0.0043) [2024-06-23 17:41:48,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7925317632. Throughput: 0: 42862.7. Samples: 7925487920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:41:48,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-23 17:41:50,286][15401] Updated weights for policy 0, policy_version 483730 (0.0046) [2024-06-23 17:41:53,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7925530624. Throughput: 0: 42805.7. Samples: 7925617960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:41:53,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 17:41:54,521][15401] Updated weights for policy 0, policy_version 483740 (0.0042) [2024-06-23 17:41:58,019][15401] Updated weights for policy 0, policy_version 483750 (0.0032) [2024-06-23 17:41:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7925760000. Throughput: 0: 42801.1. Samples: 7925872240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:41:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 17:42:02,264][15401] Updated weights for policy 0, policy_version 483760 (0.0043) [2024-06-23 17:42:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7925972992. Throughput: 0: 42626.1. Samples: 7926122840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 17:42:05,039][15349] Signal inference workers to stop experience collection... (117500 times) [2024-06-23 17:42:05,071][15401] InferenceWorker_p0-w0: stopping experience collection (117500 times) [2024-06-23 17:42:05,089][15349] Signal inference workers to resume experience collection... (117500 times) [2024-06-23 17:42:05,090][15401] InferenceWorker_p0-w0: resuming experience collection (117500 times) [2024-06-23 17:42:05,556][15401] Updated weights for policy 0, policy_version 483770 (0.0034) [2024-06-23 17:42:08,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7926169600. Throughput: 0: 42690.3. Samples: 7926254720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 17:42:10,065][15401] Updated weights for policy 0, policy_version 483780 (0.0038) [2024-06-23 17:42:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7926398976. Throughput: 0: 42626.8. Samples: 7926510680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 17:42:13,428][15401] Updated weights for policy 0, policy_version 483790 (0.0045) [2024-06-23 17:42:17,827][15401] Updated weights for policy 0, policy_version 483800 (0.0049) [2024-06-23 17:42:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7926611968. Throughput: 0: 42557.9. Samples: 7926764400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:18,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 17:42:21,101][15401] Updated weights for policy 0, policy_version 483810 (0.0040) [2024-06-23 17:42:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 7926808576. Throughput: 0: 42483.4. Samples: 7926890160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 17:42:25,530][15401] Updated weights for policy 0, policy_version 483820 (0.0041) [2024-06-23 17:42:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7927054336. Throughput: 0: 42648.1. Samples: 7927152120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 17:42:28,704][15401] Updated weights for policy 0, policy_version 483830 (0.0038) [2024-06-23 17:42:33,024][15401] Updated weights for policy 0, policy_version 483840 (0.0034) [2024-06-23 17:42:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7927250944. Throughput: 0: 42714.5. Samples: 7927410080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 17:42:36,414][15401] Updated weights for policy 0, policy_version 483850 (0.0053) [2024-06-23 17:42:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 7927447552. Throughput: 0: 42640.9. Samples: 7927536800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 17:42:40,664][15401] Updated weights for policy 0, policy_version 483860 (0.0034) [2024-06-23 17:42:43,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 7927676928. Throughput: 0: 42734.5. Samples: 7927795280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:43,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 17:42:44,111][15401] Updated weights for policy 0, policy_version 483870 (0.0024) [2024-06-23 17:42:48,354][15401] Updated weights for policy 0, policy_version 483880 (0.0034) [2024-06-23 17:42:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7927889920. Throughput: 0: 42667.2. Samples: 7928042860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 17:42:52,264][15401] Updated weights for policy 0, policy_version 483890 (0.0053) [2024-06-23 17:42:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7928070144. Throughput: 0: 42443.5. Samples: 7928164680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:53,395][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 17:42:56,389][15401] Updated weights for policy 0, policy_version 483900 (0.0029) [2024-06-23 17:42:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 7928315904. Throughput: 0: 42428.4. Samples: 7928419960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:42:58,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 17:42:59,886][15401] Updated weights for policy 0, policy_version 483910 (0.0025) [2024-06-23 17:43:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7928512512. Throughput: 0: 42581.3. Samples: 7928680560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:43:03,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 17:43:04,145][15401] Updated weights for policy 0, policy_version 483920 (0.0036) [2024-06-23 17:43:07,583][15401] Updated weights for policy 0, policy_version 483930 (0.0030) [2024-06-23 17:43:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 7928725504. Throughput: 0: 42580.4. Samples: 7928806380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-23 17:43:08,393][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 17:43:11,890][15401] Updated weights for policy 0, policy_version 483940 (0.0036) [2024-06-23 17:43:13,041][15349] Signal inference workers to stop experience collection... (117550 times) [2024-06-23 17:43:13,042][15349] Signal inference workers to resume experience collection... (117550 times) [2024-06-23 17:43:13,056][15401] InferenceWorker_p0-w0: stopping experience collection (117550 times) [2024-06-23 17:43:13,056][15401] InferenceWorker_p0-w0: resuming experience collection (117550 times) [2024-06-23 17:43:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7928954880. Throughput: 0: 42556.4. Samples: 7929067160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 17:43:15,322][15401] Updated weights for policy 0, policy_version 483950 (0.0038) [2024-06-23 17:43:18,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7929167872. Throughput: 0: 42446.0. Samples: 7929320140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:18,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 17:43:19,650][15401] Updated weights for policy 0, policy_version 483960 (0.0029) [2024-06-23 17:43:22,787][15401] Updated weights for policy 0, policy_version 483970 (0.0036) [2024-06-23 17:43:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7929364480. Throughput: 0: 42410.8. Samples: 7929445280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:23,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 17:43:27,369][15401] Updated weights for policy 0, policy_version 483980 (0.0022) [2024-06-23 17:43:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7929610240. Throughput: 0: 42673.2. Samples: 7929715580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 17:43:30,765][15401] Updated weights for policy 0, policy_version 483990 (0.0027) [2024-06-23 17:43:33,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 7929806848. Throughput: 0: 42712.9. Samples: 7929965040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:33,392][15132] Avg episode reward: [(0, '0.298')] [2024-06-23 17:43:34,949][15401] Updated weights for policy 0, policy_version 484000 (0.0033) [2024-06-23 17:43:38,261][15401] Updated weights for policy 0, policy_version 484010 (0.0031) [2024-06-23 17:43:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7930019840. Throughput: 0: 42883.1. Samples: 7930094420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 17:43:42,386][15401] Updated weights for policy 0, policy_version 484020 (0.0030) [2024-06-23 17:43:43,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7930216448. Throughput: 0: 43092.4. Samples: 7930359120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:43,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 17:43:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000484023_7930232832.pth... [2024-06-23 17:43:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000483399_7920009216.pth [2024-06-23 17:43:45,641][15401] Updated weights for policy 0, policy_version 484030 (0.0048) [2024-06-23 17:43:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7930445824. Throughput: 0: 42880.8. Samples: 7930610200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:48,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 17:43:49,887][15401] Updated weights for policy 0, policy_version 484040 (0.0037) [2024-06-23 17:43:53,292][15401] Updated weights for policy 0, policy_version 484050 (0.0033) [2024-06-23 17:43:53,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43415.9, 300 sec: 42709.1). Total num frames: 7930675200. Throughput: 0: 42945.3. Samples: 7930738920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:53,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 17:43:57,327][15401] Updated weights for policy 0, policy_version 484060 (0.0030) [2024-06-23 17:43:58,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7930855424. Throughput: 0: 42854.9. Samples: 7930995620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:43:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 17:44:00,861][15401] Updated weights for policy 0, policy_version 484070 (0.0037) [2024-06-23 17:44:03,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7931084800. Throughput: 0: 42957.1. Samples: 7931253220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:44:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 17:44:05,432][15401] Updated weights for policy 0, policy_version 484080 (0.0030) [2024-06-23 17:44:08,389][15132] Fps is (10 sec: 45874.5, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 7931314176. Throughput: 0: 43014.6. Samples: 7931380940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:44:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 17:44:08,409][15401] Updated weights for policy 0, policy_version 484090 (0.0034) [2024-06-23 17:44:12,944][15401] Updated weights for policy 0, policy_version 484100 (0.0027) [2024-06-23 17:44:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7931510784. Throughput: 0: 42799.1. Samples: 7931641540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:44:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 17:44:16,001][15401] Updated weights for policy 0, policy_version 484110 (0.0042) [2024-06-23 17:44:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7931740160. Throughput: 0: 42818.7. Samples: 7931891780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:44:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 17:44:20,364][15401] Updated weights for policy 0, policy_version 484120 (0.0046) [2024-06-23 17:44:22,863][15349] Signal inference workers to stop experience collection... (117600 times) [2024-06-23 17:44:22,868][15349] Signal inference workers to resume experience collection... (117600 times) [2024-06-23 17:44:22,908][15401] InferenceWorker_p0-w0: stopping experience collection (117600 times) [2024-06-23 17:44:22,908][15401] InferenceWorker_p0-w0: resuming experience collection (117600 times) [2024-06-23 17:44:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 7931953152. Throughput: 0: 42966.2. Samples: 7932027900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:44:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 17:44:23,646][15401] Updated weights for policy 0, policy_version 484130 (0.0034) [2024-06-23 17:44:28,149][15401] Updated weights for policy 0, policy_version 484140 (0.0039) [2024-06-23 17:44:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.9). Total num frames: 7932149760. Throughput: 0: 42693.3. Samples: 7932280320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:44:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 17:44:31,886][15401] Updated weights for policy 0, policy_version 484150 (0.0037) [2024-06-23 17:44:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 7932379136. Throughput: 0: 42716.5. Samples: 7932532440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:44:33,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 17:44:35,729][15401] Updated weights for policy 0, policy_version 484160 (0.0034) [2024-06-23 17:44:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7932575744. Throughput: 0: 42786.3. Samples: 7932664200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-23 17:44:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 17:44:39,514][15401] Updated weights for policy 0, policy_version 484170 (0.0027) [2024-06-23 17:44:43,371][15401] Updated weights for policy 0, policy_version 484180 (0.0040) [2024-06-23 17:44:43,396][15132] Fps is (10 sec: 42571.2, 60 sec: 43139.9, 300 sec: 42708.6). Total num frames: 7932805120. Throughput: 0: 42823.0. Samples: 7932922940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:44:43,396][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 17:44:47,094][15401] Updated weights for policy 0, policy_version 484190 (0.0040) [2024-06-23 17:44:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 7933034496. Throughput: 0: 42657.0. Samples: 7933172780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:44:48,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-23 17:44:50,977][15401] Updated weights for policy 0, policy_version 484200 (0.0030) [2024-06-23 17:44:53,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 7933214720. Throughput: 0: 42757.4. Samples: 7933305020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:44:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 17:44:54,827][15401] Updated weights for policy 0, policy_version 484210 (0.0037) [2024-06-23 17:44:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7933427712. Throughput: 0: 42704.5. Samples: 7933563240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:44:58,396][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 17:44:58,583][15401] Updated weights for policy 0, policy_version 484220 (0.0030) [2024-06-23 17:45:02,265][15401] Updated weights for policy 0, policy_version 484230 (0.0037) [2024-06-23 17:45:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 7933657088. Throughput: 0: 42726.7. Samples: 7933814480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 17:45:06,342][15401] Updated weights for policy 0, policy_version 484240 (0.0036) [2024-06-23 17:45:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7933853696. Throughput: 0: 42680.1. Samples: 7933948500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 17:45:09,791][15401] Updated weights for policy 0, policy_version 484250 (0.0036) [2024-06-23 17:45:13,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 7934083072. Throughput: 0: 42801.0. Samples: 7934206640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:13,396][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 17:45:13,804][15401] Updated weights for policy 0, policy_version 484260 (0.0026) [2024-06-23 17:45:17,235][15401] Updated weights for policy 0, policy_version 484270 (0.0024) [2024-06-23 17:45:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 7934312448. Throughput: 0: 42848.0. Samples: 7934460600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 17:45:21,422][15401] Updated weights for policy 0, policy_version 484280 (0.0028) [2024-06-23 17:45:23,390][15132] Fps is (10 sec: 40985.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 7934492672. Throughput: 0: 42918.5. Samples: 7934595540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 17:45:25,063][15401] Updated weights for policy 0, policy_version 484290 (0.0041) [2024-06-23 17:45:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7934722048. Throughput: 0: 42752.0. Samples: 7934846500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 17:45:28,926][15401] Updated weights for policy 0, policy_version 484300 (0.0033) [2024-06-23 17:45:32,786][15401] Updated weights for policy 0, policy_version 484310 (0.0026) [2024-06-23 17:45:33,389][15132] Fps is (10 sec: 47514.7, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 7934967808. Throughput: 0: 42913.3. Samples: 7935103880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 17:45:36,766][15401] Updated weights for policy 0, policy_version 484320 (0.0034) [2024-06-23 17:45:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7935148032. Throughput: 0: 42933.7. Samples: 7935237040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:38,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-23 17:45:40,359][15401] Updated weights for policy 0, policy_version 484330 (0.0040) [2024-06-23 17:45:41,464][15349] Signal inference workers to stop experience collection... (117650 times) [2024-06-23 17:45:41,465][15349] Signal inference workers to resume experience collection... (117650 times) [2024-06-23 17:45:41,516][15401] InferenceWorker_p0-w0: stopping experience collection (117650 times) [2024-06-23 17:45:41,516][15401] InferenceWorker_p0-w0: resuming experience collection (117650 times) [2024-06-23 17:45:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 7935361024. Throughput: 0: 42922.9. Samples: 7935494780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 17:45:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000484336_7935361024.pth... [2024-06-23 17:45:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000483710_7925104640.pth [2024-06-23 17:45:44,443][15401] Updated weights for policy 0, policy_version 484340 (0.0035) [2024-06-23 17:45:48,151][15401] Updated weights for policy 0, policy_version 484350 (0.0024) [2024-06-23 17:45:48,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 7935606784. Throughput: 0: 42839.0. Samples: 7935742340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:48,393][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 17:45:52,215][15401] Updated weights for policy 0, policy_version 484360 (0.0039) [2024-06-23 17:45:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7935787008. Throughput: 0: 42919.8. Samples: 7935879900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 17:45:55,747][15401] Updated weights for policy 0, policy_version 484370 (0.0036) [2024-06-23 17:45:58,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7936000000. Throughput: 0: 42782.0. Samples: 7936131560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:45:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 17:45:59,981][15401] Updated weights for policy 0, policy_version 484380 (0.0033) [2024-06-23 17:46:03,297][15401] Updated weights for policy 0, policy_version 484390 (0.0037) [2024-06-23 17:46:03,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7936245760. Throughput: 0: 42829.3. Samples: 7936387920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 17:46:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 17:46:07,539][15401] Updated weights for policy 0, policy_version 484400 (0.0028) [2024-06-23 17:46:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7936425984. Throughput: 0: 42810.4. Samples: 7936522000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 17:46:11,119][15401] Updated weights for policy 0, policy_version 484410 (0.0028) [2024-06-23 17:46:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42602.7, 300 sec: 42765.0). Total num frames: 7936638976. Throughput: 0: 42693.4. Samples: 7936767720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:13,391][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 17:46:15,203][15401] Updated weights for policy 0, policy_version 484420 (0.0037) [2024-06-23 17:46:18,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.8, 300 sec: 42820.6). Total num frames: 7936884736. Throughput: 0: 42652.8. Samples: 7937023360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:18,392][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 17:46:18,681][15401] Updated weights for policy 0, policy_version 484430 (0.0042) [2024-06-23 17:46:22,688][15401] Updated weights for policy 0, policy_version 484440 (0.0036) [2024-06-23 17:46:23,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7937064960. Throughput: 0: 42594.1. Samples: 7937153780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 17:46:27,084][15401] Updated weights for policy 0, policy_version 484450 (0.0033) [2024-06-23 17:46:28,396][15132] Fps is (10 sec: 40943.5, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 7937294336. Throughput: 0: 42643.4. Samples: 7937414000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:28,396][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 17:46:30,673][15401] Updated weights for policy 0, policy_version 484460 (0.0035) [2024-06-23 17:46:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7937523712. Throughput: 0: 42845.0. Samples: 7937670260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 17:46:34,514][15401] Updated weights for policy 0, policy_version 484470 (0.0030) [2024-06-23 17:46:38,274][15401] Updated weights for policy 0, policy_version 484480 (0.0046) [2024-06-23 17:46:38,390][15132] Fps is (10 sec: 42625.1, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 7937720320. Throughput: 0: 42701.8. Samples: 7937801480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 17:46:42,109][15401] Updated weights for policy 0, policy_version 484490 (0.0033) [2024-06-23 17:46:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7937933312. Throughput: 0: 42871.0. Samples: 7938060760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:43,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-23 17:46:46,019][15401] Updated weights for policy 0, policy_version 484500 (0.0029) [2024-06-23 17:46:48,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42871.5, 300 sec: 42875.7). Total num frames: 7938179072. Throughput: 0: 42705.8. Samples: 7938309780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:48,392][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 17:46:49,802][15401] Updated weights for policy 0, policy_version 484510 (0.0031) [2024-06-23 17:46:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 7938342912. Throughput: 0: 42715.2. Samples: 7938444180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 17:46:53,575][15401] Updated weights for policy 0, policy_version 484520 (0.0033) [2024-06-23 17:46:57,385][15401] Updated weights for policy 0, policy_version 484530 (0.0041) [2024-06-23 17:46:58,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7938572288. Throughput: 0: 43054.6. Samples: 7938705160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:46:58,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 17:47:01,095][15401] Updated weights for policy 0, policy_version 484540 (0.0020) [2024-06-23 17:47:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7938801664. Throughput: 0: 42887.1. Samples: 7938953180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:47:03,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 17:47:04,959][15349] Signal inference workers to stop experience collection... (117700 times) [2024-06-23 17:47:04,959][15349] Signal inference workers to resume experience collection... (117700 times) [2024-06-23 17:47:04,988][15401] InferenceWorker_p0-w0: stopping experience collection (117700 times) [2024-06-23 17:47:04,989][15401] InferenceWorker_p0-w0: resuming experience collection (117700 times) [2024-06-23 17:47:05,106][15401] Updated weights for policy 0, policy_version 484550 (0.0045) [2024-06-23 17:47:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7938998272. Throughput: 0: 43062.9. Samples: 7939091600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:47:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 17:47:08,530][15401] Updated weights for policy 0, policy_version 484560 (0.0029) [2024-06-23 17:47:12,740][15401] Updated weights for policy 0, policy_version 484570 (0.0036) [2024-06-23 17:47:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 7939211264. Throughput: 0: 42954.0. Samples: 7939346660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:47:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 17:47:16,335][15401] Updated weights for policy 0, policy_version 484580 (0.0029) [2024-06-23 17:47:18,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 7939440640. Throughput: 0: 42832.3. Samples: 7939597820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:47:18,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 17:47:20,426][15401] Updated weights for policy 0, policy_version 484590 (0.0035) [2024-06-23 17:47:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7939653632. Throughput: 0: 42929.0. Samples: 7939733280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:47:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 17:47:23,997][15401] Updated weights for policy 0, policy_version 484600 (0.0043) [2024-06-23 17:47:28,230][15401] Updated weights for policy 0, policy_version 484610 (0.0027) [2024-06-23 17:47:28,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42601.3, 300 sec: 42709.2). Total num frames: 7939850240. Throughput: 0: 42808.5. Samples: 7939987240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 17:47:28,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 17:47:31,755][15401] Updated weights for policy 0, policy_version 484620 (0.0034) [2024-06-23 17:47:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7940096000. Throughput: 0: 42748.5. Samples: 7940233360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:47:33,396][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 17:47:35,951][15401] Updated weights for policy 0, policy_version 484630 (0.0033) [2024-06-23 17:47:38,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7940276224. Throughput: 0: 42762.1. Samples: 7940368480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:47:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 17:47:39,445][15401] Updated weights for policy 0, policy_version 484640 (0.0033) [2024-06-23 17:47:43,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 7940472832. Throughput: 0: 42659.6. Samples: 7940624840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:47:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 17:47:43,490][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000484649_7940489216.pth... [2024-06-23 17:47:43,557][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000484023_7930232832.pth [2024-06-23 17:47:43,716][15401] Updated weights for policy 0, policy_version 484650 (0.0040) [2024-06-23 17:47:46,963][15401] Updated weights for policy 0, policy_version 484660 (0.0045) [2024-06-23 17:47:48,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42322.5, 300 sec: 42875.2). Total num frames: 7940718592. Throughput: 0: 42774.4. Samples: 7940878300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:47:48,397][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 17:47:51,436][15401] Updated weights for policy 0, policy_version 484670 (0.0037) [2024-06-23 17:47:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7940931584. Throughput: 0: 42734.2. Samples: 7941014640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:47:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 17:47:54,699][15401] Updated weights for policy 0, policy_version 484680 (0.0028) [2024-06-23 17:47:58,390][15132] Fps is (10 sec: 40985.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7941128192. Throughput: 0: 42625.7. Samples: 7941264820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:47:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 17:47:59,053][15401] Updated weights for policy 0, policy_version 484690 (0.0039) [2024-06-23 17:48:02,183][15401] Updated weights for policy 0, policy_version 484700 (0.0026) [2024-06-23 17:48:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 7941357568. Throughput: 0: 42835.0. Samples: 7941525300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 17:48:06,561][15401] Updated weights for policy 0, policy_version 484710 (0.0026) [2024-06-23 17:48:08,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 7941586944. Throughput: 0: 42828.0. Samples: 7941660540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:08,395][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 17:48:09,654][15401] Updated weights for policy 0, policy_version 484720 (0.0027) [2024-06-23 17:48:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7941767168. Throughput: 0: 42795.5. Samples: 7941912940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:13,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 17:48:14,309][15401] Updated weights for policy 0, policy_version 484730 (0.0044) [2024-06-23 17:48:17,158][15401] Updated weights for policy 0, policy_version 484740 (0.0042) [2024-06-23 17:48:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 7941996544. Throughput: 0: 43148.4. Samples: 7942175040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:18,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 17:48:21,734][15401] Updated weights for policy 0, policy_version 484750 (0.0040) [2024-06-23 17:48:23,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7942225920. Throughput: 0: 43016.9. Samples: 7942304240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 17:48:24,969][15401] Updated weights for policy 0, policy_version 484760 (0.0032) [2024-06-23 17:48:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.1, 300 sec: 42765.4). Total num frames: 7942422528. Throughput: 0: 42950.5. Samples: 7942557620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 17:48:29,011][15349] Signal inference workers to stop experience collection... (117750 times) [2024-06-23 17:48:29,063][15401] InferenceWorker_p0-w0: stopping experience collection (117750 times) [2024-06-23 17:48:29,063][15349] Signal inference workers to resume experience collection... (117750 times) [2024-06-23 17:48:29,086][15401] InferenceWorker_p0-w0: resuming experience collection (117750 times) [2024-06-23 17:48:29,197][15401] Updated weights for policy 0, policy_version 484770 (0.0042) [2024-06-23 17:48:32,535][15401] Updated weights for policy 0, policy_version 484780 (0.0025) [2024-06-23 17:48:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7942651904. Throughput: 0: 43060.8. Samples: 7942815760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 17:48:36,857][15401] Updated weights for policy 0, policy_version 484790 (0.0035) [2024-06-23 17:48:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 7942881280. Throughput: 0: 43013.2. Samples: 7942950240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 17:48:40,441][15401] Updated weights for policy 0, policy_version 484800 (0.0033) [2024-06-23 17:48:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7943061504. Throughput: 0: 43114.7. Samples: 7943204980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 17:48:44,503][15401] Updated weights for policy 0, policy_version 484810 (0.0032) [2024-06-23 17:48:48,072][15401] Updated weights for policy 0, policy_version 484820 (0.0027) [2024-06-23 17:48:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42876.0, 300 sec: 42765.3). Total num frames: 7943290880. Throughput: 0: 42987.1. Samples: 7943459720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 17:48:52,127][15401] Updated weights for policy 0, policy_version 484830 (0.0029) [2024-06-23 17:48:53,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 7943536640. Throughput: 0: 43034.1. Samples: 7943597080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 17:48:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 17:48:55,662][15401] Updated weights for policy 0, policy_version 484840 (0.0038) [2024-06-23 17:48:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7943716864. Throughput: 0: 43134.6. Samples: 7943854000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:48:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 17:48:59,572][15401] Updated weights for policy 0, policy_version 484850 (0.0032) [2024-06-23 17:49:03,253][15401] Updated weights for policy 0, policy_version 484860 (0.0041) [2024-06-23 17:49:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 7943946240. Throughput: 0: 43012.9. Samples: 7944110620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 17:49:07,284][15401] Updated weights for policy 0, policy_version 484870 (0.0035) [2024-06-23 17:49:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7944159232. Throughput: 0: 43052.4. Samples: 7944241600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:08,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 17:49:10,894][15401] Updated weights for policy 0, policy_version 484880 (0.0032) [2024-06-23 17:49:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 7944372224. Throughput: 0: 43023.0. Samples: 7944493760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:13,393][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 17:49:14,787][15401] Updated weights for policy 0, policy_version 484890 (0.0045) [2024-06-23 17:49:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7944585216. Throughput: 0: 42971.5. Samples: 7944749480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 17:49:18,471][15401] Updated weights for policy 0, policy_version 484900 (0.0034) [2024-06-23 17:49:22,546][15401] Updated weights for policy 0, policy_version 484910 (0.0039) [2024-06-23 17:49:23,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7944781824. Throughput: 0: 42786.3. Samples: 7944875620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 17:49:26,057][15401] Updated weights for policy 0, policy_version 484920 (0.0035) [2024-06-23 17:49:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7944994816. Throughput: 0: 42673.8. Samples: 7945125300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 17:49:30,233][15401] Updated weights for policy 0, policy_version 484930 (0.0042) [2024-06-23 17:49:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7945224192. Throughput: 0: 42771.2. Samples: 7945384420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 17:49:33,791][15401] Updated weights for policy 0, policy_version 484940 (0.0024) [2024-06-23 17:49:37,989][15401] Updated weights for policy 0, policy_version 484950 (0.0023) [2024-06-23 17:49:38,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.8, 300 sec: 42821.1). Total num frames: 7945437184. Throughput: 0: 42577.4. Samples: 7945513160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:38,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 17:49:41,806][15401] Updated weights for policy 0, policy_version 484960 (0.0027) [2024-06-23 17:49:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7945633792. Throughput: 0: 42507.7. Samples: 7945766840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 17:49:43,508][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000484964_7945650176.pth... [2024-06-23 17:49:43,557][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000484336_7935361024.pth [2024-06-23 17:49:45,553][15401] Updated weights for policy 0, policy_version 484970 (0.0035) [2024-06-23 17:49:48,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 7945846784. Throughput: 0: 42593.7. Samples: 7946027340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:48,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 17:49:49,383][15401] Updated weights for policy 0, policy_version 484980 (0.0043) [2024-06-23 17:49:53,181][15401] Updated weights for policy 0, policy_version 484990 (0.0044) [2024-06-23 17:49:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 7946092544. Throughput: 0: 42428.9. Samples: 7946150900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:53,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 17:49:56,892][15349] Signal inference workers to stop experience collection... (117800 times) [2024-06-23 17:49:56,893][15349] Signal inference workers to resume experience collection... (117800 times) [2024-06-23 17:49:56,908][15401] InferenceWorker_p0-w0: stopping experience collection (117800 times) [2024-06-23 17:49:56,908][15401] InferenceWorker_p0-w0: resuming experience collection (117800 times) [2024-06-23 17:49:57,041][15401] Updated weights for policy 0, policy_version 485000 (0.0028) [2024-06-23 17:49:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 7946272768. Throughput: 0: 42569.9. Samples: 7946409300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:49:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 17:50:00,738][15401] Updated weights for policy 0, policy_version 485010 (0.0045) [2024-06-23 17:50:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7946502144. Throughput: 0: 42548.6. Samples: 7946664160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:50:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 17:50:04,614][15401] Updated weights for policy 0, policy_version 485020 (0.0037) [2024-06-23 17:50:08,299][15401] Updated weights for policy 0, policy_version 485030 (0.0038) [2024-06-23 17:50:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 7946731520. Throughput: 0: 42687.5. Samples: 7946796560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:50:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 17:50:12,583][15401] Updated weights for policy 0, policy_version 485040 (0.0030) [2024-06-23 17:50:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 7946911744. Throughput: 0: 42773.9. Samples: 7947050120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:50:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 17:50:16,342][15401] Updated weights for policy 0, policy_version 485050 (0.0042) [2024-06-23 17:50:18,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 7947141120. Throughput: 0: 42553.2. Samples: 7947299420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-23 17:50:18,393][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 17:50:20,418][15401] Updated weights for policy 0, policy_version 485060 (0.0040) [2024-06-23 17:50:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7947354112. Throughput: 0: 42792.0. Samples: 7947438700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:50:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 17:50:23,854][15401] Updated weights for policy 0, policy_version 485070 (0.0034) [2024-06-23 17:50:27,980][15401] Updated weights for policy 0, policy_version 485080 (0.0039) [2024-06-23 17:50:28,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7947550720. Throughput: 0: 42753.8. Samples: 7947690760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:50:28,396][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 17:50:31,719][15401] Updated weights for policy 0, policy_version 485090 (0.0026) [2024-06-23 17:50:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 7947780096. Throughput: 0: 42479.6. Samples: 7947938920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:50:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 17:50:35,503][15401] Updated weights for policy 0, policy_version 485100 (0.0040) [2024-06-23 17:50:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 7947976704. Throughput: 0: 42750.8. Samples: 7948074680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:50:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 17:50:39,373][15401] Updated weights for policy 0, policy_version 485110 (0.0048) [2024-06-23 17:50:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 7948189696. Throughput: 0: 42499.0. Samples: 7948321760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:50:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 17:50:43,581][15401] Updated weights for policy 0, policy_version 485120 (0.0030) [2024-06-23 17:50:46,957][15401] Updated weights for policy 0, policy_version 485130 (0.0032) [2024-06-23 17:50:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 7948419072. Throughput: 0: 42503.0. Samples: 7948576800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:50:48,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 17:50:51,260][15401] Updated weights for policy 0, policy_version 485140 (0.0052) [2024-06-23 17:50:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 7948615680. Throughput: 0: 42568.5. Samples: 7948712140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:50:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 17:50:54,641][15401] Updated weights for policy 0, policy_version 485150 (0.0046) [2024-06-23 17:50:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7948828672. Throughput: 0: 42512.4. Samples: 7948963180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:50:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 17:50:59,024][15401] Updated weights for policy 0, policy_version 485160 (0.0036) [2024-06-23 17:51:02,065][15401] Updated weights for policy 0, policy_version 485170 (0.0029) [2024-06-23 17:51:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7949074432. Throughput: 0: 42676.1. Samples: 7949219740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:03,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 17:51:06,563][15401] Updated weights for policy 0, policy_version 485180 (0.0040) [2024-06-23 17:51:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 7949271040. Throughput: 0: 42677.0. Samples: 7949359160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:08,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 17:51:09,868][15401] Updated weights for policy 0, policy_version 485190 (0.0028) [2024-06-23 17:51:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 7949484032. Throughput: 0: 42780.8. Samples: 7949615900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 17:51:14,103][15401] Updated weights for policy 0, policy_version 485200 (0.0028) [2024-06-23 17:51:17,245][15401] Updated weights for policy 0, policy_version 485210 (0.0024) [2024-06-23 17:51:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42871.5, 300 sec: 42875.8). Total num frames: 7949713408. Throughput: 0: 42879.1. Samples: 7949868580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:18,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 17:51:21,610][15401] Updated weights for policy 0, policy_version 485220 (0.0034) [2024-06-23 17:51:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 7949926400. Throughput: 0: 42913.8. Samples: 7950005800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 17:51:24,591][15401] Updated weights for policy 0, policy_version 485230 (0.0032) [2024-06-23 17:51:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7950139392. Throughput: 0: 43047.1. Samples: 7950258880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 17:51:29,170][15401] Updated weights for policy 0, policy_version 485240 (0.0032) [2024-06-23 17:51:32,480][15401] Updated weights for policy 0, policy_version 485250 (0.0040) [2024-06-23 17:51:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 7950368768. Throughput: 0: 43105.4. Samples: 7950516540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 17:51:36,844][15401] Updated weights for policy 0, policy_version 485260 (0.0037) [2024-06-23 17:51:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 7950565376. Throughput: 0: 43095.0. Samples: 7950651420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 17:51:40,024][15401] Updated weights for policy 0, policy_version 485270 (0.0026) [2024-06-23 17:51:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 7950761984. Throughput: 0: 43157.9. Samples: 7950905280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 17:51:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 17:51:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000485276_7950761984.pth... [2024-06-23 17:51:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000484649_7940489216.pth [2024-06-23 17:51:44,538][15401] Updated weights for policy 0, policy_version 485280 (0.0032) [2024-06-23 17:51:47,721][15401] Updated weights for policy 0, policy_version 485290 (0.0037) [2024-06-23 17:51:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 7951007744. Throughput: 0: 42869.3. Samples: 7951148860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:51:48,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 17:51:52,276][15401] Updated weights for policy 0, policy_version 485300 (0.0033) [2024-06-23 17:51:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 7951204352. Throughput: 0: 42762.9. Samples: 7951283500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:51:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 17:51:53,618][15349] Signal inference workers to stop experience collection... (117850 times) [2024-06-23 17:51:53,619][15349] Signal inference workers to resume experience collection... (117850 times) [2024-06-23 17:51:53,643][15401] InferenceWorker_p0-w0: stopping experience collection (117850 times) [2024-06-23 17:51:53,643][15401] InferenceWorker_p0-w0: resuming experience collection (117850 times) [2024-06-23 17:51:55,422][15401] Updated weights for policy 0, policy_version 485310 (0.0037) [2024-06-23 17:51:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 7951417344. Throughput: 0: 42865.3. Samples: 7951544840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:51:58,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 17:52:00,035][15401] Updated weights for policy 0, policy_version 485320 (0.0027) [2024-06-23 17:52:03,125][15401] Updated weights for policy 0, policy_version 485330 (0.0027) [2024-06-23 17:52:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 7951646720. Throughput: 0: 42818.6. Samples: 7951795320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 17:52:07,636][15401] Updated weights for policy 0, policy_version 485340 (0.0025) [2024-06-23 17:52:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7951826944. Throughput: 0: 42735.6. Samples: 7951928900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 17:52:10,979][15401] Updated weights for policy 0, policy_version 485350 (0.0022) [2024-06-23 17:52:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 7952056320. Throughput: 0: 42662.3. Samples: 7952178680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:13,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 17:52:15,495][15401] Updated weights for policy 0, policy_version 485360 (0.0036) [2024-06-23 17:52:18,390][15132] Fps is (10 sec: 45873.6, 60 sec: 42873.0, 300 sec: 42820.5). Total num frames: 7952285696. Throughput: 0: 42655.7. Samples: 7952436060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:18,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 17:52:18,539][15401] Updated weights for policy 0, policy_version 485370 (0.0030) [2024-06-23 17:52:23,238][15401] Updated weights for policy 0, policy_version 485380 (0.0036) [2024-06-23 17:52:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 7952465920. Throughput: 0: 42560.9. Samples: 7952566660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:23,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 17:52:26,661][15401] Updated weights for policy 0, policy_version 485390 (0.0036) [2024-06-23 17:52:28,390][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7952711680. Throughput: 0: 42575.9. Samples: 7952821200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 17:52:30,784][15401] Updated weights for policy 0, policy_version 485400 (0.0034) [2024-06-23 17:52:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 7952924672. Throughput: 0: 42780.9. Samples: 7953074000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:33,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 17:52:34,055][15401] Updated weights for policy 0, policy_version 485410 (0.0038) [2024-06-23 17:52:38,239][15401] Updated weights for policy 0, policy_version 485420 (0.0033) [2024-06-23 17:52:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 7953121280. Throughput: 0: 42713.4. Samples: 7953205600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:38,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 17:52:41,536][15401] Updated weights for policy 0, policy_version 485430 (0.0030) [2024-06-23 17:52:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 7953334272. Throughput: 0: 42560.0. Samples: 7953460040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 17:52:46,317][15401] Updated weights for policy 0, policy_version 485440 (0.0042) [2024-06-23 17:52:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 7953580032. Throughput: 0: 42602.8. Samples: 7953712440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 17:52:49,028][15401] Updated weights for policy 0, policy_version 485450 (0.0028) [2024-06-23 17:52:53,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 7953760256. Throughput: 0: 42553.7. Samples: 7953843920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:53,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 17:52:53,772][15401] Updated weights for policy 0, policy_version 485460 (0.0027) [2024-06-23 17:52:56,628][15401] Updated weights for policy 0, policy_version 485470 (0.0035) [2024-06-23 17:52:58,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7953956864. Throughput: 0: 42537.3. Samples: 7954092860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:52:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 17:53:01,526][15401] Updated weights for policy 0, policy_version 485480 (0.0031) [2024-06-23 17:53:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 7954202624. Throughput: 0: 42488.2. Samples: 7954348020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:53:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 17:53:04,434][15401] Updated weights for policy 0, policy_version 485490 (0.0041) [2024-06-23 17:53:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7954366464. Throughput: 0: 42494.8. Samples: 7954478920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:53:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 17:53:09,378][15401] Updated weights for policy 0, policy_version 485500 (0.0041) [2024-06-23 17:53:11,009][15349] Signal inference workers to stop experience collection... (117900 times) [2024-06-23 17:53:11,013][15349] Signal inference workers to resume experience collection... (117900 times) [2024-06-23 17:53:11,031][15401] InferenceWorker_p0-w0: stopping experience collection (117900 times) [2024-06-23 17:53:11,031][15401] InferenceWorker_p0-w0: resuming experience collection (117900 times) [2024-06-23 17:53:12,285][15401] Updated weights for policy 0, policy_version 485510 (0.0034) [2024-06-23 17:53:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 7954612224. Throughput: 0: 42265.7. Samples: 7954723160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-23 17:53:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 17:53:16,931][15401] Updated weights for policy 0, policy_version 485520 (0.0042) [2024-06-23 17:53:18,392][15132] Fps is (10 sec: 47501.7, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 7954841600. Throughput: 0: 42518.7. Samples: 7954987440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:18,392][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 17:53:19,828][15401] Updated weights for policy 0, policy_version 485530 (0.0036) [2024-06-23 17:53:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7955005440. Throughput: 0: 42381.8. Samples: 7955112780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:23,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 17:53:25,028][15401] Updated weights for policy 0, policy_version 485540 (0.0028) [2024-06-23 17:53:28,014][15401] Updated weights for policy 0, policy_version 485550 (0.0035) [2024-06-23 17:53:28,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7955251200. Throughput: 0: 42260.5. Samples: 7955361760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 17:53:32,488][15401] Updated weights for policy 0, policy_version 485560 (0.0035) [2024-06-23 17:53:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7955464192. Throughput: 0: 42510.8. Samples: 7955625420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 17:53:35,626][15401] Updated weights for policy 0, policy_version 485570 (0.0041) [2024-06-23 17:53:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 7955660800. Throughput: 0: 42295.1. Samples: 7955747100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 17:53:39,986][15401] Updated weights for policy 0, policy_version 485580 (0.0029) [2024-06-23 17:53:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7955890176. Throughput: 0: 42401.2. Samples: 7956000920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 17:53:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000485589_7955890176.pth... [2024-06-23 17:53:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000484964_7945650176.pth [2024-06-23 17:53:43,675][15401] Updated weights for policy 0, policy_version 485590 (0.0035) [2024-06-23 17:53:47,829][15401] Updated weights for policy 0, policy_version 485600 (0.0047) [2024-06-23 17:53:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7956103168. Throughput: 0: 42616.0. Samples: 7956265740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:48,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 17:53:51,093][15401] Updated weights for policy 0, policy_version 485610 (0.0021) [2024-06-23 17:53:53,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 7956299776. Throughput: 0: 42550.6. Samples: 7956393700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:53,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 17:53:55,326][15401] Updated weights for policy 0, policy_version 485620 (0.0028) [2024-06-23 17:53:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7956545536. Throughput: 0: 42758.3. Samples: 7956647280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:53:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 17:53:58,630][15401] Updated weights for policy 0, policy_version 485630 (0.0033) [2024-06-23 17:54:02,931][15401] Updated weights for policy 0, policy_version 485640 (0.0030) [2024-06-23 17:54:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7956758528. Throughput: 0: 42682.8. Samples: 7956908060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:54:03,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 17:54:06,313][15401] Updated weights for policy 0, policy_version 485650 (0.0039) [2024-06-23 17:54:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 7956938752. Throughput: 0: 42718.7. Samples: 7957035120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:54:08,396][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 17:54:10,603][15401] Updated weights for policy 0, policy_version 485660 (0.0045) [2024-06-23 17:54:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 7957184512. Throughput: 0: 42826.2. Samples: 7957289040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:54:13,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 17:54:14,284][15401] Updated weights for policy 0, policy_version 485670 (0.0045) [2024-06-23 17:54:18,385][15401] Updated weights for policy 0, policy_version 485680 (0.0024) [2024-06-23 17:54:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 7957381120. Throughput: 0: 42691.6. Samples: 7957546540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:54:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 17:54:21,895][15401] Updated weights for policy 0, policy_version 485690 (0.0045) [2024-06-23 17:54:23,392][15132] Fps is (10 sec: 39321.6, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 7957577728. Throughput: 0: 42786.2. Samples: 7957672580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:54:23,392][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 17:54:26,007][15401] Updated weights for policy 0, policy_version 485700 (0.0041) [2024-06-23 17:54:26,024][15349] Signal inference workers to stop experience collection... (117950 times) [2024-06-23 17:54:26,024][15349] Signal inference workers to resume experience collection... (117950 times) [2024-06-23 17:54:26,049][15401] InferenceWorker_p0-w0: stopping experience collection (117950 times) [2024-06-23 17:54:26,049][15401] InferenceWorker_p0-w0: resuming experience collection (117950 times) [2024-06-23 17:54:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7957823488. Throughput: 0: 42827.7. Samples: 7957928160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:54:28,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 17:54:29,442][15401] Updated weights for policy 0, policy_version 485710 (0.0024) [2024-06-23 17:54:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 7958020096. Throughput: 0: 42944.0. Samples: 7958198220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:54:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 17:54:33,480][15401] Updated weights for policy 0, policy_version 485720 (0.0027) [2024-06-23 17:54:37,013][15401] Updated weights for policy 0, policy_version 485730 (0.0040) [2024-06-23 17:54:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 7958233088. Throughput: 0: 42866.9. Samples: 7958322820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 17:54:38,393][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 17:54:41,105][15401] Updated weights for policy 0, policy_version 485740 (0.0026) [2024-06-23 17:54:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7958462464. Throughput: 0: 42804.3. Samples: 7958573480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:54:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 17:54:44,550][15401] Updated weights for policy 0, policy_version 485750 (0.0024) [2024-06-23 17:54:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7958659072. Throughput: 0: 42892.0. Samples: 7958838200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:54:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 17:54:48,964][15401] Updated weights for policy 0, policy_version 485760 (0.0037) [2024-06-23 17:54:52,116][15401] Updated weights for policy 0, policy_version 485770 (0.0039) [2024-06-23 17:54:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7958872064. Throughput: 0: 42793.7. Samples: 7958960840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:54:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 17:54:56,609][15401] Updated weights for policy 0, policy_version 485780 (0.0037) [2024-06-23 17:54:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7959117824. Throughput: 0: 42883.6. Samples: 7959218700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:54:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 17:55:00,223][15401] Updated weights for policy 0, policy_version 485790 (0.0031) [2024-06-23 17:55:03,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 7959298048. Throughput: 0: 42855.9. Samples: 7959475160. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:03,393][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 17:55:04,130][15401] Updated weights for policy 0, policy_version 485800 (0.0040) [2024-06-23 17:55:07,814][15401] Updated weights for policy 0, policy_version 485810 (0.0033) [2024-06-23 17:55:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7959527424. Throughput: 0: 42765.4. Samples: 7959596920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:08,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 17:55:11,930][15401] Updated weights for policy 0, policy_version 485820 (0.0052) [2024-06-23 17:55:13,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 7959740416. Throughput: 0: 42800.5. Samples: 7959854180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 17:55:15,544][15401] Updated weights for policy 0, policy_version 485830 (0.0046) [2024-06-23 17:55:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7959937024. Throughput: 0: 42623.6. Samples: 7960116280. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 17:55:19,617][15401] Updated weights for policy 0, policy_version 485840 (0.0047) [2024-06-23 17:55:23,190][15401] Updated weights for policy 0, policy_version 485850 (0.0035) [2024-06-23 17:55:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 7960182784. Throughput: 0: 42553.8. Samples: 7960237640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:23,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 17:55:27,200][15401] Updated weights for policy 0, policy_version 485860 (0.0042) [2024-06-23 17:55:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7960379392. Throughput: 0: 42691.6. Samples: 7960494600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 17:55:30,898][15401] Updated weights for policy 0, policy_version 485870 (0.0043) [2024-06-23 17:55:33,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 7960576000. Throughput: 0: 42517.2. Samples: 7960751580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:33,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 17:55:34,858][15401] Updated weights for policy 0, policy_version 485880 (0.0025) [2024-06-23 17:55:37,400][15349] Signal inference workers to stop experience collection... (118000 times) [2024-06-23 17:55:37,401][15349] Signal inference workers to resume experience collection... (118000 times) [2024-06-23 17:55:37,421][15401] InferenceWorker_p0-w0: stopping experience collection (118000 times) [2024-06-23 17:55:37,452][15401] InferenceWorker_p0-w0: resuming experience collection (118000 times) [2024-06-23 17:55:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 7960805376. Throughput: 0: 42527.2. Samples: 7960874560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:38,392][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 17:55:38,599][15401] Updated weights for policy 0, policy_version 485890 (0.0035) [2024-06-23 17:55:42,614][15401] Updated weights for policy 0, policy_version 485900 (0.0036) [2024-06-23 17:55:43,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 7961001984. Throughput: 0: 42524.0. Samples: 7961132280. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:43,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 17:55:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000485901_7961001984.pth... [2024-06-23 17:55:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000485276_7950761984.pth [2024-06-23 17:55:46,929][15401] Updated weights for policy 0, policy_version 485910 (0.0041) [2024-06-23 17:55:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7961231360. Throughput: 0: 42433.7. Samples: 7961384580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 17:55:50,441][15401] Updated weights for policy 0, policy_version 485920 (0.0032) [2024-06-23 17:55:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7961444352. Throughput: 0: 42586.6. Samples: 7961513320. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 17:55:54,520][15401] Updated weights for policy 0, policy_version 485930 (0.0038) [2024-06-23 17:55:58,272][15401] Updated weights for policy 0, policy_version 485940 (0.0041) [2024-06-23 17:55:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 7961640960. Throughput: 0: 42361.7. Samples: 7961760460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:55:58,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 17:56:02,399][15401] Updated weights for policy 0, policy_version 485950 (0.0033) [2024-06-23 17:56:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 7961870336. Throughput: 0: 42229.6. Samples: 7962016720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-23 17:56:03,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 17:56:05,797][15401] Updated weights for policy 0, policy_version 485960 (0.0028) [2024-06-23 17:56:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7962066944. Throughput: 0: 42466.8. Samples: 7962148640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 17:56:10,120][15401] Updated weights for policy 0, policy_version 485970 (0.0043) [2024-06-23 17:56:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 7962279936. Throughput: 0: 42430.3. Samples: 7962403960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 17:56:13,442][15401] Updated weights for policy 0, policy_version 485980 (0.0030) [2024-06-23 17:56:17,797][15401] Updated weights for policy 0, policy_version 485990 (0.0037) [2024-06-23 17:56:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7962492928. Throughput: 0: 42445.3. Samples: 7962661520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:18,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 17:56:20,998][15401] Updated weights for policy 0, policy_version 486000 (0.0034) [2024-06-23 17:56:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 7962705920. Throughput: 0: 42426.3. Samples: 7962783740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 17:56:25,533][15401] Updated weights for policy 0, policy_version 486010 (0.0037) [2024-06-23 17:56:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 7962918912. Throughput: 0: 42278.2. Samples: 7963034800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 17:56:28,881][15401] Updated weights for policy 0, policy_version 486020 (0.0042) [2024-06-23 17:56:33,348][15401] Updated weights for policy 0, policy_version 486030 (0.0044) [2024-06-23 17:56:33,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 7963115520. Throughput: 0: 42599.6. Samples: 7963301560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 17:56:36,574][15401] Updated weights for policy 0, policy_version 486040 (0.0040) [2024-06-23 17:56:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7963361280. Throughput: 0: 42487.2. Samples: 7963425240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 17:56:41,014][15401] Updated weights for policy 0, policy_version 486050 (0.0040) [2024-06-23 17:56:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7963557888. Throughput: 0: 42620.8. Samples: 7963678400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:43,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 17:56:44,169][15401] Updated weights for policy 0, policy_version 486060 (0.0031) [2024-06-23 17:56:48,364][15349] Signal inference workers to stop experience collection... (118050 times) [2024-06-23 17:56:48,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.3, 300 sec: 42487.4). Total num frames: 7963738112. Throughput: 0: 42731.7. Samples: 7963939540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 17:56:48,417][15401] InferenceWorker_p0-w0: stopping experience collection (118050 times) [2024-06-23 17:56:48,425][15349] Signal inference workers to resume experience collection... (118050 times) [2024-06-23 17:56:48,436][15401] InferenceWorker_p0-w0: resuming experience collection (118050 times) [2024-06-23 17:56:48,568][15401] Updated weights for policy 0, policy_version 486070 (0.0037) [2024-06-23 17:56:51,857][15401] Updated weights for policy 0, policy_version 486080 (0.0035) [2024-06-23 17:56:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7963983872. Throughput: 0: 42444.4. Samples: 7964058640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 17:56:56,115][15401] Updated weights for policy 0, policy_version 486090 (0.0036) [2024-06-23 17:56:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7964196864. Throughput: 0: 42482.2. Samples: 7964315660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:56:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 17:56:59,926][15401] Updated weights for policy 0, policy_version 486100 (0.0033) [2024-06-23 17:57:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41780.9, 300 sec: 42542.9). Total num frames: 7964377088. Throughput: 0: 42456.1. Samples: 7964572040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:57:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 17:57:03,957][15401] Updated weights for policy 0, policy_version 486110 (0.0039) [2024-06-23 17:57:07,486][15401] Updated weights for policy 0, policy_version 486120 (0.0043) [2024-06-23 17:57:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 7964590080. Throughput: 0: 42456.7. Samples: 7964694300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:57:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 17:57:11,550][15401] Updated weights for policy 0, policy_version 486130 (0.0053) [2024-06-23 17:57:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7964852224. Throughput: 0: 42536.9. Samples: 7964948960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:57:13,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 17:57:15,003][15401] Updated weights for policy 0, policy_version 486140 (0.0036) [2024-06-23 17:57:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7965032448. Throughput: 0: 42274.5. Samples: 7965203920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:57:18,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 17:57:19,210][15401] Updated weights for policy 0, policy_version 486150 (0.0037) [2024-06-23 17:57:22,609][15401] Updated weights for policy 0, policy_version 486160 (0.0042) [2024-06-23 17:57:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7965245440. Throughput: 0: 42275.6. Samples: 7965327640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:57:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 17:57:27,200][15401] Updated weights for policy 0, policy_version 486170 (0.0033) [2024-06-23 17:57:28,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 7965458432. Throughput: 0: 42408.5. Samples: 7965586880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 17:57:28,392][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 17:57:30,121][15401] Updated weights for policy 0, policy_version 486180 (0.0021) [2024-06-23 17:57:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7965671424. Throughput: 0: 42299.0. Samples: 7965843000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:57:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 17:57:34,764][15401] Updated weights for policy 0, policy_version 486190 (0.0028) [2024-06-23 17:57:37,680][15401] Updated weights for policy 0, policy_version 486200 (0.0035) [2024-06-23 17:57:38,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7965900800. Throughput: 0: 42466.7. Samples: 7965969640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:57:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 17:57:42,807][15401] Updated weights for policy 0, policy_version 486210 (0.0042) [2024-06-23 17:57:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 7966081024. Throughput: 0: 42423.6. Samples: 7966224720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:57:43,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 17:57:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000486212_7966097408.pth... [2024-06-23 17:57:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000485589_7955890176.pth [2024-06-23 17:57:45,199][15401] Updated weights for policy 0, policy_version 486220 (0.0037) [2024-06-23 17:57:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 7966310400. Throughput: 0: 42396.5. Samples: 7966479880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:57:48,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 17:57:50,539][15401] Updated weights for policy 0, policy_version 486230 (0.0038) [2024-06-23 17:57:53,326][15401] Updated weights for policy 0, policy_version 486240 (0.0033) [2024-06-23 17:57:53,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7966556160. Throughput: 0: 42396.1. Samples: 7966602120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:57:53,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 17:57:56,375][15349] Signal inference workers to stop experience collection... (118100 times) [2024-06-23 17:57:56,403][15401] InferenceWorker_p0-w0: stopping experience collection (118100 times) [2024-06-23 17:57:56,433][15349] Signal inference workers to resume experience collection... (118100 times) [2024-06-23 17:57:56,440][15401] InferenceWorker_p0-w0: resuming experience collection (118100 times) [2024-06-23 17:57:58,167][15401] Updated weights for policy 0, policy_version 486250 (0.0034) [2024-06-23 17:57:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7966736384. Throughput: 0: 42485.3. Samples: 7966860800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:57:58,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 17:58:00,788][15401] Updated weights for policy 0, policy_version 486260 (0.0031) [2024-06-23 17:58:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7966949376. Throughput: 0: 42608.1. Samples: 7967121280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 17:58:05,775][15401] Updated weights for policy 0, policy_version 486270 (0.0029) [2024-06-23 17:58:08,371][15401] Updated weights for policy 0, policy_version 486280 (0.0041) [2024-06-23 17:58:08,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 7967211520. Throughput: 0: 42653.3. Samples: 7967247040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 17:58:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42432.1). Total num frames: 7967358976. Throughput: 0: 42412.9. Samples: 7967495360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 17:58:13,414][15401] Updated weights for policy 0, policy_version 486290 (0.0038) [2024-06-23 17:58:16,612][15401] Updated weights for policy 0, policy_version 486300 (0.0033) [2024-06-23 17:58:18,390][15132] Fps is (10 sec: 36044.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7967571968. Throughput: 0: 42503.4. Samples: 7967755660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 17:58:21,197][15401] Updated weights for policy 0, policy_version 486310 (0.0039) [2024-06-23 17:58:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7967801344. Throughput: 0: 42486.7. Samples: 7967881540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:23,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 17:58:24,341][15401] Updated weights for policy 0, policy_version 486320 (0.0023) [2024-06-23 17:58:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 7967997952. Throughput: 0: 42450.5. Samples: 7968135000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 17:58:28,730][15401] Updated weights for policy 0, policy_version 486330 (0.0028) [2024-06-23 17:58:31,826][15401] Updated weights for policy 0, policy_version 486340 (0.0023) [2024-06-23 17:58:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7968210944. Throughput: 0: 42570.2. Samples: 7968395540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 17:58:36,387][15401] Updated weights for policy 0, policy_version 486350 (0.0040) [2024-06-23 17:58:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7968456704. Throughput: 0: 42689.7. Samples: 7968523160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 17:58:39,555][15401] Updated weights for policy 0, policy_version 486360 (0.0044) [2024-06-23 17:58:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7968636928. Throughput: 0: 42657.3. Samples: 7968780380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:43,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 17:58:43,902][15401] Updated weights for policy 0, policy_version 486370 (0.0049) [2024-06-23 17:58:47,483][15401] Updated weights for policy 0, policy_version 486380 (0.0048) [2024-06-23 17:58:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7968866304. Throughput: 0: 42493.5. Samples: 7969033480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 17:58:51,565][15401] Updated weights for policy 0, policy_version 486390 (0.0043) [2024-06-23 17:58:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 7969079296. Throughput: 0: 42548.1. Samples: 7969161700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 17:58:53,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 17:58:55,265][15401] Updated weights for policy 0, policy_version 486400 (0.0047) [2024-06-23 17:58:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 7969275904. Throughput: 0: 42760.4. Samples: 7969419580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:58:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 17:58:59,203][15401] Updated weights for policy 0, policy_version 486410 (0.0039) [2024-06-23 17:59:03,091][15401] Updated weights for policy 0, policy_version 486420 (0.0033) [2024-06-23 17:59:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7969521664. Throughput: 0: 42565.3. Samples: 7969671100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 17:59:06,822][15401] Updated weights for policy 0, policy_version 486430 (0.0037) [2024-06-23 17:59:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 42487.7). Total num frames: 7969718272. Throughput: 0: 42750.2. Samples: 7969805300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 17:59:10,761][15401] Updated weights for policy 0, policy_version 486440 (0.0046) [2024-06-23 17:59:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7969914880. Throughput: 0: 42720.5. Samples: 7970057420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:13,391][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 17:59:14,485][15401] Updated weights for policy 0, policy_version 486450 (0.0039) [2024-06-23 17:59:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 7970144256. Throughput: 0: 42723.6. Samples: 7970318100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 17:59:18,511][15401] Updated weights for policy 0, policy_version 486460 (0.0038) [2024-06-23 17:59:20,820][15349] Signal inference workers to stop experience collection... (118150 times) [2024-06-23 17:59:20,820][15349] Signal inference workers to resume experience collection... (118150 times) [2024-06-23 17:59:20,854][15401] InferenceWorker_p0-w0: stopping experience collection (118150 times) [2024-06-23 17:59:20,854][15401] InferenceWorker_p0-w0: resuming experience collection (118150 times) [2024-06-23 17:59:22,354][15401] Updated weights for policy 0, policy_version 486470 (0.0032) [2024-06-23 17:59:23,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 7970373632. Throughput: 0: 42785.7. Samples: 7970448620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:23,392][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 17:59:26,275][15401] Updated weights for policy 0, policy_version 486480 (0.0032) [2024-06-23 17:59:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 7970553856. Throughput: 0: 42630.2. Samples: 7970698740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:28,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 17:59:29,992][15401] Updated weights for policy 0, policy_version 486490 (0.0044) [2024-06-23 17:59:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 7970783232. Throughput: 0: 42572.8. Samples: 7970949260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:33,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 17:59:34,253][15401] Updated weights for policy 0, policy_version 486500 (0.0030) [2024-06-23 17:59:37,509][15401] Updated weights for policy 0, policy_version 486510 (0.0030) [2024-06-23 17:59:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7971012608. Throughput: 0: 42704.9. Samples: 7971083420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 17:59:41,831][15401] Updated weights for policy 0, policy_version 486520 (0.0026) [2024-06-23 17:59:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7971192832. Throughput: 0: 42703.1. Samples: 7971341220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:43,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 17:59:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000486524_7971209216.pth... [2024-06-23 17:59:43,580][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000485901_7961001984.pth [2024-06-23 17:59:45,079][15401] Updated weights for policy 0, policy_version 486530 (0.0041) [2024-06-23 17:59:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 7971422208. Throughput: 0: 42636.0. Samples: 7971589720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:48,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 17:59:49,486][15401] Updated weights for policy 0, policy_version 486540 (0.0042) [2024-06-23 17:59:53,310][15401] Updated weights for policy 0, policy_version 486550 (0.0032) [2024-06-23 17:59:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 7971635200. Throughput: 0: 42542.1. Samples: 7971719700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:53,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 17:59:57,093][15401] Updated weights for policy 0, policy_version 486560 (0.0031) [2024-06-23 17:59:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 7971848192. Throughput: 0: 42641.0. Samples: 7971976260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 17:59:58,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 18:00:00,954][15401] Updated weights for policy 0, policy_version 486570 (0.0038) [2024-06-23 18:00:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 7972061184. Throughput: 0: 42318.9. Samples: 7972222460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 18:00:03,391][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 18:00:04,701][15401] Updated weights for policy 0, policy_version 486580 (0.0031) [2024-06-23 18:00:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7972274176. Throughput: 0: 42435.1. Samples: 7972358100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 18:00:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 18:00:08,513][15401] Updated weights for policy 0, policy_version 486590 (0.0032) [2024-06-23 18:00:12,895][15401] Updated weights for policy 0, policy_version 486600 (0.0032) [2024-06-23 18:00:13,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 7972470784. Throughput: 0: 42513.8. Samples: 7972611860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 18:00:13,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 18:00:16,273][15401] Updated weights for policy 0, policy_version 486610 (0.0029) [2024-06-23 18:00:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 7972700160. Throughput: 0: 42577.4. Samples: 7972865240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 18:00:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 18:00:20,627][15401] Updated weights for policy 0, policy_version 486620 (0.0023) [2024-06-23 18:00:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 7972929536. Throughput: 0: 42525.7. Samples: 7972997080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 18:00:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 18:00:23,831][15401] Updated weights for policy 0, policy_version 486630 (0.0029) [2024-06-23 18:00:28,183][15401] Updated weights for policy 0, policy_version 486640 (0.0033) [2024-06-23 18:00:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 7973109760. Throughput: 0: 42439.9. Samples: 7973251020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:00:28,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 18:00:31,591][15401] Updated weights for policy 0, policy_version 486650 (0.0037) [2024-06-23 18:00:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 7973355520. Throughput: 0: 42561.7. Samples: 7973505000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:00:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 18:00:35,749][15401] Updated weights for policy 0, policy_version 486660 (0.0022) [2024-06-23 18:00:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7973568512. Throughput: 0: 42603.6. Samples: 7973636860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:00:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 18:00:39,099][15401] Updated weights for policy 0, policy_version 486670 (0.0031) [2024-06-23 18:00:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 7973748736. Throughput: 0: 42462.3. Samples: 7973887060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:00:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 18:00:43,662][15401] Updated weights for policy 0, policy_version 486680 (0.0032) [2024-06-23 18:00:46,891][15401] Updated weights for policy 0, policy_version 486690 (0.0031) [2024-06-23 18:00:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 7973961728. Throughput: 0: 42711.7. Samples: 7974144480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:00:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 18:00:51,314][15401] Updated weights for policy 0, policy_version 486700 (0.0032) [2024-06-23 18:00:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7974207488. Throughput: 0: 42555.1. Samples: 7974273080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:00:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 18:00:54,130][15349] Signal inference workers to stop experience collection... (118200 times) [2024-06-23 18:00:54,159][15401] InferenceWorker_p0-w0: stopping experience collection (118200 times) [2024-06-23 18:00:54,246][15349] Signal inference workers to resume experience collection... (118200 times) [2024-06-23 18:00:54,246][15401] InferenceWorker_p0-w0: resuming experience collection (118200 times) [2024-06-23 18:00:54,384][15401] Updated weights for policy 0, policy_version 486710 (0.0031) [2024-06-23 18:00:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 7974387712. Throughput: 0: 42431.6. Samples: 7974521280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:00:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 18:00:58,922][15401] Updated weights for policy 0, policy_version 486720 (0.0035) [2024-06-23 18:01:02,094][15401] Updated weights for policy 0, policy_version 486730 (0.0053) [2024-06-23 18:01:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7974600704. Throughput: 0: 42539.9. Samples: 7974779540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 18:01:06,791][15401] Updated weights for policy 0, policy_version 486740 (0.0032) [2024-06-23 18:01:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 7974830080. Throughput: 0: 42499.7. Samples: 7974909660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:08,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 18:01:09,640][15401] Updated weights for policy 0, policy_version 486750 (0.0036) [2024-06-23 18:01:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 7975043072. Throughput: 0: 42409.5. Samples: 7975159440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 18:01:14,843][15401] Updated weights for policy 0, policy_version 486760 (0.0031) [2024-06-23 18:01:17,271][15401] Updated weights for policy 0, policy_version 486770 (0.0042) [2024-06-23 18:01:18,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 7975256064. Throughput: 0: 42381.9. Samples: 7975412180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 18:01:22,471][15401] Updated weights for policy 0, policy_version 486780 (0.0042) [2024-06-23 18:01:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 7975452672. Throughput: 0: 42438.2. Samples: 7975546580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:23,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 18:01:24,969][15401] Updated weights for policy 0, policy_version 486790 (0.0031) [2024-06-23 18:01:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 7975665664. Throughput: 0: 42460.5. Samples: 7975797780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 18:01:30,038][15401] Updated weights for policy 0, policy_version 486800 (0.0038) [2024-06-23 18:01:32,458][15401] Updated weights for policy 0, policy_version 486810 (0.0032) [2024-06-23 18:01:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 7975911424. Throughput: 0: 42221.3. Samples: 7976044440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 18:01:37,971][15401] Updated weights for policy 0, policy_version 486820 (0.0027) [2024-06-23 18:01:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 7976075264. Throughput: 0: 42391.1. Samples: 7976180680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:38,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-23 18:01:40,380][15401] Updated weights for policy 0, policy_version 486830 (0.0030) [2024-06-23 18:01:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7976304640. Throughput: 0: 42495.1. Samples: 7976433560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:43,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 18:01:43,439][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000486836_7976321024.pth... [2024-06-23 18:01:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000486212_7966097408.pth [2024-06-23 18:01:45,506][15401] Updated weights for policy 0, policy_version 486840 (0.0044) [2024-06-23 18:01:47,932][15401] Updated weights for policy 0, policy_version 486850 (0.0036) [2024-06-23 18:01:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 7976550400. Throughput: 0: 42336.9. Samples: 7976684700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 18:01:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 18:01:53,037][15401] Updated weights for policy 0, policy_version 486860 (0.0022) [2024-06-23 18:01:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 7976730624. Throughput: 0: 42514.7. Samples: 7976822720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:01:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 18:01:55,628][15401] Updated weights for policy 0, policy_version 486870 (0.0029) [2024-06-23 18:01:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7976943616. Throughput: 0: 42593.8. Samples: 7977076160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:01:58,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 18:02:00,599][15401] Updated weights for policy 0, policy_version 486880 (0.0033) [2024-06-23 18:02:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7977189376. Throughput: 0: 42614.7. Samples: 7977329840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 18:02:03,417][15401] Updated weights for policy 0, policy_version 486890 (0.0040) [2024-06-23 18:02:07,824][15349] Signal inference workers to stop experience collection... (118250 times) [2024-06-23 18:02:07,825][15349] Signal inference workers to resume experience collection... (118250 times) [2024-06-23 18:02:07,852][15401] InferenceWorker_p0-w0: stopping experience collection (118250 times) [2024-06-23 18:02:07,852][15401] InferenceWorker_p0-w0: resuming experience collection (118250 times) [2024-06-23 18:02:08,251][15401] Updated weights for policy 0, policy_version 486900 (0.0043) [2024-06-23 18:02:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 7977385984. Throughput: 0: 42566.7. Samples: 7977462080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 18:02:11,048][15401] Updated weights for policy 0, policy_version 486910 (0.0043) [2024-06-23 18:02:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 7977598976. Throughput: 0: 42596.8. Samples: 7977714640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:13,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 18:02:15,748][15401] Updated weights for policy 0, policy_version 486920 (0.0032) [2024-06-23 18:02:18,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 7977828352. Throughput: 0: 42842.9. Samples: 7977972640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:18,396][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 18:02:18,681][15401] Updated weights for policy 0, policy_version 486930 (0.0030) [2024-06-23 18:02:23,326][15401] Updated weights for policy 0, policy_version 486940 (0.0029) [2024-06-23 18:02:23,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42866.9, 300 sec: 42597.8). Total num frames: 7978024960. Throughput: 0: 42655.3. Samples: 7978100440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:23,396][15132] Avg episode reward: [(0, '0.225')] [2024-06-23 18:02:26,559][15401] Updated weights for policy 0, policy_version 486950 (0.0023) [2024-06-23 18:02:28,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7978237952. Throughput: 0: 42726.2. Samples: 7978356240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 18:02:31,166][15401] Updated weights for policy 0, policy_version 486960 (0.0049) [2024-06-23 18:02:33,390][15132] Fps is (10 sec: 44265.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7978467328. Throughput: 0: 42831.6. Samples: 7978612120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:33,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 18:02:34,186][15401] Updated weights for policy 0, policy_version 486970 (0.0048) [2024-06-23 18:02:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7978647552. Throughput: 0: 42644.4. Samples: 7978741720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 18:02:39,028][15401] Updated weights for policy 0, policy_version 486980 (0.0039) [2024-06-23 18:02:41,731][15401] Updated weights for policy 0, policy_version 486990 (0.0028) [2024-06-23 18:02:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 7978876928. Throughput: 0: 42564.0. Samples: 7978991540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 18:02:46,855][15401] Updated weights for policy 0, policy_version 487000 (0.0033) [2024-06-23 18:02:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 7979089920. Throughput: 0: 42592.8. Samples: 7979246520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 18:02:49,603][15401] Updated weights for policy 0, policy_version 487010 (0.0043) [2024-06-23 18:02:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 7979270144. Throughput: 0: 42549.7. Samples: 7979376820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:53,391][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 18:02:54,604][15401] Updated weights for policy 0, policy_version 487020 (0.0042) [2024-06-23 18:02:57,085][15401] Updated weights for policy 0, policy_version 487030 (0.0037) [2024-06-23 18:02:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7979515904. Throughput: 0: 42469.4. Samples: 7979625760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:02:58,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 18:03:02,320][15401] Updated weights for policy 0, policy_version 487040 (0.0033) [2024-06-23 18:03:03,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 7979745280. Throughput: 0: 42636.2. Samples: 7979891000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:03:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 18:03:04,659][15401] Updated weights for policy 0, policy_version 487050 (0.0039) [2024-06-23 18:03:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7979925504. Throughput: 0: 42674.5. Samples: 7980020520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:03:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 18:03:09,809][15401] Updated weights for policy 0, policy_version 487060 (0.0038) [2024-06-23 18:03:12,278][15401] Updated weights for policy 0, policy_version 487070 (0.0031) [2024-06-23 18:03:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7980171264. Throughput: 0: 42582.7. Samples: 7980272460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 18:03:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 18:03:17,566][15401] Updated weights for policy 0, policy_version 487080 (0.0035) [2024-06-23 18:03:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42329.7, 300 sec: 42598.4). Total num frames: 7980367872. Throughput: 0: 42833.7. Samples: 7980539640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 18:03:20,107][15401] Updated weights for policy 0, policy_version 487090 (0.0038) [2024-06-23 18:03:21,850][15349] Signal inference workers to stop experience collection... (118300 times) [2024-06-23 18:03:21,851][15349] Signal inference workers to resume experience collection... (118300 times) [2024-06-23 18:03:21,896][15401] InferenceWorker_p0-w0: stopping experience collection (118300 times) [2024-06-23 18:03:21,896][15401] InferenceWorker_p0-w0: resuming experience collection (118300 times) [2024-06-23 18:03:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 7980564480. Throughput: 0: 42625.8. Samples: 7980659880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:23,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 18:03:25,236][15401] Updated weights for policy 0, policy_version 487100 (0.0036) [2024-06-23 18:03:28,257][15401] Updated weights for policy 0, policy_version 487110 (0.0039) [2024-06-23 18:03:28,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7980810240. Throughput: 0: 42746.6. Samples: 7980915140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:28,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 18:03:32,682][15401] Updated weights for policy 0, policy_version 487120 (0.0040) [2024-06-23 18:03:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 7981006848. Throughput: 0: 42900.4. Samples: 7981177040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 18:03:35,917][15401] Updated weights for policy 0, policy_version 487130 (0.0034) [2024-06-23 18:03:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 7981203456. Throughput: 0: 42761.4. Samples: 7981301080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 18:03:40,110][15401] Updated weights for policy 0, policy_version 487140 (0.0041) [2024-06-23 18:03:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7981449216. Throughput: 0: 42992.4. Samples: 7981560420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:43,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 18:03:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000487150_7981465600.pth... [2024-06-23 18:03:43,454][15401] Updated weights for policy 0, policy_version 487150 (0.0028) [2024-06-23 18:03:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000486524_7971209216.pth [2024-06-23 18:03:47,610][15401] Updated weights for policy 0, policy_version 487160 (0.0039) [2024-06-23 18:03:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7981645824. Throughput: 0: 42832.9. Samples: 7981818480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 18:03:51,111][15401] Updated weights for policy 0, policy_version 487170 (0.0032) [2024-06-23 18:03:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7981858816. Throughput: 0: 42789.3. Samples: 7981946040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 18:03:55,069][15401] Updated weights for policy 0, policy_version 487180 (0.0035) [2024-06-23 18:03:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7982088192. Throughput: 0: 42993.7. Samples: 7982207180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:03:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 18:03:58,789][15401] Updated weights for policy 0, policy_version 487190 (0.0054) [2024-06-23 18:04:02,705][15401] Updated weights for policy 0, policy_version 487200 (0.0040) [2024-06-23 18:04:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7982301184. Throughput: 0: 42491.1. Samples: 7982451740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:04:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 18:04:06,594][15401] Updated weights for policy 0, policy_version 487210 (0.0043) [2024-06-23 18:04:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 7982481408. Throughput: 0: 42754.2. Samples: 7982583820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:04:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 18:04:10,343][15401] Updated weights for policy 0, policy_version 487220 (0.0030) [2024-06-23 18:04:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7982710784. Throughput: 0: 42851.2. Samples: 7982843440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:04:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 18:04:14,283][15401] Updated weights for policy 0, policy_version 487230 (0.0044) [2024-06-23 18:04:17,931][15401] Updated weights for policy 0, policy_version 487240 (0.0036) [2024-06-23 18:04:18,395][15132] Fps is (10 sec: 47489.2, 60 sec: 43140.9, 300 sec: 42653.5). Total num frames: 7982956544. Throughput: 0: 42568.1. Samples: 7983092820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:04:18,395][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 18:04:22,010][15401] Updated weights for policy 0, policy_version 487250 (0.0030) [2024-06-23 18:04:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 7983136768. Throughput: 0: 42826.7. Samples: 7983228280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:04:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 18:04:25,530][15401] Updated weights for policy 0, policy_version 487260 (0.0045) [2024-06-23 18:04:28,389][15132] Fps is (10 sec: 39342.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7983349760. Throughput: 0: 42777.8. Samples: 7983485420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:04:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 18:04:29,521][15401] Updated weights for policy 0, policy_version 487270 (0.0035) [2024-06-23 18:04:33,000][15401] Updated weights for policy 0, policy_version 487280 (0.0024) [2024-06-23 18:04:33,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 7983595520. Throughput: 0: 42536.8. Samples: 7983732640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:04:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 18:04:37,571][15401] Updated weights for policy 0, policy_version 487290 (0.0035) [2024-06-23 18:04:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 7983775744. Throughput: 0: 42623.9. Samples: 7983864120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 18:04:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 18:04:41,221][15401] Updated weights for policy 0, policy_version 487300 (0.0036) [2024-06-23 18:04:43,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 7983988736. Throughput: 0: 42489.5. Samples: 7984119200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:04:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 18:04:43,409][15349] Signal inference workers to stop experience collection... (118350 times) [2024-06-23 18:04:43,410][15349] Signal inference workers to resume experience collection... (118350 times) [2024-06-23 18:04:43,471][15401] InferenceWorker_p0-w0: stopping experience collection (118350 times) [2024-06-23 18:04:43,471][15401] InferenceWorker_p0-w0: resuming experience collection (118350 times) [2024-06-23 18:04:45,301][15401] Updated weights for policy 0, policy_version 487310 (0.0032) [2024-06-23 18:04:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 7984234496. Throughput: 0: 42741.0. Samples: 7984375080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:04:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 18:04:48,817][15401] Updated weights for policy 0, policy_version 487320 (0.0029) [2024-06-23 18:04:53,082][15401] Updated weights for policy 0, policy_version 487330 (0.0032) [2024-06-23 18:04:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 7984431104. Throughput: 0: 42669.9. Samples: 7984503960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:04:53,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-23 18:04:56,528][15401] Updated weights for policy 0, policy_version 487340 (0.0031) [2024-06-23 18:04:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7984644096. Throughput: 0: 42525.2. Samples: 7984757080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:04:58,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 18:05:00,681][15401] Updated weights for policy 0, policy_version 487350 (0.0028) [2024-06-23 18:05:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7984873472. Throughput: 0: 42811.5. Samples: 7985019120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 18:05:04,108][15401] Updated weights for policy 0, policy_version 487360 (0.0029) [2024-06-23 18:05:08,277][15401] Updated weights for policy 0, policy_version 487370 (0.0046) [2024-06-23 18:05:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 7985070080. Throughput: 0: 42635.4. Samples: 7985146880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 18:05:11,586][15401] Updated weights for policy 0, policy_version 487380 (0.0031) [2024-06-23 18:05:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7985283072. Throughput: 0: 42547.1. Samples: 7985400040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 18:05:15,839][15401] Updated weights for policy 0, policy_version 487390 (0.0028) [2024-06-23 18:05:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42327.3, 300 sec: 42598.1). Total num frames: 7985496064. Throughput: 0: 42831.2. Samples: 7985660140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:18,393][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 18:05:19,203][15401] Updated weights for policy 0, policy_version 487400 (0.0042) [2024-06-23 18:05:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7985709056. Throughput: 0: 42745.0. Samples: 7985787640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:23,391][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 18:05:23,507][15401] Updated weights for policy 0, policy_version 487410 (0.0037) [2024-06-23 18:05:27,300][15401] Updated weights for policy 0, policy_version 487420 (0.0044) [2024-06-23 18:05:28,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 7985922048. Throughput: 0: 42801.6. Samples: 7986045280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 18:05:31,465][15401] Updated weights for policy 0, policy_version 487430 (0.0040) [2024-06-23 18:05:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 7986151424. Throughput: 0: 42857.8. Samples: 7986303680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:33,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 18:05:35,110][15401] Updated weights for policy 0, policy_version 487440 (0.0038) [2024-06-23 18:05:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 7986348032. Throughput: 0: 42801.1. Samples: 7986430020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 18:05:38,998][15401] Updated weights for policy 0, policy_version 487450 (0.0030) [2024-06-23 18:05:42,800][15401] Updated weights for policy 0, policy_version 487460 (0.0032) [2024-06-23 18:05:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 7986577408. Throughput: 0: 42971.1. Samples: 7986690780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 18:05:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000487462_7986577408.pth... [2024-06-23 18:05:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000486836_7976321024.pth [2024-06-23 18:05:46,766][15401] Updated weights for policy 0, policy_version 487470 (0.0038) [2024-06-23 18:05:48,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 7986790400. Throughput: 0: 42820.9. Samples: 7986946160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:48,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 18:05:50,463][15401] Updated weights for policy 0, policy_version 487480 (0.0037) [2024-06-23 18:05:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 7986987008. Throughput: 0: 42717.4. Samples: 7987069160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 18:05:54,598][15401] Updated weights for policy 0, policy_version 487490 (0.0028) [2024-06-23 18:05:58,275][15401] Updated weights for policy 0, policy_version 487500 (0.0030) [2024-06-23 18:05:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 7987200000. Throughput: 0: 42750.3. Samples: 7987323800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:05:58,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 18:05:59,858][15349] Signal inference workers to stop experience collection... (118400 times) [2024-06-23 18:05:59,860][15349] Signal inference workers to resume experience collection... (118400 times) [2024-06-23 18:05:59,897][15401] InferenceWorker_p0-w0: stopping experience collection (118400 times) [2024-06-23 18:05:59,897][15401] InferenceWorker_p0-w0: resuming experience collection (118400 times) [2024-06-23 18:06:02,277][15401] Updated weights for policy 0, policy_version 487510 (0.0032) [2024-06-23 18:06:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 7987412992. Throughput: 0: 42710.3. Samples: 7987582000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:06:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 18:06:05,787][15401] Updated weights for policy 0, policy_version 487520 (0.0027) [2024-06-23 18:06:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7987625984. Throughput: 0: 42697.8. Samples: 7987709040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 18:06:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 18:06:10,003][15401] Updated weights for policy 0, policy_version 487530 (0.0029) [2024-06-23 18:06:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 7987838976. Throughput: 0: 42628.2. Samples: 7987963540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 18:06:13,406][15401] Updated weights for policy 0, policy_version 487540 (0.0033) [2024-06-23 18:06:17,633][15401] Updated weights for policy 0, policy_version 487550 (0.0038) [2024-06-23 18:06:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 7988035584. Throughput: 0: 42738.4. Samples: 7988226900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 18:06:21,022][15401] Updated weights for policy 0, policy_version 487560 (0.0035) [2024-06-23 18:06:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7988264960. Throughput: 0: 42660.0. Samples: 7988349720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 18:06:25,342][15401] Updated weights for policy 0, policy_version 487570 (0.0038) [2024-06-23 18:06:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7988494336. Throughput: 0: 42492.1. Samples: 7988602920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 18:06:28,598][15401] Updated weights for policy 0, policy_version 487580 (0.0027) [2024-06-23 18:06:33,182][15401] Updated weights for policy 0, policy_version 487590 (0.0030) [2024-06-23 18:06:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 7988674560. Throughput: 0: 42673.0. Samples: 7988866340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 18:06:36,277][15401] Updated weights for policy 0, policy_version 487600 (0.0029) [2024-06-23 18:06:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7988903936. Throughput: 0: 42539.1. Samples: 7988983420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 18:06:40,626][15401] Updated weights for policy 0, policy_version 487610 (0.0032) [2024-06-23 18:06:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7989133312. Throughput: 0: 42687.5. Samples: 7989244740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 18:06:43,951][15401] Updated weights for policy 0, policy_version 487620 (0.0027) [2024-06-23 18:06:48,153][15401] Updated weights for policy 0, policy_version 487630 (0.0029) [2024-06-23 18:06:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 7989329920. Throughput: 0: 42676.6. Samples: 7989502440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 18:06:51,619][15401] Updated weights for policy 0, policy_version 487640 (0.0029) [2024-06-23 18:06:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7989559296. Throughput: 0: 42604.1. Samples: 7989626220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:53,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 18:06:56,050][15401] Updated weights for policy 0, policy_version 487650 (0.0029) [2024-06-23 18:06:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 7989772288. Throughput: 0: 42836.0. Samples: 7989891160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:06:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 18:06:59,268][15401] Updated weights for policy 0, policy_version 487660 (0.0027) [2024-06-23 18:07:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7989968896. Throughput: 0: 42766.6. Samples: 7990151400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:07:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 18:07:03,818][15401] Updated weights for policy 0, policy_version 487670 (0.0046) [2024-06-23 18:07:06,690][15401] Updated weights for policy 0, policy_version 487680 (0.0030) [2024-06-23 18:07:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7990214656. Throughput: 0: 42892.1. Samples: 7990279860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:07:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 18:07:11,655][15401] Updated weights for policy 0, policy_version 487690 (0.0039) [2024-06-23 18:07:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 7990394880. Throughput: 0: 42851.5. Samples: 7990531240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:07:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 18:07:14,129][15349] Signal inference workers to stop experience collection... (118450 times) [2024-06-23 18:07:14,129][15349] Signal inference workers to resume experience collection... (118450 times) [2024-06-23 18:07:14,155][15401] InferenceWorker_p0-w0: stopping experience collection (118450 times) [2024-06-23 18:07:14,155][15401] InferenceWorker_p0-w0: resuming experience collection (118450 times) [2024-06-23 18:07:14,267][15401] Updated weights for policy 0, policy_version 487700 (0.0027) [2024-06-23 18:07:18,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 7990591488. Throughput: 0: 42877.3. Samples: 7990795820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:07:18,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-23 18:07:19,142][15401] Updated weights for policy 0, policy_version 487710 (0.0040) [2024-06-23 18:07:21,841][15401] Updated weights for policy 0, policy_version 487720 (0.0022) [2024-06-23 18:07:23,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 7990870016. Throughput: 0: 42951.4. Samples: 7990916240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:07:23,404][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 18:07:26,815][15401] Updated weights for policy 0, policy_version 487730 (0.0034) [2024-06-23 18:07:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7991033856. Throughput: 0: 42850.7. Samples: 7991173020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:07:28,399][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 18:07:29,404][15401] Updated weights for policy 0, policy_version 487740 (0.0038) [2024-06-23 18:07:33,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 7991246848. Throughput: 0: 42835.3. Samples: 7991430040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 18:07:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 18:07:34,486][15401] Updated weights for policy 0, policy_version 487750 (0.0040) [2024-06-23 18:07:37,356][15401] Updated weights for policy 0, policy_version 487760 (0.0041) [2024-06-23 18:07:38,390][15132] Fps is (10 sec: 47512.5, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 7991508992. Throughput: 0: 42830.0. Samples: 7991553580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:07:38,399][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 18:07:42,005][15401] Updated weights for policy 0, policy_version 487770 (0.0037) [2024-06-23 18:07:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7991672832. Throughput: 0: 42709.2. Samples: 7991813080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:07:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 18:07:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000487774_7991689216.pth... [2024-06-23 18:07:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000487150_7981465600.pth [2024-06-23 18:07:45,140][15401] Updated weights for policy 0, policy_version 487780 (0.0050) [2024-06-23 18:07:48,390][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 7991885824. Throughput: 0: 42573.2. Samples: 7992067200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:07:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 18:07:49,776][15401] Updated weights for policy 0, policy_version 487790 (0.0044) [2024-06-23 18:07:53,141][15401] Updated weights for policy 0, policy_version 487800 (0.0030) [2024-06-23 18:07:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7992131584. Throughput: 0: 42637.2. Samples: 7992198540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:07:53,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-23 18:07:57,387][15401] Updated weights for policy 0, policy_version 487810 (0.0028) [2024-06-23 18:07:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7992311808. Throughput: 0: 42818.3. Samples: 7992458060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:07:58,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 18:08:00,651][15401] Updated weights for policy 0, policy_version 487820 (0.0039) [2024-06-23 18:08:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7992541184. Throughput: 0: 42524.9. Samples: 7992709440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 18:08:05,017][15401] Updated weights for policy 0, policy_version 487830 (0.0045) [2024-06-23 18:08:08,213][15401] Updated weights for policy 0, policy_version 487840 (0.0026) [2024-06-23 18:08:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7992770560. Throughput: 0: 42748.7. Samples: 7992839920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 18:08:12,613][15401] Updated weights for policy 0, policy_version 487850 (0.0038) [2024-06-23 18:08:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 7992950784. Throughput: 0: 42708.9. Samples: 7993094920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 18:08:15,957][15401] Updated weights for policy 0, policy_version 487860 (0.0033) [2024-06-23 18:08:16,810][15349] Signal inference workers to stop experience collection... (118500 times) [2024-06-23 18:08:16,852][15401] InferenceWorker_p0-w0: stopping experience collection (118500 times) [2024-06-23 18:08:16,875][15349] Signal inference workers to resume experience collection... (118500 times) [2024-06-23 18:08:16,876][15401] InferenceWorker_p0-w0: resuming experience collection (118500 times) [2024-06-23 18:08:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7993180160. Throughput: 0: 42517.0. Samples: 7993343300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:18,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 18:08:20,379][15401] Updated weights for policy 0, policy_version 487870 (0.0042) [2024-06-23 18:08:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7993409536. Throughput: 0: 42778.8. Samples: 7993478620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 18:08:23,560][15401] Updated weights for policy 0, policy_version 487880 (0.0036) [2024-06-23 18:08:28,025][15401] Updated weights for policy 0, policy_version 487890 (0.0038) [2024-06-23 18:08:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 7993589760. Throughput: 0: 42591.1. Samples: 7993729680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 18:08:31,212][15401] Updated weights for policy 0, policy_version 487900 (0.0037) [2024-06-23 18:08:33,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 7993835520. Throughput: 0: 42498.7. Samples: 7993979740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:33,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 18:08:35,759][15401] Updated weights for policy 0, policy_version 487910 (0.0036) [2024-06-23 18:08:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.5, 300 sec: 42654.0). Total num frames: 7994032128. Throughput: 0: 42658.4. Samples: 7994118160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 18:08:38,826][15401] Updated weights for policy 0, policy_version 487920 (0.0034) [2024-06-23 18:08:43,390][15132] Fps is (10 sec: 39330.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 7994228736. Throughput: 0: 42485.1. Samples: 7994369900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 18:08:43,719][15401] Updated weights for policy 0, policy_version 487930 (0.0033) [2024-06-23 18:08:46,792][15401] Updated weights for policy 0, policy_version 487940 (0.0038) [2024-06-23 18:08:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 7994474496. Throughput: 0: 42361.0. Samples: 7994615680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:48,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 18:08:51,421][15401] Updated weights for policy 0, policy_version 487950 (0.0028) [2024-06-23 18:08:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 7994671104. Throughput: 0: 42547.8. Samples: 7994754580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:53,391][15132] Avg episode reward: [(0, '0.818')] [2024-06-23 18:08:54,675][15401] Updated weights for policy 0, policy_version 487960 (0.0032) [2024-06-23 18:08:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 7994851328. Throughput: 0: 42422.8. Samples: 7995003940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 18:08:58,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 18:08:58,949][15401] Updated weights for policy 0, policy_version 487970 (0.0038) [2024-06-23 18:09:02,194][15401] Updated weights for policy 0, policy_version 487980 (0.0021) [2024-06-23 18:09:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 7995113472. Throughput: 0: 42682.5. Samples: 7995264020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 18:09:06,427][15401] Updated weights for policy 0, policy_version 487990 (0.0025) [2024-06-23 18:09:08,396][15132] Fps is (10 sec: 45845.6, 60 sec: 42320.8, 300 sec: 42708.5). Total num frames: 7995310080. Throughput: 0: 42606.9. Samples: 7995396200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:08,396][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 18:09:09,784][15401] Updated weights for policy 0, policy_version 488000 (0.0041) [2024-06-23 18:09:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42599.1). Total num frames: 7995523072. Throughput: 0: 42602.5. Samples: 7995646800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:13,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 18:09:13,963][15401] Updated weights for policy 0, policy_version 488010 (0.0032) [2024-06-23 18:09:17,243][15401] Updated weights for policy 0, policy_version 488020 (0.0030) [2024-06-23 18:09:18,389][15132] Fps is (10 sec: 44265.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7995752448. Throughput: 0: 42806.3. Samples: 7995905920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:18,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 18:09:21,684][15401] Updated weights for policy 0, policy_version 488030 (0.0029) [2024-06-23 18:09:23,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 7995965440. Throughput: 0: 42678.5. Samples: 7996038700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 18:09:24,908][15401] Updated weights for policy 0, policy_version 488040 (0.0043) [2024-06-23 18:09:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 7996178432. Throughput: 0: 42827.3. Samples: 7996297120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:28,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 18:09:29,422][15401] Updated weights for policy 0, policy_version 488050 (0.0031) [2024-06-23 18:09:30,051][15349] Signal inference workers to stop experience collection... (118550 times) [2024-06-23 18:09:30,051][15349] Signal inference workers to resume experience collection... (118550 times) [2024-06-23 18:09:30,094][15401] InferenceWorker_p0-w0: stopping experience collection (118550 times) [2024-06-23 18:09:30,094][15401] InferenceWorker_p0-w0: resuming experience collection (118550 times) [2024-06-23 18:09:32,453][15401] Updated weights for policy 0, policy_version 488060 (0.0039) [2024-06-23 18:09:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 7996391424. Throughput: 0: 43075.9. Samples: 7996554100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:33,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 18:09:37,065][15401] Updated weights for policy 0, policy_version 488070 (0.0023) [2024-06-23 18:09:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 7996620800. Throughput: 0: 42849.1. Samples: 7996682780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 18:09:40,551][15401] Updated weights for policy 0, policy_version 488080 (0.0044) [2024-06-23 18:09:43,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42867.0, 300 sec: 42597.5). Total num frames: 7996801024. Throughput: 0: 42972.9. Samples: 7996938000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:43,397][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 18:09:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000488086_7996801024.pth... [2024-06-23 18:09:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000487462_7986577408.pth [2024-06-23 18:09:44,706][15401] Updated weights for policy 0, policy_version 488090 (0.0035) [2024-06-23 18:09:48,105][15401] Updated weights for policy 0, policy_version 488100 (0.0039) [2024-06-23 18:09:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 7997046784. Throughput: 0: 42787.6. Samples: 7997189460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 18:09:52,496][15401] Updated weights for policy 0, policy_version 488110 (0.0026) [2024-06-23 18:09:53,389][15132] Fps is (10 sec: 45905.6, 60 sec: 43144.8, 300 sec: 42765.1). Total num frames: 7997259776. Throughput: 0: 42877.8. Samples: 7997325420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:53,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 18:09:55,704][15401] Updated weights for policy 0, policy_version 488120 (0.0035) [2024-06-23 18:09:58,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 7997423616. Throughput: 0: 42746.4. Samples: 7997570380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:09:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 18:09:59,967][15401] Updated weights for policy 0, policy_version 488130 (0.0029) [2024-06-23 18:10:03,204][15401] Updated weights for policy 0, policy_version 488140 (0.0031) [2024-06-23 18:10:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 7997685760. Throughput: 0: 42702.2. Samples: 7997827520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:10:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 18:10:07,685][15401] Updated weights for policy 0, policy_version 488150 (0.0030) [2024-06-23 18:10:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 7997882368. Throughput: 0: 42778.7. Samples: 7997963740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:10:08,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 18:10:10,889][15401] Updated weights for policy 0, policy_version 488160 (0.0029) [2024-06-23 18:10:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.5, 300 sec: 42598.8). Total num frames: 7998062592. Throughput: 0: 42593.8. Samples: 7998213840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:10:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 18:10:15,365][15401] Updated weights for policy 0, policy_version 488170 (0.0042) [2024-06-23 18:10:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 7998324736. Throughput: 0: 42522.0. Samples: 7998467580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:10:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 18:10:18,485][15401] Updated weights for policy 0, policy_version 488180 (0.0037) [2024-06-23 18:10:22,833][15401] Updated weights for policy 0, policy_version 488190 (0.0038) [2024-06-23 18:10:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 7998504960. Throughput: 0: 42692.4. Samples: 7998603940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-23 18:10:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 18:10:26,221][15401] Updated weights for policy 0, policy_version 488200 (0.0037) [2024-06-23 18:10:28,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 7998717952. Throughput: 0: 42616.4. Samples: 7998855460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:10:28,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 18:10:30,863][15401] Updated weights for policy 0, policy_version 488210 (0.0035) [2024-06-23 18:10:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 7998947328. Throughput: 0: 42722.6. Samples: 7999111980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:10:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 18:10:33,974][15401] Updated weights for policy 0, policy_version 488220 (0.0038) [2024-06-23 18:10:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 7999143936. Throughput: 0: 42654.5. Samples: 7999244880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:10:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 18:10:38,654][15401] Updated weights for policy 0, policy_version 488230 (0.0044) [2024-06-23 18:10:38,905][15349] Signal inference workers to stop experience collection... (118600 times) [2024-06-23 18:10:38,936][15401] InferenceWorker_p0-w0: stopping experience collection (118600 times) [2024-06-23 18:10:38,959][15349] Signal inference workers to resume experience collection... (118600 times) [2024-06-23 18:10:38,960][15401] InferenceWorker_p0-w0: resuming experience collection (118600 times) [2024-06-23 18:10:41,642][15401] Updated weights for policy 0, policy_version 488240 (0.0027) [2024-06-23 18:10:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42603.0, 300 sec: 42598.8). Total num frames: 7999356928. Throughput: 0: 42745.0. Samples: 7999493900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:10:43,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-23 18:10:46,261][15401] Updated weights for policy 0, policy_version 488250 (0.0025) [2024-06-23 18:10:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 7999586304. Throughput: 0: 42760.9. Samples: 7999751760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:10:48,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 18:10:49,401][15401] Updated weights for policy 0, policy_version 488260 (0.0039) [2024-06-23 18:10:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 7999782912. Throughput: 0: 42746.8. Samples: 7999887340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:10:53,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 18:10:53,871][15401] Updated weights for policy 0, policy_version 488270 (0.0032) [2024-06-23 18:10:56,923][15401] Updated weights for policy 0, policy_version 488280 (0.0037) [2024-06-23 18:10:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 7999995904. Throughput: 0: 42703.4. Samples: 8000135500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:10:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 18:11:01,566][15401] Updated weights for policy 0, policy_version 488290 (0.0031) [2024-06-23 18:11:03,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8000241664. Throughput: 0: 42804.6. Samples: 8000393800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:03,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 18:11:04,707][15401] Updated weights for policy 0, policy_version 488300 (0.0037) [2024-06-23 18:11:08,394][15132] Fps is (10 sec: 44215.1, 60 sec: 42594.8, 300 sec: 42708.7). Total num frames: 8000438272. Throughput: 0: 42729.0. Samples: 8000526960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:08,395][15132] Avg episode reward: [(0, '0.849')] [2024-06-23 18:11:09,074][15401] Updated weights for policy 0, policy_version 488310 (0.0027) [2024-06-23 18:11:12,067][15401] Updated weights for policy 0, policy_version 488320 (0.0040) [2024-06-23 18:11:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8000651264. Throughput: 0: 42613.2. Samples: 8000773060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:13,394][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 18:11:16,809][15401] Updated weights for policy 0, policy_version 488330 (0.0032) [2024-06-23 18:11:18,389][15132] Fps is (10 sec: 44259.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8000880640. Throughput: 0: 42757.5. Samples: 8001036060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:18,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 18:11:20,212][15401] Updated weights for policy 0, policy_version 488340 (0.0045) [2024-06-23 18:11:23,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 8001093632. Throughput: 0: 42590.1. Samples: 8001161540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:23,393][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 18:11:24,443][15401] Updated weights for policy 0, policy_version 488350 (0.0033) [2024-06-23 18:11:27,841][15401] Updated weights for policy 0, policy_version 488360 (0.0041) [2024-06-23 18:11:28,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.3, 300 sec: 42820.5). Total num frames: 8001306624. Throughput: 0: 42574.4. Samples: 8001409760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 18:11:32,065][15401] Updated weights for policy 0, policy_version 488370 (0.0029) [2024-06-23 18:11:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8001503232. Throughput: 0: 42769.6. Samples: 8001676400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:33,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 18:11:35,368][15401] Updated weights for policy 0, policy_version 488380 (0.0035) [2024-06-23 18:11:38,389][15132] Fps is (10 sec: 42599.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8001732608. Throughput: 0: 42457.7. Samples: 8001797940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 18:11:39,581][15401] Updated weights for policy 0, policy_version 488390 (0.0042) [2024-06-23 18:11:42,922][15401] Updated weights for policy 0, policy_version 488400 (0.0037) [2024-06-23 18:11:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8001945600. Throughput: 0: 42644.5. Samples: 8002054500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:43,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 18:11:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000488400_8001945600.pth... [2024-06-23 18:11:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000487774_7991689216.pth [2024-06-23 18:11:47,135][15401] Updated weights for policy 0, policy_version 488410 (0.0033) [2024-06-23 18:11:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8002142208. Throughput: 0: 42743.6. Samples: 8002317260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:48,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 18:11:48,812][15349] Signal inference workers to stop experience collection... (118650 times) [2024-06-23 18:11:48,859][15401] InferenceWorker_p0-w0: stopping experience collection (118650 times) [2024-06-23 18:11:48,861][15349] Signal inference workers to resume experience collection... (118650 times) [2024-06-23 18:11:48,874][15401] InferenceWorker_p0-w0: resuming experience collection (118650 times) [2024-06-23 18:11:50,910][15401] Updated weights for policy 0, policy_version 488420 (0.0030) [2024-06-23 18:11:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 8002355200. Throughput: 0: 42473.9. Samples: 8002438080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:11:53,400][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 18:11:54,650][15401] Updated weights for policy 0, policy_version 488430 (0.0035) [2024-06-23 18:11:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8002584576. Throughput: 0: 42758.8. Samples: 8002697200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:11:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 18:11:58,875][15401] Updated weights for policy 0, policy_version 488440 (0.0024) [2024-06-23 18:12:02,579][15401] Updated weights for policy 0, policy_version 488450 (0.0036) [2024-06-23 18:12:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8002797568. Throughput: 0: 42666.5. Samples: 8002956060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:03,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 18:12:06,310][15401] Updated weights for policy 0, policy_version 488460 (0.0039) [2024-06-23 18:12:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42602.0, 300 sec: 42709.5). Total num frames: 8002994176. Throughput: 0: 42716.2. Samples: 8003083660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:08,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 18:12:10,231][15401] Updated weights for policy 0, policy_version 488470 (0.0041) [2024-06-23 18:12:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8003239936. Throughput: 0: 42836.6. Samples: 8003337400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 18:12:13,733][15401] Updated weights for policy 0, policy_version 488480 (0.0034) [2024-06-23 18:12:18,071][15401] Updated weights for policy 0, policy_version 488490 (0.0024) [2024-06-23 18:12:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8003436544. Throughput: 0: 42798.8. Samples: 8003602340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 18:12:21,500][15401] Updated weights for policy 0, policy_version 488500 (0.0028) [2024-06-23 18:12:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8003649536. Throughput: 0: 42867.1. Samples: 8003726960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 18:12:25,583][15401] Updated weights for policy 0, policy_version 488510 (0.0029) [2024-06-23 18:12:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 8003878912. Throughput: 0: 42911.6. Samples: 8003985520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 18:12:29,026][15401] Updated weights for policy 0, policy_version 488520 (0.0038) [2024-06-23 18:12:33,328][15401] Updated weights for policy 0, policy_version 488530 (0.0048) [2024-06-23 18:12:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 8004075520. Throughput: 0: 42986.7. Samples: 8004251760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:33,401][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 18:12:36,435][15401] Updated weights for policy 0, policy_version 488540 (0.0033) [2024-06-23 18:12:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8004288512. Throughput: 0: 42991.6. Samples: 8004372700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 18:12:40,785][15401] Updated weights for policy 0, policy_version 488550 (0.0025) [2024-06-23 18:12:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8004517888. Throughput: 0: 42929.8. Samples: 8004629040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:43,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 18:12:44,135][15401] Updated weights for policy 0, policy_version 488560 (0.0039) [2024-06-23 18:12:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 8004714496. Throughput: 0: 43038.5. Samples: 8004892780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 18:12:48,416][15401] Updated weights for policy 0, policy_version 488570 (0.0043) [2024-06-23 18:12:51,877][15401] Updated weights for policy 0, policy_version 488580 (0.0034) [2024-06-23 18:12:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 8004943872. Throughput: 0: 42888.3. Samples: 8005013740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:53,392][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 18:12:55,885][15401] Updated weights for policy 0, policy_version 488590 (0.0030) [2024-06-23 18:12:58,390][15132] Fps is (10 sec: 45873.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8005173248. Throughput: 0: 43134.6. Samples: 8005278460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:12:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 18:12:59,637][15401] Updated weights for policy 0, policy_version 488600 (0.0026) [2024-06-23 18:13:03,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 8005369856. Throughput: 0: 43093.5. Samples: 8005541540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:13:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 18:13:03,424][15401] Updated weights for policy 0, policy_version 488610 (0.0028) [2024-06-23 18:13:07,247][15401] Updated weights for policy 0, policy_version 488620 (0.0039) [2024-06-23 18:13:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8005599232. Throughput: 0: 43172.1. Samples: 8005669700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:13:08,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 18:13:11,073][15401] Updated weights for policy 0, policy_version 488630 (0.0027) [2024-06-23 18:13:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8005812224. Throughput: 0: 43205.4. Samples: 8005929760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:13:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 18:13:14,899][15401] Updated weights for policy 0, policy_version 488640 (0.0025) [2024-06-23 18:13:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8006025216. Throughput: 0: 43061.1. Samples: 8006189400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 18:13:18,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 18:13:18,491][15401] Updated weights for policy 0, policy_version 488650 (0.0032) [2024-06-23 18:13:22,529][15401] Updated weights for policy 0, policy_version 488660 (0.0036) [2024-06-23 18:13:23,239][15349] Signal inference workers to stop experience collection... (118700 times) [2024-06-23 18:13:23,241][15349] Signal inference workers to resume experience collection... (118700 times) [2024-06-23 18:13:23,284][15401] InferenceWorker_p0-w0: stopping experience collection (118700 times) [2024-06-23 18:13:23,284][15401] InferenceWorker_p0-w0: resuming experience collection (118700 times) [2024-06-23 18:13:23,390][15132] Fps is (10 sec: 44235.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 8006254592. Throughput: 0: 43197.2. Samples: 8006316580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:13:23,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 18:13:26,034][15401] Updated weights for policy 0, policy_version 488670 (0.0034) [2024-06-23 18:13:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 8006451200. Throughput: 0: 43231.4. Samples: 8006574460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:13:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 18:13:30,383][15401] Updated weights for policy 0, policy_version 488680 (0.0040) [2024-06-23 18:13:33,389][15132] Fps is (10 sec: 40961.0, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 8006664192. Throughput: 0: 43055.5. Samples: 8006830280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:13:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 18:13:33,847][15401] Updated weights for policy 0, policy_version 488690 (0.0029) [2024-06-23 18:13:37,903][15401] Updated weights for policy 0, policy_version 488700 (0.0038) [2024-06-23 18:13:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8006877184. Throughput: 0: 43264.1. Samples: 8006960520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:13:38,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-23 18:13:41,640][15401] Updated weights for policy 0, policy_version 488710 (0.0033) [2024-06-23 18:13:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8007106560. Throughput: 0: 43078.7. Samples: 8007217000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:13:43,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 18:13:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000488715_8007106560.pth... [2024-06-23 18:13:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000488086_7996801024.pth [2024-06-23 18:13:45,458][15401] Updated weights for policy 0, policy_version 488720 (0.0038) [2024-06-23 18:13:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 8007303168. Throughput: 0: 42948.7. Samples: 8007474240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:13:48,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 18:13:49,363][15401] Updated weights for policy 0, policy_version 488730 (0.0040) [2024-06-23 18:13:52,961][15401] Updated weights for policy 0, policy_version 488740 (0.0036) [2024-06-23 18:13:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.0, 300 sec: 42931.6). Total num frames: 8007516160. Throughput: 0: 42857.5. Samples: 8007598300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:13:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 18:13:56,993][15401] Updated weights for policy 0, policy_version 488750 (0.0038) [2024-06-23 18:13:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8007745536. Throughput: 0: 42802.5. Samples: 8007855880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:13:58,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 18:14:00,447][15401] Updated weights for policy 0, policy_version 488760 (0.0033) [2024-06-23 18:14:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.3, 300 sec: 42821.5). Total num frames: 8007942144. Throughput: 0: 42792.3. Samples: 8008115060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:03,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 18:14:04,609][15401] Updated weights for policy 0, policy_version 488770 (0.0037) [2024-06-23 18:14:08,322][15401] Updated weights for policy 0, policy_version 488780 (0.0032) [2024-06-23 18:14:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8008171520. Throughput: 0: 42760.6. Samples: 8008240800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 18:14:12,295][15401] Updated weights for policy 0, policy_version 488790 (0.0025) [2024-06-23 18:14:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8008400896. Throughput: 0: 42832.0. Samples: 8008501900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 18:14:15,840][15401] Updated weights for policy 0, policy_version 488800 (0.0033) [2024-06-23 18:14:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8008581120. Throughput: 0: 42767.1. Samples: 8008754800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 18:14:19,910][15401] Updated weights for policy 0, policy_version 488810 (0.0039) [2024-06-23 18:14:23,377][15401] Updated weights for policy 0, policy_version 488820 (0.0033) [2024-06-23 18:14:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8008826880. Throughput: 0: 42677.3. Samples: 8008881000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 18:14:27,802][15401] Updated weights for policy 0, policy_version 488830 (0.0034) [2024-06-23 18:14:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8009023488. Throughput: 0: 42773.5. Samples: 8009141800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:28,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 18:14:30,915][15401] Updated weights for policy 0, policy_version 488840 (0.0035) [2024-06-23 18:14:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 8009236480. Throughput: 0: 42727.5. Samples: 8009397080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:33,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 18:14:35,419][15401] Updated weights for policy 0, policy_version 488850 (0.0041) [2024-06-23 18:14:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42932.6). Total num frames: 8009465856. Throughput: 0: 42752.7. Samples: 8009522160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 18:14:38,488][15401] Updated weights for policy 0, policy_version 488860 (0.0047) [2024-06-23 18:14:43,101][15401] Updated weights for policy 0, policy_version 488870 (0.0036) [2024-06-23 18:14:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8009662464. Throughput: 0: 42943.6. Samples: 8009788340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 18:14:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 18:14:46,114][15401] Updated weights for policy 0, policy_version 488880 (0.0030) [2024-06-23 18:14:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 8009891840. Throughput: 0: 42835.7. Samples: 8010042660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:14:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 18:14:50,695][15401] Updated weights for policy 0, policy_version 488890 (0.0032) [2024-06-23 18:14:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8010104832. Throughput: 0: 42893.2. Samples: 8010171000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:14:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 18:14:53,743][15401] Updated weights for policy 0, policy_version 488900 (0.0029) [2024-06-23 18:14:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 8010285056. Throughput: 0: 42924.2. Samples: 8010433480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:14:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 18:14:58,409][15401] Updated weights for policy 0, policy_version 488910 (0.0030) [2024-06-23 18:15:01,274][15401] Updated weights for policy 0, policy_version 488920 (0.0044) [2024-06-23 18:15:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8010514432. Throughput: 0: 42948.8. Samples: 8010687500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 18:15:06,046][15401] Updated weights for policy 0, policy_version 488930 (0.0036) [2024-06-23 18:15:08,392][15132] Fps is (10 sec: 47501.5, 60 sec: 43142.8, 300 sec: 43042.4). Total num frames: 8010760192. Throughput: 0: 43021.7. Samples: 8010817080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:08,392][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 18:15:09,251][15401] Updated weights for policy 0, policy_version 488940 (0.0029) [2024-06-23 18:15:12,174][15349] Signal inference workers to stop experience collection... (118750 times) [2024-06-23 18:15:12,174][15349] Signal inference workers to resume experience collection... (118750 times) [2024-06-23 18:15:12,222][15401] InferenceWorker_p0-w0: stopping experience collection (118750 times) [2024-06-23 18:15:12,222][15401] InferenceWorker_p0-w0: resuming experience collection (118750 times) [2024-06-23 18:15:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8010940416. Throughput: 0: 43107.0. Samples: 8011081620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:13,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 18:15:13,850][15401] Updated weights for policy 0, policy_version 488950 (0.0033) [2024-06-23 18:15:16,684][15401] Updated weights for policy 0, policy_version 488960 (0.0024) [2024-06-23 18:15:18,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43417.4, 300 sec: 42987.2). Total num frames: 8011186176. Throughput: 0: 43058.2. Samples: 8011334600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:18,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 18:15:21,263][15401] Updated weights for policy 0, policy_version 488970 (0.0039) [2024-06-23 18:15:23,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8011415552. Throughput: 0: 43299.9. Samples: 8011470660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 18:15:24,227][15401] Updated weights for policy 0, policy_version 488980 (0.0033) [2024-06-23 18:15:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8011595776. Throughput: 0: 43176.9. Samples: 8011731300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 18:15:28,749][15401] Updated weights for policy 0, policy_version 488990 (0.0035) [2024-06-23 18:15:31,684][15401] Updated weights for policy 0, policy_version 489000 (0.0042) [2024-06-23 18:15:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 8011825152. Throughput: 0: 43195.0. Samples: 8011986440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:33,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 18:15:36,350][15401] Updated weights for policy 0, policy_version 489010 (0.0032) [2024-06-23 18:15:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 8012054528. Throughput: 0: 43376.5. Samples: 8012122940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 18:15:39,289][15401] Updated weights for policy 0, policy_version 489020 (0.0029) [2024-06-23 18:15:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8012234752. Throughput: 0: 43222.5. Samples: 8012378500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 18:15:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000489028_8012234752.pth... [2024-06-23 18:15:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000488400_8001945600.pth [2024-06-23 18:15:43,870][15401] Updated weights for policy 0, policy_version 489030 (0.0025) [2024-06-23 18:15:47,008][15401] Updated weights for policy 0, policy_version 489040 (0.0034) [2024-06-23 18:15:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.7, 300 sec: 43042.3). Total num frames: 8012480512. Throughput: 0: 43169.6. Samples: 8012630240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:48,393][15132] Avg episode reward: [(0, '0.252')] [2024-06-23 18:15:51,435][15401] Updated weights for policy 0, policy_version 489050 (0.0029) [2024-06-23 18:15:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8012693504. Throughput: 0: 43289.9. Samples: 8012765020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:53,396][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 18:15:54,746][15401] Updated weights for policy 0, policy_version 489060 (0.0032) [2024-06-23 18:15:58,390][15132] Fps is (10 sec: 39331.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 8012873728. Throughput: 0: 43166.3. Samples: 8013024100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:15:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 18:15:58,908][15401] Updated weights for policy 0, policy_version 489070 (0.0031) [2024-06-23 18:16:02,110][15401] Updated weights for policy 0, policy_version 489080 (0.0033) [2024-06-23 18:16:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 43043.4). Total num frames: 8013135872. Throughput: 0: 43076.0. Samples: 8013273020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:16:03,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 18:16:06,602][15401] Updated weights for policy 0, policy_version 489090 (0.0033) [2024-06-23 18:16:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 8013332480. Throughput: 0: 43097.8. Samples: 8013410060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:16:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 18:16:09,980][15401] Updated weights for policy 0, policy_version 489100 (0.0031) [2024-06-23 18:16:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8013529088. Throughput: 0: 42881.3. Samples: 8013660960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-23 18:16:13,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 18:16:14,486][15401] Updated weights for policy 0, policy_version 489110 (0.0036) [2024-06-23 18:16:17,612][15401] Updated weights for policy 0, policy_version 489120 (0.0033) [2024-06-23 18:16:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42932.0). Total num frames: 8013758464. Throughput: 0: 42728.5. Samples: 8013909220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:18,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 18:16:22,053][15401] Updated weights for policy 0, policy_version 489130 (0.0032) [2024-06-23 18:16:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8013955072. Throughput: 0: 42783.1. Samples: 8014048180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:23,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 18:16:25,249][15401] Updated weights for policy 0, policy_version 489140 (0.0043) [2024-06-23 18:16:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 8014168064. Throughput: 0: 42711.6. Samples: 8014300520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 18:16:29,751][15401] Updated weights for policy 0, policy_version 489150 (0.0030) [2024-06-23 18:16:32,818][15401] Updated weights for policy 0, policy_version 489160 (0.0047) [2024-06-23 18:16:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8014413824. Throughput: 0: 42701.4. Samples: 8014551700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 18:16:37,459][15401] Updated weights for policy 0, policy_version 489170 (0.0049) [2024-06-23 18:16:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8014610432. Throughput: 0: 42751.1. Samples: 8014688820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:38,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 18:16:40,153][15349] Signal inference workers to stop experience collection... (118800 times) [2024-06-23 18:16:40,153][15349] Signal inference workers to resume experience collection... (118800 times) [2024-06-23 18:16:40,191][15401] InferenceWorker_p0-w0: stopping experience collection (118800 times) [2024-06-23 18:16:40,191][15401] InferenceWorker_p0-w0: resuming experience collection (118800 times) [2024-06-23 18:16:40,301][15401] Updated weights for policy 0, policy_version 489180 (0.0029) [2024-06-23 18:16:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8014807040. Throughput: 0: 42610.6. Samples: 8014941580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:43,396][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 18:16:45,159][15401] Updated weights for policy 0, policy_version 489190 (0.0033) [2024-06-23 18:16:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42327.0, 300 sec: 42931.6). Total num frames: 8015020032. Throughput: 0: 42653.8. Samples: 8015192440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 18:16:48,569][15401] Updated weights for policy 0, policy_version 489200 (0.0037) [2024-06-23 18:16:52,757][15401] Updated weights for policy 0, policy_version 489210 (0.0033) [2024-06-23 18:16:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8015233024. Throughput: 0: 42567.5. Samples: 8015325600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 18:16:56,402][15401] Updated weights for policy 0, policy_version 489220 (0.0029) [2024-06-23 18:16:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8015462400. Throughput: 0: 42528.3. Samples: 8015574740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:16:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 18:17:00,588][15401] Updated weights for policy 0, policy_version 489230 (0.0032) [2024-06-23 18:17:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 8015675392. Throughput: 0: 42664.4. Samples: 8015829120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:17:03,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 18:17:04,137][15401] Updated weights for policy 0, policy_version 489240 (0.0028) [2024-06-23 18:17:08,192][15401] Updated weights for policy 0, policy_version 489250 (0.0037) [2024-06-23 18:17:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8015872000. Throughput: 0: 42619.1. Samples: 8015966040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:17:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 18:17:11,733][15401] Updated weights for policy 0, policy_version 489260 (0.0033) [2024-06-23 18:17:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8016084992. Throughput: 0: 42553.7. Samples: 8016215440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:17:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 18:17:15,879][15401] Updated weights for policy 0, policy_version 489270 (0.0029) [2024-06-23 18:17:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8016314368. Throughput: 0: 42617.8. Samples: 8016469500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:17:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 18:17:19,337][15401] Updated weights for policy 0, policy_version 489280 (0.0038) [2024-06-23 18:17:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8016494592. Throughput: 0: 42469.3. Samples: 8016599940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:17:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 18:17:23,673][15401] Updated weights for policy 0, policy_version 489290 (0.0046) [2024-06-23 18:17:27,465][15401] Updated weights for policy 0, policy_version 489300 (0.0046) [2024-06-23 18:17:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 8016723968. Throughput: 0: 42550.9. Samples: 8016856360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:17:28,396][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 18:17:31,344][15401] Updated weights for policy 0, policy_version 489310 (0.0034) [2024-06-23 18:17:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 8016953344. Throughput: 0: 42446.7. Samples: 8017102540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:17:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 18:17:35,205][15401] Updated weights for policy 0, policy_version 489320 (0.0023) [2024-06-23 18:17:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8017149952. Throughput: 0: 42285.3. Samples: 8017228440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-23 18:17:38,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 18:17:39,215][15401] Updated weights for policy 0, policy_version 489330 (0.0035) [2024-06-23 18:17:42,786][15401] Updated weights for policy 0, policy_version 489340 (0.0048) [2024-06-23 18:17:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8017362944. Throughput: 0: 42455.3. Samples: 8017485220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:17:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 18:17:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000489341_8017362944.pth... [2024-06-23 18:17:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000488715_8007106560.pth [2024-06-23 18:17:46,742][15401] Updated weights for policy 0, policy_version 489350 (0.0031) [2024-06-23 18:17:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 8017592320. Throughput: 0: 42592.5. Samples: 8017745780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:17:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 18:17:50,299][15401] Updated weights for policy 0, policy_version 489360 (0.0043) [2024-06-23 18:17:53,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8017805312. Throughput: 0: 42411.9. Samples: 8017874680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:17:53,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 18:17:54,383][15401] Updated weights for policy 0, policy_version 489370 (0.0033) [2024-06-23 18:17:57,810][15401] Updated weights for policy 0, policy_version 489380 (0.0037) [2024-06-23 18:17:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8018018304. Throughput: 0: 42581.8. Samples: 8018131620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:17:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 18:18:01,915][15401] Updated weights for policy 0, policy_version 489390 (0.0040) [2024-06-23 18:18:03,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8018231296. Throughput: 0: 42646.1. Samples: 8018388580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 18:18:05,464][15401] Updated weights for policy 0, policy_version 489400 (0.0029) [2024-06-23 18:18:08,137][15349] Signal inference workers to stop experience collection... (118850 times) [2024-06-23 18:18:08,187][15401] InferenceWorker_p0-w0: stopping experience collection (118850 times) [2024-06-23 18:18:08,193][15349] Signal inference workers to resume experience collection... (118850 times) [2024-06-23 18:18:08,210][15401] InferenceWorker_p0-w0: resuming experience collection (118850 times) [2024-06-23 18:18:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8018460672. Throughput: 0: 42508.5. Samples: 8018512820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:08,392][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 18:18:09,775][15401] Updated weights for policy 0, policy_version 489410 (0.0028) [2024-06-23 18:18:13,312][15401] Updated weights for policy 0, policy_version 489420 (0.0045) [2024-06-23 18:18:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 8018657280. Throughput: 0: 42503.9. Samples: 8018769040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 18:18:17,313][15401] Updated weights for policy 0, policy_version 489430 (0.0030) [2024-06-23 18:18:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8018853888. Throughput: 0: 42783.2. Samples: 8019027780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 18:18:21,158][15401] Updated weights for policy 0, policy_version 489440 (0.0033) [2024-06-23 18:18:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8019066880. Throughput: 0: 42770.7. Samples: 8019153120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:23,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 18:18:24,853][15401] Updated weights for policy 0, policy_version 489450 (0.0031) [2024-06-23 18:18:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8019279872. Throughput: 0: 42705.7. Samples: 8019406980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:28,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 18:18:28,810][15401] Updated weights for policy 0, policy_version 489460 (0.0028) [2024-06-23 18:18:32,453][15401] Updated weights for policy 0, policy_version 489470 (0.0046) [2024-06-23 18:18:33,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 8019492864. Throughput: 0: 42603.1. Samples: 8019663020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:33,392][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 18:18:36,346][15401] Updated weights for policy 0, policy_version 489480 (0.0024) [2024-06-23 18:18:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8019705856. Throughput: 0: 42612.9. Samples: 8019792160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 18:18:40,326][15401] Updated weights for policy 0, policy_version 489490 (0.0036) [2024-06-23 18:18:43,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8019935232. Throughput: 0: 42694.0. Samples: 8020052840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:43,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 18:18:43,777][15401] Updated weights for policy 0, policy_version 489500 (0.0047) [2024-06-23 18:18:47,769][15401] Updated weights for policy 0, policy_version 489510 (0.0043) [2024-06-23 18:18:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8020148224. Throughput: 0: 42585.0. Samples: 8020304900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 18:18:51,430][15401] Updated weights for policy 0, policy_version 489520 (0.0039) [2024-06-23 18:18:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8020361216. Throughput: 0: 42753.3. Samples: 8020436720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 18:18:55,339][15401] Updated weights for policy 0, policy_version 489530 (0.0042) [2024-06-23 18:18:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8020557824. Throughput: 0: 42760.4. Samples: 8020693260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:18:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 18:18:59,038][15401] Updated weights for policy 0, policy_version 489540 (0.0023) [2024-06-23 18:19:02,843][15401] Updated weights for policy 0, policy_version 489550 (0.0031) [2024-06-23 18:19:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 8020803584. Throughput: 0: 42639.9. Samples: 8020946580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:19:03,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 18:19:06,644][15401] Updated weights for policy 0, policy_version 489560 (0.0032) [2024-06-23 18:19:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8021000192. Throughput: 0: 42817.2. Samples: 8021079900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 18:19:10,839][15401] Updated weights for policy 0, policy_version 489570 (0.0033) [2024-06-23 18:19:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8021213184. Throughput: 0: 42856.9. Samples: 8021335540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 18:19:14,602][15401] Updated weights for policy 0, policy_version 489580 (0.0034) [2024-06-23 18:19:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8021426176. Throughput: 0: 42841.9. Samples: 8021590800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 18:19:18,489][15401] Updated weights for policy 0, policy_version 489590 (0.0030) [2024-06-23 18:19:22,292][15401] Updated weights for policy 0, policy_version 489600 (0.0024) [2024-06-23 18:19:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8021639168. Throughput: 0: 42772.0. Samples: 8021716900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 18:19:26,162][15401] Updated weights for policy 0, policy_version 489610 (0.0044) [2024-06-23 18:19:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 8021835776. Throughput: 0: 42660.9. Samples: 8021972580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 18:19:29,860][15401] Updated weights for policy 0, policy_version 489620 (0.0040) [2024-06-23 18:19:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8022065152. Throughput: 0: 42735.5. Samples: 8022228000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:33,394][15132] Avg episode reward: [(0, '0.215')] [2024-06-23 18:19:34,091][15401] Updated weights for policy 0, policy_version 489630 (0.0030) [2024-06-23 18:19:37,887][15401] Updated weights for policy 0, policy_version 489640 (0.0026) [2024-06-23 18:19:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8022294528. Throughput: 0: 42644.9. Samples: 8022355740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:38,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 18:19:41,584][15401] Updated weights for policy 0, policy_version 489650 (0.0040) [2024-06-23 18:19:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8022474752. Throughput: 0: 42622.6. Samples: 8022611280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 18:19:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000489654_8022491136.pth... [2024-06-23 18:19:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000489028_8012234752.pth [2024-06-23 18:19:45,564][15401] Updated weights for policy 0, policy_version 489660 (0.0041) [2024-06-23 18:19:48,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 8022704128. Throughput: 0: 42722.4. Samples: 8022869360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:48,397][15132] Avg episode reward: [(0, '0.846')] [2024-06-23 18:19:49,074][15401] Updated weights for policy 0, policy_version 489670 (0.0029) [2024-06-23 18:19:50,662][15349] Signal inference workers to stop experience collection... (118900 times) [2024-06-23 18:19:50,715][15401] InferenceWorker_p0-w0: stopping experience collection (118900 times) [2024-06-23 18:19:50,782][15349] Signal inference workers to resume experience collection... (118900 times) [2024-06-23 18:19:50,782][15401] InferenceWorker_p0-w0: resuming experience collection (118900 times) [2024-06-23 18:19:53,384][15401] Updated weights for policy 0, policy_version 489680 (0.0035) [2024-06-23 18:19:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8022917120. Throughput: 0: 42666.8. Samples: 8022999900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:53,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 18:19:57,098][15401] Updated weights for policy 0, policy_version 489690 (0.0025) [2024-06-23 18:19:58,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8023113728. Throughput: 0: 42543.5. Samples: 8023250000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:19:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 18:20:01,086][15401] Updated weights for policy 0, policy_version 489700 (0.0038) [2024-06-23 18:20:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 8023343104. Throughput: 0: 42586.7. Samples: 8023507200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:20:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 18:20:04,844][15401] Updated weights for policy 0, policy_version 489710 (0.0020) [2024-06-23 18:20:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8023539712. Throughput: 0: 42707.6. Samples: 8023638740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:20:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 18:20:08,833][15401] Updated weights for policy 0, policy_version 489720 (0.0029) [2024-06-23 18:20:12,462][15401] Updated weights for policy 0, policy_version 489730 (0.0031) [2024-06-23 18:20:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 8023752704. Throughput: 0: 42604.7. Samples: 8023889800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:20:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 18:20:16,268][15401] Updated weights for policy 0, policy_version 489740 (0.0038) [2024-06-23 18:20:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8023982080. Throughput: 0: 42528.3. Samples: 8024141780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:20:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 18:20:20,166][15401] Updated weights for policy 0, policy_version 489750 (0.0032) [2024-06-23 18:20:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8024195072. Throughput: 0: 42651.9. Samples: 8024275080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:20:23,399][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 18:20:23,860][15401] Updated weights for policy 0, policy_version 489760 (0.0037) [2024-06-23 18:20:27,722][15401] Updated weights for policy 0, policy_version 489770 (0.0029) [2024-06-23 18:20:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8024408064. Throughput: 0: 42660.0. Samples: 8024530980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 18:20:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 18:20:31,611][15401] Updated weights for policy 0, policy_version 489780 (0.0030) [2024-06-23 18:20:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8024637440. Throughput: 0: 42526.0. Samples: 8024782760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:20:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 18:20:35,374][15401] Updated weights for policy 0, policy_version 489790 (0.0043) [2024-06-23 18:20:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8024850432. Throughput: 0: 42583.2. Samples: 8024916140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:20:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 18:20:39,089][15401] Updated weights for policy 0, policy_version 489800 (0.0035) [2024-06-23 18:20:43,044][15401] Updated weights for policy 0, policy_version 489810 (0.0043) [2024-06-23 18:20:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 8025047040. Throughput: 0: 42740.9. Samples: 8025173340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:20:43,404][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 18:20:46,739][15401] Updated weights for policy 0, policy_version 489820 (0.0028) [2024-06-23 18:20:48,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43147.4, 300 sec: 42709.1). Total num frames: 8025292800. Throughput: 0: 42659.5. Samples: 8025426980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:20:48,401][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 18:20:50,959][15401] Updated weights for policy 0, policy_version 489830 (0.0034) [2024-06-23 18:20:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8025473024. Throughput: 0: 42652.3. Samples: 8025558100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:20:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 18:20:54,631][15401] Updated weights for policy 0, policy_version 489840 (0.0039) [2024-06-23 18:20:58,390][15132] Fps is (10 sec: 39328.3, 60 sec: 42871.0, 300 sec: 42542.8). Total num frames: 8025686016. Throughput: 0: 42665.2. Samples: 8025809760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:20:58,391][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 18:20:58,590][15401] Updated weights for policy 0, policy_version 489850 (0.0031) [2024-06-23 18:21:02,385][15401] Updated weights for policy 0, policy_version 489860 (0.0037) [2024-06-23 18:21:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8025915392. Throughput: 0: 42666.3. Samples: 8026061760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 18:21:06,323][15401] Updated weights for policy 0, policy_version 489870 (0.0033) [2024-06-23 18:21:08,390][15132] Fps is (10 sec: 40962.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8026095616. Throughput: 0: 42499.6. Samples: 8026187560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 18:21:10,210][15401] Updated weights for policy 0, policy_version 489880 (0.0046) [2024-06-23 18:21:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 8026341376. Throughput: 0: 42417.9. Samples: 8026439780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 18:21:14,050][15401] Updated weights for policy 0, policy_version 489890 (0.0033) [2024-06-23 18:21:17,845][15401] Updated weights for policy 0, policy_version 489900 (0.0037) [2024-06-23 18:21:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 8026521600. Throughput: 0: 42689.0. Samples: 8026703760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:18,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 18:21:21,711][15401] Updated weights for policy 0, policy_version 489910 (0.0029) [2024-06-23 18:21:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 8026734592. Throughput: 0: 42453.8. Samples: 8026826560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 18:21:25,806][15401] Updated weights for policy 0, policy_version 489920 (0.0031) [2024-06-23 18:21:26,345][15349] Signal inference workers to stop experience collection... (118950 times) [2024-06-23 18:21:26,403][15349] Signal inference workers to resume experience collection... (118950 times) [2024-06-23 18:21:26,403][15401] InferenceWorker_p0-w0: stopping experience collection (118950 times) [2024-06-23 18:21:26,423][15401] InferenceWorker_p0-w0: resuming experience collection (118950 times) [2024-06-23 18:21:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8026980352. Throughput: 0: 42204.5. Samples: 8027072540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 18:21:29,558][15401] Updated weights for policy 0, policy_version 489930 (0.0022) [2024-06-23 18:21:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8027176960. Throughput: 0: 42432.4. Samples: 8027336340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 18:21:33,391][15401] Updated weights for policy 0, policy_version 489940 (0.0043) [2024-06-23 18:21:36,982][15401] Updated weights for policy 0, policy_version 489950 (0.0037) [2024-06-23 18:21:38,392][15132] Fps is (10 sec: 37674.0, 60 sec: 41777.5, 300 sec: 42542.5). Total num frames: 8027357184. Throughput: 0: 42306.3. Samples: 8027461980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:38,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 18:21:41,044][15401] Updated weights for policy 0, policy_version 489960 (0.0036) [2024-06-23 18:21:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8027602944. Throughput: 0: 42325.8. Samples: 8027714400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 18:21:43,541][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000489967_8027619328.pth... [2024-06-23 18:21:43,600][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000489341_8017362944.pth [2024-06-23 18:21:44,981][15401] Updated weights for policy 0, policy_version 489970 (0.0037) [2024-06-23 18:21:48,392][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.3, 300 sec: 42598.1). Total num frames: 8027799552. Throughput: 0: 42641.9. Samples: 8027980740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:48,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 18:21:48,933][15401] Updated weights for policy 0, policy_version 489980 (0.0027) [2024-06-23 18:21:52,451][15401] Updated weights for policy 0, policy_version 489990 (0.0034) [2024-06-23 18:21:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8028012544. Throughput: 0: 42545.3. Samples: 8028102100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 18:21:56,430][15401] Updated weights for policy 0, policy_version 490000 (0.0032) [2024-06-23 18:21:58,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42871.9, 300 sec: 42653.9). Total num frames: 8028258304. Throughput: 0: 42724.8. Samples: 8028362400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:21:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 18:21:59,958][15401] Updated weights for policy 0, policy_version 490010 (0.0032) [2024-06-23 18:22:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8028454912. Throughput: 0: 42704.8. Samples: 8028625480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 18:22:03,901][15401] Updated weights for policy 0, policy_version 490020 (0.0041) [2024-06-23 18:22:07,543][15401] Updated weights for policy 0, policy_version 490030 (0.0038) [2024-06-23 18:22:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8028667904. Throughput: 0: 42697.6. Samples: 8028747960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:08,391][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 18:22:11,467][15401] Updated weights for policy 0, policy_version 490040 (0.0036) [2024-06-23 18:22:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 8028897280. Throughput: 0: 43049.2. Samples: 8029009760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 18:22:15,283][15401] Updated weights for policy 0, policy_version 490050 (0.0033) [2024-06-23 18:22:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8029077504. Throughput: 0: 42813.9. Samples: 8029262960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 18:22:19,331][15401] Updated weights for policy 0, policy_version 490060 (0.0034) [2024-06-23 18:22:22,990][15401] Updated weights for policy 0, policy_version 490070 (0.0038) [2024-06-23 18:22:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8029306880. Throughput: 0: 42735.6. Samples: 8029384980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:23,400][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 18:22:26,862][15401] Updated weights for policy 0, policy_version 490080 (0.0027) [2024-06-23 18:22:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8029503488. Throughput: 0: 42766.9. Samples: 8029638900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:28,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-23 18:22:30,686][15401] Updated weights for policy 0, policy_version 490090 (0.0034) [2024-06-23 18:22:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8029700096. Throughput: 0: 42752.4. Samples: 8029904500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 18:22:34,814][15401] Updated weights for policy 0, policy_version 490100 (0.0035) [2024-06-23 18:22:38,098][15349] Signal inference workers to stop experience collection... (119000 times) [2024-06-23 18:22:38,101][15349] Signal inference workers to resume experience collection... (119000 times) [2024-06-23 18:22:38,149][15401] InferenceWorker_p0-w0: stopping experience collection (119000 times) [2024-06-23 18:22:38,149][15401] InferenceWorker_p0-w0: resuming experience collection (119000 times) [2024-06-23 18:22:38,237][15401] Updated weights for policy 0, policy_version 490110 (0.0024) [2024-06-23 18:22:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43419.3, 300 sec: 42709.5). Total num frames: 8029962240. Throughput: 0: 42677.0. Samples: 8030022560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 18:22:42,465][15401] Updated weights for policy 0, policy_version 490120 (0.0044) [2024-06-23 18:22:43,396][15132] Fps is (10 sec: 45845.7, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 8030158848. Throughput: 0: 42652.2. Samples: 8030282020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:43,396][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 18:22:45,874][15401] Updated weights for policy 0, policy_version 490130 (0.0032) [2024-06-23 18:22:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42326.9, 300 sec: 42487.7). Total num frames: 8030339072. Throughput: 0: 42402.2. Samples: 8030533580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 18:22:50,430][15401] Updated weights for policy 0, policy_version 490140 (0.0029) [2024-06-23 18:22:53,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8030584832. Throughput: 0: 42385.3. Samples: 8030655300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 18:22:53,595][15401] Updated weights for policy 0, policy_version 490150 (0.0035) [2024-06-23 18:22:57,962][15401] Updated weights for policy 0, policy_version 490160 (0.0033) [2024-06-23 18:22:58,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 8030797824. Throughput: 0: 42475.6. Samples: 8030921260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:22:58,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 18:23:01,422][15401] Updated weights for policy 0, policy_version 490170 (0.0034) [2024-06-23 18:23:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 8030994432. Throughput: 0: 42445.6. Samples: 8031173020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:23:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 18:23:05,575][15401] Updated weights for policy 0, policy_version 490180 (0.0037) [2024-06-23 18:23:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8031207424. Throughput: 0: 42433.0. Samples: 8031294460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:23:08,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 18:23:09,051][15401] Updated weights for policy 0, policy_version 490190 (0.0027) [2024-06-23 18:23:13,218][15401] Updated weights for policy 0, policy_version 490200 (0.0027) [2024-06-23 18:23:13,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 8031436800. Throughput: 0: 42617.2. Samples: 8031556780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:23:13,393][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 18:23:16,718][15401] Updated weights for policy 0, policy_version 490210 (0.0029) [2024-06-23 18:23:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8031633408. Throughput: 0: 42349.8. Samples: 8031810240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:23:18,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 18:23:21,138][15401] Updated weights for policy 0, policy_version 490220 (0.0037) [2024-06-23 18:23:23,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8031862784. Throughput: 0: 42567.6. Samples: 8031938100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 18:23:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 18:23:24,199][15401] Updated weights for policy 0, policy_version 490230 (0.0037) [2024-06-23 18:23:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 8032059392. Throughput: 0: 42552.7. Samples: 8032196620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:23:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 18:23:28,909][15401] Updated weights for policy 0, policy_version 490240 (0.0030) [2024-06-23 18:23:31,645][15401] Updated weights for policy 0, policy_version 490250 (0.0034) [2024-06-23 18:23:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8032272384. Throughput: 0: 42674.2. Samples: 8032453920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:23:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 18:23:36,514][15401] Updated weights for policy 0, policy_version 490260 (0.0029) [2024-06-23 18:23:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8032501760. Throughput: 0: 42751.6. Samples: 8032579120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:23:38,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 18:23:39,688][15401] Updated weights for policy 0, policy_version 490270 (0.0032) [2024-06-23 18:23:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 8032714752. Throughput: 0: 42559.2. Samples: 8032836320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:23:43,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 18:23:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000490278_8032714752.pth... [2024-06-23 18:23:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000489654_8022491136.pth [2024-06-23 18:23:44,081][15401] Updated weights for policy 0, policy_version 490280 (0.0025) [2024-06-23 18:23:47,034][15401] Updated weights for policy 0, policy_version 490290 (0.0026) [2024-06-23 18:23:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8032927744. Throughput: 0: 42697.8. Samples: 8033094420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:23:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 18:23:51,784][15401] Updated weights for policy 0, policy_version 490300 (0.0039) [2024-06-23 18:23:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8033157120. Throughput: 0: 42831.9. Samples: 8033221900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:23:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 18:23:54,454][15349] Signal inference workers to stop experience collection... (119050 times) [2024-06-23 18:23:54,454][15349] Signal inference workers to resume experience collection... (119050 times) [2024-06-23 18:23:54,501][15401] InferenceWorker_p0-w0: stopping experience collection (119050 times) [2024-06-23 18:23:54,501][15401] InferenceWorker_p0-w0: resuming experience collection (119050 times) [2024-06-23 18:23:54,591][15401] Updated weights for policy 0, policy_version 490310 (0.0038) [2024-06-23 18:23:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 8033337344. Throughput: 0: 42846.4. Samples: 8033484760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:23:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 18:23:59,383][15401] Updated weights for policy 0, policy_version 490320 (0.0020) [2024-06-23 18:24:02,527][15401] Updated weights for policy 0, policy_version 490330 (0.0038) [2024-06-23 18:24:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 8033583104. Throughput: 0: 42687.8. Samples: 8033731300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:03,393][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 18:24:07,483][15401] Updated weights for policy 0, policy_version 490340 (0.0035) [2024-06-23 18:24:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8033796096. Throughput: 0: 42885.3. Samples: 8033867940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 18:24:10,362][15401] Updated weights for policy 0, policy_version 490350 (0.0028) [2024-06-23 18:24:13,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 8033976320. Throughput: 0: 42797.4. Samples: 8034122500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 18:24:15,019][15401] Updated weights for policy 0, policy_version 490360 (0.0033) [2024-06-23 18:24:18,121][15401] Updated weights for policy 0, policy_version 490370 (0.0037) [2024-06-23 18:24:18,391][15132] Fps is (10 sec: 44229.4, 60 sec: 43416.4, 300 sec: 42709.2). Total num frames: 8034238464. Throughput: 0: 42582.5. Samples: 8034370200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:18,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 18:24:22,634][15401] Updated weights for policy 0, policy_version 490380 (0.0033) [2024-06-23 18:24:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8034418688. Throughput: 0: 42878.3. Samples: 8034508640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 18:24:25,604][15401] Updated weights for policy 0, policy_version 490390 (0.0031) [2024-06-23 18:24:28,390][15132] Fps is (10 sec: 37689.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 8034615296. Throughput: 0: 42767.4. Samples: 8034760860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 18:24:30,222][15401] Updated weights for policy 0, policy_version 490400 (0.0032) [2024-06-23 18:24:33,155][15401] Updated weights for policy 0, policy_version 490410 (0.0036) [2024-06-23 18:24:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 8034877440. Throughput: 0: 42523.2. Samples: 8035007960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 18:24:38,034][15401] Updated weights for policy 0, policy_version 490420 (0.0037) [2024-06-23 18:24:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8035057664. Throughput: 0: 42754.7. Samples: 8035145860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 18:24:40,573][15401] Updated weights for policy 0, policy_version 490430 (0.0035) [2024-06-23 18:24:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 8035270656. Throughput: 0: 42543.9. Samples: 8035399240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:43,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 18:24:45,615][15401] Updated weights for policy 0, policy_version 490440 (0.0037) [2024-06-23 18:24:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 8035500032. Throughput: 0: 42623.2. Samples: 8035649240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 18:24:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 18:24:48,680][15401] Updated weights for policy 0, policy_version 490450 (0.0029) [2024-06-23 18:24:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8035680256. Throughput: 0: 42668.1. Samples: 8035788000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:24:53,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 18:24:53,475][15401] Updated weights for policy 0, policy_version 490460 (0.0032) [2024-06-23 18:24:56,272][15401] Updated weights for policy 0, policy_version 490470 (0.0033) [2024-06-23 18:24:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 8035909632. Throughput: 0: 42547.5. Samples: 8036037240. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:24:58,393][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 18:25:01,075][15401] Updated weights for policy 0, policy_version 490480 (0.0038) [2024-06-23 18:25:03,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 8036155392. Throughput: 0: 42745.6. Samples: 8036293680. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 18:25:03,990][15401] Updated weights for policy 0, policy_version 490490 (0.0031) [2024-06-23 18:25:08,239][15349] Signal inference workers to stop experience collection... (119100 times) [2024-06-23 18:25:08,277][15401] InferenceWorker_p0-w0: stopping experience collection (119100 times) [2024-06-23 18:25:08,286][15349] Signal inference workers to resume experience collection... (119100 times) [2024-06-23 18:25:08,295][15401] InferenceWorker_p0-w0: resuming experience collection (119100 times) [2024-06-23 18:25:08,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8036319232. Throughput: 0: 42657.3. Samples: 8036428220. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 18:25:08,615][15401] Updated weights for policy 0, policy_version 490500 (0.0030) [2024-06-23 18:25:11,628][15401] Updated weights for policy 0, policy_version 490510 (0.0031) [2024-06-23 18:25:13,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 8036564992. Throughput: 0: 42778.7. Samples: 8036686000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:13,393][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 18:25:16,163][15401] Updated weights for policy 0, policy_version 490520 (0.0038) [2024-06-23 18:25:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42326.5, 300 sec: 42654.0). Total num frames: 8036777984. Throughput: 0: 42978.3. Samples: 8036941980. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 18:25:19,246][15401] Updated weights for policy 0, policy_version 490530 (0.0033) [2024-06-23 18:25:23,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8036974592. Throughput: 0: 42713.8. Samples: 8037067980. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 18:25:23,899][15401] Updated weights for policy 0, policy_version 490540 (0.0031) [2024-06-23 18:25:27,147][15401] Updated weights for policy 0, policy_version 490550 (0.0046) [2024-06-23 18:25:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 8037220352. Throughput: 0: 42919.0. Samples: 8037330600. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:28,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 18:25:31,394][15401] Updated weights for policy 0, policy_version 490560 (0.0029) [2024-06-23 18:25:33,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8037433344. Throughput: 0: 43184.4. Samples: 8037592540. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:33,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 18:25:34,802][15401] Updated weights for policy 0, policy_version 490570 (0.0040) [2024-06-23 18:25:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8037629952. Throughput: 0: 42945.3. Samples: 8037720540. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 18:25:38,812][15401] Updated weights for policy 0, policy_version 490580 (0.0036) [2024-06-23 18:25:42,468][15401] Updated weights for policy 0, policy_version 490590 (0.0037) [2024-06-23 18:25:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42654.3). Total num frames: 8037875712. Throughput: 0: 43241.5. Samples: 8037983000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 18:25:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000490594_8037892096.pth... [2024-06-23 18:25:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000489967_8027619328.pth [2024-06-23 18:25:46,349][15401] Updated weights for policy 0, policy_version 490600 (0.0029) [2024-06-23 18:25:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8038088704. Throughput: 0: 43340.4. Samples: 8038244000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 18:25:50,092][15401] Updated weights for policy 0, policy_version 490610 (0.0046) [2024-06-23 18:25:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 42709.6). Total num frames: 8038285312. Throughput: 0: 43113.2. Samples: 8038368320. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 18:25:53,892][15401] Updated weights for policy 0, policy_version 490620 (0.0037) [2024-06-23 18:25:57,431][15401] Updated weights for policy 0, policy_version 490630 (0.0037) [2024-06-23 18:25:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43419.3, 300 sec: 42709.5). Total num frames: 8038514688. Throughput: 0: 43313.3. Samples: 8038635000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:25:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 18:26:01,497][15401] Updated weights for policy 0, policy_version 490640 (0.0037) [2024-06-23 18:26:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8038744064. Throughput: 0: 43390.3. Samples: 8038894540. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:26:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 18:26:04,782][15401] Updated weights for policy 0, policy_version 490650 (0.0023) [2024-06-23 18:26:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 8038924288. Throughput: 0: 43572.4. Samples: 8039028740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:26:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 18:26:09,155][15401] Updated weights for policy 0, policy_version 490660 (0.0049) [2024-06-23 18:26:12,195][15401] Updated weights for policy 0, policy_version 490670 (0.0039) [2024-06-23 18:26:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 8039153664. Throughput: 0: 43361.8. Samples: 8039281880. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:26:13,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 18:26:16,404][15349] Signal inference workers to stop experience collection... (119150 times) [2024-06-23 18:26:16,428][15401] InferenceWorker_p0-w0: stopping experience collection (119150 times) [2024-06-23 18:26:16,466][15349] Signal inference workers to resume experience collection... (119150 times) [2024-06-23 18:26:16,468][15401] InferenceWorker_p0-w0: resuming experience collection (119150 times) [2024-06-23 18:26:16,623][15401] Updated weights for policy 0, policy_version 490680 (0.0036) [2024-06-23 18:26:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8039383040. Throughput: 0: 43387.2. Samples: 8039544960. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-23 18:26:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 18:26:19,905][15401] Updated weights for policy 0, policy_version 490690 (0.0029) [2024-06-23 18:26:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8039563264. Throughput: 0: 43413.8. Samples: 8039674160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:26:23,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 18:26:24,163][15401] Updated weights for policy 0, policy_version 490700 (0.0026) [2024-06-23 18:26:27,552][15401] Updated weights for policy 0, policy_version 490710 (0.0047) [2024-06-23 18:26:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8039809024. Throughput: 0: 43148.8. Samples: 8039924700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:26:28,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 18:26:32,182][15401] Updated weights for policy 0, policy_version 490720 (0.0038) [2024-06-23 18:26:33,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.7, 300 sec: 42987.5). Total num frames: 8040038400. Throughput: 0: 43165.3. Samples: 8040186440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:26:33,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-23 18:26:35,104][15401] Updated weights for policy 0, policy_version 490730 (0.0044) [2024-06-23 18:26:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8040218624. Throughput: 0: 43210.2. Samples: 8040312780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:26:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 18:26:39,705][15401] Updated weights for policy 0, policy_version 490740 (0.0041) [2024-06-23 18:26:42,507][15401] Updated weights for policy 0, policy_version 490750 (0.0035) [2024-06-23 18:26:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 8040448000. Throughput: 0: 42842.2. Samples: 8040562900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:26:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 18:26:47,459][15401] Updated weights for policy 0, policy_version 490760 (0.0035) [2024-06-23 18:26:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8040677376. Throughput: 0: 42916.8. Samples: 8040825800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:26:48,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 18:26:50,596][15401] Updated weights for policy 0, policy_version 490770 (0.0031) [2024-06-23 18:26:53,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8040857600. Throughput: 0: 42759.5. Samples: 8040952920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:26:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 18:26:55,001][15401] Updated weights for policy 0, policy_version 490780 (0.0034) [2024-06-23 18:26:58,384][15401] Updated weights for policy 0, policy_version 490790 (0.0038) [2024-06-23 18:26:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8041103360. Throughput: 0: 42648.0. Samples: 8041201040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:26:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 18:27:02,559][15401] Updated weights for policy 0, policy_version 490800 (0.0037) [2024-06-23 18:27:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 8041299968. Throughput: 0: 42735.5. Samples: 8041468060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 18:27:06,066][15401] Updated weights for policy 0, policy_version 490810 (0.0048) [2024-06-23 18:27:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8041496576. Throughput: 0: 42471.1. Samples: 8041585360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:08,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 18:27:10,517][15401] Updated weights for policy 0, policy_version 490820 (0.0046) [2024-06-23 18:27:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8041725952. Throughput: 0: 42591.7. Samples: 8041841320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:13,390][15132] Avg episode reward: [(0, '0.099')] [2024-06-23 18:27:13,659][15401] Updated weights for policy 0, policy_version 490830 (0.0041) [2024-06-23 18:27:18,219][15401] Updated weights for policy 0, policy_version 490840 (0.0032) [2024-06-23 18:27:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8041938944. Throughput: 0: 42757.8. Samples: 8042110540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:18,390][15132] Avg episode reward: [(0, '0.112')] [2024-06-23 18:27:21,254][15401] Updated weights for policy 0, policy_version 490850 (0.0031) [2024-06-23 18:27:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8042151936. Throughput: 0: 42548.1. Samples: 8042227440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 18:27:25,742][15401] Updated weights for policy 0, policy_version 490860 (0.0031) [2024-06-23 18:27:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8042381312. Throughput: 0: 42703.8. Samples: 8042484560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 18:27:28,656][15401] Updated weights for policy 0, policy_version 490870 (0.0022) [2024-06-23 18:27:33,211][15401] Updated weights for policy 0, policy_version 490880 (0.0029) [2024-06-23 18:27:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8042577920. Throughput: 0: 42725.8. Samples: 8042748460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 18:27:36,690][15401] Updated weights for policy 0, policy_version 490890 (0.0032) [2024-06-23 18:27:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.8, 300 sec: 42821.1). Total num frames: 8042790912. Throughput: 0: 42632.0. Samples: 8042871460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:38,401][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 18:27:40,503][15349] Signal inference workers to stop experience collection... (119200 times) [2024-06-23 18:27:40,525][15401] InferenceWorker_p0-w0: stopping experience collection (119200 times) [2024-06-23 18:27:40,562][15349] Signal inference workers to resume experience collection... (119200 times) [2024-06-23 18:27:40,562][15401] InferenceWorker_p0-w0: resuming experience collection (119200 times) [2024-06-23 18:27:40,704][15401] Updated weights for policy 0, policy_version 490900 (0.0037) [2024-06-23 18:27:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 43042.7). Total num frames: 8043036672. Throughput: 0: 42891.7. Samples: 8043131160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:27:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 18:27:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000490908_8043036672.pth... [2024-06-23 18:27:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000490278_8032714752.pth [2024-06-23 18:27:44,349][15401] Updated weights for policy 0, policy_version 490910 (0.0034) [2024-06-23 18:27:48,262][15401] Updated weights for policy 0, policy_version 490920 (0.0037) [2024-06-23 18:27:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8043233280. Throughput: 0: 42742.2. Samples: 8043391460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:27:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 18:27:52,577][15401] Updated weights for policy 0, policy_version 490930 (0.0030) [2024-06-23 18:27:53,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43142.8, 300 sec: 42876.1). Total num frames: 8043446272. Throughput: 0: 42907.4. Samples: 8043516300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:27:53,393][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 18:27:55,970][15401] Updated weights for policy 0, policy_version 490940 (0.0032) [2024-06-23 18:27:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8043675648. Throughput: 0: 43060.9. Samples: 8043779060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:27:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 18:28:00,073][15401] Updated weights for policy 0, policy_version 490950 (0.0028) [2024-06-23 18:28:03,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8043872256. Throughput: 0: 42884.5. Samples: 8044040340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 18:28:03,514][15401] Updated weights for policy 0, policy_version 490960 (0.0036) [2024-06-23 18:28:07,503][15401] Updated weights for policy 0, policy_version 490970 (0.0042) [2024-06-23 18:28:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 8044085248. Throughput: 0: 43030.6. Samples: 8044163820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:08,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 18:28:11,330][15401] Updated weights for policy 0, policy_version 490980 (0.0028) [2024-06-23 18:28:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8044314624. Throughput: 0: 43245.2. Samples: 8044430600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 18:28:15,078][15401] Updated weights for policy 0, policy_version 490990 (0.0029) [2024-06-23 18:28:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8044527616. Throughput: 0: 42930.7. Samples: 8044680340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 18:28:18,869][15401] Updated weights for policy 0, policy_version 491000 (0.0034) [2024-06-23 18:28:22,711][15401] Updated weights for policy 0, policy_version 491010 (0.0027) [2024-06-23 18:28:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 8044740608. Throughput: 0: 43073.3. Samples: 8044809660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:23,392][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 18:28:26,358][15401] Updated weights for policy 0, policy_version 491020 (0.0039) [2024-06-23 18:28:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 8044969984. Throughput: 0: 43114.5. Samples: 8045071320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 18:28:30,274][15401] Updated weights for policy 0, policy_version 491030 (0.0043) [2024-06-23 18:28:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8045166592. Throughput: 0: 42948.9. Samples: 8045324160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 18:28:33,810][15401] Updated weights for policy 0, policy_version 491040 (0.0032) [2024-06-23 18:28:37,931][15401] Updated weights for policy 0, policy_version 491050 (0.0044) [2024-06-23 18:28:38,392][15132] Fps is (10 sec: 40951.4, 60 sec: 43144.6, 300 sec: 42931.3). Total num frames: 8045379584. Throughput: 0: 43065.5. Samples: 8045454240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:38,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 18:28:41,731][15401] Updated weights for policy 0, policy_version 491060 (0.0033) [2024-06-23 18:28:43,391][15132] Fps is (10 sec: 44232.3, 60 sec: 42870.7, 300 sec: 42987.0). Total num frames: 8045608960. Throughput: 0: 42887.0. Samples: 8045709020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:43,391][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 18:28:45,505][15401] Updated weights for policy 0, policy_version 491070 (0.0026) [2024-06-23 18:28:48,389][15132] Fps is (10 sec: 42608.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8045805568. Throughput: 0: 42845.7. Samples: 8045968400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 18:28:49,257][15401] Updated weights for policy 0, policy_version 491080 (0.0037) [2024-06-23 18:28:53,114][15401] Updated weights for policy 0, policy_version 491090 (0.0035) [2024-06-23 18:28:53,390][15132] Fps is (10 sec: 40963.4, 60 sec: 42873.1, 300 sec: 42987.1). Total num frames: 8046018560. Throughput: 0: 42937.6. Samples: 8046096020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 18:28:56,730][15401] Updated weights for policy 0, policy_version 491100 (0.0035) [2024-06-23 18:28:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 8046231552. Throughput: 0: 42691.5. Samples: 8046351720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:28:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 18:29:00,707][15401] Updated weights for policy 0, policy_version 491110 (0.0046) [2024-06-23 18:29:02,674][15349] Signal inference workers to stop experience collection... (119250 times) [2024-06-23 18:29:02,714][15401] InferenceWorker_p0-w0: stopping experience collection (119250 times) [2024-06-23 18:29:02,724][15349] Signal inference workers to resume experience collection... (119250 times) [2024-06-23 18:29:02,731][15401] InferenceWorker_p0-w0: resuming experience collection (119250 times) [2024-06-23 18:29:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8046428160. Throughput: 0: 42940.0. Samples: 8046612640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:29:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 18:29:04,283][15401] Updated weights for policy 0, policy_version 491120 (0.0031) [2024-06-23 18:29:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8046657536. Throughput: 0: 42872.6. Samples: 8046738920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 18:29:08,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 18:29:08,847][15401] Updated weights for policy 0, policy_version 491130 (0.0035) [2024-06-23 18:29:11,959][15401] Updated weights for policy 0, policy_version 491140 (0.0023) [2024-06-23 18:29:13,392][15132] Fps is (10 sec: 45865.0, 60 sec: 42869.9, 300 sec: 42876.0). Total num frames: 8046886912. Throughput: 0: 42666.1. Samples: 8046991380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:13,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 18:29:16,398][15401] Updated weights for policy 0, policy_version 491150 (0.0044) [2024-06-23 18:29:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8047083520. Throughput: 0: 42853.8. Samples: 8047252580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 18:29:19,916][15401] Updated weights for policy 0, policy_version 491160 (0.0032) [2024-06-23 18:29:23,390][15132] Fps is (10 sec: 42607.5, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 8047312896. Throughput: 0: 42748.3. Samples: 8047377820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 18:29:23,907][15401] Updated weights for policy 0, policy_version 491170 (0.0035) [2024-06-23 18:29:27,363][15401] Updated weights for policy 0, policy_version 491180 (0.0046) [2024-06-23 18:29:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8047525888. Throughput: 0: 42907.6. Samples: 8047639820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 18:29:31,434][15401] Updated weights for policy 0, policy_version 491190 (0.0034) [2024-06-23 18:29:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 8047706112. Throughput: 0: 42921.4. Samples: 8047899860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:33,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 18:29:34,903][15401] Updated weights for policy 0, policy_version 491200 (0.0028) [2024-06-23 18:29:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42599.9, 300 sec: 42931.6). Total num frames: 8047935488. Throughput: 0: 42836.5. Samples: 8048023660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 18:29:39,066][15401] Updated weights for policy 0, policy_version 491210 (0.0037) [2024-06-23 18:29:42,961][15401] Updated weights for policy 0, policy_version 491220 (0.0033) [2024-06-23 18:29:43,390][15132] Fps is (10 sec: 47512.3, 60 sec: 42872.1, 300 sec: 42987.2). Total num frames: 8048181248. Throughput: 0: 42799.1. Samples: 8048277680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 18:29:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000491222_8048181248.pth... [2024-06-23 18:29:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000490594_8037892096.pth [2024-06-23 18:29:46,771][15401] Updated weights for policy 0, policy_version 491230 (0.0039) [2024-06-23 18:29:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 8048361472. Throughput: 0: 42895.0. Samples: 8048542920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:48,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 18:29:50,515][15401] Updated weights for policy 0, policy_version 491240 (0.0034) [2024-06-23 18:29:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42987.5). Total num frames: 8048590848. Throughput: 0: 42710.2. Samples: 8048660880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 18:29:54,561][15401] Updated weights for policy 0, policy_version 491250 (0.0032) [2024-06-23 18:29:58,064][15401] Updated weights for policy 0, policy_version 491260 (0.0037) [2024-06-23 18:29:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 8048803840. Throughput: 0: 42807.8. Samples: 8048917740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:29:58,392][15132] Avg episode reward: [(0, '0.143')] [2024-06-23 18:30:02,279][15401] Updated weights for policy 0, policy_version 491270 (0.0027) [2024-06-23 18:30:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8048984064. Throughput: 0: 42767.4. Samples: 8049177120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:30:03,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-23 18:30:05,662][15401] Updated weights for policy 0, policy_version 491280 (0.0031) [2024-06-23 18:30:08,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42869.7, 300 sec: 42931.6). Total num frames: 8049229824. Throughput: 0: 42766.6. Samples: 8049302420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:30:08,393][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 18:30:09,865][15401] Updated weights for policy 0, policy_version 491290 (0.0043) [2024-06-23 18:30:13,386][15401] Updated weights for policy 0, policy_version 491300 (0.0032) [2024-06-23 18:30:13,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42873.0, 300 sec: 42987.2). Total num frames: 8049459200. Throughput: 0: 42783.1. Samples: 8049565060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:30:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 18:30:17,601][15401] Updated weights for policy 0, policy_version 491310 (0.0042) [2024-06-23 18:30:18,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8049639424. Throughput: 0: 42786.6. Samples: 8049825260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:30:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 18:30:20,806][15349] Signal inference workers to stop experience collection... (119300 times) [2024-06-23 18:30:20,812][15349] Signal inference workers to resume experience collection... (119300 times) [2024-06-23 18:30:20,819][15401] InferenceWorker_p0-w0: stopping experience collection (119300 times) [2024-06-23 18:30:20,854][15401] InferenceWorker_p0-w0: resuming experience collection (119300 times) [2024-06-23 18:30:20,951][15401] Updated weights for policy 0, policy_version 491320 (0.0035) [2024-06-23 18:30:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8049885184. Throughput: 0: 42772.1. Samples: 8049948400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:30:23,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 18:30:25,082][15401] Updated weights for policy 0, policy_version 491330 (0.0032) [2024-06-23 18:30:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8050081792. Throughput: 0: 42836.7. Samples: 8050205320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:30:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 18:30:28,702][15401] Updated weights for policy 0, policy_version 491340 (0.0036) [2024-06-23 18:30:33,007][15401] Updated weights for policy 0, policy_version 491350 (0.0032) [2024-06-23 18:30:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 8050278400. Throughput: 0: 42721.3. Samples: 8050465380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 18:30:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 18:30:36,327][15401] Updated weights for policy 0, policy_version 491360 (0.0027) [2024-06-23 18:30:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8050540544. Throughput: 0: 42955.9. Samples: 8050593900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:30:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 18:30:40,803][15401] Updated weights for policy 0, policy_version 491370 (0.0040) [2024-06-23 18:30:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 8050720768. Throughput: 0: 42887.6. Samples: 8050847580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:30:43,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 18:30:43,947][15401] Updated weights for policy 0, policy_version 491380 (0.0038) [2024-06-23 18:30:48,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8050917376. Throughput: 0: 42887.3. Samples: 8051107040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:30:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 18:30:48,527][15401] Updated weights for policy 0, policy_version 491390 (0.0039) [2024-06-23 18:30:51,446][15401] Updated weights for policy 0, policy_version 491400 (0.0027) [2024-06-23 18:30:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8051179520. Throughput: 0: 42851.0. Samples: 8051230620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:30:53,396][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 18:30:56,497][15401] Updated weights for policy 0, policy_version 491410 (0.0040) [2024-06-23 18:30:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8051359744. Throughput: 0: 42854.2. Samples: 8051493500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:30:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 18:30:59,418][15401] Updated weights for policy 0, policy_version 491420 (0.0025) [2024-06-23 18:31:03,392][15132] Fps is (10 sec: 37674.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8051556352. Throughput: 0: 42690.0. Samples: 8051746420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:03,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 18:31:03,990][15401] Updated weights for policy 0, policy_version 491430 (0.0030) [2024-06-23 18:31:06,825][15401] Updated weights for policy 0, policy_version 491440 (0.0032) [2024-06-23 18:31:08,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43144.5, 300 sec: 42931.3). Total num frames: 8051818496. Throughput: 0: 42803.5. Samples: 8051874660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:08,393][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 18:31:11,448][15401] Updated weights for policy 0, policy_version 491450 (0.0038) [2024-06-23 18:31:13,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8051998720. Throughput: 0: 42834.5. Samples: 8052132880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:13,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 18:31:14,631][15401] Updated weights for policy 0, policy_version 491460 (0.0032) [2024-06-23 18:31:18,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 8052211712. Throughput: 0: 42640.0. Samples: 8052384180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 18:31:18,948][15401] Updated weights for policy 0, policy_version 491470 (0.0037) [2024-06-23 18:31:22,107][15401] Updated weights for policy 0, policy_version 491480 (0.0035) [2024-06-23 18:31:23,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8052473856. Throughput: 0: 42729.3. Samples: 8052516720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:23,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 18:31:26,708][15401] Updated weights for policy 0, policy_version 491490 (0.0032) [2024-06-23 18:31:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8052621312. Throughput: 0: 42790.3. Samples: 8052773140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 18:31:29,815][15401] Updated weights for policy 0, policy_version 491500 (0.0036) [2024-06-23 18:31:33,389][15132] Fps is (10 sec: 36045.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8052834304. Throughput: 0: 42719.1. Samples: 8053029400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:33,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 18:31:33,487][15349] Signal inference workers to stop experience collection... (119350 times) [2024-06-23 18:31:33,487][15349] Signal inference workers to resume experience collection... (119350 times) [2024-06-23 18:31:33,528][15401] InferenceWorker_p0-w0: stopping experience collection (119350 times) [2024-06-23 18:31:33,528][15401] InferenceWorker_p0-w0: resuming experience collection (119350 times) [2024-06-23 18:31:34,136][15401] Updated weights for policy 0, policy_version 491510 (0.0052) [2024-06-23 18:31:37,866][15401] Updated weights for policy 0, policy_version 491520 (0.0042) [2024-06-23 18:31:38,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8053096448. Throughput: 0: 42782.4. Samples: 8053155820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:38,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 18:31:41,883][15401] Updated weights for policy 0, policy_version 491530 (0.0038) [2024-06-23 18:31:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8053276672. Throughput: 0: 42571.6. Samples: 8053409220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:43,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 18:31:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000491533_8053276672.pth... [2024-06-23 18:31:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000490908_8043036672.pth [2024-06-23 18:31:45,454][15401] Updated weights for policy 0, policy_version 491540 (0.0033) [2024-06-23 18:31:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8053489664. Throughput: 0: 42637.3. Samples: 8053665000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:48,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 18:31:49,560][15401] Updated weights for policy 0, policy_version 491550 (0.0037) [2024-06-23 18:31:53,037][15401] Updated weights for policy 0, policy_version 491560 (0.0043) [2024-06-23 18:31:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8053719040. Throughput: 0: 42676.9. Samples: 8053795020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 18:31:57,325][15401] Updated weights for policy 0, policy_version 491570 (0.0047) [2024-06-23 18:31:58,394][15132] Fps is (10 sec: 40941.8, 60 sec: 42322.1, 300 sec: 42708.8). Total num frames: 8053899264. Throughput: 0: 42571.3. Samples: 8054048780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:31:58,394][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 18:32:00,958][15401] Updated weights for policy 0, policy_version 491580 (0.0033) [2024-06-23 18:32:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 8054128640. Throughput: 0: 42614.8. Samples: 8054301840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-23 18:32:03,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 18:32:05,139][15401] Updated weights for policy 0, policy_version 491590 (0.0033) [2024-06-23 18:32:08,389][15132] Fps is (10 sec: 44257.0, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 8054341632. Throughput: 0: 42674.3. Samples: 8054437060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 18:32:08,623][15401] Updated weights for policy 0, policy_version 491600 (0.0040) [2024-06-23 18:32:12,775][15401] Updated weights for policy 0, policy_version 491610 (0.0043) [2024-06-23 18:32:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8054538240. Throughput: 0: 42410.6. Samples: 8054681620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 18:32:16,277][15401] Updated weights for policy 0, policy_version 491620 (0.0041) [2024-06-23 18:32:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8054784000. Throughput: 0: 42286.1. Samples: 8054932280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:18,394][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 18:32:20,776][15401] Updated weights for policy 0, policy_version 491630 (0.0034) [2024-06-23 18:32:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 8054980608. Throughput: 0: 42361.2. Samples: 8055062080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 18:32:23,964][15401] Updated weights for policy 0, policy_version 491640 (0.0033) [2024-06-23 18:32:28,367][15401] Updated weights for policy 0, policy_version 491650 (0.0041) [2024-06-23 18:32:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8055193600. Throughput: 0: 42268.9. Samples: 8055311320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:28,401][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 18:32:32,181][15401] Updated weights for policy 0, policy_version 491660 (0.0048) [2024-06-23 18:32:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 8055406592. Throughput: 0: 42323.6. Samples: 8055569560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:33,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 18:32:36,038][15401] Updated weights for policy 0, policy_version 491670 (0.0034) [2024-06-23 18:32:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 41777.5, 300 sec: 42598.0). Total num frames: 8055603200. Throughput: 0: 42143.6. Samples: 8055691580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:38,392][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 18:32:39,805][15401] Updated weights for policy 0, policy_version 491680 (0.0028) [2024-06-23 18:32:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8055816192. Throughput: 0: 42157.0. Samples: 8055945660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 18:32:43,585][15401] Updated weights for policy 0, policy_version 491690 (0.0035) [2024-06-23 18:32:47,379][15401] Updated weights for policy 0, policy_version 491700 (0.0039) [2024-06-23 18:32:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 8056029184. Throughput: 0: 42299.9. Samples: 8056205340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 18:32:51,189][15401] Updated weights for policy 0, policy_version 491710 (0.0038) [2024-06-23 18:32:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8056242176. Throughput: 0: 42107.5. Samples: 8056331900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:53,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 18:32:55,264][15401] Updated weights for policy 0, policy_version 491720 (0.0023) [2024-06-23 18:32:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42874.7, 300 sec: 42709.5). Total num frames: 8056471552. Throughput: 0: 42348.0. Samples: 8056587280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:32:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 18:32:58,733][15401] Updated weights for policy 0, policy_version 491730 (0.0023) [2024-06-23 18:33:02,779][15401] Updated weights for policy 0, policy_version 491740 (0.0028) [2024-06-23 18:33:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8056668160. Throughput: 0: 42453.0. Samples: 8056842660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:33:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 18:33:03,550][15349] Signal inference workers to stop experience collection... (119400 times) [2024-06-23 18:33:03,550][15349] Signal inference workers to resume experience collection... (119400 times) [2024-06-23 18:33:03,596][15401] InferenceWorker_p0-w0: stopping experience collection (119400 times) [2024-06-23 18:33:03,596][15401] InferenceWorker_p0-w0: resuming experience collection (119400 times) [2024-06-23 18:33:06,782][15401] Updated weights for policy 0, policy_version 491750 (0.0031) [2024-06-23 18:33:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8056881152. Throughput: 0: 42422.3. Samples: 8056971080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:33:08,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 18:33:10,630][15401] Updated weights for policy 0, policy_version 491760 (0.0043) [2024-06-23 18:33:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8057110528. Throughput: 0: 42781.2. Samples: 8057236480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:33:13,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 18:33:14,227][15401] Updated weights for policy 0, policy_version 491770 (0.0029) [2024-06-23 18:33:18,037][15401] Updated weights for policy 0, policy_version 491780 (0.0041) [2024-06-23 18:33:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 8057323520. Throughput: 0: 42559.7. Samples: 8057484740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:33:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 18:33:22,212][15401] Updated weights for policy 0, policy_version 491790 (0.0029) [2024-06-23 18:33:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8057536512. Throughput: 0: 42789.9. Samples: 8057617020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:33:23,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 18:33:25,652][15401] Updated weights for policy 0, policy_version 491800 (0.0040) [2024-06-23 18:33:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8057749504. Throughput: 0: 42912.1. Samples: 8057876700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-23 18:33:28,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 18:33:29,758][15401] Updated weights for policy 0, policy_version 491810 (0.0029) [2024-06-23 18:33:33,185][15401] Updated weights for policy 0, policy_version 491820 (0.0038) [2024-06-23 18:33:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 8057978880. Throughput: 0: 42812.0. Samples: 8058131880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:33:33,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 18:33:37,144][15401] Updated weights for policy 0, policy_version 491830 (0.0033) [2024-06-23 18:33:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43419.3, 300 sec: 42709.6). Total num frames: 8058208256. Throughput: 0: 42860.9. Samples: 8058260640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:33:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 18:33:40,923][15401] Updated weights for policy 0, policy_version 491840 (0.0024) [2024-06-23 18:33:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8058404864. Throughput: 0: 42895.0. Samples: 8058517560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:33:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 18:33:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000491846_8058404864.pth... [2024-06-23 18:33:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000491222_8048181248.pth [2024-06-23 18:33:44,913][15401] Updated weights for policy 0, policy_version 491850 (0.0031) [2024-06-23 18:33:48,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 8058601472. Throughput: 0: 42870.6. Samples: 8058771940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:33:48,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 18:33:48,919][15401] Updated weights for policy 0, policy_version 491860 (0.0035) [2024-06-23 18:33:52,902][15401] Updated weights for policy 0, policy_version 491870 (0.0050) [2024-06-23 18:33:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8058830848. Throughput: 0: 42744.4. Samples: 8058894580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:33:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 18:33:56,611][15401] Updated weights for policy 0, policy_version 491880 (0.0036) [2024-06-23 18:33:58,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8059043840. Throughput: 0: 42657.8. Samples: 8059156080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:33:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 18:34:00,407][15401] Updated weights for policy 0, policy_version 491890 (0.0040) [2024-06-23 18:34:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8059240448. Throughput: 0: 42701.3. Samples: 8059406300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 18:34:04,638][15401] Updated weights for policy 0, policy_version 491900 (0.0038) [2024-06-23 18:34:08,209][15401] Updated weights for policy 0, policy_version 491910 (0.0036) [2024-06-23 18:34:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 8059453440. Throughput: 0: 42528.8. Samples: 8059530820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 18:34:12,259][15401] Updated weights for policy 0, policy_version 491920 (0.0033) [2024-06-23 18:34:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8059682816. Throughput: 0: 42589.8. Samples: 8059793240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:13,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 18:34:16,151][15401] Updated weights for policy 0, policy_version 491930 (0.0032) [2024-06-23 18:34:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8059879424. Throughput: 0: 42565.0. Samples: 8060047300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 18:34:19,919][15401] Updated weights for policy 0, policy_version 491940 (0.0033) [2024-06-23 18:34:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 8060076032. Throughput: 0: 42492.5. Samples: 8060172800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:23,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 18:34:23,687][15401] Updated weights for policy 0, policy_version 491950 (0.0029) [2024-06-23 18:34:27,593][15401] Updated weights for policy 0, policy_version 491960 (0.0040) [2024-06-23 18:34:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8060321792. Throughput: 0: 42496.6. Samples: 8060429900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 18:34:31,215][15401] Updated weights for policy 0, policy_version 491970 (0.0034) [2024-06-23 18:34:32,647][15349] Signal inference workers to stop experience collection... (119450 times) [2024-06-23 18:34:32,702][15401] InferenceWorker_p0-w0: stopping experience collection (119450 times) [2024-06-23 18:34:32,762][15349] Signal inference workers to resume experience collection... (119450 times) [2024-06-23 18:34:32,762][15401] InferenceWorker_p0-w0: resuming experience collection (119450 times) [2024-06-23 18:34:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8060534784. Throughput: 0: 42472.4. Samples: 8060683100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 18:34:35,355][15401] Updated weights for policy 0, policy_version 491980 (0.0030) [2024-06-23 18:34:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 8060747776. Throughput: 0: 42680.0. Samples: 8060815280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:38,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 18:34:38,795][15401] Updated weights for policy 0, policy_version 491990 (0.0036) [2024-06-23 18:34:42,780][15401] Updated weights for policy 0, policy_version 492000 (0.0035) [2024-06-23 18:34:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8060944384. Throughput: 0: 42564.0. Samples: 8061071460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:43,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 18:34:46,460][15401] Updated weights for policy 0, policy_version 492010 (0.0041) [2024-06-23 18:34:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 8061190144. Throughput: 0: 42451.5. Samples: 8061316620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 18:34:50,750][15401] Updated weights for policy 0, policy_version 492020 (0.0031) [2024-06-23 18:34:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 8061353984. Throughput: 0: 42741.8. Samples: 8061454200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 18:34:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 18:34:54,148][15401] Updated weights for policy 0, policy_version 492030 (0.0037) [2024-06-23 18:34:58,324][15401] Updated weights for policy 0, policy_version 492040 (0.0040) [2024-06-23 18:34:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8061583360. Throughput: 0: 42396.8. Samples: 8061701100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:34:58,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 18:35:01,771][15401] Updated weights for policy 0, policy_version 492050 (0.0029) [2024-06-23 18:35:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 8061812736. Throughput: 0: 42624.5. Samples: 8061965400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 18:35:05,776][15401] Updated weights for policy 0, policy_version 492060 (0.0029) [2024-06-23 18:35:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 8062009344. Throughput: 0: 42740.9. Samples: 8062096140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 18:35:09,798][15401] Updated weights for policy 0, policy_version 492070 (0.0044) [2024-06-23 18:35:13,291][15401] Updated weights for policy 0, policy_version 492080 (0.0029) [2024-06-23 18:35:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8062238720. Throughput: 0: 42560.0. Samples: 8062345100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:13,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 18:35:17,342][15401] Updated weights for policy 0, policy_version 492090 (0.0038) [2024-06-23 18:35:18,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 8062451712. Throughput: 0: 42693.8. Samples: 8062604420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:18,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 18:35:20,851][15401] Updated weights for policy 0, policy_version 492100 (0.0026) [2024-06-23 18:35:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8062648320. Throughput: 0: 42560.5. Samples: 8062730400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:23,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 18:35:24,943][15401] Updated weights for policy 0, policy_version 492110 (0.0032) [2024-06-23 18:35:28,341][15401] Updated weights for policy 0, policy_version 492120 (0.0036) [2024-06-23 18:35:28,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8062894080. Throughput: 0: 42611.2. Samples: 8062988960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:28,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-23 18:35:32,785][15401] Updated weights for policy 0, policy_version 492130 (0.0028) [2024-06-23 18:35:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8063074304. Throughput: 0: 42846.7. Samples: 8063244720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 18:35:36,042][15401] Updated weights for policy 0, policy_version 492140 (0.0034) [2024-06-23 18:35:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 8063287296. Throughput: 0: 42490.3. Samples: 8063366260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 18:35:40,254][15401] Updated weights for policy 0, policy_version 492150 (0.0030) [2024-06-23 18:35:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8063516672. Throughput: 0: 42842.7. Samples: 8063629020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 18:35:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000492158_8063516672.pth... [2024-06-23 18:35:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000491533_8053276672.pth [2024-06-23 18:35:43,905][15401] Updated weights for policy 0, policy_version 492160 (0.0036) [2024-06-23 18:35:47,924][15401] Updated weights for policy 0, policy_version 492170 (0.0030) [2024-06-23 18:35:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8063729664. Throughput: 0: 42765.6. Samples: 8063889860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:48,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 18:35:51,376][15401] Updated weights for policy 0, policy_version 492180 (0.0032) [2024-06-23 18:35:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8063926272. Throughput: 0: 42672.5. Samples: 8064016400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:53,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 18:35:55,465][15401] Updated weights for policy 0, policy_version 492190 (0.0038) [2024-06-23 18:35:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 8064172032. Throughput: 0: 42825.3. Samples: 8064272240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:35:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 18:35:59,063][15401] Updated weights for policy 0, policy_version 492200 (0.0027) [2024-06-23 18:36:03,170][15349] Signal inference workers to stop experience collection... (119500 times) [2024-06-23 18:36:03,170][15349] Signal inference workers to resume experience collection... (119500 times) [2024-06-23 18:36:03,200][15401] InferenceWorker_p0-w0: stopping experience collection (119500 times) [2024-06-23 18:36:03,200][15401] InferenceWorker_p0-w0: resuming experience collection (119500 times) [2024-06-23 18:36:03,304][15401] Updated weights for policy 0, policy_version 492210 (0.0028) [2024-06-23 18:36:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 8064368640. Throughput: 0: 42786.8. Samples: 8064529720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:36:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 18:36:06,644][15401] Updated weights for policy 0, policy_version 492220 (0.0032) [2024-06-23 18:36:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8064548864. Throughput: 0: 42715.9. Samples: 8064652620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:36:08,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 18:36:10,887][15401] Updated weights for policy 0, policy_version 492230 (0.0034) [2024-06-23 18:36:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 8064811008. Throughput: 0: 42770.7. Samples: 8064913740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:36:13,392][15132] Avg episode reward: [(0, '0.877')] [2024-06-23 18:36:14,609][15401] Updated weights for policy 0, policy_version 492240 (0.0038) [2024-06-23 18:36:18,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 8065024000. Throughput: 0: 42866.3. Samples: 8065173700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:36:18,390][15132] Avg episode reward: [(0, '0.884')] [2024-06-23 18:36:18,395][15401] Updated weights for policy 0, policy_version 492250 (0.0032) [2024-06-23 18:36:22,434][15401] Updated weights for policy 0, policy_version 492260 (0.0040) [2024-06-23 18:36:23,389][15132] Fps is (10 sec: 39330.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8065204224. Throughput: 0: 42936.0. Samples: 8065298380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 18:36:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 18:36:26,482][15401] Updated weights for policy 0, policy_version 492270 (0.0044) [2024-06-23 18:36:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8065449984. Throughput: 0: 42836.9. Samples: 8065556680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:36:28,394][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 18:36:30,005][15401] Updated weights for policy 0, policy_version 492280 (0.0032) [2024-06-23 18:36:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 8065630208. Throughput: 0: 42840.2. Samples: 8065817660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:36:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 18:36:33,862][15401] Updated weights for policy 0, policy_version 492290 (0.0028) [2024-06-23 18:36:37,874][15401] Updated weights for policy 0, policy_version 492300 (0.0046) [2024-06-23 18:36:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8065843200. Throughput: 0: 42622.3. Samples: 8065934400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:36:38,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 18:36:41,426][15401] Updated weights for policy 0, policy_version 492310 (0.0043) [2024-06-23 18:36:43,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8066072576. Throughput: 0: 42643.6. Samples: 8066191200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:36:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 18:36:45,373][15401] Updated weights for policy 0, policy_version 492320 (0.0037) [2024-06-23 18:36:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8066269184. Throughput: 0: 42720.4. Samples: 8066452140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:36:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 18:36:49,105][15401] Updated weights for policy 0, policy_version 492330 (0.0038) [2024-06-23 18:36:52,862][15401] Updated weights for policy 0, policy_version 492340 (0.0028) [2024-06-23 18:36:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42710.1). Total num frames: 8066498560. Throughput: 0: 42817.0. Samples: 8066579380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:36:53,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 18:36:56,705][15401] Updated weights for policy 0, policy_version 492350 (0.0037) [2024-06-23 18:36:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8066711552. Throughput: 0: 42639.2. Samples: 8066832400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:36:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 18:37:00,958][15401] Updated weights for policy 0, policy_version 492360 (0.0043) [2024-06-23 18:37:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8066924544. Throughput: 0: 42591.8. Samples: 8067090340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 18:37:04,269][15401] Updated weights for policy 0, policy_version 492370 (0.0042) [2024-06-23 18:37:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8067121152. Throughput: 0: 42813.4. Samples: 8067224980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:08,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 18:37:08,522][15401] Updated weights for policy 0, policy_version 492380 (0.0030) [2024-06-23 18:37:11,932][15401] Updated weights for policy 0, policy_version 492390 (0.0047) [2024-06-23 18:37:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 8067350528. Throughput: 0: 42609.8. Samples: 8067474120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:13,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 18:37:16,244][15401] Updated weights for policy 0, policy_version 492400 (0.0040) [2024-06-23 18:37:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8067563520. Throughput: 0: 42494.5. Samples: 8067729920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 18:37:19,494][15401] Updated weights for policy 0, policy_version 492410 (0.0036) [2024-06-23 18:37:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8067776512. Throughput: 0: 42888.9. Samples: 8067864400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 18:37:23,978][15401] Updated weights for policy 0, policy_version 492420 (0.0033) [2024-06-23 18:37:27,570][15401] Updated weights for policy 0, policy_version 492430 (0.0029) [2024-06-23 18:37:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8068005888. Throughput: 0: 42775.2. Samples: 8068116080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 18:37:31,574][15401] Updated weights for policy 0, policy_version 492440 (0.0029) [2024-06-23 18:37:32,897][15349] Signal inference workers to stop experience collection... (119550 times) [2024-06-23 18:37:32,952][15401] InferenceWorker_p0-w0: stopping experience collection (119550 times) [2024-06-23 18:37:32,957][15349] Signal inference workers to resume experience collection... (119550 times) [2024-06-23 18:37:32,967][15401] InferenceWorker_p0-w0: resuming experience collection (119550 times) [2024-06-23 18:37:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.3, 300 sec: 42765.3). Total num frames: 8068218880. Throughput: 0: 42667.8. Samples: 8068372200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 18:37:35,062][15401] Updated weights for policy 0, policy_version 492450 (0.0032) [2024-06-23 18:37:38,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8068415488. Throughput: 0: 42777.7. Samples: 8068504380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 18:37:39,548][15401] Updated weights for policy 0, policy_version 492460 (0.0033) [2024-06-23 18:37:42,602][15401] Updated weights for policy 0, policy_version 492470 (0.0033) [2024-06-23 18:37:43,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8068644864. Throughput: 0: 42893.7. Samples: 8068762620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 18:37:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000492471_8068644864.pth... [2024-06-23 18:37:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000491846_8058404864.pth [2024-06-23 18:37:47,121][15401] Updated weights for policy 0, policy_version 492480 (0.0036) [2024-06-23 18:37:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8068857856. Throughput: 0: 42758.7. Samples: 8069014480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-23 18:37:48,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 18:37:50,173][15401] Updated weights for policy 0, policy_version 492490 (0.0030) [2024-06-23 18:37:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 8069054464. Throughput: 0: 42701.7. Samples: 8069146660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:37:53,393][15132] Avg episode reward: [(0, '0.249')] [2024-06-23 18:37:54,739][15401] Updated weights for policy 0, policy_version 492500 (0.0036) [2024-06-23 18:37:57,700][15401] Updated weights for policy 0, policy_version 492510 (0.0044) [2024-06-23 18:37:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8069283840. Throughput: 0: 42920.8. Samples: 8069405560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:37:58,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 18:38:02,433][15401] Updated weights for policy 0, policy_version 492520 (0.0028) [2024-06-23 18:38:03,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8069496832. Throughput: 0: 42969.4. Samples: 8069663540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:03,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 18:38:05,631][15401] Updated weights for policy 0, policy_version 492530 (0.0052) [2024-06-23 18:38:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8069709824. Throughput: 0: 42882.0. Samples: 8069794100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 18:38:09,914][15401] Updated weights for policy 0, policy_version 492540 (0.0038) [2024-06-23 18:38:13,175][15401] Updated weights for policy 0, policy_version 492550 (0.0028) [2024-06-23 18:38:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8069939200. Throughput: 0: 43009.1. Samples: 8070051500. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:13,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 18:38:17,646][15401] Updated weights for policy 0, policy_version 492560 (0.0036) [2024-06-23 18:38:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8070135808. Throughput: 0: 43152.2. Samples: 8070314040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 18:38:20,927][15401] Updated weights for policy 0, policy_version 492570 (0.0037) [2024-06-23 18:38:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8070365184. Throughput: 0: 42966.2. Samples: 8070437860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 18:38:25,291][15401] Updated weights for policy 0, policy_version 492580 (0.0028) [2024-06-23 18:38:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8070578176. Throughput: 0: 42903.1. Samples: 8070693260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 18:38:28,584][15401] Updated weights for policy 0, policy_version 492590 (0.0036) [2024-06-23 18:38:32,725][15401] Updated weights for policy 0, policy_version 492600 (0.0034) [2024-06-23 18:38:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 8070774784. Throughput: 0: 43334.8. Samples: 8070964540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:33,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 18:38:34,287][15349] Signal inference workers to stop experience collection... (119600 times) [2024-06-23 18:38:34,314][15401] InferenceWorker_p0-w0: stopping experience collection (119600 times) [2024-06-23 18:38:34,343][15349] Signal inference workers to resume experience collection... (119600 times) [2024-06-23 18:38:34,348][15401] InferenceWorker_p0-w0: resuming experience collection (119600 times) [2024-06-23 18:38:36,162][15401] Updated weights for policy 0, policy_version 492610 (0.0032) [2024-06-23 18:38:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8071020544. Throughput: 0: 43193.8. Samples: 8071090280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 18:38:40,174][15401] Updated weights for policy 0, policy_version 492620 (0.0024) [2024-06-23 18:38:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 8071217152. Throughput: 0: 43155.1. Samples: 8071347640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:43,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 18:38:43,883][15401] Updated weights for policy 0, policy_version 492630 (0.0029) [2024-06-23 18:38:47,940][15401] Updated weights for policy 0, policy_version 492640 (0.0021) [2024-06-23 18:38:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8071430144. Throughput: 0: 43192.3. Samples: 8071607200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:48,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 18:38:51,321][15401] Updated weights for policy 0, policy_version 492650 (0.0028) [2024-06-23 18:38:53,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 8071659520. Throughput: 0: 43110.8. Samples: 8071734080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:53,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-23 18:38:55,910][15401] Updated weights for policy 0, policy_version 492660 (0.0039) [2024-06-23 18:38:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 8071872512. Throughput: 0: 43026.0. Samples: 8071987660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:38:58,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 18:38:58,935][15401] Updated weights for policy 0, policy_version 492670 (0.0037) [2024-06-23 18:39:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8072052736. Throughput: 0: 43109.3. Samples: 8072253960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:39:03,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 18:39:03,454][15401] Updated weights for policy 0, policy_version 492680 (0.0030) [2024-06-23 18:39:06,628][15401] Updated weights for policy 0, policy_version 492690 (0.0024) [2024-06-23 18:39:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 8072298496. Throughput: 0: 42939.5. Samples: 8072370140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:39:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 18:39:10,911][15401] Updated weights for policy 0, policy_version 492700 (0.0041) [2024-06-23 18:39:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8072527872. Throughput: 0: 42982.2. Samples: 8072627460. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 18:39:13,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 18:39:14,224][15401] Updated weights for policy 0, policy_version 492710 (0.0032) [2024-06-23 18:39:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8072708096. Throughput: 0: 42736.4. Samples: 8072887680. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 18:39:18,570][15401] Updated weights for policy 0, policy_version 492720 (0.0035) [2024-06-23 18:39:21,805][15401] Updated weights for policy 0, policy_version 492730 (0.0028) [2024-06-23 18:39:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8072937472. Throughput: 0: 42583.1. Samples: 8073006520. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:23,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 18:39:26,105][15401] Updated weights for policy 0, policy_version 492740 (0.0033) [2024-06-23 18:39:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8073150464. Throughput: 0: 42667.1. Samples: 8073267560. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:28,390][15132] Avg episode reward: [(0, '0.862')] [2024-06-23 18:39:29,457][15401] Updated weights for policy 0, policy_version 492750 (0.0041) [2024-06-23 18:39:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 8073347072. Throughput: 0: 42715.7. Samples: 8073529400. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 18:39:33,677][15401] Updated weights for policy 0, policy_version 492760 (0.0036) [2024-06-23 18:39:37,334][15401] Updated weights for policy 0, policy_version 492770 (0.0043) [2024-06-23 18:39:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8073576448. Throughput: 0: 42662.4. Samples: 8073653880. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 18:39:41,397][15401] Updated weights for policy 0, policy_version 492780 (0.0029) [2024-06-23 18:39:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 8073805824. Throughput: 0: 42764.8. Samples: 8073912080. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:43,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 18:39:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000492786_8073805824.pth... [2024-06-23 18:39:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000492158_8063516672.pth [2024-06-23 18:39:45,035][15401] Updated weights for policy 0, policy_version 492790 (0.0029) [2024-06-23 18:39:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8074002432. Throughput: 0: 42578.6. Samples: 8074170000. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 18:39:49,129][15401] Updated weights for policy 0, policy_version 492800 (0.0030) [2024-06-23 18:39:52,582][15401] Updated weights for policy 0, policy_version 492810 (0.0025) [2024-06-23 18:39:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8074231808. Throughput: 0: 42787.2. Samples: 8074295560. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:53,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 18:39:56,528][15349] Signal inference workers to stop experience collection... (119650 times) [2024-06-23 18:39:56,529][15349] Signal inference workers to resume experience collection... (119650 times) [2024-06-23 18:39:56,570][15401] InferenceWorker_p0-w0: stopping experience collection (119650 times) [2024-06-23 18:39:56,570][15401] InferenceWorker_p0-w0: resuming experience collection (119650 times) [2024-06-23 18:39:56,675][15401] Updated weights for policy 0, policy_version 492820 (0.0050) [2024-06-23 18:39:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8074428416. Throughput: 0: 42891.2. Samples: 8074557560. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:39:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 18:40:00,077][15401] Updated weights for policy 0, policy_version 492830 (0.0035) [2024-06-23 18:40:03,392][15132] Fps is (10 sec: 40948.7, 60 sec: 43142.6, 300 sec: 42820.2). Total num frames: 8074641408. Throughput: 0: 43002.7. Samples: 8074822920. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:03,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 18:40:04,242][15401] Updated weights for policy 0, policy_version 492840 (0.0036) [2024-06-23 18:40:07,502][15401] Updated weights for policy 0, policy_version 492850 (0.0039) [2024-06-23 18:40:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8074887168. Throughput: 0: 43112.1. Samples: 8074946560. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 18:40:11,832][15401] Updated weights for policy 0, policy_version 492860 (0.0034) [2024-06-23 18:40:13,390][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 8075083776. Throughput: 0: 43002.5. Samples: 8075202680. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 18:40:15,342][15401] Updated weights for policy 0, policy_version 492870 (0.0032) [2024-06-23 18:40:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8075280384. Throughput: 0: 43071.4. Samples: 8075467620. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 18:40:19,575][15401] Updated weights for policy 0, policy_version 492880 (0.0035) [2024-06-23 18:40:22,819][15401] Updated weights for policy 0, policy_version 492890 (0.0037) [2024-06-23 18:40:23,390][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8075526144. Throughput: 0: 43067.9. Samples: 8075591940. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:23,397][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 18:40:27,178][15401] Updated weights for policy 0, policy_version 492900 (0.0035) [2024-06-23 18:40:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8075739136. Throughput: 0: 43108.3. Samples: 8075851960. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:28,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 18:40:30,545][15401] Updated weights for policy 0, policy_version 492910 (0.0039) [2024-06-23 18:40:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8075935744. Throughput: 0: 42998.3. Samples: 8076104920. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 18:40:34,900][15401] Updated weights for policy 0, policy_version 492920 (0.0043) [2024-06-23 18:40:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8076148736. Throughput: 0: 43001.3. Samples: 8076230620. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 18:40:38,408][15401] Updated weights for policy 0, policy_version 492930 (0.0032) [2024-06-23 18:40:42,527][15401] Updated weights for policy 0, policy_version 492940 (0.0036) [2024-06-23 18:40:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8076378112. Throughput: 0: 43147.1. Samples: 8076499180. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-23 18:40:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 18:40:45,937][15401] Updated weights for policy 0, policy_version 492950 (0.0039) [2024-06-23 18:40:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8076574720. Throughput: 0: 42785.3. Samples: 8076748140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:40:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 18:40:50,125][15401] Updated weights for policy 0, policy_version 492960 (0.0036) [2024-06-23 18:40:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 8076804096. Throughput: 0: 42907.0. Samples: 8076877380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:40:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 18:40:53,518][15401] Updated weights for policy 0, policy_version 492970 (0.0036) [2024-06-23 18:40:57,558][15401] Updated weights for policy 0, policy_version 492980 (0.0034) [2024-06-23 18:40:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8077017088. Throughput: 0: 43083.8. Samples: 8077141440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:40:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 18:41:01,049][15401] Updated weights for policy 0, policy_version 492990 (0.0042) [2024-06-23 18:41:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43146.5, 300 sec: 42987.2). Total num frames: 8077230080. Throughput: 0: 42842.3. Samples: 8077395520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 18:41:05,053][15401] Updated weights for policy 0, policy_version 493000 (0.0027) [2024-06-23 18:41:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 8077459456. Throughput: 0: 43054.3. Samples: 8077529380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 18:41:08,533][15401] Updated weights for policy 0, policy_version 493010 (0.0033) [2024-06-23 18:41:12,588][15401] Updated weights for policy 0, policy_version 493020 (0.0038) [2024-06-23 18:41:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8077639680. Throughput: 0: 42934.2. Samples: 8077784000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 18:41:16,321][15401] Updated weights for policy 0, policy_version 493030 (0.0043) [2024-06-23 18:41:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 8077885440. Throughput: 0: 42926.3. Samples: 8078036600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 18:41:20,563][15401] Updated weights for policy 0, policy_version 493040 (0.0026) [2024-06-23 18:41:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8078098432. Throughput: 0: 43115.5. Samples: 8078170820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:23,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 18:41:24,101][15401] Updated weights for policy 0, policy_version 493050 (0.0034) [2024-06-23 18:41:27,920][15401] Updated weights for policy 0, policy_version 493060 (0.0036) [2024-06-23 18:41:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8078295040. Throughput: 0: 42749.3. Samples: 8078422900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:28,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 18:41:31,544][15401] Updated weights for policy 0, policy_version 493070 (0.0032) [2024-06-23 18:41:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8078524416. Throughput: 0: 43085.4. Samples: 8078686980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 18:41:35,371][15401] Updated weights for policy 0, policy_version 493080 (0.0024) [2024-06-23 18:41:37,741][15349] Signal inference workers to stop experience collection... (119700 times) [2024-06-23 18:41:37,772][15401] InferenceWorker_p0-w0: stopping experience collection (119700 times) [2024-06-23 18:41:37,800][15349] Signal inference workers to resume experience collection... (119700 times) [2024-06-23 18:41:37,804][15401] InferenceWorker_p0-w0: resuming experience collection (119700 times) [2024-06-23 18:41:38,390][15132] Fps is (10 sec: 42597.0, 60 sec: 42871.2, 300 sec: 42876.0). Total num frames: 8078721024. Throughput: 0: 43139.8. Samples: 8078818680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:38,399][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 18:41:39,123][15401] Updated weights for policy 0, policy_version 493090 (0.0022) [2024-06-23 18:41:42,910][15401] Updated weights for policy 0, policy_version 493100 (0.0030) [2024-06-23 18:41:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8078950400. Throughput: 0: 42929.6. Samples: 8079073280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 18:41:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000493100_8078950400.pth... [2024-06-23 18:41:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000492471_8068644864.pth [2024-06-23 18:41:46,680][15401] Updated weights for policy 0, policy_version 493110 (0.0041) [2024-06-23 18:41:48,390][15132] Fps is (10 sec: 45876.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 8079179776. Throughput: 0: 43119.1. Samples: 8079335880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 18:41:50,326][15401] Updated weights for policy 0, policy_version 493120 (0.0034) [2024-06-23 18:41:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8079376384. Throughput: 0: 43068.0. Samples: 8079467440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 18:41:54,622][15401] Updated weights for policy 0, policy_version 493130 (0.0034) [2024-06-23 18:41:57,754][15401] Updated weights for policy 0, policy_version 493140 (0.0025) [2024-06-23 18:41:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8079605760. Throughput: 0: 43045.9. Samples: 8079721060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:41:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 18:42:02,273][15401] Updated weights for policy 0, policy_version 493150 (0.0035) [2024-06-23 18:42:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 8079835136. Throughput: 0: 43263.0. Samples: 8079983440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:42:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 18:42:05,261][15401] Updated weights for policy 0, policy_version 493160 (0.0027) [2024-06-23 18:42:08,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42593.9, 300 sec: 42930.7). Total num frames: 8080015360. Throughput: 0: 43165.9. Samples: 8080113560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 18:42:08,396][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 18:42:09,814][15401] Updated weights for policy 0, policy_version 493170 (0.0032) [2024-06-23 18:42:12,917][15401] Updated weights for policy 0, policy_version 493180 (0.0040) [2024-06-23 18:42:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43690.8, 300 sec: 43042.7). Total num frames: 8080261120. Throughput: 0: 43092.5. Samples: 8080362060. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 18:42:17,518][15401] Updated weights for policy 0, policy_version 493190 (0.0022) [2024-06-23 18:42:18,389][15132] Fps is (10 sec: 45904.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 8080474112. Throughput: 0: 42970.1. Samples: 8080620640. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:18,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 18:42:20,795][15401] Updated weights for policy 0, policy_version 493200 (0.0021) [2024-06-23 18:42:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8080654336. Throughput: 0: 42958.6. Samples: 8080751800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:23,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 18:42:25,163][15401] Updated weights for policy 0, policy_version 493210 (0.0040) [2024-06-23 18:42:28,245][15401] Updated weights for policy 0, policy_version 493220 (0.0029) [2024-06-23 18:42:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 8080916480. Throughput: 0: 43065.4. Samples: 8081011220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:28,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 18:42:32,651][15401] Updated weights for policy 0, policy_version 493230 (0.0029) [2024-06-23 18:42:33,396][15132] Fps is (10 sec: 47483.0, 60 sec: 43412.9, 300 sec: 43097.3). Total num frames: 8081129472. Throughput: 0: 43026.3. Samples: 8081272340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:33,397][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 18:42:35,786][15401] Updated weights for policy 0, policy_version 493240 (0.0036) [2024-06-23 18:42:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.8, 300 sec: 42931.6). Total num frames: 8081309696. Throughput: 0: 42910.2. Samples: 8081398400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:38,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 18:42:40,410][15401] Updated weights for policy 0, policy_version 493250 (0.0027) [2024-06-23 18:42:43,371][15401] Updated weights for policy 0, policy_version 493260 (0.0032) [2024-06-23 18:42:43,389][15132] Fps is (10 sec: 44265.3, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 8081571840. Throughput: 0: 42999.1. Samples: 8081656020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:43,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 18:42:47,931][15401] Updated weights for policy 0, policy_version 493270 (0.0033) [2024-06-23 18:42:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 8081752064. Throughput: 0: 42964.9. Samples: 8081916860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:48,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-23 18:42:50,928][15349] Signal inference workers to stop experience collection... (119750 times) [2024-06-23 18:42:50,929][15349] Signal inference workers to resume experience collection... (119750 times) [2024-06-23 18:42:50,985][15401] InferenceWorker_p0-w0: stopping experience collection (119750 times) [2024-06-23 18:42:50,985][15401] InferenceWorker_p0-w0: resuming experience collection (119750 times) [2024-06-23 18:42:51,067][15401] Updated weights for policy 0, policy_version 493280 (0.0031) [2024-06-23 18:42:53,392][15132] Fps is (10 sec: 37673.8, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 8081948672. Throughput: 0: 42693.9. Samples: 8082034620. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:53,393][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 18:42:55,642][15401] Updated weights for policy 0, policy_version 493290 (0.0035) [2024-06-23 18:42:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 8082210816. Throughput: 0: 42907.9. Samples: 8082292920. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:42:58,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 18:42:58,519][15401] Updated weights for policy 0, policy_version 493300 (0.0039) [2024-06-23 18:43:03,241][15401] Updated weights for policy 0, policy_version 493310 (0.0030) [2024-06-23 18:43:03,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 8082391040. Throughput: 0: 42964.8. Samples: 8082554060. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:43:03,392][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 18:43:07,075][15401] Updated weights for policy 0, policy_version 493320 (0.0033) [2024-06-23 18:43:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43149.1, 300 sec: 42931.7). Total num frames: 8082604032. Throughput: 0: 42803.5. Samples: 8082677960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:43:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 18:43:10,785][15401] Updated weights for policy 0, policy_version 493330 (0.0024) [2024-06-23 18:43:13,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 8082833408. Throughput: 0: 42800.6. Samples: 8082937240. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:43:13,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 18:43:14,677][15401] Updated weights for policy 0, policy_version 493340 (0.0032) [2024-06-23 18:43:18,317][15401] Updated weights for policy 0, policy_version 493350 (0.0028) [2024-06-23 18:43:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8083046400. Throughput: 0: 42869.1. Samples: 8083201180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:43:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 18:43:22,755][15401] Updated weights for policy 0, policy_version 493360 (0.0040) [2024-06-23 18:43:23,390][15132] Fps is (10 sec: 40958.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8083243008. Throughput: 0: 42768.8. Samples: 8083323000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:43:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 18:43:25,935][15401] Updated weights for policy 0, policy_version 493370 (0.0024) [2024-06-23 18:43:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 8083472384. Throughput: 0: 42671.0. Samples: 8083576220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:43:28,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 18:43:30,447][15401] Updated weights for policy 0, policy_version 493380 (0.0033) [2024-06-23 18:43:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42603.0, 300 sec: 42931.6). Total num frames: 8083685376. Throughput: 0: 42662.6. Samples: 8083836680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-23 18:43:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 18:43:33,472][15401] Updated weights for policy 0, policy_version 493390 (0.0035) [2024-06-23 18:43:37,945][15401] Updated weights for policy 0, policy_version 493400 (0.0032) [2024-06-23 18:43:38,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 8083865600. Throughput: 0: 42906.8. Samples: 8083965320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:43:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 18:43:41,130][15401] Updated weights for policy 0, policy_version 493410 (0.0042) [2024-06-23 18:43:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.7, 300 sec: 42986.8). Total num frames: 8084111360. Throughput: 0: 42810.2. Samples: 8084219480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:43:43,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 18:43:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000493415_8084111360.pth... [2024-06-23 18:43:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000492786_8073805824.pth [2024-06-23 18:43:45,409][15401] Updated weights for policy 0, policy_version 493420 (0.0035) [2024-06-23 18:43:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 8084324352. Throughput: 0: 42758.4. Samples: 8084478180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:43:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 18:43:48,618][15401] Updated weights for policy 0, policy_version 493430 (0.0033) [2024-06-23 18:43:53,025][15401] Updated weights for policy 0, policy_version 493440 (0.0028) [2024-06-23 18:43:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 8084520960. Throughput: 0: 42814.2. Samples: 8084604600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:43:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 18:43:56,609][15401] Updated weights for policy 0, policy_version 493450 (0.0038) [2024-06-23 18:43:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 43098.2). Total num frames: 8084766720. Throughput: 0: 42782.4. Samples: 8084862460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:43:58,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 18:44:00,955][15401] Updated weights for policy 0, policy_version 493460 (0.0033) [2024-06-23 18:44:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8084963328. Throughput: 0: 42769.9. Samples: 8085125820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 18:44:04,325][15401] Updated weights for policy 0, policy_version 493470 (0.0037) [2024-06-23 18:44:05,556][15349] Signal inference workers to stop experience collection... (119800 times) [2024-06-23 18:44:05,560][15349] Signal inference workers to resume experience collection... (119800 times) [2024-06-23 18:44:05,586][15401] InferenceWorker_p0-w0: stopping experience collection (119800 times) [2024-06-23 18:44:05,587][15401] InferenceWorker_p0-w0: resuming experience collection (119800 times) [2024-06-23 18:44:08,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8085159936. Throughput: 0: 42769.0. Samples: 8085247600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 18:44:08,441][15401] Updated weights for policy 0, policy_version 493480 (0.0034) [2024-06-23 18:44:11,748][15401] Updated weights for policy 0, policy_version 493490 (0.0029) [2024-06-23 18:44:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 8085422080. Throughput: 0: 42946.8. Samples: 8085508820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 18:44:15,914][15401] Updated weights for policy 0, policy_version 493500 (0.0033) [2024-06-23 18:44:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8085585920. Throughput: 0: 43034.1. Samples: 8085773220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:18,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 18:44:19,270][15401] Updated weights for policy 0, policy_version 493510 (0.0037) [2024-06-23 18:44:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8085815296. Throughput: 0: 42848.4. Samples: 8085893500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 18:44:23,872][15401] Updated weights for policy 0, policy_version 493520 (0.0037) [2024-06-23 18:44:27,108][15401] Updated weights for policy 0, policy_version 493530 (0.0030) [2024-06-23 18:44:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 8086044672. Throughput: 0: 43001.8. Samples: 8086154460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 18:44:31,367][15401] Updated weights for policy 0, policy_version 493540 (0.0039) [2024-06-23 18:44:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8086241280. Throughput: 0: 42943.4. Samples: 8086410640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:33,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-23 18:44:34,788][15401] Updated weights for policy 0, policy_version 493550 (0.0035) [2024-06-23 18:44:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8086454272. Throughput: 0: 42913.3. Samples: 8086535700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 18:44:38,902][15401] Updated weights for policy 0, policy_version 493560 (0.0035) [2024-06-23 18:44:42,854][15401] Updated weights for policy 0, policy_version 493570 (0.0034) [2024-06-23 18:44:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.0, 300 sec: 42987.1). Total num frames: 8086683648. Throughput: 0: 42902.6. Samples: 8086793080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 18:44:46,655][15401] Updated weights for policy 0, policy_version 493580 (0.0032) [2024-06-23 18:44:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 8086863872. Throughput: 0: 42735.6. Samples: 8087048920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 18:44:50,555][15401] Updated weights for policy 0, policy_version 493590 (0.0033) [2024-06-23 18:44:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 8087109632. Throughput: 0: 42719.0. Samples: 8087169960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 18:44:54,238][15401] Updated weights for policy 0, policy_version 493600 (0.0036) [2024-06-23 18:44:58,022][15401] Updated weights for policy 0, policy_version 493610 (0.0036) [2024-06-23 18:44:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42987.6). Total num frames: 8087322624. Throughput: 0: 42808.0. Samples: 8087435180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:44:58,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-23 18:45:01,870][15401] Updated weights for policy 0, policy_version 493620 (0.0040) [2024-06-23 18:45:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8087519232. Throughput: 0: 42696.1. Samples: 8087694540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-23 18:45:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 18:45:05,642][15401] Updated weights for policy 0, policy_version 493630 (0.0030) [2024-06-23 18:45:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8087732224. Throughput: 0: 42684.0. Samples: 8087814280. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 18:45:09,437][15401] Updated weights for policy 0, policy_version 493640 (0.0029) [2024-06-23 18:45:13,116][15401] Updated weights for policy 0, policy_version 493650 (0.0035) [2024-06-23 18:45:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 8087961600. Throughput: 0: 42626.6. Samples: 8088072660. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 18:45:17,268][15401] Updated weights for policy 0, policy_version 493660 (0.0030) [2024-06-23 18:45:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8088158208. Throughput: 0: 42645.4. Samples: 8088329680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 18:45:20,834][15401] Updated weights for policy 0, policy_version 493670 (0.0022) [2024-06-23 18:45:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8088387584. Throughput: 0: 42585.4. Samples: 8088452040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 18:45:25,103][15401] Updated weights for policy 0, policy_version 493680 (0.0041) [2024-06-23 18:45:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8088600576. Throughput: 0: 42708.7. Samples: 8088714960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 18:45:28,474][15401] Updated weights for policy 0, policy_version 493690 (0.0029) [2024-06-23 18:45:29,611][15349] Signal inference workers to stop experience collection... (119850 times) [2024-06-23 18:45:29,612][15349] Signal inference workers to resume experience collection... (119850 times) [2024-06-23 18:45:29,664][15401] InferenceWorker_p0-w0: stopping experience collection (119850 times) [2024-06-23 18:45:29,664][15401] InferenceWorker_p0-w0: resuming experience collection (119850 times) [2024-06-23 18:45:32,669][15401] Updated weights for policy 0, policy_version 493700 (0.0033) [2024-06-23 18:45:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8088780800. Throughput: 0: 42699.2. Samples: 8088970380. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 18:45:36,099][15401] Updated weights for policy 0, policy_version 493710 (0.0038) [2024-06-23 18:45:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8089026560. Throughput: 0: 42643.6. Samples: 8089088920. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 18:45:40,310][15401] Updated weights for policy 0, policy_version 493720 (0.0044) [2024-06-23 18:45:43,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8089239552. Throughput: 0: 42627.6. Samples: 8089353420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 18:45:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000493729_8089255936.pth... [2024-06-23 18:45:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000493100_8078950400.pth [2024-06-23 18:45:43,800][15401] Updated weights for policy 0, policy_version 493730 (0.0035) [2024-06-23 18:45:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8089419776. Throughput: 0: 42504.0. Samples: 8089607220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 18:45:48,444][15401] Updated weights for policy 0, policy_version 493740 (0.0038) [2024-06-23 18:45:51,431][15401] Updated weights for policy 0, policy_version 493750 (0.0033) [2024-06-23 18:45:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8089649152. Throughput: 0: 42553.9. Samples: 8089729200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 18:45:56,117][15401] Updated weights for policy 0, policy_version 493760 (0.0043) [2024-06-23 18:45:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8089878528. Throughput: 0: 42553.4. Samples: 8089987560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:45:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 18:45:59,262][15401] Updated weights for policy 0, policy_version 493770 (0.0034) [2024-06-23 18:46:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8090058752. Throughput: 0: 42615.0. Samples: 8090247360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:46:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 18:46:03,826][15401] Updated weights for policy 0, policy_version 493780 (0.0038) [2024-06-23 18:46:06,873][15401] Updated weights for policy 0, policy_version 493790 (0.0030) [2024-06-23 18:46:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 8090288128. Throughput: 0: 42575.1. Samples: 8090368020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:46:08,393][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 18:46:11,328][15401] Updated weights for policy 0, policy_version 493800 (0.0040) [2024-06-23 18:46:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8090517504. Throughput: 0: 42528.6. Samples: 8090628760. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:46:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 18:46:14,727][15401] Updated weights for policy 0, policy_version 493810 (0.0035) [2024-06-23 18:46:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8090697728. Throughput: 0: 42701.2. Samples: 8090891940. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:46:18,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 18:46:18,830][15401] Updated weights for policy 0, policy_version 493820 (0.0048) [2024-06-23 18:46:22,201][15401] Updated weights for policy 0, policy_version 493830 (0.0033) [2024-06-23 18:46:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8090943488. Throughput: 0: 42781.7. Samples: 8091014100. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:46:23,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 18:46:26,353][15401] Updated weights for policy 0, policy_version 493840 (0.0022) [2024-06-23 18:46:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 8091172864. Throughput: 0: 42706.1. Samples: 8091275200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-23 18:46:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 18:46:30,035][15401] Updated weights for policy 0, policy_version 493850 (0.0028) [2024-06-23 18:46:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 8091336704. Throughput: 0: 42918.7. Samples: 8091538560. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:46:33,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-23 18:46:33,939][15401] Updated weights for policy 0, policy_version 493860 (0.0033) [2024-06-23 18:46:37,525][15401] Updated weights for policy 0, policy_version 493870 (0.0040) [2024-06-23 18:46:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8091598848. Throughput: 0: 42973.6. Samples: 8091663020. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:46:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 18:46:41,611][15401] Updated weights for policy 0, policy_version 493880 (0.0038) [2024-06-23 18:46:43,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8091811840. Throughput: 0: 42944.4. Samples: 8091920060. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:46:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 18:46:45,116][15401] Updated weights for policy 0, policy_version 493890 (0.0029) [2024-06-23 18:46:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8092008448. Throughput: 0: 43048.1. Samples: 8092184520. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:46:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 18:46:49,194][15401] Updated weights for policy 0, policy_version 493900 (0.0031) [2024-06-23 18:46:51,871][15349] Signal inference workers to stop experience collection... (119900 times) [2024-06-23 18:46:51,924][15401] InferenceWorker_p0-w0: stopping experience collection (119900 times) [2024-06-23 18:46:51,988][15349] Signal inference workers to resume experience collection... (119900 times) [2024-06-23 18:46:51,988][15401] InferenceWorker_p0-w0: resuming experience collection (119900 times) [2024-06-23 18:46:52,540][15401] Updated weights for policy 0, policy_version 493910 (0.0023) [2024-06-23 18:46:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8092237824. Throughput: 0: 43191.6. Samples: 8092311540. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:46:53,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 18:46:56,556][15401] Updated weights for policy 0, policy_version 493920 (0.0031) [2024-06-23 18:46:58,390][15132] Fps is (10 sec: 42594.1, 60 sec: 42597.7, 300 sec: 42709.3). Total num frames: 8092434432. Throughput: 0: 43175.7. Samples: 8092571700. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:46:58,391][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 18:47:00,594][15401] Updated weights for policy 0, policy_version 493930 (0.0040) [2024-06-23 18:47:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 42877.0). Total num frames: 8092663808. Throughput: 0: 43059.2. Samples: 8092829600. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 18:47:04,212][15401] Updated weights for policy 0, policy_version 493940 (0.0042) [2024-06-23 18:47:08,355][15401] Updated weights for policy 0, policy_version 493950 (0.0045) [2024-06-23 18:47:08,389][15132] Fps is (10 sec: 44241.7, 60 sec: 43146.4, 300 sec: 42765.0). Total num frames: 8092876800. Throughput: 0: 43097.5. Samples: 8092953480. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:08,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 18:47:11,707][15401] Updated weights for policy 0, policy_version 493960 (0.0029) [2024-06-23 18:47:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 8093089792. Throughput: 0: 43063.7. Samples: 8093213060. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 18:47:15,946][15401] Updated weights for policy 0, policy_version 493970 (0.0030) [2024-06-23 18:47:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8093286400. Throughput: 0: 42963.6. Samples: 8093471920. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 18:47:19,618][15401] Updated weights for policy 0, policy_version 493980 (0.0029) [2024-06-23 18:47:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8093515776. Throughput: 0: 43030.3. Samples: 8093599380. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:23,394][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 18:47:23,405][15401] Updated weights for policy 0, policy_version 493990 (0.0033) [2024-06-23 18:47:27,316][15401] Updated weights for policy 0, policy_version 494000 (0.0035) [2024-06-23 18:47:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 8093745152. Throughput: 0: 43055.5. Samples: 8093857560. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 18:47:31,193][15401] Updated weights for policy 0, policy_version 494010 (0.0032) [2024-06-23 18:47:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8093941760. Throughput: 0: 42881.0. Samples: 8094114160. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 18:47:34,800][15401] Updated weights for policy 0, policy_version 494020 (0.0029) [2024-06-23 18:47:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8094171136. Throughput: 0: 42924.0. Samples: 8094243120. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 18:47:38,765][15401] Updated weights for policy 0, policy_version 494030 (0.0024) [2024-06-23 18:47:42,410][15401] Updated weights for policy 0, policy_version 494040 (0.0037) [2024-06-23 18:47:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8094384128. Throughput: 0: 42853.4. Samples: 8094500060. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 18:47:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000494043_8094400512.pth... [2024-06-23 18:47:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000493415_8084111360.pth [2024-06-23 18:47:46,415][15401] Updated weights for policy 0, policy_version 494050 (0.0025) [2024-06-23 18:47:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 8094597120. Throughput: 0: 42897.8. Samples: 8094760000. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:48,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 18:47:50,061][15401] Updated weights for policy 0, policy_version 494060 (0.0035) [2024-06-23 18:47:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8094777344. Throughput: 0: 42945.3. Samples: 8094886020. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-23 18:47:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 18:47:54,208][15401] Updated weights for policy 0, policy_version 494070 (0.0038) [2024-06-23 18:47:57,654][15401] Updated weights for policy 0, policy_version 494080 (0.0051) [2024-06-23 18:47:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43145.2, 300 sec: 42820.6). Total num frames: 8095023104. Throughput: 0: 42938.1. Samples: 8095145280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:47:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 18:48:02,075][15401] Updated weights for policy 0, policy_version 494090 (0.0031) [2024-06-23 18:48:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8095252480. Throughput: 0: 42765.7. Samples: 8095396380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:03,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 18:48:05,150][15401] Updated weights for policy 0, policy_version 494100 (0.0034) [2024-06-23 18:48:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8095416320. Throughput: 0: 42718.3. Samples: 8095521700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 18:48:09,054][15349] Signal inference workers to stop experience collection... (119950 times) [2024-06-23 18:48:09,100][15401] InferenceWorker_p0-w0: stopping experience collection (119950 times) [2024-06-23 18:48:09,167][15349] Signal inference workers to resume experience collection... (119950 times) [2024-06-23 18:48:09,168][15401] InferenceWorker_p0-w0: resuming experience collection (119950 times) [2024-06-23 18:48:09,909][15401] Updated weights for policy 0, policy_version 494110 (0.0034) [2024-06-23 18:48:12,725][15401] Updated weights for policy 0, policy_version 494120 (0.0033) [2024-06-23 18:48:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 8095678464. Throughput: 0: 42712.5. Samples: 8095779620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 18:48:17,515][15401] Updated weights for policy 0, policy_version 494130 (0.0039) [2024-06-23 18:48:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8095875072. Throughput: 0: 42793.2. Samples: 8096039860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:18,395][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 18:48:20,404][15401] Updated weights for policy 0, policy_version 494140 (0.0029) [2024-06-23 18:48:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8096071680. Throughput: 0: 42621.8. Samples: 8096161100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 18:48:25,278][15401] Updated weights for policy 0, policy_version 494150 (0.0033) [2024-06-23 18:48:27,939][15401] Updated weights for policy 0, policy_version 494160 (0.0026) [2024-06-23 18:48:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8096333824. Throughput: 0: 42791.0. Samples: 8096425660. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 18:48:33,081][15401] Updated weights for policy 0, policy_version 494170 (0.0031) [2024-06-23 18:48:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8096497664. Throughput: 0: 42798.2. Samples: 8096685920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 18:48:35,662][15401] Updated weights for policy 0, policy_version 494180 (0.0027) [2024-06-23 18:48:38,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 8096727040. Throughput: 0: 42490.3. Samples: 8096798080. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 18:48:40,864][15401] Updated weights for policy 0, policy_version 494190 (0.0039) [2024-06-23 18:48:43,162][15401] Updated weights for policy 0, policy_version 494200 (0.0036) [2024-06-23 18:48:43,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8096972800. Throughput: 0: 42746.2. Samples: 8097068860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:43,395][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 18:48:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 8097120256. Throughput: 0: 42937.4. Samples: 8097328560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:48,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-23 18:48:48,426][15401] Updated weights for policy 0, policy_version 494210 (0.0028) [2024-06-23 18:48:50,878][15401] Updated weights for policy 0, policy_version 494220 (0.0037) [2024-06-23 18:48:53,390][15132] Fps is (10 sec: 40958.4, 60 sec: 43417.2, 300 sec: 42765.0). Total num frames: 8097382400. Throughput: 0: 42643.5. Samples: 8097440680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:53,391][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 18:48:55,993][15401] Updated weights for policy 0, policy_version 494230 (0.0039) [2024-06-23 18:48:58,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8097611776. Throughput: 0: 42923.2. Samples: 8097711160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:48:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 18:48:58,580][15401] Updated weights for policy 0, policy_version 494240 (0.0033) [2024-06-23 18:49:03,389][15132] Fps is (10 sec: 39324.1, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 8097775616. Throughput: 0: 42873.1. Samples: 8097969140. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:49:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 18:49:03,497][15401] Updated weights for policy 0, policy_version 494250 (0.0041) [2024-06-23 18:49:06,404][15401] Updated weights for policy 0, policy_version 494260 (0.0043) [2024-06-23 18:49:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 8098037760. Throughput: 0: 42855.5. Samples: 8098089600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:49:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 18:49:11,198][15401] Updated weights for policy 0, policy_version 494270 (0.0037) [2024-06-23 18:49:13,389][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8098250752. Throughput: 0: 42924.6. Samples: 8098357260. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:49:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 18:49:14,033][15401] Updated weights for policy 0, policy_version 494280 (0.0035) [2024-06-23 18:49:18,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8098414592. Throughput: 0: 42971.0. Samples: 8098619620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:49:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 18:49:18,751][15401] Updated weights for policy 0, policy_version 494290 (0.0028) [2024-06-23 18:49:21,480][15401] Updated weights for policy 0, policy_version 494300 (0.0035) [2024-06-23 18:49:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 8098676736. Throughput: 0: 43229.6. Samples: 8098743420. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-23 18:49:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 18:49:25,818][15349] Signal inference workers to stop experience collection... (120000 times) [2024-06-23 18:49:25,818][15349] Signal inference workers to resume experience collection... (120000 times) [2024-06-23 18:49:25,834][15401] InferenceWorker_p0-w0: stopping experience collection (120000 times) [2024-06-23 18:49:25,834][15401] InferenceWorker_p0-w0: resuming experience collection (120000 times) [2024-06-23 18:49:26,307][15401] Updated weights for policy 0, policy_version 494310 (0.0040) [2024-06-23 18:49:28,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8098889728. Throughput: 0: 43059.6. Samples: 8099006540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:49:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 18:49:29,098][15401] Updated weights for policy 0, policy_version 494320 (0.0036) [2024-06-23 18:49:33,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8099053568. Throughput: 0: 43261.8. Samples: 8099275340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:49:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 18:49:33,847][15401] Updated weights for policy 0, policy_version 494330 (0.0040) [2024-06-23 18:49:36,799][15401] Updated weights for policy 0, policy_version 494340 (0.0021) [2024-06-23 18:49:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8099332096. Throughput: 0: 43333.7. Samples: 8099390680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:49:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 18:49:41,355][15401] Updated weights for policy 0, policy_version 494350 (0.0036) [2024-06-23 18:49:43,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8099528704. Throughput: 0: 43146.7. Samples: 8099652760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:49:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 18:49:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000494356_8099528704.pth... [2024-06-23 18:49:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000493729_8089255936.pth [2024-06-23 18:49:44,711][15401] Updated weights for policy 0, policy_version 494360 (0.0032) [2024-06-23 18:49:48,389][15132] Fps is (10 sec: 37683.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8099708928. Throughput: 0: 43181.7. Samples: 8099912320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:49:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 18:49:48,847][15401] Updated weights for policy 0, policy_version 494370 (0.0034) [2024-06-23 18:49:52,353][15401] Updated weights for policy 0, policy_version 494380 (0.0027) [2024-06-23 18:49:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.8, 300 sec: 42876.1). Total num frames: 8099971072. Throughput: 0: 43185.3. Samples: 8100032940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:49:53,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 18:49:56,999][15401] Updated weights for policy 0, policy_version 494390 (0.0037) [2024-06-23 18:49:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8100167680. Throughput: 0: 42870.1. Samples: 8100286420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:49:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 18:49:59,950][15401] Updated weights for policy 0, policy_version 494400 (0.0032) [2024-06-23 18:50:03,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8100347904. Throughput: 0: 42898.8. Samples: 8100550060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 18:50:04,415][15401] Updated weights for policy 0, policy_version 494410 (0.0040) [2024-06-23 18:50:07,575][15401] Updated weights for policy 0, policy_version 494420 (0.0041) [2024-06-23 18:50:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8100610048. Throughput: 0: 42890.7. Samples: 8100673500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 18:50:11,893][15401] Updated weights for policy 0, policy_version 494430 (0.0040) [2024-06-23 18:50:13,390][15132] Fps is (10 sec: 47512.5, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 8100823040. Throughput: 0: 42818.6. Samples: 8100933380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 18:50:15,206][15401] Updated weights for policy 0, policy_version 494440 (0.0029) [2024-06-23 18:50:18,392][15132] Fps is (10 sec: 39312.4, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 8101003264. Throughput: 0: 42545.3. Samples: 8101189980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:18,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 18:50:18,936][15349] Signal inference workers to stop experience collection... (120050 times) [2024-06-23 18:50:18,936][15349] Signal inference workers to resume experience collection... (120050 times) [2024-06-23 18:50:18,972][15401] InferenceWorker_p0-w0: stopping experience collection (120050 times) [2024-06-23 18:50:18,972][15401] InferenceWorker_p0-w0: resuming experience collection (120050 times) [2024-06-23 18:50:19,863][15401] Updated weights for policy 0, policy_version 494450 (0.0035) [2024-06-23 18:50:22,900][15401] Updated weights for policy 0, policy_version 494460 (0.0035) [2024-06-23 18:50:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8101249024. Throughput: 0: 42830.2. Samples: 8101318040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:23,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 18:50:27,334][15401] Updated weights for policy 0, policy_version 494470 (0.0040) [2024-06-23 18:50:28,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8101445632. Throughput: 0: 42851.4. Samples: 8101581080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 18:50:30,434][15401] Updated weights for policy 0, policy_version 494480 (0.0033) [2024-06-23 18:50:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8101658624. Throughput: 0: 42642.6. Samples: 8101831240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 18:50:34,889][15401] Updated weights for policy 0, policy_version 494490 (0.0034) [2024-06-23 18:50:38,024][15401] Updated weights for policy 0, policy_version 494500 (0.0028) [2024-06-23 18:50:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8101904384. Throughput: 0: 42846.7. Samples: 8101961040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:38,392][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 18:50:42,392][15401] Updated weights for policy 0, policy_version 494510 (0.0036) [2024-06-23 18:50:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8102100992. Throughput: 0: 42980.0. Samples: 8102220520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 18:50:45,580][15401] Updated weights for policy 0, policy_version 494520 (0.0035) [2024-06-23 18:50:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8102297600. Throughput: 0: 42677.6. Samples: 8102470560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 18:50:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 18:50:50,004][15401] Updated weights for policy 0, policy_version 494530 (0.0031) [2024-06-23 18:50:53,163][15401] Updated weights for policy 0, policy_version 494540 (0.0033) [2024-06-23 18:50:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 8102543360. Throughput: 0: 42760.5. Samples: 8102597720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:50:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 18:50:57,644][15401] Updated weights for policy 0, policy_version 494550 (0.0028) [2024-06-23 18:50:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8102739968. Throughput: 0: 42953.6. Samples: 8102866280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:50:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 18:51:01,194][15401] Updated weights for policy 0, policy_version 494560 (0.0048) [2024-06-23 18:51:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 8102936576. Throughput: 0: 42814.7. Samples: 8103116540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 18:51:05,287][15401] Updated weights for policy 0, policy_version 494570 (0.0035) [2024-06-23 18:51:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8103165952. Throughput: 0: 42793.4. Samples: 8103243740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:08,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 18:51:08,638][15401] Updated weights for policy 0, policy_version 494580 (0.0031) [2024-06-23 18:51:12,626][15401] Updated weights for policy 0, policy_version 494590 (0.0026) [2024-06-23 18:51:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 8103378944. Throughput: 0: 42783.2. Samples: 8103506320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:13,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-23 18:51:16,102][15401] Updated weights for policy 0, policy_version 494600 (0.0023) [2024-06-23 18:51:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 8103575552. Throughput: 0: 42952.9. Samples: 8103764120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 18:51:20,144][15401] Updated weights for policy 0, policy_version 494610 (0.0026) [2024-06-23 18:51:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8103821312. Throughput: 0: 43003.7. Samples: 8103896200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 18:51:23,782][15401] Updated weights for policy 0, policy_version 494620 (0.0023) [2024-06-23 18:51:27,699][15401] Updated weights for policy 0, policy_version 494630 (0.0045) [2024-06-23 18:51:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 8104034304. Throughput: 0: 43000.4. Samples: 8104155540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 18:51:31,251][15401] Updated weights for policy 0, policy_version 494640 (0.0041) [2024-06-23 18:51:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8104230912. Throughput: 0: 43176.4. Samples: 8104413600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:33,393][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 18:51:35,683][15401] Updated weights for policy 0, policy_version 494650 (0.0029) [2024-06-23 18:51:37,959][15349] Signal inference workers to stop experience collection... (120100 times) [2024-06-23 18:51:37,964][15349] Signal inference workers to resume experience collection... (120100 times) [2024-06-23 18:51:37,975][15401] InferenceWorker_p0-w0: stopping experience collection (120100 times) [2024-06-23 18:51:37,987][15401] InferenceWorker_p0-w0: resuming experience collection (120100 times) [2024-06-23 18:51:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8104476672. Throughput: 0: 43233.2. Samples: 8104543220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 18:51:39,122][15401] Updated weights for policy 0, policy_version 494660 (0.0037) [2024-06-23 18:51:43,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8104640512. Throughput: 0: 42944.5. Samples: 8104798780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:43,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 18:51:43,489][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000494669_8104656896.pth... [2024-06-23 18:51:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000494043_8094400512.pth [2024-06-23 18:51:43,711][15401] Updated weights for policy 0, policy_version 494670 (0.0037) [2024-06-23 18:51:46,725][15401] Updated weights for policy 0, policy_version 494680 (0.0036) [2024-06-23 18:51:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8104869888. Throughput: 0: 42892.9. Samples: 8105046720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 18:51:51,231][15401] Updated weights for policy 0, policy_version 494690 (0.0040) [2024-06-23 18:51:53,390][15132] Fps is (10 sec: 49151.0, 60 sec: 43144.5, 300 sec: 43042.8). Total num frames: 8105132032. Throughput: 0: 43010.5. Samples: 8105179220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 18:51:54,188][15401] Updated weights for policy 0, policy_version 494700 (0.0025) [2024-06-23 18:51:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8105295872. Throughput: 0: 43016.9. Samples: 8105442080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:51:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 18:51:58,811][15401] Updated weights for policy 0, policy_version 494710 (0.0036) [2024-06-23 18:52:02,091][15401] Updated weights for policy 0, policy_version 494720 (0.0047) [2024-06-23 18:52:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8105508864. Throughput: 0: 42787.9. Samples: 8105689580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:52:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 18:52:06,685][15401] Updated weights for policy 0, policy_version 494730 (0.0033) [2024-06-23 18:52:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 8105771008. Throughput: 0: 42747.9. Samples: 8105819860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:52:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 18:52:09,856][15401] Updated weights for policy 0, policy_version 494740 (0.0030) [2024-06-23 18:52:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8105951232. Throughput: 0: 42698.3. Samples: 8106076960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-23 18:52:13,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 18:52:14,296][15401] Updated weights for policy 0, policy_version 494750 (0.0034) [2024-06-23 18:52:17,747][15401] Updated weights for policy 0, policy_version 494760 (0.0034) [2024-06-23 18:52:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8106147840. Throughput: 0: 42618.4. Samples: 8106331320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:18,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 18:52:22,095][15401] Updated weights for policy 0, policy_version 494770 (0.0025) [2024-06-23 18:52:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8106409984. Throughput: 0: 42544.1. Samples: 8106457700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 18:52:25,266][15401] Updated weights for policy 0, policy_version 494780 (0.0030) [2024-06-23 18:52:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.8, 300 sec: 42875.7). Total num frames: 8106590208. Throughput: 0: 42625.6. Samples: 8106717040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:28,392][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 18:52:29,551][15401] Updated weights for policy 0, policy_version 494790 (0.0035) [2024-06-23 18:52:32,747][15401] Updated weights for policy 0, policy_version 494800 (0.0035) [2024-06-23 18:52:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 8106803200. Throughput: 0: 42924.9. Samples: 8106978340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 18:52:36,983][15401] Updated weights for policy 0, policy_version 494810 (0.0037) [2024-06-23 18:52:38,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 8107048960. Throughput: 0: 42946.0. Samples: 8107111780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 18:52:40,489][15401] Updated weights for policy 0, policy_version 494820 (0.0038) [2024-06-23 18:52:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8107229184. Throughput: 0: 42829.3. Samples: 8107369400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 18:52:44,394][15401] Updated weights for policy 0, policy_version 494830 (0.0033) [2024-06-23 18:52:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8107442176. Throughput: 0: 43096.0. Samples: 8107628900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 18:52:48,444][15401] Updated weights for policy 0, policy_version 494840 (0.0029) [2024-06-23 18:52:52,023][15401] Updated weights for policy 0, policy_version 494850 (0.0034) [2024-06-23 18:52:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 8107671552. Throughput: 0: 43050.4. Samples: 8107757120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:53,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 18:52:53,782][15349] Signal inference workers to stop experience collection... (120150 times) [2024-06-23 18:52:53,821][15401] InferenceWorker_p0-w0: stopping experience collection (120150 times) [2024-06-23 18:52:53,833][15349] Signal inference workers to resume experience collection... (120150 times) [2024-06-23 18:52:53,844][15401] InferenceWorker_p0-w0: resuming experience collection (120150 times) [2024-06-23 18:52:55,889][15401] Updated weights for policy 0, policy_version 494860 (0.0034) [2024-06-23 18:52:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8107884544. Throughput: 0: 43011.1. Samples: 8108012460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:52:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 18:52:59,987][15401] Updated weights for policy 0, policy_version 494870 (0.0024) [2024-06-23 18:53:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8108097536. Throughput: 0: 43052.9. Samples: 8108268700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:03,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-23 18:53:03,402][15401] Updated weights for policy 0, policy_version 494880 (0.0027) [2024-06-23 18:53:07,546][15401] Updated weights for policy 0, policy_version 494890 (0.0031) [2024-06-23 18:53:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8108326912. Throughput: 0: 43068.0. Samples: 8108395760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 18:53:10,851][15401] Updated weights for policy 0, policy_version 494900 (0.0033) [2024-06-23 18:53:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8108539904. Throughput: 0: 43176.5. Samples: 8108659880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 18:53:15,060][15401] Updated weights for policy 0, policy_version 494910 (0.0041) [2024-06-23 18:53:18,305][15401] Updated weights for policy 0, policy_version 494920 (0.0034) [2024-06-23 18:53:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43690.6, 300 sec: 43042.7). Total num frames: 8108769280. Throughput: 0: 42962.7. Samples: 8108911660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 18:53:22,757][15401] Updated weights for policy 0, policy_version 494930 (0.0023) [2024-06-23 18:53:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8108982272. Throughput: 0: 43049.3. Samples: 8109049000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 18:53:25,823][15401] Updated weights for policy 0, policy_version 494940 (0.0041) [2024-06-23 18:53:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43144.5, 300 sec: 42986.8). Total num frames: 8109178880. Throughput: 0: 43023.9. Samples: 8109305580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:28,393][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 18:53:30,259][15401] Updated weights for policy 0, policy_version 494950 (0.0037) [2024-06-23 18:53:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 8109408256. Throughput: 0: 42728.9. Samples: 8109551700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 18:53:33,889][15401] Updated weights for policy 0, policy_version 494960 (0.0027) [2024-06-23 18:53:38,075][15401] Updated weights for policy 0, policy_version 494970 (0.0032) [2024-06-23 18:53:38,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8109604864. Throughput: 0: 42809.8. Samples: 8109683560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 18:53:41,395][15401] Updated weights for policy 0, policy_version 494980 (0.0039) [2024-06-23 18:53:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 8109817856. Throughput: 0: 42969.3. Samples: 8109946080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-23 18:53:43,391][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 18:53:43,531][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000494985_8109834240.pth... [2024-06-23 18:53:43,588][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000494356_8099528704.pth [2024-06-23 18:53:45,578][15401] Updated weights for policy 0, policy_version 494990 (0.0036) [2024-06-23 18:53:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 8110047232. Throughput: 0: 42869.2. Samples: 8110197820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:53:48,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 18:53:49,211][15401] Updated weights for policy 0, policy_version 495000 (0.0046) [2024-06-23 18:53:53,124][15401] Updated weights for policy 0, policy_version 495010 (0.0029) [2024-06-23 18:53:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8110260224. Throughput: 0: 43003.7. Samples: 8110330920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:53:53,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-23 18:53:56,910][15401] Updated weights for policy 0, policy_version 495020 (0.0034) [2024-06-23 18:53:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8110456832. Throughput: 0: 42858.8. Samples: 8110588520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:53:58,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 18:54:00,777][15401] Updated weights for policy 0, policy_version 495030 (0.0023) [2024-06-23 18:54:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8110686208. Throughput: 0: 42927.4. Samples: 8110843400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 18:54:04,666][15401] Updated weights for policy 0, policy_version 495040 (0.0039) [2024-06-23 18:54:08,336][15349] Signal inference workers to stop experience collection... (120200 times) [2024-06-23 18:54:08,371][15401] InferenceWorker_p0-w0: stopping experience collection (120200 times) [2024-06-23 18:54:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8110866432. Throughput: 0: 42755.1. Samples: 8110972980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 18:54:08,393][15349] Signal inference workers to resume experience collection... (120200 times) [2024-06-23 18:54:08,394][15401] InferenceWorker_p0-w0: resuming experience collection (120200 times) [2024-06-23 18:54:08,566][15401] Updated weights for policy 0, policy_version 495050 (0.0034) [2024-06-23 18:54:12,310][15401] Updated weights for policy 0, policy_version 495060 (0.0023) [2024-06-23 18:54:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 8111112192. Throughput: 0: 42780.4. Samples: 8111230600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 18:54:16,123][15401] Updated weights for policy 0, policy_version 495070 (0.0029) [2024-06-23 18:54:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8111325184. Throughput: 0: 42953.4. Samples: 8111484600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 18:54:19,829][15401] Updated weights for policy 0, policy_version 495080 (0.0032) [2024-06-23 18:54:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 8111521792. Throughput: 0: 42910.5. Samples: 8111614540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 18:54:23,667][15401] Updated weights for policy 0, policy_version 495090 (0.0032) [2024-06-23 18:54:27,258][15401] Updated weights for policy 0, policy_version 495100 (0.0024) [2024-06-23 18:54:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 8111734784. Throughput: 0: 42800.5. Samples: 8111872100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 18:54:31,217][15401] Updated weights for policy 0, policy_version 495110 (0.0034) [2024-06-23 18:54:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8111980544. Throughput: 0: 42918.2. Samples: 8112129140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:33,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 18:54:35,026][15401] Updated weights for policy 0, policy_version 495120 (0.0035) [2024-06-23 18:54:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8112160768. Throughput: 0: 42892.4. Samples: 8112261080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 18:54:39,133][15401] Updated weights for policy 0, policy_version 495130 (0.0035) [2024-06-23 18:54:42,581][15401] Updated weights for policy 0, policy_version 495140 (0.0058) [2024-06-23 18:54:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8112373760. Throughput: 0: 42744.3. Samples: 8112512020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 18:54:46,751][15401] Updated weights for policy 0, policy_version 495150 (0.0049) [2024-06-23 18:54:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8112603136. Throughput: 0: 42793.9. Samples: 8112769120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 18:54:50,019][15401] Updated weights for policy 0, policy_version 495160 (0.0029) [2024-06-23 18:54:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8112816128. Throughput: 0: 42876.8. Samples: 8112902440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 18:54:54,145][15401] Updated weights for policy 0, policy_version 495170 (0.0042) [2024-06-23 18:54:57,650][15401] Updated weights for policy 0, policy_version 495180 (0.0026) [2024-06-23 18:54:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8113029120. Throughput: 0: 42771.1. Samples: 8113155300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:54:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 18:55:01,778][15401] Updated weights for policy 0, policy_version 495190 (0.0024) [2024-06-23 18:55:03,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 8113242112. Throughput: 0: 42970.5. Samples: 8113418380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:55:03,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 18:55:05,290][15401] Updated weights for policy 0, policy_version 495200 (0.0027) [2024-06-23 18:55:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8113455104. Throughput: 0: 43030.4. Samples: 8113550900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 18:55:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 18:55:09,221][15401] Updated weights for policy 0, policy_version 495210 (0.0036) [2024-06-23 18:55:12,835][15401] Updated weights for policy 0, policy_version 495220 (0.0038) [2024-06-23 18:55:13,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 8113684480. Throughput: 0: 42959.0. Samples: 8113805260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:13,392][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 18:55:16,781][15401] Updated weights for policy 0, policy_version 495230 (0.0044) [2024-06-23 18:55:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8113881088. Throughput: 0: 43099.8. Samples: 8114068620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 18:55:20,563][15401] Updated weights for policy 0, policy_version 495240 (0.0037) [2024-06-23 18:55:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8114110464. Throughput: 0: 42986.1. Samples: 8114195460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:23,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 18:55:24,259][15401] Updated weights for policy 0, policy_version 495250 (0.0036) [2024-06-23 18:55:27,962][15401] Updated weights for policy 0, policy_version 495260 (0.0045) [2024-06-23 18:55:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 8114339840. Throughput: 0: 43078.3. Samples: 8114450540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:28,392][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 18:55:31,633][15349] Signal inference workers to stop experience collection... (120250 times) [2024-06-23 18:55:31,676][15401] InferenceWorker_p0-w0: stopping experience collection (120250 times) [2024-06-23 18:55:31,696][15349] Signal inference workers to resume experience collection... (120250 times) [2024-06-23 18:55:31,696][15401] InferenceWorker_p0-w0: resuming experience collection (120250 times) [2024-06-23 18:55:31,848][15401] Updated weights for policy 0, policy_version 495270 (0.0034) [2024-06-23 18:55:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8114520064. Throughput: 0: 43112.9. Samples: 8114709200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 18:55:35,816][15401] Updated weights for policy 0, policy_version 495280 (0.0041) [2024-06-23 18:55:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 8114765824. Throughput: 0: 42956.9. Samples: 8114835500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:38,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 18:55:39,464][15401] Updated weights for policy 0, policy_version 495290 (0.0035) [2024-06-23 18:55:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 8114978816. Throughput: 0: 43082.3. Samples: 8115094000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:43,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 18:55:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000495299_8114978816.pth... [2024-06-23 18:55:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000494669_8104656896.pth [2024-06-23 18:55:43,646][15401] Updated weights for policy 0, policy_version 495300 (0.0037) [2024-06-23 18:55:47,225][15401] Updated weights for policy 0, policy_version 495310 (0.0037) [2024-06-23 18:55:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8115175424. Throughput: 0: 42902.3. Samples: 8115348880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 18:55:51,347][15401] Updated weights for policy 0, policy_version 495320 (0.0033) [2024-06-23 18:55:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8115388416. Throughput: 0: 42723.4. Samples: 8115473460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:53,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 18:55:55,238][15401] Updated weights for policy 0, policy_version 495330 (0.0036) [2024-06-23 18:55:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8115617792. Throughput: 0: 42897.4. Samples: 8115735640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:55:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 18:55:59,002][15401] Updated weights for policy 0, policy_version 495340 (0.0035) [2024-06-23 18:56:02,931][15401] Updated weights for policy 0, policy_version 495350 (0.0036) [2024-06-23 18:56:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 8115814400. Throughput: 0: 42736.7. Samples: 8115991780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:56:03,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 18:56:06,557][15401] Updated weights for policy 0, policy_version 495360 (0.0037) [2024-06-23 18:56:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8116043776. Throughput: 0: 42669.4. Samples: 8116115580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:56:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 18:56:10,400][15401] Updated weights for policy 0, policy_version 495370 (0.0029) [2024-06-23 18:56:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.7, 300 sec: 43042.7). Total num frames: 8116273152. Throughput: 0: 42851.2. Samples: 8116378840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:56:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 18:56:14,683][15401] Updated weights for policy 0, policy_version 495380 (0.0038) [2024-06-23 18:56:17,916][15401] Updated weights for policy 0, policy_version 495390 (0.0045) [2024-06-23 18:56:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 8116469760. Throughput: 0: 42786.2. Samples: 8116634680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:56:18,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 18:56:22,259][15401] Updated weights for policy 0, policy_version 495400 (0.0039) [2024-06-23 18:56:23,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8116682752. Throughput: 0: 42793.7. Samples: 8116761220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:56:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 18:56:25,845][15401] Updated weights for policy 0, policy_version 495410 (0.0033) [2024-06-23 18:56:28,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 8116912128. Throughput: 0: 42877.8. Samples: 8117023500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:56:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 18:56:29,641][15401] Updated weights for policy 0, policy_version 495420 (0.0046) [2024-06-23 18:56:33,240][15401] Updated weights for policy 0, policy_version 495430 (0.0046) [2024-06-23 18:56:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8117125120. Throughput: 0: 42894.0. Samples: 8117279120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 18:56:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 18:56:37,127][15401] Updated weights for policy 0, policy_version 495440 (0.0037) [2024-06-23 18:56:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 43042.3). Total num frames: 8117338112. Throughput: 0: 43052.5. Samples: 8117410920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:56:38,393][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 18:56:40,982][15401] Updated weights for policy 0, policy_version 495450 (0.0030) [2024-06-23 18:56:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8117534720. Throughput: 0: 42862.5. Samples: 8117664460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:56:43,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 18:56:44,963][15401] Updated weights for policy 0, policy_version 495460 (0.0034) [2024-06-23 18:56:48,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8117731328. Throughput: 0: 42942.4. Samples: 8117924180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:56:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 18:56:48,755][15401] Updated weights for policy 0, policy_version 495470 (0.0030) [2024-06-23 18:56:52,554][15401] Updated weights for policy 0, policy_version 495480 (0.0037) [2024-06-23 18:56:53,389][15132] Fps is (10 sec: 44238.2, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 8117977088. Throughput: 0: 43052.2. Samples: 8118052920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:56:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 18:56:56,432][15401] Updated weights for policy 0, policy_version 495490 (0.0037) [2024-06-23 18:56:56,902][15349] Signal inference workers to stop experience collection... (120300 times) [2024-06-23 18:56:56,929][15401] InferenceWorker_p0-w0: stopping experience collection (120300 times) [2024-06-23 18:56:56,957][15349] Signal inference workers to resume experience collection... (120300 times) [2024-06-23 18:56:56,960][15401] InferenceWorker_p0-w0: resuming experience collection (120300 times) [2024-06-23 18:56:58,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8118190080. Throughput: 0: 42817.6. Samples: 8118305640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:56:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 18:57:00,069][15401] Updated weights for policy 0, policy_version 495500 (0.0035) [2024-06-23 18:57:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8118386688. Throughput: 0: 43058.8. Samples: 8118572220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:03,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 18:57:03,926][15401] Updated weights for policy 0, policy_version 495510 (0.0048) [2024-06-23 18:57:07,581][15401] Updated weights for policy 0, policy_version 495520 (0.0030) [2024-06-23 18:57:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8118632448. Throughput: 0: 43037.3. Samples: 8118697900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 18:57:11,558][15401] Updated weights for policy 0, policy_version 495530 (0.0037) [2024-06-23 18:57:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8118829056. Throughput: 0: 42916.5. Samples: 8118954740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:13,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 18:57:15,113][15401] Updated weights for policy 0, policy_version 495540 (0.0025) [2024-06-23 18:57:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8119025664. Throughput: 0: 42937.5. Samples: 8119211300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:18,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-23 18:57:19,159][15401] Updated weights for policy 0, policy_version 495550 (0.0039) [2024-06-23 18:57:23,067][15401] Updated weights for policy 0, policy_version 495560 (0.0034) [2024-06-23 18:57:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 8119271424. Throughput: 0: 42938.3. Samples: 8119343040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 18:57:26,679][15401] Updated weights for policy 0, policy_version 495570 (0.0038) [2024-06-23 18:57:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8119468032. Throughput: 0: 42931.3. Samples: 8119596360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 18:57:30,498][15401] Updated weights for policy 0, policy_version 495580 (0.0044) [2024-06-23 18:57:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 8119681024. Throughput: 0: 42849.2. Samples: 8119852400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 18:57:34,745][15401] Updated weights for policy 0, policy_version 495590 (0.0029) [2024-06-23 18:57:38,105][15401] Updated weights for policy 0, policy_version 495600 (0.0042) [2024-06-23 18:57:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 8119910400. Throughput: 0: 42892.3. Samples: 8119983080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 18:57:42,248][15401] Updated weights for policy 0, policy_version 495610 (0.0034) [2024-06-23 18:57:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8120107008. Throughput: 0: 43009.7. Samples: 8120241080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 18:57:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000495613_8120123392.pth... [2024-06-23 18:57:43,548][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000494985_8109834240.pth [2024-06-23 18:57:45,623][15401] Updated weights for policy 0, policy_version 495620 (0.0025) [2024-06-23 18:57:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8120320000. Throughput: 0: 42730.7. Samples: 8120495100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 18:57:49,822][15401] Updated weights for policy 0, policy_version 495630 (0.0027) [2024-06-23 18:57:53,329][15401] Updated weights for policy 0, policy_version 495640 (0.0027) [2024-06-23 18:57:53,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 8120565760. Throughput: 0: 42786.3. Samples: 8120623280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:53,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 18:57:57,344][15401] Updated weights for policy 0, policy_version 495650 (0.0039) [2024-06-23 18:57:58,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.8, 300 sec: 42875.7). Total num frames: 8120745984. Throughput: 0: 42769.2. Samples: 8120879460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:57:58,393][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 18:58:01,335][15401] Updated weights for policy 0, policy_version 495660 (0.0033) [2024-06-23 18:58:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8120975360. Throughput: 0: 42771.1. Samples: 8121136000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-23 18:58:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 18:58:05,137][15401] Updated weights for policy 0, policy_version 495670 (0.0039) [2024-06-23 18:58:08,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8121188352. Throughput: 0: 42714.6. Samples: 8121265200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 18:58:09,098][15401] Updated weights for policy 0, policy_version 495680 (0.0048) [2024-06-23 18:58:12,633][15401] Updated weights for policy 0, policy_version 495690 (0.0023) [2024-06-23 18:58:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8121384960. Throughput: 0: 42768.4. Samples: 8121520940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:13,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 18:58:13,474][15349] Signal inference workers to stop experience collection... (120350 times) [2024-06-23 18:58:13,475][15349] Signal inference workers to resume experience collection... (120350 times) [2024-06-23 18:58:13,491][15401] InferenceWorker_p0-w0: stopping experience collection (120350 times) [2024-06-23 18:58:13,492][15401] InferenceWorker_p0-w0: resuming experience collection (120350 times) [2024-06-23 18:58:16,567][15401] Updated weights for policy 0, policy_version 495700 (0.0030) [2024-06-23 18:58:18,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8121614336. Throughput: 0: 42862.8. Samples: 8121781220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 18:58:20,125][15401] Updated weights for policy 0, policy_version 495710 (0.0029) [2024-06-23 18:58:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 8121827328. Throughput: 0: 42825.0. Samples: 8121910200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 18:58:24,114][15401] Updated weights for policy 0, policy_version 495720 (0.0032) [2024-06-23 18:58:27,611][15401] Updated weights for policy 0, policy_version 495730 (0.0029) [2024-06-23 18:58:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8122056704. Throughput: 0: 42749.4. Samples: 8122164800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 18:58:31,624][15401] Updated weights for policy 0, policy_version 495740 (0.0033) [2024-06-23 18:58:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8122269696. Throughput: 0: 43082.2. Samples: 8122433800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 18:58:35,106][15401] Updated weights for policy 0, policy_version 495750 (0.0044) [2024-06-23 18:58:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8122466304. Throughput: 0: 42995.6. Samples: 8122558080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 18:58:39,625][15401] Updated weights for policy 0, policy_version 495760 (0.0030) [2024-06-23 18:58:42,872][15401] Updated weights for policy 0, policy_version 495770 (0.0028) [2024-06-23 18:58:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8122695680. Throughput: 0: 42991.3. Samples: 8122813960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 18:58:47,222][15401] Updated weights for policy 0, policy_version 495780 (0.0041) [2024-06-23 18:58:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 8122892288. Throughput: 0: 43110.5. Samples: 8123075980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 18:58:50,698][15401] Updated weights for policy 0, policy_version 495790 (0.0024) [2024-06-23 18:58:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8123138048. Throughput: 0: 42887.7. Samples: 8123195140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 18:58:55,016][15401] Updated weights for policy 0, policy_version 495800 (0.0041) [2024-06-23 18:58:58,224][15401] Updated weights for policy 0, policy_version 495810 (0.0027) [2024-06-23 18:58:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43419.3, 300 sec: 42931.7). Total num frames: 8123351040. Throughput: 0: 43045.8. Samples: 8123458000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:58:58,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 18:59:02,833][15401] Updated weights for policy 0, policy_version 495820 (0.0038) [2024-06-23 18:59:03,394][15132] Fps is (10 sec: 39304.4, 60 sec: 42595.3, 300 sec: 42931.0). Total num frames: 8123531264. Throughput: 0: 42936.7. Samples: 8123713560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:59:03,394][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 18:59:06,132][15401] Updated weights for policy 0, policy_version 495830 (0.0032) [2024-06-23 18:59:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 8123777024. Throughput: 0: 42901.4. Samples: 8123840760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:59:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 18:59:10,528][15401] Updated weights for policy 0, policy_version 495840 (0.0027) [2024-06-23 18:59:13,390][15132] Fps is (10 sec: 44255.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8123973632. Throughput: 0: 42997.2. Samples: 8124099680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:59:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 18:59:13,940][15401] Updated weights for policy 0, policy_version 495850 (0.0031) [2024-06-23 18:59:18,037][15401] Updated weights for policy 0, policy_version 495860 (0.0021) [2024-06-23 18:59:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 8124186624. Throughput: 0: 42841.4. Samples: 8124361660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:59:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 18:59:21,489][15401] Updated weights for policy 0, policy_version 495870 (0.0038) [2024-06-23 18:59:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 8124416000. Throughput: 0: 42886.0. Samples: 8124487960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:59:23,392][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 18:59:25,571][15401] Updated weights for policy 0, policy_version 495880 (0.0034) [2024-06-23 18:59:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8124612608. Throughput: 0: 42909.3. Samples: 8124744880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 18:59:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 18:59:29,036][15401] Updated weights for policy 0, policy_version 495890 (0.0030) [2024-06-23 18:59:33,063][15401] Updated weights for policy 0, policy_version 495900 (0.0031) [2024-06-23 18:59:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8124825600. Throughput: 0: 42786.2. Samples: 8125001360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 18:59:33,399][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 18:59:36,845][15401] Updated weights for policy 0, policy_version 495910 (0.0030) [2024-06-23 18:59:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 8125071360. Throughput: 0: 43128.0. Samples: 8125135900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 18:59:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 18:59:40,891][15401] Updated weights for policy 0, policy_version 495920 (0.0033) [2024-06-23 18:59:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8125251584. Throughput: 0: 42836.4. Samples: 8125385640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 18:59:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 18:59:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000495926_8125251584.pth... [2024-06-23 18:59:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000495299_8114978816.pth [2024-06-23 18:59:44,489][15401] Updated weights for policy 0, policy_version 495930 (0.0044) [2024-06-23 18:59:46,664][15349] Signal inference workers to stop experience collection... (120400 times) [2024-06-23 18:59:46,672][15349] Signal inference workers to resume experience collection... (120400 times) [2024-06-23 18:59:46,677][15401] InferenceWorker_p0-w0: stopping experience collection (120400 times) [2024-06-23 18:59:46,712][15401] InferenceWorker_p0-w0: resuming experience collection (120400 times) [2024-06-23 18:59:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8125464576. Throughput: 0: 42996.9. Samples: 8125648240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 18:59:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 18:59:48,529][15401] Updated weights for policy 0, policy_version 495940 (0.0047) [2024-06-23 18:59:51,949][15401] Updated weights for policy 0, policy_version 495950 (0.0043) [2024-06-23 18:59:53,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8125710336. Throughput: 0: 42958.7. Samples: 8125773900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 18:59:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 18:59:56,513][15401] Updated weights for policy 0, policy_version 495960 (0.0045) [2024-06-23 18:59:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42876.4). Total num frames: 8125890560. Throughput: 0: 42868.6. Samples: 8126028760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 18:59:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 18:59:59,446][15401] Updated weights for policy 0, policy_version 495970 (0.0045) [2024-06-23 19:00:03,390][15132] Fps is (10 sec: 39320.5, 60 sec: 42874.5, 300 sec: 42876.1). Total num frames: 8126103552. Throughput: 0: 42894.5. Samples: 8126291920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 19:00:04,045][15401] Updated weights for policy 0, policy_version 495980 (0.0031) [2024-06-23 19:00:07,318][15401] Updated weights for policy 0, policy_version 495990 (0.0037) [2024-06-23 19:00:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8126365696. Throughput: 0: 42894.4. Samples: 8126418200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:08,390][15132] Avg episode reward: [(0, '0.196')] [2024-06-23 19:00:11,803][15401] Updated weights for policy 0, policy_version 496000 (0.0038) [2024-06-23 19:00:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 8126545920. Throughput: 0: 42860.5. Samples: 8126673600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 19:00:15,063][15401] Updated weights for policy 0, policy_version 496010 (0.0024) [2024-06-23 19:00:18,391][15132] Fps is (10 sec: 39317.2, 60 sec: 42870.6, 300 sec: 42875.9). Total num frames: 8126758912. Throughput: 0: 42832.8. Samples: 8126928880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:18,391][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 19:00:19,337][15401] Updated weights for policy 0, policy_version 496020 (0.0032) [2024-06-23 19:00:22,629][15401] Updated weights for policy 0, policy_version 496030 (0.0044) [2024-06-23 19:00:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 8127004672. Throughput: 0: 42696.0. Samples: 8127057220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 19:00:27,301][15401] Updated weights for policy 0, policy_version 496040 (0.0033) [2024-06-23 19:00:28,392][15132] Fps is (10 sec: 42592.8, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 8127184896. Throughput: 0: 42919.1. Samples: 8127317100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:28,392][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 19:00:30,238][15401] Updated weights for policy 0, policy_version 496050 (0.0037) [2024-06-23 19:00:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8127414272. Throughput: 0: 42660.5. Samples: 8127567960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 19:00:34,776][15401] Updated weights for policy 0, policy_version 496060 (0.0030) [2024-06-23 19:00:37,811][15401] Updated weights for policy 0, policy_version 496070 (0.0033) [2024-06-23 19:00:38,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8127627264. Throughput: 0: 42838.6. Samples: 8127701640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 19:00:42,267][15401] Updated weights for policy 0, policy_version 496080 (0.0041) [2024-06-23 19:00:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8127807488. Throughput: 0: 42826.5. Samples: 8127955960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 19:00:45,576][15401] Updated weights for policy 0, policy_version 496090 (0.0035) [2024-06-23 19:00:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8128053248. Throughput: 0: 42537.8. Samples: 8128206120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 19:00:49,790][15401] Updated weights for policy 0, policy_version 496100 (0.0037) [2024-06-23 19:00:53,349][15401] Updated weights for policy 0, policy_version 496110 (0.0039) [2024-06-23 19:00:53,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 8128266240. Throughput: 0: 42718.6. Samples: 8128340640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:53,392][15132] Avg episode reward: [(0, '0.278')] [2024-06-23 19:00:57,308][15401] Updated weights for policy 0, policy_version 496120 (0.0034) [2024-06-23 19:00:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8128446464. Throughput: 0: 42662.2. Samples: 8128593400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 19:00:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 19:01:01,090][15401] Updated weights for policy 0, policy_version 496130 (0.0039) [2024-06-23 19:01:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 8128708608. Throughput: 0: 42495.6. Samples: 8128841140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:03,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 19:01:05,323][15401] Updated weights for policy 0, policy_version 496140 (0.0039) [2024-06-23 19:01:07,914][15349] Signal inference workers to stop experience collection... (120450 times) [2024-06-23 19:01:07,958][15401] InferenceWorker_p0-w0: stopping experience collection (120450 times) [2024-06-23 19:01:07,966][15349] Signal inference workers to resume experience collection... (120450 times) [2024-06-23 19:01:07,968][15401] InferenceWorker_p0-w0: resuming experience collection (120450 times) [2024-06-23 19:01:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 8128905216. Throughput: 0: 42674.6. Samples: 8128977580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:08,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 19:01:08,523][15401] Updated weights for policy 0, policy_version 496150 (0.0026) [2024-06-23 19:01:12,822][15401] Updated weights for policy 0, policy_version 496160 (0.0033) [2024-06-23 19:01:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 8129101824. Throughput: 0: 42578.7. Samples: 8129233040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 19:01:16,125][15401] Updated weights for policy 0, policy_version 496170 (0.0052) [2024-06-23 19:01:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43145.4, 300 sec: 42931.7). Total num frames: 8129347584. Throughput: 0: 42625.1. Samples: 8129486080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 19:01:20,337][15401] Updated weights for policy 0, policy_version 496180 (0.0047) [2024-06-23 19:01:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 8129527808. Throughput: 0: 42646.3. Samples: 8129620720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 19:01:23,752][15401] Updated weights for policy 0, policy_version 496190 (0.0037) [2024-06-23 19:01:27,750][15401] Updated weights for policy 0, policy_version 496200 (0.0035) [2024-06-23 19:01:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 8129740800. Throughput: 0: 42591.3. Samples: 8129872560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 19:01:31,546][15401] Updated weights for policy 0, policy_version 496210 (0.0032) [2024-06-23 19:01:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 8129953792. Throughput: 0: 42701.3. Samples: 8130127680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 19:01:35,978][15401] Updated weights for policy 0, policy_version 496220 (0.0024) [2024-06-23 19:01:38,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 8130166784. Throughput: 0: 42424.0. Samples: 8130249720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:38,392][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 19:01:39,249][15401] Updated weights for policy 0, policy_version 496230 (0.0044) [2024-06-23 19:01:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8130379776. Throughput: 0: 42412.0. Samples: 8130501940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 19:01:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000496240_8130396160.pth... [2024-06-23 19:01:43,522][15401] Updated weights for policy 0, policy_version 496240 (0.0041) [2024-06-23 19:01:43,563][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000495613_8120123392.pth [2024-06-23 19:01:47,274][15401] Updated weights for policy 0, policy_version 496250 (0.0036) [2024-06-23 19:01:48,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 8130576384. Throughput: 0: 42608.6. Samples: 8130758520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 19:01:51,054][15401] Updated weights for policy 0, policy_version 496260 (0.0042) [2024-06-23 19:01:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 8130805760. Throughput: 0: 42324.5. Samples: 8130882180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 19:01:54,899][15401] Updated weights for policy 0, policy_version 496270 (0.0026) [2024-06-23 19:01:58,389][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8131035136. Throughput: 0: 42472.9. Samples: 8131144320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:01:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 19:01:59,199][15401] Updated weights for policy 0, policy_version 496280 (0.0030) [2024-06-23 19:02:02,466][15401] Updated weights for policy 0, policy_version 496290 (0.0035) [2024-06-23 19:02:03,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42047.8, 300 sec: 42708.6). Total num frames: 8131231744. Throughput: 0: 42485.0. Samples: 8131398180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:02:03,397][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 19:02:06,794][15401] Updated weights for policy 0, policy_version 496300 (0.0038) [2024-06-23 19:02:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8131444736. Throughput: 0: 42388.7. Samples: 8131528220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:02:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 19:02:10,121][15401] Updated weights for policy 0, policy_version 496310 (0.0036) [2024-06-23 19:02:13,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8131674112. Throughput: 0: 42512.7. Samples: 8131785640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:02:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 19:02:14,365][15401] Updated weights for policy 0, policy_version 496320 (0.0038) [2024-06-23 19:02:17,926][15401] Updated weights for policy 0, policy_version 496330 (0.0038) [2024-06-23 19:02:18,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 8131887104. Throughput: 0: 42419.2. Samples: 8132036640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:02:18,392][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 19:02:21,926][15401] Updated weights for policy 0, policy_version 496340 (0.0032) [2024-06-23 19:02:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8132083712. Throughput: 0: 42585.9. Samples: 8132165980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-23 19:02:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 19:02:25,820][15349] Signal inference workers to stop experience collection... (120500 times) [2024-06-23 19:02:25,821][15349] Signal inference workers to resume experience collection... (120500 times) [2024-06-23 19:02:25,834][15401] InferenceWorker_p0-w0: stopping experience collection (120500 times) [2024-06-23 19:02:25,837][15401] Updated weights for policy 0, policy_version 496350 (0.0031) [2024-06-23 19:02:25,865][15401] InferenceWorker_p0-w0: resuming experience collection (120500 times) [2024-06-23 19:02:28,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8132296704. Throughput: 0: 42756.4. Samples: 8132425980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:02:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 19:02:29,514][15401] Updated weights for policy 0, policy_version 496360 (0.0035) [2024-06-23 19:02:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 8132509696. Throughput: 0: 42750.6. Samples: 8132682300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:02:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 19:02:33,431][15401] Updated weights for policy 0, policy_version 496370 (0.0035) [2024-06-23 19:02:37,350][15401] Updated weights for policy 0, policy_version 496380 (0.0041) [2024-06-23 19:02:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 8132739072. Throughput: 0: 42867.2. Samples: 8132811200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:02:38,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 19:02:41,037][15401] Updated weights for policy 0, policy_version 496390 (0.0034) [2024-06-23 19:02:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8132935680. Throughput: 0: 42636.3. Samples: 8133062960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:02:43,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 19:02:45,162][15401] Updated weights for policy 0, policy_version 496400 (0.0035) [2024-06-23 19:02:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 8133148672. Throughput: 0: 42676.8. Samples: 8133318360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:02:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 19:02:48,749][15401] Updated weights for policy 0, policy_version 496410 (0.0031) [2024-06-23 19:02:52,620][15401] Updated weights for policy 0, policy_version 496420 (0.0025) [2024-06-23 19:02:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 8133361664. Throughput: 0: 42714.7. Samples: 8133450380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:02:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 19:02:56,314][15401] Updated weights for policy 0, policy_version 496430 (0.0029) [2024-06-23 19:02:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8133591040. Throughput: 0: 42747.6. Samples: 8133709280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:02:58,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 19:03:00,217][15401] Updated weights for policy 0, policy_version 496440 (0.0040) [2024-06-23 19:03:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 8133804032. Throughput: 0: 42805.7. Samples: 8133962800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 19:03:03,947][15401] Updated weights for policy 0, policy_version 496450 (0.0025) [2024-06-23 19:03:07,839][15401] Updated weights for policy 0, policy_version 496460 (0.0027) [2024-06-23 19:03:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8134017024. Throughput: 0: 42739.0. Samples: 8134089240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 19:03:11,793][15401] Updated weights for policy 0, policy_version 496470 (0.0028) [2024-06-23 19:03:13,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 8134213632. Throughput: 0: 42693.0. Samples: 8134347160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:13,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-23 19:03:15,301][15401] Updated weights for policy 0, policy_version 496480 (0.0033) [2024-06-23 19:03:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 8134443008. Throughput: 0: 42556.5. Samples: 8134597340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 19:03:19,397][15401] Updated weights for policy 0, policy_version 496490 (0.0034) [2024-06-23 19:03:23,034][15401] Updated weights for policy 0, policy_version 496500 (0.0037) [2024-06-23 19:03:23,391][15132] Fps is (10 sec: 44230.8, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 8134656000. Throughput: 0: 42684.0. Samples: 8134732040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:23,391][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 19:03:26,870][15401] Updated weights for policy 0, policy_version 496510 (0.0033) [2024-06-23 19:03:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8134836224. Throughput: 0: 42654.4. Samples: 8134982400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:28,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 19:03:31,123][15401] Updated weights for policy 0, policy_version 496520 (0.0038) [2024-06-23 19:03:33,390][15132] Fps is (10 sec: 40964.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8135065600. Throughput: 0: 42633.6. Samples: 8135236880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 19:03:34,689][15401] Updated weights for policy 0, policy_version 496530 (0.0029) [2024-06-23 19:03:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8135278592. Throughput: 0: 42634.3. Samples: 8135368920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:38,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 19:03:38,795][15401] Updated weights for policy 0, policy_version 496540 (0.0026) [2024-06-23 19:03:42,251][15401] Updated weights for policy 0, policy_version 496550 (0.0035) [2024-06-23 19:03:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 8135491584. Throughput: 0: 42415.9. Samples: 8135618100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:43,392][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 19:03:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000496551_8135491584.pth... [2024-06-23 19:03:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000495926_8125251584.pth [2024-06-23 19:03:46,426][15401] Updated weights for policy 0, policy_version 496560 (0.0038) [2024-06-23 19:03:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8135720960. Throughput: 0: 42394.8. Samples: 8135870560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:03:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 19:03:49,952][15401] Updated weights for policy 0, policy_version 496570 (0.0031) [2024-06-23 19:03:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8135917568. Throughput: 0: 42581.4. Samples: 8136005400. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:03:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 19:03:53,996][15401] Updated weights for policy 0, policy_version 496580 (0.0040) [2024-06-23 19:03:56,835][15349] Signal inference workers to stop experience collection... (120550 times) [2024-06-23 19:03:56,835][15349] Signal inference workers to resume experience collection... (120550 times) [2024-06-23 19:03:56,870][15401] InferenceWorker_p0-w0: stopping experience collection (120550 times) [2024-06-23 19:03:56,870][15401] InferenceWorker_p0-w0: resuming experience collection (120550 times) [2024-06-23 19:03:57,829][15401] Updated weights for policy 0, policy_version 496590 (0.0045) [2024-06-23 19:03:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.6). Total num frames: 8136146944. Throughput: 0: 42495.8. Samples: 8136259480. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:03:58,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-23 19:04:01,396][15401] Updated weights for policy 0, policy_version 496600 (0.0030) [2024-06-23 19:04:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8136376320. Throughput: 0: 42698.2. Samples: 8136518760. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:03,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 19:04:05,654][15401] Updated weights for policy 0, policy_version 496610 (0.0037) [2024-06-23 19:04:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8136572928. Throughput: 0: 42541.6. Samples: 8136646360. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 19:04:09,036][15401] Updated weights for policy 0, policy_version 496620 (0.0033) [2024-06-23 19:04:13,251][15401] Updated weights for policy 0, policy_version 496630 (0.0036) [2024-06-23 19:04:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8136785920. Throughput: 0: 42723.9. Samples: 8136904980. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 19:04:16,674][15401] Updated weights for policy 0, policy_version 496640 (0.0034) [2024-06-23 19:04:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8136998912. Throughput: 0: 42796.2. Samples: 8137162700. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 19:04:20,792][15401] Updated weights for policy 0, policy_version 496650 (0.0030) [2024-06-23 19:04:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 8137211904. Throughput: 0: 42736.5. Samples: 8137292060. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:23,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 19:04:24,186][15401] Updated weights for policy 0, policy_version 496660 (0.0027) [2024-06-23 19:04:28,389][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8137424896. Throughput: 0: 42837.9. Samples: 8137545700. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 19:04:28,444][15401] Updated weights for policy 0, policy_version 496670 (0.0035) [2024-06-23 19:04:32,001][15401] Updated weights for policy 0, policy_version 496680 (0.0033) [2024-06-23 19:04:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 8137637888. Throughput: 0: 42943.6. Samples: 8137803020. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 19:04:36,875][15401] Updated weights for policy 0, policy_version 496690 (0.0046) [2024-06-23 19:04:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8137850880. Throughput: 0: 42862.7. Samples: 8137934220. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:38,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-23 19:04:39,633][15401] Updated weights for policy 0, policy_version 496700 (0.0034) [2024-06-23 19:04:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8138063872. Throughput: 0: 42792.9. Samples: 8138185160. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 19:04:44,333][15401] Updated weights for policy 0, policy_version 496710 (0.0039) [2024-06-23 19:04:47,390][15401] Updated weights for policy 0, policy_version 496720 (0.0037) [2024-06-23 19:04:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8138276864. Throughput: 0: 42670.5. Samples: 8138438940. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 19:04:52,134][15401] Updated weights for policy 0, policy_version 496730 (0.0034) [2024-06-23 19:04:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8138489856. Throughput: 0: 42679.6. Samples: 8138566940. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 19:04:55,193][15401] Updated weights for policy 0, policy_version 496740 (0.0042) [2024-06-23 19:04:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 8138686464. Throughput: 0: 42574.2. Samples: 8138820920. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:04:58,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 19:04:59,896][15401] Updated weights for policy 0, policy_version 496750 (0.0034) [2024-06-23 19:05:02,978][15349] Signal inference workers to stop experience collection... (120600 times) [2024-06-23 19:05:02,978][15349] Signal inference workers to resume experience collection... (120600 times) [2024-06-23 19:05:03,024][15401] InferenceWorker_p0-w0: stopping experience collection (120600 times) [2024-06-23 19:05:03,024][15401] InferenceWorker_p0-w0: resuming experience collection (120600 times) [2024-06-23 19:05:03,113][15401] Updated weights for policy 0, policy_version 496760 (0.0047) [2024-06-23 19:05:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 8138915840. Throughput: 0: 42406.5. Samples: 8139071000. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:05:03,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 19:05:07,584][15401] Updated weights for policy 0, policy_version 496770 (0.0027) [2024-06-23 19:05:08,389][15132] Fps is (10 sec: 40970.6, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 8139096064. Throughput: 0: 42508.9. Samples: 8139204960. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:05:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 19:05:11,033][15401] Updated weights for policy 0, policy_version 496780 (0.0028) [2024-06-23 19:05:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.1). Total num frames: 8139341824. Throughput: 0: 42366.3. Samples: 8139452180. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:05:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 19:05:15,320][15401] Updated weights for policy 0, policy_version 496790 (0.0042) [2024-06-23 19:05:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 8139538432. Throughput: 0: 42297.2. Samples: 8139706400. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-23 19:05:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 19:05:18,907][15401] Updated weights for policy 0, policy_version 496800 (0.0035) [2024-06-23 19:05:23,011][15401] Updated weights for policy 0, policy_version 496810 (0.0036) [2024-06-23 19:05:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 8139735040. Throughput: 0: 42149.4. Samples: 8139830940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:05:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 19:05:26,598][15401] Updated weights for policy 0, policy_version 496820 (0.0032) [2024-06-23 19:05:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 8139948032. Throughput: 0: 42280.0. Samples: 8140087760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:05:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 19:05:30,655][15401] Updated weights for policy 0, policy_version 496830 (0.0037) [2024-06-23 19:05:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8140193792. Throughput: 0: 42169.4. Samples: 8140336560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:05:33,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-23 19:05:34,289][15401] Updated weights for policy 0, policy_version 496840 (0.0033) [2024-06-23 19:05:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8140374016. Throughput: 0: 42255.6. Samples: 8140468440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:05:38,390][15132] Avg episode reward: [(0, '0.892')] [2024-06-23 19:05:38,878][15401] Updated weights for policy 0, policy_version 496850 (0.0044) [2024-06-23 19:05:42,145][15401] Updated weights for policy 0, policy_version 496860 (0.0033) [2024-06-23 19:05:43,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 8140570624. Throughput: 0: 42250.4. Samples: 8140722080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:05:43,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 19:05:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000496862_8140587008.pth... [2024-06-23 19:05:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000496240_8130396160.pth [2024-06-23 19:05:46,466][15401] Updated weights for policy 0, policy_version 496870 (0.0029) [2024-06-23 19:05:48,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 8140849152. Throughput: 0: 42173.4. Samples: 8140968800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:05:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 19:05:50,290][15401] Updated weights for policy 0, policy_version 496880 (0.0023) [2024-06-23 19:05:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 8141012992. Throughput: 0: 42252.7. Samples: 8141106340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:05:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 19:05:54,112][15401] Updated weights for policy 0, policy_version 496890 (0.0034) [2024-06-23 19:05:58,027][15401] Updated weights for policy 0, policy_version 496900 (0.0031) [2024-06-23 19:05:58,390][15132] Fps is (10 sec: 36044.5, 60 sec: 42053.9, 300 sec: 42376.2). Total num frames: 8141209600. Throughput: 0: 42215.4. Samples: 8141351880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:05:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 19:06:01,837][15401] Updated weights for policy 0, policy_version 496910 (0.0030) [2024-06-23 19:06:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 8141455360. Throughput: 0: 42053.9. Samples: 8141598820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 19:06:05,641][15401] Updated weights for policy 0, policy_version 496920 (0.0044) [2024-06-23 19:06:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8141651968. Throughput: 0: 42298.6. Samples: 8141734380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:08,392][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 19:06:09,650][15401] Updated weights for policy 0, policy_version 496930 (0.0041) [2024-06-23 19:06:10,543][15349] Signal inference workers to stop experience collection... (120650 times) [2024-06-23 19:06:10,586][15401] InferenceWorker_p0-w0: stopping experience collection (120650 times) [2024-06-23 19:06:10,593][15349] Signal inference workers to resume experience collection... (120650 times) [2024-06-23 19:06:10,606][15401] InferenceWorker_p0-w0: resuming experience collection (120650 times) [2024-06-23 19:06:13,390][15132] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 8141848576. Throughput: 0: 42086.7. Samples: 8141981660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 19:06:13,776][15401] Updated weights for policy 0, policy_version 496940 (0.0022) [2024-06-23 19:06:17,101][15401] Updated weights for policy 0, policy_version 496950 (0.0035) [2024-06-23 19:06:18,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 8142110720. Throughput: 0: 42322.6. Samples: 8142241180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:18,393][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 19:06:21,692][15401] Updated weights for policy 0, policy_version 496960 (0.0027) [2024-06-23 19:06:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 8142290944. Throughput: 0: 42431.9. Samples: 8142377880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 19:06:24,589][15401] Updated weights for policy 0, policy_version 496970 (0.0042) [2024-06-23 19:06:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 8142503936. Throughput: 0: 42249.7. Samples: 8142623320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 19:06:29,317][15401] Updated weights for policy 0, policy_version 496980 (0.0050) [2024-06-23 19:06:32,122][15401] Updated weights for policy 0, policy_version 496990 (0.0031) [2024-06-23 19:06:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 8142733312. Throughput: 0: 42576.0. Samples: 8142884720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:33,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 19:06:36,783][15401] Updated weights for policy 0, policy_version 497000 (0.0030) [2024-06-23 19:06:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 8142897152. Throughput: 0: 42397.4. Samples: 8143014220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 19:06:39,711][15401] Updated weights for policy 0, policy_version 497010 (0.0036) [2024-06-23 19:06:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8143142912. Throughput: 0: 42526.4. Samples: 8143265560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 19:06:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 19:06:44,220][15401] Updated weights for policy 0, policy_version 497020 (0.0046) [2024-06-23 19:06:47,559][15401] Updated weights for policy 0, policy_version 497030 (0.0031) [2024-06-23 19:06:48,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8143372288. Throughput: 0: 42803.4. Samples: 8143524980. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:06:48,390][15132] Avg episode reward: [(0, '0.142')] [2024-06-23 19:06:51,595][15401] Updated weights for policy 0, policy_version 497040 (0.0028) [2024-06-23 19:06:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 8143552512. Throughput: 0: 42632.4. Samples: 8143652840. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:06:53,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 19:06:55,261][15401] Updated weights for policy 0, policy_version 497050 (0.0027) [2024-06-23 19:06:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42599.3). Total num frames: 8143798272. Throughput: 0: 42845.5. Samples: 8143909700. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:06:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 19:06:59,385][15401] Updated weights for policy 0, policy_version 497060 (0.0045) [2024-06-23 19:07:02,868][15401] Updated weights for policy 0, policy_version 497070 (0.0035) [2024-06-23 19:07:03,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8144011264. Throughput: 0: 42778.9. Samples: 8144166120. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 19:07:06,888][15401] Updated weights for policy 0, policy_version 497080 (0.0058) [2024-06-23 19:07:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 8144191488. Throughput: 0: 42542.3. Samples: 8144292280. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 19:07:10,443][15401] Updated weights for policy 0, policy_version 497090 (0.0034) [2024-06-23 19:07:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.7, 300 sec: 42598.8). Total num frames: 8144453632. Throughput: 0: 42985.8. Samples: 8144557680. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:13,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 19:07:14,287][15401] Updated weights for policy 0, policy_version 497100 (0.0033) [2024-06-23 19:07:18,169][15401] Updated weights for policy 0, policy_version 497110 (0.0036) [2024-06-23 19:07:18,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 8144666624. Throughput: 0: 42923.5. Samples: 8144816280. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 19:07:21,893][15401] Updated weights for policy 0, policy_version 497120 (0.0029) [2024-06-23 19:07:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8144846848. Throughput: 0: 42806.1. Samples: 8144940500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 19:07:25,697][15401] Updated weights for policy 0, policy_version 497130 (0.0040) [2024-06-23 19:07:27,775][15349] Signal inference workers to stop experience collection... (120700 times) [2024-06-23 19:07:27,775][15349] Signal inference workers to resume experience collection... (120700 times) [2024-06-23 19:07:27,817][15401] InferenceWorker_p0-w0: stopping experience collection (120700 times) [2024-06-23 19:07:27,817][15401] InferenceWorker_p0-w0: resuming experience collection (120700 times) [2024-06-23 19:07:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 8145108992. Throughput: 0: 43014.2. Samples: 8145201200. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:28,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-23 19:07:29,786][15401] Updated weights for policy 0, policy_version 497140 (0.0029) [2024-06-23 19:07:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8145272832. Throughput: 0: 42896.0. Samples: 8145455300. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 19:07:33,640][15401] Updated weights for policy 0, policy_version 497150 (0.0044) [2024-06-23 19:07:37,482][15401] Updated weights for policy 0, policy_version 497160 (0.0046) [2024-06-23 19:07:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 8145502208. Throughput: 0: 42753.0. Samples: 8145576720. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 19:07:41,232][15401] Updated weights for policy 0, policy_version 497170 (0.0022) [2024-06-23 19:07:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8145731584. Throughput: 0: 42920.0. Samples: 8145841100. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 19:07:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000497177_8145747968.pth... [2024-06-23 19:07:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000496551_8135491584.pth [2024-06-23 19:07:45,219][15401] Updated weights for policy 0, policy_version 497180 (0.0024) [2024-06-23 19:07:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8145928192. Throughput: 0: 43035.8. Samples: 8146102740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 19:07:49,044][15401] Updated weights for policy 0, policy_version 497190 (0.0029) [2024-06-23 19:07:52,973][15401] Updated weights for policy 0, policy_version 497200 (0.0042) [2024-06-23 19:07:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 8146141184. Throughput: 0: 42892.9. Samples: 8146222460. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 19:07:56,558][15401] Updated weights for policy 0, policy_version 497210 (0.0031) [2024-06-23 19:07:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8146370560. Throughput: 0: 42779.1. Samples: 8146482740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:07:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 19:08:00,603][15401] Updated weights for policy 0, policy_version 497220 (0.0046) [2024-06-23 19:08:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 8146550784. Throughput: 0: 42740.2. Samples: 8146739580. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:08:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 19:08:04,190][15401] Updated weights for policy 0, policy_version 497230 (0.0033) [2024-06-23 19:08:08,321][15401] Updated weights for policy 0, policy_version 497240 (0.0052) [2024-06-23 19:08:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 8146780160. Throughput: 0: 42663.7. Samples: 8146860360. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-23 19:08:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 19:08:11,866][15401] Updated weights for policy 0, policy_version 497250 (0.0033) [2024-06-23 19:08:13,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 8147009536. Throughput: 0: 42646.5. Samples: 8147120300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:13,391][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 19:08:15,816][15401] Updated weights for policy 0, policy_version 497260 (0.0035) [2024-06-23 19:08:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42543.0). Total num frames: 8147206144. Throughput: 0: 42763.4. Samples: 8147379660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:18,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 19:08:19,690][15401] Updated weights for policy 0, policy_version 497270 (0.0025) [2024-06-23 19:08:23,226][15401] Updated weights for policy 0, policy_version 497280 (0.0041) [2024-06-23 19:08:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 8147435520. Throughput: 0: 42863.8. Samples: 8147505600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 19:08:27,263][15401] Updated weights for policy 0, policy_version 497290 (0.0032) [2024-06-23 19:08:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 8147632128. Throughput: 0: 42659.1. Samples: 8147760760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 19:08:31,048][15401] Updated weights for policy 0, policy_version 497300 (0.0039) [2024-06-23 19:08:33,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8147845120. Throughput: 0: 42533.9. Samples: 8148016760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 19:08:34,988][15401] Updated weights for policy 0, policy_version 497310 (0.0033) [2024-06-23 19:08:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 8148058112. Throughput: 0: 42736.1. Samples: 8148145580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 19:08:38,819][15401] Updated weights for policy 0, policy_version 497320 (0.0044) [2024-06-23 19:08:42,620][15401] Updated weights for policy 0, policy_version 497330 (0.0041) [2024-06-23 19:08:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8148271104. Throughput: 0: 42699.0. Samples: 8148404200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 19:08:46,464][15401] Updated weights for policy 0, policy_version 497340 (0.0050) [2024-06-23 19:08:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8148484096. Throughput: 0: 42588.3. Samples: 8148656060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:48,390][15132] Avg episode reward: [(0, '0.203')] [2024-06-23 19:08:50,247][15401] Updated weights for policy 0, policy_version 497350 (0.0038) [2024-06-23 19:08:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8148697088. Throughput: 0: 42775.8. Samples: 8148785280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 19:08:54,076][15401] Updated weights for policy 0, policy_version 497360 (0.0031) [2024-06-23 19:08:57,746][15401] Updated weights for policy 0, policy_version 497370 (0.0036) [2024-06-23 19:08:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8148926464. Throughput: 0: 42808.2. Samples: 8149046660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:08:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 19:09:01,767][15401] Updated weights for policy 0, policy_version 497380 (0.0053) [2024-06-23 19:09:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 8149139456. Throughput: 0: 42593.8. Samples: 8149296380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:09:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 19:09:05,802][15401] Updated weights for policy 0, policy_version 497390 (0.0032) [2024-06-23 19:09:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 8149336064. Throughput: 0: 42583.6. Samples: 8149421860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:09:08,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 19:09:09,408][15401] Updated weights for policy 0, policy_version 497400 (0.0040) [2024-06-23 19:09:12,262][15349] Signal inference workers to stop experience collection... (120750 times) [2024-06-23 19:09:12,308][15401] InferenceWorker_p0-w0: stopping experience collection (120750 times) [2024-06-23 19:09:12,316][15349] Signal inference workers to resume experience collection... (120750 times) [2024-06-23 19:09:12,325][15401] InferenceWorker_p0-w0: resuming experience collection (120750 times) [2024-06-23 19:09:13,350][15401] Updated weights for policy 0, policy_version 497410 (0.0027) [2024-06-23 19:09:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8149565440. Throughput: 0: 42603.1. Samples: 8149677900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:09:13,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 19:09:17,061][15401] Updated weights for policy 0, policy_version 497420 (0.0026) [2024-06-23 19:09:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8149778432. Throughput: 0: 42441.2. Samples: 8149926620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:09:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 19:09:21,015][15401] Updated weights for policy 0, policy_version 497430 (0.0038) [2024-06-23 19:09:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 8149975040. Throughput: 0: 42508.0. Samples: 8150058440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:09:23,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 19:09:24,676][15401] Updated weights for policy 0, policy_version 497440 (0.0033) [2024-06-23 19:09:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 8150188032. Throughput: 0: 42517.2. Samples: 8150317480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:09:28,391][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 19:09:28,774][15401] Updated weights for policy 0, policy_version 497450 (0.0040) [2024-06-23 19:09:32,523][15401] Updated weights for policy 0, policy_version 497460 (0.0044) [2024-06-23 19:09:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 8150401024. Throughput: 0: 42566.3. Samples: 8150571540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:09:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 19:09:36,510][15401] Updated weights for policy 0, policy_version 497470 (0.0026) [2024-06-23 19:09:38,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8150614016. Throughput: 0: 42489.5. Samples: 8150697300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:09:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 19:09:40,254][15401] Updated weights for policy 0, policy_version 497480 (0.0026) [2024-06-23 19:09:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 8150843392. Throughput: 0: 42307.9. Samples: 8150950620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:09:43,393][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 19:09:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000497488_8150843392.pth... [2024-06-23 19:09:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000496862_8140587008.pth [2024-06-23 19:09:43,994][15401] Updated weights for policy 0, policy_version 497490 (0.0027) [2024-06-23 19:09:47,998][15401] Updated weights for policy 0, policy_version 497500 (0.0030) [2024-06-23 19:09:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8151056384. Throughput: 0: 42402.7. Samples: 8151204500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:09:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 19:09:51,634][15401] Updated weights for policy 0, policy_version 497510 (0.0034) [2024-06-23 19:09:53,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 8151252992. Throughput: 0: 42505.0. Samples: 8151334580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:09:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 19:09:55,661][15401] Updated weights for policy 0, policy_version 497520 (0.0032) [2024-06-23 19:09:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8151482368. Throughput: 0: 42639.6. Samples: 8151596680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:09:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 19:09:59,180][15401] Updated weights for policy 0, policy_version 497530 (0.0038) [2024-06-23 19:10:03,289][15401] Updated weights for policy 0, policy_version 497540 (0.0050) [2024-06-23 19:10:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 8151695360. Throughput: 0: 42782.2. Samples: 8151851820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 19:10:07,065][15401] Updated weights for policy 0, policy_version 497550 (0.0035) [2024-06-23 19:10:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 8151908352. Throughput: 0: 42658.2. Samples: 8151978060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:08,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 19:10:11,019][15401] Updated weights for policy 0, policy_version 497560 (0.0034) [2024-06-23 19:10:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8152121344. Throughput: 0: 42638.7. Samples: 8152236220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 19:10:14,720][15401] Updated weights for policy 0, policy_version 497570 (0.0036) [2024-06-23 19:10:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 8152317952. Throughput: 0: 42870.6. Samples: 8152500720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:18,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 19:10:18,675][15401] Updated weights for policy 0, policy_version 497580 (0.0040) [2024-06-23 19:10:22,616][15401] Updated weights for policy 0, policy_version 497590 (0.0039) [2024-06-23 19:10:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8152547328. Throughput: 0: 42787.1. Samples: 8152622720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:23,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 19:10:26,363][15401] Updated weights for policy 0, policy_version 497600 (0.0043) [2024-06-23 19:10:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 8152760320. Throughput: 0: 42895.7. Samples: 8152880820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 19:10:30,289][15401] Updated weights for policy 0, policy_version 497610 (0.0055) [2024-06-23 19:10:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8152973312. Throughput: 0: 42947.1. Samples: 8153137120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:33,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 19:10:33,811][15401] Updated weights for policy 0, policy_version 497620 (0.0034) [2024-06-23 19:10:37,766][15401] Updated weights for policy 0, policy_version 497630 (0.0036) [2024-06-23 19:10:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 8153186304. Throughput: 0: 42923.5. Samples: 8153266240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:38,392][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 19:10:41,327][15401] Updated weights for policy 0, policy_version 497640 (0.0030) [2024-06-23 19:10:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 8153415680. Throughput: 0: 42846.5. Samples: 8153524780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:43,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-23 19:10:45,234][15401] Updated weights for policy 0, policy_version 497650 (0.0026) [2024-06-23 19:10:48,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8153612288. Throughput: 0: 43003.6. Samples: 8153786980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:48,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 19:10:49,273][15401] Updated weights for policy 0, policy_version 497660 (0.0030) [2024-06-23 19:10:51,744][15349] Signal inference workers to stop experience collection... (120800 times) [2024-06-23 19:10:51,744][15349] Signal inference workers to resume experience collection... (120800 times) [2024-06-23 19:10:51,760][15401] InferenceWorker_p0-w0: stopping experience collection (120800 times) [2024-06-23 19:10:51,761][15401] InferenceWorker_p0-w0: resuming experience collection (120800 times) [2024-06-23 19:10:53,208][15401] Updated weights for policy 0, policy_version 497670 (0.0045) [2024-06-23 19:10:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 8153841664. Throughput: 0: 42922.1. Samples: 8153909660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:53,393][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 19:10:56,732][15401] Updated weights for policy 0, policy_version 497680 (0.0027) [2024-06-23 19:10:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8154054656. Throughput: 0: 42891.6. Samples: 8154166340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:10:58,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 19:11:00,750][15401] Updated weights for policy 0, policy_version 497690 (0.0031) [2024-06-23 19:11:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8154251264. Throughput: 0: 42832.3. Samples: 8154428180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 19:11:03,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 19:11:04,283][15401] Updated weights for policy 0, policy_version 497700 (0.0025) [2024-06-23 19:11:08,394][15132] Fps is (10 sec: 40941.4, 60 sec: 42595.1, 300 sec: 42764.4). Total num frames: 8154464256. Throughput: 0: 42897.9. Samples: 8154553320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:08,395][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 19:11:08,539][15401] Updated weights for policy 0, policy_version 497710 (0.0033) [2024-06-23 19:11:11,880][15401] Updated weights for policy 0, policy_version 497720 (0.0034) [2024-06-23 19:11:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 8154693632. Throughput: 0: 42871.9. Samples: 8154810060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 19:11:15,939][15401] Updated weights for policy 0, policy_version 497730 (0.0027) [2024-06-23 19:11:18,389][15132] Fps is (10 sec: 42618.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8154890240. Throughput: 0: 43071.3. Samples: 8155075320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:18,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 19:11:19,363][15401] Updated weights for policy 0, policy_version 497740 (0.0037) [2024-06-23 19:11:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 8155119616. Throughput: 0: 43003.5. Samples: 8155201400. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:23,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 19:11:23,537][15401] Updated weights for policy 0, policy_version 497750 (0.0028) [2024-06-23 19:11:26,866][15401] Updated weights for policy 0, policy_version 497760 (0.0034) [2024-06-23 19:11:28,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8155348992. Throughput: 0: 42946.2. Samples: 8155457360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 19:11:30,932][15401] Updated weights for policy 0, policy_version 497770 (0.0031) [2024-06-23 19:11:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8155529216. Throughput: 0: 43041.5. Samples: 8155723840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 19:11:34,485][15401] Updated weights for policy 0, policy_version 497780 (0.0040) [2024-06-23 19:11:38,259][15401] Updated weights for policy 0, policy_version 497790 (0.0033) [2024-06-23 19:11:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 8155791360. Throughput: 0: 43055.2. Samples: 8155847040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:38,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 19:11:42,009][15401] Updated weights for policy 0, policy_version 497800 (0.0035) [2024-06-23 19:11:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8155987968. Throughput: 0: 42980.8. Samples: 8156100480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:43,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-23 19:11:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000497802_8155987968.pth... [2024-06-23 19:11:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000497177_8145747968.pth [2024-06-23 19:11:46,627][15401] Updated weights for policy 0, policy_version 497810 (0.0024) [2024-06-23 19:11:48,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8156168192. Throughput: 0: 43031.8. Samples: 8156364600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:48,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 19:11:49,770][15401] Updated weights for policy 0, policy_version 497820 (0.0038) [2024-06-23 19:11:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8156413952. Throughput: 0: 43000.0. Samples: 8156488120. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:53,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 19:11:54,206][15401] Updated weights for policy 0, policy_version 497830 (0.0031) [2024-06-23 19:11:56,826][15349] Signal inference workers to stop experience collection... (120850 times) [2024-06-23 19:11:56,858][15401] InferenceWorker_p0-w0: stopping experience collection (120850 times) [2024-06-23 19:11:56,888][15349] Signal inference workers to resume experience collection... (120850 times) [2024-06-23 19:11:56,889][15401] InferenceWorker_p0-w0: resuming experience collection (120850 times) [2024-06-23 19:11:57,482][15401] Updated weights for policy 0, policy_version 497840 (0.0032) [2024-06-23 19:11:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8156626944. Throughput: 0: 42946.0. Samples: 8156742620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:11:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 19:12:01,930][15401] Updated weights for policy 0, policy_version 497850 (0.0028) [2024-06-23 19:12:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8156807168. Throughput: 0: 42875.8. Samples: 8157004740. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:12:03,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 19:12:05,138][15401] Updated weights for policy 0, policy_version 497860 (0.0033) [2024-06-23 19:12:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43420.9, 300 sec: 42765.0). Total num frames: 8157069312. Throughput: 0: 42732.5. Samples: 8157124260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:12:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 19:12:09,564][15401] Updated weights for policy 0, policy_version 497870 (0.0034) [2024-06-23 19:12:13,054][15401] Updated weights for policy 0, policy_version 497880 (0.0039) [2024-06-23 19:12:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8157265920. Throughput: 0: 42769.9. Samples: 8157382000. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:12:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 19:12:17,182][15401] Updated weights for policy 0, policy_version 497890 (0.0035) [2024-06-23 19:12:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 8157462528. Throughput: 0: 42549.7. Samples: 8157638580. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:12:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 19:12:20,668][15401] Updated weights for policy 0, policy_version 497900 (0.0026) [2024-06-23 19:12:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 8157708288. Throughput: 0: 42483.1. Samples: 8157758780. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:12:23,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 19:12:24,689][15401] Updated weights for policy 0, policy_version 497910 (0.0036) [2024-06-23 19:12:28,379][15401] Updated weights for policy 0, policy_version 497920 (0.0034) [2024-06-23 19:12:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8157921280. Throughput: 0: 42749.6. Samples: 8158024200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:12:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 19:12:32,164][15401] Updated weights for policy 0, policy_version 497930 (0.0047) [2024-06-23 19:12:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8158101504. Throughput: 0: 42381.1. Samples: 8158271760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-06-23 19:12:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 19:12:36,403][15401] Updated weights for policy 0, policy_version 497940 (0.0026) [2024-06-23 19:12:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8158330880. Throughput: 0: 42482.2. Samples: 8158399820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:12:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 19:12:40,060][15401] Updated weights for policy 0, policy_version 497950 (0.0047) [2024-06-23 19:12:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8158543872. Throughput: 0: 42712.7. Samples: 8158664700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:12:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 19:12:44,211][15401] Updated weights for policy 0, policy_version 497960 (0.0034) [2024-06-23 19:12:47,661][15401] Updated weights for policy 0, policy_version 497970 (0.0044) [2024-06-23 19:12:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8158740480. Throughput: 0: 42329.0. Samples: 8158909540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:12:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 19:12:50,497][15349] Signal inference workers to stop experience collection... (120900 times) [2024-06-23 19:12:50,497][15349] Signal inference workers to resume experience collection... (120900 times) [2024-06-23 19:12:50,547][15401] InferenceWorker_p0-w0: stopping experience collection (120900 times) [2024-06-23 19:12:50,547][15401] InferenceWorker_p0-w0: resuming experience collection (120900 times) [2024-06-23 19:12:51,821][15401] Updated weights for policy 0, policy_version 497980 (0.0034) [2024-06-23 19:12:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8158953472. Throughput: 0: 42588.4. Samples: 8159040740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:12:53,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 19:12:55,083][15401] Updated weights for policy 0, policy_version 497990 (0.0026) [2024-06-23 19:12:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8159166464. Throughput: 0: 42728.0. Samples: 8159304760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:12:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 19:12:59,358][15401] Updated weights for policy 0, policy_version 498000 (0.0036) [2024-06-23 19:13:02,546][15401] Updated weights for policy 0, policy_version 498010 (0.0036) [2024-06-23 19:13:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 8159395840. Throughput: 0: 42514.2. Samples: 8159551820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:03,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 19:13:07,174][15401] Updated weights for policy 0, policy_version 498020 (0.0041) [2024-06-23 19:13:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8159608832. Throughput: 0: 42742.2. Samples: 8159682180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 19:13:10,635][15401] Updated weights for policy 0, policy_version 498030 (0.0037) [2024-06-23 19:13:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8159821824. Throughput: 0: 42797.6. Samples: 8159950100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 19:13:14,744][15401] Updated weights for policy 0, policy_version 498040 (0.0040) [2024-06-23 19:13:18,176][15401] Updated weights for policy 0, policy_version 498050 (0.0035) [2024-06-23 19:13:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 8160067584. Throughput: 0: 42716.6. Samples: 8160194000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 19:13:22,562][15401] Updated weights for policy 0, policy_version 498060 (0.0030) [2024-06-23 19:13:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8160247808. Throughput: 0: 42784.6. Samples: 8160325120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:23,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-23 19:13:25,862][15401] Updated weights for policy 0, policy_version 498070 (0.0043) [2024-06-23 19:13:28,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 8160444416. Throughput: 0: 42619.6. Samples: 8160582580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 19:13:29,991][15401] Updated weights for policy 0, policy_version 498080 (0.0033) [2024-06-23 19:13:33,393][15132] Fps is (10 sec: 42585.0, 60 sec: 42869.3, 300 sec: 42764.6). Total num frames: 8160673792. Throughput: 0: 42760.6. Samples: 8160833900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:33,393][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 19:13:33,726][15401] Updated weights for policy 0, policy_version 498090 (0.0030) [2024-06-23 19:13:38,012][15401] Updated weights for policy 0, policy_version 498100 (0.0056) [2024-06-23 19:13:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8160886784. Throughput: 0: 42761.8. Samples: 8160965020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 19:13:41,583][15401] Updated weights for policy 0, policy_version 498110 (0.0025) [2024-06-23 19:13:43,390][15132] Fps is (10 sec: 42611.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8161099776. Throughput: 0: 42718.6. Samples: 8161227100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 19:13:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000498114_8161099776.pth... [2024-06-23 19:13:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000497488_8150843392.pth [2024-06-23 19:13:45,493][15401] Updated weights for policy 0, policy_version 498120 (0.0031) [2024-06-23 19:13:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8161329152. Throughput: 0: 42891.7. Samples: 8161481840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 19:13:49,254][15401] Updated weights for policy 0, policy_version 498130 (0.0029) [2024-06-23 19:13:53,025][15401] Updated weights for policy 0, policy_version 498140 (0.0028) [2024-06-23 19:13:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8161525760. Throughput: 0: 42906.7. Samples: 8161612980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 19:13:57,102][15401] Updated weights for policy 0, policy_version 498150 (0.0033) [2024-06-23 19:13:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8161738752. Throughput: 0: 42629.9. Samples: 8161868440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:13:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 19:14:00,664][15401] Updated weights for policy 0, policy_version 498160 (0.0031) [2024-06-23 19:14:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 8161968128. Throughput: 0: 42924.0. Samples: 8162125580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 19:14:04,903][15401] Updated weights for policy 0, policy_version 498170 (0.0026) [2024-06-23 19:14:08,285][15401] Updated weights for policy 0, policy_version 498180 (0.0032) [2024-06-23 19:14:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8162181120. Throughput: 0: 42922.6. Samples: 8162256640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:08,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 19:14:12,495][15401] Updated weights for policy 0, policy_version 498190 (0.0030) [2024-06-23 19:14:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8162377728. Throughput: 0: 42954.7. Samples: 8162515540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 19:14:15,840][15401] Updated weights for policy 0, policy_version 498200 (0.0037) [2024-06-23 19:14:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 8162607104. Throughput: 0: 43008.7. Samples: 8162769160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:18,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 19:14:20,304][15401] Updated weights for policy 0, policy_version 498210 (0.0030) [2024-06-23 19:14:21,360][15349] Signal inference workers to stop experience collection... (120950 times) [2024-06-23 19:14:21,395][15401] InferenceWorker_p0-w0: stopping experience collection (120950 times) [2024-06-23 19:14:21,422][15349] Signal inference workers to resume experience collection... (120950 times) [2024-06-23 19:14:21,423][15401] InferenceWorker_p0-w0: resuming experience collection (120950 times) [2024-06-23 19:14:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8162820096. Throughput: 0: 42925.3. Samples: 8162896660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 19:14:23,488][15401] Updated weights for policy 0, policy_version 498220 (0.0028) [2024-06-23 19:14:27,771][15401] Updated weights for policy 0, policy_version 498230 (0.0041) [2024-06-23 19:14:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8163016704. Throughput: 0: 42860.9. Samples: 8163155840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 19:14:31,007][15401] Updated weights for policy 0, policy_version 498240 (0.0031) [2024-06-23 19:14:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.6, 300 sec: 42820.6). Total num frames: 8163246080. Throughput: 0: 42918.6. Samples: 8163413180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:33,392][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 19:14:35,647][15401] Updated weights for policy 0, policy_version 498250 (0.0039) [2024-06-23 19:14:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 8163459072. Throughput: 0: 42892.0. Samples: 8163543120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 19:14:38,926][15401] Updated weights for policy 0, policy_version 498260 (0.0034) [2024-06-23 19:14:43,028][15401] Updated weights for policy 0, policy_version 498270 (0.0034) [2024-06-23 19:14:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8163655680. Throughput: 0: 42873.7. Samples: 8163797760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:43,391][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 19:14:46,775][15401] Updated weights for policy 0, policy_version 498280 (0.0027) [2024-06-23 19:14:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8163901440. Throughput: 0: 42837.3. Samples: 8164053260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 19:14:50,451][15401] Updated weights for policy 0, policy_version 498290 (0.0039) [2024-06-23 19:14:53,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42593.9, 300 sec: 42708.5). Total num frames: 8164081664. Throughput: 0: 42852.1. Samples: 8164185260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:53,396][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 19:14:54,240][15401] Updated weights for policy 0, policy_version 498300 (0.0033) [2024-06-23 19:14:58,327][15401] Updated weights for policy 0, policy_version 498310 (0.0034) [2024-06-23 19:14:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8164311040. Throughput: 0: 42913.4. Samples: 8164446640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:14:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 19:15:01,793][15401] Updated weights for policy 0, policy_version 498320 (0.0045) [2024-06-23 19:15:03,392][15132] Fps is (10 sec: 45893.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8164540416. Throughput: 0: 42925.2. Samples: 8164700900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:15:03,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 19:15:05,747][15401] Updated weights for policy 0, policy_version 498330 (0.0033) [2024-06-23 19:15:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8164737024. Throughput: 0: 42916.9. Samples: 8164827920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:15:08,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 19:15:09,363][15401] Updated weights for policy 0, policy_version 498340 (0.0030) [2024-06-23 19:15:13,202][15401] Updated weights for policy 0, policy_version 498350 (0.0033) [2024-06-23 19:15:13,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8164982784. Throughput: 0: 42952.5. Samples: 8165088700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:15:13,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 19:15:17,089][15401] Updated weights for policy 0, policy_version 498360 (0.0033) [2024-06-23 19:15:18,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 8165212160. Throughput: 0: 42888.0. Samples: 8165343140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:15:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 19:15:20,635][15401] Updated weights for policy 0, policy_version 498370 (0.0027) [2024-06-23 19:15:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8165376000. Throughput: 0: 42907.5. Samples: 8165473960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:15:23,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 19:15:24,762][15401] Updated weights for policy 0, policy_version 498380 (0.0033) [2024-06-23 19:15:28,122][15401] Updated weights for policy 0, policy_version 498390 (0.0036) [2024-06-23 19:15:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8165621760. Throughput: 0: 43121.9. Samples: 8165738240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:15:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 19:15:32,347][15401] Updated weights for policy 0, policy_version 498400 (0.0035) [2024-06-23 19:15:33,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.8, 300 sec: 42876.1). Total num frames: 8165834752. Throughput: 0: 43191.9. Samples: 8165997000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:15:33,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 19:15:35,681][15401] Updated weights for policy 0, policy_version 498410 (0.0037) [2024-06-23 19:15:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8166047744. Throughput: 0: 43084.9. Samples: 8166123800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:15:38,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 19:15:40,063][15401] Updated weights for policy 0, policy_version 498420 (0.0043) [2024-06-23 19:15:43,246][15401] Updated weights for policy 0, policy_version 498430 (0.0030) [2024-06-23 19:15:43,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 8166277120. Throughput: 0: 43120.4. Samples: 8166387060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:15:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 19:15:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000498430_8166277120.pth... [2024-06-23 19:15:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000497802_8155987968.pth [2024-06-23 19:15:47,455][15401] Updated weights for policy 0, policy_version 498440 (0.0032) [2024-06-23 19:15:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 8166490112. Throughput: 0: 43152.6. Samples: 8166642660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:15:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 19:15:50,891][15401] Updated weights for policy 0, policy_version 498450 (0.0032) [2024-06-23 19:15:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43695.3, 300 sec: 42876.1). Total num frames: 8166703104. Throughput: 0: 43243.9. Samples: 8166773900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:15:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 19:15:54,955][15401] Updated weights for policy 0, policy_version 498460 (0.0041) [2024-06-23 19:15:58,347][15401] Updated weights for policy 0, policy_version 498470 (0.0023) [2024-06-23 19:15:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 8166932480. Throughput: 0: 43245.7. Samples: 8167034760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:15:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 19:16:00,675][15349] Signal inference workers to stop experience collection... (121000 times) [2024-06-23 19:16:00,675][15349] Signal inference workers to resume experience collection... (121000 times) [2024-06-23 19:16:00,689][15401] InferenceWorker_p0-w0: stopping experience collection (121000 times) [2024-06-23 19:16:00,689][15401] InferenceWorker_p0-w0: resuming experience collection (121000 times) [2024-06-23 19:16:02,529][15401] Updated weights for policy 0, policy_version 498480 (0.0047) [2024-06-23 19:16:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42873.3, 300 sec: 42876.8). Total num frames: 8167112704. Throughput: 0: 43310.8. Samples: 8167292120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 19:16:06,145][15401] Updated weights for policy 0, policy_version 498490 (0.0029) [2024-06-23 19:16:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8167342080. Throughput: 0: 43192.1. Samples: 8167417600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 19:16:10,139][15401] Updated weights for policy 0, policy_version 498500 (0.0041) [2024-06-23 19:16:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8167555072. Throughput: 0: 42963.0. Samples: 8167671580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 19:16:13,982][15401] Updated weights for policy 0, policy_version 498510 (0.0025) [2024-06-23 19:16:17,705][15401] Updated weights for policy 0, policy_version 498520 (0.0025) [2024-06-23 19:16:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 8167751680. Throughput: 0: 43023.5. Samples: 8167932960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:18,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 19:16:21,684][15401] Updated weights for policy 0, policy_version 498530 (0.0038) [2024-06-23 19:16:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 8167997440. Throughput: 0: 43023.5. Samples: 8168059860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 19:16:25,518][15401] Updated weights for policy 0, policy_version 498540 (0.0044) [2024-06-23 19:16:28,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8168210432. Throughput: 0: 43025.7. Samples: 8168323220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 19:16:29,208][15401] Updated weights for policy 0, policy_version 498550 (0.0029) [2024-06-23 19:16:33,331][15401] Updated weights for policy 0, policy_version 498560 (0.0023) [2024-06-23 19:16:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8168407040. Throughput: 0: 43017.8. Samples: 8168578460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:33,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-23 19:16:36,697][15401] Updated weights for policy 0, policy_version 498570 (0.0039) [2024-06-23 19:16:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.7, 300 sec: 42875.8). Total num frames: 8168636416. Throughput: 0: 42751.6. Samples: 8168697820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:38,392][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 19:16:41,027][15401] Updated weights for policy 0, policy_version 498580 (0.0035) [2024-06-23 19:16:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8168833024. Throughput: 0: 42754.7. Samples: 8168958720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 19:16:44,838][15401] Updated weights for policy 0, policy_version 498590 (0.0029) [2024-06-23 19:16:48,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8169029632. Throughput: 0: 42617.2. Samples: 8169209900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 19:16:48,756][15401] Updated weights for policy 0, policy_version 498600 (0.0036) [2024-06-23 19:16:52,339][15401] Updated weights for policy 0, policy_version 498610 (0.0035) [2024-06-23 19:16:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8169275392. Throughput: 0: 42662.6. Samples: 8169337420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 19:16:53,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 19:16:56,519][15401] Updated weights for policy 0, policy_version 498620 (0.0038) [2024-06-23 19:16:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8169488384. Throughput: 0: 42793.9. Samples: 8169597300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:16:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 19:16:59,743][15401] Updated weights for policy 0, policy_version 498630 (0.0040) [2024-06-23 19:17:03,392][15132] Fps is (10 sec: 37674.6, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 8169652224. Throughput: 0: 42756.5. Samples: 8169857100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:03,392][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 19:17:04,421][15401] Updated weights for policy 0, policy_version 498640 (0.0039) [2024-06-23 19:17:07,219][15401] Updated weights for policy 0, policy_version 498650 (0.0038) [2024-06-23 19:17:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8169930752. Throughput: 0: 42645.3. Samples: 8169978900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 19:17:10,960][15349] Signal inference workers to stop experience collection... (121050 times) [2024-06-23 19:17:10,965][15349] Signal inference workers to resume experience collection... (121050 times) [2024-06-23 19:17:11,016][15401] InferenceWorker_p0-w0: stopping experience collection (121050 times) [2024-06-23 19:17:11,016][15401] InferenceWorker_p0-w0: resuming experience collection (121050 times) [2024-06-23 19:17:12,175][15401] Updated weights for policy 0, policy_version 498660 (0.0025) [2024-06-23 19:17:13,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8170110976. Throughput: 0: 42695.1. Samples: 8170244500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 19:17:14,669][15401] Updated weights for policy 0, policy_version 498670 (0.0026) [2024-06-23 19:17:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8170307584. Throughput: 0: 42639.5. Samples: 8170497240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 19:17:19,917][15401] Updated weights for policy 0, policy_version 498680 (0.0025) [2024-06-23 19:17:22,344][15401] Updated weights for policy 0, policy_version 498690 (0.0032) [2024-06-23 19:17:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8170569728. Throughput: 0: 42818.4. Samples: 8170624540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 19:17:27,397][15401] Updated weights for policy 0, policy_version 498700 (0.0038) [2024-06-23 19:17:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42323.6, 300 sec: 42875.8). Total num frames: 8170749952. Throughput: 0: 42735.9. Samples: 8170881940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:28,393][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 19:17:29,922][15401] Updated weights for policy 0, policy_version 498710 (0.0027) [2024-06-23 19:17:33,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 8170962944. Throughput: 0: 42904.9. Samples: 8171140720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:33,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 19:17:35,130][15401] Updated weights for policy 0, policy_version 498720 (0.0029) [2024-06-23 19:17:37,455][15401] Updated weights for policy 0, policy_version 498730 (0.0027) [2024-06-23 19:17:38,392][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42986.8). Total num frames: 8171225088. Throughput: 0: 42936.1. Samples: 8171269640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:38,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 19:17:42,853][15401] Updated weights for policy 0, policy_version 498740 (0.0035) [2024-06-23 19:17:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8171405312. Throughput: 0: 43048.1. Samples: 8171534460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 19:17:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000498743_8171405312.pth... [2024-06-23 19:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000498114_8161099776.pth [2024-06-23 19:17:45,217][15401] Updated weights for policy 0, policy_version 498750 (0.0035) [2024-06-23 19:17:48,390][15132] Fps is (10 sec: 37691.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8171601920. Throughput: 0: 42798.6. Samples: 8171782940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 19:17:50,389][15401] Updated weights for policy 0, policy_version 498760 (0.0029) [2024-06-23 19:17:52,728][15401] Updated weights for policy 0, policy_version 498770 (0.0029) [2024-06-23 19:17:53,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.7, 300 sec: 43098.2). Total num frames: 8171880448. Throughput: 0: 43001.3. Samples: 8171913960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:53,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 19:17:57,983][15401] Updated weights for policy 0, policy_version 498780 (0.0045) [2024-06-23 19:17:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 8172011520. Throughput: 0: 42870.3. Samples: 8172173660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:17:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 19:18:00,592][15401] Updated weights for policy 0, policy_version 498790 (0.0028) [2024-06-23 19:18:03,389][15132] Fps is (10 sec: 36044.9, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 8172240896. Throughput: 0: 42697.9. Samples: 8172418640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:18:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 19:18:05,849][15401] Updated weights for policy 0, policy_version 498800 (0.0034) [2024-06-23 19:18:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8172486656. Throughput: 0: 42812.0. Samples: 8172551080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:18:08,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 19:18:08,462][15401] Updated weights for policy 0, policy_version 498810 (0.0030) [2024-06-23 19:18:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 8172650496. Throughput: 0: 42712.7. Samples: 8172803900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:18:13,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 19:18:13,504][15401] Updated weights for policy 0, policy_version 498820 (0.0036) [2024-06-23 19:18:16,104][15401] Updated weights for policy 0, policy_version 498830 (0.0035) [2024-06-23 19:18:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8172879872. Throughput: 0: 42590.7. Samples: 8173057200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 19:18:18,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 19:18:21,255][15401] Updated weights for policy 0, policy_version 498840 (0.0040) [2024-06-23 19:18:23,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 8173125632. Throughput: 0: 42777.3. Samples: 8173194520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:18:23,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 19:18:23,731][15401] Updated weights for policy 0, policy_version 498850 (0.0041) [2024-06-23 19:18:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42765.5). Total num frames: 8173289472. Throughput: 0: 42521.3. Samples: 8173447920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:18:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 19:18:28,761][15401] Updated weights for policy 0, policy_version 498860 (0.0047) [2024-06-23 19:18:28,771][15349] Signal inference workers to stop experience collection... (121100 times) [2024-06-23 19:18:28,771][15349] Signal inference workers to resume experience collection... (121100 times) [2024-06-23 19:18:28,791][15401] InferenceWorker_p0-w0: stopping experience collection (121100 times) [2024-06-23 19:18:28,792][15401] InferenceWorker_p0-w0: resuming experience collection (121100 times) [2024-06-23 19:18:31,739][15401] Updated weights for policy 0, policy_version 498870 (0.0039) [2024-06-23 19:18:33,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42868.6, 300 sec: 42875.2). Total num frames: 8173535232. Throughput: 0: 42520.7. Samples: 8173696640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:18:33,396][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 19:18:36,218][15401] Updated weights for policy 0, policy_version 498880 (0.0047) [2024-06-23 19:18:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42053.9, 300 sec: 42876.1). Total num frames: 8173748224. Throughput: 0: 42659.5. Samples: 8173833640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:18:38,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 19:18:39,206][15401] Updated weights for policy 0, policy_version 498890 (0.0030) [2024-06-23 19:18:43,389][15132] Fps is (10 sec: 40986.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8173944832. Throughput: 0: 42657.8. Samples: 8174093260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:18:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 19:18:43,576][15401] Updated weights for policy 0, policy_version 498900 (0.0023) [2024-06-23 19:18:46,764][15401] Updated weights for policy 0, policy_version 498910 (0.0029) [2024-06-23 19:18:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8174190592. Throughput: 0: 42716.4. Samples: 8174340880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:18:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 19:18:51,363][15401] Updated weights for policy 0, policy_version 498920 (0.0026) [2024-06-23 19:18:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 42931.6). Total num frames: 8174403584. Throughput: 0: 42788.9. Samples: 8174476580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:18:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 19:18:54,430][15401] Updated weights for policy 0, policy_version 498930 (0.0036) [2024-06-23 19:18:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8174583808. Throughput: 0: 42986.2. Samples: 8174738280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:18:58,398][15132] Avg episode reward: [(0, '0.194')] [2024-06-23 19:18:59,053][15401] Updated weights for policy 0, policy_version 498940 (0.0025) [2024-06-23 19:19:01,996][15401] Updated weights for policy 0, policy_version 498950 (0.0033) [2024-06-23 19:19:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 8174845952. Throughput: 0: 42679.5. Samples: 8174977780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:03,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 19:19:06,485][15401] Updated weights for policy 0, policy_version 498960 (0.0027) [2024-06-23 19:19:08,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8175042560. Throughput: 0: 42731.9. Samples: 8175117460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:08,398][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 19:19:09,711][15401] Updated weights for policy 0, policy_version 498970 (0.0042) [2024-06-23 19:19:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8175239168. Throughput: 0: 42934.2. Samples: 8175379960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 19:19:14,180][15401] Updated weights for policy 0, policy_version 498980 (0.0046) [2024-06-23 19:19:17,212][15401] Updated weights for policy 0, policy_version 498990 (0.0048) [2024-06-23 19:19:18,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8175484928. Throughput: 0: 42936.8. Samples: 8175628520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 19:19:21,808][15401] Updated weights for policy 0, policy_version 499000 (0.0046) [2024-06-23 19:19:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8175681536. Throughput: 0: 42938.6. Samples: 8175765880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 19:19:24,698][15401] Updated weights for policy 0, policy_version 499010 (0.0034) [2024-06-23 19:19:28,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8175878144. Throughput: 0: 42898.6. Samples: 8176023700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:28,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 19:19:29,507][15401] Updated weights for policy 0, policy_version 499020 (0.0025) [2024-06-23 19:19:32,473][15401] Updated weights for policy 0, policy_version 499030 (0.0042) [2024-06-23 19:19:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43149.1, 300 sec: 42931.6). Total num frames: 8176123904. Throughput: 0: 42920.0. Samples: 8176272280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:33,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 19:19:37,139][15401] Updated weights for policy 0, policy_version 499040 (0.0042) [2024-06-23 19:19:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8176320512. Throughput: 0: 42992.4. Samples: 8176411240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:38,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 19:19:40,185][15401] Updated weights for policy 0, policy_version 499050 (0.0031) [2024-06-23 19:19:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8176533504. Throughput: 0: 42797.7. Samples: 8176664180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 19:19:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 19:19:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000499056_8176533504.pth... [2024-06-23 19:19:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000498430_8166277120.pth [2024-06-23 19:19:44,821][15401] Updated weights for policy 0, policy_version 499060 (0.0042) [2024-06-23 19:19:46,723][15349] Signal inference workers to stop experience collection... (121150 times) [2024-06-23 19:19:46,746][15401] InferenceWorker_p0-w0: stopping experience collection (121150 times) [2024-06-23 19:19:46,839][15349] Signal inference workers to resume experience collection... (121150 times) [2024-06-23 19:19:46,839][15401] InferenceWorker_p0-w0: resuming experience collection (121150 times) [2024-06-23 19:19:47,733][15401] Updated weights for policy 0, policy_version 499070 (0.0040) [2024-06-23 19:19:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 43043.6). Total num frames: 8176779264. Throughput: 0: 42945.8. Samples: 8176910340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:19:48,395][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 19:19:52,578][15401] Updated weights for policy 0, policy_version 499080 (0.0026) [2024-06-23 19:19:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 8176959488. Throughput: 0: 42803.6. Samples: 8177043720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:19:53,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 19:19:55,510][15401] Updated weights for policy 0, policy_version 499090 (0.0029) [2024-06-23 19:19:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 8177172480. Throughput: 0: 42701.8. Samples: 8177301540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:19:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 19:20:00,279][15401] Updated weights for policy 0, policy_version 499100 (0.0032) [2024-06-23 19:20:03,284][15401] Updated weights for policy 0, policy_version 499110 (0.0046) [2024-06-23 19:20:03,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8177418240. Throughput: 0: 42676.8. Samples: 8177548980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 19:20:07,845][15401] Updated weights for policy 0, policy_version 499120 (0.0033) [2024-06-23 19:20:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8177598464. Throughput: 0: 42575.6. Samples: 8177681780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 19:20:10,841][15401] Updated weights for policy 0, policy_version 499130 (0.0039) [2024-06-23 19:20:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8177811456. Throughput: 0: 42532.1. Samples: 8177937640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:13,392][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 19:20:15,411][15401] Updated weights for policy 0, policy_version 499140 (0.0045) [2024-06-23 19:20:18,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8178057216. Throughput: 0: 42645.0. Samples: 8178191300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 19:20:18,546][15401] Updated weights for policy 0, policy_version 499150 (0.0039) [2024-06-23 19:20:23,249][15401] Updated weights for policy 0, policy_version 499160 (0.0038) [2024-06-23 19:20:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8178237440. Throughput: 0: 42554.2. Samples: 8178326180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 19:20:26,433][15401] Updated weights for policy 0, policy_version 499170 (0.0044) [2024-06-23 19:20:28,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 8178434048. Throughput: 0: 42461.3. Samples: 8178574940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 19:20:30,998][15401] Updated weights for policy 0, policy_version 499180 (0.0038) [2024-06-23 19:20:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8178696192. Throughput: 0: 42687.3. Samples: 8178831260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 19:20:34,267][15401] Updated weights for policy 0, policy_version 499190 (0.0040) [2024-06-23 19:20:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8178876416. Throughput: 0: 42720.6. Samples: 8178966040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 19:20:38,644][15401] Updated weights for policy 0, policy_version 499200 (0.0032) [2024-06-23 19:20:41,995][15401] Updated weights for policy 0, policy_version 499210 (0.0036) [2024-06-23 19:20:43,389][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8179073024. Throughput: 0: 42234.1. Samples: 8179202080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 19:20:46,382][15401] Updated weights for policy 0, policy_version 499220 (0.0037) [2024-06-23 19:20:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8179302400. Throughput: 0: 42593.4. Samples: 8179465680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 19:20:49,752][15401] Updated weights for policy 0, policy_version 499230 (0.0041) [2024-06-23 19:20:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 8179515392. Throughput: 0: 42602.8. Samples: 8179598900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 19:20:54,014][15401] Updated weights for policy 0, policy_version 499240 (0.0028) [2024-06-23 19:20:57,684][15401] Updated weights for policy 0, policy_version 499250 (0.0037) [2024-06-23 19:20:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 8179728384. Throughput: 0: 42434.6. Samples: 8179847300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:20:58,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 19:21:01,653][15401] Updated weights for policy 0, policy_version 499260 (0.0029) [2024-06-23 19:21:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8179941376. Throughput: 0: 42526.3. Samples: 8180104980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:21:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 19:21:05,521][15401] Updated weights for policy 0, policy_version 499270 (0.0027) [2024-06-23 19:21:06,664][15349] Signal inference workers to stop experience collection... (121200 times) [2024-06-23 19:21:06,668][15349] Signal inference workers to resume experience collection... (121200 times) [2024-06-23 19:21:06,715][15401] InferenceWorker_p0-w0: stopping experience collection (121200 times) [2024-06-23 19:21:06,715][15401] InferenceWorker_p0-w0: resuming experience collection (121200 times) [2024-06-23 19:21:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8180154368. Throughput: 0: 42491.1. Samples: 8180238280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:21:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 19:21:09,260][15401] Updated weights for policy 0, policy_version 499280 (0.0045) [2024-06-23 19:21:13,289][15401] Updated weights for policy 0, policy_version 499290 (0.0040) [2024-06-23 19:21:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8180367360. Throughput: 0: 42505.4. Samples: 8180487680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-23 19:21:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 19:21:16,887][15401] Updated weights for policy 0, policy_version 499300 (0.0032) [2024-06-23 19:21:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 8180580352. Throughput: 0: 42497.6. Samples: 8180743660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:18,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 19:21:20,897][15401] Updated weights for policy 0, policy_version 499310 (0.0049) [2024-06-23 19:21:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8180793344. Throughput: 0: 42377.2. Samples: 8180873020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 19:21:24,668][15401] Updated weights for policy 0, policy_version 499320 (0.0040) [2024-06-23 19:21:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8181006336. Throughput: 0: 42726.3. Samples: 8181124760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 19:21:28,727][15401] Updated weights for policy 0, policy_version 499330 (0.0031) [2024-06-23 19:21:32,363][15401] Updated weights for policy 0, policy_version 499340 (0.0026) [2024-06-23 19:21:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 42598.7). Total num frames: 8181202944. Throughput: 0: 42579.5. Samples: 8181381760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 19:21:36,993][15401] Updated weights for policy 0, policy_version 499350 (0.0042) [2024-06-23 19:21:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8181432320. Throughput: 0: 42439.1. Samples: 8181508660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 19:21:40,297][15401] Updated weights for policy 0, policy_version 499360 (0.0044) [2024-06-23 19:21:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8181645312. Throughput: 0: 42451.3. Samples: 8181757500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:43,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 19:21:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000499368_8181645312.pth... [2024-06-23 19:21:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000498743_8171405312.pth [2024-06-23 19:21:44,529][15401] Updated weights for policy 0, policy_version 499370 (0.0021) [2024-06-23 19:21:48,051][15401] Updated weights for policy 0, policy_version 499380 (0.0031) [2024-06-23 19:21:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8181858304. Throughput: 0: 42402.9. Samples: 8182013120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 19:21:52,015][15401] Updated weights for policy 0, policy_version 499390 (0.0037) [2024-06-23 19:21:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8182054912. Throughput: 0: 42261.8. Samples: 8182140060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 19:21:55,531][15401] Updated weights for policy 0, policy_version 499400 (0.0033) [2024-06-23 19:21:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 8182300672. Throughput: 0: 42488.8. Samples: 8182399680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:21:58,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 19:21:59,965][15401] Updated weights for policy 0, policy_version 499410 (0.0043) [2024-06-23 19:22:03,040][15401] Updated weights for policy 0, policy_version 499420 (0.0043) [2024-06-23 19:22:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8182497280. Throughput: 0: 42308.1. Samples: 8182647520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:22:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 19:22:07,827][15401] Updated weights for policy 0, policy_version 499430 (0.0030) [2024-06-23 19:22:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8182677504. Throughput: 0: 42363.6. Samples: 8182779380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:22:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 19:22:10,715][15401] Updated weights for policy 0, policy_version 499440 (0.0034) [2024-06-23 19:22:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8182923264. Throughput: 0: 42452.5. Samples: 8183035120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:22:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 19:22:15,289][15401] Updated weights for policy 0, policy_version 499450 (0.0034) [2024-06-23 19:22:18,349][15349] Signal inference workers to stop experience collection... (121250 times) [2024-06-23 19:22:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8183119872. Throughput: 0: 42533.4. Samples: 8183295760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:22:18,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 19:22:18,392][15401] InferenceWorker_p0-w0: stopping experience collection (121250 times) [2024-06-23 19:22:18,413][15349] Signal inference workers to resume experience collection... (121250 times) [2024-06-23 19:22:18,414][15401] InferenceWorker_p0-w0: resuming experience collection (121250 times) [2024-06-23 19:22:18,583][15401] Updated weights for policy 0, policy_version 499460 (0.0037) [2024-06-23 19:22:22,711][15401] Updated weights for policy 0, policy_version 499470 (0.0028) [2024-06-23 19:22:23,394][15132] Fps is (10 sec: 40942.2, 60 sec: 42322.4, 300 sec: 42653.7). Total num frames: 8183332864. Throughput: 0: 42507.1. Samples: 8183421660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:22:23,394][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 19:22:26,435][15401] Updated weights for policy 0, policy_version 499480 (0.0035) [2024-06-23 19:22:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 8183562240. Throughput: 0: 42516.7. Samples: 8183670760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:22:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 19:22:30,246][15401] Updated weights for policy 0, policy_version 499490 (0.0038) [2024-06-23 19:22:33,389][15132] Fps is (10 sec: 42616.4, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 8183758848. Throughput: 0: 42485.5. Samples: 8183924960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:22:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 19:22:34,323][15401] Updated weights for policy 0, policy_version 499500 (0.0038) [2024-06-23 19:22:37,890][15401] Updated weights for policy 0, policy_version 499510 (0.0031) [2024-06-23 19:22:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 8183971840. Throughput: 0: 42484.0. Samples: 8184051840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-23 19:22:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 19:22:42,091][15401] Updated weights for policy 0, policy_version 499520 (0.0038) [2024-06-23 19:22:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8184201216. Throughput: 0: 42250.7. Samples: 8184300960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:22:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 19:22:45,654][15401] Updated weights for policy 0, policy_version 499530 (0.0031) [2024-06-23 19:22:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 8184397824. Throughput: 0: 42441.3. Samples: 8184557380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:22:48,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 19:22:49,776][15401] Updated weights for policy 0, policy_version 499540 (0.0039) [2024-06-23 19:22:53,364][15401] Updated weights for policy 0, policy_version 499550 (0.0028) [2024-06-23 19:22:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8184627200. Throughput: 0: 42316.9. Samples: 8184683640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:22:53,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 19:22:57,385][15401] Updated weights for policy 0, policy_version 499560 (0.0034) [2024-06-23 19:22:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8184840192. Throughput: 0: 42361.6. Samples: 8184941400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:22:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 19:23:01,041][15401] Updated weights for policy 0, policy_version 499570 (0.0036) [2024-06-23 19:23:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8185036800. Throughput: 0: 42116.8. Samples: 8185191020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:03,399][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 19:23:05,063][15401] Updated weights for policy 0, policy_version 499580 (0.0042) [2024-06-23 19:23:08,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 8185233408. Throughput: 0: 42013.2. Samples: 8185312180. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:08,393][15132] Avg episode reward: [(0, '0.252')] [2024-06-23 19:23:09,560][15401] Updated weights for policy 0, policy_version 499590 (0.0033) [2024-06-23 19:23:12,602][15401] Updated weights for policy 0, policy_version 499600 (0.0027) [2024-06-23 19:23:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8185479168. Throughput: 0: 42344.0. Samples: 8185576240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:13,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 19:23:17,116][15401] Updated weights for policy 0, policy_version 499610 (0.0041) [2024-06-23 19:23:18,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 8185659392. Throughput: 0: 42438.5. Samples: 8185834700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 19:23:19,657][15349] Signal inference workers to stop experience collection... (121300 times) [2024-06-23 19:23:19,657][15349] Signal inference workers to resume experience collection... (121300 times) [2024-06-23 19:23:19,679][15401] InferenceWorker_p0-w0: stopping experience collection (121300 times) [2024-06-23 19:23:19,679][15401] InferenceWorker_p0-w0: resuming experience collection (121300 times) [2024-06-23 19:23:20,479][15401] Updated weights for policy 0, policy_version 499620 (0.0044) [2024-06-23 19:23:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42328.3, 300 sec: 42653.9). Total num frames: 8185872384. Throughput: 0: 42304.5. Samples: 8185955540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 19:23:25,092][15401] Updated weights for policy 0, policy_version 499630 (0.0035) [2024-06-23 19:23:28,143][15401] Updated weights for policy 0, policy_version 499640 (0.0049) [2024-06-23 19:23:28,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 8186118144. Throughput: 0: 42574.3. Samples: 8186216800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:28,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 19:23:32,642][15401] Updated weights for policy 0, policy_version 499650 (0.0028) [2024-06-23 19:23:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8186314752. Throughput: 0: 42561.0. Samples: 8186472620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 19:23:35,810][15401] Updated weights for policy 0, policy_version 499660 (0.0032) [2024-06-23 19:23:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8186511360. Throughput: 0: 42407.5. Samples: 8186591980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 19:23:40,217][15401] Updated weights for policy 0, policy_version 499670 (0.0022) [2024-06-23 19:23:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8186740736. Throughput: 0: 42337.9. Samples: 8186846600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:43,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-23 19:23:43,508][15401] Updated weights for policy 0, policy_version 499680 (0.0048) [2024-06-23 19:23:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000499680_8186757120.pth... [2024-06-23 19:23:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000499056_8176533504.pth [2024-06-23 19:23:47,819][15401] Updated weights for policy 0, policy_version 499690 (0.0029) [2024-06-23 19:23:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8186937344. Throughput: 0: 42373.5. Samples: 8187097820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 19:23:51,214][15401] Updated weights for policy 0, policy_version 499700 (0.0032) [2024-06-23 19:23:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 8187133952. Throughput: 0: 42473.9. Samples: 8187223400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 19:23:55,391][15401] Updated weights for policy 0, policy_version 499710 (0.0024) [2024-06-23 19:23:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.7, 300 sec: 42431.5). Total num frames: 8187363328. Throughput: 0: 42372.0. Samples: 8187483080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:23:58,392][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 19:23:59,347][15401] Updated weights for policy 0, policy_version 499720 (0.0031) [2024-06-23 19:24:03,070][15401] Updated weights for policy 0, policy_version 499730 (0.0045) [2024-06-23 19:24:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 8187592704. Throughput: 0: 42278.4. Samples: 8187737220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:24:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 19:24:06,969][15401] Updated weights for policy 0, policy_version 499740 (0.0033) [2024-06-23 19:24:08,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.5). Total num frames: 8187789312. Throughput: 0: 42544.9. Samples: 8187870160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-23 19:24:08,393][15132] Avg episode reward: [(0, '0.839')] [2024-06-23 19:24:10,868][15401] Updated weights for policy 0, policy_version 499750 (0.0032) [2024-06-23 19:24:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 8188002304. Throughput: 0: 42405.2. Samples: 8188125040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 19:24:14,688][15401] Updated weights for policy 0, policy_version 499760 (0.0028) [2024-06-23 19:24:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 8188215296. Throughput: 0: 42416.4. Samples: 8188381360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:18,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-23 19:24:18,487][15401] Updated weights for policy 0, policy_version 499770 (0.0041) [2024-06-23 19:24:22,372][15401] Updated weights for policy 0, policy_version 499780 (0.0042) [2024-06-23 19:24:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8188444672. Throughput: 0: 42696.5. Samples: 8188513320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:23,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 19:24:25,987][15401] Updated weights for policy 0, policy_version 499790 (0.0038) [2024-06-23 19:24:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42050.5, 300 sec: 42431.5). Total num frames: 8188641280. Throughput: 0: 42666.6. Samples: 8188766700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:28,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 19:24:29,896][15401] Updated weights for policy 0, policy_version 499800 (0.0041) [2024-06-23 19:24:33,385][15401] Updated weights for policy 0, policy_version 499810 (0.0029) [2024-06-23 19:24:33,391][15132] Fps is (10 sec: 44230.8, 60 sec: 42870.4, 300 sec: 42598.2). Total num frames: 8188887040. Throughput: 0: 42855.9. Samples: 8189026400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:33,391][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 19:24:37,483][15401] Updated weights for policy 0, policy_version 499820 (0.0026) [2024-06-23 19:24:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 8189083648. Throughput: 0: 42992.9. Samples: 8189158080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:38,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 19:24:40,853][15401] Updated weights for policy 0, policy_version 499830 (0.0032) [2024-06-23 19:24:43,390][15132] Fps is (10 sec: 40965.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 8189296640. Throughput: 0: 42911.6. Samples: 8189414000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:43,396][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 19:24:45,060][15401] Updated weights for policy 0, policy_version 499840 (0.0035) [2024-06-23 19:24:48,394][15132] Fps is (10 sec: 44214.8, 60 sec: 43141.0, 300 sec: 42598.0). Total num frames: 8189526016. Throughput: 0: 42968.6. Samples: 8189671020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:48,395][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 19:24:48,545][15401] Updated weights for policy 0, policy_version 499850 (0.0028) [2024-06-23 19:24:52,827][15401] Updated weights for policy 0, policy_version 499860 (0.0035) [2024-06-23 19:24:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 8189706240. Throughput: 0: 42859.7. Samples: 8189798740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 19:24:56,212][15401] Updated weights for policy 0, policy_version 499870 (0.0036) [2024-06-23 19:24:58,389][15132] Fps is (10 sec: 39341.0, 60 sec: 42600.1, 300 sec: 42376.3). Total num frames: 8189919232. Throughput: 0: 42916.0. Samples: 8190056260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:24:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 19:25:00,462][15401] Updated weights for policy 0, policy_version 499880 (0.0036) [2024-06-23 19:25:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8190148608. Throughput: 0: 42901.3. Samples: 8190311920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:25:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 19:25:03,888][15401] Updated weights for policy 0, policy_version 499890 (0.0034) [2024-06-23 19:25:06,347][15349] Signal inference workers to stop experience collection... (121350 times) [2024-06-23 19:25:06,347][15349] Signal inference workers to resume experience collection... (121350 times) [2024-06-23 19:25:06,408][15401] InferenceWorker_p0-w0: stopping experience collection (121350 times) [2024-06-23 19:25:06,408][15401] InferenceWorker_p0-w0: resuming experience collection (121350 times) [2024-06-23 19:25:08,054][15401] Updated weights for policy 0, policy_version 499900 (0.0036) [2024-06-23 19:25:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42873.1, 300 sec: 42542.8). Total num frames: 8190361600. Throughput: 0: 42904.8. Samples: 8190444040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:25:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 19:25:11,712][15401] Updated weights for policy 0, policy_version 499910 (0.0037) [2024-06-23 19:25:13,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42866.9, 300 sec: 42430.9). Total num frames: 8190574592. Throughput: 0: 42853.5. Samples: 8190695280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:25:13,397][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 19:25:15,603][15401] Updated weights for policy 0, policy_version 499920 (0.0034) [2024-06-23 19:25:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 8190787584. Throughput: 0: 42905.7. Samples: 8190957100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:25:18,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 19:25:19,456][15401] Updated weights for policy 0, policy_version 499930 (0.0027) [2024-06-23 19:25:23,181][15401] Updated weights for policy 0, policy_version 499940 (0.0037) [2024-06-23 19:25:23,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8191016960. Throughput: 0: 42715.9. Samples: 8191080300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:25:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 19:25:27,347][15401] Updated weights for policy 0, policy_version 499950 (0.0042) [2024-06-23 19:25:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.0, 300 sec: 42376.2). Total num frames: 8191197184. Throughput: 0: 42680.8. Samples: 8191334640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:25:28,404][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 19:25:30,682][15401] Updated weights for policy 0, policy_version 499960 (0.0032) [2024-06-23 19:25:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42326.4, 300 sec: 42542.9). Total num frames: 8191426560. Throughput: 0: 42753.6. Samples: 8191594720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 19:25:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 19:25:34,892][15401] Updated weights for policy 0, policy_version 499970 (0.0034) [2024-06-23 19:25:38,328][15401] Updated weights for policy 0, policy_version 499980 (0.0039) [2024-06-23 19:25:38,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8191672320. Throughput: 0: 42734.5. Samples: 8191721800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:25:38,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 19:25:42,424][15401] Updated weights for policy 0, policy_version 499990 (0.0038) [2024-06-23 19:25:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8191852544. Throughput: 0: 42739.1. Samples: 8191979520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:25:43,390][15132] Avg episode reward: [(0, '0.167')] [2024-06-23 19:25:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000499992_8191868928.pth... [2024-06-23 19:25:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000499368_8181645312.pth [2024-06-23 19:25:45,806][15401] Updated weights for policy 0, policy_version 500000 (0.0037) [2024-06-23 19:25:48,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42055.8, 300 sec: 42487.3). Total num frames: 8192049152. Throughput: 0: 42902.7. Samples: 8192242540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:25:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 19:25:50,166][15401] Updated weights for policy 0, policy_version 500010 (0.0030) [2024-06-23 19:25:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 8192294912. Throughput: 0: 42698.3. Samples: 8192365460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:25:53,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 19:25:53,862][15401] Updated weights for policy 0, policy_version 500020 (0.0031) [2024-06-23 19:25:57,873][15401] Updated weights for policy 0, policy_version 500030 (0.0039) [2024-06-23 19:25:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8192507904. Throughput: 0: 42848.3. Samples: 8192623180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:25:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 19:26:01,540][15401] Updated weights for policy 0, policy_version 500040 (0.0042) [2024-06-23 19:26:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8192704512. Throughput: 0: 42825.0. Samples: 8192884220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 19:26:05,579][15401] Updated weights for policy 0, policy_version 500050 (0.0028) [2024-06-23 19:26:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 8192950272. Throughput: 0: 42858.7. Samples: 8193008940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:08,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 19:26:08,996][15401] Updated weights for policy 0, policy_version 500060 (0.0033) [2024-06-23 19:26:13,297][15401] Updated weights for policy 0, policy_version 500070 (0.0030) [2024-06-23 19:26:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 8193146880. Throughput: 0: 43017.4. Samples: 8193270420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:13,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 19:26:16,436][15401] Updated weights for policy 0, policy_version 500080 (0.0025) [2024-06-23 19:26:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 8193359872. Throughput: 0: 42907.4. Samples: 8193525660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:18,393][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 19:26:20,893][15401] Updated weights for policy 0, policy_version 500090 (0.0039) [2024-06-23 19:26:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8193589248. Throughput: 0: 42904.0. Samples: 8193652480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 19:26:24,539][15401] Updated weights for policy 0, policy_version 500100 (0.0042) [2024-06-23 19:26:28,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 8193785856. Throughput: 0: 42963.7. Samples: 8193912880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:28,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 19:26:28,501][15401] Updated weights for policy 0, policy_version 500110 (0.0036) [2024-06-23 19:26:32,179][15401] Updated weights for policy 0, policy_version 500120 (0.0030) [2024-06-23 19:26:33,349][15349] Signal inference workers to stop experience collection... (121400 times) [2024-06-23 19:26:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8193982464. Throughput: 0: 42875.1. Samples: 8194171920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 19:26:33,401][15349] Signal inference workers to resume experience collection... (121400 times) [2024-06-23 19:26:33,402][15401] InferenceWorker_p0-w0: stopping experience collection (121400 times) [2024-06-23 19:26:33,419][15401] InferenceWorker_p0-w0: resuming experience collection (121400 times) [2024-06-23 19:26:36,253][15401] Updated weights for policy 0, policy_version 500130 (0.0044) [2024-06-23 19:26:38,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8194228224. Throughput: 0: 42967.9. Samples: 8194299020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 19:26:39,927][15401] Updated weights for policy 0, policy_version 500140 (0.0030) [2024-06-23 19:26:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8194424832. Throughput: 0: 42940.0. Samples: 8194555480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:43,391][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 19:26:43,666][15401] Updated weights for policy 0, policy_version 500150 (0.0033) [2024-06-23 19:26:47,374][15401] Updated weights for policy 0, policy_version 500160 (0.0037) [2024-06-23 19:26:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 8194654208. Throughput: 0: 42865.9. Samples: 8194813180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 19:26:51,366][15401] Updated weights for policy 0, policy_version 500170 (0.0038) [2024-06-23 19:26:53,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 8194867200. Throughput: 0: 43057.7. Samples: 8194946640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:53,392][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 19:26:54,905][15401] Updated weights for policy 0, policy_version 500180 (0.0029) [2024-06-23 19:26:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8195063808. Throughput: 0: 42833.9. Samples: 8195197940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 19:26:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 19:26:59,253][15401] Updated weights for policy 0, policy_version 500190 (0.0050) [2024-06-23 19:27:02,345][15401] Updated weights for policy 0, policy_version 500200 (0.0035) [2024-06-23 19:27:03,389][15132] Fps is (10 sec: 42609.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8195293184. Throughput: 0: 42822.4. Samples: 8195452560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 19:27:06,891][15401] Updated weights for policy 0, policy_version 500210 (0.0023) [2024-06-23 19:27:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8195506176. Throughput: 0: 42957.3. Samples: 8195585560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 19:27:10,365][15401] Updated weights for policy 0, policy_version 500220 (0.0029) [2024-06-23 19:27:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8195719168. Throughput: 0: 42879.4. Samples: 8195842460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 19:27:14,427][15401] Updated weights for policy 0, policy_version 500230 (0.0030) [2024-06-23 19:27:17,726][15401] Updated weights for policy 0, policy_version 500240 (0.0037) [2024-06-23 19:27:18,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43419.3, 300 sec: 42821.2). Total num frames: 8195964928. Throughput: 0: 42841.2. Samples: 8196099780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 19:27:21,854][15401] Updated weights for policy 0, policy_version 500250 (0.0029) [2024-06-23 19:27:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8196145152. Throughput: 0: 43057.6. Samples: 8196236600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 19:27:25,189][15401] Updated weights for policy 0, policy_version 500260 (0.0042) [2024-06-23 19:27:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 8196358144. Throughput: 0: 42980.8. Samples: 8196489620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 19:27:29,410][15401] Updated weights for policy 0, policy_version 500270 (0.0036) [2024-06-23 19:27:32,837][15401] Updated weights for policy 0, policy_version 500280 (0.0041) [2024-06-23 19:27:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8196587520. Throughput: 0: 42878.6. Samples: 8196742720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 19:27:37,028][15401] Updated weights for policy 0, policy_version 500290 (0.0034) [2024-06-23 19:27:38,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8196800512. Throughput: 0: 42963.6. Samples: 8196879900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 19:27:40,480][15401] Updated weights for policy 0, policy_version 500300 (0.0040) [2024-06-23 19:27:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8196997120. Throughput: 0: 43071.0. Samples: 8197136140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:43,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 19:27:43,451][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000500306_8197013504.pth... [2024-06-23 19:27:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000499680_8186757120.pth [2024-06-23 19:27:44,549][15401] Updated weights for policy 0, policy_version 500310 (0.0033) [2024-06-23 19:27:48,246][15401] Updated weights for policy 0, policy_version 500320 (0.0046) [2024-06-23 19:27:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8197242880. Throughput: 0: 43069.6. Samples: 8197390700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 19:27:52,386][15349] Signal inference workers to stop experience collection... (121450 times) [2024-06-23 19:27:52,388][15349] Signal inference workers to resume experience collection... (121450 times) [2024-06-23 19:27:52,394][15401] Updated weights for policy 0, policy_version 500330 (0.0024) [2024-06-23 19:27:52,405][15401] InferenceWorker_p0-w0: stopping experience collection (121450 times) [2024-06-23 19:27:52,406][15401] InferenceWorker_p0-w0: resuming experience collection (121450 times) [2024-06-23 19:27:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8197439488. Throughput: 0: 42966.3. Samples: 8197519040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 19:27:56,010][15401] Updated weights for policy 0, policy_version 500340 (0.0045) [2024-06-23 19:27:58,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8197636096. Throughput: 0: 42917.1. Samples: 8197773720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:27:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 19:28:00,004][15401] Updated weights for policy 0, policy_version 500350 (0.0028) [2024-06-23 19:28:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 8197865472. Throughput: 0: 42948.5. Samples: 8198032460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:28:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 19:28:03,796][15401] Updated weights for policy 0, policy_version 500360 (0.0032) [2024-06-23 19:28:07,822][15401] Updated weights for policy 0, policy_version 500370 (0.0032) [2024-06-23 19:28:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8198078464. Throughput: 0: 42742.6. Samples: 8198160020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:28:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 19:28:11,450][15401] Updated weights for policy 0, policy_version 500380 (0.0040) [2024-06-23 19:28:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8198307840. Throughput: 0: 42855.7. Samples: 8198418120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:28:13,395][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 19:28:15,418][15401] Updated weights for policy 0, policy_version 500390 (0.0032) [2024-06-23 19:28:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8198520832. Throughput: 0: 42914.6. Samples: 8198673880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:28:18,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 19:28:19,298][15401] Updated weights for policy 0, policy_version 500400 (0.0049) [2024-06-23 19:28:22,985][15401] Updated weights for policy 0, policy_version 500410 (0.0033) [2024-06-23 19:28:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 8198717440. Throughput: 0: 42744.0. Samples: 8198803380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:28:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 19:28:26,782][15401] Updated weights for policy 0, policy_version 500420 (0.0035) [2024-06-23 19:28:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8198930432. Throughput: 0: 42700.4. Samples: 8199057660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-23 19:28:28,394][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 19:28:30,721][15401] Updated weights for policy 0, policy_version 500430 (0.0032) [2024-06-23 19:28:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 8199176192. Throughput: 0: 42690.8. Samples: 8199311780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:28:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 19:28:34,414][15401] Updated weights for policy 0, policy_version 500440 (0.0035) [2024-06-23 19:28:38,377][15401] Updated weights for policy 0, policy_version 500450 (0.0027) [2024-06-23 19:28:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8199372800. Throughput: 0: 42880.4. Samples: 8199448660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:28:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 19:28:42,016][15401] Updated weights for policy 0, policy_version 500460 (0.0036) [2024-06-23 19:28:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8199569408. Throughput: 0: 42783.9. Samples: 8199699000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:28:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 19:28:45,983][15401] Updated weights for policy 0, policy_version 500470 (0.0032) [2024-06-23 19:28:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8199798784. Throughput: 0: 42809.7. Samples: 8199958900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:28:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 19:28:49,554][15401] Updated weights for policy 0, policy_version 500480 (0.0033) [2024-06-23 19:28:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 8200011776. Throughput: 0: 42897.4. Samples: 8200090400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:28:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 19:28:53,477][15401] Updated weights for policy 0, policy_version 500490 (0.0031) [2024-06-23 19:28:57,069][15401] Updated weights for policy 0, policy_version 500500 (0.0025) [2024-06-23 19:28:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8200224768. Throughput: 0: 42728.5. Samples: 8200340900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:28:58,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 19:29:01,024][15401] Updated weights for policy 0, policy_version 500510 (0.0043) [2024-06-23 19:29:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 8200437760. Throughput: 0: 42817.9. Samples: 8200600680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 19:29:04,642][15401] Updated weights for policy 0, policy_version 500520 (0.0028) [2024-06-23 19:29:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8200650752. Throughput: 0: 42924.1. Samples: 8200734960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 19:29:08,943][15401] Updated weights for policy 0, policy_version 500530 (0.0028) [2024-06-23 19:29:12,278][15401] Updated weights for policy 0, policy_version 500540 (0.0029) [2024-06-23 19:29:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8200863744. Throughput: 0: 42901.4. Samples: 8200988220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:13,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 19:29:16,710][15401] Updated weights for policy 0, policy_version 500550 (0.0045) [2024-06-23 19:29:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8201093120. Throughput: 0: 42849.3. Samples: 8201240000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 19:29:20,331][15401] Updated weights for policy 0, policy_version 500560 (0.0034) [2024-06-23 19:29:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 8201273344. Throughput: 0: 42758.7. Samples: 8201372800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:23,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 19:29:24,323][15401] Updated weights for policy 0, policy_version 500570 (0.0036) [2024-06-23 19:29:26,083][15349] Signal inference workers to stop experience collection... (121500 times) [2024-06-23 19:29:26,084][15349] Signal inference workers to resume experience collection... (121500 times) [2024-06-23 19:29:26,113][15401] InferenceWorker_p0-w0: stopping experience collection (121500 times) [2024-06-23 19:29:26,113][15401] InferenceWorker_p0-w0: resuming experience collection (121500 times) [2024-06-23 19:29:27,874][15401] Updated weights for policy 0, policy_version 500580 (0.0041) [2024-06-23 19:29:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42820.4). Total num frames: 8201519104. Throughput: 0: 42746.6. Samples: 8201622700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:28,393][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 19:29:32,047][15401] Updated weights for policy 0, policy_version 500590 (0.0031) [2024-06-23 19:29:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8201715712. Throughput: 0: 42565.9. Samples: 8201874360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 19:29:35,595][15401] Updated weights for policy 0, policy_version 500600 (0.0036) [2024-06-23 19:29:38,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8201912320. Throughput: 0: 42506.2. Samples: 8202003180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:38,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 19:29:39,580][15401] Updated weights for policy 0, policy_version 500610 (0.0038) [2024-06-23 19:29:43,159][15401] Updated weights for policy 0, policy_version 500620 (0.0051) [2024-06-23 19:29:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42821.2). Total num frames: 8202158080. Throughput: 0: 42800.7. Samples: 8202266940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 19:29:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000500620_8202158080.pth... [2024-06-23 19:29:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000499992_8191868928.pth [2024-06-23 19:29:47,156][15401] Updated weights for policy 0, policy_version 500630 (0.0036) [2024-06-23 19:29:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 8202354688. Throughput: 0: 42663.0. Samples: 8202520620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:48,393][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 19:29:50,627][15401] Updated weights for policy 0, policy_version 500640 (0.0035) [2024-06-23 19:29:53,396][15132] Fps is (10 sec: 39297.1, 60 sec: 42320.8, 300 sec: 42819.6). Total num frames: 8202551296. Throughput: 0: 42483.7. Samples: 8202647000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 19:29:53,396][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 19:29:54,986][15401] Updated weights for policy 0, policy_version 500650 (0.0035) [2024-06-23 19:29:58,148][15401] Updated weights for policy 0, policy_version 500660 (0.0029) [2024-06-23 19:29:58,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8202813440. Throughput: 0: 42667.9. Samples: 8202908280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:29:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 19:30:02,679][15401] Updated weights for policy 0, policy_version 500670 (0.0035) [2024-06-23 19:30:03,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8202993664. Throughput: 0: 42596.9. Samples: 8203156860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 19:30:05,903][15401] Updated weights for policy 0, policy_version 500680 (0.0030) [2024-06-23 19:30:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 8203190272. Throughput: 0: 42431.1. Samples: 8203282200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 19:30:10,520][15401] Updated weights for policy 0, policy_version 500690 (0.0038) [2024-06-23 19:30:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 8203452416. Throughput: 0: 42704.1. Samples: 8203544280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 19:30:13,689][15401] Updated weights for policy 0, policy_version 500700 (0.0032) [2024-06-23 19:30:18,018][15401] Updated weights for policy 0, policy_version 500710 (0.0028) [2024-06-23 19:30:18,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 8203632640. Throughput: 0: 42674.6. Samples: 8203794820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:18,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 19:30:21,585][15401] Updated weights for policy 0, policy_version 500720 (0.0040) [2024-06-23 19:30:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8203845632. Throughput: 0: 42587.5. Samples: 8203919620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 19:30:25,680][15401] Updated weights for policy 0, policy_version 500730 (0.0030) [2024-06-23 19:30:27,155][15349] Signal inference workers to stop experience collection... (121550 times) [2024-06-23 19:30:27,157][15349] Signal inference workers to resume experience collection... (121550 times) [2024-06-23 19:30:27,198][15401] InferenceWorker_p0-w0: stopping experience collection (121550 times) [2024-06-23 19:30:27,198][15401] InferenceWorker_p0-w0: resuming experience collection (121550 times) [2024-06-23 19:30:28,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 8204091392. Throughput: 0: 42662.3. Samples: 8204186740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 19:30:29,345][15401] Updated weights for policy 0, policy_version 500740 (0.0030) [2024-06-23 19:30:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 8204271616. Throughput: 0: 42608.3. Samples: 8204437900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 19:30:33,566][15401] Updated weights for policy 0, policy_version 500750 (0.0033) [2024-06-23 19:30:37,005][15401] Updated weights for policy 0, policy_version 500760 (0.0031) [2024-06-23 19:30:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8204500992. Throughput: 0: 42414.0. Samples: 8204555360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 19:30:41,381][15401] Updated weights for policy 0, policy_version 500770 (0.0033) [2024-06-23 19:30:43,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8204713984. Throughput: 0: 42572.9. Samples: 8204824060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 19:30:44,485][15401] Updated weights for policy 0, policy_version 500780 (0.0028) [2024-06-23 19:30:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 8204894208. Throughput: 0: 42787.5. Samples: 8205082300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 19:30:49,265][15401] Updated weights for policy 0, policy_version 500790 (0.0038) [2024-06-23 19:30:52,061][15401] Updated weights for policy 0, policy_version 500800 (0.0029) [2024-06-23 19:30:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 8205156352. Throughput: 0: 42738.2. Samples: 8205205420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 19:30:57,041][15401] Updated weights for policy 0, policy_version 500810 (0.0039) [2024-06-23 19:30:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 8205352960. Throughput: 0: 42598.2. Samples: 8205461200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:30:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 19:30:59,733][15401] Updated weights for policy 0, policy_version 500820 (0.0033) [2024-06-23 19:31:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8205533184. Throughput: 0: 42778.1. Samples: 8205719740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:31:03,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 19:31:04,536][15401] Updated weights for policy 0, policy_version 500830 (0.0034) [2024-06-23 19:31:07,348][15401] Updated weights for policy 0, policy_version 500840 (0.0042) [2024-06-23 19:31:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8205778944. Throughput: 0: 42722.3. Samples: 8205842120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:31:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 19:31:12,307][15401] Updated weights for policy 0, policy_version 500850 (0.0038) [2024-06-23 19:31:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.1, 300 sec: 42765.4). Total num frames: 8205975552. Throughput: 0: 42593.2. Samples: 8206103440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:31:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 19:31:15,030][15401] Updated weights for policy 0, policy_version 500860 (0.0028) [2024-06-23 19:31:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 8206172160. Throughput: 0: 42767.3. Samples: 8206362420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:31:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 19:31:20,019][15401] Updated weights for policy 0, policy_version 500870 (0.0037) [2024-06-23 19:31:22,626][15401] Updated weights for policy 0, policy_version 500880 (0.0041) [2024-06-23 19:31:23,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8206434304. Throughput: 0: 42867.1. Samples: 8206484380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 19:31:23,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 19:31:27,854][15401] Updated weights for policy 0, policy_version 500890 (0.0039) [2024-06-23 19:31:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 8206614528. Throughput: 0: 42812.4. Samples: 8206750620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:31:28,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 19:31:30,132][15401] Updated weights for policy 0, policy_version 500900 (0.0037) [2024-06-23 19:31:33,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8206811136. Throughput: 0: 42693.7. Samples: 8207003520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:31:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 19:31:35,608][15401] Updated weights for policy 0, policy_version 500910 (0.0028) [2024-06-23 19:31:37,155][15349] Signal inference workers to stop experience collection... (121600 times) [2024-06-23 19:31:37,155][15349] Signal inference workers to resume experience collection... (121600 times) [2024-06-23 19:31:37,194][15401] InferenceWorker_p0-w0: stopping experience collection (121600 times) [2024-06-23 19:31:37,194][15401] InferenceWorker_p0-w0: resuming experience collection (121600 times) [2024-06-23 19:31:37,811][15401] Updated weights for policy 0, policy_version 500920 (0.0034) [2024-06-23 19:31:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8207073280. Throughput: 0: 42697.2. Samples: 8207126800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:31:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 19:31:43,232][15401] Updated weights for policy 0, policy_version 500930 (0.0035) [2024-06-23 19:31:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8207253504. Throughput: 0: 42969.3. Samples: 8207394820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:31:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 19:31:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000500931_8207253504.pth... [2024-06-23 19:31:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000500306_8197013504.pth [2024-06-23 19:31:45,227][15401] Updated weights for policy 0, policy_version 500940 (0.0029) [2024-06-23 19:31:48,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 8207466496. Throughput: 0: 42695.3. Samples: 8207641020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:31:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 19:31:50,728][15401] Updated weights for policy 0, policy_version 500950 (0.0030) [2024-06-23 19:31:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8207712256. Throughput: 0: 42903.0. Samples: 8207772760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:31:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 19:31:53,684][15401] Updated weights for policy 0, policy_version 500960 (0.0051) [2024-06-23 19:31:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 8207876096. Throughput: 0: 42811.3. Samples: 8208029940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:31:58,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 19:31:58,414][15401] Updated weights for policy 0, policy_version 500970 (0.0035) [2024-06-23 19:32:01,292][15401] Updated weights for policy 0, policy_version 500980 (0.0031) [2024-06-23 19:32:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8208105472. Throughput: 0: 42613.6. Samples: 8208280040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 19:32:06,094][15401] Updated weights for policy 0, policy_version 500990 (0.0055) [2024-06-23 19:32:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8208351232. Throughput: 0: 42811.2. Samples: 8208410880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 19:32:09,002][15401] Updated weights for policy 0, policy_version 501000 (0.0031) [2024-06-23 19:32:13,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 8208515072. Throughput: 0: 42491.6. Samples: 8208662840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:13,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 19:32:13,804][15401] Updated weights for policy 0, policy_version 501010 (0.0039) [2024-06-23 19:32:16,718][15401] Updated weights for policy 0, policy_version 501020 (0.0030) [2024-06-23 19:32:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8208760832. Throughput: 0: 42302.4. Samples: 8208907120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:18,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 19:32:21,684][15401] Updated weights for policy 0, policy_version 501030 (0.0039) [2024-06-23 19:32:23,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8208973824. Throughput: 0: 42636.1. Samples: 8209045420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:23,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 19:32:24,759][15401] Updated weights for policy 0, policy_version 501040 (0.0044) [2024-06-23 19:32:28,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8209154048. Throughput: 0: 42381.3. Samples: 8209301980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 19:32:29,263][15401] Updated weights for policy 0, policy_version 501050 (0.0031) [2024-06-23 19:32:32,486][15401] Updated weights for policy 0, policy_version 501060 (0.0044) [2024-06-23 19:32:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 8209416192. Throughput: 0: 42315.6. Samples: 8209545220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:33,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 19:32:36,862][15401] Updated weights for policy 0, policy_version 501070 (0.0049) [2024-06-23 19:32:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8209612800. Throughput: 0: 42598.2. Samples: 8209689680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:38,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 19:32:40,172][15401] Updated weights for policy 0, policy_version 501080 (0.0037) [2024-06-23 19:32:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8209809408. Throughput: 0: 42460.3. Samples: 8209940660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 19:32:44,433][15401] Updated weights for policy 0, policy_version 501090 (0.0047) [2024-06-23 19:32:45,763][15349] Signal inference workers to stop experience collection... (121650 times) [2024-06-23 19:32:45,806][15401] InferenceWorker_p0-w0: stopping experience collection (121650 times) [2024-06-23 19:32:45,872][15349] Signal inference workers to resume experience collection... (121650 times) [2024-06-23 19:32:45,872][15401] InferenceWorker_p0-w0: resuming experience collection (121650 times) [2024-06-23 19:32:47,925][15401] Updated weights for policy 0, policy_version 501100 (0.0039) [2024-06-23 19:32:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8210055168. Throughput: 0: 42511.3. Samples: 8210193040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-23 19:32:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 19:32:52,067][15401] Updated weights for policy 0, policy_version 501110 (0.0031) [2024-06-23 19:32:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8210251776. Throughput: 0: 42583.9. Samples: 8210327160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:32:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 19:32:55,657][15401] Updated weights for policy 0, policy_version 501120 (0.0024) [2024-06-23 19:32:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8210448384. Throughput: 0: 42694.3. Samples: 8210583980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:32:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 19:32:59,899][15401] Updated weights for policy 0, policy_version 501130 (0.0024) [2024-06-23 19:33:02,923][15401] Updated weights for policy 0, policy_version 501140 (0.0038) [2024-06-23 19:33:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8210694144. Throughput: 0: 42972.3. Samples: 8210840880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 19:33:07,406][15401] Updated weights for policy 0, policy_version 501150 (0.0030) [2024-06-23 19:33:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8210874368. Throughput: 0: 42809.9. Samples: 8210971860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:08,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 19:33:10,635][15401] Updated weights for policy 0, policy_version 501160 (0.0036) [2024-06-23 19:33:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 8211103744. Throughput: 0: 42819.0. Samples: 8211228840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:13,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 19:33:15,034][15401] Updated weights for policy 0, policy_version 501170 (0.0028) [2024-06-23 19:33:18,134][15401] Updated weights for policy 0, policy_version 501180 (0.0034) [2024-06-23 19:33:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 8211333120. Throughput: 0: 42979.4. Samples: 8211479300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 19:33:22,609][15401] Updated weights for policy 0, policy_version 501190 (0.0041) [2024-06-23 19:33:23,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 8211529728. Throughput: 0: 42804.1. Samples: 8211615960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:23,392][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 19:33:25,598][15401] Updated weights for policy 0, policy_version 501200 (0.0045) [2024-06-23 19:33:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 8211759104. Throughput: 0: 43062.3. Samples: 8211878460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 19:33:30,171][15401] Updated weights for policy 0, policy_version 501210 (0.0030) [2024-06-23 19:33:33,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8211972096. Throughput: 0: 43088.3. Samples: 8212132020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 19:33:33,400][15401] Updated weights for policy 0, policy_version 501220 (0.0035) [2024-06-23 19:33:37,602][15401] Updated weights for policy 0, policy_version 501230 (0.0029) [2024-06-23 19:33:38,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 8212168704. Throughput: 0: 42999.9. Samples: 8212262160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 19:33:41,109][15401] Updated weights for policy 0, policy_version 501240 (0.0031) [2024-06-23 19:33:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 8212398080. Throughput: 0: 42976.7. Samples: 8212518040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:43,392][15132] Avg episode reward: [(0, '0.290')] [2024-06-23 19:33:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000501246_8212414464.pth... [2024-06-23 19:33:43,511][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000500620_8202158080.pth [2024-06-23 19:33:45,131][15401] Updated weights for policy 0, policy_version 501250 (0.0040) [2024-06-23 19:33:48,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8212594688. Throughput: 0: 42937.8. Samples: 8212773080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:48,392][15132] Avg episode reward: [(0, '0.391')] [2024-06-23 19:33:48,843][15401] Updated weights for policy 0, policy_version 501260 (0.0033) [2024-06-23 19:33:52,763][15401] Updated weights for policy 0, policy_version 501270 (0.0031) [2024-06-23 19:33:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8212824064. Throughput: 0: 42802.1. Samples: 8212897960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:53,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 19:33:56,435][15401] Updated weights for policy 0, policy_version 501280 (0.0030) [2024-06-23 19:33:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8213037056. Throughput: 0: 42994.8. Samples: 8213163600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:33:58,390][15132] Avg episode reward: [(0, '0.158')] [2024-06-23 19:34:00,525][15401] Updated weights for policy 0, policy_version 501290 (0.0032) [2024-06-23 19:34:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8213266432. Throughput: 0: 43096.4. Samples: 8213418640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:34:03,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 19:34:04,074][15401] Updated weights for policy 0, policy_version 501300 (0.0029) [2024-06-23 19:34:07,967][15401] Updated weights for policy 0, policy_version 501310 (0.0035) [2024-06-23 19:34:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8213463040. Throughput: 0: 42958.7. Samples: 8213549000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:34:08,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 19:34:11,754][15401] Updated weights for policy 0, policy_version 501320 (0.0037) [2024-06-23 19:34:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 8213676032. Throughput: 0: 42894.2. Samples: 8213808700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:34:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 19:34:14,438][15349] Signal inference workers to stop experience collection... (121700 times) [2024-06-23 19:34:14,438][15349] Signal inference workers to resume experience collection... (121700 times) [2024-06-23 19:34:14,448][15401] InferenceWorker_p0-w0: stopping experience collection (121700 times) [2024-06-23 19:34:14,452][15401] InferenceWorker_p0-w0: resuming experience collection (121700 times) [2024-06-23 19:34:15,514][15401] Updated weights for policy 0, policy_version 501330 (0.0036) [2024-06-23 19:34:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8213921792. Throughput: 0: 42955.7. Samples: 8214065020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:34:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 19:34:19,460][15401] Updated weights for policy 0, policy_version 501340 (0.0035) [2024-06-23 19:34:23,281][15401] Updated weights for policy 0, policy_version 501350 (0.0036) [2024-06-23 19:34:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43146.2, 300 sec: 42709.8). Total num frames: 8214118400. Throughput: 0: 43116.1. Samples: 8214202380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:34:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 19:34:27,168][15401] Updated weights for policy 0, policy_version 501360 (0.0038) [2024-06-23 19:34:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8214331392. Throughput: 0: 43165.8. Samples: 8214460400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:34:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 19:34:30,811][15401] Updated weights for policy 0, policy_version 501370 (0.0035) [2024-06-23 19:34:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8214577152. Throughput: 0: 42991.0. Samples: 8214707680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:34:33,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 19:34:34,667][15401] Updated weights for policy 0, policy_version 501380 (0.0028) [2024-06-23 19:34:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 8214757376. Throughput: 0: 43171.2. Samples: 8214840660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:34:38,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-23 19:34:38,522][15401] Updated weights for policy 0, policy_version 501390 (0.0041) [2024-06-23 19:34:42,186][15401] Updated weights for policy 0, policy_version 501400 (0.0030) [2024-06-23 19:34:43,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42600.2, 300 sec: 42709.8). Total num frames: 8214953984. Throughput: 0: 42896.5. Samples: 8215093940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:34:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 19:34:46,027][15401] Updated weights for policy 0, policy_version 501410 (0.0043) [2024-06-23 19:34:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42877.0). Total num frames: 8215199744. Throughput: 0: 42882.4. Samples: 8215348340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:34:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 19:34:50,282][15401] Updated weights for policy 0, policy_version 501420 (0.0033) [2024-06-23 19:34:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8215396352. Throughput: 0: 42945.9. Samples: 8215481560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:34:53,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 19:34:53,952][15401] Updated weights for policy 0, policy_version 501430 (0.0047) [2024-06-23 19:34:57,835][15401] Updated weights for policy 0, policy_version 501440 (0.0036) [2024-06-23 19:34:58,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8215609344. Throughput: 0: 42826.1. Samples: 8215735880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:34:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 19:35:01,375][15401] Updated weights for policy 0, policy_version 501450 (0.0028) [2024-06-23 19:35:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8215838720. Throughput: 0: 42889.7. Samples: 8215995060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 19:35:05,173][15401] Updated weights for policy 0, policy_version 501460 (0.0043) [2024-06-23 19:35:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8216035328. Throughput: 0: 42778.6. Samples: 8216127420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 19:35:08,827][15401] Updated weights for policy 0, policy_version 501470 (0.0036) [2024-06-23 19:35:12,650][15401] Updated weights for policy 0, policy_version 501480 (0.0024) [2024-06-23 19:35:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 8216264704. Throughput: 0: 42837.0. Samples: 8216388060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:13,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 19:35:16,435][15401] Updated weights for policy 0, policy_version 501490 (0.0028) [2024-06-23 19:35:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8216494080. Throughput: 0: 43025.1. Samples: 8216643800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:18,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 19:35:20,222][15401] Updated weights for policy 0, policy_version 501500 (0.0041) [2024-06-23 19:35:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8216674304. Throughput: 0: 42994.2. Samples: 8216775400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 19:35:23,931][15401] Updated weights for policy 0, policy_version 501510 (0.0040) [2024-06-23 19:35:26,325][15349] Signal inference workers to stop experience collection... (121750 times) [2024-06-23 19:35:26,325][15349] Signal inference workers to resume experience collection... (121750 times) [2024-06-23 19:35:26,372][15401] InferenceWorker_p0-w0: stopping experience collection (121750 times) [2024-06-23 19:35:26,372][15401] InferenceWorker_p0-w0: resuming experience collection (121750 times) [2024-06-23 19:35:27,561][15401] Updated weights for policy 0, policy_version 501520 (0.0036) [2024-06-23 19:35:28,394][15132] Fps is (10 sec: 44215.2, 60 sec: 43414.2, 300 sec: 42931.0). Total num frames: 8216936448. Throughput: 0: 43186.9. Samples: 8217037560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:28,395][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 19:35:31,558][15401] Updated weights for policy 0, policy_version 501530 (0.0028) [2024-06-23 19:35:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8217133056. Throughput: 0: 43245.3. Samples: 8217294380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:33,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 19:35:35,065][15401] Updated weights for policy 0, policy_version 501540 (0.0032) [2024-06-23 19:35:38,390][15132] Fps is (10 sec: 39340.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8217329664. Throughput: 0: 43184.4. Samples: 8217424860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 19:35:39,502][15401] Updated weights for policy 0, policy_version 501550 (0.0042) [2024-06-23 19:35:43,174][15401] Updated weights for policy 0, policy_version 501560 (0.0037) [2024-06-23 19:35:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 8217575424. Throughput: 0: 43174.3. Samples: 8217678720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-23 19:35:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 19:35:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000501561_8217575424.pth... [2024-06-23 19:35:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000500931_8207253504.pth [2024-06-23 19:35:47,211][15401] Updated weights for policy 0, policy_version 501570 (0.0029) [2024-06-23 19:35:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8217788416. Throughput: 0: 43133.0. Samples: 8217936040. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:35:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 19:35:50,701][15401] Updated weights for policy 0, policy_version 501580 (0.0035) [2024-06-23 19:35:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8217968640. Throughput: 0: 42950.8. Samples: 8218060200. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:35:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 19:35:54,917][15401] Updated weights for policy 0, policy_version 501590 (0.0022) [2024-06-23 19:35:58,336][15401] Updated weights for policy 0, policy_version 501600 (0.0029) [2024-06-23 19:35:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 8218214400. Throughput: 0: 42860.9. Samples: 8218316800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:35:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 19:36:02,634][15401] Updated weights for policy 0, policy_version 501610 (0.0029) [2024-06-23 19:36:03,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8218411008. Throughput: 0: 42981.1. Samples: 8218578060. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:03,393][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 19:36:06,021][15401] Updated weights for policy 0, policy_version 501620 (0.0042) [2024-06-23 19:36:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8218607616. Throughput: 0: 42810.7. Samples: 8218701880. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 19:36:10,312][15401] Updated weights for policy 0, policy_version 501630 (0.0036) [2024-06-23 19:36:13,392][15132] Fps is (10 sec: 44237.0, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 8218853376. Throughput: 0: 42642.3. Samples: 8218956360. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:13,393][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 19:36:13,637][15401] Updated weights for policy 0, policy_version 501640 (0.0042) [2024-06-23 19:36:17,818][15401] Updated weights for policy 0, policy_version 501650 (0.0029) [2024-06-23 19:36:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8219049984. Throughput: 0: 42845.4. Samples: 8219222420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 19:36:21,037][15401] Updated weights for policy 0, policy_version 501660 (0.0041) [2024-06-23 19:36:23,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8219246592. Throughput: 0: 42774.6. Samples: 8219349720. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:23,392][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 19:36:25,493][15401] Updated weights for policy 0, policy_version 501670 (0.0030) [2024-06-23 19:36:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42601.8, 300 sec: 42987.2). Total num frames: 8219492352. Throughput: 0: 42783.1. Samples: 8219603960. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:28,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 19:36:28,574][15401] Updated weights for policy 0, policy_version 501680 (0.0038) [2024-06-23 19:36:33,035][15401] Updated weights for policy 0, policy_version 501690 (0.0038) [2024-06-23 19:36:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8219688960. Throughput: 0: 42980.8. Samples: 8219870180. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 19:36:36,135][15401] Updated weights for policy 0, policy_version 501700 (0.0044) [2024-06-23 19:36:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8219901952. Throughput: 0: 42967.5. Samples: 8219993740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 19:36:40,673][15401] Updated weights for policy 0, policy_version 501710 (0.0039) [2024-06-23 19:36:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8220147712. Throughput: 0: 42963.1. Samples: 8220250140. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 19:36:44,256][15401] Updated weights for policy 0, policy_version 501720 (0.0042) [2024-06-23 19:36:48,198][15401] Updated weights for policy 0, policy_version 501730 (0.0037) [2024-06-23 19:36:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 8220344320. Throughput: 0: 42910.7. Samples: 8220508940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:48,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 19:36:51,769][15401] Updated weights for policy 0, policy_version 501740 (0.0039) [2024-06-23 19:36:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.3, 300 sec: 42987.1). Total num frames: 8220557312. Throughput: 0: 42932.6. Samples: 8220633860. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 19:36:55,672][15401] Updated weights for policy 0, policy_version 501750 (0.0036) [2024-06-23 19:36:56,918][15349] Signal inference workers to stop experience collection... (121800 times) [2024-06-23 19:36:56,975][15401] InferenceWorker_p0-w0: stopping experience collection (121800 times) [2024-06-23 19:36:57,033][15349] Signal inference workers to resume experience collection... (121800 times) [2024-06-23 19:36:57,034][15401] InferenceWorker_p0-w0: resuming experience collection (121800 times) [2024-06-23 19:36:58,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 8220803072. Throughput: 0: 43118.3. Samples: 8220896580. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:36:58,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-23 19:36:59,679][15401] Updated weights for policy 0, policy_version 501760 (0.0032) [2024-06-23 19:37:03,250][15401] Updated weights for policy 0, policy_version 501770 (0.0032) [2024-06-23 19:37:03,390][15132] Fps is (10 sec: 44237.6, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 8220999680. Throughput: 0: 42951.9. Samples: 8221155260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:37:03,390][15132] Avg episode reward: [(0, '0.022')] [2024-06-23 19:37:07,156][15401] Updated weights for policy 0, policy_version 501780 (0.0042) [2024-06-23 19:37:08,390][15132] Fps is (10 sec: 39319.1, 60 sec: 43144.0, 300 sec: 42987.4). Total num frames: 8221196288. Throughput: 0: 42878.1. Samples: 8221279260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 22.0) [2024-06-23 19:37:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 19:37:10,981][15401] Updated weights for policy 0, policy_version 501790 (0.0029) [2024-06-23 19:37:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 8221442048. Throughput: 0: 43140.5. Samples: 8221545280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:13,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 19:37:14,870][15401] Updated weights for policy 0, policy_version 501800 (0.0026) [2024-06-23 19:37:18,390][15132] Fps is (10 sec: 42601.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8221622272. Throughput: 0: 42827.1. Samples: 8221797400. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 19:37:18,932][15401] Updated weights for policy 0, policy_version 501810 (0.0031) [2024-06-23 19:37:22,406][15401] Updated weights for policy 0, policy_version 501820 (0.0031) [2024-06-23 19:37:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8221835264. Throughput: 0: 42866.5. Samples: 8221922740. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:23,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 19:37:26,438][15401] Updated weights for policy 0, policy_version 501830 (0.0037) [2024-06-23 19:37:28,396][15132] Fps is (10 sec: 45846.3, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 8222081024. Throughput: 0: 43013.5. Samples: 8222186020. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:28,396][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 19:37:29,847][15401] Updated weights for policy 0, policy_version 501840 (0.0026) [2024-06-23 19:37:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8222277632. Throughput: 0: 42908.8. Samples: 8222439840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 19:37:34,256][15401] Updated weights for policy 0, policy_version 501850 (0.0048) [2024-06-23 19:37:37,373][15401] Updated weights for policy 0, policy_version 501860 (0.0034) [2024-06-23 19:37:38,389][15132] Fps is (10 sec: 39347.2, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 8222474240. Throughput: 0: 43044.4. Samples: 8222570840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:38,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 19:37:41,724][15401] Updated weights for policy 0, policy_version 501870 (0.0048) [2024-06-23 19:37:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8222720000. Throughput: 0: 43016.8. Samples: 8222832340. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 19:37:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000501875_8222720000.pth... [2024-06-23 19:37:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000501246_8212414464.pth [2024-06-23 19:37:44,865][15401] Updated weights for policy 0, policy_version 501880 (0.0035) [2024-06-23 19:37:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8222916608. Throughput: 0: 42807.6. Samples: 8223081600. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 19:37:49,339][15401] Updated weights for policy 0, policy_version 501890 (0.0022) [2024-06-23 19:37:52,618][15401] Updated weights for policy 0, policy_version 501900 (0.0034) [2024-06-23 19:37:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8223129600. Throughput: 0: 42920.7. Samples: 8223210660. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 19:37:57,075][15401] Updated weights for policy 0, policy_version 501910 (0.0032) [2024-06-23 19:37:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 8223358976. Throughput: 0: 42793.3. Samples: 8223470980. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:37:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 19:38:00,185][15401] Updated weights for policy 0, policy_version 501920 (0.0039) [2024-06-23 19:38:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8223555584. Throughput: 0: 42896.4. Samples: 8223727740. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:38:03,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-23 19:38:04,611][15401] Updated weights for policy 0, policy_version 501930 (0.0043) [2024-06-23 19:38:07,618][15349] Signal inference workers to stop experience collection... (121850 times) [2024-06-23 19:38:07,619][15349] Signal inference workers to resume experience collection... (121850 times) [2024-06-23 19:38:07,661][15401] InferenceWorker_p0-w0: stopping experience collection (121850 times) [2024-06-23 19:38:07,661][15401] InferenceWorker_p0-w0: resuming experience collection (121850 times) [2024-06-23 19:38:07,756][15401] Updated weights for policy 0, policy_version 501940 (0.0032) [2024-06-23 19:38:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43145.0, 300 sec: 42987.2). Total num frames: 8223784960. Throughput: 0: 42917.8. Samples: 8223854040. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:38:08,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 19:38:12,175][15401] Updated weights for policy 0, policy_version 501950 (0.0035) [2024-06-23 19:38:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8223997952. Throughput: 0: 42869.6. Samples: 8224114880. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:38:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 19:38:15,554][15401] Updated weights for policy 0, policy_version 501960 (0.0041) [2024-06-23 19:38:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 8224194560. Throughput: 0: 42892.1. Samples: 8224369980. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:38:18,393][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 19:38:20,071][15401] Updated weights for policy 0, policy_version 501970 (0.0030) [2024-06-23 19:38:23,396][15132] Fps is (10 sec: 42571.0, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 8224423936. Throughput: 0: 42782.1. Samples: 8224496320. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:38:23,397][15132] Avg episode reward: [(0, '0.377')] [2024-06-23 19:38:23,647][15401] Updated weights for policy 0, policy_version 501980 (0.0039) [2024-06-23 19:38:27,694][15401] Updated weights for policy 0, policy_version 501990 (0.0029) [2024-06-23 19:38:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42329.7, 300 sec: 42876.1). Total num frames: 8224620544. Throughput: 0: 42785.3. Samples: 8224757680. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:38:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 19:38:31,186][15401] Updated weights for policy 0, policy_version 502000 (0.0037) [2024-06-23 19:38:33,390][15132] Fps is (10 sec: 42625.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8224849920. Throughput: 0: 42887.1. Samples: 8225011520. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:38:33,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-23 19:38:35,315][15401] Updated weights for policy 0, policy_version 502010 (0.0030) [2024-06-23 19:38:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 8225062912. Throughput: 0: 42949.4. Samples: 8225143380. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-23 19:38:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 19:38:38,602][15401] Updated weights for policy 0, policy_version 502020 (0.0034) [2024-06-23 19:38:43,172][15401] Updated weights for policy 0, policy_version 502030 (0.0034) [2024-06-23 19:38:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8225275904. Throughput: 0: 42911.0. Samples: 8225401980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:38:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 19:38:46,356][15401] Updated weights for policy 0, policy_version 502040 (0.0038) [2024-06-23 19:38:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8225488896. Throughput: 0: 42785.0. Samples: 8225653060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:38:48,390][15132] Avg episode reward: [(0, '0.914')] [2024-06-23 19:38:50,720][15401] Updated weights for policy 0, policy_version 502050 (0.0035) [2024-06-23 19:38:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 8225718272. Throughput: 0: 42888.4. Samples: 8225784020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:38:53,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 19:38:53,930][15401] Updated weights for policy 0, policy_version 502060 (0.0031) [2024-06-23 19:38:58,182][15401] Updated weights for policy 0, policy_version 502070 (0.0027) [2024-06-23 19:38:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8225914880. Throughput: 0: 42934.2. Samples: 8226046920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:38:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 19:39:01,535][15401] Updated weights for policy 0, policy_version 502080 (0.0033) [2024-06-23 19:39:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 8226127872. Throughput: 0: 42960.4. Samples: 8226303300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:03,392][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 19:39:05,788][15401] Updated weights for policy 0, policy_version 502090 (0.0031) [2024-06-23 19:39:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 8226373632. Throughput: 0: 43005.8. Samples: 8226431300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 19:39:08,999][15401] Updated weights for policy 0, policy_version 502100 (0.0044) [2024-06-23 19:39:13,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8226553856. Throughput: 0: 42912.5. Samples: 8226688740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 19:39:13,412][15401] Updated weights for policy 0, policy_version 502110 (0.0028) [2024-06-23 19:39:16,581][15401] Updated weights for policy 0, policy_version 502120 (0.0033) [2024-06-23 19:39:18,092][15349] Signal inference workers to stop experience collection... (121900 times) [2024-06-23 19:39:18,140][15401] InferenceWorker_p0-w0: stopping experience collection (121900 times) [2024-06-23 19:39:18,208][15349] Signal inference workers to resume experience collection... (121900 times) [2024-06-23 19:39:18,209][15401] InferenceWorker_p0-w0: resuming experience collection (121900 times) [2024-06-23 19:39:18,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 8226783232. Throughput: 0: 43055.5. Samples: 8226949120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:18,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 19:39:21,081][15401] Updated weights for policy 0, policy_version 502130 (0.0037) [2024-06-23 19:39:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42603.1, 300 sec: 42876.1). Total num frames: 8226979840. Throughput: 0: 42900.9. Samples: 8227073920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 19:39:24,463][15401] Updated weights for policy 0, policy_version 502140 (0.0031) [2024-06-23 19:39:28,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8227209216. Throughput: 0: 42895.6. Samples: 8227332280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 19:39:28,695][15401] Updated weights for policy 0, policy_version 502150 (0.0027) [2024-06-23 19:39:32,115][15401] Updated weights for policy 0, policy_version 502160 (0.0039) [2024-06-23 19:39:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8227422208. Throughput: 0: 42982.6. Samples: 8227587280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 19:39:36,331][15401] Updated weights for policy 0, policy_version 502170 (0.0034) [2024-06-23 19:39:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8227618816. Throughput: 0: 42952.5. Samples: 8227716880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:38,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 19:39:39,624][15401] Updated weights for policy 0, policy_version 502180 (0.0034) [2024-06-23 19:39:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8227848192. Throughput: 0: 42785.0. Samples: 8227972240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:43,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 19:39:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000502188_8227848192.pth... [2024-06-23 19:39:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000501561_8217575424.pth [2024-06-23 19:39:44,096][15401] Updated weights for policy 0, policy_version 502190 (0.0024) [2024-06-23 19:39:47,207][15401] Updated weights for policy 0, policy_version 502200 (0.0031) [2024-06-23 19:39:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8228061184. Throughput: 0: 42858.0. Samples: 8228231800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 19:39:51,927][15401] Updated weights for policy 0, policy_version 502210 (0.0030) [2024-06-23 19:39:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8228257792. Throughput: 0: 42778.1. Samples: 8228356320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 19:39:55,027][15401] Updated weights for policy 0, policy_version 502220 (0.0036) [2024-06-23 19:39:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8228487168. Throughput: 0: 42788.0. Samples: 8228614200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:39:58,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 19:39:59,357][15401] Updated weights for policy 0, policy_version 502230 (0.0036) [2024-06-23 19:40:02,739][15401] Updated weights for policy 0, policy_version 502240 (0.0033) [2024-06-23 19:40:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 8228716544. Throughput: 0: 42648.9. Samples: 8228868220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 19:40:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 19:40:06,798][15401] Updated weights for policy 0, policy_version 502250 (0.0024) [2024-06-23 19:40:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 8228896768. Throughput: 0: 42798.6. Samples: 8228999860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 19:40:10,318][15401] Updated weights for policy 0, policy_version 502260 (0.0038) [2024-06-23 19:40:13,394][15132] Fps is (10 sec: 40942.3, 60 sec: 42868.4, 300 sec: 42819.9). Total num frames: 8229126144. Throughput: 0: 42707.4. Samples: 8229254300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:13,394][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 19:40:14,838][15401] Updated weights for policy 0, policy_version 502270 (0.0031) [2024-06-23 19:40:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.2, 300 sec: 42931.6). Total num frames: 8229339136. Throughput: 0: 42732.5. Samples: 8229510240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 19:40:18,441][15401] Updated weights for policy 0, policy_version 502280 (0.0037) [2024-06-23 19:40:22,461][15401] Updated weights for policy 0, policy_version 502290 (0.0036) [2024-06-23 19:40:23,392][15132] Fps is (10 sec: 40967.9, 60 sec: 42596.6, 300 sec: 42709.8). Total num frames: 8229535744. Throughput: 0: 42607.0. Samples: 8229634300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:23,401][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 19:40:25,933][15401] Updated weights for policy 0, policy_version 502300 (0.0027) [2024-06-23 19:40:27,237][15349] Signal inference workers to stop experience collection... (121950 times) [2024-06-23 19:40:27,282][15401] InferenceWorker_p0-w0: stopping experience collection (121950 times) [2024-06-23 19:40:27,291][15349] Signal inference workers to resume experience collection... (121950 times) [2024-06-23 19:40:27,296][15401] InferenceWorker_p0-w0: resuming experience collection (121950 times) [2024-06-23 19:40:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8229781504. Throughput: 0: 42676.0. Samples: 8229892660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 19:40:29,986][15401] Updated weights for policy 0, policy_version 502310 (0.0034) [2024-06-23 19:40:33,391][15132] Fps is (10 sec: 44239.0, 60 sec: 42597.0, 300 sec: 42875.8). Total num frames: 8229978112. Throughput: 0: 42708.7. Samples: 8230153780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:33,392][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 19:40:33,767][15401] Updated weights for policy 0, policy_version 502320 (0.0031) [2024-06-23 19:40:37,567][15401] Updated weights for policy 0, policy_version 502330 (0.0030) [2024-06-23 19:40:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8230191104. Throughput: 0: 42604.1. Samples: 8230273500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 19:40:41,453][15401] Updated weights for policy 0, policy_version 502340 (0.0031) [2024-06-23 19:40:43,389][15132] Fps is (10 sec: 45884.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8230436864. Throughput: 0: 42640.6. Samples: 8230533020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 19:40:45,078][15401] Updated weights for policy 0, policy_version 502350 (0.0033) [2024-06-23 19:40:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8230617088. Throughput: 0: 42935.1. Samples: 8230800300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 19:40:49,002][15401] Updated weights for policy 0, policy_version 502360 (0.0041) [2024-06-23 19:40:52,604][15401] Updated weights for policy 0, policy_version 502370 (0.0031) [2024-06-23 19:40:53,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8230830080. Throughput: 0: 42686.3. Samples: 8230920740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 19:40:56,663][15401] Updated weights for policy 0, policy_version 502380 (0.0030) [2024-06-23 19:40:58,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 8231075840. Throughput: 0: 42835.4. Samples: 8231181700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:40:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 19:41:00,530][15401] Updated weights for policy 0, policy_version 502390 (0.0033) [2024-06-23 19:41:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8231256064. Throughput: 0: 43052.2. Samples: 8231447600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:41:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 19:41:04,282][15401] Updated weights for policy 0, policy_version 502400 (0.0025) [2024-06-23 19:41:07,947][15401] Updated weights for policy 0, policy_version 502410 (0.0030) [2024-06-23 19:41:08,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 8231485440. Throughput: 0: 42871.6. Samples: 8231563420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:41:08,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 19:41:11,994][15401] Updated weights for policy 0, policy_version 502420 (0.0034) [2024-06-23 19:41:13,389][15132] Fps is (10 sec: 47514.8, 60 sec: 43420.8, 300 sec: 42987.2). Total num frames: 8231731200. Throughput: 0: 42895.1. Samples: 8231822940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:41:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 19:41:15,482][15401] Updated weights for policy 0, policy_version 502430 (0.0048) [2024-06-23 19:41:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8231895040. Throughput: 0: 42755.6. Samples: 8232077700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:41:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 19:41:19,774][15401] Updated weights for policy 0, policy_version 502440 (0.0026) [2024-06-23 19:41:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 8232124416. Throughput: 0: 42760.4. Samples: 8232197720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:41:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 19:41:23,653][15401] Updated weights for policy 0, policy_version 502450 (0.0043) [2024-06-23 19:41:27,313][15401] Updated weights for policy 0, policy_version 502460 (0.0032) [2024-06-23 19:41:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8232337408. Throughput: 0: 42794.5. Samples: 8232458780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:41:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 19:41:31,389][15401] Updated weights for policy 0, policy_version 502470 (0.0036) [2024-06-23 19:41:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42326.8, 300 sec: 42765.0). Total num frames: 8232517632. Throughput: 0: 42496.2. Samples: 8232712620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 19:41:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 19:41:35,283][15401] Updated weights for policy 0, policy_version 502480 (0.0041) [2024-06-23 19:41:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8232763392. Throughput: 0: 42549.8. Samples: 8232835480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:41:38,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 19:41:39,292][15401] Updated weights for policy 0, policy_version 502490 (0.0034) [2024-06-23 19:41:42,806][15401] Updated weights for policy 0, policy_version 502500 (0.0038) [2024-06-23 19:41:43,396][15132] Fps is (10 sec: 45845.2, 60 sec: 42320.7, 300 sec: 42819.6). Total num frames: 8232976384. Throughput: 0: 42416.1. Samples: 8233090700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:41:43,397][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 19:41:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000502501_8232976384.pth... [2024-06-23 19:41:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000501875_8222720000.pth [2024-06-23 19:41:47,089][15401] Updated weights for policy 0, policy_version 502510 (0.0041) [2024-06-23 19:41:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8233172992. Throughput: 0: 42132.1. Samples: 8233343540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:41:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 19:41:49,803][15349] Signal inference workers to stop experience collection... (122000 times) [2024-06-23 19:41:49,864][15349] Signal inference workers to resume experience collection... (122000 times) [2024-06-23 19:41:49,864][15401] InferenceWorker_p0-w0: stopping experience collection (122000 times) [2024-06-23 19:41:49,878][15401] InferenceWorker_p0-w0: resuming experience collection (122000 times) [2024-06-23 19:41:50,337][15401] Updated weights for policy 0, policy_version 502520 (0.0032) [2024-06-23 19:41:53,389][15132] Fps is (10 sec: 44265.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8233418752. Throughput: 0: 42442.0. Samples: 8233473300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:41:53,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 19:41:54,653][15401] Updated weights for policy 0, policy_version 502530 (0.0034) [2024-06-23 19:41:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8233598976. Throughput: 0: 42332.5. Samples: 8233727900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:41:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 19:41:58,402][15401] Updated weights for policy 0, policy_version 502540 (0.0035) [2024-06-23 19:42:02,796][15401] Updated weights for policy 0, policy_version 502550 (0.0042) [2024-06-23 19:42:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.5, 300 sec: 42709.6). Total num frames: 8233795584. Throughput: 0: 42421.9. Samples: 8233986680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 19:42:05,923][15401] Updated weights for policy 0, policy_version 502560 (0.0027) [2024-06-23 19:42:08,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 8234041344. Throughput: 0: 42474.2. Samples: 8234109160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:08,392][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 19:42:10,254][15401] Updated weights for policy 0, policy_version 502570 (0.0034) [2024-06-23 19:42:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 8234254336. Throughput: 0: 42424.5. Samples: 8234367880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 19:42:13,399][15401] Updated weights for policy 0, policy_version 502580 (0.0032) [2024-06-23 19:42:17,943][15401] Updated weights for policy 0, policy_version 502590 (0.0035) [2024-06-23 19:42:18,392][15132] Fps is (10 sec: 39321.7, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 8234434560. Throughput: 0: 42752.3. Samples: 8234636580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:18,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 19:42:20,987][15401] Updated weights for policy 0, policy_version 502600 (0.0037) [2024-06-23 19:42:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 8234696704. Throughput: 0: 42660.0. Samples: 8234755180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 19:42:25,471][15401] Updated weights for policy 0, policy_version 502610 (0.0041) [2024-06-23 19:42:28,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8234893312. Throughput: 0: 42882.9. Samples: 8235020160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:28,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 19:42:28,624][15401] Updated weights for policy 0, policy_version 502620 (0.0033) [2024-06-23 19:42:33,085][15401] Updated weights for policy 0, policy_version 502630 (0.0027) [2024-06-23 19:42:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 8235089920. Throughput: 0: 43149.7. Samples: 8235285280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 19:42:36,112][15401] Updated weights for policy 0, policy_version 502640 (0.0033) [2024-06-23 19:42:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8235335680. Throughput: 0: 42971.1. Samples: 8235407000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 19:42:40,616][15401] Updated weights for policy 0, policy_version 502650 (0.0051) [2024-06-23 19:42:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42876.0, 300 sec: 42820.6). Total num frames: 8235548672. Throughput: 0: 43159.4. Samples: 8235670080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 19:42:43,526][15401] Updated weights for policy 0, policy_version 502660 (0.0041) [2024-06-23 19:42:48,226][15401] Updated weights for policy 0, policy_version 502670 (0.0043) [2024-06-23 19:42:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8235745280. Throughput: 0: 43207.0. Samples: 8235931000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 19:42:51,369][15401] Updated weights for policy 0, policy_version 502680 (0.0046) [2024-06-23 19:42:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8235991040. Throughput: 0: 43240.0. Samples: 8236054860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 19:42:55,689][15401] Updated weights for policy 0, policy_version 502690 (0.0039) [2024-06-23 19:42:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8236171264. Throughput: 0: 43172.5. Samples: 8236310640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 19:42:58,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 19:42:59,007][15401] Updated weights for policy 0, policy_version 502700 (0.0045) [2024-06-23 19:43:03,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 8236367872. Throughput: 0: 42976.8. Samples: 8236570440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:03,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 19:43:03,686][15401] Updated weights for policy 0, policy_version 502710 (0.0037) [2024-06-23 19:43:06,396][15349] Signal inference workers to stop experience collection... (122050 times) [2024-06-23 19:43:06,448][15401] InferenceWorker_p0-w0: stopping experience collection (122050 times) [2024-06-23 19:43:06,447][15349] Signal inference workers to resume experience collection... (122050 times) [2024-06-23 19:43:06,473][15401] InferenceWorker_p0-w0: resuming experience collection (122050 times) [2024-06-23 19:43:06,579][15401] Updated weights for policy 0, policy_version 502720 (0.0027) [2024-06-23 19:43:08,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8236613632. Throughput: 0: 43157.7. Samples: 8236697280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 19:43:11,160][15401] Updated weights for policy 0, policy_version 502730 (0.0034) [2024-06-23 19:43:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8236826624. Throughput: 0: 42989.0. Samples: 8236954660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 19:43:14,320][15401] Updated weights for policy 0, policy_version 502740 (0.0041) [2024-06-23 19:43:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43146.2, 300 sec: 42710.4). Total num frames: 8237023232. Throughput: 0: 42770.7. Samples: 8237209960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:18,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-23 19:43:19,271][15401] Updated weights for policy 0, policy_version 502750 (0.0043) [2024-06-23 19:43:21,933][15401] Updated weights for policy 0, policy_version 502760 (0.0022) [2024-06-23 19:43:23,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 8237268992. Throughput: 0: 42874.0. Samples: 8237336440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:23,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 19:43:26,811][15401] Updated weights for policy 0, policy_version 502770 (0.0041) [2024-06-23 19:43:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8237465600. Throughput: 0: 42825.0. Samples: 8237597200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 19:43:29,592][15401] Updated weights for policy 0, policy_version 502780 (0.0041) [2024-06-23 19:43:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8237678592. Throughput: 0: 42786.3. Samples: 8237856380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 19:43:34,301][15401] Updated weights for policy 0, policy_version 502790 (0.0032) [2024-06-23 19:43:37,305][15401] Updated weights for policy 0, policy_version 502800 (0.0033) [2024-06-23 19:43:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8237891584. Throughput: 0: 42795.6. Samples: 8237980660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 19:43:41,816][15401] Updated weights for policy 0, policy_version 502810 (0.0033) [2024-06-23 19:43:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8238120960. Throughput: 0: 42853.1. Samples: 8238239040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 19:43:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000502815_8238120960.pth... [2024-06-23 19:43:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000502188_8227848192.pth [2024-06-23 19:43:45,038][15401] Updated weights for policy 0, policy_version 502820 (0.0030) [2024-06-23 19:43:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8238333952. Throughput: 0: 42775.6. Samples: 8238495340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 19:43:49,275][15401] Updated weights for policy 0, policy_version 502830 (0.0037) [2024-06-23 19:43:52,476][15401] Updated weights for policy 0, policy_version 502840 (0.0028) [2024-06-23 19:43:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8238546944. Throughput: 0: 42834.5. Samples: 8238624840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 19:43:56,598][15401] Updated weights for policy 0, policy_version 502850 (0.0030) [2024-06-23 19:43:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 8238759936. Throughput: 0: 42826.6. Samples: 8238881860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:43:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 19:43:59,989][15401] Updated weights for policy 0, policy_version 502860 (0.0037) [2024-06-23 19:44:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 8238972928. Throughput: 0: 42906.7. Samples: 8239140760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:44:03,394][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 19:44:04,037][15401] Updated weights for policy 0, policy_version 502870 (0.0038) [2024-06-23 19:44:07,547][15401] Updated weights for policy 0, policy_version 502880 (0.0041) [2024-06-23 19:44:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8239185920. Throughput: 0: 43054.0. Samples: 8239273760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:44:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 19:44:11,654][15401] Updated weights for policy 0, policy_version 502890 (0.0042) [2024-06-23 19:44:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 8239382528. Throughput: 0: 42780.9. Samples: 8239522340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:44:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 19:44:15,599][15401] Updated weights for policy 0, policy_version 502900 (0.0041) [2024-06-23 19:44:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8239628288. Throughput: 0: 42737.6. Samples: 8239779580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:44:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 19:44:19,226][15401] Updated weights for policy 0, policy_version 502910 (0.0039) [2024-06-23 19:44:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8239824896. Throughput: 0: 42832.8. Samples: 8239908140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:44:23,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 19:44:23,429][15401] Updated weights for policy 0, policy_version 502920 (0.0039) [2024-06-23 19:44:24,997][15349] Signal inference workers to stop experience collection... (122100 times) [2024-06-23 19:44:24,998][15349] Signal inference workers to resume experience collection... (122100 times) [2024-06-23 19:44:25,014][15401] InferenceWorker_p0-w0: stopping experience collection (122100 times) [2024-06-23 19:44:25,015][15401] InferenceWorker_p0-w0: resuming experience collection (122100 times) [2024-06-23 19:44:27,260][15401] Updated weights for policy 0, policy_version 502930 (0.0027) [2024-06-23 19:44:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8240037888. Throughput: 0: 42833.0. Samples: 8240166520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-23 19:44:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 19:44:30,994][15401] Updated weights for policy 0, policy_version 502940 (0.0043) [2024-06-23 19:44:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8240267264. Throughput: 0: 42774.7. Samples: 8240420200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:44:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 19:44:34,917][15401] Updated weights for policy 0, policy_version 502950 (0.0037) [2024-06-23 19:44:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8240480256. Throughput: 0: 42849.9. Samples: 8240553080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:44:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 19:44:38,619][15401] Updated weights for policy 0, policy_version 502960 (0.0049) [2024-06-23 19:44:42,619][15401] Updated weights for policy 0, policy_version 502970 (0.0029) [2024-06-23 19:44:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8240676864. Throughput: 0: 42758.7. Samples: 8240806000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:44:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 19:44:46,331][15401] Updated weights for policy 0, policy_version 502980 (0.0037) [2024-06-23 19:44:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8240889856. Throughput: 0: 42849.7. Samples: 8241069000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:44:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 19:44:50,131][15401] Updated weights for policy 0, policy_version 502990 (0.0037) [2024-06-23 19:44:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8241119232. Throughput: 0: 42741.8. Samples: 8241197140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:44:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 19:44:54,125][15401] Updated weights for policy 0, policy_version 503000 (0.0039) [2024-06-23 19:44:57,697][15401] Updated weights for policy 0, policy_version 503010 (0.0035) [2024-06-23 19:44:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8241315840. Throughput: 0: 42809.8. Samples: 8241448780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:44:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 19:45:01,799][15401] Updated weights for policy 0, policy_version 503020 (0.0035) [2024-06-23 19:45:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8241528832. Throughput: 0: 42824.6. Samples: 8241706680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:03,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 19:45:05,421][15401] Updated weights for policy 0, policy_version 503030 (0.0044) [2024-06-23 19:45:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42821.2). Total num frames: 8241758208. Throughput: 0: 42837.3. Samples: 8241835820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:08,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-23 19:45:09,319][15401] Updated weights for policy 0, policy_version 503040 (0.0024) [2024-06-23 19:45:12,943][15401] Updated weights for policy 0, policy_version 503050 (0.0029) [2024-06-23 19:45:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8241987584. Throughput: 0: 42741.3. Samples: 8242089880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:13,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 19:45:16,966][15401] Updated weights for policy 0, policy_version 503060 (0.0038) [2024-06-23 19:45:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 8242184192. Throughput: 0: 42736.5. Samples: 8242343340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:18,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 19:45:20,689][15401] Updated weights for policy 0, policy_version 503070 (0.0042) [2024-06-23 19:45:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8242380800. Throughput: 0: 42625.9. Samples: 8242471240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 19:45:24,713][15401] Updated weights for policy 0, policy_version 503080 (0.0036) [2024-06-23 19:45:28,363][15401] Updated weights for policy 0, policy_version 503090 (0.0033) [2024-06-23 19:45:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 8242626560. Throughput: 0: 42763.9. Samples: 8242730380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 19:45:32,384][15401] Updated weights for policy 0, policy_version 503100 (0.0037) [2024-06-23 19:45:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8242823168. Throughput: 0: 42725.5. Samples: 8242991640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 19:45:35,855][15401] Updated weights for policy 0, policy_version 503110 (0.0036) [2024-06-23 19:45:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8243036160. Throughput: 0: 42571.5. Samples: 8243112860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:38,390][15132] Avg episode reward: [(0, '0.162')] [2024-06-23 19:45:40,218][15401] Updated weights for policy 0, policy_version 503120 (0.0048) [2024-06-23 19:45:43,390][15132] Fps is (10 sec: 42593.9, 60 sec: 42870.8, 300 sec: 42820.4). Total num frames: 8243249152. Throughput: 0: 42708.3. Samples: 8243370700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:43,391][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 19:45:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000503128_8243249152.pth... [2024-06-23 19:45:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000502501_8232976384.pth [2024-06-23 19:45:43,795][15401] Updated weights for policy 0, policy_version 503130 (0.0026) [2024-06-23 19:45:47,947][15401] Updated weights for policy 0, policy_version 503140 (0.0031) [2024-06-23 19:45:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8243462144. Throughput: 0: 42664.3. Samples: 8243626580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:48,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-23 19:45:51,373][15401] Updated weights for policy 0, policy_version 503150 (0.0036) [2024-06-23 19:45:53,390][15132] Fps is (10 sec: 40963.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8243658752. Throughput: 0: 42638.2. Samples: 8243754540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 19:45:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 19:45:55,679][15401] Updated weights for policy 0, policy_version 503160 (0.0036) [2024-06-23 19:45:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8243904512. Throughput: 0: 42826.6. Samples: 8244017080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:45:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 19:45:58,941][15401] Updated weights for policy 0, policy_version 503170 (0.0043) [2024-06-23 19:46:03,339][15401] Updated weights for policy 0, policy_version 503180 (0.0034) [2024-06-23 19:46:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8244101120. Throughput: 0: 42957.4. Samples: 8244276420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:03,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 19:46:06,372][15401] Updated weights for policy 0, policy_version 503190 (0.0038) [2024-06-23 19:46:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8244314112. Throughput: 0: 42933.6. Samples: 8244403260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:08,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 19:46:10,848][15401] Updated weights for policy 0, policy_version 503200 (0.0040) [2024-06-23 19:46:13,161][15349] Signal inference workers to stop experience collection... (122150 times) [2024-06-23 19:46:13,161][15349] Signal inference workers to resume experience collection... (122150 times) [2024-06-23 19:46:13,178][15401] InferenceWorker_p0-w0: stopping experience collection (122150 times) [2024-06-23 19:46:13,179][15401] InferenceWorker_p0-w0: resuming experience collection (122150 times) [2024-06-23 19:46:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8244559872. Throughput: 0: 42910.3. Samples: 8244661340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 19:46:14,240][15401] Updated weights for policy 0, policy_version 503210 (0.0026) [2024-06-23 19:46:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8244723712. Throughput: 0: 42969.3. Samples: 8244925260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 19:46:18,648][15401] Updated weights for policy 0, policy_version 503220 (0.0029) [2024-06-23 19:46:21,915][15401] Updated weights for policy 0, policy_version 503230 (0.0037) [2024-06-23 19:46:23,396][15132] Fps is (10 sec: 40934.1, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 8244969472. Throughput: 0: 42909.0. Samples: 8245044040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:23,396][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 19:46:26,123][15401] Updated weights for policy 0, policy_version 503240 (0.0033) [2024-06-23 19:46:28,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8245198848. Throughput: 0: 43051.7. Samples: 8245307980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:28,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 19:46:29,397][15401] Updated weights for policy 0, policy_version 503250 (0.0032) [2024-06-23 19:46:33,389][15132] Fps is (10 sec: 42625.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8245395456. Throughput: 0: 43108.1. Samples: 8245566440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 19:46:33,733][15401] Updated weights for policy 0, policy_version 503260 (0.0034) [2024-06-23 19:46:36,954][15401] Updated weights for policy 0, policy_version 503270 (0.0030) [2024-06-23 19:46:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 8245608448. Throughput: 0: 43090.3. Samples: 8245693600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:38,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 19:46:41,368][15401] Updated weights for policy 0, policy_version 503280 (0.0034) [2024-06-23 19:46:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43145.3, 300 sec: 42931.7). Total num frames: 8245837824. Throughput: 0: 42971.2. Samples: 8245950780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 19:46:44,522][15401] Updated weights for policy 0, policy_version 503290 (0.0045) [2024-06-23 19:46:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8246018048. Throughput: 0: 42927.5. Samples: 8246208160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 19:46:49,311][15401] Updated weights for policy 0, policy_version 503300 (0.0041) [2024-06-23 19:46:52,107][15401] Updated weights for policy 0, policy_version 503310 (0.0030) [2024-06-23 19:46:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 8246263808. Throughput: 0: 42761.3. Samples: 8246327520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 19:46:56,864][15401] Updated weights for policy 0, policy_version 503320 (0.0038) [2024-06-23 19:46:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8246460416. Throughput: 0: 42771.6. Samples: 8246586060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:46:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 19:46:59,843][15401] Updated weights for policy 0, policy_version 503330 (0.0044) [2024-06-23 19:47:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 8246657024. Throughput: 0: 42438.6. Samples: 8246835000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:47:03,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 19:47:04,633][15401] Updated weights for policy 0, policy_version 503340 (0.0032) [2024-06-23 19:47:07,989][15401] Updated weights for policy 0, policy_version 503350 (0.0046) [2024-06-23 19:47:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8246919168. Throughput: 0: 42673.1. Samples: 8246964060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:47:08,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 19:47:12,125][15401] Updated weights for policy 0, policy_version 503360 (0.0037) [2024-06-23 19:47:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42932.0). Total num frames: 8247099392. Throughput: 0: 42512.4. Samples: 8247221040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:47:13,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 19:47:15,693][15401] Updated weights for policy 0, policy_version 503370 (0.0037) [2024-06-23 19:47:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8247312384. Throughput: 0: 42468.1. Samples: 8247477500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-23 19:47:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 19:47:20,015][15401] Updated weights for policy 0, policy_version 503380 (0.0038) [2024-06-23 19:47:23,317][15401] Updated weights for policy 0, policy_version 503390 (0.0038) [2024-06-23 19:47:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 8247541760. Throughput: 0: 42459.1. Samples: 8247604260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:47:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 19:47:27,663][15401] Updated weights for policy 0, policy_version 503400 (0.0028) [2024-06-23 19:47:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 8247721984. Throughput: 0: 42403.0. Samples: 8247858920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:47:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 19:47:30,827][15401] Updated weights for policy 0, policy_version 503410 (0.0031) [2024-06-23 19:47:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 8247918592. Throughput: 0: 42352.5. Samples: 8248114020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:47:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 19:47:35,256][15401] Updated weights for policy 0, policy_version 503420 (0.0044) [2024-06-23 19:47:38,197][15349] Signal inference workers to stop experience collection... (122200 times) [2024-06-23 19:47:38,197][15349] Signal inference workers to resume experience collection... (122200 times) [2024-06-23 19:47:38,225][15401] InferenceWorker_p0-w0: stopping experience collection (122200 times) [2024-06-23 19:47:38,225][15401] InferenceWorker_p0-w0: resuming experience collection (122200 times) [2024-06-23 19:47:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8248180736. Throughput: 0: 42479.2. Samples: 8248239080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:47:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 19:47:38,502][15401] Updated weights for policy 0, policy_version 503430 (0.0034) [2024-06-23 19:47:43,274][15401] Updated weights for policy 0, policy_version 503440 (0.0030) [2024-06-23 19:47:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.1, 300 sec: 42765.0). Total num frames: 8248360960. Throughput: 0: 42405.7. Samples: 8248494320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:47:43,392][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 19:47:43,441][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000503441_8248377344.pth... [2024-06-23 19:47:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000502815_8238120960.pth [2024-06-23 19:47:46,731][15401] Updated weights for policy 0, policy_version 503450 (0.0033) [2024-06-23 19:47:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8248573952. Throughput: 0: 42379.2. Samples: 8248742060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:47:48,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 19:47:50,943][15401] Updated weights for policy 0, policy_version 503460 (0.0048) [2024-06-23 19:47:53,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 8248803328. Throughput: 0: 42392.4. Samples: 8248871820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:47:53,392][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 19:47:54,377][15401] Updated weights for policy 0, policy_version 503470 (0.0032) [2024-06-23 19:47:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8248999936. Throughput: 0: 42456.1. Samples: 8249131560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:47:58,390][15132] Avg episode reward: [(0, '0.155')] [2024-06-23 19:47:58,495][15401] Updated weights for policy 0, policy_version 503480 (0.0034) [2024-06-23 19:48:02,235][15401] Updated weights for policy 0, policy_version 503490 (0.0031) [2024-06-23 19:48:03,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8249229312. Throughput: 0: 42380.0. Samples: 8249384600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:03,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-23 19:48:05,935][15401] Updated weights for policy 0, policy_version 503500 (0.0042) [2024-06-23 19:48:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8249458688. Throughput: 0: 42444.2. Samples: 8249514240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 19:48:09,693][15401] Updated weights for policy 0, policy_version 503510 (0.0033) [2024-06-23 19:48:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8249638912. Throughput: 0: 42725.8. Samples: 8249781580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 19:48:13,660][15401] Updated weights for policy 0, policy_version 503520 (0.0025) [2024-06-23 19:48:17,589][15401] Updated weights for policy 0, policy_version 503530 (0.0042) [2024-06-23 19:48:18,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42709.8). Total num frames: 8249868288. Throughput: 0: 42466.1. Samples: 8250025000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 19:48:21,409][15401] Updated weights for policy 0, policy_version 503540 (0.0037) [2024-06-23 19:48:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8250097664. Throughput: 0: 42657.7. Samples: 8250158680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 19:48:25,096][15401] Updated weights for policy 0, policy_version 503550 (0.0031) [2024-06-23 19:48:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8250294272. Throughput: 0: 42746.7. Samples: 8250417920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 19:48:28,899][15401] Updated weights for policy 0, policy_version 503560 (0.0040) [2024-06-23 19:48:32,838][15401] Updated weights for policy 0, policy_version 503570 (0.0029) [2024-06-23 19:48:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8250507264. Throughput: 0: 42880.4. Samples: 8250671680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 19:48:36,631][15401] Updated weights for policy 0, policy_version 503580 (0.0030) [2024-06-23 19:48:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8250736640. Throughput: 0: 42871.5. Samples: 8250800940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:38,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 19:48:40,429][15401] Updated weights for policy 0, policy_version 503590 (0.0038) [2024-06-23 19:48:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8250933248. Throughput: 0: 42735.4. Samples: 8251054660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:43,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 19:48:44,615][15401] Updated weights for policy 0, policy_version 503600 (0.0034) [2024-06-23 19:48:48,115][15401] Updated weights for policy 0, policy_version 503610 (0.0029) [2024-06-23 19:48:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8251146240. Throughput: 0: 42563.0. Samples: 8251299940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-23 19:48:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 19:48:52,174][15401] Updated weights for policy 0, policy_version 503620 (0.0034) [2024-06-23 19:48:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 8251359232. Throughput: 0: 42675.4. Samples: 8251434640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:48:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 19:48:55,888][15401] Updated weights for policy 0, policy_version 503630 (0.0043) [2024-06-23 19:48:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8251555840. Throughput: 0: 42563.9. Samples: 8251696960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:48:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 19:48:59,598][15401] Updated weights for policy 0, policy_version 503640 (0.0044) [2024-06-23 19:49:00,368][15349] Signal inference workers to stop experience collection... (122250 times) [2024-06-23 19:49:00,369][15349] Signal inference workers to resume experience collection... (122250 times) [2024-06-23 19:49:00,412][15401] InferenceWorker_p0-w0: stopping experience collection (122250 times) [2024-06-23 19:49:00,412][15401] InferenceWorker_p0-w0: resuming experience collection (122250 times) [2024-06-23 19:49:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8251785216. Throughput: 0: 42807.8. Samples: 8251951340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 19:49:03,449][15401] Updated weights for policy 0, policy_version 503650 (0.0039) [2024-06-23 19:49:07,419][15401] Updated weights for policy 0, policy_version 503660 (0.0032) [2024-06-23 19:49:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8252014592. Throughput: 0: 42653.8. Samples: 8252078100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 19:49:10,917][15401] Updated weights for policy 0, policy_version 503670 (0.0046) [2024-06-23 19:49:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8252211200. Throughput: 0: 42689.4. Samples: 8252338940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 19:49:15,221][15401] Updated weights for policy 0, policy_version 503680 (0.0031) [2024-06-23 19:49:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8252424192. Throughput: 0: 42584.5. Samples: 8252587980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:18,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 19:49:18,890][15401] Updated weights for policy 0, policy_version 503690 (0.0027) [2024-06-23 19:49:22,817][15401] Updated weights for policy 0, policy_version 503700 (0.0041) [2024-06-23 19:49:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8252653568. Throughput: 0: 42618.7. Samples: 8252718880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:23,392][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 19:49:26,531][15401] Updated weights for policy 0, policy_version 503710 (0.0036) [2024-06-23 19:49:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8252833792. Throughput: 0: 42718.2. Samples: 8252976980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:28,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 19:49:30,524][15401] Updated weights for policy 0, policy_version 503720 (0.0036) [2024-06-23 19:49:33,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8253079552. Throughput: 0: 42838.2. Samples: 8253227660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:33,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 19:49:34,142][15401] Updated weights for policy 0, policy_version 503730 (0.0034) [2024-06-23 19:49:38,005][15401] Updated weights for policy 0, policy_version 503740 (0.0034) [2024-06-23 19:49:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8253292544. Throughput: 0: 42797.3. Samples: 8253360520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 19:49:41,900][15401] Updated weights for policy 0, policy_version 503750 (0.0034) [2024-06-23 19:49:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8253489152. Throughput: 0: 42739.6. Samples: 8253620240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:43,395][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 19:49:43,489][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000503754_8253505536.pth... [2024-06-23 19:49:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000503128_8243249152.pth [2024-06-23 19:49:45,537][15401] Updated weights for policy 0, policy_version 503760 (0.0040) [2024-06-23 19:49:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8253718528. Throughput: 0: 42669.2. Samples: 8253871460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:48,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 19:49:49,583][15401] Updated weights for policy 0, policy_version 503770 (0.0028) [2024-06-23 19:49:53,263][15401] Updated weights for policy 0, policy_version 503780 (0.0028) [2024-06-23 19:49:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8253931520. Throughput: 0: 42763.0. Samples: 8254002440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:53,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 19:49:57,202][15401] Updated weights for policy 0, policy_version 503790 (0.0033) [2024-06-23 19:49:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8254128128. Throughput: 0: 42594.2. Samples: 8254255680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:49:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 19:50:00,975][15401] Updated weights for policy 0, policy_version 503800 (0.0027) [2024-06-23 19:50:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8254357504. Throughput: 0: 42793.7. Samples: 8254513700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:50:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 19:50:05,296][15401] Updated weights for policy 0, policy_version 503810 (0.0029) [2024-06-23 19:50:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8254554112. Throughput: 0: 42727.1. Samples: 8254641500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:50:08,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 19:50:08,741][15401] Updated weights for policy 0, policy_version 503820 (0.0043) [2024-06-23 19:50:12,819][15401] Updated weights for policy 0, policy_version 503830 (0.0036) [2024-06-23 19:50:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8254783488. Throughput: 0: 42712.5. Samples: 8254899040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-23 19:50:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 19:50:16,699][15401] Updated weights for policy 0, policy_version 503840 (0.0036) [2024-06-23 19:50:17,519][15349] Signal inference workers to stop experience collection... (122300 times) [2024-06-23 19:50:17,570][15401] InferenceWorker_p0-w0: stopping experience collection (122300 times) [2024-06-23 19:50:17,586][15349] Signal inference workers to resume experience collection... (122300 times) [2024-06-23 19:50:17,587][15401] InferenceWorker_p0-w0: resuming experience collection (122300 times) [2024-06-23 19:50:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8254996480. Throughput: 0: 42705.4. Samples: 8255149400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:18,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 19:50:20,306][15401] Updated weights for policy 0, policy_version 503850 (0.0021) [2024-06-23 19:50:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 8255193088. Throughput: 0: 42722.6. Samples: 8255283040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 19:50:24,168][15401] Updated weights for policy 0, policy_version 503860 (0.0045) [2024-06-23 19:50:27,906][15401] Updated weights for policy 0, policy_version 503870 (0.0033) [2024-06-23 19:50:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8255422464. Throughput: 0: 42631.1. Samples: 8255538640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 19:50:31,879][15401] Updated weights for policy 0, policy_version 503880 (0.0029) [2024-06-23 19:50:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8255635456. Throughput: 0: 42875.2. Samples: 8255800840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 19:50:35,678][15401] Updated weights for policy 0, policy_version 503890 (0.0032) [2024-06-23 19:50:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.1). Total num frames: 8255832064. Throughput: 0: 42732.6. Samples: 8255925400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 19:50:39,425][15401] Updated weights for policy 0, policy_version 503900 (0.0039) [2024-06-23 19:50:43,138][15401] Updated weights for policy 0, policy_version 503910 (0.0033) [2024-06-23 19:50:43,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 8256061440. Throughput: 0: 42740.6. Samples: 8256179280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:43,397][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 19:50:47,094][15401] Updated weights for policy 0, policy_version 503920 (0.0036) [2024-06-23 19:50:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8256274432. Throughput: 0: 42731.2. Samples: 8256436600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 19:50:51,134][15401] Updated weights for policy 0, policy_version 503930 (0.0037) [2024-06-23 19:50:53,391][15132] Fps is (10 sec: 42619.4, 60 sec: 42597.4, 300 sec: 42653.7). Total num frames: 8256487424. Throughput: 0: 42876.0. Samples: 8256570980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:53,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 19:50:54,980][15401] Updated weights for policy 0, policy_version 503940 (0.0032) [2024-06-23 19:50:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8256700416. Throughput: 0: 42742.1. Samples: 8256822440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:50:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 19:50:58,759][15401] Updated weights for policy 0, policy_version 503950 (0.0036) [2024-06-23 19:51:02,651][15401] Updated weights for policy 0, policy_version 503960 (0.0020) [2024-06-23 19:51:03,390][15132] Fps is (10 sec: 44242.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8256929792. Throughput: 0: 42808.3. Samples: 8257075780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 19:51:06,355][15401] Updated weights for policy 0, policy_version 503970 (0.0024) [2024-06-23 19:51:08,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 8257126400. Throughput: 0: 42859.6. Samples: 8257211820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:08,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 19:51:10,231][15401] Updated weights for policy 0, policy_version 503980 (0.0029) [2024-06-23 19:51:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8257323008. Throughput: 0: 42611.1. Samples: 8257456140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:13,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-23 19:51:14,018][15401] Updated weights for policy 0, policy_version 503990 (0.0034) [2024-06-23 19:51:17,743][15401] Updated weights for policy 0, policy_version 504000 (0.0024) [2024-06-23 19:51:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 8257552384. Throughput: 0: 42560.6. Samples: 8257716060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:18,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 19:51:21,583][15401] Updated weights for policy 0, policy_version 504010 (0.0037) [2024-06-23 19:51:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 8257748992. Throughput: 0: 42640.3. Samples: 8257844220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 19:51:25,553][15401] Updated weights for policy 0, policy_version 504020 (0.0027) [2024-06-23 19:51:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8257978368. Throughput: 0: 42619.4. Samples: 8258096880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 19:51:29,054][15401] Updated weights for policy 0, policy_version 504030 (0.0035) [2024-06-23 19:51:32,927][15401] Updated weights for policy 0, policy_version 504040 (0.0028) [2024-06-23 19:51:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8258191360. Throughput: 0: 42817.7. Samples: 8258363400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 19:51:36,598][15401] Updated weights for policy 0, policy_version 504050 (0.0033) [2024-06-23 19:51:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8258387968. Throughput: 0: 42667.2. Samples: 8258490940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:38,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 19:51:40,834][15401] Updated weights for policy 0, policy_version 504060 (0.0044) [2024-06-23 19:51:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 8258633728. Throughput: 0: 42747.2. Samples: 8258746060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-23 19:51:43,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 19:51:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000504068_8258650112.pth... [2024-06-23 19:51:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000503441_8248377344.pth [2024-06-23 19:51:44,290][15401] Updated weights for policy 0, policy_version 504070 (0.0035) [2024-06-23 19:51:48,291][15401] Updated weights for policy 0, policy_version 504080 (0.0033) [2024-06-23 19:51:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8258846720. Throughput: 0: 42933.4. Samples: 8259007780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:51:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 19:51:52,219][15401] Updated weights for policy 0, policy_version 504090 (0.0044) [2024-06-23 19:51:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42326.4, 300 sec: 42598.4). Total num frames: 8259026944. Throughput: 0: 42669.4. Samples: 8259131840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:51:53,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 19:51:53,926][15349] Signal inference workers to stop experience collection... (122350 times) [2024-06-23 19:51:53,927][15349] Signal inference workers to resume experience collection... (122350 times) [2024-06-23 19:51:53,962][15401] InferenceWorker_p0-w0: stopping experience collection (122350 times) [2024-06-23 19:51:53,962][15401] InferenceWorker_p0-w0: resuming experience collection (122350 times) [2024-06-23 19:51:55,818][15401] Updated weights for policy 0, policy_version 504100 (0.0035) [2024-06-23 19:51:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8259289088. Throughput: 0: 42953.8. Samples: 8259389060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:51:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 19:52:00,041][15401] Updated weights for policy 0, policy_version 504110 (0.0033) [2024-06-23 19:52:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 8259485696. Throughput: 0: 42907.6. Samples: 8259646900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 19:52:03,433][15401] Updated weights for policy 0, policy_version 504120 (0.0033) [2024-06-23 19:52:07,597][15401] Updated weights for policy 0, policy_version 504130 (0.0050) [2024-06-23 19:52:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 8259682304. Throughput: 0: 42925.8. Samples: 8259775880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 19:52:10,943][15401] Updated weights for policy 0, policy_version 504140 (0.0031) [2024-06-23 19:52:13,389][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8259928064. Throughput: 0: 43064.5. Samples: 8260034780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 19:52:15,180][15401] Updated weights for policy 0, policy_version 504150 (0.0049) [2024-06-23 19:52:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8260141056. Throughput: 0: 42758.1. Samples: 8260287520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 19:52:18,552][15401] Updated weights for policy 0, policy_version 504160 (0.0032) [2024-06-23 19:52:23,167][15401] Updated weights for policy 0, policy_version 504170 (0.0035) [2024-06-23 19:52:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8260321280. Throughput: 0: 42926.2. Samples: 8260422620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:23,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 19:52:26,192][15401] Updated weights for policy 0, policy_version 504180 (0.0031) [2024-06-23 19:52:28,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8260567040. Throughput: 0: 42829.4. Samples: 8260673380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:28,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 19:52:30,676][15401] Updated weights for policy 0, policy_version 504190 (0.0038) [2024-06-23 19:52:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8260780032. Throughput: 0: 42903.0. Samples: 8260938420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:33,391][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 19:52:33,791][15401] Updated weights for policy 0, policy_version 504200 (0.0037) [2024-06-23 19:52:38,180][15401] Updated weights for policy 0, policy_version 504210 (0.0033) [2024-06-23 19:52:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8260976640. Throughput: 0: 42913.7. Samples: 8261062960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 19:52:41,349][15401] Updated weights for policy 0, policy_version 504220 (0.0021) [2024-06-23 19:52:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8261206016. Throughput: 0: 42720.9. Samples: 8261311500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:43,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 19:52:46,062][15401] Updated weights for policy 0, policy_version 504230 (0.0041) [2024-06-23 19:52:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 8261402624. Throughput: 0: 42988.8. Samples: 8261581400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 19:52:49,033][15401] Updated weights for policy 0, policy_version 504240 (0.0037) [2024-06-23 19:52:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8261615616. Throughput: 0: 42782.6. Samples: 8261701100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 19:52:53,515][15401] Updated weights for policy 0, policy_version 504250 (0.0040) [2024-06-23 19:52:56,553][15401] Updated weights for policy 0, policy_version 504260 (0.0041) [2024-06-23 19:52:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8261844992. Throughput: 0: 42730.2. Samples: 8261957640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:52:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 19:53:01,069][15401] Updated weights for policy 0, policy_version 504270 (0.0027) [2024-06-23 19:53:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 8262041600. Throughput: 0: 43005.8. Samples: 8262222780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:53:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 19:53:03,448][15349] Signal inference workers to stop experience collection... (122400 times) [2024-06-23 19:53:03,476][15401] InferenceWorker_p0-w0: stopping experience collection (122400 times) [2024-06-23 19:53:03,566][15349] Signal inference workers to resume experience collection... (122400 times) [2024-06-23 19:53:03,566][15401] InferenceWorker_p0-w0: resuming experience collection (122400 times) [2024-06-23 19:53:04,157][15401] Updated weights for policy 0, policy_version 504280 (0.0021) [2024-06-23 19:53:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8262254592. Throughput: 0: 42686.3. Samples: 8262343500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-23 19:53:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 19:53:08,763][15401] Updated weights for policy 0, policy_version 504290 (0.0039) [2024-06-23 19:53:11,850][15401] Updated weights for policy 0, policy_version 504300 (0.0046) [2024-06-23 19:53:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8262483968. Throughput: 0: 42702.5. Samples: 8262595000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 19:53:16,640][15401] Updated weights for policy 0, policy_version 504310 (0.0037) [2024-06-23 19:53:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8262696960. Throughput: 0: 42670.8. Samples: 8262858600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:18,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 19:53:20,021][15401] Updated weights for policy 0, policy_version 504320 (0.0041) [2024-06-23 19:53:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8262893568. Throughput: 0: 42789.0. Samples: 8262988460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 19:53:23,994][15401] Updated weights for policy 0, policy_version 504330 (0.0031) [2024-06-23 19:53:27,583][15401] Updated weights for policy 0, policy_version 504340 (0.0024) [2024-06-23 19:53:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8263122944. Throughput: 0: 43008.0. Samples: 8263246960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:28,392][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 19:53:31,427][15401] Updated weights for policy 0, policy_version 504350 (0.0038) [2024-06-23 19:53:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8263352320. Throughput: 0: 42758.6. Samples: 8263505540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 19:53:34,989][15401] Updated weights for policy 0, policy_version 504360 (0.0034) [2024-06-23 19:53:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8263548928. Throughput: 0: 42990.4. Samples: 8263635660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 19:53:38,832][15401] Updated weights for policy 0, policy_version 504370 (0.0049) [2024-06-23 19:53:42,767][15401] Updated weights for policy 0, policy_version 504380 (0.0035) [2024-06-23 19:53:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8263778304. Throughput: 0: 43076.8. Samples: 8263896100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:43,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-23 19:53:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000504381_8263778304.pth... [2024-06-23 19:53:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000503754_8253505536.pth [2024-06-23 19:53:46,819][15401] Updated weights for policy 0, policy_version 504390 (0.0029) [2024-06-23 19:53:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8264007680. Throughput: 0: 42888.1. Samples: 8264152740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 19:53:50,348][15401] Updated weights for policy 0, policy_version 504400 (0.0025) [2024-06-23 19:53:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8264204288. Throughput: 0: 43153.4. Samples: 8264285400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:53,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 19:53:54,329][15401] Updated weights for policy 0, policy_version 504410 (0.0036) [2024-06-23 19:53:57,782][15401] Updated weights for policy 0, policy_version 504420 (0.0039) [2024-06-23 19:53:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8264433664. Throughput: 0: 43292.8. Samples: 8264543180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:53:58,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 19:54:01,826][15401] Updated weights for policy 0, policy_version 504430 (0.0031) [2024-06-23 19:54:02,675][15349] Signal inference workers to stop experience collection... (122450 times) [2024-06-23 19:54:02,719][15401] InferenceWorker_p0-w0: stopping experience collection (122450 times) [2024-06-23 19:54:02,733][15349] Signal inference workers to resume experience collection... (122450 times) [2024-06-23 19:54:02,734][15401] InferenceWorker_p0-w0: resuming experience collection (122450 times) [2024-06-23 19:54:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 8264663040. Throughput: 0: 43192.4. Samples: 8264802260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:54:03,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 19:54:05,402][15401] Updated weights for policy 0, policy_version 504440 (0.0037) [2024-06-23 19:54:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8264843264. Throughput: 0: 43289.4. Samples: 8264936480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:54:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 19:54:09,343][15401] Updated weights for policy 0, policy_version 504450 (0.0026) [2024-06-23 19:54:13,195][15401] Updated weights for policy 0, policy_version 504460 (0.0029) [2024-06-23 19:54:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8265089024. Throughput: 0: 43246.2. Samples: 8265192940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:54:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 19:54:16,901][15401] Updated weights for policy 0, policy_version 504470 (0.0033) [2024-06-23 19:54:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 8265302016. Throughput: 0: 43052.4. Samples: 8265442900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:54:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 19:54:20,938][15401] Updated weights for policy 0, policy_version 504480 (0.0026) [2024-06-23 19:54:23,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8265465856. Throughput: 0: 43066.9. Samples: 8265573680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:54:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 19:54:24,745][15401] Updated weights for policy 0, policy_version 504490 (0.0048) [2024-06-23 19:54:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 8265711616. Throughput: 0: 42893.4. Samples: 8265826400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:54:28,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 19:54:28,620][15401] Updated weights for policy 0, policy_version 504500 (0.0036) [2024-06-23 19:54:32,408][15401] Updated weights for policy 0, policy_version 504510 (0.0034) [2024-06-23 19:54:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8265924608. Throughput: 0: 42903.0. Samples: 8266083380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:54:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 19:54:36,296][15401] Updated weights for policy 0, policy_version 504520 (0.0028) [2024-06-23 19:54:38,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8266104832. Throughput: 0: 42928.4. Samples: 8266217180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-23 19:54:38,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-23 19:54:40,088][15401] Updated weights for policy 0, policy_version 504530 (0.0038) [2024-06-23 19:54:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8266334208. Throughput: 0: 42675.3. Samples: 8266463560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:54:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 19:54:44,178][15401] Updated weights for policy 0, policy_version 504540 (0.0043) [2024-06-23 19:54:47,656][15401] Updated weights for policy 0, policy_version 504550 (0.0042) [2024-06-23 19:54:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8266563584. Throughput: 0: 42634.8. Samples: 8266720820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:54:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 19:54:51,788][15401] Updated weights for policy 0, policy_version 504560 (0.0034) [2024-06-23 19:54:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8266760192. Throughput: 0: 42659.6. Samples: 8266856160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:54:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 19:54:55,092][15401] Updated weights for policy 0, policy_version 504570 (0.0031) [2024-06-23 19:54:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8266989568. Throughput: 0: 42736.9. Samples: 8267116100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:54:58,394][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 19:54:59,346][15401] Updated weights for policy 0, policy_version 504580 (0.0022) [2024-06-23 19:55:02,698][15401] Updated weights for policy 0, policy_version 504590 (0.0036) [2024-06-23 19:55:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8267218944. Throughput: 0: 42840.8. Samples: 8267370740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 19:55:07,226][15401] Updated weights for policy 0, policy_version 504600 (0.0031) [2024-06-23 19:55:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8267415552. Throughput: 0: 43011.2. Samples: 8267509280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:08,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 19:55:10,363][15401] Updated weights for policy 0, policy_version 504610 (0.0024) [2024-06-23 19:55:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8267628544. Throughput: 0: 42997.5. Samples: 8267761180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 19:55:14,764][15401] Updated weights for policy 0, policy_version 504620 (0.0042) [2024-06-23 19:55:18,063][15401] Updated weights for policy 0, policy_version 504630 (0.0036) [2024-06-23 19:55:18,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8267874304. Throughput: 0: 42953.9. Samples: 8268016300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 19:55:22,491][15401] Updated weights for policy 0, policy_version 504640 (0.0040) [2024-06-23 19:55:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 8268070912. Throughput: 0: 43004.5. Samples: 8268152380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 19:55:25,385][15401] Updated weights for policy 0, policy_version 504650 (0.0036) [2024-06-23 19:55:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 8268283904. Throughput: 0: 43369.1. Samples: 8268415180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:28,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 19:55:30,021][15401] Updated weights for policy 0, policy_version 504660 (0.0040) [2024-06-23 19:55:32,887][15401] Updated weights for policy 0, policy_version 504670 (0.0037) [2024-06-23 19:55:33,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 8268513280. Throughput: 0: 43187.7. Samples: 8268664280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:33,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-23 19:55:37,547][15401] Updated weights for policy 0, policy_version 504680 (0.0044) [2024-06-23 19:55:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43690.6, 300 sec: 42932.6). Total num frames: 8268726272. Throughput: 0: 43135.0. Samples: 8268797240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 19:55:40,755][15401] Updated weights for policy 0, policy_version 504690 (0.0043) [2024-06-23 19:55:42,965][15349] Signal inference workers to stop experience collection... (122500 times) [2024-06-23 19:55:42,965][15349] Signal inference workers to resume experience collection... (122500 times) [2024-06-23 19:55:42,984][15401] InferenceWorker_p0-w0: stopping experience collection (122500 times) [2024-06-23 19:55:42,985][15401] InferenceWorker_p0-w0: resuming experience collection (122500 times) [2024-06-23 19:55:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 8268939264. Throughput: 0: 43157.8. Samples: 8269058200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:43,395][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 19:55:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000504696_8268939264.pth... [2024-06-23 19:55:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000504068_8258650112.pth [2024-06-23 19:55:45,136][15401] Updated weights for policy 0, policy_version 504700 (0.0025) [2024-06-23 19:55:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42931.8). Total num frames: 8269152256. Throughput: 0: 43046.3. Samples: 8269307820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 19:55:48,510][15401] Updated weights for policy 0, policy_version 504710 (0.0033) [2024-06-23 19:55:52,718][15401] Updated weights for policy 0, policy_version 504720 (0.0027) [2024-06-23 19:55:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8269348864. Throughput: 0: 42951.2. Samples: 8269441980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 19:55:55,807][15401] Updated weights for policy 0, policy_version 504730 (0.0025) [2024-06-23 19:55:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 8269594624. Throughput: 0: 43072.5. Samples: 8269699440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:55:58,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 19:56:00,266][15401] Updated weights for policy 0, policy_version 504740 (0.0029) [2024-06-23 19:56:03,310][15401] Updated weights for policy 0, policy_version 504750 (0.0027) [2024-06-23 19:56:03,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 43043.0). Total num frames: 8269824000. Throughput: 0: 43100.4. Samples: 8269955820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-23 19:56:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 19:56:07,835][15401] Updated weights for policy 0, policy_version 504760 (0.0036) [2024-06-23 19:56:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 8269987840. Throughput: 0: 42994.2. Samples: 8270087120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 19:56:11,037][15401] Updated weights for policy 0, policy_version 504770 (0.0035) [2024-06-23 19:56:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 8270233600. Throughput: 0: 42850.3. Samples: 8270343440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:13,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 19:56:15,736][15401] Updated weights for policy 0, policy_version 504780 (0.0025) [2024-06-23 19:56:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 8270446592. Throughput: 0: 43055.4. Samples: 8270601760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 19:56:18,845][15401] Updated weights for policy 0, policy_version 504790 (0.0050) [2024-06-23 19:56:23,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42593.8, 300 sec: 42875.2). Total num frames: 8270626816. Throughput: 0: 42835.7. Samples: 8270725120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:23,396][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 19:56:23,749][15401] Updated weights for policy 0, policy_version 504800 (0.0040) [2024-06-23 19:56:26,751][15401] Updated weights for policy 0, policy_version 504810 (0.0030) [2024-06-23 19:56:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8270872576. Throughput: 0: 42716.0. Samples: 8270980420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 19:56:31,251][15401] Updated weights for policy 0, policy_version 504820 (0.0033) [2024-06-23 19:56:33,391][15132] Fps is (10 sec: 44258.2, 60 sec: 42597.4, 300 sec: 42986.9). Total num frames: 8271069184. Throughput: 0: 42889.2. Samples: 8271237900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:33,392][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 19:56:34,358][15401] Updated weights for policy 0, policy_version 504830 (0.0030) [2024-06-23 19:56:38,394][15132] Fps is (10 sec: 39303.4, 60 sec: 42322.0, 300 sec: 42819.9). Total num frames: 8271265792. Throughput: 0: 42648.9. Samples: 8271361380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:38,395][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 19:56:38,844][15401] Updated weights for policy 0, policy_version 504840 (0.0032) [2024-06-23 19:56:41,885][15401] Updated weights for policy 0, policy_version 504850 (0.0035) [2024-06-23 19:56:43,389][15132] Fps is (10 sec: 44244.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 8271511552. Throughput: 0: 42584.9. Samples: 8271615760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 19:56:46,349][15401] Updated weights for policy 0, policy_version 504860 (0.0035) [2024-06-23 19:56:48,390][15132] Fps is (10 sec: 44257.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8271708160. Throughput: 0: 42689.8. Samples: 8271876860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 19:56:49,544][15401] Updated weights for policy 0, policy_version 504870 (0.0034) [2024-06-23 19:56:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8271921152. Throughput: 0: 42608.5. Samples: 8272004500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 19:56:54,068][15401] Updated weights for policy 0, policy_version 504880 (0.0029) [2024-06-23 19:56:55,788][15349] Signal inference workers to stop experience collection... (122550 times) [2024-06-23 19:56:55,788][15349] Signal inference workers to resume experience collection... (122550 times) [2024-06-23 19:56:55,825][15401] InferenceWorker_p0-w0: stopping experience collection (122550 times) [2024-06-23 19:56:55,826][15401] InferenceWorker_p0-w0: resuming experience collection (122550 times) [2024-06-23 19:56:57,464][15401] Updated weights for policy 0, policy_version 504890 (0.0033) [2024-06-23 19:56:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8272150528. Throughput: 0: 42557.0. Samples: 8272258500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:56:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 19:57:01,723][15401] Updated weights for policy 0, policy_version 504900 (0.0038) [2024-06-23 19:57:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42931.7). Total num frames: 8272347136. Throughput: 0: 42558.6. Samples: 8272516900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:57:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 19:57:05,205][15401] Updated weights for policy 0, policy_version 504910 (0.0036) [2024-06-23 19:57:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8272560128. Throughput: 0: 42586.6. Samples: 8272641240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:57:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 19:57:09,439][15401] Updated weights for policy 0, policy_version 504920 (0.0036) [2024-06-23 19:57:12,839][15401] Updated weights for policy 0, policy_version 504930 (0.0030) [2024-06-23 19:57:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8272789504. Throughput: 0: 42604.6. Samples: 8272897620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:57:13,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 19:57:17,072][15401] Updated weights for policy 0, policy_version 504940 (0.0041) [2024-06-23 19:57:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 8272986112. Throughput: 0: 42558.8. Samples: 8273152980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:57:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 19:57:20,543][15401] Updated weights for policy 0, policy_version 504950 (0.0034) [2024-06-23 19:57:23,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43149.0, 300 sec: 42876.1). Total num frames: 8273215488. Throughput: 0: 42673.2. Samples: 8273281480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:57:23,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 19:57:24,668][15401] Updated weights for policy 0, policy_version 504960 (0.0040) [2024-06-23 19:57:28,271][15401] Updated weights for policy 0, policy_version 504970 (0.0045) [2024-06-23 19:57:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8273428480. Throughput: 0: 42813.8. Samples: 8273542380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:57:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 19:57:32,307][15401] Updated weights for policy 0, policy_version 504980 (0.0030) [2024-06-23 19:57:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42599.6, 300 sec: 42876.1). Total num frames: 8273625088. Throughput: 0: 42587.7. Samples: 8273793300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-23 19:57:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 19:57:35,939][15401] Updated weights for policy 0, policy_version 504990 (0.0021) [2024-06-23 19:57:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42874.8, 300 sec: 42820.6). Total num frames: 8273838080. Throughput: 0: 42569.3. Samples: 8273920120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:57:38,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 19:57:39,932][15401] Updated weights for policy 0, policy_version 505000 (0.0030) [2024-06-23 19:57:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8274067456. Throughput: 0: 42698.6. Samples: 8274179940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:57:43,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 19:57:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000505010_8274083840.pth... [2024-06-23 19:57:43,442][15401] Updated weights for policy 0, policy_version 505010 (0.0038) [2024-06-23 19:57:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000504381_8263778304.pth [2024-06-23 19:57:47,738][15401] Updated weights for policy 0, policy_version 505020 (0.0042) [2024-06-23 19:57:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8274264064. Throughput: 0: 42685.7. Samples: 8274437760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:57:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 19:57:51,581][15401] Updated weights for policy 0, policy_version 505030 (0.0032) [2024-06-23 19:57:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8274493440. Throughput: 0: 42744.0. Samples: 8274564720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:57:53,390][15132] Avg episode reward: [(0, '0.203')] [2024-06-23 19:57:55,375][15401] Updated weights for policy 0, policy_version 505040 (0.0036) [2024-06-23 19:57:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8274706432. Throughput: 0: 42686.5. Samples: 8274818520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:57:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 19:57:59,142][15401] Updated weights for policy 0, policy_version 505050 (0.0032) [2024-06-23 19:58:02,996][15401] Updated weights for policy 0, policy_version 505060 (0.0025) [2024-06-23 19:58:03,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 8274903040. Throughput: 0: 42683.9. Samples: 8275073860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:03,396][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 19:58:06,732][15401] Updated weights for policy 0, policy_version 505070 (0.0041) [2024-06-23 19:58:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8275148800. Throughput: 0: 42742.3. Samples: 8275204880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:08,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 19:58:10,718][15401] Updated weights for policy 0, policy_version 505080 (0.0032) [2024-06-23 19:58:13,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 8275329024. Throughput: 0: 42532.3. Samples: 8275456340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 19:58:14,272][15401] Updated weights for policy 0, policy_version 505090 (0.0037) [2024-06-23 19:58:18,307][15401] Updated weights for policy 0, policy_version 505100 (0.0041) [2024-06-23 19:58:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8275558400. Throughput: 0: 42683.5. Samples: 8275714060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 19:58:22,131][15401] Updated weights for policy 0, policy_version 505110 (0.0042) [2024-06-23 19:58:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 8275771392. Throughput: 0: 42709.7. Samples: 8275842060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:23,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 19:58:25,982][15401] Updated weights for policy 0, policy_version 505120 (0.0028) [2024-06-23 19:58:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8275984384. Throughput: 0: 42586.2. Samples: 8276096320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 19:58:30,209][15401] Updated weights for policy 0, policy_version 505130 (0.0043) [2024-06-23 19:58:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 8276197376. Throughput: 0: 42605.2. Samples: 8276355000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:33,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 19:58:33,495][15401] Updated weights for policy 0, policy_version 505140 (0.0035) [2024-06-23 19:58:37,798][15401] Updated weights for policy 0, policy_version 505150 (0.0040) [2024-06-23 19:58:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8276393984. Throughput: 0: 42604.8. Samples: 8276481940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 19:58:40,164][15349] Signal inference workers to stop experience collection... (122600 times) [2024-06-23 19:58:40,164][15349] Signal inference workers to resume experience collection... (122600 times) [2024-06-23 19:58:40,220][15401] InferenceWorker_p0-w0: stopping experience collection (122600 times) [2024-06-23 19:58:40,220][15401] InferenceWorker_p0-w0: resuming experience collection (122600 times) [2024-06-23 19:58:41,338][15401] Updated weights for policy 0, policy_version 505160 (0.0031) [2024-06-23 19:58:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8276623360. Throughput: 0: 42608.6. Samples: 8276735900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 19:58:45,701][15401] Updated weights for policy 0, policy_version 505170 (0.0030) [2024-06-23 19:58:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8276836352. Throughput: 0: 42598.2. Samples: 8276990780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:48,392][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 19:58:48,923][15401] Updated weights for policy 0, policy_version 505180 (0.0032) [2024-06-23 19:58:53,321][15401] Updated weights for policy 0, policy_version 505190 (0.0027) [2024-06-23 19:58:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8277032960. Throughput: 0: 42478.8. Samples: 8277116420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 19:58:56,543][15401] Updated weights for policy 0, policy_version 505200 (0.0031) [2024-06-23 19:58:58,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8277278720. Throughput: 0: 42634.2. Samples: 8277374880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-23 19:58:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 19:59:00,833][15401] Updated weights for policy 0, policy_version 505210 (0.0037) [2024-06-23 19:59:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 8277491712. Throughput: 0: 42692.8. Samples: 8277635240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:03,400][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 19:59:04,440][15401] Updated weights for policy 0, policy_version 505220 (0.0043) [2024-06-23 19:59:08,240][15401] Updated weights for policy 0, policy_version 505230 (0.0039) [2024-06-23 19:59:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 8277688320. Throughput: 0: 42714.2. Samples: 8277764300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:08,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 19:59:11,921][15401] Updated weights for policy 0, policy_version 505240 (0.0036) [2024-06-23 19:59:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8277901312. Throughput: 0: 42772.6. Samples: 8278021080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 19:59:15,710][15401] Updated weights for policy 0, policy_version 505250 (0.0035) [2024-06-23 19:59:18,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8278114304. Throughput: 0: 42753.4. Samples: 8278278900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 19:59:19,582][15401] Updated weights for policy 0, policy_version 505260 (0.0033) [2024-06-23 19:59:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 8278327296. Throughput: 0: 42789.8. Samples: 8278407480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:23,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 19:59:23,574][15401] Updated weights for policy 0, policy_version 505270 (0.0035) [2024-06-23 19:59:27,093][15401] Updated weights for policy 0, policy_version 505280 (0.0037) [2024-06-23 19:59:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8278556672. Throughput: 0: 42907.0. Samples: 8278666720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:28,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 19:59:31,578][15401] Updated weights for policy 0, policy_version 505290 (0.0029) [2024-06-23 19:59:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8278753280. Throughput: 0: 43049.8. Samples: 8278927920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 19:59:34,643][15401] Updated weights for policy 0, policy_version 505300 (0.0034) [2024-06-23 19:59:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8278949888. Throughput: 0: 42907.6. Samples: 8279047260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:38,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 19:59:39,099][15401] Updated weights for policy 0, policy_version 505310 (0.0033) [2024-06-23 19:59:42,215][15401] Updated weights for policy 0, policy_version 505320 (0.0034) [2024-06-23 19:59:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8279212032. Throughput: 0: 42945.7. Samples: 8279307440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 19:59:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000505323_8279212032.pth... [2024-06-23 19:59:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000504696_8268939264.pth [2024-06-23 19:59:46,773][15401] Updated weights for policy 0, policy_version 505330 (0.0041) [2024-06-23 19:59:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 8279375872. Throughput: 0: 43034.2. Samples: 8279571780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:48,391][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 19:59:49,795][15401] Updated weights for policy 0, policy_version 505340 (0.0036) [2024-06-23 19:59:53,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8279605248. Throughput: 0: 42868.5. Samples: 8279693280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 19:59:54,209][15401] Updated weights for policy 0, policy_version 505350 (0.0037) [2024-06-23 19:59:57,579][15401] Updated weights for policy 0, policy_version 505360 (0.0042) [2024-06-23 19:59:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8279834624. Throughput: 0: 42862.0. Samples: 8279949880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 19:59:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 20:00:01,835][15401] Updated weights for policy 0, policy_version 505370 (0.0044) [2024-06-23 20:00:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 8280031232. Throughput: 0: 42798.7. Samples: 8280204840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 20:00:03,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 20:00:05,671][15401] Updated weights for policy 0, policy_version 505380 (0.0035) [2024-06-23 20:00:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8280244224. Throughput: 0: 42677.8. Samples: 8280327980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 20:00:08,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 20:00:09,203][15349] Signal inference workers to stop experience collection... (122650 times) [2024-06-23 20:00:09,203][15349] Signal inference workers to resume experience collection... (122650 times) [2024-06-23 20:00:09,244][15401] InferenceWorker_p0-w0: stopping experience collection (122650 times) [2024-06-23 20:00:09,244][15401] InferenceWorker_p0-w0: resuming experience collection (122650 times) [2024-06-23 20:00:09,717][15401] Updated weights for policy 0, policy_version 505390 (0.0033) [2024-06-23 20:00:13,146][15401] Updated weights for policy 0, policy_version 505400 (0.0030) [2024-06-23 20:00:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8280473600. Throughput: 0: 42619.6. Samples: 8280584600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 20:00:13,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 20:00:17,262][15401] Updated weights for policy 0, policy_version 505410 (0.0043) [2024-06-23 20:00:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8280686592. Throughput: 0: 42531.1. Samples: 8280841820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 20:00:18,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 20:00:21,029][15401] Updated weights for policy 0, policy_version 505420 (0.0033) [2024-06-23 20:00:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8280883200. Throughput: 0: 42614.0. Samples: 8280964900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 20:00:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 20:00:24,871][15401] Updated weights for policy 0, policy_version 505430 (0.0029) [2024-06-23 20:00:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8281096192. Throughput: 0: 42680.6. Samples: 8281228060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-23 20:00:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 20:00:28,785][15401] Updated weights for policy 0, policy_version 505440 (0.0044) [2024-06-23 20:00:32,479][15401] Updated weights for policy 0, policy_version 505450 (0.0034) [2024-06-23 20:00:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8281309184. Throughput: 0: 42490.8. Samples: 8281483860. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:00:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 20:00:36,399][15401] Updated weights for policy 0, policy_version 505460 (0.0034) [2024-06-23 20:00:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 8281538560. Throughput: 0: 42590.2. Samples: 8281609940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:00:38,392][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 20:00:40,139][15401] Updated weights for policy 0, policy_version 505470 (0.0022) [2024-06-23 20:00:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 8281735168. Throughput: 0: 42682.9. Samples: 8281870600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:00:43,390][15132] Avg episode reward: [(0, '0.898')] [2024-06-23 20:00:44,030][15401] Updated weights for policy 0, policy_version 505480 (0.0039) [2024-06-23 20:00:48,042][15401] Updated weights for policy 0, policy_version 505490 (0.0034) [2024-06-23 20:00:48,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 8281948160. Throughput: 0: 42627.6. Samples: 8282123180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:00:48,392][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 20:00:51,724][15401] Updated weights for policy 0, policy_version 505500 (0.0034) [2024-06-23 20:00:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8282177536. Throughput: 0: 42796.0. Samples: 8282253800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:00:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 20:00:55,542][15401] Updated weights for policy 0, policy_version 505510 (0.0028) [2024-06-23 20:00:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8282374144. Throughput: 0: 42836.0. Samples: 8282512220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:00:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 20:00:59,366][15401] Updated weights for policy 0, policy_version 505520 (0.0036) [2024-06-23 20:01:03,019][15401] Updated weights for policy 0, policy_version 505530 (0.0033) [2024-06-23 20:01:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8282603520. Throughput: 0: 42687.9. Samples: 8282762780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:03,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-23 20:01:07,091][15401] Updated weights for policy 0, policy_version 505540 (0.0027) [2024-06-23 20:01:08,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 8282816512. Throughput: 0: 42940.9. Samples: 8282897340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:08,393][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 20:01:10,933][15401] Updated weights for policy 0, policy_version 505550 (0.0032) [2024-06-23 20:01:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8283013120. Throughput: 0: 42752.9. Samples: 8283151940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 20:01:14,575][15401] Updated weights for policy 0, policy_version 505560 (0.0030) [2024-06-23 20:01:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42766.0). Total num frames: 8283242496. Throughput: 0: 42905.8. Samples: 8283414620. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 20:01:18,484][15401] Updated weights for policy 0, policy_version 505570 (0.0035) [2024-06-23 20:01:22,230][15401] Updated weights for policy 0, policy_version 505580 (0.0044) [2024-06-23 20:01:23,394][15132] Fps is (10 sec: 45855.0, 60 sec: 43141.4, 300 sec: 42708.8). Total num frames: 8283471872. Throughput: 0: 42995.0. Samples: 8283544800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:23,394][15132] Avg episode reward: [(0, '0.195')] [2024-06-23 20:01:26,034][15401] Updated weights for policy 0, policy_version 505590 (0.0037) [2024-06-23 20:01:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.2). Total num frames: 8283684864. Throughput: 0: 42865.2. Samples: 8283799540. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:28,390][15132] Avg episode reward: [(0, '0.187')] [2024-06-23 20:01:29,830][15401] Updated weights for policy 0, policy_version 505600 (0.0034) [2024-06-23 20:01:33,389][15132] Fps is (10 sec: 42617.4, 60 sec: 43144.6, 300 sec: 42821.2). Total num frames: 8283897856. Throughput: 0: 42948.5. Samples: 8284055760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 20:01:33,710][15401] Updated weights for policy 0, policy_version 505610 (0.0029) [2024-06-23 20:01:36,543][15349] Signal inference workers to stop experience collection... (122700 times) [2024-06-23 20:01:36,544][15349] Signal inference workers to resume experience collection... (122700 times) [2024-06-23 20:01:36,561][15401] InferenceWorker_p0-w0: stopping experience collection (122700 times) [2024-06-23 20:01:36,592][15401] InferenceWorker_p0-w0: resuming experience collection (122700 times) [2024-06-23 20:01:38,024][15401] Updated weights for policy 0, policy_version 505620 (0.0034) [2024-06-23 20:01:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 8284094464. Throughput: 0: 42975.9. Samples: 8284187820. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:38,392][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 20:01:41,248][15401] Updated weights for policy 0, policy_version 505630 (0.0023) [2024-06-23 20:01:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8284323840. Throughput: 0: 42898.2. Samples: 8284442640. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 20:01:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000505635_8284323840.pth... [2024-06-23 20:01:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000505010_8274083840.pth [2024-06-23 20:01:45,504][15401] Updated weights for policy 0, policy_version 505640 (0.0022) [2024-06-23 20:01:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8284520448. Throughput: 0: 43196.2. Samples: 8284706600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 20:01:48,858][15401] Updated weights for policy 0, policy_version 505650 (0.0041) [2024-06-23 20:01:52,887][15401] Updated weights for policy 0, policy_version 505660 (0.0033) [2024-06-23 20:01:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8284749824. Throughput: 0: 42992.5. Samples: 8284831900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-23 20:01:53,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 20:01:56,753][15401] Updated weights for policy 0, policy_version 505670 (0.0037) [2024-06-23 20:01:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 8284979200. Throughput: 0: 42948.8. Samples: 8285084640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:01:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 20:02:00,618][15401] Updated weights for policy 0, policy_version 505680 (0.0040) [2024-06-23 20:02:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8285175808. Throughput: 0: 42803.9. Samples: 8285340800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 20:02:04,458][15401] Updated weights for policy 0, policy_version 505690 (0.0033) [2024-06-23 20:02:08,160][15401] Updated weights for policy 0, policy_version 505700 (0.0029) [2024-06-23 20:02:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8285388800. Throughput: 0: 42686.0. Samples: 8285465480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 20:02:12,084][15401] Updated weights for policy 0, policy_version 505710 (0.0028) [2024-06-23 20:02:13,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 8285634560. Throughput: 0: 42862.8. Samples: 8285728360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 20:02:16,067][15401] Updated weights for policy 0, policy_version 505720 (0.0025) [2024-06-23 20:02:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8285814784. Throughput: 0: 42896.3. Samples: 8285986100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 20:02:19,733][15401] Updated weights for policy 0, policy_version 505730 (0.0032) [2024-06-23 20:02:23,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42328.5, 300 sec: 42653.9). Total num frames: 8286011392. Throughput: 0: 42641.4. Samples: 8286106580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:23,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 20:02:23,801][15401] Updated weights for policy 0, policy_version 505740 (0.0042) [2024-06-23 20:02:27,502][15401] Updated weights for policy 0, policy_version 505750 (0.0035) [2024-06-23 20:02:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8286257152. Throughput: 0: 42971.1. Samples: 8286376340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 20:02:31,330][15401] Updated weights for policy 0, policy_version 505760 (0.0052) [2024-06-23 20:02:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8286453760. Throughput: 0: 42589.2. Samples: 8286623120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 20:02:35,177][15401] Updated weights for policy 0, policy_version 505770 (0.0036) [2024-06-23 20:02:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 8286666752. Throughput: 0: 42702.2. Samples: 8286753500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 20:02:38,850][15401] Updated weights for policy 0, policy_version 505780 (0.0033) [2024-06-23 20:02:42,685][15401] Updated weights for policy 0, policy_version 505790 (0.0038) [2024-06-23 20:02:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8286896128. Throughput: 0: 42667.6. Samples: 8287004680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 20:02:46,867][15401] Updated weights for policy 0, policy_version 505800 (0.0042) [2024-06-23 20:02:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8287092736. Throughput: 0: 42714.7. Samples: 8287262960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 20:02:50,227][15401] Updated weights for policy 0, policy_version 505810 (0.0032) [2024-06-23 20:02:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8287322112. Throughput: 0: 42722.2. Samples: 8287387980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 20:02:54,369][15401] Updated weights for policy 0, policy_version 505820 (0.0038) [2024-06-23 20:02:58,079][15401] Updated weights for policy 0, policy_version 505830 (0.0027) [2024-06-23 20:02:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 8287535104. Throughput: 0: 42665.3. Samples: 8287648300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:02:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 20:02:59,444][15349] Signal inference workers to stop experience collection... (122750 times) [2024-06-23 20:02:59,479][15401] InferenceWorker_p0-w0: stopping experience collection (122750 times) [2024-06-23 20:02:59,509][15349] Signal inference workers to resume experience collection... (122750 times) [2024-06-23 20:02:59,544][15401] InferenceWorker_p0-w0: resuming experience collection (122750 times) [2024-06-23 20:03:02,002][15401] Updated weights for policy 0, policy_version 505840 (0.0033) [2024-06-23 20:03:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8287731712. Throughput: 0: 42481.4. Samples: 8287897760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:03:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 20:03:05,722][15401] Updated weights for policy 0, policy_version 505850 (0.0039) [2024-06-23 20:03:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8287961088. Throughput: 0: 42659.4. Samples: 8288026260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:03:08,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 20:03:09,745][15401] Updated weights for policy 0, policy_version 505860 (0.0030) [2024-06-23 20:03:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8288157696. Throughput: 0: 42465.5. Samples: 8288287280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:03:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 20:03:13,398][15401] Updated weights for policy 0, policy_version 505870 (0.0033) [2024-06-23 20:03:17,249][15401] Updated weights for policy 0, policy_version 505880 (0.0039) [2024-06-23 20:03:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8288370688. Throughput: 0: 42684.4. Samples: 8288543920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:03:18,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 20:03:21,133][15401] Updated weights for policy 0, policy_version 505890 (0.0033) [2024-06-23 20:03:23,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 8288600064. Throughput: 0: 42610.7. Samples: 8288671080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:03:23,392][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 20:03:24,835][15401] Updated weights for policy 0, policy_version 505900 (0.0036) [2024-06-23 20:03:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 8288780288. Throughput: 0: 42765.9. Samples: 8288929140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:03:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 20:03:28,845][15401] Updated weights for policy 0, policy_version 505910 (0.0039) [2024-06-23 20:03:32,554][15401] Updated weights for policy 0, policy_version 505920 (0.0026) [2024-06-23 20:03:33,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8289009664. Throughput: 0: 42664.4. Samples: 8289182960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:03:33,393][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 20:03:36,321][15401] Updated weights for policy 0, policy_version 505930 (0.0046) [2024-06-23 20:03:38,389][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8289239040. Throughput: 0: 42755.1. Samples: 8289311960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:03:38,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 20:03:40,423][15401] Updated weights for policy 0, policy_version 505940 (0.0037) [2024-06-23 20:03:43,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 8289435648. Throughput: 0: 42738.6. Samples: 8289571540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:03:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 20:03:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000505947_8289435648.pth... [2024-06-23 20:03:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000505323_8279212032.pth [2024-06-23 20:03:43,979][15401] Updated weights for policy 0, policy_version 505950 (0.0030) [2024-06-23 20:03:47,967][15401] Updated weights for policy 0, policy_version 505960 (0.0033) [2024-06-23 20:03:48,393][15132] Fps is (10 sec: 42581.8, 60 sec: 42868.7, 300 sec: 42820.0). Total num frames: 8289665024. Throughput: 0: 42656.3. Samples: 8289817460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:03:48,394][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 20:03:51,947][15401] Updated weights for policy 0, policy_version 505970 (0.0037) [2024-06-23 20:03:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8289861632. Throughput: 0: 42747.3. Samples: 8289949880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:03:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 20:03:55,469][15401] Updated weights for policy 0, policy_version 505980 (0.0037) [2024-06-23 20:03:58,390][15132] Fps is (10 sec: 39336.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 8290058240. Throughput: 0: 42547.4. Samples: 8290201920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:03:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 20:03:59,629][15401] Updated weights for policy 0, policy_version 505990 (0.0034) [2024-06-23 20:04:03,285][15401] Updated weights for policy 0, policy_version 506000 (0.0036) [2024-06-23 20:04:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 8290304000. Throughput: 0: 42366.8. Samples: 8290450420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:03,390][15132] Avg episode reward: [(0, '0.088')] [2024-06-23 20:04:07,408][15401] Updated weights for policy 0, policy_version 506010 (0.0036) [2024-06-23 20:04:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 8290484224. Throughput: 0: 42468.2. Samples: 8290582040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:08,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 20:04:11,089][15401] Updated weights for policy 0, policy_version 506020 (0.0036) [2024-06-23 20:04:13,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8290680832. Throughput: 0: 42288.4. Samples: 8290832120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 20:04:14,993][15401] Updated weights for policy 0, policy_version 506030 (0.0030) [2024-06-23 20:04:16,700][15349] Signal inference workers to stop experience collection... (122800 times) [2024-06-23 20:04:16,702][15349] Signal inference workers to resume experience collection... (122800 times) [2024-06-23 20:04:16,721][15401] InferenceWorker_p0-w0: stopping experience collection (122800 times) [2024-06-23 20:04:16,750][15401] InferenceWorker_p0-w0: resuming experience collection (122800 times) [2024-06-23 20:04:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8290926592. Throughput: 0: 42414.3. Samples: 8291091500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:18,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 20:04:18,735][15401] Updated weights for policy 0, policy_version 506040 (0.0038) [2024-06-23 20:04:22,860][15401] Updated weights for policy 0, policy_version 506050 (0.0029) [2024-06-23 20:04:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 8291139584. Throughput: 0: 42570.6. Samples: 8291227640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 20:04:26,254][15401] Updated weights for policy 0, policy_version 506060 (0.0032) [2024-06-23 20:04:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8291336192. Throughput: 0: 42230.7. Samples: 8291471920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:28,393][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 20:04:30,551][15401] Updated weights for policy 0, policy_version 506070 (0.0029) [2024-06-23 20:04:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 8291565568. Throughput: 0: 42606.4. Samples: 8291734580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 20:04:33,872][15401] Updated weights for policy 0, policy_version 506080 (0.0033) [2024-06-23 20:04:38,115][15401] Updated weights for policy 0, policy_version 506090 (0.0040) [2024-06-23 20:04:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8291778560. Throughput: 0: 42532.9. Samples: 8291863860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 20:04:41,535][15401] Updated weights for policy 0, policy_version 506100 (0.0034) [2024-06-23 20:04:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8291991552. Throughput: 0: 42413.8. Samples: 8292110540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:43,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-23 20:04:45,743][15401] Updated weights for policy 0, policy_version 506110 (0.0033) [2024-06-23 20:04:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42601.1, 300 sec: 42765.0). Total num frames: 8292220928. Throughput: 0: 42558.5. Samples: 8292365560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:04:48,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 20:04:49,596][15401] Updated weights for policy 0, policy_version 506120 (0.0035) [2024-06-23 20:04:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8292401152. Throughput: 0: 42447.1. Samples: 8292492160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:04:53,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 20:04:53,549][15401] Updated weights for policy 0, policy_version 506130 (0.0031) [2024-06-23 20:04:57,496][15401] Updated weights for policy 0, policy_version 506140 (0.0023) [2024-06-23 20:04:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8292630528. Throughput: 0: 42551.1. Samples: 8292746920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:04:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 20:05:01,478][15401] Updated weights for policy 0, policy_version 506150 (0.0031) [2024-06-23 20:05:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8292859904. Throughput: 0: 42391.4. Samples: 8292999120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:03,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 20:05:05,229][15401] Updated weights for policy 0, policy_version 506160 (0.0032) [2024-06-23 20:05:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8293040128. Throughput: 0: 42360.0. Samples: 8293133840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 20:05:08,972][15401] Updated weights for policy 0, policy_version 506170 (0.0041) [2024-06-23 20:05:12,945][15401] Updated weights for policy 0, policy_version 506180 (0.0040) [2024-06-23 20:05:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8293253120. Throughput: 0: 42545.8. Samples: 8293386480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 20:05:16,580][15401] Updated weights for policy 0, policy_version 506190 (0.0030) [2024-06-23 20:05:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8293498880. Throughput: 0: 42343.1. Samples: 8293640020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 20:05:20,503][15401] Updated weights for policy 0, policy_version 506200 (0.0039) [2024-06-23 20:05:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8293695488. Throughput: 0: 42530.7. Samples: 8293777740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 20:05:24,169][15401] Updated weights for policy 0, policy_version 506210 (0.0037) [2024-06-23 20:05:28,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 8293892096. Throughput: 0: 42527.6. Samples: 8294024380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:28,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 20:05:28,457][15401] Updated weights for policy 0, policy_version 506220 (0.0030) [2024-06-23 20:05:31,875][15401] Updated weights for policy 0, policy_version 506230 (0.0030) [2024-06-23 20:05:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 8294154240. Throughput: 0: 42562.9. Samples: 8294280880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 20:05:35,910][15401] Updated weights for policy 0, policy_version 506240 (0.0034) [2024-06-23 20:05:38,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8294318080. Throughput: 0: 42663.6. Samples: 8294412020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 20:05:39,519][15401] Updated weights for policy 0, policy_version 506250 (0.0037) [2024-06-23 20:05:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 8294547456. Throughput: 0: 42643.9. Samples: 8294665900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:43,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 20:05:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000506260_8294563840.pth... [2024-06-23 20:05:43,514][15401] Updated weights for policy 0, policy_version 506260 (0.0042) [2024-06-23 20:05:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000505635_8284323840.pth [2024-06-23 20:05:45,275][15349] Signal inference workers to stop experience collection... (122850 times) [2024-06-23 20:05:45,324][15401] InferenceWorker_p0-w0: stopping experience collection (122850 times) [2024-06-23 20:05:45,333][15349] Signal inference workers to resume experience collection... (122850 times) [2024-06-23 20:05:45,344][15401] InferenceWorker_p0-w0: resuming experience collection (122850 times) [2024-06-23 20:05:47,247][15401] Updated weights for policy 0, policy_version 506270 (0.0044) [2024-06-23 20:05:48,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8294776832. Throughput: 0: 42714.8. Samples: 8294921280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 20:05:51,375][15401] Updated weights for policy 0, policy_version 506280 (0.0031) [2024-06-23 20:05:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8294957056. Throughput: 0: 42565.3. Samples: 8295049280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:53,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 20:05:54,895][15401] Updated weights for policy 0, policy_version 506290 (0.0029) [2024-06-23 20:05:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 8295186432. Throughput: 0: 42644.9. Samples: 8295305500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:05:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 20:05:58,914][15401] Updated weights for policy 0, policy_version 506300 (0.0033) [2024-06-23 20:06:02,434][15401] Updated weights for policy 0, policy_version 506310 (0.0031) [2024-06-23 20:06:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 8295399424. Throughput: 0: 42691.8. Samples: 8295561160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:06:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 20:06:06,537][15401] Updated weights for policy 0, policy_version 506320 (0.0044) [2024-06-23 20:06:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8295596032. Throughput: 0: 42641.8. Samples: 8295696620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:06:08,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 20:06:10,204][15401] Updated weights for policy 0, policy_version 506330 (0.0036) [2024-06-23 20:06:13,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8295825408. Throughput: 0: 42688.0. Samples: 8295945240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:06:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 20:06:14,100][15401] Updated weights for policy 0, policy_version 506340 (0.0035) [2024-06-23 20:06:17,713][15401] Updated weights for policy 0, policy_version 506350 (0.0030) [2024-06-23 20:06:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42599.0). Total num frames: 8296038400. Throughput: 0: 42808.0. Samples: 8296207240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 20:06:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 20:06:22,075][15401] Updated weights for policy 0, policy_version 506360 (0.0047) [2024-06-23 20:06:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8296235008. Throughput: 0: 42764.3. Samples: 8296336420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:06:23,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 20:06:25,600][15401] Updated weights for policy 0, policy_version 506370 (0.0028) [2024-06-23 20:06:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 8296464384. Throughput: 0: 42707.5. Samples: 8296587740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:06:28,395][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 20:06:29,710][15401] Updated weights for policy 0, policy_version 506380 (0.0023) [2024-06-23 20:06:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42654.3). Total num frames: 8296677376. Throughput: 0: 42670.7. Samples: 8296841460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:06:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 20:06:33,581][15401] Updated weights for policy 0, policy_version 506390 (0.0026) [2024-06-23 20:06:37,531][15401] Updated weights for policy 0, policy_version 506400 (0.0029) [2024-06-23 20:06:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8296873984. Throughput: 0: 42772.9. Samples: 8296974060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:06:38,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 20:06:41,167][15401] Updated weights for policy 0, policy_version 506410 (0.0030) [2024-06-23 20:06:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8297103360. Throughput: 0: 42649.8. Samples: 8297224740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:06:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 20:06:45,112][15401] Updated weights for policy 0, policy_version 506420 (0.0052) [2024-06-23 20:06:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8297316352. Throughput: 0: 42658.4. Samples: 8297480780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:06:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 20:06:48,774][15401] Updated weights for policy 0, policy_version 506430 (0.0033) [2024-06-23 20:06:52,636][15401] Updated weights for policy 0, policy_version 506440 (0.0031) [2024-06-23 20:06:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 8297529344. Throughput: 0: 42539.3. Samples: 8297610900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:06:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 20:06:56,362][15401] Updated weights for policy 0, policy_version 506450 (0.0040) [2024-06-23 20:06:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 8297758720. Throughput: 0: 42728.1. Samples: 8297868000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:06:58,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 20:07:00,895][15401] Updated weights for policy 0, policy_version 506460 (0.0034) [2024-06-23 20:07:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 8297955328. Throughput: 0: 42540.5. Samples: 8298121560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 20:07:03,879][15401] Updated weights for policy 0, policy_version 506470 (0.0033) [2024-06-23 20:07:07,166][15349] Signal inference workers to stop experience collection... (122900 times) [2024-06-23 20:07:07,219][15401] InferenceWorker_p0-w0: stopping experience collection (122900 times) [2024-06-23 20:07:07,223][15349] Signal inference workers to resume experience collection... (122900 times) [2024-06-23 20:07:07,242][15401] InferenceWorker_p0-w0: resuming experience collection (122900 times) [2024-06-23 20:07:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 8298151936. Throughput: 0: 42539.6. Samples: 8298250700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:08,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 20:07:08,482][15401] Updated weights for policy 0, policy_version 506480 (0.0037) [2024-06-23 20:07:11,669][15401] Updated weights for policy 0, policy_version 506490 (0.0035) [2024-06-23 20:07:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8298381312. Throughput: 0: 42503.9. Samples: 8298500420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 20:07:16,105][15401] Updated weights for policy 0, policy_version 506500 (0.0032) [2024-06-23 20:07:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8298594304. Throughput: 0: 42582.2. Samples: 8298757660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:18,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 20:07:19,467][15401] Updated weights for policy 0, policy_version 506510 (0.0034) [2024-06-23 20:07:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 8298790912. Throughput: 0: 42524.8. Samples: 8298887680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 20:07:23,607][15401] Updated weights for policy 0, policy_version 506520 (0.0028) [2024-06-23 20:07:27,246][15401] Updated weights for policy 0, policy_version 506530 (0.0036) [2024-06-23 20:07:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8299020288. Throughput: 0: 42539.5. Samples: 8299139020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:28,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 20:07:31,152][15401] Updated weights for policy 0, policy_version 506540 (0.0031) [2024-06-23 20:07:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8299216896. Throughput: 0: 42642.3. Samples: 8299399680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 20:07:34,820][15401] Updated weights for policy 0, policy_version 506550 (0.0038) [2024-06-23 20:07:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 8299446272. Throughput: 0: 42532.0. Samples: 8299524840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 20:07:38,728][15401] Updated weights for policy 0, policy_version 506560 (0.0046) [2024-06-23 20:07:42,390][15401] Updated weights for policy 0, policy_version 506570 (0.0035) [2024-06-23 20:07:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8299659264. Throughput: 0: 42433.7. Samples: 8299777520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-23 20:07:43,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 20:07:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000506571_8299659264.pth... [2024-06-23 20:07:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000505947_8289435648.pth [2024-06-23 20:07:46,903][15401] Updated weights for policy 0, policy_version 506580 (0.0033) [2024-06-23 20:07:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 8299855872. Throughput: 0: 42444.8. Samples: 8300031580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:07:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 20:07:50,065][15401] Updated weights for policy 0, policy_version 506590 (0.0047) [2024-06-23 20:07:53,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 8300085248. Throughput: 0: 42379.9. Samples: 8300157900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:07:53,392][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 20:07:54,514][15401] Updated weights for policy 0, policy_version 506600 (0.0038) [2024-06-23 20:07:57,935][15401] Updated weights for policy 0, policy_version 506610 (0.0027) [2024-06-23 20:07:58,391][15132] Fps is (10 sec: 44228.6, 60 sec: 42324.0, 300 sec: 42598.1). Total num frames: 8300298240. Throughput: 0: 42647.7. Samples: 8300419640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:07:58,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 20:08:02,115][15401] Updated weights for policy 0, policy_version 506620 (0.0030) [2024-06-23 20:08:03,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8300511232. Throughput: 0: 42588.9. Samples: 8300674160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 20:08:05,478][15401] Updated weights for policy 0, policy_version 506630 (0.0028) [2024-06-23 20:08:08,389][15132] Fps is (10 sec: 42606.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8300724224. Throughput: 0: 42490.7. Samples: 8300799760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 20:08:09,573][15401] Updated weights for policy 0, policy_version 506640 (0.0043) [2024-06-23 20:08:12,980][15349] Signal inference workers to stop experience collection... (122950 times) [2024-06-23 20:08:12,981][15349] Signal inference workers to resume experience collection... (122950 times) [2024-06-23 20:08:13,025][15401] InferenceWorker_p0-w0: stopping experience collection (122950 times) [2024-06-23 20:08:13,025][15401] InferenceWorker_p0-w0: resuming experience collection (122950 times) [2024-06-23 20:08:13,124][15401] Updated weights for policy 0, policy_version 506650 (0.0027) [2024-06-23 20:08:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8300953600. Throughput: 0: 42471.6. Samples: 8301050240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 20:08:17,166][15401] Updated weights for policy 0, policy_version 506660 (0.0038) [2024-06-23 20:08:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 8301150208. Throughput: 0: 42451.0. Samples: 8301309980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:18,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 20:08:20,739][15401] Updated weights for policy 0, policy_version 506670 (0.0039) [2024-06-23 20:08:23,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 8301330432. Throughput: 0: 42455.1. Samples: 8301435320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 20:08:24,708][15401] Updated weights for policy 0, policy_version 506680 (0.0031) [2024-06-23 20:08:28,377][15401] Updated weights for policy 0, policy_version 506690 (0.0028) [2024-06-23 20:08:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 8301608960. Throughput: 0: 42549.4. Samples: 8301692240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 20:08:32,822][15401] Updated weights for policy 0, policy_version 506700 (0.0046) [2024-06-23 20:08:33,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 8301789184. Throughput: 0: 42672.0. Samples: 8301951820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:33,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-23 20:08:36,267][15401] Updated weights for policy 0, policy_version 506710 (0.0029) [2024-06-23 20:08:38,390][15132] Fps is (10 sec: 36044.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 8301969408. Throughput: 0: 42633.3. Samples: 8302076300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:38,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 20:08:40,339][15401] Updated weights for policy 0, policy_version 506720 (0.0031) [2024-06-23 20:08:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42598.9). Total num frames: 8302231552. Throughput: 0: 42516.8. Samples: 8302332820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 20:08:43,949][15401] Updated weights for policy 0, policy_version 506730 (0.0022) [2024-06-23 20:08:47,916][15401] Updated weights for policy 0, policy_version 506740 (0.0038) [2024-06-23 20:08:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8302428160. Throughput: 0: 42636.5. Samples: 8302592800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:48,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 20:08:51,387][15401] Updated weights for policy 0, policy_version 506750 (0.0030) [2024-06-23 20:08:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 8302608384. Throughput: 0: 42763.6. Samples: 8302724120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 20:08:55,527][15401] Updated weights for policy 0, policy_version 506760 (0.0048) [2024-06-23 20:08:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42599.7, 300 sec: 42542.8). Total num frames: 8302854144. Throughput: 0: 42829.8. Samples: 8302977580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:08:58,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 20:08:59,038][15401] Updated weights for policy 0, policy_version 506770 (0.0037) [2024-06-23 20:09:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8303067136. Throughput: 0: 42801.4. Samples: 8303236040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:09:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 20:09:03,502][15401] Updated weights for policy 0, policy_version 506780 (0.0028) [2024-06-23 20:09:07,040][15401] Updated weights for policy 0, policy_version 506790 (0.0026) [2024-06-23 20:09:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8303263744. Throughput: 0: 42771.7. Samples: 8303360040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:09:08,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 20:09:11,113][15401] Updated weights for policy 0, policy_version 506800 (0.0043) [2024-06-23 20:09:12,518][15349] Signal inference workers to stop experience collection... (123000 times) [2024-06-23 20:09:12,520][15349] Signal inference workers to resume experience collection... (123000 times) [2024-06-23 20:09:12,557][15401] InferenceWorker_p0-w0: stopping experience collection (123000 times) [2024-06-23 20:09:12,557][15401] InferenceWorker_p0-w0: resuming experience collection (123000 times) [2024-06-23 20:09:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8303509504. Throughput: 0: 42890.7. Samples: 8303622320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-23 20:09:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 20:09:14,561][15401] Updated weights for policy 0, policy_version 506810 (0.0045) [2024-06-23 20:09:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8303689728. Throughput: 0: 42843.1. Samples: 8303879760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:18,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 20:09:18,890][15401] Updated weights for policy 0, policy_version 506820 (0.0031) [2024-06-23 20:09:22,130][15401] Updated weights for policy 0, policy_version 506830 (0.0047) [2024-06-23 20:09:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 8303902720. Throughput: 0: 42757.4. Samples: 8304000380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 20:09:27,128][15401] Updated weights for policy 0, policy_version 506840 (0.0026) [2024-06-23 20:09:28,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8304164864. Throughput: 0: 42977.4. Samples: 8304266800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 20:09:29,725][15401] Updated weights for policy 0, policy_version 506850 (0.0036) [2024-06-23 20:09:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 8304328704. Throughput: 0: 42989.6. Samples: 8304527340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:33,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 20:09:34,606][15401] Updated weights for policy 0, policy_version 506860 (0.0051) [2024-06-23 20:09:37,220][15401] Updated weights for policy 0, policy_version 506870 (0.0044) [2024-06-23 20:09:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 8304558080. Throughput: 0: 42658.3. Samples: 8304643740. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 20:09:42,063][15401] Updated weights for policy 0, policy_version 506880 (0.0032) [2024-06-23 20:09:43,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8304803840. Throughput: 0: 42870.2. Samples: 8304906740. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 20:09:43,530][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000506886_8304820224.pth... [2024-06-23 20:09:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000506260_8294563840.pth [2024-06-23 20:09:45,032][15401] Updated weights for policy 0, policy_version 506890 (0.0031) [2024-06-23 20:09:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8304967680. Throughput: 0: 42856.0. Samples: 8305164560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 20:09:49,614][15401] Updated weights for policy 0, policy_version 506900 (0.0036) [2024-06-23 20:09:52,647][15401] Updated weights for policy 0, policy_version 506910 (0.0026) [2024-06-23 20:09:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 8305213440. Throughput: 0: 42746.5. Samples: 8305283640. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 20:09:57,393][15401] Updated weights for policy 0, policy_version 506920 (0.0034) [2024-06-23 20:09:58,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 8305426432. Throughput: 0: 42746.6. Samples: 8305546020. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:09:58,392][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 20:10:00,390][15401] Updated weights for policy 0, policy_version 506930 (0.0026) [2024-06-23 20:10:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8305623040. Throughput: 0: 42566.6. Samples: 8305795260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:10:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 20:10:04,937][15401] Updated weights for policy 0, policy_version 506940 (0.0038) [2024-06-23 20:10:07,955][15401] Updated weights for policy 0, policy_version 506950 (0.0024) [2024-06-23 20:10:08,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8305868800. Throughput: 0: 42704.8. Samples: 8305922100. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:10:08,391][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 20:10:12,632][15401] Updated weights for policy 0, policy_version 506960 (0.0031) [2024-06-23 20:10:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8306065408. Throughput: 0: 42690.8. Samples: 8306187880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:10:13,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 20:10:15,362][15401] Updated weights for policy 0, policy_version 506970 (0.0034) [2024-06-23 20:10:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8306262016. Throughput: 0: 42620.5. Samples: 8306445260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:10:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 20:10:20,133][15401] Updated weights for policy 0, policy_version 506980 (0.0028) [2024-06-23 20:10:21,308][15349] Signal inference workers to stop experience collection... (123050 times) [2024-06-23 20:10:21,316][15349] Signal inference workers to resume experience collection... (123050 times) [2024-06-23 20:10:21,350][15401] InferenceWorker_p0-w0: stopping experience collection (123050 times) [2024-06-23 20:10:21,350][15401] InferenceWorker_p0-w0: resuming experience collection (123050 times) [2024-06-23 20:10:23,217][15401] Updated weights for policy 0, policy_version 506990 (0.0038) [2024-06-23 20:10:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 42820.9). Total num frames: 8306524160. Throughput: 0: 42878.2. Samples: 8306573260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:10:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 20:10:28,087][15401] Updated weights for policy 0, policy_version 507000 (0.0046) [2024-06-23 20:10:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 8306704384. Throughput: 0: 42745.0. Samples: 8306830260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:10:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 20:10:30,812][15401] Updated weights for policy 0, policy_version 507010 (0.0046) [2024-06-23 20:10:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8306917376. Throughput: 0: 42756.5. Samples: 8307088600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:10:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 20:10:35,745][15401] Updated weights for policy 0, policy_version 507020 (0.0021) [2024-06-23 20:10:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8307163136. Throughput: 0: 42982.0. Samples: 8307217820. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-23 20:10:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 20:10:38,669][15401] Updated weights for policy 0, policy_version 507030 (0.0041) [2024-06-23 20:10:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8307326976. Throughput: 0: 42915.2. Samples: 8307477100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:10:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 20:10:43,690][15401] Updated weights for policy 0, policy_version 507040 (0.0034) [2024-06-23 20:10:46,171][15401] Updated weights for policy 0, policy_version 507050 (0.0041) [2024-06-23 20:10:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8307556352. Throughput: 0: 43090.7. Samples: 8307734340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:10:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 20:10:51,177][15401] Updated weights for policy 0, policy_version 507060 (0.0033) [2024-06-23 20:10:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8307802112. Throughput: 0: 43199.0. Samples: 8307866060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:10:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 20:10:53,771][15401] Updated weights for policy 0, policy_version 507070 (0.0031) [2024-06-23 20:10:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 8307965952. Throughput: 0: 42961.3. Samples: 8308121140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:10:58,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-23 20:10:58,724][15401] Updated weights for policy 0, policy_version 507080 (0.0035) [2024-06-23 20:11:01,389][15401] Updated weights for policy 0, policy_version 507090 (0.0036) [2024-06-23 20:11:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8308195328. Throughput: 0: 42874.3. Samples: 8308374600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:03,390][15132] Avg episode reward: [(0, '0.873')] [2024-06-23 20:11:06,332][15401] Updated weights for policy 0, policy_version 507100 (0.0031) [2024-06-23 20:11:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8308424704. Throughput: 0: 42934.5. Samples: 8308505320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:08,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 20:11:09,007][15401] Updated weights for policy 0, policy_version 507110 (0.0020) [2024-06-23 20:11:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 8308621312. Throughput: 0: 42883.0. Samples: 8308760000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 20:11:13,858][15401] Updated weights for policy 0, policy_version 507120 (0.0029) [2024-06-23 20:11:16,871][15401] Updated weights for policy 0, policy_version 507130 (0.0031) [2024-06-23 20:11:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8308850688. Throughput: 0: 42846.2. Samples: 8309016680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:18,392][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 20:11:21,676][15401] Updated weights for policy 0, policy_version 507140 (0.0042) [2024-06-23 20:11:23,307][15349] Signal inference workers to stop experience collection... (123100 times) [2024-06-23 20:11:23,356][15401] InferenceWorker_p0-w0: stopping experience collection (123100 times) [2024-06-23 20:11:23,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 8309047296. Throughput: 0: 42809.0. Samples: 8309144220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 20:11:23,432][15349] Signal inference workers to resume experience collection... (123100 times) [2024-06-23 20:11:23,432][15401] InferenceWorker_p0-w0: resuming experience collection (123100 times) [2024-06-23 20:11:24,635][15401] Updated weights for policy 0, policy_version 507150 (0.0033) [2024-06-23 20:11:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8309260288. Throughput: 0: 42591.1. Samples: 8309393700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 20:11:29,466][15401] Updated weights for policy 0, policy_version 507160 (0.0039) [2024-06-23 20:11:32,349][15401] Updated weights for policy 0, policy_version 507170 (0.0029) [2024-06-23 20:11:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8309506048. Throughput: 0: 42537.8. Samples: 8309648540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 20:11:37,111][15401] Updated weights for policy 0, policy_version 507180 (0.0028) [2024-06-23 20:11:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 8309686272. Throughput: 0: 42589.9. Samples: 8309782600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 20:11:40,163][15401] Updated weights for policy 0, policy_version 507190 (0.0032) [2024-06-23 20:11:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8309899264. Throughput: 0: 42403.0. Samples: 8310029280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 20:11:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000507196_8309899264.pth... [2024-06-23 20:11:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000506571_8299659264.pth [2024-06-23 20:11:45,021][15401] Updated weights for policy 0, policy_version 507200 (0.0042) [2024-06-23 20:11:47,984][15401] Updated weights for policy 0, policy_version 507210 (0.0035) [2024-06-23 20:11:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 8310145024. Throughput: 0: 42402.7. Samples: 8310282720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 20:11:52,689][15401] Updated weights for policy 0, policy_version 507220 (0.0046) [2024-06-23 20:11:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 8310325248. Throughput: 0: 42506.0. Samples: 8310418080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 20:11:55,726][15401] Updated weights for policy 0, policy_version 507230 (0.0040) [2024-06-23 20:11:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8310554624. Throughput: 0: 42454.3. Samples: 8310670440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:11:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 20:12:00,507][15401] Updated weights for policy 0, policy_version 507240 (0.0031) [2024-06-23 20:12:03,368][15401] Updated weights for policy 0, policy_version 507250 (0.0046) [2024-06-23 20:12:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8310784000. Throughput: 0: 42413.4. Samples: 8310925280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:12:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 20:12:08,098][15401] Updated weights for policy 0, policy_version 507260 (0.0037) [2024-06-23 20:12:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8310964224. Throughput: 0: 42425.7. Samples: 8311053380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:12:08,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-23 20:12:11,122][15401] Updated weights for policy 0, policy_version 507270 (0.0034) [2024-06-23 20:12:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8311209984. Throughput: 0: 42390.7. Samples: 8311301280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 20:12:15,653][15401] Updated weights for policy 0, policy_version 507280 (0.0032) [2024-06-23 20:12:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8311406592. Throughput: 0: 42503.0. Samples: 8311561180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:18,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 20:12:18,847][15401] Updated weights for policy 0, policy_version 507290 (0.0032) [2024-06-23 20:12:23,219][15401] Updated weights for policy 0, policy_version 507300 (0.0029) [2024-06-23 20:12:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 8311603200. Throughput: 0: 42272.4. Samples: 8311684860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 20:12:26,822][15401] Updated weights for policy 0, policy_version 507310 (0.0035) [2024-06-23 20:12:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8311848960. Throughput: 0: 42471.7. Samples: 8311940500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 20:12:30,948][15401] Updated weights for policy 0, policy_version 507320 (0.0031) [2024-06-23 20:12:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 8312029184. Throughput: 0: 42654.3. Samples: 8312202160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:33,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 20:12:34,580][15401] Updated weights for policy 0, policy_version 507330 (0.0031) [2024-06-23 20:12:38,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8312242176. Throughput: 0: 42280.7. Samples: 8312320720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 20:12:38,612][15401] Updated weights for policy 0, policy_version 507340 (0.0039) [2024-06-23 20:12:42,137][15401] Updated weights for policy 0, policy_version 507350 (0.0043) [2024-06-23 20:12:43,392][15132] Fps is (10 sec: 44225.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 8312471552. Throughput: 0: 42489.3. Samples: 8312582560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:43,393][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 20:12:46,365][15401] Updated weights for policy 0, policy_version 507360 (0.0043) [2024-06-23 20:12:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42654.3). Total num frames: 8312668160. Throughput: 0: 42519.5. Samples: 8312838660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 20:12:48,642][15349] Signal inference workers to stop experience collection... (123150 times) [2024-06-23 20:12:48,700][15401] InferenceWorker_p0-w0: stopping experience collection (123150 times) [2024-06-23 20:12:48,702][15349] Signal inference workers to resume experience collection... (123150 times) [2024-06-23 20:12:48,709][15401] InferenceWorker_p0-w0: resuming experience collection (123150 times) [2024-06-23 20:12:50,246][15401] Updated weights for policy 0, policy_version 507370 (0.0037) [2024-06-23 20:12:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42654.2). Total num frames: 8312881152. Throughput: 0: 42425.2. Samples: 8312962520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 20:12:53,991][15401] Updated weights for policy 0, policy_version 507380 (0.0034) [2024-06-23 20:12:57,936][15401] Updated weights for policy 0, policy_version 507390 (0.0038) [2024-06-23 20:12:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8313094144. Throughput: 0: 42636.5. Samples: 8313219920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:12:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 20:13:01,528][15401] Updated weights for policy 0, policy_version 507400 (0.0026) [2024-06-23 20:13:03,392][15132] Fps is (10 sec: 40950.5, 60 sec: 41777.5, 300 sec: 42598.0). Total num frames: 8313290752. Throughput: 0: 42592.5. Samples: 8313477940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:13:03,393][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 20:13:05,685][15401] Updated weights for policy 0, policy_version 507410 (0.0049) [2024-06-23 20:13:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8313536512. Throughput: 0: 42501.0. Samples: 8313597400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:13:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 20:13:09,086][15401] Updated weights for policy 0, policy_version 507420 (0.0033) [2024-06-23 20:13:13,210][15401] Updated weights for policy 0, policy_version 507430 (0.0034) [2024-06-23 20:13:13,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 8313733120. Throughput: 0: 42521.6. Samples: 8313853980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:13:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 20:13:17,298][15401] Updated weights for policy 0, policy_version 507440 (0.0033) [2024-06-23 20:13:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 8313929728. Throughput: 0: 42314.6. Samples: 8314106320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:13:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 20:13:21,146][15401] Updated weights for policy 0, policy_version 507450 (0.0041) [2024-06-23 20:13:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8314159104. Throughput: 0: 42510.3. Samples: 8314233680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:13:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 20:13:24,870][15401] Updated weights for policy 0, policy_version 507460 (0.0032) [2024-06-23 20:13:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 8314355712. Throughput: 0: 42304.5. Samples: 8314486160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:13:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 20:13:28,847][15401] Updated weights for policy 0, policy_version 507470 (0.0038) [2024-06-23 20:13:32,446][15401] Updated weights for policy 0, policy_version 507480 (0.0044) [2024-06-23 20:13:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 8314568704. Throughput: 0: 42263.5. Samples: 8314740520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-23 20:13:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 20:13:36,518][15401] Updated weights for policy 0, policy_version 507490 (0.0032) [2024-06-23 20:13:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8314781696. Throughput: 0: 42340.0. Samples: 8314867820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:13:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 20:13:40,178][15401] Updated weights for policy 0, policy_version 507500 (0.0042) [2024-06-23 20:13:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 8314994688. Throughput: 0: 42088.7. Samples: 8315113920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:13:43,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 20:13:43,510][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000507508_8315011072.pth... [2024-06-23 20:13:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000506886_8304820224.pth [2024-06-23 20:13:44,220][15401] Updated weights for policy 0, policy_version 507510 (0.0032) [2024-06-23 20:13:47,991][15401] Updated weights for policy 0, policy_version 507520 (0.0042) [2024-06-23 20:13:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8315207680. Throughput: 0: 41971.1. Samples: 8315366540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:13:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 20:13:51,860][15401] Updated weights for policy 0, policy_version 507530 (0.0044) [2024-06-23 20:13:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8315404288. Throughput: 0: 42227.9. Samples: 8315497660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:13:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 20:13:55,894][15401] Updated weights for policy 0, policy_version 507540 (0.0035) [2024-06-23 20:13:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8315633664. Throughput: 0: 42227.8. Samples: 8315754220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:13:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 20:13:59,427][15401] Updated weights for policy 0, policy_version 507550 (0.0045) [2024-06-23 20:14:03,394][15132] Fps is (10 sec: 44217.6, 60 sec: 42597.0, 300 sec: 42653.3). Total num frames: 8315846656. Throughput: 0: 42360.2. Samples: 8316012720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:03,394][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 20:14:03,509][15401] Updated weights for policy 0, policy_version 507560 (0.0033) [2024-06-23 20:14:07,263][15401] Updated weights for policy 0, policy_version 507570 (0.0036) [2024-06-23 20:14:08,389][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 8316043264. Throughput: 0: 42253.8. Samples: 8316135100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:08,390][15132] Avg episode reward: [(0, '0.097')] [2024-06-23 20:14:11,011][15349] Signal inference workers to stop experience collection... (123200 times) [2024-06-23 20:14:11,066][15401] InferenceWorker_p0-w0: stopping experience collection (123200 times) [2024-06-23 20:14:11,067][15349] Signal inference workers to resume experience collection... (123200 times) [2024-06-23 20:14:11,083][15401] InferenceWorker_p0-w0: resuming experience collection (123200 times) [2024-06-23 20:14:11,213][15401] Updated weights for policy 0, policy_version 507580 (0.0028) [2024-06-23 20:14:13,390][15132] Fps is (10 sec: 42617.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8316272640. Throughput: 0: 42221.4. Samples: 8316386120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:13,393][15132] Avg episode reward: [(0, '0.222')] [2024-06-23 20:14:14,863][15401] Updated weights for policy 0, policy_version 507590 (0.0032) [2024-06-23 20:14:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 8316452864. Throughput: 0: 42466.6. Samples: 8316651520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 20:14:18,968][15401] Updated weights for policy 0, policy_version 507600 (0.0038) [2024-06-23 20:14:22,596][15401] Updated weights for policy 0, policy_version 507610 (0.0030) [2024-06-23 20:14:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 8316682240. Throughput: 0: 42415.3. Samples: 8316776500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 20:14:26,490][15401] Updated weights for policy 0, policy_version 507620 (0.0039) [2024-06-23 20:14:28,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8316928000. Throughput: 0: 42505.8. Samples: 8317026680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 20:14:30,877][15401] Updated weights for policy 0, policy_version 507630 (0.0036) [2024-06-23 20:14:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 8317091840. Throughput: 0: 42770.7. Samples: 8317291220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:33,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 20:14:34,173][15401] Updated weights for policy 0, policy_version 507640 (0.0042) [2024-06-23 20:14:38,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 8317321216. Throughput: 0: 42523.1. Samples: 8317411200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:38,390][15401] Updated weights for policy 0, policy_version 507650 (0.0030) [2024-06-23 20:14:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 20:14:42,282][15401] Updated weights for policy 0, policy_version 507660 (0.0031) [2024-06-23 20:14:43,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8317566976. Throughput: 0: 42628.0. Samples: 8317672480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:43,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 20:14:45,973][15401] Updated weights for policy 0, policy_version 507670 (0.0041) [2024-06-23 20:14:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 8317763584. Throughput: 0: 42583.6. Samples: 8317928900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:48,393][15132] Avg episode reward: [(0, '0.235')] [2024-06-23 20:14:50,048][15401] Updated weights for policy 0, policy_version 507680 (0.0027) [2024-06-23 20:14:53,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.8, 300 sec: 42542.9). Total num frames: 8317976576. Throughput: 0: 42583.5. Samples: 8318051460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:53,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 20:14:53,536][15401] Updated weights for policy 0, policy_version 507690 (0.0027) [2024-06-23 20:14:57,678][15401] Updated weights for policy 0, policy_version 507700 (0.0031) [2024-06-23 20:14:58,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8318189568. Throughput: 0: 42839.3. Samples: 8318313880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:14:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 20:15:01,134][15401] Updated weights for policy 0, policy_version 507710 (0.0032) [2024-06-23 20:15:03,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42874.6, 300 sec: 42542.9). Total num frames: 8318418944. Throughput: 0: 42572.1. Samples: 8318567260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-23 20:15:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 20:15:05,279][15401] Updated weights for policy 0, policy_version 507720 (0.0036) [2024-06-23 20:15:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 8318615552. Throughput: 0: 42651.5. Samples: 8318695820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 20:15:08,888][15401] Updated weights for policy 0, policy_version 507730 (0.0026) [2024-06-23 20:15:12,883][15401] Updated weights for policy 0, policy_version 507740 (0.0045) [2024-06-23 20:15:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8318828544. Throughput: 0: 42743.1. Samples: 8318950120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 20:15:16,807][15401] Updated weights for policy 0, policy_version 507750 (0.0043) [2024-06-23 20:15:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42376.2). Total num frames: 8319025152. Throughput: 0: 42360.0. Samples: 8319197420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:18,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 20:15:20,698][15401] Updated weights for policy 0, policy_version 507760 (0.0033) [2024-06-23 20:15:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 8319238144. Throughput: 0: 42508.1. Samples: 8319324060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 20:15:24,270][15401] Updated weights for policy 0, policy_version 507770 (0.0039) [2024-06-23 20:15:28,173][15401] Updated weights for policy 0, policy_version 507780 (0.0036) [2024-06-23 20:15:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 8319467520. Throughput: 0: 42448.7. Samples: 8319582680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 20:15:31,666][15349] Signal inference workers to stop experience collection... (123250 times) [2024-06-23 20:15:31,674][15349] Signal inference workers to resume experience collection... (123250 times) [2024-06-23 20:15:31,699][15401] InferenceWorker_p0-w0: stopping experience collection (123250 times) [2024-06-23 20:15:31,700][15401] InferenceWorker_p0-w0: resuming experience collection (123250 times) [2024-06-23 20:15:31,804][15401] Updated weights for policy 0, policy_version 507790 (0.0033) [2024-06-23 20:15:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 8319680512. Throughput: 0: 42605.0. Samples: 8319846020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 20:15:35,713][15401] Updated weights for policy 0, policy_version 507800 (0.0031) [2024-06-23 20:15:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8319893504. Throughput: 0: 42633.0. Samples: 8319969840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:38,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-23 20:15:39,992][15401] Updated weights for policy 0, policy_version 507810 (0.0033) [2024-06-23 20:15:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 8320090112. Throughput: 0: 42304.7. Samples: 8320217600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 20:15:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000507818_8320090112.pth... [2024-06-23 20:15:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000507196_8309899264.pth [2024-06-23 20:15:43,802][15401] Updated weights for policy 0, policy_version 507820 (0.0040) [2024-06-23 20:15:47,755][15401] Updated weights for policy 0, policy_version 507830 (0.0043) [2024-06-23 20:15:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.1, 300 sec: 42431.8). Total num frames: 8320319488. Throughput: 0: 42406.6. Samples: 8320475560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 20:15:51,540][15401] Updated weights for policy 0, policy_version 507840 (0.0025) [2024-06-23 20:15:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 8320499712. Throughput: 0: 42382.2. Samples: 8320603020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:53,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 20:15:55,690][15401] Updated weights for policy 0, policy_version 507850 (0.0035) [2024-06-23 20:15:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 8320729088. Throughput: 0: 42200.0. Samples: 8320849120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:15:58,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 20:15:59,086][15401] Updated weights for policy 0, policy_version 507860 (0.0028) [2024-06-23 20:16:03,201][15401] Updated weights for policy 0, policy_version 507870 (0.0039) [2024-06-23 20:16:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 8320942080. Throughput: 0: 42542.1. Samples: 8321111820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:16:03,394][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 20:16:06,695][15401] Updated weights for policy 0, policy_version 507880 (0.0037) [2024-06-23 20:16:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 8321138688. Throughput: 0: 42571.1. Samples: 8321239760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:16:08,391][15132] Avg episode reward: [(0, '0.925')] [2024-06-23 20:16:10,850][15401] Updated weights for policy 0, policy_version 507890 (0.0034) [2024-06-23 20:16:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 8321368064. Throughput: 0: 42342.3. Samples: 8321488080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:16:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 20:16:14,347][15401] Updated weights for policy 0, policy_version 507900 (0.0046) [2024-06-23 20:16:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 8321564672. Throughput: 0: 42217.4. Samples: 8321745800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:16:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 20:16:18,619][15401] Updated weights for policy 0, policy_version 507910 (0.0044) [2024-06-23 20:16:22,019][15401] Updated weights for policy 0, policy_version 507920 (0.0038) [2024-06-23 20:16:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 8321794048. Throughput: 0: 42206.1. Samples: 8321869120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:16:23,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 20:16:26,278][15401] Updated weights for policy 0, policy_version 507930 (0.0050) [2024-06-23 20:16:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 8322007040. Throughput: 0: 42298.4. Samples: 8322121020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 20:16:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 20:16:29,833][15401] Updated weights for policy 0, policy_version 507940 (0.0040) [2024-06-23 20:16:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 8322187264. Throughput: 0: 42421.4. Samples: 8322384520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:16:33,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 20:16:34,023][15401] Updated weights for policy 0, policy_version 507950 (0.0031) [2024-06-23 20:16:37,515][15401] Updated weights for policy 0, policy_version 507960 (0.0038) [2024-06-23 20:16:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 8322433024. Throughput: 0: 42321.6. Samples: 8322507500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:16:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 20:16:41,733][15401] Updated weights for policy 0, policy_version 507970 (0.0029) [2024-06-23 20:16:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 8322662400. Throughput: 0: 42476.5. Samples: 8322760560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:16:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 20:16:45,218][15401] Updated weights for policy 0, policy_version 507980 (0.0030) [2024-06-23 20:16:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 8322842624. Throughput: 0: 42548.5. Samples: 8323026500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:16:48,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 20:16:49,239][15401] Updated weights for policy 0, policy_version 507990 (0.0031) [2024-06-23 20:16:52,732][15401] Updated weights for policy 0, policy_version 508000 (0.0035) [2024-06-23 20:16:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 8323088384. Throughput: 0: 42382.2. Samples: 8323146960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:16:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 20:16:56,401][15349] Signal inference workers to stop experience collection... (123300 times) [2024-06-23 20:16:56,432][15401] InferenceWorker_p0-w0: stopping experience collection (123300 times) [2024-06-23 20:16:56,458][15349] Signal inference workers to resume experience collection... (123300 times) [2024-06-23 20:16:56,458][15401] InferenceWorker_p0-w0: resuming experience collection (123300 times) [2024-06-23 20:16:56,929][15401] Updated weights for policy 0, policy_version 508010 (0.0035) [2024-06-23 20:16:58,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 8323317760. Throughput: 0: 42612.5. Samples: 8323405640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:16:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 20:17:00,922][15401] Updated weights for policy 0, policy_version 508020 (0.0024) [2024-06-23 20:17:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 8323481600. Throughput: 0: 42735.8. Samples: 8323668920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:03,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 20:17:04,611][15401] Updated weights for policy 0, policy_version 508030 (0.0030) [2024-06-23 20:17:08,362][15401] Updated weights for policy 0, policy_version 508040 (0.0031) [2024-06-23 20:17:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 8323727360. Throughput: 0: 42684.5. Samples: 8323789920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:08,392][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 20:17:12,123][15401] Updated weights for policy 0, policy_version 508050 (0.0042) [2024-06-23 20:17:13,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 8323956736. Throughput: 0: 42982.6. Samples: 8324055240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 20:17:15,922][15401] Updated weights for policy 0, policy_version 508060 (0.0035) [2024-06-23 20:17:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 8324136960. Throughput: 0: 42894.5. Samples: 8324314780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 20:17:19,667][15401] Updated weights for policy 0, policy_version 508070 (0.0032) [2024-06-23 20:17:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 8324349952. Throughput: 0: 42837.0. Samples: 8324435160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 20:17:23,770][15401] Updated weights for policy 0, policy_version 508080 (0.0033) [2024-06-23 20:17:27,275][15401] Updated weights for policy 0, policy_version 508090 (0.0046) [2024-06-23 20:17:28,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 8324595712. Throughput: 0: 43137.3. Samples: 8324701740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-23 20:17:31,320][15401] Updated weights for policy 0, policy_version 508100 (0.0034) [2024-06-23 20:17:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 8324775936. Throughput: 0: 43000.7. Samples: 8324961540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 20:17:35,015][15401] Updated weights for policy 0, policy_version 508110 (0.0038) [2024-06-23 20:17:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42487.7). Total num frames: 8325005312. Throughput: 0: 43051.2. Samples: 8325084260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:38,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 20:17:38,806][15401] Updated weights for policy 0, policy_version 508120 (0.0036) [2024-06-23 20:17:42,824][15401] Updated weights for policy 0, policy_version 508130 (0.0044) [2024-06-23 20:17:43,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8325234688. Throughput: 0: 43139.1. Samples: 8325346900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 20:17:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000508132_8325234688.pth... [2024-06-23 20:17:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000507508_8315011072.pth [2024-06-23 20:17:46,330][15401] Updated weights for policy 0, policy_version 508140 (0.0035) [2024-06-23 20:17:48,392][15132] Fps is (10 sec: 40951.4, 60 sec: 42870.0, 300 sec: 42487.0). Total num frames: 8325414912. Throughput: 0: 42950.5. Samples: 8325601780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:48,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 20:17:50,305][15401] Updated weights for policy 0, policy_version 508150 (0.0041) [2024-06-23 20:17:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8325660672. Throughput: 0: 43079.5. Samples: 8325728500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 20:17:53,804][15401] Updated weights for policy 0, policy_version 508160 (0.0043) [2024-06-23 20:17:57,968][15401] Updated weights for policy 0, policy_version 508170 (0.0034) [2024-06-23 20:17:58,390][15132] Fps is (10 sec: 45884.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 8325873664. Throughput: 0: 42971.0. Samples: 8325988940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 20:17:58,392][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 20:18:01,528][15401] Updated weights for policy 0, policy_version 508180 (0.0028) [2024-06-23 20:18:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 8326070272. Throughput: 0: 42811.3. Samples: 8326241280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:03,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-23 20:18:05,723][15401] Updated weights for policy 0, policy_version 508190 (0.0025) [2024-06-23 20:18:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 8326316032. Throughput: 0: 43066.7. Samples: 8326373160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:08,390][15132] Avg episode reward: [(0, '0.023')] [2024-06-23 20:18:09,002][15401] Updated weights for policy 0, policy_version 508200 (0.0040) [2024-06-23 20:18:13,374][15401] Updated weights for policy 0, policy_version 508210 (0.0038) [2024-06-23 20:18:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8326512640. Throughput: 0: 42924.9. Samples: 8326633360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 20:18:13,921][15349] Signal inference workers to stop experience collection... (123350 times) [2024-06-23 20:18:13,928][15349] Signal inference workers to resume experience collection... (123350 times) [2024-06-23 20:18:13,937][15401] InferenceWorker_p0-w0: stopping experience collection (123350 times) [2024-06-23 20:18:13,937][15401] InferenceWorker_p0-w0: resuming experience collection (123350 times) [2024-06-23 20:18:17,155][15401] Updated weights for policy 0, policy_version 508220 (0.0026) [2024-06-23 20:18:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 8326725632. Throughput: 0: 42716.7. Samples: 8326883780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 20:18:20,989][15401] Updated weights for policy 0, policy_version 508230 (0.0028) [2024-06-23 20:18:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 8326938624. Throughput: 0: 42880.1. Samples: 8327013860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 20:18:24,679][15401] Updated weights for policy 0, policy_version 508240 (0.0041) [2024-06-23 20:18:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8327151616. Throughput: 0: 42869.8. Samples: 8327276040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 20:18:28,563][15401] Updated weights for policy 0, policy_version 508250 (0.0033) [2024-06-23 20:18:32,167][15401] Updated weights for policy 0, policy_version 508260 (0.0031) [2024-06-23 20:18:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 8327380992. Throughput: 0: 43029.5. Samples: 8327538020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:33,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 20:18:35,985][15401] Updated weights for policy 0, policy_version 508270 (0.0030) [2024-06-23 20:18:38,391][15132] Fps is (10 sec: 44232.5, 60 sec: 43143.8, 300 sec: 42709.3). Total num frames: 8327593984. Throughput: 0: 43088.8. Samples: 8327667540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:38,391][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 20:18:39,689][15401] Updated weights for policy 0, policy_version 508280 (0.0035) [2024-06-23 20:18:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8327806976. Throughput: 0: 43045.7. Samples: 8327926000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 20:18:43,551][15401] Updated weights for policy 0, policy_version 508290 (0.0030) [2024-06-23 20:18:47,177][15401] Updated weights for policy 0, policy_version 508300 (0.0041) [2024-06-23 20:18:48,389][15132] Fps is (10 sec: 44241.8, 60 sec: 43692.3, 300 sec: 42820.6). Total num frames: 8328036352. Throughput: 0: 43118.3. Samples: 8328181600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 20:18:51,816][15401] Updated weights for policy 0, policy_version 508310 (0.0039) [2024-06-23 20:18:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8328232960. Throughput: 0: 43090.1. Samples: 8328312220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:53,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 20:18:54,887][15401] Updated weights for policy 0, policy_version 508320 (0.0033) [2024-06-23 20:18:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.7). Total num frames: 8328462336. Throughput: 0: 43075.2. Samples: 8328571740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:18:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 20:18:59,383][15401] Updated weights for policy 0, policy_version 508330 (0.0052) [2024-06-23 20:19:02,658][15401] Updated weights for policy 0, policy_version 508340 (0.0039) [2024-06-23 20:19:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 8328675328. Throughput: 0: 43218.1. Samples: 8328828600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:19:03,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-23 20:19:06,886][15401] Updated weights for policy 0, policy_version 508350 (0.0033) [2024-06-23 20:19:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8328888320. Throughput: 0: 43175.1. Samples: 8328956740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:19:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 20:19:10,232][15401] Updated weights for policy 0, policy_version 508360 (0.0051) [2024-06-23 20:19:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8329101312. Throughput: 0: 43186.8. Samples: 8329219440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:19:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 20:19:14,370][15401] Updated weights for policy 0, policy_version 508370 (0.0034) [2024-06-23 20:19:17,726][15401] Updated weights for policy 0, policy_version 508380 (0.0037) [2024-06-23 20:19:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8329314304. Throughput: 0: 42990.2. Samples: 8329472580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:19:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 20:19:21,964][15401] Updated weights for policy 0, policy_version 508390 (0.0037) [2024-06-23 20:19:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8329527296. Throughput: 0: 42979.7. Samples: 8329601580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-23 20:19:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 20:19:25,265][15401] Updated weights for policy 0, policy_version 508400 (0.0027) [2024-06-23 20:19:27,459][15349] Signal inference workers to stop experience collection... (123400 times) [2024-06-23 20:19:27,459][15349] Signal inference workers to resume experience collection... (123400 times) [2024-06-23 20:19:27,516][15401] InferenceWorker_p0-w0: stopping experience collection (123400 times) [2024-06-23 20:19:27,516][15401] InferenceWorker_p0-w0: resuming experience collection (123400 times) [2024-06-23 20:19:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8329723904. Throughput: 0: 43044.6. Samples: 8329863000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:19:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 20:19:29,472][15401] Updated weights for policy 0, policy_version 508410 (0.0024) [2024-06-23 20:19:32,998][15401] Updated weights for policy 0, policy_version 508420 (0.0030) [2024-06-23 20:19:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8329969664. Throughput: 0: 43021.3. Samples: 8330117560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:19:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 20:19:36,995][15401] Updated weights for policy 0, policy_version 508430 (0.0027) [2024-06-23 20:19:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42872.2, 300 sec: 42709.5). Total num frames: 8330166272. Throughput: 0: 42943.5. Samples: 8330244680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:19:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 20:19:40,607][15401] Updated weights for policy 0, policy_version 508440 (0.0038) [2024-06-23 20:19:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 8330379264. Throughput: 0: 42948.8. Samples: 8330504440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:19:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 20:19:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000508447_8330395648.pth... [2024-06-23 20:19:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000507818_8320090112.pth [2024-06-23 20:19:44,518][15401] Updated weights for policy 0, policy_version 508450 (0.0030) [2024-06-23 20:19:48,244][15401] Updated weights for policy 0, policy_version 508460 (0.0039) [2024-06-23 20:19:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 8330608640. Throughput: 0: 42902.8. Samples: 8330759220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:19:48,391][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 20:19:52,383][15401] Updated weights for policy 0, policy_version 508470 (0.0029) [2024-06-23 20:19:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8330805248. Throughput: 0: 42924.0. Samples: 8330888320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:19:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 20:19:55,989][15401] Updated weights for policy 0, policy_version 508480 (0.0032) [2024-06-23 20:19:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8331001856. Throughput: 0: 42727.9. Samples: 8331142200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:19:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 20:19:59,987][15401] Updated weights for policy 0, policy_version 508490 (0.0025) [2024-06-23 20:20:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8331231232. Throughput: 0: 42761.9. Samples: 8331396860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 20:20:03,634][15401] Updated weights for policy 0, policy_version 508500 (0.0034) [2024-06-23 20:20:07,615][15401] Updated weights for policy 0, policy_version 508510 (0.0047) [2024-06-23 20:20:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 8331427840. Throughput: 0: 42891.8. Samples: 8331531720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 20:20:11,206][15401] Updated weights for policy 0, policy_version 508520 (0.0030) [2024-06-23 20:20:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8331657216. Throughput: 0: 42669.4. Samples: 8331783120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:13,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 20:20:15,271][15401] Updated weights for policy 0, policy_version 508530 (0.0032) [2024-06-23 20:20:18,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8331886592. Throughput: 0: 42665.7. Samples: 8332037520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 20:20:19,020][15401] Updated weights for policy 0, policy_version 508540 (0.0040) [2024-06-23 20:20:22,999][15401] Updated weights for policy 0, policy_version 508550 (0.0040) [2024-06-23 20:20:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8332083200. Throughput: 0: 42855.1. Samples: 8332173160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 20:20:26,626][15401] Updated weights for policy 0, policy_version 508560 (0.0030) [2024-06-23 20:20:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8332312576. Throughput: 0: 42624.8. Samples: 8332422560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:28,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 20:20:30,814][15401] Updated weights for policy 0, policy_version 508570 (0.0036) [2024-06-23 20:20:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8332492800. Throughput: 0: 42783.2. Samples: 8332684460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:33,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 20:20:34,246][15401] Updated weights for policy 0, policy_version 508580 (0.0039) [2024-06-23 20:20:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8332705792. Throughput: 0: 42645.0. Samples: 8332807340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 20:20:38,601][15401] Updated weights for policy 0, policy_version 508590 (0.0049) [2024-06-23 20:20:42,069][15401] Updated weights for policy 0, policy_version 508600 (0.0042) [2024-06-23 20:20:43,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8332967936. Throughput: 0: 42692.8. Samples: 8333063380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 20:20:45,157][15349] Signal inference workers to stop experience collection... (123450 times) [2024-06-23 20:20:45,182][15401] InferenceWorker_p0-w0: stopping experience collection (123450 times) [2024-06-23 20:20:45,222][15349] Signal inference workers to resume experience collection... (123450 times) [2024-06-23 20:20:45,222][15401] InferenceWorker_p0-w0: resuming experience collection (123450 times) [2024-06-23 20:20:46,220][15401] Updated weights for policy 0, policy_version 508610 (0.0027) [2024-06-23 20:20:48,394][15132] Fps is (10 sec: 44215.8, 60 sec: 42322.0, 300 sec: 42875.4). Total num frames: 8333148160. Throughput: 0: 42768.4. Samples: 8333321640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:48,395][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 20:20:49,990][15401] Updated weights for policy 0, policy_version 508620 (0.0031) [2024-06-23 20:20:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8333361152. Throughput: 0: 42541.1. Samples: 8333446060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-23 20:20:53,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 20:20:53,879][15401] Updated weights for policy 0, policy_version 508630 (0.0024) [2024-06-23 20:20:57,624][15401] Updated weights for policy 0, policy_version 508640 (0.0021) [2024-06-23 20:20:58,389][15132] Fps is (10 sec: 45897.0, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 8333606912. Throughput: 0: 42859.6. Samples: 8333711800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:20:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 20:21:01,362][15401] Updated weights for policy 0, policy_version 508650 (0.0036) [2024-06-23 20:21:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8333803520. Throughput: 0: 42829.3. Samples: 8333964840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 20:21:05,112][15401] Updated weights for policy 0, policy_version 508660 (0.0037) [2024-06-23 20:21:08,396][15132] Fps is (10 sec: 40933.5, 60 sec: 43140.1, 300 sec: 42875.2). Total num frames: 8334016512. Throughput: 0: 42665.5. Samples: 8334093380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:08,396][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 20:21:08,946][15401] Updated weights for policy 0, policy_version 508670 (0.0041) [2024-06-23 20:21:12,847][15401] Updated weights for policy 0, policy_version 508680 (0.0029) [2024-06-23 20:21:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8334229504. Throughput: 0: 42851.3. Samples: 8334350860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 20:21:17,118][15401] Updated weights for policy 0, policy_version 508690 (0.0032) [2024-06-23 20:21:18,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8334426112. Throughput: 0: 42732.5. Samples: 8334607420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 20:21:20,500][15401] Updated weights for policy 0, policy_version 508700 (0.0040) [2024-06-23 20:21:23,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8334655488. Throughput: 0: 42664.2. Samples: 8334727240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 20:21:24,849][15401] Updated weights for policy 0, policy_version 508710 (0.0041) [2024-06-23 20:21:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 8334852096. Throughput: 0: 42747.2. Samples: 8334987000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 20:21:28,507][15401] Updated weights for policy 0, policy_version 508720 (0.0043) [2024-06-23 20:21:32,383][15401] Updated weights for policy 0, policy_version 508730 (0.0032) [2024-06-23 20:21:33,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8335065088. Throughput: 0: 42823.5. Samples: 8335248500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:33,404][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 20:21:36,039][15401] Updated weights for policy 0, policy_version 508740 (0.0035) [2024-06-23 20:21:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8335294464. Throughput: 0: 42831.0. Samples: 8335373460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:38,393][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 20:21:39,974][15401] Updated weights for policy 0, policy_version 508750 (0.0032) [2024-06-23 20:21:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 8335507456. Throughput: 0: 42677.3. Samples: 8335632280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 20:21:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000508760_8335523840.pth... [2024-06-23 20:21:43,421][15401] Updated weights for policy 0, policy_version 508760 (0.0028) [2024-06-23 20:21:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000508132_8325234688.pth [2024-06-23 20:21:47,671][15401] Updated weights for policy 0, policy_version 508770 (0.0037) [2024-06-23 20:21:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42601.6, 300 sec: 42765.0). Total num frames: 8335704064. Throughput: 0: 42818.6. Samples: 8335891680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:48,396][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 20:21:50,494][15349] Signal inference workers to stop experience collection... (123500 times) [2024-06-23 20:21:50,532][15401] InferenceWorker_p0-w0: stopping experience collection (123500 times) [2024-06-23 20:21:50,543][15349] Signal inference workers to resume experience collection... (123500 times) [2024-06-23 20:21:50,548][15401] InferenceWorker_p0-w0: resuming experience collection (123500 times) [2024-06-23 20:21:51,457][15401] Updated weights for policy 0, policy_version 508780 (0.0034) [2024-06-23 20:21:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 8335949824. Throughput: 0: 42792.7. Samples: 8336018780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 20:21:55,402][15401] Updated weights for policy 0, policy_version 508790 (0.0032) [2024-06-23 20:21:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 8336130048. Throughput: 0: 42703.9. Samples: 8336272540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:21:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 20:21:59,003][15401] Updated weights for policy 0, policy_version 508800 (0.0027) [2024-06-23 20:22:02,863][15401] Updated weights for policy 0, policy_version 508810 (0.0034) [2024-06-23 20:22:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8336343040. Throughput: 0: 42733.2. Samples: 8336530420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:22:03,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 20:22:06,618][15401] Updated weights for policy 0, policy_version 508820 (0.0043) [2024-06-23 20:22:08,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 8336605184. Throughput: 0: 42915.2. Samples: 8336658420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:22:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 20:22:11,046][15401] Updated weights for policy 0, policy_version 508830 (0.0044) [2024-06-23 20:22:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8336785408. Throughput: 0: 42766.2. Samples: 8336911480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:22:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 20:22:14,194][15401] Updated weights for policy 0, policy_version 508840 (0.0042) [2024-06-23 20:22:18,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 8336982016. Throughput: 0: 42725.8. Samples: 8337171160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-23 20:22:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 20:22:18,663][15401] Updated weights for policy 0, policy_version 508850 (0.0034) [2024-06-23 20:22:21,686][15401] Updated weights for policy 0, policy_version 508860 (0.0029) [2024-06-23 20:22:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8337244160. Throughput: 0: 42746.7. Samples: 8337297060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:22:23,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 20:22:26,537][15401] Updated weights for policy 0, policy_version 508870 (0.0028) [2024-06-23 20:22:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8337424384. Throughput: 0: 42698.1. Samples: 8337553700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:22:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 20:22:29,684][15401] Updated weights for policy 0, policy_version 508880 (0.0024) [2024-06-23 20:22:33,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8337620992. Throughput: 0: 42572.5. Samples: 8337807440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:22:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 20:22:33,993][15401] Updated weights for policy 0, policy_version 508890 (0.0025) [2024-06-23 20:22:37,092][15401] Updated weights for policy 0, policy_version 508900 (0.0042) [2024-06-23 20:22:38,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 8337883136. Throughput: 0: 42426.3. Samples: 8337928060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:22:38,392][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 20:22:41,839][15401] Updated weights for policy 0, policy_version 508910 (0.0029) [2024-06-23 20:22:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 8338063360. Throughput: 0: 42634.2. Samples: 8338191080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:22:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 20:22:44,773][15401] Updated weights for policy 0, policy_version 508920 (0.0029) [2024-06-23 20:22:48,390][15132] Fps is (10 sec: 37691.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8338259968. Throughput: 0: 42560.9. Samples: 8338445660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:22:48,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 20:22:49,441][15401] Updated weights for policy 0, policy_version 508930 (0.0042) [2024-06-23 20:22:52,508][15401] Updated weights for policy 0, policy_version 508940 (0.0045) [2024-06-23 20:22:53,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8338522112. Throughput: 0: 42413.8. Samples: 8338567040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:22:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 20:22:57,082][15401] Updated weights for policy 0, policy_version 508950 (0.0041) [2024-06-23 20:22:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8338702336. Throughput: 0: 42527.1. Samples: 8338825200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:22:58,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 20:23:00,346][15401] Updated weights for policy 0, policy_version 508960 (0.0031) [2024-06-23 20:23:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8338898944. Throughput: 0: 42366.7. Samples: 8339077660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:03,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 20:23:04,741][15401] Updated weights for policy 0, policy_version 508970 (0.0031) [2024-06-23 20:23:07,988][15401] Updated weights for policy 0, policy_version 508980 (0.0031) [2024-06-23 20:23:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 8339128320. Throughput: 0: 42387.5. Samples: 8339204500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 20:23:12,503][15401] Updated weights for policy 0, policy_version 508990 (0.0037) [2024-06-23 20:23:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8339324928. Throughput: 0: 42469.4. Samples: 8339464820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 20:23:13,439][15349] Signal inference workers to stop experience collection... (123550 times) [2024-06-23 20:23:13,490][15401] InferenceWorker_p0-w0: stopping experience collection (123550 times) [2024-06-23 20:23:13,499][15349] Signal inference workers to resume experience collection... (123550 times) [2024-06-23 20:23:13,504][15401] InferenceWorker_p0-w0: resuming experience collection (123550 times) [2024-06-23 20:23:15,624][15401] Updated weights for policy 0, policy_version 509000 (0.0031) [2024-06-23 20:23:18,390][15132] Fps is (10 sec: 40956.6, 60 sec: 42597.8, 300 sec: 42709.4). Total num frames: 8339537920. Throughput: 0: 42483.8. Samples: 8339719240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:18,391][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 20:23:20,124][15401] Updated weights for policy 0, policy_version 509010 (0.0047) [2024-06-23 20:23:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 8339767296. Throughput: 0: 42508.4. Samples: 8339840840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 20:23:23,413][15401] Updated weights for policy 0, policy_version 509020 (0.0042) [2024-06-23 20:23:27,985][15401] Updated weights for policy 0, policy_version 509030 (0.0031) [2024-06-23 20:23:28,389][15132] Fps is (10 sec: 44240.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8339980288. Throughput: 0: 42578.8. Samples: 8340107120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:28,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 20:23:31,029][15401] Updated weights for policy 0, policy_version 509040 (0.0036) [2024-06-23 20:23:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 8340176896. Throughput: 0: 42345.4. Samples: 8340351200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 20:23:35,674][15401] Updated weights for policy 0, policy_version 509050 (0.0038) [2024-06-23 20:23:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 8340406272. Throughput: 0: 42486.2. Samples: 8340478920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 20:23:38,937][15401] Updated weights for policy 0, policy_version 509060 (0.0036) [2024-06-23 20:23:43,379][15401] Updated weights for policy 0, policy_version 509070 (0.0027) [2024-06-23 20:23:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 8340602880. Throughput: 0: 42564.5. Samples: 8340740600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:43,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 20:23:43,520][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000509071_8340619264.pth... [2024-06-23 20:23:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000508447_8330395648.pth [2024-06-23 20:23:46,431][15401] Updated weights for policy 0, policy_version 509080 (0.0043) [2024-06-23 20:23:48,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8340815872. Throughput: 0: 42341.5. Samples: 8340983040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-23 20:23:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 20:23:51,043][15401] Updated weights for policy 0, policy_version 509090 (0.0045) [2024-06-23 20:23:53,393][15132] Fps is (10 sec: 45856.8, 60 sec: 42322.6, 300 sec: 42708.9). Total num frames: 8341061632. Throughput: 0: 42368.7. Samples: 8341111260. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:23:53,394][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 20:23:53,981][15401] Updated weights for policy 0, policy_version 509100 (0.0034) [2024-06-23 20:23:58,390][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 8341225472. Throughput: 0: 42377.2. Samples: 8341371800. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:23:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 20:23:58,772][15401] Updated weights for policy 0, policy_version 509110 (0.0028) [2024-06-23 20:24:01,880][15401] Updated weights for policy 0, policy_version 509120 (0.0038) [2024-06-23 20:24:03,389][15132] Fps is (10 sec: 40976.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8341471232. Throughput: 0: 42422.6. Samples: 8341628220. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 20:24:06,354][15401] Updated weights for policy 0, policy_version 509130 (0.0029) [2024-06-23 20:24:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8341684224. Throughput: 0: 42617.4. Samples: 8341758620. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 20:24:09,807][15401] Updated weights for policy 0, policy_version 509140 (0.0030) [2024-06-23 20:24:13,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 8341864448. Throughput: 0: 42333.6. Samples: 8342012140. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 20:24:13,899][15401] Updated weights for policy 0, policy_version 509150 (0.0031) [2024-06-23 20:24:17,778][15401] Updated weights for policy 0, policy_version 509160 (0.0032) [2024-06-23 20:24:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.9, 300 sec: 42598.4). Total num frames: 8342093824. Throughput: 0: 42481.7. Samples: 8342262880. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 20:24:21,712][15401] Updated weights for policy 0, policy_version 509170 (0.0031) [2024-06-23 20:24:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8342323200. Throughput: 0: 42640.5. Samples: 8342397740. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 20:24:25,410][15401] Updated weights for policy 0, policy_version 509180 (0.0037) [2024-06-23 20:24:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 8342519808. Throughput: 0: 42463.4. Samples: 8342651460. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 20:24:29,323][15349] Signal inference workers to stop experience collection... (123600 times) [2024-06-23 20:24:29,323][15349] Signal inference workers to resume experience collection... (123600 times) [2024-06-23 20:24:29,349][15401] InferenceWorker_p0-w0: stopping experience collection (123600 times) [2024-06-23 20:24:29,349][15401] InferenceWorker_p0-w0: resuming experience collection (123600 times) [2024-06-23 20:24:29,466][15401] Updated weights for policy 0, policy_version 509190 (0.0044) [2024-06-23 20:24:32,950][15401] Updated weights for policy 0, policy_version 509200 (0.0024) [2024-06-23 20:24:33,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 8342749184. Throughput: 0: 42736.4. Samples: 8342906440. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:33,405][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 20:24:37,156][15401] Updated weights for policy 0, policy_version 509210 (0.0039) [2024-06-23 20:24:38,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 8342962176. Throughput: 0: 42713.0. Samples: 8343033280. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:38,392][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 20:24:40,468][15401] Updated weights for policy 0, policy_version 509220 (0.0038) [2024-06-23 20:24:43,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8343158784. Throughput: 0: 42605.4. Samples: 8343289040. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:43,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-23 20:24:44,877][15401] Updated weights for policy 0, policy_version 509230 (0.0046) [2024-06-23 20:24:48,162][15401] Updated weights for policy 0, policy_version 509240 (0.0039) [2024-06-23 20:24:48,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 8343388160. Throughput: 0: 42547.9. Samples: 8343542980. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:48,392][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 20:24:52,336][15401] Updated weights for policy 0, policy_version 509250 (0.0035) [2024-06-23 20:24:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42055.0, 300 sec: 42653.9). Total num frames: 8343584768. Throughput: 0: 42580.4. Samples: 8343674740. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 20:24:55,701][15401] Updated weights for policy 0, policy_version 509260 (0.0044) [2024-06-23 20:24:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 8343814144. Throughput: 0: 42607.2. Samples: 8343929460. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:24:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 20:24:59,838][15401] Updated weights for policy 0, policy_version 509270 (0.0034) [2024-06-23 20:25:03,200][15401] Updated weights for policy 0, policy_version 509280 (0.0035) [2024-06-23 20:25:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8344043520. Throughput: 0: 42744.5. Samples: 8344186380. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:25:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 20:25:07,448][15401] Updated weights for policy 0, policy_version 509290 (0.0038) [2024-06-23 20:25:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8344223744. Throughput: 0: 42689.3. Samples: 8344318760. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:25:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 20:25:10,680][15401] Updated weights for policy 0, policy_version 509300 (0.0034) [2024-06-23 20:25:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 8344469504. Throughput: 0: 42856.0. Samples: 8344579980. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-23 20:25:13,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 20:25:15,339][15401] Updated weights for policy 0, policy_version 509310 (0.0028) [2024-06-23 20:25:18,181][15401] Updated weights for policy 0, policy_version 509320 (0.0038) [2024-06-23 20:25:18,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43416.0, 300 sec: 42764.7). Total num frames: 8344698880. Throughput: 0: 42844.3. Samples: 8344834260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:18,392][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 20:25:22,807][15401] Updated weights for policy 0, policy_version 509330 (0.0036) [2024-06-23 20:25:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8344879104. Throughput: 0: 43110.2. Samples: 8344973140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:23,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 20:25:25,825][15401] Updated weights for policy 0, policy_version 509340 (0.0044) [2024-06-23 20:25:28,392][15132] Fps is (10 sec: 40959.9, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 8345108480. Throughput: 0: 43117.3. Samples: 8345229420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:28,393][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 20:25:30,251][15401] Updated weights for policy 0, policy_version 509350 (0.0030) [2024-06-23 20:25:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43149.2, 300 sec: 42820.6). Total num frames: 8345337856. Throughput: 0: 43055.7. Samples: 8345480380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 20:25:33,632][15401] Updated weights for policy 0, policy_version 509360 (0.0038) [2024-06-23 20:25:37,882][15401] Updated weights for policy 0, policy_version 509370 (0.0042) [2024-06-23 20:25:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 8345534464. Throughput: 0: 43047.0. Samples: 8345611860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:38,394][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 20:25:41,159][15401] Updated weights for policy 0, policy_version 509380 (0.0032) [2024-06-23 20:25:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.7). Total num frames: 8345763840. Throughput: 0: 43097.8. Samples: 8345868860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 20:25:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000509385_8345763840.pth... [2024-06-23 20:25:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000508760_8335523840.pth [2024-06-23 20:25:45,785][15401] Updated weights for policy 0, policy_version 509390 (0.0023) [2024-06-23 20:25:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8345960448. Throughput: 0: 43020.5. Samples: 8346122300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 20:25:48,773][15401] Updated weights for policy 0, policy_version 509400 (0.0031) [2024-06-23 20:25:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 8346157056. Throughput: 0: 42903.2. Samples: 8346249400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 20:25:53,444][15401] Updated weights for policy 0, policy_version 509410 (0.0040) [2024-06-23 20:25:55,801][15349] Signal inference workers to stop experience collection... (123650 times) [2024-06-23 20:25:55,802][15349] Signal inference workers to resume experience collection... (123650 times) [2024-06-23 20:25:55,850][15401] InferenceWorker_p0-w0: stopping experience collection (123650 times) [2024-06-23 20:25:55,851][15401] InferenceWorker_p0-w0: resuming experience collection (123650 times) [2024-06-23 20:25:56,470][15401] Updated weights for policy 0, policy_version 509420 (0.0030) [2024-06-23 20:25:58,391][15132] Fps is (10 sec: 45866.4, 60 sec: 43416.2, 300 sec: 42764.8). Total num frames: 8346419200. Throughput: 0: 42753.9. Samples: 8346503980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:25:58,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 20:26:01,010][15401] Updated weights for policy 0, policy_version 509430 (0.0038) [2024-06-23 20:26:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 8346599424. Throughput: 0: 42916.9. Samples: 8346765420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 20:26:04,637][15401] Updated weights for policy 0, policy_version 509440 (0.0041) [2024-06-23 20:26:08,390][15132] Fps is (10 sec: 37690.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8346796032. Throughput: 0: 42438.3. Samples: 8346882860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 20:26:08,873][15401] Updated weights for policy 0, policy_version 509450 (0.0042) [2024-06-23 20:26:12,146][15401] Updated weights for policy 0, policy_version 509460 (0.0040) [2024-06-23 20:26:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 8347058176. Throughput: 0: 42538.7. Samples: 8347143560. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 20:26:16,579][15401] Updated weights for policy 0, policy_version 509470 (0.0033) [2024-06-23 20:26:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 8347238400. Throughput: 0: 42798.6. Samples: 8347406320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 20:26:20,229][15401] Updated weights for policy 0, policy_version 509480 (0.0039) [2024-06-23 20:26:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8347451392. Throughput: 0: 42320.4. Samples: 8347516280. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 20:26:24,229][15401] Updated weights for policy 0, policy_version 509490 (0.0029) [2024-06-23 20:26:27,827][15401] Updated weights for policy 0, policy_version 509500 (0.0043) [2024-06-23 20:26:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8347680768. Throughput: 0: 42539.6. Samples: 8347783140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 20:26:31,710][15401] Updated weights for policy 0, policy_version 509510 (0.0036) [2024-06-23 20:26:33,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42050.5, 300 sec: 42598.1). Total num frames: 8347860992. Throughput: 0: 42718.6. Samples: 8348044740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:33,392][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 20:26:35,352][15401] Updated weights for policy 0, policy_version 509520 (0.0030) [2024-06-23 20:26:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8348090368. Throughput: 0: 42605.7. Samples: 8348166660. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 20:26:39,272][15401] Updated weights for policy 0, policy_version 509530 (0.0038) [2024-06-23 20:26:43,057][15401] Updated weights for policy 0, policy_version 509540 (0.0044) [2024-06-23 20:26:43,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8348303360. Throughput: 0: 42666.6. Samples: 8348423900. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-23 20:26:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 20:26:46,856][15401] Updated weights for policy 0, policy_version 509550 (0.0036) [2024-06-23 20:26:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8348499968. Throughput: 0: 42631.2. Samples: 8348683820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:26:48,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 20:26:50,776][15401] Updated weights for policy 0, policy_version 509560 (0.0031) [2024-06-23 20:26:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8348712960. Throughput: 0: 42653.5. Samples: 8348802260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:26:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 20:26:54,947][15401] Updated weights for policy 0, policy_version 509570 (0.0043) [2024-06-23 20:26:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42053.6, 300 sec: 42709.5). Total num frames: 8348942336. Throughput: 0: 42600.9. Samples: 8349060600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:26:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 20:26:58,476][15401] Updated weights for policy 0, policy_version 509580 (0.0036) [2024-06-23 20:27:01,913][15349] Signal inference workers to stop experience collection... (123700 times) [2024-06-23 20:27:01,913][15349] Signal inference workers to resume experience collection... (123700 times) [2024-06-23 20:27:01,953][15401] InferenceWorker_p0-w0: stopping experience collection (123700 times) [2024-06-23 20:27:01,953][15401] InferenceWorker_p0-w0: resuming experience collection (123700 times) [2024-06-23 20:27:02,655][15401] Updated weights for policy 0, policy_version 509590 (0.0038) [2024-06-23 20:27:03,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8349138944. Throughput: 0: 42390.2. Samples: 8349313880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:03,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 20:27:06,301][15401] Updated weights for policy 0, policy_version 509600 (0.0048) [2024-06-23 20:27:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 8349368320. Throughput: 0: 42625.6. Samples: 8349434420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 20:27:10,540][15401] Updated weights for policy 0, policy_version 509610 (0.0023) [2024-06-23 20:27:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 8349564928. Throughput: 0: 42580.0. Samples: 8349699240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:13,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 20:27:13,877][15401] Updated weights for policy 0, policy_version 509620 (0.0038) [2024-06-23 20:27:18,041][15401] Updated weights for policy 0, policy_version 509630 (0.0031) [2024-06-23 20:27:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 8349777920. Throughput: 0: 42304.0. Samples: 8349948320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 20:27:21,581][15401] Updated weights for policy 0, policy_version 509640 (0.0042) [2024-06-23 20:27:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8350007296. Throughput: 0: 42524.9. Samples: 8350080280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:23,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 20:27:25,566][15401] Updated weights for policy 0, policy_version 509650 (0.0032) [2024-06-23 20:27:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 8350203904. Throughput: 0: 42578.4. Samples: 8350339920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:28,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-23 20:27:29,376][15401] Updated weights for policy 0, policy_version 509660 (0.0039) [2024-06-23 20:27:33,126][15401] Updated weights for policy 0, policy_version 509670 (0.0023) [2024-06-23 20:27:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 8350433280. Throughput: 0: 42327.9. Samples: 8350588680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:33,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 20:27:37,144][15401] Updated weights for policy 0, policy_version 509680 (0.0040) [2024-06-23 20:27:38,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 8350646272. Throughput: 0: 42652.1. Samples: 8350721880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:38,397][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 20:27:40,817][15401] Updated weights for policy 0, policy_version 509690 (0.0040) [2024-06-23 20:27:43,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8350842880. Throughput: 0: 42642.7. Samples: 8350979520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:43,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 20:27:43,455][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000509696_8350859264.pth... [2024-06-23 20:27:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000509071_8340619264.pth [2024-06-23 20:27:44,953][15401] Updated weights for policy 0, policy_version 509700 (0.0037) [2024-06-23 20:27:48,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 8351072256. Throughput: 0: 42637.9. Samples: 8351232580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:48,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-23 20:27:48,494][15401] Updated weights for policy 0, policy_version 509710 (0.0027) [2024-06-23 20:27:52,501][15401] Updated weights for policy 0, policy_version 509720 (0.0037) [2024-06-23 20:27:53,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 8351285248. Throughput: 0: 42841.2. Samples: 8351362380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:53,392][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 20:27:56,232][15401] Updated weights for policy 0, policy_version 509730 (0.0049) [2024-06-23 20:27:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8351481856. Throughput: 0: 42626.1. Samples: 8351617420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:27:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 20:27:59,959][15401] Updated weights for policy 0, policy_version 509740 (0.0032) [2024-06-23 20:28:03,392][15132] Fps is (10 sec: 44236.7, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 8351727616. Throughput: 0: 42713.3. Samples: 8351870520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:28:03,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 20:28:03,711][15401] Updated weights for policy 0, policy_version 509750 (0.0035) [2024-06-23 20:28:07,962][15401] Updated weights for policy 0, policy_version 509760 (0.0036) [2024-06-23 20:28:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8351940608. Throughput: 0: 42766.7. Samples: 8352004780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-23 20:28:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 20:28:11,230][15401] Updated weights for policy 0, policy_version 509770 (0.0032) [2024-06-23 20:28:13,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 8352120832. Throughput: 0: 42682.5. Samples: 8352260640. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:13,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 20:28:15,486][15401] Updated weights for policy 0, policy_version 509780 (0.0041) [2024-06-23 20:28:17,823][15349] Signal inference workers to stop experience collection... (123750 times) [2024-06-23 20:28:17,872][15401] InferenceWorker_p0-w0: stopping experience collection (123750 times) [2024-06-23 20:28:17,876][15349] Signal inference workers to resume experience collection... (123750 times) [2024-06-23 20:28:17,894][15401] InferenceWorker_p0-w0: resuming experience collection (123750 times) [2024-06-23 20:28:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8352366592. Throughput: 0: 42717.8. Samples: 8352510880. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:18,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 20:28:18,933][15401] Updated weights for policy 0, policy_version 509790 (0.0041) [2024-06-23 20:28:22,973][15401] Updated weights for policy 0, policy_version 509800 (0.0040) [2024-06-23 20:28:23,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8352579584. Throughput: 0: 42848.3. Samples: 8352649780. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 20:28:26,350][15401] Updated weights for policy 0, policy_version 509810 (0.0044) [2024-06-23 20:28:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8352776192. Throughput: 0: 42775.1. Samples: 8352904400. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 20:28:30,682][15401] Updated weights for policy 0, policy_version 509820 (0.0053) [2024-06-23 20:28:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8353005568. Throughput: 0: 42842.1. Samples: 8353160480. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:33,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 20:28:33,857][15401] Updated weights for policy 0, policy_version 509830 (0.0034) [2024-06-23 20:28:38,334][15401] Updated weights for policy 0, policy_version 509840 (0.0030) [2024-06-23 20:28:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 8353218560. Throughput: 0: 42884.6. Samples: 8353292080. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 20:28:41,388][15401] Updated weights for policy 0, policy_version 509850 (0.0039) [2024-06-23 20:28:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.1). Total num frames: 8353431552. Throughput: 0: 42938.3. Samples: 8353549640. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:43,396][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 20:28:45,892][15401] Updated weights for policy 0, policy_version 509860 (0.0031) [2024-06-23 20:28:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42710.1). Total num frames: 8353660928. Throughput: 0: 43076.2. Samples: 8353808840. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:48,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 20:28:49,289][15401] Updated weights for policy 0, policy_version 509870 (0.0035) [2024-06-23 20:28:53,346][15401] Updated weights for policy 0, policy_version 509880 (0.0032) [2024-06-23 20:28:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 8353873920. Throughput: 0: 42953.2. Samples: 8353937680. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 20:28:56,980][15401] Updated weights for policy 0, policy_version 509890 (0.0033) [2024-06-23 20:28:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8354070528. Throughput: 0: 42854.4. Samples: 8354189080. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:28:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 20:29:01,329][15401] Updated weights for policy 0, policy_version 509900 (0.0029) [2024-06-23 20:29:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 8354299904. Throughput: 0: 43153.5. Samples: 8354452780. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:29:03,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 20:29:04,690][15401] Updated weights for policy 0, policy_version 509910 (0.0036) [2024-06-23 20:29:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8354480128. Throughput: 0: 42861.8. Samples: 8354578560. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:29:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 20:29:08,898][15401] Updated weights for policy 0, policy_version 509920 (0.0029) [2024-06-23 20:29:12,830][15401] Updated weights for policy 0, policy_version 509930 (0.0030) [2024-06-23 20:29:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 8354709504. Throughput: 0: 42792.9. Samples: 8354830080. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:29:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 20:29:16,613][15401] Updated weights for policy 0, policy_version 509940 (0.0022) [2024-06-23 20:29:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8354938880. Throughput: 0: 42952.5. Samples: 8355093340. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:29:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 20:29:20,219][15401] Updated weights for policy 0, policy_version 509950 (0.0031) [2024-06-23 20:29:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8355135488. Throughput: 0: 42851.0. Samples: 8355220380. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:29:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 20:29:24,458][15401] Updated weights for policy 0, policy_version 509960 (0.0033) [2024-06-23 20:29:27,732][15401] Updated weights for policy 0, policy_version 509970 (0.0052) [2024-06-23 20:29:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42766.0). Total num frames: 8355364864. Throughput: 0: 42727.2. Samples: 8355472360. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:29:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 20:29:32,132][15401] Updated weights for policy 0, policy_version 509980 (0.0035) [2024-06-23 20:29:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 8355561472. Throughput: 0: 42844.9. Samples: 8355736860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:29:33,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 20:29:35,697][15401] Updated weights for policy 0, policy_version 509990 (0.0040) [2024-06-23 20:29:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8355790848. Throughput: 0: 42760.1. Samples: 8355861880. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-23 20:29:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 20:29:39,819][15401] Updated weights for policy 0, policy_version 510000 (0.0033) [2024-06-23 20:29:41,307][15349] Signal inference workers to stop experience collection... (123800 times) [2024-06-23 20:29:41,308][15349] Signal inference workers to resume experience collection... (123800 times) [2024-06-23 20:29:41,343][15401] InferenceWorker_p0-w0: stopping experience collection (123800 times) [2024-06-23 20:29:41,343][15401] InferenceWorker_p0-w0: resuming experience collection (123800 times) [2024-06-23 20:29:43,154][15401] Updated weights for policy 0, policy_version 510010 (0.0038) [2024-06-23 20:29:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 8356003840. Throughput: 0: 42795.6. Samples: 8356114880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:29:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 20:29:43,465][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000510011_8356020224.pth... [2024-06-23 20:29:43,524][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000509385_8345763840.pth [2024-06-23 20:29:47,437][15401] Updated weights for policy 0, policy_version 510020 (0.0046) [2024-06-23 20:29:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 8356200448. Throughput: 0: 42803.8. Samples: 8356378960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:29:48,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 20:29:50,639][15401] Updated weights for policy 0, policy_version 510030 (0.0022) [2024-06-23 20:29:53,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 8356397056. Throughput: 0: 42744.9. Samples: 8356502080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:29:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 20:29:55,090][15401] Updated weights for policy 0, policy_version 510040 (0.0030) [2024-06-23 20:29:58,169][15401] Updated weights for policy 0, policy_version 510050 (0.0033) [2024-06-23 20:29:58,396][15132] Fps is (10 sec: 45845.9, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 8356659200. Throughput: 0: 42768.9. Samples: 8356754960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:29:58,397][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 20:30:02,573][15401] Updated weights for policy 0, policy_version 510060 (0.0035) [2024-06-23 20:30:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8356839424. Throughput: 0: 42692.9. Samples: 8357014520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 20:30:06,207][15401] Updated weights for policy 0, policy_version 510070 (0.0033) [2024-06-23 20:30:08,390][15132] Fps is (10 sec: 39347.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 8357052416. Throughput: 0: 42718.3. Samples: 8357142700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 20:30:10,564][15401] Updated weights for policy 0, policy_version 510080 (0.0031) [2024-06-23 20:30:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 8357298176. Throughput: 0: 42716.8. Samples: 8357394620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:13,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 20:30:13,651][15401] Updated weights for policy 0, policy_version 510090 (0.0040) [2024-06-23 20:30:18,104][15401] Updated weights for policy 0, policy_version 510100 (0.0041) [2024-06-23 20:30:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8357478400. Throughput: 0: 42714.5. Samples: 8357659020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 20:30:21,177][15401] Updated weights for policy 0, policy_version 510110 (0.0029) [2024-06-23 20:30:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 8357691392. Throughput: 0: 42647.5. Samples: 8357781020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 20:30:25,680][15401] Updated weights for policy 0, policy_version 510120 (0.0027) [2024-06-23 20:30:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8357937152. Throughput: 0: 42816.5. Samples: 8358041620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 20:30:28,960][15401] Updated weights for policy 0, policy_version 510130 (0.0037) [2024-06-23 20:30:33,169][15401] Updated weights for policy 0, policy_version 510140 (0.0035) [2024-06-23 20:30:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 8358133760. Throughput: 0: 42808.8. Samples: 8358305360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 20:30:36,489][15401] Updated weights for policy 0, policy_version 510150 (0.0038) [2024-06-23 20:30:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8358346752. Throughput: 0: 42832.4. Samples: 8358429540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:38,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 20:30:40,853][15401] Updated weights for policy 0, policy_version 510160 (0.0026) [2024-06-23 20:30:43,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8358592512. Throughput: 0: 42955.0. Samples: 8358687660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 20:30:44,198][15401] Updated weights for policy 0, policy_version 510170 (0.0037) [2024-06-23 20:30:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8358772736. Throughput: 0: 43009.4. Samples: 8358949940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:48,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-23 20:30:48,476][15401] Updated weights for policy 0, policy_version 510180 (0.0043) [2024-06-23 20:30:49,485][15349] Signal inference workers to stop experience collection... (123850 times) [2024-06-23 20:30:49,489][15349] Signal inference workers to resume experience collection... (123850 times) [2024-06-23 20:30:49,528][15401] InferenceWorker_p0-w0: stopping experience collection (123850 times) [2024-06-23 20:30:49,556][15401] InferenceWorker_p0-w0: resuming experience collection (123850 times) [2024-06-23 20:30:51,755][15401] Updated weights for policy 0, policy_version 510190 (0.0036) [2024-06-23 20:30:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42654.2). Total num frames: 8359002112. Throughput: 0: 42772.4. Samples: 8359067460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 20:30:56,175][15401] Updated weights for policy 0, policy_version 510200 (0.0030) [2024-06-23 20:30:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 8359215104. Throughput: 0: 43086.3. Samples: 8359333500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:30:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 20:30:59,452][15401] Updated weights for policy 0, policy_version 510210 (0.0030) [2024-06-23 20:31:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8359395328. Throughput: 0: 43039.6. Samples: 8359595800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:31:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 20:31:04,155][15401] Updated weights for policy 0, policy_version 510220 (0.0036) [2024-06-23 20:31:07,193][15401] Updated weights for policy 0, policy_version 510230 (0.0030) [2024-06-23 20:31:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8359641088. Throughput: 0: 42851.5. Samples: 8359709340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 20:31:11,615][15401] Updated weights for policy 0, policy_version 510240 (0.0038) [2024-06-23 20:31:13,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8359870464. Throughput: 0: 43044.0. Samples: 8359978600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 20:31:14,783][15401] Updated weights for policy 0, policy_version 510250 (0.0025) [2024-06-23 20:31:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8360034304. Throughput: 0: 43053.5. Samples: 8360242760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:18,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 20:31:19,179][15401] Updated weights for policy 0, policy_version 510260 (0.0028) [2024-06-23 20:31:22,537][15401] Updated weights for policy 0, policy_version 510270 (0.0036) [2024-06-23 20:31:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 8360296448. Throughput: 0: 42883.4. Samples: 8360359300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 20:31:26,958][15401] Updated weights for policy 0, policy_version 510280 (0.0028) [2024-06-23 20:31:28,396][15132] Fps is (10 sec: 47483.1, 60 sec: 42866.8, 300 sec: 42875.5). Total num frames: 8360509440. Throughput: 0: 42960.1. Samples: 8360621140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:28,396][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 20:31:30,212][15401] Updated weights for policy 0, policy_version 510290 (0.0044) [2024-06-23 20:31:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8360689664. Throughput: 0: 42959.4. Samples: 8360883120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:33,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 20:31:34,521][15401] Updated weights for policy 0, policy_version 510300 (0.0026) [2024-06-23 20:31:37,737][15401] Updated weights for policy 0, policy_version 510310 (0.0040) [2024-06-23 20:31:38,390][15132] Fps is (10 sec: 42625.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8360935424. Throughput: 0: 43014.6. Samples: 8361003120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 20:31:42,134][15401] Updated weights for policy 0, policy_version 510320 (0.0030) [2024-06-23 20:31:43,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42320.8, 300 sec: 42819.6). Total num frames: 8361132032. Throughput: 0: 42932.9. Samples: 8361265760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:43,397][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 20:31:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000510323_8361132032.pth... [2024-06-23 20:31:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000509696_8350859264.pth [2024-06-23 20:31:45,195][15401] Updated weights for policy 0, policy_version 510330 (0.0024) [2024-06-23 20:31:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8361345024. Throughput: 0: 42743.1. Samples: 8361519240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 20:31:49,628][15401] Updated weights for policy 0, policy_version 510340 (0.0043) [2024-06-23 20:31:53,239][15401] Updated weights for policy 0, policy_version 510350 (0.0027) [2024-06-23 20:31:53,392][15132] Fps is (10 sec: 44254.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8361574400. Throughput: 0: 43117.8. Samples: 8361649740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:53,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 20:31:57,215][15401] Updated weights for policy 0, policy_version 510360 (0.0031) [2024-06-23 20:31:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8361771008. Throughput: 0: 42885.8. Samples: 8361908460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:31:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 20:32:00,965][15401] Updated weights for policy 0, policy_version 510370 (0.0028) [2024-06-23 20:32:03,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 8362000384. Throughput: 0: 42637.8. Samples: 8362161460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:32:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 20:32:04,409][15349] Signal inference workers to stop experience collection... (123900 times) [2024-06-23 20:32:04,416][15349] Signal inference workers to resume experience collection... (123900 times) [2024-06-23 20:32:04,447][15401] InferenceWorker_p0-w0: stopping experience collection (123900 times) [2024-06-23 20:32:04,447][15401] InferenceWorker_p0-w0: resuming experience collection (123900 times) [2024-06-23 20:32:04,871][15401] Updated weights for policy 0, policy_version 510380 (0.0027) [2024-06-23 20:32:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8362213376. Throughput: 0: 42933.5. Samples: 8362291300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:32:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 20:32:08,543][15401] Updated weights for policy 0, policy_version 510390 (0.0024) [2024-06-23 20:32:12,805][15401] Updated weights for policy 0, policy_version 510400 (0.0042) [2024-06-23 20:32:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8362409984. Throughput: 0: 42870.1. Samples: 8362550020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:32:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 20:32:16,165][15401] Updated weights for policy 0, policy_version 510410 (0.0036) [2024-06-23 20:32:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 8362655744. Throughput: 0: 42540.9. Samples: 8362797460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:32:18,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 20:32:20,240][15401] Updated weights for policy 0, policy_version 510420 (0.0035) [2024-06-23 20:32:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8362852352. Throughput: 0: 42920.1. Samples: 8362934520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:32:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 20:32:23,803][15401] Updated weights for policy 0, policy_version 510430 (0.0032) [2024-06-23 20:32:27,978][15401] Updated weights for policy 0, policy_version 510440 (0.0035) [2024-06-23 20:32:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42329.9, 300 sec: 42765.4). Total num frames: 8363048960. Throughput: 0: 42804.4. Samples: 8363191680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:32:28,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-23 20:32:31,297][15401] Updated weights for policy 0, policy_version 510450 (0.0044) [2024-06-23 20:32:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42877.0). Total num frames: 8363294720. Throughput: 0: 42825.3. Samples: 8363446380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-23 20:32:33,390][15132] Avg episode reward: [(0, '0.137')] [2024-06-23 20:32:35,490][15401] Updated weights for policy 0, policy_version 510460 (0.0032) [2024-06-23 20:32:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 8363491328. Throughput: 0: 42797.0. Samples: 8363575500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:32:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 20:32:39,125][15401] Updated weights for policy 0, policy_version 510470 (0.0032) [2024-06-23 20:32:43,132][15401] Updated weights for policy 0, policy_version 510480 (0.0033) [2024-06-23 20:32:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 8363704320. Throughput: 0: 42800.0. Samples: 8363834460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:32:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 20:32:46,757][15401] Updated weights for policy 0, policy_version 510490 (0.0026) [2024-06-23 20:32:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 8363933696. Throughput: 0: 42664.8. Samples: 8364081380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:32:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 20:32:50,737][15401] Updated weights for policy 0, policy_version 510500 (0.0034) [2024-06-23 20:32:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 8364113920. Throughput: 0: 42663.9. Samples: 8364211180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:32:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 20:32:54,579][15401] Updated weights for policy 0, policy_version 510510 (0.0031) [2024-06-23 20:32:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 8364343296. Throughput: 0: 42516.8. Samples: 8364463280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:32:58,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 20:32:58,670][15401] Updated weights for policy 0, policy_version 510520 (0.0030) [2024-06-23 20:33:02,577][15401] Updated weights for policy 0, policy_version 510530 (0.0028) [2024-06-23 20:33:03,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 8364556288. Throughput: 0: 42581.6. Samples: 8364713900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:03,396][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 20:33:06,097][15401] Updated weights for policy 0, policy_version 510540 (0.0033) [2024-06-23 20:33:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8364769280. Throughput: 0: 42378.6. Samples: 8364841560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 20:33:10,300][15401] Updated weights for policy 0, policy_version 510550 (0.0033) [2024-06-23 20:33:13,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8364982272. Throughput: 0: 42334.9. Samples: 8365096760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 20:33:13,894][15401] Updated weights for policy 0, policy_version 510560 (0.0034) [2024-06-23 20:33:18,102][15401] Updated weights for policy 0, policy_version 510570 (0.0037) [2024-06-23 20:33:18,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 8365195264. Throughput: 0: 42398.3. Samples: 8365354400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:18,392][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 20:33:21,857][15401] Updated weights for policy 0, policy_version 510580 (0.0034) [2024-06-23 20:33:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8365408256. Throughput: 0: 42376.8. Samples: 8365482460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:23,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 20:33:25,533][15401] Updated weights for policy 0, policy_version 510590 (0.0034) [2024-06-23 20:33:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8365621248. Throughput: 0: 42206.1. Samples: 8365733740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 20:33:29,525][15401] Updated weights for policy 0, policy_version 510600 (0.0041) [2024-06-23 20:33:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8365817856. Throughput: 0: 42473.4. Samples: 8365992680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 20:33:33,471][15401] Updated weights for policy 0, policy_version 510610 (0.0035) [2024-06-23 20:33:37,052][15401] Updated weights for policy 0, policy_version 510620 (0.0037) [2024-06-23 20:33:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 8366047232. Throughput: 0: 42455.5. Samples: 8366121680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 20:33:41,185][15401] Updated weights for policy 0, policy_version 510630 (0.0036) [2024-06-23 20:33:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8366260224. Throughput: 0: 42472.9. Samples: 8366374560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 20:33:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000510636_8366260224.pth... [2024-06-23 20:33:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000510011_8356020224.pth [2024-06-23 20:33:44,606][15401] Updated weights for policy 0, policy_version 510640 (0.0023) [2024-06-23 20:33:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 8366456832. Throughput: 0: 42620.6. Samples: 8366631560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 20:33:48,754][15401] Updated weights for policy 0, policy_version 510650 (0.0044) [2024-06-23 20:33:51,674][15349] Signal inference workers to stop experience collection... (123950 times) [2024-06-23 20:33:51,727][15401] InferenceWorker_p0-w0: stopping experience collection (123950 times) [2024-06-23 20:33:51,731][15349] Signal inference workers to resume experience collection... (123950 times) [2024-06-23 20:33:51,741][15401] InferenceWorker_p0-w0: resuming experience collection (123950 times) [2024-06-23 20:33:52,190][15401] Updated weights for policy 0, policy_version 510660 (0.0028) [2024-06-23 20:33:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8366686208. Throughput: 0: 42659.1. Samples: 8366761220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:53,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-23 20:33:56,363][15401] Updated weights for policy 0, policy_version 510670 (0.0026) [2024-06-23 20:33:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8366882816. Throughput: 0: 42736.6. Samples: 8367019900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:33:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 20:33:59,770][15401] Updated weights for policy 0, policy_version 510680 (0.0037) [2024-06-23 20:34:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 8367095808. Throughput: 0: 42775.2. Samples: 8367279180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 20:34:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-23 20:34:04,040][15401] Updated weights for policy 0, policy_version 510690 (0.0036) [2024-06-23 20:34:07,465][15401] Updated weights for policy 0, policy_version 510700 (0.0030) [2024-06-23 20:34:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8367341568. Throughput: 0: 42729.7. Samples: 8367405300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:08,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 20:34:11,734][15401] Updated weights for policy 0, policy_version 510710 (0.0037) [2024-06-23 20:34:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8367538176. Throughput: 0: 42854.1. Samples: 8367662180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:13,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 20:34:15,176][15401] Updated weights for policy 0, policy_version 510720 (0.0040) [2024-06-23 20:34:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 8367734784. Throughput: 0: 42802.2. Samples: 8367918780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 20:34:19,335][15401] Updated weights for policy 0, policy_version 510730 (0.0038) [2024-06-23 20:34:22,857][15401] Updated weights for policy 0, policy_version 510740 (0.0032) [2024-06-23 20:34:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8367980544. Throughput: 0: 42648.0. Samples: 8368040840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:23,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 20:34:27,246][15401] Updated weights for policy 0, policy_version 510750 (0.0025) [2024-06-23 20:34:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8368193536. Throughput: 0: 42869.4. Samples: 8368303680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 20:34:30,450][15401] Updated weights for policy 0, policy_version 510760 (0.0027) [2024-06-23 20:34:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8368373760. Throughput: 0: 42909.0. Samples: 8368562460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 20:34:34,771][15401] Updated weights for policy 0, policy_version 510770 (0.0031) [2024-06-23 20:34:38,257][15401] Updated weights for policy 0, policy_version 510780 (0.0039) [2024-06-23 20:34:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8368619520. Throughput: 0: 42863.8. Samples: 8368690080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 20:34:42,483][15401] Updated weights for policy 0, policy_version 510790 (0.0039) [2024-06-23 20:34:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8368816128. Throughput: 0: 42848.7. Samples: 8368948100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:43,399][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 20:34:45,887][15401] Updated weights for policy 0, policy_version 510800 (0.0037) [2024-06-23 20:34:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8369029120. Throughput: 0: 42608.4. Samples: 8369196560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 20:34:50,063][15401] Updated weights for policy 0, policy_version 510810 (0.0028) [2024-06-23 20:34:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 8369242112. Throughput: 0: 42645.4. Samples: 8369324340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 20:34:53,696][15401] Updated weights for policy 0, policy_version 510820 (0.0035) [2024-06-23 20:34:57,728][15401] Updated weights for policy 0, policy_version 510830 (0.0052) [2024-06-23 20:34:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8369455104. Throughput: 0: 42866.0. Samples: 8369591140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:34:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 20:35:01,271][15401] Updated weights for policy 0, policy_version 510840 (0.0031) [2024-06-23 20:35:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8369684480. Throughput: 0: 42649.7. Samples: 8369838020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:35:03,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 20:35:05,399][15401] Updated weights for policy 0, policy_version 510850 (0.0034) [2024-06-23 20:35:05,938][15349] Signal inference workers to stop experience collection... (124000 times) [2024-06-23 20:35:05,990][15401] InferenceWorker_p0-w0: stopping experience collection (124000 times) [2024-06-23 20:35:05,991][15349] Signal inference workers to resume experience collection... (124000 times) [2024-06-23 20:35:06,006][15401] InferenceWorker_p0-w0: resuming experience collection (124000 times) [2024-06-23 20:35:08,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8369897472. Throughput: 0: 42803.5. Samples: 8369967000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:35:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 20:35:08,911][15401] Updated weights for policy 0, policy_version 510860 (0.0027) [2024-06-23 20:35:13,267][15401] Updated weights for policy 0, policy_version 510870 (0.0038) [2024-06-23 20:35:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8370094080. Throughput: 0: 42774.6. Samples: 8370228540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:35:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 20:35:16,675][15401] Updated weights for policy 0, policy_version 510880 (0.0035) [2024-06-23 20:35:18,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8370307072. Throughput: 0: 42679.4. Samples: 8370483040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:35:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 20:35:20,718][15401] Updated weights for policy 0, policy_version 510890 (0.0028) [2024-06-23 20:35:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8370536448. Throughput: 0: 42710.2. Samples: 8370612040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:35:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 20:35:24,398][15401] Updated weights for policy 0, policy_version 510900 (0.0041) [2024-06-23 20:35:28,316][15401] Updated weights for policy 0, policy_version 510910 (0.0033) [2024-06-23 20:35:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8370749440. Throughput: 0: 42719.3. Samples: 8370870460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-23 20:35:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 20:35:31,977][15401] Updated weights for policy 0, policy_version 510920 (0.0039) [2024-06-23 20:35:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8370962432. Throughput: 0: 42963.1. Samples: 8371129900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:35:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 20:35:35,964][15401] Updated weights for policy 0, policy_version 510930 (0.0029) [2024-06-23 20:35:38,393][15132] Fps is (10 sec: 45856.8, 60 sec: 43141.6, 300 sec: 42764.4). Total num frames: 8371208192. Throughput: 0: 42906.9. Samples: 8371255320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:35:38,394][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 20:35:39,386][15401] Updated weights for policy 0, policy_version 510940 (0.0033) [2024-06-23 20:35:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 8371372032. Throughput: 0: 42812.7. Samples: 8371517720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:35:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 20:35:43,593][15401] Updated weights for policy 0, policy_version 510950 (0.0047) [2024-06-23 20:35:43,596][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000510950_8371404800.pth... [2024-06-23 20:35:43,634][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000510323_8361132032.pth [2024-06-23 20:35:46,877][15401] Updated weights for policy 0, policy_version 510960 (0.0035) [2024-06-23 20:35:48,390][15132] Fps is (10 sec: 37698.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8371585024. Throughput: 0: 43127.6. Samples: 8371778760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:35:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 20:35:51,556][15401] Updated weights for policy 0, policy_version 510970 (0.0035) [2024-06-23 20:35:53,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8371847168. Throughput: 0: 43096.6. Samples: 8371906340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:35:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 20:35:54,274][15401] Updated weights for policy 0, policy_version 510980 (0.0032) [2024-06-23 20:35:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8372011008. Throughput: 0: 42971.7. Samples: 8372162260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:35:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 20:35:59,116][15401] Updated weights for policy 0, policy_version 510990 (0.0028) [2024-06-23 20:36:02,366][15401] Updated weights for policy 0, policy_version 511000 (0.0021) [2024-06-23 20:36:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8372224000. Throughput: 0: 43056.1. Samples: 8372420560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 20:36:06,549][15401] Updated weights for policy 0, policy_version 511010 (0.0032) [2024-06-23 20:36:08,390][15132] Fps is (10 sec: 47512.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8372486144. Throughput: 0: 43080.3. Samples: 8372550660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:08,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 20:36:10,275][15401] Updated weights for policy 0, policy_version 511020 (0.0028) [2024-06-23 20:36:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8372682752. Throughput: 0: 43023.1. Samples: 8372806500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 20:36:14,069][15401] Updated weights for policy 0, policy_version 511030 (0.0046) [2024-06-23 20:36:14,635][15349] Signal inference workers to stop experience collection... (124050 times) [2024-06-23 20:36:14,684][15401] InferenceWorker_p0-w0: stopping experience collection (124050 times) [2024-06-23 20:36:14,691][15349] Signal inference workers to resume experience collection... (124050 times) [2024-06-23 20:36:14,701][15401] InferenceWorker_p0-w0: resuming experience collection (124050 times) [2024-06-23 20:36:18,245][15401] Updated weights for policy 0, policy_version 511040 (0.0041) [2024-06-23 20:36:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8372879360. Throughput: 0: 43005.0. Samples: 8373065120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:18,390][15132] Avg episode reward: [(0, '0.088')] [2024-06-23 20:36:22,178][15401] Updated weights for policy 0, policy_version 511050 (0.0041) [2024-06-23 20:36:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42766.0). Total num frames: 8373125120. Throughput: 0: 43032.3. Samples: 8373191600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 20:36:25,794][15401] Updated weights for policy 0, policy_version 511060 (0.0029) [2024-06-23 20:36:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8373305344. Throughput: 0: 42871.6. Samples: 8373446940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:28,399][15132] Avg episode reward: [(0, '0.778')] [2024-06-23 20:36:29,641][15401] Updated weights for policy 0, policy_version 511070 (0.0028) [2024-06-23 20:36:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8373534720. Throughput: 0: 42704.6. Samples: 8373700460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 20:36:33,395][15401] Updated weights for policy 0, policy_version 511080 (0.0037) [2024-06-23 20:36:37,195][15401] Updated weights for policy 0, policy_version 511090 (0.0050) [2024-06-23 20:36:38,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42323.6, 300 sec: 42765.0). Total num frames: 8373747712. Throughput: 0: 42802.4. Samples: 8373832720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:38,397][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 20:36:41,044][15401] Updated weights for policy 0, policy_version 511100 (0.0038) [2024-06-23 20:36:43,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8373960704. Throughput: 0: 42795.3. Samples: 8374088060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:43,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 20:36:44,952][15401] Updated weights for policy 0, policy_version 511110 (0.0037) [2024-06-23 20:36:48,389][15132] Fps is (10 sec: 42625.7, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 8374173696. Throughput: 0: 42685.3. Samples: 8374341400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:48,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-23 20:36:48,475][15401] Updated weights for policy 0, policy_version 511120 (0.0033) [2024-06-23 20:36:52,451][15401] Updated weights for policy 0, policy_version 511130 (0.0024) [2024-06-23 20:36:53,393][15132] Fps is (10 sec: 42585.9, 60 sec: 42323.1, 300 sec: 42764.6). Total num frames: 8374386688. Throughput: 0: 42748.8. Samples: 8374474480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:53,393][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 20:36:56,095][15401] Updated weights for policy 0, policy_version 511140 (0.0041) [2024-06-23 20:36:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 8374616064. Throughput: 0: 42839.1. Samples: 8374734260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:36:58,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 20:37:00,081][15401] Updated weights for policy 0, policy_version 511150 (0.0022) [2024-06-23 20:37:03,389][15132] Fps is (10 sec: 44250.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8374829056. Throughput: 0: 42675.6. Samples: 8374985520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:03,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 20:37:03,586][15401] Updated weights for policy 0, policy_version 511160 (0.0034) [2024-06-23 20:37:07,554][15401] Updated weights for policy 0, policy_version 511170 (0.0043) [2024-06-23 20:37:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 8375042048. Throughput: 0: 42863.5. Samples: 8375120460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:08,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 20:37:11,183][15401] Updated weights for policy 0, policy_version 511180 (0.0047) [2024-06-23 20:37:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8375271424. Throughput: 0: 42844.5. Samples: 8375374940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:13,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 20:37:15,407][15401] Updated weights for policy 0, policy_version 511190 (0.0035) [2024-06-23 20:37:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8375468032. Throughput: 0: 42990.5. Samples: 8375635040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 20:37:18,819][15401] Updated weights for policy 0, policy_version 511200 (0.0031) [2024-06-23 20:37:23,017][15401] Updated weights for policy 0, policy_version 511210 (0.0036) [2024-06-23 20:37:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 8375681024. Throughput: 0: 42882.3. Samples: 8375762160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:23,391][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 20:37:26,438][15401] Updated weights for policy 0, policy_version 511220 (0.0032) [2024-06-23 20:37:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8375910400. Throughput: 0: 42788.2. Samples: 8376013520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 20:37:29,214][15349] Signal inference workers to stop experience collection... (124100 times) [2024-06-23 20:37:29,215][15349] Signal inference workers to resume experience collection... (124100 times) [2024-06-23 20:37:29,253][15401] InferenceWorker_p0-w0: stopping experience collection (124100 times) [2024-06-23 20:37:29,253][15401] InferenceWorker_p0-w0: resuming experience collection (124100 times) [2024-06-23 20:37:30,629][15401] Updated weights for policy 0, policy_version 511230 (0.0048) [2024-06-23 20:37:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 8376090624. Throughput: 0: 42847.4. Samples: 8376269540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 20:37:34,361][15401] Updated weights for policy 0, policy_version 511240 (0.0045) [2024-06-23 20:37:38,137][15401] Updated weights for policy 0, policy_version 511250 (0.0048) [2024-06-23 20:37:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 8376320000. Throughput: 0: 42720.7. Samples: 8376396780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 20:37:41,969][15401] Updated weights for policy 0, policy_version 511260 (0.0034) [2024-06-23 20:37:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8376532992. Throughput: 0: 42657.4. Samples: 8376653840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 20:37:43,464][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000511264_8376549376.pth... [2024-06-23 20:37:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000510636_8366260224.pth [2024-06-23 20:37:45,586][15401] Updated weights for policy 0, policy_version 511270 (0.0024) [2024-06-23 20:37:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8376745984. Throughput: 0: 42934.1. Samples: 8376917560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:48,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 20:37:49,364][15401] Updated weights for policy 0, policy_version 511280 (0.0036) [2024-06-23 20:37:52,938][15401] Updated weights for policy 0, policy_version 511290 (0.0031) [2024-06-23 20:37:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43146.7, 300 sec: 42820.6). Total num frames: 8376975360. Throughput: 0: 42752.3. Samples: 8377044320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:53,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 20:37:56,873][15401] Updated weights for policy 0, policy_version 511300 (0.0035) [2024-06-23 20:37:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42766.0). Total num frames: 8377171968. Throughput: 0: 42791.2. Samples: 8377300540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:37:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 20:38:00,538][15401] Updated weights for policy 0, policy_version 511310 (0.0024) [2024-06-23 20:38:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8377384960. Throughput: 0: 42837.8. Samples: 8377562740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:38:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 20:38:04,614][15401] Updated weights for policy 0, policy_version 511320 (0.0029) [2024-06-23 20:38:08,366][15401] Updated weights for policy 0, policy_version 511330 (0.0032) [2024-06-23 20:38:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8377630720. Throughput: 0: 42789.5. Samples: 8377687680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:38:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-23 20:38:12,283][15401] Updated weights for policy 0, policy_version 511340 (0.0030) [2024-06-23 20:38:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 8377827328. Throughput: 0: 42920.5. Samples: 8377944940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:38:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 20:38:16,435][15401] Updated weights for policy 0, policy_version 511350 (0.0039) [2024-06-23 20:38:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8378023936. Throughput: 0: 43009.0. Samples: 8378204940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:38:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 20:38:20,113][15401] Updated weights for policy 0, policy_version 511360 (0.0024) [2024-06-23 20:38:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8378269696. Throughput: 0: 42946.7. Samples: 8378329380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-23 20:38:23,400][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 20:38:23,928][15401] Updated weights for policy 0, policy_version 511370 (0.0029) [2024-06-23 20:38:27,806][15401] Updated weights for policy 0, policy_version 511380 (0.0039) [2024-06-23 20:38:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8378466304. Throughput: 0: 43002.6. Samples: 8378588960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:38:28,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 20:38:31,616][15401] Updated weights for policy 0, policy_version 511390 (0.0037) [2024-06-23 20:38:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8378679296. Throughput: 0: 42979.5. Samples: 8378851640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:38:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 20:38:35,345][15401] Updated weights for policy 0, policy_version 511400 (0.0035) [2024-06-23 20:38:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8378908672. Throughput: 0: 42922.2. Samples: 8378975820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:38:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 20:38:39,051][15401] Updated weights for policy 0, policy_version 511410 (0.0040) [2024-06-23 20:38:43,051][15401] Updated weights for policy 0, policy_version 511420 (0.0038) [2024-06-23 20:38:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 8379121664. Throughput: 0: 42894.2. Samples: 8379230780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:38:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 20:38:46,891][15401] Updated weights for policy 0, policy_version 511430 (0.0039) [2024-06-23 20:38:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8379318272. Throughput: 0: 42784.9. Samples: 8379488060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:38:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 20:38:51,087][15401] Updated weights for policy 0, policy_version 511440 (0.0039) [2024-06-23 20:38:51,478][15349] Signal inference workers to stop experience collection... (124150 times) [2024-06-23 20:38:51,485][15349] Signal inference workers to resume experience collection... (124150 times) [2024-06-23 20:38:51,492][15401] InferenceWorker_p0-w0: stopping experience collection (124150 times) [2024-06-23 20:38:51,516][15401] InferenceWorker_p0-w0: resuming experience collection (124150 times) [2024-06-23 20:38:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8379531264. Throughput: 0: 42868.5. Samples: 8379616760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:38:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 20:38:54,533][15401] Updated weights for policy 0, policy_version 511450 (0.0035) [2024-06-23 20:38:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8379727872. Throughput: 0: 42728.4. Samples: 8379867720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:38:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 20:38:58,854][15401] Updated weights for policy 0, policy_version 511460 (0.0033) [2024-06-23 20:39:02,106][15401] Updated weights for policy 0, policy_version 511470 (0.0042) [2024-06-23 20:39:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8379973632. Throughput: 0: 42726.1. Samples: 8380127620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:03,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 20:39:06,519][15401] Updated weights for policy 0, policy_version 511480 (0.0031) [2024-06-23 20:39:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8380170240. Throughput: 0: 42997.3. Samples: 8380264260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 20:39:09,801][15401] Updated weights for policy 0, policy_version 511490 (0.0021) [2024-06-23 20:39:13,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8380383232. Throughput: 0: 42726.8. Samples: 8380511660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 20:39:14,031][15401] Updated weights for policy 0, policy_version 511500 (0.0023) [2024-06-23 20:39:17,492][15401] Updated weights for policy 0, policy_version 511510 (0.0028) [2024-06-23 20:39:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8380596224. Throughput: 0: 42515.3. Samples: 8380764820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 20:39:21,756][15401] Updated weights for policy 0, policy_version 511520 (0.0044) [2024-06-23 20:39:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8380825600. Throughput: 0: 42654.3. Samples: 8380895260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 20:39:25,225][15401] Updated weights for policy 0, policy_version 511530 (0.0038) [2024-06-23 20:39:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8381022208. Throughput: 0: 42560.0. Samples: 8381145980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 20:39:29,604][15401] Updated weights for policy 0, policy_version 511540 (0.0027) [2024-06-23 20:39:32,954][15401] Updated weights for policy 0, policy_version 511550 (0.0050) [2024-06-23 20:39:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8381251584. Throughput: 0: 42498.8. Samples: 8381400500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:33,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 20:39:37,071][15401] Updated weights for policy 0, policy_version 511560 (0.0031) [2024-06-23 20:39:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 8381448192. Throughput: 0: 42542.3. Samples: 8381531160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 20:39:40,462][15401] Updated weights for policy 0, policy_version 511570 (0.0037) [2024-06-23 20:39:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8381677568. Throughput: 0: 42717.9. Samples: 8381790020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 20:39:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000511577_8381677568.pth... [2024-06-23 20:39:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000510950_8371404800.pth [2024-06-23 20:39:44,651][15401] Updated weights for policy 0, policy_version 511580 (0.0043) [2024-06-23 20:39:47,983][15401] Updated weights for policy 0, policy_version 511590 (0.0033) [2024-06-23 20:39:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8381890560. Throughput: 0: 42574.8. Samples: 8382043480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 20:39:52,151][15401] Updated weights for policy 0, policy_version 511600 (0.0038) [2024-06-23 20:39:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8382103552. Throughput: 0: 42612.8. Samples: 8382181840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 20:39:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 20:39:55,495][15401] Updated weights for policy 0, policy_version 511610 (0.0028) [2024-06-23 20:39:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8382300160. Throughput: 0: 42743.5. Samples: 8382435120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:39:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 20:39:59,639][15401] Updated weights for policy 0, policy_version 511620 (0.0029) [2024-06-23 20:40:03,020][15401] Updated weights for policy 0, policy_version 511630 (0.0033) [2024-06-23 20:40:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8382545920. Throughput: 0: 42710.5. Samples: 8382686800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:03,400][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 20:40:07,622][15401] Updated weights for policy 0, policy_version 511640 (0.0040) [2024-06-23 20:40:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8382726144. Throughput: 0: 42766.3. Samples: 8382819740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 20:40:09,695][15349] Signal inference workers to stop experience collection... (124200 times) [2024-06-23 20:40:09,700][15349] Signal inference workers to resume experience collection... (124200 times) [2024-06-23 20:40:09,741][15401] InferenceWorker_p0-w0: stopping experience collection (124200 times) [2024-06-23 20:40:09,742][15401] InferenceWorker_p0-w0: resuming experience collection (124200 times) [2024-06-23 20:40:11,282][15401] Updated weights for policy 0, policy_version 511650 (0.0022) [2024-06-23 20:40:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8382955520. Throughput: 0: 42754.7. Samples: 8383069940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 20:40:15,071][15401] Updated weights for policy 0, policy_version 511660 (0.0032) [2024-06-23 20:40:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8383168512. Throughput: 0: 43007.0. Samples: 8383335820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 20:40:18,793][15401] Updated weights for policy 0, policy_version 511670 (0.0028) [2024-06-23 20:40:22,660][15401] Updated weights for policy 0, policy_version 511680 (0.0033) [2024-06-23 20:40:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8383381504. Throughput: 0: 42990.6. Samples: 8383465740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:23,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-23 20:40:26,428][15401] Updated weights for policy 0, policy_version 511690 (0.0042) [2024-06-23 20:40:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8383610880. Throughput: 0: 42900.7. Samples: 8383720560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 20:40:30,291][15401] Updated weights for policy 0, policy_version 511700 (0.0037) [2024-06-23 20:40:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42710.1). Total num frames: 8383807488. Throughput: 0: 43045.8. Samples: 8383980540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 20:40:34,045][15401] Updated weights for policy 0, policy_version 511710 (0.0037) [2024-06-23 20:40:37,699][15401] Updated weights for policy 0, policy_version 511720 (0.0027) [2024-06-23 20:40:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8384036864. Throughput: 0: 42823.5. Samples: 8384108900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 20:40:41,563][15401] Updated weights for policy 0, policy_version 511730 (0.0034) [2024-06-23 20:40:43,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 8384266240. Throughput: 0: 42935.0. Samples: 8384367300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:43,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 20:40:45,255][15401] Updated weights for policy 0, policy_version 511740 (0.0031) [2024-06-23 20:40:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8384462848. Throughput: 0: 43172.0. Samples: 8384629540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:48,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 20:40:49,291][15401] Updated weights for policy 0, policy_version 511750 (0.0028) [2024-06-23 20:40:53,097][15401] Updated weights for policy 0, policy_version 511760 (0.0034) [2024-06-23 20:40:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8384675840. Throughput: 0: 42995.9. Samples: 8384754560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 20:40:56,839][15401] Updated weights for policy 0, policy_version 511770 (0.0035) [2024-06-23 20:40:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 8384905216. Throughput: 0: 43107.5. Samples: 8385009780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:40:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 20:41:00,758][15401] Updated weights for policy 0, policy_version 511780 (0.0042) [2024-06-23 20:41:03,390][15132] Fps is (10 sec: 40957.9, 60 sec: 42325.0, 300 sec: 42709.4). Total num frames: 8385085440. Throughput: 0: 43040.8. Samples: 8385272680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:41:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 20:41:04,526][15401] Updated weights for policy 0, policy_version 511790 (0.0030) [2024-06-23 20:41:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8385314816. Throughput: 0: 42830.2. Samples: 8385393100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:41:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 20:41:08,563][15401] Updated weights for policy 0, policy_version 511800 (0.0040) [2024-06-23 20:41:12,156][15401] Updated weights for policy 0, policy_version 511810 (0.0036) [2024-06-23 20:41:13,390][15132] Fps is (10 sec: 45877.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8385544192. Throughput: 0: 42815.2. Samples: 8385647240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:41:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 20:41:16,169][15401] Updated weights for policy 0, policy_version 511820 (0.0037) [2024-06-23 20:41:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8385724416. Throughput: 0: 42828.4. Samples: 8385907820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 20:41:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 20:41:19,899][15401] Updated weights for policy 0, policy_version 511830 (0.0033) [2024-06-23 20:41:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8385953792. Throughput: 0: 42738.3. Samples: 8386032120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:41:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 20:41:23,785][15401] Updated weights for policy 0, policy_version 511840 (0.0038) [2024-06-23 20:41:27,572][15401] Updated weights for policy 0, policy_version 511850 (0.0033) [2024-06-23 20:41:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8386183168. Throughput: 0: 42792.5. Samples: 8386292860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:41:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 20:41:31,397][15401] Updated weights for policy 0, policy_version 511860 (0.0041) [2024-06-23 20:41:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.2, 300 sec: 42765.9). Total num frames: 8386363392. Throughput: 0: 42719.0. Samples: 8386551900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:41:33,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 20:41:35,390][15401] Updated weights for policy 0, policy_version 511870 (0.0032) [2024-06-23 20:41:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8386576384. Throughput: 0: 42730.6. Samples: 8386677440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:41:38,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-23 20:41:39,107][15401] Updated weights for policy 0, policy_version 511880 (0.0036) [2024-06-23 20:41:43,108][15401] Updated weights for policy 0, policy_version 511890 (0.0028) [2024-06-23 20:41:43,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 8386805760. Throughput: 0: 42785.0. Samples: 8386935100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:41:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 20:41:43,462][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000511891_8386822144.pth... [2024-06-23 20:41:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000511264_8376549376.pth [2024-06-23 20:41:46,931][15401] Updated weights for policy 0, policy_version 511900 (0.0020) [2024-06-23 20:41:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42821.0). Total num frames: 8387018752. Throughput: 0: 42472.4. Samples: 8387183920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:41:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 20:41:50,895][15401] Updated weights for policy 0, policy_version 511910 (0.0028) [2024-06-23 20:41:52,367][15349] Signal inference workers to stop experience collection... (124250 times) [2024-06-23 20:41:52,372][15349] Signal inference workers to resume experience collection... (124250 times) [2024-06-23 20:41:52,409][15401] InferenceWorker_p0-w0: stopping experience collection (124250 times) [2024-06-23 20:41:52,409][15401] InferenceWorker_p0-w0: resuming experience collection (124250 times) [2024-06-23 20:41:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8387231744. Throughput: 0: 42643.7. Samples: 8387312060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:41:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 20:41:54,514][15401] Updated weights for policy 0, policy_version 511920 (0.0035) [2024-06-23 20:41:58,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8387428352. Throughput: 0: 42598.4. Samples: 8387564160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:41:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 20:41:58,627][15401] Updated weights for policy 0, policy_version 511930 (0.0037) [2024-06-23 20:42:02,749][15401] Updated weights for policy 0, policy_version 511940 (0.0036) [2024-06-23 20:42:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.9, 300 sec: 42820.5). Total num frames: 8387674112. Throughput: 0: 42340.9. Samples: 8387813160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:03,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-23 20:42:06,169][15401] Updated weights for policy 0, policy_version 511950 (0.0038) [2024-06-23 20:42:08,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 8387870720. Throughput: 0: 42418.9. Samples: 8387940980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 20:42:10,433][15401] Updated weights for policy 0, policy_version 511960 (0.0045) [2024-06-23 20:42:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8388067328. Throughput: 0: 42304.1. Samples: 8388196540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 20:42:14,215][15401] Updated weights for policy 0, policy_version 511970 (0.0034) [2024-06-23 20:42:18,009][15401] Updated weights for policy 0, policy_version 511980 (0.0039) [2024-06-23 20:42:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8388280320. Throughput: 0: 42110.3. Samples: 8388446860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 20:42:21,780][15401] Updated weights for policy 0, policy_version 511990 (0.0042) [2024-06-23 20:42:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8388493312. Throughput: 0: 42254.0. Samples: 8388578860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 20:42:25,730][15401] Updated weights for policy 0, policy_version 512000 (0.0031) [2024-06-23 20:42:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8388722688. Throughput: 0: 42159.5. Samples: 8388832280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 20:42:29,430][15401] Updated weights for policy 0, policy_version 512010 (0.0036) [2024-06-23 20:42:33,278][15401] Updated weights for policy 0, policy_version 512020 (0.0028) [2024-06-23 20:42:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8388935680. Throughput: 0: 42415.1. Samples: 8389092600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 20:42:36,911][15401] Updated weights for policy 0, policy_version 512030 (0.0030) [2024-06-23 20:42:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 8389132288. Throughput: 0: 42506.3. Samples: 8389224840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:38,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-23 20:42:40,914][15401] Updated weights for policy 0, policy_version 512040 (0.0035) [2024-06-23 20:42:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8389361664. Throughput: 0: 42504.7. Samples: 8389476880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 20:42:44,697][15401] Updated weights for policy 0, policy_version 512050 (0.0041) [2024-06-23 20:42:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8389574656. Throughput: 0: 42727.6. Samples: 8389735900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:42:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 20:42:48,523][15401] Updated weights for policy 0, policy_version 512060 (0.0049) [2024-06-23 20:42:52,490][15401] Updated weights for policy 0, policy_version 512070 (0.0041) [2024-06-23 20:42:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 8389771264. Throughput: 0: 42745.4. Samples: 8389864520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:42:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 20:42:56,253][15401] Updated weights for policy 0, policy_version 512080 (0.0045) [2024-06-23 20:42:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8390000640. Throughput: 0: 42721.8. Samples: 8390119020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:42:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 20:43:00,106][15401] Updated weights for policy 0, policy_version 512090 (0.0032) [2024-06-23 20:43:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 8390197248. Throughput: 0: 43132.0. Samples: 8390387800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 20:43:03,919][15401] Updated weights for policy 0, policy_version 512100 (0.0033) [2024-06-23 20:43:07,627][15401] Updated weights for policy 0, policy_version 512110 (0.0036) [2024-06-23 20:43:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8390426624. Throughput: 0: 42803.1. Samples: 8390505000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 20:43:11,491][15401] Updated weights for policy 0, policy_version 512120 (0.0038) [2024-06-23 20:43:13,392][15132] Fps is (10 sec: 45864.9, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 8390656000. Throughput: 0: 42873.7. Samples: 8390761700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:13,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 20:43:15,282][15401] Updated weights for policy 0, policy_version 512130 (0.0037) [2024-06-23 20:43:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 8390836224. Throughput: 0: 43006.0. Samples: 8391027860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:18,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 20:43:19,037][15401] Updated weights for policy 0, policy_version 512140 (0.0044) [2024-06-23 20:43:20,694][15349] Signal inference workers to stop experience collection... (124300 times) [2024-06-23 20:43:20,694][15349] Signal inference workers to resume experience collection... (124300 times) [2024-06-23 20:43:20,745][15401] InferenceWorker_p0-w0: stopping experience collection (124300 times) [2024-06-23 20:43:20,745][15401] InferenceWorker_p0-w0: resuming experience collection (124300 times) [2024-06-23 20:43:23,091][15401] Updated weights for policy 0, policy_version 512150 (0.0031) [2024-06-23 20:43:23,392][15132] Fps is (10 sec: 40959.7, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 8391065600. Throughput: 0: 42710.9. Samples: 8391146940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:23,401][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 20:43:26,706][15401] Updated weights for policy 0, policy_version 512160 (0.0041) [2024-06-23 20:43:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8391294976. Throughput: 0: 42847.7. Samples: 8391405020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:28,390][15132] Avg episode reward: [(0, '0.184')] [2024-06-23 20:43:30,472][15401] Updated weights for policy 0, policy_version 512170 (0.0031) [2024-06-23 20:43:33,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8391507968. Throughput: 0: 42956.1. Samples: 8391668920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 20:43:34,107][15401] Updated weights for policy 0, policy_version 512180 (0.0041) [2024-06-23 20:43:38,231][15401] Updated weights for policy 0, policy_version 512190 (0.0042) [2024-06-23 20:43:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8391720960. Throughput: 0: 42801.7. Samples: 8391790600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:38,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 20:43:41,970][15401] Updated weights for policy 0, policy_version 512200 (0.0040) [2024-06-23 20:43:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8391933952. Throughput: 0: 42902.0. Samples: 8392049620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:43,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 20:43:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000512204_8391950336.pth... [2024-06-23 20:43:43,616][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000511577_8381677568.pth [2024-06-23 20:43:45,859][15401] Updated weights for policy 0, policy_version 512210 (0.0034) [2024-06-23 20:43:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8392114176. Throughput: 0: 42841.8. Samples: 8392315680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 20:43:49,482][15401] Updated weights for policy 0, policy_version 512220 (0.0033) [2024-06-23 20:43:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8392359936. Throughput: 0: 42890.2. Samples: 8392435060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:53,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 20:43:53,514][15401] Updated weights for policy 0, policy_version 512230 (0.0030) [2024-06-23 20:43:57,519][15401] Updated weights for policy 0, policy_version 512240 (0.0033) [2024-06-23 20:43:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8392556544. Throughput: 0: 42898.8. Samples: 8392692040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:43:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 20:44:01,143][15401] Updated weights for policy 0, policy_version 512250 (0.0036) [2024-06-23 20:44:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8392753152. Throughput: 0: 42733.8. Samples: 8392950880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:44:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 20:44:05,166][15401] Updated weights for policy 0, policy_version 512260 (0.0042) [2024-06-23 20:44:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8392998912. Throughput: 0: 42912.1. Samples: 8393077880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:44:08,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 20:44:08,861][15401] Updated weights for policy 0, policy_version 512270 (0.0029) [2024-06-23 20:44:12,744][15401] Updated weights for policy 0, policy_version 512280 (0.0043) [2024-06-23 20:44:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8393211904. Throughput: 0: 42875.9. Samples: 8393334440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:44:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 20:44:16,702][15401] Updated weights for policy 0, policy_version 512290 (0.0033) [2024-06-23 20:44:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 8393408512. Throughput: 0: 42740.7. Samples: 8393592260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 20:44:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 20:44:20,359][15401] Updated weights for policy 0, policy_version 512300 (0.0050) [2024-06-23 20:44:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 8393654272. Throughput: 0: 42824.1. Samples: 8393717680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:44:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 20:44:24,204][15401] Updated weights for policy 0, policy_version 512310 (0.0046) [2024-06-23 20:44:27,977][15401] Updated weights for policy 0, policy_version 512320 (0.0024) [2024-06-23 20:44:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8393850880. Throughput: 0: 42720.1. Samples: 8393972020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:44:28,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 20:44:31,706][15401] Updated weights for policy 0, policy_version 512330 (0.0033) [2024-06-23 20:44:33,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 8394047488. Throughput: 0: 42500.0. Samples: 8394228280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:44:33,393][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 20:44:35,762][15401] Updated weights for policy 0, policy_version 512340 (0.0033) [2024-06-23 20:44:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8394276864. Throughput: 0: 42635.1. Samples: 8394353640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:44:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 20:44:39,840][15401] Updated weights for policy 0, policy_version 512350 (0.0044) [2024-06-23 20:44:43,367][15401] Updated weights for policy 0, policy_version 512360 (0.0043) [2024-06-23 20:44:43,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8394506240. Throughput: 0: 42685.3. Samples: 8394612880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:44:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 20:44:47,224][15349] Signal inference workers to stop experience collection... (124350 times) [2024-06-23 20:44:47,224][15349] Signal inference workers to resume experience collection... (124350 times) [2024-06-23 20:44:47,265][15401] InferenceWorker_p0-w0: stopping experience collection (124350 times) [2024-06-23 20:44:47,265][15401] InferenceWorker_p0-w0: resuming experience collection (124350 times) [2024-06-23 20:44:47,364][15401] Updated weights for policy 0, policy_version 512370 (0.0035) [2024-06-23 20:44:48,390][15132] Fps is (10 sec: 42594.7, 60 sec: 43144.0, 300 sec: 42709.4). Total num frames: 8394702848. Throughput: 0: 42633.4. Samples: 8394869420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:44:48,391][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 20:44:50,925][15401] Updated weights for policy 0, policy_version 512380 (0.0038) [2024-06-23 20:44:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8394915840. Throughput: 0: 42530.7. Samples: 8394991760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:44:53,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 20:44:54,902][15401] Updated weights for policy 0, policy_version 512390 (0.0031) [2024-06-23 20:44:58,390][15132] Fps is (10 sec: 42601.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8395128832. Throughput: 0: 42628.4. Samples: 8395252720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:44:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 20:44:58,555][15401] Updated weights for policy 0, policy_version 512400 (0.0028) [2024-06-23 20:45:03,054][15401] Updated weights for policy 0, policy_version 512410 (0.0045) [2024-06-23 20:45:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8395325440. Throughput: 0: 42521.0. Samples: 8395505700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 20:45:06,751][15401] Updated weights for policy 0, policy_version 512420 (0.0041) [2024-06-23 20:45:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8395538432. Throughput: 0: 42437.7. Samples: 8395627380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:08,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-23 20:45:10,674][15401] Updated weights for policy 0, policy_version 512430 (0.0040) [2024-06-23 20:45:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8395767808. Throughput: 0: 42588.9. Samples: 8395888520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:13,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 20:45:14,182][15401] Updated weights for policy 0, policy_version 512440 (0.0040) [2024-06-23 20:45:18,190][15401] Updated weights for policy 0, policy_version 512450 (0.0031) [2024-06-23 20:45:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8395980800. Throughput: 0: 42604.0. Samples: 8396145360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 20:45:21,581][15401] Updated weights for policy 0, policy_version 512460 (0.0039) [2024-06-23 20:45:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 8396193792. Throughput: 0: 42557.7. Samples: 8396268740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-23 20:45:25,969][15401] Updated weights for policy 0, policy_version 512470 (0.0028) [2024-06-23 20:45:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8396406784. Throughput: 0: 42606.1. Samples: 8396530160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 20:45:29,556][15401] Updated weights for policy 0, policy_version 512480 (0.0034) [2024-06-23 20:45:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 8396603392. Throughput: 0: 42574.1. Samples: 8396785220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 20:45:33,691][15401] Updated weights for policy 0, policy_version 512490 (0.0035) [2024-06-23 20:45:37,070][15401] Updated weights for policy 0, policy_version 512500 (0.0024) [2024-06-23 20:45:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 8396849152. Throughput: 0: 42640.4. Samples: 8396910580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:38,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 20:45:41,347][15401] Updated weights for policy 0, policy_version 512510 (0.0021) [2024-06-23 20:45:43,394][15132] Fps is (10 sec: 45853.1, 60 sec: 42594.9, 300 sec: 42708.8). Total num frames: 8397062144. Throughput: 0: 42715.9. Samples: 8397175140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-23 20:45:43,395][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 20:45:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000512516_8397062144.pth... [2024-06-23 20:45:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000511891_8386822144.pth [2024-06-23 20:45:44,538][15401] Updated weights for policy 0, policy_version 512520 (0.0042) [2024-06-23 20:45:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.9, 300 sec: 42598.4). Total num frames: 8397242368. Throughput: 0: 42880.4. Samples: 8397435320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:45:48,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 20:45:48,992][15401] Updated weights for policy 0, policy_version 512530 (0.0037) [2024-06-23 20:45:52,309][15401] Updated weights for policy 0, policy_version 512540 (0.0031) [2024-06-23 20:45:53,389][15132] Fps is (10 sec: 42619.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8397488128. Throughput: 0: 42952.6. Samples: 8397560240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:45:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 20:45:56,487][15401] Updated weights for policy 0, policy_version 512550 (0.0038) [2024-06-23 20:45:58,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8397717504. Throughput: 0: 42888.8. Samples: 8397818520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:45:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 20:45:59,717][15349] Signal inference workers to stop experience collection... (124400 times) [2024-06-23 20:45:59,718][15349] Signal inference workers to resume experience collection... (124400 times) [2024-06-23 20:45:59,760][15401] InferenceWorker_p0-w0: stopping experience collection (124400 times) [2024-06-23 20:45:59,760][15401] InferenceWorker_p0-w0: resuming experience collection (124400 times) [2024-06-23 20:45:59,886][15401] Updated weights for policy 0, policy_version 512560 (0.0036) [2024-06-23 20:46:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8397897728. Throughput: 0: 43125.5. Samples: 8398086000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 20:46:04,023][15401] Updated weights for policy 0, policy_version 512570 (0.0025) [2024-06-23 20:46:07,428][15401] Updated weights for policy 0, policy_version 512580 (0.0034) [2024-06-23 20:46:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 8398127104. Throughput: 0: 42997.3. Samples: 8398203620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 20:46:11,963][15401] Updated weights for policy 0, policy_version 512590 (0.0033) [2024-06-23 20:46:13,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8398356480. Throughput: 0: 42895.1. Samples: 8398460440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:13,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 20:46:15,006][15401] Updated weights for policy 0, policy_version 512600 (0.0031) [2024-06-23 20:46:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8398536704. Throughput: 0: 43031.2. Samples: 8398721620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 20:46:19,398][15401] Updated weights for policy 0, policy_version 512610 (0.0042) [2024-06-23 20:46:22,658][15401] Updated weights for policy 0, policy_version 512620 (0.0028) [2024-06-23 20:46:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8398782464. Throughput: 0: 43018.2. Samples: 8398846400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 20:46:26,863][15401] Updated weights for policy 0, policy_version 512630 (0.0029) [2024-06-23 20:46:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8398995456. Throughput: 0: 42907.7. Samples: 8399105780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 20:46:30,230][15401] Updated weights for policy 0, policy_version 512640 (0.0033) [2024-06-23 20:46:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8399175680. Throughput: 0: 43161.0. Samples: 8399377560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 20:46:34,232][15401] Updated weights for policy 0, policy_version 512650 (0.0031) [2024-06-23 20:46:37,783][15401] Updated weights for policy 0, policy_version 512660 (0.0027) [2024-06-23 20:46:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8399421440. Throughput: 0: 42955.9. Samples: 8399493260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:38,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-23 20:46:41,794][15401] Updated weights for policy 0, policy_version 512670 (0.0040) [2024-06-23 20:46:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42875.0, 300 sec: 42765.0). Total num frames: 8399634432. Throughput: 0: 43009.9. Samples: 8399753960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 20:46:45,179][15401] Updated weights for policy 0, policy_version 512680 (0.0047) [2024-06-23 20:46:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8399831040. Throughput: 0: 43023.0. Samples: 8400022040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 20:46:49,607][15401] Updated weights for policy 0, policy_version 512690 (0.0025) [2024-06-23 20:46:53,057][15401] Updated weights for policy 0, policy_version 512700 (0.0036) [2024-06-23 20:46:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8400076800. Throughput: 0: 43085.8. Samples: 8400142480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:53,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 20:46:57,109][15401] Updated weights for policy 0, policy_version 512710 (0.0039) [2024-06-23 20:46:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8400257024. Throughput: 0: 43099.7. Samples: 8400399920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:46:58,398][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 20:47:00,696][15401] Updated weights for policy 0, policy_version 512720 (0.0037) [2024-06-23 20:47:03,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8400453632. Throughput: 0: 43120.0. Samples: 8400662020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:47:03,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-23 20:47:04,749][15401] Updated weights for policy 0, policy_version 512730 (0.0038) [2024-06-23 20:47:08,267][15401] Updated weights for policy 0, policy_version 512740 (0.0037) [2024-06-23 20:47:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8400732160. Throughput: 0: 43088.0. Samples: 8400785360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:47:08,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 20:47:12,544][15401] Updated weights for policy 0, policy_version 512750 (0.0042) [2024-06-23 20:47:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8400912384. Throughput: 0: 43013.0. Samples: 8401041360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 20:47:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 20:47:15,878][15401] Updated weights for policy 0, policy_version 512760 (0.0043) [2024-06-23 20:47:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8401108992. Throughput: 0: 42854.3. Samples: 8401306000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:18,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 20:47:20,024][15349] Signal inference workers to stop experience collection... (124450 times) [2024-06-23 20:47:20,027][15349] Signal inference workers to resume experience collection... (124450 times) [2024-06-23 20:47:20,051][15401] InferenceWorker_p0-w0: stopping experience collection (124450 times) [2024-06-23 20:47:20,051][15401] InferenceWorker_p0-w0: resuming experience collection (124450 times) [2024-06-23 20:47:20,181][15401] Updated weights for policy 0, policy_version 512770 (0.0034) [2024-06-23 20:47:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8401354752. Throughput: 0: 43089.4. Samples: 8401432280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:23,400][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 20:47:23,921][15401] Updated weights for policy 0, policy_version 512780 (0.0035) [2024-06-23 20:47:27,811][15401] Updated weights for policy 0, policy_version 512790 (0.0028) [2024-06-23 20:47:28,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8401567744. Throughput: 0: 42972.4. Samples: 8401687820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:28,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 20:47:31,457][15401] Updated weights for policy 0, policy_version 512800 (0.0027) [2024-06-23 20:47:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8401764352. Throughput: 0: 42724.1. Samples: 8401944620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 20:47:35,520][15401] Updated weights for policy 0, policy_version 512810 (0.0026) [2024-06-23 20:47:38,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8402010112. Throughput: 0: 42986.7. Samples: 8402076880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 20:47:39,030][15401] Updated weights for policy 0, policy_version 512820 (0.0033) [2024-06-23 20:47:43,070][15401] Updated weights for policy 0, policy_version 512830 (0.0028) [2024-06-23 20:47:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8402223104. Throughput: 0: 43054.5. Samples: 8402337380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:43,396][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 20:47:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000512831_8402223104.pth... [2024-06-23 20:47:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000512204_8391950336.pth [2024-06-23 20:47:46,731][15401] Updated weights for policy 0, policy_version 512840 (0.0033) [2024-06-23 20:47:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8402419712. Throughput: 0: 42811.4. Samples: 8402588540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:48,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 20:47:50,587][15401] Updated weights for policy 0, policy_version 512850 (0.0034) [2024-06-23 20:47:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 8402649088. Throughput: 0: 42908.4. Samples: 8402716340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:53,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 20:47:54,167][15401] Updated weights for policy 0, policy_version 512860 (0.0043) [2024-06-23 20:47:58,059][15401] Updated weights for policy 0, policy_version 512870 (0.0032) [2024-06-23 20:47:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 8402878464. Throughput: 0: 43129.7. Samples: 8402982200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:47:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 20:48:01,652][15401] Updated weights for policy 0, policy_version 512880 (0.0042) [2024-06-23 20:48:03,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43690.5, 300 sec: 42876.1). Total num frames: 8403075072. Throughput: 0: 42837.1. Samples: 8403233680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:48:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 20:48:05,831][15401] Updated weights for policy 0, policy_version 512890 (0.0024) [2024-06-23 20:48:08,393][15132] Fps is (10 sec: 40946.5, 60 sec: 42596.1, 300 sec: 42820.4). Total num frames: 8403288064. Throughput: 0: 42899.5. Samples: 8403362900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:48:08,393][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 20:48:09,685][15401] Updated weights for policy 0, policy_version 512900 (0.0037) [2024-06-23 20:48:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8403501056. Throughput: 0: 42974.2. Samples: 8403621560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:48:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 20:48:13,447][15401] Updated weights for policy 0, policy_version 512910 (0.0039) [2024-06-23 20:48:17,602][15401] Updated weights for policy 0, policy_version 512920 (0.0041) [2024-06-23 20:48:18,389][15132] Fps is (10 sec: 44251.6, 60 sec: 43690.6, 300 sec: 42932.0). Total num frames: 8403730432. Throughput: 0: 42832.0. Samples: 8403872060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:48:18,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 20:48:21,515][15401] Updated weights for policy 0, policy_version 512930 (0.0031) [2024-06-23 20:48:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8403927040. Throughput: 0: 42778.7. Samples: 8404001920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:48:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 20:48:25,273][15401] Updated weights for policy 0, policy_version 512940 (0.0027) [2024-06-23 20:48:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42873.0, 300 sec: 42820.5). Total num frames: 8404140032. Throughput: 0: 42683.9. Samples: 8404258160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:48:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 20:48:29,077][15401] Updated weights for policy 0, policy_version 512950 (0.0034) [2024-06-23 20:48:32,784][15401] Updated weights for policy 0, policy_version 512960 (0.0036) [2024-06-23 20:48:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 8404353024. Throughput: 0: 42852.3. Samples: 8404516900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:48:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 20:48:36,687][15401] Updated weights for policy 0, policy_version 512970 (0.0042) [2024-06-23 20:48:38,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8404566016. Throughput: 0: 42862.0. Samples: 8404645020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 20:48:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-23 20:48:40,555][15401] Updated weights for policy 0, policy_version 512980 (0.0031) [2024-06-23 20:48:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 8404762624. Throughput: 0: 42541.8. Samples: 8404896580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:48:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 20:48:44,353][15401] Updated weights for policy 0, policy_version 512990 (0.0041) [2024-06-23 20:48:48,102][15401] Updated weights for policy 0, policy_version 513000 (0.0036) [2024-06-23 20:48:48,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8404992000. Throughput: 0: 42631.7. Samples: 8405152200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:48:48,392][15132] Avg episode reward: [(0, '0.239')] [2024-06-23 20:48:52,177][15401] Updated weights for policy 0, policy_version 513010 (0.0035) [2024-06-23 20:48:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 8405204992. Throughput: 0: 42667.5. Samples: 8405282800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:48:53,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 20:48:56,139][15401] Updated weights for policy 0, policy_version 513020 (0.0033) [2024-06-23 20:48:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 8405417984. Throughput: 0: 42615.6. Samples: 8405539260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:48:58,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 20:48:59,629][15401] Updated weights for policy 0, policy_version 513030 (0.0036) [2024-06-23 20:49:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8405614592. Throughput: 0: 42649.3. Samples: 8405791280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 20:49:03,535][15349] Signal inference workers to stop experience collection... (124500 times) [2024-06-23 20:49:03,535][15349] Signal inference workers to resume experience collection... (124500 times) [2024-06-23 20:49:03,553][15401] InferenceWorker_p0-w0: stopping experience collection (124500 times) [2024-06-23 20:49:03,553][15401] InferenceWorker_p0-w0: resuming experience collection (124500 times) [2024-06-23 20:49:03,678][15401] Updated weights for policy 0, policy_version 513040 (0.0029) [2024-06-23 20:49:07,416][15401] Updated weights for policy 0, policy_version 513050 (0.0039) [2024-06-23 20:49:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.8, 300 sec: 42876.1). Total num frames: 8405860352. Throughput: 0: 42658.6. Samples: 8405921560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 20:49:11,160][15401] Updated weights for policy 0, policy_version 513060 (0.0031) [2024-06-23 20:49:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 8406056960. Throughput: 0: 42804.0. Samples: 8406184440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:13,393][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 20:49:15,042][15401] Updated weights for policy 0, policy_version 513070 (0.0037) [2024-06-23 20:49:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8406269952. Throughput: 0: 42661.9. Samples: 8406436680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 20:49:18,930][15401] Updated weights for policy 0, policy_version 513080 (0.0028) [2024-06-23 20:49:22,698][15401] Updated weights for policy 0, policy_version 513090 (0.0039) [2024-06-23 20:49:23,392][15132] Fps is (10 sec: 44236.5, 60 sec: 42869.6, 300 sec: 42875.7). Total num frames: 8406499328. Throughput: 0: 42713.5. Samples: 8406567240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:23,393][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 20:49:26,578][15401] Updated weights for policy 0, policy_version 513100 (0.0028) [2024-06-23 20:49:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 8406695936. Throughput: 0: 42788.9. Samples: 8406822080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:28,390][15132] Avg episode reward: [(0, '0.862')] [2024-06-23 20:49:30,205][15401] Updated weights for policy 0, policy_version 513110 (0.0058) [2024-06-23 20:49:33,389][15132] Fps is (10 sec: 40970.6, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 8406908928. Throughput: 0: 42895.7. Samples: 8407082400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:33,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 20:49:34,000][15401] Updated weights for policy 0, policy_version 513120 (0.0021) [2024-06-23 20:49:37,811][15401] Updated weights for policy 0, policy_version 513130 (0.0041) [2024-06-23 20:49:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 8407121920. Throughput: 0: 42806.6. Samples: 8407209100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:38,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 20:49:41,590][15401] Updated weights for policy 0, policy_version 513140 (0.0030) [2024-06-23 20:49:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 8407334912. Throughput: 0: 42712.0. Samples: 8407461300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 20:49:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000513144_8407351296.pth... [2024-06-23 20:49:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000512516_8397062144.pth [2024-06-23 20:49:45,276][15401] Updated weights for policy 0, policy_version 513150 (0.0031) [2024-06-23 20:49:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 8407547904. Throughput: 0: 43041.2. Samples: 8407728140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 20:49:49,158][15401] Updated weights for policy 0, policy_version 513160 (0.0033) [2024-06-23 20:49:52,871][15401] Updated weights for policy 0, policy_version 513170 (0.0029) [2024-06-23 20:49:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8407777280. Throughput: 0: 43005.8. Samples: 8407856820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 20:49:56,977][15401] Updated weights for policy 0, policy_version 513180 (0.0035) [2024-06-23 20:49:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8407973888. Throughput: 0: 42725.0. Samples: 8408106960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:49:58,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 20:50:01,082][15401] Updated weights for policy 0, policy_version 513190 (0.0037) [2024-06-23 20:50:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8408203264. Throughput: 0: 42878.6. Samples: 8408366220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:50:03,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-23 20:50:04,666][15401] Updated weights for policy 0, policy_version 513200 (0.0038) [2024-06-23 20:50:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 8408399872. Throughput: 0: 42825.0. Samples: 8408494260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-23 20:50:08,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 20:50:08,829][15401] Updated weights for policy 0, policy_version 513210 (0.0033) [2024-06-23 20:50:12,253][15401] Updated weights for policy 0, policy_version 513220 (0.0025) [2024-06-23 20:50:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 8408612864. Throughput: 0: 42867.6. Samples: 8408751120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:13,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-23 20:50:16,279][15401] Updated weights for policy 0, policy_version 513230 (0.0034) [2024-06-23 20:50:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8408842240. Throughput: 0: 42756.3. Samples: 8409006440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:18,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 20:50:20,210][15401] Updated weights for policy 0, policy_version 513240 (0.0032) [2024-06-23 20:50:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 8409055232. Throughput: 0: 42791.3. Samples: 8409134700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:23,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 20:50:23,697][15401] Updated weights for policy 0, policy_version 513250 (0.0032) [2024-06-23 20:50:26,040][15349] Signal inference workers to stop experience collection... (124550 times) [2024-06-23 20:50:26,046][15349] Signal inference workers to resume experience collection... (124550 times) [2024-06-23 20:50:26,084][15401] InferenceWorker_p0-w0: stopping experience collection (124550 times) [2024-06-23 20:50:26,085][15401] InferenceWorker_p0-w0: resuming experience collection (124550 times) [2024-06-23 20:50:27,879][15401] Updated weights for policy 0, policy_version 513260 (0.0035) [2024-06-23 20:50:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8409284608. Throughput: 0: 42973.3. Samples: 8409395100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:28,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 20:50:31,385][15401] Updated weights for policy 0, policy_version 513270 (0.0032) [2024-06-23 20:50:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8409464832. Throughput: 0: 42737.5. Samples: 8409651320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:33,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 20:50:35,395][15401] Updated weights for policy 0, policy_version 513280 (0.0037) [2024-06-23 20:50:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.9, 300 sec: 42876.4). Total num frames: 8409710592. Throughput: 0: 42681.3. Samples: 8409777580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:38,401][15132] Avg episode reward: [(0, '0.504')] [2024-06-23 20:50:38,851][15401] Updated weights for policy 0, policy_version 513290 (0.0034) [2024-06-23 20:50:43,082][15401] Updated weights for policy 0, policy_version 513300 (0.0030) [2024-06-23 20:50:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8409907200. Throughput: 0: 42925.7. Samples: 8410038620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 20:50:46,201][15401] Updated weights for policy 0, policy_version 513310 (0.0030) [2024-06-23 20:50:48,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8410120192. Throughput: 0: 42944.1. Samples: 8410298700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 20:50:50,772][15401] Updated weights for policy 0, policy_version 513320 (0.0029) [2024-06-23 20:50:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8410365952. Throughput: 0: 42848.8. Samples: 8410422460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 20:50:54,104][15401] Updated weights for policy 0, policy_version 513330 (0.0025) [2024-06-23 20:50:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8410546176. Throughput: 0: 42835.5. Samples: 8410678720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:50:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 20:50:58,647][15401] Updated weights for policy 0, policy_version 513340 (0.0038) [2024-06-23 20:51:01,903][15401] Updated weights for policy 0, policy_version 513350 (0.0023) [2024-06-23 20:51:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8410759168. Throughput: 0: 42934.8. Samples: 8410938500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:51:03,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-23 20:51:06,053][15401] Updated weights for policy 0, policy_version 513360 (0.0035) [2024-06-23 20:51:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8411004928. Throughput: 0: 42943.8. Samples: 8411067180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:51:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 20:51:09,376][15401] Updated weights for policy 0, policy_version 513370 (0.0047) [2024-06-23 20:51:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8411201536. Throughput: 0: 42945.0. Samples: 8411327620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:51:13,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-23 20:51:13,482][15401] Updated weights for policy 0, policy_version 513380 (0.0037) [2024-06-23 20:51:17,086][15401] Updated weights for policy 0, policy_version 513390 (0.0039) [2024-06-23 20:51:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8411398144. Throughput: 0: 42849.7. Samples: 8411579560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:51:18,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 20:51:21,488][15401] Updated weights for policy 0, policy_version 513400 (0.0042) [2024-06-23 20:51:23,394][15132] Fps is (10 sec: 42580.4, 60 sec: 42868.4, 300 sec: 42820.0). Total num frames: 8411627520. Throughput: 0: 42923.6. Samples: 8411709220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:51:23,394][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 20:51:24,904][15401] Updated weights for policy 0, policy_version 513410 (0.0037) [2024-06-23 20:51:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8411840512. Throughput: 0: 42825.8. Samples: 8411965780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:51:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 20:51:28,869][15401] Updated weights for policy 0, policy_version 513420 (0.0045) [2024-06-23 20:51:32,759][15401] Updated weights for policy 0, policy_version 513430 (0.0036) [2024-06-23 20:51:33,390][15132] Fps is (10 sec: 42615.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 8412053504. Throughput: 0: 42679.5. Samples: 8412219280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:51:33,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 20:51:36,714][15401] Updated weights for policy 0, policy_version 513440 (0.0038) [2024-06-23 20:51:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 8412266496. Throughput: 0: 42803.3. Samples: 8412348600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 20:51:38,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 20:51:40,440][15401] Updated weights for policy 0, policy_version 513450 (0.0039) [2024-06-23 20:51:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8412479488. Throughput: 0: 42805.2. Samples: 8412604960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:51:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 20:51:43,553][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000513458_8412495872.pth... [2024-06-23 20:51:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000512831_8402223104.pth [2024-06-23 20:51:44,221][15401] Updated weights for policy 0, policy_version 513460 (0.0045) [2024-06-23 20:51:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8412676096. Throughput: 0: 42751.5. Samples: 8412862320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:51:48,390][15132] Avg episode reward: [(0, '0.885')] [2024-06-23 20:51:48,499][15401] Updated weights for policy 0, policy_version 513470 (0.0035) [2024-06-23 20:51:51,843][15401] Updated weights for policy 0, policy_version 513480 (0.0029) [2024-06-23 20:51:53,396][15132] Fps is (10 sec: 44209.3, 60 sec: 42594.0, 300 sec: 42930.7). Total num frames: 8412921856. Throughput: 0: 42790.5. Samples: 8412993020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:51:53,396][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 20:51:56,054][15401] Updated weights for policy 0, policy_version 513490 (0.0055) [2024-06-23 20:51:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8413118464. Throughput: 0: 42649.3. Samples: 8413246840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:51:58,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 20:51:59,427][15401] Updated weights for policy 0, policy_version 513500 (0.0034) [2024-06-23 20:52:03,392][15132] Fps is (10 sec: 40976.4, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 8413331456. Throughput: 0: 42857.8. Samples: 8413508260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:03,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 20:52:03,549][15401] Updated weights for policy 0, policy_version 513510 (0.0048) [2024-06-23 20:52:06,887][15401] Updated weights for policy 0, policy_version 513520 (0.0033) [2024-06-23 20:52:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8413560832. Throughput: 0: 42787.0. Samples: 8413634460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:08,400][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 20:52:10,971][15401] Updated weights for policy 0, policy_version 513530 (0.0033) [2024-06-23 20:52:13,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8413773824. Throughput: 0: 42815.6. Samples: 8413892480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 20:52:14,814][15401] Updated weights for policy 0, policy_version 513540 (0.0045) [2024-06-23 20:52:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8413986816. Throughput: 0: 42833.0. Samples: 8414146760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 20:52:18,458][15401] Updated weights for policy 0, policy_version 513550 (0.0031) [2024-06-23 20:52:22,483][15401] Updated weights for policy 0, policy_version 513560 (0.0040) [2024-06-23 20:52:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42874.4, 300 sec: 42820.9). Total num frames: 8414199808. Throughput: 0: 42743.4. Samples: 8414272060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 20:52:26,747][15401] Updated weights for policy 0, policy_version 513570 (0.0042) [2024-06-23 20:52:27,098][15349] Signal inference workers to stop experience collection... (124600 times) [2024-06-23 20:52:27,104][15349] Signal inference workers to resume experience collection... (124600 times) [2024-06-23 20:52:27,143][15401] InferenceWorker_p0-w0: stopping experience collection (124600 times) [2024-06-23 20:52:27,144][15401] InferenceWorker_p0-w0: resuming experience collection (124600 times) [2024-06-23 20:52:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8414412800. Throughput: 0: 42824.7. Samples: 8414532060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 20:52:30,085][15401] Updated weights for policy 0, policy_version 513580 (0.0049) [2024-06-23 20:52:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8414625792. Throughput: 0: 42887.4. Samples: 8414792260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:33,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 20:52:34,238][15401] Updated weights for policy 0, policy_version 513590 (0.0035) [2024-06-23 20:52:37,818][15401] Updated weights for policy 0, policy_version 513600 (0.0029) [2024-06-23 20:52:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 8414838784. Throughput: 0: 42745.6. Samples: 8414916400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:38,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 20:52:41,922][15401] Updated weights for policy 0, policy_version 513610 (0.0035) [2024-06-23 20:52:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8415035392. Throughput: 0: 42772.0. Samples: 8415171580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:43,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 20:52:45,398][15401] Updated weights for policy 0, policy_version 513620 (0.0035) [2024-06-23 20:52:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 8415264768. Throughput: 0: 42471.6. Samples: 8415419380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 20:52:49,549][15401] Updated weights for policy 0, policy_version 513630 (0.0032) [2024-06-23 20:52:53,034][15401] Updated weights for policy 0, policy_version 513640 (0.0047) [2024-06-23 20:52:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 8415494144. Throughput: 0: 42681.4. Samples: 8415555120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-23 20:52:57,441][15401] Updated weights for policy 0, policy_version 513650 (0.0040) [2024-06-23 20:52:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 8415657984. Throughput: 0: 42587.5. Samples: 8415808920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:52:58,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 20:53:00,529][15401] Updated weights for policy 0, policy_version 513660 (0.0027) [2024-06-23 20:53:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42765.5). Total num frames: 8415903744. Throughput: 0: 42565.3. Samples: 8416062200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:53:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 20:53:05,052][15401] Updated weights for policy 0, policy_version 513670 (0.0052) [2024-06-23 20:53:08,302][15401] Updated weights for policy 0, policy_version 513680 (0.0028) [2024-06-23 20:53:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8416133120. Throughput: 0: 42688.5. Samples: 8416193040. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:08,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 20:53:12,576][15401] Updated weights for policy 0, policy_version 513690 (0.0034) [2024-06-23 20:53:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8416313344. Throughput: 0: 42660.8. Samples: 8416451800. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 20:53:16,220][15401] Updated weights for policy 0, policy_version 513700 (0.0026) [2024-06-23 20:53:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8416559104. Throughput: 0: 42510.8. Samples: 8416705240. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 20:53:20,307][15401] Updated weights for policy 0, policy_version 513710 (0.0041) [2024-06-23 20:53:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8416772096. Throughput: 0: 42675.7. Samples: 8416836700. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 20:53:23,922][15401] Updated weights for policy 0, policy_version 513720 (0.0043) [2024-06-23 20:53:27,832][15401] Updated weights for policy 0, policy_version 513730 (0.0036) [2024-06-23 20:53:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8416968704. Throughput: 0: 42687.4. Samples: 8417092520. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:28,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-23 20:53:31,481][15401] Updated weights for policy 0, policy_version 513740 (0.0040) [2024-06-23 20:53:33,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8417181696. Throughput: 0: 42893.2. Samples: 8417349580. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:33,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-23 20:53:35,444][15401] Updated weights for policy 0, policy_version 513750 (0.0032) [2024-06-23 20:53:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 8417411072. Throughput: 0: 42680.0. Samples: 8417475720. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:38,392][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 20:53:39,078][15401] Updated weights for policy 0, policy_version 513760 (0.0036) [2024-06-23 20:53:43,270][15401] Updated weights for policy 0, policy_version 513770 (0.0035) [2024-06-23 20:53:43,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 8417607680. Throughput: 0: 42691.9. Samples: 8417730060. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:43,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-23 20:53:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000513770_8417607680.pth... [2024-06-23 20:53:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000513144_8407351296.pth [2024-06-23 20:53:47,300][15401] Updated weights for policy 0, policy_version 513780 (0.0027) [2024-06-23 20:53:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8417837056. Throughput: 0: 42673.0. Samples: 8417982480. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 20:53:49,538][15349] Signal inference workers to stop experience collection... (124650 times) [2024-06-23 20:53:49,541][15349] Signal inference workers to resume experience collection... (124650 times) [2024-06-23 20:53:49,584][15401] InferenceWorker_p0-w0: stopping experience collection (124650 times) [2024-06-23 20:53:49,584][15401] InferenceWorker_p0-w0: resuming experience collection (124650 times) [2024-06-23 20:53:50,867][15401] Updated weights for policy 0, policy_version 513790 (0.0033) [2024-06-23 20:53:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8418033664. Throughput: 0: 42661.3. Samples: 8418112800. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 20:53:54,873][15401] Updated weights for policy 0, policy_version 513800 (0.0043) [2024-06-23 20:53:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 8418263040. Throughput: 0: 42633.1. Samples: 8418370280. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:53:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 20:53:58,394][15401] Updated weights for policy 0, policy_version 513810 (0.0035) [2024-06-23 20:54:02,274][15401] Updated weights for policy 0, policy_version 513820 (0.0035) [2024-06-23 20:54:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8418443264. Throughput: 0: 42806.1. Samples: 8418631520. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:54:03,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 20:54:05,958][15401] Updated weights for policy 0, policy_version 513830 (0.0030) [2024-06-23 20:54:08,389][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 8418672640. Throughput: 0: 42617.2. Samples: 8418754480. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:54:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 20:54:09,755][15401] Updated weights for policy 0, policy_version 513840 (0.0028) [2024-06-23 20:54:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8418885632. Throughput: 0: 42669.9. Samples: 8419012660. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:54:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 20:54:13,646][15401] Updated weights for policy 0, policy_version 513850 (0.0030) [2024-06-23 20:54:17,545][15401] Updated weights for policy 0, policy_version 513860 (0.0028) [2024-06-23 20:54:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 8419098624. Throughput: 0: 42610.0. Samples: 8419267020. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:54:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 20:54:21,386][15401] Updated weights for policy 0, policy_version 513870 (0.0041) [2024-06-23 20:54:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 8419295232. Throughput: 0: 42655.2. Samples: 8419395200. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:54:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 20:54:25,215][15401] Updated weights for policy 0, policy_version 513880 (0.0024) [2024-06-23 20:54:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8419524608. Throughput: 0: 42642.2. Samples: 8419648960. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:54:28,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 20:54:29,213][15401] Updated weights for policy 0, policy_version 513890 (0.0043) [2024-06-23 20:54:32,882][15401] Updated weights for policy 0, policy_version 513900 (0.0040) [2024-06-23 20:54:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8419753984. Throughput: 0: 42790.5. Samples: 8419908060. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-06-23 20:54:33,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-23 20:54:36,843][15401] Updated weights for policy 0, policy_version 513910 (0.0029) [2024-06-23 20:54:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8419950592. Throughput: 0: 42743.1. Samples: 8420036240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:54:38,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-23 20:54:40,397][15401] Updated weights for policy 0, policy_version 513920 (0.0038) [2024-06-23 20:54:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8420147200. Throughput: 0: 42667.8. Samples: 8420290340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:54:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 20:54:44,592][15401] Updated weights for policy 0, policy_version 513930 (0.0035) [2024-06-23 20:54:48,035][15401] Updated weights for policy 0, policy_version 513940 (0.0040) [2024-06-23 20:54:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 8420392960. Throughput: 0: 42458.6. Samples: 8420542160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:54:48,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 20:54:52,331][15401] Updated weights for policy 0, policy_version 513950 (0.0045) [2024-06-23 20:54:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8420605952. Throughput: 0: 42758.2. Samples: 8420678600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:54:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 20:54:55,971][15401] Updated weights for policy 0, policy_version 513960 (0.0039) [2024-06-23 20:54:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.0, 300 sec: 42653.9). Total num frames: 8420786176. Throughput: 0: 42671.9. Samples: 8420932900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:54:58,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 20:54:59,735][15401] Updated weights for policy 0, policy_version 513970 (0.0038) [2024-06-23 20:55:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8421015552. Throughput: 0: 42597.7. Samples: 8421183920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 20:55:03,630][15401] Updated weights for policy 0, policy_version 513980 (0.0033) [2024-06-23 20:55:07,334][15401] Updated weights for policy 0, policy_version 513990 (0.0022) [2024-06-23 20:55:08,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8421228544. Throughput: 0: 42677.7. Samples: 8421315800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:08,392][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 20:55:11,398][15401] Updated weights for policy 0, policy_version 514000 (0.0038) [2024-06-23 20:55:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8421441536. Throughput: 0: 42603.2. Samples: 8421566100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 20:55:15,093][15401] Updated weights for policy 0, policy_version 514010 (0.0030) [2024-06-23 20:55:18,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 8421670912. Throughput: 0: 42603.5. Samples: 8421825220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 20:55:19,008][15401] Updated weights for policy 0, policy_version 514020 (0.0040) [2024-06-23 20:55:23,088][15401] Updated weights for policy 0, policy_version 514030 (0.0041) [2024-06-23 20:55:23,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 8421867520. Throughput: 0: 42584.0. Samples: 8421952620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:23,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 20:55:24,340][15349] Signal inference workers to stop experience collection... (124700 times) [2024-06-23 20:55:24,340][15349] Signal inference workers to resume experience collection... (124700 times) [2024-06-23 20:55:24,385][15401] InferenceWorker_p0-w0: stopping experience collection (124700 times) [2024-06-23 20:55:24,385][15401] InferenceWorker_p0-w0: resuming experience collection (124700 times) [2024-06-23 20:55:26,649][15401] Updated weights for policy 0, policy_version 514040 (0.0032) [2024-06-23 20:55:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8422064128. Throughput: 0: 42537.8. Samples: 8422204540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 20:55:30,674][15401] Updated weights for policy 0, policy_version 514050 (0.0033) [2024-06-23 20:55:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 8422293504. Throughput: 0: 42700.5. Samples: 8422463680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 20:55:34,202][15401] Updated weights for policy 0, policy_version 514060 (0.0025) [2024-06-23 20:55:38,280][15401] Updated weights for policy 0, policy_version 514070 (0.0028) [2024-06-23 20:55:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8422522880. Throughput: 0: 42659.5. Samples: 8422598280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 20:55:41,681][15401] Updated weights for policy 0, policy_version 514080 (0.0032) [2024-06-23 20:55:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8422719488. Throughput: 0: 42517.5. Samples: 8422846180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:43,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 20:55:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000514083_8422735872.pth... [2024-06-23 20:55:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000513458_8412495872.pth [2024-06-23 20:55:46,221][15401] Updated weights for policy 0, policy_version 514090 (0.0033) [2024-06-23 20:55:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8422932480. Throughput: 0: 42633.6. Samples: 8423102440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 20:55:49,644][15401] Updated weights for policy 0, policy_version 514100 (0.0044) [2024-06-23 20:55:53,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 8423145472. Throughput: 0: 42492.5. Samples: 8423227960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:53,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 20:55:54,233][15401] Updated weights for policy 0, policy_version 514110 (0.0045) [2024-06-23 20:55:57,295][15401] Updated weights for policy 0, policy_version 514120 (0.0034) [2024-06-23 20:55:58,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8423358464. Throughput: 0: 42597.8. Samples: 8423483000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 20:55:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 20:56:01,666][15401] Updated weights for policy 0, policy_version 514130 (0.0036) [2024-06-23 20:56:03,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8423604224. Throughput: 0: 42579.7. Samples: 8423741300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 20:56:04,820][15401] Updated weights for policy 0, policy_version 514140 (0.0025) [2024-06-23 20:56:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 8423784448. Throughput: 0: 42825.5. Samples: 8423879660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 20:56:09,288][15401] Updated weights for policy 0, policy_version 514150 (0.0027) [2024-06-23 20:56:12,240][15401] Updated weights for policy 0, policy_version 514160 (0.0040) [2024-06-23 20:56:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8424013824. Throughput: 0: 42889.2. Samples: 8424134560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:13,396][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 20:56:17,034][15401] Updated weights for policy 0, policy_version 514170 (0.0030) [2024-06-23 20:56:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42765.6). Total num frames: 8424243200. Throughput: 0: 42837.4. Samples: 8424391360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:18,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 20:56:20,203][15401] Updated weights for policy 0, policy_version 514180 (0.0028) [2024-06-23 20:56:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 8424423424. Throughput: 0: 42813.8. Samples: 8424524900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:23,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 20:56:24,468][15401] Updated weights for policy 0, policy_version 514190 (0.0034) [2024-06-23 20:56:27,762][15401] Updated weights for policy 0, policy_version 514200 (0.0032) [2024-06-23 20:56:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8424652800. Throughput: 0: 43005.8. Samples: 8424781440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 20:56:32,209][15401] Updated weights for policy 0, policy_version 514210 (0.0049) [2024-06-23 20:56:33,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 8424898560. Throughput: 0: 43044.5. Samples: 8425039440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:33,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 20:56:35,616][15401] Updated weights for policy 0, policy_version 514220 (0.0035) [2024-06-23 20:56:36,998][15349] Signal inference workers to stop experience collection... (124750 times) [2024-06-23 20:56:37,000][15349] Signal inference workers to resume experience collection... (124750 times) [2024-06-23 20:56:37,044][15401] InferenceWorker_p0-w0: stopping experience collection (124750 times) [2024-06-23 20:56:37,044][15401] InferenceWorker_p0-w0: resuming experience collection (124750 times) [2024-06-23 20:56:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42709.2). Total num frames: 8425078784. Throughput: 0: 43129.3. Samples: 8425168780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:38,393][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 20:56:39,761][15401] Updated weights for policy 0, policy_version 514230 (0.0034) [2024-06-23 20:56:43,305][15401] Updated weights for policy 0, policy_version 514240 (0.0037) [2024-06-23 20:56:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8425308160. Throughput: 0: 43096.8. Samples: 8425422360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 20:56:47,664][15401] Updated weights for policy 0, policy_version 514250 (0.0046) [2024-06-23 20:56:48,392][15132] Fps is (10 sec: 45875.2, 60 sec: 43416.0, 300 sec: 42765.6). Total num frames: 8425537536. Throughput: 0: 43068.4. Samples: 8425679480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:48,393][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 20:56:51,119][15401] Updated weights for policy 0, policy_version 514260 (0.0034) [2024-06-23 20:56:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8425717760. Throughput: 0: 42888.8. Samples: 8425809660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:53,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 20:56:55,180][15401] Updated weights for policy 0, policy_version 514270 (0.0032) [2024-06-23 20:56:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 8425947136. Throughput: 0: 42824.1. Samples: 8426061640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:56:58,390][15132] Avg episode reward: [(0, '0.901')] [2024-06-23 20:56:58,714][15401] Updated weights for policy 0, policy_version 514280 (0.0032) [2024-06-23 20:57:02,711][15401] Updated weights for policy 0, policy_version 514290 (0.0033) [2024-06-23 20:57:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8426160128. Throughput: 0: 42877.7. Samples: 8426320860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:57:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 20:57:06,284][15401] Updated weights for policy 0, policy_version 514300 (0.0037) [2024-06-23 20:57:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8426373120. Throughput: 0: 42808.1. Samples: 8426451260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:57:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 20:57:10,223][15401] Updated weights for policy 0, policy_version 514310 (0.0037) [2024-06-23 20:57:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8426586112. Throughput: 0: 42699.9. Samples: 8426702940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:57:13,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 20:57:14,148][15401] Updated weights for policy 0, policy_version 514320 (0.0026) [2024-06-23 20:57:17,843][15401] Updated weights for policy 0, policy_version 514330 (0.0034) [2024-06-23 20:57:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8426799104. Throughput: 0: 42700.2. Samples: 8426960940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:57:18,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 20:57:21,689][15401] Updated weights for policy 0, policy_version 514340 (0.0025) [2024-06-23 20:57:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8427012096. Throughput: 0: 42696.1. Samples: 8427090000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:57:23,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 20:57:25,563][15401] Updated weights for policy 0, policy_version 514350 (0.0024) [2024-06-23 20:57:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8427241472. Throughput: 0: 42625.3. Samples: 8427340500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 20:57:28,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 20:57:29,386][15401] Updated weights for policy 0, policy_version 514360 (0.0036) [2024-06-23 20:57:33,225][15401] Updated weights for policy 0, policy_version 514370 (0.0048) [2024-06-23 20:57:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 8427438080. Throughput: 0: 42537.4. Samples: 8427593560. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:57:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 20:57:36,973][15401] Updated weights for policy 0, policy_version 514380 (0.0030) [2024-06-23 20:57:38,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 8427634688. Throughput: 0: 42372.9. Samples: 8427716440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:57:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 20:57:40,958][15401] Updated weights for policy 0, policy_version 514390 (0.0033) [2024-06-23 20:57:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8427880448. Throughput: 0: 42551.4. Samples: 8427976460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:57:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 20:57:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000514397_8427880448.pth... [2024-06-23 20:57:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000513770_8417607680.pth [2024-06-23 20:57:44,959][15401] Updated weights for policy 0, policy_version 514400 (0.0038) [2024-06-23 20:57:48,165][15349] Signal inference workers to stop experience collection... (124800 times) [2024-06-23 20:57:48,220][15349] Signal inference workers to resume experience collection... (124800 times) [2024-06-23 20:57:48,224][15401] InferenceWorker_p0-w0: stopping experience collection (124800 times) [2024-06-23 20:57:48,236][15401] InferenceWorker_p0-w0: resuming experience collection (124800 times) [2024-06-23 20:57:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 8428077056. Throughput: 0: 42609.0. Samples: 8428238260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:57:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 20:57:48,527][15401] Updated weights for policy 0, policy_version 514410 (0.0030) [2024-06-23 20:57:52,410][15401] Updated weights for policy 0, policy_version 514420 (0.0036) [2024-06-23 20:57:53,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8428257280. Throughput: 0: 42401.2. Samples: 8428359320. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:57:53,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-23 20:57:56,120][15401] Updated weights for policy 0, policy_version 514430 (0.0041) [2024-06-23 20:57:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8428519424. Throughput: 0: 42441.3. Samples: 8428612800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:57:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 20:58:00,108][15401] Updated weights for policy 0, policy_version 514440 (0.0042) [2024-06-23 20:58:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 8428699648. Throughput: 0: 42597.9. Samples: 8428877840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 20:58:03,927][15401] Updated weights for policy 0, policy_version 514450 (0.0032) [2024-06-23 20:58:07,742][15401] Updated weights for policy 0, policy_version 514460 (0.0026) [2024-06-23 20:58:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8428912640. Throughput: 0: 42338.2. Samples: 8428995220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 20:58:11,681][15401] Updated weights for policy 0, policy_version 514470 (0.0047) [2024-06-23 20:58:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8429158400. Throughput: 0: 42507.6. Samples: 8429253340. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:13,396][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 20:58:16,163][15401] Updated weights for policy 0, policy_version 514480 (0.0029) [2024-06-23 20:58:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8429338624. Throughput: 0: 42700.9. Samples: 8429515100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 20:58:19,162][15401] Updated weights for policy 0, policy_version 514490 (0.0025) [2024-06-23 20:58:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 8429551616. Throughput: 0: 42667.1. Samples: 8429636460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 20:58:23,901][15401] Updated weights for policy 0, policy_version 514500 (0.0026) [2024-06-23 20:58:27,006][15401] Updated weights for policy 0, policy_version 514510 (0.0030) [2024-06-23 20:58:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8429797376. Throughput: 0: 42764.4. Samples: 8429900860. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 20:58:31,290][15401] Updated weights for policy 0, policy_version 514520 (0.0033) [2024-06-23 20:58:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8429977600. Throughput: 0: 42812.8. Samples: 8430164840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 20:58:34,860][15401] Updated weights for policy 0, policy_version 514530 (0.0031) [2024-06-23 20:58:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8430206976. Throughput: 0: 42665.5. Samples: 8430279260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 20:58:38,812][15401] Updated weights for policy 0, policy_version 514540 (0.0027) [2024-06-23 20:58:42,454][15401] Updated weights for policy 0, policy_version 514550 (0.0039) [2024-06-23 20:58:43,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8430452736. Throughput: 0: 43074.3. Samples: 8430551140. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:43,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 20:58:46,244][15401] Updated weights for policy 0, policy_version 514560 (0.0032) [2024-06-23 20:58:48,393][15132] Fps is (10 sec: 44222.9, 60 sec: 42869.3, 300 sec: 42764.6). Total num frames: 8430649344. Throughput: 0: 42918.3. Samples: 8430809300. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:48,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 20:58:50,087][15401] Updated weights for policy 0, policy_version 514570 (0.0034) [2024-06-23 20:58:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 8430845952. Throughput: 0: 43021.8. Samples: 8430931200. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:53,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 20:58:53,767][15401] Updated weights for policy 0, policy_version 514580 (0.0033) [2024-06-23 20:58:57,689][15401] Updated weights for policy 0, policy_version 514590 (0.0019) [2024-06-23 20:58:58,209][15349] Signal inference workers to stop experience collection... (124850 times) [2024-06-23 20:58:58,244][15401] InferenceWorker_p0-w0: stopping experience collection (124850 times) [2024-06-23 20:58:58,276][15349] Signal inference workers to resume experience collection... (124850 times) [2024-06-23 20:58:58,276][15401] InferenceWorker_p0-w0: resuming experience collection (124850 times) [2024-06-23 20:58:58,389][15132] Fps is (10 sec: 42612.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8431075328. Throughput: 0: 43248.6. Samples: 8431199520. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-23 20:58:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 20:59:01,447][15401] Updated weights for policy 0, policy_version 514600 (0.0025) [2024-06-23 20:59:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8431288320. Throughput: 0: 43018.9. Samples: 8431450960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 20:59:05,194][15401] Updated weights for policy 0, policy_version 514610 (0.0037) [2024-06-23 20:59:08,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8431501312. Throughput: 0: 43226.1. Samples: 8431581640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 20:59:08,908][15401] Updated weights for policy 0, policy_version 514620 (0.0033) [2024-06-23 20:59:12,927][15401] Updated weights for policy 0, policy_version 514630 (0.0041) [2024-06-23 20:59:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8431714304. Throughput: 0: 43112.6. Samples: 8431840920. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:13,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 20:59:16,430][15401] Updated weights for policy 0, policy_version 514640 (0.0031) [2024-06-23 20:59:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8431927296. Throughput: 0: 43031.6. Samples: 8432101260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 20:59:20,406][15401] Updated weights for policy 0, policy_version 514650 (0.0036) [2024-06-23 20:59:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8432140288. Throughput: 0: 43367.9. Samples: 8432230820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 20:59:23,963][15401] Updated weights for policy 0, policy_version 514660 (0.0024) [2024-06-23 20:59:27,883][15401] Updated weights for policy 0, policy_version 514670 (0.0035) [2024-06-23 20:59:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8432369664. Throughput: 0: 42944.6. Samples: 8432483660. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 20:59:31,709][15401] Updated weights for policy 0, policy_version 514680 (0.0040) [2024-06-23 20:59:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8432566272. Throughput: 0: 43148.7. Samples: 8432750860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:33,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 20:59:35,270][15401] Updated weights for policy 0, policy_version 514690 (0.0032) [2024-06-23 20:59:38,390][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8432795648. Throughput: 0: 43221.7. Samples: 8432876180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:38,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 20:59:39,486][15401] Updated weights for policy 0, policy_version 514700 (0.0032) [2024-06-23 20:59:43,073][15401] Updated weights for policy 0, policy_version 514710 (0.0029) [2024-06-23 20:59:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8433008640. Throughput: 0: 42999.0. Samples: 8433134480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:43,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-23 20:59:43,449][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000514711_8433025024.pth... [2024-06-23 20:59:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000514083_8422735872.pth [2024-06-23 20:59:47,063][15401] Updated weights for policy 0, policy_version 514720 (0.0046) [2024-06-23 20:59:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.7, 300 sec: 42765.0). Total num frames: 8433221632. Throughput: 0: 43091.7. Samples: 8433390080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:48,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 20:59:50,759][15401] Updated weights for policy 0, policy_version 514730 (0.0028) [2024-06-23 20:59:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 8433451008. Throughput: 0: 43162.9. Samples: 8433523960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 20:59:54,619][15401] Updated weights for policy 0, policy_version 514740 (0.0030) [2024-06-23 20:59:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 8433647616. Throughput: 0: 43001.2. Samples: 8433775980. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 20:59:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 20:59:58,455][15401] Updated weights for policy 0, policy_version 514750 (0.0032) [2024-06-23 21:00:02,263][15401] Updated weights for policy 0, policy_version 514760 (0.0040) [2024-06-23 21:00:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 8433860608. Throughput: 0: 42952.0. Samples: 8434034100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 21:00:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 21:00:05,966][15401] Updated weights for policy 0, policy_version 514770 (0.0024) [2024-06-23 21:00:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8434089984. Throughput: 0: 42846.3. Samples: 8434158900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 21:00:08,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 21:00:09,892][15401] Updated weights for policy 0, policy_version 514780 (0.0040) [2024-06-23 21:00:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8434302976. Throughput: 0: 42965.5. Samples: 8434417100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 21:00:13,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 21:00:13,556][15401] Updated weights for policy 0, policy_version 514790 (0.0035) [2024-06-23 21:00:17,604][15401] Updated weights for policy 0, policy_version 514800 (0.0028) [2024-06-23 21:00:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 8434515968. Throughput: 0: 42764.4. Samples: 8434675260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 21:00:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 21:00:21,103][15401] Updated weights for policy 0, policy_version 514810 (0.0024) [2024-06-23 21:00:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8434728960. Throughput: 0: 42769.6. Samples: 8434800820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-23 21:00:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 21:00:25,554][15401] Updated weights for policy 0, policy_version 514820 (0.0035) [2024-06-23 21:00:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 8434941952. Throughput: 0: 42766.7. Samples: 8435058980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:00:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 21:00:28,747][15401] Updated weights for policy 0, policy_version 514830 (0.0039) [2024-06-23 21:00:32,000][15349] Signal inference workers to stop experience collection... (124900 times) [2024-06-23 21:00:32,051][15401] InferenceWorker_p0-w0: stopping experience collection (124900 times) [2024-06-23 21:00:32,050][15349] Signal inference workers to resume experience collection... (124900 times) [2024-06-23 21:00:32,065][15401] InferenceWorker_p0-w0: resuming experience collection (124900 times) [2024-06-23 21:00:32,911][15401] Updated weights for policy 0, policy_version 514840 (0.0030) [2024-06-23 21:00:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8435154944. Throughput: 0: 42886.6. Samples: 8435319980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:00:33,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 21:00:36,404][15401] Updated weights for policy 0, policy_version 514850 (0.0044) [2024-06-23 21:00:38,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8435384320. Throughput: 0: 42777.6. Samples: 8435448960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:00:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 21:00:40,559][15401] Updated weights for policy 0, policy_version 514860 (0.0027) [2024-06-23 21:00:43,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 8435580928. Throughput: 0: 42748.1. Samples: 8435699740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:00:43,393][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 21:00:44,164][15401] Updated weights for policy 0, policy_version 514870 (0.0031) [2024-06-23 21:00:48,290][15401] Updated weights for policy 0, policy_version 514880 (0.0034) [2024-06-23 21:00:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 8435793920. Throughput: 0: 42744.8. Samples: 8435957620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:00:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 21:00:51,832][15401] Updated weights for policy 0, policy_version 514890 (0.0032) [2024-06-23 21:00:53,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8436006912. Throughput: 0: 42876.3. Samples: 8436088340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:00:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 21:00:55,923][15401] Updated weights for policy 0, policy_version 514900 (0.0038) [2024-06-23 21:00:58,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 8436219904. Throughput: 0: 42645.3. Samples: 8436336240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:00:58,393][15132] Avg episode reward: [(0, '0.853')] [2024-06-23 21:00:59,739][15401] Updated weights for policy 0, policy_version 514910 (0.0029) [2024-06-23 21:01:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8436416512. Throughput: 0: 42526.3. Samples: 8436588940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:03,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 21:01:04,018][15401] Updated weights for policy 0, policy_version 514920 (0.0025) [2024-06-23 21:01:07,327][15401] Updated weights for policy 0, policy_version 514930 (0.0031) [2024-06-23 21:01:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8436629504. Throughput: 0: 42694.4. Samples: 8436722060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 21:01:11,533][15401] Updated weights for policy 0, policy_version 514940 (0.0035) [2024-06-23 21:01:13,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8436858880. Throughput: 0: 42589.6. Samples: 8436975620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:13,393][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 21:01:14,983][15401] Updated weights for policy 0, policy_version 514950 (0.0036) [2024-06-23 21:01:18,391][15132] Fps is (10 sec: 44228.2, 60 sec: 42597.1, 300 sec: 42875.8). Total num frames: 8437071872. Throughput: 0: 42552.1. Samples: 8437234900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:18,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 21:01:19,015][15401] Updated weights for policy 0, policy_version 514960 (0.0027) [2024-06-23 21:01:22,983][15401] Updated weights for policy 0, policy_version 514970 (0.0032) [2024-06-23 21:01:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 8437284864. Throughput: 0: 42531.6. Samples: 8437362880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 21:01:26,675][15401] Updated weights for policy 0, policy_version 514980 (0.0029) [2024-06-23 21:01:28,389][15132] Fps is (10 sec: 40967.8, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 8437481472. Throughput: 0: 42711.2. Samples: 8437621640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:28,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 21:01:30,615][15401] Updated weights for policy 0, policy_version 514990 (0.0038) [2024-06-23 21:01:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 8437710848. Throughput: 0: 42652.0. Samples: 8437876960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 21:01:34,483][15401] Updated weights for policy 0, policy_version 515000 (0.0034) [2024-06-23 21:01:38,168][15401] Updated weights for policy 0, policy_version 515010 (0.0035) [2024-06-23 21:01:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8437923840. Throughput: 0: 42548.0. Samples: 8438003000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:38,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 21:01:42,033][15401] Updated weights for policy 0, policy_version 515020 (0.0040) [2024-06-23 21:01:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 8438120448. Throughput: 0: 42543.1. Samples: 8438250580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:43,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 21:01:43,446][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000515023_8438136832.pth... [2024-06-23 21:01:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000514397_8427880448.pth [2024-06-23 21:01:45,972][15401] Updated weights for policy 0, policy_version 515030 (0.0032) [2024-06-23 21:01:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 8438349824. Throughput: 0: 42557.7. Samples: 8438504040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 21:01:49,546][15401] Updated weights for policy 0, policy_version 515040 (0.0030) [2024-06-23 21:01:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 8438530048. Throughput: 0: 42604.9. Samples: 8438639280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-23 21:01:53,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 21:01:53,569][15349] Signal inference workers to stop experience collection... (124950 times) [2024-06-23 21:01:53,569][15349] Signal inference workers to resume experience collection... (124950 times) [2024-06-23 21:01:53,587][15401] InferenceWorker_p0-w0: stopping experience collection (124950 times) [2024-06-23 21:01:53,616][15401] InferenceWorker_p0-w0: resuming experience collection (124950 times) [2024-06-23 21:01:53,717][15401] Updated weights for policy 0, policy_version 515050 (0.0034) [2024-06-23 21:01:57,267][15401] Updated weights for policy 0, policy_version 515060 (0.0030) [2024-06-23 21:01:58,394][15132] Fps is (10 sec: 42578.8, 60 sec: 42596.8, 300 sec: 42764.4). Total num frames: 8438775808. Throughput: 0: 42545.4. Samples: 8438890260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:01:58,395][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 21:02:01,435][15401] Updated weights for policy 0, policy_version 515070 (0.0035) [2024-06-23 21:02:03,396][15132] Fps is (10 sec: 45846.5, 60 sec: 42867.1, 300 sec: 42764.1). Total num frames: 8438988800. Throughput: 0: 42510.6. Samples: 8439148060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:03,396][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 21:02:04,811][15401] Updated weights for policy 0, policy_version 515080 (0.0030) [2024-06-23 21:02:08,390][15132] Fps is (10 sec: 40979.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8439185408. Throughput: 0: 42533.8. Samples: 8439276900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:08,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 21:02:09,151][15401] Updated weights for policy 0, policy_version 515090 (0.0029) [2024-06-23 21:02:12,505][15401] Updated weights for policy 0, policy_version 515100 (0.0037) [2024-06-23 21:02:13,390][15132] Fps is (10 sec: 42624.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8439414784. Throughput: 0: 42395.4. Samples: 8439529440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 21:02:16,827][15401] Updated weights for policy 0, policy_version 515110 (0.0035) [2024-06-23 21:02:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 8439627776. Throughput: 0: 42236.9. Samples: 8439777620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:18,393][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 21:02:20,339][15401] Updated weights for policy 0, policy_version 515120 (0.0030) [2024-06-23 21:02:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 8439808000. Throughput: 0: 42322.4. Samples: 8439907500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 21:02:24,509][15401] Updated weights for policy 0, policy_version 515130 (0.0033) [2024-06-23 21:02:27,939][15401] Updated weights for policy 0, policy_version 515140 (0.0034) [2024-06-23 21:02:28,390][15132] Fps is (10 sec: 42594.5, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 8440053760. Throughput: 0: 42464.1. Samples: 8440161500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:28,391][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 21:02:32,378][15401] Updated weights for policy 0, policy_version 515150 (0.0045) [2024-06-23 21:02:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8440266752. Throughput: 0: 42427.6. Samples: 8440413280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:33,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 21:02:35,630][15401] Updated weights for policy 0, policy_version 515160 (0.0039) [2024-06-23 21:02:38,390][15132] Fps is (10 sec: 40963.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8440463360. Throughput: 0: 42324.2. Samples: 8440543880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 21:02:40,013][15401] Updated weights for policy 0, policy_version 515170 (0.0039) [2024-06-23 21:02:43,074][15401] Updated weights for policy 0, policy_version 515180 (0.0024) [2024-06-23 21:02:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8440709120. Throughput: 0: 42486.1. Samples: 8440801940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:43,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-23 21:02:47,839][15401] Updated weights for policy 0, policy_version 515190 (0.0039) [2024-06-23 21:02:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8440905728. Throughput: 0: 42493.8. Samples: 8441060020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 21:02:50,568][15401] Updated weights for policy 0, policy_version 515200 (0.0037) [2024-06-23 21:02:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 8441102336. Throughput: 0: 42419.0. Samples: 8441185760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 21:02:55,406][15401] Updated weights for policy 0, policy_version 515210 (0.0035) [2024-06-23 21:02:57,209][15349] Signal inference workers to stop experience collection... (125000 times) [2024-06-23 21:02:57,210][15349] Signal inference workers to resume experience collection... (125000 times) [2024-06-23 21:02:57,221][15401] InferenceWorker_p0-w0: stopping experience collection (125000 times) [2024-06-23 21:02:57,221][15401] InferenceWorker_p0-w0: resuming experience collection (125000 times) [2024-06-23 21:02:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42874.7, 300 sec: 42876.1). Total num frames: 8441348096. Throughput: 0: 42650.2. Samples: 8441448700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:02:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 21:02:58,567][15401] Updated weights for policy 0, policy_version 515220 (0.0030) [2024-06-23 21:03:02,943][15401] Updated weights for policy 0, policy_version 515230 (0.0031) [2024-06-23 21:03:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42602.8, 300 sec: 42820.6). Total num frames: 8441544704. Throughput: 0: 42951.6. Samples: 8441710440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:03:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 21:03:06,140][15401] Updated weights for policy 0, policy_version 515240 (0.0039) [2024-06-23 21:03:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8441741312. Throughput: 0: 42859.0. Samples: 8441836160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:03:08,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 21:03:10,858][15401] Updated weights for policy 0, policy_version 515250 (0.0027) [2024-06-23 21:03:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8441987072. Throughput: 0: 42856.9. Samples: 8442090020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:03:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 21:03:13,771][15401] Updated weights for policy 0, policy_version 515260 (0.0031) [2024-06-23 21:03:18,381][15401] Updated weights for policy 0, policy_version 515270 (0.0027) [2024-06-23 21:03:18,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 8442183680. Throughput: 0: 43334.3. Samples: 8442363600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-23 21:03:18,396][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 21:03:21,267][15401] Updated weights for policy 0, policy_version 515280 (0.0035) [2024-06-23 21:03:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8442396672. Throughput: 0: 43008.6. Samples: 8442479260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:03:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 21:03:26,010][15401] Updated weights for policy 0, policy_version 515290 (0.0031) [2024-06-23 21:03:28,394][15132] Fps is (10 sec: 45884.8, 60 sec: 43142.1, 300 sec: 42931.0). Total num frames: 8442642432. Throughput: 0: 42911.5. Samples: 8442733140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:03:28,394][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 21:03:28,951][15401] Updated weights for policy 0, policy_version 515300 (0.0026) [2024-06-23 21:03:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8442806272. Throughput: 0: 43226.7. Samples: 8443005220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:03:33,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 21:03:33,651][15401] Updated weights for policy 0, policy_version 515310 (0.0029) [2024-06-23 21:03:36,524][15401] Updated weights for policy 0, policy_version 515320 (0.0035) [2024-06-23 21:03:38,389][15132] Fps is (10 sec: 37699.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8443019264. Throughput: 0: 42954.8. Samples: 8443118720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:03:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 21:03:41,219][15401] Updated weights for policy 0, policy_version 515330 (0.0028) [2024-06-23 21:03:43,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42821.0). Total num frames: 8443281408. Throughput: 0: 42838.8. Samples: 8443376440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:03:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 21:03:43,480][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000515338_8443297792.pth... [2024-06-23 21:03:43,535][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000514711_8433025024.pth [2024-06-23 21:03:44,326][15401] Updated weights for policy 0, policy_version 515340 (0.0029) [2024-06-23 21:03:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8443461632. Throughput: 0: 42926.2. Samples: 8443642120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:03:48,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-23 21:03:48,782][15401] Updated weights for policy 0, policy_version 515350 (0.0035) [2024-06-23 21:03:52,129][15401] Updated weights for policy 0, policy_version 515360 (0.0031) [2024-06-23 21:03:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8443658240. Throughput: 0: 42757.3. Samples: 8443760240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:03:53,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 21:03:56,555][15401] Updated weights for policy 0, policy_version 515370 (0.0034) [2024-06-23 21:03:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8443920384. Throughput: 0: 42864.4. Samples: 8444018920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:03:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 21:03:59,725][15401] Updated weights for policy 0, policy_version 515380 (0.0046) [2024-06-23 21:04:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8444100608. Throughput: 0: 42639.1. Samples: 8444282080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 21:04:04,195][15401] Updated weights for policy 0, policy_version 515390 (0.0027) [2024-06-23 21:04:05,128][15349] Signal inference workers to stop experience collection... (125050 times) [2024-06-23 21:04:05,167][15401] InferenceWorker_p0-w0: stopping experience collection (125050 times) [2024-06-23 21:04:05,192][15349] Signal inference workers to resume experience collection... (125050 times) [2024-06-23 21:04:05,196][15401] InferenceWorker_p0-w0: resuming experience collection (125050 times) [2024-06-23 21:04:07,237][15401] Updated weights for policy 0, policy_version 515400 (0.0037) [2024-06-23 21:04:08,396][15132] Fps is (10 sec: 40933.8, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 8444329984. Throughput: 0: 42750.8. Samples: 8444403320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:08,397][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 21:04:11,562][15401] Updated weights for policy 0, policy_version 515410 (0.0028) [2024-06-23 21:04:13,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8444559360. Throughput: 0: 42998.2. Samples: 8444667880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:13,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 21:04:15,096][15401] Updated weights for policy 0, policy_version 515420 (0.0037) [2024-06-23 21:04:18,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 8444739584. Throughput: 0: 42653.2. Samples: 8444924620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 21:04:19,351][15401] Updated weights for policy 0, policy_version 515430 (0.0031) [2024-06-23 21:04:22,722][15401] Updated weights for policy 0, policy_version 515440 (0.0035) [2024-06-23 21:04:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8444968960. Throughput: 0: 42887.5. Samples: 8445048660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 21:04:26,877][15401] Updated weights for policy 0, policy_version 515450 (0.0051) [2024-06-23 21:04:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42601.4, 300 sec: 42820.5). Total num frames: 8445198336. Throughput: 0: 43026.6. Samples: 8445312640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 21:04:30,190][15401] Updated weights for policy 0, policy_version 515460 (0.0026) [2024-06-23 21:04:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8445394944. Throughput: 0: 42838.1. Samples: 8445569840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 21:04:34,773][15401] Updated weights for policy 0, policy_version 515470 (0.0029) [2024-06-23 21:04:37,942][15401] Updated weights for policy 0, policy_version 515480 (0.0034) [2024-06-23 21:04:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 8445624320. Throughput: 0: 42993.8. Samples: 8445694960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 21:04:42,334][15401] Updated weights for policy 0, policy_version 515490 (0.0033) [2024-06-23 21:04:43,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8445853696. Throughput: 0: 43182.3. Samples: 8445962120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 21:04:45,503][15401] Updated weights for policy 0, policy_version 515500 (0.0033) [2024-06-23 21:04:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 8446050304. Throughput: 0: 42910.8. Samples: 8446213080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-23 21:04:48,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 21:04:49,861][15401] Updated weights for policy 0, policy_version 515510 (0.0037) [2024-06-23 21:04:53,250][15401] Updated weights for policy 0, policy_version 515520 (0.0031) [2024-06-23 21:04:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 8446279680. Throughput: 0: 43025.8. Samples: 8446339200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:04:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 21:04:57,380][15401] Updated weights for policy 0, policy_version 515530 (0.0046) [2024-06-23 21:04:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8446459904. Throughput: 0: 42646.4. Samples: 8446586960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:04:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 21:05:00,989][15401] Updated weights for policy 0, policy_version 515540 (0.0027) [2024-06-23 21:05:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 8446672896. Throughput: 0: 42684.4. Samples: 8446845420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:03,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 21:05:05,247][15401] Updated weights for policy 0, policy_version 515550 (0.0037) [2024-06-23 21:05:07,226][15349] Signal inference workers to stop experience collection... (125100 times) [2024-06-23 21:05:07,226][15349] Signal inference workers to resume experience collection... (125100 times) [2024-06-23 21:05:07,245][15401] InferenceWorker_p0-w0: stopping experience collection (125100 times) [2024-06-23 21:05:07,245][15401] InferenceWorker_p0-w0: resuming experience collection (125100 times) [2024-06-23 21:05:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 8446902272. Throughput: 0: 42786.6. Samples: 8446974060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:08,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 21:05:08,721][15401] Updated weights for policy 0, policy_version 515560 (0.0023) [2024-06-23 21:05:13,064][15401] Updated weights for policy 0, policy_version 515570 (0.0031) [2024-06-23 21:05:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8447098880. Throughput: 0: 42520.9. Samples: 8447226080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 21:05:16,919][15401] Updated weights for policy 0, policy_version 515580 (0.0042) [2024-06-23 21:05:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8447311872. Throughput: 0: 42519.7. Samples: 8447483220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:18,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 21:05:20,554][15401] Updated weights for policy 0, policy_version 515590 (0.0032) [2024-06-23 21:05:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8447557632. Throughput: 0: 42616.1. Samples: 8447612680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:23,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 21:05:24,655][15401] Updated weights for policy 0, policy_version 515600 (0.0036) [2024-06-23 21:05:28,389][15401] Updated weights for policy 0, policy_version 515610 (0.0033) [2024-06-23 21:05:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8447754240. Throughput: 0: 42282.2. Samples: 8447864820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 21:05:32,217][15401] Updated weights for policy 0, policy_version 515620 (0.0036) [2024-06-23 21:05:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 8447934464. Throughput: 0: 42397.9. Samples: 8448120980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:33,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 21:05:36,893][15401] Updated weights for policy 0, policy_version 515630 (0.0041) [2024-06-23 21:05:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 8448180224. Throughput: 0: 42492.5. Samples: 8448251360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 21:05:39,745][15401] Updated weights for policy 0, policy_version 515640 (0.0033) [2024-06-23 21:05:43,396][15132] Fps is (10 sec: 44208.2, 60 sec: 42047.7, 300 sec: 42653.0). Total num frames: 8448376832. Throughput: 0: 42605.0. Samples: 8448504460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:43,396][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 21:05:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000515648_8448376832.pth... [2024-06-23 21:05:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000515023_8438136832.pth [2024-06-23 21:05:44,441][15401] Updated weights for policy 0, policy_version 515650 (0.0029) [2024-06-23 21:05:47,570][15401] Updated weights for policy 0, policy_version 515660 (0.0033) [2024-06-23 21:05:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 8448589824. Throughput: 0: 42407.6. Samples: 8448753760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:48,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-23 21:05:52,156][15401] Updated weights for policy 0, policy_version 515670 (0.0024) [2024-06-23 21:05:53,389][15132] Fps is (10 sec: 42626.4, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 8448802816. Throughput: 0: 42529.6. Samples: 8448887880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:53,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 21:05:55,200][15401] Updated weights for policy 0, policy_version 515680 (0.0038) [2024-06-23 21:05:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8449032192. Throughput: 0: 42653.4. Samples: 8449145480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:05:58,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 21:05:59,797][15401] Updated weights for policy 0, policy_version 515690 (0.0045) [2024-06-23 21:06:02,844][15401] Updated weights for policy 0, policy_version 515700 (0.0026) [2024-06-23 21:06:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8449245184. Throughput: 0: 42485.3. Samples: 8449395060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:06:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 21:06:07,344][15401] Updated weights for policy 0, policy_version 515710 (0.0030) [2024-06-23 21:06:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 8449441792. Throughput: 0: 42472.0. Samples: 8449523920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:06:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 21:06:10,306][15401] Updated weights for policy 0, policy_version 515720 (0.0045) [2024-06-23 21:06:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.7). Total num frames: 8449671168. Throughput: 0: 42606.7. Samples: 8449782120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:06:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 21:06:14,939][15401] Updated weights for policy 0, policy_version 515730 (0.0028) [2024-06-23 21:06:17,933][15401] Updated weights for policy 0, policy_version 515740 (0.0038) [2024-06-23 21:06:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8449884160. Throughput: 0: 42508.0. Samples: 8450033840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 21:06:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-23 21:06:22,562][15401] Updated weights for policy 0, policy_version 515750 (0.0037) [2024-06-23 21:06:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 8450064384. Throughput: 0: 42516.4. Samples: 8450164600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:06:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 21:06:24,226][15349] Signal inference workers to stop experience collection... (125150 times) [2024-06-23 21:06:24,257][15401] InferenceWorker_p0-w0: stopping experience collection (125150 times) [2024-06-23 21:06:24,273][15349] Signal inference workers to resume experience collection... (125150 times) [2024-06-23 21:06:24,274][15401] InferenceWorker_p0-w0: resuming experience collection (125150 times) [2024-06-23 21:06:25,649][15401] Updated weights for policy 0, policy_version 515760 (0.0041) [2024-06-23 21:06:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8450310144. Throughput: 0: 42664.4. Samples: 8450424080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:06:28,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 21:06:30,364][15401] Updated weights for policy 0, policy_version 515770 (0.0027) [2024-06-23 21:06:33,302][15401] Updated weights for policy 0, policy_version 515780 (0.0040) [2024-06-23 21:06:33,397][15132] Fps is (10 sec: 47477.0, 60 sec: 43412.1, 300 sec: 42763.9). Total num frames: 8450539520. Throughput: 0: 42909.6. Samples: 8450685020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:06:33,398][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 21:06:37,917][15401] Updated weights for policy 0, policy_version 515790 (0.0026) [2024-06-23 21:06:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 8450719744. Throughput: 0: 42816.2. Samples: 8450814620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:06:38,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 21:06:40,951][15401] Updated weights for policy 0, policy_version 515800 (0.0037) [2024-06-23 21:06:43,390][15132] Fps is (10 sec: 40991.1, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 8450949120. Throughput: 0: 42651.9. Samples: 8451064820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:06:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 21:06:45,611][15401] Updated weights for policy 0, policy_version 515810 (0.0038) [2024-06-23 21:06:48,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8451178496. Throughput: 0: 42916.4. Samples: 8451326300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:06:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 21:06:48,651][15401] Updated weights for policy 0, policy_version 515820 (0.0042) [2024-06-23 21:06:53,236][15401] Updated weights for policy 0, policy_version 515830 (0.0028) [2024-06-23 21:06:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 8451358720. Throughput: 0: 42979.6. Samples: 8451458000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:06:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 21:06:56,175][15401] Updated weights for policy 0, policy_version 515840 (0.0041) [2024-06-23 21:06:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 8451604480. Throughput: 0: 42868.0. Samples: 8451711180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:06:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 21:07:00,813][15401] Updated weights for policy 0, policy_version 515850 (0.0048) [2024-06-23 21:07:03,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8451817472. Throughput: 0: 43167.5. Samples: 8451976480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:03,393][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 21:07:03,837][15401] Updated weights for policy 0, policy_version 515860 (0.0037) [2024-06-23 21:07:08,229][15401] Updated weights for policy 0, policy_version 515870 (0.0030) [2024-06-23 21:07:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8452030464. Throughput: 0: 43187.6. Samples: 8452108040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:08,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 21:07:11,200][15401] Updated weights for policy 0, policy_version 515880 (0.0035) [2024-06-23 21:07:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8452243456. Throughput: 0: 42986.2. Samples: 8452358460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 21:07:15,784][15401] Updated weights for policy 0, policy_version 515890 (0.0037) [2024-06-23 21:07:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8452472832. Throughput: 0: 43016.6. Samples: 8452620440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:18,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-23 21:07:18,508][15349] Signal inference workers to stop experience collection... (125200 times) [2024-06-23 21:07:18,508][15349] Signal inference workers to resume experience collection... (125200 times) [2024-06-23 21:07:18,556][15401] InferenceWorker_p0-w0: stopping experience collection (125200 times) [2024-06-23 21:07:18,556][15401] InferenceWorker_p0-w0: resuming experience collection (125200 times) [2024-06-23 21:07:18,642][15401] Updated weights for policy 0, policy_version 515900 (0.0026) [2024-06-23 21:07:23,183][15401] Updated weights for policy 0, policy_version 515910 (0.0023) [2024-06-23 21:07:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42765.1). Total num frames: 8452669440. Throughput: 0: 43069.9. Samples: 8452752760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 21:07:26,138][15401] Updated weights for policy 0, policy_version 515920 (0.0023) [2024-06-23 21:07:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8452898816. Throughput: 0: 43101.4. Samples: 8453004380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 21:07:30,792][15401] Updated weights for policy 0, policy_version 515930 (0.0037) [2024-06-23 21:07:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42877.0, 300 sec: 42876.1). Total num frames: 8453111808. Throughput: 0: 43049.5. Samples: 8453263520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 21:07:34,330][15401] Updated weights for policy 0, policy_version 515940 (0.0034) [2024-06-23 21:07:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8453308416. Throughput: 0: 42942.6. Samples: 8453390420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 21:07:38,683][15401] Updated weights for policy 0, policy_version 515950 (0.0034) [2024-06-23 21:07:41,690][15401] Updated weights for policy 0, policy_version 515960 (0.0030) [2024-06-23 21:07:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8453537792. Throughput: 0: 42942.7. Samples: 8453643600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-23 21:07:43,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 21:07:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000515963_8453537792.pth... [2024-06-23 21:07:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000515338_8443297792.pth [2024-06-23 21:07:46,141][15401] Updated weights for policy 0, policy_version 515970 (0.0037) [2024-06-23 21:07:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 8453734400. Throughput: 0: 42852.8. Samples: 8453904860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:07:48,393][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 21:07:49,455][15401] Updated weights for policy 0, policy_version 515980 (0.0023) [2024-06-23 21:07:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8453947392. Throughput: 0: 42782.5. Samples: 8454033260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:07:53,391][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 21:07:53,902][15401] Updated weights for policy 0, policy_version 515990 (0.0037) [2024-06-23 21:07:56,888][15401] Updated weights for policy 0, policy_version 516000 (0.0051) [2024-06-23 21:07:58,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8454176768. Throughput: 0: 42772.4. Samples: 8454283220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:07:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 21:08:01,352][15401] Updated weights for policy 0, policy_version 516010 (0.0043) [2024-06-23 21:08:03,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 8454389760. Throughput: 0: 43020.6. Samples: 8454556360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 21:08:04,392][15401] Updated weights for policy 0, policy_version 516020 (0.0037) [2024-06-23 21:08:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8454586368. Throughput: 0: 42763.2. Samples: 8454677100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 21:08:08,930][15401] Updated weights for policy 0, policy_version 516030 (0.0032) [2024-06-23 21:08:11,967][15401] Updated weights for policy 0, policy_version 516040 (0.0035) [2024-06-23 21:08:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 8454815744. Throughput: 0: 42665.6. Samples: 8454924340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:13,395][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 21:08:17,065][15401] Updated weights for policy 0, policy_version 516050 (0.0032) [2024-06-23 21:08:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8454995968. Throughput: 0: 42991.4. Samples: 8455198140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:18,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 21:08:18,572][15349] Signal inference workers to stop experience collection... (125250 times) [2024-06-23 21:08:18,628][15401] InferenceWorker_p0-w0: stopping experience collection (125250 times) [2024-06-23 21:08:18,628][15349] Signal inference workers to resume experience collection... (125250 times) [2024-06-23 21:08:18,643][15401] InferenceWorker_p0-w0: resuming experience collection (125250 times) [2024-06-23 21:08:19,784][15401] Updated weights for policy 0, policy_version 516060 (0.0035) [2024-06-23 21:08:23,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42654.6). Total num frames: 8455225344. Throughput: 0: 42827.2. Samples: 8455317640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 21:08:24,511][15401] Updated weights for policy 0, policy_version 516070 (0.0024) [2024-06-23 21:08:27,321][15401] Updated weights for policy 0, policy_version 516080 (0.0038) [2024-06-23 21:08:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8455471104. Throughput: 0: 42923.9. Samples: 8455575180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:28,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 21:08:32,059][15401] Updated weights for policy 0, policy_version 516090 (0.0031) [2024-06-23 21:08:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 8455667712. Throughput: 0: 43112.0. Samples: 8455844800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 21:08:34,848][15401] Updated weights for policy 0, policy_version 516100 (0.0034) [2024-06-23 21:08:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8455880704. Throughput: 0: 42944.1. Samples: 8455965740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 21:08:39,537][15401] Updated weights for policy 0, policy_version 516110 (0.0034) [2024-06-23 21:08:42,841][15401] Updated weights for policy 0, policy_version 516120 (0.0029) [2024-06-23 21:08:43,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 8456126464. Throughput: 0: 43148.0. Samples: 8456224980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:43,393][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 21:08:47,206][15401] Updated weights for policy 0, policy_version 516130 (0.0034) [2024-06-23 21:08:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 8456306688. Throughput: 0: 42870.6. Samples: 8456485540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 21:08:50,643][15401] Updated weights for policy 0, policy_version 516140 (0.0039) [2024-06-23 21:08:53,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8456519680. Throughput: 0: 42867.0. Samples: 8456606120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:53,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-23 21:08:54,922][15401] Updated weights for policy 0, policy_version 516150 (0.0044) [2024-06-23 21:08:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 8456749056. Throughput: 0: 43118.3. Samples: 8456864760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:08:58,393][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 21:08:58,555][15401] Updated weights for policy 0, policy_version 516160 (0.0027) [2024-06-23 21:09:02,578][15401] Updated weights for policy 0, policy_version 516170 (0.0031) [2024-06-23 21:09:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 8456962048. Throughput: 0: 42768.0. Samples: 8457122700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:09:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 21:09:06,085][15401] Updated weights for policy 0, policy_version 516180 (0.0049) [2024-06-23 21:09:08,390][15132] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8457175040. Throughput: 0: 42893.2. Samples: 8457247840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:09:08,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-23 21:09:10,125][15401] Updated weights for policy 0, policy_version 516190 (0.0041) [2024-06-23 21:09:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 8457371648. Throughput: 0: 42972.2. Samples: 8457508920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-23 21:09:13,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 21:09:13,753][15401] Updated weights for policy 0, policy_version 516200 (0.0039) [2024-06-23 21:09:17,610][15401] Updated weights for policy 0, policy_version 516210 (0.0031) [2024-06-23 21:09:18,396][15132] Fps is (10 sec: 42571.5, 60 sec: 43413.0, 300 sec: 42819.6). Total num frames: 8457601024. Throughput: 0: 42577.6. Samples: 8457761060. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:18,396][15132] Avg episode reward: [(0, '0.845')] [2024-06-23 21:09:20,325][15349] Signal inference workers to stop experience collection... (125300 times) [2024-06-23 21:09:20,361][15401] InferenceWorker_p0-w0: stopping experience collection (125300 times) [2024-06-23 21:09:20,389][15349] Signal inference workers to resume experience collection... (125300 times) [2024-06-23 21:09:20,391][15401] InferenceWorker_p0-w0: resuming experience collection (125300 times) [2024-06-23 21:09:21,467][15401] Updated weights for policy 0, policy_version 516220 (0.0036) [2024-06-23 21:09:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8457797632. Throughput: 0: 42810.8. Samples: 8457892220. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 21:09:25,173][15401] Updated weights for policy 0, policy_version 516230 (0.0044) [2024-06-23 21:09:28,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8458027008. Throughput: 0: 42899.7. Samples: 8458155360. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 21:09:29,068][15401] Updated weights for policy 0, policy_version 516240 (0.0029) [2024-06-23 21:09:32,728][15401] Updated weights for policy 0, policy_version 516250 (0.0040) [2024-06-23 21:09:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8458240000. Throughput: 0: 42685.2. Samples: 8458406380. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 21:09:36,590][15401] Updated weights for policy 0, policy_version 516260 (0.0029) [2024-06-23 21:09:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8458452992. Throughput: 0: 42932.3. Samples: 8458538060. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 21:09:40,685][15401] Updated weights for policy 0, policy_version 516270 (0.0032) [2024-06-23 21:09:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 8458649600. Throughput: 0: 42919.6. Samples: 8458796040. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 21:09:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000516276_8458665984.pth... [2024-06-23 21:09:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000515648_8448376832.pth [2024-06-23 21:09:44,644][15401] Updated weights for policy 0, policy_version 516280 (0.0028) [2024-06-23 21:09:48,384][15401] Updated weights for policy 0, policy_version 516290 (0.0031) [2024-06-23 21:09:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8458895360. Throughput: 0: 42871.4. Samples: 8459051920. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:48,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 21:09:52,106][15401] Updated weights for policy 0, policy_version 516300 (0.0028) [2024-06-23 21:09:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8459108352. Throughput: 0: 42960.0. Samples: 8459181040. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 21:09:55,838][15401] Updated weights for policy 0, policy_version 516310 (0.0034) [2024-06-23 21:09:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42598.3, 300 sec: 42820.2). Total num frames: 8459304960. Throughput: 0: 42975.4. Samples: 8459442920. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:09:58,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 21:09:59,920][15401] Updated weights for policy 0, policy_version 516320 (0.0034) [2024-06-23 21:10:03,339][15401] Updated weights for policy 0, policy_version 516330 (0.0031) [2024-06-23 21:10:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8459550720. Throughput: 0: 42912.9. Samples: 8459691860. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 21:10:07,552][15401] Updated weights for policy 0, policy_version 516340 (0.0033) [2024-06-23 21:10:08,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8459747328. Throughput: 0: 42968.4. Samples: 8459825800. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:08,393][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 21:10:10,842][15401] Updated weights for policy 0, policy_version 516350 (0.0031) [2024-06-23 21:10:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8459943936. Throughput: 0: 42825.4. Samples: 8460082500. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:13,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 21:10:15,086][15401] Updated weights for policy 0, policy_version 516360 (0.0028) [2024-06-23 21:10:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43149.2, 300 sec: 42820.6). Total num frames: 8460189696. Throughput: 0: 42816.1. Samples: 8460333100. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 21:10:18,655][15401] Updated weights for policy 0, policy_version 516370 (0.0036) [2024-06-23 21:10:22,619][15401] Updated weights for policy 0, policy_version 516380 (0.0038) [2024-06-23 21:10:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8460386304. Throughput: 0: 42984.6. Samples: 8460472380. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 21:10:26,283][15401] Updated weights for policy 0, policy_version 516390 (0.0037) [2024-06-23 21:10:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8460599296. Throughput: 0: 42834.6. Samples: 8460723600. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:28,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 21:10:30,349][15401] Updated weights for policy 0, policy_version 516400 (0.0046) [2024-06-23 21:10:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8460812288. Throughput: 0: 42823.1. Samples: 8460978960. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 21:10:33,999][15401] Updated weights for policy 0, policy_version 516410 (0.0035) [2024-06-23 21:10:37,929][15401] Updated weights for policy 0, policy_version 516420 (0.0029) [2024-06-23 21:10:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 8461025280. Throughput: 0: 42851.2. Samples: 8461109340. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 21:10:39,285][15349] Signal inference workers to stop experience collection... (125350 times) [2024-06-23 21:10:39,338][15401] InferenceWorker_p0-w0: stopping experience collection (125350 times) [2024-06-23 21:10:39,396][15349] Signal inference workers to resume experience collection... (125350 times) [2024-06-23 21:10:39,396][15401] InferenceWorker_p0-w0: resuming experience collection (125350 times) [2024-06-23 21:10:41,473][15401] Updated weights for policy 0, policy_version 516430 (0.0029) [2024-06-23 21:10:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8461238272. Throughput: 0: 42762.3. Samples: 8461367120. Policy #0 lag: (min: 2.0, avg: 11.2, max: 21.0) [2024-06-23 21:10:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 21:10:45,577][15401] Updated weights for policy 0, policy_version 516440 (0.0025) [2024-06-23 21:10:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8461451264. Throughput: 0: 42936.3. Samples: 8461624000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:10:48,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 21:10:49,179][15401] Updated weights for policy 0, policy_version 516450 (0.0026) [2024-06-23 21:10:53,362][15401] Updated weights for policy 0, policy_version 516460 (0.0035) [2024-06-23 21:10:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8461680640. Throughput: 0: 42897.3. Samples: 8461756180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:10:53,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 21:10:56,993][15401] Updated weights for policy 0, policy_version 516470 (0.0033) [2024-06-23 21:10:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 8461893632. Throughput: 0: 42791.9. Samples: 8462008140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:10:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 21:11:00,912][15401] Updated weights for policy 0, policy_version 516480 (0.0032) [2024-06-23 21:11:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8462106624. Throughput: 0: 42991.1. Samples: 8462267700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:03,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 21:11:04,713][15401] Updated weights for policy 0, policy_version 516490 (0.0033) [2024-06-23 21:11:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8462303232. Throughput: 0: 42627.2. Samples: 8462390600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:08,404][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 21:11:08,596][15401] Updated weights for policy 0, policy_version 516500 (0.0032) [2024-06-23 21:11:12,810][15401] Updated weights for policy 0, policy_version 516510 (0.0037) [2024-06-23 21:11:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8462532608. Throughput: 0: 42682.6. Samples: 8462644320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:13,390][15132] Avg episode reward: [(0, '0.152')] [2024-06-23 21:11:16,118][15401] Updated weights for policy 0, policy_version 516520 (0.0046) [2024-06-23 21:11:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 8462712832. Throughput: 0: 42886.4. Samples: 8462908840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:18,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 21:11:20,255][15401] Updated weights for policy 0, policy_version 516530 (0.0032) [2024-06-23 21:11:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8462974976. Throughput: 0: 42750.7. Samples: 8463033120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 21:11:23,736][15401] Updated weights for policy 0, policy_version 516540 (0.0041) [2024-06-23 21:11:27,798][15401] Updated weights for policy 0, policy_version 516550 (0.0022) [2024-06-23 21:11:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42821.7). Total num frames: 8463171584. Throughput: 0: 42733.4. Samples: 8463290120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 21:11:31,387][15401] Updated weights for policy 0, policy_version 516560 (0.0037) [2024-06-23 21:11:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8463368192. Throughput: 0: 42799.9. Samples: 8463550000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 21:11:35,448][15401] Updated weights for policy 0, policy_version 516570 (0.0048) [2024-06-23 21:11:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8463597568. Throughput: 0: 42806.2. Samples: 8463682460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:38,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 21:11:38,883][15401] Updated weights for policy 0, policy_version 516580 (0.0029) [2024-06-23 21:11:43,106][15401] Updated weights for policy 0, policy_version 516590 (0.0032) [2024-06-23 21:11:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8463810560. Throughput: 0: 42914.1. Samples: 8463939280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:43,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-23 21:11:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000516590_8463810560.pth... [2024-06-23 21:11:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000515963_8453537792.pth [2024-06-23 21:11:46,670][15401] Updated weights for policy 0, policy_version 516600 (0.0024) [2024-06-23 21:11:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8464023552. Throughput: 0: 42857.6. Samples: 8464196300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 21:11:50,876][15401] Updated weights for policy 0, policy_version 516610 (0.0022) [2024-06-23 21:11:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8464236544. Throughput: 0: 43025.9. Samples: 8464326760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 21:11:54,100][15401] Updated weights for policy 0, policy_version 516620 (0.0034) [2024-06-23 21:11:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 8464449536. Throughput: 0: 43035.7. Samples: 8464580920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:11:58,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 21:11:58,523][15401] Updated weights for policy 0, policy_version 516630 (0.0047) [2024-06-23 21:12:01,671][15401] Updated weights for policy 0, policy_version 516640 (0.0028) [2024-06-23 21:12:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8464662528. Throughput: 0: 42778.0. Samples: 8464833860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:12:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 21:12:06,148][15401] Updated weights for policy 0, policy_version 516650 (0.0049) [2024-06-23 21:12:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8464875520. Throughput: 0: 42925.8. Samples: 8464964780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-23 21:12:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 21:12:09,083][15349] Signal inference workers to stop experience collection... (125400 times) [2024-06-23 21:12:09,127][15401] InferenceWorker_p0-w0: stopping experience collection (125400 times) [2024-06-23 21:12:09,141][15349] Signal inference workers to resume experience collection... (125400 times) [2024-06-23 21:12:09,143][15401] InferenceWorker_p0-w0: resuming experience collection (125400 times) [2024-06-23 21:12:09,293][15401] Updated weights for policy 0, policy_version 516660 (0.0038) [2024-06-23 21:12:13,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8465088512. Throughput: 0: 42884.9. Samples: 8465219940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 21:12:13,623][15401] Updated weights for policy 0, policy_version 516670 (0.0038) [2024-06-23 21:12:17,106][15401] Updated weights for policy 0, policy_version 516680 (0.0039) [2024-06-23 21:12:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8465317888. Throughput: 0: 42707.6. Samples: 8465471840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 21:12:21,273][15401] Updated weights for policy 0, policy_version 516690 (0.0042) [2024-06-23 21:12:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8465498112. Throughput: 0: 42634.7. Samples: 8465601020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:23,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 21:12:24,639][15401] Updated weights for policy 0, policy_version 516700 (0.0037) [2024-06-23 21:12:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8465727488. Throughput: 0: 42643.4. Samples: 8465858220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:28,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 21:12:28,902][15401] Updated weights for policy 0, policy_version 516710 (0.0042) [2024-06-23 21:12:32,684][15401] Updated weights for policy 0, policy_version 516720 (0.0039) [2024-06-23 21:12:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8465956864. Throughput: 0: 42577.6. Samples: 8466112280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 21:12:36,667][15401] Updated weights for policy 0, policy_version 516730 (0.0035) [2024-06-23 21:12:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8466153472. Throughput: 0: 42679.5. Samples: 8466247340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 21:12:40,363][15401] Updated weights for policy 0, policy_version 516740 (0.0025) [2024-06-23 21:12:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 8466382848. Throughput: 0: 42812.3. Samples: 8466507480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 21:12:44,404][15401] Updated weights for policy 0, policy_version 516750 (0.0033) [2024-06-23 21:12:47,886][15401] Updated weights for policy 0, policy_version 516760 (0.0032) [2024-06-23 21:12:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 8466612224. Throughput: 0: 42734.4. Samples: 8466756900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 21:12:52,074][15401] Updated weights for policy 0, policy_version 516770 (0.0032) [2024-06-23 21:12:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8466792448. Throughput: 0: 42743.0. Samples: 8466888220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:53,391][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 21:12:55,538][15401] Updated weights for policy 0, policy_version 516780 (0.0039) [2024-06-23 21:12:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8467021824. Throughput: 0: 42628.3. Samples: 8467138220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:12:58,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 21:12:59,878][15401] Updated weights for policy 0, policy_version 516790 (0.0033) [2024-06-23 21:13:03,234][15401] Updated weights for policy 0, policy_version 516800 (0.0031) [2024-06-23 21:13:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 8467251200. Throughput: 0: 42734.8. Samples: 8467394900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:13:03,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 21:13:07,476][15401] Updated weights for policy 0, policy_version 516810 (0.0037) [2024-06-23 21:13:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8467431424. Throughput: 0: 42809.0. Samples: 8467527420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:13:08,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 21:13:10,807][15401] Updated weights for policy 0, policy_version 516820 (0.0028) [2024-06-23 21:13:13,392][15132] Fps is (10 sec: 40947.6, 60 sec: 42869.3, 300 sec: 42931.2). Total num frames: 8467660800. Throughput: 0: 42781.5. Samples: 8467783520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:13:13,393][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 21:13:14,996][15401] Updated weights for policy 0, policy_version 516830 (0.0028) [2024-06-23 21:13:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8467890176. Throughput: 0: 42818.1. Samples: 8468039100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:13:18,396][15132] Avg episode reward: [(0, '0.199')] [2024-06-23 21:13:18,534][15401] Updated weights for policy 0, policy_version 516840 (0.0029) [2024-06-23 21:13:20,259][15349] Signal inference workers to stop experience collection... (125450 times) [2024-06-23 21:13:20,259][15349] Signal inference workers to resume experience collection... (125450 times) [2024-06-23 21:13:20,304][15401] InferenceWorker_p0-w0: stopping experience collection (125450 times) [2024-06-23 21:13:20,304][15401] InferenceWorker_p0-w0: resuming experience collection (125450 times) [2024-06-23 21:13:22,599][15401] Updated weights for policy 0, policy_version 516850 (0.0032) [2024-06-23 21:13:23,389][15132] Fps is (10 sec: 42611.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8468086784. Throughput: 0: 42776.9. Samples: 8468172300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:13:23,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 21:13:26,206][15401] Updated weights for policy 0, policy_version 516860 (0.0026) [2024-06-23 21:13:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8468316160. Throughput: 0: 42598.4. Samples: 8468424400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:13:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 21:13:30,526][15401] Updated weights for policy 0, policy_version 516870 (0.0040) [2024-06-23 21:13:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8468529152. Throughput: 0: 42731.1. Samples: 8468679800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:13:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 21:13:34,017][15401] Updated weights for policy 0, policy_version 516880 (0.0024) [2024-06-23 21:13:38,346][15401] Updated weights for policy 0, policy_version 516890 (0.0039) [2024-06-23 21:13:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 8468725760. Throughput: 0: 42616.0. Samples: 8468805940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-23 21:13:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 21:13:42,034][15401] Updated weights for policy 0, policy_version 516900 (0.0037) [2024-06-23 21:13:43,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8468955136. Throughput: 0: 42717.8. Samples: 8469060520. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:13:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 21:13:43,525][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000516905_8468971520.pth... [2024-06-23 21:13:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000516276_8458665984.pth [2024-06-23 21:13:45,914][15401] Updated weights for policy 0, policy_version 516910 (0.0032) [2024-06-23 21:13:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8469151744. Throughput: 0: 42659.9. Samples: 8469314600. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:13:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 21:13:49,716][15401] Updated weights for policy 0, policy_version 516920 (0.0033) [2024-06-23 21:13:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 8469364736. Throughput: 0: 42448.3. Samples: 8469437600. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:13:53,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 21:13:53,470][15401] Updated weights for policy 0, policy_version 516930 (0.0028) [2024-06-23 21:13:57,341][15401] Updated weights for policy 0, policy_version 516940 (0.0030) [2024-06-23 21:13:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8469594112. Throughput: 0: 42566.0. Samples: 8469698860. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:13:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 21:14:01,013][15401] Updated weights for policy 0, policy_version 516950 (0.0029) [2024-06-23 21:14:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 8469774336. Throughput: 0: 42695.0. Samples: 8469960380. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:03,394][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 21:14:04,996][15401] Updated weights for policy 0, policy_version 516960 (0.0037) [2024-06-23 21:14:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8470003712. Throughput: 0: 42362.7. Samples: 8470078620. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 21:14:09,360][15401] Updated weights for policy 0, policy_version 516970 (0.0032) [2024-06-23 21:14:12,663][15401] Updated weights for policy 0, policy_version 516980 (0.0034) [2024-06-23 21:14:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42873.5, 300 sec: 42821.5). Total num frames: 8470233088. Throughput: 0: 42522.2. Samples: 8470337900. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 21:14:16,918][15401] Updated weights for policy 0, policy_version 516990 (0.0030) [2024-06-23 21:14:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 8470413312. Throughput: 0: 42664.8. Samples: 8470599720. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:18,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 21:14:20,115][15401] Updated weights for policy 0, policy_version 517000 (0.0038) [2024-06-23 21:14:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8470659072. Throughput: 0: 42589.7. Samples: 8470722580. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:23,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 21:14:24,331][15401] Updated weights for policy 0, policy_version 517010 (0.0030) [2024-06-23 21:14:28,059][15401] Updated weights for policy 0, policy_version 517020 (0.0033) [2024-06-23 21:14:28,390][15132] Fps is (10 sec: 44234.6, 60 sec: 42325.0, 300 sec: 42764.9). Total num frames: 8470855680. Throughput: 0: 42870.2. Samples: 8470989700. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:28,391][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 21:14:31,722][15401] Updated weights for policy 0, policy_version 517030 (0.0036) [2024-06-23 21:14:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 8471052288. Throughput: 0: 42921.8. Samples: 8471246080. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 21:14:35,325][15349] Signal inference workers to stop experience collection... (125500 times) [2024-06-23 21:14:35,357][15401] InferenceWorker_p0-w0: stopping experience collection (125500 times) [2024-06-23 21:14:35,385][15349] Signal inference workers to resume experience collection... (125500 times) [2024-06-23 21:14:35,385][15401] InferenceWorker_p0-w0: resuming experience collection (125500 times) [2024-06-23 21:14:35,518][15401] Updated weights for policy 0, policy_version 517040 (0.0037) [2024-06-23 21:14:38,390][15132] Fps is (10 sec: 45877.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8471314432. Throughput: 0: 42974.2. Samples: 8471371440. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 21:14:39,226][15401] Updated weights for policy 0, policy_version 517050 (0.0039) [2024-06-23 21:14:43,213][15401] Updated weights for policy 0, policy_version 517060 (0.0033) [2024-06-23 21:14:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8471511040. Throughput: 0: 43035.6. Samples: 8471635460. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 21:14:46,880][15401] Updated weights for policy 0, policy_version 517070 (0.0041) [2024-06-23 21:14:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8471707648. Throughput: 0: 42878.2. Samples: 8471889900. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 21:14:50,695][15401] Updated weights for policy 0, policy_version 517080 (0.0041) [2024-06-23 21:14:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 8471953408. Throughput: 0: 43050.2. Samples: 8472015880. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 21:14:54,481][15401] Updated weights for policy 0, policy_version 517090 (0.0037) [2024-06-23 21:14:58,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8472150016. Throughput: 0: 43031.1. Samples: 8472274300. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:14:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 21:14:58,514][15401] Updated weights for policy 0, policy_version 517100 (0.0030) [2024-06-23 21:15:02,082][15401] Updated weights for policy 0, policy_version 517110 (0.0032) [2024-06-23 21:15:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 8472363008. Throughput: 0: 42883.2. Samples: 8472529460. Policy #0 lag: (min: 2.0, avg: 9.6, max: 22.0) [2024-06-23 21:15:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 21:15:06,109][15401] Updated weights for policy 0, policy_version 517120 (0.0033) [2024-06-23 21:15:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8472576000. Throughput: 0: 43053.4. Samples: 8472659880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 21:15:09,683][15401] Updated weights for policy 0, policy_version 517130 (0.0027) [2024-06-23 21:15:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8472788992. Throughput: 0: 42757.3. Samples: 8472913760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 21:15:13,806][15401] Updated weights for policy 0, policy_version 517140 (0.0036) [2024-06-23 21:15:17,781][15401] Updated weights for policy 0, policy_version 517150 (0.0037) [2024-06-23 21:15:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8473001984. Throughput: 0: 42605.7. Samples: 8473163340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:18,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 21:15:21,402][15401] Updated weights for policy 0, policy_version 517160 (0.0036) [2024-06-23 21:15:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 8473198592. Throughput: 0: 42680.4. Samples: 8473292060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 21:15:25,398][15401] Updated weights for policy 0, policy_version 517170 (0.0028) [2024-06-23 21:15:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.8, 300 sec: 42820.6). Total num frames: 8473444352. Throughput: 0: 42451.8. Samples: 8473545800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 21:15:28,934][15401] Updated weights for policy 0, policy_version 517180 (0.0044) [2024-06-23 21:15:33,232][15401] Updated weights for policy 0, policy_version 517190 (0.0029) [2024-06-23 21:15:33,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 8473640960. Throughput: 0: 42628.1. Samples: 8473808260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:33,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 21:15:36,480][15401] Updated weights for policy 0, policy_version 517200 (0.0042) [2024-06-23 21:15:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8473853952. Throughput: 0: 42579.0. Samples: 8473931940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 21:15:41,017][15401] Updated weights for policy 0, policy_version 517210 (0.0032) [2024-06-23 21:15:43,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8474083328. Throughput: 0: 42664.8. Samples: 8474194220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 21:15:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000517217_8474083328.pth... [2024-06-23 21:15:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000516590_8463810560.pth [2024-06-23 21:15:44,555][15401] Updated weights for policy 0, policy_version 517220 (0.0036) [2024-06-23 21:15:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8474263552. Throughput: 0: 42715.9. Samples: 8474451680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:48,396][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 21:15:48,552][15401] Updated weights for policy 0, policy_version 517230 (0.0039) [2024-06-23 21:15:52,274][15401] Updated weights for policy 0, policy_version 517240 (0.0034) [2024-06-23 21:15:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 8474476544. Throughput: 0: 42587.6. Samples: 8474576320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:53,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 21:15:56,284][15401] Updated weights for policy 0, policy_version 517250 (0.0033) [2024-06-23 21:15:56,987][15349] Signal inference workers to stop experience collection... (125550 times) [2024-06-23 21:15:56,992][15349] Signal inference workers to resume experience collection... (125550 times) [2024-06-23 21:15:57,001][15401] InferenceWorker_p0-w0: stopping experience collection (125550 times) [2024-06-23 21:15:57,009][15401] InferenceWorker_p0-w0: resuming experience collection (125550 times) [2024-06-23 21:15:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8474722304. Throughput: 0: 42607.6. Samples: 8474831100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:15:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 21:16:00,141][15401] Updated weights for policy 0, policy_version 517260 (0.0051) [2024-06-23 21:16:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 8474886144. Throughput: 0: 42997.0. Samples: 8475098200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:16:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 21:16:03,946][15401] Updated weights for policy 0, policy_version 517270 (0.0027) [2024-06-23 21:16:07,623][15401] Updated weights for policy 0, policy_version 517280 (0.0021) [2024-06-23 21:16:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8475131904. Throughput: 0: 42708.4. Samples: 8475213940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:16:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 21:16:11,443][15401] Updated weights for policy 0, policy_version 517290 (0.0032) [2024-06-23 21:16:13,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8475377664. Throughput: 0: 42989.4. Samples: 8475480320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:16:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 21:16:15,143][15401] Updated weights for policy 0, policy_version 517300 (0.0030) [2024-06-23 21:16:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8475541504. Throughput: 0: 42844.5. Samples: 8475736160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:16:18,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 21:16:19,235][15401] Updated weights for policy 0, policy_version 517310 (0.0035) [2024-06-23 21:16:23,209][15401] Updated weights for policy 0, policy_version 517320 (0.0037) [2024-06-23 21:16:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8475787264. Throughput: 0: 42816.0. Samples: 8475858660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:16:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 21:16:27,033][15401] Updated weights for policy 0, policy_version 517330 (0.0041) [2024-06-23 21:16:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8476016640. Throughput: 0: 42833.3. Samples: 8476121720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:16:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 21:16:30,565][15401] Updated weights for policy 0, policy_version 517340 (0.0038) [2024-06-23 21:16:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8476213248. Throughput: 0: 42877.0. Samples: 8476381140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 21:16:33,396][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 21:16:34,633][15401] Updated weights for policy 0, policy_version 517350 (0.0033) [2024-06-23 21:16:38,031][15401] Updated weights for policy 0, policy_version 517360 (0.0024) [2024-06-23 21:16:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8476426240. Throughput: 0: 42853.6. Samples: 8476504740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:16:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 21:16:42,209][15401] Updated weights for policy 0, policy_version 517370 (0.0031) [2024-06-23 21:16:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8476655616. Throughput: 0: 42957.3. Samples: 8476764280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:16:43,392][15132] Avg episode reward: [(0, '0.281')] [2024-06-23 21:16:45,439][15401] Updated weights for policy 0, policy_version 517380 (0.0023) [2024-06-23 21:16:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8476835840. Throughput: 0: 42749.3. Samples: 8477021920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:16:48,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 21:16:49,690][15401] Updated weights for policy 0, policy_version 517390 (0.0033) [2024-06-23 21:16:52,945][15401] Updated weights for policy 0, policy_version 517400 (0.0030) [2024-06-23 21:16:53,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8477081600. Throughput: 0: 43020.7. Samples: 8477149860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:16:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 21:16:57,253][15401] Updated weights for policy 0, policy_version 517410 (0.0032) [2024-06-23 21:16:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8477294592. Throughput: 0: 42839.1. Samples: 8477408080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:16:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 21:17:00,367][15401] Updated weights for policy 0, policy_version 517420 (0.0042) [2024-06-23 21:17:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8477491200. Throughput: 0: 42969.3. Samples: 8477669780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 21:17:04,989][15401] Updated weights for policy 0, policy_version 517430 (0.0020) [2024-06-23 21:17:08,121][15401] Updated weights for policy 0, policy_version 517440 (0.0032) [2024-06-23 21:17:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 8477736960. Throughput: 0: 42998.3. Samples: 8477793580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:08,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 21:17:12,655][15401] Updated weights for policy 0, policy_version 517450 (0.0030) [2024-06-23 21:17:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 8477917184. Throughput: 0: 42897.5. Samples: 8478052100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:13,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 21:17:14,978][15349] Signal inference workers to stop experience collection... (125600 times) [2024-06-23 21:17:14,978][15349] Signal inference workers to resume experience collection... (125600 times) [2024-06-23 21:17:14,997][15401] InferenceWorker_p0-w0: stopping experience collection (125600 times) [2024-06-23 21:17:14,997][15401] InferenceWorker_p0-w0: resuming experience collection (125600 times) [2024-06-23 21:17:15,843][15401] Updated weights for policy 0, policy_version 517460 (0.0031) [2024-06-23 21:17:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8478130176. Throughput: 0: 42856.4. Samples: 8478309680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 21:17:20,224][15401] Updated weights for policy 0, policy_version 517470 (0.0034) [2024-06-23 21:17:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8478359552. Throughput: 0: 42877.4. Samples: 8478434220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 21:17:23,640][15401] Updated weights for policy 0, policy_version 517480 (0.0050) [2024-06-23 21:17:27,921][15401] Updated weights for policy 0, policy_version 517490 (0.0028) [2024-06-23 21:17:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 8478572544. Throughput: 0: 42817.8. Samples: 8478691080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:28,392][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 21:17:31,505][15401] Updated weights for policy 0, policy_version 517500 (0.0038) [2024-06-23 21:17:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8478785536. Throughput: 0: 42860.8. Samples: 8478950660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 21:17:35,654][15401] Updated weights for policy 0, policy_version 517510 (0.0027) [2024-06-23 21:17:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8479014912. Throughput: 0: 42836.3. Samples: 8479077500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 21:17:39,261][15401] Updated weights for policy 0, policy_version 517520 (0.0043) [2024-06-23 21:17:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 8479195136. Throughput: 0: 42827.2. Samples: 8479335300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 21:17:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000517530_8479211520.pth... [2024-06-23 21:17:43,420][15401] Updated weights for policy 0, policy_version 517530 (0.0038) [2024-06-23 21:17:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000516905_8468971520.pth [2024-06-23 21:17:46,874][15401] Updated weights for policy 0, policy_version 517540 (0.0029) [2024-06-23 21:17:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8479408128. Throughput: 0: 42683.1. Samples: 8479590520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 21:17:51,099][15401] Updated weights for policy 0, policy_version 517550 (0.0031) [2024-06-23 21:17:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8479637504. Throughput: 0: 42734.1. Samples: 8479716620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 21:17:54,925][15401] Updated weights for policy 0, policy_version 517560 (0.0036) [2024-06-23 21:17:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8479834112. Throughput: 0: 42596.0. Samples: 8479968920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:17:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 21:17:59,084][15401] Updated weights for policy 0, policy_version 517570 (0.0038) [2024-06-23 21:18:02,520][15401] Updated weights for policy 0, policy_version 517580 (0.0032) [2024-06-23 21:18:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8480047104. Throughput: 0: 42388.9. Samples: 8480217180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 21:18:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 21:18:07,015][15401] Updated weights for policy 0, policy_version 517590 (0.0040) [2024-06-23 21:18:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42765.4). Total num frames: 8480276480. Throughput: 0: 42516.0. Samples: 8480347440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 21:18:10,116][15401] Updated weights for policy 0, policy_version 517600 (0.0030) [2024-06-23 21:18:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 8480473088. Throughput: 0: 42410.6. Samples: 8480599460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 21:18:14,478][15401] Updated weights for policy 0, policy_version 517610 (0.0020) [2024-06-23 21:18:17,995][15401] Updated weights for policy 0, policy_version 517620 (0.0034) [2024-06-23 21:18:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8480686080. Throughput: 0: 42275.6. Samples: 8480853060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 21:18:22,154][15401] Updated weights for policy 0, policy_version 517630 (0.0036) [2024-06-23 21:18:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8480899072. Throughput: 0: 42336.3. Samples: 8480982640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:23,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 21:18:25,719][15401] Updated weights for policy 0, policy_version 517640 (0.0031) [2024-06-23 21:18:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 8481112064. Throughput: 0: 42297.8. Samples: 8481238700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:28,399][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 21:18:29,717][15401] Updated weights for policy 0, policy_version 517650 (0.0045) [2024-06-23 21:18:33,241][15401] Updated weights for policy 0, policy_version 517660 (0.0039) [2024-06-23 21:18:33,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8481341440. Throughput: 0: 42350.6. Samples: 8481496300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:33,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 21:18:37,346][15401] Updated weights for policy 0, policy_version 517670 (0.0032) [2024-06-23 21:18:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 8481538048. Throughput: 0: 42471.7. Samples: 8481627840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:38,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 21:18:41,001][15401] Updated weights for policy 0, policy_version 517680 (0.0045) [2024-06-23 21:18:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8481767424. Throughput: 0: 42403.9. Samples: 8481877100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:43,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-23 21:18:45,355][15401] Updated weights for policy 0, policy_version 517690 (0.0040) [2024-06-23 21:18:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8481980416. Throughput: 0: 42534.3. Samples: 8482131220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 21:18:48,597][15401] Updated weights for policy 0, policy_version 517700 (0.0043) [2024-06-23 21:18:52,942][15401] Updated weights for policy 0, policy_version 517710 (0.0039) [2024-06-23 21:18:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8482160640. Throughput: 0: 42442.7. Samples: 8482257360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 21:18:56,495][15401] Updated weights for policy 0, policy_version 517720 (0.0038) [2024-06-23 21:18:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8482390016. Throughput: 0: 42440.2. Samples: 8482509260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:18:58,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 21:19:00,367][15401] Updated weights for policy 0, policy_version 517730 (0.0044) [2024-06-23 21:19:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8482619392. Throughput: 0: 42528.8. Samples: 8482766860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:19:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 21:19:04,080][15401] Updated weights for policy 0, policy_version 517740 (0.0042) [2024-06-23 21:19:07,719][15349] Signal inference workers to stop experience collection... (125650 times) [2024-06-23 21:19:07,772][15401] InferenceWorker_p0-w0: stopping experience collection (125650 times) [2024-06-23 21:19:07,780][15349] Signal inference workers to resume experience collection... (125650 times) [2024-06-23 21:19:07,787][15401] InferenceWorker_p0-w0: resuming experience collection (125650 times) [2024-06-23 21:19:07,913][15401] Updated weights for policy 0, policy_version 517750 (0.0032) [2024-06-23 21:19:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8482816000. Throughput: 0: 42604.9. Samples: 8482899860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:19:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 21:19:11,841][15401] Updated weights for policy 0, policy_version 517760 (0.0036) [2024-06-23 21:19:13,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8483045376. Throughput: 0: 42463.6. Samples: 8483149560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:19:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 21:19:15,648][15401] Updated weights for policy 0, policy_version 517770 (0.0033) [2024-06-23 21:19:18,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 8483274752. Throughput: 0: 42538.8. Samples: 8483410540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:19:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 21:19:19,553][15401] Updated weights for policy 0, policy_version 517780 (0.0024) [2024-06-23 21:19:23,265][15401] Updated weights for policy 0, policy_version 517790 (0.0032) [2024-06-23 21:19:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 8483471360. Throughput: 0: 42490.7. Samples: 8483539920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:19:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 21:19:27,201][15401] Updated weights for policy 0, policy_version 517800 (0.0026) [2024-06-23 21:19:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8483684352. Throughput: 0: 42590.3. Samples: 8483793660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-23 21:19:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 21:19:31,230][15401] Updated weights for policy 0, policy_version 517810 (0.0025) [2024-06-23 21:19:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8483897344. Throughput: 0: 42562.6. Samples: 8484046540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:19:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 21:19:34,848][15401] Updated weights for policy 0, policy_version 517820 (0.0053) [2024-06-23 21:19:38,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8484061184. Throughput: 0: 42521.5. Samples: 8484170820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:19:38,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 21:19:38,957][15401] Updated weights for policy 0, policy_version 517830 (0.0030) [2024-06-23 21:19:42,416][15401] Updated weights for policy 0, policy_version 517840 (0.0045) [2024-06-23 21:19:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8484339712. Throughput: 0: 42786.6. Samples: 8484434660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:19:43,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-23 21:19:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000517843_8484339712.pth... [2024-06-23 21:19:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000517217_8474083328.pth [2024-06-23 21:19:46,710][15401] Updated weights for policy 0, policy_version 517850 (0.0052) [2024-06-23 21:19:48,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8484536320. Throughput: 0: 42697.1. Samples: 8484688220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:19:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 21:19:49,922][15401] Updated weights for policy 0, policy_version 517860 (0.0030) [2024-06-23 21:19:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8484716544. Throughput: 0: 42624.2. Samples: 8484817940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:19:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 21:19:54,472][15401] Updated weights for policy 0, policy_version 517870 (0.0030) [2024-06-23 21:19:57,412][15401] Updated weights for policy 0, policy_version 517880 (0.0041) [2024-06-23 21:19:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8484962304. Throughput: 0: 42815.6. Samples: 8485076260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:19:58,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 21:20:02,038][15401] Updated weights for policy 0, policy_version 517890 (0.0034) [2024-06-23 21:20:03,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 8485175296. Throughput: 0: 43009.6. Samples: 8485346080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:03,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 21:20:04,714][15401] Updated weights for policy 0, policy_version 517900 (0.0038) [2024-06-23 21:20:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8485388288. Throughput: 0: 42955.5. Samples: 8485472920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 21:20:09,503][15401] Updated weights for policy 0, policy_version 517910 (0.0040) [2024-06-23 21:20:12,251][15401] Updated weights for policy 0, policy_version 517920 (0.0035) [2024-06-23 21:20:13,393][15132] Fps is (10 sec: 45870.8, 60 sec: 43142.1, 300 sec: 42820.1). Total num frames: 8485634048. Throughput: 0: 43021.1. Samples: 8485729760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:13,393][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 21:20:17,224][15401] Updated weights for policy 0, policy_version 517930 (0.0041) [2024-06-23 21:20:18,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 8485830656. Throughput: 0: 43222.2. Samples: 8485991640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:18,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 21:20:20,135][15401] Updated weights for policy 0, policy_version 517940 (0.0037) [2024-06-23 21:20:23,389][15132] Fps is (10 sec: 39334.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8486027264. Throughput: 0: 43244.8. Samples: 8486116840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 21:20:24,779][15401] Updated weights for policy 0, policy_version 517950 (0.0033) [2024-06-23 21:20:27,468][15349] Signal inference workers to stop experience collection... (125700 times) [2024-06-23 21:20:27,468][15349] Signal inference workers to resume experience collection... (125700 times) [2024-06-23 21:20:27,489][15401] InferenceWorker_p0-w0: stopping experience collection (125700 times) [2024-06-23 21:20:27,490][15401] InferenceWorker_p0-w0: resuming experience collection (125700 times) [2024-06-23 21:20:27,794][15401] Updated weights for policy 0, policy_version 517960 (0.0030) [2024-06-23 21:20:28,389][15132] Fps is (10 sec: 45886.1, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 8486289408. Throughput: 0: 43082.7. Samples: 8486373380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 21:20:32,700][15401] Updated weights for policy 0, policy_version 517970 (0.0048) [2024-06-23 21:20:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8486469632. Throughput: 0: 43372.4. Samples: 8486639980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-23 21:20:35,270][15401] Updated weights for policy 0, policy_version 517980 (0.0031) [2024-06-23 21:20:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 8486682624. Throughput: 0: 43097.8. Samples: 8486757340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 21:20:40,360][15401] Updated weights for policy 0, policy_version 517990 (0.0028) [2024-06-23 21:20:42,945][15401] Updated weights for policy 0, policy_version 518000 (0.0033) [2024-06-23 21:20:43,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8486928384. Throughput: 0: 43130.5. Samples: 8487017140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:43,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 21:20:47,888][15401] Updated weights for policy 0, policy_version 518010 (0.0050) [2024-06-23 21:20:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8487075840. Throughput: 0: 42882.8. Samples: 8487275700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:48,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-23 21:20:50,787][15401] Updated weights for policy 0, policy_version 518020 (0.0040) [2024-06-23 21:20:53,390][15132] Fps is (10 sec: 37683.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8487305216. Throughput: 0: 42694.3. Samples: 8487394160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 21:20:55,729][15401] Updated weights for policy 0, policy_version 518030 (0.0031) [2024-06-23 21:20:58,275][15401] Updated weights for policy 0, policy_version 518040 (0.0033) [2024-06-23 21:20:58,392][15132] Fps is (10 sec: 49140.0, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 8487567360. Throughput: 0: 42833.8. Samples: 8487657240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-23 21:20:58,393][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 21:21:03,067][15401] Updated weights for policy 0, policy_version 518050 (0.0025) [2024-06-23 21:21:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8487747584. Throughput: 0: 42846.2. Samples: 8487919620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 21:21:05,851][15401] Updated weights for policy 0, policy_version 518060 (0.0031) [2024-06-23 21:21:08,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8487960576. Throughput: 0: 42742.2. Samples: 8488040240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:08,394][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 21:21:10,735][15401] Updated weights for policy 0, policy_version 518070 (0.0036) [2024-06-23 21:21:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42873.9, 300 sec: 42931.6). Total num frames: 8488206336. Throughput: 0: 43046.7. Samples: 8488310480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 21:21:13,394][15401] Updated weights for policy 0, policy_version 518080 (0.0024) [2024-06-23 21:21:18,234][15401] Updated weights for policy 0, policy_version 518090 (0.0031) [2024-06-23 21:21:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 8488386560. Throughput: 0: 42949.8. Samples: 8488572720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-23 21:21:20,968][15401] Updated weights for policy 0, policy_version 518100 (0.0036) [2024-06-23 21:21:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8488615936. Throughput: 0: 42993.0. Samples: 8488692020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 21:21:25,765][15401] Updated weights for policy 0, policy_version 518110 (0.0033) [2024-06-23 21:21:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8488828928. Throughput: 0: 43115.3. Samples: 8488957320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 21:21:28,838][15401] Updated weights for policy 0, policy_version 518120 (0.0045) [2024-06-23 21:21:30,068][15349] Signal inference workers to stop experience collection... (125750 times) [2024-06-23 21:21:30,068][15349] Signal inference workers to resume experience collection... (125750 times) [2024-06-23 21:21:30,088][15401] InferenceWorker_p0-w0: stopping experience collection (125750 times) [2024-06-23 21:21:30,088][15401] InferenceWorker_p0-w0: resuming experience collection (125750 times) [2024-06-23 21:21:33,251][15401] Updated weights for policy 0, policy_version 518130 (0.0029) [2024-06-23 21:21:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 8489041920. Throughput: 0: 43106.8. Samples: 8489215500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:33,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 21:21:36,703][15401] Updated weights for policy 0, policy_version 518140 (0.0030) [2024-06-23 21:21:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42820.9). Total num frames: 8489287680. Throughput: 0: 43255.1. Samples: 8489340640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 21:21:40,839][15401] Updated weights for policy 0, policy_version 518150 (0.0034) [2024-06-23 21:21:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 8489467904. Throughput: 0: 43172.5. Samples: 8489599900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 21:21:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000518156_8489467904.pth... [2024-06-23 21:21:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000517530_8479211520.pth [2024-06-23 21:21:44,217][15401] Updated weights for policy 0, policy_version 518160 (0.0037) [2024-06-23 21:21:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 8489680896. Throughput: 0: 43135.5. Samples: 8489860720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:48,392][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 21:21:48,404][15401] Updated weights for policy 0, policy_version 518170 (0.0035) [2024-06-23 21:21:51,828][15401] Updated weights for policy 0, policy_version 518180 (0.0036) [2024-06-23 21:21:53,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43963.7, 300 sec: 42876.1). Total num frames: 8489943040. Throughput: 0: 43431.0. Samples: 8489994640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 21:21:55,724][15401] Updated weights for policy 0, policy_version 518190 (0.0050) [2024-06-23 21:21:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 8490123264. Throughput: 0: 43115.4. Samples: 8490250680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:21:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-23 21:21:59,374][15401] Updated weights for policy 0, policy_version 518200 (0.0040) [2024-06-23 21:22:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8490336256. Throughput: 0: 43105.7. Samples: 8490512480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:22:03,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 21:22:03,631][15401] Updated weights for policy 0, policy_version 518210 (0.0029) [2024-06-23 21:22:06,975][15401] Updated weights for policy 0, policy_version 518220 (0.0035) [2024-06-23 21:22:08,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43963.7, 300 sec: 42987.2). Total num frames: 8490598400. Throughput: 0: 43442.5. Samples: 8490646940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:22:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 21:22:11,082][15401] Updated weights for policy 0, policy_version 518230 (0.0029) [2024-06-23 21:22:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8490778624. Throughput: 0: 43274.2. Samples: 8490904660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:22:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 21:22:14,602][15401] Updated weights for policy 0, policy_version 518240 (0.0041) [2024-06-23 21:22:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8490991616. Throughput: 0: 43272.4. Samples: 8491162760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:22:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 21:22:18,527][15401] Updated weights for policy 0, policy_version 518250 (0.0041) [2024-06-23 21:22:22,247][15401] Updated weights for policy 0, policy_version 518260 (0.0038) [2024-06-23 21:22:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.4, 300 sec: 42876.4). Total num frames: 8491220992. Throughput: 0: 43463.0. Samples: 8491296480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:22:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 21:22:25,974][15401] Updated weights for policy 0, policy_version 518270 (0.0028) [2024-06-23 21:22:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8491433984. Throughput: 0: 43391.6. Samples: 8491552520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 21:22:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 21:22:29,732][15401] Updated weights for policy 0, policy_version 518280 (0.0033) [2024-06-23 21:22:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8491646976. Throughput: 0: 43336.5. Samples: 8491810860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:22:33,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 21:22:33,430][15401] Updated weights for policy 0, policy_version 518290 (0.0038) [2024-06-23 21:22:37,602][15401] Updated weights for policy 0, policy_version 518300 (0.0032) [2024-06-23 21:22:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8491876352. Throughput: 0: 43222.3. Samples: 8491939640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:22:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 21:22:41,450][15401] Updated weights for policy 0, policy_version 518310 (0.0033) [2024-06-23 21:22:42,932][15349] Signal inference workers to stop experience collection... (125800 times) [2024-06-23 21:22:42,934][15349] Signal inference workers to resume experience collection... (125800 times) [2024-06-23 21:22:42,973][15401] InferenceWorker_p0-w0: stopping experience collection (125800 times) [2024-06-23 21:22:42,973][15401] InferenceWorker_p0-w0: resuming experience collection (125800 times) [2024-06-23 21:22:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 8492072960. Throughput: 0: 43305.0. Samples: 8492199400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:22:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 21:22:44,953][15401] Updated weights for policy 0, policy_version 518320 (0.0037) [2024-06-23 21:22:48,393][15132] Fps is (10 sec: 40946.9, 60 sec: 43415.3, 300 sec: 42875.6). Total num frames: 8492285952. Throughput: 0: 43238.7. Samples: 8492458360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:22:48,393][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 21:22:48,912][15401] Updated weights for policy 0, policy_version 518330 (0.0024) [2024-06-23 21:22:52,389][15401] Updated weights for policy 0, policy_version 518340 (0.0034) [2024-06-23 21:22:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8492531712. Throughput: 0: 43107.5. Samples: 8492586780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:22:53,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 21:22:56,471][15401] Updated weights for policy 0, policy_version 518350 (0.0038) [2024-06-23 21:22:58,390][15132] Fps is (10 sec: 44250.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 8492728320. Throughput: 0: 43159.0. Samples: 8492846820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:22:58,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 21:23:00,138][15401] Updated weights for policy 0, policy_version 518360 (0.0034) [2024-06-23 21:23:03,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8492908544. Throughput: 0: 43284.4. Samples: 8493110560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 21:23:03,992][15401] Updated weights for policy 0, policy_version 518370 (0.0030) [2024-06-23 21:23:07,590][15401] Updated weights for policy 0, policy_version 518380 (0.0029) [2024-06-23 21:23:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8493154304. Throughput: 0: 42942.0. Samples: 8493228860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 21:23:11,677][15401] Updated weights for policy 0, policy_version 518390 (0.0035) [2024-06-23 21:23:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8493350912. Throughput: 0: 42951.6. Samples: 8493485340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:13,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-23 21:23:15,333][15401] Updated weights for policy 0, policy_version 518400 (0.0035) [2024-06-23 21:23:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 8493563904. Throughput: 0: 42909.3. Samples: 8493741780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 21:23:19,397][15401] Updated weights for policy 0, policy_version 518410 (0.0035) [2024-06-23 21:23:22,782][15401] Updated weights for policy 0, policy_version 518420 (0.0030) [2024-06-23 21:23:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8493809664. Throughput: 0: 43011.1. Samples: 8493875140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:23,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 21:23:27,408][15401] Updated weights for policy 0, policy_version 518430 (0.0033) [2024-06-23 21:23:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8494006272. Throughput: 0: 42949.6. Samples: 8494132140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:28,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 21:23:30,707][15401] Updated weights for policy 0, policy_version 518440 (0.0025) [2024-06-23 21:23:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8494202880. Throughput: 0: 42735.2. Samples: 8494381300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:33,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-23 21:23:34,767][15401] Updated weights for policy 0, policy_version 518450 (0.0047) [2024-06-23 21:23:38,254][15401] Updated weights for policy 0, policy_version 518460 (0.0043) [2024-06-23 21:23:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8494448640. Throughput: 0: 42727.2. Samples: 8494509500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:38,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-23 21:23:42,213][15401] Updated weights for policy 0, policy_version 518470 (0.0044) [2024-06-23 21:23:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8494628864. Throughput: 0: 42715.6. Samples: 8494769020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:43,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-23 21:23:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000518472_8494645248.pth... [2024-06-23 21:23:43,458][15349] Signal inference workers to stop experience collection... (125850 times) [2024-06-23 21:23:43,458][15349] Signal inference workers to resume experience collection... (125850 times) [2024-06-23 21:23:43,469][15401] InferenceWorker_p0-w0: stopping experience collection (125850 times) [2024-06-23 21:23:43,470][15401] InferenceWorker_p0-w0: resuming experience collection (125850 times) [2024-06-23 21:23:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000517843_8484339712.pth [2024-06-23 21:23:45,846][15401] Updated weights for policy 0, policy_version 518480 (0.0033) [2024-06-23 21:23:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.8, 300 sec: 43042.7). Total num frames: 8494858240. Throughput: 0: 42679.0. Samples: 8495031120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:48,396][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 21:23:49,940][15401] Updated weights for policy 0, policy_version 518490 (0.0043) [2024-06-23 21:23:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 8495087616. Throughput: 0: 42888.8. Samples: 8495158860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-23 21:23:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 21:23:53,587][15401] Updated weights for policy 0, policy_version 518500 (0.0051) [2024-06-23 21:23:57,359][15401] Updated weights for policy 0, policy_version 518510 (0.0027) [2024-06-23 21:23:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 8495284224. Throughput: 0: 42880.9. Samples: 8495414980. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:23:58,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 21:24:01,123][15401] Updated weights for policy 0, policy_version 518520 (0.0039) [2024-06-23 21:24:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 8495480832. Throughput: 0: 42972.0. Samples: 8495675520. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 21:24:05,150][15401] Updated weights for policy 0, policy_version 518530 (0.0032) [2024-06-23 21:24:08,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 8495710208. Throughput: 0: 42695.9. Samples: 8495796560. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:08,393][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 21:24:09,097][15401] Updated weights for policy 0, policy_version 518540 (0.0030) [2024-06-23 21:24:12,827][15401] Updated weights for policy 0, policy_version 518550 (0.0037) [2024-06-23 21:24:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8495923200. Throughput: 0: 42737.0. Samples: 8496055300. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 21:24:16,957][15401] Updated weights for policy 0, policy_version 518560 (0.0039) [2024-06-23 21:24:18,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8496136192. Throughput: 0: 42834.1. Samples: 8496308840. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 21:24:20,623][15401] Updated weights for policy 0, policy_version 518570 (0.0035) [2024-06-23 21:24:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42987.1). Total num frames: 8496365568. Throughput: 0: 42785.2. Samples: 8496434840. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 21:24:24,832][15401] Updated weights for policy 0, policy_version 518580 (0.0038) [2024-06-23 21:24:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8496562176. Throughput: 0: 42686.6. Samples: 8496689920. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:28,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 21:24:28,594][15401] Updated weights for policy 0, policy_version 518590 (0.0031) [2024-06-23 21:24:32,322][15401] Updated weights for policy 0, policy_version 518600 (0.0025) [2024-06-23 21:24:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 8496758784. Throughput: 0: 42660.9. Samples: 8496950860. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 21:24:36,170][15401] Updated weights for policy 0, policy_version 518610 (0.0038) [2024-06-23 21:24:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8497004544. Throughput: 0: 42589.3. Samples: 8497075380. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 21:24:39,781][15401] Updated weights for policy 0, policy_version 518620 (0.0044) [2024-06-23 21:24:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8497201152. Throughput: 0: 42551.8. Samples: 8497329820. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:43,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 21:24:43,898][15401] Updated weights for policy 0, policy_version 518630 (0.0028) [2024-06-23 21:24:47,424][15401] Updated weights for policy 0, policy_version 518640 (0.0039) [2024-06-23 21:24:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 8497414144. Throughput: 0: 42328.7. Samples: 8497580320. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:48,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 21:24:52,066][15401] Updated weights for policy 0, policy_version 518650 (0.0040) [2024-06-23 21:24:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8497643520. Throughput: 0: 42545.8. Samples: 8497711020. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 21:24:54,977][15401] Updated weights for policy 0, policy_version 518660 (0.0039) [2024-06-23 21:24:58,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.7, 300 sec: 42987.2). Total num frames: 8497856512. Throughput: 0: 42368.4. Samples: 8497961980. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:24:58,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 21:24:59,782][15401] Updated weights for policy 0, policy_version 518670 (0.0040) [2024-06-23 21:25:02,929][15401] Updated weights for policy 0, policy_version 518680 (0.0048) [2024-06-23 21:25:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8498069504. Throughput: 0: 42364.1. Samples: 8498215220. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:25:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 21:25:04,344][15349] Signal inference workers to stop experience collection... (125900 times) [2024-06-23 21:25:04,348][15349] Signal inference workers to resume experience collection... (125900 times) [2024-06-23 21:25:04,396][15401] InferenceWorker_p0-w0: stopping experience collection (125900 times) [2024-06-23 21:25:04,396][15401] InferenceWorker_p0-w0: resuming experience collection (125900 times) [2024-06-23 21:25:07,866][15401] Updated weights for policy 0, policy_version 518690 (0.0030) [2024-06-23 21:25:08,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42054.0, 300 sec: 42710.0). Total num frames: 8498233344. Throughput: 0: 42432.1. Samples: 8498344280. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:25:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 21:25:10,538][15401] Updated weights for policy 0, policy_version 518700 (0.0036) [2024-06-23 21:25:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 8498479104. Throughput: 0: 42370.7. Samples: 8498596600. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:25:13,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 21:25:15,603][15401] Updated weights for policy 0, policy_version 518710 (0.0033) [2024-06-23 21:25:18,206][15401] Updated weights for policy 0, policy_version 518720 (0.0037) [2024-06-23 21:25:18,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8498708480. Throughput: 0: 42190.2. Samples: 8498849420. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:25:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-23 21:25:23,103][15401] Updated weights for policy 0, policy_version 518730 (0.0030) [2024-06-23 21:25:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 8498872320. Throughput: 0: 42304.3. Samples: 8498979080. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-06-23 21:25:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 21:25:26,159][15401] Updated weights for policy 0, policy_version 518740 (0.0041) [2024-06-23 21:25:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8499134464. Throughput: 0: 42412.6. Samples: 8499238380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:25:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 21:25:30,554][15401] Updated weights for policy 0, policy_version 518750 (0.0039) [2024-06-23 21:25:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8499331072. Throughput: 0: 42577.3. Samples: 8499496300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:25:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 21:25:33,741][15401] Updated weights for policy 0, policy_version 518760 (0.0038) [2024-06-23 21:25:38,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 8499511296. Throughput: 0: 42513.9. Samples: 8499624140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:25:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 21:25:38,407][15401] Updated weights for policy 0, policy_version 518770 (0.0034) [2024-06-23 21:25:41,385][15401] Updated weights for policy 0, policy_version 518780 (0.0039) [2024-06-23 21:25:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 8499773440. Throughput: 0: 42633.2. Samples: 8499880380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:25:43,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-23 21:25:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000518785_8499773440.pth... [2024-06-23 21:25:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000518156_8489467904.pth [2024-06-23 21:25:45,758][15401] Updated weights for policy 0, policy_version 518790 (0.0037) [2024-06-23 21:25:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8499970048. Throughput: 0: 42788.8. Samples: 8500140720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:25:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 21:25:49,056][15401] Updated weights for policy 0, policy_version 518800 (0.0030) [2024-06-23 21:25:53,210][15401] Updated weights for policy 0, policy_version 518810 (0.0031) [2024-06-23 21:25:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 8500183040. Throughput: 0: 42663.8. Samples: 8500264160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:25:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 21:25:56,774][15401] Updated weights for policy 0, policy_version 518820 (0.0039) [2024-06-23 21:25:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 8500412416. Throughput: 0: 42816.0. Samples: 8500523320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:25:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 21:26:00,762][15401] Updated weights for policy 0, policy_version 518830 (0.0034) [2024-06-23 21:26:03,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42320.8, 300 sec: 42875.2). Total num frames: 8500609024. Throughput: 0: 42958.8. Samples: 8500782840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:03,397][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 21:26:04,438][15401] Updated weights for policy 0, policy_version 518840 (0.0030) [2024-06-23 21:26:08,282][15401] Updated weights for policy 0, policy_version 518850 (0.0033) [2024-06-23 21:26:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 8500838400. Throughput: 0: 42832.5. Samples: 8500906540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:08,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 21:26:12,133][15401] Updated weights for policy 0, policy_version 518860 (0.0046) [2024-06-23 21:26:13,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8501035008. Throughput: 0: 42847.9. Samples: 8501166540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 21:26:15,809][15401] Updated weights for policy 0, policy_version 518870 (0.0026) [2024-06-23 21:26:17,828][15349] Signal inference workers to stop experience collection... (125950 times) [2024-06-23 21:26:17,829][15349] Signal inference workers to resume experience collection... (125950 times) [2024-06-23 21:26:17,840][15401] InferenceWorker_p0-w0: stopping experience collection (125950 times) [2024-06-23 21:26:17,841][15401] InferenceWorker_p0-w0: resuming experience collection (125950 times) [2024-06-23 21:26:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 8501248000. Throughput: 0: 42878.2. Samples: 8501425820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:18,391][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 21:26:19,757][15401] Updated weights for policy 0, policy_version 518880 (0.0034) [2024-06-23 21:26:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 8501477376. Throughput: 0: 42928.3. Samples: 8501555920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:23,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 21:26:23,545][15401] Updated weights for policy 0, policy_version 518890 (0.0044) [2024-06-23 21:26:27,561][15401] Updated weights for policy 0, policy_version 518900 (0.0042) [2024-06-23 21:26:28,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 8501673984. Throughput: 0: 42987.8. Samples: 8501814820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:28,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-23 21:26:31,117][15401] Updated weights for policy 0, policy_version 518910 (0.0035) [2024-06-23 21:26:33,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 8501886976. Throughput: 0: 42914.7. Samples: 8502071980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:33,392][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 21:26:35,013][15401] Updated weights for policy 0, policy_version 518920 (0.0027) [2024-06-23 21:26:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8502116352. Throughput: 0: 43090.8. Samples: 8502203240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 21:26:38,593][15401] Updated weights for policy 0, policy_version 518930 (0.0041) [2024-06-23 21:26:42,778][15401] Updated weights for policy 0, policy_version 518940 (0.0035) [2024-06-23 21:26:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8502312960. Throughput: 0: 42897.4. Samples: 8502453700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:43,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 21:26:46,440][15401] Updated weights for policy 0, policy_version 518950 (0.0033) [2024-06-23 21:26:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8502542336. Throughput: 0: 42824.2. Samples: 8502709660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 21:26:50,377][15401] Updated weights for policy 0, policy_version 518960 (0.0033) [2024-06-23 21:26:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8502755328. Throughput: 0: 43032.1. Samples: 8502842980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-23 21:26:53,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-23 21:26:53,859][15401] Updated weights for policy 0, policy_version 518970 (0.0041) [2024-06-23 21:26:57,967][15401] Updated weights for policy 0, policy_version 518980 (0.0042) [2024-06-23 21:26:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8502968320. Throughput: 0: 42949.8. Samples: 8503099280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:26:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 21:27:01,347][15401] Updated weights for policy 0, policy_version 518990 (0.0046) [2024-06-23 21:27:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 8503197696. Throughput: 0: 42693.5. Samples: 8503347020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:03,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 21:27:05,791][15401] Updated weights for policy 0, policy_version 519000 (0.0030) [2024-06-23 21:27:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8503394304. Throughput: 0: 42792.4. Samples: 8503481580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 21:27:09,064][15401] Updated weights for policy 0, policy_version 519010 (0.0041) [2024-06-23 21:27:13,389][15401] Updated weights for policy 0, policy_version 519020 (0.0032) [2024-06-23 21:27:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 8503623680. Throughput: 0: 42868.3. Samples: 8503743900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 21:27:16,670][15401] Updated weights for policy 0, policy_version 519030 (0.0035) [2024-06-23 21:27:18,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 8503853056. Throughput: 0: 42873.4. Samples: 8504001180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 21:27:20,968][15401] Updated weights for policy 0, policy_version 519040 (0.0032) [2024-06-23 21:27:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8504066048. Throughput: 0: 42958.1. Samples: 8504136360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 21:27:24,036][15401] Updated weights for policy 0, policy_version 519050 (0.0032) [2024-06-23 21:27:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8504262656. Throughput: 0: 43148.9. Samples: 8504395400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 21:27:28,607][15401] Updated weights for policy 0, policy_version 519060 (0.0035) [2024-06-23 21:27:29,817][15349] Signal inference workers to stop experience collection... (126000 times) [2024-06-23 21:27:29,819][15349] Signal inference workers to resume experience collection... (126000 times) [2024-06-23 21:27:29,865][15401] InferenceWorker_p0-w0: stopping experience collection (126000 times) [2024-06-23 21:27:29,865][15401] InferenceWorker_p0-w0: resuming experience collection (126000 times) [2024-06-23 21:27:31,731][15401] Updated weights for policy 0, policy_version 519070 (0.0030) [2024-06-23 21:27:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 8504492032. Throughput: 0: 43094.4. Samples: 8504648900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:33,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-23 21:27:36,210][15401] Updated weights for policy 0, policy_version 519080 (0.0031) [2024-06-23 21:27:38,391][15132] Fps is (10 sec: 44232.2, 60 sec: 43143.8, 300 sec: 42820.4). Total num frames: 8504705024. Throughput: 0: 43182.1. Samples: 8504786220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:38,391][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 21:27:39,359][15401] Updated weights for policy 0, policy_version 519090 (0.0023) [2024-06-23 21:27:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.5). Total num frames: 8504901632. Throughput: 0: 43049.8. Samples: 8505036520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 21:27:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000519098_8504901632.pth... [2024-06-23 21:27:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000518472_8494645248.pth [2024-06-23 21:27:44,085][15401] Updated weights for policy 0, policy_version 519100 (0.0045) [2024-06-23 21:27:47,238][15401] Updated weights for policy 0, policy_version 519110 (0.0048) [2024-06-23 21:27:48,389][15132] Fps is (10 sec: 44241.7, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 8505147392. Throughput: 0: 43138.4. Samples: 8505288240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 21:27:51,813][15401] Updated weights for policy 0, policy_version 519120 (0.0031) [2024-06-23 21:27:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8505344000. Throughput: 0: 43243.7. Samples: 8505427540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:53,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 21:27:54,678][15401] Updated weights for policy 0, policy_version 519130 (0.0038) [2024-06-23 21:27:58,390][15132] Fps is (10 sec: 39319.3, 60 sec: 42871.2, 300 sec: 42820.5). Total num frames: 8505540608. Throughput: 0: 42968.0. Samples: 8505677480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:27:58,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 21:27:59,300][15401] Updated weights for policy 0, policy_version 519140 (0.0039) [2024-06-23 21:28:02,492][15401] Updated weights for policy 0, policy_version 519150 (0.0043) [2024-06-23 21:28:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 8505802752. Throughput: 0: 42931.2. Samples: 8505933080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:28:03,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-23 21:28:06,751][15401] Updated weights for policy 0, policy_version 519160 (0.0045) [2024-06-23 21:28:08,390][15132] Fps is (10 sec: 44238.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 8505982976. Throughput: 0: 43026.8. Samples: 8506072560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:28:08,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 21:28:10,054][15401] Updated weights for policy 0, policy_version 519170 (0.0030) [2024-06-23 21:28:13,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8506179584. Throughput: 0: 42789.4. Samples: 8506320920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:28:13,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-23 21:28:14,406][15401] Updated weights for policy 0, policy_version 519180 (0.0036) [2024-06-23 21:28:17,735][15401] Updated weights for policy 0, policy_version 519190 (0.0030) [2024-06-23 21:28:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8506425344. Throughput: 0: 42807.1. Samples: 8506575220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-23 21:28:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 21:28:22,521][15401] Updated weights for policy 0, policy_version 519200 (0.0033) [2024-06-23 21:28:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 8506621952. Throughput: 0: 42805.5. Samples: 8506712420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:28:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 21:28:25,654][15401] Updated weights for policy 0, policy_version 519210 (0.0039) [2024-06-23 21:28:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8506834944. Throughput: 0: 42772.1. Samples: 8506961260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:28:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 21:28:30,032][15401] Updated weights for policy 0, policy_version 519220 (0.0033) [2024-06-23 21:28:33,284][15401] Updated weights for policy 0, policy_version 519230 (0.0027) [2024-06-23 21:28:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8507064320. Throughput: 0: 42887.6. Samples: 8507218180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:28:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 21:28:37,583][15401] Updated weights for policy 0, policy_version 519240 (0.0040) [2024-06-23 21:28:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42599.1, 300 sec: 42820.5). Total num frames: 8507260928. Throughput: 0: 42792.4. Samples: 8507353200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:28:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 21:28:40,939][15401] Updated weights for policy 0, policy_version 519250 (0.0029) [2024-06-23 21:28:43,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 8507473920. Throughput: 0: 42611.5. Samples: 8507595080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:28:43,392][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 21:28:45,532][15401] Updated weights for policy 0, policy_version 519260 (0.0036) [2024-06-23 21:28:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8507686912. Throughput: 0: 42572.0. Samples: 8507848820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:28:48,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 21:28:48,624][15349] Signal inference workers to stop experience collection... (126050 times) [2024-06-23 21:28:48,674][15401] InferenceWorker_p0-w0: stopping experience collection (126050 times) [2024-06-23 21:28:48,681][15349] Signal inference workers to resume experience collection... (126050 times) [2024-06-23 21:28:48,697][15401] InferenceWorker_p0-w0: resuming experience collection (126050 times) [2024-06-23 21:28:48,817][15401] Updated weights for policy 0, policy_version 519270 (0.0027) [2024-06-23 21:28:53,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 8507867136. Throughput: 0: 42440.1. Samples: 8507982360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:28:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 21:28:53,401][15401] Updated weights for policy 0, policy_version 519280 (0.0032) [2024-06-23 21:28:56,339][15401] Updated weights for policy 0, policy_version 519290 (0.0032) [2024-06-23 21:28:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.8, 300 sec: 42876.1). Total num frames: 8508129280. Throughput: 0: 42528.3. Samples: 8508234700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:28:58,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 21:29:00,884][15401] Updated weights for policy 0, policy_version 519300 (0.0030) [2024-06-23 21:29:03,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 8508342272. Throughput: 0: 42544.0. Samples: 8508489700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 21:29:04,074][15401] Updated weights for policy 0, policy_version 519310 (0.0037) [2024-06-23 21:29:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 8508506112. Throughput: 0: 42368.7. Samples: 8508619020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-23 21:29:08,636][15401] Updated weights for policy 0, policy_version 519320 (0.0043) [2024-06-23 21:29:11,930][15401] Updated weights for policy 0, policy_version 519330 (0.0038) [2024-06-23 21:29:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8508784640. Throughput: 0: 42445.3. Samples: 8508871300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:13,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 21:29:16,251][15401] Updated weights for policy 0, policy_version 519340 (0.0028) [2024-06-23 21:29:18,389][15132] Fps is (10 sec: 47514.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8508981248. Throughput: 0: 42477.8. Samples: 8509129680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 21:29:19,432][15401] Updated weights for policy 0, policy_version 519350 (0.0027) [2024-06-23 21:29:23,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 8509161472. Throughput: 0: 42216.4. Samples: 8509252940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 21:29:24,079][15401] Updated weights for policy 0, policy_version 519360 (0.0039) [2024-06-23 21:29:26,887][15401] Updated weights for policy 0, policy_version 519370 (0.0038) [2024-06-23 21:29:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8509407232. Throughput: 0: 42633.9. Samples: 8509513500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:28,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 21:29:31,450][15401] Updated weights for policy 0, policy_version 519380 (0.0031) [2024-06-23 21:29:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8509620224. Throughput: 0: 42916.0. Samples: 8509780040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 21:29:34,671][15401] Updated weights for policy 0, policy_version 519390 (0.0043) [2024-06-23 21:29:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8509816832. Throughput: 0: 42735.4. Samples: 8509905460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 21:29:38,913][15401] Updated weights for policy 0, policy_version 519400 (0.0035) [2024-06-23 21:29:42,296][15401] Updated weights for policy 0, policy_version 519410 (0.0041) [2024-06-23 21:29:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 8510062592. Throughput: 0: 42941.8. Samples: 8510167080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 21:29:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000519413_8510062592.pth... [2024-06-23 21:29:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000518785_8499773440.pth [2024-06-23 21:29:46,608][15401] Updated weights for policy 0, policy_version 519420 (0.0047) [2024-06-23 21:29:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8510259200. Throughput: 0: 42944.9. Samples: 8510422220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 21:29:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 21:29:49,823][15401] Updated weights for policy 0, policy_version 519430 (0.0037) [2024-06-23 21:29:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.5, 300 sec: 42765.4). Total num frames: 8510472192. Throughput: 0: 42916.5. Samples: 8510550260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:29:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 21:29:54,154][15401] Updated weights for policy 0, policy_version 519440 (0.0042) [2024-06-23 21:29:57,548][15401] Updated weights for policy 0, policy_version 519450 (0.0025) [2024-06-23 21:29:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8510701568. Throughput: 0: 43094.2. Samples: 8510810540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:29:58,395][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 21:30:01,928][15401] Updated weights for policy 0, policy_version 519460 (0.0043) [2024-06-23 21:30:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8510881792. Throughput: 0: 43087.9. Samples: 8511068640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:03,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 21:30:03,476][15349] Signal inference workers to stop experience collection... (126100 times) [2024-06-23 21:30:03,477][15349] Signal inference workers to resume experience collection... (126100 times) [2024-06-23 21:30:03,495][15401] InferenceWorker_p0-w0: stopping experience collection (126100 times) [2024-06-23 21:30:03,495][15401] InferenceWorker_p0-w0: resuming experience collection (126100 times) [2024-06-23 21:30:05,180][15401] Updated weights for policy 0, policy_version 519470 (0.0044) [2024-06-23 21:30:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 8511111168. Throughput: 0: 43123.6. Samples: 8511193500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 21:30:09,665][15401] Updated weights for policy 0, policy_version 519480 (0.0039) [2024-06-23 21:30:12,707][15401] Updated weights for policy 0, policy_version 519490 (0.0036) [2024-06-23 21:30:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8511356928. Throughput: 0: 43237.2. Samples: 8511459180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 21:30:17,417][15401] Updated weights for policy 0, policy_version 519500 (0.0024) [2024-06-23 21:30:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 8511504384. Throughput: 0: 43035.5. Samples: 8511716640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 21:30:20,314][15401] Updated weights for policy 0, policy_version 519510 (0.0042) [2024-06-23 21:30:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 8511782912. Throughput: 0: 42842.3. Samples: 8511833360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 21:30:24,973][15401] Updated weights for policy 0, policy_version 519520 (0.0041) [2024-06-23 21:30:27,902][15401] Updated weights for policy 0, policy_version 519530 (0.0020) [2024-06-23 21:30:28,390][15132] Fps is (10 sec: 50789.9, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 8512012288. Throughput: 0: 42976.5. Samples: 8512101020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 21:30:32,309][15401] Updated weights for policy 0, policy_version 519540 (0.0041) [2024-06-23 21:30:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 8512159744. Throughput: 0: 43277.7. Samples: 8512369720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 21:30:35,570][15401] Updated weights for policy 0, policy_version 519550 (0.0040) [2024-06-23 21:30:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8512421888. Throughput: 0: 43072.0. Samples: 8512488500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 21:30:39,939][15401] Updated weights for policy 0, policy_version 519560 (0.0038) [2024-06-23 21:30:43,163][15401] Updated weights for policy 0, policy_version 519570 (0.0041) [2024-06-23 21:30:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8512634880. Throughput: 0: 43077.7. Samples: 8512749040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:43,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 21:30:47,802][15401] Updated weights for policy 0, policy_version 519580 (0.0042) [2024-06-23 21:30:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8512815104. Throughput: 0: 43339.2. Samples: 8513018900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 21:30:50,742][15401] Updated weights for policy 0, policy_version 519590 (0.0039) [2024-06-23 21:30:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 8513077248. Throughput: 0: 43297.8. Samples: 8513141900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 21:30:55,236][15401] Updated weights for policy 0, policy_version 519600 (0.0032) [2024-06-23 21:30:58,270][15401] Updated weights for policy 0, policy_version 519610 (0.0030) [2024-06-23 21:30:58,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42988.1). Total num frames: 8513290240. Throughput: 0: 43145.4. Samples: 8513400720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:30:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 21:31:02,870][15401] Updated weights for policy 0, policy_version 519620 (0.0033) [2024-06-23 21:31:03,392][15132] Fps is (10 sec: 39312.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 8513470464. Throughput: 0: 43256.8. Samples: 8513663300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:31:03,392][15132] Avg episode reward: [(0, '0.226')] [2024-06-23 21:31:05,835][15401] Updated weights for policy 0, policy_version 519630 (0.0031) [2024-06-23 21:31:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 8513716224. Throughput: 0: 43323.9. Samples: 8513782940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:31:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 21:31:10,191][15401] Updated weights for policy 0, policy_version 519640 (0.0028) [2024-06-23 21:31:13,344][15401] Updated weights for policy 0, policy_version 519650 (0.0034) [2024-06-23 21:31:13,390][15132] Fps is (10 sec: 47524.8, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8513945600. Throughput: 0: 43180.0. Samples: 8514044120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:31:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 21:31:17,671][15401] Updated weights for policy 0, policy_version 519660 (0.0039) [2024-06-23 21:31:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 8514125824. Throughput: 0: 43087.6. Samples: 8514308660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-23 21:31:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 21:31:20,884][15401] Updated weights for policy 0, policy_version 519670 (0.0028) [2024-06-23 21:31:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8514355200. Throughput: 0: 43175.2. Samples: 8514431380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:31:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 21:31:25,530][15401] Updated weights for policy 0, policy_version 519680 (0.0041) [2024-06-23 21:31:28,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42869.8, 300 sec: 43042.7). Total num frames: 8514584576. Throughput: 0: 43230.2. Samples: 8514694500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:31:28,392][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 21:31:28,919][15401] Updated weights for policy 0, policy_version 519690 (0.0024) [2024-06-23 21:31:28,933][15349] Signal inference workers to stop experience collection... (126150 times) [2024-06-23 21:31:28,940][15349] Signal inference workers to resume experience collection... (126150 times) [2024-06-23 21:31:28,985][15401] InferenceWorker_p0-w0: stopping experience collection (126150 times) [2024-06-23 21:31:28,985][15401] InferenceWorker_p0-w0: resuming experience collection (126150 times) [2024-06-23 21:31:32,942][15401] Updated weights for policy 0, policy_version 519700 (0.0033) [2024-06-23 21:31:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 8514764800. Throughput: 0: 42963.1. Samples: 8514952240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:31:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-23 21:31:36,557][15401] Updated weights for policy 0, policy_version 519710 (0.0035) [2024-06-23 21:31:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8514994176. Throughput: 0: 42963.2. Samples: 8515075240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:31:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 21:31:40,612][15401] Updated weights for policy 0, policy_version 519720 (0.0034) [2024-06-23 21:31:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8515223552. Throughput: 0: 42883.9. Samples: 8515330500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:31:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 21:31:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000519728_8515223552.pth... [2024-06-23 21:31:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000519098_8504901632.pth [2024-06-23 21:31:44,094][15401] Updated weights for policy 0, policy_version 519730 (0.0037) [2024-06-23 21:31:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8515403776. Throughput: 0: 42823.2. Samples: 8515590240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:31:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-23 21:31:48,470][15401] Updated weights for policy 0, policy_version 519740 (0.0029) [2024-06-23 21:31:51,661][15401] Updated weights for policy 0, policy_version 519750 (0.0029) [2024-06-23 21:31:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 8515633152. Throughput: 0: 42873.8. Samples: 8515712360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:31:53,392][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 21:31:56,047][15401] Updated weights for policy 0, policy_version 519760 (0.0029) [2024-06-23 21:31:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8515846144. Throughput: 0: 42910.6. Samples: 8515975100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:31:58,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 21:31:59,326][15401] Updated weights for policy 0, policy_version 519770 (0.0041) [2024-06-23 21:32:03,389][15132] Fps is (10 sec: 42609.2, 60 sec: 43146.3, 300 sec: 42931.7). Total num frames: 8516059136. Throughput: 0: 42804.2. Samples: 8516234840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 21:32:03,475][15401] Updated weights for policy 0, policy_version 519780 (0.0032) [2024-06-23 21:32:06,926][15401] Updated weights for policy 0, policy_version 519790 (0.0029) [2024-06-23 21:32:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8516288512. Throughput: 0: 42867.5. Samples: 8516360420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 21:32:11,122][15401] Updated weights for policy 0, policy_version 519800 (0.0038) [2024-06-23 21:32:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 8516485120. Throughput: 0: 42720.8. Samples: 8516616840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 21:32:14,575][15401] Updated weights for policy 0, policy_version 519810 (0.0030) [2024-06-23 21:32:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8516698112. Throughput: 0: 42770.7. Samples: 8516876920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:18,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 21:32:18,576][15401] Updated weights for policy 0, policy_version 519820 (0.0047) [2024-06-23 21:32:22,087][15401] Updated weights for policy 0, policy_version 519830 (0.0041) [2024-06-23 21:32:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8516943872. Throughput: 0: 42909.8. Samples: 8517006180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:23,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 21:32:26,304][15401] Updated weights for policy 0, policy_version 519840 (0.0040) [2024-06-23 21:32:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 8517140480. Throughput: 0: 43069.7. Samples: 8517268640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:28,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-23 21:32:29,976][15401] Updated weights for policy 0, policy_version 519850 (0.0030) [2024-06-23 21:32:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.2). Total num frames: 8517353472. Throughput: 0: 42933.7. Samples: 8517522260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 21:32:34,088][15401] Updated weights for policy 0, policy_version 519860 (0.0030) [2024-06-23 21:32:37,509][15401] Updated weights for policy 0, policy_version 519870 (0.0041) [2024-06-23 21:32:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8517566464. Throughput: 0: 43091.2. Samples: 8517651360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:38,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 21:32:41,565][15401] Updated weights for policy 0, policy_version 519880 (0.0034) [2024-06-23 21:32:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8517779456. Throughput: 0: 43012.0. Samples: 8517910640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 21:32:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 21:32:44,981][15401] Updated weights for policy 0, policy_version 519890 (0.0029) [2024-06-23 21:32:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8517992448. Throughput: 0: 43048.0. Samples: 8518172000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:32:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 21:32:49,250][15401] Updated weights for policy 0, policy_version 519900 (0.0039) [2024-06-23 21:32:52,559][15401] Updated weights for policy 0, policy_version 519910 (0.0032) [2024-06-23 21:32:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43144.5, 300 sec: 42986.9). Total num frames: 8518221824. Throughput: 0: 43024.9. Samples: 8518296640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:32:53,393][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 21:32:56,997][15401] Updated weights for policy 0, policy_version 519920 (0.0027) [2024-06-23 21:32:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8518418432. Throughput: 0: 43026.8. Samples: 8518553040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:32:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 21:33:00,147][15401] Updated weights for policy 0, policy_version 519930 (0.0034) [2024-06-23 21:33:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 8518631424. Throughput: 0: 42962.5. Samples: 8518810240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 21:33:04,560][15401] Updated weights for policy 0, policy_version 519940 (0.0033) [2024-06-23 21:33:07,758][15401] Updated weights for policy 0, policy_version 519950 (0.0037) [2024-06-23 21:33:08,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.9, 300 sec: 42986.8). Total num frames: 8518860800. Throughput: 0: 42978.6. Samples: 8518940320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:08,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 21:33:12,021][15401] Updated weights for policy 0, policy_version 519960 (0.0023) [2024-06-23 21:33:12,588][15349] Signal inference workers to stop experience collection... (126200 times) [2024-06-23 21:33:12,589][15349] Signal inference workers to resume experience collection... (126200 times) [2024-06-23 21:33:12,599][15401] InferenceWorker_p0-w0: stopping experience collection (126200 times) [2024-06-23 21:33:12,599][15401] InferenceWorker_p0-w0: resuming experience collection (126200 times) [2024-06-23 21:33:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8519073792. Throughput: 0: 42762.4. Samples: 8519192940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 21:33:15,917][15401] Updated weights for policy 0, policy_version 519970 (0.0038) [2024-06-23 21:33:18,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8519270400. Throughput: 0: 42785.8. Samples: 8519447620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 21:33:19,747][15401] Updated weights for policy 0, policy_version 519980 (0.0031) [2024-06-23 21:33:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8519499776. Throughput: 0: 42796.4. Samples: 8519577200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 21:33:23,542][15401] Updated weights for policy 0, policy_version 519990 (0.0039) [2024-06-23 21:33:27,200][15401] Updated weights for policy 0, policy_version 520000 (0.0028) [2024-06-23 21:33:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8519729152. Throughput: 0: 42825.8. Samples: 8519837800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 21:33:31,022][15401] Updated weights for policy 0, policy_version 520010 (0.0031) [2024-06-23 21:33:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8519925760. Throughput: 0: 42805.0. Samples: 8520098240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:33,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-23 21:33:34,942][15401] Updated weights for policy 0, policy_version 520020 (0.0024) [2024-06-23 21:33:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 8520155136. Throughput: 0: 42817.9. Samples: 8520223340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 21:33:38,578][15401] Updated weights for policy 0, policy_version 520030 (0.0033) [2024-06-23 21:33:42,581][15401] Updated weights for policy 0, policy_version 520040 (0.0029) [2024-06-23 21:33:43,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 8520351744. Throughput: 0: 42849.0. Samples: 8520481240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 21:33:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000520042_8520368128.pth... [2024-06-23 21:33:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000519413_8510062592.pth [2024-06-23 21:33:46,389][15401] Updated weights for policy 0, policy_version 520050 (0.0035) [2024-06-23 21:33:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 8520564736. Throughput: 0: 42983.7. Samples: 8520744500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:48,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-23 21:33:50,306][15401] Updated weights for policy 0, policy_version 520060 (0.0032) [2024-06-23 21:33:53,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 8520794112. Throughput: 0: 42868.0. Samples: 8520869280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:53,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-23 21:33:53,760][15401] Updated weights for policy 0, policy_version 520070 (0.0027) [2024-06-23 21:33:57,711][15401] Updated weights for policy 0, policy_version 520080 (0.0033) [2024-06-23 21:33:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8521007104. Throughput: 0: 42978.5. Samples: 8521126980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:33:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 21:34:01,367][15401] Updated weights for policy 0, policy_version 520090 (0.0035) [2024-06-23 21:34:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 8521203712. Throughput: 0: 43121.7. Samples: 8521388100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:34:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 21:34:05,409][15401] Updated weights for policy 0, policy_version 520100 (0.0027) [2024-06-23 21:34:08,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 8521449472. Throughput: 0: 43053.4. Samples: 8521514600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:34:08,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-23 21:34:08,981][15401] Updated weights for policy 0, policy_version 520110 (0.0032) [2024-06-23 21:34:13,061][15401] Updated weights for policy 0, policy_version 520120 (0.0033) [2024-06-23 21:34:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 8521662464. Throughput: 0: 42822.7. Samples: 8521764820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:34:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 21:34:16,997][15401] Updated weights for policy 0, policy_version 520130 (0.0040) [2024-06-23 21:34:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8521842688. Throughput: 0: 42823.4. Samples: 8522025280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 21:34:20,844][15401] Updated weights for policy 0, policy_version 520140 (0.0041) [2024-06-23 21:34:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8522072064. Throughput: 0: 42736.9. Samples: 8522146500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:23,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 21:34:24,653][15401] Updated weights for policy 0, policy_version 520150 (0.0034) [2024-06-23 21:34:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8522285056. Throughput: 0: 42731.8. Samples: 8522404180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 21:34:28,620][15401] Updated weights for policy 0, policy_version 520160 (0.0036) [2024-06-23 21:34:32,197][15401] Updated weights for policy 0, policy_version 520170 (0.0049) [2024-06-23 21:34:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8522481664. Throughput: 0: 42611.5. Samples: 8522662020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 21:34:36,144][15401] Updated weights for policy 0, policy_version 520180 (0.0025) [2024-06-23 21:34:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8522727424. Throughput: 0: 42698.1. Samples: 8522790700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 21:34:39,713][15401] Updated weights for policy 0, policy_version 520190 (0.0051) [2024-06-23 21:34:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8522924032. Throughput: 0: 42684.2. Samples: 8523047760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 21:34:43,950][15401] Updated weights for policy 0, policy_version 520200 (0.0047) [2024-06-23 21:34:46,826][15349] Signal inference workers to stop experience collection... (126250 times) [2024-06-23 21:34:46,860][15401] InferenceWorker_p0-w0: stopping experience collection (126250 times) [2024-06-23 21:34:46,943][15349] Signal inference workers to resume experience collection... (126250 times) [2024-06-23 21:34:46,943][15401] InferenceWorker_p0-w0: resuming experience collection (126250 times) [2024-06-23 21:34:47,510][15401] Updated weights for policy 0, policy_version 520210 (0.0040) [2024-06-23 21:34:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8523120640. Throughput: 0: 42448.1. Samples: 8523298260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 21:34:51,868][15401] Updated weights for policy 0, policy_version 520220 (0.0029) [2024-06-23 21:34:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8523366400. Throughput: 0: 42557.6. Samples: 8523429700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 21:34:55,237][15401] Updated weights for policy 0, policy_version 520230 (0.0032) [2024-06-23 21:34:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 8523579392. Throughput: 0: 42783.7. Samples: 8523690080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:34:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 21:34:59,510][15401] Updated weights for policy 0, policy_version 520240 (0.0024) [2024-06-23 21:35:03,017][15401] Updated weights for policy 0, policy_version 520250 (0.0030) [2024-06-23 21:35:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8523776000. Throughput: 0: 42594.9. Samples: 8523942060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 21:35:07,115][15401] Updated weights for policy 0, policy_version 520260 (0.0027) [2024-06-23 21:35:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8524005376. Throughput: 0: 42749.7. Samples: 8524070240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:08,404][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 21:35:10,823][15401] Updated weights for policy 0, policy_version 520270 (0.0038) [2024-06-23 21:35:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 43042.7). Total num frames: 8524201984. Throughput: 0: 42879.7. Samples: 8524333760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 21:35:14,726][15401] Updated weights for policy 0, policy_version 520280 (0.0031) [2024-06-23 21:35:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8524414976. Throughput: 0: 42610.3. Samples: 8524579480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:18,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 21:35:18,501][15401] Updated weights for policy 0, policy_version 520290 (0.0027) [2024-06-23 21:35:22,323][15401] Updated weights for policy 0, policy_version 520300 (0.0027) [2024-06-23 21:35:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8524627968. Throughput: 0: 42643.3. Samples: 8524709640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 21:35:26,139][15401] Updated weights for policy 0, policy_version 520310 (0.0033) [2024-06-23 21:35:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8524840960. Throughput: 0: 42654.2. Samples: 8524967200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:28,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-23 21:35:29,838][15401] Updated weights for policy 0, policy_version 520320 (0.0041) [2024-06-23 21:35:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8525070336. Throughput: 0: 42786.6. Samples: 8525223660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 21:35:33,561][15401] Updated weights for policy 0, policy_version 520330 (0.0029) [2024-06-23 21:35:37,769][15401] Updated weights for policy 0, policy_version 520340 (0.0035) [2024-06-23 21:35:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 8525266944. Throughput: 0: 42681.5. Samples: 8525350360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:38,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-23 21:35:41,525][15401] Updated weights for policy 0, policy_version 520350 (0.0032) [2024-06-23 21:35:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8525496320. Throughput: 0: 42747.4. Samples: 8525613720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-23 21:35:43,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-23 21:35:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000520356_8525512704.pth... [2024-06-23 21:35:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000519728_8515223552.pth [2024-06-23 21:35:45,494][15401] Updated weights for policy 0, policy_version 520360 (0.0030) [2024-06-23 21:35:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8525709312. Throughput: 0: 42803.6. Samples: 8525868220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:35:48,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-23 21:35:49,041][15401] Updated weights for policy 0, policy_version 520370 (0.0051) [2024-06-23 21:35:52,987][15401] Updated weights for policy 0, policy_version 520380 (0.0040) [2024-06-23 21:35:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8525905920. Throughput: 0: 42760.9. Samples: 8525994480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:35:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 21:35:53,849][15349] Signal inference workers to stop experience collection... (126300 times) [2024-06-23 21:35:53,849][15349] Signal inference workers to resume experience collection... (126300 times) [2024-06-23 21:35:53,863][15401] InferenceWorker_p0-w0: stopping experience collection (126300 times) [2024-06-23 21:35:53,864][15401] InferenceWorker_p0-w0: resuming experience collection (126300 times) [2024-06-23 21:35:56,838][15401] Updated weights for policy 0, policy_version 520390 (0.0038) [2024-06-23 21:35:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 8526135296. Throughput: 0: 42695.5. Samples: 8526255060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:35:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 21:36:00,521][15401] Updated weights for policy 0, policy_version 520400 (0.0027) [2024-06-23 21:36:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8526348288. Throughput: 0: 43014.0. Samples: 8526515120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:03,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-23 21:36:04,369][15401] Updated weights for policy 0, policy_version 520410 (0.0037) [2024-06-23 21:36:08,383][15401] Updated weights for policy 0, policy_version 520420 (0.0028) [2024-06-23 21:36:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8526561280. Throughput: 0: 42839.0. Samples: 8526637400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 21:36:11,842][15401] Updated weights for policy 0, policy_version 520430 (0.0021) [2024-06-23 21:36:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8526790656. Throughput: 0: 42860.9. Samples: 8526895940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:13,392][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 21:36:15,938][15401] Updated weights for policy 0, policy_version 520440 (0.0036) [2024-06-23 21:36:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8526987264. Throughput: 0: 42993.4. Samples: 8527158360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:18,394][15132] Avg episode reward: [(0, '0.680')] [2024-06-23 21:36:19,410][15401] Updated weights for policy 0, policy_version 520450 (0.0026) [2024-06-23 21:36:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 8527200256. Throughput: 0: 43037.7. Samples: 8527287060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:23,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 21:36:23,621][15401] Updated weights for policy 0, policy_version 520460 (0.0030) [2024-06-23 21:36:27,503][15401] Updated weights for policy 0, policy_version 520470 (0.0038) [2024-06-23 21:36:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8527429632. Throughput: 0: 42786.4. Samples: 8527539100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 21:36:31,141][15401] Updated weights for policy 0, policy_version 520480 (0.0042) [2024-06-23 21:36:33,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.9, 300 sec: 42875.7). Total num frames: 8527642624. Throughput: 0: 42748.9. Samples: 8527792020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:33,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 21:36:35,039][15401] Updated weights for policy 0, policy_version 520490 (0.0035) [2024-06-23 21:36:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8527855616. Throughput: 0: 42768.9. Samples: 8527919080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 21:36:38,681][15401] Updated weights for policy 0, policy_version 520500 (0.0028) [2024-06-23 21:36:42,470][15401] Updated weights for policy 0, policy_version 520510 (0.0031) [2024-06-23 21:36:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8528052224. Throughput: 0: 42846.5. Samples: 8528183160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 21:36:46,277][15401] Updated weights for policy 0, policy_version 520520 (0.0042) [2024-06-23 21:36:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 8528281600. Throughput: 0: 42740.5. Samples: 8528438440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:48,395][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 21:36:50,055][15401] Updated weights for policy 0, policy_version 520530 (0.0038) [2024-06-23 21:36:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 8528510976. Throughput: 0: 43009.9. Samples: 8528572840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:53,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-23 21:36:54,312][15401] Updated weights for policy 0, policy_version 520540 (0.0040) [2024-06-23 21:36:57,658][15401] Updated weights for policy 0, policy_version 520550 (0.0031) [2024-06-23 21:36:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8528707584. Throughput: 0: 42875.6. Samples: 8528825340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:36:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 21:37:01,740][15401] Updated weights for policy 0, policy_version 520560 (0.0030) [2024-06-23 21:37:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8528936960. Throughput: 0: 42756.1. Samples: 8529082380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:37:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 21:37:05,366][15401] Updated weights for policy 0, policy_version 520570 (0.0045) [2024-06-23 21:37:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 8529149952. Throughput: 0: 42854.3. Samples: 8529215500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:37:08,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 21:37:09,244][15401] Updated weights for policy 0, policy_version 520580 (0.0028) [2024-06-23 21:37:12,918][15401] Updated weights for policy 0, policy_version 520590 (0.0027) [2024-06-23 21:37:13,394][15132] Fps is (10 sec: 40941.2, 60 sec: 42595.2, 300 sec: 42875.4). Total num frames: 8529346560. Throughput: 0: 43010.7. Samples: 8529474780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-23 21:37:13,394][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 21:37:15,812][15349] Signal inference workers to stop experience collection... (126350 times) [2024-06-23 21:37:15,817][15349] Signal inference workers to resume experience collection... (126350 times) [2024-06-23 21:37:15,836][15401] InferenceWorker_p0-w0: stopping experience collection (126350 times) [2024-06-23 21:37:15,836][15401] InferenceWorker_p0-w0: resuming experience collection (126350 times) [2024-06-23 21:37:16,649][15401] Updated weights for policy 0, policy_version 520600 (0.0036) [2024-06-23 21:37:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8529559552. Throughput: 0: 43101.4. Samples: 8529731480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:18,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-23 21:37:20,490][15401] Updated weights for policy 0, policy_version 520610 (0.0046) [2024-06-23 21:37:23,389][15132] Fps is (10 sec: 42617.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8529772544. Throughput: 0: 43228.0. Samples: 8529864340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:23,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-23 21:37:24,073][15401] Updated weights for policy 0, policy_version 520620 (0.0039) [2024-06-23 21:37:28,140][15401] Updated weights for policy 0, policy_version 520630 (0.0037) [2024-06-23 21:37:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8530001920. Throughput: 0: 42895.6. Samples: 8530113460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 21:37:32,263][15401] Updated weights for policy 0, policy_version 520640 (0.0028) [2024-06-23 21:37:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 8530198528. Throughput: 0: 42984.0. Samples: 8530372820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:33,393][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 21:37:35,829][15401] Updated weights for policy 0, policy_version 520650 (0.0034) [2024-06-23 21:37:38,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 8530411520. Throughput: 0: 42690.6. Samples: 8530494020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:38,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 21:37:39,872][15401] Updated weights for policy 0, policy_version 520660 (0.0037) [2024-06-23 21:37:43,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8530640896. Throughput: 0: 42933.4. Samples: 8530757340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 21:37:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000520670_8530657280.pth... [2024-06-23 21:37:43,511][15401] Updated weights for policy 0, policy_version 520670 (0.0032) [2024-06-23 21:37:43,570][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000520042_8520368128.pth [2024-06-23 21:37:47,510][15401] Updated weights for policy 0, policy_version 520680 (0.0031) [2024-06-23 21:37:48,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 8530837504. Throughput: 0: 42775.5. Samples: 8531007280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 21:37:51,278][15401] Updated weights for policy 0, policy_version 520690 (0.0048) [2024-06-23 21:37:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8531050496. Throughput: 0: 42722.3. Samples: 8531138000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:53,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 21:37:55,474][15401] Updated weights for policy 0, policy_version 520700 (0.0042) [2024-06-23 21:37:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8531263488. Throughput: 0: 42825.1. Samples: 8531401720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:37:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 21:37:58,981][15401] Updated weights for policy 0, policy_version 520710 (0.0028) [2024-06-23 21:38:03,049][15401] Updated weights for policy 0, policy_version 520720 (0.0027) [2024-06-23 21:38:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 8531492864. Throughput: 0: 42816.9. Samples: 8531658240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:38:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 21:38:06,504][15401] Updated weights for policy 0, policy_version 520730 (0.0039) [2024-06-23 21:38:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8531722240. Throughput: 0: 42793.7. Samples: 8531790060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:38:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 21:38:10,622][15401] Updated weights for policy 0, policy_version 520740 (0.0032) [2024-06-23 21:38:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42601.6, 300 sec: 42820.6). Total num frames: 8531902464. Throughput: 0: 42908.0. Samples: 8532044320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:38:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 21:38:14,166][15401] Updated weights for policy 0, policy_version 520750 (0.0025) [2024-06-23 21:38:18,270][15401] Updated weights for policy 0, policy_version 520760 (0.0040) [2024-06-23 21:38:18,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8532131840. Throughput: 0: 42886.7. Samples: 8532302720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:38:18,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 21:38:22,029][15401] Updated weights for policy 0, policy_version 520770 (0.0037) [2024-06-23 21:38:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 8532361216. Throughput: 0: 43126.6. Samples: 8532434620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:38:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 21:38:25,857][15401] Updated weights for policy 0, policy_version 520780 (0.0033) [2024-06-23 21:38:28,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8532557824. Throughput: 0: 42899.0. Samples: 8532687800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:38:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 21:38:29,519][15401] Updated weights for policy 0, policy_version 520790 (0.0036) [2024-06-23 21:38:33,291][15401] Updated weights for policy 0, policy_version 520800 (0.0029) [2024-06-23 21:38:33,394][15132] Fps is (10 sec: 42580.3, 60 sec: 43143.1, 300 sec: 42819.9). Total num frames: 8532787200. Throughput: 0: 43155.8. Samples: 8532949480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:38:33,394][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 21:38:37,012][15401] Updated weights for policy 0, policy_version 520810 (0.0046) [2024-06-23 21:38:37,940][15349] Signal inference workers to stop experience collection... (126400 times) [2024-06-23 21:38:37,941][15349] Signal inference workers to resume experience collection... (126400 times) [2024-06-23 21:38:37,961][15401] InferenceWorker_p0-w0: stopping experience collection (126400 times) [2024-06-23 21:38:37,961][15401] InferenceWorker_p0-w0: resuming experience collection (126400 times) [2024-06-23 21:38:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 8533000192. Throughput: 0: 43118.5. Samples: 8533078340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-23 21:38:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 21:38:40,714][15401] Updated weights for policy 0, policy_version 520820 (0.0030) [2024-06-23 21:38:43,390][15132] Fps is (10 sec: 40977.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8533196800. Throughput: 0: 42999.5. Samples: 8533336700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:38:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 21:38:44,735][15401] Updated weights for policy 0, policy_version 520830 (0.0040) [2024-06-23 21:38:48,185][15401] Updated weights for policy 0, policy_version 520840 (0.0032) [2024-06-23 21:38:48,396][15132] Fps is (10 sec: 44208.4, 60 sec: 43413.0, 300 sec: 42875.2). Total num frames: 8533442560. Throughput: 0: 42929.0. Samples: 8533590320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:38:48,396][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 21:38:52,258][15401] Updated weights for policy 0, policy_version 520850 (0.0045) [2024-06-23 21:38:53,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8533655552. Throughput: 0: 43033.9. Samples: 8533726580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:38:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 21:38:55,795][15401] Updated weights for policy 0, policy_version 520860 (0.0027) [2024-06-23 21:38:58,389][15132] Fps is (10 sec: 40986.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8533852160. Throughput: 0: 43004.1. Samples: 8533979500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:38:58,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 21:38:59,924][15401] Updated weights for policy 0, policy_version 520870 (0.0025) [2024-06-23 21:39:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8534081536. Throughput: 0: 42834.8. Samples: 8534230180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 21:39:03,394][15401] Updated weights for policy 0, policy_version 520880 (0.0040) [2024-06-23 21:39:08,014][15401] Updated weights for policy 0, policy_version 520890 (0.0035) [2024-06-23 21:39:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8534294528. Throughput: 0: 42868.1. Samples: 8534363680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 21:39:10,902][15401] Updated weights for policy 0, policy_version 520900 (0.0037) [2024-06-23 21:39:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8534491136. Throughput: 0: 43012.9. Samples: 8534623380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 21:39:15,571][15401] Updated weights for policy 0, policy_version 520910 (0.0036) [2024-06-23 21:39:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43419.4, 300 sec: 42931.6). Total num frames: 8534736896. Throughput: 0: 42796.2. Samples: 8534875120. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:18,398][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 21:39:18,441][15401] Updated weights for policy 0, policy_version 520920 (0.0036) [2024-06-23 21:39:23,011][15401] Updated weights for policy 0, policy_version 520930 (0.0027) [2024-06-23 21:39:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8534933504. Throughput: 0: 42905.2. Samples: 8535009080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:23,402][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 21:39:26,113][15401] Updated weights for policy 0, policy_version 520940 (0.0037) [2024-06-23 21:39:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8535130112. Throughput: 0: 42777.8. Samples: 8535261700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:28,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 21:39:30,670][15401] Updated weights for policy 0, policy_version 520950 (0.0022) [2024-06-23 21:39:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43147.6, 300 sec: 42876.1). Total num frames: 8535375872. Throughput: 0: 42835.8. Samples: 8535517660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 21:39:33,819][15401] Updated weights for policy 0, policy_version 520960 (0.0039) [2024-06-23 21:39:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8535539712. Throughput: 0: 42742.6. Samples: 8535650000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:38,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 21:39:38,768][15401] Updated weights for policy 0, policy_version 520970 (0.0038) [2024-06-23 21:39:41,751][15401] Updated weights for policy 0, policy_version 520980 (0.0023) [2024-06-23 21:39:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8535785472. Throughput: 0: 42768.9. Samples: 8535904100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 21:39:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000520983_8535785472.pth... [2024-06-23 21:39:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000520356_8525512704.pth [2024-06-23 21:39:46,126][15401] Updated weights for policy 0, policy_version 520990 (0.0039) [2024-06-23 21:39:48,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42874.3, 300 sec: 42875.8). Total num frames: 8536014848. Throughput: 0: 42972.3. Samples: 8536164040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:48,393][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 21:39:49,363][15401] Updated weights for policy 0, policy_version 521000 (0.0032) [2024-06-23 21:39:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8536195072. Throughput: 0: 42962.8. Samples: 8536297000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 21:39:53,460][15349] Signal inference workers to stop experience collection... (126450 times) [2024-06-23 21:39:53,472][15401] InferenceWorker_p0-w0: stopping experience collection (126450 times) [2024-06-23 21:39:53,522][15349] Signal inference workers to resume experience collection... (126450 times) [2024-06-23 21:39:53,522][15401] InferenceWorker_p0-w0: resuming experience collection (126450 times) [2024-06-23 21:39:53,654][15401] Updated weights for policy 0, policy_version 521010 (0.0036) [2024-06-23 21:39:57,450][15401] Updated weights for policy 0, policy_version 521020 (0.0045) [2024-06-23 21:39:58,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8536424448. Throughput: 0: 42853.5. Samples: 8536551780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:39:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 21:40:01,442][15401] Updated weights for policy 0, policy_version 521030 (0.0034) [2024-06-23 21:40:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 8536653824. Throughput: 0: 42769.6. Samples: 8536799760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:40:03,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 21:40:05,232][15401] Updated weights for policy 0, policy_version 521040 (0.0030) [2024-06-23 21:40:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8536850432. Throughput: 0: 42884.0. Samples: 8536938860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-23 21:40:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 21:40:08,949][15401] Updated weights for policy 0, policy_version 521050 (0.0034) [2024-06-23 21:40:12,959][15401] Updated weights for policy 0, policy_version 521060 (0.0038) [2024-06-23 21:40:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8537047040. Throughput: 0: 42760.2. Samples: 8537185900. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 21:40:16,839][15401] Updated weights for policy 0, policy_version 521070 (0.0048) [2024-06-23 21:40:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8537309184. Throughput: 0: 42645.0. Samples: 8537436680. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 21:40:20,551][15401] Updated weights for policy 0, policy_version 521080 (0.0037) [2024-06-23 21:40:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 8537505792. Throughput: 0: 42891.6. Samples: 8537580120. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 21:40:24,205][15401] Updated weights for policy 0, policy_version 521090 (0.0027) [2024-06-23 21:40:28,075][15401] Updated weights for policy 0, policy_version 521100 (0.0024) [2024-06-23 21:40:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8537702400. Throughput: 0: 42787.0. Samples: 8537829520. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 21:40:31,745][15401] Updated weights for policy 0, policy_version 521110 (0.0048) [2024-06-23 21:40:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8537931776. Throughput: 0: 42689.5. Samples: 8538084960. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 21:40:35,661][15401] Updated weights for policy 0, policy_version 521120 (0.0044) [2024-06-23 21:40:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8538144768. Throughput: 0: 42663.4. Samples: 8538216860. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:38,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 21:40:39,343][15401] Updated weights for policy 0, policy_version 521130 (0.0032) [2024-06-23 21:40:43,280][15401] Updated weights for policy 0, policy_version 521140 (0.0038) [2024-06-23 21:40:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8538357760. Throughput: 0: 42678.6. Samples: 8538472320. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:43,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 21:40:46,986][15401] Updated weights for policy 0, policy_version 521150 (0.0041) [2024-06-23 21:40:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 8538570752. Throughput: 0: 42813.0. Samples: 8538726340. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 21:40:51,242][15401] Updated weights for policy 0, policy_version 521160 (0.0036) [2024-06-23 21:40:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8538783744. Throughput: 0: 42690.3. Samples: 8538859920. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 21:40:54,854][15401] Updated weights for policy 0, policy_version 521170 (0.0033) [2024-06-23 21:40:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 8538980352. Throughput: 0: 42727.0. Samples: 8539108620. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:40:58,396][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 21:40:58,802][15401] Updated weights for policy 0, policy_version 521180 (0.0028) [2024-06-23 21:41:02,325][15401] Updated weights for policy 0, policy_version 521190 (0.0034) [2024-06-23 21:41:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8539209728. Throughput: 0: 42906.1. Samples: 8539367460. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:41:03,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-23 21:41:06,583][15401] Updated weights for policy 0, policy_version 521200 (0.0023) [2024-06-23 21:41:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8539422720. Throughput: 0: 42613.2. Samples: 8539497720. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:41:08,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 21:41:09,798][15401] Updated weights for policy 0, policy_version 521210 (0.0032) [2024-06-23 21:41:12,084][15349] Signal inference workers to stop experience collection... (126500 times) [2024-06-23 21:41:12,131][15349] Signal inference workers to resume experience collection... (126500 times) [2024-06-23 21:41:12,132][15401] InferenceWorker_p0-w0: stopping experience collection (126500 times) [2024-06-23 21:41:12,154][15401] InferenceWorker_p0-w0: resuming experience collection (126500 times) [2024-06-23 21:41:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8539619328. Throughput: 0: 42768.1. Samples: 8539754080. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:41:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 21:41:14,083][15401] Updated weights for policy 0, policy_version 521220 (0.0047) [2024-06-23 21:41:17,643][15401] Updated weights for policy 0, policy_version 521230 (0.0026) [2024-06-23 21:41:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8539865088. Throughput: 0: 42739.8. Samples: 8540008260. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:41:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 21:41:21,582][15401] Updated weights for policy 0, policy_version 521240 (0.0029) [2024-06-23 21:41:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8540045312. Throughput: 0: 42809.4. Samples: 8540143280. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:41:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 21:41:25,171][15401] Updated weights for policy 0, policy_version 521250 (0.0029) [2024-06-23 21:41:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 8540258304. Throughput: 0: 42737.7. Samples: 8540395520. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:41:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 21:41:29,507][15401] Updated weights for policy 0, policy_version 521260 (0.0038) [2024-06-23 21:41:32,747][15401] Updated weights for policy 0, policy_version 521270 (0.0023) [2024-06-23 21:41:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8540504064. Throughput: 0: 42692.4. Samples: 8540647500. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:41:33,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 21:41:37,123][15401] Updated weights for policy 0, policy_version 521280 (0.0046) [2024-06-23 21:41:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8540700672. Throughput: 0: 42765.5. Samples: 8540784360. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-06-23 21:41:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 21:41:40,571][15401] Updated weights for policy 0, policy_version 521290 (0.0041) [2024-06-23 21:41:43,394][15132] Fps is (10 sec: 40942.4, 60 sec: 42595.3, 300 sec: 42819.9). Total num frames: 8540913664. Throughput: 0: 42894.1. Samples: 8541039040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:41:43,394][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 21:41:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000521296_8540913664.pth... [2024-06-23 21:41:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000520670_8530657280.pth [2024-06-23 21:41:45,022][15401] Updated weights for policy 0, policy_version 521300 (0.0044) [2024-06-23 21:41:48,387][15401] Updated weights for policy 0, policy_version 521310 (0.0026) [2024-06-23 21:41:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8541143040. Throughput: 0: 42777.4. Samples: 8541292440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:41:48,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-23 21:41:52,639][15401] Updated weights for policy 0, policy_version 521320 (0.0028) [2024-06-23 21:41:53,390][15132] Fps is (10 sec: 44255.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8541356032. Throughput: 0: 42779.5. Samples: 8541422800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:41:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 21:41:55,971][15401] Updated weights for policy 0, policy_version 521330 (0.0046) [2024-06-23 21:41:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8541569024. Throughput: 0: 42889.7. Samples: 8541684120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:41:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 21:42:00,081][15401] Updated weights for policy 0, policy_version 521340 (0.0035) [2024-06-23 21:42:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8541782016. Throughput: 0: 42845.9. Samples: 8541936320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-23 21:42:03,503][15401] Updated weights for policy 0, policy_version 521350 (0.0033) [2024-06-23 21:42:07,596][15401] Updated weights for policy 0, policy_version 521360 (0.0047) [2024-06-23 21:42:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.8). Total num frames: 8541995008. Throughput: 0: 42805.3. Samples: 8542069520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 21:42:11,073][15401] Updated weights for policy 0, policy_version 521370 (0.0029) [2024-06-23 21:42:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8542191616. Throughput: 0: 42790.7. Samples: 8542321100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 21:42:15,451][15401] Updated weights for policy 0, policy_version 521380 (0.0034) [2024-06-23 21:42:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8542420992. Throughput: 0: 42768.0. Samples: 8542572060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:18,394][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 21:42:18,800][15401] Updated weights for policy 0, policy_version 521390 (0.0030) [2024-06-23 21:42:23,341][15401] Updated weights for policy 0, policy_version 521400 (0.0036) [2024-06-23 21:42:23,395][15132] Fps is (10 sec: 42574.3, 60 sec: 42867.4, 300 sec: 42764.2). Total num frames: 8542617600. Throughput: 0: 42693.6. Samples: 8542705820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:23,396][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 21:42:26,326][15349] Signal inference workers to stop experience collection... (126550 times) [2024-06-23 21:42:26,332][15349] Signal inference workers to resume experience collection... (126550 times) [2024-06-23 21:42:26,349][15401] InferenceWorker_p0-w0: stopping experience collection (126550 times) [2024-06-23 21:42:26,387][15401] InferenceWorker_p0-w0: resuming experience collection (126550 times) [2024-06-23 21:42:26,470][15401] Updated weights for policy 0, policy_version 521410 (0.0037) [2024-06-23 21:42:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 8542830592. Throughput: 0: 42624.0. Samples: 8542956940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 21:42:30,867][15401] Updated weights for policy 0, policy_version 521420 (0.0039) [2024-06-23 21:42:33,389][15132] Fps is (10 sec: 42622.8, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 8543043584. Throughput: 0: 42677.8. Samples: 8543212940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 21:42:34,322][15401] Updated weights for policy 0, policy_version 521430 (0.0029) [2024-06-23 21:42:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8543256576. Throughput: 0: 42672.1. Samples: 8543343040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 21:42:38,771][15401] Updated weights for policy 0, policy_version 521440 (0.0034) [2024-06-23 21:42:42,409][15401] Updated weights for policy 0, policy_version 521450 (0.0041) [2024-06-23 21:42:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42328.5, 300 sec: 42765.0). Total num frames: 8543453184. Throughput: 0: 42479.7. Samples: 8543595700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 21:42:46,341][15401] Updated weights for policy 0, policy_version 521460 (0.0034) [2024-06-23 21:42:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8543698944. Throughput: 0: 42313.3. Samples: 8543840420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-23 21:42:49,986][15401] Updated weights for policy 0, policy_version 521470 (0.0041) [2024-06-23 21:42:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 8543895552. Throughput: 0: 42399.7. Samples: 8543977500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:53,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 21:42:54,165][15401] Updated weights for policy 0, policy_version 521480 (0.0031) [2024-06-23 21:42:57,502][15401] Updated weights for policy 0, policy_version 521490 (0.0030) [2024-06-23 21:42:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8544108544. Throughput: 0: 42375.1. Samples: 8544227980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:42:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 21:43:01,708][15401] Updated weights for policy 0, policy_version 521500 (0.0033) [2024-06-23 21:43:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8544337920. Throughput: 0: 42598.7. Samples: 8544489000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-23 21:43:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 21:43:04,873][15401] Updated weights for policy 0, policy_version 521510 (0.0036) [2024-06-23 21:43:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8544534528. Throughput: 0: 42578.3. Samples: 8544621600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 21:43:09,434][15401] Updated weights for policy 0, policy_version 521520 (0.0036) [2024-06-23 21:43:12,372][15401] Updated weights for policy 0, policy_version 521530 (0.0041) [2024-06-23 21:43:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 8544763904. Throughput: 0: 42570.2. Samples: 8544872600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:13,396][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 21:43:17,152][15401] Updated weights for policy 0, policy_version 521540 (0.0034) [2024-06-23 21:43:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8544960512. Throughput: 0: 42687.6. Samples: 8545133880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 21:43:20,291][15401] Updated weights for policy 0, policy_version 521550 (0.0038) [2024-06-23 21:43:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42602.4, 300 sec: 42765.0). Total num frames: 8545173504. Throughput: 0: 42598.1. Samples: 8545259960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 21:43:24,702][15401] Updated weights for policy 0, policy_version 521560 (0.0032) [2024-06-23 21:43:27,843][15401] Updated weights for policy 0, policy_version 521570 (0.0037) [2024-06-23 21:43:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.6). Total num frames: 8545402880. Throughput: 0: 42574.1. Samples: 8545511540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 21:43:32,585][15401] Updated weights for policy 0, policy_version 521580 (0.0034) [2024-06-23 21:43:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8545599488. Throughput: 0: 42876.0. Samples: 8545769840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 21:43:35,787][15401] Updated weights for policy 0, policy_version 521590 (0.0030) [2024-06-23 21:43:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8545812480. Throughput: 0: 42667.0. Samples: 8545897620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:38,392][15132] Avg episode reward: [(0, '0.189')] [2024-06-23 21:43:40,191][15401] Updated weights for policy 0, policy_version 521600 (0.0044) [2024-06-23 21:43:43,392][15132] Fps is (10 sec: 44225.2, 60 sec: 43142.6, 300 sec: 42710.0). Total num frames: 8546041856. Throughput: 0: 42703.3. Samples: 8546149740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:43,393][15132] Avg episode reward: [(0, '0.258')] [2024-06-23 21:43:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000521610_8546058240.pth... [2024-06-23 21:43:43,414][15401] Updated weights for policy 0, policy_version 521610 (0.0032) [2024-06-23 21:43:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000520983_8535785472.pth [2024-06-23 21:43:48,020][15401] Updated weights for policy 0, policy_version 521620 (0.0044) [2024-06-23 21:43:48,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 8546222080. Throughput: 0: 42671.5. Samples: 8546409220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 21:43:48,551][15349] Signal inference workers to stop experience collection... (126600 times) [2024-06-23 21:43:48,585][15401] InferenceWorker_p0-w0: stopping experience collection (126600 times) [2024-06-23 21:43:48,607][15349] Signal inference workers to resume experience collection... (126600 times) [2024-06-23 21:43:48,608][15401] InferenceWorker_p0-w0: resuming experience collection (126600 times) [2024-06-23 21:43:51,031][15401] Updated weights for policy 0, policy_version 521630 (0.0036) [2024-06-23 21:43:53,389][15132] Fps is (10 sec: 40971.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8546451456. Throughput: 0: 42412.0. Samples: 8546530140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:53,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 21:43:55,907][15401] Updated weights for policy 0, policy_version 521640 (0.0043) [2024-06-23 21:43:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8546680832. Throughput: 0: 42485.4. Samples: 8546784440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:43:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 21:43:58,811][15401] Updated weights for policy 0, policy_version 521650 (0.0035) [2024-06-23 21:44:03,390][15132] Fps is (10 sec: 40957.5, 60 sec: 42051.9, 300 sec: 42598.3). Total num frames: 8546861056. Throughput: 0: 42542.1. Samples: 8547048300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:44:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 21:44:03,546][15401] Updated weights for policy 0, policy_version 521660 (0.0038) [2024-06-23 21:44:06,617][15401] Updated weights for policy 0, policy_version 521670 (0.0033) [2024-06-23 21:44:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8547090432. Throughput: 0: 42496.5. Samples: 8547172300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:44:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 21:44:11,172][15401] Updated weights for policy 0, policy_version 521680 (0.0035) [2024-06-23 21:44:13,389][15132] Fps is (10 sec: 47516.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8547336192. Throughput: 0: 42607.2. Samples: 8547428860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:44:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 21:44:14,127][15401] Updated weights for policy 0, policy_version 521690 (0.0046) [2024-06-23 21:44:18,393][15132] Fps is (10 sec: 40946.5, 60 sec: 42323.0, 300 sec: 42597.9). Total num frames: 8547500032. Throughput: 0: 42708.8. Samples: 8547691880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:44:18,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 21:44:18,720][15401] Updated weights for policy 0, policy_version 521700 (0.0043) [2024-06-23 21:44:21,906][15401] Updated weights for policy 0, policy_version 521710 (0.0033) [2024-06-23 21:44:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8547729408. Throughput: 0: 42469.3. Samples: 8547808640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:44:23,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 21:44:26,200][15401] Updated weights for policy 0, policy_version 521720 (0.0036) [2024-06-23 21:44:28,390][15132] Fps is (10 sec: 45890.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8547958784. Throughput: 0: 42724.2. Samples: 8548072220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:44:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 21:44:29,410][15401] Updated weights for policy 0, policy_version 521730 (0.0041) [2024-06-23 21:44:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8548155392. Throughput: 0: 42731.6. Samples: 8548332140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 21:44:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 21:44:33,971][15401] Updated weights for policy 0, policy_version 521740 (0.0035) [2024-06-23 21:44:36,987][15401] Updated weights for policy 0, policy_version 521750 (0.0036) [2024-06-23 21:44:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 8548368384. Throughput: 0: 42709.8. Samples: 8548452080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:44:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 21:44:41,643][15401] Updated weights for policy 0, policy_version 521760 (0.0042) [2024-06-23 21:44:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.3, 300 sec: 42654.3). Total num frames: 8548597760. Throughput: 0: 42874.6. Samples: 8548713800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:44:43,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 21:44:44,489][15401] Updated weights for policy 0, policy_version 521770 (0.0030) [2024-06-23 21:44:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8548794368. Throughput: 0: 42785.4. Samples: 8548973620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:44:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 21:44:49,225][15401] Updated weights for policy 0, policy_version 521780 (0.0033) [2024-06-23 21:44:52,052][15401] Updated weights for policy 0, policy_version 521790 (0.0031) [2024-06-23 21:44:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8549023744. Throughput: 0: 42754.2. Samples: 8549096240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:44:53,392][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 21:44:56,817][15401] Updated weights for policy 0, policy_version 521800 (0.0047) [2024-06-23 21:44:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8549236736. Throughput: 0: 42913.3. Samples: 8549359960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:44:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 21:45:00,218][15401] Updated weights for policy 0, policy_version 521810 (0.0023) [2024-06-23 21:45:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.8, 300 sec: 42653.9). Total num frames: 8549433344. Throughput: 0: 42800.8. Samples: 8549617780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 21:45:04,529][15401] Updated weights for policy 0, policy_version 521820 (0.0030) [2024-06-23 21:45:07,909][15401] Updated weights for policy 0, policy_version 521830 (0.0041) [2024-06-23 21:45:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8549662720. Throughput: 0: 42922.3. Samples: 8549740140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 21:45:12,127][15401] Updated weights for policy 0, policy_version 521840 (0.0029) [2024-06-23 21:45:13,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8549892096. Throughput: 0: 42897.4. Samples: 8550002600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:13,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 21:45:15,383][15401] Updated weights for policy 0, policy_version 521850 (0.0041) [2024-06-23 21:45:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.9, 300 sec: 42598.4). Total num frames: 8550072320. Throughput: 0: 42885.0. Samples: 8550261960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 21:45:19,982][15401] Updated weights for policy 0, policy_version 521860 (0.0042) [2024-06-23 21:45:21,245][15349] Signal inference workers to stop experience collection... (126650 times) [2024-06-23 21:45:21,300][15401] InferenceWorker_p0-w0: stopping experience collection (126650 times) [2024-06-23 21:45:21,304][15349] Signal inference workers to resume experience collection... (126650 times) [2024-06-23 21:45:21,314][15401] InferenceWorker_p0-w0: resuming experience collection (126650 times) [2024-06-23 21:45:23,268][15401] Updated weights for policy 0, policy_version 521870 (0.0038) [2024-06-23 21:45:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 8550318080. Throughput: 0: 42896.4. Samples: 8550382520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:23,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 21:45:27,557][15401] Updated weights for policy 0, policy_version 521880 (0.0032) [2024-06-23 21:45:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8550531072. Throughput: 0: 42772.4. Samples: 8550638560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 21:45:30,764][15401] Updated weights for policy 0, policy_version 521890 (0.0039) [2024-06-23 21:45:33,392][15132] Fps is (10 sec: 39321.4, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 8550711296. Throughput: 0: 42759.5. Samples: 8550897900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:33,392][15132] Avg episode reward: [(0, '0.873')] [2024-06-23 21:45:35,122][15401] Updated weights for policy 0, policy_version 521900 (0.0030) [2024-06-23 21:45:38,239][15401] Updated weights for policy 0, policy_version 521910 (0.0025) [2024-06-23 21:45:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 8550973440. Throughput: 0: 42794.7. Samples: 8551022000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:38,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-23 21:45:42,799][15401] Updated weights for policy 0, policy_version 521920 (0.0030) [2024-06-23 21:45:43,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8551170048. Throughput: 0: 42757.7. Samples: 8551284060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 21:45:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000521922_8551170048.pth... [2024-06-23 21:45:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000521296_8540913664.pth [2024-06-23 21:45:45,814][15401] Updated weights for policy 0, policy_version 521930 (0.0028) [2024-06-23 21:45:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8551350272. Throughput: 0: 42757.5. Samples: 8551541860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:48,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 21:45:50,537][15401] Updated weights for policy 0, policy_version 521940 (0.0033) [2024-06-23 21:45:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8551612416. Throughput: 0: 42803.6. Samples: 8551666300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:53,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 21:45:53,475][15401] Updated weights for policy 0, policy_version 521950 (0.0044) [2024-06-23 21:45:58,088][15401] Updated weights for policy 0, policy_version 521960 (0.0037) [2024-06-23 21:45:58,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 8551809024. Throughput: 0: 42660.3. Samples: 8551922420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:45:58,392][15132] Avg episode reward: [(0, '0.245')] [2024-06-23 21:46:01,724][15401] Updated weights for policy 0, policy_version 521970 (0.0029) [2024-06-23 21:46:03,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 8552005632. Throughput: 0: 42586.6. Samples: 8552178460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 21:46:03,392][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 21:46:05,479][15401] Updated weights for policy 0, policy_version 521980 (0.0031) [2024-06-23 21:46:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8552235008. Throughput: 0: 42817.9. Samples: 8552309220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:08,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 21:46:09,415][15401] Updated weights for policy 0, policy_version 521990 (0.0027) [2024-06-23 21:46:13,013][15401] Updated weights for policy 0, policy_version 522000 (0.0032) [2024-06-23 21:46:13,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8552448000. Throughput: 0: 42799.6. Samples: 8552564540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 21:46:17,095][15401] Updated weights for policy 0, policy_version 522010 (0.0025) [2024-06-23 21:46:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8552660992. Throughput: 0: 42703.0. Samples: 8552819440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 21:46:20,740][15401] Updated weights for policy 0, policy_version 522020 (0.0031) [2024-06-23 21:46:23,391][15132] Fps is (10 sec: 42592.7, 60 sec: 42599.1, 300 sec: 42764.8). Total num frames: 8552873984. Throughput: 0: 42762.8. Samples: 8552946380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:23,391][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 21:46:25,180][15401] Updated weights for policy 0, policy_version 522030 (0.0032) [2024-06-23 21:46:28,365][15401] Updated weights for policy 0, policy_version 522040 (0.0044) [2024-06-23 21:46:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 8553103360. Throughput: 0: 42764.8. Samples: 8553208580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:28,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 21:46:32,653][15401] Updated weights for policy 0, policy_version 522050 (0.0039) [2024-06-23 21:46:33,390][15132] Fps is (10 sec: 40964.9, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 8553283584. Throughput: 0: 42745.2. Samples: 8553465400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:33,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 21:46:33,631][15349] Signal inference workers to stop experience collection... (126700 times) [2024-06-23 21:46:33,660][15401] InferenceWorker_p0-w0: stopping experience collection (126700 times) [2024-06-23 21:46:33,698][15349] Signal inference workers to resume experience collection... (126700 times) [2024-06-23 21:46:33,698][15401] InferenceWorker_p0-w0: resuming experience collection (126700 times) [2024-06-23 21:46:35,834][15401] Updated weights for policy 0, policy_version 522060 (0.0038) [2024-06-23 21:46:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42765.6). Total num frames: 8553529344. Throughput: 0: 42776.3. Samples: 8553591240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:38,399][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 21:46:40,234][15401] Updated weights for policy 0, policy_version 522070 (0.0035) [2024-06-23 21:46:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8553725952. Throughput: 0: 42711.1. Samples: 8553844320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:43,398][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 21:46:43,782][15401] Updated weights for policy 0, policy_version 522080 (0.0029) [2024-06-23 21:46:47,910][15401] Updated weights for policy 0, policy_version 522090 (0.0033) [2024-06-23 21:46:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8553922560. Throughput: 0: 42761.0. Samples: 8554102600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 21:46:51,457][15401] Updated weights for policy 0, policy_version 522100 (0.0038) [2024-06-23 21:46:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 8554135552. Throughput: 0: 42709.6. Samples: 8554231160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 21:46:55,929][15401] Updated weights for policy 0, policy_version 522110 (0.0028) [2024-06-23 21:46:58,394][15132] Fps is (10 sec: 45854.6, 60 sec: 42870.0, 300 sec: 42708.8). Total num frames: 8554381312. Throughput: 0: 42780.7. Samples: 8554489860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:46:58,394][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 21:46:59,138][15401] Updated weights for policy 0, policy_version 522120 (0.0023) [2024-06-23 21:47:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 8554561536. Throughput: 0: 42769.9. Samples: 8554744080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:47:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 21:47:03,521][15401] Updated weights for policy 0, policy_version 522130 (0.0041) [2024-06-23 21:47:06,929][15401] Updated weights for policy 0, policy_version 522140 (0.0033) [2024-06-23 21:47:08,390][15132] Fps is (10 sec: 39338.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8554774528. Throughput: 0: 42764.2. Samples: 8554870720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:47:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 21:47:11,133][15401] Updated weights for policy 0, policy_version 522150 (0.0032) [2024-06-23 21:47:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8555036672. Throughput: 0: 42716.5. Samples: 8555130720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:47:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 21:47:14,794][15401] Updated weights for policy 0, policy_version 522160 (0.0038) [2024-06-23 21:47:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42710.3). Total num frames: 8555216896. Throughput: 0: 42684.6. Samples: 8555386200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:47:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 21:47:18,810][15401] Updated weights for policy 0, policy_version 522170 (0.0031) [2024-06-23 21:47:22,336][15401] Updated weights for policy 0, policy_version 522180 (0.0036) [2024-06-23 21:47:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42326.3, 300 sec: 42654.0). Total num frames: 8555413504. Throughput: 0: 42620.6. Samples: 8555509160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:47:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 21:47:26,240][15401] Updated weights for policy 0, policy_version 522190 (0.0041) [2024-06-23 21:47:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 8555675648. Throughput: 0: 42880.4. Samples: 8555773940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-23 21:47:28,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 21:47:29,874][15401] Updated weights for policy 0, policy_version 522200 (0.0041) [2024-06-23 21:47:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 8555872256. Throughput: 0: 42866.7. Samples: 8556031600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:47:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 21:47:33,879][15401] Updated weights for policy 0, policy_version 522210 (0.0031) [2024-06-23 21:47:37,321][15401] Updated weights for policy 0, policy_version 522220 (0.0038) [2024-06-23 21:47:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8556068864. Throughput: 0: 42722.7. Samples: 8556153680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:47:38,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 21:47:40,753][15349] Signal inference workers to stop experience collection... (126750 times) [2024-06-23 21:47:40,753][15349] Signal inference workers to resume experience collection... (126750 times) [2024-06-23 21:47:40,794][15401] InferenceWorker_p0-w0: stopping experience collection (126750 times) [2024-06-23 21:47:40,795][15401] InferenceWorker_p0-w0: resuming experience collection (126750 times) [2024-06-23 21:47:41,266][15401] Updated weights for policy 0, policy_version 522230 (0.0033) [2024-06-23 21:47:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 8556331008. Throughput: 0: 43022.8. Samples: 8556425700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:47:43,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 21:47:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000522238_8556347392.pth... [2024-06-23 21:47:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000521610_8546058240.pth [2024-06-23 21:47:44,777][15401] Updated weights for policy 0, policy_version 522240 (0.0031) [2024-06-23 21:47:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8556511232. Throughput: 0: 43085.7. Samples: 8556682940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:47:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 21:47:48,755][15401] Updated weights for policy 0, policy_version 522250 (0.0039) [2024-06-23 21:47:52,228][15401] Updated weights for policy 0, policy_version 522260 (0.0040) [2024-06-23 21:47:53,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8556707840. Throughput: 0: 43000.2. Samples: 8556805720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:47:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 21:47:56,374][15401] Updated weights for policy 0, policy_version 522270 (0.0028) [2024-06-23 21:47:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42874.6, 300 sec: 42765.0). Total num frames: 8556953600. Throughput: 0: 42926.6. Samples: 8557062420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:47:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 21:47:59,670][15401] Updated weights for policy 0, policy_version 522280 (0.0026) [2024-06-23 21:48:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8557150208. Throughput: 0: 43127.4. Samples: 8557326940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 21:48:03,963][15401] Updated weights for policy 0, policy_version 522290 (0.0028) [2024-06-23 21:48:07,405][15401] Updated weights for policy 0, policy_version 522300 (0.0035) [2024-06-23 21:48:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 8557363200. Throughput: 0: 43101.3. Samples: 8557448720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 21:48:11,815][15401] Updated weights for policy 0, policy_version 522310 (0.0033) [2024-06-23 21:48:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8557608960. Throughput: 0: 43002.7. Samples: 8557709060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:13,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 21:48:15,417][15401] Updated weights for policy 0, policy_version 522320 (0.0041) [2024-06-23 21:48:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8557772800. Throughput: 0: 43083.8. Samples: 8557970380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-23 21:48:19,507][15401] Updated weights for policy 0, policy_version 522330 (0.0035) [2024-06-23 21:48:23,237][15401] Updated weights for policy 0, policy_version 522340 (0.0039) [2024-06-23 21:48:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 8558018560. Throughput: 0: 43056.9. Samples: 8558091240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 21:48:27,124][15401] Updated weights for policy 0, policy_version 522350 (0.0026) [2024-06-23 21:48:28,392][15132] Fps is (10 sec: 47502.9, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 8558247936. Throughput: 0: 42832.5. Samples: 8558353260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:28,392][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 21:48:30,895][15401] Updated weights for policy 0, policy_version 522360 (0.0028) [2024-06-23 21:48:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 8558444544. Throughput: 0: 42846.2. Samples: 8558611020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:33,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 21:48:34,890][15401] Updated weights for policy 0, policy_version 522370 (0.0028) [2024-06-23 21:48:38,392][15132] Fps is (10 sec: 40959.9, 60 sec: 43142.9, 300 sec: 42765.1). Total num frames: 8558657536. Throughput: 0: 42779.9. Samples: 8558730920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:38,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 21:48:38,542][15401] Updated weights for policy 0, policy_version 522380 (0.0033) [2024-06-23 21:48:42,614][15401] Updated weights for policy 0, policy_version 522390 (0.0026) [2024-06-23 21:48:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8558886912. Throughput: 0: 42943.2. Samples: 8558994860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:43,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 21:48:46,303][15401] Updated weights for policy 0, policy_version 522400 (0.0031) [2024-06-23 21:48:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8559083520. Throughput: 0: 42676.2. Samples: 8559247360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 21:48:50,280][15401] Updated weights for policy 0, policy_version 522410 (0.0028) [2024-06-23 21:48:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 8559296512. Throughput: 0: 42706.2. Samples: 8559370600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:53,401][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 21:48:54,013][15401] Updated weights for policy 0, policy_version 522420 (0.0035) [2024-06-23 21:48:57,909][15401] Updated weights for policy 0, policy_version 522430 (0.0036) [2024-06-23 21:48:58,107][15349] Signal inference workers to stop experience collection... (126800 times) [2024-06-23 21:48:58,107][15349] Signal inference workers to resume experience collection... (126800 times) [2024-06-23 21:48:58,128][15401] InferenceWorker_p0-w0: stopping experience collection (126800 times) [2024-06-23 21:48:58,128][15401] InferenceWorker_p0-w0: resuming experience collection (126800 times) [2024-06-23 21:48:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 8559525888. Throughput: 0: 42725.8. Samples: 8559631720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-23 21:48:58,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 21:49:01,866][15401] Updated weights for policy 0, policy_version 522440 (0.0029) [2024-06-23 21:49:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8559706112. Throughput: 0: 42637.4. Samples: 8559889060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:03,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 21:49:05,451][15401] Updated weights for policy 0, policy_version 522450 (0.0048) [2024-06-23 21:49:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 8559935488. Throughput: 0: 42683.5. Samples: 8560012100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:08,393][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 21:49:09,370][15401] Updated weights for policy 0, policy_version 522460 (0.0046) [2024-06-23 21:49:13,016][15401] Updated weights for policy 0, policy_version 522470 (0.0036) [2024-06-23 21:49:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42932.1). Total num frames: 8560164864. Throughput: 0: 42644.5. Samples: 8560272160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-23 21:49:17,019][15401] Updated weights for policy 0, policy_version 522480 (0.0033) [2024-06-23 21:49:18,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8560345088. Throughput: 0: 42647.2. Samples: 8560530140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 21:49:20,621][15401] Updated weights for policy 0, policy_version 522490 (0.0041) [2024-06-23 21:49:23,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 8560574464. Throughput: 0: 42809.8. Samples: 8560657360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:23,392][15132] Avg episode reward: [(0, '0.797')] [2024-06-23 21:49:24,493][15401] Updated weights for policy 0, policy_version 522500 (0.0037) [2024-06-23 21:49:28,201][15401] Updated weights for policy 0, policy_version 522510 (0.0035) [2024-06-23 21:49:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 8560803840. Throughput: 0: 42618.6. Samples: 8560912700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 21:49:32,518][15401] Updated weights for policy 0, policy_version 522520 (0.0035) [2024-06-23 21:49:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8560984064. Throughput: 0: 42731.0. Samples: 8561170260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 21:49:36,001][15401] Updated weights for policy 0, policy_version 522530 (0.0030) [2024-06-23 21:49:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 8561229824. Throughput: 0: 42743.5. Samples: 8561293960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 21:49:40,139][15401] Updated weights for policy 0, policy_version 522540 (0.0036) [2024-06-23 21:49:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8561426432. Throughput: 0: 42727.1. Samples: 8561554440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 21:49:43,468][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000522549_8561442816.pth... [2024-06-23 21:49:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000521922_8551170048.pth [2024-06-23 21:49:43,694][15401] Updated weights for policy 0, policy_version 522550 (0.0030) [2024-06-23 21:49:47,911][15401] Updated weights for policy 0, policy_version 522560 (0.0039) [2024-06-23 21:49:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8561623040. Throughput: 0: 42661.4. Samples: 8561808820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 21:49:51,302][15401] Updated weights for policy 0, policy_version 522570 (0.0041) [2024-06-23 21:49:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 8561868800. Throughput: 0: 42764.5. Samples: 8561936400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:53,391][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 21:49:55,649][15401] Updated weights for policy 0, policy_version 522580 (0.0035) [2024-06-23 21:49:58,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42320.8, 300 sec: 42819.7). Total num frames: 8562065408. Throughput: 0: 42754.4. Samples: 8562196380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:49:58,397][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 21:49:58,887][15401] Updated weights for policy 0, policy_version 522590 (0.0028) [2024-06-23 21:50:03,292][15401] Updated weights for policy 0, policy_version 522600 (0.0037) [2024-06-23 21:50:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8562278400. Throughput: 0: 42796.3. Samples: 8562455980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:50:03,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 21:50:06,527][15401] Updated weights for policy 0, policy_version 522610 (0.0027) [2024-06-23 21:50:08,390][15132] Fps is (10 sec: 45904.4, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 8562524160. Throughput: 0: 42747.6. Samples: 8562580900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:50:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 21:50:10,750][15401] Updated weights for policy 0, policy_version 522620 (0.0051) [2024-06-23 21:50:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8562720768. Throughput: 0: 42839.5. Samples: 8562840480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:50:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 21:50:14,434][15401] Updated weights for policy 0, policy_version 522630 (0.0034) [2024-06-23 21:50:18,265][15401] Updated weights for policy 0, policy_version 522640 (0.0030) [2024-06-23 21:50:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42765.3). Total num frames: 8562933760. Throughput: 0: 42894.6. Samples: 8563100520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:50:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 21:50:22,242][15401] Updated weights for policy 0, policy_version 522650 (0.0029) [2024-06-23 21:50:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 8563179520. Throughput: 0: 42916.0. Samples: 8563225180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:50:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 21:50:26,435][15401] Updated weights for policy 0, policy_version 522660 (0.0041) [2024-06-23 21:50:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 8563359744. Throughput: 0: 42704.8. Samples: 8563476160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 21:50:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 21:50:28,844][15349] Signal inference workers to stop experience collection... (126850 times) [2024-06-23 21:50:28,889][15401] InferenceWorker_p0-w0: stopping experience collection (126850 times) [2024-06-23 21:50:28,895][15349] Signal inference workers to resume experience collection... (126850 times) [2024-06-23 21:50:28,904][15401] InferenceWorker_p0-w0: resuming experience collection (126850 times) [2024-06-23 21:50:29,860][15401] Updated weights for policy 0, policy_version 522670 (0.0033) [2024-06-23 21:50:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8563556352. Throughput: 0: 42827.0. Samples: 8563736040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:50:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 21:50:34,063][15401] Updated weights for policy 0, policy_version 522680 (0.0026) [2024-06-23 21:50:37,420][15401] Updated weights for policy 0, policy_version 522690 (0.0031) [2024-06-23 21:50:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8563802112. Throughput: 0: 42659.1. Samples: 8563856060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:50:38,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 21:50:41,581][15401] Updated weights for policy 0, policy_version 522700 (0.0032) [2024-06-23 21:50:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8563998720. Throughput: 0: 42699.4. Samples: 8564117580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:50:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 21:50:45,110][15401] Updated weights for policy 0, policy_version 522710 (0.0026) [2024-06-23 21:50:48,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8564178944. Throughput: 0: 42699.7. Samples: 8564377460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:50:48,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-23 21:50:49,353][15401] Updated weights for policy 0, policy_version 522720 (0.0022) [2024-06-23 21:50:52,652][15401] Updated weights for policy 0, policy_version 522730 (0.0042) [2024-06-23 21:50:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 8564457472. Throughput: 0: 42742.7. Samples: 8564504320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:50:53,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 21:50:56,896][15401] Updated weights for policy 0, policy_version 522740 (0.0038) [2024-06-23 21:50:58,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43147.4, 300 sec: 42876.1). Total num frames: 8564654080. Throughput: 0: 42766.7. Samples: 8564765080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:50:58,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 21:51:00,243][15401] Updated weights for policy 0, policy_version 522750 (0.0030) [2024-06-23 21:51:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 8564834304. Throughput: 0: 42692.0. Samples: 8565021660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 21:51:04,574][15401] Updated weights for policy 0, policy_version 522760 (0.0045) [2024-06-23 21:51:07,642][15401] Updated weights for policy 0, policy_version 522770 (0.0035) [2024-06-23 21:51:08,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8565080064. Throughput: 0: 42678.7. Samples: 8565145720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:08,396][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 21:51:12,145][15401] Updated weights for policy 0, policy_version 522780 (0.0040) [2024-06-23 21:51:13,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8565293056. Throughput: 0: 42951.2. Samples: 8565408960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 21:51:15,188][15401] Updated weights for policy 0, policy_version 522790 (0.0042) [2024-06-23 21:51:18,390][15132] Fps is (10 sec: 40956.7, 60 sec: 42597.9, 300 sec: 42765.1). Total num frames: 8565489664. Throughput: 0: 42765.1. Samples: 8565660500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:18,391][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 21:51:19,817][15401] Updated weights for policy 0, policy_version 522800 (0.0044) [2024-06-23 21:51:23,181][15401] Updated weights for policy 0, policy_version 522810 (0.0026) [2024-06-23 21:51:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 42820.6). Total num frames: 8565735424. Throughput: 0: 42944.8. Samples: 8565788680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:23,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 21:51:27,497][15401] Updated weights for policy 0, policy_version 522820 (0.0040) [2024-06-23 21:51:28,390][15132] Fps is (10 sec: 44239.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8565932032. Throughput: 0: 42919.8. Samples: 8566048980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:28,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 21:51:30,783][15401] Updated weights for policy 0, policy_version 522830 (0.0035) [2024-06-23 21:51:33,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8566145024. Throughput: 0: 42715.4. Samples: 8566299660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 21:51:34,785][15349] Signal inference workers to stop experience collection... (126900 times) [2024-06-23 21:51:34,785][15349] Signal inference workers to resume experience collection... (126900 times) [2024-06-23 21:51:34,820][15401] InferenceWorker_p0-w0: stopping experience collection (126900 times) [2024-06-23 21:51:34,820][15401] InferenceWorker_p0-w0: resuming experience collection (126900 times) [2024-06-23 21:51:35,205][15401] Updated weights for policy 0, policy_version 522840 (0.0042) [2024-06-23 21:51:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8566341632. Throughput: 0: 42692.4. Samples: 8566425480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 21:51:38,717][15401] Updated weights for policy 0, policy_version 522850 (0.0034) [2024-06-23 21:51:42,778][15401] Updated weights for policy 0, policy_version 522860 (0.0027) [2024-06-23 21:51:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8566587392. Throughput: 0: 42849.8. Samples: 8566693220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:43,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 21:51:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000522863_8566587392.pth... [2024-06-23 21:51:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000522238_8556347392.pth [2024-06-23 21:51:46,189][15401] Updated weights for policy 0, policy_version 522870 (0.0032) [2024-06-23 21:51:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8566784000. Throughput: 0: 42686.4. Samples: 8566942540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 21:51:50,358][15401] Updated weights for policy 0, policy_version 522880 (0.0032) [2024-06-23 21:51:53,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.7, 300 sec: 42765.3). Total num frames: 8566996992. Throughput: 0: 42716.4. Samples: 8567068060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 21:51:53,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 21:51:53,789][15401] Updated weights for policy 0, policy_version 522890 (0.0036) [2024-06-23 21:51:57,971][15401] Updated weights for policy 0, policy_version 522900 (0.0029) [2024-06-23 21:51:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 8567193600. Throughput: 0: 42655.5. Samples: 8567328460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:51:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 21:52:01,345][15401] Updated weights for policy 0, policy_version 522910 (0.0043) [2024-06-23 21:52:03,389][15132] Fps is (10 sec: 42609.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8567422976. Throughput: 0: 42716.4. Samples: 8567582700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:03,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 21:52:05,481][15401] Updated weights for policy 0, policy_version 522920 (0.0037) [2024-06-23 21:52:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8567635968. Throughput: 0: 42855.5. Samples: 8567717080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 21:52:09,004][15401] Updated weights for policy 0, policy_version 522930 (0.0032) [2024-06-23 21:52:13,030][15401] Updated weights for policy 0, policy_version 522940 (0.0024) [2024-06-23 21:52:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8567848960. Throughput: 0: 42738.3. Samples: 8567972200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 21:52:16,509][15401] Updated weights for policy 0, policy_version 522950 (0.0027) [2024-06-23 21:52:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42599.0, 300 sec: 42820.6). Total num frames: 8568045568. Throughput: 0: 42915.3. Samples: 8568230840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 21:52:21,125][15401] Updated weights for policy 0, policy_version 522960 (0.0027) [2024-06-23 21:52:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8568291328. Throughput: 0: 43025.8. Samples: 8568361640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:23,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 21:52:23,989][15401] Updated weights for policy 0, policy_version 522970 (0.0032) [2024-06-23 21:52:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8568487936. Throughput: 0: 42779.9. Samples: 8568618320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 21:52:28,549][15401] Updated weights for policy 0, policy_version 522980 (0.0026) [2024-06-23 21:52:31,994][15401] Updated weights for policy 0, policy_version 522990 (0.0026) [2024-06-23 21:52:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8568700928. Throughput: 0: 42987.0. Samples: 8568876960. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 21:52:36,065][15401] Updated weights for policy 0, policy_version 523000 (0.0041) [2024-06-23 21:52:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8568930304. Throughput: 0: 42993.0. Samples: 8569002640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 21:52:39,771][15401] Updated weights for policy 0, policy_version 523010 (0.0031) [2024-06-23 21:52:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8569143296. Throughput: 0: 42937.4. Samples: 8569260640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 21:52:43,593][15401] Updated weights for policy 0, policy_version 523020 (0.0033) [2024-06-23 21:52:47,605][15401] Updated weights for policy 0, policy_version 523030 (0.0037) [2024-06-23 21:52:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8569356288. Throughput: 0: 42991.9. Samples: 8569517340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:48,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 21:52:51,167][15401] Updated weights for policy 0, policy_version 523040 (0.0041) [2024-06-23 21:52:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 8569585664. Throughput: 0: 42857.3. Samples: 8569645660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 21:52:55,095][15401] Updated weights for policy 0, policy_version 523050 (0.0050) [2024-06-23 21:52:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8569765888. Throughput: 0: 42976.9. Samples: 8569906160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:52:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 21:52:58,933][15401] Updated weights for policy 0, policy_version 523060 (0.0048) [2024-06-23 21:53:02,655][15401] Updated weights for policy 0, policy_version 523070 (0.0039) [2024-06-23 21:53:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.6, 300 sec: 42820.2). Total num frames: 8569995264. Throughput: 0: 42802.9. Samples: 8570157080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:53:03,393][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 21:53:04,787][15349] Signal inference workers to stop experience collection... (126950 times) [2024-06-23 21:53:04,787][15349] Signal inference workers to resume experience collection... (126950 times) [2024-06-23 21:53:04,801][15401] InferenceWorker_p0-w0: stopping experience collection (126950 times) [2024-06-23 21:53:04,801][15401] InferenceWorker_p0-w0: resuming experience collection (126950 times) [2024-06-23 21:53:06,809][15401] Updated weights for policy 0, policy_version 523080 (0.0042) [2024-06-23 21:53:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8570224640. Throughput: 0: 42728.2. Samples: 8570284420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:53:08,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 21:53:10,217][15401] Updated weights for policy 0, policy_version 523090 (0.0036) [2024-06-23 21:53:13,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8570404864. Throughput: 0: 42864.9. Samples: 8570547240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:53:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 21:53:14,860][15401] Updated weights for policy 0, policy_version 523100 (0.0036) [2024-06-23 21:53:17,910][15401] Updated weights for policy 0, policy_version 523110 (0.0036) [2024-06-23 21:53:18,392][15132] Fps is (10 sec: 40951.0, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 8570634240. Throughput: 0: 42585.3. Samples: 8570793400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:53:18,393][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 21:53:22,417][15401] Updated weights for policy 0, policy_version 523120 (0.0044) [2024-06-23 21:53:23,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 8570880000. Throughput: 0: 42760.8. Samples: 8570926880. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-23 21:53:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 21:53:25,865][15401] Updated weights for policy 0, policy_version 523130 (0.0030) [2024-06-23 21:53:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8571060224. Throughput: 0: 42804.5. Samples: 8571186840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:53:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 21:53:29,989][15401] Updated weights for policy 0, policy_version 523140 (0.0028) [2024-06-23 21:53:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 8571273216. Throughput: 0: 42607.9. Samples: 8571434700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:53:33,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 21:53:33,527][15401] Updated weights for policy 0, policy_version 523150 (0.0032) [2024-06-23 21:53:37,626][15401] Updated weights for policy 0, policy_version 523160 (0.0031) [2024-06-23 21:53:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8571502592. Throughput: 0: 42732.5. Samples: 8571568620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:53:38,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 21:53:41,502][15401] Updated weights for policy 0, policy_version 523170 (0.0031) [2024-06-23 21:53:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8571699200. Throughput: 0: 42645.7. Samples: 8571825220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:53:43,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 21:53:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000523175_8571699200.pth... [2024-06-23 21:53:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000522549_8561442816.pth [2024-06-23 21:53:45,242][15401] Updated weights for policy 0, policy_version 523180 (0.0033) [2024-06-23 21:53:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 8571928576. Throughput: 0: 42638.3. Samples: 8572075700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:53:48,396][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 21:53:49,095][15401] Updated weights for policy 0, policy_version 523190 (0.0029) [2024-06-23 21:53:52,815][15401] Updated weights for policy 0, policy_version 523200 (0.0032) [2024-06-23 21:53:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8572141568. Throughput: 0: 42866.0. Samples: 8572213380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:53:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 21:53:56,515][15401] Updated weights for policy 0, policy_version 523210 (0.0047) [2024-06-23 21:53:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8572321792. Throughput: 0: 42592.1. Samples: 8572463880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:53:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 21:54:00,616][15401] Updated weights for policy 0, policy_version 523220 (0.0025) [2024-06-23 21:54:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42820.9). Total num frames: 8572567552. Throughput: 0: 42695.2. Samples: 8572714580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:03,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 21:54:04,085][15401] Updated weights for policy 0, policy_version 523230 (0.0038) [2024-06-23 21:54:08,110][15401] Updated weights for policy 0, policy_version 523240 (0.0032) [2024-06-23 21:54:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 8572780544. Throughput: 0: 42867.6. Samples: 8572855920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 21:54:11,613][15401] Updated weights for policy 0, policy_version 523250 (0.0035) [2024-06-23 21:54:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8572960768. Throughput: 0: 42680.4. Samples: 8573107460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 21:54:15,312][15349] Signal inference workers to stop experience collection... (127000 times) [2024-06-23 21:54:15,313][15349] Signal inference workers to resume experience collection... (127000 times) [2024-06-23 21:54:15,326][15401] InferenceWorker_p0-w0: stopping experience collection (127000 times) [2024-06-23 21:54:15,359][15401] InferenceWorker_p0-w0: resuming experience collection (127000 times) [2024-06-23 21:54:15,785][15401] Updated weights for policy 0, policy_version 523260 (0.0037) [2024-06-23 21:54:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8573222912. Throughput: 0: 42763.2. Samples: 8573359140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:18,392][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 21:54:19,207][15401] Updated weights for policy 0, policy_version 523270 (0.0031) [2024-06-23 21:54:23,339][15401] Updated weights for policy 0, policy_version 523280 (0.0028) [2024-06-23 21:54:23,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 8573419520. Throughput: 0: 42927.5. Samples: 8573500460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:23,392][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 21:54:26,647][15401] Updated weights for policy 0, policy_version 523290 (0.0040) [2024-06-23 21:54:28,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 8573616128. Throughput: 0: 42819.6. Samples: 8573752100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-23 21:54:30,965][15401] Updated weights for policy 0, policy_version 523300 (0.0049) [2024-06-23 21:54:33,390][15132] Fps is (10 sec: 45885.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8573878272. Throughput: 0: 42897.2. Samples: 8574006080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 21:54:34,193][15401] Updated weights for policy 0, policy_version 523310 (0.0042) [2024-06-23 21:54:38,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 8574058496. Throughput: 0: 42920.5. Samples: 8574145080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:38,396][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 21:54:38,509][15401] Updated weights for policy 0, policy_version 523320 (0.0036) [2024-06-23 21:54:41,813][15401] Updated weights for policy 0, policy_version 523330 (0.0039) [2024-06-23 21:54:43,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8574255104. Throughput: 0: 42873.8. Samples: 8574393200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-23 21:54:46,463][15401] Updated weights for policy 0, policy_version 523340 (0.0043) [2024-06-23 21:54:48,389][15132] Fps is (10 sec: 47544.1, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 8574533632. Throughput: 0: 42860.0. Samples: 8574643280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 21:54:49,631][15401] Updated weights for policy 0, policy_version 523350 (0.0036) [2024-06-23 21:54:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 8574681088. Throughput: 0: 42806.6. Samples: 8574782220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 21:54:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 21:54:54,273][15401] Updated weights for policy 0, policy_version 523360 (0.0040) [2024-06-23 21:54:57,032][15401] Updated weights for policy 0, policy_version 523370 (0.0038) [2024-06-23 21:54:58,390][15132] Fps is (10 sec: 37682.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8574910464. Throughput: 0: 42685.4. Samples: 8575028300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:54:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 21:55:01,874][15401] Updated weights for policy 0, policy_version 523380 (0.0036) [2024-06-23 21:55:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8575156224. Throughput: 0: 42790.2. Samples: 8575284600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:03,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 21:55:05,102][15401] Updated weights for policy 0, policy_version 523390 (0.0041) [2024-06-23 21:55:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8575336448. Throughput: 0: 42646.1. Samples: 8575419440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:08,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 21:55:09,722][15401] Updated weights for policy 0, policy_version 523400 (0.0036) [2024-06-23 21:55:12,811][15401] Updated weights for policy 0, policy_version 523410 (0.0026) [2024-06-23 21:55:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 43144.7, 300 sec: 42765.1). Total num frames: 8575549440. Throughput: 0: 42464.2. Samples: 8575662980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 21:55:17,366][15401] Updated weights for policy 0, policy_version 523420 (0.0037) [2024-06-23 21:55:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 8575778816. Throughput: 0: 42608.6. Samples: 8575923460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:18,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-23 21:55:20,309][15401] Updated weights for policy 0, policy_version 523430 (0.0039) [2024-06-23 21:55:23,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 8575942656. Throughput: 0: 42385.5. Samples: 8576052160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 21:55:23,780][15349] Signal inference workers to stop experience collection... (127050 times) [2024-06-23 21:55:23,780][15349] Signal inference workers to resume experience collection... (127050 times) [2024-06-23 21:55:23,791][15401] InferenceWorker_p0-w0: stopping experience collection (127050 times) [2024-06-23 21:55:23,815][15401] InferenceWorker_p0-w0: resuming experience collection (127050 times) [2024-06-23 21:55:25,019][15401] Updated weights for policy 0, policy_version 523440 (0.0022) [2024-06-23 21:55:27,894][15401] Updated weights for policy 0, policy_version 523450 (0.0034) [2024-06-23 21:55:28,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8576204800. Throughput: 0: 42496.6. Samples: 8576305560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 21:55:32,698][15401] Updated weights for policy 0, policy_version 523460 (0.0034) [2024-06-23 21:55:33,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 8576417792. Throughput: 0: 42711.1. Samples: 8576565280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:33,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-23 21:55:35,948][15401] Updated weights for policy 0, policy_version 523470 (0.0028) [2024-06-23 21:55:38,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42329.8, 300 sec: 42709.5). Total num frames: 8576598016. Throughput: 0: 42417.8. Samples: 8576691020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 21:55:40,258][15401] Updated weights for policy 0, policy_version 523480 (0.0035) [2024-06-23 21:55:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8576843776. Throughput: 0: 42748.1. Samples: 8576951960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 21:55:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000523490_8576860160.pth... [2024-06-23 21:55:43,421][15401] Updated weights for policy 0, policy_version 523490 (0.0027) [2024-06-23 21:55:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000522863_8566587392.pth [2024-06-23 21:55:47,791][15401] Updated weights for policy 0, policy_version 523500 (0.0025) [2024-06-23 21:55:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 8577040384. Throughput: 0: 42804.6. Samples: 8577210800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 21:55:51,006][15401] Updated weights for policy 0, policy_version 523510 (0.0038) [2024-06-23 21:55:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 8577253376. Throughput: 0: 42561.9. Samples: 8577334720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 21:55:55,343][15401] Updated weights for policy 0, policy_version 523520 (0.0031) [2024-06-23 21:55:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8577482752. Throughput: 0: 42887.9. Samples: 8577592940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:55:58,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 21:55:58,773][15401] Updated weights for policy 0, policy_version 523530 (0.0032) [2024-06-23 21:56:03,110][15401] Updated weights for policy 0, policy_version 523540 (0.0024) [2024-06-23 21:56:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 8577679360. Throughput: 0: 42836.8. Samples: 8577851120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:56:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 21:56:06,689][15401] Updated weights for policy 0, policy_version 523550 (0.0027) [2024-06-23 21:56:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8577892352. Throughput: 0: 42709.7. Samples: 8577974100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:56:08,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 21:56:10,910][15401] Updated weights for policy 0, policy_version 523560 (0.0040) [2024-06-23 21:56:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.7). Total num frames: 8578121728. Throughput: 0: 42853.6. Samples: 8578233960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:56:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 21:56:14,405][15401] Updated weights for policy 0, policy_version 523570 (0.0035) [2024-06-23 21:56:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 8578318336. Throughput: 0: 42751.0. Samples: 8578489080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:56:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 21:56:18,496][15401] Updated weights for policy 0, policy_version 523580 (0.0034) [2024-06-23 21:56:22,109][15401] Updated weights for policy 0, policy_version 523590 (0.0029) [2024-06-23 21:56:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8578514944. Throughput: 0: 42759.9. Samples: 8578615220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-23 21:56:23,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 21:56:26,021][15401] Updated weights for policy 0, policy_version 523600 (0.0035) [2024-06-23 21:56:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8578760704. Throughput: 0: 42555.0. Samples: 8578866940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:56:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 21:56:30,007][15401] Updated weights for policy 0, policy_version 523610 (0.0030) [2024-06-23 21:56:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 8578957312. Throughput: 0: 42655.8. Samples: 8579130320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:56:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 21:56:33,893][15401] Updated weights for policy 0, policy_version 523620 (0.0040) [2024-06-23 21:56:37,599][15401] Updated weights for policy 0, policy_version 523630 (0.0041) [2024-06-23 21:56:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8579170304. Throughput: 0: 42669.7. Samples: 8579254860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:56:38,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 21:56:41,394][15401] Updated weights for policy 0, policy_version 523640 (0.0043) [2024-06-23 21:56:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8579399680. Throughput: 0: 42584.0. Samples: 8579509220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:56:43,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 21:56:45,449][15401] Updated weights for policy 0, policy_version 523650 (0.0030) [2024-06-23 21:56:47,120][15349] Signal inference workers to stop experience collection... (127100 times) [2024-06-23 21:56:47,120][15349] Signal inference workers to resume experience collection... (127100 times) [2024-06-23 21:56:47,137][15401] InferenceWorker_p0-w0: stopping experience collection (127100 times) [2024-06-23 21:56:47,138][15401] InferenceWorker_p0-w0: resuming experience collection (127100 times) [2024-06-23 21:56:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 8579596288. Throughput: 0: 42432.0. Samples: 8579760560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:56:48,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-23 21:56:49,072][15401] Updated weights for policy 0, policy_version 523660 (0.0030) [2024-06-23 21:56:53,077][15401] Updated weights for policy 0, policy_version 523670 (0.0039) [2024-06-23 21:56:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8579809280. Throughput: 0: 42507.1. Samples: 8579886920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:56:53,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-23 21:56:56,941][15401] Updated weights for policy 0, policy_version 523680 (0.0048) [2024-06-23 21:56:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 8580005888. Throughput: 0: 42442.6. Samples: 8580143880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:56:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 21:57:00,846][15401] Updated weights for policy 0, policy_version 523690 (0.0045) [2024-06-23 21:57:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 8580235264. Throughput: 0: 42371.1. Samples: 8580395880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:03,393][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 21:57:04,849][15401] Updated weights for policy 0, policy_version 523700 (0.0036) [2024-06-23 21:57:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8580448256. Throughput: 0: 42477.9. Samples: 8580526720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:08,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 21:57:08,505][15401] Updated weights for policy 0, policy_version 523710 (0.0036) [2024-06-23 21:57:12,407][15401] Updated weights for policy 0, policy_version 523720 (0.0030) [2024-06-23 21:57:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8580644864. Throughput: 0: 42474.7. Samples: 8580778300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 21:57:16,272][15401] Updated weights for policy 0, policy_version 523730 (0.0032) [2024-06-23 21:57:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8580874240. Throughput: 0: 42329.1. Samples: 8581035120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 21:57:20,163][15401] Updated weights for policy 0, policy_version 523740 (0.0030) [2024-06-23 21:57:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8581087232. Throughput: 0: 42427.2. Samples: 8581164080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 21:57:24,385][15401] Updated weights for policy 0, policy_version 523750 (0.0027) [2024-06-23 21:57:27,725][15401] Updated weights for policy 0, policy_version 523760 (0.0027) [2024-06-23 21:57:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8581300224. Throughput: 0: 42501.3. Samples: 8581421780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 21:57:32,125][15401] Updated weights for policy 0, policy_version 523770 (0.0044) [2024-06-23 21:57:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 8581513216. Throughput: 0: 42655.0. Samples: 8581680140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:33,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 21:57:35,237][15401] Updated weights for policy 0, policy_version 523780 (0.0029) [2024-06-23 21:57:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8581726208. Throughput: 0: 42601.8. Samples: 8581804000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 21:57:39,615][15401] Updated weights for policy 0, policy_version 523790 (0.0033) [2024-06-23 21:57:42,998][15401] Updated weights for policy 0, policy_version 523800 (0.0044) [2024-06-23 21:57:43,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8581955584. Throughput: 0: 42691.5. Samples: 8582065000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 21:57:43,488][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000523802_8581971968.pth... [2024-06-23 21:57:43,544][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000523175_8571699200.pth [2024-06-23 21:57:47,598][15401] Updated weights for policy 0, policy_version 523810 (0.0031) [2024-06-23 21:57:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8582152192. Throughput: 0: 42652.6. Samples: 8582315140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-23 21:57:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 21:57:50,811][15401] Updated weights for policy 0, policy_version 523820 (0.0036) [2024-06-23 21:57:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8582365184. Throughput: 0: 42534.1. Samples: 8582440760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:57:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-23 21:57:55,155][15401] Updated weights for policy 0, policy_version 523830 (0.0039) [2024-06-23 21:57:58,392][15132] Fps is (10 sec: 42589.2, 60 sec: 42869.9, 300 sec: 42654.0). Total num frames: 8582578176. Throughput: 0: 42600.2. Samples: 8582695400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:57:58,392][15132] Avg episode reward: [(0, '0.291')] [2024-06-23 21:57:58,464][15401] Updated weights for policy 0, policy_version 523840 (0.0032) [2024-06-23 21:58:02,610][15401] Updated weights for policy 0, policy_version 523850 (0.0034) [2024-06-23 21:58:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 8582807552. Throughput: 0: 42622.1. Samples: 8582953120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 21:58:06,001][15401] Updated weights for policy 0, policy_version 523860 (0.0035) [2024-06-23 21:58:08,390][15132] Fps is (10 sec: 42607.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8583004160. Throughput: 0: 42673.2. Samples: 8583084380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 21:58:10,122][15401] Updated weights for policy 0, policy_version 523870 (0.0031) [2024-06-23 21:58:10,549][15349] Signal inference workers to stop experience collection... (127150 times) [2024-06-23 21:58:10,572][15401] InferenceWorker_p0-w0: stopping experience collection (127150 times) [2024-06-23 21:58:10,664][15349] Signal inference workers to resume experience collection... (127150 times) [2024-06-23 21:58:10,665][15401] InferenceWorker_p0-w0: resuming experience collection (127150 times) [2024-06-23 21:58:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 8583233536. Throughput: 0: 42646.6. Samples: 8583340880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:13,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 21:58:13,757][15401] Updated weights for policy 0, policy_version 523880 (0.0041) [2024-06-23 21:58:17,802][15401] Updated weights for policy 0, policy_version 523890 (0.0036) [2024-06-23 21:58:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8583430144. Throughput: 0: 42621.8. Samples: 8583598020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:18,392][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 21:58:21,521][15401] Updated weights for policy 0, policy_version 523900 (0.0032) [2024-06-23 21:58:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8583659520. Throughput: 0: 42785.3. Samples: 8583729340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 21:58:25,596][15401] Updated weights for policy 0, policy_version 523910 (0.0040) [2024-06-23 21:58:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8583856128. Throughput: 0: 42501.8. Samples: 8583977580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 21:58:29,176][15401] Updated weights for policy 0, policy_version 523920 (0.0036) [2024-06-23 21:58:33,214][15401] Updated weights for policy 0, policy_version 523930 (0.0036) [2024-06-23 21:58:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 8584069120. Throughput: 0: 42853.3. Samples: 8584243540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 21:58:36,839][15401] Updated weights for policy 0, policy_version 523940 (0.0029) [2024-06-23 21:58:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8584298496. Throughput: 0: 42918.2. Samples: 8584372080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:38,395][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 21:58:40,857][15401] Updated weights for policy 0, policy_version 523950 (0.0025) [2024-06-23 21:58:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8584495104. Throughput: 0: 42940.7. Samples: 8584627640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 21:58:44,334][15401] Updated weights for policy 0, policy_version 523960 (0.0028) [2024-06-23 21:58:48,362][15401] Updated weights for policy 0, policy_version 523970 (0.0050) [2024-06-23 21:58:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8584724480. Throughput: 0: 43069.4. Samples: 8584891240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 21:58:51,838][15401] Updated weights for policy 0, policy_version 523980 (0.0038) [2024-06-23 21:58:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8584921088. Throughput: 0: 42912.1. Samples: 8585015420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 21:58:55,938][15401] Updated weights for policy 0, policy_version 523990 (0.0041) [2024-06-23 21:58:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.0, 300 sec: 42653.9). Total num frames: 8585150464. Throughput: 0: 42896.9. Samples: 8585271240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:58:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 21:58:59,453][15401] Updated weights for policy 0, policy_version 524000 (0.0032) [2024-06-23 21:59:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8585363456. Throughput: 0: 42940.9. Samples: 8585530360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:59:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 21:59:03,513][15401] Updated weights for policy 0, policy_version 524010 (0.0036) [2024-06-23 21:59:07,450][15401] Updated weights for policy 0, policy_version 524020 (0.0025) [2024-06-23 21:59:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8585576448. Throughput: 0: 42882.3. Samples: 8585659040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:59:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 21:59:11,339][15401] Updated weights for policy 0, policy_version 524030 (0.0033) [2024-06-23 21:59:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 8585789440. Throughput: 0: 43120.4. Samples: 8585918000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:59:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 21:59:15,023][15401] Updated weights for policy 0, policy_version 524040 (0.0046) [2024-06-23 21:59:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 8586002432. Throughput: 0: 42736.5. Samples: 8586166680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-23 21:59:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 21:59:18,899][15401] Updated weights for policy 0, policy_version 524050 (0.0036) [2024-06-23 21:59:22,842][15401] Updated weights for policy 0, policy_version 524060 (0.0033) [2024-06-23 21:59:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8586215424. Throughput: 0: 42844.9. Samples: 8586300100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 21:59:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 21:59:26,690][15401] Updated weights for policy 0, policy_version 524070 (0.0035) [2024-06-23 21:59:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 8586428416. Throughput: 0: 42817.8. Samples: 8586554440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 21:59:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 21:59:30,341][15401] Updated weights for policy 0, policy_version 524080 (0.0033) [2024-06-23 21:59:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 8586657792. Throughput: 0: 42578.2. Samples: 8586807260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 21:59:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 21:59:34,363][15401] Updated weights for policy 0, policy_version 524090 (0.0038) [2024-06-23 21:59:37,791][15401] Updated weights for policy 0, policy_version 524100 (0.0024) [2024-06-23 21:59:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8586854400. Throughput: 0: 42797.2. Samples: 8586941300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 21:59:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 21:59:41,761][15401] Updated weights for policy 0, policy_version 524110 (0.0042) [2024-06-23 21:59:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 8587051008. Throughput: 0: 42697.8. Samples: 8587192640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 21:59:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 21:59:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000524113_8587067392.pth... [2024-06-23 21:59:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000523490_8576860160.pth [2024-06-23 21:59:45,874][15401] Updated weights for policy 0, policy_version 524120 (0.0037) [2024-06-23 21:59:48,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 8587280384. Throughput: 0: 42688.8. Samples: 8587451460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 21:59:48,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 21:59:49,583][15401] Updated weights for policy 0, policy_version 524130 (0.0033) [2024-06-23 21:59:53,332][15401] Updated weights for policy 0, policy_version 524140 (0.0038) [2024-06-23 21:59:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8587509760. Throughput: 0: 42769.3. Samples: 8587583660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 21:59:53,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 21:59:56,979][15349] Signal inference workers to stop experience collection... (127200 times) [2024-06-23 21:59:56,979][15349] Signal inference workers to resume experience collection... (127200 times) [2024-06-23 21:59:57,011][15401] InferenceWorker_p0-w0: stopping experience collection (127200 times) [2024-06-23 21:59:57,011][15401] InferenceWorker_p0-w0: resuming experience collection (127200 times) [2024-06-23 21:59:57,134][15401] Updated weights for policy 0, policy_version 524150 (0.0031) [2024-06-23 21:59:58,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8587689984. Throughput: 0: 42700.1. Samples: 8587839500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 21:59:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 22:00:00,804][15401] Updated weights for policy 0, policy_version 524160 (0.0034) [2024-06-23 22:00:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8587935744. Throughput: 0: 42874.6. Samples: 8588096040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 22:00:04,689][15401] Updated weights for policy 0, policy_version 524170 (0.0033) [2024-06-23 22:00:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8588132352. Throughput: 0: 42795.5. Samples: 8588225900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 22:00:08,842][15401] Updated weights for policy 0, policy_version 524180 (0.0030) [2024-06-23 22:00:12,375][15401] Updated weights for policy 0, policy_version 524190 (0.0036) [2024-06-23 22:00:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8588345344. Throughput: 0: 42728.8. Samples: 8588477240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:13,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 22:00:16,819][15401] Updated weights for policy 0, policy_version 524200 (0.0028) [2024-06-23 22:00:18,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8588558336. Throughput: 0: 42742.9. Samples: 8588730680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 22:00:20,274][15401] Updated weights for policy 0, policy_version 524210 (0.0031) [2024-06-23 22:00:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8588754944. Throughput: 0: 42653.0. Samples: 8588860680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:23,399][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 22:00:24,422][15401] Updated weights for policy 0, policy_version 524220 (0.0044) [2024-06-23 22:00:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8588984320. Throughput: 0: 42551.2. Samples: 8589107440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:28,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 22:00:28,393][15401] Updated weights for policy 0, policy_version 524230 (0.0036) [2024-06-23 22:00:32,091][15401] Updated weights for policy 0, policy_version 524240 (0.0027) [2024-06-23 22:00:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8589213696. Throughput: 0: 42500.8. Samples: 8589363900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:33,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 22:00:35,869][15401] Updated weights for policy 0, policy_version 524250 (0.0027) [2024-06-23 22:00:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8589410304. Throughput: 0: 42429.5. Samples: 8589492980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:38,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-23 22:00:39,711][15401] Updated weights for policy 0, policy_version 524260 (0.0029) [2024-06-23 22:00:43,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8589623296. Throughput: 0: 42484.0. Samples: 8589751280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:43,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 22:00:43,396][15401] Updated weights for policy 0, policy_version 524270 (0.0038) [2024-06-23 22:00:47,458][15401] Updated weights for policy 0, policy_version 524280 (0.0035) [2024-06-23 22:00:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 8589869056. Throughput: 0: 42487.6. Samples: 8590007980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-23 22:00:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 22:00:50,894][15401] Updated weights for policy 0, policy_version 524290 (0.0033) [2024-06-23 22:00:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8590049280. Throughput: 0: 42414.3. Samples: 8590134540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:00:53,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 22:00:55,063][15401] Updated weights for policy 0, policy_version 524300 (0.0038) [2024-06-23 22:00:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8590278656. Throughput: 0: 42506.7. Samples: 8590390040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:00:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 22:00:58,491][15401] Updated weights for policy 0, policy_version 524310 (0.0026) [2024-06-23 22:01:02,882][15401] Updated weights for policy 0, policy_version 524320 (0.0029) [2024-06-23 22:01:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8590491648. Throughput: 0: 42788.3. Samples: 8590656160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:03,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 22:01:05,819][15349] Signal inference workers to stop experience collection... (127250 times) [2024-06-23 22:01:05,872][15401] InferenceWorker_p0-w0: stopping experience collection (127250 times) [2024-06-23 22:01:05,880][15349] Signal inference workers to resume experience collection... (127250 times) [2024-06-23 22:01:05,885][15401] InferenceWorker_p0-w0: resuming experience collection (127250 times) [2024-06-23 22:01:06,062][15401] Updated weights for policy 0, policy_version 524330 (0.0045) [2024-06-23 22:01:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8590688256. Throughput: 0: 42580.5. Samples: 8590776800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:08,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 22:01:10,699][15401] Updated weights for policy 0, policy_version 524340 (0.0024) [2024-06-23 22:01:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8590934016. Throughput: 0: 42800.0. Samples: 8591033440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 22:01:13,510][15401] Updated weights for policy 0, policy_version 524350 (0.0026) [2024-06-23 22:01:18,222][15401] Updated weights for policy 0, policy_version 524360 (0.0038) [2024-06-23 22:01:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8591114240. Throughput: 0: 42941.1. Samples: 8591296240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 22:01:21,186][15401] Updated weights for policy 0, policy_version 524370 (0.0021) [2024-06-23 22:01:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8591327232. Throughput: 0: 42681.8. Samples: 8591413660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 22:01:25,807][15401] Updated weights for policy 0, policy_version 524380 (0.0045) [2024-06-23 22:01:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8591556608. Throughput: 0: 42764.8. Samples: 8591675700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 22:01:28,738][15401] Updated weights for policy 0, policy_version 524390 (0.0030) [2024-06-23 22:01:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8591736832. Throughput: 0: 43026.1. Samples: 8591944160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 22:01:33,667][15401] Updated weights for policy 0, policy_version 524400 (0.0030) [2024-06-23 22:01:36,237][15401] Updated weights for policy 0, policy_version 524410 (0.0038) [2024-06-23 22:01:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8591982592. Throughput: 0: 42751.2. Samples: 8592058340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 22:01:41,328][15401] Updated weights for policy 0, policy_version 524420 (0.0028) [2024-06-23 22:01:43,389][15132] Fps is (10 sec: 47514.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8592211968. Throughput: 0: 42856.1. Samples: 8592318560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 22:01:43,447][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000524428_8592228352.pth... [2024-06-23 22:01:43,515][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000523802_8581971968.pth [2024-06-23 22:01:44,173][15401] Updated weights for policy 0, policy_version 524430 (0.0037) [2024-06-23 22:01:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 8592375808. Throughput: 0: 42873.9. Samples: 8592585480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 22:01:48,768][15401] Updated weights for policy 0, policy_version 524440 (0.0028) [2024-06-23 22:01:51,747][15401] Updated weights for policy 0, policy_version 524450 (0.0039) [2024-06-23 22:01:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8592637952. Throughput: 0: 42762.7. Samples: 8592701120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 22:01:56,137][15349] Signal inference workers to stop experience collection... (127300 times) [2024-06-23 22:01:56,165][15401] InferenceWorker_p0-w0: stopping experience collection (127300 times) [2024-06-23 22:01:56,195][15349] Signal inference workers to resume experience collection... (127300 times) [2024-06-23 22:01:56,200][15401] InferenceWorker_p0-w0: resuming experience collection (127300 times) [2024-06-23 22:01:56,334][15401] Updated weights for policy 0, policy_version 524460 (0.0024) [2024-06-23 22:01:58,396][15132] Fps is (10 sec: 47482.8, 60 sec: 42866.9, 300 sec: 42764.4). Total num frames: 8592850944. Throughput: 0: 42946.7. Samples: 8592966320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:01:58,396][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 22:01:59,545][15401] Updated weights for policy 0, policy_version 524470 (0.0035) [2024-06-23 22:02:03,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8593014784. Throughput: 0: 42963.1. Samples: 8593229580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:02:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 22:02:03,889][15401] Updated weights for policy 0, policy_version 524480 (0.0030) [2024-06-23 22:02:07,119][15401] Updated weights for policy 0, policy_version 524490 (0.0022) [2024-06-23 22:02:08,390][15132] Fps is (10 sec: 44264.5, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 8593293312. Throughput: 0: 42947.4. Samples: 8593346300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:02:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 22:02:11,705][15401] Updated weights for policy 0, policy_version 524500 (0.0034) [2024-06-23 22:02:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8593489920. Throughput: 0: 42829.3. Samples: 8593603020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:02:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 22:02:14,822][15401] Updated weights for policy 0, policy_version 524510 (0.0043) [2024-06-23 22:02:18,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8593670144. Throughput: 0: 42576.7. Samples: 8593860100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 22:02:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 22:02:19,511][15401] Updated weights for policy 0, policy_version 524520 (0.0031) [2024-06-23 22:02:22,662][15401] Updated weights for policy 0, policy_version 524530 (0.0050) [2024-06-23 22:02:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8593915904. Throughput: 0: 42744.3. Samples: 8593981840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:02:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 22:02:27,161][15401] Updated weights for policy 0, policy_version 524540 (0.0037) [2024-06-23 22:02:28,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 8594128896. Throughput: 0: 42774.5. Samples: 8594243420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:02:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 22:02:30,317][15401] Updated weights for policy 0, policy_version 524550 (0.0030) [2024-06-23 22:02:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 8594309120. Throughput: 0: 42504.8. Samples: 8594498200. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:02:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 22:02:34,773][15401] Updated weights for policy 0, policy_version 524560 (0.0039) [2024-06-23 22:02:38,030][15401] Updated weights for policy 0, policy_version 524570 (0.0024) [2024-06-23 22:02:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8594554880. Throughput: 0: 42690.1. Samples: 8594622180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:02:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 22:02:42,455][15401] Updated weights for policy 0, policy_version 524580 (0.0035) [2024-06-23 22:02:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8594751488. Throughput: 0: 42671.9. Samples: 8594886280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:02:43,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 22:02:45,807][15401] Updated weights for policy 0, policy_version 524590 (0.0032) [2024-06-23 22:02:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8594964480. Throughput: 0: 42297.3. Samples: 8595132960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:02:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 22:02:50,001][15401] Updated weights for policy 0, policy_version 524600 (0.0035) [2024-06-23 22:02:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 8595177472. Throughput: 0: 42578.3. Samples: 8595262320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:02:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 22:02:53,795][15401] Updated weights for policy 0, policy_version 524610 (0.0034) [2024-06-23 22:02:57,791][15401] Updated weights for policy 0, policy_version 524620 (0.0037) [2024-06-23 22:02:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42329.9, 300 sec: 42654.0). Total num frames: 8595390464. Throughput: 0: 42623.2. Samples: 8595521060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:02:58,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 22:03:01,398][15401] Updated weights for policy 0, policy_version 524630 (0.0043) [2024-06-23 22:03:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 8595619840. Throughput: 0: 42657.8. Samples: 8595779700. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 22:03:05,513][15401] Updated weights for policy 0, policy_version 524640 (0.0035) [2024-06-23 22:03:08,394][15132] Fps is (10 sec: 44217.9, 60 sec: 42322.4, 300 sec: 42708.9). Total num frames: 8595832832. Throughput: 0: 42695.1. Samples: 8595903300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:08,394][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 22:03:09,009][15401] Updated weights for policy 0, policy_version 524650 (0.0031) [2024-06-23 22:03:12,149][15349] Signal inference workers to stop experience collection... (127350 times) [2024-06-23 22:03:12,190][15401] InferenceWorker_p0-w0: stopping experience collection (127350 times) [2024-06-23 22:03:12,222][15349] Signal inference workers to resume experience collection... (127350 times) [2024-06-23 22:03:12,223][15401] InferenceWorker_p0-w0: resuming experience collection (127350 times) [2024-06-23 22:03:13,011][15401] Updated weights for policy 0, policy_version 524660 (0.0038) [2024-06-23 22:03:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8596045824. Throughput: 0: 42681.4. Samples: 8596164080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:13,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 22:03:16,839][15401] Updated weights for policy 0, policy_version 524670 (0.0034) [2024-06-23 22:03:18,392][15132] Fps is (10 sec: 40967.5, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 8596242432. Throughput: 0: 42508.0. Samples: 8596411160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:18,393][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 22:03:20,975][15401] Updated weights for policy 0, policy_version 524680 (0.0033) [2024-06-23 22:03:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 8596455424. Throughput: 0: 42673.2. Samples: 8596542580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:23,392][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 22:03:24,290][15401] Updated weights for policy 0, policy_version 524690 (0.0029) [2024-06-23 22:03:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8596668416. Throughput: 0: 42516.8. Samples: 8596799540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 22:03:28,507][15401] Updated weights for policy 0, policy_version 524700 (0.0040) [2024-06-23 22:03:31,898][15401] Updated weights for policy 0, policy_version 524710 (0.0037) [2024-06-23 22:03:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8596897792. Throughput: 0: 42609.8. Samples: 8597050400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 22:03:36,050][15401] Updated weights for policy 0, policy_version 524720 (0.0034) [2024-06-23 22:03:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8597094400. Throughput: 0: 42621.4. Samples: 8597180280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:38,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 22:03:39,593][15401] Updated weights for policy 0, policy_version 524730 (0.0047) [2024-06-23 22:03:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8597307392. Throughput: 0: 42727.5. Samples: 8597443800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 22:03:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 22:03:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000524738_8597307392.pth... [2024-06-23 22:03:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000524113_8587067392.pth [2024-06-23 22:03:43,997][15401] Updated weights for policy 0, policy_version 524740 (0.0032) [2024-06-23 22:03:47,226][15401] Updated weights for policy 0, policy_version 524750 (0.0032) [2024-06-23 22:03:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8597553152. Throughput: 0: 42496.7. Samples: 8597692060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:03:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 22:03:51,534][15401] Updated weights for policy 0, policy_version 524760 (0.0037) [2024-06-23 22:03:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8597749760. Throughput: 0: 42828.4. Samples: 8597830400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:03:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 22:03:54,977][15401] Updated weights for policy 0, policy_version 524770 (0.0023) [2024-06-23 22:03:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8597962752. Throughput: 0: 42649.7. Samples: 8598083320. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:03:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 22:03:59,107][15401] Updated weights for policy 0, policy_version 524780 (0.0037) [2024-06-23 22:04:02,751][15401] Updated weights for policy 0, policy_version 524790 (0.0033) [2024-06-23 22:04:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8598192128. Throughput: 0: 42719.2. Samples: 8598333420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-23 22:04:06,565][15401] Updated weights for policy 0, policy_version 524800 (0.0030) [2024-06-23 22:04:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42601.5, 300 sec: 42709.5). Total num frames: 8598388736. Throughput: 0: 42855.7. Samples: 8598470980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 22:04:10,250][15401] Updated weights for policy 0, policy_version 524810 (0.0035) [2024-06-23 22:04:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8598618112. Throughput: 0: 42832.4. Samples: 8598727000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 22:04:14,173][15401] Updated weights for policy 0, policy_version 524820 (0.0028) [2024-06-23 22:04:17,880][15401] Updated weights for policy 0, policy_version 524830 (0.0038) [2024-06-23 22:04:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 8598831104. Throughput: 0: 42890.1. Samples: 8598980460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 22:04:21,908][15401] Updated weights for policy 0, policy_version 524840 (0.0034) [2024-06-23 22:04:22,997][15349] Signal inference workers to stop experience collection... (127400 times) [2024-06-23 22:04:23,052][15401] InferenceWorker_p0-w0: stopping experience collection (127400 times) [2024-06-23 22:04:23,119][15349] Signal inference workers to resume experience collection... (127400 times) [2024-06-23 22:04:23,120][15401] InferenceWorker_p0-w0: resuming experience collection (127400 times) [2024-06-23 22:04:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8599027712. Throughput: 0: 42913.7. Samples: 8599111400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 22:04:25,605][15401] Updated weights for policy 0, policy_version 524850 (0.0036) [2024-06-23 22:04:28,392][15132] Fps is (10 sec: 40948.4, 60 sec: 42869.3, 300 sec: 42653.5). Total num frames: 8599240704. Throughput: 0: 42677.7. Samples: 8599364420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:28,393][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 22:04:30,118][15401] Updated weights for policy 0, policy_version 524860 (0.0030) [2024-06-23 22:04:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8599453696. Throughput: 0: 42830.1. Samples: 8599619420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:33,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 22:04:33,584][15401] Updated weights for policy 0, policy_version 524870 (0.0034) [2024-06-23 22:04:37,777][15401] Updated weights for policy 0, policy_version 524880 (0.0028) [2024-06-23 22:04:38,389][15132] Fps is (10 sec: 40972.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8599650304. Throughput: 0: 42579.8. Samples: 8599746480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 22:04:41,142][15401] Updated weights for policy 0, policy_version 524890 (0.0036) [2024-06-23 22:04:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 8599896064. Throughput: 0: 42597.0. Samples: 8600000180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 22:04:45,445][15401] Updated weights for policy 0, policy_version 524900 (0.0041) [2024-06-23 22:04:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8600092672. Throughput: 0: 42771.1. Samples: 8600258120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 22:04:48,834][15401] Updated weights for policy 0, policy_version 524910 (0.0034) [2024-06-23 22:04:53,059][15401] Updated weights for policy 0, policy_version 524920 (0.0028) [2024-06-23 22:04:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8600289280. Throughput: 0: 42680.3. Samples: 8600391600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 22:04:56,528][15401] Updated weights for policy 0, policy_version 524930 (0.0029) [2024-06-23 22:04:58,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 8600535040. Throughput: 0: 42650.2. Samples: 8600646360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:04:58,392][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 22:05:00,541][15401] Updated weights for policy 0, policy_version 524940 (0.0036) [2024-06-23 22:05:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8600748032. Throughput: 0: 42679.7. Samples: 8600901040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:05:03,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-23 22:05:04,154][15401] Updated weights for policy 0, policy_version 524950 (0.0031) [2024-06-23 22:05:08,098][15401] Updated weights for policy 0, policy_version 524960 (0.0035) [2024-06-23 22:05:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8600944640. Throughput: 0: 42748.5. Samples: 8601035080. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:05:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 22:05:11,902][15401] Updated weights for policy 0, policy_version 524970 (0.0033) [2024-06-23 22:05:13,391][15132] Fps is (10 sec: 40955.5, 60 sec: 42324.7, 300 sec: 42709.3). Total num frames: 8601157632. Throughput: 0: 42675.2. Samples: 8601284720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-23 22:05:13,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 22:05:15,931][15401] Updated weights for policy 0, policy_version 524980 (0.0028) [2024-06-23 22:05:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 8601387008. Throughput: 0: 42616.7. Samples: 8601537160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:18,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-23 22:05:19,930][15401] Updated weights for policy 0, policy_version 524990 (0.0042) [2024-06-23 22:05:23,389][15132] Fps is (10 sec: 40964.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8601567232. Throughput: 0: 42634.6. Samples: 8601665040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:23,390][15132] Avg episode reward: [(0, '0.881')] [2024-06-23 22:05:23,624][15401] Updated weights for policy 0, policy_version 525000 (0.0029) [2024-06-23 22:05:27,413][15401] Updated weights for policy 0, policy_version 525010 (0.0032) [2024-06-23 22:05:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.5, 300 sec: 42654.0). Total num frames: 8601796608. Throughput: 0: 42846.2. Samples: 8601928260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 22:05:31,140][15401] Updated weights for policy 0, policy_version 525020 (0.0038) [2024-06-23 22:05:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8602025984. Throughput: 0: 42768.4. Samples: 8602182700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 22:05:34,987][15401] Updated weights for policy 0, policy_version 525030 (0.0029) [2024-06-23 22:05:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 8602222592. Throughput: 0: 42726.2. Samples: 8602314280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 22:05:38,691][15401] Updated weights for policy 0, policy_version 525040 (0.0035) [2024-06-23 22:05:42,672][15401] Updated weights for policy 0, policy_version 525050 (0.0041) [2024-06-23 22:05:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8602435584. Throughput: 0: 42740.1. Samples: 8602569560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:43,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 22:05:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000525052_8602451968.pth... [2024-06-23 22:05:43,524][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000524428_8592228352.pth [2024-06-23 22:05:44,534][15349] Signal inference workers to stop experience collection... (127450 times) [2024-06-23 22:05:44,536][15349] Signal inference workers to resume experience collection... (127450 times) [2024-06-23 22:05:44,557][15401] InferenceWorker_p0-w0: stopping experience collection (127450 times) [2024-06-23 22:05:44,591][15401] InferenceWorker_p0-w0: resuming experience collection (127450 times) [2024-06-23 22:05:46,202][15401] Updated weights for policy 0, policy_version 525060 (0.0035) [2024-06-23 22:05:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8602664960. Throughput: 0: 42902.6. Samples: 8602831660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:48,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 22:05:50,335][15401] Updated weights for policy 0, policy_version 525070 (0.0022) [2024-06-23 22:05:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8602877952. Throughput: 0: 42782.5. Samples: 8602960300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 22:05:53,767][15401] Updated weights for policy 0, policy_version 525080 (0.0034) [2024-06-23 22:05:58,006][15401] Updated weights for policy 0, policy_version 525090 (0.0036) [2024-06-23 22:05:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 8603090944. Throughput: 0: 42836.0. Samples: 8603212300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:05:58,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-23 22:06:02,005][15401] Updated weights for policy 0, policy_version 525100 (0.0031) [2024-06-23 22:06:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8603303936. Throughput: 0: 42952.3. Samples: 8603470020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:03,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 22:06:05,504][15401] Updated weights for policy 0, policy_version 525110 (0.0027) [2024-06-23 22:06:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8603533312. Throughput: 0: 43158.3. Samples: 8603607160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 22:06:09,429][15401] Updated weights for policy 0, policy_version 525120 (0.0027) [2024-06-23 22:06:13,032][15401] Updated weights for policy 0, policy_version 525130 (0.0040) [2024-06-23 22:06:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42872.2, 300 sec: 42765.0). Total num frames: 8603729920. Throughput: 0: 43042.2. Samples: 8603865160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 22:06:17,384][15401] Updated weights for policy 0, policy_version 525140 (0.0038) [2024-06-23 22:06:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8603942912. Throughput: 0: 42963.1. Samples: 8604116040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 22:06:20,538][15401] Updated weights for policy 0, policy_version 525150 (0.0041) [2024-06-23 22:06:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 8604188672. Throughput: 0: 42966.7. Samples: 8604247780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 22:06:25,024][15401] Updated weights for policy 0, policy_version 525160 (0.0038) [2024-06-23 22:06:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8604368896. Throughput: 0: 43003.5. Samples: 8604504720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:28,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 22:06:28,647][15401] Updated weights for policy 0, policy_version 525170 (0.0028) [2024-06-23 22:06:32,510][15401] Updated weights for policy 0, policy_version 525180 (0.0033) [2024-06-23 22:06:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8604581888. Throughput: 0: 42758.1. Samples: 8604755780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-23 22:06:36,270][15401] Updated weights for policy 0, policy_version 525190 (0.0036) [2024-06-23 22:06:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8604811264. Throughput: 0: 42869.4. Samples: 8604889420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:38,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 22:06:40,103][15401] Updated weights for policy 0, policy_version 525200 (0.0034) [2024-06-23 22:06:43,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 8605007872. Throughput: 0: 43009.9. Samples: 8605148020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-23 22:06:43,397][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 22:06:43,790][15401] Updated weights for policy 0, policy_version 525210 (0.0038) [2024-06-23 22:06:47,800][15401] Updated weights for policy 0, policy_version 525220 (0.0036) [2024-06-23 22:06:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8605237248. Throughput: 0: 42914.3. Samples: 8605401160. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:06:48,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-23 22:06:51,346][15401] Updated weights for policy 0, policy_version 525230 (0.0029) [2024-06-23 22:06:53,390][15132] Fps is (10 sec: 45904.1, 60 sec: 43144.5, 300 sec: 42765.9). Total num frames: 8605466624. Throughput: 0: 42780.2. Samples: 8605532280. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:06:53,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 22:06:55,237][15401] Updated weights for policy 0, policy_version 525240 (0.0048) [2024-06-23 22:06:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8605663232. Throughput: 0: 42945.8. Samples: 8605797720. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:06:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 22:06:58,976][15401] Updated weights for policy 0, policy_version 525250 (0.0023) [2024-06-23 22:06:59,387][15349] Signal inference workers to stop experience collection... (127500 times) [2024-06-23 22:06:59,436][15401] InferenceWorker_p0-w0: stopping experience collection (127500 times) [2024-06-23 22:06:59,438][15349] Signal inference workers to resume experience collection... (127500 times) [2024-06-23 22:06:59,449][15401] InferenceWorker_p0-w0: resuming experience collection (127500 times) [2024-06-23 22:07:02,832][15401] Updated weights for policy 0, policy_version 525260 (0.0031) [2024-06-23 22:07:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8605876224. Throughput: 0: 43090.1. Samples: 8606055100. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 22:07:06,359][15401] Updated weights for policy 0, policy_version 525270 (0.0030) [2024-06-23 22:07:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8606121984. Throughput: 0: 42994.7. Samples: 8606182540. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 22:07:10,321][15401] Updated weights for policy 0, policy_version 525280 (0.0035) [2024-06-23 22:07:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8606318592. Throughput: 0: 43183.0. Samples: 8606447960. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 22:07:13,772][15401] Updated weights for policy 0, policy_version 525290 (0.0039) [2024-06-23 22:07:17,799][15401] Updated weights for policy 0, policy_version 525300 (0.0034) [2024-06-23 22:07:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8606515200. Throughput: 0: 43268.0. Samples: 8606702840. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:18,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-23 22:07:21,546][15401] Updated weights for policy 0, policy_version 525310 (0.0045) [2024-06-23 22:07:23,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8606777344. Throughput: 0: 43161.3. Samples: 8606831680. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 22:07:25,544][15401] Updated weights for policy 0, policy_version 525320 (0.0036) [2024-06-23 22:07:28,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8606941184. Throughput: 0: 43198.1. Samples: 8607091760. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:28,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 22:07:29,078][15401] Updated weights for policy 0, policy_version 525330 (0.0033) [2024-06-23 22:07:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8607154176. Throughput: 0: 43342.6. Samples: 8607351580. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:33,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 22:07:33,477][15401] Updated weights for policy 0, policy_version 525340 (0.0026) [2024-06-23 22:07:36,563][15401] Updated weights for policy 0, policy_version 525350 (0.0030) [2024-06-23 22:07:38,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8607416320. Throughput: 0: 43237.9. Samples: 8607477980. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 22:07:40,934][15401] Updated weights for policy 0, policy_version 525360 (0.0036) [2024-06-23 22:07:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 8607580160. Throughput: 0: 43044.8. Samples: 8607734740. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-23 22:07:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000525366_8607596544.pth... [2024-06-23 22:07:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000524738_8597307392.pth [2024-06-23 22:07:44,305][15401] Updated weights for policy 0, policy_version 525370 (0.0039) [2024-06-23 22:07:48,315][15401] Updated weights for policy 0, policy_version 525380 (0.0034) [2024-06-23 22:07:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8607825920. Throughput: 0: 43008.1. Samples: 8607990460. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:48,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 22:07:52,020][15401] Updated weights for policy 0, policy_version 525390 (0.0029) [2024-06-23 22:07:53,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8608055296. Throughput: 0: 43088.7. Samples: 8608121540. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 22:07:56,297][15401] Updated weights for policy 0, policy_version 525400 (0.0036) [2024-06-23 22:07:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8608235520. Throughput: 0: 42969.4. Samples: 8608381580. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:07:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 22:07:59,409][15401] Updated weights for policy 0, policy_version 525410 (0.0036) [2024-06-23 22:08:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42821.2). Total num frames: 8608464896. Throughput: 0: 42891.1. Samples: 8608632940. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:08:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 22:08:03,636][15401] Updated weights for policy 0, policy_version 525420 (0.0042) [2024-06-23 22:08:05,481][15349] Signal inference workers to stop experience collection... (127550 times) [2024-06-23 22:08:05,481][15349] Signal inference workers to resume experience collection... (127550 times) [2024-06-23 22:08:05,524][15401] InferenceWorker_p0-w0: stopping experience collection (127550 times) [2024-06-23 22:08:05,524][15401] InferenceWorker_p0-w0: resuming experience collection (127550 times) [2024-06-23 22:08:07,163][15401] Updated weights for policy 0, policy_version 525430 (0.0044) [2024-06-23 22:08:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8608694272. Throughput: 0: 42943.9. Samples: 8608764160. Policy #0 lag: (min: 2.0, avg: 12.4, max: 23.0) [2024-06-23 22:08:08,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 22:08:11,272][15401] Updated weights for policy 0, policy_version 525440 (0.0037) [2024-06-23 22:08:13,391][15132] Fps is (10 sec: 40955.7, 60 sec: 42597.7, 300 sec: 42820.7). Total num frames: 8608874496. Throughput: 0: 43037.7. Samples: 8609028400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:13,391][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 22:08:14,675][15401] Updated weights for policy 0, policy_version 525450 (0.0028) [2024-06-23 22:08:18,395][15132] Fps is (10 sec: 40938.9, 60 sec: 43140.8, 300 sec: 42875.7). Total num frames: 8609103872. Throughput: 0: 42797.2. Samples: 8609277680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:18,395][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 22:08:19,319][15401] Updated weights for policy 0, policy_version 525460 (0.0030) [2024-06-23 22:08:22,462][15401] Updated weights for policy 0, policy_version 525470 (0.0040) [2024-06-23 22:08:23,390][15132] Fps is (10 sec: 47518.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8609349632. Throughput: 0: 42937.3. Samples: 8609410160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 22:08:26,885][15401] Updated weights for policy 0, policy_version 525480 (0.0028) [2024-06-23 22:08:28,389][15132] Fps is (10 sec: 40981.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8609513472. Throughput: 0: 43042.4. Samples: 8609671640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:28,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 22:08:29,858][15401] Updated weights for policy 0, policy_version 525490 (0.0020) [2024-06-23 22:08:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8609759232. Throughput: 0: 42785.4. Samples: 8609915800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 22:08:34,378][15401] Updated weights for policy 0, policy_version 525500 (0.0037) [2024-06-23 22:08:37,424][15401] Updated weights for policy 0, policy_version 525510 (0.0039) [2024-06-23 22:08:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8609988608. Throughput: 0: 42976.5. Samples: 8610055480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:38,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 22:08:41,913][15401] Updated weights for policy 0, policy_version 525520 (0.0032) [2024-06-23 22:08:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8610168832. Throughput: 0: 42900.1. Samples: 8610312080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 22:08:45,260][15401] Updated weights for policy 0, policy_version 525530 (0.0039) [2024-06-23 22:08:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 8610414592. Throughput: 0: 42900.2. Samples: 8610563440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 22:08:49,286][15401] Updated weights for policy 0, policy_version 525540 (0.0042) [2024-06-23 22:08:52,873][15401] Updated weights for policy 0, policy_version 525550 (0.0027) [2024-06-23 22:08:53,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8610643968. Throughput: 0: 42996.4. Samples: 8610699000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 22:08:56,994][15401] Updated weights for policy 0, policy_version 525560 (0.0033) [2024-06-23 22:08:58,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8610791424. Throughput: 0: 42753.6. Samples: 8610952260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:08:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 22:09:00,503][15401] Updated weights for policy 0, policy_version 525570 (0.0034) [2024-06-23 22:09:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8611053568. Throughput: 0: 42756.9. Samples: 8611201520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:09:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 22:09:04,547][15401] Updated weights for policy 0, policy_version 525580 (0.0041) [2024-06-23 22:09:08,292][15401] Updated weights for policy 0, policy_version 525590 (0.0038) [2024-06-23 22:09:08,391][15132] Fps is (10 sec: 47508.1, 60 sec: 42870.7, 300 sec: 42875.9). Total num frames: 8611266560. Throughput: 0: 42945.7. Samples: 8611342760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:09:08,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 22:09:12,100][15401] Updated weights for policy 0, policy_version 525600 (0.0034) [2024-06-23 22:09:13,392][15132] Fps is (10 sec: 37674.2, 60 sec: 42597.5, 300 sec: 42709.1). Total num frames: 8611430400. Throughput: 0: 42621.1. Samples: 8611589700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:09:13,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 22:09:13,554][15349] Signal inference workers to stop experience collection... (127600 times) [2024-06-23 22:09:13,554][15349] Signal inference workers to resume experience collection... (127600 times) [2024-06-23 22:09:13,580][15401] InferenceWorker_p0-w0: stopping experience collection (127600 times) [2024-06-23 22:09:13,580][15401] InferenceWorker_p0-w0: resuming experience collection (127600 times) [2024-06-23 22:09:15,789][15401] Updated weights for policy 0, policy_version 525610 (0.0035) [2024-06-23 22:09:18,390][15132] Fps is (10 sec: 42602.9, 60 sec: 43148.3, 300 sec: 42931.6). Total num frames: 8611692544. Throughput: 0: 42828.3. Samples: 8611843080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:09:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 22:09:19,722][15401] Updated weights for policy 0, policy_version 525620 (0.0034) [2024-06-23 22:09:23,389][15132] Fps is (10 sec: 47525.4, 60 sec: 42598.4, 300 sec: 42932.1). Total num frames: 8611905536. Throughput: 0: 42878.7. Samples: 8611985020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:09:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 22:09:23,925][15401] Updated weights for policy 0, policy_version 525630 (0.0036) [2024-06-23 22:09:27,537][15401] Updated weights for policy 0, policy_version 525640 (0.0032) [2024-06-23 22:09:28,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.6, 300 sec: 42820.2). Total num frames: 8612085760. Throughput: 0: 42691.4. Samples: 8612233300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:09:28,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 22:09:31,493][15401] Updated weights for policy 0, policy_version 525650 (0.0030) [2024-06-23 22:09:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 8612331520. Throughput: 0: 42838.5. Samples: 8612491180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:09:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 22:09:34,916][15401] Updated weights for policy 0, policy_version 525660 (0.0028) [2024-06-23 22:09:38,390][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8612544512. Throughput: 0: 42830.3. Samples: 8612626360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-23 22:09:38,391][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 22:09:38,973][15401] Updated weights for policy 0, policy_version 525670 (0.0044) [2024-06-23 22:09:42,499][15401] Updated weights for policy 0, policy_version 525680 (0.0026) [2024-06-23 22:09:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8612741120. Throughput: 0: 42858.7. Samples: 8612880900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:09:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 22:09:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000525680_8612741120.pth... [2024-06-23 22:09:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000525052_8602451968.pth [2024-06-23 22:09:46,427][15401] Updated weights for policy 0, policy_version 525690 (0.0041) [2024-06-23 22:09:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8612970496. Throughput: 0: 43112.6. Samples: 8613141580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:09:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 22:09:50,031][15401] Updated weights for policy 0, policy_version 525700 (0.0034) [2024-06-23 22:09:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42876.5). Total num frames: 8613183488. Throughput: 0: 42940.7. Samples: 8613275040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:09:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-23 22:09:54,207][15401] Updated weights for policy 0, policy_version 525710 (0.0027) [2024-06-23 22:09:57,497][15401] Updated weights for policy 0, policy_version 525720 (0.0039) [2024-06-23 22:09:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8613396480. Throughput: 0: 42906.4. Samples: 8613520380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:09:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-23 22:10:01,948][15401] Updated weights for policy 0, policy_version 525730 (0.0031) [2024-06-23 22:10:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8613609472. Throughput: 0: 42941.4. Samples: 8613775440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:03,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 22:10:05,222][15401] Updated weights for policy 0, policy_version 525740 (0.0029) [2024-06-23 22:10:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42599.2, 300 sec: 42931.8). Total num frames: 8613822464. Throughput: 0: 42691.2. Samples: 8613906120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:08,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 22:10:09,621][15401] Updated weights for policy 0, policy_version 525750 (0.0031) [2024-06-23 22:10:13,045][15401] Updated weights for policy 0, policy_version 525760 (0.0033) [2024-06-23 22:10:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43692.4, 300 sec: 42931.6). Total num frames: 8614051840. Throughput: 0: 42956.6. Samples: 8614166240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-23 22:10:17,214][15401] Updated weights for policy 0, policy_version 525770 (0.0023) [2024-06-23 22:10:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 8614248448. Throughput: 0: 42978.3. Samples: 8614425200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 22:10:21,126][15401] Updated weights for policy 0, policy_version 525780 (0.0052) [2024-06-23 22:10:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 8614477824. Throughput: 0: 42752.8. Samples: 8614550240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 22:10:24,695][15401] Updated weights for policy 0, policy_version 525790 (0.0033) [2024-06-23 22:10:25,883][15349] Signal inference workers to stop experience collection... (127650 times) [2024-06-23 22:10:25,885][15349] Signal inference workers to resume experience collection... (127650 times) [2024-06-23 22:10:25,905][15401] InferenceWorker_p0-w0: stopping experience collection (127650 times) [2024-06-23 22:10:25,905][15401] InferenceWorker_p0-w0: resuming experience collection (127650 times) [2024-06-23 22:10:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43417.7, 300 sec: 42931.3). Total num frames: 8614690816. Throughput: 0: 42852.4. Samples: 8614809360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:28,392][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 22:10:28,805][15401] Updated weights for policy 0, policy_version 525800 (0.0048) [2024-06-23 22:10:32,316][15401] Updated weights for policy 0, policy_version 525810 (0.0041) [2024-06-23 22:10:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8614903808. Throughput: 0: 42699.0. Samples: 8615063040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 22:10:36,691][15401] Updated weights for policy 0, policy_version 525820 (0.0041) [2024-06-23 22:10:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8615133184. Throughput: 0: 42659.5. Samples: 8615194720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 22:10:39,993][15401] Updated weights for policy 0, policy_version 525830 (0.0036) [2024-06-23 22:10:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8615313408. Throughput: 0: 43004.9. Samples: 8615455600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-23 22:10:44,136][15401] Updated weights for policy 0, policy_version 525840 (0.0042) [2024-06-23 22:10:47,983][15401] Updated weights for policy 0, policy_version 525850 (0.0026) [2024-06-23 22:10:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 8615526400. Throughput: 0: 42879.3. Samples: 8615705020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 22:10:51,593][15401] Updated weights for policy 0, policy_version 525860 (0.0032) [2024-06-23 22:10:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8615772160. Throughput: 0: 42870.1. Samples: 8615835280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:53,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 22:10:55,776][15401] Updated weights for policy 0, policy_version 525870 (0.0027) [2024-06-23 22:10:58,396][15132] Fps is (10 sec: 42572.0, 60 sec: 42593.9, 300 sec: 42875.2). Total num frames: 8615952384. Throughput: 0: 42897.1. Samples: 8616096880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:10:58,396][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 22:10:59,121][15401] Updated weights for policy 0, policy_version 525880 (0.0048) [2024-06-23 22:11:03,357][15401] Updated weights for policy 0, policy_version 525890 (0.0040) [2024-06-23 22:11:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8616181760. Throughput: 0: 42853.7. Samples: 8616353620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:11:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 22:11:07,304][15401] Updated weights for policy 0, policy_version 525900 (0.0032) [2024-06-23 22:11:08,389][15132] Fps is (10 sec: 45904.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8616411136. Throughput: 0: 42792.6. Samples: 8616475900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-23 22:11:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 22:11:11,021][15401] Updated weights for policy 0, policy_version 525910 (0.0044) [2024-06-23 22:11:13,391][15132] Fps is (10 sec: 42593.2, 60 sec: 42597.6, 300 sec: 42931.5). Total num frames: 8616607744. Throughput: 0: 42775.8. Samples: 8616734220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:13,391][15132] Avg episode reward: [(0, '0.559')] [2024-06-23 22:11:14,865][15401] Updated weights for policy 0, policy_version 525920 (0.0050) [2024-06-23 22:11:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8616804352. Throughput: 0: 42783.7. Samples: 8616988300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:18,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 22:11:18,670][15401] Updated weights for policy 0, policy_version 525930 (0.0037) [2024-06-23 22:11:22,610][15401] Updated weights for policy 0, policy_version 525940 (0.0030) [2024-06-23 22:11:23,389][15132] Fps is (10 sec: 44242.1, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8617050112. Throughput: 0: 42656.9. Samples: 8617114280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:23,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-23 22:11:26,134][15401] Updated weights for policy 0, policy_version 525950 (0.0032) [2024-06-23 22:11:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 8617246720. Throughput: 0: 42665.3. Samples: 8617375540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 22:11:30,170][15401] Updated weights for policy 0, policy_version 525960 (0.0037) [2024-06-23 22:11:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8617459712. Throughput: 0: 42805.9. Samples: 8617631280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 22:11:33,759][15401] Updated weights for policy 0, policy_version 525970 (0.0029) [2024-06-23 22:11:38,188][15401] Updated weights for policy 0, policy_version 525980 (0.0033) [2024-06-23 22:11:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42932.6). Total num frames: 8617672704. Throughput: 0: 42766.3. Samples: 8617759760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:38,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 22:11:38,531][15349] Signal inference workers to stop experience collection... (127700 times) [2024-06-23 22:11:38,568][15401] InferenceWorker_p0-w0: stopping experience collection (127700 times) [2024-06-23 22:11:38,644][15349] Signal inference workers to resume experience collection... (127700 times) [2024-06-23 22:11:38,644][15401] InferenceWorker_p0-w0: resuming experience collection (127700 times) [2024-06-23 22:11:41,462][15401] Updated weights for policy 0, policy_version 525990 (0.0034) [2024-06-23 22:11:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8617902080. Throughput: 0: 42673.6. Samples: 8618016920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 22:11:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000525995_8617902080.pth... [2024-06-23 22:11:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000525366_8607596544.pth [2024-06-23 22:11:45,694][15401] Updated weights for policy 0, policy_version 526000 (0.0032) [2024-06-23 22:11:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8618115072. Throughput: 0: 42503.1. Samples: 8618266260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:48,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 22:11:49,056][15401] Updated weights for policy 0, policy_version 526010 (0.0030) [2024-06-23 22:11:53,315][15401] Updated weights for policy 0, policy_version 526020 (0.0039) [2024-06-23 22:11:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 8618311680. Throughput: 0: 42780.0. Samples: 8618401000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 22:11:56,614][15401] Updated weights for policy 0, policy_version 526030 (0.0035) [2024-06-23 22:11:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42875.9, 300 sec: 42876.1). Total num frames: 8618524672. Throughput: 0: 42714.4. Samples: 8618656320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:11:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 22:12:00,897][15401] Updated weights for policy 0, policy_version 526040 (0.0039) [2024-06-23 22:12:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8618770432. Throughput: 0: 42708.0. Samples: 8618910160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:12:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 22:12:04,383][15401] Updated weights for policy 0, policy_version 526050 (0.0038) [2024-06-23 22:12:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8618950656. Throughput: 0: 42939.6. Samples: 8619046560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:12:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 22:12:08,519][15401] Updated weights for policy 0, policy_version 526060 (0.0029) [2024-06-23 22:12:11,903][15401] Updated weights for policy 0, policy_version 526070 (0.0025) [2024-06-23 22:12:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42872.4, 300 sec: 42931.7). Total num frames: 8619180032. Throughput: 0: 42775.2. Samples: 8619300420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:12:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 22:12:16,179][15401] Updated weights for policy 0, policy_version 526080 (0.0035) [2024-06-23 22:12:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8619393024. Throughput: 0: 42683.1. Samples: 8619552020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:12:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 22:12:19,513][15401] Updated weights for policy 0, policy_version 526090 (0.0032) [2024-06-23 22:12:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42820.9). Total num frames: 8619573248. Throughput: 0: 42675.0. Samples: 8619680140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:12:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 22:12:23,876][15401] Updated weights for policy 0, policy_version 526100 (0.0026) [2024-06-23 22:12:27,367][15401] Updated weights for policy 0, policy_version 526110 (0.0035) [2024-06-23 22:12:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8619835392. Throughput: 0: 42680.5. Samples: 8619937540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:12:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 22:12:31,507][15401] Updated weights for policy 0, policy_version 526120 (0.0030) [2024-06-23 22:12:33,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8620048384. Throughput: 0: 42836.0. Samples: 8620193880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:12:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 22:12:35,091][15401] Updated weights for policy 0, policy_version 526130 (0.0036) [2024-06-23 22:12:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8620228608. Throughput: 0: 42737.3. Samples: 8620324180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:12:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 22:12:39,305][15401] Updated weights for policy 0, policy_version 526140 (0.0042) [2024-06-23 22:12:42,654][15401] Updated weights for policy 0, policy_version 526150 (0.0046) [2024-06-23 22:12:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8620457984. Throughput: 0: 42809.9. Samples: 8620582760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:12:43,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-23 22:12:46,889][15401] Updated weights for policy 0, policy_version 526160 (0.0039) [2024-06-23 22:12:48,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 8620703744. Throughput: 0: 42785.3. Samples: 8620835600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:12:48,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 22:12:50,328][15401] Updated weights for policy 0, policy_version 526170 (0.0029) [2024-06-23 22:12:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8620867584. Throughput: 0: 42684.3. Samples: 8620967360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:12:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 22:12:54,461][15401] Updated weights for policy 0, policy_version 526180 (0.0038) [2024-06-23 22:12:57,168][15349] Signal inference workers to stop experience collection... (127750 times) [2024-06-23 22:12:57,169][15349] Signal inference workers to resume experience collection... (127750 times) [2024-06-23 22:12:57,211][15401] InferenceWorker_p0-w0: stopping experience collection (127750 times) [2024-06-23 22:12:57,211][15401] InferenceWorker_p0-w0: resuming experience collection (127750 times) [2024-06-23 22:12:57,838][15401] Updated weights for policy 0, policy_version 526190 (0.0038) [2024-06-23 22:12:58,389][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8621096960. Throughput: 0: 42903.9. Samples: 8621231100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:12:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 22:13:02,067][15401] Updated weights for policy 0, policy_version 526200 (0.0049) [2024-06-23 22:13:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8621326336. Throughput: 0: 42952.1. Samples: 8621484860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-23 22:13:05,458][15401] Updated weights for policy 0, policy_version 526210 (0.0034) [2024-06-23 22:13:08,393][15132] Fps is (10 sec: 40945.9, 60 sec: 42595.9, 300 sec: 42820.2). Total num frames: 8621506560. Throughput: 0: 42920.8. Samples: 8621611720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:08,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 22:13:09,599][15401] Updated weights for policy 0, policy_version 526220 (0.0027) [2024-06-23 22:13:13,112][15401] Updated weights for policy 0, policy_version 526230 (0.0031) [2024-06-23 22:13:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.8). Total num frames: 8621752320. Throughput: 0: 42977.7. Samples: 8621871540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 22:13:17,221][15401] Updated weights for policy 0, policy_version 526240 (0.0028) [2024-06-23 22:13:18,390][15132] Fps is (10 sec: 45890.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8621965312. Throughput: 0: 43048.5. Samples: 8622131060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:18,392][15132] Avg episode reward: [(0, '0.739')] [2024-06-23 22:13:20,846][15401] Updated weights for policy 0, policy_version 526250 (0.0035) [2024-06-23 22:13:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8622161920. Throughput: 0: 42863.5. Samples: 8622253040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 22:13:24,691][15401] Updated weights for policy 0, policy_version 526260 (0.0028) [2024-06-23 22:13:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 8622391296. Throughput: 0: 42954.4. Samples: 8622515720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 22:13:28,410][15401] Updated weights for policy 0, policy_version 526270 (0.0036) [2024-06-23 22:13:32,416][15401] Updated weights for policy 0, policy_version 526280 (0.0037) [2024-06-23 22:13:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8622604288. Throughput: 0: 43062.3. Samples: 8622773300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 22:13:36,026][15401] Updated weights for policy 0, policy_version 526290 (0.0047) [2024-06-23 22:13:38,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8622817280. Throughput: 0: 42893.0. Samples: 8622897540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:38,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 22:13:40,066][15401] Updated weights for policy 0, policy_version 526300 (0.0045) [2024-06-23 22:13:43,391][15132] Fps is (10 sec: 42593.3, 60 sec: 42870.6, 300 sec: 42764.8). Total num frames: 8623030272. Throughput: 0: 42813.6. Samples: 8623157760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:43,391][15132] Avg episode reward: [(0, '0.750')] [2024-06-23 22:13:43,533][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000526310_8623063040.pth... [2024-06-23 22:13:43,553][15401] Updated weights for policy 0, policy_version 526310 (0.0036) [2024-06-23 22:13:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000525680_8612741120.pth [2024-06-23 22:13:47,874][15401] Updated weights for policy 0, policy_version 526320 (0.0041) [2024-06-23 22:13:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8623259648. Throughput: 0: 42829.0. Samples: 8623412160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:48,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 22:13:51,441][15401] Updated weights for policy 0, policy_version 526330 (0.0028) [2024-06-23 22:13:53,390][15132] Fps is (10 sec: 44241.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 8623472640. Throughput: 0: 42794.3. Samples: 8623537320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 22:13:55,506][15401] Updated weights for policy 0, policy_version 526340 (0.0033) [2024-06-23 22:13:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8623669248. Throughput: 0: 42730.4. Samples: 8623794400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:13:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 22:13:59,075][15401] Updated weights for policy 0, policy_version 526350 (0.0032) [2024-06-23 22:14:03,123][15401] Updated weights for policy 0, policy_version 526360 (0.0038) [2024-06-23 22:14:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 8623898624. Throughput: 0: 42790.3. Samples: 8624056620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:14:03,394][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 22:14:06,616][15401] Updated weights for policy 0, policy_version 526370 (0.0033) [2024-06-23 22:14:07,764][15349] Signal inference workers to stop experience collection... (127800 times) [2024-06-23 22:14:07,765][15349] Signal inference workers to resume experience collection... (127800 times) [2024-06-23 22:14:07,813][15401] InferenceWorker_p0-w0: stopping experience collection (127800 times) [2024-06-23 22:14:07,813][15401] InferenceWorker_p0-w0: resuming experience collection (127800 times) [2024-06-23 22:14:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43420.0, 300 sec: 42987.5). Total num frames: 8624111616. Throughput: 0: 42925.3. Samples: 8624184680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-23 22:14:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 22:14:10,621][15401] Updated weights for policy 0, policy_version 526380 (0.0043) [2024-06-23 22:14:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8624324608. Throughput: 0: 42762.5. Samples: 8624440020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 22:14:14,078][15401] Updated weights for policy 0, policy_version 526390 (0.0025) [2024-06-23 22:14:18,149][15401] Updated weights for policy 0, policy_version 526400 (0.0048) [2024-06-23 22:14:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8624537600. Throughput: 0: 42885.7. Samples: 8624703160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 22:14:22,367][15401] Updated weights for policy 0, policy_version 526410 (0.0045) [2024-06-23 22:14:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 8624734208. Throughput: 0: 42946.3. Samples: 8624830120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 22:14:25,754][15401] Updated weights for policy 0, policy_version 526420 (0.0037) [2024-06-23 22:14:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8624963584. Throughput: 0: 42761.0. Samples: 8625081960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 22:14:30,083][15401] Updated weights for policy 0, policy_version 526430 (0.0027) [2024-06-23 22:14:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8625176576. Throughput: 0: 42763.1. Samples: 8625336500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 22:14:33,608][15401] Updated weights for policy 0, policy_version 526440 (0.0037) [2024-06-23 22:14:37,691][15401] Updated weights for policy 0, policy_version 526450 (0.0033) [2024-06-23 22:14:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8625373184. Throughput: 0: 42885.3. Samples: 8625467160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 22:14:40,947][15401] Updated weights for policy 0, policy_version 526460 (0.0027) [2024-06-23 22:14:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43145.3, 300 sec: 42876.1). Total num frames: 8625618944. Throughput: 0: 42867.5. Samples: 8625723440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 22:14:45,288][15401] Updated weights for policy 0, policy_version 526470 (0.0032) [2024-06-23 22:14:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8625815552. Throughput: 0: 42795.2. Samples: 8625982400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 22:14:48,543][15401] Updated weights for policy 0, policy_version 526480 (0.0043) [2024-06-23 22:14:52,795][15401] Updated weights for policy 0, policy_version 526490 (0.0031) [2024-06-23 22:14:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8626028544. Throughput: 0: 42761.3. Samples: 8626108940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 22:14:56,292][15401] Updated weights for policy 0, policy_version 526500 (0.0038) [2024-06-23 22:14:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 8626241536. Throughput: 0: 42720.3. Samples: 8626362440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:14:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 22:15:00,687][15401] Updated weights for policy 0, policy_version 526510 (0.0044) [2024-06-23 22:15:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8626454528. Throughput: 0: 42552.4. Samples: 8626618020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:15:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 22:15:03,925][15401] Updated weights for policy 0, policy_version 526520 (0.0046) [2024-06-23 22:15:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8626667520. Throughput: 0: 42629.6. Samples: 8626748460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:15:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 22:15:08,392][15401] Updated weights for policy 0, policy_version 526530 (0.0027) [2024-06-23 22:15:11,673][15401] Updated weights for policy 0, policy_version 526540 (0.0053) [2024-06-23 22:15:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 8626896896. Throughput: 0: 42652.5. Samples: 8627001420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:15:13,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 22:15:16,136][15401] Updated weights for policy 0, policy_version 526550 (0.0028) [2024-06-23 22:15:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8627093504. Throughput: 0: 42763.0. Samples: 8627260840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:15:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 22:15:19,553][15401] Updated weights for policy 0, policy_version 526560 (0.0028) [2024-06-23 22:15:23,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 8627290112. Throughput: 0: 42559.8. Samples: 8627382340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:15:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 22:15:23,793][15401] Updated weights for policy 0, policy_version 526570 (0.0028) [2024-06-23 22:15:27,133][15401] Updated weights for policy 0, policy_version 526580 (0.0045) [2024-06-23 22:15:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8627535872. Throughput: 0: 42625.4. Samples: 8627641580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:15:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 22:15:31,657][15401] Updated weights for policy 0, policy_version 526590 (0.0031) [2024-06-23 22:15:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8627716096. Throughput: 0: 42631.5. Samples: 8627900820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 22:15:33,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 22:15:34,772][15401] Updated weights for policy 0, policy_version 526600 (0.0036) [2024-06-23 22:15:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 8627945472. Throughput: 0: 42569.0. Samples: 8628024540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:15:38,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 22:15:39,225][15401] Updated weights for policy 0, policy_version 526610 (0.0041) [2024-06-23 22:15:42,380][15401] Updated weights for policy 0, policy_version 526620 (0.0040) [2024-06-23 22:15:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8628158464. Throughput: 0: 42599.6. Samples: 8628279420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:15:43,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 22:15:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000526621_8628158464.pth... [2024-06-23 22:15:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000525995_8617902080.pth [2024-06-23 22:15:46,859][15401] Updated weights for policy 0, policy_version 526630 (0.0027) [2024-06-23 22:15:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8628355072. Throughput: 0: 42678.5. Samples: 8628538560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:15:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 22:15:50,083][15401] Updated weights for policy 0, policy_version 526640 (0.0034) [2024-06-23 22:15:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42877.0). Total num frames: 8628600832. Throughput: 0: 42529.0. Samples: 8628662260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:15:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 22:15:54,569][15401] Updated weights for policy 0, policy_version 526650 (0.0044) [2024-06-23 22:15:56,983][15349] Signal inference workers to stop experience collection... (127850 times) [2024-06-23 22:15:57,003][15401] InferenceWorker_p0-w0: stopping experience collection (127850 times) [2024-06-23 22:15:57,105][15349] Signal inference workers to resume experience collection... (127850 times) [2024-06-23 22:15:57,106][15401] InferenceWorker_p0-w0: resuming experience collection (127850 times) [2024-06-23 22:15:57,785][15401] Updated weights for policy 0, policy_version 526660 (0.0034) [2024-06-23 22:15:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8628797440. Throughput: 0: 42416.4. Samples: 8628910060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:15:58,392][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 22:16:02,373][15401] Updated weights for policy 0, policy_version 526670 (0.0044) [2024-06-23 22:16:03,393][15132] Fps is (10 sec: 39308.9, 60 sec: 42323.1, 300 sec: 42653.5). Total num frames: 8628994048. Throughput: 0: 42383.2. Samples: 8629168220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:03,393][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 22:16:05,453][15401] Updated weights for policy 0, policy_version 526680 (0.0037) [2024-06-23 22:16:08,394][15132] Fps is (10 sec: 40942.5, 60 sec: 42322.3, 300 sec: 42709.0). Total num frames: 8629207040. Throughput: 0: 42369.6. Samples: 8629289160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:08,394][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 22:16:09,957][15401] Updated weights for policy 0, policy_version 526690 (0.0028) [2024-06-23 22:16:13,389][15132] Fps is (10 sec: 44251.0, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 8629436416. Throughput: 0: 42456.8. Samples: 8629552140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-23 22:16:13,604][15401] Updated weights for policy 0, policy_version 526700 (0.0037) [2024-06-23 22:16:17,639][15401] Updated weights for policy 0, policy_version 526710 (0.0032) [2024-06-23 22:16:18,389][15132] Fps is (10 sec: 42617.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8629633024. Throughput: 0: 42307.6. Samples: 8629804660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 22:16:21,230][15401] Updated weights for policy 0, policy_version 526720 (0.0031) [2024-06-23 22:16:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8629846016. Throughput: 0: 42341.8. Samples: 8629929920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 22:16:25,117][15401] Updated weights for policy 0, policy_version 526730 (0.0027) [2024-06-23 22:16:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8630075392. Throughput: 0: 42446.7. Samples: 8630189520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 22:16:28,654][15401] Updated weights for policy 0, policy_version 526740 (0.0029) [2024-06-23 22:16:32,563][15401] Updated weights for policy 0, policy_version 526750 (0.0028) [2024-06-23 22:16:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8630288384. Throughput: 0: 42513.4. Samples: 8630451660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-23 22:16:36,285][15401] Updated weights for policy 0, policy_version 526760 (0.0045) [2024-06-23 22:16:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 8630484992. Throughput: 0: 42604.4. Samples: 8630579560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:38,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 22:16:40,304][15401] Updated weights for policy 0, policy_version 526770 (0.0033) [2024-06-23 22:16:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8630714368. Throughput: 0: 42834.3. Samples: 8630837600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 22:16:44,302][15401] Updated weights for policy 0, policy_version 526780 (0.0043) [2024-06-23 22:16:48,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8630910976. Throughput: 0: 42754.6. Samples: 8631092040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 22:16:48,426][15401] Updated weights for policy 0, policy_version 526790 (0.0030) [2024-06-23 22:16:51,946][15401] Updated weights for policy 0, policy_version 526800 (0.0025) [2024-06-23 22:16:53,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 8631140352. Throughput: 0: 42834.3. Samples: 8631216620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:53,401][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 22:16:56,084][15401] Updated weights for policy 0, policy_version 526810 (0.0034) [2024-06-23 22:16:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8631353344. Throughput: 0: 42712.3. Samples: 8631474200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:16:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 22:16:59,534][15401] Updated weights for policy 0, policy_version 526820 (0.0031) [2024-06-23 22:17:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42600.6, 300 sec: 42709.4). Total num frames: 8631549952. Throughput: 0: 42717.6. Samples: 8631726960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-23 22:17:03,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 22:17:03,599][15401] Updated weights for policy 0, policy_version 526830 (0.0032) [2024-06-23 22:17:07,214][15401] Updated weights for policy 0, policy_version 526840 (0.0031) [2024-06-23 22:17:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42874.6, 300 sec: 42709.5). Total num frames: 8631779328. Throughput: 0: 42809.7. Samples: 8631856360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-23 22:17:11,133][15401] Updated weights for policy 0, policy_version 526850 (0.0039) [2024-06-23 22:17:12,135][15349] Signal inference workers to stop experience collection... (127900 times) [2024-06-23 22:17:12,135][15349] Signal inference workers to resume experience collection... (127900 times) [2024-06-23 22:17:12,178][15401] InferenceWorker_p0-w0: stopping experience collection (127900 times) [2024-06-23 22:17:12,178][15401] InferenceWorker_p0-w0: resuming experience collection (127900 times) [2024-06-23 22:17:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8631992320. Throughput: 0: 42880.4. Samples: 8632119140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:13,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-23 22:17:14,906][15401] Updated weights for policy 0, policy_version 526860 (0.0031) [2024-06-23 22:17:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8632205312. Throughput: 0: 42568.0. Samples: 8632367220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:18,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 22:17:18,802][15401] Updated weights for policy 0, policy_version 526870 (0.0038) [2024-06-23 22:17:22,924][15401] Updated weights for policy 0, policy_version 526880 (0.0034) [2024-06-23 22:17:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8632418304. Throughput: 0: 42597.4. Samples: 8632496340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:23,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 22:17:26,293][15401] Updated weights for policy 0, policy_version 526890 (0.0024) [2024-06-23 22:17:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8632647680. Throughput: 0: 42585.7. Samples: 8632753960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 22:17:30,459][15401] Updated weights for policy 0, policy_version 526900 (0.0056) [2024-06-23 22:17:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8632827904. Throughput: 0: 42610.7. Samples: 8633009520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 22:17:34,031][15401] Updated weights for policy 0, policy_version 526910 (0.0035) [2024-06-23 22:17:37,919][15401] Updated weights for policy 0, policy_version 526920 (0.0037) [2024-06-23 22:17:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42709.4). Total num frames: 8633057280. Throughput: 0: 42715.5. Samples: 8633138720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 22:17:42,046][15401] Updated weights for policy 0, policy_version 526930 (0.0033) [2024-06-23 22:17:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 8633270272. Throughput: 0: 42756.9. Samples: 8633398260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:43,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 22:17:43,568][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000526934_8633286656.pth... [2024-06-23 22:17:43,638][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000526310_8623063040.pth [2024-06-23 22:17:45,465][15401] Updated weights for policy 0, policy_version 526940 (0.0042) [2024-06-23 22:17:48,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 8633483264. Throughput: 0: 42897.8. Samples: 8633657460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:48,392][15132] Avg episode reward: [(0, '0.798')] [2024-06-23 22:17:49,554][15401] Updated weights for policy 0, policy_version 526950 (0.0034) [2024-06-23 22:17:53,219][15401] Updated weights for policy 0, policy_version 526960 (0.0034) [2024-06-23 22:17:53,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42868.6, 300 sec: 42764.1). Total num frames: 8633712640. Throughput: 0: 42885.0. Samples: 8633786460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:53,397][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 22:17:56,929][15401] Updated weights for policy 0, policy_version 526970 (0.0040) [2024-06-23 22:17:58,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8633909248. Throughput: 0: 42696.1. Samples: 8634040460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:17:58,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 22:18:01,062][15401] Updated weights for policy 0, policy_version 526980 (0.0025) [2024-06-23 22:18:03,390][15132] Fps is (10 sec: 42625.2, 60 sec: 43144.5, 300 sec: 42821.0). Total num frames: 8634138624. Throughput: 0: 42934.6. Samples: 8634299280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:18:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 22:18:04,413][15401] Updated weights for policy 0, policy_version 526990 (0.0032) [2024-06-23 22:18:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8634351616. Throughput: 0: 42962.7. Samples: 8634429660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:18:08,396][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 22:18:08,571][15401] Updated weights for policy 0, policy_version 527000 (0.0031) [2024-06-23 22:18:12,211][15401] Updated weights for policy 0, policy_version 527010 (0.0025) [2024-06-23 22:18:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8634548224. Throughput: 0: 42950.8. Samples: 8634686740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:18:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 22:18:16,248][15401] Updated weights for policy 0, policy_version 527020 (0.0026) [2024-06-23 22:18:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8634777600. Throughput: 0: 42931.4. Samples: 8634941440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:18:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 22:18:19,771][15401] Updated weights for policy 0, policy_version 527030 (0.0036) [2024-06-23 22:18:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8634974208. Throughput: 0: 42861.0. Samples: 8635067460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:18:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 22:18:24,367][15401] Updated weights for policy 0, policy_version 527040 (0.0034) [2024-06-23 22:18:27,880][15401] Updated weights for policy 0, policy_version 527050 (0.0026) [2024-06-23 22:18:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8635187200. Throughput: 0: 42681.8. Samples: 8635318940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:18:28,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 22:18:31,930][15349] Signal inference workers to stop experience collection... (127950 times) [2024-06-23 22:18:31,930][15349] Signal inference workers to resume experience collection... (127950 times) [2024-06-23 22:18:31,942][15401] Updated weights for policy 0, policy_version 527060 (0.0037) [2024-06-23 22:18:31,975][15401] InferenceWorker_p0-w0: stopping experience collection (127950 times) [2024-06-23 22:18:31,975][15401] InferenceWorker_p0-w0: resuming experience collection (127950 times) [2024-06-23 22:18:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 8635416576. Throughput: 0: 42675.9. Samples: 8635577780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 22:18:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 22:18:35,925][15401] Updated weights for policy 0, policy_version 527070 (0.0041) [2024-06-23 22:18:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.6). Total num frames: 8635629568. Throughput: 0: 42660.3. Samples: 8635705900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:18:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 22:18:39,486][15401] Updated weights for policy 0, policy_version 527080 (0.0028) [2024-06-23 22:18:43,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8635826176. Throughput: 0: 42740.5. Samples: 8635963780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:18:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 22:18:43,548][15401] Updated weights for policy 0, policy_version 527090 (0.0031) [2024-06-23 22:18:47,333][15401] Updated weights for policy 0, policy_version 527100 (0.0033) [2024-06-23 22:18:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 8636071936. Throughput: 0: 42582.8. Samples: 8636215500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:18:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 22:18:51,038][15401] Updated weights for policy 0, policy_version 527110 (0.0022) [2024-06-23 22:18:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 8636268544. Throughput: 0: 42648.9. Samples: 8636348860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:18:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 22:18:54,685][15401] Updated weights for policy 0, policy_version 527120 (0.0028) [2024-06-23 22:18:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8636481536. Throughput: 0: 42688.5. Samples: 8636607720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:18:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 22:18:58,499][15401] Updated weights for policy 0, policy_version 527130 (0.0030) [2024-06-23 22:19:02,129][15401] Updated weights for policy 0, policy_version 527140 (0.0036) [2024-06-23 22:19:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 8636694528. Throughput: 0: 42948.5. Samples: 8636874120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:03,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 22:19:05,943][15401] Updated weights for policy 0, policy_version 527150 (0.0037) [2024-06-23 22:19:08,393][15132] Fps is (10 sec: 44222.8, 60 sec: 42869.2, 300 sec: 42709.0). Total num frames: 8636923904. Throughput: 0: 42933.5. Samples: 8636999600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:08,393][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 22:19:10,489][15401] Updated weights for policy 0, policy_version 527160 (0.0028) [2024-06-23 22:19:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8637136896. Throughput: 0: 43004.5. Samples: 8637254140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:13,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 22:19:13,481][15401] Updated weights for policy 0, policy_version 527170 (0.0039) [2024-06-23 22:19:17,823][15401] Updated weights for policy 0, policy_version 527180 (0.0036) [2024-06-23 22:19:18,390][15132] Fps is (10 sec: 42611.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8637349888. Throughput: 0: 43120.2. Samples: 8637518180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 22:19:21,189][15401] Updated weights for policy 0, policy_version 527190 (0.0041) [2024-06-23 22:19:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8637579264. Throughput: 0: 42990.7. Samples: 8637640480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 22:19:25,424][15401] Updated weights for policy 0, policy_version 527200 (0.0026) [2024-06-23 22:19:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 8637775872. Throughput: 0: 42986.0. Samples: 8637898160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:28,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 22:19:28,704][15401] Updated weights for policy 0, policy_version 527210 (0.0052) [2024-06-23 22:19:32,834][15401] Updated weights for policy 0, policy_version 527220 (0.0032) [2024-06-23 22:19:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8637988864. Throughput: 0: 43119.0. Samples: 8638155860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 22:19:36,191][15401] Updated weights for policy 0, policy_version 527230 (0.0041) [2024-06-23 22:19:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8638218240. Throughput: 0: 42908.4. Samples: 8638279740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:38,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 22:19:40,442][15401] Updated weights for policy 0, policy_version 527240 (0.0025) [2024-06-23 22:19:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 8638431232. Throughput: 0: 42989.6. Samples: 8638542260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 22:19:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000527248_8638431232.pth... [2024-06-23 22:19:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000526621_8628158464.pth [2024-06-23 22:19:44,123][15401] Updated weights for policy 0, policy_version 527250 (0.0028) [2024-06-23 22:19:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8638611456. Throughput: 0: 42756.8. Samples: 8638798180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 22:19:48,766][15401] Updated weights for policy 0, policy_version 527260 (0.0033) [2024-06-23 22:19:49,352][15349] Signal inference workers to stop experience collection... (128000 times) [2024-06-23 22:19:49,381][15401] InferenceWorker_p0-w0: stopping experience collection (128000 times) [2024-06-23 22:19:49,407][15349] Signal inference workers to resume experience collection... (128000 times) [2024-06-23 22:19:49,408][15401] InferenceWorker_p0-w0: resuming experience collection (128000 times) [2024-06-23 22:19:51,931][15401] Updated weights for policy 0, policy_version 527270 (0.0036) [2024-06-23 22:19:53,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 8638873600. Throughput: 0: 42622.5. Samples: 8638917480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 22:19:56,491][15401] Updated weights for policy 0, policy_version 527280 (0.0040) [2024-06-23 22:19:58,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8639070208. Throughput: 0: 42884.4. Samples: 8639183940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:19:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 22:19:59,435][15401] Updated weights for policy 0, policy_version 527290 (0.0029) [2024-06-23 22:20:03,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8639250432. Throughput: 0: 42758.2. Samples: 8639442300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-23 22:20:03,394][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 22:20:04,051][15401] Updated weights for policy 0, policy_version 527300 (0.0035) [2024-06-23 22:20:07,292][15401] Updated weights for policy 0, policy_version 527310 (0.0044) [2024-06-23 22:20:08,390][15132] Fps is (10 sec: 44232.9, 60 sec: 43146.1, 300 sec: 42765.2). Total num frames: 8639512576. Throughput: 0: 42838.3. Samples: 8639568240. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:08,391][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 22:20:11,688][15401] Updated weights for policy 0, policy_version 527320 (0.0023) [2024-06-23 22:20:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8639709184. Throughput: 0: 42846.4. Samples: 8639826240. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-23 22:20:14,861][15401] Updated weights for policy 0, policy_version 527330 (0.0041) [2024-06-23 22:20:18,390][15132] Fps is (10 sec: 39324.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8639905792. Throughput: 0: 42945.8. Samples: 8640088420. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 22:20:19,145][15401] Updated weights for policy 0, policy_version 527340 (0.0037) [2024-06-23 22:20:22,514][15401] Updated weights for policy 0, policy_version 527350 (0.0032) [2024-06-23 22:20:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8640135168. Throughput: 0: 42927.6. Samples: 8640211480. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 22:20:26,845][15401] Updated weights for policy 0, policy_version 527360 (0.0046) [2024-06-23 22:20:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8640348160. Throughput: 0: 42778.0. Samples: 8640467260. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 22:20:30,221][15401] Updated weights for policy 0, policy_version 527370 (0.0033) [2024-06-23 22:20:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8640544768. Throughput: 0: 42834.9. Samples: 8640725740. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 22:20:34,359][15401] Updated weights for policy 0, policy_version 527380 (0.0049) [2024-06-23 22:20:37,909][15401] Updated weights for policy 0, policy_version 527390 (0.0027) [2024-06-23 22:20:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8640757760. Throughput: 0: 42896.1. Samples: 8640847800. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 22:20:41,935][15401] Updated weights for policy 0, policy_version 527400 (0.0022) [2024-06-23 22:20:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8641003520. Throughput: 0: 42704.9. Samples: 8641105660. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:43,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-23 22:20:45,996][15401] Updated weights for policy 0, policy_version 527410 (0.0028) [2024-06-23 22:20:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 8641200128. Throughput: 0: 42646.8. Samples: 8641361400. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 22:20:49,502][15401] Updated weights for policy 0, policy_version 527420 (0.0035) [2024-06-23 22:20:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8641396736. Throughput: 0: 42601.8. Samples: 8641485280. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 22:20:53,525][15401] Updated weights for policy 0, policy_version 527430 (0.0037) [2024-06-23 22:20:57,545][15401] Updated weights for policy 0, policy_version 527440 (0.0036) [2024-06-23 22:20:57,574][15349] Signal inference workers to stop experience collection... (128050 times) [2024-06-23 22:20:57,575][15349] Signal inference workers to resume experience collection... (128050 times) [2024-06-23 22:20:57,642][15401] InferenceWorker_p0-w0: stopping experience collection (128050 times) [2024-06-23 22:20:57,642][15401] InferenceWorker_p0-w0: resuming experience collection (128050 times) [2024-06-23 22:20:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42821.0). Total num frames: 8641626112. Throughput: 0: 42637.7. Samples: 8641744940. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:20:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 22:21:01,255][15401] Updated weights for policy 0, policy_version 527450 (0.0038) [2024-06-23 22:21:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 42876.7). Total num frames: 8641855488. Throughput: 0: 42497.4. Samples: 8642000800. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:21:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 22:21:05,197][15401] Updated weights for policy 0, policy_version 527460 (0.0026) [2024-06-23 22:21:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.8, 300 sec: 42709.5). Total num frames: 8642035712. Throughput: 0: 42572.3. Samples: 8642127240. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:21:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 22:21:08,771][15401] Updated weights for policy 0, policy_version 527470 (0.0035) [2024-06-23 22:21:12,816][15401] Updated weights for policy 0, policy_version 527480 (0.0038) [2024-06-23 22:21:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8642265088. Throughput: 0: 42606.6. Samples: 8642384560. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:21:13,398][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 22:21:16,638][15401] Updated weights for policy 0, policy_version 527490 (0.0033) [2024-06-23 22:21:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8642478080. Throughput: 0: 42408.0. Samples: 8642634100. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:21:18,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-23 22:21:20,452][15401] Updated weights for policy 0, policy_version 527500 (0.0024) [2024-06-23 22:21:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8642674688. Throughput: 0: 42619.5. Samples: 8642765680. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:21:23,398][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 22:21:24,138][15401] Updated weights for policy 0, policy_version 527510 (0.0043) [2024-06-23 22:21:28,074][15401] Updated weights for policy 0, policy_version 527520 (0.0037) [2024-06-23 22:21:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8642887680. Throughput: 0: 42461.4. Samples: 8643016420. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-23 22:21:28,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 22:21:31,824][15401] Updated weights for policy 0, policy_version 527530 (0.0038) [2024-06-23 22:21:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 8643117056. Throughput: 0: 42619.5. Samples: 8643279280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:21:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 22:21:35,689][15401] Updated weights for policy 0, policy_version 527540 (0.0033) [2024-06-23 22:21:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8643313664. Throughput: 0: 42767.9. Samples: 8643409840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:21:38,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 22:21:39,414][15401] Updated weights for policy 0, policy_version 527550 (0.0039) [2024-06-23 22:21:43,242][15401] Updated weights for policy 0, policy_version 527560 (0.0040) [2024-06-23 22:21:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8643543040. Throughput: 0: 42677.3. Samples: 8643665420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:21:43,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-23 22:21:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000527560_8643543040.pth... [2024-06-23 22:21:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000526934_8633286656.pth [2024-06-23 22:21:47,215][15401] Updated weights for policy 0, policy_version 527570 (0.0039) [2024-06-23 22:21:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 8643756032. Throughput: 0: 42634.1. Samples: 8643919340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:21:48,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 22:21:50,896][15401] Updated weights for policy 0, policy_version 527580 (0.0048) [2024-06-23 22:21:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8643969024. Throughput: 0: 42699.2. Samples: 8644048700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:21:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 22:21:54,751][15401] Updated weights for policy 0, policy_version 527590 (0.0034) [2024-06-23 22:21:58,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 8644182016. Throughput: 0: 42697.9. Samples: 8644306240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:21:58,397][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 22:21:58,554][15401] Updated weights for policy 0, policy_version 527600 (0.0046) [2024-06-23 22:22:02,513][15401] Updated weights for policy 0, policy_version 527610 (0.0028) [2024-06-23 22:22:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8644395008. Throughput: 0: 42603.1. Samples: 8644551240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 22:22:06,625][15401] Updated weights for policy 0, policy_version 527620 (0.0035) [2024-06-23 22:22:08,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8644608000. Throughput: 0: 42641.3. Samples: 8644684540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 22:22:10,145][15401] Updated weights for policy 0, policy_version 527630 (0.0034) [2024-06-23 22:22:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8644820992. Throughput: 0: 42718.5. Samples: 8644938860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:13,393][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 22:22:14,187][15401] Updated weights for policy 0, policy_version 527640 (0.0033) [2024-06-23 22:22:14,822][15349] Signal inference workers to stop experience collection... (128100 times) [2024-06-23 22:22:14,823][15349] Signal inference workers to resume experience collection... (128100 times) [2024-06-23 22:22:14,845][15401] InferenceWorker_p0-w0: stopping experience collection (128100 times) [2024-06-23 22:22:14,845][15401] InferenceWorker_p0-w0: resuming experience collection (128100 times) [2024-06-23 22:22:17,723][15401] Updated weights for policy 0, policy_version 527650 (0.0038) [2024-06-23 22:22:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8645033984. Throughput: 0: 42569.3. Samples: 8645194900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 22:22:21,793][15401] Updated weights for policy 0, policy_version 527660 (0.0021) [2024-06-23 22:22:23,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8645263360. Throughput: 0: 42561.7. Samples: 8645325120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 22:22:25,296][15401] Updated weights for policy 0, policy_version 527670 (0.0038) [2024-06-23 22:22:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8645443584. Throughput: 0: 42545.9. Samples: 8645579980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 22:22:29,329][15401] Updated weights for policy 0, policy_version 527680 (0.0043) [2024-06-23 22:22:33,124][15401] Updated weights for policy 0, policy_version 527690 (0.0049) [2024-06-23 22:22:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8645672960. Throughput: 0: 42550.4. Samples: 8645834100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 22:22:37,003][15401] Updated weights for policy 0, policy_version 527700 (0.0041) [2024-06-23 22:22:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8645885952. Throughput: 0: 42658.2. Samples: 8645968320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 22:22:40,684][15401] Updated weights for policy 0, policy_version 527710 (0.0041) [2024-06-23 22:22:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 8646082560. Throughput: 0: 42565.2. Samples: 8646221400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:43,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 22:22:44,585][15401] Updated weights for policy 0, policy_version 527720 (0.0033) [2024-06-23 22:22:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.9). Total num frames: 8646295552. Throughput: 0: 42788.1. Samples: 8646476700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:48,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-23 22:22:48,782][15401] Updated weights for policy 0, policy_version 527730 (0.0037) [2024-06-23 22:22:52,256][15401] Updated weights for policy 0, policy_version 527740 (0.0031) [2024-06-23 22:22:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8646524928. Throughput: 0: 42705.7. Samples: 8646606300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 22:22:56,357][15401] Updated weights for policy 0, policy_version 527750 (0.0042) [2024-06-23 22:22:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 8646737920. Throughput: 0: 42576.5. Samples: 8646854700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-23 22:22:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 22:23:00,085][15401] Updated weights for policy 0, policy_version 527760 (0.0029) [2024-06-23 22:23:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 8646918144. Throughput: 0: 42778.1. Samples: 8647119920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-23 22:23:03,981][15401] Updated weights for policy 0, policy_version 527770 (0.0031) [2024-06-23 22:23:07,907][15401] Updated weights for policy 0, policy_version 527780 (0.0041) [2024-06-23 22:23:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8647147520. Throughput: 0: 42611.6. Samples: 8647242640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 22:23:11,542][15401] Updated weights for policy 0, policy_version 527790 (0.0029) [2024-06-23 22:23:13,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8647393280. Throughput: 0: 42538.6. Samples: 8647494220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 22:23:15,703][15401] Updated weights for policy 0, policy_version 527800 (0.0029) [2024-06-23 22:23:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 8647573504. Throughput: 0: 42629.6. Samples: 8647752440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 22:23:19,131][15401] Updated weights for policy 0, policy_version 527810 (0.0047) [2024-06-23 22:23:23,335][15401] Updated weights for policy 0, policy_version 527820 (0.0033) [2024-06-23 22:23:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8647802880. Throughput: 0: 42386.3. Samples: 8647875700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 22:23:27,094][15401] Updated weights for policy 0, policy_version 527830 (0.0028) [2024-06-23 22:23:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8648015872. Throughput: 0: 42467.1. Samples: 8648132420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 22:23:30,898][15401] Updated weights for policy 0, policy_version 527840 (0.0034) [2024-06-23 22:23:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8648212480. Throughput: 0: 42602.6. Samples: 8648393820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 22:23:34,709][15401] Updated weights for policy 0, policy_version 527850 (0.0036) [2024-06-23 22:23:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8648441856. Throughput: 0: 42499.2. Samples: 8648518760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 22:23:38,438][15401] Updated weights for policy 0, policy_version 527860 (0.0034) [2024-06-23 22:23:42,648][15401] Updated weights for policy 0, policy_version 527870 (0.0033) [2024-06-23 22:23:42,755][15349] Signal inference workers to stop experience collection... (128150 times) [2024-06-23 22:23:42,800][15401] InferenceWorker_p0-w0: stopping experience collection (128150 times) [2024-06-23 22:23:42,809][15349] Signal inference workers to resume experience collection... (128150 times) [2024-06-23 22:23:42,811][15401] InferenceWorker_p0-w0: resuming experience collection (128150 times) [2024-06-23 22:23:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8648671232. Throughput: 0: 42627.1. Samples: 8648772920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 22:23:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000527873_8648671232.pth... [2024-06-23 22:23:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000527248_8638431232.pth [2024-06-23 22:23:46,126][15401] Updated weights for policy 0, policy_version 527880 (0.0046) [2024-06-23 22:23:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 8648835072. Throughput: 0: 42509.8. Samples: 8649032860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:48,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 22:23:50,252][15401] Updated weights for policy 0, policy_version 527890 (0.0042) [2024-06-23 22:23:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8649080832. Throughput: 0: 42446.7. Samples: 8649152740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:53,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 22:23:53,692][15401] Updated weights for policy 0, policy_version 527900 (0.0041) [2024-06-23 22:23:57,735][15401] Updated weights for policy 0, policy_version 527910 (0.0036) [2024-06-23 22:23:58,392][15132] Fps is (10 sec: 47502.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 8649310208. Throughput: 0: 42708.4. Samples: 8649416200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:23:58,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 22:24:01,632][15401] Updated weights for policy 0, policy_version 527920 (0.0038) [2024-06-23 22:24:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.9). Total num frames: 8649490432. Throughput: 0: 42584.1. Samples: 8649668720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:24:03,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-23 22:24:05,476][15401] Updated weights for policy 0, policy_version 527930 (0.0032) [2024-06-23 22:24:08,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8649703424. Throughput: 0: 42648.4. Samples: 8649794880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:24:08,398][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 22:24:09,421][15401] Updated weights for policy 0, policy_version 527940 (0.0031) [2024-06-23 22:24:13,175][15401] Updated weights for policy 0, policy_version 527950 (0.0028) [2024-06-23 22:24:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8649949184. Throughput: 0: 42885.7. Samples: 8650062280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:24:13,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-23 22:24:17,330][15401] Updated weights for policy 0, policy_version 527960 (0.0033) [2024-06-23 22:24:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 8650129408. Throughput: 0: 42626.2. Samples: 8650312000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:24:18,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 22:24:20,823][15401] Updated weights for policy 0, policy_version 527970 (0.0036) [2024-06-23 22:24:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8650358784. Throughput: 0: 42566.1. Samples: 8650434240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:24:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 22:24:25,200][15401] Updated weights for policy 0, policy_version 527980 (0.0025) [2024-06-23 22:24:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8650571776. Throughput: 0: 42591.5. Samples: 8650689540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:24:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 22:24:28,662][15401] Updated weights for policy 0, policy_version 527990 (0.0036) [2024-06-23 22:24:32,760][15401] Updated weights for policy 0, policy_version 528000 (0.0035) [2024-06-23 22:24:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 8650768384. Throughput: 0: 42540.4. Samples: 8650947180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:24:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 22:24:36,470][15401] Updated weights for policy 0, policy_version 528010 (0.0025) [2024-06-23 22:24:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 8651014144. Throughput: 0: 42696.0. Samples: 8651074060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:24:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 22:24:40,346][15401] Updated weights for policy 0, policy_version 528020 (0.0029) [2024-06-23 22:24:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 8651194368. Throughput: 0: 42563.6. Samples: 8651331460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:24:43,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 22:24:44,310][15401] Updated weights for policy 0, policy_version 528030 (0.0038) [2024-06-23 22:24:48,016][15401] Updated weights for policy 0, policy_version 528040 (0.0028) [2024-06-23 22:24:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 8651407360. Throughput: 0: 42459.6. Samples: 8651579400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:24:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 22:24:51,909][15401] Updated weights for policy 0, policy_version 528050 (0.0028) [2024-06-23 22:24:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8651653120. Throughput: 0: 42567.1. Samples: 8651710400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:24:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 22:24:55,739][15401] Updated weights for policy 0, policy_version 528060 (0.0038) [2024-06-23 22:24:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 8651816960. Throughput: 0: 42404.0. Samples: 8651970460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:24:58,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 22:24:59,547][15401] Updated weights for policy 0, policy_version 528070 (0.0031) [2024-06-23 22:25:03,353][15401] Updated weights for policy 0, policy_version 528080 (0.0040) [2024-06-23 22:25:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42543.0). Total num frames: 8652062720. Throughput: 0: 42507.1. Samples: 8652224820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 22:25:07,135][15401] Updated weights for policy 0, policy_version 528090 (0.0029) [2024-06-23 22:25:07,545][15349] Signal inference workers to stop experience collection... (128200 times) [2024-06-23 22:25:07,596][15401] InferenceWorker_p0-w0: stopping experience collection (128200 times) [2024-06-23 22:25:07,663][15349] Signal inference workers to resume experience collection... (128200 times) [2024-06-23 22:25:07,663][15401] InferenceWorker_p0-w0: resuming experience collection (128200 times) [2024-06-23 22:25:08,389][15132] Fps is (10 sec: 49152.4, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 8652308480. Throughput: 0: 42756.1. Samples: 8652358260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 22:25:10,970][15401] Updated weights for policy 0, policy_version 528100 (0.0035) [2024-06-23 22:25:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 8652472320. Throughput: 0: 42764.0. Samples: 8652613920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 22:25:14,886][15401] Updated weights for policy 0, policy_version 528110 (0.0031) [2024-06-23 22:25:18,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8652701696. Throughput: 0: 42592.0. Samples: 8652863820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 22:25:18,451][15401] Updated weights for policy 0, policy_version 528120 (0.0034) [2024-06-23 22:25:22,435][15401] Updated weights for policy 0, policy_version 528130 (0.0027) [2024-06-23 22:25:23,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 8652947456. Throughput: 0: 42819.0. Samples: 8653001020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:23,392][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 22:25:25,940][15401] Updated weights for policy 0, policy_version 528140 (0.0029) [2024-06-23 22:25:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8653094912. Throughput: 0: 42937.4. Samples: 8653263640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 22:25:29,987][15401] Updated weights for policy 0, policy_version 528150 (0.0032) [2024-06-23 22:25:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 8653340672. Throughput: 0: 42903.1. Samples: 8653510040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 22:25:33,561][15401] Updated weights for policy 0, policy_version 528160 (0.0036) [2024-06-23 22:25:37,581][15401] Updated weights for policy 0, policy_version 528170 (0.0033) [2024-06-23 22:25:38,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8653586432. Throughput: 0: 43048.1. Samples: 8653647560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 22:25:41,112][15401] Updated weights for policy 0, policy_version 528180 (0.0043) [2024-06-23 22:25:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 8653750272. Throughput: 0: 43012.9. Samples: 8653906040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 22:25:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000528184_8653766656.pth... [2024-06-23 22:25:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000527560_8643543040.pth [2024-06-23 22:25:44,947][15401] Updated weights for policy 0, policy_version 528190 (0.0035) [2024-06-23 22:25:48,391][15132] Fps is (10 sec: 40953.9, 60 sec: 43143.4, 300 sec: 42709.3). Total num frames: 8653996032. Throughput: 0: 42820.9. Samples: 8654151820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:48,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-23 22:25:48,933][15401] Updated weights for policy 0, policy_version 528200 (0.0035) [2024-06-23 22:25:52,731][15401] Updated weights for policy 0, policy_version 528210 (0.0033) [2024-06-23 22:25:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8654225408. Throughput: 0: 42993.2. Samples: 8654292960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 22:25:56,654][15401] Updated weights for policy 0, policy_version 528220 (0.0031) [2024-06-23 22:25:58,389][15132] Fps is (10 sec: 37689.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 8654372864. Throughput: 0: 42901.4. Samples: 8654544480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-23 22:25:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 22:26:00,336][15401] Updated weights for policy 0, policy_version 528230 (0.0040) [2024-06-23 22:26:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8654651392. Throughput: 0: 42746.8. Samples: 8654787420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:03,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 22:26:04,462][15401] Updated weights for policy 0, policy_version 528240 (0.0037) [2024-06-23 22:26:08,233][15401] Updated weights for policy 0, policy_version 528250 (0.0051) [2024-06-23 22:26:08,396][15132] Fps is (10 sec: 47483.1, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 8654848000. Throughput: 0: 42936.7. Samples: 8654933340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:08,397][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 22:26:12,176][15401] Updated weights for policy 0, policy_version 528260 (0.0039) [2024-06-23 22:26:13,390][15132] Fps is (10 sec: 37682.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 8655028224. Throughput: 0: 42563.8. Samples: 8655179020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-23 22:26:15,820][15401] Updated weights for policy 0, policy_version 528270 (0.0038) [2024-06-23 22:26:18,389][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8655273984. Throughput: 0: 42713.7. Samples: 8655432160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:18,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-23 22:26:19,755][15401] Updated weights for policy 0, policy_version 528280 (0.0030) [2024-06-23 22:26:23,029][15349] Signal inference workers to stop experience collection... (128250 times) [2024-06-23 22:26:23,086][15401] InferenceWorker_p0-w0: stopping experience collection (128250 times) [2024-06-23 22:26:23,095][15349] Signal inference workers to resume experience collection... (128250 times) [2024-06-23 22:26:23,099][15401] InferenceWorker_p0-w0: resuming experience collection (128250 times) [2024-06-23 22:26:23,393][15132] Fps is (10 sec: 45858.4, 60 sec: 42324.3, 300 sec: 42708.9). Total num frames: 8655486976. Throughput: 0: 42748.8. Samples: 8655571420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:23,394][15132] Avg episode reward: [(0, '0.567')] [2024-06-23 22:26:23,430][15401] Updated weights for policy 0, policy_version 528290 (0.0038) [2024-06-23 22:26:27,290][15401] Updated weights for policy 0, policy_version 528300 (0.0031) [2024-06-23 22:26:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 8655667200. Throughput: 0: 42557.3. Samples: 8655821120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 22:26:30,927][15401] Updated weights for policy 0, policy_version 528310 (0.0041) [2024-06-23 22:26:33,390][15132] Fps is (10 sec: 44252.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8655929344. Throughput: 0: 42699.9. Samples: 8656073260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:33,391][15132] Avg episode reward: [(0, '0.378')] [2024-06-23 22:26:35,604][15401] Updated weights for policy 0, policy_version 528320 (0.0037) [2024-06-23 22:26:38,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8656142336. Throughput: 0: 42820.6. Samples: 8656219880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 22:26:38,477][15401] Updated weights for policy 0, policy_version 528330 (0.0035) [2024-06-23 22:26:43,162][15401] Updated weights for policy 0, policy_version 528340 (0.0038) [2024-06-23 22:26:43,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8656322560. Throughput: 0: 42744.9. Samples: 8656468000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 22:26:46,197][15401] Updated weights for policy 0, policy_version 528350 (0.0045) [2024-06-23 22:26:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43145.7, 300 sec: 42765.0). Total num frames: 8656584704. Throughput: 0: 42880.0. Samples: 8656717020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 22:26:51,238][15401] Updated weights for policy 0, policy_version 528360 (0.0030) [2024-06-23 22:26:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 8656781312. Throughput: 0: 42842.9. Samples: 8656861000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:53,391][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 22:26:53,767][15401] Updated weights for policy 0, policy_version 528370 (0.0024) [2024-06-23 22:26:58,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 8656945152. Throughput: 0: 42838.9. Samples: 8657106760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:26:58,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-23 22:26:58,642][15401] Updated weights for policy 0, policy_version 528380 (0.0028) [2024-06-23 22:27:01,550][15401] Updated weights for policy 0, policy_version 528390 (0.0029) [2024-06-23 22:27:03,394][15132] Fps is (10 sec: 44218.3, 60 sec: 42868.4, 300 sec: 42764.4). Total num frames: 8657223680. Throughput: 0: 42824.4. Samples: 8657359440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:27:03,394][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 22:27:06,031][15401] Updated weights for policy 0, policy_version 528400 (0.0032) [2024-06-23 22:27:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42602.9, 300 sec: 42654.3). Total num frames: 8657403904. Throughput: 0: 42889.9. Samples: 8657501300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:27:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 22:27:09,029][15401] Updated weights for policy 0, policy_version 528410 (0.0040) [2024-06-23 22:27:13,389][15132] Fps is (10 sec: 39338.3, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 8657616896. Throughput: 0: 42933.4. Samples: 8657753120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:27:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 22:27:13,498][15401] Updated weights for policy 0, policy_version 528420 (0.0031) [2024-06-23 22:27:15,107][15349] Signal inference workers to stop experience collection... (128300 times) [2024-06-23 22:27:15,112][15349] Signal inference workers to resume experience collection... (128300 times) [2024-06-23 22:27:15,157][15401] InferenceWorker_p0-w0: stopping experience collection (128300 times) [2024-06-23 22:27:15,164][15401] InferenceWorker_p0-w0: resuming experience collection (128300 times) [2024-06-23 22:27:16,820][15401] Updated weights for policy 0, policy_version 528430 (0.0039) [2024-06-23 22:27:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8657862656. Throughput: 0: 42991.7. Samples: 8658007880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:27:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 22:27:21,503][15401] Updated weights for policy 0, policy_version 528440 (0.0045) [2024-06-23 22:27:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42874.2, 300 sec: 42765.0). Total num frames: 8658059264. Throughput: 0: 42823.0. Samples: 8658146920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:27:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 22:27:24,435][15401] Updated weights for policy 0, policy_version 528450 (0.0032) [2024-06-23 22:27:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 8658255872. Throughput: 0: 42846.6. Samples: 8658396100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-23 22:27:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 22:27:29,015][15401] Updated weights for policy 0, policy_version 528460 (0.0044) [2024-06-23 22:27:32,159][15401] Updated weights for policy 0, policy_version 528470 (0.0033) [2024-06-23 22:27:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8658501632. Throughput: 0: 42980.9. Samples: 8658651160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:27:33,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 22:27:36,514][15401] Updated weights for policy 0, policy_version 528480 (0.0035) [2024-06-23 22:27:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8658714624. Throughput: 0: 42747.6. Samples: 8658784640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:27:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 22:27:40,079][15401] Updated weights for policy 0, policy_version 528490 (0.0033) [2024-06-23 22:27:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8658911232. Throughput: 0: 42865.2. Samples: 8659035700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:27:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 22:27:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000528498_8658911232.pth... [2024-06-23 22:27:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000527873_8648671232.pth [2024-06-23 22:27:43,967][15401] Updated weights for policy 0, policy_version 528500 (0.0037) [2024-06-23 22:27:47,640][15401] Updated weights for policy 0, policy_version 528510 (0.0028) [2024-06-23 22:27:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 8659140608. Throughput: 0: 43099.0. Samples: 8659298720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:27:48,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-23 22:27:51,815][15401] Updated weights for policy 0, policy_version 528520 (0.0037) [2024-06-23 22:27:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8659353600. Throughput: 0: 42816.5. Samples: 8659428040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:27:53,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-23 22:27:55,333][15401] Updated weights for policy 0, policy_version 528530 (0.0032) [2024-06-23 22:27:58,389][15132] Fps is (10 sec: 40961.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8659550208. Throughput: 0: 42729.8. Samples: 8659675960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:27:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 22:27:59,654][15401] Updated weights for policy 0, policy_version 528540 (0.0041) [2024-06-23 22:28:02,898][15401] Updated weights for policy 0, policy_version 528550 (0.0029) [2024-06-23 22:28:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42601.3, 300 sec: 42820.5). Total num frames: 8659779584. Throughput: 0: 42795.0. Samples: 8659933660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:03,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-23 22:28:07,315][15401] Updated weights for policy 0, policy_version 528560 (0.0032) [2024-06-23 22:28:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8659992576. Throughput: 0: 42628.9. Samples: 8660065220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:08,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 22:28:10,644][15401] Updated weights for policy 0, policy_version 528570 (0.0027) [2024-06-23 22:28:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8660205568. Throughput: 0: 42790.7. Samples: 8660321680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 22:28:14,742][15401] Updated weights for policy 0, policy_version 528580 (0.0028) [2024-06-23 22:28:18,268][15401] Updated weights for policy 0, policy_version 528590 (0.0034) [2024-06-23 22:28:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8660418560. Throughput: 0: 42843.0. Samples: 8660579100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 22:28:22,248][15401] Updated weights for policy 0, policy_version 528600 (0.0049) [2024-06-23 22:28:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8660631552. Throughput: 0: 42668.4. Samples: 8660704720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:23,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 22:28:24,779][15349] Signal inference workers to stop experience collection... (128350 times) [2024-06-23 22:28:24,786][15349] Signal inference workers to resume experience collection... (128350 times) [2024-06-23 22:28:24,829][15401] InferenceWorker_p0-w0: stopping experience collection (128350 times) [2024-06-23 22:28:24,829][15401] InferenceWorker_p0-w0: resuming experience collection (128350 times) [2024-06-23 22:28:25,807][15401] Updated weights for policy 0, policy_version 528610 (0.0034) [2024-06-23 22:28:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 8660860928. Throughput: 0: 42954.7. Samples: 8660968660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 22:28:29,682][15401] Updated weights for policy 0, policy_version 528620 (0.0047) [2024-06-23 22:28:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8661057536. Throughput: 0: 42802.4. Samples: 8661224820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 22:28:33,433][15401] Updated weights for policy 0, policy_version 528630 (0.0041) [2024-06-23 22:28:37,621][15401] Updated weights for policy 0, policy_version 528640 (0.0033) [2024-06-23 22:28:38,391][15132] Fps is (10 sec: 40952.6, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 8661270528. Throughput: 0: 42730.3. Samples: 8661350980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:38,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 22:28:41,057][15401] Updated weights for policy 0, policy_version 528650 (0.0044) [2024-06-23 22:28:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 8661499904. Throughput: 0: 42846.1. Samples: 8661604140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:43,393][15132] Avg episode reward: [(0, '0.183')] [2024-06-23 22:28:45,177][15401] Updated weights for policy 0, policy_version 528660 (0.0026) [2024-06-23 22:28:48,390][15132] Fps is (10 sec: 42605.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8661696512. Throughput: 0: 42910.3. Samples: 8661864620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 22:28:48,838][15401] Updated weights for policy 0, policy_version 528670 (0.0032) [2024-06-23 22:28:52,858][15401] Updated weights for policy 0, policy_version 528680 (0.0031) [2024-06-23 22:28:53,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 8661909504. Throughput: 0: 42877.3. Samples: 8661994700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-23 22:28:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 22:28:56,453][15401] Updated weights for policy 0, policy_version 528690 (0.0039) [2024-06-23 22:28:58,394][15132] Fps is (10 sec: 44218.8, 60 sec: 43141.5, 300 sec: 42875.5). Total num frames: 8662138880. Throughput: 0: 42737.4. Samples: 8662245040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:28:58,394][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 22:29:00,604][15401] Updated weights for policy 0, policy_version 528700 (0.0026) [2024-06-23 22:29:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8662351872. Throughput: 0: 42982.8. Samples: 8662513320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 22:29:03,985][15401] Updated weights for policy 0, policy_version 528710 (0.0037) [2024-06-23 22:29:08,027][15401] Updated weights for policy 0, policy_version 528720 (0.0033) [2024-06-23 22:29:08,390][15132] Fps is (10 sec: 42613.8, 60 sec: 42871.1, 300 sec: 42764.9). Total num frames: 8662564864. Throughput: 0: 42972.0. Samples: 8662638480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:08,391][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 22:29:11,671][15401] Updated weights for policy 0, policy_version 528730 (0.0031) [2024-06-23 22:29:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8662777856. Throughput: 0: 42780.0. Samples: 8662893760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:13,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 22:29:15,511][15401] Updated weights for policy 0, policy_version 528740 (0.0046) [2024-06-23 22:29:18,390][15132] Fps is (10 sec: 40961.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8662974464. Throughput: 0: 42975.9. Samples: 8663158740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 22:29:19,214][15401] Updated weights for policy 0, policy_version 528750 (0.0039) [2024-06-23 22:29:22,995][15401] Updated weights for policy 0, policy_version 528760 (0.0042) [2024-06-23 22:29:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8663203840. Throughput: 0: 42828.4. Samples: 8663278180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 22:29:26,802][15401] Updated weights for policy 0, policy_version 528770 (0.0035) [2024-06-23 22:29:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8663416832. Throughput: 0: 42841.7. Samples: 8663531920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 22:29:30,876][15401] Updated weights for policy 0, policy_version 528780 (0.0033) [2024-06-23 22:29:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8663613440. Throughput: 0: 43001.9. Samples: 8663799700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 22:29:34,554][15401] Updated weights for policy 0, policy_version 528790 (0.0036) [2024-06-23 22:29:38,382][15401] Updated weights for policy 0, policy_version 528800 (0.0032) [2024-06-23 22:29:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43145.8, 300 sec: 42931.6). Total num frames: 8663859200. Throughput: 0: 42860.1. Samples: 8663923400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 22:29:42,620][15401] Updated weights for policy 0, policy_version 528810 (0.0030) [2024-06-23 22:29:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42598.4, 300 sec: 42875.7). Total num frames: 8664055808. Throughput: 0: 43070.1. Samples: 8664183120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:43,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 22:29:43,554][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000528813_8664072192.pth... [2024-06-23 22:29:43,625][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000528184_8653766656.pth [2024-06-23 22:29:45,889][15401] Updated weights for policy 0, policy_version 528820 (0.0020) [2024-06-23 22:29:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8664252416. Throughput: 0: 42899.6. Samples: 8664443800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 22:29:50,271][15401] Updated weights for policy 0, policy_version 528830 (0.0038) [2024-06-23 22:29:53,389][15401] Updated weights for policy 0, policy_version 528840 (0.0027) [2024-06-23 22:29:53,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 8664514560. Throughput: 0: 42819.6. Samples: 8664565340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:53,391][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 22:29:57,760][15401] Updated weights for policy 0, policy_version 528850 (0.0049) [2024-06-23 22:29:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42874.5, 300 sec: 42876.1). Total num frames: 8664711168. Throughput: 0: 42873.3. Samples: 8664823060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:29:58,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 22:30:00,561][15349] Signal inference workers to stop experience collection... (128400 times) [2024-06-23 22:30:00,561][15349] Signal inference workers to resume experience collection... (128400 times) [2024-06-23 22:30:00,588][15401] InferenceWorker_p0-w0: stopping experience collection (128400 times) [2024-06-23 22:30:00,588][15401] InferenceWorker_p0-w0: resuming experience collection (128400 times) [2024-06-23 22:30:01,221][15401] Updated weights for policy 0, policy_version 528860 (0.0029) [2024-06-23 22:30:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8664907776. Throughput: 0: 42740.1. Samples: 8665082040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:30:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 22:30:05,218][15401] Updated weights for policy 0, policy_version 528870 (0.0027) [2024-06-23 22:30:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.9, 300 sec: 42931.6). Total num frames: 8665137152. Throughput: 0: 42816.0. Samples: 8665204900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:30:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 22:30:08,834][15401] Updated weights for policy 0, policy_version 528880 (0.0039) [2024-06-23 22:30:13,021][15401] Updated weights for policy 0, policy_version 528890 (0.0035) [2024-06-23 22:30:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8665350144. Throughput: 0: 43124.6. Samples: 8665472520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:30:13,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 22:30:16,442][15401] Updated weights for policy 0, policy_version 528900 (0.0040) [2024-06-23 22:30:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 8665546752. Throughput: 0: 42808.5. Samples: 8665726080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:30:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 22:30:20,761][15401] Updated weights for policy 0, policy_version 528910 (0.0029) [2024-06-23 22:30:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 8665792512. Throughput: 0: 42893.7. Samples: 8665853620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-23 22:30:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-23 22:30:23,876][15401] Updated weights for policy 0, policy_version 528920 (0.0036) [2024-06-23 22:30:28,271][15401] Updated weights for policy 0, policy_version 528930 (0.0035) [2024-06-23 22:30:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8665989120. Throughput: 0: 43008.2. Samples: 8666118380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:30:28,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 22:30:31,657][15401] Updated weights for policy 0, policy_version 528940 (0.0031) [2024-06-23 22:30:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8666202112. Throughput: 0: 42837.6. Samples: 8666371500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:30:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 22:30:35,889][15401] Updated weights for policy 0, policy_version 528950 (0.0040) [2024-06-23 22:30:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8666431488. Throughput: 0: 43056.4. Samples: 8666502880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:30:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 22:30:39,283][15401] Updated weights for policy 0, policy_version 528960 (0.0033) [2024-06-23 22:30:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.1, 300 sec: 42820.8). Total num frames: 8666628096. Throughput: 0: 43063.4. Samples: 8666760920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:30:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 22:30:43,676][15401] Updated weights for policy 0, policy_version 528970 (0.0038) [2024-06-23 22:30:46,917][15401] Updated weights for policy 0, policy_version 528980 (0.0033) [2024-06-23 22:30:48,394][15132] Fps is (10 sec: 40941.9, 60 sec: 43141.2, 300 sec: 42764.4). Total num frames: 8666841088. Throughput: 0: 42855.7. Samples: 8667010740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:30:48,394][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 22:30:51,188][15401] Updated weights for policy 0, policy_version 528990 (0.0027) [2024-06-23 22:30:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42931.6). Total num frames: 8667037696. Throughput: 0: 42975.1. Samples: 8667138780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:30:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 22:30:54,399][15401] Updated weights for policy 0, policy_version 529000 (0.0024) [2024-06-23 22:30:58,389][15132] Fps is (10 sec: 42617.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8667267072. Throughput: 0: 42712.0. Samples: 8667394560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:30:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 22:30:58,991][15401] Updated weights for policy 0, policy_version 529010 (0.0027) [2024-06-23 22:31:02,085][15401] Updated weights for policy 0, policy_version 529020 (0.0031) [2024-06-23 22:31:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 8667480064. Throughput: 0: 42723.4. Samples: 8667648640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 22:31:06,510][15401] Updated weights for policy 0, policy_version 529030 (0.0039) [2024-06-23 22:31:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8667709440. Throughput: 0: 42870.3. Samples: 8667782780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-23 22:31:09,752][15401] Updated weights for policy 0, policy_version 529040 (0.0027) [2024-06-23 22:31:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8667889664. Throughput: 0: 42658.2. Samples: 8668038000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 22:31:14,117][15401] Updated weights for policy 0, policy_version 529050 (0.0033) [2024-06-23 22:31:17,462][15401] Updated weights for policy 0, policy_version 529060 (0.0039) [2024-06-23 22:31:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.7). Total num frames: 8668135424. Throughput: 0: 42616.2. Samples: 8668289220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 22:31:21,156][15349] Signal inference workers to stop experience collection... (128450 times) [2024-06-23 22:31:21,157][15349] Signal inference workers to resume experience collection... (128450 times) [2024-06-23 22:31:21,177][15401] InferenceWorker_p0-w0: stopping experience collection (128450 times) [2024-06-23 22:31:21,213][15401] InferenceWorker_p0-w0: resuming experience collection (128450 times) [2024-06-23 22:31:21,763][15401] Updated weights for policy 0, policy_version 529070 (0.0044) [2024-06-23 22:31:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 8668332032. Throughput: 0: 42768.9. Samples: 8668427480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-23 22:31:25,066][15401] Updated weights for policy 0, policy_version 529080 (0.0045) [2024-06-23 22:31:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8668528640. Throughput: 0: 42650.4. Samples: 8668680180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 22:31:29,590][15401] Updated weights for policy 0, policy_version 529090 (0.0023) [2024-06-23 22:31:32,715][15401] Updated weights for policy 0, policy_version 529100 (0.0045) [2024-06-23 22:31:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8668790784. Throughput: 0: 42685.5. Samples: 8668931400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 22:31:37,223][15401] Updated weights for policy 0, policy_version 529110 (0.0027) [2024-06-23 22:31:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 8668971008. Throughput: 0: 43038.7. Samples: 8669075520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 22:31:40,331][15401] Updated weights for policy 0, policy_version 529120 (0.0044) [2024-06-23 22:31:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8669184000. Throughput: 0: 42883.5. Samples: 8669324320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:43,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-23 22:31:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000529125_8669184000.pth... [2024-06-23 22:31:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000528498_8658911232.pth [2024-06-23 22:31:44,710][15401] Updated weights for policy 0, policy_version 529130 (0.0022) [2024-06-23 22:31:47,937][15401] Updated weights for policy 0, policy_version 529140 (0.0036) [2024-06-23 22:31:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43147.8, 300 sec: 42876.1). Total num frames: 8669429760. Throughput: 0: 42744.5. Samples: 8669572140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 22:31:52,406][15401] Updated weights for policy 0, policy_version 529150 (0.0038) [2024-06-23 22:31:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8669609984. Throughput: 0: 42876.9. Samples: 8669712240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 22:31:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 22:31:56,115][15401] Updated weights for policy 0, policy_version 529160 (0.0042) [2024-06-23 22:31:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42765.6). Total num frames: 8669839360. Throughput: 0: 42738.7. Samples: 8669961240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:31:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 22:32:00,398][15401] Updated weights for policy 0, policy_version 529170 (0.0028) [2024-06-23 22:32:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8670052352. Throughput: 0: 42854.2. Samples: 8670217660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 22:32:03,696][15401] Updated weights for policy 0, policy_version 529180 (0.0030) [2024-06-23 22:32:07,936][15401] Updated weights for policy 0, policy_version 529190 (0.0035) [2024-06-23 22:32:08,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8670265344. Throughput: 0: 42749.4. Samples: 8670351200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:08,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-23 22:32:11,499][15401] Updated weights for policy 0, policy_version 529200 (0.0034) [2024-06-23 22:32:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 8670494720. Throughput: 0: 42808.4. Samples: 8670606560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 22:32:15,497][15401] Updated weights for policy 0, policy_version 529210 (0.0025) [2024-06-23 22:32:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8670707712. Throughput: 0: 42904.5. Samples: 8670862100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 22:32:19,044][15401] Updated weights for policy 0, policy_version 529220 (0.0035) [2024-06-23 22:32:22,948][15401] Updated weights for policy 0, policy_version 529230 (0.0046) [2024-06-23 22:32:23,396][15132] Fps is (10 sec: 42572.9, 60 sec: 43140.3, 300 sec: 42930.8). Total num frames: 8670920704. Throughput: 0: 42587.6. Samples: 8670992220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:23,396][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 22:32:26,537][15401] Updated weights for policy 0, policy_version 529240 (0.0030) [2024-06-23 22:32:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43688.9, 300 sec: 42875.7). Total num frames: 8671150080. Throughput: 0: 42921.8. Samples: 8671255900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:28,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 22:32:30,297][15401] Updated weights for policy 0, policy_version 529250 (0.0031) [2024-06-23 22:32:33,390][15132] Fps is (10 sec: 44263.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8671363072. Throughput: 0: 43251.6. Samples: 8671518460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 22:32:33,954][15401] Updated weights for policy 0, policy_version 529260 (0.0031) [2024-06-23 22:32:37,846][15401] Updated weights for policy 0, policy_version 529270 (0.0024) [2024-06-23 22:32:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 8671576064. Throughput: 0: 42834.6. Samples: 8671639800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 22:32:39,388][15349] Signal inference workers to stop experience collection... (128500 times) [2024-06-23 22:32:39,388][15349] Signal inference workers to resume experience collection... (128500 times) [2024-06-23 22:32:39,409][15401] InferenceWorker_p0-w0: stopping experience collection (128500 times) [2024-06-23 22:32:39,409][15401] InferenceWorker_p0-w0: resuming experience collection (128500 times) [2024-06-23 22:32:41,483][15401] Updated weights for policy 0, policy_version 529280 (0.0025) [2024-06-23 22:32:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8671772672. Throughput: 0: 43235.5. Samples: 8671906840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 22:32:45,374][15401] Updated weights for policy 0, policy_version 529290 (0.0050) [2024-06-23 22:32:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8672018432. Throughput: 0: 43284.1. Samples: 8672165440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:48,397][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 22:32:49,251][15401] Updated weights for policy 0, policy_version 529300 (0.0049) [2024-06-23 22:32:52,851][15401] Updated weights for policy 0, policy_version 529310 (0.0028) [2024-06-23 22:32:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8672215040. Throughput: 0: 43225.4. Samples: 8672296340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 22:32:56,810][15401] Updated weights for policy 0, policy_version 529320 (0.0033) [2024-06-23 22:32:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8672411648. Throughput: 0: 43113.8. Samples: 8672546680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:32:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 22:33:00,879][15401] Updated weights for policy 0, policy_version 529330 (0.0034) [2024-06-23 22:33:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8672641024. Throughput: 0: 43028.9. Samples: 8672798400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:33:03,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-23 22:33:05,162][15401] Updated weights for policy 0, policy_version 529340 (0.0046) [2024-06-23 22:33:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8672854016. Throughput: 0: 43141.3. Samples: 8672933320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:33:08,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-23 22:33:08,569][15401] Updated weights for policy 0, policy_version 529350 (0.0040) [2024-06-23 22:33:12,825][15401] Updated weights for policy 0, policy_version 529360 (0.0048) [2024-06-23 22:33:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8673050624. Throughput: 0: 42832.0. Samples: 8673183240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:33:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-23 22:33:16,163][15401] Updated weights for policy 0, policy_version 529370 (0.0029) [2024-06-23 22:33:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8673280000. Throughput: 0: 42600.0. Samples: 8673435460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:33:18,391][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 22:33:20,422][15401] Updated weights for policy 0, policy_version 529380 (0.0031) [2024-06-23 22:33:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42875.8, 300 sec: 42820.6). Total num frames: 8673492992. Throughput: 0: 42807.2. Samples: 8673566120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-23 22:33:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 22:33:23,890][15401] Updated weights for policy 0, policy_version 529390 (0.0027) [2024-06-23 22:33:28,047][15401] Updated weights for policy 0, policy_version 529400 (0.0032) [2024-06-23 22:33:28,390][15132] Fps is (10 sec: 40956.4, 60 sec: 42326.4, 300 sec: 42820.4). Total num frames: 8673689600. Throughput: 0: 42400.9. Samples: 8673814920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:33:28,391][15132] Avg episode reward: [(0, '0.692')] [2024-06-23 22:33:31,673][15401] Updated weights for policy 0, policy_version 529410 (0.0037) [2024-06-23 22:33:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42820.8). Total num frames: 8673902592. Throughput: 0: 42284.8. Samples: 8674068260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:33:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-23 22:33:35,927][15401] Updated weights for policy 0, policy_version 529420 (0.0027) [2024-06-23 22:33:38,389][15132] Fps is (10 sec: 42602.5, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 8674115584. Throughput: 0: 42282.2. Samples: 8674199040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:33:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 22:33:39,306][15401] Updated weights for policy 0, policy_version 529430 (0.0036) [2024-06-23 22:33:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8674328576. Throughput: 0: 42374.3. Samples: 8674453520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:33:43,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 22:33:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000529439_8674328576.pth... [2024-06-23 22:33:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000528813_8664072192.pth [2024-06-23 22:33:43,605][15401] Updated weights for policy 0, policy_version 529440 (0.0043) [2024-06-23 22:33:47,003][15401] Updated weights for policy 0, policy_version 529450 (0.0040) [2024-06-23 22:33:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 8674525184. Throughput: 0: 42374.7. Samples: 8674705260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:33:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 22:33:51,522][15401] Updated weights for policy 0, policy_version 529460 (0.0043) [2024-06-23 22:33:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.6). Total num frames: 8674754560. Throughput: 0: 42233.4. Samples: 8674833820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:33:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 22:33:54,697][15401] Updated weights for policy 0, policy_version 529470 (0.0033) [2024-06-23 22:33:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8674967552. Throughput: 0: 42444.5. Samples: 8675093240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:33:58,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 22:33:58,949][15401] Updated weights for policy 0, policy_version 529480 (0.0034) [2024-06-23 22:34:02,369][15401] Updated weights for policy 0, policy_version 529490 (0.0052) [2024-06-23 22:34:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 8675180544. Throughput: 0: 42455.2. Samples: 8675345940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:03,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 22:34:06,550][15401] Updated weights for policy 0, policy_version 529500 (0.0028) [2024-06-23 22:34:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8675393536. Throughput: 0: 42517.8. Samples: 8675479420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:08,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 22:34:09,157][15349] Signal inference workers to stop experience collection... (128550 times) [2024-06-23 22:34:09,197][15401] InferenceWorker_p0-w0: stopping experience collection (128550 times) [2024-06-23 22:34:09,206][15349] Signal inference workers to resume experience collection... (128550 times) [2024-06-23 22:34:09,214][15401] InferenceWorker_p0-w0: resuming experience collection (128550 times) [2024-06-23 22:34:10,027][15401] Updated weights for policy 0, policy_version 529510 (0.0037) [2024-06-23 22:34:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8675590144. Throughput: 0: 42520.9. Samples: 8675728320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 22:34:14,294][15401] Updated weights for policy 0, policy_version 529520 (0.0039) [2024-06-23 22:34:17,814][15401] Updated weights for policy 0, policy_version 529530 (0.0032) [2024-06-23 22:34:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8675819520. Throughput: 0: 42388.4. Samples: 8675975740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 22:34:21,911][15401] Updated weights for policy 0, policy_version 529540 (0.0040) [2024-06-23 22:34:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 8676016128. Throughput: 0: 42478.1. Samples: 8676110560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:23,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-23 22:34:25,716][15401] Updated weights for policy 0, policy_version 529550 (0.0045) [2024-06-23 22:34:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42326.1, 300 sec: 42765.0). Total num frames: 8676229120. Throughput: 0: 42302.3. Samples: 8676357120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 22:34:29,809][15401] Updated weights for policy 0, policy_version 529560 (0.0039) [2024-06-23 22:34:33,291][15401] Updated weights for policy 0, policy_version 529570 (0.0045) [2024-06-23 22:34:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8676474880. Throughput: 0: 42337.8. Samples: 8676610460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:33,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 22:34:37,698][15401] Updated weights for policy 0, policy_version 529580 (0.0040) [2024-06-23 22:34:38,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 8676671488. Throughput: 0: 42478.6. Samples: 8676745460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:38,392][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 22:34:41,365][15401] Updated weights for policy 0, policy_version 529590 (0.0040) [2024-06-23 22:34:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8676868096. Throughput: 0: 42306.7. Samples: 8676997040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:43,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-23 22:34:45,416][15401] Updated weights for policy 0, policy_version 529600 (0.0030) [2024-06-23 22:34:48,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8677097472. Throughput: 0: 42250.6. Samples: 8677247220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 22:34:49,167][15401] Updated weights for policy 0, policy_version 529610 (0.0044) [2024-06-23 22:34:53,029][15401] Updated weights for policy 0, policy_version 529620 (0.0033) [2024-06-23 22:34:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8677310464. Throughput: 0: 42209.3. Samples: 8677378840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-23 22:34:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 22:34:56,965][15401] Updated weights for policy 0, policy_version 529630 (0.0029) [2024-06-23 22:34:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8677507072. Throughput: 0: 42358.2. Samples: 8677634440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:34:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 22:35:00,584][15401] Updated weights for policy 0, policy_version 529640 (0.0026) [2024-06-23 22:35:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8677736448. Throughput: 0: 42603.1. Samples: 8677892880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-23 22:35:04,623][15401] Updated weights for policy 0, policy_version 529650 (0.0033) [2024-06-23 22:35:08,204][15401] Updated weights for policy 0, policy_version 529660 (0.0031) [2024-06-23 22:35:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8677949440. Throughput: 0: 42439.5. Samples: 8678020340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 22:35:12,291][15401] Updated weights for policy 0, policy_version 529670 (0.0031) [2024-06-23 22:35:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8678146048. Throughput: 0: 42716.4. Samples: 8678279360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 22:35:15,908][15401] Updated weights for policy 0, policy_version 529680 (0.0038) [2024-06-23 22:35:16,898][15349] Signal inference workers to stop experience collection... (128600 times) [2024-06-23 22:35:16,898][15349] Signal inference workers to resume experience collection... (128600 times) [2024-06-23 22:35:16,918][15401] InferenceWorker_p0-w0: stopping experience collection (128600 times) [2024-06-23 22:35:16,918][15401] InferenceWorker_p0-w0: resuming experience collection (128600 times) [2024-06-23 22:35:18,391][15132] Fps is (10 sec: 42593.5, 60 sec: 42597.5, 300 sec: 42653.8). Total num frames: 8678375424. Throughput: 0: 42649.0. Samples: 8678529720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:18,391][15132] Avg episode reward: [(0, '0.300')] [2024-06-23 22:35:20,021][15401] Updated weights for policy 0, policy_version 529690 (0.0033) [2024-06-23 22:35:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8678572032. Throughput: 0: 42477.8. Samples: 8678656860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 22:35:23,840][15401] Updated weights for policy 0, policy_version 529700 (0.0028) [2024-06-23 22:35:27,700][15401] Updated weights for policy 0, policy_version 529710 (0.0040) [2024-06-23 22:35:28,390][15132] Fps is (10 sec: 40964.6, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 8678785024. Throughput: 0: 42557.2. Samples: 8678912120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:28,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-23 22:35:31,461][15401] Updated weights for policy 0, policy_version 529720 (0.0035) [2024-06-23 22:35:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 8678981632. Throughput: 0: 42688.5. Samples: 8679168200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 22:35:35,386][15401] Updated weights for policy 0, policy_version 529730 (0.0033) [2024-06-23 22:35:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 8679211008. Throughput: 0: 42575.5. Samples: 8679294740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:38,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-23 22:35:39,229][15401] Updated weights for policy 0, policy_version 529740 (0.0023) [2024-06-23 22:35:42,973][15401] Updated weights for policy 0, policy_version 529750 (0.0034) [2024-06-23 22:35:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 8679424000. Throughput: 0: 42567.0. Samples: 8679549960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 22:35:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000529750_8679424000.pth... [2024-06-23 22:35:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000529125_8669184000.pth [2024-06-23 22:35:46,977][15401] Updated weights for policy 0, policy_version 529760 (0.0037) [2024-06-23 22:35:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8679636992. Throughput: 0: 42441.8. Samples: 8679802760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:48,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 22:35:51,290][15401] Updated weights for policy 0, policy_version 529770 (0.0032) [2024-06-23 22:35:53,393][15132] Fps is (10 sec: 42581.8, 60 sec: 42322.5, 300 sec: 42653.4). Total num frames: 8679849984. Throughput: 0: 42548.8. Samples: 8679935200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:53,394][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 22:35:54,587][15401] Updated weights for policy 0, policy_version 529780 (0.0036) [2024-06-23 22:35:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8680062976. Throughput: 0: 42426.2. Samples: 8680188540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:35:58,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 22:35:58,825][15401] Updated weights for policy 0, policy_version 529790 (0.0037) [2024-06-23 22:36:02,118][15401] Updated weights for policy 0, policy_version 529800 (0.0039) [2024-06-23 22:36:03,389][15132] Fps is (10 sec: 42615.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8680275968. Throughput: 0: 42443.0. Samples: 8680439600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:36:03,390][15132] Avg episode reward: [(0, '0.161')] [2024-06-23 22:36:06,428][15401] Updated weights for policy 0, policy_version 529810 (0.0046) [2024-06-23 22:36:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8680505344. Throughput: 0: 42541.4. Samples: 8680571220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:36:08,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 22:36:10,162][15401] Updated weights for policy 0, policy_version 529820 (0.0021) [2024-06-23 22:36:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8680701952. Throughput: 0: 42601.0. Samples: 8680829160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:36:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 22:36:13,999][15401] Updated weights for policy 0, policy_version 529830 (0.0026) [2024-06-23 22:36:17,748][15401] Updated weights for policy 0, policy_version 529840 (0.0049) [2024-06-23 22:36:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42326.2, 300 sec: 42653.9). Total num frames: 8680914944. Throughput: 0: 42353.6. Samples: 8681074120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 22:36:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 22:36:21,733][15401] Updated weights for policy 0, policy_version 529850 (0.0035) [2024-06-23 22:36:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8681144320. Throughput: 0: 42456.9. Samples: 8681205300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:36:23,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-23 22:36:25,499][15401] Updated weights for policy 0, policy_version 529860 (0.0029) [2024-06-23 22:36:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8681324544. Throughput: 0: 42484.0. Samples: 8681461740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:36:28,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 22:36:29,364][15401] Updated weights for policy 0, policy_version 529870 (0.0032) [2024-06-23 22:36:33,005][15401] Updated weights for policy 0, policy_version 529880 (0.0033) [2024-06-23 22:36:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 8681553920. Throughput: 0: 42379.5. Samples: 8681709940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:36:33,393][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 22:36:36,933][15401] Updated weights for policy 0, policy_version 529890 (0.0036) [2024-06-23 22:36:38,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 8681766912. Throughput: 0: 42459.7. Samples: 8681845820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:36:38,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 22:36:41,050][15401] Updated weights for policy 0, policy_version 529900 (0.0032) [2024-06-23 22:36:43,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 8681947136. Throughput: 0: 42512.0. Samples: 8682101580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:36:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-23 22:36:44,498][15401] Updated weights for policy 0, policy_version 529910 (0.0040) [2024-06-23 22:36:48,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8682192896. Throughput: 0: 42512.0. Samples: 8682352640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:36:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 22:36:48,624][15401] Updated weights for policy 0, policy_version 529920 (0.0023) [2024-06-23 22:36:52,351][15401] Updated weights for policy 0, policy_version 529930 (0.0039) [2024-06-23 22:36:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42601.2, 300 sec: 42598.4). Total num frames: 8682405888. Throughput: 0: 42635.5. Samples: 8682489820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:36:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 22:36:56,165][15401] Updated weights for policy 0, policy_version 529940 (0.0043) [2024-06-23 22:36:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8682602496. Throughput: 0: 42520.4. Samples: 8682742580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:36:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 22:36:58,800][15349] Signal inference workers to stop experience collection... (128650 times) [2024-06-23 22:36:58,800][15349] Signal inference workers to resume experience collection... (128650 times) [2024-06-23 22:36:58,823][15401] InferenceWorker_p0-w0: stopping experience collection (128650 times) [2024-06-23 22:36:58,823][15401] InferenceWorker_p0-w0: resuming experience collection (128650 times) [2024-06-23 22:37:00,317][15401] Updated weights for policy 0, policy_version 529950 (0.0047) [2024-06-23 22:37:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8682831872. Throughput: 0: 42695.2. Samples: 8682995400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 22:37:03,662][15401] Updated weights for policy 0, policy_version 529960 (0.0026) [2024-06-23 22:37:07,877][15401] Updated weights for policy 0, policy_version 529970 (0.0031) [2024-06-23 22:37:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8683044864. Throughput: 0: 42855.2. Samples: 8683133780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:08,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 22:37:11,263][15401] Updated weights for policy 0, policy_version 529980 (0.0030) [2024-06-23 22:37:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8683241472. Throughput: 0: 42873.9. Samples: 8683391060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:13,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 22:37:15,642][15401] Updated weights for policy 0, policy_version 529990 (0.0034) [2024-06-23 22:37:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42543.7). Total num frames: 8683470848. Throughput: 0: 42868.6. Samples: 8683638920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 22:37:18,892][15401] Updated weights for policy 0, policy_version 530000 (0.0048) [2024-06-23 22:37:23,331][15401] Updated weights for policy 0, policy_version 530010 (0.0029) [2024-06-23 22:37:23,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42325.2, 300 sec: 42487.6). Total num frames: 8683683840. Throughput: 0: 42897.7. Samples: 8683776120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:23,399][15132] Avg episode reward: [(0, '0.740')] [2024-06-23 22:37:26,494][15401] Updated weights for policy 0, policy_version 530020 (0.0041) [2024-06-23 22:37:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 8683880448. Throughput: 0: 42751.9. Samples: 8684025420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:28,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 22:37:30,837][15401] Updated weights for policy 0, policy_version 530030 (0.0035) [2024-06-23 22:37:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 8684126208. Throughput: 0: 42806.2. Samples: 8684278920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-23 22:37:34,671][15401] Updated weights for policy 0, policy_version 530040 (0.0038) [2024-06-23 22:37:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 8684322816. Throughput: 0: 42776.0. Samples: 8684414740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:38,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-23 22:37:38,459][15401] Updated weights for policy 0, policy_version 530050 (0.0037) [2024-06-23 22:37:42,130][15401] Updated weights for policy 0, policy_version 530060 (0.0029) [2024-06-23 22:37:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 8684519424. Throughput: 0: 42701.0. Samples: 8684664120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:43,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 22:37:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000530062_8684535808.pth... [2024-06-23 22:37:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000529439_8674328576.pth [2024-06-23 22:37:46,126][15401] Updated weights for policy 0, policy_version 530070 (0.0030) [2024-06-23 22:37:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8684781568. Throughput: 0: 42612.3. Samples: 8684912960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-23 22:37:48,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-23 22:37:49,648][15401] Updated weights for policy 0, policy_version 530080 (0.0036) [2024-06-23 22:37:53,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8684978176. Throughput: 0: 42691.4. Samples: 8685054900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:37:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 22:37:53,586][15401] Updated weights for policy 0, policy_version 530090 (0.0022) [2024-06-23 22:37:57,527][15401] Updated weights for policy 0, policy_version 530100 (0.0034) [2024-06-23 22:37:58,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 8685174784. Throughput: 0: 42586.2. Samples: 8685307440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:37:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 22:38:01,412][15349] Signal inference workers to stop experience collection... (128700 times) [2024-06-23 22:38:01,452][15401] InferenceWorker_p0-w0: stopping experience collection (128700 times) [2024-06-23 22:38:01,462][15349] Signal inference workers to resume experience collection... (128700 times) [2024-06-23 22:38:01,471][15401] InferenceWorker_p0-w0: resuming experience collection (128700 times) [2024-06-23 22:38:01,474][15401] Updated weights for policy 0, policy_version 530110 (0.0032) [2024-06-23 22:38:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8685420544. Throughput: 0: 42806.7. Samples: 8685565220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:03,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-23 22:38:05,044][15401] Updated weights for policy 0, policy_version 530120 (0.0032) [2024-06-23 22:38:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8685600768. Throughput: 0: 42742.9. Samples: 8685699540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:08,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 22:38:08,991][15401] Updated weights for policy 0, policy_version 530130 (0.0033) [2024-06-23 22:38:12,640][15401] Updated weights for policy 0, policy_version 530140 (0.0047) [2024-06-23 22:38:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 8685813760. Throughput: 0: 42728.3. Samples: 8685948200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:13,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-23 22:38:16,566][15401] Updated weights for policy 0, policy_version 530150 (0.0036) [2024-06-23 22:38:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8686059520. Throughput: 0: 42693.0. Samples: 8686200100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 22:38:20,248][15401] Updated weights for policy 0, policy_version 530160 (0.0042) [2024-06-23 22:38:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42543.0). Total num frames: 8686239744. Throughput: 0: 42694.2. Samples: 8686335980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:23,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-23 22:38:24,303][15401] Updated weights for policy 0, policy_version 530170 (0.0034) [2024-06-23 22:38:28,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42866.9, 300 sec: 42541.9). Total num frames: 8686452736. Throughput: 0: 42792.5. Samples: 8686590060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:28,396][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 22:38:28,446][15401] Updated weights for policy 0, policy_version 530180 (0.0026) [2024-06-23 22:38:32,150][15401] Updated weights for policy 0, policy_version 530190 (0.0028) [2024-06-23 22:38:33,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8686714880. Throughput: 0: 42751.7. Samples: 8686836780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 22:38:36,192][15401] Updated weights for policy 0, policy_version 530200 (0.0034) [2024-06-23 22:38:38,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8686878720. Throughput: 0: 42661.1. Samples: 8686974640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 22:38:39,672][15401] Updated weights for policy 0, policy_version 530210 (0.0032) [2024-06-23 22:38:43,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8687091712. Throughput: 0: 42760.1. Samples: 8687231640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:43,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 22:38:43,715][15401] Updated weights for policy 0, policy_version 530220 (0.0030) [2024-06-23 22:38:47,267][15401] Updated weights for policy 0, policy_version 530230 (0.0026) [2024-06-23 22:38:48,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8687353856. Throughput: 0: 42508.9. Samples: 8687478120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:48,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 22:38:51,340][15401] Updated weights for policy 0, policy_version 530240 (0.0043) [2024-06-23 22:38:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8687534080. Throughput: 0: 42499.6. Samples: 8687612020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:53,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 22:38:55,119][15401] Updated weights for policy 0, policy_version 530250 (0.0034) [2024-06-23 22:38:58,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 8687747072. Throughput: 0: 42617.8. Samples: 8687866100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:38:58,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 22:38:59,067][15401] Updated weights for policy 0, policy_version 530260 (0.0040) [2024-06-23 22:39:02,808][15401] Updated weights for policy 0, policy_version 530270 (0.0033) [2024-06-23 22:39:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8687976448. Throughput: 0: 42586.1. Samples: 8688116480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:39:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 22:39:06,804][15401] Updated weights for policy 0, policy_version 530280 (0.0037) [2024-06-23 22:39:08,392][15132] Fps is (10 sec: 44236.9, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 8688189440. Throughput: 0: 42587.5. Samples: 8688252520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:39:08,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 22:39:09,776][15349] Signal inference workers to stop experience collection... (128750 times) [2024-06-23 22:39:09,777][15349] Signal inference workers to resume experience collection... (128750 times) [2024-06-23 22:39:09,814][15401] InferenceWorker_p0-w0: stopping experience collection (128750 times) [2024-06-23 22:39:09,814][15401] InferenceWorker_p0-w0: resuming experience collection (128750 times) [2024-06-23 22:39:10,216][15401] Updated weights for policy 0, policy_version 530290 (0.0051) [2024-06-23 22:39:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 8688369664. Throughput: 0: 42535.0. Samples: 8688503860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:39:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 22:39:14,569][15401] Updated weights for policy 0, policy_version 530300 (0.0044) [2024-06-23 22:39:17,782][15401] Updated weights for policy 0, policy_version 530310 (0.0032) [2024-06-23 22:39:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8688615424. Throughput: 0: 42767.1. Samples: 8688761300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-23 22:39:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 22:39:22,538][15401] Updated weights for policy 0, policy_version 530320 (0.0033) [2024-06-23 22:39:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8688828416. Throughput: 0: 42675.5. Samples: 8688895040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:39:23,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-23 22:39:25,633][15401] Updated weights for policy 0, policy_version 530330 (0.0039) [2024-06-23 22:39:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42876.0, 300 sec: 42542.9). Total num frames: 8689025024. Throughput: 0: 42454.6. Samples: 8689142100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:39:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 22:39:30,209][15401] Updated weights for policy 0, policy_version 530340 (0.0038) [2024-06-23 22:39:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42543.2). Total num frames: 8689221632. Throughput: 0: 42669.7. Samples: 8689398260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:39:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 22:39:33,604][15401] Updated weights for policy 0, policy_version 530350 (0.0030) [2024-06-23 22:39:37,825][15401] Updated weights for policy 0, policy_version 530360 (0.0027) [2024-06-23 22:39:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8689434624. Throughput: 0: 42524.4. Samples: 8689525620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:39:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 22:39:41,220][15401] Updated weights for policy 0, policy_version 530370 (0.0026) [2024-06-23 22:39:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8689664000. Throughput: 0: 42540.5. Samples: 8689780320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:39:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 22:39:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000530375_8689664000.pth... [2024-06-23 22:39:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000529750_8679424000.pth [2024-06-23 22:39:45,314][15401] Updated weights for policy 0, policy_version 530380 (0.0042) [2024-06-23 22:39:48,395][15132] Fps is (10 sec: 42573.1, 60 sec: 41775.0, 300 sec: 42542.0). Total num frames: 8689860608. Throughput: 0: 42737.6. Samples: 8690039920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:39:48,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-23 22:39:48,887][15401] Updated weights for policy 0, policy_version 530390 (0.0034) [2024-06-23 22:39:53,017][15401] Updated weights for policy 0, policy_version 530400 (0.0040) [2024-06-23 22:39:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 8690073600. Throughput: 0: 42439.1. Samples: 8690162180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:39:53,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 22:39:56,810][15401] Updated weights for policy 0, policy_version 530410 (0.0035) [2024-06-23 22:39:58,389][15132] Fps is (10 sec: 45902.6, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 8690319360. Throughput: 0: 42543.1. Samples: 8690418300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:39:58,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-23 22:40:00,782][15401] Updated weights for policy 0, policy_version 530420 (0.0036) [2024-06-23 22:40:03,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 8690515968. Throughput: 0: 42563.0. Samples: 8690676740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:03,393][15132] Avg episode reward: [(0, '0.865')] [2024-06-23 22:40:04,395][15401] Updated weights for policy 0, policy_version 530430 (0.0027) [2024-06-23 22:40:08,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41780.9, 300 sec: 42542.9). Total num frames: 8690696192. Throughput: 0: 42283.1. Samples: 8690797780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 22:40:08,741][15401] Updated weights for policy 0, policy_version 530440 (0.0042) [2024-06-23 22:40:11,983][15401] Updated weights for policy 0, policy_version 530450 (0.0036) [2024-06-23 22:40:13,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42598.6). Total num frames: 8690941952. Throughput: 0: 42420.9. Samples: 8691051040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:13,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-23 22:40:16,784][15401] Updated weights for policy 0, policy_version 530460 (0.0031) [2024-06-23 22:40:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8691154944. Throughput: 0: 42461.3. Samples: 8691309020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 22:40:19,403][15349] Signal inference workers to stop experience collection... (128800 times) [2024-06-23 22:40:19,440][15401] InferenceWorker_p0-w0: stopping experience collection (128800 times) [2024-06-23 22:40:19,450][15349] Signal inference workers to resume experience collection... (128800 times) [2024-06-23 22:40:19,454][15401] InferenceWorker_p0-w0: resuming experience collection (128800 times) [2024-06-23 22:40:19,588][15401] Updated weights for policy 0, policy_version 530470 (0.0028) [2024-06-23 22:40:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 8691335168. Throughput: 0: 42441.7. Samples: 8691435500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 22:40:24,235][15401] Updated weights for policy 0, policy_version 530480 (0.0038) [2024-06-23 22:40:27,213][15401] Updated weights for policy 0, policy_version 530490 (0.0024) [2024-06-23 22:40:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8691580928. Throughput: 0: 42382.3. Samples: 8691687520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 22:40:31,736][15401] Updated weights for policy 0, policy_version 530500 (0.0028) [2024-06-23 22:40:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8691777536. Throughput: 0: 42556.3. Samples: 8691954700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:33,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 22:40:34,977][15401] Updated weights for policy 0, policy_version 530510 (0.0030) [2024-06-23 22:40:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8691990528. Throughput: 0: 42439.2. Samples: 8692071940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 22:40:39,309][15401] Updated weights for policy 0, policy_version 530520 (0.0034) [2024-06-23 22:40:42,589][15401] Updated weights for policy 0, policy_version 530530 (0.0033) [2024-06-23 22:40:43,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 8692236288. Throughput: 0: 42550.2. Samples: 8692333160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:43,392][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 22:40:46,853][15401] Updated weights for policy 0, policy_version 530540 (0.0031) [2024-06-23 22:40:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42056.4, 300 sec: 42487.9). Total num frames: 8692383744. Throughput: 0: 42557.8. Samples: 8692591740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 22:40:48,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 22:40:50,369][15401] Updated weights for policy 0, policy_version 530550 (0.0026) [2024-06-23 22:40:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8692645888. Throughput: 0: 42565.3. Samples: 8692713220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:40:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 22:40:54,442][15401] Updated weights for policy 0, policy_version 530560 (0.0035) [2024-06-23 22:40:57,914][15401] Updated weights for policy 0, policy_version 530570 (0.0034) [2024-06-23 22:40:58,389][15132] Fps is (10 sec: 49152.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8692875264. Throughput: 0: 42775.5. Samples: 8692975940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:40:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 22:41:02,538][15401] Updated weights for policy 0, policy_version 530580 (0.0050) [2024-06-23 22:41:03,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42325.3, 300 sec: 42542.5). Total num frames: 8693055488. Throughput: 0: 42872.0. Samples: 8693238360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:03,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 22:41:05,528][15401] Updated weights for policy 0, policy_version 530590 (0.0042) [2024-06-23 22:41:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8693284864. Throughput: 0: 42815.1. Samples: 8693362180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 22:41:09,966][15401] Updated weights for policy 0, policy_version 530600 (0.0023) [2024-06-23 22:41:12,990][15401] Updated weights for policy 0, policy_version 530610 (0.0029) [2024-06-23 22:41:13,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8693514240. Throughput: 0: 43032.4. Samples: 8693623980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:13,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-23 22:41:17,435][15401] Updated weights for policy 0, policy_version 530620 (0.0043) [2024-06-23 22:41:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8693694464. Throughput: 0: 42772.8. Samples: 8693879480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:18,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-23 22:41:21,110][15401] Updated weights for policy 0, policy_version 530630 (0.0034) [2024-06-23 22:41:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 8693940224. Throughput: 0: 42988.9. Samples: 8694006440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:23,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-23 22:41:24,989][15401] Updated weights for policy 0, policy_version 530640 (0.0032) [2024-06-23 22:41:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 8694136832. Throughput: 0: 42965.9. Samples: 8694266520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 22:41:28,681][15401] Updated weights for policy 0, policy_version 530650 (0.0036) [2024-06-23 22:41:32,826][15401] Updated weights for policy 0, policy_version 530660 (0.0037) [2024-06-23 22:41:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 8694366208. Throughput: 0: 42958.4. Samples: 8694524860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 22:41:36,393][15401] Updated weights for policy 0, policy_version 530670 (0.0034) [2024-06-23 22:41:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8694579200. Throughput: 0: 43159.4. Samples: 8694655400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 22:41:40,343][15401] Updated weights for policy 0, policy_version 530680 (0.0030) [2024-06-23 22:41:43,390][15132] Fps is (10 sec: 42597.0, 60 sec: 42599.9, 300 sec: 42709.4). Total num frames: 8694792192. Throughput: 0: 42934.4. Samples: 8694908000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 22:41:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000530688_8694792192.pth... [2024-06-23 22:41:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000530062_8684535808.pth [2024-06-23 22:41:44,367][15401] Updated weights for policy 0, policy_version 530690 (0.0033) [2024-06-23 22:41:47,840][15401] Updated weights for policy 0, policy_version 530700 (0.0029) [2024-06-23 22:41:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43690.7, 300 sec: 42709.5). Total num frames: 8695005184. Throughput: 0: 42745.4. Samples: 8695161800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 22:41:52,012][15401] Updated weights for policy 0, policy_version 530710 (0.0031) [2024-06-23 22:41:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8695201792. Throughput: 0: 42847.9. Samples: 8695290340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:53,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 22:41:55,293][15349] Signal inference workers to stop experience collection... (128850 times) [2024-06-23 22:41:55,336][15401] InferenceWorker_p0-w0: stopping experience collection (128850 times) [2024-06-23 22:41:55,352][15349] Signal inference workers to resume experience collection... (128850 times) [2024-06-23 22:41:55,353][15401] InferenceWorker_p0-w0: resuming experience collection (128850 times) [2024-06-23 22:41:55,531][15401] Updated weights for policy 0, policy_version 530720 (0.0033) [2024-06-23 22:41:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8695431168. Throughput: 0: 42767.8. Samples: 8695548540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:41:58,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 22:41:59,418][15401] Updated weights for policy 0, policy_version 530730 (0.0036) [2024-06-23 22:42:02,996][15401] Updated weights for policy 0, policy_version 530740 (0.0033) [2024-06-23 22:42:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 8695660544. Throughput: 0: 42821.8. Samples: 8695806460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:42:03,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 22:42:07,335][15401] Updated weights for policy 0, policy_version 530750 (0.0040) [2024-06-23 22:42:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8695840768. Throughput: 0: 42970.3. Samples: 8695940100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:42:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 22:42:10,560][15401] Updated weights for policy 0, policy_version 530760 (0.0031) [2024-06-23 22:42:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8696070144. Throughput: 0: 42862.1. Samples: 8696195320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:42:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 22:42:14,831][15401] Updated weights for policy 0, policy_version 530770 (0.0039) [2024-06-23 22:42:18,159][15401] Updated weights for policy 0, policy_version 530780 (0.0028) [2024-06-23 22:42:18,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8696299520. Throughput: 0: 42747.0. Samples: 8696448480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-23 22:42:18,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 22:42:22,421][15401] Updated weights for policy 0, policy_version 530790 (0.0037) [2024-06-23 22:42:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 8696479744. Throughput: 0: 42843.6. Samples: 8696583460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:42:23,393][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 22:42:25,846][15401] Updated weights for policy 0, policy_version 530800 (0.0026) [2024-06-23 22:42:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8696725504. Throughput: 0: 42798.8. Samples: 8696833940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:42:28,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 22:42:29,922][15401] Updated weights for policy 0, policy_version 530810 (0.0043) [2024-06-23 22:42:33,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8696922112. Throughput: 0: 42953.7. Samples: 8697094720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:42:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 22:42:33,632][15401] Updated weights for policy 0, policy_version 530820 (0.0039) [2024-06-23 22:42:37,384][15401] Updated weights for policy 0, policy_version 530830 (0.0033) [2024-06-23 22:42:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 8697118720. Throughput: 0: 42962.4. Samples: 8697223640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:42:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-23 22:42:41,190][15401] Updated weights for policy 0, policy_version 530840 (0.0041) [2024-06-23 22:42:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 8697364480. Throughput: 0: 42816.6. Samples: 8697475280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:42:43,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-23 22:42:45,076][15401] Updated weights for policy 0, policy_version 530850 (0.0042) [2024-06-23 22:42:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8697561088. Throughput: 0: 42942.6. Samples: 8697738880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:42:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 22:42:48,757][15401] Updated weights for policy 0, policy_version 530860 (0.0035) [2024-06-23 22:42:52,998][15401] Updated weights for policy 0, policy_version 530870 (0.0039) [2024-06-23 22:42:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8697774080. Throughput: 0: 42733.1. Samples: 8697863100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:42:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 22:42:56,301][15401] Updated weights for policy 0, policy_version 530880 (0.0041) [2024-06-23 22:42:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8698019840. Throughput: 0: 42719.6. Samples: 8698117700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:42:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 22:43:00,599][15401] Updated weights for policy 0, policy_version 530890 (0.0042) [2024-06-23 22:43:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8698200064. Throughput: 0: 42943.1. Samples: 8698380920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:03,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 22:43:04,189][15401] Updated weights for policy 0, policy_version 530900 (0.0043) [2024-06-23 22:43:08,263][15401] Updated weights for policy 0, policy_version 530910 (0.0039) [2024-06-23 22:43:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8698429440. Throughput: 0: 42611.2. Samples: 8698500860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-23 22:43:11,907][15401] Updated weights for policy 0, policy_version 530920 (0.0031) [2024-06-23 22:43:13,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8698675200. Throughput: 0: 42903.1. Samples: 8698764580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 22:43:16,694][15401] Updated weights for policy 0, policy_version 530930 (0.0036) [2024-06-23 22:43:16,726][15349] Signal inference workers to stop experience collection... (128900 times) [2024-06-23 22:43:16,727][15349] Signal inference workers to resume experience collection... (128900 times) [2024-06-23 22:43:16,740][15401] InferenceWorker_p0-w0: stopping experience collection (128900 times) [2024-06-23 22:43:16,741][15401] InferenceWorker_p0-w0: resuming experience collection (128900 times) [2024-06-23 22:43:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8698839040. Throughput: 0: 42882.3. Samples: 8699024420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 22:43:19,538][15401] Updated weights for policy 0, policy_version 530940 (0.0042) [2024-06-23 22:43:23,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42873.2, 300 sec: 42710.4). Total num frames: 8699052032. Throughput: 0: 42708.3. Samples: 8699145520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 22:43:24,040][15401] Updated weights for policy 0, policy_version 530950 (0.0030) [2024-06-23 22:43:27,218][15401] Updated weights for policy 0, policy_version 530960 (0.0026) [2024-06-23 22:43:28,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 8699314176. Throughput: 0: 42913.4. Samples: 8699406380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 22:43:31,428][15401] Updated weights for policy 0, policy_version 530970 (0.0027) [2024-06-23 22:43:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8699478016. Throughput: 0: 42897.4. Samples: 8699669260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:33,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 22:43:34,726][15401] Updated weights for policy 0, policy_version 530980 (0.0030) [2024-06-23 22:43:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8699707392. Throughput: 0: 42747.6. Samples: 8699786740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 22:43:39,151][15401] Updated weights for policy 0, policy_version 530990 (0.0034) [2024-06-23 22:43:42,387][15401] Updated weights for policy 0, policy_version 531000 (0.0043) [2024-06-23 22:43:43,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 8699936768. Throughput: 0: 42870.2. Samples: 8700046960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:43,393][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 22:43:43,537][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000531003_8699953152.pth... [2024-06-23 22:43:43,604][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000530375_8689664000.pth [2024-06-23 22:43:46,686][15401] Updated weights for policy 0, policy_version 531010 (0.0035) [2024-06-23 22:43:48,391][15132] Fps is (10 sec: 40955.5, 60 sec: 42597.6, 300 sec: 42653.8). Total num frames: 8700116992. Throughput: 0: 42802.4. Samples: 8700307080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-23 22:43:48,391][15132] Avg episode reward: [(0, '0.849')] [2024-06-23 22:43:50,246][15401] Updated weights for policy 0, policy_version 531020 (0.0034) [2024-06-23 22:43:53,390][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 8700362752. Throughput: 0: 42883.6. Samples: 8700430620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:43:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-23 22:43:54,062][15401] Updated weights for policy 0, policy_version 531030 (0.0043) [2024-06-23 22:43:58,180][15401] Updated weights for policy 0, policy_version 531040 (0.0042) [2024-06-23 22:43:58,389][15132] Fps is (10 sec: 45880.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8700575744. Throughput: 0: 42810.3. Samples: 8700691040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:43:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 22:44:01,626][15401] Updated weights for policy 0, policy_version 531050 (0.0030) [2024-06-23 22:44:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 8700772352. Throughput: 0: 42673.6. Samples: 8700944740. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:03,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 22:44:06,032][15401] Updated weights for policy 0, policy_version 531060 (0.0043) [2024-06-23 22:44:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8701001728. Throughput: 0: 42818.8. Samples: 8701072360. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 22:44:09,679][15401] Updated weights for policy 0, policy_version 531070 (0.0023) [2024-06-23 22:44:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 8701181952. Throughput: 0: 42814.5. Samples: 8701333040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 22:44:13,554][15401] Updated weights for policy 0, policy_version 531080 (0.0030) [2024-06-23 22:44:17,171][15401] Updated weights for policy 0, policy_version 531090 (0.0033) [2024-06-23 22:44:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8701427712. Throughput: 0: 42593.7. Samples: 8701585980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 22:44:21,122][15401] Updated weights for policy 0, policy_version 531100 (0.0024) [2024-06-23 22:44:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 8701657088. Throughput: 0: 42699.1. Samples: 8701708200. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 22:44:24,775][15401] Updated weights for policy 0, policy_version 531110 (0.0038) [2024-06-23 22:44:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 8701820928. Throughput: 0: 42703.7. Samples: 8701968520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 22:44:28,879][15401] Updated weights for policy 0, policy_version 531120 (0.0035) [2024-06-23 22:44:32,372][15401] Updated weights for policy 0, policy_version 531130 (0.0042) [2024-06-23 22:44:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8702050304. Throughput: 0: 42494.0. Samples: 8702219260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 22:44:36,761][15401] Updated weights for policy 0, policy_version 531140 (0.0039) [2024-06-23 22:44:36,913][15349] Signal inference workers to stop experience collection... (128950 times) [2024-06-23 22:44:36,915][15349] Signal inference workers to resume experience collection... (128950 times) [2024-06-23 22:44:36,965][15401] InferenceWorker_p0-w0: stopping experience collection (128950 times) [2024-06-23 22:44:36,965][15401] InferenceWorker_p0-w0: resuming experience collection (128950 times) [2024-06-23 22:44:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8702296064. Throughput: 0: 42702.6. Samples: 8702352240. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-23 22:44:40,281][15401] Updated weights for policy 0, policy_version 531150 (0.0039) [2024-06-23 22:44:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42053.9, 300 sec: 42710.3). Total num frames: 8702459904. Throughput: 0: 42485.3. Samples: 8702602880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 22:44:44,458][15401] Updated weights for policy 0, policy_version 531160 (0.0026) [2024-06-23 22:44:47,850][15401] Updated weights for policy 0, policy_version 531170 (0.0025) [2024-06-23 22:44:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42872.3, 300 sec: 42765.0). Total num frames: 8702689280. Throughput: 0: 42403.8. Samples: 8702852900. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:48,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 22:44:52,022][15401] Updated weights for policy 0, policy_version 531180 (0.0027) [2024-06-23 22:44:53,395][15132] Fps is (10 sec: 45851.9, 60 sec: 42594.7, 300 sec: 42708.7). Total num frames: 8702918656. Throughput: 0: 42601.3. Samples: 8702989640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:53,395][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 22:44:55,420][15401] Updated weights for policy 0, policy_version 531190 (0.0027) [2024-06-23 22:44:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 8703115264. Throughput: 0: 42421.3. Samples: 8703242000. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:44:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 22:44:59,910][15401] Updated weights for policy 0, policy_version 531200 (0.0034) [2024-06-23 22:45:03,389][15132] Fps is (10 sec: 40981.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8703328256. Throughput: 0: 42403.6. Samples: 8703494140. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:45:03,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 22:45:03,798][15401] Updated weights for policy 0, policy_version 531210 (0.0025) [2024-06-23 22:45:07,581][15401] Updated weights for policy 0, policy_version 531220 (0.0034) [2024-06-23 22:45:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8703541248. Throughput: 0: 42599.2. Samples: 8703625160. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:45:08,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-23 22:45:11,299][15401] Updated weights for policy 0, policy_version 531230 (0.0041) [2024-06-23 22:45:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8703754240. Throughput: 0: 42456.9. Samples: 8703879080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-23 22:45:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 22:45:15,157][15401] Updated weights for policy 0, policy_version 531240 (0.0040) [2024-06-23 22:45:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8703967232. Throughput: 0: 42421.7. Samples: 8704128240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-23 22:45:18,927][15401] Updated weights for policy 0, policy_version 531250 (0.0032) [2024-06-23 22:45:22,731][15401] Updated weights for policy 0, policy_version 531260 (0.0041) [2024-06-23 22:45:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8704180224. Throughput: 0: 42387.1. Samples: 8704259660. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 22:45:26,575][15401] Updated weights for policy 0, policy_version 531270 (0.0033) [2024-06-23 22:45:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8704376832. Throughput: 0: 42434.4. Samples: 8704512420. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 22:45:30,773][15401] Updated weights for policy 0, policy_version 531280 (0.0033) [2024-06-23 22:45:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8704622592. Throughput: 0: 42436.3. Samples: 8704762540. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 22:45:34,122][15401] Updated weights for policy 0, policy_version 531290 (0.0036) [2024-06-23 22:45:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.3, 300 sec: 42598.7). Total num frames: 8704802816. Throughput: 0: 42357.3. Samples: 8704895500. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:38,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-23 22:45:38,425][15401] Updated weights for policy 0, policy_version 531300 (0.0029) [2024-06-23 22:45:41,685][15401] Updated weights for policy 0, policy_version 531310 (0.0036) [2024-06-23 22:45:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8705032192. Throughput: 0: 42469.3. Samples: 8705153120. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 22:45:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000531313_8705032192.pth... [2024-06-23 22:45:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000530688_8694792192.pth [2024-06-23 22:45:45,848][15401] Updated weights for policy 0, policy_version 531320 (0.0030) [2024-06-23 22:45:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8705261568. Throughput: 0: 42550.2. Samples: 8705408900. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 22:45:49,509][15401] Updated weights for policy 0, policy_version 531330 (0.0031) [2024-06-23 22:45:53,312][15401] Updated weights for policy 0, policy_version 531340 (0.0034) [2024-06-23 22:45:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42602.0, 300 sec: 42709.5). Total num frames: 8705474560. Throughput: 0: 42547.9. Samples: 8705539820. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 22:45:57,136][15401] Updated weights for policy 0, policy_version 531350 (0.0030) [2024-06-23 22:45:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 8705654784. Throughput: 0: 42563.9. Samples: 8705794460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:45:58,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 22:46:00,817][15401] Updated weights for policy 0, policy_version 531360 (0.0031) [2024-06-23 22:46:01,994][15349] Signal inference workers to stop experience collection... (129000 times) [2024-06-23 22:46:01,994][15349] Signal inference workers to resume experience collection... (129000 times) [2024-06-23 22:46:02,039][15401] InferenceWorker_p0-w0: stopping experience collection (129000 times) [2024-06-23 22:46:02,040][15401] InferenceWorker_p0-w0: resuming experience collection (129000 times) [2024-06-23 22:46:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8705900544. Throughput: 0: 42823.6. Samples: 8706055300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 22:46:04,730][15401] Updated weights for policy 0, policy_version 531370 (0.0042) [2024-06-23 22:46:08,265][15401] Updated weights for policy 0, policy_version 531380 (0.0036) [2024-06-23 22:46:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8706129920. Throughput: 0: 43092.0. Samples: 8706198800. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:08,393][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 22:46:12,399][15401] Updated weights for policy 0, policy_version 531390 (0.0045) [2024-06-23 22:46:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8706310144. Throughput: 0: 42982.5. Samples: 8706446640. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-23 22:46:15,813][15401] Updated weights for policy 0, policy_version 531400 (0.0042) [2024-06-23 22:46:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8706539520. Throughput: 0: 43114.3. Samples: 8706702680. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 22:46:19,926][15401] Updated weights for policy 0, policy_version 531410 (0.0034) [2024-06-23 22:46:23,390][15132] Fps is (10 sec: 44233.6, 60 sec: 42870.9, 300 sec: 42764.9). Total num frames: 8706752512. Throughput: 0: 43177.0. Samples: 8706838500. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:23,391][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 22:46:23,951][15401] Updated weights for policy 0, policy_version 531420 (0.0039) [2024-06-23 22:46:27,657][15401] Updated weights for policy 0, policy_version 531430 (0.0030) [2024-06-23 22:46:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8706965504. Throughput: 0: 43072.4. Samples: 8707091380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:28,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 22:46:31,395][15401] Updated weights for policy 0, policy_version 531440 (0.0036) [2024-06-23 22:46:33,389][15132] Fps is (10 sec: 44240.7, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 8707194880. Throughput: 0: 43050.3. Samples: 8707346160. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 22:46:35,195][15401] Updated weights for policy 0, policy_version 531450 (0.0035) [2024-06-23 22:46:38,396][15132] Fps is (10 sec: 44209.0, 60 sec: 43413.0, 300 sec: 42764.1). Total num frames: 8707407872. Throughput: 0: 43074.0. Samples: 8707478420. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:38,396][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 22:46:39,168][15401] Updated weights for policy 0, policy_version 531460 (0.0032) [2024-06-23 22:46:42,985][15401] Updated weights for policy 0, policy_version 531470 (0.0027) [2024-06-23 22:46:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8707604480. Throughput: 0: 43124.0. Samples: 8707735040. Policy #0 lag: (min: 1.0, avg: 11.5, max: 20.0) [2024-06-23 22:46:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 22:46:46,660][15401] Updated weights for policy 0, policy_version 531480 (0.0029) [2024-06-23 22:46:48,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8707833856. Throughput: 0: 43025.3. Samples: 8707991440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:46:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 22:46:50,567][15401] Updated weights for policy 0, policy_version 531490 (0.0029) [2024-06-23 22:46:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8708046848. Throughput: 0: 42843.6. Samples: 8708126760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:46:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 22:46:53,872][15401] Updated weights for policy 0, policy_version 531500 (0.0030) [2024-06-23 22:46:58,148][15401] Updated weights for policy 0, policy_version 531510 (0.0035) [2024-06-23 22:46:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 8708259840. Throughput: 0: 43020.1. Samples: 8708382540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:46:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 22:47:01,385][15401] Updated weights for policy 0, policy_version 531520 (0.0038) [2024-06-23 22:47:03,396][15132] Fps is (10 sec: 44208.7, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 8708489216. Throughput: 0: 43009.0. Samples: 8708638360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:03,396][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 22:47:05,715][15401] Updated weights for policy 0, policy_version 531530 (0.0032) [2024-06-23 22:47:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8708702208. Throughput: 0: 43047.4. Samples: 8708775600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 22:47:09,088][15401] Updated weights for policy 0, policy_version 531540 (0.0034) [2024-06-23 22:47:13,133][15401] Updated weights for policy 0, policy_version 531550 (0.0035) [2024-06-23 22:47:13,390][15132] Fps is (10 sec: 42625.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8708915200. Throughput: 0: 43292.9. Samples: 8709039560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 22:47:16,620][15401] Updated weights for policy 0, policy_version 531560 (0.0021) [2024-06-23 22:47:17,592][15349] Signal inference workers to stop experience collection... (129050 times) [2024-06-23 22:47:17,647][15401] InferenceWorker_p0-w0: stopping experience collection (129050 times) [2024-06-23 22:47:17,657][15349] Signal inference workers to resume experience collection... (129050 times) [2024-06-23 22:47:17,658][15401] InferenceWorker_p0-w0: resuming experience collection (129050 times) [2024-06-23 22:47:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 8709144576. Throughput: 0: 43141.3. Samples: 8709287520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 22:47:20,635][15401] Updated weights for policy 0, policy_version 531570 (0.0037) [2024-06-23 22:47:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43418.2, 300 sec: 42820.6). Total num frames: 8709357568. Throughput: 0: 43271.9. Samples: 8709425380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 22:47:24,194][15401] Updated weights for policy 0, policy_version 531580 (0.0029) [2024-06-23 22:47:28,279][15401] Updated weights for policy 0, policy_version 531590 (0.0024) [2024-06-23 22:47:28,391][15132] Fps is (10 sec: 42592.8, 60 sec: 43416.7, 300 sec: 42875.9). Total num frames: 8709570560. Throughput: 0: 43397.5. Samples: 8709687980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:28,391][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 22:47:31,806][15401] Updated weights for policy 0, policy_version 531600 (0.0042) [2024-06-23 22:47:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8709783552. Throughput: 0: 43245.3. Samples: 8709937480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:33,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-23 22:47:36,121][15401] Updated weights for policy 0, policy_version 531610 (0.0027) [2024-06-23 22:47:38,390][15132] Fps is (10 sec: 42603.7, 60 sec: 43149.1, 300 sec: 42820.5). Total num frames: 8709996544. Throughput: 0: 43202.6. Samples: 8710070880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-23 22:47:39,483][15401] Updated weights for policy 0, policy_version 531620 (0.0021) [2024-06-23 22:47:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8710193152. Throughput: 0: 43229.8. Samples: 8710327880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 22:47:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000531629_8710209536.pth... [2024-06-23 22:47:43,517][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000531003_8699953152.pth [2024-06-23 22:47:43,822][15401] Updated weights for policy 0, policy_version 531630 (0.0033) [2024-06-23 22:47:47,308][15401] Updated weights for policy 0, policy_version 531640 (0.0035) [2024-06-23 22:47:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8710438912. Throughput: 0: 43130.9. Samples: 8710578980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:48,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 22:47:51,602][15401] Updated weights for policy 0, policy_version 531650 (0.0043) [2024-06-23 22:47:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8710651904. Throughput: 0: 43160.9. Samples: 8710717840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 22:47:54,743][15401] Updated weights for policy 0, policy_version 531660 (0.0035) [2024-06-23 22:47:58,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8710832128. Throughput: 0: 42965.4. Samples: 8710973000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:47:58,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 22:47:59,134][15401] Updated weights for policy 0, policy_version 531670 (0.0035) [2024-06-23 22:48:02,579][15401] Updated weights for policy 0, policy_version 531680 (0.0034) [2024-06-23 22:48:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43422.2, 300 sec: 42931.6). Total num frames: 8711094272. Throughput: 0: 43111.1. Samples: 8711227520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:48:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-23 22:48:06,951][15401] Updated weights for policy 0, policy_version 531690 (0.0032) [2024-06-23 22:48:08,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 8711307264. Throughput: 0: 43241.7. Samples: 8711371360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:48:08,393][15132] Avg episode reward: [(0, '0.335')] [2024-06-23 22:48:09,824][15401] Updated weights for policy 0, policy_version 531700 (0.0030) [2024-06-23 22:48:13,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8711471104. Throughput: 0: 43112.5. Samples: 8711627980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-23 22:48:13,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 22:48:14,260][15401] Updated weights for policy 0, policy_version 531710 (0.0037) [2024-06-23 22:48:17,294][15401] Updated weights for policy 0, policy_version 531720 (0.0033) [2024-06-23 22:48:18,392][15132] Fps is (10 sec: 44236.8, 60 sec: 43415.8, 300 sec: 43042.4). Total num frames: 8711749632. Throughput: 0: 43196.9. Samples: 8711881440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:18,392][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 22:48:21,803][15401] Updated weights for policy 0, policy_version 531730 (0.0034) [2024-06-23 22:48:23,390][15132] Fps is (10 sec: 47512.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8711946240. Throughput: 0: 43354.2. Samples: 8712021820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 22:48:24,744][15401] Updated weights for policy 0, policy_version 531740 (0.0039) [2024-06-23 22:48:28,390][15132] Fps is (10 sec: 37692.2, 60 sec: 42599.3, 300 sec: 42876.1). Total num frames: 8712126464. Throughput: 0: 43288.4. Samples: 8712275860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 22:48:29,357][15401] Updated weights for policy 0, policy_version 531750 (0.0038) [2024-06-23 22:48:31,341][15349] Signal inference workers to stop experience collection... (129100 times) [2024-06-23 22:48:31,350][15349] Signal inference workers to resume experience collection... (129100 times) [2024-06-23 22:48:31,396][15401] InferenceWorker_p0-w0: stopping experience collection (129100 times) [2024-06-23 22:48:31,396][15401] InferenceWorker_p0-w0: resuming experience collection (129100 times) [2024-06-23 22:48:32,550][15401] Updated weights for policy 0, policy_version 531760 (0.0027) [2024-06-23 22:48:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 8712404992. Throughput: 0: 43292.1. Samples: 8712527120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 22:48:36,984][15401] Updated weights for policy 0, policy_version 531770 (0.0034) [2024-06-23 22:48:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 8712585216. Throughput: 0: 43433.8. Samples: 8712672360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:38,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 22:48:40,230][15401] Updated weights for policy 0, policy_version 531780 (0.0023) [2024-06-23 22:48:43,390][15132] Fps is (10 sec: 37683.0, 60 sec: 43144.5, 300 sec: 42931.8). Total num frames: 8712781824. Throughput: 0: 43198.2. Samples: 8712916920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 22:48:44,693][15401] Updated weights for policy 0, policy_version 531790 (0.0033) [2024-06-23 22:48:47,639][15401] Updated weights for policy 0, policy_version 531800 (0.0042) [2024-06-23 22:48:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8713027584. Throughput: 0: 43334.2. Samples: 8713177560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:48,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 22:48:52,307][15401] Updated weights for policy 0, policy_version 531810 (0.0022) [2024-06-23 22:48:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8713224192. Throughput: 0: 43140.9. Samples: 8713312600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 22:48:55,209][15401] Updated weights for policy 0, policy_version 531820 (0.0030) [2024-06-23 22:48:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8713437184. Throughput: 0: 42974.9. Samples: 8713561860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:48:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 22:48:59,892][15401] Updated weights for policy 0, policy_version 531830 (0.0038) [2024-06-23 22:49:02,806][15401] Updated weights for policy 0, policy_version 531840 (0.0028) [2024-06-23 22:49:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8713666560. Throughput: 0: 43013.8. Samples: 8713816960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 22:49:07,539][15401] Updated weights for policy 0, policy_version 531850 (0.0029) [2024-06-23 22:49:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 8713863168. Throughput: 0: 43034.3. Samples: 8713958360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:08,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-23 22:49:10,841][15401] Updated weights for policy 0, policy_version 531860 (0.0029) [2024-06-23 22:49:13,393][15132] Fps is (10 sec: 40944.3, 60 sec: 43414.7, 300 sec: 42875.5). Total num frames: 8714076160. Throughput: 0: 42822.1. Samples: 8714203020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:13,394][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 22:49:15,378][15401] Updated weights for policy 0, policy_version 531870 (0.0026) [2024-06-23 22:49:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 8714305536. Throughput: 0: 43080.6. Samples: 8714465740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:18,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-23 22:49:18,507][15401] Updated weights for policy 0, policy_version 531880 (0.0033) [2024-06-23 22:49:23,089][15401] Updated weights for policy 0, policy_version 531890 (0.0033) [2024-06-23 22:49:23,389][15132] Fps is (10 sec: 42615.3, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 8714502144. Throughput: 0: 42670.3. Samples: 8714592520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:23,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-23 22:49:26,192][15401] Updated weights for policy 0, policy_version 531900 (0.0031) [2024-06-23 22:49:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8714715136. Throughput: 0: 42797.8. Samples: 8714842820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 22:49:30,617][15401] Updated weights for policy 0, policy_version 531910 (0.0043) [2024-06-23 22:49:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8714944512. Throughput: 0: 42867.6. Samples: 8715106600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 22:49:33,722][15401] Updated weights for policy 0, policy_version 531920 (0.0033) [2024-06-23 22:49:37,962][15349] Signal inference workers to stop experience collection... (129150 times) [2024-06-23 22:49:37,968][15349] Signal inference workers to resume experience collection... (129150 times) [2024-06-23 22:49:37,998][15401] InferenceWorker_p0-w0: stopping experience collection (129150 times) [2024-06-23 22:49:37,998][15401] InferenceWorker_p0-w0: resuming experience collection (129150 times) [2024-06-23 22:49:38,109][15401] Updated weights for policy 0, policy_version 531930 (0.0047) [2024-06-23 22:49:38,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42866.9, 300 sec: 43041.8). Total num frames: 8715157504. Throughput: 0: 42590.5. Samples: 8715229440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:38,396][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 22:49:41,269][15401] Updated weights for policy 0, policy_version 531940 (0.0038) [2024-06-23 22:49:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8715370496. Throughput: 0: 42842.7. Samples: 8715489780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-23 22:49:43,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-23 22:49:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000531944_8715370496.pth... [2024-06-23 22:49:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000531313_8705032192.pth [2024-06-23 22:49:45,547][15401] Updated weights for policy 0, policy_version 531950 (0.0030) [2024-06-23 22:49:48,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.4, 300 sec: 42932.4). Total num frames: 8715583488. Throughput: 0: 42845.9. Samples: 8715745020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:49:48,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-23 22:49:48,946][15401] Updated weights for policy 0, policy_version 531960 (0.0042) [2024-06-23 22:49:53,167][15401] Updated weights for policy 0, policy_version 531970 (0.0041) [2024-06-23 22:49:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8715796480. Throughput: 0: 42614.6. Samples: 8715876020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:49:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 22:49:56,644][15401] Updated weights for policy 0, policy_version 531980 (0.0031) [2024-06-23 22:49:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8716009472. Throughput: 0: 42864.6. Samples: 8716131760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:49:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 22:50:00,633][15401] Updated weights for policy 0, policy_version 531990 (0.0038) [2024-06-23 22:50:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 8716238848. Throughput: 0: 42773.6. Samples: 8716390560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 22:50:04,841][15401] Updated weights for policy 0, policy_version 532000 (0.0033) [2024-06-23 22:50:08,367][15401] Updated weights for policy 0, policy_version 532010 (0.0026) [2024-06-23 22:50:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8716451840. Throughput: 0: 42851.1. Samples: 8716520820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:08,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-23 22:50:12,540][15401] Updated weights for policy 0, policy_version 532020 (0.0033) [2024-06-23 22:50:13,391][15132] Fps is (10 sec: 42593.6, 60 sec: 43146.5, 300 sec: 43042.5). Total num frames: 8716664832. Throughput: 0: 42982.0. Samples: 8716777060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:13,391][15132] Avg episode reward: [(0, '0.564')] [2024-06-23 22:50:15,791][15401] Updated weights for policy 0, policy_version 532030 (0.0044) [2024-06-23 22:50:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 8716877824. Throughput: 0: 42783.5. Samples: 8717031860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:18,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-23 22:50:20,284][15401] Updated weights for policy 0, policy_version 532040 (0.0035) [2024-06-23 22:50:23,389][15132] Fps is (10 sec: 40964.7, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 8717074432. Throughput: 0: 42943.9. Samples: 8717161640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 22:50:23,619][15401] Updated weights for policy 0, policy_version 532050 (0.0054) [2024-06-23 22:50:27,759][15401] Updated weights for policy 0, policy_version 532060 (0.0030) [2024-06-23 22:50:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 8717303808. Throughput: 0: 42950.2. Samples: 8717422640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:28,392][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 22:50:31,286][15401] Updated weights for policy 0, policy_version 532070 (0.0040) [2024-06-23 22:50:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 8717533184. Throughput: 0: 42805.7. Samples: 8717671280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 22:50:35,315][15401] Updated weights for policy 0, policy_version 532080 (0.0047) [2024-06-23 22:50:38,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42329.9, 300 sec: 42931.6). Total num frames: 8717697024. Throughput: 0: 42848.1. Samples: 8717804180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 22:50:39,204][15401] Updated weights for policy 0, policy_version 532090 (0.0035) [2024-06-23 22:50:43,044][15401] Updated weights for policy 0, policy_version 532100 (0.0031) [2024-06-23 22:50:43,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 8717942784. Throughput: 0: 42895.6. Samples: 8718062160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:43,392][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 22:50:46,770][15401] Updated weights for policy 0, policy_version 532110 (0.0051) [2024-06-23 22:50:48,391][15132] Fps is (10 sec: 47507.5, 60 sec: 43143.6, 300 sec: 43042.5). Total num frames: 8718172160. Throughput: 0: 42748.2. Samples: 8718314280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:48,391][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 22:50:50,626][15401] Updated weights for policy 0, policy_version 532120 (0.0034) [2024-06-23 22:50:53,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42596.7, 300 sec: 43042.4). Total num frames: 8718352384. Throughput: 0: 42943.9. Samples: 8718453400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:53,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 22:50:54,033][15349] Signal inference workers to stop experience collection... (129200 times) [2024-06-23 22:50:54,033][15349] Signal inference workers to resume experience collection... (129200 times) [2024-06-23 22:50:54,061][15401] InferenceWorker_p0-w0: stopping experience collection (129200 times) [2024-06-23 22:50:54,088][15401] InferenceWorker_p0-w0: resuming experience collection (129200 times) [2024-06-23 22:50:54,171][15401] Updated weights for policy 0, policy_version 532130 (0.0029) [2024-06-23 22:50:58,107][15401] Updated weights for policy 0, policy_version 532140 (0.0037) [2024-06-23 22:50:58,390][15132] Fps is (10 sec: 40964.5, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 8718581760. Throughput: 0: 42997.8. Samples: 8718711920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:50:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 22:51:01,660][15401] Updated weights for policy 0, policy_version 532150 (0.0037) [2024-06-23 22:51:03,389][15132] Fps is (10 sec: 47525.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8718827520. Throughput: 0: 42957.9. Samples: 8718964960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:51:03,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-23 22:51:05,586][15401] Updated weights for policy 0, policy_version 532160 (0.0025) [2024-06-23 22:51:08,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 8719007744. Throughput: 0: 43200.5. Samples: 8719105660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:51:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 22:51:09,069][15401] Updated weights for policy 0, policy_version 532170 (0.0027) [2024-06-23 22:51:13,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42599.1, 300 sec: 42987.2). Total num frames: 8719220736. Throughput: 0: 43034.6. Samples: 8719359100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-23 22:51:13,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 22:51:13,754][15401] Updated weights for policy 0, policy_version 532180 (0.0041) [2024-06-23 22:51:16,821][15401] Updated weights for policy 0, policy_version 532190 (0.0046) [2024-06-23 22:51:18,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43417.5, 300 sec: 43153.9). Total num frames: 8719482880. Throughput: 0: 43222.6. Samples: 8719616300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:18,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 22:51:21,189][15401] Updated weights for policy 0, policy_version 532200 (0.0042) [2024-06-23 22:51:23,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8719663104. Throughput: 0: 43316.1. Samples: 8719753400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 22:51:24,323][15401] Updated weights for policy 0, policy_version 532210 (0.0047) [2024-06-23 22:51:28,392][15132] Fps is (10 sec: 37674.7, 60 sec: 42598.4, 300 sec: 42931.3). Total num frames: 8719859712. Throughput: 0: 43166.7. Samples: 8720004660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:28,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 22:51:28,777][15401] Updated weights for policy 0, policy_version 532220 (0.0041) [2024-06-23 22:51:32,195][15401] Updated weights for policy 0, policy_version 532230 (0.0038) [2024-06-23 22:51:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 43043.6). Total num frames: 8720105472. Throughput: 0: 43115.5. Samples: 8720254420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 22:51:36,455][15401] Updated weights for policy 0, policy_version 532240 (0.0038) [2024-06-23 22:51:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 8720302080. Throughput: 0: 43159.2. Samples: 8720395460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 22:51:39,898][15401] Updated weights for policy 0, policy_version 532250 (0.0033) [2024-06-23 22:51:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 8720515072. Throughput: 0: 42947.2. Samples: 8720644540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:43,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 22:51:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000532258_8720515072.pth... [2024-06-23 22:51:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000531629_8710209536.pth [2024-06-23 22:51:43,886][15401] Updated weights for policy 0, policy_version 532260 (0.0032) [2024-06-23 22:51:47,280][15401] Updated weights for policy 0, policy_version 532270 (0.0031) [2024-06-23 22:51:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42872.3, 300 sec: 43042.7). Total num frames: 8720744448. Throughput: 0: 43067.4. Samples: 8720903000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-23 22:51:51,905][15401] Updated weights for policy 0, policy_version 532280 (0.0033) [2024-06-23 22:51:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43419.3, 300 sec: 43042.7). Total num frames: 8720957440. Throughput: 0: 42931.0. Samples: 8721037560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:53,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-23 22:51:54,876][15401] Updated weights for policy 0, policy_version 532290 (0.0029) [2024-06-23 22:51:58,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43142.9, 300 sec: 42987.8). Total num frames: 8721170432. Throughput: 0: 42884.5. Samples: 8721289000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:51:58,393][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 22:51:59,277][15401] Updated weights for policy 0, policy_version 532300 (0.0028) [2024-06-23 22:52:02,738][15401] Updated weights for policy 0, policy_version 532310 (0.0040) [2024-06-23 22:52:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 8721383424. Throughput: 0: 42947.1. Samples: 8721548920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:52:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 22:52:06,769][15401] Updated weights for policy 0, policy_version 532320 (0.0046) [2024-06-23 22:52:08,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8721580032. Throughput: 0: 42689.2. Samples: 8721674420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:52:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 22:52:10,277][15401] Updated weights for policy 0, policy_version 532330 (0.0036) [2024-06-23 22:52:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43417.8, 300 sec: 42987.2). Total num frames: 8721825792. Throughput: 0: 42901.9. Samples: 8721935140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:52:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 22:52:14,688][15401] Updated weights for policy 0, policy_version 532340 (0.0030) [2024-06-23 22:52:16,004][15349] Signal inference workers to stop experience collection... (129250 times) [2024-06-23 22:52:16,008][15349] Signal inference workers to resume experience collection... (129250 times) [2024-06-23 22:52:16,023][15401] InferenceWorker_p0-w0: stopping experience collection (129250 times) [2024-06-23 22:52:16,023][15401] InferenceWorker_p0-w0: resuming experience collection (129250 times) [2024-06-23 22:52:17,887][15401] Updated weights for policy 0, policy_version 532350 (0.0025) [2024-06-23 22:52:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42931.6). Total num frames: 8722022400. Throughput: 0: 43051.6. Samples: 8722191740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:52:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 22:52:22,121][15401] Updated weights for policy 0, policy_version 532360 (0.0025) [2024-06-23 22:52:23,392][15132] Fps is (10 sec: 40948.5, 60 sec: 42869.4, 300 sec: 42931.4). Total num frames: 8722235392. Throughput: 0: 42825.4. Samples: 8722322720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:52:23,393][15132] Avg episode reward: [(0, '0.277')] [2024-06-23 22:52:25,534][15401] Updated weights for policy 0, policy_version 532370 (0.0035) [2024-06-23 22:52:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43419.3, 300 sec: 42987.2). Total num frames: 8722464768. Throughput: 0: 43057.0. Samples: 8722582100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:52:28,390][15132] Avg episode reward: [(0, '0.133')] [2024-06-23 22:52:29,571][15401] Updated weights for policy 0, policy_version 532380 (0.0029) [2024-06-23 22:52:33,258][15401] Updated weights for policy 0, policy_version 532390 (0.0029) [2024-06-23 22:52:33,389][15132] Fps is (10 sec: 44249.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8722677760. Throughput: 0: 43022.8. Samples: 8722839020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:52:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 22:52:37,017][15401] Updated weights for policy 0, policy_version 532400 (0.0028) [2024-06-23 22:52:38,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43139.9, 300 sec: 43041.8). Total num frames: 8722890752. Throughput: 0: 42823.7. Samples: 8722964900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 22:52:38,397][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 22:52:40,785][15401] Updated weights for policy 0, policy_version 532410 (0.0034) [2024-06-23 22:52:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 8723120128. Throughput: 0: 43107.1. Samples: 8723228720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:52:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 22:52:44,398][15401] Updated weights for policy 0, policy_version 532420 (0.0040) [2024-06-23 22:52:48,394][15132] Fps is (10 sec: 42607.3, 60 sec: 42868.5, 300 sec: 42931.0). Total num frames: 8723316736. Throughput: 0: 43024.9. Samples: 8723485220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:52:48,394][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 22:52:48,642][15401] Updated weights for policy 0, policy_version 532430 (0.0034) [2024-06-23 22:52:51,822][15401] Updated weights for policy 0, policy_version 532440 (0.0029) [2024-06-23 22:52:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 8723529728. Throughput: 0: 43006.7. Samples: 8723609720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:52:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 22:52:56,329][15401] Updated weights for policy 0, policy_version 532450 (0.0037) [2024-06-23 22:52:58,389][15132] Fps is (10 sec: 44256.1, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 8723759104. Throughput: 0: 43121.7. Samples: 8723875620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:52:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 22:52:59,172][15401] Updated weights for policy 0, policy_version 532460 (0.0032) [2024-06-23 22:53:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 8723955712. Throughput: 0: 43131.0. Samples: 8724132640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 22:53:03,989][15401] Updated weights for policy 0, policy_version 532470 (0.0034) [2024-06-23 22:53:07,116][15401] Updated weights for policy 0, policy_version 532480 (0.0036) [2024-06-23 22:53:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 43042.3). Total num frames: 8724168704. Throughput: 0: 42813.6. Samples: 8724249320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:08,393][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 22:53:11,653][15401] Updated weights for policy 0, policy_version 532490 (0.0034) [2024-06-23 22:53:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 8724414464. Throughput: 0: 43034.8. Samples: 8724518660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:13,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-23 22:53:14,612][15401] Updated weights for policy 0, policy_version 532500 (0.0042) [2024-06-23 22:53:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8724594688. Throughput: 0: 43088.3. Samples: 8724778000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:18,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 22:53:19,189][15401] Updated weights for policy 0, policy_version 532510 (0.0033) [2024-06-23 22:53:22,067][15401] Updated weights for policy 0, policy_version 532520 (0.0032) [2024-06-23 22:53:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43146.4, 300 sec: 43042.7). Total num frames: 8724824064. Throughput: 0: 42998.9. Samples: 8724899580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 22:53:26,662][15401] Updated weights for policy 0, policy_version 532530 (0.0034) [2024-06-23 22:53:27,623][15349] Signal inference workers to stop experience collection... (129300 times) [2024-06-23 22:53:27,676][15349] Signal inference workers to resume experience collection... (129300 times) [2024-06-23 22:53:27,676][15401] InferenceWorker_p0-w0: stopping experience collection (129300 times) [2024-06-23 22:53:27,691][15401] InferenceWorker_p0-w0: resuming experience collection (129300 times) [2024-06-23 22:53:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8725053440. Throughput: 0: 43009.4. Samples: 8725164140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 22:53:29,530][15401] Updated weights for policy 0, policy_version 532540 (0.0030) [2024-06-23 22:53:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 8725250048. Throughput: 0: 43186.5. Samples: 8725428420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 22:53:34,085][15401] Updated weights for policy 0, policy_version 532550 (0.0045) [2024-06-23 22:53:37,566][15401] Updated weights for policy 0, policy_version 532560 (0.0030) [2024-06-23 22:53:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43149.1, 300 sec: 43042.7). Total num frames: 8725479424. Throughput: 0: 43168.8. Samples: 8725552320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-23 22:53:41,515][15401] Updated weights for policy 0, policy_version 532570 (0.0025) [2024-06-23 22:53:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8725708800. Throughput: 0: 43195.6. Samples: 8725819420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:43,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-23 22:53:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000532575_8725708800.pth... [2024-06-23 22:53:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000531944_8715370496.pth [2024-06-23 22:53:45,158][15401] Updated weights for policy 0, policy_version 532580 (0.0032) [2024-06-23 22:53:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42874.5, 300 sec: 42931.6). Total num frames: 8725889024. Throughput: 0: 43212.4. Samples: 8726077200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 22:53:49,776][15401] Updated weights for policy 0, policy_version 532590 (0.0042) [2024-06-23 22:53:52,925][15401] Updated weights for policy 0, policy_version 532600 (0.0037) [2024-06-23 22:53:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8726118400. Throughput: 0: 43285.5. Samples: 8726197060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-23 22:53:57,341][15401] Updated weights for policy 0, policy_version 532610 (0.0039) [2024-06-23 22:53:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 8726331392. Throughput: 0: 43278.7. Samples: 8726466200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:53:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 22:54:00,258][15401] Updated weights for policy 0, policy_version 532620 (0.0039) [2024-06-23 22:54:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8726528000. Throughput: 0: 43297.3. Samples: 8726726380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:54:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 22:54:04,721][15401] Updated weights for policy 0, policy_version 532630 (0.0037) [2024-06-23 22:54:08,105][15401] Updated weights for policy 0, policy_version 532640 (0.0040) [2024-06-23 22:54:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43419.3, 300 sec: 43043.3). Total num frames: 8726773760. Throughput: 0: 43324.0. Samples: 8726849160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 22:54:08,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-23 22:54:12,507][15401] Updated weights for policy 0, policy_version 532650 (0.0033) [2024-06-23 22:54:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8726970368. Throughput: 0: 43177.0. Samples: 8727107100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 22:54:15,584][15401] Updated weights for policy 0, policy_version 532660 (0.0047) [2024-06-23 22:54:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8727166976. Throughput: 0: 43084.4. Samples: 8727367220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:18,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 22:54:20,167][15401] Updated weights for policy 0, policy_version 532670 (0.0029) [2024-06-23 22:54:23,101][15401] Updated weights for policy 0, policy_version 532680 (0.0034) [2024-06-23 22:54:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 8727429120. Throughput: 0: 43024.0. Samples: 8727488400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-23 22:54:27,675][15401] Updated weights for policy 0, policy_version 532690 (0.0039) [2024-06-23 22:54:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 8727609344. Throughput: 0: 42802.2. Samples: 8727745520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 22:54:30,510][15401] Updated weights for policy 0, policy_version 532700 (0.0042) [2024-06-23 22:54:33,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42877.0). Total num frames: 8727805952. Throughput: 0: 42997.9. Samples: 8728012100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-23 22:54:34,450][15349] Signal inference workers to stop experience collection... (129350 times) [2024-06-23 22:54:34,452][15349] Signal inference workers to resume experience collection... (129350 times) [2024-06-23 22:54:34,487][15401] InferenceWorker_p0-w0: stopping experience collection (129350 times) [2024-06-23 22:54:34,487][15401] InferenceWorker_p0-w0: resuming experience collection (129350 times) [2024-06-23 22:54:35,131][15401] Updated weights for policy 0, policy_version 532710 (0.0037) [2024-06-23 22:54:38,142][15401] Updated weights for policy 0, policy_version 532720 (0.0030) [2024-06-23 22:54:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 8728084480. Throughput: 0: 43101.7. Samples: 8728136640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 22:54:43,032][15401] Updated weights for policy 0, policy_version 532730 (0.0035) [2024-06-23 22:54:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 8728264704. Throughput: 0: 42880.3. Samples: 8728395820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 22:54:45,682][15401] Updated weights for policy 0, policy_version 532740 (0.0035) [2024-06-23 22:54:48,394][15132] Fps is (10 sec: 37668.2, 60 sec: 42868.7, 300 sec: 42931.1). Total num frames: 8728461312. Throughput: 0: 42782.9. Samples: 8728651780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:48,394][15132] Avg episode reward: [(0, '0.276')] [2024-06-23 22:54:50,518][15401] Updated weights for policy 0, policy_version 532750 (0.0029) [2024-06-23 22:54:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 8728723456. Throughput: 0: 42872.6. Samples: 8728778420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 22:54:53,901][15401] Updated weights for policy 0, policy_version 532760 (0.0041) [2024-06-23 22:54:58,000][15401] Updated weights for policy 0, policy_version 532770 (0.0028) [2024-06-23 22:54:58,390][15132] Fps is (10 sec: 44254.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8728903680. Throughput: 0: 42919.8. Samples: 8729038500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:54:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 22:55:01,331][15401] Updated weights for policy 0, policy_version 532780 (0.0030) [2024-06-23 22:55:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8729116672. Throughput: 0: 42815.4. Samples: 8729293920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:55:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 22:55:05,608][15401] Updated weights for policy 0, policy_version 532790 (0.0043) [2024-06-23 22:55:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 43042.9). Total num frames: 8729362432. Throughput: 0: 43068.1. Samples: 8729426460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:55:08,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-23 22:55:08,810][15401] Updated weights for policy 0, policy_version 532800 (0.0035) [2024-06-23 22:55:13,243][15401] Updated weights for policy 0, policy_version 532810 (0.0040) [2024-06-23 22:55:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 8729559040. Throughput: 0: 43221.3. Samples: 8729690480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:55:13,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-23 22:55:16,387][15401] Updated weights for policy 0, policy_version 532820 (0.0035) [2024-06-23 22:55:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 8729772032. Throughput: 0: 42929.2. Samples: 8729943920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:55:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 22:55:20,951][15401] Updated weights for policy 0, policy_version 532830 (0.0027) [2024-06-23 22:55:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.7, 300 sec: 43098.6). Total num frames: 8730017792. Throughput: 0: 43007.7. Samples: 8730071980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:55:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 22:55:23,897][15401] Updated weights for policy 0, policy_version 532840 (0.0037) [2024-06-23 22:55:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 8730165248. Throughput: 0: 43076.9. Samples: 8730334280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:55:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 22:55:28,995][15401] Updated weights for policy 0, policy_version 532850 (0.0026) [2024-06-23 22:55:31,858][15401] Updated weights for policy 0, policy_version 532860 (0.0024) [2024-06-23 22:55:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43690.6, 300 sec: 43153.8). Total num frames: 8730427392. Throughput: 0: 42861.2. Samples: 8730580360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:55:33,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-23 22:55:36,410][15401] Updated weights for policy 0, policy_version 532870 (0.0026) [2024-06-23 22:55:38,389][15132] Fps is (10 sec: 49152.5, 60 sec: 42871.5, 300 sec: 43098.6). Total num frames: 8730656768. Throughput: 0: 43074.2. Samples: 8730716760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-23 22:55:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-23 22:55:39,370][15401] Updated weights for policy 0, policy_version 532880 (0.0042) [2024-06-23 22:55:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42876.3). Total num frames: 8730820608. Throughput: 0: 43042.3. Samples: 8730975400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:55:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-23 22:55:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000532887_8730820608.pth... [2024-06-23 22:55:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000532258_8720515072.pth [2024-06-23 22:55:44,151][15401] Updated weights for policy 0, policy_version 532890 (0.0034) [2024-06-23 22:55:45,783][15349] Signal inference workers to stop experience collection... (129400 times) [2024-06-23 22:55:45,830][15401] InferenceWorker_p0-w0: stopping experience collection (129400 times) [2024-06-23 22:55:45,839][15349] Signal inference workers to resume experience collection... (129400 times) [2024-06-23 22:55:45,851][15401] InferenceWorker_p0-w0: resuming experience collection (129400 times) [2024-06-23 22:55:46,925][15401] Updated weights for policy 0, policy_version 532900 (0.0033) [2024-06-23 22:55:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43420.6, 300 sec: 43098.6). Total num frames: 8731066368. Throughput: 0: 42782.3. Samples: 8731219120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:55:48,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-23 22:55:51,709][15401] Updated weights for policy 0, policy_version 532910 (0.0039) [2024-06-23 22:55:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 8731279360. Throughput: 0: 42866.3. Samples: 8731355440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:55:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 22:55:54,791][15401] Updated weights for policy 0, policy_version 532920 (0.0031) [2024-06-23 22:55:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 8731459584. Throughput: 0: 42672.5. Samples: 8731610740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:55:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 22:55:59,351][15401] Updated weights for policy 0, policy_version 532930 (0.0028) [2024-06-23 22:56:02,339][15401] Updated weights for policy 0, policy_version 532940 (0.0033) [2024-06-23 22:56:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 8731721728. Throughput: 0: 42525.1. Samples: 8731857540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 22:56:06,799][15401] Updated weights for policy 0, policy_version 532950 (0.0026) [2024-06-23 22:56:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 8731918336. Throughput: 0: 42910.9. Samples: 8732002980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:08,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-23 22:56:09,740][15401] Updated weights for policy 0, policy_version 532960 (0.0031) [2024-06-23 22:56:13,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8732098560. Throughput: 0: 42634.2. Samples: 8732252820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 22:56:14,339][15401] Updated weights for policy 0, policy_version 532970 (0.0029) [2024-06-23 22:56:17,786][15401] Updated weights for policy 0, policy_version 532980 (0.0030) [2024-06-23 22:56:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 8732360704. Throughput: 0: 42538.6. Samples: 8732494600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:18,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-23 22:56:21,920][15401] Updated weights for policy 0, policy_version 532990 (0.0033) [2024-06-23 22:56:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 42932.0). Total num frames: 8732524544. Throughput: 0: 42605.4. Samples: 8732634000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 22:56:25,392][15401] Updated weights for policy 0, policy_version 533000 (0.0036) [2024-06-23 22:56:28,390][15132] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8732721152. Throughput: 0: 42427.5. Samples: 8732884640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 22:56:29,852][15401] Updated weights for policy 0, policy_version 533010 (0.0022) [2024-06-23 22:56:33,291][15401] Updated weights for policy 0, policy_version 533020 (0.0051) [2024-06-23 22:56:33,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 8732999680. Throughput: 0: 42581.3. Samples: 8733135280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 22:56:37,966][15401] Updated weights for policy 0, policy_version 533030 (0.0038) [2024-06-23 22:56:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42052.2, 300 sec: 42931.6). Total num frames: 8733179904. Throughput: 0: 42610.2. Samples: 8733272900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 22:56:40,785][15401] Updated weights for policy 0, policy_version 533040 (0.0026) [2024-06-23 22:56:43,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8733376512. Throughput: 0: 42649.8. Samples: 8733529980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 22:56:45,538][15401] Updated weights for policy 0, policy_version 533050 (0.0029) [2024-06-23 22:56:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8733638656. Throughput: 0: 42681.3. Samples: 8733778200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:48,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-23 22:56:48,419][15401] Updated weights for policy 0, policy_version 533060 (0.0036) [2024-06-23 22:56:51,143][15349] Signal inference workers to stop experience collection... (129450 times) [2024-06-23 22:56:51,163][15401] InferenceWorker_p0-w0: stopping experience collection (129450 times) [2024-06-23 22:56:51,204][15349] Signal inference workers to resume experience collection... (129450 times) [2024-06-23 22:56:51,205][15401] InferenceWorker_p0-w0: resuming experience collection (129450 times) [2024-06-23 22:56:53,022][15401] Updated weights for policy 0, policy_version 533070 (0.0029) [2024-06-23 22:56:53,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.6, 300 sec: 42931.6). Total num frames: 8733835264. Throughput: 0: 42576.9. Samples: 8733919040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:53,392][15132] Avg episode reward: [(0, '0.372')] [2024-06-23 22:56:56,254][15401] Updated weights for policy 0, policy_version 533080 (0.0043) [2024-06-23 22:56:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8734031872. Throughput: 0: 42642.2. Samples: 8734171720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:56:58,393][15132] Avg episode reward: [(0, '0.379')] [2024-06-23 22:57:00,616][15401] Updated weights for policy 0, policy_version 533090 (0.0037) [2024-06-23 22:57:03,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 8734261248. Throughput: 0: 42945.9. Samples: 8734427160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:57:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 22:57:03,917][15401] Updated weights for policy 0, policy_version 533100 (0.0032) [2024-06-23 22:57:08,155][15401] Updated weights for policy 0, policy_version 533110 (0.0029) [2024-06-23 22:57:08,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.8, 300 sec: 42875.7). Total num frames: 8734474240. Throughput: 0: 42807.4. Samples: 8734560440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-23 22:57:08,393][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 22:57:11,470][15401] Updated weights for policy 0, policy_version 533120 (0.0033) [2024-06-23 22:57:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8734687232. Throughput: 0: 42780.8. Samples: 8734809780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:13,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-23 22:57:16,382][15401] Updated weights for policy 0, policy_version 533130 (0.0055) [2024-06-23 22:57:18,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42052.3, 300 sec: 42876.5). Total num frames: 8734883840. Throughput: 0: 42831.1. Samples: 8735062680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-23 22:57:19,377][15401] Updated weights for policy 0, policy_version 533140 (0.0035) [2024-06-23 22:57:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8735096832. Throughput: 0: 42631.1. Samples: 8735191300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 22:57:23,991][15401] Updated weights for policy 0, policy_version 533150 (0.0026) [2024-06-23 22:57:26,996][15401] Updated weights for policy 0, policy_version 533160 (0.0027) [2024-06-23 22:57:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 8735309824. Throughput: 0: 42478.2. Samples: 8735441500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 22:57:31,583][15401] Updated weights for policy 0, policy_version 533170 (0.0040) [2024-06-23 22:57:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42765.9). Total num frames: 8735506432. Throughput: 0: 42865.7. Samples: 8735707160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 22:57:34,562][15401] Updated weights for policy 0, policy_version 533180 (0.0035) [2024-06-23 22:57:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8735752192. Throughput: 0: 42486.2. Samples: 8735830820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 22:57:39,115][15401] Updated weights for policy 0, policy_version 533190 (0.0037) [2024-06-23 22:57:42,403][15401] Updated weights for policy 0, policy_version 533200 (0.0032) [2024-06-23 22:57:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.7). Total num frames: 8735965184. Throughput: 0: 42440.9. Samples: 8736081560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 22:57:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000533201_8735965184.pth... [2024-06-23 22:57:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000532575_8725708800.pth [2024-06-23 22:57:46,873][15401] Updated weights for policy 0, policy_version 533210 (0.0033) [2024-06-23 22:57:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 8736161792. Throughput: 0: 42529.3. Samples: 8736340980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 22:57:49,997][15401] Updated weights for policy 0, policy_version 533220 (0.0038) [2024-06-23 22:57:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 8736374784. Throughput: 0: 42381.5. Samples: 8736467500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 22:57:54,405][15401] Updated weights for policy 0, policy_version 533230 (0.0037) [2024-06-23 22:57:58,355][15401] Updated weights for policy 0, policy_version 533240 (0.0034) [2024-06-23 22:57:58,390][15132] Fps is (10 sec: 44233.5, 60 sec: 42871.0, 300 sec: 42876.0). Total num frames: 8736604160. Throughput: 0: 42412.8. Samples: 8736718380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:57:58,391][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 22:58:02,093][15401] Updated weights for policy 0, policy_version 533250 (0.0029) [2024-06-23 22:58:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 8736800768. Throughput: 0: 42449.9. Samples: 8736972920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:58:03,390][15132] Avg episode reward: [(0, '0.125')] [2024-06-23 22:58:05,961][15401] Updated weights for policy 0, policy_version 533260 (0.0040) [2024-06-23 22:58:08,389][15132] Fps is (10 sec: 42601.7, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 8737030144. Throughput: 0: 42343.6. Samples: 8737096760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:58:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 22:58:09,939][15401] Updated weights for policy 0, policy_version 533270 (0.0039) [2024-06-23 22:58:13,343][15349] Signal inference workers to stop experience collection... (129500 times) [2024-06-23 22:58:13,350][15349] Signal inference workers to resume experience collection... (129500 times) [2024-06-23 22:58:13,379][15401] InferenceWorker_p0-w0: stopping experience collection (129500 times) [2024-06-23 22:58:13,379][15401] InferenceWorker_p0-w0: resuming experience collection (129500 times) [2024-06-23 22:58:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 8737210368. Throughput: 0: 42505.8. Samples: 8737354260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:58:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 22:58:13,811][15401] Updated weights for policy 0, policy_version 533280 (0.0032) [2024-06-23 22:58:17,693][15401] Updated weights for policy 0, policy_version 533290 (0.0028) [2024-06-23 22:58:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8737456128. Throughput: 0: 42107.1. Samples: 8737601980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:58:18,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 22:58:21,654][15401] Updated weights for policy 0, policy_version 533300 (0.0030) [2024-06-23 22:58:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8737669120. Throughput: 0: 42353.0. Samples: 8737736700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:58:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 22:58:25,390][15401] Updated weights for policy 0, policy_version 533310 (0.0034) [2024-06-23 22:58:28,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 8737849344. Throughput: 0: 42433.8. Samples: 8737991180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:58:28,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 22:58:29,341][15401] Updated weights for policy 0, policy_version 533320 (0.0039) [2024-06-23 22:58:33,031][15401] Updated weights for policy 0, policy_version 533330 (0.0022) [2024-06-23 22:58:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8738095104. Throughput: 0: 42249.8. Samples: 8738242220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:58:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-23 22:58:37,026][15401] Updated weights for policy 0, policy_version 533340 (0.0032) [2024-06-23 22:58:38,389][15132] Fps is (10 sec: 45886.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8738308096. Throughput: 0: 42488.4. Samples: 8738379480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-23 22:58:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 22:58:40,655][15401] Updated weights for policy 0, policy_version 533350 (0.0027) [2024-06-23 22:58:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8738488320. Throughput: 0: 42443.8. Samples: 8738628320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:58:43,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-23 22:58:44,773][15401] Updated weights for policy 0, policy_version 533360 (0.0029) [2024-06-23 22:58:48,226][15401] Updated weights for policy 0, policy_version 533370 (0.0034) [2024-06-23 22:58:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8738734080. Throughput: 0: 42502.2. Samples: 8738885520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:58:48,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-23 22:58:52,447][15401] Updated weights for policy 0, policy_version 533380 (0.0042) [2024-06-23 22:58:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8738947072. Throughput: 0: 42732.4. Samples: 8739019720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:58:53,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 22:58:55,823][15401] Updated weights for policy 0, policy_version 533390 (0.0030) [2024-06-23 22:58:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.8, 300 sec: 42765.0). Total num frames: 8739143680. Throughput: 0: 42583.9. Samples: 8739270540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:58:58,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 22:59:00,046][15401] Updated weights for policy 0, policy_version 533400 (0.0036) [2024-06-23 22:59:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 8739373056. Throughput: 0: 42799.1. Samples: 8739528040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:03,393][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 22:59:03,780][15401] Updated weights for policy 0, policy_version 533410 (0.0031) [2024-06-23 22:59:07,533][15401] Updated weights for policy 0, policy_version 533420 (0.0028) [2024-06-23 22:59:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8739569664. Throughput: 0: 42743.6. Samples: 8739660160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 22:59:11,294][15401] Updated weights for policy 0, policy_version 533430 (0.0046) [2024-06-23 22:59:13,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8739766272. Throughput: 0: 42801.1. Samples: 8739917120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-23 22:59:15,096][15401] Updated weights for policy 0, policy_version 533440 (0.0030) [2024-06-23 22:59:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8740012032. Throughput: 0: 42840.4. Samples: 8740170040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 22:59:19,291][15401] Updated weights for policy 0, policy_version 533450 (0.0046) [2024-06-23 22:59:22,623][15401] Updated weights for policy 0, policy_version 533460 (0.0029) [2024-06-23 22:59:23,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8740225024. Throughput: 0: 42698.1. Samples: 8740300900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:23,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 22:59:27,035][15401] Updated weights for policy 0, policy_version 533470 (0.0042) [2024-06-23 22:59:27,617][15349] Signal inference workers to stop experience collection... (129550 times) [2024-06-23 22:59:27,617][15349] Signal inference workers to resume experience collection... (129550 times) [2024-06-23 22:59:27,630][15401] InferenceWorker_p0-w0: stopping experience collection (129550 times) [2024-06-23 22:59:27,641][15401] InferenceWorker_p0-w0: resuming experience collection (129550 times) [2024-06-23 22:59:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8740421632. Throughput: 0: 42747.1. Samples: 8740551940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 22:59:30,479][15401] Updated weights for policy 0, policy_version 533480 (0.0039) [2024-06-23 22:59:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8740634624. Throughput: 0: 42707.5. Samples: 8740807360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 22:59:34,602][15401] Updated weights for policy 0, policy_version 533490 (0.0029) [2024-06-23 22:59:37,982][15401] Updated weights for policy 0, policy_version 533500 (0.0026) [2024-06-23 22:59:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8740864000. Throughput: 0: 42584.9. Samples: 8740936040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-23 22:59:42,171][15401] Updated weights for policy 0, policy_version 533510 (0.0042) [2024-06-23 22:59:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.6). Total num frames: 8741076992. Throughput: 0: 42686.6. Samples: 8741191440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-23 22:59:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000533513_8741076992.pth... [2024-06-23 22:59:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000532887_8730820608.pth [2024-06-23 22:59:45,950][15401] Updated weights for policy 0, policy_version 533520 (0.0040) [2024-06-23 22:59:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8741273600. Throughput: 0: 42645.9. Samples: 8741447000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 22:59:49,970][15401] Updated weights for policy 0, policy_version 533530 (0.0047) [2024-06-23 22:59:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8741502976. Throughput: 0: 42476.4. Samples: 8741571600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:53,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-23 22:59:53,438][15401] Updated weights for policy 0, policy_version 533540 (0.0029) [2024-06-23 22:59:57,598][15401] Updated weights for policy 0, policy_version 533550 (0.0032) [2024-06-23 22:59:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8741732352. Throughput: 0: 42687.4. Samples: 8741838060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 22:59:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 23:00:01,495][15401] Updated weights for policy 0, policy_version 533560 (0.0042) [2024-06-23 23:00:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 8741912576. Throughput: 0: 42574.1. Samples: 8742085880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 23:00:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 23:00:05,373][15401] Updated weights for policy 0, policy_version 533570 (0.0024) [2024-06-23 23:00:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8742141952. Throughput: 0: 42494.8. Samples: 8742213160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-23 23:00:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 23:00:09,062][15401] Updated weights for policy 0, policy_version 533580 (0.0040) [2024-06-23 23:00:12,902][15401] Updated weights for policy 0, policy_version 533590 (0.0032) [2024-06-23 23:00:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 8742354944. Throughput: 0: 42807.1. Samples: 8742478260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 23:00:16,735][15401] Updated weights for policy 0, policy_version 533600 (0.0034) [2024-06-23 23:00:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 8742567936. Throughput: 0: 42861.3. Samples: 8742736120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 23:00:20,487][15401] Updated weights for policy 0, policy_version 533610 (0.0040) [2024-06-23 23:00:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8742797312. Throughput: 0: 42885.3. Samples: 8742865880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:23,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 23:00:24,068][15401] Updated weights for policy 0, policy_version 533620 (0.0022) [2024-06-23 23:00:28,291][15401] Updated weights for policy 0, policy_version 533630 (0.0029) [2024-06-23 23:00:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8742993920. Throughput: 0: 42973.4. Samples: 8743125240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-23 23:00:31,710][15401] Updated weights for policy 0, policy_version 533640 (0.0023) [2024-06-23 23:00:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8743223296. Throughput: 0: 42939.0. Samples: 8743379260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-23 23:00:35,654][15401] Updated weights for policy 0, policy_version 533650 (0.0029) [2024-06-23 23:00:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8743436288. Throughput: 0: 43156.5. Samples: 8743513640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-23 23:00:39,273][15401] Updated weights for policy 0, policy_version 533660 (0.0038) [2024-06-23 23:00:43,345][15401] Updated weights for policy 0, policy_version 533670 (0.0039) [2024-06-23 23:00:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 8743649280. Throughput: 0: 42977.0. Samples: 8743772020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 23:00:46,988][15401] Updated weights for policy 0, policy_version 533680 (0.0029) [2024-06-23 23:00:48,396][15132] Fps is (10 sec: 42570.8, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 8743862272. Throughput: 0: 42947.7. Samples: 8744018800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:48,396][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 23:00:50,967][15401] Updated weights for policy 0, policy_version 533690 (0.0038) [2024-06-23 23:00:53,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 8744075264. Throughput: 0: 42915.7. Samples: 8744144640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:53,396][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 23:00:54,529][15349] Signal inference workers to stop experience collection... (129600 times) [2024-06-23 23:00:54,584][15401] InferenceWorker_p0-w0: stopping experience collection (129600 times) [2024-06-23 23:00:54,587][15349] Signal inference workers to resume experience collection... (129600 times) [2024-06-23 23:00:54,596][15401] InferenceWorker_p0-w0: resuming experience collection (129600 times) [2024-06-23 23:00:54,734][15401] Updated weights for policy 0, policy_version 533700 (0.0032) [2024-06-23 23:00:58,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8744288256. Throughput: 0: 42923.2. Samples: 8744409800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:00:58,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 23:00:58,558][15401] Updated weights for policy 0, policy_version 533710 (0.0029) [2024-06-23 23:01:02,193][15401] Updated weights for policy 0, policy_version 533720 (0.0031) [2024-06-23 23:01:03,389][15132] Fps is (10 sec: 44265.4, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 8744517632. Throughput: 0: 42719.2. Samples: 8744658480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:01:03,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-23 23:01:06,266][15401] Updated weights for policy 0, policy_version 533730 (0.0052) [2024-06-23 23:01:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8744714240. Throughput: 0: 42721.7. Samples: 8744788360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:01:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-23 23:01:10,401][15401] Updated weights for policy 0, policy_version 533740 (0.0036) [2024-06-23 23:01:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 8744910848. Throughput: 0: 42638.3. Samples: 8745043960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:01:13,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-23 23:01:13,904][15401] Updated weights for policy 0, policy_version 533750 (0.0031) [2024-06-23 23:01:17,978][15401] Updated weights for policy 0, policy_version 533760 (0.0032) [2024-06-23 23:01:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8745140224. Throughput: 0: 42525.9. Samples: 8745292920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:01:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-23 23:01:21,489][15401] Updated weights for policy 0, policy_version 533770 (0.0045) [2024-06-23 23:01:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8745336832. Throughput: 0: 42456.8. Samples: 8745424200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:01:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-23 23:01:25,648][15401] Updated weights for policy 0, policy_version 533780 (0.0038) [2024-06-23 23:01:28,393][15132] Fps is (10 sec: 40945.7, 60 sec: 42596.0, 300 sec: 42542.4). Total num frames: 8745549824. Throughput: 0: 42408.3. Samples: 8745680540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:01:28,394][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 23:01:29,362][15401] Updated weights for policy 0, policy_version 533790 (0.0024) [2024-06-23 23:01:33,235][15401] Updated weights for policy 0, policy_version 533800 (0.0049) [2024-06-23 23:01:33,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 8745779200. Throughput: 0: 42656.3. Samples: 8745938160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:01:33,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 23:01:37,187][15401] Updated weights for policy 0, policy_version 533810 (0.0037) [2024-06-23 23:01:38,390][15132] Fps is (10 sec: 44251.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8745992192. Throughput: 0: 42698.4. Samples: 8746065800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-23 23:01:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 23:01:40,803][15401] Updated weights for policy 0, policy_version 533820 (0.0028) [2024-06-23 23:01:43,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 8746188800. Throughput: 0: 42408.8. Samples: 8746318200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:01:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 23:01:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000533825_8746188800.pth... [2024-06-23 23:01:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000533201_8735965184.pth [2024-06-23 23:01:44,687][15401] Updated weights for policy 0, policy_version 533830 (0.0049) [2024-06-23 23:01:48,353][15401] Updated weights for policy 0, policy_version 533840 (0.0024) [2024-06-23 23:01:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42876.1, 300 sec: 42709.8). Total num frames: 8746434560. Throughput: 0: 42677.8. Samples: 8746578980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:01:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 23:01:52,222][15401] Updated weights for policy 0, policy_version 533850 (0.0037) [2024-06-23 23:01:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 8746647552. Throughput: 0: 42661.4. Samples: 8746708120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:01:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 23:01:56,037][15401] Updated weights for policy 0, policy_version 533860 (0.0042) [2024-06-23 23:01:58,394][15132] Fps is (10 sec: 40942.1, 60 sec: 42595.2, 300 sec: 42653.3). Total num frames: 8746844160. Throughput: 0: 42646.9. Samples: 8746963260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:01:58,394][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 23:01:59,224][15349] Signal inference workers to stop experience collection... (129650 times) [2024-06-23 23:01:59,224][15349] Signal inference workers to resume experience collection... (129650 times) [2024-06-23 23:01:59,262][15401] InferenceWorker_p0-w0: stopping experience collection (129650 times) [2024-06-23 23:01:59,263][15401] InferenceWorker_p0-w0: resuming experience collection (129650 times) [2024-06-23 23:02:00,214][15401] Updated weights for policy 0, policy_version 533870 (0.0029) [2024-06-23 23:02:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 8747057152. Throughput: 0: 42918.6. Samples: 8747224260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:03,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-23 23:02:03,618][15401] Updated weights for policy 0, policy_version 533880 (0.0026) [2024-06-23 23:02:07,705][15401] Updated weights for policy 0, policy_version 533890 (0.0031) [2024-06-23 23:02:08,390][15132] Fps is (10 sec: 42616.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8747270144. Throughput: 0: 42861.8. Samples: 8747352980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:08,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-23 23:02:11,161][15401] Updated weights for policy 0, policy_version 533900 (0.0032) [2024-06-23 23:02:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8747466752. Throughput: 0: 42745.9. Samples: 8747603960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:13,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 23:02:15,722][15401] Updated weights for policy 0, policy_version 533910 (0.0049) [2024-06-23 23:02:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 8747712512. Throughput: 0: 42799.0. Samples: 8747864020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:18,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-23 23:02:18,680][15401] Updated weights for policy 0, policy_version 533920 (0.0031) [2024-06-23 23:02:23,315][15401] Updated weights for policy 0, policy_version 533930 (0.0034) [2024-06-23 23:02:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8747909120. Throughput: 0: 42934.4. Samples: 8747997840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:23,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 23:02:26,362][15401] Updated weights for policy 0, policy_version 533940 (0.0033) [2024-06-23 23:02:28,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.9, 300 sec: 42765.0). Total num frames: 8748122112. Throughput: 0: 42838.3. Samples: 8748245920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 23:02:30,884][15401] Updated weights for policy 0, policy_version 533950 (0.0034) [2024-06-23 23:02:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8748351488. Throughput: 0: 42832.5. Samples: 8748506440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 23:02:34,582][15401] Updated weights for policy 0, policy_version 533960 (0.0034) [2024-06-23 23:02:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8748548096. Throughput: 0: 42860.0. Samples: 8748636820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 23:02:38,473][15401] Updated weights for policy 0, policy_version 533970 (0.0032) [2024-06-23 23:02:42,318][15401] Updated weights for policy 0, policy_version 533980 (0.0041) [2024-06-23 23:02:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8748777472. Throughput: 0: 42933.5. Samples: 8748895080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 23:02:46,053][15401] Updated weights for policy 0, policy_version 533990 (0.0035) [2024-06-23 23:02:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8748990464. Throughput: 0: 42764.0. Samples: 8749148640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:48,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 23:02:49,969][15401] Updated weights for policy 0, policy_version 534000 (0.0036) [2024-06-23 23:02:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.1). Total num frames: 8749187072. Throughput: 0: 42741.4. Samples: 8749276340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:53,390][15132] Avg episode reward: [(0, '0.243')] [2024-06-23 23:02:53,642][15401] Updated weights for policy 0, policy_version 534010 (0.0035) [2024-06-23 23:02:57,510][15401] Updated weights for policy 0, policy_version 534020 (0.0022) [2024-06-23 23:02:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43147.6, 300 sec: 42820.5). Total num frames: 8749432832. Throughput: 0: 42990.6. Samples: 8749538540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:02:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 23:03:01,555][15401] Updated weights for policy 0, policy_version 534030 (0.0035) [2024-06-23 23:03:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8749645824. Throughput: 0: 42805.8. Samples: 8749790280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:03:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 23:03:05,081][15401] Updated weights for policy 0, policy_version 534040 (0.0034) [2024-06-23 23:03:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8749826048. Throughput: 0: 42826.6. Samples: 8749925040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-23 23:03:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 23:03:08,934][15401] Updated weights for policy 0, policy_version 534050 (0.0035) [2024-06-23 23:03:12,962][15401] Updated weights for policy 0, policy_version 534060 (0.0032) [2024-06-23 23:03:13,389][15132] Fps is (10 sec: 40961.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8750055424. Throughput: 0: 42997.0. Samples: 8750180780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 23:03:16,655][15401] Updated weights for policy 0, policy_version 534070 (0.0035) [2024-06-23 23:03:17,275][15349] Signal inference workers to stop experience collection... (129700 times) [2024-06-23 23:03:17,275][15349] Signal inference workers to resume experience collection... (129700 times) [2024-06-23 23:03:17,325][15401] InferenceWorker_p0-w0: stopping experience collection (129700 times) [2024-06-23 23:03:17,325][15401] InferenceWorker_p0-w0: resuming experience collection (129700 times) [2024-06-23 23:03:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8750284800. Throughput: 0: 42839.4. Samples: 8750434220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 23:03:20,547][15401] Updated weights for policy 0, policy_version 534080 (0.0035) [2024-06-23 23:03:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 8750481408. Throughput: 0: 42834.9. Samples: 8750564380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 23:03:24,264][15401] Updated weights for policy 0, policy_version 534090 (0.0028) [2024-06-23 23:03:28,159][15401] Updated weights for policy 0, policy_version 534100 (0.0033) [2024-06-23 23:03:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8750694400. Throughput: 0: 42755.5. Samples: 8750819080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:28,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-23 23:03:31,816][15401] Updated weights for policy 0, policy_version 534110 (0.0038) [2024-06-23 23:03:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8750907392. Throughput: 0: 42976.5. Samples: 8751082580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:33,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-23 23:03:35,975][15401] Updated weights for policy 0, policy_version 534120 (0.0040) [2024-06-23 23:03:38,391][15132] Fps is (10 sec: 42592.5, 60 sec: 42870.6, 300 sec: 42820.4). Total num frames: 8751120384. Throughput: 0: 42912.8. Samples: 8751207480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:38,396][15132] Avg episode reward: [(0, '0.817')] [2024-06-23 23:03:39,337][15401] Updated weights for policy 0, policy_version 534130 (0.0038) [2024-06-23 23:03:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8751333376. Throughput: 0: 42769.0. Samples: 8751463140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 23:03:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000534140_8751349760.pth... [2024-06-23 23:03:43,412][15401] Updated weights for policy 0, policy_version 534140 (0.0038) [2024-06-23 23:03:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000533513_8741076992.pth [2024-06-23 23:03:47,355][15401] Updated weights for policy 0, policy_version 534150 (0.0036) [2024-06-23 23:03:48,389][15132] Fps is (10 sec: 42604.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8751546368. Throughput: 0: 42898.0. Samples: 8751720680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 23:03:51,132][15401] Updated weights for policy 0, policy_version 534160 (0.0035) [2024-06-23 23:03:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 8751759360. Throughput: 0: 42659.4. Samples: 8751844720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:53,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 23:03:54,943][15401] Updated weights for policy 0, policy_version 534170 (0.0039) [2024-06-23 23:03:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 8751972352. Throughput: 0: 42578.6. Samples: 8752096820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:03:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 23:03:58,949][15401] Updated weights for policy 0, policy_version 534180 (0.0024) [2024-06-23 23:04:02,425][15401] Updated weights for policy 0, policy_version 534190 (0.0023) [2024-06-23 23:04:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8752185344. Throughput: 0: 42664.4. Samples: 8752354120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:04:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 23:04:06,331][15401] Updated weights for policy 0, policy_version 534200 (0.0024) [2024-06-23 23:04:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8752398336. Throughput: 0: 42670.1. Samples: 8752484540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:04:08,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-23 23:04:10,054][15401] Updated weights for policy 0, policy_version 534210 (0.0045) [2024-06-23 23:04:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8752611328. Throughput: 0: 42785.0. Samples: 8752744400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:04:13,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-23 23:04:14,093][15401] Updated weights for policy 0, policy_version 534220 (0.0029) [2024-06-23 23:04:17,630][15401] Updated weights for policy 0, policy_version 534230 (0.0031) [2024-06-23 23:04:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8752840704. Throughput: 0: 42657.2. Samples: 8753002160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:04:18,392][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 23:04:21,593][15401] Updated weights for policy 0, policy_version 534240 (0.0027) [2024-06-23 23:04:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8753037312. Throughput: 0: 42746.1. Samples: 8753131000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:04:23,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-23 23:04:25,174][15401] Updated weights for policy 0, policy_version 534250 (0.0042) [2024-06-23 23:04:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8753266688. Throughput: 0: 42778.2. Samples: 8753388160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:04:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 23:04:29,539][15401] Updated weights for policy 0, policy_version 534260 (0.0034) [2024-06-23 23:04:32,786][15401] Updated weights for policy 0, policy_version 534270 (0.0027) [2024-06-23 23:04:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8753479680. Throughput: 0: 42609.7. Samples: 8753638120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:04:33,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 23:04:37,118][15401] Updated weights for policy 0, policy_version 534280 (0.0032) [2024-06-23 23:04:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 8753676288. Throughput: 0: 42827.2. Samples: 8753771940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-23 23:04:38,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-23 23:04:40,398][15401] Updated weights for policy 0, policy_version 534290 (0.0029) [2024-06-23 23:04:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8753905664. Throughput: 0: 42977.8. Samples: 8754030820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:04:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-23 23:04:45,054][15401] Updated weights for policy 0, policy_version 534300 (0.0040) [2024-06-23 23:04:48,311][15401] Updated weights for policy 0, policy_version 534310 (0.0041) [2024-06-23 23:04:48,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 8754135040. Throughput: 0: 42768.0. Samples: 8754278780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:04:48,392][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 23:04:50,982][15349] Signal inference workers to stop experience collection... (129750 times) [2024-06-23 23:04:50,983][15349] Signal inference workers to resume experience collection... (129750 times) [2024-06-23 23:04:51,019][15401] InferenceWorker_p0-w0: stopping experience collection (129750 times) [2024-06-23 23:04:51,019][15401] InferenceWorker_p0-w0: resuming experience collection (129750 times) [2024-06-23 23:04:52,876][15401] Updated weights for policy 0, policy_version 534320 (0.0044) [2024-06-23 23:04:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8754315264. Throughput: 0: 42766.8. Samples: 8754409040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:04:53,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-23 23:04:56,171][15401] Updated weights for policy 0, policy_version 534330 (0.0028) [2024-06-23 23:04:58,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8754544640. Throughput: 0: 42701.6. Samples: 8754665980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:04:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 23:05:00,270][15401] Updated weights for policy 0, policy_version 534340 (0.0028) [2024-06-23 23:05:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8754757632. Throughput: 0: 42808.6. Samples: 8754928540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 23:05:03,678][15401] Updated weights for policy 0, policy_version 534350 (0.0030) [2024-06-23 23:05:07,923][15401] Updated weights for policy 0, policy_version 534360 (0.0022) [2024-06-23 23:05:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8754970624. Throughput: 0: 42846.4. Samples: 8755059080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 23:05:11,202][15401] Updated weights for policy 0, policy_version 534370 (0.0033) [2024-06-23 23:05:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 8755200000. Throughput: 0: 42795.8. Samples: 8755313980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 23:05:15,529][15401] Updated weights for policy 0, policy_version 534380 (0.0034) [2024-06-23 23:05:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8755412992. Throughput: 0: 42931.5. Samples: 8755570040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-23 23:05:19,142][15401] Updated weights for policy 0, policy_version 534390 (0.0033) [2024-06-23 23:05:22,976][15401] Updated weights for policy 0, policy_version 534400 (0.0034) [2024-06-23 23:05:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 8755625984. Throughput: 0: 42768.6. Samples: 8755696520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-23 23:05:26,603][15401] Updated weights for policy 0, policy_version 534410 (0.0030) [2024-06-23 23:05:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8755822592. Throughput: 0: 42710.2. Samples: 8755952780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 23:05:30,693][15401] Updated weights for policy 0, policy_version 534420 (0.0030) [2024-06-23 23:05:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8756051968. Throughput: 0: 42947.5. Samples: 8756211320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-23 23:05:34,335][15401] Updated weights for policy 0, policy_version 534430 (0.0031) [2024-06-23 23:05:38,305][15401] Updated weights for policy 0, policy_version 534440 (0.0030) [2024-06-23 23:05:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8756264960. Throughput: 0: 42892.3. Samples: 8756339200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:38,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-23 23:05:41,840][15401] Updated weights for policy 0, policy_version 534450 (0.0029) [2024-06-23 23:05:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42765.9). Total num frames: 8756477952. Throughput: 0: 42940.0. Samples: 8756598280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:43,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-23 23:05:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000534453_8756477952.pth... [2024-06-23 23:05:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000533825_8746188800.pth [2024-06-23 23:05:45,787][15401] Updated weights for policy 0, policy_version 534460 (0.0026) [2024-06-23 23:05:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42765.9). Total num frames: 8756690944. Throughput: 0: 42846.1. Samples: 8756856620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:48,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-23 23:05:49,482][15401] Updated weights for policy 0, policy_version 534470 (0.0047) [2024-06-23 23:05:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8756887552. Throughput: 0: 42843.1. Samples: 8756987020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:53,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 23:05:53,672][15401] Updated weights for policy 0, policy_version 534480 (0.0025) [2024-06-23 23:05:57,171][15401] Updated weights for policy 0, policy_version 534490 (0.0034) [2024-06-23 23:05:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8757133312. Throughput: 0: 42848.5. Samples: 8757242160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:05:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 23:06:01,530][15401] Updated weights for policy 0, policy_version 534500 (0.0029) [2024-06-23 23:06:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8757346304. Throughput: 0: 42800.9. Samples: 8757496080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-23 23:06:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-23 23:06:04,861][15401] Updated weights for policy 0, policy_version 534510 (0.0034) [2024-06-23 23:06:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8757542912. Throughput: 0: 42971.9. Samples: 8757630360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:08,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 23:06:08,970][15401] Updated weights for policy 0, policy_version 534520 (0.0030) [2024-06-23 23:06:12,455][15401] Updated weights for policy 0, policy_version 534530 (0.0032) [2024-06-23 23:06:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8757788672. Throughput: 0: 42921.3. Samples: 8757884240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 23:06:16,510][15401] Updated weights for policy 0, policy_version 534540 (0.0032) [2024-06-23 23:06:18,389][15132] Fps is (10 sec: 45886.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8758001664. Throughput: 0: 42970.4. Samples: 8758144980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 23:06:19,871][15401] Updated weights for policy 0, policy_version 534550 (0.0028) [2024-06-23 23:06:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42876.6). Total num frames: 8758198272. Throughput: 0: 43026.2. Samples: 8758275380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 23:06:23,955][15401] Updated weights for policy 0, policy_version 534560 (0.0025) [2024-06-23 23:06:27,265][15401] Updated weights for policy 0, policy_version 534570 (0.0039) [2024-06-23 23:06:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 8758427648. Throughput: 0: 42951.2. Samples: 8758531080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:28,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 23:06:32,042][15401] Updated weights for policy 0, policy_version 534580 (0.0034) [2024-06-23 23:06:33,328][15349] Signal inference workers to stop experience collection... (129800 times) [2024-06-23 23:06:33,376][15401] InferenceWorker_p0-w0: stopping experience collection (129800 times) [2024-06-23 23:06:33,385][15349] Signal inference workers to resume experience collection... (129800 times) [2024-06-23 23:06:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8758624256. Throughput: 0: 43110.8. Samples: 8758796600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:33,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-23 23:06:33,391][15401] InferenceWorker_p0-w0: resuming experience collection (129800 times) [2024-06-23 23:06:34,895][15401] Updated weights for policy 0, policy_version 534590 (0.0031) [2024-06-23 23:06:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 8758837248. Throughput: 0: 43037.7. Samples: 8758923820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:38,392][15132] Avg episode reward: [(0, '0.431')] [2024-06-23 23:06:39,464][15401] Updated weights for policy 0, policy_version 534600 (0.0039) [2024-06-23 23:06:42,315][15401] Updated weights for policy 0, policy_version 534610 (0.0027) [2024-06-23 23:06:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 8759083008. Throughput: 0: 43080.1. Samples: 8759180760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:43,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-23 23:06:47,077][15401] Updated weights for policy 0, policy_version 534620 (0.0041) [2024-06-23 23:06:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8759279616. Throughput: 0: 43317.3. Samples: 8759445360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 23:06:49,789][15401] Updated weights for policy 0, policy_version 534630 (0.0039) [2024-06-23 23:06:53,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43417.4, 300 sec: 42876.7). Total num frames: 8759492608. Throughput: 0: 43139.0. Samples: 8759571520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 23:06:54,656][15401] Updated weights for policy 0, policy_version 534640 (0.0039) [2024-06-23 23:06:57,737][15401] Updated weights for policy 0, policy_version 534650 (0.0041) [2024-06-23 23:06:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8759721984. Throughput: 0: 43288.9. Samples: 8759832240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:06:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 23:07:02,380][15401] Updated weights for policy 0, policy_version 534660 (0.0027) [2024-06-23 23:07:03,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8759934976. Throughput: 0: 43284.9. Samples: 8760092800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 23:07:05,098][15401] Updated weights for policy 0, policy_version 534670 (0.0032) [2024-06-23 23:07:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 8760131584. Throughput: 0: 43340.6. Samples: 8760225700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:08,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 23:07:09,761][15401] Updated weights for policy 0, policy_version 534680 (0.0038) [2024-06-23 23:07:12,628][15401] Updated weights for policy 0, policy_version 534690 (0.0023) [2024-06-23 23:07:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8760360960. Throughput: 0: 43221.7. Samples: 8760476060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:13,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-23 23:07:17,395][15401] Updated weights for policy 0, policy_version 534700 (0.0040) [2024-06-23 23:07:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8760557568. Throughput: 0: 43211.1. Samples: 8760741100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 23:07:20,200][15401] Updated weights for policy 0, policy_version 534710 (0.0036) [2024-06-23 23:07:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8760786944. Throughput: 0: 43147.6. Samples: 8760865360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 23:07:24,868][15401] Updated weights for policy 0, policy_version 534720 (0.0044) [2024-06-23 23:07:27,765][15401] Updated weights for policy 0, policy_version 534730 (0.0028) [2024-06-23 23:07:28,396][15132] Fps is (10 sec: 45845.6, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 8761016320. Throughput: 0: 43104.9. Samples: 8761120760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:28,396][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 23:07:32,432][15401] Updated weights for policy 0, policy_version 534740 (0.0034) [2024-06-23 23:07:33,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 8761196544. Throughput: 0: 43175.9. Samples: 8761388380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:33,392][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 23:07:35,297][15401] Updated weights for policy 0, policy_version 534750 (0.0042) [2024-06-23 23:07:38,392][15132] Fps is (10 sec: 40976.3, 60 sec: 43144.5, 300 sec: 42875.7). Total num frames: 8761425920. Throughput: 0: 43076.5. Samples: 8761510060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:38,393][15132] Avg episode reward: [(0, '0.402')] [2024-06-23 23:07:40,347][15401] Updated weights for policy 0, policy_version 534760 (0.0034) [2024-06-23 23:07:43,082][15401] Updated weights for policy 0, policy_version 534770 (0.0039) [2024-06-23 23:07:43,389][15132] Fps is (10 sec: 47525.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8761671680. Throughput: 0: 42963.2. Samples: 8761765580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:43,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-23 23:07:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000534770_8761671680.pth... [2024-06-23 23:07:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000534140_8751349760.pth [2024-06-23 23:07:48,042][15401] Updated weights for policy 0, policy_version 534780 (0.0029) [2024-06-23 23:07:48,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8761835520. Throughput: 0: 42821.6. Samples: 8762019780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 23:07:51,130][15401] Updated weights for policy 0, policy_version 534790 (0.0047) [2024-06-23 23:07:53,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8762048512. Throughput: 0: 42584.3. Samples: 8762142000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 23:07:55,599][15401] Updated weights for policy 0, policy_version 534800 (0.0038) [2024-06-23 23:07:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8762277888. Throughput: 0: 42795.2. Samples: 8762401840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:07:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 23:07:58,911][15401] Updated weights for policy 0, policy_version 534810 (0.0027) [2024-06-23 23:08:01,112][15349] Signal inference workers to stop experience collection... (129850 times) [2024-06-23 23:08:01,160][15401] InferenceWorker_p0-w0: stopping experience collection (129850 times) [2024-06-23 23:08:01,169][15349] Signal inference workers to resume experience collection... (129850 times) [2024-06-23 23:08:01,175][15401] InferenceWorker_p0-w0: resuming experience collection (129850 times) [2024-06-23 23:08:03,187][15401] Updated weights for policy 0, policy_version 534820 (0.0027) [2024-06-23 23:08:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8762490880. Throughput: 0: 42604.4. Samples: 8762658300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-23 23:08:06,619][15401] Updated weights for policy 0, policy_version 534830 (0.0032) [2024-06-23 23:08:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8762703872. Throughput: 0: 42681.8. Samples: 8762786040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-23 23:08:10,845][15401] Updated weights for policy 0, policy_version 534840 (0.0028) [2024-06-23 23:08:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8762916864. Throughput: 0: 42710.1. Samples: 8763042440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 23:08:14,551][15401] Updated weights for policy 0, policy_version 534850 (0.0028) [2024-06-23 23:08:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8763129856. Throughput: 0: 42335.6. Samples: 8763293380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 23:08:18,739][15401] Updated weights for policy 0, policy_version 534860 (0.0038) [2024-06-23 23:08:22,337][15401] Updated weights for policy 0, policy_version 534870 (0.0030) [2024-06-23 23:08:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8763359232. Throughput: 0: 42585.4. Samples: 8763426300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 23:08:26,197][15401] Updated weights for policy 0, policy_version 534880 (0.0042) [2024-06-23 23:08:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42329.9, 300 sec: 42876.1). Total num frames: 8763555840. Throughput: 0: 42560.0. Samples: 8763680780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-23 23:08:30,318][15401] Updated weights for policy 0, policy_version 534890 (0.0038) [2024-06-23 23:08:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42876.3). Total num frames: 8763768832. Throughput: 0: 42443.2. Samples: 8763929720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:33,398][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 23:08:34,076][15401] Updated weights for policy 0, policy_version 534900 (0.0031) [2024-06-23 23:08:38,012][15401] Updated weights for policy 0, policy_version 534910 (0.0033) [2024-06-23 23:08:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 8763981824. Throughput: 0: 42678.3. Samples: 8764062520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 23:08:41,610][15401] Updated weights for policy 0, policy_version 534920 (0.0028) [2024-06-23 23:08:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 8764211200. Throughput: 0: 42657.7. Samples: 8764321440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-23 23:08:45,383][15401] Updated weights for policy 0, policy_version 534930 (0.0029) [2024-06-23 23:08:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 8764424192. Throughput: 0: 42764.1. Samples: 8764582680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 23:08:49,098][15401] Updated weights for policy 0, policy_version 534940 (0.0026) [2024-06-23 23:08:52,920][15401] Updated weights for policy 0, policy_version 534950 (0.0030) [2024-06-23 23:08:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8764620800. Throughput: 0: 42629.6. Samples: 8764704380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:53,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 23:08:56,649][15401] Updated weights for policy 0, policy_version 534960 (0.0028) [2024-06-23 23:08:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8764850176. Throughput: 0: 42705.7. Samples: 8764964200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:08:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 23:09:00,961][15401] Updated weights for policy 0, policy_version 534970 (0.0027) [2024-06-23 23:09:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8765063168. Throughput: 0: 42936.9. Samples: 8765225540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-23 23:09:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 23:09:04,129][15401] Updated weights for policy 0, policy_version 534980 (0.0033) [2024-06-23 23:09:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8765259776. Throughput: 0: 42686.2. Samples: 8765347180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 23:09:08,555][15401] Updated weights for policy 0, policy_version 534990 (0.0038) [2024-06-23 23:09:11,573][15401] Updated weights for policy 0, policy_version 535000 (0.0036) [2024-06-23 23:09:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 8765489152. Throughput: 0: 42969.6. Samples: 8765614420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-23 23:09:15,971][15401] Updated weights for policy 0, policy_version 535010 (0.0042) [2024-06-23 23:09:18,029][15349] Signal inference workers to stop experience collection... (129900 times) [2024-06-23 23:09:18,029][15349] Signal inference workers to resume experience collection... (129900 times) [2024-06-23 23:09:18,051][15401] InferenceWorker_p0-w0: stopping experience collection (129900 times) [2024-06-23 23:09:18,051][15401] InferenceWorker_p0-w0: resuming experience collection (129900 times) [2024-06-23 23:09:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8765702144. Throughput: 0: 43170.2. Samples: 8765872380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:18,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-23 23:09:19,373][15401] Updated weights for policy 0, policy_version 535020 (0.0039) [2024-06-23 23:09:23,379][15401] Updated weights for policy 0, policy_version 535030 (0.0028) [2024-06-23 23:09:23,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8765931520. Throughput: 0: 42977.8. Samples: 8765996520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 23:09:27,234][15401] Updated weights for policy 0, policy_version 535040 (0.0042) [2024-06-23 23:09:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8766144512. Throughput: 0: 43086.8. Samples: 8766260340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:28,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-23 23:09:30,976][15401] Updated weights for policy 0, policy_version 535050 (0.0027) [2024-06-23 23:09:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8766341120. Throughput: 0: 42990.9. Samples: 8766517280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-23 23:09:34,743][15401] Updated weights for policy 0, policy_version 535060 (0.0031) [2024-06-23 23:09:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8766570496. Throughput: 0: 43097.1. Samples: 8766643740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-23 23:09:38,401][15401] Updated weights for policy 0, policy_version 535070 (0.0039) [2024-06-23 23:09:42,256][15401] Updated weights for policy 0, policy_version 535080 (0.0027) [2024-06-23 23:09:43,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 8766799872. Throughput: 0: 43170.2. Samples: 8766906860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 23:09:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000535083_8766799872.pth... [2024-06-23 23:09:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000534453_8756477952.pth [2024-06-23 23:09:46,021][15401] Updated weights for policy 0, policy_version 535090 (0.0039) [2024-06-23 23:09:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8766980096. Throughput: 0: 43180.0. Samples: 8767168640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:48,391][15132] Avg episode reward: [(0, '0.655')] [2024-06-23 23:09:49,746][15401] Updated weights for policy 0, policy_version 535100 (0.0025) [2024-06-23 23:09:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.8, 300 sec: 42987.2). Total num frames: 8767225856. Throughput: 0: 43253.4. Samples: 8767293580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 23:09:53,634][15401] Updated weights for policy 0, policy_version 535110 (0.0032) [2024-06-23 23:09:57,398][15401] Updated weights for policy 0, policy_version 535120 (0.0043) [2024-06-23 23:09:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8767422464. Throughput: 0: 43028.7. Samples: 8767550700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:09:58,396][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 23:10:01,425][15401] Updated weights for policy 0, policy_version 535130 (0.0033) [2024-06-23 23:10:03,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8767619072. Throughput: 0: 43156.8. Samples: 8767814440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:10:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 23:10:04,915][15401] Updated weights for policy 0, policy_version 535140 (0.0027) [2024-06-23 23:10:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8767848448. Throughput: 0: 43271.7. Samples: 8767943740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:10:08,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 23:10:09,046][15401] Updated weights for policy 0, policy_version 535150 (0.0043) [2024-06-23 23:10:12,406][15401] Updated weights for policy 0, policy_version 535160 (0.0035) [2024-06-23 23:10:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 8768077824. Throughput: 0: 42968.4. Samples: 8768193920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:10:13,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-23 23:10:16,818][15401] Updated weights for policy 0, policy_version 535170 (0.0029) [2024-06-23 23:10:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8768274432. Throughput: 0: 43010.8. Samples: 8768452760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:10:18,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-23 23:10:20,107][15401] Updated weights for policy 0, policy_version 535180 (0.0036) [2024-06-23 23:10:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8768487424. Throughput: 0: 43042.6. Samples: 8768580660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:10:23,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-23 23:10:24,461][15401] Updated weights for policy 0, policy_version 535190 (0.0032) [2024-06-23 23:10:28,239][15401] Updated weights for policy 0, policy_version 535200 (0.0031) [2024-06-23 23:10:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8768716800. Throughput: 0: 42750.6. Samples: 8768830640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:10:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-23 23:10:32,005][15349] Signal inference workers to stop experience collection... (129950 times) [2024-06-23 23:10:32,006][15349] Signal inference workers to resume experience collection... (129950 times) [2024-06-23 23:10:32,054][15401] InferenceWorker_p0-w0: stopping experience collection (129950 times) [2024-06-23 23:10:32,054][15401] InferenceWorker_p0-w0: resuming experience collection (129950 times) [2024-06-23 23:10:32,149][15401] Updated weights for policy 0, policy_version 535210 (0.0042) [2024-06-23 23:10:33,390][15132] Fps is (10 sec: 44235.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8768929792. Throughput: 0: 42493.6. Samples: 8769080860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-23 23:10:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 23:10:36,140][15401] Updated weights for policy 0, policy_version 535220 (0.0038) [2024-06-23 23:10:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8769126400. Throughput: 0: 42719.1. Samples: 8769215940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:10:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 23:10:39,843][15401] Updated weights for policy 0, policy_version 535230 (0.0048) [2024-06-23 23:10:43,390][15132] Fps is (10 sec: 40961.1, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8769339392. Throughput: 0: 42680.3. Samples: 8769471320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:10:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 23:10:43,814][15401] Updated weights for policy 0, policy_version 535240 (0.0036) [2024-06-23 23:10:47,478][15401] Updated weights for policy 0, policy_version 535250 (0.0033) [2024-06-23 23:10:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 8769585152. Throughput: 0: 42329.9. Samples: 8769719280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:10:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-23 23:10:51,722][15401] Updated weights for policy 0, policy_version 535260 (0.0035) [2024-06-23 23:10:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8769765376. Throughput: 0: 42502.1. Samples: 8769856340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:10:53,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-23 23:10:54,998][15401] Updated weights for policy 0, policy_version 535270 (0.0029) [2024-06-23 23:10:58,392][15132] Fps is (10 sec: 37674.1, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 8769961984. Throughput: 0: 42589.3. Samples: 8770110540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:10:58,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 23:10:59,347][15401] Updated weights for policy 0, policy_version 535280 (0.0031) [2024-06-23 23:11:02,425][15401] Updated weights for policy 0, policy_version 535290 (0.0040) [2024-06-23 23:11:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42987.5). Total num frames: 8770224128. Throughput: 0: 42383.7. Samples: 8770360020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:03,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 23:11:06,936][15401] Updated weights for policy 0, policy_version 535300 (0.0034) [2024-06-23 23:11:08,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8770404352. Throughput: 0: 42546.2. Samples: 8770495240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 23:11:10,439][15401] Updated weights for policy 0, policy_version 535310 (0.0039) [2024-06-23 23:11:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8770617344. Throughput: 0: 42539.1. Samples: 8770744900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-23 23:11:14,662][15401] Updated weights for policy 0, policy_version 535320 (0.0033) [2024-06-23 23:11:18,103][15401] Updated weights for policy 0, policy_version 535330 (0.0046) [2024-06-23 23:11:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 8770863104. Throughput: 0: 42622.2. Samples: 8770998840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-23 23:11:22,299][15401] Updated weights for policy 0, policy_version 535340 (0.0030) [2024-06-23 23:11:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8771059712. Throughput: 0: 42607.0. Samples: 8771133260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:23,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-23 23:11:25,756][15401] Updated weights for policy 0, policy_version 535350 (0.0031) [2024-06-23 23:11:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8771272704. Throughput: 0: 42557.9. Samples: 8771386420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:28,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 23:11:30,039][15401] Updated weights for policy 0, policy_version 535360 (0.0032) [2024-06-23 23:11:33,302][15401] Updated weights for policy 0, policy_version 535370 (0.0024) [2024-06-23 23:11:33,391][15132] Fps is (10 sec: 44230.4, 60 sec: 42870.7, 300 sec: 42931.8). Total num frames: 8771502080. Throughput: 0: 42718.6. Samples: 8771641680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:33,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 23:11:38,067][15401] Updated weights for policy 0, policy_version 535380 (0.0029) [2024-06-23 23:11:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8771682304. Throughput: 0: 42574.3. Samples: 8771772180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 23:11:40,859][15401] Updated weights for policy 0, policy_version 535390 (0.0043) [2024-06-23 23:11:43,390][15132] Fps is (10 sec: 40965.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8771911680. Throughput: 0: 42549.2. Samples: 8772025160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:43,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-23 23:11:43,465][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000535396_8771928064.pth... [2024-06-23 23:11:43,525][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000534770_8761671680.pth [2024-06-23 23:11:45,584][15401] Updated weights for policy 0, policy_version 535400 (0.0039) [2024-06-23 23:11:48,388][15401] Updated weights for policy 0, policy_version 535410 (0.0042) [2024-06-23 23:11:48,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 8772157440. Throughput: 0: 42665.7. Samples: 8772279980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 23:11:53,077][15401] Updated weights for policy 0, policy_version 535420 (0.0033) [2024-06-23 23:11:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8772321280. Throughput: 0: 42589.7. Samples: 8772411780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-23 23:11:56,234][15401] Updated weights for policy 0, policy_version 535430 (0.0036) [2024-06-23 23:11:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 8772550656. Throughput: 0: 42725.2. Samples: 8772667540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:11:58,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-23 23:11:59,593][15349] Signal inference workers to stop experience collection... (130000 times) [2024-06-23 23:11:59,600][15349] Signal inference workers to resume experience collection... (130000 times) [2024-06-23 23:11:59,633][15401] InferenceWorker_p0-w0: stopping experience collection (130000 times) [2024-06-23 23:11:59,634][15401] InferenceWorker_p0-w0: resuming experience collection (130000 times) [2024-06-23 23:12:00,470][15401] Updated weights for policy 0, policy_version 535440 (0.0033) [2024-06-23 23:12:03,392][15132] Fps is (10 sec: 45864.7, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 8772780032. Throughput: 0: 42845.2. Samples: 8772926980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-23 23:12:03,392][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 23:12:03,956][15401] Updated weights for policy 0, policy_version 535450 (0.0023) [2024-06-23 23:12:07,979][15401] Updated weights for policy 0, policy_version 535460 (0.0030) [2024-06-23 23:12:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8772976640. Throughput: 0: 42651.1. Samples: 8773052560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:08,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 23:12:11,474][15401] Updated weights for policy 0, policy_version 535470 (0.0044) [2024-06-23 23:12:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8773189632. Throughput: 0: 42709.7. Samples: 8773308360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:13,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 23:12:15,534][15401] Updated weights for policy 0, policy_version 535480 (0.0036) [2024-06-23 23:12:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8773386240. Throughput: 0: 42829.0. Samples: 8773568920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-23 23:12:19,315][15401] Updated weights for policy 0, policy_version 535490 (0.0034) [2024-06-23 23:12:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 8773615616. Throughput: 0: 42751.6. Samples: 8773696000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-23 23:12:23,523][15401] Updated weights for policy 0, policy_version 535500 (0.0029) [2024-06-23 23:12:26,830][15401] Updated weights for policy 0, policy_version 535510 (0.0024) [2024-06-23 23:12:28,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 8773844992. Throughput: 0: 42861.8. Samples: 8773953940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 23:12:30,990][15401] Updated weights for policy 0, policy_version 535520 (0.0029) [2024-06-23 23:12:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42326.3, 300 sec: 42765.4). Total num frames: 8774041600. Throughput: 0: 43106.1. Samples: 8774219760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:33,396][15132] Avg episode reward: [(0, '0.795')] [2024-06-23 23:12:34,536][15401] Updated weights for policy 0, policy_version 535530 (0.0038) [2024-06-23 23:12:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8774270976. Throughput: 0: 42882.8. Samples: 8774341500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 23:12:38,530][15401] Updated weights for policy 0, policy_version 535540 (0.0030) [2024-06-23 23:12:42,221][15401] Updated weights for policy 0, policy_version 535550 (0.0036) [2024-06-23 23:12:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8774483968. Throughput: 0: 42930.8. Samples: 8774599420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 23:12:46,017][15401] Updated weights for policy 0, policy_version 535560 (0.0030) [2024-06-23 23:12:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 8774680576. Throughput: 0: 42993.5. Samples: 8774861580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 23:12:49,905][15401] Updated weights for policy 0, policy_version 535570 (0.0030) [2024-06-23 23:12:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8774909952. Throughput: 0: 42816.9. Samples: 8774979320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 23:12:53,698][15401] Updated weights for policy 0, policy_version 535580 (0.0032) [2024-06-23 23:12:57,542][15401] Updated weights for policy 0, policy_version 535590 (0.0037) [2024-06-23 23:12:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8775122944. Throughput: 0: 42958.7. Samples: 8775241500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:12:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-23 23:12:58,597][15349] Signal inference workers to stop experience collection... (130050 times) [2024-06-23 23:12:58,652][15349] Signal inference workers to resume experience collection... (130050 times) [2024-06-23 23:12:58,652][15401] InferenceWorker_p0-w0: stopping experience collection (130050 times) [2024-06-23 23:12:58,664][15401] InferenceWorker_p0-w0: resuming experience collection (130050 times) [2024-06-23 23:13:01,836][15401] Updated weights for policy 0, policy_version 535600 (0.0038) [2024-06-23 23:13:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 8775303168. Throughput: 0: 42868.5. Samples: 8775498000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:13:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 23:13:05,463][15401] Updated weights for policy 0, policy_version 535610 (0.0033) [2024-06-23 23:13:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8775548928. Throughput: 0: 42796.1. Samples: 8775621820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:13:08,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 23:13:09,255][15401] Updated weights for policy 0, policy_version 535620 (0.0036) [2024-06-23 23:13:13,337][15401] Updated weights for policy 0, policy_version 535630 (0.0045) [2024-06-23 23:13:13,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8775761920. Throughput: 0: 42785.3. Samples: 8775879280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:13:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 23:13:16,918][15401] Updated weights for policy 0, policy_version 535640 (0.0030) [2024-06-23 23:13:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8775958528. Throughput: 0: 42597.4. Samples: 8776136640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:13:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 23:13:20,930][15401] Updated weights for policy 0, policy_version 535650 (0.0037) [2024-06-23 23:13:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8776187904. Throughput: 0: 42567.9. Samples: 8776257160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:13:23,393][15132] Avg episode reward: [(0, '0.266')] [2024-06-23 23:13:25,035][15401] Updated weights for policy 0, policy_version 535660 (0.0027) [2024-06-23 23:13:28,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 8776400896. Throughput: 0: 42640.2. Samples: 8776518500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:13:28,396][15132] Avg episode reward: [(0, '0.616')] [2024-06-23 23:13:28,809][15401] Updated weights for policy 0, policy_version 535670 (0.0034) [2024-06-23 23:13:32,667][15401] Updated weights for policy 0, policy_version 535680 (0.0030) [2024-06-23 23:13:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8776597504. Throughput: 0: 42472.7. Samples: 8776772860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-23 23:13:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-23 23:13:36,393][15401] Updated weights for policy 0, policy_version 535690 (0.0044) [2024-06-23 23:13:38,390][15132] Fps is (10 sec: 44262.7, 60 sec: 42871.1, 300 sec: 42820.5). Total num frames: 8776843264. Throughput: 0: 42662.6. Samples: 8776899160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:13:38,391][15132] Avg episode reward: [(0, '0.565')] [2024-06-23 23:13:40,160][15401] Updated weights for policy 0, policy_version 535700 (0.0024) [2024-06-23 23:13:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8777039872. Throughput: 0: 42680.4. Samples: 8777162120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:13:43,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-23 23:13:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000535708_8777039872.pth... [2024-06-23 23:13:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000535083_8766799872.pth [2024-06-23 23:13:43,839][15401] Updated weights for policy 0, policy_version 535710 (0.0034) [2024-06-23 23:13:48,083][15401] Updated weights for policy 0, policy_version 535720 (0.0043) [2024-06-23 23:13:48,393][15132] Fps is (10 sec: 39309.6, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 8777236480. Throughput: 0: 42397.8. Samples: 8777406060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:13:48,394][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 23:13:51,584][15401] Updated weights for policy 0, policy_version 535730 (0.0040) [2024-06-23 23:13:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8777465856. Throughput: 0: 42407.3. Samples: 8777530160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:13:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-23 23:13:55,683][15401] Updated weights for policy 0, policy_version 535740 (0.0030) [2024-06-23 23:13:58,392][15132] Fps is (10 sec: 44242.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8777678848. Throughput: 0: 42535.6. Samples: 8777793480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:13:58,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 23:13:59,468][15401] Updated weights for policy 0, policy_version 535750 (0.0041) [2024-06-23 23:14:03,209][15401] Updated weights for policy 0, policy_version 535760 (0.0029) [2024-06-23 23:14:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8777891840. Throughput: 0: 42453.8. Samples: 8778047060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:03,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-23 23:14:06,953][15401] Updated weights for policy 0, policy_version 535770 (0.0037) [2024-06-23 23:14:08,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8778104832. Throughput: 0: 42550.3. Samples: 8778171920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:08,392][15132] Avg episode reward: [(0, '0.847')] [2024-06-23 23:14:11,122][15401] Updated weights for policy 0, policy_version 535780 (0.0033) [2024-06-23 23:14:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 8778317824. Throughput: 0: 42694.9. Samples: 8778439600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:13,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 23:14:14,575][15401] Updated weights for policy 0, policy_version 535790 (0.0029) [2024-06-23 23:14:18,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8778530816. Throughput: 0: 42585.9. Samples: 8778689220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 23:14:18,689][15401] Updated weights for policy 0, policy_version 535800 (0.0027) [2024-06-23 23:14:22,480][15401] Updated weights for policy 0, policy_version 535810 (0.0032) [2024-06-23 23:14:23,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8778760192. Throughput: 0: 42650.3. Samples: 8778818400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-23 23:14:26,321][15401] Updated weights for policy 0, policy_version 535820 (0.0024) [2024-06-23 23:14:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 8778940416. Throughput: 0: 42579.7. Samples: 8779078200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:28,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 23:14:29,901][15349] Signal inference workers to stop experience collection... (130100 times) [2024-06-23 23:14:29,902][15349] Signal inference workers to resume experience collection... (130100 times) [2024-06-23 23:14:29,915][15401] InferenceWorker_p0-w0: stopping experience collection (130100 times) [2024-06-23 23:14:29,915][15401] InferenceWorker_p0-w0: resuming experience collection (130100 times) [2024-06-23 23:14:30,042][15401] Updated weights for policy 0, policy_version 535830 (0.0042) [2024-06-23 23:14:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8779169792. Throughput: 0: 42811.8. Samples: 8779332440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 23:14:33,929][15401] Updated weights for policy 0, policy_version 535840 (0.0041) [2024-06-23 23:14:37,623][15401] Updated weights for policy 0, policy_version 535850 (0.0026) [2024-06-23 23:14:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.8, 300 sec: 42709.5). Total num frames: 8779399168. Throughput: 0: 42862.4. Samples: 8779458960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:38,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 23:14:41,578][15401] Updated weights for policy 0, policy_version 535860 (0.0035) [2024-06-23 23:14:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8779595776. Throughput: 0: 42771.6. Samples: 8779718100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:43,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-23 23:14:45,600][15401] Updated weights for policy 0, policy_version 535870 (0.0025) [2024-06-23 23:14:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42874.0, 300 sec: 42653.9). Total num frames: 8779808768. Throughput: 0: 42682.6. Samples: 8779967780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 23:14:49,238][15401] Updated weights for policy 0, policy_version 535880 (0.0029) [2024-06-23 23:14:53,129][15401] Updated weights for policy 0, policy_version 535890 (0.0032) [2024-06-23 23:14:53,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 8780021760. Throughput: 0: 42848.9. Samples: 8780100120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:53,392][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 23:14:56,915][15401] Updated weights for policy 0, policy_version 535900 (0.0028) [2024-06-23 23:14:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8780234752. Throughput: 0: 42492.9. Samples: 8780351680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:14:58,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-23 23:15:00,906][15401] Updated weights for policy 0, policy_version 535910 (0.0030) [2024-06-23 23:15:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8780447744. Throughput: 0: 42655.5. Samples: 8780608720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-23 23:15:03,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-23 23:15:04,597][15401] Updated weights for policy 0, policy_version 535920 (0.0040) [2024-06-23 23:15:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 8780660736. Throughput: 0: 42701.9. Samples: 8780739980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 23:15:08,496][15401] Updated weights for policy 0, policy_version 535930 (0.0032) [2024-06-23 23:15:12,267][15401] Updated weights for policy 0, policy_version 535940 (0.0034) [2024-06-23 23:15:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 8780873728. Throughput: 0: 42637.2. Samples: 8780996880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 23:15:16,110][15401] Updated weights for policy 0, policy_version 535950 (0.0030) [2024-06-23 23:15:18,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 8781103104. Throughput: 0: 42616.9. Samples: 8781250300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:18,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-23 23:15:20,113][15401] Updated weights for policy 0, policy_version 535960 (0.0034) [2024-06-23 23:15:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 8781283328. Throughput: 0: 42647.9. Samples: 8781378120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:23,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-23 23:15:23,907][15401] Updated weights for policy 0, policy_version 535970 (0.0028) [2024-06-23 23:15:27,914][15401] Updated weights for policy 0, policy_version 535980 (0.0029) [2024-06-23 23:15:28,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42598.5). Total num frames: 8781496320. Throughput: 0: 42440.5. Samples: 8781627920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 23:15:31,635][15401] Updated weights for policy 0, policy_version 535990 (0.0034) [2024-06-23 23:15:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8781742080. Throughput: 0: 42708.0. Samples: 8781889640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 23:15:35,427][15401] Updated weights for policy 0, policy_version 536000 (0.0031) [2024-06-23 23:15:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 8781905920. Throughput: 0: 42700.0. Samples: 8782021520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 23:15:39,230][15401] Updated weights for policy 0, policy_version 536010 (0.0033) [2024-06-23 23:15:42,933][15401] Updated weights for policy 0, policy_version 536020 (0.0043) [2024-06-23 23:15:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8782151680. Throughput: 0: 42523.2. Samples: 8782265220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 23:15:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000536020_8782151680.pth... [2024-06-23 23:15:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000535396_8771928064.pth [2024-06-23 23:15:44,498][15349] Signal inference workers to stop experience collection... (130150 times) [2024-06-23 23:15:44,548][15401] InferenceWorker_p0-w0: stopping experience collection (130150 times) [2024-06-23 23:15:44,558][15349] Signal inference workers to resume experience collection... (130150 times) [2024-06-23 23:15:44,570][15401] InferenceWorker_p0-w0: resuming experience collection (130150 times) [2024-06-23 23:15:46,834][15401] Updated weights for policy 0, policy_version 536030 (0.0038) [2024-06-23 23:15:48,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8782381056. Throughput: 0: 42812.5. Samples: 8782535280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 23:15:50,401][15401] Updated weights for policy 0, policy_version 536040 (0.0028) [2024-06-23 23:15:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42709.8). Total num frames: 8782561280. Throughput: 0: 42774.6. Samples: 8782664840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 23:15:54,229][15401] Updated weights for policy 0, policy_version 536050 (0.0037) [2024-06-23 23:15:58,111][15401] Updated weights for policy 0, policy_version 536060 (0.0035) [2024-06-23 23:15:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 8782807040. Throughput: 0: 42602.8. Samples: 8782914100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:15:58,392][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 23:16:01,886][15401] Updated weights for policy 0, policy_version 536070 (0.0039) [2024-06-23 23:16:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8783003648. Throughput: 0: 42791.6. Samples: 8783175820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:16:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 23:16:05,709][15401] Updated weights for policy 0, policy_version 536080 (0.0030) [2024-06-23 23:16:08,390][15132] Fps is (10 sec: 39330.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8783200256. Throughput: 0: 42770.2. Samples: 8783302780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:16:08,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 23:16:09,669][15401] Updated weights for policy 0, policy_version 536090 (0.0037) [2024-06-23 23:16:13,128][15401] Updated weights for policy 0, policy_version 536100 (0.0044) [2024-06-23 23:16:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 8783462400. Throughput: 0: 42828.5. Samples: 8783555200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:16:13,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-23 23:16:17,439][15401] Updated weights for policy 0, policy_version 536110 (0.0031) [2024-06-23 23:16:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 8783659008. Throughput: 0: 42821.0. Samples: 8783816580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:16:18,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-23 23:16:21,258][15401] Updated weights for policy 0, policy_version 536120 (0.0038) [2024-06-23 23:16:23,392][15132] Fps is (10 sec: 37674.0, 60 sec: 42596.8, 300 sec: 42598.0). Total num frames: 8783839232. Throughput: 0: 42665.8. Samples: 8783941580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:16:23,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 23:16:24,940][15401] Updated weights for policy 0, policy_version 536130 (0.0031) [2024-06-23 23:16:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43415.8, 300 sec: 42709.3). Total num frames: 8784101376. Throughput: 0: 42996.7. Samples: 8784200180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:16:28,393][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 23:16:28,756][15401] Updated weights for policy 0, policy_version 536140 (0.0035) [2024-06-23 23:16:32,433][15401] Updated weights for policy 0, policy_version 536150 (0.0035) [2024-06-23 23:16:33,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8784297984. Throughput: 0: 42977.3. Samples: 8784469260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-23 23:16:33,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-23 23:16:36,340][15401] Updated weights for policy 0, policy_version 536160 (0.0039) [2024-06-23 23:16:38,390][15132] Fps is (10 sec: 37692.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8784478208. Throughput: 0: 42869.4. Samples: 8784593960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:16:38,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 23:16:40,167][15401] Updated weights for policy 0, policy_version 536170 (0.0034) [2024-06-23 23:16:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 8784740352. Throughput: 0: 43074.4. Samples: 8784852340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:16:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-23 23:16:43,907][15401] Updated weights for policy 0, policy_version 536180 (0.0041) [2024-06-23 23:16:47,877][15401] Updated weights for policy 0, policy_version 536190 (0.0028) [2024-06-23 23:16:48,390][15132] Fps is (10 sec: 49151.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8784969728. Throughput: 0: 42999.4. Samples: 8785110800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:16:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 23:16:51,489][15401] Updated weights for policy 0, policy_version 536200 (0.0029) [2024-06-23 23:16:53,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8785117184. Throughput: 0: 42951.3. Samples: 8785235580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:16:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-23 23:16:55,403][15349] Signal inference workers to stop experience collection... (130200 times) [2024-06-23 23:16:55,442][15401] InferenceWorker_p0-w0: stopping experience collection (130200 times) [2024-06-23 23:16:55,453][15349] Signal inference workers to resume experience collection... (130200 times) [2024-06-23 23:16:55,463][15401] InferenceWorker_p0-w0: resuming experience collection (130200 times) [2024-06-23 23:16:55,592][15401] Updated weights for policy 0, policy_version 536210 (0.0033) [2024-06-23 23:16:58,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 8785395712. Throughput: 0: 43023.6. Samples: 8785491260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:16:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 23:16:59,606][15401] Updated weights for policy 0, policy_version 536220 (0.0026) [2024-06-23 23:17:03,090][15401] Updated weights for policy 0, policy_version 536230 (0.0029) [2024-06-23 23:17:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8785592320. Throughput: 0: 43033.4. Samples: 8785753080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:03,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 23:17:07,046][15401] Updated weights for policy 0, policy_version 536240 (0.0031) [2024-06-23 23:17:08,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 8785772544. Throughput: 0: 43095.7. Samples: 8785880780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 23:17:10,931][15401] Updated weights for policy 0, policy_version 536250 (0.0023) [2024-06-23 23:17:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 8786051072. Throughput: 0: 42995.5. Samples: 8786134980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:13,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 23:17:14,379][15401] Updated weights for policy 0, policy_version 536260 (0.0032) [2024-06-23 23:17:18,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 8786231296. Throughput: 0: 43079.1. Samples: 8786407920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:18,393][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 23:17:18,569][15401] Updated weights for policy 0, policy_version 536270 (0.0041) [2024-06-23 23:17:22,062][15401] Updated weights for policy 0, policy_version 536280 (0.0028) [2024-06-23 23:17:23,389][15132] Fps is (10 sec: 37692.6, 60 sec: 43146.3, 300 sec: 42654.0). Total num frames: 8786427904. Throughput: 0: 42992.9. Samples: 8786528640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 23:17:25,894][15401] Updated weights for policy 0, policy_version 536290 (0.0025) [2024-06-23 23:17:28,389][15132] Fps is (10 sec: 45886.1, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 8786690048. Throughput: 0: 43086.1. Samples: 8786791220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 23:17:29,498][15401] Updated weights for policy 0, policy_version 536300 (0.0045) [2024-06-23 23:17:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8786886656. Throughput: 0: 43183.8. Samples: 8787054060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 23:17:33,489][15401] Updated weights for policy 0, policy_version 536310 (0.0049) [2024-06-23 23:17:37,324][15401] Updated weights for policy 0, policy_version 536320 (0.0043) [2024-06-23 23:17:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 8787099648. Throughput: 0: 43211.5. Samples: 8787180100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:38,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-23 23:17:40,960][15401] Updated weights for policy 0, policy_version 536330 (0.0035) [2024-06-23 23:17:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 8787345408. Throughput: 0: 43333.3. Samples: 8787441260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 23:17:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000536338_8787361792.pth... [2024-06-23 23:17:43,563][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000535708_8777039872.pth [2024-06-23 23:17:44,717][15401] Updated weights for policy 0, policy_version 536340 (0.0046) [2024-06-23 23:17:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8787542016. Throughput: 0: 43543.1. Samples: 8787712520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-23 23:17:48,491][15401] Updated weights for policy 0, policy_version 536350 (0.0032) [2024-06-23 23:17:52,171][15401] Updated weights for policy 0, policy_version 536360 (0.0037) [2024-06-23 23:17:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 8787738624. Throughput: 0: 43428.9. Samples: 8787835080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-23 23:17:55,967][15401] Updated weights for policy 0, policy_version 536370 (0.0041) [2024-06-23 23:17:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 8788000768. Throughput: 0: 43621.9. Samples: 8788097860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-23 23:17:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 23:17:59,886][15401] Updated weights for policy 0, policy_version 536380 (0.0040) [2024-06-23 23:18:03,393][15132] Fps is (10 sec: 45857.8, 60 sec: 43414.8, 300 sec: 42875.5). Total num frames: 8788197376. Throughput: 0: 43364.4. Samples: 8788359380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:03,394][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 23:18:03,733][15401] Updated weights for policy 0, policy_version 536390 (0.0033) [2024-06-23 23:18:07,322][15401] Updated weights for policy 0, policy_version 536400 (0.0031) [2024-06-23 23:18:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 8788393984. Throughput: 0: 43498.1. Samples: 8788486060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 23:18:11,456][15401] Updated weights for policy 0, policy_version 536410 (0.0033) [2024-06-23 23:18:13,390][15132] Fps is (10 sec: 44253.0, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 8788639744. Throughput: 0: 43484.3. Samples: 8788748020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:13,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 23:18:14,707][15401] Updated weights for policy 0, policy_version 536420 (0.0022) [2024-06-23 23:18:17,753][15349] Signal inference workers to stop experience collection... (130250 times) [2024-06-23 23:18:17,755][15349] Signal inference workers to resume experience collection... (130250 times) [2024-06-23 23:18:17,766][15401] InferenceWorker_p0-w0: stopping experience collection (130250 times) [2024-06-23 23:18:17,800][15401] InferenceWorker_p0-w0: resuming experience collection (130250 times) [2024-06-23 23:18:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43419.4, 300 sec: 42876.5). Total num frames: 8788836352. Throughput: 0: 43437.8. Samples: 8789008760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:18,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-23 23:18:18,986][15401] Updated weights for policy 0, policy_version 536430 (0.0030) [2024-06-23 23:18:22,293][15401] Updated weights for policy 0, policy_version 536440 (0.0022) [2024-06-23 23:18:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43690.6, 300 sec: 42877.0). Total num frames: 8789049344. Throughput: 0: 43553.3. Samples: 8789140000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 23:18:26,892][15401] Updated weights for policy 0, policy_version 536450 (0.0047) [2024-06-23 23:18:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 8789295104. Throughput: 0: 43602.1. Samples: 8789403360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:28,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-23 23:18:29,801][15401] Updated weights for policy 0, policy_version 536460 (0.0045) [2024-06-23 23:18:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.2). Total num frames: 8789491712. Throughput: 0: 43154.6. Samples: 8789654480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:33,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 23:18:34,461][15401] Updated weights for policy 0, policy_version 536470 (0.0040) [2024-06-23 23:18:37,490][15401] Updated weights for policy 0, policy_version 536480 (0.0047) [2024-06-23 23:18:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8789704704. Throughput: 0: 43219.1. Samples: 8789779940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 23:18:41,929][15401] Updated weights for policy 0, policy_version 536490 (0.0036) [2024-06-23 23:18:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 43043.2). Total num frames: 8789934080. Throughput: 0: 43301.2. Samples: 8790046420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-23 23:18:45,242][15401] Updated weights for policy 0, policy_version 536500 (0.0036) [2024-06-23 23:18:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 8790130688. Throughput: 0: 43199.5. Samples: 8790303300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:48,393][15132] Avg episode reward: [(0, '0.270')] [2024-06-23 23:18:49,477][15401] Updated weights for policy 0, policy_version 536510 (0.0035) [2024-06-23 23:18:53,325][15401] Updated weights for policy 0, policy_version 536520 (0.0035) [2024-06-23 23:18:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 8790343680. Throughput: 0: 43141.5. Samples: 8790427420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 23:18:57,052][15401] Updated weights for policy 0, policy_version 536530 (0.0031) [2024-06-23 23:18:58,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8790573056. Throughput: 0: 43158.4. Samples: 8790690140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:18:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-23 23:19:00,787][15401] Updated weights for policy 0, policy_version 536540 (0.0033) [2024-06-23 23:19:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42874.0, 300 sec: 42932.0). Total num frames: 8790769664. Throughput: 0: 43138.5. Samples: 8790950000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:19:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 23:19:04,464][15401] Updated weights for policy 0, policy_version 536550 (0.0045) [2024-06-23 23:19:08,081][15401] Updated weights for policy 0, policy_version 536560 (0.0033) [2024-06-23 23:19:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.8, 300 sec: 43043.1). Total num frames: 8791015424. Throughput: 0: 42996.5. Samples: 8791074840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:19:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-23 23:19:11,834][15401] Updated weights for policy 0, policy_version 536570 (0.0021) [2024-06-23 23:19:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8791228416. Throughput: 0: 42973.4. Samples: 8791337160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:19:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-23 23:19:15,337][15401] Updated weights for policy 0, policy_version 536580 (0.0028) [2024-06-23 23:19:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8791425024. Throughput: 0: 43321.3. Samples: 8791603940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:19:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-23 23:19:19,286][15401] Updated weights for policy 0, policy_version 536590 (0.0033) [2024-06-23 23:19:22,821][15401] Updated weights for policy 0, policy_version 536600 (0.0043) [2024-06-23 23:19:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 43153.8). Total num frames: 8791670784. Throughput: 0: 43374.6. Samples: 8791731800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:19:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 23:19:26,793][15401] Updated weights for policy 0, policy_version 536610 (0.0026) [2024-06-23 23:19:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 8791867392. Throughput: 0: 43228.4. Samples: 8791991700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-23 23:19:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-23 23:19:29,644][15349] Signal inference workers to stop experience collection... (130300 times) [2024-06-23 23:19:29,666][15401] InferenceWorker_p0-w0: stopping experience collection (130300 times) [2024-06-23 23:19:29,702][15349] Signal inference workers to resume experience collection... (130300 times) [2024-06-23 23:19:29,702][15401] InferenceWorker_p0-w0: resuming experience collection (130300 times) [2024-06-23 23:19:30,224][15401] Updated weights for policy 0, policy_version 536620 (0.0035) [2024-06-23 23:19:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8792080384. Throughput: 0: 43289.0. Samples: 8792251200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:19:33,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-23 23:19:34,705][15401] Updated weights for policy 0, policy_version 536630 (0.0034) [2024-06-23 23:19:38,100][15401] Updated weights for policy 0, policy_version 536640 (0.0035) [2024-06-23 23:19:38,392][15132] Fps is (10 sec: 45864.9, 60 sec: 43688.9, 300 sec: 43153.4). Total num frames: 8792326144. Throughput: 0: 43452.8. Samples: 8792382900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:19:38,392][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 23:19:42,258][15401] Updated weights for policy 0, policy_version 536650 (0.0035) [2024-06-23 23:19:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 8792522752. Throughput: 0: 43433.2. Samples: 8792644640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:19:43,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-23 23:19:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000536653_8792522752.pth... [2024-06-23 23:19:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000536020_8782151680.pth [2024-06-23 23:19:45,603][15401] Updated weights for policy 0, policy_version 536660 (0.0028) [2024-06-23 23:19:48,389][15132] Fps is (10 sec: 40969.8, 60 sec: 43419.4, 300 sec: 43098.6). Total num frames: 8792735744. Throughput: 0: 43284.6. Samples: 8792897800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:19:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 23:19:49,887][15401] Updated weights for policy 0, policy_version 536670 (0.0039) [2024-06-23 23:19:53,197][15401] Updated weights for policy 0, policy_version 536680 (0.0034) [2024-06-23 23:19:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43690.5, 300 sec: 43153.8). Total num frames: 8792965120. Throughput: 0: 43414.0. Samples: 8793028480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:19:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-23 23:19:57,616][15401] Updated weights for policy 0, policy_version 536690 (0.0032) [2024-06-23 23:19:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 8793145344. Throughput: 0: 43331.0. Samples: 8793287060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:19:58,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 23:20:00,705][15401] Updated weights for policy 0, policy_version 536700 (0.0039) [2024-06-23 23:20:03,390][15132] Fps is (10 sec: 40960.7, 60 sec: 43417.7, 300 sec: 43098.2). Total num frames: 8793374720. Throughput: 0: 42795.5. Samples: 8793529740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:03,390][15132] Avg episode reward: [(0, '0.191')] [2024-06-23 23:20:05,547][15401] Updated weights for policy 0, policy_version 536710 (0.0037) [2024-06-23 23:20:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 43098.3). Total num frames: 8793587712. Throughput: 0: 42995.1. Samples: 8793666580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:08,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-23 23:20:08,701][15401] Updated weights for policy 0, policy_version 536720 (0.0044) [2024-06-23 23:20:13,256][15401] Updated weights for policy 0, policy_version 536730 (0.0028) [2024-06-23 23:20:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 8793784320. Throughput: 0: 42924.6. Samples: 8793923300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 23:20:16,464][15401] Updated weights for policy 0, policy_version 536740 (0.0047) [2024-06-23 23:20:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 43209.3). Total num frames: 8794030080. Throughput: 0: 42743.6. Samples: 8794174660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:18,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 23:20:20,854][15401] Updated weights for policy 0, policy_version 536750 (0.0029) [2024-06-23 23:20:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 8794243072. Throughput: 0: 42815.1. Samples: 8794309480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-23 23:20:24,221][15401] Updated weights for policy 0, policy_version 536760 (0.0042) [2024-06-23 23:20:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8794423296. Throughput: 0: 42692.4. Samples: 8794565800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:28,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-23 23:20:28,412][15401] Updated weights for policy 0, policy_version 536770 (0.0028) [2024-06-23 23:20:31,994][15401] Updated weights for policy 0, policy_version 536780 (0.0039) [2024-06-23 23:20:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.7, 300 sec: 43320.4). Total num frames: 8794685440. Throughput: 0: 42601.9. Samples: 8794814880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 23:20:36,388][15401] Updated weights for policy 0, policy_version 536790 (0.0038) [2024-06-23 23:20:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42600.0, 300 sec: 43153.8). Total num frames: 8794882048. Throughput: 0: 42809.4. Samples: 8794954900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-23 23:20:39,639][15401] Updated weights for policy 0, policy_version 536800 (0.0032) [2024-06-23 23:20:43,389][15132] Fps is (10 sec: 36044.4, 60 sec: 42052.3, 300 sec: 42931.6). Total num frames: 8795045888. Throughput: 0: 42622.3. Samples: 8795205060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-23 23:20:44,001][15401] Updated weights for policy 0, policy_version 536810 (0.0035) [2024-06-23 23:20:47,172][15401] Updated weights for policy 0, policy_version 536820 (0.0028) [2024-06-23 23:20:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 43209.3). Total num frames: 8795308032. Throughput: 0: 42912.9. Samples: 8795460820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-23 23:20:51,494][15401] Updated weights for policy 0, policy_version 536830 (0.0035) [2024-06-23 23:20:52,365][15349] Signal inference workers to stop experience collection... (130350 times) [2024-06-23 23:20:52,367][15349] Signal inference workers to resume experience collection... (130350 times) [2024-06-23 23:20:52,412][15401] InferenceWorker_p0-w0: stopping experience collection (130350 times) [2024-06-23 23:20:52,412][15401] InferenceWorker_p0-w0: resuming experience collection (130350 times) [2024-06-23 23:20:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.5, 300 sec: 43043.1). Total num frames: 8795504640. Throughput: 0: 42905.3. Samples: 8795597320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 23:20:54,907][15401] Updated weights for policy 0, policy_version 536840 (0.0036) [2024-06-23 23:20:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 8795717632. Throughput: 0: 42756.4. Samples: 8795847340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:20:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 23:20:59,507][15401] Updated weights for policy 0, policy_version 536850 (0.0032) [2024-06-23 23:21:02,480][15401] Updated weights for policy 0, policy_version 536860 (0.0030) [2024-06-23 23:21:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 43264.9). Total num frames: 8795963392. Throughput: 0: 42808.5. Samples: 8796101040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:03,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-23 23:21:07,142][15401] Updated weights for policy 0, policy_version 536870 (0.0033) [2024-06-23 23:21:08,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 43042.4). Total num frames: 8796160000. Throughput: 0: 42800.0. Samples: 8796235580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:08,393][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 23:21:10,149][15401] Updated weights for policy 0, policy_version 536880 (0.0031) [2024-06-23 23:21:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 8796372992. Throughput: 0: 42699.2. Samples: 8796487260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:13,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-23 23:21:14,677][15401] Updated weights for policy 0, policy_version 536890 (0.0040) [2024-06-23 23:21:17,630][15401] Updated weights for policy 0, policy_version 536900 (0.0031) [2024-06-23 23:21:18,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 43265.2). Total num frames: 8796602368. Throughput: 0: 42866.2. Samples: 8796743860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 23:21:22,184][15401] Updated weights for policy 0, policy_version 536910 (0.0022) [2024-06-23 23:21:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42987.5). Total num frames: 8796782592. Throughput: 0: 42807.2. Samples: 8796881220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 23:21:25,291][15401] Updated weights for policy 0, policy_version 536920 (0.0029) [2024-06-23 23:21:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 8797011968. Throughput: 0: 42808.9. Samples: 8797131460. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:28,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-23 23:21:29,689][15401] Updated weights for policy 0, policy_version 536930 (0.0033) [2024-06-23 23:21:33,006][15401] Updated weights for policy 0, policy_version 536940 (0.0032) [2024-06-23 23:21:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.2, 300 sec: 43264.8). Total num frames: 8797241344. Throughput: 0: 42809.6. Samples: 8797387260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-23 23:21:37,202][15401] Updated weights for policy 0, policy_version 536950 (0.0035) [2024-06-23 23:21:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 8797421568. Throughput: 0: 42788.9. Samples: 8797522820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-23 23:21:40,869][15401] Updated weights for policy 0, policy_version 536960 (0.0049) [2024-06-23 23:21:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43690.6, 300 sec: 43042.7). Total num frames: 8797667328. Throughput: 0: 42836.0. Samples: 8797774960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 23:21:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000536967_8797667328.pth... [2024-06-23 23:21:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000536338_8787361792.pth [2024-06-23 23:21:44,768][15401] Updated weights for policy 0, policy_version 536970 (0.0026) [2024-06-23 23:21:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 43209.3). Total num frames: 8797863936. Throughput: 0: 42933.7. Samples: 8798033060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:48,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 23:21:48,540][15401] Updated weights for policy 0, policy_version 536980 (0.0034) [2024-06-23 23:21:52,331][15401] Updated weights for policy 0, policy_version 536990 (0.0038) [2024-06-23 23:21:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8798076928. Throughput: 0: 42870.7. Samples: 8798164660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 23:21:55,953][15401] Updated weights for policy 0, policy_version 537000 (0.0043) [2024-06-23 23:21:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 43097.9). Total num frames: 8798306304. Throughput: 0: 42927.5. Samples: 8798419100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:21:58,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-23 23:21:59,962][15401] Updated weights for policy 0, policy_version 537010 (0.0033) [2024-06-23 23:22:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 43209.3). Total num frames: 8798519296. Throughput: 0: 42895.6. Samples: 8798674160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:22:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 23:22:03,506][15401] Updated weights for policy 0, policy_version 537020 (0.0038) [2024-06-23 23:22:07,640][15401] Updated weights for policy 0, policy_version 537030 (0.0033) [2024-06-23 23:22:08,394][15132] Fps is (10 sec: 42591.0, 60 sec: 42870.2, 300 sec: 42986.9). Total num frames: 8798732288. Throughput: 0: 42685.5. Samples: 8798802240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:22:08,394][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 23:22:11,168][15401] Updated weights for policy 0, policy_version 537040 (0.0031) [2024-06-23 23:22:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 43098.6). Total num frames: 8798945280. Throughput: 0: 42863.0. Samples: 8799060300. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:22:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-23 23:22:15,394][15401] Updated weights for policy 0, policy_version 537050 (0.0036) [2024-06-23 23:22:18,390][15132] Fps is (10 sec: 42615.1, 60 sec: 42598.2, 300 sec: 43153.7). Total num frames: 8799158272. Throughput: 0: 42783.9. Samples: 8799312540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:22:18,400][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 23:22:18,715][15401] Updated weights for policy 0, policy_version 537060 (0.0028) [2024-06-23 23:22:22,993][15401] Updated weights for policy 0, policy_version 537070 (0.0037) [2024-06-23 23:22:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8799354880. Throughput: 0: 42656.5. Samples: 8799442360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:22:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 23:22:26,596][15401] Updated weights for policy 0, policy_version 537080 (0.0048) [2024-06-23 23:22:28,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 8799600640. Throughput: 0: 42839.6. Samples: 8799702740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-23 23:22:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-23 23:22:30,550][15401] Updated weights for policy 0, policy_version 537090 (0.0045) [2024-06-23 23:22:32,547][15349] Signal inference workers to stop experience collection... (130400 times) [2024-06-23 23:22:32,547][15349] Signal inference workers to resume experience collection... (130400 times) [2024-06-23 23:22:32,566][15401] InferenceWorker_p0-w0: stopping experience collection (130400 times) [2024-06-23 23:22:32,566][15401] InferenceWorker_p0-w0: resuming experience collection (130400 times) [2024-06-23 23:22:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 8799797248. Throughput: 0: 42769.8. Samples: 8799957700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:22:33,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-23 23:22:34,201][15401] Updated weights for policy 0, policy_version 537100 (0.0024) [2024-06-23 23:22:38,224][15401] Updated weights for policy 0, policy_version 537110 (0.0045) [2024-06-23 23:22:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8800010240. Throughput: 0: 42654.8. Samples: 8800084120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:22:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 23:22:42,081][15401] Updated weights for policy 0, policy_version 537120 (0.0043) [2024-06-23 23:22:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 8800239616. Throughput: 0: 42776.1. Samples: 8800343920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:22:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-23 23:22:45,945][15401] Updated weights for policy 0, policy_version 537130 (0.0037) [2024-06-23 23:22:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 8800436224. Throughput: 0: 42825.2. Samples: 8800601300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:22:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-23 23:22:49,531][15401] Updated weights for policy 0, policy_version 537140 (0.0047) [2024-06-23 23:22:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8800649216. Throughput: 0: 42890.7. Samples: 8800732140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:22:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-23 23:22:53,449][15401] Updated weights for policy 0, policy_version 537150 (0.0038) [2024-06-23 23:22:57,150][15401] Updated weights for policy 0, policy_version 537160 (0.0027) [2024-06-23 23:22:58,394][15132] Fps is (10 sec: 44218.3, 60 sec: 42870.2, 300 sec: 42987.1). Total num frames: 8800878592. Throughput: 0: 42806.2. Samples: 8800986760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:22:58,395][15132] Avg episode reward: [(0, '0.332')] [2024-06-23 23:23:01,361][15401] Updated weights for policy 0, policy_version 537170 (0.0042) [2024-06-23 23:23:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.6, 300 sec: 42986.8). Total num frames: 8801075200. Throughput: 0: 42857.5. Samples: 8801241220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:03,401][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 23:23:04,882][15401] Updated weights for policy 0, policy_version 537180 (0.0027) [2024-06-23 23:23:08,390][15132] Fps is (10 sec: 40976.9, 60 sec: 42601.3, 300 sec: 42876.1). Total num frames: 8801288192. Throughput: 0: 42946.1. Samples: 8801374940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 23:23:08,761][15401] Updated weights for policy 0, policy_version 537190 (0.0032) [2024-06-23 23:23:12,403][15401] Updated weights for policy 0, policy_version 537200 (0.0031) [2024-06-23 23:23:13,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 8801501184. Throughput: 0: 42874.2. Samples: 8801632180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:13,393][15132] Avg episode reward: [(0, '0.260')] [2024-06-23 23:23:16,162][15401] Updated weights for policy 0, policy_version 537210 (0.0043) [2024-06-23 23:23:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.7, 300 sec: 42987.2). Total num frames: 8801730560. Throughput: 0: 43015.2. Samples: 8801893380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:18,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-23 23:23:19,958][15401] Updated weights for policy 0, policy_version 537220 (0.0033) [2024-06-23 23:23:23,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8801943552. Throughput: 0: 43079.4. Samples: 8802022700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:23,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-23 23:23:23,627][15401] Updated weights for policy 0, policy_version 537230 (0.0034) [2024-06-23 23:23:27,362][15401] Updated weights for policy 0, policy_version 537240 (0.0029) [2024-06-23 23:23:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8802140160. Throughput: 0: 42914.2. Samples: 8802275060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-23 23:23:31,618][15401] Updated weights for policy 0, policy_version 537250 (0.0041) [2024-06-23 23:23:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8802353152. Throughput: 0: 43031.2. Samples: 8802537700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:33,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-23 23:23:34,957][15401] Updated weights for policy 0, policy_version 537260 (0.0032) [2024-06-23 23:23:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 8802598912. Throughput: 0: 43015.8. Samples: 8802667860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:38,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-23 23:23:39,107][15401] Updated weights for policy 0, policy_version 537270 (0.0038) [2024-06-23 23:23:42,460][15401] Updated weights for policy 0, policy_version 537280 (0.0043) [2024-06-23 23:23:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42932.0). Total num frames: 8802795520. Throughput: 0: 42953.3. Samples: 8802919480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:43,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-23 23:23:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000537281_8802811904.pth... [2024-06-23 23:23:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000536653_8792522752.pth [2024-06-23 23:23:46,559][15401] Updated weights for policy 0, policy_version 537290 (0.0037) [2024-06-23 23:23:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8802992128. Throughput: 0: 43236.5. Samples: 8803186760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 23:23:50,316][15401] Updated weights for policy 0, policy_version 537300 (0.0022) [2024-06-23 23:23:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8803237888. Throughput: 0: 42920.1. Samples: 8803306340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:53,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 23:23:53,960][15401] Updated weights for policy 0, policy_version 537310 (0.0033) [2024-06-23 23:23:57,932][15401] Updated weights for policy 0, policy_version 537320 (0.0031) [2024-06-23 23:23:58,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42872.8, 300 sec: 42986.8). Total num frames: 8803450880. Throughput: 0: 42993.4. Samples: 8803566880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:23:58,393][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 23:24:01,166][15349] Signal inference workers to stop experience collection... (130450 times) [2024-06-23 23:24:01,169][15349] Signal inference workers to resume experience collection... (130450 times) [2024-06-23 23:24:01,187][15401] InferenceWorker_p0-w0: stopping experience collection (130450 times) [2024-06-23 23:24:01,188][15401] InferenceWorker_p0-w0: resuming experience collection (130450 times) [2024-06-23 23:24:02,098][15401] Updated weights for policy 0, policy_version 537330 (0.0046) [2024-06-23 23:24:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 8803647488. Throughput: 0: 42901.7. Samples: 8803823960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-23 23:24:05,751][15401] Updated weights for policy 0, policy_version 537340 (0.0044) [2024-06-23 23:24:08,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8803876864. Throughput: 0: 42828.9. Samples: 8803950000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:08,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-23 23:24:09,594][15401] Updated weights for policy 0, policy_version 537350 (0.0031) [2024-06-23 23:24:13,375][15401] Updated weights for policy 0, policy_version 537360 (0.0033) [2024-06-23 23:24:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43419.3, 300 sec: 42987.2). Total num frames: 8804106240. Throughput: 0: 43028.8. Samples: 8804211360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 23:24:17,331][15401] Updated weights for policy 0, policy_version 537370 (0.0027) [2024-06-23 23:24:18,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 8804302848. Throughput: 0: 42888.3. Samples: 8804467780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:18,392][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 23:24:21,035][15401] Updated weights for policy 0, policy_version 537380 (0.0038) [2024-06-23 23:24:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8804515840. Throughput: 0: 42680.1. Samples: 8804588460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-23 23:24:24,987][15401] Updated weights for policy 0, policy_version 537390 (0.0036) [2024-06-23 23:24:28,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8804728832. Throughput: 0: 42915.5. Samples: 8804850680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-23 23:24:28,734][15401] Updated weights for policy 0, policy_version 537400 (0.0036) [2024-06-23 23:24:32,988][15401] Updated weights for policy 0, policy_version 537410 (0.0035) [2024-06-23 23:24:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 8804941824. Throughput: 0: 42607.1. Samples: 8805104080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 23:24:36,377][15401] Updated weights for policy 0, policy_version 537420 (0.0030) [2024-06-23 23:24:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8805154816. Throughput: 0: 42785.4. Samples: 8805231680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 23:24:40,527][15401] Updated weights for policy 0, policy_version 537430 (0.0039) [2024-06-23 23:24:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8805367808. Throughput: 0: 42795.7. Samples: 8805492580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 23:24:43,968][15401] Updated weights for policy 0, policy_version 537440 (0.0032) [2024-06-23 23:24:47,947][15401] Updated weights for policy 0, policy_version 537450 (0.0027) [2024-06-23 23:24:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8805580800. Throughput: 0: 42680.0. Samples: 8805744560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 23:24:51,632][15401] Updated weights for policy 0, policy_version 537460 (0.0036) [2024-06-23 23:24:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8805793792. Throughput: 0: 42792.6. Samples: 8805875660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 23:24:55,541][15401] Updated weights for policy 0, policy_version 537470 (0.0026) [2024-06-23 23:24:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 8805990400. Throughput: 0: 42710.5. Samples: 8806133340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:24:58,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-23 23:24:59,858][15401] Updated weights for policy 0, policy_version 537480 (0.0037) [2024-06-23 23:25:03,022][15401] Updated weights for policy 0, policy_version 537490 (0.0027) [2024-06-23 23:25:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8806236160. Throughput: 0: 42482.7. Samples: 8806379400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:25:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-23 23:25:07,373][15401] Updated weights for policy 0, policy_version 537500 (0.0024) [2024-06-23 23:25:08,392][15132] Fps is (10 sec: 44227.1, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 8806432768. Throughput: 0: 42900.5. Samples: 8806519080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:25:08,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 23:25:10,893][15401] Updated weights for policy 0, policy_version 537510 (0.0032) [2024-06-23 23:25:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8806662144. Throughput: 0: 42950.7. Samples: 8806783460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:25:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 23:25:14,919][15401] Updated weights for policy 0, policy_version 537520 (0.0032) [2024-06-23 23:25:16,358][15349] Signal inference workers to stop experience collection... (130500 times) [2024-06-23 23:25:16,407][15401] InferenceWorker_p0-w0: stopping experience collection (130500 times) [2024-06-23 23:25:16,407][15349] Signal inference workers to resume experience collection... (130500 times) [2024-06-23 23:25:16,430][15401] InferenceWorker_p0-w0: resuming experience collection (130500 times) [2024-06-23 23:25:18,375][15401] Updated weights for policy 0, policy_version 537530 (0.0036) [2024-06-23 23:25:18,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 8806891520. Throughput: 0: 42842.7. Samples: 8807032000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:25:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 23:25:22,483][15401] Updated weights for policy 0, policy_version 537540 (0.0023) [2024-06-23 23:25:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 8807071744. Throughput: 0: 43084.8. Samples: 8807170500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:25:23,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 23:25:25,765][15401] Updated weights for policy 0, policy_version 537550 (0.0033) [2024-06-23 23:25:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8807301120. Throughput: 0: 43007.5. Samples: 8807427920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-23 23:25:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 23:25:29,898][15401] Updated weights for policy 0, policy_version 537560 (0.0026) [2024-06-23 23:25:33,333][15401] Updated weights for policy 0, policy_version 537570 (0.0053) [2024-06-23 23:25:33,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 8807546880. Throughput: 0: 43045.9. Samples: 8807681620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:25:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 23:25:37,335][15401] Updated weights for policy 0, policy_version 537580 (0.0028) [2024-06-23 23:25:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8807727104. Throughput: 0: 43211.4. Samples: 8807820180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:25:38,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-23 23:25:40,842][15401] Updated weights for policy 0, policy_version 537590 (0.0029) [2024-06-23 23:25:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8807956480. Throughput: 0: 43333.0. Samples: 8808083320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:25:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 23:25:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000537595_8807956480.pth... [2024-06-23 23:25:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000536967_8797667328.pth [2024-06-23 23:25:44,885][15401] Updated weights for policy 0, policy_version 537600 (0.0041) [2024-06-23 23:25:48,255][15401] Updated weights for policy 0, policy_version 537610 (0.0038) [2024-06-23 23:25:48,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43690.8, 300 sec: 43042.7). Total num frames: 8808202240. Throughput: 0: 43459.1. Samples: 8808335060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:25:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 23:25:52,869][15401] Updated weights for policy 0, policy_version 537620 (0.0037) [2024-06-23 23:25:53,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 8808382464. Throughput: 0: 43353.3. Samples: 8808469980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:25:53,393][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 23:25:55,942][15401] Updated weights for policy 0, policy_version 537630 (0.0045) [2024-06-23 23:25:58,389][15132] Fps is (10 sec: 37683.4, 60 sec: 43144.8, 300 sec: 42765.0). Total num frames: 8808579072. Throughput: 0: 43152.5. Samples: 8808725320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:25:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 23:26:00,391][15401] Updated weights for policy 0, policy_version 537640 (0.0042) [2024-06-23 23:26:03,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 8808841216. Throughput: 0: 43192.0. Samples: 8808975640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 23:26:03,548][15401] Updated weights for policy 0, policy_version 537650 (0.0021) [2024-06-23 23:26:07,801][15401] Updated weights for policy 0, policy_version 537660 (0.0026) [2024-06-23 23:26:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 8809037824. Throughput: 0: 43192.9. Samples: 8809114180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 23:26:11,110][15401] Updated weights for policy 0, policy_version 537670 (0.0026) [2024-06-23 23:26:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8809234432. Throughput: 0: 43240.0. Samples: 8809373720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:13,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 23:26:15,670][15401] Updated weights for policy 0, policy_version 537680 (0.0038) [2024-06-23 23:26:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 8809480192. Throughput: 0: 43111.6. Samples: 8809621640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 23:26:19,134][15401] Updated weights for policy 0, policy_version 537690 (0.0041) [2024-06-23 23:26:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 8809660416. Throughput: 0: 43050.6. Samples: 8809757460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 23:26:23,571][15401] Updated weights for policy 0, policy_version 537700 (0.0023) [2024-06-23 23:26:26,894][15401] Updated weights for policy 0, policy_version 537710 (0.0027) [2024-06-23 23:26:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8809889792. Throughput: 0: 42766.3. Samples: 8810007800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:28,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 23:26:31,174][15401] Updated weights for policy 0, policy_version 537720 (0.0032) [2024-06-23 23:26:32,437][15349] Signal inference workers to stop experience collection... (130550 times) [2024-06-23 23:26:32,465][15401] InferenceWorker_p0-w0: stopping experience collection (130550 times) [2024-06-23 23:26:32,554][15349] Signal inference workers to resume experience collection... (130550 times) [2024-06-23 23:26:32,555][15401] InferenceWorker_p0-w0: resuming experience collection (130550 times) [2024-06-23 23:26:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 8810119168. Throughput: 0: 42953.2. Samples: 8810267960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 23:26:34,453][15401] Updated weights for policy 0, policy_version 537730 (0.0031) [2024-06-23 23:26:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8810315776. Throughput: 0: 42958.3. Samples: 8810403000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:38,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-23 23:26:38,619][15401] Updated weights for policy 0, policy_version 537740 (0.0036) [2024-06-23 23:26:41,899][15401] Updated weights for policy 0, policy_version 537750 (0.0039) [2024-06-23 23:26:43,395][15132] Fps is (10 sec: 42576.9, 60 sec: 43140.8, 300 sec: 42986.4). Total num frames: 8810545152. Throughput: 0: 42875.4. Samples: 8810654940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:43,395][15132] Avg episode reward: [(0, '0.309')] [2024-06-23 23:26:46,212][15401] Updated weights for policy 0, policy_version 537760 (0.0039) [2024-06-23 23:26:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8810758144. Throughput: 0: 43028.0. Samples: 8810911900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 23:26:49,592][15401] Updated weights for policy 0, policy_version 537770 (0.0032) [2024-06-23 23:26:53,392][15132] Fps is (10 sec: 40971.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8810954752. Throughput: 0: 42859.5. Samples: 8811042960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:53,393][15132] Avg episode reward: [(0, '0.388')] [2024-06-23 23:26:53,664][15401] Updated weights for policy 0, policy_version 537780 (0.0030) [2024-06-23 23:26:57,472][15401] Updated weights for policy 0, policy_version 537790 (0.0040) [2024-06-23 23:26:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 8811200512. Throughput: 0: 42855.1. Samples: 8811302200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-23 23:26:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-23 23:27:01,278][15401] Updated weights for policy 0, policy_version 537800 (0.0037) [2024-06-23 23:27:03,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42598.3, 300 sec: 42932.2). Total num frames: 8811397120. Throughput: 0: 43011.3. Samples: 8811557160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-23 23:27:05,054][15401] Updated weights for policy 0, policy_version 537810 (0.0042) [2024-06-23 23:27:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8811593728. Throughput: 0: 42816.0. Samples: 8811684180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:08,391][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 23:27:08,783][15401] Updated weights for policy 0, policy_version 537820 (0.0042) [2024-06-23 23:27:12,695][15401] Updated weights for policy 0, policy_version 537830 (0.0025) [2024-06-23 23:27:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 8811823104. Throughput: 0: 42900.0. Samples: 8811938300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:13,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-23 23:27:16,416][15401] Updated weights for policy 0, policy_version 537840 (0.0026) [2024-06-23 23:27:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 8812019712. Throughput: 0: 42886.4. Samples: 8812197840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:18,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-23 23:27:20,550][15401] Updated weights for policy 0, policy_version 537850 (0.0047) [2024-06-23 23:27:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8812232704. Throughput: 0: 42718.1. Samples: 8812325320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 23:27:24,369][15401] Updated weights for policy 0, policy_version 537860 (0.0034) [2024-06-23 23:27:28,133][15401] Updated weights for policy 0, policy_version 537870 (0.0036) [2024-06-23 23:27:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8812462080. Throughput: 0: 42744.0. Samples: 8812578200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 23:27:31,939][15401] Updated weights for policy 0, policy_version 537880 (0.0033) [2024-06-23 23:27:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 8812658688. Throughput: 0: 42771.6. Samples: 8812836620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 23:27:36,030][15401] Updated weights for policy 0, policy_version 537890 (0.0027) [2024-06-23 23:27:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8812888064. Throughput: 0: 42752.7. Samples: 8812966720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-23 23:27:39,420][15401] Updated weights for policy 0, policy_version 537900 (0.0029) [2024-06-23 23:27:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42327.3, 300 sec: 42875.7). Total num frames: 8813084672. Throughput: 0: 42597.2. Samples: 8813219180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:43,393][15132] Avg episode reward: [(0, '0.714')] [2024-06-23 23:27:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000537908_8813084672.pth... [2024-06-23 23:27:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000537281_8802811904.pth [2024-06-23 23:27:44,059][15401] Updated weights for policy 0, policy_version 537910 (0.0045) [2024-06-23 23:27:47,224][15401] Updated weights for policy 0, policy_version 537920 (0.0033) [2024-06-23 23:27:48,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 8813297664. Throughput: 0: 42540.1. Samples: 8813471460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 23:27:51,587][15349] Signal inference workers to stop experience collection... (130600 times) [2024-06-23 23:27:51,591][15349] Signal inference workers to resume experience collection... (130600 times) [2024-06-23 23:27:51,609][15401] Updated weights for policy 0, policy_version 537930 (0.0043) [2024-06-23 23:27:51,640][15401] InferenceWorker_p0-w0: stopping experience collection (130600 times) [2024-06-23 23:27:51,640][15401] InferenceWorker_p0-w0: resuming experience collection (130600 times) [2024-06-23 23:27:53,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42873.2, 300 sec: 42876.7). Total num frames: 8813527040. Throughput: 0: 42653.0. Samples: 8813603560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:53,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 23:27:54,767][15401] Updated weights for policy 0, policy_version 537940 (0.0025) [2024-06-23 23:27:58,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42323.6, 300 sec: 42931.6). Total num frames: 8813740032. Throughput: 0: 42775.1. Samples: 8813863280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:27:58,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 23:27:59,245][15401] Updated weights for policy 0, policy_version 537950 (0.0024) [2024-06-23 23:28:02,437][15401] Updated weights for policy 0, policy_version 537960 (0.0040) [2024-06-23 23:28:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8813953024. Throughput: 0: 42568.6. Samples: 8814113440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:28:03,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-23 23:28:06,827][15401] Updated weights for policy 0, policy_version 537970 (0.0034) [2024-06-23 23:28:08,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.6, 300 sec: 42932.0). Total num frames: 8814166016. Throughput: 0: 42569.5. Samples: 8814240940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:28:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 23:28:09,962][15401] Updated weights for policy 0, policy_version 537980 (0.0022) [2024-06-23 23:28:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 8814379008. Throughput: 0: 42740.7. Samples: 8814501540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:28:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-23 23:28:14,461][15401] Updated weights for policy 0, policy_version 537990 (0.0038) [2024-06-23 23:28:17,655][15401] Updated weights for policy 0, policy_version 538000 (0.0038) [2024-06-23 23:28:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8814592000. Throughput: 0: 42529.3. Samples: 8814750440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:28:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-23 23:28:22,323][15401] Updated weights for policy 0, policy_version 538010 (0.0039) [2024-06-23 23:28:23,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 8814804992. Throughput: 0: 42525.2. Samples: 8814880360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:28:23,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 23:28:25,588][15401] Updated weights for policy 0, policy_version 538020 (0.0029) [2024-06-23 23:28:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8815017984. Throughput: 0: 42646.3. Samples: 8815138160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-23 23:28:28,396][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 23:28:29,978][15401] Updated weights for policy 0, policy_version 538030 (0.0024) [2024-06-23 23:28:33,164][15401] Updated weights for policy 0, policy_version 538040 (0.0031) [2024-06-23 23:28:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 8815247360. Throughput: 0: 42612.6. Samples: 8815389020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:28:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-23 23:28:37,389][15401] Updated weights for policy 0, policy_version 538050 (0.0028) [2024-06-23 23:28:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 8815427584. Throughput: 0: 42726.2. Samples: 8815526240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:28:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-23 23:28:40,854][15401] Updated weights for policy 0, policy_version 538060 (0.0034) [2024-06-23 23:28:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 8815656960. Throughput: 0: 42682.2. Samples: 8815783880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:28:43,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-23 23:28:44,982][15401] Updated weights for policy 0, policy_version 538070 (0.0029) [2024-06-23 23:28:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8815886336. Throughput: 0: 42638.5. Samples: 8816032160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:28:48,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-23 23:28:48,443][15401] Updated weights for policy 0, policy_version 538080 (0.0036) [2024-06-23 23:28:52,763][15401] Updated weights for policy 0, policy_version 538090 (0.0034) [2024-06-23 23:28:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 8816066560. Throughput: 0: 42811.4. Samples: 8816167460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:28:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 23:28:56,104][15401] Updated weights for policy 0, policy_version 538100 (0.0023) [2024-06-23 23:28:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 8816312320. Throughput: 0: 42676.6. Samples: 8816421980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:28:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-23 23:29:00,263][15401] Updated weights for policy 0, policy_version 538110 (0.0024) [2024-06-23 23:29:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8816525312. Throughput: 0: 42726.5. Samples: 8816673140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:03,399][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 23:29:03,765][15401] Updated weights for policy 0, policy_version 538120 (0.0035) [2024-06-23 23:29:08,057][15401] Updated weights for policy 0, policy_version 538130 (0.0033) [2024-06-23 23:29:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8816721920. Throughput: 0: 42750.3. Samples: 8816804120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 23:29:12,238][15401] Updated weights for policy 0, policy_version 538140 (0.0029) [2024-06-23 23:29:13,002][15349] Signal inference workers to stop experience collection... (130650 times) [2024-06-23 23:29:13,003][15349] Signal inference workers to resume experience collection... (130650 times) [2024-06-23 23:29:13,031][15401] InferenceWorker_p0-w0: stopping experience collection (130650 times) [2024-06-23 23:29:13,032][15401] InferenceWorker_p0-w0: resuming experience collection (130650 times) [2024-06-23 23:29:13,392][15132] Fps is (10 sec: 44226.9, 60 sec: 43143.0, 300 sec: 42931.6). Total num frames: 8816967680. Throughput: 0: 42881.3. Samples: 8817067920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:13,392][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 23:29:16,064][15401] Updated weights for policy 0, policy_version 538150 (0.0039) [2024-06-23 23:29:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8817164288. Throughput: 0: 42815.9. Samples: 8817315740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-23 23:29:19,693][15401] Updated weights for policy 0, policy_version 538160 (0.0027) [2024-06-23 23:29:23,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8817360896. Throughput: 0: 42601.0. Samples: 8817443280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-23 23:29:23,574][15401] Updated weights for policy 0, policy_version 538170 (0.0038) [2024-06-23 23:29:27,256][15401] Updated weights for policy 0, policy_version 538180 (0.0038) [2024-06-23 23:29:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8817606656. Throughput: 0: 42709.4. Samples: 8817705800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-23 23:29:31,134][15401] Updated weights for policy 0, policy_version 538190 (0.0028) [2024-06-23 23:29:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8817786880. Throughput: 0: 42792.4. Samples: 8817957820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 23:29:34,774][15401] Updated weights for policy 0, policy_version 538200 (0.0044) [2024-06-23 23:29:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8817999872. Throughput: 0: 42610.8. Samples: 8818084940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-23 23:29:38,878][15401] Updated weights for policy 0, policy_version 538210 (0.0036) [2024-06-23 23:29:42,410][15401] Updated weights for policy 0, policy_version 538220 (0.0049) [2024-06-23 23:29:43,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 8818245632. Throughput: 0: 42678.8. Samples: 8818342520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:43,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-23 23:29:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000538223_8818245632.pth... [2024-06-23 23:29:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000537595_8807956480.pth [2024-06-23 23:29:46,693][15401] Updated weights for policy 0, policy_version 538230 (0.0032) [2024-06-23 23:29:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 8818425856. Throughput: 0: 42818.7. Samples: 8818599980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 23:29:50,200][15401] Updated weights for policy 0, policy_version 538240 (0.0028) [2024-06-23 23:29:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8818638848. Throughput: 0: 42603.5. Samples: 8818721280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:53,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-23 23:29:54,063][15401] Updated weights for policy 0, policy_version 538250 (0.0041) [2024-06-23 23:29:58,037][15401] Updated weights for policy 0, policy_version 538260 (0.0027) [2024-06-23 23:29:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 8818868224. Throughput: 0: 42593.9. Samples: 8818984540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 26.0) [2024-06-23 23:29:58,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-23 23:30:01,892][15401] Updated weights for policy 0, policy_version 538270 (0.0031) [2024-06-23 23:30:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.5, 300 sec: 42765.4). Total num frames: 8819048448. Throughput: 0: 42776.2. Samples: 8819240660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 23:30:05,629][15401] Updated weights for policy 0, policy_version 538280 (0.0024) [2024-06-23 23:30:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8819277824. Throughput: 0: 42695.4. Samples: 8819364680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:08,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 23:30:09,478][15401] Updated weights for policy 0, policy_version 538290 (0.0049) [2024-06-23 23:30:13,383][15401] Updated weights for policy 0, policy_version 538300 (0.0033) [2024-06-23 23:30:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 8819507200. Throughput: 0: 42544.1. Samples: 8819620280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:13,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 23:30:17,471][15401] Updated weights for policy 0, policy_version 538310 (0.0027) [2024-06-23 23:30:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8819703808. Throughput: 0: 42619.6. Samples: 8819875700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:18,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-23 23:30:20,963][15401] Updated weights for policy 0, policy_version 538320 (0.0036) [2024-06-23 23:30:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8819933184. Throughput: 0: 42521.3. Samples: 8819998400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 23:30:24,890][15401] Updated weights for policy 0, policy_version 538330 (0.0036) [2024-06-23 23:30:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8820146176. Throughput: 0: 42584.5. Samples: 8820258820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-23 23:30:28,944][15401] Updated weights for policy 0, policy_version 538340 (0.0039) [2024-06-23 23:30:32,353][15401] Updated weights for policy 0, policy_version 538350 (0.0046) [2024-06-23 23:30:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8820342784. Throughput: 0: 42605.5. Samples: 8820517220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-23 23:30:36,482][15401] Updated weights for policy 0, policy_version 538360 (0.0035) [2024-06-23 23:30:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8820572160. Throughput: 0: 42786.7. Samples: 8820646680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-23 23:30:40,129][15401] Updated weights for policy 0, policy_version 538370 (0.0039) [2024-06-23 23:30:41,612][15349] Signal inference workers to stop experience collection... (130700 times) [2024-06-23 23:30:41,647][15401] InferenceWorker_p0-w0: stopping experience collection (130700 times) [2024-06-23 23:30:41,670][15349] Signal inference workers to resume experience collection... (130700 times) [2024-06-23 23:30:41,671][15401] InferenceWorker_p0-w0: resuming experience collection (130700 times) [2024-06-23 23:30:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 8820752384. Throughput: 0: 42470.0. Samples: 8820895700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-23 23:30:44,081][15401] Updated weights for policy 0, policy_version 538380 (0.0026) [2024-06-23 23:30:47,931][15401] Updated weights for policy 0, policy_version 538390 (0.0034) [2024-06-23 23:30:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 8820981760. Throughput: 0: 42510.2. Samples: 8821153620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:48,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-23 23:30:51,623][15401] Updated weights for policy 0, policy_version 538400 (0.0031) [2024-06-23 23:30:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8821211136. Throughput: 0: 42724.8. Samples: 8821287200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:53,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 23:30:55,641][15401] Updated weights for policy 0, policy_version 538410 (0.0047) [2024-06-23 23:30:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 8821391360. Throughput: 0: 42646.6. Samples: 8821539380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:30:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 23:30:59,629][15401] Updated weights for policy 0, policy_version 538420 (0.0040) [2024-06-23 23:31:03,209][15401] Updated weights for policy 0, policy_version 538430 (0.0028) [2024-06-23 23:31:03,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 8821637120. Throughput: 0: 42746.6. Samples: 8821799400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:31:03,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 23:31:07,289][15401] Updated weights for policy 0, policy_version 538440 (0.0025) [2024-06-23 23:31:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8821850112. Throughput: 0: 42896.1. Samples: 8821928720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:31:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-23 23:31:10,959][15401] Updated weights for policy 0, policy_version 538450 (0.0029) [2024-06-23 23:31:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8822046720. Throughput: 0: 42672.4. Samples: 8822179080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:31:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-23 23:31:14,773][15401] Updated weights for policy 0, policy_version 538460 (0.0024) [2024-06-23 23:31:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8822276096. Throughput: 0: 42579.0. Samples: 8822433280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:31:18,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 23:31:18,451][15401] Updated weights for policy 0, policy_version 538470 (0.0033) [2024-06-23 23:31:22,552][15401] Updated weights for policy 0, policy_version 538480 (0.0037) [2024-06-23 23:31:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8822489088. Throughput: 0: 42784.4. Samples: 8822571980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:31:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 23:31:25,869][15401] Updated weights for policy 0, policy_version 538490 (0.0035) [2024-06-23 23:31:28,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8822669312. Throughput: 0: 42819.2. Samples: 8822822560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-23 23:31:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 23:31:30,143][15401] Updated weights for policy 0, policy_version 538500 (0.0029) [2024-06-23 23:31:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8822915072. Throughput: 0: 42829.8. Samples: 8823080960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:31:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-23 23:31:33,606][15401] Updated weights for policy 0, policy_version 538510 (0.0028) [2024-06-23 23:31:37,817][15401] Updated weights for policy 0, policy_version 538520 (0.0037) [2024-06-23 23:31:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42654.7). Total num frames: 8823128064. Throughput: 0: 42840.5. Samples: 8823215020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:31:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 23:31:41,257][15401] Updated weights for policy 0, policy_version 538530 (0.0046) [2024-06-23 23:31:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8823324672. Throughput: 0: 42732.3. Samples: 8823462340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:31:43,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 23:31:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000538533_8823324672.pth... [2024-06-23 23:31:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000537908_8813084672.pth [2024-06-23 23:31:45,671][15401] Updated weights for policy 0, policy_version 538540 (0.0034) [2024-06-23 23:31:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 8823570432. Throughput: 0: 42607.5. Samples: 8823716640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:31:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 23:31:49,014][15401] Updated weights for policy 0, policy_version 538550 (0.0034) [2024-06-23 23:31:53,341][15401] Updated weights for policy 0, policy_version 538560 (0.0029) [2024-06-23 23:31:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8823767040. Throughput: 0: 42787.1. Samples: 8823854140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:31:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 23:31:54,338][15349] Signal inference workers to stop experience collection... (130750 times) [2024-06-23 23:31:54,399][15401] InferenceWorker_p0-w0: stopping experience collection (130750 times) [2024-06-23 23:31:54,459][15349] Signal inference workers to resume experience collection... (130750 times) [2024-06-23 23:31:54,459][15401] InferenceWorker_p0-w0: resuming experience collection (130750 times) [2024-06-23 23:31:56,497][15401] Updated weights for policy 0, policy_version 538570 (0.0034) [2024-06-23 23:31:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 8823980032. Throughput: 0: 42705.3. Samples: 8824100820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:31:58,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-23 23:32:00,986][15401] Updated weights for policy 0, policy_version 538580 (0.0033) [2024-06-23 23:32:03,390][15132] Fps is (10 sec: 44234.2, 60 sec: 42872.8, 300 sec: 42765.0). Total num frames: 8824209408. Throughput: 0: 42837.8. Samples: 8824361000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 23:32:03,939][15401] Updated weights for policy 0, policy_version 538590 (0.0044) [2024-06-23 23:32:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8824389632. Throughput: 0: 42682.7. Samples: 8824492700. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-23 23:32:08,673][15401] Updated weights for policy 0, policy_version 538600 (0.0052) [2024-06-23 23:32:11,520][15401] Updated weights for policy 0, policy_version 538610 (0.0041) [2024-06-23 23:32:13,389][15132] Fps is (10 sec: 40962.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8824619008. Throughput: 0: 42671.1. Samples: 8824742760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-23 23:32:16,350][15401] Updated weights for policy 0, policy_version 538620 (0.0025) [2024-06-23 23:32:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8824848384. Throughput: 0: 42659.5. Samples: 8825000640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 23:32:19,636][15401] Updated weights for policy 0, policy_version 538630 (0.0037) [2024-06-23 23:32:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8825012224. Throughput: 0: 42466.3. Samples: 8825126000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:23,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-23 23:32:24,209][15401] Updated weights for policy 0, policy_version 538640 (0.0051) [2024-06-23 23:32:27,107][15401] Updated weights for policy 0, policy_version 538650 (0.0029) [2024-06-23 23:32:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 8825274368. Throughput: 0: 42653.3. Samples: 8825381740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:28,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-23 23:32:31,728][15401] Updated weights for policy 0, policy_version 538660 (0.0038) [2024-06-23 23:32:33,389][15132] Fps is (10 sec: 49152.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8825503744. Throughput: 0: 42730.4. Samples: 8825639500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-23 23:32:34,679][15401] Updated weights for policy 0, policy_version 538670 (0.0030) [2024-06-23 23:32:38,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42598.8). Total num frames: 8825651200. Throughput: 0: 42712.8. Samples: 8825776220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-23 23:32:39,235][15401] Updated weights for policy 0, policy_version 538680 (0.0041) [2024-06-23 23:32:42,478][15401] Updated weights for policy 0, policy_version 538690 (0.0027) [2024-06-23 23:32:43,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8825896960. Throughput: 0: 42770.3. Samples: 8826025480. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 23:32:46,962][15401] Updated weights for policy 0, policy_version 538700 (0.0035) [2024-06-23 23:32:47,660][15349] Signal inference workers to stop experience collection... (130800 times) [2024-06-23 23:32:47,712][15401] InferenceWorker_p0-w0: stopping experience collection (130800 times) [2024-06-23 23:32:47,719][15349] Signal inference workers to resume experience collection... (130800 times) [2024-06-23 23:32:47,730][15401] InferenceWorker_p0-w0: resuming experience collection (130800 times) [2024-06-23 23:32:48,389][15132] Fps is (10 sec: 49152.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8826142720. Throughput: 0: 42647.2. Samples: 8826280100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-23 23:32:50,299][15401] Updated weights for policy 0, policy_version 538710 (0.0045) [2024-06-23 23:32:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 8826306560. Throughput: 0: 42745.7. Samples: 8826416260. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-23 23:32:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-23 23:32:54,471][15401] Updated weights for policy 0, policy_version 538720 (0.0049) [2024-06-23 23:32:58,311][15401] Updated weights for policy 0, policy_version 538730 (0.0036) [2024-06-23 23:32:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8826552320. Throughput: 0: 42631.6. Samples: 8826661180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:32:58,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-23 23:33:02,229][15401] Updated weights for policy 0, policy_version 538740 (0.0035) [2024-06-23 23:33:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.8, 300 sec: 42709.5). Total num frames: 8826765312. Throughput: 0: 42756.0. Samples: 8826924660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 23:33:05,873][15401] Updated weights for policy 0, policy_version 538750 (0.0032) [2024-06-23 23:33:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42598.5). Total num frames: 8826945536. Throughput: 0: 42889.0. Samples: 8827056000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 23:33:09,820][15401] Updated weights for policy 0, policy_version 538760 (0.0036) [2024-06-23 23:33:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8827191296. Throughput: 0: 42883.7. Samples: 8827311500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-23 23:33:13,527][15401] Updated weights for policy 0, policy_version 538770 (0.0031) [2024-06-23 23:33:17,426][15401] Updated weights for policy 0, policy_version 538780 (0.0035) [2024-06-23 23:33:18,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8827420672. Throughput: 0: 42884.8. Samples: 8827569320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 23:33:21,247][15401] Updated weights for policy 0, policy_version 538790 (0.0032) [2024-06-23 23:33:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8827600896. Throughput: 0: 42712.8. Samples: 8827698300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-23 23:33:25,069][15401] Updated weights for policy 0, policy_version 538800 (0.0038) [2024-06-23 23:33:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8827846656. Throughput: 0: 42929.8. Samples: 8827957320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-23 23:33:28,706][15401] Updated weights for policy 0, policy_version 538810 (0.0029) [2024-06-23 23:33:32,553][15401] Updated weights for policy 0, policy_version 538820 (0.0024) [2024-06-23 23:33:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8828059648. Throughput: 0: 42964.0. Samples: 8828213480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 23:33:36,644][15401] Updated weights for policy 0, policy_version 538830 (0.0030) [2024-06-23 23:33:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 8828256256. Throughput: 0: 42876.9. Samples: 8828345720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-23 23:33:39,973][15401] Updated weights for policy 0, policy_version 538840 (0.0033) [2024-06-23 23:33:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8828485632. Throughput: 0: 43192.8. Samples: 8828604860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-23 23:33:43,476][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000538849_8828502016.pth... [2024-06-23 23:33:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000538223_8818245632.pth [2024-06-23 23:33:44,113][15401] Updated weights for policy 0, policy_version 538850 (0.0044) [2024-06-23 23:33:47,803][15401] Updated weights for policy 0, policy_version 538860 (0.0021) [2024-06-23 23:33:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 8828698624. Throughput: 0: 43000.7. Samples: 8828859700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-23 23:33:51,794][15401] Updated weights for policy 0, policy_version 538870 (0.0038) [2024-06-23 23:33:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8828895232. Throughput: 0: 43004.7. Samples: 8828991220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-23 23:33:55,179][15401] Updated weights for policy 0, policy_version 538880 (0.0055) [2024-06-23 23:33:58,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 8829140992. Throughput: 0: 43151.1. Samples: 8829253300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:33:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 23:33:59,254][15401] Updated weights for policy 0, policy_version 538890 (0.0032) [2024-06-23 23:34:02,755][15401] Updated weights for policy 0, policy_version 538900 (0.0027) [2024-06-23 23:34:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8829353984. Throughput: 0: 43067.9. Samples: 8829507380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:34:03,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 23:34:06,680][15401] Updated weights for policy 0, policy_version 538910 (0.0035) [2024-06-23 23:34:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.5, 300 sec: 42654.3). Total num frames: 8829550592. Throughput: 0: 43029.0. Samples: 8829634600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:34:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 23:34:10,399][15401] Updated weights for policy 0, policy_version 538920 (0.0038) [2024-06-23 23:34:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8829763584. Throughput: 0: 43035.5. Samples: 8829893920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:34:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-23 23:34:14,710][15401] Updated weights for policy 0, policy_version 538930 (0.0035) [2024-06-23 23:34:18,077][15401] Updated weights for policy 0, policy_version 538940 (0.0033) [2024-06-23 23:34:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8829992960. Throughput: 0: 43017.7. Samples: 8830149280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:34:18,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 23:34:22,236][15401] Updated weights for policy 0, policy_version 538950 (0.0048) [2024-06-23 23:34:23,267][15349] Signal inference workers to stop experience collection... (130850 times) [2024-06-23 23:34:23,268][15349] Signal inference workers to resume experience collection... (130850 times) [2024-06-23 23:34:23,315][15401] InferenceWorker_p0-w0: stopping experience collection (130850 times) [2024-06-23 23:34:23,315][15401] InferenceWorker_p0-w0: resuming experience collection (130850 times) [2024-06-23 23:34:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 8830189568. Throughput: 0: 43025.8. Samples: 8830281880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-23 23:34:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-23 23:34:26,021][15401] Updated weights for policy 0, policy_version 538960 (0.0042) [2024-06-23 23:34:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8830418944. Throughput: 0: 42891.5. Samples: 8830534980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:34:28,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-23 23:34:30,203][15401] Updated weights for policy 0, policy_version 538970 (0.0025) [2024-06-23 23:34:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8830615552. Throughput: 0: 43073.9. Samples: 8830798020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:34:33,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-23 23:34:33,765][15401] Updated weights for policy 0, policy_version 538980 (0.0029) [2024-06-23 23:34:37,601][15401] Updated weights for policy 0, policy_version 538990 (0.0033) [2024-06-23 23:34:38,391][15132] Fps is (10 sec: 42593.2, 60 sec: 43143.6, 300 sec: 42709.3). Total num frames: 8830844928. Throughput: 0: 42973.5. Samples: 8830925080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:34:38,400][15132] Avg episode reward: [(0, '0.578')] [2024-06-23 23:34:41,254][15401] Updated weights for policy 0, policy_version 539000 (0.0043) [2024-06-23 23:34:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8831057920. Throughput: 0: 42745.2. Samples: 8831176840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:34:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 23:34:45,494][15401] Updated weights for policy 0, policy_version 539010 (0.0042) [2024-06-23 23:34:48,389][15132] Fps is (10 sec: 42604.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8831270912. Throughput: 0: 42781.5. Samples: 8831432540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:34:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 23:34:48,879][15401] Updated weights for policy 0, policy_version 539020 (0.0029) [2024-06-23 23:34:53,021][15401] Updated weights for policy 0, policy_version 539030 (0.0038) [2024-06-23 23:34:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8831483904. Throughput: 0: 42806.8. Samples: 8831560900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:34:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 23:34:56,454][15401] Updated weights for policy 0, policy_version 539040 (0.0041) [2024-06-23 23:34:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 8831680512. Throughput: 0: 42675.9. Samples: 8831814340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:34:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 23:35:00,686][15401] Updated weights for policy 0, policy_version 539050 (0.0031) [2024-06-23 23:35:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42765.4). Total num frames: 8831893504. Throughput: 0: 42843.2. Samples: 8832077220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-23 23:35:04,009][15401] Updated weights for policy 0, policy_version 539060 (0.0032) [2024-06-23 23:35:08,247][15401] Updated weights for policy 0, policy_version 539070 (0.0025) [2024-06-23 23:35:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8832122880. Throughput: 0: 42781.7. Samples: 8832207060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-23 23:35:12,042][15401] Updated weights for policy 0, policy_version 539080 (0.0031) [2024-06-23 23:35:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8832335872. Throughput: 0: 42761.8. Samples: 8832459260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 23:35:15,842][15401] Updated weights for policy 0, policy_version 539090 (0.0035) [2024-06-23 23:35:18,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8832548864. Throughput: 0: 42511.1. Samples: 8832711120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:18,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 23:35:19,758][15401] Updated weights for policy 0, policy_version 539100 (0.0024) [2024-06-23 23:35:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 8832761856. Throughput: 0: 42637.6. Samples: 8832843820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:23,393][15132] Avg episode reward: [(0, '0.395')] [2024-06-23 23:35:24,006][15401] Updated weights for policy 0, policy_version 539110 (0.0035) [2024-06-23 23:35:27,447][15401] Updated weights for policy 0, policy_version 539120 (0.0040) [2024-06-23 23:35:28,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8832991232. Throughput: 0: 42711.6. Samples: 8833098860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:28,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-23 23:35:31,603][15401] Updated weights for policy 0, policy_version 539130 (0.0031) [2024-06-23 23:35:33,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8833204224. Throughput: 0: 42613.7. Samples: 8833350160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:33,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 23:35:35,008][15401] Updated weights for policy 0, policy_version 539140 (0.0032) [2024-06-23 23:35:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42599.3, 300 sec: 42876.1). Total num frames: 8833400832. Throughput: 0: 42686.1. Samples: 8833481780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 23:35:39,263][15401] Updated weights for policy 0, policy_version 539150 (0.0036) [2024-06-23 23:35:42,607][15401] Updated weights for policy 0, policy_version 539160 (0.0041) [2024-06-23 23:35:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8833613824. Throughput: 0: 42618.2. Samples: 8833732160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-23 23:35:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000539162_8833630208.pth... [2024-06-23 23:35:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000538533_8823324672.pth [2024-06-23 23:35:46,979][15401] Updated weights for policy 0, policy_version 539170 (0.0036) [2024-06-23 23:35:47,227][15349] Signal inference workers to stop experience collection... (130900 times) [2024-06-23 23:35:47,253][15401] InferenceWorker_p0-w0: stopping experience collection (130900 times) [2024-06-23 23:35:47,293][15349] Signal inference workers to resume experience collection... (130900 times) [2024-06-23 23:35:47,293][15401] InferenceWorker_p0-w0: resuming experience collection (130900 times) [2024-06-23 23:35:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8833843200. Throughput: 0: 42360.8. Samples: 8833983460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:48,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-23 23:35:50,319][15401] Updated weights for policy 0, policy_version 539180 (0.0039) [2024-06-23 23:35:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 8834007040. Throughput: 0: 42296.1. Samples: 8834110380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-23 23:35:53,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 23:35:54,528][15401] Updated weights for policy 0, policy_version 539190 (0.0030) [2024-06-23 23:35:57,869][15401] Updated weights for policy 0, policy_version 539200 (0.0035) [2024-06-23 23:35:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 8834252800. Throughput: 0: 42517.9. Samples: 8834372560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:35:58,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-23 23:36:02,345][15401] Updated weights for policy 0, policy_version 539210 (0.0028) [2024-06-23 23:36:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8834465792. Throughput: 0: 42587.5. Samples: 8834627460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 23:36:05,323][15401] Updated weights for policy 0, policy_version 539220 (0.0032) [2024-06-23 23:36:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8834662400. Throughput: 0: 42457.4. Samples: 8834754300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-23 23:36:10,258][15401] Updated weights for policy 0, policy_version 539230 (0.0037) [2024-06-23 23:36:13,145][15401] Updated weights for policy 0, policy_version 539240 (0.0029) [2024-06-23 23:36:13,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 8834908160. Throughput: 0: 42486.1. Samples: 8835010840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:13,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 23:36:17,849][15401] Updated weights for policy 0, policy_version 539250 (0.0041) [2024-06-23 23:36:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 8835088384. Throughput: 0: 42626.8. Samples: 8835268360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-23 23:36:20,870][15401] Updated weights for policy 0, policy_version 539260 (0.0041) [2024-06-23 23:36:23,390][15132] Fps is (10 sec: 37692.3, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 8835284992. Throughput: 0: 42342.2. Samples: 8835387180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 23:36:25,449][15401] Updated weights for policy 0, policy_version 539270 (0.0040) [2024-06-23 23:36:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8835547136. Throughput: 0: 42564.1. Samples: 8835647540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 23:36:28,414][15401] Updated weights for policy 0, policy_version 539280 (0.0033) [2024-06-23 23:36:33,065][15401] Updated weights for policy 0, policy_version 539290 (0.0045) [2024-06-23 23:36:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8835727360. Throughput: 0: 42652.5. Samples: 8835902820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-23 23:36:36,086][15401] Updated weights for policy 0, policy_version 539300 (0.0037) [2024-06-23 23:36:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 8835923968. Throughput: 0: 42447.5. Samples: 8836020520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-23 23:36:40,723][15401] Updated weights for policy 0, policy_version 539310 (0.0041) [2024-06-23 23:36:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8836186112. Throughput: 0: 42388.3. Samples: 8836280040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:43,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-23 23:36:44,104][15401] Updated weights for policy 0, policy_version 539320 (0.0040) [2024-06-23 23:36:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 8836366336. Throughput: 0: 42480.2. Samples: 8836539060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:48,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-23 23:36:48,428][15401] Updated weights for policy 0, policy_version 539330 (0.0042) [2024-06-23 23:36:51,807][15401] Updated weights for policy 0, policy_version 539340 (0.0040) [2024-06-23 23:36:53,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8836562944. Throughput: 0: 42394.3. Samples: 8836662040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-23 23:36:56,153][15401] Updated weights for policy 0, policy_version 539350 (0.0032) [2024-06-23 23:36:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8836808704. Throughput: 0: 42409.3. Samples: 8836919160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:36:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-23 23:36:59,247][15401] Updated weights for policy 0, policy_version 539360 (0.0034) [2024-06-23 23:37:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8837005312. Throughput: 0: 42482.6. Samples: 8837180080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:37:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-23 23:37:04,120][15401] Updated weights for policy 0, policy_version 539370 (0.0032) [2024-06-23 23:37:05,971][15349] Signal inference workers to stop experience collection... (130950 times) [2024-06-23 23:37:06,020][15401] InferenceWorker_p0-w0: stopping experience collection (130950 times) [2024-06-23 23:37:06,021][15349] Signal inference workers to resume experience collection... (130950 times) [2024-06-23 23:37:06,037][15401] InferenceWorker_p0-w0: resuming experience collection (130950 times) [2024-06-23 23:37:06,609][15401] Updated weights for policy 0, policy_version 539380 (0.0029) [2024-06-23 23:37:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8837218304. Throughput: 0: 42601.7. Samples: 8837304260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:37:08,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-23 23:37:11,562][15401] Updated weights for policy 0, policy_version 539390 (0.0033) [2024-06-23 23:37:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 8837464064. Throughput: 0: 42711.5. Samples: 8837569560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:37:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 23:37:14,276][15401] Updated weights for policy 0, policy_version 539400 (0.0029) [2024-06-23 23:37:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8837660672. Throughput: 0: 42760.0. Samples: 8837827020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:37:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-23 23:37:19,222][15401] Updated weights for policy 0, policy_version 539410 (0.0033) [2024-06-23 23:37:22,423][15401] Updated weights for policy 0, policy_version 539420 (0.0027) [2024-06-23 23:37:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8837873664. Throughput: 0: 42777.5. Samples: 8837945500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:37:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-23 23:37:26,794][15401] Updated weights for policy 0, policy_version 539430 (0.0031) [2024-06-23 23:37:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8838086656. Throughput: 0: 42832.6. Samples: 8838207500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:37:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 23:37:30,077][15401] Updated weights for policy 0, policy_version 539440 (0.0035) [2024-06-23 23:37:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8838283264. Throughput: 0: 43000.8. Samples: 8838474100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:37:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-23 23:37:34,461][15401] Updated weights for policy 0, policy_version 539450 (0.0044) [2024-06-23 23:37:37,614][15401] Updated weights for policy 0, policy_version 539460 (0.0040) [2024-06-23 23:37:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 8838529024. Throughput: 0: 42919.5. Samples: 8838593420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:37:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 23:37:41,913][15401] Updated weights for policy 0, policy_version 539470 (0.0019) [2024-06-23 23:37:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8838725632. Throughput: 0: 42956.9. Samples: 8838852220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:37:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 23:37:43,548][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000539474_8838742016.pth... [2024-06-23 23:37:43,612][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000538849_8828502016.pth [2024-06-23 23:37:45,119][15401] Updated weights for policy 0, policy_version 539480 (0.0028) [2024-06-23 23:37:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 8838938624. Throughput: 0: 42988.4. Samples: 8839114560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:37:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-23 23:37:49,350][15401] Updated weights for policy 0, policy_version 539490 (0.0034) [2024-06-23 23:37:52,689][15401] Updated weights for policy 0, policy_version 539500 (0.0030) [2024-06-23 23:37:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8839168000. Throughput: 0: 42885.8. Samples: 8839234120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:37:53,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 23:37:56,974][15401] Updated weights for policy 0, policy_version 539510 (0.0026) [2024-06-23 23:37:58,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 8839348224. Throughput: 0: 42670.7. Samples: 8839489840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:37:58,392][15132] Avg episode reward: [(0, '0.686')] [2024-06-23 23:38:00,213][15401] Updated weights for policy 0, policy_version 539520 (0.0030) [2024-06-23 23:38:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 8839577600. Throughput: 0: 42838.3. Samples: 8839754740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-23 23:38:04,616][15401] Updated weights for policy 0, policy_version 539530 (0.0033) [2024-06-23 23:38:06,521][15349] Signal inference workers to stop experience collection... (131000 times) [2024-06-23 23:38:06,521][15349] Signal inference workers to resume experience collection... (131000 times) [2024-06-23 23:38:06,544][15401] InferenceWorker_p0-w0: stopping experience collection (131000 times) [2024-06-23 23:38:06,544][15401] InferenceWorker_p0-w0: resuming experience collection (131000 times) [2024-06-23 23:38:07,959][15401] Updated weights for policy 0, policy_version 539540 (0.0039) [2024-06-23 23:38:08,390][15132] Fps is (10 sec: 47521.9, 60 sec: 43417.2, 300 sec: 42820.5). Total num frames: 8839823360. Throughput: 0: 43007.8. Samples: 8839880880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:08,391][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 23:38:12,220][15401] Updated weights for policy 0, policy_version 539550 (0.0038) [2024-06-23 23:38:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 8840003584. Throughput: 0: 42857.2. Samples: 8840136180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:13,393][15132] Avg episode reward: [(0, '0.677')] [2024-06-23 23:38:15,564][15401] Updated weights for policy 0, policy_version 539560 (0.0040) [2024-06-23 23:38:18,390][15132] Fps is (10 sec: 37685.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8840200192. Throughput: 0: 42696.4. Samples: 8840395440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-23 23:38:20,327][15401] Updated weights for policy 0, policy_version 539570 (0.0029) [2024-06-23 23:38:23,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8840445952. Throughput: 0: 42948.5. Samples: 8840526100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-23 23:38:24,079][15401] Updated weights for policy 0, policy_version 539580 (0.0023) [2024-06-23 23:38:27,734][15401] Updated weights for policy 0, policy_version 539590 (0.0039) [2024-06-23 23:38:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8840642560. Throughput: 0: 42720.4. Samples: 8840774640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 23:38:31,642][15401] Updated weights for policy 0, policy_version 539600 (0.0032) [2024-06-23 23:38:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8840855552. Throughput: 0: 42694.7. Samples: 8841035820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-23 23:38:35,213][15401] Updated weights for policy 0, policy_version 539610 (0.0033) [2024-06-23 23:38:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8841101312. Throughput: 0: 42936.0. Samples: 8841166240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:38,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 23:38:39,298][15401] Updated weights for policy 0, policy_version 539620 (0.0035) [2024-06-23 23:38:42,668][15401] Updated weights for policy 0, policy_version 539630 (0.0027) [2024-06-23 23:38:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 8841297920. Throughput: 0: 42913.3. Samples: 8841420940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:43,393][15132] Avg episode reward: [(0, '0.334')] [2024-06-23 23:38:46,890][15401] Updated weights for policy 0, policy_version 539640 (0.0032) [2024-06-23 23:38:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8841510912. Throughput: 0: 42758.6. Samples: 8841678880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:48,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-23 23:38:50,980][15401] Updated weights for policy 0, policy_version 539650 (0.0024) [2024-06-23 23:38:53,390][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8841756672. Throughput: 0: 42836.1. Samples: 8841808480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-23 23:38:53,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 23:38:54,696][15401] Updated weights for policy 0, policy_version 539660 (0.0041) [2024-06-23 23:38:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 8841936896. Throughput: 0: 42844.0. Samples: 8842064060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:38:58,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-23 23:38:58,541][15401] Updated weights for policy 0, policy_version 539670 (0.0040) [2024-06-23 23:39:02,356][15401] Updated weights for policy 0, policy_version 539680 (0.0042) [2024-06-23 23:39:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8842133504. Throughput: 0: 42636.5. Samples: 8842314080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-23 23:39:06,269][15401] Updated weights for policy 0, policy_version 539690 (0.0034) [2024-06-23 23:39:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.8, 300 sec: 42765.0). Total num frames: 8842379264. Throughput: 0: 42519.4. Samples: 8842439480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:08,399][15132] Avg episode reward: [(0, '0.838')] [2024-06-23 23:39:09,940][15401] Updated weights for policy 0, policy_version 539700 (0.0042) [2024-06-23 23:39:13,394][15132] Fps is (10 sec: 44216.6, 60 sec: 42869.9, 300 sec: 42653.3). Total num frames: 8842575872. Throughput: 0: 42746.8. Samples: 8842698440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:13,395][15132] Avg episode reward: [(0, '0.493')] [2024-06-23 23:39:13,733][15401] Updated weights for policy 0, policy_version 539710 (0.0027) [2024-06-23 23:39:15,531][15349] Signal inference workers to stop experience collection... (131050 times) [2024-06-23 23:39:15,532][15349] Signal inference workers to resume experience collection... (131050 times) [2024-06-23 23:39:15,562][15401] InferenceWorker_p0-w0: stopping experience collection (131050 times) [2024-06-23 23:39:15,562][15401] InferenceWorker_p0-w0: resuming experience collection (131050 times) [2024-06-23 23:39:17,578][15401] Updated weights for policy 0, policy_version 539720 (0.0038) [2024-06-23 23:39:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8842772480. Throughput: 0: 42668.8. Samples: 8842955920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 23:39:21,331][15401] Updated weights for policy 0, policy_version 539730 (0.0045) [2024-06-23 23:39:23,389][15132] Fps is (10 sec: 44257.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8843018240. Throughput: 0: 42617.8. Samples: 8843084040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:23,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-23 23:39:25,435][15401] Updated weights for policy 0, policy_version 539740 (0.0031) [2024-06-23 23:39:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8843231232. Throughput: 0: 42728.1. Samples: 8843343600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 23:39:28,864][15401] Updated weights for policy 0, policy_version 539750 (0.0038) [2024-06-23 23:39:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42598.6). Total num frames: 8843411456. Throughput: 0: 42586.7. Samples: 8843595280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-23 23:39:33,475][15401] Updated weights for policy 0, policy_version 539760 (0.0042) [2024-06-23 23:39:36,719][15401] Updated weights for policy 0, policy_version 539770 (0.0034) [2024-06-23 23:39:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8843640832. Throughput: 0: 42497.9. Samples: 8843720880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:38,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-23 23:39:41,079][15401] Updated weights for policy 0, policy_version 539780 (0.0030) [2024-06-23 23:39:43,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 8843886592. Throughput: 0: 42698.2. Samples: 8843985480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-23 23:39:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000539788_8843886592.pth... [2024-06-23 23:39:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000539162_8833630208.pth [2024-06-23 23:39:44,290][15401] Updated weights for policy 0, policy_version 539790 (0.0035) [2024-06-23 23:39:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8844066816. Throughput: 0: 42744.9. Samples: 8844237600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 23:39:48,767][15401] Updated weights for policy 0, policy_version 539800 (0.0056) [2024-06-23 23:39:52,212][15401] Updated weights for policy 0, policy_version 539810 (0.0026) [2024-06-23 23:39:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8844279808. Throughput: 0: 42793.0. Samples: 8844365160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-23 23:39:56,624][15401] Updated weights for policy 0, policy_version 539820 (0.0034) [2024-06-23 23:39:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8844509184. Throughput: 0: 42644.8. Samples: 8844617260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:39:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-23 23:39:59,725][15401] Updated weights for policy 0, policy_version 539830 (0.0039) [2024-06-23 23:40:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 8844705792. Throughput: 0: 42525.8. Samples: 8844869680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:40:03,393][15132] Avg episode reward: [(0, '0.736')] [2024-06-23 23:40:04,273][15401] Updated weights for policy 0, policy_version 539840 (0.0024) [2024-06-23 23:40:07,435][15401] Updated weights for policy 0, policy_version 539850 (0.0030) [2024-06-23 23:40:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8844918784. Throughput: 0: 42606.1. Samples: 8845001320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:40:08,391][15132] Avg episode reward: [(0, '0.573')] [2024-06-23 23:40:11,709][15401] Updated weights for policy 0, policy_version 539860 (0.0040) [2024-06-23 23:40:13,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42601.7, 300 sec: 42654.3). Total num frames: 8845131776. Throughput: 0: 42520.0. Samples: 8845257000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:40:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 23:40:15,207][15401] Updated weights for policy 0, policy_version 539870 (0.0029) [2024-06-23 23:40:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 8845344768. Throughput: 0: 42600.5. Samples: 8845512300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:40:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-23 23:40:19,318][15401] Updated weights for policy 0, policy_version 539880 (0.0031) [2024-06-23 23:40:22,808][15401] Updated weights for policy 0, policy_version 539890 (0.0032) [2024-06-23 23:40:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8845557760. Throughput: 0: 42579.0. Samples: 8845636940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-23 23:40:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 23:40:26,954][15401] Updated weights for policy 0, policy_version 539900 (0.0033) [2024-06-23 23:40:28,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 8845787136. Throughput: 0: 42480.7. Samples: 8845897380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:40:28,396][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 23:40:30,367][15401] Updated weights for policy 0, policy_version 539910 (0.0028) [2024-06-23 23:40:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8845983744. Throughput: 0: 42607.6. Samples: 8846154940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:40:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 23:40:34,479][15401] Updated weights for policy 0, policy_version 539920 (0.0027) [2024-06-23 23:40:38,195][15401] Updated weights for policy 0, policy_version 539930 (0.0039) [2024-06-23 23:40:38,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8846213120. Throughput: 0: 42516.9. Samples: 8846278420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:40:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-23 23:40:39,938][15349] Signal inference workers to stop experience collection... (131100 times) [2024-06-23 23:40:39,939][15349] Signal inference workers to resume experience collection... (131100 times) [2024-06-23 23:40:39,967][15401] InferenceWorker_p0-w0: stopping experience collection (131100 times) [2024-06-23 23:40:39,967][15401] InferenceWorker_p0-w0: resuming experience collection (131100 times) [2024-06-23 23:40:42,098][15401] Updated weights for policy 0, policy_version 539940 (0.0029) [2024-06-23 23:40:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8846442496. Throughput: 0: 42735.2. Samples: 8846540340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:40:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-23 23:40:45,947][15401] Updated weights for policy 0, policy_version 539950 (0.0035) [2024-06-23 23:40:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8846622720. Throughput: 0: 42979.7. Samples: 8846803660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:40:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 23:40:49,572][15401] Updated weights for policy 0, policy_version 539960 (0.0036) [2024-06-23 23:40:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8846852096. Throughput: 0: 42786.3. Samples: 8846926700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:40:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 23:40:53,614][15401] Updated weights for policy 0, policy_version 539970 (0.0036) [2024-06-23 23:40:57,096][15401] Updated weights for policy 0, policy_version 539980 (0.0037) [2024-06-23 23:40:58,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8847097856. Throughput: 0: 42927.0. Samples: 8847188720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:40:58,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-23 23:41:01,408][15401] Updated weights for policy 0, policy_version 539990 (0.0028) [2024-06-23 23:41:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 8847278080. Throughput: 0: 43091.4. Samples: 8847451520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:03,392][15132] Avg episode reward: [(0, '0.356')] [2024-06-23 23:41:04,716][15401] Updated weights for policy 0, policy_version 540000 (0.0036) [2024-06-23 23:41:08,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.8, 300 sec: 42653.9). Total num frames: 8847491072. Throughput: 0: 43100.0. Samples: 8847576540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:08,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 23:41:09,041][15401] Updated weights for policy 0, policy_version 540010 (0.0040) [2024-06-23 23:41:12,423][15401] Updated weights for policy 0, policy_version 540020 (0.0027) [2024-06-23 23:41:13,390][15132] Fps is (10 sec: 45885.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8847736832. Throughput: 0: 43102.0. Samples: 8847836700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 23:41:16,898][15401] Updated weights for policy 0, policy_version 540030 (0.0032) [2024-06-23 23:41:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8847900672. Throughput: 0: 43094.3. Samples: 8848094180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-23 23:41:20,108][15401] Updated weights for policy 0, policy_version 540040 (0.0037) [2024-06-23 23:41:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8848146432. Throughput: 0: 43000.0. Samples: 8848213420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-23 23:41:24,611][15401] Updated weights for policy 0, policy_version 540050 (0.0029) [2024-06-23 23:41:27,683][15401] Updated weights for policy 0, policy_version 540060 (0.0033) [2024-06-23 23:41:28,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 8848375808. Throughput: 0: 43035.5. Samples: 8848476940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 23:41:32,265][15401] Updated weights for policy 0, policy_version 540070 (0.0044) [2024-06-23 23:41:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8848539648. Throughput: 0: 42947.0. Samples: 8848736280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:33,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-23 23:41:35,374][15401] Updated weights for policy 0, policy_version 540080 (0.0030) [2024-06-23 23:41:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8848769024. Throughput: 0: 42814.8. Samples: 8848853360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 23:41:39,803][15401] Updated weights for policy 0, policy_version 540090 (0.0029) [2024-06-23 23:41:43,068][15401] Updated weights for policy 0, policy_version 540100 (0.0037) [2024-06-23 23:41:43,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 8849014784. Throughput: 0: 42816.8. Samples: 8849115480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:43,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-23 23:41:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000540101_8849014784.pth... [2024-06-23 23:41:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000539474_8838742016.pth [2024-06-23 23:41:47,864][15401] Updated weights for policy 0, policy_version 540110 (0.0034) [2024-06-23 23:41:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8849178624. Throughput: 0: 42673.4. Samples: 8849371720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-23 23:41:50,796][15401] Updated weights for policy 0, policy_version 540120 (0.0040) [2024-06-23 23:41:51,232][15349] Signal inference workers to stop experience collection... (131150 times) [2024-06-23 23:41:51,233][15349] Signal inference workers to resume experience collection... (131150 times) [2024-06-23 23:41:51,272][15401] InferenceWorker_p0-w0: stopping experience collection (131150 times) [2024-06-23 23:41:51,272][15401] InferenceWorker_p0-w0: resuming experience collection (131150 times) [2024-06-23 23:41:53,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8849424384. Throughput: 0: 42448.0. Samples: 8849486600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-23 23:41:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 23:41:55,490][15401] Updated weights for policy 0, policy_version 540130 (0.0035) [2024-06-23 23:41:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 8849620992. Throughput: 0: 42518.4. Samples: 8849750020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:41:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-23 23:41:58,558][15401] Updated weights for policy 0, policy_version 540140 (0.0043) [2024-06-23 23:42:03,235][15401] Updated weights for policy 0, policy_version 540150 (0.0042) [2024-06-23 23:42:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 8849817600. Throughput: 0: 42451.4. Samples: 8850004500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-23 23:42:06,120][15401] Updated weights for policy 0, policy_version 540160 (0.0043) [2024-06-23 23:42:08,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 8850063360. Throughput: 0: 42406.4. Samples: 8850121720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-23 23:42:10,798][15401] Updated weights for policy 0, policy_version 540170 (0.0037) [2024-06-23 23:42:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8850276352. Throughput: 0: 42459.4. Samples: 8850387620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 23:42:13,649][15401] Updated weights for policy 0, policy_version 540180 (0.0029) [2024-06-23 23:42:18,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8850456576. Throughput: 0: 42463.1. Samples: 8850647120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-23 23:42:18,566][15401] Updated weights for policy 0, policy_version 540190 (0.0025) [2024-06-23 23:42:21,172][15401] Updated weights for policy 0, policy_version 540200 (0.0035) [2024-06-23 23:42:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8850702336. Throughput: 0: 42480.4. Samples: 8850764980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:23,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 23:42:26,394][15401] Updated weights for policy 0, policy_version 540210 (0.0036) [2024-06-23 23:42:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 8850898944. Throughput: 0: 42450.3. Samples: 8851025740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:28,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-23 23:42:29,279][15401] Updated weights for policy 0, policy_version 540220 (0.0039) [2024-06-23 23:42:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8851095552. Throughput: 0: 42441.2. Samples: 8851281580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-23 23:42:34,000][15401] Updated weights for policy 0, policy_version 540230 (0.0030) [2024-06-23 23:42:36,922][15401] Updated weights for policy 0, policy_version 540240 (0.0023) [2024-06-23 23:42:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8851341312. Throughput: 0: 42657.8. Samples: 8851406200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:38,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-23 23:42:41,498][15401] Updated weights for policy 0, policy_version 540250 (0.0032) [2024-06-23 23:42:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 8851537920. Throughput: 0: 42558.2. Samples: 8851665140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-23 23:42:44,628][15401] Updated weights for policy 0, policy_version 540260 (0.0038) [2024-06-23 23:42:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8851750912. Throughput: 0: 42540.5. Samples: 8851918820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-23 23:42:49,298][15401] Updated weights for policy 0, policy_version 540270 (0.0033) [2024-06-23 23:42:52,487][15401] Updated weights for policy 0, policy_version 540280 (0.0040) [2024-06-23 23:42:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 8851963904. Throughput: 0: 42765.1. Samples: 8852046140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:53,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-23 23:42:57,055][15401] Updated weights for policy 0, policy_version 540290 (0.0024) [2024-06-23 23:42:57,855][15349] Signal inference workers to stop experience collection... (131200 times) [2024-06-23 23:42:57,855][15349] Signal inference workers to resume experience collection... (131200 times) [2024-06-23 23:42:57,901][15401] InferenceWorker_p0-w0: stopping experience collection (131200 times) [2024-06-23 23:42:57,901][15401] InferenceWorker_p0-w0: resuming experience collection (131200 times) [2024-06-23 23:42:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8852176896. Throughput: 0: 42665.5. Samples: 8852307560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:42:58,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-23 23:42:59,967][15401] Updated weights for policy 0, policy_version 540300 (0.0028) [2024-06-23 23:43:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42543.0). Total num frames: 8852373504. Throughput: 0: 42607.2. Samples: 8852564440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:43:03,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-23 23:43:04,689][15401] Updated weights for policy 0, policy_version 540310 (0.0031) [2024-06-23 23:43:07,690][15401] Updated weights for policy 0, policy_version 540320 (0.0042) [2024-06-23 23:43:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 8852619264. Throughput: 0: 42787.1. Samples: 8852690400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:43:08,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 23:43:12,239][15401] Updated weights for policy 0, policy_version 540330 (0.0024) [2024-06-23 23:43:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8852799488. Throughput: 0: 42783.0. Samples: 8852950980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:43:13,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-23 23:43:15,342][15401] Updated weights for policy 0, policy_version 540340 (0.0028) [2024-06-23 23:43:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 8853028864. Throughput: 0: 42570.7. Samples: 8853197260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:43:18,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-23 23:43:19,870][15401] Updated weights for policy 0, policy_version 540350 (0.0039) [2024-06-23 23:43:23,008][15401] Updated weights for policy 0, policy_version 540360 (0.0036) [2024-06-23 23:43:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8853258240. Throughput: 0: 42744.0. Samples: 8853329680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-23 23:43:23,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-23 23:43:27,819][15401] Updated weights for policy 0, policy_version 540370 (0.0042) [2024-06-23 23:43:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8853438464. Throughput: 0: 42661.4. Samples: 8853584900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:43:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-23 23:43:30,981][15401] Updated weights for policy 0, policy_version 540380 (0.0030) [2024-06-23 23:43:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8853684224. Throughput: 0: 42601.3. Samples: 8853835880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:43:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-23 23:43:35,403][15401] Updated weights for policy 0, policy_version 540390 (0.0032) [2024-06-23 23:43:38,390][15132] Fps is (10 sec: 44234.8, 60 sec: 42325.0, 300 sec: 42654.2). Total num frames: 8853880832. Throughput: 0: 42776.5. Samples: 8853971100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:43:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-23 23:43:38,632][15401] Updated weights for policy 0, policy_version 540400 (0.0036) [2024-06-23 23:43:43,215][15401] Updated weights for policy 0, policy_version 540410 (0.0030) [2024-06-23 23:43:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 8854077440. Throughput: 0: 42743.4. Samples: 8854231020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:43:43,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 23:43:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000540410_8854077440.pth... [2024-06-23 23:43:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000539788_8843886592.pth [2024-06-23 23:43:46,222][15401] Updated weights for policy 0, policy_version 540420 (0.0036) [2024-06-23 23:43:48,390][15132] Fps is (10 sec: 45876.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8854339584. Throughput: 0: 42503.5. Samples: 8854477100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:43:48,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-23 23:43:51,167][15401] Updated weights for policy 0, policy_version 540430 (0.0029) [2024-06-23 23:43:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8854536192. Throughput: 0: 42887.9. Samples: 8854620360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:43:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-23 23:43:53,786][15401] Updated weights for policy 0, policy_version 540440 (0.0024) [2024-06-23 23:43:58,389][15132] Fps is (10 sec: 36045.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8854700032. Throughput: 0: 42596.2. Samples: 8854867800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:43:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 23:43:58,757][15401] Updated weights for policy 0, policy_version 540450 (0.0043) [2024-06-23 23:44:01,555][15401] Updated weights for policy 0, policy_version 540460 (0.0033) [2024-06-23 23:44:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 8854994944. Throughput: 0: 42711.9. Samples: 8855119300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 23:44:06,259][15401] Updated weights for policy 0, policy_version 540470 (0.0041) [2024-06-23 23:44:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42654.6). Total num frames: 8855158784. Throughput: 0: 42868.0. Samples: 8855258740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 23:44:08,444][15349] Signal inference workers to stop experience collection... (131250 times) [2024-06-23 23:44:08,445][15349] Signal inference workers to resume experience collection... (131250 times) [2024-06-23 23:44:08,486][15401] InferenceWorker_p0-w0: stopping experience collection (131250 times) [2024-06-23 23:44:08,486][15401] InferenceWorker_p0-w0: resuming experience collection (131250 times) [2024-06-23 23:44:09,130][15401] Updated weights for policy 0, policy_version 540480 (0.0034) [2024-06-23 23:44:13,390][15132] Fps is (10 sec: 36044.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8855355392. Throughput: 0: 42584.8. Samples: 8855501220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-23 23:44:13,992][15401] Updated weights for policy 0, policy_version 540490 (0.0039) [2024-06-23 23:44:16,944][15401] Updated weights for policy 0, policy_version 540500 (0.0038) [2024-06-23 23:44:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8855617536. Throughput: 0: 42593.8. Samples: 8855752600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-23 23:44:21,455][15401] Updated weights for policy 0, policy_version 540510 (0.0036) [2024-06-23 23:44:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8855797760. Throughput: 0: 42704.4. Samples: 8855892780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-23 23:44:25,014][15401] Updated weights for policy 0, policy_version 540520 (0.0022) [2024-06-23 23:44:28,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 8856010752. Throughput: 0: 42453.0. Samples: 8856141500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:28,393][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 23:44:28,947][15401] Updated weights for policy 0, policy_version 540530 (0.0026) [2024-06-23 23:44:32,644][15401] Updated weights for policy 0, policy_version 540540 (0.0031) [2024-06-23 23:44:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8856256512. Throughput: 0: 42684.1. Samples: 8856397880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:33,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 23:44:36,520][15401] Updated weights for policy 0, policy_version 540550 (0.0033) [2024-06-23 23:44:38,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.8, 300 sec: 42598.4). Total num frames: 8856453120. Throughput: 0: 42428.6. Samples: 8856529640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:38,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-23 23:44:40,043][15401] Updated weights for policy 0, policy_version 540560 (0.0042) [2024-06-23 23:44:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8856666112. Throughput: 0: 42511.9. Samples: 8856780840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-23 23:44:44,480][15401] Updated weights for policy 0, policy_version 540570 (0.0037) [2024-06-23 23:44:47,649][15401] Updated weights for policy 0, policy_version 540580 (0.0036) [2024-06-23 23:44:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8856895488. Throughput: 0: 42704.4. Samples: 8857041000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 23:44:52,067][15401] Updated weights for policy 0, policy_version 540590 (0.0038) [2024-06-23 23:44:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 8857075712. Throughput: 0: 42421.8. Samples: 8857167720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-23 23:44:53,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-23 23:44:55,615][15401] Updated weights for policy 0, policy_version 540600 (0.0041) [2024-06-23 23:44:58,395][15132] Fps is (10 sec: 40939.1, 60 sec: 43413.8, 300 sec: 42709.1). Total num frames: 8857305088. Throughput: 0: 42705.9. Samples: 8857423200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:44:58,395][15132] Avg episode reward: [(0, '0.672')] [2024-06-23 23:44:59,666][15401] Updated weights for policy 0, policy_version 540610 (0.0031) [2024-06-23 23:45:03,153][15401] Updated weights for policy 0, policy_version 540620 (0.0043) [2024-06-23 23:45:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8857518080. Throughput: 0: 42643.1. Samples: 8857671540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:03,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 23:45:07,372][15401] Updated weights for policy 0, policy_version 540630 (0.0030) [2024-06-23 23:45:08,390][15132] Fps is (10 sec: 39341.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8857698304. Throughput: 0: 42385.7. Samples: 8857800140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 23:45:10,985][15401] Updated weights for policy 0, policy_version 540640 (0.0042) [2024-06-23 23:45:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8857944064. Throughput: 0: 42592.9. Samples: 8858058080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 23:45:15,013][15401] Updated weights for policy 0, policy_version 540650 (0.0041) [2024-06-23 23:45:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8858157056. Throughput: 0: 42547.6. Samples: 8858312520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-23 23:45:18,513][15401] Updated weights for policy 0, policy_version 540660 (0.0031) [2024-06-23 23:45:22,823][15401] Updated weights for policy 0, policy_version 540670 (0.0029) [2024-06-23 23:45:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 8858353664. Throughput: 0: 42434.6. Samples: 8858439200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:23,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-23 23:45:26,040][15401] Updated weights for policy 0, policy_version 540680 (0.0042) [2024-06-23 23:45:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8858583040. Throughput: 0: 42644.5. Samples: 8858699840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-23 23:45:30,518][15401] Updated weights for policy 0, policy_version 540690 (0.0037) [2024-06-23 23:45:33,228][15349] Signal inference workers to stop experience collection... (131300 times) [2024-06-23 23:45:33,228][15349] Signal inference workers to resume experience collection... (131300 times) [2024-06-23 23:45:33,257][15401] InferenceWorker_p0-w0: stopping experience collection (131300 times) [2024-06-23 23:45:33,257][15401] InferenceWorker_p0-w0: resuming experience collection (131300 times) [2024-06-23 23:45:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8858796032. Throughput: 0: 42620.2. Samples: 8858958900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:33,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-23 23:45:33,925][15401] Updated weights for policy 0, policy_version 540700 (0.0042) [2024-06-23 23:45:38,189][15401] Updated weights for policy 0, policy_version 540710 (0.0034) [2024-06-23 23:45:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8859009024. Throughput: 0: 42527.6. Samples: 8859081460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-23 23:45:41,984][15401] Updated weights for policy 0, policy_version 540720 (0.0035) [2024-06-23 23:45:43,396][15132] Fps is (10 sec: 40933.3, 60 sec: 42320.9, 300 sec: 42653.0). Total num frames: 8859205632. Throughput: 0: 42451.3. Samples: 8859333560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:43,396][15132] Avg episode reward: [(0, '0.283')] [2024-06-23 23:45:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000540723_8859205632.pth... [2024-06-23 23:45:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000540101_8849014784.pth [2024-06-23 23:45:45,961][15401] Updated weights for policy 0, policy_version 540730 (0.0028) [2024-06-23 23:45:48,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 8859402240. Throughput: 0: 42558.3. Samples: 8859586660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:48,390][15132] Avg episode reward: [(0, '0.885')] [2024-06-23 23:45:49,734][15401] Updated weights for policy 0, policy_version 540740 (0.0045) [2024-06-23 23:45:53,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 8859631616. Throughput: 0: 42542.3. Samples: 8859714540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:53,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-23 23:45:53,485][15401] Updated weights for policy 0, policy_version 540750 (0.0035) [2024-06-23 23:45:57,375][15401] Updated weights for policy 0, policy_version 540760 (0.0036) [2024-06-23 23:45:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42055.9, 300 sec: 42543.2). Total num frames: 8859828224. Throughput: 0: 42442.8. Samples: 8859968000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:45:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-23 23:46:01,258][15401] Updated weights for policy 0, policy_version 540770 (0.0037) [2024-06-23 23:46:03,393][15132] Fps is (10 sec: 42583.1, 60 sec: 42322.8, 300 sec: 42598.2). Total num frames: 8860057600. Throughput: 0: 42484.2. Samples: 8860224460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:46:03,394][15132] Avg episode reward: [(0, '0.688')] [2024-06-23 23:46:05,044][15401] Updated weights for policy 0, policy_version 540780 (0.0032) [2024-06-23 23:46:08,395][15132] Fps is (10 sec: 44214.0, 60 sec: 42867.8, 300 sec: 42486.6). Total num frames: 8860270592. Throughput: 0: 42569.9. Samples: 8860355060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:46:08,395][15132] Avg episode reward: [(0, '0.451')] [2024-06-23 23:46:08,956][15401] Updated weights for policy 0, policy_version 540790 (0.0028) [2024-06-23 23:46:12,684][15401] Updated weights for policy 0, policy_version 540800 (0.0026) [2024-06-23 23:46:13,390][15132] Fps is (10 sec: 40974.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8860467200. Throughput: 0: 42318.6. Samples: 8860604180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:46:13,394][15132] Avg episode reward: [(0, '0.312')] [2024-06-23 23:46:16,826][15401] Updated weights for policy 0, policy_version 540810 (0.0028) [2024-06-23 23:46:18,390][15132] Fps is (10 sec: 44259.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8860712960. Throughput: 0: 42140.3. Samples: 8860855220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:46:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-23 23:46:20,580][15401] Updated weights for policy 0, policy_version 540820 (0.0033) [2024-06-23 23:46:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 8860893184. Throughput: 0: 42386.1. Samples: 8860988840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 19.0) [2024-06-23 23:46:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 23:46:24,190][15401] Updated weights for policy 0, policy_version 540830 (0.0028) [2024-06-23 23:46:28,390][15132] Fps is (10 sec: 37682.7, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 8861089792. Throughput: 0: 42442.8. Samples: 8861243220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:46:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-23 23:46:29,017][15401] Updated weights for policy 0, policy_version 540840 (0.0025) [2024-06-23 23:46:31,872][15401] Updated weights for policy 0, policy_version 540850 (0.0032) [2024-06-23 23:46:33,396][15132] Fps is (10 sec: 47483.4, 60 sec: 42866.8, 300 sec: 42708.5). Total num frames: 8861368320. Throughput: 0: 42308.6. Samples: 8861490820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:46:33,396][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 23:46:36,567][15401] Updated weights for policy 0, policy_version 540860 (0.0039) [2024-06-23 23:46:38,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 8861532160. Throughput: 0: 42633.4. Samples: 8861633040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:46:38,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-23 23:46:39,279][15401] Updated weights for policy 0, policy_version 540870 (0.0036) [2024-06-23 23:46:43,390][15132] Fps is (10 sec: 36067.5, 60 sec: 42056.7, 300 sec: 42542.8). Total num frames: 8861728768. Throughput: 0: 42460.3. Samples: 8861878720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:46:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-23 23:46:44,173][15401] Updated weights for policy 0, policy_version 540880 (0.0040) [2024-06-23 23:46:47,141][15401] Updated weights for policy 0, policy_version 540890 (0.0037) [2024-06-23 23:46:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 8861990912. Throughput: 0: 42493.7. Samples: 8862136520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:46:48,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-23 23:46:51,773][15401] Updated weights for policy 0, policy_version 540900 (0.0037) [2024-06-23 23:46:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 8862171136. Throughput: 0: 42578.7. Samples: 8862270880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:46:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-23 23:46:54,643][15401] Updated weights for policy 0, policy_version 540910 (0.0029) [2024-06-23 23:46:58,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 8862400512. Throughput: 0: 42524.9. Samples: 8862517800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:46:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-23 23:46:59,258][15401] Updated weights for policy 0, policy_version 540920 (0.0035) [2024-06-23 23:47:00,682][15349] Signal inference workers to stop experience collection... (131350 times) [2024-06-23 23:47:00,683][15349] Signal inference workers to resume experience collection... (131350 times) [2024-06-23 23:47:00,713][15401] InferenceWorker_p0-w0: stopping experience collection (131350 times) [2024-06-23 23:47:00,713][15401] InferenceWorker_p0-w0: resuming experience collection (131350 times) [2024-06-23 23:47:02,197][15401] Updated weights for policy 0, policy_version 540930 (0.0033) [2024-06-23 23:47:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42601.0, 300 sec: 42542.9). Total num frames: 8862613504. Throughput: 0: 42736.5. Samples: 8862778360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-23 23:47:06,959][15401] Updated weights for policy 0, policy_version 540940 (0.0041) [2024-06-23 23:47:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42328.9, 300 sec: 42487.3). Total num frames: 8862810112. Throughput: 0: 42677.7. Samples: 8862909340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:08,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-23 23:47:10,066][15401] Updated weights for policy 0, policy_version 540950 (0.0047) [2024-06-23 23:47:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8863055872. Throughput: 0: 42574.8. Samples: 8863159080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-23 23:47:15,162][15401] Updated weights for policy 0, policy_version 540960 (0.0042) [2024-06-23 23:47:17,775][15401] Updated weights for policy 0, policy_version 540970 (0.0034) [2024-06-23 23:47:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8863268864. Throughput: 0: 42732.3. Samples: 8863413500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-23 23:47:22,747][15401] Updated weights for policy 0, policy_version 540980 (0.0024) [2024-06-23 23:47:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 8863432704. Throughput: 0: 42357.4. Samples: 8863539120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-23 23:47:25,801][15401] Updated weights for policy 0, policy_version 540990 (0.0038) [2024-06-23 23:47:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 8863694848. Throughput: 0: 42624.0. Samples: 8863796800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-23 23:47:30,326][15401] Updated weights for policy 0, policy_version 541000 (0.0037) [2024-06-23 23:47:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42056.8, 300 sec: 42542.9). Total num frames: 8863891456. Throughput: 0: 42589.3. Samples: 8864053040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:33,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-23 23:47:33,402][15401] Updated weights for policy 0, policy_version 541010 (0.0034) [2024-06-23 23:47:37,940][15401] Updated weights for policy 0, policy_version 541020 (0.0027) [2024-06-23 23:47:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 8864071680. Throughput: 0: 42485.6. Samples: 8864182740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:38,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-23 23:47:40,812][15401] Updated weights for policy 0, policy_version 541030 (0.0039) [2024-06-23 23:47:43,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43415.9, 300 sec: 42653.6). Total num frames: 8864333824. Throughput: 0: 42680.8. Samples: 8864438540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:43,393][15132] Avg episode reward: [(0, '0.765')] [2024-06-23 23:47:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000541036_8864333824.pth... [2024-06-23 23:47:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000540410_8854077440.pth [2024-06-23 23:47:45,670][15401] Updated weights for policy 0, policy_version 541040 (0.0038) [2024-06-23 23:47:48,389][15132] Fps is (10 sec: 47514.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8864546816. Throughput: 0: 42753.3. Samples: 8864702260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 23:47:48,516][15401] Updated weights for policy 0, policy_version 541050 (0.0039) [2024-06-23 23:47:53,318][15401] Updated weights for policy 0, policy_version 541060 (0.0029) [2024-06-23 23:47:53,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 8864727040. Throughput: 0: 42664.9. Samples: 8864829260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-23 23:47:53,390][15132] Avg episode reward: [(0, '0.177')] [2024-06-23 23:47:55,969][15401] Updated weights for policy 0, policy_version 541070 (0.0033) [2024-06-23 23:47:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8864972800. Throughput: 0: 42743.5. Samples: 8865082540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:47:58,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-23 23:48:00,964][15401] Updated weights for policy 0, policy_version 541080 (0.0027) [2024-06-23 23:48:03,058][15349] Signal inference workers to stop experience collection... (131400 times) [2024-06-23 23:48:03,058][15349] Signal inference workers to resume experience collection... (131400 times) [2024-06-23 23:48:03,096][15401] InferenceWorker_p0-w0: stopping experience collection (131400 times) [2024-06-23 23:48:03,096][15401] InferenceWorker_p0-w0: resuming experience collection (131400 times) [2024-06-23 23:48:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 8865185792. Throughput: 0: 43041.7. Samples: 8865350380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:03,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-23 23:48:03,582][15401] Updated weights for policy 0, policy_version 541090 (0.0027) [2024-06-23 23:48:08,314][15401] Updated weights for policy 0, policy_version 541100 (0.0044) [2024-06-23 23:48:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 8865382400. Throughput: 0: 43084.8. Samples: 8865477940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-23 23:48:11,546][15401] Updated weights for policy 0, policy_version 541110 (0.0037) [2024-06-23 23:48:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8865628160. Throughput: 0: 43075.1. Samples: 8865735180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:13,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 23:48:15,809][15401] Updated weights for policy 0, policy_version 541120 (0.0034) [2024-06-23 23:48:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8865841152. Throughput: 0: 43182.2. Samples: 8865996240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-23 23:48:19,194][15401] Updated weights for policy 0, policy_version 541130 (0.0037) [2024-06-23 23:48:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8866021376. Throughput: 0: 43063.7. Samples: 8866120600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 23:48:23,395][15401] Updated weights for policy 0, policy_version 541140 (0.0037) [2024-06-23 23:48:26,712][15401] Updated weights for policy 0, policy_version 541150 (0.0044) [2024-06-23 23:48:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8866283520. Throughput: 0: 43123.7. Samples: 8866379000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-23 23:48:30,840][15401] Updated weights for policy 0, policy_version 541160 (0.0027) [2024-06-23 23:48:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 8866463744. Throughput: 0: 43007.9. Samples: 8866637620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:33,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-23 23:48:34,232][15401] Updated weights for policy 0, policy_version 541170 (0.0032) [2024-06-23 23:48:38,364][15401] Updated weights for policy 0, policy_version 541180 (0.0030) [2024-06-23 23:48:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 8866693120. Throughput: 0: 42932.0. Samples: 8866761200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:38,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-23 23:48:42,120][15401] Updated weights for policy 0, policy_version 541190 (0.0033) [2024-06-23 23:48:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43419.4, 300 sec: 42709.5). Total num frames: 8866938880. Throughput: 0: 43134.3. Samples: 8867023580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-23 23:48:46,305][15401] Updated weights for policy 0, policy_version 541200 (0.0037) [2024-06-23 23:48:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8867119104. Throughput: 0: 42865.5. Samples: 8867279320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 23:48:49,699][15401] Updated weights for policy 0, policy_version 541210 (0.0028) [2024-06-23 23:48:53,390][15132] Fps is (10 sec: 36044.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8867299328. Throughput: 0: 42706.1. Samples: 8867399720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:53,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-23 23:48:53,872][15401] Updated weights for policy 0, policy_version 541220 (0.0032) [2024-06-23 23:48:57,170][15401] Updated weights for policy 0, policy_version 541230 (0.0033) [2024-06-23 23:48:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 8867577856. Throughput: 0: 42908.1. Samples: 8867666040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:48:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-23 23:49:01,695][15401] Updated weights for policy 0, policy_version 541240 (0.0039) [2024-06-23 23:49:03,396][15132] Fps is (10 sec: 47483.5, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 8867774464. Throughput: 0: 42791.7. Samples: 8867922140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:49:03,396][15132] Avg episode reward: [(0, '0.467')] [2024-06-23 23:49:04,837][15401] Updated weights for policy 0, policy_version 541250 (0.0043) [2024-06-23 23:49:08,393][15132] Fps is (10 sec: 36030.6, 60 sec: 42595.6, 300 sec: 42653.4). Total num frames: 8867938304. Throughput: 0: 42712.7. Samples: 8868042840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:49:08,394][15132] Avg episode reward: [(0, '0.755')] [2024-06-23 23:49:09,264][15401] Updated weights for policy 0, policy_version 541260 (0.0032) [2024-06-23 23:49:09,783][15349] Signal inference workers to stop experience collection... (131450 times) [2024-06-23 23:49:09,820][15401] InferenceWorker_p0-w0: stopping experience collection (131450 times) [2024-06-23 23:49:09,896][15349] Signal inference workers to resume experience collection... (131450 times) [2024-06-23 23:49:09,896][15401] InferenceWorker_p0-w0: resuming experience collection (131450 times) [2024-06-23 23:49:12,303][15401] Updated weights for policy 0, policy_version 541270 (0.0025) [2024-06-23 23:49:13,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8868200448. Throughput: 0: 42831.1. Samples: 8868306400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:49:13,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-23 23:49:17,366][15401] Updated weights for policy 0, policy_version 541280 (0.0043) [2024-06-23 23:49:18,390][15132] Fps is (10 sec: 47531.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8868413440. Throughput: 0: 42744.3. Samples: 8868561120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:49:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-23 23:49:20,029][15401] Updated weights for policy 0, policy_version 541290 (0.0033) [2024-06-23 23:49:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 8868593664. Throughput: 0: 42796.9. Samples: 8868687060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-23 23:49:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 23:49:24,884][15401] Updated weights for policy 0, policy_version 541300 (0.0044) [2024-06-23 23:49:27,813][15401] Updated weights for policy 0, policy_version 541310 (0.0038) [2024-06-23 23:49:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8868839424. Throughput: 0: 42636.4. Samples: 8868942220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:49:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-23 23:49:32,514][15401] Updated weights for policy 0, policy_version 541320 (0.0032) [2024-06-23 23:49:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8869019648. Throughput: 0: 42693.1. Samples: 8869200520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:49:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-23 23:49:35,607][15401] Updated weights for policy 0, policy_version 541330 (0.0027) [2024-06-23 23:49:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8869232640. Throughput: 0: 42715.6. Samples: 8869321920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:49:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-23 23:49:40,133][15401] Updated weights for policy 0, policy_version 541340 (0.0038) [2024-06-23 23:49:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8869462016. Throughput: 0: 42501.9. Samples: 8869578620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:49:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-23 23:49:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000541350_8869478400.pth... [2024-06-23 23:49:43,488][15401] Updated weights for policy 0, policy_version 541350 (0.0047) [2024-06-23 23:49:43,542][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000540723_8859205632.pth [2024-06-23 23:49:47,809][15401] Updated weights for policy 0, policy_version 541360 (0.0027) [2024-06-23 23:49:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8869658624. Throughput: 0: 42358.5. Samples: 8869828000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:49:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-23 23:49:51,320][15401] Updated weights for policy 0, policy_version 541370 (0.0036) [2024-06-23 23:49:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42654.7). Total num frames: 8869888000. Throughput: 0: 42434.8. Samples: 8869952240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:49:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-23 23:49:55,342][15401] Updated weights for policy 0, policy_version 541380 (0.0031) [2024-06-23 23:49:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 42542.9). Total num frames: 8870068224. Throughput: 0: 42378.3. Samples: 8870213420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:49:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 23:49:58,921][15401] Updated weights for policy 0, policy_version 541390 (0.0043) [2024-06-23 23:50:03,219][15401] Updated weights for policy 0, policy_version 541400 (0.0042) [2024-06-23 23:50:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42329.8, 300 sec: 42765.0). Total num frames: 8870313984. Throughput: 0: 42329.8. Samples: 8870465960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:03,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-23 23:50:06,860][15401] Updated weights for policy 0, policy_version 541410 (0.0032) [2024-06-23 23:50:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43147.4, 300 sec: 42653.9). Total num frames: 8870526976. Throughput: 0: 42324.0. Samples: 8870591640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-23 23:50:10,776][15401] Updated weights for policy 0, policy_version 541420 (0.0028) [2024-06-23 23:50:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8870723584. Throughput: 0: 42501.4. Samples: 8870854780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 23:50:14,398][15401] Updated weights for policy 0, policy_version 541430 (0.0034) [2024-06-23 23:50:18,384][15401] Updated weights for policy 0, policy_version 541440 (0.0038) [2024-06-23 23:50:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8870952960. Throughput: 0: 42304.1. Samples: 8871104200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-23 23:50:21,992][15401] Updated weights for policy 0, policy_version 541450 (0.0040) [2024-06-23 23:50:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8871165952. Throughput: 0: 42450.7. Samples: 8871232200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:23,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-23 23:50:25,957][15401] Updated weights for policy 0, policy_version 541460 (0.0044) [2024-06-23 23:50:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8871362560. Throughput: 0: 42404.8. Samples: 8871486840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 23:50:29,633][15401] Updated weights for policy 0, policy_version 541470 (0.0043) [2024-06-23 23:50:31,064][15349] Signal inference workers to stop experience collection... (131500 times) [2024-06-23 23:50:31,064][15349] Signal inference workers to resume experience collection... (131500 times) [2024-06-23 23:50:31,093][15401] InferenceWorker_p0-w0: stopping experience collection (131500 times) [2024-06-23 23:50:31,093][15401] InferenceWorker_p0-w0: resuming experience collection (131500 times) [2024-06-23 23:50:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8871575552. Throughput: 0: 42422.6. Samples: 8871737020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:33,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 23:50:33,554][15401] Updated weights for policy 0, policy_version 541480 (0.0031) [2024-06-23 23:50:38,181][15401] Updated weights for policy 0, policy_version 541490 (0.0037) [2024-06-23 23:50:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 8871772160. Throughput: 0: 42557.8. Samples: 8871867340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-23 23:50:41,251][15401] Updated weights for policy 0, policy_version 541500 (0.0040) [2024-06-23 23:50:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 8871968768. Throughput: 0: 42250.7. Samples: 8872114700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:43,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-23 23:50:45,962][15401] Updated weights for policy 0, policy_version 541510 (0.0041) [2024-06-23 23:50:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8872214528. Throughput: 0: 42173.0. Samples: 8872363740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-23 23:50:48,823][15401] Updated weights for policy 0, policy_version 541520 (0.0030) [2024-06-23 23:50:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 8872411136. Throughput: 0: 42477.4. Samples: 8872503120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-23 23:50:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-23 23:50:53,421][15401] Updated weights for policy 0, policy_version 541530 (0.0032) [2024-06-23 23:50:56,692][15401] Updated weights for policy 0, policy_version 541540 (0.0037) [2024-06-23 23:50:58,391][15132] Fps is (10 sec: 42590.2, 60 sec: 42870.1, 300 sec: 42654.2). Total num frames: 8872640512. Throughput: 0: 42235.2. Samples: 8872755440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:50:58,392][15132] Avg episode reward: [(0, '0.328')] [2024-06-23 23:51:00,977][15401] Updated weights for policy 0, policy_version 541550 (0.0042) [2024-06-23 23:51:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.8). Total num frames: 8872886272. Throughput: 0: 42371.1. Samples: 8873010900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 23:51:04,600][15401] Updated weights for policy 0, policy_version 541560 (0.0045) [2024-06-23 23:51:08,389][15132] Fps is (10 sec: 40967.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 8873050112. Throughput: 0: 42545.7. Samples: 8873146760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 23:51:09,136][15401] Updated weights for policy 0, policy_version 541570 (0.0027) [2024-06-23 23:51:12,110][15401] Updated weights for policy 0, policy_version 541580 (0.0035) [2024-06-23 23:51:13,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8873263104. Throughput: 0: 42475.5. Samples: 8873398240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-23 23:51:16,802][15401] Updated weights for policy 0, policy_version 541590 (0.0035) [2024-06-23 23:51:18,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42320.8, 300 sec: 42708.6). Total num frames: 8873492480. Throughput: 0: 42728.6. Samples: 8873660080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:18,396][15132] Avg episode reward: [(0, '0.663')] [2024-06-23 23:51:19,703][15401] Updated weights for policy 0, policy_version 541600 (0.0031) [2024-06-23 23:51:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 8873705472. Throughput: 0: 42702.2. Samples: 8873788940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-23 23:51:24,434][15401] Updated weights for policy 0, policy_version 541610 (0.0035) [2024-06-23 23:51:27,280][15401] Updated weights for policy 0, policy_version 541620 (0.0039) [2024-06-23 23:51:28,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 8873918464. Throughput: 0: 42783.6. Samples: 8874039960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:28,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-23 23:51:31,945][15401] Updated weights for policy 0, policy_version 541630 (0.0029) [2024-06-23 23:51:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8874115072. Throughput: 0: 43116.7. Samples: 8874304000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-23 23:51:35,365][15401] Updated weights for policy 0, policy_version 541640 (0.0029) [2024-06-23 23:51:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8874360832. Throughput: 0: 42817.6. Samples: 8874429920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:38,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-23 23:51:39,615][15401] Updated weights for policy 0, policy_version 541650 (0.0032) [2024-06-23 23:51:42,976][15401] Updated weights for policy 0, policy_version 541660 (0.0045) [2024-06-23 23:51:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 8874557440. Throughput: 0: 42788.5. Samples: 8874680840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-23 23:51:43,464][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000541661_8874573824.pth... [2024-06-23 23:51:43,528][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000541036_8864333824.pth [2024-06-23 23:51:47,150][15401] Updated weights for policy 0, policy_version 541670 (0.0036) [2024-06-23 23:51:48,334][15349] Signal inference workers to stop experience collection... (131550 times) [2024-06-23 23:51:48,334][15349] Signal inference workers to resume experience collection... (131550 times) [2024-06-23 23:51:48,353][15401] InferenceWorker_p0-w0: stopping experience collection (131550 times) [2024-06-23 23:51:48,353][15401] InferenceWorker_p0-w0: resuming experience collection (131550 times) [2024-06-23 23:51:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8874754048. Throughput: 0: 43026.6. Samples: 8874947100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-23 23:51:50,963][15401] Updated weights for policy 0, policy_version 541680 (0.0036) [2024-06-23 23:51:53,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 8874983424. Throughput: 0: 42684.8. Samples: 8875067680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:53,393][15132] Avg episode reward: [(0, '0.806')] [2024-06-23 23:51:54,768][15401] Updated weights for policy 0, policy_version 541690 (0.0031) [2024-06-23 23:51:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42326.5, 300 sec: 42598.4). Total num frames: 8875180032. Throughput: 0: 42660.4. Samples: 8875317960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:51:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 23:51:58,585][15401] Updated weights for policy 0, policy_version 541700 (0.0039) [2024-06-23 23:52:02,386][15401] Updated weights for policy 0, policy_version 541710 (0.0031) [2024-06-23 23:52:03,390][15132] Fps is (10 sec: 40969.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 8875393024. Throughput: 0: 42715.8. Samples: 8875582020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:52:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-23 23:52:06,232][15401] Updated weights for policy 0, policy_version 541720 (0.0043) [2024-06-23 23:52:08,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8875638784. Throughput: 0: 42526.7. Samples: 8875702640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:52:08,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-23 23:52:10,067][15401] Updated weights for policy 0, policy_version 541730 (0.0028) [2024-06-23 23:52:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8875835392. Throughput: 0: 42626.2. Samples: 8875958140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:52:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-23 23:52:13,802][15401] Updated weights for policy 0, policy_version 541740 (0.0046) [2024-06-23 23:52:17,734][15401] Updated weights for policy 0, policy_version 541750 (0.0035) [2024-06-23 23:52:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 8876032000. Throughput: 0: 42567.7. Samples: 8876219540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-23 23:52:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-23 23:52:21,305][15401] Updated weights for policy 0, policy_version 541760 (0.0035) [2024-06-23 23:52:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8876277760. Throughput: 0: 42538.1. Samples: 8876344140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:52:23,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 23:52:25,814][15401] Updated weights for policy 0, policy_version 541770 (0.0022) [2024-06-23 23:52:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8876474368. Throughput: 0: 42668.8. Samples: 8876600940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:52:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-23 23:52:28,966][15401] Updated weights for policy 0, policy_version 541780 (0.0032) [2024-06-23 23:52:33,308][15401] Updated weights for policy 0, policy_version 541790 (0.0027) [2024-06-23 23:52:33,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 8876687360. Throughput: 0: 42606.2. Samples: 8876864480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:52:33,401][15132] Avg episode reward: [(0, '0.521')] [2024-06-23 23:52:36,453][15401] Updated weights for policy 0, policy_version 541800 (0.0037) [2024-06-23 23:52:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 8876900352. Throughput: 0: 42618.8. Samples: 8876985420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:52:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-23 23:52:40,800][15401] Updated weights for policy 0, policy_version 541810 (0.0038) [2024-06-23 23:52:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8877113344. Throughput: 0: 42725.9. Samples: 8877240620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:52:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-23 23:52:44,276][15401] Updated weights for policy 0, policy_version 541820 (0.0037) [2024-06-23 23:52:48,312][15401] Updated weights for policy 0, policy_version 541830 (0.0027) [2024-06-23 23:52:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8877342720. Throughput: 0: 42754.7. Samples: 8877505980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:52:48,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-23 23:52:51,848][15401] Updated weights for policy 0, policy_version 541840 (0.0029) [2024-06-23 23:52:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 8877555712. Throughput: 0: 42876.3. Samples: 8877632080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:52:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-23 23:52:56,069][15401] Updated weights for policy 0, policy_version 541850 (0.0036) [2024-06-23 23:52:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 8877768704. Throughput: 0: 42811.2. Samples: 8877884640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:52:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 23:52:59,596][15401] Updated weights for policy 0, policy_version 541860 (0.0031) [2024-06-23 23:53:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8877965312. Throughput: 0: 42903.5. Samples: 8878150200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:03,393][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 23:53:03,779][15401] Updated weights for policy 0, policy_version 541870 (0.0031) [2024-06-23 23:53:04,439][15349] Signal inference workers to stop experience collection... (131600 times) [2024-06-23 23:53:04,440][15349] Signal inference workers to resume experience collection... (131600 times) [2024-06-23 23:53:04,483][15401] InferenceWorker_p0-w0: stopping experience collection (131600 times) [2024-06-23 23:53:04,483][15401] InferenceWorker_p0-w0: resuming experience collection (131600 times) [2024-06-23 23:53:07,193][15401] Updated weights for policy 0, policy_version 541880 (0.0034) [2024-06-23 23:53:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8878194688. Throughput: 0: 42762.7. Samples: 8878268460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-23 23:53:11,542][15401] Updated weights for policy 0, policy_version 541890 (0.0032) [2024-06-23 23:53:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8878424064. Throughput: 0: 42879.5. Samples: 8878530520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-23 23:53:14,822][15401] Updated weights for policy 0, policy_version 541900 (0.0027) [2024-06-23 23:53:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8878587904. Throughput: 0: 42830.8. Samples: 8878791760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-23 23:53:19,151][15401] Updated weights for policy 0, policy_version 541910 (0.0038) [2024-06-23 23:53:22,465][15401] Updated weights for policy 0, policy_version 541920 (0.0024) [2024-06-23 23:53:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 8878833664. Throughput: 0: 42747.1. Samples: 8878909040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-23 23:53:26,882][15401] Updated weights for policy 0, policy_version 541930 (0.0037) [2024-06-23 23:53:28,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8879063040. Throughput: 0: 42888.0. Samples: 8879170580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 23:53:30,293][15401] Updated weights for policy 0, policy_version 541940 (0.0036) [2024-06-23 23:53:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 8879243264. Throughput: 0: 42718.7. Samples: 8879428320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-23 23:53:34,493][15401] Updated weights for policy 0, policy_version 541950 (0.0037) [2024-06-23 23:53:38,162][15401] Updated weights for policy 0, policy_version 541960 (0.0031) [2024-06-23 23:53:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 8879472640. Throughput: 0: 42640.5. Samples: 8879550900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-23 23:53:42,190][15401] Updated weights for policy 0, policy_version 541970 (0.0028) [2024-06-23 23:53:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8879702016. Throughput: 0: 42856.3. Samples: 8879813180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-23 23:53:43,447][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000541975_8879718400.pth... [2024-06-23 23:53:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000541350_8869478400.pth [2024-06-23 23:53:45,614][15401] Updated weights for policy 0, policy_version 541980 (0.0043) [2024-06-23 23:53:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8879882240. Throughput: 0: 42679.6. Samples: 8880070780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-23 23:53:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-23 23:53:49,674][15401] Updated weights for policy 0, policy_version 541990 (0.0026) [2024-06-23 23:53:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 8880111616. Throughput: 0: 42731.8. Samples: 8880191380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:53:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-23 23:53:53,494][15401] Updated weights for policy 0, policy_version 542000 (0.0026) [2024-06-23 23:53:57,209][15401] Updated weights for policy 0, policy_version 542010 (0.0034) [2024-06-23 23:53:58,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.4, 300 sec: 42654.8). Total num frames: 8880357376. Throughput: 0: 42831.5. Samples: 8880457940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:53:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-23 23:54:01,034][15401] Updated weights for policy 0, policy_version 542020 (0.0022) [2024-06-23 23:54:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 8880537600. Throughput: 0: 42865.7. Samples: 8880720720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-23 23:54:04,801][15401] Updated weights for policy 0, policy_version 542030 (0.0023) [2024-06-23 23:54:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8880766976. Throughput: 0: 42931.5. Samples: 8880840960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:08,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 23:54:08,681][15401] Updated weights for policy 0, policy_version 542040 (0.0047) [2024-06-23 23:54:12,362][15401] Updated weights for policy 0, policy_version 542050 (0.0033) [2024-06-23 23:54:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 8880996352. Throughput: 0: 43096.1. Samples: 8881109900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-23 23:54:16,022][15401] Updated weights for policy 0, policy_version 542060 (0.0029) [2024-06-23 23:54:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 8881192960. Throughput: 0: 43191.5. Samples: 8881371940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:18,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-23 23:54:20,005][15401] Updated weights for policy 0, policy_version 542070 (0.0027) [2024-06-23 23:54:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 8881422336. Throughput: 0: 43130.8. Samples: 8881491780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 23:54:23,656][15401] Updated weights for policy 0, policy_version 542080 (0.0029) [2024-06-23 23:54:27,567][15401] Updated weights for policy 0, policy_version 542090 (0.0036) [2024-06-23 23:54:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8881635328. Throughput: 0: 43230.4. Samples: 8881758540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-23 23:54:31,173][15401] Updated weights for policy 0, policy_version 542100 (0.0029) [2024-06-23 23:54:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 8881831936. Throughput: 0: 43022.1. Samples: 8882006880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:33,392][15132] Avg episode reward: [(0, '0.752')] [2024-06-23 23:54:35,578][15401] Updated weights for policy 0, policy_version 542110 (0.0032) [2024-06-23 23:54:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8882077696. Throughput: 0: 43140.7. Samples: 8882132720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:38,390][15132] Avg episode reward: [(0, '0.891')] [2024-06-23 23:54:38,612][15401] Updated weights for policy 0, policy_version 542120 (0.0038) [2024-06-23 23:54:39,942][15349] Signal inference workers to stop experience collection... (131650 times) [2024-06-23 23:54:39,942][15349] Signal inference workers to resume experience collection... (131650 times) [2024-06-23 23:54:39,988][15401] InferenceWorker_p0-w0: stopping experience collection (131650 times) [2024-06-23 23:54:39,988][15401] InferenceWorker_p0-w0: resuming experience collection (131650 times) [2024-06-23 23:54:43,149][15401] Updated weights for policy 0, policy_version 542130 (0.0033) [2024-06-23 23:54:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8882274304. Throughput: 0: 43130.0. Samples: 8882398780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:43,390][15132] Avg episode reward: [(0, '0.891')] [2024-06-23 23:54:46,053][15401] Updated weights for policy 0, policy_version 542140 (0.0037) [2024-06-23 23:54:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 8882470912. Throughput: 0: 42965.3. Samples: 8882654160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-23 23:54:50,806][15401] Updated weights for policy 0, policy_version 542150 (0.0031) [2024-06-23 23:54:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8882716672. Throughput: 0: 43267.1. Samples: 8882787980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-23 23:54:53,567][15401] Updated weights for policy 0, policy_version 542160 (0.0042) [2024-06-23 23:54:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 8882896896. Throughput: 0: 42988.4. Samples: 8883044380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:54:58,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-23 23:54:58,420][15401] Updated weights for policy 0, policy_version 542170 (0.0047) [2024-06-23 23:55:01,163][15401] Updated weights for policy 0, policy_version 542180 (0.0037) [2024-06-23 23:55:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 8883142656. Throughput: 0: 42885.9. Samples: 8883301800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:55:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-23 23:55:05,958][15401] Updated weights for policy 0, policy_version 542190 (0.0027) [2024-06-23 23:55:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8883355648. Throughput: 0: 43199.5. Samples: 8883435760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:55:08,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-23 23:55:09,034][15401] Updated weights for policy 0, policy_version 542200 (0.0026) [2024-06-23 23:55:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8883552256. Throughput: 0: 42891.5. Samples: 8883688660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:55:13,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-23 23:55:13,517][15401] Updated weights for policy 0, policy_version 542210 (0.0032) [2024-06-23 23:55:16,884][15401] Updated weights for policy 0, policy_version 542220 (0.0038) [2024-06-23 23:55:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8883781632. Throughput: 0: 42978.2. Samples: 8883940800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-23 23:55:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-23 23:55:21,047][15401] Updated weights for policy 0, policy_version 542230 (0.0033) [2024-06-23 23:55:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8883978240. Throughput: 0: 43070.8. Samples: 8884070900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:55:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-23 23:55:24,358][15401] Updated weights for policy 0, policy_version 542240 (0.0039) [2024-06-23 23:55:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8884207616. Throughput: 0: 43000.8. Samples: 8884333820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:55:28,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-23 23:55:28,564][15401] Updated weights for policy 0, policy_version 542250 (0.0033) [2024-06-23 23:55:32,192][15401] Updated weights for policy 0, policy_version 542260 (0.0026) [2024-06-23 23:55:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 8884420608. Throughput: 0: 43049.3. Samples: 8884591380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:55:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-23 23:55:36,139][15401] Updated weights for policy 0, policy_version 542270 (0.0047) [2024-06-23 23:55:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8884633600. Throughput: 0: 42917.3. Samples: 8884719260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:55:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 23:55:39,675][15401] Updated weights for policy 0, policy_version 542280 (0.0028) [2024-06-23 23:55:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 8884846592. Throughput: 0: 43128.7. Samples: 8884985180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:55:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-23 23:55:43,536][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000542289_8884862976.pth... [2024-06-23 23:55:43,586][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000541661_8874573824.pth [2024-06-23 23:55:43,728][15401] Updated weights for policy 0, policy_version 542290 (0.0032) [2024-06-23 23:55:47,340][15401] Updated weights for policy 0, policy_version 542300 (0.0041) [2024-06-23 23:55:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 8885075968. Throughput: 0: 43003.3. Samples: 8885236960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:55:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-23 23:55:51,346][15401] Updated weights for policy 0, policy_version 542310 (0.0034) [2024-06-23 23:55:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 8885288960. Throughput: 0: 42999.1. Samples: 8885370720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:55:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-23 23:55:54,849][15401] Updated weights for policy 0, policy_version 542320 (0.0033) [2024-06-23 23:55:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 8885501952. Throughput: 0: 43186.2. Samples: 8885632040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:55:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 23:55:58,869][15401] Updated weights for policy 0, policy_version 542330 (0.0037) [2024-06-23 23:56:02,385][15401] Updated weights for policy 0, policy_version 542340 (0.0036) [2024-06-23 23:56:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 8885731328. Throughput: 0: 43239.1. Samples: 8885886560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-23 23:56:06,347][15401] Updated weights for policy 0, policy_version 542350 (0.0034) [2024-06-23 23:56:07,626][15349] Signal inference workers to stop experience collection... (131700 times) [2024-06-23 23:56:07,627][15349] Signal inference workers to resume experience collection... (131700 times) [2024-06-23 23:56:07,672][15401] InferenceWorker_p0-w0: stopping experience collection (131700 times) [2024-06-23 23:56:07,672][15401] InferenceWorker_p0-w0: resuming experience collection (131700 times) [2024-06-23 23:56:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 8885944320. Throughput: 0: 43189.2. Samples: 8886014420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:08,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-23 23:56:10,199][15401] Updated weights for policy 0, policy_version 542360 (0.0034) [2024-06-23 23:56:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42932.6). Total num frames: 8886157312. Throughput: 0: 43150.3. Samples: 8886275580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-23 23:56:14,331][15401] Updated weights for policy 0, policy_version 542370 (0.0043) [2024-06-23 23:56:17,684][15401] Updated weights for policy 0, policy_version 542380 (0.0039) [2024-06-23 23:56:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 8886370304. Throughput: 0: 43035.2. Samples: 8886527960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-23 23:56:21,923][15401] Updated weights for policy 0, policy_version 542390 (0.0032) [2024-06-23 23:56:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 8886599680. Throughput: 0: 43092.1. Samples: 8886658400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-23 23:56:25,278][15401] Updated weights for policy 0, policy_version 542400 (0.0030) [2024-06-23 23:56:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8886763520. Throughput: 0: 42911.7. Samples: 8886916200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:28,399][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 23:56:29,487][15401] Updated weights for policy 0, policy_version 542410 (0.0032) [2024-06-23 23:56:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8886992896. Throughput: 0: 43085.4. Samples: 8887175800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:33,398][15132] Avg episode reward: [(0, '0.505')] [2024-06-23 23:56:33,504][15401] Updated weights for policy 0, policy_version 542420 (0.0033) [2024-06-23 23:56:36,967][15401] Updated weights for policy 0, policy_version 542430 (0.0036) [2024-06-23 23:56:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 8887222272. Throughput: 0: 43053.9. Samples: 8887308140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-23 23:56:41,139][15401] Updated weights for policy 0, policy_version 542440 (0.0036) [2024-06-23 23:56:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8887435264. Throughput: 0: 42890.1. Samples: 8887562100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-23 23:56:44,776][15401] Updated weights for policy 0, policy_version 542450 (0.0027) [2024-06-23 23:56:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 8887648256. Throughput: 0: 43160.9. Samples: 8887828800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-23 23:56:48,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-23 23:56:48,565][15401] Updated weights for policy 0, policy_version 542460 (0.0023) [2024-06-23 23:56:52,557][15401] Updated weights for policy 0, policy_version 542470 (0.0035) [2024-06-23 23:56:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8887861248. Throughput: 0: 43083.3. Samples: 8887953160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:56:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-23 23:56:56,057][15401] Updated weights for policy 0, policy_version 542480 (0.0027) [2024-06-23 23:56:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 8888074240. Throughput: 0: 42746.2. Samples: 8888199160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:56:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-23 23:57:00,139][15401] Updated weights for policy 0, policy_version 542490 (0.0024) [2024-06-23 23:57:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 8888303616. Throughput: 0: 42935.0. Samples: 8888460040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-23 23:57:03,519][15401] Updated weights for policy 0, policy_version 542500 (0.0035) [2024-06-23 23:57:07,731][15401] Updated weights for policy 0, policy_version 542510 (0.0036) [2024-06-23 23:57:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 8888500224. Throughput: 0: 42878.8. Samples: 8888587940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-23 23:57:11,031][15401] Updated weights for policy 0, policy_version 542520 (0.0033) [2024-06-23 23:57:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 8888729600. Throughput: 0: 42852.5. Samples: 8888844560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-23 23:57:15,426][15401] Updated weights for policy 0, policy_version 542530 (0.0034) [2024-06-23 23:57:18,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 8888942592. Throughput: 0: 42798.2. Samples: 8889101720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-23 23:57:18,871][15401] Updated weights for policy 0, policy_version 542540 (0.0040) [2024-06-23 23:57:23,115][15401] Updated weights for policy 0, policy_version 542550 (0.0023) [2024-06-23 23:57:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 8889155584. Throughput: 0: 42675.4. Samples: 8889228540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:23,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-23 23:57:26,701][15401] Updated weights for policy 0, policy_version 542560 (0.0028) [2024-06-23 23:57:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 8889352192. Throughput: 0: 42804.4. Samples: 8889488300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-23 23:57:30,793][15401] Updated weights for policy 0, policy_version 542570 (0.0025) [2024-06-23 23:57:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 8889581568. Throughput: 0: 42397.0. Samples: 8889736660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-23 23:57:34,337][15401] Updated weights for policy 0, policy_version 542580 (0.0031) [2024-06-23 23:57:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 8889778176. Throughput: 0: 42608.8. Samples: 8889870560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:38,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-23 23:57:38,562][15401] Updated weights for policy 0, policy_version 542590 (0.0033) [2024-06-23 23:57:42,201][15401] Updated weights for policy 0, policy_version 542600 (0.0040) [2024-06-23 23:57:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8890007552. Throughput: 0: 42873.0. Samples: 8890128440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-23 23:57:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000542603_8890007552.pth... [2024-06-23 23:57:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000541975_8879718400.pth [2024-06-23 23:57:44,987][15349] Signal inference workers to stop experience collection... (131750 times) [2024-06-23 23:57:45,034][15401] InferenceWorker_p0-w0: stopping experience collection (131750 times) [2024-06-23 23:57:45,039][15349] Signal inference workers to resume experience collection... (131750 times) [2024-06-23 23:57:45,052][15401] InferenceWorker_p0-w0: resuming experience collection (131750 times) [2024-06-23 23:57:46,164][15401] Updated weights for policy 0, policy_version 542610 (0.0047) [2024-06-23 23:57:48,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 8890220544. Throughput: 0: 42638.3. Samples: 8890378860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:48,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-23 23:57:49,945][15401] Updated weights for policy 0, policy_version 542620 (0.0038) [2024-06-23 23:57:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 8890417152. Throughput: 0: 42717.6. Samples: 8890510240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:53,394][15132] Avg episode reward: [(0, '0.621')] [2024-06-23 23:57:53,850][15401] Updated weights for policy 0, policy_version 542630 (0.0033) [2024-06-23 23:57:57,403][15401] Updated weights for policy 0, policy_version 542640 (0.0033) [2024-06-23 23:57:58,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 8890630144. Throughput: 0: 42780.0. Samples: 8890769660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:57:58,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-23 23:58:01,785][15401] Updated weights for policy 0, policy_version 542650 (0.0051) [2024-06-23 23:58:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 8890875904. Throughput: 0: 42582.3. Samples: 8891017920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:58:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-23 23:58:05,375][15401] Updated weights for policy 0, policy_version 542660 (0.0041) [2024-06-23 23:58:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8891039744. Throughput: 0: 42665.5. Samples: 8891148480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:58:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-23 23:58:09,571][15401] Updated weights for policy 0, policy_version 542670 (0.0034) [2024-06-23 23:58:13,035][15401] Updated weights for policy 0, policy_version 542680 (0.0028) [2024-06-23 23:58:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 8891285504. Throughput: 0: 42562.2. Samples: 8891403600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:58:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-23 23:58:17,238][15401] Updated weights for policy 0, policy_version 542690 (0.0029) [2024-06-23 23:58:18,389][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 8891514880. Throughput: 0: 42668.0. Samples: 8891656720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-23 23:58:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-23 23:58:20,496][15401] Updated weights for policy 0, policy_version 542700 (0.0041) [2024-06-23 23:58:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8891695104. Throughput: 0: 42585.9. Samples: 8891786920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:58:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-23 23:58:24,733][15401] Updated weights for policy 0, policy_version 542710 (0.0026) [2024-06-23 23:58:27,982][15401] Updated weights for policy 0, policy_version 542720 (0.0034) [2024-06-23 23:58:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 8891940864. Throughput: 0: 42449.6. Samples: 8892038680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:58:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-23 23:58:32,279][15401] Updated weights for policy 0, policy_version 542730 (0.0033) [2024-06-23 23:58:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 8892137472. Throughput: 0: 42772.9. Samples: 8892303540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:58:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-23 23:58:35,538][15401] Updated weights for policy 0, policy_version 542740 (0.0026) [2024-06-23 23:58:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8892350464. Throughput: 0: 42584.8. Samples: 8892426560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:58:38,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-23 23:58:39,727][15401] Updated weights for policy 0, policy_version 542750 (0.0036) [2024-06-23 23:58:43,068][15401] Updated weights for policy 0, policy_version 542760 (0.0025) [2024-06-23 23:58:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 8892579840. Throughput: 0: 42546.6. Samples: 8892684260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:58:43,390][15132] Avg episode reward: [(0, '0.206')] [2024-06-23 23:58:47,553][15401] Updated weights for policy 0, policy_version 542770 (0.0035) [2024-06-23 23:58:48,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42054.0, 300 sec: 42820.6). Total num frames: 8892743680. Throughput: 0: 42865.4. Samples: 8892946860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:58:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-23 23:58:50,716][15401] Updated weights for policy 0, policy_version 542780 (0.0028) [2024-06-23 23:58:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8892989440. Throughput: 0: 42555.4. Samples: 8893063480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:58:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-23 23:58:55,064][15401] Updated weights for policy 0, policy_version 542790 (0.0042) [2024-06-23 23:58:58,392][15132] Fps is (10 sec: 47500.3, 60 sec: 43142.6, 300 sec: 42986.8). Total num frames: 8893218816. Throughput: 0: 42706.0. Samples: 8893325480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:58:58,393][15132] Avg episode reward: [(0, '0.529')] [2024-06-23 23:58:58,485][15401] Updated weights for policy 0, policy_version 542800 (0.0031) [2024-06-23 23:59:02,765][15401] Updated weights for policy 0, policy_version 542810 (0.0028) [2024-06-23 23:59:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42050.6, 300 sec: 42820.2). Total num frames: 8893399040. Throughput: 0: 42665.3. Samples: 8893576760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:03,393][15132] Avg episode reward: [(0, '0.389')] [2024-06-23 23:59:06,396][15401] Updated weights for policy 0, policy_version 542820 (0.0035) [2024-06-23 23:59:08,389][15132] Fps is (10 sec: 40971.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8893628416. Throughput: 0: 42501.3. Samples: 8893699480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:08,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-23 23:59:10,189][15401] Updated weights for policy 0, policy_version 542830 (0.0030) [2024-06-23 23:59:12,009][15349] Signal inference workers to stop experience collection... (131800 times) [2024-06-23 23:59:12,009][15349] Signal inference workers to resume experience collection... (131800 times) [2024-06-23 23:59:12,049][15401] InferenceWorker_p0-w0: stopping experience collection (131800 times) [2024-06-23 23:59:12,056][15401] InferenceWorker_p0-w0: resuming experience collection (131800 times) [2024-06-23 23:59:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 8893841408. Throughput: 0: 42794.7. Samples: 8893964440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:13,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-23 23:59:14,109][15401] Updated weights for policy 0, policy_version 542840 (0.0040) [2024-06-23 23:59:18,236][15401] Updated weights for policy 0, policy_version 542850 (0.0036) [2024-06-23 23:59:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 8894054400. Throughput: 0: 42525.0. Samples: 8894217160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-23 23:59:21,728][15401] Updated weights for policy 0, policy_version 542860 (0.0044) [2024-06-23 23:59:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8894267392. Throughput: 0: 42704.5. Samples: 8894348260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-23 23:59:25,750][15401] Updated weights for policy 0, policy_version 542870 (0.0035) [2024-06-23 23:59:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42820.9). Total num frames: 8894464000. Throughput: 0: 42756.1. Samples: 8894608280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:28,396][15132] Avg episode reward: [(0, '0.595')] [2024-06-23 23:59:29,264][15401] Updated weights for policy 0, policy_version 542880 (0.0032) [2024-06-23 23:59:33,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8894693376. Throughput: 0: 42494.1. Samples: 8894859200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:33,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-23 23:59:33,637][15401] Updated weights for policy 0, policy_version 542890 (0.0037) [2024-06-23 23:59:36,824][15401] Updated weights for policy 0, policy_version 542900 (0.0036) [2024-06-23 23:59:38,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 8894922752. Throughput: 0: 42828.8. Samples: 8894990880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:38,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-23 23:59:41,263][15401] Updated weights for policy 0, policy_version 542910 (0.0027) [2024-06-23 23:59:43,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 8895102976. Throughput: 0: 42703.5. Samples: 8895247020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-23 23:59:43,612][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000542916_8895135744.pth... [2024-06-23 23:59:43,686][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000542289_8884862976.pth [2024-06-23 23:59:44,747][15401] Updated weights for policy 0, policy_version 542920 (0.0039) [2024-06-23 23:59:48,392][15132] Fps is (10 sec: 40960.1, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 8895332352. Throughput: 0: 42828.0. Samples: 8895504020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-23 23:59:48,392][15132] Avg episode reward: [(0, '0.397')] [2024-06-23 23:59:48,945][15401] Updated weights for policy 0, policy_version 542930 (0.0041) [2024-06-23 23:59:52,502][15401] Updated weights for policy 0, policy_version 542940 (0.0028) [2024-06-23 23:59:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8895561728. Throughput: 0: 42958.6. Samples: 8895632620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 23:59:53,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-23 23:59:56,534][15401] Updated weights for policy 0, policy_version 542950 (0.0028) [2024-06-23 23:59:58,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42327.2, 300 sec: 42765.0). Total num frames: 8895758336. Throughput: 0: 42792.4. Samples: 8895890100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-23 23:59:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 00:00:00,211][15401] Updated weights for policy 0, policy_version 542960 (0.0034) [2024-06-24 00:00:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 8895971328. Throughput: 0: 42936.3. Samples: 8896149300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:03,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 00:00:03,974][15401] Updated weights for policy 0, policy_version 542970 (0.0038) [2024-06-24 00:00:07,748][15401] Updated weights for policy 0, policy_version 542980 (0.0024) [2024-06-24 00:00:08,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 8896200704. Throughput: 0: 42901.4. Samples: 8896278920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:08,392][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 00:00:11,467][15401] Updated weights for policy 0, policy_version 542990 (0.0033) [2024-06-24 00:00:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8896397312. Throughput: 0: 42850.1. Samples: 8896536540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 00:00:15,249][15401] Updated weights for policy 0, policy_version 543000 (0.0034) [2024-06-24 00:00:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8896610304. Throughput: 0: 43005.5. Samples: 8896794340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 00:00:19,054][15401] Updated weights for policy 0, policy_version 543010 (0.0033) [2024-06-24 00:00:21,616][15349] Signal inference workers to stop experience collection... (131850 times) [2024-06-24 00:00:21,616][15349] Signal inference workers to resume experience collection... (131850 times) [2024-06-24 00:00:21,653][15401] InferenceWorker_p0-w0: stopping experience collection (131850 times) [2024-06-24 00:00:21,653][15401] InferenceWorker_p0-w0: resuming experience collection (131850 times) [2024-06-24 00:00:22,882][15401] Updated weights for policy 0, policy_version 543020 (0.0048) [2024-06-24 00:00:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 8896856064. Throughput: 0: 42982.3. Samples: 8896924980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:23,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 00:00:26,923][15401] Updated weights for policy 0, policy_version 543030 (0.0032) [2024-06-24 00:00:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8897036288. Throughput: 0: 43009.7. Samples: 8897182460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 00:00:30,383][15401] Updated weights for policy 0, policy_version 543040 (0.0041) [2024-06-24 00:00:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 8897265664. Throughput: 0: 43112.4. Samples: 8897443980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 00:00:34,504][15401] Updated weights for policy 0, policy_version 543050 (0.0045) [2024-06-24 00:00:37,958][15401] Updated weights for policy 0, policy_version 543060 (0.0037) [2024-06-24 00:00:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 8897511424. Throughput: 0: 43093.3. Samples: 8897571820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 00:00:42,337][15401] Updated weights for policy 0, policy_version 543070 (0.0038) [2024-06-24 00:00:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8897691648. Throughput: 0: 43066.8. Samples: 8897828100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 00:00:45,631][15401] Updated weights for policy 0, policy_version 543080 (0.0035) [2024-06-24 00:00:48,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 8897904640. Throughput: 0: 43024.5. Samples: 8898085500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:48,393][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 00:00:49,850][15401] Updated weights for policy 0, policy_version 543090 (0.0033) [2024-06-24 00:00:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8898134016. Throughput: 0: 42920.5. Samples: 8898210240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 00:00:53,569][15401] Updated weights for policy 0, policy_version 543100 (0.0034) [2024-06-24 00:00:57,509][15401] Updated weights for policy 0, policy_version 543110 (0.0042) [2024-06-24 00:00:58,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8898330624. Throughput: 0: 42993.3. Samples: 8898471240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:00:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 00:01:01,009][15401] Updated weights for policy 0, policy_version 543120 (0.0030) [2024-06-24 00:01:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8898543616. Throughput: 0: 43081.4. Samples: 8898733000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:01:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 00:01:05,141][15401] Updated weights for policy 0, policy_version 543130 (0.0039) [2024-06-24 00:01:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 8898772992. Throughput: 0: 42870.3. Samples: 8898854140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:01:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 00:01:08,582][15401] Updated weights for policy 0, policy_version 543140 (0.0034) [2024-06-24 00:01:12,862][15401] Updated weights for policy 0, policy_version 543150 (0.0031) [2024-06-24 00:01:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 8898985984. Throughput: 0: 42883.9. Samples: 8899112240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:01:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 00:01:16,187][15401] Updated weights for policy 0, policy_version 543160 (0.0032) [2024-06-24 00:01:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8899182592. Throughput: 0: 42802.3. Samples: 8899370080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 00:01:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 00:01:20,948][15401] Updated weights for policy 0, policy_version 543170 (0.0035) [2024-06-24 00:01:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 8899428352. Throughput: 0: 42625.4. Samples: 8899489960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:01:23,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 00:01:23,796][15401] Updated weights for policy 0, policy_version 543180 (0.0036) [2024-06-24 00:01:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8899608576. Throughput: 0: 42634.7. Samples: 8899746660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:01:28,390][15132] Avg episode reward: [(0, '0.082')] [2024-06-24 00:01:28,489][15401] Updated weights for policy 0, policy_version 543190 (0.0020) [2024-06-24 00:01:31,303][15401] Updated weights for policy 0, policy_version 543200 (0.0031) [2024-06-24 00:01:33,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8899805184. Throughput: 0: 42617.8. Samples: 8900003200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:01:33,400][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 00:01:36,135][15401] Updated weights for policy 0, policy_version 543210 (0.0025) [2024-06-24 00:01:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8900050944. Throughput: 0: 42640.5. Samples: 8900129060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:01:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 00:01:39,233][15401] Updated weights for policy 0, policy_version 543220 (0.0051) [2024-06-24 00:01:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8900247552. Throughput: 0: 42448.8. Samples: 8900381440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:01:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 00:01:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000543228_8900247552.pth... [2024-06-24 00:01:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000542603_8890007552.pth [2024-06-24 00:01:44,099][15401] Updated weights for policy 0, policy_version 543230 (0.0038) [2024-06-24 00:01:45,090][15349] Signal inference workers to stop experience collection... (131900 times) [2024-06-24 00:01:45,090][15349] Signal inference workers to resume experience collection... (131900 times) [2024-06-24 00:01:45,130][15401] InferenceWorker_p0-w0: stopping experience collection (131900 times) [2024-06-24 00:01:45,130][15401] InferenceWorker_p0-w0: resuming experience collection (131900 times) [2024-06-24 00:01:47,297][15401] Updated weights for policy 0, policy_version 543240 (0.0039) [2024-06-24 00:01:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.0, 300 sec: 42709.4). Total num frames: 8900460544. Throughput: 0: 42088.3. Samples: 8900626980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:01:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 00:01:51,711][15401] Updated weights for policy 0, policy_version 543250 (0.0031) [2024-06-24 00:01:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8900689920. Throughput: 0: 42305.7. Samples: 8900757900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:01:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 00:01:55,104][15401] Updated weights for policy 0, policy_version 543260 (0.0039) [2024-06-24 00:01:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8900870144. Throughput: 0: 42402.8. Samples: 8901020360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:01:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 00:01:59,469][15401] Updated weights for policy 0, policy_version 543270 (0.0036) [2024-06-24 00:02:02,793][15401] Updated weights for policy 0, policy_version 543280 (0.0041) [2024-06-24 00:02:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 8901115904. Throughput: 0: 42104.4. Samples: 8901264780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 00:02:07,242][15401] Updated weights for policy 0, policy_version 543290 (0.0048) [2024-06-24 00:02:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8901345280. Throughput: 0: 42439.5. Samples: 8901399740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:08,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 00:02:10,354][15401] Updated weights for policy 0, policy_version 543300 (0.0023) [2024-06-24 00:02:13,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 8901492736. Throughput: 0: 42411.8. Samples: 8901655200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 00:02:14,855][15401] Updated weights for policy 0, policy_version 543310 (0.0028) [2024-06-24 00:02:18,331][15401] Updated weights for policy 0, policy_version 543320 (0.0048) [2024-06-24 00:02:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8901754880. Throughput: 0: 42113.8. Samples: 8901898320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 00:02:22,562][15401] Updated weights for policy 0, policy_version 543330 (0.0042) [2024-06-24 00:02:23,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8901967872. Throughput: 0: 42378.2. Samples: 8902036080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 00:02:26,021][15401] Updated weights for policy 0, policy_version 543340 (0.0025) [2024-06-24 00:02:28,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 8902131712. Throughput: 0: 42291.2. Samples: 8902284540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 00:02:30,249][15401] Updated weights for policy 0, policy_version 543350 (0.0027) [2024-06-24 00:02:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8902377472. Throughput: 0: 42437.0. Samples: 8902536640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 00:02:33,681][15401] Updated weights for policy 0, policy_version 543360 (0.0038) [2024-06-24 00:02:37,957][15401] Updated weights for policy 0, policy_version 543370 (0.0035) [2024-06-24 00:02:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8902590464. Throughput: 0: 42523.1. Samples: 8902671440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 00:02:41,366][15401] Updated weights for policy 0, policy_version 543380 (0.0041) [2024-06-24 00:02:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 8902787072. Throughput: 0: 42161.7. Samples: 8902917640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:43,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 00:02:45,603][15401] Updated weights for policy 0, policy_version 543390 (0.0034) [2024-06-24 00:02:46,244][15349] Signal inference workers to stop experience collection... (131950 times) [2024-06-24 00:02:46,244][15349] Signal inference workers to resume experience collection... (131950 times) [2024-06-24 00:02:46,296][15401] InferenceWorker_p0-w0: stopping experience collection (131950 times) [2024-06-24 00:02:46,296][15401] InferenceWorker_p0-w0: resuming experience collection (131950 times) [2024-06-24 00:02:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8903016448. Throughput: 0: 42352.9. Samples: 8903170660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 00:02:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 00:02:49,386][15401] Updated weights for policy 0, policy_version 543400 (0.0028) [2024-06-24 00:02:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 8903213056. Throughput: 0: 42371.5. Samples: 8903306460. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:02:53,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 00:02:53,463][15401] Updated weights for policy 0, policy_version 543410 (0.0036) [2024-06-24 00:02:56,893][15401] Updated weights for policy 0, policy_version 543420 (0.0041) [2024-06-24 00:02:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8903442432. Throughput: 0: 42221.4. Samples: 8903555160. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:02:58,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 00:03:01,215][15401] Updated weights for policy 0, policy_version 543430 (0.0028) [2024-06-24 00:03:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8903655424. Throughput: 0: 42449.2. Samples: 8903808540. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 00:03:04,643][15401] Updated weights for policy 0, policy_version 543440 (0.0040) [2024-06-24 00:03:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 42542.9). Total num frames: 8903835648. Throughput: 0: 42275.4. Samples: 8903938480. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 00:03:08,903][15401] Updated weights for policy 0, policy_version 543450 (0.0030) [2024-06-24 00:03:12,569][15401] Updated weights for policy 0, policy_version 543460 (0.0035) [2024-06-24 00:03:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 8904081408. Throughput: 0: 42463.6. Samples: 8904195400. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 00:03:16,742][15401] Updated weights for policy 0, policy_version 543470 (0.0034) [2024-06-24 00:03:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8904294400. Throughput: 0: 42423.1. Samples: 8904445680. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 00:03:20,487][15401] Updated weights for policy 0, policy_version 543480 (0.0036) [2024-06-24 00:03:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 8904474624. Throughput: 0: 42391.4. Samples: 8904579060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 00:03:24,243][15401] Updated weights for policy 0, policy_version 543490 (0.0025) [2024-06-24 00:03:27,950][15401] Updated weights for policy 0, policy_version 543500 (0.0032) [2024-06-24 00:03:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8904704000. Throughput: 0: 42619.5. Samples: 8904835520. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 00:03:31,769][15401] Updated weights for policy 0, policy_version 543510 (0.0056) [2024-06-24 00:03:33,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8904949760. Throughput: 0: 42720.1. Samples: 8905093060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 00:03:35,452][15401] Updated weights for policy 0, policy_version 543520 (0.0023) [2024-06-24 00:03:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 8905129984. Throughput: 0: 42679.1. Samples: 8905227020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 00:03:39,391][15401] Updated weights for policy 0, policy_version 543530 (0.0038) [2024-06-24 00:03:42,951][15401] Updated weights for policy 0, policy_version 543540 (0.0053) [2024-06-24 00:03:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8905359360. Throughput: 0: 42774.5. Samples: 8905480020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 00:03:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000543540_8905359360.pth... [2024-06-24 00:03:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000542916_8895135744.pth [2024-06-24 00:03:47,334][15401] Updated weights for policy 0, policy_version 543550 (0.0039) [2024-06-24 00:03:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8905572352. Throughput: 0: 42740.5. Samples: 8905731860. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 00:03:50,546][15401] Updated weights for policy 0, policy_version 543560 (0.0037) [2024-06-24 00:03:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 8905785344. Throughput: 0: 42784.5. Samples: 8905863780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 00:03:54,794][15401] Updated weights for policy 0, policy_version 543570 (0.0038) [2024-06-24 00:03:58,029][15401] Updated weights for policy 0, policy_version 543580 (0.0040) [2024-06-24 00:03:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 8906014720. Throughput: 0: 42751.4. Samples: 8906119220. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:03:58,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 00:04:02,645][15401] Updated weights for policy 0, policy_version 543590 (0.0029) [2024-06-24 00:04:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8906194944. Throughput: 0: 42941.7. Samples: 8906378060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:04:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 00:04:03,598][15349] Signal inference workers to stop experience collection... (132000 times) [2024-06-24 00:04:03,649][15401] InferenceWorker_p0-w0: stopping experience collection (132000 times) [2024-06-24 00:04:03,657][15349] Signal inference workers to resume experience collection... (132000 times) [2024-06-24 00:04:03,661][15401] InferenceWorker_p0-w0: resuming experience collection (132000 times) [2024-06-24 00:04:05,570][15401] Updated weights for policy 0, policy_version 543600 (0.0024) [2024-06-24 00:04:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 8906424320. Throughput: 0: 42796.5. Samples: 8906504900. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:04:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 00:04:10,261][15401] Updated weights for policy 0, policy_version 543610 (0.0038) [2024-06-24 00:04:13,230][15401] Updated weights for policy 0, policy_version 543620 (0.0034) [2024-06-24 00:04:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8906670080. Throughput: 0: 42953.4. Samples: 8906768420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:04:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 00:04:17,741][15401] Updated weights for policy 0, policy_version 543630 (0.0034) [2024-06-24 00:04:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8906833920. Throughput: 0: 42910.3. Samples: 8907024020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-24 00:04:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 00:04:20,853][15401] Updated weights for policy 0, policy_version 543640 (0.0033) [2024-06-24 00:04:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8907063296. Throughput: 0: 42648.6. Samples: 8907146200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:04:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 00:04:25,418][15401] Updated weights for policy 0, policy_version 543650 (0.0041) [2024-06-24 00:04:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42709.8). Total num frames: 8907292672. Throughput: 0: 42861.5. Samples: 8907408780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:04:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 00:04:28,627][15401] Updated weights for policy 0, policy_version 543660 (0.0035) [2024-06-24 00:04:33,060][15401] Updated weights for policy 0, policy_version 543670 (0.0034) [2024-06-24 00:04:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 8907489280. Throughput: 0: 42914.3. Samples: 8907663000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:04:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 00:04:36,457][15401] Updated weights for policy 0, policy_version 543680 (0.0031) [2024-06-24 00:04:38,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 8907702272. Throughput: 0: 42718.5. Samples: 8907786120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:04:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 00:04:40,696][15401] Updated weights for policy 0, policy_version 543690 (0.0039) [2024-06-24 00:04:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 8907931648. Throughput: 0: 42847.6. Samples: 8908047360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:04:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 00:04:44,441][15401] Updated weights for policy 0, policy_version 543700 (0.0030) [2024-06-24 00:04:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 8908111872. Throughput: 0: 42857.3. Samples: 8908306640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:04:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 00:04:48,727][15401] Updated weights for policy 0, policy_version 543710 (0.0034) [2024-06-24 00:04:52,140][15401] Updated weights for policy 0, policy_version 543720 (0.0042) [2024-06-24 00:04:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8908357632. Throughput: 0: 42718.3. Samples: 8908427220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:04:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 00:04:56,211][15401] Updated weights for policy 0, policy_version 543730 (0.0040) [2024-06-24 00:04:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8908554240. Throughput: 0: 42590.8. Samples: 8908685000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:04:58,390][15132] Avg episode reward: [(0, '0.183')] [2024-06-24 00:04:59,627][15401] Updated weights for policy 0, policy_version 543740 (0.0030) [2024-06-24 00:05:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 8908767232. Throughput: 0: 42541.2. Samples: 8908938380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 00:05:03,743][15401] Updated weights for policy 0, policy_version 543750 (0.0039) [2024-06-24 00:05:07,522][15401] Updated weights for policy 0, policy_version 543760 (0.0027) [2024-06-24 00:05:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8908980224. Throughput: 0: 42703.6. Samples: 8909067860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 00:05:11,245][15401] Updated weights for policy 0, policy_version 543770 (0.0028) [2024-06-24 00:05:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 8909176832. Throughput: 0: 42591.5. Samples: 8909325400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 00:05:15,326][15401] Updated weights for policy 0, policy_version 543780 (0.0041) [2024-06-24 00:05:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8909422592. Throughput: 0: 42501.3. Samples: 8909575560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 00:05:18,695][15401] Updated weights for policy 0, policy_version 543790 (0.0026) [2024-06-24 00:05:20,998][15349] Signal inference workers to stop experience collection... (132050 times) [2024-06-24 00:05:21,032][15401] InferenceWorker_p0-w0: stopping experience collection (132050 times) [2024-06-24 00:05:21,115][15349] Signal inference workers to resume experience collection... (132050 times) [2024-06-24 00:05:21,115][15401] InferenceWorker_p0-w0: resuming experience collection (132050 times) [2024-06-24 00:05:23,215][15401] Updated weights for policy 0, policy_version 543800 (0.0036) [2024-06-24 00:05:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 8909635584. Throughput: 0: 42758.2. Samples: 8909710240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 00:05:26,355][15401] Updated weights for policy 0, policy_version 543810 (0.0027) [2024-06-24 00:05:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 8909815808. Throughput: 0: 42566.6. Samples: 8909962860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 00:05:30,862][15401] Updated weights for policy 0, policy_version 543820 (0.0047) [2024-06-24 00:05:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 8910061568. Throughput: 0: 42335.2. Samples: 8910211720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 00:05:34,299][15401] Updated weights for policy 0, policy_version 543830 (0.0039) [2024-06-24 00:05:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 8910258176. Throughput: 0: 42764.5. Samples: 8910351620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 00:05:38,405][15401] Updated weights for policy 0, policy_version 543840 (0.0041) [2024-06-24 00:05:41,724][15401] Updated weights for policy 0, policy_version 543850 (0.0034) [2024-06-24 00:05:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42598.7). Total num frames: 8910471168. Throughput: 0: 42591.3. Samples: 8910601620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:43,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 00:05:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000543852_8910471168.pth... [2024-06-24 00:05:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000543228_8900247552.pth [2024-06-24 00:05:45,873][15401] Updated weights for policy 0, policy_version 543860 (0.0041) [2024-06-24 00:05:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8910700544. Throughput: 0: 42639.6. Samples: 8910857160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 00:05:48,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 00:05:49,371][15401] Updated weights for policy 0, policy_version 543870 (0.0041) [2024-06-24 00:05:53,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8910913536. Throughput: 0: 42750.5. Samples: 8910991640. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:05:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 00:05:53,533][15401] Updated weights for policy 0, policy_version 543880 (0.0043) [2024-06-24 00:05:57,113][15401] Updated weights for policy 0, policy_version 543890 (0.0045) [2024-06-24 00:05:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8911126528. Throughput: 0: 42595.5. Samples: 8911242200. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:05:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 00:06:01,247][15401] Updated weights for policy 0, policy_version 543900 (0.0033) [2024-06-24 00:06:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8911339520. Throughput: 0: 42729.8. Samples: 8911498400. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 00:06:04,635][15401] Updated weights for policy 0, policy_version 543910 (0.0048) [2024-06-24 00:06:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8911536128. Throughput: 0: 42622.8. Samples: 8911628260. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 00:06:09,288][15401] Updated weights for policy 0, policy_version 543920 (0.0033) [2024-06-24 00:06:12,579][15401] Updated weights for policy 0, policy_version 543930 (0.0037) [2024-06-24 00:06:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 8911765504. Throughput: 0: 42635.1. Samples: 8911881540. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:13,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 00:06:16,842][15401] Updated weights for policy 0, policy_version 543940 (0.0025) [2024-06-24 00:06:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 8911994880. Throughput: 0: 42795.9. Samples: 8912137640. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:18,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 00:06:20,256][15401] Updated weights for policy 0, policy_version 543950 (0.0030) [2024-06-24 00:06:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8912175104. Throughput: 0: 42564.3. Samples: 8912267020. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 00:06:24,406][15401] Updated weights for policy 0, policy_version 543960 (0.0034) [2024-06-24 00:06:28,080][15401] Updated weights for policy 0, policy_version 543970 (0.0046) [2024-06-24 00:06:28,392][15132] Fps is (10 sec: 42598.6, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 8912420864. Throughput: 0: 42806.8. Samples: 8912528020. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:28,392][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 00:06:32,009][15401] Updated weights for policy 0, policy_version 543980 (0.0034) [2024-06-24 00:06:33,391][15132] Fps is (10 sec: 47506.0, 60 sec: 43143.3, 300 sec: 42709.2). Total num frames: 8912650240. Throughput: 0: 42790.9. Samples: 8912782820. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:33,391][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 00:06:35,441][15401] Updated weights for policy 0, policy_version 543990 (0.0038) [2024-06-24 00:06:38,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8912814080. Throughput: 0: 42725.0. Samples: 8912914260. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:38,396][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 00:06:39,788][15401] Updated weights for policy 0, policy_version 544000 (0.0041) [2024-06-24 00:06:42,903][15401] Updated weights for policy 0, policy_version 544010 (0.0035) [2024-06-24 00:06:43,390][15132] Fps is (10 sec: 40966.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8913059840. Throughput: 0: 42872.8. Samples: 8913171480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:43,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 00:06:47,264][15401] Updated weights for policy 0, policy_version 544020 (0.0032) [2024-06-24 00:06:47,764][15349] Signal inference workers to stop experience collection... (132100 times) [2024-06-24 00:06:47,764][15349] Signal inference workers to resume experience collection... (132100 times) [2024-06-24 00:06:47,800][15401] InferenceWorker_p0-w0: stopping experience collection (132100 times) [2024-06-24 00:06:47,800][15401] InferenceWorker_p0-w0: resuming experience collection (132100 times) [2024-06-24 00:06:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8913289216. Throughput: 0: 42959.9. Samples: 8913431600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:48,392][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 00:06:50,429][15401] Updated weights for policy 0, policy_version 544030 (0.0037) [2024-06-24 00:06:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8913469440. Throughput: 0: 42857.0. Samples: 8913556820. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:53,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 00:06:55,048][15401] Updated weights for policy 0, policy_version 544040 (0.0037) [2024-06-24 00:06:58,329][15401] Updated weights for policy 0, policy_version 544050 (0.0041) [2024-06-24 00:06:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8913715200. Throughput: 0: 43038.7. Samples: 8913818180. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:06:58,393][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 00:07:02,585][15401] Updated weights for policy 0, policy_version 544060 (0.0042) [2024-06-24 00:07:03,392][15132] Fps is (10 sec: 45863.5, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 8913928192. Throughput: 0: 43080.4. Samples: 8914076260. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:07:03,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 00:07:05,905][15401] Updated weights for policy 0, policy_version 544070 (0.0051) [2024-06-24 00:07:08,396][15132] Fps is (10 sec: 39296.8, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 8914108416. Throughput: 0: 42998.4. Samples: 8914202220. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:07:08,396][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 00:07:10,162][15401] Updated weights for policy 0, policy_version 544080 (0.0033) [2024-06-24 00:07:13,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 8914354176. Throughput: 0: 42973.4. Samples: 8914461720. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:07:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 00:07:13,403][15401] Updated weights for policy 0, policy_version 544090 (0.0028) [2024-06-24 00:07:17,769][15401] Updated weights for policy 0, policy_version 544100 (0.0031) [2024-06-24 00:07:18,389][15132] Fps is (10 sec: 45904.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 8914567168. Throughput: 0: 43098.1. Samples: 8914722160. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-06-24 00:07:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 00:07:21,494][15401] Updated weights for policy 0, policy_version 544110 (0.0040) [2024-06-24 00:07:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8914747392. Throughput: 0: 42936.4. Samples: 8914846400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:07:23,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 00:07:25,431][15401] Updated weights for policy 0, policy_version 544120 (0.0039) [2024-06-24 00:07:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 8915009536. Throughput: 0: 42969.7. Samples: 8915105120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:07:28,396][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 00:07:29,278][15401] Updated weights for policy 0, policy_version 544130 (0.0023) [2024-06-24 00:07:33,031][15401] Updated weights for policy 0, policy_version 544140 (0.0036) [2024-06-24 00:07:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42599.5, 300 sec: 42765.0). Total num frames: 8915206144. Throughput: 0: 42896.9. Samples: 8915361960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:07:33,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 00:07:36,729][15401] Updated weights for policy 0, policy_version 544150 (0.0033) [2024-06-24 00:07:38,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8915402752. Throughput: 0: 42960.7. Samples: 8915490060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:07:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 00:07:40,660][15401] Updated weights for policy 0, policy_version 544160 (0.0031) [2024-06-24 00:07:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8915648512. Throughput: 0: 42975.5. Samples: 8915752080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:07:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 00:07:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000544168_8915648512.pth... [2024-06-24 00:07:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000543540_8905359360.pth [2024-06-24 00:07:44,355][15401] Updated weights for policy 0, policy_version 544170 (0.0041) [2024-06-24 00:07:48,238][15401] Updated weights for policy 0, policy_version 544180 (0.0041) [2024-06-24 00:07:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 8915861504. Throughput: 0: 42912.6. Samples: 8916007220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:07:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 00:07:51,991][15401] Updated weights for policy 0, policy_version 544190 (0.0034) [2024-06-24 00:07:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8916058112. Throughput: 0: 42929.1. Samples: 8916133760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:07:53,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 00:07:55,977][15401] Updated weights for policy 0, policy_version 544200 (0.0034) [2024-06-24 00:07:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8916271104. Throughput: 0: 42924.0. Samples: 8916393300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:07:58,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 00:07:59,620][15401] Updated weights for policy 0, policy_version 544210 (0.0041) [2024-06-24 00:08:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 8916467712. Throughput: 0: 42648.7. Samples: 8916641360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 00:08:03,719][15401] Updated weights for policy 0, policy_version 544220 (0.0031) [2024-06-24 00:08:07,587][15401] Updated weights for policy 0, policy_version 544230 (0.0022) [2024-06-24 00:08:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 8916680704. Throughput: 0: 42624.4. Samples: 8916764500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 00:08:09,140][15349] Signal inference workers to stop experience collection... (132150 times) [2024-06-24 00:08:09,180][15401] InferenceWorker_p0-w0: stopping experience collection (132150 times) [2024-06-24 00:08:09,189][15349] Signal inference workers to resume experience collection... (132150 times) [2024-06-24 00:08:09,199][15401] InferenceWorker_p0-w0: resuming experience collection (132150 times) [2024-06-24 00:08:11,764][15401] Updated weights for policy 0, policy_version 544240 (0.0039) [2024-06-24 00:08:13,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42320.7, 300 sec: 42708.5). Total num frames: 8916893696. Throughput: 0: 42577.1. Samples: 8917021360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:13,397][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 00:08:15,438][15401] Updated weights for policy 0, policy_version 544250 (0.0029) [2024-06-24 00:08:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 8917106688. Throughput: 0: 42548.1. Samples: 8917276620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 00:08:19,317][15401] Updated weights for policy 0, policy_version 544260 (0.0038) [2024-06-24 00:08:22,986][15401] Updated weights for policy 0, policy_version 544270 (0.0036) [2024-06-24 00:08:23,390][15132] Fps is (10 sec: 44265.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8917336064. Throughput: 0: 42588.4. Samples: 8917406540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 00:08:26,805][15401] Updated weights for policy 0, policy_version 544280 (0.0038) [2024-06-24 00:08:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 8917532672. Throughput: 0: 42360.6. Samples: 8917658300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 00:08:30,574][15401] Updated weights for policy 0, policy_version 544290 (0.0035) [2024-06-24 00:08:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8917762048. Throughput: 0: 42397.2. Samples: 8917915100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 00:08:34,422][15401] Updated weights for policy 0, policy_version 544300 (0.0022) [2024-06-24 00:08:38,104][15401] Updated weights for policy 0, policy_version 544310 (0.0030) [2024-06-24 00:08:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 8917991424. Throughput: 0: 42549.0. Samples: 8918048460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:38,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 00:08:41,976][15401] Updated weights for policy 0, policy_version 544320 (0.0023) [2024-06-24 00:08:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8918188032. Throughput: 0: 42453.7. Samples: 8918303720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 00:08:45,798][15401] Updated weights for policy 0, policy_version 544330 (0.0030) [2024-06-24 00:08:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 8918384640. Throughput: 0: 42657.9. Samples: 8918560960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 00:08:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 00:08:49,562][15401] Updated weights for policy 0, policy_version 544340 (0.0047) [2024-06-24 00:08:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8918614016. Throughput: 0: 42781.8. Samples: 8918689680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:08:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 00:08:53,682][15401] Updated weights for policy 0, policy_version 544350 (0.0024) [2024-06-24 00:08:57,154][15401] Updated weights for policy 0, policy_version 544360 (0.0025) [2024-06-24 00:08:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8918843392. Throughput: 0: 42811.4. Samples: 8918947600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:08:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 00:09:01,215][15401] Updated weights for policy 0, policy_version 544370 (0.0037) [2024-06-24 00:09:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8919040000. Throughput: 0: 42923.9. Samples: 8919208200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:03,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 00:09:04,724][15401] Updated weights for policy 0, policy_version 544380 (0.0036) [2024-06-24 00:09:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 8919252992. Throughput: 0: 42821.4. Samples: 8919333500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 00:09:08,701][15401] Updated weights for policy 0, policy_version 544390 (0.0035) [2024-06-24 00:09:12,587][15401] Updated weights for policy 0, policy_version 544400 (0.0038) [2024-06-24 00:09:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 8919465984. Throughput: 0: 43049.4. Samples: 8919595520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 00:09:16,878][15401] Updated weights for policy 0, policy_version 544410 (0.0040) [2024-06-24 00:09:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8919695360. Throughput: 0: 42927.5. Samples: 8919846840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 00:09:20,154][15401] Updated weights for policy 0, policy_version 544420 (0.0032) [2024-06-24 00:09:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8919875584. Throughput: 0: 42844.8. Samples: 8919976480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:23,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 00:09:24,419][15401] Updated weights for policy 0, policy_version 544430 (0.0034) [2024-06-24 00:09:27,822][15401] Updated weights for policy 0, policy_version 544440 (0.0042) [2024-06-24 00:09:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 8920121344. Throughput: 0: 42996.8. Samples: 8920238580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 00:09:31,983][15401] Updated weights for policy 0, policy_version 544450 (0.0027) [2024-06-24 00:09:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8920334336. Throughput: 0: 42742.3. Samples: 8920484360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:33,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 00:09:35,345][15401] Updated weights for policy 0, policy_version 544460 (0.0036) [2024-06-24 00:09:38,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 8920514560. Throughput: 0: 42821.9. Samples: 8920616660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 00:09:39,961][15401] Updated weights for policy 0, policy_version 544470 (0.0032) [2024-06-24 00:09:42,656][15349] Signal inference workers to stop experience collection... (132200 times) [2024-06-24 00:09:42,691][15401] InferenceWorker_p0-w0: stopping experience collection (132200 times) [2024-06-24 00:09:42,718][15349] Signal inference workers to resume experience collection... (132200 times) [2024-06-24 00:09:42,724][15401] InferenceWorker_p0-w0: resuming experience collection (132200 times) [2024-06-24 00:09:43,040][15401] Updated weights for policy 0, policy_version 544480 (0.0027) [2024-06-24 00:09:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 8920760320. Throughput: 0: 42725.9. Samples: 8920870260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 00:09:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000544480_8920760320.pth... [2024-06-24 00:09:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000543852_8910471168.pth [2024-06-24 00:09:47,468][15401] Updated weights for policy 0, policy_version 544490 (0.0038) [2024-06-24 00:09:48,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8920973312. Throughput: 0: 42511.9. Samples: 8921121240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 00:09:50,754][15401] Updated weights for policy 0, policy_version 544500 (0.0032) [2024-06-24 00:09:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8921153536. Throughput: 0: 42571.1. Samples: 8921249200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 00:09:55,327][15401] Updated weights for policy 0, policy_version 544510 (0.0034) [2024-06-24 00:09:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8921399296. Throughput: 0: 42442.1. Samples: 8921505420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:09:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 00:09:58,578][15401] Updated weights for policy 0, policy_version 544520 (0.0042) [2024-06-24 00:10:02,948][15401] Updated weights for policy 0, policy_version 544530 (0.0043) [2024-06-24 00:10:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8921595904. Throughput: 0: 42476.8. Samples: 8921758300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:10:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 00:10:06,343][15401] Updated weights for policy 0, policy_version 544540 (0.0028) [2024-06-24 00:10:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 8921776128. Throughput: 0: 42426.6. Samples: 8921885680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:10:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 00:10:10,535][15401] Updated weights for policy 0, policy_version 544550 (0.0039) [2024-06-24 00:10:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 8922038272. Throughput: 0: 42244.9. Samples: 8922139600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:10:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 00:10:13,871][15401] Updated weights for policy 0, policy_version 544560 (0.0029) [2024-06-24 00:10:18,074][15401] Updated weights for policy 0, policy_version 544570 (0.0031) [2024-06-24 00:10:18,389][15132] Fps is (10 sec: 47514.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8922251264. Throughput: 0: 42523.5. Samples: 8922397920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:10:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 00:10:22,456][15401] Updated weights for policy 0, policy_version 544580 (0.0030) [2024-06-24 00:10:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8922431488. Throughput: 0: 42308.8. Samples: 8922520560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:10:23,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 00:10:25,679][15401] Updated weights for policy 0, policy_version 544590 (0.0051) [2024-06-24 00:10:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8922677248. Throughput: 0: 42464.8. Samples: 8922781180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:10:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 00:10:30,006][15401] Updated weights for policy 0, policy_version 544600 (0.0024) [2024-06-24 00:10:33,266][15401] Updated weights for policy 0, policy_version 544610 (0.0028) [2024-06-24 00:10:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8922890240. Throughput: 0: 42616.1. Samples: 8923038960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:10:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 00:10:37,555][15401] Updated weights for policy 0, policy_version 544620 (0.0036) [2024-06-24 00:10:38,393][15132] Fps is (10 sec: 40947.5, 60 sec: 42869.2, 300 sec: 42764.6). Total num frames: 8923086848. Throughput: 0: 42676.1. Samples: 8923169760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:10:38,393][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 00:10:40,828][15401] Updated weights for policy 0, policy_version 544630 (0.0040) [2024-06-24 00:10:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8923316224. Throughput: 0: 42667.1. Samples: 8923425440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:10:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 00:10:44,970][15401] Updated weights for policy 0, policy_version 544640 (0.0045) [2024-06-24 00:10:48,390][15132] Fps is (10 sec: 44250.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8923529216. Throughput: 0: 42879.7. Samples: 8923687880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:10:48,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 00:10:48,617][15401] Updated weights for policy 0, policy_version 544650 (0.0036) [2024-06-24 00:10:52,462][15401] Updated weights for policy 0, policy_version 544660 (0.0038) [2024-06-24 00:10:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8923725824. Throughput: 0: 42864.1. Samples: 8923814560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:10:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 00:10:56,279][15401] Updated weights for policy 0, policy_version 544670 (0.0041) [2024-06-24 00:10:57,511][15349] Signal inference workers to stop experience collection... (132250 times) [2024-06-24 00:10:57,511][15349] Signal inference workers to resume experience collection... (132250 times) [2024-06-24 00:10:57,556][15401] InferenceWorker_p0-w0: stopping experience collection (132250 times) [2024-06-24 00:10:57,556][15401] InferenceWorker_p0-w0: resuming experience collection (132250 times) [2024-06-24 00:10:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8923955200. Throughput: 0: 43033.4. Samples: 8924076100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:10:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 00:11:00,230][15401] Updated weights for policy 0, policy_version 544680 (0.0040) [2024-06-24 00:11:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 8924168192. Throughput: 0: 42855.5. Samples: 8924326420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:03,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 00:11:03,839][15401] Updated weights for policy 0, policy_version 544690 (0.0031) [2024-06-24 00:11:07,991][15401] Updated weights for policy 0, policy_version 544700 (0.0044) [2024-06-24 00:11:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.7, 300 sec: 42709.8). Total num frames: 8924364800. Throughput: 0: 43090.7. Samples: 8924459640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:08,390][15132] Avg episode reward: [(0, '0.151')] [2024-06-24 00:11:11,476][15401] Updated weights for policy 0, policy_version 544710 (0.0027) [2024-06-24 00:11:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 8924610560. Throughput: 0: 42922.3. Samples: 8924712680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 00:11:16,017][15401] Updated weights for policy 0, policy_version 544720 (0.0039) [2024-06-24 00:11:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8924790784. Throughput: 0: 42870.8. Samples: 8924968140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 00:11:19,366][15401] Updated weights for policy 0, policy_version 544730 (0.0026) [2024-06-24 00:11:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 8925003776. Throughput: 0: 42771.4. Samples: 8925094340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:23,392][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 00:11:23,662][15401] Updated weights for policy 0, policy_version 544740 (0.0025) [2024-06-24 00:11:26,811][15401] Updated weights for policy 0, policy_version 544750 (0.0034) [2024-06-24 00:11:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42709.7). Total num frames: 8925249536. Throughput: 0: 42880.2. Samples: 8925355040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:28,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 00:11:31,236][15401] Updated weights for policy 0, policy_version 544760 (0.0028) [2024-06-24 00:11:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 8925446144. Throughput: 0: 42705.3. Samples: 8925609620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 00:11:34,727][15401] Updated weights for policy 0, policy_version 544770 (0.0039) [2024-06-24 00:11:38,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42600.6, 300 sec: 42653.9). Total num frames: 8925642752. Throughput: 0: 42639.6. Samples: 8925733340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 00:11:38,798][15401] Updated weights for policy 0, policy_version 544780 (0.0028) [2024-06-24 00:11:42,138][15401] Updated weights for policy 0, policy_version 544790 (0.0040) [2024-06-24 00:11:43,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 8925888512. Throughput: 0: 42495.7. Samples: 8925988420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 00:11:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000544793_8925888512.pth... [2024-06-24 00:11:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000544168_8915648512.pth [2024-06-24 00:11:46,407][15401] Updated weights for policy 0, policy_version 544800 (0.0029) [2024-06-24 00:11:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8926101504. Throughput: 0: 42878.6. Samples: 8926255960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 00:11:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 00:11:49,678][15401] Updated weights for policy 0, policy_version 544810 (0.0032) [2024-06-24 00:11:53,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8926298112. Throughput: 0: 42644.4. Samples: 8926378640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:11:53,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-24 00:11:54,021][15401] Updated weights for policy 0, policy_version 544820 (0.0034) [2024-06-24 00:11:57,089][15401] Updated weights for policy 0, policy_version 544830 (0.0045) [2024-06-24 00:11:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 8926527488. Throughput: 0: 42690.8. Samples: 8926633760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:11:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 00:12:01,535][15401] Updated weights for policy 0, policy_version 544840 (0.0026) [2024-06-24 00:12:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 8926740480. Throughput: 0: 42882.6. Samples: 8926897860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 00:12:04,645][15401] Updated weights for policy 0, policy_version 544850 (0.0045) [2024-06-24 00:12:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8926937088. Throughput: 0: 42824.4. Samples: 8927021440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 00:12:09,450][15401] Updated weights for policy 0, policy_version 544860 (0.0028) [2024-06-24 00:12:12,571][15401] Updated weights for policy 0, policy_version 544870 (0.0038) [2024-06-24 00:12:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 8927166464. Throughput: 0: 42683.8. Samples: 8927275920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:13,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 00:12:16,938][15401] Updated weights for policy 0, policy_version 544880 (0.0029) [2024-06-24 00:12:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8927363072. Throughput: 0: 42864.0. Samples: 8927538500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 00:12:20,234][15401] Updated weights for policy 0, policy_version 544890 (0.0043) [2024-06-24 00:12:23,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8927559680. Throughput: 0: 42960.5. Samples: 8927666560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 00:12:24,533][15401] Updated weights for policy 0, policy_version 544900 (0.0035) [2024-06-24 00:12:27,721][15401] Updated weights for policy 0, policy_version 544910 (0.0040) [2024-06-24 00:12:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8927821824. Throughput: 0: 43006.1. Samples: 8927923680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 00:12:31,773][15349] Signal inference workers to stop experience collection... (132300 times) [2024-06-24 00:12:31,799][15401] InferenceWorker_p0-w0: stopping experience collection (132300 times) [2024-06-24 00:12:31,833][15349] Signal inference workers to resume experience collection... (132300 times) [2024-06-24 00:12:31,834][15401] InferenceWorker_p0-w0: resuming experience collection (132300 times) [2024-06-24 00:12:32,285][15401] Updated weights for policy 0, policy_version 544920 (0.0026) [2024-06-24 00:12:33,390][15132] Fps is (10 sec: 45871.3, 60 sec: 42870.9, 300 sec: 42764.9). Total num frames: 8928018432. Throughput: 0: 42690.9. Samples: 8928177080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:33,391][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 00:12:35,684][15401] Updated weights for policy 0, policy_version 544930 (0.0029) [2024-06-24 00:12:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8928215040. Throughput: 0: 42732.0. Samples: 8928301580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:38,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-24 00:12:40,006][15401] Updated weights for policy 0, policy_version 544940 (0.0027) [2024-06-24 00:12:43,115][15401] Updated weights for policy 0, policy_version 544950 (0.0028) [2024-06-24 00:12:43,390][15132] Fps is (10 sec: 44239.9, 60 sec: 42871.6, 300 sec: 42709.4). Total num frames: 8928460800. Throughput: 0: 42764.7. Samples: 8928558180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 00:12:47,744][15401] Updated weights for policy 0, policy_version 544960 (0.0024) [2024-06-24 00:12:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 8928641024. Throughput: 0: 42734.8. Samples: 8928820920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 00:12:50,935][15401] Updated weights for policy 0, policy_version 544970 (0.0029) [2024-06-24 00:12:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8928870400. Throughput: 0: 42618.2. Samples: 8928939260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 00:12:55,561][15401] Updated weights for policy 0, policy_version 544980 (0.0046) [2024-06-24 00:12:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8929099776. Throughput: 0: 42767.7. Samples: 8929200360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:12:58,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-24 00:12:59,003][15401] Updated weights for policy 0, policy_version 544990 (0.0034) [2024-06-24 00:13:03,059][15401] Updated weights for policy 0, policy_version 545000 (0.0025) [2024-06-24 00:13:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 8929280000. Throughput: 0: 42708.0. Samples: 8929460460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:13:03,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 00:13:06,439][15401] Updated weights for policy 0, policy_version 545010 (0.0037) [2024-06-24 00:13:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 8929492992. Throughput: 0: 42634.0. Samples: 8929585100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:13:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 00:13:10,493][15401] Updated weights for policy 0, policy_version 545020 (0.0032) [2024-06-24 00:13:13,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 8929738752. Throughput: 0: 42743.2. Samples: 8929847120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:13:13,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-24 00:13:13,877][15401] Updated weights for policy 0, policy_version 545030 (0.0042) [2024-06-24 00:13:18,153][15401] Updated weights for policy 0, policy_version 545040 (0.0032) [2024-06-24 00:13:18,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8929935360. Throughput: 0: 42849.2. Samples: 8930105260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 00:13:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 00:13:21,574][15401] Updated weights for policy 0, policy_version 545050 (0.0039) [2024-06-24 00:13:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 8930148352. Throughput: 0: 42840.8. Samples: 8930229420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:13:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 00:13:25,659][15401] Updated weights for policy 0, policy_version 545060 (0.0043) [2024-06-24 00:13:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8930377728. Throughput: 0: 42907.6. Samples: 8930489020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:13:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 00:13:29,510][15401] Updated weights for policy 0, policy_version 545070 (0.0037) [2024-06-24 00:13:33,163][15401] Updated weights for policy 0, policy_version 545080 (0.0022) [2024-06-24 00:13:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42872.0, 300 sec: 42709.5). Total num frames: 8930590720. Throughput: 0: 42775.4. Samples: 8930745820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:13:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 00:13:37,281][15401] Updated weights for policy 0, policy_version 545090 (0.0031) [2024-06-24 00:13:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8930770944. Throughput: 0: 42874.7. Samples: 8930868620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:13:38,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 00:13:41,046][15401] Updated weights for policy 0, policy_version 545100 (0.0031) [2024-06-24 00:13:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 8931016704. Throughput: 0: 42801.1. Samples: 8931126420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:13:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 00:13:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000545107_8931033088.pth... [2024-06-24 00:13:43,518][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000544480_8920760320.pth [2024-06-24 00:13:44,792][15401] Updated weights for policy 0, policy_version 545110 (0.0031) [2024-06-24 00:13:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8931213312. Throughput: 0: 42704.9. Samples: 8931382080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:13:48,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 00:13:48,621][15401] Updated weights for policy 0, policy_version 545120 (0.0041) [2024-06-24 00:13:52,365][15401] Updated weights for policy 0, policy_version 545130 (0.0038) [2024-06-24 00:13:53,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 8931426304. Throughput: 0: 42749.4. Samples: 8931508920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:13:53,393][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 00:13:56,391][15401] Updated weights for policy 0, policy_version 545140 (0.0028) [2024-06-24 00:13:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 8931672064. Throughput: 0: 42694.6. Samples: 8931768380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:13:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 00:14:00,100][15401] Updated weights for policy 0, policy_version 545150 (0.0033) [2024-06-24 00:14:03,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.1). Total num frames: 8931852288. Throughput: 0: 42575.5. Samples: 8932021260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:03,393][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 00:14:04,307][15401] Updated weights for policy 0, policy_version 545160 (0.0043) [2024-06-24 00:14:04,811][15349] Signal inference workers to stop experience collection... (132350 times) [2024-06-24 00:14:04,875][15401] InferenceWorker_p0-w0: stopping experience collection (132350 times) [2024-06-24 00:14:04,934][15349] Signal inference workers to resume experience collection... (132350 times) [2024-06-24 00:14:04,934][15401] InferenceWorker_p0-w0: resuming experience collection (132350 times) [2024-06-24 00:14:08,226][15401] Updated weights for policy 0, policy_version 545170 (0.0028) [2024-06-24 00:14:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8932065280. Throughput: 0: 42358.4. Samples: 8932135540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 00:14:12,090][15401] Updated weights for policy 0, policy_version 545180 (0.0034) [2024-06-24 00:14:13,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8932294656. Throughput: 0: 42324.5. Samples: 8932393620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 00:14:16,025][15401] Updated weights for policy 0, policy_version 545190 (0.0031) [2024-06-24 00:14:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 8932491264. Throughput: 0: 42410.7. Samples: 8932654300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 00:14:19,915][15401] Updated weights for policy 0, policy_version 545200 (0.0024) [2024-06-24 00:14:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8932704256. Throughput: 0: 42404.8. Samples: 8932776840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:23,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 00:14:23,522][15401] Updated weights for policy 0, policy_version 545210 (0.0034) [2024-06-24 00:14:27,460][15401] Updated weights for policy 0, policy_version 545220 (0.0035) [2024-06-24 00:14:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8932917248. Throughput: 0: 42515.3. Samples: 8933039600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 00:14:31,103][15401] Updated weights for policy 0, policy_version 545230 (0.0034) [2024-06-24 00:14:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 8933130240. Throughput: 0: 42592.8. Samples: 8933298760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:33,391][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 00:14:35,257][15401] Updated weights for policy 0, policy_version 545240 (0.0041) [2024-06-24 00:14:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8933343232. Throughput: 0: 42547.6. Samples: 8933423460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 00:14:38,923][15401] Updated weights for policy 0, policy_version 545250 (0.0028) [2024-06-24 00:14:42,812][15401] Updated weights for policy 0, policy_version 545260 (0.0033) [2024-06-24 00:14:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8933556224. Throughput: 0: 42648.1. Samples: 8933687560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 00:14:46,387][15401] Updated weights for policy 0, policy_version 545270 (0.0042) [2024-06-24 00:14:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8933752832. Throughput: 0: 42602.4. Samples: 8933938260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 00:14:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 00:14:50,446][15401] Updated weights for policy 0, policy_version 545280 (0.0044) [2024-06-24 00:14:53,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 8933982208. Throughput: 0: 42828.8. Samples: 8934062840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:14:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 00:14:54,110][15401] Updated weights for policy 0, policy_version 545290 (0.0026) [2024-06-24 00:14:58,111][15401] Updated weights for policy 0, policy_version 545300 (0.0040) [2024-06-24 00:14:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 8934195200. Throughput: 0: 42879.6. Samples: 8934323200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:14:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 00:15:01,839][15401] Updated weights for policy 0, policy_version 545310 (0.0036) [2024-06-24 00:15:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 8934408192. Throughput: 0: 42671.0. Samples: 8934574600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:03,393][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 00:15:05,687][15401] Updated weights for policy 0, policy_version 545320 (0.0034) [2024-06-24 00:15:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8934621184. Throughput: 0: 42725.9. Samples: 8934699500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 00:15:09,738][15401] Updated weights for policy 0, policy_version 545330 (0.0030) [2024-06-24 00:15:13,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8934834176. Throughput: 0: 42624.0. Samples: 8934957680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:13,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 00:15:13,499][15401] Updated weights for policy 0, policy_version 545340 (0.0030) [2024-06-24 00:15:17,391][15401] Updated weights for policy 0, policy_version 545350 (0.0028) [2024-06-24 00:15:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8935047168. Throughput: 0: 42491.7. Samples: 8935210880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 00:15:21,209][15401] Updated weights for policy 0, policy_version 545360 (0.0036) [2024-06-24 00:15:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 8935243776. Throughput: 0: 42645.5. Samples: 8935342500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 00:15:24,948][15401] Updated weights for policy 0, policy_version 545370 (0.0034) [2024-06-24 00:15:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8935473152. Throughput: 0: 42435.9. Samples: 8935597160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:28,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 00:15:28,839][15401] Updated weights for policy 0, policy_version 545380 (0.0027) [2024-06-24 00:15:32,601][15401] Updated weights for policy 0, policy_version 545390 (0.0027) [2024-06-24 00:15:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.9). Total num frames: 8935686144. Throughput: 0: 42583.0. Samples: 8935854500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:33,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 00:15:36,381][15401] Updated weights for policy 0, policy_version 545400 (0.0038) [2024-06-24 00:15:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8935882752. Throughput: 0: 42726.5. Samples: 8935985540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:38,399][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 00:15:40,064][15401] Updated weights for policy 0, policy_version 545410 (0.0037) [2024-06-24 00:15:43,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42870.0, 300 sec: 42709.1). Total num frames: 8936128512. Throughput: 0: 42580.4. Samples: 8936239420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:43,404][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 00:15:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000545418_8936128512.pth... [2024-06-24 00:15:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000544793_8925888512.pth [2024-06-24 00:15:44,069][15401] Updated weights for policy 0, policy_version 545420 (0.0047) [2024-06-24 00:15:48,117][15349] Signal inference workers to stop experience collection... (132400 times) [2024-06-24 00:15:48,118][15349] Signal inference workers to resume experience collection... (132400 times) [2024-06-24 00:15:48,130][15401] Updated weights for policy 0, policy_version 545430 (0.0030) [2024-06-24 00:15:48,160][15401] InferenceWorker_p0-w0: stopping experience collection (132400 times) [2024-06-24 00:15:48,160][15401] InferenceWorker_p0-w0: resuming experience collection (132400 times) [2024-06-24 00:15:48,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8936341504. Throughput: 0: 42714.4. Samples: 8936496640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 00:15:51,965][15401] Updated weights for policy 0, policy_version 545440 (0.0036) [2024-06-24 00:15:53,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8936538112. Throughput: 0: 42777.2. Samples: 8936624480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 00:15:55,817][15401] Updated weights for policy 0, policy_version 545450 (0.0046) [2024-06-24 00:15:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8936783872. Throughput: 0: 42582.7. Samples: 8936873900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:15:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 00:15:59,630][15401] Updated weights for policy 0, policy_version 545460 (0.0033) [2024-06-24 00:16:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 8936964096. Throughput: 0: 42872.5. Samples: 8937140140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:16:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 00:16:03,490][15401] Updated weights for policy 0, policy_version 545470 (0.0040) [2024-06-24 00:16:07,285][15401] Updated weights for policy 0, policy_version 545480 (0.0043) [2024-06-24 00:16:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8937177088. Throughput: 0: 42555.6. Samples: 8937257500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:16:08,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 00:16:11,288][15401] Updated weights for policy 0, policy_version 545490 (0.0043) [2024-06-24 00:16:13,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 8937439232. Throughput: 0: 42538.5. Samples: 8937511400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:16:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 00:16:15,429][15401] Updated weights for policy 0, policy_version 545500 (0.0030) [2024-06-24 00:16:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8937570304. Throughput: 0: 42852.6. Samples: 8937782860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 00:16:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 00:16:18,856][15401] Updated weights for policy 0, policy_version 545510 (0.0029) [2024-06-24 00:16:23,069][15401] Updated weights for policy 0, policy_version 545520 (0.0037) [2024-06-24 00:16:23,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 8937816064. Throughput: 0: 42346.7. Samples: 8937891240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:16:23,393][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 00:16:26,485][15401] Updated weights for policy 0, policy_version 545530 (0.0040) [2024-06-24 00:16:28,390][15132] Fps is (10 sec: 50789.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 8938078208. Throughput: 0: 42663.1. Samples: 8938159160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:16:28,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 00:16:30,570][15401] Updated weights for policy 0, policy_version 545540 (0.0037) [2024-06-24 00:16:33,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 8938225664. Throughput: 0: 42822.6. Samples: 8938423760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:16:33,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 00:16:34,326][15401] Updated weights for policy 0, policy_version 545550 (0.0035) [2024-06-24 00:16:38,094][15401] Updated weights for policy 0, policy_version 545560 (0.0028) [2024-06-24 00:16:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8938455040. Throughput: 0: 42400.0. Samples: 8938532480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:16:38,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-24 00:16:42,024][15401] Updated weights for policy 0, policy_version 545570 (0.0047) [2024-06-24 00:16:43,390][15132] Fps is (10 sec: 49163.6, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 8938717184. Throughput: 0: 42744.8. Samples: 8938797420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:16:43,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 00:16:45,546][15401] Updated weights for policy 0, policy_version 545580 (0.0033) [2024-06-24 00:16:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8938864640. Throughput: 0: 42692.9. Samples: 8939061320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:16:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 00:16:49,863][15401] Updated weights for policy 0, policy_version 545590 (0.0030) [2024-06-24 00:16:53,051][15401] Updated weights for policy 0, policy_version 545600 (0.0040) [2024-06-24 00:16:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8939110400. Throughput: 0: 42636.3. Samples: 8939176140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:16:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 00:16:57,399][15401] Updated weights for policy 0, policy_version 545610 (0.0028) [2024-06-24 00:16:58,392][15132] Fps is (10 sec: 47501.7, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 8939339776. Throughput: 0: 43035.5. Samples: 8939448100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:16:58,403][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 00:16:58,982][15349] Signal inference workers to stop experience collection... (132450 times) [2024-06-24 00:16:59,027][15401] InferenceWorker_p0-w0: stopping experience collection (132450 times) [2024-06-24 00:16:59,036][15349] Signal inference workers to resume experience collection... (132450 times) [2024-06-24 00:16:59,047][15401] InferenceWorker_p0-w0: resuming experience collection (132450 times) [2024-06-24 00:17:00,452][15401] Updated weights for policy 0, policy_version 545620 (0.0034) [2024-06-24 00:17:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8939503616. Throughput: 0: 42637.7. Samples: 8939701560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 00:17:05,113][15401] Updated weights for policy 0, policy_version 545630 (0.0044) [2024-06-24 00:17:07,955][15401] Updated weights for policy 0, policy_version 545640 (0.0034) [2024-06-24 00:17:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 8939765760. Throughput: 0: 42956.1. Samples: 8939824160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 00:17:12,628][15401] Updated weights for policy 0, policy_version 545650 (0.0040) [2024-06-24 00:17:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 8939945984. Throughput: 0: 42903.1. Samples: 8940089800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 00:17:15,523][15401] Updated weights for policy 0, policy_version 545660 (0.0032) [2024-06-24 00:17:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8940158976. Throughput: 0: 42710.3. Samples: 8940345620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 00:17:20,135][15401] Updated weights for policy 0, policy_version 545670 (0.0040) [2024-06-24 00:17:23,201][15401] Updated weights for policy 0, policy_version 545680 (0.0028) [2024-06-24 00:17:23,396][15132] Fps is (10 sec: 47483.3, 60 sec: 43414.7, 300 sec: 42708.5). Total num frames: 8940421120. Throughput: 0: 43079.3. Samples: 8940471320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:23,396][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 00:17:27,812][15401] Updated weights for policy 0, policy_version 545690 (0.0031) [2024-06-24 00:17:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 8940601344. Throughput: 0: 42874.7. Samples: 8940726780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:28,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-24 00:17:30,912][15401] Updated weights for policy 0, policy_version 545700 (0.0035) [2024-06-24 00:17:33,390][15132] Fps is (10 sec: 37707.1, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 8940797952. Throughput: 0: 42801.2. Samples: 8940987380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:33,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 00:17:35,377][15401] Updated weights for policy 0, policy_version 545710 (0.0031) [2024-06-24 00:17:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 8941043712. Throughput: 0: 43058.2. Samples: 8941113760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 00:17:38,757][15401] Updated weights for policy 0, policy_version 545720 (0.0040) [2024-06-24 00:17:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 8941223936. Throughput: 0: 42626.8. Samples: 8941366200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 00:17:43,437][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000545730_8941240320.pth... [2024-06-24 00:17:43,447][15401] Updated weights for policy 0, policy_version 545730 (0.0037) [2024-06-24 00:17:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000545107_8931033088.pth [2024-06-24 00:17:46,465][15401] Updated weights for policy 0, policy_version 545740 (0.0035) [2024-06-24 00:17:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8941436928. Throughput: 0: 42770.3. Samples: 8941626220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 00:17:48,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 00:17:51,014][15401] Updated weights for policy 0, policy_version 545750 (0.0030) [2024-06-24 00:17:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8941666304. Throughput: 0: 42816.4. Samples: 8941750900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:17:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 00:17:54,041][15401] Updated weights for policy 0, policy_version 545760 (0.0036) [2024-06-24 00:17:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42054.0, 300 sec: 42654.3). Total num frames: 8941862912. Throughput: 0: 42645.4. Samples: 8942008840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:17:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 00:17:58,656][15401] Updated weights for policy 0, policy_version 545770 (0.0031) [2024-06-24 00:18:02,213][15401] Updated weights for policy 0, policy_version 545780 (0.0035) [2024-06-24 00:18:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 8942075904. Throughput: 0: 42548.9. Samples: 8942260320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:03,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 00:18:06,279][15401] Updated weights for policy 0, policy_version 545790 (0.0040) [2024-06-24 00:18:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8942321664. Throughput: 0: 42565.2. Samples: 8942386480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:08,391][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 00:18:09,852][15401] Updated weights for policy 0, policy_version 545800 (0.0036) [2024-06-24 00:18:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8942501888. Throughput: 0: 42485.0. Samples: 8942638600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:13,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 00:18:14,053][15401] Updated weights for policy 0, policy_version 545810 (0.0037) [2024-06-24 00:18:17,409][15401] Updated weights for policy 0, policy_version 545820 (0.0034) [2024-06-24 00:18:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8942731264. Throughput: 0: 42533.5. Samples: 8942901380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:18,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 00:18:19,507][15349] Signal inference workers to stop experience collection... (132500 times) [2024-06-24 00:18:19,508][15349] Signal inference workers to resume experience collection... (132500 times) [2024-06-24 00:18:19,534][15401] InferenceWorker_p0-w0: stopping experience collection (132500 times) [2024-06-24 00:18:19,534][15401] InferenceWorker_p0-w0: resuming experience collection (132500 times) [2024-06-24 00:18:21,592][15401] Updated weights for policy 0, policy_version 545830 (0.0032) [2024-06-24 00:18:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42056.8, 300 sec: 42598.4). Total num frames: 8942944256. Throughput: 0: 42557.4. Samples: 8943028840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 00:18:25,012][15401] Updated weights for policy 0, policy_version 545840 (0.0036) [2024-06-24 00:18:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8943157248. Throughput: 0: 42591.0. Samples: 8943282800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 00:18:28,987][15401] Updated weights for policy 0, policy_version 545850 (0.0037) [2024-06-24 00:18:33,071][15401] Updated weights for policy 0, policy_version 545860 (0.0033) [2024-06-24 00:18:33,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42867.0, 300 sec: 42708.6). Total num frames: 8943370240. Throughput: 0: 42381.9. Samples: 8943533680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:33,396][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 00:18:36,631][15401] Updated weights for policy 0, policy_version 545870 (0.0028) [2024-06-24 00:18:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8943583232. Throughput: 0: 42523.6. Samples: 8943664460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 00:18:40,555][15401] Updated weights for policy 0, policy_version 545880 (0.0033) [2024-06-24 00:18:43,389][15132] Fps is (10 sec: 44265.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8943812608. Throughput: 0: 42618.2. Samples: 8943926660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 00:18:44,093][15401] Updated weights for policy 0, policy_version 545890 (0.0030) [2024-06-24 00:18:48,076][15401] Updated weights for policy 0, policy_version 545900 (0.0028) [2024-06-24 00:18:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 8944025600. Throughput: 0: 42653.3. Samples: 8944179720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 00:18:51,556][15401] Updated weights for policy 0, policy_version 545910 (0.0035) [2024-06-24 00:18:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8944238592. Throughput: 0: 42713.0. Samples: 8944308560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 00:18:55,725][15401] Updated weights for policy 0, policy_version 545920 (0.0034) [2024-06-24 00:18:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 8944451584. Throughput: 0: 43044.3. Samples: 8944575600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:18:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 00:18:59,432][15401] Updated weights for policy 0, policy_version 545930 (0.0039) [2024-06-24 00:19:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8944664576. Throughput: 0: 42668.0. Samples: 8944821440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:19:03,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 00:19:03,428][15401] Updated weights for policy 0, policy_version 545940 (0.0028) [2024-06-24 00:19:07,190][15401] Updated weights for policy 0, policy_version 545950 (0.0036) [2024-06-24 00:19:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8944861184. Throughput: 0: 42724.4. Samples: 8944951440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:19:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 00:19:11,097][15401] Updated weights for policy 0, policy_version 545960 (0.0045) [2024-06-24 00:19:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8945090560. Throughput: 0: 42771.1. Samples: 8945207500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:19:13,404][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 00:19:15,427][15401] Updated weights for policy 0, policy_version 545970 (0.0030) [2024-06-24 00:19:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8945319936. Throughput: 0: 42769.2. Samples: 8945458020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 00:19:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 00:19:18,749][15401] Updated weights for policy 0, policy_version 545980 (0.0029) [2024-06-24 00:19:23,139][15401] Updated weights for policy 0, policy_version 545990 (0.0026) [2024-06-24 00:19:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8945500160. Throughput: 0: 42745.4. Samples: 8945588000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:19:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 00:19:26,613][15401] Updated weights for policy 0, policy_version 546000 (0.0034) [2024-06-24 00:19:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8945729536. Throughput: 0: 42580.9. Samples: 8945842800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:19:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 00:19:30,751][15401] Updated weights for policy 0, policy_version 546010 (0.0032) [2024-06-24 00:19:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42329.8, 300 sec: 42598.4). Total num frames: 8945909760. Throughput: 0: 42774.3. Samples: 8946104560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:19:33,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 00:19:34,238][15401] Updated weights for policy 0, policy_version 546020 (0.0029) [2024-06-24 00:19:38,269][15401] Updated weights for policy 0, policy_version 546030 (0.0029) [2024-06-24 00:19:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8946155520. Throughput: 0: 42663.9. Samples: 8946228440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:19:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 00:19:42,076][15349] Signal inference workers to stop experience collection... (132550 times) [2024-06-24 00:19:42,076][15349] Signal inference workers to resume experience collection... (132550 times) [2024-06-24 00:19:42,107][15401] InferenceWorker_p0-w0: stopping experience collection (132550 times) [2024-06-24 00:19:42,107][15401] InferenceWorker_p0-w0: resuming experience collection (132550 times) [2024-06-24 00:19:42,216][15401] Updated weights for policy 0, policy_version 546040 (0.0026) [2024-06-24 00:19:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 8946368512. Throughput: 0: 42400.9. Samples: 8946483640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:19:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 00:19:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000546043_8946368512.pth... [2024-06-24 00:19:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000545418_8936128512.pth [2024-06-24 00:19:45,776][15401] Updated weights for policy 0, policy_version 546050 (0.0037) [2024-06-24 00:19:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 8946565120. Throughput: 0: 42538.6. Samples: 8946735680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:19:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 00:19:50,272][15401] Updated weights for policy 0, policy_version 546060 (0.0036) [2024-06-24 00:19:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8946794496. Throughput: 0: 42452.0. Samples: 8946861780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:19:53,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 00:19:53,814][15401] Updated weights for policy 0, policy_version 546070 (0.0026) [2024-06-24 00:19:57,637][15401] Updated weights for policy 0, policy_version 546080 (0.0021) [2024-06-24 00:19:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 8947023872. Throughput: 0: 42652.0. Samples: 8947126840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:19:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 00:20:01,361][15401] Updated weights for policy 0, policy_version 546090 (0.0033) [2024-06-24 00:20:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8947220480. Throughput: 0: 42772.4. Samples: 8947382780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 00:20:05,109][15401] Updated weights for policy 0, policy_version 546100 (0.0021) [2024-06-24 00:20:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8947433472. Throughput: 0: 42724.0. Samples: 8947510580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 00:20:09,001][15401] Updated weights for policy 0, policy_version 546110 (0.0030) [2024-06-24 00:20:12,721][15401] Updated weights for policy 0, policy_version 546120 (0.0027) [2024-06-24 00:20:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 8947662848. Throughput: 0: 42987.1. Samples: 8947777220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 00:20:16,766][15401] Updated weights for policy 0, policy_version 546130 (0.0042) [2024-06-24 00:20:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 8947843072. Throughput: 0: 42836.6. Samples: 8948032200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 00:20:20,163][15401] Updated weights for policy 0, policy_version 546140 (0.0029) [2024-06-24 00:20:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 8948072448. Throughput: 0: 42751.5. Samples: 8948152260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 00:20:24,232][15401] Updated weights for policy 0, policy_version 546150 (0.0041) [2024-06-24 00:20:27,828][15401] Updated weights for policy 0, policy_version 546160 (0.0036) [2024-06-24 00:20:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8948301824. Throughput: 0: 42914.7. Samples: 8948414800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:28,390][15132] Avg episode reward: [(0, '0.243')] [2024-06-24 00:20:31,735][15401] Updated weights for policy 0, policy_version 546170 (0.0045) [2024-06-24 00:20:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8948482048. Throughput: 0: 42989.9. Samples: 8948670220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:33,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-24 00:20:35,914][15401] Updated weights for policy 0, policy_version 546180 (0.0035) [2024-06-24 00:20:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 8948711424. Throughput: 0: 42982.2. Samples: 8948795980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:38,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-24 00:20:39,354][15401] Updated weights for policy 0, policy_version 546190 (0.0043) [2024-06-24 00:20:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8948908032. Throughput: 0: 42685.3. Samples: 8949047680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 00:20:43,646][15401] Updated weights for policy 0, policy_version 546200 (0.0022) [2024-06-24 00:20:47,214][15401] Updated weights for policy 0, policy_version 546210 (0.0047) [2024-06-24 00:20:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 8949121024. Throughput: 0: 42657.4. Samples: 8949302360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:48,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-24 00:20:51,274][15401] Updated weights for policy 0, policy_version 546220 (0.0049) [2024-06-24 00:20:51,384][15349] Signal inference workers to stop experience collection... (132600 times) [2024-06-24 00:20:51,424][15401] InferenceWorker_p0-w0: stopping experience collection (132600 times) [2024-06-24 00:20:51,494][15349] Signal inference workers to resume experience collection... (132600 times) [2024-06-24 00:20:51,494][15401] InferenceWorker_p0-w0: resuming experience collection (132600 times) [2024-06-24 00:20:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8949350400. Throughput: 0: 42749.7. Samples: 8949434320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 00:20:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 00:20:54,834][15401] Updated weights for policy 0, policy_version 546230 (0.0036) [2024-06-24 00:20:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8949563392. Throughput: 0: 42550.5. Samples: 8949692000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:20:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 00:20:58,743][15401] Updated weights for policy 0, policy_version 546240 (0.0031) [2024-06-24 00:21:02,406][15401] Updated weights for policy 0, policy_version 546250 (0.0028) [2024-06-24 00:21:03,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42323.5, 300 sec: 42653.6). Total num frames: 8949760000. Throughput: 0: 42598.8. Samples: 8949949260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:03,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 00:21:06,289][15401] Updated weights for policy 0, policy_version 546260 (0.0033) [2024-06-24 00:21:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8949989376. Throughput: 0: 42778.7. Samples: 8950077300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 00:21:09,902][15401] Updated weights for policy 0, policy_version 546270 (0.0035) [2024-06-24 00:21:13,390][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 8950202368. Throughput: 0: 42740.3. Samples: 8950338120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 00:21:13,875][15401] Updated weights for policy 0, policy_version 546280 (0.0023) [2024-06-24 00:21:17,403][15401] Updated weights for policy 0, policy_version 546290 (0.0030) [2024-06-24 00:21:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 8950415360. Throughput: 0: 42787.5. Samples: 8950595660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 00:21:21,588][15401] Updated weights for policy 0, policy_version 546300 (0.0028) [2024-06-24 00:21:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8950644736. Throughput: 0: 42895.1. Samples: 8950726260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 00:21:24,896][15401] Updated weights for policy 0, policy_version 546310 (0.0032) [2024-06-24 00:21:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 8950841344. Throughput: 0: 42940.1. Samples: 8950979980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:28,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 00:21:29,449][15401] Updated weights for policy 0, policy_version 546320 (0.0033) [2024-06-24 00:21:32,433][15401] Updated weights for policy 0, policy_version 546330 (0.0036) [2024-06-24 00:21:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8951070720. Throughput: 0: 42902.2. Samples: 8951232960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 00:21:37,223][15401] Updated weights for policy 0, policy_version 546340 (0.0032) [2024-06-24 00:21:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8951283712. Throughput: 0: 42878.6. Samples: 8951363860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 00:21:40,726][15401] Updated weights for policy 0, policy_version 546350 (0.0038) [2024-06-24 00:21:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8951480320. Throughput: 0: 42744.5. Samples: 8951615500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 00:21:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000546356_8951496704.pth... [2024-06-24 00:21:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000545730_8941240320.pth [2024-06-24 00:21:44,980][15401] Updated weights for policy 0, policy_version 546360 (0.0034) [2024-06-24 00:21:48,316][15401] Updated weights for policy 0, policy_version 546370 (0.0034) [2024-06-24 00:21:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 8951726080. Throughput: 0: 42684.1. Samples: 8951869940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 00:21:52,766][15401] Updated weights for policy 0, policy_version 546380 (0.0028) [2024-06-24 00:21:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 8951906304. Throughput: 0: 42681.7. Samples: 8951997980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 00:21:55,904][15401] Updated weights for policy 0, policy_version 546390 (0.0035) [2024-06-24 00:21:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8952119296. Throughput: 0: 42547.6. Samples: 8952252760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:21:58,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 00:22:00,415][15401] Updated weights for policy 0, policy_version 546400 (0.0030) [2024-06-24 00:22:02,901][15349] Signal inference workers to stop experience collection... (132650 times) [2024-06-24 00:22:02,902][15349] Signal inference workers to resume experience collection... (132650 times) [2024-06-24 00:22:02,947][15401] InferenceWorker_p0-w0: stopping experience collection (132650 times) [2024-06-24 00:22:02,947][15401] InferenceWorker_p0-w0: resuming experience collection (132650 times) [2024-06-24 00:22:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43146.4, 300 sec: 42653.9). Total num frames: 8952348672. Throughput: 0: 42441.8. Samples: 8952505540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:22:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 00:22:03,652][15401] Updated weights for policy 0, policy_version 546410 (0.0036) [2024-06-24 00:22:08,236][15401] Updated weights for policy 0, policy_version 546420 (0.0035) [2024-06-24 00:22:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8952545280. Throughput: 0: 42460.8. Samples: 8952637000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:22:08,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 00:22:11,287][15401] Updated weights for policy 0, policy_version 546430 (0.0036) [2024-06-24 00:22:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8952758272. Throughput: 0: 42416.3. Samples: 8952888720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:22:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 00:22:16,234][15401] Updated weights for policy 0, policy_version 546440 (0.0029) [2024-06-24 00:22:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.4, 300 sec: 42654.9). Total num frames: 8953004032. Throughput: 0: 42371.9. Samples: 8953139700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:22:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 00:22:19,233][15401] Updated weights for policy 0, policy_version 546450 (0.0038) [2024-06-24 00:22:23,393][15132] Fps is (10 sec: 40944.6, 60 sec: 42049.6, 300 sec: 42597.9). Total num frames: 8953167872. Throughput: 0: 42351.2. Samples: 8953269820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 00:22:23,394][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 00:22:23,926][15401] Updated weights for policy 0, policy_version 546460 (0.0020) [2024-06-24 00:22:26,989][15401] Updated weights for policy 0, policy_version 546470 (0.0031) [2024-06-24 00:22:28,394][15132] Fps is (10 sec: 40943.5, 60 sec: 42868.5, 300 sec: 42764.4). Total num frames: 8953413632. Throughput: 0: 42263.7. Samples: 8953517540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:22:28,394][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 00:22:31,534][15401] Updated weights for policy 0, policy_version 546480 (0.0040) [2024-06-24 00:22:33,389][15132] Fps is (10 sec: 44254.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8953610240. Throughput: 0: 42374.0. Samples: 8953776760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:22:33,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 00:22:34,630][15401] Updated weights for policy 0, policy_version 546490 (0.0043) [2024-06-24 00:22:38,390][15132] Fps is (10 sec: 40976.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8953823232. Throughput: 0: 42351.6. Samples: 8953903800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:22:38,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 00:22:39,167][15401] Updated weights for policy 0, policy_version 546500 (0.0035) [2024-06-24 00:22:42,239][15401] Updated weights for policy 0, policy_version 546510 (0.0035) [2024-06-24 00:22:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8954052608. Throughput: 0: 42367.1. Samples: 8954159280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:22:43,390][15132] Avg episode reward: [(0, '0.195')] [2024-06-24 00:22:46,994][15401] Updated weights for policy 0, policy_version 546520 (0.0038) [2024-06-24 00:22:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 8954249216. Throughput: 0: 42563.1. Samples: 8954420880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:22:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 00:22:49,851][15401] Updated weights for policy 0, policy_version 546530 (0.0054) [2024-06-24 00:22:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 8954462208. Throughput: 0: 42339.0. Samples: 8954542260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:22:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 00:22:54,454][15401] Updated weights for policy 0, policy_version 546540 (0.0028) [2024-06-24 00:22:57,662][15401] Updated weights for policy 0, policy_version 546550 (0.0035) [2024-06-24 00:22:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8954691584. Throughput: 0: 42513.8. Samples: 8954801840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:22:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 00:23:02,007][15401] Updated weights for policy 0, policy_version 546560 (0.0029) [2024-06-24 00:23:03,391][15132] Fps is (10 sec: 42592.5, 60 sec: 42324.2, 300 sec: 42598.2). Total num frames: 8954888192. Throughput: 0: 42601.7. Samples: 8955056840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:03,392][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 00:23:05,650][15401] Updated weights for policy 0, policy_version 546570 (0.0041) [2024-06-24 00:23:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8955101184. Throughput: 0: 42480.0. Samples: 8955181260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 00:23:09,633][15401] Updated weights for policy 0, policy_version 546580 (0.0025) [2024-06-24 00:23:13,168][15401] Updated weights for policy 0, policy_version 546590 (0.0047) [2024-06-24 00:23:13,390][15132] Fps is (10 sec: 44243.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8955330560. Throughput: 0: 42757.2. Samples: 8955441440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 00:23:17,311][15401] Updated weights for policy 0, policy_version 546600 (0.0029) [2024-06-24 00:23:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 8955510784. Throughput: 0: 42587.9. Samples: 8955693220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 00:23:21,267][15401] Updated weights for policy 0, policy_version 546610 (0.0045) [2024-06-24 00:23:23,391][15132] Fps is (10 sec: 40953.2, 60 sec: 42873.0, 300 sec: 42653.7). Total num frames: 8955740160. Throughput: 0: 42588.7. Samples: 8955820360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:23,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 00:23:25,360][15401] Updated weights for policy 0, policy_version 546620 (0.0033) [2024-06-24 00:23:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42328.2, 300 sec: 42654.9). Total num frames: 8955953152. Throughput: 0: 42458.2. Samples: 8956069900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:28,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 00:23:28,717][15401] Updated weights for policy 0, policy_version 546630 (0.0030) [2024-06-24 00:23:32,954][15401] Updated weights for policy 0, policy_version 546640 (0.0043) [2024-06-24 00:23:33,390][15132] Fps is (10 sec: 40966.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8956149760. Throughput: 0: 42488.0. Samples: 8956332840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:33,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 00:23:36,133][15401] Updated weights for policy 0, policy_version 546650 (0.0033) [2024-06-24 00:23:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8956379136. Throughput: 0: 42548.5. Samples: 8956456940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:38,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 00:23:40,642][15401] Updated weights for policy 0, policy_version 546660 (0.0029) [2024-06-24 00:23:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8956608512. Throughput: 0: 42697.4. Samples: 8956723220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:43,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 00:23:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000546669_8956624896.pth... [2024-06-24 00:23:43,511][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000546043_8946368512.pth [2024-06-24 00:23:44,182][15401] Updated weights for policy 0, policy_version 546670 (0.0043) [2024-06-24 00:23:47,201][15349] Signal inference workers to stop experience collection... (132700 times) [2024-06-24 00:23:47,202][15349] Signal inference workers to resume experience collection... (132700 times) [2024-06-24 00:23:47,236][15401] InferenceWorker_p0-w0: stopping experience collection (132700 times) [2024-06-24 00:23:47,237][15401] InferenceWorker_p0-w0: resuming experience collection (132700 times) [2024-06-24 00:23:48,232][15401] Updated weights for policy 0, policy_version 546680 (0.0044) [2024-06-24 00:23:48,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.6, 300 sec: 42598.0). Total num frames: 8956805120. Throughput: 0: 42725.8. Samples: 8956979540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:48,393][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 00:23:51,819][15401] Updated weights for policy 0, policy_version 546690 (0.0029) [2024-06-24 00:23:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8957034496. Throughput: 0: 42622.6. Samples: 8957099280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 00:23:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 00:23:56,299][15401] Updated weights for policy 0, policy_version 546700 (0.0039) [2024-06-24 00:23:58,392][15132] Fps is (10 sec: 45875.3, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 8957263872. Throughput: 0: 42581.7. Samples: 8957357720. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:23:58,393][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 00:23:59,372][15401] Updated weights for policy 0, policy_version 546710 (0.0031) [2024-06-24 00:24:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42326.5, 300 sec: 42598.4). Total num frames: 8957427712. Throughput: 0: 42803.2. Samples: 8957619360. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:03,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 00:24:03,894][15401] Updated weights for policy 0, policy_version 546720 (0.0033) [2024-06-24 00:24:06,927][15401] Updated weights for policy 0, policy_version 546730 (0.0030) [2024-06-24 00:24:08,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8957673472. Throughput: 0: 42575.3. Samples: 8957736180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 00:24:11,827][15401] Updated weights for policy 0, policy_version 546740 (0.0029) [2024-06-24 00:24:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8957886464. Throughput: 0: 42887.9. Samples: 8957999860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:13,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 00:24:14,468][15401] Updated weights for policy 0, policy_version 546750 (0.0043) [2024-06-24 00:24:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8958066688. Throughput: 0: 42861.3. Samples: 8958261600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 00:24:19,350][15401] Updated weights for policy 0, policy_version 546760 (0.0028) [2024-06-24 00:24:22,384][15401] Updated weights for policy 0, policy_version 546770 (0.0031) [2024-06-24 00:24:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43145.7, 300 sec: 42709.5). Total num frames: 8958328832. Throughput: 0: 42677.0. Samples: 8958377400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 00:24:26,979][15401] Updated weights for policy 0, policy_version 546780 (0.0023) [2024-06-24 00:24:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8958525440. Throughput: 0: 42563.0. Samples: 8958638560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 00:24:30,111][15401] Updated weights for policy 0, policy_version 546790 (0.0037) [2024-06-24 00:24:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8958705664. Throughput: 0: 42489.0. Samples: 8958891440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:33,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-24 00:24:34,982][15401] Updated weights for policy 0, policy_version 546800 (0.0029) [2024-06-24 00:24:37,961][15401] Updated weights for policy 0, policy_version 546810 (0.0027) [2024-06-24 00:24:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8958951424. Throughput: 0: 42444.9. Samples: 8959009300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 00:24:42,525][15401] Updated weights for policy 0, policy_version 546820 (0.0033) [2024-06-24 00:24:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 8959131648. Throughput: 0: 42475.7. Samples: 8959269020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:43,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 00:24:45,679][15401] Updated weights for policy 0, policy_version 546830 (0.0042) [2024-06-24 00:24:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 8959344640. Throughput: 0: 42196.9. Samples: 8959518220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 00:24:50,374][15401] Updated weights for policy 0, policy_version 546840 (0.0027) [2024-06-24 00:24:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 8959574016. Throughput: 0: 42425.4. Samples: 8959645320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 00:24:53,443][15401] Updated weights for policy 0, policy_version 546850 (0.0030) [2024-06-24 00:24:58,033][15401] Updated weights for policy 0, policy_version 546860 (0.0031) [2024-06-24 00:24:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 8959787008. Throughput: 0: 42404.6. Samples: 8959908060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:24:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 00:25:01,059][15401] Updated weights for policy 0, policy_version 546870 (0.0047) [2024-06-24 00:25:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 8959983616. Throughput: 0: 42032.0. Samples: 8960153040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:25:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 00:25:05,699][15401] Updated weights for policy 0, policy_version 546880 (0.0038) [2024-06-24 00:25:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 8960212992. Throughput: 0: 42357.8. Samples: 8960283500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:25:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 00:25:08,992][15401] Updated weights for policy 0, policy_version 546890 (0.0043) [2024-06-24 00:25:13,035][15349] Signal inference workers to stop experience collection... (132750 times) [2024-06-24 00:25:13,084][15401] InferenceWorker_p0-w0: stopping experience collection (132750 times) [2024-06-24 00:25:13,092][15349] Signal inference workers to resume experience collection... (132750 times) [2024-06-24 00:25:13,102][15401] InferenceWorker_p0-w0: resuming experience collection (132750 times) [2024-06-24 00:25:13,218][15401] Updated weights for policy 0, policy_version 546900 (0.0039) [2024-06-24 00:25:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 8960409600. Throughput: 0: 42311.7. Samples: 8960542580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:25:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 00:25:16,752][15401] Updated weights for policy 0, policy_version 546910 (0.0037) [2024-06-24 00:25:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8960638976. Throughput: 0: 42305.9. Samples: 8960795200. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:25:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 00:25:20,757][15401] Updated weights for policy 0, policy_version 546920 (0.0038) [2024-06-24 00:25:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 8960835584. Throughput: 0: 42522.7. Samples: 8960922820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-24 00:25:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 00:25:24,399][15401] Updated weights for policy 0, policy_version 546930 (0.0028) [2024-06-24 00:25:28,286][15401] Updated weights for policy 0, policy_version 546940 (0.0027) [2024-06-24 00:25:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 8961064960. Throughput: 0: 42417.6. Samples: 8961177820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:25:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 00:25:32,104][15401] Updated weights for policy 0, policy_version 546950 (0.0032) [2024-06-24 00:25:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 8961245184. Throughput: 0: 42662.0. Samples: 8961438120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:25:33,393][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 00:25:35,909][15401] Updated weights for policy 0, policy_version 546960 (0.0032) [2024-06-24 00:25:38,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 8961490944. Throughput: 0: 42560.3. Samples: 8961560640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:25:38,392][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 00:25:39,700][15401] Updated weights for policy 0, policy_version 546970 (0.0038) [2024-06-24 00:25:43,358][15401] Updated weights for policy 0, policy_version 546980 (0.0028) [2024-06-24 00:25:43,392][15132] Fps is (10 sec: 47513.9, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 8961720320. Throughput: 0: 42634.5. Samples: 8961826720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:25:43,393][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 00:25:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000546980_8961720320.pth... [2024-06-24 00:25:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000546356_8951496704.pth [2024-06-24 00:25:47,226][15401] Updated weights for policy 0, policy_version 546990 (0.0020) [2024-06-24 00:25:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 8961900544. Throughput: 0: 42869.4. Samples: 8962082160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:25:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 00:25:51,085][15401] Updated weights for policy 0, policy_version 547000 (0.0040) [2024-06-24 00:25:53,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 8962146304. Throughput: 0: 42718.2. Samples: 8962205820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:25:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 00:25:54,806][15401] Updated weights for policy 0, policy_version 547010 (0.0038) [2024-06-24 00:25:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 8962342912. Throughput: 0: 42804.7. Samples: 8962468800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:25:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 00:25:59,083][15401] Updated weights for policy 0, policy_version 547020 (0.0024) [2024-06-24 00:26:02,427][15401] Updated weights for policy 0, policy_version 547030 (0.0033) [2024-06-24 00:26:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8962555904. Throughput: 0: 42587.4. Samples: 8962711640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:03,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 00:26:06,813][15401] Updated weights for policy 0, policy_version 547040 (0.0031) [2024-06-24 00:26:08,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 8962768896. Throughput: 0: 42805.3. Samples: 8962849060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 00:26:09,480][15349] Signal inference workers to stop experience collection... (132800 times) [2024-06-24 00:26:09,480][15349] Signal inference workers to resume experience collection... (132800 times) [2024-06-24 00:26:09,513][15401] InferenceWorker_p0-w0: stopping experience collection (132800 times) [2024-06-24 00:26:09,513][15401] InferenceWorker_p0-w0: resuming experience collection (132800 times) [2024-06-24 00:26:09,961][15401] Updated weights for policy 0, policy_version 547050 (0.0029) [2024-06-24 00:26:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 8962949120. Throughput: 0: 42836.1. Samples: 8963105440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:13,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 00:26:14,540][15401] Updated weights for policy 0, policy_version 547060 (0.0037) [2024-06-24 00:26:17,855][15401] Updated weights for policy 0, policy_version 547070 (0.0036) [2024-06-24 00:26:18,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 8963194880. Throughput: 0: 42461.9. Samples: 8963348900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:18,392][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 00:26:22,207][15401] Updated weights for policy 0, policy_version 547080 (0.0026) [2024-06-24 00:26:23,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8963424256. Throughput: 0: 42913.9. Samples: 8963491660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 00:26:25,606][15401] Updated weights for policy 0, policy_version 547090 (0.0035) [2024-06-24 00:26:28,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8963604480. Throughput: 0: 42563.6. Samples: 8963741980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 00:26:29,651][15401] Updated weights for policy 0, policy_version 547100 (0.0031) [2024-06-24 00:26:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43146.3, 300 sec: 42542.9). Total num frames: 8963833856. Throughput: 0: 42514.7. Samples: 8963995320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 00:26:33,415][15401] Updated weights for policy 0, policy_version 547110 (0.0039) [2024-06-24 00:26:37,578][15401] Updated weights for policy 0, policy_version 547120 (0.0031) [2024-06-24 00:26:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 8964046848. Throughput: 0: 42754.7. Samples: 8964129780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:38,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-24 00:26:41,285][15401] Updated weights for policy 0, policy_version 547130 (0.0031) [2024-06-24 00:26:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41780.8, 300 sec: 42376.2). Total num frames: 8964227072. Throughput: 0: 42567.6. Samples: 8964384340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 00:26:45,188][15401] Updated weights for policy 0, policy_version 547140 (0.0040) [2024-06-24 00:26:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8964472832. Throughput: 0: 42627.6. Samples: 8964629880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 00:26:48,936][15401] Updated weights for policy 0, policy_version 547150 (0.0043) [2024-06-24 00:26:52,842][15401] Updated weights for policy 0, policy_version 547160 (0.0035) [2024-06-24 00:26:53,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 8964685824. Throughput: 0: 42673.8. Samples: 8964769480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 00:26:53,392][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 00:26:56,531][15401] Updated weights for policy 0, policy_version 547170 (0.0032) [2024-06-24 00:26:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 8964882432. Throughput: 0: 42576.8. Samples: 8965021400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:26:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 00:27:00,474][15401] Updated weights for policy 0, policy_version 547180 (0.0041) [2024-06-24 00:27:03,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 8965128192. Throughput: 0: 42758.3. Samples: 8965272920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 00:27:04,568][15401] Updated weights for policy 0, policy_version 547190 (0.0024) [2024-06-24 00:27:07,943][15401] Updated weights for policy 0, policy_version 547200 (0.0025) [2024-06-24 00:27:08,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8965324800. Throughput: 0: 42634.7. Samples: 8965410220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:08,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 00:27:12,273][15401] Updated weights for policy 0, policy_version 547210 (0.0038) [2024-06-24 00:27:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 8965537792. Throughput: 0: 42865.4. Samples: 8965670920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:13,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 00:27:15,508][15401] Updated weights for policy 0, policy_version 547220 (0.0024) [2024-06-24 00:27:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42710.0). Total num frames: 8965767168. Throughput: 0: 42726.7. Samples: 8965918020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:18,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 00:27:19,932][15401] Updated weights for policy 0, policy_version 547230 (0.0032) [2024-06-24 00:27:23,173][15401] Updated weights for policy 0, policy_version 547240 (0.0037) [2024-06-24 00:27:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42599.0). Total num frames: 8965980160. Throughput: 0: 42841.8. Samples: 8966057660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 00:27:27,561][15401] Updated weights for policy 0, policy_version 547250 (0.0026) [2024-06-24 00:27:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8966176768. Throughput: 0: 42997.9. Samples: 8966319240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 00:27:29,652][15349] Signal inference workers to stop experience collection... (132850 times) [2024-06-24 00:27:29,676][15401] InferenceWorker_p0-w0: stopping experience collection (132850 times) [2024-06-24 00:27:29,767][15349] Signal inference workers to resume experience collection... (132850 times) [2024-06-24 00:27:29,768][15401] InferenceWorker_p0-w0: resuming experience collection (132850 times) [2024-06-24 00:27:30,851][15401] Updated weights for policy 0, policy_version 547260 (0.0034) [2024-06-24 00:27:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8966406144. Throughput: 0: 43089.0. Samples: 8966568880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:33,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 00:27:35,362][15401] Updated weights for policy 0, policy_version 547270 (0.0033) [2024-06-24 00:27:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 8966619136. Throughput: 0: 42933.9. Samples: 8966701400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 00:27:38,415][15401] Updated weights for policy 0, policy_version 547280 (0.0032) [2024-06-24 00:27:43,140][15401] Updated weights for policy 0, policy_version 547290 (0.0042) [2024-06-24 00:27:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8966815744. Throughput: 0: 42976.9. Samples: 8966955360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:43,394][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 00:27:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000547291_8966815744.pth... [2024-06-24 00:27:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000546669_8956624896.pth [2024-06-24 00:27:45,913][15401] Updated weights for policy 0, policy_version 547300 (0.0045) [2024-06-24 00:27:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8967061504. Throughput: 0: 43004.5. Samples: 8967208120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 00:27:50,505][15401] Updated weights for policy 0, policy_version 547310 (0.0037) [2024-06-24 00:27:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 8967274496. Throughput: 0: 42975.9. Samples: 8967344140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:53,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 00:27:53,514][15401] Updated weights for policy 0, policy_version 547320 (0.0027) [2024-06-24 00:27:58,046][15401] Updated weights for policy 0, policy_version 547330 (0.0035) [2024-06-24 00:27:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.6, 300 sec: 42598.6). Total num frames: 8967454720. Throughput: 0: 42764.9. Samples: 8967595340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:27:58,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 00:28:01,237][15401] Updated weights for policy 0, policy_version 547340 (0.0032) [2024-06-24 00:28:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 8967684096. Throughput: 0: 42868.0. Samples: 8967847080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:28:03,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 00:28:05,586][15401] Updated weights for policy 0, policy_version 547350 (0.0034) [2024-06-24 00:28:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 8967913472. Throughput: 0: 42760.8. Samples: 8967981900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:28:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 00:28:08,714][15401] Updated weights for policy 0, policy_version 547360 (0.0038) [2024-06-24 00:28:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8968093696. Throughput: 0: 42672.9. Samples: 8968239520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:28:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 00:28:13,446][15401] Updated weights for policy 0, policy_version 547370 (0.0035) [2024-06-24 00:28:16,264][15401] Updated weights for policy 0, policy_version 547380 (0.0035) [2024-06-24 00:28:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 8968323072. Throughput: 0: 42732.9. Samples: 8968491860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:28:18,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 00:28:21,061][15401] Updated weights for policy 0, policy_version 547390 (0.0033) [2024-06-24 00:28:23,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 8968552448. Throughput: 0: 42736.7. Samples: 8968624660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 00:28:23,392][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 00:28:23,907][15401] Updated weights for policy 0, policy_version 547400 (0.0044) [2024-06-24 00:28:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 8968749056. Throughput: 0: 42806.4. Samples: 8968881640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:28:28,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 00:28:28,653][15401] Updated weights for policy 0, policy_version 547410 (0.0040) [2024-06-24 00:28:31,553][15401] Updated weights for policy 0, policy_version 547420 (0.0026) [2024-06-24 00:28:33,396][15132] Fps is (10 sec: 42581.5, 60 sec: 42866.8, 300 sec: 42708.6). Total num frames: 8968978432. Throughput: 0: 42702.7. Samples: 8969130020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:28:33,397][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 00:28:36,336][15401] Updated weights for policy 0, policy_version 547430 (0.0029) [2024-06-24 00:28:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 8969191424. Throughput: 0: 42581.7. Samples: 8969260320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:28:38,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-24 00:28:39,705][15401] Updated weights for policy 0, policy_version 547440 (0.0032) [2024-06-24 00:28:43,389][15132] Fps is (10 sec: 37707.4, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 8969355264. Throughput: 0: 42676.8. Samples: 8969515800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:28:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 00:28:44,108][15401] Updated weights for policy 0, policy_version 547450 (0.0032) [2024-06-24 00:28:47,278][15401] Updated weights for policy 0, policy_version 547460 (0.0029) [2024-06-24 00:28:48,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 8969617408. Throughput: 0: 42567.9. Samples: 8969762640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:28:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 00:28:51,556][15349] Signal inference workers to stop experience collection... (132900 times) [2024-06-24 00:28:51,557][15349] Signal inference workers to resume experience collection... (132900 times) [2024-06-24 00:28:51,573][15401] InferenceWorker_p0-w0: stopping experience collection (132900 times) [2024-06-24 00:28:51,573][15401] InferenceWorker_p0-w0: resuming experience collection (132900 times) [2024-06-24 00:28:51,739][15401] Updated weights for policy 0, policy_version 547470 (0.0039) [2024-06-24 00:28:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42487.7). Total num frames: 8969797632. Throughput: 0: 42696.0. Samples: 8969903220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:28:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 00:28:55,127][15401] Updated weights for policy 0, policy_version 547480 (0.0028) [2024-06-24 00:28:58,391][15132] Fps is (10 sec: 39313.8, 60 sec: 42597.0, 300 sec: 42653.6). Total num frames: 8970010624. Throughput: 0: 42535.5. Samples: 8970153700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:28:58,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 00:28:59,431][15401] Updated weights for policy 0, policy_version 547490 (0.0046) [2024-06-24 00:29:02,747][15401] Updated weights for policy 0, policy_version 547500 (0.0034) [2024-06-24 00:29:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 8970256384. Throughput: 0: 42547.1. Samples: 8970406480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 00:29:07,126][15401] Updated weights for policy 0, policy_version 547510 (0.0044) [2024-06-24 00:29:08,396][15132] Fps is (10 sec: 44217.2, 60 sec: 42320.9, 300 sec: 42597.5). Total num frames: 8970452992. Throughput: 0: 42585.6. Samples: 8970541180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:08,396][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 00:29:10,877][15401] Updated weights for policy 0, policy_version 547520 (0.0034) [2024-06-24 00:29:13,394][15132] Fps is (10 sec: 40942.2, 60 sec: 42868.4, 300 sec: 42708.9). Total num frames: 8970665984. Throughput: 0: 42255.5. Samples: 8970783320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:13,394][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 00:29:15,151][15401] Updated weights for policy 0, policy_version 547530 (0.0053) [2024-06-24 00:29:18,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 8970878976. Throughput: 0: 42483.9. Samples: 8971041520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:18,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 00:29:18,447][15401] Updated weights for policy 0, policy_version 547540 (0.0044) [2024-06-24 00:29:22,698][15401] Updated weights for policy 0, policy_version 547550 (0.0027) [2024-06-24 00:29:23,389][15132] Fps is (10 sec: 40977.7, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 8971075584. Throughput: 0: 42387.3. Samples: 8971167740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 00:29:26,110][15401] Updated weights for policy 0, policy_version 547560 (0.0028) [2024-06-24 00:29:28,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 8971304960. Throughput: 0: 42345.2. Samples: 8971421440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:28,393][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 00:29:30,458][15401] Updated weights for policy 0, policy_version 547570 (0.0042) [2024-06-24 00:29:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 8971517952. Throughput: 0: 42664.1. Samples: 8971682520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 00:29:33,797][15401] Updated weights for policy 0, policy_version 547580 (0.0038) [2024-06-24 00:29:38,190][15401] Updated weights for policy 0, policy_version 547590 (0.0033) [2024-06-24 00:29:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 8971714560. Throughput: 0: 42371.7. Samples: 8971809940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:38,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 00:29:41,509][15401] Updated weights for policy 0, policy_version 547600 (0.0032) [2024-06-24 00:29:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8971927552. Throughput: 0: 42412.1. Samples: 8972062160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:43,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 00:29:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000547604_8971943936.pth... [2024-06-24 00:29:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000546980_8961720320.pth [2024-06-24 00:29:45,627][15401] Updated weights for policy 0, policy_version 547610 (0.0034) [2024-06-24 00:29:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8972173312. Throughput: 0: 42465.7. Samples: 8972317440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 00:29:49,154][15401] Updated weights for policy 0, policy_version 547620 (0.0033) [2024-06-24 00:29:53,148][15401] Updated weights for policy 0, policy_version 547630 (0.0039) [2024-06-24 00:29:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8972369920. Throughput: 0: 42343.8. Samples: 8972446380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-24 00:29:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 00:29:56,745][15401] Updated weights for policy 0, policy_version 547640 (0.0041) [2024-06-24 00:29:58,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 8972550144. Throughput: 0: 42475.7. Samples: 8972694540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:29:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 00:30:01,099][15401] Updated weights for policy 0, policy_version 547650 (0.0029) [2024-06-24 00:30:03,177][15349] Signal inference workers to stop experience collection... (132950 times) [2024-06-24 00:30:03,220][15401] InferenceWorker_p0-w0: stopping experience collection (132950 times) [2024-06-24 00:30:03,227][15349] Signal inference workers to resume experience collection... (132950 times) [2024-06-24 00:30:03,233][15401] InferenceWorker_p0-w0: resuming experience collection (132950 times) [2024-06-24 00:30:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 8972812288. Throughput: 0: 42442.1. Samples: 8972951520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:03,393][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 00:30:04,375][15401] Updated weights for policy 0, policy_version 547660 (0.0039) [2024-06-24 00:30:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42329.9, 300 sec: 42653.9). Total num frames: 8972992512. Throughput: 0: 42710.2. Samples: 8973089700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 00:30:08,661][15401] Updated weights for policy 0, policy_version 547670 (0.0027) [2024-06-24 00:30:12,109][15401] Updated weights for policy 0, policy_version 547680 (0.0028) [2024-06-24 00:30:13,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42328.4, 300 sec: 42598.4). Total num frames: 8973205504. Throughput: 0: 42537.5. Samples: 8973335520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 00:30:16,416][15401] Updated weights for policy 0, policy_version 547690 (0.0046) [2024-06-24 00:30:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8973434880. Throughput: 0: 42460.0. Samples: 8973593220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 00:30:19,621][15401] Updated weights for policy 0, policy_version 547700 (0.0038) [2024-06-24 00:30:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 8973631488. Throughput: 0: 42598.5. Samples: 8973726880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:23,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 00:30:24,122][15401] Updated weights for policy 0, policy_version 547710 (0.0038) [2024-06-24 00:30:27,080][15401] Updated weights for policy 0, policy_version 547720 (0.0034) [2024-06-24 00:30:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42327.0, 300 sec: 42709.8). Total num frames: 8973844480. Throughput: 0: 42431.5. Samples: 8973971580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:28,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-24 00:30:32,131][15401] Updated weights for policy 0, policy_version 547730 (0.0027) [2024-06-24 00:30:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 8974090240. Throughput: 0: 42561.3. Samples: 8974232700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 00:30:35,168][15401] Updated weights for policy 0, policy_version 547740 (0.0041) [2024-06-24 00:30:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 8974270464. Throughput: 0: 42702.2. Samples: 8974367980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:38,394][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 00:30:39,798][15401] Updated weights for policy 0, policy_version 547750 (0.0025) [2024-06-24 00:30:42,621][15401] Updated weights for policy 0, policy_version 547760 (0.0047) [2024-06-24 00:30:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8974499840. Throughput: 0: 42567.9. Samples: 8974610100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:43,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 00:30:47,555][15401] Updated weights for policy 0, policy_version 547770 (0.0030) [2024-06-24 00:30:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 8974696448. Throughput: 0: 42814.7. Samples: 8974878080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 00:30:50,157][15401] Updated weights for policy 0, policy_version 547780 (0.0041) [2024-06-24 00:30:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 8974909440. Throughput: 0: 42431.4. Samples: 8974999120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:53,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 00:30:55,286][15401] Updated weights for policy 0, policy_version 547790 (0.0027) [2024-06-24 00:30:57,816][15401] Updated weights for policy 0, policy_version 547800 (0.0035) [2024-06-24 00:30:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 8975155200. Throughput: 0: 42593.7. Samples: 8975252240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:30:58,404][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 00:31:03,075][15401] Updated weights for policy 0, policy_version 547810 (0.0024) [2024-06-24 00:31:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41780.8, 300 sec: 42542.9). Total num frames: 8975319040. Throughput: 0: 42669.2. Samples: 8975513340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:31:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 00:31:05,989][15401] Updated weights for policy 0, policy_version 547820 (0.0026) [2024-06-24 00:31:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 8975548416. Throughput: 0: 42308.0. Samples: 8975630740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:31:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 00:31:10,916][15401] Updated weights for policy 0, policy_version 547830 (0.0043) [2024-06-24 00:31:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 8975777792. Throughput: 0: 42469.8. Samples: 8975882720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:31:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 00:31:13,782][15401] Updated weights for policy 0, policy_version 547840 (0.0038) [2024-06-24 00:31:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 8975958016. Throughput: 0: 42445.5. Samples: 8976142740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:31:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 00:31:18,506][15401] Updated weights for policy 0, policy_version 547850 (0.0037) [2024-06-24 00:31:21,319][15349] Signal inference workers to stop experience collection... (133000 times) [2024-06-24 00:31:21,319][15349] Signal inference workers to resume experience collection... (133000 times) [2024-06-24 00:31:21,360][15401] InferenceWorker_p0-w0: stopping experience collection (133000 times) [2024-06-24 00:31:21,360][15401] InferenceWorker_p0-w0: resuming experience collection (133000 times) [2024-06-24 00:31:21,458][15401] Updated weights for policy 0, policy_version 547860 (0.0033) [2024-06-24 00:31:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8976171008. Throughput: 0: 42097.9. Samples: 8976262380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 00:31:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 00:31:26,091][15401] Updated weights for policy 0, policy_version 547870 (0.0038) [2024-06-24 00:31:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8976400384. Throughput: 0: 42516.5. Samples: 8976523340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:31:28,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 00:31:28,981][15401] Updated weights for policy 0, policy_version 547880 (0.0034) [2024-06-24 00:31:33,391][15132] Fps is (10 sec: 42589.9, 60 sec: 41777.9, 300 sec: 42542.6). Total num frames: 8976596992. Throughput: 0: 42423.1. Samples: 8976787200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:31:33,392][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 00:31:33,693][15401] Updated weights for policy 0, policy_version 547890 (0.0037) [2024-06-24 00:31:36,633][15401] Updated weights for policy 0, policy_version 547900 (0.0030) [2024-06-24 00:31:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8976826368. Throughput: 0: 42311.3. Samples: 8976903120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:31:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 00:31:41,469][15401] Updated weights for policy 0, policy_version 547910 (0.0028) [2024-06-24 00:31:43,390][15132] Fps is (10 sec: 45884.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8977055744. Throughput: 0: 42504.0. Samples: 8977164920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:31:43,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 00:31:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000547916_8977055744.pth... [2024-06-24 00:31:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000547291_8966815744.pth [2024-06-24 00:31:44,542][15401] Updated weights for policy 0, policy_version 547920 (0.0037) [2024-06-24 00:31:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 8977235968. Throughput: 0: 42390.3. Samples: 8977420900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:31:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 00:31:49,174][15401] Updated weights for policy 0, policy_version 547930 (0.0030) [2024-06-24 00:31:52,282][15401] Updated weights for policy 0, policy_version 547940 (0.0034) [2024-06-24 00:31:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 8977465344. Throughput: 0: 42458.8. Samples: 8977541380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:31:53,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 00:31:56,950][15401] Updated weights for policy 0, policy_version 547950 (0.0037) [2024-06-24 00:31:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 8977694720. Throughput: 0: 42664.4. Samples: 8977802620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:31:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 00:31:59,836][15401] Updated weights for policy 0, policy_version 547960 (0.0027) [2024-06-24 00:32:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8977891328. Throughput: 0: 42535.9. Samples: 8978056860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:03,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 00:32:04,560][15401] Updated weights for policy 0, policy_version 547970 (0.0039) [2024-06-24 00:32:07,580][15401] Updated weights for policy 0, policy_version 547980 (0.0041) [2024-06-24 00:32:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8978104320. Throughput: 0: 42680.5. Samples: 8978183000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:08,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 00:32:12,176][15401] Updated weights for policy 0, policy_version 547990 (0.0037) [2024-06-24 00:32:13,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8978333696. Throughput: 0: 42610.2. Samples: 8978440800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 00:32:15,641][15401] Updated weights for policy 0, policy_version 548000 (0.0032) [2024-06-24 00:32:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 8978513920. Throughput: 0: 42480.6. Samples: 8978698740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 00:32:19,858][15401] Updated weights for policy 0, policy_version 548010 (0.0032) [2024-06-24 00:32:23,226][15401] Updated weights for policy 0, policy_version 548020 (0.0030) [2024-06-24 00:32:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 8978759680. Throughput: 0: 42576.0. Samples: 8978819040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 00:32:27,633][15401] Updated weights for policy 0, policy_version 548030 (0.0031) [2024-06-24 00:32:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 8978972672. Throughput: 0: 42585.0. Samples: 8979081240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 00:32:31,066][15401] Updated weights for policy 0, policy_version 548040 (0.0038) [2024-06-24 00:32:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42326.7, 300 sec: 42431.8). Total num frames: 8979136512. Throughput: 0: 42679.2. Samples: 8979341460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 00:32:35,381][15401] Updated weights for policy 0, policy_version 548050 (0.0037) [2024-06-24 00:32:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 8979398656. Throughput: 0: 42595.1. Samples: 8979458160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 00:32:38,549][15401] Updated weights for policy 0, policy_version 548060 (0.0036) [2024-06-24 00:32:43,020][15401] Updated weights for policy 0, policy_version 548070 (0.0033) [2024-06-24 00:32:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 8979595264. Throughput: 0: 42704.1. Samples: 8979724300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 00:32:45,957][15401] Updated weights for policy 0, policy_version 548080 (0.0042) [2024-06-24 00:32:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 8979791872. Throughput: 0: 42614.3. Samples: 8979974500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:48,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 00:32:50,606][15401] Updated weights for policy 0, policy_version 548090 (0.0040) [2024-06-24 00:32:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8980054016. Throughput: 0: 42624.4. Samples: 8980101100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-24 00:32:53,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 00:32:53,509][15401] Updated weights for policy 0, policy_version 548100 (0.0030) [2024-06-24 00:32:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 8980217856. Throughput: 0: 42744.9. Samples: 8980364320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:32:58,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 00:32:58,431][15401] Updated weights for policy 0, policy_version 548110 (0.0034) [2024-06-24 00:33:01,690][15401] Updated weights for policy 0, policy_version 548120 (0.0036) [2024-06-24 00:33:03,212][15349] Signal inference workers to stop experience collection... (133050 times) [2024-06-24 00:33:03,213][15349] Signal inference workers to resume experience collection... (133050 times) [2024-06-24 00:33:03,261][15401] InferenceWorker_p0-w0: stopping experience collection (133050 times) [2024-06-24 00:33:03,261][15401] InferenceWorker_p0-w0: resuming experience collection (133050 times) [2024-06-24 00:33:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 8980447232. Throughput: 0: 42831.9. Samples: 8980626180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:03,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-24 00:33:05,974][15401] Updated weights for policy 0, policy_version 548130 (0.0028) [2024-06-24 00:33:08,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 8980692992. Throughput: 0: 42897.8. Samples: 8980749440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:08,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 00:33:09,216][15401] Updated weights for policy 0, policy_version 548140 (0.0047) [2024-06-24 00:33:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 8980856832. Throughput: 0: 42778.3. Samples: 8981006260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:13,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 00:33:13,805][15401] Updated weights for policy 0, policy_version 548150 (0.0035) [2024-06-24 00:33:17,111][15401] Updated weights for policy 0, policy_version 548160 (0.0031) [2024-06-24 00:33:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 8981069824. Throughput: 0: 42569.3. Samples: 8981257080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:18,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 00:33:21,427][15401] Updated weights for policy 0, policy_version 548170 (0.0040) [2024-06-24 00:33:23,390][15132] Fps is (10 sec: 47508.7, 60 sec: 42870.8, 300 sec: 42653.8). Total num frames: 8981331968. Throughput: 0: 42934.7. Samples: 8981390260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:23,391][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 00:33:24,559][15401] Updated weights for policy 0, policy_version 548180 (0.0034) [2024-06-24 00:33:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42488.2). Total num frames: 8981512192. Throughput: 0: 42702.6. Samples: 8981645920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 00:33:29,017][15401] Updated weights for policy 0, policy_version 548190 (0.0039) [2024-06-24 00:33:32,033][15401] Updated weights for policy 0, policy_version 548200 (0.0035) [2024-06-24 00:33:33,390][15132] Fps is (10 sec: 39325.0, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 8981725184. Throughput: 0: 42751.2. Samples: 8981898300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 00:33:36,526][15401] Updated weights for policy 0, policy_version 548210 (0.0038) [2024-06-24 00:33:38,391][15132] Fps is (10 sec: 45866.6, 60 sec: 42870.1, 300 sec: 42764.7). Total num frames: 8981970944. Throughput: 0: 42817.2. Samples: 8982027960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:38,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 00:33:39,490][15401] Updated weights for policy 0, policy_version 548220 (0.0032) [2024-06-24 00:33:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 8982167552. Throughput: 0: 42893.2. Samples: 8982294520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:43,391][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 00:33:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000548228_8982167552.pth... [2024-06-24 00:33:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000547604_8971943936.pth [2024-06-24 00:33:44,068][15401] Updated weights for policy 0, policy_version 548230 (0.0034) [2024-06-24 00:33:47,132][15401] Updated weights for policy 0, policy_version 548240 (0.0032) [2024-06-24 00:33:48,390][15132] Fps is (10 sec: 40967.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 8982380544. Throughput: 0: 42615.6. Samples: 8982543880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 00:33:51,989][15401] Updated weights for policy 0, policy_version 548250 (0.0033) [2024-06-24 00:33:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 8982577152. Throughput: 0: 42820.9. Samples: 8982676380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:53,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 00:33:54,726][15401] Updated weights for policy 0, policy_version 548260 (0.0026) [2024-06-24 00:33:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 8982773760. Throughput: 0: 42681.2. Samples: 8982926920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:33:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 00:33:59,405][15401] Updated weights for policy 0, policy_version 548270 (0.0032) [2024-06-24 00:34:02,856][15401] Updated weights for policy 0, policy_version 548280 (0.0024) [2024-06-24 00:34:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 8983019520. Throughput: 0: 42682.7. Samples: 8983177800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:34:03,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 00:34:07,120][15401] Updated weights for policy 0, policy_version 548290 (0.0032) [2024-06-24 00:34:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42599.0). Total num frames: 8983232512. Throughput: 0: 42751.1. Samples: 8983314020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:34:08,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-24 00:34:10,465][15401] Updated weights for policy 0, policy_version 548300 (0.0030) [2024-06-24 00:34:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 8983412736. Throughput: 0: 42657.8. Samples: 8983565520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:34:13,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 00:34:14,691][15401] Updated weights for policy 0, policy_version 548310 (0.0043) [2024-06-24 00:34:18,202][15401] Updated weights for policy 0, policy_version 548320 (0.0030) [2024-06-24 00:34:18,391][15132] Fps is (10 sec: 44230.8, 60 sec: 43416.6, 300 sec: 42709.3). Total num frames: 8983674880. Throughput: 0: 42746.3. Samples: 8983821940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:34:18,392][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 00:34:22,528][15401] Updated weights for policy 0, policy_version 548330 (0.0039) [2024-06-24 00:34:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42599.1, 300 sec: 42654.3). Total num frames: 8983887872. Throughput: 0: 42928.6. Samples: 8983959660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 00:34:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 00:34:24,849][15349] Signal inference workers to stop experience collection... (133100 times) [2024-06-24 00:34:24,850][15349] Signal inference workers to resume experience collection... (133100 times) [2024-06-24 00:34:24,871][15401] InferenceWorker_p0-w0: stopping experience collection (133100 times) [2024-06-24 00:34:24,871][15401] InferenceWorker_p0-w0: resuming experience collection (133100 times) [2024-06-24 00:34:26,019][15401] Updated weights for policy 0, policy_version 548340 (0.0033) [2024-06-24 00:34:28,390][15132] Fps is (10 sec: 39326.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 8984068096. Throughput: 0: 42647.1. Samples: 8984213640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:34:28,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 00:34:29,916][15401] Updated weights for policy 0, policy_version 548350 (0.0040) [2024-06-24 00:34:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 8984313856. Throughput: 0: 42758.7. Samples: 8984468020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:34:33,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 00:34:33,536][15401] Updated weights for policy 0, policy_version 548360 (0.0044) [2024-06-24 00:34:37,596][15401] Updated weights for policy 0, policy_version 548370 (0.0044) [2024-06-24 00:34:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42599.8, 300 sec: 42709.5). Total num frames: 8984526848. Throughput: 0: 42830.6. Samples: 8984603760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:34:38,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 00:34:41,041][15401] Updated weights for policy 0, policy_version 548380 (0.0030) [2024-06-24 00:34:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 8984707072. Throughput: 0: 42833.3. Samples: 8984854420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:34:43,393][15132] Avg episode reward: [(0, '0.176')] [2024-06-24 00:34:45,537][15401] Updated weights for policy 0, policy_version 548390 (0.0038) [2024-06-24 00:34:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 8984952832. Throughput: 0: 42744.3. Samples: 8985101300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:34:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 00:34:49,167][15401] Updated weights for policy 0, policy_version 548400 (0.0035) [2024-06-24 00:34:53,163][15401] Updated weights for policy 0, policy_version 548410 (0.0033) [2024-06-24 00:34:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 8985149440. Throughput: 0: 42812.4. Samples: 8985240580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:34:53,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 00:34:56,635][15401] Updated weights for policy 0, policy_version 548420 (0.0034) [2024-06-24 00:34:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 8985362432. Throughput: 0: 42912.8. Samples: 8985496600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:34:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 00:35:00,740][15401] Updated weights for policy 0, policy_version 548430 (0.0038) [2024-06-24 00:35:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8985608192. Throughput: 0: 42853.2. Samples: 8985750280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 00:35:04,160][15401] Updated weights for policy 0, policy_version 548440 (0.0042) [2024-06-24 00:35:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8985788416. Throughput: 0: 42821.1. Samples: 8985886620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 00:35:08,577][15401] Updated weights for policy 0, policy_version 548450 (0.0031) [2024-06-24 00:35:11,803][15401] Updated weights for policy 0, policy_version 548460 (0.0041) [2024-06-24 00:35:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 8986001408. Throughput: 0: 42701.8. Samples: 8986135220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 00:35:16,461][15401] Updated weights for policy 0, policy_version 548470 (0.0025) [2024-06-24 00:35:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 8986230784. Throughput: 0: 42736.0. Samples: 8986391140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 00:35:19,564][15401] Updated weights for policy 0, policy_version 548480 (0.0036) [2024-06-24 00:35:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 8986427392. Throughput: 0: 42624.3. Samples: 8986521860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 00:35:24,049][15401] Updated weights for policy 0, policy_version 548490 (0.0030) [2024-06-24 00:35:27,032][15401] Updated weights for policy 0, policy_version 548500 (0.0035) [2024-06-24 00:35:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 8986656768. Throughput: 0: 42645.0. Samples: 8986773440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 00:35:31,554][15401] Updated weights for policy 0, policy_version 548510 (0.0034) [2024-06-24 00:35:33,392][15132] Fps is (10 sec: 45864.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 8986886144. Throughput: 0: 42939.1. Samples: 8987033660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:33,393][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 00:35:34,615][15401] Updated weights for policy 0, policy_version 548520 (0.0040) [2024-06-24 00:35:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8987082752. Throughput: 0: 42827.0. Samples: 8987167800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 00:35:39,082][15401] Updated weights for policy 0, policy_version 548530 (0.0042) [2024-06-24 00:35:42,193][15401] Updated weights for policy 0, policy_version 548540 (0.0028) [2024-06-24 00:35:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 8987295744. Throughput: 0: 42682.1. Samples: 8987417300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 00:35:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000548541_8987295744.pth... [2024-06-24 00:35:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000547916_8977055744.pth [2024-06-24 00:35:46,678][15401] Updated weights for policy 0, policy_version 548550 (0.0033) [2024-06-24 00:35:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8987525120. Throughput: 0: 42753.8. Samples: 8987674200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 00:35:49,958][15401] Updated weights for policy 0, policy_version 548560 (0.0026) [2024-06-24 00:35:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 8987721728. Throughput: 0: 42672.1. Samples: 8987806860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 00:35:54,230][15401] Updated weights for policy 0, policy_version 548570 (0.0029) [2024-06-24 00:35:57,446][15401] Updated weights for policy 0, policy_version 548580 (0.0031) [2024-06-24 00:35:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8987934720. Throughput: 0: 42717.8. Samples: 8988057520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:35:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 00:36:01,596][15349] Signal inference workers to stop experience collection... (133150 times) [2024-06-24 00:36:01,645][15401] InferenceWorker_p0-w0: stopping experience collection (133150 times) [2024-06-24 00:36:01,652][15349] Signal inference workers to resume experience collection... (133150 times) [2024-06-24 00:36:01,659][15401] InferenceWorker_p0-w0: resuming experience collection (133150 times) [2024-06-24 00:36:01,794][15401] Updated weights for policy 0, policy_version 548590 (0.0028) [2024-06-24 00:36:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 8988147712. Throughput: 0: 42852.9. Samples: 8988319520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 00:36:05,665][15401] Updated weights for policy 0, policy_version 548600 (0.0028) [2024-06-24 00:36:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 8988344320. Throughput: 0: 42912.5. Samples: 8988452920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 00:36:09,265][15401] Updated weights for policy 0, policy_version 548610 (0.0030) [2024-06-24 00:36:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8988573696. Throughput: 0: 42887.5. Samples: 8988703380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 00:36:13,500][15401] Updated weights for policy 0, policy_version 548620 (0.0027) [2024-06-24 00:36:17,195][15401] Updated weights for policy 0, policy_version 548630 (0.0039) [2024-06-24 00:36:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8988803072. Throughput: 0: 42799.1. Samples: 8988959520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:18,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 00:36:21,030][15401] Updated weights for policy 0, policy_version 548640 (0.0035) [2024-06-24 00:36:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 8988999680. Throughput: 0: 42685.9. Samples: 8989088660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:23,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 00:36:24,759][15401] Updated weights for policy 0, policy_version 548650 (0.0033) [2024-06-24 00:36:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 8989212672. Throughput: 0: 42917.1. Samples: 8989348560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 00:36:28,533][15401] Updated weights for policy 0, policy_version 548660 (0.0044) [2024-06-24 00:36:32,408][15401] Updated weights for policy 0, policy_version 548670 (0.0040) [2024-06-24 00:36:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 8989458432. Throughput: 0: 42772.8. Samples: 8989598980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 00:36:36,088][15401] Updated weights for policy 0, policy_version 548680 (0.0036) [2024-06-24 00:36:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 8989622272. Throughput: 0: 42703.6. Samples: 8989728520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 00:36:40,087][15401] Updated weights for policy 0, policy_version 548690 (0.0025) [2024-06-24 00:36:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 8989884416. Throughput: 0: 42967.6. Samples: 8989991060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 00:36:43,531][15401] Updated weights for policy 0, policy_version 548700 (0.0033) [2024-06-24 00:36:47,630][15401] Updated weights for policy 0, policy_version 548710 (0.0035) [2024-06-24 00:36:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8990081024. Throughput: 0: 42877.8. Samples: 8990249020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:48,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 00:36:51,530][15401] Updated weights for policy 0, policy_version 548720 (0.0026) [2024-06-24 00:36:53,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8990277632. Throughput: 0: 42699.4. Samples: 8990374400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 00:36:55,246][15401] Updated weights for policy 0, policy_version 548730 (0.0031) [2024-06-24 00:36:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8990507008. Throughput: 0: 42646.1. Samples: 8990622460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:36:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 00:36:59,440][15401] Updated weights for policy 0, policy_version 548740 (0.0038) [2024-06-24 00:37:03,136][15401] Updated weights for policy 0, policy_version 548750 (0.0037) [2024-06-24 00:37:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8990720000. Throughput: 0: 42768.5. Samples: 8990884100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:37:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 00:37:06,894][15401] Updated weights for policy 0, policy_version 548760 (0.0026) [2024-06-24 00:37:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 8990916608. Throughput: 0: 42828.4. Samples: 8991015940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:37:08,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 00:37:10,769][15401] Updated weights for policy 0, policy_version 548770 (0.0036) [2024-06-24 00:37:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8991145984. Throughput: 0: 42617.7. Samples: 8991266360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:37:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 00:37:14,711][15401] Updated weights for policy 0, policy_version 548780 (0.0041) [2024-06-24 00:37:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8991358976. Throughput: 0: 42782.7. Samples: 8991524200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:37:18,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 00:37:18,436][15401] Updated weights for policy 0, policy_version 548790 (0.0030) [2024-06-24 00:37:22,350][15401] Updated weights for policy 0, policy_version 548800 (0.0040) [2024-06-24 00:37:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 8991555584. Throughput: 0: 42785.9. Samples: 8991653880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:37:23,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-24 00:37:25,989][15401] Updated weights for policy 0, policy_version 548810 (0.0028) [2024-06-24 00:37:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8991784960. Throughput: 0: 42448.3. Samples: 8991901240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:37:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 00:37:29,967][15401] Updated weights for policy 0, policy_version 548820 (0.0033) [2024-06-24 00:37:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 8991997952. Throughput: 0: 42570.1. Samples: 8992164680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:37:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 00:37:33,586][15401] Updated weights for policy 0, policy_version 548830 (0.0029) [2024-06-24 00:37:37,595][15401] Updated weights for policy 0, policy_version 548840 (0.0038) [2024-06-24 00:37:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8992210944. Throughput: 0: 42639.7. Samples: 8992293180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:37:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 00:37:39,827][15349] Signal inference workers to stop experience collection... (133200 times) [2024-06-24 00:37:39,832][15349] Signal inference workers to resume experience collection... (133200 times) [2024-06-24 00:37:39,841][15401] InferenceWorker_p0-w0: stopping experience collection (133200 times) [2024-06-24 00:37:39,851][15401] InferenceWorker_p0-w0: resuming experience collection (133200 times) [2024-06-24 00:37:41,564][15401] Updated weights for policy 0, policy_version 548850 (0.0027) [2024-06-24 00:37:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 8992407552. Throughput: 0: 42515.1. Samples: 8992535640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:37:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 00:37:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000548853_8992407552.pth... [2024-06-24 00:37:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000548228_8982167552.pth [2024-06-24 00:37:45,265][15401] Updated weights for policy 0, policy_version 548860 (0.0023) [2024-06-24 00:37:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 8992620544. Throughput: 0: 42560.7. Samples: 8992799340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:37:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 00:37:49,119][15401] Updated weights for policy 0, policy_version 548870 (0.0031) [2024-06-24 00:37:52,807][15401] Updated weights for policy 0, policy_version 548880 (0.0030) [2024-06-24 00:37:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 8992849920. Throughput: 0: 42626.6. Samples: 8992934240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:37:53,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 00:37:56,614][15401] Updated weights for policy 0, policy_version 548890 (0.0029) [2024-06-24 00:37:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 8993062912. Throughput: 0: 42593.8. Samples: 8993183080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:37:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 00:38:00,821][15401] Updated weights for policy 0, policy_version 548900 (0.0033) [2024-06-24 00:38:03,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 8993275904. Throughput: 0: 42680.3. Samples: 8993444820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 00:38:04,237][15401] Updated weights for policy 0, policy_version 548910 (0.0037) [2024-06-24 00:38:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 8993488896. Throughput: 0: 42773.7. Samples: 8993578700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 00:38:08,403][15401] Updated weights for policy 0, policy_version 548920 (0.0029) [2024-06-24 00:38:11,899][15401] Updated weights for policy 0, policy_version 548930 (0.0034) [2024-06-24 00:38:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 8993718272. Throughput: 0: 42900.8. Samples: 8993831780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:13,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 00:38:16,057][15401] Updated weights for policy 0, policy_version 548940 (0.0027) [2024-06-24 00:38:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 8993914880. Throughput: 0: 42746.8. Samples: 8994088280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 00:38:19,582][15401] Updated weights for policy 0, policy_version 548950 (0.0026) [2024-06-24 00:38:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 8994144256. Throughput: 0: 42768.4. Samples: 8994217760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:23,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 00:38:23,733][15401] Updated weights for policy 0, policy_version 548960 (0.0040) [2024-06-24 00:38:27,183][15401] Updated weights for policy 0, policy_version 548970 (0.0031) [2024-06-24 00:38:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 8994357248. Throughput: 0: 42988.9. Samples: 8994470140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 00:38:31,439][15401] Updated weights for policy 0, policy_version 548980 (0.0046) [2024-06-24 00:38:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 8994553856. Throughput: 0: 43067.2. Samples: 8994737360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 00:38:34,694][15401] Updated weights for policy 0, policy_version 548990 (0.0038) [2024-06-24 00:38:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8994783232. Throughput: 0: 42774.8. Samples: 8994859000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 00:38:39,059][15401] Updated weights for policy 0, policy_version 549000 (0.0037) [2024-06-24 00:38:39,920][15349] Signal inference workers to stop experience collection... (133250 times) [2024-06-24 00:38:39,962][15401] InferenceWorker_p0-w0: stopping experience collection (133250 times) [2024-06-24 00:38:40,028][15349] Signal inference workers to resume experience collection... (133250 times) [2024-06-24 00:38:40,029][15401] InferenceWorker_p0-w0: resuming experience collection (133250 times) [2024-06-24 00:38:42,255][15401] Updated weights for policy 0, policy_version 549010 (0.0029) [2024-06-24 00:38:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 8994996224. Throughput: 0: 43047.5. Samples: 8995120220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 00:38:46,635][15401] Updated weights for policy 0, policy_version 549020 (0.0034) [2024-06-24 00:38:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 8995209216. Throughput: 0: 42847.2. Samples: 8995372940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 00:38:50,444][15401] Updated weights for policy 0, policy_version 549030 (0.0030) [2024-06-24 00:38:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 8995422208. Throughput: 0: 42705.7. Samples: 8995500460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 00:38:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 00:38:54,167][15401] Updated weights for policy 0, policy_version 549040 (0.0035) [2024-06-24 00:38:57,964][15401] Updated weights for policy 0, policy_version 549050 (0.0045) [2024-06-24 00:38:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 8995651584. Throughput: 0: 42767.6. Samples: 8995756320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:38:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 00:39:01,753][15401] Updated weights for policy 0, policy_version 549060 (0.0043) [2024-06-24 00:39:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 8995864576. Throughput: 0: 42814.5. Samples: 8996014940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 00:39:05,579][15401] Updated weights for policy 0, policy_version 549070 (0.0030) [2024-06-24 00:39:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8996044800. Throughput: 0: 42884.9. Samples: 8996147580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 00:39:09,625][15401] Updated weights for policy 0, policy_version 549080 (0.0035) [2024-06-24 00:39:13,228][15401] Updated weights for policy 0, policy_version 549090 (0.0039) [2024-06-24 00:39:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 8996290560. Throughput: 0: 42939.9. Samples: 8996402440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:13,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 00:39:17,486][15401] Updated weights for policy 0, policy_version 549100 (0.0033) [2024-06-24 00:39:18,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8996519936. Throughput: 0: 42632.5. Samples: 8996655820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 00:39:20,886][15401] Updated weights for policy 0, policy_version 549110 (0.0047) [2024-06-24 00:39:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8996700160. Throughput: 0: 42821.3. Samples: 8996785960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 00:39:25,041][15401] Updated weights for policy 0, policy_version 549120 (0.0040) [2024-06-24 00:39:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 8996929536. Throughput: 0: 42767.5. Samples: 8997044760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 00:39:28,795][15401] Updated weights for policy 0, policy_version 549130 (0.0025) [2024-06-24 00:39:32,744][15401] Updated weights for policy 0, policy_version 549140 (0.0031) [2024-06-24 00:39:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 8997158912. Throughput: 0: 42842.2. Samples: 8997300840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 00:39:36,429][15401] Updated weights for policy 0, policy_version 549150 (0.0035) [2024-06-24 00:39:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8997339136. Throughput: 0: 42952.9. Samples: 8997433340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 00:39:40,203][15401] Updated weights for policy 0, policy_version 549160 (0.0029) [2024-06-24 00:39:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8997552128. Throughput: 0: 42731.2. Samples: 8997679220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 00:39:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000549167_8997552128.pth... [2024-06-24 00:39:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000548541_8987295744.pth [2024-06-24 00:39:44,154][15401] Updated weights for policy 0, policy_version 549170 (0.0041) [2024-06-24 00:39:47,854][15401] Updated weights for policy 0, policy_version 549180 (0.0036) [2024-06-24 00:39:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8997781504. Throughput: 0: 42790.2. Samples: 8997940500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 00:39:52,066][15401] Updated weights for policy 0, policy_version 549190 (0.0033) [2024-06-24 00:39:53,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 8997978112. Throughput: 0: 42771.5. Samples: 8998072400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:53,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 00:39:55,725][15401] Updated weights for policy 0, policy_version 549200 (0.0036) [2024-06-24 00:39:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8998207488. Throughput: 0: 42590.3. Samples: 8998319000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:39:58,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-24 00:39:59,937][15401] Updated weights for policy 0, policy_version 549210 (0.0033) [2024-06-24 00:40:03,202][15349] Signal inference workers to stop experience collection... (133300 times) [2024-06-24 00:40:03,202][15349] Signal inference workers to resume experience collection... (133300 times) [2024-06-24 00:40:03,213][15401] InferenceWorker_p0-w0: stopping experience collection (133300 times) [2024-06-24 00:40:03,214][15401] InferenceWorker_p0-w0: resuming experience collection (133300 times) [2024-06-24 00:40:03,349][15401] Updated weights for policy 0, policy_version 549220 (0.0029) [2024-06-24 00:40:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 8998420480. Throughput: 0: 42884.8. Samples: 8998585640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:40:03,392][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 00:40:07,509][15401] Updated weights for policy 0, policy_version 549230 (0.0033) [2024-06-24 00:40:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 8998617088. Throughput: 0: 42672.6. Samples: 8998706220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:40:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 00:40:11,054][15401] Updated weights for policy 0, policy_version 549240 (0.0031) [2024-06-24 00:40:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 8998862848. Throughput: 0: 42471.6. Samples: 8998955980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:40:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 00:40:15,063][15401] Updated weights for policy 0, policy_version 549250 (0.0029) [2024-06-24 00:40:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 8999043072. Throughput: 0: 42582.7. Samples: 8999217060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:40:18,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 00:40:18,851][15401] Updated weights for policy 0, policy_version 549260 (0.0030) [2024-06-24 00:40:22,782][15401] Updated weights for policy 0, policy_version 549270 (0.0029) [2024-06-24 00:40:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 8999256064. Throughput: 0: 42438.6. Samples: 8999343080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 00:40:23,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 00:40:26,409][15401] Updated weights for policy 0, policy_version 549280 (0.0027) [2024-06-24 00:40:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 8999501824. Throughput: 0: 42663.1. Samples: 8999599060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:40:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 00:40:30,406][15401] Updated weights for policy 0, policy_version 549290 (0.0037) [2024-06-24 00:40:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 8999698432. Throughput: 0: 42778.8. Samples: 8999865540. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:40:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 00:40:33,928][15401] Updated weights for policy 0, policy_version 549300 (0.0037) [2024-06-24 00:40:37,958][15401] Updated weights for policy 0, policy_version 549310 (0.0027) [2024-06-24 00:40:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 8999895040. Throughput: 0: 42625.1. Samples: 8999990420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:40:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 00:40:41,395][15401] Updated weights for policy 0, policy_version 549320 (0.0027) [2024-06-24 00:40:43,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 9000157184. Throughput: 0: 42857.2. Samples: 9000247680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:40:43,393][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 00:40:45,675][15401] Updated weights for policy 0, policy_version 549330 (0.0030) [2024-06-24 00:40:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9000321024. Throughput: 0: 42794.8. Samples: 9000511400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:40:48,392][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 00:40:49,020][15401] Updated weights for policy 0, policy_version 549340 (0.0033) [2024-06-24 00:40:53,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 9000534016. Throughput: 0: 42840.4. Samples: 9000634040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:40:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 00:40:53,431][15401] Updated weights for policy 0, policy_version 549350 (0.0035) [2024-06-24 00:40:57,000][15401] Updated weights for policy 0, policy_version 549360 (0.0033) [2024-06-24 00:40:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9000796160. Throughput: 0: 43087.7. Samples: 9000894920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:40:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 00:41:00,957][15401] Updated weights for policy 0, policy_version 549370 (0.0054) [2024-06-24 00:41:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9000992768. Throughput: 0: 43099.1. Samples: 9001156520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 00:41:04,557][15401] Updated weights for policy 0, policy_version 549380 (0.0036) [2024-06-24 00:41:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9001189376. Throughput: 0: 42959.7. Samples: 9001276260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 00:41:08,556][15401] Updated weights for policy 0, policy_version 549390 (0.0036) [2024-06-24 00:41:12,134][15401] Updated weights for policy 0, policy_version 549400 (0.0026) [2024-06-24 00:41:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 9001451520. Throughput: 0: 43144.9. Samples: 9001540580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 00:41:16,115][15401] Updated weights for policy 0, policy_version 549410 (0.0027) [2024-06-24 00:41:17,454][15349] Signal inference workers to stop experience collection... (133350 times) [2024-06-24 00:41:17,455][15349] Signal inference workers to resume experience collection... (133350 times) [2024-06-24 00:41:17,502][15401] InferenceWorker_p0-w0: stopping experience collection (133350 times) [2024-06-24 00:41:17,502][15401] InferenceWorker_p0-w0: resuming experience collection (133350 times) [2024-06-24 00:41:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9001631744. Throughput: 0: 42935.5. Samples: 9001797640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 00:41:19,626][15401] Updated weights for policy 0, policy_version 549420 (0.0033) [2024-06-24 00:41:23,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9001828352. Throughput: 0: 42956.4. Samples: 9001923460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 00:41:24,278][15401] Updated weights for policy 0, policy_version 549430 (0.0028) [2024-06-24 00:41:27,117][15401] Updated weights for policy 0, policy_version 549440 (0.0045) [2024-06-24 00:41:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9002090496. Throughput: 0: 43095.3. Samples: 9002186860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 00:41:31,764][15401] Updated weights for policy 0, policy_version 549450 (0.0037) [2024-06-24 00:41:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9002287104. Throughput: 0: 42905.3. Samples: 9002442140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 00:41:34,918][15401] Updated weights for policy 0, policy_version 549460 (0.0029) [2024-06-24 00:41:38,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9002483712. Throughput: 0: 42933.2. Samples: 9002566040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 00:41:39,316][15401] Updated weights for policy 0, policy_version 549470 (0.0026) [2024-06-24 00:41:42,632][15401] Updated weights for policy 0, policy_version 549480 (0.0030) [2024-06-24 00:41:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9002729472. Throughput: 0: 43042.7. Samples: 9002831840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 00:41:43,438][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000549484_9002745856.pth... [2024-06-24 00:41:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000548853_8992407552.pth [2024-06-24 00:41:46,859][15401] Updated weights for policy 0, policy_version 549490 (0.0032) [2024-06-24 00:41:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9002926080. Throughput: 0: 43000.0. Samples: 9003091520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:48,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 00:41:50,357][15401] Updated weights for policy 0, policy_version 549500 (0.0027) [2024-06-24 00:41:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9003139072. Throughput: 0: 42890.6. Samples: 9003206340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 00:41:53,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 00:41:54,506][15401] Updated weights for policy 0, policy_version 549510 (0.0039) [2024-06-24 00:41:57,820][15401] Updated weights for policy 0, policy_version 549520 (0.0028) [2024-06-24 00:41:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9003368448. Throughput: 0: 42896.0. Samples: 9003470900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:41:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 00:42:02,173][15401] Updated weights for policy 0, policy_version 549530 (0.0028) [2024-06-24 00:42:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9003532288. Throughput: 0: 42926.2. Samples: 9003729320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 00:42:05,723][15401] Updated weights for policy 0, policy_version 549540 (0.0031) [2024-06-24 00:42:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9003794432. Throughput: 0: 42717.7. Samples: 9003845760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:08,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-24 00:42:09,637][15401] Updated weights for policy 0, policy_version 549550 (0.0044) [2024-06-24 00:42:13,350][15401] Updated weights for policy 0, policy_version 549560 (0.0034) [2024-06-24 00:42:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9003991040. Throughput: 0: 42686.6. Samples: 9004107760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:13,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 00:42:17,389][15401] Updated weights for policy 0, policy_version 549570 (0.0029) [2024-06-24 00:42:18,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9004171264. Throughput: 0: 42743.9. Samples: 9004365620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 00:42:20,997][15401] Updated weights for policy 0, policy_version 549580 (0.0036) [2024-06-24 00:42:23,030][15349] Signal inference workers to stop experience collection... (133400 times) [2024-06-24 00:42:23,031][15349] Signal inference workers to resume experience collection... (133400 times) [2024-06-24 00:42:23,071][15401] InferenceWorker_p0-w0: stopping experience collection (133400 times) [2024-06-24 00:42:23,071][15401] InferenceWorker_p0-w0: resuming experience collection (133400 times) [2024-06-24 00:42:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 9004449792. Throughput: 0: 42704.5. Samples: 9004487740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 00:42:25,229][15401] Updated weights for policy 0, policy_version 549590 (0.0036) [2024-06-24 00:42:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9004630016. Throughput: 0: 42719.0. Samples: 9004754200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:28,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 00:42:28,668][15401] Updated weights for policy 0, policy_version 549600 (0.0033) [2024-06-24 00:42:32,728][15401] Updated weights for policy 0, policy_version 549610 (0.0032) [2024-06-24 00:42:33,396][15132] Fps is (10 sec: 36022.1, 60 sec: 42047.8, 300 sec: 42708.6). Total num frames: 9004810240. Throughput: 0: 42532.6. Samples: 9005005760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:33,396][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 00:42:36,020][15401] Updated weights for policy 0, policy_version 549620 (0.0045) [2024-06-24 00:42:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9005056000. Throughput: 0: 42570.2. Samples: 9005122000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 00:42:40,168][15401] Updated weights for policy 0, policy_version 549630 (0.0026) [2024-06-24 00:42:43,390][15132] Fps is (10 sec: 45903.9, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 9005268992. Throughput: 0: 42740.3. Samples: 9005394220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:43,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 00:42:44,097][15401] Updated weights for policy 0, policy_version 549640 (0.0033) [2024-06-24 00:42:47,654][15401] Updated weights for policy 0, policy_version 549650 (0.0033) [2024-06-24 00:42:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 9005465600. Throughput: 0: 42427.5. Samples: 9005638560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 00:42:51,762][15401] Updated weights for policy 0, policy_version 549660 (0.0034) [2024-06-24 00:42:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9005711360. Throughput: 0: 42759.2. Samples: 9005769920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 00:42:55,237][15401] Updated weights for policy 0, policy_version 549670 (0.0040) [2024-06-24 00:42:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 9005891584. Throughput: 0: 42797.2. Samples: 9006033640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:42:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 00:42:59,366][15401] Updated weights for policy 0, policy_version 549680 (0.0036) [2024-06-24 00:43:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9006104576. Throughput: 0: 42478.3. Samples: 9006277140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:43:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 00:43:03,417][15401] Updated weights for policy 0, policy_version 549690 (0.0039) [2024-06-24 00:43:06,990][15401] Updated weights for policy 0, policy_version 549700 (0.0028) [2024-06-24 00:43:08,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 9006350336. Throughput: 0: 42738.7. Samples: 9006411080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:43:08,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 00:43:10,859][15401] Updated weights for policy 0, policy_version 549710 (0.0042) [2024-06-24 00:43:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9006530560. Throughput: 0: 42690.8. Samples: 9006675280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:43:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 00:43:14,565][15401] Updated weights for policy 0, policy_version 549720 (0.0038) [2024-06-24 00:43:18,355][15401] Updated weights for policy 0, policy_version 549730 (0.0030) [2024-06-24 00:43:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9006776320. Throughput: 0: 42648.7. Samples: 9006924680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:43:18,392][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 00:43:22,149][15401] Updated weights for policy 0, policy_version 549740 (0.0037) [2024-06-24 00:43:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 9006989312. Throughput: 0: 43051.1. Samples: 9007059300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 27.0) [2024-06-24 00:43:23,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 00:43:25,809][15401] Updated weights for policy 0, policy_version 549750 (0.0042) [2024-06-24 00:43:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9007185920. Throughput: 0: 42834.0. Samples: 9007321740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:43:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 00:43:29,632][15401] Updated weights for policy 0, policy_version 549760 (0.0032) [2024-06-24 00:43:33,238][15401] Updated weights for policy 0, policy_version 549770 (0.0038) [2024-06-24 00:43:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43695.4, 300 sec: 42876.1). Total num frames: 9007431680. Throughput: 0: 43042.7. Samples: 9007575480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:43:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 00:43:35,202][15349] Signal inference workers to stop experience collection... (133450 times) [2024-06-24 00:43:35,227][15401] InferenceWorker_p0-w0: stopping experience collection (133450 times) [2024-06-24 00:43:35,264][15349] Signal inference workers to resume experience collection... (133450 times) [2024-06-24 00:43:35,265][15401] InferenceWorker_p0-w0: resuming experience collection (133450 times) [2024-06-24 00:43:37,150][15401] Updated weights for policy 0, policy_version 549780 (0.0027) [2024-06-24 00:43:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9007611904. Throughput: 0: 43105.7. Samples: 9007709680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:43:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 00:43:41,123][15401] Updated weights for policy 0, policy_version 549790 (0.0030) [2024-06-24 00:43:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 9007824896. Throughput: 0: 42889.5. Samples: 9007963660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:43:43,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-24 00:43:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000549795_9007841280.pth... [2024-06-24 00:43:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000549167_8997552128.pth [2024-06-24 00:43:44,757][15401] Updated weights for policy 0, policy_version 549800 (0.0039) [2024-06-24 00:43:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9008070656. Throughput: 0: 43058.1. Samples: 9008214760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:43:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 00:43:48,640][15401] Updated weights for policy 0, policy_version 549810 (0.0031) [2024-06-24 00:43:52,573][15401] Updated weights for policy 0, policy_version 549820 (0.0034) [2024-06-24 00:43:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9008267264. Throughput: 0: 43064.4. Samples: 9008348880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:43:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 00:43:56,284][15401] Updated weights for policy 0, policy_version 549830 (0.0029) [2024-06-24 00:43:58,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 9008463872. Throughput: 0: 42837.1. Samples: 9008603060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:43:58,393][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 00:44:00,271][15401] Updated weights for policy 0, policy_version 549840 (0.0023) [2024-06-24 00:44:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9008693248. Throughput: 0: 43114.3. Samples: 9008864820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 00:44:04,155][15401] Updated weights for policy 0, policy_version 549850 (0.0026) [2024-06-24 00:44:07,942][15401] Updated weights for policy 0, policy_version 549860 (0.0037) [2024-06-24 00:44:08,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9008906240. Throughput: 0: 42973.4. Samples: 9008993100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 00:44:11,603][15401] Updated weights for policy 0, policy_version 549870 (0.0025) [2024-06-24 00:44:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9009102848. Throughput: 0: 42893.7. Samples: 9009251960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 00:44:15,665][15401] Updated weights for policy 0, policy_version 549880 (0.0043) [2024-06-24 00:44:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9009332224. Throughput: 0: 42847.0. Samples: 9009503600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 00:44:19,058][15401] Updated weights for policy 0, policy_version 549890 (0.0034) [2024-06-24 00:44:23,293][15401] Updated weights for policy 0, policy_version 549900 (0.0032) [2024-06-24 00:44:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9009561600. Throughput: 0: 42861.2. Samples: 9009638440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 00:44:27,308][15401] Updated weights for policy 0, policy_version 549910 (0.0033) [2024-06-24 00:44:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9009741824. Throughput: 0: 42773.2. Samples: 9009888460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 00:44:30,811][15401] Updated weights for policy 0, policy_version 549920 (0.0046) [2024-06-24 00:44:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9009987584. Throughput: 0: 42902.4. Samples: 9010145360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 00:44:34,978][15401] Updated weights for policy 0, policy_version 549930 (0.0032) [2024-06-24 00:44:38,334][15401] Updated weights for policy 0, policy_version 549940 (0.0028) [2024-06-24 00:44:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 9010216960. Throughput: 0: 42886.6. Samples: 9010278780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:38,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 00:44:42,581][15401] Updated weights for policy 0, policy_version 549950 (0.0033) [2024-06-24 00:44:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9010397184. Throughput: 0: 42856.2. Samples: 9010531480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 00:44:45,951][15401] Updated weights for policy 0, policy_version 549960 (0.0031) [2024-06-24 00:44:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 9010626560. Throughput: 0: 42755.9. Samples: 9010788840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 00:44:50,101][15401] Updated weights for policy 0, policy_version 549970 (0.0039) [2024-06-24 00:44:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9010839552. Throughput: 0: 42888.0. Samples: 9010923060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 00:44:53,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 00:44:53,528][15349] Signal inference workers to stop experience collection... (133500 times) [2024-06-24 00:44:53,529][15349] Signal inference workers to resume experience collection... (133500 times) [2024-06-24 00:44:53,552][15401] InferenceWorker_p0-w0: stopping experience collection (133500 times) [2024-06-24 00:44:53,552][15401] InferenceWorker_p0-w0: resuming experience collection (133500 times) [2024-06-24 00:44:53,689][15401] Updated weights for policy 0, policy_version 549980 (0.0037) [2024-06-24 00:44:57,678][15401] Updated weights for policy 0, policy_version 549990 (0.0046) [2024-06-24 00:44:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 9011052544. Throughput: 0: 42731.9. Samples: 9011174900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:44:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 00:45:01,442][15401] Updated weights for policy 0, policy_version 550000 (0.0026) [2024-06-24 00:45:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9011265536. Throughput: 0: 42781.8. Samples: 9011428780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:03,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-24 00:45:05,657][15401] Updated weights for policy 0, policy_version 550010 (0.0056) [2024-06-24 00:45:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9011462144. Throughput: 0: 42615.7. Samples: 9011556140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 00:45:09,438][15401] Updated weights for policy 0, policy_version 550020 (0.0030) [2024-06-24 00:45:13,317][15401] Updated weights for policy 0, policy_version 550030 (0.0056) [2024-06-24 00:45:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9011691520. Throughput: 0: 42556.8. Samples: 9011803520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:13,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 00:45:16,981][15401] Updated weights for policy 0, policy_version 550040 (0.0041) [2024-06-24 00:45:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9011888128. Throughput: 0: 42681.4. Samples: 9012066020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:18,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 00:45:20,827][15401] Updated weights for policy 0, policy_version 550050 (0.0027) [2024-06-24 00:45:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9012117504. Throughput: 0: 42538.7. Samples: 9012193020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:23,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 00:45:24,573][15401] Updated weights for policy 0, policy_version 550060 (0.0028) [2024-06-24 00:45:28,315][15401] Updated weights for policy 0, policy_version 550070 (0.0035) [2024-06-24 00:45:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9012346880. Throughput: 0: 42671.0. Samples: 9012451680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 00:45:32,522][15401] Updated weights for policy 0, policy_version 550080 (0.0026) [2024-06-24 00:45:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9012559872. Throughput: 0: 42641.8. Samples: 9012707720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 00:45:36,266][15401] Updated weights for policy 0, policy_version 550090 (0.0029) [2024-06-24 00:45:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 9012756480. Throughput: 0: 42617.2. Samples: 9012840840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:38,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 00:45:40,413][15401] Updated weights for policy 0, policy_version 550100 (0.0028) [2024-06-24 00:45:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9012969472. Throughput: 0: 42571.9. Samples: 9013090640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:43,390][15132] Avg episode reward: [(0, '0.147')] [2024-06-24 00:45:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000550108_9012969472.pth... [2024-06-24 00:45:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000549484_9002745856.pth [2024-06-24 00:45:43,942][15401] Updated weights for policy 0, policy_version 550110 (0.0035) [2024-06-24 00:45:48,074][15401] Updated weights for policy 0, policy_version 550120 (0.0037) [2024-06-24 00:45:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9013166080. Throughput: 0: 42622.3. Samples: 9013346780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 00:45:51,421][15401] Updated weights for policy 0, policy_version 550130 (0.0033) [2024-06-24 00:45:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9013395456. Throughput: 0: 42644.1. Samples: 9013475120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 00:45:55,592][15401] Updated weights for policy 0, policy_version 550140 (0.0037) [2024-06-24 00:45:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9013592064. Throughput: 0: 42955.2. Samples: 9013736500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:45:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 00:45:58,999][15401] Updated weights for policy 0, policy_version 550150 (0.0031) [2024-06-24 00:46:03,132][15401] Updated weights for policy 0, policy_version 550160 (0.0038) [2024-06-24 00:46:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9013837824. Throughput: 0: 42840.8. Samples: 9013993860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:46:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 00:46:06,729][15401] Updated weights for policy 0, policy_version 550170 (0.0023) [2024-06-24 00:46:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9014050816. Throughput: 0: 42832.0. Samples: 9014120460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:46:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 00:46:10,642][15401] Updated weights for policy 0, policy_version 550180 (0.0036) [2024-06-24 00:46:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9014263808. Throughput: 0: 42876.4. Samples: 9014381120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:46:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 00:46:14,158][15401] Updated weights for policy 0, policy_version 550190 (0.0025) [2024-06-24 00:46:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9014460416. Throughput: 0: 42907.5. Samples: 9014638560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:46:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 00:46:18,454][15401] Updated weights for policy 0, policy_version 550200 (0.0037) [2024-06-24 00:46:21,803][15401] Updated weights for policy 0, policy_version 550210 (0.0023) [2024-06-24 00:46:23,397][15132] Fps is (10 sec: 42566.6, 60 sec: 42866.1, 300 sec: 42708.4). Total num frames: 9014689792. Throughput: 0: 42740.5. Samples: 9014764480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 00:46:23,397][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 00:46:25,979][15401] Updated weights for policy 0, policy_version 550220 (0.0041) [2024-06-24 00:46:28,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9014919168. Throughput: 0: 43047.8. Samples: 9015027780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:46:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 00:46:29,710][15401] Updated weights for policy 0, policy_version 550230 (0.0040) [2024-06-24 00:46:33,390][15132] Fps is (10 sec: 42630.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9015115776. Throughput: 0: 43071.5. Samples: 9015285000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:46:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 00:46:33,587][15401] Updated weights for policy 0, policy_version 550240 (0.0031) [2024-06-24 00:46:37,179][15401] Updated weights for policy 0, policy_version 550250 (0.0028) [2024-06-24 00:46:38,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9015312384. Throughput: 0: 43048.4. Samples: 9015412300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:46:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 00:46:41,202][15401] Updated weights for policy 0, policy_version 550260 (0.0025) [2024-06-24 00:46:42,546][15349] Signal inference workers to stop experience collection... (133550 times) [2024-06-24 00:46:42,587][15401] InferenceWorker_p0-w0: stopping experience collection (133550 times) [2024-06-24 00:46:42,597][15349] Signal inference workers to resume experience collection... (133550 times) [2024-06-24 00:46:42,604][15401] InferenceWorker_p0-w0: resuming experience collection (133550 times) [2024-06-24 00:46:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9015558144. Throughput: 0: 43115.0. Samples: 9015676680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:46:43,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-24 00:46:45,187][15401] Updated weights for policy 0, policy_version 550270 (0.0035) [2024-06-24 00:46:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9015771136. Throughput: 0: 42904.6. Samples: 9015924560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:46:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 00:46:48,733][15401] Updated weights for policy 0, policy_version 550280 (0.0026) [2024-06-24 00:46:52,583][15401] Updated weights for policy 0, policy_version 550290 (0.0023) [2024-06-24 00:46:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9015951360. Throughput: 0: 43031.1. Samples: 9016056860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:46:53,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 00:46:56,194][15401] Updated weights for policy 0, policy_version 550300 (0.0026) [2024-06-24 00:46:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 9016197120. Throughput: 0: 43031.4. Samples: 9016317540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:46:58,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 00:47:00,002][15401] Updated weights for policy 0, policy_version 550310 (0.0038) [2024-06-24 00:47:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9016410112. Throughput: 0: 43026.6. Samples: 9016574760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 00:47:03,866][15401] Updated weights for policy 0, policy_version 550320 (0.0031) [2024-06-24 00:47:07,628][15401] Updated weights for policy 0, policy_version 550330 (0.0036) [2024-06-24 00:47:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9016623104. Throughput: 0: 43180.4. Samples: 9016707280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 00:47:11,391][15401] Updated weights for policy 0, policy_version 550340 (0.0022) [2024-06-24 00:47:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9016852480. Throughput: 0: 43058.1. Samples: 9016965400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 00:47:15,362][15401] Updated weights for policy 0, policy_version 550350 (0.0023) [2024-06-24 00:47:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9017065472. Throughput: 0: 42971.6. Samples: 9017218720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 00:47:19,175][15401] Updated weights for policy 0, policy_version 550360 (0.0040) [2024-06-24 00:47:23,262][15401] Updated weights for policy 0, policy_version 550370 (0.0030) [2024-06-24 00:47:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42876.7, 300 sec: 42820.5). Total num frames: 9017262080. Throughput: 0: 43061.6. Samples: 9017350080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 00:47:26,820][15401] Updated weights for policy 0, policy_version 550380 (0.0038) [2024-06-24 00:47:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.2, 300 sec: 42932.6). Total num frames: 9017475072. Throughput: 0: 42894.2. Samples: 9017606920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:28,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 00:47:30,777][15401] Updated weights for policy 0, policy_version 550390 (0.0041) [2024-06-24 00:47:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9017704448. Throughput: 0: 43136.0. Samples: 9017865680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 00:47:34,385][15401] Updated weights for policy 0, policy_version 550400 (0.0027) [2024-06-24 00:47:38,251][15401] Updated weights for policy 0, policy_version 550410 (0.0039) [2024-06-24 00:47:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9017917440. Throughput: 0: 43043.1. Samples: 9017993800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 00:47:41,907][15401] Updated weights for policy 0, policy_version 550420 (0.0035) [2024-06-24 00:47:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9018130432. Throughput: 0: 42902.3. Samples: 9018248140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 00:47:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000550423_9018130432.pth... [2024-06-24 00:47:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000549795_9007841280.pth [2024-06-24 00:47:45,758][15401] Updated weights for policy 0, policy_version 550430 (0.0041) [2024-06-24 00:47:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9018343424. Throughput: 0: 42870.4. Samples: 9018503920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 00:47:49,590][15401] Updated weights for policy 0, policy_version 550440 (0.0033) [2024-06-24 00:47:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 9018556416. Throughput: 0: 42749.4. Samples: 9018631100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:53,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 00:47:53,516][15401] Updated weights for policy 0, policy_version 550450 (0.0022) [2024-06-24 00:47:57,276][15401] Updated weights for policy 0, policy_version 550460 (0.0039) [2024-06-24 00:47:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9018753024. Throughput: 0: 42716.5. Samples: 9018887640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 00:47:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 00:48:01,461][15401] Updated weights for policy 0, policy_version 550470 (0.0039) [2024-06-24 00:48:03,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 9018982400. Throughput: 0: 42719.5. Samples: 9019141100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 00:48:04,989][15401] Updated weights for policy 0, policy_version 550480 (0.0028) [2024-06-24 00:48:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9019179008. Throughput: 0: 42765.0. Samples: 9019274500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 00:48:08,981][15401] Updated weights for policy 0, policy_version 550490 (0.0034) [2024-06-24 00:48:12,620][15401] Updated weights for policy 0, policy_version 550500 (0.0041) [2024-06-24 00:48:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9019408384. Throughput: 0: 42731.2. Samples: 9019529820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 00:48:15,732][15349] Signal inference workers to stop experience collection... (133600 times) [2024-06-24 00:48:15,732][15349] Signal inference workers to resume experience collection... (133600 times) [2024-06-24 00:48:15,774][15401] InferenceWorker_p0-w0: stopping experience collection (133600 times) [2024-06-24 00:48:15,774][15401] InferenceWorker_p0-w0: resuming experience collection (133600 times) [2024-06-24 00:48:16,733][15401] Updated weights for policy 0, policy_version 550510 (0.0048) [2024-06-24 00:48:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9019637760. Throughput: 0: 42578.6. Samples: 9019781720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 00:48:20,373][15401] Updated weights for policy 0, policy_version 550520 (0.0027) [2024-06-24 00:48:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9019801600. Throughput: 0: 42633.3. Samples: 9019912300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 00:48:24,370][15401] Updated weights for policy 0, policy_version 550530 (0.0026) [2024-06-24 00:48:28,248][15401] Updated weights for policy 0, policy_version 550540 (0.0037) [2024-06-24 00:48:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9020047360. Throughput: 0: 42515.1. Samples: 9020161320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:28,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 00:48:32,099][15401] Updated weights for policy 0, policy_version 550550 (0.0038) [2024-06-24 00:48:33,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9020276736. Throughput: 0: 42463.0. Samples: 9020414760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:33,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-24 00:48:36,352][15401] Updated weights for policy 0, policy_version 550560 (0.0023) [2024-06-24 00:48:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9020456960. Throughput: 0: 42558.6. Samples: 9020546140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 00:48:39,947][15401] Updated weights for policy 0, policy_version 550570 (0.0033) [2024-06-24 00:48:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9020669952. Throughput: 0: 42437.2. Samples: 9020797320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:43,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 00:48:43,901][15401] Updated weights for policy 0, policy_version 550580 (0.0024) [2024-06-24 00:48:47,637][15401] Updated weights for policy 0, policy_version 550590 (0.0035) [2024-06-24 00:48:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9020915712. Throughput: 0: 42512.2. Samples: 9021054140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 00:48:51,399][15401] Updated weights for policy 0, policy_version 550600 (0.0027) [2024-06-24 00:48:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42054.0, 300 sec: 42765.4). Total num frames: 9021079552. Throughput: 0: 42568.9. Samples: 9021190100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:53,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 00:48:55,194][15401] Updated weights for policy 0, policy_version 550610 (0.0035) [2024-06-24 00:48:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9021292544. Throughput: 0: 42319.1. Samples: 9021434180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:48:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 00:48:58,948][15401] Updated weights for policy 0, policy_version 550620 (0.0030) [2024-06-24 00:49:02,859][15401] Updated weights for policy 0, policy_version 550630 (0.0026) [2024-06-24 00:49:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9021554688. Throughput: 0: 42492.8. Samples: 9021693900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:49:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 00:49:06,578][15401] Updated weights for policy 0, policy_version 550640 (0.0031) [2024-06-24 00:49:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9021734912. Throughput: 0: 42568.9. Samples: 9021827900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:49:08,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 00:49:10,505][15401] Updated weights for policy 0, policy_version 550650 (0.0044) [2024-06-24 00:49:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9021947904. Throughput: 0: 42538.8. Samples: 9022075560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:49:13,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 00:49:14,271][15401] Updated weights for policy 0, policy_version 550660 (0.0039) [2024-06-24 00:49:18,101][15401] Updated weights for policy 0, policy_version 550670 (0.0038) [2024-06-24 00:49:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9022177280. Throughput: 0: 42691.6. Samples: 9022335880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:49:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 00:49:21,987][15401] Updated weights for policy 0, policy_version 550680 (0.0024) [2024-06-24 00:49:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9022373888. Throughput: 0: 42599.6. Samples: 9022463120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:49:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 00:49:25,850][15401] Updated weights for policy 0, policy_version 550690 (0.0034) [2024-06-24 00:49:26,148][15349] Signal inference workers to stop experience collection... (133650 times) [2024-06-24 00:49:26,192][15401] InferenceWorker_p0-w0: stopping experience collection (133650 times) [2024-06-24 00:49:26,201][15349] Signal inference workers to resume experience collection... (133650 times) [2024-06-24 00:49:26,207][15401] InferenceWorker_p0-w0: resuming experience collection (133650 times) [2024-06-24 00:49:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9022603264. Throughput: 0: 42615.2. Samples: 9022715000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 00:49:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 00:49:29,519][15401] Updated weights for policy 0, policy_version 550700 (0.0039) [2024-06-24 00:49:33,401][15132] Fps is (10 sec: 44185.2, 60 sec: 42317.1, 300 sec: 42707.8). Total num frames: 9022816256. Throughput: 0: 42761.3. Samples: 9022978900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:49:33,401][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 00:49:33,543][15401] Updated weights for policy 0, policy_version 550710 (0.0024) [2024-06-24 00:49:37,235][15401] Updated weights for policy 0, policy_version 550720 (0.0035) [2024-06-24 00:49:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9023012864. Throughput: 0: 42540.7. Samples: 9023104440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:49:38,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-24 00:49:41,194][15401] Updated weights for policy 0, policy_version 550730 (0.0038) [2024-06-24 00:49:43,390][15132] Fps is (10 sec: 44288.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9023258624. Throughput: 0: 42669.2. Samples: 9023354300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:49:43,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 00:49:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000550736_9023258624.pth... [2024-06-24 00:49:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000550108_9012969472.pth [2024-06-24 00:49:44,970][15401] Updated weights for policy 0, policy_version 550740 (0.0039) [2024-06-24 00:49:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9023455232. Throughput: 0: 42749.4. Samples: 9023617620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:49:48,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-24 00:49:48,990][15401] Updated weights for policy 0, policy_version 550750 (0.0036) [2024-06-24 00:49:52,456][15401] Updated weights for policy 0, policy_version 550760 (0.0039) [2024-06-24 00:49:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9023651840. Throughput: 0: 42606.7. Samples: 9023745200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:49:53,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 00:49:56,743][15401] Updated weights for policy 0, policy_version 550770 (0.0038) [2024-06-24 00:49:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9023897600. Throughput: 0: 42810.1. Samples: 9024002020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:49:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 00:49:59,959][15401] Updated weights for policy 0, policy_version 550780 (0.0036) [2024-06-24 00:50:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 9024077824. Throughput: 0: 42837.3. Samples: 9024263560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 00:50:04,235][15401] Updated weights for policy 0, policy_version 550790 (0.0041) [2024-06-24 00:50:07,543][15401] Updated weights for policy 0, policy_version 550800 (0.0040) [2024-06-24 00:50:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9024307200. Throughput: 0: 42727.9. Samples: 9024385880. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 00:50:12,046][15401] Updated weights for policy 0, policy_version 550810 (0.0043) [2024-06-24 00:50:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9024536576. Throughput: 0: 42813.0. Samples: 9024641580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:13,396][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 00:50:15,762][15401] Updated weights for policy 0, policy_version 550820 (0.0040) [2024-06-24 00:50:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9024716800. Throughput: 0: 42592.9. Samples: 9024895080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:18,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 00:50:19,748][15401] Updated weights for policy 0, policy_version 550830 (0.0022) [2024-06-24 00:50:23,291][15401] Updated weights for policy 0, policy_version 550840 (0.0034) [2024-06-24 00:50:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9024962560. Throughput: 0: 42515.6. Samples: 9025017640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 00:50:27,558][15401] Updated weights for policy 0, policy_version 550850 (0.0037) [2024-06-24 00:50:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9025175552. Throughput: 0: 42920.0. Samples: 9025285700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 00:50:31,041][15401] Updated weights for policy 0, policy_version 550860 (0.0037) [2024-06-24 00:50:32,524][15349] Signal inference workers to stop experience collection... (133700 times) [2024-06-24 00:50:32,574][15401] InferenceWorker_p0-w0: stopping experience collection (133700 times) [2024-06-24 00:50:32,582][15349] Signal inference workers to resume experience collection... (133700 times) [2024-06-24 00:50:32,589][15401] InferenceWorker_p0-w0: resuming experience collection (133700 times) [2024-06-24 00:50:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42606.7, 300 sec: 42765.0). Total num frames: 9025372160. Throughput: 0: 42688.5. Samples: 9025538600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:33,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-24 00:50:35,271][15401] Updated weights for policy 0, policy_version 550870 (0.0035) [2024-06-24 00:50:38,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 9025601536. Throughput: 0: 42688.0. Samples: 9025666260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:38,392][15132] Avg episode reward: [(0, '0.224')] [2024-06-24 00:50:38,709][15401] Updated weights for policy 0, policy_version 550880 (0.0038) [2024-06-24 00:50:42,834][15401] Updated weights for policy 0, policy_version 550890 (0.0041) [2024-06-24 00:50:43,392][15132] Fps is (10 sec: 44224.7, 60 sec: 42596.5, 300 sec: 42875.7). Total num frames: 9025814528. Throughput: 0: 42733.4. Samples: 9025925140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:43,393][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 00:50:46,437][15401] Updated weights for policy 0, policy_version 550900 (0.0034) [2024-06-24 00:50:48,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9026027520. Throughput: 0: 42601.7. Samples: 9026180640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:48,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 00:50:50,500][15401] Updated weights for policy 0, policy_version 550910 (0.0038) [2024-06-24 00:50:53,390][15132] Fps is (10 sec: 42610.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9026240512. Throughput: 0: 42700.1. Samples: 9026307380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 00:50:54,091][15401] Updated weights for policy 0, policy_version 550920 (0.0032) [2024-06-24 00:50:58,029][15401] Updated weights for policy 0, policy_version 550930 (0.0028) [2024-06-24 00:50:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9026453504. Throughput: 0: 42874.1. Samples: 9026570920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 00:50:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 00:51:01,820][15401] Updated weights for policy 0, policy_version 550940 (0.0041) [2024-06-24 00:51:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9026682880. Throughput: 0: 42737.7. Samples: 9026818280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 00:51:05,923][15401] Updated weights for policy 0, policy_version 550950 (0.0026) [2024-06-24 00:51:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9026863104. Throughput: 0: 42901.6. Samples: 9026948220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 00:51:09,455][15401] Updated weights for policy 0, policy_version 550960 (0.0034) [2024-06-24 00:51:13,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 9027059712. Throughput: 0: 42678.3. Samples: 9027206220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 00:51:13,641][15401] Updated weights for policy 0, policy_version 550970 (0.0041) [2024-06-24 00:51:17,269][15401] Updated weights for policy 0, policy_version 550980 (0.0034) [2024-06-24 00:51:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42766.1). Total num frames: 9027305472. Throughput: 0: 42555.2. Samples: 9027453580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 00:51:21,334][15401] Updated weights for policy 0, policy_version 550990 (0.0035) [2024-06-24 00:51:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9027518464. Throughput: 0: 42752.6. Samples: 9027590020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 00:51:25,162][15401] Updated weights for policy 0, policy_version 551000 (0.0042) [2024-06-24 00:51:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9027715072. Throughput: 0: 42626.2. Samples: 9027843200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 00:51:28,836][15401] Updated weights for policy 0, policy_version 551010 (0.0042) [2024-06-24 00:51:32,800][15401] Updated weights for policy 0, policy_version 551020 (0.0028) [2024-06-24 00:51:32,895][15349] Signal inference workers to stop experience collection... (133750 times) [2024-06-24 00:51:32,916][15401] InferenceWorker_p0-w0: stopping experience collection (133750 times) [2024-06-24 00:51:32,951][15349] Signal inference workers to resume experience collection... (133750 times) [2024-06-24 00:51:32,952][15401] InferenceWorker_p0-w0: resuming experience collection (133750 times) [2024-06-24 00:51:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9027960832. Throughput: 0: 42595.9. Samples: 9028097460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:33,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 00:51:36,404][15401] Updated weights for policy 0, policy_version 551030 (0.0032) [2024-06-24 00:51:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 9028157440. Throughput: 0: 42687.1. Samples: 9028228300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:38,394][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 00:51:40,439][15401] Updated weights for policy 0, policy_version 551040 (0.0037) [2024-06-24 00:51:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.3, 300 sec: 42709.5). Total num frames: 9028370432. Throughput: 0: 42539.2. Samples: 9028485180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 00:51:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000551048_9028370432.pth... [2024-06-24 00:51:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000550423_9018130432.pth [2024-06-24 00:51:44,210][15401] Updated weights for policy 0, policy_version 551050 (0.0031) [2024-06-24 00:51:47,872][15401] Updated weights for policy 0, policy_version 551060 (0.0038) [2024-06-24 00:51:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9028583424. Throughput: 0: 42751.5. Samples: 9028742100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 00:51:51,736][15401] Updated weights for policy 0, policy_version 551070 (0.0034) [2024-06-24 00:51:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9028780032. Throughput: 0: 42721.4. Samples: 9028870680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 00:51:55,445][15401] Updated weights for policy 0, policy_version 551080 (0.0029) [2024-06-24 00:51:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9029025792. Throughput: 0: 42773.3. Samples: 9029131020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:51:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 00:51:59,331][15401] Updated weights for policy 0, policy_version 551090 (0.0026) [2024-06-24 00:52:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9029206016. Throughput: 0: 43035.9. Samples: 9029390200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:52:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 00:52:03,401][15401] Updated weights for policy 0, policy_version 551100 (0.0034) [2024-06-24 00:52:06,886][15401] Updated weights for policy 0, policy_version 551110 (0.0034) [2024-06-24 00:52:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9029435392. Throughput: 0: 42768.7. Samples: 9029514620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:52:08,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 00:52:10,997][15401] Updated weights for policy 0, policy_version 551120 (0.0041) [2024-06-24 00:52:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9029648384. Throughput: 0: 42711.1. Samples: 9029765200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:52:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 00:52:14,637][15401] Updated weights for policy 0, policy_version 551130 (0.0048) [2024-06-24 00:52:18,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9029861376. Throughput: 0: 42907.8. Samples: 9030028300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:52:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 00:52:18,513][15401] Updated weights for policy 0, policy_version 551140 (0.0034) [2024-06-24 00:52:22,247][15401] Updated weights for policy 0, policy_version 551150 (0.0036) [2024-06-24 00:52:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9030074368. Throughput: 0: 42803.2. Samples: 9030154440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:52:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 00:52:26,283][15401] Updated weights for policy 0, policy_version 551160 (0.0023) [2024-06-24 00:52:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9030303744. Throughput: 0: 42865.3. Samples: 9030414120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 00:52:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 00:52:30,126][15401] Updated weights for policy 0, policy_version 551170 (0.0034) [2024-06-24 00:52:33,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 9030500352. Throughput: 0: 42860.0. Samples: 9030670900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:52:33,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 00:52:34,033][15401] Updated weights for policy 0, policy_version 551180 (0.0029) [2024-06-24 00:52:37,685][15401] Updated weights for policy 0, policy_version 551190 (0.0022) [2024-06-24 00:52:38,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 9030713344. Throughput: 0: 42734.9. Samples: 9030794020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:52:38,396][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 00:52:41,606][15401] Updated weights for policy 0, policy_version 551200 (0.0034) [2024-06-24 00:52:43,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9030942720. Throughput: 0: 42722.3. Samples: 9031053520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:52:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 00:52:45,416][15401] Updated weights for policy 0, policy_version 551210 (0.0058) [2024-06-24 00:52:48,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 9031122944. Throughput: 0: 42605.8. Samples: 9031307460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:52:48,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 00:52:49,289][15401] Updated weights for policy 0, policy_version 551220 (0.0049) [2024-06-24 00:52:52,879][15401] Updated weights for policy 0, policy_version 551230 (0.0042) [2024-06-24 00:52:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9031352320. Throughput: 0: 42505.4. Samples: 9031427360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:52:53,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 00:52:54,675][15349] Signal inference workers to stop experience collection... (133800 times) [2024-06-24 00:52:54,723][15401] InferenceWorker_p0-w0: stopping experience collection (133800 times) [2024-06-24 00:52:54,726][15349] Signal inference workers to resume experience collection... (133800 times) [2024-06-24 00:52:54,734][15401] InferenceWorker_p0-w0: resuming experience collection (133800 times) [2024-06-24 00:52:57,035][15401] Updated weights for policy 0, policy_version 551240 (0.0033) [2024-06-24 00:52:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 9031565312. Throughput: 0: 42588.9. Samples: 9031681700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:52:58,391][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 00:53:00,393][15401] Updated weights for policy 0, policy_version 551250 (0.0040) [2024-06-24 00:53:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9031778304. Throughput: 0: 42443.4. Samples: 9031938260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 00:53:05,111][15401] Updated weights for policy 0, policy_version 551260 (0.0033) [2024-06-24 00:53:08,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 9031991296. Throughput: 0: 42446.3. Samples: 9032064800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:08,397][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 00:53:08,573][15401] Updated weights for policy 0, policy_version 551270 (0.0025) [2024-06-24 00:53:12,697][15401] Updated weights for policy 0, policy_version 551280 (0.0033) [2024-06-24 00:53:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9032187904. Throughput: 0: 42450.8. Samples: 9032324400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 00:53:16,904][15401] Updated weights for policy 0, policy_version 551290 (0.0039) [2024-06-24 00:53:18,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9032417280. Throughput: 0: 42381.3. Samples: 9032577960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 00:53:20,393][15401] Updated weights for policy 0, policy_version 551300 (0.0034) [2024-06-24 00:53:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9032630272. Throughput: 0: 42531.9. Samples: 9032707680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 00:53:24,486][15401] Updated weights for policy 0, policy_version 551310 (0.0027) [2024-06-24 00:53:28,251][15401] Updated weights for policy 0, policy_version 551320 (0.0048) [2024-06-24 00:53:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 9032826880. Throughput: 0: 42193.7. Samples: 9032952240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 00:53:32,060][15401] Updated weights for policy 0, policy_version 551330 (0.0033) [2024-06-24 00:53:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 9033039872. Throughput: 0: 42325.9. Samples: 9033212120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 00:53:36,261][15401] Updated weights for policy 0, policy_version 551340 (0.0029) [2024-06-24 00:53:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 9033285632. Throughput: 0: 42624.4. Samples: 9033345460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 00:53:39,490][15401] Updated weights for policy 0, policy_version 551350 (0.0033) [2024-06-24 00:53:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 9033449472. Throughput: 0: 42562.1. Samples: 9033597000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 00:53:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000551358_9033449472.pth... [2024-06-24 00:53:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000550736_9023258624.pth [2024-06-24 00:53:43,908][15401] Updated weights for policy 0, policy_version 551360 (0.0032) [2024-06-24 00:53:47,166][15401] Updated weights for policy 0, policy_version 551370 (0.0026) [2024-06-24 00:53:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9033678848. Throughput: 0: 42457.8. Samples: 9033848860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:48,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 00:53:51,603][15401] Updated weights for policy 0, policy_version 551380 (0.0031) [2024-06-24 00:53:53,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9033924608. Throughput: 0: 42541.7. Samples: 9033978900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 00:53:54,753][15401] Updated weights for policy 0, policy_version 551390 (0.0032) [2024-06-24 00:53:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9034104832. Throughput: 0: 42497.6. Samples: 9034236800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 00:53:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 00:53:59,075][15401] Updated weights for policy 0, policy_version 551400 (0.0022) [2024-06-24 00:54:02,397][15401] Updated weights for policy 0, policy_version 551410 (0.0033) [2024-06-24 00:54:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9034317824. Throughput: 0: 42548.8. Samples: 9034492660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 00:54:06,491][15401] Updated weights for policy 0, policy_version 551420 (0.0026) [2024-06-24 00:54:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 9034563584. Throughput: 0: 42497.8. Samples: 9034620080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 00:54:09,968][15401] Updated weights for policy 0, policy_version 551430 (0.0027) [2024-06-24 00:54:11,459][15349] Signal inference workers to stop experience collection... (133850 times) [2024-06-24 00:54:11,464][15349] Signal inference workers to resume experience collection... (133850 times) [2024-06-24 00:54:11,481][15401] InferenceWorker_p0-w0: stopping experience collection (133850 times) [2024-06-24 00:54:11,482][15401] InferenceWorker_p0-w0: resuming experience collection (133850 times) [2024-06-24 00:54:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9034743808. Throughput: 0: 42784.1. Samples: 9034877520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 00:54:13,962][15401] Updated weights for policy 0, policy_version 551440 (0.0047) [2024-06-24 00:54:17,426][15401] Updated weights for policy 0, policy_version 551450 (0.0034) [2024-06-24 00:54:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9034973184. Throughput: 0: 42757.2. Samples: 9035136200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 00:54:21,566][15401] Updated weights for policy 0, policy_version 551460 (0.0033) [2024-06-24 00:54:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 9035186176. Throughput: 0: 42681.3. Samples: 9035266220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:23,393][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 00:54:25,411][15401] Updated weights for policy 0, policy_version 551470 (0.0029) [2024-06-24 00:54:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42544.5). Total num frames: 9035366400. Throughput: 0: 42750.3. Samples: 9035520760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 00:54:29,287][15401] Updated weights for policy 0, policy_version 551480 (0.0027) [2024-06-24 00:54:33,088][15401] Updated weights for policy 0, policy_version 551490 (0.0043) [2024-06-24 00:54:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9035628544. Throughput: 0: 42791.9. Samples: 9035774500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:33,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 00:54:36,881][15401] Updated weights for policy 0, policy_version 551500 (0.0041) [2024-06-24 00:54:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9035825152. Throughput: 0: 42963.0. Samples: 9035912240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:38,394][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 00:54:40,726][15401] Updated weights for policy 0, policy_version 551510 (0.0033) [2024-06-24 00:54:43,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 9036021760. Throughput: 0: 42758.3. Samples: 9036161020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:43,392][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 00:54:44,847][15401] Updated weights for policy 0, policy_version 551520 (0.0036) [2024-06-24 00:54:48,102][15401] Updated weights for policy 0, policy_version 551530 (0.0028) [2024-06-24 00:54:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9036267520. Throughput: 0: 42631.6. Samples: 9036411080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 00:54:52,594][15401] Updated weights for policy 0, policy_version 551540 (0.0040) [2024-06-24 00:54:53,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9036480512. Throughput: 0: 42874.1. Samples: 9036549420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 00:54:55,588][15401] Updated weights for policy 0, policy_version 551550 (0.0029) [2024-06-24 00:54:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9036660736. Throughput: 0: 42758.3. Samples: 9036801640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:54:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 00:55:00,159][15401] Updated weights for policy 0, policy_version 551560 (0.0028) [2024-06-24 00:55:03,113][15401] Updated weights for policy 0, policy_version 551570 (0.0038) [2024-06-24 00:55:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 9036922880. Throughput: 0: 42553.4. Samples: 9037051200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:55:03,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 00:55:07,917][15401] Updated weights for policy 0, policy_version 551580 (0.0028) [2024-06-24 00:55:08,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 9037103104. Throughput: 0: 42791.1. Samples: 9037191820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:55:08,393][15132] Avg episode reward: [(0, '0.833')] [2024-06-24 00:55:10,684][15401] Updated weights for policy 0, policy_version 551590 (0.0032) [2024-06-24 00:55:13,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9037316096. Throughput: 0: 42706.2. Samples: 9037442540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:55:13,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 00:55:15,729][15401] Updated weights for policy 0, policy_version 551600 (0.0035) [2024-06-24 00:55:18,392][15132] Fps is (10 sec: 45874.8, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 9037561856. Throughput: 0: 42616.3. Samples: 9037692340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:55:18,393][15132] Avg episode reward: [(0, '0.884')] [2024-06-24 00:55:18,618][15401] Updated weights for policy 0, policy_version 551610 (0.0027) [2024-06-24 00:55:23,254][15401] Updated weights for policy 0, policy_version 551620 (0.0033) [2024-06-24 00:55:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 9037742080. Throughput: 0: 42564.0. Samples: 9037827620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:55:23,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 00:55:24,276][15349] Signal inference workers to stop experience collection... (133900 times) [2024-06-24 00:55:24,317][15401] InferenceWorker_p0-w0: stopping experience collection (133900 times) [2024-06-24 00:55:24,337][15349] Signal inference workers to resume experience collection... (133900 times) [2024-06-24 00:55:24,338][15401] InferenceWorker_p0-w0: resuming experience collection (133900 times) [2024-06-24 00:55:26,531][15401] Updated weights for policy 0, policy_version 551630 (0.0034) [2024-06-24 00:55:28,390][15132] Fps is (10 sec: 39331.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9037955072. Throughput: 0: 42641.4. Samples: 9038079780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 00:55:28,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 00:55:30,819][15401] Updated weights for policy 0, policy_version 551640 (0.0042) [2024-06-24 00:55:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 9038168064. Throughput: 0: 42730.6. Samples: 9038333960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:55:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 00:55:34,275][15401] Updated weights for policy 0, policy_version 551650 (0.0038) [2024-06-24 00:55:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 9038381056. Throughput: 0: 42530.3. Samples: 9038463280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:55:38,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-24 00:55:38,473][15401] Updated weights for policy 0, policy_version 551660 (0.0045) [2024-06-24 00:55:42,487][15401] Updated weights for policy 0, policy_version 551670 (0.0029) [2024-06-24 00:55:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 9038594048. Throughput: 0: 42687.5. Samples: 9038722580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:55:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 00:55:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000551673_9038610432.pth... [2024-06-24 00:55:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000551048_9028370432.pth [2024-06-24 00:55:46,202][15401] Updated weights for policy 0, policy_version 551680 (0.0045) [2024-06-24 00:55:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9038823424. Throughput: 0: 42691.1. Samples: 9038972200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:55:48,395][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 00:55:50,106][15401] Updated weights for policy 0, policy_version 551690 (0.0037) [2024-06-24 00:55:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9039036416. Throughput: 0: 42639.6. Samples: 9039110500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:55:53,396][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 00:55:53,718][15401] Updated weights for policy 0, policy_version 551700 (0.0033) [2024-06-24 00:55:57,633][15401] Updated weights for policy 0, policy_version 551710 (0.0033) [2024-06-24 00:55:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 9039233024. Throughput: 0: 42740.4. Samples: 9039365860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:55:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 00:56:01,162][15401] Updated weights for policy 0, policy_version 551720 (0.0033) [2024-06-24 00:56:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 9039478784. Throughput: 0: 42931.6. Samples: 9039624160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 00:56:05,199][15401] Updated weights for policy 0, policy_version 551730 (0.0048) [2024-06-24 00:56:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 9039675392. Throughput: 0: 42877.7. Samples: 9039757120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 00:56:08,801][15401] Updated weights for policy 0, policy_version 551740 (0.0041) [2024-06-24 00:56:12,646][15401] Updated weights for policy 0, policy_version 551750 (0.0036) [2024-06-24 00:56:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9039872000. Throughput: 0: 42962.6. Samples: 9040013100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 00:56:16,355][15401] Updated weights for policy 0, policy_version 551760 (0.0034) [2024-06-24 00:56:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 9040117760. Throughput: 0: 43117.8. Samples: 9040274260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 00:56:20,138][15401] Updated weights for policy 0, policy_version 551770 (0.0032) [2024-06-24 00:56:23,390][15132] Fps is (10 sec: 45872.1, 60 sec: 43144.0, 300 sec: 42764.9). Total num frames: 9040330752. Throughput: 0: 43194.3. Samples: 9040407060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:23,391][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 00:56:23,895][15401] Updated weights for policy 0, policy_version 551780 (0.0028) [2024-06-24 00:56:27,602][15401] Updated weights for policy 0, policy_version 551790 (0.0038) [2024-06-24 00:56:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9040527360. Throughput: 0: 43072.5. Samples: 9040660840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 00:56:31,420][15401] Updated weights for policy 0, policy_version 551800 (0.0040) [2024-06-24 00:56:33,389][15132] Fps is (10 sec: 40963.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9040740352. Throughput: 0: 43371.2. Samples: 9040923900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 00:56:35,443][15401] Updated weights for policy 0, policy_version 551810 (0.0032) [2024-06-24 00:56:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9040969728. Throughput: 0: 43132.0. Samples: 9041051440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 00:56:39,133][15401] Updated weights for policy 0, policy_version 551820 (0.0023) [2024-06-24 00:56:42,824][15349] Signal inference workers to stop experience collection... (133950 times) [2024-06-24 00:56:42,825][15349] Signal inference workers to resume experience collection... (133950 times) [2024-06-24 00:56:42,845][15401] InferenceWorker_p0-w0: stopping experience collection (133950 times) [2024-06-24 00:56:42,845][15401] InferenceWorker_p0-w0: resuming experience collection (133950 times) [2024-06-24 00:56:42,975][15401] Updated weights for policy 0, policy_version 551830 (0.0031) [2024-06-24 00:56:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9041182720. Throughput: 0: 43061.4. Samples: 9041303620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 00:56:46,772][15401] Updated weights for policy 0, policy_version 551840 (0.0030) [2024-06-24 00:56:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9041395712. Throughput: 0: 43115.6. Samples: 9041564360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:48,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 00:56:50,607][15401] Updated weights for policy 0, policy_version 551850 (0.0034) [2024-06-24 00:56:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9041608704. Throughput: 0: 43053.4. Samples: 9041694520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 00:56:54,248][15401] Updated weights for policy 0, policy_version 551860 (0.0041) [2024-06-24 00:56:58,115][15401] Updated weights for policy 0, policy_version 551870 (0.0040) [2024-06-24 00:56:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 9041838080. Throughput: 0: 43033.4. Samples: 9041949600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 00:56:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 00:57:02,052][15401] Updated weights for policy 0, policy_version 551880 (0.0039) [2024-06-24 00:57:03,394][15132] Fps is (10 sec: 44218.1, 60 sec: 42868.5, 300 sec: 42764.4). Total num frames: 9042051072. Throughput: 0: 42844.9. Samples: 9042202460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:03,394][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 00:57:05,982][15401] Updated weights for policy 0, policy_version 551890 (0.0037) [2024-06-24 00:57:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9042247680. Throughput: 0: 42757.6. Samples: 9042331120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:08,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 00:57:09,575][15401] Updated weights for policy 0, policy_version 551900 (0.0042) [2024-06-24 00:57:13,389][15132] Fps is (10 sec: 40977.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 9042460672. Throughput: 0: 42873.3. Samples: 9042590140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 00:57:13,640][15401] Updated weights for policy 0, policy_version 551910 (0.0044) [2024-06-24 00:57:17,452][15401] Updated weights for policy 0, policy_version 551920 (0.0037) [2024-06-24 00:57:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9042690048. Throughput: 0: 42704.9. Samples: 9042845620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 00:57:21,407][15401] Updated weights for policy 0, policy_version 551930 (0.0028) [2024-06-24 00:57:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.9, 300 sec: 42709.5). Total num frames: 9042903040. Throughput: 0: 42733.7. Samples: 9042974460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:23,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 00:57:25,519][15401] Updated weights for policy 0, policy_version 551940 (0.0027) [2024-06-24 00:57:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 9043083264. Throughput: 0: 42676.8. Samples: 9043224080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:28,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 00:57:29,106][15401] Updated weights for policy 0, policy_version 551950 (0.0041) [2024-06-24 00:57:33,060][15401] Updated weights for policy 0, policy_version 551960 (0.0027) [2024-06-24 00:57:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42765.9). Total num frames: 9043329024. Throughput: 0: 42659.1. Samples: 9043484020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 00:57:36,849][15401] Updated weights for policy 0, policy_version 551970 (0.0040) [2024-06-24 00:57:38,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9043542016. Throughput: 0: 42677.0. Samples: 9043614980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 00:57:40,637][15401] Updated weights for policy 0, policy_version 551980 (0.0041) [2024-06-24 00:57:43,390][15132] Fps is (10 sec: 40957.0, 60 sec: 42597.8, 300 sec: 42764.9). Total num frames: 9043738624. Throughput: 0: 42713.0. Samples: 9043871720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:43,391][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 00:57:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000551986_9043738624.pth... [2024-06-24 00:57:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000551358_9033449472.pth [2024-06-24 00:57:44,715][15401] Updated weights for policy 0, policy_version 551990 (0.0035) [2024-06-24 00:57:48,189][15401] Updated weights for policy 0, policy_version 552000 (0.0034) [2024-06-24 00:57:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9043968000. Throughput: 0: 42722.3. Samples: 9044124780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 00:57:52,299][15401] Updated weights for policy 0, policy_version 552010 (0.0036) [2024-06-24 00:57:53,389][15132] Fps is (10 sec: 42602.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9044164608. Throughput: 0: 42777.9. Samples: 9044256120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 00:57:56,010][15401] Updated weights for policy 0, policy_version 552020 (0.0036) [2024-06-24 00:57:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9044377600. Throughput: 0: 42658.2. Samples: 9044509760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:57:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 00:57:59,685][15349] Signal inference workers to stop experience collection... (134000 times) [2024-06-24 00:57:59,688][15349] Signal inference workers to resume experience collection... (134000 times) [2024-06-24 00:57:59,719][15401] InferenceWorker_p0-w0: stopping experience collection (134000 times) [2024-06-24 00:57:59,719][15401] InferenceWorker_p0-w0: resuming experience collection (134000 times) [2024-06-24 00:57:59,845][15401] Updated weights for policy 0, policy_version 552030 (0.0030) [2024-06-24 00:58:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42601.4, 300 sec: 42765.9). Total num frames: 9044606976. Throughput: 0: 42549.3. Samples: 9044760340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:58:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 00:58:03,495][15401] Updated weights for policy 0, policy_version 552040 (0.0041) [2024-06-24 00:58:08,297][15401] Updated weights for policy 0, policy_version 552050 (0.0036) [2024-06-24 00:58:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9044787200. Throughput: 0: 42554.3. Samples: 9044889400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:58:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 00:58:10,903][15401] Updated weights for policy 0, policy_version 552060 (0.0032) [2024-06-24 00:58:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9045032960. Throughput: 0: 42769.8. Samples: 9045148720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:58:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 00:58:15,966][15401] Updated weights for policy 0, policy_version 552070 (0.0028) [2024-06-24 00:58:18,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9045262336. Throughput: 0: 42588.0. Samples: 9045400480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:58:18,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 00:58:18,611][15401] Updated weights for policy 0, policy_version 552080 (0.0039) [2024-06-24 00:58:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9045426176. Throughput: 0: 42565.6. Samples: 9045530440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:58:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 00:58:23,548][15401] Updated weights for policy 0, policy_version 552090 (0.0035) [2024-06-24 00:58:26,530][15401] Updated weights for policy 0, policy_version 552100 (0.0033) [2024-06-24 00:58:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9045671936. Throughput: 0: 42496.3. Samples: 9045784020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 00:58:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 00:58:31,079][15401] Updated weights for policy 0, policy_version 552110 (0.0037) [2024-06-24 00:58:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9045868544. Throughput: 0: 42633.4. Samples: 9046043280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:58:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 00:58:34,135][15401] Updated weights for policy 0, policy_version 552120 (0.0037) [2024-06-24 00:58:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 9046081536. Throughput: 0: 42487.9. Samples: 9046168180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:58:38,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 00:58:38,558][15401] Updated weights for policy 0, policy_version 552130 (0.0036) [2024-06-24 00:58:42,244][15401] Updated weights for policy 0, policy_version 552140 (0.0031) [2024-06-24 00:58:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42872.0, 300 sec: 42820.5). Total num frames: 9046310912. Throughput: 0: 42686.6. Samples: 9046430660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:58:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 00:58:45,925][15401] Updated weights for policy 0, policy_version 552150 (0.0028) [2024-06-24 00:58:48,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9046507520. Throughput: 0: 42795.3. Samples: 9046686120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:58:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 00:58:49,791][15401] Updated weights for policy 0, policy_version 552160 (0.0040) [2024-06-24 00:58:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9046736896. Throughput: 0: 42778.7. Samples: 9046814440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:58:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 00:58:53,636][15401] Updated weights for policy 0, policy_version 552170 (0.0036) [2024-06-24 00:58:57,293][15401] Updated weights for policy 0, policy_version 552180 (0.0032) [2024-06-24 00:58:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9046949888. Throughput: 0: 42788.0. Samples: 9047074180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:58:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 00:59:01,754][15401] Updated weights for policy 0, policy_version 552190 (0.0028) [2024-06-24 00:59:03,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9047179264. Throughput: 0: 42847.1. Samples: 9047328700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:03,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 00:59:04,944][15401] Updated weights for policy 0, policy_version 552200 (0.0033) [2024-06-24 00:59:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 9047375872. Throughput: 0: 42857.3. Samples: 9047459120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:08,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 00:59:09,267][15401] Updated weights for policy 0, policy_version 552210 (0.0028) [2024-06-24 00:59:12,652][15401] Updated weights for policy 0, policy_version 552220 (0.0041) [2024-06-24 00:59:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9047605248. Throughput: 0: 42993.4. Samples: 9047718720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 00:59:16,914][15401] Updated weights for policy 0, policy_version 552230 (0.0040) [2024-06-24 00:59:18,392][15132] Fps is (10 sec: 44237.0, 60 sec: 42596.7, 300 sec: 42820.6). Total num frames: 9047818240. Throughput: 0: 42824.8. Samples: 9047970500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:18,392][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 00:59:20,228][15401] Updated weights for policy 0, policy_version 552240 (0.0032) [2024-06-24 00:59:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9047998464. Throughput: 0: 42956.7. Samples: 9048101120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 00:59:24,393][15401] Updated weights for policy 0, policy_version 552250 (0.0029) [2024-06-24 00:59:27,724][15349] Signal inference workers to stop experience collection... (134050 times) [2024-06-24 00:59:27,726][15349] Signal inference workers to resume experience collection... (134050 times) [2024-06-24 00:59:27,755][15401] InferenceWorker_p0-w0: stopping experience collection (134050 times) [2024-06-24 00:59:27,755][15401] InferenceWorker_p0-w0: resuming experience collection (134050 times) [2024-06-24 00:59:27,859][15401] Updated weights for policy 0, policy_version 552260 (0.0031) [2024-06-24 00:59:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9048227840. Throughput: 0: 42798.3. Samples: 9048356580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 00:59:31,942][15401] Updated weights for policy 0, policy_version 552270 (0.0035) [2024-06-24 00:59:33,390][15132] Fps is (10 sec: 47512.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9048473600. Throughput: 0: 42800.2. Samples: 9048612140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 00:59:35,616][15401] Updated weights for policy 0, policy_version 552280 (0.0044) [2024-06-24 00:59:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.3, 300 sec: 42820.9). Total num frames: 9048653824. Throughput: 0: 42938.8. Samples: 9048746680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 00:59:39,554][15401] Updated weights for policy 0, policy_version 552290 (0.0032) [2024-06-24 00:59:43,066][15401] Updated weights for policy 0, policy_version 552300 (0.0023) [2024-06-24 00:59:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9048883200. Throughput: 0: 42832.4. Samples: 9049001740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:43,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 00:59:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000552300_9048883200.pth... [2024-06-24 00:59:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000551673_9038610432.pth [2024-06-24 00:59:47,031][15401] Updated weights for policy 0, policy_version 552310 (0.0035) [2024-06-24 00:59:48,391][15132] Fps is (10 sec: 44228.8, 60 sec: 43143.3, 300 sec: 42764.8). Total num frames: 9049096192. Throughput: 0: 42871.4. Samples: 9049257880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:48,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 00:59:51,225][15401] Updated weights for policy 0, policy_version 552320 (0.0037) [2024-06-24 00:59:53,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9049292800. Throughput: 0: 42816.1. Samples: 9049385740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:53,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 00:59:54,776][15401] Updated weights for policy 0, policy_version 552330 (0.0037) [2024-06-24 00:59:58,389][15132] Fps is (10 sec: 40967.0, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 9049505792. Throughput: 0: 42654.2. Samples: 9049638160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 00:59:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 00:59:58,937][15401] Updated weights for policy 0, policy_version 552340 (0.0035) [2024-06-24 01:00:02,378][15401] Updated weights for policy 0, policy_version 552350 (0.0025) [2024-06-24 01:00:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42765.4). Total num frames: 9049718784. Throughput: 0: 42744.9. Samples: 9049893920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:03,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 01:00:06,664][15401] Updated weights for policy 0, policy_version 552360 (0.0027) [2024-06-24 01:00:08,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42873.0, 300 sec: 42820.5). Total num frames: 9049948160. Throughput: 0: 42752.1. Samples: 9050024980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 01:00:10,215][15401] Updated weights for policy 0, policy_version 552370 (0.0023) [2024-06-24 01:00:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 9050161152. Throughput: 0: 42730.1. Samples: 9050279440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 01:00:14,215][15401] Updated weights for policy 0, policy_version 552380 (0.0041) [2024-06-24 01:00:17,853][15401] Updated weights for policy 0, policy_version 552390 (0.0034) [2024-06-24 01:00:18,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 9050357760. Throughput: 0: 42650.8. Samples: 9050531420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 01:00:21,820][15401] Updated weights for policy 0, policy_version 552400 (0.0028) [2024-06-24 01:00:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9050587136. Throughput: 0: 42462.0. Samples: 9050657480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:23,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 01:00:25,687][15401] Updated weights for policy 0, policy_version 552410 (0.0043) [2024-06-24 01:00:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9050783744. Throughput: 0: 42478.7. Samples: 9050913180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 01:00:29,513][15401] Updated weights for policy 0, policy_version 552420 (0.0037) [2024-06-24 01:00:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 9050996736. Throughput: 0: 42473.0. Samples: 9051169100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 01:00:33,536][15401] Updated weights for policy 0, policy_version 552430 (0.0028) [2024-06-24 01:00:37,226][15401] Updated weights for policy 0, policy_version 552440 (0.0034) [2024-06-24 01:00:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9051226112. Throughput: 0: 42504.0. Samples: 9051298420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:38,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-24 01:00:41,212][15401] Updated weights for policy 0, policy_version 552450 (0.0037) [2024-06-24 01:00:43,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9051439104. Throughput: 0: 42480.8. Samples: 9051549800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:43,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 01:00:44,824][15401] Updated weights for policy 0, policy_version 552460 (0.0041) [2024-06-24 01:00:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42599.5, 300 sec: 42765.0). Total num frames: 9051652096. Throughput: 0: 42580.8. Samples: 9051810060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:48,393][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 01:00:48,659][15401] Updated weights for policy 0, policy_version 552470 (0.0042) [2024-06-24 01:00:52,446][15401] Updated weights for policy 0, policy_version 552480 (0.0038) [2024-06-24 01:00:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9051881472. Throughput: 0: 42446.8. Samples: 9051935080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 01:00:56,426][15401] Updated weights for policy 0, policy_version 552490 (0.0022) [2024-06-24 01:00:57,201][15349] Signal inference workers to stop experience collection... (134100 times) [2024-06-24 01:00:57,235][15401] InferenceWorker_p0-w0: stopping experience collection (134100 times) [2024-06-24 01:00:57,249][15349] Signal inference workers to resume experience collection... (134100 times) [2024-06-24 01:00:57,251][15401] InferenceWorker_p0-w0: resuming experience collection (134100 times) [2024-06-24 01:00:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9052094464. Throughput: 0: 42557.0. Samples: 9052194500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:00:58,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 01:01:00,183][15401] Updated weights for policy 0, policy_version 552500 (0.0024) [2024-06-24 01:01:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9052274688. Throughput: 0: 42744.0. Samples: 9052454900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:01:03,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 01:01:03,940][15401] Updated weights for policy 0, policy_version 552510 (0.0039) [2024-06-24 01:01:07,837][15401] Updated weights for policy 0, policy_version 552520 (0.0030) [2024-06-24 01:01:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 9052504064. Throughput: 0: 42706.8. Samples: 9052579280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:01:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 01:01:11,914][15401] Updated weights for policy 0, policy_version 552530 (0.0029) [2024-06-24 01:01:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9052733440. Throughput: 0: 42684.0. Samples: 9052833960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:01:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 01:01:15,532][15401] Updated weights for policy 0, policy_version 552540 (0.0047) [2024-06-24 01:01:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.6). Total num frames: 9052930048. Throughput: 0: 42668.1. Samples: 9053089160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:01:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 01:01:19,475][15401] Updated weights for policy 0, policy_version 552550 (0.0041) [2024-06-24 01:01:23,376][15401] Updated weights for policy 0, policy_version 552560 (0.0039) [2024-06-24 01:01:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9053143040. Throughput: 0: 42522.7. Samples: 9053211940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:01:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 01:01:26,961][15401] Updated weights for policy 0, policy_version 552570 (0.0028) [2024-06-24 01:01:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9053372416. Throughput: 0: 42731.6. Samples: 9053472720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 01:01:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 01:01:30,925][15401] Updated weights for policy 0, policy_version 552580 (0.0043) [2024-06-24 01:01:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 9053552640. Throughput: 0: 42697.9. Samples: 9053731460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:01:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 01:01:34,826][15401] Updated weights for policy 0, policy_version 552590 (0.0031) [2024-06-24 01:01:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9053782016. Throughput: 0: 42676.1. Samples: 9053855500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:01:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 01:01:38,492][15401] Updated weights for policy 0, policy_version 552600 (0.0040) [2024-06-24 01:01:42,403][15401] Updated weights for policy 0, policy_version 552610 (0.0038) [2024-06-24 01:01:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9053995008. Throughput: 0: 42735.9. Samples: 9054117620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:01:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 01:01:43,546][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000552613_9054011392.pth... [2024-06-24 01:01:43,594][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000551986_9043738624.pth [2024-06-24 01:01:46,027][15401] Updated weights for policy 0, policy_version 552620 (0.0029) [2024-06-24 01:01:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9054208000. Throughput: 0: 42642.1. Samples: 9054373800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:01:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 01:01:50,007][15401] Updated weights for policy 0, policy_version 552630 (0.0033) [2024-06-24 01:01:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9054420992. Throughput: 0: 42678.2. Samples: 9054499800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:01:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 01:01:53,933][15401] Updated weights for policy 0, policy_version 552640 (0.0036) [2024-06-24 01:01:57,600][15401] Updated weights for policy 0, policy_version 552650 (0.0044) [2024-06-24 01:01:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 9054633984. Throughput: 0: 42893.3. Samples: 9054764160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:01:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 01:02:01,499][15401] Updated weights for policy 0, policy_version 552660 (0.0038) [2024-06-24 01:02:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9054830592. Throughput: 0: 42824.0. Samples: 9055016240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 01:02:05,489][15401] Updated weights for policy 0, policy_version 552670 (0.0043) [2024-06-24 01:02:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9055059968. Throughput: 0: 42791.6. Samples: 9055137560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:08,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 01:02:09,181][15401] Updated weights for policy 0, policy_version 552680 (0.0038) [2024-06-24 01:02:13,047][15401] Updated weights for policy 0, policy_version 552690 (0.0042) [2024-06-24 01:02:13,395][15132] Fps is (10 sec: 45852.3, 60 sec: 42594.8, 300 sec: 42708.7). Total num frames: 9055289344. Throughput: 0: 42742.2. Samples: 9055396340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:13,395][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 01:02:17,167][15401] Updated weights for policy 0, policy_version 552700 (0.0032) [2024-06-24 01:02:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9055485952. Throughput: 0: 42561.7. Samples: 9055646740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:18,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 01:02:20,548][15401] Updated weights for policy 0, policy_version 552710 (0.0039) [2024-06-24 01:02:23,390][15132] Fps is (10 sec: 39341.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9055682560. Throughput: 0: 42532.0. Samples: 9055769440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 01:02:24,915][15401] Updated weights for policy 0, policy_version 552720 (0.0030) [2024-06-24 01:02:27,527][15349] Signal inference workers to stop experience collection... (134150 times) [2024-06-24 01:02:27,572][15401] InferenceWorker_p0-w0: stopping experience collection (134150 times) [2024-06-24 01:02:27,589][15349] Signal inference workers to resume experience collection... (134150 times) [2024-06-24 01:02:27,590][15401] InferenceWorker_p0-w0: resuming experience collection (134150 times) [2024-06-24 01:02:28,107][15401] Updated weights for policy 0, policy_version 552730 (0.0039) [2024-06-24 01:02:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9055928320. Throughput: 0: 42562.3. Samples: 9056032920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 01:02:32,499][15401] Updated weights for policy 0, policy_version 552740 (0.0032) [2024-06-24 01:02:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 9056108544. Throughput: 0: 42569.8. Samples: 9056289540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:33,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 01:02:35,737][15401] Updated weights for policy 0, policy_version 552750 (0.0026) [2024-06-24 01:02:38,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42654.1). Total num frames: 9056321536. Throughput: 0: 42486.7. Samples: 9056411700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 01:02:40,261][15401] Updated weights for policy 0, policy_version 552760 (0.0032) [2024-06-24 01:02:43,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9056550912. Throughput: 0: 42429.8. Samples: 9056673500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 01:02:43,661][15401] Updated weights for policy 0, policy_version 552770 (0.0031) [2024-06-24 01:02:47,874][15401] Updated weights for policy 0, policy_version 552780 (0.0043) [2024-06-24 01:02:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9056747520. Throughput: 0: 42429.3. Samples: 9056925560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 01:02:51,250][15401] Updated weights for policy 0, policy_version 552790 (0.0032) [2024-06-24 01:02:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9056976896. Throughput: 0: 42584.5. Samples: 9057053860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 01:02:55,438][15401] Updated weights for policy 0, policy_version 552800 (0.0036) [2024-06-24 01:02:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9057189888. Throughput: 0: 42678.2. Samples: 9057316640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:02:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 01:02:58,770][15401] Updated weights for policy 0, policy_version 552810 (0.0035) [2024-06-24 01:03:02,949][15401] Updated weights for policy 0, policy_version 552820 (0.0035) [2024-06-24 01:03:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9057402880. Throughput: 0: 42679.2. Samples: 9057567300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 01:03:06,582][15401] Updated weights for policy 0, policy_version 552830 (0.0039) [2024-06-24 01:03:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9057632256. Throughput: 0: 42799.1. Samples: 9057695400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 01:03:10,790][15401] Updated weights for policy 0, policy_version 552840 (0.0034) [2024-06-24 01:03:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42328.8, 300 sec: 42598.4). Total num frames: 9057828864. Throughput: 0: 42734.9. Samples: 9057956000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:13,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-24 01:03:14,247][15401] Updated weights for policy 0, policy_version 552850 (0.0035) [2024-06-24 01:03:18,383][15401] Updated weights for policy 0, policy_version 552860 (0.0033) [2024-06-24 01:03:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9058058240. Throughput: 0: 42607.2. Samples: 9058206760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 01:03:22,120][15401] Updated weights for policy 0, policy_version 552870 (0.0036) [2024-06-24 01:03:23,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 9058271232. Throughput: 0: 42738.1. Samples: 9058335020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:23,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 01:03:25,914][15401] Updated weights for policy 0, policy_version 552880 (0.0035) [2024-06-24 01:03:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9058451456. Throughput: 0: 42686.2. Samples: 9058594380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 01:03:30,023][15401] Updated weights for policy 0, policy_version 552890 (0.0036) [2024-06-24 01:03:33,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 9058697216. Throughput: 0: 42589.0. Samples: 9058842060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 01:03:33,502][15401] Updated weights for policy 0, policy_version 552900 (0.0026) [2024-06-24 01:03:37,460][15401] Updated weights for policy 0, policy_version 552910 (0.0052) [2024-06-24 01:03:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9058910208. Throughput: 0: 42751.9. Samples: 9058977700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:38,396][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 01:03:41,361][15401] Updated weights for policy 0, policy_version 552920 (0.0031) [2024-06-24 01:03:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9059090432. Throughput: 0: 42636.8. Samples: 9059235300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:43,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-24 01:03:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000552923_9059090432.pth... [2024-06-24 01:03:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000552300_9048883200.pth [2024-06-24 01:03:45,025][15401] Updated weights for policy 0, policy_version 552930 (0.0037) [2024-06-24 01:03:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 9059319808. Throughput: 0: 42716.5. Samples: 9059489540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:48,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 01:03:48,907][15401] Updated weights for policy 0, policy_version 552940 (0.0034) [2024-06-24 01:03:52,576][15401] Updated weights for policy 0, policy_version 552950 (0.0033) [2024-06-24 01:03:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9059549184. Throughput: 0: 42882.7. Samples: 9059625120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:53,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 01:03:56,675][15401] Updated weights for policy 0, policy_version 552960 (0.0030) [2024-06-24 01:03:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 9059729408. Throughput: 0: 42553.5. Samples: 9059870900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:03:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 01:04:00,465][15401] Updated weights for policy 0, policy_version 552970 (0.0043) [2024-06-24 01:04:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 9059958784. Throughput: 0: 42695.5. Samples: 9060128060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:04:03,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 01:04:03,421][15349] Signal inference workers to stop experience collection... (134200 times) [2024-06-24 01:04:03,452][15401] InferenceWorker_p0-w0: stopping experience collection (134200 times) [2024-06-24 01:04:03,477][15349] Signal inference workers to resume experience collection... (134200 times) [2024-06-24 01:04:03,478][15401] InferenceWorker_p0-w0: resuming experience collection (134200 times) [2024-06-24 01:04:04,109][15401] Updated weights for policy 0, policy_version 552980 (0.0029) [2024-06-24 01:04:07,979][15401] Updated weights for policy 0, policy_version 552990 (0.0030) [2024-06-24 01:04:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9060188160. Throughput: 0: 42845.0. Samples: 9060262940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:04:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 01:04:11,706][15401] Updated weights for policy 0, policy_version 553000 (0.0040) [2024-06-24 01:04:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 9060368384. Throughput: 0: 42630.7. Samples: 9060512760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:04:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 01:04:15,886][15401] Updated weights for policy 0, policy_version 553010 (0.0041) [2024-06-24 01:04:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9060614144. Throughput: 0: 42935.1. Samples: 9060774140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:04:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 01:04:19,358][15401] Updated weights for policy 0, policy_version 553020 (0.0030) [2024-06-24 01:04:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 9060827136. Throughput: 0: 42791.2. Samples: 9060903300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:04:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 01:04:23,877][15401] Updated weights for policy 0, policy_version 553030 (0.0030) [2024-06-24 01:04:27,159][15401] Updated weights for policy 0, policy_version 553040 (0.0032) [2024-06-24 01:04:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 9061023744. Throughput: 0: 42720.8. Samples: 9061157740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 01:04:28,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 01:04:31,302][15401] Updated weights for policy 0, policy_version 553050 (0.0033) [2024-06-24 01:04:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9061253120. Throughput: 0: 42954.6. Samples: 9061422500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:04:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 01:04:34,594][15401] Updated weights for policy 0, policy_version 553060 (0.0034) [2024-06-24 01:04:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 9061466112. Throughput: 0: 42717.8. Samples: 9061547420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:04:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 01:04:38,902][15401] Updated weights for policy 0, policy_version 553070 (0.0028) [2024-06-24 01:04:42,198][15401] Updated weights for policy 0, policy_version 553080 (0.0047) [2024-06-24 01:04:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43416.0, 300 sec: 42709.4). Total num frames: 9061695488. Throughput: 0: 42932.8. Samples: 9061802980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:04:43,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 01:04:46,434][15401] Updated weights for policy 0, policy_version 553090 (0.0026) [2024-06-24 01:04:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9061892096. Throughput: 0: 43027.7. Samples: 9062064300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:04:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 01:04:50,076][15401] Updated weights for policy 0, policy_version 553100 (0.0026) [2024-06-24 01:04:53,390][15132] Fps is (10 sec: 42607.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 9062121472. Throughput: 0: 42799.3. Samples: 9062188920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:04:53,396][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 01:04:54,099][15401] Updated weights for policy 0, policy_version 553110 (0.0033) [2024-06-24 01:04:57,618][15401] Updated weights for policy 0, policy_version 553120 (0.0027) [2024-06-24 01:04:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 9062350848. Throughput: 0: 43024.8. Samples: 9062448880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:04:58,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 01:05:02,197][15401] Updated weights for policy 0, policy_version 553130 (0.0033) [2024-06-24 01:05:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9062531072. Throughput: 0: 42959.1. Samples: 9062707300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 01:05:05,091][15401] Updated weights for policy 0, policy_version 553140 (0.0034) [2024-06-24 01:05:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9062744064. Throughput: 0: 42958.2. Samples: 9062836420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 01:05:09,716][15401] Updated weights for policy 0, policy_version 553150 (0.0050) [2024-06-24 01:05:12,639][15401] Updated weights for policy 0, policy_version 553160 (0.0028) [2024-06-24 01:05:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 42820.5). Total num frames: 9062989824. Throughput: 0: 43122.3. Samples: 9063098240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 01:05:17,330][15401] Updated weights for policy 0, policy_version 553170 (0.0034) [2024-06-24 01:05:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9063186432. Throughput: 0: 42875.6. Samples: 9063351900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 01:05:20,329][15401] Updated weights for policy 0, policy_version 553180 (0.0033) [2024-06-24 01:05:22,550][15349] Signal inference workers to stop experience collection... (134250 times) [2024-06-24 01:05:22,600][15401] InferenceWorker_p0-w0: stopping experience collection (134250 times) [2024-06-24 01:05:22,607][15349] Signal inference workers to resume experience collection... (134250 times) [2024-06-24 01:05:22,619][15401] InferenceWorker_p0-w0: resuming experience collection (134250 times) [2024-06-24 01:05:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9063399424. Throughput: 0: 42899.5. Samples: 9063477900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 01:05:25,052][15401] Updated weights for policy 0, policy_version 553190 (0.0038) [2024-06-24 01:05:27,819][15401] Updated weights for policy 0, policy_version 553200 (0.0039) [2024-06-24 01:05:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 9063628800. Throughput: 0: 43001.8. Samples: 9063737960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:28,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 01:05:32,726][15401] Updated weights for policy 0, policy_version 553210 (0.0034) [2024-06-24 01:05:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9063809024. Throughput: 0: 42973.2. Samples: 9063998100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 01:05:35,954][15401] Updated weights for policy 0, policy_version 553220 (0.0037) [2024-06-24 01:05:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 9064054784. Throughput: 0: 42944.2. Samples: 9064121500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:38,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 01:05:40,237][15401] Updated weights for policy 0, policy_version 553230 (0.0032) [2024-06-24 01:05:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 9064267776. Throughput: 0: 42906.8. Samples: 9064379680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:43,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 01:05:43,463][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000553240_9064284160.pth... [2024-06-24 01:05:43,467][15401] Updated weights for policy 0, policy_version 553240 (0.0029) [2024-06-24 01:05:43,517][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000552613_9054011392.pth [2024-06-24 01:05:47,819][15401] Updated weights for policy 0, policy_version 553250 (0.0051) [2024-06-24 01:05:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9064464384. Throughput: 0: 43037.4. Samples: 9064643980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:48,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 01:05:51,093][15401] Updated weights for policy 0, policy_version 553260 (0.0034) [2024-06-24 01:05:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 9064693760. Throughput: 0: 42812.9. Samples: 9064763000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 01:05:55,277][15401] Updated weights for policy 0, policy_version 553270 (0.0031) [2024-06-24 01:05:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9064906752. Throughput: 0: 42784.4. Samples: 9065023540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:05:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 01:05:58,727][15401] Updated weights for policy 0, policy_version 553280 (0.0038) [2024-06-24 01:06:02,856][15401] Updated weights for policy 0, policy_version 553290 (0.0042) [2024-06-24 01:06:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9065103360. Throughput: 0: 42966.7. Samples: 9065285400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 01:06:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 01:06:06,316][15401] Updated weights for policy 0, policy_version 553300 (0.0043) [2024-06-24 01:06:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9065332736. Throughput: 0: 42995.1. Samples: 9065412680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:08,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 01:06:10,424][15401] Updated weights for policy 0, policy_version 553310 (0.0036) [2024-06-24 01:06:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9065562112. Throughput: 0: 42944.4. Samples: 9065670460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 01:06:13,862][15401] Updated weights for policy 0, policy_version 553320 (0.0030) [2024-06-24 01:06:17,969][15401] Updated weights for policy 0, policy_version 553330 (0.0039) [2024-06-24 01:06:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9065775104. Throughput: 0: 42918.4. Samples: 9065929420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 01:06:21,512][15401] Updated weights for policy 0, policy_version 553340 (0.0045) [2024-06-24 01:06:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9065988096. Throughput: 0: 43091.2. Samples: 9066060500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:23,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 01:06:25,714][15401] Updated weights for policy 0, policy_version 553350 (0.0026) [2024-06-24 01:06:28,391][15132] Fps is (10 sec: 40954.7, 60 sec: 42597.5, 300 sec: 42820.4). Total num frames: 9066184704. Throughput: 0: 43098.3. Samples: 9066319160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:28,391][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 01:06:29,124][15401] Updated weights for policy 0, policy_version 553360 (0.0026) [2024-06-24 01:06:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 9066397696. Throughput: 0: 43021.8. Samples: 9066579960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 01:06:33,457][15401] Updated weights for policy 0, policy_version 553370 (0.0045) [2024-06-24 01:06:36,803][15401] Updated weights for policy 0, policy_version 553380 (0.0034) [2024-06-24 01:06:38,390][15132] Fps is (10 sec: 45880.5, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 9066643456. Throughput: 0: 43131.9. Samples: 9066703940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 01:06:40,948][15349] Signal inference workers to stop experience collection... (134300 times) [2024-06-24 01:06:40,948][15349] Signal inference workers to resume experience collection... (134300 times) [2024-06-24 01:06:40,978][15401] InferenceWorker_p0-w0: stopping experience collection (134300 times) [2024-06-24 01:06:41,011][15401] InferenceWorker_p0-w0: resuming experience collection (134300 times) [2024-06-24 01:06:41,083][15401] Updated weights for policy 0, policy_version 553390 (0.0023) [2024-06-24 01:06:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9066840064. Throughput: 0: 43117.7. Samples: 9066963840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 01:06:44,492][15401] Updated weights for policy 0, policy_version 553400 (0.0043) [2024-06-24 01:06:48,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9067020288. Throughput: 0: 43075.5. Samples: 9067223800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:48,396][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 01:06:48,819][15401] Updated weights for policy 0, policy_version 553410 (0.0032) [2024-06-24 01:06:52,280][15401] Updated weights for policy 0, policy_version 553420 (0.0037) [2024-06-24 01:06:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9067266048. Throughput: 0: 42871.1. Samples: 9067341880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 01:06:56,266][15401] Updated weights for policy 0, policy_version 553430 (0.0028) [2024-06-24 01:06:58,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9067495424. Throughput: 0: 42924.4. Samples: 9067602060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:06:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 01:06:59,814][15401] Updated weights for policy 0, policy_version 553440 (0.0032) [2024-06-24 01:07:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 9067675648. Throughput: 0: 43044.7. Samples: 9067866440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:07:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 01:07:03,876][15401] Updated weights for policy 0, policy_version 553450 (0.0037) [2024-06-24 01:07:07,222][15401] Updated weights for policy 0, policy_version 553460 (0.0041) [2024-06-24 01:07:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.8). Total num frames: 9067905024. Throughput: 0: 42856.5. Samples: 9067989040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:07:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 01:07:11,548][15401] Updated weights for policy 0, policy_version 553470 (0.0052) [2024-06-24 01:07:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9068134400. Throughput: 0: 42864.7. Samples: 9068248020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:07:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 01:07:15,178][15401] Updated weights for policy 0, policy_version 553480 (0.0047) [2024-06-24 01:07:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9068347392. Throughput: 0: 42849.6. Samples: 9068508200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:07:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 01:07:19,207][15401] Updated weights for policy 0, policy_version 553490 (0.0031) [2024-06-24 01:07:23,170][15401] Updated weights for policy 0, policy_version 553500 (0.0033) [2024-06-24 01:07:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9068544000. Throughput: 0: 42843.6. Samples: 9068631900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:07:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 01:07:26,725][15401] Updated weights for policy 0, policy_version 553510 (0.0032) [2024-06-24 01:07:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43145.4, 300 sec: 42932.0). Total num frames: 9068773376. Throughput: 0: 42777.4. Samples: 9068888820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:07:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 01:07:30,623][15401] Updated weights for policy 0, policy_version 553520 (0.0037) [2024-06-24 01:07:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9068969984. Throughput: 0: 42807.6. Samples: 9069150140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 01:07:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 01:07:34,229][15401] Updated weights for policy 0, policy_version 553530 (0.0038) [2024-06-24 01:07:38,274][15401] Updated weights for policy 0, policy_version 553540 (0.0029) [2024-06-24 01:07:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9069199360. Throughput: 0: 42964.5. Samples: 9069275280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:07:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 01:07:41,854][15401] Updated weights for policy 0, policy_version 553550 (0.0036) [2024-06-24 01:07:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 9069428736. Throughput: 0: 42912.9. Samples: 9069533140. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:07:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 01:07:43,432][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000553555_9069445120.pth... [2024-06-24 01:07:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000552923_9059090432.pth [2024-06-24 01:07:46,144][15401] Updated weights for policy 0, policy_version 553560 (0.0033) [2024-06-24 01:07:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9069625344. Throughput: 0: 42853.8. Samples: 9069794860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:07:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 01:07:49,511][15401] Updated weights for policy 0, policy_version 553570 (0.0030) [2024-06-24 01:07:53,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9069821952. Throughput: 0: 42880.1. Samples: 9069918640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:07:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 01:07:53,762][15401] Updated weights for policy 0, policy_version 553580 (0.0032) [2024-06-24 01:07:57,320][15401] Updated weights for policy 0, policy_version 553590 (0.0033) [2024-06-24 01:07:58,392][15132] Fps is (10 sec: 47502.6, 60 sec: 43416.0, 300 sec: 43042.4). Total num frames: 9070100480. Throughput: 0: 42916.4. Samples: 9070179360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:07:58,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 01:08:01,297][15401] Updated weights for policy 0, policy_version 553600 (0.0030) [2024-06-24 01:08:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9070264320. Throughput: 0: 42912.6. Samples: 9070439260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 01:08:04,806][15401] Updated weights for policy 0, policy_version 553610 (0.0028) [2024-06-24 01:08:05,268][15349] Signal inference workers to stop experience collection... (134350 times) [2024-06-24 01:08:05,269][15349] Signal inference workers to resume experience collection... (134350 times) [2024-06-24 01:08:05,312][15401] InferenceWorker_p0-w0: stopping experience collection (134350 times) [2024-06-24 01:08:05,312][15401] InferenceWorker_p0-w0: resuming experience collection (134350 times) [2024-06-24 01:08:08,390][15132] Fps is (10 sec: 37691.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9070477312. Throughput: 0: 42794.1. Samples: 9070557640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 01:08:09,258][15401] Updated weights for policy 0, policy_version 553620 (0.0034) [2024-06-24 01:08:12,605][15401] Updated weights for policy 0, policy_version 553630 (0.0030) [2024-06-24 01:08:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9070723072. Throughput: 0: 42965.7. Samples: 9070822280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 01:08:16,926][15401] Updated weights for policy 0, policy_version 553640 (0.0030) [2024-06-24 01:08:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 9070886912. Throughput: 0: 42877.8. Samples: 9071079640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:18,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 01:08:20,331][15401] Updated weights for policy 0, policy_version 553650 (0.0050) [2024-06-24 01:08:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 9071116288. Throughput: 0: 42739.1. Samples: 9071198540. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:23,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-24 01:08:24,562][15401] Updated weights for policy 0, policy_version 553660 (0.0035) [2024-06-24 01:08:27,899][15401] Updated weights for policy 0, policy_version 553670 (0.0033) [2024-06-24 01:08:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9071345664. Throughput: 0: 42824.1. Samples: 9071460220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 01:08:32,005][15401] Updated weights for policy 0, policy_version 553680 (0.0038) [2024-06-24 01:08:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9071509504. Throughput: 0: 42839.5. Samples: 9071722640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 01:08:35,375][15401] Updated weights for policy 0, policy_version 553690 (0.0030) [2024-06-24 01:08:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9071771648. Throughput: 0: 42705.7. Samples: 9071840400. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 01:08:39,427][15401] Updated weights for policy 0, policy_version 553700 (0.0030) [2024-06-24 01:08:42,870][15401] Updated weights for policy 0, policy_version 553710 (0.0048) [2024-06-24 01:08:43,392][15132] Fps is (10 sec: 49140.6, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 9072001024. Throughput: 0: 42813.3. Samples: 9072105960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:43,392][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 01:08:47,059][15401] Updated weights for policy 0, policy_version 553720 (0.0031) [2024-06-24 01:08:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9072164864. Throughput: 0: 42899.1. Samples: 9072369720. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:48,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 01:08:50,489][15401] Updated weights for policy 0, policy_version 553730 (0.0035) [2024-06-24 01:08:53,391][15132] Fps is (10 sec: 40962.9, 60 sec: 43143.2, 300 sec: 42986.9). Total num frames: 9072410624. Throughput: 0: 42882.1. Samples: 9072487400. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:53,392][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 01:08:54,914][15401] Updated weights for policy 0, policy_version 553740 (0.0034) [2024-06-24 01:08:58,319][15401] Updated weights for policy 0, policy_version 553750 (0.0030) [2024-06-24 01:08:58,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42327.0, 300 sec: 42987.2). Total num frames: 9072640000. Throughput: 0: 42927.7. Samples: 9072754020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:08:58,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 01:09:02,464][15401] Updated weights for policy 0, policy_version 553760 (0.0030) [2024-06-24 01:09:03,389][15132] Fps is (10 sec: 40967.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9072820224. Throughput: 0: 43014.2. Samples: 9073015280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-06-24 01:09:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 01:09:05,941][15401] Updated weights for policy 0, policy_version 553770 (0.0038) [2024-06-24 01:09:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.9, 300 sec: 43042.4). Total num frames: 9073065984. Throughput: 0: 43171.0. Samples: 9073141340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:08,392][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 01:09:09,848][15401] Updated weights for policy 0, policy_version 553780 (0.0029) [2024-06-24 01:09:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.6, 300 sec: 42931.7). Total num frames: 9073278976. Throughput: 0: 43174.3. Samples: 9073403060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 01:09:13,543][15401] Updated weights for policy 0, policy_version 553790 (0.0032) [2024-06-24 01:09:17,243][15349] Signal inference workers to stop experience collection... (134400 times) [2024-06-24 01:09:17,244][15349] Signal inference workers to resume experience collection... (134400 times) [2024-06-24 01:09:17,264][15401] InferenceWorker_p0-w0: stopping experience collection (134400 times) [2024-06-24 01:09:17,264][15401] InferenceWorker_p0-w0: resuming experience collection (134400 times) [2024-06-24 01:09:17,395][15401] Updated weights for policy 0, policy_version 553800 (0.0030) [2024-06-24 01:09:18,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9073475584. Throughput: 0: 43081.9. Samples: 9073661320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 01:09:21,089][15401] Updated weights for policy 0, policy_version 553810 (0.0050) [2024-06-24 01:09:23,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 9073721344. Throughput: 0: 43224.8. Samples: 9073785520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:23,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 01:09:24,892][15401] Updated weights for policy 0, policy_version 553820 (0.0044) [2024-06-24 01:09:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9073917952. Throughput: 0: 43081.8. Samples: 9074044540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:28,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 01:09:28,633][15401] Updated weights for policy 0, policy_version 553830 (0.0029) [2024-06-24 01:09:32,528][15401] Updated weights for policy 0, policy_version 553840 (0.0034) [2024-06-24 01:09:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 9074130944. Throughput: 0: 42954.2. Samples: 9074302660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:33,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 01:09:36,461][15401] Updated weights for policy 0, policy_version 553850 (0.0038) [2024-06-24 01:09:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 9074360320. Throughput: 0: 43077.1. Samples: 9074425800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:38,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 01:09:40,577][15401] Updated weights for policy 0, policy_version 553860 (0.0035) [2024-06-24 01:09:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42873.0, 300 sec: 42987.1). Total num frames: 9074573312. Throughput: 0: 42960.6. Samples: 9074687260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:43,391][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 01:09:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000553868_9074573312.pth... [2024-06-24 01:09:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000553240_9064284160.pth [2024-06-24 01:09:44,307][15401] Updated weights for policy 0, policy_version 553870 (0.0037) [2024-06-24 01:09:48,174][15401] Updated weights for policy 0, policy_version 553880 (0.0032) [2024-06-24 01:09:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9074769920. Throughput: 0: 42760.4. Samples: 9074939500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:48,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-24 01:09:51,834][15401] Updated weights for policy 0, policy_version 553890 (0.0027) [2024-06-24 01:09:53,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42872.7, 300 sec: 42820.6). Total num frames: 9074982912. Throughput: 0: 42735.3. Samples: 9075064320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 01:09:55,731][15401] Updated weights for policy 0, policy_version 553900 (0.0030) [2024-06-24 01:09:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9075212288. Throughput: 0: 42809.2. Samples: 9075329480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:09:58,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 01:09:59,287][15401] Updated weights for policy 0, policy_version 553910 (0.0028) [2024-06-24 01:10:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9075408896. Throughput: 0: 42776.9. Samples: 9075586280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:10:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 01:10:03,739][15401] Updated weights for policy 0, policy_version 553920 (0.0027) [2024-06-24 01:10:06,742][15401] Updated weights for policy 0, policy_version 553930 (0.0031) [2024-06-24 01:10:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9075638272. Throughput: 0: 42744.1. Samples: 9075709000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:10:08,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 01:10:11,302][15401] Updated weights for policy 0, policy_version 553940 (0.0023) [2024-06-24 01:10:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9075851264. Throughput: 0: 42818.7. Samples: 9075971380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:10:13,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 01:10:14,251][15401] Updated weights for policy 0, policy_version 553950 (0.0023) [2024-06-24 01:10:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9076047872. Throughput: 0: 42699.1. Samples: 9076224120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:10:18,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 01:10:19,240][15401] Updated weights for policy 0, policy_version 553960 (0.0028) [2024-06-24 01:10:21,989][15401] Updated weights for policy 0, policy_version 553970 (0.0043) [2024-06-24 01:10:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9076260864. Throughput: 0: 42785.3. Samples: 9076351140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:10:23,399][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 01:10:26,755][15401] Updated weights for policy 0, policy_version 553980 (0.0043) [2024-06-24 01:10:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9076490240. Throughput: 0: 42813.1. Samples: 9076613840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:10:28,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 01:10:29,690][15401] Updated weights for policy 0, policy_version 553990 (0.0030) [2024-06-24 01:10:33,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.6, 300 sec: 42820.5). Total num frames: 9076686848. Throughput: 0: 42899.9. Samples: 9076870100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 01:10:33,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 01:10:34,187][15401] Updated weights for policy 0, policy_version 554000 (0.0056) [2024-06-24 01:10:37,459][15401] Updated weights for policy 0, policy_version 554010 (0.0021) [2024-06-24 01:10:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9076916224. Throughput: 0: 42992.4. Samples: 9076998980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:10:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 01:10:41,566][15401] Updated weights for policy 0, policy_version 554020 (0.0023) [2024-06-24 01:10:43,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.6, 300 sec: 42931.6). Total num frames: 9077129216. Throughput: 0: 42844.5. Samples: 9077257480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:10:43,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 01:10:44,822][15349] Signal inference workers to stop experience collection... (134450 times) [2024-06-24 01:10:44,822][15349] Signal inference workers to resume experience collection... (134450 times) [2024-06-24 01:10:44,866][15401] InferenceWorker_p0-w0: stopping experience collection (134450 times) [2024-06-24 01:10:44,866][15401] InferenceWorker_p0-w0: resuming experience collection (134450 times) [2024-06-24 01:10:45,143][15401] Updated weights for policy 0, policy_version 554030 (0.0036) [2024-06-24 01:10:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9077325824. Throughput: 0: 42820.8. Samples: 9077513220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:10:48,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 01:10:49,106][15401] Updated weights for policy 0, policy_version 554040 (0.0036) [2024-06-24 01:10:52,721][15401] Updated weights for policy 0, policy_version 554050 (0.0028) [2024-06-24 01:10:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9077555200. Throughput: 0: 42832.8. Samples: 9077636480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:10:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 01:10:57,188][15401] Updated weights for policy 0, policy_version 554060 (0.0046) [2024-06-24 01:10:58,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42869.7, 300 sec: 42986.8). Total num frames: 9077784576. Throughput: 0: 42814.6. Samples: 9077898140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:10:58,393][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 01:11:00,476][15401] Updated weights for policy 0, policy_version 554070 (0.0036) [2024-06-24 01:11:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9077981184. Throughput: 0: 42867.1. Samples: 9078153140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 01:11:04,706][15401] Updated weights for policy 0, policy_version 554080 (0.0032) [2024-06-24 01:11:08,307][15401] Updated weights for policy 0, policy_version 554090 (0.0028) [2024-06-24 01:11:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9078210560. Throughput: 0: 42914.4. Samples: 9078282280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:08,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 01:11:12,123][15401] Updated weights for policy 0, policy_version 554100 (0.0044) [2024-06-24 01:11:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9078423552. Throughput: 0: 42643.9. Samples: 9078532820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:13,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 01:11:15,940][15401] Updated weights for policy 0, policy_version 554110 (0.0028) [2024-06-24 01:11:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9078620160. Throughput: 0: 42741.0. Samples: 9078793340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 01:11:19,576][15401] Updated weights for policy 0, policy_version 554120 (0.0035) [2024-06-24 01:11:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42931.8). Total num frames: 9078849536. Throughput: 0: 42657.8. Samples: 9078918580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:23,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 01:11:23,425][15401] Updated weights for policy 0, policy_version 554130 (0.0036) [2024-06-24 01:11:27,666][15401] Updated weights for policy 0, policy_version 554140 (0.0042) [2024-06-24 01:11:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9079078912. Throughput: 0: 42603.9. Samples: 9079174660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:28,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 01:11:31,265][15401] Updated weights for policy 0, policy_version 554150 (0.0040) [2024-06-24 01:11:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 9079275520. Throughput: 0: 42665.8. Samples: 9079433180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 01:11:35,197][15401] Updated weights for policy 0, policy_version 554160 (0.0031) [2024-06-24 01:11:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 9079488512. Throughput: 0: 42744.9. Samples: 9079560100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:38,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 01:11:38,761][15401] Updated weights for policy 0, policy_version 554170 (0.0027) [2024-06-24 01:11:42,805][15401] Updated weights for policy 0, policy_version 554180 (0.0033) [2024-06-24 01:11:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 9079717888. Throughput: 0: 42762.7. Samples: 9079822360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 01:11:43,433][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000554183_9079734272.pth... [2024-06-24 01:11:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000553555_9069445120.pth [2024-06-24 01:11:46,819][15401] Updated weights for policy 0, policy_version 554190 (0.0027) [2024-06-24 01:11:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9079914496. Throughput: 0: 42684.0. Samples: 9080073920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 01:11:50,517][15401] Updated weights for policy 0, policy_version 554200 (0.0037) [2024-06-24 01:11:50,876][15349] Signal inference workers to stop experience collection... (134500 times) [2024-06-24 01:11:50,888][15401] InferenceWorker_p0-w0: stopping experience collection (134500 times) [2024-06-24 01:11:50,939][15349] Signal inference workers to resume experience collection... (134500 times) [2024-06-24 01:11:50,939][15401] InferenceWorker_p0-w0: resuming experience collection (134500 times) [2024-06-24 01:11:53,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9080111104. Throughput: 0: 42467.9. Samples: 9080193440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:53,393][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 01:11:54,516][15401] Updated weights for policy 0, policy_version 554210 (0.0029) [2024-06-24 01:11:58,324][15401] Updated weights for policy 0, policy_version 554220 (0.0031) [2024-06-24 01:11:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 9080340480. Throughput: 0: 42685.3. Samples: 9080453660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:11:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 01:12:02,156][15401] Updated weights for policy 0, policy_version 554230 (0.0032) [2024-06-24 01:12:03,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9080537088. Throughput: 0: 42512.9. Samples: 9080706420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 01:12:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 01:12:05,883][15401] Updated weights for policy 0, policy_version 554240 (0.0029) [2024-06-24 01:12:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9080766464. Throughput: 0: 42513.3. Samples: 9080831680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 01:12:09,973][15401] Updated weights for policy 0, policy_version 554250 (0.0031) [2024-06-24 01:12:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9080963072. Throughput: 0: 42689.0. Samples: 9081095660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 01:12:13,729][15401] Updated weights for policy 0, policy_version 554260 (0.0036) [2024-06-24 01:12:17,497][15401] Updated weights for policy 0, policy_version 554270 (0.0031) [2024-06-24 01:12:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9081192448. Throughput: 0: 42518.3. Samples: 9081346500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 01:12:21,473][15401] Updated weights for policy 0, policy_version 554280 (0.0039) [2024-06-24 01:12:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9081405440. Throughput: 0: 42569.7. Samples: 9081475640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 01:12:25,307][15401] Updated weights for policy 0, policy_version 554290 (0.0047) [2024-06-24 01:12:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 9081585664. Throughput: 0: 42371.1. Samples: 9081729060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 01:12:29,240][15401] Updated weights for policy 0, policy_version 554300 (0.0033) [2024-06-24 01:12:33,251][15401] Updated weights for policy 0, policy_version 554310 (0.0029) [2024-06-24 01:12:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9081815040. Throughput: 0: 42430.6. Samples: 9081983300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 01:12:36,959][15401] Updated weights for policy 0, policy_version 554320 (0.0048) [2024-06-24 01:12:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9082044416. Throughput: 0: 42702.7. Samples: 9082114960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:38,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 01:12:40,916][15401] Updated weights for policy 0, policy_version 554330 (0.0039) [2024-06-24 01:12:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 9082224640. Throughput: 0: 42468.0. Samples: 9082364720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 01:12:44,524][15401] Updated weights for policy 0, policy_version 554340 (0.0028) [2024-06-24 01:12:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.1, 300 sec: 42765.0). Total num frames: 9082437632. Throughput: 0: 42575.0. Samples: 9082622300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:48,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 01:12:48,737][15401] Updated weights for policy 0, policy_version 554350 (0.0035) [2024-06-24 01:12:52,172][15401] Updated weights for policy 0, policy_version 554360 (0.0029) [2024-06-24 01:12:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42598.7). Total num frames: 9082667008. Throughput: 0: 42623.0. Samples: 9082749720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:53,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 01:12:56,460][15401] Updated weights for policy 0, policy_version 554370 (0.0037) [2024-06-24 01:12:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9082880000. Throughput: 0: 42444.0. Samples: 9083005640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:12:58,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 01:13:00,149][15401] Updated weights for policy 0, policy_version 554380 (0.0034) [2024-06-24 01:13:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9083076608. Throughput: 0: 42435.5. Samples: 9083256100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:13:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 01:13:04,248][15401] Updated weights for policy 0, policy_version 554390 (0.0033) [2024-06-24 01:13:07,695][15401] Updated weights for policy 0, policy_version 554400 (0.0036) [2024-06-24 01:13:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9083322368. Throughput: 0: 42352.5. Samples: 9083381500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:13:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 01:13:11,958][15401] Updated weights for policy 0, policy_version 554410 (0.0024) [2024-06-24 01:13:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9083518976. Throughput: 0: 42502.2. Samples: 9083641660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:13:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 01:13:15,277][15401] Updated weights for policy 0, policy_version 554420 (0.0032) [2024-06-24 01:13:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9083731968. Throughput: 0: 42448.3. Samples: 9083893480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:13:18,395][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 01:13:19,474][15401] Updated weights for policy 0, policy_version 554430 (0.0045) [2024-06-24 01:13:23,285][15401] Updated weights for policy 0, policy_version 554440 (0.0030) [2024-06-24 01:13:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9083944960. Throughput: 0: 42502.2. Samples: 9084027560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:13:23,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 01:13:24,591][15349] Signal inference workers to stop experience collection... (134550 times) [2024-06-24 01:13:24,591][15349] Signal inference workers to resume experience collection... (134550 times) [2024-06-24 01:13:24,607][15401] InferenceWorker_p0-w0: stopping experience collection (134550 times) [2024-06-24 01:13:24,607][15401] InferenceWorker_p0-w0: resuming experience collection (134550 times) [2024-06-24 01:13:26,957][15401] Updated weights for policy 0, policy_version 554450 (0.0047) [2024-06-24 01:13:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9084141568. Throughput: 0: 42623.6. Samples: 9084282780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:13:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 01:13:30,737][15401] Updated weights for policy 0, policy_version 554460 (0.0041) [2024-06-24 01:13:33,392][15132] Fps is (10 sec: 45864.7, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 9084403712. Throughput: 0: 42507.2. Samples: 9084535220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 01:13:33,392][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 01:13:34,901][15401] Updated weights for policy 0, policy_version 554470 (0.0034) [2024-06-24 01:13:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 9084583936. Throughput: 0: 42686.3. Samples: 9084670600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:13:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 01:13:38,925][15401] Updated weights for policy 0, policy_version 554480 (0.0035) [2024-06-24 01:13:42,516][15401] Updated weights for policy 0, policy_version 554490 (0.0031) [2024-06-24 01:13:43,389][15132] Fps is (10 sec: 37692.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9084780544. Throughput: 0: 42595.9. Samples: 9084922460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:13:43,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 01:13:43,462][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000554492_9084796928.pth... [2024-06-24 01:13:43,530][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000553868_9074573312.pth [2024-06-24 01:13:46,384][15401] Updated weights for policy 0, policy_version 554500 (0.0034) [2024-06-24 01:13:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.2). Total num frames: 9085026304. Throughput: 0: 42691.0. Samples: 9085177200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:13:48,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 01:13:50,026][15401] Updated weights for policy 0, policy_version 554510 (0.0037) [2024-06-24 01:13:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9085222912. Throughput: 0: 42932.8. Samples: 9085313480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:13:53,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 01:13:53,887][15401] Updated weights for policy 0, policy_version 554520 (0.0037) [2024-06-24 01:13:57,750][15401] Updated weights for policy 0, policy_version 554530 (0.0025) [2024-06-24 01:13:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9085435904. Throughput: 0: 42803.9. Samples: 9085567840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:13:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 01:14:01,255][15401] Updated weights for policy 0, policy_version 554540 (0.0029) [2024-06-24 01:14:03,394][15132] Fps is (10 sec: 44217.3, 60 sec: 43141.3, 300 sec: 42709.2). Total num frames: 9085665280. Throughput: 0: 42956.6. Samples: 9085826720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:03,394][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 01:14:05,660][15401] Updated weights for policy 0, policy_version 554550 (0.0026) [2024-06-24 01:14:08,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9085878272. Throughput: 0: 42842.9. Samples: 9085955480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 01:14:09,547][15401] Updated weights for policy 0, policy_version 554560 (0.0021) [2024-06-24 01:14:12,996][15401] Updated weights for policy 0, policy_version 554570 (0.0031) [2024-06-24 01:14:13,390][15132] Fps is (10 sec: 42617.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9086091264. Throughput: 0: 42953.7. Samples: 9086215700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 01:14:17,269][15401] Updated weights for policy 0, policy_version 554580 (0.0025) [2024-06-24 01:14:18,393][15132] Fps is (10 sec: 42581.3, 60 sec: 42868.7, 300 sec: 42653.4). Total num frames: 9086304256. Throughput: 0: 43032.8. Samples: 9086471760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:18,394][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 01:14:20,545][15401] Updated weights for policy 0, policy_version 554590 (0.0038) [2024-06-24 01:14:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9086533632. Throughput: 0: 42885.7. Samples: 9086600460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 01:14:24,857][15401] Updated weights for policy 0, policy_version 554600 (0.0043) [2024-06-24 01:14:28,257][15401] Updated weights for policy 0, policy_version 554610 (0.0045) [2024-06-24 01:14:28,389][15132] Fps is (10 sec: 42615.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9086730240. Throughput: 0: 42844.5. Samples: 9086850460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:28,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 01:14:32,345][15401] Updated weights for policy 0, policy_version 554620 (0.0028) [2024-06-24 01:14:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 9086926848. Throughput: 0: 43043.6. Samples: 9087114160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 01:14:35,824][15401] Updated weights for policy 0, policy_version 554630 (0.0038) [2024-06-24 01:14:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9087172608. Throughput: 0: 42858.7. Samples: 9087242120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 01:14:39,642][15401] Updated weights for policy 0, policy_version 554640 (0.0033) [2024-06-24 01:14:43,374][15401] Updated weights for policy 0, policy_version 554650 (0.0036) [2024-06-24 01:14:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9087385600. Throughput: 0: 43105.9. Samples: 9087507600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 01:14:45,811][15349] Signal inference workers to stop experience collection... (134600 times) [2024-06-24 01:14:45,812][15349] Signal inference workers to resume experience collection... (134600 times) [2024-06-24 01:14:45,846][15401] InferenceWorker_p0-w0: stopping experience collection (134600 times) [2024-06-24 01:14:45,846][15401] InferenceWorker_p0-w0: resuming experience collection (134600 times) [2024-06-24 01:14:47,112][15401] Updated weights for policy 0, policy_version 554660 (0.0040) [2024-06-24 01:14:48,393][15132] Fps is (10 sec: 42582.0, 60 sec: 42868.7, 300 sec: 42764.4). Total num frames: 9087598592. Throughput: 0: 43012.6. Samples: 9087762260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:48,394][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 01:14:51,075][15401] Updated weights for policy 0, policy_version 554670 (0.0035) [2024-06-24 01:14:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 9087811584. Throughput: 0: 42897.7. Samples: 9087885880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 01:14:54,975][15401] Updated weights for policy 0, policy_version 554680 (0.0025) [2024-06-24 01:14:58,389][15132] Fps is (10 sec: 40976.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9088008192. Throughput: 0: 42911.6. Samples: 9088146720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:14:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 01:14:58,871][15401] Updated weights for policy 0, policy_version 554690 (0.0024) [2024-06-24 01:15:02,346][15401] Updated weights for policy 0, policy_version 554700 (0.0045) [2024-06-24 01:15:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42874.8, 300 sec: 42709.5). Total num frames: 9088237568. Throughput: 0: 42887.8. Samples: 9088401540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:15:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 01:15:06,540][15401] Updated weights for policy 0, policy_version 554710 (0.0033) [2024-06-24 01:15:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9088466944. Throughput: 0: 42969.0. Samples: 9088534060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 01:15:09,886][15401] Updated weights for policy 0, policy_version 554720 (0.0030) [2024-06-24 01:15:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9088647168. Throughput: 0: 43134.6. Samples: 9088791520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 01:15:14,137][15401] Updated weights for policy 0, policy_version 554730 (0.0028) [2024-06-24 01:15:17,651][15401] Updated weights for policy 0, policy_version 554740 (0.0043) [2024-06-24 01:15:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43147.4, 300 sec: 42820.6). Total num frames: 9088892928. Throughput: 0: 42874.7. Samples: 9089043520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 01:15:22,007][15401] Updated weights for policy 0, policy_version 554750 (0.0037) [2024-06-24 01:15:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9089105920. Throughput: 0: 42996.9. Samples: 9089176980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:23,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 01:15:25,114][15401] Updated weights for policy 0, policy_version 554760 (0.0029) [2024-06-24 01:15:28,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 9089269760. Throughput: 0: 42689.3. Samples: 9089428620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 01:15:29,588][15401] Updated weights for policy 0, policy_version 554770 (0.0036) [2024-06-24 01:15:33,013][15401] Updated weights for policy 0, policy_version 554780 (0.0030) [2024-06-24 01:15:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 9089515520. Throughput: 0: 42651.2. Samples: 9089681500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:33,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 01:15:37,174][15401] Updated weights for policy 0, policy_version 554790 (0.0023) [2024-06-24 01:15:38,390][15132] Fps is (10 sec: 49151.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9089761280. Throughput: 0: 42915.4. Samples: 9089817080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 01:15:40,801][15401] Updated weights for policy 0, policy_version 554800 (0.0027) [2024-06-24 01:15:43,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9089908736. Throughput: 0: 42614.5. Samples: 9090064380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:43,396][15132] Avg episode reward: [(0, '0.197')] [2024-06-24 01:15:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000554804_9089908736.pth... [2024-06-24 01:15:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000554183_9079734272.pth [2024-06-24 01:15:44,758][15401] Updated weights for policy 0, policy_version 554810 (0.0032) [2024-06-24 01:15:48,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42601.2, 300 sec: 42709.5). Total num frames: 9090154496. Throughput: 0: 42596.0. Samples: 9090318360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:48,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 01:15:48,421][15401] Updated weights for policy 0, policy_version 554820 (0.0026) [2024-06-24 01:15:52,552][15401] Updated weights for policy 0, policy_version 554830 (0.0026) [2024-06-24 01:15:53,196][15349] Signal inference workers to stop experience collection... (134650 times) [2024-06-24 01:15:53,221][15401] InferenceWorker_p0-w0: stopping experience collection (134650 times) [2024-06-24 01:15:53,313][15349] Signal inference workers to resume experience collection... (134650 times) [2024-06-24 01:15:53,314][15401] InferenceWorker_p0-w0: resuming experience collection (134650 times) [2024-06-24 01:15:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 9090367488. Throughput: 0: 42616.9. Samples: 9090451820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 01:15:56,226][15401] Updated weights for policy 0, policy_version 554840 (0.0035) [2024-06-24 01:15:58,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 9090547712. Throughput: 0: 42298.6. Samples: 9090695060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:15:58,392][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 01:16:00,393][15401] Updated weights for policy 0, policy_version 554850 (0.0031) [2024-06-24 01:16:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9090793472. Throughput: 0: 42334.2. Samples: 9090948560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:16:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 01:16:03,772][15401] Updated weights for policy 0, policy_version 554860 (0.0031) [2024-06-24 01:16:08,009][15401] Updated weights for policy 0, policy_version 554870 (0.0037) [2024-06-24 01:16:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9090990080. Throughput: 0: 42424.5. Samples: 9091086080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:16:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 01:16:11,303][15401] Updated weights for policy 0, policy_version 554880 (0.0041) [2024-06-24 01:16:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9091203072. Throughput: 0: 42446.7. Samples: 9091338720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:16:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 01:16:15,629][15401] Updated weights for policy 0, policy_version 554890 (0.0025) [2024-06-24 01:16:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9091448832. Throughput: 0: 42318.2. Samples: 9091585720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:16:18,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 01:16:19,134][15401] Updated weights for policy 0, policy_version 554900 (0.0027) [2024-06-24 01:16:23,343][15401] Updated weights for policy 0, policy_version 554910 (0.0032) [2024-06-24 01:16:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9091645440. Throughput: 0: 42361.3. Samples: 9091723340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:16:23,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 01:16:26,633][15401] Updated weights for policy 0, policy_version 554920 (0.0037) [2024-06-24 01:16:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9091842048. Throughput: 0: 42503.7. Samples: 9091977040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:16:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 01:16:31,190][15401] Updated weights for policy 0, policy_version 554930 (0.0027) [2024-06-24 01:16:33,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43146.2, 300 sec: 42765.4). Total num frames: 9092104192. Throughput: 0: 42457.2. Samples: 9092228940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 01:16:33,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 01:16:34,201][15401] Updated weights for policy 0, policy_version 554940 (0.0037) [2024-06-24 01:16:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 42487.3). Total num frames: 9092251648. Throughput: 0: 42466.6. Samples: 9092362820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:16:38,398][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 01:16:38,832][15401] Updated weights for policy 0, policy_version 554950 (0.0038) [2024-06-24 01:16:41,700][15401] Updated weights for policy 0, policy_version 554960 (0.0034) [2024-06-24 01:16:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9092497408. Throughput: 0: 42696.9. Samples: 9092616320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:16:43,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 01:16:46,374][15401] Updated weights for policy 0, policy_version 554970 (0.0036) [2024-06-24 01:16:48,389][15132] Fps is (10 sec: 49152.4, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 9092743168. Throughput: 0: 42798.2. Samples: 9092874480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:16:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 01:16:49,191][15401] Updated weights for policy 0, policy_version 554980 (0.0035) [2024-06-24 01:16:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9092907008. Throughput: 0: 42840.9. Samples: 9093013920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:16:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 01:16:53,961][15401] Updated weights for policy 0, policy_version 554990 (0.0034) [2024-06-24 01:16:56,795][15401] Updated weights for policy 0, policy_version 555000 (0.0032) [2024-06-24 01:16:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 9093136384. Throughput: 0: 42662.6. Samples: 9093258540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:16:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 01:17:01,590][15401] Updated weights for policy 0, policy_version 555010 (0.0033) [2024-06-24 01:17:02,767][15349] Signal inference workers to stop experience collection... (134700 times) [2024-06-24 01:17:02,816][15401] InferenceWorker_p0-w0: stopping experience collection (134700 times) [2024-06-24 01:17:02,877][15349] Signal inference workers to resume experience collection... (134700 times) [2024-06-24 01:17:02,877][15401] InferenceWorker_p0-w0: resuming experience collection (134700 times) [2024-06-24 01:17:03,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9093382144. Throughput: 0: 43087.7. Samples: 9093524660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 01:17:04,487][15401] Updated weights for policy 0, policy_version 555020 (0.0032) [2024-06-24 01:17:08,390][15132] Fps is (10 sec: 40958.3, 60 sec: 42598.1, 300 sec: 42653.9). Total num frames: 9093545984. Throughput: 0: 43007.3. Samples: 9093658680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:08,391][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 01:17:09,514][15401] Updated weights for policy 0, policy_version 555030 (0.0040) [2024-06-24 01:17:12,136][15401] Updated weights for policy 0, policy_version 555040 (0.0030) [2024-06-24 01:17:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9093791744. Throughput: 0: 42963.6. Samples: 9093910400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:13,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-24 01:17:17,053][15401] Updated weights for policy 0, policy_version 555050 (0.0033) [2024-06-24 01:17:18,389][15132] Fps is (10 sec: 47516.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9094021120. Throughput: 0: 43244.6. Samples: 9094174940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:18,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 01:17:19,711][15401] Updated weights for policy 0, policy_version 555060 (0.0027) [2024-06-24 01:17:23,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 9094201344. Throughput: 0: 43057.0. Samples: 9094300660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:23,397][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 01:17:24,660][15401] Updated weights for policy 0, policy_version 555070 (0.0022) [2024-06-24 01:17:27,672][15401] Updated weights for policy 0, policy_version 555080 (0.0034) [2024-06-24 01:17:28,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 9094447104. Throughput: 0: 42993.0. Samples: 9094551000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:28,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 01:17:32,331][15401] Updated weights for policy 0, policy_version 555090 (0.0027) [2024-06-24 01:17:33,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9094643712. Throughput: 0: 43059.6. Samples: 9094812160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:33,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 01:17:35,417][15401] Updated weights for policy 0, policy_version 555100 (0.0030) [2024-06-24 01:17:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9094856704. Throughput: 0: 42698.7. Samples: 9094935360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:38,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 01:17:39,995][15401] Updated weights for policy 0, policy_version 555110 (0.0031) [2024-06-24 01:17:43,063][15401] Updated weights for policy 0, policy_version 555120 (0.0024) [2024-06-24 01:17:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9095086080. Throughput: 0: 43027.0. Samples: 9095194760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:43,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 01:17:43,451][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000555121_9095102464.pth... [2024-06-24 01:17:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000554492_9084796928.pth [2024-06-24 01:17:47,716][15401] Updated weights for policy 0, policy_version 555130 (0.0042) [2024-06-24 01:17:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 9095266304. Throughput: 0: 42788.4. Samples: 9095450140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:48,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 01:17:51,038][15401] Updated weights for policy 0, policy_version 555140 (0.0032) [2024-06-24 01:17:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9095495680. Throughput: 0: 42490.5. Samples: 9095570740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 01:17:55,478][15401] Updated weights for policy 0, policy_version 555150 (0.0034) [2024-06-24 01:17:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9095708672. Throughput: 0: 42575.5. Samples: 9095826300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:17:58,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 01:17:58,929][15401] Updated weights for policy 0, policy_version 555160 (0.0039) [2024-06-24 01:18:03,030][15401] Updated weights for policy 0, policy_version 555170 (0.0034) [2024-06-24 01:18:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9095905280. Throughput: 0: 42320.8. Samples: 9096079380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:18:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 01:18:06,613][15401] Updated weights for policy 0, policy_version 555180 (0.0036) [2024-06-24 01:18:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.9, 300 sec: 42765.0). Total num frames: 9096134656. Throughput: 0: 42330.6. Samples: 9096205260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 01:18:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 01:18:10,994][15401] Updated weights for policy 0, policy_version 555190 (0.0025) [2024-06-24 01:18:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 9096331264. Throughput: 0: 42463.9. Samples: 9096461980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:13,393][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 01:18:14,318][15401] Updated weights for policy 0, policy_version 555200 (0.0027) [2024-06-24 01:18:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 9096544256. Throughput: 0: 42355.0. Samples: 9096718140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:18,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 01:18:18,557][15401] Updated weights for policy 0, policy_version 555210 (0.0029) [2024-06-24 01:18:22,094][15401] Updated weights for policy 0, policy_version 555220 (0.0029) [2024-06-24 01:18:23,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 9096773632. Throughput: 0: 42412.1. Samples: 9096843900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 01:18:26,001][15401] Updated weights for policy 0, policy_version 555230 (0.0038) [2024-06-24 01:18:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 9096970240. Throughput: 0: 42403.7. Samples: 9097102920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 01:18:29,834][15401] Updated weights for policy 0, policy_version 555240 (0.0036) [2024-06-24 01:18:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9097183232. Throughput: 0: 42487.9. Samples: 9097362100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 01:18:34,086][15401] Updated weights for policy 0, policy_version 555250 (0.0025) [2024-06-24 01:18:34,521][15349] Signal inference workers to stop experience collection... (134750 times) [2024-06-24 01:18:34,521][15349] Signal inference workers to resume experience collection... (134750 times) [2024-06-24 01:18:34,557][15401] InferenceWorker_p0-w0: stopping experience collection (134750 times) [2024-06-24 01:18:34,557][15401] InferenceWorker_p0-w0: resuming experience collection (134750 times) [2024-06-24 01:18:37,445][15401] Updated weights for policy 0, policy_version 555260 (0.0033) [2024-06-24 01:18:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9097428992. Throughput: 0: 42593.3. Samples: 9097487440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 01:18:41,539][15401] Updated weights for policy 0, policy_version 555270 (0.0034) [2024-06-24 01:18:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 9097625600. Throughput: 0: 42630.7. Samples: 9097744680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:43,390][15132] Avg episode reward: [(0, '0.114')] [2024-06-24 01:18:45,143][15401] Updated weights for policy 0, policy_version 555280 (0.0040) [2024-06-24 01:18:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9097822208. Throughput: 0: 42655.9. Samples: 9097998900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 01:18:49,249][15401] Updated weights for policy 0, policy_version 555290 (0.0046) [2024-06-24 01:18:52,863][15401] Updated weights for policy 0, policy_version 555300 (0.0032) [2024-06-24 01:18:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9098051584. Throughput: 0: 42640.3. Samples: 9098124080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 01:18:57,181][15401] Updated weights for policy 0, policy_version 555310 (0.0043) [2024-06-24 01:18:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42710.1). Total num frames: 9098264576. Throughput: 0: 42795.7. Samples: 9098387680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:18:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 01:19:00,597][15401] Updated weights for policy 0, policy_version 555320 (0.0038) [2024-06-24 01:19:03,394][15132] Fps is (10 sec: 40942.9, 60 sec: 42595.4, 300 sec: 42653.3). Total num frames: 9098461184. Throughput: 0: 42720.9. Samples: 9098640760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:19:03,394][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 01:19:04,760][15401] Updated weights for policy 0, policy_version 555330 (0.0039) [2024-06-24 01:19:08,154][15401] Updated weights for policy 0, policy_version 555340 (0.0023) [2024-06-24 01:19:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9098690560. Throughput: 0: 42701.1. Samples: 9098765460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:19:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 01:19:12,178][15401] Updated weights for policy 0, policy_version 555350 (0.0038) [2024-06-24 01:19:13,389][15132] Fps is (10 sec: 44255.7, 60 sec: 42873.2, 300 sec: 42710.0). Total num frames: 9098903552. Throughput: 0: 42806.2. Samples: 9099029200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:19:13,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 01:19:16,130][15401] Updated weights for policy 0, policy_version 555360 (0.0030) [2024-06-24 01:19:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9099116544. Throughput: 0: 42565.4. Samples: 9099277540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:19:18,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 01:19:20,041][15401] Updated weights for policy 0, policy_version 555370 (0.0028) [2024-06-24 01:19:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9099313152. Throughput: 0: 42641.0. Samples: 9099406280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:19:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 01:19:23,579][15401] Updated weights for policy 0, policy_version 555380 (0.0037) [2024-06-24 01:19:27,674][15401] Updated weights for policy 0, policy_version 555390 (0.0033) [2024-06-24 01:19:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9099542528. Throughput: 0: 42655.9. Samples: 9099664200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:19:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 01:19:31,195][15401] Updated weights for policy 0, policy_version 555400 (0.0034) [2024-06-24 01:19:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9099771904. Throughput: 0: 42575.6. Samples: 9099914800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:19:33,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-24 01:19:35,411][15401] Updated weights for policy 0, policy_version 555410 (0.0031) [2024-06-24 01:19:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9099968512. Throughput: 0: 42744.6. Samples: 9100047580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 01:19:38,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 01:19:39,046][15401] Updated weights for policy 0, policy_version 555420 (0.0052) [2024-06-24 01:19:43,187][15401] Updated weights for policy 0, policy_version 555430 (0.0037) [2024-06-24 01:19:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42599.0). Total num frames: 9100165120. Throughput: 0: 42600.4. Samples: 9100304700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:19:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 01:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000555431_9100181504.pth... [2024-06-24 01:19:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000554804_9089908736.pth [2024-06-24 01:19:46,863][15401] Updated weights for policy 0, policy_version 555440 (0.0028) [2024-06-24 01:19:47,849][15349] Signal inference workers to stop experience collection... (134800 times) [2024-06-24 01:19:47,879][15401] InferenceWorker_p0-w0: stopping experience collection (134800 times) [2024-06-24 01:19:47,903][15349] Signal inference workers to resume experience collection... (134800 times) [2024-06-24 01:19:47,904][15401] InferenceWorker_p0-w0: resuming experience collection (134800 times) [2024-06-24 01:19:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9100410880. Throughput: 0: 42552.0. Samples: 9100555420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:19:48,395][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 01:19:50,831][15401] Updated weights for policy 0, policy_version 555450 (0.0048) [2024-06-24 01:19:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9100607488. Throughput: 0: 42805.3. Samples: 9100691700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:19:53,390][15132] Avg episode reward: [(0, '0.165')] [2024-06-24 01:19:54,397][15401] Updated weights for policy 0, policy_version 555460 (0.0034) [2024-06-24 01:19:58,372][15401] Updated weights for policy 0, policy_version 555470 (0.0030) [2024-06-24 01:19:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9100820480. Throughput: 0: 42462.3. Samples: 9100940000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:19:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 01:20:02,140][15401] Updated weights for policy 0, policy_version 555480 (0.0047) [2024-06-24 01:20:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43147.6, 300 sec: 42653.9). Total num frames: 9101049856. Throughput: 0: 42614.7. Samples: 9101195200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 01:20:05,871][15401] Updated weights for policy 0, policy_version 555490 (0.0044) [2024-06-24 01:20:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9101246464. Throughput: 0: 42667.1. Samples: 9101326300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 01:20:09,834][15401] Updated weights for policy 0, policy_version 555500 (0.0044) [2024-06-24 01:20:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9101459456. Throughput: 0: 42558.7. Samples: 9101579340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 01:20:13,448][15401] Updated weights for policy 0, policy_version 555510 (0.0031) [2024-06-24 01:20:17,489][15401] Updated weights for policy 0, policy_version 555520 (0.0027) [2024-06-24 01:20:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 9101672448. Throughput: 0: 42610.1. Samples: 9101832360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:18,393][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 01:20:20,955][15401] Updated weights for policy 0, policy_version 555530 (0.0031) [2024-06-24 01:20:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9101869056. Throughput: 0: 42527.5. Samples: 9101961320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 01:20:25,206][15401] Updated weights for policy 0, policy_version 555540 (0.0035) [2024-06-24 01:20:28,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 9102098432. Throughput: 0: 42566.6. Samples: 9102220200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 01:20:28,550][15401] Updated weights for policy 0, policy_version 555550 (0.0031) [2024-06-24 01:20:32,801][15401] Updated weights for policy 0, policy_version 555560 (0.0035) [2024-06-24 01:20:33,391][15132] Fps is (10 sec: 44230.6, 60 sec: 42324.4, 300 sec: 42542.7). Total num frames: 9102311424. Throughput: 0: 42669.4. Samples: 9102475600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:33,391][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 01:20:36,477][15401] Updated weights for policy 0, policy_version 555570 (0.0037) [2024-06-24 01:20:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 9102508032. Throughput: 0: 42457.4. Samples: 9102602380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:38,393][15132] Avg episode reward: [(0, '0.286')] [2024-06-24 01:20:40,541][15401] Updated weights for policy 0, policy_version 555580 (0.0033) [2024-06-24 01:20:43,392][15132] Fps is (10 sec: 44231.8, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 9102753792. Throughput: 0: 42605.1. Samples: 9102857340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:43,393][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 01:20:44,077][15401] Updated weights for policy 0, policy_version 555590 (0.0044) [2024-06-24 01:20:48,330][15401] Updated weights for policy 0, policy_version 555600 (0.0043) [2024-06-24 01:20:48,392][15132] Fps is (10 sec: 44237.0, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 9102950400. Throughput: 0: 42717.7. Samples: 9103117600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:48,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 01:20:51,710][15401] Updated weights for policy 0, policy_version 555610 (0.0038) [2024-06-24 01:20:53,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42325.5, 300 sec: 42709.8). Total num frames: 9103147008. Throughput: 0: 42518.3. Samples: 9103239620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 01:20:56,069][15401] Updated weights for policy 0, policy_version 555620 (0.0028) [2024-06-24 01:20:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9103376384. Throughput: 0: 42645.8. Samples: 9103498400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:20:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 01:20:59,246][15401] Updated weights for policy 0, policy_version 555630 (0.0033) [2024-06-24 01:21:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 9103556608. Throughput: 0: 42721.4. Samples: 9103754720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:21:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 01:21:03,785][15401] Updated weights for policy 0, policy_version 555640 (0.0058) [2024-06-24 01:21:07,253][15401] Updated weights for policy 0, policy_version 555650 (0.0047) [2024-06-24 01:21:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 9103785984. Throughput: 0: 42516.4. Samples: 9103874660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 01:21:08,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 01:21:11,605][15401] Updated weights for policy 0, policy_version 555660 (0.0027) [2024-06-24 01:21:13,345][15349] Signal inference workers to stop experience collection... (134850 times) [2024-06-24 01:21:13,345][15349] Signal inference workers to resume experience collection... (134850 times) [2024-06-24 01:21:13,373][15401] InferenceWorker_p0-w0: stopping experience collection (134850 times) [2024-06-24 01:21:13,373][15401] InferenceWorker_p0-w0: resuming experience collection (134850 times) [2024-06-24 01:21:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9104015360. Throughput: 0: 42502.6. Samples: 9104132820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 01:21:14,851][15401] Updated weights for policy 0, policy_version 555670 (0.0037) [2024-06-24 01:21:18,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 9104211968. Throughput: 0: 42565.7. Samples: 9104391000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 01:21:19,281][15401] Updated weights for policy 0, policy_version 555680 (0.0036) [2024-06-24 01:21:22,755][15401] Updated weights for policy 0, policy_version 555690 (0.0033) [2024-06-24 01:21:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9104441344. Throughput: 0: 42494.3. Samples: 9104514520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 01:21:27,160][15401] Updated weights for policy 0, policy_version 555700 (0.0049) [2024-06-24 01:21:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9104654336. Throughput: 0: 42564.1. Samples: 9104772620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 01:21:30,454][15401] Updated weights for policy 0, policy_version 555710 (0.0029) [2024-06-24 01:21:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42326.2, 300 sec: 42709.5). Total num frames: 9104850944. Throughput: 0: 42434.2. Samples: 9105027040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 01:21:34,741][15401] Updated weights for policy 0, policy_version 555720 (0.0034) [2024-06-24 01:21:38,084][15401] Updated weights for policy 0, policy_version 555730 (0.0029) [2024-06-24 01:21:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 9105080320. Throughput: 0: 42493.2. Samples: 9105151820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:38,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 01:21:42,297][15401] Updated weights for policy 0, policy_version 555740 (0.0038) [2024-06-24 01:21:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 9105293312. Throughput: 0: 42497.3. Samples: 9105410780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 01:21:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000555743_9105293312.pth... [2024-06-24 01:21:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000555121_9095102464.pth [2024-06-24 01:21:45,917][15401] Updated weights for policy 0, policy_version 555750 (0.0040) [2024-06-24 01:21:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 9105489920. Throughput: 0: 42464.9. Samples: 9105665640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:48,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 01:21:49,969][15401] Updated weights for policy 0, policy_version 555760 (0.0027) [2024-06-24 01:21:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9105719296. Throughput: 0: 42516.1. Samples: 9105787780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 01:21:53,591][15401] Updated weights for policy 0, policy_version 555770 (0.0031) [2024-06-24 01:21:57,526][15401] Updated weights for policy 0, policy_version 555780 (0.0026) [2024-06-24 01:21:58,394][15132] Fps is (10 sec: 42579.1, 60 sec: 42322.1, 300 sec: 42486.7). Total num frames: 9105915904. Throughput: 0: 42561.5. Samples: 9106048280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:21:58,395][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 01:22:01,094][15401] Updated weights for policy 0, policy_version 555790 (0.0023) [2024-06-24 01:22:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9106145280. Throughput: 0: 42685.8. Samples: 9106311860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:22:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 01:22:04,955][15401] Updated weights for policy 0, policy_version 555800 (0.0043) [2024-06-24 01:22:08,389][15132] Fps is (10 sec: 44257.1, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 9106358272. Throughput: 0: 42620.0. Samples: 9106432420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:22:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 01:22:08,935][15401] Updated weights for policy 0, policy_version 555810 (0.0034) [2024-06-24 01:22:12,956][15401] Updated weights for policy 0, policy_version 555820 (0.0027) [2024-06-24 01:22:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 9106571264. Throughput: 0: 42649.4. Samples: 9106691840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:22:13,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 01:22:16,596][15401] Updated weights for policy 0, policy_version 555830 (0.0026) [2024-06-24 01:22:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 9106767872. Throughput: 0: 42701.9. Samples: 9106948620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:22:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 01:22:20,723][15401] Updated weights for policy 0, policy_version 555840 (0.0026) [2024-06-24 01:22:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9107013632. Throughput: 0: 42704.4. Samples: 9107073520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:22:23,399][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 01:22:24,018][15401] Updated weights for policy 0, policy_version 555850 (0.0046) [2024-06-24 01:22:28,269][15401] Updated weights for policy 0, policy_version 555860 (0.0040) [2024-06-24 01:22:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9107210240. Throughput: 0: 42703.0. Samples: 9107332420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:22:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 01:22:31,928][15401] Updated weights for policy 0, policy_version 555870 (0.0041) [2024-06-24 01:22:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9107406848. Throughput: 0: 42790.3. Samples: 9107591200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:22:33,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-24 01:22:35,794][15401] Updated weights for policy 0, policy_version 555880 (0.0035) [2024-06-24 01:22:38,396][15132] Fps is (10 sec: 44209.1, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 9107652608. Throughput: 0: 42895.6. Samples: 9107718360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 01:22:38,397][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 01:22:39,632][15401] Updated weights for policy 0, policy_version 555890 (0.0042) [2024-06-24 01:22:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9107849216. Throughput: 0: 42765.8. Samples: 9107972540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:22:43,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 01:22:43,422][15401] Updated weights for policy 0, policy_version 555900 (0.0033) [2024-06-24 01:22:47,457][15401] Updated weights for policy 0, policy_version 555910 (0.0031) [2024-06-24 01:22:48,392][15132] Fps is (10 sec: 39337.3, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 9108045824. Throughput: 0: 42591.5. Samples: 9108228580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:22:48,392][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 01:22:51,064][15401] Updated weights for policy 0, policy_version 555920 (0.0032) [2024-06-24 01:22:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9108291584. Throughput: 0: 42652.9. Samples: 9108351800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:22:53,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 01:22:55,081][15401] Updated weights for policy 0, policy_version 555930 (0.0040) [2024-06-24 01:22:58,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42874.7, 300 sec: 42653.9). Total num frames: 9108488192. Throughput: 0: 42652.8. Samples: 9108611220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:22:58,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 01:22:58,866][15401] Updated weights for policy 0, policy_version 555940 (0.0043) [2024-06-24 01:23:02,794][15401] Updated weights for policy 0, policy_version 555950 (0.0028) [2024-06-24 01:23:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 9108684800. Throughput: 0: 42730.1. Samples: 9108871480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:03,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 01:23:04,201][15349] Signal inference workers to stop experience collection... (134900 times) [2024-06-24 01:23:04,236][15401] InferenceWorker_p0-w0: stopping experience collection (134900 times) [2024-06-24 01:23:04,258][15349] Signal inference workers to resume experience collection... (134900 times) [2024-06-24 01:23:04,259][15401] InferenceWorker_p0-w0: resuming experience collection (134900 times) [2024-06-24 01:23:06,436][15401] Updated weights for policy 0, policy_version 555960 (0.0042) [2024-06-24 01:23:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 9108930560. Throughput: 0: 42792.9. Samples: 9108999200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:08,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 01:23:10,301][15401] Updated weights for policy 0, policy_version 555970 (0.0033) [2024-06-24 01:23:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9109127168. Throughput: 0: 42768.5. Samples: 9109257000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:13,391][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 01:23:14,313][15401] Updated weights for policy 0, policy_version 555980 (0.0036) [2024-06-24 01:23:17,911][15401] Updated weights for policy 0, policy_version 555990 (0.0038) [2024-06-24 01:23:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 9109356544. Throughput: 0: 42737.7. Samples: 9109514400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 01:23:21,878][15401] Updated weights for policy 0, policy_version 556000 (0.0030) [2024-06-24 01:23:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9109585920. Throughput: 0: 42842.1. Samples: 9109645980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:23,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 01:23:25,665][15401] Updated weights for policy 0, policy_version 556010 (0.0041) [2024-06-24 01:23:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9109782528. Throughput: 0: 42933.7. Samples: 9109904560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 01:23:29,447][15401] Updated weights for policy 0, policy_version 556020 (0.0037) [2024-06-24 01:23:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 9109979136. Throughput: 0: 42879.3. Samples: 9110158040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:33,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 01:23:33,577][15401] Updated weights for policy 0, policy_version 556030 (0.0038) [2024-06-24 01:23:36,875][15401] Updated weights for policy 0, policy_version 556040 (0.0034) [2024-06-24 01:23:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 9110208512. Throughput: 0: 43013.8. Samples: 9110287420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 01:23:41,035][15401] Updated weights for policy 0, policy_version 556050 (0.0027) [2024-06-24 01:23:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9110421504. Throughput: 0: 42873.7. Samples: 9110540540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:43,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 01:23:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000556056_9110421504.pth... [2024-06-24 01:23:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000555431_9100181504.pth [2024-06-24 01:23:44,560][15401] Updated weights for policy 0, policy_version 556060 (0.0049) [2024-06-24 01:23:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 9110618112. Throughput: 0: 42735.2. Samples: 9110794560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 01:23:48,730][15401] Updated weights for policy 0, policy_version 556070 (0.0043) [2024-06-24 01:23:52,314][15401] Updated weights for policy 0, policy_version 556080 (0.0044) [2024-06-24 01:23:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9110863872. Throughput: 0: 42896.5. Samples: 9110929540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 01:23:56,211][15401] Updated weights for policy 0, policy_version 556090 (0.0030) [2024-06-24 01:23:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42710.1). Total num frames: 9111060480. Throughput: 0: 42888.9. Samples: 9111187000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:23:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 01:23:59,774][15401] Updated weights for policy 0, policy_version 556100 (0.0031) [2024-06-24 01:24:03,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 9111257088. Throughput: 0: 42940.0. Samples: 9111446800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:24:03,392][15132] Avg episode reward: [(0, '0.201')] [2024-06-24 01:24:03,823][15401] Updated weights for policy 0, policy_version 556110 (0.0033) [2024-06-24 01:24:07,196][15401] Updated weights for policy 0, policy_version 556120 (0.0033) [2024-06-24 01:24:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9111502848. Throughput: 0: 42941.4. Samples: 9111578340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 01:24:08,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 01:24:11,581][15401] Updated weights for policy 0, policy_version 556130 (0.0036) [2024-06-24 01:24:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9111683072. Throughput: 0: 42767.2. Samples: 9111829080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 01:24:14,763][15401] Updated weights for policy 0, policy_version 556140 (0.0046) [2024-06-24 01:24:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9111896064. Throughput: 0: 43059.4. Samples: 9112095720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 01:24:19,233][15401] Updated weights for policy 0, policy_version 556150 (0.0034) [2024-06-24 01:24:21,935][15349] Signal inference workers to stop experience collection... (134950 times) [2024-06-24 01:24:21,972][15401] InferenceWorker_p0-w0: stopping experience collection (134950 times) [2024-06-24 01:24:21,992][15349] Signal inference workers to resume experience collection... (134950 times) [2024-06-24 01:24:21,993][15401] InferenceWorker_p0-w0: resuming experience collection (134950 times) [2024-06-24 01:24:22,388][15401] Updated weights for policy 0, policy_version 556160 (0.0039) [2024-06-24 01:24:23,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9112158208. Throughput: 0: 42931.1. Samples: 9112219320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 01:24:26,775][15401] Updated weights for policy 0, policy_version 556170 (0.0026) [2024-06-24 01:24:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9112338432. Throughput: 0: 43011.3. Samples: 9112476040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 01:24:29,945][15401] Updated weights for policy 0, policy_version 556180 (0.0034) [2024-06-24 01:24:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9112535040. Throughput: 0: 43225.7. Samples: 9112739720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 01:24:34,469][15401] Updated weights for policy 0, policy_version 556190 (0.0030) [2024-06-24 01:24:37,564][15401] Updated weights for policy 0, policy_version 556200 (0.0034) [2024-06-24 01:24:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9112797184. Throughput: 0: 42933.4. Samples: 9112861540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:38,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 01:24:42,090][15401] Updated weights for policy 0, policy_version 556210 (0.0031) [2024-06-24 01:24:43,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 9113010176. Throughput: 0: 43035.2. Samples: 9113123580. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:43,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 01:24:45,669][15401] Updated weights for policy 0, policy_version 556220 (0.0042) [2024-06-24 01:24:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 9113190400. Throughput: 0: 42945.8. Samples: 9113379260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:48,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 01:24:50,041][15401] Updated weights for policy 0, policy_version 556230 (0.0036) [2024-06-24 01:24:53,251][15401] Updated weights for policy 0, policy_version 556240 (0.0034) [2024-06-24 01:24:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9113436160. Throughput: 0: 42768.8. Samples: 9113502940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 01:24:57,637][15401] Updated weights for policy 0, policy_version 556250 (0.0042) [2024-06-24 01:24:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9113649152. Throughput: 0: 43010.5. Samples: 9113764560. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:24:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 01:25:00,882][15401] Updated weights for policy 0, policy_version 556260 (0.0038) [2024-06-24 01:25:03,394][15132] Fps is (10 sec: 40940.5, 60 sec: 43142.8, 300 sec: 42708.8). Total num frames: 9113845760. Throughput: 0: 42793.7. Samples: 9114021640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:25:03,395][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 01:25:05,433][15401] Updated weights for policy 0, policy_version 556270 (0.0032) [2024-06-24 01:25:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9114058752. Throughput: 0: 42793.8. Samples: 9114145040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:25:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 01:25:08,853][15401] Updated weights for policy 0, policy_version 556280 (0.0047) [2024-06-24 01:25:12,740][15401] Updated weights for policy 0, policy_version 556290 (0.0039) [2024-06-24 01:25:13,389][15132] Fps is (10 sec: 42619.3, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 9114271744. Throughput: 0: 42892.5. Samples: 9114406200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:25:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 01:25:16,225][15401] Updated weights for policy 0, policy_version 556300 (0.0042) [2024-06-24 01:25:18,396][15132] Fps is (10 sec: 44208.9, 60 sec: 43413.0, 300 sec: 42819.6). Total num frames: 9114501120. Throughput: 0: 42715.3. Samples: 9114662180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:25:18,396][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 01:25:20,252][15401] Updated weights for policy 0, policy_version 556310 (0.0034) [2024-06-24 01:25:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9114714112. Throughput: 0: 42950.7. Samples: 9114794320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:25:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 01:25:23,917][15401] Updated weights for policy 0, policy_version 556320 (0.0037) [2024-06-24 01:25:27,723][15401] Updated weights for policy 0, policy_version 556330 (0.0040) [2024-06-24 01:25:28,389][15132] Fps is (10 sec: 42625.8, 60 sec: 43144.5, 300 sec: 42765.2). Total num frames: 9114927104. Throughput: 0: 42918.2. Samples: 9115054900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:25:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 01:25:31,554][15401] Updated weights for policy 0, policy_version 556340 (0.0029) [2024-06-24 01:25:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43690.7, 300 sec: 42876.4). Total num frames: 9115156480. Throughput: 0: 42913.4. Samples: 9115310360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:25:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 01:25:35,516][15401] Updated weights for policy 0, policy_version 556350 (0.0032) [2024-06-24 01:25:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 9115353088. Throughput: 0: 42964.9. Samples: 9115436360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 01:25:38,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 01:25:39,163][15401] Updated weights for policy 0, policy_version 556360 (0.0030) [2024-06-24 01:25:43,035][15401] Updated weights for policy 0, policy_version 556370 (0.0029) [2024-06-24 01:25:43,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42593.8, 300 sec: 42764.4). Total num frames: 9115566080. Throughput: 0: 42900.7. Samples: 9115695360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:25:43,397][15132] Avg episode reward: [(0, '0.317')] [2024-06-24 01:25:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000556370_9115566080.pth... [2024-06-24 01:25:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000555743_9105293312.pth [2024-06-24 01:25:46,618][15349] Signal inference workers to stop experience collection... (135000 times) [2024-06-24 01:25:46,666][15401] InferenceWorker_p0-w0: stopping experience collection (135000 times) [2024-06-24 01:25:46,677][15349] Signal inference workers to resume experience collection... (135000 times) [2024-06-24 01:25:46,682][15401] InferenceWorker_p0-w0: resuming experience collection (135000 times) [2024-06-24 01:25:47,014][15401] Updated weights for policy 0, policy_version 556380 (0.0031) [2024-06-24 01:25:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9115795456. Throughput: 0: 42644.9. Samples: 9115940460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:25:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 01:25:50,569][15401] Updated weights for policy 0, policy_version 556390 (0.0025) [2024-06-24 01:25:53,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9115975680. Throughput: 0: 42832.4. Samples: 9116072500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:25:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 01:25:54,473][15401] Updated weights for policy 0, policy_version 556400 (0.0031) [2024-06-24 01:25:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9116205056. Throughput: 0: 42807.0. Samples: 9116332520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:25:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 01:25:58,651][15401] Updated weights for policy 0, policy_version 556410 (0.0037) [2024-06-24 01:26:02,189][15401] Updated weights for policy 0, policy_version 556420 (0.0049) [2024-06-24 01:26:03,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43421.0, 300 sec: 42932.0). Total num frames: 9116450816. Throughput: 0: 42682.8. Samples: 9116582640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 01:26:06,206][15401] Updated weights for policy 0, policy_version 556430 (0.0031) [2024-06-24 01:26:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9116614656. Throughput: 0: 42683.3. Samples: 9116715080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:08,391][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 01:26:09,846][15401] Updated weights for policy 0, policy_version 556440 (0.0029) [2024-06-24 01:26:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9116844032. Throughput: 0: 42544.8. Samples: 9116969420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:13,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 01:26:13,746][15401] Updated weights for policy 0, policy_version 556450 (0.0022) [2024-06-24 01:26:17,719][15401] Updated weights for policy 0, policy_version 556460 (0.0050) [2024-06-24 01:26:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 9117057024. Throughput: 0: 42478.1. Samples: 9117221880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 01:26:21,286][15401] Updated weights for policy 0, policy_version 556470 (0.0030) [2024-06-24 01:26:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9117253632. Throughput: 0: 42493.8. Samples: 9117348580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 01:26:25,595][15401] Updated weights for policy 0, policy_version 556480 (0.0042) [2024-06-24 01:26:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9117466624. Throughput: 0: 42421.0. Samples: 9117604040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 01:26:29,119][15401] Updated weights for policy 0, policy_version 556490 (0.0035) [2024-06-24 01:26:33,137][15401] Updated weights for policy 0, policy_version 556500 (0.0035) [2024-06-24 01:26:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9117712384. Throughput: 0: 42749.4. Samples: 9117864180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:33,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 01:26:36,677][15401] Updated weights for policy 0, policy_version 556510 (0.0036) [2024-06-24 01:26:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9117908992. Throughput: 0: 42680.4. Samples: 9117993120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 01:26:40,766][15401] Updated weights for policy 0, policy_version 556520 (0.0036) [2024-06-24 01:26:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 9118138368. Throughput: 0: 42454.3. Samples: 9118242960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 01:26:44,534][15401] Updated weights for policy 0, policy_version 556530 (0.0034) [2024-06-24 01:26:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9118334976. Throughput: 0: 42636.9. Samples: 9118501300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 01:26:48,644][15401] Updated weights for policy 0, policy_version 556540 (0.0047) [2024-06-24 01:26:52,023][15401] Updated weights for policy 0, policy_version 556550 (0.0035) [2024-06-24 01:26:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42821.2). Total num frames: 9118547968. Throughput: 0: 42547.7. Samples: 9118629720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 01:26:56,198][15401] Updated weights for policy 0, policy_version 556560 (0.0025) [2024-06-24 01:26:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9118777344. Throughput: 0: 42555.3. Samples: 9118884400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:26:58,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-24 01:26:59,792][15401] Updated weights for policy 0, policy_version 556570 (0.0047) [2024-06-24 01:27:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 9118973952. Throughput: 0: 42646.8. Samples: 9119140980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 01:27:04,098][15401] Updated weights for policy 0, policy_version 556580 (0.0032) [2024-06-24 01:27:07,311][15401] Updated weights for policy 0, policy_version 556590 (0.0027) [2024-06-24 01:27:08,390][15132] Fps is (10 sec: 40957.0, 60 sec: 42871.1, 300 sec: 42764.9). Total num frames: 9119186944. Throughput: 0: 42623.8. Samples: 9119266680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:08,391][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 01:27:11,791][15401] Updated weights for policy 0, policy_version 556600 (0.0030) [2024-06-24 01:27:13,205][15349] Signal inference workers to stop experience collection... (135050 times) [2024-06-24 01:27:13,262][15349] Signal inference workers to resume experience collection... (135050 times) [2024-06-24 01:27:13,264][15401] InferenceWorker_p0-w0: stopping experience collection (135050 times) [2024-06-24 01:27:13,275][15401] InferenceWorker_p0-w0: resuming experience collection (135050 times) [2024-06-24 01:27:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9119399936. Throughput: 0: 42605.3. Samples: 9119521280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 01:27:15,199][15401] Updated weights for policy 0, policy_version 556610 (0.0028) [2024-06-24 01:27:18,389][15132] Fps is (10 sec: 40962.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9119596544. Throughput: 0: 42568.9. Samples: 9119779780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 01:27:19,434][15401] Updated weights for policy 0, policy_version 556620 (0.0026) [2024-06-24 01:27:22,909][15401] Updated weights for policy 0, policy_version 556630 (0.0041) [2024-06-24 01:27:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9119842304. Throughput: 0: 42405.4. Samples: 9119901360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 01:27:27,255][15401] Updated weights for policy 0, policy_version 556640 (0.0033) [2024-06-24 01:27:28,392][15132] Fps is (10 sec: 45865.8, 60 sec: 43143.2, 300 sec: 42875.8). Total num frames: 9120055296. Throughput: 0: 42742.9. Samples: 9120166480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:28,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 01:27:30,373][15401] Updated weights for policy 0, policy_version 556650 (0.0037) [2024-06-24 01:27:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42654.9). Total num frames: 9120235520. Throughput: 0: 42774.8. Samples: 9120426160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 01:27:34,757][15401] Updated weights for policy 0, policy_version 556660 (0.0033) [2024-06-24 01:27:37,838][15401] Updated weights for policy 0, policy_version 556670 (0.0043) [2024-06-24 01:27:38,390][15132] Fps is (10 sec: 44245.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9120497664. Throughput: 0: 42616.3. Samples: 9120547460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 01:27:42,141][15401] Updated weights for policy 0, policy_version 556680 (0.0031) [2024-06-24 01:27:43,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42869.7, 300 sec: 42931.6). Total num frames: 9120710656. Throughput: 0: 42828.3. Samples: 9120811780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:43,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 01:27:43,450][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000556685_9120727040.pth... [2024-06-24 01:27:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000556056_9110421504.pth [2024-06-24 01:27:45,436][15401] Updated weights for policy 0, policy_version 556690 (0.0036) [2024-06-24 01:27:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9120874496. Throughput: 0: 43071.5. Samples: 9121079200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 01:27:49,700][15401] Updated weights for policy 0, policy_version 556700 (0.0031) [2024-06-24 01:27:53,061][15401] Updated weights for policy 0, policy_version 556710 (0.0036) [2024-06-24 01:27:53,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9121136640. Throughput: 0: 42853.1. Samples: 9121195040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:53,396][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 01:27:57,266][15401] Updated weights for policy 0, policy_version 556720 (0.0033) [2024-06-24 01:27:58,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 9121349632. Throughput: 0: 43137.3. Samples: 9121462460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:27:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 01:28:00,793][15401] Updated weights for policy 0, policy_version 556730 (0.0029) [2024-06-24 01:28:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9121513472. Throughput: 0: 43184.8. Samples: 9121723100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:28:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 01:28:05,019][15401] Updated weights for policy 0, policy_version 556740 (0.0050) [2024-06-24 01:28:08,324][15401] Updated weights for policy 0, policy_version 556750 (0.0038) [2024-06-24 01:28:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43418.0, 300 sec: 42931.6). Total num frames: 9121792000. Throughput: 0: 43135.5. Samples: 9121842460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:28:08,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 01:28:12,760][15401] Updated weights for policy 0, policy_version 556760 (0.0023) [2024-06-24 01:28:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9121988608. Throughput: 0: 43155.6. Samples: 9122108400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:28:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 01:28:16,001][15401] Updated weights for policy 0, policy_version 556770 (0.0038) [2024-06-24 01:28:18,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 9122168832. Throughput: 0: 42954.6. Samples: 9122359220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:28:18,393][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 01:28:20,533][15401] Updated weights for policy 0, policy_version 556780 (0.0044) [2024-06-24 01:28:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9122414592. Throughput: 0: 42950.3. Samples: 9122480220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:28:23,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 01:28:24,059][15401] Updated weights for policy 0, policy_version 556790 (0.0037) [2024-06-24 01:28:28,207][15401] Updated weights for policy 0, policy_version 556800 (0.0038) [2024-06-24 01:28:28,295][15349] Signal inference workers to stop experience collection... (135100 times) [2024-06-24 01:28:28,316][15401] InferenceWorker_p0-w0: stopping experience collection (135100 times) [2024-06-24 01:28:28,356][15349] Signal inference workers to resume experience collection... (135100 times) [2024-06-24 01:28:28,356][15401] InferenceWorker_p0-w0: resuming experience collection (135100 times) [2024-06-24 01:28:28,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42873.0, 300 sec: 42876.1). Total num frames: 9122627584. Throughput: 0: 43078.8. Samples: 9122750220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:28:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 01:28:31,767][15401] Updated weights for policy 0, policy_version 556810 (0.0029) [2024-06-24 01:28:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9122807808. Throughput: 0: 42731.9. Samples: 9123002140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:28:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 01:28:35,871][15401] Updated weights for policy 0, policy_version 556820 (0.0038) [2024-06-24 01:28:38,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 9123069952. Throughput: 0: 42846.1. Samples: 9123123220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 01:28:38,393][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 01:28:39,495][15401] Updated weights for policy 0, policy_version 556830 (0.0036) [2024-06-24 01:28:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42325.3, 300 sec: 42820.2). Total num frames: 9123250176. Throughput: 0: 42762.2. Samples: 9123386860. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:28:43,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 01:28:43,539][15401] Updated weights for policy 0, policy_version 556840 (0.0033) [2024-06-24 01:28:47,146][15401] Updated weights for policy 0, policy_version 556850 (0.0037) [2024-06-24 01:28:48,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9123446784. Throughput: 0: 42577.9. Samples: 9123639100. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:28:48,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 01:28:51,281][15401] Updated weights for policy 0, policy_version 556860 (0.0023) [2024-06-24 01:28:53,392][15132] Fps is (10 sec: 47513.7, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 9123725312. Throughput: 0: 42816.4. Samples: 9123769300. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:28:53,393][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 01:28:54,723][15401] Updated weights for policy 0, policy_version 556870 (0.0028) [2024-06-24 01:28:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 9123889152. Throughput: 0: 42739.3. Samples: 9124031660. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:28:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 01:28:58,730][15401] Updated weights for policy 0, policy_version 556880 (0.0023) [2024-06-24 01:29:02,405][15401] Updated weights for policy 0, policy_version 556890 (0.0033) [2024-06-24 01:29:03,390][15132] Fps is (10 sec: 37692.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9124102144. Throughput: 0: 42736.0. Samples: 9124282240. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 01:29:06,315][15401] Updated weights for policy 0, policy_version 556900 (0.0028) [2024-06-24 01:29:08,392][15132] Fps is (10 sec: 47502.4, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 9124364288. Throughput: 0: 42949.4. Samples: 9124413040. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:08,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 01:29:10,248][15401] Updated weights for policy 0, policy_version 556910 (0.0049) [2024-06-24 01:29:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 9124528128. Throughput: 0: 42793.8. Samples: 9124675940. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:13,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 01:29:13,894][15401] Updated weights for policy 0, policy_version 556920 (0.0025) [2024-06-24 01:29:17,711][15401] Updated weights for policy 0, policy_version 556930 (0.0034) [2024-06-24 01:29:18,389][15132] Fps is (10 sec: 39330.8, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 9124757504. Throughput: 0: 42782.7. Samples: 9124927360. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 01:29:21,467][15401] Updated weights for policy 0, policy_version 556940 (0.0031) [2024-06-24 01:29:23,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9125003264. Throughput: 0: 43023.2. Samples: 9125059160. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:23,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 01:29:25,613][15401] Updated weights for policy 0, policy_version 556950 (0.0032) [2024-06-24 01:29:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9125167104. Throughput: 0: 42862.8. Samples: 9125315580. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 01:29:28,941][15401] Updated weights for policy 0, policy_version 556960 (0.0023) [2024-06-24 01:29:33,203][15401] Updated weights for policy 0, policy_version 556970 (0.0044) [2024-06-24 01:29:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9125396480. Throughput: 0: 42994.1. Samples: 9125573840. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 01:29:36,972][15401] Updated weights for policy 0, policy_version 556980 (0.0031) [2024-06-24 01:29:38,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 9125642240. Throughput: 0: 42899.7. Samples: 9125699680. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 01:29:40,805][15401] Updated weights for policy 0, policy_version 556990 (0.0046) [2024-06-24 01:29:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 9125806080. Throughput: 0: 42751.6. Samples: 9125955480. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 01:29:43,544][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000556996_9125822464.pth... [2024-06-24 01:29:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000556370_9115566080.pth [2024-06-24 01:29:43,806][15349] Signal inference workers to stop experience collection... (135150 times) [2024-06-24 01:29:43,806][15349] Signal inference workers to resume experience collection... (135150 times) [2024-06-24 01:29:43,821][15401] InferenceWorker_p0-w0: stopping experience collection (135150 times) [2024-06-24 01:29:43,821][15401] InferenceWorker_p0-w0: resuming experience collection (135150 times) [2024-06-24 01:29:44,550][15401] Updated weights for policy 0, policy_version 557000 (0.0031) [2024-06-24 01:29:48,279][15401] Updated weights for policy 0, policy_version 557010 (0.0033) [2024-06-24 01:29:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 9126051840. Throughput: 0: 42876.0. Samples: 9126211660. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:48,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 01:29:52,133][15401] Updated weights for policy 0, policy_version 557020 (0.0042) [2024-06-24 01:29:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 9126264832. Throughput: 0: 42890.7. Samples: 9126343020. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:53,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 01:29:55,894][15401] Updated weights for policy 0, policy_version 557030 (0.0033) [2024-06-24 01:29:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42710.2). Total num frames: 9126445056. Throughput: 0: 42591.0. Samples: 9126592540. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:29:58,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 01:30:00,182][15401] Updated weights for policy 0, policy_version 557040 (0.0037) [2024-06-24 01:30:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9126690816. Throughput: 0: 42567.5. Samples: 9126842900. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:30:03,391][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 01:30:03,411][15401] Updated weights for policy 0, policy_version 557050 (0.0038) [2024-06-24 01:30:07,762][15401] Updated weights for policy 0, policy_version 557060 (0.0027) [2024-06-24 01:30:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 9126903808. Throughput: 0: 42602.3. Samples: 9126976260. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:30:08,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 01:30:10,986][15401] Updated weights for policy 0, policy_version 557070 (0.0037) [2024-06-24 01:30:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42654.9). Total num frames: 9127084032. Throughput: 0: 42467.1. Samples: 9127226600. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-24 01:30:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 01:30:15,385][15401] Updated weights for policy 0, policy_version 557080 (0.0028) [2024-06-24 01:30:18,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9127329792. Throughput: 0: 42296.1. Samples: 9127477260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:18,392][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 01:30:19,077][15401] Updated weights for policy 0, policy_version 557090 (0.0045) [2024-06-24 01:30:22,956][15401] Updated weights for policy 0, policy_version 557100 (0.0041) [2024-06-24 01:30:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9127526400. Throughput: 0: 42560.8. Samples: 9127614920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:23,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 01:30:26,643][15401] Updated weights for policy 0, policy_version 557110 (0.0030) [2024-06-24 01:30:28,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9127739392. Throughput: 0: 42359.5. Samples: 9127861660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 01:30:30,440][15401] Updated weights for policy 0, policy_version 557120 (0.0032) [2024-06-24 01:30:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9127968768. Throughput: 0: 42320.0. Samples: 9128116060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:33,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 01:30:34,430][15401] Updated weights for policy 0, policy_version 557130 (0.0040) [2024-06-24 01:30:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42710.4). Total num frames: 9128165376. Throughput: 0: 42245.3. Samples: 9128244060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:38,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 01:30:38,417][15401] Updated weights for policy 0, policy_version 557140 (0.0046) [2024-06-24 01:30:42,331][15401] Updated weights for policy 0, policy_version 557150 (0.0041) [2024-06-24 01:30:43,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 9128394752. Throughput: 0: 42367.1. Samples: 9128499160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:43,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 01:30:46,266][15401] Updated weights for policy 0, policy_version 557160 (0.0034) [2024-06-24 01:30:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9128591360. Throughput: 0: 42474.2. Samples: 9128754240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 01:30:50,342][15401] Updated weights for policy 0, policy_version 557170 (0.0030) [2024-06-24 01:30:53,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9128804352. Throughput: 0: 42403.5. Samples: 9128884420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:53,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 01:30:53,622][15401] Updated weights for policy 0, policy_version 557180 (0.0034) [2024-06-24 01:30:57,707][15401] Updated weights for policy 0, policy_version 557190 (0.0046) [2024-06-24 01:30:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9129017344. Throughput: 0: 42692.0. Samples: 9129147740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:30:58,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 01:31:00,917][15349] Signal inference workers to stop experience collection... (135200 times) [2024-06-24 01:31:00,917][15349] Signal inference workers to resume experience collection... (135200 times) [2024-06-24 01:31:00,945][15401] InferenceWorker_p0-w0: stopping experience collection (135200 times) [2024-06-24 01:31:00,945][15401] InferenceWorker_p0-w0: resuming experience collection (135200 times) [2024-06-24 01:31:01,240][15401] Updated weights for policy 0, policy_version 557200 (0.0045) [2024-06-24 01:31:03,393][15132] Fps is (10 sec: 44219.8, 60 sec: 42595.7, 300 sec: 42820.0). Total num frames: 9129246720. Throughput: 0: 42555.4. Samples: 9129392320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:03,394][15132] Avg episode reward: [(0, '0.251')] [2024-06-24 01:31:05,487][15401] Updated weights for policy 0, policy_version 557210 (0.0033) [2024-06-24 01:31:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9129443328. Throughput: 0: 42481.7. Samples: 9129526600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 01:31:08,982][15401] Updated weights for policy 0, policy_version 557220 (0.0040) [2024-06-24 01:31:12,924][15401] Updated weights for policy 0, policy_version 557230 (0.0030) [2024-06-24 01:31:13,390][15132] Fps is (10 sec: 42614.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9129672704. Throughput: 0: 42788.4. Samples: 9129787140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 01:31:16,718][15401] Updated weights for policy 0, policy_version 557240 (0.0026) [2024-06-24 01:31:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 9129869312. Throughput: 0: 42843.0. Samples: 9130044000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:18,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-24 01:31:20,860][15401] Updated weights for policy 0, policy_version 557250 (0.0033) [2024-06-24 01:31:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9130098688. Throughput: 0: 42873.3. Samples: 9130173360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:23,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 01:31:24,238][15401] Updated weights for policy 0, policy_version 557260 (0.0038) [2024-06-24 01:31:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9130295296. Throughput: 0: 42988.1. Samples: 9130433520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 01:31:28,586][15401] Updated weights for policy 0, policy_version 557270 (0.0036) [2024-06-24 01:31:31,804][15401] Updated weights for policy 0, policy_version 557280 (0.0035) [2024-06-24 01:31:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 9130508288. Throughput: 0: 42998.2. Samples: 9130689260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:33,393][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 01:31:36,094][15401] Updated weights for policy 0, policy_version 557290 (0.0029) [2024-06-24 01:31:38,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 9130737664. Throughput: 0: 42912.1. Samples: 9130815740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:38,397][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 01:31:39,304][15401] Updated weights for policy 0, policy_version 557300 (0.0044) [2024-06-24 01:31:43,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 9130950656. Throughput: 0: 42925.3. Samples: 9131079380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:31:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 01:31:43,527][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000557310_9130967040.pth... [2024-06-24 01:31:43,533][15401] Updated weights for policy 0, policy_version 557310 (0.0033) [2024-06-24 01:31:43,588][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000556685_9120727040.pth [2024-06-24 01:31:47,130][15401] Updated weights for policy 0, policy_version 557320 (0.0034) [2024-06-24 01:31:48,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9131163648. Throughput: 0: 43185.6. Samples: 9131335500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:31:48,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 01:31:51,335][15401] Updated weights for policy 0, policy_version 557330 (0.0034) [2024-06-24 01:31:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9131393024. Throughput: 0: 43168.1. Samples: 9131469160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:31:53,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 01:31:54,474][15401] Updated weights for policy 0, policy_version 557340 (0.0027) [2024-06-24 01:31:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9131589632. Throughput: 0: 43097.7. Samples: 9131726540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:31:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 01:31:58,921][15401] Updated weights for policy 0, policy_version 557350 (0.0031) [2024-06-24 01:32:01,979][15349] Signal inference workers to stop experience collection... (135250 times) [2024-06-24 01:32:01,984][15349] Signal inference workers to resume experience collection... (135250 times) [2024-06-24 01:32:02,000][15401] InferenceWorker_p0-w0: stopping experience collection (135250 times) [2024-06-24 01:32:02,000][15401] InferenceWorker_p0-w0: resuming experience collection (135250 times) [2024-06-24 01:32:02,134][15401] Updated weights for policy 0, policy_version 557360 (0.0045) [2024-06-24 01:32:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42599.4, 300 sec: 42764.8). Total num frames: 9131802624. Throughput: 0: 42908.9. Samples: 9131975000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:03,393][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 01:32:06,557][15401] Updated weights for policy 0, policy_version 557370 (0.0035) [2024-06-24 01:32:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9132032000. Throughput: 0: 42954.1. Samples: 9132106300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 01:32:09,714][15401] Updated weights for policy 0, policy_version 557380 (0.0040) [2024-06-24 01:32:13,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9132212224. Throughput: 0: 42806.0. Samples: 9132359800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:13,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 01:32:14,253][15401] Updated weights for policy 0, policy_version 557390 (0.0033) [2024-06-24 01:32:17,557][15401] Updated weights for policy 0, policy_version 557400 (0.0039) [2024-06-24 01:32:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 9132457984. Throughput: 0: 42610.4. Samples: 9132606620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 01:32:21,955][15401] Updated weights for policy 0, policy_version 557410 (0.0032) [2024-06-24 01:32:23,396][15132] Fps is (10 sec: 47483.9, 60 sec: 43139.9, 300 sec: 42819.9). Total num frames: 9132687360. Throughput: 0: 42902.6. Samples: 9132746360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:23,396][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 01:32:25,178][15401] Updated weights for policy 0, policy_version 557420 (0.0032) [2024-06-24 01:32:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9132867584. Throughput: 0: 42725.5. Samples: 9133002020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 01:32:29,713][15401] Updated weights for policy 0, policy_version 557430 (0.0037) [2024-06-24 01:32:32,874][15401] Updated weights for policy 0, policy_version 557440 (0.0025) [2024-06-24 01:32:33,389][15132] Fps is (10 sec: 42626.0, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 9133113344. Throughput: 0: 42437.8. Samples: 9133245200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:33,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 01:32:37,475][15401] Updated weights for policy 0, policy_version 557450 (0.0033) [2024-06-24 01:32:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42876.0, 300 sec: 42709.8). Total num frames: 9133309952. Throughput: 0: 42608.0. Samples: 9133386520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 01:32:40,554][15401] Updated weights for policy 0, policy_version 557460 (0.0029) [2024-06-24 01:32:43,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 9133506560. Throughput: 0: 42469.3. Samples: 9133637760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:43,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 01:32:45,077][15401] Updated weights for policy 0, policy_version 557470 (0.0039) [2024-06-24 01:32:48,177][15401] Updated weights for policy 0, policy_version 557480 (0.0021) [2024-06-24 01:32:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9133752320. Throughput: 0: 42587.6. Samples: 9133891340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 01:32:52,563][15401] Updated weights for policy 0, policy_version 557490 (0.0036) [2024-06-24 01:32:53,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9133932544. Throughput: 0: 42719.7. Samples: 9134028680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:53,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-24 01:32:55,815][15401] Updated weights for policy 0, policy_version 557500 (0.0040) [2024-06-24 01:32:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9134145536. Throughput: 0: 42608.9. Samples: 9134277200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:32:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 01:33:00,112][15401] Updated weights for policy 0, policy_version 557510 (0.0035) [2024-06-24 01:33:03,368][15349] Signal inference workers to stop experience collection... (135300 times) [2024-06-24 01:33:03,368][15349] Signal inference workers to resume experience collection... (135300 times) [2024-06-24 01:33:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43146.4, 300 sec: 42709.5). Total num frames: 9134391296. Throughput: 0: 42894.3. Samples: 9134536860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:33:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 01:33:03,404][15401] InferenceWorker_p0-w0: stopping experience collection (135300 times) [2024-06-24 01:33:03,404][15401] InferenceWorker_p0-w0: resuming experience collection (135300 times) [2024-06-24 01:33:03,507][15401] Updated weights for policy 0, policy_version 557520 (0.0034) [2024-06-24 01:33:07,605][15401] Updated weights for policy 0, policy_version 557530 (0.0031) [2024-06-24 01:33:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9134587904. Throughput: 0: 42677.5. Samples: 9134666580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:33:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 01:33:11,005][15401] Updated weights for policy 0, policy_version 557540 (0.0022) [2024-06-24 01:33:13,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 9134784512. Throughput: 0: 42661.6. Samples: 9134921800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 01:33:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 01:33:15,466][15401] Updated weights for policy 0, policy_version 557550 (0.0023) [2024-06-24 01:33:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9135030272. Throughput: 0: 42988.9. Samples: 9135179700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 01:33:18,731][15401] Updated weights for policy 0, policy_version 557560 (0.0043) [2024-06-24 01:33:23,056][15401] Updated weights for policy 0, policy_version 557570 (0.0037) [2024-06-24 01:33:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 9135243264. Throughput: 0: 42854.2. Samples: 9135314960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 01:33:26,614][15401] Updated weights for policy 0, policy_version 557580 (0.0042) [2024-06-24 01:33:28,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9135439872. Throughput: 0: 42824.4. Samples: 9135564760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 01:33:30,736][15401] Updated weights for policy 0, policy_version 557590 (0.0033) [2024-06-24 01:33:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 9135669248. Throughput: 0: 42964.0. Samples: 9135824720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 01:33:34,373][15401] Updated weights for policy 0, policy_version 557600 (0.0025) [2024-06-24 01:33:38,269][15401] Updated weights for policy 0, policy_version 557610 (0.0039) [2024-06-24 01:33:38,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 9135882240. Throughput: 0: 42867.9. Samples: 9135957740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:38,395][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 01:33:41,986][15401] Updated weights for policy 0, policy_version 557620 (0.0031) [2024-06-24 01:33:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 9136078848. Throughput: 0: 42989.4. Samples: 9136211720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 01:33:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000557623_9136095232.pth... [2024-06-24 01:33:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000556996_9125822464.pth [2024-06-24 01:33:45,679][15401] Updated weights for policy 0, policy_version 557630 (0.0035) [2024-06-24 01:33:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 9136324608. Throughput: 0: 42918.9. Samples: 9136468220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 01:33:49,806][15401] Updated weights for policy 0, policy_version 557640 (0.0034) [2024-06-24 01:33:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9136521216. Throughput: 0: 42959.7. Samples: 9136599760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 01:33:53,416][15401] Updated weights for policy 0, policy_version 557650 (0.0041) [2024-06-24 01:33:57,514][15401] Updated weights for policy 0, policy_version 557660 (0.0036) [2024-06-24 01:33:58,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 9136717824. Throughput: 0: 43014.7. Samples: 9136857560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:33:58,392][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 01:34:01,113][15401] Updated weights for policy 0, policy_version 557670 (0.0023) [2024-06-24 01:34:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 9136963584. Throughput: 0: 42934.4. Samples: 9137111760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 01:34:05,165][15401] Updated weights for policy 0, policy_version 557680 (0.0023) [2024-06-24 01:34:08,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 9137176576. Throughput: 0: 42868.0. Samples: 9137244020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 01:34:08,553][15401] Updated weights for policy 0, policy_version 557690 (0.0045) [2024-06-24 01:34:12,711][15401] Updated weights for policy 0, policy_version 557700 (0.0034) [2024-06-24 01:34:13,392][15132] Fps is (10 sec: 40950.7, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 9137373184. Throughput: 0: 43062.3. Samples: 9137502660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:13,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 01:34:16,251][15401] Updated weights for policy 0, policy_version 557710 (0.0031) [2024-06-24 01:34:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 9137602560. Throughput: 0: 43036.8. Samples: 9137761380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 01:34:20,165][15401] Updated weights for policy 0, policy_version 557720 (0.0038) [2024-06-24 01:34:23,067][15349] Signal inference workers to stop experience collection... (135350 times) [2024-06-24 01:34:23,068][15349] Signal inference workers to resume experience collection... (135350 times) [2024-06-24 01:34:23,090][15401] InferenceWorker_p0-w0: stopping experience collection (135350 times) [2024-06-24 01:34:23,090][15401] InferenceWorker_p0-w0: resuming experience collection (135350 times) [2024-06-24 01:34:23,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 9137815552. Throughput: 0: 43059.0. Samples: 9137895500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:23,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 01:34:23,761][15401] Updated weights for policy 0, policy_version 557730 (0.0048) [2024-06-24 01:34:27,740][15401] Updated weights for policy 0, policy_version 557740 (0.0035) [2024-06-24 01:34:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 9138028544. Throughput: 0: 43065.8. Samples: 9138149680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 01:34:31,418][15401] Updated weights for policy 0, policy_version 557750 (0.0033) [2024-06-24 01:34:33,392][15132] Fps is (10 sec: 44237.2, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 9138257920. Throughput: 0: 43056.5. Samples: 9138405860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:33,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 01:34:35,435][15401] Updated weights for policy 0, policy_version 557760 (0.0030) [2024-06-24 01:34:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9138438144. Throughput: 0: 43038.3. Samples: 9138536480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 01:34:39,244][15401] Updated weights for policy 0, policy_version 557770 (0.0039) [2024-06-24 01:34:43,025][15401] Updated weights for policy 0, policy_version 557780 (0.0022) [2024-06-24 01:34:43,390][15132] Fps is (10 sec: 40969.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9138667520. Throughput: 0: 42960.0. Samples: 9138790660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 01:34:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 01:34:46,997][15401] Updated weights for policy 0, policy_version 557790 (0.0036) [2024-06-24 01:34:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9138880512. Throughput: 0: 42953.0. Samples: 9139044640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:34:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 01:34:50,582][15401] Updated weights for policy 0, policy_version 557800 (0.0037) [2024-06-24 01:34:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9139093504. Throughput: 0: 42907.6. Samples: 9139174860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:34:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 01:34:54,753][15401] Updated weights for policy 0, policy_version 557810 (0.0045) [2024-06-24 01:34:58,049][15401] Updated weights for policy 0, policy_version 557820 (0.0030) [2024-06-24 01:34:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 9139322880. Throughput: 0: 42874.3. Samples: 9139431900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:34:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 01:35:02,276][15401] Updated weights for policy 0, policy_version 557830 (0.0034) [2024-06-24 01:35:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 9139519488. Throughput: 0: 42885.1. Samples: 9139691200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 01:35:05,967][15401] Updated weights for policy 0, policy_version 557840 (0.0029) [2024-06-24 01:35:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9139732480. Throughput: 0: 42659.6. Samples: 9139815080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 01:35:10,095][15401] Updated weights for policy 0, policy_version 557850 (0.0035) [2024-06-24 01:35:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.4, 300 sec: 42820.9). Total num frames: 9139961856. Throughput: 0: 42680.5. Samples: 9140070300. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:13,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 01:35:13,494][15401] Updated weights for policy 0, policy_version 557860 (0.0043) [2024-06-24 01:35:17,727][15401] Updated weights for policy 0, policy_version 557870 (0.0038) [2024-06-24 01:35:18,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 9140158464. Throughput: 0: 42726.0. Samples: 9140328700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:18,396][15132] Avg episode reward: [(0, '0.304')] [2024-06-24 01:35:20,991][15401] Updated weights for policy 0, policy_version 557880 (0.0038) [2024-06-24 01:35:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 9140371456. Throughput: 0: 42636.0. Samples: 9140455100. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 01:35:25,195][15401] Updated weights for policy 0, policy_version 557890 (0.0031) [2024-06-24 01:35:28,390][15132] Fps is (10 sec: 44264.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9140600832. Throughput: 0: 42653.7. Samples: 9140710080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:28,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 01:35:28,757][15401] Updated weights for policy 0, policy_version 557900 (0.0031) [2024-06-24 01:35:33,050][15401] Updated weights for policy 0, policy_version 557910 (0.0050) [2024-06-24 01:35:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 9140813824. Throughput: 0: 42673.0. Samples: 9140964920. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 01:35:36,791][15401] Updated weights for policy 0, policy_version 557920 (0.0038) [2024-06-24 01:35:38,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42866.8, 300 sec: 42764.4). Total num frames: 9141010432. Throughput: 0: 42584.5. Samples: 9141091440. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:38,396][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 01:35:40,457][15401] Updated weights for policy 0, policy_version 557930 (0.0030) [2024-06-24 01:35:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9141256192. Throughput: 0: 42747.1. Samples: 9141355520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 01:35:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000557938_9141256192.pth... [2024-06-24 01:35:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000557310_9130967040.pth [2024-06-24 01:35:44,326][15401] Updated weights for policy 0, policy_version 557940 (0.0034) [2024-06-24 01:35:48,085][15349] Signal inference workers to stop experience collection... (135400 times) [2024-06-24 01:35:48,134][15401] InferenceWorker_p0-w0: stopping experience collection (135400 times) [2024-06-24 01:35:48,143][15349] Signal inference workers to resume experience collection... (135400 times) [2024-06-24 01:35:48,150][15401] InferenceWorker_p0-w0: resuming experience collection (135400 times) [2024-06-24 01:35:48,290][15401] Updated weights for policy 0, policy_version 557950 (0.0036) [2024-06-24 01:35:48,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9141452800. Throughput: 0: 42673.7. Samples: 9141611520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:48,393][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 01:35:51,879][15401] Updated weights for policy 0, policy_version 557960 (0.0025) [2024-06-24 01:35:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9141649408. Throughput: 0: 42648.9. Samples: 9141734280. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 01:35:56,145][15401] Updated weights for policy 0, policy_version 557970 (0.0030) [2024-06-24 01:35:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.7). Total num frames: 9141895168. Throughput: 0: 42805.3. Samples: 9141996540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:35:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 01:35:59,468][15401] Updated weights for policy 0, policy_version 557980 (0.0040) [2024-06-24 01:36:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9142091776. Throughput: 0: 42815.9. Samples: 9142255140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:36:03,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 01:36:03,811][15401] Updated weights for policy 0, policy_version 557990 (0.0030) [2024-06-24 01:36:07,033][15401] Updated weights for policy 0, policy_version 558000 (0.0043) [2024-06-24 01:36:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9142288384. Throughput: 0: 42606.6. Samples: 9142372400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:36:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 01:36:11,551][15401] Updated weights for policy 0, policy_version 558010 (0.0029) [2024-06-24 01:36:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 9142534144. Throughput: 0: 42915.1. Samples: 9142641360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 01:36:13,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 01:36:14,678][15401] Updated weights for policy 0, policy_version 558020 (0.0036) [2024-06-24 01:36:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42874.3, 300 sec: 42820.2). Total num frames: 9142730752. Throughput: 0: 43011.9. Samples: 9142900560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:18,393][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 01:36:19,215][15401] Updated weights for policy 0, policy_version 558030 (0.0033) [2024-06-24 01:36:22,526][15401] Updated weights for policy 0, policy_version 558040 (0.0033) [2024-06-24 01:36:23,392][15132] Fps is (10 sec: 40959.7, 60 sec: 42869.6, 300 sec: 42875.7). Total num frames: 9142943744. Throughput: 0: 42820.7. Samples: 9143018200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:23,393][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 01:36:26,984][15401] Updated weights for policy 0, policy_version 558050 (0.0041) [2024-06-24 01:36:28,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 9143156736. Throughput: 0: 42716.5. Samples: 9143277760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 01:36:30,123][15401] Updated weights for policy 0, policy_version 558060 (0.0038) [2024-06-24 01:36:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.2, 300 sec: 42765.9). Total num frames: 9143353344. Throughput: 0: 42678.2. Samples: 9143532040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 01:36:34,598][15401] Updated weights for policy 0, policy_version 558070 (0.0039) [2024-06-24 01:36:37,684][15401] Updated weights for policy 0, policy_version 558080 (0.0029) [2024-06-24 01:36:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 9143599104. Throughput: 0: 42803.5. Samples: 9143660440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:38,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 01:36:42,457][15401] Updated weights for policy 0, policy_version 558090 (0.0034) [2024-06-24 01:36:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9143795712. Throughput: 0: 42780.8. Samples: 9143921680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 01:36:45,199][15401] Updated weights for policy 0, policy_version 558100 (0.0038) [2024-06-24 01:36:48,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9144008704. Throughput: 0: 42586.6. Samples: 9144171640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:48,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 01:36:49,979][15401] Updated weights for policy 0, policy_version 558110 (0.0028) [2024-06-24 01:36:53,008][15401] Updated weights for policy 0, policy_version 558120 (0.0041) [2024-06-24 01:36:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9144238080. Throughput: 0: 42886.2. Samples: 9144302280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:53,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 01:36:57,766][15401] Updated weights for policy 0, policy_version 558130 (0.0032) [2024-06-24 01:36:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42052.2, 300 sec: 42765.4). Total num frames: 9144418304. Throughput: 0: 42539.2. Samples: 9144555520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:36:58,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 01:37:00,765][15401] Updated weights for policy 0, policy_version 558140 (0.0038) [2024-06-24 01:37:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9144647680. Throughput: 0: 42509.0. Samples: 9144813360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 01:37:05,333][15401] Updated weights for policy 0, policy_version 558150 (0.0033) [2024-06-24 01:37:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 9144877056. Throughput: 0: 42726.0. Samples: 9144940760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 01:37:08,487][15401] Updated weights for policy 0, policy_version 558160 (0.0028) [2024-06-24 01:37:12,615][15349] Signal inference workers to stop experience collection... (135450 times) [2024-06-24 01:37:12,640][15401] InferenceWorker_p0-w0: stopping experience collection (135450 times) [2024-06-24 01:37:12,678][15349] Signal inference workers to resume experience collection... (135450 times) [2024-06-24 01:37:12,679][15401] InferenceWorker_p0-w0: resuming experience collection (135450 times) [2024-06-24 01:37:12,835][15401] Updated weights for policy 0, policy_version 558170 (0.0033) [2024-06-24 01:37:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 9145057280. Throughput: 0: 42602.7. Samples: 9145194880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 01:37:16,695][15401] Updated weights for policy 0, policy_version 558180 (0.0035) [2024-06-24 01:37:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.1, 300 sec: 42710.4). Total num frames: 9145286656. Throughput: 0: 42584.1. Samples: 9145448320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 01:37:20,401][15401] Updated weights for policy 0, policy_version 558190 (0.0030) [2024-06-24 01:37:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 9145483264. Throughput: 0: 42584.1. Samples: 9145576720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:23,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 01:37:24,196][15401] Updated weights for policy 0, policy_version 558200 (0.0029) [2024-06-24 01:37:27,935][15401] Updated weights for policy 0, policy_version 558210 (0.0029) [2024-06-24 01:37:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9145712640. Throughput: 0: 42543.6. Samples: 9145836140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:28,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 01:37:31,833][15401] Updated weights for policy 0, policy_version 558220 (0.0043) [2024-06-24 01:37:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9145909248. Throughput: 0: 42636.6. Samples: 9146090180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:33,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 01:37:35,901][15401] Updated weights for policy 0, policy_version 558230 (0.0039) [2024-06-24 01:37:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42050.6, 300 sec: 42765.0). Total num frames: 9146122240. Throughput: 0: 42601.2. Samples: 9146219440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:38,393][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 01:37:39,282][15401] Updated weights for policy 0, policy_version 558240 (0.0034) [2024-06-24 01:37:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9146335232. Throughput: 0: 42701.3. Samples: 9146477080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 01:37:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 01:37:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000558249_9146351616.pth... [2024-06-24 01:37:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000557623_9136095232.pth [2024-06-24 01:37:43,632][15401] Updated weights for policy 0, policy_version 558250 (0.0036) [2024-06-24 01:37:47,071][15401] Updated weights for policy 0, policy_version 558260 (0.0025) [2024-06-24 01:37:48,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 9146564608. Throughput: 0: 42697.6. Samples: 9146734760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:37:48,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 01:37:51,153][15401] Updated weights for policy 0, policy_version 558270 (0.0034) [2024-06-24 01:37:53,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.6, 300 sec: 42764.7). Total num frames: 9146761216. Throughput: 0: 42624.8. Samples: 9146858980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:37:53,393][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 01:37:54,844][15401] Updated weights for policy 0, policy_version 558280 (0.0032) [2024-06-24 01:37:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9146990592. Throughput: 0: 42631.1. Samples: 9147113280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:37:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 01:37:58,677][15401] Updated weights for policy 0, policy_version 558290 (0.0054) [2024-06-24 01:38:02,769][15401] Updated weights for policy 0, policy_version 558300 (0.0042) [2024-06-24 01:38:03,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9147219968. Throughput: 0: 42694.7. Samples: 9147369580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:03,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 01:38:06,101][15401] Updated weights for policy 0, policy_version 558310 (0.0026) [2024-06-24 01:38:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9147416576. Throughput: 0: 42649.9. Samples: 9147495960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 01:38:10,265][15401] Updated weights for policy 0, policy_version 558320 (0.0032) [2024-06-24 01:38:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9147645952. Throughput: 0: 42694.2. Samples: 9147757380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:13,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 01:38:14,172][15401] Updated weights for policy 0, policy_version 558330 (0.0038) [2024-06-24 01:38:17,935][15401] Updated weights for policy 0, policy_version 558340 (0.0042) [2024-06-24 01:38:18,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9147858944. Throughput: 0: 42703.9. Samples: 9148011860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 01:38:21,603][15401] Updated weights for policy 0, policy_version 558350 (0.0037) [2024-06-24 01:38:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9148055552. Throughput: 0: 42718.6. Samples: 9148141680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 01:38:24,745][15349] Signal inference workers to stop experience collection... (135500 times) [2024-06-24 01:38:24,798][15349] Signal inference workers to resume experience collection... (135500 times) [2024-06-24 01:38:24,808][15401] InferenceWorker_p0-w0: stopping experience collection (135500 times) [2024-06-24 01:38:24,844][15401] InferenceWorker_p0-w0: resuming experience collection (135500 times) [2024-06-24 01:38:25,474][15401] Updated weights for policy 0, policy_version 558360 (0.0029) [2024-06-24 01:38:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9148268544. Throughput: 0: 42633.3. Samples: 9148395580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 01:38:29,433][15401] Updated weights for policy 0, policy_version 558370 (0.0042) [2024-06-24 01:38:33,054][15401] Updated weights for policy 0, policy_version 558380 (0.0032) [2024-06-24 01:38:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 9148514304. Throughput: 0: 42565.0. Samples: 9148650180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 01:38:37,128][15401] Updated weights for policy 0, policy_version 558390 (0.0048) [2024-06-24 01:38:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 9148694528. Throughput: 0: 42818.8. Samples: 9148785720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 01:38:40,740][15401] Updated weights for policy 0, policy_version 558400 (0.0036) [2024-06-24 01:38:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9148923904. Throughput: 0: 42880.8. Samples: 9149042920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 01:38:44,954][15401] Updated weights for policy 0, policy_version 558410 (0.0031) [2024-06-24 01:38:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9149136896. Throughput: 0: 42796.8. Samples: 9149295440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 01:38:48,421][15401] Updated weights for policy 0, policy_version 558420 (0.0052) [2024-06-24 01:38:52,581][15401] Updated weights for policy 0, policy_version 558430 (0.0030) [2024-06-24 01:38:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42765.3). Total num frames: 9149333504. Throughput: 0: 42916.7. Samples: 9149427220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 01:38:56,174][15401] Updated weights for policy 0, policy_version 558440 (0.0023) [2024-06-24 01:38:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9149562880. Throughput: 0: 42622.2. Samples: 9149675380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:38:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 01:39:00,562][15401] Updated weights for policy 0, policy_version 558450 (0.0027) [2024-06-24 01:39:03,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9149775872. Throughput: 0: 42701.3. Samples: 9149933420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:39:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 01:39:03,672][15401] Updated weights for policy 0, policy_version 558460 (0.0033) [2024-06-24 01:39:08,371][15401] Updated weights for policy 0, policy_version 558470 (0.0032) [2024-06-24 01:39:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42709.8). Total num frames: 9149972480. Throughput: 0: 42615.1. Samples: 9150059360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:39:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 01:39:11,478][15401] Updated weights for policy 0, policy_version 558480 (0.0035) [2024-06-24 01:39:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9150201856. Throughput: 0: 42755.3. Samples: 9150319560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:39:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 01:39:15,919][15401] Updated weights for policy 0, policy_version 558490 (0.0034) [2024-06-24 01:39:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 9150414848. Throughput: 0: 42868.0. Samples: 9150579240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 01:39:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 01:39:18,987][15401] Updated weights for policy 0, policy_version 558500 (0.0034) [2024-06-24 01:39:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9150611456. Throughput: 0: 42653.3. Samples: 9150705120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:39:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 01:39:23,754][15401] Updated weights for policy 0, policy_version 558510 (0.0041) [2024-06-24 01:39:26,705][15401] Updated weights for policy 0, policy_version 558520 (0.0041) [2024-06-24 01:39:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 9150857216. Throughput: 0: 42596.5. Samples: 9150959760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:39:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 01:39:31,600][15401] Updated weights for policy 0, policy_version 558530 (0.0032) [2024-06-24 01:39:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9151053824. Throughput: 0: 42688.1. Samples: 9151216400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:39:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 01:39:34,322][15401] Updated weights for policy 0, policy_version 558540 (0.0041) [2024-06-24 01:39:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9151250432. Throughput: 0: 42501.8. Samples: 9151339800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:39:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 01:39:39,068][15401] Updated weights for policy 0, policy_version 558550 (0.0038) [2024-06-24 01:39:41,998][15401] Updated weights for policy 0, policy_version 558560 (0.0039) [2024-06-24 01:39:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9151496192. Throughput: 0: 42649.7. Samples: 9151594720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:39:43,392][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 01:39:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000558563_9151496192.pth... [2024-06-24 01:39:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000557938_9141256192.pth [2024-06-24 01:39:46,411][15401] Updated weights for policy 0, policy_version 558570 (0.0035) [2024-06-24 01:39:47,539][15349] Signal inference workers to stop experience collection... (135550 times) [2024-06-24 01:39:47,544][15349] Signal inference workers to resume experience collection... (135550 times) [2024-06-24 01:39:47,594][15401] InferenceWorker_p0-w0: stopping experience collection (135550 times) [2024-06-24 01:39:47,594][15401] InferenceWorker_p0-w0: resuming experience collection (135550 times) [2024-06-24 01:39:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9151692800. Throughput: 0: 42843.2. Samples: 9151861360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:39:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 01:39:49,482][15401] Updated weights for policy 0, policy_version 558580 (0.0037) [2024-06-24 01:39:53,392][15132] Fps is (10 sec: 39321.4, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 9151889408. Throughput: 0: 42808.9. Samples: 9151985860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:39:53,393][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 01:39:53,981][15401] Updated weights for policy 0, policy_version 558590 (0.0026) [2024-06-24 01:39:57,180][15401] Updated weights for policy 0, policy_version 558600 (0.0041) [2024-06-24 01:39:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9152151552. Throughput: 0: 42677.2. Samples: 9152240040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:39:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 01:40:01,723][15401] Updated weights for policy 0, policy_version 558610 (0.0029) [2024-06-24 01:40:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9152331776. Throughput: 0: 42772.9. Samples: 9152504020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 01:40:04,854][15401] Updated weights for policy 0, policy_version 558620 (0.0026) [2024-06-24 01:40:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9152544768. Throughput: 0: 42663.6. Samples: 9152624980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 01:40:09,763][15401] Updated weights for policy 0, policy_version 558630 (0.0036) [2024-06-24 01:40:12,514][15401] Updated weights for policy 0, policy_version 558640 (0.0028) [2024-06-24 01:40:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42766.0). Total num frames: 9152774144. Throughput: 0: 42787.6. Samples: 9152885200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 01:40:17,414][15401] Updated weights for policy 0, policy_version 558650 (0.0034) [2024-06-24 01:40:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9152970752. Throughput: 0: 42903.5. Samples: 9153147060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 01:40:20,115][15401] Updated weights for policy 0, policy_version 558660 (0.0031) [2024-06-24 01:40:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9153183744. Throughput: 0: 42679.6. Samples: 9153260380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:23,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 01:40:25,372][15401] Updated weights for policy 0, policy_version 558670 (0.0031) [2024-06-24 01:40:27,855][15401] Updated weights for policy 0, policy_version 558680 (0.0045) [2024-06-24 01:40:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9153429504. Throughput: 0: 42784.1. Samples: 9153519900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 01:40:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 42543.8). Total num frames: 9153560576. Throughput: 0: 42724.8. Samples: 9153783980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 01:40:33,421][15401] Updated weights for policy 0, policy_version 558690 (0.0046) [2024-06-24 01:40:35,797][15401] Updated weights for policy 0, policy_version 558700 (0.0038) [2024-06-24 01:40:38,394][15132] Fps is (10 sec: 40941.4, 60 sec: 43141.3, 300 sec: 42653.3). Total num frames: 9153839104. Throughput: 0: 42403.3. Samples: 9153894100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:38,394][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 01:40:40,963][15401] Updated weights for policy 0, policy_version 558710 (0.0030) [2024-06-24 01:40:43,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42327.0, 300 sec: 42654.0). Total num frames: 9154035712. Throughput: 0: 42567.6. Samples: 9154155580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 01:40:43,662][15401] Updated weights for policy 0, policy_version 558720 (0.0028) [2024-06-24 01:40:48,389][15132] Fps is (10 sec: 36061.2, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 9154199552. Throughput: 0: 42306.2. Samples: 9154407800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 01:40:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 01:40:48,774][15401] Updated weights for policy 0, policy_version 558730 (0.0033) [2024-06-24 01:40:51,460][15401] Updated weights for policy 0, policy_version 558740 (0.0042) [2024-06-24 01:40:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 9154478080. Throughput: 0: 42227.1. Samples: 9154525200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:40:53,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 01:40:56,394][15401] Updated weights for policy 0, policy_version 558750 (0.0032) [2024-06-24 01:40:57,894][15349] Signal inference workers to stop experience collection... (135600 times) [2024-06-24 01:40:57,896][15349] Signal inference workers to resume experience collection... (135600 times) [2024-06-24 01:40:57,920][15401] InferenceWorker_p0-w0: stopping experience collection (135600 times) [2024-06-24 01:40:57,920][15401] InferenceWorker_p0-w0: resuming experience collection (135600 times) [2024-06-24 01:40:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 9154658304. Throughput: 0: 42283.1. Samples: 9154787940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:40:58,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 01:40:59,551][15401] Updated weights for policy 0, policy_version 558760 (0.0046) [2024-06-24 01:41:03,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 9154854912. Throughput: 0: 42031.3. Samples: 9155038480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 01:41:04,202][15401] Updated weights for policy 0, policy_version 558770 (0.0035) [2024-06-24 01:41:07,252][15401] Updated weights for policy 0, policy_version 558780 (0.0036) [2024-06-24 01:41:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 9155100672. Throughput: 0: 42248.4. Samples: 9155161560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 01:41:12,116][15401] Updated weights for policy 0, policy_version 558790 (0.0045) [2024-06-24 01:41:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.1, 300 sec: 42598.7). Total num frames: 9155297280. Throughput: 0: 42191.9. Samples: 9155418540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 01:41:15,041][15401] Updated weights for policy 0, policy_version 558800 (0.0048) [2024-06-24 01:41:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41779.2, 300 sec: 42487.7). Total num frames: 9155477504. Throughput: 0: 41931.1. Samples: 9155670880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 01:41:19,767][15401] Updated weights for policy 0, policy_version 558810 (0.0037) [2024-06-24 01:41:22,629][15401] Updated weights for policy 0, policy_version 558820 (0.0035) [2024-06-24 01:41:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9155739648. Throughput: 0: 42227.8. Samples: 9155794160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:23,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 01:41:27,334][15401] Updated weights for policy 0, policy_version 558830 (0.0025) [2024-06-24 01:41:28,390][15132] Fps is (10 sec: 44235.6, 60 sec: 41506.0, 300 sec: 42598.4). Total num frames: 9155919872. Throughput: 0: 42406.4. Samples: 9156063880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 01:41:30,252][15401] Updated weights for policy 0, policy_version 558840 (0.0025) [2024-06-24 01:41:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 9156132864. Throughput: 0: 42250.6. Samples: 9156309080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 01:41:34,944][15401] Updated weights for policy 0, policy_version 558850 (0.0037) [2024-06-24 01:41:37,778][15401] Updated weights for policy 0, policy_version 558860 (0.0033) [2024-06-24 01:41:38,390][15132] Fps is (10 sec: 45876.0, 60 sec: 42328.5, 300 sec: 42653.9). Total num frames: 9156378624. Throughput: 0: 42491.1. Samples: 9156437300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 01:41:42,686][15401] Updated weights for policy 0, policy_version 558870 (0.0031) [2024-06-24 01:41:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42487.7). Total num frames: 9156542464. Throughput: 0: 42553.9. Samples: 9156702860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 01:41:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000558872_9156558848.pth... [2024-06-24 01:41:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000558249_9146351616.pth [2024-06-24 01:41:45,176][15401] Updated weights for policy 0, policy_version 558880 (0.0041) [2024-06-24 01:41:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 9156788224. Throughput: 0: 42458.9. Samples: 9156949120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:48,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 01:41:50,199][15401] Updated weights for policy 0, policy_version 558890 (0.0036) [2024-06-24 01:41:52,758][15401] Updated weights for policy 0, policy_version 558900 (0.0027) [2024-06-24 01:41:53,390][15132] Fps is (10 sec: 47512.1, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 9157017600. Throughput: 0: 42775.4. Samples: 9157086460. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 01:41:58,219][15401] Updated weights for policy 0, policy_version 558910 (0.0045) [2024-06-24 01:41:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9157197824. Throughput: 0: 42792.1. Samples: 9157344180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:41:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 01:42:00,597][15401] Updated weights for policy 0, policy_version 558920 (0.0042) [2024-06-24 01:42:03,389][15132] Fps is (10 sec: 42599.7, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 9157443584. Throughput: 0: 42659.6. Samples: 9157590560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:42:03,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 01:42:05,646][15401] Updated weights for policy 0, policy_version 558930 (0.0030) [2024-06-24 01:42:08,134][15401] Updated weights for policy 0, policy_version 558940 (0.0035) [2024-06-24 01:42:08,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9157672960. Throughput: 0: 42995.7. Samples: 9157728960. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:42:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 01:42:12,780][15349] Signal inference workers to stop experience collection... (135650 times) [2024-06-24 01:42:12,821][15401] InferenceWorker_p0-w0: stopping experience collection (135650 times) [2024-06-24 01:42:12,831][15349] Signal inference workers to resume experience collection... (135650 times) [2024-06-24 01:42:12,843][15401] InferenceWorker_p0-w0: resuming experience collection (135650 times) [2024-06-24 01:42:13,291][15401] Updated weights for policy 0, policy_version 558950 (0.0036) [2024-06-24 01:42:13,390][15132] Fps is (10 sec: 39320.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 9157836800. Throughput: 0: 42742.7. Samples: 9157987300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:42:13,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-24 01:42:15,805][15401] Updated weights for policy 0, policy_version 558960 (0.0043) [2024-06-24 01:42:18,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43415.8, 300 sec: 42709.1). Total num frames: 9158082560. Throughput: 0: 42730.2. Samples: 9158232040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 24.0) [2024-06-24 01:42:18,392][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 01:42:21,039][15401] Updated weights for policy 0, policy_version 558970 (0.0036) [2024-06-24 01:42:23,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9158311936. Throughput: 0: 42908.1. Samples: 9158368160. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:42:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 01:42:23,466][15401] Updated weights for policy 0, policy_version 558980 (0.0055) [2024-06-24 01:42:28,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 9158475776. Throughput: 0: 42739.0. Samples: 9158626120. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:42:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 01:42:28,432][15401] Updated weights for policy 0, policy_version 558990 (0.0035) [2024-06-24 01:42:31,123][15401] Updated weights for policy 0, policy_version 559000 (0.0027) [2024-06-24 01:42:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 9158705152. Throughput: 0: 42843.6. Samples: 9158877080. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:42:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 01:42:35,969][15401] Updated weights for policy 0, policy_version 559010 (0.0045) [2024-06-24 01:42:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9158934528. Throughput: 0: 42762.1. Samples: 9159010740. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:42:38,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 01:42:39,050][15401] Updated weights for policy 0, policy_version 559020 (0.0027) [2024-06-24 01:42:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 9159131136. Throughput: 0: 42667.5. Samples: 9159264220. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:42:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 01:42:43,602][15401] Updated weights for policy 0, policy_version 559030 (0.0029) [2024-06-24 01:42:46,734][15401] Updated weights for policy 0, policy_version 559040 (0.0038) [2024-06-24 01:42:48,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 9159360512. Throughput: 0: 42668.8. Samples: 9159510660. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:42:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 01:42:50,961][15401] Updated weights for policy 0, policy_version 559050 (0.0038) [2024-06-24 01:42:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 9159557120. Throughput: 0: 42534.2. Samples: 9159643000. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:42:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 01:42:54,642][15401] Updated weights for policy 0, policy_version 559060 (0.0025) [2024-06-24 01:42:58,330][15401] Updated weights for policy 0, policy_version 559070 (0.0036) [2024-06-24 01:42:58,393][15132] Fps is (10 sec: 44220.1, 60 sec: 43414.9, 300 sec: 42653.4). Total num frames: 9159802880. Throughput: 0: 42597.0. Samples: 9159904320. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:42:58,394][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 01:43:02,129][15401] Updated weights for policy 0, policy_version 559080 (0.0036) [2024-06-24 01:43:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 9160015872. Throughput: 0: 42820.8. Samples: 9160158880. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:03,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 01:43:06,248][15401] Updated weights for policy 0, policy_version 559090 (0.0048) [2024-06-24 01:43:08,390][15132] Fps is (10 sec: 39334.1, 60 sec: 42051.8, 300 sec: 42542.8). Total num frames: 9160196096. Throughput: 0: 42646.1. Samples: 9160287260. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:08,391][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 01:43:09,960][15401] Updated weights for policy 0, policy_version 559100 (0.0028) [2024-06-24 01:43:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 9160425472. Throughput: 0: 42742.2. Samples: 9160549520. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:13,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 01:43:13,626][15401] Updated weights for policy 0, policy_version 559110 (0.0028) [2024-06-24 01:43:17,584][15401] Updated weights for policy 0, policy_version 559120 (0.0027) [2024-06-24 01:43:18,389][15132] Fps is (10 sec: 45877.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 9160654848. Throughput: 0: 42771.1. Samples: 9160801780. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 01:43:21,609][15401] Updated weights for policy 0, policy_version 559130 (0.0036) [2024-06-24 01:43:23,194][15349] Signal inference workers to stop experience collection... (135700 times) [2024-06-24 01:43:23,244][15401] InferenceWorker_p0-w0: stopping experience collection (135700 times) [2024-06-24 01:43:23,250][15349] Signal inference workers to resume experience collection... (135700 times) [2024-06-24 01:43:23,255][15401] InferenceWorker_p0-w0: resuming experience collection (135700 times) [2024-06-24 01:43:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 9160818688. Throughput: 0: 42683.5. Samples: 9160931500. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:23,390][15132] Avg episode reward: [(0, '0.889')] [2024-06-24 01:43:25,222][15401] Updated weights for policy 0, policy_version 559140 (0.0037) [2024-06-24 01:43:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 9161080832. Throughput: 0: 42820.5. Samples: 9161191140. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 01:43:29,282][15401] Updated weights for policy 0, policy_version 559150 (0.0030) [2024-06-24 01:43:32,850][15401] Updated weights for policy 0, policy_version 559160 (0.0045) [2024-06-24 01:43:33,396][15132] Fps is (10 sec: 49120.6, 60 sec: 43413.0, 300 sec: 42764.1). Total num frames: 9161310208. Throughput: 0: 42881.5. Samples: 9161440600. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:33,396][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 01:43:37,021][15401] Updated weights for policy 0, policy_version 559170 (0.0037) [2024-06-24 01:43:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 9161457664. Throughput: 0: 42871.0. Samples: 9161572200. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 01:43:40,550][15401] Updated weights for policy 0, policy_version 559180 (0.0036) [2024-06-24 01:43:43,389][15132] Fps is (10 sec: 39347.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 9161703424. Throughput: 0: 42840.6. Samples: 9161831980. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 01:43:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000559187_9161719808.pth... [2024-06-24 01:43:43,592][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000558563_9151496192.pth [2024-06-24 01:43:44,652][15401] Updated weights for policy 0, policy_version 559190 (0.0049) [2024-06-24 01:43:48,171][15401] Updated weights for policy 0, policy_version 559200 (0.0029) [2024-06-24 01:43:48,389][15132] Fps is (10 sec: 49152.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9161949184. Throughput: 0: 42773.5. Samples: 9162083680. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-24 01:43:48,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 01:43:52,366][15401] Updated weights for policy 0, policy_version 559210 (0.0038) [2024-06-24 01:43:53,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9162113024. Throughput: 0: 42921.4. Samples: 9162218700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:43:53,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 01:43:55,726][15401] Updated weights for policy 0, policy_version 559220 (0.0032) [2024-06-24 01:43:58,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42328.0, 300 sec: 42598.4). Total num frames: 9162342400. Throughput: 0: 42701.8. Samples: 9162471100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:43:58,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 01:44:00,050][15401] Updated weights for policy 0, policy_version 559230 (0.0024) [2024-06-24 01:44:03,373][15401] Updated weights for policy 0, policy_version 559240 (0.0029) [2024-06-24 01:44:03,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9162588160. Throughput: 0: 42829.7. Samples: 9162729120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 01:44:07,621][15401] Updated weights for policy 0, policy_version 559250 (0.0036) [2024-06-24 01:44:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.8, 300 sec: 42598.4). Total num frames: 9162768384. Throughput: 0: 42819.9. Samples: 9162858400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 01:44:10,949][15401] Updated weights for policy 0, policy_version 559260 (0.0024) [2024-06-24 01:44:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9162981376. Throughput: 0: 42721.7. Samples: 9163113620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 01:44:15,740][15401] Updated weights for policy 0, policy_version 559270 (0.0040) [2024-06-24 01:44:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9163210752. Throughput: 0: 42805.6. Samples: 9163366580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 01:44:18,623][15401] Updated weights for policy 0, policy_version 559280 (0.0036) [2024-06-24 01:44:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 9163390976. Throughput: 0: 42848.6. Samples: 9163500380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 01:44:23,400][15401] Updated weights for policy 0, policy_version 559290 (0.0029) [2024-06-24 01:44:26,587][15401] Updated weights for policy 0, policy_version 559300 (0.0030) [2024-06-24 01:44:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9163620352. Throughput: 0: 42592.2. Samples: 9163748640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 01:44:30,907][15401] Updated weights for policy 0, policy_version 559310 (0.0023) [2024-06-24 01:44:33,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42328.1, 300 sec: 42709.1). Total num frames: 9163849728. Throughput: 0: 42762.5. Samples: 9164008100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:33,393][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 01:44:34,272][15401] Updated weights for policy 0, policy_version 559320 (0.0045) [2024-06-24 01:44:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42543.2). Total num frames: 9164046336. Throughput: 0: 42639.6. Samples: 9164137480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 01:44:38,723][15401] Updated weights for policy 0, policy_version 559330 (0.0030) [2024-06-24 01:44:41,733][15401] Updated weights for policy 0, policy_version 559340 (0.0029) [2024-06-24 01:44:42,077][15349] Signal inference workers to stop experience collection... (135750 times) [2024-06-24 01:44:42,077][15349] Signal inference workers to resume experience collection... (135750 times) [2024-06-24 01:44:42,097][15401] InferenceWorker_p0-w0: stopping experience collection (135750 times) [2024-06-24 01:44:42,097][15401] InferenceWorker_p0-w0: resuming experience collection (135750 times) [2024-06-24 01:44:43,389][15132] Fps is (10 sec: 44248.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9164292096. Throughput: 0: 42755.7. Samples: 9164395100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 01:44:46,355][15401] Updated weights for policy 0, policy_version 559350 (0.0026) [2024-06-24 01:44:48,393][15132] Fps is (10 sec: 44222.4, 60 sec: 42323.0, 300 sec: 42709.4). Total num frames: 9164488704. Throughput: 0: 42958.7. Samples: 9164662400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:48,393][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 01:44:49,246][15401] Updated weights for policy 0, policy_version 559360 (0.0032) [2024-06-24 01:44:53,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 9164685312. Throughput: 0: 42814.8. Samples: 9164785060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 01:44:53,893][15401] Updated weights for policy 0, policy_version 559370 (0.0051) [2024-06-24 01:44:56,718][15401] Updated weights for policy 0, policy_version 559380 (0.0029) [2024-06-24 01:44:58,392][15132] Fps is (10 sec: 45879.0, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 9164947456. Throughput: 0: 42943.1. Samples: 9165046160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:44:58,392][15132] Avg episode reward: [(0, '0.912')] [2024-06-24 01:45:01,300][15401] Updated weights for policy 0, policy_version 559390 (0.0027) [2024-06-24 01:45:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9165144064. Throughput: 0: 43181.7. Samples: 9165309760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:45:03,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 01:45:04,294][15401] Updated weights for policy 0, policy_version 559400 (0.0047) [2024-06-24 01:45:08,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9165340672. Throughput: 0: 42992.8. Samples: 9165435060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:45:08,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 01:45:08,837][15401] Updated weights for policy 0, policy_version 559410 (0.0044) [2024-06-24 01:45:11,780][15401] Updated weights for policy 0, policy_version 559420 (0.0029) [2024-06-24 01:45:13,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43688.9, 300 sec: 42820.2). Total num frames: 9165602816. Throughput: 0: 43245.8. Samples: 9165694800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:45:13,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 01:45:16,250][15401] Updated weights for policy 0, policy_version 559430 (0.0038) [2024-06-24 01:45:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9165783040. Throughput: 0: 43383.6. Samples: 9165960260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-24 01:45:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 01:45:19,676][15401] Updated weights for policy 0, policy_version 559440 (0.0023) [2024-06-24 01:45:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43690.6, 300 sec: 42653.9). Total num frames: 9166012416. Throughput: 0: 43230.1. Samples: 9166082840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:45:23,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 01:45:23,748][15401] Updated weights for policy 0, policy_version 559450 (0.0033) [2024-06-24 01:45:27,457][15401] Updated weights for policy 0, policy_version 559460 (0.0028) [2024-06-24 01:45:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 9166225408. Throughput: 0: 43285.2. Samples: 9166342940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:45:28,390][15132] Avg episode reward: [(0, '0.191')] [2024-06-24 01:45:31,474][15401] Updated weights for policy 0, policy_version 559470 (0.0034) [2024-06-24 01:45:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42654.6). Total num frames: 9166422016. Throughput: 0: 43141.3. Samples: 9166603620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:45:33,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 01:45:35,152][15401] Updated weights for policy 0, policy_version 559480 (0.0032) [2024-06-24 01:45:38,396][15132] Fps is (10 sec: 40933.7, 60 sec: 43139.9, 300 sec: 42708.5). Total num frames: 9166635008. Throughput: 0: 43041.8. Samples: 9166722220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:45:38,397][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 01:45:39,190][15401] Updated weights for policy 0, policy_version 559490 (0.0031) [2024-06-24 01:45:42,767][15401] Updated weights for policy 0, policy_version 559500 (0.0025) [2024-06-24 01:45:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9166864384. Throughput: 0: 42946.8. Samples: 9166978660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:45:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 01:45:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000559502_9166880768.pth... [2024-06-24 01:45:43,607][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000558872_9156558848.pth [2024-06-24 01:45:46,861][15401] Updated weights for policy 0, policy_version 559510 (0.0026) [2024-06-24 01:45:48,389][15132] Fps is (10 sec: 40986.3, 60 sec: 42600.7, 300 sec: 42598.4). Total num frames: 9167044608. Throughput: 0: 42900.0. Samples: 9167240260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:45:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 01:45:50,273][15401] Updated weights for policy 0, policy_version 559520 (0.0036) [2024-06-24 01:45:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9167290368. Throughput: 0: 42736.4. Samples: 9167358200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:45:53,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 01:45:54,777][15401] Updated weights for policy 0, policy_version 559530 (0.0040) [2024-06-24 01:45:56,937][15349] Signal inference workers to stop experience collection... (135800 times) [2024-06-24 01:45:56,937][15349] Signal inference workers to resume experience collection... (135800 times) [2024-06-24 01:45:56,972][15401] InferenceWorker_p0-w0: stopping experience collection (135800 times) [2024-06-24 01:45:56,972][15401] InferenceWorker_p0-w0: resuming experience collection (135800 times) [2024-06-24 01:45:57,983][15401] Updated weights for policy 0, policy_version 559540 (0.0041) [2024-06-24 01:45:58,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 9167519744. Throughput: 0: 42822.2. Samples: 9167621700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:45:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 01:46:02,219][15401] Updated weights for policy 0, policy_version 559550 (0.0024) [2024-06-24 01:46:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9167683584. Throughput: 0: 42775.3. Samples: 9167885140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 01:46:05,614][15401] Updated weights for policy 0, policy_version 559560 (0.0033) [2024-06-24 01:46:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9167929344. Throughput: 0: 42709.4. Samples: 9168004760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:08,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 01:46:09,999][15401] Updated weights for policy 0, policy_version 559570 (0.0030) [2024-06-24 01:46:13,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42326.9, 300 sec: 42931.6). Total num frames: 9168142336. Throughput: 0: 42776.7. Samples: 9168267900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 01:46:13,542][15401] Updated weights for policy 0, policy_version 559580 (0.0033) [2024-06-24 01:46:17,727][15401] Updated weights for policy 0, policy_version 559590 (0.0029) [2024-06-24 01:46:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9168338944. Throughput: 0: 42598.3. Samples: 9168520540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 01:46:21,201][15401] Updated weights for policy 0, policy_version 559600 (0.0029) [2024-06-24 01:46:23,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 9168584704. Throughput: 0: 42748.7. Samples: 9168645640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 01:46:25,211][15401] Updated weights for policy 0, policy_version 559610 (0.0031) [2024-06-24 01:46:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9168781312. Throughput: 0: 42900.8. Samples: 9168909200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 01:46:28,740][15401] Updated weights for policy 0, policy_version 559620 (0.0045) [2024-06-24 01:46:33,088][15401] Updated weights for policy 0, policy_version 559630 (0.0033) [2024-06-24 01:46:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9168977920. Throughput: 0: 42736.9. Samples: 9169163420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 01:46:36,352][15401] Updated weights for policy 0, policy_version 559640 (0.0042) [2024-06-24 01:46:38,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42871.5, 300 sec: 42930.7). Total num frames: 9169207296. Throughput: 0: 42897.5. Samples: 9169288860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:38,396][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 01:46:40,542][15401] Updated weights for policy 0, policy_version 559650 (0.0034) [2024-06-24 01:46:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9169403904. Throughput: 0: 42762.7. Samples: 9169546020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 01:46:43,982][15401] Updated weights for policy 0, policy_version 559660 (0.0034) [2024-06-24 01:46:48,114][15401] Updated weights for policy 0, policy_version 559670 (0.0032) [2024-06-24 01:46:48,390][15132] Fps is (10 sec: 42625.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9169633280. Throughput: 0: 42648.8. Samples: 9169804340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 01:46:51,583][15401] Updated weights for policy 0, policy_version 559680 (0.0038) [2024-06-24 01:46:53,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 9169846272. Throughput: 0: 42818.2. Samples: 9169931680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 01:46:53,393][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 01:46:55,766][15401] Updated weights for policy 0, policy_version 559690 (0.0043) [2024-06-24 01:46:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9170042880. Throughput: 0: 42515.2. Samples: 9170181080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:46:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 01:46:59,414][15401] Updated weights for policy 0, policy_version 559700 (0.0033) [2024-06-24 01:47:03,276][15401] Updated weights for policy 0, policy_version 559710 (0.0033) [2024-06-24 01:47:03,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 9170288640. Throughput: 0: 42636.4. Samples: 9170439180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 01:47:07,001][15401] Updated weights for policy 0, policy_version 559720 (0.0046) [2024-06-24 01:47:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9170485248. Throughput: 0: 42737.8. Samples: 9170568840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 01:47:10,603][15349] Signal inference workers to stop experience collection... (135850 times) [2024-06-24 01:47:10,603][15349] Signal inference workers to resume experience collection... (135850 times) [2024-06-24 01:47:10,633][15401] InferenceWorker_p0-w0: stopping experience collection (135850 times) [2024-06-24 01:47:10,633][15401] InferenceWorker_p0-w0: resuming experience collection (135850 times) [2024-06-24 01:47:10,747][15401] Updated weights for policy 0, policy_version 559730 (0.0036) [2024-06-24 01:47:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 9170698240. Throughput: 0: 42601.3. Samples: 9170826260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:13,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-24 01:47:14,674][15401] Updated weights for policy 0, policy_version 559740 (0.0029) [2024-06-24 01:47:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9170927616. Throughput: 0: 42443.1. Samples: 9171073360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 01:47:18,482][15401] Updated weights for policy 0, policy_version 559750 (0.0041) [2024-06-24 01:47:22,670][15401] Updated weights for policy 0, policy_version 559760 (0.0027) [2024-06-24 01:47:23,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9171140608. Throughput: 0: 42571.8. Samples: 9171204320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 01:47:26,263][15401] Updated weights for policy 0, policy_version 559770 (0.0023) [2024-06-24 01:47:28,394][15132] Fps is (10 sec: 39301.9, 60 sec: 42321.8, 300 sec: 42764.3). Total num frames: 9171320832. Throughput: 0: 42505.5. Samples: 9171458980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:28,395][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 01:47:30,106][15401] Updated weights for policy 0, policy_version 559780 (0.0040) [2024-06-24 01:47:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9171566592. Throughput: 0: 42448.3. Samples: 9171714520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 01:47:33,796][15401] Updated weights for policy 0, policy_version 559790 (0.0029) [2024-06-24 01:47:37,899][15401] Updated weights for policy 0, policy_version 559800 (0.0041) [2024-06-24 01:47:38,390][15132] Fps is (10 sec: 45898.1, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 9171779584. Throughput: 0: 42585.0. Samples: 9171847900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 01:47:41,769][15401] Updated weights for policy 0, policy_version 559810 (0.0045) [2024-06-24 01:47:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 9171976192. Throughput: 0: 42577.7. Samples: 9172097080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:43,390][15132] Avg episode reward: [(0, '0.103')] [2024-06-24 01:47:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000559813_9171976192.pth... [2024-06-24 01:47:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000559187_9161719808.pth [2024-06-24 01:47:45,587][15401] Updated weights for policy 0, policy_version 559820 (0.0047) [2024-06-24 01:47:48,390][15132] Fps is (10 sec: 40957.3, 60 sec: 42597.9, 300 sec: 42820.5). Total num frames: 9172189184. Throughput: 0: 42527.4. Samples: 9172352940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:48,391][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 01:47:49,605][15401] Updated weights for policy 0, policy_version 559830 (0.0034) [2024-06-24 01:47:53,360][15401] Updated weights for policy 0, policy_version 559840 (0.0031) [2024-06-24 01:47:53,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42873.2, 300 sec: 42765.6). Total num frames: 9172418560. Throughput: 0: 42668.0. Samples: 9172488900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:53,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 01:47:57,166][15401] Updated weights for policy 0, policy_version 559850 (0.0038) [2024-06-24 01:47:58,390][15132] Fps is (10 sec: 42601.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9172615168. Throughput: 0: 42547.2. Samples: 9172740880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:47:58,394][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 01:48:00,933][15401] Updated weights for policy 0, policy_version 559860 (0.0037) [2024-06-24 01:48:03,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 9172844544. Throughput: 0: 42707.9. Samples: 9172995320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:48:03,393][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 01:48:04,807][15401] Updated weights for policy 0, policy_version 559870 (0.0028) [2024-06-24 01:48:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9173057536. Throughput: 0: 42704.4. Samples: 9173126020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:48:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 01:48:08,495][15401] Updated weights for policy 0, policy_version 559880 (0.0039) [2024-06-24 01:48:12,886][15401] Updated weights for policy 0, policy_version 559890 (0.0028) [2024-06-24 01:48:13,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9173254144. Throughput: 0: 42682.9. Samples: 9173379500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:48:13,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 01:48:16,516][15401] Updated weights for policy 0, policy_version 559900 (0.0033) [2024-06-24 01:48:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9173483520. Throughput: 0: 42637.5. Samples: 9173633200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:48:18,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 01:48:20,456][15401] Updated weights for policy 0, policy_version 559910 (0.0039) [2024-06-24 01:48:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9173663744. Throughput: 0: 42495.9. Samples: 9173760220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 01:48:23,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-24 01:48:24,352][15401] Updated weights for policy 0, policy_version 559920 (0.0044) [2024-06-24 01:48:24,361][15349] Signal inference workers to stop experience collection... (135900 times) [2024-06-24 01:48:24,361][15349] Signal inference workers to resume experience collection... (135900 times) [2024-06-24 01:48:24,408][15401] InferenceWorker_p0-w0: stopping experience collection (135900 times) [2024-06-24 01:48:24,408][15401] InferenceWorker_p0-w0: resuming experience collection (135900 times) [2024-06-24 01:48:28,008][15401] Updated weights for policy 0, policy_version 559930 (0.0031) [2024-06-24 01:48:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42875.0, 300 sec: 42654.8). Total num frames: 9173893120. Throughput: 0: 42555.2. Samples: 9174012060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:48:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 01:48:31,914][15401] Updated weights for policy 0, policy_version 559940 (0.0039) [2024-06-24 01:48:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 9174122496. Throughput: 0: 42634.0. Samples: 9174271440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:48:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 01:48:35,614][15401] Updated weights for policy 0, policy_version 559950 (0.0035) [2024-06-24 01:48:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9174319104. Throughput: 0: 42472.8. Samples: 9174400180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:48:38,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 01:48:39,538][15401] Updated weights for policy 0, policy_version 559960 (0.0027) [2024-06-24 01:48:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9174532096. Throughput: 0: 42529.4. Samples: 9174654700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:48:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 01:48:43,420][15401] Updated weights for policy 0, policy_version 559970 (0.0029) [2024-06-24 01:48:47,343][15401] Updated weights for policy 0, policy_version 559980 (0.0030) [2024-06-24 01:48:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.8, 300 sec: 42765.0). Total num frames: 9174728704. Throughput: 0: 42588.1. Samples: 9174911680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:48:48,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-24 01:48:51,043][15401] Updated weights for policy 0, policy_version 559990 (0.0030) [2024-06-24 01:48:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9174941696. Throughput: 0: 42388.1. Samples: 9175033480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:48:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 01:48:55,148][15401] Updated weights for policy 0, policy_version 560000 (0.0027) [2024-06-24 01:48:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9175187456. Throughput: 0: 42432.0. Samples: 9175288940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:48:58,392][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 01:48:58,566][15401] Updated weights for policy 0, policy_version 560010 (0.0030) [2024-06-24 01:49:03,253][15401] Updated weights for policy 0, policy_version 560020 (0.0036) [2024-06-24 01:49:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 9175367680. Throughput: 0: 42620.7. Samples: 9175551140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 01:49:06,015][15401] Updated weights for policy 0, policy_version 560030 (0.0038) [2024-06-24 01:49:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9175580672. Throughput: 0: 42329.4. Samples: 9175665040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 01:49:10,632][15401] Updated weights for policy 0, policy_version 560040 (0.0028) [2024-06-24 01:49:13,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9175842816. Throughput: 0: 42773.0. Samples: 9175936840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 01:49:13,559][15401] Updated weights for policy 0, policy_version 560050 (0.0034) [2024-06-24 01:49:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 9176006656. Throughput: 0: 42736.5. Samples: 9176194580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:18,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 01:49:18,619][15401] Updated weights for policy 0, policy_version 560060 (0.0032) [2024-06-24 01:49:21,087][15401] Updated weights for policy 0, policy_version 560070 (0.0022) [2024-06-24 01:49:23,390][15132] Fps is (10 sec: 36044.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9176203264. Throughput: 0: 42510.6. Samples: 9176313160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 01:49:25,992][15401] Updated weights for policy 0, policy_version 560080 (0.0045) [2024-06-24 01:49:28,392][15132] Fps is (10 sec: 49141.7, 60 sec: 43416.2, 300 sec: 42876.1). Total num frames: 9176498176. Throughput: 0: 42814.0. Samples: 9176581420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:28,392][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 01:49:28,571][15401] Updated weights for policy 0, policy_version 560090 (0.0029) [2024-06-24 01:49:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9176662016. Throughput: 0: 42892.5. Samples: 9176841840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 01:49:33,634][15401] Updated weights for policy 0, policy_version 560100 (0.0038) [2024-06-24 01:49:36,059][15401] Updated weights for policy 0, policy_version 560110 (0.0033) [2024-06-24 01:49:38,390][15132] Fps is (10 sec: 36052.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9176858624. Throughput: 0: 42820.8. Samples: 9176960420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 01:49:41,021][15401] Updated weights for policy 0, policy_version 560120 (0.0021) [2024-06-24 01:49:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42821.0). Total num frames: 9177120768. Throughput: 0: 43091.2. Samples: 9177228040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 01:49:43,480][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000560128_9177137152.pth... [2024-06-24 01:49:43,542][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000559502_9166880768.pth [2024-06-24 01:49:44,064][15401] Updated weights for policy 0, policy_version 560130 (0.0030) [2024-06-24 01:49:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9177317376. Throughput: 0: 43046.4. Samples: 9177488220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 01:49:48,506][15401] Updated weights for policy 0, policy_version 560140 (0.0027) [2024-06-24 01:49:50,852][15349] Signal inference workers to stop experience collection... (135950 times) [2024-06-24 01:49:50,852][15349] Signal inference workers to resume experience collection... (135950 times) [2024-06-24 01:49:50,866][15401] InferenceWorker_p0-w0: stopping experience collection (135950 times) [2024-06-24 01:49:50,872][15401] InferenceWorker_p0-w0: resuming experience collection (135950 times) [2024-06-24 01:49:51,640][15401] Updated weights for policy 0, policy_version 560150 (0.0031) [2024-06-24 01:49:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 9177530368. Throughput: 0: 43253.3. Samples: 9177611440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 01:49:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 01:49:56,179][15401] Updated weights for policy 0, policy_version 560160 (0.0037) [2024-06-24 01:49:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9177759744. Throughput: 0: 42911.2. Samples: 9177867840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:49:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 01:49:59,207][15401] Updated weights for policy 0, policy_version 560170 (0.0033) [2024-06-24 01:50:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9177956352. Throughput: 0: 42960.9. Samples: 9178127820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 01:50:03,812][15401] Updated weights for policy 0, policy_version 560180 (0.0031) [2024-06-24 01:50:07,327][15401] Updated weights for policy 0, policy_version 560190 (0.0031) [2024-06-24 01:50:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 9178169344. Throughput: 0: 43020.4. Samples: 9178249080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 01:50:11,239][15401] Updated weights for policy 0, policy_version 560200 (0.0046) [2024-06-24 01:50:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9178382336. Throughput: 0: 42738.4. Samples: 9178504560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 01:50:14,895][15401] Updated weights for policy 0, policy_version 560210 (0.0030) [2024-06-24 01:50:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9178578944. Throughput: 0: 42807.9. Samples: 9178768200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 01:50:19,227][15401] Updated weights for policy 0, policy_version 560220 (0.0033) [2024-06-24 01:50:22,551][15401] Updated weights for policy 0, policy_version 560230 (0.0029) [2024-06-24 01:50:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 9178808320. Throughput: 0: 42741.8. Samples: 9178883800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 01:50:27,128][15401] Updated weights for policy 0, policy_version 560240 (0.0028) [2024-06-24 01:50:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42053.8, 300 sec: 42709.5). Total num frames: 9179021312. Throughput: 0: 42515.5. Samples: 9179141240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 01:50:30,498][15401] Updated weights for policy 0, policy_version 560250 (0.0046) [2024-06-24 01:50:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 9179201536. Throughput: 0: 42659.1. Samples: 9179407880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 01:50:34,529][15401] Updated weights for policy 0, policy_version 560260 (0.0040) [2024-06-24 01:50:38,067][15401] Updated weights for policy 0, policy_version 560270 (0.0031) [2024-06-24 01:50:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 9179463680. Throughput: 0: 42508.6. Samples: 9179524320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:38,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 01:50:42,050][15401] Updated weights for policy 0, policy_version 560280 (0.0031) [2024-06-24 01:50:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9179676672. Throughput: 0: 42624.0. Samples: 9179785920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 01:50:45,593][15401] Updated weights for policy 0, policy_version 560290 (0.0028) [2024-06-24 01:50:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9179856896. Throughput: 0: 42806.3. Samples: 9180054100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 01:50:49,904][15401] Updated weights for policy 0, policy_version 560300 (0.0033) [2024-06-24 01:50:53,377][15401] Updated weights for policy 0, policy_version 560310 (0.0039) [2024-06-24 01:50:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9180119040. Throughput: 0: 42767.6. Samples: 9180173620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:53,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 01:50:57,693][15401] Updated weights for policy 0, policy_version 560320 (0.0037) [2024-06-24 01:50:58,005][15349] Signal inference workers to stop experience collection... (136000 times) [2024-06-24 01:50:58,053][15401] InferenceWorker_p0-w0: stopping experience collection (136000 times) [2024-06-24 01:50:58,060][15349] Signal inference workers to resume experience collection... (136000 times) [2024-06-24 01:50:58,066][15401] InferenceWorker_p0-w0: resuming experience collection (136000 times) [2024-06-24 01:50:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9180315648. Throughput: 0: 42684.1. Samples: 9180425340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:50:58,391][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 01:51:01,405][15401] Updated weights for policy 0, policy_version 560330 (0.0029) [2024-06-24 01:51:03,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9180495872. Throughput: 0: 42611.6. Samples: 9180685720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:51:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 01:51:05,376][15401] Updated weights for policy 0, policy_version 560340 (0.0019) [2024-06-24 01:51:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9180741632. Throughput: 0: 42728.7. Samples: 9180806600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:51:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 01:51:09,409][15401] Updated weights for policy 0, policy_version 560350 (0.0031) [2024-06-24 01:51:13,167][15401] Updated weights for policy 0, policy_version 560360 (0.0035) [2024-06-24 01:51:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9180938240. Throughput: 0: 42852.4. Samples: 9181069600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:51:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 01:51:17,116][15401] Updated weights for policy 0, policy_version 560370 (0.0042) [2024-06-24 01:51:18,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9181134848. Throughput: 0: 42594.1. Samples: 9181324620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:51:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 01:51:20,834][15401] Updated weights for policy 0, policy_version 560380 (0.0029) [2024-06-24 01:51:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9181396992. Throughput: 0: 42687.4. Samples: 9181445260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 01:51:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 01:51:24,851][15401] Updated weights for policy 0, policy_version 560390 (0.0040) [2024-06-24 01:51:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9181577216. Throughput: 0: 42658.5. Samples: 9181705560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:51:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 01:51:28,466][15401] Updated weights for policy 0, policy_version 560400 (0.0033) [2024-06-24 01:51:32,626][15401] Updated weights for policy 0, policy_version 560410 (0.0036) [2024-06-24 01:51:33,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 9181773824. Throughput: 0: 42353.8. Samples: 9181960020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:51:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 01:51:36,103][15401] Updated weights for policy 0, policy_version 560420 (0.0042) [2024-06-24 01:51:38,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9182019584. Throughput: 0: 42409.4. Samples: 9182082140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:51:38,401][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 01:51:40,327][15401] Updated weights for policy 0, policy_version 560430 (0.0028) [2024-06-24 01:51:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9182216192. Throughput: 0: 42715.0. Samples: 9182347520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:51:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 01:51:43,503][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000560439_9182232576.pth... [2024-06-24 01:51:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000559813_9171976192.pth [2024-06-24 01:51:43,931][15401] Updated weights for policy 0, policy_version 560440 (0.0037) [2024-06-24 01:51:47,995][15401] Updated weights for policy 0, policy_version 560450 (0.0032) [2024-06-24 01:51:48,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42869.7, 300 sec: 42653.9). Total num frames: 9182429184. Throughput: 0: 42548.8. Samples: 9182600520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:51:48,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 01:51:51,535][15401] Updated weights for policy 0, policy_version 560460 (0.0028) [2024-06-24 01:51:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9182674944. Throughput: 0: 42686.3. Samples: 9182727480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:51:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 01:51:55,679][15401] Updated weights for policy 0, policy_version 560470 (0.0032) [2024-06-24 01:51:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9182855168. Throughput: 0: 42494.7. Samples: 9182981860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:51:58,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 01:51:59,100][15401] Updated weights for policy 0, policy_version 560480 (0.0038) [2024-06-24 01:52:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9183051776. Throughput: 0: 42625.7. Samples: 9183242780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 01:52:03,487][15401] Updated weights for policy 0, policy_version 560490 (0.0037) [2024-06-24 01:52:06,667][15401] Updated weights for policy 0, policy_version 560500 (0.0028) [2024-06-24 01:52:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9183313920. Throughput: 0: 42753.5. Samples: 9183369160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 01:52:11,078][15401] Updated weights for policy 0, policy_version 560510 (0.0042) [2024-06-24 01:52:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9183477760. Throughput: 0: 42582.7. Samples: 9183621780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:13,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 01:52:14,535][15401] Updated weights for policy 0, policy_version 560520 (0.0037) [2024-06-24 01:52:14,981][15349] Signal inference workers to stop experience collection... (136050 times) [2024-06-24 01:52:14,982][15349] Signal inference workers to resume experience collection... (136050 times) [2024-06-24 01:52:15,002][15401] InferenceWorker_p0-w0: stopping experience collection (136050 times) [2024-06-24 01:52:15,002][15401] InferenceWorker_p0-w0: resuming experience collection (136050 times) [2024-06-24 01:52:18,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9183690752. Throughput: 0: 42600.8. Samples: 9183877060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:18,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 01:52:18,903][15401] Updated weights for policy 0, policy_version 560530 (0.0029) [2024-06-24 01:52:22,297][15401] Updated weights for policy 0, policy_version 560540 (0.0046) [2024-06-24 01:52:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42765.7). Total num frames: 9183936512. Throughput: 0: 42751.2. Samples: 9184005840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 01:52:26,456][15401] Updated weights for policy 0, policy_version 560550 (0.0036) [2024-06-24 01:52:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9184133120. Throughput: 0: 42689.5. Samples: 9184268540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 01:52:29,745][15401] Updated weights for policy 0, policy_version 560560 (0.0036) [2024-06-24 01:52:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9184346112. Throughput: 0: 42778.3. Samples: 9184525440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:33,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 01:52:34,323][15401] Updated weights for policy 0, policy_version 560570 (0.0027) [2024-06-24 01:52:37,353][15401] Updated weights for policy 0, policy_version 560580 (0.0040) [2024-06-24 01:52:38,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42871.4, 300 sec: 42764.7). Total num frames: 9184591872. Throughput: 0: 42804.0. Samples: 9184653760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:38,392][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 01:52:41,859][15401] Updated weights for policy 0, policy_version 560590 (0.0034) [2024-06-24 01:52:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.6). Total num frames: 9184788480. Throughput: 0: 42946.6. Samples: 9184914460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:43,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 01:52:44,912][15401] Updated weights for policy 0, policy_version 560600 (0.0029) [2024-06-24 01:52:48,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 9184968704. Throughput: 0: 42808.6. Samples: 9185169160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:48,390][15132] Avg episode reward: [(0, '0.889')] [2024-06-24 01:52:49,398][15401] Updated weights for policy 0, policy_version 560610 (0.0043) [2024-06-24 01:52:52,525][15401] Updated weights for policy 0, policy_version 560620 (0.0040) [2024-06-24 01:52:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9185247232. Throughput: 0: 42767.9. Samples: 9185293720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-24 01:52:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 01:52:57,155][15401] Updated weights for policy 0, policy_version 560630 (0.0031) [2024-06-24 01:52:58,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42593.8, 300 sec: 42597.8). Total num frames: 9185411072. Throughput: 0: 42858.4. Samples: 9185550680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:52:58,396][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 01:53:00,351][15401] Updated weights for policy 0, policy_version 560640 (0.0032) [2024-06-24 01:53:03,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9185624064. Throughput: 0: 42925.3. Samples: 9185808700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 01:53:04,855][15401] Updated weights for policy 0, policy_version 560650 (0.0023) [2024-06-24 01:53:07,723][15401] Updated weights for policy 0, policy_version 560660 (0.0041) [2024-06-24 01:53:08,390][15132] Fps is (10 sec: 44264.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9185853440. Throughput: 0: 42891.5. Samples: 9185935960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 01:53:12,401][15401] Updated weights for policy 0, policy_version 560670 (0.0034) [2024-06-24 01:53:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 9186050048. Throughput: 0: 42902.2. Samples: 9186199140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 01:53:15,563][15401] Updated weights for policy 0, policy_version 560680 (0.0036) [2024-06-24 01:53:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9186279424. Throughput: 0: 42751.0. Samples: 9186449240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 01:53:19,854][15401] Updated weights for policy 0, policy_version 560690 (0.0041) [2024-06-24 01:53:20,654][15349] Signal inference workers to stop experience collection... (136100 times) [2024-06-24 01:53:20,697][15401] InferenceWorker_p0-w0: stopping experience collection (136100 times) [2024-06-24 01:53:20,768][15349] Signal inference workers to resume experience collection... (136100 times) [2024-06-24 01:53:20,769][15401] InferenceWorker_p0-w0: resuming experience collection (136100 times) [2024-06-24 01:53:23,341][15401] Updated weights for policy 0, policy_version 560700 (0.0037) [2024-06-24 01:53:23,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9186508800. Throughput: 0: 42782.8. Samples: 9186578880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 01:53:28,017][15401] Updated weights for policy 0, policy_version 560710 (0.0042) [2024-06-24 01:53:28,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 9186689024. Throughput: 0: 42703.1. Samples: 9186836200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:28,392][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 01:53:30,957][15401] Updated weights for policy 0, policy_version 560720 (0.0030) [2024-06-24 01:53:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9186934784. Throughput: 0: 42603.1. Samples: 9187086300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 01:53:35,804][15401] Updated weights for policy 0, policy_version 560730 (0.0041) [2024-06-24 01:53:38,389][15132] Fps is (10 sec: 45886.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9187147776. Throughput: 0: 42840.9. Samples: 9187221560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 01:53:38,446][15401] Updated weights for policy 0, policy_version 560740 (0.0028) [2024-06-24 01:53:43,232][15401] Updated weights for policy 0, policy_version 560750 (0.0027) [2024-06-24 01:53:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9187344384. Throughput: 0: 42830.0. Samples: 9187477760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:43,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 01:53:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000560751_9187344384.pth... [2024-06-24 01:53:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000560128_9177137152.pth [2024-06-24 01:53:46,466][15401] Updated weights for policy 0, policy_version 560760 (0.0035) [2024-06-24 01:53:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 9187590144. Throughput: 0: 42692.7. Samples: 9187729860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 01:53:50,650][15401] Updated weights for policy 0, policy_version 560770 (0.0024) [2024-06-24 01:53:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9187786752. Throughput: 0: 42934.2. Samples: 9187868000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 01:53:54,143][15401] Updated weights for policy 0, policy_version 560780 (0.0025) [2024-06-24 01:53:58,194][15401] Updated weights for policy 0, policy_version 560790 (0.0026) [2024-06-24 01:53:58,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 9187983360. Throughput: 0: 42882.9. Samples: 9188128880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:53:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 01:54:01,690][15401] Updated weights for policy 0, policy_version 560800 (0.0027) [2024-06-24 01:54:03,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 9188245504. Throughput: 0: 42831.6. Samples: 9188376660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:54:03,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-24 01:54:05,586][15401] Updated weights for policy 0, policy_version 560810 (0.0039) [2024-06-24 01:54:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9188442112. Throughput: 0: 42884.4. Samples: 9188508680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:54:08,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 01:54:09,227][15401] Updated weights for policy 0, policy_version 560820 (0.0036) [2024-06-24 01:54:13,135][15401] Updated weights for policy 0, policy_version 560830 (0.0036) [2024-06-24 01:54:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9188638720. Throughput: 0: 43011.9. Samples: 9188771640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:54:13,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 01:54:16,900][15401] Updated weights for policy 0, policy_version 560840 (0.0037) [2024-06-24 01:54:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9188884480. Throughput: 0: 42988.9. Samples: 9189020800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:54:18,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-24 01:54:20,707][15401] Updated weights for policy 0, policy_version 560850 (0.0041) [2024-06-24 01:54:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42654.2). Total num frames: 9189081088. Throughput: 0: 42920.4. Samples: 9189152980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:54:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 01:54:24,433][15401] Updated weights for policy 0, policy_version 560860 (0.0027) [2024-06-24 01:54:28,145][15349] Signal inference workers to stop experience collection... (136150 times) [2024-06-24 01:54:28,197][15349] Signal inference workers to resume experience collection... (136150 times) [2024-06-24 01:54:28,197][15401] InferenceWorker_p0-w0: stopping experience collection (136150 times) [2024-06-24 01:54:28,217][15401] InferenceWorker_p0-w0: resuming experience collection (136150 times) [2024-06-24 01:54:28,333][15401] Updated weights for policy 0, policy_version 560870 (0.0033) [2024-06-24 01:54:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43419.2, 300 sec: 42820.5). Total num frames: 9189294080. Throughput: 0: 42862.7. Samples: 9189406580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 01:54:28,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 01:54:32,011][15401] Updated weights for policy 0, policy_version 560880 (0.0030) [2024-06-24 01:54:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9189507072. Throughput: 0: 42995.0. Samples: 9189664640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:54:33,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 01:54:36,005][15401] Updated weights for policy 0, policy_version 560890 (0.0031) [2024-06-24 01:54:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9189703680. Throughput: 0: 42835.7. Samples: 9189795600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:54:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 01:54:39,645][15401] Updated weights for policy 0, policy_version 560900 (0.0028) [2024-06-24 01:54:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9189933056. Throughput: 0: 42704.9. Samples: 9190050600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:54:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 01:54:43,622][15401] Updated weights for policy 0, policy_version 560910 (0.0031) [2024-06-24 01:54:47,280][15401] Updated weights for policy 0, policy_version 560920 (0.0034) [2024-06-24 01:54:48,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 9190146048. Throughput: 0: 42808.0. Samples: 9190303120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:54:48,392][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 01:54:51,325][15401] Updated weights for policy 0, policy_version 560930 (0.0051) [2024-06-24 01:54:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9190342656. Throughput: 0: 42797.8. Samples: 9190434580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:54:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 01:54:54,913][15401] Updated weights for policy 0, policy_version 560940 (0.0040) [2024-06-24 01:54:58,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9190555648. Throughput: 0: 42651.2. Samples: 9190690940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:54:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 01:54:58,939][15401] Updated weights for policy 0, policy_version 560950 (0.0032) [2024-06-24 01:55:02,466][15401] Updated weights for policy 0, policy_version 560960 (0.0031) [2024-06-24 01:55:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9190785024. Throughput: 0: 42788.9. Samples: 9190946300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 01:55:06,721][15401] Updated weights for policy 0, policy_version 560970 (0.0038) [2024-06-24 01:55:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9190998016. Throughput: 0: 42842.3. Samples: 9191080880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 01:55:10,281][15401] Updated weights for policy 0, policy_version 560980 (0.0037) [2024-06-24 01:55:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9191211008. Throughput: 0: 42878.3. Samples: 9191336100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 01:55:14,242][15401] Updated weights for policy 0, policy_version 560990 (0.0028) [2024-06-24 01:55:18,185][15401] Updated weights for policy 0, policy_version 561000 (0.0038) [2024-06-24 01:55:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9191424000. Throughput: 0: 42700.0. Samples: 9191586140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 01:55:22,238][15401] Updated weights for policy 0, policy_version 561010 (0.0037) [2024-06-24 01:55:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9191620608. Throughput: 0: 42612.0. Samples: 9191713140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 01:55:25,874][15401] Updated weights for policy 0, policy_version 561020 (0.0035) [2024-06-24 01:55:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9191849984. Throughput: 0: 42622.4. Samples: 9191968600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 01:55:30,080][15401] Updated weights for policy 0, policy_version 561030 (0.0027) [2024-06-24 01:55:33,242][15401] Updated weights for policy 0, policy_version 561040 (0.0029) [2024-06-24 01:55:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9192079360. Throughput: 0: 42813.9. Samples: 9192229640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 01:55:37,504][15401] Updated weights for policy 0, policy_version 561050 (0.0023) [2024-06-24 01:55:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9192275968. Throughput: 0: 42692.0. Samples: 9192355720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 01:55:40,681][15401] Updated weights for policy 0, policy_version 561060 (0.0034) [2024-06-24 01:55:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9192505344. Throughput: 0: 42808.1. Samples: 9192617300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 01:55:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000561066_9192505344.pth... [2024-06-24 01:55:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000560439_9182232576.pth [2024-06-24 01:55:45,046][15401] Updated weights for policy 0, policy_version 561070 (0.0035) [2024-06-24 01:55:48,368][15401] Updated weights for policy 0, policy_version 561080 (0.0053) [2024-06-24 01:55:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 9192734720. Throughput: 0: 42855.9. Samples: 9192874820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:48,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 01:55:52,492][15401] Updated weights for policy 0, policy_version 561090 (0.0032) [2024-06-24 01:55:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9192914944. Throughput: 0: 42797.2. Samples: 9193006760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:53,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 01:55:55,975][15401] Updated weights for policy 0, policy_version 561100 (0.0025) [2024-06-24 01:55:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9193144320. Throughput: 0: 42865.3. Samples: 9193265040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 01:55:58,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 01:56:00,619][15401] Updated weights for policy 0, policy_version 561110 (0.0026) [2024-06-24 01:56:01,265][15349] Signal inference workers to stop experience collection... (136200 times) [2024-06-24 01:56:01,265][15349] Signal inference workers to resume experience collection... (136200 times) [2024-06-24 01:56:01,305][15401] InferenceWorker_p0-w0: stopping experience collection (136200 times) [2024-06-24 01:56:01,305][15401] InferenceWorker_p0-w0: resuming experience collection (136200 times) [2024-06-24 01:56:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 9193357312. Throughput: 0: 42926.5. Samples: 9193517840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 01:56:03,960][15401] Updated weights for policy 0, policy_version 561120 (0.0029) [2024-06-24 01:56:08,208][15401] Updated weights for policy 0, policy_version 561130 (0.0043) [2024-06-24 01:56:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9193570304. Throughput: 0: 43035.1. Samples: 9193649720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 01:56:11,771][15401] Updated weights for policy 0, policy_version 561140 (0.0035) [2024-06-24 01:56:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9193766912. Throughput: 0: 43083.5. Samples: 9193907360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:13,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-24 01:56:15,583][15401] Updated weights for policy 0, policy_version 561150 (0.0026) [2024-06-24 01:56:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9193996288. Throughput: 0: 42940.9. Samples: 9194161980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 01:56:19,274][15401] Updated weights for policy 0, policy_version 561160 (0.0037) [2024-06-24 01:56:23,048][15401] Updated weights for policy 0, policy_version 561170 (0.0042) [2024-06-24 01:56:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9194209280. Throughput: 0: 42967.5. Samples: 9194289260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:23,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 01:56:27,281][15401] Updated weights for policy 0, policy_version 561180 (0.0027) [2024-06-24 01:56:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9194422272. Throughput: 0: 42916.9. Samples: 9194548560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 01:56:30,608][15401] Updated weights for policy 0, policy_version 561190 (0.0048) [2024-06-24 01:56:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 9194635264. Throughput: 0: 42804.0. Samples: 9194801000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 01:56:34,963][15401] Updated weights for policy 0, policy_version 561200 (0.0032) [2024-06-24 01:56:38,258][15401] Updated weights for policy 0, policy_version 561210 (0.0038) [2024-06-24 01:56:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9194864640. Throughput: 0: 42755.1. Samples: 9194930740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:38,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-24 01:56:42,607][15401] Updated weights for policy 0, policy_version 561220 (0.0027) [2024-06-24 01:56:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 9195061248. Throughput: 0: 42752.1. Samples: 9195188880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:43,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 01:56:46,087][15401] Updated weights for policy 0, policy_version 561230 (0.0031) [2024-06-24 01:56:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9195274240. Throughput: 0: 42675.7. Samples: 9195438240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 01:56:50,298][15401] Updated weights for policy 0, policy_version 561240 (0.0048) [2024-06-24 01:56:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9195487232. Throughput: 0: 42666.3. Samples: 9195569700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 01:56:53,701][15401] Updated weights for policy 0, policy_version 561250 (0.0034) [2024-06-24 01:56:57,989][15401] Updated weights for policy 0, policy_version 561260 (0.0036) [2024-06-24 01:56:58,391][15132] Fps is (10 sec: 42590.4, 60 sec: 42597.1, 300 sec: 42875.8). Total num frames: 9195700224. Throughput: 0: 42677.8. Samples: 9195827940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:56:58,392][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 01:57:01,482][15401] Updated weights for policy 0, policy_version 561270 (0.0026) [2024-06-24 01:57:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9195929600. Throughput: 0: 42465.2. Samples: 9196072920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:57:03,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 01:57:05,454][15401] Updated weights for policy 0, policy_version 561280 (0.0040) [2024-06-24 01:57:08,390][15132] Fps is (10 sec: 42606.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9196126208. Throughput: 0: 42713.3. Samples: 9196211360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:57:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 01:57:09,216][15401] Updated weights for policy 0, policy_version 561290 (0.0042) [2024-06-24 01:57:12,940][15401] Updated weights for policy 0, policy_version 561300 (0.0037) [2024-06-24 01:57:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9196339200. Throughput: 0: 42567.1. Samples: 9196464080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:57:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 01:57:16,882][15401] Updated weights for policy 0, policy_version 561310 (0.0036) [2024-06-24 01:57:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9196568576. Throughput: 0: 42431.5. Samples: 9196710420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:57:18,399][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 01:57:21,032][15401] Updated weights for policy 0, policy_version 561320 (0.0045) [2024-06-24 01:57:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9196748800. Throughput: 0: 42576.7. Samples: 9196846680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:57:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 01:57:24,443][15401] Updated weights for policy 0, policy_version 561330 (0.0028) [2024-06-24 01:57:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9196978176. Throughput: 0: 42564.0. Samples: 9197104260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 01:57:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 01:57:28,715][15401] Updated weights for policy 0, policy_version 561340 (0.0027) [2024-06-24 01:57:32,143][15401] Updated weights for policy 0, policy_version 561350 (0.0043) [2024-06-24 01:57:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 9197207552. Throughput: 0: 42589.0. Samples: 9197354740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:57:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 01:57:36,311][15401] Updated weights for policy 0, policy_version 561360 (0.0045) [2024-06-24 01:57:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 9197371392. Throughput: 0: 42663.1. Samples: 9197489540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:57:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 01:57:39,823][15401] Updated weights for policy 0, policy_version 561370 (0.0033) [2024-06-24 01:57:43,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9197617152. Throughput: 0: 42479.1. Samples: 9197739420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:57:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 01:57:43,489][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000561379_9197633536.pth... [2024-06-24 01:57:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000560751_9187344384.pth [2024-06-24 01:57:43,995][15401] Updated weights for policy 0, policy_version 561380 (0.0028) [2024-06-24 01:57:47,352][15349] Signal inference workers to stop experience collection... (136250 times) [2024-06-24 01:57:47,375][15401] InferenceWorker_p0-w0: stopping experience collection (136250 times) [2024-06-24 01:57:47,414][15349] Signal inference workers to resume experience collection... (136250 times) [2024-06-24 01:57:47,414][15401] InferenceWorker_p0-w0: resuming experience collection (136250 times) [2024-06-24 01:57:47,550][15401] Updated weights for policy 0, policy_version 561390 (0.0030) [2024-06-24 01:57:48,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9197846528. Throughput: 0: 42521.4. Samples: 9197986380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:57:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 01:57:52,106][15401] Updated weights for policy 0, policy_version 561400 (0.0032) [2024-06-24 01:57:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.1, 300 sec: 42710.4). Total num frames: 9198010368. Throughput: 0: 42392.4. Samples: 9198119020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:57:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 01:57:55,254][15401] Updated weights for policy 0, policy_version 561410 (0.0033) [2024-06-24 01:57:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42599.7, 300 sec: 42820.6). Total num frames: 9198256128. Throughput: 0: 42283.1. Samples: 9198366820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:57:58,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 01:57:59,674][15401] Updated weights for policy 0, policy_version 561420 (0.0042) [2024-06-24 01:58:03,135][15401] Updated weights for policy 0, policy_version 561430 (0.0029) [2024-06-24 01:58:03,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9198485504. Throughput: 0: 42598.1. Samples: 9198627340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 01:58:07,236][15401] Updated weights for policy 0, policy_version 561440 (0.0047) [2024-06-24 01:58:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9198665728. Throughput: 0: 42473.1. Samples: 9198757980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:08,393][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 01:58:10,779][15401] Updated weights for policy 0, policy_version 561450 (0.0032) [2024-06-24 01:58:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9198895104. Throughput: 0: 42195.0. Samples: 9199003040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:13,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 01:58:15,075][15401] Updated weights for policy 0, policy_version 561460 (0.0039) [2024-06-24 01:58:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9199108096. Throughput: 0: 42449.2. Samples: 9199264960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:18,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 01:58:18,528][15401] Updated weights for policy 0, policy_version 561470 (0.0039) [2024-06-24 01:58:22,547][15401] Updated weights for policy 0, policy_version 561480 (0.0032) [2024-06-24 01:58:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 9199304704. Throughput: 0: 42300.8. Samples: 9199393080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 01:58:26,149][15401] Updated weights for policy 0, policy_version 561490 (0.0032) [2024-06-24 01:58:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9199534080. Throughput: 0: 42415.2. Samples: 9199648100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 01:58:29,954][15401] Updated weights for policy 0, policy_version 561500 (0.0037) [2024-06-24 01:58:33,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42320.7, 300 sec: 42708.5). Total num frames: 9199747072. Throughput: 0: 42768.0. Samples: 9199911220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:33,397][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 01:58:33,787][15401] Updated weights for policy 0, policy_version 561510 (0.0030) [2024-06-24 01:58:37,546][15401] Updated weights for policy 0, policy_version 561520 (0.0031) [2024-06-24 01:58:38,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9199960064. Throughput: 0: 42579.9. Samples: 9200035120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 01:58:41,467][15401] Updated weights for policy 0, policy_version 561530 (0.0027) [2024-06-24 01:58:43,389][15132] Fps is (10 sec: 44265.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9200189440. Throughput: 0: 42817.8. Samples: 9200293620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:43,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 01:58:45,160][15401] Updated weights for policy 0, policy_version 561540 (0.0050) [2024-06-24 01:58:48,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9200369664. Throughput: 0: 42823.7. Samples: 9200554400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:48,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-24 01:58:49,210][15401] Updated weights for policy 0, policy_version 561550 (0.0032) [2024-06-24 01:58:52,830][15401] Updated weights for policy 0, policy_version 561560 (0.0035) [2024-06-24 01:58:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9200599040. Throughput: 0: 42620.5. Samples: 9200675900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 01:58:56,890][15401] Updated weights for policy 0, policy_version 561570 (0.0031) [2024-06-24 01:58:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 9200828416. Throughput: 0: 42929.5. Samples: 9200934860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-24 01:58:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 01:59:00,508][15401] Updated weights for policy 0, policy_version 561580 (0.0035) [2024-06-24 01:59:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9201008640. Throughput: 0: 42864.3. Samples: 9201193860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 01:59:04,564][15349] Signal inference workers to stop experience collection... (136300 times) [2024-06-24 01:59:04,565][15349] Signal inference workers to resume experience collection... (136300 times) [2024-06-24 01:59:04,577][15401] InferenceWorker_p0-w0: stopping experience collection (136300 times) [2024-06-24 01:59:04,610][15401] InferenceWorker_p0-w0: resuming experience collection (136300 times) [2024-06-24 01:59:04,706][15401] Updated weights for policy 0, policy_version 561590 (0.0026) [2024-06-24 01:59:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9201238016. Throughput: 0: 42660.5. Samples: 9201312800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 01:59:08,437][15401] Updated weights for policy 0, policy_version 561600 (0.0037) [2024-06-24 01:59:12,503][15401] Updated weights for policy 0, policy_version 561610 (0.0033) [2024-06-24 01:59:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9201467392. Throughput: 0: 42812.0. Samples: 9201574640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 01:59:16,089][15401] Updated weights for policy 0, policy_version 561620 (0.0028) [2024-06-24 01:59:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9201647616. Throughput: 0: 42560.3. Samples: 9201826160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:18,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 01:59:20,120][15401] Updated weights for policy 0, policy_version 561630 (0.0038) [2024-06-24 01:59:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9201876992. Throughput: 0: 42566.0. Samples: 9201950580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 01:59:23,896][15401] Updated weights for policy 0, policy_version 561640 (0.0037) [2024-06-24 01:59:27,755][15401] Updated weights for policy 0, policy_version 561650 (0.0037) [2024-06-24 01:59:28,390][15132] Fps is (10 sec: 44235.4, 60 sec: 42598.1, 300 sec: 42653.9). Total num frames: 9202089984. Throughput: 0: 42605.8. Samples: 9202210900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 01:59:31,394][15401] Updated weights for policy 0, policy_version 561660 (0.0027) [2024-06-24 01:59:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42329.9, 300 sec: 42653.9). Total num frames: 9202286592. Throughput: 0: 42385.9. Samples: 9202461760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 01:59:35,632][15401] Updated weights for policy 0, policy_version 561670 (0.0045) [2024-06-24 01:59:38,389][15132] Fps is (10 sec: 40961.7, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 9202499584. Throughput: 0: 42565.4. Samples: 9202591340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:38,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 01:59:39,330][15401] Updated weights for policy 0, policy_version 561680 (0.0032) [2024-06-24 01:59:43,330][15401] Updated weights for policy 0, policy_version 561690 (0.0041) [2024-06-24 01:59:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 9202728960. Throughput: 0: 42540.7. Samples: 9202849200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:43,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 01:59:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000561690_9202728960.pth... [2024-06-24 01:59:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000561066_9192505344.pth [2024-06-24 01:59:47,097][15401] Updated weights for policy 0, policy_version 561700 (0.0029) [2024-06-24 01:59:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9202958336. Throughput: 0: 42481.4. Samples: 9203105520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 01:59:50,747][15401] Updated weights for policy 0, policy_version 561710 (0.0025) [2024-06-24 01:59:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9203154944. Throughput: 0: 42742.7. Samples: 9203236220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:53,391][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 01:59:54,580][15401] Updated weights for policy 0, policy_version 561720 (0.0033) [2024-06-24 01:59:58,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9203351552. Throughput: 0: 42677.3. Samples: 9203495120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 01:59:58,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 01:59:58,881][15401] Updated weights for policy 0, policy_version 561730 (0.0042) [2024-06-24 02:00:02,108][15401] Updated weights for policy 0, policy_version 561740 (0.0027) [2024-06-24 02:00:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9203597312. Throughput: 0: 42680.0. Samples: 9203746760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 02:00:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 02:00:06,284][15401] Updated weights for policy 0, policy_version 561750 (0.0030) [2024-06-24 02:00:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9203793920. Throughput: 0: 42838.2. Samples: 9203878300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 02:00:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 02:00:09,783][15401] Updated weights for policy 0, policy_version 561760 (0.0027) [2024-06-24 02:00:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9204006912. Throughput: 0: 42698.2. Samples: 9204132300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 02:00:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 02:00:13,928][15401] Updated weights for policy 0, policy_version 561770 (0.0029) [2024-06-24 02:00:17,858][15401] Updated weights for policy 0, policy_version 561780 (0.0051) [2024-06-24 02:00:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9204219904. Throughput: 0: 42673.3. Samples: 9204382060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 02:00:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 02:00:21,863][15401] Updated weights for policy 0, policy_version 561790 (0.0038) [2024-06-24 02:00:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9204432896. Throughput: 0: 42606.7. Samples: 9204508640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 02:00:23,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 02:00:25,642][15401] Updated weights for policy 0, policy_version 561800 (0.0036) [2024-06-24 02:00:28,035][15349] Signal inference workers to stop experience collection... (136350 times) [2024-06-24 02:00:28,037][15349] Signal inference workers to resume experience collection... (136350 times) [2024-06-24 02:00:28,047][15401] InferenceWorker_p0-w0: stopping experience collection (136350 times) [2024-06-24 02:00:28,061][15401] InferenceWorker_p0-w0: resuming experience collection (136350 times) [2024-06-24 02:00:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.7, 300 sec: 42653.9). Total num frames: 9204662272. Throughput: 0: 42588.0. Samples: 9204765660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 02:00:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 02:00:29,747][15401] Updated weights for policy 0, policy_version 561810 (0.0039) [2024-06-24 02:00:33,146][15401] Updated weights for policy 0, policy_version 561820 (0.0036) [2024-06-24 02:00:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 9204858880. Throughput: 0: 42545.7. Samples: 9205020080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 02:00:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 02:00:37,199][15401] Updated weights for policy 0, policy_version 561830 (0.0037) [2024-06-24 02:00:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9205071872. Throughput: 0: 42559.6. Samples: 9205151400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:00:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 02:00:40,589][15401] Updated weights for policy 0, policy_version 561840 (0.0036) [2024-06-24 02:00:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 9205284864. Throughput: 0: 42477.3. Samples: 9205406600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:00:43,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 02:00:44,715][15401] Updated weights for policy 0, policy_version 561850 (0.0037) [2024-06-24 02:00:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9205481472. Throughput: 0: 42497.3. Samples: 9205659140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:00:48,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-24 02:00:48,855][15401] Updated weights for policy 0, policy_version 561860 (0.0030) [2024-06-24 02:00:52,382][15401] Updated weights for policy 0, policy_version 561870 (0.0038) [2024-06-24 02:00:53,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 9205694464. Throughput: 0: 42380.4. Samples: 9205785520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:00:53,401][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 02:00:56,434][15401] Updated weights for policy 0, policy_version 561880 (0.0025) [2024-06-24 02:00:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 9205940224. Throughput: 0: 42565.7. Samples: 9206047760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:00:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 02:00:59,991][15401] Updated weights for policy 0, policy_version 561890 (0.0028) [2024-06-24 02:01:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 9206120448. Throughput: 0: 42677.6. Samples: 9206302560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 02:01:03,984][15401] Updated weights for policy 0, policy_version 561900 (0.0044) [2024-06-24 02:01:07,501][15401] Updated weights for policy 0, policy_version 561910 (0.0031) [2024-06-24 02:01:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9206349824. Throughput: 0: 42724.0. Samples: 9206431220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 02:01:11,764][15401] Updated weights for policy 0, policy_version 561920 (0.0045) [2024-06-24 02:01:13,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 9206579200. Throughput: 0: 42658.3. Samples: 9206685380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:13,393][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 02:01:15,220][15401] Updated weights for policy 0, policy_version 561930 (0.0023) [2024-06-24 02:01:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 9206759424. Throughput: 0: 42702.3. Samples: 9206941680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 02:01:19,344][15401] Updated weights for policy 0, policy_version 561940 (0.0036) [2024-06-24 02:01:23,306][15401] Updated weights for policy 0, policy_version 561950 (0.0034) [2024-06-24 02:01:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9206988800. Throughput: 0: 42637.7. Samples: 9207070100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 02:01:27,065][15401] Updated weights for policy 0, policy_version 561960 (0.0049) [2024-06-24 02:01:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 9207201792. Throughput: 0: 42646.4. Samples: 9207325680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 02:01:30,828][15401] Updated weights for policy 0, policy_version 561970 (0.0035) [2024-06-24 02:01:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 9207398400. Throughput: 0: 42761.4. Samples: 9207583400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 02:01:34,845][15401] Updated weights for policy 0, policy_version 561980 (0.0031) [2024-06-24 02:01:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9207627776. Throughput: 0: 42713.0. Samples: 9207707500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 02:01:38,417][15401] Updated weights for policy 0, policy_version 561990 (0.0034) [2024-06-24 02:01:42,400][15401] Updated weights for policy 0, policy_version 562000 (0.0025) [2024-06-24 02:01:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9207857152. Throughput: 0: 42719.4. Samples: 9207970140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 02:01:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000562003_9207857152.pth... [2024-06-24 02:01:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000561379_9197633536.pth [2024-06-24 02:01:46,078][15401] Updated weights for policy 0, policy_version 562010 (0.0037) [2024-06-24 02:01:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 9208053760. Throughput: 0: 42528.1. Samples: 9208216320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 02:01:50,147][15401] Updated weights for policy 0, policy_version 562020 (0.0035) [2024-06-24 02:01:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.2, 300 sec: 42598.7). Total num frames: 9208266752. Throughput: 0: 42445.7. Samples: 9208341280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 02:01:54,007][15401] Updated weights for policy 0, policy_version 562030 (0.0042) [2024-06-24 02:01:57,838][15401] Updated weights for policy 0, policy_version 562040 (0.0020) [2024-06-24 02:01:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9208479744. Throughput: 0: 42610.4. Samples: 9208602740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:01:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 02:02:02,001][15401] Updated weights for policy 0, policy_version 562050 (0.0032) [2024-06-24 02:02:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9208709120. Throughput: 0: 42484.0. Samples: 9208853460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 02:02:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 02:02:05,538][15401] Updated weights for policy 0, policy_version 562060 (0.0027) [2024-06-24 02:02:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9208905728. Throughput: 0: 42540.9. Samples: 9208984440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 02:02:09,604][15401] Updated weights for policy 0, policy_version 562070 (0.0033) [2024-06-24 02:02:10,610][15349] Signal inference workers to stop experience collection... (136400 times) [2024-06-24 02:02:10,611][15349] Signal inference workers to resume experience collection... (136400 times) [2024-06-24 02:02:10,640][15401] InferenceWorker_p0-w0: stopping experience collection (136400 times) [2024-06-24 02:02:10,640][15401] InferenceWorker_p0-w0: resuming experience collection (136400 times) [2024-06-24 02:02:13,321][15401] Updated weights for policy 0, policy_version 562080 (0.0037) [2024-06-24 02:02:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 9209118720. Throughput: 0: 42621.2. Samples: 9209243640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 02:02:17,156][15401] Updated weights for policy 0, policy_version 562090 (0.0032) [2024-06-24 02:02:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9209331712. Throughput: 0: 42530.2. Samples: 9209497260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:18,399][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 02:02:20,816][15401] Updated weights for policy 0, policy_version 562100 (0.0024) [2024-06-24 02:02:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9209528320. Throughput: 0: 42582.2. Samples: 9209623700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:23,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 02:02:24,683][15401] Updated weights for policy 0, policy_version 562110 (0.0026) [2024-06-24 02:02:28,356][15401] Updated weights for policy 0, policy_version 562120 (0.0027) [2024-06-24 02:02:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 9209774080. Throughput: 0: 42435.5. Samples: 9209879740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:28,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 02:02:32,141][15401] Updated weights for policy 0, policy_version 562130 (0.0033) [2024-06-24 02:02:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9209987072. Throughput: 0: 42766.1. Samples: 9210140800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 02:02:35,890][15401] Updated weights for policy 0, policy_version 562140 (0.0023) [2024-06-24 02:02:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9210183680. Throughput: 0: 42813.8. Samples: 9210267900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 02:02:39,570][15401] Updated weights for policy 0, policy_version 562150 (0.0036) [2024-06-24 02:02:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9210413056. Throughput: 0: 42885.2. Samples: 9210532580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:43,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-24 02:02:43,483][15401] Updated weights for policy 0, policy_version 562160 (0.0039) [2024-06-24 02:02:47,334][15401] Updated weights for policy 0, policy_version 562170 (0.0028) [2024-06-24 02:02:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9210626048. Throughput: 0: 42902.8. Samples: 9210784080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 02:02:50,995][15401] Updated weights for policy 0, policy_version 562180 (0.0034) [2024-06-24 02:02:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9210822656. Throughput: 0: 42774.2. Samples: 9210909280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:53,399][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 02:02:55,075][15401] Updated weights for policy 0, policy_version 562190 (0.0029) [2024-06-24 02:02:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9211052032. Throughput: 0: 42727.3. Samples: 9211166360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:02:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 02:02:58,619][15401] Updated weights for policy 0, policy_version 562200 (0.0028) [2024-06-24 02:03:03,048][15401] Updated weights for policy 0, policy_version 562210 (0.0040) [2024-06-24 02:03:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 9211265024. Throughput: 0: 42723.7. Samples: 9211419820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:03:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 02:03:06,204][15401] Updated weights for policy 0, policy_version 562220 (0.0033) [2024-06-24 02:03:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9211461632. Throughput: 0: 42675.6. Samples: 9211544100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:03:08,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 02:03:10,643][15401] Updated weights for policy 0, policy_version 562230 (0.0027) [2024-06-24 02:03:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 9211691008. Throughput: 0: 42878.4. Samples: 9211809260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:03:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 02:03:13,933][15401] Updated weights for policy 0, policy_version 562240 (0.0035) [2024-06-24 02:03:18,132][15401] Updated weights for policy 0, policy_version 562250 (0.0036) [2024-06-24 02:03:18,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 9211904000. Throughput: 0: 42786.7. Samples: 9212066300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:03:18,392][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 02:03:21,387][15401] Updated weights for policy 0, policy_version 562260 (0.0032) [2024-06-24 02:03:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 9212116992. Throughput: 0: 42828.1. Samples: 9212195160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:03:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 02:03:25,568][15401] Updated weights for policy 0, policy_version 562270 (0.0039) [2024-06-24 02:03:28,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 9212329984. Throughput: 0: 42741.7. Samples: 9212455960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:03:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 02:03:29,128][15401] Updated weights for policy 0, policy_version 562280 (0.0036) [2024-06-24 02:03:31,568][15349] Signal inference workers to stop experience collection... (136450 times) [2024-06-24 02:03:31,612][15401] InferenceWorker_p0-w0: stopping experience collection (136450 times) [2024-06-24 02:03:31,631][15349] Signal inference workers to resume experience collection... (136450 times) [2024-06-24 02:03:31,636][15401] InferenceWorker_p0-w0: resuming experience collection (136450 times) [2024-06-24 02:03:33,293][15401] Updated weights for policy 0, policy_version 562290 (0.0049) [2024-06-24 02:03:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9212559360. Throughput: 0: 42777.3. Samples: 9212709060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:03:33,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 02:03:37,005][15401] Updated weights for policy 0, policy_version 562300 (0.0041) [2024-06-24 02:03:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9212755968. Throughput: 0: 42924.1. Samples: 9212840860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:03:38,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 02:03:40,798][15401] Updated weights for policy 0, policy_version 562310 (0.0039) [2024-06-24 02:03:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9212968960. Throughput: 0: 42973.1. Samples: 9213100160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:03:43,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 02:03:43,466][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000562316_9212985344.pth... [2024-06-24 02:03:43,538][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000561690_9202728960.pth [2024-06-24 02:03:44,830][15401] Updated weights for policy 0, policy_version 562320 (0.0032) [2024-06-24 02:03:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9213198336. Throughput: 0: 43084.3. Samples: 9213358620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:03:48,396][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 02:03:48,730][15401] Updated weights for policy 0, policy_version 562330 (0.0033) [2024-06-24 02:03:52,470][15401] Updated weights for policy 0, policy_version 562340 (0.0042) [2024-06-24 02:03:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9213394944. Throughput: 0: 43206.7. Samples: 9213488400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:03:53,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-24 02:03:56,257][15401] Updated weights for policy 0, policy_version 562350 (0.0027) [2024-06-24 02:03:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9213624320. Throughput: 0: 43040.0. Samples: 9213746060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:03:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 02:04:00,062][15401] Updated weights for policy 0, policy_version 562360 (0.0028) [2024-06-24 02:04:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9213837312. Throughput: 0: 42962.8. Samples: 9213999520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 02:04:03,672][15401] Updated weights for policy 0, policy_version 562370 (0.0024) [2024-06-24 02:04:07,670][15401] Updated weights for policy 0, policy_version 562380 (0.0033) [2024-06-24 02:04:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9214050304. Throughput: 0: 43080.4. Samples: 9214133780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:08,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 02:04:11,116][15401] Updated weights for policy 0, policy_version 562390 (0.0040) [2024-06-24 02:04:13,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9214263296. Throughput: 0: 42983.6. Samples: 9214390320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:13,392][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 02:04:15,306][15401] Updated weights for policy 0, policy_version 562400 (0.0044) [2024-06-24 02:04:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 9214492672. Throughput: 0: 42988.8. Samples: 9214643560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 02:04:18,668][15401] Updated weights for policy 0, policy_version 562410 (0.0036) [2024-06-24 02:04:23,003][15401] Updated weights for policy 0, policy_version 562420 (0.0040) [2024-06-24 02:04:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9214689280. Throughput: 0: 42927.5. Samples: 9214772600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 02:04:26,623][15401] Updated weights for policy 0, policy_version 562430 (0.0028) [2024-06-24 02:04:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9214902272. Throughput: 0: 42883.6. Samples: 9215029920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 02:04:30,562][15401] Updated weights for policy 0, policy_version 562440 (0.0028) [2024-06-24 02:04:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9215131648. Throughput: 0: 43001.0. Samples: 9215293660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 02:04:34,059][15401] Updated weights for policy 0, policy_version 562450 (0.0030) [2024-06-24 02:04:38,191][15401] Updated weights for policy 0, policy_version 562460 (0.0034) [2024-06-24 02:04:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9215344640. Throughput: 0: 42957.4. Samples: 9215421480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 02:04:41,890][15401] Updated weights for policy 0, policy_version 562470 (0.0027) [2024-06-24 02:04:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9215557632. Throughput: 0: 42894.1. Samples: 9215676300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 02:04:45,921][15401] Updated weights for policy 0, policy_version 562480 (0.0042) [2024-06-24 02:04:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9215770624. Throughput: 0: 43026.0. Samples: 9215935700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 02:04:49,504][15401] Updated weights for policy 0, policy_version 562490 (0.0024) [2024-06-24 02:04:53,393][15132] Fps is (10 sec: 42582.1, 60 sec: 43141.7, 300 sec: 42820.0). Total num frames: 9215983616. Throughput: 0: 42822.0. Samples: 9216060940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:53,394][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 02:04:53,553][15401] Updated weights for policy 0, policy_version 562500 (0.0046) [2024-06-24 02:04:56,974][15401] Updated weights for policy 0, policy_version 562510 (0.0027) [2024-06-24 02:04:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9216196608. Throughput: 0: 42856.1. Samples: 9216318740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:04:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 02:05:00,824][15349] Signal inference workers to stop experience collection... (136500 times) [2024-06-24 02:05:00,825][15349] Signal inference workers to resume experience collection... (136500 times) [2024-06-24 02:05:00,869][15401] InferenceWorker_p0-w0: stopping experience collection (136500 times) [2024-06-24 02:05:00,876][15401] InferenceWorker_p0-w0: resuming experience collection (136500 times) [2024-06-24 02:05:01,223][15401] Updated weights for policy 0, policy_version 562520 (0.0045) [2024-06-24 02:05:03,392][15132] Fps is (10 sec: 42605.0, 60 sec: 42869.6, 300 sec: 42764.7). Total num frames: 9216409600. Throughput: 0: 43028.0. Samples: 9216579920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 02:05:03,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 02:05:04,526][15401] Updated weights for policy 0, policy_version 562530 (0.0037) [2024-06-24 02:05:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9216622592. Throughput: 0: 42921.0. Samples: 9216704040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 02:05:08,828][15401] Updated weights for policy 0, policy_version 562540 (0.0033) [2024-06-24 02:05:12,373][15401] Updated weights for policy 0, policy_version 562550 (0.0034) [2024-06-24 02:05:13,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 9216835584. Throughput: 0: 43028.6. Samples: 9216966200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 02:05:16,750][15401] Updated weights for policy 0, policy_version 562560 (0.0031) [2024-06-24 02:05:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9217064960. Throughput: 0: 42832.5. Samples: 9217221120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:18,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 02:05:19,912][15401] Updated weights for policy 0, policy_version 562570 (0.0030) [2024-06-24 02:05:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9217261568. Throughput: 0: 42790.9. Samples: 9217347080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 02:05:24,367][15401] Updated weights for policy 0, policy_version 562580 (0.0027) [2024-06-24 02:05:27,395][15401] Updated weights for policy 0, policy_version 562590 (0.0034) [2024-06-24 02:05:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 9217490944. Throughput: 0: 42877.0. Samples: 9217605860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:28,393][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 02:05:31,910][15401] Updated weights for policy 0, policy_version 562600 (0.0034) [2024-06-24 02:05:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9217703936. Throughput: 0: 42944.1. Samples: 9217868180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:33,396][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 02:05:35,197][15401] Updated weights for policy 0, policy_version 562610 (0.0042) [2024-06-24 02:05:38,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9217900544. Throughput: 0: 42977.0. Samples: 9217994740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 02:05:39,467][15401] Updated weights for policy 0, policy_version 562620 (0.0026) [2024-06-24 02:05:42,664][15401] Updated weights for policy 0, policy_version 562630 (0.0032) [2024-06-24 02:05:43,394][15132] Fps is (10 sec: 44216.2, 60 sec: 43141.3, 300 sec: 42931.0). Total num frames: 9218146304. Throughput: 0: 43023.0. Samples: 9218254980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:43,395][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 02:05:43,522][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000562632_9218162688.pth... [2024-06-24 02:05:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000562003_9207857152.pth [2024-06-24 02:05:46,989][15401] Updated weights for policy 0, policy_version 562640 (0.0032) [2024-06-24 02:05:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42876.1). Total num frames: 9218342912. Throughput: 0: 42895.1. Samples: 9218510200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:48,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 02:05:50,341][15401] Updated weights for policy 0, policy_version 562650 (0.0042) [2024-06-24 02:05:53,390][15132] Fps is (10 sec: 40978.8, 60 sec: 42874.2, 300 sec: 42765.0). Total num frames: 9218555904. Throughput: 0: 42914.5. Samples: 9218635200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 02:05:54,885][15401] Updated weights for policy 0, policy_version 562660 (0.0047) [2024-06-24 02:05:58,210][15401] Updated weights for policy 0, policy_version 562670 (0.0030) [2024-06-24 02:05:58,389][15132] Fps is (10 sec: 44248.1, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 9218785280. Throughput: 0: 42938.7. Samples: 9218898440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:05:58,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 02:06:02,378][15401] Updated weights for policy 0, policy_version 562680 (0.0032) [2024-06-24 02:06:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9218965504. Throughput: 0: 43134.2. Samples: 9219162160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:06:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 02:06:05,756][15401] Updated weights for policy 0, policy_version 562690 (0.0039) [2024-06-24 02:06:08,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42765.3). Total num frames: 9219194880. Throughput: 0: 43051.6. Samples: 9219284400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:06:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 02:06:10,528][15401] Updated weights for policy 0, policy_version 562700 (0.0031) [2024-06-24 02:06:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9219424256. Throughput: 0: 42868.9. Samples: 9219534860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:06:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 02:06:13,513][15401] Updated weights for policy 0, policy_version 562710 (0.0037) [2024-06-24 02:06:18,038][15401] Updated weights for policy 0, policy_version 562720 (0.0030) [2024-06-24 02:06:18,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9219620864. Throughput: 0: 42887.6. Samples: 9219798120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:06:18,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 02:06:21,258][15401] Updated weights for policy 0, policy_version 562730 (0.0040) [2024-06-24 02:06:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9219850240. Throughput: 0: 42796.5. Samples: 9219920580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:06:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 02:06:25,922][15401] Updated weights for policy 0, policy_version 562740 (0.0028) [2024-06-24 02:06:28,391][15132] Fps is (10 sec: 42592.3, 60 sec: 42599.0, 300 sec: 42875.9). Total num frames: 9220046848. Throughput: 0: 42683.5. Samples: 9220175600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:06:28,391][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 02:06:28,921][15401] Updated weights for policy 0, policy_version 562750 (0.0044) [2024-06-24 02:06:32,207][15349] Signal inference workers to stop experience collection... (136550 times) [2024-06-24 02:06:32,252][15401] InferenceWorker_p0-w0: stopping experience collection (136550 times) [2024-06-24 02:06:32,263][15349] Signal inference workers to resume experience collection... (136550 times) [2024-06-24 02:06:32,273][15401] InferenceWorker_p0-w0: resuming experience collection (136550 times) [2024-06-24 02:06:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9220227072. Throughput: 0: 42778.3. Samples: 9220435120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:06:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 02:06:33,615][15401] Updated weights for policy 0, policy_version 562760 (0.0032) [2024-06-24 02:06:36,551][15401] Updated weights for policy 0, policy_version 562770 (0.0028) [2024-06-24 02:06:38,390][15132] Fps is (10 sec: 44242.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9220489216. Throughput: 0: 42654.6. Samples: 9220554660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 02:06:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 02:06:41,247][15401] Updated weights for policy 0, policy_version 562780 (0.0033) [2024-06-24 02:06:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42328.6, 300 sec: 42820.5). Total num frames: 9220685824. Throughput: 0: 42589.1. Samples: 9220814960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:06:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 02:06:44,084][15401] Updated weights for policy 0, policy_version 562790 (0.0034) [2024-06-24 02:06:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 9220882432. Throughput: 0: 42516.8. Samples: 9221075420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:06:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 02:06:48,851][15401] Updated weights for policy 0, policy_version 562800 (0.0039) [2024-06-24 02:06:52,022][15401] Updated weights for policy 0, policy_version 562810 (0.0039) [2024-06-24 02:06:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9221128192. Throughput: 0: 42548.2. Samples: 9221199060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:06:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 02:06:56,531][15401] Updated weights for policy 0, policy_version 562820 (0.0031) [2024-06-24 02:06:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9221324800. Throughput: 0: 42700.9. Samples: 9221456400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:06:58,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 02:06:59,612][15401] Updated weights for policy 0, policy_version 562830 (0.0042) [2024-06-24 02:07:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9221521408. Throughput: 0: 42523.7. Samples: 9221711680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:03,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 02:07:04,050][15401] Updated weights for policy 0, policy_version 562840 (0.0029) [2024-06-24 02:07:07,629][15401] Updated weights for policy 0, policy_version 562850 (0.0036) [2024-06-24 02:07:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9221767168. Throughput: 0: 42515.9. Samples: 9221833800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:08,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 02:07:11,484][15401] Updated weights for policy 0, policy_version 562860 (0.0047) [2024-06-24 02:07:13,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9221980160. Throughput: 0: 42675.0. Samples: 9222095920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:13,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 02:07:15,179][15401] Updated weights for policy 0, policy_version 562870 (0.0033) [2024-06-24 02:07:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9222176768. Throughput: 0: 42588.5. Samples: 9222351600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:18,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 02:07:18,987][15401] Updated weights for policy 0, policy_version 562880 (0.0033) [2024-06-24 02:07:22,724][15401] Updated weights for policy 0, policy_version 562890 (0.0030) [2024-06-24 02:07:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9222406144. Throughput: 0: 42696.1. Samples: 9222475980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:23,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 02:07:26,971][15401] Updated weights for policy 0, policy_version 562900 (0.0034) [2024-06-24 02:07:28,393][15132] Fps is (10 sec: 45858.2, 60 sec: 43142.9, 300 sec: 42875.6). Total num frames: 9222635520. Throughput: 0: 42758.8. Samples: 9222739260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:28,394][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 02:07:30,324][15401] Updated weights for policy 0, policy_version 562910 (0.0032) [2024-06-24 02:07:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9222815744. Throughput: 0: 42754.8. Samples: 9222999380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:33,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 02:07:34,536][15401] Updated weights for policy 0, policy_version 562920 (0.0052) [2024-06-24 02:07:38,078][15401] Updated weights for policy 0, policy_version 562930 (0.0043) [2024-06-24 02:07:38,389][15132] Fps is (10 sec: 40975.7, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 9223045120. Throughput: 0: 42715.6. Samples: 9223121260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 02:07:42,093][15401] Updated weights for policy 0, policy_version 562940 (0.0040) [2024-06-24 02:07:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 9223290880. Throughput: 0: 42944.1. Samples: 9223388880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:43,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 02:07:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000562945_9223290880.pth... [2024-06-24 02:07:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000562316_9212985344.pth [2024-06-24 02:07:45,638][15401] Updated weights for policy 0, policy_version 562950 (0.0034) [2024-06-24 02:07:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9223471104. Throughput: 0: 43015.4. Samples: 9223647380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:48,399][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 02:07:49,535][15401] Updated weights for policy 0, policy_version 562960 (0.0039) [2024-06-24 02:07:53,211][15401] Updated weights for policy 0, policy_version 562970 (0.0026) [2024-06-24 02:07:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9223700480. Throughput: 0: 43108.5. Samples: 9223773680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 02:07:56,993][15401] Updated weights for policy 0, policy_version 562980 (0.0031) [2024-06-24 02:07:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9223913472. Throughput: 0: 43004.5. Samples: 9224031120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:07:58,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 02:08:00,819][15401] Updated weights for policy 0, policy_version 562990 (0.0024) [2024-06-24 02:08:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9224110080. Throughput: 0: 42979.9. Samples: 9224285700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:08:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 02:08:05,083][15401] Updated weights for policy 0, policy_version 563000 (0.0030) [2024-06-24 02:08:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9224339456. Throughput: 0: 43035.6. Samples: 9224412580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 02:08:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 02:08:08,428][15401] Updated weights for policy 0, policy_version 563010 (0.0031) [2024-06-24 02:08:10,812][15349] Signal inference workers to stop experience collection... (136600 times) [2024-06-24 02:08:10,820][15349] Signal inference workers to resume experience collection... (136600 times) [2024-06-24 02:08:10,863][15401] InferenceWorker_p0-w0: stopping experience collection (136600 times) [2024-06-24 02:08:10,863][15401] InferenceWorker_p0-w0: resuming experience collection (136600 times) [2024-06-24 02:08:12,557][15401] Updated weights for policy 0, policy_version 563020 (0.0051) [2024-06-24 02:08:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 9224552448. Throughput: 0: 42934.7. Samples: 9224671160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 02:08:16,138][15401] Updated weights for policy 0, policy_version 563030 (0.0042) [2024-06-24 02:08:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9224765440. Throughput: 0: 42883.5. Samples: 9224929140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 02:08:20,203][15401] Updated weights for policy 0, policy_version 563040 (0.0034) [2024-06-24 02:08:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9224978432. Throughput: 0: 42891.5. Samples: 9225051380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:23,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 02:08:23,858][15401] Updated weights for policy 0, policy_version 563050 (0.0032) [2024-06-24 02:08:27,810][15401] Updated weights for policy 0, policy_version 563060 (0.0037) [2024-06-24 02:08:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42874.1, 300 sec: 42876.1). Total num frames: 9225207808. Throughput: 0: 42696.0. Samples: 9225310200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:28,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 02:08:31,591][15401] Updated weights for policy 0, policy_version 563070 (0.0036) [2024-06-24 02:08:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9225388032. Throughput: 0: 42817.0. Samples: 9225574140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 02:08:35,251][15401] Updated weights for policy 0, policy_version 563080 (0.0027) [2024-06-24 02:08:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.2, 300 sec: 42820.6). Total num frames: 9225601024. Throughput: 0: 42685.3. Samples: 9225694520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 02:08:39,277][15401] Updated weights for policy 0, policy_version 563090 (0.0038) [2024-06-24 02:08:42,694][15401] Updated weights for policy 0, policy_version 563100 (0.0034) [2024-06-24 02:08:43,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 9225863168. Throughput: 0: 42773.1. Samples: 9225955900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 02:08:47,027][15401] Updated weights for policy 0, policy_version 563110 (0.0026) [2024-06-24 02:08:48,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 9226043392. Throughput: 0: 42920.9. Samples: 9226217240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:48,393][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 02:08:50,403][15401] Updated weights for policy 0, policy_version 563120 (0.0039) [2024-06-24 02:08:53,396][15132] Fps is (10 sec: 39295.9, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 9226256384. Throughput: 0: 42924.1. Samples: 9226344440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:53,396][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 02:08:54,565][15401] Updated weights for policy 0, policy_version 563130 (0.0043) [2024-06-24 02:08:57,869][15401] Updated weights for policy 0, policy_version 563140 (0.0035) [2024-06-24 02:08:58,389][15132] Fps is (10 sec: 45886.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9226502144. Throughput: 0: 43070.7. Samples: 9226609340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:08:58,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 02:09:02,318][15401] Updated weights for policy 0, policy_version 563150 (0.0032) [2024-06-24 02:09:03,390][15132] Fps is (10 sec: 44264.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9226698752. Throughput: 0: 43014.5. Samples: 9226864800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:09:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 02:09:05,272][15401] Updated weights for policy 0, policy_version 563160 (0.0032) [2024-06-24 02:09:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 9226911744. Throughput: 0: 43102.3. Samples: 9226990980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:09:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 02:09:09,847][15401] Updated weights for policy 0, policy_version 563170 (0.0026) [2024-06-24 02:09:12,963][15401] Updated weights for policy 0, policy_version 563180 (0.0040) [2024-06-24 02:09:13,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9227141120. Throughput: 0: 43124.4. Samples: 9227250800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:09:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 02:09:17,399][15401] Updated weights for policy 0, policy_version 563190 (0.0034) [2024-06-24 02:09:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9227337728. Throughput: 0: 42945.7. Samples: 9227506700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:09:18,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 02:09:20,649][15401] Updated weights for policy 0, policy_version 563200 (0.0032) [2024-06-24 02:09:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9227550720. Throughput: 0: 43075.2. Samples: 9227632900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:09:23,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 02:09:24,382][15349] Signal inference workers to stop experience collection... (136650 times) [2024-06-24 02:09:24,382][15349] Signal inference workers to resume experience collection... (136650 times) [2024-06-24 02:09:24,426][15401] InferenceWorker_p0-w0: stopping experience collection (136650 times) [2024-06-24 02:09:24,426][15401] InferenceWorker_p0-w0: resuming experience collection (136650 times) [2024-06-24 02:09:25,143][15401] Updated weights for policy 0, policy_version 563210 (0.0032) [2024-06-24 02:09:28,277][15401] Updated weights for policy 0, policy_version 563220 (0.0034) [2024-06-24 02:09:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9227796480. Throughput: 0: 43001.7. Samples: 9227890980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:09:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 02:09:32,629][15401] Updated weights for policy 0, policy_version 563230 (0.0029) [2024-06-24 02:09:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9227976704. Throughput: 0: 42924.4. Samples: 9228148740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:09:33,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 02:09:35,973][15401] Updated weights for policy 0, policy_version 563240 (0.0040) [2024-06-24 02:09:38,392][15132] Fps is (10 sec: 39312.1, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 9228189696. Throughput: 0: 42854.0. Samples: 9228272700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 02:09:38,393][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 02:09:40,131][15401] Updated weights for policy 0, policy_version 563250 (0.0044) [2024-06-24 02:09:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9228419072. Throughput: 0: 42904.4. Samples: 9228540040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:09:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 02:09:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000563258_9228419072.pth... [2024-06-24 02:09:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000562632_9218162688.pth [2024-06-24 02:09:43,973][15401] Updated weights for policy 0, policy_version 563260 (0.0025) [2024-06-24 02:09:47,674][15401] Updated weights for policy 0, policy_version 563270 (0.0022) [2024-06-24 02:09:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42873.2, 300 sec: 42821.1). Total num frames: 9228615680. Throughput: 0: 42953.5. Samples: 9228797700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:09:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 02:09:51,507][15401] Updated weights for policy 0, policy_version 563280 (0.0029) [2024-06-24 02:09:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 9228845056. Throughput: 0: 42938.6. Samples: 9228923220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:09:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 02:09:55,465][15401] Updated weights for policy 0, policy_version 563290 (0.0033) [2024-06-24 02:09:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 9229074432. Throughput: 0: 43014.6. Samples: 9229186460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:09:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 02:09:58,983][15401] Updated weights for policy 0, policy_version 563300 (0.0038) [2024-06-24 02:10:03,271][15401] Updated weights for policy 0, policy_version 563310 (0.0028) [2024-06-24 02:10:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9229271040. Throughput: 0: 43036.0. Samples: 9229443320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:03,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 02:10:06,902][15401] Updated weights for policy 0, policy_version 563320 (0.0035) [2024-06-24 02:10:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.3, 300 sec: 42931.6). Total num frames: 9229500416. Throughput: 0: 42938.9. Samples: 9229565160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:08,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-24 02:10:11,017][15401] Updated weights for policy 0, policy_version 563330 (0.0028) [2024-06-24 02:10:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9229729792. Throughput: 0: 43030.7. Samples: 9229827360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 02:10:14,706][15401] Updated weights for policy 0, policy_version 563340 (0.0023) [2024-06-24 02:10:18,389][15132] Fps is (10 sec: 39322.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9229893632. Throughput: 0: 42960.6. Samples: 9230081960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 02:10:18,704][15401] Updated weights for policy 0, policy_version 563350 (0.0035) [2024-06-24 02:10:22,375][15401] Updated weights for policy 0, policy_version 563360 (0.0040) [2024-06-24 02:10:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 9230139392. Throughput: 0: 42807.6. Samples: 9230198940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 02:10:26,700][15401] Updated weights for policy 0, policy_version 563370 (0.0034) [2024-06-24 02:10:28,395][15132] Fps is (10 sec: 47488.2, 60 sec: 42867.7, 300 sec: 42930.9). Total num frames: 9230368768. Throughput: 0: 42794.6. Samples: 9230466020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:28,395][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 02:10:29,975][15401] Updated weights for policy 0, policy_version 563380 (0.0042) [2024-06-24 02:10:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9230548992. Throughput: 0: 42756.8. Samples: 9230721760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 02:10:34,413][15401] Updated weights for policy 0, policy_version 563390 (0.0040) [2024-06-24 02:10:34,629][15349] Signal inference workers to stop experience collection... (136700 times) [2024-06-24 02:10:34,690][15401] InferenceWorker_p0-w0: stopping experience collection (136700 times) [2024-06-24 02:10:34,743][15349] Signal inference workers to resume experience collection... (136700 times) [2024-06-24 02:10:34,743][15401] InferenceWorker_p0-w0: resuming experience collection (136700 times) [2024-06-24 02:10:37,420][15401] Updated weights for policy 0, policy_version 563400 (0.0027) [2024-06-24 02:10:38,390][15132] Fps is (10 sec: 42620.4, 60 sec: 43419.3, 300 sec: 42876.8). Total num frames: 9230794752. Throughput: 0: 42673.3. Samples: 9230843520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 02:10:41,903][15401] Updated weights for policy 0, policy_version 563410 (0.0034) [2024-06-24 02:10:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42932.0). Total num frames: 9231007744. Throughput: 0: 42642.2. Samples: 9231105360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 02:10:45,108][15401] Updated weights for policy 0, policy_version 563420 (0.0042) [2024-06-24 02:10:48,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9231171584. Throughput: 0: 42661.7. Samples: 9231363100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:48,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 02:10:49,373][15401] Updated weights for policy 0, policy_version 563430 (0.0030) [2024-06-24 02:10:52,839][15401] Updated weights for policy 0, policy_version 563440 (0.0043) [2024-06-24 02:10:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9231433728. Throughput: 0: 42654.4. Samples: 9231484600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 02:10:56,996][15401] Updated weights for policy 0, policy_version 563450 (0.0027) [2024-06-24 02:10:58,394][15132] Fps is (10 sec: 45856.8, 60 sec: 42595.6, 300 sec: 42931.0). Total num frames: 9231630336. Throughput: 0: 42610.3. Samples: 9231745000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:10:58,394][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 02:11:00,415][15401] Updated weights for policy 0, policy_version 563460 (0.0039) [2024-06-24 02:11:03,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 9231826944. Throughput: 0: 42615.9. Samples: 9231999780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:11:03,393][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 02:11:04,987][15401] Updated weights for policy 0, policy_version 563470 (0.0043) [2024-06-24 02:11:08,264][15401] Updated weights for policy 0, policy_version 563480 (0.0035) [2024-06-24 02:11:08,390][15132] Fps is (10 sec: 42615.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9232056320. Throughput: 0: 42720.8. Samples: 9232121380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:11:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 02:11:12,479][15401] Updated weights for policy 0, policy_version 563490 (0.0040) [2024-06-24 02:11:13,392][15132] Fps is (10 sec: 44237.4, 60 sec: 42323.7, 300 sec: 42875.8). Total num frames: 9232269312. Throughput: 0: 42762.4. Samples: 9232390200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:13,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 02:11:15,970][15401] Updated weights for policy 0, policy_version 563500 (0.0035) [2024-06-24 02:11:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9232465920. Throughput: 0: 42671.2. Samples: 9232641960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 02:11:19,918][15401] Updated weights for policy 0, policy_version 563510 (0.0027) [2024-06-24 02:11:23,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42876.3). Total num frames: 9232695296. Throughput: 0: 42664.2. Samples: 9232763400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 02:11:23,410][15401] Updated weights for policy 0, policy_version 563520 (0.0036) [2024-06-24 02:11:27,550][15401] Updated weights for policy 0, policy_version 563530 (0.0034) [2024-06-24 02:11:28,391][15132] Fps is (10 sec: 42590.8, 60 sec: 42054.7, 300 sec: 42931.4). Total num frames: 9232891904. Throughput: 0: 42695.4. Samples: 9233026720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:28,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 02:11:31,011][15401] Updated weights for policy 0, policy_version 563540 (0.0029) [2024-06-24 02:11:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9233121280. Throughput: 0: 42715.9. Samples: 9233285320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 02:11:35,099][15401] Updated weights for policy 0, policy_version 563550 (0.0042) [2024-06-24 02:11:38,390][15132] Fps is (10 sec: 45882.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9233350656. Throughput: 0: 42756.0. Samples: 9233408620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 02:11:38,477][15401] Updated weights for policy 0, policy_version 563560 (0.0035) [2024-06-24 02:11:42,867][15401] Updated weights for policy 0, policy_version 563570 (0.0028) [2024-06-24 02:11:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42931.7). Total num frames: 9233547264. Throughput: 0: 42799.0. Samples: 9233670780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 02:11:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000563571_9233547264.pth... [2024-06-24 02:11:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000562945_9223290880.pth [2024-06-24 02:11:46,251][15401] Updated weights for policy 0, policy_version 563580 (0.0031) [2024-06-24 02:11:47,707][15349] Signal inference workers to stop experience collection... (136750 times) [2024-06-24 02:11:47,708][15349] Signal inference workers to resume experience collection... (136750 times) [2024-06-24 02:11:47,724][15401] InferenceWorker_p0-w0: stopping experience collection (136750 times) [2024-06-24 02:11:47,724][15401] InferenceWorker_p0-w0: resuming experience collection (136750 times) [2024-06-24 02:11:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9233743872. Throughput: 0: 42736.0. Samples: 9233922800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 02:11:50,543][15401] Updated weights for policy 0, policy_version 563590 (0.0043) [2024-06-24 02:11:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 9233973248. Throughput: 0: 42846.2. Samples: 9234049460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 02:11:54,001][15401] Updated weights for policy 0, policy_version 563600 (0.0033) [2024-06-24 02:11:58,117][15401] Updated weights for policy 0, policy_version 563610 (0.0043) [2024-06-24 02:11:58,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42599.5, 300 sec: 42931.3). Total num frames: 9234186240. Throughput: 0: 42649.1. Samples: 9234309420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:11:58,393][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 02:12:01,619][15401] Updated weights for policy 0, policy_version 563620 (0.0032) [2024-06-24 02:12:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 9234399232. Throughput: 0: 42735.0. Samples: 9234565040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 02:12:05,683][15401] Updated weights for policy 0, policy_version 563630 (0.0033) [2024-06-24 02:12:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9234612224. Throughput: 0: 42942.4. Samples: 9234695820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 02:12:09,151][15401] Updated weights for policy 0, policy_version 563640 (0.0035) [2024-06-24 02:12:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 9234825216. Throughput: 0: 42830.1. Samples: 9234954000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 02:12:13,508][15401] Updated weights for policy 0, policy_version 563650 (0.0032) [2024-06-24 02:12:16,822][15401] Updated weights for policy 0, policy_version 563660 (0.0040) [2024-06-24 02:12:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9235021824. Throughput: 0: 42926.9. Samples: 9235217020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:18,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 02:12:21,199][15401] Updated weights for policy 0, policy_version 563670 (0.0029) [2024-06-24 02:12:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.5). Total num frames: 9235251200. Throughput: 0: 43008.9. Samples: 9235344020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 02:12:24,686][15401] Updated weights for policy 0, policy_version 563680 (0.0039) [2024-06-24 02:12:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42872.7, 300 sec: 42876.1). Total num frames: 9235464192. Throughput: 0: 42834.6. Samples: 9235598340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 02:12:28,641][15401] Updated weights for policy 0, policy_version 563690 (0.0030) [2024-06-24 02:12:32,402][15401] Updated weights for policy 0, policy_version 563700 (0.0030) [2024-06-24 02:12:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9235677184. Throughput: 0: 42924.4. Samples: 9235854400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:33,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 02:12:36,319][15401] Updated weights for policy 0, policy_version 563710 (0.0040) [2024-06-24 02:12:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9235906560. Throughput: 0: 42877.7. Samples: 9235978960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 02:12:40,106][15401] Updated weights for policy 0, policy_version 563720 (0.0023) [2024-06-24 02:12:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9236119552. Throughput: 0: 42868.6. Samples: 9236238400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 02:12:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 02:12:43,868][15401] Updated weights for policy 0, policy_version 563730 (0.0038) [2024-06-24 02:12:47,934][15401] Updated weights for policy 0, policy_version 563740 (0.0033) [2024-06-24 02:12:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9236332544. Throughput: 0: 42883.1. Samples: 9236494780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:12:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 02:12:51,531][15401] Updated weights for policy 0, policy_version 563750 (0.0032) [2024-06-24 02:12:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9236545536. Throughput: 0: 42832.5. Samples: 9236623280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:12:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 02:12:55,408][15401] Updated weights for policy 0, policy_version 563760 (0.0038) [2024-06-24 02:12:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9236758528. Throughput: 0: 42833.3. Samples: 9236881500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:12:58,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 02:12:59,708][15401] Updated weights for policy 0, policy_version 563770 (0.0028) [2024-06-24 02:13:02,964][15401] Updated weights for policy 0, policy_version 563780 (0.0033) [2024-06-24 02:13:03,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 9236971520. Throughput: 0: 42512.8. Samples: 9237130200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:03,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 02:13:07,563][15401] Updated weights for policy 0, policy_version 563790 (0.0039) [2024-06-24 02:13:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9237168128. Throughput: 0: 42549.9. Samples: 9237258760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 02:13:10,694][15401] Updated weights for policy 0, policy_version 563800 (0.0035) [2024-06-24 02:13:13,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9237397504. Throughput: 0: 42593.9. Samples: 9237515060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 02:13:15,107][15401] Updated weights for policy 0, policy_version 563810 (0.0039) [2024-06-24 02:13:18,354][15401] Updated weights for policy 0, policy_version 563820 (0.0032) [2024-06-24 02:13:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9237626880. Throughput: 0: 42698.3. Samples: 9237775820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:18,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 02:13:22,697][15401] Updated weights for policy 0, policy_version 563830 (0.0038) [2024-06-24 02:13:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9237823488. Throughput: 0: 42771.7. Samples: 9237903680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 02:13:24,190][15349] Signal inference workers to stop experience collection... (136800 times) [2024-06-24 02:13:24,190][15349] Signal inference workers to resume experience collection... (136800 times) [2024-06-24 02:13:24,222][15401] InferenceWorker_p0-w0: stopping experience collection (136800 times) [2024-06-24 02:13:24,222][15401] InferenceWorker_p0-w0: resuming experience collection (136800 times) [2024-06-24 02:13:25,824][15401] Updated weights for policy 0, policy_version 563840 (0.0032) [2024-06-24 02:13:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9238052864. Throughput: 0: 42765.4. Samples: 9238162840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 02:13:30,232][15401] Updated weights for policy 0, policy_version 563850 (0.0037) [2024-06-24 02:13:33,390][15132] Fps is (10 sec: 44235.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9238265856. Throughput: 0: 42776.7. Samples: 9238419740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 02:13:33,461][15401] Updated weights for policy 0, policy_version 563860 (0.0025) [2024-06-24 02:13:37,818][15401] Updated weights for policy 0, policy_version 563870 (0.0044) [2024-06-24 02:13:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 9238462464. Throughput: 0: 42771.2. Samples: 9238548080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:38,392][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 02:13:41,246][15401] Updated weights for policy 0, policy_version 563880 (0.0033) [2024-06-24 02:13:43,390][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 9238708224. Throughput: 0: 42678.7. Samples: 9238802040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:43,395][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 02:13:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000563886_9238708224.pth... [2024-06-24 02:13:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000563258_9228419072.pth [2024-06-24 02:13:45,756][15401] Updated weights for policy 0, policy_version 563890 (0.0042) [2024-06-24 02:13:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 9238904832. Throughput: 0: 42753.4. Samples: 9239054000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:48,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-24 02:13:49,332][15401] Updated weights for policy 0, policy_version 563900 (0.0042) [2024-06-24 02:13:53,304][15401] Updated weights for policy 0, policy_version 563910 (0.0037) [2024-06-24 02:13:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9239101440. Throughput: 0: 42661.1. Samples: 9239178520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:53,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 02:13:57,067][15401] Updated weights for policy 0, policy_version 563920 (0.0036) [2024-06-24 02:13:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9239347200. Throughput: 0: 42755.0. Samples: 9239439040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:13:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 02:14:00,815][15401] Updated weights for policy 0, policy_version 563930 (0.0034) [2024-06-24 02:14:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9239527424. Throughput: 0: 42661.0. Samples: 9239695560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:14:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 02:14:04,590][15401] Updated weights for policy 0, policy_version 563940 (0.0034) [2024-06-24 02:14:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9239740416. Throughput: 0: 42537.3. Samples: 9239817860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:14:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 02:14:09,036][15401] Updated weights for policy 0, policy_version 563950 (0.0030) [2024-06-24 02:14:12,328][15401] Updated weights for policy 0, policy_version 563960 (0.0036) [2024-06-24 02:14:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9239969792. Throughput: 0: 42587.0. Samples: 9240079260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 02:14:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 02:14:16,637][15401] Updated weights for policy 0, policy_version 563970 (0.0031) [2024-06-24 02:14:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9240182784. Throughput: 0: 42539.8. Samples: 9240334020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 02:14:19,980][15401] Updated weights for policy 0, policy_version 563980 (0.0032) [2024-06-24 02:14:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9240379392. Throughput: 0: 42601.3. Samples: 9240465040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 02:14:24,032][15401] Updated weights for policy 0, policy_version 563990 (0.0026) [2024-06-24 02:14:27,574][15401] Updated weights for policy 0, policy_version 564000 (0.0032) [2024-06-24 02:14:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9240592384. Throughput: 0: 42655.6. Samples: 9240721540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:28,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 02:14:31,882][15401] Updated weights for policy 0, policy_version 564010 (0.0035) [2024-06-24 02:14:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42820.9). Total num frames: 9240821760. Throughput: 0: 42820.0. Samples: 9240980900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:33,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 02:14:34,970][15401] Updated weights for policy 0, policy_version 564020 (0.0040) [2024-06-24 02:14:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 9241034752. Throughput: 0: 42929.0. Samples: 9241110320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:38,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 02:14:39,316][15401] Updated weights for policy 0, policy_version 564030 (0.0036) [2024-06-24 02:14:42,455][15401] Updated weights for policy 0, policy_version 564040 (0.0033) [2024-06-24 02:14:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9241247744. Throughput: 0: 42770.6. Samples: 9241363720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 02:14:43,787][15349] Signal inference workers to stop experience collection... (136850 times) [2024-06-24 02:14:43,787][15349] Signal inference workers to resume experience collection... (136850 times) [2024-06-24 02:14:43,835][15401] InferenceWorker_p0-w0: stopping experience collection (136850 times) [2024-06-24 02:14:43,836][15401] InferenceWorker_p0-w0: resuming experience collection (136850 times) [2024-06-24 02:14:47,329][15401] Updated weights for policy 0, policy_version 564050 (0.0054) [2024-06-24 02:14:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9241460736. Throughput: 0: 42926.7. Samples: 9241627260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 02:14:50,655][15401] Updated weights for policy 0, policy_version 564060 (0.0043) [2024-06-24 02:14:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9241673728. Throughput: 0: 42910.7. Samples: 9241748840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 02:14:55,099][15401] Updated weights for policy 0, policy_version 564070 (0.0024) [2024-06-24 02:14:58,151][15401] Updated weights for policy 0, policy_version 564080 (0.0028) [2024-06-24 02:14:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9241886720. Throughput: 0: 42827.6. Samples: 9242006500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:14:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 02:15:02,632][15401] Updated weights for policy 0, policy_version 564090 (0.0041) [2024-06-24 02:15:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9242099712. Throughput: 0: 43031.2. Samples: 9242270420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 02:15:05,756][15401] Updated weights for policy 0, policy_version 564100 (0.0037) [2024-06-24 02:15:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9242329088. Throughput: 0: 42939.1. Samples: 9242397300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 02:15:10,196][15401] Updated weights for policy 0, policy_version 564110 (0.0030) [2024-06-24 02:15:13,295][15401] Updated weights for policy 0, policy_version 564120 (0.0029) [2024-06-24 02:15:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9242542080. Throughput: 0: 42875.4. Samples: 9242650940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:13,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 02:15:18,002][15401] Updated weights for policy 0, policy_version 564130 (0.0034) [2024-06-24 02:15:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9242722304. Throughput: 0: 42933.7. Samples: 9242912920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 02:15:20,990][15401] Updated weights for policy 0, policy_version 564140 (0.0033) [2024-06-24 02:15:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43417.7, 300 sec: 42765.8). Total num frames: 9242984448. Throughput: 0: 42792.0. Samples: 9243035960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:23,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 02:15:25,741][15401] Updated weights for policy 0, policy_version 564150 (0.0043) [2024-06-24 02:15:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9243164672. Throughput: 0: 42887.7. Samples: 9243293660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 02:15:28,685][15401] Updated weights for policy 0, policy_version 564160 (0.0043) [2024-06-24 02:15:33,227][15401] Updated weights for policy 0, policy_version 564170 (0.0034) [2024-06-24 02:15:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9243377664. Throughput: 0: 42928.4. Samples: 9243559040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:33,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 02:15:36,311][15401] Updated weights for policy 0, policy_version 564180 (0.0032) [2024-06-24 02:15:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9243623424. Throughput: 0: 42952.5. Samples: 9243681700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 02:15:40,557][15401] Updated weights for policy 0, policy_version 564190 (0.0060) [2024-06-24 02:15:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9243820032. Throughput: 0: 43019.6. Samples: 9243942380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 02:15:43,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 02:15:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000564198_9243820032.pth... [2024-06-24 02:15:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000563571_9233547264.pth [2024-06-24 02:15:43,922][15401] Updated weights for policy 0, policy_version 564200 (0.0027) [2024-06-24 02:15:48,088][15401] Updated weights for policy 0, policy_version 564210 (0.0039) [2024-06-24 02:15:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9244016640. Throughput: 0: 42789.7. Samples: 9244195960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:15:48,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 02:15:52,170][15401] Updated weights for policy 0, policy_version 564220 (0.0032) [2024-06-24 02:15:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42821.1). Total num frames: 9244262400. Throughput: 0: 42683.6. Samples: 9244318060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:15:53,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 02:15:54,291][15349] Signal inference workers to stop experience collection... (136900 times) [2024-06-24 02:15:54,347][15401] InferenceWorker_p0-w0: stopping experience collection (136900 times) [2024-06-24 02:15:54,355][15349] Signal inference workers to resume experience collection... (136900 times) [2024-06-24 02:15:54,365][15401] InferenceWorker_p0-w0: resuming experience collection (136900 times) [2024-06-24 02:15:56,257][15401] Updated weights for policy 0, policy_version 564230 (0.0040) [2024-06-24 02:15:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 9244459008. Throughput: 0: 42847.3. Samples: 9244579060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:15:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 02:15:59,620][15401] Updated weights for policy 0, policy_version 564240 (0.0038) [2024-06-24 02:16:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9244639232. Throughput: 0: 42772.9. Samples: 9244837700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 02:16:03,985][15401] Updated weights for policy 0, policy_version 564250 (0.0029) [2024-06-24 02:16:06,973][15401] Updated weights for policy 0, policy_version 564260 (0.0041) [2024-06-24 02:16:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 9244884992. Throughput: 0: 42756.9. Samples: 9244960020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 02:16:11,451][15401] Updated weights for policy 0, policy_version 564270 (0.0036) [2024-06-24 02:16:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 9245097984. Throughput: 0: 42824.0. Samples: 9245220740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 02:16:14,562][15401] Updated weights for policy 0, policy_version 564280 (0.0029) [2024-06-24 02:16:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9245310976. Throughput: 0: 42583.9. Samples: 9245475320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 02:16:19,119][15401] Updated weights for policy 0, policy_version 564290 (0.0027) [2024-06-24 02:16:22,540][15401] Updated weights for policy 0, policy_version 564300 (0.0034) [2024-06-24 02:16:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42050.5, 300 sec: 42764.9). Total num frames: 9245507584. Throughput: 0: 42658.6. Samples: 9245601440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:23,392][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 02:16:26,524][15401] Updated weights for policy 0, policy_version 564310 (0.0035) [2024-06-24 02:16:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9245736960. Throughput: 0: 42666.7. Samples: 9245862380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 02:16:30,196][15401] Updated weights for policy 0, policy_version 564320 (0.0032) [2024-06-24 02:16:33,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9245966336. Throughput: 0: 42534.7. Samples: 9246110020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:33,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-24 02:16:34,016][15401] Updated weights for policy 0, policy_version 564330 (0.0039) [2024-06-24 02:16:38,201][15401] Updated weights for policy 0, policy_version 564340 (0.0037) [2024-06-24 02:16:38,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42052.1, 300 sec: 42709.4). Total num frames: 9246146560. Throughput: 0: 42679.8. Samples: 9246238660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 02:16:41,521][15401] Updated weights for policy 0, policy_version 564350 (0.0036) [2024-06-24 02:16:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9246375936. Throughput: 0: 42626.7. Samples: 9246497260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 02:16:45,779][15401] Updated weights for policy 0, policy_version 564360 (0.0026) [2024-06-24 02:16:48,392][15132] Fps is (10 sec: 45865.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 9246605312. Throughput: 0: 42575.0. Samples: 9246753680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:48,393][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 02:16:49,218][15401] Updated weights for policy 0, policy_version 564370 (0.0029) [2024-06-24 02:16:53,287][15401] Updated weights for policy 0, policy_version 564380 (0.0040) [2024-06-24 02:16:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 9246801920. Throughput: 0: 42719.4. Samples: 9246882400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:53,391][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 02:16:56,772][15401] Updated weights for policy 0, policy_version 564390 (0.0036) [2024-06-24 02:16:58,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9247014912. Throughput: 0: 42450.6. Samples: 9247131020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:16:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 02:17:00,878][15401] Updated weights for policy 0, policy_version 564400 (0.0031) [2024-06-24 02:17:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 9247211520. Throughput: 0: 42551.1. Samples: 9247390120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:17:03,390][15132] Avg episode reward: [(0, '0.129')] [2024-06-24 02:17:04,703][15401] Updated weights for policy 0, policy_version 564410 (0.0039) [2024-06-24 02:17:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9247440896. Throughput: 0: 42574.0. Samples: 9247517160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:17:08,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 02:17:08,464][15401] Updated weights for policy 0, policy_version 564420 (0.0037) [2024-06-24 02:17:12,135][15401] Updated weights for policy 0, policy_version 564430 (0.0029) [2024-06-24 02:17:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9247653888. Throughput: 0: 42451.6. Samples: 9247772700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:17:13,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 02:17:16,695][15401] Updated weights for policy 0, policy_version 564440 (0.0032) [2024-06-24 02:17:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9247866880. Throughput: 0: 42807.5. Samples: 9248036360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 02:17:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 02:17:19,758][15401] Updated weights for policy 0, policy_version 564450 (0.0051) [2024-06-24 02:17:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 9248079872. Throughput: 0: 42689.5. Samples: 9248159680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:17:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 02:17:24,190][15401] Updated weights for policy 0, policy_version 564460 (0.0038) [2024-06-24 02:17:27,523][15401] Updated weights for policy 0, policy_version 564470 (0.0035) [2024-06-24 02:17:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9248309248. Throughput: 0: 42719.5. Samples: 9248419640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:17:28,390][15132] Avg episode reward: [(0, '0.929')] [2024-06-24 02:17:31,513][15349] Signal inference workers to stop experience collection... (136950 times) [2024-06-24 02:17:31,566][15349] Signal inference workers to resume experience collection... (136950 times) [2024-06-24 02:17:31,566][15401] InferenceWorker_p0-w0: stopping experience collection (136950 times) [2024-06-24 02:17:31,579][15401] InferenceWorker_p0-w0: resuming experience collection (136950 times) [2024-06-24 02:17:31,715][15401] Updated weights for policy 0, policy_version 564480 (0.0026) [2024-06-24 02:17:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9248505856. Throughput: 0: 42774.2. Samples: 9248678420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:17:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 02:17:35,365][15401] Updated weights for policy 0, policy_version 564490 (0.0037) [2024-06-24 02:17:38,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9248702464. Throughput: 0: 42792.9. Samples: 9248808080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:17:38,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 02:17:39,163][15401] Updated weights for policy 0, policy_version 564500 (0.0030) [2024-06-24 02:17:42,801][15401] Updated weights for policy 0, policy_version 564510 (0.0028) [2024-06-24 02:17:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9248931840. Throughput: 0: 43068.9. Samples: 9249069120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:17:43,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 02:17:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000564511_9248948224.pth... [2024-06-24 02:17:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000563886_9238708224.pth [2024-06-24 02:17:46,814][15401] Updated weights for policy 0, policy_version 564520 (0.0043) [2024-06-24 02:17:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 9249161216. Throughput: 0: 42897.5. Samples: 9249320500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:17:48,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 02:17:50,301][15401] Updated weights for policy 0, policy_version 564530 (0.0038) [2024-06-24 02:17:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 9249341440. Throughput: 0: 42902.0. Samples: 9249447860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:17:53,401][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 02:17:54,407][15401] Updated weights for policy 0, policy_version 564540 (0.0039) [2024-06-24 02:17:58,204][15401] Updated weights for policy 0, policy_version 564550 (0.0034) [2024-06-24 02:17:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 9249587200. Throughput: 0: 42911.5. Samples: 9249703720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:17:58,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 02:18:02,185][15401] Updated weights for policy 0, policy_version 564560 (0.0043) [2024-06-24 02:18:03,392][15132] Fps is (10 sec: 45875.2, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 9249800192. Throughput: 0: 42684.4. Samples: 9249957260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:03,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 02:18:05,745][15401] Updated weights for policy 0, policy_version 564570 (0.0037) [2024-06-24 02:18:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9249996800. Throughput: 0: 42758.3. Samples: 9250083800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:08,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 02:18:09,963][15401] Updated weights for policy 0, policy_version 564580 (0.0043) [2024-06-24 02:18:13,293][15401] Updated weights for policy 0, policy_version 564590 (0.0027) [2024-06-24 02:18:13,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9250242560. Throughput: 0: 42628.4. Samples: 9250337920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 02:18:17,554][15401] Updated weights for policy 0, policy_version 564600 (0.0024) [2024-06-24 02:18:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9250422784. Throughput: 0: 42599.5. Samples: 9250595400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:18,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 02:18:21,493][15401] Updated weights for policy 0, policy_version 564610 (0.0037) [2024-06-24 02:18:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9250635776. Throughput: 0: 42555.1. Samples: 9250723060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:23,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 02:18:25,487][15401] Updated weights for policy 0, policy_version 564620 (0.0040) [2024-06-24 02:18:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9250865152. Throughput: 0: 42333.9. Samples: 9250974140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:28,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 02:18:29,005][15401] Updated weights for policy 0, policy_version 564630 (0.0043) [2024-06-24 02:18:33,303][15401] Updated weights for policy 0, policy_version 564640 (0.0046) [2024-06-24 02:18:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 9251061760. Throughput: 0: 42701.2. Samples: 9251242060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:33,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 02:18:36,632][15401] Updated weights for policy 0, policy_version 564650 (0.0026) [2024-06-24 02:18:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9251274752. Throughput: 0: 42600.4. Samples: 9251364780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:38,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 02:18:40,838][15401] Updated weights for policy 0, policy_version 564660 (0.0039) [2024-06-24 02:18:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9251504128. Throughput: 0: 42607.6. Samples: 9251621060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:43,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 02:18:44,068][15401] Updated weights for policy 0, policy_version 564670 (0.0041) [2024-06-24 02:18:48,294][15401] Updated weights for policy 0, policy_version 564680 (0.0029) [2024-06-24 02:18:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9251717120. Throughput: 0: 42776.5. Samples: 9251882100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 02:18:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 02:18:51,601][15401] Updated weights for policy 0, policy_version 564690 (0.0036) [2024-06-24 02:18:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 9251913728. Throughput: 0: 42756.0. Samples: 9252007820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:18:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 02:18:56,105][15401] Updated weights for policy 0, policy_version 564700 (0.0032) [2024-06-24 02:18:57,205][15349] Signal inference workers to stop experience collection... (137000 times) [2024-06-24 02:18:57,237][15401] InferenceWorker_p0-w0: stopping experience collection (137000 times) [2024-06-24 02:18:57,264][15349] Signal inference workers to resume experience collection... (137000 times) [2024-06-24 02:18:57,268][15401] InferenceWorker_p0-w0: resuming experience collection (137000 times) [2024-06-24 02:18:58,392][15132] Fps is (10 sec: 44227.2, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 9252159488. Throughput: 0: 42880.2. Samples: 9252267620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:18:58,392][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 02:18:59,149][15401] Updated weights for policy 0, policy_version 564710 (0.0030) [2024-06-24 02:19:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9252356096. Throughput: 0: 42831.1. Samples: 9252522800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 02:19:03,693][15401] Updated weights for policy 0, policy_version 564720 (0.0034) [2024-06-24 02:19:06,776][15401] Updated weights for policy 0, policy_version 564730 (0.0027) [2024-06-24 02:19:08,390][15132] Fps is (10 sec: 40968.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 9252569088. Throughput: 0: 42758.6. Samples: 9252647200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 02:19:11,310][15401] Updated weights for policy 0, policy_version 564740 (0.0033) [2024-06-24 02:19:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9252798464. Throughput: 0: 43012.8. Samples: 9252909720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:13,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 02:19:14,510][15401] Updated weights for policy 0, policy_version 564750 (0.0031) [2024-06-24 02:19:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9252995072. Throughput: 0: 42648.9. Samples: 9253161260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 02:19:18,867][15401] Updated weights for policy 0, policy_version 564760 (0.0028) [2024-06-24 02:19:22,674][15401] Updated weights for policy 0, policy_version 564770 (0.0027) [2024-06-24 02:19:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9253208064. Throughput: 0: 42760.5. Samples: 9253289000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:23,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 02:19:26,660][15401] Updated weights for policy 0, policy_version 564780 (0.0030) [2024-06-24 02:19:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9253437440. Throughput: 0: 42837.8. Samples: 9253548760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 02:19:30,162][15401] Updated weights for policy 0, policy_version 564790 (0.0027) [2024-06-24 02:19:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9253650432. Throughput: 0: 42842.3. Samples: 9253810000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 02:19:34,141][15401] Updated weights for policy 0, policy_version 564800 (0.0047) [2024-06-24 02:19:38,338][15401] Updated weights for policy 0, policy_version 564810 (0.0038) [2024-06-24 02:19:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9253847040. Throughput: 0: 42743.9. Samples: 9253931300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:38,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 02:19:41,739][15401] Updated weights for policy 0, policy_version 564820 (0.0043) [2024-06-24 02:19:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9254092800. Throughput: 0: 42785.9. Samples: 9254192900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:43,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-24 02:19:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000564825_9254092800.pth... [2024-06-24 02:19:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000564198_9243820032.pth [2024-06-24 02:19:45,967][15401] Updated weights for policy 0, policy_version 564830 (0.0028) [2024-06-24 02:19:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9254273024. Throughput: 0: 42725.5. Samples: 9254445440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 02:19:49,492][15401] Updated weights for policy 0, policy_version 564840 (0.0040) [2024-06-24 02:19:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9254469632. Throughput: 0: 42664.0. Samples: 9254567080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 02:19:53,623][15401] Updated weights for policy 0, policy_version 564850 (0.0037) [2024-06-24 02:19:57,405][15401] Updated weights for policy 0, policy_version 564860 (0.0030) [2024-06-24 02:19:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 9254715392. Throughput: 0: 42660.5. Samples: 9254829440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:19:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 02:20:01,388][15401] Updated weights for policy 0, policy_version 564870 (0.0041) [2024-06-24 02:20:03,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9254895616. Throughput: 0: 42756.9. Samples: 9255085320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:20:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 02:20:04,955][15401] Updated weights for policy 0, policy_version 564880 (0.0032) [2024-06-24 02:20:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9255124992. Throughput: 0: 42671.1. Samples: 9255209200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:20:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 02:20:08,981][15401] Updated weights for policy 0, policy_version 564890 (0.0035) [2024-06-24 02:20:12,328][15401] Updated weights for policy 0, policy_version 564900 (0.0046) [2024-06-24 02:20:13,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9255370752. Throughput: 0: 42639.1. Samples: 9255467520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:20:13,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 02:20:16,798][15401] Updated weights for policy 0, policy_version 564910 (0.0034) [2024-06-24 02:20:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9255534592. Throughput: 0: 42596.4. Samples: 9255726840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 02:20:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 02:20:19,932][15401] Updated weights for policy 0, policy_version 564920 (0.0031) [2024-06-24 02:20:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9255780352. Throughput: 0: 42566.0. Samples: 9255846760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:20:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 02:20:24,675][15401] Updated weights for policy 0, policy_version 564930 (0.0034) [2024-06-24 02:20:25,534][15349] Signal inference workers to stop experience collection... (137050 times) [2024-06-24 02:20:25,555][15401] InferenceWorker_p0-w0: stopping experience collection (137050 times) [2024-06-24 02:20:25,593][15349] Signal inference workers to resume experience collection... (137050 times) [2024-06-24 02:20:25,593][15401] InferenceWorker_p0-w0: resuming experience collection (137050 times) [2024-06-24 02:20:27,777][15401] Updated weights for policy 0, policy_version 564940 (0.0038) [2024-06-24 02:20:28,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 9255993344. Throughput: 0: 42464.9. Samples: 9256103920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:20:28,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 02:20:32,384][15401] Updated weights for policy 0, policy_version 564950 (0.0039) [2024-06-24 02:20:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 9256173568. Throughput: 0: 42677.2. Samples: 9256365920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:20:33,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 02:20:35,721][15401] Updated weights for policy 0, policy_version 564960 (0.0033) [2024-06-24 02:20:38,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9256402944. Throughput: 0: 42650.3. Samples: 9256486340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:20:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 02:20:39,939][15401] Updated weights for policy 0, policy_version 564970 (0.0029) [2024-06-24 02:20:43,242][15401] Updated weights for policy 0, policy_version 564980 (0.0029) [2024-06-24 02:20:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 9256632320. Throughput: 0: 42507.1. Samples: 9256742260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:20:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 02:20:47,722][15401] Updated weights for policy 0, policy_version 564990 (0.0039) [2024-06-24 02:20:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9256812544. Throughput: 0: 42700.6. Samples: 9257006840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:20:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 02:20:50,977][15401] Updated weights for policy 0, policy_version 565000 (0.0049) [2024-06-24 02:20:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9257058304. Throughput: 0: 42520.9. Samples: 9257122640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:20:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 02:20:55,490][15401] Updated weights for policy 0, policy_version 565010 (0.0035) [2024-06-24 02:20:58,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9257271296. Throughput: 0: 42646.1. Samples: 9257386600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:20:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 02:20:58,540][15401] Updated weights for policy 0, policy_version 565020 (0.0035) [2024-06-24 02:21:03,358][15401] Updated weights for policy 0, policy_version 565030 (0.0032) [2024-06-24 02:21:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9257451520. Throughput: 0: 42561.8. Samples: 9257642120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 02:21:06,159][15401] Updated weights for policy 0, policy_version 565040 (0.0041) [2024-06-24 02:21:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9257697280. Throughput: 0: 42558.6. Samples: 9257761900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 02:21:10,869][15401] Updated weights for policy 0, policy_version 565050 (0.0034) [2024-06-24 02:21:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9257910272. Throughput: 0: 42710.3. Samples: 9258025780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:13,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 02:21:13,641][15401] Updated weights for policy 0, policy_version 565060 (0.0031) [2024-06-24 02:21:18,345][15401] Updated weights for policy 0, policy_version 565070 (0.0029) [2024-06-24 02:21:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 9258106880. Throughput: 0: 42639.6. Samples: 9258284700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:18,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 02:21:21,379][15401] Updated weights for policy 0, policy_version 565080 (0.0034) [2024-06-24 02:21:23,390][15132] Fps is (10 sec: 44233.9, 60 sec: 42871.0, 300 sec: 42764.9). Total num frames: 9258352640. Throughput: 0: 42708.8. Samples: 9258408260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:23,391][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 02:21:25,933][15401] Updated weights for policy 0, policy_version 565090 (0.0039) [2024-06-24 02:21:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 9258532864. Throughput: 0: 43033.4. Samples: 9258678760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 02:21:28,926][15401] Updated weights for policy 0, policy_version 565100 (0.0037) [2024-06-24 02:21:33,389][15132] Fps is (10 sec: 39324.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9258745856. Throughput: 0: 42796.3. Samples: 9258932680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:33,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 02:21:33,452][15401] Updated weights for policy 0, policy_version 565110 (0.0028) [2024-06-24 02:21:35,949][15349] Signal inference workers to stop experience collection... (137100 times) [2024-06-24 02:21:35,990][15401] InferenceWorker_p0-w0: stopping experience collection (137100 times) [2024-06-24 02:21:36,011][15349] Signal inference workers to resume experience collection... (137100 times) [2024-06-24 02:21:36,018][15401] InferenceWorker_p0-w0: resuming experience collection (137100 times) [2024-06-24 02:21:36,490][15401] Updated weights for policy 0, policy_version 565120 (0.0041) [2024-06-24 02:21:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9258991616. Throughput: 0: 42998.7. Samples: 9259057580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:38,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 02:21:41,396][15401] Updated weights for policy 0, policy_version 565130 (0.0040) [2024-06-24 02:21:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 9259171840. Throughput: 0: 43000.9. Samples: 9259321640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:43,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-24 02:21:43,483][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000565136_9259188224.pth... [2024-06-24 02:21:43,548][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000564511_9248948224.pth [2024-06-24 02:21:44,184][15401] Updated weights for policy 0, policy_version 565140 (0.0026) [2024-06-24 02:21:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9259384832. Throughput: 0: 43011.5. Samples: 9259577640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:48,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 02:21:48,862][15401] Updated weights for policy 0, policy_version 565150 (0.0040) [2024-06-24 02:21:51,953][15401] Updated weights for policy 0, policy_version 565160 (0.0030) [2024-06-24 02:21:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9259630592. Throughput: 0: 43215.5. Samples: 9259706600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 02:21:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 02:21:56,197][15401] Updated weights for policy 0, policy_version 565170 (0.0043) [2024-06-24 02:21:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9259827200. Throughput: 0: 43129.8. Samples: 9259966620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:21:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 02:21:59,435][15401] Updated weights for policy 0, policy_version 565180 (0.0029) [2024-06-24 02:22:03,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 9260040192. Throughput: 0: 43083.1. Samples: 9260223540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:03,392][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 02:22:03,898][15401] Updated weights for policy 0, policy_version 565190 (0.0033) [2024-06-24 02:22:07,068][15401] Updated weights for policy 0, policy_version 565200 (0.0041) [2024-06-24 02:22:08,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9260285952. Throughput: 0: 43204.9. Samples: 9260352460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 02:22:11,898][15401] Updated weights for policy 0, policy_version 565210 (0.0030) [2024-06-24 02:22:13,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9260466176. Throughput: 0: 42856.8. Samples: 9260607320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:13,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 02:22:14,934][15401] Updated weights for policy 0, policy_version 565220 (0.0035) [2024-06-24 02:22:18,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 9260679168. Throughput: 0: 42879.0. Samples: 9260862340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:18,393][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 02:22:19,359][15401] Updated weights for policy 0, policy_version 565230 (0.0027) [2024-06-24 02:22:22,521][15401] Updated weights for policy 0, policy_version 565240 (0.0041) [2024-06-24 02:22:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42872.0, 300 sec: 42765.0). Total num frames: 9260924928. Throughput: 0: 42997.4. Samples: 9260992460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:23,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-24 02:22:27,284][15401] Updated weights for policy 0, policy_version 565250 (0.0042) [2024-06-24 02:22:28,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9261105152. Throughput: 0: 42956.2. Samples: 9261254660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 02:22:30,218][15401] Updated weights for policy 0, policy_version 565260 (0.0027) [2024-06-24 02:22:33,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9261334528. Throughput: 0: 42859.9. Samples: 9261506340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:33,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 02:22:34,587][15401] Updated weights for policy 0, policy_version 565270 (0.0032) [2024-06-24 02:22:37,833][15401] Updated weights for policy 0, policy_version 565280 (0.0037) [2024-06-24 02:22:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9261563904. Throughput: 0: 43000.5. Samples: 9261641620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 02:22:42,019][15401] Updated weights for policy 0, policy_version 565290 (0.0032) [2024-06-24 02:22:43,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 9261744128. Throughput: 0: 42854.3. Samples: 9261895060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 02:22:45,431][15401] Updated weights for policy 0, policy_version 565300 (0.0032) [2024-06-24 02:22:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 9261989888. Throughput: 0: 42843.6. Samples: 9262151400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 02:22:49,468][15401] Updated weights for policy 0, policy_version 565310 (0.0027) [2024-06-24 02:22:53,086][15401] Updated weights for policy 0, policy_version 565320 (0.0034) [2024-06-24 02:22:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9262202880. Throughput: 0: 42972.6. Samples: 9262286220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 02:22:56,313][15349] Signal inference workers to stop experience collection... (137150 times) [2024-06-24 02:22:56,314][15349] Signal inference workers to resume experience collection... (137150 times) [2024-06-24 02:22:56,345][15401] InferenceWorker_p0-w0: stopping experience collection (137150 times) [2024-06-24 02:22:56,345][15401] InferenceWorker_p0-w0: resuming experience collection (137150 times) [2024-06-24 02:22:56,999][15401] Updated weights for policy 0, policy_version 565330 (0.0040) [2024-06-24 02:22:58,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 9262399488. Throughput: 0: 42886.2. Samples: 9262537300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:22:58,393][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 02:23:00,734][15401] Updated weights for policy 0, policy_version 565340 (0.0035) [2024-06-24 02:23:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 9262628864. Throughput: 0: 43083.6. Samples: 9262801000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:23:03,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 02:23:04,639][15401] Updated weights for policy 0, policy_version 565350 (0.0033) [2024-06-24 02:23:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 9262841856. Throughput: 0: 43017.7. Samples: 9262928260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:23:08,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 02:23:08,454][15401] Updated weights for policy 0, policy_version 565360 (0.0030) [2024-06-24 02:23:11,986][15401] Updated weights for policy 0, policy_version 565370 (0.0035) [2024-06-24 02:23:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9263038464. Throughput: 0: 42812.8. Samples: 9263181240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:23:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 02:23:15,903][15401] Updated weights for policy 0, policy_version 565380 (0.0026) [2024-06-24 02:23:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 9263284224. Throughput: 0: 42885.6. Samples: 9263436180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:23:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 02:23:19,862][15401] Updated weights for policy 0, policy_version 565390 (0.0033) [2024-06-24 02:23:23,392][15132] Fps is (10 sec: 45863.3, 60 sec: 42869.6, 300 sec: 42820.2). Total num frames: 9263497216. Throughput: 0: 42920.6. Samples: 9263573160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 02:23:23,393][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 02:23:23,718][15401] Updated weights for policy 0, policy_version 565400 (0.0030) [2024-06-24 02:23:27,232][15401] Updated weights for policy 0, policy_version 565410 (0.0042) [2024-06-24 02:23:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 9263693824. Throughput: 0: 42855.4. Samples: 9263823560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:23:28,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 02:23:31,326][15401] Updated weights for policy 0, policy_version 565420 (0.0022) [2024-06-24 02:23:33,389][15132] Fps is (10 sec: 42609.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 9263923200. Throughput: 0: 42898.3. Samples: 9264081820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:23:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 02:23:35,035][15401] Updated weights for policy 0, policy_version 565430 (0.0051) [2024-06-24 02:23:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9264152576. Throughput: 0: 42971.2. Samples: 9264219920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:23:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 02:23:38,751][15401] Updated weights for policy 0, policy_version 565440 (0.0036) [2024-06-24 02:23:42,591][15401] Updated weights for policy 0, policy_version 565450 (0.0037) [2024-06-24 02:23:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9264332800. Throughput: 0: 42949.4. Samples: 9264469920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:23:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 02:23:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000565450_9264332800.pth... [2024-06-24 02:23:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000564825_9254092800.pth [2024-06-24 02:23:46,330][15401] Updated weights for policy 0, policy_version 565460 (0.0048) [2024-06-24 02:23:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9264578560. Throughput: 0: 42756.9. Samples: 9264725060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:23:48,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-24 02:23:50,388][15401] Updated weights for policy 0, policy_version 565470 (0.0037) [2024-06-24 02:23:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 9264758784. Throughput: 0: 42849.6. Samples: 9264856500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:23:53,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 02:23:54,210][15401] Updated weights for policy 0, policy_version 565480 (0.0026) [2024-06-24 02:23:58,129][15401] Updated weights for policy 0, policy_version 565490 (0.0034) [2024-06-24 02:23:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 9264988160. Throughput: 0: 42874.9. Samples: 9265110620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:23:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 02:24:02,350][15401] Updated weights for policy 0, policy_version 565500 (0.0036) [2024-06-24 02:24:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9265201152. Throughput: 0: 42828.8. Samples: 9265363480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 02:24:05,828][15401] Updated weights for policy 0, policy_version 565510 (0.0033) [2024-06-24 02:24:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9265414144. Throughput: 0: 42542.4. Samples: 9265487460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:08,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 02:24:09,872][15401] Updated weights for policy 0, policy_version 565520 (0.0029) [2024-06-24 02:24:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9265627136. Throughput: 0: 42697.0. Samples: 9265744920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 02:24:13,509][15401] Updated weights for policy 0, policy_version 565530 (0.0048) [2024-06-24 02:24:17,623][15401] Updated weights for policy 0, policy_version 565540 (0.0031) [2024-06-24 02:24:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9265823744. Throughput: 0: 42599.1. Samples: 9265998780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 02:24:21,497][15401] Updated weights for policy 0, policy_version 565550 (0.0038) [2024-06-24 02:24:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9266053120. Throughput: 0: 42351.4. Samples: 9266125740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 02:24:25,251][15401] Updated weights for policy 0, policy_version 565560 (0.0044) [2024-06-24 02:24:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9266249728. Throughput: 0: 42434.3. Samples: 9266379460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 02:24:29,130][15401] Updated weights for policy 0, policy_version 565570 (0.0032) [2024-06-24 02:24:32,868][15401] Updated weights for policy 0, policy_version 565580 (0.0038) [2024-06-24 02:24:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9266462720. Throughput: 0: 42305.9. Samples: 9266628820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 02:24:36,737][15401] Updated weights for policy 0, policy_version 565590 (0.0029) [2024-06-24 02:24:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 9266675712. Throughput: 0: 42309.5. Samples: 9266760420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 02:24:41,054][15401] Updated weights for policy 0, policy_version 565600 (0.0024) [2024-06-24 02:24:43,094][15349] Signal inference workers to stop experience collection... (137200 times) [2024-06-24 02:24:43,142][15401] InferenceWorker_p0-w0: stopping experience collection (137200 times) [2024-06-24 02:24:43,150][15349] Signal inference workers to resume experience collection... (137200 times) [2024-06-24 02:24:43,158][15401] InferenceWorker_p0-w0: resuming experience collection (137200 times) [2024-06-24 02:24:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9266905088. Throughput: 0: 42387.2. Samples: 9267018040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 02:24:44,485][15401] Updated weights for policy 0, policy_version 565610 (0.0035) [2024-06-24 02:24:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 9267101696. Throughput: 0: 42399.6. Samples: 9267271460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 02:24:48,627][15401] Updated weights for policy 0, policy_version 565620 (0.0038) [2024-06-24 02:24:51,990][15401] Updated weights for policy 0, policy_version 565630 (0.0040) [2024-06-24 02:24:53,390][15132] Fps is (10 sec: 39318.6, 60 sec: 42324.8, 300 sec: 42653.8). Total num frames: 9267298304. Throughput: 0: 42466.7. Samples: 9267398500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:24:53,391][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 02:24:56,120][15401] Updated weights for policy 0, policy_version 565640 (0.0031) [2024-06-24 02:24:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 9267527680. Throughput: 0: 42489.8. Samples: 9267656960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:24:58,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 02:24:59,488][15401] Updated weights for policy 0, policy_version 565650 (0.0042) [2024-06-24 02:25:03,390][15132] Fps is (10 sec: 45878.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9267757056. Throughput: 0: 42467.5. Samples: 9267909820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:03,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 02:25:03,648][15401] Updated weights for policy 0, policy_version 565660 (0.0031) [2024-06-24 02:25:07,387][15401] Updated weights for policy 0, policy_version 565670 (0.0035) [2024-06-24 02:25:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9267953664. Throughput: 0: 42549.5. Samples: 9268040460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 02:25:11,086][15401] Updated weights for policy 0, policy_version 565680 (0.0028) [2024-06-24 02:25:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 9268166656. Throughput: 0: 42759.9. Samples: 9268303660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 02:25:14,869][15401] Updated weights for policy 0, policy_version 565690 (0.0032) [2024-06-24 02:25:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9268396032. Throughput: 0: 42653.3. Samples: 9268548220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 02:25:19,649][15401] Updated weights for policy 0, policy_version 565700 (0.0033) [2024-06-24 02:25:22,403][15401] Updated weights for policy 0, policy_version 565710 (0.0038) [2024-06-24 02:25:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42709.8). Total num frames: 9268592640. Throughput: 0: 42714.7. Samples: 9268682580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 02:25:27,179][15401] Updated weights for policy 0, policy_version 565720 (0.0024) [2024-06-24 02:25:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9268805632. Throughput: 0: 42821.3. Samples: 9268945000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 02:25:30,151][15401] Updated weights for policy 0, policy_version 565730 (0.0039) [2024-06-24 02:25:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9269035008. Throughput: 0: 42841.3. Samples: 9269199320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 02:25:34,487][15401] Updated weights for policy 0, policy_version 565740 (0.0032) [2024-06-24 02:25:37,971][15401] Updated weights for policy 0, policy_version 565750 (0.0040) [2024-06-24 02:25:38,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9269264384. Throughput: 0: 42819.4. Samples: 9269325340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 02:25:42,322][15401] Updated weights for policy 0, policy_version 565760 (0.0025) [2024-06-24 02:25:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 9269460992. Throughput: 0: 42981.5. Samples: 9269591240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:43,393][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 02:25:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000565764_9269477376.pth... [2024-06-24 02:25:43,545][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000565136_9259188224.pth [2024-06-24 02:25:45,615][15401] Updated weights for policy 0, policy_version 565770 (0.0035) [2024-06-24 02:25:48,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 9269690368. Throughput: 0: 42956.1. Samples: 9269843120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:48,405][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 02:25:50,057][15401] Updated weights for policy 0, policy_version 565780 (0.0048) [2024-06-24 02:25:53,229][15401] Updated weights for policy 0, policy_version 565790 (0.0031) [2024-06-24 02:25:53,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43418.1, 300 sec: 42820.6). Total num frames: 9269903360. Throughput: 0: 42861.6. Samples: 9269969240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:53,394][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 02:25:57,526][15401] Updated weights for policy 0, policy_version 565800 (0.0036) [2024-06-24 02:25:58,389][15132] Fps is (10 sec: 40986.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9270099968. Throughput: 0: 42755.6. Samples: 9270227660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:25:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 02:26:01,175][15401] Updated weights for policy 0, policy_version 565810 (0.0040) [2024-06-24 02:26:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9270329344. Throughput: 0: 43011.5. Samples: 9270483740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:26:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 02:26:05,050][15401] Updated weights for policy 0, policy_version 565820 (0.0032) [2024-06-24 02:26:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9270525952. Throughput: 0: 42987.6. Samples: 9270617020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:26:08,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 02:26:08,659][15401] Updated weights for policy 0, policy_version 565830 (0.0037) [2024-06-24 02:26:12,591][15401] Updated weights for policy 0, policy_version 565840 (0.0033) [2024-06-24 02:26:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9270738944. Throughput: 0: 42862.3. Samples: 9270873800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:26:13,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 02:26:16,292][15401] Updated weights for policy 0, policy_version 565850 (0.0028) [2024-06-24 02:26:18,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.3, 300 sec: 42709.6). Total num frames: 9270951936. Throughput: 0: 42871.9. Samples: 9271128560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:26:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 02:26:20,189][15401] Updated weights for policy 0, policy_version 565860 (0.0041) [2024-06-24 02:26:23,307][15349] Signal inference workers to stop experience collection... (137250 times) [2024-06-24 02:26:23,307][15349] Signal inference workers to resume experience collection... (137250 times) [2024-06-24 02:26:23,360][15401] InferenceWorker_p0-w0: stopping experience collection (137250 times) [2024-06-24 02:26:23,360][15401] InferenceWorker_p0-w0: resuming experience collection (137250 times) [2024-06-24 02:26:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9271164928. Throughput: 0: 42891.7. Samples: 9271255460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:26:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 02:26:23,853][15401] Updated weights for policy 0, policy_version 565870 (0.0048) [2024-06-24 02:26:27,873][15401] Updated weights for policy 0, policy_version 565880 (0.0036) [2024-06-24 02:26:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9271377920. Throughput: 0: 42657.5. Samples: 9271510720. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:26:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 02:26:31,596][15401] Updated weights for policy 0, policy_version 565890 (0.0030) [2024-06-24 02:26:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9271590912. Throughput: 0: 42835.4. Samples: 9271770440. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:26:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 02:26:35,366][15401] Updated weights for policy 0, policy_version 565900 (0.0031) [2024-06-24 02:26:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9271820288. Throughput: 0: 42914.7. Samples: 9271900400. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:26:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 02:26:39,058][15401] Updated weights for policy 0, policy_version 565910 (0.0035) [2024-06-24 02:26:43,077][15401] Updated weights for policy 0, policy_version 565920 (0.0034) [2024-06-24 02:26:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9272033280. Throughput: 0: 42943.0. Samples: 9272160100. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:26:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 02:26:46,697][15401] Updated weights for policy 0, policy_version 565930 (0.0034) [2024-06-24 02:26:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 9272229888. Throughput: 0: 43000.5. Samples: 9272418760. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:26:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 02:26:50,729][15401] Updated weights for policy 0, policy_version 565940 (0.0029) [2024-06-24 02:26:53,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 9272459264. Throughput: 0: 42765.6. Samples: 9272541580. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:26:53,393][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 02:26:54,202][15401] Updated weights for policy 0, policy_version 565950 (0.0038) [2024-06-24 02:26:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 9272672256. Throughput: 0: 42842.7. Samples: 9272801720. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:26:58,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 02:26:58,505][15401] Updated weights for policy 0, policy_version 565960 (0.0041) [2024-06-24 02:27:01,810][15401] Updated weights for policy 0, policy_version 565970 (0.0030) [2024-06-24 02:27:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9272885248. Throughput: 0: 42699.6. Samples: 9273050040. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 02:27:06,165][15401] Updated weights for policy 0, policy_version 565980 (0.0033) [2024-06-24 02:27:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9273098240. Throughput: 0: 42752.4. Samples: 9273179320. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 02:27:09,737][15401] Updated weights for policy 0, policy_version 565990 (0.0037) [2024-06-24 02:27:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 9273294848. Throughput: 0: 42772.5. Samples: 9273435480. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 02:27:14,039][15401] Updated weights for policy 0, policy_version 566000 (0.0029) [2024-06-24 02:27:17,253][15401] Updated weights for policy 0, policy_version 566010 (0.0026) [2024-06-24 02:27:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9273524224. Throughput: 0: 42528.4. Samples: 9273684220. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 02:27:21,791][15401] Updated weights for policy 0, policy_version 566020 (0.0027) [2024-06-24 02:27:23,394][15132] Fps is (10 sec: 45852.7, 60 sec: 43141.0, 300 sec: 42875.4). Total num frames: 9273753600. Throughput: 0: 42658.6. Samples: 9273820240. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:23,395][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 02:27:25,124][15401] Updated weights for policy 0, policy_version 566030 (0.0034) [2024-06-24 02:27:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9273917440. Throughput: 0: 42439.1. Samples: 9274069860. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 02:27:29,595][15401] Updated weights for policy 0, policy_version 566040 (0.0042) [2024-06-24 02:27:32,795][15401] Updated weights for policy 0, policy_version 566050 (0.0028) [2024-06-24 02:27:33,390][15132] Fps is (10 sec: 42618.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9274179584. Throughput: 0: 42268.7. Samples: 9274320860. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:33,396][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 02:27:37,332][15401] Updated weights for policy 0, policy_version 566060 (0.0031) [2024-06-24 02:27:38,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9274376192. Throughput: 0: 42610.4. Samples: 9274458940. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 02:27:40,283][15401] Updated weights for policy 0, policy_version 566070 (0.0023) [2024-06-24 02:27:43,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 9274572800. Throughput: 0: 42272.5. Samples: 9274703980. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 02:27:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000566075_9274572800.pth... [2024-06-24 02:27:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000565450_9264332800.pth [2024-06-24 02:27:44,938][15401] Updated weights for policy 0, policy_version 566080 (0.0043) [2024-06-24 02:27:47,835][15401] Updated weights for policy 0, policy_version 566090 (0.0041) [2024-06-24 02:27:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9274818560. Throughput: 0: 42287.2. Samples: 9274952960. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 02:27:52,849][15401] Updated weights for policy 0, policy_version 566100 (0.0036) [2024-06-24 02:27:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 42654.3). Total num frames: 9274982400. Throughput: 0: 42486.7. Samples: 9275091220. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:53,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 02:27:53,569][15349] Signal inference workers to stop experience collection... (137300 times) [2024-06-24 02:27:53,569][15349] Signal inference workers to resume experience collection... (137300 times) [2024-06-24 02:27:53,583][15401] InferenceWorker_p0-w0: stopping experience collection (137300 times) [2024-06-24 02:27:53,583][15401] InferenceWorker_p0-w0: resuming experience collection (137300 times) [2024-06-24 02:27:55,475][15401] Updated weights for policy 0, policy_version 566110 (0.0039) [2024-06-24 02:27:58,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9275195392. Throughput: 0: 42201.2. Samples: 9275334540. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 02:27:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 02:28:00,922][15401] Updated weights for policy 0, policy_version 566120 (0.0034) [2024-06-24 02:28:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9275457536. Throughput: 0: 42214.8. Samples: 9275583880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 02:28:03,481][15401] Updated weights for policy 0, policy_version 566130 (0.0040) [2024-06-24 02:28:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 9275621376. Throughput: 0: 42425.4. Samples: 9275729180. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 02:28:08,588][15401] Updated weights for policy 0, policy_version 566140 (0.0052) [2024-06-24 02:28:11,092][15401] Updated weights for policy 0, policy_version 566150 (0.0034) [2024-06-24 02:28:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9275850752. Throughput: 0: 42298.3. Samples: 9275973280. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:13,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 02:28:16,268][15401] Updated weights for policy 0, policy_version 566160 (0.0031) [2024-06-24 02:28:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 9276080128. Throughput: 0: 42376.3. Samples: 9276227780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:18,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-24 02:28:18,804][15401] Updated weights for policy 0, policy_version 566170 (0.0040) [2024-06-24 02:28:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41782.5, 300 sec: 42598.4). Total num frames: 9276260352. Throughput: 0: 42284.8. Samples: 9276361760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:23,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 02:28:23,776][15401] Updated weights for policy 0, policy_version 566180 (0.0029) [2024-06-24 02:28:26,398][15401] Updated weights for policy 0, policy_version 566190 (0.0046) [2024-06-24 02:28:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9276489728. Throughput: 0: 42351.9. Samples: 9276609820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:28,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 02:28:31,713][15401] Updated weights for policy 0, policy_version 566200 (0.0034) [2024-06-24 02:28:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9276719104. Throughput: 0: 42679.4. Samples: 9276873540. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 02:28:34,182][15401] Updated weights for policy 0, policy_version 566210 (0.0027) [2024-06-24 02:28:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9276899328. Throughput: 0: 42425.7. Samples: 9277000380. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:38,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 02:28:39,406][15401] Updated weights for policy 0, policy_version 566220 (0.0039) [2024-06-24 02:28:42,001][15401] Updated weights for policy 0, policy_version 566230 (0.0044) [2024-06-24 02:28:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9277128704. Throughput: 0: 42567.6. Samples: 9277250080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 02:28:46,914][15401] Updated weights for policy 0, policy_version 566240 (0.0030) [2024-06-24 02:28:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 9277341696. Throughput: 0: 42935.1. Samples: 9277515960. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:48,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 02:28:50,182][15401] Updated weights for policy 0, policy_version 566250 (0.0037) [2024-06-24 02:28:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9277538304. Throughput: 0: 42428.5. Samples: 9277638460. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 02:28:54,546][15401] Updated weights for policy 0, policy_version 566260 (0.0033) [2024-06-24 02:28:57,914][15401] Updated weights for policy 0, policy_version 566270 (0.0032) [2024-06-24 02:28:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9277784064. Throughput: 0: 42536.4. Samples: 9277887420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:28:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 02:29:02,233][15401] Updated weights for policy 0, policy_version 566280 (0.0047) [2024-06-24 02:29:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9277980672. Throughput: 0: 42784.8. Samples: 9278153100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:29:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 02:29:03,971][15349] Signal inference workers to stop experience collection... (137350 times) [2024-06-24 02:29:04,006][15401] InferenceWorker_p0-w0: stopping experience collection (137350 times) [2024-06-24 02:29:04,029][15349] Signal inference workers to resume experience collection... (137350 times) [2024-06-24 02:29:04,029][15401] InferenceWorker_p0-w0: resuming experience collection (137350 times) [2024-06-24 02:29:05,736][15401] Updated weights for policy 0, policy_version 566290 (0.0029) [2024-06-24 02:29:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9278193664. Throughput: 0: 42573.0. Samples: 9278277540. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:29:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 02:29:10,046][15401] Updated weights for policy 0, policy_version 566300 (0.0027) [2024-06-24 02:29:13,340][15401] Updated weights for policy 0, policy_version 566310 (0.0039) [2024-06-24 02:29:13,390][15132] Fps is (10 sec: 44235.3, 60 sec: 42871.2, 300 sec: 42709.4). Total num frames: 9278423040. Throughput: 0: 42610.4. Samples: 9278527300. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:29:13,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 02:29:17,771][15401] Updated weights for policy 0, policy_version 566320 (0.0042) [2024-06-24 02:29:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 9278603264. Throughput: 0: 42540.6. Samples: 9278787860. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:29:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 02:29:20,822][15401] Updated weights for policy 0, policy_version 566330 (0.0051) [2024-06-24 02:29:23,389][15132] Fps is (10 sec: 40961.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9278832640. Throughput: 0: 42271.7. Samples: 9278902600. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:29:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 02:29:25,648][15401] Updated weights for policy 0, policy_version 566340 (0.0033) [2024-06-24 02:29:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9279062016. Throughput: 0: 42594.6. Samples: 9279166840. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-24 02:29:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 02:29:28,712][15401] Updated weights for policy 0, policy_version 566350 (0.0029) [2024-06-24 02:29:33,199][15401] Updated weights for policy 0, policy_version 566360 (0.0023) [2024-06-24 02:29:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9279242240. Throughput: 0: 42360.7. Samples: 9279422200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:29:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 02:29:36,163][15401] Updated weights for policy 0, policy_version 566370 (0.0035) [2024-06-24 02:29:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 9279471616. Throughput: 0: 42298.7. Samples: 9279541900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:29:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 02:29:40,731][15401] Updated weights for policy 0, policy_version 566380 (0.0027) [2024-06-24 02:29:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9279684608. Throughput: 0: 42691.5. Samples: 9279808540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:29:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 02:29:43,548][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000566389_9279717376.pth... [2024-06-24 02:29:43,616][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000565764_9269477376.pth [2024-06-24 02:29:43,791][15401] Updated weights for policy 0, policy_version 566390 (0.0023) [2024-06-24 02:29:48,312][15401] Updated weights for policy 0, policy_version 566400 (0.0041) [2024-06-24 02:29:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 9279897600. Throughput: 0: 42375.2. Samples: 9280059980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:29:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 02:29:51,649][15401] Updated weights for policy 0, policy_version 566410 (0.0027) [2024-06-24 02:29:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 9280077824. Throughput: 0: 42349.3. Samples: 9280183260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:29:53,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 02:29:56,270][15401] Updated weights for policy 0, policy_version 566420 (0.0033) [2024-06-24 02:29:58,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 9280307200. Throughput: 0: 42513.2. Samples: 9280440380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:29:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 02:29:59,505][15401] Updated weights for policy 0, policy_version 566430 (0.0034) [2024-06-24 02:30:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9280520192. Throughput: 0: 42241.8. Samples: 9280688740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 02:30:03,814][15401] Updated weights for policy 0, policy_version 566440 (0.0026) [2024-06-24 02:30:07,201][15401] Updated weights for policy 0, policy_version 566450 (0.0041) [2024-06-24 02:30:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9280733184. Throughput: 0: 42575.9. Samples: 9280818520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 02:30:11,375][15401] Updated weights for policy 0, policy_version 566460 (0.0033) [2024-06-24 02:30:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 9280929792. Throughput: 0: 42289.2. Samples: 9281069860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:13,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 02:30:15,219][15401] Updated weights for policy 0, policy_version 566470 (0.0041) [2024-06-24 02:30:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9281175552. Throughput: 0: 42341.5. Samples: 9281327560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 02:30:19,043][15401] Updated weights for policy 0, policy_version 566480 (0.0022) [2024-06-24 02:30:22,739][15401] Updated weights for policy 0, policy_version 566490 (0.0028) [2024-06-24 02:30:23,389][15132] Fps is (10 sec: 45876.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9281388544. Throughput: 0: 42640.9. Samples: 9281460740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 02:30:26,906][15401] Updated weights for policy 0, policy_version 566500 (0.0029) [2024-06-24 02:30:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 9281568768. Throughput: 0: 42219.2. Samples: 9281708400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:28,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-24 02:30:30,458][15401] Updated weights for policy 0, policy_version 566510 (0.0038) [2024-06-24 02:30:31,800][15349] Signal inference workers to stop experience collection... (137400 times) [2024-06-24 02:30:31,802][15349] Signal inference workers to resume experience collection... (137400 times) [2024-06-24 02:30:31,817][15401] InferenceWorker_p0-w0: stopping experience collection (137400 times) [2024-06-24 02:30:31,817][15401] InferenceWorker_p0-w0: resuming experience collection (137400 times) [2024-06-24 02:30:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 9281814528. Throughput: 0: 42240.3. Samples: 9281960800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:33,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 02:30:35,054][15401] Updated weights for policy 0, policy_version 566520 (0.0046) [2024-06-24 02:30:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42487.7). Total num frames: 9281994752. Throughput: 0: 42376.5. Samples: 9282090200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 02:30:38,579][15401] Updated weights for policy 0, policy_version 566530 (0.0030) [2024-06-24 02:30:42,674][15401] Updated weights for policy 0, policy_version 566540 (0.0035) [2024-06-24 02:30:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42432.7). Total num frames: 9282207744. Throughput: 0: 42250.6. Samples: 9282341660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 02:30:46,109][15401] Updated weights for policy 0, policy_version 566550 (0.0032) [2024-06-24 02:30:48,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 9282453504. Throughput: 0: 42452.0. Samples: 9282599080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:48,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 02:30:50,258][15401] Updated weights for policy 0, policy_version 566560 (0.0029) [2024-06-24 02:30:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9282633728. Throughput: 0: 42454.3. Samples: 9282728960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 02:30:54,125][15401] Updated weights for policy 0, policy_version 566570 (0.0034) [2024-06-24 02:30:57,893][15401] Updated weights for policy 0, policy_version 566580 (0.0022) [2024-06-24 02:30:58,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 9282863104. Throughput: 0: 42543.7. Samples: 9282984420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 02:30:58,393][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 02:31:01,847][15401] Updated weights for policy 0, policy_version 566590 (0.0035) [2024-06-24 02:31:03,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 9283108864. Throughput: 0: 42473.1. Samples: 9283238860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:03,390][15132] Avg episode reward: [(0, '0.220')] [2024-06-24 02:31:05,439][15401] Updated weights for policy 0, policy_version 566600 (0.0034) [2024-06-24 02:31:08,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 9283272704. Throughput: 0: 42502.2. Samples: 9283373340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:08,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 02:31:09,326][15401] Updated weights for policy 0, policy_version 566610 (0.0034) [2024-06-24 02:31:13,135][15401] Updated weights for policy 0, policy_version 566620 (0.0035) [2024-06-24 02:31:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 9283502080. Throughput: 0: 42600.3. Samples: 9283625420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 02:31:17,187][15401] Updated weights for policy 0, policy_version 566630 (0.0038) [2024-06-24 02:31:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9283731456. Throughput: 0: 42635.6. Samples: 9283879400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 02:31:20,755][15401] Updated weights for policy 0, policy_version 566640 (0.0033) [2024-06-24 02:31:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 9283928064. Throughput: 0: 42685.2. Samples: 9284011040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 02:31:24,671][15401] Updated weights for policy 0, policy_version 566650 (0.0041) [2024-06-24 02:31:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 9284141056. Throughput: 0: 42732.1. Samples: 9284264600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:28,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 02:31:28,411][15401] Updated weights for policy 0, policy_version 566660 (0.0030) [2024-06-24 02:31:32,159][15401] Updated weights for policy 0, policy_version 566670 (0.0028) [2024-06-24 02:31:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 9284370432. Throughput: 0: 42756.9. Samples: 9284523140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 02:31:36,026][15401] Updated weights for policy 0, policy_version 566680 (0.0032) [2024-06-24 02:31:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 9284583424. Throughput: 0: 42696.0. Samples: 9284650280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 02:31:39,869][15401] Updated weights for policy 0, policy_version 566690 (0.0043) [2024-06-24 02:31:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 9284796416. Throughput: 0: 42873.4. Samples: 9284913620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 02:31:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000566699_9284796416.pth... [2024-06-24 02:31:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000566075_9274572800.pth [2024-06-24 02:31:43,640][15401] Updated weights for policy 0, policy_version 566700 (0.0031) [2024-06-24 02:31:47,569][15401] Updated weights for policy 0, policy_version 566710 (0.0030) [2024-06-24 02:31:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 9285009408. Throughput: 0: 42812.2. Samples: 9285165400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 02:31:51,709][15401] Updated weights for policy 0, policy_version 566720 (0.0041) [2024-06-24 02:31:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 9285206016. Throughput: 0: 42708.4. Samples: 9285295220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:53,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 02:31:55,326][15401] Updated weights for policy 0, policy_version 566730 (0.0037) [2024-06-24 02:31:58,392][15132] Fps is (10 sec: 40951.6, 60 sec: 42598.7, 300 sec: 42487.0). Total num frames: 9285419008. Throughput: 0: 42729.8. Samples: 9285548340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:31:58,392][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 02:31:59,118][15401] Updated weights for policy 0, policy_version 566740 (0.0037) [2024-06-24 02:32:02,976][15349] Signal inference workers to stop experience collection... (137450 times) [2024-06-24 02:32:03,027][15349] Signal inference workers to resume experience collection... (137450 times) [2024-06-24 02:32:03,027][15401] InferenceWorker_p0-w0: stopping experience collection (137450 times) [2024-06-24 02:32:03,034][15401] Updated weights for policy 0, policy_version 566750 (0.0030) [2024-06-24 02:32:03,046][15401] InferenceWorker_p0-w0: resuming experience collection (137450 times) [2024-06-24 02:32:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 9285648384. Throughput: 0: 42810.2. Samples: 9285805860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:32:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 02:32:06,935][15401] Updated weights for policy 0, policy_version 566760 (0.0033) [2024-06-24 02:32:08,389][15132] Fps is (10 sec: 42607.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 9285844992. Throughput: 0: 42736.1. Samples: 9285934160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:32:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 02:32:10,554][15401] Updated weights for policy 0, policy_version 566770 (0.0031) [2024-06-24 02:32:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 9286057984. Throughput: 0: 42684.5. Samples: 9286185400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:32:13,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 02:32:14,547][15401] Updated weights for policy 0, policy_version 566780 (0.0035) [2024-06-24 02:32:18,201][15401] Updated weights for policy 0, policy_version 566790 (0.0036) [2024-06-24 02:32:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42543.6). Total num frames: 9286303744. Throughput: 0: 42653.4. Samples: 9286442540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:32:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 02:32:22,253][15401] Updated weights for policy 0, policy_version 566800 (0.0031) [2024-06-24 02:32:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 9286483968. Throughput: 0: 42736.0. Samples: 9286573500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:32:23,393][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 02:32:25,804][15401] Updated weights for policy 0, policy_version 566810 (0.0044) [2024-06-24 02:32:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 9286713344. Throughput: 0: 42577.7. Samples: 9286829620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:32:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 02:32:29,859][15401] Updated weights for policy 0, policy_version 566820 (0.0028) [2024-06-24 02:32:33,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 9286926336. Throughput: 0: 42770.3. Samples: 9287090060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 02:32:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 02:32:33,419][15401] Updated weights for policy 0, policy_version 566830 (0.0032) [2024-06-24 02:32:37,538][15401] Updated weights for policy 0, policy_version 566840 (0.0032) [2024-06-24 02:32:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9287139328. Throughput: 0: 42738.1. Samples: 9287218440. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:32:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 02:32:41,082][15401] Updated weights for policy 0, policy_version 566850 (0.0037) [2024-06-24 02:32:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9287352320. Throughput: 0: 42712.1. Samples: 9287470300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:32:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 02:32:45,231][15401] Updated weights for policy 0, policy_version 566860 (0.0030) [2024-06-24 02:32:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9287565312. Throughput: 0: 43014.2. Samples: 9287741500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:32:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 02:32:48,597][15401] Updated weights for policy 0, policy_version 566870 (0.0033) [2024-06-24 02:32:52,681][15401] Updated weights for policy 0, policy_version 566880 (0.0036) [2024-06-24 02:32:53,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 9287778304. Throughput: 0: 42927.4. Samples: 9287866000. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:32:53,393][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 02:32:56,412][15401] Updated weights for policy 0, policy_version 566890 (0.0058) [2024-06-24 02:32:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.0, 300 sec: 42542.9). Total num frames: 9288007680. Throughput: 0: 42821.3. Samples: 9288112360. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:32:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 02:33:00,494][15401] Updated weights for policy 0, policy_version 566900 (0.0038) [2024-06-24 02:33:03,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9288187904. Throughput: 0: 43059.0. Samples: 9288380200. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 02:33:04,001][15349] Signal inference workers to stop experience collection... (137500 times) [2024-06-24 02:33:04,022][15401] InferenceWorker_p0-w0: stopping experience collection (137500 times) [2024-06-24 02:33:04,111][15349] Signal inference workers to resume experience collection... (137500 times) [2024-06-24 02:33:04,111][15401] InferenceWorker_p0-w0: resuming experience collection (137500 times) [2024-06-24 02:33:04,255][15401] Updated weights for policy 0, policy_version 566910 (0.0031) [2024-06-24 02:33:08,083][15401] Updated weights for policy 0, policy_version 566920 (0.0034) [2024-06-24 02:33:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 9288433664. Throughput: 0: 42724.4. Samples: 9288496000. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 02:33:11,999][15401] Updated weights for policy 0, policy_version 566930 (0.0028) [2024-06-24 02:33:13,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 9288663040. Throughput: 0: 42721.0. Samples: 9288752060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:13,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 02:33:15,688][15401] Updated weights for policy 0, policy_version 566940 (0.0051) [2024-06-24 02:33:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9288826880. Throughput: 0: 42671.1. Samples: 9289010260. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:18,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 02:33:19,605][15401] Updated weights for policy 0, policy_version 566950 (0.0026) [2024-06-24 02:33:23,396][15132] Fps is (10 sec: 39296.6, 60 sec: 42868.6, 300 sec: 42597.5). Total num frames: 9289056256. Throughput: 0: 42462.1. Samples: 9289129500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:23,396][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 02:33:23,673][15401] Updated weights for policy 0, policy_version 566960 (0.0043) [2024-06-24 02:33:27,258][15401] Updated weights for policy 0, policy_version 566970 (0.0032) [2024-06-24 02:33:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 9289302016. Throughput: 0: 42744.1. Samples: 9289393780. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 02:33:31,251][15401] Updated weights for policy 0, policy_version 566980 (0.0029) [2024-06-24 02:33:33,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9289465856. Throughput: 0: 42464.8. Samples: 9289652420. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 02:33:34,934][15401] Updated weights for policy 0, policy_version 566990 (0.0027) [2024-06-24 02:33:38,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 9289695232. Throughput: 0: 42212.0. Samples: 9289765540. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:38,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 02:33:39,333][15401] Updated weights for policy 0, policy_version 567000 (0.0037) [2024-06-24 02:33:42,499][15401] Updated weights for policy 0, policy_version 567010 (0.0036) [2024-06-24 02:33:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9289924608. Throughput: 0: 42679.4. Samples: 9290032940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:43,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 02:33:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000567013_9289940992.pth... [2024-06-24 02:33:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000566389_9279717376.pth [2024-06-24 02:33:47,293][15401] Updated weights for policy 0, policy_version 567020 (0.0032) [2024-06-24 02:33:48,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9290104832. Throughput: 0: 42367.1. Samples: 9290286720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:48,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 02:33:50,206][15401] Updated weights for policy 0, policy_version 567030 (0.0023) [2024-06-24 02:33:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 9290317824. Throughput: 0: 42573.9. Samples: 9290411820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 02:33:54,965][15401] Updated weights for policy 0, policy_version 567040 (0.0033) [2024-06-24 02:33:58,099][15401] Updated weights for policy 0, policy_version 567050 (0.0033) [2024-06-24 02:33:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9290547200. Throughput: 0: 42529.0. Samples: 9290665860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:33:58,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 02:34:02,610][15401] Updated weights for policy 0, policy_version 567060 (0.0030) [2024-06-24 02:34:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 9290743808. Throughput: 0: 42541.3. Samples: 9290924620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 02:34:03,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 02:34:05,690][15401] Updated weights for policy 0, policy_version 567070 (0.0031) [2024-06-24 02:34:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9290973184. Throughput: 0: 42589.5. Samples: 9291045760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 02:34:10,290][15401] Updated weights for policy 0, policy_version 567080 (0.0029) [2024-06-24 02:34:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9291186176. Throughput: 0: 42555.5. Samples: 9291308780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 02:34:13,451][15401] Updated weights for policy 0, policy_version 567090 (0.0026) [2024-06-24 02:34:17,771][15401] Updated weights for policy 0, policy_version 567100 (0.0028) [2024-06-24 02:34:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9291382784. Throughput: 0: 42688.1. Samples: 9291573380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 02:34:19,091][15349] Signal inference workers to stop experience collection... (137550 times) [2024-06-24 02:34:19,091][15349] Signal inference workers to resume experience collection... (137550 times) [2024-06-24 02:34:19,136][15401] InferenceWorker_p0-w0: stopping experience collection (137550 times) [2024-06-24 02:34:19,137][15401] InferenceWorker_p0-w0: resuming experience collection (137550 times) [2024-06-24 02:34:21,054][15401] Updated weights for policy 0, policy_version 567110 (0.0038) [2024-06-24 02:34:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42874.3, 300 sec: 42598.1). Total num frames: 9291628544. Throughput: 0: 42885.7. Samples: 9291695400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:23,393][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 02:34:25,223][15401] Updated weights for policy 0, policy_version 567120 (0.0045) [2024-06-24 02:34:28,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42047.8, 300 sec: 42653.0). Total num frames: 9291825152. Throughput: 0: 42690.0. Samples: 9291954260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:28,396][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 02:34:28,644][15401] Updated weights for policy 0, policy_version 567130 (0.0028) [2024-06-24 02:34:32,902][15401] Updated weights for policy 0, policy_version 567140 (0.0032) [2024-06-24 02:34:33,392][15132] Fps is (10 sec: 39321.5, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 9292021760. Throughput: 0: 42864.8. Samples: 9292215740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:33,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 02:34:36,248][15401] Updated weights for policy 0, policy_version 567150 (0.0029) [2024-06-24 02:34:38,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 9292267520. Throughput: 0: 42819.5. Samples: 9292338700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 02:34:40,401][15401] Updated weights for policy 0, policy_version 567160 (0.0035) [2024-06-24 02:34:43,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9292464128. Throughput: 0: 42857.7. Samples: 9292594460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 02:34:44,148][15401] Updated weights for policy 0, policy_version 567170 (0.0031) [2024-06-24 02:34:48,233][15401] Updated weights for policy 0, policy_version 567180 (0.0049) [2024-06-24 02:34:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9292677120. Throughput: 0: 42760.3. Samples: 9292848840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 02:34:51,910][15401] Updated weights for policy 0, policy_version 567190 (0.0036) [2024-06-24 02:34:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9292906496. Throughput: 0: 42873.3. Samples: 9292975060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 02:34:55,766][15401] Updated weights for policy 0, policy_version 567200 (0.0037) [2024-06-24 02:34:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9293103104. Throughput: 0: 42786.2. Samples: 9293234160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:34:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 02:34:59,496][15401] Updated weights for policy 0, policy_version 567210 (0.0027) [2024-06-24 02:35:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9293316096. Throughput: 0: 42639.1. Samples: 9293492140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:35:03,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 02:35:03,530][15401] Updated weights for policy 0, policy_version 567220 (0.0035) [2024-06-24 02:35:07,019][15401] Updated weights for policy 0, policy_version 567230 (0.0033) [2024-06-24 02:35:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9293561856. Throughput: 0: 42681.7. Samples: 9293615980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:35:08,390][15132] Avg episode reward: [(0, '0.903')] [2024-06-24 02:35:11,094][15401] Updated weights for policy 0, policy_version 567240 (0.0042) [2024-06-24 02:35:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9293725696. Throughput: 0: 42651.4. Samples: 9293873300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:35:13,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 02:35:14,583][15401] Updated weights for policy 0, policy_version 567250 (0.0031) [2024-06-24 02:35:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 9293955072. Throughput: 0: 42540.9. Samples: 9294129980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:35:18,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 02:35:18,820][15401] Updated weights for policy 0, policy_version 567260 (0.0029) [2024-06-24 02:35:22,314][15401] Updated weights for policy 0, policy_version 567270 (0.0029) [2024-06-24 02:35:23,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 9294200832. Throughput: 0: 42733.5. Samples: 9294261700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:35:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 02:35:26,414][15401] Updated weights for policy 0, policy_version 567280 (0.0048) [2024-06-24 02:35:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42329.9, 300 sec: 42542.9). Total num frames: 9294364672. Throughput: 0: 42710.7. Samples: 9294516440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:35:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 02:35:29,934][15401] Updated weights for policy 0, policy_version 567290 (0.0027) [2024-06-24 02:35:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 9294610432. Throughput: 0: 42749.0. Samples: 9294772540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 02:35:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 02:35:33,974][15401] Updated weights for policy 0, policy_version 567300 (0.0032) [2024-06-24 02:35:37,624][15401] Updated weights for policy 0, policy_version 567310 (0.0033) [2024-06-24 02:35:38,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9294839808. Throughput: 0: 42882.3. Samples: 9294904760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:35:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 02:35:41,551][15401] Updated weights for policy 0, policy_version 567320 (0.0041) [2024-06-24 02:35:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9295020032. Throughput: 0: 42807.6. Samples: 9295160500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:35:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 02:35:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000567323_9295020032.pth... [2024-06-24 02:35:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000566699_9284796416.pth [2024-06-24 02:35:45,368][15401] Updated weights for policy 0, policy_version 567330 (0.0026) [2024-06-24 02:35:46,469][15349] Signal inference workers to stop experience collection... (137600 times) [2024-06-24 02:35:46,470][15349] Signal inference workers to resume experience collection... (137600 times) [2024-06-24 02:35:46,515][15401] InferenceWorker_p0-w0: stopping experience collection (137600 times) [2024-06-24 02:35:46,515][15401] InferenceWorker_p0-w0: resuming experience collection (137600 times) [2024-06-24 02:35:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9295265792. Throughput: 0: 42726.2. Samples: 9295414820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:35:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 02:35:49,272][15401] Updated weights for policy 0, policy_version 567340 (0.0030) [2024-06-24 02:35:53,290][15401] Updated weights for policy 0, policy_version 567350 (0.0036) [2024-06-24 02:35:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 9295462400. Throughput: 0: 42894.7. Samples: 9295546240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:35:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 02:35:57,211][15401] Updated weights for policy 0, policy_version 567360 (0.0034) [2024-06-24 02:35:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9295659008. Throughput: 0: 42595.5. Samples: 9295790100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:35:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 02:36:00,982][15401] Updated weights for policy 0, policy_version 567370 (0.0040) [2024-06-24 02:36:03,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9295888384. Throughput: 0: 42436.1. Samples: 9296039600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:03,398][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 02:36:04,836][15401] Updated weights for policy 0, policy_version 567380 (0.0031) [2024-06-24 02:36:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 9296084992. Throughput: 0: 42418.9. Samples: 9296170560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 02:36:08,686][15401] Updated weights for policy 0, policy_version 567390 (0.0026) [2024-06-24 02:36:12,395][15401] Updated weights for policy 0, policy_version 567400 (0.0028) [2024-06-24 02:36:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9296314368. Throughput: 0: 42347.9. Samples: 9296422100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 02:36:16,468][15401] Updated weights for policy 0, policy_version 567410 (0.0026) [2024-06-24 02:36:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9296510976. Throughput: 0: 42352.4. Samples: 9296678400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:18,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 02:36:19,883][15401] Updated weights for policy 0, policy_version 567420 (0.0047) [2024-06-24 02:36:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 9296707584. Throughput: 0: 42233.8. Samples: 9296805280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 02:36:24,171][15401] Updated weights for policy 0, policy_version 567430 (0.0032) [2024-06-24 02:36:27,473][15401] Updated weights for policy 0, policy_version 567440 (0.0047) [2024-06-24 02:36:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 9296953344. Throughput: 0: 42278.8. Samples: 9297063040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 02:36:32,028][15401] Updated weights for policy 0, policy_version 567450 (0.0031) [2024-06-24 02:36:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9297149952. Throughput: 0: 42304.8. Samples: 9297318540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 02:36:35,495][15401] Updated weights for policy 0, policy_version 567460 (0.0051) [2024-06-24 02:36:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 9297346560. Throughput: 0: 42099.7. Samples: 9297440720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:38,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 02:36:39,843][15401] Updated weights for policy 0, policy_version 567470 (0.0035) [2024-06-24 02:36:43,119][15401] Updated weights for policy 0, policy_version 567480 (0.0033) [2024-06-24 02:36:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9297592320. Throughput: 0: 42434.3. Samples: 9297699640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 02:36:47,566][15401] Updated weights for policy 0, policy_version 567490 (0.0047) [2024-06-24 02:36:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9297805312. Throughput: 0: 42342.1. Samples: 9297945000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 02:36:50,825][15401] Updated weights for policy 0, policy_version 567500 (0.0030) [2024-06-24 02:36:53,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42320.9, 300 sec: 42653.3). Total num frames: 9298001920. Throughput: 0: 42324.3. Samples: 9298075420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:53,397][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 02:36:54,662][15349] Signal inference workers to stop experience collection... (137650 times) [2024-06-24 02:36:54,712][15401] InferenceWorker_p0-w0: stopping experience collection (137650 times) [2024-06-24 02:36:54,776][15349] Signal inference workers to resume experience collection... (137650 times) [2024-06-24 02:36:54,776][15401] InferenceWorker_p0-w0: resuming experience collection (137650 times) [2024-06-24 02:36:54,916][15401] Updated weights for policy 0, policy_version 567510 (0.0027) [2024-06-24 02:36:58,386][15401] Updated weights for policy 0, policy_version 567520 (0.0045) [2024-06-24 02:36:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9298247680. Throughput: 0: 42590.7. Samples: 9298338680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:36:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 02:37:02,586][15401] Updated weights for policy 0, policy_version 567530 (0.0050) [2024-06-24 02:37:03,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9298411520. Throughput: 0: 42508.8. Samples: 9298591300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:37:03,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 02:37:06,138][15401] Updated weights for policy 0, policy_version 567540 (0.0038) [2024-06-24 02:37:08,390][15132] Fps is (10 sec: 36044.4, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 9298608128. Throughput: 0: 42320.7. Samples: 9298709720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 02:37:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 02:37:10,232][15401] Updated weights for policy 0, policy_version 567550 (0.0034) [2024-06-24 02:37:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9298853888. Throughput: 0: 42448.8. Samples: 9298973240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 02:37:14,119][15401] Updated weights for policy 0, policy_version 567560 (0.0029) [2024-06-24 02:37:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 9299034112. Throughput: 0: 42480.1. Samples: 9299230140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 02:37:18,574][15401] Updated weights for policy 0, policy_version 567570 (0.0027) [2024-06-24 02:37:21,682][15401] Updated weights for policy 0, policy_version 567580 (0.0042) [2024-06-24 02:37:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 9299263488. Throughput: 0: 42420.0. Samples: 9299349620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 02:37:25,947][15401] Updated weights for policy 0, policy_version 567590 (0.0047) [2024-06-24 02:37:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9299492864. Throughput: 0: 42425.3. Samples: 9299608780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 02:37:29,275][15401] Updated weights for policy 0, policy_version 567600 (0.0032) [2024-06-24 02:37:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9299705856. Throughput: 0: 42772.2. Samples: 9299869740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:33,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-24 02:37:33,488][15401] Updated weights for policy 0, policy_version 567610 (0.0032) [2024-06-24 02:37:36,926][15401] Updated weights for policy 0, policy_version 567620 (0.0048) [2024-06-24 02:37:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 9299902464. Throughput: 0: 42689.3. Samples: 9299996160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:38,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 02:37:41,364][15401] Updated weights for policy 0, policy_version 567630 (0.0035) [2024-06-24 02:37:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9300131840. Throughput: 0: 42428.9. Samples: 9300247980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:43,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 02:37:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000567635_9300131840.pth... [2024-06-24 02:37:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000567013_9289940992.pth [2024-06-24 02:37:45,005][15401] Updated weights for policy 0, policy_version 567640 (0.0038) [2024-06-24 02:37:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 9300344832. Throughput: 0: 42409.4. Samples: 9300499720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:48,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 02:37:49,005][15401] Updated weights for policy 0, policy_version 567650 (0.0028) [2024-06-24 02:37:52,513][15401] Updated weights for policy 0, policy_version 567660 (0.0034) [2024-06-24 02:37:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42329.9, 300 sec: 42487.3). Total num frames: 9300541440. Throughput: 0: 42684.6. Samples: 9300630520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:53,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-24 02:37:56,526][15401] Updated weights for policy 0, policy_version 567670 (0.0029) [2024-06-24 02:37:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 9300754432. Throughput: 0: 42509.3. Samples: 9300886160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:37:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 02:38:00,393][15401] Updated weights for policy 0, policy_version 567680 (0.0023) [2024-06-24 02:38:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 9300983808. Throughput: 0: 42327.1. Samples: 9301134860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:38:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 02:38:04,675][15401] Updated weights for policy 0, policy_version 567690 (0.0021) [2024-06-24 02:38:07,991][15401] Updated weights for policy 0, policy_version 567700 (0.0025) [2024-06-24 02:38:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42487.3). Total num frames: 9301196800. Throughput: 0: 42623.3. Samples: 9301267660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:38:08,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 02:38:12,334][15401] Updated weights for policy 0, policy_version 567710 (0.0027) [2024-06-24 02:38:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9301393408. Throughput: 0: 42555.1. Samples: 9301523760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:38:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 02:38:15,978][15401] Updated weights for policy 0, policy_version 567720 (0.0044) [2024-06-24 02:38:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42599.3). Total num frames: 9301622784. Throughput: 0: 42304.0. Samples: 9301773420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:38:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 02:38:20,002][15401] Updated weights for policy 0, policy_version 567730 (0.0024) [2024-06-24 02:38:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 9301835776. Throughput: 0: 42482.1. Samples: 9301907860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:38:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 02:38:23,488][15401] Updated weights for policy 0, policy_version 567740 (0.0036) [2024-06-24 02:38:27,462][15401] Updated weights for policy 0, policy_version 567750 (0.0033) [2024-06-24 02:38:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9302032384. Throughput: 0: 42632.5. Samples: 9302166440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:38:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 02:38:30,312][15349] Signal inference workers to stop experience collection... (137700 times) [2024-06-24 02:38:30,312][15349] Signal inference workers to resume experience collection... (137700 times) [2024-06-24 02:38:30,349][15401] InferenceWorker_p0-w0: stopping experience collection (137700 times) [2024-06-24 02:38:30,349][15401] InferenceWorker_p0-w0: resuming experience collection (137700 times) [2024-06-24 02:38:31,133][15401] Updated weights for policy 0, policy_version 567760 (0.0029) [2024-06-24 02:38:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 9302261760. Throughput: 0: 42751.6. Samples: 9302423540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:38:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 02:38:35,109][15401] Updated weights for policy 0, policy_version 567770 (0.0025) [2024-06-24 02:38:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 9302474752. Throughput: 0: 42836.8. Samples: 9302558180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 02:38:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 02:38:38,661][15401] Updated weights for policy 0, policy_version 567780 (0.0032) [2024-06-24 02:38:42,419][15401] Updated weights for policy 0, policy_version 567790 (0.0026) [2024-06-24 02:38:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 9302687744. Throughput: 0: 42935.9. Samples: 9302818380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:38:43,393][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 02:38:46,381][15401] Updated weights for policy 0, policy_version 567800 (0.0033) [2024-06-24 02:38:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9302917120. Throughput: 0: 43046.3. Samples: 9303071940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:38:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 02:38:50,370][15401] Updated weights for policy 0, policy_version 567810 (0.0022) [2024-06-24 02:38:53,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 9303130112. Throughput: 0: 43170.1. Samples: 9303210320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:38:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 02:38:53,984][15401] Updated weights for policy 0, policy_version 567820 (0.0032) [2024-06-24 02:38:57,881][15401] Updated weights for policy 0, policy_version 567830 (0.0032) [2024-06-24 02:38:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9303343104. Throughput: 0: 43177.7. Samples: 9303466760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:38:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 02:39:01,651][15401] Updated weights for policy 0, policy_version 567840 (0.0031) [2024-06-24 02:39:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9303572480. Throughput: 0: 43135.5. Samples: 9303714520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:03,396][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 02:39:05,480][15401] Updated weights for policy 0, policy_version 567850 (0.0039) [2024-06-24 02:39:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9303752704. Throughput: 0: 43101.2. Samples: 9303847420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 02:39:09,356][15401] Updated weights for policy 0, policy_version 567860 (0.0036) [2024-06-24 02:39:13,182][15401] Updated weights for policy 0, policy_version 567870 (0.0033) [2024-06-24 02:39:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9303982080. Throughput: 0: 42967.9. Samples: 9304100000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 02:39:17,183][15401] Updated weights for policy 0, policy_version 567880 (0.0040) [2024-06-24 02:39:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 9304211456. Throughput: 0: 42965.4. Samples: 9304356980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 02:39:20,712][15401] Updated weights for policy 0, policy_version 567890 (0.0031) [2024-06-24 02:39:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 9304408064. Throughput: 0: 43011.3. Samples: 9304493680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:23,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 02:39:24,722][15401] Updated weights for policy 0, policy_version 567900 (0.0027) [2024-06-24 02:39:28,379][15401] Updated weights for policy 0, policy_version 567910 (0.0030) [2024-06-24 02:39:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 9304637440. Throughput: 0: 42957.5. Samples: 9304751360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 02:39:32,178][15401] Updated weights for policy 0, policy_version 567920 (0.0030) [2024-06-24 02:39:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9304850432. Throughput: 0: 42880.8. Samples: 9305001580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 02:39:35,953][15401] Updated weights for policy 0, policy_version 567930 (0.0028) [2024-06-24 02:39:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 9305047040. Throughput: 0: 42741.5. Samples: 9305133680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:38,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 02:39:40,100][15401] Updated weights for policy 0, policy_version 567940 (0.0029) [2024-06-24 02:39:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 9305276416. Throughput: 0: 42800.9. Samples: 9305392800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:43,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 02:39:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000567950_9305292800.pth... [2024-06-24 02:39:43,415][15401] Updated weights for policy 0, policy_version 567950 (0.0037) [2024-06-24 02:39:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000567323_9295020032.pth [2024-06-24 02:39:47,665][15401] Updated weights for policy 0, policy_version 567960 (0.0041) [2024-06-24 02:39:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9305505792. Throughput: 0: 42995.1. Samples: 9305649300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:48,392][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 02:39:50,959][15401] Updated weights for policy 0, policy_version 567970 (0.0036) [2024-06-24 02:39:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9305702400. Throughput: 0: 42857.9. Samples: 9305776020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 02:39:55,082][15401] Updated weights for policy 0, policy_version 567980 (0.0043) [2024-06-24 02:39:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9305931776. Throughput: 0: 43081.5. Samples: 9306038660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:39:58,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 02:39:58,936][15401] Updated weights for policy 0, policy_version 567990 (0.0039) [2024-06-24 02:40:00,214][15349] Signal inference workers to stop experience collection... (137750 times) [2024-06-24 02:40:00,243][15401] InferenceWorker_p0-w0: stopping experience collection (137750 times) [2024-06-24 02:40:00,327][15349] Signal inference workers to resume experience collection... (137750 times) [2024-06-24 02:40:00,327][15401] InferenceWorker_p0-w0: resuming experience collection (137750 times) [2024-06-24 02:40:02,464][15401] Updated weights for policy 0, policy_version 568000 (0.0024) [2024-06-24 02:40:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9306144768. Throughput: 0: 43170.6. Samples: 9306299660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:40:03,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 02:40:06,423][15401] Updated weights for policy 0, policy_version 568010 (0.0031) [2024-06-24 02:40:08,392][15132] Fps is (10 sec: 39311.7, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 9306324992. Throughput: 0: 43009.1. Samples: 9306429200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 02:40:08,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 02:40:10,302][15401] Updated weights for policy 0, policy_version 568020 (0.0033) [2024-06-24 02:40:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 9306570752. Throughput: 0: 42934.1. Samples: 9306683500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:13,392][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 02:40:14,171][15401] Updated weights for policy 0, policy_version 568030 (0.0044) [2024-06-24 02:40:17,890][15401] Updated weights for policy 0, policy_version 568040 (0.0029) [2024-06-24 02:40:18,389][15132] Fps is (10 sec: 47525.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9306800128. Throughput: 0: 43148.6. Samples: 9306943260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:18,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 02:40:21,837][15401] Updated weights for policy 0, policy_version 568050 (0.0045) [2024-06-24 02:40:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9306980352. Throughput: 0: 43103.5. Samples: 9307073340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:23,391][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 02:40:25,439][15401] Updated weights for policy 0, policy_version 568060 (0.0035) [2024-06-24 02:40:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9307209728. Throughput: 0: 43047.6. Samples: 9307329940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 02:40:29,353][15401] Updated weights for policy 0, policy_version 568070 (0.0041) [2024-06-24 02:40:32,993][15401] Updated weights for policy 0, policy_version 568080 (0.0033) [2024-06-24 02:40:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9307439104. Throughput: 0: 43082.3. Samples: 9307588000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 02:40:37,118][15401] Updated weights for policy 0, policy_version 568090 (0.0025) [2024-06-24 02:40:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9307619328. Throughput: 0: 43264.3. Samples: 9307722920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 02:40:40,446][15401] Updated weights for policy 0, policy_version 568100 (0.0042) [2024-06-24 02:40:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9307848704. Throughput: 0: 42894.6. Samples: 9307968920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:43,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 02:40:44,816][15401] Updated weights for policy 0, policy_version 568110 (0.0031) [2024-06-24 02:40:48,197][15401] Updated weights for policy 0, policy_version 568120 (0.0031) [2024-06-24 02:40:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9308078080. Throughput: 0: 42896.8. Samples: 9308230020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:48,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-24 02:40:52,446][15401] Updated weights for policy 0, policy_version 568130 (0.0036) [2024-06-24 02:40:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9308258304. Throughput: 0: 42895.2. Samples: 9308359380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 02:40:55,937][15401] Updated weights for policy 0, policy_version 568140 (0.0037) [2024-06-24 02:40:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 9308504064. Throughput: 0: 42879.9. Samples: 9308613000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:40:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 02:41:00,077][15401] Updated weights for policy 0, policy_version 568150 (0.0026) [2024-06-24 02:41:03,326][15401] Updated weights for policy 0, policy_version 568160 (0.0042) [2024-06-24 02:41:03,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9308733440. Throughput: 0: 42927.5. Samples: 9308875000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:41:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 02:41:07,901][15401] Updated weights for policy 0, policy_version 568170 (0.0042) [2024-06-24 02:41:08,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 9308897280. Throughput: 0: 42875.6. Samples: 9309002740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:41:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 02:41:11,031][15401] Updated weights for policy 0, policy_version 568180 (0.0023) [2024-06-24 02:41:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 9309143040. Throughput: 0: 42738.6. Samples: 9309253180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:41:13,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 02:41:15,555][15401] Updated weights for policy 0, policy_version 568190 (0.0034) [2024-06-24 02:41:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9309356032. Throughput: 0: 42691.0. Samples: 9309509100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:41:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 02:41:18,686][15401] Updated weights for policy 0, policy_version 568200 (0.0026) [2024-06-24 02:41:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9309536256. Throughput: 0: 42575.1. Samples: 9309638800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:41:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 02:41:23,536][15401] Updated weights for policy 0, policy_version 568210 (0.0033) [2024-06-24 02:41:25,456][15349] Signal inference workers to stop experience collection... (137800 times) [2024-06-24 02:41:25,457][15349] Signal inference workers to resume experience collection... (137800 times) [2024-06-24 02:41:25,471][15401] InferenceWorker_p0-w0: stopping experience collection (137800 times) [2024-06-24 02:41:25,471][15401] InferenceWorker_p0-w0: resuming experience collection (137800 times) [2024-06-24 02:41:26,225][15401] Updated weights for policy 0, policy_version 568220 (0.0039) [2024-06-24 02:41:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9309798400. Throughput: 0: 42677.7. Samples: 9309889420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:41:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 02:41:31,012][15401] Updated weights for policy 0, policy_version 568230 (0.0037) [2024-06-24 02:41:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9309978624. Throughput: 0: 42798.8. Samples: 9310155960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:41:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 02:41:34,202][15401] Updated weights for policy 0, policy_version 568240 (0.0045) [2024-06-24 02:41:38,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9310175232. Throughput: 0: 42651.1. Samples: 9310278680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 02:41:38,399][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 02:41:38,695][15401] Updated weights for policy 0, policy_version 568250 (0.0027) [2024-06-24 02:41:41,841][15401] Updated weights for policy 0, policy_version 568260 (0.0033) [2024-06-24 02:41:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 9310437376. Throughput: 0: 42602.7. Samples: 9310530120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:41:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 02:41:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000568264_9310437376.pth... [2024-06-24 02:41:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000567635_9300131840.pth [2024-06-24 02:41:46,337][15401] Updated weights for policy 0, policy_version 568270 (0.0031) [2024-06-24 02:41:48,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42320.9, 300 sec: 42765.0). Total num frames: 9310617600. Throughput: 0: 42673.4. Samples: 9310795580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:41:48,396][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 02:41:49,472][15401] Updated weights for policy 0, policy_version 568280 (0.0034) [2024-06-24 02:41:53,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9310830592. Throughput: 0: 42594.3. Samples: 9310919480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:41:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 02:41:53,962][15401] Updated weights for policy 0, policy_version 568290 (0.0038) [2024-06-24 02:41:57,274][15401] Updated weights for policy 0, policy_version 568300 (0.0031) [2024-06-24 02:41:58,389][15132] Fps is (10 sec: 47544.2, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 9311092736. Throughput: 0: 42850.3. Samples: 9311181440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:41:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 02:42:01,415][15401] Updated weights for policy 0, policy_version 568310 (0.0026) [2024-06-24 02:42:03,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42323.6, 300 sec: 42931.3). Total num frames: 9311272960. Throughput: 0: 43071.0. Samples: 9311447400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:03,393][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 02:42:04,626][15401] Updated weights for policy 0, policy_version 568320 (0.0040) [2024-06-24 02:42:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9311485952. Throughput: 0: 42926.3. Samples: 9311570480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 02:42:08,729][15401] Updated weights for policy 0, policy_version 568330 (0.0036) [2024-06-24 02:42:12,126][15401] Updated weights for policy 0, policy_version 568340 (0.0027) [2024-06-24 02:42:13,389][15132] Fps is (10 sec: 45886.8, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 9311731712. Throughput: 0: 43214.0. Samples: 9311834040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:13,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 02:42:16,328][15401] Updated weights for policy 0, policy_version 568350 (0.0033) [2024-06-24 02:42:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9311928320. Throughput: 0: 43164.7. Samples: 9312098380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 02:42:19,611][15401] Updated weights for policy 0, policy_version 568360 (0.0028) [2024-06-24 02:42:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9312141312. Throughput: 0: 43245.2. Samples: 9312224720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:23,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-24 02:42:23,995][15401] Updated weights for policy 0, policy_version 568370 (0.0039) [2024-06-24 02:42:27,161][15401] Updated weights for policy 0, policy_version 568380 (0.0031) [2024-06-24 02:42:28,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9312370688. Throughput: 0: 43317.5. Samples: 9312479400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 02:42:31,905][15401] Updated weights for policy 0, policy_version 568390 (0.0047) [2024-06-24 02:42:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9312550912. Throughput: 0: 43272.9. Samples: 9312742580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 02:42:34,819][15401] Updated weights for policy 0, policy_version 568400 (0.0038) [2024-06-24 02:42:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9312780288. Throughput: 0: 43186.2. Samples: 9312862860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:38,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-24 02:42:39,833][15401] Updated weights for policy 0, policy_version 568410 (0.0050) [2024-06-24 02:42:41,636][15349] Signal inference workers to stop experience collection... (137850 times) [2024-06-24 02:42:41,636][15349] Signal inference workers to resume experience collection... (137850 times) [2024-06-24 02:42:41,657][15401] InferenceWorker_p0-w0: stopping experience collection (137850 times) [2024-06-24 02:42:41,657][15401] InferenceWorker_p0-w0: resuming experience collection (137850 times) [2024-06-24 02:42:42,586][15401] Updated weights for policy 0, policy_version 568420 (0.0036) [2024-06-24 02:42:43,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9313009664. Throughput: 0: 43047.1. Samples: 9313118560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:43,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 02:42:47,347][15401] Updated weights for policy 0, policy_version 568430 (0.0036) [2024-06-24 02:42:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43149.2, 300 sec: 42931.6). Total num frames: 9313206272. Throughput: 0: 42967.3. Samples: 9313380820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:48,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 02:42:50,471][15401] Updated weights for policy 0, policy_version 568440 (0.0037) [2024-06-24 02:42:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9313419264. Throughput: 0: 42821.3. Samples: 9313497440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 02:42:54,879][15401] Updated weights for policy 0, policy_version 568450 (0.0032) [2024-06-24 02:42:58,189][15401] Updated weights for policy 0, policy_version 568460 (0.0035) [2024-06-24 02:42:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9313648640. Throughput: 0: 42891.1. Samples: 9313764140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:42:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 02:43:02,301][15401] Updated weights for policy 0, policy_version 568470 (0.0033) [2024-06-24 02:43:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 9313828864. Throughput: 0: 42715.3. Samples: 9314020560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:43:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 02:43:05,830][15401] Updated weights for policy 0, policy_version 568480 (0.0031) [2024-06-24 02:43:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9314058240. Throughput: 0: 42553.9. Samples: 9314139640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:43:08,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 02:43:09,955][15401] Updated weights for policy 0, policy_version 568490 (0.0029) [2024-06-24 02:43:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9314287616. Throughput: 0: 42762.8. Samples: 9314403720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 02:43:13,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 02:43:13,497][15401] Updated weights for policy 0, policy_version 568500 (0.0025) [2024-06-24 02:43:17,483][15401] Updated weights for policy 0, policy_version 568510 (0.0032) [2024-06-24 02:43:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9314484224. Throughput: 0: 42625.2. Samples: 9314660720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:18,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 02:43:21,112][15401] Updated weights for policy 0, policy_version 568520 (0.0037) [2024-06-24 02:43:23,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 9314713600. Throughput: 0: 42702.7. Samples: 9314784480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 02:43:25,246][15401] Updated weights for policy 0, policy_version 568530 (0.0035) [2024-06-24 02:43:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9314926592. Throughput: 0: 42844.4. Samples: 9315046560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:28,391][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 02:43:28,761][15401] Updated weights for policy 0, policy_version 568540 (0.0033) [2024-06-24 02:43:32,754][15401] Updated weights for policy 0, policy_version 568550 (0.0027) [2024-06-24 02:43:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9315123200. Throughput: 0: 42724.2. Samples: 9315303420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 02:43:36,540][15401] Updated weights for policy 0, policy_version 568560 (0.0037) [2024-06-24 02:43:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 9315368960. Throughput: 0: 42983.5. Samples: 9315431700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 02:43:40,308][15401] Updated weights for policy 0, policy_version 568570 (0.0021) [2024-06-24 02:43:43,391][15132] Fps is (10 sec: 44233.2, 60 sec: 42597.7, 300 sec: 42875.9). Total num frames: 9315565568. Throughput: 0: 42883.0. Samples: 9315693920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:43,391][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 02:43:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000568577_9315565568.pth... [2024-06-24 02:43:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000567950_9305292800.pth [2024-06-24 02:43:44,291][15401] Updated weights for policy 0, policy_version 568580 (0.0032) [2024-06-24 02:43:48,242][15401] Updated weights for policy 0, policy_version 568590 (0.0036) [2024-06-24 02:43:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9315778560. Throughput: 0: 42853.2. Samples: 9315948960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 02:43:51,939][15401] Updated weights for policy 0, policy_version 568600 (0.0028) [2024-06-24 02:43:53,390][15132] Fps is (10 sec: 44240.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9316007936. Throughput: 0: 43022.1. Samples: 9316075640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 02:43:55,678][15401] Updated weights for policy 0, policy_version 568610 (0.0028) [2024-06-24 02:43:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9316204544. Throughput: 0: 42864.4. Samples: 9316332620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:43:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 02:43:59,438][15401] Updated weights for policy 0, policy_version 568620 (0.0034) [2024-06-24 02:44:02,314][15349] Signal inference workers to stop experience collection... (137900 times) [2024-06-24 02:44:02,314][15349] Signal inference workers to resume experience collection... (137900 times) [2024-06-24 02:44:02,337][15401] InferenceWorker_p0-w0: stopping experience collection (137900 times) [2024-06-24 02:44:02,337][15401] InferenceWorker_p0-w0: resuming experience collection (137900 times) [2024-06-24 02:44:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9316417536. Throughput: 0: 42985.4. Samples: 9316595060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 02:44:03,582][15401] Updated weights for policy 0, policy_version 568630 (0.0037) [2024-06-24 02:44:06,957][15401] Updated weights for policy 0, policy_version 568640 (0.0047) [2024-06-24 02:44:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9316663296. Throughput: 0: 43149.8. Samples: 9316726220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 02:44:10,979][15401] Updated weights for policy 0, policy_version 568650 (0.0043) [2024-06-24 02:44:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9316859904. Throughput: 0: 42958.6. Samples: 9316979700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 02:44:14,420][15401] Updated weights for policy 0, policy_version 568660 (0.0043) [2024-06-24 02:44:18,373][15401] Updated weights for policy 0, policy_version 568670 (0.0038) [2024-06-24 02:44:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 9317089280. Throughput: 0: 43029.1. Samples: 9317239720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 02:44:22,142][15401] Updated weights for policy 0, policy_version 568680 (0.0032) [2024-06-24 02:44:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9317302272. Throughput: 0: 43043.6. Samples: 9317368660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:23,399][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 02:44:25,963][15401] Updated weights for policy 0, policy_version 568690 (0.0031) [2024-06-24 02:44:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9317498880. Throughput: 0: 42946.2. Samples: 9317626460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 02:44:29,952][15401] Updated weights for policy 0, policy_version 568700 (0.0040) [2024-06-24 02:44:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 9317728256. Throughput: 0: 42984.0. Samples: 9317883240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 02:44:33,483][15401] Updated weights for policy 0, policy_version 568710 (0.0038) [2024-06-24 02:44:37,882][15401] Updated weights for policy 0, policy_version 568720 (0.0030) [2024-06-24 02:44:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9317941248. Throughput: 0: 42919.6. Samples: 9318007020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 02:44:41,494][15401] Updated weights for policy 0, policy_version 568730 (0.0039) [2024-06-24 02:44:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42872.1, 300 sec: 42820.5). Total num frames: 9318137856. Throughput: 0: 42937.6. Samples: 9318264820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-24 02:44:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 02:44:45,446][15401] Updated weights for policy 0, policy_version 568740 (0.0035) [2024-06-24 02:44:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9318350848. Throughput: 0: 42776.0. Samples: 9318519980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:44:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 02:44:49,097][15401] Updated weights for policy 0, policy_version 568750 (0.0039) [2024-06-24 02:44:53,147][15401] Updated weights for policy 0, policy_version 568760 (0.0035) [2024-06-24 02:44:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9318563840. Throughput: 0: 42627.2. Samples: 9318644440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:44:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 02:44:57,035][15401] Updated weights for policy 0, policy_version 568770 (0.0035) [2024-06-24 02:44:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9318776832. Throughput: 0: 42771.6. Samples: 9318904420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:44:58,392][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 02:45:00,821][15401] Updated weights for policy 0, policy_version 568780 (0.0048) [2024-06-24 02:45:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 9319006208. Throughput: 0: 42630.1. Samples: 9319158080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 02:45:04,537][15401] Updated weights for policy 0, policy_version 568790 (0.0023) [2024-06-24 02:45:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 9319202816. Throughput: 0: 42765.3. Samples: 9319293100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:08,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 02:45:08,591][15401] Updated weights for policy 0, policy_version 568800 (0.0044) [2024-06-24 02:45:12,002][15401] Updated weights for policy 0, policy_version 568810 (0.0029) [2024-06-24 02:45:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9319432192. Throughput: 0: 42600.5. Samples: 9319543480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 02:45:16,349][15401] Updated weights for policy 0, policy_version 568820 (0.0029) [2024-06-24 02:45:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 9319628800. Throughput: 0: 42684.1. Samples: 9319804020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 02:45:19,575][15401] Updated weights for policy 0, policy_version 568830 (0.0038) [2024-06-24 02:45:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9319858176. Throughput: 0: 42755.1. Samples: 9319931000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 02:45:24,055][15401] Updated weights for policy 0, policy_version 568840 (0.0036) [2024-06-24 02:45:27,101][15401] Updated weights for policy 0, policy_version 568850 (0.0027) [2024-06-24 02:45:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9320071168. Throughput: 0: 42689.7. Samples: 9320185860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 02:45:31,678][15401] Updated weights for policy 0, policy_version 568860 (0.0028) [2024-06-24 02:45:32,297][15349] Signal inference workers to stop experience collection... (137950 times) [2024-06-24 02:45:32,299][15349] Signal inference workers to resume experience collection... (137950 times) [2024-06-24 02:45:32,321][15401] InferenceWorker_p0-w0: stopping experience collection (137950 times) [2024-06-24 02:45:32,322][15401] InferenceWorker_p0-w0: resuming experience collection (137950 times) [2024-06-24 02:45:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9320284160. Throughput: 0: 42820.0. Samples: 9320446880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 02:45:34,603][15401] Updated weights for policy 0, policy_version 568870 (0.0030) [2024-06-24 02:45:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9320497152. Throughput: 0: 42910.2. Samples: 9320575400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 02:45:39,873][15401] Updated weights for policy 0, policy_version 568880 (0.0036) [2024-06-24 02:45:42,315][15401] Updated weights for policy 0, policy_version 568890 (0.0027) [2024-06-24 02:45:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9320710144. Throughput: 0: 42668.0. Samples: 9320824480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 02:45:43,433][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000568892_9320726528.pth... [2024-06-24 02:45:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000568264_9310437376.pth [2024-06-24 02:45:47,580][15401] Updated weights for policy 0, policy_version 568900 (0.0042) [2024-06-24 02:45:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9320906752. Throughput: 0: 42899.6. Samples: 9321088560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 02:45:49,913][15401] Updated weights for policy 0, policy_version 568910 (0.0035) [2024-06-24 02:45:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9321136128. Throughput: 0: 42615.2. Samples: 9321210780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 02:45:55,121][15401] Updated weights for policy 0, policy_version 568920 (0.0040) [2024-06-24 02:45:57,873][15401] Updated weights for policy 0, policy_version 568930 (0.0030) [2024-06-24 02:45:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9321365504. Throughput: 0: 42672.8. Samples: 9321463760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:45:58,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 02:46:02,774][15401] Updated weights for policy 0, policy_version 568940 (0.0030) [2024-06-24 02:46:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9321545728. Throughput: 0: 42860.9. Samples: 9321732760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:46:03,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 02:46:05,577][15401] Updated weights for policy 0, policy_version 568950 (0.0039) [2024-06-24 02:46:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9321775104. Throughput: 0: 42678.7. Samples: 9321851540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:46:08,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 02:46:10,386][15401] Updated weights for policy 0, policy_version 568960 (0.0030) [2024-06-24 02:46:13,090][15401] Updated weights for policy 0, policy_version 568970 (0.0043) [2024-06-24 02:46:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9322004480. Throughput: 0: 42683.3. Samples: 9322106600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 02:46:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 02:46:18,097][15401] Updated weights for policy 0, policy_version 568980 (0.0041) [2024-06-24 02:46:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9322168320. Throughput: 0: 42729.8. Samples: 9322369720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:18,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 02:46:21,033][15401] Updated weights for policy 0, policy_version 568990 (0.0041) [2024-06-24 02:46:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9322414080. Throughput: 0: 42456.0. Samples: 9322485920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:23,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-24 02:46:25,786][15401] Updated weights for policy 0, policy_version 569000 (0.0038) [2024-06-24 02:46:28,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9322643456. Throughput: 0: 42613.2. Samples: 9322742080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:28,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 02:46:28,612][15401] Updated weights for policy 0, policy_version 569010 (0.0032) [2024-06-24 02:46:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 9322807296. Throughput: 0: 42625.0. Samples: 9323006680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 02:46:33,477][15401] Updated weights for policy 0, policy_version 569020 (0.0027) [2024-06-24 02:46:36,370][15401] Updated weights for policy 0, policy_version 569030 (0.0041) [2024-06-24 02:46:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9323053056. Throughput: 0: 42528.5. Samples: 9323124560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 02:46:41,240][15401] Updated weights for policy 0, policy_version 569040 (0.0029) [2024-06-24 02:46:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42877.0). Total num frames: 9323266048. Throughput: 0: 42808.9. Samples: 9323390160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 02:46:44,441][15401] Updated weights for policy 0, policy_version 569050 (0.0034) [2024-06-24 02:46:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9323446272. Throughput: 0: 42491.1. Samples: 9323644860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 02:46:48,781][15401] Updated weights for policy 0, policy_version 569060 (0.0037) [2024-06-24 02:46:50,030][15349] Signal inference workers to stop experience collection... (138000 times) [2024-06-24 02:46:50,031][15349] Signal inference workers to resume experience collection... (138000 times) [2024-06-24 02:46:50,076][15401] InferenceWorker_p0-w0: stopping experience collection (138000 times) [2024-06-24 02:46:50,076][15401] InferenceWorker_p0-w0: resuming experience collection (138000 times) [2024-06-24 02:46:51,936][15401] Updated weights for policy 0, policy_version 569070 (0.0036) [2024-06-24 02:46:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9323692032. Throughput: 0: 42514.1. Samples: 9323764680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 02:46:56,569][15401] Updated weights for policy 0, policy_version 569080 (0.0026) [2024-06-24 02:46:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 9323905024. Throughput: 0: 42722.7. Samples: 9324029120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:46:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 02:46:59,569][15401] Updated weights for policy 0, policy_version 569090 (0.0033) [2024-06-24 02:47:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9324101632. Throughput: 0: 42489.8. Samples: 9324281860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:03,392][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 02:47:04,007][15401] Updated weights for policy 0, policy_version 569100 (0.0044) [2024-06-24 02:47:07,038][15401] Updated weights for policy 0, policy_version 569110 (0.0035) [2024-06-24 02:47:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9324331008. Throughput: 0: 42687.9. Samples: 9324406880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 02:47:11,355][15401] Updated weights for policy 0, policy_version 569120 (0.0033) [2024-06-24 02:47:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9324527616. Throughput: 0: 42760.1. Samples: 9324666280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:13,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 02:47:15,031][15401] Updated weights for policy 0, policy_version 569130 (0.0030) [2024-06-24 02:47:18,395][15132] Fps is (10 sec: 39299.8, 60 sec: 42594.5, 300 sec: 42653.2). Total num frames: 9324724224. Throughput: 0: 42364.5. Samples: 9324913320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:18,395][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 02:47:19,389][15401] Updated weights for policy 0, policy_version 569140 (0.0036) [2024-06-24 02:47:22,991][15401] Updated weights for policy 0, policy_version 569150 (0.0029) [2024-06-24 02:47:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9324969984. Throughput: 0: 42553.7. Samples: 9325039480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 02:47:27,012][15401] Updated weights for policy 0, policy_version 569160 (0.0034) [2024-06-24 02:47:28,389][15132] Fps is (10 sec: 44261.6, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 9325166592. Throughput: 0: 42428.9. Samples: 9325299460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 02:47:30,342][15401] Updated weights for policy 0, policy_version 569170 (0.0023) [2024-06-24 02:47:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 9325379584. Throughput: 0: 42606.0. Samples: 9325562140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:33,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 02:47:34,702][15401] Updated weights for policy 0, policy_version 569180 (0.0038) [2024-06-24 02:47:37,789][15401] Updated weights for policy 0, policy_version 569190 (0.0042) [2024-06-24 02:47:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9325625344. Throughput: 0: 42754.8. Samples: 9325688640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 02:47:42,177][15401] Updated weights for policy 0, policy_version 569200 (0.0026) [2024-06-24 02:47:43,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9325821952. Throughput: 0: 42694.5. Samples: 9325950480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:43,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 02:47:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000569203_9325821952.pth... [2024-06-24 02:47:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000568577_9315565568.pth [2024-06-24 02:47:45,547][15401] Updated weights for policy 0, policy_version 569210 (0.0039) [2024-06-24 02:47:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9326034944. Throughput: 0: 42812.5. Samples: 9326208320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-24 02:47:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 02:47:49,759][15401] Updated weights for policy 0, policy_version 569220 (0.0030) [2024-06-24 02:47:53,375][15401] Updated weights for policy 0, policy_version 569230 (0.0042) [2024-06-24 02:47:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9326264320. Throughput: 0: 42899.6. Samples: 9326337360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:47:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 02:47:57,575][15401] Updated weights for policy 0, policy_version 569240 (0.0031) [2024-06-24 02:47:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9326444544. Throughput: 0: 42836.3. Samples: 9326593920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:47:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 02:48:01,002][15401] Updated weights for policy 0, policy_version 569250 (0.0032) [2024-06-24 02:48:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 9326690304. Throughput: 0: 42935.0. Samples: 9326845160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:03,393][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 02:48:05,288][15401] Updated weights for policy 0, policy_version 569260 (0.0042) [2024-06-24 02:48:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9326903296. Throughput: 0: 43108.5. Samples: 9326979360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:08,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 02:48:08,504][15401] Updated weights for policy 0, policy_version 569270 (0.0037) [2024-06-24 02:48:12,843][15401] Updated weights for policy 0, policy_version 569280 (0.0028) [2024-06-24 02:48:13,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 9327083520. Throughput: 0: 43027.9. Samples: 9327235820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:13,393][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 02:48:14,372][15349] Signal inference workers to stop experience collection... (138050 times) [2024-06-24 02:48:14,411][15401] InferenceWorker_p0-w0: stopping experience collection (138050 times) [2024-06-24 02:48:14,440][15349] Signal inference workers to resume experience collection... (138050 times) [2024-06-24 02:48:14,444][15401] InferenceWorker_p0-w0: resuming experience collection (138050 times) [2024-06-24 02:48:16,077][15401] Updated weights for policy 0, policy_version 569290 (0.0028) [2024-06-24 02:48:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43421.6, 300 sec: 42765.0). Total num frames: 9327329280. Throughput: 0: 42781.0. Samples: 9327487280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 02:48:20,540][15401] Updated weights for policy 0, policy_version 569300 (0.0039) [2024-06-24 02:48:23,392][15132] Fps is (10 sec: 45872.9, 60 sec: 42869.4, 300 sec: 42764.6). Total num frames: 9327542272. Throughput: 0: 42994.9. Samples: 9327623540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:23,393][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 02:48:23,813][15401] Updated weights for policy 0, policy_version 569310 (0.0034) [2024-06-24 02:48:28,261][15401] Updated weights for policy 0, policy_version 569320 (0.0041) [2024-06-24 02:48:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9327738880. Throughput: 0: 42924.9. Samples: 9327882100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:28,393][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 02:48:31,301][15401] Updated weights for policy 0, policy_version 569330 (0.0052) [2024-06-24 02:48:33,390][15132] Fps is (10 sec: 44246.2, 60 sec: 43417.1, 300 sec: 42764.9). Total num frames: 9327984640. Throughput: 0: 42710.8. Samples: 9328130340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:33,391][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 02:48:35,871][15401] Updated weights for policy 0, policy_version 569340 (0.0030) [2024-06-24 02:48:38,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.3, 300 sec: 42765.2). Total num frames: 9328181248. Throughput: 0: 42950.6. Samples: 9328270140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:38,398][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 02:48:39,307][15401] Updated weights for policy 0, policy_version 569350 (0.0044) [2024-06-24 02:48:43,390][15132] Fps is (10 sec: 39324.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 9328377856. Throughput: 0: 42893.3. Samples: 9328524120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:43,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 02:48:43,729][15401] Updated weights for policy 0, policy_version 569360 (0.0032) [2024-06-24 02:48:46,783][15401] Updated weights for policy 0, policy_version 569370 (0.0034) [2024-06-24 02:48:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9328640000. Throughput: 0: 42865.4. Samples: 9328774100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:48,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 02:48:51,266][15401] Updated weights for policy 0, policy_version 569380 (0.0028) [2024-06-24 02:48:53,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9328852992. Throughput: 0: 43159.9. Samples: 9328921560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:53,396][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 02:48:54,126][15401] Updated weights for policy 0, policy_version 569390 (0.0039) [2024-06-24 02:48:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9329033216. Throughput: 0: 43100.0. Samples: 9329175220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:48:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 02:48:58,631][15401] Updated weights for policy 0, policy_version 569400 (0.0028) [2024-06-24 02:49:01,930][15401] Updated weights for policy 0, policy_version 569410 (0.0036) [2024-06-24 02:49:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 9329295360. Throughput: 0: 42987.9. Samples: 9329421840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:49:03,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 02:49:06,797][15401] Updated weights for policy 0, policy_version 569420 (0.0038) [2024-06-24 02:49:08,392][15132] Fps is (10 sec: 45864.9, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 9329491968. Throughput: 0: 43208.5. Samples: 9329567900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:49:08,392][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 02:49:09,296][15401] Updated weights for policy 0, policy_version 569430 (0.0038) [2024-06-24 02:49:13,390][15132] Fps is (10 sec: 39330.0, 60 sec: 43419.1, 300 sec: 42709.4). Total num frames: 9329688576. Throughput: 0: 43105.6. Samples: 9329821760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:49:13,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 02:49:14,162][15401] Updated weights for policy 0, policy_version 569440 (0.0043) [2024-06-24 02:49:14,188][15349] Signal inference workers to stop experience collection... (138100 times) [2024-06-24 02:49:14,189][15349] Signal inference workers to resume experience collection... (138100 times) [2024-06-24 02:49:14,206][15401] InferenceWorker_p0-w0: stopping experience collection (138100 times) [2024-06-24 02:49:14,206][15401] InferenceWorker_p0-w0: resuming experience collection (138100 times) [2024-06-24 02:49:16,813][15401] Updated weights for policy 0, policy_version 569450 (0.0039) [2024-06-24 02:49:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9329934336. Throughput: 0: 43197.7. Samples: 9330074200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 02:49:18,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 02:49:21,675][15401] Updated weights for policy 0, policy_version 569460 (0.0038) [2024-06-24 02:49:23,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42873.6, 300 sec: 42765.0). Total num frames: 9330114560. Throughput: 0: 43121.0. Samples: 9330210580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:49:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 02:49:24,566][15401] Updated weights for policy 0, policy_version 569470 (0.0036) [2024-06-24 02:49:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 9330327552. Throughput: 0: 43021.8. Samples: 9330460100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:49:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 02:49:29,150][15401] Updated weights for policy 0, policy_version 569480 (0.0030) [2024-06-24 02:49:32,229][15401] Updated weights for policy 0, policy_version 569490 (0.0029) [2024-06-24 02:49:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43145.1, 300 sec: 42820.6). Total num frames: 9330573312. Throughput: 0: 43118.6. Samples: 9330714440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:49:33,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 02:49:36,629][15401] Updated weights for policy 0, policy_version 569500 (0.0031) [2024-06-24 02:49:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9330753536. Throughput: 0: 42841.0. Samples: 9330849400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:49:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 02:49:39,755][15401] Updated weights for policy 0, policy_version 569510 (0.0032) [2024-06-24 02:49:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 9330982912. Throughput: 0: 42861.9. Samples: 9331104000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:49:43,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 02:49:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000569518_9330982912.pth... [2024-06-24 02:49:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000568892_9320726528.pth [2024-06-24 02:49:44,148][15401] Updated weights for policy 0, policy_version 569520 (0.0034) [2024-06-24 02:49:47,350][15401] Updated weights for policy 0, policy_version 569530 (0.0034) [2024-06-24 02:49:48,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9331228672. Throughput: 0: 42899.7. Samples: 9331352220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:49:48,391][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 02:49:51,749][15401] Updated weights for policy 0, policy_version 569540 (0.0039) [2024-06-24 02:49:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9331392512. Throughput: 0: 42750.1. Samples: 9331491560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:49:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 02:49:54,935][15401] Updated weights for policy 0, policy_version 569550 (0.0034) [2024-06-24 02:49:58,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 9331638272. Throughput: 0: 42709.0. Samples: 9331743660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:49:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 02:49:59,456][15401] Updated weights for policy 0, policy_version 569560 (0.0035) [2024-06-24 02:50:02,710][15401] Updated weights for policy 0, policy_version 569570 (0.0025) [2024-06-24 02:50:03,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 9331867648. Throughput: 0: 42783.9. Samples: 9331999480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 02:50:07,546][15401] Updated weights for policy 0, policy_version 569580 (0.0034) [2024-06-24 02:50:08,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 9332031488. Throughput: 0: 42691.0. Samples: 9332131680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:08,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 02:50:10,277][15401] Updated weights for policy 0, policy_version 569590 (0.0041) [2024-06-24 02:50:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9332260864. Throughput: 0: 42866.8. Samples: 9332389100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:13,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 02:50:15,237][15401] Updated weights for policy 0, policy_version 569600 (0.0045) [2024-06-24 02:50:17,165][15349] Signal inference workers to stop experience collection... (138150 times) [2024-06-24 02:50:17,165][15349] Signal inference workers to resume experience collection... (138150 times) [2024-06-24 02:50:17,202][15401] InferenceWorker_p0-w0: stopping experience collection (138150 times) [2024-06-24 02:50:17,202][15401] InferenceWorker_p0-w0: resuming experience collection (138150 times) [2024-06-24 02:50:18,059][15401] Updated weights for policy 0, policy_version 569610 (0.0044) [2024-06-24 02:50:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9332490240. Throughput: 0: 42788.6. Samples: 9332639920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 02:50:23,124][15401] Updated weights for policy 0, policy_version 569620 (0.0036) [2024-06-24 02:50:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9332670464. Throughput: 0: 42833.3. Samples: 9332776900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 02:50:25,834][15401] Updated weights for policy 0, policy_version 569630 (0.0026) [2024-06-24 02:50:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9332899840. Throughput: 0: 42579.1. Samples: 9333020060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:28,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 02:50:30,709][15401] Updated weights for policy 0, policy_version 569640 (0.0033) [2024-06-24 02:50:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9333129216. Throughput: 0: 42819.0. Samples: 9333279080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 02:50:33,769][15401] Updated weights for policy 0, policy_version 569650 (0.0031) [2024-06-24 02:50:38,270][15401] Updated weights for policy 0, policy_version 569660 (0.0037) [2024-06-24 02:50:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9333309440. Throughput: 0: 42658.8. Samples: 9333411200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:38,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 02:50:41,544][15401] Updated weights for policy 0, policy_version 569670 (0.0048) [2024-06-24 02:50:43,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 9333555200. Throughput: 0: 42723.8. Samples: 9333666500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:43,396][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 02:50:45,784][15401] Updated weights for policy 0, policy_version 569680 (0.0042) [2024-06-24 02:50:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9333768192. Throughput: 0: 42840.5. Samples: 9333927300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 02:50:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 02:50:49,149][15401] Updated weights for policy 0, policy_version 569690 (0.0037) [2024-06-24 02:50:53,217][15401] Updated weights for policy 0, policy_version 569700 (0.0025) [2024-06-24 02:50:53,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9333964800. Throughput: 0: 42791.2. Samples: 9334057280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:50:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 02:50:56,698][15401] Updated weights for policy 0, policy_version 569710 (0.0030) [2024-06-24 02:50:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9334210560. Throughput: 0: 42705.8. Samples: 9334310860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:50:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 02:51:00,830][15401] Updated weights for policy 0, policy_version 569720 (0.0046) [2024-06-24 02:51:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9334423552. Throughput: 0: 42860.2. Samples: 9334568640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 02:51:04,173][15401] Updated weights for policy 0, policy_version 569730 (0.0047) [2024-06-24 02:51:08,348][15401] Updated weights for policy 0, policy_version 569740 (0.0026) [2024-06-24 02:51:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9334620160. Throughput: 0: 42820.8. Samples: 9334703840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 02:51:11,563][15401] Updated weights for policy 0, policy_version 569750 (0.0029) [2024-06-24 02:51:13,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 9334833152. Throughput: 0: 43089.9. Samples: 9334959380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:13,396][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 02:51:15,890][15401] Updated weights for policy 0, policy_version 569760 (0.0041) [2024-06-24 02:51:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9335062528. Throughput: 0: 43089.3. Samples: 9335218100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 02:51:19,137][15401] Updated weights for policy 0, policy_version 569770 (0.0042) [2024-06-24 02:51:23,390][15132] Fps is (10 sec: 42625.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9335259136. Throughput: 0: 43219.0. Samples: 9335356060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:23,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 02:51:23,489][15401] Updated weights for policy 0, policy_version 569780 (0.0033) [2024-06-24 02:51:26,720][15401] Updated weights for policy 0, policy_version 569790 (0.0045) [2024-06-24 02:51:28,391][15132] Fps is (10 sec: 42592.3, 60 sec: 43143.5, 300 sec: 42987.0). Total num frames: 9335488512. Throughput: 0: 43071.4. Samples: 9335604500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:28,392][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 02:51:30,983][15401] Updated weights for policy 0, policy_version 569800 (0.0031) [2024-06-24 02:51:31,330][15349] Signal inference workers to stop experience collection... (138200 times) [2024-06-24 02:51:31,330][15349] Signal inference workers to resume experience collection... (138200 times) [2024-06-24 02:51:31,373][15401] InferenceWorker_p0-w0: stopping experience collection (138200 times) [2024-06-24 02:51:31,373][15401] InferenceWorker_p0-w0: resuming experience collection (138200 times) [2024-06-24 02:51:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9335701504. Throughput: 0: 43124.4. Samples: 9335867900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:33,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 02:51:34,346][15401] Updated weights for policy 0, policy_version 569810 (0.0036) [2024-06-24 02:51:38,390][15132] Fps is (10 sec: 40965.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9335898112. Throughput: 0: 43024.7. Samples: 9335993400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 02:51:38,734][15401] Updated weights for policy 0, policy_version 569820 (0.0028) [2024-06-24 02:51:42,031][15401] Updated weights for policy 0, policy_version 569830 (0.0036) [2024-06-24 02:51:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42876.1, 300 sec: 42987.2). Total num frames: 9336127488. Throughput: 0: 42928.9. Samples: 9336242660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:43,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 02:51:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000569833_9336143872.pth... [2024-06-24 02:51:43,529][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000569203_9325821952.pth [2024-06-24 02:51:46,352][15401] Updated weights for policy 0, policy_version 569840 (0.0042) [2024-06-24 02:51:48,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9336340480. Throughput: 0: 43127.3. Samples: 9336509360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 02:51:49,478][15401] Updated weights for policy 0, policy_version 569850 (0.0034) [2024-06-24 02:51:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9336553472. Throughput: 0: 42984.9. Samples: 9336638160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 02:51:53,775][15401] Updated weights for policy 0, policy_version 569860 (0.0029) [2024-06-24 02:51:56,833][15401] Updated weights for policy 0, policy_version 569870 (0.0031) [2024-06-24 02:51:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 9336782848. Throughput: 0: 43113.2. Samples: 9336899200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:51:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 02:52:01,215][15401] Updated weights for policy 0, policy_version 569880 (0.0024) [2024-06-24 02:52:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9336995840. Throughput: 0: 43204.1. Samples: 9337162280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:52:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 02:52:04,237][15401] Updated weights for policy 0, policy_version 569890 (0.0032) [2024-06-24 02:52:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9337208832. Throughput: 0: 43122.7. Samples: 9337296580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:52:08,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 02:52:09,252][15401] Updated weights for policy 0, policy_version 569900 (0.0032) [2024-06-24 02:52:11,710][15401] Updated weights for policy 0, policy_version 569910 (0.0044) [2024-06-24 02:52:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43149.2, 300 sec: 43043.5). Total num frames: 9337421824. Throughput: 0: 43148.6. Samples: 9337546120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:52:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 02:52:16,750][15401] Updated weights for policy 0, policy_version 569920 (0.0033) [2024-06-24 02:52:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 9337651200. Throughput: 0: 43189.5. Samples: 9337811420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:52:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 02:52:19,711][15401] Updated weights for policy 0, policy_version 569930 (0.0024) [2024-06-24 02:52:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 9337864192. Throughput: 0: 43295.7. Samples: 9337941700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 02:52:23,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 02:52:24,217][15401] Updated weights for policy 0, policy_version 569940 (0.0028) [2024-06-24 02:52:27,552][15401] Updated weights for policy 0, policy_version 569950 (0.0034) [2024-06-24 02:52:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42872.5, 300 sec: 42987.2). Total num frames: 9338060800. Throughput: 0: 43125.3. Samples: 9338183300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:52:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 02:52:32,432][15401] Updated weights for policy 0, policy_version 569960 (0.0038) [2024-06-24 02:52:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9338290176. Throughput: 0: 43014.7. Samples: 9338445020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:52:33,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 02:52:35,208][15401] Updated weights for policy 0, policy_version 569970 (0.0025) [2024-06-24 02:52:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42932.0). Total num frames: 9338486784. Throughput: 0: 43041.0. Samples: 9338575000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:52:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 02:52:40,092][15401] Updated weights for policy 0, policy_version 569980 (0.0046) [2024-06-24 02:52:42,855][15401] Updated weights for policy 0, policy_version 569990 (0.0040) [2024-06-24 02:52:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9338716160. Throughput: 0: 42744.0. Samples: 9338822680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:52:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 02:52:47,645][15401] Updated weights for policy 0, policy_version 570000 (0.0040) [2024-06-24 02:52:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9338896384. Throughput: 0: 42870.6. Samples: 9339091460. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:52:48,396][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 02:52:48,475][15349] Signal inference workers to stop experience collection... (138250 times) [2024-06-24 02:52:48,479][15349] Signal inference workers to resume experience collection... (138250 times) [2024-06-24 02:52:48,495][15401] InferenceWorker_p0-w0: stopping experience collection (138250 times) [2024-06-24 02:52:48,495][15401] InferenceWorker_p0-w0: resuming experience collection (138250 times) [2024-06-24 02:52:50,895][15401] Updated weights for policy 0, policy_version 570010 (0.0033) [2024-06-24 02:52:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9339125760. Throughput: 0: 42504.5. Samples: 9339209280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:52:53,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 02:52:55,170][15401] Updated weights for policy 0, policy_version 570020 (0.0039) [2024-06-24 02:52:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9339355136. Throughput: 0: 42641.8. Samples: 9339465000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:52:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 02:52:58,654][15401] Updated weights for policy 0, policy_version 570030 (0.0026) [2024-06-24 02:53:02,705][15401] Updated weights for policy 0, policy_version 570040 (0.0036) [2024-06-24 02:53:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9339551744. Throughput: 0: 42627.0. Samples: 9339729640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 02:53:06,788][15401] Updated weights for policy 0, policy_version 570050 (0.0026) [2024-06-24 02:53:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 9339781120. Throughput: 0: 42484.5. Samples: 9339853500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 02:53:10,886][15401] Updated weights for policy 0, policy_version 570060 (0.0036) [2024-06-24 02:53:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 9339994112. Throughput: 0: 42789.7. Samples: 9340108840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:13,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 02:53:14,285][15401] Updated weights for policy 0, policy_version 570070 (0.0035) [2024-06-24 02:53:18,316][15401] Updated weights for policy 0, policy_version 570080 (0.0033) [2024-06-24 02:53:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42876.5). Total num frames: 9340190720. Throughput: 0: 42857.2. Samples: 9340373600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 02:53:21,756][15401] Updated weights for policy 0, policy_version 570090 (0.0037) [2024-06-24 02:53:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 9340420096. Throughput: 0: 42671.5. Samples: 9340495220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 02:53:25,821][15401] Updated weights for policy 0, policy_version 570100 (0.0031) [2024-06-24 02:53:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.2). Total num frames: 9340633088. Throughput: 0: 42854.1. Samples: 9340751120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 02:53:29,716][15401] Updated weights for policy 0, policy_version 570110 (0.0037) [2024-06-24 02:53:33,260][15401] Updated weights for policy 0, policy_version 570120 (0.0027) [2024-06-24 02:53:33,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 9340846080. Throughput: 0: 42713.7. Samples: 9341013680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:33,393][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 02:53:37,156][15401] Updated weights for policy 0, policy_version 570130 (0.0041) [2024-06-24 02:53:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9341059072. Throughput: 0: 42864.8. Samples: 9341138200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 02:53:40,759][15401] Updated weights for policy 0, policy_version 570140 (0.0025) [2024-06-24 02:53:43,390][15132] Fps is (10 sec: 44244.2, 60 sec: 42870.9, 300 sec: 42876.0). Total num frames: 9341288448. Throughput: 0: 42922.4. Samples: 9341396540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:43,391][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 02:53:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000570147_9341288448.pth... [2024-06-24 02:53:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000569518_9330982912.pth [2024-06-24 02:53:44,592][15401] Updated weights for policy 0, policy_version 570150 (0.0028) [2024-06-24 02:53:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9341485056. Throughput: 0: 42782.7. Samples: 9341654860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:48,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 02:53:48,718][15401] Updated weights for policy 0, policy_version 570160 (0.0040) [2024-06-24 02:53:52,513][15401] Updated weights for policy 0, policy_version 570170 (0.0035) [2024-06-24 02:53:53,301][15349] Signal inference workers to stop experience collection... (138300 times) [2024-06-24 02:53:53,347][15401] InferenceWorker_p0-w0: stopping experience collection (138300 times) [2024-06-24 02:53:53,352][15349] Signal inference workers to resume experience collection... (138300 times) [2024-06-24 02:53:53,360][15401] InferenceWorker_p0-w0: resuming experience collection (138300 times) [2024-06-24 02:53:53,390][15132] Fps is (10 sec: 42601.4, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 9341714432. Throughput: 0: 42792.8. Samples: 9341779180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-24 02:53:53,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 02:53:56,285][15401] Updated weights for policy 0, policy_version 570180 (0.0030) [2024-06-24 02:53:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 9341927424. Throughput: 0: 42914.3. Samples: 9342039980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:53:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 02:54:00,487][15401] Updated weights for policy 0, policy_version 570190 (0.0026) [2024-06-24 02:54:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 9342124032. Throughput: 0: 42692.6. Samples: 9342294760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 02:54:03,843][15401] Updated weights for policy 0, policy_version 570200 (0.0030) [2024-06-24 02:54:08,115][15401] Updated weights for policy 0, policy_version 570210 (0.0020) [2024-06-24 02:54:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9342337024. Throughput: 0: 42831.5. Samples: 9342422640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:08,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 02:54:11,246][15401] Updated weights for policy 0, policy_version 570220 (0.0033) [2024-06-24 02:54:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9342566400. Throughput: 0: 42867.6. Samples: 9342680160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 02:54:15,774][15401] Updated weights for policy 0, policy_version 570230 (0.0045) [2024-06-24 02:54:18,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 9342779392. Throughput: 0: 42555.1. Samples: 9342928660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:18,392][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 02:54:19,062][15401] Updated weights for policy 0, policy_version 570240 (0.0024) [2024-06-24 02:54:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 9342959616. Throughput: 0: 42747.9. Samples: 9343061860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 02:54:23,399][15401] Updated weights for policy 0, policy_version 570250 (0.0037) [2024-06-24 02:54:26,645][15401] Updated weights for policy 0, policy_version 570260 (0.0038) [2024-06-24 02:54:28,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 9343188992. Throughput: 0: 42583.5. Samples: 9343312760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 02:54:31,054][15401] Updated weights for policy 0, policy_version 570270 (0.0032) [2024-06-24 02:54:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 9343418368. Throughput: 0: 42573.2. Samples: 9343570660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 02:54:34,185][15401] Updated weights for policy 0, policy_version 570280 (0.0043) [2024-06-24 02:54:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9343614976. Throughput: 0: 42779.7. Samples: 9343704260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 02:54:38,518][15401] Updated weights for policy 0, policy_version 570290 (0.0034) [2024-06-24 02:54:41,757][15401] Updated weights for policy 0, policy_version 570300 (0.0038) [2024-06-24 02:54:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.9, 300 sec: 42765.0). Total num frames: 9343844352. Throughput: 0: 42627.5. Samples: 9343958220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:43,391][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 02:54:46,044][15401] Updated weights for policy 0, policy_version 570310 (0.0042) [2024-06-24 02:54:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9344057344. Throughput: 0: 42706.0. Samples: 9344216540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:48,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 02:54:49,486][15401] Updated weights for policy 0, policy_version 570320 (0.0028) [2024-06-24 02:54:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9344253952. Throughput: 0: 42782.3. Samples: 9344347840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:53,391][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 02:54:53,686][15401] Updated weights for policy 0, policy_version 570330 (0.0033) [2024-06-24 02:54:56,938][15401] Updated weights for policy 0, policy_version 570340 (0.0030) [2024-06-24 02:54:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9344466944. Throughput: 0: 42751.3. Samples: 9344603960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:54:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 02:55:00,409][15349] Signal inference workers to stop experience collection... (138350 times) [2024-06-24 02:55:00,412][15349] Signal inference workers to resume experience collection... (138350 times) [2024-06-24 02:55:00,427][15401] InferenceWorker_p0-w0: stopping experience collection (138350 times) [2024-06-24 02:55:00,427][15401] InferenceWorker_p0-w0: resuming experience collection (138350 times) [2024-06-24 02:55:01,160][15401] Updated weights for policy 0, policy_version 570350 (0.0028) [2024-06-24 02:55:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9344712704. Throughput: 0: 42972.6. Samples: 9344862320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:55:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 02:55:04,608][15401] Updated weights for policy 0, policy_version 570360 (0.0038) [2024-06-24 02:55:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9344909312. Throughput: 0: 43006.3. Samples: 9344997140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:55:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 02:55:09,203][15401] Updated weights for policy 0, policy_version 570370 (0.0033) [2024-06-24 02:55:12,341][15401] Updated weights for policy 0, policy_version 570380 (0.0031) [2024-06-24 02:55:13,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9345122304. Throughput: 0: 43008.2. Samples: 9345248140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:55:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 02:55:16,819][15401] Updated weights for policy 0, policy_version 570390 (0.0030) [2024-06-24 02:55:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 9345351680. Throughput: 0: 43001.3. Samples: 9345505720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:55:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 02:55:19,756][15401] Updated weights for policy 0, policy_version 570400 (0.0031) [2024-06-24 02:55:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9345548288. Throughput: 0: 43001.6. Samples: 9345639340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 02:55:23,391][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 02:55:24,380][15401] Updated weights for policy 0, policy_version 570410 (0.0035) [2024-06-24 02:55:27,238][15401] Updated weights for policy 0, policy_version 570420 (0.0034) [2024-06-24 02:55:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9345777664. Throughput: 0: 42871.0. Samples: 9345887420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:55:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 02:55:31,807][15401] Updated weights for policy 0, policy_version 570430 (0.0037) [2024-06-24 02:55:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9345990656. Throughput: 0: 42960.0. Samples: 9346149740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:55:33,404][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 02:55:34,844][15401] Updated weights for policy 0, policy_version 570440 (0.0036) [2024-06-24 02:55:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.3, 300 sec: 42821.5). Total num frames: 9346187264. Throughput: 0: 42946.6. Samples: 9346280440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:55:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 02:55:39,314][15401] Updated weights for policy 0, policy_version 570450 (0.0040) [2024-06-24 02:55:42,906][15401] Updated weights for policy 0, policy_version 570460 (0.0034) [2024-06-24 02:55:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9346433024. Throughput: 0: 42813.2. Samples: 9346530560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:55:43,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 02:55:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000570461_9346433024.pth... [2024-06-24 02:55:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000569833_9336143872.pth [2024-06-24 02:55:47,209][15401] Updated weights for policy 0, policy_version 570470 (0.0047) [2024-06-24 02:55:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9346629632. Throughput: 0: 42795.4. Samples: 9346788120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:55:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 02:55:50,333][15401] Updated weights for policy 0, policy_version 570480 (0.0038) [2024-06-24 02:55:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9346826240. Throughput: 0: 42656.4. Samples: 9346916680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:55:53,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 02:55:54,766][15401] Updated weights for policy 0, policy_version 570490 (0.0033) [2024-06-24 02:55:58,387][15401] Updated weights for policy 0, policy_version 570500 (0.0044) [2024-06-24 02:55:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9347072000. Throughput: 0: 42784.7. Samples: 9347173440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:55:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 02:56:02,334][15401] Updated weights for policy 0, policy_version 570510 (0.0037) [2024-06-24 02:56:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9347268608. Throughput: 0: 42738.3. Samples: 9347428940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:03,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 02:56:05,986][15401] Updated weights for policy 0, policy_version 570520 (0.0033) [2024-06-24 02:56:08,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42867.0, 300 sec: 42876.1). Total num frames: 9347481600. Throughput: 0: 42529.6. Samples: 9347553440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:08,397][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 02:56:10,328][15401] Updated weights for policy 0, policy_version 570530 (0.0035) [2024-06-24 02:56:10,874][15349] Signal inference workers to stop experience collection... (138400 times) [2024-06-24 02:56:10,874][15349] Signal inference workers to resume experience collection... (138400 times) [2024-06-24 02:56:10,921][15401] InferenceWorker_p0-w0: stopping experience collection (138400 times) [2024-06-24 02:56:10,928][15401] InferenceWorker_p0-w0: resuming experience collection (138400 times) [2024-06-24 02:56:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 9347710976. Throughput: 0: 42939.3. Samples: 9347819680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 02:56:13,723][15401] Updated weights for policy 0, policy_version 570540 (0.0028) [2024-06-24 02:56:17,835][15401] Updated weights for policy 0, policy_version 570550 (0.0040) [2024-06-24 02:56:18,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9347923968. Throughput: 0: 42754.3. Samples: 9348073680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 02:56:21,282][15401] Updated weights for policy 0, policy_version 570560 (0.0022) [2024-06-24 02:56:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42876.3). Total num frames: 9348136960. Throughput: 0: 42717.8. Samples: 9348202740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 02:56:25,697][15401] Updated weights for policy 0, policy_version 570570 (0.0042) [2024-06-24 02:56:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9348333568. Throughput: 0: 42974.3. Samples: 9348464400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 02:56:28,823][15401] Updated weights for policy 0, policy_version 570580 (0.0031) [2024-06-24 02:56:33,065][15401] Updated weights for policy 0, policy_version 570590 (0.0035) [2024-06-24 02:56:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9348546560. Throughput: 0: 42937.4. Samples: 9348720300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 02:56:36,547][15401] Updated weights for policy 0, policy_version 570600 (0.0030) [2024-06-24 02:56:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9348759552. Throughput: 0: 42878.7. Samples: 9348846220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 02:56:40,670][15401] Updated weights for policy 0, policy_version 570610 (0.0031) [2024-06-24 02:56:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 9348972544. Throughput: 0: 42952.8. Samples: 9349106320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 02:56:44,113][15401] Updated weights for policy 0, policy_version 570620 (0.0028) [2024-06-24 02:56:48,287][15401] Updated weights for policy 0, policy_version 570630 (0.0029) [2024-06-24 02:56:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9349201920. Throughput: 0: 43116.9. Samples: 9349369200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 02:56:51,984][15401] Updated weights for policy 0, policy_version 570640 (0.0029) [2024-06-24 02:56:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 9349431296. Throughput: 0: 43186.1. Samples: 9349496540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 02:56:55,859][15401] Updated weights for policy 0, policy_version 570650 (0.0024) [2024-06-24 02:56:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9349627904. Throughput: 0: 43079.0. Samples: 9349758240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-24 02:56:58,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 02:56:59,406][15401] Updated weights for policy 0, policy_version 570660 (0.0032) [2024-06-24 02:57:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9349857280. Throughput: 0: 43233.6. Samples: 9350019200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:03,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 02:57:03,392][15401] Updated weights for policy 0, policy_version 570670 (0.0036) [2024-06-24 02:57:07,376][15401] Updated weights for policy 0, policy_version 570680 (0.0032) [2024-06-24 02:57:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 9350070272. Throughput: 0: 43124.9. Samples: 9350143360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 02:57:10,749][15401] Updated weights for policy 0, policy_version 570690 (0.0030) [2024-06-24 02:57:13,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9350266880. Throughput: 0: 43083.6. Samples: 9350403160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 02:57:14,835][15401] Updated weights for policy 0, policy_version 570700 (0.0040) [2024-06-24 02:57:18,243][15401] Updated weights for policy 0, policy_version 570710 (0.0033) [2024-06-24 02:57:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9350512640. Throughput: 0: 43148.8. Samples: 9350662000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 02:57:22,511][15401] Updated weights for policy 0, policy_version 570720 (0.0035) [2024-06-24 02:57:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9350709248. Throughput: 0: 43270.7. Samples: 9350793400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 02:57:25,794][15401] Updated weights for policy 0, policy_version 570730 (0.0031) [2024-06-24 02:57:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9350922240. Throughput: 0: 43181.3. Samples: 9351049480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 02:57:29,987][15401] Updated weights for policy 0, policy_version 570740 (0.0028) [2024-06-24 02:57:32,963][15349] Signal inference workers to stop experience collection... (138450 times) [2024-06-24 02:57:32,964][15349] Signal inference workers to resume experience collection... (138450 times) [2024-06-24 02:57:32,977][15401] InferenceWorker_p0-w0: stopping experience collection (138450 times) [2024-06-24 02:57:32,977][15401] InferenceWorker_p0-w0: resuming experience collection (138450 times) [2024-06-24 02:57:33,383][15401] Updated weights for policy 0, policy_version 570750 (0.0041) [2024-06-24 02:57:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 9351168000. Throughput: 0: 43013.0. Samples: 9351304780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 02:57:37,603][15401] Updated weights for policy 0, policy_version 570760 (0.0038) [2024-06-24 02:57:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9351364608. Throughput: 0: 43098.1. Samples: 9351435960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 02:57:40,975][15401] Updated weights for policy 0, policy_version 570770 (0.0035) [2024-06-24 02:57:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 9351577600. Throughput: 0: 43161.7. Samples: 9351700520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 02:57:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000570775_9351577600.pth... [2024-06-24 02:57:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000570147_9341288448.pth [2024-06-24 02:57:45,258][15401] Updated weights for policy 0, policy_version 570780 (0.0034) [2024-06-24 02:57:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42987.1). Total num frames: 9351806976. Throughput: 0: 43001.4. Samples: 9351954260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:48,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-24 02:57:48,469][15401] Updated weights for policy 0, policy_version 570790 (0.0032) [2024-06-24 02:57:53,023][15401] Updated weights for policy 0, policy_version 570800 (0.0031) [2024-06-24 02:57:53,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 9352003584. Throughput: 0: 43144.0. Samples: 9352084940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:53,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 02:57:55,995][15401] Updated weights for policy 0, policy_version 570810 (0.0041) [2024-06-24 02:57:58,390][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9352216576. Throughput: 0: 43049.7. Samples: 9352340400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:57:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 02:58:00,872][15401] Updated weights for policy 0, policy_version 570820 (0.0037) [2024-06-24 02:58:03,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 9352445952. Throughput: 0: 42951.2. Samples: 9352594800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:58:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 02:58:03,638][15401] Updated weights for policy 0, policy_version 570830 (0.0032) [2024-06-24 02:58:08,338][15401] Updated weights for policy 0, policy_version 570840 (0.0042) [2024-06-24 02:58:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 9352642560. Throughput: 0: 42941.6. Samples: 9352725880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:58:08,392][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 02:58:11,290][15401] Updated weights for policy 0, policy_version 570850 (0.0043) [2024-06-24 02:58:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9352855552. Throughput: 0: 42866.7. Samples: 9352978480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:58:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 02:58:16,037][15401] Updated weights for policy 0, policy_version 570860 (0.0038) [2024-06-24 02:58:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9353068544. Throughput: 0: 42932.1. Samples: 9353236720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:58:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 02:58:19,256][15401] Updated weights for policy 0, policy_version 570870 (0.0044) [2024-06-24 02:58:23,337][15401] Updated weights for policy 0, policy_version 570880 (0.0035) [2024-06-24 02:58:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 9353297920. Throughput: 0: 42899.7. Samples: 9353366440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:58:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 02:58:26,640][15401] Updated weights for policy 0, policy_version 570890 (0.0049) [2024-06-24 02:58:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 9353510912. Throughput: 0: 42746.4. Samples: 9353624100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 02:58:28,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 02:58:30,706][15401] Updated weights for policy 0, policy_version 570900 (0.0053) [2024-06-24 02:58:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 9353723904. Throughput: 0: 42984.0. Samples: 9353888540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:58:33,395][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 02:58:34,209][15401] Updated weights for policy 0, policy_version 570910 (0.0033) [2024-06-24 02:58:38,235][15401] Updated weights for policy 0, policy_version 570920 (0.0037) [2024-06-24 02:58:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 9353953280. Throughput: 0: 42922.2. Samples: 9354016340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:58:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 02:58:41,913][15401] Updated weights for policy 0, policy_version 570930 (0.0042) [2024-06-24 02:58:43,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 9354166272. Throughput: 0: 42839.1. Samples: 9354268260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:58:43,402][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 02:58:46,142][15401] Updated weights for policy 0, policy_version 570940 (0.0039) [2024-06-24 02:58:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9354379264. Throughput: 0: 43077.8. Samples: 9354533300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:58:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 02:58:49,525][15401] Updated weights for policy 0, policy_version 570950 (0.0033) [2024-06-24 02:58:52,593][15349] Signal inference workers to stop experience collection... (138500 times) [2024-06-24 02:58:52,633][15401] InferenceWorker_p0-w0: stopping experience collection (138500 times) [2024-06-24 02:58:52,711][15349] Signal inference workers to resume experience collection... (138500 times) [2024-06-24 02:58:52,711][15401] InferenceWorker_p0-w0: resuming experience collection (138500 times) [2024-06-24 02:58:53,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9354575872. Throughput: 0: 42968.5. Samples: 9354659360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:58:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 02:58:53,642][15401] Updated weights for policy 0, policy_version 570960 (0.0022) [2024-06-24 02:58:57,218][15401] Updated weights for policy 0, policy_version 570970 (0.0033) [2024-06-24 02:58:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9354805248. Throughput: 0: 43006.2. Samples: 9354913760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:58:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 02:59:01,115][15401] Updated weights for policy 0, policy_version 570980 (0.0032) [2024-06-24 02:59:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 9355001856. Throughput: 0: 43131.1. Samples: 9355177620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 02:59:04,984][15401] Updated weights for policy 0, policy_version 570990 (0.0032) [2024-06-24 02:59:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.3, 300 sec: 42931.7). Total num frames: 9355231232. Throughput: 0: 43033.8. Samples: 9355302960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 02:59:08,985][15401] Updated weights for policy 0, policy_version 571000 (0.0036) [2024-06-24 02:59:12,751][15401] Updated weights for policy 0, policy_version 571010 (0.0036) [2024-06-24 02:59:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 9355444224. Throughput: 0: 43064.4. Samples: 9355562000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 02:59:16,456][15401] Updated weights for policy 0, policy_version 571020 (0.0036) [2024-06-24 02:59:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 9355657216. Throughput: 0: 42932.5. Samples: 9355820500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 02:59:20,254][15401] Updated weights for policy 0, policy_version 571030 (0.0045) [2024-06-24 02:59:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 9355870208. Throughput: 0: 42857.8. Samples: 9355944940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 02:59:23,949][15401] Updated weights for policy 0, policy_version 571040 (0.0031) [2024-06-24 02:59:28,147][15401] Updated weights for policy 0, policy_version 571050 (0.0045) [2024-06-24 02:59:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9356099584. Throughput: 0: 43066.8. Samples: 9356206160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 02:59:32,182][15401] Updated weights for policy 0, policy_version 571060 (0.0032) [2024-06-24 02:59:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 9356312576. Throughput: 0: 42796.9. Samples: 9356459160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:33,395][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 02:59:35,761][15401] Updated weights for policy 0, policy_version 571070 (0.0036) [2024-06-24 02:59:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9356509184. Throughput: 0: 42797.3. Samples: 9356585240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:38,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 02:59:39,760][15401] Updated weights for policy 0, policy_version 571080 (0.0032) [2024-06-24 02:59:43,281][15401] Updated weights for policy 0, policy_version 571090 (0.0030) [2024-06-24 02:59:43,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42871.5, 300 sec: 42986.8). Total num frames: 9356738560. Throughput: 0: 43016.0. Samples: 9356849580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:43,392][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 02:59:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000571090_9356738560.pth... [2024-06-24 02:59:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000570461_9346433024.pth [2024-06-24 02:59:47,506][15401] Updated weights for policy 0, policy_version 571100 (0.0035) [2024-06-24 02:59:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 9356935168. Throughput: 0: 42728.4. Samples: 9357100400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 02:59:51,197][15401] Updated weights for policy 0, policy_version 571110 (0.0047) [2024-06-24 02:59:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9357148160. Throughput: 0: 42826.2. Samples: 9357230140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 02:59:55,138][15401] Updated weights for policy 0, policy_version 571120 (0.0032) [2024-06-24 02:59:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9357361152. Throughput: 0: 42859.1. Samples: 9357490660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 02:59:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 02:59:58,649][15401] Updated weights for policy 0, policy_version 571130 (0.0022) [2024-06-24 03:00:02,753][15401] Updated weights for policy 0, policy_version 571140 (0.0029) [2024-06-24 03:00:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 9357574144. Throughput: 0: 42743.4. Samples: 9357743960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 03:00:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 03:00:06,324][15401] Updated weights for policy 0, policy_version 571150 (0.0055) [2024-06-24 03:00:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 9357770752. Throughput: 0: 42829.9. Samples: 9357872280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 03:00:10,238][15401] Updated weights for policy 0, policy_version 571160 (0.0028) [2024-06-24 03:00:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9358016512. Throughput: 0: 42749.6. Samples: 9358129900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 03:00:14,191][15401] Updated weights for policy 0, policy_version 571170 (0.0052) [2024-06-24 03:00:17,738][15401] Updated weights for policy 0, policy_version 571180 (0.0038) [2024-06-24 03:00:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9358229504. Throughput: 0: 42680.5. Samples: 9358379780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:18,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 03:00:21,789][15401] Updated weights for policy 0, policy_version 571190 (0.0027) [2024-06-24 03:00:23,284][15349] Signal inference workers to stop experience collection... (138550 times) [2024-06-24 03:00:23,284][15349] Signal inference workers to resume experience collection... (138550 times) [2024-06-24 03:00:23,332][15401] InferenceWorker_p0-w0: stopping experience collection (138550 times) [2024-06-24 03:00:23,332][15401] InferenceWorker_p0-w0: resuming experience collection (138550 times) [2024-06-24 03:00:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9358426112. Throughput: 0: 42829.9. Samples: 9358512580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 03:00:25,440][15401] Updated weights for policy 0, policy_version 571200 (0.0035) [2024-06-24 03:00:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 9358622720. Throughput: 0: 42680.1. Samples: 9358770080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 03:00:29,328][15401] Updated weights for policy 0, policy_version 571210 (0.0030) [2024-06-24 03:00:32,897][15401] Updated weights for policy 0, policy_version 571220 (0.0041) [2024-06-24 03:00:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 9358884864. Throughput: 0: 42703.9. Samples: 9359022080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:33,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 03:00:36,974][15401] Updated weights for policy 0, policy_version 571230 (0.0048) [2024-06-24 03:00:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9359065088. Throughput: 0: 42868.0. Samples: 9359159200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 03:00:40,399][15401] Updated weights for policy 0, policy_version 571240 (0.0025) [2024-06-24 03:00:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 9359278080. Throughput: 0: 42809.8. Samples: 9359417100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:43,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-24 03:00:44,645][15401] Updated weights for policy 0, policy_version 571250 (0.0033) [2024-06-24 03:00:48,015][15401] Updated weights for policy 0, policy_version 571260 (0.0031) [2024-06-24 03:00:48,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 9359523840. Throughput: 0: 42630.8. Samples: 9359662340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 03:00:52,170][15401] Updated weights for policy 0, policy_version 571270 (0.0035) [2024-06-24 03:00:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9359720448. Throughput: 0: 42899.1. Samples: 9359802740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 03:00:55,705][15401] Updated weights for policy 0, policy_version 571280 (0.0027) [2024-06-24 03:00:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9359917056. Throughput: 0: 42998.8. Samples: 9360064840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:00:58,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 03:00:59,758][15401] Updated weights for policy 0, policy_version 571290 (0.0029) [2024-06-24 03:01:03,330][15401] Updated weights for policy 0, policy_version 571300 (0.0035) [2024-06-24 03:01:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 43043.6). Total num frames: 9360179200. Throughput: 0: 42952.4. Samples: 9360312640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:01:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 03:01:07,672][15401] Updated weights for policy 0, policy_version 571310 (0.0042) [2024-06-24 03:01:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9360359424. Throughput: 0: 43011.1. Samples: 9360448080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:01:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 03:01:11,025][15401] Updated weights for policy 0, policy_version 571320 (0.0038) [2024-06-24 03:01:13,392][15132] Fps is (10 sec: 39313.7, 60 sec: 42597.1, 300 sec: 42875.8). Total num frames: 9360572416. Throughput: 0: 42817.6. Samples: 9360696960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:01:13,392][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 03:01:15,973][15401] Updated weights for policy 0, policy_version 571330 (0.0037) [2024-06-24 03:01:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9360801792. Throughput: 0: 42787.1. Samples: 9360947500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:01:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 03:01:18,860][15401] Updated weights for policy 0, policy_version 571340 (0.0035) [2024-06-24 03:01:23,312][15401] Updated weights for policy 0, policy_version 571350 (0.0037) [2024-06-24 03:01:23,392][15132] Fps is (10 sec: 42596.6, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 9360998400. Throughput: 0: 42695.4. Samples: 9361080600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:01:23,393][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 03:01:26,317][15401] Updated weights for policy 0, policy_version 571360 (0.0044) [2024-06-24 03:01:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9361211392. Throughput: 0: 42684.5. Samples: 9361337900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:01:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 03:01:30,840][15401] Updated weights for policy 0, policy_version 571370 (0.0036) [2024-06-24 03:01:33,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 9361440768. Throughput: 0: 43084.9. Samples: 9361601160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 03:01:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 03:01:33,856][15401] Updated weights for policy 0, policy_version 571380 (0.0033) [2024-06-24 03:01:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9361637376. Throughput: 0: 42896.5. Samples: 9361733080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:01:38,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 03:01:38,482][15401] Updated weights for policy 0, policy_version 571390 (0.0029) [2024-06-24 03:01:41,461][15401] Updated weights for policy 0, policy_version 571400 (0.0033) [2024-06-24 03:01:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9361866752. Throughput: 0: 42642.1. Samples: 9361983740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:01:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 03:01:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000571403_9361866752.pth... [2024-06-24 03:01:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000570775_9351577600.pth [2024-06-24 03:01:44,406][15349] Signal inference workers to stop experience collection... (138600 times) [2024-06-24 03:01:44,449][15349] Signal inference workers to resume experience collection... (138600 times) [2024-06-24 03:01:44,456][15401] InferenceWorker_p0-w0: stopping experience collection (138600 times) [2024-06-24 03:01:44,492][15401] InferenceWorker_p0-w0: resuming experience collection (138600 times) [2024-06-24 03:01:46,314][15401] Updated weights for policy 0, policy_version 571410 (0.0022) [2024-06-24 03:01:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9362079744. Throughput: 0: 42890.6. Samples: 9362242720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:01:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 03:01:49,153][15401] Updated weights for policy 0, policy_version 571420 (0.0024) [2024-06-24 03:01:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9362276352. Throughput: 0: 42736.9. Samples: 9362371240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:01:53,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 03:01:53,753][15401] Updated weights for policy 0, policy_version 571430 (0.0036) [2024-06-24 03:01:56,986][15401] Updated weights for policy 0, policy_version 571440 (0.0029) [2024-06-24 03:01:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.5, 300 sec: 42931.7). Total num frames: 9362522112. Throughput: 0: 42845.5. Samples: 9362624920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:01:58,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-24 03:02:01,371][15401] Updated weights for policy 0, policy_version 571450 (0.0034) [2024-06-24 03:02:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 9362735104. Throughput: 0: 43105.9. Samples: 9362887260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:03,390][15132] Avg episode reward: [(0, '0.189')] [2024-06-24 03:02:04,452][15401] Updated weights for policy 0, policy_version 571460 (0.0029) [2024-06-24 03:02:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9362931712. Throughput: 0: 43058.3. Samples: 9363018120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:08,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 03:02:08,944][15401] Updated weights for policy 0, policy_version 571470 (0.0031) [2024-06-24 03:02:12,451][15401] Updated weights for policy 0, policy_version 571480 (0.0031) [2024-06-24 03:02:13,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42871.2, 300 sec: 42820.2). Total num frames: 9363144704. Throughput: 0: 43015.8. Samples: 9363273720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:13,393][15132] Avg episode reward: [(0, '0.845')] [2024-06-24 03:02:16,467][15401] Updated weights for policy 0, policy_version 571490 (0.0036) [2024-06-24 03:02:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9363374080. Throughput: 0: 42802.3. Samples: 9363527260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:18,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 03:02:19,994][15401] Updated weights for policy 0, policy_version 571500 (0.0031) [2024-06-24 03:02:23,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 9363587072. Throughput: 0: 42827.1. Samples: 9363660300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:23,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-24 03:02:23,988][15401] Updated weights for policy 0, policy_version 571510 (0.0022) [2024-06-24 03:02:28,036][15401] Updated weights for policy 0, policy_version 571520 (0.0029) [2024-06-24 03:02:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9363800064. Throughput: 0: 42884.9. Samples: 9363913560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 03:02:31,532][15401] Updated weights for policy 0, policy_version 571530 (0.0035) [2024-06-24 03:02:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9363996672. Throughput: 0: 42680.2. Samples: 9364163320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 03:02:35,591][15401] Updated weights for policy 0, policy_version 571540 (0.0030) [2024-06-24 03:02:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9364209664. Throughput: 0: 42650.3. Samples: 9364290500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 03:02:39,118][15401] Updated weights for policy 0, policy_version 571550 (0.0031) [2024-06-24 03:02:43,126][15401] Updated weights for policy 0, policy_version 571560 (0.0030) [2024-06-24 03:02:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9364439040. Throughput: 0: 42775.4. Samples: 9364549820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 03:02:47,145][15401] Updated weights for policy 0, policy_version 571570 (0.0027) [2024-06-24 03:02:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 9364652032. Throughput: 0: 42552.0. Samples: 9364802100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 03:02:49,223][15349] Signal inference workers to stop experience collection... (138650 times) [2024-06-24 03:02:49,224][15349] Signal inference workers to resume experience collection... (138650 times) [2024-06-24 03:02:49,248][15401] InferenceWorker_p0-w0: stopping experience collection (138650 times) [2024-06-24 03:02:49,248][15401] InferenceWorker_p0-w0: resuming experience collection (138650 times) [2024-06-24 03:02:50,736][15401] Updated weights for policy 0, policy_version 571580 (0.0026) [2024-06-24 03:02:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9364832256. Throughput: 0: 42497.4. Samples: 9364930500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 03:02:54,647][15401] Updated weights for policy 0, policy_version 571590 (0.0038) [2024-06-24 03:02:58,345][15401] Updated weights for policy 0, policy_version 571600 (0.0036) [2024-06-24 03:02:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9365094400. Throughput: 0: 42641.3. Samples: 9365192480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:02:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 03:03:02,202][15401] Updated weights for policy 0, policy_version 571610 (0.0038) [2024-06-24 03:03:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 9365291008. Throughput: 0: 42583.0. Samples: 9365443500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 03:03:03,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 03:03:05,818][15401] Updated weights for policy 0, policy_version 571620 (0.0035) [2024-06-24 03:03:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9365487616. Throughput: 0: 42489.4. Samples: 9365572320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 03:03:09,746][15401] Updated weights for policy 0, policy_version 571630 (0.0033) [2024-06-24 03:03:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 9365733376. Throughput: 0: 42755.1. Samples: 9365837540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:13,396][15132] Avg episode reward: [(0, '0.868')] [2024-06-24 03:03:13,624][15401] Updated weights for policy 0, policy_version 571640 (0.0030) [2024-06-24 03:03:17,624][15401] Updated weights for policy 0, policy_version 571650 (0.0027) [2024-06-24 03:03:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9365929984. Throughput: 0: 42716.3. Samples: 9366085560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:18,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-24 03:03:21,244][15401] Updated weights for policy 0, policy_version 571660 (0.0034) [2024-06-24 03:03:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9366126592. Throughput: 0: 42783.9. Samples: 9366215780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:23,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 03:03:25,473][15401] Updated weights for policy 0, policy_version 571670 (0.0042) [2024-06-24 03:03:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9366372352. Throughput: 0: 42735.6. Samples: 9366472920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 03:03:28,886][15401] Updated weights for policy 0, policy_version 571680 (0.0037) [2024-06-24 03:03:33,154][15401] Updated weights for policy 0, policy_version 571690 (0.0044) [2024-06-24 03:03:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9366568960. Throughput: 0: 42685.3. Samples: 9366722940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:33,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-24 03:03:37,463][15401] Updated weights for policy 0, policy_version 571700 (0.0030) [2024-06-24 03:03:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 9366765568. Throughput: 0: 42624.5. Samples: 9366848600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 03:03:40,935][15401] Updated weights for policy 0, policy_version 571710 (0.0035) [2024-06-24 03:03:43,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42867.0, 300 sec: 42819.6). Total num frames: 9367011328. Throughput: 0: 42581.1. Samples: 9367108900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:43,396][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 03:03:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000571717_9367011328.pth... [2024-06-24 03:03:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000571090_9356738560.pth [2024-06-24 03:03:45,147][15401] Updated weights for policy 0, policy_version 571720 (0.0042) [2024-06-24 03:03:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9367207936. Throughput: 0: 42563.2. Samples: 9367358840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 03:03:48,490][15401] Updated weights for policy 0, policy_version 571730 (0.0045) [2024-06-24 03:03:52,811][15401] Updated weights for policy 0, policy_version 571740 (0.0040) [2024-06-24 03:03:53,390][15132] Fps is (10 sec: 37707.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9367388160. Throughput: 0: 42459.9. Samples: 9367483020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 03:03:56,325][15401] Updated weights for policy 0, policy_version 571750 (0.0037) [2024-06-24 03:03:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 9367633920. Throughput: 0: 42310.8. Samples: 9367741520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:03:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 03:04:00,363][15401] Updated weights for policy 0, policy_version 571760 (0.0044) [2024-06-24 03:04:02,203][15349] Signal inference workers to stop experience collection... (138700 times) [2024-06-24 03:04:02,243][15401] InferenceWorker_p0-w0: stopping experience collection (138700 times) [2024-06-24 03:04:02,258][15349] Signal inference workers to resume experience collection... (138700 times) [2024-06-24 03:04:02,264][15401] InferenceWorker_p0-w0: resuming experience collection (138700 times) [2024-06-24 03:04:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9367846912. Throughput: 0: 42648.4. Samples: 9368004740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:04:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 03:04:03,860][15401] Updated weights for policy 0, policy_version 571770 (0.0037) [2024-06-24 03:04:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9368027136. Throughput: 0: 42444.5. Samples: 9368125780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:04:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 03:04:08,555][15401] Updated weights for policy 0, policy_version 571780 (0.0036) [2024-06-24 03:04:11,750][15401] Updated weights for policy 0, policy_version 571790 (0.0030) [2024-06-24 03:04:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 9368272896. Throughput: 0: 42456.1. Samples: 9368383440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:04:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 03:04:16,137][15401] Updated weights for policy 0, policy_version 571800 (0.0025) [2024-06-24 03:04:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9368469504. Throughput: 0: 42745.7. Samples: 9368646500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:04:18,395][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 03:04:19,338][15401] Updated weights for policy 0, policy_version 571810 (0.0037) [2024-06-24 03:04:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9368666112. Throughput: 0: 42614.1. Samples: 9368766240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:04:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 03:04:23,683][15401] Updated weights for policy 0, policy_version 571820 (0.0048) [2024-06-24 03:04:27,105][15401] Updated weights for policy 0, policy_version 571830 (0.0033) [2024-06-24 03:04:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9368911872. Throughput: 0: 42575.4. Samples: 9369024520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:04:28,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 03:04:31,281][15401] Updated weights for policy 0, policy_version 571840 (0.0036) [2024-06-24 03:04:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9369124864. Throughput: 0: 42842.6. Samples: 9369286760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:04:33,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 03:04:34,884][15401] Updated weights for policy 0, policy_version 571850 (0.0037) [2024-06-24 03:04:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 9369321472. Throughput: 0: 42781.9. Samples: 9369408200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 03:04:38,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 03:04:38,939][15401] Updated weights for policy 0, policy_version 571860 (0.0038) [2024-06-24 03:04:42,505][15401] Updated weights for policy 0, policy_version 571870 (0.0034) [2024-06-24 03:04:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42602.9, 300 sec: 42820.5). Total num frames: 9369567232. Throughput: 0: 42995.1. Samples: 9369676300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:04:43,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 03:04:46,644][15401] Updated weights for policy 0, policy_version 571880 (0.0026) [2024-06-24 03:04:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9369763840. Throughput: 0: 42774.3. Samples: 9369929580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:04:48,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-24 03:04:49,994][15401] Updated weights for policy 0, policy_version 571890 (0.0030) [2024-06-24 03:04:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9369976832. Throughput: 0: 42824.3. Samples: 9370052880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:04:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 03:04:54,145][15401] Updated weights for policy 0, policy_version 571900 (0.0031) [2024-06-24 03:04:57,610][15401] Updated weights for policy 0, policy_version 571910 (0.0035) [2024-06-24 03:04:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9370206208. Throughput: 0: 42783.1. Samples: 9370308680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:04:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 03:05:01,878][15401] Updated weights for policy 0, policy_version 571920 (0.0030) [2024-06-24 03:05:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9370386432. Throughput: 0: 42672.1. Samples: 9370566740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 03:05:05,398][15401] Updated weights for policy 0, policy_version 571930 (0.0035) [2024-06-24 03:05:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9370599424. Throughput: 0: 42648.5. Samples: 9370685420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 03:05:09,389][15401] Updated weights for policy 0, policy_version 571940 (0.0043) [2024-06-24 03:05:13,237][15401] Updated weights for policy 0, policy_version 571950 (0.0024) [2024-06-24 03:05:13,394][15132] Fps is (10 sec: 44216.6, 60 sec: 42595.1, 300 sec: 42708.8). Total num frames: 9370828800. Throughput: 0: 42597.0. Samples: 9370941580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:13,395][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 03:05:17,597][15401] Updated weights for policy 0, policy_version 571960 (0.0027) [2024-06-24 03:05:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9371009024. Throughput: 0: 42530.7. Samples: 9371200640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 03:05:18,966][15349] Signal inference workers to stop experience collection... (138750 times) [2024-06-24 03:05:18,966][15349] Signal inference workers to resume experience collection... (138750 times) [2024-06-24 03:05:19,014][15401] InferenceWorker_p0-w0: stopping experience collection (138750 times) [2024-06-24 03:05:19,015][15401] InferenceWorker_p0-w0: resuming experience collection (138750 times) [2024-06-24 03:05:20,830][15401] Updated weights for policy 0, policy_version 571970 (0.0043) [2024-06-24 03:05:23,390][15132] Fps is (10 sec: 40978.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9371238400. Throughput: 0: 42570.5. Samples: 9371323880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:23,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-24 03:05:25,352][15401] Updated weights for policy 0, policy_version 571980 (0.0035) [2024-06-24 03:05:28,364][15401] Updated weights for policy 0, policy_version 571990 (0.0034) [2024-06-24 03:05:28,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9371484160. Throughput: 0: 42298.1. Samples: 9371579720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 03:05:32,831][15401] Updated weights for policy 0, policy_version 572000 (0.0033) [2024-06-24 03:05:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9371664384. Throughput: 0: 42438.6. Samples: 9371839320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 03:05:36,068][15401] Updated weights for policy 0, policy_version 572010 (0.0040) [2024-06-24 03:05:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9371893760. Throughput: 0: 42497.0. Samples: 9371965240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 03:05:40,320][15401] Updated weights for policy 0, policy_version 572020 (0.0048) [2024-06-24 03:05:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9372106752. Throughput: 0: 42668.4. Samples: 9372228760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:43,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 03:05:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000572029_9372123136.pth... [2024-06-24 03:05:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000571403_9361866752.pth [2024-06-24 03:05:43,685][15401] Updated weights for policy 0, policy_version 572030 (0.0026) [2024-06-24 03:05:48,051][15401] Updated weights for policy 0, policy_version 572040 (0.0041) [2024-06-24 03:05:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9372303360. Throughput: 0: 42509.2. Samples: 9372479660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 03:05:51,512][15401] Updated weights for policy 0, policy_version 572050 (0.0024) [2024-06-24 03:05:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9372532736. Throughput: 0: 42616.4. Samples: 9372603160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 03:05:55,976][15401] Updated weights for policy 0, policy_version 572060 (0.0032) [2024-06-24 03:05:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9372745728. Throughput: 0: 42702.9. Samples: 9372863020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:05:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 03:05:59,177][15401] Updated weights for policy 0, policy_version 572070 (0.0036) [2024-06-24 03:06:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9372942336. Throughput: 0: 42659.5. Samples: 9373120320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:06:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 03:06:03,569][15401] Updated weights for policy 0, policy_version 572080 (0.0033) [2024-06-24 03:06:06,751][15401] Updated weights for policy 0, policy_version 572090 (0.0025) [2024-06-24 03:06:08,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42709.4). Total num frames: 9373171712. Throughput: 0: 42717.4. Samples: 9373246260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 03:06:08,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 03:06:11,166][15401] Updated weights for policy 0, policy_version 572100 (0.0042) [2024-06-24 03:06:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42601.6, 300 sec: 42653.9). Total num frames: 9373384704. Throughput: 0: 42898.2. Samples: 9373510140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:13,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-24 03:06:14,356][15401] Updated weights for policy 0, policy_version 572110 (0.0044) [2024-06-24 03:06:18,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 9373581312. Throughput: 0: 42629.4. Samples: 9373757640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:18,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-24 03:06:18,910][15401] Updated weights for policy 0, policy_version 572120 (0.0024) [2024-06-24 03:06:21,951][15401] Updated weights for policy 0, policy_version 572130 (0.0027) [2024-06-24 03:06:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9373827072. Throughput: 0: 42583.0. Samples: 9373881480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 03:06:26,350][15401] Updated weights for policy 0, policy_version 572140 (0.0036) [2024-06-24 03:06:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9374023680. Throughput: 0: 42615.0. Samples: 9374146440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:28,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 03:06:29,528][15401] Updated weights for policy 0, policy_version 572150 (0.0048) [2024-06-24 03:06:33,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9374203904. Throughput: 0: 42735.3. Samples: 9374402740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 03:06:33,683][15349] Signal inference workers to stop experience collection... (138800 times) [2024-06-24 03:06:33,728][15349] Signal inference workers to resume experience collection... (138800 times) [2024-06-24 03:06:33,738][15401] InferenceWorker_p0-w0: stopping experience collection (138800 times) [2024-06-24 03:06:33,773][15401] InferenceWorker_p0-w0: resuming experience collection (138800 times) [2024-06-24 03:06:33,857][15401] Updated weights for policy 0, policy_version 572160 (0.0033) [2024-06-24 03:06:37,660][15401] Updated weights for policy 0, policy_version 572170 (0.0032) [2024-06-24 03:06:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9374449664. Throughput: 0: 42695.4. Samples: 9374524460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 03:06:42,064][15401] Updated weights for policy 0, policy_version 572180 (0.0034) [2024-06-24 03:06:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 9374646272. Throughput: 0: 42561.4. Samples: 9374778380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:43,393][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 03:06:45,672][15401] Updated weights for policy 0, policy_version 572190 (0.0035) [2024-06-24 03:06:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9374859264. Throughput: 0: 42537.8. Samples: 9375034520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:48,395][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 03:06:49,948][15401] Updated weights for policy 0, policy_version 572200 (0.0042) [2024-06-24 03:06:53,346][15401] Updated weights for policy 0, policy_version 572210 (0.0036) [2024-06-24 03:06:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9375088640. Throughput: 0: 42506.4. Samples: 9375158940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 03:06:57,465][15401] Updated weights for policy 0, policy_version 572220 (0.0027) [2024-06-24 03:06:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9375285248. Throughput: 0: 42316.1. Samples: 9375414360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:06:58,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 03:07:01,059][15401] Updated weights for policy 0, policy_version 572230 (0.0032) [2024-06-24 03:07:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9375514624. Throughput: 0: 42428.3. Samples: 9375666920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 03:07:05,038][15401] Updated weights for policy 0, policy_version 572240 (0.0040) [2024-06-24 03:07:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42598.7). Total num frames: 9375711232. Throughput: 0: 42535.6. Samples: 9375795580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:08,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 03:07:08,743][15401] Updated weights for policy 0, policy_version 572250 (0.0038) [2024-06-24 03:07:12,511][15401] Updated weights for policy 0, policy_version 572260 (0.0039) [2024-06-24 03:07:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9375924224. Throughput: 0: 42447.2. Samples: 9376056560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:13,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 03:07:16,380][15401] Updated weights for policy 0, policy_version 572270 (0.0028) [2024-06-24 03:07:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9376153600. Throughput: 0: 42374.6. Samples: 9376309600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 03:07:20,285][15401] Updated weights for policy 0, policy_version 572280 (0.0031) [2024-06-24 03:07:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9376366592. Throughput: 0: 42589.5. Samples: 9376440980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 03:07:24,367][15401] Updated weights for policy 0, policy_version 572290 (0.0038) [2024-06-24 03:07:27,717][15401] Updated weights for policy 0, policy_version 572300 (0.0031) [2024-06-24 03:07:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 9376563200. Throughput: 0: 42661.4. Samples: 9376698140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:28,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 03:07:31,851][15401] Updated weights for policy 0, policy_version 572310 (0.0042) [2024-06-24 03:07:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9376792576. Throughput: 0: 42638.3. Samples: 9376953240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 03:07:35,445][15401] Updated weights for policy 0, policy_version 572320 (0.0043) [2024-06-24 03:07:38,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 9376989184. Throughput: 0: 42700.8. Samples: 9377080580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:38,392][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 03:07:39,732][15401] Updated weights for policy 0, policy_version 572330 (0.0038) [2024-06-24 03:07:42,839][15401] Updated weights for policy 0, policy_version 572340 (0.0030) [2024-06-24 03:07:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 9377234944. Throughput: 0: 42864.0. Samples: 9377343240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 03:07:43,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 03:07:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000572341_9377234944.pth... [2024-06-24 03:07:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000571717_9367011328.pth [2024-06-24 03:07:47,144][15401] Updated weights for policy 0, policy_version 572350 (0.0036) [2024-06-24 03:07:48,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9377447936. Throughput: 0: 42899.3. Samples: 9377597380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:07:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 03:07:50,399][15401] Updated weights for policy 0, policy_version 572360 (0.0027) [2024-06-24 03:07:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 9377628160. Throughput: 0: 42806.6. Samples: 9377721880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:07:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 03:07:54,661][15401] Updated weights for policy 0, policy_version 572370 (0.0045) [2024-06-24 03:07:57,922][15401] Updated weights for policy 0, policy_version 572380 (0.0033) [2024-06-24 03:07:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 9377890304. Throughput: 0: 42861.8. Samples: 9377985340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:07:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 03:08:02,254][15401] Updated weights for policy 0, policy_version 572390 (0.0038) [2024-06-24 03:08:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9378070528. Throughput: 0: 42989.4. Samples: 9378244120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 03:08:05,701][15401] Updated weights for policy 0, policy_version 572400 (0.0021) [2024-06-24 03:08:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 9378283520. Throughput: 0: 42854.6. Samples: 9378369440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 03:08:09,848][15349] Signal inference workers to stop experience collection... (138850 times) [2024-06-24 03:08:09,903][15349] Signal inference workers to resume experience collection... (138850 times) [2024-06-24 03:08:09,903][15401] InferenceWorker_p0-w0: stopping experience collection (138850 times) [2024-06-24 03:08:09,920][15401] InferenceWorker_p0-w0: resuming experience collection (138850 times) [2024-06-24 03:08:10,042][15401] Updated weights for policy 0, policy_version 572410 (0.0028) [2024-06-24 03:08:13,310][15401] Updated weights for policy 0, policy_version 572420 (0.0038) [2024-06-24 03:08:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 9378529280. Throughput: 0: 42839.2. Samples: 9378625800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 03:08:17,828][15401] Updated weights for policy 0, policy_version 572430 (0.0048) [2024-06-24 03:08:18,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 9378725888. Throughput: 0: 42957.6. Samples: 9378886440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:18,393][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 03:08:20,817][15401] Updated weights for policy 0, policy_version 572440 (0.0036) [2024-06-24 03:08:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9378938880. Throughput: 0: 42841.5. Samples: 9379008340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 03:08:25,763][15401] Updated weights for policy 0, policy_version 572450 (0.0047) [2024-06-24 03:08:28,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 9379151872. Throughput: 0: 42700.0. Samples: 9379264740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 03:08:28,825][15401] Updated weights for policy 0, policy_version 572460 (0.0037) [2024-06-24 03:08:33,328][15401] Updated weights for policy 0, policy_version 572470 (0.0045) [2024-06-24 03:08:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9379348480. Throughput: 0: 42895.9. Samples: 9379527700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:33,390][15132] Avg episode reward: [(0, '0.177')] [2024-06-24 03:08:36,387][15401] Updated weights for policy 0, policy_version 572480 (0.0025) [2024-06-24 03:08:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42543.8). Total num frames: 9379561472. Throughput: 0: 42791.3. Samples: 9379647480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 03:08:40,955][15401] Updated weights for policy 0, policy_version 572490 (0.0034) [2024-06-24 03:08:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9379790848. Throughput: 0: 42771.6. Samples: 9379910060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 03:08:43,924][15401] Updated weights for policy 0, policy_version 572500 (0.0030) [2024-06-24 03:08:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9379987456. Throughput: 0: 42859.5. Samples: 9380172800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 03:08:48,459][15401] Updated weights for policy 0, policy_version 572510 (0.0045) [2024-06-24 03:08:51,396][15401] Updated weights for policy 0, policy_version 572520 (0.0044) [2024-06-24 03:08:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 9380216832. Throughput: 0: 42784.1. Samples: 9380294720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 03:08:55,906][15401] Updated weights for policy 0, policy_version 572530 (0.0028) [2024-06-24 03:08:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9380429824. Throughput: 0: 43068.4. Samples: 9380563880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:08:58,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 03:08:59,057][15401] Updated weights for policy 0, policy_version 572540 (0.0046) [2024-06-24 03:09:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9380626432. Throughput: 0: 42963.3. Samples: 9380819680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:09:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 03:09:03,712][15401] Updated weights for policy 0, policy_version 572550 (0.0037) [2024-06-24 03:09:06,772][15401] Updated weights for policy 0, policy_version 572560 (0.0035) [2024-06-24 03:09:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9380872192. Throughput: 0: 42999.8. Samples: 9380943340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:09:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 03:09:11,319][15401] Updated weights for policy 0, policy_version 572570 (0.0043) [2024-06-24 03:09:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9381085184. Throughput: 0: 43063.0. Samples: 9381202580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 03:09:13,392][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 03:09:14,786][15401] Updated weights for policy 0, policy_version 572580 (0.0028) [2024-06-24 03:09:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9381281792. Throughput: 0: 42725.4. Samples: 9381450340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 03:09:18,887][15401] Updated weights for policy 0, policy_version 572590 (0.0038) [2024-06-24 03:09:22,353][15401] Updated weights for policy 0, policy_version 572600 (0.0042) [2024-06-24 03:09:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.6, 300 sec: 42709.1). Total num frames: 9381511168. Throughput: 0: 42978.4. Samples: 9381581620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:23,393][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 03:09:26,761][15401] Updated weights for policy 0, policy_version 572610 (0.0028) [2024-06-24 03:09:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9381707776. Throughput: 0: 42829.0. Samples: 9381837360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 03:09:30,248][15401] Updated weights for policy 0, policy_version 572620 (0.0039) [2024-06-24 03:09:33,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9381920768. Throughput: 0: 42549.9. Samples: 9382087540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:33,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 03:09:34,365][15401] Updated weights for policy 0, policy_version 572630 (0.0037) [2024-06-24 03:09:38,048][15401] Updated weights for policy 0, policy_version 572640 (0.0046) [2024-06-24 03:09:38,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 9382150144. Throughput: 0: 42733.1. Samples: 9382217720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 03:09:42,067][15401] Updated weights for policy 0, policy_version 572650 (0.0034) [2024-06-24 03:09:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9382330368. Throughput: 0: 42495.6. Samples: 9382476180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:43,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 03:09:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000572653_9382346752.pth... [2024-06-24 03:09:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000572029_9372123136.pth [2024-06-24 03:09:45,427][15349] Signal inference workers to stop experience collection... (138900 times) [2024-06-24 03:09:45,428][15349] Signal inference workers to resume experience collection... (138900 times) [2024-06-24 03:09:45,464][15401] InferenceWorker_p0-w0: stopping experience collection (138900 times) [2024-06-24 03:09:45,464][15401] InferenceWorker_p0-w0: resuming experience collection (138900 times) [2024-06-24 03:09:45,570][15401] Updated weights for policy 0, policy_version 572660 (0.0034) [2024-06-24 03:09:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9382576128. Throughput: 0: 42471.9. Samples: 9382730920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 03:09:49,721][15401] Updated weights for policy 0, policy_version 572670 (0.0028) [2024-06-24 03:09:53,067][15401] Updated weights for policy 0, policy_version 572680 (0.0026) [2024-06-24 03:09:53,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9382805504. Throughput: 0: 42724.9. Samples: 9382865960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 03:09:57,299][15401] Updated weights for policy 0, policy_version 572690 (0.0031) [2024-06-24 03:09:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9382969344. Throughput: 0: 42720.1. Samples: 9383124980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:09:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 03:10:00,527][15401] Updated weights for policy 0, policy_version 572700 (0.0047) [2024-06-24 03:10:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9383215104. Throughput: 0: 42821.8. Samples: 9383377320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:03,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 03:10:04,829][15401] Updated weights for policy 0, policy_version 572710 (0.0033) [2024-06-24 03:10:08,323][15401] Updated weights for policy 0, policy_version 572720 (0.0037) [2024-06-24 03:10:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.6, 300 sec: 42765.7). Total num frames: 9383444480. Throughput: 0: 42810.4. Samples: 9383507980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 03:10:12,322][15401] Updated weights for policy 0, policy_version 572730 (0.0025) [2024-06-24 03:10:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9383624704. Throughput: 0: 42688.8. Samples: 9383758360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 03:10:15,756][15401] Updated weights for policy 0, policy_version 572740 (0.0029) [2024-06-24 03:10:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9383870464. Throughput: 0: 42917.7. Samples: 9384018840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 03:10:20,146][15401] Updated weights for policy 0, policy_version 572750 (0.0033) [2024-06-24 03:10:23,308][15401] Updated weights for policy 0, policy_version 572760 (0.0030) [2024-06-24 03:10:23,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43144.6, 300 sec: 42764.7). Total num frames: 9384099840. Throughput: 0: 43043.6. Samples: 9384154780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:23,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 03:10:28,117][15401] Updated weights for policy 0, policy_version 572770 (0.0047) [2024-06-24 03:10:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9384263680. Throughput: 0: 42889.6. Samples: 9384406220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 03:10:30,992][15401] Updated weights for policy 0, policy_version 572780 (0.0040) [2024-06-24 03:10:33,391][15132] Fps is (10 sec: 40963.6, 60 sec: 43143.4, 300 sec: 42764.8). Total num frames: 9384509440. Throughput: 0: 42830.6. Samples: 9384658360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:33,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 03:10:35,857][15401] Updated weights for policy 0, policy_version 572790 (0.0035) [2024-06-24 03:10:38,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9384738816. Throughput: 0: 42821.3. Samples: 9384792920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 03:10:38,819][15401] Updated weights for policy 0, policy_version 572800 (0.0035) [2024-06-24 03:10:43,389][15132] Fps is (10 sec: 39327.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9384902656. Throughput: 0: 42891.5. Samples: 9385055100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 03:10:43,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 03:10:43,468][15401] Updated weights for policy 0, policy_version 572810 (0.0033) [2024-06-24 03:10:46,785][15401] Updated weights for policy 0, policy_version 572820 (0.0030) [2024-06-24 03:10:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9385164800. Throughput: 0: 42756.7. Samples: 9385301380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:10:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 03:10:50,950][15401] Updated weights for policy 0, policy_version 572830 (0.0032) [2024-06-24 03:10:53,202][15349] Signal inference workers to stop experience collection... (138950 times) [2024-06-24 03:10:53,202][15349] Signal inference workers to resume experience collection... (138950 times) [2024-06-24 03:10:53,253][15401] InferenceWorker_p0-w0: stopping experience collection (138950 times) [2024-06-24 03:10:53,254][15401] InferenceWorker_p0-w0: resuming experience collection (138950 times) [2024-06-24 03:10:53,392][15132] Fps is (10 sec: 47501.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 9385377792. Throughput: 0: 42933.6. Samples: 9385440100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:10:53,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 03:10:54,054][15401] Updated weights for policy 0, policy_version 572840 (0.0047) [2024-06-24 03:10:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9385558016. Throughput: 0: 43181.3. Samples: 9385701520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:10:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 03:10:58,499][15401] Updated weights for policy 0, policy_version 572850 (0.0033) [2024-06-24 03:11:01,540][15401] Updated weights for policy 0, policy_version 572860 (0.0032) [2024-06-24 03:11:03,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43417.6, 300 sec: 42876.5). Total num frames: 9385820160. Throughput: 0: 42925.0. Samples: 9385950460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:03,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 03:11:06,061][15401] Updated weights for policy 0, policy_version 572870 (0.0046) [2024-06-24 03:11:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9386000384. Throughput: 0: 42923.2. Samples: 9386086220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 03:11:09,514][15401] Updated weights for policy 0, policy_version 572880 (0.0033) [2024-06-24 03:11:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9386213376. Throughput: 0: 43078.8. Samples: 9386344760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 03:11:13,727][15401] Updated weights for policy 0, policy_version 572890 (0.0034) [2024-06-24 03:11:17,034][15401] Updated weights for policy 0, policy_version 572900 (0.0029) [2024-06-24 03:11:18,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9386475520. Throughput: 0: 42938.7. Samples: 9386590540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 03:11:21,351][15401] Updated weights for policy 0, policy_version 572910 (0.0032) [2024-06-24 03:11:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 9386639360. Throughput: 0: 43149.9. Samples: 9386734660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:23,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 03:11:24,569][15401] Updated weights for policy 0, policy_version 572920 (0.0040) [2024-06-24 03:11:28,390][15132] Fps is (10 sec: 36044.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9386835968. Throughput: 0: 42824.3. Samples: 9386982200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 03:11:29,005][15401] Updated weights for policy 0, policy_version 572930 (0.0031) [2024-06-24 03:11:32,056][15401] Updated weights for policy 0, policy_version 572940 (0.0037) [2024-06-24 03:11:33,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43418.7, 300 sec: 42931.7). Total num frames: 9387114496. Throughput: 0: 43097.8. Samples: 9387240780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:33,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 03:11:36,509][15401] Updated weights for policy 0, policy_version 572950 (0.0029) [2024-06-24 03:11:38,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.8, 300 sec: 42876.1). Total num frames: 9387294720. Throughput: 0: 43090.7. Samples: 9387379180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:38,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 03:11:39,809][15401] Updated weights for policy 0, policy_version 572960 (0.0029) [2024-06-24 03:11:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9387507712. Throughput: 0: 42938.7. Samples: 9387633760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:43,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 03:11:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000572968_9387507712.pth... [2024-06-24 03:11:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000572341_9377234944.pth [2024-06-24 03:11:44,040][15401] Updated weights for policy 0, policy_version 572970 (0.0037) [2024-06-24 03:11:47,529][15401] Updated weights for policy 0, policy_version 572980 (0.0031) [2024-06-24 03:11:48,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9387753472. Throughput: 0: 43075.6. Samples: 9387888860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 03:11:51,698][15401] Updated weights for policy 0, policy_version 572990 (0.0042) [2024-06-24 03:11:53,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42598.4, 300 sec: 42875.7). Total num frames: 9387933696. Throughput: 0: 43195.5. Samples: 9388030120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:53,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 03:11:55,100][15401] Updated weights for policy 0, policy_version 573000 (0.0029) [2024-06-24 03:11:58,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9388130304. Throughput: 0: 42847.1. Samples: 9388272880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:11:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 03:11:59,416][15401] Updated weights for policy 0, policy_version 573010 (0.0050) [2024-06-24 03:12:02,827][15401] Updated weights for policy 0, policy_version 573020 (0.0032) [2024-06-24 03:12:03,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9388376064. Throughput: 0: 43065.9. Samples: 9388528500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:12:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 03:12:06,959][15349] Signal inference workers to stop experience collection... (139000 times) [2024-06-24 03:12:06,959][15349] Signal inference workers to resume experience collection... (139000 times) [2024-06-24 03:12:06,998][15401] InferenceWorker_p0-w0: stopping experience collection (139000 times) [2024-06-24 03:12:06,999][15401] InferenceWorker_p0-w0: resuming experience collection (139000 times) [2024-06-24 03:12:07,093][15401] Updated weights for policy 0, policy_version 573030 (0.0030) [2024-06-24 03:12:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9388556288. Throughput: 0: 42947.9. Samples: 9388667320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:12:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 03:12:10,353][15401] Updated weights for policy 0, policy_version 573040 (0.0042) [2024-06-24 03:12:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9388785664. Throughput: 0: 42879.6. Samples: 9388911780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:12:13,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 03:12:15,107][15401] Updated weights for policy 0, policy_version 573050 (0.0043) [2024-06-24 03:12:17,919][15401] Updated weights for policy 0, policy_version 573060 (0.0021) [2024-06-24 03:12:18,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9389015040. Throughput: 0: 42898.3. Samples: 9389171200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 03:12:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 03:12:22,735][15401] Updated weights for policy 0, policy_version 573070 (0.0022) [2024-06-24 03:12:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 9389195264. Throughput: 0: 42816.1. Samples: 9389305800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:12:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 03:12:25,531][15401] Updated weights for policy 0, policy_version 573080 (0.0025) [2024-06-24 03:12:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 9389441024. Throughput: 0: 42745.9. Samples: 9389557320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:12:28,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 03:12:30,196][15401] Updated weights for policy 0, policy_version 573090 (0.0035) [2024-06-24 03:12:33,371][15401] Updated weights for policy 0, policy_version 573100 (0.0033) [2024-06-24 03:12:33,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 9389670400. Throughput: 0: 42804.5. Samples: 9389815060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:12:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 03:12:37,727][15401] Updated weights for policy 0, policy_version 573110 (0.0046) [2024-06-24 03:12:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9389850624. Throughput: 0: 42534.7. Samples: 9389944080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:12:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 03:12:41,008][15401] Updated weights for policy 0, policy_version 573120 (0.0033) [2024-06-24 03:12:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9390080000. Throughput: 0: 42825.4. Samples: 9390200020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:12:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 03:12:45,262][15401] Updated weights for policy 0, policy_version 573130 (0.0041) [2024-06-24 03:12:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 9390292992. Throughput: 0: 42704.8. Samples: 9390450220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:12:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 03:12:49,119][15401] Updated weights for policy 0, policy_version 573140 (0.0036) [2024-06-24 03:12:52,996][15401] Updated weights for policy 0, policy_version 573150 (0.0029) [2024-06-24 03:12:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 9390489600. Throughput: 0: 42390.7. Samples: 9390574900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:12:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 03:12:56,606][15401] Updated weights for policy 0, policy_version 573160 (0.0026) [2024-06-24 03:12:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 9390735360. Throughput: 0: 42683.2. Samples: 9390832520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:12:58,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 03:13:00,586][15401] Updated weights for policy 0, policy_version 573170 (0.0040) [2024-06-24 03:13:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9390915584. Throughput: 0: 42749.3. Samples: 9391094920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 03:13:04,230][15401] Updated weights for policy 0, policy_version 573180 (0.0036) [2024-06-24 03:13:08,221][15401] Updated weights for policy 0, policy_version 573190 (0.0027) [2024-06-24 03:13:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 9391144960. Throughput: 0: 42418.3. Samples: 9391214620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 03:13:11,843][15401] Updated weights for policy 0, policy_version 573200 (0.0031) [2024-06-24 03:13:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42876.5). Total num frames: 9391374336. Throughput: 0: 42691.1. Samples: 9391478420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:13,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 03:13:15,506][15349] Signal inference workers to stop experience collection... (139050 times) [2024-06-24 03:13:15,507][15349] Signal inference workers to resume experience collection... (139050 times) [2024-06-24 03:13:15,554][15401] InferenceWorker_p0-w0: stopping experience collection (139050 times) [2024-06-24 03:13:15,554][15401] InferenceWorker_p0-w0: resuming experience collection (139050 times) [2024-06-24 03:13:15,656][15401] Updated weights for policy 0, policy_version 573210 (0.0038) [2024-06-24 03:13:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 9391570944. Throughput: 0: 42617.1. Samples: 9391732840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 03:13:19,760][15401] Updated weights for policy 0, policy_version 573220 (0.0033) [2024-06-24 03:13:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 9391767552. Throughput: 0: 42534.6. Samples: 9391858140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:23,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 03:13:23,810][15401] Updated weights for policy 0, policy_version 573230 (0.0036) [2024-06-24 03:13:27,358][15401] Updated weights for policy 0, policy_version 573240 (0.0026) [2024-06-24 03:13:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9391996928. Throughput: 0: 42534.2. Samples: 9392114060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:28,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 03:13:31,450][15401] Updated weights for policy 0, policy_version 573250 (0.0031) [2024-06-24 03:13:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.1, 300 sec: 42820.5). Total num frames: 9392193536. Throughput: 0: 42796.8. Samples: 9392376080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 03:13:34,956][15401] Updated weights for policy 0, policy_version 573260 (0.0034) [2024-06-24 03:13:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9392406528. Throughput: 0: 42756.5. Samples: 9392498940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:38,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 03:13:39,110][15401] Updated weights for policy 0, policy_version 573270 (0.0037) [2024-06-24 03:13:42,471][15401] Updated weights for policy 0, policy_version 573280 (0.0028) [2024-06-24 03:13:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9392635904. Throughput: 0: 42754.3. Samples: 9392756460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 03:13:43,432][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000573282_9392652288.pth... [2024-06-24 03:13:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000572653_9382346752.pth [2024-06-24 03:13:46,719][15401] Updated weights for policy 0, policy_version 573290 (0.0039) [2024-06-24 03:13:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9392816128. Throughput: 0: 42730.2. Samples: 9393017780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-24 03:13:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 03:13:50,222][15401] Updated weights for policy 0, policy_version 573300 (0.0035) [2024-06-24 03:13:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9393045504. Throughput: 0: 42729.3. Samples: 9393137440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:13:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 03:13:54,477][15401] Updated weights for policy 0, policy_version 573310 (0.0034) [2024-06-24 03:13:57,784][15401] Updated weights for policy 0, policy_version 573320 (0.0025) [2024-06-24 03:13:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9393274880. Throughput: 0: 42779.6. Samples: 9393403500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:13:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 03:14:01,959][15401] Updated weights for policy 0, policy_version 573330 (0.0029) [2024-06-24 03:14:03,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 9393471488. Throughput: 0: 42758.7. Samples: 9393657080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:03,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 03:14:05,729][15401] Updated weights for policy 0, policy_version 573340 (0.0039) [2024-06-24 03:14:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9393700864. Throughput: 0: 42754.3. Samples: 9393782080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 03:14:09,564][15401] Updated weights for policy 0, policy_version 573350 (0.0030) [2024-06-24 03:14:13,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9393913856. Throughput: 0: 42806.7. Samples: 9394040360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:13,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 03:14:13,473][15401] Updated weights for policy 0, policy_version 573360 (0.0040) [2024-06-24 03:14:17,534][15401] Updated weights for policy 0, policy_version 573370 (0.0031) [2024-06-24 03:14:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 9394110464. Throughput: 0: 42704.6. Samples: 9394297780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:18,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 03:14:21,383][15401] Updated weights for policy 0, policy_version 573380 (0.0033) [2024-06-24 03:14:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9394356224. Throughput: 0: 42877.4. Samples: 9394428420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 03:14:25,065][15401] Updated weights for policy 0, policy_version 573390 (0.0037) [2024-06-24 03:14:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9394552832. Throughput: 0: 42704.8. Samples: 9394678180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 03:14:28,906][15401] Updated weights for policy 0, policy_version 573400 (0.0036) [2024-06-24 03:14:32,690][15401] Updated weights for policy 0, policy_version 573410 (0.0047) [2024-06-24 03:14:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9394765824. Throughput: 0: 42609.7. Samples: 9394935220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 03:14:33,896][15349] Signal inference workers to stop experience collection... (139100 times) [2024-06-24 03:14:33,946][15401] InferenceWorker_p0-w0: stopping experience collection (139100 times) [2024-06-24 03:14:34,007][15349] Signal inference workers to resume experience collection... (139100 times) [2024-06-24 03:14:34,007][15401] InferenceWorker_p0-w0: resuming experience collection (139100 times) [2024-06-24 03:14:36,338][15401] Updated weights for policy 0, policy_version 573420 (0.0028) [2024-06-24 03:14:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9394995200. Throughput: 0: 42799.5. Samples: 9395063420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 03:14:40,299][15401] Updated weights for policy 0, policy_version 573430 (0.0034) [2024-06-24 03:14:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9395208192. Throughput: 0: 42556.3. Samples: 9395318540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 03:14:44,026][15401] Updated weights for policy 0, policy_version 573440 (0.0035) [2024-06-24 03:14:48,374][15401] Updated weights for policy 0, policy_version 573450 (0.0043) [2024-06-24 03:14:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9395404800. Throughput: 0: 42801.9. Samples: 9395583060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 03:14:51,505][15401] Updated weights for policy 0, policy_version 573460 (0.0046) [2024-06-24 03:14:53,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42866.9, 300 sec: 42875.1). Total num frames: 9395617792. Throughput: 0: 42746.8. Samples: 9395705960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:53,397][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 03:14:56,154][15401] Updated weights for policy 0, policy_version 573470 (0.0032) [2024-06-24 03:14:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9395847168. Throughput: 0: 42777.7. Samples: 9395965360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:14:58,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 03:14:59,041][15401] Updated weights for policy 0, policy_version 573480 (0.0033) [2024-06-24 03:15:03,389][15132] Fps is (10 sec: 40986.9, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 9396027392. Throughput: 0: 42856.1. Samples: 9396226300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:15:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 03:15:03,678][15401] Updated weights for policy 0, policy_version 573490 (0.0036) [2024-06-24 03:15:06,562][15401] Updated weights for policy 0, policy_version 573500 (0.0034) [2024-06-24 03:15:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9396273152. Throughput: 0: 42651.9. Samples: 9396347760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:15:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 03:15:11,155][15401] Updated weights for policy 0, policy_version 573510 (0.0032) [2024-06-24 03:15:13,390][15132] Fps is (10 sec: 47512.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9396502528. Throughput: 0: 43043.9. Samples: 9396615160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:15:13,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 03:15:14,167][15401] Updated weights for policy 0, policy_version 573520 (0.0039) [2024-06-24 03:15:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 9396682752. Throughput: 0: 42980.6. Samples: 9396869340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:15:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 03:15:18,919][15401] Updated weights for policy 0, policy_version 573530 (0.0031) [2024-06-24 03:15:21,778][15401] Updated weights for policy 0, policy_version 573540 (0.0034) [2024-06-24 03:15:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9396928512. Throughput: 0: 42756.4. Samples: 9396987460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 03:15:23,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 03:15:26,737][15401] Updated weights for policy 0, policy_version 573550 (0.0046) [2024-06-24 03:15:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.8). Total num frames: 9397141504. Throughput: 0: 43131.9. Samples: 9397259480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:15:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 03:15:29,298][15401] Updated weights for policy 0, policy_version 573560 (0.0032) [2024-06-24 03:15:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9397338112. Throughput: 0: 42885.8. Samples: 9397512920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:15:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 03:15:34,179][15401] Updated weights for policy 0, policy_version 573570 (0.0024) [2024-06-24 03:15:36,908][15401] Updated weights for policy 0, policy_version 573580 (0.0038) [2024-06-24 03:15:38,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 9397583872. Throughput: 0: 42918.1. Samples: 9397637100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:15:38,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 03:15:41,537][15401] Updated weights for policy 0, policy_version 573590 (0.0043) [2024-06-24 03:15:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9397780480. Throughput: 0: 43010.3. Samples: 9397900820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:15:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 03:15:43,519][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000573596_9397796864.pth... [2024-06-24 03:15:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000572968_9387507712.pth [2024-06-24 03:15:44,504][15401] Updated weights for policy 0, policy_version 573600 (0.0030) [2024-06-24 03:15:48,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 9397977088. Throughput: 0: 43038.6. Samples: 9398163040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:15:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 03:15:49,003][15401] Updated weights for policy 0, policy_version 573610 (0.0032) [2024-06-24 03:15:52,082][15401] Updated weights for policy 0, policy_version 573620 (0.0028) [2024-06-24 03:15:52,856][15349] Signal inference workers to stop experience collection... (139150 times) [2024-06-24 03:15:52,859][15349] Signal inference workers to resume experience collection... (139150 times) [2024-06-24 03:15:52,876][15401] InferenceWorker_p0-w0: stopping experience collection (139150 times) [2024-06-24 03:15:52,876][15401] InferenceWorker_p0-w0: resuming experience collection (139150 times) [2024-06-24 03:15:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43695.3, 300 sec: 42987.2). Total num frames: 9398239232. Throughput: 0: 43138.3. Samples: 9398288980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:15:53,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 03:15:56,740][15401] Updated weights for policy 0, policy_version 573630 (0.0030) [2024-06-24 03:15:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 9398419456. Throughput: 0: 43045.8. Samples: 9398552220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:15:58,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 03:15:59,658][15401] Updated weights for policy 0, policy_version 573640 (0.0037) [2024-06-24 03:16:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 9398632448. Throughput: 0: 43256.4. Samples: 9398815880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 03:16:04,049][15401] Updated weights for policy 0, policy_version 573650 (0.0030) [2024-06-24 03:16:07,426][15401] Updated weights for policy 0, policy_version 573660 (0.0030) [2024-06-24 03:16:08,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 9398894592. Throughput: 0: 43353.7. Samples: 9398938380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 03:16:12,084][15401] Updated weights for policy 0, policy_version 573670 (0.0033) [2024-06-24 03:16:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9399074816. Throughput: 0: 43019.2. Samples: 9399195340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 03:16:15,115][15401] Updated weights for policy 0, policy_version 573680 (0.0040) [2024-06-24 03:16:18,390][15132] Fps is (10 sec: 37683.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9399271424. Throughput: 0: 43141.8. Samples: 9399454300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 03:16:19,614][15401] Updated weights for policy 0, policy_version 573690 (0.0024) [2024-06-24 03:16:22,786][15401] Updated weights for policy 0, policy_version 573700 (0.0037) [2024-06-24 03:16:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 9399533568. Throughput: 0: 43276.1. Samples: 9399584420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:23,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 03:16:27,029][15401] Updated weights for policy 0, policy_version 573710 (0.0035) [2024-06-24 03:16:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9399713792. Throughput: 0: 43205.7. Samples: 9399845080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 03:16:30,370][15401] Updated weights for policy 0, policy_version 573720 (0.0032) [2024-06-24 03:16:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 9399943168. Throughput: 0: 43151.5. Samples: 9400104860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:33,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 03:16:34,453][15401] Updated weights for policy 0, policy_version 573730 (0.0028) [2024-06-24 03:16:37,947][15401] Updated weights for policy 0, policy_version 573740 (0.0030) [2024-06-24 03:16:38,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43144.5, 300 sec: 42931.3). Total num frames: 9400172544. Throughput: 0: 43297.3. Samples: 9400237460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:38,393][15132] Avg episode reward: [(0, '0.272')] [2024-06-24 03:16:42,071][15401] Updated weights for policy 0, policy_version 573750 (0.0040) [2024-06-24 03:16:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9400352768. Throughput: 0: 42999.3. Samples: 9400487180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:43,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 03:16:45,465][15401] Updated weights for policy 0, policy_version 573760 (0.0028) [2024-06-24 03:16:48,390][15132] Fps is (10 sec: 39330.6, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 9400565760. Throughput: 0: 42927.4. Samples: 9400747620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:48,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 03:16:49,715][15401] Updated weights for policy 0, policy_version 573770 (0.0036) [2024-06-24 03:16:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 9400795136. Throughput: 0: 43058.0. Samples: 9400875980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 03:16:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 03:16:53,479][15401] Updated weights for policy 0, policy_version 573780 (0.0033) [2024-06-24 03:16:57,189][15401] Updated weights for policy 0, policy_version 573790 (0.0030) [2024-06-24 03:16:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9401008128. Throughput: 0: 43002.7. Samples: 9401130460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:16:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 03:17:01,098][15401] Updated weights for policy 0, policy_version 573800 (0.0029) [2024-06-24 03:17:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9401221120. Throughput: 0: 43060.8. Samples: 9401392040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 03:17:05,203][15401] Updated weights for policy 0, policy_version 573810 (0.0024) [2024-06-24 03:17:05,820][15349] Signal inference workers to stop experience collection... (139200 times) [2024-06-24 03:17:05,820][15349] Signal inference workers to resume experience collection... (139200 times) [2024-06-24 03:17:05,839][15401] InferenceWorker_p0-w0: stopping experience collection (139200 times) [2024-06-24 03:17:05,839][15401] InferenceWorker_p0-w0: resuming experience collection (139200 times) [2024-06-24 03:17:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9401434112. Throughput: 0: 42907.1. Samples: 9401515240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 03:17:08,602][15401] Updated weights for policy 0, policy_version 573820 (0.0030) [2024-06-24 03:17:12,586][15401] Updated weights for policy 0, policy_version 573830 (0.0022) [2024-06-24 03:17:13,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9401663488. Throughput: 0: 42885.0. Samples: 9401774900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 03:17:16,208][15401] Updated weights for policy 0, policy_version 573840 (0.0042) [2024-06-24 03:17:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 9401876480. Throughput: 0: 42951.9. Samples: 9402037800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:18,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 03:17:20,194][15401] Updated weights for policy 0, policy_version 573850 (0.0039) [2024-06-24 03:17:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9402089472. Throughput: 0: 42758.3. Samples: 9402161480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:23,399][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 03:17:24,066][15401] Updated weights for policy 0, policy_version 573860 (0.0031) [2024-06-24 03:17:27,677][15401] Updated weights for policy 0, policy_version 573870 (0.0033) [2024-06-24 03:17:28,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9402302464. Throughput: 0: 42866.5. Samples: 9402416180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 03:17:31,663][15401] Updated weights for policy 0, policy_version 573880 (0.0028) [2024-06-24 03:17:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 9402531840. Throughput: 0: 42877.3. Samples: 9402677100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:33,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 03:17:35,470][15401] Updated weights for policy 0, policy_version 573890 (0.0029) [2024-06-24 03:17:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 9402712064. Throughput: 0: 42880.0. Samples: 9402805580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 03:17:39,222][15401] Updated weights for policy 0, policy_version 573900 (0.0027) [2024-06-24 03:17:43,074][15401] Updated weights for policy 0, policy_version 573910 (0.0040) [2024-06-24 03:17:43,390][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9402941440. Throughput: 0: 42899.5. Samples: 9403060940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 03:17:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000573910_9402941440.pth... [2024-06-24 03:17:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000573282_9392652288.pth [2024-06-24 03:17:47,200][15401] Updated weights for policy 0, policy_version 573920 (0.0026) [2024-06-24 03:17:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9403170816. Throughput: 0: 42656.1. Samples: 9403311560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 03:17:51,211][15401] Updated weights for policy 0, policy_version 573930 (0.0036) [2024-06-24 03:17:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9403351040. Throughput: 0: 42977.7. Samples: 9403449240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 03:17:54,759][15401] Updated weights for policy 0, policy_version 573940 (0.0025) [2024-06-24 03:17:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9403564032. Throughput: 0: 42826.5. Samples: 9403702100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:17:58,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 03:17:58,692][15401] Updated weights for policy 0, policy_version 573950 (0.0034) [2024-06-24 03:18:02,358][15401] Updated weights for policy 0, policy_version 573960 (0.0035) [2024-06-24 03:18:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9403809792. Throughput: 0: 42710.3. Samples: 9403959660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:18:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 03:18:06,284][15401] Updated weights for policy 0, policy_version 573970 (0.0038) [2024-06-24 03:18:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9404006400. Throughput: 0: 42836.0. Samples: 9404089100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:18:08,404][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 03:18:10,099][15401] Updated weights for policy 0, policy_version 573980 (0.0033) [2024-06-24 03:18:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 9404235776. Throughput: 0: 42829.4. Samples: 9404343500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:18:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 03:18:13,717][15401] Updated weights for policy 0, policy_version 573990 (0.0038) [2024-06-24 03:18:17,564][15401] Updated weights for policy 0, policy_version 574000 (0.0038) [2024-06-24 03:18:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42931.7). Total num frames: 9404432384. Throughput: 0: 42918.4. Samples: 9404608420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:18:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 03:18:21,271][15401] Updated weights for policy 0, policy_version 574010 (0.0023) [2024-06-24 03:18:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9404645376. Throughput: 0: 42844.4. Samples: 9404733580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 03:18:23,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 03:18:25,549][15401] Updated weights for policy 0, policy_version 574020 (0.0037) [2024-06-24 03:18:26,375][15349] Signal inference workers to stop experience collection... (139250 times) [2024-06-24 03:18:26,379][15349] Signal inference workers to resume experience collection... (139250 times) [2024-06-24 03:18:26,401][15401] InferenceWorker_p0-w0: stopping experience collection (139250 times) [2024-06-24 03:18:26,402][15401] InferenceWorker_p0-w0: resuming experience collection (139250 times) [2024-06-24 03:18:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 9404874752. Throughput: 0: 42656.9. Samples: 9404980500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:18:28,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 03:18:29,003][15401] Updated weights for policy 0, policy_version 574030 (0.0038) [2024-06-24 03:18:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42876.1). Total num frames: 9405054976. Throughput: 0: 42941.4. Samples: 9405243920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:18:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 03:18:33,519][15401] Updated weights for policy 0, policy_version 574040 (0.0043) [2024-06-24 03:18:36,621][15401] Updated weights for policy 0, policy_version 574050 (0.0037) [2024-06-24 03:18:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9405267968. Throughput: 0: 42514.6. Samples: 9405362400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:18:38,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 03:18:41,256][15401] Updated weights for policy 0, policy_version 574060 (0.0028) [2024-06-24 03:18:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 9405513728. Throughput: 0: 42566.3. Samples: 9405617580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:18:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 03:18:44,295][15401] Updated weights for policy 0, policy_version 574070 (0.0029) [2024-06-24 03:18:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42876.1). Total num frames: 9405693952. Throughput: 0: 42669.9. Samples: 9405879800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:18:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 03:18:48,829][15401] Updated weights for policy 0, policy_version 574080 (0.0034) [2024-06-24 03:18:52,091][15401] Updated weights for policy 0, policy_version 574090 (0.0032) [2024-06-24 03:18:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9405906944. Throughput: 0: 42542.2. Samples: 9406003500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:18:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 03:18:56,471][15401] Updated weights for policy 0, policy_version 574100 (0.0033) [2024-06-24 03:18:58,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.9, 300 sec: 42987.2). Total num frames: 9406152704. Throughput: 0: 42459.5. Samples: 9406254280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:18:58,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 03:18:59,965][15401] Updated weights for policy 0, policy_version 574110 (0.0042) [2024-06-24 03:19:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.7, 300 sec: 42875.8). Total num frames: 9406349312. Throughput: 0: 42388.8. Samples: 9406516020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:03,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 03:19:04,014][15401] Updated weights for policy 0, policy_version 574120 (0.0037) [2024-06-24 03:19:07,529][15401] Updated weights for policy 0, policy_version 574130 (0.0032) [2024-06-24 03:19:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9406545920. Throughput: 0: 42409.0. Samples: 9406641980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:08,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 03:19:11,547][15401] Updated weights for policy 0, policy_version 574140 (0.0032) [2024-06-24 03:19:13,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 9406791680. Throughput: 0: 42656.9. Samples: 9406900060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:13,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 03:19:15,354][15401] Updated weights for policy 0, policy_version 574150 (0.0043) [2024-06-24 03:19:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9406988288. Throughput: 0: 42622.7. Samples: 9407161940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:18,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 03:19:19,310][15401] Updated weights for policy 0, policy_version 574160 (0.0036) [2024-06-24 03:19:22,971][15401] Updated weights for policy 0, policy_version 574170 (0.0034) [2024-06-24 03:19:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9407201280. Throughput: 0: 42626.3. Samples: 9407280580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 03:19:27,147][15401] Updated weights for policy 0, policy_version 574180 (0.0029) [2024-06-24 03:19:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9407447040. Throughput: 0: 42785.3. Samples: 9407542920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 03:19:29,692][15349] Signal inference workers to stop experience collection... (139300 times) [2024-06-24 03:19:29,692][15349] Signal inference workers to resume experience collection... (139300 times) [2024-06-24 03:19:29,724][15401] InferenceWorker_p0-w0: stopping experience collection (139300 times) [2024-06-24 03:19:29,724][15401] InferenceWorker_p0-w0: resuming experience collection (139300 times) [2024-06-24 03:19:30,531][15401] Updated weights for policy 0, policy_version 574190 (0.0040) [2024-06-24 03:19:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9407627264. Throughput: 0: 42571.0. Samples: 9407795500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 03:19:34,826][15401] Updated weights for policy 0, policy_version 574200 (0.0033) [2024-06-24 03:19:38,388][15401] Updated weights for policy 0, policy_version 574210 (0.0055) [2024-06-24 03:19:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9407856640. Throughput: 0: 42467.9. Samples: 9407914560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 03:19:42,592][15401] Updated weights for policy 0, policy_version 574220 (0.0039) [2024-06-24 03:19:43,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 9408069632. Throughput: 0: 42821.8. Samples: 9408181260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:43,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 03:19:43,532][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000574224_9408086016.pth... [2024-06-24 03:19:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000573596_9397796864.pth [2024-06-24 03:19:46,154][15401] Updated weights for policy 0, policy_version 574230 (0.0032) [2024-06-24 03:19:48,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 9408266240. Throughput: 0: 42629.8. Samples: 9408434260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 03:19:50,286][15401] Updated weights for policy 0, policy_version 574240 (0.0035) [2024-06-24 03:19:53,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 9408479232. Throughput: 0: 42583.9. Samples: 9408558360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:53,393][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 03:19:53,761][15401] Updated weights for policy 0, policy_version 574250 (0.0036) [2024-06-24 03:19:58,035][15401] Updated weights for policy 0, policy_version 574260 (0.0036) [2024-06-24 03:19:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42326.9, 300 sec: 42931.6). Total num frames: 9408692224. Throughput: 0: 42867.4. Samples: 9408829100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 03:19:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 03:20:01,961][15401] Updated weights for policy 0, policy_version 574270 (0.0022) [2024-06-24 03:20:03,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 9408905216. Throughput: 0: 42435.1. Samples: 9409071520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 03:20:05,838][15401] Updated weights for policy 0, policy_version 574280 (0.0035) [2024-06-24 03:20:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 9409134592. Throughput: 0: 42712.0. Samples: 9409202620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:08,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 03:20:09,462][15401] Updated weights for policy 0, policy_version 574290 (0.0030) [2024-06-24 03:20:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 9409314816. Throughput: 0: 42642.1. Samples: 9409461820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 03:20:13,478][15401] Updated weights for policy 0, policy_version 574300 (0.0037) [2024-06-24 03:20:16,934][15401] Updated weights for policy 0, policy_version 574310 (0.0033) [2024-06-24 03:20:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9409576960. Throughput: 0: 42639.1. Samples: 9409714260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 03:20:20,951][15401] Updated weights for policy 0, policy_version 574320 (0.0034) [2024-06-24 03:20:23,390][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9409789952. Throughput: 0: 42954.4. Samples: 9409847500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 03:20:24,515][15401] Updated weights for policy 0, policy_version 574330 (0.0031) [2024-06-24 03:20:28,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 9409953792. Throughput: 0: 42774.7. Samples: 9410106020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 03:20:28,790][15401] Updated weights for policy 0, policy_version 574340 (0.0042) [2024-06-24 03:20:32,055][15401] Updated weights for policy 0, policy_version 574350 (0.0036) [2024-06-24 03:20:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 9410199552. Throughput: 0: 42798.6. Samples: 9410360200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 03:20:35,995][15349] Signal inference workers to stop experience collection... (139350 times) [2024-06-24 03:20:36,043][15401] InferenceWorker_p0-w0: stopping experience collection (139350 times) [2024-06-24 03:20:36,103][15349] Signal inference workers to resume experience collection... (139350 times) [2024-06-24 03:20:36,104][15401] InferenceWorker_p0-w0: resuming experience collection (139350 times) [2024-06-24 03:20:36,266][15401] Updated weights for policy 0, policy_version 574360 (0.0029) [2024-06-24 03:20:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9410428928. Throughput: 0: 43070.2. Samples: 9410496420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 03:20:39,619][15401] Updated weights for policy 0, policy_version 574370 (0.0030) [2024-06-24 03:20:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 9410609152. Throughput: 0: 42718.8. Samples: 9410751440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:43,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 03:20:43,772][15401] Updated weights for policy 0, policy_version 574380 (0.0025) [2024-06-24 03:20:47,222][15401] Updated weights for policy 0, policy_version 574390 (0.0039) [2024-06-24 03:20:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9410838528. Throughput: 0: 42908.0. Samples: 9411002380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 03:20:51,394][15401] Updated weights for policy 0, policy_version 574400 (0.0030) [2024-06-24 03:20:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 9411067904. Throughput: 0: 42971.2. Samples: 9411136320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 03:20:55,054][15401] Updated weights for policy 0, policy_version 574410 (0.0034) [2024-06-24 03:20:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9411248128. Throughput: 0: 42839.3. Samples: 9411389580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:20:58,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 03:20:58,886][15401] Updated weights for policy 0, policy_version 574420 (0.0038) [2024-06-24 03:21:02,497][15401] Updated weights for policy 0, policy_version 574430 (0.0033) [2024-06-24 03:21:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9411477504. Throughput: 0: 42849.5. Samples: 9411642480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:21:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 03:21:06,776][15401] Updated weights for policy 0, policy_version 574440 (0.0038) [2024-06-24 03:21:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9411706880. Throughput: 0: 42854.6. Samples: 9411775960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:21:08,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 03:21:10,203][15401] Updated weights for policy 0, policy_version 574450 (0.0035) [2024-06-24 03:21:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9411887104. Throughput: 0: 42692.4. Samples: 9412027180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:21:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 03:21:14,415][15401] Updated weights for policy 0, policy_version 574460 (0.0037) [2024-06-24 03:21:17,764][15401] Updated weights for policy 0, policy_version 574470 (0.0023) [2024-06-24 03:21:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9412116480. Throughput: 0: 42631.5. Samples: 9412278620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:21:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 03:21:21,985][15401] Updated weights for policy 0, policy_version 574480 (0.0040) [2024-06-24 03:21:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9412329472. Throughput: 0: 42581.5. Samples: 9412412580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:21:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 03:21:25,859][15401] Updated weights for policy 0, policy_version 574490 (0.0030) [2024-06-24 03:21:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9412526080. Throughput: 0: 42531.0. Samples: 9412665340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 03:21:28,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 03:21:29,697][15401] Updated weights for policy 0, policy_version 574500 (0.0029) [2024-06-24 03:21:33,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42593.9, 300 sec: 42653.4). Total num frames: 9412755456. Throughput: 0: 42607.2. Samples: 9412919980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:21:33,396][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 03:21:33,586][15401] Updated weights for policy 0, policy_version 574510 (0.0034) [2024-06-24 03:21:37,340][15401] Updated weights for policy 0, policy_version 574520 (0.0033) [2024-06-24 03:21:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9412984832. Throughput: 0: 42544.8. Samples: 9413050840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:21:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 03:21:41,234][15401] Updated weights for policy 0, policy_version 574530 (0.0033) [2024-06-24 03:21:43,390][15132] Fps is (10 sec: 42625.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9413181440. Throughput: 0: 42626.1. Samples: 9413307760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:21:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 03:21:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000574535_9413181440.pth... [2024-06-24 03:21:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000573910_9402941440.pth [2024-06-24 03:21:44,915][15401] Updated weights for policy 0, policy_version 574540 (0.0030) [2024-06-24 03:21:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9413410816. Throughput: 0: 42614.7. Samples: 9413560140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:21:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 03:21:48,836][15401] Updated weights for policy 0, policy_version 574550 (0.0031) [2024-06-24 03:21:52,694][15401] Updated weights for policy 0, policy_version 574560 (0.0038) [2024-06-24 03:21:53,302][15349] Signal inference workers to stop experience collection... (139400 times) [2024-06-24 03:21:53,308][15349] Signal inference workers to resume experience collection... (139400 times) [2024-06-24 03:21:53,350][15401] InferenceWorker_p0-w0: stopping experience collection (139400 times) [2024-06-24 03:21:53,350][15401] InferenceWorker_p0-w0: resuming experience collection (139400 times) [2024-06-24 03:21:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9413623808. Throughput: 0: 42599.7. Samples: 9413692940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:21:53,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 03:21:57,027][15401] Updated weights for policy 0, policy_version 574570 (0.0037) [2024-06-24 03:21:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9413836800. Throughput: 0: 42777.7. Samples: 9413952180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:21:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 03:22:00,273][15401] Updated weights for policy 0, policy_version 574580 (0.0031) [2024-06-24 03:22:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9414033408. Throughput: 0: 42841.8. Samples: 9414206500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:03,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 03:22:04,612][15401] Updated weights for policy 0, policy_version 574590 (0.0032) [2024-06-24 03:22:07,793][15401] Updated weights for policy 0, policy_version 574600 (0.0030) [2024-06-24 03:22:08,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9414246400. Throughput: 0: 42598.7. Samples: 9414329520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 03:22:12,099][15401] Updated weights for policy 0, policy_version 574610 (0.0027) [2024-06-24 03:22:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 9414459392. Throughput: 0: 42851.2. Samples: 9414593640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 03:22:15,321][15401] Updated weights for policy 0, policy_version 574620 (0.0038) [2024-06-24 03:22:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9414672384. Throughput: 0: 42924.3. Samples: 9414851300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:18,395][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 03:22:19,629][15401] Updated weights for policy 0, policy_version 574630 (0.0029) [2024-06-24 03:22:23,016][15401] Updated weights for policy 0, policy_version 574640 (0.0033) [2024-06-24 03:22:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9414901760. Throughput: 0: 42750.7. Samples: 9414974620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:23,391][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 03:22:27,193][15401] Updated weights for policy 0, policy_version 574650 (0.0036) [2024-06-24 03:22:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 9415114752. Throughput: 0: 42819.7. Samples: 9415234640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:28,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 03:22:31,034][15401] Updated weights for policy 0, policy_version 574660 (0.0042) [2024-06-24 03:22:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 9415311360. Throughput: 0: 42867.9. Samples: 9415489200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:33,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 03:22:34,750][15401] Updated weights for policy 0, policy_version 574670 (0.0031) [2024-06-24 03:22:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9415524352. Throughput: 0: 42657.6. Samples: 9415612540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:38,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 03:22:38,609][15401] Updated weights for policy 0, policy_version 574680 (0.0041) [2024-06-24 03:22:42,568][15401] Updated weights for policy 0, policy_version 574690 (0.0035) [2024-06-24 03:22:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9415753728. Throughput: 0: 42680.1. Samples: 9415872780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 03:22:46,238][15401] Updated weights for policy 0, policy_version 574700 (0.0029) [2024-06-24 03:22:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9415966720. Throughput: 0: 42619.1. Samples: 9416124360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 03:22:50,371][15401] Updated weights for policy 0, policy_version 574710 (0.0044) [2024-06-24 03:22:53,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42320.7, 300 sec: 42708.6). Total num frames: 9416163328. Throughput: 0: 42656.4. Samples: 9416249340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:53,405][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 03:22:54,077][15401] Updated weights for policy 0, policy_version 574720 (0.0043) [2024-06-24 03:22:58,051][15401] Updated weights for policy 0, policy_version 574730 (0.0039) [2024-06-24 03:22:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9416376320. Throughput: 0: 42417.2. Samples: 9416502420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:22:58,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 03:23:01,646][15401] Updated weights for policy 0, policy_version 574740 (0.0034) [2024-06-24 03:23:02,542][15349] Signal inference workers to stop experience collection... (139450 times) [2024-06-24 03:23:02,582][15401] InferenceWorker_p0-w0: stopping experience collection (139450 times) [2024-06-24 03:23:02,597][15349] Signal inference workers to resume experience collection... (139450 times) [2024-06-24 03:23:02,598][15401] InferenceWorker_p0-w0: resuming experience collection (139450 times) [2024-06-24 03:23:03,390][15132] Fps is (10 sec: 44265.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9416605696. Throughput: 0: 42427.1. Samples: 9416760520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 03:23:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 03:23:06,039][15401] Updated weights for policy 0, policy_version 574750 (0.0043) [2024-06-24 03:23:08,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 9416802304. Throughput: 0: 42564.9. Samples: 9416890140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:08,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 03:23:09,251][15401] Updated weights for policy 0, policy_version 574760 (0.0034) [2024-06-24 03:23:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9417015296. Throughput: 0: 42586.2. Samples: 9417151020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 03:23:13,507][15401] Updated weights for policy 0, policy_version 574770 (0.0025) [2024-06-24 03:23:16,795][15401] Updated weights for policy 0, policy_version 574780 (0.0036) [2024-06-24 03:23:18,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9417244672. Throughput: 0: 42470.1. Samples: 9417400360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 03:23:20,964][15401] Updated weights for policy 0, policy_version 574790 (0.0027) [2024-06-24 03:23:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9417441280. Throughput: 0: 42570.3. Samples: 9417528200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 03:23:24,686][15401] Updated weights for policy 0, policy_version 574800 (0.0027) [2024-06-24 03:23:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9417670656. Throughput: 0: 42426.6. Samples: 9417781980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:28,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 03:23:28,480][15401] Updated weights for policy 0, policy_version 574810 (0.0040) [2024-06-24 03:23:32,236][15401] Updated weights for policy 0, policy_version 574820 (0.0033) [2024-06-24 03:23:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9417883648. Throughput: 0: 42691.3. Samples: 9418045460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 03:23:35,877][15401] Updated weights for policy 0, policy_version 574830 (0.0030) [2024-06-24 03:23:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9418080256. Throughput: 0: 42733.7. Samples: 9418172080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 03:23:39,707][15401] Updated weights for policy 0, policy_version 574840 (0.0033) [2024-06-24 03:23:43,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9418326016. Throughput: 0: 42853.7. Samples: 9418430840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:43,399][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 03:23:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000574849_9418326016.pth... [2024-06-24 03:23:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000574224_9408086016.pth [2024-06-24 03:23:43,934][15401] Updated weights for policy 0, policy_version 574850 (0.0031) [2024-06-24 03:23:47,346][15401] Updated weights for policy 0, policy_version 574860 (0.0032) [2024-06-24 03:23:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9418522624. Throughput: 0: 42741.4. Samples: 9418683880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 03:23:51,466][15401] Updated weights for policy 0, policy_version 574870 (0.0031) [2024-06-24 03:23:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42602.9, 300 sec: 42598.7). Total num frames: 9418719232. Throughput: 0: 42760.4. Samples: 9418814260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:53,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 03:23:55,170][15401] Updated weights for policy 0, policy_version 574880 (0.0033) [2024-06-24 03:23:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 9418964992. Throughput: 0: 42637.1. Samples: 9419069700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:23:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 03:23:59,032][15401] Updated weights for policy 0, policy_version 574890 (0.0029) [2024-06-24 03:24:02,790][15401] Updated weights for policy 0, policy_version 574900 (0.0026) [2024-06-24 03:24:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9419161600. Throughput: 0: 42795.2. Samples: 9419326140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:24:03,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-24 03:24:06,642][15401] Updated weights for policy 0, policy_version 574910 (0.0037) [2024-06-24 03:24:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 9419341824. Throughput: 0: 42806.3. Samples: 9419454480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:24:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 03:24:10,356][15401] Updated weights for policy 0, policy_version 574920 (0.0030) [2024-06-24 03:24:11,713][15349] Signal inference workers to stop experience collection... (139500 times) [2024-06-24 03:24:11,743][15401] InferenceWorker_p0-w0: stopping experience collection (139500 times) [2024-06-24 03:24:11,828][15349] Signal inference workers to resume experience collection... (139500 times) [2024-06-24 03:24:11,828][15401] InferenceWorker_p0-w0: resuming experience collection (139500 times) [2024-06-24 03:24:13,390][15132] Fps is (10 sec: 45873.1, 60 sec: 43417.2, 300 sec: 42820.5). Total num frames: 9419620352. Throughput: 0: 42958.7. Samples: 9419715140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:24:13,391][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 03:24:14,242][15401] Updated weights for policy 0, policy_version 574930 (0.0024) [2024-06-24 03:24:17,950][15401] Updated weights for policy 0, policy_version 574940 (0.0036) [2024-06-24 03:24:18,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9419816960. Throughput: 0: 42615.3. Samples: 9419963160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:24:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 03:24:21,880][15401] Updated weights for policy 0, policy_version 574950 (0.0029) [2024-06-24 03:24:23,389][15132] Fps is (10 sec: 37685.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9419997184. Throughput: 0: 42636.5. Samples: 9420090720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:24:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 03:24:25,563][15401] Updated weights for policy 0, policy_version 574960 (0.0031) [2024-06-24 03:24:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9420242944. Throughput: 0: 42723.7. Samples: 9420353400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:24:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 03:24:29,519][15401] Updated weights for policy 0, policy_version 574970 (0.0028) [2024-06-24 03:24:33,104][15401] Updated weights for policy 0, policy_version 574980 (0.0037) [2024-06-24 03:24:33,390][15132] Fps is (10 sec: 47512.6, 60 sec: 43144.3, 300 sec: 42765.0). Total num frames: 9420472320. Throughput: 0: 42634.9. Samples: 9420602460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 03:24:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 03:24:37,818][15401] Updated weights for policy 0, policy_version 574990 (0.0034) [2024-06-24 03:24:38,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42866.9, 300 sec: 42653.4). Total num frames: 9420652544. Throughput: 0: 42691.3. Samples: 9420735640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:24:38,397][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 03:24:40,680][15401] Updated weights for policy 0, policy_version 575000 (0.0046) [2024-06-24 03:24:43,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9420898304. Throughput: 0: 42935.2. Samples: 9421001780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:24:43,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-24 03:24:45,341][15401] Updated weights for policy 0, policy_version 575010 (0.0032) [2024-06-24 03:24:48,389][15132] Fps is (10 sec: 45904.9, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 9421111296. Throughput: 0: 42745.8. Samples: 9421249700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:24:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 03:24:48,968][15401] Updated weights for policy 0, policy_version 575020 (0.0032) [2024-06-24 03:24:52,845][15401] Updated weights for policy 0, policy_version 575030 (0.0033) [2024-06-24 03:24:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9421307904. Throughput: 0: 42816.7. Samples: 9421381240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:24:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 03:24:56,406][15401] Updated weights for policy 0, policy_version 575040 (0.0039) [2024-06-24 03:24:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9421504512. Throughput: 0: 42731.1. Samples: 9421638020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:24:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 03:25:00,730][15401] Updated weights for policy 0, policy_version 575050 (0.0042) [2024-06-24 03:25:02,409][15349] Signal inference workers to stop experience collection... (139550 times) [2024-06-24 03:25:02,447][15401] InferenceWorker_p0-w0: stopping experience collection (139550 times) [2024-06-24 03:25:02,457][15349] Signal inference workers to resume experience collection... (139550 times) [2024-06-24 03:25:02,465][15401] InferenceWorker_p0-w0: resuming experience collection (139550 times) [2024-06-24 03:25:03,390][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9421750272. Throughput: 0: 43047.7. Samples: 9421900300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 03:25:03,954][15401] Updated weights for policy 0, policy_version 575060 (0.0029) [2024-06-24 03:25:08,145][15401] Updated weights for policy 0, policy_version 575070 (0.0032) [2024-06-24 03:25:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9421946880. Throughput: 0: 43139.1. Samples: 9422031980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 03:25:11,849][15401] Updated weights for policy 0, policy_version 575080 (0.0044) [2024-06-24 03:25:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.6, 300 sec: 42653.9). Total num frames: 9422159872. Throughput: 0: 42987.1. Samples: 9422287820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:13,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 03:25:15,661][15401] Updated weights for policy 0, policy_version 575090 (0.0037) [2024-06-24 03:25:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9422389248. Throughput: 0: 43268.5. Samples: 9422549540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 03:25:19,365][15401] Updated weights for policy 0, policy_version 575100 (0.0038) [2024-06-24 03:25:23,198][15401] Updated weights for policy 0, policy_version 575110 (0.0040) [2024-06-24 03:25:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9422602240. Throughput: 0: 43146.2. Samples: 9422676940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:23,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-24 03:25:26,783][15401] Updated weights for policy 0, policy_version 575120 (0.0043) [2024-06-24 03:25:28,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9422815232. Throughput: 0: 42847.9. Samples: 9422930040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:28,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 03:25:31,391][15401] Updated weights for policy 0, policy_version 575130 (0.0022) [2024-06-24 03:25:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9423028224. Throughput: 0: 43043.5. Samples: 9423186660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 03:25:34,407][15401] Updated weights for policy 0, policy_version 575140 (0.0026) [2024-06-24 03:25:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43149.1, 300 sec: 42820.5). Total num frames: 9423241216. Throughput: 0: 43009.5. Samples: 9423316660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:38,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 03:25:38,940][15401] Updated weights for policy 0, policy_version 575150 (0.0041) [2024-06-24 03:25:42,110][15401] Updated weights for policy 0, policy_version 575160 (0.0024) [2024-06-24 03:25:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9423437824. Throughput: 0: 42795.1. Samples: 9423563800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 03:25:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000575161_9423437824.pth... [2024-06-24 03:25:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000574535_9413181440.pth [2024-06-24 03:25:46,474][15401] Updated weights for policy 0, policy_version 575170 (0.0039) [2024-06-24 03:25:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9423667200. Throughput: 0: 42775.9. Samples: 9423825220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 03:25:49,934][15401] Updated weights for policy 0, policy_version 575180 (0.0033) [2024-06-24 03:25:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9423880192. Throughput: 0: 42743.4. Samples: 9423955440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 03:25:54,305][15401] Updated weights for policy 0, policy_version 575190 (0.0040) [2024-06-24 03:25:57,535][15401] Updated weights for policy 0, policy_version 575200 (0.0030) [2024-06-24 03:25:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9424093184. Throughput: 0: 42637.0. Samples: 9424206480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:25:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 03:26:01,868][15401] Updated weights for policy 0, policy_version 575210 (0.0038) [2024-06-24 03:26:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9424306176. Throughput: 0: 42711.7. Samples: 9424471560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 03:26:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 03:26:05,132][15401] Updated weights for policy 0, policy_version 575220 (0.0024) [2024-06-24 03:26:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9424519168. Throughput: 0: 42633.3. Samples: 9424595440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:08,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 03:26:10,036][15401] Updated weights for policy 0, policy_version 575230 (0.0034) [2024-06-24 03:26:12,586][15401] Updated weights for policy 0, policy_version 575240 (0.0040) [2024-06-24 03:26:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9424732160. Throughput: 0: 42711.1. Samples: 9424851940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:13,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 03:26:17,482][15401] Updated weights for policy 0, policy_version 575250 (0.0039) [2024-06-24 03:26:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9424945152. Throughput: 0: 42848.6. Samples: 9425114840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 03:26:20,483][15401] Updated weights for policy 0, policy_version 575260 (0.0042) [2024-06-24 03:26:22,687][15349] Signal inference workers to stop experience collection... (139600 times) [2024-06-24 03:26:22,687][15349] Signal inference workers to resume experience collection... (139600 times) [2024-06-24 03:26:22,703][15401] InferenceWorker_p0-w0: stopping experience collection (139600 times) [2024-06-24 03:26:22,703][15401] InferenceWorker_p0-w0: resuming experience collection (139600 times) [2024-06-24 03:26:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9425141760. Throughput: 0: 42619.7. Samples: 9425234540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:23,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-24 03:26:25,423][15401] Updated weights for policy 0, policy_version 575270 (0.0034) [2024-06-24 03:26:28,380][15401] Updated weights for policy 0, policy_version 575280 (0.0036) [2024-06-24 03:26:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42821.5). Total num frames: 9425387520. Throughput: 0: 42785.8. Samples: 9425489160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 03:26:32,813][15401] Updated weights for policy 0, policy_version 575290 (0.0039) [2024-06-24 03:26:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9425600512. Throughput: 0: 42941.0. Samples: 9425757560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:33,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 03:26:35,697][15401] Updated weights for policy 0, policy_version 575300 (0.0034) [2024-06-24 03:26:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 9425797120. Throughput: 0: 42879.7. Samples: 9425885120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:38,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 03:26:40,305][15401] Updated weights for policy 0, policy_version 575310 (0.0024) [2024-06-24 03:26:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9426026496. Throughput: 0: 42959.6. Samples: 9426139660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:43,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 03:26:43,446][15401] Updated weights for policy 0, policy_version 575320 (0.0037) [2024-06-24 03:26:47,705][15401] Updated weights for policy 0, policy_version 575330 (0.0037) [2024-06-24 03:26:48,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9426255872. Throughput: 0: 42840.8. Samples: 9426399400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:48,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 03:26:50,991][15401] Updated weights for policy 0, policy_version 575340 (0.0047) [2024-06-24 03:26:53,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 9426419712. Throughput: 0: 42919.9. Samples: 9426526940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:53,393][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 03:26:55,425][15401] Updated weights for policy 0, policy_version 575350 (0.0043) [2024-06-24 03:26:58,390][15132] Fps is (10 sec: 40958.2, 60 sec: 42871.1, 300 sec: 42820.5). Total num frames: 9426665472. Throughput: 0: 42849.8. Samples: 9426780200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:26:58,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 03:26:58,803][15401] Updated weights for policy 0, policy_version 575360 (0.0029) [2024-06-24 03:27:02,955][15401] Updated weights for policy 0, policy_version 575370 (0.0033) [2024-06-24 03:27:03,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9426878464. Throughput: 0: 42800.0. Samples: 9427040840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:27:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 03:27:06,255][15401] Updated weights for policy 0, policy_version 575380 (0.0038) [2024-06-24 03:27:08,389][15132] Fps is (10 sec: 40962.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9427075072. Throughput: 0: 42962.2. Samples: 9427167840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:27:08,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 03:27:10,826][15401] Updated weights for policy 0, policy_version 575390 (0.0040) [2024-06-24 03:27:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9427320832. Throughput: 0: 42935.5. Samples: 9427421260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:27:13,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 03:27:13,897][15401] Updated weights for policy 0, policy_version 575400 (0.0034) [2024-06-24 03:27:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9427501056. Throughput: 0: 42803.6. Samples: 9427683720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:27:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 03:27:18,432][15401] Updated weights for policy 0, policy_version 575410 (0.0034) [2024-06-24 03:27:19,064][15349] Signal inference workers to stop experience collection... (139650 times) [2024-06-24 03:27:19,064][15349] Signal inference workers to resume experience collection... (139650 times) [2024-06-24 03:27:19,083][15401] InferenceWorker_p0-w0: stopping experience collection (139650 times) [2024-06-24 03:27:19,084][15401] InferenceWorker_p0-w0: resuming experience collection (139650 times) [2024-06-24 03:27:21,568][15401] Updated weights for policy 0, policy_version 575420 (0.0054) [2024-06-24 03:27:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9427730432. Throughput: 0: 42643.5. Samples: 9427803980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:27:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 03:27:26,037][15401] Updated weights for policy 0, policy_version 575430 (0.0032) [2024-06-24 03:27:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9427943424. Throughput: 0: 42598.2. Samples: 9428056580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:27:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 03:27:29,260][15401] Updated weights for policy 0, policy_version 575440 (0.0040) [2024-06-24 03:27:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9428140032. Throughput: 0: 42781.7. Samples: 9428324580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:27:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 03:27:33,635][15401] Updated weights for policy 0, policy_version 575450 (0.0036) [2024-06-24 03:27:36,802][15401] Updated weights for policy 0, policy_version 575460 (0.0026) [2024-06-24 03:27:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 9428369408. Throughput: 0: 42650.3. Samples: 9428446100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-06-24 03:27:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 03:27:41,194][15401] Updated weights for policy 0, policy_version 575470 (0.0041) [2024-06-24 03:27:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9428582400. Throughput: 0: 42645.4. Samples: 9428699220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:27:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 03:27:43,529][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000575476_9428598784.pth... [2024-06-24 03:27:43,608][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000574849_9418326016.pth [2024-06-24 03:27:44,472][15401] Updated weights for policy 0, policy_version 575480 (0.0027) [2024-06-24 03:27:48,391][15132] Fps is (10 sec: 39316.9, 60 sec: 41778.4, 300 sec: 42710.2). Total num frames: 9428762624. Throughput: 0: 42521.5. Samples: 9428954360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:27:48,391][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 03:27:49,379][15401] Updated weights for policy 0, policy_version 575490 (0.0029) [2024-06-24 03:27:52,130][15401] Updated weights for policy 0, policy_version 575500 (0.0039) [2024-06-24 03:27:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 9428992000. Throughput: 0: 42415.8. Samples: 9429076560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:27:53,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 03:27:56,994][15401] Updated weights for policy 0, policy_version 575510 (0.0041) [2024-06-24 03:27:58,391][15132] Fps is (10 sec: 47512.7, 60 sec: 42870.8, 300 sec: 42820.4). Total num frames: 9429237760. Throughput: 0: 42565.4. Samples: 9429336760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:27:58,391][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 03:28:00,061][15401] Updated weights for policy 0, policy_version 575520 (0.0032) [2024-06-24 03:28:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 9429401600. Throughput: 0: 42521.3. Samples: 9429597180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:03,390][15132] Avg episode reward: [(0, '0.158')] [2024-06-24 03:28:04,627][15401] Updated weights for policy 0, policy_version 575530 (0.0030) [2024-06-24 03:28:07,991][15401] Updated weights for policy 0, policy_version 575540 (0.0031) [2024-06-24 03:28:08,389][15132] Fps is (10 sec: 40965.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9429647360. Throughput: 0: 42458.7. Samples: 9429714620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 03:28:12,669][15401] Updated weights for policy 0, policy_version 575550 (0.0046) [2024-06-24 03:28:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9429860352. Throughput: 0: 42635.6. Samples: 9429975180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:13,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 03:28:15,549][15401] Updated weights for policy 0, policy_version 575560 (0.0035) [2024-06-24 03:28:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9430056960. Throughput: 0: 42384.0. Samples: 9430231860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 03:28:20,182][15401] Updated weights for policy 0, policy_version 575570 (0.0033) [2024-06-24 03:28:22,138][15349] Signal inference workers to stop experience collection... (139700 times) [2024-06-24 03:28:22,138][15349] Signal inference workers to resume experience collection... (139700 times) [2024-06-24 03:28:22,160][15401] InferenceWorker_p0-w0: stopping experience collection (139700 times) [2024-06-24 03:28:22,160][15401] InferenceWorker_p0-w0: resuming experience collection (139700 times) [2024-06-24 03:28:23,303][15401] Updated weights for policy 0, policy_version 575580 (0.0032) [2024-06-24 03:28:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9430302720. Throughput: 0: 42502.2. Samples: 9430358700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 03:28:27,750][15401] Updated weights for policy 0, policy_version 575590 (0.0035) [2024-06-24 03:28:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 9430482944. Throughput: 0: 42505.7. Samples: 9430611980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:28,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 03:28:31,407][15401] Updated weights for policy 0, policy_version 575600 (0.0032) [2024-06-24 03:28:33,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 9430712320. Throughput: 0: 42584.4. Samples: 9430870880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:33,397][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 03:28:35,330][15401] Updated weights for policy 0, policy_version 575610 (0.0033) [2024-06-24 03:28:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9430925312. Throughput: 0: 42744.2. Samples: 9431000040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 03:28:39,067][15401] Updated weights for policy 0, policy_version 575620 (0.0037) [2024-06-24 03:28:42,688][15401] Updated weights for policy 0, policy_version 575630 (0.0029) [2024-06-24 03:28:43,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9431121920. Throughput: 0: 42678.6. Samples: 9431257240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:43,391][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 03:28:46,790][15401] Updated weights for policy 0, policy_version 575640 (0.0028) [2024-06-24 03:28:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43418.5, 300 sec: 42876.1). Total num frames: 9431367680. Throughput: 0: 42622.6. Samples: 9431515200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:48,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-24 03:28:50,838][15401] Updated weights for policy 0, policy_version 575650 (0.0029) [2024-06-24 03:28:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9431564288. Throughput: 0: 42976.1. Samples: 9431648540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:53,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 03:28:54,515][15401] Updated weights for policy 0, policy_version 575660 (0.0032) [2024-06-24 03:28:58,352][15401] Updated weights for policy 0, policy_version 575670 (0.0027) [2024-06-24 03:28:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42326.3, 300 sec: 42765.0). Total num frames: 9431777280. Throughput: 0: 42896.9. Samples: 9431905540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:28:58,391][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 03:29:02,058][15401] Updated weights for policy 0, policy_version 575680 (0.0030) [2024-06-24 03:29:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9431990272. Throughput: 0: 42786.7. Samples: 9432157260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:29:03,391][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 03:29:05,842][15401] Updated weights for policy 0, policy_version 575690 (0.0031) [2024-06-24 03:29:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 9432219648. Throughput: 0: 42842.5. Samples: 9432286620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 03:29:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 03:29:09,888][15401] Updated weights for policy 0, policy_version 575700 (0.0041) [2024-06-24 03:29:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9432416256. Throughput: 0: 42983.2. Samples: 9432546220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:13,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 03:29:13,538][15401] Updated weights for policy 0, policy_version 575710 (0.0028) [2024-06-24 03:29:17,489][15401] Updated weights for policy 0, policy_version 575720 (0.0025) [2024-06-24 03:29:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9432629248. Throughput: 0: 42912.7. Samples: 9432801680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:18,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 03:29:21,154][15401] Updated weights for policy 0, policy_version 575730 (0.0039) [2024-06-24 03:29:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9432858624. Throughput: 0: 42926.6. Samples: 9432931740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 03:29:24,860][15401] Updated weights for policy 0, policy_version 575740 (0.0038) [2024-06-24 03:29:28,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 9433055232. Throughput: 0: 42984.5. Samples: 9433191820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:28,397][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 03:29:28,753][15401] Updated weights for policy 0, policy_version 575750 (0.0031) [2024-06-24 03:29:32,503][15401] Updated weights for policy 0, policy_version 575760 (0.0029) [2024-06-24 03:29:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42876.0, 300 sec: 42821.5). Total num frames: 9433284608. Throughput: 0: 42875.9. Samples: 9433444620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:33,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 03:29:36,364][15401] Updated weights for policy 0, policy_version 575770 (0.0042) [2024-06-24 03:29:38,390][15132] Fps is (10 sec: 45904.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9433513984. Throughput: 0: 42843.4. Samples: 9433576500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:38,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 03:29:40,039][15401] Updated weights for policy 0, policy_version 575780 (0.0029) [2024-06-24 03:29:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9433694208. Throughput: 0: 42857.7. Samples: 9433834140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:43,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 03:29:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000575788_9433710592.pth... [2024-06-24 03:29:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000575161_9423437824.pth [2024-06-24 03:29:43,925][15401] Updated weights for policy 0, policy_version 575790 (0.0032) [2024-06-24 03:29:47,635][15401] Updated weights for policy 0, policy_version 575800 (0.0029) [2024-06-24 03:29:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9433923584. Throughput: 0: 42910.2. Samples: 9434088220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 03:29:51,488][15401] Updated weights for policy 0, policy_version 575810 (0.0041) [2024-06-24 03:29:52,497][15349] Signal inference workers to stop experience collection... (139750 times) [2024-06-24 03:29:52,497][15349] Signal inference workers to resume experience collection... (139750 times) [2024-06-24 03:29:52,529][15401] InferenceWorker_p0-w0: stopping experience collection (139750 times) [2024-06-24 03:29:52,529][15401] InferenceWorker_p0-w0: resuming experience collection (139750 times) [2024-06-24 03:29:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9434152960. Throughput: 0: 42873.4. Samples: 9434215920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 03:29:55,315][15401] Updated weights for policy 0, policy_version 575820 (0.0033) [2024-06-24 03:29:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9434333184. Throughput: 0: 42794.2. Samples: 9434471960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:29:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 03:29:59,763][15401] Updated weights for policy 0, policy_version 575830 (0.0027) [2024-06-24 03:30:03,273][15401] Updated weights for policy 0, policy_version 575840 (0.0033) [2024-06-24 03:30:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9434562560. Throughput: 0: 42697.9. Samples: 9434723080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 03:30:07,529][15401] Updated weights for policy 0, policy_version 575850 (0.0049) [2024-06-24 03:30:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9434775552. Throughput: 0: 42602.2. Samples: 9434848840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 03:30:10,908][15401] Updated weights for policy 0, policy_version 575860 (0.0028) [2024-06-24 03:30:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9434988544. Throughput: 0: 42459.9. Samples: 9435102240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 03:30:15,124][15401] Updated weights for policy 0, policy_version 575870 (0.0038) [2024-06-24 03:30:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9435185152. Throughput: 0: 42567.6. Samples: 9435360160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 03:30:18,609][15401] Updated weights for policy 0, policy_version 575880 (0.0036) [2024-06-24 03:30:22,581][15401] Updated weights for policy 0, policy_version 575890 (0.0032) [2024-06-24 03:30:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 9435398144. Throughput: 0: 42483.7. Samples: 9435488260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:23,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 03:30:26,171][15401] Updated weights for policy 0, policy_version 575900 (0.0029) [2024-06-24 03:30:28,394][15132] Fps is (10 sec: 42581.2, 60 sec: 42600.1, 300 sec: 42653.4). Total num frames: 9435611136. Throughput: 0: 42447.8. Samples: 9435744460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:28,394][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 03:30:30,342][15401] Updated weights for policy 0, policy_version 575910 (0.0028) [2024-06-24 03:30:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9435824128. Throughput: 0: 42424.4. Samples: 9435997320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 03:30:34,391][15401] Updated weights for policy 0, policy_version 575920 (0.0042) [2024-06-24 03:30:37,829][15401] Updated weights for policy 0, policy_version 575930 (0.0029) [2024-06-24 03:30:38,389][15132] Fps is (10 sec: 42616.2, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 9436037120. Throughput: 0: 42346.4. Samples: 9436121500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 03:30:42,062][15401] Updated weights for policy 0, policy_version 575940 (0.0041) [2024-06-24 03:30:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9436233728. Throughput: 0: 42309.8. Samples: 9436375900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:30:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 03:30:45,475][15401] Updated weights for policy 0, policy_version 575950 (0.0037) [2024-06-24 03:30:48,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 9436463104. Throughput: 0: 42375.9. Samples: 9436630100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:30:48,392][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 03:30:49,687][15401] Updated weights for policy 0, policy_version 575960 (0.0030) [2024-06-24 03:30:53,107][15401] Updated weights for policy 0, policy_version 575970 (0.0027) [2024-06-24 03:30:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9436692480. Throughput: 0: 42520.6. Samples: 9436762260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:30:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 03:30:57,135][15401] Updated weights for policy 0, policy_version 575980 (0.0032) [2024-06-24 03:30:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9436889088. Throughput: 0: 42537.7. Samples: 9437016440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:30:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 03:31:01,331][15401] Updated weights for policy 0, policy_version 575990 (0.0039) [2024-06-24 03:31:02,688][15349] Signal inference workers to stop experience collection... (139800 times) [2024-06-24 03:31:02,688][15349] Signal inference workers to resume experience collection... (139800 times) [2024-06-24 03:31:02,720][15401] InferenceWorker_p0-w0: stopping experience collection (139800 times) [2024-06-24 03:31:02,720][15401] InferenceWorker_p0-w0: resuming experience collection (139800 times) [2024-06-24 03:31:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9437102080. Throughput: 0: 42410.3. Samples: 9437268620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:03,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 03:31:04,923][15401] Updated weights for policy 0, policy_version 576000 (0.0031) [2024-06-24 03:31:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9437315072. Throughput: 0: 42522.1. Samples: 9437401760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 03:31:08,937][15401] Updated weights for policy 0, policy_version 576010 (0.0037) [2024-06-24 03:31:12,382][15401] Updated weights for policy 0, policy_version 576020 (0.0029) [2024-06-24 03:31:13,392][15132] Fps is (10 sec: 42587.4, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 9437528064. Throughput: 0: 42451.7. Samples: 9437654720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:13,393][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 03:31:16,724][15401] Updated weights for policy 0, policy_version 576030 (0.0044) [2024-06-24 03:31:18,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9437757440. Throughput: 0: 42428.6. Samples: 9437906700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:18,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 03:31:20,105][15401] Updated weights for policy 0, policy_version 576040 (0.0028) [2024-06-24 03:31:23,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9437937664. Throughput: 0: 42561.3. Samples: 9438036760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 03:31:24,568][15401] Updated weights for policy 0, policy_version 576050 (0.0042) [2024-06-24 03:31:27,947][15401] Updated weights for policy 0, policy_version 576060 (0.0029) [2024-06-24 03:31:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42874.4, 300 sec: 42654.0). Total num frames: 9438183424. Throughput: 0: 42615.6. Samples: 9438293600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 03:31:32,110][15401] Updated weights for policy 0, policy_version 576070 (0.0032) [2024-06-24 03:31:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 9438396416. Throughput: 0: 42764.5. Samples: 9438554400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 03:31:35,606][15401] Updated weights for policy 0, policy_version 576080 (0.0030) [2024-06-24 03:31:38,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.6, 300 sec: 42653.6). Total num frames: 9438609408. Throughput: 0: 42677.6. Samples: 9438682860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:38,393][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 03:31:39,543][15401] Updated weights for policy 0, policy_version 576090 (0.0029) [2024-06-24 03:31:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 9438806016. Throughput: 0: 42845.6. Samples: 9438944500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 03:31:43,494][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000576100_9438822400.pth... [2024-06-24 03:31:43,502][15401] Updated weights for policy 0, policy_version 576100 (0.0024) [2024-06-24 03:31:43,548][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000575476_9428598784.pth [2024-06-24 03:31:47,055][15401] Updated weights for policy 0, policy_version 576110 (0.0033) [2024-06-24 03:31:48,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9439035392. Throughput: 0: 42814.9. Samples: 9439195400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:48,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 03:31:51,340][15401] Updated weights for policy 0, policy_version 576120 (0.0029) [2024-06-24 03:31:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.2, 300 sec: 42654.0). Total num frames: 9439248384. Throughput: 0: 42666.6. Samples: 9439321760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 03:31:54,835][15401] Updated weights for policy 0, policy_version 576130 (0.0025) [2024-06-24 03:31:58,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9439444992. Throughput: 0: 42762.4. Samples: 9439578920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:31:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 03:31:59,213][15401] Updated weights for policy 0, policy_version 576140 (0.0035) [2024-06-24 03:32:02,543][15401] Updated weights for policy 0, policy_version 576150 (0.0030) [2024-06-24 03:32:03,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9439674368. Throughput: 0: 42777.4. Samples: 9439831580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:32:03,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 03:32:06,643][15401] Updated weights for policy 0, policy_version 576160 (0.0038) [2024-06-24 03:32:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9439887360. Throughput: 0: 42821.2. Samples: 9439963720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:32:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 03:32:10,405][15401] Updated weights for policy 0, policy_version 576170 (0.0036) [2024-06-24 03:32:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 9440083968. Throughput: 0: 42753.3. Samples: 9440217500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:32:13,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 03:32:14,035][15401] Updated weights for policy 0, policy_version 576180 (0.0025) [2024-06-24 03:32:17,871][15401] Updated weights for policy 0, policy_version 576190 (0.0048) [2024-06-24 03:32:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 9440313344. Throughput: 0: 42837.8. Samples: 9440482100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 03:32:21,537][15401] Updated weights for policy 0, policy_version 576200 (0.0041) [2024-06-24 03:32:23,128][15349] Signal inference workers to stop experience collection... (139850 times) [2024-06-24 03:32:23,128][15349] Signal inference workers to resume experience collection... (139850 times) [2024-06-24 03:32:23,155][15401] InferenceWorker_p0-w0: stopping experience collection (139850 times) [2024-06-24 03:32:23,155][15401] InferenceWorker_p0-w0: resuming experience collection (139850 times) [2024-06-24 03:32:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 9440542720. Throughput: 0: 42837.9. Samples: 9440610460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 03:32:25,370][15401] Updated weights for policy 0, policy_version 576210 (0.0037) [2024-06-24 03:32:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9440739328. Throughput: 0: 42721.9. Samples: 9440866980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 03:32:29,277][15401] Updated weights for policy 0, policy_version 576220 (0.0036) [2024-06-24 03:32:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9440935936. Throughput: 0: 42773.9. Samples: 9441120120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 03:32:33,430][15401] Updated weights for policy 0, policy_version 576230 (0.0028) [2024-06-24 03:32:37,094][15401] Updated weights for policy 0, policy_version 576240 (0.0030) [2024-06-24 03:32:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 9441148928. Throughput: 0: 42838.8. Samples: 9441249500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 03:32:41,006][15401] Updated weights for policy 0, policy_version 576250 (0.0040) [2024-06-24 03:32:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 9441378304. Throughput: 0: 42747.9. Samples: 9441502580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 03:32:44,717][15401] Updated weights for policy 0, policy_version 576260 (0.0032) [2024-06-24 03:32:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 9441591296. Throughput: 0: 42746.0. Samples: 9441755160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 03:32:48,649][15401] Updated weights for policy 0, policy_version 576270 (0.0037) [2024-06-24 03:32:52,274][15401] Updated weights for policy 0, policy_version 576280 (0.0028) [2024-06-24 03:32:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42543.1). Total num frames: 9441787904. Throughput: 0: 42608.0. Samples: 9441881080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 03:32:56,280][15401] Updated weights for policy 0, policy_version 576290 (0.0033) [2024-06-24 03:32:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9442017280. Throughput: 0: 42686.2. Samples: 9442138380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:32:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 03:32:59,882][15401] Updated weights for policy 0, policy_version 576300 (0.0040) [2024-06-24 03:33:03,391][15132] Fps is (10 sec: 44228.9, 60 sec: 42597.0, 300 sec: 42653.7). Total num frames: 9442230272. Throughput: 0: 42432.1. Samples: 9442391620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:03,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 03:33:03,712][15401] Updated weights for policy 0, policy_version 576310 (0.0040) [2024-06-24 03:33:07,947][15401] Updated weights for policy 0, policy_version 576320 (0.0030) [2024-06-24 03:33:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9442426880. Throughput: 0: 42554.3. Samples: 9442525400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 03:33:11,511][15401] Updated weights for policy 0, policy_version 576330 (0.0036) [2024-06-24 03:33:13,390][15132] Fps is (10 sec: 44244.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9442672640. Throughput: 0: 42480.4. Samples: 9442778600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 03:33:15,525][15401] Updated weights for policy 0, policy_version 576340 (0.0041) [2024-06-24 03:33:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9442852864. Throughput: 0: 42537.7. Samples: 9443034320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 03:33:19,127][15401] Updated weights for policy 0, policy_version 576350 (0.0037) [2024-06-24 03:33:23,176][15401] Updated weights for policy 0, policy_version 576360 (0.0058) [2024-06-24 03:33:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9443082240. Throughput: 0: 42384.0. Samples: 9443156780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 03:33:26,940][15401] Updated weights for policy 0, policy_version 576370 (0.0035) [2024-06-24 03:33:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 9443295232. Throughput: 0: 42547.2. Samples: 9443417200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 03:33:30,771][15401] Updated weights for policy 0, policy_version 576380 (0.0037) [2024-06-24 03:33:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9443508224. Throughput: 0: 42642.3. Samples: 9443674060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 03:33:34,559][15401] Updated weights for policy 0, policy_version 576390 (0.0046) [2024-06-24 03:33:36,065][15349] Signal inference workers to stop experience collection... (139900 times) [2024-06-24 03:33:36,068][15349] Signal inference workers to resume experience collection... (139900 times) [2024-06-24 03:33:36,082][15401] InferenceWorker_p0-w0: stopping experience collection (139900 times) [2024-06-24 03:33:36,082][15401] InferenceWorker_p0-w0: resuming experience collection (139900 times) [2024-06-24 03:33:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9443721216. Throughput: 0: 42637.0. Samples: 9443799740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 03:33:38,616][15401] Updated weights for policy 0, policy_version 576400 (0.0024) [2024-06-24 03:33:42,309][15401] Updated weights for policy 0, policy_version 576410 (0.0031) [2024-06-24 03:33:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9443950592. Throughput: 0: 42675.6. Samples: 9444058780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 03:33:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 03:33:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000576413_9443950592.pth... [2024-06-24 03:33:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000575788_9433710592.pth [2024-06-24 03:33:46,336][15401] Updated weights for policy 0, policy_version 576420 (0.0031) [2024-06-24 03:33:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9444147200. Throughput: 0: 42750.6. Samples: 9444315320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:33:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 03:33:49,817][15401] Updated weights for policy 0, policy_version 576430 (0.0043) [2024-06-24 03:33:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9444360192. Throughput: 0: 42603.5. Samples: 9444442560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:33:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 03:33:54,060][15401] Updated weights for policy 0, policy_version 576440 (0.0037) [2024-06-24 03:33:57,484][15401] Updated weights for policy 0, policy_version 576450 (0.0029) [2024-06-24 03:33:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9444573184. Throughput: 0: 42727.1. Samples: 9444701320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:33:58,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 03:34:01,765][15401] Updated weights for policy 0, policy_version 576460 (0.0033) [2024-06-24 03:34:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42872.8, 300 sec: 42654.0). Total num frames: 9444802560. Throughput: 0: 42657.3. Samples: 9444953900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 03:34:05,085][15401] Updated weights for policy 0, policy_version 576470 (0.0029) [2024-06-24 03:34:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9444982784. Throughput: 0: 42851.0. Samples: 9445085080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 03:34:09,431][15401] Updated weights for policy 0, policy_version 576480 (0.0033) [2024-06-24 03:34:12,637][15401] Updated weights for policy 0, policy_version 576490 (0.0034) [2024-06-24 03:34:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9445228544. Throughput: 0: 42733.2. Samples: 9445340200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 03:34:17,204][15401] Updated weights for policy 0, policy_version 576500 (0.0037) [2024-06-24 03:34:18,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9445441536. Throughput: 0: 42689.3. Samples: 9445595080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 03:34:20,138][15401] Updated weights for policy 0, policy_version 576510 (0.0024) [2024-06-24 03:34:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 9445638144. Throughput: 0: 42780.9. Samples: 9445724880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 03:34:24,981][15401] Updated weights for policy 0, policy_version 576520 (0.0034) [2024-06-24 03:34:27,946][15401] Updated weights for policy 0, policy_version 576530 (0.0032) [2024-06-24 03:34:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9445867520. Throughput: 0: 42701.0. Samples: 9445980320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 03:34:32,499][15401] Updated weights for policy 0, policy_version 576540 (0.0025) [2024-06-24 03:34:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9446080512. Throughput: 0: 42612.4. Samples: 9446232880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:33,398][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 03:34:35,663][15401] Updated weights for policy 0, policy_version 576550 (0.0033) [2024-06-24 03:34:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9446277120. Throughput: 0: 42523.2. Samples: 9446356100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:38,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 03:34:40,322][15401] Updated weights for policy 0, policy_version 576560 (0.0037) [2024-06-24 03:34:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 9446506496. Throughput: 0: 42579.7. Samples: 9446617400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 03:34:43,417][15401] Updated weights for policy 0, policy_version 576570 (0.0036) [2024-06-24 03:34:47,831][15401] Updated weights for policy 0, policy_version 576580 (0.0040) [2024-06-24 03:34:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 9446686720. Throughput: 0: 42649.7. Samples: 9446873140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:48,402][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 03:34:51,220][15401] Updated weights for policy 0, policy_version 576590 (0.0026) [2024-06-24 03:34:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9446932480. Throughput: 0: 42472.6. Samples: 9446996340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 03:34:55,777][15401] Updated weights for policy 0, policy_version 576600 (0.0035) [2024-06-24 03:34:58,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 9447145472. Throughput: 0: 42571.1. Samples: 9447256000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:34:58,393][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 03:34:58,746][15401] Updated weights for policy 0, policy_version 576610 (0.0027) [2024-06-24 03:35:03,309][15401] Updated weights for policy 0, policy_version 576620 (0.0030) [2024-06-24 03:35:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9447342080. Throughput: 0: 42664.5. Samples: 9447514980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:35:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 03:35:06,989][15401] Updated weights for policy 0, policy_version 576630 (0.0039) [2024-06-24 03:35:06,995][15349] Signal inference workers to stop experience collection... (139950 times) [2024-06-24 03:35:06,995][15349] Signal inference workers to resume experience collection... (139950 times) [2024-06-24 03:35:07,032][15401] InferenceWorker_p0-w0: stopping experience collection (139950 times) [2024-06-24 03:35:07,032][15401] InferenceWorker_p0-w0: resuming experience collection (139950 times) [2024-06-24 03:35:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 9447555072. Throughput: 0: 42585.8. Samples: 9447641240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:35:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 03:35:10,765][15401] Updated weights for policy 0, policy_version 576640 (0.0038) [2024-06-24 03:35:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9447784448. Throughput: 0: 42623.9. Samples: 9447898400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:35:13,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 03:35:14,559][15401] Updated weights for policy 0, policy_version 576650 (0.0041) [2024-06-24 03:35:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9447981056. Throughput: 0: 42797.9. Samples: 9448158780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 03:35:18,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-24 03:35:18,421][15401] Updated weights for policy 0, policy_version 576660 (0.0034) [2024-06-24 03:35:21,932][15401] Updated weights for policy 0, policy_version 576670 (0.0031) [2024-06-24 03:35:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 9448210432. Throughput: 0: 42955.4. Samples: 9448289100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:35:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 03:35:26,031][15401] Updated weights for policy 0, policy_version 576680 (0.0031) [2024-06-24 03:35:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 9448407040. Throughput: 0: 42879.6. Samples: 9448546980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:35:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 03:35:29,683][15401] Updated weights for policy 0, policy_version 576690 (0.0027) [2024-06-24 03:35:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9448620032. Throughput: 0: 42790.7. Samples: 9448798720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:35:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 03:35:33,587][15401] Updated weights for policy 0, policy_version 576700 (0.0032) [2024-06-24 03:35:37,251][15401] Updated weights for policy 0, policy_version 576710 (0.0031) [2024-06-24 03:35:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9448865792. Throughput: 0: 42879.9. Samples: 9448925940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:35:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 03:35:41,131][15401] Updated weights for policy 0, policy_version 576720 (0.0032) [2024-06-24 03:35:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 9449046016. Throughput: 0: 42859.6. Samples: 9449184580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:35:43,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 03:35:43,598][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000576726_9449078784.pth... [2024-06-24 03:35:43,648][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000576100_9438822400.pth [2024-06-24 03:35:44,812][15401] Updated weights for policy 0, policy_version 576730 (0.0024) [2024-06-24 03:35:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 9449291776. Throughput: 0: 42843.0. Samples: 9449442920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:35:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 03:35:48,533][15401] Updated weights for policy 0, policy_version 576740 (0.0033) [2024-06-24 03:35:52,282][15401] Updated weights for policy 0, policy_version 576750 (0.0039) [2024-06-24 03:35:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9449504768. Throughput: 0: 42883.8. Samples: 9449571020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:35:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 03:35:56,523][15401] Updated weights for policy 0, policy_version 576760 (0.0031) [2024-06-24 03:35:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 9449701376. Throughput: 0: 42976.9. Samples: 9449832360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:35:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 03:35:59,980][15401] Updated weights for policy 0, policy_version 576770 (0.0033) [2024-06-24 03:36:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9449914368. Throughput: 0: 42888.8. Samples: 9450088780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:03,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 03:36:04,190][15401] Updated weights for policy 0, policy_version 576780 (0.0032) [2024-06-24 03:36:07,596][15401] Updated weights for policy 0, policy_version 576790 (0.0032) [2024-06-24 03:36:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.3, 300 sec: 42765.3). Total num frames: 9450143744. Throughput: 0: 42736.8. Samples: 9450212260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 03:36:11,834][15401] Updated weights for policy 0, policy_version 576800 (0.0029) [2024-06-24 03:36:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 9450340352. Throughput: 0: 42642.9. Samples: 9450466020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:13,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 03:36:15,243][15401] Updated weights for policy 0, policy_version 576810 (0.0030) [2024-06-24 03:36:18,390][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9450536960. Throughput: 0: 42848.9. Samples: 9450726920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:18,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 03:36:19,597][15401] Updated weights for policy 0, policy_version 576820 (0.0044) [2024-06-24 03:36:23,021][15401] Updated weights for policy 0, policy_version 576830 (0.0037) [2024-06-24 03:36:23,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9450782720. Throughput: 0: 42863.7. Samples: 9450854800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 03:36:27,099][15401] Updated weights for policy 0, policy_version 576840 (0.0033) [2024-06-24 03:36:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 9450979328. Throughput: 0: 42764.8. Samples: 9451109000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 03:36:30,726][15401] Updated weights for policy 0, policy_version 576850 (0.0042) [2024-06-24 03:36:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 9451192320. Throughput: 0: 42781.9. Samples: 9451368100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:33,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 03:36:34,905][15401] Updated weights for policy 0, policy_version 576860 (0.0037) [2024-06-24 03:36:37,352][15349] Signal inference workers to stop experience collection... (140000 times) [2024-06-24 03:36:37,353][15349] Signal inference workers to resume experience collection... (140000 times) [2024-06-24 03:36:37,372][15401] InferenceWorker_p0-w0: stopping experience collection (140000 times) [2024-06-24 03:36:37,372][15401] InferenceWorker_p0-w0: resuming experience collection (140000 times) [2024-06-24 03:36:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9451421696. Throughput: 0: 42785.0. Samples: 9451496340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 03:36:38,434][15401] Updated weights for policy 0, policy_version 576870 (0.0039) [2024-06-24 03:36:42,689][15401] Updated weights for policy 0, policy_version 576880 (0.0038) [2024-06-24 03:36:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 9451618304. Throughput: 0: 42602.6. Samples: 9451749480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 03:36:46,095][15401] Updated weights for policy 0, policy_version 576890 (0.0028) [2024-06-24 03:36:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 9451831296. Throughput: 0: 42528.9. Samples: 9452002580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 03:36:48,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 03:36:50,579][15401] Updated weights for policy 0, policy_version 576900 (0.0035) [2024-06-24 03:36:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9452060672. Throughput: 0: 42735.7. Samples: 9452135360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:36:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 03:36:53,685][15401] Updated weights for policy 0, policy_version 576910 (0.0041) [2024-06-24 03:36:58,228][15401] Updated weights for policy 0, policy_version 576920 (0.0036) [2024-06-24 03:36:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9452257280. Throughput: 0: 42716.4. Samples: 9452388160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:36:58,393][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 03:37:01,290][15401] Updated weights for policy 0, policy_version 576930 (0.0032) [2024-06-24 03:37:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9452470272. Throughput: 0: 42565.4. Samples: 9452642360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 03:37:05,843][15401] Updated weights for policy 0, policy_version 576940 (0.0028) [2024-06-24 03:37:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 9452683264. Throughput: 0: 42547.6. Samples: 9452769440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:08,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 03:37:08,907][15401] Updated weights for policy 0, policy_version 576950 (0.0022) [2024-06-24 03:37:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 9452896256. Throughput: 0: 42594.8. Samples: 9453025760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 03:37:13,598][15401] Updated weights for policy 0, policy_version 576960 (0.0034) [2024-06-24 03:37:16,591][15401] Updated weights for policy 0, policy_version 576970 (0.0028) [2024-06-24 03:37:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9453109248. Throughput: 0: 42561.3. Samples: 9453283360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 03:37:21,132][15401] Updated weights for policy 0, policy_version 576980 (0.0022) [2024-06-24 03:37:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9453338624. Throughput: 0: 42619.6. Samples: 9453414220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 03:37:24,162][15401] Updated weights for policy 0, policy_version 576990 (0.0032) [2024-06-24 03:37:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9453535232. Throughput: 0: 42777.8. Samples: 9453674480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:28,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 03:37:28,589][15401] Updated weights for policy 0, policy_version 577000 (0.0033) [2024-06-24 03:37:31,639][15401] Updated weights for policy 0, policy_version 577010 (0.0037) [2024-06-24 03:37:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9453748224. Throughput: 0: 43028.3. Samples: 9453938860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 03:37:36,106][15401] Updated weights for policy 0, policy_version 577020 (0.0038) [2024-06-24 03:37:38,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9453993984. Throughput: 0: 42861.7. Samples: 9454064240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:38,392][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 03:37:39,290][15401] Updated weights for policy 0, policy_version 577030 (0.0040) [2024-06-24 03:37:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9454190592. Throughput: 0: 43020.5. Samples: 9454324080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:43,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 03:37:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000577039_9454206976.pth... [2024-06-24 03:37:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000576413_9443950592.pth [2024-06-24 03:37:43,737][15401] Updated weights for policy 0, policy_version 577040 (0.0027) [2024-06-24 03:37:47,018][15401] Updated weights for policy 0, policy_version 577050 (0.0031) [2024-06-24 03:37:48,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9454403584. Throughput: 0: 42984.8. Samples: 9454576680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 03:37:51,480][15401] Updated weights for policy 0, policy_version 577060 (0.0034) [2024-06-24 03:37:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9454632960. Throughput: 0: 43155.6. Samples: 9454711440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 03:37:54,738][15401] Updated weights for policy 0, policy_version 577070 (0.0036) [2024-06-24 03:37:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.7). Total num frames: 9454829568. Throughput: 0: 43228.8. Samples: 9454971060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:37:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 03:37:58,985][15401] Updated weights for policy 0, policy_version 577080 (0.0033) [2024-06-24 03:38:02,310][15401] Updated weights for policy 0, policy_version 577090 (0.0036) [2024-06-24 03:38:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 9455075328. Throughput: 0: 43221.6. Samples: 9455228340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:38:03,391][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 03:38:06,655][15401] Updated weights for policy 0, policy_version 577100 (0.0038) [2024-06-24 03:38:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9455288320. Throughput: 0: 43345.4. Samples: 9455364760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:38:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 03:38:10,019][15401] Updated weights for policy 0, policy_version 577110 (0.0040) [2024-06-24 03:38:11,575][15349] Signal inference workers to stop experience collection... (140050 times) [2024-06-24 03:38:11,600][15401] InferenceWorker_p0-w0: stopping experience collection (140050 times) [2024-06-24 03:38:11,689][15349] Signal inference workers to resume experience collection... (140050 times) [2024-06-24 03:38:11,689][15401] InferenceWorker_p0-w0: resuming experience collection (140050 times) [2024-06-24 03:38:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9455484928. Throughput: 0: 43008.5. Samples: 9455609860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:38:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 03:38:14,813][15401] Updated weights for policy 0, policy_version 577120 (0.0037) [2024-06-24 03:38:17,620][15401] Updated weights for policy 0, policy_version 577130 (0.0027) [2024-06-24 03:38:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9455714304. Throughput: 0: 42829.5. Samples: 9455866180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:38:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 03:38:22,433][15401] Updated weights for policy 0, policy_version 577140 (0.0035) [2024-06-24 03:38:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9455927296. Throughput: 0: 43024.1. Samples: 9456000220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 03:38:23,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 03:38:25,258][15401] Updated weights for policy 0, policy_version 577150 (0.0031) [2024-06-24 03:38:28,392][15132] Fps is (10 sec: 40949.7, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 9456123904. Throughput: 0: 42848.8. Samples: 9456252380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:38:28,393][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 03:38:29,974][15401] Updated weights for policy 0, policy_version 577160 (0.0022) [2024-06-24 03:38:32,816][15401] Updated weights for policy 0, policy_version 577170 (0.0044) [2024-06-24 03:38:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 9456353280. Throughput: 0: 42905.8. Samples: 9456507440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:38:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 03:38:37,869][15401] Updated weights for policy 0, policy_version 577180 (0.0033) [2024-06-24 03:38:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 9456549888. Throughput: 0: 42843.8. Samples: 9456639420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:38:38,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 03:38:41,012][15401] Updated weights for policy 0, policy_version 577190 (0.0033) [2024-06-24 03:38:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9456762880. Throughput: 0: 42626.6. Samples: 9456889260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:38:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 03:38:45,430][15401] Updated weights for policy 0, policy_version 577200 (0.0048) [2024-06-24 03:38:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9456992256. Throughput: 0: 42587.6. Samples: 9457144780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:38:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 03:38:48,697][15401] Updated weights for policy 0, policy_version 577210 (0.0031) [2024-06-24 03:38:53,075][15401] Updated weights for policy 0, policy_version 577220 (0.0028) [2024-06-24 03:38:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9457188864. Throughput: 0: 42481.3. Samples: 9457276420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:38:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 03:38:56,272][15401] Updated weights for policy 0, policy_version 577230 (0.0040) [2024-06-24 03:38:58,394][15132] Fps is (10 sec: 42581.7, 60 sec: 43141.7, 300 sec: 42764.4). Total num frames: 9457418240. Throughput: 0: 42631.4. Samples: 9457528440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:38:58,394][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 03:39:00,589][15401] Updated weights for policy 0, policy_version 577240 (0.0032) [2024-06-24 03:39:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9457631232. Throughput: 0: 42616.4. Samples: 9457783920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 03:39:03,974][15401] Updated weights for policy 0, policy_version 577250 (0.0034) [2024-06-24 03:39:08,073][15401] Updated weights for policy 0, policy_version 577260 (0.0042) [2024-06-24 03:39:08,392][15132] Fps is (10 sec: 40966.6, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 9457827840. Throughput: 0: 42551.4. Samples: 9457915140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:08,392][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 03:39:11,592][15401] Updated weights for policy 0, policy_version 577270 (0.0030) [2024-06-24 03:39:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9458057216. Throughput: 0: 42554.6. Samples: 9458167240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 03:39:16,270][15401] Updated weights for policy 0, policy_version 577280 (0.0043) [2024-06-24 03:39:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9458253824. Throughput: 0: 42609.0. Samples: 9458424840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:18,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 03:39:19,215][15401] Updated weights for policy 0, policy_version 577290 (0.0039) [2024-06-24 03:39:23,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9458466816. Throughput: 0: 42470.3. Samples: 9458550580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:23,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 03:39:23,857][15401] Updated weights for policy 0, policy_version 577300 (0.0037) [2024-06-24 03:39:26,608][15349] Signal inference workers to stop experience collection... (140100 times) [2024-06-24 03:39:26,612][15349] Signal inference workers to resume experience collection... (140100 times) [2024-06-24 03:39:26,628][15401] InferenceWorker_p0-w0: stopping experience collection (140100 times) [2024-06-24 03:39:26,659][15401] InferenceWorker_p0-w0: resuming experience collection (140100 times) [2024-06-24 03:39:26,745][15401] Updated weights for policy 0, policy_version 577310 (0.0024) [2024-06-24 03:39:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 9458696192. Throughput: 0: 42634.3. Samples: 9458807800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 03:39:31,362][15401] Updated weights for policy 0, policy_version 577320 (0.0028) [2024-06-24 03:39:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9458892800. Throughput: 0: 42783.6. Samples: 9459070040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 03:39:34,297][15401] Updated weights for policy 0, policy_version 577330 (0.0036) [2024-06-24 03:39:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9459122176. Throughput: 0: 42648.3. Samples: 9459195600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:38,390][15132] Avg episode reward: [(0, '0.146')] [2024-06-24 03:39:39,046][15401] Updated weights for policy 0, policy_version 577340 (0.0026) [2024-06-24 03:39:42,133][15401] Updated weights for policy 0, policy_version 577350 (0.0026) [2024-06-24 03:39:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9459335168. Throughput: 0: 42794.9. Samples: 9459454040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:43,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-24 03:39:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000577352_9459335168.pth... [2024-06-24 03:39:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000576726_9449078784.pth [2024-06-24 03:39:46,634][15401] Updated weights for policy 0, policy_version 577360 (0.0031) [2024-06-24 03:39:48,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 9459548160. Throughput: 0: 42907.0. Samples: 9459714840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:48,392][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 03:39:49,777][15401] Updated weights for policy 0, policy_version 577370 (0.0035) [2024-06-24 03:39:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 9459761152. Throughput: 0: 42857.8. Samples: 9459843640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-24 03:39:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 03:39:54,117][15401] Updated weights for policy 0, policy_version 577380 (0.0039) [2024-06-24 03:39:57,712][15401] Updated weights for policy 0, policy_version 577390 (0.0030) [2024-06-24 03:39:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42601.2, 300 sec: 42820.6). Total num frames: 9459974144. Throughput: 0: 42929.0. Samples: 9460099040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:39:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 03:40:01,559][15401] Updated weights for policy 0, policy_version 577400 (0.0034) [2024-06-24 03:40:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9460203520. Throughput: 0: 42807.4. Samples: 9460351180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 03:40:05,822][15401] Updated weights for policy 0, policy_version 577410 (0.0040) [2024-06-24 03:40:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 9460416512. Throughput: 0: 43009.7. Samples: 9460486020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:08,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 03:40:09,145][15401] Updated weights for policy 0, policy_version 577420 (0.0038) [2024-06-24 03:40:13,263][15401] Updated weights for policy 0, policy_version 577430 (0.0033) [2024-06-24 03:40:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9460613120. Throughput: 0: 43099.2. Samples: 9460747260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:13,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 03:40:16,793][15401] Updated weights for policy 0, policy_version 577440 (0.0032) [2024-06-24 03:40:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9460842496. Throughput: 0: 42797.4. Samples: 9460995920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 03:40:21,093][15401] Updated weights for policy 0, policy_version 577450 (0.0027) [2024-06-24 03:40:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9461055488. Throughput: 0: 43021.5. Samples: 9461131560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 03:40:24,481][15401] Updated weights for policy 0, policy_version 577460 (0.0029) [2024-06-24 03:40:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9461235712. Throughput: 0: 43040.9. Samples: 9461390880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 03:40:28,725][15401] Updated weights for policy 0, policy_version 577470 (0.0041) [2024-06-24 03:40:31,891][15401] Updated weights for policy 0, policy_version 577480 (0.0038) [2024-06-24 03:40:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 9461481472. Throughput: 0: 42803.1. Samples: 9461640980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:33,392][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 03:40:36,275][15401] Updated weights for policy 0, policy_version 577490 (0.0030) [2024-06-24 03:40:38,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9461710848. Throughput: 0: 42944.5. Samples: 9461776140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:38,391][15132] Avg episode reward: [(0, '0.862')] [2024-06-24 03:40:39,405][15401] Updated weights for policy 0, policy_version 577500 (0.0028) [2024-06-24 03:40:43,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9461891072. Throughput: 0: 43070.2. Samples: 9462037200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:43,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-24 03:40:43,817][15401] Updated weights for policy 0, policy_version 577510 (0.0027) [2024-06-24 03:40:46,841][15401] Updated weights for policy 0, policy_version 577520 (0.0023) [2024-06-24 03:40:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 9462120448. Throughput: 0: 43134.3. Samples: 9462292220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 03:40:51,364][15401] Updated weights for policy 0, policy_version 577530 (0.0035) [2024-06-24 03:40:52,092][15349] Signal inference workers to stop experience collection... (140150 times) [2024-06-24 03:40:52,093][15349] Signal inference workers to resume experience collection... (140150 times) [2024-06-24 03:40:52,141][15401] InferenceWorker_p0-w0: stopping experience collection (140150 times) [2024-06-24 03:40:52,141][15401] InferenceWorker_p0-w0: resuming experience collection (140150 times) [2024-06-24 03:40:53,390][15132] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 9462382592. Throughput: 0: 43209.8. Samples: 9462430460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:53,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 03:40:54,671][15401] Updated weights for policy 0, policy_version 577540 (0.0040) [2024-06-24 03:40:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9462546432. Throughput: 0: 43048.4. Samples: 9462684440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:40:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 03:40:58,985][15401] Updated weights for policy 0, policy_version 577550 (0.0025) [2024-06-24 03:41:02,118][15401] Updated weights for policy 0, policy_version 577560 (0.0039) [2024-06-24 03:41:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9462775808. Throughput: 0: 43318.3. Samples: 9462945240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:41:03,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 03:41:06,563][15401] Updated weights for policy 0, policy_version 577570 (0.0025) [2024-06-24 03:41:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 9462988800. Throughput: 0: 43146.2. Samples: 9463073140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:41:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 03:41:10,113][15401] Updated weights for policy 0, policy_version 577580 (0.0037) [2024-06-24 03:41:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9463201792. Throughput: 0: 43039.1. Samples: 9463327640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:41:13,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 03:41:14,154][15401] Updated weights for policy 0, policy_version 577590 (0.0038) [2024-06-24 03:41:17,675][15401] Updated weights for policy 0, policy_version 577600 (0.0039) [2024-06-24 03:41:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9463431168. Throughput: 0: 43026.7. Samples: 9463577080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:41:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 03:41:21,644][15401] Updated weights for policy 0, policy_version 577610 (0.0040) [2024-06-24 03:41:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9463627776. Throughput: 0: 43013.8. Samples: 9463711760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:41:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 03:41:25,249][15401] Updated weights for policy 0, policy_version 577620 (0.0032) [2024-06-24 03:41:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9463840768. Throughput: 0: 42855.1. Samples: 9463965680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:41:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 03:41:29,377][15401] Updated weights for policy 0, policy_version 577630 (0.0052) [2024-06-24 03:41:32,748][15401] Updated weights for policy 0, policy_version 577640 (0.0028) [2024-06-24 03:41:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 9464070144. Throughput: 0: 42879.5. Samples: 9464221800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:41:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 03:41:37,215][15401] Updated weights for policy 0, policy_version 577650 (0.0033) [2024-06-24 03:41:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9464266752. Throughput: 0: 42652.8. Samples: 9464349840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:41:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 03:41:40,283][15401] Updated weights for policy 0, policy_version 577660 (0.0031) [2024-06-24 03:41:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9464479744. Throughput: 0: 42771.1. Samples: 9464609140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:41:43,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 03:41:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000577667_9464496128.pth... [2024-06-24 03:41:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000577039_9454206976.pth [2024-06-24 03:41:44,678][15401] Updated weights for policy 0, policy_version 577670 (0.0037) [2024-06-24 03:41:47,841][15401] Updated weights for policy 0, policy_version 577680 (0.0025) [2024-06-24 03:41:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9464709120. Throughput: 0: 42547.5. Samples: 9464859880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:41:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 03:41:52,375][15401] Updated weights for policy 0, policy_version 577690 (0.0029) [2024-06-24 03:41:53,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42323.7, 300 sec: 42931.3). Total num frames: 9464922112. Throughput: 0: 42677.7. Samples: 9464993740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:41:53,393][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 03:41:55,473][15401] Updated weights for policy 0, policy_version 577700 (0.0048) [2024-06-24 03:41:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9465118720. Throughput: 0: 42738.2. Samples: 9465250860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:41:58,390][15132] Avg episode reward: [(0, '0.128')] [2024-06-24 03:41:59,774][15401] Updated weights for policy 0, policy_version 577710 (0.0041) [2024-06-24 03:42:03,156][15401] Updated weights for policy 0, policy_version 577720 (0.0038) [2024-06-24 03:42:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 9465364480. Throughput: 0: 42916.4. Samples: 9465508320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:03,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 03:42:07,371][15401] Updated weights for policy 0, policy_version 577730 (0.0028) [2024-06-24 03:42:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9465561088. Throughput: 0: 42860.4. Samples: 9465640480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:08,390][15132] Avg episode reward: [(0, '0.183')] [2024-06-24 03:42:10,146][15349] Signal inference workers to stop experience collection... (140200 times) [2024-06-24 03:42:10,146][15349] Signal inference workers to resume experience collection... (140200 times) [2024-06-24 03:42:10,180][15401] InferenceWorker_p0-w0: stopping experience collection (140200 times) [2024-06-24 03:42:10,180][15401] InferenceWorker_p0-w0: resuming experience collection (140200 times) [2024-06-24 03:42:11,225][15401] Updated weights for policy 0, policy_version 577740 (0.0046) [2024-06-24 03:42:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9465757696. Throughput: 0: 42740.8. Samples: 9465889020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:13,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-24 03:42:15,055][15401] Updated weights for policy 0, policy_version 577750 (0.0033) [2024-06-24 03:42:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9465987072. Throughput: 0: 42926.2. Samples: 9466153480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 03:42:18,829][15401] Updated weights for policy 0, policy_version 577760 (0.0046) [2024-06-24 03:42:22,715][15401] Updated weights for policy 0, policy_version 577770 (0.0030) [2024-06-24 03:42:23,396][15132] Fps is (10 sec: 44209.3, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 9466200064. Throughput: 0: 42974.9. Samples: 9466283980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:23,396][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 03:42:26,455][15401] Updated weights for policy 0, policy_version 577780 (0.0035) [2024-06-24 03:42:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 9466413056. Throughput: 0: 42783.1. Samples: 9466534380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 03:42:30,297][15401] Updated weights for policy 0, policy_version 577790 (0.0042) [2024-06-24 03:42:33,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 9466642432. Throughput: 0: 43032.9. Samples: 9466796360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:33,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 03:42:33,931][15401] Updated weights for policy 0, policy_version 577800 (0.0029) [2024-06-24 03:42:38,117][15401] Updated weights for policy 0, policy_version 577810 (0.0028) [2024-06-24 03:42:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9466839040. Throughput: 0: 42948.1. Samples: 9466926300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 03:42:41,580][15401] Updated weights for policy 0, policy_version 577820 (0.0043) [2024-06-24 03:42:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9467084800. Throughput: 0: 42982.6. Samples: 9467185080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 03:42:45,688][15401] Updated weights for policy 0, policy_version 577830 (0.0037) [2024-06-24 03:42:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9467297792. Throughput: 0: 42991.7. Samples: 9467442940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 03:42:49,113][15401] Updated weights for policy 0, policy_version 577840 (0.0033) [2024-06-24 03:42:53,289][15401] Updated weights for policy 0, policy_version 577850 (0.0032) [2024-06-24 03:42:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 9467494400. Throughput: 0: 42847.6. Samples: 9467568620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 03:42:56,769][15401] Updated weights for policy 0, policy_version 577860 (0.0044) [2024-06-24 03:42:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9467707392. Throughput: 0: 43144.6. Samples: 9467830520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:42:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 03:43:01,067][15401] Updated weights for policy 0, policy_version 577870 (0.0036) [2024-06-24 03:43:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 9467920384. Throughput: 0: 42805.0. Samples: 9468079700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 03:43:04,578][15401] Updated weights for policy 0, policy_version 577880 (0.0038) [2024-06-24 03:43:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9468133376. Throughput: 0: 42849.5. Samples: 9468211940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:08,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-24 03:43:08,618][15401] Updated weights for policy 0, policy_version 577890 (0.0032) [2024-06-24 03:43:12,085][15401] Updated weights for policy 0, policy_version 577900 (0.0026) [2024-06-24 03:43:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9468346368. Throughput: 0: 42918.2. Samples: 9468465700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:13,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 03:43:16,445][15401] Updated weights for policy 0, policy_version 577910 (0.0038) [2024-06-24 03:43:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9468559360. Throughput: 0: 42865.6. Samples: 9468725320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 03:43:19,691][15401] Updated weights for policy 0, policy_version 577920 (0.0029) [2024-06-24 03:43:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42602.9, 300 sec: 42820.9). Total num frames: 9468755968. Throughput: 0: 42793.3. Samples: 9468852000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:23,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-24 03:43:24,335][15401] Updated weights for policy 0, policy_version 577930 (0.0032) [2024-06-24 03:43:27,653][15401] Updated weights for policy 0, policy_version 577940 (0.0047) [2024-06-24 03:43:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9469001728. Throughput: 0: 42654.2. Samples: 9469104520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:28,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 03:43:32,139][15401] Updated weights for policy 0, policy_version 577950 (0.0031) [2024-06-24 03:43:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9469198336. Throughput: 0: 42593.4. Samples: 9469359640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 03:43:35,378][15401] Updated weights for policy 0, policy_version 577960 (0.0031) [2024-06-24 03:43:38,391][15132] Fps is (10 sec: 39315.9, 60 sec: 42597.3, 300 sec: 42820.4). Total num frames: 9469394944. Throughput: 0: 42730.6. Samples: 9469491560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:38,392][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 03:43:38,506][15349] Signal inference workers to stop experience collection... (140250 times) [2024-06-24 03:43:38,533][15401] InferenceWorker_p0-w0: stopping experience collection (140250 times) [2024-06-24 03:43:38,569][15349] Signal inference workers to resume experience collection... (140250 times) [2024-06-24 03:43:38,569][15401] InferenceWorker_p0-w0: resuming experience collection (140250 times) [2024-06-24 03:43:39,662][15401] Updated weights for policy 0, policy_version 577970 (0.0044) [2024-06-24 03:43:42,911][15401] Updated weights for policy 0, policy_version 577980 (0.0035) [2024-06-24 03:43:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 9469624320. Throughput: 0: 42587.4. Samples: 9469746960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:43,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 03:43:43,521][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000577981_9469640704.pth... [2024-06-24 03:43:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000577352_9459335168.pth [2024-06-24 03:43:47,216][15401] Updated weights for policy 0, policy_version 577990 (0.0032) [2024-06-24 03:43:48,390][15132] Fps is (10 sec: 44243.0, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 9469837312. Throughput: 0: 42668.3. Samples: 9469999780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 03:43:50,856][15401] Updated weights for policy 0, policy_version 578000 (0.0045) [2024-06-24 03:43:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42821.1). Total num frames: 9470050304. Throughput: 0: 42531.2. Samples: 9470125840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 03:43:55,022][15401] Updated weights for policy 0, policy_version 578010 (0.0028) [2024-06-24 03:43:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9470263296. Throughput: 0: 42744.5. Samples: 9470389200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:43:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 03:43:58,803][15401] Updated weights for policy 0, policy_version 578020 (0.0028) [2024-06-24 03:44:02,501][15401] Updated weights for policy 0, policy_version 578030 (0.0029) [2024-06-24 03:44:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 9470459904. Throughput: 0: 42588.6. Samples: 9470641800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:44:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 03:44:06,342][15401] Updated weights for policy 0, policy_version 578040 (0.0031) [2024-06-24 03:44:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9470689280. Throughput: 0: 42684.1. Samples: 9470772780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:44:08,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 03:44:09,985][15401] Updated weights for policy 0, policy_version 578050 (0.0035) [2024-06-24 03:44:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9470902272. Throughput: 0: 42676.4. Samples: 9471024960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:44:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 03:44:14,005][15401] Updated weights for policy 0, policy_version 578060 (0.0044) [2024-06-24 03:44:17,534][15401] Updated weights for policy 0, policy_version 578070 (0.0036) [2024-06-24 03:44:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9471115264. Throughput: 0: 42661.6. Samples: 9471279420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:44:18,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 03:44:21,721][15401] Updated weights for policy 0, policy_version 578080 (0.0038) [2024-06-24 03:44:23,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9471311872. Throughput: 0: 42648.2. Samples: 9471410660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:44:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 03:44:25,043][15401] Updated weights for policy 0, policy_version 578090 (0.0041) [2024-06-24 03:44:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 9471541248. Throughput: 0: 42714.7. Samples: 9471669120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:44:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 03:44:29,322][15401] Updated weights for policy 0, policy_version 578100 (0.0030) [2024-06-24 03:44:32,715][15401] Updated weights for policy 0, policy_version 578110 (0.0029) [2024-06-24 03:44:33,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9471770624. Throughput: 0: 42742.2. Samples: 9471923180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 03:44:33,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 03:44:37,002][15401] Updated weights for policy 0, policy_version 578120 (0.0039) [2024-06-24 03:44:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42872.6, 300 sec: 42820.6). Total num frames: 9471967232. Throughput: 0: 42904.9. Samples: 9472056560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:44:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 03:44:40,411][15401] Updated weights for policy 0, policy_version 578130 (0.0035) [2024-06-24 03:44:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 9472180224. Throughput: 0: 42550.9. Samples: 9472304000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:44:43,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-24 03:44:44,854][15401] Updated weights for policy 0, policy_version 578140 (0.0034) [2024-06-24 03:44:48,186][15401] Updated weights for policy 0, policy_version 578150 (0.0033) [2024-06-24 03:44:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9472409600. Throughput: 0: 42680.9. Samples: 9472562440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:44:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 03:44:52,453][15401] Updated weights for policy 0, policy_version 578160 (0.0043) [2024-06-24 03:44:53,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 9472622592. Throughput: 0: 42698.5. Samples: 9472694320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:44:53,404][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 03:44:55,901][15401] Updated weights for policy 0, policy_version 578170 (0.0048) [2024-06-24 03:44:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9472802816. Throughput: 0: 42669.5. Samples: 9472945080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:44:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 03:45:00,159][15401] Updated weights for policy 0, policy_version 578180 (0.0037) [2024-06-24 03:45:02,920][15349] Signal inference workers to stop experience collection... (140300 times) [2024-06-24 03:45:02,922][15349] Signal inference workers to resume experience collection... (140300 times) [2024-06-24 03:45:02,937][15401] InferenceWorker_p0-w0: stopping experience collection (140300 times) [2024-06-24 03:45:02,937][15401] InferenceWorker_p0-w0: resuming experience collection (140300 times) [2024-06-24 03:45:03,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9473048576. Throughput: 0: 42728.1. Samples: 9473202180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 03:45:03,504][15401] Updated weights for policy 0, policy_version 578190 (0.0041) [2024-06-24 03:45:07,933][15401] Updated weights for policy 0, policy_version 578200 (0.0039) [2024-06-24 03:45:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9473245184. Throughput: 0: 42716.0. Samples: 9473332880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 03:45:11,080][15401] Updated weights for policy 0, policy_version 578210 (0.0031) [2024-06-24 03:45:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9473441792. Throughput: 0: 42489.9. Samples: 9473581160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 03:45:15,468][15401] Updated weights for policy 0, policy_version 578220 (0.0026) [2024-06-24 03:45:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9473671168. Throughput: 0: 42594.8. Samples: 9473839940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:18,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 03:45:18,762][15401] Updated weights for policy 0, policy_version 578230 (0.0037) [2024-06-24 03:45:23,147][15401] Updated weights for policy 0, policy_version 578240 (0.0024) [2024-06-24 03:45:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9473884160. Throughput: 0: 42563.6. Samples: 9473971920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 03:45:26,307][15401] Updated weights for policy 0, policy_version 578250 (0.0031) [2024-06-24 03:45:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 9474080768. Throughput: 0: 42639.6. Samples: 9474222780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 03:45:30,883][15401] Updated weights for policy 0, policy_version 578260 (0.0027) [2024-06-24 03:45:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9474326528. Throughput: 0: 42605.2. Samples: 9474479680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:33,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 03:45:34,101][15401] Updated weights for policy 0, policy_version 578270 (0.0044) [2024-06-24 03:45:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9474523136. Throughput: 0: 42665.9. Samples: 9474614180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:38,391][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 03:45:38,494][15401] Updated weights for policy 0, policy_version 578280 (0.0028) [2024-06-24 03:45:41,613][15401] Updated weights for policy 0, policy_version 578290 (0.0041) [2024-06-24 03:45:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9474736128. Throughput: 0: 42521.2. Samples: 9474858540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 03:45:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000578292_9474736128.pth... [2024-06-24 03:45:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000577667_9464496128.pth [2024-06-24 03:45:46,133][15401] Updated weights for policy 0, policy_version 578300 (0.0037) [2024-06-24 03:45:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9474949120. Throughput: 0: 42649.8. Samples: 9475121420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 03:45:49,337][15401] Updated weights for policy 0, policy_version 578310 (0.0032) [2024-06-24 03:45:53,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 9475178496. Throughput: 0: 42812.0. Samples: 9475259420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:53,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 03:45:53,523][15401] Updated weights for policy 0, policy_version 578320 (0.0041) [2024-06-24 03:45:56,917][15401] Updated weights for policy 0, policy_version 578330 (0.0028) [2024-06-24 03:45:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9475375104. Throughput: 0: 42730.7. Samples: 9475504040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:45:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 03:46:01,588][15401] Updated weights for policy 0, policy_version 578340 (0.0030) [2024-06-24 03:46:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 9475571712. Throughput: 0: 42748.5. Samples: 9475763620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 03:46:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 03:46:04,883][15401] Updated weights for policy 0, policy_version 578350 (0.0045) [2024-06-24 03:46:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 9475784704. Throughput: 0: 42643.1. Samples: 9475890860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 03:46:09,166][15401] Updated weights for policy 0, policy_version 578360 (0.0039) [2024-06-24 03:46:12,345][15401] Updated weights for policy 0, policy_version 578370 (0.0031) [2024-06-24 03:46:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9476030464. Throughput: 0: 42612.1. Samples: 9476140320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 03:46:15,978][15349] Signal inference workers to stop experience collection... (140350 times) [2024-06-24 03:46:16,006][15401] InferenceWorker_p0-w0: stopping experience collection (140350 times) [2024-06-24 03:46:16,025][15349] Signal inference workers to resume experience collection... (140350 times) [2024-06-24 03:46:16,029][15401] InferenceWorker_p0-w0: resuming experience collection (140350 times) [2024-06-24 03:46:16,850][15401] Updated weights for policy 0, policy_version 578380 (0.0035) [2024-06-24 03:46:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9476227072. Throughput: 0: 42716.1. Samples: 9476401900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 03:46:20,642][15401] Updated weights for policy 0, policy_version 578390 (0.0027) [2024-06-24 03:46:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9476440064. Throughput: 0: 42600.8. Samples: 9476531220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 03:46:24,408][15401] Updated weights for policy 0, policy_version 578400 (0.0029) [2024-06-24 03:46:28,207][15401] Updated weights for policy 0, policy_version 578410 (0.0038) [2024-06-24 03:46:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9476669440. Throughput: 0: 42761.3. Samples: 9476782800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 03:46:32,412][15401] Updated weights for policy 0, policy_version 578420 (0.0024) [2024-06-24 03:46:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9476866048. Throughput: 0: 42671.1. Samples: 9477041620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 03:46:35,955][15401] Updated weights for policy 0, policy_version 578430 (0.0038) [2024-06-24 03:46:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9477095424. Throughput: 0: 42328.3. Samples: 9477164200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 03:46:40,066][15401] Updated weights for policy 0, policy_version 578440 (0.0026) [2024-06-24 03:46:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.4). Total num frames: 9477308416. Throughput: 0: 42547.4. Samples: 9477418680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 03:46:43,540][15401] Updated weights for policy 0, policy_version 578450 (0.0028) [2024-06-24 03:46:48,195][15401] Updated weights for policy 0, policy_version 578460 (0.0043) [2024-06-24 03:46:48,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 9477488640. Throughput: 0: 42501.4. Samples: 9477676180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 03:46:51,170][15401] Updated weights for policy 0, policy_version 578470 (0.0038) [2024-06-24 03:46:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9477734400. Throughput: 0: 42391.9. Samples: 9477798500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 03:46:55,735][15401] Updated weights for policy 0, policy_version 578480 (0.0029) [2024-06-24 03:46:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9477931008. Throughput: 0: 42529.4. Samples: 9478054140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:46:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 03:46:59,037][15401] Updated weights for policy 0, policy_version 578490 (0.0026) [2024-06-24 03:47:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9478127616. Throughput: 0: 42429.8. Samples: 9478311240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:47:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 03:47:03,457][15401] Updated weights for policy 0, policy_version 578500 (0.0033) [2024-06-24 03:47:06,817][15401] Updated weights for policy 0, policy_version 578510 (0.0023) [2024-06-24 03:47:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9478356992. Throughput: 0: 42407.2. Samples: 9478439540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:47:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 03:47:10,969][15401] Updated weights for policy 0, policy_version 578520 (0.0044) [2024-06-24 03:47:13,396][15132] Fps is (10 sec: 45845.9, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 9478586368. Throughput: 0: 42560.7. Samples: 9478698300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:47:13,397][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 03:47:14,492][15401] Updated weights for policy 0, policy_version 578530 (0.0034) [2024-06-24 03:47:18,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42598.3, 300 sec: 42654.8). Total num frames: 9478782976. Throughput: 0: 42496.3. Samples: 9478953960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:47:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 03:47:18,805][15401] Updated weights for policy 0, policy_version 578540 (0.0033) [2024-06-24 03:47:22,061][15401] Updated weights for policy 0, policy_version 578550 (0.0040) [2024-06-24 03:47:23,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9478995968. Throughput: 0: 42610.2. Samples: 9479081660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:47:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 03:47:26,310][15401] Updated weights for policy 0, policy_version 578560 (0.0039) [2024-06-24 03:47:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9479208960. Throughput: 0: 42619.2. Samples: 9479336540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:47:28,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 03:47:29,796][15401] Updated weights for policy 0, policy_version 578570 (0.0036) [2024-06-24 03:47:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9479421952. Throughput: 0: 42749.2. Samples: 9479599900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:47:33,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 03:47:33,999][15401] Updated weights for policy 0, policy_version 578580 (0.0033) [2024-06-24 03:47:37,541][15401] Updated weights for policy 0, policy_version 578590 (0.0023) [2024-06-24 03:47:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9479651328. Throughput: 0: 42704.0. Samples: 9479720180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 03:47:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 03:47:41,593][15401] Updated weights for policy 0, policy_version 578600 (0.0025) [2024-06-24 03:47:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 9479847936. Throughput: 0: 42716.8. Samples: 9479976400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:47:43,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 03:47:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000578605_9479864320.pth... [2024-06-24 03:47:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000577981_9469640704.pth [2024-06-24 03:47:45,161][15401] Updated weights for policy 0, policy_version 578610 (0.0033) [2024-06-24 03:47:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9480077312. Throughput: 0: 42634.3. Samples: 9480229780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:47:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 03:47:49,133][15401] Updated weights for policy 0, policy_version 578620 (0.0031) [2024-06-24 03:47:50,426][15349] Signal inference workers to stop experience collection... (140400 times) [2024-06-24 03:47:50,426][15349] Signal inference workers to resume experience collection... (140400 times) [2024-06-24 03:47:50,452][15401] InferenceWorker_p0-w0: stopping experience collection (140400 times) [2024-06-24 03:47:50,452][15401] InferenceWorker_p0-w0: resuming experience collection (140400 times) [2024-06-24 03:47:52,961][15401] Updated weights for policy 0, policy_version 578630 (0.0030) [2024-06-24 03:47:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9480290304. Throughput: 0: 42612.3. Samples: 9480357100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:47:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 03:47:56,652][15401] Updated weights for policy 0, policy_version 578640 (0.0029) [2024-06-24 03:47:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9480503296. Throughput: 0: 42725.1. Samples: 9480620660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:47:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 03:48:00,551][15401] Updated weights for policy 0, policy_version 578650 (0.0039) [2024-06-24 03:48:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9480699904. Throughput: 0: 42688.7. Samples: 9480874940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 03:48:04,710][15401] Updated weights for policy 0, policy_version 578660 (0.0039) [2024-06-24 03:48:08,377][15401] Updated weights for policy 0, policy_version 578670 (0.0038) [2024-06-24 03:48:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9480929280. Throughput: 0: 42620.9. Samples: 9480999600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 03:48:12,263][15401] Updated weights for policy 0, policy_version 578680 (0.0027) [2024-06-24 03:48:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 9481142272. Throughput: 0: 42743.0. Samples: 9481259980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:13,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-24 03:48:15,759][15401] Updated weights for policy 0, policy_version 578690 (0.0038) [2024-06-24 03:48:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9481338880. Throughput: 0: 42568.3. Samples: 9481515480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 03:48:20,142][15401] Updated weights for policy 0, policy_version 578700 (0.0050) [2024-06-24 03:48:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9481568256. Throughput: 0: 42676.5. Samples: 9481640620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 03:48:23,430][15401] Updated weights for policy 0, policy_version 578710 (0.0035) [2024-06-24 03:48:27,561][15401] Updated weights for policy 0, policy_version 578720 (0.0032) [2024-06-24 03:48:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9481781248. Throughput: 0: 42784.0. Samples: 9481901680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 03:48:31,246][15401] Updated weights for policy 0, policy_version 578730 (0.0044) [2024-06-24 03:48:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 9481977856. Throughput: 0: 42773.3. Samples: 9482154580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:33,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 03:48:35,415][15401] Updated weights for policy 0, policy_version 578740 (0.0029) [2024-06-24 03:48:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9482190848. Throughput: 0: 42596.6. Samples: 9482273940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 03:48:38,932][15401] Updated weights for policy 0, policy_version 578750 (0.0038) [2024-06-24 03:48:43,160][15401] Updated weights for policy 0, policy_version 578760 (0.0022) [2024-06-24 03:48:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9482403840. Throughput: 0: 42443.7. Samples: 9482530620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:43,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 03:48:46,961][15401] Updated weights for policy 0, policy_version 578770 (0.0050) [2024-06-24 03:48:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9482616832. Throughput: 0: 42676.4. Samples: 9482795380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 03:48:50,796][15401] Updated weights for policy 0, policy_version 578780 (0.0029) [2024-06-24 03:48:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9482846208. Throughput: 0: 42613.9. Samples: 9482917220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 03:48:54,664][15401] Updated weights for policy 0, policy_version 578790 (0.0034) [2024-06-24 03:48:58,287][15401] Updated weights for policy 0, policy_version 578800 (0.0031) [2024-06-24 03:48:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9483059200. Throughput: 0: 42503.1. Samples: 9483172620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:48:58,400][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 03:49:02,320][15401] Updated weights for policy 0, policy_version 578810 (0.0031) [2024-06-24 03:49:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9483255808. Throughput: 0: 42510.4. Samples: 9483428440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:49:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 03:49:06,244][15401] Updated weights for policy 0, policy_version 578820 (0.0033) [2024-06-24 03:49:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9483501568. Throughput: 0: 42608.7. Samples: 9483558020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 03:49:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 03:49:10,523][15401] Updated weights for policy 0, policy_version 578830 (0.0028) [2024-06-24 03:49:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9483681792. Throughput: 0: 42479.2. Samples: 9483813240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 03:49:13,813][15349] Signal inference workers to stop experience collection... (140450 times) [2024-06-24 03:49:13,813][15349] Signal inference workers to resume experience collection... (140450 times) [2024-06-24 03:49:13,838][15401] InferenceWorker_p0-w0: stopping experience collection (140450 times) [2024-06-24 03:49:13,838][15401] InferenceWorker_p0-w0: resuming experience collection (140450 times) [2024-06-24 03:49:13,972][15401] Updated weights for policy 0, policy_version 578840 (0.0027) [2024-06-24 03:49:18,322][15401] Updated weights for policy 0, policy_version 578850 (0.0028) [2024-06-24 03:49:18,390][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9483878400. Throughput: 0: 42587.1. Samples: 9484071000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 03:49:21,467][15401] Updated weights for policy 0, policy_version 578860 (0.0028) [2024-06-24 03:49:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9484124160. Throughput: 0: 42627.0. Samples: 9484192160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:23,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 03:49:26,242][15401] Updated weights for policy 0, policy_version 578870 (0.0041) [2024-06-24 03:49:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9484337152. Throughput: 0: 42683.1. Samples: 9484451360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:28,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 03:49:29,125][15401] Updated weights for policy 0, policy_version 578880 (0.0041) [2024-06-24 03:49:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9484517376. Throughput: 0: 42576.8. Samples: 9484711340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 03:49:33,905][15401] Updated weights for policy 0, policy_version 578890 (0.0030) [2024-06-24 03:49:36,931][15401] Updated weights for policy 0, policy_version 578900 (0.0034) [2024-06-24 03:49:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9484779520. Throughput: 0: 42574.5. Samples: 9484833080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 03:49:41,611][15401] Updated weights for policy 0, policy_version 578910 (0.0035) [2024-06-24 03:49:43,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9484992512. Throughput: 0: 42663.6. Samples: 9485092480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:43,399][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 03:49:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000578918_9484992512.pth... [2024-06-24 03:49:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000578292_9474736128.pth [2024-06-24 03:49:44,508][15401] Updated weights for policy 0, policy_version 578920 (0.0033) [2024-06-24 03:49:48,392][15132] Fps is (10 sec: 37675.3, 60 sec: 42323.7, 300 sec: 42487.4). Total num frames: 9485156352. Throughput: 0: 42652.1. Samples: 9485347880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:48,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 03:49:49,215][15401] Updated weights for policy 0, policy_version 578930 (0.0035) [2024-06-24 03:49:52,142][15401] Updated weights for policy 0, policy_version 578940 (0.0022) [2024-06-24 03:49:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9485402112. Throughput: 0: 42448.2. Samples: 9485468180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 03:49:56,731][15401] Updated weights for policy 0, policy_version 578950 (0.0029) [2024-06-24 03:49:58,390][15132] Fps is (10 sec: 47523.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9485631488. Throughput: 0: 42768.8. Samples: 9485737840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:49:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 03:49:59,537][15401] Updated weights for policy 0, policy_version 578960 (0.0034) [2024-06-24 03:50:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9485811712. Throughput: 0: 42652.5. Samples: 9485990360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 03:50:04,221][15401] Updated weights for policy 0, policy_version 578970 (0.0048) [2024-06-24 03:50:07,227][15401] Updated weights for policy 0, policy_version 578980 (0.0046) [2024-06-24 03:50:08,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9486057472. Throughput: 0: 42726.2. Samples: 9486114840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 03:50:08,926][15349] Signal inference workers to stop experience collection... (140500 times) [2024-06-24 03:50:08,927][15349] Signal inference workers to resume experience collection... (140500 times) [2024-06-24 03:50:08,960][15401] InferenceWorker_p0-w0: stopping experience collection (140500 times) [2024-06-24 03:50:08,990][15401] InferenceWorker_p0-w0: resuming experience collection (140500 times) [2024-06-24 03:50:11,829][15401] Updated weights for policy 0, policy_version 578990 (0.0032) [2024-06-24 03:50:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9486270464. Throughput: 0: 42882.7. Samples: 9486381080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 03:50:14,852][15401] Updated weights for policy 0, policy_version 579000 (0.0026) [2024-06-24 03:50:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9486450688. Throughput: 0: 42718.2. Samples: 9486633660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:18,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 03:50:19,534][15401] Updated weights for policy 0, policy_version 579010 (0.0032) [2024-06-24 03:50:22,387][15401] Updated weights for policy 0, policy_version 579020 (0.0036) [2024-06-24 03:50:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9486696448. Throughput: 0: 42723.7. Samples: 9486755640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:23,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 03:50:27,396][15401] Updated weights for policy 0, policy_version 579030 (0.0042) [2024-06-24 03:50:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9486893056. Throughput: 0: 42884.5. Samples: 9487022280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 03:50:30,034][15401] Updated weights for policy 0, policy_version 579040 (0.0025) [2024-06-24 03:50:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9487089664. Throughput: 0: 42611.3. Samples: 9487265300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 03:50:34,964][15401] Updated weights for policy 0, policy_version 579050 (0.0041) [2024-06-24 03:50:37,655][15401] Updated weights for policy 0, policy_version 579060 (0.0035) [2024-06-24 03:50:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9487335424. Throughput: 0: 42673.7. Samples: 9487388500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 03:50:42,507][15401] Updated weights for policy 0, policy_version 579070 (0.0039) [2024-06-24 03:50:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9487532032. Throughput: 0: 42542.0. Samples: 9487652220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 03:50:43,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 03:50:45,392][15401] Updated weights for policy 0, policy_version 579080 (0.0044) [2024-06-24 03:50:48,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 9487712256. Throughput: 0: 42631.6. Samples: 9487908780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:50:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 03:50:50,196][15401] Updated weights for policy 0, policy_version 579090 (0.0023) [2024-06-24 03:50:52,927][15401] Updated weights for policy 0, policy_version 579100 (0.0039) [2024-06-24 03:50:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9487974400. Throughput: 0: 42556.9. Samples: 9488029900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:50:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 03:50:57,918][15401] Updated weights for policy 0, policy_version 579110 (0.0033) [2024-06-24 03:50:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9488171008. Throughput: 0: 42559.5. Samples: 9488296260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:50:58,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 03:51:00,857][15401] Updated weights for policy 0, policy_version 579120 (0.0032) [2024-06-24 03:51:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9488367616. Throughput: 0: 42423.2. Samples: 9488542700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 03:51:05,393][15401] Updated weights for policy 0, policy_version 579130 (0.0053) [2024-06-24 03:51:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9488596992. Throughput: 0: 42562.2. Samples: 9488670940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 03:51:08,568][15401] Updated weights for policy 0, policy_version 579140 (0.0032) [2024-06-24 03:51:12,981][15401] Updated weights for policy 0, policy_version 579150 (0.0038) [2024-06-24 03:51:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 9488793600. Throughput: 0: 42282.1. Samples: 9488924980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:13,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 03:51:15,713][15349] Signal inference workers to stop experience collection... (140550 times) [2024-06-24 03:51:15,769][15401] InferenceWorker_p0-w0: stopping experience collection (140550 times) [2024-06-24 03:51:15,777][15349] Signal inference workers to resume experience collection... (140550 times) [2024-06-24 03:51:15,779][15401] InferenceWorker_p0-w0: resuming experience collection (140550 times) [2024-06-24 03:51:16,597][15401] Updated weights for policy 0, policy_version 579160 (0.0035) [2024-06-24 03:51:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9489022976. Throughput: 0: 42405.0. Samples: 9489173520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:18,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 03:51:20,802][15401] Updated weights for policy 0, policy_version 579170 (0.0031) [2024-06-24 03:51:23,391][15132] Fps is (10 sec: 42593.3, 60 sec: 42051.3, 300 sec: 42542.7). Total num frames: 9489219584. Throughput: 0: 42546.9. Samples: 9489303160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:23,391][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 03:51:24,138][15401] Updated weights for policy 0, policy_version 579180 (0.0035) [2024-06-24 03:51:28,226][15401] Updated weights for policy 0, policy_version 579190 (0.0033) [2024-06-24 03:51:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9489448960. Throughput: 0: 42501.2. Samples: 9489564780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:28,394][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 03:51:31,767][15401] Updated weights for policy 0, policy_version 579200 (0.0030) [2024-06-24 03:51:33,390][15132] Fps is (10 sec: 44242.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9489661952. Throughput: 0: 42407.8. Samples: 9489817140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 03:51:35,837][15401] Updated weights for policy 0, policy_version 579210 (0.0027) [2024-06-24 03:51:38,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 9489858560. Throughput: 0: 42590.9. Samples: 9489946500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 03:51:39,474][15401] Updated weights for policy 0, policy_version 579220 (0.0031) [2024-06-24 03:51:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 9490087936. Throughput: 0: 42442.1. Samples: 9490206160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 03:51:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000579229_9490087936.pth... [2024-06-24 03:51:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000578605_9479864320.pth [2024-06-24 03:51:43,641][15401] Updated weights for policy 0, policy_version 579230 (0.0038) [2024-06-24 03:51:47,212][15401] Updated weights for policy 0, policy_version 579240 (0.0034) [2024-06-24 03:51:48,389][15132] Fps is (10 sec: 44238.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 9490300928. Throughput: 0: 42545.4. Samples: 9490457240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 03:51:51,226][15401] Updated weights for policy 0, policy_version 579250 (0.0039) [2024-06-24 03:51:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9490497536. Throughput: 0: 42599.0. Samples: 9490587900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 03:51:54,847][15401] Updated weights for policy 0, policy_version 579260 (0.0045) [2024-06-24 03:51:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9490726912. Throughput: 0: 42744.2. Samples: 9490848460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:51:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 03:51:58,710][15401] Updated weights for policy 0, policy_version 579270 (0.0032) [2024-06-24 03:52:02,705][15401] Updated weights for policy 0, policy_version 579280 (0.0033) [2024-06-24 03:52:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9490956288. Throughput: 0: 42694.7. Samples: 9491094780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:52:03,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-24 03:52:06,498][15401] Updated weights for policy 0, policy_version 579290 (0.0038) [2024-06-24 03:52:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42543.4). Total num frames: 9491136512. Throughput: 0: 42593.6. Samples: 9491219920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:52:08,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 03:52:10,262][15401] Updated weights for policy 0, policy_version 579300 (0.0039) [2024-06-24 03:52:13,390][15132] Fps is (10 sec: 39320.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9491349504. Throughput: 0: 42556.6. Samples: 9491479840. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 03:52:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 03:52:14,407][15401] Updated weights for policy 0, policy_version 579310 (0.0032) [2024-06-24 03:52:17,871][15401] Updated weights for policy 0, policy_version 579320 (0.0039) [2024-06-24 03:52:18,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9491595264. Throughput: 0: 42467.2. Samples: 9491728160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:18,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 03:52:21,950][15401] Updated weights for policy 0, policy_version 579330 (0.0042) [2024-06-24 03:52:23,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42599.3, 300 sec: 42598.4). Total num frames: 9491775488. Throughput: 0: 42621.1. Samples: 9491864440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:23,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 03:52:25,681][15401] Updated weights for policy 0, policy_version 579340 (0.0031) [2024-06-24 03:52:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9492004864. Throughput: 0: 42457.4. Samples: 9492116740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:28,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 03:52:29,755][15401] Updated weights for policy 0, policy_version 579350 (0.0036) [2024-06-24 03:52:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9492217856. Throughput: 0: 42647.5. Samples: 9492376380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:33,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 03:52:33,406][15401] Updated weights for policy 0, policy_version 579360 (0.0036) [2024-06-24 03:52:37,691][15401] Updated weights for policy 0, policy_version 579370 (0.0039) [2024-06-24 03:52:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 9492430848. Throughput: 0: 42613.3. Samples: 9492505500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:38,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 03:52:40,149][15349] Signal inference workers to stop experience collection... (140600 times) [2024-06-24 03:52:40,194][15401] InferenceWorker_p0-w0: stopping experience collection (140600 times) [2024-06-24 03:52:40,201][15349] Signal inference workers to resume experience collection... (140600 times) [2024-06-24 03:52:40,205][15401] InferenceWorker_p0-w0: resuming experience collection (140600 times) [2024-06-24 03:52:41,067][15401] Updated weights for policy 0, policy_version 579380 (0.0041) [2024-06-24 03:52:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 9492643840. Throughput: 0: 42532.5. Samples: 9492762420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 03:52:45,239][15401] Updated weights for policy 0, policy_version 579390 (0.0026) [2024-06-24 03:52:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9492856832. Throughput: 0: 42704.0. Samples: 9493016460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 03:52:48,732][15401] Updated weights for policy 0, policy_version 579400 (0.0036) [2024-06-24 03:52:52,982][15401] Updated weights for policy 0, policy_version 579410 (0.0032) [2024-06-24 03:52:53,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 9493069824. Throughput: 0: 42802.2. Samples: 9493146020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:53,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 03:52:56,308][15401] Updated weights for policy 0, policy_version 579420 (0.0029) [2024-06-24 03:52:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9493266432. Throughput: 0: 42611.5. Samples: 9493397340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:52:58,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 03:53:00,440][15401] Updated weights for policy 0, policy_version 579430 (0.0029) [2024-06-24 03:53:03,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9493495808. Throughput: 0: 43026.8. Samples: 9493664360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:03,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 03:53:04,026][15401] Updated weights for policy 0, policy_version 579440 (0.0035) [2024-06-24 03:53:08,052][15401] Updated weights for policy 0, policy_version 579450 (0.0035) [2024-06-24 03:53:08,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 9493708800. Throughput: 0: 42825.1. Samples: 9493791580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 03:53:11,646][15401] Updated weights for policy 0, policy_version 579460 (0.0035) [2024-06-24 03:53:13,390][15132] Fps is (10 sec: 42594.2, 60 sec: 42871.0, 300 sec: 42653.8). Total num frames: 9493921792. Throughput: 0: 42745.8. Samples: 9494040340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:13,391][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 03:53:15,863][15401] Updated weights for policy 0, policy_version 579470 (0.0025) [2024-06-24 03:53:18,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9494134784. Throughput: 0: 42928.4. Samples: 9494308160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 03:53:19,251][15401] Updated weights for policy 0, policy_version 579480 (0.0035) [2024-06-24 03:53:23,389][15132] Fps is (10 sec: 42602.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9494347776. Throughput: 0: 42825.5. Samples: 9494432640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 03:53:23,414][15401] Updated weights for policy 0, policy_version 579490 (0.0035) [2024-06-24 03:53:26,919][15401] Updated weights for policy 0, policy_version 579500 (0.0037) [2024-06-24 03:53:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9494577152. Throughput: 0: 42733.3. Samples: 9494685420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 03:53:31,029][15401] Updated weights for policy 0, policy_version 579510 (0.0046) [2024-06-24 03:53:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 9494740992. Throughput: 0: 42969.8. Samples: 9494950100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 03:53:34,887][15401] Updated weights for policy 0, policy_version 579520 (0.0033) [2024-06-24 03:53:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9495003136. Throughput: 0: 42752.1. Samples: 9495069760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 03:53:38,610][15401] Updated weights for policy 0, policy_version 579530 (0.0054) [2024-06-24 03:53:42,771][15401] Updated weights for policy 0, policy_version 579540 (0.0030) [2024-06-24 03:53:43,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9495216128. Throughput: 0: 42917.8. Samples: 9495328640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 03:53:43,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 03:53:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000579543_9495232512.pth... [2024-06-24 03:53:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000578918_9484992512.pth [2024-06-24 03:53:46,140][15401] Updated weights for policy 0, policy_version 579550 (0.0032) [2024-06-24 03:53:48,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 9495379968. Throughput: 0: 42675.8. Samples: 9495584780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:53:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 03:53:50,280][15401] Updated weights for policy 0, policy_version 579560 (0.0063) [2024-06-24 03:53:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 9495658496. Throughput: 0: 42470.0. Samples: 9495702720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:53:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 03:53:53,975][15401] Updated weights for policy 0, policy_version 579570 (0.0030) [2024-06-24 03:53:55,957][15349] Signal inference workers to stop experience collection... (140650 times) [2024-06-24 03:53:55,959][15349] Signal inference workers to resume experience collection... (140650 times) [2024-06-24 03:53:55,983][15401] InferenceWorker_p0-w0: stopping experience collection (140650 times) [2024-06-24 03:53:55,983][15401] InferenceWorker_p0-w0: resuming experience collection (140650 times) [2024-06-24 03:53:57,865][15401] Updated weights for policy 0, policy_version 579580 (0.0023) [2024-06-24 03:53:58,389][15132] Fps is (10 sec: 47514.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9495855104. Throughput: 0: 42997.8. Samples: 9495975200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:53:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 03:54:01,356][15401] Updated weights for policy 0, policy_version 579590 (0.0038) [2024-06-24 03:54:03,390][15132] Fps is (10 sec: 36044.1, 60 sec: 42052.1, 300 sec: 42431.8). Total num frames: 9496018944. Throughput: 0: 42742.5. Samples: 9496231580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 03:54:05,397][15401] Updated weights for policy 0, policy_version 579600 (0.0039) [2024-06-24 03:54:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 9496313856. Throughput: 0: 42695.0. Samples: 9496353920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 03:54:08,864][15401] Updated weights for policy 0, policy_version 579610 (0.0046) [2024-06-24 03:54:13,194][15401] Updated weights for policy 0, policy_version 579620 (0.0035) [2024-06-24 03:54:13,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42872.1, 300 sec: 42765.0). Total num frames: 9496494080. Throughput: 0: 42990.5. Samples: 9496620000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:13,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-24 03:54:16,916][15401] Updated weights for policy 0, policy_version 579630 (0.0032) [2024-06-24 03:54:18,389][15132] Fps is (10 sec: 36045.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9496674304. Throughput: 0: 42839.5. Samples: 9496877880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 03:54:20,661][15401] Updated weights for policy 0, policy_version 579640 (0.0030) [2024-06-24 03:54:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 9496952832. Throughput: 0: 42832.8. Samples: 9496997240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 03:54:24,407][15401] Updated weights for policy 0, policy_version 579650 (0.0029) [2024-06-24 03:54:28,103][15401] Updated weights for policy 0, policy_version 579660 (0.0037) [2024-06-24 03:54:28,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9497149440. Throughput: 0: 43007.6. Samples: 9497263980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:28,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 03:54:32,501][15401] Updated weights for policy 0, policy_version 579670 (0.0040) [2024-06-24 03:54:33,389][15132] Fps is (10 sec: 36045.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 9497313280. Throughput: 0: 43096.2. Samples: 9497524100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 03:54:35,653][15401] Updated weights for policy 0, policy_version 579680 (0.0024) [2024-06-24 03:54:38,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 9497608192. Throughput: 0: 43139.0. Samples: 9497644080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:38,393][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 03:54:39,960][15401] Updated weights for policy 0, policy_version 579690 (0.0027) [2024-06-24 03:54:43,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 9497788416. Throughput: 0: 42944.8. Samples: 9497907720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 03:54:43,541][15401] Updated weights for policy 0, policy_version 579700 (0.0048) [2024-06-24 03:54:47,458][15401] Updated weights for policy 0, policy_version 579710 (0.0043) [2024-06-24 03:54:48,389][15132] Fps is (10 sec: 36054.0, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 9497968640. Throughput: 0: 43057.6. Samples: 9498169160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 03:54:51,177][15401] Updated weights for policy 0, policy_version 579720 (0.0029) [2024-06-24 03:54:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9498247168. Throughput: 0: 43150.7. Samples: 9498295700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:53,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 03:54:55,046][15401] Updated weights for policy 0, policy_version 579730 (0.0037) [2024-06-24 03:54:56,557][15349] Signal inference workers to stop experience collection... (140700 times) [2024-06-24 03:54:56,577][15401] InferenceWorker_p0-w0: stopping experience collection (140700 times) [2024-06-24 03:54:56,670][15349] Signal inference workers to resume experience collection... (140700 times) [2024-06-24 03:54:56,671][15401] InferenceWorker_p0-w0: resuming experience collection (140700 times) [2024-06-24 03:54:58,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 9498411008. Throughput: 0: 42941.8. Samples: 9498552480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:54:58,393][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 03:54:58,861][15401] Updated weights for policy 0, policy_version 579740 (0.0037) [2024-06-24 03:55:02,827][15401] Updated weights for policy 0, policy_version 579750 (0.0046) [2024-06-24 03:55:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 9498624000. Throughput: 0: 42911.1. Samples: 9498808880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:55:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 03:55:06,299][15401] Updated weights for policy 0, policy_version 579760 (0.0035) [2024-06-24 03:55:08,390][15132] Fps is (10 sec: 47525.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9498886144. Throughput: 0: 43112.9. Samples: 9498937320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:55:08,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 03:55:10,241][15401] Updated weights for policy 0, policy_version 579770 (0.0033) [2024-06-24 03:55:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9499066368. Throughput: 0: 43009.3. Samples: 9499199400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:55:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 03:55:13,806][15401] Updated weights for policy 0, policy_version 579780 (0.0041) [2024-06-24 03:55:17,609][15401] Updated weights for policy 0, policy_version 579790 (0.0038) [2024-06-24 03:55:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 9499279360. Throughput: 0: 42983.5. Samples: 9499458360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 03:55:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 03:55:21,532][15401] Updated weights for policy 0, policy_version 579800 (0.0035) [2024-06-24 03:55:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9499525120. Throughput: 0: 43132.6. Samples: 9499584940. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:55:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 03:55:25,194][15401] Updated weights for policy 0, policy_version 579810 (0.0043) [2024-06-24 03:55:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9499705344. Throughput: 0: 43058.3. Samples: 9499845340. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:55:28,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 03:55:29,083][15401] Updated weights for policy 0, policy_version 579820 (0.0028) [2024-06-24 03:55:33,130][15401] Updated weights for policy 0, policy_version 579830 (0.0029) [2024-06-24 03:55:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 9499934720. Throughput: 0: 42853.6. Samples: 9500097580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:55:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 03:55:36,879][15401] Updated weights for policy 0, policy_version 579840 (0.0035) [2024-06-24 03:55:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 9500164096. Throughput: 0: 43069.8. Samples: 9500233840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:55:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 03:55:40,588][15401] Updated weights for policy 0, policy_version 579850 (0.0036) [2024-06-24 03:55:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 9500344320. Throughput: 0: 43044.6. Samples: 9500489380. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:55:43,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 03:55:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000579855_9500344320.pth... [2024-06-24 03:55:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000579229_9490087936.pth [2024-06-24 03:55:44,331][15401] Updated weights for policy 0, policy_version 579860 (0.0035) [2024-06-24 03:55:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 9500573696. Throughput: 0: 42989.3. Samples: 9500743400. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:55:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 03:55:48,725][15401] Updated weights for policy 0, policy_version 579870 (0.0040) [2024-06-24 03:55:52,024][15401] Updated weights for policy 0, policy_version 579880 (0.0038) [2024-06-24 03:55:53,393][15132] Fps is (10 sec: 45860.5, 60 sec: 42596.2, 300 sec: 42820.1). Total num frames: 9500803072. Throughput: 0: 43116.1. Samples: 9500877680. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:55:53,393][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 03:55:56,430][15401] Updated weights for policy 0, policy_version 579890 (0.0034) [2024-06-24 03:55:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 9500999680. Throughput: 0: 42994.2. Samples: 9501134140. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:55:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 03:55:59,557][15401] Updated weights for policy 0, policy_version 579900 (0.0037) [2024-06-24 03:56:03,389][15132] Fps is (10 sec: 40973.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9501212672. Throughput: 0: 42890.7. Samples: 9501388440. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:03,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 03:56:03,963][15401] Updated weights for policy 0, policy_version 579910 (0.0038) [2024-06-24 03:56:07,226][15401] Updated weights for policy 0, policy_version 579920 (0.0031) [2024-06-24 03:56:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9501442048. Throughput: 0: 42953.7. Samples: 9501517860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:08,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 03:56:11,518][15401] Updated weights for policy 0, policy_version 579930 (0.0040) [2024-06-24 03:56:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9501638656. Throughput: 0: 42882.6. Samples: 9501775060. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 03:56:14,855][15401] Updated weights for policy 0, policy_version 579940 (0.0033) [2024-06-24 03:56:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42876.3). Total num frames: 9501868032. Throughput: 0: 42895.5. Samples: 9502027880. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 03:56:19,213][15401] Updated weights for policy 0, policy_version 579950 (0.0047) [2024-06-24 03:56:21,307][15349] Signal inference workers to stop experience collection... (140750 times) [2024-06-24 03:56:21,312][15349] Signal inference workers to resume experience collection... (140750 times) [2024-06-24 03:56:21,358][15401] InferenceWorker_p0-w0: stopping experience collection (140750 times) [2024-06-24 03:56:21,358][15401] InferenceWorker_p0-w0: resuming experience collection (140750 times) [2024-06-24 03:56:22,369][15401] Updated weights for policy 0, policy_version 579960 (0.0034) [2024-06-24 03:56:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9502113792. Throughput: 0: 42876.8. Samples: 9502163300. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 03:56:26,849][15401] Updated weights for policy 0, policy_version 579970 (0.0044) [2024-06-24 03:56:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9502277632. Throughput: 0: 42848.0. Samples: 9502417540. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 03:56:30,003][15401] Updated weights for policy 0, policy_version 579980 (0.0034) [2024-06-24 03:56:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9502507008. Throughput: 0: 42995.9. Samples: 9502678220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 03:56:34,569][15401] Updated weights for policy 0, policy_version 579990 (0.0036) [2024-06-24 03:56:37,499][15401] Updated weights for policy 0, policy_version 580000 (0.0037) [2024-06-24 03:56:38,389][15132] Fps is (10 sec: 49151.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9502769152. Throughput: 0: 42995.0. Samples: 9502812320. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 03:56:42,167][15401] Updated weights for policy 0, policy_version 580010 (0.0036) [2024-06-24 03:56:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9502932992. Throughput: 0: 43052.4. Samples: 9503071500. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:43,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 03:56:45,073][15401] Updated weights for policy 0, policy_version 580020 (0.0035) [2024-06-24 03:56:48,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9503145984. Throughput: 0: 43086.3. Samples: 9503327320. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-24 03:56:48,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 03:56:49,861][15401] Updated weights for policy 0, policy_version 580030 (0.0030) [2024-06-24 03:56:52,750][15401] Updated weights for policy 0, policy_version 580040 (0.0025) [2024-06-24 03:56:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43146.8, 300 sec: 42931.6). Total num frames: 9503391744. Throughput: 0: 42999.2. Samples: 9503452820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:56:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 03:56:57,462][15401] Updated weights for policy 0, policy_version 580050 (0.0048) [2024-06-24 03:56:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9503571968. Throughput: 0: 43023.5. Samples: 9503711220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:56:58,392][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 03:57:00,319][15401] Updated weights for policy 0, policy_version 580060 (0.0036) [2024-06-24 03:57:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 9503784960. Throughput: 0: 42981.5. Samples: 9503962040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:03,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-24 03:57:05,168][15401] Updated weights for policy 0, policy_version 580070 (0.0033) [2024-06-24 03:57:07,849][15401] Updated weights for policy 0, policy_version 580080 (0.0041) [2024-06-24 03:57:08,389][15132] Fps is (10 sec: 47525.0, 60 sec: 43417.6, 300 sec: 43042.8). Total num frames: 9504047104. Throughput: 0: 42950.4. Samples: 9504096060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 03:57:13,002][15401] Updated weights for policy 0, policy_version 580090 (0.0035) [2024-06-24 03:57:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 9504194560. Throughput: 0: 43091.9. Samples: 9504356780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:13,392][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 03:57:15,549][15401] Updated weights for policy 0, policy_version 580100 (0.0030) [2024-06-24 03:57:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9504440320. Throughput: 0: 42776.5. Samples: 9504603160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:18,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-24 03:57:20,728][15401] Updated weights for policy 0, policy_version 580110 (0.0030) [2024-06-24 03:57:23,380][15401] Updated weights for policy 0, policy_version 580120 (0.0042) [2024-06-24 03:57:23,389][15132] Fps is (10 sec: 49164.0, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 9504686080. Throughput: 0: 42779.6. Samples: 9504737400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:23,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 03:57:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9504833536. Throughput: 0: 42697.0. Samples: 9504992860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 03:57:28,444][15401] Updated weights for policy 0, policy_version 580130 (0.0041) [2024-06-24 03:57:31,174][15401] Updated weights for policy 0, policy_version 580140 (0.0021) [2024-06-24 03:57:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 9505062912. Throughput: 0: 42668.5. Samples: 9505247400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 03:57:36,013][15401] Updated weights for policy 0, policy_version 580150 (0.0042) [2024-06-24 03:57:38,393][15132] Fps is (10 sec: 47496.2, 60 sec: 42322.8, 300 sec: 42931.1). Total num frames: 9505308672. Throughput: 0: 42796.5. Samples: 9505378820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:38,393][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 03:57:38,685][15401] Updated weights for policy 0, policy_version 580160 (0.0033) [2024-06-24 03:57:43,396][15132] Fps is (10 sec: 42570.4, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 9505488896. Throughput: 0: 42720.2. Samples: 9505633800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:43,397][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 03:57:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000580170_9505505280.pth... [2024-06-24 03:57:43,534][15401] Updated weights for policy 0, policy_version 580170 (0.0041) [2024-06-24 03:57:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000579543_9495232512.pth [2024-06-24 03:57:46,283][15401] Updated weights for policy 0, policy_version 580180 (0.0043) [2024-06-24 03:57:48,390][15132] Fps is (10 sec: 40974.3, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 9505718272. Throughput: 0: 42907.9. Samples: 9505892900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:48,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 03:57:51,182][15401] Updated weights for policy 0, policy_version 580190 (0.0025) [2024-06-24 03:57:52,806][15349] Signal inference workers to stop experience collection... (140800 times) [2024-06-24 03:57:52,844][15401] InferenceWorker_p0-w0: stopping experience collection (140800 times) [2024-06-24 03:57:52,865][15349] Signal inference workers to resume experience collection... (140800 times) [2024-06-24 03:57:52,865][15401] InferenceWorker_p0-w0: resuming experience collection (140800 times) [2024-06-24 03:57:53,390][15132] Fps is (10 sec: 47543.5, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 9505964032. Throughput: 0: 42931.0. Samples: 9506027960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:53,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 03:57:53,971][15401] Updated weights for policy 0, policy_version 580200 (0.0036) [2024-06-24 03:57:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 9506127872. Throughput: 0: 42752.9. Samples: 9506280560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:57:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 03:57:58,704][15401] Updated weights for policy 0, policy_version 580210 (0.0044) [2024-06-24 03:58:01,448][15401] Updated weights for policy 0, policy_version 580220 (0.0033) [2024-06-24 03:58:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 9506373632. Throughput: 0: 42997.0. Samples: 9506538020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:58:03,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 03:58:06,311][15401] Updated weights for policy 0, policy_version 580230 (0.0039) [2024-06-24 03:58:08,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42871.4, 300 sec: 43042.8). Total num frames: 9506619392. Throughput: 0: 43006.1. Samples: 9506672680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:58:08,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 03:58:09,334][15401] Updated weights for policy 0, policy_version 580240 (0.0026) [2024-06-24 03:58:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 9506783232. Throughput: 0: 42949.7. Samples: 9506925600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:58:13,395][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 03:58:14,008][15401] Updated weights for policy 0, policy_version 580250 (0.0037) [2024-06-24 03:58:17,011][15401] Updated weights for policy 0, policy_version 580260 (0.0038) [2024-06-24 03:58:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9507012608. Throughput: 0: 42998.1. Samples: 9507182320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:58:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 03:58:21,609][15401] Updated weights for policy 0, policy_version 580270 (0.0053) [2024-06-24 03:58:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9507241984. Throughput: 0: 43000.9. Samples: 9507313700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 03:58:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 03:58:24,641][15401] Updated weights for policy 0, policy_version 580280 (0.0037) [2024-06-24 03:58:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 9507438592. Throughput: 0: 42992.4. Samples: 9507568180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:58:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 03:58:29,088][15401] Updated weights for policy 0, policy_version 580290 (0.0036) [2024-06-24 03:58:32,449][15401] Updated weights for policy 0, policy_version 580300 (0.0032) [2024-06-24 03:58:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 9507667968. Throughput: 0: 43027.2. Samples: 9507829120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:58:33,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 03:58:36,537][15401] Updated weights for policy 0, policy_version 580310 (0.0032) [2024-06-24 03:58:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43147.1, 300 sec: 42987.2). Total num frames: 9507897344. Throughput: 0: 43041.9. Samples: 9507964840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:58:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 03:58:40,287][15401] Updated weights for policy 0, policy_version 580320 (0.0030) [2024-06-24 03:58:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43149.1, 300 sec: 43042.7). Total num frames: 9508077568. Throughput: 0: 43043.4. Samples: 9508217520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:58:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 03:58:43,941][15401] Updated weights for policy 0, policy_version 580330 (0.0029) [2024-06-24 03:58:48,267][15401] Updated weights for policy 0, policy_version 580340 (0.0037) [2024-06-24 03:58:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 9508290560. Throughput: 0: 43044.5. Samples: 9508475020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:58:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 03:58:51,918][15401] Updated weights for policy 0, policy_version 580350 (0.0040) [2024-06-24 03:58:53,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42869.9, 300 sec: 42986.8). Total num frames: 9508536320. Throughput: 0: 42836.9. Samples: 9508600440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:58:53,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 03:58:55,707][15401] Updated weights for policy 0, policy_version 580360 (0.0037) [2024-06-24 03:58:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 9508716544. Throughput: 0: 42840.1. Samples: 9508853400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:58:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 03:58:59,505][15401] Updated weights for policy 0, policy_version 580370 (0.0029) [2024-06-24 03:59:03,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9508929536. Throughput: 0: 42930.7. Samples: 9509114200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 03:59:03,680][15401] Updated weights for policy 0, policy_version 580380 (0.0031) [2024-06-24 03:59:07,043][15401] Updated weights for policy 0, policy_version 580390 (0.0031) [2024-06-24 03:59:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 9509175296. Throughput: 0: 43008.7. Samples: 9509249100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:08,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 03:59:11,066][15401] Updated weights for policy 0, policy_version 580400 (0.0034) [2024-06-24 03:59:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9509355520. Throughput: 0: 42981.4. Samples: 9509502340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 03:59:14,243][15349] Signal inference workers to stop experience collection... (140850 times) [2024-06-24 03:59:14,279][15401] InferenceWorker_p0-w0: stopping experience collection (140850 times) [2024-06-24 03:59:14,362][15349] Signal inference workers to resume experience collection... (140850 times) [2024-06-24 03:59:14,362][15401] InferenceWorker_p0-w0: resuming experience collection (140850 times) [2024-06-24 03:59:14,529][15401] Updated weights for policy 0, policy_version 580410 (0.0043) [2024-06-24 03:59:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9509584896. Throughput: 0: 42971.2. Samples: 9509762820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 03:59:18,482][15401] Updated weights for policy 0, policy_version 580420 (0.0041) [2024-06-24 03:59:22,085][15401] Updated weights for policy 0, policy_version 580430 (0.0042) [2024-06-24 03:59:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 9509814272. Throughput: 0: 42890.2. Samples: 9509894900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:23,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 03:59:25,974][15401] Updated weights for policy 0, policy_version 580440 (0.0024) [2024-06-24 03:59:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 9510010880. Throughput: 0: 42887.7. Samples: 9510147460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:28,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 03:59:29,808][15401] Updated weights for policy 0, policy_version 580450 (0.0032) [2024-06-24 03:59:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 9510240256. Throughput: 0: 42959.1. Samples: 9510408180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:33,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-24 03:59:33,561][15401] Updated weights for policy 0, policy_version 580460 (0.0041) [2024-06-24 03:59:37,462][15401] Updated weights for policy 0, policy_version 580470 (0.0028) [2024-06-24 03:59:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9510453248. Throughput: 0: 43049.8. Samples: 9510537580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 03:59:41,052][15401] Updated weights for policy 0, policy_version 580480 (0.0045) [2024-06-24 03:59:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 9510666240. Throughput: 0: 43059.0. Samples: 9510791060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:43,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 03:59:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000580485_9510666240.pth... [2024-06-24 03:59:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000579855_9500344320.pth [2024-06-24 03:59:45,116][15401] Updated weights for policy 0, policy_version 580490 (0.0038) [2024-06-24 03:59:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9510879232. Throughput: 0: 42888.5. Samples: 9511044180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 03:59:48,592][15401] Updated weights for policy 0, policy_version 580500 (0.0037) [2024-06-24 03:59:52,850][15401] Updated weights for policy 0, policy_version 580510 (0.0042) [2024-06-24 03:59:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42987.5). Total num frames: 9511092224. Throughput: 0: 42701.8. Samples: 9511170680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 03:59:53,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 03:59:56,140][15401] Updated weights for policy 0, policy_version 580520 (0.0046) [2024-06-24 03:59:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9511288832. Throughput: 0: 42800.0. Samples: 9511428340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 03:59:58,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 04:00:00,723][15401] Updated weights for policy 0, policy_version 580530 (0.0040) [2024-06-24 04:00:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9511534592. Throughput: 0: 42628.4. Samples: 9511681100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 04:00:04,470][15401] Updated weights for policy 0, policy_version 580540 (0.0032) [2024-06-24 04:00:08,227][15401] Updated weights for policy 0, policy_version 580550 (0.0024) [2024-06-24 04:00:08,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42593.8, 300 sec: 42930.7). Total num frames: 9511731200. Throughput: 0: 42594.4. Samples: 9511811920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:08,397][15132] Avg episode reward: [(0, '0.259')] [2024-06-24 04:00:12,023][15401] Updated weights for policy 0, policy_version 580560 (0.0034) [2024-06-24 04:00:13,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 9511927808. Throughput: 0: 42493.7. Samples: 9512059780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:13,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 04:00:16,201][15401] Updated weights for policy 0, policy_version 580570 (0.0026) [2024-06-24 04:00:18,389][15132] Fps is (10 sec: 44265.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9512173568. Throughput: 0: 42474.3. Samples: 9512319520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 04:00:19,554][15401] Updated weights for policy 0, policy_version 580580 (0.0032) [2024-06-24 04:00:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9512353792. Throughput: 0: 42430.3. Samples: 9512446940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 04:00:23,662][15401] Updated weights for policy 0, policy_version 580590 (0.0035) [2024-06-24 04:00:27,094][15401] Updated weights for policy 0, policy_version 580600 (0.0025) [2024-06-24 04:00:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9512566784. Throughput: 0: 42373.4. Samples: 9512697860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:28,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 04:00:31,142][15349] Signal inference workers to stop experience collection... (140900 times) [2024-06-24 04:00:31,142][15349] Signal inference workers to resume experience collection... (140900 times) [2024-06-24 04:00:31,154][15401] Updated weights for policy 0, policy_version 580610 (0.0041) [2024-06-24 04:00:31,181][15401] InferenceWorker_p0-w0: stopping experience collection (140900 times) [2024-06-24 04:00:31,181][15401] InferenceWorker_p0-w0: resuming experience collection (140900 times) [2024-06-24 04:00:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9512812544. Throughput: 0: 42684.5. Samples: 9512964980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 04:00:34,790][15401] Updated weights for policy 0, policy_version 580620 (0.0030) [2024-06-24 04:00:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9513009152. Throughput: 0: 42795.5. Samples: 9513096480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:38,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 04:00:38,774][15401] Updated weights for policy 0, policy_version 580630 (0.0041) [2024-06-24 04:00:42,266][15401] Updated weights for policy 0, policy_version 580640 (0.0048) [2024-06-24 04:00:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9513222144. Throughput: 0: 42683.9. Samples: 9513349120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:43,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 04:00:46,689][15401] Updated weights for policy 0, policy_version 580650 (0.0037) [2024-06-24 04:00:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.6). Total num frames: 9513451520. Throughput: 0: 42882.7. Samples: 9513610820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 04:00:49,851][15401] Updated weights for policy 0, policy_version 580660 (0.0024) [2024-06-24 04:00:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 9513664512. Throughput: 0: 42839.8. Samples: 9513739440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 04:00:54,198][15401] Updated weights for policy 0, policy_version 580670 (0.0023) [2024-06-24 04:00:57,814][15401] Updated weights for policy 0, policy_version 580680 (0.0036) [2024-06-24 04:00:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9513877504. Throughput: 0: 43038.8. Samples: 9513996420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:00:58,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 04:01:01,824][15401] Updated weights for policy 0, policy_version 580690 (0.0025) [2024-06-24 04:01:03,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9514090496. Throughput: 0: 43036.4. Samples: 9514256160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:01:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 04:01:05,417][15401] Updated weights for policy 0, policy_version 580700 (0.0040) [2024-06-24 04:01:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42876.1, 300 sec: 42931.6). Total num frames: 9514303488. Throughput: 0: 43057.8. Samples: 9514384540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:01:08,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 04:01:09,334][15401] Updated weights for policy 0, policy_version 580710 (0.0035) [2024-06-24 04:01:13,005][15401] Updated weights for policy 0, policy_version 580720 (0.0025) [2024-06-24 04:01:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 9514532864. Throughput: 0: 43222.6. Samples: 9514642880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:01:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 04:01:16,907][15401] Updated weights for policy 0, policy_version 580730 (0.0033) [2024-06-24 04:01:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9514729472. Throughput: 0: 42960.7. Samples: 9514898320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:01:18,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 04:01:20,762][15401] Updated weights for policy 0, policy_version 580740 (0.0036) [2024-06-24 04:01:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 9514975232. Throughput: 0: 42889.4. Samples: 9515026500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:01:23,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 04:01:24,483][15401] Updated weights for policy 0, policy_version 580750 (0.0033) [2024-06-24 04:01:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9515155456. Throughput: 0: 43031.6. Samples: 9515285540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:01:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 04:01:28,855][15401] Updated weights for policy 0, policy_version 580760 (0.0027) [2024-06-24 04:01:32,124][15401] Updated weights for policy 0, policy_version 580770 (0.0033) [2024-06-24 04:01:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 9515384832. Throughput: 0: 42807.9. Samples: 9515537180. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:01:33,395][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 04:01:36,337][15401] Updated weights for policy 0, policy_version 580780 (0.0036) [2024-06-24 04:01:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9515597824. Throughput: 0: 42927.2. Samples: 9515671160. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:01:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 04:01:39,629][15401] Updated weights for policy 0, policy_version 580790 (0.0036) [2024-06-24 04:01:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9515794432. Throughput: 0: 42929.7. Samples: 9515928260. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:01:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 04:01:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000580798_9515794432.pth... [2024-06-24 04:01:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000580170_9505505280.pth [2024-06-24 04:01:43,807][15401] Updated weights for policy 0, policy_version 580800 (0.0034) [2024-06-24 04:01:47,703][15401] Updated weights for policy 0, policy_version 580810 (0.0042) [2024-06-24 04:01:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9516023808. Throughput: 0: 42790.6. Samples: 9516181740. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:01:48,394][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 04:01:51,591][15401] Updated weights for policy 0, policy_version 580820 (0.0039) [2024-06-24 04:01:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42932.0). Total num frames: 9516236800. Throughput: 0: 42768.8. Samples: 9516309140. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:01:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 04:01:55,005][15349] Signal inference workers to stop experience collection... (140950 times) [2024-06-24 04:01:55,042][15401] InferenceWorker_p0-w0: stopping experience collection (140950 times) [2024-06-24 04:01:55,066][15349] Signal inference workers to resume experience collection... (140950 times) [2024-06-24 04:01:55,066][15401] InferenceWorker_p0-w0: resuming experience collection (140950 times) [2024-06-24 04:01:55,207][15401] Updated weights for policy 0, policy_version 580830 (0.0033) [2024-06-24 04:01:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 9516417024. Throughput: 0: 42679.5. Samples: 9516563460. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:01:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 04:01:59,147][15401] Updated weights for policy 0, policy_version 580840 (0.0029) [2024-06-24 04:02:02,806][15401] Updated weights for policy 0, policy_version 580850 (0.0039) [2024-06-24 04:02:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9516662784. Throughput: 0: 42708.8. Samples: 9516820220. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:03,393][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 04:02:07,143][15401] Updated weights for policy 0, policy_version 580860 (0.0032) [2024-06-24 04:02:08,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 9516875776. Throughput: 0: 42678.3. Samples: 9516947020. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 04:02:10,371][15401] Updated weights for policy 0, policy_version 580870 (0.0040) [2024-06-24 04:02:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9517072384. Throughput: 0: 42678.7. Samples: 9517206080. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 04:02:14,777][15401] Updated weights for policy 0, policy_version 580880 (0.0034) [2024-06-24 04:02:17,952][15401] Updated weights for policy 0, policy_version 580890 (0.0038) [2024-06-24 04:02:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 9517318144. Throughput: 0: 42775.1. Samples: 9517462060. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 04:02:22,363][15401] Updated weights for policy 0, policy_version 580900 (0.0036) [2024-06-24 04:02:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 9517531136. Throughput: 0: 42596.6. Samples: 9517588000. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:23,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 04:02:25,749][15401] Updated weights for policy 0, policy_version 580910 (0.0038) [2024-06-24 04:02:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9517744128. Throughput: 0: 42639.2. Samples: 9517847020. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 04:02:29,835][15401] Updated weights for policy 0, policy_version 580920 (0.0035) [2024-06-24 04:02:33,297][15401] Updated weights for policy 0, policy_version 580930 (0.0032) [2024-06-24 04:02:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.6). Total num frames: 9517957120. Throughput: 0: 42806.7. Samples: 9518108040. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 04:02:37,512][15401] Updated weights for policy 0, policy_version 580940 (0.0031) [2024-06-24 04:02:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42988.1). Total num frames: 9518170112. Throughput: 0: 42837.9. Samples: 9518236840. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 04:02:40,971][15401] Updated weights for policy 0, policy_version 580950 (0.0037) [2024-06-24 04:02:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9518366720. Throughput: 0: 42846.7. Samples: 9518491560. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:43,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 04:02:44,999][15401] Updated weights for policy 0, policy_version 580960 (0.0030) [2024-06-24 04:02:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9518596096. Throughput: 0: 42913.8. Samples: 9518751240. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 04:02:48,455][15401] Updated weights for policy 0, policy_version 580970 (0.0024) [2024-06-24 04:02:52,568][15401] Updated weights for policy 0, policy_version 580980 (0.0029) [2024-06-24 04:02:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9518809088. Throughput: 0: 43040.8. Samples: 9518883860. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 04:02:56,054][15401] Updated weights for policy 0, policy_version 580990 (0.0033) [2024-06-24 04:02:58,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 9519005696. Throughput: 0: 42989.0. Samples: 9519140580. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-24 04:02:58,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 04:03:00,065][15401] Updated weights for policy 0, policy_version 581000 (0.0036) [2024-06-24 04:03:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 9519235072. Throughput: 0: 42964.0. Samples: 9519395440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:03,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 04:03:03,729][15401] Updated weights for policy 0, policy_version 581010 (0.0028) [2024-06-24 04:03:07,877][15401] Updated weights for policy 0, policy_version 581020 (0.0038) [2024-06-24 04:03:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9519448064. Throughput: 0: 43001.3. Samples: 9519523060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:08,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 04:03:11,459][15401] Updated weights for policy 0, policy_version 581030 (0.0032) [2024-06-24 04:03:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9519644672. Throughput: 0: 42952.4. Samples: 9519779880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 04:03:15,395][15401] Updated weights for policy 0, policy_version 581040 (0.0035) [2024-06-24 04:03:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9519890432. Throughput: 0: 42944.1. Samples: 9520040520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 04:03:19,088][15401] Updated weights for policy 0, policy_version 581050 (0.0041) [2024-06-24 04:03:22,866][15401] Updated weights for policy 0, policy_version 581060 (0.0044) [2024-06-24 04:03:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9520087040. Throughput: 0: 43056.8. Samples: 9520174400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 04:03:26,773][15401] Updated weights for policy 0, policy_version 581070 (0.0037) [2024-06-24 04:03:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9520300032. Throughput: 0: 43016.5. Samples: 9520427300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 04:03:30,565][15401] Updated weights for policy 0, policy_version 581080 (0.0030) [2024-06-24 04:03:33,019][15349] Signal inference workers to stop experience collection... (141000 times) [2024-06-24 04:03:33,019][15349] Signal inference workers to resume experience collection... (141000 times) [2024-06-24 04:03:33,065][15401] InferenceWorker_p0-w0: stopping experience collection (141000 times) [2024-06-24 04:03:33,065][15401] InferenceWorker_p0-w0: resuming experience collection (141000 times) [2024-06-24 04:03:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9520529408. Throughput: 0: 42893.9. Samples: 9520681460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 04:03:34,576][15401] Updated weights for policy 0, policy_version 581090 (0.0042) [2024-06-24 04:03:38,205][15401] Updated weights for policy 0, policy_version 581100 (0.0033) [2024-06-24 04:03:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 9520742400. Throughput: 0: 42877.8. Samples: 9520813360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 04:03:42,125][15401] Updated weights for policy 0, policy_version 581110 (0.0039) [2024-06-24 04:03:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9520939008. Throughput: 0: 42932.7. Samples: 9521072560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 04:03:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000581113_9520955392.pth... [2024-06-24 04:03:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000580485_9510666240.pth [2024-06-24 04:03:45,588][15401] Updated weights for policy 0, policy_version 581120 (0.0041) [2024-06-24 04:03:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 9521168384. Throughput: 0: 43030.3. Samples: 9521331800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 04:03:49,662][15401] Updated weights for policy 0, policy_version 581130 (0.0041) [2024-06-24 04:03:53,276][15401] Updated weights for policy 0, policy_version 581140 (0.0038) [2024-06-24 04:03:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 9521397760. Throughput: 0: 43078.5. Samples: 9521461600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 04:03:57,084][15401] Updated weights for policy 0, policy_version 581150 (0.0035) [2024-06-24 04:03:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 9521610752. Throughput: 0: 43073.7. Samples: 9521718200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:03:58,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 04:04:01,196][15401] Updated weights for policy 0, policy_version 581160 (0.0032) [2024-06-24 04:04:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9521807360. Throughput: 0: 43139.1. Samples: 9521981780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:04:03,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 04:04:04,455][15401] Updated weights for policy 0, policy_version 581170 (0.0044) [2024-06-24 04:04:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9522020352. Throughput: 0: 42957.9. Samples: 9522107500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:04:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 04:04:08,814][15401] Updated weights for policy 0, policy_version 581180 (0.0027) [2024-06-24 04:04:12,319][15401] Updated weights for policy 0, policy_version 581190 (0.0040) [2024-06-24 04:04:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 9522249728. Throughput: 0: 43058.2. Samples: 9522364920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:04:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 04:04:16,338][15401] Updated weights for policy 0, policy_version 581200 (0.0034) [2024-06-24 04:04:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9522429952. Throughput: 0: 43298.7. Samples: 9522629900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:04:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 04:04:19,965][15401] Updated weights for policy 0, policy_version 581210 (0.0031) [2024-06-24 04:04:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9522675712. Throughput: 0: 43043.6. Samples: 9522750320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:04:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 04:04:23,982][15401] Updated weights for policy 0, policy_version 581220 (0.0032) [2024-06-24 04:04:27,722][15401] Updated weights for policy 0, policy_version 581230 (0.0029) [2024-06-24 04:04:28,396][15132] Fps is (10 sec: 45845.5, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 9522888704. Throughput: 0: 43012.6. Samples: 9523008400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:04:28,396][15132] Avg episode reward: [(0, '0.163')] [2024-06-24 04:04:31,550][15401] Updated weights for policy 0, policy_version 581240 (0.0046) [2024-06-24 04:04:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9523085312. Throughput: 0: 43093.0. Samples: 9523270980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:04:33,390][15132] Avg episode reward: [(0, '0.190')] [2024-06-24 04:04:35,453][15401] Updated weights for policy 0, policy_version 581250 (0.0034) [2024-06-24 04:04:38,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9523314688. Throughput: 0: 42952.1. Samples: 9523394440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:04:38,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 04:04:39,135][15401] Updated weights for policy 0, policy_version 581260 (0.0037) [2024-06-24 04:04:43,054][15401] Updated weights for policy 0, policy_version 581270 (0.0031) [2024-06-24 04:04:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 9523544064. Throughput: 0: 43001.3. Samples: 9523653260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:04:43,394][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 04:04:46,954][15401] Updated weights for policy 0, policy_version 581280 (0.0032) [2024-06-24 04:04:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9523740672. Throughput: 0: 42910.2. Samples: 9523912740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:04:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 04:04:50,783][15401] Updated weights for policy 0, policy_version 581290 (0.0037) [2024-06-24 04:04:53,394][15132] Fps is (10 sec: 40940.4, 60 sec: 42595.0, 300 sec: 42930.9). Total num frames: 9523953664. Throughput: 0: 42844.2. Samples: 9524035700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:04:53,395][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 04:04:54,579][15401] Updated weights for policy 0, policy_version 581300 (0.0046) [2024-06-24 04:04:57,378][15349] Signal inference workers to stop experience collection... (141050 times) [2024-06-24 04:04:57,426][15401] InferenceWorker_p0-w0: stopping experience collection (141050 times) [2024-06-24 04:04:57,434][15349] Signal inference workers to resume experience collection... (141050 times) [2024-06-24 04:04:57,444][15401] InferenceWorker_p0-w0: resuming experience collection (141050 times) [2024-06-24 04:04:58,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 9524166656. Throughput: 0: 42894.6. Samples: 9524295280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:04:58,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 04:04:58,405][15401] Updated weights for policy 0, policy_version 581310 (0.0033) [2024-06-24 04:05:02,363][15401] Updated weights for policy 0, policy_version 581320 (0.0028) [2024-06-24 04:05:03,390][15132] Fps is (10 sec: 42619.0, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 9524379648. Throughput: 0: 42698.6. Samples: 9524551340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 04:05:06,077][15401] Updated weights for policy 0, policy_version 581330 (0.0041) [2024-06-24 04:05:08,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 9524609024. Throughput: 0: 42755.2. Samples: 9524674300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:08,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 04:05:09,854][15401] Updated weights for policy 0, policy_version 581340 (0.0032) [2024-06-24 04:05:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9524805632. Throughput: 0: 42769.6. Samples: 9524932760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:13,398][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 04:05:13,726][15401] Updated weights for policy 0, policy_version 581350 (0.0037) [2024-06-24 04:05:18,019][15401] Updated weights for policy 0, policy_version 581360 (0.0030) [2024-06-24 04:05:18,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9525018624. Throughput: 0: 42642.5. Samples: 9525189900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 04:05:21,342][15401] Updated weights for policy 0, policy_version 581370 (0.0033) [2024-06-24 04:05:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9525248000. Throughput: 0: 42672.4. Samples: 9525314700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 04:05:25,470][15401] Updated weights for policy 0, policy_version 581380 (0.0024) [2024-06-24 04:05:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 9525460992. Throughput: 0: 42671.6. Samples: 9525573480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:28,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 04:05:29,288][15401] Updated weights for policy 0, policy_version 581390 (0.0036) [2024-06-24 04:05:33,106][15401] Updated weights for policy 0, policy_version 581400 (0.0034) [2024-06-24 04:05:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9525673984. Throughput: 0: 42645.8. Samples: 9525831800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 04:05:37,117][15401] Updated weights for policy 0, policy_version 581410 (0.0035) [2024-06-24 04:05:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 9525886976. Throughput: 0: 42702.0. Samples: 9525957080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 04:05:40,649][15401] Updated weights for policy 0, policy_version 581420 (0.0033) [2024-06-24 04:05:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9526099968. Throughput: 0: 42716.8. Samples: 9526217440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 04:05:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000581427_9526099968.pth... [2024-06-24 04:05:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000580798_9515794432.pth [2024-06-24 04:05:44,661][15401] Updated weights for policy 0, policy_version 581430 (0.0028) [2024-06-24 04:05:48,162][15401] Updated weights for policy 0, policy_version 581440 (0.0033) [2024-06-24 04:05:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9526312960. Throughput: 0: 42684.0. Samples: 9526472120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:48,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 04:05:52,453][15401] Updated weights for policy 0, policy_version 581450 (0.0037) [2024-06-24 04:05:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42874.9, 300 sec: 42876.1). Total num frames: 9526525952. Throughput: 0: 42843.0. Samples: 9526602240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:53,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-24 04:05:55,722][15401] Updated weights for policy 0, policy_version 581460 (0.0033) [2024-06-24 04:05:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 9526755328. Throughput: 0: 42815.6. Samples: 9526859460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:05:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 04:05:59,918][15401] Updated weights for policy 0, policy_version 581470 (0.0036) [2024-06-24 04:06:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9526951936. Throughput: 0: 42829.7. Samples: 9527117240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-24 04:06:03,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 04:06:03,564][15401] Updated weights for policy 0, policy_version 581480 (0.0031) [2024-06-24 04:06:07,497][15401] Updated weights for policy 0, policy_version 581490 (0.0036) [2024-06-24 04:06:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9527164928. Throughput: 0: 42868.5. Samples: 9527243780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 04:06:11,187][15401] Updated weights for policy 0, policy_version 581500 (0.0036) [2024-06-24 04:06:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 9527394304. Throughput: 0: 42947.6. Samples: 9527506120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 04:06:14,932][15401] Updated weights for policy 0, policy_version 581510 (0.0022) [2024-06-24 04:06:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9527590912. Throughput: 0: 42835.1. Samples: 9527759380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 04:06:18,762][15401] Updated weights for policy 0, policy_version 581520 (0.0032) [2024-06-24 04:06:22,458][15401] Updated weights for policy 0, policy_version 581530 (0.0036) [2024-06-24 04:06:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9527803904. Throughput: 0: 42894.5. Samples: 9527887340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:23,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 04:06:26,354][15401] Updated weights for policy 0, policy_version 581540 (0.0038) [2024-06-24 04:06:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9528033280. Throughput: 0: 43051.7. Samples: 9528154760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 04:06:29,918][15401] Updated weights for policy 0, policy_version 581550 (0.0039) [2024-06-24 04:06:31,376][15349] Signal inference workers to stop experience collection... (141100 times) [2024-06-24 04:06:31,376][15349] Signal inference workers to resume experience collection... (141100 times) [2024-06-24 04:06:31,415][15401] InferenceWorker_p0-w0: stopping experience collection (141100 times) [2024-06-24 04:06:31,415][15401] InferenceWorker_p0-w0: resuming experience collection (141100 times) [2024-06-24 04:06:33,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 9528246272. Throughput: 0: 43058.1. Samples: 9528409840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:33,393][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 04:06:34,045][15401] Updated weights for policy 0, policy_version 581560 (0.0038) [2024-06-24 04:06:37,489][15401] Updated weights for policy 0, policy_version 581570 (0.0029) [2024-06-24 04:06:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 9528459264. Throughput: 0: 43001.5. Samples: 9528537300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 04:06:41,602][15401] Updated weights for policy 0, policy_version 581580 (0.0032) [2024-06-24 04:06:43,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9528672256. Throughput: 0: 43124.0. Samples: 9528800040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:43,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 04:06:45,058][15401] Updated weights for policy 0, policy_version 581590 (0.0046) [2024-06-24 04:06:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9528868864. Throughput: 0: 42890.4. Samples: 9529047300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:48,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 04:06:49,394][15401] Updated weights for policy 0, policy_version 581600 (0.0034) [2024-06-24 04:06:52,794][15401] Updated weights for policy 0, policy_version 581610 (0.0033) [2024-06-24 04:06:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9529098240. Throughput: 0: 42812.0. Samples: 9529170320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 04:06:57,159][15401] Updated weights for policy 0, policy_version 581620 (0.0039) [2024-06-24 04:06:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 9529294848. Throughput: 0: 42824.9. Samples: 9529433240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:06:58,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 04:07:00,695][15401] Updated weights for policy 0, policy_version 581630 (0.0033) [2024-06-24 04:07:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9529524224. Throughput: 0: 42885.6. Samples: 9529689240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:07:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 04:07:04,658][15401] Updated weights for policy 0, policy_version 581640 (0.0038) [2024-06-24 04:07:08,295][15401] Updated weights for policy 0, policy_version 581650 (0.0035) [2024-06-24 04:07:08,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 9529753600. Throughput: 0: 42998.3. Samples: 9529822360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:07:08,393][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 04:07:12,697][15401] Updated weights for policy 0, policy_version 581660 (0.0045) [2024-06-24 04:07:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9529933824. Throughput: 0: 42610.2. Samples: 9530072220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:07:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 04:07:15,878][15401] Updated weights for policy 0, policy_version 581670 (0.0025) [2024-06-24 04:07:18,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9530163200. Throughput: 0: 42642.4. Samples: 9530328640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:07:18,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 04:07:20,282][15401] Updated weights for policy 0, policy_version 581680 (0.0033) [2024-06-24 04:07:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9530392576. Throughput: 0: 42819.0. Samples: 9530464160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:07:23,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 04:07:23,497][15401] Updated weights for policy 0, policy_version 581690 (0.0033) [2024-06-24 04:07:27,858][15401] Updated weights for policy 0, policy_version 581700 (0.0035) [2024-06-24 04:07:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9530572800. Throughput: 0: 42737.4. Samples: 9530723220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:07:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 04:07:31,156][15401] Updated weights for policy 0, policy_version 581710 (0.0036) [2024-06-24 04:07:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.2, 300 sec: 42820.5). Total num frames: 9530802176. Throughput: 0: 42987.1. Samples: 9530981720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:07:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 04:07:35,471][15401] Updated weights for policy 0, policy_version 581720 (0.0032) [2024-06-24 04:07:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9531031552. Throughput: 0: 43146.2. Samples: 9531111900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-24 04:07:38,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 04:07:38,614][15401] Updated weights for policy 0, policy_version 581730 (0.0040) [2024-06-24 04:07:43,072][15401] Updated weights for policy 0, policy_version 581740 (0.0046) [2024-06-24 04:07:43,390][15132] Fps is (10 sec: 44233.3, 60 sec: 42870.9, 300 sec: 42876.0). Total num frames: 9531244544. Throughput: 0: 43204.6. Samples: 9531377480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:07:43,391][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 04:07:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000581741_9531244544.pth... [2024-06-24 04:07:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000581113_9520955392.pth [2024-06-24 04:07:46,024][15401] Updated weights for policy 0, policy_version 581750 (0.0032) [2024-06-24 04:07:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9531457536. Throughput: 0: 43113.9. Samples: 9531629360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:07:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 04:07:50,722][15401] Updated weights for policy 0, policy_version 581760 (0.0048) [2024-06-24 04:07:53,390][15132] Fps is (10 sec: 44240.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9531686912. Throughput: 0: 43002.7. Samples: 9531757380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:07:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 04:07:54,102][15401] Updated weights for policy 0, policy_version 581770 (0.0034) [2024-06-24 04:07:58,167][15401] Updated weights for policy 0, policy_version 581780 (0.0034) [2024-06-24 04:07:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9531883520. Throughput: 0: 43288.3. Samples: 9532020200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:07:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 04:08:01,477][15401] Updated weights for policy 0, policy_version 581790 (0.0040) [2024-06-24 04:08:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 9532112896. Throughput: 0: 42980.9. Samples: 9532262780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:03,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 04:08:06,028][15401] Updated weights for policy 0, policy_version 581800 (0.0035) [2024-06-24 04:08:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 9532325888. Throughput: 0: 43032.9. Samples: 9532400640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:08,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 04:08:08,969][15401] Updated weights for policy 0, policy_version 581810 (0.0026) [2024-06-24 04:08:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9532506112. Throughput: 0: 42999.9. Samples: 9532658220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 04:08:13,850][15401] Updated weights for policy 0, policy_version 581820 (0.0040) [2024-06-24 04:08:16,011][15349] Signal inference workers to stop experience collection... (141150 times) [2024-06-24 04:08:16,064][15401] InferenceWorker_p0-w0: stopping experience collection (141150 times) [2024-06-24 04:08:16,134][15349] Signal inference workers to resume experience collection... (141150 times) [2024-06-24 04:08:16,134][15401] InferenceWorker_p0-w0: resuming experience collection (141150 times) [2024-06-24 04:08:16,564][15401] Updated weights for policy 0, policy_version 581830 (0.0032) [2024-06-24 04:08:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9532751872. Throughput: 0: 42834.1. Samples: 9532909260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:18,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 04:08:21,334][15401] Updated weights for policy 0, policy_version 581840 (0.0032) [2024-06-24 04:08:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9532964864. Throughput: 0: 43035.5. Samples: 9533048500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 04:08:24,205][15401] Updated weights for policy 0, policy_version 581850 (0.0034) [2024-06-24 04:08:28,390][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9533161472. Throughput: 0: 42798.9. Samples: 9533303400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 04:08:28,983][15401] Updated weights for policy 0, policy_version 581860 (0.0033) [2024-06-24 04:08:31,769][15401] Updated weights for policy 0, policy_version 581870 (0.0031) [2024-06-24 04:08:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 9533407232. Throughput: 0: 42840.9. Samples: 9533557200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:33,390][15132] Avg episode reward: [(0, '0.862')] [2024-06-24 04:08:36,609][15401] Updated weights for policy 0, policy_version 581880 (0.0036) [2024-06-24 04:08:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 9533620224. Throughput: 0: 43099.9. Samples: 9533696880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 04:08:39,479][15401] Updated weights for policy 0, policy_version 581890 (0.0032) [2024-06-24 04:08:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.9, 300 sec: 42876.1). Total num frames: 9533816832. Throughput: 0: 42833.3. Samples: 9533947700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:43,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-24 04:08:44,243][15401] Updated weights for policy 0, policy_version 581900 (0.0037) [2024-06-24 04:08:47,118][15401] Updated weights for policy 0, policy_version 581910 (0.0032) [2024-06-24 04:08:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9534046208. Throughput: 0: 43052.3. Samples: 9534200140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 04:08:51,757][15401] Updated weights for policy 0, policy_version 581920 (0.0035) [2024-06-24 04:08:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9534242816. Throughput: 0: 42981.6. Samples: 9534334820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:53,391][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 04:08:54,631][15401] Updated weights for policy 0, policy_version 581930 (0.0034) [2024-06-24 04:08:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9534439424. Throughput: 0: 42848.9. Samples: 9534586420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:08:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 04:08:59,194][15401] Updated weights for policy 0, policy_version 581940 (0.0029) [2024-06-24 04:09:02,301][15401] Updated weights for policy 0, policy_version 581950 (0.0034) [2024-06-24 04:09:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 9534685184. Throughput: 0: 43050.3. Samples: 9534846620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:09:03,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 04:09:06,644][15401] Updated weights for policy 0, policy_version 581960 (0.0033) [2024-06-24 04:09:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9534898176. Throughput: 0: 42996.9. Samples: 9534983360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-24 04:09:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 04:09:09,819][15401] Updated weights for policy 0, policy_version 581970 (0.0031) [2024-06-24 04:09:13,392][15132] Fps is (10 sec: 40960.1, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 9535094784. Throughput: 0: 42909.7. Samples: 9535234440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:13,393][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 04:09:14,327][15401] Updated weights for policy 0, policy_version 581980 (0.0039) [2024-06-24 04:09:17,605][15401] Updated weights for policy 0, policy_version 581990 (0.0033) [2024-06-24 04:09:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9535324160. Throughput: 0: 42859.2. Samples: 9535485860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 04:09:21,975][15401] Updated weights for policy 0, policy_version 582000 (0.0034) [2024-06-24 04:09:23,392][15132] Fps is (10 sec: 44236.6, 60 sec: 42869.8, 300 sec: 42876.7). Total num frames: 9535537152. Throughput: 0: 42724.9. Samples: 9535619600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:23,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 04:09:25,550][15401] Updated weights for policy 0, policy_version 582010 (0.0030) [2024-06-24 04:09:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9535733760. Throughput: 0: 42755.2. Samples: 9535871680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 04:09:29,455][15401] Updated weights for policy 0, policy_version 582020 (0.0037) [2024-06-24 04:09:31,059][15349] Signal inference workers to stop experience collection... (141200 times) [2024-06-24 04:09:31,108][15401] InferenceWorker_p0-w0: stopping experience collection (141200 times) [2024-06-24 04:09:31,113][15349] Signal inference workers to resume experience collection... (141200 times) [2024-06-24 04:09:31,119][15401] InferenceWorker_p0-w0: resuming experience collection (141200 times) [2024-06-24 04:09:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9535963136. Throughput: 0: 42804.1. Samples: 9536126320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 04:09:33,483][15401] Updated weights for policy 0, policy_version 582030 (0.0023) [2024-06-24 04:09:37,011][15401] Updated weights for policy 0, policy_version 582040 (0.0032) [2024-06-24 04:09:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 9536176128. Throughput: 0: 42868.6. Samples: 9536263900. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 04:09:40,902][15401] Updated weights for policy 0, policy_version 582050 (0.0034) [2024-06-24 04:09:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9536389120. Throughput: 0: 42895.1. Samples: 9536516700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 04:09:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000582055_9536389120.pth... [2024-06-24 04:09:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000581427_9526099968.pth [2024-06-24 04:09:44,863][15401] Updated weights for policy 0, policy_version 582060 (0.0032) [2024-06-24 04:09:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42932.4). Total num frames: 9536618496. Throughput: 0: 42789.5. Samples: 9536772040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 04:09:48,661][15401] Updated weights for policy 0, policy_version 582070 (0.0038) [2024-06-24 04:09:52,408][15401] Updated weights for policy 0, policy_version 582080 (0.0034) [2024-06-24 04:09:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 9536815104. Throughput: 0: 42753.8. Samples: 9536907280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 04:09:56,314][15401] Updated weights for policy 0, policy_version 582090 (0.0036) [2024-06-24 04:09:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9537011712. Throughput: 0: 42694.2. Samples: 9537155580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:09:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 04:10:00,418][15401] Updated weights for policy 0, policy_version 582100 (0.0035) [2024-06-24 04:10:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9537257472. Throughput: 0: 42743.5. Samples: 9537409320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 04:10:03,880][15401] Updated weights for policy 0, policy_version 582110 (0.0028) [2024-06-24 04:10:08,273][15401] Updated weights for policy 0, policy_version 582120 (0.0035) [2024-06-24 04:10:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9537454080. Throughput: 0: 42705.9. Samples: 9537541260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 04:10:11,896][15401] Updated weights for policy 0, policy_version 582130 (0.0032) [2024-06-24 04:10:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 9537650688. Throughput: 0: 42686.4. Samples: 9537792560. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 04:10:15,848][15401] Updated weights for policy 0, policy_version 582140 (0.0029) [2024-06-24 04:10:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9537880064. Throughput: 0: 42691.9. Samples: 9538047460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:18,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 04:10:19,338][15401] Updated weights for policy 0, policy_version 582150 (0.0045) [2024-06-24 04:10:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 9538093056. Throughput: 0: 42702.2. Samples: 9538185500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 04:10:23,429][15401] Updated weights for policy 0, policy_version 582160 (0.0029) [2024-06-24 04:10:26,808][15401] Updated weights for policy 0, policy_version 582170 (0.0043) [2024-06-24 04:10:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9538289664. Throughput: 0: 42561.3. Samples: 9538431960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:28,392][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 04:10:30,884][15401] Updated weights for policy 0, policy_version 582180 (0.0039) [2024-06-24 04:10:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9538535424. Throughput: 0: 42616.9. Samples: 9538689800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 04:10:34,357][15401] Updated weights for policy 0, policy_version 582190 (0.0038) [2024-06-24 04:10:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 42820.6). Total num frames: 9538732032. Throughput: 0: 42678.5. Samples: 9538827820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 04:10:38,876][15401] Updated weights for policy 0, policy_version 582200 (0.0041) [2024-06-24 04:10:42,615][15401] Updated weights for policy 0, policy_version 582210 (0.0042) [2024-06-24 04:10:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9538928640. Throughput: 0: 42605.0. Samples: 9539072800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-24 04:10:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 04:10:46,423][15401] Updated weights for policy 0, policy_version 582220 (0.0032) [2024-06-24 04:10:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 9539190784. Throughput: 0: 42663.8. Samples: 9539329200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:10:48,391][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 04:10:50,423][15401] Updated weights for policy 0, policy_version 582230 (0.0032) [2024-06-24 04:10:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9539371008. Throughput: 0: 42850.8. Samples: 9539469540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:10:53,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-24 04:10:54,095][15401] Updated weights for policy 0, policy_version 582240 (0.0035) [2024-06-24 04:10:58,339][15401] Updated weights for policy 0, policy_version 582250 (0.0031) [2024-06-24 04:10:58,390][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9539584000. Throughput: 0: 42896.8. Samples: 9539722920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:10:58,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 04:11:01,600][15349] Signal inference workers to stop experience collection... (141250 times) [2024-06-24 04:11:01,600][15349] Signal inference workers to resume experience collection... (141250 times) [2024-06-24 04:11:01,614][15401] InferenceWorker_p0-w0: stopping experience collection (141250 times) [2024-06-24 04:11:01,614][15401] InferenceWorker_p0-w0: resuming experience collection (141250 times) [2024-06-24 04:11:01,770][15401] Updated weights for policy 0, policy_version 582260 (0.0034) [2024-06-24 04:11:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9539846144. Throughput: 0: 42901.4. Samples: 9539978020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:03,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 04:11:05,841][15401] Updated weights for policy 0, policy_version 582270 (0.0039) [2024-06-24 04:11:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 9540026368. Throughput: 0: 42926.9. Samples: 9540117320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:08,393][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 04:11:09,419][15401] Updated weights for policy 0, policy_version 582280 (0.0033) [2024-06-24 04:11:13,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9540222976. Throughput: 0: 42947.0. Samples: 9540364580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:13,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 04:11:13,742][15401] Updated weights for policy 0, policy_version 582290 (0.0036) [2024-06-24 04:11:17,215][15401] Updated weights for policy 0, policy_version 582300 (0.0045) [2024-06-24 04:11:18,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9540485120. Throughput: 0: 42871.4. Samples: 9540619020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:18,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 04:11:21,224][15401] Updated weights for policy 0, policy_version 582310 (0.0032) [2024-06-24 04:11:23,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9540665344. Throughput: 0: 42936.1. Samples: 9540759940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:23,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 04:11:24,643][15401] Updated weights for policy 0, policy_version 582320 (0.0029) [2024-06-24 04:11:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 9540878336. Throughput: 0: 42919.4. Samples: 9541004180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:28,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 04:11:28,764][15401] Updated weights for policy 0, policy_version 582330 (0.0037) [2024-06-24 04:11:32,083][15401] Updated weights for policy 0, policy_version 582340 (0.0038) [2024-06-24 04:11:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9541124096. Throughput: 0: 43037.2. Samples: 9541265860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:33,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 04:11:36,271][15401] Updated weights for policy 0, policy_version 582350 (0.0033) [2024-06-24 04:11:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9541304320. Throughput: 0: 43039.8. Samples: 9541406340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:38,395][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 04:11:39,501][15401] Updated weights for policy 0, policy_version 582360 (0.0039) [2024-06-24 04:11:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 9541533696. Throughput: 0: 42859.5. Samples: 9541651600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 04:11:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000582369_9541533696.pth... [2024-06-24 04:11:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000581741_9531244544.pth [2024-06-24 04:11:43,795][15401] Updated weights for policy 0, policy_version 582370 (0.0035) [2024-06-24 04:11:47,403][15401] Updated weights for policy 0, policy_version 582380 (0.0029) [2024-06-24 04:11:48,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.7, 300 sec: 42931.7). Total num frames: 9541763072. Throughput: 0: 43053.0. Samples: 9541915400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 04:11:51,305][15401] Updated weights for policy 0, policy_version 582390 (0.0047) [2024-06-24 04:11:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9541926912. Throughput: 0: 42801.9. Samples: 9542043300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:53,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 04:11:55,197][15401] Updated weights for policy 0, policy_version 582400 (0.0041) [2024-06-24 04:11:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9542172672. Throughput: 0: 42682.4. Samples: 9542285280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:11:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 04:11:58,802][15401] Updated weights for policy 0, policy_version 582410 (0.0027) [2024-06-24 04:12:02,298][15349] Signal inference workers to stop experience collection... (141300 times) [2024-06-24 04:12:02,299][15349] Signal inference workers to resume experience collection... (141300 times) [2024-06-24 04:12:02,345][15401] InferenceWorker_p0-w0: stopping experience collection (141300 times) [2024-06-24 04:12:02,346][15401] InferenceWorker_p0-w0: resuming experience collection (141300 times) [2024-06-24 04:12:02,787][15401] Updated weights for policy 0, policy_version 582420 (0.0045) [2024-06-24 04:12:03,390][15132] Fps is (10 sec: 49151.5, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 9542418432. Throughput: 0: 42885.8. Samples: 9542548880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:12:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 04:12:06,511][15401] Updated weights for policy 0, policy_version 582430 (0.0038) [2024-06-24 04:12:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 9542582272. Throughput: 0: 42723.0. Samples: 9542682480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:12:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 04:12:10,430][15401] Updated weights for policy 0, policy_version 582440 (0.0031) [2024-06-24 04:12:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 9542828032. Throughput: 0: 42737.4. Samples: 9542927360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-24 04:12:13,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 04:12:14,544][15401] Updated weights for policy 0, policy_version 582450 (0.0028) [2024-06-24 04:12:18,151][15401] Updated weights for policy 0, policy_version 582460 (0.0028) [2024-06-24 04:12:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9543024640. Throughput: 0: 42572.3. Samples: 9543181620. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 04:12:22,215][15401] Updated weights for policy 0, policy_version 582470 (0.0030) [2024-06-24 04:12:23,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 9543221248. Throughput: 0: 42224.1. Samples: 9543306520. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:23,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 04:12:25,673][15401] Updated weights for policy 0, policy_version 582480 (0.0023) [2024-06-24 04:12:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 9543467008. Throughput: 0: 42381.9. Samples: 9543558780. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 04:12:29,859][15401] Updated weights for policy 0, policy_version 582490 (0.0028) [2024-06-24 04:12:33,263][15401] Updated weights for policy 0, policy_version 582500 (0.0027) [2024-06-24 04:12:33,392][15132] Fps is (10 sec: 45875.0, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 9543680000. Throughput: 0: 42363.4. Samples: 9543821860. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:33,392][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 04:12:37,877][15401] Updated weights for policy 0, policy_version 582510 (0.0035) [2024-06-24 04:12:38,391][15132] Fps is (10 sec: 39315.2, 60 sec: 42597.3, 300 sec: 42764.9). Total num frames: 9543860224. Throughput: 0: 42352.7. Samples: 9543949240. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:38,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 04:12:40,878][15401] Updated weights for policy 0, policy_version 582520 (0.0024) [2024-06-24 04:12:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9544089600. Throughput: 0: 42684.3. Samples: 9544206080. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:43,394][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 04:12:45,488][15401] Updated weights for policy 0, policy_version 582530 (0.0041) [2024-06-24 04:12:48,389][15132] Fps is (10 sec: 45882.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9544318976. Throughput: 0: 42511.3. Samples: 9544461880. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 04:12:48,434][15401] Updated weights for policy 0, policy_version 582540 (0.0025) [2024-06-24 04:12:53,232][15401] Updated weights for policy 0, policy_version 582550 (0.0030) [2024-06-24 04:12:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9544499200. Throughput: 0: 42473.0. Samples: 9544593760. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 04:12:56,442][15401] Updated weights for policy 0, policy_version 582560 (0.0032) [2024-06-24 04:12:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9544728576. Throughput: 0: 42566.7. Samples: 9544842860. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:12:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 04:13:01,096][15401] Updated weights for policy 0, policy_version 582570 (0.0039) [2024-06-24 04:13:03,396][15132] Fps is (10 sec: 44208.2, 60 sec: 42047.8, 300 sec: 42764.1). Total num frames: 9544941568. Throughput: 0: 42621.9. Samples: 9545099880. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:03,397][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 04:13:04,011][15401] Updated weights for policy 0, policy_version 582580 (0.0040) [2024-06-24 04:13:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9545138176. Throughput: 0: 42651.2. Samples: 9545225720. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 04:13:08,907][15401] Updated weights for policy 0, policy_version 582590 (0.0042) [2024-06-24 04:13:11,584][15401] Updated weights for policy 0, policy_version 582600 (0.0030) [2024-06-24 04:13:13,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9545367552. Throughput: 0: 42672.3. Samples: 9545479040. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:13,393][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 04:13:16,355][15401] Updated weights for policy 0, policy_version 582610 (0.0028) [2024-06-24 04:13:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9545564160. Throughput: 0: 42649.8. Samples: 9545741000. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:18,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 04:13:19,465][15401] Updated weights for policy 0, policy_version 582620 (0.0032) [2024-06-24 04:13:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 9545793536. Throughput: 0: 42715.7. Samples: 9545871380. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:23,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 04:13:23,923][15401] Updated weights for policy 0, policy_version 582630 (0.0033) [2024-06-24 04:13:24,876][15349] Signal inference workers to stop experience collection... (141350 times) [2024-06-24 04:13:24,877][15349] Signal inference workers to resume experience collection... (141350 times) [2024-06-24 04:13:24,899][15401] InferenceWorker_p0-w0: stopping experience collection (141350 times) [2024-06-24 04:13:24,899][15401] InferenceWorker_p0-w0: resuming experience collection (141350 times) [2024-06-24 04:13:26,991][15401] Updated weights for policy 0, policy_version 582640 (0.0047) [2024-06-24 04:13:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9546006528. Throughput: 0: 42484.1. Samples: 9546117860. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 04:13:31,858][15401] Updated weights for policy 0, policy_version 582650 (0.0045) [2024-06-24 04:13:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 9546203136. Throughput: 0: 42721.6. Samples: 9546384360. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 04:13:34,432][15401] Updated weights for policy 0, policy_version 582660 (0.0027) [2024-06-24 04:13:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42872.5, 300 sec: 42765.0). Total num frames: 9546432512. Throughput: 0: 42591.0. Samples: 9546510360. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 04:13:39,385][15401] Updated weights for policy 0, policy_version 582670 (0.0039) [2024-06-24 04:13:42,281][15401] Updated weights for policy 0, policy_version 582680 (0.0031) [2024-06-24 04:13:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9546645504. Throughput: 0: 42620.4. Samples: 9546760780. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 04:13:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000582681_9546645504.pth... [2024-06-24 04:13:43,445][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000582055_9536389120.pth [2024-06-24 04:13:46,895][15401] Updated weights for policy 0, policy_version 582690 (0.0029) [2024-06-24 04:13:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 9546842112. Throughput: 0: 42860.7. Samples: 9547028340. Policy #0 lag: (min: 2.0, avg: 8.2, max: 22.0) [2024-06-24 04:13:48,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 04:13:49,939][15401] Updated weights for policy 0, policy_version 582700 (0.0042) [2024-06-24 04:13:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9547071488. Throughput: 0: 42773.3. Samples: 9547150520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:13:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 04:13:54,884][15401] Updated weights for policy 0, policy_version 582710 (0.0037) [2024-06-24 04:13:57,957][15401] Updated weights for policy 0, policy_version 582720 (0.0040) [2024-06-24 04:13:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 9547300864. Throughput: 0: 42749.8. Samples: 9547402780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:13:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 04:14:02,407][15401] Updated weights for policy 0, policy_version 582730 (0.0039) [2024-06-24 04:14:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 9547497472. Throughput: 0: 42660.6. Samples: 9547660720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 04:14:05,639][15401] Updated weights for policy 0, policy_version 582740 (0.0022) [2024-06-24 04:14:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 9547710464. Throughput: 0: 42585.0. Samples: 9547787700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 04:14:10,097][15401] Updated weights for policy 0, policy_version 582750 (0.0034) [2024-06-24 04:14:13,313][15401] Updated weights for policy 0, policy_version 582760 (0.0024) [2024-06-24 04:14:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9547939840. Throughput: 0: 42715.1. Samples: 9548040040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:13,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 04:14:17,905][15401] Updated weights for policy 0, policy_version 582770 (0.0039) [2024-06-24 04:14:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 9548136448. Throughput: 0: 42577.0. Samples: 9548300320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 04:14:20,961][15401] Updated weights for policy 0, policy_version 582780 (0.0035) [2024-06-24 04:14:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9548349440. Throughput: 0: 42506.3. Samples: 9548423140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 04:14:25,411][15401] Updated weights for policy 0, policy_version 582790 (0.0037) [2024-06-24 04:14:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9548562432. Throughput: 0: 42653.0. Samples: 9548680160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:28,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 04:14:28,969][15401] Updated weights for policy 0, policy_version 582800 (0.0052) [2024-06-24 04:14:33,026][15401] Updated weights for policy 0, policy_version 582810 (0.0033) [2024-06-24 04:14:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9548759040. Throughput: 0: 42449.0. Samples: 9548938540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:33,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 04:14:36,573][15401] Updated weights for policy 0, policy_version 582820 (0.0034) [2024-06-24 04:14:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9548988416. Throughput: 0: 42671.7. Samples: 9549070740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:38,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 04:14:40,639][15401] Updated weights for policy 0, policy_version 582830 (0.0047) [2024-06-24 04:14:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9549201408. Throughput: 0: 42637.4. Samples: 9549321460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 04:14:44,218][15401] Updated weights for policy 0, policy_version 582840 (0.0036) [2024-06-24 04:14:48,306][15401] Updated weights for policy 0, policy_version 582850 (0.0030) [2024-06-24 04:14:48,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9549414400. Throughput: 0: 42712.3. Samples: 9549582780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 04:14:51,674][15349] Signal inference workers to stop experience collection... (141400 times) [2024-06-24 04:14:51,674][15349] Signal inference workers to resume experience collection... (141400 times) [2024-06-24 04:14:51,705][15401] InferenceWorker_p0-w0: stopping experience collection (141400 times) [2024-06-24 04:14:51,737][15401] InferenceWorker_p0-w0: resuming experience collection (141400 times) [2024-06-24 04:14:51,816][15401] Updated weights for policy 0, policy_version 582860 (0.0040) [2024-06-24 04:14:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9549627392. Throughput: 0: 42659.0. Samples: 9549707360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:53,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 04:14:55,906][15401] Updated weights for policy 0, policy_version 582870 (0.0045) [2024-06-24 04:14:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9549856768. Throughput: 0: 42652.5. Samples: 9549959400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:14:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 04:14:59,841][15401] Updated weights for policy 0, policy_version 582880 (0.0043) [2024-06-24 04:15:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9550036992. Throughput: 0: 42554.8. Samples: 9550215280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:15:03,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-24 04:15:03,792][15401] Updated weights for policy 0, policy_version 582890 (0.0031) [2024-06-24 04:15:07,439][15401] Updated weights for policy 0, policy_version 582900 (0.0039) [2024-06-24 04:15:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9550266368. Throughput: 0: 42565.8. Samples: 9550338600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:15:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 04:15:11,345][15401] Updated weights for policy 0, policy_version 582910 (0.0044) [2024-06-24 04:15:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9550495744. Throughput: 0: 42716.4. Samples: 9550602400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:15:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 04:15:14,892][15401] Updated weights for policy 0, policy_version 582920 (0.0022) [2024-06-24 04:15:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9550675968. Throughput: 0: 42805.8. Samples: 9550864800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 04:15:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 04:15:18,914][15401] Updated weights for policy 0, policy_version 582930 (0.0043) [2024-06-24 04:15:22,509][15401] Updated weights for policy 0, policy_version 582940 (0.0045) [2024-06-24 04:15:23,391][15132] Fps is (10 sec: 42590.9, 60 sec: 42870.3, 300 sec: 42820.3). Total num frames: 9550921728. Throughput: 0: 42473.0. Samples: 9550982100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:15:23,392][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 04:15:26,368][15401] Updated weights for policy 0, policy_version 582950 (0.0039) [2024-06-24 04:15:28,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 9551134720. Throughput: 0: 42834.1. Samples: 9551249000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:15:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 04:15:29,954][15401] Updated weights for policy 0, policy_version 582960 (0.0032) [2024-06-24 04:15:33,389][15132] Fps is (10 sec: 42605.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9551347712. Throughput: 0: 42970.8. Samples: 9551516460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:15:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 04:15:33,814][15401] Updated weights for policy 0, policy_version 582970 (0.0048) [2024-06-24 04:15:37,418][15401] Updated weights for policy 0, policy_version 582980 (0.0042) [2024-06-24 04:15:38,393][15132] Fps is (10 sec: 44220.1, 60 sec: 43141.6, 300 sec: 42875.5). Total num frames: 9551577088. Throughput: 0: 42966.0. Samples: 9551641000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:15:38,394][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 04:15:41,653][15401] Updated weights for policy 0, policy_version 582990 (0.0042) [2024-06-24 04:15:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9551790080. Throughput: 0: 43007.8. Samples: 9551894760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:15:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 04:15:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000582995_9551790080.pth... [2024-06-24 04:15:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000582369_9541533696.pth [2024-06-24 04:15:45,302][15401] Updated weights for policy 0, policy_version 583000 (0.0035) [2024-06-24 04:15:48,389][15132] Fps is (10 sec: 39337.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9551970304. Throughput: 0: 43131.9. Samples: 9552156220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:15:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 04:15:49,206][15401] Updated weights for policy 0, policy_version 583010 (0.0028) [2024-06-24 04:15:52,951][15401] Updated weights for policy 0, policy_version 583020 (0.0036) [2024-06-24 04:15:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9552216064. Throughput: 0: 43153.7. Samples: 9552280520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:15:53,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-24 04:15:56,825][15401] Updated weights for policy 0, policy_version 583030 (0.0030) [2024-06-24 04:15:58,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 9552412672. Throughput: 0: 42820.7. Samples: 9552529440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:15:58,393][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 04:16:00,538][15401] Updated weights for policy 0, policy_version 583040 (0.0033) [2024-06-24 04:16:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 9552625664. Throughput: 0: 42859.9. Samples: 9552793500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 04:16:04,490][15401] Updated weights for policy 0, policy_version 583050 (0.0031) [2024-06-24 04:16:07,703][15349] Signal inference workers to stop experience collection... (141450 times) [2024-06-24 04:16:07,703][15349] Signal inference workers to resume experience collection... (141450 times) [2024-06-24 04:16:07,721][15401] InferenceWorker_p0-w0: stopping experience collection (141450 times) [2024-06-24 04:16:07,724][15401] InferenceWorker_p0-w0: resuming experience collection (141450 times) [2024-06-24 04:16:08,361][15401] Updated weights for policy 0, policy_version 583060 (0.0040) [2024-06-24 04:16:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9552855040. Throughput: 0: 43032.3. Samples: 9552918480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 04:16:12,055][15401] Updated weights for policy 0, policy_version 583070 (0.0027) [2024-06-24 04:16:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9553068032. Throughput: 0: 42790.1. Samples: 9553174540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 04:16:16,067][15401] Updated weights for policy 0, policy_version 583080 (0.0030) [2024-06-24 04:16:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9553281024. Throughput: 0: 42718.7. Samples: 9553438800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 04:16:19,620][15401] Updated weights for policy 0, policy_version 583090 (0.0036) [2024-06-24 04:16:23,390][15132] Fps is (10 sec: 40958.7, 60 sec: 42599.5, 300 sec: 42709.5). Total num frames: 9553477632. Throughput: 0: 42786.8. Samples: 9553566240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 04:16:23,746][15401] Updated weights for policy 0, policy_version 583100 (0.0033) [2024-06-24 04:16:27,155][15401] Updated weights for policy 0, policy_version 583110 (0.0032) [2024-06-24 04:16:28,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 9553723392. Throughput: 0: 42824.9. Samples: 9553821980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:28,393][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 04:16:31,324][15401] Updated weights for policy 0, policy_version 583120 (0.0039) [2024-06-24 04:16:33,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9553920000. Throughput: 0: 42839.0. Samples: 9554084080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:33,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 04:16:35,059][15401] Updated weights for policy 0, policy_version 583130 (0.0032) [2024-06-24 04:16:38,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42328.2, 300 sec: 42654.0). Total num frames: 9554116608. Throughput: 0: 42724.1. Samples: 9554203100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 04:16:38,889][15401] Updated weights for policy 0, policy_version 583140 (0.0041) [2024-06-24 04:16:42,689][15401] Updated weights for policy 0, policy_version 583150 (0.0031) [2024-06-24 04:16:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9554362368. Throughput: 0: 42951.7. Samples: 9554462160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 04:16:46,745][15401] Updated weights for policy 0, policy_version 583160 (0.0043) [2024-06-24 04:16:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9554542592. Throughput: 0: 42900.5. Samples: 9554724020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 04:16:50,253][15401] Updated weights for policy 0, policy_version 583170 (0.0024) [2024-06-24 04:16:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9554755584. Throughput: 0: 42761.7. Samples: 9554842760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 04:16:53,390][15132] Avg episode reward: [(0, '0.880')] [2024-06-24 04:16:54,355][15401] Updated weights for policy 0, policy_version 583180 (0.0039) [2024-06-24 04:16:57,805][15401] Updated weights for policy 0, policy_version 583190 (0.0034) [2024-06-24 04:16:58,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43144.6, 300 sec: 42653.6). Total num frames: 9555001344. Throughput: 0: 42816.3. Samples: 9555101380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:16:58,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 04:17:02,078][15401] Updated weights for policy 0, policy_version 583200 (0.0039) [2024-06-24 04:17:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9555181568. Throughput: 0: 42734.6. Samples: 9555361860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 04:17:05,678][15401] Updated weights for policy 0, policy_version 583210 (0.0029) [2024-06-24 04:17:08,390][15132] Fps is (10 sec: 39330.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9555394560. Throughput: 0: 42538.7. Samples: 9555480480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 04:17:09,825][15401] Updated weights for policy 0, policy_version 583220 (0.0030) [2024-06-24 04:17:12,344][15349] Signal inference workers to stop experience collection... (141500 times) [2024-06-24 04:17:12,365][15401] InferenceWorker_p0-w0: stopping experience collection (141500 times) [2024-06-24 04:17:12,457][15349] Signal inference workers to resume experience collection... (141500 times) [2024-06-24 04:17:12,457][15401] InferenceWorker_p0-w0: resuming experience collection (141500 times) [2024-06-24 04:17:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 9555623936. Throughput: 0: 42589.3. Samples: 9555738400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 04:17:13,417][15401] Updated weights for policy 0, policy_version 583230 (0.0040) [2024-06-24 04:17:17,759][15401] Updated weights for policy 0, policy_version 583240 (0.0030) [2024-06-24 04:17:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 9555820544. Throughput: 0: 42521.0. Samples: 9555997420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:18,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 04:17:20,904][15401] Updated weights for policy 0, policy_version 583250 (0.0035) [2024-06-24 04:17:23,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9556033536. Throughput: 0: 42558.2. Samples: 9556118220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 04:17:25,686][15401] Updated weights for policy 0, policy_version 583260 (0.0036) [2024-06-24 04:17:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 9556262912. Throughput: 0: 42416.3. Samples: 9556370900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 04:17:28,730][15401] Updated weights for policy 0, policy_version 583270 (0.0041) [2024-06-24 04:17:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42054.0, 300 sec: 42654.2). Total num frames: 9556443136. Throughput: 0: 42383.1. Samples: 9556631260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 04:17:33,438][15401] Updated weights for policy 0, policy_version 583280 (0.0043) [2024-06-24 04:17:36,473][15401] Updated weights for policy 0, policy_version 583290 (0.0038) [2024-06-24 04:17:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 9556688896. Throughput: 0: 42433.6. Samples: 9556752280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:38,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 04:17:41,189][15401] Updated weights for policy 0, policy_version 583300 (0.0049) [2024-06-24 04:17:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9556901888. Throughput: 0: 42251.8. Samples: 9557002620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:43,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 04:17:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000583307_9556901888.pth... [2024-06-24 04:17:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000582681_9546645504.pth [2024-06-24 04:17:44,185][15401] Updated weights for policy 0, policy_version 583310 (0.0028) [2024-06-24 04:17:48,390][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9557065728. Throughput: 0: 42416.4. Samples: 9557270600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:48,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 04:17:48,707][15401] Updated weights for policy 0, policy_version 583320 (0.0035) [2024-06-24 04:17:52,157][15401] Updated weights for policy 0, policy_version 583330 (0.0028) [2024-06-24 04:17:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9557327872. Throughput: 0: 42469.8. Samples: 9557391620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:53,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 04:17:56,330][15401] Updated weights for policy 0, policy_version 583340 (0.0027) [2024-06-24 04:17:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42053.9, 300 sec: 42654.9). Total num frames: 9557524480. Throughput: 0: 42420.1. Samples: 9557647300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:17:58,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 04:17:59,815][15401] Updated weights for policy 0, policy_version 583350 (0.0034) [2024-06-24 04:18:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9557704704. Throughput: 0: 42484.8. Samples: 9557909240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:18:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 04:18:04,287][15401] Updated weights for policy 0, policy_version 583360 (0.0028) [2024-06-24 04:18:07,468][15401] Updated weights for policy 0, policy_version 583370 (0.0042) [2024-06-24 04:18:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 9557950464. Throughput: 0: 42481.4. Samples: 9558029880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:18:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 04:18:11,856][15401] Updated weights for policy 0, policy_version 583380 (0.0048) [2024-06-24 04:18:13,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9558179840. Throughput: 0: 42556.0. Samples: 9558285920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:18:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 04:18:15,262][15401] Updated weights for policy 0, policy_version 583390 (0.0039) [2024-06-24 04:18:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 9558343680. Throughput: 0: 42596.5. Samples: 9558548100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:18:18,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 04:18:19,485][15401] Updated weights for policy 0, policy_version 583400 (0.0042) [2024-06-24 04:18:22,879][15401] Updated weights for policy 0, policy_version 583410 (0.0038) [2024-06-24 04:18:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9558605824. Throughput: 0: 42569.0. Samples: 9558667880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 04:18:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 04:18:26,592][15349] Signal inference workers to stop experience collection... (141550 times) [2024-06-24 04:18:26,624][15401] InferenceWorker_p0-w0: stopping experience collection (141550 times) [2024-06-24 04:18:26,651][15349] Signal inference workers to resume experience collection... (141550 times) [2024-06-24 04:18:26,656][15401] InferenceWorker_p0-w0: resuming experience collection (141550 times) [2024-06-24 04:18:26,978][15401] Updated weights for policy 0, policy_version 583420 (0.0029) [2024-06-24 04:18:28,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9558818816. Throughput: 0: 42702.8. Samples: 9558924240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:18:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 04:18:30,537][15401] Updated weights for policy 0, policy_version 583430 (0.0027) [2024-06-24 04:18:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9558999040. Throughput: 0: 42625.2. Samples: 9559188740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:18:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 04:18:34,507][15401] Updated weights for policy 0, policy_version 583440 (0.0019) [2024-06-24 04:18:38,108][15401] Updated weights for policy 0, policy_version 583450 (0.0032) [2024-06-24 04:18:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9559244800. Throughput: 0: 42688.5. Samples: 9559312600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:18:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 04:18:42,145][15401] Updated weights for policy 0, policy_version 583460 (0.0033) [2024-06-24 04:18:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9559457792. Throughput: 0: 42745.3. Samples: 9559570840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:18:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 04:18:45,886][15401] Updated weights for policy 0, policy_version 583470 (0.0034) [2024-06-24 04:18:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 9559654400. Throughput: 0: 42740.1. Samples: 9559832540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:18:48,392][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 04:18:49,593][15401] Updated weights for policy 0, policy_version 583480 (0.0031) [2024-06-24 04:18:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 9559883776. Throughput: 0: 42939.2. Samples: 9559962140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:18:53,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 04:18:53,429][15401] Updated weights for policy 0, policy_version 583490 (0.0045) [2024-06-24 04:18:57,190][15401] Updated weights for policy 0, policy_version 583500 (0.0030) [2024-06-24 04:18:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9560113152. Throughput: 0: 42931.1. Samples: 9560217820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:18:58,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-24 04:19:01,027][15401] Updated weights for policy 0, policy_version 583510 (0.0041) [2024-06-24 04:19:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 9560293376. Throughput: 0: 43016.8. Samples: 9560483860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 04:19:04,682][15401] Updated weights for policy 0, policy_version 583520 (0.0042) [2024-06-24 04:19:08,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 9560522752. Throughput: 0: 43097.9. Samples: 9560607280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:08,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 04:19:08,565][15401] Updated weights for policy 0, policy_version 583530 (0.0032) [2024-06-24 04:19:12,318][15401] Updated weights for policy 0, policy_version 583540 (0.0036) [2024-06-24 04:19:13,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9560768512. Throughput: 0: 43165.9. Samples: 9560866700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 04:19:16,243][15401] Updated weights for policy 0, policy_version 583550 (0.0033) [2024-06-24 04:19:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9560932352. Throughput: 0: 43149.9. Samples: 9561130480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 04:19:19,878][15401] Updated weights for policy 0, policy_version 583560 (0.0033) [2024-06-24 04:19:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9561178112. Throughput: 0: 43079.1. Samples: 9561251160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 04:19:23,893][15401] Updated weights for policy 0, policy_version 583570 (0.0041) [2024-06-24 04:19:27,740][15401] Updated weights for policy 0, policy_version 583580 (0.0037) [2024-06-24 04:19:28,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9561407488. Throughput: 0: 43193.4. Samples: 9561514540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 04:19:31,749][15401] Updated weights for policy 0, policy_version 583590 (0.0045) [2024-06-24 04:19:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 9561587712. Throughput: 0: 43060.0. Samples: 9561770240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 04:19:34,454][15349] Signal inference workers to stop experience collection... (141600 times) [2024-06-24 04:19:34,455][15349] Signal inference workers to resume experience collection... (141600 times) [2024-06-24 04:19:34,473][15401] InferenceWorker_p0-w0: stopping experience collection (141600 times) [2024-06-24 04:19:34,473][15401] InferenceWorker_p0-w0: resuming experience collection (141600 times) [2024-06-24 04:19:35,259][15401] Updated weights for policy 0, policy_version 583600 (0.0037) [2024-06-24 04:19:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 9561833472. Throughput: 0: 42920.7. Samples: 9561893680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:38,393][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 04:19:39,248][15401] Updated weights for policy 0, policy_version 583610 (0.0033) [2024-06-24 04:19:43,093][15401] Updated weights for policy 0, policy_version 583620 (0.0044) [2024-06-24 04:19:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9562046464. Throughput: 0: 43139.3. Samples: 9562159080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 04:19:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000583622_9562062848.pth... [2024-06-24 04:19:43,545][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000582995_9551790080.pth [2024-06-24 04:19:47,018][15401] Updated weights for policy 0, policy_version 583630 (0.0021) [2024-06-24 04:19:48,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9562226688. Throughput: 0: 42969.0. Samples: 9562417460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 04:19:50,676][15401] Updated weights for policy 0, policy_version 583640 (0.0035) [2024-06-24 04:19:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 9562488832. Throughput: 0: 42804.6. Samples: 9562533500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 04:19:54,824][15401] Updated weights for policy 0, policy_version 583650 (0.0031) [2024-06-24 04:19:58,260][15401] Updated weights for policy 0, policy_version 583660 (0.0038) [2024-06-24 04:19:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9562685440. Throughput: 0: 42937.3. Samples: 9562798880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 04:19:58,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 04:20:02,486][15401] Updated weights for policy 0, policy_version 583670 (0.0030) [2024-06-24 04:20:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9562865664. Throughput: 0: 42593.7. Samples: 9563047200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 04:20:05,913][15401] Updated weights for policy 0, policy_version 583680 (0.0028) [2024-06-24 04:20:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9563111424. Throughput: 0: 42707.6. Samples: 9563173000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 04:20:10,086][15401] Updated weights for policy 0, policy_version 583690 (0.0038) [2024-06-24 04:20:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9563308032. Throughput: 0: 42734.2. Samples: 9563437580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 04:20:13,596][15401] Updated weights for policy 0, policy_version 583700 (0.0033) [2024-06-24 04:20:17,749][15401] Updated weights for policy 0, policy_version 583710 (0.0040) [2024-06-24 04:20:18,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 9563504640. Throughput: 0: 42518.2. Samples: 9563683560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 04:20:21,457][15401] Updated weights for policy 0, policy_version 583720 (0.0044) [2024-06-24 04:20:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9563750400. Throughput: 0: 42621.0. Samples: 9563811520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 04:20:25,450][15401] Updated weights for policy 0, policy_version 583730 (0.0026) [2024-06-24 04:20:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 9563930624. Throughput: 0: 42585.8. Samples: 9564075440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 04:20:29,052][15401] Updated weights for policy 0, policy_version 583740 (0.0030) [2024-06-24 04:20:33,113][15401] Updated weights for policy 0, policy_version 583750 (0.0035) [2024-06-24 04:20:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 9564160000. Throughput: 0: 42459.3. Samples: 9564328140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 04:20:36,650][15401] Updated weights for policy 0, policy_version 583760 (0.0032) [2024-06-24 04:20:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 9564389376. Throughput: 0: 42823.2. Samples: 9564460540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 04:20:40,939][15401] Updated weights for policy 0, policy_version 583770 (0.0040) [2024-06-24 04:20:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9564585984. Throughput: 0: 42545.4. Samples: 9564713420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 04:20:43,789][15349] Signal inference workers to stop experience collection... (141650 times) [2024-06-24 04:20:43,835][15401] InferenceWorker_p0-w0: stopping experience collection (141650 times) [2024-06-24 04:20:43,844][15349] Signal inference workers to resume experience collection... (141650 times) [2024-06-24 04:20:43,850][15401] InferenceWorker_p0-w0: resuming experience collection (141650 times) [2024-06-24 04:20:44,505][15401] Updated weights for policy 0, policy_version 583780 (0.0041) [2024-06-24 04:20:48,385][15401] Updated weights for policy 0, policy_version 583790 (0.0028) [2024-06-24 04:20:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9564815360. Throughput: 0: 42694.0. Samples: 9564968420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:48,393][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 04:20:52,429][15401] Updated weights for policy 0, policy_version 583800 (0.0039) [2024-06-24 04:20:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 9565011968. Throughput: 0: 42832.2. Samples: 9565100460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:53,391][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 04:20:56,102][15401] Updated weights for policy 0, policy_version 583810 (0.0033) [2024-06-24 04:20:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9565224960. Throughput: 0: 42593.7. Samples: 9565354300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:20:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 04:21:00,065][15401] Updated weights for policy 0, policy_version 583820 (0.0033) [2024-06-24 04:21:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 9565437952. Throughput: 0: 42594.2. Samples: 9565600300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:21:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 04:21:04,098][15401] Updated weights for policy 0, policy_version 583830 (0.0039) [2024-06-24 04:21:07,671][15401] Updated weights for policy 0, policy_version 583840 (0.0037) [2024-06-24 04:21:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9565650944. Throughput: 0: 42690.3. Samples: 9565732580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:21:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 04:21:12,127][15401] Updated weights for policy 0, policy_version 583850 (0.0029) [2024-06-24 04:21:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9565880320. Throughput: 0: 42598.9. Samples: 9565992400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:21:13,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 04:21:15,672][15401] Updated weights for policy 0, policy_version 583860 (0.0027) [2024-06-24 04:21:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9566093312. Throughput: 0: 42480.1. Samples: 9566239740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:21:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 04:21:19,804][15401] Updated weights for policy 0, policy_version 583870 (0.0039) [2024-06-24 04:21:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 9566273536. Throughput: 0: 42349.7. Samples: 9566366280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:21:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 04:21:23,418][15401] Updated weights for policy 0, policy_version 583880 (0.0033) [2024-06-24 04:21:27,221][15401] Updated weights for policy 0, policy_version 583890 (0.0028) [2024-06-24 04:21:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 9566519296. Throughput: 0: 42674.0. Samples: 9566633760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:21:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 04:21:30,968][15401] Updated weights for policy 0, policy_version 583900 (0.0028) [2024-06-24 04:21:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9566732288. Throughput: 0: 42647.8. Samples: 9566887580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:21:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 04:21:34,773][15401] Updated weights for policy 0, policy_version 583910 (0.0033) [2024-06-24 04:21:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 9566912512. Throughput: 0: 42648.1. Samples: 9567019620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:21:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 04:21:38,917][15401] Updated weights for policy 0, policy_version 583920 (0.0049) [2024-06-24 04:21:42,317][15401] Updated weights for policy 0, policy_version 583930 (0.0026) [2024-06-24 04:21:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9567141888. Throughput: 0: 42718.2. Samples: 9567276620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:21:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 04:21:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000583932_9567141888.pth... [2024-06-24 04:21:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000583307_9556901888.pth [2024-06-24 04:21:46,394][15401] Updated weights for policy 0, policy_version 583940 (0.0033) [2024-06-24 04:21:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 9567371264. Throughput: 0: 42861.6. Samples: 9567529080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:21:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 04:21:49,815][15401] Updated weights for policy 0, policy_version 583950 (0.0036) [2024-06-24 04:21:51,284][15349] Signal inference workers to stop experience collection... (141700 times) [2024-06-24 04:21:51,288][15349] Signal inference workers to resume experience collection... (141700 times) [2024-06-24 04:21:51,306][15401] InferenceWorker_p0-w0: stopping experience collection (141700 times) [2024-06-24 04:21:51,306][15401] InferenceWorker_p0-w0: resuming experience collection (141700 times) [2024-06-24 04:21:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 9567551488. Throughput: 0: 42833.7. Samples: 9567660100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:21:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 04:21:54,031][15401] Updated weights for policy 0, policy_version 583960 (0.0024) [2024-06-24 04:21:57,271][15401] Updated weights for policy 0, policy_version 583970 (0.0031) [2024-06-24 04:21:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9567797248. Throughput: 0: 42719.7. Samples: 9567914780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:21:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 04:22:01,587][15401] Updated weights for policy 0, policy_version 583980 (0.0038) [2024-06-24 04:22:03,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9568026624. Throughput: 0: 42806.3. Samples: 9568166020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:03,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-24 04:22:05,313][15401] Updated weights for policy 0, policy_version 583990 (0.0045) [2024-06-24 04:22:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 9568206848. Throughput: 0: 42896.1. Samples: 9568296600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:08,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-24 04:22:09,295][15401] Updated weights for policy 0, policy_version 584000 (0.0029) [2024-06-24 04:22:13,111][15401] Updated weights for policy 0, policy_version 584010 (0.0027) [2024-06-24 04:22:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9568419840. Throughput: 0: 42645.5. Samples: 9568552800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:13,390][15132] Avg episode reward: [(0, '0.183')] [2024-06-24 04:22:16,962][15401] Updated weights for policy 0, policy_version 584020 (0.0042) [2024-06-24 04:22:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 9568632832. Throughput: 0: 42708.2. Samples: 9568809440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:18,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 04:22:20,901][15401] Updated weights for policy 0, policy_version 584030 (0.0042) [2024-06-24 04:22:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9568862208. Throughput: 0: 42682.2. Samples: 9568940320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 04:22:24,430][15401] Updated weights for policy 0, policy_version 584040 (0.0039) [2024-06-24 04:22:28,372][15401] Updated weights for policy 0, policy_version 584050 (0.0028) [2024-06-24 04:22:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9569075200. Throughput: 0: 42682.8. Samples: 9569197340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 04:22:31,965][15401] Updated weights for policy 0, policy_version 584060 (0.0045) [2024-06-24 04:22:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9569288192. Throughput: 0: 42763.7. Samples: 9569453440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:33,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 04:22:36,041][15401] Updated weights for policy 0, policy_version 584070 (0.0034) [2024-06-24 04:22:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9569501184. Throughput: 0: 42722.7. Samples: 9569582620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:38,404][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 04:22:39,388][15401] Updated weights for policy 0, policy_version 584080 (0.0035) [2024-06-24 04:22:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9569714176. Throughput: 0: 42922.1. Samples: 9569846280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 04:22:43,455][15401] Updated weights for policy 0, policy_version 584090 (0.0035) [2024-06-24 04:22:47,577][15401] Updated weights for policy 0, policy_version 584100 (0.0044) [2024-06-24 04:22:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9569943552. Throughput: 0: 43067.2. Samples: 9570104040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:48,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 04:22:51,107][15401] Updated weights for policy 0, policy_version 584110 (0.0038) [2024-06-24 04:22:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9570156544. Throughput: 0: 43010.7. Samples: 9570232080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 04:22:54,995][15401] Updated weights for policy 0, policy_version 584120 (0.0030) [2024-06-24 04:22:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 9570369536. Throughput: 0: 43186.3. Samples: 9570496180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:22:58,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 04:22:58,608][15401] Updated weights for policy 0, policy_version 584130 (0.0033) [2024-06-24 04:23:02,463][15401] Updated weights for policy 0, policy_version 584140 (0.0025) [2024-06-24 04:23:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9570598912. Throughput: 0: 43219.3. Samples: 9570754320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:23:03,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 04:23:06,130][15401] Updated weights for policy 0, policy_version 584150 (0.0032) [2024-06-24 04:23:08,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43413.0, 300 sec: 42819.6). Total num frames: 9570811904. Throughput: 0: 43162.3. Samples: 9570882900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:08,397][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 04:23:10,272][15401] Updated weights for policy 0, policy_version 584160 (0.0027) [2024-06-24 04:23:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9571008512. Throughput: 0: 43118.6. Samples: 9571137680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 04:23:13,875][15401] Updated weights for policy 0, policy_version 584170 (0.0040) [2024-06-24 04:23:16,420][15349] Signal inference workers to stop experience collection... (141750 times) [2024-06-24 04:23:16,469][15401] InferenceWorker_p0-w0: stopping experience collection (141750 times) [2024-06-24 04:23:16,477][15349] Signal inference workers to resume experience collection... (141750 times) [2024-06-24 04:23:16,494][15401] InferenceWorker_p0-w0: resuming experience collection (141750 times) [2024-06-24 04:23:17,962][15401] Updated weights for policy 0, policy_version 584180 (0.0037) [2024-06-24 04:23:18,389][15132] Fps is (10 sec: 40986.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9571221504. Throughput: 0: 43112.5. Samples: 9571393500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 04:23:21,387][15401] Updated weights for policy 0, policy_version 584190 (0.0038) [2024-06-24 04:23:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9571450880. Throughput: 0: 43098.6. Samples: 9571522060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:23,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 04:23:25,453][15401] Updated weights for policy 0, policy_version 584200 (0.0039) [2024-06-24 04:23:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9571647488. Throughput: 0: 42902.3. Samples: 9571776880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 04:23:29,150][15401] Updated weights for policy 0, policy_version 584210 (0.0037) [2024-06-24 04:23:32,972][15401] Updated weights for policy 0, policy_version 584220 (0.0029) [2024-06-24 04:23:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9571860480. Throughput: 0: 42885.7. Samples: 9572033900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 04:23:36,808][15401] Updated weights for policy 0, policy_version 584230 (0.0032) [2024-06-24 04:23:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9572073472. Throughput: 0: 42826.7. Samples: 9572159280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 04:23:40,631][15401] Updated weights for policy 0, policy_version 584240 (0.0035) [2024-06-24 04:23:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9572286464. Throughput: 0: 42591.0. Samples: 9572412780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:43,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-24 04:23:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000584246_9572286464.pth... [2024-06-24 04:23:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000583622_9562062848.pth [2024-06-24 04:23:44,433][15401] Updated weights for policy 0, policy_version 584250 (0.0030) [2024-06-24 04:23:48,339][15401] Updated weights for policy 0, policy_version 584260 (0.0035) [2024-06-24 04:23:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9572515840. Throughput: 0: 42529.5. Samples: 9572668140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 04:23:52,318][15401] Updated weights for policy 0, policy_version 584270 (0.0028) [2024-06-24 04:23:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9572696064. Throughput: 0: 42558.1. Samples: 9572797740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:53,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 04:23:56,064][15401] Updated weights for policy 0, policy_version 584280 (0.0041) [2024-06-24 04:23:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9572909056. Throughput: 0: 42535.6. Samples: 9573051780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:23:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 04:23:59,842][15401] Updated weights for policy 0, policy_version 584290 (0.0038) [2024-06-24 04:24:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 9573138432. Throughput: 0: 42378.7. Samples: 9573300540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:24:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 04:24:03,808][15401] Updated weights for policy 0, policy_version 584300 (0.0039) [2024-06-24 04:24:07,946][15401] Updated weights for policy 0, policy_version 584310 (0.0045) [2024-06-24 04:24:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42329.7, 300 sec: 42653.9). Total num frames: 9573351424. Throughput: 0: 42416.8. Samples: 9573430820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:24:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 04:24:11,663][15401] Updated weights for policy 0, policy_version 584320 (0.0034) [2024-06-24 04:24:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 9573564416. Throughput: 0: 42354.6. Samples: 9573682940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:24:13,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 04:24:15,340][15401] Updated weights for policy 0, policy_version 584330 (0.0033) [2024-06-24 04:24:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9573761024. Throughput: 0: 42440.9. Samples: 9573943740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:24:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 04:24:19,244][15401] Updated weights for policy 0, policy_version 584340 (0.0043) [2024-06-24 04:24:22,801][15401] Updated weights for policy 0, policy_version 584350 (0.0031) [2024-06-24 04:24:23,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9574006784. Throughput: 0: 42541.9. Samples: 9574073660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:24:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 04:24:27,187][15401] Updated weights for policy 0, policy_version 584360 (0.0029) [2024-06-24 04:24:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9574203392. Throughput: 0: 42732.0. Samples: 9574335720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:24:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 04:24:30,257][15401] Updated weights for policy 0, policy_version 584370 (0.0023) [2024-06-24 04:24:33,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 9574416384. Throughput: 0: 42703.5. Samples: 9574589800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:24:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 04:24:34,691][15401] Updated weights for policy 0, policy_version 584380 (0.0030) [2024-06-24 04:24:37,749][15401] Updated weights for policy 0, policy_version 584390 (0.0036) [2024-06-24 04:24:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9574645760. Throughput: 0: 42720.0. Samples: 9574720140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:24:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 04:24:42,511][15401] Updated weights for policy 0, policy_version 584400 (0.0029) [2024-06-24 04:24:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9574842368. Throughput: 0: 42813.7. Samples: 9574978400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:24:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 04:24:45,558][15401] Updated weights for policy 0, policy_version 584410 (0.0041) [2024-06-24 04:24:48,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 9575071744. Throughput: 0: 42895.8. Samples: 9575230960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:24:48,393][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 04:24:49,974][15401] Updated weights for policy 0, policy_version 584420 (0.0028) [2024-06-24 04:24:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9575284736. Throughput: 0: 42961.6. Samples: 9575364080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:24:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 04:24:53,422][15401] Updated weights for policy 0, policy_version 584430 (0.0034) [2024-06-24 04:24:57,479][15401] Updated weights for policy 0, policy_version 584440 (0.0035) [2024-06-24 04:24:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9575497728. Throughput: 0: 43045.8. Samples: 9575619900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:24:58,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 04:25:00,843][15401] Updated weights for policy 0, policy_version 584450 (0.0033) [2024-06-24 04:25:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9575710720. Throughput: 0: 42955.0. Samples: 9575876720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 04:25:05,000][15401] Updated weights for policy 0, policy_version 584460 (0.0032) [2024-06-24 04:25:05,004][15349] Signal inference workers to stop experience collection... (141800 times) [2024-06-24 04:25:05,005][15349] Signal inference workers to resume experience collection... (141800 times) [2024-06-24 04:25:05,047][15401] InferenceWorker_p0-w0: stopping experience collection (141800 times) [2024-06-24 04:25:05,047][15401] InferenceWorker_p0-w0: resuming experience collection (141800 times) [2024-06-24 04:25:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9575940096. Throughput: 0: 43067.4. Samples: 9576011700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 04:25:08,555][15401] Updated weights for policy 0, policy_version 584470 (0.0031) [2024-06-24 04:25:12,503][15401] Updated weights for policy 0, policy_version 584480 (0.0026) [2024-06-24 04:25:13,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 9576136704. Throughput: 0: 42968.4. Samples: 9576269400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:13,392][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 04:25:16,244][15401] Updated weights for policy 0, policy_version 584490 (0.0033) [2024-06-24 04:25:18,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 9576366080. Throughput: 0: 42925.3. Samples: 9576521540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:18,392][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 04:25:20,198][15401] Updated weights for policy 0, policy_version 584500 (0.0036) [2024-06-24 04:25:23,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 9576562688. Throughput: 0: 43073.2. Samples: 9576658440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 04:25:23,789][15401] Updated weights for policy 0, policy_version 584510 (0.0031) [2024-06-24 04:25:27,787][15401] Updated weights for policy 0, policy_version 584520 (0.0031) [2024-06-24 04:25:28,393][15132] Fps is (10 sec: 42592.9, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 9576792064. Throughput: 0: 43070.3. Samples: 9576916720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:28,394][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 04:25:31,347][15401] Updated weights for policy 0, policy_version 584530 (0.0034) [2024-06-24 04:25:33,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 9577021440. Throughput: 0: 42984.5. Samples: 9577165260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:33,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 04:25:35,386][15401] Updated weights for policy 0, policy_version 584540 (0.0036) [2024-06-24 04:25:38,390][15132] Fps is (10 sec: 40975.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9577201664. Throughput: 0: 43055.9. Samples: 9577301600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:38,391][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 04:25:38,909][15401] Updated weights for policy 0, policy_version 584550 (0.0040) [2024-06-24 04:25:43,241][15401] Updated weights for policy 0, policy_version 584560 (0.0037) [2024-06-24 04:25:43,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9577431040. Throughput: 0: 43083.2. Samples: 9577558640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:43,396][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 04:25:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000584560_9577431040.pth... [2024-06-24 04:25:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000583932_9567141888.pth [2024-06-24 04:25:46,597][15401] Updated weights for policy 0, policy_version 584570 (0.0034) [2024-06-24 04:25:48,393][15132] Fps is (10 sec: 45859.8, 60 sec: 43143.9, 300 sec: 42875.6). Total num frames: 9577660416. Throughput: 0: 42857.7. Samples: 9577805460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:48,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 04:25:51,013][15401] Updated weights for policy 0, policy_version 584580 (0.0046) [2024-06-24 04:25:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9577857024. Throughput: 0: 42888.1. Samples: 9577941660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 04:25:54,211][15401] Updated weights for policy 0, policy_version 584590 (0.0040) [2024-06-24 04:25:58,390][15132] Fps is (10 sec: 40973.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9578070016. Throughput: 0: 42863.9. Samples: 9578198180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:25:58,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-24 04:25:58,565][15401] Updated weights for policy 0, policy_version 584600 (0.0047) [2024-06-24 04:26:02,004][15401] Updated weights for policy 0, policy_version 584610 (0.0033) [2024-06-24 04:26:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9578299392. Throughput: 0: 42938.4. Samples: 9578453660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:26:03,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 04:26:06,090][15401] Updated weights for policy 0, policy_version 584620 (0.0029) [2024-06-24 04:26:08,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9578512384. Throughput: 0: 42816.5. Samples: 9578585180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 04:26:08,391][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 04:26:09,952][15401] Updated weights for policy 0, policy_version 584630 (0.0034) [2024-06-24 04:26:13,391][15132] Fps is (10 sec: 42591.8, 60 sec: 43145.2, 300 sec: 42820.4). Total num frames: 9578725376. Throughput: 0: 42746.1. Samples: 9578840200. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:13,391][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 04:26:13,638][15401] Updated weights for policy 0, policy_version 584640 (0.0034) [2024-06-24 04:26:17,486][15401] Updated weights for policy 0, policy_version 584650 (0.0030) [2024-06-24 04:26:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 9578921984. Throughput: 0: 42958.3. Samples: 9579098280. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:18,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 04:26:20,393][15349] Signal inference workers to stop experience collection... (141850 times) [2024-06-24 04:26:20,423][15401] InferenceWorker_p0-w0: stopping experience collection (141850 times) [2024-06-24 04:26:20,507][15349] Signal inference workers to resume experience collection... (141850 times) [2024-06-24 04:26:20,507][15401] InferenceWorker_p0-w0: resuming experience collection (141850 times) [2024-06-24 04:26:21,588][15401] Updated weights for policy 0, policy_version 584660 (0.0037) [2024-06-24 04:26:23,392][15132] Fps is (10 sec: 42594.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 9579151360. Throughput: 0: 42770.6. Samples: 9579226380. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:23,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 04:26:24,976][15401] Updated weights for policy 0, policy_version 584670 (0.0029) [2024-06-24 04:26:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42874.1, 300 sec: 42820.6). Total num frames: 9579364352. Throughput: 0: 42731.9. Samples: 9579481580. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:28,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 04:26:29,070][15401] Updated weights for policy 0, policy_version 584680 (0.0033) [2024-06-24 04:26:32,704][15401] Updated weights for policy 0, policy_version 584690 (0.0035) [2024-06-24 04:26:33,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42873.1, 300 sec: 42987.1). Total num frames: 9579593728. Throughput: 0: 43061.3. Samples: 9579743080. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 04:26:36,681][15401] Updated weights for policy 0, policy_version 584700 (0.0033) [2024-06-24 04:26:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9579790336. Throughput: 0: 42910.7. Samples: 9579872640. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 04:26:40,600][15401] Updated weights for policy 0, policy_version 584710 (0.0052) [2024-06-24 04:26:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9580003328. Throughput: 0: 42681.9. Samples: 9580118860. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 04:26:44,292][15401] Updated weights for policy 0, policy_version 584720 (0.0037) [2024-06-24 04:26:48,230][15401] Updated weights for policy 0, policy_version 584730 (0.0036) [2024-06-24 04:26:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.8, 300 sec: 42931.6). Total num frames: 9580216320. Throughput: 0: 42860.4. Samples: 9580382380. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 04:26:52,007][15401] Updated weights for policy 0, policy_version 584740 (0.0023) [2024-06-24 04:26:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9580412928. Throughput: 0: 42745.7. Samples: 9580508740. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 04:26:55,877][15401] Updated weights for policy 0, policy_version 584750 (0.0037) [2024-06-24 04:26:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9580642304. Throughput: 0: 42651.5. Samples: 9580759460. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:26:58,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 04:26:59,578][15401] Updated weights for policy 0, policy_version 584760 (0.0030) [2024-06-24 04:27:03,383][15401] Updated weights for policy 0, policy_version 584770 (0.0038) [2024-06-24 04:27:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9580871680. Throughput: 0: 42829.8. Samples: 9581025620. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 04:27:07,211][15401] Updated weights for policy 0, policy_version 584780 (0.0034) [2024-06-24 04:27:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9581068288. Throughput: 0: 42764.9. Samples: 9581150700. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:08,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 04:27:10,976][15401] Updated weights for policy 0, policy_version 584790 (0.0037) [2024-06-24 04:27:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42599.4, 300 sec: 42876.1). Total num frames: 9581281280. Throughput: 0: 42651.1. Samples: 9581400880. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:13,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 04:27:14,749][15401] Updated weights for policy 0, policy_version 584800 (0.0035) [2024-06-24 04:27:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9581494272. Throughput: 0: 42640.6. Samples: 9581661900. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 04:27:18,876][15401] Updated weights for policy 0, policy_version 584810 (0.0037) [2024-06-24 04:27:22,121][15401] Updated weights for policy 0, policy_version 584820 (0.0043) [2024-06-24 04:27:23,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 9581707264. Throughput: 0: 42603.9. Samples: 9581789920. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:23,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 04:27:26,450][15401] Updated weights for policy 0, policy_version 584830 (0.0028) [2024-06-24 04:27:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9581936640. Throughput: 0: 42763.1. Samples: 9582043200. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:28,392][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 04:27:29,890][15401] Updated weights for policy 0, policy_version 584840 (0.0035) [2024-06-24 04:27:33,389][15132] Fps is (10 sec: 39331.4, 60 sec: 41779.4, 300 sec: 42709.5). Total num frames: 9582100480. Throughput: 0: 42747.6. Samples: 9582306020. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 04:27:34,225][15401] Updated weights for policy 0, policy_version 584850 (0.0036) [2024-06-24 04:27:38,093][15401] Updated weights for policy 0, policy_version 584860 (0.0046) [2024-06-24 04:27:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9582346240. Throughput: 0: 42602.4. Samples: 9582425840. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 04:27:41,907][15401] Updated weights for policy 0, policy_version 584870 (0.0028) [2024-06-24 04:27:43,392][15132] Fps is (10 sec: 47501.6, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 9582575616. Throughput: 0: 42748.4. Samples: 9582683240. Policy #0 lag: (min: 1.0, avg: 12.3, max: 23.0) [2024-06-24 04:27:43,393][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 04:27:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000584874_9582575616.pth... [2024-06-24 04:27:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000584246_9572286464.pth [2024-06-24 04:27:45,520][15401] Updated weights for policy 0, policy_version 584880 (0.0030) [2024-06-24 04:27:47,180][15349] Signal inference workers to stop experience collection... (141900 times) [2024-06-24 04:27:47,183][15349] Signal inference workers to resume experience collection... (141900 times) [2024-06-24 04:27:47,201][15401] InferenceWorker_p0-w0: stopping experience collection (141900 times) [2024-06-24 04:27:47,201][15401] InferenceWorker_p0-w0: resuming experience collection (141900 times) [2024-06-24 04:27:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9582755840. Throughput: 0: 42675.9. Samples: 9582946040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:27:48,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 04:27:49,566][15401] Updated weights for policy 0, policy_version 584890 (0.0036) [2024-06-24 04:27:53,051][15401] Updated weights for policy 0, policy_version 584900 (0.0032) [2024-06-24 04:27:53,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9583001600. Throughput: 0: 42630.3. Samples: 9583069060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:27:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 04:27:57,196][15401] Updated weights for policy 0, policy_version 584910 (0.0025) [2024-06-24 04:27:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9583214592. Throughput: 0: 42788.1. Samples: 9583326340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:27:58,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 04:28:00,583][15401] Updated weights for policy 0, policy_version 584920 (0.0037) [2024-06-24 04:28:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42654.9). Total num frames: 9583394816. Throughput: 0: 42610.7. Samples: 9583579380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 04:28:04,946][15401] Updated weights for policy 0, policy_version 584930 (0.0043) [2024-06-24 04:28:08,320][15401] Updated weights for policy 0, policy_version 584940 (0.0029) [2024-06-24 04:28:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9583656960. Throughput: 0: 42506.2. Samples: 9583702600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 04:28:12,919][15401] Updated weights for policy 0, policy_version 584950 (0.0033) [2024-06-24 04:28:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9583853568. Throughput: 0: 42719.5. Samples: 9583965580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 04:28:15,979][15401] Updated weights for policy 0, policy_version 584960 (0.0034) [2024-06-24 04:28:18,389][15132] Fps is (10 sec: 37684.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9584033792. Throughput: 0: 42444.9. Samples: 9584216040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:18,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-24 04:28:20,650][15401] Updated weights for policy 0, policy_version 584970 (0.0032) [2024-06-24 04:28:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 9584279552. Throughput: 0: 42495.9. Samples: 9584338160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:23,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 04:28:23,689][15401] Updated weights for policy 0, policy_version 584980 (0.0030) [2024-06-24 04:28:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 9584459776. Throughput: 0: 42605.9. Samples: 9584600400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:28,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 04:28:28,515][15401] Updated weights for policy 0, policy_version 584990 (0.0031) [2024-06-24 04:28:31,775][15401] Updated weights for policy 0, policy_version 585000 (0.0028) [2024-06-24 04:28:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9584672768. Throughput: 0: 42373.4. Samples: 9584852840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 04:28:36,218][15401] Updated weights for policy 0, policy_version 585010 (0.0032) [2024-06-24 04:28:38,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9584934912. Throughput: 0: 42458.6. Samples: 9584979700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:38,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 04:28:39,301][15401] Updated weights for policy 0, policy_version 585020 (0.0041) [2024-06-24 04:28:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41507.8, 300 sec: 42542.9). Total num frames: 9585065984. Throughput: 0: 42500.4. Samples: 9585238860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:43,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 04:28:43,996][15401] Updated weights for policy 0, policy_version 585030 (0.0026) [2024-06-24 04:28:46,797][15401] Updated weights for policy 0, policy_version 585040 (0.0041) [2024-06-24 04:28:48,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9585311744. Throughput: 0: 42593.3. Samples: 9585496080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:48,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 04:28:51,516][15401] Updated weights for policy 0, policy_version 585050 (0.0034) [2024-06-24 04:28:53,389][15132] Fps is (10 sec: 50790.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9585573888. Throughput: 0: 42832.6. Samples: 9585630060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 04:28:54,439][15401] Updated weights for policy 0, policy_version 585060 (0.0036) [2024-06-24 04:28:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 9585721344. Throughput: 0: 42684.1. Samples: 9585886360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:28:58,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 04:28:59,131][15401] Updated weights for policy 0, policy_version 585070 (0.0037) [2024-06-24 04:28:59,520][15349] Signal inference workers to stop experience collection... (141950 times) [2024-06-24 04:28:59,572][15401] InferenceWorker_p0-w0: stopping experience collection (141950 times) [2024-06-24 04:28:59,630][15349] Signal inference workers to resume experience collection... (141950 times) [2024-06-24 04:28:59,630][15401] InferenceWorker_p0-w0: resuming experience collection (141950 times) [2024-06-24 04:29:02,098][15401] Updated weights for policy 0, policy_version 585080 (0.0026) [2024-06-24 04:29:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9585967104. Throughput: 0: 42643.4. Samples: 9586135000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:29:03,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-24 04:29:06,802][15401] Updated weights for policy 0, policy_version 585090 (0.0032) [2024-06-24 04:29:08,392][15132] Fps is (10 sec: 50778.2, 60 sec: 42869.9, 300 sec: 42931.6). Total num frames: 9586229248. Throughput: 0: 42879.5. Samples: 9586267840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:29:08,392][15132] Avg episode reward: [(0, '0.136')] [2024-06-24 04:29:10,333][15401] Updated weights for policy 0, policy_version 585100 (0.0027) [2024-06-24 04:29:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 9586376704. Throughput: 0: 42800.3. Samples: 9586526420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 04:29:13,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-24 04:29:14,441][15401] Updated weights for policy 0, policy_version 585110 (0.0031) [2024-06-24 04:29:17,758][15401] Updated weights for policy 0, policy_version 585120 (0.0046) [2024-06-24 04:29:18,390][15132] Fps is (10 sec: 37691.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 9586606080. Throughput: 0: 42757.3. Samples: 9586776920. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:18,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 04:29:21,992][15401] Updated weights for policy 0, policy_version 585130 (0.0031) [2024-06-24 04:29:23,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9586851840. Throughput: 0: 42897.2. Samples: 9586910080. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 04:29:25,361][15401] Updated weights for policy 0, policy_version 585140 (0.0040) [2024-06-24 04:29:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9587015680. Throughput: 0: 42956.9. Samples: 9587171920. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:28,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-24 04:29:29,605][15401] Updated weights for policy 0, policy_version 585150 (0.0044) [2024-06-24 04:29:32,855][15401] Updated weights for policy 0, policy_version 585160 (0.0032) [2024-06-24 04:29:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9587261440. Throughput: 0: 42639.0. Samples: 9587414840. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 04:29:37,461][15401] Updated weights for policy 0, policy_version 585170 (0.0043) [2024-06-24 04:29:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9587474432. Throughput: 0: 42655.9. Samples: 9587549580. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:38,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 04:29:40,263][15401] Updated weights for policy 0, policy_version 585180 (0.0030) [2024-06-24 04:29:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 9587671040. Throughput: 0: 42781.2. Samples: 9587811520. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:43,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 04:29:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000585185_9587671040.pth... [2024-06-24 04:29:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000584560_9577431040.pth [2024-06-24 04:29:44,945][15401] Updated weights for policy 0, policy_version 585190 (0.0028) [2024-06-24 04:29:48,336][15401] Updated weights for policy 0, policy_version 585200 (0.0038) [2024-06-24 04:29:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 9587916800. Throughput: 0: 42782.2. Samples: 9588060200. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 04:29:52,522][15401] Updated weights for policy 0, policy_version 585210 (0.0023) [2024-06-24 04:29:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9588113408. Throughput: 0: 42878.1. Samples: 9588197260. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:53,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 04:29:55,794][15401] Updated weights for policy 0, policy_version 585220 (0.0042) [2024-06-24 04:29:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9588326400. Throughput: 0: 42837.9. Samples: 9588454120. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:29:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 04:30:00,491][15401] Updated weights for policy 0, policy_version 585230 (0.0041) [2024-06-24 04:30:03,351][15401] Updated weights for policy 0, policy_version 585240 (0.0031) [2024-06-24 04:30:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 9588572160. Throughput: 0: 42720.4. Samples: 9588699340. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 04:30:07,717][15349] Signal inference workers to stop experience collection... (142000 times) [2024-06-24 04:30:07,725][15349] Signal inference workers to resume experience collection... (142000 times) [2024-06-24 04:30:07,770][15401] InferenceWorker_p0-w0: stopping experience collection (142000 times) [2024-06-24 04:30:07,770][15401] InferenceWorker_p0-w0: resuming experience collection (142000 times) [2024-06-24 04:30:08,083][15401] Updated weights for policy 0, policy_version 585250 (0.0029) [2024-06-24 04:30:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41780.8, 300 sec: 42709.8). Total num frames: 9588736000. Throughput: 0: 42615.7. Samples: 9588827780. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:08,394][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 04:30:11,165][15401] Updated weights for policy 0, policy_version 585260 (0.0036) [2024-06-24 04:30:13,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 9588948992. Throughput: 0: 42358.3. Samples: 9589078040. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 04:30:15,741][15401] Updated weights for policy 0, policy_version 585270 (0.0040) [2024-06-24 04:30:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 9589178368. Throughput: 0: 42644.2. Samples: 9589333820. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 04:30:18,955][15401] Updated weights for policy 0, policy_version 585280 (0.0030) [2024-06-24 04:30:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42050.7, 300 sec: 42654.1). Total num frames: 9589374976. Throughput: 0: 42554.2. Samples: 9589464620. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:23,393][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 04:30:23,724][15401] Updated weights for policy 0, policy_version 585290 (0.0039) [2024-06-24 04:30:26,552][15401] Updated weights for policy 0, policy_version 585300 (0.0035) [2024-06-24 04:30:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 9589604352. Throughput: 0: 42325.4. Samples: 9589716160. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:28,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 04:30:31,346][15401] Updated weights for policy 0, policy_version 585310 (0.0024) [2024-06-24 04:30:33,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9589817344. Throughput: 0: 42560.4. Samples: 9589975420. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:33,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 04:30:34,191][15401] Updated weights for policy 0, policy_version 585320 (0.0021) [2024-06-24 04:30:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9590013952. Throughput: 0: 42430.0. Samples: 9590106600. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 04:30:38,900][15401] Updated weights for policy 0, policy_version 585330 (0.0031) [2024-06-24 04:30:42,069][15401] Updated weights for policy 0, policy_version 585340 (0.0027) [2024-06-24 04:30:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.9). Total num frames: 9590226944. Throughput: 0: 42136.3. Samples: 9590350260. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 04:30:46,600][15401] Updated weights for policy 0, policy_version 585350 (0.0035) [2024-06-24 04:30:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 9590423552. Throughput: 0: 42618.3. Samples: 9590617160. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-24 04:30:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 04:30:49,737][15401] Updated weights for policy 0, policy_version 585360 (0.0033) [2024-06-24 04:30:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9590652928. Throughput: 0: 42596.0. Samples: 9590744600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:30:53,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 04:30:53,976][15401] Updated weights for policy 0, policy_version 585370 (0.0040) [2024-06-24 04:30:57,611][15401] Updated weights for policy 0, policy_version 585380 (0.0037) [2024-06-24 04:30:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9590882304. Throughput: 0: 42613.6. Samples: 9590995660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:30:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 04:31:01,687][15401] Updated weights for policy 0, policy_version 585390 (0.0024) [2024-06-24 04:31:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 9591078912. Throughput: 0: 42747.8. Samples: 9591257480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 04:31:05,156][15401] Updated weights for policy 0, policy_version 585400 (0.0031) [2024-06-24 04:31:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42598.3). Total num frames: 9591291904. Throughput: 0: 42513.8. Samples: 9591377740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:08,392][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 04:31:09,068][15349] Signal inference workers to stop experience collection... (142050 times) [2024-06-24 04:31:09,068][15349] Signal inference workers to resume experience collection... (142050 times) [2024-06-24 04:31:09,109][15401] InferenceWorker_p0-w0: stopping experience collection (142050 times) [2024-06-24 04:31:09,110][15401] InferenceWorker_p0-w0: resuming experience collection (142050 times) [2024-06-24 04:31:09,371][15401] Updated weights for policy 0, policy_version 585410 (0.0020) [2024-06-24 04:31:13,030][15401] Updated weights for policy 0, policy_version 585420 (0.0035) [2024-06-24 04:31:13,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 9591521280. Throughput: 0: 42639.1. Samples: 9591635020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:13,392][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 04:31:17,047][15401] Updated weights for policy 0, policy_version 585430 (0.0030) [2024-06-24 04:31:18,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 9591717888. Throughput: 0: 42614.3. Samples: 9591893060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 04:31:20,833][15401] Updated weights for policy 0, policy_version 585440 (0.0034) [2024-06-24 04:31:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 9591947264. Throughput: 0: 42541.2. Samples: 9592020960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 04:31:24,597][15401] Updated weights for policy 0, policy_version 585450 (0.0035) [2024-06-24 04:31:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9592160256. Throughput: 0: 42875.7. Samples: 9592279660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 04:31:28,531][15401] Updated weights for policy 0, policy_version 585460 (0.0038) [2024-06-24 04:31:32,307][15401] Updated weights for policy 0, policy_version 585470 (0.0039) [2024-06-24 04:31:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 9592356864. Throughput: 0: 42663.0. Samples: 9592537100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:33,393][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 04:31:36,196][15401] Updated weights for policy 0, policy_version 585480 (0.0036) [2024-06-24 04:31:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9592586240. Throughput: 0: 42709.4. Samples: 9592666520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:38,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 04:31:39,744][15401] Updated weights for policy 0, policy_version 585490 (0.0040) [2024-06-24 04:31:43,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9592799232. Throughput: 0: 42845.8. Samples: 9592923720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 04:31:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000585498_9592799232.pth... [2024-06-24 04:31:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000584874_9582575616.pth [2024-06-24 04:31:43,908][15401] Updated weights for policy 0, policy_version 585500 (0.0028) [2024-06-24 04:31:47,374][15401] Updated weights for policy 0, policy_version 585510 (0.0038) [2024-06-24 04:31:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9593012224. Throughput: 0: 42693.9. Samples: 9593178700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 04:31:51,926][15401] Updated weights for policy 0, policy_version 585520 (0.0039) [2024-06-24 04:31:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9593225216. Throughput: 0: 42924.0. Samples: 9593309220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:53,391][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 04:31:54,887][15401] Updated weights for policy 0, policy_version 585530 (0.0033) [2024-06-24 04:31:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9593421824. Throughput: 0: 42906.3. Samples: 9593565700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:31:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 04:31:59,462][15401] Updated weights for policy 0, policy_version 585540 (0.0033) [2024-06-24 04:32:02,445][15401] Updated weights for policy 0, policy_version 585550 (0.0024) [2024-06-24 04:32:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9593667584. Throughput: 0: 42751.4. Samples: 9593816880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:32:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 04:32:07,474][15401] Updated weights for policy 0, policy_version 585560 (0.0046) [2024-06-24 04:32:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 9593864192. Throughput: 0: 42861.8. Samples: 9593949740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:32:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 04:32:10,329][15401] Updated weights for policy 0, policy_version 585570 (0.0041) [2024-06-24 04:32:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 9594060800. Throughput: 0: 42737.8. Samples: 9594202860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:32:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 04:32:15,036][15401] Updated weights for policy 0, policy_version 585580 (0.0040) [2024-06-24 04:32:17,857][15401] Updated weights for policy 0, policy_version 585590 (0.0051) [2024-06-24 04:32:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 9594322944. Throughput: 0: 42654.8. Samples: 9594456460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 04:32:18,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 04:32:22,471][15401] Updated weights for policy 0, policy_version 585600 (0.0036) [2024-06-24 04:32:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9594519552. Throughput: 0: 42922.5. Samples: 9594598040. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:32:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 04:32:24,125][15349] Signal inference workers to stop experience collection... (142100 times) [2024-06-24 04:32:24,125][15349] Signal inference workers to resume experience collection... (142100 times) [2024-06-24 04:32:24,165][15401] InferenceWorker_p0-w0: stopping experience collection (142100 times) [2024-06-24 04:32:24,166][15401] InferenceWorker_p0-w0: resuming experience collection (142100 times) [2024-06-24 04:32:25,572][15401] Updated weights for policy 0, policy_version 585610 (0.0031) [2024-06-24 04:32:28,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9594699776. Throughput: 0: 42722.7. Samples: 9594846240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:32:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 04:32:29,915][15401] Updated weights for policy 0, policy_version 585620 (0.0032) [2024-06-24 04:32:33,217][15401] Updated weights for policy 0, policy_version 585630 (0.0037) [2024-06-24 04:32:33,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43692.4, 300 sec: 42820.5). Total num frames: 9594978304. Throughput: 0: 42688.4. Samples: 9595099680. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:32:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 04:32:37,442][15401] Updated weights for policy 0, policy_version 585640 (0.0028) [2024-06-24 04:32:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 9595142144. Throughput: 0: 42800.4. Samples: 9595235240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:32:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 04:32:40,829][15401] Updated weights for policy 0, policy_version 585650 (0.0030) [2024-06-24 04:32:43,390][15132] Fps is (10 sec: 36044.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9595338752. Throughput: 0: 42686.6. Samples: 9595486600. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:32:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 04:32:45,074][15401] Updated weights for policy 0, policy_version 585660 (0.0038) [2024-06-24 04:32:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9595600896. Throughput: 0: 42749.7. Samples: 9595740620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:32:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 04:32:48,511][15401] Updated weights for policy 0, policy_version 585670 (0.0028) [2024-06-24 04:32:52,835][15401] Updated weights for policy 0, policy_version 585680 (0.0035) [2024-06-24 04:32:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9595781120. Throughput: 0: 42847.2. Samples: 9595877860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:32:53,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 04:32:56,282][15401] Updated weights for policy 0, policy_version 585690 (0.0037) [2024-06-24 04:32:58,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9595994112. Throughput: 0: 42740.9. Samples: 9596126200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:32:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 04:33:00,492][15401] Updated weights for policy 0, policy_version 585700 (0.0032) [2024-06-24 04:33:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9596223488. Throughput: 0: 42800.4. Samples: 9596382480. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 04:33:03,906][15401] Updated weights for policy 0, policy_version 585710 (0.0032) [2024-06-24 04:33:08,171][15401] Updated weights for policy 0, policy_version 585720 (0.0028) [2024-06-24 04:33:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9596436480. Throughput: 0: 42685.4. Samples: 9596518880. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 04:33:11,697][15401] Updated weights for policy 0, policy_version 585730 (0.0030) [2024-06-24 04:33:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9596649472. Throughput: 0: 42732.3. Samples: 9596769200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:13,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 04:33:15,741][15401] Updated weights for policy 0, policy_version 585740 (0.0051) [2024-06-24 04:33:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9596862464. Throughput: 0: 42688.4. Samples: 9597020660. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 04:33:19,631][15401] Updated weights for policy 0, policy_version 585750 (0.0022) [2024-06-24 04:33:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 9597059072. Throughput: 0: 42628.6. Samples: 9597153520. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 04:33:23,694][15401] Updated weights for policy 0, policy_version 585760 (0.0040) [2024-06-24 04:33:27,249][15401] Updated weights for policy 0, policy_version 585770 (0.0031) [2024-06-24 04:33:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9597288448. Throughput: 0: 42694.2. Samples: 9597407840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 04:33:31,263][15401] Updated weights for policy 0, policy_version 585780 (0.0030) [2024-06-24 04:33:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9597501440. Throughput: 0: 42781.9. Samples: 9597665800. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 04:33:34,876][15401] Updated weights for policy 0, policy_version 585790 (0.0043) [2024-06-24 04:33:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9597698048. Throughput: 0: 42645.8. Samples: 9597796920. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 04:33:39,219][15401] Updated weights for policy 0, policy_version 585800 (0.0027) [2024-06-24 04:33:39,683][15349] Signal inference workers to stop experience collection... (142150 times) [2024-06-24 04:33:39,683][15349] Signal inference workers to resume experience collection... (142150 times) [2024-06-24 04:33:39,704][15401] InferenceWorker_p0-w0: stopping experience collection (142150 times) [2024-06-24 04:33:39,705][15401] InferenceWorker_p0-w0: resuming experience collection (142150 times) [2024-06-24 04:33:42,556][15401] Updated weights for policy 0, policy_version 585810 (0.0028) [2024-06-24 04:33:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9597927424. Throughput: 0: 42831.1. Samples: 9598053600. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 04:33:43,614][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000585813_9597960192.pth... [2024-06-24 04:33:43,673][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000585185_9587671040.pth [2024-06-24 04:33:46,770][15401] Updated weights for policy 0, policy_version 585820 (0.0027) [2024-06-24 04:33:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9598156800. Throughput: 0: 42743.7. Samples: 9598305940. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 04:33:50,531][15401] Updated weights for policy 0, policy_version 585830 (0.0040) [2024-06-24 04:33:53,390][15132] Fps is (10 sec: 40957.0, 60 sec: 42597.8, 300 sec: 42764.9). Total num frames: 9598337024. Throughput: 0: 42630.8. Samples: 9598437300. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-24 04:33:53,391][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 04:33:54,232][15401] Updated weights for policy 0, policy_version 585840 (0.0027) [2024-06-24 04:33:58,191][15401] Updated weights for policy 0, policy_version 585850 (0.0045) [2024-06-24 04:33:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9598566400. Throughput: 0: 42701.9. Samples: 9598690780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:33:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 04:34:01,741][15401] Updated weights for policy 0, policy_version 585860 (0.0035) [2024-06-24 04:34:03,389][15132] Fps is (10 sec: 45879.2, 60 sec: 42871.6, 300 sec: 42598.7). Total num frames: 9598795776. Throughput: 0: 42903.7. Samples: 9598951320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 04:34:05,771][15401] Updated weights for policy 0, policy_version 585870 (0.0043) [2024-06-24 04:34:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9598976000. Throughput: 0: 42840.9. Samples: 9599081360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:08,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 04:34:09,389][15401] Updated weights for policy 0, policy_version 585880 (0.0033) [2024-06-24 04:34:13,345][15401] Updated weights for policy 0, policy_version 585890 (0.0037) [2024-06-24 04:34:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9599221760. Throughput: 0: 42718.7. Samples: 9599330180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 04:34:17,029][15401] Updated weights for policy 0, policy_version 585900 (0.0024) [2024-06-24 04:34:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 9599434752. Throughput: 0: 42719.7. Samples: 9599588180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 04:34:21,159][15401] Updated weights for policy 0, policy_version 585910 (0.0036) [2024-06-24 04:34:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9599631360. Throughput: 0: 42781.8. Samples: 9599722100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 04:34:24,787][15401] Updated weights for policy 0, policy_version 585920 (0.0029) [2024-06-24 04:34:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9599860736. Throughput: 0: 42775.2. Samples: 9599978480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 04:34:28,752][15401] Updated weights for policy 0, policy_version 585930 (0.0033) [2024-06-24 04:34:32,312][15401] Updated weights for policy 0, policy_version 585940 (0.0044) [2024-06-24 04:34:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9600090112. Throughput: 0: 42913.7. Samples: 9600237060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 04:34:36,266][15401] Updated weights for policy 0, policy_version 585950 (0.0034) [2024-06-24 04:34:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9600270336. Throughput: 0: 42874.1. Samples: 9600366600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 04:34:39,855][15401] Updated weights for policy 0, policy_version 585960 (0.0030) [2024-06-24 04:34:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9600516096. Throughput: 0: 42897.7. Samples: 9600621180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 04:34:43,768][15401] Updated weights for policy 0, policy_version 585970 (0.0031) [2024-06-24 04:34:47,545][15401] Updated weights for policy 0, policy_version 585980 (0.0033) [2024-06-24 04:34:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9600729088. Throughput: 0: 42934.1. Samples: 9600883360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 04:34:51,334][15401] Updated weights for policy 0, policy_version 585990 (0.0037) [2024-06-24 04:34:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43145.2, 300 sec: 42709.5). Total num frames: 9600925696. Throughput: 0: 42889.8. Samples: 9601011400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 04:34:55,239][15401] Updated weights for policy 0, policy_version 586000 (0.0026) [2024-06-24 04:34:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 9601155072. Throughput: 0: 42905.4. Samples: 9601260920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:34:58,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 04:34:58,770][15349] Signal inference workers to stop experience collection... (142200 times) [2024-06-24 04:34:58,823][15401] InferenceWorker_p0-w0: stopping experience collection (142200 times) [2024-06-24 04:34:58,825][15349] Signal inference workers to resume experience collection... (142200 times) [2024-06-24 04:34:58,851][15401] InferenceWorker_p0-w0: resuming experience collection (142200 times) [2024-06-24 04:34:58,962][15401] Updated weights for policy 0, policy_version 586010 (0.0034) [2024-06-24 04:35:02,945][15401] Updated weights for policy 0, policy_version 586020 (0.0038) [2024-06-24 04:35:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9601368064. Throughput: 0: 42896.8. Samples: 9601518540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:35:03,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 04:35:06,673][15401] Updated weights for policy 0, policy_version 586030 (0.0038) [2024-06-24 04:35:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9601548288. Throughput: 0: 42859.6. Samples: 9601650780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:35:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 04:35:10,643][15401] Updated weights for policy 0, policy_version 586040 (0.0030) [2024-06-24 04:35:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9601794048. Throughput: 0: 42705.8. Samples: 9601900240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:35:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 04:35:14,467][15401] Updated weights for policy 0, policy_version 586050 (0.0031) [2024-06-24 04:35:18,283][15401] Updated weights for policy 0, policy_version 586060 (0.0029) [2024-06-24 04:35:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 9602007040. Throughput: 0: 42647.1. Samples: 9602156180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:35:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 04:35:22,076][15401] Updated weights for policy 0, policy_version 586070 (0.0036) [2024-06-24 04:35:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9602187264. Throughput: 0: 42560.0. Samples: 9602281800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 04:35:23,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 04:35:26,058][15401] Updated weights for policy 0, policy_version 586080 (0.0037) [2024-06-24 04:35:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9602433024. Throughput: 0: 42553.8. Samples: 9602536100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:35:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 04:35:30,170][15401] Updated weights for policy 0, policy_version 586090 (0.0034) [2024-06-24 04:35:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9602629632. Throughput: 0: 42525.6. Samples: 9602797020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:35:33,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 04:35:33,704][15401] Updated weights for policy 0, policy_version 586100 (0.0039) [2024-06-24 04:35:37,763][15401] Updated weights for policy 0, policy_version 586110 (0.0033) [2024-06-24 04:35:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9602826240. Throughput: 0: 42397.2. Samples: 9602919280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:35:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 04:35:41,316][15401] Updated weights for policy 0, policy_version 586120 (0.0037) [2024-06-24 04:35:43,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9603088384. Throughput: 0: 42658.6. Samples: 9603180560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:35:43,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 04:35:43,395][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000586126_9603088384.pth... [2024-06-24 04:35:43,445][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000585498_9592799232.pth [2024-06-24 04:35:46,038][15401] Updated weights for policy 0, policy_version 586130 (0.0039) [2024-06-24 04:35:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9603268608. Throughput: 0: 42552.4. Samples: 9603433400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:35:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 04:35:49,158][15401] Updated weights for policy 0, policy_version 586140 (0.0039) [2024-06-24 04:35:53,390][15132] Fps is (10 sec: 36044.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9603448832. Throughput: 0: 42247.0. Samples: 9603551900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:35:53,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 04:35:53,906][15401] Updated weights for policy 0, policy_version 586150 (0.0032) [2024-06-24 04:35:57,054][15401] Updated weights for policy 0, policy_version 586160 (0.0040) [2024-06-24 04:35:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9603710976. Throughput: 0: 42392.9. Samples: 9603807920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:35:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 04:36:02,058][15401] Updated weights for policy 0, policy_version 586170 (0.0038) [2024-06-24 04:36:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 9603891200. Throughput: 0: 42382.6. Samples: 9604063400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 04:36:04,561][15401] Updated weights for policy 0, policy_version 586180 (0.0043) [2024-06-24 04:36:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 9604104192. Throughput: 0: 42369.6. Samples: 9604188420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 04:36:09,679][15401] Updated weights for policy 0, policy_version 586190 (0.0029) [2024-06-24 04:36:12,423][15401] Updated weights for policy 0, policy_version 586200 (0.0035) [2024-06-24 04:36:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9604333568. Throughput: 0: 42438.7. Samples: 9604445840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 04:36:17,036][15349] Signal inference workers to stop experience collection... (142250 times) [2024-06-24 04:36:17,060][15401] InferenceWorker_p0-w0: stopping experience collection (142250 times) [2024-06-24 04:36:17,151][15349] Signal inference workers to resume experience collection... (142250 times) [2024-06-24 04:36:17,151][15401] InferenceWorker_p0-w0: resuming experience collection (142250 times) [2024-06-24 04:36:17,282][15401] Updated weights for policy 0, policy_version 586210 (0.0038) [2024-06-24 04:36:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 9604530176. Throughput: 0: 42407.8. Samples: 9604705360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 04:36:19,937][15401] Updated weights for policy 0, policy_version 586220 (0.0030) [2024-06-24 04:36:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9604759552. Throughput: 0: 42478.4. Samples: 9604830800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 04:36:24,883][15401] Updated weights for policy 0, policy_version 586230 (0.0023) [2024-06-24 04:36:27,449][15401] Updated weights for policy 0, policy_version 586240 (0.0045) [2024-06-24 04:36:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 9604988928. Throughput: 0: 42332.8. Samples: 9605085540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:28,391][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 04:36:32,721][15401] Updated weights for policy 0, policy_version 586250 (0.0039) [2024-06-24 04:36:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 9605169152. Throughput: 0: 42547.6. Samples: 9605348040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 04:36:35,286][15401] Updated weights for policy 0, policy_version 586260 (0.0036) [2024-06-24 04:36:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9605414912. Throughput: 0: 42680.4. Samples: 9605472520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 04:36:40,346][15401] Updated weights for policy 0, policy_version 586270 (0.0043) [2024-06-24 04:36:42,858][15401] Updated weights for policy 0, policy_version 586280 (0.0037) [2024-06-24 04:36:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9605611520. Throughput: 0: 42675.1. Samples: 9605728300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 04:36:47,818][15401] Updated weights for policy 0, policy_version 586290 (0.0039) [2024-06-24 04:36:48,390][15132] Fps is (10 sec: 37681.3, 60 sec: 42051.9, 300 sec: 42598.3). Total num frames: 9605791744. Throughput: 0: 42713.7. Samples: 9605985540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:48,391][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 04:36:50,679][15401] Updated weights for policy 0, policy_version 586300 (0.0045) [2024-06-24 04:36:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 9606053888. Throughput: 0: 42693.6. Samples: 9606109640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 04:36:55,313][15401] Updated weights for policy 0, policy_version 586310 (0.0038) [2024-06-24 04:36:58,307][15401] Updated weights for policy 0, policy_version 586320 (0.0031) [2024-06-24 04:36:58,389][15132] Fps is (10 sec: 47516.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9606266880. Throughput: 0: 42699.1. Samples: 9606367300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 04:36:58,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 04:37:02,883][15401] Updated weights for policy 0, policy_version 586330 (0.0036) [2024-06-24 04:37:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9606430720. Throughput: 0: 42712.4. Samples: 9606627420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:03,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 04:37:06,099][15401] Updated weights for policy 0, policy_version 586340 (0.0031) [2024-06-24 04:37:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9606676480. Throughput: 0: 42633.0. Samples: 9606749280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 04:37:10,504][15401] Updated weights for policy 0, policy_version 586350 (0.0027) [2024-06-24 04:37:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9606889472. Throughput: 0: 42735.1. Samples: 9607008620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:13,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 04:37:13,747][15401] Updated weights for policy 0, policy_version 586360 (0.0034) [2024-06-24 04:37:18,274][15401] Updated weights for policy 0, policy_version 586370 (0.0028) [2024-06-24 04:37:18,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9607086080. Throughput: 0: 42639.1. Samples: 9607266800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 04:37:21,448][15401] Updated weights for policy 0, policy_version 586380 (0.0031) [2024-06-24 04:37:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9607315456. Throughput: 0: 42545.9. Samples: 9607387080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 04:37:26,110][15401] Updated weights for policy 0, policy_version 586390 (0.0035) [2024-06-24 04:37:26,374][15349] Signal inference workers to stop experience collection... (142300 times) [2024-06-24 04:37:26,375][15349] Signal inference workers to resume experience collection... (142300 times) [2024-06-24 04:37:26,409][15401] InferenceWorker_p0-w0: stopping experience collection (142300 times) [2024-06-24 04:37:26,410][15401] InferenceWorker_p0-w0: resuming experience collection (142300 times) [2024-06-24 04:37:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 9607512064. Throughput: 0: 42567.1. Samples: 9607643820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 04:37:29,475][15401] Updated weights for policy 0, policy_version 586400 (0.0038) [2024-06-24 04:37:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9607708672. Throughput: 0: 42533.8. Samples: 9607899540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 04:37:33,645][15401] Updated weights for policy 0, policy_version 586410 (0.0036) [2024-06-24 04:37:37,509][15401] Updated weights for policy 0, policy_version 586420 (0.0039) [2024-06-24 04:37:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 9607921664. Throughput: 0: 42439.7. Samples: 9608019420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 04:37:41,369][15401] Updated weights for policy 0, policy_version 586430 (0.0032) [2024-06-24 04:37:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9608151040. Throughput: 0: 42355.9. Samples: 9608273320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:43,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 04:37:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000586435_9608151040.pth... [2024-06-24 04:37:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000585813_9597960192.pth [2024-06-24 04:37:45,492][15401] Updated weights for policy 0, policy_version 586440 (0.0038) [2024-06-24 04:37:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.8, 300 sec: 42653.9). Total num frames: 9608364032. Throughput: 0: 42295.5. Samples: 9608530720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 04:37:48,887][15401] Updated weights for policy 0, policy_version 586450 (0.0027) [2024-06-24 04:37:53,071][15401] Updated weights for policy 0, policy_version 586460 (0.0038) [2024-06-24 04:37:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 9608560640. Throughput: 0: 42358.5. Samples: 9608655420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 04:37:56,370][15401] Updated weights for policy 0, policy_version 586470 (0.0033) [2024-06-24 04:37:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9608790016. Throughput: 0: 42328.1. Samples: 9608913380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:37:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 04:38:00,922][15401] Updated weights for policy 0, policy_version 586480 (0.0026) [2024-06-24 04:38:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9609003008. Throughput: 0: 42443.2. Samples: 9609176740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:38:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 04:38:03,870][15401] Updated weights for policy 0, policy_version 586490 (0.0030) [2024-06-24 04:38:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 9609199616. Throughput: 0: 42535.1. Samples: 9609301160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:38:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 04:38:08,514][15401] Updated weights for policy 0, policy_version 586500 (0.0033) [2024-06-24 04:38:11,568][15401] Updated weights for policy 0, policy_version 586510 (0.0032) [2024-06-24 04:38:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 9609428992. Throughput: 0: 42405.4. Samples: 9609552060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:38:13,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 04:38:16,180][15401] Updated weights for policy 0, policy_version 586520 (0.0028) [2024-06-24 04:38:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9609641984. Throughput: 0: 42490.4. Samples: 9609811600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:38:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 04:38:19,114][15401] Updated weights for policy 0, policy_version 586530 (0.0028) [2024-06-24 04:38:23,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 9609838592. Throughput: 0: 42654.6. Samples: 9609938880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:38:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 04:38:23,890][15401] Updated weights for policy 0, policy_version 586540 (0.0031) [2024-06-24 04:38:27,055][15401] Updated weights for policy 0, policy_version 586550 (0.0038) [2024-06-24 04:38:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9610084352. Throughput: 0: 42676.9. Samples: 9610193780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 04:38:28,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 04:38:31,492][15401] Updated weights for policy 0, policy_version 586560 (0.0033) [2024-06-24 04:38:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9610280960. Throughput: 0: 42734.7. Samples: 9610453780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:38:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 04:38:34,654][15401] Updated weights for policy 0, policy_version 586570 (0.0042) [2024-06-24 04:38:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9610493952. Throughput: 0: 42805.8. Samples: 9610581680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:38:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 04:38:39,068][15401] Updated weights for policy 0, policy_version 586580 (0.0033) [2024-06-24 04:38:42,020][15349] Signal inference workers to stop experience collection... (142350 times) [2024-06-24 04:38:42,030][15349] Signal inference workers to resume experience collection... (142350 times) [2024-06-24 04:38:42,044][15401] InferenceWorker_p0-w0: stopping experience collection (142350 times) [2024-06-24 04:38:42,045][15401] InferenceWorker_p0-w0: resuming experience collection (142350 times) [2024-06-24 04:38:42,201][15401] Updated weights for policy 0, policy_version 586590 (0.0023) [2024-06-24 04:38:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 9610739712. Throughput: 0: 42837.0. Samples: 9610841040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:38:43,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 04:38:46,647][15401] Updated weights for policy 0, policy_version 586600 (0.0032) [2024-06-24 04:38:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.1). Total num frames: 9610919936. Throughput: 0: 42843.5. Samples: 9611104700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:38:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 04:38:49,786][15401] Updated weights for policy 0, policy_version 586610 (0.0051) [2024-06-24 04:38:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 9611149312. Throughput: 0: 42727.6. Samples: 9611223900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:38:53,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-24 04:38:54,186][15401] Updated weights for policy 0, policy_version 586620 (0.0037) [2024-06-24 04:38:57,482][15401] Updated weights for policy 0, policy_version 586630 (0.0042) [2024-06-24 04:38:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 9611395072. Throughput: 0: 43059.5. Samples: 9611489740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:38:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 04:39:02,182][15401] Updated weights for policy 0, policy_version 586640 (0.0036) [2024-06-24 04:39:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9611575296. Throughput: 0: 43030.6. Samples: 9611747980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 04:39:05,080][15401] Updated weights for policy 0, policy_version 586650 (0.0025) [2024-06-24 04:39:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 9611804672. Throughput: 0: 42888.1. Samples: 9611868840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 04:39:09,661][15401] Updated weights for policy 0, policy_version 586660 (0.0032) [2024-06-24 04:39:12,787][15401] Updated weights for policy 0, policy_version 586670 (0.0030) [2024-06-24 04:39:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.4, 300 sec: 42709.5). Total num frames: 9612034048. Throughput: 0: 43112.8. Samples: 9612133860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 04:39:17,014][15401] Updated weights for policy 0, policy_version 586680 (0.0042) [2024-06-24 04:39:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9612214272. Throughput: 0: 42987.6. Samples: 9612388220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 04:39:20,413][15401] Updated weights for policy 0, policy_version 586690 (0.0029) [2024-06-24 04:39:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 9612443648. Throughput: 0: 42900.9. Samples: 9612512220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 04:39:24,769][15401] Updated weights for policy 0, policy_version 586700 (0.0035) [2024-06-24 04:39:28,142][15401] Updated weights for policy 0, policy_version 586710 (0.0042) [2024-06-24 04:39:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 9612673024. Throughput: 0: 43043.4. Samples: 9612778000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 04:39:32,426][15401] Updated weights for policy 0, policy_version 586720 (0.0041) [2024-06-24 04:39:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9612869632. Throughput: 0: 42881.6. Samples: 9613034380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 04:39:35,886][15401] Updated weights for policy 0, policy_version 586730 (0.0029) [2024-06-24 04:39:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.4, 300 sec: 42653.9). Total num frames: 9613099008. Throughput: 0: 43089.1. Samples: 9613162920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 04:39:40,214][15401] Updated weights for policy 0, policy_version 586740 (0.0030) [2024-06-24 04:39:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9613295616. Throughput: 0: 42795.9. Samples: 9613415560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 04:39:43,503][15401] Updated weights for policy 0, policy_version 586750 (0.0042) [2024-06-24 04:39:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000586750_9613312000.pth... [2024-06-24 04:39:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000586126_9603088384.pth [2024-06-24 04:39:47,936][15401] Updated weights for policy 0, policy_version 586760 (0.0039) [2024-06-24 04:39:48,389][15132] Fps is (10 sec: 40961.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 9613508608. Throughput: 0: 42847.3. Samples: 9613676100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 04:39:51,231][15401] Updated weights for policy 0, policy_version 586770 (0.0036) [2024-06-24 04:39:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9613737984. Throughput: 0: 42985.8. Samples: 9613803200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:53,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 04:39:55,441][15401] Updated weights for policy 0, policy_version 586780 (0.0029) [2024-06-24 04:39:58,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9613934592. Throughput: 0: 42715.6. Samples: 9614056060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:39:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 04:39:59,082][15401] Updated weights for policy 0, policy_version 586790 (0.0024) [2024-06-24 04:40:02,883][15401] Updated weights for policy 0, policy_version 586800 (0.0028) [2024-06-24 04:40:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9614147584. Throughput: 0: 42948.0. Samples: 9614320880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 04:40:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 04:40:06,802][15349] Signal inference workers to stop experience collection... (142400 times) [2024-06-24 04:40:06,855][15401] InferenceWorker_p0-w0: stopping experience collection (142400 times) [2024-06-24 04:40:06,865][15349] Signal inference workers to resume experience collection... (142400 times) [2024-06-24 04:40:06,866][15401] InferenceWorker_p0-w0: resuming experience collection (142400 times) [2024-06-24 04:40:06,874][15401] Updated weights for policy 0, policy_version 586810 (0.0047) [2024-06-24 04:40:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9614393344. Throughput: 0: 43070.6. Samples: 9614450400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 04:40:10,349][15401] Updated weights for policy 0, policy_version 586820 (0.0026) [2024-06-24 04:40:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9614573568. Throughput: 0: 42677.8. Samples: 9614698500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 04:40:14,618][15401] Updated weights for policy 0, policy_version 586830 (0.0031) [2024-06-24 04:40:17,841][15401] Updated weights for policy 0, policy_version 586840 (0.0028) [2024-06-24 04:40:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9614786560. Throughput: 0: 42741.8. Samples: 9614957760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:18,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 04:40:22,335][15401] Updated weights for policy 0, policy_version 586850 (0.0037) [2024-06-24 04:40:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9615032320. Throughput: 0: 42776.3. Samples: 9615087840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:23,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 04:40:25,365][15401] Updated weights for policy 0, policy_version 586860 (0.0030) [2024-06-24 04:40:28,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 9615228928. Throughput: 0: 42787.7. Samples: 9615341000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 04:40:29,791][15401] Updated weights for policy 0, policy_version 586870 (0.0033) [2024-06-24 04:40:33,126][15401] Updated weights for policy 0, policy_version 586880 (0.0044) [2024-06-24 04:40:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9615441920. Throughput: 0: 42605.6. Samples: 9615593360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 04:40:37,313][15401] Updated weights for policy 0, policy_version 586890 (0.0038) [2024-06-24 04:40:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 9615638528. Throughput: 0: 42567.1. Samples: 9615718720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 04:40:41,281][15401] Updated weights for policy 0, policy_version 586900 (0.0033) [2024-06-24 04:40:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9615867904. Throughput: 0: 42688.4. Samples: 9615977040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 04:40:44,819][15401] Updated weights for policy 0, policy_version 586910 (0.0030) [2024-06-24 04:40:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9616064512. Throughput: 0: 42678.2. Samples: 9616241400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 04:40:48,763][15401] Updated weights for policy 0, policy_version 586920 (0.0027) [2024-06-24 04:40:52,250][15401] Updated weights for policy 0, policy_version 586930 (0.0037) [2024-06-24 04:40:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9616277504. Throughput: 0: 42571.2. Samples: 9616366100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 04:40:56,296][15401] Updated weights for policy 0, policy_version 586940 (0.0027) [2024-06-24 04:40:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9616506880. Throughput: 0: 42735.6. Samples: 9616621600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:40:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 04:40:59,823][15401] Updated weights for policy 0, policy_version 586950 (0.0027) [2024-06-24 04:41:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9616703488. Throughput: 0: 42697.0. Samples: 9616879120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:41:03,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 04:41:03,830][15401] Updated weights for policy 0, policy_version 586960 (0.0048) [2024-06-24 04:41:07,493][15401] Updated weights for policy 0, policy_version 586970 (0.0030) [2024-06-24 04:41:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 9616916480. Throughput: 0: 42556.8. Samples: 9617002900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:41:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 04:41:11,758][15401] Updated weights for policy 0, policy_version 586980 (0.0035) [2024-06-24 04:41:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9617145856. Throughput: 0: 42572.8. Samples: 9617256780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:41:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 04:41:15,349][15401] Updated weights for policy 0, policy_version 586990 (0.0041) [2024-06-24 04:41:17,764][15349] Signal inference workers to stop experience collection... (142450 times) [2024-06-24 04:41:17,800][15401] InferenceWorker_p0-w0: stopping experience collection (142450 times) [2024-06-24 04:41:17,829][15349] Signal inference workers to resume experience collection... (142450 times) [2024-06-24 04:41:17,829][15401] InferenceWorker_p0-w0: resuming experience collection (142450 times) [2024-06-24 04:41:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9617342464. Throughput: 0: 42840.9. Samples: 9617521200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:41:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 04:41:19,419][15401] Updated weights for policy 0, policy_version 587000 (0.0035) [2024-06-24 04:41:23,107][15401] Updated weights for policy 0, policy_version 587010 (0.0041) [2024-06-24 04:41:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 9617571840. Throughput: 0: 42696.8. Samples: 9617640180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:41:23,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 04:41:27,164][15401] Updated weights for policy 0, policy_version 587020 (0.0035) [2024-06-24 04:41:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9617801216. Throughput: 0: 42683.5. Samples: 9617897800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:41:28,404][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 04:41:31,346][15401] Updated weights for policy 0, policy_version 587030 (0.0036) [2024-06-24 04:41:33,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 9617981440. Throughput: 0: 42565.0. Samples: 9618156820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:41:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 04:41:34,723][15401] Updated weights for policy 0, policy_version 587040 (0.0037) [2024-06-24 04:41:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9618194432. Throughput: 0: 42454.7. Samples: 9618276560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 04:41:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 04:41:39,062][15401] Updated weights for policy 0, policy_version 587050 (0.0031) [2024-06-24 04:41:42,494][15401] Updated weights for policy 0, policy_version 587060 (0.0031) [2024-06-24 04:41:43,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9618423808. Throughput: 0: 42537.8. Samples: 9618535800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:41:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 04:41:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000587063_9618440192.pth... [2024-06-24 04:41:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000586435_9608151040.pth [2024-06-24 04:41:46,785][15401] Updated weights for policy 0, policy_version 587070 (0.0047) [2024-06-24 04:41:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9618620416. Throughput: 0: 42590.6. Samples: 9618795700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:41:48,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 04:41:50,034][15401] Updated weights for policy 0, policy_version 587080 (0.0040) [2024-06-24 04:41:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9618833408. Throughput: 0: 42489.7. Samples: 9618914940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:41:53,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 04:41:54,400][15401] Updated weights for policy 0, policy_version 587090 (0.0037) [2024-06-24 04:41:57,594][15401] Updated weights for policy 0, policy_version 587100 (0.0025) [2024-06-24 04:41:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9619062784. Throughput: 0: 42625.3. Samples: 9619174920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:41:58,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-24 04:42:02,421][15401] Updated weights for policy 0, policy_version 587110 (0.0039) [2024-06-24 04:42:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9619243008. Throughput: 0: 42619.2. Samples: 9619439060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 04:42:05,214][15401] Updated weights for policy 0, policy_version 587120 (0.0024) [2024-06-24 04:42:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9619488768. Throughput: 0: 42633.9. Samples: 9619558600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 04:42:10,067][15401] Updated weights for policy 0, policy_version 587130 (0.0028) [2024-06-24 04:42:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9619685376. Throughput: 0: 42596.6. Samples: 9619814640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 04:42:13,485][15401] Updated weights for policy 0, policy_version 587140 (0.0048) [2024-06-24 04:42:17,663][15401] Updated weights for policy 0, policy_version 587150 (0.0027) [2024-06-24 04:42:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9619898368. Throughput: 0: 42535.9. Samples: 9620070940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 04:42:21,064][15401] Updated weights for policy 0, policy_version 587160 (0.0046) [2024-06-24 04:42:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9620127744. Throughput: 0: 42627.5. Samples: 9620194800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 04:42:25,172][15401] Updated weights for policy 0, policy_version 587170 (0.0027) [2024-06-24 04:42:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 9620324352. Throughput: 0: 42584.1. Samples: 9620452080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 04:42:28,643][15401] Updated weights for policy 0, policy_version 587180 (0.0051) [2024-06-24 04:42:32,831][15401] Updated weights for policy 0, policy_version 587190 (0.0030) [2024-06-24 04:42:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 9620537344. Throughput: 0: 42652.8. Samples: 9620715080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:33,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-24 04:42:36,573][15401] Updated weights for policy 0, policy_version 587200 (0.0046) [2024-06-24 04:42:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9620783104. Throughput: 0: 42764.5. Samples: 9620839340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:38,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 04:42:40,298][15401] Updated weights for policy 0, policy_version 587210 (0.0039) [2024-06-24 04:42:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9620979712. Throughput: 0: 42646.7. Samples: 9621094020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 04:42:44,272][15401] Updated weights for policy 0, policy_version 587220 (0.0029) [2024-06-24 04:42:48,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9621159936. Throughput: 0: 42430.7. Samples: 9621348440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 04:42:48,416][15401] Updated weights for policy 0, policy_version 587230 (0.0034) [2024-06-24 04:42:49,781][15349] Signal inference workers to stop experience collection... (142500 times) [2024-06-24 04:42:49,788][15349] Signal inference workers to resume experience collection... (142500 times) [2024-06-24 04:42:49,815][15401] InferenceWorker_p0-w0: stopping experience collection (142500 times) [2024-06-24 04:42:49,816][15401] InferenceWorker_p0-w0: resuming experience collection (142500 times) [2024-06-24 04:42:51,809][15401] Updated weights for policy 0, policy_version 587240 (0.0041) [2024-06-24 04:42:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9621389312. Throughput: 0: 42483.6. Samples: 9621470360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:53,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 04:42:56,045][15401] Updated weights for policy 0, policy_version 587250 (0.0033) [2024-06-24 04:42:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9621602304. Throughput: 0: 42629.8. Samples: 9621732980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:42:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 04:42:59,305][15401] Updated weights for policy 0, policy_version 587260 (0.0051) [2024-06-24 04:43:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9621815296. Throughput: 0: 42699.2. Samples: 9621992400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:43:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 04:43:03,485][15401] Updated weights for policy 0, policy_version 587270 (0.0028) [2024-06-24 04:43:06,683][15401] Updated weights for policy 0, policy_version 587280 (0.0036) [2024-06-24 04:43:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9622044672. Throughput: 0: 42797.7. Samples: 9622120700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 04:43:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 04:43:10,956][15401] Updated weights for policy 0, policy_version 587290 (0.0033) [2024-06-24 04:43:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9622274048. Throughput: 0: 42799.4. Samples: 9622378060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 04:43:14,265][15401] Updated weights for policy 0, policy_version 587300 (0.0036) [2024-06-24 04:43:18,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9622470656. Throughput: 0: 42680.7. Samples: 9622635700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:18,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 04:43:18,487][15401] Updated weights for policy 0, policy_version 587310 (0.0031) [2024-06-24 04:43:22,254][15401] Updated weights for policy 0, policy_version 587320 (0.0031) [2024-06-24 04:43:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9622700032. Throughput: 0: 42595.0. Samples: 9622756120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:23,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 04:43:26,383][15401] Updated weights for policy 0, policy_version 587330 (0.0029) [2024-06-24 04:43:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9622896640. Throughput: 0: 42747.6. Samples: 9623017660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 04:43:29,799][15401] Updated weights for policy 0, policy_version 587340 (0.0036) [2024-06-24 04:43:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9623093248. Throughput: 0: 42911.5. Samples: 9623279460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 04:43:34,177][15401] Updated weights for policy 0, policy_version 587350 (0.0030) [2024-06-24 04:43:37,489][15401] Updated weights for policy 0, policy_version 587360 (0.0030) [2024-06-24 04:43:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 9623339008. Throughput: 0: 42843.9. Samples: 9623398440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:38,393][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 04:43:41,906][15401] Updated weights for policy 0, policy_version 587370 (0.0039) [2024-06-24 04:43:43,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 9623552000. Throughput: 0: 42822.1. Samples: 9623660080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:43,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 04:43:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000587376_9623568384.pth... [2024-06-24 04:43:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000586750_9613312000.pth [2024-06-24 04:43:45,047][15401] Updated weights for policy 0, policy_version 587380 (0.0039) [2024-06-24 04:43:48,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9623732224. Throughput: 0: 42804.8. Samples: 9623918620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 04:43:49,546][15401] Updated weights for policy 0, policy_version 587390 (0.0023) [2024-06-24 04:43:52,860][15401] Updated weights for policy 0, policy_version 587400 (0.0027) [2024-06-24 04:43:53,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9623977984. Throughput: 0: 42657.0. Samples: 9624040260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:53,396][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 04:43:57,200][15401] Updated weights for policy 0, policy_version 587410 (0.0031) [2024-06-24 04:43:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9624190976. Throughput: 0: 42647.2. Samples: 9624297180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:43:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 04:44:00,670][15401] Updated weights for policy 0, policy_version 587420 (0.0028) [2024-06-24 04:44:03,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9624354816. Throughput: 0: 42533.2. Samples: 9624549700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 04:44:05,225][15401] Updated weights for policy 0, policy_version 587430 (0.0031) [2024-06-24 04:44:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 9624600576. Throughput: 0: 42478.9. Samples: 9624667660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 04:44:08,591][15401] Updated weights for policy 0, policy_version 587440 (0.0032) [2024-06-24 04:44:12,901][15401] Updated weights for policy 0, policy_version 587450 (0.0029) [2024-06-24 04:44:13,186][15349] Signal inference workers to stop experience collection... (142550 times) [2024-06-24 04:44:13,215][15401] InferenceWorker_p0-w0: stopping experience collection (142550 times) [2024-06-24 04:44:13,240][15349] Signal inference workers to resume experience collection... (142550 times) [2024-06-24 04:44:13,244][15401] InferenceWorker_p0-w0: resuming experience collection (142550 times) [2024-06-24 04:44:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 9624797184. Throughput: 0: 42563.6. Samples: 9624933020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 04:44:16,230][15401] Updated weights for policy 0, policy_version 587460 (0.0032) [2024-06-24 04:44:18,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 9624993792. Throughput: 0: 42323.6. Samples: 9625184020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 04:44:20,429][15401] Updated weights for policy 0, policy_version 587470 (0.0039) [2024-06-24 04:44:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 9625239552. Throughput: 0: 42495.7. Samples: 9625310640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:23,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 04:44:23,823][15401] Updated weights for policy 0, policy_version 587480 (0.0027) [2024-06-24 04:44:27,885][15401] Updated weights for policy 0, policy_version 587490 (0.0029) [2024-06-24 04:44:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9625452544. Throughput: 0: 42498.8. Samples: 9625572420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 04:44:31,373][15401] Updated weights for policy 0, policy_version 587500 (0.0028) [2024-06-24 04:44:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9625649152. Throughput: 0: 42355.5. Samples: 9625824620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 04:44:35,959][15401] Updated weights for policy 0, policy_version 587510 (0.0031) [2024-06-24 04:44:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 9625894912. Throughput: 0: 42472.4. Samples: 9625951520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 04:44:38,933][15401] Updated weights for policy 0, policy_version 587520 (0.0037) [2024-06-24 04:44:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41780.9, 300 sec: 42542.8). Total num frames: 9626058752. Throughput: 0: 42401.3. Samples: 9626205240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 04:44:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 04:44:43,711][15401] Updated weights for policy 0, policy_version 587530 (0.0024) [2024-06-24 04:44:46,641][15401] Updated weights for policy 0, policy_version 587540 (0.0036) [2024-06-24 04:44:48,392][15132] Fps is (10 sec: 39311.1, 60 sec: 42596.5, 300 sec: 42542.5). Total num frames: 9626288128. Throughput: 0: 42392.5. Samples: 9626457480. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:44:48,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 04:44:51,382][15401] Updated weights for policy 0, policy_version 587550 (0.0038) [2024-06-24 04:44:53,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9626517504. Throughput: 0: 42705.9. Samples: 9626589440. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:44:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 04:44:54,319][15401] Updated weights for policy 0, policy_version 587560 (0.0028) [2024-06-24 04:44:58,389][15132] Fps is (10 sec: 42610.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9626714112. Throughput: 0: 42536.8. Samples: 9626847180. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:44:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 04:44:59,034][15401] Updated weights for policy 0, policy_version 587570 (0.0036) [2024-06-24 04:45:01,935][15401] Updated weights for policy 0, policy_version 587580 (0.0035) [2024-06-24 04:45:03,390][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 9626943488. Throughput: 0: 42497.8. Samples: 9627096420. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 04:45:06,694][15401] Updated weights for policy 0, policy_version 587590 (0.0032) [2024-06-24 04:45:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9627156480. Throughput: 0: 42768.0. Samples: 9627235200. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 04:45:09,683][15401] Updated weights for policy 0, policy_version 587600 (0.0024) [2024-06-24 04:45:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 9627336704. Throughput: 0: 42546.1. Samples: 9627487000. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:13,396][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 04:45:13,704][15349] Signal inference workers to stop experience collection... (142600 times) [2024-06-24 04:45:13,704][15349] Signal inference workers to resume experience collection... (142600 times) [2024-06-24 04:45:13,754][15401] InferenceWorker_p0-w0: stopping experience collection (142600 times) [2024-06-24 04:45:13,754][15401] InferenceWorker_p0-w0: resuming experience collection (142600 times) [2024-06-24 04:45:14,293][15401] Updated weights for policy 0, policy_version 587610 (0.0040) [2024-06-24 04:45:17,604][15401] Updated weights for policy 0, policy_version 587620 (0.0036) [2024-06-24 04:45:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 9627582464. Throughput: 0: 42497.0. Samples: 9627736980. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 04:45:21,885][15401] Updated weights for policy 0, policy_version 587630 (0.0038) [2024-06-24 04:45:23,390][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9627811840. Throughput: 0: 42668.5. Samples: 9627871600. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:23,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 04:45:25,643][15401] Updated weights for policy 0, policy_version 587640 (0.0030) [2024-06-24 04:45:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9627992064. Throughput: 0: 42804.4. Samples: 9628131440. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 04:45:29,514][15401] Updated weights for policy 0, policy_version 587650 (0.0028) [2024-06-24 04:45:33,273][15401] Updated weights for policy 0, policy_version 587660 (0.0043) [2024-06-24 04:45:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 9628221440. Throughput: 0: 42746.7. Samples: 9628380960. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 04:45:37,076][15401] Updated weights for policy 0, policy_version 587670 (0.0030) [2024-06-24 04:45:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9628434432. Throughput: 0: 42742.0. Samples: 9628512820. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 04:45:40,791][15401] Updated weights for policy 0, policy_version 587680 (0.0029) [2024-06-24 04:45:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9628647424. Throughput: 0: 42673.3. Samples: 9628767480. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 04:45:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000587686_9628647424.pth... [2024-06-24 04:45:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000587063_9618440192.pth [2024-06-24 04:45:45,072][15401] Updated weights for policy 0, policy_version 587690 (0.0044) [2024-06-24 04:45:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.5, 300 sec: 42653.9). Total num frames: 9628860416. Throughput: 0: 42682.8. Samples: 9629017140. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 04:45:48,522][15401] Updated weights for policy 0, policy_version 587700 (0.0042) [2024-06-24 04:45:52,665][15401] Updated weights for policy 0, policy_version 587710 (0.0045) [2024-06-24 04:45:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9629073408. Throughput: 0: 42441.7. Samples: 9629145080. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 04:45:56,286][15401] Updated weights for policy 0, policy_version 587720 (0.0045) [2024-06-24 04:45:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9629253632. Throughput: 0: 42534.9. Samples: 9629401060. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:45:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 04:46:00,135][15401] Updated weights for policy 0, policy_version 587730 (0.0034) [2024-06-24 04:46:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9629499392. Throughput: 0: 42548.0. Samples: 9629651640. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:46:03,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-24 04:46:04,000][15401] Updated weights for policy 0, policy_version 587740 (0.0046) [2024-06-24 04:46:08,031][15401] Updated weights for policy 0, policy_version 587750 (0.0039) [2024-06-24 04:46:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9629712384. Throughput: 0: 42558.7. Samples: 9629786740. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:46:08,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 04:46:11,793][15401] Updated weights for policy 0, policy_version 587760 (0.0026) [2024-06-24 04:46:13,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 9629876224. Throughput: 0: 42317.7. Samples: 9630035740. Policy #0 lag: (min: 2.0, avg: 12.5, max: 26.0) [2024-06-24 04:46:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 04:46:15,504][15401] Updated weights for policy 0, policy_version 587770 (0.0032) [2024-06-24 04:46:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 9630121984. Throughput: 0: 42529.6. Samples: 9630294800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 04:46:19,437][15401] Updated weights for policy 0, policy_version 587780 (0.0031) [2024-06-24 04:46:20,862][15349] Signal inference workers to stop experience collection... (142650 times) [2024-06-24 04:46:20,886][15401] InferenceWorker_p0-w0: stopping experience collection (142650 times) [2024-06-24 04:46:20,925][15349] Signal inference workers to resume experience collection... (142650 times) [2024-06-24 04:46:20,925][15401] InferenceWorker_p0-w0: resuming experience collection (142650 times) [2024-06-24 04:46:22,892][15401] Updated weights for policy 0, policy_version 587790 (0.0027) [2024-06-24 04:46:23,390][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9630367744. Throughput: 0: 42528.8. Samples: 9630426620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:23,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 04:46:27,339][15401] Updated weights for policy 0, policy_version 587800 (0.0023) [2024-06-24 04:46:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 9630531584. Throughput: 0: 42590.6. Samples: 9630684060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 04:46:30,445][15401] Updated weights for policy 0, policy_version 587810 (0.0026) [2024-06-24 04:46:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9630777344. Throughput: 0: 42707.0. Samples: 9630938960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:33,399][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 04:46:34,896][15401] Updated weights for policy 0, policy_version 587820 (0.0026) [2024-06-24 04:46:38,109][15401] Updated weights for policy 0, policy_version 587830 (0.0034) [2024-06-24 04:46:38,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9631006720. Throughput: 0: 42770.2. Samples: 9631069740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 04:46:42,631][15401] Updated weights for policy 0, policy_version 587840 (0.0029) [2024-06-24 04:46:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9631186944. Throughput: 0: 42726.2. Samples: 9631323740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:43,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 04:46:45,739][15401] Updated weights for policy 0, policy_version 587850 (0.0029) [2024-06-24 04:46:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9631399936. Throughput: 0: 42791.1. Samples: 9631577240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:48,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 04:46:50,351][15401] Updated weights for policy 0, policy_version 587860 (0.0024) [2024-06-24 04:46:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9631645696. Throughput: 0: 42732.9. Samples: 9631709720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 04:46:53,546][15401] Updated weights for policy 0, policy_version 587870 (0.0033) [2024-06-24 04:46:57,992][15401] Updated weights for policy 0, policy_version 587880 (0.0036) [2024-06-24 04:46:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9631842304. Throughput: 0: 42893.3. Samples: 9631965940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:46:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 04:47:01,126][15401] Updated weights for policy 0, policy_version 587890 (0.0040) [2024-06-24 04:47:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9632055296. Throughput: 0: 42856.9. Samples: 9632223360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:03,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 04:47:05,528][15401] Updated weights for policy 0, policy_version 587900 (0.0032) [2024-06-24 04:47:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9632301056. Throughput: 0: 42990.2. Samples: 9632361180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 04:47:08,521][15401] Updated weights for policy 0, policy_version 587910 (0.0036) [2024-06-24 04:47:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 9632464896. Throughput: 0: 42949.5. Samples: 9632616780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 04:47:13,835][15401] Updated weights for policy 0, policy_version 587920 (0.0045) [2024-06-24 04:47:15,987][15401] Updated weights for policy 0, policy_version 587930 (0.0028) [2024-06-24 04:47:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9632710656. Throughput: 0: 42826.1. Samples: 9632866140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 04:47:21,382][15401] Updated weights for policy 0, policy_version 587940 (0.0036) [2024-06-24 04:47:23,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9632940032. Throughput: 0: 42959.7. Samples: 9633002920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:23,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 04:47:23,694][15401] Updated weights for policy 0, policy_version 587950 (0.0037) [2024-06-24 04:47:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 9633103872. Throughput: 0: 42889.8. Samples: 9633253780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 04:47:28,841][15401] Updated weights for policy 0, policy_version 587960 (0.0036) [2024-06-24 04:47:31,236][15401] Updated weights for policy 0, policy_version 587970 (0.0030) [2024-06-24 04:47:33,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9633349632. Throughput: 0: 42949.6. Samples: 9633509980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 04:47:36,233][15401] Updated weights for policy 0, policy_version 587980 (0.0031) [2024-06-24 04:47:38,393][15132] Fps is (10 sec: 47496.8, 60 sec: 42869.0, 300 sec: 42709.0). Total num frames: 9633579008. Throughput: 0: 43040.3. Samples: 9633646680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:38,394][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 04:47:38,917][15401] Updated weights for policy 0, policy_version 587990 (0.0030) [2024-06-24 04:47:40,365][15349] Signal inference workers to stop experience collection... (142700 times) [2024-06-24 04:47:40,404][15401] InferenceWorker_p0-w0: stopping experience collection (142700 times) [2024-06-24 04:47:40,426][15349] Signal inference workers to resume experience collection... (142700 times) [2024-06-24 04:47:40,427][15401] InferenceWorker_p0-w0: resuming experience collection (142700 times) [2024-06-24 04:47:43,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9633742848. Throughput: 0: 42906.8. Samples: 9633896740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 04:47:43,495][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000587998_9633759232.pth... [2024-06-24 04:47:43,550][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000587376_9623568384.pth [2024-06-24 04:47:43,831][15401] Updated weights for policy 0, policy_version 588000 (0.0026) [2024-06-24 04:47:46,814][15401] Updated weights for policy 0, policy_version 588010 (0.0031) [2024-06-24 04:47:48,389][15132] Fps is (10 sec: 42613.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 9634004992. Throughput: 0: 42872.5. Samples: 9634152620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 04:47:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 04:47:51,432][15401] Updated weights for policy 0, policy_version 588020 (0.0042) [2024-06-24 04:47:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9634217984. Throughput: 0: 42884.6. Samples: 9634290980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:47:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 04:47:54,192][15401] Updated weights for policy 0, policy_version 588030 (0.0024) [2024-06-24 04:47:58,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9634381824. Throughput: 0: 42839.4. Samples: 9634544560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:47:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 04:47:59,051][15401] Updated weights for policy 0, policy_version 588040 (0.0046) [2024-06-24 04:48:01,877][15401] Updated weights for policy 0, policy_version 588050 (0.0034) [2024-06-24 04:48:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9634627584. Throughput: 0: 42989.4. Samples: 9634800660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 04:48:06,731][15401] Updated weights for policy 0, policy_version 588060 (0.0037) [2024-06-24 04:48:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 9634840576. Throughput: 0: 42884.4. Samples: 9634932720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:08,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 04:48:09,471][15401] Updated weights for policy 0, policy_version 588070 (0.0033) [2024-06-24 04:48:13,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 9635037184. Throughput: 0: 42874.6. Samples: 9635183240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:13,393][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 04:48:14,635][15401] Updated weights for policy 0, policy_version 588080 (0.0028) [2024-06-24 04:48:17,462][15401] Updated weights for policy 0, policy_version 588090 (0.0033) [2024-06-24 04:48:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9635266560. Throughput: 0: 42862.0. Samples: 9635438760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:18,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 04:48:22,240][15401] Updated weights for policy 0, policy_version 588100 (0.0047) [2024-06-24 04:48:23,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9635479552. Throughput: 0: 42842.5. Samples: 9635574440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:23,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 04:48:25,290][15401] Updated weights for policy 0, policy_version 588110 (0.0033) [2024-06-24 04:48:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9635692544. Throughput: 0: 42832.7. Samples: 9635824220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:28,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 04:48:29,843][15401] Updated weights for policy 0, policy_version 588120 (0.0033) [2024-06-24 04:48:32,889][15401] Updated weights for policy 0, policy_version 588130 (0.0036) [2024-06-24 04:48:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 9635921920. Throughput: 0: 42779.0. Samples: 9636077680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:33,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 04:48:37,376][15401] Updated weights for policy 0, policy_version 588140 (0.0036) [2024-06-24 04:48:38,214][15349] Signal inference workers to stop experience collection... (142750 times) [2024-06-24 04:48:38,214][15349] Signal inference workers to resume experience collection... (142750 times) [2024-06-24 04:48:38,228][15401] InferenceWorker_p0-w0: stopping experience collection (142750 times) [2024-06-24 04:48:38,228][15401] InferenceWorker_p0-w0: resuming experience collection (142750 times) [2024-06-24 04:48:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.9, 300 sec: 42654.3). Total num frames: 9636134912. Throughput: 0: 42749.3. Samples: 9636214700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 04:48:40,874][15401] Updated weights for policy 0, policy_version 588150 (0.0032) [2024-06-24 04:48:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9636331520. Throughput: 0: 42826.2. Samples: 9636471740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 04:48:44,919][15401] Updated weights for policy 0, policy_version 588160 (0.0033) [2024-06-24 04:48:48,390][15401] Updated weights for policy 0, policy_version 588170 (0.0034) [2024-06-24 04:48:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 9636577280. Throughput: 0: 42756.8. Samples: 9636724820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:48,392][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 04:48:52,573][15401] Updated weights for policy 0, policy_version 588180 (0.0035) [2024-06-24 04:48:53,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42320.8, 300 sec: 42597.5). Total num frames: 9636757504. Throughput: 0: 42755.6. Samples: 9636857000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:53,397][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 04:48:55,905][15401] Updated weights for policy 0, policy_version 588190 (0.0024) [2024-06-24 04:48:58,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9636986880. Throughput: 0: 42839.1. Samples: 9637110900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:48:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 04:49:00,011][15401] Updated weights for policy 0, policy_version 588200 (0.0028) [2024-06-24 04:49:03,338][15401] Updated weights for policy 0, policy_version 588210 (0.0026) [2024-06-24 04:49:03,390][15132] Fps is (10 sec: 47543.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 9637232640. Throughput: 0: 42885.7. Samples: 9637368620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:49:03,391][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 04:49:07,997][15401] Updated weights for policy 0, policy_version 588220 (0.0031) [2024-06-24 04:49:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9637412864. Throughput: 0: 42808.8. Samples: 9637500840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:49:08,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 04:49:10,957][15401] Updated weights for policy 0, policy_version 588230 (0.0026) [2024-06-24 04:49:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 9637642240. Throughput: 0: 42956.6. Samples: 9637757260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:49:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 04:49:15,696][15401] Updated weights for policy 0, policy_version 588240 (0.0036) [2024-06-24 04:49:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 9637871616. Throughput: 0: 42870.2. Samples: 9638006840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:49:18,399][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 04:49:18,537][15401] Updated weights for policy 0, policy_version 588250 (0.0031) [2024-06-24 04:49:23,235][15401] Updated weights for policy 0, policy_version 588260 (0.0034) [2024-06-24 04:49:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9638051840. Throughput: 0: 42929.0. Samples: 9638146500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:49:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 04:49:26,224][15401] Updated weights for policy 0, policy_version 588270 (0.0026) [2024-06-24 04:49:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 9638281216. Throughput: 0: 42859.3. Samples: 9638400400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:49:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 04:49:31,006][15401] Updated weights for policy 0, policy_version 588280 (0.0047) [2024-06-24 04:49:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9638510592. Throughput: 0: 42784.1. Samples: 9638650000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:49:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 04:49:34,152][15401] Updated weights for policy 0, policy_version 588290 (0.0038) [2024-06-24 04:49:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9638674432. Throughput: 0: 42762.6. Samples: 9638781040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:49:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 04:49:38,686][15401] Updated weights for policy 0, policy_version 588300 (0.0033) [2024-06-24 04:49:41,623][15349] Signal inference workers to stop experience collection... (142800 times) [2024-06-24 04:49:41,625][15349] Signal inference workers to resume experience collection... (142800 times) [2024-06-24 04:49:41,669][15401] InferenceWorker_p0-w0: stopping experience collection (142800 times) [2024-06-24 04:49:41,669][15401] InferenceWorker_p0-w0: resuming experience collection (142800 times) [2024-06-24 04:49:41,766][15401] Updated weights for policy 0, policy_version 588310 (0.0036) [2024-06-24 04:49:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 9638903808. Throughput: 0: 42778.2. Samples: 9639035920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:49:43,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 04:49:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000588312_9638903808.pth... [2024-06-24 04:49:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000587686_9628647424.pth [2024-06-24 04:49:46,063][15401] Updated weights for policy 0, policy_version 588320 (0.0038) [2024-06-24 04:49:48,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 9639133184. Throughput: 0: 42623.0. Samples: 9639286660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:49:48,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 04:49:49,547][15401] Updated weights for policy 0, policy_version 588330 (0.0041) [2024-06-24 04:49:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 9639313408. Throughput: 0: 42671.9. Samples: 9639421080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:49:53,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 04:49:53,941][15401] Updated weights for policy 0, policy_version 588340 (0.0040) [2024-06-24 04:49:57,602][15401] Updated weights for policy 0, policy_version 588350 (0.0032) [2024-06-24 04:49:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9639542784. Throughput: 0: 42577.8. Samples: 9639673260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:49:58,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 04:50:01,482][15401] Updated weights for policy 0, policy_version 588360 (0.0041) [2024-06-24 04:50:03,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9639788544. Throughput: 0: 42748.1. Samples: 9639930500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 04:50:05,091][15401] Updated weights for policy 0, policy_version 588370 (0.0035) [2024-06-24 04:50:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9639968768. Throughput: 0: 42757.3. Samples: 9640070580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 04:50:09,126][15401] Updated weights for policy 0, policy_version 588380 (0.0029) [2024-06-24 04:50:12,475][15401] Updated weights for policy 0, policy_version 588390 (0.0048) [2024-06-24 04:50:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9640198144. Throughput: 0: 42641.2. Samples: 9640319260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 04:50:16,661][15401] Updated weights for policy 0, policy_version 588400 (0.0028) [2024-06-24 04:50:18,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9640443904. Throughput: 0: 42860.0. Samples: 9640578700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 04:50:20,199][15401] Updated weights for policy 0, policy_version 588410 (0.0033) [2024-06-24 04:50:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9640624128. Throughput: 0: 42997.6. Samples: 9640715940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 04:50:23,984][15401] Updated weights for policy 0, policy_version 588420 (0.0042) [2024-06-24 04:50:28,246][15401] Updated weights for policy 0, policy_version 588430 (0.0034) [2024-06-24 04:50:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9640837120. Throughput: 0: 42968.0. Samples: 9640969480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:28,394][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 04:50:31,943][15401] Updated weights for policy 0, policy_version 588440 (0.0046) [2024-06-24 04:50:33,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9641099264. Throughput: 0: 43061.1. Samples: 9641224400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 04:50:35,670][15401] Updated weights for policy 0, policy_version 588450 (0.0033) [2024-06-24 04:50:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9641263104. Throughput: 0: 43082.3. Samples: 9641359780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 04:50:39,382][15401] Updated weights for policy 0, policy_version 588460 (0.0032) [2024-06-24 04:50:43,034][15401] Updated weights for policy 0, policy_version 588470 (0.0029) [2024-06-24 04:50:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9641492480. Throughput: 0: 43135.0. Samples: 9641614340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 04:50:46,922][15401] Updated weights for policy 0, policy_version 588480 (0.0029) [2024-06-24 04:50:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9641721856. Throughput: 0: 43134.7. Samples: 9641871560. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 04:50:50,767][15401] Updated weights for policy 0, policy_version 588490 (0.0026) [2024-06-24 04:50:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9641902080. Throughput: 0: 42916.0. Samples: 9642001800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-24 04:50:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 04:50:54,509][15401] Updated weights for policy 0, policy_version 588500 (0.0045) [2024-06-24 04:50:56,567][15349] Signal inference workers to stop experience collection... (142850 times) [2024-06-24 04:50:56,584][15401] InferenceWorker_p0-w0: stopping experience collection (142850 times) [2024-06-24 04:50:56,678][15349] Signal inference workers to resume experience collection... (142850 times) [2024-06-24 04:50:56,678][15401] InferenceWorker_p0-w0: resuming experience collection (142850 times) [2024-06-24 04:50:58,228][15401] Updated weights for policy 0, policy_version 588510 (0.0031) [2024-06-24 04:50:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9642147840. Throughput: 0: 42929.8. Samples: 9642251100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:50:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 04:51:02,105][15401] Updated weights for policy 0, policy_version 588520 (0.0033) [2024-06-24 04:51:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9642344448. Throughput: 0: 43044.4. Samples: 9642515700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 04:51:05,859][15401] Updated weights for policy 0, policy_version 588530 (0.0042) [2024-06-24 04:51:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9642557440. Throughput: 0: 42710.3. Samples: 9642637900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 04:51:09,708][15401] Updated weights for policy 0, policy_version 588540 (0.0027) [2024-06-24 04:51:13,344][15401] Updated weights for policy 0, policy_version 588550 (0.0040) [2024-06-24 04:51:13,396][15132] Fps is (10 sec: 45845.3, 60 sec: 43412.9, 300 sec: 42986.2). Total num frames: 9642803200. Throughput: 0: 42827.2. Samples: 9642896980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:13,397][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 04:51:17,321][15401] Updated weights for policy 0, policy_version 588560 (0.0038) [2024-06-24 04:51:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9642983424. Throughput: 0: 42834.9. Samples: 9643151980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 04:51:21,384][15401] Updated weights for policy 0, policy_version 588570 (0.0033) [2024-06-24 04:51:23,389][15132] Fps is (10 sec: 39347.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9643196416. Throughput: 0: 42676.9. Samples: 9643280240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 04:51:25,150][15401] Updated weights for policy 0, policy_version 588580 (0.0028) [2024-06-24 04:51:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9643425792. Throughput: 0: 42713.8. Samples: 9643536460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 04:51:29,294][15401] Updated weights for policy 0, policy_version 588590 (0.0033) [2024-06-24 04:51:33,017][15401] Updated weights for policy 0, policy_version 588600 (0.0037) [2024-06-24 04:51:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 9643638784. Throughput: 0: 42745.2. Samples: 9643795100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 04:51:36,907][15401] Updated weights for policy 0, policy_version 588610 (0.0033) [2024-06-24 04:51:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9643835392. Throughput: 0: 42733.7. Samples: 9643924820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 04:51:40,416][15401] Updated weights for policy 0, policy_version 588620 (0.0033) [2024-06-24 04:51:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 9644081152. Throughput: 0: 42990.2. Samples: 9644185660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 04:51:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000588628_9644081152.pth... [2024-06-24 04:51:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000587998_9633759232.pth [2024-06-24 04:51:44,518][15401] Updated weights for policy 0, policy_version 588630 (0.0032) [2024-06-24 04:51:47,816][15401] Updated weights for policy 0, policy_version 588640 (0.0027) [2024-06-24 04:51:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9644294144. Throughput: 0: 42803.6. Samples: 9644441860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:48,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 04:51:52,026][15401] Updated weights for policy 0, policy_version 588650 (0.0029) [2024-06-24 04:51:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9644490752. Throughput: 0: 42884.7. Samples: 9644567720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:53,396][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 04:51:55,307][15401] Updated weights for policy 0, policy_version 588660 (0.0027) [2024-06-24 04:51:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9644720128. Throughput: 0: 43028.8. Samples: 9644833000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:51:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 04:52:00,103][15401] Updated weights for policy 0, policy_version 588670 (0.0045) [2024-06-24 04:52:02,890][15401] Updated weights for policy 0, policy_version 588680 (0.0026) [2024-06-24 04:52:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9644949504. Throughput: 0: 43002.2. Samples: 9645087080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:52:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 04:52:07,528][15401] Updated weights for policy 0, policy_version 588690 (0.0026) [2024-06-24 04:52:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9645113344. Throughput: 0: 43073.7. Samples: 9645218560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:52:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 04:52:10,479][15401] Updated weights for policy 0, policy_version 588700 (0.0034) [2024-06-24 04:52:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 9645359104. Throughput: 0: 43081.7. Samples: 9645475140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:52:13,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 04:52:15,224][15401] Updated weights for policy 0, policy_version 588710 (0.0027) [2024-06-24 04:52:17,117][15349] Signal inference workers to stop experience collection... (142900 times) [2024-06-24 04:52:17,117][15349] Signal inference workers to resume experience collection... (142900 times) [2024-06-24 04:52:17,155][15401] InferenceWorker_p0-w0: stopping experience collection (142900 times) [2024-06-24 04:52:17,156][15401] InferenceWorker_p0-w0: resuming experience collection (142900 times) [2024-06-24 04:52:18,241][15401] Updated weights for policy 0, policy_version 588720 (0.0034) [2024-06-24 04:52:18,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 9645588480. Throughput: 0: 43035.6. Samples: 9645731700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:52:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 04:52:22,938][15401] Updated weights for policy 0, policy_version 588730 (0.0031) [2024-06-24 04:52:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9645752320. Throughput: 0: 43024.1. Samples: 9645860900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:52:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 04:52:25,842][15401] Updated weights for policy 0, policy_version 588740 (0.0038) [2024-06-24 04:52:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9646014464. Throughput: 0: 42934.1. Samples: 9646117700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 04:52:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 04:52:30,556][15401] Updated weights for policy 0, policy_version 588750 (0.0043) [2024-06-24 04:52:33,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42876.6). Total num frames: 9646227456. Throughput: 0: 42943.1. Samples: 9646374300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:52:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 04:52:33,440][15401] Updated weights for policy 0, policy_version 588760 (0.0032) [2024-06-24 04:52:38,147][15401] Updated weights for policy 0, policy_version 588770 (0.0033) [2024-06-24 04:52:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9646407680. Throughput: 0: 43016.6. Samples: 9646503460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:52:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 04:52:41,005][15401] Updated weights for policy 0, policy_version 588780 (0.0038) [2024-06-24 04:52:43,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 9646653440. Throughput: 0: 42850.0. Samples: 9646761520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:52:43,396][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 04:52:45,741][15401] Updated weights for policy 0, policy_version 588790 (0.0041) [2024-06-24 04:52:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9646866432. Throughput: 0: 42907.2. Samples: 9647017900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:52:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 04:52:48,676][15401] Updated weights for policy 0, policy_version 588800 (0.0041) [2024-06-24 04:52:53,385][15401] Updated weights for policy 0, policy_version 588810 (0.0034) [2024-06-24 04:52:53,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 9647063040. Throughput: 0: 42811.6. Samples: 9647145080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:52:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 04:52:56,234][15401] Updated weights for policy 0, policy_version 588820 (0.0039) [2024-06-24 04:52:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 9647308800. Throughput: 0: 42758.8. Samples: 9647399280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:52:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 04:53:01,083][15401] Updated weights for policy 0, policy_version 588830 (0.0026) [2024-06-24 04:53:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 9647505408. Throughput: 0: 42873.9. Samples: 9647661020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:03,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 04:53:03,743][15401] Updated weights for policy 0, policy_version 588840 (0.0029) [2024-06-24 04:53:08,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 9647685632. Throughput: 0: 42702.1. Samples: 9647782500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 04:53:08,694][15401] Updated weights for policy 0, policy_version 588850 (0.0050) [2024-06-24 04:53:11,475][15401] Updated weights for policy 0, policy_version 588860 (0.0025) [2024-06-24 04:53:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9647931392. Throughput: 0: 42589.5. Samples: 9648034220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 04:53:16,934][15401] Updated weights for policy 0, policy_version 588870 (0.0036) [2024-06-24 04:53:18,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 9648128000. Throughput: 0: 42731.1. Samples: 9648297200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 04:53:19,444][15401] Updated weights for policy 0, policy_version 588880 (0.0044) [2024-06-24 04:53:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 9648324608. Throughput: 0: 42451.5. Samples: 9648413780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 04:53:24,479][15401] Updated weights for policy 0, policy_version 588890 (0.0032) [2024-06-24 04:53:27,233][15401] Updated weights for policy 0, policy_version 588900 (0.0041) [2024-06-24 04:53:27,241][15349] Signal inference workers to stop experience collection... (142950 times) [2024-06-24 04:53:27,242][15349] Signal inference workers to resume experience collection... (142950 times) [2024-06-24 04:53:27,288][15401] InferenceWorker_p0-w0: stopping experience collection (142950 times) [2024-06-24 04:53:27,288][15401] InferenceWorker_p0-w0: resuming experience collection (142950 times) [2024-06-24 04:53:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9648570368. Throughput: 0: 42432.3. Samples: 9648670700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 04:53:32,042][15401] Updated weights for policy 0, policy_version 588910 (0.0035) [2024-06-24 04:53:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9648766976. Throughput: 0: 42707.5. Samples: 9648939740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:33,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-24 04:53:34,715][15401] Updated weights for policy 0, policy_version 588920 (0.0028) [2024-06-24 04:53:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9648963584. Throughput: 0: 42489.7. Samples: 9649057120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 04:53:40,205][15401] Updated weights for policy 0, policy_version 588930 (0.0031) [2024-06-24 04:53:42,250][15401] Updated weights for policy 0, policy_version 588940 (0.0036) [2024-06-24 04:53:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42876.0, 300 sec: 42876.4). Total num frames: 9649225728. Throughput: 0: 42441.7. Samples: 9649309160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:43,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 04:53:43,436][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000588943_9649242112.pth... [2024-06-24 04:53:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000588312_9638903808.pth [2024-06-24 04:53:48,130][15401] Updated weights for policy 0, policy_version 588950 (0.0032) [2024-06-24 04:53:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42766.0). Total num frames: 9649373184. Throughput: 0: 42727.1. Samples: 9649583740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:48,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 04:53:49,947][15401] Updated weights for policy 0, policy_version 588960 (0.0040) [2024-06-24 04:53:53,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9649602560. Throughput: 0: 42377.9. Samples: 9649689500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:53,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 04:53:55,606][15401] Updated weights for policy 0, policy_version 588970 (0.0029) [2024-06-24 04:53:57,396][15401] Updated weights for policy 0, policy_version 588980 (0.0053) [2024-06-24 04:53:58,389][15132] Fps is (10 sec: 49151.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9649864704. Throughput: 0: 42656.9. Samples: 9649953780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 04:53:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 04:54:03,119][15401] Updated weights for policy 0, policy_version 588990 (0.0035) [2024-06-24 04:54:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 9650028544. Throughput: 0: 42844.5. Samples: 9650225200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:03,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 04:54:05,005][15401] Updated weights for policy 0, policy_version 589000 (0.0035) [2024-06-24 04:54:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9650257920. Throughput: 0: 42610.4. Samples: 9650331240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:08,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-24 04:54:10,815][15401] Updated weights for policy 0, policy_version 589010 (0.0030) [2024-06-24 04:54:12,662][15401] Updated weights for policy 0, policy_version 589020 (0.0033) [2024-06-24 04:54:13,346][15349] Signal inference workers to stop experience collection... (143000 times) [2024-06-24 04:54:13,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9650520064. Throughput: 0: 42796.1. Samples: 9650596520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:13,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 04:54:13,392][15401] InferenceWorker_p0-w0: stopping experience collection (143000 times) [2024-06-24 04:54:13,398][15349] Signal inference workers to resume experience collection... (143000 times) [2024-06-24 04:54:13,406][15401] InferenceWorker_p0-w0: resuming experience collection (143000 times) [2024-06-24 04:54:18,384][15401] Updated weights for policy 0, policy_version 589030 (0.0040) [2024-06-24 04:54:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9650667520. Throughput: 0: 42876.0. Samples: 9650869160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 04:54:20,312][15401] Updated weights for policy 0, policy_version 589040 (0.0039) [2024-06-24 04:54:23,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9650896896. Throughput: 0: 42618.8. Samples: 9650974960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:23,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 04:54:25,997][15401] Updated weights for policy 0, policy_version 589050 (0.0031) [2024-06-24 04:54:27,901][15401] Updated weights for policy 0, policy_version 589060 (0.0033) [2024-06-24 04:54:28,390][15132] Fps is (10 sec: 50790.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 9651175424. Throughput: 0: 42903.5. Samples: 9651239820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 04:54:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9651306496. Throughput: 0: 42905.8. Samples: 9651514500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 04:54:33,462][15401] Updated weights for policy 0, policy_version 589070 (0.0042) [2024-06-24 04:54:35,598][15401] Updated weights for policy 0, policy_version 589080 (0.0045) [2024-06-24 04:54:38,390][15132] Fps is (10 sec: 36044.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9651535872. Throughput: 0: 42874.6. Samples: 9651618860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:38,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 04:54:41,238][15401] Updated weights for policy 0, policy_version 589090 (0.0029) [2024-06-24 04:54:43,219][15401] Updated weights for policy 0, policy_version 589100 (0.0033) [2024-06-24 04:54:43,392][15132] Fps is (10 sec: 50778.8, 60 sec: 43143.0, 300 sec: 42986.9). Total num frames: 9651814400. Throughput: 0: 42927.6. Samples: 9651885620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:43,392][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 04:54:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9651929088. Throughput: 0: 42948.9. Samples: 9652157900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:48,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 04:54:48,838][15401] Updated weights for policy 0, policy_version 589110 (0.0038) [2024-06-24 04:54:50,798][15401] Updated weights for policy 0, policy_version 589120 (0.0030) [2024-06-24 04:54:53,390][15132] Fps is (10 sec: 37691.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9652191232. Throughput: 0: 42861.7. Samples: 9652260020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 04:54:56,681][15401] Updated weights for policy 0, policy_version 589130 (0.0028) [2024-06-24 04:54:57,370][15349] Signal inference workers to stop experience collection... (143050 times) [2024-06-24 04:54:57,421][15401] InferenceWorker_p0-w0: stopping experience collection (143050 times) [2024-06-24 04:54:57,429][15349] Signal inference workers to resume experience collection... (143050 times) [2024-06-24 04:54:57,439][15401] InferenceWorker_p0-w0: resuming experience collection (143050 times) [2024-06-24 04:54:58,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9652420608. Throughput: 0: 42934.6. Samples: 9652528580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:54:58,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 04:54:59,019][15401] Updated weights for policy 0, policy_version 589140 (0.0027) [2024-06-24 04:55:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9652568064. Throughput: 0: 42742.6. Samples: 9652792580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:55:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 04:55:04,413][15401] Updated weights for policy 0, policy_version 589150 (0.0048) [2024-06-24 04:55:06,666][15401] Updated weights for policy 0, policy_version 589160 (0.0034) [2024-06-24 04:55:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9652846592. Throughput: 0: 42859.6. Samples: 9652903640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:55:08,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 04:55:11,973][15401] Updated weights for policy 0, policy_version 589170 (0.0036) [2024-06-24 04:55:13,389][15132] Fps is (10 sec: 49152.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9653059584. Throughput: 0: 42890.8. Samples: 9653169900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:55:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 04:55:14,255][15401] Updated weights for policy 0, policy_version 589180 (0.0034) [2024-06-24 04:55:18,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9653223424. Throughput: 0: 42510.2. Samples: 9653427460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:55:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 04:55:19,641][15401] Updated weights for policy 0, policy_version 589190 (0.0038) [2024-06-24 04:55:22,046][15401] Updated weights for policy 0, policy_version 589200 (0.0035) [2024-06-24 04:55:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9653469184. Throughput: 0: 42746.2. Samples: 9653542440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:55:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 04:55:27,283][15401] Updated weights for policy 0, policy_version 589210 (0.0034) [2024-06-24 04:55:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9653698560. Throughput: 0: 42763.9. Samples: 9653809900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:55:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 04:55:29,633][15401] Updated weights for policy 0, policy_version 589220 (0.0034) [2024-06-24 04:55:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9653862400. Throughput: 0: 42418.1. Samples: 9654066720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 04:55:33,399][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 04:55:34,745][15401] Updated weights for policy 0, policy_version 589230 (0.0028) [2024-06-24 04:55:37,269][15401] Updated weights for policy 0, policy_version 589240 (0.0026) [2024-06-24 04:55:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43415.9, 300 sec: 42875.8). Total num frames: 9654140928. Throughput: 0: 42860.8. Samples: 9654188860. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:55:38,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 04:55:42,197][15401] Updated weights for policy 0, policy_version 589250 (0.0035) [2024-06-24 04:55:43,389][15132] Fps is (10 sec: 45876.0, 60 sec: 41780.8, 300 sec: 42709.5). Total num frames: 9654321152. Throughput: 0: 42843.6. Samples: 9654456540. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:55:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 04:55:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000589254_9654337536.pth... [2024-06-24 04:55:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000588628_9644081152.pth [2024-06-24 04:55:44,863][15401] Updated weights for policy 0, policy_version 589260 (0.0043) [2024-06-24 04:55:48,390][15132] Fps is (10 sec: 37691.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9654517760. Throughput: 0: 42587.9. Samples: 9654709040. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:55:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 04:55:50,266][15401] Updated weights for policy 0, policy_version 589270 (0.0042) [2024-06-24 04:55:51,192][15349] Signal inference workers to stop experience collection... (143100 times) [2024-06-24 04:55:51,196][15349] Signal inference workers to resume experience collection... (143100 times) [2024-06-24 04:55:51,246][15401] InferenceWorker_p0-w0: stopping experience collection (143100 times) [2024-06-24 04:55:51,247][15401] InferenceWorker_p0-w0: resuming experience collection (143100 times) [2024-06-24 04:55:52,594][15401] Updated weights for policy 0, policy_version 589280 (0.0030) [2024-06-24 04:55:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9654779904. Throughput: 0: 42885.2. Samples: 9654833480. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:55:53,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 04:55:57,670][15401] Updated weights for policy 0, policy_version 589290 (0.0034) [2024-06-24 04:55:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 9654943744. Throughput: 0: 42807.9. Samples: 9655096260. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:55:58,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 04:56:00,212][15401] Updated weights for policy 0, policy_version 589300 (0.0034) [2024-06-24 04:56:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9655173120. Throughput: 0: 42650.2. Samples: 9655346720. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:03,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 04:56:05,279][15401] Updated weights for policy 0, policy_version 589310 (0.0034) [2024-06-24 04:56:08,021][15401] Updated weights for policy 0, policy_version 589320 (0.0035) [2024-06-24 04:56:08,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42766.0). Total num frames: 9655418880. Throughput: 0: 43084.9. Samples: 9655481260. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 04:56:12,739][15401] Updated weights for policy 0, policy_version 589330 (0.0045) [2024-06-24 04:56:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 9655582720. Throughput: 0: 42883.5. Samples: 9655739660. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 04:56:15,612][15401] Updated weights for policy 0, policy_version 589340 (0.0025) [2024-06-24 04:56:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9655828480. Throughput: 0: 42597.4. Samples: 9655983600. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 04:56:20,257][15401] Updated weights for policy 0, policy_version 589350 (0.0027) [2024-06-24 04:56:23,233][15401] Updated weights for policy 0, policy_version 589360 (0.0031) [2024-06-24 04:56:23,390][15132] Fps is (10 sec: 49152.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9656074240. Throughput: 0: 42955.6. Samples: 9656121760. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 04:56:28,181][15401] Updated weights for policy 0, policy_version 589370 (0.0046) [2024-06-24 04:56:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9656238080. Throughput: 0: 42565.2. Samples: 9656371980. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 04:56:31,447][15401] Updated weights for policy 0, policy_version 589380 (0.0028) [2024-06-24 04:56:33,389][15132] Fps is (10 sec: 37683.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9656451072. Throughput: 0: 42532.1. Samples: 9656622980. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 04:56:35,994][15401] Updated weights for policy 0, policy_version 589390 (0.0025) [2024-06-24 04:56:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 9656664064. Throughput: 0: 42693.9. Samples: 9656754700. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 04:56:39,041][15401] Updated weights for policy 0, policy_version 589400 (0.0025) [2024-06-24 04:56:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9656877056. Throughput: 0: 42506.9. Samples: 9657009060. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 04:56:43,474][15401] Updated weights for policy 0, policy_version 589410 (0.0031) [2024-06-24 04:56:46,765][15401] Updated weights for policy 0, policy_version 589420 (0.0032) [2024-06-24 04:56:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 9657073664. Throughput: 0: 42662.2. Samples: 9657266520. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 04:56:51,124][15401] Updated weights for policy 0, policy_version 589430 (0.0034) [2024-06-24 04:56:52,236][15349] Signal inference workers to stop experience collection... (143150 times) [2024-06-24 04:56:52,237][15349] Signal inference workers to resume experience collection... (143150 times) [2024-06-24 04:56:52,279][15401] InferenceWorker_p0-w0: stopping experience collection (143150 times) [2024-06-24 04:56:52,279][15401] InferenceWorker_p0-w0: resuming experience collection (143150 times) [2024-06-24 04:56:53,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9657319424. Throughput: 0: 42494.5. Samples: 9657393520. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 04:56:54,828][15401] Updated weights for policy 0, policy_version 589440 (0.0027) [2024-06-24 04:56:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 9657516032. Throughput: 0: 42429.0. Samples: 9657648960. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:56:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 04:56:58,595][15401] Updated weights for policy 0, policy_version 589450 (0.0028) [2024-06-24 04:57:02,407][15401] Updated weights for policy 0, policy_version 589460 (0.0029) [2024-06-24 04:57:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9657729024. Throughput: 0: 42699.6. Samples: 9657905080. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-06-24 04:57:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 04:57:06,027][15401] Updated weights for policy 0, policy_version 589470 (0.0029) [2024-06-24 04:57:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9657958400. Throughput: 0: 42435.9. Samples: 9658031380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:08,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 04:57:10,014][15401] Updated weights for policy 0, policy_version 589480 (0.0042) [2024-06-24 04:57:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 9658187776. Throughput: 0: 42639.2. Samples: 9658290740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:13,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 04:57:13,618][15401] Updated weights for policy 0, policy_version 589490 (0.0023) [2024-06-24 04:57:17,829][15401] Updated weights for policy 0, policy_version 589500 (0.0028) [2024-06-24 04:57:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9658384384. Throughput: 0: 42637.3. Samples: 9658541660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 04:57:21,414][15401] Updated weights for policy 0, policy_version 589510 (0.0034) [2024-06-24 04:57:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 9658580992. Throughput: 0: 42522.1. Samples: 9658668200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 04:57:25,627][15401] Updated weights for policy 0, policy_version 589520 (0.0048) [2024-06-24 04:57:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9658793984. Throughput: 0: 42532.7. Samples: 9658923040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 04:57:29,386][15401] Updated weights for policy 0, policy_version 589530 (0.0031) [2024-06-24 04:57:33,342][15401] Updated weights for policy 0, policy_version 589540 (0.0037) [2024-06-24 04:57:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9659023360. Throughput: 0: 42491.1. Samples: 9659178620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 04:57:37,082][15401] Updated weights for policy 0, policy_version 589550 (0.0045) [2024-06-24 04:57:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42654.9). Total num frames: 9659236352. Throughput: 0: 42524.6. Samples: 9659307120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 04:57:41,012][15401] Updated weights for policy 0, policy_version 589560 (0.0041) [2024-06-24 04:57:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 9659449344. Throughput: 0: 42560.3. Samples: 9659564180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 04:57:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000589566_9659449344.pth... [2024-06-24 04:57:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000588943_9649242112.pth [2024-06-24 04:57:44,650][15401] Updated weights for policy 0, policy_version 589570 (0.0035) [2024-06-24 04:57:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9659645952. Throughput: 0: 42652.0. Samples: 9659824420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:48,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 04:57:48,572][15401] Updated weights for policy 0, policy_version 589580 (0.0038) [2024-06-24 04:57:52,361][15401] Updated weights for policy 0, policy_version 589590 (0.0036) [2024-06-24 04:57:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9659875328. Throughput: 0: 42715.9. Samples: 9659953600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 04:57:56,479][15401] Updated weights for policy 0, policy_version 589600 (0.0029) [2024-06-24 04:57:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9660088320. Throughput: 0: 42604.4. Samples: 9660207940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:57:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 04:58:00,008][15401] Updated weights for policy 0, policy_version 589610 (0.0033) [2024-06-24 04:58:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9660284928. Throughput: 0: 42715.5. Samples: 9660463860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:58:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 04:58:04,437][15401] Updated weights for policy 0, policy_version 589620 (0.0039) [2024-06-24 04:58:07,534][15401] Updated weights for policy 0, policy_version 589630 (0.0042) [2024-06-24 04:58:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9660514304. Throughput: 0: 42668.0. Samples: 9660588260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:58:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 04:58:11,875][15401] Updated weights for policy 0, policy_version 589640 (0.0034) [2024-06-24 04:58:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9660727296. Throughput: 0: 42778.6. Samples: 9660848080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:58:13,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-24 04:58:15,513][15401] Updated weights for policy 0, policy_version 589650 (0.0039) [2024-06-24 04:58:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9660923904. Throughput: 0: 42914.2. Samples: 9661109760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:58:18,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 04:58:19,690][15401] Updated weights for policy 0, policy_version 589660 (0.0028) [2024-06-24 04:58:23,125][15401] Updated weights for policy 0, policy_version 589670 (0.0029) [2024-06-24 04:58:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9661153280. Throughput: 0: 42852.4. Samples: 9661235480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:58:23,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 04:58:27,062][15349] Signal inference workers to stop experience collection... (143200 times) [2024-06-24 04:58:27,062][15349] Signal inference workers to resume experience collection... (143200 times) [2024-06-24 04:58:27,099][15401] InferenceWorker_p0-w0: stopping experience collection (143200 times) [2024-06-24 04:58:27,099][15401] InferenceWorker_p0-w0: resuming experience collection (143200 times) [2024-06-24 04:58:27,214][15401] Updated weights for policy 0, policy_version 589680 (0.0038) [2024-06-24 04:58:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9661366272. Throughput: 0: 42941.4. Samples: 9661496540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:58:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 04:58:30,655][15401] Updated weights for policy 0, policy_version 589690 (0.0037) [2024-06-24 04:58:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9661579264. Throughput: 0: 42900.0. Samples: 9661754920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:58:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 04:58:34,810][15401] Updated weights for policy 0, policy_version 589700 (0.0028) [2024-06-24 04:58:38,167][15401] Updated weights for policy 0, policy_version 589710 (0.0033) [2024-06-24 04:58:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9661808640. Throughput: 0: 42726.3. Samples: 9661876280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 04:58:38,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 04:58:42,269][15401] Updated weights for policy 0, policy_version 589720 (0.0039) [2024-06-24 04:58:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9662021632. Throughput: 0: 42905.2. Samples: 9662138680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:58:43,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 04:58:45,679][15401] Updated weights for policy 0, policy_version 589730 (0.0027) [2024-06-24 04:58:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9662218240. Throughput: 0: 42880.0. Samples: 9662393460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:58:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 04:58:50,149][15401] Updated weights for policy 0, policy_version 589740 (0.0037) [2024-06-24 04:58:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9662431232. Throughput: 0: 42863.1. Samples: 9662517100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:58:53,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 04:58:53,737][15401] Updated weights for policy 0, policy_version 589750 (0.0043) [2024-06-24 04:58:57,646][15401] Updated weights for policy 0, policy_version 589760 (0.0045) [2024-06-24 04:58:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9662644224. Throughput: 0: 42973.5. Samples: 9662781880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:58:58,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 04:59:01,273][15401] Updated weights for policy 0, policy_version 589770 (0.0037) [2024-06-24 04:59:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9662857216. Throughput: 0: 42791.6. Samples: 9663035380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 04:59:05,126][15401] Updated weights for policy 0, policy_version 589780 (0.0033) [2024-06-24 04:59:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9663086592. Throughput: 0: 42893.3. Samples: 9663165680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:08,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 04:59:08,761][15401] Updated weights for policy 0, policy_version 589790 (0.0036) [2024-06-24 04:59:12,712][15401] Updated weights for policy 0, policy_version 589800 (0.0039) [2024-06-24 04:59:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9663299584. Throughput: 0: 42766.7. Samples: 9663421040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 04:59:16,764][15401] Updated weights for policy 0, policy_version 589810 (0.0031) [2024-06-24 04:59:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9663496192. Throughput: 0: 42753.7. Samples: 9663678840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 04:59:20,489][15401] Updated weights for policy 0, policy_version 589820 (0.0032) [2024-06-24 04:59:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 9663741952. Throughput: 0: 42871.7. Samples: 9663805500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 04:59:24,283][15401] Updated weights for policy 0, policy_version 589830 (0.0040) [2024-06-24 04:59:28,367][15401] Updated weights for policy 0, policy_version 589840 (0.0029) [2024-06-24 04:59:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9663938560. Throughput: 0: 42704.5. Samples: 9664060380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 04:59:31,661][15401] Updated weights for policy 0, policy_version 589850 (0.0023) [2024-06-24 04:59:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 9664118784. Throughput: 0: 42999.6. Samples: 9664328440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 04:59:35,791][15401] Updated weights for policy 0, policy_version 589860 (0.0033) [2024-06-24 04:59:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42654.2). Total num frames: 9664397312. Throughput: 0: 43003.4. Samples: 9664452260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 04:59:39,463][15401] Updated weights for policy 0, policy_version 589870 (0.0027) [2024-06-24 04:59:43,195][15401] Updated weights for policy 0, policy_version 589880 (0.0026) [2024-06-24 04:59:43,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9664593920. Throughput: 0: 42863.4. Samples: 9664710740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 04:59:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000589880_9664593920.pth... [2024-06-24 04:59:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000589254_9654337536.pth [2024-06-24 04:59:47,011][15401] Updated weights for policy 0, policy_version 589890 (0.0024) [2024-06-24 04:59:48,392][15132] Fps is (10 sec: 37674.8, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 9664774144. Throughput: 0: 43012.3. Samples: 9664971040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:48,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 04:59:51,138][15401] Updated weights for policy 0, policy_version 589900 (0.0037) [2024-06-24 04:59:53,393][15132] Fps is (10 sec: 42583.2, 60 sec: 43141.8, 300 sec: 42708.9). Total num frames: 9665019904. Throughput: 0: 42917.4. Samples: 9665097120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:53,394][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 04:59:54,321][15349] Signal inference workers to stop experience collection... (143250 times) [2024-06-24 04:59:54,322][15349] Signal inference workers to resume experience collection... (143250 times) [2024-06-24 04:59:54,365][15401] InferenceWorker_p0-w0: stopping experience collection (143250 times) [2024-06-24 04:59:54,365][15401] InferenceWorker_p0-w0: resuming experience collection (143250 times) [2024-06-24 04:59:54,463][15401] Updated weights for policy 0, policy_version 589910 (0.0043) [2024-06-24 04:59:58,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9665216512. Throughput: 0: 43024.5. Samples: 9665357140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 04:59:58,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 04:59:58,584][15401] Updated weights for policy 0, policy_version 589920 (0.0035) [2024-06-24 05:00:01,872][15401] Updated weights for policy 0, policy_version 589930 (0.0027) [2024-06-24 05:00:03,390][15132] Fps is (10 sec: 40974.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 9665429504. Throughput: 0: 43075.4. Samples: 9665617240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 05:00:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 05:00:06,215][15401] Updated weights for policy 0, policy_version 589940 (0.0039) [2024-06-24 05:00:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9665658880. Throughput: 0: 43123.4. Samples: 9665746060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 05:00:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 05:00:09,392][15401] Updated weights for policy 0, policy_version 589950 (0.0038) [2024-06-24 05:00:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9665871872. Throughput: 0: 43237.3. Samples: 9666006060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 05:00:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 05:00:13,854][15401] Updated weights for policy 0, policy_version 589960 (0.0036) [2024-06-24 05:00:17,063][15401] Updated weights for policy 0, policy_version 589970 (0.0033) [2024-06-24 05:00:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9666101248. Throughput: 0: 42995.2. Samples: 9666263220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:18,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 05:00:21,374][15401] Updated weights for policy 0, policy_version 589980 (0.0031) [2024-06-24 05:00:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9666314240. Throughput: 0: 43224.1. Samples: 9666397340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 05:00:24,745][15401] Updated weights for policy 0, policy_version 589990 (0.0040) [2024-06-24 05:00:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9666527232. Throughput: 0: 43130.3. Samples: 9666651600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:28,391][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 05:00:29,239][15401] Updated weights for policy 0, policy_version 590000 (0.0038) [2024-06-24 05:00:32,453][15401] Updated weights for policy 0, policy_version 590010 (0.0036) [2024-06-24 05:00:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43690.6, 300 sec: 42709.8). Total num frames: 9666740224. Throughput: 0: 43040.5. Samples: 9666907760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 05:00:36,777][15401] Updated weights for policy 0, policy_version 590020 (0.0039) [2024-06-24 05:00:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9666969600. Throughput: 0: 43251.9. Samples: 9667043300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:38,395][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 05:00:39,920][15401] Updated weights for policy 0, policy_version 590030 (0.0028) [2024-06-24 05:00:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9667149824. Throughput: 0: 43174.6. Samples: 9667300000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:43,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 05:00:44,385][15401] Updated weights for policy 0, policy_version 590040 (0.0043) [2024-06-24 05:00:47,699][15401] Updated weights for policy 0, policy_version 590050 (0.0037) [2024-06-24 05:00:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43692.4, 300 sec: 42765.0). Total num frames: 9667395584. Throughput: 0: 42992.1. Samples: 9667551880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:48,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 05:00:51,948][15401] Updated weights for policy 0, policy_version 590060 (0.0031) [2024-06-24 05:00:53,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43420.3, 300 sec: 42987.2). Total num frames: 9667624960. Throughput: 0: 43244.0. Samples: 9667692040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 05:00:55,179][15401] Updated weights for policy 0, policy_version 590070 (0.0044) [2024-06-24 05:00:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9667805184. Throughput: 0: 42951.1. Samples: 9667938860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:00:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 05:01:00,011][15401] Updated weights for policy 0, policy_version 590080 (0.0024) [2024-06-24 05:01:00,511][15349] Signal inference workers to stop experience collection... (143300 times) [2024-06-24 05:01:00,516][15349] Signal inference workers to resume experience collection... (143300 times) [2024-06-24 05:01:00,531][15401] InferenceWorker_p0-w0: stopping experience collection (143300 times) [2024-06-24 05:01:00,568][15401] InferenceWorker_p0-w0: resuming experience collection (143300 times) [2024-06-24 05:01:03,245][15401] Updated weights for policy 0, policy_version 590090 (0.0043) [2024-06-24 05:01:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 9668034560. Throughput: 0: 42971.0. Samples: 9668196920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 05:01:07,518][15401] Updated weights for policy 0, policy_version 590100 (0.0029) [2024-06-24 05:01:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 9668263936. Throughput: 0: 42945.0. Samples: 9668329860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 05:01:10,713][15401] Updated weights for policy 0, policy_version 590110 (0.0037) [2024-06-24 05:01:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9668444160. Throughput: 0: 42837.9. Samples: 9668579300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 05:01:14,992][15401] Updated weights for policy 0, policy_version 590120 (0.0026) [2024-06-24 05:01:18,171][15401] Updated weights for policy 0, policy_version 590130 (0.0033) [2024-06-24 05:01:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9668689920. Throughput: 0: 42874.2. Samples: 9668837100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 05:01:22,549][15401] Updated weights for policy 0, policy_version 590140 (0.0040) [2024-06-24 05:01:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9668902912. Throughput: 0: 42940.0. Samples: 9668975600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:23,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 05:01:25,736][15401] Updated weights for policy 0, policy_version 590150 (0.0027) [2024-06-24 05:01:28,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42867.0, 300 sec: 42875.2). Total num frames: 9669099520. Throughput: 0: 42845.5. Samples: 9669228320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:28,396][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 05:01:30,088][15401] Updated weights for policy 0, policy_version 590160 (0.0026) [2024-06-24 05:01:33,244][15401] Updated weights for policy 0, policy_version 590170 (0.0029) [2024-06-24 05:01:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9669345280. Throughput: 0: 42872.0. Samples: 9669481120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:33,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 05:01:37,650][15401] Updated weights for policy 0, policy_version 590180 (0.0027) [2024-06-24 05:01:38,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9669541888. Throughput: 0: 42796.1. Samples: 9669617860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:38,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 05:01:40,697][15401] Updated weights for policy 0, policy_version 590190 (0.0044) [2024-06-24 05:01:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9669738496. Throughput: 0: 43007.6. Samples: 9669874200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:01:43,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 05:01:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000590194_9669738496.pth... [2024-06-24 05:01:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000589566_9659449344.pth [2024-06-24 05:01:45,350][15401] Updated weights for policy 0, policy_version 590200 (0.0031) [2024-06-24 05:01:48,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9669967872. Throughput: 0: 42853.2. Samples: 9670125320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:01:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 05:01:48,816][15401] Updated weights for policy 0, policy_version 590210 (0.0030) [2024-06-24 05:01:52,968][15401] Updated weights for policy 0, policy_version 590220 (0.0024) [2024-06-24 05:01:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9670180864. Throughput: 0: 42940.4. Samples: 9670262180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:01:53,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 05:01:56,499][15401] Updated weights for policy 0, policy_version 590230 (0.0043) [2024-06-24 05:01:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9670377472. Throughput: 0: 42966.1. Samples: 9670512780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:01:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 05:02:00,473][15401] Updated weights for policy 0, policy_version 590240 (0.0032) [2024-06-24 05:02:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 9670623232. Throughput: 0: 42972.7. Samples: 9670770860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:03,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-24 05:02:04,138][15401] Updated weights for policy 0, policy_version 590250 (0.0024) [2024-06-24 05:02:08,114][15401] Updated weights for policy 0, policy_version 590260 (0.0042) [2024-06-24 05:02:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9670819840. Throughput: 0: 42740.5. Samples: 9670898920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 05:02:11,899][15401] Updated weights for policy 0, policy_version 590270 (0.0033) [2024-06-24 05:02:13,392][15132] Fps is (10 sec: 40949.5, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 9671032832. Throughput: 0: 42672.7. Samples: 9671148420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:13,392][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 05:02:16,069][15401] Updated weights for policy 0, policy_version 590280 (0.0029) [2024-06-24 05:02:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 9671245824. Throughput: 0: 42753.3. Samples: 9671405120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:18,393][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 05:02:19,522][15401] Updated weights for policy 0, policy_version 590290 (0.0031) [2024-06-24 05:02:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9671442432. Throughput: 0: 42544.3. Samples: 9671532360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 05:02:23,722][15401] Updated weights for policy 0, policy_version 590300 (0.0046) [2024-06-24 05:02:27,247][15401] Updated weights for policy 0, policy_version 590310 (0.0037) [2024-06-24 05:02:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 9671655424. Throughput: 0: 42467.1. Samples: 9671785220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:28,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 05:02:31,313][15401] Updated weights for policy 0, policy_version 590320 (0.0034) [2024-06-24 05:02:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9671884800. Throughput: 0: 42623.2. Samples: 9672043360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:33,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 05:02:34,094][15349] Signal inference workers to stop experience collection... (143350 times) [2024-06-24 05:02:34,094][15349] Signal inference workers to resume experience collection... (143350 times) [2024-06-24 05:02:34,110][15401] InferenceWorker_p0-w0: stopping experience collection (143350 times) [2024-06-24 05:02:34,110][15401] InferenceWorker_p0-w0: resuming experience collection (143350 times) [2024-06-24 05:02:34,794][15401] Updated weights for policy 0, policy_version 590330 (0.0036) [2024-06-24 05:02:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9672081408. Throughput: 0: 42456.5. Samples: 9672172720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 05:02:39,012][15401] Updated weights for policy 0, policy_version 590340 (0.0029) [2024-06-24 05:02:42,702][15401] Updated weights for policy 0, policy_version 590350 (0.0046) [2024-06-24 05:02:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9672294400. Throughput: 0: 42560.4. Samples: 9672428000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 05:02:46,733][15401] Updated weights for policy 0, policy_version 590360 (0.0048) [2024-06-24 05:02:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9672523776. Throughput: 0: 42454.1. Samples: 9672681300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 05:02:50,391][15401] Updated weights for policy 0, policy_version 590370 (0.0043) [2024-06-24 05:02:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9672736768. Throughput: 0: 42504.5. Samples: 9672811620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 05:02:54,048][15401] Updated weights for policy 0, policy_version 590380 (0.0028) [2024-06-24 05:02:58,221][15401] Updated weights for policy 0, policy_version 590390 (0.0037) [2024-06-24 05:02:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9672949760. Throughput: 0: 42758.7. Samples: 9673072460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:02:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 05:03:01,768][15401] Updated weights for policy 0, policy_version 590400 (0.0040) [2024-06-24 05:03:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 9673162752. Throughput: 0: 42752.7. Samples: 9673328880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:03:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 05:03:05,924][15401] Updated weights for policy 0, policy_version 590410 (0.0032) [2024-06-24 05:03:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9673375744. Throughput: 0: 42715.9. Samples: 9673454580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:03:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 05:03:09,355][15401] Updated weights for policy 0, policy_version 590420 (0.0032) [2024-06-24 05:03:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 9673572352. Throughput: 0: 42739.5. Samples: 9673708500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:03:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 05:03:13,559][15401] Updated weights for policy 0, policy_version 590430 (0.0031) [2024-06-24 05:03:17,162][15401] Updated weights for policy 0, policy_version 590440 (0.0032) [2024-06-24 05:03:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 9673801728. Throughput: 0: 42685.7. Samples: 9673964220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:03:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 05:03:21,128][15401] Updated weights for policy 0, policy_version 590450 (0.0031) [2024-06-24 05:03:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9673998336. Throughput: 0: 42639.6. Samples: 9674091500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:03:23,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 05:03:24,671][15401] Updated weights for policy 0, policy_version 590460 (0.0030) [2024-06-24 05:03:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9674227712. Throughput: 0: 42777.5. Samples: 9674352980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:03:28,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 05:03:28,651][15401] Updated weights for policy 0, policy_version 590470 (0.0041) [2024-06-24 05:03:32,490][15401] Updated weights for policy 0, policy_version 590480 (0.0040) [2024-06-24 05:03:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 9674473472. Throughput: 0: 42736.9. Samples: 9674604460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:03:33,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 05:03:36,225][15401] Updated weights for policy 0, policy_version 590490 (0.0035) [2024-06-24 05:03:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9674637312. Throughput: 0: 42744.0. Samples: 9674735100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:03:38,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 05:03:40,346][15349] Signal inference workers to stop experience collection... (143400 times) [2024-06-24 05:03:40,368][15401] InferenceWorker_p0-w0: stopping experience collection (143400 times) [2024-06-24 05:03:40,404][15349] Signal inference workers to resume experience collection... (143400 times) [2024-06-24 05:03:40,405][15401] InferenceWorker_p0-w0: resuming experience collection (143400 times) [2024-06-24 05:03:40,406][15401] Updated weights for policy 0, policy_version 590500 (0.0027) [2024-06-24 05:03:43,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 9674883072. Throughput: 0: 42808.8. Samples: 9674998960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:03:43,393][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 05:03:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000590508_9674883072.pth... [2024-06-24 05:03:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000589880_9664593920.pth [2024-06-24 05:03:44,201][15401] Updated weights for policy 0, policy_version 590510 (0.0028) [2024-06-24 05:03:47,942][15401] Updated weights for policy 0, policy_version 590520 (0.0039) [2024-06-24 05:03:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 9675096064. Throughput: 0: 42668.2. Samples: 9675248960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:03:48,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 05:03:51,734][15401] Updated weights for policy 0, policy_version 590530 (0.0037) [2024-06-24 05:03:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9675292672. Throughput: 0: 42726.9. Samples: 9675377280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:03:53,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 05:03:55,431][15401] Updated weights for policy 0, policy_version 590540 (0.0039) [2024-06-24 05:03:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9675522048. Throughput: 0: 42858.6. Samples: 9675637140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:03:58,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 05:03:59,257][15401] Updated weights for policy 0, policy_version 590550 (0.0026) [2024-06-24 05:04:02,940][15401] Updated weights for policy 0, policy_version 590560 (0.0032) [2024-06-24 05:04:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9675735040. Throughput: 0: 42726.4. Samples: 9675886900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 05:04:06,856][15401] Updated weights for policy 0, policy_version 590570 (0.0051) [2024-06-24 05:04:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9675931648. Throughput: 0: 42828.7. Samples: 9676018800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 05:04:10,606][15401] Updated weights for policy 0, policy_version 590580 (0.0031) [2024-06-24 05:04:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9676161024. Throughput: 0: 42897.7. Samples: 9676283380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:13,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-24 05:04:14,412][15401] Updated weights for policy 0, policy_version 590590 (0.0034) [2024-06-24 05:04:18,213][15401] Updated weights for policy 0, policy_version 590600 (0.0037) [2024-06-24 05:04:18,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9676390400. Throughput: 0: 42914.6. Samples: 9676535620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:18,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 05:04:22,652][15401] Updated weights for policy 0, policy_version 590610 (0.0045) [2024-06-24 05:04:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9676587008. Throughput: 0: 42849.2. Samples: 9676663320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:23,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-24 05:04:26,230][15401] Updated weights for policy 0, policy_version 590620 (0.0042) [2024-06-24 05:04:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 9676816384. Throughput: 0: 42760.0. Samples: 9676923060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 05:04:30,068][15401] Updated weights for policy 0, policy_version 590630 (0.0034) [2024-06-24 05:04:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 9677012992. Throughput: 0: 42891.3. Samples: 9677179060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:33,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 05:04:33,755][15401] Updated weights for policy 0, policy_version 590640 (0.0025) [2024-06-24 05:04:37,637][15401] Updated weights for policy 0, policy_version 590650 (0.0036) [2024-06-24 05:04:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9677225984. Throughput: 0: 42862.7. Samples: 9677306100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 05:04:41,516][15401] Updated weights for policy 0, policy_version 590660 (0.0031) [2024-06-24 05:04:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42987.5). Total num frames: 9677455360. Throughput: 0: 42844.9. Samples: 9677565160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 05:04:45,270][15401] Updated weights for policy 0, policy_version 590670 (0.0025) [2024-06-24 05:04:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.6). Total num frames: 9677635584. Throughput: 0: 43138.2. Samples: 9677828120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 05:04:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 05:04:49,078][15401] Updated weights for policy 0, policy_version 590680 (0.0047) [2024-06-24 05:04:52,865][15401] Updated weights for policy 0, policy_version 590690 (0.0031) [2024-06-24 05:04:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9677881344. Throughput: 0: 42898.5. Samples: 9677949220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:04:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 05:04:56,628][15401] Updated weights for policy 0, policy_version 590700 (0.0039) [2024-06-24 05:04:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 9678094336. Throughput: 0: 42846.3. Samples: 9678211460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:04:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 05:05:00,538][15401] Updated weights for policy 0, policy_version 590710 (0.0034) [2024-06-24 05:05:02,208][15349] Signal inference workers to stop experience collection... (143450 times) [2024-06-24 05:05:02,251][15401] InferenceWorker_p0-w0: stopping experience collection (143450 times) [2024-06-24 05:05:02,268][15349] Signal inference workers to resume experience collection... (143450 times) [2024-06-24 05:05:02,269][15401] InferenceWorker_p0-w0: resuming experience collection (143450 times) [2024-06-24 05:05:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9678290944. Throughput: 0: 43061.7. Samples: 9678473400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 05:05:04,124][15401] Updated weights for policy 0, policy_version 590720 (0.0027) [2024-06-24 05:05:08,151][15401] Updated weights for policy 0, policy_version 590730 (0.0041) [2024-06-24 05:05:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 9678520320. Throughput: 0: 43017.9. Samples: 9678599120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 05:05:11,695][15401] Updated weights for policy 0, policy_version 590740 (0.0034) [2024-06-24 05:05:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9678733312. Throughput: 0: 42933.8. Samples: 9678855080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:13,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 05:05:16,233][15401] Updated weights for policy 0, policy_version 590750 (0.0046) [2024-06-24 05:05:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9678929920. Throughput: 0: 42824.8. Samples: 9679106180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 05:05:19,520][15401] Updated weights for policy 0, policy_version 590760 (0.0033) [2024-06-24 05:05:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9679142912. Throughput: 0: 42843.4. Samples: 9679234060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 05:05:23,909][15401] Updated weights for policy 0, policy_version 590770 (0.0036) [2024-06-24 05:05:27,279][15401] Updated weights for policy 0, policy_version 590780 (0.0039) [2024-06-24 05:05:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9679372288. Throughput: 0: 42714.1. Samples: 9679487300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 05:05:31,451][15401] Updated weights for policy 0, policy_version 590790 (0.0026) [2024-06-24 05:05:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 9679585280. Throughput: 0: 42510.5. Samples: 9679741100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:33,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 05:05:34,898][15401] Updated weights for policy 0, policy_version 590800 (0.0037) [2024-06-24 05:05:38,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9679781888. Throughput: 0: 42786.2. Samples: 9679874600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 05:05:38,941][15401] Updated weights for policy 0, policy_version 590810 (0.0033) [2024-06-24 05:05:42,779][15401] Updated weights for policy 0, policy_version 590820 (0.0034) [2024-06-24 05:05:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9680011264. Throughput: 0: 42793.7. Samples: 9680137180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 05:05:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000590821_9680011264.pth... [2024-06-24 05:05:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000590194_9669738496.pth [2024-06-24 05:05:46,529][15401] Updated weights for policy 0, policy_version 590830 (0.0037) [2024-06-24 05:05:48,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 9680240640. Throughput: 0: 42478.7. Samples: 9680385040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:48,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 05:05:50,278][15401] Updated weights for policy 0, policy_version 590840 (0.0022) [2024-06-24 05:05:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9680437248. Throughput: 0: 42666.5. Samples: 9680519120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:53,395][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 05:05:53,971][15401] Updated weights for policy 0, policy_version 590850 (0.0030) [2024-06-24 05:05:58,103][15401] Updated weights for policy 0, policy_version 590860 (0.0024) [2024-06-24 05:05:58,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9680650240. Throughput: 0: 42812.5. Samples: 9680781640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:05:58,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 05:06:01,523][15401] Updated weights for policy 0, policy_version 590870 (0.0033) [2024-06-24 05:06:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42764.6). Total num frames: 9680879616. Throughput: 0: 42907.8. Samples: 9681037140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:06:03,393][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 05:06:05,650][15401] Updated weights for policy 0, policy_version 590880 (0.0037) [2024-06-24 05:06:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9681076224. Throughput: 0: 42927.1. Samples: 9681165780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:06:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 05:06:08,999][15401] Updated weights for policy 0, policy_version 590890 (0.0032) [2024-06-24 05:06:13,325][15401] Updated weights for policy 0, policy_version 590900 (0.0040) [2024-06-24 05:06:13,392][15132] Fps is (10 sec: 42598.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9681305600. Throughput: 0: 43016.1. Samples: 9681423120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:06:13,393][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 05:06:15,914][15349] Signal inference workers to stop experience collection... (143500 times) [2024-06-24 05:06:15,914][15349] Signal inference workers to resume experience collection... (143500 times) [2024-06-24 05:06:15,940][15401] InferenceWorker_p0-w0: stopping experience collection (143500 times) [2024-06-24 05:06:15,940][15401] InferenceWorker_p0-w0: resuming experience collection (143500 times) [2024-06-24 05:06:17,179][15401] Updated weights for policy 0, policy_version 590910 (0.0032) [2024-06-24 05:06:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9681518592. Throughput: 0: 43138.8. Samples: 9681682340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:06:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 05:06:20,875][15401] Updated weights for policy 0, policy_version 590920 (0.0032) [2024-06-24 05:06:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 9681731584. Throughput: 0: 43039.5. Samples: 9681811380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-24 05:06:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 05:06:24,521][15401] Updated weights for policy 0, policy_version 590930 (0.0029) [2024-06-24 05:06:28,377][15401] Updated weights for policy 0, policy_version 590940 (0.0040) [2024-06-24 05:06:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 9681960960. Throughput: 0: 42986.7. Samples: 9682071580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:06:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 05:06:32,033][15401] Updated weights for policy 0, policy_version 590950 (0.0028) [2024-06-24 05:06:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9682157568. Throughput: 0: 43121.4. Samples: 9682325400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:06:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 05:06:36,223][15401] Updated weights for policy 0, policy_version 590960 (0.0023) [2024-06-24 05:06:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9682370560. Throughput: 0: 43087.2. Samples: 9682458040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:06:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 05:06:39,715][15401] Updated weights for policy 0, policy_version 590970 (0.0032) [2024-06-24 05:06:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 9682599936. Throughput: 0: 42983.1. Samples: 9682715980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:06:43,401][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 05:06:43,646][15401] Updated weights for policy 0, policy_version 590980 (0.0027) [2024-06-24 05:06:47,213][15401] Updated weights for policy 0, policy_version 590990 (0.0036) [2024-06-24 05:06:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 9682796544. Throughput: 0: 43102.7. Samples: 9682976660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:06:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 05:06:51,190][15401] Updated weights for policy 0, policy_version 591000 (0.0045) [2024-06-24 05:06:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9683009536. Throughput: 0: 43097.0. Samples: 9683105140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:06:53,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-24 05:06:54,729][15401] Updated weights for policy 0, policy_version 591010 (0.0025) [2024-06-24 05:06:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 9683255296. Throughput: 0: 43120.6. Samples: 9683363440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:06:58,390][15132] Avg episode reward: [(0, '0.157')] [2024-06-24 05:06:58,759][15401] Updated weights for policy 0, policy_version 591020 (0.0026) [2024-06-24 05:07:02,286][15401] Updated weights for policy 0, policy_version 591030 (0.0043) [2024-06-24 05:07:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 9683451904. Throughput: 0: 43056.4. Samples: 9683619880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:03,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 05:07:06,331][15401] Updated weights for policy 0, policy_version 591040 (0.0028) [2024-06-24 05:07:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 9683648512. Throughput: 0: 43017.3. Samples: 9683747160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 05:07:10,436][15401] Updated weights for policy 0, policy_version 591050 (0.0032) [2024-06-24 05:07:13,390][15132] Fps is (10 sec: 42595.2, 60 sec: 42872.7, 300 sec: 42820.8). Total num frames: 9683877888. Throughput: 0: 42798.0. Samples: 9683997520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:13,391][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 05:07:13,871][15401] Updated weights for policy 0, policy_version 591060 (0.0045) [2024-06-24 05:07:18,008][15401] Updated weights for policy 0, policy_version 591070 (0.0036) [2024-06-24 05:07:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9684090880. Throughput: 0: 42928.9. Samples: 9684257200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:18,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 05:07:21,485][15401] Updated weights for policy 0, policy_version 591080 (0.0034) [2024-06-24 05:07:23,389][15132] Fps is (10 sec: 40962.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9684287488. Throughput: 0: 42824.9. Samples: 9684385160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 05:07:25,516][15401] Updated weights for policy 0, policy_version 591090 (0.0029) [2024-06-24 05:07:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9684516864. Throughput: 0: 42877.9. Samples: 9684645380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 05:07:29,024][15401] Updated weights for policy 0, policy_version 591100 (0.0043) [2024-06-24 05:07:31,864][15349] Signal inference workers to stop experience collection... (143550 times) [2024-06-24 05:07:31,864][15349] Signal inference workers to resume experience collection... (143550 times) [2024-06-24 05:07:31,909][15401] InferenceWorker_p0-w0: stopping experience collection (143550 times) [2024-06-24 05:07:31,910][15401] InferenceWorker_p0-w0: resuming experience collection (143550 times) [2024-06-24 05:07:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9684729856. Throughput: 0: 42752.2. Samples: 9684900500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 05:07:33,415][15401] Updated weights for policy 0, policy_version 591110 (0.0033) [2024-06-24 05:07:36,779][15401] Updated weights for policy 0, policy_version 591120 (0.0033) [2024-06-24 05:07:38,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 9684926464. Throughput: 0: 42639.4. Samples: 9685024020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:38,392][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 05:07:40,899][15401] Updated weights for policy 0, policy_version 591130 (0.0024) [2024-06-24 05:07:43,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42871.5, 300 sec: 42875.7). Total num frames: 9685172224. Throughput: 0: 42733.2. Samples: 9685286540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:43,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 05:07:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000591136_9685172224.pth... [2024-06-24 05:07:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000590508_9674883072.pth [2024-06-24 05:07:44,861][15401] Updated weights for policy 0, policy_version 591140 (0.0032) [2024-06-24 05:07:48,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 9685385216. Throughput: 0: 42696.9. Samples: 9685541240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 05:07:48,401][15401] Updated weights for policy 0, policy_version 591150 (0.0029) [2024-06-24 05:07:52,285][15401] Updated weights for policy 0, policy_version 591160 (0.0031) [2024-06-24 05:07:53,390][15132] Fps is (10 sec: 40967.8, 60 sec: 42871.0, 300 sec: 42820.5). Total num frames: 9685581824. Throughput: 0: 42705.8. Samples: 9685668940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:53,391][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 05:07:56,284][15401] Updated weights for policy 0, policy_version 591170 (0.0050) [2024-06-24 05:07:58,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 9685827584. Throughput: 0: 42984.1. Samples: 9685931880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 05:07:58,393][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 05:07:59,652][15401] Updated weights for policy 0, policy_version 591180 (0.0035) [2024-06-24 05:08:03,390][15132] Fps is (10 sec: 44239.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9686024192. Throughput: 0: 43092.4. Samples: 9686196360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:03,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 05:08:03,781][15401] Updated weights for policy 0, policy_version 591190 (0.0032) [2024-06-24 05:08:07,023][15401] Updated weights for policy 0, policy_version 591200 (0.0040) [2024-06-24 05:08:08,390][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9686237184. Throughput: 0: 42989.3. Samples: 9686319680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 05:08:11,707][15401] Updated weights for policy 0, policy_version 591210 (0.0033) [2024-06-24 05:08:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43145.0, 300 sec: 42931.6). Total num frames: 9686466560. Throughput: 0: 42972.8. Samples: 9686579160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:13,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-24 05:08:14,525][15401] Updated weights for policy 0, policy_version 591220 (0.0030) [2024-06-24 05:08:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9686646784. Throughput: 0: 43096.4. Samples: 9686839840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 05:08:19,327][15401] Updated weights for policy 0, policy_version 591230 (0.0024) [2024-06-24 05:08:22,323][15401] Updated weights for policy 0, policy_version 591240 (0.0035) [2024-06-24 05:08:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 9686892544. Throughput: 0: 43083.1. Samples: 9686962660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 05:08:27,017][15401] Updated weights for policy 0, policy_version 591250 (0.0036) [2024-06-24 05:08:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9687105536. Throughput: 0: 42941.1. Samples: 9687218780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 05:08:30,215][15401] Updated weights for policy 0, policy_version 591260 (0.0036) [2024-06-24 05:08:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9687285760. Throughput: 0: 43040.5. Samples: 9687478060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 05:08:34,746][15401] Updated weights for policy 0, policy_version 591270 (0.0030) [2024-06-24 05:08:37,981][15401] Updated weights for policy 0, policy_version 591280 (0.0031) [2024-06-24 05:08:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43419.3, 300 sec: 42876.4). Total num frames: 9687531520. Throughput: 0: 42985.8. Samples: 9687603280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:38,396][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 05:08:42,405][15401] Updated weights for policy 0, policy_version 591290 (0.0042) [2024-06-24 05:08:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9687744512. Throughput: 0: 43030.8. Samples: 9687868160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 05:08:45,519][15401] Updated weights for policy 0, policy_version 591300 (0.0038) [2024-06-24 05:08:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9687941120. Throughput: 0: 42806.6. Samples: 9688122660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 05:08:50,132][15401] Updated weights for policy 0, policy_version 591310 (0.0044) [2024-06-24 05:08:53,215][15401] Updated weights for policy 0, policy_version 591320 (0.0055) [2024-06-24 05:08:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43418.0, 300 sec: 42931.6). Total num frames: 9688186880. Throughput: 0: 42800.9. Samples: 9688245720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 05:08:57,788][15349] Signal inference workers to stop experience collection... (143600 times) [2024-06-24 05:08:57,788][15349] Signal inference workers to resume experience collection... (143600 times) [2024-06-24 05:08:57,803][15401] Updated weights for policy 0, policy_version 591330 (0.0038) [2024-06-24 05:08:57,836][15401] InferenceWorker_p0-w0: stopping experience collection (143600 times) [2024-06-24 05:08:57,836][15401] InferenceWorker_p0-w0: resuming experience collection (143600 times) [2024-06-24 05:08:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 9688399872. Throughput: 0: 42999.6. Samples: 9688514140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:08:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 05:09:01,022][15401] Updated weights for policy 0, policy_version 591340 (0.0030) [2024-06-24 05:09:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 9688596480. Throughput: 0: 42715.1. Samples: 9688762020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:09:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 05:09:05,401][15401] Updated weights for policy 0, policy_version 591350 (0.0033) [2024-06-24 05:09:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9688825856. Throughput: 0: 42973.0. Samples: 9688896440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:09:08,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 05:09:08,519][15401] Updated weights for policy 0, policy_version 591360 (0.0042) [2024-06-24 05:09:13,098][15401] Updated weights for policy 0, policy_version 591370 (0.0035) [2024-06-24 05:09:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9689006080. Throughput: 0: 42992.4. Samples: 9689153440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:09:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 05:09:16,038][15401] Updated weights for policy 0, policy_version 591380 (0.0027) [2024-06-24 05:09:18,391][15132] Fps is (10 sec: 42591.6, 60 sec: 43416.4, 300 sec: 42931.4). Total num frames: 9689251840. Throughput: 0: 42663.8. Samples: 9689398000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:09:18,391][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 05:09:20,655][15401] Updated weights for policy 0, policy_version 591390 (0.0027) [2024-06-24 05:09:23,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 9689481216. Throughput: 0: 42961.1. Samples: 9689536520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:09:23,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 05:09:23,812][15401] Updated weights for policy 0, policy_version 591400 (0.0033) [2024-06-24 05:09:28,389][15132] Fps is (10 sec: 39328.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9689645056. Throughput: 0: 42689.8. Samples: 9689789200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-24 05:09:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 05:09:28,434][15401] Updated weights for policy 0, policy_version 591410 (0.0035) [2024-06-24 05:09:31,200][15401] Updated weights for policy 0, policy_version 591420 (0.0036) [2024-06-24 05:09:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9689874432. Throughput: 0: 42714.6. Samples: 9690044820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:09:33,395][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 05:09:36,111][15401] Updated weights for policy 0, policy_version 591430 (0.0030) [2024-06-24 05:09:38,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9690120192. Throughput: 0: 42963.5. Samples: 9690179080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:09:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 05:09:39,180][15401] Updated weights for policy 0, policy_version 591440 (0.0042) [2024-06-24 05:09:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 9690284032. Throughput: 0: 42532.3. Samples: 9690428100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:09:43,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 05:09:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000591448_9690284032.pth... [2024-06-24 05:09:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000590821_9680011264.pth [2024-06-24 05:09:43,865][15401] Updated weights for policy 0, policy_version 591450 (0.0032) [2024-06-24 05:09:46,798][15401] Updated weights for policy 0, policy_version 591460 (0.0038) [2024-06-24 05:09:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9690529792. Throughput: 0: 42651.1. Samples: 9690681320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:09:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 05:09:51,279][15401] Updated weights for policy 0, policy_version 591470 (0.0034) [2024-06-24 05:09:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 9690710016. Throughput: 0: 42715.6. Samples: 9690818640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:09:53,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-24 05:09:54,475][15401] Updated weights for policy 0, policy_version 591480 (0.0038) [2024-06-24 05:09:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 9690923008. Throughput: 0: 42620.0. Samples: 9691071340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:09:58,390][15132] Avg episode reward: [(0, '0.189')] [2024-06-24 05:09:59,017][15401] Updated weights for policy 0, policy_version 591490 (0.0032) [2024-06-24 05:09:59,109][15349] Signal inference workers to stop experience collection... (143650 times) [2024-06-24 05:09:59,160][15401] InferenceWorker_p0-w0: stopping experience collection (143650 times) [2024-06-24 05:09:59,165][15349] Signal inference workers to resume experience collection... (143650 times) [2024-06-24 05:09:59,170][15401] InferenceWorker_p0-w0: resuming experience collection (143650 times) [2024-06-24 05:10:02,067][15401] Updated weights for policy 0, policy_version 591500 (0.0027) [2024-06-24 05:10:03,392][15132] Fps is (10 sec: 47501.7, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 9691185152. Throughput: 0: 42939.7. Samples: 9691330320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:03,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 05:10:06,440][15401] Updated weights for policy 0, policy_version 591510 (0.0039) [2024-06-24 05:10:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9691365376. Throughput: 0: 42883.4. Samples: 9691466280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:08,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 05:10:09,708][15401] Updated weights for policy 0, policy_version 591520 (0.0042) [2024-06-24 05:10:13,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9691578368. Throughput: 0: 42935.0. Samples: 9691721280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 05:10:13,900][15401] Updated weights for policy 0, policy_version 591530 (0.0034) [2024-06-24 05:10:17,147][15401] Updated weights for policy 0, policy_version 591540 (0.0033) [2024-06-24 05:10:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42872.6, 300 sec: 42987.2). Total num frames: 9691824128. Throughput: 0: 42959.2. Samples: 9691977980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 05:10:21,397][15401] Updated weights for policy 0, policy_version 591550 (0.0032) [2024-06-24 05:10:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 9692020736. Throughput: 0: 42992.5. Samples: 9692113740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 05:10:24,740][15401] Updated weights for policy 0, policy_version 591560 (0.0034) [2024-06-24 05:10:28,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.3, 300 sec: 42876.1). Total num frames: 9692233728. Throughput: 0: 43036.3. Samples: 9692364740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 05:10:28,966][15401] Updated weights for policy 0, policy_version 591570 (0.0034) [2024-06-24 05:10:32,338][15401] Updated weights for policy 0, policy_version 591580 (0.0030) [2024-06-24 05:10:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9692446720. Throughput: 0: 43137.4. Samples: 9692622500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 05:10:37,174][15401] Updated weights for policy 0, policy_version 591590 (0.0035) [2024-06-24 05:10:38,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 9692659712. Throughput: 0: 42916.0. Samples: 9692749860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:38,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 05:10:40,525][15401] Updated weights for policy 0, policy_version 591600 (0.0034) [2024-06-24 05:10:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 9692872704. Throughput: 0: 42831.0. Samples: 9692998740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 05:10:44,745][15401] Updated weights for policy 0, policy_version 591610 (0.0054) [2024-06-24 05:10:48,304][15401] Updated weights for policy 0, policy_version 591620 (0.0028) [2024-06-24 05:10:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9693102080. Throughput: 0: 42863.2. Samples: 9693259060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:48,391][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 05:10:52,430][15401] Updated weights for policy 0, policy_version 591630 (0.0035) [2024-06-24 05:10:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9693282304. Throughput: 0: 42769.0. Samples: 9693390880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:53,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 05:10:56,066][15401] Updated weights for policy 0, policy_version 591640 (0.0025) [2024-06-24 05:10:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42876.5). Total num frames: 9693528064. Throughput: 0: 42766.2. Samples: 9693645760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:10:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 05:10:59,847][15401] Updated weights for policy 0, policy_version 591650 (0.0027) [2024-06-24 05:11:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 9693741056. Throughput: 0: 42909.7. Samples: 9693908920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 05:11:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 05:11:03,455][15401] Updated weights for policy 0, policy_version 591660 (0.0050) [2024-06-24 05:11:04,341][15349] Signal inference workers to stop experience collection... (143700 times) [2024-06-24 05:11:04,343][15349] Signal inference workers to resume experience collection... (143700 times) [2024-06-24 05:11:04,371][15401] InferenceWorker_p0-w0: stopping experience collection (143700 times) [2024-06-24 05:11:04,371][15401] InferenceWorker_p0-w0: resuming experience collection (143700 times) [2024-06-24 05:11:07,351][15401] Updated weights for policy 0, policy_version 591670 (0.0024) [2024-06-24 05:11:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 9693937664. Throughput: 0: 42803.4. Samples: 9694039900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 05:11:10,917][15401] Updated weights for policy 0, policy_version 591680 (0.0037) [2024-06-24 05:11:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9694167040. Throughput: 0: 42860.7. Samples: 9694293460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 05:11:15,002][15401] Updated weights for policy 0, policy_version 591690 (0.0038) [2024-06-24 05:11:18,392][15132] Fps is (10 sec: 47502.9, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 9694412800. Throughput: 0: 42940.3. Samples: 9694554920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:18,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 05:11:18,398][15401] Updated weights for policy 0, policy_version 591700 (0.0047) [2024-06-24 05:11:22,873][15401] Updated weights for policy 0, policy_version 591710 (0.0027) [2024-06-24 05:11:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9694576640. Throughput: 0: 42883.8. Samples: 9694679640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 05:11:25,864][15401] Updated weights for policy 0, policy_version 591720 (0.0038) [2024-06-24 05:11:28,392][15132] Fps is (10 sec: 40960.1, 60 sec: 43143.0, 300 sec: 42931.3). Total num frames: 9694822400. Throughput: 0: 43030.3. Samples: 9694935200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:28,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 05:11:30,898][15401] Updated weights for policy 0, policy_version 591730 (0.0031) [2024-06-24 05:11:33,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 9695051776. Throughput: 0: 43013.8. Samples: 9695194680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 05:11:34,057][15401] Updated weights for policy 0, policy_version 591740 (0.0033) [2024-06-24 05:11:38,271][15401] Updated weights for policy 0, policy_version 591750 (0.0035) [2024-06-24 05:11:38,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 9695232000. Throughput: 0: 43032.8. Samples: 9695327360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 05:11:41,502][15401] Updated weights for policy 0, policy_version 591760 (0.0038) [2024-06-24 05:11:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 9695461376. Throughput: 0: 43097.9. Samples: 9695585160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 05:11:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000591765_9695477760.pth... [2024-06-24 05:11:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000591136_9685172224.pth [2024-06-24 05:11:45,802][15401] Updated weights for policy 0, policy_version 591770 (0.0034) [2024-06-24 05:11:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9695674368. Throughput: 0: 42978.3. Samples: 9695842940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 05:11:49,121][15401] Updated weights for policy 0, policy_version 591780 (0.0032) [2024-06-24 05:11:53,218][15401] Updated weights for policy 0, policy_version 591790 (0.0037) [2024-06-24 05:11:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9695887360. Throughput: 0: 42880.6. Samples: 9695969520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:53,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 05:11:56,882][15401] Updated weights for policy 0, policy_version 591800 (0.0040) [2024-06-24 05:11:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9696100352. Throughput: 0: 43035.4. Samples: 9696230060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:11:58,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 05:12:00,713][15401] Updated weights for policy 0, policy_version 591810 (0.0047) [2024-06-24 05:12:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9696313344. Throughput: 0: 42949.8. Samples: 9696487560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:12:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 05:12:04,459][15401] Updated weights for policy 0, policy_version 591820 (0.0039) [2024-06-24 05:12:06,456][15349] Signal inference workers to stop experience collection... (143750 times) [2024-06-24 05:12:06,464][15349] Signal inference workers to resume experience collection... (143750 times) [2024-06-24 05:12:06,474][15401] InferenceWorker_p0-w0: stopping experience collection (143750 times) [2024-06-24 05:12:06,512][15401] InferenceWorker_p0-w0: resuming experience collection (143750 times) [2024-06-24 05:12:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42876.2). Total num frames: 9696526336. Throughput: 0: 42997.8. Samples: 9696614540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:12:08,399][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 05:12:08,538][15401] Updated weights for policy 0, policy_version 591830 (0.0044) [2024-06-24 05:12:12,027][15401] Updated weights for policy 0, policy_version 591840 (0.0030) [2024-06-24 05:12:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9696755712. Throughput: 0: 42889.7. Samples: 9696865140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:12:13,400][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 05:12:16,181][15401] Updated weights for policy 0, policy_version 591850 (0.0032) [2024-06-24 05:12:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42326.9, 300 sec: 42931.6). Total num frames: 9696952320. Throughput: 0: 42885.7. Samples: 9697124540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:12:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 05:12:20,277][15401] Updated weights for policy 0, policy_version 591860 (0.0038) [2024-06-24 05:12:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9697165312. Throughput: 0: 42707.6. Samples: 9697249200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:12:23,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 05:12:23,960][15401] Updated weights for policy 0, policy_version 591870 (0.0036) [2024-06-24 05:12:27,728][15401] Updated weights for policy 0, policy_version 591880 (0.0034) [2024-06-24 05:12:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 9697394688. Throughput: 0: 42842.6. Samples: 9697513080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:12:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 05:12:31,523][15401] Updated weights for policy 0, policy_version 591890 (0.0026) [2024-06-24 05:12:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 9697607680. Throughput: 0: 42948.8. Samples: 9697775640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:12:33,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 05:12:35,282][15401] Updated weights for policy 0, policy_version 591900 (0.0038) [2024-06-24 05:12:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 9697820672. Throughput: 0: 42844.8. Samples: 9697897540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 05:12:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 05:12:38,989][15401] Updated weights for policy 0, policy_version 591910 (0.0029) [2024-06-24 05:12:42,714][15401] Updated weights for policy 0, policy_version 591920 (0.0028) [2024-06-24 05:12:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9698033664. Throughput: 0: 42827.9. Samples: 9698157320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:12:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 05:12:47,166][15401] Updated weights for policy 0, policy_version 591930 (0.0034) [2024-06-24 05:12:48,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42986.9). Total num frames: 9698263040. Throughput: 0: 42861.3. Samples: 9698416420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:12:48,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 05:12:50,328][15401] Updated weights for policy 0, policy_version 591940 (0.0027) [2024-06-24 05:12:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 9698476032. Throughput: 0: 42865.4. Samples: 9698543480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:12:53,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 05:12:54,575][15401] Updated weights for policy 0, policy_version 591950 (0.0023) [2024-06-24 05:12:57,759][15401] Updated weights for policy 0, policy_version 591960 (0.0031) [2024-06-24 05:12:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9698672640. Throughput: 0: 43093.5. Samples: 9698804340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:12:58,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 05:13:02,019][15401] Updated weights for policy 0, policy_version 591970 (0.0027) [2024-06-24 05:13:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9698885632. Throughput: 0: 43040.2. Samples: 9699061340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 05:13:05,729][15401] Updated weights for policy 0, policy_version 591980 (0.0023) [2024-06-24 05:13:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9699115008. Throughput: 0: 43179.1. Samples: 9699192260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 05:13:09,462][15401] Updated weights for policy 0, policy_version 591990 (0.0027) [2024-06-24 05:13:12,059][15349] Signal inference workers to stop experience collection... (143800 times) [2024-06-24 05:13:12,090][15401] InferenceWorker_p0-w0: stopping experience collection (143800 times) [2024-06-24 05:13:12,102][15349] Signal inference workers to resume experience collection... (143800 times) [2024-06-24 05:13:12,120][15401] InferenceWorker_p0-w0: resuming experience collection (143800 times) [2024-06-24 05:13:13,324][15401] Updated weights for policy 0, policy_version 592000 (0.0036) [2024-06-24 05:13:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9699328000. Throughput: 0: 43013.3. Samples: 9699448680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 05:13:17,030][15401] Updated weights for policy 0, policy_version 592010 (0.0037) [2024-06-24 05:13:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43416.0, 300 sec: 42931.3). Total num frames: 9699557376. Throughput: 0: 42907.1. Samples: 9699706560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:18,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 05:13:20,851][15401] Updated weights for policy 0, policy_version 592020 (0.0036) [2024-06-24 05:13:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9699753984. Throughput: 0: 43154.7. Samples: 9699839500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 05:13:24,464][15401] Updated weights for policy 0, policy_version 592030 (0.0043) [2024-06-24 05:13:28,281][15401] Updated weights for policy 0, policy_version 592040 (0.0040) [2024-06-24 05:13:28,389][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 9699983360. Throughput: 0: 43077.5. Samples: 9700095800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 05:13:31,959][15401] Updated weights for policy 0, policy_version 592050 (0.0032) [2024-06-24 05:13:33,396][15132] Fps is (10 sec: 44208.6, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 9700196352. Throughput: 0: 43044.5. Samples: 9700353600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:33,396][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 05:13:36,211][15401] Updated weights for policy 0, policy_version 592060 (0.0034) [2024-06-24 05:13:38,391][15132] Fps is (10 sec: 42592.8, 60 sec: 43143.6, 300 sec: 42931.4). Total num frames: 9700409344. Throughput: 0: 43237.8. Samples: 9700489240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:38,391][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 05:13:39,388][15401] Updated weights for policy 0, policy_version 592070 (0.0030) [2024-06-24 05:13:43,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9700605952. Throughput: 0: 43172.8. Samples: 9700747120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 05:13:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000592078_9700605952.pth... [2024-06-24 05:13:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000591448_9690284032.pth [2024-06-24 05:13:43,839][15401] Updated weights for policy 0, policy_version 592080 (0.0041) [2024-06-24 05:13:47,398][15401] Updated weights for policy 0, policy_version 592090 (0.0034) [2024-06-24 05:13:48,389][15132] Fps is (10 sec: 44243.0, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 9700851712. Throughput: 0: 43072.5. Samples: 9700999600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 05:13:51,531][15401] Updated weights for policy 0, policy_version 592100 (0.0035) [2024-06-24 05:13:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9701064704. Throughput: 0: 43132.9. Samples: 9701133240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 05:13:55,182][15401] Updated weights for policy 0, policy_version 592110 (0.0032) [2024-06-24 05:13:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9701261312. Throughput: 0: 43189.5. Samples: 9701392200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:13:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 05:13:59,118][15401] Updated weights for policy 0, policy_version 592120 (0.0035) [2024-06-24 05:14:02,478][15401] Updated weights for policy 0, policy_version 592130 (0.0029) [2024-06-24 05:14:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 9701507072. Throughput: 0: 43041.4. Samples: 9701643320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:14:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 05:14:07,006][15401] Updated weights for policy 0, policy_version 592140 (0.0030) [2024-06-24 05:14:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 9701703680. Throughput: 0: 43063.6. Samples: 9701777360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 05:14:08,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 05:14:10,008][15401] Updated weights for policy 0, policy_version 592150 (0.0045) [2024-06-24 05:14:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42876.3). Total num frames: 9701900288. Throughput: 0: 42977.0. Samples: 9702029760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:13,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 05:14:14,619][15401] Updated weights for policy 0, policy_version 592160 (0.0041) [2024-06-24 05:14:17,656][15401] Updated weights for policy 0, policy_version 592170 (0.0040) [2024-06-24 05:14:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 9702146048. Throughput: 0: 42795.0. Samples: 9702279100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 05:14:22,183][15401] Updated weights for policy 0, policy_version 592180 (0.0034) [2024-06-24 05:14:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9702326272. Throughput: 0: 42690.1. Samples: 9702410240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 05:14:25,153][15401] Updated weights for policy 0, policy_version 592190 (0.0038) [2024-06-24 05:14:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 9702539264. Throughput: 0: 42755.6. Samples: 9702671120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 05:14:30,258][15401] Updated weights for policy 0, policy_version 592200 (0.0030) [2024-06-24 05:14:32,921][15401] Updated weights for policy 0, policy_version 592210 (0.0031) [2024-06-24 05:14:33,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43422.3, 300 sec: 42987.2). Total num frames: 9702801408. Throughput: 0: 42798.3. Samples: 9702925520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 05:14:37,668][15401] Updated weights for policy 0, policy_version 592220 (0.0034) [2024-06-24 05:14:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42872.4, 300 sec: 43042.7). Total num frames: 9702981632. Throughput: 0: 42784.0. Samples: 9703058520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 05:14:40,386][15401] Updated weights for policy 0, policy_version 592230 (0.0033) [2024-06-24 05:14:42,221][15349] Signal inference workers to stop experience collection... (143850 times) [2024-06-24 05:14:42,279][15349] Signal inference workers to resume experience collection... (143850 times) [2024-06-24 05:14:42,284][15401] InferenceWorker_p0-w0: stopping experience collection (143850 times) [2024-06-24 05:14:42,316][15401] InferenceWorker_p0-w0: resuming experience collection (143850 times) [2024-06-24 05:14:43,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9703211008. Throughput: 0: 42769.3. Samples: 9703316820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:43,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-24 05:14:45,286][15401] Updated weights for policy 0, policy_version 592240 (0.0025) [2024-06-24 05:14:47,922][15401] Updated weights for policy 0, policy_version 592250 (0.0031) [2024-06-24 05:14:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 9703424000. Throughput: 0: 42841.8. Samples: 9703571200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:48,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-24 05:14:52,957][15401] Updated weights for policy 0, policy_version 592260 (0.0032) [2024-06-24 05:14:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 9703604224. Throughput: 0: 42801.0. Samples: 9703703400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 05:14:55,811][15401] Updated weights for policy 0, policy_version 592270 (0.0054) [2024-06-24 05:14:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42987.5). Total num frames: 9703866368. Throughput: 0: 42968.3. Samples: 9703963340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:14:58,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 05:15:00,377][15401] Updated weights for policy 0, policy_version 592280 (0.0035) [2024-06-24 05:15:03,236][15401] Updated weights for policy 0, policy_version 592290 (0.0027) [2024-06-24 05:15:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 43098.3). Total num frames: 9704079360. Throughput: 0: 43123.9. Samples: 9704219680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 05:15:08,008][15401] Updated weights for policy 0, policy_version 592300 (0.0039) [2024-06-24 05:15:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 9704259584. Throughput: 0: 43145.8. Samples: 9704351800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 05:15:10,853][15401] Updated weights for policy 0, policy_version 592310 (0.0036) [2024-06-24 05:15:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9704505344. Throughput: 0: 43060.0. Samples: 9704608820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 05:15:15,719][15401] Updated weights for policy 0, policy_version 592320 (0.0035) [2024-06-24 05:15:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 9704718336. Throughput: 0: 43026.0. Samples: 9704861700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:18,400][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 05:15:18,799][15401] Updated weights for policy 0, policy_version 592330 (0.0033) [2024-06-24 05:15:23,227][15401] Updated weights for policy 0, policy_version 592340 (0.0042) [2024-06-24 05:15:23,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 9704898560. Throughput: 0: 42949.7. Samples: 9704991360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:23,393][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 05:15:26,531][15401] Updated weights for policy 0, policy_version 592350 (0.0040) [2024-06-24 05:15:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 9705144320. Throughput: 0: 42916.0. Samples: 9705248040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:28,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 05:15:30,775][15401] Updated weights for policy 0, policy_version 592360 (0.0036) [2024-06-24 05:15:33,390][15132] Fps is (10 sec: 47524.4, 60 sec: 42871.3, 300 sec: 43098.2). Total num frames: 9705373696. Throughput: 0: 42902.5. Samples: 9705501820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 05:15:34,172][15401] Updated weights for policy 0, policy_version 592370 (0.0042) [2024-06-24 05:15:38,325][15401] Updated weights for policy 0, policy_version 592380 (0.0035) [2024-06-24 05:15:38,391][15132] Fps is (10 sec: 40952.1, 60 sec: 42870.1, 300 sec: 42986.9). Total num frames: 9705553920. Throughput: 0: 42875.0. Samples: 9705632860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:38,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 05:15:41,714][15401] Updated weights for policy 0, policy_version 592390 (0.0024) [2024-06-24 05:15:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9705766912. Throughput: 0: 42822.7. Samples: 9705890360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 05:15:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 05:15:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000592394_9705783296.pth... [2024-06-24 05:15:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000591765_9695477760.pth [2024-06-24 05:15:45,907][15401] Updated weights for policy 0, policy_version 592400 (0.0032) [2024-06-24 05:15:48,390][15132] Fps is (10 sec: 45883.6, 60 sec: 43144.4, 300 sec: 43153.8). Total num frames: 9706012672. Throughput: 0: 42760.8. Samples: 9706143920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:15:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 05:15:49,755][15401] Updated weights for policy 0, policy_version 592410 (0.0032) [2024-06-24 05:15:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9706176512. Throughput: 0: 42707.2. Samples: 9706273620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:15:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 05:15:53,670][15401] Updated weights for policy 0, policy_version 592420 (0.0027) [2024-06-24 05:15:57,347][15401] Updated weights for policy 0, policy_version 592430 (0.0031) [2024-06-24 05:15:58,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 9706405888. Throughput: 0: 42782.3. Samples: 9706534020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:15:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 05:16:01,302][15401] Updated weights for policy 0, policy_version 592440 (0.0038) [2024-06-24 05:16:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 9706635264. Throughput: 0: 42872.2. Samples: 9706790940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 05:16:04,865][15401] Updated weights for policy 0, policy_version 592450 (0.0041) [2024-06-24 05:16:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9706831872. Throughput: 0: 42917.9. Samples: 9706922560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 05:16:09,039][15401] Updated weights for policy 0, policy_version 592460 (0.0041) [2024-06-24 05:16:12,509][15401] Updated weights for policy 0, policy_version 592470 (0.0030) [2024-06-24 05:16:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 9707061248. Throughput: 0: 42919.2. Samples: 9707179400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 05:16:16,633][15401] Updated weights for policy 0, policy_version 592480 (0.0042) [2024-06-24 05:16:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 9707274240. Throughput: 0: 42934.3. Samples: 9707433860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 05:16:20,154][15401] Updated weights for policy 0, policy_version 592490 (0.0033) [2024-06-24 05:16:20,815][15349] Signal inference workers to stop experience collection... (143900 times) [2024-06-24 05:16:20,815][15349] Signal inference workers to resume experience collection... (143900 times) [2024-06-24 05:16:20,827][15401] InferenceWorker_p0-w0: stopping experience collection (143900 times) [2024-06-24 05:16:20,839][15401] InferenceWorker_p0-w0: resuming experience collection (143900 times) [2024-06-24 05:16:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43146.2, 300 sec: 42932.0). Total num frames: 9707487232. Throughput: 0: 42859.5. Samples: 9707561460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 05:16:24,857][15401] Updated weights for policy 0, policy_version 592500 (0.0034) [2024-06-24 05:16:27,504][15401] Updated weights for policy 0, policy_version 592510 (0.0033) [2024-06-24 05:16:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9707700224. Throughput: 0: 42819.1. Samples: 9707817220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 05:16:32,433][15401] Updated weights for policy 0, policy_version 592520 (0.0035) [2024-06-24 05:16:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 9707913216. Throughput: 0: 42974.7. Samples: 9708077780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 05:16:35,270][15401] Updated weights for policy 0, policy_version 592530 (0.0030) [2024-06-24 05:16:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42872.9, 300 sec: 42931.6). Total num frames: 9708126208. Throughput: 0: 43011.6. Samples: 9708209140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:38,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 05:16:40,025][15401] Updated weights for policy 0, policy_version 592540 (0.0046) [2024-06-24 05:16:42,771][15401] Updated weights for policy 0, policy_version 592550 (0.0029) [2024-06-24 05:16:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 9708339200. Throughput: 0: 42645.6. Samples: 9708453180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:43,393][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 05:16:47,695][15401] Updated weights for policy 0, policy_version 592560 (0.0022) [2024-06-24 05:16:48,392][15132] Fps is (10 sec: 39312.0, 60 sec: 41777.6, 300 sec: 42820.2). Total num frames: 9708519424. Throughput: 0: 42831.9. Samples: 9708718480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:48,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 05:16:50,283][15401] Updated weights for policy 0, policy_version 592570 (0.0028) [2024-06-24 05:16:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9708765184. Throughput: 0: 42571.9. Samples: 9708838300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 05:16:55,218][15401] Updated weights for policy 0, policy_version 592580 (0.0028) [2024-06-24 05:16:57,750][15401] Updated weights for policy 0, policy_version 592590 (0.0033) [2024-06-24 05:16:58,393][15132] Fps is (10 sec: 47506.7, 60 sec: 43141.7, 300 sec: 42986.6). Total num frames: 9708994560. Throughput: 0: 42491.0. Samples: 9709091660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:16:58,394][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 05:17:02,676][15401] Updated weights for policy 0, policy_version 592600 (0.0047) [2024-06-24 05:17:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 9709158400. Throughput: 0: 42610.3. Samples: 9709351320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:17:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 05:17:06,002][15401] Updated weights for policy 0, policy_version 592610 (0.0029) [2024-06-24 05:17:08,390][15132] Fps is (10 sec: 39335.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9709387776. Throughput: 0: 42423.5. Samples: 9709470520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:17:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 05:17:10,118][15401] Updated weights for policy 0, policy_version 592620 (0.0023) [2024-06-24 05:17:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9709633536. Throughput: 0: 42538.7. Samples: 9709731460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:17:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 05:17:13,683][15401] Updated weights for policy 0, policy_version 592630 (0.0042) [2024-06-24 05:17:17,667][15401] Updated weights for policy 0, policy_version 592640 (0.0028) [2024-06-24 05:17:18,393][15132] Fps is (10 sec: 42585.5, 60 sec: 42323.1, 300 sec: 42875.6). Total num frames: 9709813760. Throughput: 0: 42406.7. Samples: 9709986220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 05:17:18,393][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 05:17:21,367][15401] Updated weights for policy 0, policy_version 592650 (0.0028) [2024-06-24 05:17:22,504][15349] Signal inference workers to stop experience collection... (143950 times) [2024-06-24 05:17:22,558][15401] InferenceWorker_p0-w0: stopping experience collection (143950 times) [2024-06-24 05:17:22,559][15349] Signal inference workers to resume experience collection... (143950 times) [2024-06-24 05:17:22,579][15401] InferenceWorker_p0-w0: resuming experience collection (143950 times) [2024-06-24 05:17:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9710026752. Throughput: 0: 42275.1. Samples: 9710111520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:17:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 05:17:25,638][15401] Updated weights for policy 0, policy_version 592660 (0.0027) [2024-06-24 05:17:28,389][15132] Fps is (10 sec: 45890.2, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 9710272512. Throughput: 0: 42662.8. Samples: 9710372900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:17:28,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 05:17:29,098][15401] Updated weights for policy 0, policy_version 592670 (0.0028) [2024-06-24 05:17:33,294][15401] Updated weights for policy 0, policy_version 592680 (0.0035) [2024-06-24 05:17:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9710469120. Throughput: 0: 42318.1. Samples: 9710622700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:17:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 05:17:36,871][15401] Updated weights for policy 0, policy_version 592690 (0.0033) [2024-06-24 05:17:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9710665728. Throughput: 0: 42461.4. Samples: 9710749060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:17:38,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 05:17:40,873][15401] Updated weights for policy 0, policy_version 592700 (0.0036) [2024-06-24 05:17:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42820.9). Total num frames: 9710895104. Throughput: 0: 42648.5. Samples: 9711010680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:17:43,394][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 05:17:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000592706_9710895104.pth... [2024-06-24 05:17:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000592078_9700605952.pth [2024-06-24 05:17:44,575][15401] Updated weights for policy 0, policy_version 592710 (0.0028) [2024-06-24 05:17:48,391][15132] Fps is (10 sec: 42591.3, 60 sec: 42872.0, 300 sec: 42764.8). Total num frames: 9711091712. Throughput: 0: 42473.1. Samples: 9711262680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:17:48,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 05:17:48,827][15401] Updated weights for policy 0, policy_version 592720 (0.0038) [2024-06-24 05:17:52,556][15401] Updated weights for policy 0, policy_version 592730 (0.0033) [2024-06-24 05:17:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9711321088. Throughput: 0: 42586.4. Samples: 9711386900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:17:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 05:17:56,909][15401] Updated weights for policy 0, policy_version 592740 (0.0036) [2024-06-24 05:17:58,390][15132] Fps is (10 sec: 40966.4, 60 sec: 41781.8, 300 sec: 42765.0). Total num frames: 9711501312. Throughput: 0: 42433.3. Samples: 9711640960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:17:58,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 05:18:00,155][15401] Updated weights for policy 0, policy_version 592750 (0.0028) [2024-06-24 05:18:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9711730688. Throughput: 0: 42536.4. Samples: 9711900220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 05:18:04,396][15401] Updated weights for policy 0, policy_version 592760 (0.0054) [2024-06-24 05:18:07,728][15401] Updated weights for policy 0, policy_version 592770 (0.0037) [2024-06-24 05:18:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9711960064. Throughput: 0: 42632.8. Samples: 9712030000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 05:18:12,066][15401] Updated weights for policy 0, policy_version 592780 (0.0033) [2024-06-24 05:18:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 9712156672. Throughput: 0: 42497.3. Samples: 9712285280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 05:18:15,295][15401] Updated weights for policy 0, policy_version 592790 (0.0035) [2024-06-24 05:18:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.7, 300 sec: 42765.0). Total num frames: 9712369664. Throughput: 0: 42631.7. Samples: 9712541120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 05:18:19,843][15401] Updated weights for policy 0, policy_version 592800 (0.0031) [2024-06-24 05:18:22,974][15401] Updated weights for policy 0, policy_version 592810 (0.0029) [2024-06-24 05:18:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9712615424. Throughput: 0: 42808.8. Samples: 9712675460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 05:18:27,564][15401] Updated weights for policy 0, policy_version 592820 (0.0032) [2024-06-24 05:18:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42654.9). Total num frames: 9712779264. Throughput: 0: 42744.1. Samples: 9712934160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 05:18:30,536][15401] Updated weights for policy 0, policy_version 592830 (0.0029) [2024-06-24 05:18:33,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42323.7, 300 sec: 42709.3). Total num frames: 9713008640. Throughput: 0: 42708.6. Samples: 9713184600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:33,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 05:18:34,961][15401] Updated weights for policy 0, policy_version 592840 (0.0028) [2024-06-24 05:18:38,151][15401] Updated weights for policy 0, policy_version 592850 (0.0026) [2024-06-24 05:18:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9713254400. Throughput: 0: 42979.5. Samples: 9713320980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:38,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 05:18:42,565][15401] Updated weights for policy 0, policy_version 592860 (0.0041) [2024-06-24 05:18:43,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9713434624. Throughput: 0: 42980.1. Samples: 9713575060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:43,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 05:18:45,422][15349] Signal inference workers to stop experience collection... (144000 times) [2024-06-24 05:18:45,454][15401] InferenceWorker_p0-w0: stopping experience collection (144000 times) [2024-06-24 05:18:45,482][15349] Signal inference workers to resume experience collection... (144000 times) [2024-06-24 05:18:45,488][15401] InferenceWorker_p0-w0: resuming experience collection (144000 times) [2024-06-24 05:18:45,628][15401] Updated weights for policy 0, policy_version 592870 (0.0034) [2024-06-24 05:18:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42872.7, 300 sec: 42709.5). Total num frames: 9713664000. Throughput: 0: 42924.9. Samples: 9713831840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 05:18:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 05:18:50,073][15401] Updated weights for policy 0, policy_version 592880 (0.0037) [2024-06-24 05:18:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9713893376. Throughput: 0: 43095.5. Samples: 9713969300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:18:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 05:18:53,809][15401] Updated weights for policy 0, policy_version 592890 (0.0042) [2024-06-24 05:18:57,955][15401] Updated weights for policy 0, policy_version 592900 (0.0029) [2024-06-24 05:18:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9714073600. Throughput: 0: 43021.7. Samples: 9714221260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:18:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 05:19:01,219][15401] Updated weights for policy 0, policy_version 592910 (0.0038) [2024-06-24 05:19:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9714319360. Throughput: 0: 42968.8. Samples: 9714474720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 05:19:05,388][15401] Updated weights for policy 0, policy_version 592920 (0.0042) [2024-06-24 05:19:08,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9714548736. Throughput: 0: 42941.3. Samples: 9714607820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 05:19:08,675][15401] Updated weights for policy 0, policy_version 592930 (0.0026) [2024-06-24 05:19:12,850][15401] Updated weights for policy 0, policy_version 592940 (0.0044) [2024-06-24 05:19:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9714728960. Throughput: 0: 42920.3. Samples: 9714865580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:13,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 05:19:16,043][15401] Updated weights for policy 0, policy_version 592950 (0.0038) [2024-06-24 05:19:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9714958336. Throughput: 0: 43029.1. Samples: 9715120800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 05:19:20,408][15401] Updated weights for policy 0, policy_version 592960 (0.0045) [2024-06-24 05:19:23,390][15132] Fps is (10 sec: 45872.4, 60 sec: 42871.0, 300 sec: 42876.0). Total num frames: 9715187712. Throughput: 0: 43028.3. Samples: 9715257280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:23,391][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 05:19:23,666][15401] Updated weights for policy 0, policy_version 592970 (0.0021) [2024-06-24 05:19:28,236][15401] Updated weights for policy 0, policy_version 592980 (0.0046) [2024-06-24 05:19:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 9715384320. Throughput: 0: 42962.7. Samples: 9715508380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 05:19:31,589][15401] Updated weights for policy 0, policy_version 592990 (0.0038) [2024-06-24 05:19:33,390][15132] Fps is (10 sec: 42601.0, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 9715613696. Throughput: 0: 42994.1. Samples: 9715766580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 05:19:35,574][15401] Updated weights for policy 0, policy_version 593000 (0.0036) [2024-06-24 05:19:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9715826688. Throughput: 0: 42912.9. Samples: 9715900480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:38,392][15132] Avg episode reward: [(0, '0.838')] [2024-06-24 05:19:39,396][15401] Updated weights for policy 0, policy_version 593010 (0.0032) [2024-06-24 05:19:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9716023296. Throughput: 0: 42904.9. Samples: 9716151980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 05:19:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000593020_9716039680.pth... [2024-06-24 05:19:43,514][15401] Updated weights for policy 0, policy_version 593020 (0.0037) [2024-06-24 05:19:43,566][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000592394_9705783296.pth [2024-06-24 05:19:47,048][15401] Updated weights for policy 0, policy_version 593030 (0.0038) [2024-06-24 05:19:48,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9716252672. Throughput: 0: 42835.3. Samples: 9716402300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 05:19:51,015][15401] Updated weights for policy 0, policy_version 593040 (0.0027) [2024-06-24 05:19:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9716449280. Throughput: 0: 42787.2. Samples: 9716533240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 05:19:54,668][15401] Updated weights for policy 0, policy_version 593050 (0.0029) [2024-06-24 05:19:58,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 9716678656. Throughput: 0: 42824.5. Samples: 9716792680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:19:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 05:19:58,527][15401] Updated weights for policy 0, policy_version 593060 (0.0032) [2024-06-24 05:20:01,609][15349] Signal inference workers to stop experience collection... (144050 times) [2024-06-24 05:20:01,656][15401] InferenceWorker_p0-w0: stopping experience collection (144050 times) [2024-06-24 05:20:01,666][15349] Signal inference workers to resume experience collection... (144050 times) [2024-06-24 05:20:01,675][15401] InferenceWorker_p0-w0: resuming experience collection (144050 times) [2024-06-24 05:20:02,614][15401] Updated weights for policy 0, policy_version 593070 (0.0029) [2024-06-24 05:20:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9716891648. Throughput: 0: 42736.7. Samples: 9717043960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:20:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 05:20:06,558][15401] Updated weights for policy 0, policy_version 593080 (0.0044) [2024-06-24 05:20:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9717088256. Throughput: 0: 42607.3. Samples: 9717174580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:20:08,391][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 05:20:10,272][15401] Updated weights for policy 0, policy_version 593090 (0.0033) [2024-06-24 05:20:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9717317632. Throughput: 0: 42745.3. Samples: 9717431920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:20:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 05:20:14,071][15401] Updated weights for policy 0, policy_version 593100 (0.0036) [2024-06-24 05:20:17,779][15401] Updated weights for policy 0, policy_version 593110 (0.0055) [2024-06-24 05:20:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 9717530624. Throughput: 0: 42760.1. Samples: 9717690780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:20:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 05:20:21,444][15401] Updated weights for policy 0, policy_version 593120 (0.0038) [2024-06-24 05:20:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.8, 300 sec: 42709.5). Total num frames: 9717743616. Throughput: 0: 42608.0. Samples: 9717817740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 05:20:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 05:20:25,855][15401] Updated weights for policy 0, policy_version 593130 (0.0035) [2024-06-24 05:20:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 9717956608. Throughput: 0: 42838.7. Samples: 9718079720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:20:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 05:20:29,005][15401] Updated weights for policy 0, policy_version 593140 (0.0042) [2024-06-24 05:20:33,326][15401] Updated weights for policy 0, policy_version 593150 (0.0035) [2024-06-24 05:20:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 9718169600. Throughput: 0: 43041.7. Samples: 9718339180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:20:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 05:20:36,488][15401] Updated weights for policy 0, policy_version 593160 (0.0043) [2024-06-24 05:20:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 9718382592. Throughput: 0: 42917.3. Samples: 9718464620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:20:38,392][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 05:20:40,859][15401] Updated weights for policy 0, policy_version 593170 (0.0029) [2024-06-24 05:20:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9718611968. Throughput: 0: 42912.4. Samples: 9718723740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:20:43,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 05:20:44,147][15401] Updated weights for policy 0, policy_version 593180 (0.0041) [2024-06-24 05:20:48,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9718808576. Throughput: 0: 43085.4. Samples: 9718982800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:20:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 05:20:48,538][15401] Updated weights for policy 0, policy_version 593190 (0.0034) [2024-06-24 05:20:51,781][15401] Updated weights for policy 0, policy_version 593200 (0.0052) [2024-06-24 05:20:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9719037952. Throughput: 0: 43123.9. Samples: 9719115160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:20:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 05:20:56,049][15401] Updated weights for policy 0, policy_version 593210 (0.0040) [2024-06-24 05:20:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9719250944. Throughput: 0: 42980.8. Samples: 9719366060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:20:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 05:20:59,463][15401] Updated weights for policy 0, policy_version 593220 (0.0024) [2024-06-24 05:21:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9719447552. Throughput: 0: 42888.2. Samples: 9719620760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:03,396][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 05:21:03,717][15401] Updated weights for policy 0, policy_version 593230 (0.0041) [2024-06-24 05:21:07,196][15401] Updated weights for policy 0, policy_version 593240 (0.0037) [2024-06-24 05:21:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9719676928. Throughput: 0: 42923.3. Samples: 9719749280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 05:21:11,291][15401] Updated weights for policy 0, policy_version 593250 (0.0032) [2024-06-24 05:21:13,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9719873536. Throughput: 0: 42837.0. Samples: 9720007380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 05:21:15,086][15401] Updated weights for policy 0, policy_version 593260 (0.0032) [2024-06-24 05:21:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9720119296. Throughput: 0: 42849.8. Samples: 9720267420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:18,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 05:21:18,686][15401] Updated weights for policy 0, policy_version 593270 (0.0037) [2024-06-24 05:21:22,656][15401] Updated weights for policy 0, policy_version 593280 (0.0036) [2024-06-24 05:21:23,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9720348672. Throughput: 0: 43045.8. Samples: 9720401580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 05:21:26,244][15401] Updated weights for policy 0, policy_version 593290 (0.0047) [2024-06-24 05:21:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9720512512. Throughput: 0: 42875.1. Samples: 9720653120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:28,394][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 05:21:30,154][15401] Updated weights for policy 0, policy_version 593300 (0.0044) [2024-06-24 05:21:33,165][15349] Signal inference workers to stop experience collection... (144100 times) [2024-06-24 05:21:33,195][15401] InferenceWorker_p0-w0: stopping experience collection (144100 times) [2024-06-24 05:21:33,283][15349] Signal inference workers to resume experience collection... (144100 times) [2024-06-24 05:21:33,283][15401] InferenceWorker_p0-w0: resuming experience collection (144100 times) [2024-06-24 05:21:33,390][15132] Fps is (10 sec: 39320.5, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 9720741888. Throughput: 0: 42893.9. Samples: 9720913040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 05:21:33,738][15401] Updated weights for policy 0, policy_version 593310 (0.0040) [2024-06-24 05:21:37,874][15401] Updated weights for policy 0, policy_version 593320 (0.0026) [2024-06-24 05:21:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.2, 300 sec: 42820.9). Total num frames: 9720971264. Throughput: 0: 42741.4. Samples: 9721038520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:38,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 05:21:41,588][15401] Updated weights for policy 0, policy_version 593330 (0.0037) [2024-06-24 05:21:43,389][15132] Fps is (10 sec: 40961.6, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 9721151488. Throughput: 0: 42803.2. Samples: 9721292200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 05:21:43,503][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000593333_9721167872.pth... [2024-06-24 05:21:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000592706_9710895104.pth [2024-06-24 05:21:45,339][15401] Updated weights for policy 0, policy_version 593340 (0.0041) [2024-06-24 05:21:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9721397248. Throughput: 0: 42934.9. Samples: 9721552820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 05:21:49,606][15401] Updated weights for policy 0, policy_version 593350 (0.0043) [2024-06-24 05:21:52,991][15401] Updated weights for policy 0, policy_version 593360 (0.0035) [2024-06-24 05:21:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42821.1). Total num frames: 9721626624. Throughput: 0: 43074.5. Samples: 9721687640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 05:21:57,296][15401] Updated weights for policy 0, policy_version 593370 (0.0050) [2024-06-24 05:21:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9721823232. Throughput: 0: 43011.8. Samples: 9721942920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 05:21:58,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-24 05:22:00,847][15401] Updated weights for policy 0, policy_version 593380 (0.0033) [2024-06-24 05:22:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 9722052608. Throughput: 0: 42860.7. Samples: 9722196160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 05:22:04,908][15401] Updated weights for policy 0, policy_version 593390 (0.0038) [2024-06-24 05:22:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9722249216. Throughput: 0: 42818.7. Samples: 9722328420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 05:22:08,496][15401] Updated weights for policy 0, policy_version 593400 (0.0033) [2024-06-24 05:22:12,335][15401] Updated weights for policy 0, policy_version 593410 (0.0037) [2024-06-24 05:22:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42876.5). Total num frames: 9722462208. Throughput: 0: 42960.4. Samples: 9722586340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:13,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 05:22:16,034][15401] Updated weights for policy 0, policy_version 593420 (0.0035) [2024-06-24 05:22:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9722675200. Throughput: 0: 42895.5. Samples: 9722843320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:18,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 05:22:19,865][15401] Updated weights for policy 0, policy_version 593430 (0.0033) [2024-06-24 05:22:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9722888192. Throughput: 0: 43085.0. Samples: 9722977340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 05:22:24,022][15401] Updated weights for policy 0, policy_version 593440 (0.0036) [2024-06-24 05:22:27,447][15401] Updated weights for policy 0, policy_version 593450 (0.0034) [2024-06-24 05:22:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9723101184. Throughput: 0: 42917.3. Samples: 9723223480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 05:22:31,694][15401] Updated weights for policy 0, policy_version 593460 (0.0039) [2024-06-24 05:22:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 9723314176. Throughput: 0: 42831.5. Samples: 9723480240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:33,392][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 05:22:34,959][15401] Updated weights for policy 0, policy_version 593470 (0.0035) [2024-06-24 05:22:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9723510784. Throughput: 0: 42695.7. Samples: 9723608940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:38,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-24 05:22:39,363][15401] Updated weights for policy 0, policy_version 593480 (0.0039) [2024-06-24 05:22:42,629][15401] Updated weights for policy 0, policy_version 593490 (0.0039) [2024-06-24 05:22:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42931.9). Total num frames: 9723756544. Throughput: 0: 42539.6. Samples: 9723857200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 05:22:47,321][15401] Updated weights for policy 0, policy_version 593500 (0.0028) [2024-06-24 05:22:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9723953152. Throughput: 0: 42713.4. Samples: 9724118260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 05:22:50,324][15401] Updated weights for policy 0, policy_version 593510 (0.0029) [2024-06-24 05:22:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 9724166144. Throughput: 0: 42576.5. Samples: 9724244360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 05:22:54,905][15401] Updated weights for policy 0, policy_version 593520 (0.0033) [2024-06-24 05:22:57,951][15401] Updated weights for policy 0, policy_version 593530 (0.0036) [2024-06-24 05:22:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 9724395520. Throughput: 0: 42621.0. Samples: 9724504280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:22:58,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 05:23:02,216][15349] Signal inference workers to stop experience collection... (144150 times) [2024-06-24 05:23:02,255][15401] InferenceWorker_p0-w0: stopping experience collection (144150 times) [2024-06-24 05:23:02,266][15349] Signal inference workers to resume experience collection... (144150 times) [2024-06-24 05:23:02,275][15401] InferenceWorker_p0-w0: resuming experience collection (144150 times) [2024-06-24 05:23:02,404][15401] Updated weights for policy 0, policy_version 593540 (0.0037) [2024-06-24 05:23:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 9724575744. Throughput: 0: 42787.1. Samples: 9724768740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:23:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 05:23:05,541][15401] Updated weights for policy 0, policy_version 593550 (0.0027) [2024-06-24 05:23:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9724821504. Throughput: 0: 42533.3. Samples: 9724891340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:23:08,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 05:23:10,542][15401] Updated weights for policy 0, policy_version 593560 (0.0047) [2024-06-24 05:23:13,248][15401] Updated weights for policy 0, policy_version 593570 (0.0031) [2024-06-24 05:23:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 9725050880. Throughput: 0: 42724.0. Samples: 9725146060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:23:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 05:23:18,159][15401] Updated weights for policy 0, policy_version 593580 (0.0037) [2024-06-24 05:23:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9725214720. Throughput: 0: 42860.6. Samples: 9725408960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:23:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 05:23:21,177][15401] Updated weights for policy 0, policy_version 593590 (0.0029) [2024-06-24 05:23:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 9725444096. Throughput: 0: 42639.0. Samples: 9725527700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:23:23,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 05:23:25,615][15401] Updated weights for policy 0, policy_version 593600 (0.0025) [2024-06-24 05:23:28,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 9725673472. Throughput: 0: 42877.7. Samples: 9725786700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:23:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 05:23:29,065][15401] Updated weights for policy 0, policy_version 593610 (0.0039) [2024-06-24 05:23:33,295][15401] Updated weights for policy 0, policy_version 593620 (0.0032) [2024-06-24 05:23:33,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9725870080. Throughput: 0: 42845.7. Samples: 9726046420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:23:33,392][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 05:23:36,692][15401] Updated weights for policy 0, policy_version 593630 (0.0031) [2024-06-24 05:23:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9726099456. Throughput: 0: 42790.3. Samples: 9726169920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:23:38,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 05:23:41,207][15401] Updated weights for policy 0, policy_version 593640 (0.0034) [2024-06-24 05:23:43,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9726312448. Throughput: 0: 42753.7. Samples: 9726428200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:23:43,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 05:23:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000593647_9726312448.pth... [2024-06-24 05:23:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000593020_9716039680.pth [2024-06-24 05:23:44,205][15401] Updated weights for policy 0, policy_version 593650 (0.0034) [2024-06-24 05:23:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9726492672. Throughput: 0: 42702.6. Samples: 9726690360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:23:48,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 05:23:48,785][15401] Updated weights for policy 0, policy_version 593660 (0.0035) [2024-06-24 05:23:51,881][15401] Updated weights for policy 0, policy_version 593670 (0.0043) [2024-06-24 05:23:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 9726754816. Throughput: 0: 42597.8. Samples: 9726808240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:23:53,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 05:23:56,211][15401] Updated weights for policy 0, policy_version 593680 (0.0032) [2024-06-24 05:23:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9726935040. Throughput: 0: 42709.3. Samples: 9727067980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:23:58,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 05:24:00,159][15401] Updated weights for policy 0, policy_version 593690 (0.0035) [2024-06-24 05:24:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9727131648. Throughput: 0: 42503.9. Samples: 9727321640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:03,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 05:24:04,174][15401] Updated weights for policy 0, policy_version 593700 (0.0043) [2024-06-24 05:24:07,718][15401] Updated weights for policy 0, policy_version 593710 (0.0035) [2024-06-24 05:24:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9727377408. Throughput: 0: 42690.5. Samples: 9727448760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 05:24:11,931][15401] Updated weights for policy 0, policy_version 593720 (0.0037) [2024-06-24 05:24:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9727590400. Throughput: 0: 42638.3. Samples: 9727705420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 05:24:15,287][15401] Updated weights for policy 0, policy_version 593730 (0.0036) [2024-06-24 05:24:18,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.6). Total num frames: 9727787008. Throughput: 0: 42477.3. Samples: 9727957800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:18,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-24 05:24:19,327][15401] Updated weights for policy 0, policy_version 593740 (0.0033) [2024-06-24 05:24:22,855][15401] Updated weights for policy 0, policy_version 593750 (0.0041) [2024-06-24 05:24:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9728016384. Throughput: 0: 42659.9. Samples: 9728089620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 05:24:25,911][15349] Signal inference workers to stop experience collection... (144200 times) [2024-06-24 05:24:25,911][15349] Signal inference workers to resume experience collection... (144200 times) [2024-06-24 05:24:25,959][15401] InferenceWorker_p0-w0: stopping experience collection (144200 times) [2024-06-24 05:24:25,959][15401] InferenceWorker_p0-w0: resuming experience collection (144200 times) [2024-06-24 05:24:26,764][15401] Updated weights for policy 0, policy_version 593760 (0.0032) [2024-06-24 05:24:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9728229376. Throughput: 0: 42636.3. Samples: 9728346840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 05:24:30,700][15401] Updated weights for policy 0, policy_version 593770 (0.0041) [2024-06-24 05:24:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 9728425984. Throughput: 0: 42473.3. Samples: 9728601660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:33,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 05:24:34,284][15401] Updated weights for policy 0, policy_version 593780 (0.0043) [2024-06-24 05:24:38,391][15132] Fps is (10 sec: 39316.9, 60 sec: 42051.3, 300 sec: 42709.3). Total num frames: 9728622592. Throughput: 0: 42717.4. Samples: 9728730580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:38,391][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 05:24:38,551][15401] Updated weights for policy 0, policy_version 593790 (0.0027) [2024-06-24 05:24:41,785][15401] Updated weights for policy 0, policy_version 593800 (0.0035) [2024-06-24 05:24:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9728868352. Throughput: 0: 42620.9. Samples: 9728986020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:43,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 05:24:46,062][15401] Updated weights for policy 0, policy_version 593810 (0.0037) [2024-06-24 05:24:48,389][15132] Fps is (10 sec: 45881.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9729081344. Throughput: 0: 42613.8. Samples: 9729239260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:48,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 05:24:49,282][15401] Updated weights for policy 0, policy_version 593820 (0.0037) [2024-06-24 05:24:53,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 9729294336. Throughput: 0: 42719.3. Samples: 9729371240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:53,393][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 05:24:53,820][15401] Updated weights for policy 0, policy_version 593830 (0.0038) [2024-06-24 05:24:56,809][15401] Updated weights for policy 0, policy_version 593840 (0.0033) [2024-06-24 05:24:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9729507328. Throughput: 0: 42676.8. Samples: 9729625880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:24:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 05:25:01,504][15401] Updated weights for policy 0, policy_version 593850 (0.0029) [2024-06-24 05:25:03,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9729720320. Throughput: 0: 42866.7. Samples: 9729886800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 05:25:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 05:25:04,467][15401] Updated weights for policy 0, policy_version 593860 (0.0031) [2024-06-24 05:25:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9729933312. Throughput: 0: 42674.3. Samples: 9730009960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:08,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 05:25:09,132][15401] Updated weights for policy 0, policy_version 593870 (0.0039) [2024-06-24 05:25:11,938][15401] Updated weights for policy 0, policy_version 593880 (0.0036) [2024-06-24 05:25:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9730162688. Throughput: 0: 42525.0. Samples: 9730260460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 05:25:16,778][15401] Updated weights for policy 0, policy_version 593890 (0.0029) [2024-06-24 05:25:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9730359296. Throughput: 0: 42711.6. Samples: 9730523680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 05:25:19,941][15401] Updated weights for policy 0, policy_version 593900 (0.0033) [2024-06-24 05:25:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 9730572288. Throughput: 0: 42773.9. Samples: 9730655340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 05:25:24,214][15401] Updated weights for policy 0, policy_version 593910 (0.0029) [2024-06-24 05:25:27,449][15401] Updated weights for policy 0, policy_version 593920 (0.0027) [2024-06-24 05:25:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9730801664. Throughput: 0: 42868.5. Samples: 9730915000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 05:25:31,864][15401] Updated weights for policy 0, policy_version 593930 (0.0051) [2024-06-24 05:25:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 9730998272. Throughput: 0: 42988.8. Samples: 9731173760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 05:25:35,063][15401] Updated weights for policy 0, policy_version 593940 (0.0026) [2024-06-24 05:25:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43145.5, 300 sec: 42709.5). Total num frames: 9731211264. Throughput: 0: 42948.6. Samples: 9731303820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 05:25:39,303][15401] Updated weights for policy 0, policy_version 593950 (0.0023) [2024-06-24 05:25:42,741][15401] Updated weights for policy 0, policy_version 593960 (0.0032) [2024-06-24 05:25:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 9731457024. Throughput: 0: 43038.4. Samples: 9731562600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 05:25:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000593961_9731457024.pth... [2024-06-24 05:25:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000593333_9721167872.pth [2024-06-24 05:25:47,017][15401] Updated weights for policy 0, policy_version 593970 (0.0039) [2024-06-24 05:25:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 9731637248. Throughput: 0: 43002.1. Samples: 9731821900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 05:25:50,403][15401] Updated weights for policy 0, policy_version 593980 (0.0038) [2024-06-24 05:25:53,390][15132] Fps is (10 sec: 39319.8, 60 sec: 42599.9, 300 sec: 42709.4). Total num frames: 9731850240. Throughput: 0: 42960.6. Samples: 9731943200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 05:25:54,781][15401] Updated weights for policy 0, policy_version 593990 (0.0041) [2024-06-24 05:25:58,083][15401] Updated weights for policy 0, policy_version 594000 (0.0034) [2024-06-24 05:25:58,390][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 9732112384. Throughput: 0: 43224.9. Samples: 9732205580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:25:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 05:26:02,393][15349] Signal inference workers to stop experience collection... (144250 times) [2024-06-24 05:26:02,442][15401] InferenceWorker_p0-w0: stopping experience collection (144250 times) [2024-06-24 05:26:02,514][15349] Signal inference workers to resume experience collection... (144250 times) [2024-06-24 05:26:02,515][15401] InferenceWorker_p0-w0: resuming experience collection (144250 times) [2024-06-24 05:26:02,647][15401] Updated weights for policy 0, policy_version 594010 (0.0035) [2024-06-24 05:26:03,389][15132] Fps is (10 sec: 44238.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9732292608. Throughput: 0: 43116.9. Samples: 9732463940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:26:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 05:26:05,839][15401] Updated weights for policy 0, policy_version 594020 (0.0026) [2024-06-24 05:26:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9732505600. Throughput: 0: 42885.6. Samples: 9732585200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:26:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 05:26:10,131][15401] Updated weights for policy 0, policy_version 594030 (0.0023) [2024-06-24 05:26:13,381][15401] Updated weights for policy 0, policy_version 594040 (0.0043) [2024-06-24 05:26:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9732751360. Throughput: 0: 43021.8. Samples: 9732850980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:26:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 05:26:17,591][15401] Updated weights for policy 0, policy_version 594050 (0.0028) [2024-06-24 05:26:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9732947968. Throughput: 0: 43047.1. Samples: 9733110880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:26:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 05:26:21,006][15401] Updated weights for policy 0, policy_version 594060 (0.0025) [2024-06-24 05:26:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9733160960. Throughput: 0: 42927.5. Samples: 9733235560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:26:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 05:26:25,265][15401] Updated weights for policy 0, policy_version 594070 (0.0028) [2024-06-24 05:26:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9733373952. Throughput: 0: 43028.4. Samples: 9733498880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:26:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 05:26:28,651][15401] Updated weights for policy 0, policy_version 594080 (0.0035) [2024-06-24 05:26:32,858][15401] Updated weights for policy 0, policy_version 594090 (0.0023) [2024-06-24 05:26:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9733586944. Throughput: 0: 43035.7. Samples: 9733758500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:26:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 05:26:36,195][15401] Updated weights for policy 0, policy_version 594100 (0.0025) [2024-06-24 05:26:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9733799936. Throughput: 0: 43046.1. Samples: 9733880260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 05:26:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 05:26:40,508][15401] Updated weights for policy 0, policy_version 594110 (0.0044) [2024-06-24 05:26:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 9734029312. Throughput: 0: 43004.8. Samples: 9734140800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:26:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 05:26:43,868][15401] Updated weights for policy 0, policy_version 594120 (0.0041) [2024-06-24 05:26:48,305][15401] Updated weights for policy 0, policy_version 594130 (0.0037) [2024-06-24 05:26:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 9734225920. Throughput: 0: 42899.9. Samples: 9734394440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:26:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 05:26:51,636][15401] Updated weights for policy 0, policy_version 594140 (0.0036) [2024-06-24 05:26:53,389][15132] Fps is (10 sec: 40961.0, 60 sec: 43144.8, 300 sec: 42765.0). Total num frames: 9734438912. Throughput: 0: 43121.5. Samples: 9734525660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:26:53,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 05:26:56,018][15401] Updated weights for policy 0, policy_version 594150 (0.0038) [2024-06-24 05:26:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9734651904. Throughput: 0: 42901.8. Samples: 9734781560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:26:58,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 05:26:59,242][15401] Updated weights for policy 0, policy_version 594160 (0.0026) [2024-06-24 05:27:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9734848512. Throughput: 0: 42759.5. Samples: 9735035060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:03,390][15132] Avg episode reward: [(0, '0.134')] [2024-06-24 05:27:03,731][15401] Updated weights for policy 0, policy_version 594170 (0.0036) [2024-06-24 05:27:06,762][15401] Updated weights for policy 0, policy_version 594180 (0.0034) [2024-06-24 05:27:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9735094272. Throughput: 0: 42747.1. Samples: 9735159180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:08,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-24 05:27:11,341][15401] Updated weights for policy 0, policy_version 594190 (0.0033) [2024-06-24 05:27:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9735290880. Throughput: 0: 42652.0. Samples: 9735418220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 05:27:14,496][15401] Updated weights for policy 0, policy_version 594200 (0.0038) [2024-06-24 05:27:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9735487488. Throughput: 0: 42659.5. Samples: 9735678180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 05:27:18,747][15349] Signal inference workers to stop experience collection... (144300 times) [2024-06-24 05:27:18,747][15349] Signal inference workers to resume experience collection... (144300 times) [2024-06-24 05:27:18,789][15401] InferenceWorker_p0-w0: stopping experience collection (144300 times) [2024-06-24 05:27:18,789][15401] InferenceWorker_p0-w0: resuming experience collection (144300 times) [2024-06-24 05:27:18,891][15401] Updated weights for policy 0, policy_version 594210 (0.0029) [2024-06-24 05:27:22,077][15401] Updated weights for policy 0, policy_version 594220 (0.0031) [2024-06-24 05:27:23,391][15132] Fps is (10 sec: 45869.9, 60 sec: 43143.7, 300 sec: 42875.9). Total num frames: 9735749632. Throughput: 0: 42649.2. Samples: 9735799520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:23,391][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 05:27:26,942][15401] Updated weights for policy 0, policy_version 594230 (0.0049) [2024-06-24 05:27:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9735929856. Throughput: 0: 42647.3. Samples: 9736059920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 05:27:30,258][15401] Updated weights for policy 0, policy_version 594240 (0.0029) [2024-06-24 05:27:33,390][15132] Fps is (10 sec: 39325.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9736142848. Throughput: 0: 42645.3. Samples: 9736313480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 05:27:34,578][15401] Updated weights for policy 0, policy_version 594250 (0.0029) [2024-06-24 05:27:37,771][15401] Updated weights for policy 0, policy_version 594260 (0.0023) [2024-06-24 05:27:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9736388608. Throughput: 0: 42549.7. Samples: 9736440400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 05:27:42,122][15401] Updated weights for policy 0, policy_version 594270 (0.0023) [2024-06-24 05:27:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 9736568832. Throughput: 0: 42715.2. Samples: 9736703740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 05:27:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000594274_9736585216.pth... [2024-06-24 05:27:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000593647_9726312448.pth [2024-06-24 05:27:45,177][15401] Updated weights for policy 0, policy_version 594280 (0.0043) [2024-06-24 05:27:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9736781824. Throughput: 0: 42697.7. Samples: 9736956460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 05:27:49,795][15401] Updated weights for policy 0, policy_version 594290 (0.0035) [2024-06-24 05:27:52,740][15401] Updated weights for policy 0, policy_version 594300 (0.0028) [2024-06-24 05:27:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9737027584. Throughput: 0: 42900.3. Samples: 9737089700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:53,391][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 05:27:57,341][15401] Updated weights for policy 0, policy_version 594310 (0.0042) [2024-06-24 05:27:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9737207808. Throughput: 0: 42895.5. Samples: 9737348520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:27:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 05:28:00,656][15401] Updated weights for policy 0, policy_version 594320 (0.0048) [2024-06-24 05:28:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9737437184. Throughput: 0: 42778.6. Samples: 9737603220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:28:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 05:28:04,891][15401] Updated weights for policy 0, policy_version 594330 (0.0024) [2024-06-24 05:28:08,172][15401] Updated weights for policy 0, policy_version 594340 (0.0027) [2024-06-24 05:28:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9737666560. Throughput: 0: 43098.4. Samples: 9737738900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 05:28:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 05:28:12,318][15401] Updated weights for policy 0, policy_version 594350 (0.0041) [2024-06-24 05:28:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9737863168. Throughput: 0: 43036.0. Samples: 9737996540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 05:28:16,273][15401] Updated weights for policy 0, policy_version 594360 (0.0022) [2024-06-24 05:28:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9738076160. Throughput: 0: 42970.2. Samples: 9738247140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:18,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 05:28:19,852][15401] Updated weights for policy 0, policy_version 594370 (0.0040) [2024-06-24 05:28:23,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42597.5, 300 sec: 42820.2). Total num frames: 9738305536. Throughput: 0: 43119.4. Samples: 9738380880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:23,393][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 05:28:23,824][15401] Updated weights for policy 0, policy_version 594380 (0.0033) [2024-06-24 05:28:27,348][15401] Updated weights for policy 0, policy_version 594390 (0.0029) [2024-06-24 05:28:28,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42820.6). Total num frames: 9738502144. Throughput: 0: 42930.5. Samples: 9738635720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:28,392][15132] Avg episode reward: [(0, '0.200')] [2024-06-24 05:28:29,041][15349] Signal inference workers to stop experience collection... (144350 times) [2024-06-24 05:28:29,042][15349] Signal inference workers to resume experience collection... (144350 times) [2024-06-24 05:28:29,073][15401] InferenceWorker_p0-w0: stopping experience collection (144350 times) [2024-06-24 05:28:29,073][15401] InferenceWorker_p0-w0: resuming experience collection (144350 times) [2024-06-24 05:28:31,602][15401] Updated weights for policy 0, policy_version 594400 (0.0028) [2024-06-24 05:28:33,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9738731520. Throughput: 0: 43014.2. Samples: 9738892100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:33,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 05:28:35,249][15401] Updated weights for policy 0, policy_version 594410 (0.0035) [2024-06-24 05:28:38,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 9738944512. Throughput: 0: 43100.4. Samples: 9739029320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:38,393][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 05:28:39,189][15401] Updated weights for policy 0, policy_version 594420 (0.0032) [2024-06-24 05:28:42,803][15401] Updated weights for policy 0, policy_version 594430 (0.0028) [2024-06-24 05:28:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9739141120. Throughput: 0: 42876.1. Samples: 9739277940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 05:28:46,798][15401] Updated weights for policy 0, policy_version 594440 (0.0042) [2024-06-24 05:28:48,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9739370496. Throughput: 0: 43067.2. Samples: 9739541240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 05:28:50,260][15401] Updated weights for policy 0, policy_version 594450 (0.0041) [2024-06-24 05:28:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9739599872. Throughput: 0: 42963.1. Samples: 9739672240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 05:28:54,314][15401] Updated weights for policy 0, policy_version 594460 (0.0046) [2024-06-24 05:28:57,980][15401] Updated weights for policy 0, policy_version 594470 (0.0037) [2024-06-24 05:28:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9739796480. Throughput: 0: 42847.5. Samples: 9739924680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:28:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 05:29:01,923][15401] Updated weights for policy 0, policy_version 594480 (0.0026) [2024-06-24 05:29:03,391][15132] Fps is (10 sec: 40953.1, 60 sec: 42870.3, 300 sec: 42820.3). Total num frames: 9740009472. Throughput: 0: 43116.3. Samples: 9740187440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:03,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 05:29:05,685][15401] Updated weights for policy 0, policy_version 594490 (0.0029) [2024-06-24 05:29:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9740238848. Throughput: 0: 43055.7. Samples: 9740318280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 05:29:09,429][15401] Updated weights for policy 0, policy_version 594500 (0.0033) [2024-06-24 05:29:13,379][15401] Updated weights for policy 0, policy_version 594510 (0.0019) [2024-06-24 05:29:13,390][15132] Fps is (10 sec: 44243.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9740451840. Throughput: 0: 43110.2. Samples: 9740575580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 05:29:17,014][15401] Updated weights for policy 0, policy_version 594520 (0.0026) [2024-06-24 05:29:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9740648448. Throughput: 0: 43106.8. Samples: 9740831900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 05:29:20,960][15401] Updated weights for policy 0, policy_version 594530 (0.0044) [2024-06-24 05:29:23,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.3, 300 sec: 42931.7). Total num frames: 9740894208. Throughput: 0: 42901.4. Samples: 9740959780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 05:29:24,585][15401] Updated weights for policy 0, policy_version 594540 (0.0035) [2024-06-24 05:29:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9741074432. Throughput: 0: 43010.2. Samples: 9741213400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 05:29:28,684][15401] Updated weights for policy 0, policy_version 594550 (0.0044) [2024-06-24 05:29:32,720][15401] Updated weights for policy 0, policy_version 594560 (0.0021) [2024-06-24 05:29:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42931.8). Total num frames: 9741287424. Throughput: 0: 42955.6. Samples: 9741474240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 05:29:36,366][15401] Updated weights for policy 0, policy_version 594570 (0.0040) [2024-06-24 05:29:38,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9741533184. Throughput: 0: 42788.8. Samples: 9741597840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:38,393][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 05:29:40,275][15401] Updated weights for policy 0, policy_version 594580 (0.0041) [2024-06-24 05:29:43,390][15132] Fps is (10 sec: 44235.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9741729792. Throughput: 0: 42932.6. Samples: 9741856660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 05:29:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 05:29:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000594588_9741729792.pth... [2024-06-24 05:29:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000593961_9731457024.pth [2024-06-24 05:29:43,983][15401] Updated weights for policy 0, policy_version 594590 (0.0024) [2024-06-24 05:29:47,092][15349] Signal inference workers to stop experience collection... (144400 times) [2024-06-24 05:29:47,092][15349] Signal inference workers to resume experience collection... (144400 times) [2024-06-24 05:29:47,139][15401] InferenceWorker_p0-w0: stopping experience collection (144400 times) [2024-06-24 05:29:47,139][15401] InferenceWorker_p0-w0: resuming experience collection (144400 times) [2024-06-24 05:29:47,990][15401] Updated weights for policy 0, policy_version 594600 (0.0030) [2024-06-24 05:29:48,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 9741926400. Throughput: 0: 42613.6. Samples: 9742104980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:29:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 05:29:51,663][15401] Updated weights for policy 0, policy_version 594610 (0.0035) [2024-06-24 05:29:53,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9742155776. Throughput: 0: 42524.4. Samples: 9742231880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:29:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 05:29:56,123][15401] Updated weights for policy 0, policy_version 594620 (0.0035) [2024-06-24 05:29:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9742352384. Throughput: 0: 42473.0. Samples: 9742486860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:29:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 05:29:59,457][15401] Updated weights for policy 0, policy_version 594630 (0.0032) [2024-06-24 05:30:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42326.6, 300 sec: 42765.0). Total num frames: 9742548992. Throughput: 0: 42559.5. Samples: 9742747080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:03,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 05:30:03,724][15401] Updated weights for policy 0, policy_version 594640 (0.0022) [2024-06-24 05:30:07,003][15401] Updated weights for policy 0, policy_version 594650 (0.0028) [2024-06-24 05:30:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9742794752. Throughput: 0: 42543.1. Samples: 9742874220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 05:30:11,259][15401] Updated weights for policy 0, policy_version 594660 (0.0029) [2024-06-24 05:30:13,390][15132] Fps is (10 sec: 44232.9, 60 sec: 42324.8, 300 sec: 42820.4). Total num frames: 9742991360. Throughput: 0: 42515.7. Samples: 9743126640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:13,391][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 05:30:14,612][15401] Updated weights for policy 0, policy_version 594670 (0.0046) [2024-06-24 05:30:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9743204352. Throughput: 0: 42403.6. Samples: 9743382400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 05:30:18,812][15401] Updated weights for policy 0, policy_version 594680 (0.0031) [2024-06-24 05:30:22,440][15401] Updated weights for policy 0, policy_version 594690 (0.0025) [2024-06-24 05:30:23,389][15132] Fps is (10 sec: 44240.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9743433728. Throughput: 0: 42569.5. Samples: 9743513360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 05:30:26,274][15401] Updated weights for policy 0, policy_version 594700 (0.0037) [2024-06-24 05:30:28,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 9743630336. Throughput: 0: 42541.9. Samples: 9743771140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:28,392][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 05:30:29,968][15401] Updated weights for policy 0, policy_version 594710 (0.0033) [2024-06-24 05:30:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9743859712. Throughput: 0: 42697.0. Samples: 9744026340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:33,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 05:30:33,727][15401] Updated weights for policy 0, policy_version 594720 (0.0047) [2024-06-24 05:30:37,449][15401] Updated weights for policy 0, policy_version 594730 (0.0037) [2024-06-24 05:30:38,392][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42764.7). Total num frames: 9744072704. Throughput: 0: 42916.9. Samples: 9744163240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:38,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 05:30:41,313][15401] Updated weights for policy 0, policy_version 594740 (0.0044) [2024-06-24 05:30:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 9744269312. Throughput: 0: 42739.7. Samples: 9744410140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 05:30:45,068][15401] Updated weights for policy 0, policy_version 594750 (0.0033) [2024-06-24 05:30:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9744498688. Throughput: 0: 42739.1. Samples: 9744670340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 05:30:49,104][15401] Updated weights for policy 0, policy_version 594760 (0.0029) [2024-06-24 05:30:52,569][15401] Updated weights for policy 0, policy_version 594770 (0.0036) [2024-06-24 05:30:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9744711680. Throughput: 0: 42850.7. Samples: 9744802500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 05:30:56,752][15401] Updated weights for policy 0, policy_version 594780 (0.0031) [2024-06-24 05:30:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9744924672. Throughput: 0: 42926.0. Samples: 9745058280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:30:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 05:31:00,263][15401] Updated weights for policy 0, policy_version 594790 (0.0033) [2024-06-24 05:31:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9745137664. Throughput: 0: 42914.9. Samples: 9745313580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:31:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 05:31:04,694][15401] Updated weights for policy 0, policy_version 594800 (0.0036) [2024-06-24 05:31:08,006][15401] Updated weights for policy 0, policy_version 594810 (0.0035) [2024-06-24 05:31:08,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9745383424. Throughput: 0: 42968.5. Samples: 9745446940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:31:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 05:31:09,901][15349] Signal inference workers to stop experience collection... (144450 times) [2024-06-24 05:31:09,908][15349] Signal inference workers to resume experience collection... (144450 times) [2024-06-24 05:31:09,952][15401] InferenceWorker_p0-w0: stopping experience collection (144450 times) [2024-06-24 05:31:09,952][15401] InferenceWorker_p0-w0: resuming experience collection (144450 times) [2024-06-24 05:31:12,237][15401] Updated weights for policy 0, policy_version 594820 (0.0046) [2024-06-24 05:31:13,394][15132] Fps is (10 sec: 44216.0, 60 sec: 43141.6, 300 sec: 42819.8). Total num frames: 9745580032. Throughput: 0: 42845.7. Samples: 9745699300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:31:13,395][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 05:31:15,802][15401] Updated weights for policy 0, policy_version 594830 (0.0032) [2024-06-24 05:31:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9745809408. Throughput: 0: 42899.0. Samples: 9745956800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 05:31:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 05:31:19,689][15401] Updated weights for policy 0, policy_version 594840 (0.0040) [2024-06-24 05:31:23,383][15401] Updated weights for policy 0, policy_version 594850 (0.0031) [2024-06-24 05:31:23,389][15132] Fps is (10 sec: 44258.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9746022400. Throughput: 0: 42827.6. Samples: 9746090380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:31:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 05:31:27,397][15401] Updated weights for policy 0, policy_version 594860 (0.0031) [2024-06-24 05:31:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 9746235392. Throughput: 0: 43081.7. Samples: 9746348820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:31:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 05:31:30,896][15401] Updated weights for policy 0, policy_version 594870 (0.0026) [2024-06-24 05:31:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9746448384. Throughput: 0: 42874.5. Samples: 9746599700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:31:33,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 05:31:35,168][15401] Updated weights for policy 0, policy_version 594880 (0.0038) [2024-06-24 05:31:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 9746644992. Throughput: 0: 42859.9. Samples: 9746731200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:31:38,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 05:31:38,666][15401] Updated weights for policy 0, policy_version 594890 (0.0033) [2024-06-24 05:31:42,961][15401] Updated weights for policy 0, policy_version 594900 (0.0039) [2024-06-24 05:31:43,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9746841600. Throughput: 0: 42910.0. Samples: 9746989220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:31:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 05:31:43,599][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000594902_9746874368.pth... [2024-06-24 05:31:43,658][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000594274_9736585216.pth [2024-06-24 05:31:46,383][15401] Updated weights for policy 0, policy_version 594910 (0.0032) [2024-06-24 05:31:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 9747103744. Throughput: 0: 42799.6. Samples: 9747239560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:31:48,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 05:31:50,464][15401] Updated weights for policy 0, policy_version 594920 (0.0028) [2024-06-24 05:31:53,393][15132] Fps is (10 sec: 44218.9, 60 sec: 42868.6, 300 sec: 42820.0). Total num frames: 9747283968. Throughput: 0: 42747.7. Samples: 9747370760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:31:53,394][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 05:31:54,211][15401] Updated weights for policy 0, policy_version 594930 (0.0033) [2024-06-24 05:31:58,107][15401] Updated weights for policy 0, policy_version 594940 (0.0041) [2024-06-24 05:31:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9747496960. Throughput: 0: 42932.1. Samples: 9747631040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:31:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 05:32:01,766][15401] Updated weights for policy 0, policy_version 594950 (0.0027) [2024-06-24 05:32:03,389][15132] Fps is (10 sec: 42615.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9747709952. Throughput: 0: 42858.7. Samples: 9747885440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:03,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 05:32:05,621][15401] Updated weights for policy 0, policy_version 594960 (0.0028) [2024-06-24 05:32:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.1, 300 sec: 42820.5). Total num frames: 9747922944. Throughput: 0: 42667.3. Samples: 9748010420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 05:32:09,284][15401] Updated weights for policy 0, policy_version 594970 (0.0024) [2024-06-24 05:32:13,243][15401] Updated weights for policy 0, policy_version 594980 (0.0045) [2024-06-24 05:32:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42874.9, 300 sec: 42931.6). Total num frames: 9748152320. Throughput: 0: 42672.4. Samples: 9748269080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 05:32:17,023][15401] Updated weights for policy 0, policy_version 594990 (0.0033) [2024-06-24 05:32:18,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42325.4, 300 sec: 42709.7). Total num frames: 9748348928. Throughput: 0: 42919.7. Samples: 9748531080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 05:32:20,806][15401] Updated weights for policy 0, policy_version 595000 (0.0034) [2024-06-24 05:32:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9748578304. Throughput: 0: 42731.7. Samples: 9748654120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 05:32:24,526][15401] Updated weights for policy 0, policy_version 595010 (0.0038) [2024-06-24 05:32:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9748774912. Throughput: 0: 42644.8. Samples: 9748908240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 05:32:28,552][15401] Updated weights for policy 0, policy_version 595020 (0.0035) [2024-06-24 05:32:30,358][15349] Signal inference workers to stop experience collection... (144500 times) [2024-06-24 05:32:30,358][15349] Signal inference workers to resume experience collection... (144500 times) [2024-06-24 05:32:30,398][15401] InferenceWorker_p0-w0: stopping experience collection (144500 times) [2024-06-24 05:32:30,398][15401] InferenceWorker_p0-w0: resuming experience collection (144500 times) [2024-06-24 05:32:32,241][15401] Updated weights for policy 0, policy_version 595030 (0.0034) [2024-06-24 05:32:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 9748987904. Throughput: 0: 42938.8. Samples: 9749171800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 05:32:36,254][15401] Updated weights for policy 0, policy_version 595040 (0.0041) [2024-06-24 05:32:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9749217280. Throughput: 0: 42828.1. Samples: 9749297860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:38,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 05:32:40,296][15401] Updated weights for policy 0, policy_version 595050 (0.0032) [2024-06-24 05:32:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9749430272. Throughput: 0: 42606.2. Samples: 9749548320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 05:32:44,127][15401] Updated weights for policy 0, policy_version 595060 (0.0037) [2024-06-24 05:32:47,839][15401] Updated weights for policy 0, policy_version 595070 (0.0035) [2024-06-24 05:32:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9749643264. Throughput: 0: 42537.8. Samples: 9749799640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 05:32:51,827][15401] Updated weights for policy 0, policy_version 595080 (0.0035) [2024-06-24 05:32:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42874.3, 300 sec: 42876.1). Total num frames: 9749856256. Throughput: 0: 42652.2. Samples: 9749929760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:32:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 05:32:55,256][15401] Updated weights for policy 0, policy_version 595090 (0.0030) [2024-06-24 05:32:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9750052864. Throughput: 0: 42669.4. Samples: 9750189200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:32:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 05:32:59,549][15401] Updated weights for policy 0, policy_version 595100 (0.0030) [2024-06-24 05:33:03,359][15401] Updated weights for policy 0, policy_version 595110 (0.0028) [2024-06-24 05:33:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9750282240. Throughput: 0: 42424.7. Samples: 9750440200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:03,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 05:33:07,211][15401] Updated weights for policy 0, policy_version 595120 (0.0041) [2024-06-24 05:33:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 9750495232. Throughput: 0: 42643.9. Samples: 9750573100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:08,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 05:33:11,034][15401] Updated weights for policy 0, policy_version 595130 (0.0036) [2024-06-24 05:33:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9750691840. Throughput: 0: 42674.1. Samples: 9750828580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 05:33:14,715][15401] Updated weights for policy 0, policy_version 595140 (0.0033) [2024-06-24 05:33:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 9750921216. Throughput: 0: 42437.3. Samples: 9751081480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 05:33:18,635][15401] Updated weights for policy 0, policy_version 595150 (0.0028) [2024-06-24 05:33:22,385][15401] Updated weights for policy 0, policy_version 595160 (0.0034) [2024-06-24 05:33:23,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 9751134208. Throughput: 0: 42639.4. Samples: 9751216620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:23,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 05:33:26,241][15401] Updated weights for policy 0, policy_version 595170 (0.0038) [2024-06-24 05:33:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9751330816. Throughput: 0: 42659.2. Samples: 9751467980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:28,408][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 05:33:30,435][15401] Updated weights for policy 0, policy_version 595180 (0.0036) [2024-06-24 05:33:33,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 9751576576. Throughput: 0: 42702.6. Samples: 9751721260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 05:33:33,612][15401] Updated weights for policy 0, policy_version 595190 (0.0022) [2024-06-24 05:33:38,005][15401] Updated weights for policy 0, policy_version 595200 (0.0035) [2024-06-24 05:33:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9751789568. Throughput: 0: 42856.8. Samples: 9751858320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:38,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 05:33:41,456][15401] Updated weights for policy 0, policy_version 595210 (0.0031) [2024-06-24 05:33:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9751986176. Throughput: 0: 42730.5. Samples: 9752112080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 05:33:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000595214_9751986176.pth... [2024-06-24 05:33:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000594588_9741729792.pth [2024-06-24 05:33:45,642][15401] Updated weights for policy 0, policy_version 595220 (0.0036) [2024-06-24 05:33:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9752215552. Throughput: 0: 42776.2. Samples: 9752365120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:48,396][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 05:33:48,944][15401] Updated weights for policy 0, policy_version 595230 (0.0029) [2024-06-24 05:33:53,267][15401] Updated weights for policy 0, policy_version 595240 (0.0026) [2024-06-24 05:33:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9752412160. Throughput: 0: 42863.2. Samples: 9752501940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:53,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 05:33:56,358][15401] Updated weights for policy 0, policy_version 595250 (0.0029) [2024-06-24 05:33:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 9752625152. Throughput: 0: 42917.0. Samples: 9752759840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:33:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 05:34:00,676][15401] Updated weights for policy 0, policy_version 595260 (0.0033) [2024-06-24 05:34:03,375][15349] Signal inference workers to stop experience collection... (144550 times) [2024-06-24 05:34:03,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9752854528. Throughput: 0: 43142.6. Samples: 9753023000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:34:03,393][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 05:34:03,420][15401] InferenceWorker_p0-w0: stopping experience collection (144550 times) [2024-06-24 05:34:03,440][15349] Signal inference workers to resume experience collection... (144550 times) [2024-06-24 05:34:03,448][15401] InferenceWorker_p0-w0: resuming experience collection (144550 times) [2024-06-24 05:34:03,734][15401] Updated weights for policy 0, policy_version 595270 (0.0028) [2024-06-24 05:34:08,219][15401] Updated weights for policy 0, policy_version 595280 (0.0028) [2024-06-24 05:34:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9753067520. Throughput: 0: 42998.0. Samples: 9753151540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:34:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 05:34:11,320][15401] Updated weights for policy 0, policy_version 595290 (0.0037) [2024-06-24 05:34:13,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9753280512. Throughput: 0: 43074.6. Samples: 9753406340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:34:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 05:34:15,627][15401] Updated weights for policy 0, policy_version 595300 (0.0028) [2024-06-24 05:34:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9753509888. Throughput: 0: 43170.5. Samples: 9753663940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:34:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 05:34:19,028][15401] Updated weights for policy 0, policy_version 595310 (0.0037) [2024-06-24 05:34:23,232][15401] Updated weights for policy 0, policy_version 595320 (0.0036) [2024-06-24 05:34:23,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 9753739264. Throughput: 0: 42958.2. Samples: 9753791540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:34:23,393][15132] Avg episode reward: [(0, '0.254')] [2024-06-24 05:34:27,049][15401] Updated weights for policy 0, policy_version 595330 (0.0036) [2024-06-24 05:34:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9753935872. Throughput: 0: 43106.3. Samples: 9754051860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:34:28,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 05:34:30,835][15401] Updated weights for policy 0, policy_version 595340 (0.0040) [2024-06-24 05:34:33,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 9754132480. Throughput: 0: 43142.2. Samples: 9754306520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:34:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 05:34:34,692][15401] Updated weights for policy 0, policy_version 595350 (0.0042) [2024-06-24 05:34:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9754361856. Throughput: 0: 42977.6. Samples: 9754435940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:34:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 05:34:38,698][15401] Updated weights for policy 0, policy_version 595360 (0.0035) [2024-06-24 05:34:42,232][15401] Updated weights for policy 0, policy_version 595370 (0.0038) [2024-06-24 05:34:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9754558464. Throughput: 0: 42844.5. Samples: 9754687840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:34:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 05:34:46,386][15401] Updated weights for policy 0, policy_version 595380 (0.0031) [2024-06-24 05:34:48,393][15132] Fps is (10 sec: 42584.7, 60 sec: 42869.0, 300 sec: 42820.1). Total num frames: 9754787840. Throughput: 0: 42779.5. Samples: 9754948120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:34:48,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 05:34:49,906][15401] Updated weights for policy 0, policy_version 595390 (0.0030) [2024-06-24 05:34:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9755000832. Throughput: 0: 42808.5. Samples: 9755077920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:34:53,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 05:34:54,296][15401] Updated weights for policy 0, policy_version 595400 (0.0041) [2024-06-24 05:34:57,649][15401] Updated weights for policy 0, policy_version 595410 (0.0028) [2024-06-24 05:34:58,389][15132] Fps is (10 sec: 42613.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9755213824. Throughput: 0: 42754.8. Samples: 9755330300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:34:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 05:35:01,986][15401] Updated weights for policy 0, policy_version 595420 (0.0024) [2024-06-24 05:35:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 9755443200. Throughput: 0: 42755.6. Samples: 9755587940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 05:35:05,129][15401] Updated weights for policy 0, policy_version 595430 (0.0025) [2024-06-24 05:35:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.8). Total num frames: 9755656192. Throughput: 0: 42917.0. Samples: 9755722700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 05:35:09,644][15401] Updated weights for policy 0, policy_version 595440 (0.0034) [2024-06-24 05:35:12,951][15401] Updated weights for policy 0, policy_version 595450 (0.0034) [2024-06-24 05:35:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9755852800. Throughput: 0: 42676.3. Samples: 9755972300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 05:35:15,615][15349] Signal inference workers to stop experience collection... (144600 times) [2024-06-24 05:35:15,615][15349] Signal inference workers to resume experience collection... (144600 times) [2024-06-24 05:35:15,660][15401] InferenceWorker_p0-w0: stopping experience collection (144600 times) [2024-06-24 05:35:15,661][15401] InferenceWorker_p0-w0: resuming experience collection (144600 times) [2024-06-24 05:35:17,274][15401] Updated weights for policy 0, policy_version 595460 (0.0037) [2024-06-24 05:35:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9756098560. Throughput: 0: 42736.5. Samples: 9756229660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 05:35:20,718][15401] Updated weights for policy 0, policy_version 595470 (0.0035) [2024-06-24 05:35:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.0, 300 sec: 42876.4). Total num frames: 9756278784. Throughput: 0: 42814.4. Samples: 9756362580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:23,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 05:35:25,091][15401] Updated weights for policy 0, policy_version 595480 (0.0031) [2024-06-24 05:35:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9756491776. Throughput: 0: 42883.1. Samples: 9756617580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 05:35:28,444][15401] Updated weights for policy 0, policy_version 595490 (0.0042) [2024-06-24 05:35:32,659][15401] Updated weights for policy 0, policy_version 595500 (0.0034) [2024-06-24 05:35:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 9756721152. Throughput: 0: 42834.6. Samples: 9756875540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 05:35:36,197][15401] Updated weights for policy 0, policy_version 595510 (0.0040) [2024-06-24 05:35:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9756917760. Throughput: 0: 42892.9. Samples: 9757008100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 05:35:40,083][15401] Updated weights for policy 0, policy_version 595520 (0.0026) [2024-06-24 05:35:43,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9757130752. Throughput: 0: 42933.4. Samples: 9757262300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 05:35:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000595529_9757147136.pth... [2024-06-24 05:35:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000594902_9746874368.pth [2024-06-24 05:35:43,676][15401] Updated weights for policy 0, policy_version 595530 (0.0036) [2024-06-24 05:35:47,580][15401] Updated weights for policy 0, policy_version 595540 (0.0038) [2024-06-24 05:35:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.8, 300 sec: 42820.6). Total num frames: 9757343744. Throughput: 0: 42954.8. Samples: 9757520900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:48,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 05:35:51,470][15401] Updated weights for policy 0, policy_version 595550 (0.0027) [2024-06-24 05:35:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9757573120. Throughput: 0: 42760.3. Samples: 9757646920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:53,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-24 05:35:55,355][15401] Updated weights for policy 0, policy_version 595560 (0.0039) [2024-06-24 05:35:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9757769728. Throughput: 0: 42676.5. Samples: 9757892740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 05:35:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 05:35:59,488][15401] Updated weights for policy 0, policy_version 595570 (0.0028) [2024-06-24 05:36:02,962][15401] Updated weights for policy 0, policy_version 595580 (0.0041) [2024-06-24 05:36:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9757982720. Throughput: 0: 42619.5. Samples: 9758147540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:03,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 05:36:07,082][15401] Updated weights for policy 0, policy_version 595590 (0.0044) [2024-06-24 05:36:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42821.3). Total num frames: 9758212096. Throughput: 0: 42626.2. Samples: 9758280760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:08,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 05:36:10,448][15401] Updated weights for policy 0, policy_version 595600 (0.0033) [2024-06-24 05:36:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9758425088. Throughput: 0: 42570.3. Samples: 9758533240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 05:36:14,738][15401] Updated weights for policy 0, policy_version 595610 (0.0041) [2024-06-24 05:36:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9758621696. Throughput: 0: 42578.5. Samples: 9758791560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 05:36:18,508][15401] Updated weights for policy 0, policy_version 595620 (0.0038) [2024-06-24 05:36:22,379][15401] Updated weights for policy 0, policy_version 595630 (0.0036) [2024-06-24 05:36:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9758851072. Throughput: 0: 42454.7. Samples: 9758918560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 05:36:26,208][15401] Updated weights for policy 0, policy_version 595640 (0.0025) [2024-06-24 05:36:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 9759080448. Throughput: 0: 42627.8. Samples: 9759180560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 05:36:30,011][15401] Updated weights for policy 0, policy_version 595650 (0.0035) [2024-06-24 05:36:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9759260672. Throughput: 0: 42595.9. Samples: 9759437720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 05:36:33,744][15401] Updated weights for policy 0, policy_version 595660 (0.0026) [2024-06-24 05:36:37,536][15349] Signal inference workers to stop experience collection... (144650 times) [2024-06-24 05:36:37,540][15349] Signal inference workers to resume experience collection... (144650 times) [2024-06-24 05:36:37,560][15401] InferenceWorker_p0-w0: stopping experience collection (144650 times) [2024-06-24 05:36:37,590][15401] InferenceWorker_p0-w0: resuming experience collection (144650 times) [2024-06-24 05:36:37,678][15401] Updated weights for policy 0, policy_version 595670 (0.0032) [2024-06-24 05:36:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9759490048. Throughput: 0: 42477.9. Samples: 9759558420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:38,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 05:36:41,334][15401] Updated weights for policy 0, policy_version 595680 (0.0030) [2024-06-24 05:36:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9759703040. Throughput: 0: 42801.4. Samples: 9759818800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 05:36:45,554][15401] Updated weights for policy 0, policy_version 595690 (0.0037) [2024-06-24 05:36:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42710.1). Total num frames: 9759883264. Throughput: 0: 42884.9. Samples: 9760077360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:48,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 05:36:48,973][15401] Updated weights for policy 0, policy_version 595700 (0.0037) [2024-06-24 05:36:53,215][15401] Updated weights for policy 0, policy_version 595710 (0.0024) [2024-06-24 05:36:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9760112640. Throughput: 0: 42667.5. Samples: 9760200800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:53,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 05:36:56,520][15401] Updated weights for policy 0, policy_version 595720 (0.0026) [2024-06-24 05:36:58,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9760358400. Throughput: 0: 42683.5. Samples: 9760454000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:36:58,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 05:37:00,962][15401] Updated weights for policy 0, policy_version 595730 (0.0029) [2024-06-24 05:37:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9760538624. Throughput: 0: 42814.6. Samples: 9760718220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:37:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 05:37:04,160][15401] Updated weights for policy 0, policy_version 595740 (0.0031) [2024-06-24 05:37:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9760751616. Throughput: 0: 42673.8. Samples: 9760838880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:37:08,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 05:37:08,467][15401] Updated weights for policy 0, policy_version 595750 (0.0032) [2024-06-24 05:37:11,773][15401] Updated weights for policy 0, policy_version 595760 (0.0029) [2024-06-24 05:37:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9760997376. Throughput: 0: 42647.7. Samples: 9761099700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:37:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 05:37:16,075][15401] Updated weights for policy 0, policy_version 595770 (0.0043) [2024-06-24 05:37:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9761193984. Throughput: 0: 42683.7. Samples: 9761358480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:37:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 05:37:19,806][15401] Updated weights for policy 0, policy_version 595780 (0.0033) [2024-06-24 05:37:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9761390592. Throughput: 0: 42648.5. Samples: 9761477600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:37:23,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 05:37:23,963][15401] Updated weights for policy 0, policy_version 595790 (0.0034) [2024-06-24 05:37:27,459][15401] Updated weights for policy 0, policy_version 595800 (0.0037) [2024-06-24 05:37:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9761636352. Throughput: 0: 42666.7. Samples: 9761738800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:37:28,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 05:37:31,668][15401] Updated weights for policy 0, policy_version 595810 (0.0043) [2024-06-24 05:37:33,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9761832960. Throughput: 0: 42794.0. Samples: 9762003100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 05:37:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 05:37:35,214][15401] Updated weights for policy 0, policy_version 595820 (0.0037) [2024-06-24 05:37:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9762045952. Throughput: 0: 42673.4. Samples: 9762121100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:37:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 05:37:39,091][15401] Updated weights for policy 0, policy_version 595830 (0.0033) [2024-06-24 05:37:43,034][15401] Updated weights for policy 0, policy_version 595840 (0.0022) [2024-06-24 05:37:43,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9762258944. Throughput: 0: 42926.8. Samples: 9762385700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:37:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 05:37:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000595842_9762275328.pth... [2024-06-24 05:37:43,515][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000595214_9751986176.pth [2024-06-24 05:37:46,619][15401] Updated weights for policy 0, policy_version 595850 (0.0033) [2024-06-24 05:37:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9762471936. Throughput: 0: 42842.2. Samples: 9762646120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:37:48,392][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 05:37:50,483][15401] Updated weights for policy 0, policy_version 595860 (0.0027) [2024-06-24 05:37:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9762701312. Throughput: 0: 42856.0. Samples: 9762767400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:37:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 05:37:54,135][15401] Updated weights for policy 0, policy_version 595870 (0.0045) [2024-06-24 05:37:57,965][15349] Signal inference workers to stop experience collection... (144700 times) [2024-06-24 05:37:57,966][15349] Signal inference workers to resume experience collection... (144700 times) [2024-06-24 05:37:57,995][15401] InferenceWorker_p0-w0: stopping experience collection (144700 times) [2024-06-24 05:37:57,995][15401] InferenceWorker_p0-w0: resuming experience collection (144700 times) [2024-06-24 05:37:58,117][15401] Updated weights for policy 0, policy_version 595880 (0.0032) [2024-06-24 05:37:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9762914304. Throughput: 0: 43029.3. Samples: 9763036020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:37:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 05:38:01,955][15401] Updated weights for policy 0, policy_version 595890 (0.0028) [2024-06-24 05:38:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9763110912. Throughput: 0: 42968.0. Samples: 9763292040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 05:38:05,682][15401] Updated weights for policy 0, policy_version 595900 (0.0029) [2024-06-24 05:38:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9763340288. Throughput: 0: 43147.1. Samples: 9763419220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 05:38:09,494][15401] Updated weights for policy 0, policy_version 595910 (0.0021) [2024-06-24 05:38:13,256][15401] Updated weights for policy 0, policy_version 595920 (0.0031) [2024-06-24 05:38:13,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 9763553280. Throughput: 0: 43198.5. Samples: 9763682740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:13,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 05:38:17,399][15401] Updated weights for policy 0, policy_version 595930 (0.0036) [2024-06-24 05:38:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9763766272. Throughput: 0: 42939.8. Samples: 9763935380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:18,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 05:38:21,187][15401] Updated weights for policy 0, policy_version 595940 (0.0033) [2024-06-24 05:38:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 9763995648. Throughput: 0: 43197.3. Samples: 9764064980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:23,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 05:38:24,909][15401] Updated weights for policy 0, policy_version 595950 (0.0037) [2024-06-24 05:38:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 9764192256. Throughput: 0: 43039.4. Samples: 9764322480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 05:38:28,649][15401] Updated weights for policy 0, policy_version 595960 (0.0040) [2024-06-24 05:38:32,510][15401] Updated weights for policy 0, policy_version 595970 (0.0028) [2024-06-24 05:38:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9764388864. Throughput: 0: 42904.9. Samples: 9764576840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 05:38:36,014][15401] Updated weights for policy 0, policy_version 595980 (0.0035) [2024-06-24 05:38:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9764634624. Throughput: 0: 42954.6. Samples: 9764700360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 05:38:40,042][15401] Updated weights for policy 0, policy_version 595990 (0.0036) [2024-06-24 05:38:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9764831232. Throughput: 0: 42892.5. Samples: 9764966180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:43,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 05:38:43,579][15401] Updated weights for policy 0, policy_version 596000 (0.0029) [2024-06-24 05:38:48,111][15401] Updated weights for policy 0, policy_version 596010 (0.0031) [2024-06-24 05:38:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9765044224. Throughput: 0: 42793.8. Samples: 9765217760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 05:38:51,621][15401] Updated weights for policy 0, policy_version 596020 (0.0037) [2024-06-24 05:38:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9765273600. Throughput: 0: 42758.3. Samples: 9765343340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 05:38:55,559][15401] Updated weights for policy 0, policy_version 596030 (0.0037) [2024-06-24 05:38:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 9765453824. Throughput: 0: 42620.2. Samples: 9765600640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:38:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 05:38:59,127][15401] Updated weights for policy 0, policy_version 596040 (0.0038) [2024-06-24 05:39:02,992][15401] Updated weights for policy 0, policy_version 596050 (0.0039) [2024-06-24 05:39:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9765683200. Throughput: 0: 42586.6. Samples: 9765851780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 05:39:03,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 05:39:07,027][15401] Updated weights for policy 0, policy_version 596060 (0.0042) [2024-06-24 05:39:08,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 9765896192. Throughput: 0: 42610.9. Samples: 9765982740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:08,396][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 05:39:10,441][15401] Updated weights for policy 0, policy_version 596070 (0.0036) [2024-06-24 05:39:13,391][15132] Fps is (10 sec: 40954.7, 60 sec: 42324.5, 300 sec: 42653.8). Total num frames: 9766092800. Throughput: 0: 42579.8. Samples: 9766238620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:13,391][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 05:39:14,452][15401] Updated weights for policy 0, policy_version 596080 (0.0045) [2024-06-24 05:39:15,400][15349] Signal inference workers to stop experience collection... (144750 times) [2024-06-24 05:39:15,410][15349] Signal inference workers to resume experience collection... (144750 times) [2024-06-24 05:39:15,414][15401] InferenceWorker_p0-w0: stopping experience collection (144750 times) [2024-06-24 05:39:15,444][15401] InferenceWorker_p0-w0: resuming experience collection (144750 times) [2024-06-24 05:39:18,294][15401] Updated weights for policy 0, policy_version 596090 (0.0030) [2024-06-24 05:39:18,390][15132] Fps is (10 sec: 44264.6, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 9766338560. Throughput: 0: 42575.5. Samples: 9766492740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 05:39:22,497][15401] Updated weights for policy 0, policy_version 596100 (0.0029) [2024-06-24 05:39:23,390][15132] Fps is (10 sec: 45880.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9766551552. Throughput: 0: 42814.7. Samples: 9766627020. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 05:39:25,867][15401] Updated weights for policy 0, policy_version 596110 (0.0041) [2024-06-24 05:39:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 9766731776. Throughput: 0: 42351.1. Samples: 9766871980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:28,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 05:39:30,097][15401] Updated weights for policy 0, policy_version 596120 (0.0039) [2024-06-24 05:39:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 9766977536. Throughput: 0: 42505.3. Samples: 9767130500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 05:39:33,513][15401] Updated weights for policy 0, policy_version 596130 (0.0036) [2024-06-24 05:39:37,686][15401] Updated weights for policy 0, policy_version 596140 (0.0040) [2024-06-24 05:39:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9767174144. Throughput: 0: 42755.9. Samples: 9767267360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:38,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 05:39:41,126][15401] Updated weights for policy 0, policy_version 596150 (0.0023) [2024-06-24 05:39:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42654.4). Total num frames: 9767370752. Throughput: 0: 42632.4. Samples: 9767519100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 05:39:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000596153_9767370752.pth... [2024-06-24 05:39:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000595529_9757147136.pth [2024-06-24 05:39:45,379][15401] Updated weights for policy 0, policy_version 596160 (0.0042) [2024-06-24 05:39:48,396][15132] Fps is (10 sec: 44208.2, 60 sec: 42866.7, 300 sec: 42764.1). Total num frames: 9767616512. Throughput: 0: 42567.1. Samples: 9767767580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:48,397][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 05:39:48,703][15401] Updated weights for policy 0, policy_version 596170 (0.0022) [2024-06-24 05:39:53,087][15401] Updated weights for policy 0, policy_version 596180 (0.0037) [2024-06-24 05:39:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9767829504. Throughput: 0: 42836.3. Samples: 9767910100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:53,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 05:39:56,278][15401] Updated weights for policy 0, policy_version 596190 (0.0030) [2024-06-24 05:39:58,389][15132] Fps is (10 sec: 39347.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9768009728. Throughput: 0: 42680.4. Samples: 9768159180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:39:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 05:40:00,631][15401] Updated weights for policy 0, policy_version 596200 (0.0036) [2024-06-24 05:40:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9768255488. Throughput: 0: 42661.9. Samples: 9768412520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:40:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 05:40:04,091][15401] Updated weights for policy 0, policy_version 596210 (0.0026) [2024-06-24 05:40:08,382][15401] Updated weights for policy 0, policy_version 596220 (0.0030) [2024-06-24 05:40:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 9768468480. Throughput: 0: 42642.4. Samples: 9768545920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:40:08,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 05:40:12,133][15401] Updated weights for policy 0, policy_version 596230 (0.0032) [2024-06-24 05:40:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42872.3, 300 sec: 42598.4). Total num frames: 9768665088. Throughput: 0: 42726.6. Samples: 9768794680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:40:13,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 05:40:16,021][15401] Updated weights for policy 0, policy_version 596240 (0.0033) [2024-06-24 05:40:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 9768878080. Throughput: 0: 42621.8. Samples: 9769048480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:40:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 05:40:19,869][15401] Updated weights for policy 0, policy_version 596250 (0.0035) [2024-06-24 05:40:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9769091072. Throughput: 0: 42533.9. Samples: 9769181380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:40:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 05:40:23,619][15401] Updated weights for policy 0, policy_version 596260 (0.0039) [2024-06-24 05:40:27,582][15401] Updated weights for policy 0, policy_version 596270 (0.0039) [2024-06-24 05:40:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 9769304064. Throughput: 0: 42618.1. Samples: 9769436920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:40:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 05:40:31,641][15401] Updated weights for policy 0, policy_version 596280 (0.0035) [2024-06-24 05:40:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9769533440. Throughput: 0: 42694.2. Samples: 9769688540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:40:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 05:40:35,238][15401] Updated weights for policy 0, policy_version 596290 (0.0043) [2024-06-24 05:40:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9769746432. Throughput: 0: 42522.7. Samples: 9769823620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 05:40:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 05:40:39,057][15401] Updated weights for policy 0, policy_version 596300 (0.0033) [2024-06-24 05:40:42,977][15401] Updated weights for policy 0, policy_version 596310 (0.0040) [2024-06-24 05:40:43,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 9769943040. Throughput: 0: 42690.0. Samples: 9770080240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:40:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 05:40:46,577][15401] Updated weights for policy 0, policy_version 596320 (0.0046) [2024-06-24 05:40:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43149.3, 300 sec: 42820.6). Total num frames: 9770205184. Throughput: 0: 42728.5. Samples: 9770335300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:40:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 05:40:50,787][15401] Updated weights for policy 0, policy_version 596330 (0.0046) [2024-06-24 05:40:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9770385408. Throughput: 0: 42865.2. Samples: 9770474860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:40:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 05:40:54,063][15401] Updated weights for policy 0, policy_version 596340 (0.0030) [2024-06-24 05:40:58,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9770582016. Throughput: 0: 42858.4. Samples: 9770723300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:40:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 05:40:58,441][15401] Updated weights for policy 0, policy_version 596350 (0.0033) [2024-06-24 05:40:59,832][15349] Signal inference workers to stop experience collection... (144800 times) [2024-06-24 05:40:59,832][15349] Signal inference workers to resume experience collection... (144800 times) [2024-06-24 05:40:59,861][15401] InferenceWorker_p0-w0: stopping experience collection (144800 times) [2024-06-24 05:40:59,861][15401] InferenceWorker_p0-w0: resuming experience collection (144800 times) [2024-06-24 05:41:01,628][15401] Updated weights for policy 0, policy_version 596360 (0.0046) [2024-06-24 05:41:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9770844160. Throughput: 0: 42876.4. Samples: 9770977920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 05:41:06,139][15401] Updated weights for policy 0, policy_version 596370 (0.0036) [2024-06-24 05:41:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9771024384. Throughput: 0: 42881.7. Samples: 9771111060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 05:41:09,295][15401] Updated weights for policy 0, policy_version 596380 (0.0038) [2024-06-24 05:41:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9771237376. Throughput: 0: 42853.4. Samples: 9771365320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 05:41:13,914][15401] Updated weights for policy 0, policy_version 596390 (0.0023) [2024-06-24 05:41:17,138][15401] Updated weights for policy 0, policy_version 596400 (0.0029) [2024-06-24 05:41:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9771466752. Throughput: 0: 42794.6. Samples: 9771614300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 05:41:21,925][15401] Updated weights for policy 0, policy_version 596410 (0.0041) [2024-06-24 05:41:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 9771663360. Throughput: 0: 42864.8. Samples: 9771752540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 05:41:25,131][15401] Updated weights for policy 0, policy_version 596420 (0.0026) [2024-06-24 05:41:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9771876352. Throughput: 0: 42607.8. Samples: 9771997580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 05:41:29,486][15401] Updated weights for policy 0, policy_version 596430 (0.0033) [2024-06-24 05:41:32,787][15401] Updated weights for policy 0, policy_version 596440 (0.0033) [2024-06-24 05:41:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9772105728. Throughput: 0: 42631.0. Samples: 9772253700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:33,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 05:41:37,245][15401] Updated weights for policy 0, policy_version 596450 (0.0039) [2024-06-24 05:41:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9772302336. Throughput: 0: 42447.2. Samples: 9772384980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 05:41:40,326][15401] Updated weights for policy 0, policy_version 596460 (0.0033) [2024-06-24 05:41:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 9772531712. Throughput: 0: 42598.6. Samples: 9772640240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:43,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 05:41:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000596468_9772531712.pth... [2024-06-24 05:41:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000595842_9762275328.pth [2024-06-24 05:41:44,668][15401] Updated weights for policy 0, policy_version 596470 (0.0037) [2024-06-24 05:41:47,912][15401] Updated weights for policy 0, policy_version 596480 (0.0040) [2024-06-24 05:41:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9772744704. Throughput: 0: 42643.6. Samples: 9772896880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 05:41:52,375][15401] Updated weights for policy 0, policy_version 596490 (0.0034) [2024-06-24 05:41:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9772957696. Throughput: 0: 42500.8. Samples: 9773023600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 05:41:55,499][15401] Updated weights for policy 0, policy_version 596500 (0.0026) [2024-06-24 05:41:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9773170688. Throughput: 0: 42553.8. Samples: 9773280240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:41:58,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 05:41:59,819][15401] Updated weights for policy 0, policy_version 596510 (0.0029) [2024-06-24 05:42:02,969][15401] Updated weights for policy 0, policy_version 596520 (0.0035) [2024-06-24 05:42:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9773383680. Throughput: 0: 42878.3. Samples: 9773543820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:42:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 05:42:07,135][15349] Signal inference workers to stop experience collection... (144850 times) [2024-06-24 05:42:07,135][15349] Signal inference workers to resume experience collection... (144850 times) [2024-06-24 05:42:07,166][15401] InferenceWorker_p0-w0: stopping experience collection (144850 times) [2024-06-24 05:42:07,166][15401] InferenceWorker_p0-w0: resuming experience collection (144850 times) [2024-06-24 05:42:07,279][15401] Updated weights for policy 0, policy_version 596530 (0.0030) [2024-06-24 05:42:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9773580288. Throughput: 0: 42615.3. Samples: 9773670220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:42:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 05:42:10,812][15401] Updated weights for policy 0, policy_version 596540 (0.0038) [2024-06-24 05:42:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9773826048. Throughput: 0: 42892.4. Samples: 9773927740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-24 05:42:13,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 05:42:15,192][15401] Updated weights for policy 0, policy_version 596550 (0.0040) [2024-06-24 05:42:18,391][15132] Fps is (10 sec: 44230.7, 60 sec: 42597.5, 300 sec: 42820.3). Total num frames: 9774022656. Throughput: 0: 42978.8. Samples: 9774187800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:18,391][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 05:42:18,452][15401] Updated weights for policy 0, policy_version 596560 (0.0027) [2024-06-24 05:42:22,802][15401] Updated weights for policy 0, policy_version 596570 (0.0039) [2024-06-24 05:42:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9774219264. Throughput: 0: 42784.4. Samples: 9774310280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 05:42:25,886][15401] Updated weights for policy 0, policy_version 596580 (0.0024) [2024-06-24 05:42:28,390][15132] Fps is (10 sec: 44241.0, 60 sec: 43144.2, 300 sec: 42820.5). Total num frames: 9774465024. Throughput: 0: 42874.8. Samples: 9774569620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 05:42:30,338][15401] Updated weights for policy 0, policy_version 596590 (0.0035) [2024-06-24 05:42:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9774661632. Throughput: 0: 42991.5. Samples: 9774831500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:33,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 05:42:33,780][15401] Updated weights for policy 0, policy_version 596600 (0.0035) [2024-06-24 05:42:38,070][15401] Updated weights for policy 0, policy_version 596610 (0.0028) [2024-06-24 05:42:38,389][15132] Fps is (10 sec: 40961.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9774874624. Throughput: 0: 42973.1. Samples: 9774957380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 05:42:41,247][15401] Updated weights for policy 0, policy_version 596620 (0.0033) [2024-06-24 05:42:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9775104000. Throughput: 0: 42908.4. Samples: 9775211120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 05:42:45,788][15401] Updated weights for policy 0, policy_version 596630 (0.0033) [2024-06-24 05:42:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9775300608. Throughput: 0: 42735.5. Samples: 9775466920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:48,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 05:42:48,983][15401] Updated weights for policy 0, policy_version 596640 (0.0033) [2024-06-24 05:42:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9775497216. Throughput: 0: 42731.1. Samples: 9775593120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 05:42:53,446][15401] Updated weights for policy 0, policy_version 596650 (0.0034) [2024-06-24 05:42:56,487][15401] Updated weights for policy 0, policy_version 596660 (0.0041) [2024-06-24 05:42:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9775759360. Throughput: 0: 42773.3. Samples: 9775852540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:42:58,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 05:43:00,896][15401] Updated weights for policy 0, policy_version 596670 (0.0027) [2024-06-24 05:43:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9775939584. Throughput: 0: 42849.8. Samples: 9776115980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 05:43:03,925][15401] Updated weights for policy 0, policy_version 596680 (0.0042) [2024-06-24 05:43:08,343][15401] Updated weights for policy 0, policy_version 596690 (0.0031) [2024-06-24 05:43:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9776168960. Throughput: 0: 42925.7. Samples: 9776241940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 05:43:11,505][15401] Updated weights for policy 0, policy_version 596700 (0.0035) [2024-06-24 05:43:13,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9776398336. Throughput: 0: 42908.2. Samples: 9776500480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:13,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 05:43:16,114][15401] Updated weights for policy 0, policy_version 596710 (0.0027) [2024-06-24 05:43:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42870.7, 300 sec: 42709.1). Total num frames: 9776594944. Throughput: 0: 42851.0. Samples: 9776759900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:18,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 05:43:19,269][15401] Updated weights for policy 0, policy_version 596720 (0.0036) [2024-06-24 05:43:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9776791552. Throughput: 0: 42788.7. Samples: 9776882880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 05:43:23,760][15401] Updated weights for policy 0, policy_version 596730 (0.0032) [2024-06-24 05:43:27,025][15401] Updated weights for policy 0, policy_version 596740 (0.0029) [2024-06-24 05:43:28,395][15132] Fps is (10 sec: 44224.6, 60 sec: 42868.0, 300 sec: 42875.3). Total num frames: 9777037312. Throughput: 0: 42763.5. Samples: 9777135700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:28,395][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 05:43:31,470][15401] Updated weights for policy 0, policy_version 596750 (0.0032) [2024-06-24 05:43:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9777233920. Throughput: 0: 42907.2. Samples: 9777397740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 05:43:33,985][15349] Signal inference workers to stop experience collection... (144900 times) [2024-06-24 05:43:34,006][15401] InferenceWorker_p0-w0: stopping experience collection (144900 times) [2024-06-24 05:43:34,046][15349] Signal inference workers to resume experience collection... (144900 times) [2024-06-24 05:43:34,047][15401] InferenceWorker_p0-w0: resuming experience collection (144900 times) [2024-06-24 05:43:34,535][15401] Updated weights for policy 0, policy_version 596760 (0.0033) [2024-06-24 05:43:38,390][15132] Fps is (10 sec: 37702.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9777414144. Throughput: 0: 42891.4. Samples: 9777523240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:38,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 05:43:39,188][15401] Updated weights for policy 0, policy_version 596770 (0.0043) [2024-06-24 05:43:42,310][15401] Updated weights for policy 0, policy_version 596780 (0.0026) [2024-06-24 05:43:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9777659904. Throughput: 0: 42801.3. Samples: 9777778600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 05:43:43,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 05:43:43,450][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000596782_9777676288.pth... [2024-06-24 05:43:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000596153_9767370752.pth [2024-06-24 05:43:46,899][15401] Updated weights for policy 0, policy_version 596790 (0.0029) [2024-06-24 05:43:48,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9777889280. Throughput: 0: 42633.7. Samples: 9778034500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:43:48,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 05:43:49,935][15401] Updated weights for policy 0, policy_version 596800 (0.0032) [2024-06-24 05:43:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9778069504. Throughput: 0: 42738.7. Samples: 9778165180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:43:53,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 05:43:54,479][15401] Updated weights for policy 0, policy_version 596810 (0.0032) [2024-06-24 05:43:57,322][15401] Updated weights for policy 0, policy_version 596820 (0.0041) [2024-06-24 05:43:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9778315264. Throughput: 0: 42666.4. Samples: 9778420460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:43:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 05:44:01,951][15401] Updated weights for policy 0, policy_version 596830 (0.0047) [2024-06-24 05:44:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 9778511872. Throughput: 0: 42841.0. Samples: 9778687640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 05:44:05,096][15401] Updated weights for policy 0, policy_version 596840 (0.0033) [2024-06-24 05:44:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.7). Total num frames: 9778724864. Throughput: 0: 42914.3. Samples: 9778814020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 05:44:09,643][15401] Updated weights for policy 0, policy_version 596850 (0.0032) [2024-06-24 05:44:12,453][15401] Updated weights for policy 0, policy_version 596860 (0.0031) [2024-06-24 05:44:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9778970624. Throughput: 0: 42889.9. Samples: 9779065520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 05:44:17,561][15401] Updated weights for policy 0, policy_version 596870 (0.0039) [2024-06-24 05:44:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 9779150848. Throughput: 0: 42928.5. Samples: 9779329520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 05:44:20,482][15401] Updated weights for policy 0, policy_version 596880 (0.0035) [2024-06-24 05:44:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9779380224. Throughput: 0: 42839.2. Samples: 9779451000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 05:44:25,151][15401] Updated weights for policy 0, policy_version 596890 (0.0038) [2024-06-24 05:44:28,121][15401] Updated weights for policy 0, policy_version 596900 (0.0029) [2024-06-24 05:44:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42875.1, 300 sec: 42820.5). Total num frames: 9779609600. Throughput: 0: 42877.8. Samples: 9779708100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 05:44:32,635][15401] Updated weights for policy 0, policy_version 596910 (0.0031) [2024-06-24 05:44:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9779789824. Throughput: 0: 43056.0. Samples: 9779972020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 05:44:35,658][15401] Updated weights for policy 0, policy_version 596920 (0.0042) [2024-06-24 05:44:38,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9779986432. Throughput: 0: 42914.3. Samples: 9780096320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 05:44:40,218][15401] Updated weights for policy 0, policy_version 596930 (0.0036) [2024-06-24 05:44:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42821.5). Total num frames: 9780248576. Throughput: 0: 42854.3. Samples: 9780348900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 05:44:43,500][15401] Updated weights for policy 0, policy_version 596940 (0.0031) [2024-06-24 05:44:47,754][15401] Updated weights for policy 0, policy_version 596950 (0.0035) [2024-06-24 05:44:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 9780445184. Throughput: 0: 42644.6. Samples: 9780606640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 05:44:51,422][15401] Updated weights for policy 0, policy_version 596960 (0.0027) [2024-06-24 05:44:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9780641792. Throughput: 0: 42800.0. Samples: 9780740020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 05:44:55,505][15401] Updated weights for policy 0, policy_version 596970 (0.0028) [2024-06-24 05:44:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9780871168. Throughput: 0: 42802.6. Samples: 9780991640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:44:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 05:44:58,884][15349] Signal inference workers to stop experience collection... (144950 times) [2024-06-24 05:44:58,920][15401] InferenceWorker_p0-w0: stopping experience collection (144950 times) [2024-06-24 05:44:58,945][15349] Signal inference workers to resume experience collection... (144950 times) [2024-06-24 05:44:58,946][15401] InferenceWorker_p0-w0: resuming experience collection (144950 times) [2024-06-24 05:44:59,101][15401] Updated weights for policy 0, policy_version 596980 (0.0026) [2024-06-24 05:45:03,092][15401] Updated weights for policy 0, policy_version 596990 (0.0041) [2024-06-24 05:45:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9781084160. Throughput: 0: 42713.3. Samples: 9781251620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:45:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 05:45:06,649][15401] Updated weights for policy 0, policy_version 597000 (0.0031) [2024-06-24 05:45:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9781297152. Throughput: 0: 42905.2. Samples: 9781381740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:45:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 05:45:10,998][15401] Updated weights for policy 0, policy_version 597010 (0.0032) [2024-06-24 05:45:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 9781510144. Throughput: 0: 42779.2. Samples: 9781633160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:45:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 05:45:14,495][15401] Updated weights for policy 0, policy_version 597020 (0.0039) [2024-06-24 05:45:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9781723136. Throughput: 0: 42722.2. Samples: 9781894520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 05:45:18,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 05:45:18,442][15401] Updated weights for policy 0, policy_version 597030 (0.0032) [2024-06-24 05:45:22,139][15401] Updated weights for policy 0, policy_version 597040 (0.0027) [2024-06-24 05:45:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9781936128. Throughput: 0: 42795.5. Samples: 9782022120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:45:23,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-24 05:45:26,134][15401] Updated weights for policy 0, policy_version 597050 (0.0035) [2024-06-24 05:45:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9782165504. Throughput: 0: 42823.0. Samples: 9782275940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:45:28,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 05:45:29,658][15401] Updated weights for policy 0, policy_version 597060 (0.0025) [2024-06-24 05:45:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9782362112. Throughput: 0: 43003.1. Samples: 9782541780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:45:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 05:45:33,615][15401] Updated weights for policy 0, policy_version 597070 (0.0042) [2024-06-24 05:45:37,611][15401] Updated weights for policy 0, policy_version 597080 (0.0026) [2024-06-24 05:45:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9782591488. Throughput: 0: 42792.6. Samples: 9782665680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:45:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 05:45:41,162][15401] Updated weights for policy 0, policy_version 597090 (0.0034) [2024-06-24 05:45:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9782804480. Throughput: 0: 42855.2. Samples: 9782920120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:45:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 05:45:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000597096_9782820864.pth... [2024-06-24 05:45:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000596468_9772531712.pth [2024-06-24 05:45:45,437][15401] Updated weights for policy 0, policy_version 597100 (0.0035) [2024-06-24 05:45:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9783017472. Throughput: 0: 42845.3. Samples: 9783179660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:45:48,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 05:45:48,819][15401] Updated weights for policy 0, policy_version 597110 (0.0034) [2024-06-24 05:45:53,054][15401] Updated weights for policy 0, policy_version 597120 (0.0031) [2024-06-24 05:45:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9783230464. Throughput: 0: 42663.3. Samples: 9783301580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:45:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 05:45:56,668][15401] Updated weights for policy 0, policy_version 597130 (0.0039) [2024-06-24 05:45:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9783459840. Throughput: 0: 42861.4. Samples: 9783561920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:45:58,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 05:46:00,542][15401] Updated weights for policy 0, policy_version 597140 (0.0036) [2024-06-24 05:46:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9783656448. Throughput: 0: 42816.9. Samples: 9783821280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 05:46:04,282][15401] Updated weights for policy 0, policy_version 597150 (0.0036) [2024-06-24 05:46:08,193][15401] Updated weights for policy 0, policy_version 597160 (0.0038) [2024-06-24 05:46:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 9783885824. Throughput: 0: 42724.5. Samples: 9783944720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 05:46:11,910][15401] Updated weights for policy 0, policy_version 597170 (0.0040) [2024-06-24 05:46:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9784098816. Throughput: 0: 42858.4. Samples: 9784204560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 05:46:15,738][15401] Updated weights for policy 0, policy_version 597180 (0.0023) [2024-06-24 05:46:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9784295424. Throughput: 0: 42780.0. Samples: 9784466880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 05:46:19,474][15401] Updated weights for policy 0, policy_version 597190 (0.0039) [2024-06-24 05:46:21,488][15349] Signal inference workers to stop experience collection... (145000 times) [2024-06-24 05:46:21,488][15349] Signal inference workers to resume experience collection... (145000 times) [2024-06-24 05:46:21,513][15401] InferenceWorker_p0-w0: stopping experience collection (145000 times) [2024-06-24 05:46:21,513][15401] InferenceWorker_p0-w0: resuming experience collection (145000 times) [2024-06-24 05:46:23,272][15401] Updated weights for policy 0, policy_version 597200 (0.0041) [2024-06-24 05:46:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9784524800. Throughput: 0: 42841.8. Samples: 9784593560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 05:46:27,276][15401] Updated weights for policy 0, policy_version 597210 (0.0035) [2024-06-24 05:46:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9784737792. Throughput: 0: 42839.9. Samples: 9784847920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:28,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 05:46:30,991][15401] Updated weights for policy 0, policy_version 597220 (0.0043) [2024-06-24 05:46:33,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 9784918016. Throughput: 0: 42872.3. Samples: 9785109020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:33,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 05:46:34,934][15401] Updated weights for policy 0, policy_version 597230 (0.0026) [2024-06-24 05:46:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9785147392. Throughput: 0: 42836.5. Samples: 9785229220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:38,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 05:46:38,558][15401] Updated weights for policy 0, policy_version 597240 (0.0023) [2024-06-24 05:46:42,457][15401] Updated weights for policy 0, policy_version 597250 (0.0033) [2024-06-24 05:46:43,389][15132] Fps is (10 sec: 47525.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9785393152. Throughput: 0: 42946.2. Samples: 9785494500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 05:46:46,008][15401] Updated weights for policy 0, policy_version 597260 (0.0042) [2024-06-24 05:46:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9785573376. Throughput: 0: 43008.0. Samples: 9785756640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 05:46:50,022][15401] Updated weights for policy 0, policy_version 597270 (0.0039) [2024-06-24 05:46:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9785802752. Throughput: 0: 43009.8. Samples: 9785880160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 05:46:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 05:46:53,551][15401] Updated weights for policy 0, policy_version 597280 (0.0026) [2024-06-24 05:46:57,387][15401] Updated weights for policy 0, policy_version 597290 (0.0033) [2024-06-24 05:46:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9786032128. Throughput: 0: 43158.3. Samples: 9786146680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:46:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 05:47:01,644][15401] Updated weights for policy 0, policy_version 597300 (0.0042) [2024-06-24 05:47:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9786228736. Throughput: 0: 43132.4. Samples: 9786407840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:03,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 05:47:04,851][15401] Updated weights for policy 0, policy_version 597310 (0.0030) [2024-06-24 05:47:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9786474496. Throughput: 0: 42971.9. Samples: 9786527300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 05:47:09,121][15401] Updated weights for policy 0, policy_version 597320 (0.0030) [2024-06-24 05:47:12,768][15401] Updated weights for policy 0, policy_version 597330 (0.0030) [2024-06-24 05:47:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.8). Total num frames: 9786687488. Throughput: 0: 43076.5. Samples: 9786786360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 05:47:16,749][15401] Updated weights for policy 0, policy_version 597340 (0.0028) [2024-06-24 05:47:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9786867712. Throughput: 0: 43123.7. Samples: 9787049480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 05:47:20,222][15401] Updated weights for policy 0, policy_version 597350 (0.0027) [2024-06-24 05:47:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 9787113472. Throughput: 0: 43172.3. Samples: 9787171980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:23,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 05:47:24,326][15401] Updated weights for policy 0, policy_version 597360 (0.0044) [2024-06-24 05:47:27,989][15401] Updated weights for policy 0, policy_version 597370 (0.0029) [2024-06-24 05:47:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9787326464. Throughput: 0: 43170.7. Samples: 9787437180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 05:47:31,925][15401] Updated weights for policy 0, policy_version 597380 (0.0035) [2024-06-24 05:47:33,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 9787490304. Throughput: 0: 43234.3. Samples: 9787702200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 05:47:35,639][15401] Updated weights for policy 0, policy_version 597390 (0.0039) [2024-06-24 05:47:38,010][15349] Signal inference workers to stop experience collection... (145050 times) [2024-06-24 05:47:38,010][15349] Signal inference workers to resume experience collection... (145050 times) [2024-06-24 05:47:38,025][15401] InferenceWorker_p0-w0: stopping experience collection (145050 times) [2024-06-24 05:47:38,025][15401] InferenceWorker_p0-w0: resuming experience collection (145050 times) [2024-06-24 05:47:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 9787768832. Throughput: 0: 43172.8. Samples: 9787822940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:38,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 05:47:39,436][15401] Updated weights for policy 0, policy_version 597400 (0.0040) [2024-06-24 05:47:43,230][15401] Updated weights for policy 0, policy_version 597410 (0.0031) [2024-06-24 05:47:43,389][15132] Fps is (10 sec: 47515.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9787965440. Throughput: 0: 42971.9. Samples: 9788080420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 05:47:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000597410_9787965440.pth... [2024-06-24 05:47:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000596782_9777676288.pth [2024-06-24 05:47:47,370][15401] Updated weights for policy 0, policy_version 597420 (0.0030) [2024-06-24 05:47:48,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9788145664. Throughput: 0: 43018.2. Samples: 9788343660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:48,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 05:47:50,835][15401] Updated weights for policy 0, policy_version 597430 (0.0038) [2024-06-24 05:47:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9788407808. Throughput: 0: 43175.1. Samples: 9788470180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 05:47:54,889][15401] Updated weights for policy 0, policy_version 597440 (0.0031) [2024-06-24 05:47:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9788604416. Throughput: 0: 43133.4. Samples: 9788727360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:47:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 05:47:58,458][15401] Updated weights for policy 0, policy_version 597450 (0.0026) [2024-06-24 05:48:02,830][15401] Updated weights for policy 0, policy_version 597460 (0.0023) [2024-06-24 05:48:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9788801024. Throughput: 0: 43027.1. Samples: 9788985700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:48:03,398][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 05:48:06,224][15401] Updated weights for policy 0, policy_version 597470 (0.0036) [2024-06-24 05:48:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9789046784. Throughput: 0: 43149.3. Samples: 9789113700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:48:08,391][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 05:48:10,235][15401] Updated weights for policy 0, policy_version 597480 (0.0025) [2024-06-24 05:48:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 9789259776. Throughput: 0: 42939.0. Samples: 9789369440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:48:13,400][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 05:48:13,586][15401] Updated weights for policy 0, policy_version 597490 (0.0032) [2024-06-24 05:48:17,891][15401] Updated weights for policy 0, policy_version 597500 (0.0038) [2024-06-24 05:48:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9789440000. Throughput: 0: 42867.8. Samples: 9789631240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:48:18,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 05:48:21,057][15401] Updated weights for policy 0, policy_version 597510 (0.0039) [2024-06-24 05:48:23,396][15132] Fps is (10 sec: 44208.8, 60 sec: 43140.0, 300 sec: 42931.5). Total num frames: 9789702144. Throughput: 0: 42859.2. Samples: 9789751880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:48:23,396][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 05:48:25,423][15401] Updated weights for policy 0, policy_version 597520 (0.0040) [2024-06-24 05:48:28,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9789915136. Throughput: 0: 43132.9. Samples: 9790021400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 05:48:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 05:48:28,536][15401] Updated weights for policy 0, policy_version 597530 (0.0037) [2024-06-24 05:48:33,010][15401] Updated weights for policy 0, policy_version 597540 (0.0025) [2024-06-24 05:48:33,389][15132] Fps is (10 sec: 39347.1, 60 sec: 43417.8, 300 sec: 42987.2). Total num frames: 9790095360. Throughput: 0: 42971.5. Samples: 9790277380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:48:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 05:48:36,068][15401] Updated weights for policy 0, policy_version 597550 (0.0050) [2024-06-24 05:48:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9790341120. Throughput: 0: 42887.0. Samples: 9790400100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:48:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 05:48:41,002][15401] Updated weights for policy 0, policy_version 597560 (0.0034) [2024-06-24 05:48:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9790554112. Throughput: 0: 43053.8. Samples: 9790664780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:48:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 05:48:43,904][15401] Updated weights for policy 0, policy_version 597570 (0.0030) [2024-06-24 05:48:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9790734336. Throughput: 0: 43099.9. Samples: 9790925200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:48:48,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 05:48:48,648][15401] Updated weights for policy 0, policy_version 597580 (0.0047) [2024-06-24 05:48:51,631][15401] Updated weights for policy 0, policy_version 597590 (0.0034) [2024-06-24 05:48:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9790980096. Throughput: 0: 42952.1. Samples: 9791046540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:48:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 05:48:55,001][15349] Signal inference workers to stop experience collection... (145100 times) [2024-06-24 05:48:55,002][15349] Signal inference workers to resume experience collection... (145100 times) [2024-06-24 05:48:55,017][15401] InferenceWorker_p0-w0: stopping experience collection (145100 times) [2024-06-24 05:48:55,017][15401] InferenceWorker_p0-w0: resuming experience collection (145100 times) [2024-06-24 05:48:56,270][15401] Updated weights for policy 0, policy_version 597600 (0.0037) [2024-06-24 05:48:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9791193088. Throughput: 0: 43093.9. Samples: 9791308660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:48:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 05:48:59,284][15401] Updated weights for policy 0, policy_version 597610 (0.0035) [2024-06-24 05:49:03,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9791373312. Throughput: 0: 42926.6. Samples: 9791562940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 05:49:03,857][15401] Updated weights for policy 0, policy_version 597620 (0.0038) [2024-06-24 05:49:06,915][15401] Updated weights for policy 0, policy_version 597630 (0.0029) [2024-06-24 05:49:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9791635456. Throughput: 0: 42963.8. Samples: 9791684980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:08,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 05:49:11,407][15401] Updated weights for policy 0, policy_version 597640 (0.0033) [2024-06-24 05:49:13,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 9791815680. Throughput: 0: 42836.4. Samples: 9791949040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:13,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 05:49:14,728][15401] Updated weights for policy 0, policy_version 597650 (0.0050) [2024-06-24 05:49:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9792028672. Throughput: 0: 42881.3. Samples: 9792207040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 05:49:19,155][15401] Updated weights for policy 0, policy_version 597660 (0.0024) [2024-06-24 05:49:22,227][15401] Updated weights for policy 0, policy_version 597670 (0.0035) [2024-06-24 05:49:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42876.1, 300 sec: 42931.7). Total num frames: 9792274432. Throughput: 0: 42850.3. Samples: 9792328360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 05:49:27,078][15401] Updated weights for policy 0, policy_version 597680 (0.0035) [2024-06-24 05:49:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 9792454656. Throughput: 0: 42856.4. Samples: 9792593320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 05:49:29,579][15401] Updated weights for policy 0, policy_version 597690 (0.0024) [2024-06-24 05:49:33,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9792651264. Throughput: 0: 42935.6. Samples: 9792857300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:33,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 05:49:34,646][15401] Updated weights for policy 0, policy_version 597700 (0.0031) [2024-06-24 05:49:36,882][15401] Updated weights for policy 0, policy_version 597710 (0.0037) [2024-06-24 05:49:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9792897024. Throughput: 0: 42954.7. Samples: 9792979500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 05:49:42,287][15401] Updated weights for policy 0, policy_version 597720 (0.0037) [2024-06-24 05:49:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9793110016. Throughput: 0: 43080.0. Samples: 9793247260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 05:49:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000597724_9793110016.pth... [2024-06-24 05:49:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000597096_9782820864.pth [2024-06-24 05:49:44,864][15401] Updated weights for policy 0, policy_version 597730 (0.0046) [2024-06-24 05:49:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9793306624. Throughput: 0: 42822.4. Samples: 9793489940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:48,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 05:49:49,958][15401] Updated weights for policy 0, policy_version 597740 (0.0033) [2024-06-24 05:49:52,524][15401] Updated weights for policy 0, policy_version 597750 (0.0041) [2024-06-24 05:49:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9793552384. Throughput: 0: 42982.0. Samples: 9793619160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 05:49:57,605][15401] Updated weights for policy 0, policy_version 597760 (0.0029) [2024-06-24 05:49:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 9793748992. Throughput: 0: 42948.8. Samples: 9793881740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:49:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 05:50:00,264][15401] Updated weights for policy 0, policy_version 597770 (0.0027) [2024-06-24 05:50:02,270][15349] Signal inference workers to stop experience collection... (145150 times) [2024-06-24 05:50:02,323][15401] InferenceWorker_p0-w0: stopping experience collection (145150 times) [2024-06-24 05:50:02,384][15349] Signal inference workers to resume experience collection... (145150 times) [2024-06-24 05:50:02,384][15401] InferenceWorker_p0-w0: resuming experience collection (145150 times) [2024-06-24 05:50:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 9793961984. Throughput: 0: 42754.7. Samples: 9794131000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 05:50:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 05:50:05,317][15401] Updated weights for policy 0, policy_version 597780 (0.0035) [2024-06-24 05:50:07,834][15401] Updated weights for policy 0, policy_version 597790 (0.0032) [2024-06-24 05:50:08,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 9794207744. Throughput: 0: 43035.0. Samples: 9794264940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:08,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-24 05:50:12,810][15401] Updated weights for policy 0, policy_version 597800 (0.0029) [2024-06-24 05:50:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9794371584. Throughput: 0: 42958.2. Samples: 9794526440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 05:50:15,489][15401] Updated weights for policy 0, policy_version 597810 (0.0035) [2024-06-24 05:50:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9794617344. Throughput: 0: 42530.2. Samples: 9794771160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 05:50:20,300][15401] Updated weights for policy 0, policy_version 597820 (0.0021) [2024-06-24 05:50:23,386][15401] Updated weights for policy 0, policy_version 597830 (0.0033) [2024-06-24 05:50:23,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9794846720. Throughput: 0: 42770.1. Samples: 9794904160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 05:50:27,910][15401] Updated weights for policy 0, policy_version 597840 (0.0035) [2024-06-24 05:50:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9795026944. Throughput: 0: 42573.8. Samples: 9795163080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 05:50:31,037][15401] Updated weights for policy 0, policy_version 597850 (0.0034) [2024-06-24 05:50:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 9795272704. Throughput: 0: 42704.5. Samples: 9795411640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 05:50:35,573][15401] Updated weights for policy 0, policy_version 597860 (0.0035) [2024-06-24 05:50:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9795469312. Throughput: 0: 42690.2. Samples: 9795540220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 05:50:38,959][15401] Updated weights for policy 0, policy_version 597870 (0.0031) [2024-06-24 05:50:43,249][15401] Updated weights for policy 0, policy_version 597880 (0.0040) [2024-06-24 05:50:43,395][15132] Fps is (10 sec: 39300.0, 60 sec: 42594.5, 300 sec: 42875.3). Total num frames: 9795665920. Throughput: 0: 42491.0. Samples: 9795794060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:43,395][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 05:50:46,554][15401] Updated weights for policy 0, policy_version 597890 (0.0023) [2024-06-24 05:50:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 9795911680. Throughput: 0: 42580.0. Samples: 9796047100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 05:50:50,829][15401] Updated weights for policy 0, policy_version 597900 (0.0029) [2024-06-24 05:50:53,390][15132] Fps is (10 sec: 44260.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9796108288. Throughput: 0: 42505.7. Samples: 9796177700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:53,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-24 05:50:54,279][15401] Updated weights for policy 0, policy_version 597910 (0.0033) [2024-06-24 05:50:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9796304896. Throughput: 0: 42388.5. Samples: 9796433920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:50:58,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 05:50:58,436][15401] Updated weights for policy 0, policy_version 597920 (0.0030) [2024-06-24 05:51:02,027][15401] Updated weights for policy 0, policy_version 597930 (0.0033) [2024-06-24 05:51:03,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9796550656. Throughput: 0: 42483.1. Samples: 9796682900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:51:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 05:51:06,083][15401] Updated weights for policy 0, policy_version 597940 (0.0029) [2024-06-24 05:51:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 9796730880. Throughput: 0: 42446.3. Samples: 9796814240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:51:08,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 05:51:09,568][15401] Updated weights for policy 0, policy_version 597950 (0.0032) [2024-06-24 05:51:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9796943872. Throughput: 0: 42312.8. Samples: 9797067160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:51:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 05:51:14,083][15401] Updated weights for policy 0, policy_version 597960 (0.0033) [2024-06-24 05:51:16,928][15349] Signal inference workers to stop experience collection... (145200 times) [2024-06-24 05:51:16,929][15349] Signal inference workers to resume experience collection... (145200 times) [2024-06-24 05:51:16,947][15401] InferenceWorker_p0-w0: stopping experience collection (145200 times) [2024-06-24 05:51:16,947][15401] InferenceWorker_p0-w0: resuming experience collection (145200 times) [2024-06-24 05:51:17,239][15401] Updated weights for policy 0, policy_version 597970 (0.0029) [2024-06-24 05:51:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9797189632. Throughput: 0: 42360.8. Samples: 9797317880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:51:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 05:51:21,538][15401] Updated weights for policy 0, policy_version 597980 (0.0032) [2024-06-24 05:51:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 9797353472. Throughput: 0: 42456.2. Samples: 9797450760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:51:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 05:51:25,171][15401] Updated weights for policy 0, policy_version 597990 (0.0023) [2024-06-24 05:51:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 9797566464. Throughput: 0: 42352.2. Samples: 9797699680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:51:28,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 05:51:29,382][15401] Updated weights for policy 0, policy_version 598000 (0.0037) [2024-06-24 05:51:32,996][15401] Updated weights for policy 0, policy_version 598010 (0.0034) [2024-06-24 05:51:33,390][15132] Fps is (10 sec: 47514.2, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 9797828608. Throughput: 0: 42440.3. Samples: 9797956920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 05:51:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 05:51:37,053][15401] Updated weights for policy 0, policy_version 598020 (0.0034) [2024-06-24 05:51:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 9797992448. Throughput: 0: 42482.8. Samples: 9798089520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:51:38,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 05:51:40,806][15401] Updated weights for policy 0, policy_version 598030 (0.0032) [2024-06-24 05:51:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42602.3, 300 sec: 42876.1). Total num frames: 9798221824. Throughput: 0: 42245.7. Samples: 9798334980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:51:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 05:51:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000598036_9798221824.pth... [2024-06-24 05:51:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000597410_9787965440.pth [2024-06-24 05:51:44,726][15401] Updated weights for policy 0, policy_version 598040 (0.0028) [2024-06-24 05:51:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 9798434816. Throughput: 0: 42597.9. Samples: 9798599800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:51:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 05:51:48,483][15401] Updated weights for policy 0, policy_version 598050 (0.0034) [2024-06-24 05:51:52,661][15401] Updated weights for policy 0, policy_version 598060 (0.0035) [2024-06-24 05:51:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 9798664192. Throughput: 0: 42488.9. Samples: 9798726240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:51:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 05:51:56,115][15401] Updated weights for policy 0, policy_version 598070 (0.0037) [2024-06-24 05:51:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9798877184. Throughput: 0: 42347.9. Samples: 9798972820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:51:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 05:52:00,318][15401] Updated weights for policy 0, policy_version 598080 (0.0039) [2024-06-24 05:52:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 9799057408. Throughput: 0: 42700.5. Samples: 9799239400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 05:52:03,907][15401] Updated weights for policy 0, policy_version 598090 (0.0034) [2024-06-24 05:52:08,184][15401] Updated weights for policy 0, policy_version 598100 (0.0037) [2024-06-24 05:52:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9799286784. Throughput: 0: 42399.6. Samples: 9799358740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 05:52:11,707][15401] Updated weights for policy 0, policy_version 598110 (0.0047) [2024-06-24 05:52:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9799516160. Throughput: 0: 42569.4. Samples: 9799615300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 05:52:15,691][15401] Updated weights for policy 0, policy_version 598120 (0.0040) [2024-06-24 05:52:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 9799696384. Throughput: 0: 42548.1. Samples: 9799871580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 05:52:19,347][15401] Updated weights for policy 0, policy_version 598130 (0.0038) [2024-06-24 05:52:23,386][15401] Updated weights for policy 0, policy_version 598140 (0.0032) [2024-06-24 05:52:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 9799925760. Throughput: 0: 42425.7. Samples: 9799998580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 05:52:26,929][15401] Updated weights for policy 0, policy_version 598150 (0.0034) [2024-06-24 05:52:28,396][15132] Fps is (10 sec: 44208.0, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 9800138752. Throughput: 0: 42600.1. Samples: 9800252260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:28,397][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 05:52:30,991][15401] Updated weights for policy 0, policy_version 598160 (0.0039) [2024-06-24 05:52:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 9800335360. Throughput: 0: 42502.7. Samples: 9800512420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:33,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 05:52:34,383][15349] Signal inference workers to stop experience collection... (145250 times) [2024-06-24 05:52:34,384][15349] Signal inference workers to resume experience collection... (145250 times) [2024-06-24 05:52:34,424][15401] InferenceWorker_p0-w0: stopping experience collection (145250 times) [2024-06-24 05:52:34,424][15401] InferenceWorker_p0-w0: resuming experience collection (145250 times) [2024-06-24 05:52:34,522][15401] Updated weights for policy 0, policy_version 598170 (0.0029) [2024-06-24 05:52:38,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 9800548352. Throughput: 0: 42488.3. Samples: 9800638220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 05:52:38,722][15401] Updated weights for policy 0, policy_version 598180 (0.0041) [2024-06-24 05:52:42,347][15401] Updated weights for policy 0, policy_version 598190 (0.0046) [2024-06-24 05:52:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9800777728. Throughput: 0: 42748.9. Samples: 9800896520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:43,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 05:52:46,239][15401] Updated weights for policy 0, policy_version 598200 (0.0035) [2024-06-24 05:52:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9800990720. Throughput: 0: 42536.5. Samples: 9801153540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:48,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 05:52:49,968][15401] Updated weights for policy 0, policy_version 598210 (0.0042) [2024-06-24 05:52:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9801187328. Throughput: 0: 42652.0. Samples: 9801278080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 05:52:54,520][15401] Updated weights for policy 0, policy_version 598220 (0.0031) [2024-06-24 05:52:57,609][15401] Updated weights for policy 0, policy_version 598230 (0.0025) [2024-06-24 05:52:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9801416704. Throughput: 0: 42532.0. Samples: 9801529240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:52:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 05:53:01,997][15401] Updated weights for policy 0, policy_version 598240 (0.0045) [2024-06-24 05:53:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9801596928. Throughput: 0: 42639.5. Samples: 9801790360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:53:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 05:53:05,306][15401] Updated weights for policy 0, policy_version 598250 (0.0034) [2024-06-24 05:53:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9801826304. Throughput: 0: 42504.1. Samples: 9801911260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 05:53:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 05:53:09,651][15401] Updated weights for policy 0, policy_version 598260 (0.0023) [2024-06-24 05:53:13,045][15401] Updated weights for policy 0, policy_version 598270 (0.0029) [2024-06-24 05:53:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9802072064. Throughput: 0: 42657.6. Samples: 9802171580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:13,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 05:53:17,482][15401] Updated weights for policy 0, policy_version 598280 (0.0038) [2024-06-24 05:53:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 9802252288. Throughput: 0: 42503.5. Samples: 9802425080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 05:53:20,783][15401] Updated weights for policy 0, policy_version 598290 (0.0032) [2024-06-24 05:53:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9802481664. Throughput: 0: 42428.5. Samples: 9802547500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 05:53:24,978][15401] Updated weights for policy 0, policy_version 598300 (0.0030) [2024-06-24 05:53:28,343][15401] Updated weights for policy 0, policy_version 598310 (0.0027) [2024-06-24 05:53:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 9802711040. Throughput: 0: 42624.0. Samples: 9802814600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 05:53:32,459][15401] Updated weights for policy 0, policy_version 598320 (0.0027) [2024-06-24 05:53:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 9802874880. Throughput: 0: 42658.6. Samples: 9803073180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 05:53:35,898][15401] Updated weights for policy 0, policy_version 598330 (0.0032) [2024-06-24 05:53:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9803120640. Throughput: 0: 42541.8. Samples: 9803192460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:38,391][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 05:53:40,122][15401] Updated weights for policy 0, policy_version 598340 (0.0031) [2024-06-24 05:53:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9803333632. Throughput: 0: 42748.3. Samples: 9803452920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 05:53:43,539][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000598349_9803350016.pth... [2024-06-24 05:53:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000597724_9793110016.pth [2024-06-24 05:53:43,770][15401] Updated weights for policy 0, policy_version 598350 (0.0033) [2024-06-24 05:53:44,317][15349] Signal inference workers to stop experience collection... (145300 times) [2024-06-24 05:53:44,317][15349] Signal inference workers to resume experience collection... (145300 times) [2024-06-24 05:53:44,329][15401] InferenceWorker_p0-w0: stopping experience collection (145300 times) [2024-06-24 05:53:44,330][15401] InferenceWorker_p0-w0: resuming experience collection (145300 times) [2024-06-24 05:53:47,964][15401] Updated weights for policy 0, policy_version 598360 (0.0033) [2024-06-24 05:53:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 9803530240. Throughput: 0: 42401.7. Samples: 9803698440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:48,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 05:53:51,356][15401] Updated weights for policy 0, policy_version 598370 (0.0032) [2024-06-24 05:53:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 9803759616. Throughput: 0: 42569.0. Samples: 9803826860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 05:53:55,929][15401] Updated weights for policy 0, policy_version 598380 (0.0040) [2024-06-24 05:53:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9803939840. Throughput: 0: 42686.7. Samples: 9804092480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:53:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 05:53:59,088][15401] Updated weights for policy 0, policy_version 598390 (0.0061) [2024-06-24 05:54:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 9804169216. Throughput: 0: 42572.8. Samples: 9804340860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 05:54:03,680][15401] Updated weights for policy 0, policy_version 598400 (0.0031) [2024-06-24 05:54:06,782][15401] Updated weights for policy 0, policy_version 598410 (0.0027) [2024-06-24 05:54:08,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9804414976. Throughput: 0: 42749.0. Samples: 9804471200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:08,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 05:54:11,513][15401] Updated weights for policy 0, policy_version 598420 (0.0035) [2024-06-24 05:54:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 9804578816. Throughput: 0: 42473.7. Samples: 9804725920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 05:54:14,658][15401] Updated weights for policy 0, policy_version 598430 (0.0038) [2024-06-24 05:54:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 9804808192. Throughput: 0: 42172.8. Samples: 9804970960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 05:54:19,199][15401] Updated weights for policy 0, policy_version 598440 (0.0032) [2024-06-24 05:54:22,296][15401] Updated weights for policy 0, policy_version 598450 (0.0023) [2024-06-24 05:54:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9805021184. Throughput: 0: 42388.9. Samples: 9805099960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:23,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 05:54:26,895][15401] Updated weights for policy 0, policy_version 598460 (0.0035) [2024-06-24 05:54:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 9805217792. Throughput: 0: 42189.2. Samples: 9805351440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 05:54:30,143][15401] Updated weights for policy 0, policy_version 598470 (0.0033) [2024-06-24 05:54:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9805430784. Throughput: 0: 42409.7. Samples: 9805606880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:33,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 05:54:34,532][15401] Updated weights for policy 0, policy_version 598480 (0.0039) [2024-06-24 05:54:37,898][15401] Updated weights for policy 0, policy_version 598490 (0.0040) [2024-06-24 05:54:38,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9805660160. Throughput: 0: 42374.1. Samples: 9805733700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:38,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 05:54:42,396][15401] Updated weights for policy 0, policy_version 598500 (0.0037) [2024-06-24 05:54:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 9805856768. Throughput: 0: 42186.6. Samples: 9805990880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 05:54:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 05:54:45,661][15401] Updated weights for policy 0, policy_version 598510 (0.0036) [2024-06-24 05:54:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9806086144. Throughput: 0: 42249.3. Samples: 9806242080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:54:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 05:54:49,929][15401] Updated weights for policy 0, policy_version 598520 (0.0035) [2024-06-24 05:54:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9806299136. Throughput: 0: 42375.1. Samples: 9806378080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:54:53,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 05:54:53,482][15401] Updated weights for policy 0, policy_version 598530 (0.0051) [2024-06-24 05:54:57,775][15401] Updated weights for policy 0, policy_version 598540 (0.0034) [2024-06-24 05:54:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9806495744. Throughput: 0: 42385.9. Samples: 9806633280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:54:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 05:55:01,132][15401] Updated weights for policy 0, policy_version 598550 (0.0035) [2024-06-24 05:55:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 9806741504. Throughput: 0: 42535.2. Samples: 9806885040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 05:55:05,583][15401] Updated weights for policy 0, policy_version 598560 (0.0028) [2024-06-24 05:55:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9806938112. Throughput: 0: 42750.1. Samples: 9807023720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 05:55:08,767][15401] Updated weights for policy 0, policy_version 598570 (0.0028) [2024-06-24 05:55:13,146][15401] Updated weights for policy 0, policy_version 598580 (0.0041) [2024-06-24 05:55:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 9807134720. Throughput: 0: 42673.4. Samples: 9807271740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:13,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 05:55:16,739][15401] Updated weights for policy 0, policy_version 598590 (0.0032) [2024-06-24 05:55:17,444][15349] Signal inference workers to stop experience collection... (145350 times) [2024-06-24 05:55:17,445][15349] Signal inference workers to resume experience collection... (145350 times) [2024-06-24 05:55:17,478][15401] InferenceWorker_p0-w0: stopping experience collection (145350 times) [2024-06-24 05:55:17,510][15401] InferenceWorker_p0-w0: resuming experience collection (145350 times) [2024-06-24 05:55:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 9807364096. Throughput: 0: 42617.4. Samples: 9807524660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:18,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 05:55:20,923][15401] Updated weights for policy 0, policy_version 598600 (0.0026) [2024-06-24 05:55:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 9807577088. Throughput: 0: 42828.0. Samples: 9807660960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:23,392][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 05:55:24,261][15401] Updated weights for policy 0, policy_version 598610 (0.0045) [2024-06-24 05:55:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 9807773696. Throughput: 0: 42787.6. Samples: 9807916320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:28,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 05:55:28,693][15401] Updated weights for policy 0, policy_version 598620 (0.0028) [2024-06-24 05:55:31,724][15401] Updated weights for policy 0, policy_version 598630 (0.0031) [2024-06-24 05:55:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 9808019456. Throughput: 0: 42923.6. Samples: 9808173640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 05:55:36,199][15401] Updated weights for policy 0, policy_version 598640 (0.0032) [2024-06-24 05:55:38,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42598.8). Total num frames: 9808232448. Throughput: 0: 42893.7. Samples: 9808308400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:38,392][15132] Avg episode reward: [(0, '0.251')] [2024-06-24 05:55:39,358][15401] Updated weights for policy 0, policy_version 598650 (0.0033) [2024-06-24 05:55:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 9808412672. Throughput: 0: 42784.9. Samples: 9808558600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:43,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 05:55:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000598658_9808412672.pth... [2024-06-24 05:55:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000598036_9798221824.pth [2024-06-24 05:55:43,929][15401] Updated weights for policy 0, policy_version 598660 (0.0023) [2024-06-24 05:55:46,872][15401] Updated weights for policy 0, policy_version 598670 (0.0032) [2024-06-24 05:55:48,392][15132] Fps is (10 sec: 44236.8, 60 sec: 43142.9, 300 sec: 42598.1). Total num frames: 9808674816. Throughput: 0: 42805.3. Samples: 9808811380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:48,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 05:55:51,509][15401] Updated weights for policy 0, policy_version 598680 (0.0029) [2024-06-24 05:55:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 9808838656. Throughput: 0: 42827.8. Samples: 9808950960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 05:55:54,558][15401] Updated weights for policy 0, policy_version 598690 (0.0045) [2024-06-24 05:55:58,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 9809068032. Throughput: 0: 42830.7. Samples: 9809199120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:55:58,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 05:55:59,001][15401] Updated weights for policy 0, policy_version 598700 (0.0040) [2024-06-24 05:56:02,006][15401] Updated weights for policy 0, policy_version 598710 (0.0027) [2024-06-24 05:56:03,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9809313792. Throughput: 0: 42862.5. Samples: 9809453480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:56:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 05:56:06,873][15401] Updated weights for policy 0, policy_version 598720 (0.0043) [2024-06-24 05:56:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 9809477632. Throughput: 0: 42920.5. Samples: 9809592380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:56:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 05:56:09,542][15401] Updated weights for policy 0, policy_version 598730 (0.0033) [2024-06-24 05:56:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 9809723392. Throughput: 0: 42934.2. Samples: 9809848360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 05:56:13,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 05:56:14,239][15401] Updated weights for policy 0, policy_version 598740 (0.0038) [2024-06-24 05:56:17,428][15401] Updated weights for policy 0, policy_version 598750 (0.0028) [2024-06-24 05:56:18,390][15132] Fps is (10 sec: 49151.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9809969152. Throughput: 0: 42689.3. Samples: 9810094660. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:18,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 05:56:21,828][15401] Updated weights for policy 0, policy_version 598760 (0.0033) [2024-06-24 05:56:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9810116608. Throughput: 0: 42679.1. Samples: 9810228860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:23,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 05:56:24,974][15401] Updated weights for policy 0, policy_version 598770 (0.0036) [2024-06-24 05:56:28,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42869.8, 300 sec: 42431.4). Total num frames: 9810345984. Throughput: 0: 42823.9. Samples: 9810485780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:28,392][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 05:56:29,231][15401] Updated weights for policy 0, policy_version 598780 (0.0029) [2024-06-24 05:56:32,651][15401] Updated weights for policy 0, policy_version 598790 (0.0030) [2024-06-24 05:56:33,396][15132] Fps is (10 sec: 47482.7, 60 sec: 42866.8, 300 sec: 42708.9). Total num frames: 9810591744. Throughput: 0: 42814.7. Samples: 9810738220. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:33,397][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 05:56:36,706][15401] Updated weights for policy 0, policy_version 598800 (0.0031) [2024-06-24 05:56:38,092][15349] Signal inference workers to stop experience collection... (145400 times) [2024-06-24 05:56:38,118][15401] InferenceWorker_p0-w0: stopping experience collection (145400 times) [2024-06-24 05:56:38,207][15349] Signal inference workers to resume experience collection... (145400 times) [2024-06-24 05:56:38,207][15401] InferenceWorker_p0-w0: resuming experience collection (145400 times) [2024-06-24 05:56:38,393][15132] Fps is (10 sec: 44234.0, 60 sec: 42598.0, 300 sec: 42598.0). Total num frames: 9810788352. Throughput: 0: 42654.4. Samples: 9810870540. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:38,393][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 05:56:40,235][15401] Updated weights for policy 0, policy_version 598810 (0.0042) [2024-06-24 05:56:43,390][15132] Fps is (10 sec: 40986.2, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 9811001344. Throughput: 0: 42807.4. Samples: 9811125460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 05:56:45,357][15401] Updated weights for policy 0, policy_version 598820 (0.0044) [2024-06-24 05:56:47,802][15401] Updated weights for policy 0, policy_version 598830 (0.0038) [2024-06-24 05:56:48,390][15132] Fps is (10 sec: 45888.9, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 9811247104. Throughput: 0: 42834.8. Samples: 9811381040. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:48,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 05:56:52,879][15401] Updated weights for policy 0, policy_version 598840 (0.0029) [2024-06-24 05:56:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 9811410944. Throughput: 0: 42730.2. Samples: 9811515240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:53,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 05:56:55,771][15401] Updated weights for policy 0, policy_version 598850 (0.0033) [2024-06-24 05:56:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9811640320. Throughput: 0: 42566.3. Samples: 9811763840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:56:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 05:57:00,442][15401] Updated weights for policy 0, policy_version 598860 (0.0034) [2024-06-24 05:57:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9811869696. Throughput: 0: 42661.8. Samples: 9812014440. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 05:57:03,707][15401] Updated weights for policy 0, policy_version 598870 (0.0037) [2024-06-24 05:57:08,066][15401] Updated weights for policy 0, policy_version 598880 (0.0030) [2024-06-24 05:57:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 9812049920. Throughput: 0: 42592.4. Samples: 9812145520. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 05:57:11,302][15401] Updated weights for policy 0, policy_version 598890 (0.0033) [2024-06-24 05:57:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9812295680. Throughput: 0: 42485.0. Samples: 9812397500. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 05:57:15,568][15401] Updated weights for policy 0, policy_version 598900 (0.0023) [2024-06-24 05:57:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9812508672. Throughput: 0: 42676.9. Samples: 9812658400. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:18,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 05:57:18,963][15401] Updated weights for policy 0, policy_version 598910 (0.0031) [2024-06-24 05:57:23,290][15401] Updated weights for policy 0, policy_version 598920 (0.0036) [2024-06-24 05:57:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42599.3). Total num frames: 9812705280. Throughput: 0: 42454.9. Samples: 9812780880. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 05:57:26,765][15401] Updated weights for policy 0, policy_version 598930 (0.0032) [2024-06-24 05:57:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 9812934656. Throughput: 0: 42412.9. Samples: 9813034040. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 05:57:30,945][15401] Updated weights for policy 0, policy_version 598940 (0.0031) [2024-06-24 05:57:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42056.9, 300 sec: 42598.4). Total num frames: 9813114880. Throughput: 0: 42641.0. Samples: 9813299880. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:33,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 05:57:34,461][15401] Updated weights for policy 0, policy_version 598950 (0.0032) [2024-06-24 05:57:38,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42325.7, 300 sec: 42542.5). Total num frames: 9813327872. Throughput: 0: 42283.1. Samples: 9813418080. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:38,393][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 05:57:38,605][15401] Updated weights for policy 0, policy_version 598960 (0.0042) [2024-06-24 05:57:42,139][15401] Updated weights for policy 0, policy_version 598970 (0.0023) [2024-06-24 05:57:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9813573632. Throughput: 0: 42379.0. Samples: 9813670900. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 05:57:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000598973_9813573632.pth... [2024-06-24 05:57:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000598349_9803350016.pth [2024-06-24 05:57:46,200][15401] Updated weights for policy 0, policy_version 598980 (0.0034) [2024-06-24 05:57:48,389][15132] Fps is (10 sec: 42609.1, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 9813753856. Throughput: 0: 42623.2. Samples: 9813932480. Policy #0 lag: (min: 1.0, avg: 11.5, max: 19.0) [2024-06-24 05:57:48,390][15132] Avg episode reward: [(0, '0.195')] [2024-06-24 05:57:49,847][15401] Updated weights for policy 0, policy_version 598990 (0.0031) [2024-06-24 05:57:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9813966848. Throughput: 0: 42240.0. Samples: 9814046320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:57:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 05:57:54,852][15401] Updated weights for policy 0, policy_version 599000 (0.0049) [2024-06-24 05:57:57,076][15349] Signal inference workers to stop experience collection... (145450 times) [2024-06-24 05:57:57,114][15401] InferenceWorker_p0-w0: stopping experience collection (145450 times) [2024-06-24 05:57:57,124][15349] Signal inference workers to resume experience collection... (145450 times) [2024-06-24 05:57:57,133][15401] InferenceWorker_p0-w0: resuming experience collection (145450 times) [2024-06-24 05:57:57,886][15401] Updated weights for policy 0, policy_version 599010 (0.0035) [2024-06-24 05:57:58,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 9814196224. Throughput: 0: 42332.3. Samples: 9814302560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:57:58,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 05:58:02,446][15401] Updated weights for policy 0, policy_version 599020 (0.0044) [2024-06-24 05:58:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 9814360064. Throughput: 0: 42151.9. Samples: 9814555240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:03,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-24 05:58:05,535][15401] Updated weights for policy 0, policy_version 599030 (0.0039) [2024-06-24 05:58:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9814605824. Throughput: 0: 42053.8. Samples: 9814673300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:08,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 05:58:10,026][15401] Updated weights for policy 0, policy_version 599040 (0.0038) [2024-06-24 05:58:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9814818816. Throughput: 0: 42104.4. Samples: 9814928740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:13,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 05:58:13,490][15401] Updated weights for policy 0, policy_version 599050 (0.0041) [2024-06-24 05:58:17,767][15401] Updated weights for policy 0, policy_version 599060 (0.0028) [2024-06-24 05:58:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 9815015424. Throughput: 0: 41932.8. Samples: 9815186860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:18,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 05:58:21,149][15401] Updated weights for policy 0, policy_version 599070 (0.0040) [2024-06-24 05:58:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 9815228416. Throughput: 0: 42036.1. Samples: 9815309600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:23,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 05:58:25,398][15401] Updated weights for policy 0, policy_version 599080 (0.0034) [2024-06-24 05:58:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 9815441408. Throughput: 0: 42260.1. Samples: 9815572600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 05:58:28,805][15401] Updated weights for policy 0, policy_version 599090 (0.0045) [2024-06-24 05:58:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 9815638016. Throughput: 0: 42087.9. Samples: 9815826440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 05:58:33,606][15401] Updated weights for policy 0, policy_version 599100 (0.0024) [2024-06-24 05:58:36,585][15401] Updated weights for policy 0, policy_version 599110 (0.0049) [2024-06-24 05:58:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 9815867392. Throughput: 0: 42289.8. Samples: 9815949360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 05:58:41,679][15401] Updated weights for policy 0, policy_version 599120 (0.0029) [2024-06-24 05:58:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 42431.8). Total num frames: 9816047616. Throughput: 0: 42206.3. Samples: 9816201740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 05:58:44,353][15401] Updated weights for policy 0, policy_version 599130 (0.0047) [2024-06-24 05:58:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 9816276992. Throughput: 0: 42047.1. Samples: 9816447360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:48,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 05:58:49,485][15401] Updated weights for policy 0, policy_version 599140 (0.0034) [2024-06-24 05:58:52,039][15401] Updated weights for policy 0, policy_version 599150 (0.0039) [2024-06-24 05:58:53,396][15132] Fps is (10 sec: 47483.7, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 9816522752. Throughput: 0: 42329.1. Samples: 9816578380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:53,396][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 05:58:57,151][15401] Updated weights for policy 0, policy_version 599160 (0.0034) [2024-06-24 05:58:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41507.8, 300 sec: 42431.8). Total num frames: 9816686592. Throughput: 0: 42287.1. Samples: 9816831660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:58:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 05:59:00,057][15401] Updated weights for policy 0, policy_version 599170 (0.0028) [2024-06-24 05:59:03,389][15132] Fps is (10 sec: 37707.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 9816899584. Throughput: 0: 42058.3. Samples: 9817079480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:59:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 05:59:04,707][15401] Updated weights for policy 0, policy_version 599180 (0.0033) [2024-06-24 05:59:07,642][15401] Updated weights for policy 0, policy_version 599190 (0.0027) [2024-06-24 05:59:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9817145344. Throughput: 0: 42185.4. Samples: 9817207940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:59:08,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 05:59:12,317][15401] Updated weights for policy 0, policy_version 599200 (0.0036) [2024-06-24 05:59:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 42376.3). Total num frames: 9817309184. Throughput: 0: 42019.6. Samples: 9817463480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:59:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 05:59:14,222][15349] Signal inference workers to stop experience collection... (145500 times) [2024-06-24 05:59:14,260][15401] InferenceWorker_p0-w0: stopping experience collection (145500 times) [2024-06-24 05:59:14,277][15349] Signal inference workers to resume experience collection... (145500 times) [2024-06-24 05:59:14,279][15401] InferenceWorker_p0-w0: resuming experience collection (145500 times) [2024-06-24 05:59:15,178][15401] Updated weights for policy 0, policy_version 599210 (0.0036) [2024-06-24 05:59:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 9817538560. Throughput: 0: 42023.6. Samples: 9817717500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:59:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 05:59:20,071][15401] Updated weights for policy 0, policy_version 599220 (0.0024) [2024-06-24 05:59:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9817767936. Throughput: 0: 42256.0. Samples: 9817850880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 05:59:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 05:59:23,439][15401] Updated weights for policy 0, policy_version 599230 (0.0030) [2024-06-24 05:59:27,734][15401] Updated weights for policy 0, policy_version 599240 (0.0031) [2024-06-24 05:59:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 9817964544. Throughput: 0: 42269.4. Samples: 9818103860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:59:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 05:59:31,182][15401] Updated weights for policy 0, policy_version 599250 (0.0036) [2024-06-24 05:59:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9818193920. Throughput: 0: 42281.3. Samples: 9818350020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:59:33,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 05:59:35,571][15401] Updated weights for policy 0, policy_version 599260 (0.0034) [2024-06-24 05:59:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9818406912. Throughput: 0: 42429.0. Samples: 9818487420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:59:38,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 05:59:38,774][15401] Updated weights for policy 0, policy_version 599270 (0.0037) [2024-06-24 05:59:43,162][15401] Updated weights for policy 0, policy_version 599280 (0.0032) [2024-06-24 05:59:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 9818603520. Throughput: 0: 42381.2. Samples: 9818738820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:59:43,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 05:59:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000599280_9818603520.pth... [2024-06-24 05:59:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000598658_9808412672.pth [2024-06-24 05:59:46,635][15401] Updated weights for policy 0, policy_version 599290 (0.0033) [2024-06-24 05:59:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9818832896. Throughput: 0: 42458.7. Samples: 9818990120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:59:48,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 05:59:51,099][15401] Updated weights for policy 0, policy_version 599300 (0.0041) [2024-06-24 05:59:53,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42329.8, 300 sec: 42598.4). Total num frames: 9819062272. Throughput: 0: 42572.4. Samples: 9819123700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:59:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 05:59:54,270][15401] Updated weights for policy 0, policy_version 599310 (0.0036) [2024-06-24 05:59:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 9819242496. Throughput: 0: 42522.6. Samples: 9819377000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 05:59:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 05:59:58,923][15401] Updated weights for policy 0, policy_version 599320 (0.0038) [2024-06-24 06:00:02,145][15401] Updated weights for policy 0, policy_version 599330 (0.0040) [2024-06-24 06:00:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 9819488256. Throughput: 0: 42427.2. Samples: 9819626720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 06:00:06,493][15401] Updated weights for policy 0, policy_version 599340 (0.0024) [2024-06-24 06:00:08,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 9819684864. Throughput: 0: 42563.0. Samples: 9819766320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:08,392][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 06:00:09,609][15401] Updated weights for policy 0, policy_version 599350 (0.0025) [2024-06-24 06:00:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 9819881472. Throughput: 0: 42572.8. Samples: 9820019640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:13,399][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 06:00:14,053][15401] Updated weights for policy 0, policy_version 599360 (0.0028) [2024-06-24 06:00:17,156][15401] Updated weights for policy 0, policy_version 599370 (0.0034) [2024-06-24 06:00:18,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 9820127232. Throughput: 0: 42838.7. Samples: 9820277760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 06:00:21,646][15401] Updated weights for policy 0, policy_version 599380 (0.0030) [2024-06-24 06:00:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 9820323840. Throughput: 0: 42711.1. Samples: 9820409420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 06:00:24,757][15401] Updated weights for policy 0, policy_version 599390 (0.0035) [2024-06-24 06:00:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 9820536832. Throughput: 0: 42652.1. Samples: 9820658160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 06:00:29,262][15401] Updated weights for policy 0, policy_version 599400 (0.0042) [2024-06-24 06:00:32,510][15401] Updated weights for policy 0, policy_version 599410 (0.0033) [2024-06-24 06:00:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42432.1). Total num frames: 9820749824. Throughput: 0: 42815.4. Samples: 9820916820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:33,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 06:00:36,753][15401] Updated weights for policy 0, policy_version 599420 (0.0024) [2024-06-24 06:00:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9820962816. Throughput: 0: 42783.6. Samples: 9821048960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:38,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 06:00:40,099][15401] Updated weights for policy 0, policy_version 599430 (0.0036) [2024-06-24 06:00:43,390][15132] Fps is (10 sec: 44232.9, 60 sec: 43144.0, 300 sec: 42432.0). Total num frames: 9821192192. Throughput: 0: 42726.7. Samples: 9821299740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:43,391][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 06:00:44,911][15401] Updated weights for policy 0, policy_version 599440 (0.0030) [2024-06-24 06:00:45,244][15349] Signal inference workers to stop experience collection... (145550 times) [2024-06-24 06:00:45,245][15349] Signal inference workers to resume experience collection... (145550 times) [2024-06-24 06:00:45,284][15401] InferenceWorker_p0-w0: stopping experience collection (145550 times) [2024-06-24 06:00:45,285][15401] InferenceWorker_p0-w0: resuming experience collection (145550 times) [2024-06-24 06:00:48,145][15401] Updated weights for policy 0, policy_version 599450 (0.0026) [2024-06-24 06:00:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 9821388800. Throughput: 0: 42715.0. Samples: 9821548900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:48,399][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 06:00:52,551][15401] Updated weights for policy 0, policy_version 599460 (0.0032) [2024-06-24 06:00:53,389][15132] Fps is (10 sec: 39325.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 9821585408. Throughput: 0: 42375.2. Samples: 9821673100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 06:00:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 06:00:55,576][15401] Updated weights for policy 0, policy_version 599470 (0.0035) [2024-06-24 06:00:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42487.3). Total num frames: 9821847552. Throughput: 0: 42572.5. Samples: 9821935400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:00:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 06:01:00,060][15401] Updated weights for policy 0, policy_version 599480 (0.0038) [2024-06-24 06:01:03,378][15401] Updated weights for policy 0, policy_version 599490 (0.0031) [2024-06-24 06:01:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 9822044160. Throughput: 0: 42534.1. Samples: 9822191800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 06:01:07,583][15401] Updated weights for policy 0, policy_version 599500 (0.0029) [2024-06-24 06:01:08,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42600.0, 300 sec: 42431.8). Total num frames: 9822240768. Throughput: 0: 42441.6. Samples: 9822319300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 06:01:10,995][15401] Updated weights for policy 0, policy_version 599510 (0.0029) [2024-06-24 06:01:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42376.2). Total num frames: 9822470144. Throughput: 0: 42757.8. Samples: 9822582260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:13,394][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 06:01:15,140][15401] Updated weights for policy 0, policy_version 599520 (0.0032) [2024-06-24 06:01:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 9822666752. Throughput: 0: 42546.6. Samples: 9822831420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:18,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 06:01:18,832][15401] Updated weights for policy 0, policy_version 599530 (0.0035) [2024-06-24 06:01:23,026][15401] Updated weights for policy 0, policy_version 599540 (0.0036) [2024-06-24 06:01:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 9822863360. Throughput: 0: 42324.0. Samples: 9822953540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 06:01:26,399][15401] Updated weights for policy 0, policy_version 599550 (0.0029) [2024-06-24 06:01:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42321.6). Total num frames: 9823076352. Throughput: 0: 42528.8. Samples: 9823213500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 06:01:30,795][15401] Updated weights for policy 0, policy_version 599560 (0.0022) [2024-06-24 06:01:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42376.7). Total num frames: 9823289344. Throughput: 0: 42742.7. Samples: 9823472320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:33,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-24 06:01:34,241][15401] Updated weights for policy 0, policy_version 599570 (0.0034) [2024-06-24 06:01:38,201][15401] Updated weights for policy 0, policy_version 599580 (0.0033) [2024-06-24 06:01:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 9823518720. Throughput: 0: 42775.0. Samples: 9823597980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:38,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 06:01:41,570][15401] Updated weights for policy 0, policy_version 599590 (0.0032) [2024-06-24 06:01:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42326.0, 300 sec: 42320.7). Total num frames: 9823731712. Throughput: 0: 42791.5. Samples: 9823861020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 06:01:43,488][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000599594_9823748096.pth... [2024-06-24 06:01:43,545][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000598973_9813573632.pth [2024-06-24 06:01:45,561][15401] Updated weights for policy 0, policy_version 599600 (0.0051) [2024-06-24 06:01:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 9823944704. Throughput: 0: 42759.5. Samples: 9824115980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 06:01:49,954][15401] Updated weights for policy 0, policy_version 599610 (0.0032) [2024-06-24 06:01:53,064][15401] Updated weights for policy 0, policy_version 599620 (0.0027) [2024-06-24 06:01:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 9824174080. Throughput: 0: 42712.6. Samples: 9824241360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 06:01:57,475][15401] Updated weights for policy 0, policy_version 599630 (0.0030) [2024-06-24 06:01:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 9824387072. Throughput: 0: 42917.0. Samples: 9824513520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:01:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 06:02:00,489][15401] Updated weights for policy 0, policy_version 599640 (0.0039) [2024-06-24 06:02:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 9824600064. Throughput: 0: 42869.9. Samples: 9824760560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:02:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 06:02:04,975][15401] Updated weights for policy 0, policy_version 599650 (0.0047) [2024-06-24 06:02:08,243][15401] Updated weights for policy 0, policy_version 599660 (0.0034) [2024-06-24 06:02:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.7, 300 sec: 42487.3). Total num frames: 9824829440. Throughput: 0: 43122.2. Samples: 9824894040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:02:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 06:02:12,490][15401] Updated weights for policy 0, policy_version 599670 (0.0035) [2024-06-24 06:02:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 9825009664. Throughput: 0: 43283.6. Samples: 9825161260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:02:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 06:02:16,166][15401] Updated weights for policy 0, policy_version 599680 (0.0039) [2024-06-24 06:02:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 9825239040. Throughput: 0: 42941.4. Samples: 9825404680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:02:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 06:02:20,247][15401] Updated weights for policy 0, policy_version 599690 (0.0031) [2024-06-24 06:02:21,104][15349] Signal inference workers to stop experience collection... (145600 times) [2024-06-24 06:02:21,156][15401] InferenceWorker_p0-w0: stopping experience collection (145600 times) [2024-06-24 06:02:21,164][15349] Signal inference workers to resume experience collection... (145600 times) [2024-06-24 06:02:21,169][15401] InferenceWorker_p0-w0: resuming experience collection (145600 times) [2024-06-24 06:02:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42487.3). Total num frames: 9825468416. Throughput: 0: 43140.5. Samples: 9825539300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:02:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 06:02:23,703][15401] Updated weights for policy 0, policy_version 599700 (0.0036) [2024-06-24 06:02:27,835][15401] Updated weights for policy 0, policy_version 599710 (0.0037) [2024-06-24 06:02:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 9825665024. Throughput: 0: 43100.5. Samples: 9825800540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 06:02:28,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 06:02:31,461][15401] Updated weights for policy 0, policy_version 599720 (0.0037) [2024-06-24 06:02:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42598.8). Total num frames: 9825894400. Throughput: 0: 42887.3. Samples: 9826045900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:02:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 06:02:35,355][15401] Updated weights for policy 0, policy_version 599730 (0.0033) [2024-06-24 06:02:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 9826107392. Throughput: 0: 43135.3. Samples: 9826182440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:02:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 06:02:39,053][15401] Updated weights for policy 0, policy_version 599740 (0.0040) [2024-06-24 06:02:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 9826287616. Throughput: 0: 42646.7. Samples: 9826432620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:02:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 06:02:43,583][15401] Updated weights for policy 0, policy_version 599750 (0.0037) [2024-06-24 06:02:46,663][15401] Updated weights for policy 0, policy_version 599760 (0.0033) [2024-06-24 06:02:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.9, 300 sec: 42598.1). Total num frames: 9826533376. Throughput: 0: 42806.2. Samples: 9826686940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:02:48,392][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 06:02:51,157][15401] Updated weights for policy 0, policy_version 599770 (0.0029) [2024-06-24 06:02:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.6, 300 sec: 42543.2). Total num frames: 9826746368. Throughput: 0: 42935.2. Samples: 9826826120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:02:53,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 06:02:54,410][15401] Updated weights for policy 0, policy_version 599780 (0.0026) [2024-06-24 06:02:58,396][15132] Fps is (10 sec: 39305.5, 60 sec: 42320.7, 300 sec: 42597.5). Total num frames: 9826926592. Throughput: 0: 42568.6. Samples: 9827077120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:02:58,397][15132] Avg episode reward: [(0, '0.244')] [2024-06-24 06:02:58,878][15401] Updated weights for policy 0, policy_version 599790 (0.0032) [2024-06-24 06:03:01,941][15401] Updated weights for policy 0, policy_version 599800 (0.0032) [2024-06-24 06:03:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 9827188736. Throughput: 0: 42717.8. Samples: 9827326980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:03,396][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 06:03:06,413][15401] Updated weights for policy 0, policy_version 599810 (0.0032) [2024-06-24 06:03:08,390][15132] Fps is (10 sec: 44264.8, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 9827368960. Throughput: 0: 42727.9. Samples: 9827462060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 06:03:09,496][15401] Updated weights for policy 0, policy_version 599820 (0.0041) [2024-06-24 06:03:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9827598336. Throughput: 0: 42644.4. Samples: 9827719540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 06:03:13,885][15401] Updated weights for policy 0, policy_version 599830 (0.0031) [2024-06-24 06:03:17,259][15401] Updated weights for policy 0, policy_version 599840 (0.0035) [2024-06-24 06:03:18,391][15132] Fps is (10 sec: 45866.8, 60 sec: 43143.1, 300 sec: 42709.2). Total num frames: 9827827712. Throughput: 0: 42792.7. Samples: 9827971660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:18,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 06:03:21,627][15401] Updated weights for policy 0, policy_version 599850 (0.0032) [2024-06-24 06:03:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9828024320. Throughput: 0: 42731.5. Samples: 9828105360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 06:03:24,811][15401] Updated weights for policy 0, policy_version 599860 (0.0030) [2024-06-24 06:03:28,390][15132] Fps is (10 sec: 40967.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9828237312. Throughput: 0: 42772.3. Samples: 9828357380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 06:03:29,181][15401] Updated weights for policy 0, policy_version 599870 (0.0038) [2024-06-24 06:03:32,613][15401] Updated weights for policy 0, policy_version 599880 (0.0033) [2024-06-24 06:03:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9828483072. Throughput: 0: 42818.7. Samples: 9828613680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:33,400][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 06:03:36,764][15401] Updated weights for policy 0, policy_version 599890 (0.0036) [2024-06-24 06:03:38,390][15132] Fps is (10 sec: 42594.9, 60 sec: 42597.7, 300 sec: 42764.9). Total num frames: 9828663296. Throughput: 0: 42728.9. Samples: 9828748960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:38,391][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 06:03:40,275][15401] Updated weights for policy 0, policy_version 599900 (0.0049) [2024-06-24 06:03:43,392][15132] Fps is (10 sec: 39312.0, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 9828876288. Throughput: 0: 42727.4. Samples: 9828999680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:43,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 06:03:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000599907_9828876288.pth... [2024-06-24 06:03:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000599280_9818603520.pth [2024-06-24 06:03:44,711][15401] Updated weights for policy 0, policy_version 599910 (0.0029) [2024-06-24 06:03:47,930][15401] Updated weights for policy 0, policy_version 599920 (0.0045) [2024-06-24 06:03:48,389][15132] Fps is (10 sec: 44240.8, 60 sec: 42873.2, 300 sec: 42654.9). Total num frames: 9829105664. Throughput: 0: 42867.1. Samples: 9829256000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 06:03:49,048][15349] Signal inference workers to stop experience collection... (145650 times) [2024-06-24 06:03:49,049][15349] Signal inference workers to resume experience collection... (145650 times) [2024-06-24 06:03:49,091][15401] InferenceWorker_p0-w0: stopping experience collection (145650 times) [2024-06-24 06:03:49,092][15401] InferenceWorker_p0-w0: resuming experience collection (145650 times) [2024-06-24 06:03:52,250][15401] Updated weights for policy 0, policy_version 599930 (0.0036) [2024-06-24 06:03:53,396][15132] Fps is (10 sec: 42581.6, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 9829302272. Throughput: 0: 42800.7. Samples: 9829388360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:53,397][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 06:03:55,504][15401] Updated weights for policy 0, policy_version 599940 (0.0038) [2024-06-24 06:03:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 9829515264. Throughput: 0: 42656.5. Samples: 9829639080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:03:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 06:03:59,843][15401] Updated weights for policy 0, policy_version 599950 (0.0023) [2024-06-24 06:04:02,979][15401] Updated weights for policy 0, policy_version 599960 (0.0041) [2024-06-24 06:04:03,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9829744640. Throughput: 0: 42847.6. Samples: 9829899720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 06:04:03,391][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 06:04:07,304][15401] Updated weights for policy 0, policy_version 599970 (0.0031) [2024-06-24 06:04:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9829941248. Throughput: 0: 42959.0. Samples: 9830038520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 06:04:10,430][15401] Updated weights for policy 0, policy_version 599980 (0.0040) [2024-06-24 06:04:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 9830170624. Throughput: 0: 42913.3. Samples: 9830288580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:13,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 06:04:14,955][15401] Updated weights for policy 0, policy_version 599990 (0.0046) [2024-06-24 06:04:17,959][15401] Updated weights for policy 0, policy_version 600000 (0.0041) [2024-06-24 06:04:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42872.7, 300 sec: 42820.5). Total num frames: 9830400000. Throughput: 0: 42954.9. Samples: 9830546660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:18,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 06:04:22,405][15401] Updated weights for policy 0, policy_version 600010 (0.0030) [2024-06-24 06:04:23,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9830580224. Throughput: 0: 42894.1. Samples: 9830679160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:23,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 06:04:25,624][15401] Updated weights for policy 0, policy_version 600020 (0.0033) [2024-06-24 06:04:28,390][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9830825984. Throughput: 0: 42958.7. Samples: 9830932720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:28,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 06:04:29,875][15401] Updated weights for policy 0, policy_version 600030 (0.0044) [2024-06-24 06:04:33,183][15401] Updated weights for policy 0, policy_version 600040 (0.0036) [2024-06-24 06:04:33,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 9831055360. Throughput: 0: 42992.3. Samples: 9831190760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:33,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 06:04:37,436][15401] Updated weights for policy 0, policy_version 600050 (0.0036) [2024-06-24 06:04:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42872.1, 300 sec: 42820.6). Total num frames: 9831235584. Throughput: 0: 43065.2. Samples: 9831326020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:38,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 06:04:40,805][15401] Updated weights for policy 0, policy_version 600060 (0.0037) [2024-06-24 06:04:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 9831481344. Throughput: 0: 43071.7. Samples: 9831577300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:43,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 06:04:45,084][15401] Updated weights for policy 0, policy_version 600070 (0.0031) [2024-06-24 06:04:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9831694336. Throughput: 0: 43108.9. Samples: 9831839620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 06:04:48,651][15401] Updated weights for policy 0, policy_version 600080 (0.0037) [2024-06-24 06:04:52,924][15401] Updated weights for policy 0, policy_version 600090 (0.0026) [2024-06-24 06:04:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 9831890944. Throughput: 0: 42885.8. Samples: 9831968380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 06:04:55,199][15349] Signal inference workers to stop experience collection... (145700 times) [2024-06-24 06:04:55,199][15349] Signal inference workers to resume experience collection... (145700 times) [2024-06-24 06:04:55,215][15401] InferenceWorker_p0-w0: stopping experience collection (145700 times) [2024-06-24 06:04:55,215][15401] InferenceWorker_p0-w0: resuming experience collection (145700 times) [2024-06-24 06:04:56,370][15401] Updated weights for policy 0, policy_version 600100 (0.0036) [2024-06-24 06:04:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 9832120320. Throughput: 0: 43020.4. Samples: 9832224500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:04:58,392][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 06:05:00,382][15401] Updated weights for policy 0, policy_version 600110 (0.0033) [2024-06-24 06:05:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 9832316928. Throughput: 0: 43154.7. Samples: 9832488620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:05:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 06:05:04,187][15401] Updated weights for policy 0, policy_version 600120 (0.0032) [2024-06-24 06:05:08,371][15401] Updated weights for policy 0, policy_version 600130 (0.0045) [2024-06-24 06:05:08,390][15132] Fps is (10 sec: 40969.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9832529920. Throughput: 0: 42975.4. Samples: 9832613060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:05:08,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 06:05:11,656][15401] Updated weights for policy 0, policy_version 600140 (0.0033) [2024-06-24 06:05:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 9832775680. Throughput: 0: 43079.1. Samples: 9832871280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:05:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 06:05:15,706][15401] Updated weights for policy 0, policy_version 600150 (0.0037) [2024-06-24 06:05:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 9832972288. Throughput: 0: 43138.8. Samples: 9833131900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:05:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 06:05:19,335][15401] Updated weights for policy 0, policy_version 600160 (0.0037) [2024-06-24 06:05:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9833168896. Throughput: 0: 42848.4. Samples: 9833254200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:05:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 06:05:23,663][15401] Updated weights for policy 0, policy_version 600170 (0.0046) [2024-06-24 06:05:26,946][15401] Updated weights for policy 0, policy_version 600180 (0.0022) [2024-06-24 06:05:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9833431040. Throughput: 0: 43042.1. Samples: 9833514200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:05:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 06:05:31,350][15401] Updated weights for policy 0, policy_version 600190 (0.0038) [2024-06-24 06:05:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 9833611264. Throughput: 0: 42787.0. Samples: 9833765040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:05:33,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 06:05:34,600][15401] Updated weights for policy 0, policy_version 600200 (0.0030) [2024-06-24 06:05:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 9833807872. Throughput: 0: 42793.9. Samples: 9833894100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:05:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 06:05:39,067][15401] Updated weights for policy 0, policy_version 600210 (0.0030) [2024-06-24 06:05:42,225][15401] Updated weights for policy 0, policy_version 600220 (0.0028) [2024-06-24 06:05:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9834053632. Throughput: 0: 42912.9. Samples: 9834155480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:05:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 06:05:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000600223_9834053632.pth... [2024-06-24 06:05:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000599594_9823748096.pth [2024-06-24 06:05:46,821][15401] Updated weights for policy 0, policy_version 600230 (0.0028) [2024-06-24 06:05:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9834250240. Throughput: 0: 42647.7. Samples: 9834407760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:05:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 06:05:50,025][15401] Updated weights for policy 0, policy_version 600240 (0.0031) [2024-06-24 06:05:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9834463232. Throughput: 0: 42734.4. Samples: 9834536100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:05:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-24 06:05:54,472][15401] Updated weights for policy 0, policy_version 600250 (0.0041) [2024-06-24 06:05:57,527][15401] Updated weights for policy 0, policy_version 600260 (0.0041) [2024-06-24 06:05:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 9834692608. Throughput: 0: 42798.3. Samples: 9834797200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:05:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 06:06:01,925][15401] Updated weights for policy 0, policy_version 600270 (0.0041) [2024-06-24 06:06:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 9834872832. Throughput: 0: 42851.2. Samples: 9835060200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 06:06:05,145][15401] Updated weights for policy 0, policy_version 600280 (0.0043) [2024-06-24 06:06:08,396][15132] Fps is (10 sec: 39296.6, 60 sec: 42594.0, 300 sec: 42764.1). Total num frames: 9835085824. Throughput: 0: 42706.4. Samples: 9835176260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:08,396][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 06:06:09,458][15401] Updated weights for policy 0, policy_version 600290 (0.0030) [2024-06-24 06:06:12,709][15401] Updated weights for policy 0, policy_version 600300 (0.0029) [2024-06-24 06:06:13,390][15132] Fps is (10 sec: 47512.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9835347968. Throughput: 0: 42791.1. Samples: 9835439800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 06:06:17,103][15349] Signal inference workers to stop experience collection... (145750 times) [2024-06-24 06:06:17,105][15349] Signal inference workers to resume experience collection... (145750 times) [2024-06-24 06:06:17,117][15401] InferenceWorker_p0-w0: stopping experience collection (145750 times) [2024-06-24 06:06:17,132][15401] InferenceWorker_p0-w0: resuming experience collection (145750 times) [2024-06-24 06:06:17,263][15401] Updated weights for policy 0, policy_version 600310 (0.0038) [2024-06-24 06:06:18,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 9835528192. Throughput: 0: 42989.0. Samples: 9835699540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 06:06:20,323][15401] Updated weights for policy 0, policy_version 600320 (0.0041) [2024-06-24 06:06:23,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9835724800. Throughput: 0: 42781.8. Samples: 9835819280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 06:06:25,062][15401] Updated weights for policy 0, policy_version 600330 (0.0044) [2024-06-24 06:06:28,033][15401] Updated weights for policy 0, policy_version 600340 (0.0026) [2024-06-24 06:06:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 9835970560. Throughput: 0: 42762.1. Samples: 9836079780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 06:06:32,579][15401] Updated weights for policy 0, policy_version 600350 (0.0025) [2024-06-24 06:06:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 9836167168. Throughput: 0: 42875.6. Samples: 9836337160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 06:06:35,771][15401] Updated weights for policy 0, policy_version 600360 (0.0049) [2024-06-24 06:06:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9836380160. Throughput: 0: 42761.3. Samples: 9836460360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 06:06:40,332][15401] Updated weights for policy 0, policy_version 600370 (0.0038) [2024-06-24 06:06:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 9836609536. Throughput: 0: 42684.9. Samples: 9836718020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 06:06:43,665][15401] Updated weights for policy 0, policy_version 600380 (0.0039) [2024-06-24 06:06:47,953][15401] Updated weights for policy 0, policy_version 600390 (0.0030) [2024-06-24 06:06:48,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 9836806144. Throughput: 0: 42646.3. Samples: 9836979560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:48,397][15132] Avg episode reward: [(0, '0.883')] [2024-06-24 06:06:51,210][15401] Updated weights for policy 0, policy_version 600400 (0.0035) [2024-06-24 06:06:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9837035520. Throughput: 0: 42790.6. Samples: 9837101560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:53,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 06:06:55,628][15401] Updated weights for policy 0, policy_version 600410 (0.0028) [2024-06-24 06:06:58,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9837248512. Throughput: 0: 42744.1. Samples: 9837363280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:06:58,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 06:06:58,729][15401] Updated weights for policy 0, policy_version 600420 (0.0035) [2024-06-24 06:07:03,392][15132] Fps is (10 sec: 39311.6, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 9837428736. Throughput: 0: 42850.0. Samples: 9837627900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:07:03,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 06:07:03,469][15401] Updated weights for policy 0, policy_version 600430 (0.0033) [2024-06-24 06:07:06,472][15401] Updated weights for policy 0, policy_version 600440 (0.0033) [2024-06-24 06:07:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43149.2, 300 sec: 42931.6). Total num frames: 9837674496. Throughput: 0: 42819.1. Samples: 9837746140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:07:08,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 06:07:11,203][15401] Updated weights for policy 0, policy_version 600450 (0.0030) [2024-06-24 06:07:13,390][15132] Fps is (10 sec: 45886.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9837887488. Throughput: 0: 42719.7. Samples: 9838002160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 06:07:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 06:07:14,047][15401] Updated weights for policy 0, policy_version 600460 (0.0040) [2024-06-24 06:07:18,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9838067712. Throughput: 0: 42734.5. Samples: 9838260220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 06:07:18,827][15401] Updated weights for policy 0, policy_version 600470 (0.0032) [2024-06-24 06:07:21,779][15401] Updated weights for policy 0, policy_version 600480 (0.0038) [2024-06-24 06:07:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9838313472. Throughput: 0: 42684.9. Samples: 9838381180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 06:07:26,296][15401] Updated weights for policy 0, policy_version 600490 (0.0036) [2024-06-24 06:07:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9838510080. Throughput: 0: 42722.2. Samples: 9838640520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:28,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 06:07:29,562][15401] Updated weights for policy 0, policy_version 600500 (0.0061) [2024-06-24 06:07:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9838723072. Throughput: 0: 42527.8. Samples: 9838893040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 06:07:34,242][15401] Updated weights for policy 0, policy_version 600510 (0.0038) [2024-06-24 06:07:37,053][15401] Updated weights for policy 0, policy_version 600520 (0.0043) [2024-06-24 06:07:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9838936064. Throughput: 0: 42655.5. Samples: 9839021060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 06:07:41,658][15401] Updated weights for policy 0, policy_version 600530 (0.0029) [2024-06-24 06:07:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 9839165440. Throughput: 0: 42723.7. Samples: 9839285840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 06:07:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000600535_9839165440.pth... [2024-06-24 06:07:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000599907_9828876288.pth [2024-06-24 06:07:44,669][15401] Updated weights for policy 0, policy_version 600540 (0.0037) [2024-06-24 06:07:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 9839378432. Throughput: 0: 42472.5. Samples: 9839539060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 06:07:49,154][15401] Updated weights for policy 0, policy_version 600550 (0.0028) [2024-06-24 06:07:52,756][15401] Updated weights for policy 0, policy_version 600560 (0.0038) [2024-06-24 06:07:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42932.6). Total num frames: 9839591424. Throughput: 0: 42676.8. Samples: 9839666600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 06:07:57,008][15401] Updated weights for policy 0, policy_version 600570 (0.0038) [2024-06-24 06:07:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9839788032. Throughput: 0: 42753.3. Samples: 9839926060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:07:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 06:08:00,678][15401] Updated weights for policy 0, policy_version 600580 (0.0038) [2024-06-24 06:08:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43419.4, 300 sec: 42931.7). Total num frames: 9840033792. Throughput: 0: 42401.1. Samples: 9840168260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 06:08:04,555][15401] Updated weights for policy 0, policy_version 600590 (0.0030) [2024-06-24 06:08:08,314][15349] Signal inference workers to stop experience collection... (145800 times) [2024-06-24 06:08:08,372][15401] InferenceWorker_p0-w0: stopping experience collection (145800 times) [2024-06-24 06:08:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9840197632. Throughput: 0: 42751.1. Samples: 9840304980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 06:08:08,428][15349] Signal inference workers to resume experience collection... (145800 times) [2024-06-24 06:08:08,429][15401] InferenceWorker_p0-w0: resuming experience collection (145800 times) [2024-06-24 06:08:08,581][15401] Updated weights for policy 0, policy_version 600600 (0.0036) [2024-06-24 06:08:11,998][15401] Updated weights for policy 0, policy_version 600610 (0.0037) [2024-06-24 06:08:13,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42052.2, 300 sec: 42654.2). Total num frames: 9840410624. Throughput: 0: 42536.4. Samples: 9840554660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 06:08:16,139][15401] Updated weights for policy 0, policy_version 600620 (0.0032) [2024-06-24 06:08:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9840656384. Throughput: 0: 42606.2. Samples: 9840810320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:18,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 06:08:19,899][15401] Updated weights for policy 0, policy_version 600630 (0.0028) [2024-06-24 06:08:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9840852992. Throughput: 0: 42791.8. Samples: 9840946700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 06:08:23,750][15401] Updated weights for policy 0, policy_version 600640 (0.0031) [2024-06-24 06:08:27,417][15401] Updated weights for policy 0, policy_version 600650 (0.0039) [2024-06-24 06:08:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9841065984. Throughput: 0: 42498.6. Samples: 9841198280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 06:08:31,283][15401] Updated weights for policy 0, policy_version 600660 (0.0034) [2024-06-24 06:08:33,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43144.6, 300 sec: 42876.2). Total num frames: 9841311744. Throughput: 0: 42513.5. Samples: 9841452160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 06:08:35,069][15401] Updated weights for policy 0, policy_version 600670 (0.0025) [2024-06-24 06:08:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 9841475584. Throughput: 0: 42735.5. Samples: 9841589700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 06:08:39,036][15401] Updated weights for policy 0, policy_version 600680 (0.0036) [2024-06-24 06:08:42,473][15401] Updated weights for policy 0, policy_version 600690 (0.0034) [2024-06-24 06:08:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9841721344. Throughput: 0: 42633.8. Samples: 9841844580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 06:08:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 06:08:47,066][15401] Updated weights for policy 0, policy_version 600700 (0.0027) [2024-06-24 06:08:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 9841934336. Throughput: 0: 42819.9. Samples: 9842095160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:08:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 06:08:50,038][15401] Updated weights for policy 0, policy_version 600710 (0.0032) [2024-06-24 06:08:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9842130944. Throughput: 0: 42719.9. Samples: 9842227380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:08:53,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 06:08:54,499][15401] Updated weights for policy 0, policy_version 600720 (0.0033) [2024-06-24 06:08:57,956][15401] Updated weights for policy 0, policy_version 600730 (0.0031) [2024-06-24 06:08:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9842360320. Throughput: 0: 42814.2. Samples: 9842481300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:08:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 06:09:02,076][15401] Updated weights for policy 0, policy_version 600740 (0.0032) [2024-06-24 06:09:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9842573312. Throughput: 0: 42983.7. Samples: 9842744580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 06:09:05,315][15401] Updated weights for policy 0, policy_version 600750 (0.0028) [2024-06-24 06:09:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 9842786304. Throughput: 0: 42903.6. Samples: 9842877360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 06:09:10,215][15401] Updated weights for policy 0, policy_version 600760 (0.0031) [2024-06-24 06:09:13,294][15401] Updated weights for policy 0, policy_version 600770 (0.0032) [2024-06-24 06:09:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9843015680. Throughput: 0: 42818.6. Samples: 9843125120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 06:09:17,962][15349] Signal inference workers to stop experience collection... (145850 times) [2024-06-24 06:09:17,970][15349] Signal inference workers to resume experience collection... (145850 times) [2024-06-24 06:09:17,971][15401] Updated weights for policy 0, policy_version 600780 (0.0048) [2024-06-24 06:09:17,994][15401] InferenceWorker_p0-w0: stopping experience collection (145850 times) [2024-06-24 06:09:17,994][15401] InferenceWorker_p0-w0: resuming experience collection (145850 times) [2024-06-24 06:09:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9843212288. Throughput: 0: 43052.8. Samples: 9843389540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 06:09:20,742][15401] Updated weights for policy 0, policy_version 600790 (0.0028) [2024-06-24 06:09:23,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 9843425280. Throughput: 0: 42701.4. Samples: 9843511360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:23,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 06:09:25,525][15401] Updated weights for policy 0, policy_version 600800 (0.0033) [2024-06-24 06:09:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 9843654656. Throughput: 0: 42655.5. Samples: 9843764080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:28,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 06:09:28,418][15401] Updated weights for policy 0, policy_version 600810 (0.0048) [2024-06-24 06:09:33,115][15401] Updated weights for policy 0, policy_version 600820 (0.0028) [2024-06-24 06:09:33,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 9843851264. Throughput: 0: 43117.3. Samples: 9844035440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 06:09:35,790][15401] Updated weights for policy 0, policy_version 600830 (0.0033) [2024-06-24 06:09:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 9844080640. Throughput: 0: 42768.1. Samples: 9844151940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 06:09:41,010][15401] Updated weights for policy 0, policy_version 600840 (0.0035) [2024-06-24 06:09:43,379][15401] Updated weights for policy 0, policy_version 600850 (0.0028) [2024-06-24 06:09:43,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 9844326400. Throughput: 0: 42801.9. Samples: 9844407380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 06:09:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000600850_9844326400.pth... [2024-06-24 06:09:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000600223_9834053632.pth [2024-06-24 06:09:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9844473856. Throughput: 0: 42921.3. Samples: 9844676040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 06:09:48,457][15401] Updated weights for policy 0, policy_version 600860 (0.0034) [2024-06-24 06:09:50,791][15401] Updated weights for policy 0, policy_version 600870 (0.0031) [2024-06-24 06:09:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 9844719616. Throughput: 0: 42633.8. Samples: 9844795880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 06:09:56,025][15401] Updated weights for policy 0, policy_version 600880 (0.0041) [2024-06-24 06:09:58,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 9844965376. Throughput: 0: 42933.0. Samples: 9845057100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:09:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 06:09:58,397][15401] Updated weights for policy 0, policy_version 600890 (0.0043) [2024-06-24 06:10:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9845129216. Throughput: 0: 43012.1. Samples: 9845325080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:10:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 06:10:03,531][15401] Updated weights for policy 0, policy_version 600900 (0.0041) [2024-06-24 06:10:06,101][15401] Updated weights for policy 0, policy_version 600910 (0.0035) [2024-06-24 06:10:08,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9845358592. Throughput: 0: 42777.7. Samples: 9845436260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:10:08,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 06:10:11,066][15401] Updated weights for policy 0, policy_version 600920 (0.0044) [2024-06-24 06:10:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9845587968. Throughput: 0: 43045.4. Samples: 9845701120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:10:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 06:10:13,975][15401] Updated weights for policy 0, policy_version 600930 (0.0044) [2024-06-24 06:10:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9845784576. Throughput: 0: 42744.6. Samples: 9845958940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:10:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 06:10:18,517][15401] Updated weights for policy 0, policy_version 600940 (0.0035) [2024-06-24 06:10:19,515][15349] Signal inference workers to stop experience collection... (145900 times) [2024-06-24 06:10:19,543][15401] InferenceWorker_p0-w0: stopping experience collection (145900 times) [2024-06-24 06:10:19,582][15349] Signal inference workers to resume experience collection... (145900 times) [2024-06-24 06:10:19,582][15401] InferenceWorker_p0-w0: resuming experience collection (145900 times) [2024-06-24 06:10:21,738][15401] Updated weights for policy 0, policy_version 600950 (0.0033) [2024-06-24 06:10:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 9845997568. Throughput: 0: 42863.0. Samples: 9846080780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:10:23,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 06:10:26,398][15401] Updated weights for policy 0, policy_version 600960 (0.0037) [2024-06-24 06:10:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9846210560. Throughput: 0: 43043.1. Samples: 9846344320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:10:28,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 06:10:29,313][15401] Updated weights for policy 0, policy_version 600970 (0.0039) [2024-06-24 06:10:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9846407168. Throughput: 0: 42633.8. Samples: 9846594560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:10:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 06:10:34,013][15401] Updated weights for policy 0, policy_version 600980 (0.0033) [2024-06-24 06:10:37,275][15401] Updated weights for policy 0, policy_version 600990 (0.0034) [2024-06-24 06:10:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9846652928. Throughput: 0: 42778.8. Samples: 9846720920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:10:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 06:10:41,635][15401] Updated weights for policy 0, policy_version 601000 (0.0027) [2024-06-24 06:10:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 9846833152. Throughput: 0: 42700.4. Samples: 9846978620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:10:43,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 06:10:44,745][15401] Updated weights for policy 0, policy_version 601010 (0.0030) [2024-06-24 06:10:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9847062528. Throughput: 0: 42419.8. Samples: 9847233980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:10:48,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-24 06:10:49,244][15401] Updated weights for policy 0, policy_version 601020 (0.0031) [2024-06-24 06:10:52,302][15401] Updated weights for policy 0, policy_version 601030 (0.0034) [2024-06-24 06:10:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9847291904. Throughput: 0: 42814.3. Samples: 9847362900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:10:53,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-24 06:10:56,886][15401] Updated weights for policy 0, policy_version 601040 (0.0032) [2024-06-24 06:10:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 9847472128. Throughput: 0: 42483.1. Samples: 9847612860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:10:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 06:11:00,182][15401] Updated weights for policy 0, policy_version 601050 (0.0025) [2024-06-24 06:11:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 9847701504. Throughput: 0: 42517.7. Samples: 9847872240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 06:11:04,464][15401] Updated weights for policy 0, policy_version 601060 (0.0034) [2024-06-24 06:11:07,969][15401] Updated weights for policy 0, policy_version 601070 (0.0037) [2024-06-24 06:11:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 9847930880. Throughput: 0: 42654.4. Samples: 9848000220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:08,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 06:11:12,537][15401] Updated weights for policy 0, policy_version 601080 (0.0029) [2024-06-24 06:11:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 9848111104. Throughput: 0: 42489.6. Samples: 9848256360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 06:11:15,614][15401] Updated weights for policy 0, policy_version 601090 (0.0034) [2024-06-24 06:11:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9848324096. Throughput: 0: 42747.6. Samples: 9848518200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 06:11:20,069][15401] Updated weights for policy 0, policy_version 601100 (0.0040) [2024-06-24 06:11:23,283][15401] Updated weights for policy 0, policy_version 601110 (0.0038) [2024-06-24 06:11:23,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9848586240. Throughput: 0: 42734.9. Samples: 9848644000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 06:11:27,673][15401] Updated weights for policy 0, policy_version 601120 (0.0036) [2024-06-24 06:11:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9848766464. Throughput: 0: 42600.3. Samples: 9848895640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:28,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 06:11:28,728][15349] Signal inference workers to stop experience collection... (145950 times) [2024-06-24 06:11:28,729][15349] Signal inference workers to resume experience collection... (145950 times) [2024-06-24 06:11:28,739][15401] InferenceWorker_p0-w0: stopping experience collection (145950 times) [2024-06-24 06:11:28,739][15401] InferenceWorker_p0-w0: resuming experience collection (145950 times) [2024-06-24 06:11:30,839][15401] Updated weights for policy 0, policy_version 601130 (0.0034) [2024-06-24 06:11:33,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9848963072. Throughput: 0: 42718.3. Samples: 9849156300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 06:11:35,314][15401] Updated weights for policy 0, policy_version 601140 (0.0028) [2024-06-24 06:11:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9849225216. Throughput: 0: 42654.3. Samples: 9849282340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:38,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 06:11:38,447][15401] Updated weights for policy 0, policy_version 601150 (0.0025) [2024-06-24 06:11:43,083][15401] Updated weights for policy 0, policy_version 601160 (0.0043) [2024-06-24 06:11:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42765.9). Total num frames: 9849421824. Throughput: 0: 42887.8. Samples: 9849542820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 06:11:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000601161_9849421824.pth... [2024-06-24 06:11:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000600535_9839165440.pth [2024-06-24 06:11:46,652][15401] Updated weights for policy 0, policy_version 601170 (0.0046) [2024-06-24 06:11:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9849618432. Throughput: 0: 42631.5. Samples: 9849790660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 06:11:50,665][15401] Updated weights for policy 0, policy_version 601180 (0.0038) [2024-06-24 06:11:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9849864192. Throughput: 0: 42581.3. Samples: 9849916380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:11:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 06:11:54,160][15401] Updated weights for policy 0, policy_version 601190 (0.0043) [2024-06-24 06:11:58,244][15401] Updated weights for policy 0, policy_version 601200 (0.0029) [2024-06-24 06:11:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 9850060800. Throughput: 0: 42639.6. Samples: 9850175140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:11:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 06:12:01,665][15401] Updated weights for policy 0, policy_version 601210 (0.0033) [2024-06-24 06:12:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9850273792. Throughput: 0: 42510.6. Samples: 9850431180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 06:12:05,783][15401] Updated weights for policy 0, policy_version 601220 (0.0033) [2024-06-24 06:12:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9850503168. Throughput: 0: 42656.1. Samples: 9850563520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:08,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 06:12:09,498][15401] Updated weights for policy 0, policy_version 601230 (0.0040) [2024-06-24 06:12:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9850699776. Throughput: 0: 42760.5. Samples: 9850819860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:13,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 06:12:13,791][15401] Updated weights for policy 0, policy_version 601240 (0.0034) [2024-06-24 06:12:17,309][15401] Updated weights for policy 0, policy_version 601250 (0.0043) [2024-06-24 06:12:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 9850912768. Throughput: 0: 42776.4. Samples: 9851081240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:18,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-24 06:12:21,280][15401] Updated weights for policy 0, policy_version 601260 (0.0045) [2024-06-24 06:12:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9851125760. Throughput: 0: 42806.9. Samples: 9851208660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 06:12:24,995][15401] Updated weights for policy 0, policy_version 601270 (0.0037) [2024-06-24 06:12:28,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 9851338752. Throughput: 0: 42676.2. Samples: 9851463520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:28,396][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 06:12:28,835][15401] Updated weights for policy 0, policy_version 601280 (0.0038) [2024-06-24 06:12:32,678][15401] Updated weights for policy 0, policy_version 601290 (0.0035) [2024-06-24 06:12:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9851551744. Throughput: 0: 42806.3. Samples: 9851716940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 06:12:36,339][15401] Updated weights for policy 0, policy_version 601300 (0.0050) [2024-06-24 06:12:38,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9851764736. Throughput: 0: 42879.5. Samples: 9851845960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 06:12:40,373][15401] Updated weights for policy 0, policy_version 601310 (0.0049) [2024-06-24 06:12:43,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 9851977728. Throughput: 0: 42947.3. Samples: 9852107860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:43,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 06:12:43,892][15401] Updated weights for policy 0, policy_version 601320 (0.0032) [2024-06-24 06:12:48,057][15401] Updated weights for policy 0, policy_version 601330 (0.0035) [2024-06-24 06:12:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9852207104. Throughput: 0: 42723.4. Samples: 9852353740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 06:12:51,965][15401] Updated weights for policy 0, policy_version 601340 (0.0040) [2024-06-24 06:12:53,389][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9852403712. Throughput: 0: 42737.0. Samples: 9852486680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 06:12:55,611][15401] Updated weights for policy 0, policy_version 601350 (0.0043) [2024-06-24 06:12:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 9852616704. Throughput: 0: 42644.0. Samples: 9852738840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:12:58,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 06:12:59,877][15401] Updated weights for policy 0, policy_version 601360 (0.0047) [2024-06-24 06:13:03,355][15401] Updated weights for policy 0, policy_version 601370 (0.0032) [2024-06-24 06:13:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9852846080. Throughput: 0: 42573.8. Samples: 9852997060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:13:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 06:13:07,471][15401] Updated weights for policy 0, policy_version 601380 (0.0046) [2024-06-24 06:13:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 9853042688. Throughput: 0: 42603.2. Samples: 9853125800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:13:08,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 06:13:10,572][15349] Signal inference workers to stop experience collection... (146000 times) [2024-06-24 06:13:10,574][15349] Signal inference workers to resume experience collection... (146000 times) [2024-06-24 06:13:10,609][15401] InferenceWorker_p0-w0: stopping experience collection (146000 times) [2024-06-24 06:13:10,609][15401] InferenceWorker_p0-w0: resuming experience collection (146000 times) [2024-06-24 06:13:11,023][15401] Updated weights for policy 0, policy_version 601390 (0.0031) [2024-06-24 06:13:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9853239296. Throughput: 0: 42565.2. Samples: 9853378680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:13:13,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-24 06:13:14,986][15401] Updated weights for policy 0, policy_version 601400 (0.0039) [2024-06-24 06:13:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9853452288. Throughput: 0: 42607.1. Samples: 9853634260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:13:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 06:13:18,826][15401] Updated weights for policy 0, policy_version 601410 (0.0038) [2024-06-24 06:13:22,691][15401] Updated weights for policy 0, policy_version 601420 (0.0054) [2024-06-24 06:13:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9853665280. Throughput: 0: 42551.1. Samples: 9853760760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:13:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 06:13:26,541][15401] Updated weights for policy 0, policy_version 601430 (0.0027) [2024-06-24 06:13:28,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42601.2, 300 sec: 42653.6). Total num frames: 9853894656. Throughput: 0: 42290.5. Samples: 9854010940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 06:13:28,392][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 06:13:30,897][15401] Updated weights for policy 0, policy_version 601440 (0.0040) [2024-06-24 06:13:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9854091264. Throughput: 0: 42479.7. Samples: 9854265320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:13:33,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 06:13:34,198][15401] Updated weights for policy 0, policy_version 601450 (0.0030) [2024-06-24 06:13:38,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9854304256. Throughput: 0: 42331.0. Samples: 9854391580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:13:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 06:13:38,439][15401] Updated weights for policy 0, policy_version 601460 (0.0032) [2024-06-24 06:13:41,901][15401] Updated weights for policy 0, policy_version 601470 (0.0024) [2024-06-24 06:13:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 9854517248. Throughput: 0: 42329.7. Samples: 9854643680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:13:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 06:13:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000601473_9854533632.pth... [2024-06-24 06:13:43,548][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000600850_9844326400.pth [2024-06-24 06:13:45,944][15401] Updated weights for policy 0, policy_version 601480 (0.0035) [2024-06-24 06:13:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9854746624. Throughput: 0: 42395.5. Samples: 9854904860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:13:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 06:13:49,630][15401] Updated weights for policy 0, policy_version 601490 (0.0027) [2024-06-24 06:13:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9854943232. Throughput: 0: 42418.5. Samples: 9855034640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:13:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 06:13:53,959][15401] Updated weights for policy 0, policy_version 601500 (0.0035) [2024-06-24 06:13:57,407][15401] Updated weights for policy 0, policy_version 601510 (0.0031) [2024-06-24 06:13:58,393][15132] Fps is (10 sec: 40947.5, 60 sec: 42323.1, 300 sec: 42653.5). Total num frames: 9855156224. Throughput: 0: 42466.8. Samples: 9855289820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:13:58,393][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 06:14:01,516][15401] Updated weights for policy 0, policy_version 601520 (0.0040) [2024-06-24 06:14:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9855385600. Throughput: 0: 42413.8. Samples: 9855542880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 06:14:05,063][15401] Updated weights for policy 0, policy_version 601530 (0.0035) [2024-06-24 06:14:08,392][15132] Fps is (10 sec: 44240.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 9855598592. Throughput: 0: 42486.2. Samples: 9855672740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:08,392][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 06:14:08,999][15401] Updated weights for policy 0, policy_version 601540 (0.0048) [2024-06-24 06:14:12,926][15401] Updated weights for policy 0, policy_version 601550 (0.0035) [2024-06-24 06:14:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 9855795200. Throughput: 0: 42570.6. Samples: 9855926520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 06:14:16,860][15401] Updated weights for policy 0, policy_version 601560 (0.0035) [2024-06-24 06:14:18,392][15132] Fps is (10 sec: 40959.7, 60 sec: 42596.6, 300 sec: 42653.9). Total num frames: 9856008192. Throughput: 0: 42529.7. Samples: 9856179260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:18,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 06:14:20,935][15401] Updated weights for policy 0, policy_version 601570 (0.0043) [2024-06-24 06:14:21,994][15349] Signal inference workers to stop experience collection... (146050 times) [2024-06-24 06:14:21,994][15349] Signal inference workers to resume experience collection... (146050 times) [2024-06-24 06:14:22,012][15401] InferenceWorker_p0-w0: stopping experience collection (146050 times) [2024-06-24 06:14:22,042][15401] InferenceWorker_p0-w0: resuming experience collection (146050 times) [2024-06-24 06:14:23,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9856237568. Throughput: 0: 42617.9. Samples: 9856309380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 06:14:24,525][15401] Updated weights for policy 0, policy_version 601580 (0.0028) [2024-06-24 06:14:28,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 9856450560. Throughput: 0: 42715.2. Samples: 9856565860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:28,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 06:14:28,397][15401] Updated weights for policy 0, policy_version 601590 (0.0038) [2024-06-24 06:14:32,000][15401] Updated weights for policy 0, policy_version 601600 (0.0032) [2024-06-24 06:14:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9856663552. Throughput: 0: 42505.1. Samples: 9856817580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 06:14:35,955][15401] Updated weights for policy 0, policy_version 601610 (0.0028) [2024-06-24 06:14:38,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 9856876544. Throughput: 0: 42510.8. Samples: 9856947620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 06:14:39,559][15401] Updated weights for policy 0, policy_version 601620 (0.0039) [2024-06-24 06:14:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9857073152. Throughput: 0: 42684.3. Samples: 9857210480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 06:14:44,076][15401] Updated weights for policy 0, policy_version 601630 (0.0037) [2024-06-24 06:14:47,005][15401] Updated weights for policy 0, policy_version 601640 (0.0030) [2024-06-24 06:14:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 9857286144. Throughput: 0: 42644.0. Samples: 9857461860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 06:14:51,764][15401] Updated weights for policy 0, policy_version 601650 (0.0037) [2024-06-24 06:14:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 9857515520. Throughput: 0: 42660.4. Samples: 9857592360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 06:14:54,560][15401] Updated weights for policy 0, policy_version 601660 (0.0032) [2024-06-24 06:14:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.7, 300 sec: 42653.9). Total num frames: 9857712128. Throughput: 0: 42854.0. Samples: 9857854940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 06:14:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 06:14:59,392][15401] Updated weights for policy 0, policy_version 601670 (0.0029) [2024-06-24 06:15:02,201][15401] Updated weights for policy 0, policy_version 601680 (0.0036) [2024-06-24 06:15:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9857941504. Throughput: 0: 42649.4. Samples: 9858098380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:03,399][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 06:15:07,133][15401] Updated weights for policy 0, policy_version 601690 (0.0044) [2024-06-24 06:15:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 9858154496. Throughput: 0: 42779.1. Samples: 9858234440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 06:15:09,731][15401] Updated weights for policy 0, policy_version 601700 (0.0029) [2024-06-24 06:15:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9858351104. Throughput: 0: 42864.3. Samples: 9858494760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 06:15:14,630][15401] Updated weights for policy 0, policy_version 601710 (0.0040) [2024-06-24 06:15:17,441][15401] Updated weights for policy 0, policy_version 601720 (0.0026) [2024-06-24 06:15:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 9858596864. Throughput: 0: 42830.6. Samples: 9858744960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 06:15:22,346][15401] Updated weights for policy 0, policy_version 601730 (0.0042) [2024-06-24 06:15:23,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9858826240. Throughput: 0: 43043.2. Samples: 9858884560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 06:15:25,065][15401] Updated weights for policy 0, policy_version 601740 (0.0038) [2024-06-24 06:15:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9858990080. Throughput: 0: 42745.7. Samples: 9859134040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 06:15:30,017][15401] Updated weights for policy 0, policy_version 601750 (0.0032) [2024-06-24 06:15:33,171][15401] Updated weights for policy 0, policy_version 601760 (0.0028) [2024-06-24 06:15:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9859235840. Throughput: 0: 42644.9. Samples: 9859380880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:33,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 06:15:37,627][15401] Updated weights for policy 0, policy_version 601770 (0.0051) [2024-06-24 06:15:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9859448832. Throughput: 0: 42763.1. Samples: 9859516700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 06:15:40,856][15349] Signal inference workers to stop experience collection... (146100 times) [2024-06-24 06:15:40,864][15349] Signal inference workers to resume experience collection... (146100 times) [2024-06-24 06:15:40,914][15401] InferenceWorker_p0-w0: stopping experience collection (146100 times) [2024-06-24 06:15:40,914][15401] InferenceWorker_p0-w0: resuming experience collection (146100 times) [2024-06-24 06:15:40,990][15401] Updated weights for policy 0, policy_version 601780 (0.0040) [2024-06-24 06:15:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9859629056. Throughput: 0: 42653.2. Samples: 9859774340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 06:15:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000601784_9859629056.pth... [2024-06-24 06:15:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000601161_9849421824.pth [2024-06-24 06:15:45,140][15401] Updated weights for policy 0, policy_version 601790 (0.0033) [2024-06-24 06:15:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 9859858432. Throughput: 0: 42881.9. Samples: 9860028060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 06:15:48,635][15401] Updated weights for policy 0, policy_version 601800 (0.0043) [2024-06-24 06:15:52,806][15401] Updated weights for policy 0, policy_version 601810 (0.0046) [2024-06-24 06:15:53,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9860104192. Throughput: 0: 42789.3. Samples: 9860159960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:53,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 06:15:56,363][15401] Updated weights for policy 0, policy_version 601820 (0.0037) [2024-06-24 06:15:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9860284416. Throughput: 0: 42635.6. Samples: 9860413360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:15:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 06:16:00,383][15401] Updated weights for policy 0, policy_version 601830 (0.0044) [2024-06-24 06:16:03,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 9860481024. Throughput: 0: 42716.1. Samples: 9860667180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:16:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 06:16:04,262][15401] Updated weights for policy 0, policy_version 601840 (0.0026) [2024-06-24 06:16:07,933][15401] Updated weights for policy 0, policy_version 601850 (0.0034) [2024-06-24 06:16:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 9860743168. Throughput: 0: 42558.5. Samples: 9860799700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:16:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 06:16:11,931][15401] Updated weights for policy 0, policy_version 601860 (0.0027) [2024-06-24 06:16:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9860923392. Throughput: 0: 42647.1. Samples: 9861053160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:16:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 06:16:15,690][15401] Updated weights for policy 0, policy_version 601870 (0.0041) [2024-06-24 06:16:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9861136384. Throughput: 0: 42835.9. Samples: 9861308500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:16:18,395][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 06:16:19,538][15401] Updated weights for policy 0, policy_version 601880 (0.0042) [2024-06-24 06:16:23,189][15401] Updated weights for policy 0, policy_version 601890 (0.0029) [2024-06-24 06:16:23,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 9861365760. Throughput: 0: 42764.4. Samples: 9861441200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:16:23,392][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 06:16:27,322][15401] Updated weights for policy 0, policy_version 601900 (0.0026) [2024-06-24 06:16:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9861562368. Throughput: 0: 42765.4. Samples: 9861698780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:16:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 06:16:30,849][15401] Updated weights for policy 0, policy_version 601910 (0.0037) [2024-06-24 06:16:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9861791744. Throughput: 0: 42805.4. Samples: 9861954300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 06:16:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 06:16:34,922][15401] Updated weights for policy 0, policy_version 601920 (0.0047) [2024-06-24 06:16:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 9862004736. Throughput: 0: 42918.7. Samples: 9862091300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:16:38,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 06:16:38,410][15401] Updated weights for policy 0, policy_version 601930 (0.0037) [2024-06-24 06:16:39,915][15349] Signal inference workers to stop experience collection... (146150 times) [2024-06-24 06:16:39,916][15349] Signal inference workers to resume experience collection... (146150 times) [2024-06-24 06:16:39,932][15401] InferenceWorker_p0-w0: stopping experience collection (146150 times) [2024-06-24 06:16:39,932][15401] InferenceWorker_p0-w0: resuming experience collection (146150 times) [2024-06-24 06:16:42,577][15401] Updated weights for policy 0, policy_version 601940 (0.0031) [2024-06-24 06:16:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 9862201344. Throughput: 0: 42987.1. Samples: 9862347780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:16:43,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-24 06:16:46,014][15401] Updated weights for policy 0, policy_version 601950 (0.0034) [2024-06-24 06:16:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 9862447104. Throughput: 0: 42902.8. Samples: 9862597820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:16:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 06:16:50,321][15401] Updated weights for policy 0, policy_version 601960 (0.0037) [2024-06-24 06:16:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9862643712. Throughput: 0: 42969.5. Samples: 9862733320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:16:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 06:16:53,763][15401] Updated weights for policy 0, policy_version 601970 (0.0038) [2024-06-24 06:16:58,149][15401] Updated weights for policy 0, policy_version 601980 (0.0033) [2024-06-24 06:16:58,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9862840320. Throughput: 0: 43012.1. Samples: 9862988700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:16:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 06:17:01,318][15401] Updated weights for policy 0, policy_version 601990 (0.0034) [2024-06-24 06:17:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 9863086080. Throughput: 0: 42880.5. Samples: 9863238120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 06:17:05,757][15401] Updated weights for policy 0, policy_version 602000 (0.0035) [2024-06-24 06:17:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9863282688. Throughput: 0: 42974.3. Samples: 9863374940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 06:17:08,837][15401] Updated weights for policy 0, policy_version 602010 (0.0030) [2024-06-24 06:17:13,303][15401] Updated weights for policy 0, policy_version 602020 (0.0033) [2024-06-24 06:17:13,390][15132] Fps is (10 sec: 40957.7, 60 sec: 42871.2, 300 sec: 42653.9). Total num frames: 9863495680. Throughput: 0: 42902.2. Samples: 9863629400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:13,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 06:17:16,483][15401] Updated weights for policy 0, policy_version 602030 (0.0042) [2024-06-24 06:17:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9863741440. Throughput: 0: 42747.5. Samples: 9863877940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:18,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 06:17:20,824][15401] Updated weights for policy 0, policy_version 602040 (0.0033) [2024-06-24 06:17:23,389][15132] Fps is (10 sec: 40962.5, 60 sec: 42327.1, 300 sec: 42599.3). Total num frames: 9863905280. Throughput: 0: 42721.4. Samples: 9864013760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 06:17:24,218][15401] Updated weights for policy 0, policy_version 602050 (0.0034) [2024-06-24 06:17:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9864134656. Throughput: 0: 42771.1. Samples: 9864272480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:28,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 06:17:28,489][15401] Updated weights for policy 0, policy_version 602060 (0.0044) [2024-06-24 06:17:31,784][15401] Updated weights for policy 0, policy_version 602070 (0.0035) [2024-06-24 06:17:33,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9864380416. Throughput: 0: 42730.9. Samples: 9864520700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 06:17:36,063][15401] Updated weights for policy 0, policy_version 602080 (0.0033) [2024-06-24 06:17:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 9864560640. Throughput: 0: 42684.1. Samples: 9864654100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 06:17:39,341][15401] Updated weights for policy 0, policy_version 602090 (0.0038) [2024-06-24 06:17:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9864773632. Throughput: 0: 42691.2. Samples: 9864909800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 06:17:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000602099_9864790016.pth... [2024-06-24 06:17:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000601473_9854533632.pth [2024-06-24 06:17:43,640][15401] Updated weights for policy 0, policy_version 602100 (0.0047) [2024-06-24 06:17:47,134][15401] Updated weights for policy 0, policy_version 602110 (0.0032) [2024-06-24 06:17:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9865003008. Throughput: 0: 42731.6. Samples: 9865161040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 06:17:51,152][15401] Updated weights for policy 0, policy_version 602120 (0.0040) [2024-06-24 06:17:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9865199616. Throughput: 0: 42641.3. Samples: 9865293800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 06:17:54,872][15401] Updated weights for policy 0, policy_version 602130 (0.0039) [2024-06-24 06:17:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9865428992. Throughput: 0: 42632.0. Samples: 9865547820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:17:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 06:17:58,521][15349] Signal inference workers to stop experience collection... (146200 times) [2024-06-24 06:17:58,529][15349] Signal inference workers to resume experience collection... (146200 times) [2024-06-24 06:17:58,550][15401] InferenceWorker_p0-w0: stopping experience collection (146200 times) [2024-06-24 06:17:58,550][15401] InferenceWorker_p0-w0: resuming experience collection (146200 times) [2024-06-24 06:17:58,688][15401] Updated weights for policy 0, policy_version 602140 (0.0043) [2024-06-24 06:18:02,518][15401] Updated weights for policy 0, policy_version 602150 (0.0023) [2024-06-24 06:18:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9865658368. Throughput: 0: 42766.7. Samples: 9865802440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:18:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 06:18:06,761][15401] Updated weights for policy 0, policy_version 602160 (0.0025) [2024-06-24 06:18:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9865838592. Throughput: 0: 42717.8. Samples: 9865936060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 06:18:08,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 06:18:10,215][15401] Updated weights for policy 0, policy_version 602170 (0.0034) [2024-06-24 06:18:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.8, 300 sec: 42709.5). Total num frames: 9866051584. Throughput: 0: 42543.5. Samples: 9866186940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 06:18:14,572][15401] Updated weights for policy 0, policy_version 602180 (0.0021) [2024-06-24 06:18:17,837][15401] Updated weights for policy 0, policy_version 602190 (0.0029) [2024-06-24 06:18:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9866297344. Throughput: 0: 42739.1. Samples: 9866443960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 06:18:22,076][15401] Updated weights for policy 0, policy_version 602200 (0.0027) [2024-06-24 06:18:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 9866477568. Throughput: 0: 42784.0. Samples: 9866579380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:23,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 06:18:25,807][15401] Updated weights for policy 0, policy_version 602210 (0.0034) [2024-06-24 06:18:28,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9866706944. Throughput: 0: 42622.1. Samples: 9866827900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:28,393][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 06:18:29,705][15401] Updated weights for policy 0, policy_version 602220 (0.0029) [2024-06-24 06:18:33,283][15401] Updated weights for policy 0, policy_version 602230 (0.0026) [2024-06-24 06:18:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9866936320. Throughput: 0: 42884.5. Samples: 9867090840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 06:18:37,055][15401] Updated weights for policy 0, policy_version 602240 (0.0020) [2024-06-24 06:18:38,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9867132928. Throughput: 0: 42872.6. Samples: 9867223060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 06:18:41,057][15401] Updated weights for policy 0, policy_version 602250 (0.0022) [2024-06-24 06:18:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9867345920. Throughput: 0: 42869.8. Samples: 9867476960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 06:18:44,975][15401] Updated weights for policy 0, policy_version 602260 (0.0038) [2024-06-24 06:18:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9867558912. Throughput: 0: 42785.8. Samples: 9867727800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 06:18:48,828][15401] Updated weights for policy 0, policy_version 602270 (0.0030) [2024-06-24 06:18:52,650][15401] Updated weights for policy 0, policy_version 602280 (0.0030) [2024-06-24 06:18:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.5). Total num frames: 9867771904. Throughput: 0: 42684.0. Samples: 9867856840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 06:18:56,483][15401] Updated weights for policy 0, policy_version 602290 (0.0033) [2024-06-24 06:18:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9868001280. Throughput: 0: 42775.9. Samples: 9868111960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:18:58,393][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 06:19:00,300][15401] Updated weights for policy 0, policy_version 602300 (0.0033) [2024-06-24 06:19:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 9868197888. Throughput: 0: 42752.9. Samples: 9868367840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 06:19:04,104][15401] Updated weights for policy 0, policy_version 602310 (0.0031) [2024-06-24 06:19:07,802][15401] Updated weights for policy 0, policy_version 602320 (0.0039) [2024-06-24 06:19:08,390][15132] Fps is (10 sec: 40968.6, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 9868410880. Throughput: 0: 42565.3. Samples: 9868494840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 06:19:12,004][15401] Updated weights for policy 0, policy_version 602330 (0.0030) [2024-06-24 06:19:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42876.5). Total num frames: 9868656640. Throughput: 0: 42850.0. Samples: 9868756040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 06:19:15,383][15401] Updated weights for policy 0, policy_version 602340 (0.0036) [2024-06-24 06:19:18,389][15132] Fps is (10 sec: 42600.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9868836864. Throughput: 0: 42660.9. Samples: 9869010580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:18,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 06:19:19,741][15401] Updated weights for policy 0, policy_version 602350 (0.0035) [2024-06-24 06:19:23,345][15401] Updated weights for policy 0, policy_version 602360 (0.0031) [2024-06-24 06:19:23,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9869066240. Throughput: 0: 42505.2. Samples: 9869135800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:23,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 06:19:27,444][15401] Updated weights for policy 0, policy_version 602370 (0.0040) [2024-06-24 06:19:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43146.4, 300 sec: 42820.6). Total num frames: 9869295616. Throughput: 0: 42871.7. Samples: 9869406180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 06:19:30,854][15401] Updated weights for policy 0, policy_version 602380 (0.0043) [2024-06-24 06:19:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9869492224. Throughput: 0: 42911.7. Samples: 9869658820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 06:19:35,139][15401] Updated weights for policy 0, policy_version 602390 (0.0032) [2024-06-24 06:19:35,231][15349] Signal inference workers to stop experience collection... (146250 times) [2024-06-24 06:19:35,262][15401] InferenceWorker_p0-w0: stopping experience collection (146250 times) [2024-06-24 06:19:35,287][15349] Signal inference workers to resume experience collection... (146250 times) [2024-06-24 06:19:35,287][15401] InferenceWorker_p0-w0: resuming experience collection (146250 times) [2024-06-24 06:19:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9869705216. Throughput: 0: 42822.7. Samples: 9869783860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 06:19:38,416][15401] Updated weights for policy 0, policy_version 602400 (0.0030) [2024-06-24 06:19:42,512][15401] Updated weights for policy 0, policy_version 602410 (0.0034) [2024-06-24 06:19:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 9869950976. Throughput: 0: 43230.8. Samples: 9870057240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-24 06:19:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 06:19:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000602414_9869950976.pth... [2024-06-24 06:19:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000601784_9859629056.pth [2024-06-24 06:19:45,986][15401] Updated weights for policy 0, policy_version 602420 (0.0045) [2024-06-24 06:19:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 9870147584. Throughput: 0: 43181.4. Samples: 9870311000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:19:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 06:19:50,098][15401] Updated weights for policy 0, policy_version 602430 (0.0027) [2024-06-24 06:19:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9870344192. Throughput: 0: 43081.8. Samples: 9870433500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:19:53,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 06:19:53,641][15401] Updated weights for policy 0, policy_version 602440 (0.0031) [2024-06-24 06:19:57,596][15401] Updated weights for policy 0, policy_version 602450 (0.0024) [2024-06-24 06:19:58,392][15132] Fps is (10 sec: 40949.4, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 9870557184. Throughput: 0: 43106.5. Samples: 9870695940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:19:58,393][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 06:20:01,291][15401] Updated weights for policy 0, policy_version 602460 (0.0044) [2024-06-24 06:20:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9870786560. Throughput: 0: 43059.1. Samples: 9870948240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 06:20:05,181][15401] Updated weights for policy 0, policy_version 602470 (0.0041) [2024-06-24 06:20:08,389][15132] Fps is (10 sec: 44248.1, 60 sec: 43144.9, 300 sec: 42876.1). Total num frames: 9870999552. Throughput: 0: 43143.7. Samples: 9871077260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:08,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 06:20:09,011][15401] Updated weights for policy 0, policy_version 602480 (0.0034) [2024-06-24 06:20:13,182][15401] Updated weights for policy 0, policy_version 602490 (0.0027) [2024-06-24 06:20:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9871196160. Throughput: 0: 42824.0. Samples: 9871333260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 06:20:17,017][15401] Updated weights for policy 0, policy_version 602500 (0.0038) [2024-06-24 06:20:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9871441920. Throughput: 0: 42782.7. Samples: 9871584040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:18,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 06:20:20,771][15401] Updated weights for policy 0, policy_version 602510 (0.0033) [2024-06-24 06:20:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9871638528. Throughput: 0: 42933.7. Samples: 9871715880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:23,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 06:20:24,551][15401] Updated weights for policy 0, policy_version 602520 (0.0036) [2024-06-24 06:20:28,292][15401] Updated weights for policy 0, policy_version 602530 (0.0024) [2024-06-24 06:20:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9871851520. Throughput: 0: 42619.6. Samples: 9871975120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 06:20:32,053][15401] Updated weights for policy 0, policy_version 602540 (0.0034) [2024-06-24 06:20:33,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 9872080896. Throughput: 0: 42653.1. Samples: 9872230500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:33,392][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 06:20:35,865][15401] Updated weights for policy 0, policy_version 602550 (0.0034) [2024-06-24 06:20:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 9872293888. Throughput: 0: 42771.1. Samples: 9872358200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 06:20:39,474][15401] Updated weights for policy 0, policy_version 602560 (0.0035) [2024-06-24 06:20:43,377][15401] Updated weights for policy 0, policy_version 602570 (0.0031) [2024-06-24 06:20:43,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9872506880. Throughput: 0: 42724.9. Samples: 9872618460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 06:20:47,081][15401] Updated weights for policy 0, policy_version 602580 (0.0038) [2024-06-24 06:20:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9872703488. Throughput: 0: 42931.5. Samples: 9872880160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:48,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 06:20:51,235][15401] Updated weights for policy 0, policy_version 602590 (0.0032) [2024-06-24 06:20:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9872916480. Throughput: 0: 42866.2. Samples: 9873006240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 06:20:54,762][15401] Updated weights for policy 0, policy_version 602600 (0.0041) [2024-06-24 06:20:55,976][15349] Signal inference workers to stop experience collection... (146300 times) [2024-06-24 06:20:56,003][15401] InferenceWorker_p0-w0: stopping experience collection (146300 times) [2024-06-24 06:20:56,039][15349] Signal inference workers to resume experience collection... (146300 times) [2024-06-24 06:20:56,039][15401] InferenceWorker_p0-w0: resuming experience collection (146300 times) [2024-06-24 06:20:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 9873145856. Throughput: 0: 42872.0. Samples: 9873262500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:20:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 06:20:58,652][15401] Updated weights for policy 0, policy_version 602610 (0.0030) [2024-06-24 06:21:02,581][15401] Updated weights for policy 0, policy_version 602620 (0.0041) [2024-06-24 06:21:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9873342464. Throughput: 0: 43020.5. Samples: 9873519960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:21:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 06:21:06,187][15401] Updated weights for policy 0, policy_version 602630 (0.0030) [2024-06-24 06:21:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 9873555456. Throughput: 0: 42791.4. Samples: 9873641600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:21:08,392][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 06:21:10,199][15401] Updated weights for policy 0, policy_version 602640 (0.0037) [2024-06-24 06:21:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9873784832. Throughput: 0: 42808.5. Samples: 9873901500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:21:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 06:21:13,904][15401] Updated weights for policy 0, policy_version 602650 (0.0035) [2024-06-24 06:21:17,843][15401] Updated weights for policy 0, policy_version 602660 (0.0040) [2024-06-24 06:21:18,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 9873997824. Throughput: 0: 42870.4. Samples: 9874159560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 06:21:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 06:21:21,672][15401] Updated weights for policy 0, policy_version 602670 (0.0033) [2024-06-24 06:21:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 9874227200. Throughput: 0: 42887.6. Samples: 9874288140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:21:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 06:21:25,530][15401] Updated weights for policy 0, policy_version 602680 (0.0031) [2024-06-24 06:21:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9874423808. Throughput: 0: 42876.5. Samples: 9874547900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:21:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 06:21:29,152][15401] Updated weights for policy 0, policy_version 602690 (0.0038) [2024-06-24 06:21:32,980][15401] Updated weights for policy 0, policy_version 602700 (0.0023) [2024-06-24 06:21:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 9874636800. Throughput: 0: 42771.2. Samples: 9874804860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:21:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 06:21:36,796][15401] Updated weights for policy 0, policy_version 602710 (0.0030) [2024-06-24 06:21:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9874882560. Throughput: 0: 42743.9. Samples: 9874929720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:21:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 06:21:41,063][15401] Updated weights for policy 0, policy_version 602720 (0.0031) [2024-06-24 06:21:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9875079168. Throughput: 0: 42867.1. Samples: 9875191520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:21:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 06:21:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000602727_9875079168.pth... [2024-06-24 06:21:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000602099_9864790016.pth [2024-06-24 06:21:44,525][15401] Updated weights for policy 0, policy_version 602730 (0.0031) [2024-06-24 06:21:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9875259392. Throughput: 0: 42687.5. Samples: 9875440900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:21:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 06:21:48,987][15401] Updated weights for policy 0, policy_version 602740 (0.0036) [2024-06-24 06:21:52,198][15401] Updated weights for policy 0, policy_version 602750 (0.0039) [2024-06-24 06:21:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9875505152. Throughput: 0: 42827.2. Samples: 9875568720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:21:53,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 06:21:56,602][15401] Updated weights for policy 0, policy_version 602760 (0.0041) [2024-06-24 06:21:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9875685376. Throughput: 0: 42772.8. Samples: 9875826280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:21:58,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 06:22:00,029][15401] Updated weights for policy 0, policy_version 602770 (0.0040) [2024-06-24 06:22:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9875914752. Throughput: 0: 42637.3. Samples: 9876078240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 06:22:04,174][15401] Updated weights for policy 0, policy_version 602780 (0.0026) [2024-06-24 06:22:07,603][15401] Updated weights for policy 0, policy_version 602790 (0.0039) [2024-06-24 06:22:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43146.4, 300 sec: 42876.2). Total num frames: 9876144128. Throughput: 0: 42533.8. Samples: 9876202160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 06:22:11,797][15401] Updated weights for policy 0, policy_version 602800 (0.0036) [2024-06-24 06:22:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 9876307968. Throughput: 0: 42398.7. Samples: 9876455840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 06:22:15,092][15401] Updated weights for policy 0, policy_version 602810 (0.0033) [2024-06-24 06:22:18,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9876537344. Throughput: 0: 42356.8. Samples: 9876710920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 06:22:19,536][15401] Updated weights for policy 0, policy_version 602820 (0.0045) [2024-06-24 06:22:22,938][15401] Updated weights for policy 0, policy_version 602830 (0.0033) [2024-06-24 06:22:23,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9876783104. Throughput: 0: 42428.1. Samples: 9876838980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:23,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 06:22:27,179][15401] Updated weights for policy 0, policy_version 602840 (0.0038) [2024-06-24 06:22:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9876946944. Throughput: 0: 42110.2. Samples: 9877086480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 06:22:30,420][15401] Updated weights for policy 0, policy_version 602850 (0.0033) [2024-06-24 06:22:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9877176320. Throughput: 0: 42466.6. Samples: 9877351900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:33,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 06:22:34,670][15401] Updated weights for policy 0, policy_version 602860 (0.0021) [2024-06-24 06:22:37,979][15401] Updated weights for policy 0, policy_version 602870 (0.0028) [2024-06-24 06:22:38,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9877422080. Throughput: 0: 42570.7. Samples: 9877484400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:38,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 06:22:40,851][15349] Signal inference workers to stop experience collection... (146350 times) [2024-06-24 06:22:40,851][15349] Signal inference workers to resume experience collection... (146350 times) [2024-06-24 06:22:40,882][15401] InferenceWorker_p0-w0: stopping experience collection (146350 times) [2024-06-24 06:22:40,883][15401] InferenceWorker_p0-w0: resuming experience collection (146350 times) [2024-06-24 06:22:42,213][15401] Updated weights for policy 0, policy_version 602880 (0.0032) [2024-06-24 06:22:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 9877585920. Throughput: 0: 42434.3. Samples: 9877735820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:43,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 06:22:45,636][15401] Updated weights for policy 0, policy_version 602890 (0.0024) [2024-06-24 06:22:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9877831680. Throughput: 0: 42721.0. Samples: 9878000680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 06:22:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 06:22:49,826][15401] Updated weights for policy 0, policy_version 602900 (0.0038) [2024-06-24 06:22:53,215][15401] Updated weights for policy 0, policy_version 602910 (0.0034) [2024-06-24 06:22:53,394][15132] Fps is (10 sec: 49129.6, 60 sec: 42868.2, 300 sec: 42875.5). Total num frames: 9878077440. Throughput: 0: 42839.2. Samples: 9878130120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:22:53,394][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 06:22:57,473][15401] Updated weights for policy 0, policy_version 602920 (0.0036) [2024-06-24 06:22:58,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 9878241280. Throughput: 0: 42772.3. Samples: 9878380700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:22:58,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 06:23:01,153][15401] Updated weights for policy 0, policy_version 602930 (0.0041) [2024-06-24 06:23:03,389][15132] Fps is (10 sec: 40978.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9878487040. Throughput: 0: 42842.7. Samples: 9878638840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 06:23:05,071][15401] Updated weights for policy 0, policy_version 602940 (0.0039) [2024-06-24 06:23:08,389][15132] Fps is (10 sec: 45887.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9878700032. Throughput: 0: 43012.5. Samples: 9878774540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 06:23:08,922][15401] Updated weights for policy 0, policy_version 602950 (0.0034) [2024-06-24 06:23:12,716][15401] Updated weights for policy 0, policy_version 602960 (0.0042) [2024-06-24 06:23:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9878896640. Throughput: 0: 43043.9. Samples: 9879023460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 06:23:16,647][15401] Updated weights for policy 0, policy_version 602970 (0.0038) [2024-06-24 06:23:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9879109632. Throughput: 0: 42942.7. Samples: 9879284320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:18,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-24 06:23:20,642][15401] Updated weights for policy 0, policy_version 602980 (0.0039) [2024-06-24 06:23:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 9879339008. Throughput: 0: 42912.4. Samples: 9879415460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:23,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 06:23:24,028][15401] Updated weights for policy 0, policy_version 602990 (0.0038) [2024-06-24 06:23:28,368][15401] Updated weights for policy 0, policy_version 603000 (0.0043) [2024-06-24 06:23:28,389][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 9879552000. Throughput: 0: 42831.9. Samples: 9879663260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 06:23:31,647][15401] Updated weights for policy 0, policy_version 603010 (0.0045) [2024-06-24 06:23:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9879764992. Throughput: 0: 42876.0. Samples: 9879930100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 06:23:36,136][15401] Updated weights for policy 0, policy_version 603020 (0.0038) [2024-06-24 06:23:38,390][15132] Fps is (10 sec: 42596.3, 60 sec: 42598.0, 300 sec: 42820.5). Total num frames: 9879977984. Throughput: 0: 42904.7. Samples: 9880060660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:38,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 06:23:39,362][15401] Updated weights for policy 0, policy_version 603030 (0.0029) [2024-06-24 06:23:43,389][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9880174592. Throughput: 0: 42912.1. Samples: 9880311640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:43,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 06:23:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000603038_9880174592.pth... [2024-06-24 06:23:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000602414_9869950976.pth [2024-06-24 06:23:43,938][15401] Updated weights for policy 0, policy_version 603040 (0.0023) [2024-06-24 06:23:47,083][15401] Updated weights for policy 0, policy_version 603050 (0.0036) [2024-06-24 06:23:48,106][15349] Signal inference workers to stop experience collection... (146400 times) [2024-06-24 06:23:48,157][15401] InferenceWorker_p0-w0: stopping experience collection (146400 times) [2024-06-24 06:23:48,224][15349] Signal inference workers to resume experience collection... (146400 times) [2024-06-24 06:23:48,224][15401] InferenceWorker_p0-w0: resuming experience collection (146400 times) [2024-06-24 06:23:48,389][15132] Fps is (10 sec: 44239.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9880420352. Throughput: 0: 43105.3. Samples: 9880578580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 06:23:51,489][15401] Updated weights for policy 0, policy_version 603060 (0.0032) [2024-06-24 06:23:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42601.5, 300 sec: 42820.9). Total num frames: 9880633344. Throughput: 0: 43009.5. Samples: 9880709980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:53,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 06:23:54,687][15401] Updated weights for policy 0, policy_version 603070 (0.0050) [2024-06-24 06:23:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 9880829952. Throughput: 0: 43011.1. Samples: 9880958960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:23:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 06:23:59,255][15401] Updated weights for policy 0, policy_version 603080 (0.0042) [2024-06-24 06:24:02,219][15401] Updated weights for policy 0, policy_version 603090 (0.0025) [2024-06-24 06:24:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 9881075712. Throughput: 0: 43131.4. Samples: 9881225240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:24:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 06:24:06,803][15401] Updated weights for policy 0, policy_version 603100 (0.0043) [2024-06-24 06:24:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9881272320. Throughput: 0: 43204.9. Samples: 9881359680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:24:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 06:24:10,139][15401] Updated weights for policy 0, policy_version 603110 (0.0024) [2024-06-24 06:24:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9881485312. Throughput: 0: 43289.8. Samples: 9881611300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:24:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 06:24:14,250][15401] Updated weights for policy 0, policy_version 603120 (0.0034) [2024-06-24 06:24:17,608][15401] Updated weights for policy 0, policy_version 603130 (0.0039) [2024-06-24 06:24:18,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43963.7, 300 sec: 42987.2). Total num frames: 9881747456. Throughput: 0: 43098.2. Samples: 9881869520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:24:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 06:24:21,738][15401] Updated weights for policy 0, policy_version 603140 (0.0024) [2024-06-24 06:24:23,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 9881894912. Throughput: 0: 43131.0. Samples: 9882001640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:24:23,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 06:24:25,224][15401] Updated weights for policy 0, policy_version 603150 (0.0038) [2024-06-24 06:24:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9882140672. Throughput: 0: 43144.5. Samples: 9882253140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:24:28,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 06:24:29,071][15401] Updated weights for policy 0, policy_version 603160 (0.0024) [2024-06-24 06:24:32,738][15401] Updated weights for policy 0, policy_version 603170 (0.0040) [2024-06-24 06:24:33,389][15132] Fps is (10 sec: 45887.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9882353664. Throughput: 0: 43068.1. Samples: 9882516640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:24:33,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 06:24:36,921][15401] Updated weights for policy 0, policy_version 603180 (0.0042) [2024-06-24 06:24:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.9, 300 sec: 42765.0). Total num frames: 9882566656. Throughput: 0: 43166.8. Samples: 9882652480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:24:38,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-24 06:24:40,088][15401] Updated weights for policy 0, policy_version 603190 (0.0037) [2024-06-24 06:24:41,338][15349] Signal inference workers to stop experience collection... (146450 times) [2024-06-24 06:24:41,367][15401] InferenceWorker_p0-w0: stopping experience collection (146450 times) [2024-06-24 06:24:41,402][15349] Signal inference workers to resume experience collection... (146450 times) [2024-06-24 06:24:41,402][15401] InferenceWorker_p0-w0: resuming experience collection (146450 times) [2024-06-24 06:24:43,390][15132] Fps is (10 sec: 44233.4, 60 sec: 43690.2, 300 sec: 42876.0). Total num frames: 9882796032. Throughput: 0: 43288.2. Samples: 9882906960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:24:43,391][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 06:24:44,562][15401] Updated weights for policy 0, policy_version 603200 (0.0029) [2024-06-24 06:24:47,900][15401] Updated weights for policy 0, policy_version 603210 (0.0038) [2024-06-24 06:24:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 9883025408. Throughput: 0: 43252.6. Samples: 9883171600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:24:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 06:24:51,989][15401] Updated weights for policy 0, policy_version 603220 (0.0040) [2024-06-24 06:24:53,391][15132] Fps is (10 sec: 40956.2, 60 sec: 42870.4, 300 sec: 42876.2). Total num frames: 9883205632. Throughput: 0: 43086.4. Samples: 9883298640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:24:53,391][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 06:24:55,471][15401] Updated weights for policy 0, policy_version 603230 (0.0038) [2024-06-24 06:24:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 9883451392. Throughput: 0: 43085.6. Samples: 9883550160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:24:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 06:25:00,034][15401] Updated weights for policy 0, policy_version 603240 (0.0037) [2024-06-24 06:25:02,982][15401] Updated weights for policy 0, policy_version 603250 (0.0022) [2024-06-24 06:25:03,389][15132] Fps is (10 sec: 45882.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9883664384. Throughput: 0: 43199.1. Samples: 9883813480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 06:25:07,636][15401] Updated weights for policy 0, policy_version 603260 (0.0031) [2024-06-24 06:25:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9883844608. Throughput: 0: 43145.1. Samples: 9883943060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:08,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 06:25:10,489][15401] Updated weights for policy 0, policy_version 603270 (0.0041) [2024-06-24 06:25:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 9884106752. Throughput: 0: 43290.6. Samples: 9884201220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 06:25:15,055][15401] Updated weights for policy 0, policy_version 603280 (0.0039) [2024-06-24 06:25:18,326][15401] Updated weights for policy 0, policy_version 603290 (0.0041) [2024-06-24 06:25:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 9884303360. Throughput: 0: 43223.9. Samples: 9884461720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:18,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 06:25:22,617][15401] Updated weights for policy 0, policy_version 603300 (0.0034) [2024-06-24 06:25:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43419.5, 300 sec: 42876.1). Total num frames: 9884499968. Throughput: 0: 43028.6. Samples: 9884588760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:23,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 06:25:25,797][15401] Updated weights for policy 0, policy_version 603310 (0.0039) [2024-06-24 06:25:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 42987.5). Total num frames: 9884762112. Throughput: 0: 43147.4. Samples: 9884848560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:28,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 06:25:29,914][15401] Updated weights for policy 0, policy_version 603320 (0.0030) [2024-06-24 06:25:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9884925952. Throughput: 0: 43242.1. Samples: 9885117500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 06:25:33,791][15401] Updated weights for policy 0, policy_version 603330 (0.0037) [2024-06-24 06:25:34,839][15349] Signal inference workers to stop experience collection... (146500 times) [2024-06-24 06:25:34,845][15349] Signal inference workers to resume experience collection... (146500 times) [2024-06-24 06:25:34,886][15401] InferenceWorker_p0-w0: stopping experience collection (146500 times) [2024-06-24 06:25:34,886][15401] InferenceWorker_p0-w0: resuming experience collection (146500 times) [2024-06-24 06:25:37,501][15401] Updated weights for policy 0, policy_version 603340 (0.0029) [2024-06-24 06:25:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 9885155328. Throughput: 0: 43054.0. Samples: 9885236000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:38,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 06:25:41,263][15401] Updated weights for policy 0, policy_version 603350 (0.0030) [2024-06-24 06:25:43,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43691.2, 300 sec: 43098.2). Total num frames: 9885417472. Throughput: 0: 43332.1. Samples: 9885500100. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:43,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 06:25:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000603358_9885417472.pth... [2024-06-24 06:25:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000602727_9875079168.pth [2024-06-24 06:25:44,911][15401] Updated weights for policy 0, policy_version 603360 (0.0029) [2024-06-24 06:25:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 9885564928. Throughput: 0: 43362.2. Samples: 9885764780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 06:25:48,983][15401] Updated weights for policy 0, policy_version 603370 (0.0033) [2024-06-24 06:25:52,394][15401] Updated weights for policy 0, policy_version 603380 (0.0038) [2024-06-24 06:25:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43418.9, 300 sec: 42931.6). Total num frames: 9885810688. Throughput: 0: 42939.1. Samples: 9885875320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 06:25:56,558][15401] Updated weights for policy 0, policy_version 603390 (0.0040) [2024-06-24 06:25:58,391][15132] Fps is (10 sec: 47504.5, 60 sec: 43143.3, 300 sec: 43042.4). Total num frames: 9886040064. Throughput: 0: 43088.4. Samples: 9886140280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 06:25:58,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 06:25:59,905][15401] Updated weights for policy 0, policy_version 603400 (0.0041) [2024-06-24 06:26:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 9886220288. Throughput: 0: 43129.9. Samples: 9886402560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:03,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 06:26:04,394][15401] Updated weights for policy 0, policy_version 603410 (0.0033) [2024-06-24 06:26:07,687][15401] Updated weights for policy 0, policy_version 603420 (0.0034) [2024-06-24 06:26:08,389][15132] Fps is (10 sec: 40968.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 9886449664. Throughput: 0: 42836.9. Samples: 9886516420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 06:26:12,055][15401] Updated weights for policy 0, policy_version 603430 (0.0031) [2024-06-24 06:26:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 9886679040. Throughput: 0: 43032.5. Samples: 9886785020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 06:26:15,163][15401] Updated weights for policy 0, policy_version 603440 (0.0039) [2024-06-24 06:26:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9886859264. Throughput: 0: 42846.3. Samples: 9887045580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 06:26:19,640][15401] Updated weights for policy 0, policy_version 603450 (0.0033) [2024-06-24 06:26:22,623][15401] Updated weights for policy 0, policy_version 603460 (0.0033) [2024-06-24 06:26:23,389][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9887088640. Throughput: 0: 42832.8. Samples: 9887163480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:23,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 06:26:27,530][15401] Updated weights for policy 0, policy_version 603470 (0.0039) [2024-06-24 06:26:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 42987.1). Total num frames: 9887318016. Throughput: 0: 42906.5. Samples: 9887430900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 06:26:30,092][15401] Updated weights for policy 0, policy_version 603480 (0.0040) [2024-06-24 06:26:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9887498240. Throughput: 0: 42781.3. Samples: 9887689940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 06:26:35,097][15401] Updated weights for policy 0, policy_version 603490 (0.0049) [2024-06-24 06:26:38,032][15401] Updated weights for policy 0, policy_version 603500 (0.0038) [2024-06-24 06:26:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.8, 300 sec: 42986.8). Total num frames: 9887760384. Throughput: 0: 42966.4. Samples: 9887808920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:38,393][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 06:26:42,659][15401] Updated weights for policy 0, policy_version 603510 (0.0041) [2024-06-24 06:26:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42050.5, 300 sec: 42986.8). Total num frames: 9887940608. Throughput: 0: 42995.5. Samples: 9888075100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:43,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 06:26:45,570][15401] Updated weights for policy 0, policy_version 603520 (0.0034) [2024-06-24 06:26:48,389][15132] Fps is (10 sec: 39331.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9888153600. Throughput: 0: 42906.2. Samples: 9888333340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:48,396][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 06:26:50,198][15401] Updated weights for policy 0, policy_version 603530 (0.0041) [2024-06-24 06:26:51,348][15349] Signal inference workers to stop experience collection... (146550 times) [2024-06-24 06:26:51,383][15401] InferenceWorker_p0-w0: stopping experience collection (146550 times) [2024-06-24 06:26:51,469][15349] Signal inference workers to resume experience collection... (146550 times) [2024-06-24 06:26:51,469][15401] InferenceWorker_p0-w0: resuming experience collection (146550 times) [2024-06-24 06:26:53,277][15401] Updated weights for policy 0, policy_version 603540 (0.0029) [2024-06-24 06:26:53,389][15132] Fps is (10 sec: 45886.8, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 9888399360. Throughput: 0: 43057.8. Samples: 9888454020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 06:26:57,659][15401] Updated weights for policy 0, policy_version 603550 (0.0038) [2024-06-24 06:26:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42326.6, 300 sec: 42931.6). Total num frames: 9888579584. Throughput: 0: 42927.4. Samples: 9888716760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:26:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 06:27:00,608][15401] Updated weights for policy 0, policy_version 603560 (0.0035) [2024-06-24 06:27:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9888792576. Throughput: 0: 42985.8. Samples: 9888979940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:27:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 06:27:05,207][15401] Updated weights for policy 0, policy_version 603570 (0.0034) [2024-06-24 06:27:08,196][15401] Updated weights for policy 0, policy_version 603580 (0.0025) [2024-06-24 06:27:08,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43415.7, 300 sec: 43209.0). Total num frames: 9889054720. Throughput: 0: 43184.3. Samples: 9889106880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:27:08,393][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 06:27:12,700][15401] Updated weights for policy 0, policy_version 603590 (0.0038) [2024-06-24 06:27:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 9889218560. Throughput: 0: 42954.9. Samples: 9889363860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:27:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 06:27:15,861][15401] Updated weights for policy 0, policy_version 603600 (0.0032) [2024-06-24 06:27:18,392][15132] Fps is (10 sec: 37683.1, 60 sec: 42869.6, 300 sec: 42875.7). Total num frames: 9889431552. Throughput: 0: 42919.4. Samples: 9889621420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:27:18,393][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 06:27:20,469][15401] Updated weights for policy 0, policy_version 603610 (0.0036) [2024-06-24 06:27:23,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 43209.3). Total num frames: 9889693696. Throughput: 0: 43147.7. Samples: 9889750460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:27:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 06:27:23,649][15401] Updated weights for policy 0, policy_version 603620 (0.0025) [2024-06-24 06:27:27,941][15401] Updated weights for policy 0, policy_version 603630 (0.0047) [2024-06-24 06:27:28,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 9889873920. Throughput: 0: 43032.6. Samples: 9890011460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:27:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 06:27:31,220][15401] Updated weights for policy 0, policy_version 603640 (0.0043) [2024-06-24 06:27:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9890086912. Throughput: 0: 43018.1. Samples: 9890269160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-24 06:27:33,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 06:27:35,639][15401] Updated weights for policy 0, policy_version 603650 (0.0035) [2024-06-24 06:27:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.2, 300 sec: 43153.8). Total num frames: 9890316288. Throughput: 0: 43036.9. Samples: 9890390680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:27:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 06:27:39,041][15401] Updated weights for policy 0, policy_version 603660 (0.0028) [2024-06-24 06:27:43,111][15401] Updated weights for policy 0, policy_version 603670 (0.0034) [2024-06-24 06:27:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43419.4, 300 sec: 43098.2). Total num frames: 9890545664. Throughput: 0: 43044.1. Samples: 9890653740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:27:43,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 06:27:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000603671_9890545664.pth... [2024-06-24 06:27:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000603038_9880174592.pth [2024-06-24 06:27:46,909][15401] Updated weights for policy 0, policy_version 603680 (0.0041) [2024-06-24 06:27:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.8). Total num frames: 9890725888. Throughput: 0: 42925.8. Samples: 9890911600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:27:48,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 06:27:50,814][15401] Updated weights for policy 0, policy_version 603690 (0.0047) [2024-06-24 06:27:53,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.6, 300 sec: 43098.2). Total num frames: 9890955264. Throughput: 0: 42841.8. Samples: 9891034760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:27:53,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 06:27:54,670][15401] Updated weights for policy 0, policy_version 603700 (0.0028) [2024-06-24 06:27:58,396][15132] Fps is (10 sec: 44206.7, 60 sec: 43139.7, 300 sec: 42986.2). Total num frames: 9891168256. Throughput: 0: 42913.9. Samples: 9891295280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:27:58,397][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 06:27:58,595][15401] Updated weights for policy 0, policy_version 603710 (0.0032) [2024-06-24 06:28:02,091][15401] Updated weights for policy 0, policy_version 603720 (0.0028) [2024-06-24 06:28:03,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9891364864. Throughput: 0: 42920.2. Samples: 9891552720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:03,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 06:28:06,044][15401] Updated weights for policy 0, policy_version 603730 (0.0030) [2024-06-24 06:28:08,389][15132] Fps is (10 sec: 42627.0, 60 sec: 42327.1, 300 sec: 43042.7). Total num frames: 9891594240. Throughput: 0: 42893.3. Samples: 9891680660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 06:28:09,887][15401] Updated weights for policy 0, policy_version 603740 (0.0027) [2024-06-24 06:28:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 9891823616. Throughput: 0: 42908.9. Samples: 9891942360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 06:28:13,549][15401] Updated weights for policy 0, policy_version 603750 (0.0038) [2024-06-24 06:28:17,400][15401] Updated weights for policy 0, policy_version 603760 (0.0041) [2024-06-24 06:28:18,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43146.1, 300 sec: 42987.1). Total num frames: 9892020224. Throughput: 0: 43028.6. Samples: 9892205460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 06:28:21,399][15401] Updated weights for policy 0, policy_version 603770 (0.0025) [2024-06-24 06:28:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 43098.3). Total num frames: 9892265984. Throughput: 0: 43126.2. Samples: 9892331360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 06:28:24,790][15401] Updated weights for policy 0, policy_version 603780 (0.0035) [2024-06-24 06:28:26,171][15349] Signal inference workers to stop experience collection... (146600 times) [2024-06-24 06:28:26,227][15401] InferenceWorker_p0-w0: stopping experience collection (146600 times) [2024-06-24 06:28:26,231][15349] Signal inference workers to resume experience collection... (146600 times) [2024-06-24 06:28:26,245][15401] InferenceWorker_p0-w0: resuming experience collection (146600 times) [2024-06-24 06:28:28,390][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 9892446208. Throughput: 0: 42923.9. Samples: 9892585320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 06:28:29,108][15401] Updated weights for policy 0, policy_version 603790 (0.0034) [2024-06-24 06:28:32,509][15401] Updated weights for policy 0, policy_version 603800 (0.0037) [2024-06-24 06:28:33,392][15132] Fps is (10 sec: 39311.6, 60 sec: 42869.7, 300 sec: 42986.9). Total num frames: 9892659200. Throughput: 0: 42927.8. Samples: 9892843460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:33,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 06:28:37,077][15401] Updated weights for policy 0, policy_version 603810 (0.0031) [2024-06-24 06:28:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 9892904960. Throughput: 0: 42996.2. Samples: 9892969480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 06:28:40,366][15401] Updated weights for policy 0, policy_version 603820 (0.0047) [2024-06-24 06:28:43,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 9893085184. Throughput: 0: 42910.4. Samples: 9893225960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 06:28:44,461][15401] Updated weights for policy 0, policy_version 603830 (0.0033) [2024-06-24 06:28:47,874][15401] Updated weights for policy 0, policy_version 603840 (0.0024) [2024-06-24 06:28:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 9893314560. Throughput: 0: 42871.5. Samples: 9893481940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:48,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 06:28:51,933][15401] Updated weights for policy 0, policy_version 603850 (0.0034) [2024-06-24 06:28:53,392][15132] Fps is (10 sec: 44227.6, 60 sec: 42871.7, 300 sec: 43042.4). Total num frames: 9893527552. Throughput: 0: 43027.8. Samples: 9893617000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:53,392][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 06:28:55,513][15401] Updated weights for policy 0, policy_version 603860 (0.0033) [2024-06-24 06:28:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42876.3, 300 sec: 42931.6). Total num frames: 9893740544. Throughput: 0: 42941.3. Samples: 9893874720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:28:58,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 06:28:59,460][15401] Updated weights for policy 0, policy_version 603870 (0.0031) [2024-06-24 06:29:03,144][15401] Updated weights for policy 0, policy_version 603880 (0.0042) [2024-06-24 06:29:03,392][15132] Fps is (10 sec: 44235.6, 60 sec: 43415.8, 300 sec: 43042.4). Total num frames: 9893969920. Throughput: 0: 42761.6. Samples: 9894129820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:29:03,392][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 06:29:07,113][15401] Updated weights for policy 0, policy_version 603890 (0.0036) [2024-06-24 06:29:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 43042.3). Total num frames: 9894182912. Throughput: 0: 42989.2. Samples: 9894265980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 06:29:08,392][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 06:29:10,718][15401] Updated weights for policy 0, policy_version 603900 (0.0033) [2024-06-24 06:29:13,389][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9894379520. Throughput: 0: 43018.3. Samples: 9894521140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 06:29:14,953][15401] Updated weights for policy 0, policy_version 603910 (0.0037) [2024-06-24 06:29:18,279][15401] Updated weights for policy 0, policy_version 603920 (0.0029) [2024-06-24 06:29:18,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43417.7, 300 sec: 43154.1). Total num frames: 9894625280. Throughput: 0: 42959.6. Samples: 9894776540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:18,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 06:29:22,384][15401] Updated weights for policy 0, policy_version 603930 (0.0031) [2024-06-24 06:29:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 9894821888. Throughput: 0: 43004.5. Samples: 9894904680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:23,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 06:29:25,717][15401] Updated weights for policy 0, policy_version 603940 (0.0042) [2024-06-24 06:29:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 9895034880. Throughput: 0: 43006.7. Samples: 9895161260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:28,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 06:29:30,051][15401] Updated weights for policy 0, policy_version 603950 (0.0027) [2024-06-24 06:29:33,261][15401] Updated weights for policy 0, policy_version 603960 (0.0053) [2024-06-24 06:29:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43692.4, 300 sec: 43098.2). Total num frames: 9895280640. Throughput: 0: 42925.7. Samples: 9895413600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 06:29:37,655][15401] Updated weights for policy 0, policy_version 603970 (0.0027) [2024-06-24 06:29:38,394][15132] Fps is (10 sec: 44214.8, 60 sec: 42867.9, 300 sec: 42986.5). Total num frames: 9895477248. Throughput: 0: 42975.4. Samples: 9895551020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:38,395][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 06:29:40,875][15401] Updated weights for policy 0, policy_version 603980 (0.0037) [2024-06-24 06:29:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 9895690240. Throughput: 0: 42963.5. Samples: 9895808080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 06:29:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000603985_9895690240.pth... [2024-06-24 06:29:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000603358_9885417472.pth [2024-06-24 06:29:45,230][15401] Updated weights for policy 0, policy_version 603990 (0.0029) [2024-06-24 06:29:48,389][15132] Fps is (10 sec: 44259.1, 60 sec: 43417.7, 300 sec: 43098.5). Total num frames: 9895919616. Throughput: 0: 42974.8. Samples: 9896063580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 06:29:48,512][15401] Updated weights for policy 0, policy_version 604000 (0.0028) [2024-06-24 06:29:52,905][15401] Updated weights for policy 0, policy_version 604010 (0.0033) [2024-06-24 06:29:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.0, 300 sec: 42931.6). Total num frames: 9896116224. Throughput: 0: 42901.0. Samples: 9896196420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 06:29:56,283][15401] Updated weights for policy 0, policy_version 604020 (0.0029) [2024-06-24 06:29:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9896329216. Throughput: 0: 42988.1. Samples: 9896455600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:29:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 06:30:00,479][15401] Updated weights for policy 0, policy_version 604030 (0.0029) [2024-06-24 06:30:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.3, 300 sec: 43098.2). Total num frames: 9896558592. Throughput: 0: 43021.9. Samples: 9896712520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:30:03,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 06:30:03,857][15401] Updated weights for policy 0, policy_version 604040 (0.0034) [2024-06-24 06:30:08,182][15401] Updated weights for policy 0, policy_version 604050 (0.0027) [2024-06-24 06:30:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 9896755200. Throughput: 0: 42971.1. Samples: 9896838380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:30:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 06:30:10,703][15349] Signal inference workers to stop experience collection... (146650 times) [2024-06-24 06:30:10,712][15349] Signal inference workers to resume experience collection... (146650 times) [2024-06-24 06:30:10,727][15401] InferenceWorker_p0-w0: stopping experience collection (146650 times) [2024-06-24 06:30:10,727][15401] InferenceWorker_p0-w0: resuming experience collection (146650 times) [2024-06-24 06:30:11,934][15401] Updated weights for policy 0, policy_version 604060 (0.0032) [2024-06-24 06:30:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9896984576. Throughput: 0: 42995.2. Samples: 9897096040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:30:13,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 06:30:15,803][15401] Updated weights for policy 0, policy_version 604070 (0.0032) [2024-06-24 06:30:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 9897197568. Throughput: 0: 43137.1. Samples: 9897354760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:30:18,398][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 06:30:19,409][15401] Updated weights for policy 0, policy_version 604080 (0.0045) [2024-06-24 06:30:23,320][15401] Updated weights for policy 0, policy_version 604090 (0.0036) [2024-06-24 06:30:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 9897410560. Throughput: 0: 42935.3. Samples: 9897483000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:30:23,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 06:30:26,922][15401] Updated weights for policy 0, policy_version 604100 (0.0027) [2024-06-24 06:30:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9897607168. Throughput: 0: 42884.1. Samples: 9897737860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:30:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 06:30:30,887][15401] Updated weights for policy 0, policy_version 604110 (0.0021) [2024-06-24 06:30:33,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 9897836544. Throughput: 0: 43046.1. Samples: 9898000660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:30:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 06:30:34,330][15401] Updated weights for policy 0, policy_version 604120 (0.0028) [2024-06-24 06:30:38,391][15132] Fps is (10 sec: 42590.1, 60 sec: 42600.6, 300 sec: 42764.7). Total num frames: 9898033152. Throughput: 0: 42987.1. Samples: 9898130920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 06:30:38,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 06:30:38,674][15401] Updated weights for policy 0, policy_version 604130 (0.0034) [2024-06-24 06:30:42,069][15401] Updated weights for policy 0, policy_version 604140 (0.0048) [2024-06-24 06:30:43,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42986.8). Total num frames: 9898246144. Throughput: 0: 42724.8. Samples: 9898378320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:30:43,392][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 06:30:46,387][15401] Updated weights for policy 0, policy_version 604150 (0.0027) [2024-06-24 06:30:48,389][15132] Fps is (10 sec: 44245.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 9898475520. Throughput: 0: 42879.5. Samples: 9898642100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:30:48,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-24 06:30:49,520][15401] Updated weights for policy 0, policy_version 604160 (0.0039) [2024-06-24 06:30:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 9898688512. Throughput: 0: 43112.9. Samples: 9898778460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:30:53,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 06:30:53,893][15401] Updated weights for policy 0, policy_version 604170 (0.0024) [2024-06-24 06:30:57,235][15401] Updated weights for policy 0, policy_version 604180 (0.0032) [2024-06-24 06:30:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9898901504. Throughput: 0: 43081.8. Samples: 9899034720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:30:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 06:31:01,383][15401] Updated weights for policy 0, policy_version 604190 (0.0033) [2024-06-24 06:31:03,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 9899147264. Throughput: 0: 43066.1. Samples: 9899292740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 06:31:05,028][15401] Updated weights for policy 0, policy_version 604200 (0.0033) [2024-06-24 06:31:08,392][15132] Fps is (10 sec: 44225.6, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 9899343872. Throughput: 0: 43214.7. Samples: 9899427660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:08,393][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 06:31:08,929][15401] Updated weights for policy 0, policy_version 604210 (0.0026) [2024-06-24 06:31:12,765][15401] Updated weights for policy 0, policy_version 604220 (0.0039) [2024-06-24 06:31:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 9899540480. Throughput: 0: 43187.5. Samples: 9899681300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 06:31:16,428][15401] Updated weights for policy 0, policy_version 604230 (0.0038) [2024-06-24 06:31:18,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 9899786240. Throughput: 0: 43051.6. Samples: 9899937980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:18,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-24 06:31:20,515][15401] Updated weights for policy 0, policy_version 604240 (0.0041) [2024-06-24 06:31:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 9899999232. Throughput: 0: 43148.9. Samples: 9900072540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:23,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 06:31:23,930][15401] Updated weights for policy 0, policy_version 604250 (0.0024) [2024-06-24 06:31:28,012][15401] Updated weights for policy 0, policy_version 604260 (0.0030) [2024-06-24 06:31:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 9900195840. Throughput: 0: 43176.5. Samples: 9900321160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 06:31:31,759][15401] Updated weights for policy 0, policy_version 604270 (0.0023) [2024-06-24 06:31:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 9900425216. Throughput: 0: 43191.6. Samples: 9900585720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:33,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 06:31:35,522][15401] Updated weights for policy 0, policy_version 604280 (0.0025) [2024-06-24 06:31:38,390][15132] Fps is (10 sec: 42596.0, 60 sec: 43145.5, 300 sec: 42987.4). Total num frames: 9900621824. Throughput: 0: 43111.4. Samples: 9900718500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 06:31:38,471][15349] Signal inference workers to stop experience collection... (146700 times) [2024-06-24 06:31:38,471][15349] Signal inference workers to resume experience collection... (146700 times) [2024-06-24 06:31:38,485][15401] InferenceWorker_p0-w0: stopping experience collection (146700 times) [2024-06-24 06:31:38,486][15401] InferenceWorker_p0-w0: resuming experience collection (146700 times) [2024-06-24 06:31:39,593][15401] Updated weights for policy 0, policy_version 604290 (0.0051) [2024-06-24 06:31:43,026][15401] Updated weights for policy 0, policy_version 604300 (0.0031) [2024-06-24 06:31:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43419.3, 300 sec: 43042.7). Total num frames: 9900851200. Throughput: 0: 42940.8. Samples: 9900967060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:43,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 06:31:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000604300_9900851200.pth... [2024-06-24 06:31:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000603671_9890545664.pth [2024-06-24 06:31:47,062][15401] Updated weights for policy 0, policy_version 604310 (0.0035) [2024-06-24 06:31:48,389][15132] Fps is (10 sec: 45878.1, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 9901080576. Throughput: 0: 43030.7. Samples: 9901229120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:48,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 06:31:50,601][15401] Updated weights for policy 0, policy_version 604320 (0.0048) [2024-06-24 06:31:53,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.7, 300 sec: 43042.4). Total num frames: 9901277184. Throughput: 0: 42932.0. Samples: 9901359600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:53,397][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 06:31:54,571][15401] Updated weights for policy 0, policy_version 604330 (0.0034) [2024-06-24 06:31:58,316][15401] Updated weights for policy 0, policy_version 604340 (0.0038) [2024-06-24 06:31:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 9901506560. Throughput: 0: 42854.6. Samples: 9901609760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:31:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 06:32:02,438][15401] Updated weights for policy 0, policy_version 604350 (0.0038) [2024-06-24 06:32:03,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 9901719552. Throughput: 0: 42969.3. Samples: 9901871600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:32:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 06:32:06,381][15401] Updated weights for policy 0, policy_version 604360 (0.0025) [2024-06-24 06:32:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42871.5, 300 sec: 43042.3). Total num frames: 9901916160. Throughput: 0: 42843.5. Samples: 9902000600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:32:08,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 06:32:09,872][15401] Updated weights for policy 0, policy_version 604370 (0.0037) [2024-06-24 06:32:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 43043.1). Total num frames: 9902129152. Throughput: 0: 43017.2. Samples: 9902256940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 06:32:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 06:32:13,856][15401] Updated weights for policy 0, policy_version 604380 (0.0030) [2024-06-24 06:32:17,456][15401] Updated weights for policy 0, policy_version 604390 (0.0047) [2024-06-24 06:32:18,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 9902358528. Throughput: 0: 42899.0. Samples: 9902516180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 06:32:21,460][15401] Updated weights for policy 0, policy_version 604400 (0.0029) [2024-06-24 06:32:23,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.7, 300 sec: 43042.4). Total num frames: 9902571520. Throughput: 0: 42892.9. Samples: 9902648760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:23,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 06:32:25,432][15401] Updated weights for policy 0, policy_version 604410 (0.0041) [2024-06-24 06:32:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 9902784512. Throughput: 0: 42956.5. Samples: 9902900100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:28,390][15132] Avg episode reward: [(0, '0.136')] [2024-06-24 06:32:29,013][15401] Updated weights for policy 0, policy_version 604420 (0.0028) [2024-06-24 06:32:32,979][15401] Updated weights for policy 0, policy_version 604430 (0.0029) [2024-06-24 06:32:33,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 9902981120. Throughput: 0: 42826.5. Samples: 9903156420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:33,393][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 06:32:36,574][15401] Updated weights for policy 0, policy_version 604440 (0.0040) [2024-06-24 06:32:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43418.1, 300 sec: 42987.2). Total num frames: 9903226880. Throughput: 0: 42826.8. Samples: 9903286700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 06:32:40,380][15401] Updated weights for policy 0, policy_version 604450 (0.0028) [2024-06-24 06:32:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 9903423488. Throughput: 0: 43028.4. Samples: 9903546040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 06:32:44,210][15401] Updated weights for policy 0, policy_version 604460 (0.0034) [2024-06-24 06:32:47,911][15401] Updated weights for policy 0, policy_version 604470 (0.0039) [2024-06-24 06:32:48,391][15132] Fps is (10 sec: 40954.0, 60 sec: 42597.4, 300 sec: 42987.3). Total num frames: 9903636480. Throughput: 0: 42751.2. Samples: 9903795460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:48,391][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 06:32:52,018][15401] Updated weights for policy 0, policy_version 604480 (0.0023) [2024-06-24 06:32:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.3, 300 sec: 42988.2). Total num frames: 9903849472. Throughput: 0: 42953.9. Samples: 9903933420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 06:32:55,814][15401] Updated weights for policy 0, policy_version 604490 (0.0037) [2024-06-24 06:32:57,062][15349] Signal inference workers to stop experience collection... (146750 times) [2024-06-24 06:32:57,063][15349] Signal inference workers to resume experience collection... (146750 times) [2024-06-24 06:32:57,108][15401] InferenceWorker_p0-w0: stopping experience collection (146750 times) [2024-06-24 06:32:57,108][15401] InferenceWorker_p0-w0: resuming experience collection (146750 times) [2024-06-24 06:32:58,389][15132] Fps is (10 sec: 42604.2, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 9904062464. Throughput: 0: 42858.7. Samples: 9904185580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:32:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 06:32:59,535][15401] Updated weights for policy 0, policy_version 604500 (0.0038) [2024-06-24 06:33:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 9904275456. Throughput: 0: 42701.8. Samples: 9904437760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 06:33:03,907][15401] Updated weights for policy 0, policy_version 604510 (0.0036) [2024-06-24 06:33:07,092][15401] Updated weights for policy 0, policy_version 604520 (0.0043) [2024-06-24 06:33:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 9904488448. Throughput: 0: 42790.2. Samples: 9904574220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 06:33:11,437][15401] Updated weights for policy 0, policy_version 604530 (0.0044) [2024-06-24 06:33:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9904701440. Throughput: 0: 42905.7. Samples: 9904830860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 06:33:14,870][15401] Updated weights for policy 0, policy_version 604540 (0.0022) [2024-06-24 06:33:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9904930816. Throughput: 0: 42805.4. Samples: 9905082560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 06:33:19,027][15401] Updated weights for policy 0, policy_version 604550 (0.0036) [2024-06-24 06:33:22,487][15401] Updated weights for policy 0, policy_version 604560 (0.0032) [2024-06-24 06:33:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 9905127424. Throughput: 0: 42835.0. Samples: 9905214280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 06:33:26,587][15401] Updated weights for policy 0, policy_version 604570 (0.0036) [2024-06-24 06:33:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 43043.1). Total num frames: 9905356800. Throughput: 0: 42755.0. Samples: 9905470020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 06:33:30,284][15401] Updated weights for policy 0, policy_version 604580 (0.0045) [2024-06-24 06:33:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 9905569792. Throughput: 0: 42820.4. Samples: 9905722320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 06:33:34,095][15401] Updated weights for policy 0, policy_version 604590 (0.0028) [2024-06-24 06:33:37,934][15401] Updated weights for policy 0, policy_version 604600 (0.0040) [2024-06-24 06:33:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 9905782784. Throughput: 0: 42619.1. Samples: 9905851280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 06:33:41,607][15401] Updated weights for policy 0, policy_version 604610 (0.0023) [2024-06-24 06:33:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 9905995776. Throughput: 0: 42798.6. Samples: 9906111520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 06:33:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000604615_9906012160.pth... [2024-06-24 06:33:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000603985_9895690240.pth [2024-06-24 06:33:45,814][15401] Updated weights for policy 0, policy_version 604620 (0.0035) [2024-06-24 06:33:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42599.4, 300 sec: 42931.9). Total num frames: 9906192384. Throughput: 0: 42822.2. Samples: 9906364760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 06:33:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 06:33:49,322][15401] Updated weights for policy 0, policy_version 604630 (0.0042) [2024-06-24 06:33:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 9906405376. Throughput: 0: 42568.0. Samples: 9906489780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:33:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 06:33:53,434][15401] Updated weights for policy 0, policy_version 604640 (0.0030) [2024-06-24 06:33:57,004][15401] Updated weights for policy 0, policy_version 604650 (0.0028) [2024-06-24 06:33:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 9906634752. Throughput: 0: 42622.3. Samples: 9906748860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:33:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 06:34:00,963][15401] Updated weights for policy 0, policy_version 604660 (0.0030) [2024-06-24 06:34:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 9906847744. Throughput: 0: 42751.9. Samples: 9907006400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 06:34:04,607][15401] Updated weights for policy 0, policy_version 604670 (0.0037) [2024-06-24 06:34:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 9907044352. Throughput: 0: 42741.5. Samples: 9907137640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 06:34:08,617][15401] Updated weights for policy 0, policy_version 604680 (0.0041) [2024-06-24 06:34:10,371][15349] Signal inference workers to stop experience collection... (146800 times) [2024-06-24 06:34:10,432][15401] InferenceWorker_p0-w0: stopping experience collection (146800 times) [2024-06-24 06:34:10,434][15349] Signal inference workers to resume experience collection... (146800 times) [2024-06-24 06:34:10,450][15401] InferenceWorker_p0-w0: resuming experience collection (146800 times) [2024-06-24 06:34:12,318][15401] Updated weights for policy 0, policy_version 604690 (0.0051) [2024-06-24 06:34:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9907273728. Throughput: 0: 42674.3. Samples: 9907390360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 06:34:16,206][15401] Updated weights for policy 0, policy_version 604700 (0.0035) [2024-06-24 06:34:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9907503104. Throughput: 0: 42710.3. Samples: 9907644280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:18,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 06:34:20,385][15401] Updated weights for policy 0, policy_version 604710 (0.0040) [2024-06-24 06:34:23,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 9907683328. Throughput: 0: 42672.3. Samples: 9907771640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:23,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 06:34:23,843][15401] Updated weights for policy 0, policy_version 604720 (0.0035) [2024-06-24 06:34:27,840][15401] Updated weights for policy 0, policy_version 604730 (0.0032) [2024-06-24 06:34:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9907929088. Throughput: 0: 42525.4. Samples: 9908025160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 06:34:32,015][15401] Updated weights for policy 0, policy_version 604740 (0.0032) [2024-06-24 06:34:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42821.3). Total num frames: 9908109312. Throughput: 0: 42676.8. Samples: 9908285220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 06:34:35,584][15401] Updated weights for policy 0, policy_version 604750 (0.0032) [2024-06-24 06:34:38,396][15132] Fps is (10 sec: 39296.2, 60 sec: 42320.7, 300 sec: 42819.6). Total num frames: 9908322304. Throughput: 0: 42570.8. Samples: 9908405740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:38,397][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 06:34:39,507][15401] Updated weights for policy 0, policy_version 604760 (0.0025) [2024-06-24 06:34:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9908535296. Throughput: 0: 42546.7. Samples: 9908663460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 06:34:43,443][15401] Updated weights for policy 0, policy_version 604770 (0.0042) [2024-06-24 06:34:47,477][15401] Updated weights for policy 0, policy_version 604780 (0.0048) [2024-06-24 06:34:48,396][15132] Fps is (10 sec: 40960.2, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 9908731904. Throughput: 0: 42490.5. Samples: 9908918740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:48,396][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 06:34:51,028][15401] Updated weights for policy 0, policy_version 604790 (0.0048) [2024-06-24 06:34:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9908961280. Throughput: 0: 42383.5. Samples: 9909044900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:53,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 06:34:54,931][15401] Updated weights for policy 0, policy_version 604800 (0.0033) [2024-06-24 06:34:58,380][15401] Updated weights for policy 0, policy_version 604810 (0.0033) [2024-06-24 06:34:58,390][15132] Fps is (10 sec: 47543.4, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 9909207040. Throughput: 0: 42650.0. Samples: 9909309620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:34:58,396][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 06:35:02,418][15401] Updated weights for policy 0, policy_version 604820 (0.0036) [2024-06-24 06:35:03,392][15132] Fps is (10 sec: 42589.7, 60 sec: 42323.9, 300 sec: 42820.2). Total num frames: 9909387264. Throughput: 0: 42640.7. Samples: 9909563200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:35:03,392][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 06:35:05,908][15401] Updated weights for policy 0, policy_version 604830 (0.0029) [2024-06-24 06:35:08,391][15132] Fps is (10 sec: 40955.2, 60 sec: 42870.5, 300 sec: 42820.4). Total num frames: 9909616640. Throughput: 0: 42526.9. Samples: 9909685300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:35:08,391][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 06:35:10,092][15401] Updated weights for policy 0, policy_version 604840 (0.0037) [2024-06-24 06:35:13,389][15132] Fps is (10 sec: 44245.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9909829632. Throughput: 0: 42788.9. Samples: 9909950660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:35:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 06:35:13,706][15401] Updated weights for policy 0, policy_version 604850 (0.0039) [2024-06-24 06:35:17,748][15401] Updated weights for policy 0, policy_version 604860 (0.0028) [2024-06-24 06:35:18,396][15132] Fps is (10 sec: 40939.1, 60 sec: 42047.8, 300 sec: 42764.4). Total num frames: 9910026240. Throughput: 0: 42617.9. Samples: 9910203300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:35:18,397][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 06:35:21,019][15349] Signal inference workers to stop experience collection... (146850 times) [2024-06-24 06:35:21,052][15401] InferenceWorker_p0-w0: stopping experience collection (146850 times) [2024-06-24 06:35:21,070][15349] Signal inference workers to resume experience collection... (146850 times) [2024-06-24 06:35:21,071][15401] InferenceWorker_p0-w0: resuming experience collection (146850 times) [2024-06-24 06:35:21,555][15401] Updated weights for policy 0, policy_version 604870 (0.0035) [2024-06-24 06:35:23,390][15132] Fps is (10 sec: 44235.4, 60 sec: 43146.1, 300 sec: 42931.6). Total num frames: 9910272000. Throughput: 0: 42871.2. Samples: 9910334680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:35:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 06:35:25,375][15401] Updated weights for policy 0, policy_version 604880 (0.0041) [2024-06-24 06:35:28,392][15132] Fps is (10 sec: 42615.3, 60 sec: 42050.6, 300 sec: 42764.7). Total num frames: 9910452224. Throughput: 0: 42825.2. Samples: 9910590700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:35:28,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 06:35:29,166][15401] Updated weights for policy 0, policy_version 604890 (0.0039) [2024-06-24 06:35:33,390][15132] Fps is (10 sec: 39322.7, 60 sec: 42598.4, 300 sec: 42820.8). Total num frames: 9910665216. Throughput: 0: 42871.9. Samples: 9910847700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:35:33,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 06:35:33,514][15401] Updated weights for policy 0, policy_version 604900 (0.0031) [2024-06-24 06:35:36,731][15401] Updated weights for policy 0, policy_version 604910 (0.0041) [2024-06-24 06:35:38,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42876.1, 300 sec: 42876.5). Total num frames: 9910894592. Throughput: 0: 42886.7. Samples: 9910974800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:35:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 06:35:41,035][15401] Updated weights for policy 0, policy_version 604920 (0.0036) [2024-06-24 06:35:43,391][15132] Fps is (10 sec: 42593.0, 60 sec: 42597.4, 300 sec: 42764.8). Total num frames: 9911091200. Throughput: 0: 42638.5. Samples: 9911228400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:35:43,391][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 06:35:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000604925_9911091200.pth... [2024-06-24 06:35:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000604300_9900851200.pth [2024-06-24 06:35:44,863][15401] Updated weights for policy 0, policy_version 604930 (0.0045) [2024-06-24 06:35:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 9911304192. Throughput: 0: 42709.5. Samples: 9911485040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:35:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 06:35:48,746][15401] Updated weights for policy 0, policy_version 604940 (0.0030) [2024-06-24 06:35:52,513][15401] Updated weights for policy 0, policy_version 604950 (0.0033) [2024-06-24 06:35:53,390][15132] Fps is (10 sec: 45881.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9911549952. Throughput: 0: 42859.9. Samples: 9911613940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:35:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 06:35:56,740][15401] Updated weights for policy 0, policy_version 604960 (0.0032) [2024-06-24 06:35:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 9911730176. Throughput: 0: 42545.4. Samples: 9911865200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:35:58,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-24 06:36:00,351][15401] Updated weights for policy 0, policy_version 604970 (0.0042) [2024-06-24 06:36:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.0, 300 sec: 42765.4). Total num frames: 9911959552. Throughput: 0: 42493.3. Samples: 9912115220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 06:36:04,269][15401] Updated weights for policy 0, policy_version 604980 (0.0034) [2024-06-24 06:36:07,800][15401] Updated weights for policy 0, policy_version 604990 (0.0037) [2024-06-24 06:36:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42872.4, 300 sec: 42876.1). Total num frames: 9912188928. Throughput: 0: 42584.3. Samples: 9912250960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 06:36:11,731][15401] Updated weights for policy 0, policy_version 605000 (0.0025) [2024-06-24 06:36:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9912369152. Throughput: 0: 42496.1. Samples: 9912502920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:13,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 06:36:15,650][15401] Updated weights for policy 0, policy_version 605010 (0.0029) [2024-06-24 06:36:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 9912598528. Throughput: 0: 42431.1. Samples: 9912757100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:18,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 06:36:19,135][15401] Updated weights for policy 0, policy_version 605020 (0.0025) [2024-06-24 06:36:23,130][15401] Updated weights for policy 0, policy_version 605030 (0.0036) [2024-06-24 06:36:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 9912811520. Throughput: 0: 42652.8. Samples: 9912894180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:23,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 06:36:26,635][15401] Updated weights for policy 0, policy_version 605040 (0.0044) [2024-06-24 06:36:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 9913024512. Throughput: 0: 42655.6. Samples: 9913147840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 06:36:30,867][15401] Updated weights for policy 0, policy_version 605050 (0.0028) [2024-06-24 06:36:31,907][15349] Signal inference workers to stop experience collection... (146900 times) [2024-06-24 06:36:31,907][15349] Signal inference workers to resume experience collection... (146900 times) [2024-06-24 06:36:31,925][15401] InferenceWorker_p0-w0: stopping experience collection (146900 times) [2024-06-24 06:36:31,925][15401] InferenceWorker_p0-w0: resuming experience collection (146900 times) [2024-06-24 06:36:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9913253888. Throughput: 0: 42688.8. Samples: 9913406040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:33,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 06:36:34,124][15401] Updated weights for policy 0, policy_version 605060 (0.0040) [2024-06-24 06:36:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 9913434112. Throughput: 0: 42783.2. Samples: 9913539180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 06:36:38,560][15401] Updated weights for policy 0, policy_version 605070 (0.0044) [2024-06-24 06:36:41,712][15401] Updated weights for policy 0, policy_version 605080 (0.0032) [2024-06-24 06:36:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43145.5, 300 sec: 42709.5). Total num frames: 9913679872. Throughput: 0: 42944.5. Samples: 9913797700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 06:36:46,031][15401] Updated weights for policy 0, policy_version 605090 (0.0031) [2024-06-24 06:36:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 9913892864. Throughput: 0: 43052.0. Samples: 9914052560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 06:36:49,355][15401] Updated weights for policy 0, policy_version 605100 (0.0033) [2024-06-24 06:36:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9914105856. Throughput: 0: 42978.7. Samples: 9914185000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:53,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 06:36:53,505][15401] Updated weights for policy 0, policy_version 605110 (0.0035) [2024-06-24 06:36:57,224][15401] Updated weights for policy 0, policy_version 605120 (0.0034) [2024-06-24 06:36:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 9914335232. Throughput: 0: 43068.4. Samples: 9914441000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 06:36:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 06:37:01,157][15401] Updated weights for policy 0, policy_version 605130 (0.0038) [2024-06-24 06:37:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 9914531840. Throughput: 0: 43152.5. Samples: 9914698960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:03,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 06:37:04,756][15401] Updated weights for policy 0, policy_version 605140 (0.0035) [2024-06-24 06:37:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 9914728448. Throughput: 0: 42977.7. Samples: 9914828180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 06:37:09,090][15401] Updated weights for policy 0, policy_version 605150 (0.0038) [2024-06-24 06:37:12,448][15401] Updated weights for policy 0, policy_version 605160 (0.0037) [2024-06-24 06:37:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9914974208. Throughput: 0: 43145.2. Samples: 9915089380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 06:37:16,577][15401] Updated weights for policy 0, policy_version 605170 (0.0041) [2024-06-24 06:37:18,392][15132] Fps is (10 sec: 45864.9, 60 sec: 43142.8, 300 sec: 42765.0). Total num frames: 9915187200. Throughput: 0: 42916.9. Samples: 9915337400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:18,392][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 06:37:20,386][15401] Updated weights for policy 0, policy_version 605180 (0.0032) [2024-06-24 06:37:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9915367424. Throughput: 0: 42806.0. Samples: 9915465460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 06:37:24,259][15401] Updated weights for policy 0, policy_version 605190 (0.0044) [2024-06-24 06:37:28,097][15401] Updated weights for policy 0, policy_version 605200 (0.0045) [2024-06-24 06:37:28,392][15132] Fps is (10 sec: 42598.8, 60 sec: 43142.8, 300 sec: 42820.6). Total num frames: 9915613184. Throughput: 0: 42836.4. Samples: 9915725440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:28,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 06:37:31,982][15401] Updated weights for policy 0, policy_version 605210 (0.0025) [2024-06-24 06:37:33,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9915826176. Throughput: 0: 42748.3. Samples: 9915976240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:33,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 06:37:35,600][15401] Updated weights for policy 0, policy_version 605220 (0.0026) [2024-06-24 06:37:38,389][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9916006400. Throughput: 0: 42755.5. Samples: 9916109000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 06:37:39,568][15401] Updated weights for policy 0, policy_version 605230 (0.0041) [2024-06-24 06:37:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 9916235776. Throughput: 0: 42733.0. Samples: 9916363980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 06:37:43,483][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000605240_9916252160.pth... [2024-06-24 06:37:43,493][15401] Updated weights for policy 0, policy_version 605240 (0.0048) [2024-06-24 06:37:43,541][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000604615_9906012160.pth [2024-06-24 06:37:44,091][15349] Signal inference workers to stop experience collection... (146950 times) [2024-06-24 06:37:44,092][15349] Signal inference workers to resume experience collection... (146950 times) [2024-06-24 06:37:44,132][15401] InferenceWorker_p0-w0: stopping experience collection (146950 times) [2024-06-24 06:37:44,132][15401] InferenceWorker_p0-w0: resuming experience collection (146950 times) [2024-06-24 06:37:47,355][15401] Updated weights for policy 0, policy_version 605250 (0.0034) [2024-06-24 06:37:48,390][15132] Fps is (10 sec: 45873.0, 60 sec: 42871.1, 300 sec: 42764.9). Total num frames: 9916465152. Throughput: 0: 42576.5. Samples: 9916614920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:48,391][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 06:37:51,043][15401] Updated weights for policy 0, policy_version 605260 (0.0033) [2024-06-24 06:37:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9916661760. Throughput: 0: 42605.5. Samples: 9916745420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 06:37:54,992][15401] Updated weights for policy 0, policy_version 605270 (0.0028) [2024-06-24 06:37:58,390][15132] Fps is (10 sec: 40961.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9916874752. Throughput: 0: 42493.3. Samples: 9917001580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:37:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 06:37:58,727][15401] Updated weights for policy 0, policy_version 605280 (0.0039) [2024-06-24 06:38:02,792][15401] Updated weights for policy 0, policy_version 605290 (0.0034) [2024-06-24 06:38:03,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 9917087744. Throughput: 0: 42670.2. Samples: 9917257560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:38:03,393][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 06:38:06,317][15401] Updated weights for policy 0, policy_version 605300 (0.0046) [2024-06-24 06:38:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9917317120. Throughput: 0: 42711.2. Samples: 9917387460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:38:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 06:38:10,432][15401] Updated weights for policy 0, policy_version 605310 (0.0033) [2024-06-24 06:38:13,391][15132] Fps is (10 sec: 44240.8, 60 sec: 42597.3, 300 sec: 42709.2). Total num frames: 9917530112. Throughput: 0: 42602.5. Samples: 9917642520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:38:13,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 06:38:13,911][15401] Updated weights for policy 0, policy_version 605320 (0.0032) [2024-06-24 06:38:18,193][15401] Updated weights for policy 0, policy_version 605330 (0.0041) [2024-06-24 06:38:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 9917726720. Throughput: 0: 42814.3. Samples: 9917902880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:38:18,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 06:38:21,446][15401] Updated weights for policy 0, policy_version 605340 (0.0051) [2024-06-24 06:38:23,392][15132] Fps is (10 sec: 42596.1, 60 sec: 43143.1, 300 sec: 42709.2). Total num frames: 9917956096. Throughput: 0: 42597.5. Samples: 9918025980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:38:23,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 06:38:26,264][15401] Updated weights for policy 0, policy_version 605350 (0.0037) [2024-06-24 06:38:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 9918185472. Throughput: 0: 42718.6. Samples: 9918286320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:38:28,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 06:38:28,981][15401] Updated weights for policy 0, policy_version 605360 (0.0034) [2024-06-24 06:38:33,390][15132] Fps is (10 sec: 39329.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9918349312. Throughput: 0: 42962.1. Samples: 9918548200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 06:38:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 06:38:33,908][15401] Updated weights for policy 0, policy_version 605370 (0.0029) [2024-06-24 06:38:36,802][15401] Updated weights for policy 0, policy_version 605380 (0.0030) [2024-06-24 06:38:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9918611456. Throughput: 0: 42747.1. Samples: 9918669040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:38:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 06:38:41,431][15401] Updated weights for policy 0, policy_version 605390 (0.0026) [2024-06-24 06:38:43,391][15132] Fps is (10 sec: 47508.5, 60 sec: 43143.7, 300 sec: 42820.4). Total num frames: 9918824448. Throughput: 0: 42918.1. Samples: 9918932940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:38:43,391][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 06:38:44,599][15401] Updated weights for policy 0, policy_version 605400 (0.0021) [2024-06-24 06:38:48,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42594.2, 300 sec: 42764.1). Total num frames: 9919021056. Throughput: 0: 42913.2. Samples: 9919188820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:38:48,396][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 06:38:48,983][15401] Updated weights for policy 0, policy_version 605410 (0.0037) [2024-06-24 06:38:52,139][15401] Updated weights for policy 0, policy_version 605420 (0.0038) [2024-06-24 06:38:53,392][15132] Fps is (10 sec: 44231.2, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 9919266816. Throughput: 0: 42840.9. Samples: 9919315400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:38:53,392][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 06:38:56,983][15401] Updated weights for policy 0, policy_version 605430 (0.0039) [2024-06-24 06:38:58,392][15132] Fps is (10 sec: 42615.6, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 9919447040. Throughput: 0: 43081.9. Samples: 9919581240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:38:58,392][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 06:39:00,070][15401] Updated weights for policy 0, policy_version 605440 (0.0037) [2024-06-24 06:39:02,918][15349] Signal inference workers to stop experience collection... (147000 times) [2024-06-24 06:39:02,950][15401] InferenceWorker_p0-w0: stopping experience collection (147000 times) [2024-06-24 06:39:02,976][15349] Signal inference workers to resume experience collection... (147000 times) [2024-06-24 06:39:02,984][15401] InferenceWorker_p0-w0: resuming experience collection (147000 times) [2024-06-24 06:39:03,393][15132] Fps is (10 sec: 40956.0, 60 sec: 43143.9, 300 sec: 42820.1). Total num frames: 9919676416. Throughput: 0: 42930.6. Samples: 9919834900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:03,393][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 06:39:04,332][15401] Updated weights for policy 0, policy_version 605450 (0.0022) [2024-06-24 06:39:07,599][15401] Updated weights for policy 0, policy_version 605460 (0.0040) [2024-06-24 06:39:08,389][15132] Fps is (10 sec: 45886.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9919905792. Throughput: 0: 42982.1. Samples: 9919960080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 06:39:11,819][15401] Updated weights for policy 0, policy_version 605470 (0.0036) [2024-06-24 06:39:13,389][15132] Fps is (10 sec: 42612.7, 60 sec: 42872.6, 300 sec: 42709.5). Total num frames: 9920102400. Throughput: 0: 43062.2. Samples: 9920224120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 06:39:15,063][15401] Updated weights for policy 0, policy_version 605480 (0.0041) [2024-06-24 06:39:18,392][15132] Fps is (10 sec: 40949.3, 60 sec: 43142.7, 300 sec: 42820.5). Total num frames: 9920315392. Throughput: 0: 42838.1. Samples: 9920476020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:18,393][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 06:39:19,304][15401] Updated weights for policy 0, policy_version 605490 (0.0028) [2024-06-24 06:39:22,597][15401] Updated weights for policy 0, policy_version 605500 (0.0030) [2024-06-24 06:39:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.0, 300 sec: 42765.0). Total num frames: 9920544768. Throughput: 0: 42976.8. Samples: 9920603000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:23,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 06:39:26,838][15401] Updated weights for policy 0, policy_version 605510 (0.0024) [2024-06-24 06:39:28,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9920724992. Throughput: 0: 42893.5. Samples: 9920863100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 06:39:30,170][15401] Updated weights for policy 0, policy_version 605520 (0.0031) [2024-06-24 06:39:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42765.9). Total num frames: 9920937984. Throughput: 0: 42770.4. Samples: 9921113220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:33,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 06:39:34,963][15401] Updated weights for policy 0, policy_version 605530 (0.0036) [2024-06-24 06:39:37,807][15401] Updated weights for policy 0, policy_version 605540 (0.0036) [2024-06-24 06:39:38,391][15132] Fps is (10 sec: 45866.3, 60 sec: 42870.1, 300 sec: 42875.8). Total num frames: 9921183744. Throughput: 0: 42857.8. Samples: 9921243980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:38,392][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 06:39:42,514][15401] Updated weights for policy 0, policy_version 605550 (0.0031) [2024-06-24 06:39:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42326.2, 300 sec: 42821.5). Total num frames: 9921363968. Throughput: 0: 42625.4. Samples: 9921499280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 06:39:43,511][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000605553_9921380352.pth... [2024-06-24 06:39:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000604925_9911091200.pth [2024-06-24 06:39:45,651][15401] Updated weights for policy 0, policy_version 605560 (0.0036) [2024-06-24 06:39:48,390][15132] Fps is (10 sec: 39328.8, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 9921576960. Throughput: 0: 42633.8. Samples: 9921753280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 06:39:50,107][15401] Updated weights for policy 0, policy_version 605570 (0.0035) [2024-06-24 06:39:53,396][15132] Fps is (10 sec: 44208.0, 60 sec: 42322.5, 300 sec: 42708.6). Total num frames: 9921806336. Throughput: 0: 42707.2. Samples: 9921882180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:53,397][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 06:39:53,522][15401] Updated weights for policy 0, policy_version 605580 (0.0035) [2024-06-24 06:39:57,789][15401] Updated weights for policy 0, policy_version 605590 (0.0043) [2024-06-24 06:39:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42820.8). Total num frames: 9922019328. Throughput: 0: 42504.8. Samples: 9922136840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:39:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 06:40:01,475][15401] Updated weights for policy 0, policy_version 605600 (0.0042) [2024-06-24 06:40:03,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42600.7, 300 sec: 42765.2). Total num frames: 9922232320. Throughput: 0: 42546.8. Samples: 9922390520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 06:40:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 06:40:05,249][15401] Updated weights for policy 0, policy_version 605610 (0.0035) [2024-06-24 06:40:08,390][15132] Fps is (10 sec: 42594.7, 60 sec: 42324.6, 300 sec: 42764.9). Total num frames: 9922445312. Throughput: 0: 42651.6. Samples: 9922522360. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:08,391][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 06:40:09,071][15401] Updated weights for policy 0, policy_version 605620 (0.0038) [2024-06-24 06:40:12,858][15401] Updated weights for policy 0, policy_version 605630 (0.0031) [2024-06-24 06:40:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42766.0). Total num frames: 9922641920. Throughput: 0: 42561.4. Samples: 9922778360. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 06:40:16,830][15401] Updated weights for policy 0, policy_version 605640 (0.0038) [2024-06-24 06:40:18,390][15132] Fps is (10 sec: 40961.2, 60 sec: 42326.7, 300 sec: 42653.9). Total num frames: 9922854912. Throughput: 0: 42646.6. Samples: 9923032340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:18,391][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 06:40:20,559][15401] Updated weights for policy 0, policy_version 605650 (0.0039) [2024-06-24 06:40:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42820.9). Total num frames: 9923084288. Throughput: 0: 42644.1. Samples: 9923162880. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 06:40:24,287][15401] Updated weights for policy 0, policy_version 605660 (0.0034) [2024-06-24 06:40:27,036][15349] Signal inference workers to stop experience collection... (147050 times) [2024-06-24 06:40:27,036][15349] Signal inference workers to resume experience collection... (147050 times) [2024-06-24 06:40:27,079][15401] InferenceWorker_p0-w0: stopping experience collection (147050 times) [2024-06-24 06:40:27,079][15401] InferenceWorker_p0-w0: resuming experience collection (147050 times) [2024-06-24 06:40:28,208][15401] Updated weights for policy 0, policy_version 605670 (0.0055) [2024-06-24 06:40:28,392][15132] Fps is (10 sec: 44229.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 9923297280. Throughput: 0: 42503.4. Samples: 9923412040. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:28,392][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 06:40:32,343][15401] Updated weights for policy 0, policy_version 605680 (0.0025) [2024-06-24 06:40:33,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 9923493888. Throughput: 0: 42521.8. Samples: 9923666860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:33,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 06:40:35,804][15401] Updated weights for policy 0, policy_version 605690 (0.0038) [2024-06-24 06:40:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42326.7, 300 sec: 42820.7). Total num frames: 9923723264. Throughput: 0: 42450.0. Samples: 9923792160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 06:40:39,817][15401] Updated weights for policy 0, policy_version 605700 (0.0043) [2024-06-24 06:40:43,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9923936256. Throughput: 0: 42526.3. Samples: 9924050520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 06:40:43,704][15401] Updated weights for policy 0, policy_version 605710 (0.0036) [2024-06-24 06:40:47,251][15401] Updated weights for policy 0, policy_version 605720 (0.0029) [2024-06-24 06:40:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9924149248. Throughput: 0: 42725.8. Samples: 9924313180. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 06:40:51,034][15401] Updated weights for policy 0, policy_version 605730 (0.0042) [2024-06-24 06:40:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 9924362240. Throughput: 0: 42467.2. Samples: 9924433340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 06:40:54,741][15401] Updated weights for policy 0, policy_version 605740 (0.0029) [2024-06-24 06:40:58,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 9924591616. Throughput: 0: 42742.3. Samples: 9924702040. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:40:58,397][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 06:40:58,955][15401] Updated weights for policy 0, policy_version 605750 (0.0025) [2024-06-24 06:41:02,240][15401] Updated weights for policy 0, policy_version 605760 (0.0042) [2024-06-24 06:41:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9924788224. Throughput: 0: 42685.6. Samples: 9924953160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:41:03,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 06:41:06,682][15401] Updated weights for policy 0, policy_version 605770 (0.0032) [2024-06-24 06:41:08,389][15132] Fps is (10 sec: 42626.5, 60 sec: 42872.3, 300 sec: 42876.1). Total num frames: 9925017600. Throughput: 0: 42599.6. Samples: 9925079860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:41:08,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 06:41:10,523][15401] Updated weights for policy 0, policy_version 605780 (0.0033) [2024-06-24 06:41:13,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 9925230592. Throughput: 0: 42834.2. Samples: 9925339580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:41:13,392][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 06:41:14,271][15401] Updated weights for policy 0, policy_version 605790 (0.0023) [2024-06-24 06:41:17,877][15401] Updated weights for policy 0, policy_version 605800 (0.0029) [2024-06-24 06:41:18,392][15132] Fps is (10 sec: 42587.3, 60 sec: 43143.3, 300 sec: 42820.2). Total num frames: 9925443584. Throughput: 0: 42785.3. Samples: 9925592200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:41:18,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 06:41:21,901][15401] Updated weights for policy 0, policy_version 605810 (0.0034) [2024-06-24 06:41:23,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9925640192. Throughput: 0: 42989.3. Samples: 9925726680. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:41:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 06:41:25,284][15401] Updated weights for policy 0, policy_version 605820 (0.0031) [2024-06-24 06:41:28,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 9925853184. Throughput: 0: 42975.7. Samples: 9925984420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:41:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 06:41:29,529][15401] Updated weights for policy 0, policy_version 605830 (0.0037) [2024-06-24 06:41:33,020][15401] Updated weights for policy 0, policy_version 605840 (0.0030) [2024-06-24 06:41:33,390][15132] Fps is (10 sec: 44233.3, 60 sec: 43145.6, 300 sec: 42876.0). Total num frames: 9926082560. Throughput: 0: 42546.3. Samples: 9926227800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:41:33,391][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 06:41:37,067][15401] Updated weights for policy 0, policy_version 605850 (0.0031) [2024-06-24 06:41:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9926262784. Throughput: 0: 42884.9. Samples: 9926363160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 06:41:38,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 06:41:40,821][15401] Updated weights for policy 0, policy_version 605860 (0.0038) [2024-06-24 06:41:43,390][15132] Fps is (10 sec: 40963.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9926492160. Throughput: 0: 42566.4. Samples: 9926617260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:41:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 06:41:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000605865_9926492160.pth... [2024-06-24 06:41:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000605240_9916252160.pth [2024-06-24 06:41:44,903][15401] Updated weights for policy 0, policy_version 605870 (0.0032) [2024-06-24 06:41:46,377][15349] Signal inference workers to stop experience collection... (147100 times) [2024-06-24 06:41:46,384][15349] Signal inference workers to resume experience collection... (147100 times) [2024-06-24 06:41:46,402][15401] InferenceWorker_p0-w0: stopping experience collection (147100 times) [2024-06-24 06:41:46,402][15401] InferenceWorker_p0-w0: resuming experience collection (147100 times) [2024-06-24 06:41:48,347][15401] Updated weights for policy 0, policy_version 605880 (0.0036) [2024-06-24 06:41:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9926737920. Throughput: 0: 42611.0. Samples: 9926870660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:41:48,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 06:41:52,590][15401] Updated weights for policy 0, policy_version 605890 (0.0037) [2024-06-24 06:41:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 9926918144. Throughput: 0: 42824.8. Samples: 9927006980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:41:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 06:41:56,041][15401] Updated weights for policy 0, policy_version 605900 (0.0039) [2024-06-24 06:41:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 9927131136. Throughput: 0: 42601.0. Samples: 9927256520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:41:58,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 06:42:00,635][15401] Updated weights for policy 0, policy_version 605910 (0.0027) [2024-06-24 06:42:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9927376896. Throughput: 0: 42639.7. Samples: 9927510880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:03,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 06:42:03,532][15401] Updated weights for policy 0, policy_version 605920 (0.0041) [2024-06-24 06:42:08,252][15401] Updated weights for policy 0, policy_version 605930 (0.0027) [2024-06-24 06:42:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9927557120. Throughput: 0: 42667.6. Samples: 9927646720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 06:42:11,243][15401] Updated weights for policy 0, policy_version 605940 (0.0037) [2024-06-24 06:42:13,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 9927770112. Throughput: 0: 42632.4. Samples: 9927902880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 06:42:15,967][15401] Updated weights for policy 0, policy_version 605950 (0.0038) [2024-06-24 06:42:18,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42871.5, 300 sec: 42875.8). Total num frames: 9928015872. Throughput: 0: 42787.8. Samples: 9928153320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:18,393][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 06:42:18,855][15401] Updated weights for policy 0, policy_version 605960 (0.0033) [2024-06-24 06:42:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 9928196096. Throughput: 0: 42752.8. Samples: 9928287040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 06:42:23,547][15401] Updated weights for policy 0, policy_version 605970 (0.0031) [2024-06-24 06:42:26,989][15401] Updated weights for policy 0, policy_version 605980 (0.0038) [2024-06-24 06:42:28,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9928425472. Throughput: 0: 42792.1. Samples: 9928542900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 06:42:31,302][15401] Updated weights for policy 0, policy_version 605990 (0.0036) [2024-06-24 06:42:33,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43145.1, 300 sec: 42931.6). Total num frames: 9928671232. Throughput: 0: 42815.9. Samples: 9928797380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 06:42:34,587][15401] Updated weights for policy 0, policy_version 606000 (0.0038) [2024-06-24 06:42:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9928835072. Throughput: 0: 42705.2. Samples: 9928928720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 06:42:39,117][15401] Updated weights for policy 0, policy_version 606010 (0.0039) [2024-06-24 06:42:42,303][15401] Updated weights for policy 0, policy_version 606020 (0.0035) [2024-06-24 06:42:43,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 9929048064. Throughput: 0: 42624.0. Samples: 9929174600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 06:42:46,882][15401] Updated weights for policy 0, policy_version 606030 (0.0036) [2024-06-24 06:42:48,391][15132] Fps is (10 sec: 44229.3, 60 sec: 42324.1, 300 sec: 42764.8). Total num frames: 9929277440. Throughput: 0: 42679.6. Samples: 9929431540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:48,392][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 06:42:49,818][15401] Updated weights for policy 0, policy_version 606040 (0.0035) [2024-06-24 06:42:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 9929474048. Throughput: 0: 42651.5. Samples: 9929566140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:53,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 06:42:54,231][15401] Updated weights for policy 0, policy_version 606050 (0.0040) [2024-06-24 06:42:57,722][15401] Updated weights for policy 0, policy_version 606060 (0.0030) [2024-06-24 06:42:58,389][15132] Fps is (10 sec: 42606.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 9929703424. Throughput: 0: 42569.4. Samples: 9929818500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:42:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 06:43:01,644][15401] Updated weights for policy 0, policy_version 606070 (0.0035) [2024-06-24 06:43:03,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9929932800. Throughput: 0: 42715.3. Samples: 9930075400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:43:03,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 06:43:05,472][15401] Updated weights for policy 0, policy_version 606080 (0.0024) [2024-06-24 06:43:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.2). Total num frames: 9930113024. Throughput: 0: 42683.7. Samples: 9930207800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:43:08,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 06:43:09,409][15401] Updated weights for policy 0, policy_version 606090 (0.0031) [2024-06-24 06:43:09,736][15349] Signal inference workers to stop experience collection... (147150 times) [2024-06-24 06:43:09,736][15349] Signal inference workers to resume experience collection... (147150 times) [2024-06-24 06:43:09,772][15401] InferenceWorker_p0-w0: stopping experience collection (147150 times) [2024-06-24 06:43:09,772][15401] InferenceWorker_p0-w0: resuming experience collection (147150 times) [2024-06-24 06:43:13,162][15401] Updated weights for policy 0, policy_version 606100 (0.0035) [2024-06-24 06:43:13,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 9930342400. Throughput: 0: 42735.0. Samples: 9930466080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 06:43:13,392][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 06:43:17,146][15401] Updated weights for policy 0, policy_version 606110 (0.0037) [2024-06-24 06:43:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.2, 300 sec: 42765.3). Total num frames: 9930571776. Throughput: 0: 42619.7. Samples: 9930715260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:18,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-24 06:43:21,032][15401] Updated weights for policy 0, policy_version 606120 (0.0036) [2024-06-24 06:43:23,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9930768384. Throughput: 0: 42631.6. Samples: 9930847140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 06:43:24,802][15401] Updated weights for policy 0, policy_version 606130 (0.0031) [2024-06-24 06:43:28,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9930964992. Throughput: 0: 42841.3. Samples: 9931102460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 06:43:28,739][15401] Updated weights for policy 0, policy_version 606140 (0.0037) [2024-06-24 06:43:32,830][15401] Updated weights for policy 0, policy_version 606150 (0.0046) [2024-06-24 06:43:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9931210752. Throughput: 0: 42846.5. Samples: 9931359560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 06:43:36,390][15401] Updated weights for policy 0, policy_version 606160 (0.0035) [2024-06-24 06:43:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42654.1). Total num frames: 9931407360. Throughput: 0: 42727.7. Samples: 9931488780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 06:43:40,234][15401] Updated weights for policy 0, policy_version 606170 (0.0036) [2024-06-24 06:43:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 9931620352. Throughput: 0: 42825.3. Samples: 9931745640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 06:43:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000606178_9931620352.pth... [2024-06-24 06:43:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000605553_9921380352.pth [2024-06-24 06:43:44,362][15401] Updated weights for policy 0, policy_version 606180 (0.0040) [2024-06-24 06:43:47,830][15401] Updated weights for policy 0, policy_version 606190 (0.0046) [2024-06-24 06:43:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42599.6, 300 sec: 42598.7). Total num frames: 9931833344. Throughput: 0: 42753.2. Samples: 9931999300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:48,399][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 06:43:52,086][15401] Updated weights for policy 0, policy_version 606200 (0.0033) [2024-06-24 06:43:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9932062720. Throughput: 0: 42643.4. Samples: 9932126860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:53,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 06:43:55,369][15401] Updated weights for policy 0, policy_version 606210 (0.0043) [2024-06-24 06:43:58,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42654.1). Total num frames: 9932259328. Throughput: 0: 42563.5. Samples: 9932381440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:43:58,393][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 06:43:59,703][15401] Updated weights for policy 0, policy_version 606220 (0.0028) [2024-06-24 06:44:03,212][15401] Updated weights for policy 0, policy_version 606230 (0.0023) [2024-06-24 06:44:03,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9932472320. Throughput: 0: 42677.4. Samples: 9932635740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 06:44:07,394][15401] Updated weights for policy 0, policy_version 606240 (0.0030) [2024-06-24 06:44:08,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 9932652544. Throughput: 0: 42469.0. Samples: 9932758240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 06:44:10,866][15401] Updated weights for policy 0, policy_version 606250 (0.0030) [2024-06-24 06:44:13,394][15132] Fps is (10 sec: 42580.3, 60 sec: 42597.2, 300 sec: 42653.7). Total num frames: 9932898304. Throughput: 0: 42536.5. Samples: 9933016780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:13,394][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 06:44:15,084][15401] Updated weights for policy 0, policy_version 606260 (0.0039) [2024-06-24 06:44:18,303][15401] Updated weights for policy 0, policy_version 606270 (0.0039) [2024-06-24 06:44:18,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 9933127680. Throughput: 0: 42433.8. Samples: 9933269080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 06:44:22,691][15401] Updated weights for policy 0, policy_version 606280 (0.0030) [2024-06-24 06:44:23,390][15132] Fps is (10 sec: 39337.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 9933291520. Throughput: 0: 42458.5. Samples: 9933399420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 06:44:25,666][15401] Updated weights for policy 0, policy_version 606290 (0.0041) [2024-06-24 06:44:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 9933537280. Throughput: 0: 42526.2. Samples: 9933659320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 06:44:30,318][15401] Updated weights for policy 0, policy_version 606300 (0.0035) [2024-06-24 06:44:31,829][15349] Signal inference workers to stop experience collection... (147200 times) [2024-06-24 06:44:31,885][15349] Signal inference workers to resume experience collection... (147200 times) [2024-06-24 06:44:31,888][15401] InferenceWorker_p0-w0: stopping experience collection (147200 times) [2024-06-24 06:44:31,900][15401] InferenceWorker_p0-w0: resuming experience collection (147200 times) [2024-06-24 06:44:33,195][15401] Updated weights for policy 0, policy_version 606310 (0.0027) [2024-06-24 06:44:33,392][15132] Fps is (10 sec: 49140.3, 60 sec: 42869.8, 300 sec: 42709.4). Total num frames: 9933783040. Throughput: 0: 42424.0. Samples: 9933908480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:33,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 06:44:38,181][15401] Updated weights for policy 0, policy_version 606320 (0.0037) [2024-06-24 06:44:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 9933946880. Throughput: 0: 42582.2. Samples: 9934042960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 06:44:41,438][15401] Updated weights for policy 0, policy_version 606330 (0.0053) [2024-06-24 06:44:43,390][15132] Fps is (10 sec: 40967.6, 60 sec: 42871.1, 300 sec: 42764.9). Total num frames: 9934192640. Throughput: 0: 42706.2. Samples: 9934303140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:43,391][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 06:44:45,790][15401] Updated weights for policy 0, policy_version 606340 (0.0027) [2024-06-24 06:44:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42710.4). Total num frames: 9934405632. Throughput: 0: 42626.6. Samples: 9934553940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 06:44:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 06:44:49,218][15401] Updated weights for policy 0, policy_version 606350 (0.0039) [2024-06-24 06:44:53,389][15132] Fps is (10 sec: 39324.2, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 9934585856. Throughput: 0: 42777.8. Samples: 9934683240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:44:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 06:44:53,602][15401] Updated weights for policy 0, policy_version 606360 (0.0033) [2024-06-24 06:44:56,987][15401] Updated weights for policy 0, policy_version 606370 (0.0035) [2024-06-24 06:44:58,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 9934815232. Throughput: 0: 42752.7. Samples: 9934940580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:44:58,393][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 06:45:01,107][15401] Updated weights for policy 0, policy_version 606380 (0.0038) [2024-06-24 06:45:03,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42709.6). Total num frames: 9935044608. Throughput: 0: 42848.5. Samples: 9935197260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 06:45:04,559][15401] Updated weights for policy 0, policy_version 606390 (0.0033) [2024-06-24 06:45:08,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 9935224832. Throughput: 0: 42858.3. Samples: 9935328040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 06:45:08,797][15401] Updated weights for policy 0, policy_version 606400 (0.0030) [2024-06-24 06:45:12,183][15401] Updated weights for policy 0, policy_version 606410 (0.0027) [2024-06-24 06:45:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42601.4, 300 sec: 42709.6). Total num frames: 9935454208. Throughput: 0: 42748.1. Samples: 9935582980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:13,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 06:45:16,459][15401] Updated weights for policy 0, policy_version 606420 (0.0029) [2024-06-24 06:45:18,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9935699968. Throughput: 0: 42771.7. Samples: 9935833100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 06:45:19,812][15401] Updated weights for policy 0, policy_version 606430 (0.0041) [2024-06-24 06:45:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 9935880192. Throughput: 0: 42731.6. Samples: 9935965880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 06:45:24,035][15401] Updated weights for policy 0, policy_version 606440 (0.0038) [2024-06-24 06:45:27,618][15401] Updated weights for policy 0, policy_version 606450 (0.0039) [2024-06-24 06:45:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 9936109568. Throughput: 0: 42608.5. Samples: 9936220500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:28,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 06:45:31,481][15401] Updated weights for policy 0, policy_version 606460 (0.0031) [2024-06-24 06:45:33,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 9936322560. Throughput: 0: 42667.5. Samples: 9936474080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:33,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 06:45:35,475][15401] Updated weights for policy 0, policy_version 606470 (0.0045) [2024-06-24 06:45:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 9936519168. Throughput: 0: 42702.1. Samples: 9936604840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 06:45:39,043][15401] Updated weights for policy 0, policy_version 606480 (0.0037) [2024-06-24 06:45:43,130][15401] Updated weights for policy 0, policy_version 606490 (0.0033) [2024-06-24 06:45:43,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.8, 300 sec: 42709.5). Total num frames: 9936748544. Throughput: 0: 42640.5. Samples: 9936859300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 06:45:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000606491_9936748544.pth... [2024-06-24 06:45:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000605865_9926492160.pth [2024-06-24 06:45:46,922][15401] Updated weights for policy 0, policy_version 606500 (0.0048) [2024-06-24 06:45:48,065][15349] Signal inference workers to stop experience collection... (147250 times) [2024-06-24 06:45:48,065][15349] Signal inference workers to resume experience collection... (147250 times) [2024-06-24 06:45:48,088][15401] InferenceWorker_p0-w0: stopping experience collection (147250 times) [2024-06-24 06:45:48,088][15401] InferenceWorker_p0-w0: resuming experience collection (147250 times) [2024-06-24 06:45:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9936961536. Throughput: 0: 42691.5. Samples: 9937118380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 06:45:50,735][15401] Updated weights for policy 0, policy_version 606510 (0.0028) [2024-06-24 06:45:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 9937158144. Throughput: 0: 42488.4. Samples: 9937240020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 06:45:54,581][15401] Updated weights for policy 0, policy_version 606520 (0.0028) [2024-06-24 06:45:58,273][15401] Updated weights for policy 0, policy_version 606530 (0.0036) [2024-06-24 06:45:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 9937387520. Throughput: 0: 42469.7. Samples: 9937494120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:45:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 06:46:02,300][15401] Updated weights for policy 0, policy_version 606540 (0.0039) [2024-06-24 06:46:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 9937567744. Throughput: 0: 42742.3. Samples: 9937756500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:46:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 06:46:06,248][15401] Updated weights for policy 0, policy_version 606550 (0.0032) [2024-06-24 06:46:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 9937797120. Throughput: 0: 42548.4. Samples: 9937880560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:46:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 06:46:09,911][15401] Updated weights for policy 0, policy_version 606560 (0.0039) [2024-06-24 06:46:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 9938026496. Throughput: 0: 42542.3. Samples: 9938134900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:46:13,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 06:46:13,704][15401] Updated weights for policy 0, policy_version 606570 (0.0025) [2024-06-24 06:46:17,433][15401] Updated weights for policy 0, policy_version 606580 (0.0032) [2024-06-24 06:46:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 9938223104. Throughput: 0: 42736.5. Samples: 9938397120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:46:18,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 06:46:21,665][15401] Updated weights for policy 0, policy_version 606590 (0.0034) [2024-06-24 06:46:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 9938452480. Throughput: 0: 42635.1. Samples: 9938523520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 06:46:23,392][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 06:46:25,283][15401] Updated weights for policy 0, policy_version 606600 (0.0043) [2024-06-24 06:46:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 9938665472. Throughput: 0: 42653.3. Samples: 9938778700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:46:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 06:46:29,327][15401] Updated weights for policy 0, policy_version 606610 (0.0037) [2024-06-24 06:46:32,926][15401] Updated weights for policy 0, policy_version 606620 (0.0050) [2024-06-24 06:46:33,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9938878464. Throughput: 0: 42583.5. Samples: 9939034640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:46:33,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 06:46:36,924][15401] Updated weights for policy 0, policy_version 606630 (0.0030) [2024-06-24 06:46:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 9939107840. Throughput: 0: 42853.2. Samples: 9939168520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:46:38,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 06:46:40,601][15401] Updated weights for policy 0, policy_version 606640 (0.0038) [2024-06-24 06:46:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 9939304448. Throughput: 0: 42820.9. Samples: 9939421060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:46:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 06:46:44,733][15401] Updated weights for policy 0, policy_version 606650 (0.0032) [2024-06-24 06:46:47,999][15401] Updated weights for policy 0, policy_version 606660 (0.0026) [2024-06-24 06:46:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9939533824. Throughput: 0: 42844.8. Samples: 9939684520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:46:48,390][15132] Avg episode reward: [(0, '0.903')] [2024-06-24 06:46:52,083][15401] Updated weights for policy 0, policy_version 606670 (0.0042) [2024-06-24 06:46:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 9939763200. Throughput: 0: 42960.9. Samples: 9939813800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:46:53,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 06:46:55,534][15401] Updated weights for policy 0, policy_version 606680 (0.0047) [2024-06-24 06:46:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9939943424. Throughput: 0: 43039.5. Samples: 9940071680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:46:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 06:46:59,510][15401] Updated weights for policy 0, policy_version 606690 (0.0031) [2024-06-24 06:47:03,055][15401] Updated weights for policy 0, policy_version 606700 (0.0044) [2024-06-24 06:47:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 9940189184. Throughput: 0: 43074.2. Samples: 9940335460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 06:47:07,039][15401] Updated weights for policy 0, policy_version 606710 (0.0039) [2024-06-24 06:47:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 9940385792. Throughput: 0: 43085.9. Samples: 9940462280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 06:47:10,596][15349] Signal inference workers to stop experience collection... (147300 times) [2024-06-24 06:47:10,599][15349] Signal inference workers to resume experience collection... (147300 times) [2024-06-24 06:47:10,612][15401] InferenceWorker_p0-w0: stopping experience collection (147300 times) [2024-06-24 06:47:10,613][15401] InferenceWorker_p0-w0: resuming experience collection (147300 times) [2024-06-24 06:47:10,744][15401] Updated weights for policy 0, policy_version 606720 (0.0032) [2024-06-24 06:47:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 9940598784. Throughput: 0: 43138.7. Samples: 9940719940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:13,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 06:47:14,569][15401] Updated weights for policy 0, policy_version 606730 (0.0022) [2024-06-24 06:47:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 9940811776. Throughput: 0: 43191.1. Samples: 9940978340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:18,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 06:47:18,543][15401] Updated weights for policy 0, policy_version 606740 (0.0030) [2024-06-24 06:47:22,710][15401] Updated weights for policy 0, policy_version 606750 (0.0034) [2024-06-24 06:47:23,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 9941008384. Throughput: 0: 43008.0. Samples: 9941103880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:23,392][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 06:47:26,194][15401] Updated weights for policy 0, policy_version 606760 (0.0035) [2024-06-24 06:47:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 9941237760. Throughput: 0: 43070.2. Samples: 9941359220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 06:47:30,235][15401] Updated weights for policy 0, policy_version 606770 (0.0044) [2024-06-24 06:47:33,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9941450752. Throughput: 0: 43033.7. Samples: 9941621040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 06:47:33,759][15401] Updated weights for policy 0, policy_version 606780 (0.0038) [2024-06-24 06:47:38,037][15401] Updated weights for policy 0, policy_version 606790 (0.0041) [2024-06-24 06:47:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 9941663744. Throughput: 0: 42957.4. Samples: 9941746880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 06:47:41,298][15401] Updated weights for policy 0, policy_version 606800 (0.0027) [2024-06-24 06:47:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 9941893120. Throughput: 0: 42988.0. Samples: 9942006140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 06:47:43,496][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000606806_9941909504.pth... [2024-06-24 06:47:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000606178_9931620352.pth [2024-06-24 06:47:45,572][15401] Updated weights for policy 0, policy_version 606810 (0.0039) [2024-06-24 06:47:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 9942089728. Throughput: 0: 43010.8. Samples: 9942270940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 06:47:48,955][15401] Updated weights for policy 0, policy_version 606820 (0.0023) [2024-06-24 06:47:53,379][15401] Updated weights for policy 0, policy_version 606830 (0.0035) [2024-06-24 06:47:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9942302720. Throughput: 0: 42894.6. Samples: 9942392540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 06:47:56,680][15401] Updated weights for policy 0, policy_version 606840 (0.0033) [2024-06-24 06:47:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 9942548480. Throughput: 0: 42835.0. Samples: 9942647520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 06:47:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 06:48:00,955][15401] Updated weights for policy 0, policy_version 606850 (0.0029) [2024-06-24 06:48:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 9942712320. Throughput: 0: 42954.4. Samples: 9942911180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 06:48:04,465][15401] Updated weights for policy 0, policy_version 606860 (0.0024) [2024-06-24 06:48:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 9942941696. Throughput: 0: 42849.0. Samples: 9943031980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 06:48:08,474][15401] Updated weights for policy 0, policy_version 606870 (0.0033) [2024-06-24 06:48:12,027][15401] Updated weights for policy 0, policy_version 606880 (0.0036) [2024-06-24 06:48:13,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9943187456. Throughput: 0: 42919.1. Samples: 9943290580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 06:48:15,950][15401] Updated weights for policy 0, policy_version 606890 (0.0038) [2024-06-24 06:48:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 9943351296. Throughput: 0: 42961.4. Samples: 9943554300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 06:48:19,549][15349] Signal inference workers to stop experience collection... (147350 times) [2024-06-24 06:48:19,551][15349] Signal inference workers to resume experience collection... (147350 times) [2024-06-24 06:48:19,560][15401] InferenceWorker_p0-w0: stopping experience collection (147350 times) [2024-06-24 06:48:19,593][15401] InferenceWorker_p0-w0: resuming experience collection (147350 times) [2024-06-24 06:48:19,698][15401] Updated weights for policy 0, policy_version 606900 (0.0023) [2024-06-24 06:48:23,390][15132] Fps is (10 sec: 40957.0, 60 sec: 43145.7, 300 sec: 42820.4). Total num frames: 9943597056. Throughput: 0: 42780.6. Samples: 9943672040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:23,391][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 06:48:23,509][15401] Updated weights for policy 0, policy_version 606910 (0.0041) [2024-06-24 06:48:27,593][15401] Updated weights for policy 0, policy_version 606920 (0.0037) [2024-06-24 06:48:28,390][15132] Fps is (10 sec: 47510.7, 60 sec: 43144.2, 300 sec: 42765.0). Total num frames: 9943826432. Throughput: 0: 42845.7. Samples: 9943934220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:28,391][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 06:48:30,903][15401] Updated weights for policy 0, policy_version 606930 (0.0031) [2024-06-24 06:48:33,389][15132] Fps is (10 sec: 39324.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 9943990272. Throughput: 0: 42738.2. Samples: 9944194160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 06:48:35,147][15401] Updated weights for policy 0, policy_version 606940 (0.0029) [2024-06-24 06:48:38,389][15132] Fps is (10 sec: 44239.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9944268800. Throughput: 0: 42829.0. Samples: 9944319840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 06:48:38,401][15401] Updated weights for policy 0, policy_version 606950 (0.0037) [2024-06-24 06:48:42,719][15401] Updated weights for policy 0, policy_version 606960 (0.0036) [2024-06-24 06:48:43,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9944449024. Throughput: 0: 43012.5. Samples: 9944583080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 06:48:45,980][15401] Updated weights for policy 0, policy_version 606970 (0.0038) [2024-06-24 06:48:48,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 9944645632. Throughput: 0: 42887.6. Samples: 9944841120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 06:48:50,169][15401] Updated weights for policy 0, policy_version 606980 (0.0032) [2024-06-24 06:48:53,396][15132] Fps is (10 sec: 45845.7, 60 sec: 43412.9, 300 sec: 42875.5). Total num frames: 9944907776. Throughput: 0: 42850.2. Samples: 9944960520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:53,396][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 06:48:53,527][15401] Updated weights for policy 0, policy_version 606990 (0.0043) [2024-06-24 06:48:57,763][15401] Updated weights for policy 0, policy_version 607000 (0.0030) [2024-06-24 06:48:58,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9945104384. Throughput: 0: 42961.8. Samples: 9945223860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:48:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 06:49:01,426][15401] Updated weights for policy 0, policy_version 607010 (0.0032) [2024-06-24 06:49:03,389][15132] Fps is (10 sec: 39347.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9945300992. Throughput: 0: 42806.2. Samples: 9945480580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:49:03,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 06:49:05,275][15401] Updated weights for policy 0, policy_version 607020 (0.0030) [2024-06-24 06:49:08,396][15132] Fps is (10 sec: 44208.5, 60 sec: 43412.9, 300 sec: 42875.8). Total num frames: 9945546752. Throughput: 0: 42952.3. Samples: 9945605140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:49:08,397][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 06:49:09,211][15401] Updated weights for policy 0, policy_version 607030 (0.0035) [2024-06-24 06:49:13,140][15401] Updated weights for policy 0, policy_version 607040 (0.0039) [2024-06-24 06:49:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 9945743360. Throughput: 0: 42911.5. Samples: 9945865220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:49:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 06:49:16,857][15401] Updated weights for policy 0, policy_version 607050 (0.0045) [2024-06-24 06:49:18,392][15132] Fps is (10 sec: 37698.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 9945923584. Throughput: 0: 42775.4. Samples: 9946119160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:49:18,393][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 06:49:20,744][15401] Updated weights for policy 0, policy_version 607060 (0.0031) [2024-06-24 06:49:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.9, 300 sec: 42765.0). Total num frames: 9946152960. Throughput: 0: 42709.7. Samples: 9946241780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:49:23,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 06:49:24,684][15401] Updated weights for policy 0, policy_version 607070 (0.0048) [2024-06-24 06:49:28,364][15401] Updated weights for policy 0, policy_version 607080 (0.0035) [2024-06-24 06:49:28,392][15132] Fps is (10 sec: 47513.5, 60 sec: 42870.1, 300 sec: 42765.0). Total num frames: 9946398720. Throughput: 0: 42627.9. Samples: 9946501440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-24 06:49:28,393][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 06:49:32,295][15401] Updated weights for policy 0, policy_version 607090 (0.0029) [2024-06-24 06:49:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9946578944. Throughput: 0: 42531.1. Samples: 9946755020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:49:33,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 06:49:36,048][15401] Updated weights for policy 0, policy_version 607100 (0.0041) [2024-06-24 06:49:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9946824704. Throughput: 0: 42634.0. Samples: 9946878780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:49:38,398][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 06:49:39,837][15401] Updated weights for policy 0, policy_version 607110 (0.0028) [2024-06-24 06:49:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9947021312. Throughput: 0: 42729.8. Samples: 9947146700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:49:43,390][15132] Avg episode reward: [(0, '0.161')] [2024-06-24 06:49:43,431][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000607119_9947037696.pth... [2024-06-24 06:49:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000606491_9936748544.pth [2024-06-24 06:49:43,639][15401] Updated weights for policy 0, policy_version 607120 (0.0034) [2024-06-24 06:49:46,022][15349] Signal inference workers to stop experience collection... (147400 times) [2024-06-24 06:49:46,023][15349] Signal inference workers to resume experience collection... (147400 times) [2024-06-24 06:49:46,068][15401] InferenceWorker_p0-w0: stopping experience collection (147400 times) [2024-06-24 06:49:46,068][15401] InferenceWorker_p0-w0: resuming experience collection (147400 times) [2024-06-24 06:49:47,527][15401] Updated weights for policy 0, policy_version 607130 (0.0032) [2024-06-24 06:49:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9947234304. Throughput: 0: 42694.2. Samples: 9947401820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:49:48,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 06:49:51,317][15401] Updated weights for policy 0, policy_version 607140 (0.0033) [2024-06-24 06:49:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42603.0, 300 sec: 42876.4). Total num frames: 9947463680. Throughput: 0: 42737.7. Samples: 9947528060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:49:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 06:49:55,715][15401] Updated weights for policy 0, policy_version 607150 (0.0035) [2024-06-24 06:49:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 9947660288. Throughput: 0: 42759.1. Samples: 9947789480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:49:58,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 06:49:59,245][15401] Updated weights for policy 0, policy_version 607160 (0.0034) [2024-06-24 06:50:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9947856896. Throughput: 0: 42773.4. Samples: 9948043860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 06:50:03,451][15401] Updated weights for policy 0, policy_version 607170 (0.0028) [2024-06-24 06:50:06,959][15401] Updated weights for policy 0, policy_version 607180 (0.0047) [2024-06-24 06:50:08,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42876.0, 300 sec: 42931.6). Total num frames: 9948119040. Throughput: 0: 42890.6. Samples: 9948171860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:08,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 06:50:11,119][15401] Updated weights for policy 0, policy_version 607190 (0.0030) [2024-06-24 06:50:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9948299264. Throughput: 0: 42828.6. Samples: 9948428620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 06:50:14,687][15401] Updated weights for policy 0, policy_version 607200 (0.0039) [2024-06-24 06:50:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 9948512256. Throughput: 0: 43015.1. Samples: 9948690700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 06:50:18,614][15401] Updated weights for policy 0, policy_version 607210 (0.0024) [2024-06-24 06:50:22,317][15401] Updated weights for policy 0, policy_version 607220 (0.0049) [2024-06-24 06:50:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9948741632. Throughput: 0: 42989.5. Samples: 9948813300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 06:50:26,139][15401] Updated weights for policy 0, policy_version 607230 (0.0032) [2024-06-24 06:50:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42820.9). Total num frames: 9948954624. Throughput: 0: 42863.6. Samples: 9949075560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 06:50:29,869][15401] Updated weights for policy 0, policy_version 607240 (0.0033) [2024-06-24 06:50:33,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 9949151232. Throughput: 0: 42989.3. Samples: 9949336340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 06:50:33,795][15401] Updated weights for policy 0, policy_version 607250 (0.0039) [2024-06-24 06:50:37,608][15401] Updated weights for policy 0, policy_version 607260 (0.0030) [2024-06-24 06:50:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9949396992. Throughput: 0: 42926.7. Samples: 9949459760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 06:50:41,217][15401] Updated weights for policy 0, policy_version 607270 (0.0022) [2024-06-24 06:50:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 9949593600. Throughput: 0: 42713.7. Samples: 9949711600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:43,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 06:50:45,173][15401] Updated weights for policy 0, policy_version 607280 (0.0026) [2024-06-24 06:50:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 9949790208. Throughput: 0: 43149.0. Samples: 9949985560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 06:50:48,726][15401] Updated weights for policy 0, policy_version 607290 (0.0037) [2024-06-24 06:50:52,754][15401] Updated weights for policy 0, policy_version 607300 (0.0032) [2024-06-24 06:50:53,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 9950052352. Throughput: 0: 43072.8. Samples: 9950110140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 06:50:56,509][15401] Updated weights for policy 0, policy_version 607310 (0.0046) [2024-06-24 06:50:57,008][15349] Signal inference workers to stop experience collection... (147450 times) [2024-06-24 06:50:57,037][15401] InferenceWorker_p0-w0: stopping experience collection (147450 times) [2024-06-24 06:50:57,062][15349] Signal inference workers to resume experience collection... (147450 times) [2024-06-24 06:50:57,063][15401] InferenceWorker_p0-w0: resuming experience collection (147450 times) [2024-06-24 06:50:58,396][15132] Fps is (10 sec: 45845.4, 60 sec: 43141.6, 300 sec: 42986.2). Total num frames: 9950248960. Throughput: 0: 42977.8. Samples: 9950362900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:50:58,396][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 06:51:00,174][15401] Updated weights for policy 0, policy_version 607320 (0.0044) [2024-06-24 06:51:03,392][15132] Fps is (10 sec: 39312.5, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 9950445568. Throughput: 0: 43019.9. Samples: 9950626700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 06:51:03,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 06:51:04,155][15401] Updated weights for policy 0, policy_version 607330 (0.0028) [2024-06-24 06:51:07,654][15401] Updated weights for policy 0, policy_version 607340 (0.0031) [2024-06-24 06:51:08,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9950691328. Throughput: 0: 43146.6. Samples: 9950754900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 06:51:11,874][15401] Updated weights for policy 0, policy_version 607350 (0.0035) [2024-06-24 06:51:13,391][15132] Fps is (10 sec: 44241.5, 60 sec: 43143.5, 300 sec: 42931.4). Total num frames: 9950887936. Throughput: 0: 42903.2. Samples: 9951006260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:13,391][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 06:51:15,478][15401] Updated weights for policy 0, policy_version 607360 (0.0033) [2024-06-24 06:51:18,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.7, 300 sec: 42820.6). Total num frames: 9951084544. Throughput: 0: 42833.3. Samples: 9951263940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:18,401][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 06:51:19,403][15401] Updated weights for policy 0, policy_version 607370 (0.0045) [2024-06-24 06:51:23,182][15401] Updated weights for policy 0, policy_version 607380 (0.0022) [2024-06-24 06:51:23,392][15132] Fps is (10 sec: 42593.8, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 9951313920. Throughput: 0: 43004.7. Samples: 9951395080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:23,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 06:51:27,038][15401] Updated weights for policy 0, policy_version 607390 (0.0033) [2024-06-24 06:51:28,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9951526912. Throughput: 0: 42945.5. Samples: 9951644040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:28,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 06:51:30,750][15401] Updated weights for policy 0, policy_version 607400 (0.0037) [2024-06-24 06:51:33,392][15132] Fps is (10 sec: 42599.4, 60 sec: 43143.0, 300 sec: 42820.6). Total num frames: 9951739904. Throughput: 0: 42545.4. Samples: 9951900200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:33,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 06:51:34,637][15401] Updated weights for policy 0, policy_version 607410 (0.0036) [2024-06-24 06:51:38,310][15401] Updated weights for policy 0, policy_version 607420 (0.0034) [2024-06-24 06:51:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9951969280. Throughput: 0: 42759.7. Samples: 9952034320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 06:51:42,347][15401] Updated weights for policy 0, policy_version 607430 (0.0024) [2024-06-24 06:51:43,389][15132] Fps is (10 sec: 42607.7, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 9952165888. Throughput: 0: 42819.4. Samples: 9952289500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 06:51:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000607433_9952182272.pth... [2024-06-24 06:51:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000606806_9941909504.pth [2024-06-24 06:51:46,089][15401] Updated weights for policy 0, policy_version 607440 (0.0040) [2024-06-24 06:51:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 9952395264. Throughput: 0: 42632.0. Samples: 9952545040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 06:51:50,107][15401] Updated weights for policy 0, policy_version 607450 (0.0049) [2024-06-24 06:51:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9952591872. Throughput: 0: 42642.2. Samples: 9952673800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 06:51:53,844][15401] Updated weights for policy 0, policy_version 607460 (0.0038) [2024-06-24 06:51:57,838][15401] Updated weights for policy 0, policy_version 607470 (0.0023) [2024-06-24 06:51:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 9952804864. Throughput: 0: 42791.5. Samples: 9952931820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:51:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 06:52:01,333][15401] Updated weights for policy 0, policy_version 607480 (0.0037) [2024-06-24 06:52:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 9953034240. Throughput: 0: 42760.5. Samples: 9953188060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:52:03,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 06:52:05,342][15401] Updated weights for policy 0, policy_version 607490 (0.0044) [2024-06-24 06:52:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9953247232. Throughput: 0: 42767.7. Samples: 9953319520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:52:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 06:52:08,978][15401] Updated weights for policy 0, policy_version 607500 (0.0031) [2024-06-24 06:52:12,948][15401] Updated weights for policy 0, policy_version 607510 (0.0042) [2024-06-24 06:52:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42872.4, 300 sec: 42876.4). Total num frames: 9953460224. Throughput: 0: 42900.3. Samples: 9953574560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:52:13,404][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 06:52:16,782][15401] Updated weights for policy 0, policy_version 607520 (0.0025) [2024-06-24 06:52:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42873.1, 300 sec: 42876.4). Total num frames: 9953656832. Throughput: 0: 42803.3. Samples: 9953826260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:52:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 06:52:20,754][15401] Updated weights for policy 0, policy_version 607530 (0.0032) [2024-06-24 06:52:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 9953869824. Throughput: 0: 42640.5. Samples: 9953953140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:52:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 06:52:24,217][15401] Updated weights for policy 0, policy_version 607540 (0.0032) [2024-06-24 06:52:28,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9954082816. Throughput: 0: 42697.4. Samples: 9954210880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:52:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 06:52:28,441][15401] Updated weights for policy 0, policy_version 607550 (0.0031) [2024-06-24 06:52:31,042][15349] Signal inference workers to stop experience collection... (147500 times) [2024-06-24 06:52:31,043][15349] Signal inference workers to resume experience collection... (147500 times) [2024-06-24 06:52:31,057][15401] InferenceWorker_p0-w0: stopping experience collection (147500 times) [2024-06-24 06:52:31,070][15401] InferenceWorker_p0-w0: resuming experience collection (147500 times) [2024-06-24 06:52:31,877][15401] Updated weights for policy 0, policy_version 607560 (0.0044) [2024-06-24 06:52:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 9954328576. Throughput: 0: 42736.1. Samples: 9954468160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:52:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 06:52:35,879][15401] Updated weights for policy 0, policy_version 607570 (0.0021) [2024-06-24 06:52:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9954508800. Throughput: 0: 42822.3. Samples: 9954600800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 06:52:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 06:52:39,481][15401] Updated weights for policy 0, policy_version 607580 (0.0043) [2024-06-24 06:52:43,314][15401] Updated weights for policy 0, policy_version 607590 (0.0028) [2024-06-24 06:52:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9954754560. Throughput: 0: 42823.5. Samples: 9954858880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:52:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 06:52:47,073][15401] Updated weights for policy 0, policy_version 607600 (0.0033) [2024-06-24 06:52:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9954967552. Throughput: 0: 42756.5. Samples: 9955112100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:52:48,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-24 06:52:50,915][15401] Updated weights for policy 0, policy_version 607610 (0.0031) [2024-06-24 06:52:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 9955147776. Throughput: 0: 42742.6. Samples: 9955242940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:52:53,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 06:52:54,682][15401] Updated weights for policy 0, policy_version 607620 (0.0044) [2024-06-24 06:52:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9955377152. Throughput: 0: 42966.8. Samples: 9955508060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:52:58,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 06:52:58,978][15401] Updated weights for policy 0, policy_version 607630 (0.0028) [2024-06-24 06:53:02,221][15401] Updated weights for policy 0, policy_version 607640 (0.0029) [2024-06-24 06:53:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 9955606528. Throughput: 0: 43120.5. Samples: 9955766680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 06:53:06,493][15401] Updated weights for policy 0, policy_version 607650 (0.0048) [2024-06-24 06:53:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9955803136. Throughput: 0: 43272.4. Samples: 9955900400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:08,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 06:53:09,922][15401] Updated weights for policy 0, policy_version 607660 (0.0041) [2024-06-24 06:53:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9956032512. Throughput: 0: 43269.7. Samples: 9956158020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:13,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-24 06:53:13,838][15401] Updated weights for policy 0, policy_version 607670 (0.0034) [2024-06-24 06:53:17,358][15401] Updated weights for policy 0, policy_version 607680 (0.0036) [2024-06-24 06:53:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 9956261888. Throughput: 0: 43235.4. Samples: 9956413760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:18,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 06:53:21,276][15401] Updated weights for policy 0, policy_version 607690 (0.0037) [2024-06-24 06:53:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42765.1). Total num frames: 9956442112. Throughput: 0: 43209.2. Samples: 9956545220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 06:53:25,479][15401] Updated weights for policy 0, policy_version 607700 (0.0043) [2024-06-24 06:53:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9956671488. Throughput: 0: 43066.8. Samples: 9956796880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:28,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 06:53:29,422][15401] Updated weights for policy 0, policy_version 607710 (0.0031) [2024-06-24 06:53:32,962][15401] Updated weights for policy 0, policy_version 607720 (0.0022) [2024-06-24 06:53:33,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 9956900864. Throughput: 0: 43191.9. Samples: 9957055840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:33,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 06:53:36,862][15401] Updated weights for policy 0, policy_version 607730 (0.0041) [2024-06-24 06:53:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9957081088. Throughput: 0: 43169.0. Samples: 9957185540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:38,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 06:53:40,471][15401] Updated weights for policy 0, policy_version 607740 (0.0031) [2024-06-24 06:53:43,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9957326848. Throughput: 0: 42992.0. Samples: 9957442700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:43,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 06:53:43,435][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000607748_9957343232.pth... [2024-06-24 06:53:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000607119_9947037696.pth [2024-06-24 06:53:44,244][15401] Updated weights for policy 0, policy_version 607750 (0.0028) [2024-06-24 06:53:48,009][15401] Updated weights for policy 0, policy_version 607760 (0.0038) [2024-06-24 06:53:48,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43142.8, 300 sec: 42876.7). Total num frames: 9957556224. Throughput: 0: 42972.9. Samples: 9957700560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:48,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 06:53:52,354][15401] Updated weights for policy 0, policy_version 607770 (0.0038) [2024-06-24 06:53:53,350][15349] Signal inference workers to stop experience collection... (147550 times) [2024-06-24 06:53:53,351][15349] Signal inference workers to resume experience collection... (147550 times) [2024-06-24 06:53:53,363][15401] InferenceWorker_p0-w0: stopping experience collection (147550 times) [2024-06-24 06:53:53,387][15401] InferenceWorker_p0-w0: resuming experience collection (147550 times) [2024-06-24 06:53:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9957736448. Throughput: 0: 42786.1. Samples: 9957825780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:53,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-24 06:53:55,689][15401] Updated weights for policy 0, policy_version 607780 (0.0041) [2024-06-24 06:53:58,389][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 9957982208. Throughput: 0: 42800.9. Samples: 9958084060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:53:58,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 06:53:59,814][15401] Updated weights for policy 0, policy_version 607790 (0.0025) [2024-06-24 06:54:03,377][15401] Updated weights for policy 0, policy_version 607800 (0.0046) [2024-06-24 06:54:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 9958195200. Throughput: 0: 43001.4. Samples: 9958348820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:54:03,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 06:54:07,344][15401] Updated weights for policy 0, policy_version 607810 (0.0046) [2024-06-24 06:54:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9958391808. Throughput: 0: 42725.0. Samples: 9958467840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:54:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 06:54:11,237][15401] Updated weights for policy 0, policy_version 607820 (0.0045) [2024-06-24 06:54:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 43043.1). Total num frames: 9958621184. Throughput: 0: 42896.4. Samples: 9958727220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 06:54:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 06:54:14,990][15401] Updated weights for policy 0, policy_version 607830 (0.0029) [2024-06-24 06:54:18,390][15132] Fps is (10 sec: 40956.3, 60 sec: 42324.8, 300 sec: 42876.0). Total num frames: 9958801408. Throughput: 0: 43034.3. Samples: 9958992320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:18,391][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 06:54:18,778][15401] Updated weights for policy 0, policy_version 607840 (0.0023) [2024-06-24 06:54:22,768][15401] Updated weights for policy 0, policy_version 607850 (0.0039) [2024-06-24 06:54:23,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 9959014400. Throughput: 0: 42735.4. Samples: 9959108740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:23,392][15132] Avg episode reward: [(0, '0.841')] [2024-06-24 06:54:26,521][15401] Updated weights for policy 0, policy_version 607860 (0.0038) [2024-06-24 06:54:28,389][15132] Fps is (10 sec: 45879.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9959260160. Throughput: 0: 42706.8. Samples: 9959364500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 06:54:30,636][15401] Updated weights for policy 0, policy_version 607870 (0.0040) [2024-06-24 06:54:33,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 9959440384. Throughput: 0: 42907.9. Samples: 9959631320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 06:54:34,127][15401] Updated weights for policy 0, policy_version 607880 (0.0032) [2024-06-24 06:54:38,155][15401] Updated weights for policy 0, policy_version 607890 (0.0031) [2024-06-24 06:54:38,391][15132] Fps is (10 sec: 40953.2, 60 sec: 43143.3, 300 sec: 42875.9). Total num frames: 9959669760. Throughput: 0: 42832.3. Samples: 9959753300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:38,391][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 06:54:41,781][15401] Updated weights for policy 0, policy_version 607900 (0.0043) [2024-06-24 06:54:43,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 9959915520. Throughput: 0: 42871.1. Samples: 9960013260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 06:54:45,745][15401] Updated weights for policy 0, policy_version 607910 (0.0044) [2024-06-24 06:54:48,390][15132] Fps is (10 sec: 42604.9, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 9960095744. Throughput: 0: 42727.6. Samples: 9960271560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:48,404][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 06:54:49,700][15401] Updated weights for policy 0, policy_version 607920 (0.0037) [2024-06-24 06:54:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 9960292352. Throughput: 0: 42766.6. Samples: 9960392340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:53,398][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 06:54:53,803][15401] Updated weights for policy 0, policy_version 607930 (0.0037) [2024-06-24 06:54:57,279][15401] Updated weights for policy 0, policy_version 607940 (0.0032) [2024-06-24 06:54:58,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 9960554496. Throughput: 0: 42911.7. Samples: 9960658240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:54:58,398][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 06:55:01,241][15401] Updated weights for policy 0, policy_version 607950 (0.0044) [2024-06-24 06:55:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9960734720. Throughput: 0: 42785.7. Samples: 9960917640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:03,399][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 06:55:03,803][15349] Signal inference workers to stop experience collection... (147600 times) [2024-06-24 06:55:03,805][15349] Signal inference workers to resume experience collection... (147600 times) [2024-06-24 06:55:03,825][15401] InferenceWorker_p0-w0: stopping experience collection (147600 times) [2024-06-24 06:55:03,855][15401] InferenceWorker_p0-w0: resuming experience collection (147600 times) [2024-06-24 06:55:04,989][15401] Updated weights for policy 0, policy_version 607960 (0.0040) [2024-06-24 06:55:08,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9960947712. Throughput: 0: 42801.8. Samples: 9961034720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 06:55:08,624][15401] Updated weights for policy 0, policy_version 607970 (0.0032) [2024-06-24 06:55:12,457][15401] Updated weights for policy 0, policy_version 607980 (0.0046) [2024-06-24 06:55:13,392][15132] Fps is (10 sec: 45864.7, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 9961193472. Throughput: 0: 43013.6. Samples: 9961300220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:13,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 06:55:16,627][15401] Updated weights for policy 0, policy_version 607990 (0.0023) [2024-06-24 06:55:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42872.2, 300 sec: 42820.6). Total num frames: 9961373696. Throughput: 0: 42775.8. Samples: 9961556220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 06:55:19,951][15401] Updated weights for policy 0, policy_version 608000 (0.0042) [2024-06-24 06:55:23,390][15132] Fps is (10 sec: 40968.9, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 9961603072. Throughput: 0: 42784.9. Samples: 9961678560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 06:55:23,926][15401] Updated weights for policy 0, policy_version 608010 (0.0036) [2024-06-24 06:55:27,598][15401] Updated weights for policy 0, policy_version 608020 (0.0036) [2024-06-24 06:55:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 9961832448. Throughput: 0: 42861.0. Samples: 9961942000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:28,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 06:55:31,574][15401] Updated weights for policy 0, policy_version 608030 (0.0029) [2024-06-24 06:55:33,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9962029056. Throughput: 0: 42898.7. Samples: 9962202000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:33,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 06:55:35,314][15401] Updated weights for policy 0, policy_version 608040 (0.0025) [2024-06-24 06:55:38,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42597.8, 300 sec: 42820.6). Total num frames: 9962225664. Throughput: 0: 43019.9. Samples: 9962328340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:38,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 06:55:39,058][15401] Updated weights for policy 0, policy_version 608050 (0.0028) [2024-06-24 06:55:42,839][15401] Updated weights for policy 0, policy_version 608060 (0.0026) [2024-06-24 06:55:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 9962471424. Throughput: 0: 42858.2. Samples: 9962586860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 06:55:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000608062_9962487808.pth... [2024-06-24 06:55:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000607433_9952182272.pth [2024-06-24 06:55:46,750][15401] Updated weights for policy 0, policy_version 608070 (0.0028) [2024-06-24 06:55:48,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9962668032. Throughput: 0: 42919.3. Samples: 9962849000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 06:55:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 06:55:50,403][15401] Updated weights for policy 0, policy_version 608080 (0.0039) [2024-06-24 06:55:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 9962881024. Throughput: 0: 43094.4. Samples: 9962973960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:55:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 06:55:54,194][15401] Updated weights for policy 0, policy_version 608090 (0.0033) [2024-06-24 06:55:58,190][15401] Updated weights for policy 0, policy_version 608100 (0.0030) [2024-06-24 06:55:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42932.0). Total num frames: 9963110400. Throughput: 0: 42992.5. Samples: 9963234780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:55:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 06:56:01,717][15401] Updated weights for policy 0, policy_version 608110 (0.0034) [2024-06-24 06:56:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 9963323392. Throughput: 0: 42968.7. Samples: 9963489820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:03,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 06:56:05,687][15401] Updated weights for policy 0, policy_version 608120 (0.0036) [2024-06-24 06:56:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42820.8). Total num frames: 9963520000. Throughput: 0: 42951.8. Samples: 9963611380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 06:56:09,758][15401] Updated weights for policy 0, policy_version 608130 (0.0023) [2024-06-24 06:56:13,287][15401] Updated weights for policy 0, policy_version 608140 (0.0030) [2024-06-24 06:56:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.2, 300 sec: 42987.5). Total num frames: 9963765760. Throughput: 0: 42926.6. Samples: 9963873700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 06:56:17,325][15401] Updated weights for policy 0, policy_version 608150 (0.0032) [2024-06-24 06:56:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 9963962368. Throughput: 0: 42641.3. Samples: 9964120860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 06:56:21,423][15401] Updated weights for policy 0, policy_version 608160 (0.0040) [2024-06-24 06:56:23,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.9, 300 sec: 42875.7). Total num frames: 9964175360. Throughput: 0: 42769.3. Samples: 9964252960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:23,392][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 06:56:24,700][15349] Signal inference workers to stop experience collection... (147650 times) [2024-06-24 06:56:24,756][15401] InferenceWorker_p0-w0: stopping experience collection (147650 times) [2024-06-24 06:56:24,761][15349] Signal inference workers to resume experience collection... (147650 times) [2024-06-24 06:56:24,767][15401] InferenceWorker_p0-w0: resuming experience collection (147650 times) [2024-06-24 06:56:24,901][15401] Updated weights for policy 0, policy_version 608170 (0.0040) [2024-06-24 06:56:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 9964371968. Throughput: 0: 42606.7. Samples: 9964504160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 06:56:29,078][15401] Updated weights for policy 0, policy_version 608180 (0.0036) [2024-06-24 06:56:32,741][15401] Updated weights for policy 0, policy_version 608190 (0.0022) [2024-06-24 06:56:33,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9964601344. Throughput: 0: 42423.9. Samples: 9964758080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 06:56:36,808][15401] Updated weights for policy 0, policy_version 608200 (0.0042) [2024-06-24 06:56:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 9964814336. Throughput: 0: 42615.5. Samples: 9964891660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 06:56:40,289][15401] Updated weights for policy 0, policy_version 608210 (0.0025) [2024-06-24 06:56:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 9965027328. Throughput: 0: 42587.0. Samples: 9965151200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 06:56:44,573][15401] Updated weights for policy 0, policy_version 608220 (0.0035) [2024-06-24 06:56:47,841][15401] Updated weights for policy 0, policy_version 608230 (0.0045) [2024-06-24 06:56:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9965240320. Throughput: 0: 42533.1. Samples: 9965403800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:48,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 06:56:52,299][15401] Updated weights for policy 0, policy_version 608240 (0.0038) [2024-06-24 06:56:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9965469696. Throughput: 0: 42880.9. Samples: 9965541020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 06:56:55,367][15401] Updated weights for policy 0, policy_version 608250 (0.0044) [2024-06-24 06:56:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 9965666304. Throughput: 0: 42736.3. Samples: 9965796840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:56:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 06:56:59,775][15401] Updated weights for policy 0, policy_version 608260 (0.0040) [2024-06-24 06:57:03,262][15401] Updated weights for policy 0, policy_version 608270 (0.0026) [2024-06-24 06:57:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 9965912064. Throughput: 0: 42895.6. Samples: 9966051160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:57:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 06:57:07,470][15401] Updated weights for policy 0, policy_version 608280 (0.0044) [2024-06-24 06:57:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9966108672. Throughput: 0: 42907.6. Samples: 9966183700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:57:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 06:57:10,594][15401] Updated weights for policy 0, policy_version 608290 (0.0041) [2024-06-24 06:57:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 9966321664. Throughput: 0: 43033.8. Samples: 9966440680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:57:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 06:57:15,069][15401] Updated weights for policy 0, policy_version 608300 (0.0044) [2024-06-24 06:57:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 9966518272. Throughput: 0: 43028.3. Samples: 9966694360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:57:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 06:57:18,702][15401] Updated weights for policy 0, policy_version 608310 (0.0037) [2024-06-24 06:57:22,794][15401] Updated weights for policy 0, policy_version 608320 (0.0024) [2024-06-24 06:57:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 9966747648. Throughput: 0: 42910.7. Samples: 9966822640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 06:57:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 06:57:26,269][15401] Updated weights for policy 0, policy_version 608330 (0.0041) [2024-06-24 06:57:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 9966960640. Throughput: 0: 42802.8. Samples: 9967077320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:57:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 06:57:30,376][15401] Updated weights for policy 0, policy_version 608340 (0.0044) [2024-06-24 06:57:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 9967157248. Throughput: 0: 43008.8. Samples: 9967339200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:57:33,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 06:57:33,560][15349] Signal inference workers to stop experience collection... (147700 times) [2024-06-24 06:57:33,561][15349] Signal inference workers to resume experience collection... (147700 times) [2024-06-24 06:57:33,582][15401] InferenceWorker_p0-w0: stopping experience collection (147700 times) [2024-06-24 06:57:33,582][15401] InferenceWorker_p0-w0: resuming experience collection (147700 times) [2024-06-24 06:57:33,860][15401] Updated weights for policy 0, policy_version 608350 (0.0029) [2024-06-24 06:57:37,975][15401] Updated weights for policy 0, policy_version 608360 (0.0043) [2024-06-24 06:57:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9967386624. Throughput: 0: 42774.2. Samples: 9967465860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:57:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 06:57:41,741][15401] Updated weights for policy 0, policy_version 608370 (0.0028) [2024-06-24 06:57:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 9967599616. Throughput: 0: 42859.7. Samples: 9967725520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:57:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 06:57:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000608375_9967616000.pth... [2024-06-24 06:57:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000607748_9957343232.pth [2024-06-24 06:57:45,844][15401] Updated weights for policy 0, policy_version 608380 (0.0037) [2024-06-24 06:57:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 9967812608. Throughput: 0: 42784.3. Samples: 9967976460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:57:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 06:57:49,359][15401] Updated weights for policy 0, policy_version 608390 (0.0031) [2024-06-24 06:57:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 9968009216. Throughput: 0: 42591.5. Samples: 9968100320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:57:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 06:57:53,478][15401] Updated weights for policy 0, policy_version 608400 (0.0035) [2024-06-24 06:57:57,282][15401] Updated weights for policy 0, policy_version 608410 (0.0032) [2024-06-24 06:57:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 9968238592. Throughput: 0: 42761.1. Samples: 9968364940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:57:58,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 06:58:01,228][15401] Updated weights for policy 0, policy_version 608420 (0.0026) [2024-06-24 06:58:03,390][15132] Fps is (10 sec: 44232.9, 60 sec: 42324.7, 300 sec: 42876.0). Total num frames: 9968451584. Throughput: 0: 42613.9. Samples: 9968612020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:03,391][15132] Avg episode reward: [(0, '0.866')] [2024-06-24 06:58:05,003][15401] Updated weights for policy 0, policy_version 608430 (0.0028) [2024-06-24 06:58:08,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 9968648192. Throughput: 0: 42534.3. Samples: 9968736680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 06:58:08,980][15401] Updated weights for policy 0, policy_version 608440 (0.0039) [2024-06-24 06:58:12,555][15401] Updated weights for policy 0, policy_version 608450 (0.0043) [2024-06-24 06:58:13,392][15132] Fps is (10 sec: 42591.7, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 9968877568. Throughput: 0: 42722.0. Samples: 9968999920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:13,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 06:58:16,455][15401] Updated weights for policy 0, policy_version 608460 (0.0024) [2024-06-24 06:58:18,392][15132] Fps is (10 sec: 45863.7, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 9969106944. Throughput: 0: 42489.3. Samples: 9969251320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:18,393][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 06:58:20,141][15401] Updated weights for policy 0, policy_version 608470 (0.0034) [2024-06-24 06:58:23,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 9969303552. Throughput: 0: 42531.5. Samples: 9969379780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:23,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 06:58:23,839][15401] Updated weights for policy 0, policy_version 608480 (0.0030) [2024-06-24 06:58:27,760][15401] Updated weights for policy 0, policy_version 608490 (0.0056) [2024-06-24 06:58:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 9969516544. Throughput: 0: 42566.6. Samples: 9969641020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 06:58:31,406][15401] Updated weights for policy 0, policy_version 608500 (0.0028) [2024-06-24 06:58:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 9969729536. Throughput: 0: 42692.2. Samples: 9969897600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 06:58:35,377][15401] Updated weights for policy 0, policy_version 608510 (0.0031) [2024-06-24 06:58:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9969958912. Throughput: 0: 42694.3. Samples: 9970021560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:38,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 06:58:39,036][15401] Updated weights for policy 0, policy_version 608520 (0.0034) [2024-06-24 06:58:42,959][15401] Updated weights for policy 0, policy_version 608530 (0.0035) [2024-06-24 06:58:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 9970171904. Throughput: 0: 42755.3. Samples: 9970288920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:43,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 06:58:46,673][15401] Updated weights for policy 0, policy_version 608540 (0.0023) [2024-06-24 06:58:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 9970352128. Throughput: 0: 42867.5. Samples: 9970541020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 06:58:50,475][15401] Updated weights for policy 0, policy_version 608550 (0.0033) [2024-06-24 06:58:53,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 9970581504. Throughput: 0: 42798.1. Samples: 9970662700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:53,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 06:58:54,009][15401] Updated weights for policy 0, policy_version 608560 (0.0034) [2024-06-24 06:58:57,642][15349] Signal inference workers to stop experience collection... (147750 times) [2024-06-24 06:58:57,643][15349] Signal inference workers to resume experience collection... (147750 times) [2024-06-24 06:58:57,687][15401] InferenceWorker_p0-w0: stopping experience collection (147750 times) [2024-06-24 06:58:57,688][15401] InferenceWorker_p0-w0: resuming experience collection (147750 times) [2024-06-24 06:58:58,127][15401] Updated weights for policy 0, policy_version 608570 (0.0039) [2024-06-24 06:58:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 9970810880. Throughput: 0: 42951.7. Samples: 9970932640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 06:58:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 06:59:02,320][15401] Updated weights for policy 0, policy_version 608580 (0.0038) [2024-06-24 06:59:03,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.9, 300 sec: 42709.5). Total num frames: 9970991104. Throughput: 0: 42953.3. Samples: 9971184120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 06:59:05,783][15401] Updated weights for policy 0, policy_version 608590 (0.0024) [2024-06-24 06:59:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 9971236864. Throughput: 0: 42913.7. Samples: 9971310900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:08,391][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 06:59:09,934][15401] Updated weights for policy 0, policy_version 608600 (0.0033) [2024-06-24 06:59:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42820.7). Total num frames: 9971433472. Throughput: 0: 42915.9. Samples: 9971572240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 06:59:13,757][15401] Updated weights for policy 0, policy_version 608610 (0.0034) [2024-06-24 06:59:17,386][15401] Updated weights for policy 0, policy_version 608620 (0.0046) [2024-06-24 06:59:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42876.5). Total num frames: 9971662848. Throughput: 0: 42919.1. Samples: 9971828960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 06:59:21,287][15401] Updated weights for policy 0, policy_version 608630 (0.0029) [2024-06-24 06:59:23,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 9971892224. Throughput: 0: 43011.9. Samples: 9971957200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:23,393][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 06:59:25,153][15401] Updated weights for policy 0, policy_version 608640 (0.0023) [2024-06-24 06:59:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 9972088832. Throughput: 0: 42837.4. Samples: 9972216600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 06:59:28,923][15401] Updated weights for policy 0, policy_version 608650 (0.0030) [2024-06-24 06:59:32,778][15401] Updated weights for policy 0, policy_version 608660 (0.0034) [2024-06-24 06:59:33,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42869.7, 300 sec: 42820.4). Total num frames: 9972301824. Throughput: 0: 42925.7. Samples: 9972472780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:33,392][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 06:59:36,634][15401] Updated weights for policy 0, policy_version 608670 (0.0030) [2024-06-24 06:59:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9972531200. Throughput: 0: 43080.6. Samples: 9972601220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:38,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 06:59:40,665][15401] Updated weights for policy 0, policy_version 608680 (0.0023) [2024-06-24 06:59:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9972727808. Throughput: 0: 42865.8. Samples: 9972861600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:43,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-24 06:59:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000608687_9972727808.pth... [2024-06-24 06:59:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000608062_9962487808.pth [2024-06-24 06:59:44,090][15401] Updated weights for policy 0, policy_version 608690 (0.0036) [2024-06-24 06:59:48,279][15401] Updated weights for policy 0, policy_version 608700 (0.0032) [2024-06-24 06:59:48,396][15132] Fps is (10 sec: 40933.9, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 9972940800. Throughput: 0: 42953.6. Samples: 9973117300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:48,396][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 06:59:51,659][15401] Updated weights for policy 0, policy_version 608710 (0.0032) [2024-06-24 06:59:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 9973170176. Throughput: 0: 42964.9. Samples: 9973244320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 06:59:55,849][15401] Updated weights for policy 0, policy_version 608720 (0.0043) [2024-06-24 06:59:58,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 9973366784. Throughput: 0: 42775.1. Samples: 9973497120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 06:59:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 06:59:59,221][15401] Updated weights for policy 0, policy_version 608730 (0.0025) [2024-06-24 07:00:03,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.7, 300 sec: 42765.1). Total num frames: 9973563392. Throughput: 0: 42658.8. Samples: 9973748600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 07:00:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 07:00:03,630][15401] Updated weights for policy 0, policy_version 608740 (0.0027) [2024-06-24 07:00:07,204][15401] Updated weights for policy 0, policy_version 608750 (0.0035) [2024-06-24 07:00:08,392][15132] Fps is (10 sec: 44228.1, 60 sec: 42870.1, 300 sec: 42765.1). Total num frames: 9973809152. Throughput: 0: 42724.8. Samples: 9973879800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 07:00:08,392][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 07:00:11,518][15401] Updated weights for policy 0, policy_version 608760 (0.0036) [2024-06-24 07:00:12,526][15349] Signal inference workers to stop experience collection... (147800 times) [2024-06-24 07:00:12,528][15349] Signal inference workers to resume experience collection... (147800 times) [2024-06-24 07:00:12,548][15401] InferenceWorker_p0-w0: stopping experience collection (147800 times) [2024-06-24 07:00:12,548][15401] InferenceWorker_p0-w0: resuming experience collection (147800 times) [2024-06-24 07:00:13,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 9974005760. Throughput: 0: 42791.5. Samples: 9974142220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 07:00:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 07:00:14,653][15401] Updated weights for policy 0, policy_version 608770 (0.0032) [2024-06-24 07:00:18,389][15132] Fps is (10 sec: 40968.7, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 9974218752. Throughput: 0: 42683.7. Samples: 9974393440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 07:00:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 07:00:18,952][15401] Updated weights for policy 0, policy_version 608780 (0.0038) [2024-06-24 07:00:22,350][15401] Updated weights for policy 0, policy_version 608790 (0.0026) [2024-06-24 07:00:23,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42871.4, 300 sec: 42820.2). Total num frames: 9974464512. Throughput: 0: 42679.0. Samples: 9974521880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 07:00:23,393][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 07:00:26,515][15401] Updated weights for policy 0, policy_version 608800 (0.0036) [2024-06-24 07:00:28,391][15132] Fps is (10 sec: 44230.1, 60 sec: 42870.4, 300 sec: 42820.3). Total num frames: 9974661120. Throughput: 0: 42698.2. Samples: 9974783080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 07:00:28,391][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 07:00:30,139][15401] Updated weights for policy 0, policy_version 608810 (0.0029) [2024-06-24 07:00:33,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 9974874112. Throughput: 0: 42778.4. Samples: 9975042160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 07:00:33,393][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 07:00:33,917][15401] Updated weights for policy 0, policy_version 608820 (0.0031) [2024-06-24 07:00:37,709][15401] Updated weights for policy 0, policy_version 608830 (0.0022) [2024-06-24 07:00:38,392][15132] Fps is (10 sec: 44232.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 9975103488. Throughput: 0: 42734.2. Samples: 9975167460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:00:38,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 07:00:41,707][15401] Updated weights for policy 0, policy_version 608840 (0.0033) [2024-06-24 07:00:43,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9975283712. Throughput: 0: 42860.6. Samples: 9975425840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:00:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 07:00:45,234][15401] Updated weights for policy 0, policy_version 608850 (0.0030) [2024-06-24 07:00:48,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 9975513088. Throughput: 0: 42880.2. Samples: 9975678220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:00:48,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 07:00:49,317][15401] Updated weights for policy 0, policy_version 608860 (0.0040) [2024-06-24 07:00:52,986][15401] Updated weights for policy 0, policy_version 608870 (0.0035) [2024-06-24 07:00:53,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9975742464. Throughput: 0: 42797.0. Samples: 9975805580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:00:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 07:00:56,949][15401] Updated weights for policy 0, policy_version 608880 (0.0029) [2024-06-24 07:00:58,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 9975922688. Throughput: 0: 42585.0. Samples: 9976058540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:00:58,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 07:01:00,678][15401] Updated weights for policy 0, policy_version 608890 (0.0029) [2024-06-24 07:01:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 9976152064. Throughput: 0: 42754.6. Samples: 9976317400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 07:01:04,418][15401] Updated weights for policy 0, policy_version 608900 (0.0033) [2024-06-24 07:01:08,221][15401] Updated weights for policy 0, policy_version 608910 (0.0042) [2024-06-24 07:01:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 9976381440. Throughput: 0: 42854.0. Samples: 9976450200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 07:01:12,172][15401] Updated weights for policy 0, policy_version 608920 (0.0049) [2024-06-24 07:01:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9976578048. Throughput: 0: 42729.8. Samples: 9976705860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 07:01:15,979][15401] Updated weights for policy 0, policy_version 608930 (0.0032) [2024-06-24 07:01:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 9976807424. Throughput: 0: 42649.9. Samples: 9976961300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 07:01:19,819][15401] Updated weights for policy 0, policy_version 608940 (0.0020) [2024-06-24 07:01:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 9977004032. Throughput: 0: 42855.3. Samples: 9977095840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 07:01:23,763][15401] Updated weights for policy 0, policy_version 608950 (0.0038) [2024-06-24 07:01:27,338][15401] Updated weights for policy 0, policy_version 608960 (0.0035) [2024-06-24 07:01:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42870.7, 300 sec: 42820.2). Total num frames: 9977233408. Throughput: 0: 42738.9. Samples: 9977349200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:28,392][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 07:01:31,107][15349] Signal inference workers to stop experience collection... (147850 times) [2024-06-24 07:01:31,107][15349] Signal inference workers to resume experience collection... (147850 times) [2024-06-24 07:01:31,136][15401] InferenceWorker_p0-w0: stopping experience collection (147850 times) [2024-06-24 07:01:31,136][15401] InferenceWorker_p0-w0: resuming experience collection (147850 times) [2024-06-24 07:01:31,419][15401] Updated weights for policy 0, policy_version 608970 (0.0023) [2024-06-24 07:01:33,390][15132] Fps is (10 sec: 45873.5, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 9977462784. Throughput: 0: 42703.3. Samples: 9977599880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:33,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 07:01:34,916][15401] Updated weights for policy 0, policy_version 608980 (0.0029) [2024-06-24 07:01:38,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 9977626624. Throughput: 0: 42805.8. Samples: 9977731840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 07:01:39,122][15401] Updated weights for policy 0, policy_version 608990 (0.0041) [2024-06-24 07:01:42,956][15401] Updated weights for policy 0, policy_version 609000 (0.0041) [2024-06-24 07:01:43,392][15132] Fps is (10 sec: 40951.2, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 9977872384. Throughput: 0: 42914.0. Samples: 9977989780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:43,392][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 07:01:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000609001_9977872384.pth... [2024-06-24 07:01:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000608375_9967616000.pth [2024-06-24 07:01:46,731][15401] Updated weights for policy 0, policy_version 609010 (0.0026) [2024-06-24 07:01:48,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 9978101760. Throughput: 0: 42846.7. Samples: 9978245500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 07:01:50,461][15401] Updated weights for policy 0, policy_version 609020 (0.0044) [2024-06-24 07:01:53,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 9978281984. Throughput: 0: 42846.6. Samples: 9978378300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 07:01:54,268][15401] Updated weights for policy 0, policy_version 609030 (0.0037) [2024-06-24 07:01:57,997][15401] Updated weights for policy 0, policy_version 609040 (0.0031) [2024-06-24 07:01:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9978511360. Throughput: 0: 42861.8. Samples: 9978634640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:01:58,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 07:02:01,934][15401] Updated weights for policy 0, policy_version 609050 (0.0033) [2024-06-24 07:02:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 9978740736. Throughput: 0: 42735.1. Samples: 9978884380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:02:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 07:02:06,006][15401] Updated weights for policy 0, policy_version 609060 (0.0032) [2024-06-24 07:02:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 9978904576. Throughput: 0: 42638.7. Samples: 9979014580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 07:02:09,721][15401] Updated weights for policy 0, policy_version 609070 (0.0035) [2024-06-24 07:02:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9979150336. Throughput: 0: 42612.2. Samples: 9979266640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 07:02:13,814][15401] Updated weights for policy 0, policy_version 609080 (0.0032) [2024-06-24 07:02:17,290][15401] Updated weights for policy 0, policy_version 609090 (0.0029) [2024-06-24 07:02:18,392][15132] Fps is (10 sec: 47503.3, 60 sec: 42870.0, 300 sec: 42820.3). Total num frames: 9979379712. Throughput: 0: 42618.3. Samples: 9979517780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:18,392][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 07:02:21,406][15401] Updated weights for policy 0, policy_version 609100 (0.0041) [2024-06-24 07:02:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 9979543552. Throughput: 0: 42698.7. Samples: 9979653280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 07:02:24,803][15401] Updated weights for policy 0, policy_version 609110 (0.0030) [2024-06-24 07:02:28,389][15132] Fps is (10 sec: 39330.2, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 9979772928. Throughput: 0: 42610.0. Samples: 9979907120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:28,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 07:02:29,042][15401] Updated weights for policy 0, policy_version 609120 (0.0027) [2024-06-24 07:02:32,547][15401] Updated weights for policy 0, policy_version 609130 (0.0040) [2024-06-24 07:02:33,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.7, 300 sec: 42820.6). Total num frames: 9980018688. Throughput: 0: 42742.7. Samples: 9980168920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:33,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 07:02:36,750][15401] Updated weights for policy 0, policy_version 609140 (0.0034) [2024-06-24 07:02:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9980198912. Throughput: 0: 42601.4. Samples: 9980295360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:38,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 07:02:40,468][15401] Updated weights for policy 0, policy_version 609150 (0.0028) [2024-06-24 07:02:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 9980428288. Throughput: 0: 42468.0. Samples: 9980545700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 07:02:44,330][15401] Updated weights for policy 0, policy_version 609160 (0.0033) [2024-06-24 07:02:46,679][15349] Signal inference workers to stop experience collection... (147900 times) [2024-06-24 07:02:46,679][15349] Signal inference workers to resume experience collection... (147900 times) [2024-06-24 07:02:46,719][15401] InferenceWorker_p0-w0: stopping experience collection (147900 times) [2024-06-24 07:02:46,719][15401] InferenceWorker_p0-w0: resuming experience collection (147900 times) [2024-06-24 07:02:48,049][15401] Updated weights for policy 0, policy_version 609170 (0.0027) [2024-06-24 07:02:48,392][15132] Fps is (10 sec: 47501.7, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 9980674048. Throughput: 0: 42678.1. Samples: 9980805000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:48,392][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 07:02:51,959][15401] Updated weights for policy 0, policy_version 609180 (0.0028) [2024-06-24 07:02:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9980837888. Throughput: 0: 42679.8. Samples: 9980935180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:53,396][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 07:02:55,663][15401] Updated weights for policy 0, policy_version 609190 (0.0035) [2024-06-24 07:02:58,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.4, 300 sec: 42765.2). Total num frames: 9981067264. Throughput: 0: 42762.2. Samples: 9981190940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:02:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 07:02:59,485][15401] Updated weights for policy 0, policy_version 609200 (0.0041) [2024-06-24 07:03:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 9981280256. Throughput: 0: 43029.5. Samples: 9981454020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:03:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 07:03:03,544][15401] Updated weights for policy 0, policy_version 609210 (0.0021) [2024-06-24 07:03:06,883][15401] Updated weights for policy 0, policy_version 609220 (0.0039) [2024-06-24 07:03:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 9981493248. Throughput: 0: 42908.8. Samples: 9981584180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:03:08,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 07:03:10,960][15401] Updated weights for policy 0, policy_version 609230 (0.0025) [2024-06-24 07:03:13,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.6, 300 sec: 42765.0). Total num frames: 9981722624. Throughput: 0: 42907.7. Samples: 9981838080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:03:13,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 07:03:14,485][15401] Updated weights for policy 0, policy_version 609240 (0.0038) [2024-06-24 07:03:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42326.8, 300 sec: 42765.0). Total num frames: 9981919232. Throughput: 0: 42910.2. Samples: 9982099880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:03:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 07:03:18,667][15401] Updated weights for policy 0, policy_version 609250 (0.0041) [2024-06-24 07:03:22,754][15401] Updated weights for policy 0, policy_version 609260 (0.0024) [2024-06-24 07:03:23,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 9982132224. Throughput: 0: 42859.9. Samples: 9982224060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:03:23,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 07:03:26,151][15401] Updated weights for policy 0, policy_version 609270 (0.0034) [2024-06-24 07:03:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 9982377984. Throughput: 0: 42842.7. Samples: 9982473620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:03:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 07:03:30,307][15401] Updated weights for policy 0, policy_version 609280 (0.0036) [2024-06-24 07:03:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 9982558208. Throughput: 0: 43052.1. Samples: 9982742240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:03:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 07:03:33,816][15401] Updated weights for policy 0, policy_version 609290 (0.0035) [2024-06-24 07:03:38,250][15401] Updated weights for policy 0, policy_version 609300 (0.0028) [2024-06-24 07:03:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 9982787584. Throughput: 0: 42828.1. Samples: 9982862540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:03:38,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 07:03:41,387][15401] Updated weights for policy 0, policy_version 609310 (0.0036) [2024-06-24 07:03:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 9983016960. Throughput: 0: 42858.1. Samples: 9983119560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:03:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 07:03:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000609316_9983033344.pth... [2024-06-24 07:03:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000608687_9972727808.pth [2024-06-24 07:03:45,838][15401] Updated weights for policy 0, policy_version 609320 (0.0029) [2024-06-24 07:03:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42053.9, 300 sec: 42765.4). Total num frames: 9983197184. Throughput: 0: 42833.3. Samples: 9983381520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:03:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 07:03:49,083][15401] Updated weights for policy 0, policy_version 609330 (0.0029) [2024-06-24 07:03:53,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 9983410176. Throughput: 0: 42642.6. Samples: 9983503200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:03:53,393][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 07:03:53,722][15401] Updated weights for policy 0, policy_version 609340 (0.0046) [2024-06-24 07:03:56,244][15349] Signal inference workers to stop experience collection... (147950 times) [2024-06-24 07:03:56,253][15349] Signal inference workers to resume experience collection... (147950 times) [2024-06-24 07:03:56,269][15401] InferenceWorker_p0-w0: stopping experience collection (147950 times) [2024-06-24 07:03:56,269][15401] InferenceWorker_p0-w0: resuming experience collection (147950 times) [2024-06-24 07:03:56,570][15401] Updated weights for policy 0, policy_version 609350 (0.0035) [2024-06-24 07:03:58,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 9983672320. Throughput: 0: 42730.4. Samples: 9983760840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:03:58,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-24 07:04:01,119][15401] Updated weights for policy 0, policy_version 609360 (0.0048) [2024-06-24 07:04:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 9983852544. Throughput: 0: 42778.1. Samples: 9984024900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:03,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 07:04:04,035][15401] Updated weights for policy 0, policy_version 609370 (0.0035) [2024-06-24 07:04:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 9984065536. Throughput: 0: 42760.1. Samples: 9984148260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 07:04:09,140][15401] Updated weights for policy 0, policy_version 609380 (0.0034) [2024-06-24 07:04:11,993][15401] Updated weights for policy 0, policy_version 609390 (0.0043) [2024-06-24 07:04:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 9984327680. Throughput: 0: 42890.4. Samples: 9984403700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 07:04:16,715][15401] Updated weights for policy 0, policy_version 609400 (0.0034) [2024-06-24 07:04:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 9984491520. Throughput: 0: 42766.3. Samples: 9984666720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:18,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 07:04:19,579][15401] Updated weights for policy 0, policy_version 609410 (0.0034) [2024-06-24 07:04:23,390][15132] Fps is (10 sec: 36045.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9984688128. Throughput: 0: 42783.1. Samples: 9984787680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 07:04:24,058][15401] Updated weights for policy 0, policy_version 609420 (0.0031) [2024-06-24 07:04:27,162][15401] Updated weights for policy 0, policy_version 609430 (0.0031) [2024-06-24 07:04:28,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42869.7, 300 sec: 42876.1). Total num frames: 9984950272. Throughput: 0: 42919.5. Samples: 9985051040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:28,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 07:04:31,527][15401] Updated weights for policy 0, policy_version 609440 (0.0038) [2024-06-24 07:04:33,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 9985130496. Throughput: 0: 43023.2. Samples: 9985317840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:33,397][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 07:04:34,902][15401] Updated weights for policy 0, policy_version 609450 (0.0039) [2024-06-24 07:04:38,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 9985327104. Throughput: 0: 42984.6. Samples: 9985437400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 07:04:39,165][15401] Updated weights for policy 0, policy_version 609460 (0.0043) [2024-06-24 07:04:42,590][15401] Updated weights for policy 0, policy_version 609470 (0.0025) [2024-06-24 07:04:43,392][15132] Fps is (10 sec: 47532.4, 60 sec: 43142.8, 300 sec: 42932.2). Total num frames: 9985605632. Throughput: 0: 43138.5. Samples: 9985702180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:43,393][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 07:04:46,987][15401] Updated weights for policy 0, policy_version 609480 (0.0041) [2024-06-24 07:04:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9985769472. Throughput: 0: 42982.0. Samples: 9985959080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 07:04:50,224][15401] Updated weights for policy 0, policy_version 609490 (0.0032) [2024-06-24 07:04:53,389][15132] Fps is (10 sec: 37692.9, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 9985982464. Throughput: 0: 42901.8. Samples: 9986078840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:53,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 07:04:54,503][15401] Updated weights for policy 0, policy_version 609500 (0.0033) [2024-06-24 07:04:57,844][15401] Updated weights for policy 0, policy_version 609510 (0.0041) [2024-06-24 07:04:58,392][15132] Fps is (10 sec: 47501.8, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 9986244608. Throughput: 0: 43023.6. Samples: 9986339860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:04:58,392][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 07:05:02,502][15401] Updated weights for policy 0, policy_version 609520 (0.0027) [2024-06-24 07:05:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 9986408448. Throughput: 0: 43003.5. Samples: 9986601880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:05:03,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 07:05:05,393][15401] Updated weights for policy 0, policy_version 609530 (0.0032) [2024-06-24 07:05:08,392][15132] Fps is (10 sec: 39321.3, 60 sec: 42869.6, 300 sec: 42820.2). Total num frames: 9986637824. Throughput: 0: 42931.9. Samples: 9986719720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:05:08,401][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 07:05:10,157][15401] Updated weights for policy 0, policy_version 609540 (0.0045) [2024-06-24 07:05:11,597][15349] Signal inference workers to stop experience collection... (148000 times) [2024-06-24 07:05:11,597][15349] Signal inference workers to resume experience collection... (148000 times) [2024-06-24 07:05:11,634][15401] InferenceWorker_p0-w0: stopping experience collection (148000 times) [2024-06-24 07:05:11,634][15401] InferenceWorker_p0-w0: resuming experience collection (148000 times) [2024-06-24 07:05:13,132][15401] Updated weights for policy 0, policy_version 609550 (0.0043) [2024-06-24 07:05:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 9986867200. Throughput: 0: 42852.5. Samples: 9986979300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:05:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 07:05:17,691][15401] Updated weights for policy 0, policy_version 609560 (0.0043) [2024-06-24 07:05:18,389][15132] Fps is (10 sec: 40970.5, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 9987047424. Throughput: 0: 42688.4. Samples: 9987238540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 07:05:20,623][15401] Updated weights for policy 0, policy_version 609570 (0.0034) [2024-06-24 07:05:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42820.8). Total num frames: 9987293184. Throughput: 0: 42757.3. Samples: 9987361480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 07:05:25,514][15401] Updated weights for policy 0, policy_version 609580 (0.0039) [2024-06-24 07:05:28,321][15401] Updated weights for policy 0, policy_version 609590 (0.0039) [2024-06-24 07:05:28,390][15132] Fps is (10 sec: 47512.5, 60 sec: 42873.1, 300 sec: 42876.4). Total num frames: 9987522560. Throughput: 0: 42579.1. Samples: 9987618140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 07:05:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42329.9, 300 sec: 42598.7). Total num frames: 9987670016. Throughput: 0: 42754.6. Samples: 9987883040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 07:05:33,432][15401] Updated weights for policy 0, policy_version 609600 (0.0039) [2024-06-24 07:05:36,245][15401] Updated weights for policy 0, policy_version 609610 (0.0028) [2024-06-24 07:05:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 9987932160. Throughput: 0: 42656.7. Samples: 9987998400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 07:05:41,191][15401] Updated weights for policy 0, policy_version 609620 (0.0050) [2024-06-24 07:05:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42054.1, 300 sec: 42765.0). Total num frames: 9988128768. Throughput: 0: 42559.7. Samples: 9988254940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:43,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 07:05:43,489][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000609628_9988145152.pth... [2024-06-24 07:05:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000609001_9977872384.pth [2024-06-24 07:05:43,974][15401] Updated weights for policy 0, policy_version 609630 (0.0032) [2024-06-24 07:05:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 9988308992. Throughput: 0: 42464.8. Samples: 9988512800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 07:05:48,851][15401] Updated weights for policy 0, policy_version 609640 (0.0035) [2024-06-24 07:05:51,647][15401] Updated weights for policy 0, policy_version 609650 (0.0030) [2024-06-24 07:05:53,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 9988571136. Throughput: 0: 42493.9. Samples: 9988631840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 07:05:56,695][15401] Updated weights for policy 0, policy_version 609660 (0.0044) [2024-06-24 07:05:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 41780.8, 300 sec: 42709.5). Total num frames: 9988751360. Throughput: 0: 42486.6. Samples: 9988891200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:05:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 07:05:59,506][15401] Updated weights for policy 0, policy_version 609670 (0.0039) [2024-06-24 07:06:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9988947968. Throughput: 0: 42351.8. Samples: 9989144380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:03,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 07:06:04,303][15401] Updated weights for policy 0, policy_version 609680 (0.0036) [2024-06-24 07:06:07,070][15401] Updated weights for policy 0, policy_version 609690 (0.0035) [2024-06-24 07:06:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 9989210112. Throughput: 0: 42410.1. Samples: 9989269940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:08,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 07:06:11,751][15401] Updated weights for policy 0, policy_version 609700 (0.0037) [2024-06-24 07:06:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 9989406720. Throughput: 0: 42369.0. Samples: 9989524740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 07:06:14,828][15401] Updated weights for policy 0, policy_version 609710 (0.0031) [2024-06-24 07:06:18,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 9989603328. Throughput: 0: 42108.5. Samples: 9989777920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 07:06:19,375][15401] Updated weights for policy 0, policy_version 609720 (0.0036) [2024-06-24 07:06:22,680][15401] Updated weights for policy 0, policy_version 609730 (0.0045) [2024-06-24 07:06:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 9989832704. Throughput: 0: 42323.2. Samples: 9989902940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:23,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 07:06:27,139][15401] Updated weights for policy 0, policy_version 609740 (0.0033) [2024-06-24 07:06:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41506.3, 300 sec: 42542.9). Total num frames: 9990012928. Throughput: 0: 42218.3. Samples: 9990154760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 07:06:30,343][15401] Updated weights for policy 0, policy_version 609750 (0.0034) [2024-06-24 07:06:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 9990242304. Throughput: 0: 42105.7. Samples: 9990407560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 07:06:35,202][15401] Updated weights for policy 0, policy_version 609760 (0.0034) [2024-06-24 07:06:35,729][15349] Signal inference workers to stop experience collection... (148050 times) [2024-06-24 07:06:35,772][15401] InferenceWorker_p0-w0: stopping experience collection (148050 times) [2024-06-24 07:06:35,840][15349] Signal inference workers to resume experience collection... (148050 times) [2024-06-24 07:06:35,840][15401] InferenceWorker_p0-w0: resuming experience collection (148050 times) [2024-06-24 07:06:37,952][15401] Updated weights for policy 0, policy_version 609770 (0.0033) [2024-06-24 07:06:38,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 9990488064. Throughput: 0: 42270.2. Samples: 9990534000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 07:06:42,774][15401] Updated weights for policy 0, policy_version 609780 (0.0028) [2024-06-24 07:06:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 9990651904. Throughput: 0: 42318.4. Samples: 9990795520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 07:06:45,627][15401] Updated weights for policy 0, policy_version 609790 (0.0037) [2024-06-24 07:06:48,391][15132] Fps is (10 sec: 37678.7, 60 sec: 42597.5, 300 sec: 42653.8). Total num frames: 9990864896. Throughput: 0: 42214.9. Samples: 9991044100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 07:06:48,391][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 07:06:50,297][15401] Updated weights for policy 0, policy_version 609800 (0.0028) [2024-06-24 07:06:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 9991094272. Throughput: 0: 42275.8. Samples: 9991172340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:06:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 07:06:53,665][15401] Updated weights for policy 0, policy_version 609810 (0.0035) [2024-06-24 07:06:58,389][15132] Fps is (10 sec: 40965.5, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 9991274496. Throughput: 0: 42317.4. Samples: 9991429020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:06:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 07:06:58,411][15401] Updated weights for policy 0, policy_version 609820 (0.0028) [2024-06-24 07:07:01,255][15401] Updated weights for policy 0, policy_version 609830 (0.0032) [2024-06-24 07:07:03,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 9991503872. Throughput: 0: 42363.0. Samples: 9991684360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:03,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 07:07:05,994][15401] Updated weights for policy 0, policy_version 609840 (0.0030) [2024-06-24 07:07:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 9991733248. Throughput: 0: 42383.2. Samples: 9991810180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 07:07:08,967][15401] Updated weights for policy 0, policy_version 609850 (0.0025) [2024-06-24 07:07:13,389][15132] Fps is (10 sec: 40969.9, 60 sec: 41779.2, 300 sec: 42487.6). Total num frames: 9991913472. Throughput: 0: 42427.4. Samples: 9992064000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 07:07:13,585][15401] Updated weights for policy 0, policy_version 609860 (0.0041) [2024-06-24 07:07:16,735][15401] Updated weights for policy 0, policy_version 609871 (0.0026) [2024-06-24 07:07:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 9992159232. Throughput: 0: 42327.1. Samples: 9992312280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 07:07:22,174][15401] Updated weights for policy 0, policy_version 609881 (0.0031) [2024-06-24 07:07:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9992355840. Throughput: 0: 42538.2. Samples: 9992448220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 07:07:24,434][15401] Updated weights for policy 0, policy_version 609891 (0.0033) [2024-06-24 07:07:28,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42323.5, 300 sec: 42487.0). Total num frames: 9992552448. Throughput: 0: 42276.8. Samples: 9992698080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:28,393][15132] Avg episode reward: [(0, '0.849')] [2024-06-24 07:07:29,703][15401] Updated weights for policy 0, policy_version 609901 (0.0036) [2024-06-24 07:07:32,347][15401] Updated weights for policy 0, policy_version 609911 (0.0040) [2024-06-24 07:07:33,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 9992814592. Throughput: 0: 42165.6. Samples: 9992941600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:33,392][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 07:07:37,482][15401] Updated weights for policy 0, policy_version 609921 (0.0032) [2024-06-24 07:07:38,054][15349] Signal inference workers to stop experience collection... (148100 times) [2024-06-24 07:07:38,057][15349] Signal inference workers to resume experience collection... (148100 times) [2024-06-24 07:07:38,074][15401] InferenceWorker_p0-w0: stopping experience collection (148100 times) [2024-06-24 07:07:38,074][15401] InferenceWorker_p0-w0: resuming experience collection (148100 times) [2024-06-24 07:07:38,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 9993011200. Throughput: 0: 42482.9. Samples: 9993084080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 07:07:40,002][15401] Updated weights for policy 0, policy_version 609931 (0.0037) [2024-06-24 07:07:43,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 9993207808. Throughput: 0: 42542.1. Samples: 9993343420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 07:07:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000609938_9993224192.pth... [2024-06-24 07:07:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000609316_9983033344.pth [2024-06-24 07:07:44,996][15401] Updated weights for policy 0, policy_version 609941 (0.0032) [2024-06-24 07:07:47,718][15401] Updated weights for policy 0, policy_version 609951 (0.0030) [2024-06-24 07:07:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43418.5, 300 sec: 42820.6). Total num frames: 9993469952. Throughput: 0: 42180.5. Samples: 9993582380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 07:07:52,887][15401] Updated weights for policy 0, policy_version 609961 (0.0025) [2024-06-24 07:07:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9993633792. Throughput: 0: 42495.9. Samples: 9993722500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 07:07:55,386][15401] Updated weights for policy 0, policy_version 609971 (0.0034) [2024-06-24 07:07:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9993863168. Throughput: 0: 42587.1. Samples: 9993980420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:07:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 07:08:00,624][15401] Updated weights for policy 0, policy_version 609981 (0.0035) [2024-06-24 07:08:02,996][15401] Updated weights for policy 0, policy_version 609991 (0.0029) [2024-06-24 07:08:03,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 9994108928. Throughput: 0: 42689.3. Samples: 9994233300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:08:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 07:08:08,097][15401] Updated weights for policy 0, policy_version 610001 (0.0046) [2024-06-24 07:08:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42487.7). Total num frames: 9994256384. Throughput: 0: 42691.6. Samples: 9994369340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:08:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 07:08:10,603][15401] Updated weights for policy 0, policy_version 610011 (0.0038) [2024-06-24 07:08:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 9994502144. Throughput: 0: 42856.1. Samples: 9994626500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:08:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 07:08:15,977][15401] Updated weights for policy 0, policy_version 610021 (0.0035) [2024-06-24 07:08:18,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 9994731520. Throughput: 0: 43001.9. Samples: 9994876580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:08:18,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 07:08:18,801][15401] Updated weights for policy 0, policy_version 610031 (0.0051) [2024-06-24 07:08:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 9994895360. Throughput: 0: 42738.7. Samples: 9995007320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 07:08:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 07:08:23,613][15401] Updated weights for policy 0, policy_version 610041 (0.0061) [2024-06-24 07:08:26,506][15401] Updated weights for policy 0, policy_version 610051 (0.0046) [2024-06-24 07:08:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 9995141120. Throughput: 0: 42488.9. Samples: 9995255420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:08:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 07:08:31,303][15401] Updated weights for policy 0, policy_version 610061 (0.0034) [2024-06-24 07:08:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42054.0, 300 sec: 42543.2). Total num frames: 9995337728. Throughput: 0: 42972.5. Samples: 9995516140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:08:33,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 07:08:34,163][15401] Updated weights for policy 0, policy_version 610071 (0.0037) [2024-06-24 07:08:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 9995534336. Throughput: 0: 42645.1. Samples: 9995641520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:08:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 07:08:38,856][15401] Updated weights for policy 0, policy_version 610081 (0.0034) [2024-06-24 07:08:38,983][15349] Signal inference workers to stop experience collection... (148150 times) [2024-06-24 07:08:39,031][15401] InferenceWorker_p0-w0: stopping experience collection (148150 times) [2024-06-24 07:08:39,040][15349] Signal inference workers to resume experience collection... (148150 times) [2024-06-24 07:08:39,050][15401] InferenceWorker_p0-w0: resuming experience collection (148150 times) [2024-06-24 07:08:41,763][15401] Updated weights for policy 0, policy_version 610091 (0.0042) [2024-06-24 07:08:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 9995796480. Throughput: 0: 42589.7. Samples: 9995896960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:08:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 07:08:46,350][15401] Updated weights for policy 0, policy_version 610101 (0.0033) [2024-06-24 07:08:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 9995993088. Throughput: 0: 42708.0. Samples: 9996155160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:08:48,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 07:08:49,607][15401] Updated weights for policy 0, policy_version 610111 (0.0032) [2024-06-24 07:08:53,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 9996173312. Throughput: 0: 42460.1. Samples: 9996280040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:08:53,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 07:08:54,289][15401] Updated weights for policy 0, policy_version 610121 (0.0039) [2024-06-24 07:08:57,167][15401] Updated weights for policy 0, policy_version 610131 (0.0043) [2024-06-24 07:08:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 9996435456. Throughput: 0: 42529.8. Samples: 9996540340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:08:58,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 07:09:01,910][15401] Updated weights for policy 0, policy_version 610141 (0.0032) [2024-06-24 07:09:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 9996615680. Throughput: 0: 42718.7. Samples: 9996798920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 07:09:04,706][15401] Updated weights for policy 0, policy_version 610151 (0.0036) [2024-06-24 07:09:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 9996828672. Throughput: 0: 42571.6. Samples: 9996923040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:08,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-24 07:09:09,496][15401] Updated weights for policy 0, policy_version 610161 (0.0038) [2024-06-24 07:09:12,306][15401] Updated weights for policy 0, policy_version 610171 (0.0038) [2024-06-24 07:09:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 9997074432. Throughput: 0: 42728.4. Samples: 9997178200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 07:09:17,061][15401] Updated weights for policy 0, policy_version 610181 (0.0029) [2024-06-24 07:09:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 9997271040. Throughput: 0: 42783.0. Samples: 9997441480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:18,393][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 07:09:20,031][15401] Updated weights for policy 0, policy_version 610191 (0.0032) [2024-06-24 07:09:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42432.1). Total num frames: 9997467648. Throughput: 0: 42628.3. Samples: 9997559800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:23,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-24 07:09:24,778][15401] Updated weights for policy 0, policy_version 610201 (0.0025) [2024-06-24 07:09:28,261][15401] Updated weights for policy 0, policy_version 610211 (0.0043) [2024-06-24 07:09:28,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 9997697024. Throughput: 0: 42638.7. Samples: 9997815700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 07:09:32,475][15401] Updated weights for policy 0, policy_version 610221 (0.0029) [2024-06-24 07:09:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 9997893632. Throughput: 0: 42701.3. Samples: 9998076720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:33,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 07:09:35,879][15401] Updated weights for policy 0, policy_version 610231 (0.0037) [2024-06-24 07:09:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42432.1). Total num frames: 9998123008. Throughput: 0: 42729.3. Samples: 9998202860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 07:09:40,202][15401] Updated weights for policy 0, policy_version 610241 (0.0030) [2024-06-24 07:09:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 9998336000. Throughput: 0: 42597.3. Samples: 9998457220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 07:09:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000610251_9998352384.pth... [2024-06-24 07:09:43,510][15401] Updated weights for policy 0, policy_version 610251 (0.0031) [2024-06-24 07:09:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000609628_9988145152.pth [2024-06-24 07:09:47,793][15401] Updated weights for policy 0, policy_version 610261 (0.0025) [2024-06-24 07:09:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 9998532608. Throughput: 0: 42674.7. Samples: 9998719280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 07:09:51,163][15401] Updated weights for policy 0, policy_version 610271 (0.0033) [2024-06-24 07:09:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42487.7). Total num frames: 9998778368. Throughput: 0: 42711.5. Samples: 9998845060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 07:09:55,406][15401] Updated weights for policy 0, policy_version 610281 (0.0040) [2024-06-24 07:09:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 9998991360. Throughput: 0: 42772.0. Samples: 9999102940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 26.0) [2024-06-24 07:09:58,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-24 07:09:58,794][15401] Updated weights for policy 0, policy_version 610291 (0.0046) [2024-06-24 07:10:02,940][15401] Updated weights for policy 0, policy_version 610301 (0.0038) [2024-06-24 07:10:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 9999187968. Throughput: 0: 42634.7. Samples: 9999359940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 07:10:06,406][15401] Updated weights for policy 0, policy_version 610311 (0.0023) [2024-06-24 07:10:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 9999400960. Throughput: 0: 42751.2. Samples: 9999483600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 07:10:10,815][15401] Updated weights for policy 0, policy_version 610321 (0.0045) [2024-06-24 07:10:11,161][15349] Signal inference workers to stop experience collection... (148200 times) [2024-06-24 07:10:11,170][15349] Signal inference workers to resume experience collection... (148200 times) [2024-06-24 07:10:11,208][15401] InferenceWorker_p0-w0: stopping experience collection (148200 times) [2024-06-24 07:10:11,208][15401] InferenceWorker_p0-w0: resuming experience collection (148200 times) [2024-06-24 07:10:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 9999613952. Throughput: 0: 42910.5. Samples: 9999746680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 07:10:14,139][15401] Updated weights for policy 0, policy_version 610331 (0.0041) [2024-06-24 07:10:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 9999810560. Throughput: 0: 42645.8. Samples: 9999995780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 07:10:18,529][15401] Updated weights for policy 0, policy_version 610341 (0.0037) [2024-06-24 07:10:21,954][15401] Updated weights for policy 0, policy_version 610351 (0.0032) [2024-06-24 07:10:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 10000056320. Throughput: 0: 42635.4. Samples: 10000121460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 07:10:26,326][15401] Updated weights for policy 0, policy_version 610361 (0.0045) [2024-06-24 07:10:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10000236544. Throughput: 0: 42770.2. Samples: 10000381880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 07:10:29,737][15401] Updated weights for policy 0, policy_version 610371 (0.0035) [2024-06-24 07:10:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 10000465920. Throughput: 0: 42419.5. Samples: 10000628160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 07:10:33,862][15401] Updated weights for policy 0, policy_version 610381 (0.0032) [2024-06-24 07:10:37,581][15401] Updated weights for policy 0, policy_version 610391 (0.0037) [2024-06-24 07:10:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10000695296. Throughput: 0: 42464.4. Samples: 10000755960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 07:10:41,565][15401] Updated weights for policy 0, policy_version 610401 (0.0027) [2024-06-24 07:10:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10000875520. Throughput: 0: 42398.7. Samples: 10001010880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 07:10:45,171][15401] Updated weights for policy 0, policy_version 610411 (0.0036) [2024-06-24 07:10:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42596.6, 300 sec: 42431.4). Total num frames: 10001088512. Throughput: 0: 42379.9. Samples: 10001267140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:48,393][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 07:10:49,186][15401] Updated weights for policy 0, policy_version 610421 (0.0028) [2024-06-24 07:10:52,829][15401] Updated weights for policy 0, policy_version 610431 (0.0032) [2024-06-24 07:10:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10001317888. Throughput: 0: 42407.2. Samples: 10001391920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:53,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 07:10:56,800][15401] Updated weights for policy 0, policy_version 610441 (0.0023) [2024-06-24 07:10:58,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10001514496. Throughput: 0: 42201.6. Samples: 10001645740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:10:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 07:11:00,442][15401] Updated weights for policy 0, policy_version 610451 (0.0037) [2024-06-24 07:11:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10001727488. Throughput: 0: 42419.5. Samples: 10001904660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:11:03,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 07:11:04,335][15401] Updated weights for policy 0, policy_version 610461 (0.0032) [2024-06-24 07:11:07,897][15401] Updated weights for policy 0, policy_version 610471 (0.0036) [2024-06-24 07:11:08,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 10001956864. Throughput: 0: 42395.6. Samples: 10002029360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:11:08,393][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 07:11:11,799][15401] Updated weights for policy 0, policy_version 610481 (0.0033) [2024-06-24 07:11:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 10002169856. Throughput: 0: 42368.1. Samples: 10002288440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:11:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 07:11:15,767][15401] Updated weights for policy 0, policy_version 610491 (0.0033) [2024-06-24 07:11:18,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 10002366464. Throughput: 0: 42703.9. Samples: 10002549940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:11:18,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 07:11:19,870][15401] Updated weights for policy 0, policy_version 610501 (0.0034) [2024-06-24 07:11:20,328][15349] Signal inference workers to stop experience collection... (148250 times) [2024-06-24 07:11:20,374][15401] InferenceWorker_p0-w0: stopping experience collection (148250 times) [2024-06-24 07:11:20,383][15349] Signal inference workers to resume experience collection... (148250 times) [2024-06-24 07:11:20,393][15401] InferenceWorker_p0-w0: resuming experience collection (148250 times) [2024-06-24 07:11:23,321][15401] Updated weights for policy 0, policy_version 610511 (0.0033) [2024-06-24 07:11:23,391][15132] Fps is (10 sec: 44231.4, 60 sec: 42597.7, 300 sec: 42709.3). Total num frames: 10002612224. Throughput: 0: 42709.2. Samples: 10002677920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:11:23,391][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 07:11:27,462][15401] Updated weights for policy 0, policy_version 610521 (0.0031) [2024-06-24 07:11:28,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10002808832. Throughput: 0: 42708.8. Samples: 10002932780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:11:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 07:11:30,971][15401] Updated weights for policy 0, policy_version 610531 (0.0036) [2024-06-24 07:11:33,390][15132] Fps is (10 sec: 40964.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 10003021824. Throughput: 0: 42756.0. Samples: 10003191060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:11:33,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 07:11:35,219][15401] Updated weights for policy 0, policy_version 610541 (0.0030) [2024-06-24 07:11:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10003234816. Throughput: 0: 42876.9. Samples: 10003321380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:11:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 07:11:38,630][15401] Updated weights for policy 0, policy_version 610551 (0.0031) [2024-06-24 07:11:42,585][15401] Updated weights for policy 0, policy_version 610561 (0.0042) [2024-06-24 07:11:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42654.1). Total num frames: 10003447808. Throughput: 0: 42934.1. Samples: 10003577780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:11:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 07:11:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000610563_10003464192.pth... [2024-06-24 07:11:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000609938_9993224192.pth [2024-06-24 07:11:46,271][15401] Updated weights for policy 0, policy_version 610571 (0.0031) [2024-06-24 07:11:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 10003660800. Throughput: 0: 42678.8. Samples: 10003825200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:11:48,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 07:11:50,684][15401] Updated weights for policy 0, policy_version 610581 (0.0040) [2024-06-24 07:11:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10003857408. Throughput: 0: 42794.3. Samples: 10003955000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:11:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 07:11:53,909][15401] Updated weights for policy 0, policy_version 610591 (0.0040) [2024-06-24 07:11:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 10004070400. Throughput: 0: 42704.4. Samples: 10004210140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:11:58,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 07:11:58,436][15401] Updated weights for policy 0, policy_version 610601 (0.0031) [2024-06-24 07:12:01,851][15401] Updated weights for policy 0, policy_version 610611 (0.0031) [2024-06-24 07:12:03,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 10004316160. Throughput: 0: 42497.8. Samples: 10004462340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:03,392][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 07:12:06,150][15401] Updated weights for policy 0, policy_version 610621 (0.0025) [2024-06-24 07:12:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 10004496384. Throughput: 0: 42607.3. Samples: 10004595200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 07:12:09,445][15401] Updated weights for policy 0, policy_version 610631 (0.0035) [2024-06-24 07:12:13,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10004725760. Throughput: 0: 42673.8. Samples: 10004853100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:13,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 07:12:13,812][15401] Updated weights for policy 0, policy_version 610641 (0.0031) [2024-06-24 07:12:17,295][15401] Updated weights for policy 0, policy_version 610651 (0.0041) [2024-06-24 07:12:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 10004938752. Throughput: 0: 42456.9. Samples: 10005101620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:18,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-24 07:12:21,468][15401] Updated weights for policy 0, policy_version 610661 (0.0048) [2024-06-24 07:12:23,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42048.6, 300 sec: 42653.4). Total num frames: 10005135360. Throughput: 0: 42407.3. Samples: 10005229980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:23,396][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 07:12:25,197][15401] Updated weights for policy 0, policy_version 610671 (0.0029) [2024-06-24 07:12:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 10005364736. Throughput: 0: 42347.2. Samples: 10005483400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:28,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 07:12:29,022][15401] Updated weights for policy 0, policy_version 610681 (0.0046) [2024-06-24 07:12:32,606][15401] Updated weights for policy 0, policy_version 610691 (0.0041) [2024-06-24 07:12:33,389][15132] Fps is (10 sec: 44265.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10005577728. Throughput: 0: 42712.9. Samples: 10005747280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 07:12:36,506][15401] Updated weights for policy 0, policy_version 610701 (0.0032) [2024-06-24 07:12:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10005790720. Throughput: 0: 42551.2. Samples: 10005869800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 07:12:40,198][15401] Updated weights for policy 0, policy_version 610711 (0.0035) [2024-06-24 07:12:43,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.9, 300 sec: 42541.9). Total num frames: 10006020096. Throughput: 0: 42589.5. Samples: 10006126940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:43,396][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 07:12:44,578][15401] Updated weights for policy 0, policy_version 610721 (0.0043) [2024-06-24 07:12:47,812][15401] Updated weights for policy 0, policy_version 610731 (0.0031) [2024-06-24 07:12:48,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 10006216704. Throughput: 0: 42649.8. Samples: 10006381580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:48,392][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 07:12:48,670][15349] Signal inference workers to stop experience collection... (148300 times) [2024-06-24 07:12:48,671][15349] Signal inference workers to resume experience collection... (148300 times) [2024-06-24 07:12:48,713][15401] InferenceWorker_p0-w0: stopping experience collection (148300 times) [2024-06-24 07:12:48,714][15401] InferenceWorker_p0-w0: resuming experience collection (148300 times) [2024-06-24 07:12:52,233][15401] Updated weights for policy 0, policy_version 610741 (0.0042) [2024-06-24 07:12:53,389][15132] Fps is (10 sec: 40986.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10006429696. Throughput: 0: 42533.0. Samples: 10006509180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 07:12:55,424][15401] Updated weights for policy 0, policy_version 610751 (0.0028) [2024-06-24 07:12:58,392][15132] Fps is (10 sec: 44236.8, 60 sec: 43142.8, 300 sec: 42542.5). Total num frames: 10006659072. Throughput: 0: 42564.4. Samples: 10006768600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:12:58,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 07:13:00,071][15401] Updated weights for policy 0, policy_version 610761 (0.0036) [2024-06-24 07:13:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 10006855680. Throughput: 0: 42666.7. Samples: 10007021620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 07:13:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 07:13:03,420][15401] Updated weights for policy 0, policy_version 610771 (0.0037) [2024-06-24 07:13:07,600][15401] Updated weights for policy 0, policy_version 610781 (0.0038) [2024-06-24 07:13:08,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10007068672. Throughput: 0: 42632.2. Samples: 10007148160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 07:13:11,144][15401] Updated weights for policy 0, policy_version 610791 (0.0036) [2024-06-24 07:13:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10007314432. Throughput: 0: 42827.6. Samples: 10007410640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:13,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 07:13:14,993][15401] Updated weights for policy 0, policy_version 610801 (0.0041) [2024-06-24 07:13:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10007511040. Throughput: 0: 42697.7. Samples: 10007668680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 07:13:18,710][15401] Updated weights for policy 0, policy_version 610811 (0.0036) [2024-06-24 07:13:22,395][15401] Updated weights for policy 0, policy_version 610821 (0.0039) [2024-06-24 07:13:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42875.9, 300 sec: 42598.4). Total num frames: 10007707648. Throughput: 0: 42707.8. Samples: 10007791660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:23,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 07:13:26,366][15401] Updated weights for policy 0, policy_version 610831 (0.0024) [2024-06-24 07:13:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10007937024. Throughput: 0: 42774.4. Samples: 10008051520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:28,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 07:13:30,280][15401] Updated weights for policy 0, policy_version 610841 (0.0034) [2024-06-24 07:13:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10008150016. Throughput: 0: 42984.5. Samples: 10008315780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:33,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 07:13:33,775][15401] Updated weights for policy 0, policy_version 610851 (0.0022) [2024-06-24 07:13:37,726][15401] Updated weights for policy 0, policy_version 610861 (0.0035) [2024-06-24 07:13:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10008346624. Throughput: 0: 43021.2. Samples: 10008445140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:38,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 07:13:41,642][15401] Updated weights for policy 0, policy_version 610871 (0.0035) [2024-06-24 07:13:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 10008592384. Throughput: 0: 42879.2. Samples: 10008698060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 07:13:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000610876_10008592384.pth... [2024-06-24 07:13:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000610251_9998352384.pth [2024-06-24 07:13:45,352][15401] Updated weights for policy 0, policy_version 610881 (0.0031) [2024-06-24 07:13:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10008788992. Throughput: 0: 42969.8. Samples: 10008955260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:48,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 07:13:49,259][15401] Updated weights for policy 0, policy_version 610891 (0.0042) [2024-06-24 07:13:53,172][15401] Updated weights for policy 0, policy_version 610901 (0.0039) [2024-06-24 07:13:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10009001984. Throughput: 0: 42889.0. Samples: 10009078160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 07:13:56,858][15401] Updated weights for policy 0, policy_version 610911 (0.0028) [2024-06-24 07:13:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10009231360. Throughput: 0: 42803.9. Samples: 10009336820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:13:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 07:14:01,288][15401] Updated weights for policy 0, policy_version 610921 (0.0033) [2024-06-24 07:14:03,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10009427968. Throughput: 0: 42882.7. Samples: 10009598400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:14:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 07:14:04,741][15401] Updated weights for policy 0, policy_version 610931 (0.0036) [2024-06-24 07:14:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 10009640960. Throughput: 0: 42884.4. Samples: 10009721560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:14:08,392][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 07:14:08,737][15401] Updated weights for policy 0, policy_version 610941 (0.0030) [2024-06-24 07:14:10,975][15349] Signal inference workers to stop experience collection... (148350 times) [2024-06-24 07:14:11,025][15401] InferenceWorker_p0-w0: stopping experience collection (148350 times) [2024-06-24 07:14:11,089][15349] Signal inference workers to resume experience collection... (148350 times) [2024-06-24 07:14:11,089][15401] InferenceWorker_p0-w0: resuming experience collection (148350 times) [2024-06-24 07:14:12,526][15401] Updated weights for policy 0, policy_version 610951 (0.0032) [2024-06-24 07:14:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 10009870336. Throughput: 0: 42908.5. Samples: 10009982400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:14:13,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 07:14:16,277][15401] Updated weights for policy 0, policy_version 610961 (0.0036) [2024-06-24 07:14:18,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10010066944. Throughput: 0: 42756.8. Samples: 10010239840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:14:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 07:14:20,140][15401] Updated weights for policy 0, policy_version 610971 (0.0041) [2024-06-24 07:14:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10010296320. Throughput: 0: 42663.6. Samples: 10010365000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:14:23,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 07:14:23,880][15401] Updated weights for policy 0, policy_version 610981 (0.0031) [2024-06-24 07:14:27,689][15401] Updated weights for policy 0, policy_version 610991 (0.0036) [2024-06-24 07:14:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10010509312. Throughput: 0: 42806.7. Samples: 10010624360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:14:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 07:14:31,282][15401] Updated weights for policy 0, policy_version 611001 (0.0036) [2024-06-24 07:14:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10010689536. Throughput: 0: 42802.7. Samples: 10010881380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:14:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 07:14:35,321][15401] Updated weights for policy 0, policy_version 611011 (0.0040) [2024-06-24 07:14:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10010935296. Throughput: 0: 42775.9. Samples: 10011003080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 07:14:38,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 07:14:38,882][15401] Updated weights for policy 0, policy_version 611021 (0.0039) [2024-06-24 07:14:42,879][15401] Updated weights for policy 0, policy_version 611031 (0.0035) [2024-06-24 07:14:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10011148288. Throughput: 0: 42794.2. Samples: 10011262560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:14:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 07:14:46,347][15401] Updated weights for policy 0, policy_version 611041 (0.0035) [2024-06-24 07:14:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10011328512. Throughput: 0: 42828.5. Samples: 10011525680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:14:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 07:14:50,452][15401] Updated weights for policy 0, policy_version 611051 (0.0036) [2024-06-24 07:14:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 10011574272. Throughput: 0: 42791.9. Samples: 10011647100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:14:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 07:14:54,162][15401] Updated weights for policy 0, policy_version 611061 (0.0037) [2024-06-24 07:14:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10011770880. Throughput: 0: 42779.2. Samples: 10011907460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:14:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 07:14:58,406][15401] Updated weights for policy 0, policy_version 611071 (0.0043) [2024-06-24 07:15:01,767][15401] Updated weights for policy 0, policy_version 611081 (0.0035) [2024-06-24 07:15:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10011983872. Throughput: 0: 42885.3. Samples: 10012169680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 07:15:06,371][15401] Updated weights for policy 0, policy_version 611091 (0.0036) [2024-06-24 07:15:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 10012229632. Throughput: 0: 43041.7. Samples: 10012301880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 07:15:09,355][15401] Updated weights for policy 0, policy_version 611101 (0.0031) [2024-06-24 07:15:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10012426240. Throughput: 0: 42942.9. Samples: 10012556800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 07:15:14,036][15401] Updated weights for policy 0, policy_version 611111 (0.0030) [2024-06-24 07:15:16,975][15401] Updated weights for policy 0, policy_version 611121 (0.0043) [2024-06-24 07:15:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10012639232. Throughput: 0: 42818.2. Samples: 10012808200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 07:15:21,568][15401] Updated weights for policy 0, policy_version 611131 (0.0023) [2024-06-24 07:15:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10012852224. Throughput: 0: 43057.8. Samples: 10012940680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:23,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-24 07:15:24,526][15401] Updated weights for policy 0, policy_version 611141 (0.0041) [2024-06-24 07:15:27,394][15349] Signal inference workers to stop experience collection... (148400 times) [2024-06-24 07:15:27,445][15401] InferenceWorker_p0-w0: stopping experience collection (148400 times) [2024-06-24 07:15:27,454][15349] Signal inference workers to resume experience collection... (148400 times) [2024-06-24 07:15:27,465][15401] InferenceWorker_p0-w0: resuming experience collection (148400 times) [2024-06-24 07:15:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10013081600. Throughput: 0: 43063.1. Samples: 10013200400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:28,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-24 07:15:29,040][15401] Updated weights for policy 0, policy_version 611151 (0.0038) [2024-06-24 07:15:32,191][15401] Updated weights for policy 0, policy_version 611161 (0.0032) [2024-06-24 07:15:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10013278208. Throughput: 0: 42783.5. Samples: 10013450940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 07:15:36,872][15401] Updated weights for policy 0, policy_version 611171 (0.0037) [2024-06-24 07:15:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10013491200. Throughput: 0: 42962.7. Samples: 10013580420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:38,391][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 07:15:39,917][15401] Updated weights for policy 0, policy_version 611181 (0.0028) [2024-06-24 07:15:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 10013704192. Throughput: 0: 42940.8. Samples: 10013839800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:43,396][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 07:15:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000611188_10013704192.pth... [2024-06-24 07:15:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000610563_10003464192.pth [2024-06-24 07:15:44,426][15401] Updated weights for policy 0, policy_version 611191 (0.0025) [2024-06-24 07:15:47,966][15401] Updated weights for policy 0, policy_version 611201 (0.0039) [2024-06-24 07:15:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10013917184. Throughput: 0: 42742.7. Samples: 10014093100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 07:15:51,973][15401] Updated weights for policy 0, policy_version 611211 (0.0033) [2024-06-24 07:15:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 10014146560. Throughput: 0: 42776.1. Samples: 10014226800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 07:15:55,786][15401] Updated weights for policy 0, policy_version 611221 (0.0026) [2024-06-24 07:15:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10014343168. Throughput: 0: 42657.5. Samples: 10014476380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:15:58,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 07:15:59,937][15401] Updated weights for policy 0, policy_version 611231 (0.0030) [2024-06-24 07:16:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 10014556160. Throughput: 0: 42826.7. Samples: 10014735400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:16:03,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 07:16:03,634][15401] Updated weights for policy 0, policy_version 611241 (0.0029) [2024-06-24 07:16:07,511][15401] Updated weights for policy 0, policy_version 611251 (0.0037) [2024-06-24 07:16:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10014801920. Throughput: 0: 42842.1. Samples: 10014868580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:16:08,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-24 07:16:11,123][15401] Updated weights for policy 0, policy_version 611261 (0.0035) [2024-06-24 07:16:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 10014998528. Throughput: 0: 42672.5. Samples: 10015120660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 07:16:13,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-24 07:16:15,420][15401] Updated weights for policy 0, policy_version 611271 (0.0041) [2024-06-24 07:16:18,390][15132] Fps is (10 sec: 40958.6, 60 sec: 42871.2, 300 sec: 42709.6). Total num frames: 10015211520. Throughput: 0: 42679.6. Samples: 10015371540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 07:16:18,688][15401] Updated weights for policy 0, policy_version 611281 (0.0042) [2024-06-24 07:16:22,925][15401] Updated weights for policy 0, policy_version 611291 (0.0032) [2024-06-24 07:16:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10015424512. Throughput: 0: 42751.1. Samples: 10015504220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 07:16:26,179][15401] Updated weights for policy 0, policy_version 611301 (0.0027) [2024-06-24 07:16:28,392][15132] Fps is (10 sec: 44228.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 10015653888. Throughput: 0: 42700.0. Samples: 10015761400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:28,393][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 07:16:30,734][15401] Updated weights for policy 0, policy_version 611311 (0.0035) [2024-06-24 07:16:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10015850496. Throughput: 0: 42699.6. Samples: 10016014580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:33,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-24 07:16:33,720][15401] Updated weights for policy 0, policy_version 611321 (0.0047) [2024-06-24 07:16:38,308][15401] Updated weights for policy 0, policy_version 611331 (0.0035) [2024-06-24 07:16:38,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10016047104. Throughput: 0: 42557.8. Samples: 10016141900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:38,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 07:16:40,343][15349] Signal inference workers to stop experience collection... (148450 times) [2024-06-24 07:16:40,372][15401] InferenceWorker_p0-w0: stopping experience collection (148450 times) [2024-06-24 07:16:40,401][15349] Signal inference workers to resume experience collection... (148450 times) [2024-06-24 07:16:40,402][15401] InferenceWorker_p0-w0: resuming experience collection (148450 times) [2024-06-24 07:16:41,277][15401] Updated weights for policy 0, policy_version 611341 (0.0043) [2024-06-24 07:16:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 10016292864. Throughput: 0: 42805.3. Samples: 10016402620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 07:16:45,752][15401] Updated weights for policy 0, policy_version 611351 (0.0035) [2024-06-24 07:16:48,390][15132] Fps is (10 sec: 44234.1, 60 sec: 42871.1, 300 sec: 42820.5). Total num frames: 10016489472. Throughput: 0: 42734.6. Samples: 10016658480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:48,391][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 07:16:49,289][15401] Updated weights for policy 0, policy_version 611361 (0.0020) [2024-06-24 07:16:53,251][15401] Updated weights for policy 0, policy_version 611371 (0.0032) [2024-06-24 07:16:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10016702464. Throughput: 0: 42633.0. Samples: 10016787060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 07:16:56,854][15401] Updated weights for policy 0, policy_version 611381 (0.0042) [2024-06-24 07:16:58,390][15132] Fps is (10 sec: 44239.2, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 10016931840. Throughput: 0: 42807.5. Samples: 10017047000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:16:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 07:17:00,694][15401] Updated weights for policy 0, policy_version 611391 (0.0027) [2024-06-24 07:17:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10017144832. Throughput: 0: 42894.5. Samples: 10017301780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:03,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 07:17:04,637][15401] Updated weights for policy 0, policy_version 611401 (0.0044) [2024-06-24 07:17:08,233][15401] Updated weights for policy 0, policy_version 611411 (0.0042) [2024-06-24 07:17:08,391][15132] Fps is (10 sec: 42591.4, 60 sec: 42597.2, 300 sec: 42820.3). Total num frames: 10017357824. Throughput: 0: 42794.5. Samples: 10017430040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:08,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 07:17:12,268][15401] Updated weights for policy 0, policy_version 611421 (0.0039) [2024-06-24 07:17:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10017570816. Throughput: 0: 42912.5. Samples: 10017692360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:13,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 07:17:15,913][15401] Updated weights for policy 0, policy_version 611431 (0.0033) [2024-06-24 07:17:18,389][15132] Fps is (10 sec: 42605.7, 60 sec: 42871.8, 300 sec: 42877.0). Total num frames: 10017783808. Throughput: 0: 42956.4. Samples: 10017947620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 07:17:19,849][15401] Updated weights for policy 0, policy_version 611441 (0.0037) [2024-06-24 07:17:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 10017996800. Throughput: 0: 42859.5. Samples: 10018070580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:23,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 07:17:23,626][15401] Updated weights for policy 0, policy_version 611451 (0.0029) [2024-06-24 07:17:27,753][15401] Updated weights for policy 0, policy_version 611461 (0.0041) [2024-06-24 07:17:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 10018209792. Throughput: 0: 42820.8. Samples: 10018329560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:28,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 07:17:31,416][15401] Updated weights for policy 0, policy_version 611471 (0.0039) [2024-06-24 07:17:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10018406400. Throughput: 0: 42741.1. Samples: 10018581800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:33,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 07:17:35,304][15401] Updated weights for policy 0, policy_version 611481 (0.0040) [2024-06-24 07:17:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 10018619392. Throughput: 0: 42684.9. Samples: 10018707880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:38,396][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 07:17:39,003][15401] Updated weights for policy 0, policy_version 611491 (0.0030) [2024-06-24 07:17:42,841][15401] Updated weights for policy 0, policy_version 611501 (0.0024) [2024-06-24 07:17:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 10018848768. Throughput: 0: 42644.4. Samples: 10018966000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 07:17:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000611502_10018848768.pth... [2024-06-24 07:17:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000610876_10008592384.pth [2024-06-24 07:17:46,794][15401] Updated weights for policy 0, policy_version 611511 (0.0028) [2024-06-24 07:17:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42872.0, 300 sec: 42820.6). Total num frames: 10019061760. Throughput: 0: 42596.2. Samples: 10019218600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 07:17:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 07:17:50,449][15401] Updated weights for policy 0, policy_version 611521 (0.0039) [2024-06-24 07:17:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10019258368. Throughput: 0: 42515.9. Samples: 10019343180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:17:53,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 07:17:54,635][15401] Updated weights for policy 0, policy_version 611531 (0.0027) [2024-06-24 07:17:58,058][15401] Updated weights for policy 0, policy_version 611541 (0.0034) [2024-06-24 07:17:58,394][15132] Fps is (10 sec: 42578.5, 60 sec: 42595.2, 300 sec: 42819.9). Total num frames: 10019487744. Throughput: 0: 42472.6. Samples: 10019603820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:17:58,395][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 07:17:59,039][15349] Signal inference workers to stop experience collection... (148500 times) [2024-06-24 07:17:59,096][15401] InferenceWorker_p0-w0: stopping experience collection (148500 times) [2024-06-24 07:17:59,097][15349] Signal inference workers to resume experience collection... (148500 times) [2024-06-24 07:17:59,120][15401] InferenceWorker_p0-w0: resuming experience collection (148500 times) [2024-06-24 07:18:02,109][15401] Updated weights for policy 0, policy_version 611551 (0.0037) [2024-06-24 07:18:03,391][15132] Fps is (10 sec: 45869.1, 60 sec: 42870.6, 300 sec: 42875.9). Total num frames: 10019717120. Throughput: 0: 42432.1. Samples: 10019857120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:03,391][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 07:18:06,062][15401] Updated weights for policy 0, policy_version 611561 (0.0030) [2024-06-24 07:18:08,389][15132] Fps is (10 sec: 40978.9, 60 sec: 42326.6, 300 sec: 42653.9). Total num frames: 10019897344. Throughput: 0: 42559.2. Samples: 10019985740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 07:18:09,723][15401] Updated weights for policy 0, policy_version 611571 (0.0024) [2024-06-24 07:18:13,391][15132] Fps is (10 sec: 40960.9, 60 sec: 42597.7, 300 sec: 42764.9). Total num frames: 10020126720. Throughput: 0: 42628.8. Samples: 10020247900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:13,391][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 07:18:13,960][15401] Updated weights for policy 0, policy_version 611581 (0.0045) [2024-06-24 07:18:17,321][15401] Updated weights for policy 0, policy_version 611591 (0.0045) [2024-06-24 07:18:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10020339712. Throughput: 0: 42733.4. Samples: 10020504800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 07:18:21,379][15401] Updated weights for policy 0, policy_version 611601 (0.0040) [2024-06-24 07:18:23,390][15132] Fps is (10 sec: 42602.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10020552704. Throughput: 0: 42769.2. Samples: 10020632500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 07:18:24,922][15401] Updated weights for policy 0, policy_version 611611 (0.0034) [2024-06-24 07:18:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10020765696. Throughput: 0: 42709.1. Samples: 10020887900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:28,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 07:18:29,215][15401] Updated weights for policy 0, policy_version 611621 (0.0027) [2024-06-24 07:18:32,493][15401] Updated weights for policy 0, policy_version 611631 (0.0037) [2024-06-24 07:18:33,391][15132] Fps is (10 sec: 42590.4, 60 sec: 42870.0, 300 sec: 42820.3). Total num frames: 10020978688. Throughput: 0: 42866.5. Samples: 10021147680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:33,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 07:18:36,820][15401] Updated weights for policy 0, policy_version 611641 (0.0022) [2024-06-24 07:18:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10021208064. Throughput: 0: 42921.3. Samples: 10021274640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:38,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 07:18:40,821][15401] Updated weights for policy 0, policy_version 611651 (0.0035) [2024-06-24 07:18:43,389][15132] Fps is (10 sec: 44245.4, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 10021421056. Throughput: 0: 42828.8. Samples: 10021530920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 07:18:44,286][15401] Updated weights for policy 0, policy_version 611661 (0.0036) [2024-06-24 07:18:48,387][15401] Updated weights for policy 0, policy_version 611671 (0.0031) [2024-06-24 07:18:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10021617664. Throughput: 0: 42946.1. Samples: 10021789640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 07:18:52,105][15401] Updated weights for policy 0, policy_version 611681 (0.0047) [2024-06-24 07:18:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10021847040. Throughput: 0: 42840.0. Samples: 10021913540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 07:18:55,978][15401] Updated weights for policy 0, policy_version 611691 (0.0038) [2024-06-24 07:18:58,391][15132] Fps is (10 sec: 44229.3, 60 sec: 42873.6, 300 sec: 42820.3). Total num frames: 10022060032. Throughput: 0: 42734.5. Samples: 10022170980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:18:58,392][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 07:18:59,684][15401] Updated weights for policy 0, policy_version 611701 (0.0029) [2024-06-24 07:19:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42326.2, 300 sec: 42765.4). Total num frames: 10022256640. Throughput: 0: 42701.2. Samples: 10022426360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:19:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 07:19:03,881][15401] Updated weights for policy 0, policy_version 611711 (0.0034) [2024-06-24 07:19:07,227][15401] Updated weights for policy 0, policy_version 611721 (0.0028) [2024-06-24 07:19:08,389][15132] Fps is (10 sec: 40967.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10022469632. Throughput: 0: 42690.4. Samples: 10022553560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:19:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 07:19:11,369][15401] Updated weights for policy 0, policy_version 611731 (0.0040) [2024-06-24 07:19:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42599.1, 300 sec: 42765.0). Total num frames: 10022682624. Throughput: 0: 42717.2. Samples: 10022810180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:19:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 07:19:14,924][15401] Updated weights for policy 0, policy_version 611741 (0.0039) [2024-06-24 07:19:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10022879232. Throughput: 0: 42683.7. Samples: 10023068360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:19:18,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 07:19:18,961][15401] Updated weights for policy 0, policy_version 611751 (0.0023) [2024-06-24 07:19:21,343][15349] Signal inference workers to stop experience collection... (148550 times) [2024-06-24 07:19:21,345][15349] Signal inference workers to resume experience collection... (148550 times) [2024-06-24 07:19:21,359][15401] InferenceWorker_p0-w0: stopping experience collection (148550 times) [2024-06-24 07:19:21,400][15401] InferenceWorker_p0-w0: resuming experience collection (148550 times) [2024-06-24 07:19:22,588][15401] Updated weights for policy 0, policy_version 611761 (0.0038) [2024-06-24 07:19:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10023108608. Throughput: 0: 42547.1. Samples: 10023189260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-24 07:19:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 07:19:27,133][15401] Updated weights for policy 0, policy_version 611771 (0.0042) [2024-06-24 07:19:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10023305216. Throughput: 0: 42629.0. Samples: 10023449220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:19:28,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 07:19:30,403][15401] Updated weights for policy 0, policy_version 611781 (0.0029) [2024-06-24 07:19:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42326.7, 300 sec: 42653.9). Total num frames: 10023518208. Throughput: 0: 42479.1. Samples: 10023701200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:19:33,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 07:19:34,726][15401] Updated weights for policy 0, policy_version 611791 (0.0041) [2024-06-24 07:19:38,010][15401] Updated weights for policy 0, policy_version 611801 (0.0034) [2024-06-24 07:19:38,396][15132] Fps is (10 sec: 44207.9, 60 sec: 42320.8, 300 sec: 42708.5). Total num frames: 10023747584. Throughput: 0: 42500.6. Samples: 10023826340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:19:38,396][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 07:19:42,336][15401] Updated weights for policy 0, policy_version 611811 (0.0037) [2024-06-24 07:19:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.6, 300 sec: 42764.7). Total num frames: 10023944192. Throughput: 0: 42489.5. Samples: 10024083040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:19:43,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 07:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000611813_10023944192.pth... [2024-06-24 07:19:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000611188_10013704192.pth [2024-06-24 07:19:45,645][15401] Updated weights for policy 0, policy_version 611821 (0.0028) [2024-06-24 07:19:48,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10024157184. Throughput: 0: 42395.1. Samples: 10024334140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:19:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 07:19:50,269][15401] Updated weights for policy 0, policy_version 611831 (0.0043) [2024-06-24 07:19:53,353][15401] Updated weights for policy 0, policy_version 611841 (0.0030) [2024-06-24 07:19:53,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10024402944. Throughput: 0: 42382.5. Samples: 10024460780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:19:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 07:19:58,119][15401] Updated weights for policy 0, policy_version 611851 (0.0038) [2024-06-24 07:19:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42053.4, 300 sec: 42709.5). Total num frames: 10024583168. Throughput: 0: 42340.0. Samples: 10024715480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:19:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 07:20:00,944][15401] Updated weights for policy 0, policy_version 611861 (0.0037) [2024-06-24 07:20:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10024796160. Throughput: 0: 42263.0. Samples: 10024970200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:03,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 07:20:05,730][15401] Updated weights for policy 0, policy_version 611871 (0.0033) [2024-06-24 07:20:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10025041920. Throughput: 0: 42417.9. Samples: 10025098060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 07:20:08,449][15401] Updated weights for policy 0, policy_version 611881 (0.0035) [2024-06-24 07:20:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 10025205760. Throughput: 0: 42276.0. Samples: 10025351640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 07:20:13,490][15401] Updated weights for policy 0, policy_version 611891 (0.0031) [2024-06-24 07:20:16,352][15401] Updated weights for policy 0, policy_version 611901 (0.0026) [2024-06-24 07:20:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10025435136. Throughput: 0: 42301.7. Samples: 10025604780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 07:20:20,979][15401] Updated weights for policy 0, policy_version 611911 (0.0022) [2024-06-24 07:20:23,390][15132] Fps is (10 sec: 47511.6, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 10025680896. Throughput: 0: 42416.4. Samples: 10025734820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 07:20:24,138][15401] Updated weights for policy 0, policy_version 611921 (0.0035) [2024-06-24 07:20:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10025861120. Throughput: 0: 42474.7. Samples: 10025994300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:28,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 07:20:28,793][15401] Updated weights for policy 0, policy_version 611931 (0.0047) [2024-06-24 07:20:28,957][15349] Signal inference workers to stop experience collection... (148600 times) [2024-06-24 07:20:28,957][15349] Signal inference workers to resume experience collection... (148600 times) [2024-06-24 07:20:28,973][15401] InferenceWorker_p0-w0: stopping experience collection (148600 times) [2024-06-24 07:20:28,973][15401] InferenceWorker_p0-w0: resuming experience collection (148600 times) [2024-06-24 07:20:31,590][15401] Updated weights for policy 0, policy_version 611941 (0.0037) [2024-06-24 07:20:33,390][15132] Fps is (10 sec: 39322.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10026074112. Throughput: 0: 42510.7. Samples: 10026247120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 07:20:36,204][15401] Updated weights for policy 0, policy_version 611951 (0.0042) [2024-06-24 07:20:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 10026319872. Throughput: 0: 42645.4. Samples: 10026379820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 07:20:38,959][15401] Updated weights for policy 0, policy_version 611961 (0.0031) [2024-06-24 07:20:43,390][15132] Fps is (10 sec: 42594.6, 60 sec: 42599.5, 300 sec: 42653.8). Total num frames: 10026500096. Throughput: 0: 42636.5. Samples: 10026634160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:43,391][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 07:20:44,058][15401] Updated weights for policy 0, policy_version 611971 (0.0036) [2024-06-24 07:20:46,552][15401] Updated weights for policy 0, policy_version 611981 (0.0031) [2024-06-24 07:20:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10026713088. Throughput: 0: 42766.7. Samples: 10026894700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:48,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-24 07:20:51,549][15401] Updated weights for policy 0, policy_version 611991 (0.0034) [2024-06-24 07:20:53,389][15132] Fps is (10 sec: 45879.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10026958848. Throughput: 0: 42743.2. Samples: 10027021500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:53,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 07:20:54,090][15401] Updated weights for policy 0, policy_version 612001 (0.0031) [2024-06-24 07:20:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10027155456. Throughput: 0: 42971.5. Samples: 10027285360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 07:20:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 07:20:59,067][15401] Updated weights for policy 0, policy_version 612011 (0.0029) [2024-06-24 07:21:01,757][15401] Updated weights for policy 0, policy_version 612021 (0.0034) [2024-06-24 07:21:03,390][15132] Fps is (10 sec: 40958.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10027368448. Throughput: 0: 43122.9. Samples: 10027545320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:03,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 07:21:06,534][15401] Updated weights for policy 0, policy_version 612031 (0.0046) [2024-06-24 07:21:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10027597824. Throughput: 0: 42919.0. Samples: 10027666160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 07:21:09,668][15401] Updated weights for policy 0, policy_version 612041 (0.0034) [2024-06-24 07:21:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 10027794432. Throughput: 0: 43050.2. Samples: 10027931560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 07:21:14,089][15401] Updated weights for policy 0, policy_version 612051 (0.0037) [2024-06-24 07:21:17,224][15401] Updated weights for policy 0, policy_version 612061 (0.0034) [2024-06-24 07:21:18,392][15132] Fps is (10 sec: 40951.0, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 10028007424. Throughput: 0: 43024.6. Samples: 10028183320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:18,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 07:21:21,722][15401] Updated weights for policy 0, policy_version 612071 (0.0032) [2024-06-24 07:21:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.7, 300 sec: 42654.3). Total num frames: 10028236800. Throughput: 0: 43050.3. Samples: 10028317080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 07:21:24,785][15401] Updated weights for policy 0, policy_version 612081 (0.0043) [2024-06-24 07:21:28,394][15132] Fps is (10 sec: 42588.5, 60 sec: 42868.2, 300 sec: 42653.3). Total num frames: 10028433408. Throughput: 0: 43050.3. Samples: 10028571580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:28,394][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 07:21:29,327][15401] Updated weights for policy 0, policy_version 612091 (0.0029) [2024-06-24 07:21:33,153][15401] Updated weights for policy 0, policy_version 612101 (0.0043) [2024-06-24 07:21:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10028662784. Throughput: 0: 42948.1. Samples: 10028827360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:33,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 07:21:36,827][15401] Updated weights for policy 0, policy_version 612111 (0.0041) [2024-06-24 07:21:38,389][15132] Fps is (10 sec: 45895.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10028892160. Throughput: 0: 43093.7. Samples: 10028960720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 07:21:40,818][15401] Updated weights for policy 0, policy_version 612121 (0.0030) [2024-06-24 07:21:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42872.0, 300 sec: 42654.0). Total num frames: 10029072384. Throughput: 0: 42916.3. Samples: 10029216600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 07:21:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000612127_10029088768.pth... [2024-06-24 07:21:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000611502_10018848768.pth [2024-06-24 07:21:44,376][15401] Updated weights for policy 0, policy_version 612131 (0.0029) [2024-06-24 07:21:48,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 10029301760. Throughput: 0: 42717.9. Samples: 10029467720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:48,393][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 07:21:49,059][15401] Updated weights for policy 0, policy_version 612141 (0.0030) [2024-06-24 07:21:51,734][15349] Signal inference workers to stop experience collection... (148650 times) [2024-06-24 07:21:51,734][15349] Signal inference workers to resume experience collection... (148650 times) [2024-06-24 07:21:51,746][15401] InferenceWorker_p0-w0: stopping experience collection (148650 times) [2024-06-24 07:21:51,746][15401] InferenceWorker_p0-w0: resuming experience collection (148650 times) [2024-06-24 07:21:52,034][15401] Updated weights for policy 0, policy_version 612151 (0.0033) [2024-06-24 07:21:53,392][15132] Fps is (10 sec: 45864.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 10029531136. Throughput: 0: 42959.1. Samples: 10029599420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:53,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 07:21:56,579][15401] Updated weights for policy 0, policy_version 612161 (0.0032) [2024-06-24 07:21:58,392][15132] Fps is (10 sec: 40959.8, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 10029711360. Throughput: 0: 42726.1. Samples: 10029854340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:21:58,393][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 07:21:59,773][15401] Updated weights for policy 0, policy_version 612171 (0.0029) [2024-06-24 07:22:03,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.7, 300 sec: 42654.2). Total num frames: 10029940736. Throughput: 0: 42858.6. Samples: 10030111860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:22:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 07:22:04,254][15401] Updated weights for policy 0, policy_version 612181 (0.0025) [2024-06-24 07:22:07,660][15401] Updated weights for policy 0, policy_version 612191 (0.0030) [2024-06-24 07:22:08,391][15132] Fps is (10 sec: 45879.5, 60 sec: 42870.4, 300 sec: 42709.3). Total num frames: 10030170112. Throughput: 0: 42777.2. Samples: 10030242120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:22:08,391][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 07:22:11,850][15401] Updated weights for policy 0, policy_version 612201 (0.0028) [2024-06-24 07:22:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10030366720. Throughput: 0: 42827.4. Samples: 10030498620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:22:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 07:22:15,160][15401] Updated weights for policy 0, policy_version 612211 (0.0037) [2024-06-24 07:22:18,390][15132] Fps is (10 sec: 39326.9, 60 sec: 42599.8, 300 sec: 42598.4). Total num frames: 10030563328. Throughput: 0: 42752.2. Samples: 10030751220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:22:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 07:22:19,474][15401] Updated weights for policy 0, policy_version 612221 (0.0036) [2024-06-24 07:22:22,719][15401] Updated weights for policy 0, policy_version 612231 (0.0033) [2024-06-24 07:22:23,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.8, 300 sec: 42708.5). Total num frames: 10030809088. Throughput: 0: 42597.5. Samples: 10030877880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:22:23,405][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 07:22:26,994][15401] Updated weights for policy 0, policy_version 612241 (0.0037) [2024-06-24 07:22:28,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42874.7, 300 sec: 42709.5). Total num frames: 10031005696. Throughput: 0: 42669.4. Samples: 10031136720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:22:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 07:22:30,301][15401] Updated weights for policy 0, policy_version 612251 (0.0036) [2024-06-24 07:22:33,390][15132] Fps is (10 sec: 39346.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10031202304. Throughput: 0: 42816.5. Samples: 10031394360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 07:22:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 07:22:34,856][15401] Updated weights for policy 0, policy_version 612261 (0.0028) [2024-06-24 07:22:37,837][15401] Updated weights for policy 0, policy_version 612271 (0.0041) [2024-06-24 07:22:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10031464448. Throughput: 0: 42856.4. Samples: 10031527860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:22:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 07:22:42,995][15401] Updated weights for policy 0, policy_version 612281 (0.0035) [2024-06-24 07:22:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 10031611904. Throughput: 0: 42766.3. Samples: 10031778720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:22:43,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 07:22:45,475][15401] Updated weights for policy 0, policy_version 612291 (0.0042) [2024-06-24 07:22:48,390][15132] Fps is (10 sec: 39319.7, 60 sec: 42599.7, 300 sec: 42709.4). Total num frames: 10031857664. Throughput: 0: 42648.8. Samples: 10032031080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:22:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 07:22:50,452][15401] Updated weights for policy 0, policy_version 612301 (0.0034) [2024-06-24 07:22:53,189][15401] Updated weights for policy 0, policy_version 612311 (0.0039) [2024-06-24 07:22:53,389][15132] Fps is (10 sec: 49152.2, 60 sec: 42873.2, 300 sec: 42765.7). Total num frames: 10032103424. Throughput: 0: 42869.9. Samples: 10032171200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:22:53,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 07:22:57,887][15401] Updated weights for policy 0, policy_version 612321 (0.0030) [2024-06-24 07:22:58,392][15132] Fps is (10 sec: 40952.3, 60 sec: 42598.4, 300 sec: 42542.7). Total num frames: 10032267264. Throughput: 0: 42795.9. Samples: 10032424540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:22:58,393][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 07:23:00,821][15401] Updated weights for policy 0, policy_version 612331 (0.0032) [2024-06-24 07:23:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10032513024. Throughput: 0: 42719.5. Samples: 10032673600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 07:23:06,053][15401] Updated weights for policy 0, policy_version 612341 (0.0030) [2024-06-24 07:23:06,707][15349] Signal inference workers to stop experience collection... (148700 times) [2024-06-24 07:23:06,708][15349] Signal inference workers to resume experience collection... (148700 times) [2024-06-24 07:23:06,731][15401] InferenceWorker_p0-w0: stopping experience collection (148700 times) [2024-06-24 07:23:06,731][15401] InferenceWorker_p0-w0: resuming experience collection (148700 times) [2024-06-24 07:23:08,390][15132] Fps is (10 sec: 47524.9, 60 sec: 42872.5, 300 sec: 42765.2). Total num frames: 10032742400. Throughput: 0: 43002.5. Samples: 10032812720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 07:23:08,507][15401] Updated weights for policy 0, policy_version 612351 (0.0032) [2024-06-24 07:23:13,390][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10032906240. Throughput: 0: 42823.1. Samples: 10033063760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:13,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 07:23:13,828][15401] Updated weights for policy 0, policy_version 612361 (0.0031) [2024-06-24 07:23:16,178][15401] Updated weights for policy 0, policy_version 612371 (0.0037) [2024-06-24 07:23:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10033168384. Throughput: 0: 42482.2. Samples: 10033306060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:18,391][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 07:23:21,457][15401] Updated weights for policy 0, policy_version 612381 (0.0038) [2024-06-24 07:23:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 10033364992. Throughput: 0: 42645.8. Samples: 10033446920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 07:23:24,031][15401] Updated weights for policy 0, policy_version 612391 (0.0024) [2024-06-24 07:23:28,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 10033561600. Throughput: 0: 42677.8. Samples: 10033699220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 07:23:28,981][15401] Updated weights for policy 0, policy_version 612401 (0.0029) [2024-06-24 07:23:31,882][15401] Updated weights for policy 0, policy_version 612411 (0.0021) [2024-06-24 07:23:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 10033823744. Throughput: 0: 42707.9. Samples: 10033952920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 07:23:36,361][15401] Updated weights for policy 0, policy_version 612421 (0.0033) [2024-06-24 07:23:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10034003968. Throughput: 0: 42628.1. Samples: 10034089460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:38,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 07:23:39,271][15401] Updated weights for policy 0, policy_version 612431 (0.0028) [2024-06-24 07:23:43,393][15132] Fps is (10 sec: 39307.8, 60 sec: 43415.0, 300 sec: 42708.9). Total num frames: 10034216960. Throughput: 0: 42666.4. Samples: 10034344580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:43,394][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 07:23:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000612440_10034216960.pth... [2024-06-24 07:23:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000611813_10023944192.pth [2024-06-24 07:23:43,835][15401] Updated weights for policy 0, policy_version 612441 (0.0045) [2024-06-24 07:23:46,918][15401] Updated weights for policy 0, policy_version 612451 (0.0041) [2024-06-24 07:23:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.9, 300 sec: 42709.5). Total num frames: 10034446336. Throughput: 0: 42789.0. Samples: 10034599100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 07:23:51,273][15401] Updated weights for policy 0, policy_version 612461 (0.0026) [2024-06-24 07:23:53,390][15132] Fps is (10 sec: 42613.8, 60 sec: 42325.3, 300 sec: 42654.2). Total num frames: 10034642944. Throughput: 0: 42651.1. Samples: 10034732020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 07:23:54,490][15401] Updated weights for policy 0, policy_version 612471 (0.0046) [2024-06-24 07:23:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 10034855936. Throughput: 0: 42663.1. Samples: 10034983600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:23:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 07:23:59,141][15401] Updated weights for policy 0, policy_version 612481 (0.0037) [2024-06-24 07:24:02,485][15401] Updated weights for policy 0, policy_version 612491 (0.0023) [2024-06-24 07:24:03,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 42764.6). Total num frames: 10035085312. Throughput: 0: 42905.3. Samples: 10035236900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:24:03,393][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 07:24:06,610][15401] Updated weights for policy 0, policy_version 612501 (0.0034) [2024-06-24 07:24:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10035281920. Throughput: 0: 42677.3. Samples: 10035367400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-24 07:24:08,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 07:24:10,163][15401] Updated weights for policy 0, policy_version 612511 (0.0028) [2024-06-24 07:24:13,390][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10035494912. Throughput: 0: 42754.6. Samples: 10035623180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:13,394][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 07:24:14,478][15401] Updated weights for policy 0, policy_version 612521 (0.0038) [2024-06-24 07:24:17,946][15401] Updated weights for policy 0, policy_version 612531 (0.0037) [2024-06-24 07:24:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10035724288. Throughput: 0: 42753.3. Samples: 10035876820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:18,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 07:24:22,478][15401] Updated weights for policy 0, policy_version 612541 (0.0046) [2024-06-24 07:24:23,391][15132] Fps is (10 sec: 42593.8, 60 sec: 42597.6, 300 sec: 42764.8). Total num frames: 10035920896. Throughput: 0: 42537.1. Samples: 10036003680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:23,391][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 07:24:25,703][15401] Updated weights for policy 0, policy_version 612551 (0.0032) [2024-06-24 07:24:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10036150272. Throughput: 0: 42506.4. Samples: 10036257220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 07:24:30,081][15401] Updated weights for policy 0, policy_version 612561 (0.0036) [2024-06-24 07:24:33,211][15401] Updated weights for policy 0, policy_version 612571 (0.0038) [2024-06-24 07:24:33,392][15132] Fps is (10 sec: 44231.3, 60 sec: 42323.7, 300 sec: 42765.6). Total num frames: 10036363264. Throughput: 0: 42772.0. Samples: 10036523940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:33,392][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 07:24:34,019][15349] Signal inference workers to stop experience collection... (148750 times) [2024-06-24 07:24:34,019][15349] Signal inference workers to resume experience collection... (148750 times) [2024-06-24 07:24:34,053][15401] InferenceWorker_p0-w0: stopping experience collection (148750 times) [2024-06-24 07:24:34,054][15401] InferenceWorker_p0-w0: resuming experience collection (148750 times) [2024-06-24 07:24:37,588][15401] Updated weights for policy 0, policy_version 612581 (0.0031) [2024-06-24 07:24:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 10036576256. Throughput: 0: 42723.5. Samples: 10036654580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:38,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 07:24:40,898][15401] Updated weights for policy 0, policy_version 612591 (0.0021) [2024-06-24 07:24:43,393][15132] Fps is (10 sec: 44230.1, 60 sec: 43144.4, 300 sec: 42875.5). Total num frames: 10036805632. Throughput: 0: 42795.9. Samples: 10036909580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:43,394][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 07:24:45,156][15401] Updated weights for policy 0, policy_version 612601 (0.0032) [2024-06-24 07:24:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10037002240. Throughput: 0: 42926.4. Samples: 10037168480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:48,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 07:24:48,501][15401] Updated weights for policy 0, policy_version 612611 (0.0044) [2024-06-24 07:24:52,856][15401] Updated weights for policy 0, policy_version 612621 (0.0043) [2024-06-24 07:24:53,392][15132] Fps is (10 sec: 40966.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 10037215232. Throughput: 0: 42820.5. Samples: 10037294420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:53,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 07:24:56,011][15401] Updated weights for policy 0, policy_version 612631 (0.0037) [2024-06-24 07:24:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10037428224. Throughput: 0: 42831.6. Samples: 10037550600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:24:58,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 07:25:00,382][15401] Updated weights for policy 0, policy_version 612641 (0.0038) [2024-06-24 07:25:03,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 10037641216. Throughput: 0: 42899.8. Samples: 10037807300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:25:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 07:25:04,020][15401] Updated weights for policy 0, policy_version 612651 (0.0031) [2024-06-24 07:25:08,105][15401] Updated weights for policy 0, policy_version 612661 (0.0052) [2024-06-24 07:25:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10037854208. Throughput: 0: 42806.4. Samples: 10037929920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:25:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 07:25:11,796][15401] Updated weights for policy 0, policy_version 612671 (0.0040) [2024-06-24 07:25:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 10038067200. Throughput: 0: 42873.9. Samples: 10038186640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:25:13,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 07:25:15,729][15401] Updated weights for policy 0, policy_version 612681 (0.0045) [2024-06-24 07:25:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10038280192. Throughput: 0: 42595.1. Samples: 10038440620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:25:18,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-24 07:25:19,704][15401] Updated weights for policy 0, policy_version 612691 (0.0030) [2024-06-24 07:25:23,363][15401] Updated weights for policy 0, policy_version 612701 (0.0030) [2024-06-24 07:25:23,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42870.5, 300 sec: 42820.2). Total num frames: 10038493184. Throughput: 0: 42506.2. Samples: 10038567460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:25:23,392][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 07:25:27,328][15401] Updated weights for policy 0, policy_version 612711 (0.0030) [2024-06-24 07:25:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10038689792. Throughput: 0: 42556.5. Samples: 10038824460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:25:28,390][15132] Avg episode reward: [(0, '0.903')] [2024-06-24 07:25:31,129][15401] Updated weights for policy 0, policy_version 612721 (0.0032) [2024-06-24 07:25:33,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 10038919168. Throughput: 0: 42418.6. Samples: 10039077320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:25:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 07:25:34,934][15401] Updated weights for policy 0, policy_version 612731 (0.0039) [2024-06-24 07:25:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.2). Total num frames: 10039115776. Throughput: 0: 42532.5. Samples: 10039208280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 07:25:38,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 07:25:38,820][15401] Updated weights for policy 0, policy_version 612741 (0.0029) [2024-06-24 07:25:42,452][15401] Updated weights for policy 0, policy_version 612751 (0.0026) [2024-06-24 07:25:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.9, 300 sec: 42765.0). Total num frames: 10039328768. Throughput: 0: 42439.9. Samples: 10039460400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:25:43,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-24 07:25:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000612752_10039328768.pth... [2024-06-24 07:25:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000612127_10029088768.pth [2024-06-24 07:25:46,481][15401] Updated weights for policy 0, policy_version 612761 (0.0036) [2024-06-24 07:25:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 10039541760. Throughput: 0: 42204.4. Samples: 10039706600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:25:48,392][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 07:25:50,322][15401] Updated weights for policy 0, policy_version 612771 (0.0028) [2024-06-24 07:25:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 10039754752. Throughput: 0: 42454.7. Samples: 10039840380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:25:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 07:25:54,127][15401] Updated weights for policy 0, policy_version 612781 (0.0032) [2024-06-24 07:25:58,377][15401] Updated weights for policy 0, policy_version 612791 (0.0038) [2024-06-24 07:25:58,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10039967744. Throughput: 0: 42342.7. Samples: 10040091960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:25:58,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 07:26:01,922][15401] Updated weights for policy 0, policy_version 612801 (0.0023) [2024-06-24 07:26:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10040197120. Throughput: 0: 42321.8. Samples: 10040345100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 07:26:06,130][15401] Updated weights for policy 0, policy_version 612811 (0.0031) [2024-06-24 07:26:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10040410112. Throughput: 0: 42526.3. Samples: 10040481040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 07:26:09,435][15401] Updated weights for policy 0, policy_version 612821 (0.0029) [2024-06-24 07:26:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42053.9, 300 sec: 42654.3). Total num frames: 10040590336. Throughput: 0: 42437.4. Samples: 10040734140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 07:26:14,086][15401] Updated weights for policy 0, policy_version 612831 (0.0031) [2024-06-24 07:26:14,349][15349] Signal inference workers to stop experience collection... (148800 times) [2024-06-24 07:26:14,356][15349] Signal inference workers to resume experience collection... (148800 times) [2024-06-24 07:26:14,400][15401] InferenceWorker_p0-w0: stopping experience collection (148800 times) [2024-06-24 07:26:14,400][15401] InferenceWorker_p0-w0: resuming experience collection (148800 times) [2024-06-24 07:26:16,977][15401] Updated weights for policy 0, policy_version 612841 (0.0048) [2024-06-24 07:26:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10040852480. Throughput: 0: 42370.2. Samples: 10040983980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:18,390][15132] Avg episode reward: [(0, '0.220')] [2024-06-24 07:26:21,759][15401] Updated weights for policy 0, policy_version 612851 (0.0035) [2024-06-24 07:26:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.0, 300 sec: 42710.1). Total num frames: 10041032704. Throughput: 0: 42519.0. Samples: 10041121640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:23,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 07:26:24,558][15401] Updated weights for policy 0, policy_version 612861 (0.0034) [2024-06-24 07:26:28,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10041229312. Throughput: 0: 42408.5. Samples: 10041368780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:28,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-24 07:26:29,307][15401] Updated weights for policy 0, policy_version 612871 (0.0030) [2024-06-24 07:26:32,348][15401] Updated weights for policy 0, policy_version 612881 (0.0040) [2024-06-24 07:26:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10041475072. Throughput: 0: 42574.7. Samples: 10041622360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 07:26:37,017][15401] Updated weights for policy 0, policy_version 612891 (0.0030) [2024-06-24 07:26:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10041671680. Throughput: 0: 42554.6. Samples: 10041755340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 07:26:40,490][15401] Updated weights for policy 0, policy_version 612901 (0.0050) [2024-06-24 07:26:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 10041884672. Throughput: 0: 42485.2. Samples: 10042003800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:43,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 07:26:44,549][15401] Updated weights for policy 0, policy_version 612911 (0.0030) [2024-06-24 07:26:48,039][15401] Updated weights for policy 0, policy_version 612921 (0.0028) [2024-06-24 07:26:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42598.7). Total num frames: 10042097664. Throughput: 0: 42673.7. Samples: 10042265420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 07:26:52,258][15401] Updated weights for policy 0, policy_version 612931 (0.0034) [2024-06-24 07:26:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10042310656. Throughput: 0: 42610.3. Samples: 10042398500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 07:26:55,843][15401] Updated weights for policy 0, policy_version 612941 (0.0028) [2024-06-24 07:26:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10042507264. Throughput: 0: 42351.9. Samples: 10042639980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:26:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 07:27:00,184][15401] Updated weights for policy 0, policy_version 612951 (0.0032) [2024-06-24 07:27:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.6). Total num frames: 10042736640. Throughput: 0: 42615.2. Samples: 10042901660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:27:03,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 07:27:03,424][15401] Updated weights for policy 0, policy_version 612961 (0.0029) [2024-06-24 07:27:07,782][15401] Updated weights for policy 0, policy_version 612971 (0.0039) [2024-06-24 07:27:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10042949632. Throughput: 0: 42438.2. Samples: 10043031360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:27:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 07:27:11,516][15401] Updated weights for policy 0, policy_version 612981 (0.0028) [2024-06-24 07:27:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10043162624. Throughput: 0: 42426.6. Samples: 10043277980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:27:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 07:27:15,286][15401] Updated weights for policy 0, policy_version 612991 (0.0037) [2024-06-24 07:27:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42543.8). Total num frames: 10043359232. Throughput: 0: 42534.7. Samples: 10043536420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 07:27:18,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 07:27:19,168][15401] Updated weights for policy 0, policy_version 613001 (0.0022) [2024-06-24 07:27:21,974][15349] Signal inference workers to stop experience collection... (148850 times) [2024-06-24 07:27:21,975][15349] Signal inference workers to resume experience collection... (148850 times) [2024-06-24 07:27:22,006][15401] InferenceWorker_p0-w0: stopping experience collection (148850 times) [2024-06-24 07:27:22,007][15401] InferenceWorker_p0-w0: resuming experience collection (148850 times) [2024-06-24 07:27:22,900][15401] Updated weights for policy 0, policy_version 613011 (0.0033) [2024-06-24 07:27:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10043572224. Throughput: 0: 42389.9. Samples: 10043662880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:27:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 07:27:26,599][15401] Updated weights for policy 0, policy_version 613021 (0.0032) [2024-06-24 07:27:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10043801600. Throughput: 0: 42585.4. Samples: 10043920140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:27:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 07:27:30,557][15401] Updated weights for policy 0, policy_version 613031 (0.0027) [2024-06-24 07:27:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10044014592. Throughput: 0: 42621.4. Samples: 10044183380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:27:33,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 07:27:34,309][15401] Updated weights for policy 0, policy_version 613041 (0.0035) [2024-06-24 07:27:38,034][15401] Updated weights for policy 0, policy_version 613051 (0.0023) [2024-06-24 07:27:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10044227584. Throughput: 0: 42602.1. Samples: 10044315600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:27:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 07:27:41,747][15401] Updated weights for policy 0, policy_version 613061 (0.0034) [2024-06-24 07:27:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42709.6). Total num frames: 10044456960. Throughput: 0: 42829.4. Samples: 10044567300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:27:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 07:27:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000613065_10044456960.pth... [2024-06-24 07:27:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000612440_10034216960.pth [2024-06-24 07:27:45,599][15401] Updated weights for policy 0, policy_version 613071 (0.0044) [2024-06-24 07:27:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10044637184. Throughput: 0: 42943.5. Samples: 10044834120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:27:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 07:27:49,242][15401] Updated weights for policy 0, policy_version 613081 (0.0030) [2024-06-24 07:27:53,219][15401] Updated weights for policy 0, policy_version 613091 (0.0029) [2024-06-24 07:27:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 10044882944. Throughput: 0: 42820.4. Samples: 10044958280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:27:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 07:27:56,995][15401] Updated weights for policy 0, policy_version 613101 (0.0040) [2024-06-24 07:27:58,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10045112320. Throughput: 0: 43080.0. Samples: 10045216580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:27:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 07:28:01,282][15401] Updated weights for policy 0, policy_version 613111 (0.0038) [2024-06-24 07:28:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10045292544. Throughput: 0: 43195.1. Samples: 10045480200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:03,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 07:28:04,536][15401] Updated weights for policy 0, policy_version 613121 (0.0031) [2024-06-24 07:28:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 10045521920. Throughput: 0: 43161.6. Samples: 10045605260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:08,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 07:28:08,637][15401] Updated weights for policy 0, policy_version 613131 (0.0035) [2024-06-24 07:28:11,880][15401] Updated weights for policy 0, policy_version 613141 (0.0036) [2024-06-24 07:28:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 10045751296. Throughput: 0: 43234.3. Samples: 10045865680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:13,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-24 07:28:16,048][15401] Updated weights for policy 0, policy_version 613151 (0.0031) [2024-06-24 07:28:18,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10045931520. Throughput: 0: 43297.3. Samples: 10046131760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 07:28:19,443][15401] Updated weights for policy 0, policy_version 613161 (0.0035) [2024-06-24 07:28:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10046160896. Throughput: 0: 43076.5. Samples: 10046254040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 07:28:23,562][15401] Updated weights for policy 0, policy_version 613171 (0.0027) [2024-06-24 07:28:26,558][15349] Signal inference workers to stop experience collection... (148900 times) [2024-06-24 07:28:26,559][15349] Signal inference workers to resume experience collection... (148900 times) [2024-06-24 07:28:26,579][15401] InferenceWorker_p0-w0: stopping experience collection (148900 times) [2024-06-24 07:28:26,580][15401] InferenceWorker_p0-w0: resuming experience collection (148900 times) [2024-06-24 07:28:27,273][15401] Updated weights for policy 0, policy_version 613181 (0.0033) [2024-06-24 07:28:28,396][15132] Fps is (10 sec: 47483.0, 60 sec: 43413.0, 300 sec: 42653.0). Total num frames: 10046406656. Throughput: 0: 43187.6. Samples: 10046511020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:28,397][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 07:28:31,218][15401] Updated weights for policy 0, policy_version 613191 (0.0039) [2024-06-24 07:28:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10046586880. Throughput: 0: 42945.3. Samples: 10046766660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:33,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 07:28:35,138][15401] Updated weights for policy 0, policy_version 613201 (0.0042) [2024-06-24 07:28:38,389][15132] Fps is (10 sec: 39347.0, 60 sec: 42871.5, 300 sec: 42654.5). Total num frames: 10046799872. Throughput: 0: 42928.0. Samples: 10046890040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 07:28:38,836][15401] Updated weights for policy 0, policy_version 613211 (0.0027) [2024-06-24 07:28:42,748][15401] Updated weights for policy 0, policy_version 613221 (0.0029) [2024-06-24 07:28:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 10047062016. Throughput: 0: 42995.7. Samples: 10047151380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:43,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 07:28:46,707][15401] Updated weights for policy 0, policy_version 613231 (0.0028) [2024-06-24 07:28:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10047242240. Throughput: 0: 42771.5. Samples: 10047404920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 07:28:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 07:28:50,486][15401] Updated weights for policy 0, policy_version 613241 (0.0045) [2024-06-24 07:28:53,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10047438848. Throughput: 0: 42690.9. Samples: 10047526240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:28:53,390][15132] Avg episode reward: [(0, '0.875')] [2024-06-24 07:28:54,378][15401] Updated weights for policy 0, policy_version 613251 (0.0034) [2024-06-24 07:28:58,259][15401] Updated weights for policy 0, policy_version 613261 (0.0025) [2024-06-24 07:28:58,391][15132] Fps is (10 sec: 42592.0, 60 sec: 42597.3, 300 sec: 42654.1). Total num frames: 10047668224. Throughput: 0: 42823.8. Samples: 10047792820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:28:58,392][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 07:29:01,948][15401] Updated weights for policy 0, policy_version 613271 (0.0029) [2024-06-24 07:29:03,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10047897600. Throughput: 0: 42586.2. Samples: 10048048140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 07:29:05,809][15401] Updated weights for policy 0, policy_version 613281 (0.0036) [2024-06-24 07:29:08,390][15132] Fps is (10 sec: 42601.3, 60 sec: 42872.6, 300 sec: 42709.4). Total num frames: 10048094208. Throughput: 0: 42756.0. Samples: 10048178100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:08,391][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 07:29:09,743][15401] Updated weights for policy 0, policy_version 613291 (0.0034) [2024-06-24 07:29:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10048307200. Throughput: 0: 42862.6. Samples: 10048439560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:13,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 07:29:13,432][15401] Updated weights for policy 0, policy_version 613301 (0.0025) [2024-06-24 07:29:17,311][15401] Updated weights for policy 0, policy_version 613311 (0.0037) [2024-06-24 07:29:18,389][15132] Fps is (10 sec: 44240.9, 60 sec: 43417.6, 300 sec: 42765.2). Total num frames: 10048536576. Throughput: 0: 42794.7. Samples: 10048692420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:18,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 07:29:21,048][15401] Updated weights for policy 0, policy_version 613321 (0.0038) [2024-06-24 07:29:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10048733184. Throughput: 0: 42956.9. Samples: 10048823100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 07:29:24,860][15401] Updated weights for policy 0, policy_version 613331 (0.0036) [2024-06-24 07:29:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42330.0, 300 sec: 42654.3). Total num frames: 10048946176. Throughput: 0: 42855.2. Samples: 10049079860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:28,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 07:29:28,781][15401] Updated weights for policy 0, policy_version 613341 (0.0037) [2024-06-24 07:29:32,477][15401] Updated weights for policy 0, policy_version 613351 (0.0038) [2024-06-24 07:29:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10049175552. Throughput: 0: 42891.7. Samples: 10049335040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:33,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 07:29:36,375][15401] Updated weights for policy 0, policy_version 613361 (0.0035) [2024-06-24 07:29:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42654.5). Total num frames: 10049388544. Throughput: 0: 42997.3. Samples: 10049461120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 07:29:40,073][15401] Updated weights for policy 0, policy_version 613371 (0.0031) [2024-06-24 07:29:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10049601536. Throughput: 0: 42861.0. Samples: 10049721500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 07:29:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000613380_10049617920.pth... [2024-06-24 07:29:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000612752_10039328768.pth [2024-06-24 07:29:43,836][15401] Updated weights for policy 0, policy_version 613381 (0.0032) [2024-06-24 07:29:47,977][15401] Updated weights for policy 0, policy_version 613391 (0.0039) [2024-06-24 07:29:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 10049798144. Throughput: 0: 42955.1. Samples: 10049981120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 07:29:51,367][15401] Updated weights for policy 0, policy_version 613401 (0.0032) [2024-06-24 07:29:52,853][15349] Signal inference workers to stop experience collection... (148950 times) [2024-06-24 07:29:52,908][15401] InferenceWorker_p0-w0: stopping experience collection (148950 times) [2024-06-24 07:29:52,912][15349] Signal inference workers to resume experience collection... (148950 times) [2024-06-24 07:29:52,924][15401] InferenceWorker_p0-w0: resuming experience collection (148950 times) [2024-06-24 07:29:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 10050043904. Throughput: 0: 42867.4. Samples: 10050107100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 07:29:55,639][15401] Updated weights for policy 0, policy_version 613411 (0.0022) [2024-06-24 07:29:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43145.7, 300 sec: 42765.0). Total num frames: 10050256896. Throughput: 0: 42861.8. Samples: 10050368340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:29:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 07:29:58,901][15401] Updated weights for policy 0, policy_version 613421 (0.0041) [2024-06-24 07:30:03,188][15401] Updated weights for policy 0, policy_version 613431 (0.0037) [2024-06-24 07:30:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10050453504. Throughput: 0: 43017.3. Samples: 10050628200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:30:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 07:30:06,625][15401] Updated weights for policy 0, policy_version 613441 (0.0036) [2024-06-24 07:30:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43145.1, 300 sec: 42765.3). Total num frames: 10050682880. Throughput: 0: 42824.2. Samples: 10050750200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:30:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 07:30:11,231][15401] Updated weights for policy 0, policy_version 613451 (0.0036) [2024-06-24 07:30:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10050895872. Throughput: 0: 42938.0. Samples: 10051012080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:30:13,390][15132] Avg episode reward: [(0, '0.877')] [2024-06-24 07:30:14,528][15401] Updated weights for policy 0, policy_version 613461 (0.0032) [2024-06-24 07:30:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10051092480. Throughput: 0: 42888.9. Samples: 10051265040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:30:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 07:30:18,671][15401] Updated weights for policy 0, policy_version 613471 (0.0038) [2024-06-24 07:30:22,050][15401] Updated weights for policy 0, policy_version 613481 (0.0034) [2024-06-24 07:30:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10051338240. Throughput: 0: 42936.4. Samples: 10051393260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 07:30:23,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-24 07:30:26,180][15401] Updated weights for policy 0, policy_version 613491 (0.0041) [2024-06-24 07:30:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10051534848. Throughput: 0: 42932.1. Samples: 10051653440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:30:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 07:30:29,680][15401] Updated weights for policy 0, policy_version 613501 (0.0038) [2024-06-24 07:30:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 10051747840. Throughput: 0: 42874.6. Samples: 10051910580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:30:33,392][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 07:30:33,903][15401] Updated weights for policy 0, policy_version 613511 (0.0035) [2024-06-24 07:30:37,412][15401] Updated weights for policy 0, policy_version 613521 (0.0024) [2024-06-24 07:30:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10051977216. Throughput: 0: 42842.4. Samples: 10052035000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:30:38,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 07:30:41,609][15401] Updated weights for policy 0, policy_version 613531 (0.0040) [2024-06-24 07:30:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 10052173824. Throughput: 0: 42747.5. Samples: 10052291980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:30:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 07:30:45,170][15401] Updated weights for policy 0, policy_version 613541 (0.0036) [2024-06-24 07:30:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10052370432. Throughput: 0: 42687.4. Samples: 10052549140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:30:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 07:30:49,346][15401] Updated weights for policy 0, policy_version 613551 (0.0034) [2024-06-24 07:30:52,823][15401] Updated weights for policy 0, policy_version 613561 (0.0027) [2024-06-24 07:30:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10052599808. Throughput: 0: 42758.4. Samples: 10052674320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:30:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 07:30:56,815][15401] Updated weights for policy 0, policy_version 613571 (0.0031) [2024-06-24 07:30:58,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10052829184. Throughput: 0: 42821.0. Samples: 10052939020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:30:58,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 07:31:00,490][15401] Updated weights for policy 0, policy_version 613581 (0.0029) [2024-06-24 07:31:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10053025792. Throughput: 0: 42847.4. Samples: 10053193180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 07:31:04,592][15401] Updated weights for policy 0, policy_version 613591 (0.0033) [2024-06-24 07:31:08,141][15401] Updated weights for policy 0, policy_version 613601 (0.0043) [2024-06-24 07:31:08,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10053238784. Throughput: 0: 42762.2. Samples: 10053317560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:08,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 07:31:12,544][15401] Updated weights for policy 0, policy_version 613611 (0.0042) [2024-06-24 07:31:13,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 10053435392. Throughput: 0: 42820.1. Samples: 10053580340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 07:31:15,648][15401] Updated weights for policy 0, policy_version 613621 (0.0025) [2024-06-24 07:31:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 10053664768. Throughput: 0: 42706.7. Samples: 10053832380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:18,393][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 07:31:19,898][15401] Updated weights for policy 0, policy_version 613631 (0.0033) [2024-06-24 07:31:23,201][15401] Updated weights for policy 0, policy_version 613641 (0.0034) [2024-06-24 07:31:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 10053894144. Throughput: 0: 42913.7. Samples: 10053966120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 07:31:27,607][15401] Updated weights for policy 0, policy_version 613651 (0.0036) [2024-06-24 07:31:27,995][15349] Signal inference workers to stop experience collection... (149000 times) [2024-06-24 07:31:27,995][15349] Signal inference workers to resume experience collection... (149000 times) [2024-06-24 07:31:28,019][15401] InferenceWorker_p0-w0: stopping experience collection (149000 times) [2024-06-24 07:31:28,019][15401] InferenceWorker_p0-w0: resuming experience collection (149000 times) [2024-06-24 07:31:28,396][15132] Fps is (10 sec: 42581.4, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 10054090752. Throughput: 0: 43019.2. Samples: 10054228120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:28,396][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 07:31:30,758][15401] Updated weights for policy 0, policy_version 613661 (0.0040) [2024-06-24 07:31:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 10054303744. Throughput: 0: 42999.1. Samples: 10054484100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:33,394][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 07:31:35,220][15401] Updated weights for policy 0, policy_version 613671 (0.0033) [2024-06-24 07:31:38,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10054533120. Throughput: 0: 43105.4. Samples: 10054614060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 07:31:38,503][15401] Updated weights for policy 0, policy_version 613681 (0.0047) [2024-06-24 07:31:42,908][15401] Updated weights for policy 0, policy_version 613691 (0.0035) [2024-06-24 07:31:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10054729728. Throughput: 0: 42872.8. Samples: 10054868300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 07:31:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000613692_10054729728.pth... [2024-06-24 07:31:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000613065_10044456960.pth [2024-06-24 07:31:46,153][15401] Updated weights for policy 0, policy_version 613701 (0.0040) [2024-06-24 07:31:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10054942720. Throughput: 0: 42986.5. Samples: 10055127560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 07:31:50,514][15401] Updated weights for policy 0, policy_version 613711 (0.0034) [2024-06-24 07:31:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 10055188480. Throughput: 0: 43033.8. Samples: 10055254080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:53,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 07:31:53,694][15401] Updated weights for policy 0, policy_version 613721 (0.0026) [2024-06-24 07:31:58,281][15401] Updated weights for policy 0, policy_version 613731 (0.0035) [2024-06-24 07:31:58,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10055368704. Throughput: 0: 42963.5. Samples: 10055513700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 07:31:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 07:32:01,278][15401] Updated weights for policy 0, policy_version 613741 (0.0041) [2024-06-24 07:32:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10055598080. Throughput: 0: 42940.9. Samples: 10055764620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:03,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 07:32:05,840][15401] Updated weights for policy 0, policy_version 613751 (0.0031) [2024-06-24 07:32:08,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 10055843840. Throughput: 0: 42918.6. Samples: 10055897460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:08,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-24 07:32:08,847][15401] Updated weights for policy 0, policy_version 613761 (0.0031) [2024-06-24 07:32:13,341][15401] Updated weights for policy 0, policy_version 613771 (0.0031) [2024-06-24 07:32:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 10056024064. Throughput: 0: 42794.5. Samples: 10056153600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 07:32:16,582][15401] Updated weights for policy 0, policy_version 613781 (0.0036) [2024-06-24 07:32:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 10056237056. Throughput: 0: 42851.1. Samples: 10056412400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:18,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 07:32:21,110][15401] Updated weights for policy 0, policy_version 613791 (0.0026) [2024-06-24 07:32:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 10056482816. Throughput: 0: 42911.5. Samples: 10056545080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 07:32:24,198][15401] Updated weights for policy 0, policy_version 613801 (0.0045) [2024-06-24 07:32:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 10056646656. Throughput: 0: 42713.0. Samples: 10056790380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:28,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 07:32:29,072][15401] Updated weights for policy 0, policy_version 613811 (0.0040) [2024-06-24 07:32:31,861][15401] Updated weights for policy 0, policy_version 613821 (0.0050) [2024-06-24 07:32:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10056892416. Throughput: 0: 42600.7. Samples: 10057044600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 07:32:36,566][15401] Updated weights for policy 0, policy_version 613831 (0.0024) [2024-06-24 07:32:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10057089024. Throughput: 0: 42851.5. Samples: 10057182400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 07:32:39,601][15401] Updated weights for policy 0, policy_version 613841 (0.0028) [2024-06-24 07:32:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10057285632. Throughput: 0: 42704.4. Samples: 10057435400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 07:32:44,125][15401] Updated weights for policy 0, policy_version 613851 (0.0044) [2024-06-24 07:32:47,133][15401] Updated weights for policy 0, policy_version 613861 (0.0039) [2024-06-24 07:32:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 10057547776. Throughput: 0: 42729.3. Samples: 10057687440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 07:32:51,628][15401] Updated weights for policy 0, policy_version 613871 (0.0042) [2024-06-24 07:32:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10057728000. Throughput: 0: 42722.3. Samples: 10057819960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:53,390][15132] Avg episode reward: [(0, '0.889')] [2024-06-24 07:32:54,709][15401] Updated weights for policy 0, policy_version 613881 (0.0042) [2024-06-24 07:32:58,272][15349] Signal inference workers to stop experience collection... (149050 times) [2024-06-24 07:32:58,321][15401] InferenceWorker_p0-w0: stopping experience collection (149050 times) [2024-06-24 07:32:58,328][15349] Signal inference workers to resume experience collection... (149050 times) [2024-06-24 07:32:58,336][15401] InferenceWorker_p0-w0: resuming experience collection (149050 times) [2024-06-24 07:32:58,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10057940992. Throughput: 0: 42704.6. Samples: 10058075300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:32:58,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 07:32:59,711][15401] Updated weights for policy 0, policy_version 613891 (0.0038) [2024-06-24 07:33:02,955][15401] Updated weights for policy 0, policy_version 613901 (0.0023) [2024-06-24 07:33:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 10058170368. Throughput: 0: 42579.3. Samples: 10058328460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:33:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 07:33:07,239][15401] Updated weights for policy 0, policy_version 613911 (0.0029) [2024-06-24 07:33:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10058383360. Throughput: 0: 42534.3. Samples: 10058459120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:33:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 07:33:10,527][15401] Updated weights for policy 0, policy_version 613921 (0.0027) [2024-06-24 07:33:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10058596352. Throughput: 0: 42845.2. Samples: 10058718420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:33:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 07:33:14,712][15401] Updated weights for policy 0, policy_version 613931 (0.0043) [2024-06-24 07:33:18,125][15401] Updated weights for policy 0, policy_version 613941 (0.0038) [2024-06-24 07:33:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10058809344. Throughput: 0: 42714.6. Samples: 10058966760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:33:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 07:33:22,360][15401] Updated weights for policy 0, policy_version 613951 (0.0039) [2024-06-24 07:33:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42654.9). Total num frames: 10058989568. Throughput: 0: 42546.4. Samples: 10059096980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:33:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 07:33:25,876][15401] Updated weights for policy 0, policy_version 613961 (0.0034) [2024-06-24 07:33:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10059218944. Throughput: 0: 42481.2. Samples: 10059347060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:33:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 07:33:29,963][15401] Updated weights for policy 0, policy_version 613971 (0.0040) [2024-06-24 07:33:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10059448320. Throughput: 0: 42475.2. Samples: 10059598820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:33:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 07:33:33,423][15401] Updated weights for policy 0, policy_version 613981 (0.0043) [2024-06-24 07:33:37,492][15401] Updated weights for policy 0, policy_version 613991 (0.0037) [2024-06-24 07:33:38,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 10059644928. Throughput: 0: 42473.3. Samples: 10059731360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:33:38,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 07:33:41,634][15401] Updated weights for policy 0, policy_version 614001 (0.0035) [2024-06-24 07:33:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10059857920. Throughput: 0: 42415.4. Samples: 10059984000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:33:43,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-24 07:33:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000614005_10059857920.pth... [2024-06-24 07:33:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000613380_10049617920.pth [2024-06-24 07:33:44,935][15401] Updated weights for policy 0, policy_version 614011 (0.0037) [2024-06-24 07:33:48,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10060103680. Throughput: 0: 42445.2. Samples: 10060238500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:33:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 07:33:49,154][15401] Updated weights for policy 0, policy_version 614021 (0.0033) [2024-06-24 07:33:52,517][15401] Updated weights for policy 0, policy_version 614031 (0.0034) [2024-06-24 07:33:53,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.7, 300 sec: 42764.9). Total num frames: 10060283904. Throughput: 0: 42559.4. Samples: 10060374400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:33:53,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 07:33:56,612][15401] Updated weights for policy 0, policy_version 614041 (0.0034) [2024-06-24 07:33:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10060513280. Throughput: 0: 42356.1. Samples: 10060624440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:33:58,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 07:34:00,709][15401] Updated weights for policy 0, policy_version 614051 (0.0049) [2024-06-24 07:34:03,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42323.6, 300 sec: 42764.8). Total num frames: 10060709888. Throughput: 0: 42653.0. Samples: 10060886240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:03,393][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 07:34:04,241][15401] Updated weights for policy 0, policy_version 614061 (0.0037) [2024-06-24 07:34:08,270][15401] Updated weights for policy 0, policy_version 614071 (0.0031) [2024-06-24 07:34:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10060939264. Throughput: 0: 42590.5. Samples: 10061013560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 07:34:11,899][15401] Updated weights for policy 0, policy_version 614081 (0.0042) [2024-06-24 07:34:13,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10061168640. Throughput: 0: 42692.1. Samples: 10061268200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:13,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 07:34:15,855][15401] Updated weights for policy 0, policy_version 614091 (0.0041) [2024-06-24 07:34:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 10061332480. Throughput: 0: 42961.4. Samples: 10061532080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 07:34:19,708][15401] Updated weights for policy 0, policy_version 614101 (0.0029) [2024-06-24 07:34:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10061578240. Throughput: 0: 42697.4. Samples: 10061652640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:23,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 07:34:23,670][15401] Updated weights for policy 0, policy_version 614111 (0.0030) [2024-06-24 07:34:27,565][15401] Updated weights for policy 0, policy_version 614121 (0.0039) [2024-06-24 07:34:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10061791232. Throughput: 0: 42841.0. Samples: 10061911840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 07:34:28,444][15349] Signal inference workers to stop experience collection... (149100 times) [2024-06-24 07:34:28,468][15401] InferenceWorker_p0-w0: stopping experience collection (149100 times) [2024-06-24 07:34:28,505][15349] Signal inference workers to resume experience collection... (149100 times) [2024-06-24 07:34:28,506][15401] InferenceWorker_p0-w0: resuming experience collection (149100 times) [2024-06-24 07:34:31,271][15401] Updated weights for policy 0, policy_version 614131 (0.0027) [2024-06-24 07:34:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10061987840. Throughput: 0: 42945.0. Samples: 10062171020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 07:34:35,201][15401] Updated weights for policy 0, policy_version 614141 (0.0038) [2024-06-24 07:34:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 10062200832. Throughput: 0: 42587.3. Samples: 10062290720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:38,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 07:34:38,955][15401] Updated weights for policy 0, policy_version 614151 (0.0049) [2024-06-24 07:34:42,927][15401] Updated weights for policy 0, policy_version 614161 (0.0038) [2024-06-24 07:34:43,394][15132] Fps is (10 sec: 42579.2, 60 sec: 42595.3, 300 sec: 42764.4). Total num frames: 10062413824. Throughput: 0: 42607.3. Samples: 10062541960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:43,394][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 07:34:47,199][15401] Updated weights for policy 0, policy_version 614171 (0.0037) [2024-06-24 07:34:48,390][15132] Fps is (10 sec: 40958.8, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 10062610432. Throughput: 0: 42458.6. Samples: 10062796780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 07:34:50,542][15401] Updated weights for policy 0, policy_version 614181 (0.0031) [2024-06-24 07:34:53,390][15132] Fps is (10 sec: 44255.9, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 10062856192. Throughput: 0: 42436.8. Samples: 10062923220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 07:34:54,921][15401] Updated weights for policy 0, policy_version 614191 (0.0038) [2024-06-24 07:34:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10063052800. Throughput: 0: 42476.5. Samples: 10063179640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:34:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 07:34:58,541][15401] Updated weights for policy 0, policy_version 614201 (0.0036) [2024-06-24 07:35:02,555][15401] Updated weights for policy 0, policy_version 614211 (0.0038) [2024-06-24 07:35:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 10063265792. Throughput: 0: 42276.7. Samples: 10063434540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:35:03,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 07:35:06,071][15401] Updated weights for policy 0, policy_version 614221 (0.0026) [2024-06-24 07:35:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10063478784. Throughput: 0: 42316.3. Samples: 10063556880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:35:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 07:35:10,165][15401] Updated weights for policy 0, policy_version 614231 (0.0031) [2024-06-24 07:35:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10063708160. Throughput: 0: 42332.0. Samples: 10063816780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:13,396][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 07:35:13,823][15401] Updated weights for policy 0, policy_version 614241 (0.0031) [2024-06-24 07:35:18,090][15401] Updated weights for policy 0, policy_version 614251 (0.0029) [2024-06-24 07:35:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10063888384. Throughput: 0: 42318.2. Samples: 10064075340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:18,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 07:35:21,642][15401] Updated weights for policy 0, policy_version 614261 (0.0037) [2024-06-24 07:35:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10064117760. Throughput: 0: 42301.5. Samples: 10064194300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:23,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 07:35:25,785][15401] Updated weights for policy 0, policy_version 614271 (0.0037) [2024-06-24 07:35:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10064347136. Throughput: 0: 42522.0. Samples: 10064455260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 07:35:29,165][15401] Updated weights for policy 0, policy_version 614281 (0.0044) [2024-06-24 07:35:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 10064527360. Throughput: 0: 42610.4. Samples: 10064714240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:33,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 07:35:33,478][15401] Updated weights for policy 0, policy_version 614291 (0.0031) [2024-06-24 07:35:36,678][15401] Updated weights for policy 0, policy_version 614301 (0.0031) [2024-06-24 07:35:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10064756736. Throughput: 0: 42523.7. Samples: 10064836780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 07:35:41,026][15401] Updated weights for policy 0, policy_version 614311 (0.0030) [2024-06-24 07:35:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42874.7, 300 sec: 42765.0). Total num frames: 10064986112. Throughput: 0: 42643.5. Samples: 10065098600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:43,398][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 07:35:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000614318_10064986112.pth... [2024-06-24 07:35:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000613692_10054729728.pth [2024-06-24 07:35:44,359][15401] Updated weights for policy 0, policy_version 614321 (0.0036) [2024-06-24 07:35:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10065166336. Throughput: 0: 42790.7. Samples: 10065360120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 07:35:48,659][15401] Updated weights for policy 0, policy_version 614331 (0.0037) [2024-06-24 07:35:52,011][15401] Updated weights for policy 0, policy_version 614341 (0.0037) [2024-06-24 07:35:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10065412096. Throughput: 0: 42706.8. Samples: 10065478680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 07:35:56,180][15401] Updated weights for policy 0, policy_version 614351 (0.0023) [2024-06-24 07:35:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10065608704. Throughput: 0: 42813.6. Samples: 10065743400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:35:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 07:35:59,782][15401] Updated weights for policy 0, policy_version 614361 (0.0038) [2024-06-24 07:36:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10065805312. Throughput: 0: 42699.0. Samples: 10065996800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 07:36:04,066][15401] Updated weights for policy 0, policy_version 614371 (0.0026) [2024-06-24 07:36:05,361][15349] Signal inference workers to stop experience collection... (149150 times) [2024-06-24 07:36:05,361][15349] Signal inference workers to resume experience collection... (149150 times) [2024-06-24 07:36:05,392][15401] InferenceWorker_p0-w0: stopping experience collection (149150 times) [2024-06-24 07:36:05,392][15401] InferenceWorker_p0-w0: resuming experience collection (149150 times) [2024-06-24 07:36:07,439][15401] Updated weights for policy 0, policy_version 614381 (0.0031) [2024-06-24 07:36:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10066034688. Throughput: 0: 42796.2. Samples: 10066120120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:08,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 07:36:11,786][15401] Updated weights for policy 0, policy_version 614391 (0.0032) [2024-06-24 07:36:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 10066247680. Throughput: 0: 42813.8. Samples: 10066381880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 07:36:15,378][15401] Updated weights for policy 0, policy_version 614401 (0.0028) [2024-06-24 07:36:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10066460672. Throughput: 0: 42687.2. Samples: 10066635160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 07:36:19,342][15401] Updated weights for policy 0, policy_version 614411 (0.0030) [2024-06-24 07:36:22,847][15401] Updated weights for policy 0, policy_version 614421 (0.0037) [2024-06-24 07:36:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42710.4). Total num frames: 10066690048. Throughput: 0: 42949.0. Samples: 10066769480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 07:36:26,855][15401] Updated weights for policy 0, policy_version 614431 (0.0027) [2024-06-24 07:36:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 10066870272. Throughput: 0: 42884.1. Samples: 10067028380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 07:36:30,292][15401] Updated weights for policy 0, policy_version 614441 (0.0044) [2024-06-24 07:36:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10067116032. Throughput: 0: 42781.0. Samples: 10067285260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 07:36:34,702][15401] Updated weights for policy 0, policy_version 614451 (0.0038) [2024-06-24 07:36:37,797][15401] Updated weights for policy 0, policy_version 614461 (0.0033) [2024-06-24 07:36:38,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10067329024. Throughput: 0: 43079.1. Samples: 10067417240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:38,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 07:36:42,262][15401] Updated weights for policy 0, policy_version 614471 (0.0026) [2024-06-24 07:36:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10067525632. Throughput: 0: 42933.5. Samples: 10067675400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-24 07:36:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 07:36:45,720][15401] Updated weights for policy 0, policy_version 614481 (0.0039) [2024-06-24 07:36:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 10067771392. Throughput: 0: 42737.0. Samples: 10067919960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:36:48,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 07:36:49,933][15401] Updated weights for policy 0, policy_version 614491 (0.0054) [2024-06-24 07:36:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10067968000. Throughput: 0: 43137.3. Samples: 10068061300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:36:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 07:36:53,449][15401] Updated weights for policy 0, policy_version 614501 (0.0039) [2024-06-24 07:36:57,715][15401] Updated weights for policy 0, policy_version 614511 (0.0040) [2024-06-24 07:36:58,390][15132] Fps is (10 sec: 40958.1, 60 sec: 42871.2, 300 sec: 42653.9). Total num frames: 10068180992. Throughput: 0: 42935.5. Samples: 10068314000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:36:58,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 07:37:01,120][15401] Updated weights for policy 0, policy_version 614521 (0.0029) [2024-06-24 07:37:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42542.5). Total num frames: 10068393984. Throughput: 0: 42865.6. Samples: 10068564220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:03,393][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 07:37:05,358][15401] Updated weights for policy 0, policy_version 614531 (0.0043) [2024-06-24 07:37:08,390][15132] Fps is (10 sec: 44238.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10068623360. Throughput: 0: 42825.5. Samples: 10068696640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 07:37:08,577][15401] Updated weights for policy 0, policy_version 614541 (0.0043) [2024-06-24 07:37:13,012][15401] Updated weights for policy 0, policy_version 614551 (0.0040) [2024-06-24 07:37:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10068819968. Throughput: 0: 42851.0. Samples: 10068956680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 07:37:15,621][15349] Signal inference workers to stop experience collection... (149200 times) [2024-06-24 07:37:15,621][15349] Signal inference workers to resume experience collection... (149200 times) [2024-06-24 07:37:15,661][15401] InferenceWorker_p0-w0: stopping experience collection (149200 times) [2024-06-24 07:37:15,662][15401] InferenceWorker_p0-w0: resuming experience collection (149200 times) [2024-06-24 07:37:16,018][15401] Updated weights for policy 0, policy_version 614561 (0.0036) [2024-06-24 07:37:18,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10069032960. Throughput: 0: 42823.2. Samples: 10069212300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 07:37:20,477][15401] Updated weights for policy 0, policy_version 614571 (0.0041) [2024-06-24 07:37:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10069278720. Throughput: 0: 42791.0. Samples: 10069342840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:23,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 07:37:23,633][15401] Updated weights for policy 0, policy_version 614581 (0.0026) [2024-06-24 07:37:27,970][15401] Updated weights for policy 0, policy_version 614591 (0.0028) [2024-06-24 07:37:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10069458944. Throughput: 0: 42755.1. Samples: 10069599380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:28,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 07:37:31,390][15401] Updated weights for policy 0, policy_version 614601 (0.0047) [2024-06-24 07:37:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10069688320. Throughput: 0: 42928.4. Samples: 10069851740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 07:37:35,923][15401] Updated weights for policy 0, policy_version 614611 (0.0027) [2024-06-24 07:37:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10069917696. Throughput: 0: 42684.9. Samples: 10069982120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:38,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 07:37:38,916][15401] Updated weights for policy 0, policy_version 614621 (0.0052) [2024-06-24 07:37:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 10070097920. Throughput: 0: 42756.0. Samples: 10070238000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:43,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 07:37:43,437][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000614631_10070114304.pth... [2024-06-24 07:37:43,442][15401] Updated weights for policy 0, policy_version 614631 (0.0035) [2024-06-24 07:37:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000614005_10059857920.pth [2024-06-24 07:37:46,727][15401] Updated weights for policy 0, policy_version 614641 (0.0026) [2024-06-24 07:37:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10070343680. Throughput: 0: 42878.3. Samples: 10070493640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:48,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 07:37:50,951][15401] Updated weights for policy 0, policy_version 614651 (0.0035) [2024-06-24 07:37:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10070556672. Throughput: 0: 42799.3. Samples: 10070622600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 07:37:54,206][15401] Updated weights for policy 0, policy_version 614661 (0.0032) [2024-06-24 07:37:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.8, 300 sec: 42653.9). Total num frames: 10070753280. Throughput: 0: 42832.0. Samples: 10070884120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:37:58,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 07:37:58,537][15401] Updated weights for policy 0, policy_version 614671 (0.0034) [2024-06-24 07:38:01,636][15401] Updated weights for policy 0, policy_version 614681 (0.0031) [2024-06-24 07:38:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 10070982656. Throughput: 0: 42710.6. Samples: 10071134280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:38:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 07:38:06,091][15401] Updated weights for policy 0, policy_version 614691 (0.0035) [2024-06-24 07:38:08,393][15132] Fps is (10 sec: 44219.5, 60 sec: 42868.8, 300 sec: 42708.9). Total num frames: 10071195648. Throughput: 0: 42767.1. Samples: 10071267520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:38:08,394][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 07:38:09,550][15401] Updated weights for policy 0, policy_version 614701 (0.0043) [2024-06-24 07:38:13,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 10071408640. Throughput: 0: 42858.0. Samples: 10071528100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:38:13,393][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 07:38:13,939][15401] Updated weights for policy 0, policy_version 614711 (0.0042) [2024-06-24 07:38:17,289][15401] Updated weights for policy 0, policy_version 614721 (0.0031) [2024-06-24 07:38:18,390][15132] Fps is (10 sec: 42614.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10071621632. Throughput: 0: 42675.1. Samples: 10071772120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-24 07:38:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 07:38:21,512][15401] Updated weights for policy 0, policy_version 614731 (0.0038) [2024-06-24 07:38:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10071818240. Throughput: 0: 42727.0. Samples: 10071904840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:38:23,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 07:38:25,158][15401] Updated weights for policy 0, policy_version 614741 (0.0035) [2024-06-24 07:38:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10072031232. Throughput: 0: 42663.6. Samples: 10072157860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:38:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 07:38:29,665][15401] Updated weights for policy 0, policy_version 614751 (0.0028) [2024-06-24 07:38:32,721][15401] Updated weights for policy 0, policy_version 614761 (0.0032) [2024-06-24 07:38:33,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 10072260608. Throughput: 0: 42504.0. Samples: 10072406420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:38:33,392][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 07:38:37,307][15401] Updated weights for policy 0, policy_version 614771 (0.0027) [2024-06-24 07:38:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10072457216. Throughput: 0: 42574.0. Samples: 10072538440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:38:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 07:38:40,236][15401] Updated weights for policy 0, policy_version 614781 (0.0038) [2024-06-24 07:38:43,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10072670208. Throughput: 0: 42597.3. Samples: 10072801000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:38:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 07:38:44,883][15401] Updated weights for policy 0, policy_version 614791 (0.0032) [2024-06-24 07:38:45,374][15349] Signal inference workers to stop experience collection... (149250 times) [2024-06-24 07:38:45,376][15349] Signal inference workers to resume experience collection... (149250 times) [2024-06-24 07:38:45,425][15401] InferenceWorker_p0-w0: stopping experience collection (149250 times) [2024-06-24 07:38:45,426][15401] InferenceWorker_p0-w0: resuming experience collection (149250 times) [2024-06-24 07:38:47,760][15401] Updated weights for policy 0, policy_version 614801 (0.0035) [2024-06-24 07:38:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 10072915968. Throughput: 0: 42705.2. Samples: 10073056020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:38:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 07:38:52,461][15401] Updated weights for policy 0, policy_version 614811 (0.0033) [2024-06-24 07:38:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10073096192. Throughput: 0: 42759.6. Samples: 10073191540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:38:53,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 07:38:55,122][15401] Updated weights for policy 0, policy_version 614821 (0.0023) [2024-06-24 07:38:58,389][15132] Fps is (10 sec: 42599.6, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 10073341952. Throughput: 0: 42717.6. Samples: 10073450280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:38:58,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 07:39:00,355][15401] Updated weights for policy 0, policy_version 614831 (0.0034) [2024-06-24 07:39:02,609][15401] Updated weights for policy 0, policy_version 614841 (0.0023) [2024-06-24 07:39:03,392][15132] Fps is (10 sec: 47502.6, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 10073571328. Throughput: 0: 42813.8. Samples: 10073698840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:03,392][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 07:39:07,913][15401] Updated weights for policy 0, policy_version 614851 (0.0038) [2024-06-24 07:39:08,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42328.1, 300 sec: 42598.4). Total num frames: 10073735168. Throughput: 0: 42916.6. Samples: 10073836080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 07:39:10,154][15401] Updated weights for policy 0, policy_version 614861 (0.0029) [2024-06-24 07:39:13,390][15132] Fps is (10 sec: 39330.4, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 10073964544. Throughput: 0: 43044.7. Samples: 10074094880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 07:39:15,673][15401] Updated weights for policy 0, policy_version 614871 (0.0037) [2024-06-24 07:39:17,833][15401] Updated weights for policy 0, policy_version 614881 (0.0040) [2024-06-24 07:39:18,390][15132] Fps is (10 sec: 49151.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10074226688. Throughput: 0: 42876.0. Samples: 10074335740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:18,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 07:39:23,284][15401] Updated weights for policy 0, policy_version 614891 (0.0036) [2024-06-24 07:39:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10074374144. Throughput: 0: 42988.6. Samples: 10074472920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:23,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 07:39:25,680][15401] Updated weights for policy 0, policy_version 614901 (0.0040) [2024-06-24 07:39:28,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10074603520. Throughput: 0: 42857.0. Samples: 10074729560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:28,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 07:39:31,115][15401] Updated weights for policy 0, policy_version 614911 (0.0034) [2024-06-24 07:39:33,276][15401] Updated weights for policy 0, policy_version 614921 (0.0032) [2024-06-24 07:39:33,392][15132] Fps is (10 sec: 49140.4, 60 sec: 43417.7, 300 sec: 42931.3). Total num frames: 10074865664. Throughput: 0: 42648.1. Samples: 10074975280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:33,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 07:39:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42710.1). Total num frames: 10075013120. Throughput: 0: 42690.3. Samples: 10075112600. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:38,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 07:39:38,836][15401] Updated weights for policy 0, policy_version 614931 (0.0036) [2024-06-24 07:39:40,054][15349] Signal inference workers to stop experience collection... (149300 times) [2024-06-24 07:39:40,057][15349] Signal inference workers to resume experience collection... (149300 times) [2024-06-24 07:39:40,073][15401] InferenceWorker_p0-w0: stopping experience collection (149300 times) [2024-06-24 07:39:40,073][15401] InferenceWorker_p0-w0: resuming experience collection (149300 times) [2024-06-24 07:39:41,026][15401] Updated weights for policy 0, policy_version 614941 (0.0024) [2024-06-24 07:39:43,390][15132] Fps is (10 sec: 36053.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10075226112. Throughput: 0: 42495.3. Samples: 10075362580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 07:39:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000614943_10075226112.pth... [2024-06-24 07:39:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000614318_10064986112.pth [2024-06-24 07:39:46,415][15401] Updated weights for policy 0, policy_version 614951 (0.0030) [2024-06-24 07:39:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 10075471872. Throughput: 0: 42583.7. Samples: 10075615000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 07:39:48,902][15401] Updated weights for policy 0, policy_version 614961 (0.0055) [2024-06-24 07:39:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10075652096. Throughput: 0: 42419.1. Samples: 10075744940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-24 07:39:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 07:39:54,067][15401] Updated weights for policy 0, policy_version 614971 (0.0029) [2024-06-24 07:39:56,807][15401] Updated weights for policy 0, policy_version 614981 (0.0040) [2024-06-24 07:39:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10075881472. Throughput: 0: 42134.0. Samples: 10075990900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:39:58,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 07:40:01,729][15401] Updated weights for policy 0, policy_version 614991 (0.0038) [2024-06-24 07:40:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 10076110848. Throughput: 0: 42568.5. Samples: 10076251320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:03,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-24 07:40:04,554][15401] Updated weights for policy 0, policy_version 615001 (0.0029) [2024-06-24 07:40:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10076291072. Throughput: 0: 42460.0. Samples: 10076383620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 07:40:09,217][15401] Updated weights for policy 0, policy_version 615011 (0.0043) [2024-06-24 07:40:12,365][15401] Updated weights for policy 0, policy_version 615021 (0.0044) [2024-06-24 07:40:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10076536832. Throughput: 0: 42363.1. Samples: 10076635900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 07:40:17,115][15401] Updated weights for policy 0, policy_version 615031 (0.0032) [2024-06-24 07:40:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 10076749824. Throughput: 0: 42525.3. Samples: 10076888820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 07:40:20,197][15401] Updated weights for policy 0, policy_version 615041 (0.0028) [2024-06-24 07:40:23,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10076913664. Throughput: 0: 42306.7. Samples: 10077016400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 07:40:24,625][15401] Updated weights for policy 0, policy_version 615051 (0.0033) [2024-06-24 07:40:27,842][15401] Updated weights for policy 0, policy_version 615061 (0.0037) [2024-06-24 07:40:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 10077175808. Throughput: 0: 42487.5. Samples: 10077274520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 07:40:32,625][15401] Updated weights for policy 0, policy_version 615071 (0.0030) [2024-06-24 07:40:33,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42053.8, 300 sec: 42820.5). Total num frames: 10077388800. Throughput: 0: 42474.0. Samples: 10077526340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 07:40:35,850][15401] Updated weights for policy 0, policy_version 615081 (0.0025) [2024-06-24 07:40:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10077569024. Throughput: 0: 42506.2. Samples: 10077657720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 07:40:40,135][15401] Updated weights for policy 0, policy_version 615091 (0.0035) [2024-06-24 07:40:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10077798400. Throughput: 0: 42755.6. Samples: 10077914900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 07:40:43,501][15401] Updated weights for policy 0, policy_version 615101 (0.0054) [2024-06-24 07:40:47,777][15401] Updated weights for policy 0, policy_version 615111 (0.0028) [2024-06-24 07:40:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10078027776. Throughput: 0: 42625.7. Samples: 10078169480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 07:40:51,175][15401] Updated weights for policy 0, policy_version 615121 (0.0036) [2024-06-24 07:40:53,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10078208000. Throughput: 0: 42566.2. Samples: 10078299100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 07:40:55,350][15401] Updated weights for policy 0, policy_version 615131 (0.0035) [2024-06-24 07:40:56,785][15349] Signal inference workers to stop experience collection... (149350 times) [2024-06-24 07:40:56,785][15349] Signal inference workers to resume experience collection... (149350 times) [2024-06-24 07:40:56,808][15401] InferenceWorker_p0-w0: stopping experience collection (149350 times) [2024-06-24 07:40:56,809][15401] InferenceWorker_p0-w0: resuming experience collection (149350 times) [2024-06-24 07:40:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10078437376. Throughput: 0: 42635.1. Samples: 10078554480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:40:58,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 07:40:58,951][15401] Updated weights for policy 0, policy_version 615141 (0.0022) [2024-06-24 07:41:03,068][15401] Updated weights for policy 0, policy_version 615151 (0.0027) [2024-06-24 07:41:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10078650368. Throughput: 0: 42737.2. Samples: 10078812000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:41:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 07:41:06,653][15401] Updated weights for policy 0, policy_version 615161 (0.0027) [2024-06-24 07:41:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10078863360. Throughput: 0: 42731.0. Samples: 10078939300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:41:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 07:41:10,663][15401] Updated weights for policy 0, policy_version 615171 (0.0025) [2024-06-24 07:41:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10079092736. Throughput: 0: 42646.7. Samples: 10079193620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:41:13,400][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 07:41:14,213][15401] Updated weights for policy 0, policy_version 615181 (0.0032) [2024-06-24 07:41:18,380][15401] Updated weights for policy 0, policy_version 615191 (0.0031) [2024-06-24 07:41:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10079289344. Throughput: 0: 42778.0. Samples: 10079451340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:41:18,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-24 07:41:22,542][15401] Updated weights for policy 0, policy_version 615201 (0.0041) [2024-06-24 07:41:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10079485952. Throughput: 0: 42519.6. Samples: 10079571100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:41:23,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 07:41:26,075][15401] Updated weights for policy 0, policy_version 615211 (0.0043) [2024-06-24 07:41:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10079731712. Throughput: 0: 42546.1. Samples: 10079829480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-24 07:41:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 07:41:30,479][15401] Updated weights for policy 0, policy_version 615221 (0.0029) [2024-06-24 07:41:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 10079895552. Throughput: 0: 42649.5. Samples: 10080088700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:41:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 07:41:33,759][15401] Updated weights for policy 0, policy_version 615231 (0.0027) [2024-06-24 07:41:38,052][15401] Updated weights for policy 0, policy_version 615241 (0.0035) [2024-06-24 07:41:38,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10080108544. Throughput: 0: 42293.2. Samples: 10080202300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:41:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 07:41:41,531][15401] Updated weights for policy 0, policy_version 615251 (0.0032) [2024-06-24 07:41:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10080354304. Throughput: 0: 42251.2. Samples: 10080455780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:41:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 07:41:43,451][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000615257_10080370688.pth... [2024-06-24 07:41:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000614631_10070114304.pth [2024-06-24 07:41:45,674][15401] Updated weights for policy 0, policy_version 615261 (0.0033) [2024-06-24 07:41:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42542.8). Total num frames: 10080518144. Throughput: 0: 42263.6. Samples: 10080713860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:41:48,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 07:41:49,314][15401] Updated weights for policy 0, policy_version 615271 (0.0043) [2024-06-24 07:41:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42598.5). Total num frames: 10080747520. Throughput: 0: 42051.6. Samples: 10080831620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:41:53,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-24 07:41:53,778][15401] Updated weights for policy 0, policy_version 615281 (0.0038) [2024-06-24 07:41:56,806][15401] Updated weights for policy 0, policy_version 615291 (0.0027) [2024-06-24 07:41:58,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10080993280. Throughput: 0: 42235.1. Samples: 10081094200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:41:58,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 07:42:01,460][15401] Updated weights for policy 0, policy_version 615301 (0.0027) [2024-06-24 07:42:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 10081157120. Throughput: 0: 42287.1. Samples: 10081354260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 07:42:04,466][15401] Updated weights for policy 0, policy_version 615311 (0.0034) [2024-06-24 07:42:05,729][15349] Signal inference workers to stop experience collection... (149400 times) [2024-06-24 07:42:05,731][15349] Signal inference workers to resume experience collection... (149400 times) [2024-06-24 07:42:05,777][15401] InferenceWorker_p0-w0: stopping experience collection (149400 times) [2024-06-24 07:42:05,777][15401] InferenceWorker_p0-w0: resuming experience collection (149400 times) [2024-06-24 07:42:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10081402880. Throughput: 0: 42274.2. Samples: 10081473440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:08,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-24 07:42:08,983][15401] Updated weights for policy 0, policy_version 615321 (0.0031) [2024-06-24 07:42:12,372][15401] Updated weights for policy 0, policy_version 615331 (0.0030) [2024-06-24 07:42:13,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 10081615872. Throughput: 0: 42271.1. Samples: 10081731780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:13,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 07:42:16,679][15401] Updated weights for policy 0, policy_version 615341 (0.0049) [2024-06-24 07:42:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 10081812480. Throughput: 0: 42249.4. Samples: 10081989920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:18,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 07:42:20,090][15401] Updated weights for policy 0, policy_version 615351 (0.0033) [2024-06-24 07:42:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 10082025472. Throughput: 0: 42416.1. Samples: 10082111020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 07:42:24,781][15401] Updated weights for policy 0, policy_version 615361 (0.0037) [2024-06-24 07:42:27,857][15401] Updated weights for policy 0, policy_version 615371 (0.0041) [2024-06-24 07:42:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10082271232. Throughput: 0: 42429.7. Samples: 10082365120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 07:42:32,289][15401] Updated weights for policy 0, policy_version 615381 (0.0033) [2024-06-24 07:42:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10082451456. Throughput: 0: 42455.6. Samples: 10082624360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:33,396][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 07:42:35,470][15401] Updated weights for policy 0, policy_version 615391 (0.0038) [2024-06-24 07:42:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10082664448. Throughput: 0: 42450.3. Samples: 10082741880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 07:42:40,146][15401] Updated weights for policy 0, policy_version 615401 (0.0033) [2024-06-24 07:42:43,039][15401] Updated weights for policy 0, policy_version 615411 (0.0036) [2024-06-24 07:42:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10082910208. Throughput: 0: 42548.5. Samples: 10083008880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 07:42:47,858][15401] Updated weights for policy 0, policy_version 615421 (0.0043) [2024-06-24 07:42:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 10083090432. Throughput: 0: 42504.9. Samples: 10083266980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 07:42:50,718][15401] Updated weights for policy 0, policy_version 615431 (0.0041) [2024-06-24 07:42:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10083303424. Throughput: 0: 42523.1. Samples: 10083386980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:53,396][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 07:42:55,524][15401] Updated weights for policy 0, policy_version 615441 (0.0039) [2024-06-24 07:42:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 10083532800. Throughput: 0: 42523.5. Samples: 10083645240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:42:58,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 07:42:58,835][15401] Updated weights for policy 0, policy_version 615451 (0.0025) [2024-06-24 07:43:02,941][15401] Updated weights for policy 0, policy_version 615461 (0.0030) [2024-06-24 07:43:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42487.9). Total num frames: 10083729408. Throughput: 0: 42516.5. Samples: 10083903160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 07:43:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 07:43:06,486][15401] Updated weights for policy 0, policy_version 615471 (0.0039) [2024-06-24 07:43:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.6, 300 sec: 42487.3). Total num frames: 10083942400. Throughput: 0: 42609.3. Samples: 10084028540. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:08,392][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 07:43:10,445][15401] Updated weights for policy 0, policy_version 615481 (0.0027) [2024-06-24 07:43:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 10084171776. Throughput: 0: 42689.0. Samples: 10084286120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:13,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 07:43:14,093][15401] Updated weights for policy 0, policy_version 615491 (0.0044) [2024-06-24 07:43:17,958][15401] Updated weights for policy 0, policy_version 615501 (0.0041) [2024-06-24 07:43:18,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10084384768. Throughput: 0: 42665.0. Samples: 10084544280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:18,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 07:43:21,834][15401] Updated weights for policy 0, policy_version 615511 (0.0033) [2024-06-24 07:43:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10084597760. Throughput: 0: 42917.7. Samples: 10084673180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:23,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 07:43:25,489][15401] Updated weights for policy 0, policy_version 615521 (0.0036) [2024-06-24 07:43:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 10084810752. Throughput: 0: 42706.2. Samples: 10084930660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 07:43:29,565][15401] Updated weights for policy 0, policy_version 615531 (0.0035) [2024-06-24 07:43:33,203][15401] Updated weights for policy 0, policy_version 615541 (0.0035) [2024-06-24 07:43:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10085023744. Throughput: 0: 42776.0. Samples: 10085191900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:33,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-24 07:43:37,126][15401] Updated weights for policy 0, policy_version 615551 (0.0034) [2024-06-24 07:43:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10085253120. Throughput: 0: 43080.5. Samples: 10085325600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 07:43:40,789][15401] Updated weights for policy 0, policy_version 615561 (0.0027) [2024-06-24 07:43:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10085466112. Throughput: 0: 43017.9. Samples: 10085581040. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:43,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 07:43:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000615568_10085466112.pth... [2024-06-24 07:43:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000614943_10075226112.pth [2024-06-24 07:43:45,018][15401] Updated weights for policy 0, policy_version 615571 (0.0023) [2024-06-24 07:43:47,856][15349] Signal inference workers to stop experience collection... (149450 times) [2024-06-24 07:43:47,856][15349] Signal inference workers to resume experience collection... (149450 times) [2024-06-24 07:43:47,904][15401] InferenceWorker_p0-w0: stopping experience collection (149450 times) [2024-06-24 07:43:47,904][15401] InferenceWorker_p0-w0: resuming experience collection (149450 times) [2024-06-24 07:43:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10085662720. Throughput: 0: 42823.5. Samples: 10085830220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 07:43:48,448][15401] Updated weights for policy 0, policy_version 615581 (0.0039) [2024-06-24 07:43:52,505][15401] Updated weights for policy 0, policy_version 615591 (0.0044) [2024-06-24 07:43:53,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42486.9). Total num frames: 10085875712. Throughput: 0: 42880.0. Samples: 10085958140. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:53,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 07:43:56,235][15401] Updated weights for policy 0, policy_version 615601 (0.0033) [2024-06-24 07:43:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 10086088704. Throughput: 0: 42863.1. Samples: 10086214960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:43:58,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 07:44:00,353][15401] Updated weights for policy 0, policy_version 615611 (0.0035) [2024-06-24 07:44:03,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 10086301696. Throughput: 0: 42859.0. Samples: 10086473040. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:44:03,392][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 07:44:03,878][15401] Updated weights for policy 0, policy_version 615621 (0.0028) [2024-06-24 07:44:07,970][15401] Updated weights for policy 0, policy_version 615631 (0.0031) [2024-06-24 07:44:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 10086514688. Throughput: 0: 42797.6. Samples: 10086599060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:44:08,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 07:44:11,619][15401] Updated weights for policy 0, policy_version 615641 (0.0042) [2024-06-24 07:44:13,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42869.7, 300 sec: 42431.4). Total num frames: 10086744064. Throughput: 0: 42796.8. Samples: 10086856620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:44:13,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 07:44:15,652][15401] Updated weights for policy 0, policy_version 615651 (0.0045) [2024-06-24 07:44:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10086957056. Throughput: 0: 42807.2. Samples: 10087118220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:44:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 07:44:19,375][15401] Updated weights for policy 0, policy_version 615661 (0.0025) [2024-06-24 07:44:23,117][15401] Updated weights for policy 0, policy_version 615671 (0.0033) [2024-06-24 07:44:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10087153664. Throughput: 0: 42699.1. Samples: 10087247060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:44:23,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-24 07:44:26,695][15401] Updated weights for policy 0, policy_version 615681 (0.0037) [2024-06-24 07:44:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42487.6). Total num frames: 10087399424. Throughput: 0: 42792.0. Samples: 10087506680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:44:28,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-24 07:44:30,524][15401] Updated weights for policy 0, policy_version 615691 (0.0030) [2024-06-24 07:44:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10087596032. Throughput: 0: 43135.1. Samples: 10087771300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:44:33,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-24 07:44:34,173][15401] Updated weights for policy 0, policy_version 615701 (0.0041) [2024-06-24 07:44:38,363][15401] Updated weights for policy 0, policy_version 615711 (0.0029) [2024-06-24 07:44:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10087809024. Throughput: 0: 43091.2. Samples: 10087897140. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-24 07:44:38,394][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 07:44:41,746][15401] Updated weights for policy 0, policy_version 615721 (0.0031) [2024-06-24 07:44:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10088054784. Throughput: 0: 43043.6. Samples: 10088151920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:44:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 07:44:45,821][15401] Updated weights for policy 0, policy_version 615731 (0.0044) [2024-06-24 07:44:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10088235008. Throughput: 0: 43182.0. Samples: 10088416120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:44:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 07:44:49,408][15401] Updated weights for policy 0, policy_version 615741 (0.0029) [2024-06-24 07:44:53,387][15401] Updated weights for policy 0, policy_version 615751 (0.0037) [2024-06-24 07:44:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 10088464384. Throughput: 0: 43030.1. Samples: 10088535420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:44:53,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 07:44:57,369][15401] Updated weights for policy 0, policy_version 615761 (0.0041) [2024-06-24 07:44:58,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43690.7, 300 sec: 42709.5). Total num frames: 10088710144. Throughput: 0: 43152.1. Samples: 10088798360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:44:58,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 07:45:01,145][15401] Updated weights for policy 0, policy_version 615771 (0.0040) [2024-06-24 07:45:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 10088873984. Throughput: 0: 43028.2. Samples: 10089054500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 07:45:04,900][15401] Updated weights for policy 0, policy_version 615781 (0.0022) [2024-06-24 07:45:05,559][15349] Signal inference workers to stop experience collection... (149500 times) [2024-06-24 07:45:05,564][15349] Signal inference workers to resume experience collection... (149500 times) [2024-06-24 07:45:05,608][15401] InferenceWorker_p0-w0: stopping experience collection (149500 times) [2024-06-24 07:45:05,608][15401] InferenceWorker_p0-w0: resuming experience collection (149500 times) [2024-06-24 07:45:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10089103360. Throughput: 0: 42835.1. Samples: 10089174640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:08,390][15132] Avg episode reward: [(0, '0.107')] [2024-06-24 07:45:08,867][15401] Updated weights for policy 0, policy_version 615791 (0.0039) [2024-06-24 07:45:12,603][15401] Updated weights for policy 0, policy_version 615801 (0.0036) [2024-06-24 07:45:13,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43419.3, 300 sec: 42709.5). Total num frames: 10089349120. Throughput: 0: 42950.2. Samples: 10089439440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 07:45:16,585][15401] Updated weights for policy 0, policy_version 615811 (0.0037) [2024-06-24 07:45:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10089512960. Throughput: 0: 42777.7. Samples: 10089696300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 07:45:20,046][15401] Updated weights for policy 0, policy_version 615821 (0.0032) [2024-06-24 07:45:23,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10089725952. Throughput: 0: 42699.7. Samples: 10089818620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 07:45:24,232][15401] Updated weights for policy 0, policy_version 615831 (0.0029) [2024-06-24 07:45:27,541][15401] Updated weights for policy 0, policy_version 615841 (0.0035) [2024-06-24 07:45:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10089971712. Throughput: 0: 42963.5. Samples: 10090085280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:28,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-24 07:45:31,806][15401] Updated weights for policy 0, policy_version 615851 (0.0033) [2024-06-24 07:45:33,390][15132] Fps is (10 sec: 44234.3, 60 sec: 42871.1, 300 sec: 42709.4). Total num frames: 10090168320. Throughput: 0: 42769.7. Samples: 10090340780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 07:45:35,415][15401] Updated weights for policy 0, policy_version 615861 (0.0041) [2024-06-24 07:45:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 10090381312. Throughput: 0: 42793.9. Samples: 10090461140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:38,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 07:45:39,383][15401] Updated weights for policy 0, policy_version 615871 (0.0028) [2024-06-24 07:45:42,974][15401] Updated weights for policy 0, policy_version 615881 (0.0040) [2024-06-24 07:45:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.0, 300 sec: 42653.9). Total num frames: 10090610688. Throughput: 0: 42764.0. Samples: 10090722760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:43,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 07:45:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000615882_10090610688.pth... [2024-06-24 07:45:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000615257_10080370688.pth [2024-06-24 07:45:47,070][15401] Updated weights for policy 0, policy_version 615891 (0.0043) [2024-06-24 07:45:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10090790912. Throughput: 0: 42682.8. Samples: 10090975220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:48,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-24 07:45:50,545][15401] Updated weights for policy 0, policy_version 615901 (0.0047) [2024-06-24 07:45:53,390][15132] Fps is (10 sec: 40961.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10091020288. Throughput: 0: 42741.3. Samples: 10091098000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:53,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 07:45:54,750][15401] Updated weights for policy 0, policy_version 615911 (0.0043) [2024-06-24 07:45:58,221][15401] Updated weights for policy 0, policy_version 615921 (0.0052) [2024-06-24 07:45:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10091249664. Throughput: 0: 42658.3. Samples: 10091359060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:45:58,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 07:46:02,375][15401] Updated weights for policy 0, policy_version 615931 (0.0036) [2024-06-24 07:46:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10091446272. Throughput: 0: 42568.8. Samples: 10091611900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:46:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 07:46:05,682][15401] Updated weights for policy 0, policy_version 615941 (0.0031) [2024-06-24 07:46:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10091659264. Throughput: 0: 42582.0. Samples: 10091734820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:46:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 07:46:10,024][15401] Updated weights for policy 0, policy_version 615951 (0.0043) [2024-06-24 07:46:12,672][15349] Signal inference workers to stop experience collection... (149550 times) [2024-06-24 07:46:12,672][15349] Signal inference workers to resume experience collection... (149550 times) [2024-06-24 07:46:12,708][15401] InferenceWorker_p0-w0: stopping experience collection (149550 times) [2024-06-24 07:46:12,708][15401] InferenceWorker_p0-w0: resuming experience collection (149550 times) [2024-06-24 07:46:13,291][15401] Updated weights for policy 0, policy_version 615961 (0.0034) [2024-06-24 07:46:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10091905024. Throughput: 0: 42610.1. Samples: 10092002740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 07:46:13,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 07:46:17,466][15401] Updated weights for policy 0, policy_version 615971 (0.0025) [2024-06-24 07:46:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10092068864. Throughput: 0: 42543.6. Samples: 10092255220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:18,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 07:46:20,858][15401] Updated weights for policy 0, policy_version 615981 (0.0035) [2024-06-24 07:46:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 10092314624. Throughput: 0: 42557.3. Samples: 10092376220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:23,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 07:46:25,403][15401] Updated weights for policy 0, policy_version 615991 (0.0032) [2024-06-24 07:46:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10092527616. Throughput: 0: 42732.9. Samples: 10092645720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 07:46:28,584][15401] Updated weights for policy 0, policy_version 616001 (0.0033) [2024-06-24 07:46:33,339][15401] Updated weights for policy 0, policy_version 616011 (0.0023) [2024-06-24 07:46:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.7, 300 sec: 42765.0). Total num frames: 10092724224. Throughput: 0: 42631.5. Samples: 10092893640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 07:46:36,495][15401] Updated weights for policy 0, policy_version 616021 (0.0034) [2024-06-24 07:46:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10092953600. Throughput: 0: 42671.2. Samples: 10093018200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:38,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 07:46:41,000][15401] Updated weights for policy 0, policy_version 616031 (0.0029) [2024-06-24 07:46:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.6, 300 sec: 42820.6). Total num frames: 10093150208. Throughput: 0: 42598.1. Samples: 10093275980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 07:46:44,106][15401] Updated weights for policy 0, policy_version 616041 (0.0040) [2024-06-24 07:46:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10093346816. Throughput: 0: 42764.5. Samples: 10093536300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 07:46:48,788][15401] Updated weights for policy 0, policy_version 616051 (0.0042) [2024-06-24 07:46:52,007][15401] Updated weights for policy 0, policy_version 616061 (0.0031) [2024-06-24 07:46:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10093592576. Throughput: 0: 42754.8. Samples: 10093658780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 07:46:56,380][15401] Updated weights for policy 0, policy_version 616071 (0.0033) [2024-06-24 07:46:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 10093789184. Throughput: 0: 42565.8. Samples: 10093918200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:46:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 07:46:59,757][15401] Updated weights for policy 0, policy_version 616081 (0.0043) [2024-06-24 07:47:03,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10093985792. Throughput: 0: 42504.3. Samples: 10094167920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:03,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 07:47:04,397][15401] Updated weights for policy 0, policy_version 616091 (0.0043) [2024-06-24 07:47:07,478][15401] Updated weights for policy 0, policy_version 616101 (0.0044) [2024-06-24 07:47:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 10094215168. Throughput: 0: 42549.8. Samples: 10094290960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 07:47:12,209][15401] Updated weights for policy 0, policy_version 616111 (0.0039) [2024-06-24 07:47:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 10094428160. Throughput: 0: 42375.9. Samples: 10094552640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 07:47:15,171][15401] Updated weights for policy 0, policy_version 616121 (0.0027) [2024-06-24 07:47:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10094641152. Throughput: 0: 42281.0. Samples: 10094796280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 07:47:20,129][15401] Updated weights for policy 0, policy_version 616131 (0.0028) [2024-06-24 07:47:23,084][15401] Updated weights for policy 0, policy_version 616141 (0.0034) [2024-06-24 07:47:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10094870528. Throughput: 0: 42439.9. Samples: 10094928000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 07:47:27,523][15401] Updated weights for policy 0, policy_version 616151 (0.0045) [2024-06-24 07:47:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10095050752. Throughput: 0: 42518.3. Samples: 10095189300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 07:47:30,773][15401] Updated weights for policy 0, policy_version 616161 (0.0033) [2024-06-24 07:47:33,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 10095280128. Throughput: 0: 42248.6. Samples: 10095437480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:33,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-24 07:47:35,238][15401] Updated weights for policy 0, policy_version 616171 (0.0051) [2024-06-24 07:47:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10095493120. Throughput: 0: 42450.3. Samples: 10095569040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 07:47:38,479][15401] Updated weights for policy 0, policy_version 616181 (0.0032) [2024-06-24 07:47:40,632][15349] Signal inference workers to stop experience collection... (149600 times) [2024-06-24 07:47:40,633][15349] Signal inference workers to resume experience collection... (149600 times) [2024-06-24 07:47:40,684][15401] InferenceWorker_p0-w0: stopping experience collection (149600 times) [2024-06-24 07:47:40,684][15401] InferenceWorker_p0-w0: resuming experience collection (149600 times) [2024-06-24 07:47:42,695][15401] Updated weights for policy 0, policy_version 616191 (0.0029) [2024-06-24 07:47:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10095689728. Throughput: 0: 42426.7. Samples: 10095827400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 07:47:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000616192_10095689728.pth... [2024-06-24 07:47:43,641][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000615568_10085466112.pth [2024-06-24 07:47:46,091][15401] Updated weights for policy 0, policy_version 616201 (0.0036) [2024-06-24 07:47:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10095935488. Throughput: 0: 42469.9. Samples: 10096079060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 27.0) [2024-06-24 07:47:48,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 07:47:50,258][15401] Updated weights for policy 0, policy_version 616211 (0.0032) [2024-06-24 07:47:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 10096115712. Throughput: 0: 42665.3. Samples: 10096210900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:47:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 07:47:53,886][15401] Updated weights for policy 0, policy_version 616221 (0.0040) [2024-06-24 07:47:58,188][15401] Updated weights for policy 0, policy_version 616231 (0.0032) [2024-06-24 07:47:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10096328704. Throughput: 0: 42458.8. Samples: 10096463280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:47:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 07:48:01,410][15401] Updated weights for policy 0, policy_version 616241 (0.0038) [2024-06-24 07:48:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 10096574464. Throughput: 0: 42655.0. Samples: 10096715760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 07:48:05,619][15401] Updated weights for policy 0, policy_version 616251 (0.0027) [2024-06-24 07:48:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10096754688. Throughput: 0: 42735.2. Samples: 10096851080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 07:48:09,360][15401] Updated weights for policy 0, policy_version 616261 (0.0032) [2024-06-24 07:48:13,158][15401] Updated weights for policy 0, policy_version 616271 (0.0036) [2024-06-24 07:48:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 10096984064. Throughput: 0: 42425.2. Samples: 10097098440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 07:48:17,125][15401] Updated weights for policy 0, policy_version 616281 (0.0031) [2024-06-24 07:48:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10097213440. Throughput: 0: 42630.5. Samples: 10097355860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 07:48:20,634][15401] Updated weights for policy 0, policy_version 616291 (0.0033) [2024-06-24 07:48:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10097393664. Throughput: 0: 42683.9. Samples: 10097489820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 07:48:24,862][15401] Updated weights for policy 0, policy_version 616301 (0.0028) [2024-06-24 07:48:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10097623040. Throughput: 0: 42517.8. Samples: 10097740700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 07:48:28,685][15401] Updated weights for policy 0, policy_version 616311 (0.0043) [2024-06-24 07:48:32,526][15401] Updated weights for policy 0, policy_version 616321 (0.0039) [2024-06-24 07:48:33,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 10097868800. Throughput: 0: 42627.1. Samples: 10097997380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:33,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 07:48:36,207][15401] Updated weights for policy 0, policy_version 616331 (0.0031) [2024-06-24 07:48:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 10098016256. Throughput: 0: 42556.4. Samples: 10098125940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 07:48:40,389][15401] Updated weights for policy 0, policy_version 616341 (0.0042) [2024-06-24 07:48:43,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10098262016. Throughput: 0: 42583.1. Samples: 10098379520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 07:48:43,841][15401] Updated weights for policy 0, policy_version 616351 (0.0027) [2024-06-24 07:48:47,523][15349] Signal inference workers to stop experience collection... (149650 times) [2024-06-24 07:48:47,524][15349] Signal inference workers to resume experience collection... (149650 times) [2024-06-24 07:48:47,564][15401] InferenceWorker_p0-w0: stopping experience collection (149650 times) [2024-06-24 07:48:47,564][15401] InferenceWorker_p0-w0: resuming experience collection (149650 times) [2024-06-24 07:48:48,024][15401] Updated weights for policy 0, policy_version 616361 (0.0035) [2024-06-24 07:48:48,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 10098491392. Throughput: 0: 42617.0. Samples: 10098633520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:48,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 07:48:51,670][15401] Updated weights for policy 0, policy_version 616371 (0.0031) [2024-06-24 07:48:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10098671616. Throughput: 0: 42495.9. Samples: 10098763400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 07:48:55,521][15401] Updated weights for policy 0, policy_version 616381 (0.0047) [2024-06-24 07:48:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 10098900992. Throughput: 0: 42700.0. Samples: 10099019940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:48:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 07:48:59,439][15401] Updated weights for policy 0, policy_version 616391 (0.0035) [2024-06-24 07:49:03,297][15401] Updated weights for policy 0, policy_version 616401 (0.0036) [2024-06-24 07:49:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10099113984. Throughput: 0: 42626.2. Samples: 10099274040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:49:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 07:49:07,289][15401] Updated weights for policy 0, policy_version 616411 (0.0033) [2024-06-24 07:49:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 10099310592. Throughput: 0: 42440.9. Samples: 10099399660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:49:08,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-24 07:49:11,041][15401] Updated weights for policy 0, policy_version 616421 (0.0050) [2024-06-24 07:49:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 10099523584. Throughput: 0: 42409.8. Samples: 10099649140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:49:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 07:49:14,861][15401] Updated weights for policy 0, policy_version 616431 (0.0046) [2024-06-24 07:49:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10099752960. Throughput: 0: 42580.9. Samples: 10099913420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:49:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 07:49:18,615][15401] Updated weights for policy 0, policy_version 616441 (0.0037) [2024-06-24 07:49:22,450][15401] Updated weights for policy 0, policy_version 616451 (0.0044) [2024-06-24 07:49:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10099949568. Throughput: 0: 42502.6. Samples: 10100038560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 07:49:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 07:49:26,270][15401] Updated weights for policy 0, policy_version 616461 (0.0041) [2024-06-24 07:49:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10100162560. Throughput: 0: 42522.8. Samples: 10100293040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:49:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 07:49:30,210][15401] Updated weights for policy 0, policy_version 616471 (0.0028) [2024-06-24 07:49:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 10100391936. Throughput: 0: 42599.1. Samples: 10100550480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:49:33,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 07:49:33,900][15401] Updated weights for policy 0, policy_version 616481 (0.0042) [2024-06-24 07:49:37,950][15401] Updated weights for policy 0, policy_version 616491 (0.0029) [2024-06-24 07:49:38,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42866.9, 300 sec: 42486.4). Total num frames: 10100588544. Throughput: 0: 42674.5. Samples: 10100684020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:49:38,396][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 07:49:41,418][15401] Updated weights for policy 0, policy_version 616501 (0.0027) [2024-06-24 07:49:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10100817920. Throughput: 0: 42517.4. Samples: 10100933220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:49:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 07:49:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000616505_10100817920.pth... [2024-06-24 07:49:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000615882_10090610688.pth [2024-06-24 07:49:45,526][15401] Updated weights for policy 0, policy_version 616511 (0.0034) [2024-06-24 07:49:48,389][15132] Fps is (10 sec: 45904.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10101047296. Throughput: 0: 42707.6. Samples: 10101195880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:49:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 07:49:48,975][15401] Updated weights for policy 0, policy_version 616521 (0.0033) [2024-06-24 07:49:53,270][15401] Updated weights for policy 0, policy_version 616531 (0.0027) [2024-06-24 07:49:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 10101243904. Throughput: 0: 42842.6. Samples: 10101327680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:49:53,392][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 07:49:56,680][15401] Updated weights for policy 0, policy_version 616541 (0.0036) [2024-06-24 07:49:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10101473280. Throughput: 0: 42855.5. Samples: 10101577640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:49:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 07:50:01,011][15401] Updated weights for policy 0, policy_version 616551 (0.0028) [2024-06-24 07:50:03,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10101653504. Throughput: 0: 42776.9. Samples: 10101838380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:03,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 07:50:04,344][15401] Updated weights for policy 0, policy_version 616561 (0.0025) [2024-06-24 07:50:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 10101882880. Throughput: 0: 42755.1. Samples: 10101962540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 07:50:08,691][15401] Updated weights for policy 0, policy_version 616571 (0.0043) [2024-06-24 07:50:12,032][15401] Updated weights for policy 0, policy_version 616581 (0.0029) [2024-06-24 07:50:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10102112256. Throughput: 0: 42805.3. Samples: 10102219280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 07:50:15,768][15349] Signal inference workers to stop experience collection... (149700 times) [2024-06-24 07:50:15,769][15349] Signal inference workers to resume experience collection... (149700 times) [2024-06-24 07:50:15,794][15401] InferenceWorker_p0-w0: stopping experience collection (149700 times) [2024-06-24 07:50:15,794][15401] InferenceWorker_p0-w0: resuming experience collection (149700 times) [2024-06-24 07:50:16,421][15401] Updated weights for policy 0, policy_version 616591 (0.0042) [2024-06-24 07:50:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10102308864. Throughput: 0: 42771.5. Samples: 10102475200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 07:50:19,760][15401] Updated weights for policy 0, policy_version 616601 (0.0041) [2024-06-24 07:50:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10102538240. Throughput: 0: 42719.8. Samples: 10102606140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:23,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 07:50:24,400][15401] Updated weights for policy 0, policy_version 616611 (0.0036) [2024-06-24 07:50:27,456][15401] Updated weights for policy 0, policy_version 616621 (0.0037) [2024-06-24 07:50:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42654.0). Total num frames: 10102751232. Throughput: 0: 42864.9. Samples: 10102862140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:28,394][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 07:50:31,783][15401] Updated weights for policy 0, policy_version 616631 (0.0030) [2024-06-24 07:50:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10102947840. Throughput: 0: 42709.6. Samples: 10103117820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 07:50:34,966][15401] Updated weights for policy 0, policy_version 616641 (0.0029) [2024-06-24 07:50:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42876.0, 300 sec: 42542.9). Total num frames: 10103160832. Throughput: 0: 42500.5. Samples: 10103240100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:38,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 07:50:39,345][15401] Updated weights for policy 0, policy_version 616651 (0.0045) [2024-06-24 07:50:42,965][15401] Updated weights for policy 0, policy_version 616661 (0.0032) [2024-06-24 07:50:43,396][15132] Fps is (10 sec: 45846.3, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 10103406592. Throughput: 0: 42776.1. Samples: 10103502840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:43,396][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 07:50:47,102][15401] Updated weights for policy 0, policy_version 616671 (0.0024) [2024-06-24 07:50:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10103586816. Throughput: 0: 42711.6. Samples: 10103760400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 07:50:50,474][15401] Updated weights for policy 0, policy_version 616681 (0.0030) [2024-06-24 07:50:53,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 10103816192. Throughput: 0: 42609.7. Samples: 10103879980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 07:50:54,981][15401] Updated weights for policy 0, policy_version 616691 (0.0031) [2024-06-24 07:50:58,083][15401] Updated weights for policy 0, policy_version 616701 (0.0031) [2024-06-24 07:50:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 10104029184. Throughput: 0: 42683.8. Samples: 10104140160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 07:50:58,393][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 07:51:02,405][15401] Updated weights for policy 0, policy_version 616711 (0.0035) [2024-06-24 07:51:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10104209408. Throughput: 0: 42882.7. Samples: 10104404920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:03,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 07:51:05,620][15401] Updated weights for policy 0, policy_version 616721 (0.0033) [2024-06-24 07:51:08,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10104455168. Throughput: 0: 42663.7. Samples: 10104526000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 07:51:09,936][15401] Updated weights for policy 0, policy_version 616731 (0.0031) [2024-06-24 07:51:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10104668160. Throughput: 0: 42652.5. Samples: 10104781500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:13,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 07:51:13,808][15401] Updated weights for policy 0, policy_version 616741 (0.0038) [2024-06-24 07:51:17,412][15401] Updated weights for policy 0, policy_version 616751 (0.0034) [2024-06-24 07:51:18,396][15132] Fps is (10 sec: 39296.1, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 10104848384. Throughput: 0: 42727.0. Samples: 10105040800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:18,396][15132] Avg episode reward: [(0, '0.109')] [2024-06-24 07:51:21,324][15401] Updated weights for policy 0, policy_version 616761 (0.0038) [2024-06-24 07:51:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10105094144. Throughput: 0: 42712.1. Samples: 10105162140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 07:51:24,940][15401] Updated weights for policy 0, policy_version 616771 (0.0030) [2024-06-24 07:51:28,390][15132] Fps is (10 sec: 45904.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10105307136. Throughput: 0: 42675.4. Samples: 10105422960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 07:51:29,028][15401] Updated weights for policy 0, policy_version 616781 (0.0036) [2024-06-24 07:51:32,791][15401] Updated weights for policy 0, policy_version 616791 (0.0030) [2024-06-24 07:51:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 10105503744. Throughput: 0: 42498.1. Samples: 10105672820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:33,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 07:51:36,517][15401] Updated weights for policy 0, policy_version 616801 (0.0034) [2024-06-24 07:51:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10105716736. Throughput: 0: 42777.8. Samples: 10105804980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:38,394][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 07:51:40,866][15401] Updated weights for policy 0, policy_version 616811 (0.0025) [2024-06-24 07:51:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42056.8, 300 sec: 42653.9). Total num frames: 10105929728. Throughput: 0: 42773.9. Samples: 10106064880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 07:51:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000616818_10105946112.pth... [2024-06-24 07:51:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000616192_10095689728.pth [2024-06-24 07:51:44,025][15401] Updated weights for policy 0, policy_version 616821 (0.0030) [2024-06-24 07:51:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10106142720. Throughput: 0: 42497.0. Samples: 10106317280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:48,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 07:51:48,509][15401] Updated weights for policy 0, policy_version 616831 (0.0032) [2024-06-24 07:51:51,181][15349] Signal inference workers to stop experience collection... (149750 times) [2024-06-24 07:51:51,182][15349] Signal inference workers to resume experience collection... (149750 times) [2024-06-24 07:51:51,216][15401] InferenceWorker_p0-w0: stopping experience collection (149750 times) [2024-06-24 07:51:51,216][15401] InferenceWorker_p0-w0: resuming experience collection (149750 times) [2024-06-24 07:51:51,705][15401] Updated weights for policy 0, policy_version 616841 (0.0041) [2024-06-24 07:51:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10106372096. Throughput: 0: 42723.1. Samples: 10106448540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 07:51:56,004][15401] Updated weights for policy 0, policy_version 616851 (0.0036) [2024-06-24 07:51:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 10106568704. Throughput: 0: 42827.2. Samples: 10106708720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:51:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 07:51:59,553][15401] Updated weights for policy 0, policy_version 616861 (0.0039) [2024-06-24 07:52:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 10106781696. Throughput: 0: 42690.2. Samples: 10106961580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:52:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 07:52:03,589][15401] Updated weights for policy 0, policy_version 616871 (0.0031) [2024-06-24 07:52:07,277][15401] Updated weights for policy 0, policy_version 616881 (0.0040) [2024-06-24 07:52:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10107027456. Throughput: 0: 42864.8. Samples: 10107091060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:52:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 07:52:11,903][15401] Updated weights for policy 0, policy_version 616891 (0.0028) [2024-06-24 07:52:13,390][15132] Fps is (10 sec: 40958.6, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 10107191296. Throughput: 0: 42680.3. Samples: 10107343580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:52:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 07:52:15,016][15401] Updated weights for policy 0, policy_version 616901 (0.0032) [2024-06-24 07:52:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43149.1, 300 sec: 42598.4). Total num frames: 10107437056. Throughput: 0: 42705.0. Samples: 10107594540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:52:18,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 07:52:19,369][15401] Updated weights for policy 0, policy_version 616911 (0.0032) [2024-06-24 07:52:22,601][15401] Updated weights for policy 0, policy_version 616921 (0.0043) [2024-06-24 07:52:23,390][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10107650048. Throughput: 0: 42723.5. Samples: 10107727540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:52:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 07:52:26,813][15401] Updated weights for policy 0, policy_version 616931 (0.0043) [2024-06-24 07:52:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10107846656. Throughput: 0: 42788.8. Samples: 10107990380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:52:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 07:52:30,175][15401] Updated weights for policy 0, policy_version 616941 (0.0036) [2024-06-24 07:52:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 10108076032. Throughput: 0: 42740.9. Samples: 10108240620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 07:52:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 07:52:34,455][15401] Updated weights for policy 0, policy_version 616951 (0.0034) [2024-06-24 07:52:37,768][15401] Updated weights for policy 0, policy_version 616961 (0.0036) [2024-06-24 07:52:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10108289024. Throughput: 0: 42683.5. Samples: 10108369300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:52:38,390][15132] Avg episode reward: [(0, '0.154')] [2024-06-24 07:52:41,961][15401] Updated weights for policy 0, policy_version 616971 (0.0036) [2024-06-24 07:52:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10108502016. Throughput: 0: 42662.6. Samples: 10108628540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:52:43,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 07:52:45,462][15401] Updated weights for policy 0, policy_version 616981 (0.0039) [2024-06-24 07:52:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10108731392. Throughput: 0: 42769.1. Samples: 10108886200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:52:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 07:52:50,061][15401] Updated weights for policy 0, policy_version 616991 (0.0023) [2024-06-24 07:52:53,085][15401] Updated weights for policy 0, policy_version 617001 (0.0028) [2024-06-24 07:52:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10108944384. Throughput: 0: 42971.6. Samples: 10109024780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:52:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 07:52:57,428][15401] Updated weights for policy 0, policy_version 617011 (0.0039) [2024-06-24 07:52:58,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10109124608. Throughput: 0: 42932.1. Samples: 10109275520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:52:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 07:53:00,899][15401] Updated weights for policy 0, policy_version 617021 (0.0039) [2024-06-24 07:53:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10109370368. Throughput: 0: 43046.2. Samples: 10109531620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 07:53:04,966][15401] Updated weights for policy 0, policy_version 617031 (0.0029) [2024-06-24 07:53:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10109583360. Throughput: 0: 43144.5. Samples: 10109669040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 07:53:08,528][15401] Updated weights for policy 0, policy_version 617041 (0.0028) [2024-06-24 07:53:12,594][15401] Updated weights for policy 0, policy_version 617051 (0.0037) [2024-06-24 07:53:13,396][15132] Fps is (10 sec: 42570.7, 60 sec: 43413.0, 300 sec: 42653.0). Total num frames: 10109796352. Throughput: 0: 42991.2. Samples: 10109925260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:13,397][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 07:53:16,101][15401] Updated weights for policy 0, policy_version 617061 (0.0032) [2024-06-24 07:53:17,424][15349] Signal inference workers to stop experience collection... (149800 times) [2024-06-24 07:53:17,431][15349] Signal inference workers to resume experience collection... (149800 times) [2024-06-24 07:53:17,446][15401] InferenceWorker_p0-w0: stopping experience collection (149800 times) [2024-06-24 07:53:17,480][15401] InferenceWorker_p0-w0: resuming experience collection (149800 times) [2024-06-24 07:53:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10110042112. Throughput: 0: 43092.4. Samples: 10110179780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 07:53:20,219][15401] Updated weights for policy 0, policy_version 617071 (0.0028) [2024-06-24 07:53:23,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10110222336. Throughput: 0: 43249.4. Samples: 10110315520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:23,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 07:53:23,759][15401] Updated weights for policy 0, policy_version 617081 (0.0036) [2024-06-24 07:53:27,948][15401] Updated weights for policy 0, policy_version 617091 (0.0024) [2024-06-24 07:53:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42598.7). Total num frames: 10110435328. Throughput: 0: 43162.6. Samples: 10110570860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 07:53:31,495][15401] Updated weights for policy 0, policy_version 617101 (0.0025) [2024-06-24 07:53:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 10110681088. Throughput: 0: 42920.6. Samples: 10110817620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 07:53:35,416][15401] Updated weights for policy 0, policy_version 617111 (0.0032) [2024-06-24 07:53:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 10110894080. Throughput: 0: 42917.9. Samples: 10110956080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 07:53:39,179][15401] Updated weights for policy 0, policy_version 617121 (0.0028) [2024-06-24 07:53:43,085][15401] Updated weights for policy 0, policy_version 617131 (0.0033) [2024-06-24 07:53:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 10111090688. Throughput: 0: 42959.9. Samples: 10111208720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 07:53:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000617132_10111090688.pth... [2024-06-24 07:53:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000616505_10100817920.pth [2024-06-24 07:53:46,718][15401] Updated weights for policy 0, policy_version 617141 (0.0033) [2024-06-24 07:53:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10111320064. Throughput: 0: 42899.6. Samples: 10111462100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 07:53:50,528][15401] Updated weights for policy 0, policy_version 617151 (0.0025) [2024-06-24 07:53:53,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 10111516672. Throughput: 0: 42741.7. Samples: 10111592520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:53,401][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 07:53:54,307][15401] Updated weights for policy 0, policy_version 617161 (0.0033) [2024-06-24 07:53:58,129][15401] Updated weights for policy 0, policy_version 617171 (0.0041) [2024-06-24 07:53:58,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43689.0, 300 sec: 42820.2). Total num frames: 10111746048. Throughput: 0: 42698.2. Samples: 10111846500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:53:58,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 07:54:02,168][15401] Updated weights for policy 0, policy_version 617181 (0.0036) [2024-06-24 07:54:03,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10111942656. Throughput: 0: 42850.8. Samples: 10112108060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:54:03,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 07:54:05,586][15401] Updated weights for policy 0, policy_version 617191 (0.0032) [2024-06-24 07:54:08,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10112139264. Throughput: 0: 42554.2. Samples: 10112230460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 07:54:08,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 07:54:09,902][15401] Updated weights for policy 0, policy_version 617201 (0.0046) [2024-06-24 07:54:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 10112368640. Throughput: 0: 42635.2. Samples: 10112489440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:13,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 07:54:13,452][15401] Updated weights for policy 0, policy_version 617211 (0.0037) [2024-06-24 07:54:17,786][15401] Updated weights for policy 0, policy_version 617221 (0.0040) [2024-06-24 07:54:18,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 10112581632. Throughput: 0: 42745.6. Samples: 10112741280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:18,393][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 07:54:21,248][15401] Updated weights for policy 0, policy_version 617231 (0.0037) [2024-06-24 07:54:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10112778240. Throughput: 0: 42449.2. Samples: 10112866300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:23,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 07:54:25,405][15401] Updated weights for policy 0, policy_version 617241 (0.0029) [2024-06-24 07:54:28,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10112991232. Throughput: 0: 42530.3. Samples: 10113122580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:28,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 07:54:29,025][15401] Updated weights for policy 0, policy_version 617251 (0.0042) [2024-06-24 07:54:32,969][15401] Updated weights for policy 0, policy_version 617261 (0.0033) [2024-06-24 07:54:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42765.9). Total num frames: 10113204224. Throughput: 0: 42629.4. Samples: 10113380420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:33,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 07:54:36,648][15401] Updated weights for policy 0, policy_version 617271 (0.0033) [2024-06-24 07:54:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10113433600. Throughput: 0: 42605.5. Samples: 10113509660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 07:54:41,248][15401] Updated weights for policy 0, policy_version 617281 (0.0042) [2024-06-24 07:54:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 10113613824. Throughput: 0: 42570.7. Samples: 10113762080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 07:54:44,291][15401] Updated weights for policy 0, policy_version 617291 (0.0039) [2024-06-24 07:54:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 10113843200. Throughput: 0: 42506.0. Samples: 10114020840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 07:54:48,762][15401] Updated weights for policy 0, policy_version 617301 (0.0030) [2024-06-24 07:54:52,105][15401] Updated weights for policy 0, policy_version 617311 (0.0028) [2024-06-24 07:54:52,420][15349] Signal inference workers to stop experience collection... (149850 times) [2024-06-24 07:54:52,446][15401] InferenceWorker_p0-w0: stopping experience collection (149850 times) [2024-06-24 07:54:52,536][15349] Signal inference workers to resume experience collection... (149850 times) [2024-06-24 07:54:52,536][15401] InferenceWorker_p0-w0: resuming experience collection (149850 times) [2024-06-24 07:54:53,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10114088960. Throughput: 0: 42671.5. Samples: 10114150680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 07:54:56,202][15401] Updated weights for policy 0, policy_version 617321 (0.0033) [2024-06-24 07:54:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 10114269184. Throughput: 0: 42621.8. Samples: 10114407420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:54:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 07:54:59,714][15401] Updated weights for policy 0, policy_version 617331 (0.0036) [2024-06-24 07:55:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10114482176. Throughput: 0: 42682.7. Samples: 10114661900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 07:55:03,757][15401] Updated weights for policy 0, policy_version 617341 (0.0045) [2024-06-24 07:55:07,451][15401] Updated weights for policy 0, policy_version 617351 (0.0031) [2024-06-24 07:55:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10114711552. Throughput: 0: 42808.0. Samples: 10114792660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 07:55:11,430][15401] Updated weights for policy 0, policy_version 617361 (0.0032) [2024-06-24 07:55:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10114908160. Throughput: 0: 42723.0. Samples: 10115045120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 07:55:15,525][15401] Updated weights for policy 0, policy_version 617371 (0.0044) [2024-06-24 07:55:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 10115121152. Throughput: 0: 42406.3. Samples: 10115288700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:18,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 07:55:19,665][15401] Updated weights for policy 0, policy_version 617381 (0.0035) [2024-06-24 07:55:23,209][15401] Updated weights for policy 0, policy_version 617391 (0.0027) [2024-06-24 07:55:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10115334144. Throughput: 0: 42386.6. Samples: 10115417060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 07:55:27,665][15401] Updated weights for policy 0, policy_version 617401 (0.0033) [2024-06-24 07:55:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10115530752. Throughput: 0: 42504.0. Samples: 10115674760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 07:55:31,065][15401] Updated weights for policy 0, policy_version 617411 (0.0029) [2024-06-24 07:55:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10115776512. Throughput: 0: 42339.2. Samples: 10115926100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:33,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 07:55:35,271][15401] Updated weights for policy 0, policy_version 617421 (0.0028) [2024-06-24 07:55:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42543.8). Total num frames: 10115956736. Throughput: 0: 42323.4. Samples: 10116055240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:38,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 07:55:38,611][15401] Updated weights for policy 0, policy_version 617431 (0.0034) [2024-06-24 07:55:42,583][15401] Updated weights for policy 0, policy_version 617441 (0.0033) [2024-06-24 07:55:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10116186112. Throughput: 0: 42319.5. Samples: 10116311800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 07:55:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 07:55:43,395][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000617443_10116186112.pth... [2024-06-24 07:55:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000616818_10105946112.pth [2024-06-24 07:55:46,373][15401] Updated weights for policy 0, policy_version 617451 (0.0031) [2024-06-24 07:55:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10116415488. Throughput: 0: 42355.1. Samples: 10116567880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:55:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 07:55:49,978][15401] Updated weights for policy 0, policy_version 617461 (0.0039) [2024-06-24 07:55:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 10116612096. Throughput: 0: 42412.0. Samples: 10116701200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:55:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 07:55:54,071][15401] Updated weights for policy 0, policy_version 617471 (0.0034) [2024-06-24 07:55:57,387][15401] Updated weights for policy 0, policy_version 617481 (0.0036) [2024-06-24 07:55:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10116808704. Throughput: 0: 42341.5. Samples: 10116950480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:55:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 07:56:01,886][15401] Updated weights for policy 0, policy_version 617491 (0.0029) [2024-06-24 07:56:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10117070848. Throughput: 0: 42579.9. Samples: 10117204800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 07:56:05,055][15401] Updated weights for policy 0, policy_version 617501 (0.0034) [2024-06-24 07:56:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10117251072. Throughput: 0: 42623.2. Samples: 10117335100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 07:56:09,497][15401] Updated weights for policy 0, policy_version 617511 (0.0027) [2024-06-24 07:56:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.5, 300 sec: 42710.4). Total num frames: 10117447680. Throughput: 0: 42481.3. Samples: 10117586420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:13,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 07:56:13,462][15401] Updated weights for policy 0, policy_version 617521 (0.0038) [2024-06-24 07:56:17,100][15401] Updated weights for policy 0, policy_version 617531 (0.0031) [2024-06-24 07:56:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10117693440. Throughput: 0: 42655.5. Samples: 10117845600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 07:56:21,073][15401] Updated weights for policy 0, policy_version 617541 (0.0028) [2024-06-24 07:56:23,395][15132] Fps is (10 sec: 45849.5, 60 sec: 42867.5, 300 sec: 42708.7). Total num frames: 10117906432. Throughput: 0: 42765.5. Samples: 10117979920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:23,396][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 07:56:24,599][15401] Updated weights for policy 0, policy_version 617551 (0.0030) [2024-06-24 07:56:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10118103040. Throughput: 0: 42671.9. Samples: 10118232040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 07:56:28,453][15401] Updated weights for policy 0, policy_version 617561 (0.0032) [2024-06-24 07:56:32,262][15401] Updated weights for policy 0, policy_version 617571 (0.0046) [2024-06-24 07:56:33,389][15132] Fps is (10 sec: 42622.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10118332416. Throughput: 0: 42725.4. Samples: 10118490520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 07:56:35,935][15401] Updated weights for policy 0, policy_version 617581 (0.0043) [2024-06-24 07:56:36,905][15349] Signal inference workers to stop experience collection... (149900 times) [2024-06-24 07:56:36,906][15349] Signal inference workers to resume experience collection... (149900 times) [2024-06-24 07:56:36,964][15401] InferenceWorker_p0-w0: stopping experience collection (149900 times) [2024-06-24 07:56:36,964][15401] InferenceWorker_p0-w0: resuming experience collection (149900 times) [2024-06-24 07:56:38,391][15132] Fps is (10 sec: 42591.6, 60 sec: 42870.4, 300 sec: 42709.2). Total num frames: 10118529024. Throughput: 0: 42534.9. Samples: 10118615340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:38,392][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 07:56:39,809][15401] Updated weights for policy 0, policy_version 617591 (0.0043) [2024-06-24 07:56:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10118758400. Throughput: 0: 42655.1. Samples: 10118869960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 07:56:44,122][15401] Updated weights for policy 0, policy_version 617601 (0.0037) [2024-06-24 07:56:47,873][15401] Updated weights for policy 0, policy_version 617611 (0.0037) [2024-06-24 07:56:48,389][15132] Fps is (10 sec: 42605.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10118955008. Throughput: 0: 42727.6. Samples: 10119127540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 07:56:51,903][15401] Updated weights for policy 0, policy_version 617621 (0.0034) [2024-06-24 07:56:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10119151616. Throughput: 0: 42588.4. Samples: 10119251580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:53,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 07:56:55,513][15401] Updated weights for policy 0, policy_version 617631 (0.0040) [2024-06-24 07:56:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10119380992. Throughput: 0: 42696.0. Samples: 10119507740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:56:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 07:56:59,475][15401] Updated weights for policy 0, policy_version 617641 (0.0036) [2024-06-24 07:57:02,988][15401] Updated weights for policy 0, policy_version 617651 (0.0040) [2024-06-24 07:57:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10119593984. Throughput: 0: 42685.4. Samples: 10119766440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:57:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 07:57:06,980][15401] Updated weights for policy 0, policy_version 617661 (0.0043) [2024-06-24 07:57:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10119790592. Throughput: 0: 42600.8. Samples: 10119896720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:57:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 07:57:10,566][15401] Updated weights for policy 0, policy_version 617671 (0.0034) [2024-06-24 07:57:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10120036352. Throughput: 0: 42559.5. Samples: 10120147220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:57:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 07:57:14,490][15401] Updated weights for policy 0, policy_version 617681 (0.0040) [2024-06-24 07:57:18,181][15401] Updated weights for policy 0, policy_version 617691 (0.0034) [2024-06-24 07:57:18,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10120249344. Throughput: 0: 42579.9. Samples: 10120406720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 07:57:18,393][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 07:57:22,028][15401] Updated weights for policy 0, policy_version 617701 (0.0029) [2024-06-24 07:57:23,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42056.2, 300 sec: 42654.0). Total num frames: 10120429568. Throughput: 0: 42641.6. Samples: 10120534140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:57:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 07:57:26,015][15401] Updated weights for policy 0, policy_version 617711 (0.0045) [2024-06-24 07:57:28,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10120675328. Throughput: 0: 42568.3. Samples: 10120785540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:57:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 07:57:30,032][15401] Updated weights for policy 0, policy_version 617721 (0.0032) [2024-06-24 07:57:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10120871936. Throughput: 0: 42764.0. Samples: 10121051920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:57:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 07:57:33,625][15401] Updated weights for policy 0, policy_version 617731 (0.0028) [2024-06-24 07:57:37,545][15401] Updated weights for policy 0, policy_version 617741 (0.0049) [2024-06-24 07:57:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42599.6, 300 sec: 42653.9). Total num frames: 10121084928. Throughput: 0: 42713.7. Samples: 10121173700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:57:38,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-24 07:57:41,375][15401] Updated weights for policy 0, policy_version 617751 (0.0033) [2024-06-24 07:57:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10121314304. Throughput: 0: 42605.3. Samples: 10121424980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:57:43,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-24 07:57:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000617756_10121314304.pth... [2024-06-24 07:57:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000617132_10111090688.pth [2024-06-24 07:57:45,150][15401] Updated weights for policy 0, policy_version 617761 (0.0033) [2024-06-24 07:57:48,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 10121494528. Throughput: 0: 42685.3. Samples: 10121687380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:57:48,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 07:57:48,644][15349] Signal inference workers to stop experience collection... (149950 times) [2024-06-24 07:57:48,652][15349] Signal inference workers to resume experience collection... (149950 times) [2024-06-24 07:57:48,660][15401] InferenceWorker_p0-w0: stopping experience collection (149950 times) [2024-06-24 07:57:48,692][15401] InferenceWorker_p0-w0: resuming experience collection (149950 times) [2024-06-24 07:57:49,124][15401] Updated weights for policy 0, policy_version 617771 (0.0039) [2024-06-24 07:57:52,987][15401] Updated weights for policy 0, policy_version 617781 (0.0029) [2024-06-24 07:57:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10121723904. Throughput: 0: 42396.5. Samples: 10121804560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:57:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 07:57:57,317][15401] Updated weights for policy 0, policy_version 617791 (0.0037) [2024-06-24 07:57:58,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10121953280. Throughput: 0: 42502.7. Samples: 10122059840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:57:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 07:58:00,610][15401] Updated weights for policy 0, policy_version 617801 (0.0033) [2024-06-24 07:58:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10122117120. Throughput: 0: 42526.3. Samples: 10122320300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 07:58:04,850][15401] Updated weights for policy 0, policy_version 617811 (0.0028) [2024-06-24 07:58:08,348][15401] Updated weights for policy 0, policy_version 617821 (0.0034) [2024-06-24 07:58:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42654.9). Total num frames: 10122379264. Throughput: 0: 42248.9. Samples: 10122435340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 07:58:12,433][15401] Updated weights for policy 0, policy_version 617831 (0.0032) [2024-06-24 07:58:13,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10122592256. Throughput: 0: 42324.6. Samples: 10122690140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 07:58:16,112][15401] Updated weights for policy 0, policy_version 617841 (0.0026) [2024-06-24 07:58:18,389][15132] Fps is (10 sec: 36045.2, 60 sec: 41507.9, 300 sec: 42431.8). Total num frames: 10122739712. Throughput: 0: 42210.3. Samples: 10122951380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:18,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 07:58:20,289][15401] Updated weights for policy 0, policy_version 617851 (0.0028) [2024-06-24 07:58:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10123001856. Throughput: 0: 42130.7. Samples: 10123069580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 07:58:24,068][15401] Updated weights for policy 0, policy_version 617861 (0.0042) [2024-06-24 07:58:27,910][15401] Updated weights for policy 0, policy_version 617871 (0.0036) [2024-06-24 07:58:28,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10123214848. Throughput: 0: 42276.0. Samples: 10123327400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 07:58:31,673][15401] Updated weights for policy 0, policy_version 617881 (0.0027) [2024-06-24 07:58:33,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41779.1, 300 sec: 42320.7). Total num frames: 10123378688. Throughput: 0: 42184.8. Samples: 10123585600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 07:58:35,780][15401] Updated weights for policy 0, policy_version 617891 (0.0033) [2024-06-24 07:58:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10123640832. Throughput: 0: 42242.2. Samples: 10123705460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 07:58:39,155][15401] Updated weights for policy 0, policy_version 617901 (0.0033) [2024-06-24 07:58:43,214][15401] Updated weights for policy 0, policy_version 617911 (0.0027) [2024-06-24 07:58:43,389][15132] Fps is (10 sec: 47514.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10123853824. Throughput: 0: 42511.3. Samples: 10123972840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 07:58:46,683][15401] Updated weights for policy 0, policy_version 617921 (0.0033) [2024-06-24 07:58:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42487.7). Total num frames: 10124050432. Throughput: 0: 42360.0. Samples: 10124226500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 07:58:51,159][15401] Updated weights for policy 0, policy_version 617931 (0.0034) [2024-06-24 07:58:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 10124279808. Throughput: 0: 42562.3. Samples: 10124350640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 07:58:53,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 07:58:54,501][15401] Updated weights for policy 0, policy_version 617941 (0.0033) [2024-06-24 07:58:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10124476416. Throughput: 0: 42618.6. Samples: 10124607980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:58:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 07:58:59,043][15401] Updated weights for policy 0, policy_version 617951 (0.0035) [2024-06-24 07:59:02,618][15401] Updated weights for policy 0, policy_version 617961 (0.0030) [2024-06-24 07:59:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 10124689408. Throughput: 0: 42350.1. Samples: 10124857140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:03,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 07:59:07,019][15401] Updated weights for policy 0, policy_version 617971 (0.0039) [2024-06-24 07:59:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 10124918784. Throughput: 0: 42454.1. Samples: 10124980020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:08,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 07:59:10,717][15401] Updated weights for policy 0, policy_version 617981 (0.0035) [2024-06-24 07:59:12,187][15349] Signal inference workers to stop experience collection... (150000 times) [2024-06-24 07:59:12,188][15349] Signal inference workers to resume experience collection... (150000 times) [2024-06-24 07:59:12,205][15401] InferenceWorker_p0-w0: stopping experience collection (150000 times) [2024-06-24 07:59:12,206][15401] InferenceWorker_p0-w0: resuming experience collection (150000 times) [2024-06-24 07:59:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42487.7). Total num frames: 10125115392. Throughput: 0: 42537.7. Samples: 10125241600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 07:59:14,652][15401] Updated weights for policy 0, policy_version 617991 (0.0025) [2024-06-24 07:59:18,390][15132] Fps is (10 sec: 39319.5, 60 sec: 42870.9, 300 sec: 42487.2). Total num frames: 10125312000. Throughput: 0: 42415.5. Samples: 10125494320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:18,391][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 07:59:18,614][15401] Updated weights for policy 0, policy_version 618001 (0.0033) [2024-06-24 07:59:22,287][15401] Updated weights for policy 0, policy_version 618011 (0.0030) [2024-06-24 07:59:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10125557760. Throughput: 0: 42639.6. Samples: 10125624240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 07:59:26,213][15401] Updated weights for policy 0, policy_version 618021 (0.0041) [2024-06-24 07:59:28,390][15132] Fps is (10 sec: 44239.0, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 10125754368. Throughput: 0: 42318.0. Samples: 10125877160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 07:59:29,921][15401] Updated weights for policy 0, policy_version 618031 (0.0047) [2024-06-24 07:59:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.6, 300 sec: 42431.8). Total num frames: 10125950976. Throughput: 0: 42348.5. Samples: 10126132180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 07:59:33,839][15401] Updated weights for policy 0, policy_version 618041 (0.0023) [2024-06-24 07:59:37,456][15401] Updated weights for policy 0, policy_version 618051 (0.0039) [2024-06-24 07:59:38,392][15132] Fps is (10 sec: 44227.6, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 10126196736. Throughput: 0: 42459.7. Samples: 10126261420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:38,392][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 07:59:41,374][15401] Updated weights for policy 0, policy_version 618061 (0.0036) [2024-06-24 07:59:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10126376960. Throughput: 0: 42375.7. Samples: 10126514880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 07:59:43,482][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000618066_10126393344.pth... [2024-06-24 07:59:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000617443_10116186112.pth [2024-06-24 07:59:45,219][15401] Updated weights for policy 0, policy_version 618071 (0.0029) [2024-06-24 07:59:48,390][15132] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 10126606336. Throughput: 0: 42625.8. Samples: 10126775300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 07:59:48,979][15401] Updated weights for policy 0, policy_version 618081 (0.0051) [2024-06-24 07:59:52,756][15401] Updated weights for policy 0, policy_version 618091 (0.0031) [2024-06-24 07:59:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10126835712. Throughput: 0: 42698.3. Samples: 10126901440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 07:59:56,674][15401] Updated weights for policy 0, policy_version 618101 (0.0027) [2024-06-24 07:59:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10127032320. Throughput: 0: 42556.3. Samples: 10127156640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 07:59:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 08:00:00,406][15401] Updated weights for policy 0, policy_version 618111 (0.0033) [2024-06-24 08:00:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 10127245312. Throughput: 0: 42502.7. Samples: 10127407020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 08:00:03,393][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 08:00:04,410][15401] Updated weights for policy 0, policy_version 618121 (0.0045) [2024-06-24 08:00:08,126][15401] Updated weights for policy 0, policy_version 618131 (0.0039) [2024-06-24 08:00:08,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10127474688. Throughput: 0: 42570.1. Samples: 10127539900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 08:00:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 08:00:12,364][15401] Updated weights for policy 0, policy_version 618141 (0.0037) [2024-06-24 08:00:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10127654912. Throughput: 0: 42631.2. Samples: 10127795560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 08:00:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 08:00:16,117][15401] Updated weights for policy 0, policy_version 618151 (0.0033) [2024-06-24 08:00:17,945][15349] Signal inference workers to stop experience collection... (150050 times) [2024-06-24 08:00:17,946][15349] Signal inference workers to resume experience collection... (150050 times) [2024-06-24 08:00:17,981][15401] InferenceWorker_p0-w0: stopping experience collection (150050 times) [2024-06-24 08:00:17,981][15401] InferenceWorker_p0-w0: resuming experience collection (150050 times) [2024-06-24 08:00:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43145.0, 300 sec: 42598.4). Total num frames: 10127900672. Throughput: 0: 42377.3. Samples: 10128039160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 08:00:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 08:00:19,951][15401] Updated weights for policy 0, policy_version 618161 (0.0031) [2024-06-24 08:00:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 41779.0, 300 sec: 42487.3). Total num frames: 10128064512. Throughput: 0: 42518.4. Samples: 10128174660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 08:00:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 08:00:23,911][15401] Updated weights for policy 0, policy_version 618171 (0.0038) [2024-06-24 08:00:27,600][15401] Updated weights for policy 0, policy_version 618181 (0.0042) [2024-06-24 08:00:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10128293888. Throughput: 0: 42431.8. Samples: 10128424320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-24 08:00:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 08:00:31,505][15401] Updated weights for policy 0, policy_version 618191 (0.0047) [2024-06-24 08:00:33,392][15132] Fps is (10 sec: 45864.9, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 10128523264. Throughput: 0: 42284.4. Samples: 10128678200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:00:33,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 08:00:35,232][15401] Updated weights for policy 0, policy_version 618201 (0.0045) [2024-06-24 08:00:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42053.8, 300 sec: 42487.3). Total num frames: 10128719872. Throughput: 0: 42384.0. Samples: 10128808720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:00:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 08:00:39,042][15401] Updated weights for policy 0, policy_version 618211 (0.0024) [2024-06-24 08:00:42,795][15401] Updated weights for policy 0, policy_version 618221 (0.0027) [2024-06-24 08:00:43,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10128932864. Throughput: 0: 42498.9. Samples: 10129069080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:00:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 08:00:46,928][15401] Updated weights for policy 0, policy_version 618231 (0.0040) [2024-06-24 08:00:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10129162240. Throughput: 0: 42505.1. Samples: 10129319640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:00:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 08:00:50,620][15401] Updated weights for policy 0, policy_version 618241 (0.0046) [2024-06-24 08:00:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10129375232. Throughput: 0: 42536.0. Samples: 10129454020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:00:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 08:00:54,287][15401] Updated weights for policy 0, policy_version 618251 (0.0033) [2024-06-24 08:00:58,343][15401] Updated weights for policy 0, policy_version 618261 (0.0028) [2024-06-24 08:00:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42431.8). Total num frames: 10129588224. Throughput: 0: 42552.2. Samples: 10129710400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:00:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 08:01:01,881][15401] Updated weights for policy 0, policy_version 618271 (0.0031) [2024-06-24 08:01:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 10129817600. Throughput: 0: 42830.2. Samples: 10129966520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 08:01:05,861][15401] Updated weights for policy 0, policy_version 618281 (0.0032) [2024-06-24 08:01:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10130014208. Throughput: 0: 42752.3. Samples: 10130098500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 08:01:09,521][15401] Updated weights for policy 0, policy_version 618291 (0.0030) [2024-06-24 08:01:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 10130227200. Throughput: 0: 42846.7. Samples: 10130352420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 08:01:13,405][15401] Updated weights for policy 0, policy_version 618301 (0.0043) [2024-06-24 08:01:17,606][15401] Updated weights for policy 0, policy_version 618311 (0.0028) [2024-06-24 08:01:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42488.1). Total num frames: 10130440192. Throughput: 0: 42840.4. Samples: 10130605920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:18,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-24 08:01:21,444][15401] Updated weights for policy 0, policy_version 618321 (0.0032) [2024-06-24 08:01:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.8, 300 sec: 42598.4). Total num frames: 10130669568. Throughput: 0: 42841.0. Samples: 10130736560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:23,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-24 08:01:25,199][15401] Updated weights for policy 0, policy_version 618331 (0.0033) [2024-06-24 08:01:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 10130882560. Throughput: 0: 42800.8. Samples: 10130995120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 08:01:28,909][15401] Updated weights for policy 0, policy_version 618341 (0.0040) [2024-06-24 08:01:32,787][15401] Updated weights for policy 0, policy_version 618351 (0.0031) [2024-06-24 08:01:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42598.6). Total num frames: 10131095552. Throughput: 0: 42939.9. Samples: 10131251940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 08:01:36,410][15401] Updated weights for policy 0, policy_version 618361 (0.0034) [2024-06-24 08:01:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 10131292160. Throughput: 0: 42798.3. Samples: 10131379940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 08:01:38,425][15349] Signal inference workers to stop experience collection... (150100 times) [2024-06-24 08:01:38,426][15349] Signal inference workers to resume experience collection... (150100 times) [2024-06-24 08:01:38,443][15401] InferenceWorker_p0-w0: stopping experience collection (150100 times) [2024-06-24 08:01:38,443][15401] InferenceWorker_p0-w0: resuming experience collection (150100 times) [2024-06-24 08:01:40,299][15401] Updated weights for policy 0, policy_version 618371 (0.0028) [2024-06-24 08:01:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 10131537920. Throughput: 0: 42886.2. Samples: 10131640280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 08:01:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000618380_10131537920.pth... [2024-06-24 08:01:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000617756_10121314304.pth [2024-06-24 08:01:43,868][15401] Updated weights for policy 0, policy_version 618381 (0.0038) [2024-06-24 08:01:47,919][15401] Updated weights for policy 0, policy_version 618391 (0.0034) [2024-06-24 08:01:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10131734528. Throughput: 0: 42834.2. Samples: 10131894060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 08:01:51,925][15401] Updated weights for policy 0, policy_version 618401 (0.0029) [2024-06-24 08:01:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10131947520. Throughput: 0: 42755.9. Samples: 10132022520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:53,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 08:01:55,609][15401] Updated weights for policy 0, policy_version 618411 (0.0040) [2024-06-24 08:01:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10132144128. Throughput: 0: 42933.0. Samples: 10132284400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:01:58,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 08:01:59,566][15401] Updated weights for policy 0, policy_version 618421 (0.0041) [2024-06-24 08:02:03,322][15401] Updated weights for policy 0, policy_version 618431 (0.0039) [2024-06-24 08:02:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10132373504. Throughput: 0: 42993.9. Samples: 10132540640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-24 08:02:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 08:02:07,284][15401] Updated weights for policy 0, policy_version 618441 (0.0032) [2024-06-24 08:02:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 10132586496. Throughput: 0: 42855.9. Samples: 10132665080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 08:02:10,745][15401] Updated weights for policy 0, policy_version 618451 (0.0036) [2024-06-24 08:02:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 10132799488. Throughput: 0: 42924.8. Samples: 10132926740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:13,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 08:02:14,870][15401] Updated weights for policy 0, policy_version 618461 (0.0033) [2024-06-24 08:02:18,227][15401] Updated weights for policy 0, policy_version 618471 (0.0045) [2024-06-24 08:02:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 10133028864. Throughput: 0: 42802.7. Samples: 10133178060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 08:02:22,423][15401] Updated weights for policy 0, policy_version 618481 (0.0031) [2024-06-24 08:02:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10133225472. Throughput: 0: 42884.9. Samples: 10133309760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:23,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 08:02:26,220][15401] Updated weights for policy 0, policy_version 618491 (0.0037) [2024-06-24 08:02:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10133438464. Throughput: 0: 42765.7. Samples: 10133564740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 08:02:30,010][15401] Updated weights for policy 0, policy_version 618501 (0.0032) [2024-06-24 08:02:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10133667840. Throughput: 0: 42828.5. Samples: 10133821340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 08:02:33,915][15401] Updated weights for policy 0, policy_version 618511 (0.0050) [2024-06-24 08:02:37,519][15401] Updated weights for policy 0, policy_version 618521 (0.0037) [2024-06-24 08:02:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 10133880832. Throughput: 0: 42993.6. Samples: 10133957240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 08:02:41,499][15401] Updated weights for policy 0, policy_version 618531 (0.0038) [2024-06-24 08:02:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 10134077440. Throughput: 0: 42883.6. Samples: 10134214160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:43,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 08:02:45,081][15401] Updated weights for policy 0, policy_version 618541 (0.0033) [2024-06-24 08:02:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10134323200. Throughput: 0: 42886.7. Samples: 10134470540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 08:02:48,827][15401] Updated weights for policy 0, policy_version 618551 (0.0029) [2024-06-24 08:02:51,570][15349] Signal inference workers to stop experience collection... (150150 times) [2024-06-24 08:02:51,572][15349] Signal inference workers to resume experience collection... (150150 times) [2024-06-24 08:02:51,582][15401] InferenceWorker_p0-w0: stopping experience collection (150150 times) [2024-06-24 08:02:51,611][15401] InferenceWorker_p0-w0: resuming experience collection (150150 times) [2024-06-24 08:02:52,743][15401] Updated weights for policy 0, policy_version 618561 (0.0033) [2024-06-24 08:02:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10134503424. Throughput: 0: 43056.5. Samples: 10134602620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 08:02:56,712][15401] Updated weights for policy 0, policy_version 618571 (0.0035) [2024-06-24 08:02:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10134732800. Throughput: 0: 42810.8. Samples: 10134853220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:02:58,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-24 08:03:00,603][15401] Updated weights for policy 0, policy_version 618581 (0.0034) [2024-06-24 08:03:03,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 10134945792. Throughput: 0: 42989.6. Samples: 10135112700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:03:03,392][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 08:03:04,232][15401] Updated weights for policy 0, policy_version 618591 (0.0033) [2024-06-24 08:03:08,326][15401] Updated weights for policy 0, policy_version 618601 (0.0034) [2024-06-24 08:03:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10135158784. Throughput: 0: 43008.8. Samples: 10135245160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:03:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 08:03:11,756][15401] Updated weights for policy 0, policy_version 618611 (0.0044) [2024-06-24 08:03:13,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10135371776. Throughput: 0: 43024.3. Samples: 10135500840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:03:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 08:03:15,763][15401] Updated weights for policy 0, policy_version 618621 (0.0034) [2024-06-24 08:03:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10135584768. Throughput: 0: 43045.3. Samples: 10135758380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:03:18,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-24 08:03:19,321][15401] Updated weights for policy 0, policy_version 618631 (0.0034) [2024-06-24 08:03:23,257][15401] Updated weights for policy 0, policy_version 618641 (0.0036) [2024-06-24 08:03:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10135814144. Throughput: 0: 42911.6. Samples: 10135888260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:03:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 08:03:27,183][15401] Updated weights for policy 0, policy_version 618651 (0.0033) [2024-06-24 08:03:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10136010752. Throughput: 0: 42968.4. Samples: 10136147740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:03:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 08:03:30,793][15401] Updated weights for policy 0, policy_version 618661 (0.0038) [2024-06-24 08:03:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10136223744. Throughput: 0: 42815.6. Samples: 10136397240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:03:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 08:03:34,785][15401] Updated weights for policy 0, policy_version 618671 (0.0031) [2024-06-24 08:03:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10136453120. Throughput: 0: 42780.9. Samples: 10136527760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:03:38,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 08:03:38,629][15401] Updated weights for policy 0, policy_version 618681 (0.0041) [2024-06-24 08:03:42,370][15401] Updated weights for policy 0, policy_version 618691 (0.0027) [2024-06-24 08:03:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10136666112. Throughput: 0: 42894.5. Samples: 10136783480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:03:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 08:03:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000618693_10136666112.pth... [2024-06-24 08:03:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000618066_10126393344.pth [2024-06-24 08:03:46,156][15401] Updated weights for policy 0, policy_version 618701 (0.0021) [2024-06-24 08:03:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10136879104. Throughput: 0: 42887.1. Samples: 10137042620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:03:48,392][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 08:03:50,018][15401] Updated weights for policy 0, policy_version 618711 (0.0031) [2024-06-24 08:03:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10137092096. Throughput: 0: 42860.9. Samples: 10137173900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:03:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 08:03:53,776][15401] Updated weights for policy 0, policy_version 618721 (0.0031) [2024-06-24 08:03:57,669][15401] Updated weights for policy 0, policy_version 618731 (0.0040) [2024-06-24 08:03:58,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10137305088. Throughput: 0: 42967.3. Samples: 10137434360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:03:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 08:04:01,403][15401] Updated weights for policy 0, policy_version 618741 (0.0030) [2024-06-24 08:04:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 10137534464. Throughput: 0: 42972.8. Samples: 10137692160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 08:04:05,298][15401] Updated weights for policy 0, policy_version 618751 (0.0034) [2024-06-24 08:04:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10137731072. Throughput: 0: 42809.0. Samples: 10137814660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:08,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-24 08:04:09,099][15401] Updated weights for policy 0, policy_version 618761 (0.0036) [2024-06-24 08:04:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.6, 300 sec: 42765.1). Total num frames: 10137927680. Throughput: 0: 42719.6. Samples: 10138070120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:13,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-24 08:04:13,486][15401] Updated weights for policy 0, policy_version 618771 (0.0022) [2024-06-24 08:04:16,917][15401] Updated weights for policy 0, policy_version 618781 (0.0026) [2024-06-24 08:04:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10138173440. Throughput: 0: 42787.5. Samples: 10138322680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 08:04:21,242][15401] Updated weights for policy 0, policy_version 618791 (0.0030) [2024-06-24 08:04:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.6, 300 sec: 42765.1). Total num frames: 10138370048. Throughput: 0: 42901.8. Samples: 10138458340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 08:04:23,945][15349] Signal inference workers to stop experience collection... (150200 times) [2024-06-24 08:04:23,945][15349] Signal inference workers to resume experience collection... (150200 times) [2024-06-24 08:04:23,992][15401] InferenceWorker_p0-w0: stopping experience collection (150200 times) [2024-06-24 08:04:23,993][15401] InferenceWorker_p0-w0: resuming experience collection (150200 times) [2024-06-24 08:04:24,536][15401] Updated weights for policy 0, policy_version 618801 (0.0042) [2024-06-24 08:04:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10138583040. Throughput: 0: 42830.8. Samples: 10138710860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:28,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 08:04:28,729][15401] Updated weights for policy 0, policy_version 618811 (0.0023) [2024-06-24 08:04:32,486][15401] Updated weights for policy 0, policy_version 618821 (0.0030) [2024-06-24 08:04:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 10138796032. Throughput: 0: 42679.7. Samples: 10138963100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 08:04:36,596][15401] Updated weights for policy 0, policy_version 618831 (0.0031) [2024-06-24 08:04:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10138992640. Throughput: 0: 42599.3. Samples: 10139090860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 08:04:40,150][15401] Updated weights for policy 0, policy_version 618841 (0.0041) [2024-06-24 08:04:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10139222016. Throughput: 0: 42460.8. Samples: 10139345100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:43,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 08:04:44,143][15401] Updated weights for policy 0, policy_version 618851 (0.0037) [2024-06-24 08:04:47,868][15401] Updated weights for policy 0, policy_version 618861 (0.0044) [2024-06-24 08:04:48,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 10139435008. Throughput: 0: 42456.1. Samples: 10139602680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:48,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 08:04:51,588][15401] Updated weights for policy 0, policy_version 618871 (0.0024) [2024-06-24 08:04:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10139648000. Throughput: 0: 42475.6. Samples: 10139726060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 08:04:55,396][15401] Updated weights for policy 0, policy_version 618881 (0.0037) [2024-06-24 08:04:58,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42866.8, 300 sec: 42820.0). Total num frames: 10139877376. Throughput: 0: 42559.6. Samples: 10139985580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:04:58,397][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 08:04:59,061][15401] Updated weights for policy 0, policy_version 618891 (0.0042) [2024-06-24 08:05:03,034][15401] Updated weights for policy 0, policy_version 618901 (0.0035) [2024-06-24 08:05:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10140073984. Throughput: 0: 42571.1. Samples: 10140238380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:05:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 08:05:06,615][15401] Updated weights for policy 0, policy_version 618911 (0.0037) [2024-06-24 08:05:08,390][15132] Fps is (10 sec: 40985.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10140286976. Throughput: 0: 42516.7. Samples: 10140371600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:05:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 08:05:10,559][15401] Updated weights for policy 0, policy_version 618921 (0.0033) [2024-06-24 08:05:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10140516352. Throughput: 0: 42659.9. Samples: 10140630560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 08:05:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 08:05:14,621][15401] Updated weights for policy 0, policy_version 618931 (0.0045) [2024-06-24 08:05:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10140712960. Throughput: 0: 42681.7. Samples: 10140883780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 08:05:18,558][15401] Updated weights for policy 0, policy_version 618941 (0.0041) [2024-06-24 08:05:22,100][15401] Updated weights for policy 0, policy_version 618951 (0.0037) [2024-06-24 08:05:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10140925952. Throughput: 0: 42664.4. Samples: 10141010760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 08:05:26,178][15401] Updated weights for policy 0, policy_version 618961 (0.0029) [2024-06-24 08:05:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 10141155328. Throughput: 0: 42610.2. Samples: 10141262560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 08:05:29,674][15401] Updated weights for policy 0, policy_version 618971 (0.0023) [2024-06-24 08:05:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10141335552. Throughput: 0: 42798.8. Samples: 10141528620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 08:05:33,706][15401] Updated weights for policy 0, policy_version 618981 (0.0032) [2024-06-24 08:05:37,509][15401] Updated weights for policy 0, policy_version 618991 (0.0036) [2024-06-24 08:05:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10141564928. Throughput: 0: 42769.0. Samples: 10141650660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 08:05:41,742][15401] Updated weights for policy 0, policy_version 619001 (0.0031) [2024-06-24 08:05:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10141794304. Throughput: 0: 42658.5. Samples: 10141904940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 08:05:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000619006_10141794304.pth... [2024-06-24 08:05:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000618380_10131537920.pth [2024-06-24 08:05:45,142][15401] Updated weights for policy 0, policy_version 619011 (0.0034) [2024-06-24 08:05:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10141990912. Throughput: 0: 42748.1. Samples: 10142162040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 08:05:49,490][15401] Updated weights for policy 0, policy_version 619021 (0.0031) [2024-06-24 08:05:52,768][15401] Updated weights for policy 0, policy_version 619031 (0.0038) [2024-06-24 08:05:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10142203904. Throughput: 0: 42480.6. Samples: 10142283220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 08:05:57,003][15401] Updated weights for policy 0, policy_version 619041 (0.0027) [2024-06-24 08:05:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 10142433280. Throughput: 0: 42572.5. Samples: 10142546320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:05:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 08:06:00,010][15349] Signal inference workers to stop experience collection... (150250 times) [2024-06-24 08:06:00,010][15349] Signal inference workers to resume experience collection... (150250 times) [2024-06-24 08:06:00,024][15401] InferenceWorker_p0-w0: stopping experience collection (150250 times) [2024-06-24 08:06:00,024][15401] InferenceWorker_p0-w0: resuming experience collection (150250 times) [2024-06-24 08:06:00,314][15401] Updated weights for policy 0, policy_version 619051 (0.0036) [2024-06-24 08:06:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10142629888. Throughput: 0: 42537.7. Samples: 10142797980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 08:06:04,622][15401] Updated weights for policy 0, policy_version 619061 (0.0032) [2024-06-24 08:06:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 10142842880. Throughput: 0: 42590.7. Samples: 10142927340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 08:06:08,413][15401] Updated weights for policy 0, policy_version 619071 (0.0034) [2024-06-24 08:06:12,361][15401] Updated weights for policy 0, policy_version 619081 (0.0030) [2024-06-24 08:06:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10143055872. Throughput: 0: 42748.1. Samples: 10143186220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 08:06:15,853][15401] Updated weights for policy 0, policy_version 619091 (0.0026) [2024-06-24 08:06:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10143268864. Throughput: 0: 42577.3. Samples: 10143444600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 08:06:19,877][15401] Updated weights for policy 0, policy_version 619101 (0.0030) [2024-06-24 08:06:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10143498240. Throughput: 0: 42657.6. Samples: 10143570260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 08:06:23,502][15401] Updated weights for policy 0, policy_version 619111 (0.0023) [2024-06-24 08:06:27,565][15401] Updated weights for policy 0, policy_version 619121 (0.0034) [2024-06-24 08:06:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10143711232. Throughput: 0: 42745.4. Samples: 10143828480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 08:06:31,256][15401] Updated weights for policy 0, policy_version 619131 (0.0029) [2024-06-24 08:06:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10143924224. Throughput: 0: 42728.8. Samples: 10144084840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 08:06:35,256][15401] Updated weights for policy 0, policy_version 619141 (0.0037) [2024-06-24 08:06:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 10144120832. Throughput: 0: 42807.5. Samples: 10144209660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:38,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 08:06:39,013][15401] Updated weights for policy 0, policy_version 619151 (0.0031) [2024-06-24 08:06:42,846][15401] Updated weights for policy 0, policy_version 619161 (0.0039) [2024-06-24 08:06:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10144333824. Throughput: 0: 42644.0. Samples: 10144465300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 08:06:46,783][15401] Updated weights for policy 0, policy_version 619171 (0.0026) [2024-06-24 08:06:48,389][15132] Fps is (10 sec: 44248.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10144563200. Throughput: 0: 42582.5. Samples: 10144714180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:06:48,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 08:06:50,946][15401] Updated weights for policy 0, policy_version 619181 (0.0029) [2024-06-24 08:06:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10144759808. Throughput: 0: 42547.1. Samples: 10144841960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:06:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 08:06:54,311][15401] Updated weights for policy 0, policy_version 619191 (0.0028) [2024-06-24 08:06:58,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10144972800. Throughput: 0: 42494.2. Samples: 10145098460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:06:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 08:06:58,400][15401] Updated weights for policy 0, policy_version 619201 (0.0032) [2024-06-24 08:07:02,218][15401] Updated weights for policy 0, policy_version 619211 (0.0038) [2024-06-24 08:07:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10145202176. Throughput: 0: 42528.9. Samples: 10145358400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 08:07:06,340][15401] Updated weights for policy 0, policy_version 619221 (0.0029) [2024-06-24 08:07:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 10145415168. Throughput: 0: 42589.0. Samples: 10145486760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:08,390][15132] Avg episode reward: [(0, '0.166')] [2024-06-24 08:07:09,767][15401] Updated weights for policy 0, policy_version 619231 (0.0035) [2024-06-24 08:07:13,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 10145611776. Throughput: 0: 42564.1. Samples: 10145744140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:13,396][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 08:07:13,915][15401] Updated weights for policy 0, policy_version 619241 (0.0043) [2024-06-24 08:07:17,506][15401] Updated weights for policy 0, policy_version 619251 (0.0034) [2024-06-24 08:07:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10145824768. Throughput: 0: 42537.8. Samples: 10145999040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:18,404][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 08:07:21,572][15401] Updated weights for policy 0, policy_version 619261 (0.0032) [2024-06-24 08:07:23,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10146054144. Throughput: 0: 42660.5. Samples: 10146129280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 08:07:24,998][15401] Updated weights for policy 0, policy_version 619271 (0.0035) [2024-06-24 08:07:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10146250752. Throughput: 0: 42650.7. Samples: 10146384580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 08:07:29,208][15401] Updated weights for policy 0, policy_version 619281 (0.0030) [2024-06-24 08:07:32,094][15349] Signal inference workers to stop experience collection... (150300 times) [2024-06-24 08:07:32,100][15349] Signal inference workers to resume experience collection... (150300 times) [2024-06-24 08:07:32,108][15401] InferenceWorker_p0-w0: stopping experience collection (150300 times) [2024-06-24 08:07:32,137][15401] InferenceWorker_p0-w0: resuming experience collection (150300 times) [2024-06-24 08:07:32,902][15401] Updated weights for policy 0, policy_version 619291 (0.0038) [2024-06-24 08:07:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10146480128. Throughput: 0: 42847.4. Samples: 10146642320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 08:07:36,750][15401] Updated weights for policy 0, policy_version 619301 (0.0038) [2024-06-24 08:07:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 10146709504. Throughput: 0: 42960.8. Samples: 10146775200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 08:07:40,439][15401] Updated weights for policy 0, policy_version 619311 (0.0026) [2024-06-24 08:07:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10146906112. Throughput: 0: 42816.7. Samples: 10147025220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 08:07:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000619318_10146906112.pth... [2024-06-24 08:07:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000618693_10136666112.pth [2024-06-24 08:07:44,406][15401] Updated weights for policy 0, policy_version 619321 (0.0034) [2024-06-24 08:07:48,060][15401] Updated weights for policy 0, policy_version 619331 (0.0036) [2024-06-24 08:07:48,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 10147119104. Throughput: 0: 42719.5. Samples: 10147280880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:48,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 08:07:52,022][15401] Updated weights for policy 0, policy_version 619341 (0.0050) [2024-06-24 08:07:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10147332096. Throughput: 0: 42808.8. Samples: 10147413160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 08:07:55,884][15401] Updated weights for policy 0, policy_version 619351 (0.0043) [2024-06-24 08:07:58,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 10147528704. Throughput: 0: 42746.0. Samples: 10147667440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:07:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 08:07:59,754][15401] Updated weights for policy 0, policy_version 619361 (0.0027) [2024-06-24 08:08:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10147758080. Throughput: 0: 42793.8. Samples: 10147924760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:08:03,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 08:08:03,807][15401] Updated weights for policy 0, policy_version 619371 (0.0033) [2024-06-24 08:08:07,267][15401] Updated weights for policy 0, policy_version 619381 (0.0037) [2024-06-24 08:08:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10147971072. Throughput: 0: 42729.4. Samples: 10148052100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:08:08,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 08:08:11,404][15401] Updated weights for policy 0, policy_version 619391 (0.0045) [2024-06-24 08:08:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 10148167680. Throughput: 0: 42756.8. Samples: 10148308640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:08:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 08:08:14,778][15401] Updated weights for policy 0, policy_version 619401 (0.0037) [2024-06-24 08:08:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10148380672. Throughput: 0: 42897.9. Samples: 10148572720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:08:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 08:08:18,992][15401] Updated weights for policy 0, policy_version 619411 (0.0032) [2024-06-24 08:08:22,863][15401] Updated weights for policy 0, policy_version 619421 (0.0048) [2024-06-24 08:08:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10148626432. Throughput: 0: 42621.8. Samples: 10148693180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 08:08:23,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 08:08:26,599][15401] Updated weights for policy 0, policy_version 619431 (0.0034) [2024-06-24 08:08:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10148823040. Throughput: 0: 42814.3. Samples: 10148951860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:08:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 08:08:30,503][15401] Updated weights for policy 0, policy_version 619441 (0.0042) [2024-06-24 08:08:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10149036032. Throughput: 0: 42800.4. Samples: 10149206800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:08:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 08:08:34,175][15401] Updated weights for policy 0, policy_version 619451 (0.0030) [2024-06-24 08:08:38,036][15401] Updated weights for policy 0, policy_version 619461 (0.0034) [2024-06-24 08:08:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 10149249024. Throughput: 0: 42727.4. Samples: 10149336000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:08:38,393][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 08:08:41,456][15349] Signal inference workers to stop experience collection... (150350 times) [2024-06-24 08:08:41,485][15401] InferenceWorker_p0-w0: stopping experience collection (150350 times) [2024-06-24 08:08:41,512][15349] Signal inference workers to resume experience collection... (150350 times) [2024-06-24 08:08:41,512][15401] InferenceWorker_p0-w0: resuming experience collection (150350 times) [2024-06-24 08:08:41,939][15401] Updated weights for policy 0, policy_version 619471 (0.0037) [2024-06-24 08:08:43,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 10149462016. Throughput: 0: 42680.9. Samples: 10149588180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:08:43,393][15132] Avg episode reward: [(0, '0.232')] [2024-06-24 08:08:45,623][15401] Updated weights for policy 0, policy_version 619481 (0.0031) [2024-06-24 08:08:48,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 10149691392. Throughput: 0: 42572.0. Samples: 10149840500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:08:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 08:08:49,965][15401] Updated weights for policy 0, policy_version 619491 (0.0028) [2024-06-24 08:08:53,267][15401] Updated weights for policy 0, policy_version 619501 (0.0034) [2024-06-24 08:08:53,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10149904384. Throughput: 0: 42637.7. Samples: 10149970800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:08:53,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 08:08:57,645][15401] Updated weights for policy 0, policy_version 619511 (0.0038) [2024-06-24 08:08:58,393][15132] Fps is (10 sec: 42584.6, 60 sec: 43142.3, 300 sec: 42653.5). Total num frames: 10150117376. Throughput: 0: 42829.8. Samples: 10150236120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:08:58,393][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 08:09:00,944][15401] Updated weights for policy 0, policy_version 619521 (0.0035) [2024-06-24 08:09:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10150313984. Throughput: 0: 42539.0. Samples: 10150486980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:03,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 08:09:05,325][15401] Updated weights for policy 0, policy_version 619531 (0.0036) [2024-06-24 08:09:08,390][15132] Fps is (10 sec: 42611.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10150543360. Throughput: 0: 42615.8. Samples: 10150610900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 08:09:08,851][15401] Updated weights for policy 0, policy_version 619541 (0.0028) [2024-06-24 08:09:13,016][15401] Updated weights for policy 0, policy_version 619551 (0.0027) [2024-06-24 08:09:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 10150739968. Throughput: 0: 42610.2. Samples: 10150869420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:13,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 08:09:16,687][15401] Updated weights for policy 0, policy_version 619561 (0.0028) [2024-06-24 08:09:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10150952960. Throughput: 0: 42443.6. Samples: 10151116760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:18,394][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 08:09:20,772][15401] Updated weights for policy 0, policy_version 619571 (0.0038) [2024-06-24 08:09:23,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10151165952. Throughput: 0: 42302.2. Samples: 10151239500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 08:09:24,586][15401] Updated weights for policy 0, policy_version 619581 (0.0049) [2024-06-24 08:09:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10151378944. Throughput: 0: 42469.9. Samples: 10151499220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:28,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 08:09:28,394][15401] Updated weights for policy 0, policy_version 619591 (0.0032) [2024-06-24 08:09:32,059][15401] Updated weights for policy 0, policy_version 619601 (0.0029) [2024-06-24 08:09:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 10151575552. Throughput: 0: 42561.9. Samples: 10151755780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 08:09:35,892][15401] Updated weights for policy 0, policy_version 619611 (0.0031) [2024-06-24 08:09:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 10151821312. Throughput: 0: 42481.3. Samples: 10151882460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 08:09:39,677][15401] Updated weights for policy 0, policy_version 619621 (0.0037) [2024-06-24 08:09:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 10152017920. Throughput: 0: 42308.3. Samples: 10152139860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:43,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 08:09:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000619630_10152017920.pth... [2024-06-24 08:09:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000619006_10141794304.pth [2024-06-24 08:09:43,636][15401] Updated weights for policy 0, policy_version 619631 (0.0032) [2024-06-24 08:09:47,272][15401] Updated weights for policy 0, policy_version 619641 (0.0033) [2024-06-24 08:09:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10152214528. Throughput: 0: 42450.7. Samples: 10152397260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 08:09:51,591][15401] Updated weights for policy 0, policy_version 619651 (0.0036) [2024-06-24 08:09:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 10152443904. Throughput: 0: 42493.9. Samples: 10152523120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:53,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 08:09:55,402][15401] Updated weights for policy 0, policy_version 619661 (0.0033) [2024-06-24 08:09:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42327.6, 300 sec: 42653.9). Total num frames: 10152656896. Throughput: 0: 42471.5. Samples: 10152780540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 08:09:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 08:09:59,053][15401] Updated weights for policy 0, policy_version 619671 (0.0038) [2024-06-24 08:10:02,986][15401] Updated weights for policy 0, policy_version 619681 (0.0035) [2024-06-24 08:10:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 10152853504. Throughput: 0: 42437.4. Samples: 10153026540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:03,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 08:10:03,534][15349] Signal inference workers to stop experience collection... (150400 times) [2024-06-24 08:10:03,581][15401] InferenceWorker_p0-w0: stopping experience collection (150400 times) [2024-06-24 08:10:03,589][15349] Signal inference workers to resume experience collection... (150400 times) [2024-06-24 08:10:03,604][15401] InferenceWorker_p0-w0: resuming experience collection (150400 times) [2024-06-24 08:10:06,746][15401] Updated weights for policy 0, policy_version 619691 (0.0025) [2024-06-24 08:10:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10153082880. Throughput: 0: 42644.1. Samples: 10153158480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 08:10:10,526][15401] Updated weights for policy 0, policy_version 619701 (0.0046) [2024-06-24 08:10:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 10153295872. Throughput: 0: 42607.5. Samples: 10153416560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 08:10:14,218][15401] Updated weights for policy 0, policy_version 619711 (0.0041) [2024-06-24 08:10:18,224][15401] Updated weights for policy 0, policy_version 619721 (0.0039) [2024-06-24 08:10:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10153508864. Throughput: 0: 42713.2. Samples: 10153677880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 08:10:21,781][15401] Updated weights for policy 0, policy_version 619731 (0.0044) [2024-06-24 08:10:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10153721856. Throughput: 0: 42796.2. Samples: 10153808280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 08:10:26,066][15401] Updated weights for policy 0, policy_version 619741 (0.0038) [2024-06-24 08:10:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10153951232. Throughput: 0: 42660.5. Samples: 10154059580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 08:10:29,301][15401] Updated weights for policy 0, policy_version 619751 (0.0034) [2024-06-24 08:10:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 10154147840. Throughput: 0: 42681.6. Samples: 10154317940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:33,391][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 08:10:33,570][15401] Updated weights for policy 0, policy_version 619761 (0.0027) [2024-06-24 08:10:37,288][15401] Updated weights for policy 0, policy_version 619771 (0.0031) [2024-06-24 08:10:38,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 10154377216. Throughput: 0: 42684.8. Samples: 10154444040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:38,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 08:10:41,081][15401] Updated weights for policy 0, policy_version 619781 (0.0032) [2024-06-24 08:10:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10154573824. Throughput: 0: 42706.3. Samples: 10154702320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:43,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 08:10:44,914][15401] Updated weights for policy 0, policy_version 619791 (0.0037) [2024-06-24 08:10:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10154803200. Throughput: 0: 42944.9. Samples: 10154958960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:48,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-24 08:10:48,710][15401] Updated weights for policy 0, policy_version 619801 (0.0036) [2024-06-24 08:10:52,639][15401] Updated weights for policy 0, policy_version 619811 (0.0032) [2024-06-24 08:10:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10154999808. Throughput: 0: 42883.2. Samples: 10155088220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:53,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 08:10:56,585][15401] Updated weights for policy 0, policy_version 619821 (0.0046) [2024-06-24 08:10:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10155229184. Throughput: 0: 42708.9. Samples: 10155338460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:10:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 08:11:00,394][15401] Updated weights for policy 0, policy_version 619831 (0.0035) [2024-06-24 08:11:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 10155425792. Throughput: 0: 42746.1. Samples: 10155601460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:11:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 08:11:04,123][15401] Updated weights for policy 0, policy_version 619841 (0.0037) [2024-06-24 08:11:08,093][15401] Updated weights for policy 0, policy_version 619851 (0.0043) [2024-06-24 08:11:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10155655168. Throughput: 0: 42589.2. Samples: 10155724800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:11:08,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 08:11:11,717][15401] Updated weights for policy 0, policy_version 619861 (0.0031) [2024-06-24 08:11:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10155868160. Throughput: 0: 42662.2. Samples: 10155979380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:11:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 08:11:15,783][15401] Updated weights for policy 0, policy_version 619871 (0.0035) [2024-06-24 08:11:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10156064768. Throughput: 0: 42662.9. Samples: 10156237760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:11:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 08:11:19,412][15401] Updated weights for policy 0, policy_version 619881 (0.0038) [2024-06-24 08:11:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10156277760. Throughput: 0: 42596.4. Samples: 10156360780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:11:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 08:11:23,730][15401] Updated weights for policy 0, policy_version 619891 (0.0028) [2024-06-24 08:11:27,154][15401] Updated weights for policy 0, policy_version 619901 (0.0024) [2024-06-24 08:11:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10156507136. Throughput: 0: 42668.8. Samples: 10156622420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:11:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 08:11:31,140][15401] Updated weights for policy 0, policy_version 619911 (0.0033) [2024-06-24 08:11:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.6, 300 sec: 42654.3). Total num frames: 10156703744. Throughput: 0: 42801.5. Samples: 10156885020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 08:11:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 08:11:34,615][15401] Updated weights for policy 0, policy_version 619921 (0.0039) [2024-06-24 08:11:35,570][15349] Signal inference workers to stop experience collection... (150450 times) [2024-06-24 08:11:35,597][15401] InferenceWorker_p0-w0: stopping experience collection (150450 times) [2024-06-24 08:11:35,682][15349] Signal inference workers to resume experience collection... (150450 times) [2024-06-24 08:11:35,682][15401] InferenceWorker_p0-w0: resuming experience collection (150450 times) [2024-06-24 08:11:38,394][15132] Fps is (10 sec: 42580.5, 60 sec: 42597.0, 300 sec: 42708.9). Total num frames: 10156933120. Throughput: 0: 42643.4. Samples: 10157007360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:11:38,394][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 08:11:38,708][15401] Updated weights for policy 0, policy_version 619931 (0.0027) [2024-06-24 08:11:42,408][15401] Updated weights for policy 0, policy_version 619941 (0.0029) [2024-06-24 08:11:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10157162496. Throughput: 0: 42847.6. Samples: 10157266600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:11:43,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 08:11:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000619945_10157178880.pth... [2024-06-24 08:11:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000619318_10146906112.pth [2024-06-24 08:11:46,227][15401] Updated weights for policy 0, policy_version 619951 (0.0032) [2024-06-24 08:11:48,390][15132] Fps is (10 sec: 42616.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10157359104. Throughput: 0: 42793.0. Samples: 10157527140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:11:48,393][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 08:11:49,922][15401] Updated weights for policy 0, policy_version 619961 (0.0039) [2024-06-24 08:11:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10157555712. Throughput: 0: 42677.4. Samples: 10157645280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:11:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 08:11:54,148][15401] Updated weights for policy 0, policy_version 619971 (0.0037) [2024-06-24 08:11:57,737][15401] Updated weights for policy 0, policy_version 619981 (0.0037) [2024-06-24 08:11:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10157817856. Throughput: 0: 42877.3. Samples: 10157908860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:11:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 08:12:01,672][15401] Updated weights for policy 0, policy_version 619991 (0.0036) [2024-06-24 08:12:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10157981696. Throughput: 0: 42950.6. Samples: 10158170540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 08:12:05,407][15401] Updated weights for policy 0, policy_version 620001 (0.0028) [2024-06-24 08:12:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 10158211072. Throughput: 0: 42864.1. Samples: 10158289660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 08:12:09,287][15401] Updated weights for policy 0, policy_version 620011 (0.0048) [2024-06-24 08:12:12,904][15401] Updated weights for policy 0, policy_version 620021 (0.0025) [2024-06-24 08:12:13,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10158456832. Throughput: 0: 42780.6. Samples: 10158547540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 08:12:16,910][15401] Updated weights for policy 0, policy_version 620031 (0.0033) [2024-06-24 08:12:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 10158604288. Throughput: 0: 42812.8. Samples: 10158811600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 08:12:20,490][15401] Updated weights for policy 0, policy_version 620041 (0.0037) [2024-06-24 08:12:23,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10158833664. Throughput: 0: 42571.1. Samples: 10158922880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:23,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 08:12:24,816][15401] Updated weights for policy 0, policy_version 620051 (0.0039) [2024-06-24 08:12:28,178][15401] Updated weights for policy 0, policy_version 620061 (0.0053) [2024-06-24 08:12:28,389][15132] Fps is (10 sec: 49152.4, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 10159095808. Throughput: 0: 42542.2. Samples: 10159181000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 08:12:32,690][15401] Updated weights for policy 0, policy_version 620071 (0.0035) [2024-06-24 08:12:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 10159243264. Throughput: 0: 42382.2. Samples: 10159434340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:33,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 08:12:35,831][15401] Updated weights for policy 0, policy_version 620081 (0.0029) [2024-06-24 08:12:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42601.5, 300 sec: 42654.0). Total num frames: 10159489024. Throughput: 0: 42429.8. Samples: 10159554620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 08:12:40,508][15401] Updated weights for policy 0, policy_version 620091 (0.0022) [2024-06-24 08:12:43,392][15132] Fps is (10 sec: 47502.2, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 10159718400. Throughput: 0: 42417.4. Samples: 10159817740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:43,393][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 08:12:43,675][15401] Updated weights for policy 0, policy_version 620101 (0.0028) [2024-06-24 08:12:48,301][15401] Updated weights for policy 0, policy_version 620111 (0.0031) [2024-06-24 08:12:48,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 10159898624. Throughput: 0: 42331.0. Samples: 10160075540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:48,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 08:12:51,267][15401] Updated weights for policy 0, policy_version 620121 (0.0037) [2024-06-24 08:12:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10160128000. Throughput: 0: 42365.4. Samples: 10160196100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 08:12:56,010][15401] Updated weights for policy 0, policy_version 620131 (0.0038) [2024-06-24 08:12:57,902][15349] Signal inference workers to stop experience collection... (150500 times) [2024-06-24 08:12:57,943][15401] InferenceWorker_p0-w0: stopping experience collection (150500 times) [2024-06-24 08:12:57,953][15349] Signal inference workers to resume experience collection... (150500 times) [2024-06-24 08:12:57,957][15401] InferenceWorker_p0-w0: resuming experience collection (150500 times) [2024-06-24 08:12:58,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10160340992. Throughput: 0: 42408.2. Samples: 10160455920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:12:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 08:12:59,174][15401] Updated weights for policy 0, policy_version 620141 (0.0041) [2024-06-24 08:13:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10160537600. Throughput: 0: 42340.9. Samples: 10160716940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:13:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 08:13:03,577][15401] Updated weights for policy 0, policy_version 620151 (0.0031) [2024-06-24 08:13:06,746][15401] Updated weights for policy 0, policy_version 620161 (0.0029) [2024-06-24 08:13:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10160783360. Throughput: 0: 42621.8. Samples: 10160840860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 08:13:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 08:13:11,501][15401] Updated weights for policy 0, policy_version 620171 (0.0045) [2024-06-24 08:13:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 10160963584. Throughput: 0: 42537.2. Samples: 10161095180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:13,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 08:13:14,397][15401] Updated weights for policy 0, policy_version 620181 (0.0033) [2024-06-24 08:13:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10161176576. Throughput: 0: 42676.5. Samples: 10161354780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 08:13:19,181][15401] Updated weights for policy 0, policy_version 620191 (0.0034) [2024-06-24 08:13:22,084][15401] Updated weights for policy 0, policy_version 620201 (0.0033) [2024-06-24 08:13:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10161405952. Throughput: 0: 42761.3. Samples: 10161478880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 08:13:26,662][15401] Updated weights for policy 0, policy_version 620211 (0.0027) [2024-06-24 08:13:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 10161618944. Throughput: 0: 42663.7. Samples: 10161737500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 08:13:29,755][15401] Updated weights for policy 0, policy_version 620221 (0.0039) [2024-06-24 08:13:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42598.4). Total num frames: 10161815552. Throughput: 0: 42640.9. Samples: 10161994380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:33,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 08:13:34,205][15401] Updated weights for policy 0, policy_version 620231 (0.0034) [2024-06-24 08:13:37,386][15401] Updated weights for policy 0, policy_version 620241 (0.0044) [2024-06-24 08:13:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 10162061312. Throughput: 0: 42798.6. Samples: 10162122140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:38,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 08:13:41,598][15401] Updated weights for policy 0, policy_version 620251 (0.0035) [2024-06-24 08:13:43,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 10162241536. Throughput: 0: 42814.4. Samples: 10162382560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 08:13:43,441][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000620255_10162257920.pth... [2024-06-24 08:13:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000619630_10152017920.pth [2024-06-24 08:13:45,426][15401] Updated weights for policy 0, policy_version 620261 (0.0030) [2024-06-24 08:13:48,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 10162454528. Throughput: 0: 42648.9. Samples: 10162636140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 08:13:49,160][15401] Updated weights for policy 0, policy_version 620271 (0.0026) [2024-06-24 08:13:53,018][15401] Updated weights for policy 0, policy_version 620281 (0.0033) [2024-06-24 08:13:53,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43142.8, 300 sec: 42709.6). Total num frames: 10162716672. Throughput: 0: 42636.1. Samples: 10162759580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:53,392][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 08:13:56,710][15401] Updated weights for policy 0, policy_version 620291 (0.0033) [2024-06-24 08:13:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10162896896. Throughput: 0: 42660.4. Samples: 10163014900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:13:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 08:14:00,799][15401] Updated weights for policy 0, policy_version 620301 (0.0046) [2024-06-24 08:14:03,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10163109888. Throughput: 0: 42570.6. Samples: 10163270460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 08:14:04,951][15401] Updated weights for policy 0, policy_version 620311 (0.0034) [2024-06-24 08:14:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 10163339264. Throughput: 0: 42636.0. Samples: 10163397500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:08,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 08:14:08,397][15401] Updated weights for policy 0, policy_version 620321 (0.0038) [2024-06-24 08:14:12,699][15401] Updated weights for policy 0, policy_version 620331 (0.0046) [2024-06-24 08:14:13,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 10163535872. Throughput: 0: 42774.6. Samples: 10163662460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:13,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 08:14:15,832][15401] Updated weights for policy 0, policy_version 620341 (0.0030) [2024-06-24 08:14:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10163732480. Throughput: 0: 42783.1. Samples: 10163919520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 08:14:20,294][15401] Updated weights for policy 0, policy_version 620351 (0.0035) [2024-06-24 08:14:23,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10163978240. Throughput: 0: 42732.1. Samples: 10164044980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:23,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 08:14:23,484][15401] Updated weights for policy 0, policy_version 620361 (0.0032) [2024-06-24 08:14:26,966][15349] Signal inference workers to stop experience collection... (150550 times) [2024-06-24 08:14:27,000][15401] InferenceWorker_p0-w0: stopping experience collection (150550 times) [2024-06-24 08:14:27,023][15349] Signal inference workers to resume experience collection... (150550 times) [2024-06-24 08:14:27,024][15401] InferenceWorker_p0-w0: resuming experience collection (150550 times) [2024-06-24 08:14:28,199][15401] Updated weights for policy 0, policy_version 620371 (0.0023) [2024-06-24 08:14:28,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 10164158464. Throughput: 0: 42614.8. Samples: 10164300500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:28,396][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 08:14:31,130][15401] Updated weights for policy 0, policy_version 620381 (0.0032) [2024-06-24 08:14:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 10164387840. Throughput: 0: 42688.0. Samples: 10164557100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 08:14:35,676][15401] Updated weights for policy 0, policy_version 620391 (0.0041) [2024-06-24 08:14:38,389][15132] Fps is (10 sec: 45905.0, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 10164617216. Throughput: 0: 42769.5. Samples: 10164684100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:38,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 08:14:38,725][15401] Updated weights for policy 0, policy_version 620401 (0.0038) [2024-06-24 08:14:43,242][15401] Updated weights for policy 0, policy_version 620411 (0.0028) [2024-06-24 08:14:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 10164813824. Throughput: 0: 43014.6. Samples: 10164950560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 08:14:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 08:14:46,248][15401] Updated weights for policy 0, policy_version 620421 (0.0038) [2024-06-24 08:14:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10165043200. Throughput: 0: 42866.2. Samples: 10165199440. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:14:48,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 08:14:50,742][15401] Updated weights for policy 0, policy_version 620431 (0.0035) [2024-06-24 08:14:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42054.0, 300 sec: 42654.0). Total num frames: 10165239808. Throughput: 0: 43009.9. Samples: 10165332940. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:14:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 08:14:54,215][15401] Updated weights for policy 0, policy_version 620441 (0.0034) [2024-06-24 08:14:58,225][15401] Updated weights for policy 0, policy_version 620451 (0.0037) [2024-06-24 08:14:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 10165469184. Throughput: 0: 42943.5. Samples: 10165594820. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:14:58,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 08:15:01,778][15401] Updated weights for policy 0, policy_version 620461 (0.0040) [2024-06-24 08:15:03,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10165698560. Throughput: 0: 42672.6. Samples: 10165839780. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 08:15:06,408][15401] Updated weights for policy 0, policy_version 620471 (0.0043) [2024-06-24 08:15:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10165895168. Throughput: 0: 43007.5. Samples: 10165980320. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:08,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 08:15:09,270][15401] Updated weights for policy 0, policy_version 620481 (0.0039) [2024-06-24 08:15:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 10166091776. Throughput: 0: 43052.2. Samples: 10166237580. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 08:15:13,955][15401] Updated weights for policy 0, policy_version 620491 (0.0037) [2024-06-24 08:15:17,079][15401] Updated weights for policy 0, policy_version 620501 (0.0029) [2024-06-24 08:15:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 10166337536. Throughput: 0: 42826.2. Samples: 10166484280. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:18,392][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 08:15:21,567][15401] Updated weights for policy 0, policy_version 620511 (0.0038) [2024-06-24 08:15:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10166534144. Throughput: 0: 43086.9. Samples: 10166623020. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:23,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 08:15:24,760][15401] Updated weights for policy 0, policy_version 620521 (0.0048) [2024-06-24 08:15:28,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42874.3, 300 sec: 42653.6). Total num frames: 10166730752. Throughput: 0: 42732.5. Samples: 10166873620. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:28,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 08:15:29,086][15401] Updated weights for policy 0, policy_version 620531 (0.0039) [2024-06-24 08:15:32,473][15401] Updated weights for policy 0, policy_version 620541 (0.0028) [2024-06-24 08:15:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 10166976512. Throughput: 0: 42800.5. Samples: 10167125460. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 08:15:36,984][15401] Updated weights for policy 0, policy_version 620551 (0.0050) [2024-06-24 08:15:38,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10167189504. Throughput: 0: 42866.5. Samples: 10167261940. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 08:15:38,865][15349] Signal inference workers to stop experience collection... (150600 times) [2024-06-24 08:15:38,919][15401] InferenceWorker_p0-w0: stopping experience collection (150600 times) [2024-06-24 08:15:38,921][15349] Signal inference workers to resume experience collection... (150600 times) [2024-06-24 08:15:38,929][15401] InferenceWorker_p0-w0: resuming experience collection (150600 times) [2024-06-24 08:15:40,246][15401] Updated weights for policy 0, policy_version 620561 (0.0036) [2024-06-24 08:15:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10167386112. Throughput: 0: 42652.0. Samples: 10167514160. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 08:15:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000620568_10167386112.pth... [2024-06-24 08:15:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000619945_10157178880.pth [2024-06-24 08:15:44,599][15401] Updated weights for policy 0, policy_version 620571 (0.0032) [2024-06-24 08:15:47,666][15401] Updated weights for policy 0, policy_version 620581 (0.0030) [2024-06-24 08:15:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10167631872. Throughput: 0: 42790.2. Samples: 10167765340. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 08:15:52,266][15401] Updated weights for policy 0, policy_version 620591 (0.0031) [2024-06-24 08:15:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 10167812096. Throughput: 0: 42724.5. Samples: 10167902920. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 08:15:55,478][15401] Updated weights for policy 0, policy_version 620601 (0.0039) [2024-06-24 08:15:58,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 10168025088. Throughput: 0: 42676.6. Samples: 10168158120. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:15:58,392][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 08:15:59,695][15401] Updated weights for policy 0, policy_version 620611 (0.0039) [2024-06-24 08:16:03,216][15401] Updated weights for policy 0, policy_version 620621 (0.0037) [2024-06-24 08:16:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10168254464. Throughput: 0: 42971.6. Samples: 10168418000. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:16:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 08:16:07,066][15401] Updated weights for policy 0, policy_version 620631 (0.0021) [2024-06-24 08:16:08,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10168467456. Throughput: 0: 42826.7. Samples: 10168550220. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:16:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 08:16:10,773][15401] Updated weights for policy 0, policy_version 620641 (0.0037) [2024-06-24 08:16:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10168680448. Throughput: 0: 42864.4. Samples: 10168802420. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:16:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 08:16:14,932][15401] Updated weights for policy 0, policy_version 620651 (0.0042) [2024-06-24 08:16:18,354][15401] Updated weights for policy 0, policy_version 620661 (0.0028) [2024-06-24 08:16:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10168909824. Throughput: 0: 42961.2. Samples: 10169058720. Policy #0 lag: (min: 2.0, avg: 11.3, max: 25.0) [2024-06-24 08:16:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 08:16:22,665][15401] Updated weights for policy 0, policy_version 620671 (0.0027) [2024-06-24 08:16:23,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10169106432. Throughput: 0: 42859.2. Samples: 10169190600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:16:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 08:16:25,778][15401] Updated weights for policy 0, policy_version 620681 (0.0031) [2024-06-24 08:16:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 10169335808. Throughput: 0: 42978.2. Samples: 10169448180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:16:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 08:16:30,104][15401] Updated weights for policy 0, policy_version 620691 (0.0034) [2024-06-24 08:16:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 10169548800. Throughput: 0: 43124.4. Samples: 10169705940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:16:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 08:16:33,856][15401] Updated weights for policy 0, policy_version 620701 (0.0041) [2024-06-24 08:16:37,585][15401] Updated weights for policy 0, policy_version 620711 (0.0032) [2024-06-24 08:16:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10169729024. Throughput: 0: 42935.5. Samples: 10169835020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:16:38,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 08:16:41,510][15401] Updated weights for policy 0, policy_version 620721 (0.0027) [2024-06-24 08:16:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10169974784. Throughput: 0: 42935.6. Samples: 10170090120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:16:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 08:16:45,201][15401] Updated weights for policy 0, policy_version 620731 (0.0031) [2024-06-24 08:16:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10170204160. Throughput: 0: 43095.0. Samples: 10170357280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:16:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 08:16:49,097][15401] Updated weights for policy 0, policy_version 620741 (0.0028) [2024-06-24 08:16:52,988][15401] Updated weights for policy 0, policy_version 620751 (0.0038) [2024-06-24 08:16:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10170384384. Throughput: 0: 42986.3. Samples: 10170484600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:16:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 08:16:56,483][15401] Updated weights for policy 0, policy_version 620761 (0.0033) [2024-06-24 08:16:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 10170613760. Throughput: 0: 43017.0. Samples: 10170738180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:16:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 08:17:00,588][15401] Updated weights for policy 0, policy_version 620771 (0.0023) [2024-06-24 08:17:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10170843136. Throughput: 0: 43034.7. Samples: 10170995280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:03,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-24 08:17:04,451][15401] Updated weights for policy 0, policy_version 620781 (0.0039) [2024-06-24 08:17:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10171023360. Throughput: 0: 42923.7. Samples: 10171122160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 08:17:08,502][15401] Updated weights for policy 0, policy_version 620791 (0.0032) [2024-06-24 08:17:09,420][15349] Signal inference workers to stop experience collection... (150650 times) [2024-06-24 08:17:09,420][15349] Signal inference workers to resume experience collection... (150650 times) [2024-06-24 08:17:09,463][15401] InferenceWorker_p0-w0: stopping experience collection (150650 times) [2024-06-24 08:17:09,463][15401] InferenceWorker_p0-w0: resuming experience collection (150650 times) [2024-06-24 08:17:11,874][15401] Updated weights for policy 0, policy_version 620801 (0.0032) [2024-06-24 08:17:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 10171252736. Throughput: 0: 43054.1. Samples: 10171385720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:13,393][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 08:17:15,954][15401] Updated weights for policy 0, policy_version 620811 (0.0042) [2024-06-24 08:17:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 10171465728. Throughput: 0: 43089.1. Samples: 10171644940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:18,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 08:17:19,749][15401] Updated weights for policy 0, policy_version 620821 (0.0038) [2024-06-24 08:17:23,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10171678720. Throughput: 0: 42861.4. Samples: 10171763780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:23,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-24 08:17:23,426][15401] Updated weights for policy 0, policy_version 620831 (0.0038) [2024-06-24 08:17:27,148][15401] Updated weights for policy 0, policy_version 620841 (0.0034) [2024-06-24 08:17:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 10171924480. Throughput: 0: 43139.6. Samples: 10172031400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 08:17:31,218][15401] Updated weights for policy 0, policy_version 620851 (0.0031) [2024-06-24 08:17:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10172104704. Throughput: 0: 42921.5. Samples: 10172288740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 08:17:34,792][15401] Updated weights for policy 0, policy_version 620861 (0.0040) [2024-06-24 08:17:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42765.4). Total num frames: 10172334080. Throughput: 0: 42796.6. Samples: 10172410440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 08:17:38,687][15401] Updated weights for policy 0, policy_version 620871 (0.0042) [2024-06-24 08:17:42,552][15401] Updated weights for policy 0, policy_version 620881 (0.0023) [2024-06-24 08:17:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 10172579840. Throughput: 0: 43013.8. Samples: 10172673800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:43,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 08:17:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000620885_10172579840.pth... [2024-06-24 08:17:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000620255_10162257920.pth [2024-06-24 08:17:46,306][15401] Updated weights for policy 0, policy_version 620891 (0.0047) [2024-06-24 08:17:48,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10172743680. Throughput: 0: 42958.1. Samples: 10172928400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 08:17:50,042][15401] Updated weights for policy 0, policy_version 620901 (0.0038) [2024-06-24 08:17:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10172989440. Throughput: 0: 42824.7. Samples: 10173049280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-24 08:17:53,394][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 08:17:54,194][15401] Updated weights for policy 0, policy_version 620911 (0.0033) [2024-06-24 08:17:57,807][15401] Updated weights for policy 0, policy_version 620921 (0.0042) [2024-06-24 08:17:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10173186048. Throughput: 0: 42755.3. Samples: 10173309600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:17:58,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 08:18:01,663][15401] Updated weights for policy 0, policy_version 620931 (0.0026) [2024-06-24 08:18:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10173382656. Throughput: 0: 42663.9. Samples: 10173564820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 08:18:05,505][15401] Updated weights for policy 0, policy_version 620941 (0.0038) [2024-06-24 08:18:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 10173628416. Throughput: 0: 42734.2. Samples: 10173686820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 08:18:09,262][15401] Updated weights for policy 0, policy_version 620951 (0.0033) [2024-06-24 08:18:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 10173808640. Throughput: 0: 42477.3. Samples: 10173942880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 08:18:13,497][15401] Updated weights for policy 0, policy_version 620961 (0.0024) [2024-06-24 08:18:17,163][15401] Updated weights for policy 0, policy_version 620971 (0.0035) [2024-06-24 08:18:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10174021632. Throughput: 0: 42328.0. Samples: 10174193500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 08:18:20,578][15349] Signal inference workers to stop experience collection... (150700 times) [2024-06-24 08:18:20,637][15401] InferenceWorker_p0-w0: stopping experience collection (150700 times) [2024-06-24 08:18:20,693][15349] Signal inference workers to resume experience collection... (150700 times) [2024-06-24 08:18:20,693][15401] InferenceWorker_p0-w0: resuming experience collection (150700 times) [2024-06-24 08:18:21,270][15401] Updated weights for policy 0, policy_version 620981 (0.0036) [2024-06-24 08:18:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10174267392. Throughput: 0: 42582.0. Samples: 10174326640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:23,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 08:18:24,549][15401] Updated weights for policy 0, policy_version 620991 (0.0038) [2024-06-24 08:18:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42765.4). Total num frames: 10174431232. Throughput: 0: 42423.5. Samples: 10174582860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 08:18:28,919][15401] Updated weights for policy 0, policy_version 621001 (0.0027) [2024-06-24 08:18:32,030][15401] Updated weights for policy 0, policy_version 621011 (0.0028) [2024-06-24 08:18:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 10174676992. Throughput: 0: 42280.6. Samples: 10174831020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:33,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 08:18:36,519][15401] Updated weights for policy 0, policy_version 621021 (0.0024) [2024-06-24 08:18:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 10174889984. Throughput: 0: 42657.8. Samples: 10174968880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 08:18:39,706][15401] Updated weights for policy 0, policy_version 621031 (0.0036) [2024-06-24 08:18:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 42765.0). Total num frames: 10175070208. Throughput: 0: 42509.8. Samples: 10175222540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 08:18:44,142][15401] Updated weights for policy 0, policy_version 621041 (0.0025) [2024-06-24 08:18:47,185][15401] Updated weights for policy 0, policy_version 621051 (0.0033) [2024-06-24 08:18:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 10175332352. Throughput: 0: 42505.3. Samples: 10175477560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 08:18:51,959][15401] Updated weights for policy 0, policy_version 621061 (0.0034) [2024-06-24 08:18:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10175528960. Throughput: 0: 42746.8. Samples: 10175610420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 08:18:54,790][15401] Updated weights for policy 0, policy_version 621071 (0.0033) [2024-06-24 08:18:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10175741952. Throughput: 0: 42715.6. Samples: 10175865080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:18:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 08:18:59,554][15401] Updated weights for policy 0, policy_version 621081 (0.0032) [2024-06-24 08:19:02,494][15401] Updated weights for policy 0, policy_version 621091 (0.0037) [2024-06-24 08:19:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10175971328. Throughput: 0: 42719.4. Samples: 10176115880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:19:03,390][15132] Avg episode reward: [(0, '0.048')] [2024-06-24 08:19:07,308][15401] Updated weights for policy 0, policy_version 621101 (0.0028) [2024-06-24 08:19:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 10176167936. Throughput: 0: 42752.0. Samples: 10176250480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:19:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 08:19:10,311][15401] Updated weights for policy 0, policy_version 621111 (0.0031) [2024-06-24 08:19:13,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 10176380928. Throughput: 0: 42631.3. Samples: 10176501540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:19:13,396][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 08:19:14,898][15401] Updated weights for policy 0, policy_version 621121 (0.0038) [2024-06-24 08:19:18,039][15401] Updated weights for policy 0, policy_version 621131 (0.0033) [2024-06-24 08:19:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10176610304. Throughput: 0: 42673.8. Samples: 10176751340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:19:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 08:19:22,433][15401] Updated weights for policy 0, policy_version 621141 (0.0035) [2024-06-24 08:19:23,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42325.4, 300 sec: 42877.0). Total num frames: 10176806912. Throughput: 0: 42759.1. Samples: 10176893040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:19:23,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 08:19:25,658][15401] Updated weights for policy 0, policy_version 621151 (0.0030) [2024-06-24 08:19:28,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10177003520. Throughput: 0: 42758.0. Samples: 10177146660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 08:19:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 08:19:30,127][15401] Updated weights for policy 0, policy_version 621161 (0.0032) [2024-06-24 08:19:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10177249280. Throughput: 0: 42643.9. Samples: 10177396540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:19:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 08:19:33,518][15401] Updated weights for policy 0, policy_version 621171 (0.0036) [2024-06-24 08:19:37,736][15401] Updated weights for policy 0, policy_version 621181 (0.0040) [2024-06-24 08:19:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10177445888. Throughput: 0: 42804.9. Samples: 10177536640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:19:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 08:19:38,918][15349] Signal inference workers to stop experience collection... (150750 times) [2024-06-24 08:19:38,919][15349] Signal inference workers to resume experience collection... (150750 times) [2024-06-24 08:19:38,939][15401] InferenceWorker_p0-w0: stopping experience collection (150750 times) [2024-06-24 08:19:38,940][15401] InferenceWorker_p0-w0: resuming experience collection (150750 times) [2024-06-24 08:19:41,175][15401] Updated weights for policy 0, policy_version 621191 (0.0035) [2024-06-24 08:19:43,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42869.6, 300 sec: 42709.1). Total num frames: 10177642496. Throughput: 0: 42661.6. Samples: 10177784960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:19:43,393][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 08:19:43,516][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000621195_10177658880.pth... [2024-06-24 08:19:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000620568_10167386112.pth [2024-06-24 08:19:45,336][15401] Updated weights for policy 0, policy_version 621201 (0.0036) [2024-06-24 08:19:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10177904640. Throughput: 0: 42676.4. Samples: 10178036320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:19:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 08:19:48,839][15401] Updated weights for policy 0, policy_version 621211 (0.0029) [2024-06-24 08:19:53,359][15401] Updated weights for policy 0, policy_version 621221 (0.0034) [2024-06-24 08:19:53,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10178084864. Throughput: 0: 42689.1. Samples: 10178171480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:19:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 08:19:56,705][15401] Updated weights for policy 0, policy_version 621231 (0.0032) [2024-06-24 08:19:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 10178297856. Throughput: 0: 42669.0. Samples: 10178421380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:19:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 08:20:00,990][15401] Updated weights for policy 0, policy_version 621241 (0.0034) [2024-06-24 08:20:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10178527232. Throughput: 0: 42871.1. Samples: 10178680540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:03,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 08:20:04,358][15401] Updated weights for policy 0, policy_version 621251 (0.0036) [2024-06-24 08:20:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10178723840. Throughput: 0: 42655.5. Samples: 10178812540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 08:20:08,524][15401] Updated weights for policy 0, policy_version 621261 (0.0036) [2024-06-24 08:20:12,056][15401] Updated weights for policy 0, policy_version 621271 (0.0031) [2024-06-24 08:20:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 10178953216. Throughput: 0: 42746.2. Samples: 10179070240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:13,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 08:20:16,394][15401] Updated weights for policy 0, policy_version 621281 (0.0027) [2024-06-24 08:20:18,391][15132] Fps is (10 sec: 45869.8, 60 sec: 42870.5, 300 sec: 42875.9). Total num frames: 10179182592. Throughput: 0: 42746.9. Samples: 10179320200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:18,391][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 08:20:19,539][15401] Updated weights for policy 0, policy_version 621291 (0.0031) [2024-06-24 08:20:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 10179346432. Throughput: 0: 42519.0. Samples: 10179450000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:23,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 08:20:23,954][15401] Updated weights for policy 0, policy_version 621301 (0.0027) [2024-06-24 08:20:27,149][15401] Updated weights for policy 0, policy_version 621311 (0.0032) [2024-06-24 08:20:28,389][15132] Fps is (10 sec: 42604.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 10179608576. Throughput: 0: 42712.6. Samples: 10179706920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:28,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 08:20:31,651][15401] Updated weights for policy 0, policy_version 621321 (0.0029) [2024-06-24 08:20:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10179805184. Throughput: 0: 42786.8. Samples: 10179961720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 08:20:35,148][15401] Updated weights for policy 0, policy_version 621331 (0.0049) [2024-06-24 08:20:38,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10179985408. Throughput: 0: 42621.7. Samples: 10180089460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 08:20:39,550][15401] Updated weights for policy 0, policy_version 621341 (0.0040) [2024-06-24 08:20:42,762][15401] Updated weights for policy 0, policy_version 621351 (0.0032) [2024-06-24 08:20:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43144.6, 300 sec: 42709.1). Total num frames: 10180231168. Throughput: 0: 42599.2. Samples: 10180338440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:43,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 08:20:47,253][15401] Updated weights for policy 0, policy_version 621361 (0.0040) [2024-06-24 08:20:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 10180444160. Throughput: 0: 42520.9. Samples: 10180593980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 08:20:50,497][15401] Updated weights for policy 0, policy_version 621371 (0.0042) [2024-06-24 08:20:53,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 10180624384. Throughput: 0: 42558.2. Samples: 10180727660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:53,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 08:20:54,856][15401] Updated weights for policy 0, policy_version 621381 (0.0043) [2024-06-24 08:20:58,096][15401] Updated weights for policy 0, policy_version 621391 (0.0044) [2024-06-24 08:20:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10180870144. Throughput: 0: 42440.1. Samples: 10180980040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:20:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 08:21:02,603][15401] Updated weights for policy 0, policy_version 621401 (0.0034) [2024-06-24 08:21:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10181066752. Throughput: 0: 42720.4. Samples: 10181242560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 08:21:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 08:21:05,802][15401] Updated weights for policy 0, policy_version 621411 (0.0028) [2024-06-24 08:21:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10181279744. Throughput: 0: 42548.0. Samples: 10181364660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 08:21:10,211][15401] Updated weights for policy 0, policy_version 621421 (0.0041) [2024-06-24 08:21:12,397][15349] Signal inference workers to stop experience collection... (150800 times) [2024-06-24 08:21:12,447][15401] InferenceWorker_p0-w0: stopping experience collection (150800 times) [2024-06-24 08:21:12,450][15349] Signal inference workers to resume experience collection... (150800 times) [2024-06-24 08:21:12,464][15401] InferenceWorker_p0-w0: resuming experience collection (150800 times) [2024-06-24 08:21:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10181509120. Throughput: 0: 42564.4. Samples: 10181622320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 08:21:13,532][15401] Updated weights for policy 0, policy_version 621431 (0.0044) [2024-06-24 08:21:17,719][15401] Updated weights for policy 0, policy_version 621441 (0.0032) [2024-06-24 08:21:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42053.2, 300 sec: 42709.5). Total num frames: 10181705728. Throughput: 0: 42709.9. Samples: 10181883660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 08:21:21,150][15401] Updated weights for policy 0, policy_version 621451 (0.0045) [2024-06-24 08:21:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10181918720. Throughput: 0: 42631.2. Samples: 10182007860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:23,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 08:21:25,446][15401] Updated weights for policy 0, policy_version 621461 (0.0022) [2024-06-24 08:21:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10182164480. Throughput: 0: 42809.4. Samples: 10182264760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 08:21:28,864][15401] Updated weights for policy 0, policy_version 621471 (0.0031) [2024-06-24 08:21:32,936][15401] Updated weights for policy 0, policy_version 621481 (0.0029) [2024-06-24 08:21:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10182344704. Throughput: 0: 42899.5. Samples: 10182524460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 08:21:36,626][15401] Updated weights for policy 0, policy_version 621491 (0.0031) [2024-06-24 08:21:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10182574080. Throughput: 0: 42729.0. Samples: 10182650460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:38,390][15132] Avg episode reward: [(0, '0.909')] [2024-06-24 08:21:40,502][15401] Updated weights for policy 0, policy_version 621501 (0.0040) [2024-06-24 08:21:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 10182787072. Throughput: 0: 42791.4. Samples: 10182905660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:43,398][15132] Avg episode reward: [(0, '0.914')] [2024-06-24 08:21:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000621508_10182787072.pth... [2024-06-24 08:21:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000620885_10172579840.pth [2024-06-24 08:21:44,353][15401] Updated weights for policy 0, policy_version 621511 (0.0031) [2024-06-24 08:21:48,162][15401] Updated weights for policy 0, policy_version 621521 (0.0032) [2024-06-24 08:21:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10183000064. Throughput: 0: 42717.2. Samples: 10183164840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:48,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 08:21:52,260][15401] Updated weights for policy 0, policy_version 621531 (0.0032) [2024-06-24 08:21:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10183213056. Throughput: 0: 42767.6. Samples: 10183289200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:53,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 08:21:55,834][15401] Updated weights for policy 0, policy_version 621541 (0.0034) [2024-06-24 08:21:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10183426048. Throughput: 0: 42707.6. Samples: 10183544160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:21:58,399][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 08:21:59,895][15401] Updated weights for policy 0, policy_version 621551 (0.0029) [2024-06-24 08:22:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10183622656. Throughput: 0: 42661.2. Samples: 10183803420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:22:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 08:22:04,033][15401] Updated weights for policy 0, policy_version 621561 (0.0037) [2024-06-24 08:22:07,457][15401] Updated weights for policy 0, policy_version 621571 (0.0040) [2024-06-24 08:22:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 10183868416. Throughput: 0: 42620.7. Samples: 10183925800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:22:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 08:22:11,647][15401] Updated weights for policy 0, policy_version 621581 (0.0040) [2024-06-24 08:22:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10184065024. Throughput: 0: 42756.9. Samples: 10184188820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:22:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 08:22:14,923][15401] Updated weights for policy 0, policy_version 621591 (0.0031) [2024-06-24 08:22:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10184261632. Throughput: 0: 42762.5. Samples: 10184448780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:22:18,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-24 08:22:19,331][15401] Updated weights for policy 0, policy_version 621601 (0.0036) [2024-06-24 08:22:22,914][15401] Updated weights for policy 0, policy_version 621611 (0.0028) [2024-06-24 08:22:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10184491008. Throughput: 0: 42532.4. Samples: 10184564420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:22:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 08:22:26,971][15401] Updated weights for policy 0, policy_version 621621 (0.0041) [2024-06-24 08:22:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10184720384. Throughput: 0: 42736.9. Samples: 10184828820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:22:28,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 08:22:30,430][15401] Updated weights for policy 0, policy_version 621631 (0.0039) [2024-06-24 08:22:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10184900608. Throughput: 0: 42735.5. Samples: 10185087940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:22:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 08:22:34,600][15401] Updated weights for policy 0, policy_version 621641 (0.0039) [2024-06-24 08:22:37,824][15401] Updated weights for policy 0, policy_version 621651 (0.0034) [2024-06-24 08:22:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10185146368. Throughput: 0: 42731.2. Samples: 10185212100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 08:22:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 08:22:42,334][15401] Updated weights for policy 0, policy_version 621661 (0.0029) [2024-06-24 08:22:43,353][15349] Signal inference workers to stop experience collection... (150850 times) [2024-06-24 08:22:43,353][15349] Signal inference workers to resume experience collection... (150850 times) [2024-06-24 08:22:43,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10185359360. Throughput: 0: 42978.3. Samples: 10185478180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:22:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 08:22:43,396][15401] InferenceWorker_p0-w0: stopping experience collection (150850 times) [2024-06-24 08:22:43,396][15401] InferenceWorker_p0-w0: resuming experience collection (150850 times) [2024-06-24 08:22:45,393][15401] Updated weights for policy 0, policy_version 621671 (0.0051) [2024-06-24 08:22:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10185555968. Throughput: 0: 42880.1. Samples: 10185733020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:22:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 08:22:49,934][15401] Updated weights for policy 0, policy_version 621681 (0.0030) [2024-06-24 08:22:52,967][15401] Updated weights for policy 0, policy_version 621691 (0.0026) [2024-06-24 08:22:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10185801728. Throughput: 0: 42837.4. Samples: 10185853480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:22:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 08:22:57,503][15401] Updated weights for policy 0, policy_version 621701 (0.0037) [2024-06-24 08:22:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10185998336. Throughput: 0: 43013.3. Samples: 10186124420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:22:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 08:23:00,616][15401] Updated weights for policy 0, policy_version 621711 (0.0048) [2024-06-24 08:23:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10186194944. Throughput: 0: 42688.9. Samples: 10186369780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 08:23:05,077][15401] Updated weights for policy 0, policy_version 621721 (0.0044) [2024-06-24 08:23:08,148][15401] Updated weights for policy 0, policy_version 621731 (0.0030) [2024-06-24 08:23:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10186440704. Throughput: 0: 42935.6. Samples: 10186496520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:08,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-24 08:23:13,079][15401] Updated weights for policy 0, policy_version 621741 (0.0024) [2024-06-24 08:23:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10186637312. Throughput: 0: 42848.9. Samples: 10186757020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:13,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 08:23:16,195][15401] Updated weights for policy 0, policy_version 621751 (0.0022) [2024-06-24 08:23:18,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 10186833920. Throughput: 0: 42601.8. Samples: 10187005120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:18,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 08:23:20,545][15401] Updated weights for policy 0, policy_version 621761 (0.0033) [2024-06-24 08:23:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10187079680. Throughput: 0: 42641.2. Samples: 10187130960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 08:23:24,136][15401] Updated weights for policy 0, policy_version 621771 (0.0027) [2024-06-24 08:23:28,124][15401] Updated weights for policy 0, policy_version 621781 (0.0043) [2024-06-24 08:23:28,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10187276288. Throughput: 0: 42713.3. Samples: 10187400280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:28,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 08:23:31,698][15401] Updated weights for policy 0, policy_version 621791 (0.0030) [2024-06-24 08:23:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 10187489280. Throughput: 0: 42544.8. Samples: 10187647640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:33,393][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 08:23:35,717][15401] Updated weights for policy 0, policy_version 621801 (0.0027) [2024-06-24 08:23:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10187718656. Throughput: 0: 42690.0. Samples: 10187774520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 08:23:39,285][15401] Updated weights for policy 0, policy_version 621811 (0.0029) [2024-06-24 08:23:43,180][15401] Updated weights for policy 0, policy_version 621821 (0.0042) [2024-06-24 08:23:43,390][15132] Fps is (10 sec: 42607.7, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 10187915264. Throughput: 0: 42577.2. Samples: 10188040400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 08:23:43,535][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000621822_10187931648.pth... [2024-06-24 08:23:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000621195_10177658880.pth [2024-06-24 08:23:46,927][15401] Updated weights for policy 0, policy_version 621831 (0.0044) [2024-06-24 08:23:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10188111872. Throughput: 0: 42721.4. Samples: 10188292240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 08:23:50,729][15401] Updated weights for policy 0, policy_version 621841 (0.0030) [2024-06-24 08:23:53,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 10188341248. Throughput: 0: 42569.7. Samples: 10188412260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:53,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 08:23:54,486][15401] Updated weights for policy 0, policy_version 621851 (0.0040) [2024-06-24 08:23:57,735][15349] Signal inference workers to stop experience collection... (150900 times) [2024-06-24 08:23:57,735][15349] Signal inference workers to resume experience collection... (150900 times) [2024-06-24 08:23:57,764][15401] InferenceWorker_p0-w0: stopping experience collection (150900 times) [2024-06-24 08:23:57,764][15401] InferenceWorker_p0-w0: resuming experience collection (150900 times) [2024-06-24 08:23:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10188554240. Throughput: 0: 42494.3. Samples: 10188669260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:23:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 08:23:58,617][15401] Updated weights for policy 0, policy_version 621861 (0.0032) [2024-06-24 08:24:02,214][15401] Updated weights for policy 0, policy_version 621871 (0.0030) [2024-06-24 08:24:03,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10188734464. Throughput: 0: 42650.8. Samples: 10188924300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:24:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 08:24:06,203][15401] Updated weights for policy 0, policy_version 621881 (0.0038) [2024-06-24 08:24:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42654.8). Total num frames: 10188963840. Throughput: 0: 42615.1. Samples: 10189048640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:24:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 08:24:10,306][15401] Updated weights for policy 0, policy_version 621891 (0.0038) [2024-06-24 08:24:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 10189176832. Throughput: 0: 42267.1. Samples: 10189302300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:24:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 08:24:14,472][15401] Updated weights for policy 0, policy_version 621901 (0.0037) [2024-06-24 08:24:18,156][15401] Updated weights for policy 0, policy_version 621911 (0.0032) [2024-06-24 08:24:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 10189389824. Throughput: 0: 42499.6. Samples: 10189560020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 08:24:21,983][15401] Updated weights for policy 0, policy_version 621921 (0.0032) [2024-06-24 08:24:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10189619200. Throughput: 0: 42408.7. Samples: 10189682920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:23,391][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 08:24:26,114][15401] Updated weights for policy 0, policy_version 621931 (0.0037) [2024-06-24 08:24:28,393][15132] Fps is (10 sec: 44219.8, 60 sec: 42595.6, 300 sec: 42653.4). Total num frames: 10189832192. Throughput: 0: 42184.1. Samples: 10189938840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:28,394][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 08:24:29,540][15401] Updated weights for policy 0, policy_version 621941 (0.0033) [2024-06-24 08:24:33,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41780.9, 300 sec: 42542.9). Total num frames: 10189996032. Throughput: 0: 42440.0. Samples: 10190202040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:33,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 08:24:33,899][15401] Updated weights for policy 0, policy_version 621951 (0.0024) [2024-06-24 08:24:37,176][15401] Updated weights for policy 0, policy_version 621961 (0.0040) [2024-06-24 08:24:38,389][15132] Fps is (10 sec: 42615.4, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 10190258176. Throughput: 0: 42398.4. Samples: 10190320080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 08:24:41,463][15401] Updated weights for policy 0, policy_version 621971 (0.0038) [2024-06-24 08:24:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 10190471168. Throughput: 0: 42567.6. Samples: 10190584800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 08:24:44,935][15401] Updated weights for policy 0, policy_version 621981 (0.0038) [2024-06-24 08:24:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10190651392. Throughput: 0: 42512.3. Samples: 10190837360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:48,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 08:24:49,054][15401] Updated weights for policy 0, policy_version 621991 (0.0045) [2024-06-24 08:24:52,651][15401] Updated weights for policy 0, policy_version 622001 (0.0039) [2024-06-24 08:24:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 10190897152. Throughput: 0: 42518.4. Samples: 10190961960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:53,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-24 08:24:56,890][15401] Updated weights for policy 0, policy_version 622011 (0.0025) [2024-06-24 08:24:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10191093760. Throughput: 0: 42611.0. Samples: 10191219800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:24:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 08:25:00,136][15401] Updated weights for policy 0, policy_version 622021 (0.0033) [2024-06-24 08:25:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10191306752. Throughput: 0: 42552.0. Samples: 10191474860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 08:25:04,675][15401] Updated weights for policy 0, policy_version 622031 (0.0042) [2024-06-24 08:25:07,671][15401] Updated weights for policy 0, policy_version 622041 (0.0034) [2024-06-24 08:25:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 10191536128. Throughput: 0: 42660.5. Samples: 10191602640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 08:25:12,603][15401] Updated weights for policy 0, policy_version 622051 (0.0033) [2024-06-24 08:25:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42543.0). Total num frames: 10191732736. Throughput: 0: 42674.7. Samples: 10191859040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 08:25:15,634][15401] Updated weights for policy 0, policy_version 622061 (0.0031) [2024-06-24 08:25:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10191945728. Throughput: 0: 42340.0. Samples: 10192107340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 08:25:20,024][15401] Updated weights for policy 0, policy_version 622071 (0.0044) [2024-06-24 08:25:23,358][15401] Updated weights for policy 0, policy_version 622081 (0.0023) [2024-06-24 08:25:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10192175104. Throughput: 0: 42728.8. Samples: 10192242880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 08:25:23,786][15349] Signal inference workers to stop experience collection... (150950 times) [2024-06-24 08:25:23,826][15401] InferenceWorker_p0-w0: stopping experience collection (150950 times) [2024-06-24 08:25:23,847][15349] Signal inference workers to resume experience collection... (150950 times) [2024-06-24 08:25:23,849][15401] InferenceWorker_p0-w0: resuming experience collection (150950 times) [2024-06-24 08:25:27,442][15401] Updated weights for policy 0, policy_version 622091 (0.0041) [2024-06-24 08:25:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42055.0, 300 sec: 42542.9). Total num frames: 10192355328. Throughput: 0: 42609.3. Samples: 10192502220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 08:25:30,961][15401] Updated weights for policy 0, policy_version 622101 (0.0045) [2024-06-24 08:25:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 10192601088. Throughput: 0: 42589.8. Samples: 10192753900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 08:25:35,255][15401] Updated weights for policy 0, policy_version 622111 (0.0039) [2024-06-24 08:25:38,354][15401] Updated weights for policy 0, policy_version 622121 (0.0037) [2024-06-24 08:25:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 10192830464. Throughput: 0: 42842.0. Samples: 10192889860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 08:25:42,759][15401] Updated weights for policy 0, policy_version 622131 (0.0025) [2024-06-24 08:25:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 10192994304. Throughput: 0: 42677.7. Samples: 10193140300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 08:25:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000622131_10192994304.pth... [2024-06-24 08:25:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000621508_10182787072.pth [2024-06-24 08:25:45,808][15401] Updated weights for policy 0, policy_version 622141 (0.0034) [2024-06-24 08:25:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 10193240064. Throughput: 0: 42631.6. Samples: 10193393280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:25:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 08:25:50,139][15401] Updated weights for policy 0, policy_version 622151 (0.0039) [2024-06-24 08:25:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10193453056. Throughput: 0: 42947.9. Samples: 10193535300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:25:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 08:25:53,815][15401] Updated weights for policy 0, policy_version 622161 (0.0028) [2024-06-24 08:25:57,676][15401] Updated weights for policy 0, policy_version 622171 (0.0047) [2024-06-24 08:25:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10193649664. Throughput: 0: 42733.4. Samples: 10193782040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:25:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 08:26:01,470][15401] Updated weights for policy 0, policy_version 622181 (0.0042) [2024-06-24 08:26:03,391][15132] Fps is (10 sec: 42593.2, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 10193879040. Throughput: 0: 43011.7. Samples: 10194042920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:03,391][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 08:26:05,249][15401] Updated weights for policy 0, policy_version 622191 (0.0029) [2024-06-24 08:26:08,390][15132] Fps is (10 sec: 44235.3, 60 sec: 42598.1, 300 sec: 42653.9). Total num frames: 10194092032. Throughput: 0: 42995.7. Samples: 10194177700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:08,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 08:26:08,931][15401] Updated weights for policy 0, policy_version 622201 (0.0038) [2024-06-24 08:26:13,272][15401] Updated weights for policy 0, policy_version 622211 (0.0043) [2024-06-24 08:26:13,389][15132] Fps is (10 sec: 42603.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10194305024. Throughput: 0: 42848.5. Samples: 10194430400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:13,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 08:26:16,923][15401] Updated weights for policy 0, policy_version 622221 (0.0045) [2024-06-24 08:26:18,389][15132] Fps is (10 sec: 44239.1, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 10194534400. Throughput: 0: 42910.5. Samples: 10194684860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 08:26:21,166][15401] Updated weights for policy 0, policy_version 622231 (0.0035) [2024-06-24 08:26:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 10194747392. Throughput: 0: 42817.5. Samples: 10194816640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 08:26:24,481][15401] Updated weights for policy 0, policy_version 622241 (0.0034) [2024-06-24 08:26:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10194944000. Throughput: 0: 43009.8. Samples: 10195075740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 08:26:28,833][15401] Updated weights for policy 0, policy_version 622251 (0.0032) [2024-06-24 08:26:31,905][15401] Updated weights for policy 0, policy_version 622261 (0.0036) [2024-06-24 08:26:33,392][15132] Fps is (10 sec: 42587.1, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 10195173376. Throughput: 0: 43043.7. Samples: 10195330360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:33,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 08:26:36,308][15401] Updated weights for policy 0, policy_version 622271 (0.0043) [2024-06-24 08:26:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 10195369984. Throughput: 0: 42852.5. Samples: 10195463660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:38,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 08:26:39,397][15401] Updated weights for policy 0, policy_version 622281 (0.0034) [2024-06-24 08:26:43,392][15132] Fps is (10 sec: 39322.1, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 10195566592. Throughput: 0: 42915.9. Samples: 10195713360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:43,393][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 08:26:43,964][15401] Updated weights for policy 0, policy_version 622291 (0.0045) [2024-06-24 08:26:47,331][15401] Updated weights for policy 0, policy_version 622301 (0.0028) [2024-06-24 08:26:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10195779584. Throughput: 0: 42865.6. Samples: 10195971820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:48,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 08:26:51,829][15401] Updated weights for policy 0, policy_version 622311 (0.0033) [2024-06-24 08:26:53,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10196025344. Throughput: 0: 42745.7. Samples: 10196101240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 08:26:54,176][15349] Signal inference workers to stop experience collection... (151000 times) [2024-06-24 08:26:54,228][15401] InferenceWorker_p0-w0: stopping experience collection (151000 times) [2024-06-24 08:26:54,295][15349] Signal inference workers to resume experience collection... (151000 times) [2024-06-24 08:26:54,295][15401] InferenceWorker_p0-w0: resuming experience collection (151000 times) [2024-06-24 08:26:54,831][15401] Updated weights for policy 0, policy_version 622321 (0.0025) [2024-06-24 08:26:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10196205568. Throughput: 0: 42750.2. Samples: 10196354160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:26:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 08:26:59,295][15401] Updated weights for policy 0, policy_version 622331 (0.0044) [2024-06-24 08:27:02,560][15401] Updated weights for policy 0, policy_version 622341 (0.0033) [2024-06-24 08:27:03,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42597.6, 300 sec: 42598.1). Total num frames: 10196434944. Throughput: 0: 42726.4. Samples: 10196607660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:27:03,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 08:27:06,693][15401] Updated weights for policy 0, policy_version 622351 (0.0033) [2024-06-24 08:27:08,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42870.0, 300 sec: 42709.1). Total num frames: 10196664320. Throughput: 0: 42785.6. Samples: 10196742100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:27:08,392][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 08:27:10,245][15401] Updated weights for policy 0, policy_version 622361 (0.0020) [2024-06-24 08:27:13,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10196844544. Throughput: 0: 42670.7. Samples: 10196995920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:27:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 08:27:14,413][15401] Updated weights for policy 0, policy_version 622371 (0.0031) [2024-06-24 08:27:17,974][15401] Updated weights for policy 0, policy_version 622381 (0.0027) [2024-06-24 08:27:18,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42596.5, 300 sec: 42709.1). Total num frames: 10197090304. Throughput: 0: 42653.0. Samples: 10197249740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:27:18,393][15132] Avg episode reward: [(0, '0.841')] [2024-06-24 08:27:21,798][15401] Updated weights for policy 0, policy_version 622391 (0.0039) [2024-06-24 08:27:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10197303296. Throughput: 0: 42704.4. Samples: 10197385360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 08:27:23,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 08:27:25,971][15401] Updated weights for policy 0, policy_version 622401 (0.0032) [2024-06-24 08:27:28,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10197483520. Throughput: 0: 42706.3. Samples: 10197635040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:27:28,392][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 08:27:29,580][15401] Updated weights for policy 0, policy_version 622411 (0.0033) [2024-06-24 08:27:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 10197729280. Throughput: 0: 42630.2. Samples: 10197890180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:27:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 08:27:33,698][15401] Updated weights for policy 0, policy_version 622421 (0.0025) [2024-06-24 08:27:37,176][15401] Updated weights for policy 0, policy_version 622431 (0.0037) [2024-06-24 08:27:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10197942272. Throughput: 0: 42788.4. Samples: 10198026720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:27:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 08:27:41,247][15401] Updated weights for policy 0, policy_version 622441 (0.0026) [2024-06-24 08:27:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 10198138880. Throughput: 0: 42748.1. Samples: 10198277820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:27:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 08:27:43,488][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000622446_10198155264.pth... [2024-06-24 08:27:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000621822_10187931648.pth [2024-06-24 08:27:44,810][15401] Updated weights for policy 0, policy_version 622451 (0.0033) [2024-06-24 08:27:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10198351872. Throughput: 0: 42716.1. Samples: 10198529780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:27:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 08:27:49,264][15401] Updated weights for policy 0, policy_version 622461 (0.0027) [2024-06-24 08:27:52,632][15401] Updated weights for policy 0, policy_version 622471 (0.0029) [2024-06-24 08:27:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10198581248. Throughput: 0: 42573.9. Samples: 10198657820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:27:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 08:27:56,619][15401] Updated weights for policy 0, policy_version 622481 (0.0032) [2024-06-24 08:27:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10198794240. Throughput: 0: 42577.8. Samples: 10198911920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:27:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 08:28:00,398][15401] Updated weights for policy 0, policy_version 622491 (0.0031) [2024-06-24 08:28:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 10199007232. Throughput: 0: 42749.4. Samples: 10199173360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:03,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 08:28:04,260][15401] Updated weights for policy 0, policy_version 622501 (0.0045) [2024-06-24 08:28:08,334][15401] Updated weights for policy 0, policy_version 622511 (0.0023) [2024-06-24 08:28:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 10199220224. Throughput: 0: 42552.4. Samples: 10199300220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 08:28:10,439][15349] Signal inference workers to stop experience collection... (151050 times) [2024-06-24 08:28:10,443][15349] Signal inference workers to resume experience collection... (151050 times) [2024-06-24 08:28:10,459][15401] InferenceWorker_p0-w0: stopping experience collection (151050 times) [2024-06-24 08:28:10,459][15401] InferenceWorker_p0-w0: resuming experience collection (151050 times) [2024-06-24 08:28:12,194][15401] Updated weights for policy 0, policy_version 622521 (0.0028) [2024-06-24 08:28:13,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.8, 300 sec: 42765.0). Total num frames: 10199449600. Throughput: 0: 42797.3. Samples: 10199561020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:13,393][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 08:28:15,782][15401] Updated weights for policy 0, policy_version 622531 (0.0036) [2024-06-24 08:28:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 10199646208. Throughput: 0: 42820.5. Samples: 10199817100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:18,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-24 08:28:19,915][15401] Updated weights for policy 0, policy_version 622541 (0.0031) [2024-06-24 08:28:23,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10199859200. Throughput: 0: 42587.2. Samples: 10199943140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 08:28:23,456][15401] Updated weights for policy 0, policy_version 622551 (0.0034) [2024-06-24 08:28:27,481][15401] Updated weights for policy 0, policy_version 622561 (0.0029) [2024-06-24 08:28:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 10200088576. Throughput: 0: 42839.0. Samples: 10200205580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 08:28:31,110][15401] Updated weights for policy 0, policy_version 622571 (0.0037) [2024-06-24 08:28:33,391][15132] Fps is (10 sec: 44228.9, 60 sec: 42870.3, 300 sec: 42653.7). Total num frames: 10200301568. Throughput: 0: 42937.9. Samples: 10200462060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:33,392][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 08:28:35,211][15401] Updated weights for policy 0, policy_version 622581 (0.0033) [2024-06-24 08:28:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 10200514560. Throughput: 0: 43016.8. Samples: 10200593680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:38,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 08:28:38,679][15401] Updated weights for policy 0, policy_version 622591 (0.0028) [2024-06-24 08:28:42,852][15401] Updated weights for policy 0, policy_version 622601 (0.0027) [2024-06-24 08:28:43,390][15132] Fps is (10 sec: 42605.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10200727552. Throughput: 0: 43123.4. Samples: 10200852480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 08:28:46,038][15401] Updated weights for policy 0, policy_version 622611 (0.0033) [2024-06-24 08:28:48,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 10200940544. Throughput: 0: 43047.9. Samples: 10201110520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:48,396][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 08:28:50,516][15401] Updated weights for policy 0, policy_version 622621 (0.0031) [2024-06-24 08:28:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10201169920. Throughput: 0: 43166.2. Samples: 10201242700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:53,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 08:28:53,533][15401] Updated weights for policy 0, policy_version 622631 (0.0035) [2024-06-24 08:28:58,095][15401] Updated weights for policy 0, policy_version 622641 (0.0030) [2024-06-24 08:28:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10201350144. Throughput: 0: 43013.5. Samples: 10201496520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 08:28:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 08:29:01,268][15401] Updated weights for policy 0, policy_version 622651 (0.0046) [2024-06-24 08:29:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10201595904. Throughput: 0: 42901.7. Samples: 10201747680. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 08:29:05,884][15401] Updated weights for policy 0, policy_version 622661 (0.0033) [2024-06-24 08:29:08,391][15132] Fps is (10 sec: 44231.7, 60 sec: 42870.7, 300 sec: 42764.8). Total num frames: 10201792512. Throughput: 0: 43171.3. Samples: 10201885900. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:08,391][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 08:29:08,850][15401] Updated weights for policy 0, policy_version 622671 (0.0041) [2024-06-24 08:29:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 10201989120. Throughput: 0: 42965.7. Samples: 10202139040. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:13,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 08:29:13,629][15401] Updated weights for policy 0, policy_version 622681 (0.0043) [2024-06-24 08:29:16,575][15401] Updated weights for policy 0, policy_version 622691 (0.0026) [2024-06-24 08:29:18,390][15132] Fps is (10 sec: 44241.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10202234880. Throughput: 0: 42842.1. Samples: 10202389880. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 08:29:21,101][15401] Updated weights for policy 0, policy_version 622701 (0.0039) [2024-06-24 08:29:22,525][15349] Signal inference workers to stop experience collection... (151100 times) [2024-06-24 08:29:22,525][15349] Signal inference workers to resume experience collection... (151100 times) [2024-06-24 08:29:22,569][15401] InferenceWorker_p0-w0: stopping experience collection (151100 times) [2024-06-24 08:29:22,569][15401] InferenceWorker_p0-w0: resuming experience collection (151100 times) [2024-06-24 08:29:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42765.6). Total num frames: 10202447872. Throughput: 0: 42940.0. Samples: 10202525880. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 08:29:24,411][15401] Updated weights for policy 0, policy_version 622711 (0.0035) [2024-06-24 08:29:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10202644480. Throughput: 0: 42869.9. Samples: 10202781620. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:28,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 08:29:28,634][15401] Updated weights for policy 0, policy_version 622721 (0.0027) [2024-06-24 08:29:31,825][15401] Updated weights for policy 0, policy_version 622731 (0.0035) [2024-06-24 08:29:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43145.8, 300 sec: 42820.5). Total num frames: 10202890240. Throughput: 0: 42723.2. Samples: 10203033060. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:33,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 08:29:36,377][15401] Updated weights for policy 0, policy_version 622741 (0.0042) [2024-06-24 08:29:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 10203086848. Throughput: 0: 42792.8. Samples: 10203168380. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:38,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 08:29:39,514][15401] Updated weights for policy 0, policy_version 622751 (0.0036) [2024-06-24 08:29:43,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10203267072. Throughput: 0: 42774.5. Samples: 10203421380. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 08:29:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000622758_10203267072.pth... [2024-06-24 08:29:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000622131_10192994304.pth [2024-06-24 08:29:44,068][15401] Updated weights for policy 0, policy_version 622761 (0.0040) [2024-06-24 08:29:47,059][15401] Updated weights for policy 0, policy_version 622771 (0.0044) [2024-06-24 08:29:48,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 10203545600. Throughput: 0: 42771.6. Samples: 10203672400. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 08:29:51,901][15401] Updated weights for policy 0, policy_version 622781 (0.0037) [2024-06-24 08:29:53,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10203725824. Throughput: 0: 42757.0. Samples: 10203809920. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 08:29:54,804][15401] Updated weights for policy 0, policy_version 622791 (0.0023) [2024-06-24 08:29:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10203922432. Throughput: 0: 42678.7. Samples: 10204059580. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:29:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 08:29:59,415][15401] Updated weights for policy 0, policy_version 622801 (0.0048) [2024-06-24 08:30:02,451][15401] Updated weights for policy 0, policy_version 622811 (0.0025) [2024-06-24 08:30:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10204151808. Throughput: 0: 42856.5. Samples: 10204318420. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:30:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 08:30:06,905][15401] Updated weights for policy 0, policy_version 622821 (0.0025) [2024-06-24 08:30:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42872.3, 300 sec: 42820.6). Total num frames: 10204364800. Throughput: 0: 42652.1. Samples: 10204445220. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:30:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 08:30:10,005][15401] Updated weights for policy 0, policy_version 622831 (0.0048) [2024-06-24 08:30:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10204577792. Throughput: 0: 42757.3. Samples: 10204705700. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:30:13,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 08:30:14,429][15401] Updated weights for policy 0, policy_version 622841 (0.0041) [2024-06-24 08:30:17,438][15401] Updated weights for policy 0, policy_version 622851 (0.0041) [2024-06-24 08:30:18,394][15132] Fps is (10 sec: 42579.4, 60 sec: 42595.3, 300 sec: 42764.4). Total num frames: 10204790784. Throughput: 0: 42777.5. Samples: 10204958240. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:30:18,394][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 08:30:21,996][15401] Updated weights for policy 0, policy_version 622861 (0.0031) [2024-06-24 08:30:23,390][15132] Fps is (10 sec: 44235.0, 60 sec: 42871.2, 300 sec: 42931.6). Total num frames: 10205020160. Throughput: 0: 42753.8. Samples: 10205092320. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:30:23,391][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 08:30:25,646][15401] Updated weights for policy 0, policy_version 622871 (0.0032) [2024-06-24 08:30:28,390][15132] Fps is (10 sec: 40977.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10205200384. Throughput: 0: 42771.2. Samples: 10205346080. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:30:28,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 08:30:29,585][15401] Updated weights for policy 0, policy_version 622881 (0.0041) [2024-06-24 08:30:33,094][15401] Updated weights for policy 0, policy_version 622891 (0.0046) [2024-06-24 08:30:33,389][15132] Fps is (10 sec: 42600.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10205446144. Throughput: 0: 42848.4. Samples: 10205600580. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-24 08:30:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 08:30:37,178][15401] Updated weights for policy 0, policy_version 622901 (0.0029) [2024-06-24 08:30:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 10205659136. Throughput: 0: 42759.7. Samples: 10205734100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:30:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 08:30:40,750][15401] Updated weights for policy 0, policy_version 622911 (0.0028) [2024-06-24 08:30:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10205839360. Throughput: 0: 42818.2. Samples: 10205986400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:30:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 08:30:44,854][15401] Updated weights for policy 0, policy_version 622921 (0.0037) [2024-06-24 08:30:48,380][15401] Updated weights for policy 0, policy_version 622931 (0.0032) [2024-06-24 08:30:48,391][15132] Fps is (10 sec: 44231.9, 60 sec: 42597.7, 300 sec: 42875.9). Total num frames: 10206101504. Throughput: 0: 42610.5. Samples: 10206235940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:30:48,391][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 08:30:52,587][15401] Updated weights for policy 0, policy_version 622941 (0.0045) [2024-06-24 08:30:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10206281728. Throughput: 0: 42755.2. Samples: 10206369200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:30:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 08:30:56,145][15401] Updated weights for policy 0, policy_version 622951 (0.0042) [2024-06-24 08:30:58,390][15132] Fps is (10 sec: 39325.5, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 10206494720. Throughput: 0: 42559.6. Samples: 10206620880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:30:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 08:31:00,482][15401] Updated weights for policy 0, policy_version 622961 (0.0039) [2024-06-24 08:31:00,899][15349] Signal inference workers to stop experience collection... (151150 times) [2024-06-24 08:31:00,947][15401] InferenceWorker_p0-w0: stopping experience collection (151150 times) [2024-06-24 08:31:00,955][15349] Signal inference workers to resume experience collection... (151150 times) [2024-06-24 08:31:00,961][15401] InferenceWorker_p0-w0: resuming experience collection (151150 times) [2024-06-24 08:31:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10206724096. Throughput: 0: 42625.2. Samples: 10206876180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:03,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 08:31:03,799][15401] Updated weights for policy 0, policy_version 622971 (0.0032) [2024-06-24 08:31:08,224][15401] Updated weights for policy 0, policy_version 622981 (0.0045) [2024-06-24 08:31:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10206920704. Throughput: 0: 42644.0. Samples: 10207011280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 08:31:11,601][15401] Updated weights for policy 0, policy_version 622991 (0.0043) [2024-06-24 08:31:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42709.4). Total num frames: 10207133696. Throughput: 0: 42674.3. Samples: 10207266420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:13,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 08:31:15,876][15401] Updated weights for policy 0, policy_version 623001 (0.0033) [2024-06-24 08:31:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42874.6, 300 sec: 42765.0). Total num frames: 10207363072. Throughput: 0: 42530.6. Samples: 10207514460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 08:31:19,494][15401] Updated weights for policy 0, policy_version 623011 (0.0040) [2024-06-24 08:31:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.7, 300 sec: 42765.0). Total num frames: 10207559680. Throughput: 0: 42635.9. Samples: 10207652720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 08:31:23,547][15401] Updated weights for policy 0, policy_version 623021 (0.0036) [2024-06-24 08:31:27,242][15401] Updated weights for policy 0, policy_version 623031 (0.0024) [2024-06-24 08:31:28,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 10207772672. Throughput: 0: 42654.7. Samples: 10207905960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:28,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 08:31:31,134][15401] Updated weights for policy 0, policy_version 623041 (0.0032) [2024-06-24 08:31:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10208002048. Throughput: 0: 42698.6. Samples: 10208157340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:33,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 08:31:34,828][15401] Updated weights for policy 0, policy_version 623051 (0.0034) [2024-06-24 08:31:38,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42052.2, 300 sec: 42765.4). Total num frames: 10208182272. Throughput: 0: 42705.7. Samples: 10208290960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 08:31:38,823][15401] Updated weights for policy 0, policy_version 623061 (0.0041) [2024-06-24 08:31:42,900][15401] Updated weights for policy 0, policy_version 623071 (0.0040) [2024-06-24 08:31:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10208428032. Throughput: 0: 42678.1. Samples: 10208541400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 08:31:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000623073_10208428032.pth... [2024-06-24 08:31:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000622446_10198155264.pth [2024-06-24 08:31:46,448][15401] Updated weights for policy 0, policy_version 623081 (0.0035) [2024-06-24 08:31:48,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42599.2, 300 sec: 42820.6). Total num frames: 10208657408. Throughput: 0: 42644.5. Samples: 10208795180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:48,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-24 08:31:50,455][15401] Updated weights for policy 0, policy_version 623091 (0.0029) [2024-06-24 08:31:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 10208821248. Throughput: 0: 42513.4. Samples: 10208924380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:53,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 08:31:54,273][15401] Updated weights for policy 0, policy_version 623101 (0.0050) [2024-06-24 08:31:58,040][15401] Updated weights for policy 0, policy_version 623111 (0.0036) [2024-06-24 08:31:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 10209050624. Throughput: 0: 42412.4. Samples: 10209174980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:31:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 08:32:02,060][15401] Updated weights for policy 0, policy_version 623121 (0.0037) [2024-06-24 08:32:03,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 10209296384. Throughput: 0: 42601.0. Samples: 10209431500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:32:03,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-24 08:32:06,113][15401] Updated weights for policy 0, policy_version 623131 (0.0041) [2024-06-24 08:32:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10209443840. Throughput: 0: 42423.2. Samples: 10209561760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 08:32:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 08:32:09,857][15401] Updated weights for policy 0, policy_version 623141 (0.0042) [2024-06-24 08:32:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10209689600. Throughput: 0: 42256.8. Samples: 10209807420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 08:32:13,584][15401] Updated weights for policy 0, policy_version 623151 (0.0041) [2024-06-24 08:32:17,428][15401] Updated weights for policy 0, policy_version 623161 (0.0029) [2024-06-24 08:32:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10209902592. Throughput: 0: 42444.2. Samples: 10210067320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:18,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 08:32:19,054][15349] Signal inference workers to stop experience collection... (151200 times) [2024-06-24 08:32:19,085][15401] InferenceWorker_p0-w0: stopping experience collection (151200 times) [2024-06-24 08:32:19,111][15349] Signal inference workers to resume experience collection... (151200 times) [2024-06-24 08:32:19,112][15401] InferenceWorker_p0-w0: resuming experience collection (151200 times) [2024-06-24 08:32:21,107][15401] Updated weights for policy 0, policy_version 623171 (0.0031) [2024-06-24 08:32:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10210115584. Throughput: 0: 42385.2. Samples: 10210198300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:23,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 08:32:24,887][15401] Updated weights for policy 0, policy_version 623181 (0.0028) [2024-06-24 08:32:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10210344960. Throughput: 0: 42455.7. Samples: 10210451900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:28,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 08:32:29,071][15401] Updated weights for policy 0, policy_version 623191 (0.0039) [2024-06-24 08:32:32,307][15401] Updated weights for policy 0, policy_version 623201 (0.0024) [2024-06-24 08:32:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10210557952. Throughput: 0: 42623.4. Samples: 10210713240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:33,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 08:32:36,923][15401] Updated weights for policy 0, policy_version 623211 (0.0042) [2024-06-24 08:32:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10210754560. Throughput: 0: 42612.9. Samples: 10210841960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:38,390][15132] Avg episode reward: [(0, '0.121')] [2024-06-24 08:32:40,223][15401] Updated weights for policy 0, policy_version 623221 (0.0032) [2024-06-24 08:32:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 10210983936. Throughput: 0: 42689.0. Samples: 10211095980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 08:32:44,517][15401] Updated weights for policy 0, policy_version 623231 (0.0036) [2024-06-24 08:32:47,711][15401] Updated weights for policy 0, policy_version 623241 (0.0024) [2024-06-24 08:32:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 10211196928. Throughput: 0: 42894.5. Samples: 10211361760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:48,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 08:32:52,144][15401] Updated weights for policy 0, policy_version 623251 (0.0036) [2024-06-24 08:32:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10211377152. Throughput: 0: 42826.6. Samples: 10211488960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:53,396][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 08:32:55,571][15401] Updated weights for policy 0, policy_version 623261 (0.0034) [2024-06-24 08:32:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10211639296. Throughput: 0: 42952.1. Samples: 10211740260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:32:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 08:32:59,616][15401] Updated weights for policy 0, policy_version 623271 (0.0036) [2024-06-24 08:33:03,300][15401] Updated weights for policy 0, policy_version 623281 (0.0028) [2024-06-24 08:33:03,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10211835904. Throughput: 0: 43012.5. Samples: 10212002880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 08:33:07,145][15401] Updated weights for policy 0, policy_version 623291 (0.0043) [2024-06-24 08:33:08,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 10211999744. Throughput: 0: 42783.7. Samples: 10212123560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 08:33:10,991][15401] Updated weights for policy 0, policy_version 623301 (0.0024) [2024-06-24 08:33:13,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10212294656. Throughput: 0: 42754.2. Samples: 10212375840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 08:33:14,980][15401] Updated weights for policy 0, policy_version 623311 (0.0031) [2024-06-24 08:33:18,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10212474880. Throughput: 0: 42812.6. Samples: 10212639800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 08:33:18,523][15401] Updated weights for policy 0, policy_version 623321 (0.0033) [2024-06-24 08:33:22,464][15401] Updated weights for policy 0, policy_version 623331 (0.0033) [2024-06-24 08:33:23,389][15132] Fps is (10 sec: 36045.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10212655104. Throughput: 0: 42521.3. Samples: 10212755420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:23,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 08:33:25,660][15349] Signal inference workers to stop experience collection... (151250 times) [2024-06-24 08:33:25,661][15349] Signal inference workers to resume experience collection... (151250 times) [2024-06-24 08:33:25,686][15401] InferenceWorker_p0-w0: stopping experience collection (151250 times) [2024-06-24 08:33:25,686][15401] InferenceWorker_p0-w0: resuming experience collection (151250 times) [2024-06-24 08:33:26,011][15401] Updated weights for policy 0, policy_version 623341 (0.0034) [2024-06-24 08:33:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 10212917248. Throughput: 0: 42722.1. Samples: 10213018480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:28,391][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 08:33:30,274][15401] Updated weights for policy 0, policy_version 623351 (0.0027) [2024-06-24 08:33:33,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 10213130240. Throughput: 0: 42609.8. Samples: 10213279200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 08:33:33,707][15401] Updated weights for policy 0, policy_version 623361 (0.0035) [2024-06-24 08:33:37,618][15401] Updated weights for policy 0, policy_version 623371 (0.0029) [2024-06-24 08:33:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10213310464. Throughput: 0: 42595.2. Samples: 10213405740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 08:33:41,202][15401] Updated weights for policy 0, policy_version 623381 (0.0026) [2024-06-24 08:33:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10213556224. Throughput: 0: 42796.1. Samples: 10213666080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:33:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 08:33:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000623386_10213556224.pth... [2024-06-24 08:33:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000622758_10203267072.pth [2024-06-24 08:33:45,063][15401] Updated weights for policy 0, policy_version 623391 (0.0035) [2024-06-24 08:33:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10213752832. Throughput: 0: 42713.7. Samples: 10213925000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:33:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 08:33:48,899][15401] Updated weights for policy 0, policy_version 623401 (0.0035) [2024-06-24 08:33:53,120][15401] Updated weights for policy 0, policy_version 623411 (0.0036) [2024-06-24 08:33:53,392][15132] Fps is (10 sec: 40949.7, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 10213965824. Throughput: 0: 42745.7. Samples: 10214047220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:33:53,393][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 08:33:56,706][15401] Updated weights for policy 0, policy_version 623421 (0.0035) [2024-06-24 08:33:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10214178816. Throughput: 0: 42957.9. Samples: 10214308940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:33:58,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 08:34:00,469][15401] Updated weights for policy 0, policy_version 623431 (0.0034) [2024-06-24 08:34:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42709.6). Total num frames: 10214391808. Throughput: 0: 42824.8. Samples: 10214566920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 08:34:04,241][15401] Updated weights for policy 0, policy_version 623441 (0.0026) [2024-06-24 08:34:07,996][15401] Updated weights for policy 0, policy_version 623451 (0.0043) [2024-06-24 08:34:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 10214621184. Throughput: 0: 43099.9. Samples: 10214694920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 08:34:11,947][15401] Updated weights for policy 0, policy_version 623461 (0.0038) [2024-06-24 08:34:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10214834176. Throughput: 0: 43064.4. Samples: 10214956380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 08:34:15,463][15401] Updated weights for policy 0, policy_version 623471 (0.0032) [2024-06-24 08:34:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10215030784. Throughput: 0: 43027.7. Samples: 10215215440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 08:34:19,698][15401] Updated weights for policy 0, policy_version 623481 (0.0038) [2024-06-24 08:34:23,121][15401] Updated weights for policy 0, policy_version 623491 (0.0026) [2024-06-24 08:34:23,396][15132] Fps is (10 sec: 44208.5, 60 sec: 43686.0, 300 sec: 42819.6). Total num frames: 10215276544. Throughput: 0: 42974.2. Samples: 10215339860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:23,397][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 08:34:27,197][15401] Updated weights for policy 0, policy_version 623501 (0.0040) [2024-06-24 08:34:28,390][15132] Fps is (10 sec: 44235.2, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 10215473152. Throughput: 0: 43024.1. Samples: 10215602180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 08:34:31,022][15401] Updated weights for policy 0, policy_version 623511 (0.0039) [2024-06-24 08:34:33,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10215686144. Throughput: 0: 42916.8. Samples: 10215856260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 08:34:35,133][15401] Updated weights for policy 0, policy_version 623521 (0.0031) [2024-06-24 08:34:38,389][15132] Fps is (10 sec: 42600.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10215899136. Throughput: 0: 43049.9. Samples: 10215984360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 08:34:38,707][15401] Updated weights for policy 0, policy_version 623531 (0.0030) [2024-06-24 08:34:40,768][15349] Signal inference workers to stop experience collection... (151300 times) [2024-06-24 08:34:40,768][15349] Signal inference workers to resume experience collection... (151300 times) [2024-06-24 08:34:40,808][15401] InferenceWorker_p0-w0: stopping experience collection (151300 times) [2024-06-24 08:34:40,808][15401] InferenceWorker_p0-w0: resuming experience collection (151300 times) [2024-06-24 08:34:42,587][15401] Updated weights for policy 0, policy_version 623541 (0.0025) [2024-06-24 08:34:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10216112128. Throughput: 0: 42873.6. Samples: 10216238260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 08:34:46,458][15401] Updated weights for policy 0, policy_version 623551 (0.0031) [2024-06-24 08:34:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10216325120. Throughput: 0: 43067.2. Samples: 10216504940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:48,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 08:34:50,090][15401] Updated weights for policy 0, policy_version 623561 (0.0024) [2024-06-24 08:34:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10216538112. Throughput: 0: 42941.4. Samples: 10216627280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 08:34:54,065][15401] Updated weights for policy 0, policy_version 623571 (0.0037) [2024-06-24 08:34:58,245][15401] Updated weights for policy 0, policy_version 623581 (0.0040) [2024-06-24 08:34:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10216767488. Throughput: 0: 42718.7. Samples: 10216878720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:34:58,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 08:35:01,889][15401] Updated weights for policy 0, policy_version 623591 (0.0037) [2024-06-24 08:35:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10216964096. Throughput: 0: 42739.8. Samples: 10217138740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:35:03,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 08:35:05,660][15401] Updated weights for policy 0, policy_version 623601 (0.0036) [2024-06-24 08:35:08,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 10217193472. Throughput: 0: 42855.4. Samples: 10217268180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:35:08,392][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 08:35:09,639][15401] Updated weights for policy 0, policy_version 623611 (0.0030) [2024-06-24 08:35:13,198][15401] Updated weights for policy 0, policy_version 623621 (0.0037) [2024-06-24 08:35:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 10217406464. Throughput: 0: 42577.5. Samples: 10217518160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:35:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 08:35:17,151][15401] Updated weights for policy 0, policy_version 623631 (0.0041) [2024-06-24 08:35:18,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10217603072. Throughput: 0: 42675.2. Samples: 10217776640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-24 08:35:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 08:35:20,920][15401] Updated weights for policy 0, policy_version 623641 (0.0030) [2024-06-24 08:35:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 10217816064. Throughput: 0: 42552.8. Samples: 10217899240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:35:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 08:35:24,827][15401] Updated weights for policy 0, policy_version 623651 (0.0041) [2024-06-24 08:35:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.8, 300 sec: 42709.5). Total num frames: 10218045440. Throughput: 0: 42596.2. Samples: 10218155080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:35:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 08:35:28,447][15401] Updated weights for policy 0, policy_version 623661 (0.0032) [2024-06-24 08:35:32,579][15401] Updated weights for policy 0, policy_version 623671 (0.0028) [2024-06-24 08:35:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10218225664. Throughput: 0: 42493.2. Samples: 10218417140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:35:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 08:35:36,076][15401] Updated weights for policy 0, policy_version 623681 (0.0038) [2024-06-24 08:35:38,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10218438656. Throughput: 0: 42410.7. Samples: 10218535760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:35:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 08:35:40,641][15401] Updated weights for policy 0, policy_version 623691 (0.0031) [2024-06-24 08:35:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42654.1). Total num frames: 10218684416. Throughput: 0: 42390.7. Samples: 10218786300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:35:43,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 08:35:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000623699_10218684416.pth... [2024-06-24 08:35:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000623073_10208428032.pth [2024-06-24 08:35:44,303][15401] Updated weights for policy 0, policy_version 623701 (0.0027) [2024-06-24 08:35:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10218864640. Throughput: 0: 42372.1. Samples: 10219045480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:35:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 08:35:48,814][15401] Updated weights for policy 0, policy_version 623711 (0.0038) [2024-06-24 08:35:52,054][15401] Updated weights for policy 0, policy_version 623721 (0.0030) [2024-06-24 08:35:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10219094016. Throughput: 0: 42218.2. Samples: 10219167900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:35:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 08:35:56,402][15401] Updated weights for policy 0, policy_version 623731 (0.0026) [2024-06-24 08:35:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10219323392. Throughput: 0: 42389.4. Samples: 10219425680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:35:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 08:35:59,484][15401] Updated weights for policy 0, policy_version 623741 (0.0030) [2024-06-24 08:36:03,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 10219503616. Throughput: 0: 42501.3. Samples: 10219689300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:03,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 08:36:03,949][15401] Updated weights for policy 0, policy_version 623751 (0.0043) [2024-06-24 08:36:07,155][15401] Updated weights for policy 0, policy_version 623761 (0.0027) [2024-06-24 08:36:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 10219732992. Throughput: 0: 42489.3. Samples: 10219811260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 08:36:11,365][15349] Signal inference workers to stop experience collection... (151350 times) [2024-06-24 08:36:11,392][15401] InferenceWorker_p0-w0: stopping experience collection (151350 times) [2024-06-24 08:36:11,427][15349] Signal inference workers to resume experience collection... (151350 times) [2024-06-24 08:36:11,427][15401] InferenceWorker_p0-w0: resuming experience collection (151350 times) [2024-06-24 08:36:11,585][15401] Updated weights for policy 0, policy_version 623771 (0.0032) [2024-06-24 08:36:13,390][15132] Fps is (10 sec: 47524.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10219978752. Throughput: 0: 42562.4. Samples: 10220070400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 08:36:14,618][15401] Updated weights for policy 0, policy_version 623781 (0.0036) [2024-06-24 08:36:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10220126208. Throughput: 0: 42614.7. Samples: 10220334800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:18,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 08:36:19,223][15401] Updated weights for policy 0, policy_version 623791 (0.0030) [2024-06-24 08:36:22,826][15401] Updated weights for policy 0, policy_version 623801 (0.0036) [2024-06-24 08:36:23,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 10220355584. Throughput: 0: 42575.4. Samples: 10220451660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 08:36:26,812][15401] Updated weights for policy 0, policy_version 623811 (0.0036) [2024-06-24 08:36:28,389][15132] Fps is (10 sec: 49152.5, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 10220617728. Throughput: 0: 42821.0. Samples: 10220713240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 08:36:30,473][15401] Updated weights for policy 0, policy_version 623821 (0.0044) [2024-06-24 08:36:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10220781568. Throughput: 0: 42825.0. Samples: 10220972600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:33,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 08:36:34,480][15401] Updated weights for policy 0, policy_version 623831 (0.0034) [2024-06-24 08:36:38,077][15401] Updated weights for policy 0, policy_version 623841 (0.0037) [2024-06-24 08:36:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10221010944. Throughput: 0: 42845.5. Samples: 10221095940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:38,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 08:36:42,151][15401] Updated weights for policy 0, policy_version 623851 (0.0037) [2024-06-24 08:36:43,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10221240320. Throughput: 0: 42854.2. Samples: 10221354120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:43,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 08:36:45,715][15401] Updated weights for policy 0, policy_version 623861 (0.0032) [2024-06-24 08:36:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10221420544. Throughput: 0: 42772.0. Samples: 10221613940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:48,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 08:36:49,791][15401] Updated weights for policy 0, policy_version 623871 (0.0029) [2024-06-24 08:36:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10221649920. Throughput: 0: 42694.4. Samples: 10221732500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 08:36:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 08:36:53,505][15401] Updated weights for policy 0, policy_version 623881 (0.0026) [2024-06-24 08:36:57,438][15401] Updated weights for policy 0, policy_version 623891 (0.0032) [2024-06-24 08:36:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10221879296. Throughput: 0: 42761.9. Samples: 10221994680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:36:58,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 08:37:01,147][15401] Updated weights for policy 0, policy_version 623901 (0.0046) [2024-06-24 08:37:03,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 10222059520. Throughput: 0: 42557.8. Samples: 10222249900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 08:37:04,922][15401] Updated weights for policy 0, policy_version 623911 (0.0032) [2024-06-24 08:37:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10222305280. Throughput: 0: 42722.6. Samples: 10222374180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 08:37:08,594][15401] Updated weights for policy 0, policy_version 623921 (0.0032) [2024-06-24 08:37:12,527][15401] Updated weights for policy 0, policy_version 623931 (0.0043) [2024-06-24 08:37:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10222518272. Throughput: 0: 42742.2. Samples: 10222636640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:13,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 08:37:16,683][15401] Updated weights for policy 0, policy_version 623941 (0.0022) [2024-06-24 08:37:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10222698496. Throughput: 0: 42765.2. Samples: 10222897040. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:18,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-24 08:37:20,210][15401] Updated weights for policy 0, policy_version 623951 (0.0027) [2024-06-24 08:37:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 10222960640. Throughput: 0: 42672.3. Samples: 10223016200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:23,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 08:37:24,127][15401] Updated weights for policy 0, policy_version 623961 (0.0042) [2024-06-24 08:37:27,827][15401] Updated weights for policy 0, policy_version 623971 (0.0037) [2024-06-24 08:37:28,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 10223173632. Throughput: 0: 42760.4. Samples: 10223278440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:28,393][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 08:37:31,535][15401] Updated weights for policy 0, policy_version 623981 (0.0042) [2024-06-24 08:37:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10223337472. Throughput: 0: 42716.5. Samples: 10223536180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:33,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 08:37:35,445][15401] Updated weights for policy 0, policy_version 623991 (0.0022) [2024-06-24 08:37:35,748][15349] Signal inference workers to stop experience collection... (151400 times) [2024-06-24 08:37:35,748][15349] Signal inference workers to resume experience collection... (151400 times) [2024-06-24 08:37:35,792][15401] InferenceWorker_p0-w0: stopping experience collection (151400 times) [2024-06-24 08:37:35,792][15401] InferenceWorker_p0-w0: resuming experience collection (151400 times) [2024-06-24 08:37:38,392][15132] Fps is (10 sec: 42598.4, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 10223599616. Throughput: 0: 42824.7. Samples: 10223659720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:38,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 08:37:39,451][15401] Updated weights for policy 0, policy_version 624001 (0.0036) [2024-06-24 08:37:43,123][15401] Updated weights for policy 0, policy_version 624011 (0.0035) [2024-06-24 08:37:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10223812608. Throughput: 0: 42884.0. Samples: 10223924460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 08:37:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000624013_10223828992.pth... [2024-06-24 08:37:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000623386_10213556224.pth [2024-06-24 08:37:47,024][15401] Updated weights for policy 0, policy_version 624021 (0.0041) [2024-06-24 08:37:48,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10224009216. Throughput: 0: 42987.5. Samples: 10224184340. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 08:37:50,698][15401] Updated weights for policy 0, policy_version 624031 (0.0053) [2024-06-24 08:37:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10224238592. Throughput: 0: 42962.8. Samples: 10224307500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:53,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 08:37:54,449][15401] Updated weights for policy 0, policy_version 624041 (0.0034) [2024-06-24 08:37:58,179][15401] Updated weights for policy 0, policy_version 624051 (0.0036) [2024-06-24 08:37:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10224451584. Throughput: 0: 43116.3. Samples: 10224576880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:37:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 08:38:01,954][15401] Updated weights for policy 0, policy_version 624061 (0.0033) [2024-06-24 08:38:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10224648192. Throughput: 0: 43057.5. Samples: 10224834620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:38:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 08:38:05,657][15401] Updated weights for policy 0, policy_version 624071 (0.0039) [2024-06-24 08:38:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10224893952. Throughput: 0: 43160.8. Samples: 10224958440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:38:08,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 08:38:09,646][15401] Updated weights for policy 0, policy_version 624081 (0.0027) [2024-06-24 08:38:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10225090560. Throughput: 0: 43189.0. Samples: 10225221840. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:38:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 08:38:13,418][15401] Updated weights for policy 0, policy_version 624091 (0.0029) [2024-06-24 08:38:17,283][15401] Updated weights for policy 0, policy_version 624101 (0.0029) [2024-06-24 08:38:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 10225303552. Throughput: 0: 42973.4. Samples: 10225469980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:38:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 08:38:21,179][15401] Updated weights for policy 0, policy_version 624111 (0.0031) [2024-06-24 08:38:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10225532928. Throughput: 0: 43097.0. Samples: 10225598980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:38:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 08:38:24,959][15401] Updated weights for policy 0, policy_version 624121 (0.0030) [2024-06-24 08:38:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 10225713152. Throughput: 0: 42991.1. Samples: 10225859060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 08:38:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 08:38:28,826][15401] Updated weights for policy 0, policy_version 624131 (0.0044) [2024-06-24 08:38:32,658][15401] Updated weights for policy 0, policy_version 624141 (0.0040) [2024-06-24 08:38:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 10225942528. Throughput: 0: 42770.7. Samples: 10226109020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:38:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 08:38:36,482][15401] Updated weights for policy 0, policy_version 624151 (0.0028) [2024-06-24 08:38:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 10226171904. Throughput: 0: 42928.8. Samples: 10226239300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:38:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 08:38:40,314][15401] Updated weights for policy 0, policy_version 624161 (0.0024) [2024-06-24 08:38:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 10226368512. Throughput: 0: 42759.6. Samples: 10226501160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:38:43,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 08:38:44,041][15401] Updated weights for policy 0, policy_version 624171 (0.0037) [2024-06-24 08:38:47,990][15401] Updated weights for policy 0, policy_version 624181 (0.0036) [2024-06-24 08:38:48,391][15132] Fps is (10 sec: 40954.8, 60 sec: 42870.5, 300 sec: 42765.2). Total num frames: 10226581504. Throughput: 0: 42724.0. Samples: 10226757260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:38:48,392][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 08:38:51,619][15401] Updated weights for policy 0, policy_version 624191 (0.0032) [2024-06-24 08:38:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10226810880. Throughput: 0: 42841.1. Samples: 10226886280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:38:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 08:38:55,882][15401] Updated weights for policy 0, policy_version 624201 (0.0032) [2024-06-24 08:38:56,373][15349] Signal inference workers to stop experience collection... (151450 times) [2024-06-24 08:38:56,374][15349] Signal inference workers to resume experience collection... (151450 times) [2024-06-24 08:38:56,390][15401] InferenceWorker_p0-w0: stopping experience collection (151450 times) [2024-06-24 08:38:56,390][15401] InferenceWorker_p0-w0: resuming experience collection (151450 times) [2024-06-24 08:38:58,396][15132] Fps is (10 sec: 42577.0, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 10227007488. Throughput: 0: 42494.4. Samples: 10227134360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:38:58,397][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 08:38:59,558][15401] Updated weights for policy 0, policy_version 624211 (0.0029) [2024-06-24 08:39:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 10227220480. Throughput: 0: 42837.3. Samples: 10227397760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:03,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 08:39:03,528][15401] Updated weights for policy 0, policy_version 624221 (0.0034) [2024-06-24 08:39:06,840][15401] Updated weights for policy 0, policy_version 624231 (0.0031) [2024-06-24 08:39:08,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10227449856. Throughput: 0: 42888.4. Samples: 10227528960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:08,392][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 08:39:11,123][15401] Updated weights for policy 0, policy_version 624241 (0.0043) [2024-06-24 08:39:13,390][15132] Fps is (10 sec: 44246.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10227662848. Throughput: 0: 42737.7. Samples: 10227782260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 08:39:14,460][15401] Updated weights for policy 0, policy_version 624251 (0.0039) [2024-06-24 08:39:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 10227875840. Throughput: 0: 43038.1. Samples: 10228045740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 08:39:18,799][15401] Updated weights for policy 0, policy_version 624261 (0.0033) [2024-06-24 08:39:22,135][15401] Updated weights for policy 0, policy_version 624271 (0.0040) [2024-06-24 08:39:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 10228088832. Throughput: 0: 42903.2. Samples: 10228169940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:23,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 08:39:26,439][15401] Updated weights for policy 0, policy_version 624281 (0.0038) [2024-06-24 08:39:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10228301824. Throughput: 0: 42617.0. Samples: 10228418820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:28,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 08:39:30,139][15401] Updated weights for policy 0, policy_version 624291 (0.0025) [2024-06-24 08:39:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10228514816. Throughput: 0: 42620.4. Samples: 10228675120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 08:39:34,219][15401] Updated weights for policy 0, policy_version 624301 (0.0039) [2024-06-24 08:39:37,719][15401] Updated weights for policy 0, policy_version 624311 (0.0031) [2024-06-24 08:39:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10228727808. Throughput: 0: 42578.6. Samples: 10228802320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:38,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 08:39:41,725][15401] Updated weights for policy 0, policy_version 624321 (0.0036) [2024-06-24 08:39:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 10228940800. Throughput: 0: 42744.6. Samples: 10229057600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 08:39:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000624325_10228940800.pth... [2024-06-24 08:39:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000623699_10218684416.pth [2024-06-24 08:39:45,262][15401] Updated weights for policy 0, policy_version 624331 (0.0030) [2024-06-24 08:39:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42872.5, 300 sec: 42765.0). Total num frames: 10229153792. Throughput: 0: 42560.0. Samples: 10229312860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 08:39:49,330][15401] Updated weights for policy 0, policy_version 624341 (0.0034) [2024-06-24 08:39:53,374][15401] Updated weights for policy 0, policy_version 624351 (0.0042) [2024-06-24 08:39:53,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42593.8, 300 sec: 42708.6). Total num frames: 10229366784. Throughput: 0: 42469.0. Samples: 10229440340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:53,396][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 08:39:57,403][15401] Updated weights for policy 0, policy_version 624361 (0.0047) [2024-06-24 08:39:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 10229579776. Throughput: 0: 42528.1. Samples: 10229696020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:39:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 08:40:00,918][15401] Updated weights for policy 0, policy_version 624371 (0.0034) [2024-06-24 08:40:03,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42600.1, 300 sec: 42654.3). Total num frames: 10229776384. Throughput: 0: 42269.9. Samples: 10229947880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-24 08:40:03,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 08:40:05,107][15401] Updated weights for policy 0, policy_version 624381 (0.0036) [2024-06-24 08:40:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10230005760. Throughput: 0: 42391.6. Samples: 10230077560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:08,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 08:40:08,571][15401] Updated weights for policy 0, policy_version 624391 (0.0027) [2024-06-24 08:40:12,730][15401] Updated weights for policy 0, policy_version 624401 (0.0039) [2024-06-24 08:40:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10230218752. Throughput: 0: 42447.5. Samples: 10230328960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 08:40:16,728][15401] Updated weights for policy 0, policy_version 624411 (0.0030) [2024-06-24 08:40:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 10230398976. Throughput: 0: 42344.0. Samples: 10230580600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 08:40:20,715][15401] Updated weights for policy 0, policy_version 624421 (0.0038) [2024-06-24 08:40:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10230611968. Throughput: 0: 42351.2. Samples: 10230708120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 08:40:24,179][15401] Updated weights for policy 0, policy_version 624431 (0.0035) [2024-06-24 08:40:28,226][15401] Updated weights for policy 0, policy_version 624441 (0.0039) [2024-06-24 08:40:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 10230841344. Throughput: 0: 42287.5. Samples: 10230960540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:28,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 08:40:31,669][15349] Signal inference workers to stop experience collection... (151500 times) [2024-06-24 08:40:31,669][15349] Signal inference workers to resume experience collection... (151500 times) [2024-06-24 08:40:31,698][15401] InferenceWorker_p0-w0: stopping experience collection (151500 times) [2024-06-24 08:40:31,699][15401] InferenceWorker_p0-w0: resuming experience collection (151500 times) [2024-06-24 08:40:31,808][15401] Updated weights for policy 0, policy_version 624451 (0.0032) [2024-06-24 08:40:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10231054336. Throughput: 0: 42378.7. Samples: 10231219900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 08:40:35,820][15401] Updated weights for policy 0, policy_version 624461 (0.0056) [2024-06-24 08:40:38,389][15132] Fps is (10 sec: 39322.5, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 10231234560. Throughput: 0: 42332.8. Samples: 10231345040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 08:40:39,817][15401] Updated weights for policy 0, policy_version 624471 (0.0043) [2024-06-24 08:40:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10231496704. Throughput: 0: 42386.7. Samples: 10231603420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 08:40:43,394][15401] Updated weights for policy 0, policy_version 624481 (0.0032) [2024-06-24 08:40:47,381][15401] Updated weights for policy 0, policy_version 624491 (0.0028) [2024-06-24 08:40:48,396][15132] Fps is (10 sec: 45845.6, 60 sec: 42320.8, 300 sec: 42708.6). Total num frames: 10231693312. Throughput: 0: 42353.1. Samples: 10231854040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:48,396][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 08:40:51,109][15401] Updated weights for policy 0, policy_version 624501 (0.0043) [2024-06-24 08:40:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42056.8, 300 sec: 42598.4). Total num frames: 10231889920. Throughput: 0: 42292.4. Samples: 10231980720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:53,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 08:40:55,567][15401] Updated weights for policy 0, policy_version 624511 (0.0022) [2024-06-24 08:40:58,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 10232135680. Throughput: 0: 42446.8. Samples: 10232239060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:40:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 08:40:58,967][15401] Updated weights for policy 0, policy_version 624521 (0.0035) [2024-06-24 08:41:03,129][15401] Updated weights for policy 0, policy_version 624531 (0.0043) [2024-06-24 08:41:03,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 10232315904. Throughput: 0: 42546.2. Samples: 10232495280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:41:03,392][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 08:41:06,494][15401] Updated weights for policy 0, policy_version 624541 (0.0034) [2024-06-24 08:41:08,391][15132] Fps is (10 sec: 40955.1, 60 sec: 42324.5, 300 sec: 42598.3). Total num frames: 10232545280. Throughput: 0: 42393.6. Samples: 10232615880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:41:08,391][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 08:41:10,893][15401] Updated weights for policy 0, policy_version 624551 (0.0040) [2024-06-24 08:41:13,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10232758272. Throughput: 0: 42634.7. Samples: 10232879100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:41:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 08:41:14,119][15401] Updated weights for policy 0, policy_version 624561 (0.0032) [2024-06-24 08:41:18,389][15132] Fps is (10 sec: 40964.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10232954880. Throughput: 0: 42531.1. Samples: 10233133800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:41:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 08:41:18,652][15401] Updated weights for policy 0, policy_version 624571 (0.0038) [2024-06-24 08:41:21,711][15401] Updated weights for policy 0, policy_version 624581 (0.0035) [2024-06-24 08:41:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10233167872. Throughput: 0: 42508.8. Samples: 10233257940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:41:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 08:41:26,209][15401] Updated weights for policy 0, policy_version 624591 (0.0037) [2024-06-24 08:41:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 10233380864. Throughput: 0: 42459.2. Samples: 10233514080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:41:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 08:41:29,593][15401] Updated weights for policy 0, policy_version 624601 (0.0031) [2024-06-24 08:41:33,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42593.8, 300 sec: 42708.5). Total num frames: 10233610240. Throughput: 0: 42535.9. Samples: 10233768160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:41:33,396][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 08:41:34,199][15401] Updated weights for policy 0, policy_version 624611 (0.0033) [2024-06-24 08:41:37,127][15401] Updated weights for policy 0, policy_version 624621 (0.0045) [2024-06-24 08:41:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 10233823232. Throughput: 0: 42625.9. Samples: 10233898880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 08:41:38,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 08:41:41,674][15401] Updated weights for policy 0, policy_version 624631 (0.0031) [2024-06-24 08:41:43,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10234036224. Throughput: 0: 42424.3. Samples: 10234148160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:41:43,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 08:41:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000624636_10234036224.pth... [2024-06-24 08:41:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000624013_10223828992.pth [2024-06-24 08:41:44,862][15401] Updated weights for policy 0, policy_version 624641 (0.0030) [2024-06-24 08:41:48,158][15349] Signal inference workers to stop experience collection... (151550 times) [2024-06-24 08:41:48,209][15349] Signal inference workers to resume experience collection... (151550 times) [2024-06-24 08:41:48,209][15401] InferenceWorker_p0-w0: stopping experience collection (151550 times) [2024-06-24 08:41:48,227][15401] InferenceWorker_p0-w0: resuming experience collection (151550 times) [2024-06-24 08:41:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 10234249216. Throughput: 0: 42552.6. Samples: 10234410040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:41:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 08:41:49,118][15401] Updated weights for policy 0, policy_version 624651 (0.0039) [2024-06-24 08:41:52,519][15401] Updated weights for policy 0, policy_version 624661 (0.0040) [2024-06-24 08:41:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10234478592. Throughput: 0: 42736.6. Samples: 10234538980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:41:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 08:41:56,689][15401] Updated weights for policy 0, policy_version 624671 (0.0044) [2024-06-24 08:41:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 10234675200. Throughput: 0: 42387.1. Samples: 10234786520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:41:58,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-24 08:42:00,310][15401] Updated weights for policy 0, policy_version 624681 (0.0027) [2024-06-24 08:42:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 10234904576. Throughput: 0: 42650.2. Samples: 10235053060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:03,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 08:42:04,122][15401] Updated weights for policy 0, policy_version 624691 (0.0033) [2024-06-24 08:42:07,879][15401] Updated weights for policy 0, policy_version 624701 (0.0030) [2024-06-24 08:42:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42599.2, 300 sec: 42653.9). Total num frames: 10235101184. Throughput: 0: 42787.1. Samples: 10235183360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 08:42:11,734][15401] Updated weights for policy 0, policy_version 624711 (0.0044) [2024-06-24 08:42:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10235297792. Throughput: 0: 42707.4. Samples: 10235435920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 08:42:15,934][15401] Updated weights for policy 0, policy_version 624721 (0.0024) [2024-06-24 08:42:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10235527168. Throughput: 0: 42657.2. Samples: 10235687460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:18,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 08:42:19,964][15401] Updated weights for policy 0, policy_version 624731 (0.0021) [2024-06-24 08:42:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 10235740160. Throughput: 0: 42711.5. Samples: 10235820900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 08:42:23,480][15401] Updated weights for policy 0, policy_version 624741 (0.0046) [2024-06-24 08:42:27,982][15401] Updated weights for policy 0, policy_version 624751 (0.0038) [2024-06-24 08:42:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10235920384. Throughput: 0: 42716.0. Samples: 10236070380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 08:42:30,974][15401] Updated weights for policy 0, policy_version 624761 (0.0022) [2024-06-24 08:42:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42603.0, 300 sec: 42598.7). Total num frames: 10236166144. Throughput: 0: 42587.0. Samples: 10236326460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:33,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 08:42:35,928][15401] Updated weights for policy 0, policy_version 624771 (0.0038) [2024-06-24 08:42:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 10236395520. Throughput: 0: 42745.2. Samples: 10236462520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 08:42:38,675][15401] Updated weights for policy 0, policy_version 624781 (0.0042) [2024-06-24 08:42:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 10236559360. Throughput: 0: 42820.4. Samples: 10236713440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:43,395][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 08:42:43,596][15401] Updated weights for policy 0, policy_version 624791 (0.0026) [2024-06-24 08:42:46,355][15401] Updated weights for policy 0, policy_version 624801 (0.0027) [2024-06-24 08:42:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10236805120. Throughput: 0: 42627.1. Samples: 10236971280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:48,391][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 08:42:51,115][15401] Updated weights for policy 0, policy_version 624811 (0.0025) [2024-06-24 08:42:53,389][15132] Fps is (10 sec: 49153.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10237050880. Throughput: 0: 42684.5. Samples: 10237104160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:53,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-24 08:42:54,160][15401] Updated weights for policy 0, policy_version 624821 (0.0033) [2024-06-24 08:42:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10237214720. Throughput: 0: 42606.3. Samples: 10237353200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:42:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 08:42:58,655][15401] Updated weights for policy 0, policy_version 624831 (0.0030) [2024-06-24 08:43:01,899][15401] Updated weights for policy 0, policy_version 624841 (0.0034) [2024-06-24 08:43:03,389][15132] Fps is (10 sec: 37682.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10237427712. Throughput: 0: 42707.1. Samples: 10237609280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:43:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 08:43:06,192][15401] Updated weights for policy 0, policy_version 624851 (0.0036) [2024-06-24 08:43:07,398][15349] Signal inference workers to stop experience collection... (151600 times) [2024-06-24 08:43:07,398][15349] Signal inference workers to resume experience collection... (151600 times) [2024-06-24 08:43:07,416][15401] InferenceWorker_p0-w0: stopping experience collection (151600 times) [2024-06-24 08:43:07,443][15401] InferenceWorker_p0-w0: resuming experience collection (151600 times) [2024-06-24 08:43:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 10237640704. Throughput: 0: 42456.7. Samples: 10237731460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:43:08,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 08:43:10,005][15401] Updated weights for policy 0, policy_version 624861 (0.0031) [2024-06-24 08:43:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 10237870080. Throughput: 0: 42437.0. Samples: 10237980040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 08:43:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 08:43:14,225][15401] Updated weights for policy 0, policy_version 624871 (0.0046) [2024-06-24 08:43:17,747][15401] Updated weights for policy 0, policy_version 624881 (0.0039) [2024-06-24 08:43:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10238083072. Throughput: 0: 42465.8. Samples: 10238237420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 08:43:21,722][15401] Updated weights for policy 0, policy_version 624891 (0.0040) [2024-06-24 08:43:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10238263296. Throughput: 0: 42270.4. Samples: 10238364680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 08:43:25,358][15401] Updated weights for policy 0, policy_version 624901 (0.0027) [2024-06-24 08:43:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 10238509056. Throughput: 0: 42361.6. Samples: 10238619700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 08:43:29,167][15401] Updated weights for policy 0, policy_version 624911 (0.0032) [2024-06-24 08:43:32,944][15401] Updated weights for policy 0, policy_version 624921 (0.0036) [2024-06-24 08:43:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10238705664. Throughput: 0: 42359.9. Samples: 10238877480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:33,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 08:43:37,111][15401] Updated weights for policy 0, policy_version 624931 (0.0031) [2024-06-24 08:43:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.4, 300 sec: 42487.7). Total num frames: 10238902272. Throughput: 0: 42188.9. Samples: 10239002660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 08:43:40,800][15401] Updated weights for policy 0, policy_version 624941 (0.0037) [2024-06-24 08:43:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42543.1). Total num frames: 10239131648. Throughput: 0: 42210.2. Samples: 10239252660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 08:43:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000624947_10239131648.pth... [2024-06-24 08:43:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000624325_10228940800.pth [2024-06-24 08:43:45,039][15401] Updated weights for policy 0, policy_version 624951 (0.0028) [2024-06-24 08:43:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10239344640. Throughput: 0: 42253.7. Samples: 10239510700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:48,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 08:43:48,501][15401] Updated weights for policy 0, policy_version 624961 (0.0027) [2024-06-24 08:43:52,493][15401] Updated weights for policy 0, policy_version 624971 (0.0035) [2024-06-24 08:43:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.0, 300 sec: 42543.8). Total num frames: 10239557632. Throughput: 0: 42411.6. Samples: 10239639980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 08:43:56,073][15401] Updated weights for policy 0, policy_version 624981 (0.0037) [2024-06-24 08:43:58,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42542.9). Total num frames: 10239770624. Throughput: 0: 42476.4. Samples: 10239891580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:43:58,392][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 08:43:59,977][15401] Updated weights for policy 0, policy_version 624991 (0.0029) [2024-06-24 08:44:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 10239967232. Throughput: 0: 42649.6. Samples: 10240156660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 08:44:03,868][15401] Updated weights for policy 0, policy_version 625001 (0.0038) [2024-06-24 08:44:07,289][15401] Updated weights for policy 0, policy_version 625011 (0.0034) [2024-06-24 08:44:08,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10240196608. Throughput: 0: 42666.1. Samples: 10240284660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 08:44:11,379][15401] Updated weights for policy 0, policy_version 625021 (0.0041) [2024-06-24 08:44:13,390][15132] Fps is (10 sec: 45872.2, 60 sec: 42597.8, 300 sec: 42542.8). Total num frames: 10240425984. Throughput: 0: 42702.7. Samples: 10240541360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:13,391][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 08:44:15,144][15401] Updated weights for policy 0, policy_version 625031 (0.0041) [2024-06-24 08:44:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10240622592. Throughput: 0: 42707.3. Samples: 10240799300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:18,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 08:44:18,850][15401] Updated weights for policy 0, policy_version 625041 (0.0042) [2024-06-24 08:44:21,457][15349] Signal inference workers to stop experience collection... (151650 times) [2024-06-24 08:44:21,493][15401] InferenceWorker_p0-w0: stopping experience collection (151650 times) [2024-06-24 08:44:21,519][15349] Signal inference workers to resume experience collection... (151650 times) [2024-06-24 08:44:21,520][15401] InferenceWorker_p0-w0: resuming experience collection (151650 times) [2024-06-24 08:44:22,642][15401] Updated weights for policy 0, policy_version 625051 (0.0025) [2024-06-24 08:44:23,389][15132] Fps is (10 sec: 40963.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 10240835584. Throughput: 0: 42797.3. Samples: 10240928540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:23,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-24 08:44:26,434][15401] Updated weights for policy 0, policy_version 625061 (0.0026) [2024-06-24 08:44:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10241064960. Throughput: 0: 43068.2. Samples: 10241190720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 08:44:30,253][15401] Updated weights for policy 0, policy_version 625071 (0.0024) [2024-06-24 08:44:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 10241277952. Throughput: 0: 43046.7. Samples: 10241447800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 08:44:34,333][15401] Updated weights for policy 0, policy_version 625081 (0.0034) [2024-06-24 08:44:37,694][15401] Updated weights for policy 0, policy_version 625091 (0.0036) [2024-06-24 08:44:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 10241490944. Throughput: 0: 42941.9. Samples: 10241572360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 08:44:41,964][15401] Updated weights for policy 0, policy_version 625101 (0.0032) [2024-06-24 08:44:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10241720320. Throughput: 0: 43156.5. Samples: 10241833520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:43,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-24 08:44:45,754][15401] Updated weights for policy 0, policy_version 625111 (0.0032) [2024-06-24 08:44:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42599.3). Total num frames: 10241933312. Throughput: 0: 43049.5. Samples: 10242093880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:48,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-24 08:44:49,559][15401] Updated weights for policy 0, policy_version 625121 (0.0028) [2024-06-24 08:44:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10242129920. Throughput: 0: 42883.6. Samples: 10242214420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 08:44:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 08:44:53,431][15401] Updated weights for policy 0, policy_version 625131 (0.0033) [2024-06-24 08:44:57,020][15401] Updated weights for policy 0, policy_version 625141 (0.0048) [2024-06-24 08:44:58,395][15132] Fps is (10 sec: 44212.8, 60 sec: 43415.4, 300 sec: 42708.7). Total num frames: 10242375680. Throughput: 0: 42908.5. Samples: 10242472440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:44:58,395][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 08:45:01,167][15401] Updated weights for policy 0, policy_version 625151 (0.0027) [2024-06-24 08:45:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42542.9). Total num frames: 10242555904. Throughput: 0: 43096.6. Samples: 10242738640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 08:45:04,470][15401] Updated weights for policy 0, policy_version 625161 (0.0029) [2024-06-24 08:45:08,389][15132] Fps is (10 sec: 39343.3, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 10242768896. Throughput: 0: 42887.6. Samples: 10242858480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:08,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 08:45:08,874][15401] Updated weights for policy 0, policy_version 625171 (0.0041) [2024-06-24 08:45:12,026][15401] Updated weights for policy 0, policy_version 625181 (0.0033) [2024-06-24 08:45:13,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42872.0, 300 sec: 42709.5). Total num frames: 10242998272. Throughput: 0: 42748.3. Samples: 10243114400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:13,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 08:45:16,660][15401] Updated weights for policy 0, policy_version 625191 (0.0033) [2024-06-24 08:45:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10243194880. Throughput: 0: 42819.5. Samples: 10243374680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:18,394][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 08:45:19,754][15401] Updated weights for policy 0, policy_version 625201 (0.0033) [2024-06-24 08:45:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10243407872. Throughput: 0: 42904.5. Samples: 10243503060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:23,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 08:45:24,265][15401] Updated weights for policy 0, policy_version 625211 (0.0036) [2024-06-24 08:45:27,487][15401] Updated weights for policy 0, policy_version 625221 (0.0028) [2024-06-24 08:45:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10243653632. Throughput: 0: 42815.1. Samples: 10243760200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 08:45:32,160][15401] Updated weights for policy 0, policy_version 625231 (0.0036) [2024-06-24 08:45:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10243833856. Throughput: 0: 42749.8. Samples: 10244017620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 08:45:35,074][15401] Updated weights for policy 0, policy_version 625241 (0.0030) [2024-06-24 08:45:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10244063232. Throughput: 0: 42803.1. Samples: 10244140560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 08:45:39,687][15401] Updated weights for policy 0, policy_version 625251 (0.0034) [2024-06-24 08:45:40,797][15349] Signal inference workers to stop experience collection... (151700 times) [2024-06-24 08:45:40,799][15349] Signal inference workers to resume experience collection... (151700 times) [2024-06-24 08:45:40,840][15401] InferenceWorker_p0-w0: stopping experience collection (151700 times) [2024-06-24 08:45:40,840][15401] InferenceWorker_p0-w0: resuming experience collection (151700 times) [2024-06-24 08:45:42,823][15401] Updated weights for policy 0, policy_version 625261 (0.0038) [2024-06-24 08:45:43,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42710.0). Total num frames: 10244292608. Throughput: 0: 42871.7. Samples: 10244401540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:43,393][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 08:45:43,449][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000625263_10244308992.pth... [2024-06-24 08:45:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000624636_10234036224.pth [2024-06-24 08:45:47,302][15401] Updated weights for policy 0, policy_version 625271 (0.0027) [2024-06-24 08:45:48,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 10244489216. Throughput: 0: 42628.1. Samples: 10244657180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:48,396][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 08:45:50,913][15401] Updated weights for policy 0, policy_version 625281 (0.0033) [2024-06-24 08:45:53,389][15132] Fps is (10 sec: 39331.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10244685824. Throughput: 0: 42620.0. Samples: 10244776380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 08:45:55,038][15401] Updated weights for policy 0, policy_version 625291 (0.0029) [2024-06-24 08:45:58,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42329.2, 300 sec: 42709.8). Total num frames: 10244915200. Throughput: 0: 42785.4. Samples: 10245039740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:45:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 08:45:58,485][15401] Updated weights for policy 0, policy_version 625301 (0.0045) [2024-06-24 08:46:02,586][15401] Updated weights for policy 0, policy_version 625311 (0.0042) [2024-06-24 08:46:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.2, 300 sec: 42598.5). Total num frames: 10245111808. Throughput: 0: 42720.8. Samples: 10245297120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:46:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 08:46:06,099][15401] Updated weights for policy 0, policy_version 625321 (0.0046) [2024-06-24 08:46:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10245341184. Throughput: 0: 42567.1. Samples: 10245418580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:46:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 08:46:10,215][15401] Updated weights for policy 0, policy_version 625331 (0.0033) [2024-06-24 08:46:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10245570560. Throughput: 0: 42677.7. Samples: 10245680700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:46:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 08:46:13,597][15401] Updated weights for policy 0, policy_version 625341 (0.0022) [2024-06-24 08:46:17,790][15401] Updated weights for policy 0, policy_version 625351 (0.0038) [2024-06-24 08:46:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10245767168. Throughput: 0: 42597.9. Samples: 10245934520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:46:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 08:46:21,191][15401] Updated weights for policy 0, policy_version 625361 (0.0026) [2024-06-24 08:46:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 10245980160. Throughput: 0: 42667.9. Samples: 10246060620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:46:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 08:46:25,347][15401] Updated weights for policy 0, policy_version 625371 (0.0034) [2024-06-24 08:46:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 10246193152. Throughput: 0: 42626.8. Samples: 10246319640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 08:46:28,399][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 08:46:28,960][15401] Updated weights for policy 0, policy_version 625381 (0.0029) [2024-06-24 08:46:33,270][15401] Updated weights for policy 0, policy_version 625391 (0.0038) [2024-06-24 08:46:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10246406144. Throughput: 0: 42629.5. Samples: 10246575240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:46:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 08:46:36,616][15401] Updated weights for policy 0, policy_version 625401 (0.0026) [2024-06-24 08:46:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10246635520. Throughput: 0: 42879.0. Samples: 10246705940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:46:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 08:46:40,958][15401] Updated weights for policy 0, policy_version 625411 (0.0025) [2024-06-24 08:46:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 10246815744. Throughput: 0: 42596.4. Samples: 10246956580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:46:43,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-24 08:46:44,548][15401] Updated weights for policy 0, policy_version 625421 (0.0032) [2024-06-24 08:46:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42329.8, 300 sec: 42542.8). Total num frames: 10247028736. Throughput: 0: 42629.4. Samples: 10247215440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:46:48,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 08:46:48,953][15401] Updated weights for policy 0, policy_version 625431 (0.0029) [2024-06-24 08:46:52,095][15401] Updated weights for policy 0, policy_version 625441 (0.0026) [2024-06-24 08:46:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 10247290880. Throughput: 0: 42822.6. Samples: 10247345600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:46:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 08:46:56,408][15401] Updated weights for policy 0, policy_version 625451 (0.0034) [2024-06-24 08:46:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10247471104. Throughput: 0: 42687.1. Samples: 10247601620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:46:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 08:46:59,643][15401] Updated weights for policy 0, policy_version 625461 (0.0032) [2024-06-24 08:47:00,559][15349] Signal inference workers to stop experience collection... (151750 times) [2024-06-24 08:47:00,587][15401] InferenceWorker_p0-w0: stopping experience collection (151750 times) [2024-06-24 08:47:00,624][15349] Signal inference workers to resume experience collection... (151750 times) [2024-06-24 08:47:00,625][15401] InferenceWorker_p0-w0: resuming experience collection (151750 times) [2024-06-24 08:47:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10247684096. Throughput: 0: 42811.4. Samples: 10247861040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 08:47:04,126][15401] Updated weights for policy 0, policy_version 625471 (0.0033) [2024-06-24 08:47:07,526][15401] Updated weights for policy 0, policy_version 625481 (0.0040) [2024-06-24 08:47:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10247913472. Throughput: 0: 42773.4. Samples: 10247985420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 08:47:11,878][15401] Updated weights for policy 0, policy_version 625491 (0.0040) [2024-06-24 08:47:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10248142848. Throughput: 0: 42778.2. Samples: 10248244660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 08:47:15,025][15401] Updated weights for policy 0, policy_version 625501 (0.0026) [2024-06-24 08:47:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 10248323072. Throughput: 0: 42913.7. Samples: 10248506360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 08:47:19,343][15401] Updated weights for policy 0, policy_version 625511 (0.0042) [2024-06-24 08:47:22,668][15401] Updated weights for policy 0, policy_version 625521 (0.0033) [2024-06-24 08:47:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10248568832. Throughput: 0: 42710.7. Samples: 10248627920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:23,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 08:47:26,827][15401] Updated weights for policy 0, policy_version 625531 (0.0022) [2024-06-24 08:47:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10248765440. Throughput: 0: 42915.8. Samples: 10248887800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:28,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 08:47:30,536][15401] Updated weights for policy 0, policy_version 625541 (0.0049) [2024-06-24 08:47:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 10248978432. Throughput: 0: 42760.2. Samples: 10249139640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 08:47:34,682][15401] Updated weights for policy 0, policy_version 625551 (0.0029) [2024-06-24 08:47:37,948][15401] Updated weights for policy 0, policy_version 625561 (0.0035) [2024-06-24 08:47:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10249207808. Throughput: 0: 42770.8. Samples: 10249270280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 08:47:42,576][15401] Updated weights for policy 0, policy_version 625571 (0.0039) [2024-06-24 08:47:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 10249420800. Throughput: 0: 42958.6. Samples: 10249534760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 08:47:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000625575_10249420800.pth... [2024-06-24 08:47:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000624947_10239131648.pth [2024-06-24 08:47:45,650][15401] Updated weights for policy 0, policy_version 625581 (0.0036) [2024-06-24 08:47:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 10249633792. Throughput: 0: 42761.3. Samples: 10249785300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:48,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 08:47:50,291][15401] Updated weights for policy 0, policy_version 625591 (0.0041) [2024-06-24 08:47:53,203][15401] Updated weights for policy 0, policy_version 625601 (0.0038) [2024-06-24 08:47:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10249846784. Throughput: 0: 42849.8. Samples: 10249913660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 08:47:57,783][15401] Updated weights for policy 0, policy_version 625611 (0.0032) [2024-06-24 08:47:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10250059776. Throughput: 0: 42937.0. Samples: 10250176820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:47:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 08:48:01,044][15401] Updated weights for policy 0, policy_version 625621 (0.0037) [2024-06-24 08:48:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10250256384. Throughput: 0: 42732.1. Samples: 10250429300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:48:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 08:48:05,264][15401] Updated weights for policy 0, policy_version 625631 (0.0044) [2024-06-24 08:48:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10250485760. Throughput: 0: 42849.9. Samples: 10250556160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 08:48:08,729][15401] Updated weights for policy 0, policy_version 625641 (0.0035) [2024-06-24 08:48:12,896][15401] Updated weights for policy 0, policy_version 625651 (0.0033) [2024-06-24 08:48:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10250665984. Throughput: 0: 42765.9. Samples: 10250812260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 08:48:16,269][15401] Updated weights for policy 0, policy_version 625661 (0.0028) [2024-06-24 08:48:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10250895360. Throughput: 0: 42809.2. Samples: 10251066060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 08:48:20,379][15401] Updated weights for policy 0, policy_version 625671 (0.0035) [2024-06-24 08:48:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 10251108352. Throughput: 0: 42899.8. Samples: 10251200880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:23,392][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 08:48:24,067][15401] Updated weights for policy 0, policy_version 625681 (0.0031) [2024-06-24 08:48:28,285][15401] Updated weights for policy 0, policy_version 625691 (0.0035) [2024-06-24 08:48:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10251321344. Throughput: 0: 42597.9. Samples: 10251451660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 08:48:31,654][15401] Updated weights for policy 0, policy_version 625701 (0.0039) [2024-06-24 08:48:33,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10251550720. Throughput: 0: 42694.3. Samples: 10251706540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 08:48:35,801][15401] Updated weights for policy 0, policy_version 625711 (0.0029) [2024-06-24 08:48:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10251730944. Throughput: 0: 42750.3. Samples: 10251837420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 08:48:38,571][15349] Signal inference workers to stop experience collection... (151800 times) [2024-06-24 08:48:38,576][15349] Signal inference workers to resume experience collection... (151800 times) [2024-06-24 08:48:38,609][15401] InferenceWorker_p0-w0: stopping experience collection (151800 times) [2024-06-24 08:48:38,609][15401] InferenceWorker_p0-w0: resuming experience collection (151800 times) [2024-06-24 08:48:39,429][15401] Updated weights for policy 0, policy_version 625721 (0.0040) [2024-06-24 08:48:43,332][15401] Updated weights for policy 0, policy_version 625731 (0.0039) [2024-06-24 08:48:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10251976704. Throughput: 0: 42637.1. Samples: 10252095500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 08:48:47,273][15401] Updated weights for policy 0, policy_version 625741 (0.0039) [2024-06-24 08:48:48,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 10252189696. Throughput: 0: 42601.7. Samples: 10252346380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 08:48:51,093][15401] Updated weights for policy 0, policy_version 625751 (0.0038) [2024-06-24 08:48:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42765.4). Total num frames: 10252386304. Throughput: 0: 42639.9. Samples: 10252474960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 08:48:54,952][15401] Updated weights for policy 0, policy_version 625761 (0.0046) [2024-06-24 08:48:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 10252615680. Throughput: 0: 42726.2. Samples: 10252734940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:48:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 08:48:58,603][15401] Updated weights for policy 0, policy_version 625771 (0.0031) [2024-06-24 08:49:02,571][15401] Updated weights for policy 0, policy_version 625781 (0.0022) [2024-06-24 08:49:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10252828672. Throughput: 0: 42695.2. Samples: 10252987340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:49:03,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 08:49:06,620][15401] Updated weights for policy 0, policy_version 625791 (0.0030) [2024-06-24 08:49:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.1). Total num frames: 10253041664. Throughput: 0: 42576.9. Samples: 10253116740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:49:08,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 08:49:10,493][15401] Updated weights for policy 0, policy_version 625801 (0.0030) [2024-06-24 08:49:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10253238272. Throughput: 0: 42692.5. Samples: 10253372820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:49:13,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 08:49:14,165][15401] Updated weights for policy 0, policy_version 625811 (0.0032) [2024-06-24 08:49:18,387][15401] Updated weights for policy 0, policy_version 625821 (0.0039) [2024-06-24 08:49:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10253451264. Throughput: 0: 42649.7. Samples: 10253625780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:49:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 08:49:21,813][15401] Updated weights for policy 0, policy_version 625831 (0.0031) [2024-06-24 08:49:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 10253680640. Throughput: 0: 42570.9. Samples: 10253753120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:49:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 08:49:25,911][15401] Updated weights for policy 0, policy_version 625841 (0.0031) [2024-06-24 08:49:28,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10253877248. Throughput: 0: 42497.8. Samples: 10254008000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:49:28,393][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 08:49:29,501][15401] Updated weights for policy 0, policy_version 625851 (0.0051) [2024-06-24 08:49:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10254090240. Throughput: 0: 42610.3. Samples: 10254263840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:49:33,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 08:49:33,511][15401] Updated weights for policy 0, policy_version 625861 (0.0025) [2024-06-24 08:49:37,239][15401] Updated weights for policy 0, policy_version 625871 (0.0034) [2024-06-24 08:49:38,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 10254336000. Throughput: 0: 42557.7. Samples: 10254390060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 08:49:38,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 08:49:41,174][15401] Updated weights for policy 0, policy_version 625881 (0.0037) [2024-06-24 08:49:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10254499840. Throughput: 0: 42333.8. Samples: 10254639960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:49:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 08:49:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000625886_10254516224.pth... [2024-06-24 08:49:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000625263_10244308992.pth [2024-06-24 08:49:44,943][15401] Updated weights for policy 0, policy_version 625891 (0.0037) [2024-06-24 08:49:48,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10254712832. Throughput: 0: 42515.5. Samples: 10254900540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:49:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 08:49:49,410][15401] Updated weights for policy 0, policy_version 625901 (0.0032) [2024-06-24 08:49:52,566][15401] Updated weights for policy 0, policy_version 625911 (0.0032) [2024-06-24 08:49:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42654.7). Total num frames: 10254958592. Throughput: 0: 42370.6. Samples: 10255023420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:49:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 08:49:56,876][15349] Signal inference workers to stop experience collection... (151850 times) [2024-06-24 08:49:56,876][15349] Signal inference workers to resume experience collection... (151850 times) [2024-06-24 08:49:56,917][15401] InferenceWorker_p0-w0: stopping experience collection (151850 times) [2024-06-24 08:49:56,917][15401] InferenceWorker_p0-w0: resuming experience collection (151850 times) [2024-06-24 08:49:57,043][15401] Updated weights for policy 0, policy_version 625921 (0.0032) [2024-06-24 08:49:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 10255155200. Throughput: 0: 42350.5. Samples: 10255278600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:49:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 08:50:00,210][15401] Updated weights for policy 0, policy_version 625931 (0.0037) [2024-06-24 08:50:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 10255351808. Throughput: 0: 42412.9. Samples: 10255534360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 08:50:04,787][15401] Updated weights for policy 0, policy_version 625941 (0.0040) [2024-06-24 08:50:08,200][15401] Updated weights for policy 0, policy_version 625951 (0.0026) [2024-06-24 08:50:08,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10255597568. Throughput: 0: 42327.7. Samples: 10255657960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:08,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 08:50:12,589][15401] Updated weights for policy 0, policy_version 625961 (0.0041) [2024-06-24 08:50:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10255777792. Throughput: 0: 42417.0. Samples: 10255916660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 08:50:15,792][15401] Updated weights for policy 0, policy_version 625971 (0.0050) [2024-06-24 08:50:18,392][15132] Fps is (10 sec: 40959.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10256007168. Throughput: 0: 42472.8. Samples: 10256175220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:18,393][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 08:50:20,223][15401] Updated weights for policy 0, policy_version 625981 (0.0033) [2024-06-24 08:50:23,376][15401] Updated weights for policy 0, policy_version 625991 (0.0035) [2024-06-24 08:50:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10256236544. Throughput: 0: 42477.8. Samples: 10256301560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:23,392][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 08:50:27,733][15401] Updated weights for policy 0, policy_version 626001 (0.0031) [2024-06-24 08:50:28,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 10256433152. Throughput: 0: 42693.9. Samples: 10256561180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 08:50:30,913][15401] Updated weights for policy 0, policy_version 626011 (0.0033) [2024-06-24 08:50:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10256646144. Throughput: 0: 42673.7. Samples: 10256820860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 08:50:35,360][15401] Updated weights for policy 0, policy_version 626021 (0.0041) [2024-06-24 08:50:38,389][15401] Updated weights for policy 0, policy_version 626031 (0.0034) [2024-06-24 08:50:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10256891904. Throughput: 0: 42795.1. Samples: 10256949200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 08:50:43,032][15401] Updated weights for policy 0, policy_version 626041 (0.0038) [2024-06-24 08:50:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 10257088512. Throughput: 0: 42924.5. Samples: 10257210200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:43,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-24 08:50:45,845][15401] Updated weights for policy 0, policy_version 626051 (0.0032) [2024-06-24 08:50:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10257301504. Throughput: 0: 42986.6. Samples: 10257468760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:48,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 08:50:50,498][15401] Updated weights for policy 0, policy_version 626061 (0.0032) [2024-06-24 08:50:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10257530880. Throughput: 0: 43203.7. Samples: 10257602020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 08:50:53,394][15401] Updated weights for policy 0, policy_version 626071 (0.0027) [2024-06-24 08:50:57,959][15401] Updated weights for policy 0, policy_version 626081 (0.0030) [2024-06-24 08:50:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10257711104. Throughput: 0: 43224.3. Samples: 10257861760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:50:58,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 08:51:01,238][15401] Updated weights for policy 0, policy_version 626091 (0.0043) [2024-06-24 08:51:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10257956864. Throughput: 0: 43166.4. Samples: 10258117600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:51:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 08:51:05,750][15401] Updated weights for policy 0, policy_version 626101 (0.0038) [2024-06-24 08:51:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 10258186240. Throughput: 0: 43322.7. Samples: 10258251080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:51:08,391][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 08:51:08,656][15401] Updated weights for policy 0, policy_version 626111 (0.0027) [2024-06-24 08:51:13,356][15401] Updated weights for policy 0, policy_version 626121 (0.0025) [2024-06-24 08:51:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10258366464. Throughput: 0: 43368.8. Samples: 10258512780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 08:51:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 08:51:15,994][15401] Updated weights for policy 0, policy_version 626131 (0.0028) [2024-06-24 08:51:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 10258612224. Throughput: 0: 43213.8. Samples: 10258765480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:18,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 08:51:18,960][15349] Signal inference workers to stop experience collection... (151900 times) [2024-06-24 08:51:19,003][15401] InferenceWorker_p0-w0: stopping experience collection (151900 times) [2024-06-24 08:51:19,075][15349] Signal inference workers to resume experience collection... (151900 times) [2024-06-24 08:51:19,076][15401] InferenceWorker_p0-w0: resuming experience collection (151900 times) [2024-06-24 08:51:20,753][15401] Updated weights for policy 0, policy_version 626141 (0.0024) [2024-06-24 08:51:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10258825216. Throughput: 0: 43301.0. Samples: 10258897740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 08:51:23,575][15401] Updated weights for policy 0, policy_version 626151 (0.0024) [2024-06-24 08:51:28,248][15401] Updated weights for policy 0, policy_version 626161 (0.0039) [2024-06-24 08:51:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10259021824. Throughput: 0: 43195.7. Samples: 10259154000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:28,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 08:51:31,797][15401] Updated weights for policy 0, policy_version 626171 (0.0033) [2024-06-24 08:51:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10259251200. Throughput: 0: 43187.1. Samples: 10259412180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 08:51:35,831][15401] Updated weights for policy 0, policy_version 626181 (0.0032) [2024-06-24 08:51:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10259464192. Throughput: 0: 43062.6. Samples: 10259539840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 08:51:39,401][15401] Updated weights for policy 0, policy_version 626191 (0.0038) [2024-06-24 08:51:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10259660800. Throughput: 0: 43025.3. Samples: 10259797900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:43,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-24 08:51:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000626201_10259677184.pth... [2024-06-24 08:51:43,506][15401] Updated weights for policy 0, policy_version 626201 (0.0039) [2024-06-24 08:51:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000625575_10249420800.pth [2024-06-24 08:51:46,898][15401] Updated weights for policy 0, policy_version 626211 (0.0024) [2024-06-24 08:51:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10259890176. Throughput: 0: 43044.4. Samples: 10260054600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:48,390][15132] Avg episode reward: [(0, '0.898')] [2024-06-24 08:51:50,993][15401] Updated weights for policy 0, policy_version 626221 (0.0029) [2024-06-24 08:51:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10260119552. Throughput: 0: 42959.6. Samples: 10260184260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:53,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-24 08:51:54,852][15401] Updated weights for policy 0, policy_version 626231 (0.0036) [2024-06-24 08:51:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 10260316160. Throughput: 0: 42895.0. Samples: 10260443160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:51:58,393][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 08:51:58,557][15401] Updated weights for policy 0, policy_version 626241 (0.0036) [2024-06-24 08:52:02,356][15401] Updated weights for policy 0, policy_version 626251 (0.0031) [2024-06-24 08:52:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10260529152. Throughput: 0: 43040.3. Samples: 10260702300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 08:52:06,101][15401] Updated weights for policy 0, policy_version 626261 (0.0035) [2024-06-24 08:52:08,390][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10260774912. Throughput: 0: 42895.9. Samples: 10260828060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:08,393][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 08:52:09,897][15401] Updated weights for policy 0, policy_version 626271 (0.0041) [2024-06-24 08:52:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10260955136. Throughput: 0: 42838.2. Samples: 10261081720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 08:52:14,416][15401] Updated weights for policy 0, policy_version 626281 (0.0026) [2024-06-24 08:52:17,654][15401] Updated weights for policy 0, policy_version 626291 (0.0039) [2024-06-24 08:52:18,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10261168128. Throughput: 0: 42803.6. Samples: 10261338440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:18,393][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 08:52:21,913][15401] Updated weights for policy 0, policy_version 626301 (0.0031) [2024-06-24 08:52:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10261397504. Throughput: 0: 42795.9. Samples: 10261465660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:23,391][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 08:52:25,281][15401] Updated weights for policy 0, policy_version 626311 (0.0048) [2024-06-24 08:52:28,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10261594112. Throughput: 0: 42681.9. Samples: 10261718580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 08:52:29,464][15401] Updated weights for policy 0, policy_version 626321 (0.0045) [2024-06-24 08:52:32,758][15401] Updated weights for policy 0, policy_version 626331 (0.0028) [2024-06-24 08:52:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10261807104. Throughput: 0: 42649.3. Samples: 10261973820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:33,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 08:52:36,944][15401] Updated weights for policy 0, policy_version 626341 (0.0035) [2024-06-24 08:52:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10262036480. Throughput: 0: 42740.8. Samples: 10262107600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:38,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 08:52:40,302][15401] Updated weights for policy 0, policy_version 626351 (0.0033) [2024-06-24 08:52:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10262216704. Throughput: 0: 42609.5. Samples: 10262360480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:43,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 08:52:44,704][15401] Updated weights for policy 0, policy_version 626361 (0.0046) [2024-06-24 08:52:47,913][15401] Updated weights for policy 0, policy_version 626371 (0.0027) [2024-06-24 08:52:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10262462464. Throughput: 0: 42453.1. Samples: 10262612680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 08:52:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 08:52:52,362][15401] Updated weights for policy 0, policy_version 626381 (0.0042) [2024-06-24 08:52:53,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10262691840. Throughput: 0: 42690.4. Samples: 10262749120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:52:53,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-24 08:52:55,584][15401] Updated weights for policy 0, policy_version 626391 (0.0042) [2024-06-24 08:52:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 10262872064. Throughput: 0: 42843.1. Samples: 10263009660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:52:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 08:52:59,851][15401] Updated weights for policy 0, policy_version 626401 (0.0033) [2024-06-24 08:53:03,354][15401] Updated weights for policy 0, policy_version 626411 (0.0029) [2024-06-24 08:53:03,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10263117824. Throughput: 0: 42644.8. Samples: 10263257360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 08:53:07,750][15401] Updated weights for policy 0, policy_version 626421 (0.0032) [2024-06-24 08:53:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10263314432. Throughput: 0: 42771.5. Samples: 10263390380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 08:53:08,528][15349] Signal inference workers to stop experience collection... (151950 times) [2024-06-24 08:53:08,536][15349] Signal inference workers to resume experience collection... (151950 times) [2024-06-24 08:53:08,574][15401] InferenceWorker_p0-w0: stopping experience collection (151950 times) [2024-06-24 08:53:08,574][15401] InferenceWorker_p0-w0: resuming experience collection (151950 times) [2024-06-24 08:53:11,382][15401] Updated weights for policy 0, policy_version 626431 (0.0033) [2024-06-24 08:53:13,396][15132] Fps is (10 sec: 39296.9, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 10263511040. Throughput: 0: 42740.5. Samples: 10263642180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:13,396][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 08:53:15,287][15401] Updated weights for policy 0, policy_version 626441 (0.0029) [2024-06-24 08:53:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42820.9). Total num frames: 10263740416. Throughput: 0: 42764.5. Samples: 10263898220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:18,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 08:53:18,888][15401] Updated weights for policy 0, policy_version 626451 (0.0032) [2024-06-24 08:53:23,277][15401] Updated weights for policy 0, policy_version 626461 (0.0041) [2024-06-24 08:53:23,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10263937024. Throughput: 0: 42681.9. Samples: 10264028280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:23,390][15132] Avg episode reward: [(0, '0.149')] [2024-06-24 08:53:26,508][15401] Updated weights for policy 0, policy_version 626471 (0.0040) [2024-06-24 08:53:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10264150016. Throughput: 0: 42681.3. Samples: 10264281140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:28,390][15132] Avg episode reward: [(0, '0.149')] [2024-06-24 08:53:30,905][15401] Updated weights for policy 0, policy_version 626481 (0.0032) [2024-06-24 08:53:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10264379392. Throughput: 0: 42786.2. Samples: 10264538060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 08:53:34,348][15401] Updated weights for policy 0, policy_version 626491 (0.0037) [2024-06-24 08:53:38,381][15401] Updated weights for policy 0, policy_version 626501 (0.0028) [2024-06-24 08:53:38,390][15132] Fps is (10 sec: 44233.2, 60 sec: 42597.9, 300 sec: 42764.9). Total num frames: 10264592384. Throughput: 0: 42644.0. Samples: 10264668140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:38,391][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 08:53:42,506][15401] Updated weights for policy 0, policy_version 626511 (0.0029) [2024-06-24 08:53:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10264788992. Throughput: 0: 42437.7. Samples: 10264919360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 08:53:43,441][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000626514_10264805376.pth... [2024-06-24 08:53:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000625886_10254516224.pth [2024-06-24 08:53:45,979][15401] Updated weights for policy 0, policy_version 626521 (0.0029) [2024-06-24 08:53:48,390][15132] Fps is (10 sec: 42601.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 10265018368. Throughput: 0: 42502.3. Samples: 10265169960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 08:53:50,167][15401] Updated weights for policy 0, policy_version 626531 (0.0038) [2024-06-24 08:53:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 10265214976. Throughput: 0: 42450.0. Samples: 10265300620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 08:53:53,537][15401] Updated weights for policy 0, policy_version 626541 (0.0026) [2024-06-24 08:53:57,625][15401] Updated weights for policy 0, policy_version 626551 (0.0037) [2024-06-24 08:53:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10265427968. Throughput: 0: 42660.8. Samples: 10265561640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:53:58,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 08:54:01,441][15401] Updated weights for policy 0, policy_version 626561 (0.0043) [2024-06-24 08:54:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10265673728. Throughput: 0: 42475.0. Samples: 10265809600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:54:03,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-24 08:54:05,831][15401] Updated weights for policy 0, policy_version 626571 (0.0031) [2024-06-24 08:54:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10265853952. Throughput: 0: 42650.2. Samples: 10265947540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:54:08,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 08:54:09,199][15401] Updated weights for policy 0, policy_version 626581 (0.0032) [2024-06-24 08:54:13,250][15401] Updated weights for policy 0, policy_version 626591 (0.0042) [2024-06-24 08:54:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 10266066944. Throughput: 0: 42669.4. Samples: 10266201260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:54:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 08:54:15,608][15349] Signal inference workers to stop experience collection... (152000 times) [2024-06-24 08:54:15,631][15401] InferenceWorker_p0-w0: stopping experience collection (152000 times) [2024-06-24 08:54:15,720][15349] Signal inference workers to resume experience collection... (152000 times) [2024-06-24 08:54:15,720][15401] InferenceWorker_p0-w0: resuming experience collection (152000 times) [2024-06-24 08:54:16,803][15401] Updated weights for policy 0, policy_version 626601 (0.0034) [2024-06-24 08:54:18,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10266329088. Throughput: 0: 42487.0. Samples: 10266449980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:54:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 08:54:20,756][15401] Updated weights for policy 0, policy_version 626611 (0.0038) [2024-06-24 08:54:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 10266492928. Throughput: 0: 42669.8. Samples: 10266588240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 08:54:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 08:54:24,293][15401] Updated weights for policy 0, policy_version 626621 (0.0043) [2024-06-24 08:54:28,306][15401] Updated weights for policy 0, policy_version 626631 (0.0029) [2024-06-24 08:54:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10266722304. Throughput: 0: 42645.8. Samples: 10266838420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:54:28,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 08:54:31,753][15401] Updated weights for policy 0, policy_version 626641 (0.0036) [2024-06-24 08:54:33,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10266951680. Throughput: 0: 42777.8. Samples: 10267094960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:54:33,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 08:54:35,855][15401] Updated weights for policy 0, policy_version 626651 (0.0048) [2024-06-24 08:54:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.9, 300 sec: 42820.5). Total num frames: 10267131904. Throughput: 0: 42756.3. Samples: 10267224660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:54:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 08:54:39,434][15401] Updated weights for policy 0, policy_version 626661 (0.0039) [2024-06-24 08:54:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10267361280. Throughput: 0: 42566.2. Samples: 10267477120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:54:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 08:54:43,641][15401] Updated weights for policy 0, policy_version 626671 (0.0033) [2024-06-24 08:54:47,170][15401] Updated weights for policy 0, policy_version 626681 (0.0030) [2024-06-24 08:54:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10267574272. Throughput: 0: 42751.2. Samples: 10267733400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:54:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 08:54:51,403][15401] Updated weights for policy 0, policy_version 626691 (0.0040) [2024-06-24 08:54:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10267770880. Throughput: 0: 42636.0. Samples: 10267866160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:54:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 08:54:54,793][15401] Updated weights for policy 0, policy_version 626701 (0.0034) [2024-06-24 08:54:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10268000256. Throughput: 0: 42588.0. Samples: 10268117720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:54:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 08:54:59,173][15401] Updated weights for policy 0, policy_version 626711 (0.0033) [2024-06-24 08:55:02,519][15401] Updated weights for policy 0, policy_version 626721 (0.0043) [2024-06-24 08:55:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 10268229632. Throughput: 0: 42720.1. Samples: 10268372380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:03,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 08:55:07,069][15401] Updated weights for policy 0, policy_version 626731 (0.0040) [2024-06-24 08:55:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10268409856. Throughput: 0: 42624.4. Samples: 10268506340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 08:55:10,300][15401] Updated weights for policy 0, policy_version 626741 (0.0048) [2024-06-24 08:55:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 10268622848. Throughput: 0: 42690.3. Samples: 10268759480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 08:55:14,674][15401] Updated weights for policy 0, policy_version 626751 (0.0051) [2024-06-24 08:55:18,209][15401] Updated weights for policy 0, policy_version 626761 (0.0036) [2024-06-24 08:55:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 10268852224. Throughput: 0: 42644.3. Samples: 10269013960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 08:55:22,196][15401] Updated weights for policy 0, policy_version 626771 (0.0044) [2024-06-24 08:55:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10269065216. Throughput: 0: 42719.5. Samples: 10269147040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 08:55:25,906][15401] Updated weights for policy 0, policy_version 626781 (0.0042) [2024-06-24 08:55:28,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 10269261824. Throughput: 0: 42696.4. Samples: 10269398560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:28,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 08:55:30,162][15401] Updated weights for policy 0, policy_version 626791 (0.0027) [2024-06-24 08:55:30,341][15349] Signal inference workers to stop experience collection... (152050 times) [2024-06-24 08:55:30,343][15349] Signal inference workers to resume experience collection... (152050 times) [2024-06-24 08:55:30,387][15401] InferenceWorker_p0-w0: stopping experience collection (152050 times) [2024-06-24 08:55:30,387][15401] InferenceWorker_p0-w0: resuming experience collection (152050 times) [2024-06-24 08:55:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10269491200. Throughput: 0: 42665.8. Samples: 10269653360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 08:55:33,579][15401] Updated weights for policy 0, policy_version 626801 (0.0029) [2024-06-24 08:55:37,739][15401] Updated weights for policy 0, policy_version 626811 (0.0037) [2024-06-24 08:55:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10269704192. Throughput: 0: 42700.8. Samples: 10269787700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 08:55:41,128][15401] Updated weights for policy 0, policy_version 626821 (0.0026) [2024-06-24 08:55:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10269917184. Throughput: 0: 42741.6. Samples: 10270041100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 08:55:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000626826_10269917184.pth... [2024-06-24 08:55:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000626201_10259677184.pth [2024-06-24 08:55:45,225][15401] Updated weights for policy 0, policy_version 626831 (0.0038) [2024-06-24 08:55:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10270146560. Throughput: 0: 42650.2. Samples: 10270291640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:48,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 08:55:48,506][15401] Updated weights for policy 0, policy_version 626841 (0.0041) [2024-06-24 08:55:53,056][15401] Updated weights for policy 0, policy_version 626851 (0.0030) [2024-06-24 08:55:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10270343168. Throughput: 0: 42764.9. Samples: 10270430760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:53,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 08:55:56,428][15401] Updated weights for policy 0, policy_version 626861 (0.0043) [2024-06-24 08:55:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10270572544. Throughput: 0: 42825.8. Samples: 10270686640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:55:58,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-24 08:56:00,567][15401] Updated weights for policy 0, policy_version 626871 (0.0033) [2024-06-24 08:56:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10270801920. Throughput: 0: 42957.4. Samples: 10270947040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 08:56:03,390][15132] Avg episode reward: [(0, '0.110')] [2024-06-24 08:56:03,864][15401] Updated weights for policy 0, policy_version 626881 (0.0025) [2024-06-24 08:56:08,076][15401] Updated weights for policy 0, policy_version 626891 (0.0037) [2024-06-24 08:56:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10270982144. Throughput: 0: 42856.2. Samples: 10271075560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:08,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 08:56:11,415][15401] Updated weights for policy 0, policy_version 626901 (0.0032) [2024-06-24 08:56:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10271211520. Throughput: 0: 43002.2. Samples: 10271333560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:13,395][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 08:56:15,868][15401] Updated weights for policy 0, policy_version 626911 (0.0049) [2024-06-24 08:56:18,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10271424512. Throughput: 0: 42907.5. Samples: 10271584200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:18,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 08:56:18,926][15401] Updated weights for policy 0, policy_version 626921 (0.0028) [2024-06-24 08:56:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 10271621120. Throughput: 0: 42953.7. Samples: 10271720620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 08:56:23,549][15401] Updated weights for policy 0, policy_version 626931 (0.0046) [2024-06-24 08:56:26,578][15401] Updated weights for policy 0, policy_version 626941 (0.0041) [2024-06-24 08:56:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 10271850496. Throughput: 0: 42795.3. Samples: 10271966880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:28,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 08:56:31,134][15401] Updated weights for policy 0, policy_version 626951 (0.0037) [2024-06-24 08:56:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10272063488. Throughput: 0: 43113.8. Samples: 10272231760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:33,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 08:56:34,176][15401] Updated weights for policy 0, policy_version 626961 (0.0043) [2024-06-24 08:56:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10272276480. Throughput: 0: 42880.8. Samples: 10272360400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 08:56:38,737][15401] Updated weights for policy 0, policy_version 626971 (0.0040) [2024-06-24 08:56:41,833][15401] Updated weights for policy 0, policy_version 626981 (0.0025) [2024-06-24 08:56:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10272489472. Throughput: 0: 42713.7. Samples: 10272608760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 08:56:46,653][15401] Updated weights for policy 0, policy_version 626991 (0.0030) [2024-06-24 08:56:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10272702464. Throughput: 0: 42647.1. Samples: 10272866160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 08:56:49,158][15349] Signal inference workers to stop experience collection... (152100 times) [2024-06-24 08:56:49,158][15349] Signal inference workers to resume experience collection... (152100 times) [2024-06-24 08:56:49,200][15401] InferenceWorker_p0-w0: stopping experience collection (152100 times) [2024-06-24 08:56:49,200][15401] InferenceWorker_p0-w0: resuming experience collection (152100 times) [2024-06-24 08:56:49,651][15401] Updated weights for policy 0, policy_version 627001 (0.0039) [2024-06-24 08:56:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 10272899072. Throughput: 0: 42564.3. Samples: 10272990960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:53,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 08:56:54,467][15401] Updated weights for policy 0, policy_version 627011 (0.0032) [2024-06-24 08:56:58,058][15401] Updated weights for policy 0, policy_version 627021 (0.0034) [2024-06-24 08:56:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10273112064. Throughput: 0: 42421.5. Samples: 10273242520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:56:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 08:57:02,229][15401] Updated weights for policy 0, policy_version 627031 (0.0044) [2024-06-24 08:57:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10273341440. Throughput: 0: 42506.6. Samples: 10273497000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:57:03,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 08:57:05,598][15401] Updated weights for policy 0, policy_version 627041 (0.0035) [2024-06-24 08:57:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10273554432. Throughput: 0: 42450.5. Samples: 10273630880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:57:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 08:57:09,642][15401] Updated weights for policy 0, policy_version 627051 (0.0031) [2024-06-24 08:57:13,333][15401] Updated weights for policy 0, policy_version 627061 (0.0035) [2024-06-24 08:57:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 10273767424. Throughput: 0: 42568.3. Samples: 10273882460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:57:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 08:57:17,366][15401] Updated weights for policy 0, policy_version 627071 (0.0034) [2024-06-24 08:57:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10273996800. Throughput: 0: 42422.1. Samples: 10274140760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:57:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 08:57:20,957][15401] Updated weights for policy 0, policy_version 627081 (0.0035) [2024-06-24 08:57:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10274177024. Throughput: 0: 42424.0. Samples: 10274269480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:57:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 08:57:24,839][15401] Updated weights for policy 0, policy_version 627091 (0.0032) [2024-06-24 08:57:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10274406400. Throughput: 0: 42629.3. Samples: 10274527080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:57:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 08:57:28,613][15401] Updated weights for policy 0, policy_version 627101 (0.0028) [2024-06-24 08:57:32,367][15401] Updated weights for policy 0, policy_version 627111 (0.0029) [2024-06-24 08:57:33,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 10274635776. Throughput: 0: 42679.9. Samples: 10274786860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:57:33,393][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 08:57:36,334][15401] Updated weights for policy 0, policy_version 627121 (0.0030) [2024-06-24 08:57:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10274832384. Throughput: 0: 42830.7. Samples: 10274918340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 08:57:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 08:57:39,991][15401] Updated weights for policy 0, policy_version 627131 (0.0036) [2024-06-24 08:57:43,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10275061760. Throughput: 0: 42898.7. Samples: 10275172960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:57:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 08:57:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000627140_10275061760.pth... [2024-06-24 08:57:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000626514_10264805376.pth [2024-06-24 08:57:43,929][15401] Updated weights for policy 0, policy_version 627141 (0.0034) [2024-06-24 08:57:47,374][15401] Updated weights for policy 0, policy_version 627151 (0.0033) [2024-06-24 08:57:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 10275274752. Throughput: 0: 42985.9. Samples: 10275431360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:57:48,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 08:57:51,852][15401] Updated weights for policy 0, policy_version 627161 (0.0039) [2024-06-24 08:57:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10275487744. Throughput: 0: 42872.3. Samples: 10275560140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:57:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 08:57:54,824][15401] Updated weights for policy 0, policy_version 627171 (0.0041) [2024-06-24 08:57:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10275717120. Throughput: 0: 42991.2. Samples: 10275817060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:57:58,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 08:57:59,451][15401] Updated weights for policy 0, policy_version 627181 (0.0029) [2024-06-24 08:58:02,583][15401] Updated weights for policy 0, policy_version 627191 (0.0036) [2024-06-24 08:58:03,121][15349] Signal inference workers to stop experience collection... (152150 times) [2024-06-24 08:58:03,176][15401] InferenceWorker_p0-w0: stopping experience collection (152150 times) [2024-06-24 08:58:03,184][15349] Signal inference workers to resume experience collection... (152150 times) [2024-06-24 08:58:03,189][15401] InferenceWorker_p0-w0: resuming experience collection (152150 times) [2024-06-24 08:58:03,394][15132] Fps is (10 sec: 44218.6, 60 sec: 43141.6, 300 sec: 42764.4). Total num frames: 10275930112. Throughput: 0: 43044.6. Samples: 10276077940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:03,394][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 08:58:07,165][15401] Updated weights for policy 0, policy_version 627201 (0.0034) [2024-06-24 08:58:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 10276110336. Throughput: 0: 42922.8. Samples: 10276201000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 08:58:10,326][15401] Updated weights for policy 0, policy_version 627211 (0.0031) [2024-06-24 08:58:13,390][15132] Fps is (10 sec: 40976.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10276339712. Throughput: 0: 43025.7. Samples: 10276463240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 08:58:14,704][15401] Updated weights for policy 0, policy_version 627221 (0.0042) [2024-06-24 08:58:18,223][15401] Updated weights for policy 0, policy_version 627231 (0.0029) [2024-06-24 08:58:18,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10276552704. Throughput: 0: 42871.1. Samples: 10276715960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:18,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 08:58:22,441][15401] Updated weights for policy 0, policy_version 627241 (0.0039) [2024-06-24 08:58:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10276749312. Throughput: 0: 42819.1. Samples: 10276845200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:23,394][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 08:58:25,852][15401] Updated weights for policy 0, policy_version 627251 (0.0042) [2024-06-24 08:58:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10276978688. Throughput: 0: 42737.3. Samples: 10277096140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:28,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 08:58:29,923][15401] Updated weights for policy 0, policy_version 627261 (0.0027) [2024-06-24 08:58:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42709.6). Total num frames: 10277191680. Throughput: 0: 42615.9. Samples: 10277349080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 08:58:33,687][15401] Updated weights for policy 0, policy_version 627271 (0.0028) [2024-06-24 08:58:37,749][15401] Updated weights for policy 0, policy_version 627281 (0.0041) [2024-06-24 08:58:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10277371904. Throughput: 0: 42646.7. Samples: 10277479240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 08:58:41,235][15401] Updated weights for policy 0, policy_version 627291 (0.0030) [2024-06-24 08:58:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10277601280. Throughput: 0: 42669.8. Samples: 10277737200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 08:58:45,599][15401] Updated weights for policy 0, policy_version 627301 (0.0040) [2024-06-24 08:58:48,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10277847040. Throughput: 0: 42517.2. Samples: 10277991040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 08:58:48,819][15401] Updated weights for policy 0, policy_version 627311 (0.0044) [2024-06-24 08:58:53,138][15401] Updated weights for policy 0, policy_version 627321 (0.0037) [2024-06-24 08:58:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10278043648. Throughput: 0: 42727.9. Samples: 10278123760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 08:58:56,413][15401] Updated weights for policy 0, policy_version 627331 (0.0037) [2024-06-24 08:58:58,391][15132] Fps is (10 sec: 40954.0, 60 sec: 42324.3, 300 sec: 42653.7). Total num frames: 10278256640. Throughput: 0: 42600.4. Samples: 10278380320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:58:58,392][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 08:59:00,724][15401] Updated weights for policy 0, policy_version 627341 (0.0029) [2024-06-24 08:59:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42328.3, 300 sec: 42765.0). Total num frames: 10278469632. Throughput: 0: 42790.8. Samples: 10278641540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:59:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 08:59:03,939][15401] Updated weights for policy 0, policy_version 627351 (0.0035) [2024-06-24 08:59:08,346][15401] Updated weights for policy 0, policy_version 627361 (0.0036) [2024-06-24 08:59:08,396][15132] Fps is (10 sec: 42577.7, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 10278682624. Throughput: 0: 42803.3. Samples: 10278771620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:59:08,396][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 08:59:11,359][15401] Updated weights for policy 0, policy_version 627371 (0.0037) [2024-06-24 08:59:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10278912000. Throughput: 0: 42800.3. Samples: 10279022160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 08:59:13,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 08:59:16,158][15401] Updated weights for policy 0, policy_version 627381 (0.0030) [2024-06-24 08:59:18,389][15132] Fps is (10 sec: 45904.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 10279141376. Throughput: 0: 42885.0. Samples: 10279278900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 08:59:19,199][15401] Updated weights for policy 0, policy_version 627391 (0.0023) [2024-06-24 08:59:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10279305216. Throughput: 0: 42937.7. Samples: 10279411440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 08:59:23,771][15401] Updated weights for policy 0, policy_version 627401 (0.0038) [2024-06-24 08:59:24,409][15349] Signal inference workers to stop experience collection... (152200 times) [2024-06-24 08:59:24,445][15401] InferenceWorker_p0-w0: stopping experience collection (152200 times) [2024-06-24 08:59:24,458][15349] Signal inference workers to resume experience collection... (152200 times) [2024-06-24 08:59:24,471][15401] InferenceWorker_p0-w0: resuming experience collection (152200 times) [2024-06-24 08:59:26,736][15401] Updated weights for policy 0, policy_version 627411 (0.0029) [2024-06-24 08:59:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10279550976. Throughput: 0: 42860.5. Samples: 10279665920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 08:59:31,400][15401] Updated weights for policy 0, policy_version 627421 (0.0038) [2024-06-24 08:59:33,394][15132] Fps is (10 sec: 47493.2, 60 sec: 43141.4, 300 sec: 42875.5). Total num frames: 10279780352. Throughput: 0: 42804.4. Samples: 10279917420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:33,394][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 08:59:34,171][15401] Updated weights for policy 0, policy_version 627431 (0.0037) [2024-06-24 08:59:38,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10279944192. Throughput: 0: 42883.1. Samples: 10280053500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:38,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 08:59:39,008][15401] Updated weights for policy 0, policy_version 627441 (0.0051) [2024-06-24 08:59:41,905][15401] Updated weights for policy 0, policy_version 627451 (0.0037) [2024-06-24 08:59:43,389][15132] Fps is (10 sec: 40978.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10280189952. Throughput: 0: 42815.3. Samples: 10280306940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:43,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 08:59:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000627454_10280206336.pth... [2024-06-24 08:59:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000626826_10269917184.pth [2024-06-24 08:59:46,698][15401] Updated weights for policy 0, policy_version 627461 (0.0031) [2024-06-24 08:59:48,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 10280419328. Throughput: 0: 42633.7. Samples: 10280560160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:48,392][15132] Avg episode reward: [(0, '0.240')] [2024-06-24 08:59:49,598][15401] Updated weights for policy 0, policy_version 627471 (0.0023) [2024-06-24 08:59:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10280599552. Throughput: 0: 42642.4. Samples: 10280690260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:53,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 08:59:54,367][15401] Updated weights for policy 0, policy_version 627481 (0.0036) [2024-06-24 08:59:57,534][15401] Updated weights for policy 0, policy_version 627491 (0.0038) [2024-06-24 08:59:58,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42872.6, 300 sec: 42709.5). Total num frames: 10280828928. Throughput: 0: 42717.6. Samples: 10280944440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 08:59:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 09:00:01,890][15401] Updated weights for policy 0, policy_version 627501 (0.0034) [2024-06-24 09:00:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10281058304. Throughput: 0: 42679.5. Samples: 10281199480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:03,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 09:00:05,441][15401] Updated weights for policy 0, policy_version 627511 (0.0041) [2024-06-24 09:00:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42056.7, 300 sec: 42653.9). Total num frames: 10281205760. Throughput: 0: 42587.1. Samples: 10281327860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:08,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 09:00:09,445][15401] Updated weights for policy 0, policy_version 627521 (0.0039) [2024-06-24 09:00:13,204][15401] Updated weights for policy 0, policy_version 627531 (0.0047) [2024-06-24 09:00:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10281467904. Throughput: 0: 42556.4. Samples: 10281580960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 09:00:17,035][15401] Updated weights for policy 0, policy_version 627541 (0.0032) [2024-06-24 09:00:18,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10281680896. Throughput: 0: 42833.1. Samples: 10281844720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:18,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 09:00:20,857][15401] Updated weights for policy 0, policy_version 627551 (0.0040) [2024-06-24 09:00:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10281861120. Throughput: 0: 42540.8. Samples: 10281967840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:23,390][15132] Avg episode reward: [(0, '0.127')] [2024-06-24 09:00:24,890][15401] Updated weights for policy 0, policy_version 627561 (0.0034) [2024-06-24 09:00:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 10282106880. Throughput: 0: 42492.7. Samples: 10282219120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 09:00:28,522][15401] Updated weights for policy 0, policy_version 627571 (0.0034) [2024-06-24 09:00:31,475][15349] Signal inference workers to stop experience collection... (152250 times) [2024-06-24 09:00:31,528][15401] InferenceWorker_p0-w0: stopping experience collection (152250 times) [2024-06-24 09:00:31,597][15349] Signal inference workers to resume experience collection... (152250 times) [2024-06-24 09:00:31,597][15401] InferenceWorker_p0-w0: resuming experience collection (152250 times) [2024-06-24 09:00:32,933][15401] Updated weights for policy 0, policy_version 627581 (0.0037) [2024-06-24 09:00:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42328.3, 300 sec: 42765.0). Total num frames: 10282319872. Throughput: 0: 42615.5. Samples: 10282477760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 09:00:36,317][15401] Updated weights for policy 0, policy_version 627591 (0.0028) [2024-06-24 09:00:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10282516480. Throughput: 0: 42542.3. Samples: 10282604660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:38,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 09:00:40,674][15401] Updated weights for policy 0, policy_version 627601 (0.0033) [2024-06-24 09:00:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10282762240. Throughput: 0: 42495.5. Samples: 10282856740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 09:00:43,946][15401] Updated weights for policy 0, policy_version 627611 (0.0031) [2024-06-24 09:00:48,299][15401] Updated weights for policy 0, policy_version 627621 (0.0027) [2024-06-24 09:00:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 10282942464. Throughput: 0: 42779.1. Samples: 10283124540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 09:00:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 09:00:51,556][15401] Updated weights for policy 0, policy_version 627631 (0.0035) [2024-06-24 09:00:53,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10283139072. Throughput: 0: 42570.6. Samples: 10283243540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:00:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 09:00:55,883][15401] Updated weights for policy 0, policy_version 627641 (0.0033) [2024-06-24 09:00:58,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10283417600. Throughput: 0: 42605.3. Samples: 10283498200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:00:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 09:00:59,098][15401] Updated weights for policy 0, policy_version 627651 (0.0028) [2024-06-24 09:01:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42709.4). Total num frames: 10283581440. Throughput: 0: 42629.6. Samples: 10283763060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 09:01:03,544][15401] Updated weights for policy 0, policy_version 627661 (0.0042) [2024-06-24 09:01:06,867][15401] Updated weights for policy 0, policy_version 627671 (0.0039) [2024-06-24 09:01:08,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10283778048. Throughput: 0: 42439.3. Samples: 10283877600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 09:01:11,296][15401] Updated weights for policy 0, policy_version 627681 (0.0035) [2024-06-24 09:01:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10284040192. Throughput: 0: 42589.8. Samples: 10284135660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 09:01:14,752][15401] Updated weights for policy 0, policy_version 627691 (0.0046) [2024-06-24 09:01:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10284220416. Throughput: 0: 42819.2. Samples: 10284404620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 09:01:18,812][15401] Updated weights for policy 0, policy_version 627701 (0.0024) [2024-06-24 09:01:22,726][15401] Updated weights for policy 0, policy_version 627711 (0.0041) [2024-06-24 09:01:23,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10284433408. Throughput: 0: 42485.3. Samples: 10284516500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 09:01:26,196][15401] Updated weights for policy 0, policy_version 627721 (0.0032) [2024-06-24 09:01:28,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 10284695552. Throughput: 0: 42794.2. Samples: 10284782480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 09:01:30,155][15401] Updated weights for policy 0, policy_version 627731 (0.0032) [2024-06-24 09:01:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10284859392. Throughput: 0: 42668.7. Samples: 10285044640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 09:01:33,782][15401] Updated weights for policy 0, policy_version 627741 (0.0036) [2024-06-24 09:01:35,275][15349] Signal inference workers to stop experience collection... (152300 times) [2024-06-24 09:01:35,276][15349] Signal inference workers to resume experience collection... (152300 times) [2024-06-24 09:01:35,324][15401] InferenceWorker_p0-w0: stopping experience collection (152300 times) [2024-06-24 09:01:35,324][15401] InferenceWorker_p0-w0: resuming experience collection (152300 times) [2024-06-24 09:01:37,789][15401] Updated weights for policy 0, policy_version 627751 (0.0047) [2024-06-24 09:01:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10285088768. Throughput: 0: 42595.2. Samples: 10285160320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 09:01:41,512][15401] Updated weights for policy 0, policy_version 627761 (0.0030) [2024-06-24 09:01:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10285318144. Throughput: 0: 42786.5. Samples: 10285423600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 09:01:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000627767_10285334528.pth... [2024-06-24 09:01:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000627140_10275061760.pth [2024-06-24 09:01:45,615][15401] Updated weights for policy 0, policy_version 627771 (0.0041) [2024-06-24 09:01:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10285498368. Throughput: 0: 42658.0. Samples: 10285682660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 09:01:49,226][15401] Updated weights for policy 0, policy_version 627781 (0.0033) [2024-06-24 09:01:53,232][15401] Updated weights for policy 0, policy_version 627791 (0.0030) [2024-06-24 09:01:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10285727744. Throughput: 0: 42751.4. Samples: 10285801420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:53,393][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 09:01:56,746][15401] Updated weights for policy 0, policy_version 627801 (0.0036) [2024-06-24 09:01:58,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10285973504. Throughput: 0: 42800.2. Samples: 10286061660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:01:58,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 09:02:00,696][15401] Updated weights for policy 0, policy_version 627811 (0.0036) [2024-06-24 09:02:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10286137344. Throughput: 0: 42561.8. Samples: 10286319900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:02:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 09:02:04,476][15401] Updated weights for policy 0, policy_version 627821 (0.0039) [2024-06-24 09:02:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10286366720. Throughput: 0: 42859.6. Samples: 10286445180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:02:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 09:02:08,662][15401] Updated weights for policy 0, policy_version 627831 (0.0031) [2024-06-24 09:02:12,185][15401] Updated weights for policy 0, policy_version 627841 (0.0049) [2024-06-24 09:02:13,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10286612480. Throughput: 0: 42764.1. Samples: 10286706860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:02:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 09:02:16,395][15401] Updated weights for policy 0, policy_version 627851 (0.0028) [2024-06-24 09:02:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10286792704. Throughput: 0: 42699.2. Samples: 10286966100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:02:18,393][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 09:02:19,768][15401] Updated weights for policy 0, policy_version 627861 (0.0038) [2024-06-24 09:02:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10287005696. Throughput: 0: 42843.1. Samples: 10287088260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-24 09:02:23,390][15132] Avg episode reward: [(0, '0.111')] [2024-06-24 09:02:24,187][15401] Updated weights for policy 0, policy_version 627871 (0.0034) [2024-06-24 09:02:27,536][15401] Updated weights for policy 0, policy_version 627881 (0.0042) [2024-06-24 09:02:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 10287235072. Throughput: 0: 42816.2. Samples: 10287350320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:02:28,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 09:02:31,778][15401] Updated weights for policy 0, policy_version 627891 (0.0029) [2024-06-24 09:02:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10287431680. Throughput: 0: 42796.8. Samples: 10287608520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:02:33,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-24 09:02:35,261][15401] Updated weights for policy 0, policy_version 627901 (0.0037) [2024-06-24 09:02:38,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10287644672. Throughput: 0: 42927.0. Samples: 10287733140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:02:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 09:02:39,166][15401] Updated weights for policy 0, policy_version 627911 (0.0036) [2024-06-24 09:02:42,774][15401] Updated weights for policy 0, policy_version 627921 (0.0032) [2024-06-24 09:02:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 10287874048. Throughput: 0: 42859.9. Samples: 10287990360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:02:43,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 09:02:46,851][15401] Updated weights for policy 0, policy_version 627931 (0.0026) [2024-06-24 09:02:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10288070656. Throughput: 0: 42837.8. Samples: 10288247600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:02:48,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 09:02:50,374][15401] Updated weights for policy 0, policy_version 627941 (0.0059) [2024-06-24 09:02:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10288283648. Throughput: 0: 42822.7. Samples: 10288372200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:02:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 09:02:54,386][15401] Updated weights for policy 0, policy_version 627951 (0.0037) [2024-06-24 09:02:58,002][15401] Updated weights for policy 0, policy_version 627961 (0.0025) [2024-06-24 09:02:58,391][15132] Fps is (10 sec: 44229.2, 60 sec: 42324.2, 300 sec: 42654.3). Total num frames: 10288513024. Throughput: 0: 42706.8. Samples: 10288628740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:02:58,392][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 09:03:02,154][15401] Updated weights for policy 0, policy_version 627971 (0.0038) [2024-06-24 09:03:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10288726016. Throughput: 0: 42677.4. Samples: 10288886580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:03,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 09:03:05,605][15401] Updated weights for policy 0, policy_version 627981 (0.0034) [2024-06-24 09:03:08,389][15132] Fps is (10 sec: 42605.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10288939008. Throughput: 0: 42715.2. Samples: 10289010440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 09:03:09,697][15401] Updated weights for policy 0, policy_version 627991 (0.0037) [2024-06-24 09:03:13,336][15401] Updated weights for policy 0, policy_version 628001 (0.0026) [2024-06-24 09:03:13,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 10289168384. Throughput: 0: 42699.9. Samples: 10289271920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:13,392][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 09:03:17,697][15401] Updated weights for policy 0, policy_version 628011 (0.0043) [2024-06-24 09:03:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10289364992. Throughput: 0: 42680.8. Samples: 10289529160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:18,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 09:03:21,173][15401] Updated weights for policy 0, policy_version 628021 (0.0032) [2024-06-24 09:03:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10289577984. Throughput: 0: 42659.7. Samples: 10289652820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:23,396][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 09:03:25,396][15401] Updated weights for policy 0, policy_version 628031 (0.0029) [2024-06-24 09:03:26,971][15349] Signal inference workers to stop experience collection... (152350 times) [2024-06-24 09:03:26,972][15349] Signal inference workers to resume experience collection... (152350 times) [2024-06-24 09:03:26,982][15401] InferenceWorker_p0-w0: stopping experience collection (152350 times) [2024-06-24 09:03:26,983][15401] InferenceWorker_p0-w0: resuming experience collection (152350 times) [2024-06-24 09:03:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10289807360. Throughput: 0: 42620.4. Samples: 10289908280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 09:03:28,557][15401] Updated weights for policy 0, policy_version 628041 (0.0034) [2024-06-24 09:03:33,115][15401] Updated weights for policy 0, policy_version 628051 (0.0032) [2024-06-24 09:03:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10290003968. Throughput: 0: 42770.2. Samples: 10290172260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 09:03:36,077][15401] Updated weights for policy 0, policy_version 628061 (0.0046) [2024-06-24 09:03:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10290216960. Throughput: 0: 42685.3. Samples: 10290293040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 09:03:40,609][15401] Updated weights for policy 0, policy_version 628071 (0.0045) [2024-06-24 09:03:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10290446336. Throughput: 0: 42689.5. Samples: 10290549700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 09:03:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000628079_10290446336.pth... [2024-06-24 09:03:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000627454_10280206336.pth [2024-06-24 09:03:44,285][15401] Updated weights for policy 0, policy_version 628081 (0.0036) [2024-06-24 09:03:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10290626560. Throughput: 0: 42681.4. Samples: 10290807240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 09:03:48,407][15401] Updated weights for policy 0, policy_version 628091 (0.0032) [2024-06-24 09:03:51,777][15401] Updated weights for policy 0, policy_version 628101 (0.0025) [2024-06-24 09:03:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.7). Total num frames: 10290855936. Throughput: 0: 42695.9. Samples: 10290931760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 09:03:55,884][15401] Updated weights for policy 0, policy_version 628111 (0.0031) [2024-06-24 09:03:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42872.7, 300 sec: 42765.0). Total num frames: 10291085312. Throughput: 0: 42783.6. Samples: 10291197080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 09:03:58,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 09:03:59,393][15401] Updated weights for policy 0, policy_version 628121 (0.0034) [2024-06-24 09:04:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 10291281920. Throughput: 0: 42735.5. Samples: 10291452260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:03,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 09:04:03,403][15401] Updated weights for policy 0, policy_version 628131 (0.0047) [2024-06-24 09:04:07,635][15401] Updated weights for policy 0, policy_version 628141 (0.0046) [2024-06-24 09:04:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10291494912. Throughput: 0: 42748.0. Samples: 10291576480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 09:04:11,318][15401] Updated weights for policy 0, policy_version 628151 (0.0032) [2024-06-24 09:04:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 10291724288. Throughput: 0: 42884.6. Samples: 10291838080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 09:04:15,422][15401] Updated weights for policy 0, policy_version 628161 (0.0035) [2024-06-24 09:04:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10291937280. Throughput: 0: 42602.6. Samples: 10292089380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 09:04:18,860][15401] Updated weights for policy 0, policy_version 628171 (0.0026) [2024-06-24 09:04:23,057][15401] Updated weights for policy 0, policy_version 628181 (0.0031) [2024-06-24 09:04:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10292133888. Throughput: 0: 42799.2. Samples: 10292219000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:23,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 09:04:26,406][15401] Updated weights for policy 0, policy_version 628191 (0.0035) [2024-06-24 09:04:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42710.1). Total num frames: 10292379648. Throughput: 0: 42989.0. Samples: 10292484200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 09:04:30,543][15401] Updated weights for policy 0, policy_version 628201 (0.0042) [2024-06-24 09:04:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10292576256. Throughput: 0: 43015.1. Samples: 10292742920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:33,394][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 09:04:33,907][15401] Updated weights for policy 0, policy_version 628211 (0.0039) [2024-06-24 09:04:38,038][15401] Updated weights for policy 0, policy_version 628221 (0.0033) [2024-06-24 09:04:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10292772864. Throughput: 0: 42985.0. Samples: 10292866080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 09:04:41,886][15401] Updated weights for policy 0, policy_version 628231 (0.0040) [2024-06-24 09:04:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 10293018624. Throughput: 0: 42939.0. Samples: 10293129340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 09:04:45,573][15401] Updated weights for policy 0, policy_version 628241 (0.0035) [2024-06-24 09:04:47,021][15349] Signal inference workers to stop experience collection... (152400 times) [2024-06-24 09:04:47,083][15401] InferenceWorker_p0-w0: stopping experience collection (152400 times) [2024-06-24 09:04:47,140][15349] Signal inference workers to resume experience collection... (152400 times) [2024-06-24 09:04:47,141][15401] InferenceWorker_p0-w0: resuming experience collection (152400 times) [2024-06-24 09:04:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10293215232. Throughput: 0: 42849.5. Samples: 10293380480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 09:04:49,477][15401] Updated weights for policy 0, policy_version 628251 (0.0026) [2024-06-24 09:04:53,149][15401] Updated weights for policy 0, policy_version 628261 (0.0030) [2024-06-24 09:04:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10293428224. Throughput: 0: 42828.4. Samples: 10293503760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:53,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 09:04:57,016][15401] Updated weights for policy 0, policy_version 628271 (0.0036) [2024-06-24 09:04:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10293657600. Throughput: 0: 42850.2. Samples: 10293766340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:04:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 09:05:00,729][15401] Updated weights for policy 0, policy_version 628281 (0.0042) [2024-06-24 09:05:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10293837824. Throughput: 0: 43047.2. Samples: 10294026500. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:05:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 09:05:04,497][15401] Updated weights for policy 0, policy_version 628291 (0.0023) [2024-06-24 09:05:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10294067200. Throughput: 0: 42825.4. Samples: 10294146140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:05:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 09:05:08,495][15401] Updated weights for policy 0, policy_version 628301 (0.0028) [2024-06-24 09:05:12,237][15401] Updated weights for policy 0, policy_version 628311 (0.0037) [2024-06-24 09:05:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10294296576. Throughput: 0: 42804.1. Samples: 10294410380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:05:13,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 09:05:16,046][15401] Updated weights for policy 0, policy_version 628321 (0.0030) [2024-06-24 09:05:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10294493184. Throughput: 0: 42819.6. Samples: 10294669800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:05:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 09:05:20,360][15401] Updated weights for policy 0, policy_version 628331 (0.0038) [2024-06-24 09:05:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10294722560. Throughput: 0: 42734.9. Samples: 10294789160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:05:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 09:05:23,716][15401] Updated weights for policy 0, policy_version 628341 (0.0039) [2024-06-24 09:05:28,145][15401] Updated weights for policy 0, policy_version 628351 (0.0047) [2024-06-24 09:05:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10294919168. Throughput: 0: 42647.2. Samples: 10295048460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:05:28,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 09:05:31,331][15401] Updated weights for policy 0, policy_version 628361 (0.0032) [2024-06-24 09:05:33,393][15132] Fps is (10 sec: 40944.7, 60 sec: 42595.6, 300 sec: 42764.5). Total num frames: 10295132160. Throughput: 0: 42661.5. Samples: 10295300420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-24 09:05:33,394][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 09:05:35,713][15401] Updated weights for policy 0, policy_version 628371 (0.0050) [2024-06-24 09:05:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10295361536. Throughput: 0: 42760.4. Samples: 10295427980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:05:38,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 09:05:39,147][15401] Updated weights for policy 0, policy_version 628381 (0.0030) [2024-06-24 09:05:43,147][15401] Updated weights for policy 0, policy_version 628391 (0.0030) [2024-06-24 09:05:43,390][15132] Fps is (10 sec: 44253.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10295574528. Throughput: 0: 42791.7. Samples: 10295691980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:05:43,396][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 09:05:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000628392_10295574528.pth... [2024-06-24 09:05:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000627767_10285334528.pth [2024-06-24 09:05:46,837][15401] Updated weights for policy 0, policy_version 628401 (0.0043) [2024-06-24 09:05:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10295787520. Throughput: 0: 42695.1. Samples: 10295947780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:05:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 09:05:49,456][15349] Signal inference workers to stop experience collection... (152450 times) [2024-06-24 09:05:49,456][15349] Signal inference workers to resume experience collection... (152450 times) [2024-06-24 09:05:49,477][15401] InferenceWorker_p0-w0: stopping experience collection (152450 times) [2024-06-24 09:05:49,478][15401] InferenceWorker_p0-w0: resuming experience collection (152450 times) [2024-06-24 09:05:50,769][15401] Updated weights for policy 0, policy_version 628411 (0.0036) [2024-06-24 09:05:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10296000512. Throughput: 0: 42822.6. Samples: 10296073160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:05:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 09:05:54,582][15401] Updated weights for policy 0, policy_version 628421 (0.0031) [2024-06-24 09:05:58,309][15401] Updated weights for policy 0, policy_version 628431 (0.0034) [2024-06-24 09:05:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 10296213504. Throughput: 0: 42722.2. Samples: 10296332980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:05:58,393][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 09:06:02,169][15401] Updated weights for policy 0, policy_version 628441 (0.0034) [2024-06-24 09:06:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10296410112. Throughput: 0: 42713.3. Samples: 10296591900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 09:06:06,081][15401] Updated weights for policy 0, policy_version 628451 (0.0042) [2024-06-24 09:06:08,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10296655872. Throughput: 0: 42850.5. Samples: 10296717420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 09:06:09,772][15401] Updated weights for policy 0, policy_version 628461 (0.0040) [2024-06-24 09:06:13,391][15132] Fps is (10 sec: 44230.2, 60 sec: 42597.4, 300 sec: 42820.3). Total num frames: 10296852480. Throughput: 0: 42812.9. Samples: 10296975100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:13,391][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 09:06:13,650][15401] Updated weights for policy 0, policy_version 628471 (0.0040) [2024-06-24 09:06:17,281][15401] Updated weights for policy 0, policy_version 628481 (0.0032) [2024-06-24 09:06:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10297065472. Throughput: 0: 42815.7. Samples: 10297226960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 09:06:21,500][15401] Updated weights for policy 0, policy_version 628491 (0.0032) [2024-06-24 09:06:23,390][15132] Fps is (10 sec: 44242.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10297294848. Throughput: 0: 42977.3. Samples: 10297361960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 09:06:25,154][15401] Updated weights for policy 0, policy_version 628501 (0.0048) [2024-06-24 09:06:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 10297475072. Throughput: 0: 42757.2. Samples: 10297616040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 09:06:29,102][15401] Updated weights for policy 0, policy_version 628511 (0.0032) [2024-06-24 09:06:32,961][15401] Updated weights for policy 0, policy_version 628521 (0.0032) [2024-06-24 09:06:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42874.3, 300 sec: 42765.0). Total num frames: 10297704448. Throughput: 0: 42699.2. Samples: 10297869240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 09:06:36,791][15401] Updated weights for policy 0, policy_version 628531 (0.0028) [2024-06-24 09:06:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10297933824. Throughput: 0: 42831.0. Samples: 10298000560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 09:06:40,585][15401] Updated weights for policy 0, policy_version 628541 (0.0044) [2024-06-24 09:06:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10298114048. Throughput: 0: 42631.9. Samples: 10298251320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:43,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 09:06:44,333][15401] Updated weights for policy 0, policy_version 628551 (0.0027) [2024-06-24 09:06:48,338][15401] Updated weights for policy 0, policy_version 628561 (0.0037) [2024-06-24 09:06:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10298343424. Throughput: 0: 42538.1. Samples: 10298506120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:48,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 09:06:52,013][15401] Updated weights for policy 0, policy_version 628571 (0.0030) [2024-06-24 09:06:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 10298572800. Throughput: 0: 42695.8. Samples: 10298638740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 09:06:56,238][15401] Updated weights for policy 0, policy_version 628581 (0.0036) [2024-06-24 09:06:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 10298736640. Throughput: 0: 42573.4. Samples: 10298890840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:06:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 09:06:59,839][15401] Updated weights for policy 0, policy_version 628591 (0.0031) [2024-06-24 09:07:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10298982400. Throughput: 0: 42418.7. Samples: 10299135800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:07:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 09:07:04,072][15401] Updated weights for policy 0, policy_version 628601 (0.0032) [2024-06-24 09:07:07,479][15401] Updated weights for policy 0, policy_version 628611 (0.0033) [2024-06-24 09:07:08,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10299195392. Throughput: 0: 42430.1. Samples: 10299271320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:07:08,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 09:07:11,685][15401] Updated weights for policy 0, policy_version 628621 (0.0037) [2024-06-24 09:07:13,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42051.6, 300 sec: 42653.6). Total num frames: 10299375616. Throughput: 0: 42281.2. Samples: 10299518800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 09:07:13,393][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 09:07:14,496][15349] Signal inference workers to stop experience collection... (152500 times) [2024-06-24 09:07:14,496][15349] Signal inference workers to resume experience collection... (152500 times) [2024-06-24 09:07:14,521][15401] InferenceWorker_p0-w0: stopping experience collection (152500 times) [2024-06-24 09:07:14,521][15401] InferenceWorker_p0-w0: resuming experience collection (152500 times) [2024-06-24 09:07:15,247][15401] Updated weights for policy 0, policy_version 628631 (0.0028) [2024-06-24 09:07:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10299604992. Throughput: 0: 42210.6. Samples: 10299768720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 09:07:19,248][15401] Updated weights for policy 0, policy_version 628641 (0.0043) [2024-06-24 09:07:23,389][15132] Fps is (10 sec: 42608.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 10299801600. Throughput: 0: 42300.5. Samples: 10299904080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 09:07:23,483][15401] Updated weights for policy 0, policy_version 628651 (0.0035) [2024-06-24 09:07:26,775][15401] Updated weights for policy 0, policy_version 628661 (0.0031) [2024-06-24 09:07:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10300014592. Throughput: 0: 42230.3. Samples: 10300151680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 09:07:31,042][15401] Updated weights for policy 0, policy_version 628671 (0.0041) [2024-06-24 09:07:33,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 10300260352. Throughput: 0: 42136.0. Samples: 10300402340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:33,393][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 09:07:34,354][15401] Updated weights for policy 0, policy_version 628681 (0.0033) [2024-06-24 09:07:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10300456960. Throughput: 0: 42278.8. Samples: 10300541280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:38,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 09:07:38,686][15401] Updated weights for policy 0, policy_version 628691 (0.0026) [2024-06-24 09:07:42,098][15401] Updated weights for policy 0, policy_version 628701 (0.0031) [2024-06-24 09:07:43,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10300653568. Throughput: 0: 42252.8. Samples: 10300792220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 09:07:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000628702_10300653568.pth... [2024-06-24 09:07:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000628079_10290446336.pth [2024-06-24 09:07:46,341][15401] Updated weights for policy 0, policy_version 628711 (0.0040) [2024-06-24 09:07:48,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 10300915712. Throughput: 0: 42434.2. Samples: 10301045440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:48,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 09:07:50,204][15401] Updated weights for policy 0, policy_version 628721 (0.0028) [2024-06-24 09:07:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42654.2). Total num frames: 10301095936. Throughput: 0: 42651.6. Samples: 10301190640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 09:07:53,890][15401] Updated weights for policy 0, policy_version 628731 (0.0033) [2024-06-24 09:07:57,811][15401] Updated weights for policy 0, policy_version 628741 (0.0048) [2024-06-24 09:07:58,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10301308928. Throughput: 0: 42603.6. Samples: 10301435860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:07:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 09:08:01,644][15401] Updated weights for policy 0, policy_version 628751 (0.0047) [2024-06-24 09:08:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10301554688. Throughput: 0: 42711.9. Samples: 10301690760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 09:08:05,358][15401] Updated weights for policy 0, policy_version 628761 (0.0036) [2024-06-24 09:08:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42598.8). Total num frames: 10301734912. Throughput: 0: 42747.6. Samples: 10301827720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:08,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 09:08:09,202][15401] Updated weights for policy 0, policy_version 628771 (0.0024) [2024-06-24 09:08:13,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 10301931520. Throughput: 0: 42734.7. Samples: 10302074740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 09:08:13,688][15401] Updated weights for policy 0, policy_version 628781 (0.0043) [2024-06-24 09:08:16,823][15401] Updated weights for policy 0, policy_version 628791 (0.0038) [2024-06-24 09:08:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10302177280. Throughput: 0: 42881.0. Samples: 10302331880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 09:08:21,146][15401] Updated weights for policy 0, policy_version 628801 (0.0027) [2024-06-24 09:08:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 10302390272. Throughput: 0: 42870.3. Samples: 10302470440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:23,404][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 09:08:24,524][15401] Updated weights for policy 0, policy_version 628811 (0.0034) [2024-06-24 09:08:24,755][15349] Signal inference workers to stop experience collection... (152550 times) [2024-06-24 09:08:24,758][15349] Signal inference workers to resume experience collection... (152550 times) [2024-06-24 09:08:24,783][15401] InferenceWorker_p0-w0: stopping experience collection (152550 times) [2024-06-24 09:08:24,784][15401] InferenceWorker_p0-w0: resuming experience collection (152550 times) [2024-06-24 09:08:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10302586880. Throughput: 0: 42867.1. Samples: 10302721240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 09:08:28,615][15401] Updated weights for policy 0, policy_version 628821 (0.0039) [2024-06-24 09:08:32,081][15401] Updated weights for policy 0, policy_version 628831 (0.0032) [2024-06-24 09:08:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10302832640. Throughput: 0: 42894.8. Samples: 10302975600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 09:08:36,198][15401] Updated weights for policy 0, policy_version 628841 (0.0036) [2024-06-24 09:08:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10303012864. Throughput: 0: 42735.7. Samples: 10303113740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 09:08:39,759][15401] Updated weights for policy 0, policy_version 628851 (0.0034) [2024-06-24 09:08:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10303242240. Throughput: 0: 42764.0. Samples: 10303360240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:43,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 09:08:43,742][15401] Updated weights for policy 0, policy_version 628861 (0.0026) [2024-06-24 09:08:47,476][15401] Updated weights for policy 0, policy_version 628871 (0.0030) [2024-06-24 09:08:48,396][15132] Fps is (10 sec: 45845.8, 60 sec: 42595.6, 300 sec: 42764.1). Total num frames: 10303471616. Throughput: 0: 42854.0. Samples: 10303619460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 09:08:48,396][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 09:08:51,419][15401] Updated weights for policy 0, policy_version 628881 (0.0025) [2024-06-24 09:08:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10303651840. Throughput: 0: 42725.3. Samples: 10303750360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:08:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 09:08:55,079][15401] Updated weights for policy 0, policy_version 628891 (0.0039) [2024-06-24 09:08:58,390][15132] Fps is (10 sec: 42625.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10303897600. Throughput: 0: 42793.7. Samples: 10304000460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:08:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 09:08:59,050][15401] Updated weights for policy 0, policy_version 628901 (0.0037) [2024-06-24 09:09:02,846][15401] Updated weights for policy 0, policy_version 628911 (0.0037) [2024-06-24 09:09:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10304110592. Throughput: 0: 42877.4. Samples: 10304261360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:03,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 09:09:06,925][15401] Updated weights for policy 0, policy_version 628921 (0.0030) [2024-06-24 09:09:08,392][15132] Fps is (10 sec: 37672.5, 60 sec: 42323.3, 300 sec: 42542.4). Total num frames: 10304274432. Throughput: 0: 42530.1. Samples: 10304384420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:08,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 09:09:10,574][15401] Updated weights for policy 0, policy_version 628931 (0.0037) [2024-06-24 09:09:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 10304552960. Throughput: 0: 42605.9. Samples: 10304638500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 09:09:14,359][15401] Updated weights for policy 0, policy_version 628941 (0.0033) [2024-06-24 09:09:18,195][15401] Updated weights for policy 0, policy_version 628951 (0.0029) [2024-06-24 09:09:18,390][15132] Fps is (10 sec: 45888.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10304733184. Throughput: 0: 42747.1. Samples: 10304899220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 09:09:21,893][15401] Updated weights for policy 0, policy_version 628961 (0.0029) [2024-06-24 09:09:23,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10304929792. Throughput: 0: 42365.7. Samples: 10305020200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 09:09:25,851][15401] Updated weights for policy 0, policy_version 628971 (0.0032) [2024-06-24 09:09:28,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 10305191936. Throughput: 0: 42632.0. Samples: 10305278780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:28,393][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 09:09:29,469][15401] Updated weights for policy 0, policy_version 628981 (0.0047) [2024-06-24 09:09:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10305372160. Throughput: 0: 42658.9. Samples: 10305538840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 09:09:33,779][15401] Updated weights for policy 0, policy_version 628991 (0.0031) [2024-06-24 09:09:37,168][15401] Updated weights for policy 0, policy_version 629001 (0.0033) [2024-06-24 09:09:38,248][15349] Signal inference workers to stop experience collection... (152600 times) [2024-06-24 09:09:38,249][15349] Signal inference workers to resume experience collection... (152600 times) [2024-06-24 09:09:38,294][15401] InferenceWorker_p0-w0: stopping experience collection (152600 times) [2024-06-24 09:09:38,294][15401] InferenceWorker_p0-w0: resuming experience collection (152600 times) [2024-06-24 09:09:38,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10305585152. Throughput: 0: 42536.8. Samples: 10305664520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 09:09:41,342][15401] Updated weights for policy 0, policy_version 629011 (0.0031) [2024-06-24 09:09:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 10305814528. Throughput: 0: 42592.4. Samples: 10305917120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 09:09:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000629018_10305830912.pth... [2024-06-24 09:09:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000628392_10295574528.pth [2024-06-24 09:09:44,872][15401] Updated weights for policy 0, policy_version 629021 (0.0037) [2024-06-24 09:09:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42329.9, 300 sec: 42654.0). Total num frames: 10306011136. Throughput: 0: 42605.8. Samples: 10306178620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:48,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 09:09:49,127][15401] Updated weights for policy 0, policy_version 629031 (0.0027) [2024-06-24 09:09:52,428][15401] Updated weights for policy 0, policy_version 629041 (0.0039) [2024-06-24 09:09:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10306207744. Throughput: 0: 42604.1. Samples: 10306301480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:53,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 09:09:56,683][15401] Updated weights for policy 0, policy_version 629051 (0.0027) [2024-06-24 09:09:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10306453504. Throughput: 0: 42852.1. Samples: 10306566840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:09:58,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 09:09:59,937][15401] Updated weights for policy 0, policy_version 629061 (0.0026) [2024-06-24 09:10:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10306650112. Throughput: 0: 42838.7. Samples: 10306826960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:10:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 09:10:04,122][15401] Updated weights for policy 0, policy_version 629071 (0.0035) [2024-06-24 09:10:07,713][15401] Updated weights for policy 0, policy_version 629081 (0.0028) [2024-06-24 09:10:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43146.7, 300 sec: 42598.4). Total num frames: 10306863104. Throughput: 0: 42909.9. Samples: 10306951140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:10:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 09:10:11,757][15401] Updated weights for policy 0, policy_version 629091 (0.0030) [2024-06-24 09:10:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10307092480. Throughput: 0: 42959.3. Samples: 10307211840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:10:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 09:10:15,228][15401] Updated weights for policy 0, policy_version 629101 (0.0031) [2024-06-24 09:10:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10307305472. Throughput: 0: 42985.7. Samples: 10307473200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:10:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 09:10:19,315][15401] Updated weights for policy 0, policy_version 629111 (0.0043) [2024-06-24 09:10:23,355][15401] Updated weights for policy 0, policy_version 629121 (0.0038) [2024-06-24 09:10:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10307518464. Throughput: 0: 42925.0. Samples: 10307596140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-06-24 09:10:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 09:10:27,335][15401] Updated weights for policy 0, policy_version 629131 (0.0028) [2024-06-24 09:10:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42710.0). Total num frames: 10307731456. Throughput: 0: 43152.5. Samples: 10307858980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:10:28,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 09:10:30,965][15401] Updated weights for policy 0, policy_version 629141 (0.0036) [2024-06-24 09:10:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10307928064. Throughput: 0: 42987.6. Samples: 10308113060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:10:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 09:10:35,257][15401] Updated weights for policy 0, policy_version 629151 (0.0025) [2024-06-24 09:10:38,288][15401] Updated weights for policy 0, policy_version 629161 (0.0030) [2024-06-24 09:10:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10308173824. Throughput: 0: 43113.2. Samples: 10308241580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:10:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 09:10:42,792][15401] Updated weights for policy 0, policy_version 629171 (0.0030) [2024-06-24 09:10:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10308370432. Throughput: 0: 43012.2. Samples: 10308502400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:10:43,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 09:10:45,699][15401] Updated weights for policy 0, policy_version 629181 (0.0028) [2024-06-24 09:10:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10308616192. Throughput: 0: 43060.5. Samples: 10308764680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:10:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 09:10:50,290][15401] Updated weights for policy 0, policy_version 629191 (0.0043) [2024-06-24 09:10:53,117][15401] Updated weights for policy 0, policy_version 629201 (0.0033) [2024-06-24 09:10:53,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 42765.4). Total num frames: 10308829184. Throughput: 0: 43243.0. Samples: 10308897080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:10:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 09:10:57,738][15401] Updated weights for policy 0, policy_version 629211 (0.0040) [2024-06-24 09:10:57,899][15349] Signal inference workers to stop experience collection... (152650 times) [2024-06-24 09:10:57,931][15401] InferenceWorker_p0-w0: stopping experience collection (152650 times) [2024-06-24 09:10:57,962][15349] Signal inference workers to resume experience collection... (152650 times) [2024-06-24 09:10:57,963][15401] InferenceWorker_p0-w0: resuming experience collection (152650 times) [2024-06-24 09:10:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10309025792. Throughput: 0: 43203.9. Samples: 10309156020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:10:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 09:11:00,845][15401] Updated weights for policy 0, policy_version 629221 (0.0028) [2024-06-24 09:11:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 10309271552. Throughput: 0: 43081.3. Samples: 10309411860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 09:11:05,235][15401] Updated weights for policy 0, policy_version 629231 (0.0028) [2024-06-24 09:11:08,383][15401] Updated weights for policy 0, policy_version 629241 (0.0046) [2024-06-24 09:11:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43690.6, 300 sec: 42820.8). Total num frames: 10309484544. Throughput: 0: 43460.0. Samples: 10309551840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:08,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 09:11:12,676][15401] Updated weights for policy 0, policy_version 629251 (0.0044) [2024-06-24 09:11:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 10309664768. Throughput: 0: 43218.6. Samples: 10309803820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 09:11:16,431][15401] Updated weights for policy 0, policy_version 629261 (0.0032) [2024-06-24 09:11:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10309910528. Throughput: 0: 43116.3. Samples: 10310053300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 09:11:20,205][15401] Updated weights for policy 0, policy_version 629271 (0.0041) [2024-06-24 09:11:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10310107136. Throughput: 0: 43307.7. Samples: 10310190420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:23,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 09:11:24,089][15401] Updated weights for policy 0, policy_version 629281 (0.0044) [2024-06-24 09:11:27,748][15401] Updated weights for policy 0, policy_version 629291 (0.0041) [2024-06-24 09:11:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10310303744. Throughput: 0: 43166.7. Samples: 10310444900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 09:11:31,748][15401] Updated weights for policy 0, policy_version 629301 (0.0039) [2024-06-24 09:11:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43690.5, 300 sec: 42765.0). Total num frames: 10310549504. Throughput: 0: 42928.8. Samples: 10310696480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 09:11:35,281][15401] Updated weights for policy 0, policy_version 629311 (0.0037) [2024-06-24 09:11:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10310746112. Throughput: 0: 43060.3. Samples: 10310834800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 09:11:39,316][15401] Updated weights for policy 0, policy_version 629321 (0.0042) [2024-06-24 09:11:42,692][15401] Updated weights for policy 0, policy_version 629331 (0.0034) [2024-06-24 09:11:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10310959104. Throughput: 0: 42865.8. Samples: 10311084980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 09:11:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000629331_10310959104.pth... [2024-06-24 09:11:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000628702_10300653568.pth [2024-06-24 09:11:46,825][15401] Updated weights for policy 0, policy_version 629341 (0.0030) [2024-06-24 09:11:48,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10311188480. Throughput: 0: 42892.0. Samples: 10311342000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 09:11:50,211][15401] Updated weights for policy 0, policy_version 629351 (0.0035) [2024-06-24 09:11:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10311401472. Throughput: 0: 42760.3. Samples: 10311476060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 09:11:54,393][15401] Updated weights for policy 0, policy_version 629361 (0.0041) [2024-06-24 09:11:58,360][15401] Updated weights for policy 0, policy_version 629371 (0.0026) [2024-06-24 09:11:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10311614464. Throughput: 0: 42769.4. Samples: 10311728440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 09:11:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 09:12:02,262][15401] Updated weights for policy 0, policy_version 629381 (0.0043) [2024-06-24 09:12:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10311843840. Throughput: 0: 43020.0. Samples: 10311989200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:03,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-24 09:12:05,763][15401] Updated weights for policy 0, policy_version 629391 (0.0035) [2024-06-24 09:12:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42876.5). Total num frames: 10312024064. Throughput: 0: 42900.8. Samples: 10312120960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 09:12:09,731][15401] Updated weights for policy 0, policy_version 629401 (0.0037) [2024-06-24 09:12:13,281][15401] Updated weights for policy 0, policy_version 629411 (0.0038) [2024-06-24 09:12:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 10312269824. Throughput: 0: 42964.0. Samples: 10312378280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:13,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 09:12:17,372][15401] Updated weights for policy 0, policy_version 629421 (0.0041) [2024-06-24 09:12:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10312466432. Throughput: 0: 42972.0. Samples: 10312630220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 09:12:18,727][15349] Signal inference workers to stop experience collection... (152700 times) [2024-06-24 09:12:18,771][15401] InferenceWorker_p0-w0: stopping experience collection (152700 times) [2024-06-24 09:12:18,780][15349] Signal inference workers to resume experience collection... (152700 times) [2024-06-24 09:12:18,790][15401] InferenceWorker_p0-w0: resuming experience collection (152700 times) [2024-06-24 09:12:21,299][15401] Updated weights for policy 0, policy_version 629431 (0.0040) [2024-06-24 09:12:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10312663040. Throughput: 0: 42854.0. Samples: 10312763220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 09:12:25,029][15401] Updated weights for policy 0, policy_version 629441 (0.0039) [2024-06-24 09:12:28,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43415.9, 300 sec: 42876.1). Total num frames: 10312908800. Throughput: 0: 43021.3. Samples: 10313021040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:28,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 09:12:28,632][15401] Updated weights for policy 0, policy_version 629451 (0.0043) [2024-06-24 09:12:32,526][15401] Updated weights for policy 0, policy_version 629461 (0.0030) [2024-06-24 09:12:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10313121792. Throughput: 0: 43132.5. Samples: 10313282960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 09:12:36,067][15401] Updated weights for policy 0, policy_version 629471 (0.0035) [2024-06-24 09:12:38,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 10313318400. Throughput: 0: 42868.5. Samples: 10313405140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 09:12:40,347][15401] Updated weights for policy 0, policy_version 629481 (0.0034) [2024-06-24 09:12:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 10313547776. Throughput: 0: 43120.5. Samples: 10313668860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 09:12:43,661][15401] Updated weights for policy 0, policy_version 629491 (0.0035) [2024-06-24 09:12:47,871][15401] Updated weights for policy 0, policy_version 629501 (0.0033) [2024-06-24 09:12:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10313760768. Throughput: 0: 43047.9. Samples: 10313926360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 09:12:51,377][15401] Updated weights for policy 0, policy_version 629511 (0.0034) [2024-06-24 09:12:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10313957376. Throughput: 0: 42845.9. Samples: 10314049020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 09:12:55,385][15401] Updated weights for policy 0, policy_version 629521 (0.0041) [2024-06-24 09:12:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10314203136. Throughput: 0: 42940.1. Samples: 10314310580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:12:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 09:12:58,750][15401] Updated weights for policy 0, policy_version 629531 (0.0033) [2024-06-24 09:13:02,930][15401] Updated weights for policy 0, policy_version 629541 (0.0027) [2024-06-24 09:13:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10314399744. Throughput: 0: 43082.7. Samples: 10314568940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:13:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 09:13:06,355][15401] Updated weights for policy 0, policy_version 629551 (0.0035) [2024-06-24 09:13:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10314596352. Throughput: 0: 42959.4. Samples: 10314696400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:13:08,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 09:13:11,278][15401] Updated weights for policy 0, policy_version 629561 (0.0024) [2024-06-24 09:13:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 10314842112. Throughput: 0: 43038.3. Samples: 10314957660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:13:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 09:13:14,341][15401] Updated weights for policy 0, policy_version 629571 (0.0034) [2024-06-24 09:13:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10315038720. Throughput: 0: 42933.2. Samples: 10315214960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:13:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 09:13:18,767][15401] Updated weights for policy 0, policy_version 629581 (0.0028) [2024-06-24 09:13:22,014][15401] Updated weights for policy 0, policy_version 629591 (0.0034) [2024-06-24 09:13:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 10315251712. Throughput: 0: 42972.0. Samples: 10315338880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:13:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 09:13:26,398][15401] Updated weights for policy 0, policy_version 629601 (0.0034) [2024-06-24 09:13:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 10315481088. Throughput: 0: 42864.5. Samples: 10315597760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:13:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 09:13:29,626][15401] Updated weights for policy 0, policy_version 629611 (0.0035) [2024-06-24 09:13:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 10315677696. Throughput: 0: 42756.9. Samples: 10315850420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 09:13:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 09:13:33,952][15401] Updated weights for policy 0, policy_version 629621 (0.0050) [2024-06-24 09:13:37,413][15401] Updated weights for policy 0, policy_version 629631 (0.0039) [2024-06-24 09:13:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10315907072. Throughput: 0: 42786.5. Samples: 10315974420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:13:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 09:13:41,434][15401] Updated weights for policy 0, policy_version 629641 (0.0034) [2024-06-24 09:13:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 10316120064. Throughput: 0: 42945.2. Samples: 10316243120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:13:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 09:13:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000629647_10316136448.pth... [2024-06-24 09:13:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000629018_10305830912.pth [2024-06-24 09:13:45,130][15401] Updated weights for policy 0, policy_version 629651 (0.0031) [2024-06-24 09:13:45,134][15349] Signal inference workers to stop experience collection... (152750 times) [2024-06-24 09:13:45,135][15349] Signal inference workers to resume experience collection... (152750 times) [2024-06-24 09:13:45,171][15401] InferenceWorker_p0-w0: stopping experience collection (152750 times) [2024-06-24 09:13:45,172][15401] InferenceWorker_p0-w0: resuming experience collection (152750 times) [2024-06-24 09:13:48,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.9, 300 sec: 42986.8). Total num frames: 10316333056. Throughput: 0: 42873.7. Samples: 10316498360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:13:48,393][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 09:13:49,046][15401] Updated weights for policy 0, policy_version 629661 (0.0034) [2024-06-24 09:13:52,799][15401] Updated weights for policy 0, policy_version 629671 (0.0038) [2024-06-24 09:13:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10316546048. Throughput: 0: 42735.1. Samples: 10316619480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:13:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 09:13:56,904][15401] Updated weights for policy 0, policy_version 629681 (0.0036) [2024-06-24 09:13:58,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10316759040. Throughput: 0: 42838.6. Samples: 10316885400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:13:58,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-24 09:14:00,299][15401] Updated weights for policy 0, policy_version 629691 (0.0037) [2024-06-24 09:14:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 43043.1). Total num frames: 10316972032. Throughput: 0: 42794.3. Samples: 10317140700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:03,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-24 09:14:04,439][15401] Updated weights for policy 0, policy_version 629701 (0.0033) [2024-06-24 09:14:07,977][15401] Updated weights for policy 0, policy_version 629711 (0.0037) [2024-06-24 09:14:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 10317185024. Throughput: 0: 42812.9. Samples: 10317265560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:08,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 09:14:12,151][15401] Updated weights for policy 0, policy_version 629721 (0.0033) [2024-06-24 09:14:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 10317381632. Throughput: 0: 42806.0. Samples: 10317524040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:13,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 09:14:15,494][15401] Updated weights for policy 0, policy_version 629731 (0.0040) [2024-06-24 09:14:18,390][15132] Fps is (10 sec: 44246.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 10317627392. Throughput: 0: 42853.4. Samples: 10317778820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 09:14:19,798][15401] Updated weights for policy 0, policy_version 629741 (0.0039) [2024-06-24 09:14:23,223][15401] Updated weights for policy 0, policy_version 629751 (0.0037) [2024-06-24 09:14:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 10317840384. Throughput: 0: 43053.7. Samples: 10317911840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 09:14:27,792][15401] Updated weights for policy 0, policy_version 629761 (0.0033) [2024-06-24 09:14:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10318036992. Throughput: 0: 42616.1. Samples: 10318160840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 09:14:31,050][15401] Updated weights for policy 0, policy_version 629771 (0.0031) [2024-06-24 09:14:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10318249984. Throughput: 0: 42531.1. Samples: 10318412160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 09:14:35,795][15401] Updated weights for policy 0, policy_version 629781 (0.0035) [2024-06-24 09:14:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10318479360. Throughput: 0: 42827.5. Samples: 10318546720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 09:14:39,286][15401] Updated weights for policy 0, policy_version 629791 (0.0032) [2024-06-24 09:14:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 10318643200. Throughput: 0: 42366.3. Samples: 10318791880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 09:14:43,404][15401] Updated weights for policy 0, policy_version 629801 (0.0033) [2024-06-24 09:14:46,756][15401] Updated weights for policy 0, policy_version 629811 (0.0037) [2024-06-24 09:14:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.0, 300 sec: 42931.6). Total num frames: 10318872576. Throughput: 0: 42451.1. Samples: 10319051000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 09:14:51,068][15401] Updated weights for policy 0, policy_version 629821 (0.0032) [2024-06-24 09:14:53,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10319118336. Throughput: 0: 42619.1. Samples: 10319183320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 09:14:54,714][15401] Updated weights for policy 0, policy_version 629831 (0.0041) [2024-06-24 09:14:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 10319298560. Throughput: 0: 42425.3. Samples: 10319433180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:14:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 09:14:58,785][15401] Updated weights for policy 0, policy_version 629841 (0.0036) [2024-06-24 09:14:58,978][15349] Signal inference workers to stop experience collection... (152800 times) [2024-06-24 09:14:58,978][15349] Signal inference workers to resume experience collection... (152800 times) [2024-06-24 09:14:58,998][15401] InferenceWorker_p0-w0: stopping experience collection (152800 times) [2024-06-24 09:14:58,998][15401] InferenceWorker_p0-w0: resuming experience collection (152800 times) [2024-06-24 09:15:02,425][15401] Updated weights for policy 0, policy_version 629851 (0.0034) [2024-06-24 09:15:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 10319511552. Throughput: 0: 42505.9. Samples: 10319691580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:15:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 09:15:06,388][15401] Updated weights for policy 0, policy_version 629861 (0.0031) [2024-06-24 09:15:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 10319757312. Throughput: 0: 42387.7. Samples: 10319819280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 09:15:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 09:15:10,205][15401] Updated weights for policy 0, policy_version 629871 (0.0027) [2024-06-24 09:15:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10319953920. Throughput: 0: 42543.9. Samples: 10320075320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 09:15:14,005][15401] Updated weights for policy 0, policy_version 629881 (0.0033) [2024-06-24 09:15:17,698][15401] Updated weights for policy 0, policy_version 629891 (0.0028) [2024-06-24 09:15:18,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42050.6, 300 sec: 42820.2). Total num frames: 10320150528. Throughput: 0: 42554.6. Samples: 10320327220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:18,393][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 09:15:21,575][15401] Updated weights for policy 0, policy_version 629901 (0.0033) [2024-06-24 09:15:23,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42323.7, 300 sec: 42875.8). Total num frames: 10320379904. Throughput: 0: 42413.8. Samples: 10320455440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:23,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 09:15:25,214][15401] Updated weights for policy 0, policy_version 629911 (0.0030) [2024-06-24 09:15:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10320576512. Throughput: 0: 42669.2. Samples: 10320712000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:28,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 09:15:29,188][15401] Updated weights for policy 0, policy_version 629921 (0.0031) [2024-06-24 09:15:32,750][15401] Updated weights for policy 0, policy_version 629931 (0.0030) [2024-06-24 09:15:33,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 10320789504. Throughput: 0: 42603.6. Samples: 10320968260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:33,393][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 09:15:36,749][15401] Updated weights for policy 0, policy_version 629941 (0.0028) [2024-06-24 09:15:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 10321035264. Throughput: 0: 42537.8. Samples: 10321097520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 09:15:40,198][15401] Updated weights for policy 0, policy_version 629951 (0.0038) [2024-06-24 09:15:43,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10321231872. Throughput: 0: 42814.7. Samples: 10321359840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 09:15:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000629958_10321231872.pth... [2024-06-24 09:15:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000629331_10310959104.pth [2024-06-24 09:15:44,233][15401] Updated weights for policy 0, policy_version 629961 (0.0032) [2024-06-24 09:15:47,714][15401] Updated weights for policy 0, policy_version 629971 (0.0046) [2024-06-24 09:15:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10321444864. Throughput: 0: 42708.8. Samples: 10321613480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 09:15:51,908][15401] Updated weights for policy 0, policy_version 629981 (0.0033) [2024-06-24 09:15:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10321674240. Throughput: 0: 42804.5. Samples: 10321745480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 09:15:55,249][15401] Updated weights for policy 0, policy_version 629991 (0.0043) [2024-06-24 09:15:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10321870848. Throughput: 0: 42843.2. Samples: 10322003260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:15:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 09:15:59,756][15401] Updated weights for policy 0, policy_version 630001 (0.0048) [2024-06-24 09:16:03,274][15401] Updated weights for policy 0, policy_version 630011 (0.0046) [2024-06-24 09:16:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10322100224. Throughput: 0: 42782.2. Samples: 10322252320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 09:16:07,639][15401] Updated weights for policy 0, policy_version 630021 (0.0031) [2024-06-24 09:16:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10322313216. Throughput: 0: 42893.4. Samples: 10322385540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 09:16:10,780][15401] Updated weights for policy 0, policy_version 630031 (0.0029) [2024-06-24 09:16:13,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10322509824. Throughput: 0: 42908.0. Samples: 10322642860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 09:16:15,156][15401] Updated weights for policy 0, policy_version 630041 (0.0039) [2024-06-24 09:16:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 10322739200. Throughput: 0: 42694.2. Samples: 10322889400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:18,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 09:16:18,566][15401] Updated weights for policy 0, policy_version 630051 (0.0040) [2024-06-24 09:16:23,023][15401] Updated weights for policy 0, policy_version 630061 (0.0034) [2024-06-24 09:16:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 10322935808. Throughput: 0: 42829.8. Samples: 10323024860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 09:16:24,096][15349] Signal inference workers to stop experience collection... (152850 times) [2024-06-24 09:16:24,130][15401] InferenceWorker_p0-w0: stopping experience collection (152850 times) [2024-06-24 09:16:24,156][15349] Signal inference workers to resume experience collection... (152850 times) [2024-06-24 09:16:24,164][15401] InferenceWorker_p0-w0: resuming experience collection (152850 times) [2024-06-24 09:16:26,189][15401] Updated weights for policy 0, policy_version 630071 (0.0039) [2024-06-24 09:16:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10323148800. Throughput: 0: 42708.9. Samples: 10323281740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 09:16:30,563][15401] Updated weights for policy 0, policy_version 630081 (0.0034) [2024-06-24 09:16:33,392][15132] Fps is (10 sec: 44225.4, 60 sec: 43144.4, 300 sec: 42820.2). Total num frames: 10323378176. Throughput: 0: 42736.8. Samples: 10323536740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:33,393][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 09:16:33,767][15401] Updated weights for policy 0, policy_version 630091 (0.0037) [2024-06-24 09:16:38,374][15401] Updated weights for policy 0, policy_version 630101 (0.0041) [2024-06-24 09:16:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10323574784. Throughput: 0: 42626.2. Samples: 10323663660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 09:16:41,740][15401] Updated weights for policy 0, policy_version 630111 (0.0033) [2024-06-24 09:16:43,390][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10323804160. Throughput: 0: 42645.8. Samples: 10323922320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 09:16:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 09:16:45,811][15401] Updated weights for policy 0, policy_version 630121 (0.0020) [2024-06-24 09:16:48,391][15132] Fps is (10 sec: 44230.5, 60 sec: 42870.6, 300 sec: 42764.8). Total num frames: 10324017152. Throughput: 0: 42913.1. Samples: 10324183460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:16:48,400][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 09:16:49,280][15401] Updated weights for policy 0, policy_version 630131 (0.0034) [2024-06-24 09:16:53,275][15401] Updated weights for policy 0, policy_version 630141 (0.0026) [2024-06-24 09:16:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10324230144. Throughput: 0: 42654.2. Samples: 10324304980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:16:53,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 09:16:57,185][15401] Updated weights for policy 0, policy_version 630151 (0.0029) [2024-06-24 09:16:58,392][15132] Fps is (10 sec: 42593.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 10324443136. Throughput: 0: 42812.8. Samples: 10324569540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:16:58,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 09:17:00,935][15401] Updated weights for policy 0, policy_version 630161 (0.0028) [2024-06-24 09:17:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10324639744. Throughput: 0: 43088.1. Samples: 10324828360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:03,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 09:17:04,706][15401] Updated weights for policy 0, policy_version 630171 (0.0042) [2024-06-24 09:17:08,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10324869120. Throughput: 0: 42859.5. Samples: 10324953540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 09:17:08,551][15401] Updated weights for policy 0, policy_version 630181 (0.0040) [2024-06-24 09:17:12,283][15401] Updated weights for policy 0, policy_version 630191 (0.0038) [2024-06-24 09:17:13,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 10325082112. Throughput: 0: 42807.3. Samples: 10325208340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:13,396][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 09:17:16,035][15401] Updated weights for policy 0, policy_version 630201 (0.0037) [2024-06-24 09:17:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10325295104. Throughput: 0: 42822.4. Samples: 10325463640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:18,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-24 09:17:19,971][15401] Updated weights for policy 0, policy_version 630211 (0.0035) [2024-06-24 09:17:23,390][15132] Fps is (10 sec: 42625.0, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 10325508096. Throughput: 0: 42856.7. Samples: 10325592220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 09:17:23,875][15401] Updated weights for policy 0, policy_version 630221 (0.0035) [2024-06-24 09:17:27,605][15401] Updated weights for policy 0, policy_version 630231 (0.0044) [2024-06-24 09:17:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10325704704. Throughput: 0: 42750.3. Samples: 10325846080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 09:17:31,370][15401] Updated weights for policy 0, policy_version 630241 (0.0033) [2024-06-24 09:17:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.3, 300 sec: 42820.5). Total num frames: 10325950464. Throughput: 0: 42608.3. Samples: 10326100780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 09:17:35,623][15401] Updated weights for policy 0, policy_version 630251 (0.0026) [2024-06-24 09:17:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10326147072. Throughput: 0: 42901.8. Samples: 10326235560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 09:17:38,898][15401] Updated weights for policy 0, policy_version 630261 (0.0043) [2024-06-24 09:17:42,950][15401] Updated weights for policy 0, policy_version 630271 (0.0024) [2024-06-24 09:17:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10326360064. Throughput: 0: 42840.9. Samples: 10326497280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 09:17:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000630272_10326376448.pth... [2024-06-24 09:17:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000629647_10316136448.pth [2024-06-24 09:17:44,011][15349] Signal inference workers to stop experience collection... (152900 times) [2024-06-24 09:17:44,016][15349] Signal inference workers to resume experience collection... (152900 times) [2024-06-24 09:17:44,056][15401] InferenceWorker_p0-w0: stopping experience collection (152900 times) [2024-06-24 09:17:44,060][15401] InferenceWorker_p0-w0: resuming experience collection (152900 times) [2024-06-24 09:17:46,748][15401] Updated weights for policy 0, policy_version 630281 (0.0035) [2024-06-24 09:17:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42872.5, 300 sec: 42820.5). Total num frames: 10326589440. Throughput: 0: 42657.8. Samples: 10326747960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 09:17:50,772][15401] Updated weights for policy 0, policy_version 630291 (0.0034) [2024-06-24 09:17:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10326786048. Throughput: 0: 42784.5. Samples: 10326878840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 09:17:54,231][15401] Updated weights for policy 0, policy_version 630301 (0.0035) [2024-06-24 09:17:58,198][15401] Updated weights for policy 0, policy_version 630311 (0.0041) [2024-06-24 09:17:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 10327015424. Throughput: 0: 42925.3. Samples: 10327139700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:17:58,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 09:18:01,834][15401] Updated weights for policy 0, policy_version 630321 (0.0035) [2024-06-24 09:18:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10327212032. Throughput: 0: 42889.9. Samples: 10327393680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:18:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 09:18:06,218][15401] Updated weights for policy 0, policy_version 630331 (0.0031) [2024-06-24 09:18:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10327425024. Throughput: 0: 43014.8. Samples: 10327527880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:18:08,396][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 09:18:09,554][15401] Updated weights for policy 0, policy_version 630341 (0.0034) [2024-06-24 09:18:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 10327654400. Throughput: 0: 42943.6. Samples: 10327778540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:18:13,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 09:18:13,733][15401] Updated weights for policy 0, policy_version 630351 (0.0044) [2024-06-24 09:18:17,318][15401] Updated weights for policy 0, policy_version 630361 (0.0029) [2024-06-24 09:18:18,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 10327867392. Throughput: 0: 42913.0. Samples: 10328032140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:18:18,396][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 09:18:21,317][15401] Updated weights for policy 0, policy_version 630371 (0.0046) [2024-06-24 09:18:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10328080384. Throughput: 0: 42813.4. Samples: 10328162160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:18:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 09:18:24,978][15401] Updated weights for policy 0, policy_version 630381 (0.0026) [2024-06-24 09:18:28,389][15132] Fps is (10 sec: 42626.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10328293376. Throughput: 0: 42669.0. Samples: 10328417380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:18:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 09:18:28,937][15401] Updated weights for policy 0, policy_version 630391 (0.0038) [2024-06-24 09:18:32,824][15401] Updated weights for policy 0, policy_version 630401 (0.0030) [2024-06-24 09:18:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10328522752. Throughput: 0: 42872.0. Samples: 10328677200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:18:33,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 09:18:36,662][15401] Updated weights for policy 0, policy_version 630411 (0.0031) [2024-06-24 09:18:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10328735744. Throughput: 0: 42741.4. Samples: 10328802200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:18:38,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 09:18:40,582][15401] Updated weights for policy 0, policy_version 630421 (0.0034) [2024-06-24 09:18:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 10328948736. Throughput: 0: 42856.4. Samples: 10329068240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:18:43,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 09:18:44,199][15401] Updated weights for policy 0, policy_version 630431 (0.0030) [2024-06-24 09:18:48,324][15401] Updated weights for policy 0, policy_version 630441 (0.0026) [2024-06-24 09:18:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10329145344. Throughput: 0: 43027.1. Samples: 10329329900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:18:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 09:18:51,839][15401] Updated weights for policy 0, policy_version 630451 (0.0035) [2024-06-24 09:18:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 10329391104. Throughput: 0: 42707.1. Samples: 10329449700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:18:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 09:18:56,025][15401] Updated weights for policy 0, policy_version 630461 (0.0034) [2024-06-24 09:18:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10329571328. Throughput: 0: 42886.2. Samples: 10329708420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:18:58,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 09:18:59,405][15401] Updated weights for policy 0, policy_version 630471 (0.0035) [2024-06-24 09:19:03,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 10329767936. Throughput: 0: 42848.7. Samples: 10329960060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 09:19:03,681][15401] Updated weights for policy 0, policy_version 630481 (0.0035) [2024-06-24 09:19:07,302][15401] Updated weights for policy 0, policy_version 630491 (0.0032) [2024-06-24 09:19:08,392][15132] Fps is (10 sec: 47503.1, 60 sec: 43689.0, 300 sec: 42931.3). Total num frames: 10330046464. Throughput: 0: 42805.8. Samples: 10330088520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:08,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 09:19:11,735][15401] Updated weights for policy 0, policy_version 630501 (0.0034) [2024-06-24 09:19:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10330210304. Throughput: 0: 42840.8. Samples: 10330345220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 09:19:14,762][15401] Updated weights for policy 0, policy_version 630511 (0.0030) [2024-06-24 09:19:15,008][15349] Signal inference workers to stop experience collection... (152950 times) [2024-06-24 09:19:15,010][15349] Signal inference workers to resume experience collection... (152950 times) [2024-06-24 09:19:15,030][15401] InferenceWorker_p0-w0: stopping experience collection (152950 times) [2024-06-24 09:19:15,030][15401] InferenceWorker_p0-w0: resuming experience collection (152950 times) [2024-06-24 09:19:18,390][15132] Fps is (10 sec: 37691.1, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 10330423296. Throughput: 0: 42779.4. Samples: 10330602280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 09:19:19,362][15401] Updated weights for policy 0, policy_version 630521 (0.0043) [2024-06-24 09:19:22,470][15401] Updated weights for policy 0, policy_version 630531 (0.0037) [2024-06-24 09:19:23,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 10330685440. Throughput: 0: 42798.2. Samples: 10330728120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 09:19:26,833][15401] Updated weights for policy 0, policy_version 630541 (0.0035) [2024-06-24 09:19:28,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 10330865664. Throughput: 0: 42862.6. Samples: 10330997160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:28,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 09:19:30,020][15401] Updated weights for policy 0, policy_version 630551 (0.0031) [2024-06-24 09:19:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10331078656. Throughput: 0: 42641.7. Samples: 10331248780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 09:19:34,282][15401] Updated weights for policy 0, policy_version 630561 (0.0032) [2024-06-24 09:19:37,616][15401] Updated weights for policy 0, policy_version 630571 (0.0047) [2024-06-24 09:19:38,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10331308032. Throughput: 0: 42931.1. Samples: 10331381600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:38,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-24 09:19:41,785][15401] Updated weights for policy 0, policy_version 630581 (0.0030) [2024-06-24 09:19:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10331504640. Throughput: 0: 42983.2. Samples: 10331642660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:43,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-24 09:19:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000630585_10331504640.pth... [2024-06-24 09:19:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000629958_10321231872.pth [2024-06-24 09:19:45,069][15401] Updated weights for policy 0, policy_version 630591 (0.0035) [2024-06-24 09:19:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 10331750400. Throughput: 0: 42829.1. Samples: 10331887360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 09:19:49,391][15401] Updated weights for policy 0, policy_version 630601 (0.0050) [2024-06-24 09:19:52,650][15401] Updated weights for policy 0, policy_version 630611 (0.0036) [2024-06-24 09:19:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10331947008. Throughput: 0: 43003.1. Samples: 10332023560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:53,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 09:19:56,882][15401] Updated weights for policy 0, policy_version 630621 (0.0033) [2024-06-24 09:19:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10332160000. Throughput: 0: 43063.7. Samples: 10332283080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:19:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 09:20:00,320][15401] Updated weights for policy 0, policy_version 630631 (0.0033) [2024-06-24 09:20:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43690.8, 300 sec: 42820.6). Total num frames: 10332389376. Throughput: 0: 43006.4. Samples: 10332537560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 09:20:04,280][15401] Updated weights for policy 0, policy_version 630641 (0.0037) [2024-06-24 09:20:08,015][15401] Updated weights for policy 0, policy_version 630651 (0.0041) [2024-06-24 09:20:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42326.9, 300 sec: 42820.6). Total num frames: 10332585984. Throughput: 0: 43163.2. Samples: 10332670460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:08,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-24 09:20:11,819][15401] Updated weights for policy 0, policy_version 630661 (0.0044) [2024-06-24 09:20:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 10332798976. Throughput: 0: 42954.7. Samples: 10332930020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 09:20:15,590][15401] Updated weights for policy 0, policy_version 630671 (0.0033) [2024-06-24 09:20:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42876.4). Total num frames: 10333028352. Throughput: 0: 42929.7. Samples: 10333180620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:18,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 09:20:19,486][15401] Updated weights for policy 0, policy_version 630681 (0.0032) [2024-06-24 09:20:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 10333224960. Throughput: 0: 42989.9. Samples: 10333316140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:23,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 09:20:23,439][15401] Updated weights for policy 0, policy_version 630691 (0.0029) [2024-06-24 09:20:27,073][15401] Updated weights for policy 0, policy_version 630701 (0.0044) [2024-06-24 09:20:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10333437952. Throughput: 0: 42783.5. Samples: 10333568020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:28,392][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 09:20:31,072][15401] Updated weights for policy 0, policy_version 630711 (0.0035) [2024-06-24 09:20:33,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 10333650944. Throughput: 0: 43112.3. Samples: 10333827520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:33,393][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 09:20:34,572][15401] Updated weights for policy 0, policy_version 630721 (0.0050) [2024-06-24 09:20:38,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10333863936. Throughput: 0: 42971.1. Samples: 10333957260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:38,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 09:20:38,855][15401] Updated weights for policy 0, policy_version 630731 (0.0036) [2024-06-24 09:20:41,679][15349] Signal inference workers to stop experience collection... (153000 times) [2024-06-24 09:20:41,712][15401] InferenceWorker_p0-w0: stopping experience collection (153000 times) [2024-06-24 09:20:41,725][15349] Signal inference workers to resume experience collection... (153000 times) [2024-06-24 09:20:41,772][15401] InferenceWorker_p0-w0: resuming experience collection (153000 times) [2024-06-24 09:20:42,221][15401] Updated weights for policy 0, policy_version 630741 (0.0034) [2024-06-24 09:20:43,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10334093312. Throughput: 0: 42816.2. Samples: 10334209820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:43,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 09:20:46,441][15401] Updated weights for policy 0, policy_version 630751 (0.0048) [2024-06-24 09:20:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10334289920. Throughput: 0: 42944.5. Samples: 10334470060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 09:20:49,855][15401] Updated weights for policy 0, policy_version 630761 (0.0034) [2024-06-24 09:20:53,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10334519296. Throughput: 0: 42827.6. Samples: 10334597700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 09:20:54,183][15401] Updated weights for policy 0, policy_version 630771 (0.0030) [2024-06-24 09:20:57,692][15401] Updated weights for policy 0, policy_version 630781 (0.0040) [2024-06-24 09:20:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10334715904. Throughput: 0: 42622.7. Samples: 10334848040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:20:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 09:21:02,019][15401] Updated weights for policy 0, policy_version 630791 (0.0032) [2024-06-24 09:21:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10334945280. Throughput: 0: 42811.5. Samples: 10335107140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:21:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 09:21:05,185][15401] Updated weights for policy 0, policy_version 630801 (0.0033) [2024-06-24 09:21:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10335141888. Throughput: 0: 42718.2. Samples: 10335238460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:21:08,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-24 09:21:09,527][15401] Updated weights for policy 0, policy_version 630811 (0.0033) [2024-06-24 09:21:12,923][15401] Updated weights for policy 0, policy_version 630821 (0.0038) [2024-06-24 09:21:13,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 10335371264. Throughput: 0: 42759.1. Samples: 10335492180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:21:13,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 09:21:17,068][15401] Updated weights for policy 0, policy_version 630831 (0.0033) [2024-06-24 09:21:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10335600640. Throughput: 0: 42622.8. Samples: 10335745440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:21:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 09:21:20,545][15401] Updated weights for policy 0, policy_version 630841 (0.0043) [2024-06-24 09:21:23,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10335780864. Throughput: 0: 42597.5. Samples: 10335874160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:21:23,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 09:21:24,729][15401] Updated weights for policy 0, policy_version 630851 (0.0044) [2024-06-24 09:21:28,179][15401] Updated weights for policy 0, policy_version 630861 (0.0040) [2024-06-24 09:21:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42876.5). Total num frames: 10336026624. Throughput: 0: 42733.9. Samples: 10336132840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:21:28,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 09:21:32,487][15401] Updated weights for policy 0, policy_version 630871 (0.0040) [2024-06-24 09:21:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 10336223232. Throughput: 0: 42674.2. Samples: 10336390400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 09:21:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 09:21:36,098][15401] Updated weights for policy 0, policy_version 630881 (0.0029) [2024-06-24 09:21:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10336436224. Throughput: 0: 42647.8. Samples: 10336516860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:21:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 09:21:40,370][15401] Updated weights for policy 0, policy_version 630891 (0.0028) [2024-06-24 09:21:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.3). Total num frames: 10336665600. Throughput: 0: 42676.4. Samples: 10336768480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:21:43,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 09:21:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000630900_10336665600.pth... [2024-06-24 09:21:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000630272_10326376448.pth [2024-06-24 09:21:43,874][15401] Updated weights for policy 0, policy_version 630901 (0.0032) [2024-06-24 09:21:48,246][15401] Updated weights for policy 0, policy_version 630911 (0.0036) [2024-06-24 09:21:48,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10336845824. Throughput: 0: 42808.7. Samples: 10337033520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:21:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 09:21:51,771][15401] Updated weights for policy 0, policy_version 630921 (0.0036) [2024-06-24 09:21:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 10337075200. Throughput: 0: 42516.0. Samples: 10337151680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:21:53,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 09:21:55,789][15401] Updated weights for policy 0, policy_version 630931 (0.0048) [2024-06-24 09:21:58,389][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10337304576. Throughput: 0: 42539.6. Samples: 10337406360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:21:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 09:21:59,210][15401] Updated weights for policy 0, policy_version 630941 (0.0033) [2024-06-24 09:22:03,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 10337484800. Throughput: 0: 42728.0. Samples: 10337668300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:03,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 09:22:03,625][15401] Updated weights for policy 0, policy_version 630951 (0.0027) [2024-06-24 09:22:06,728][15401] Updated weights for policy 0, policy_version 630961 (0.0029) [2024-06-24 09:22:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 10337714176. Throughput: 0: 42558.5. Samples: 10337789280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:08,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 09:22:11,098][15401] Updated weights for policy 0, policy_version 630971 (0.0036) [2024-06-24 09:22:13,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 10337943552. Throughput: 0: 42672.4. Samples: 10338053100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 09:22:14,178][15401] Updated weights for policy 0, policy_version 630981 (0.0041) [2024-06-24 09:22:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10338140160. Throughput: 0: 42606.2. Samples: 10338307680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 09:22:18,899][15401] Updated weights for policy 0, policy_version 630991 (0.0039) [2024-06-24 09:22:20,540][15349] Signal inference workers to stop experience collection... (153050 times) [2024-06-24 09:22:20,565][15401] InferenceWorker_p0-w0: stopping experience collection (153050 times) [2024-06-24 09:22:20,656][15349] Signal inference workers to resume experience collection... (153050 times) [2024-06-24 09:22:20,656][15401] InferenceWorker_p0-w0: resuming experience collection (153050 times) [2024-06-24 09:22:21,691][15401] Updated weights for policy 0, policy_version 631001 (0.0032) [2024-06-24 09:22:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10338353152. Throughput: 0: 42498.2. Samples: 10338429280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:23,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 09:22:26,381][15401] Updated weights for policy 0, policy_version 631011 (0.0035) [2024-06-24 09:22:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10338582528. Throughput: 0: 42680.8. Samples: 10338689120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 09:22:30,163][15401] Updated weights for policy 0, policy_version 631021 (0.0033) [2024-06-24 09:22:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10338762752. Throughput: 0: 42557.2. Samples: 10338948600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 09:22:33,847][15401] Updated weights for policy 0, policy_version 631031 (0.0038) [2024-06-24 09:22:37,919][15401] Updated weights for policy 0, policy_version 631041 (0.0041) [2024-06-24 09:22:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10338975744. Throughput: 0: 42612.4. Samples: 10339069240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:38,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 09:22:41,428][15401] Updated weights for policy 0, policy_version 631051 (0.0034) [2024-06-24 09:22:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10339205120. Throughput: 0: 42730.5. Samples: 10339329240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 09:22:46,010][15401] Updated weights for policy 0, policy_version 631061 (0.0038) [2024-06-24 09:22:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10339418112. Throughput: 0: 42473.9. Samples: 10339579520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 09:22:48,904][15401] Updated weights for policy 0, policy_version 631071 (0.0024) [2024-06-24 09:22:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 10339614720. Throughput: 0: 42665.1. Samples: 10339709220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 09:22:53,446][15401] Updated weights for policy 0, policy_version 631081 (0.0028) [2024-06-24 09:22:56,829][15401] Updated weights for policy 0, policy_version 631091 (0.0030) [2024-06-24 09:22:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 10339827712. Throughput: 0: 42593.0. Samples: 10339969780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:22:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 09:23:01,115][15401] Updated weights for policy 0, policy_version 631101 (0.0029) [2024-06-24 09:23:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 10340057088. Throughput: 0: 42594.7. Samples: 10340224440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:23:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 09:23:04,697][15401] Updated weights for policy 0, policy_version 631111 (0.0045) [2024-06-24 09:23:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10340253696. Throughput: 0: 42718.7. Samples: 10340351620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:23:08,399][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 09:23:08,945][15401] Updated weights for policy 0, policy_version 631121 (0.0029) [2024-06-24 09:23:12,402][15401] Updated weights for policy 0, policy_version 631131 (0.0027) [2024-06-24 09:23:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 10340483072. Throughput: 0: 42612.9. Samples: 10340606700. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 09:23:16,708][15401] Updated weights for policy 0, policy_version 631141 (0.0035) [2024-06-24 09:23:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10340696064. Throughput: 0: 42360.4. Samples: 10340854820. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:18,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 09:23:20,158][15401] Updated weights for policy 0, policy_version 631151 (0.0031) [2024-06-24 09:23:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10340892672. Throughput: 0: 42564.9. Samples: 10340984660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:23,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 09:23:24,244][15401] Updated weights for policy 0, policy_version 631161 (0.0029) [2024-06-24 09:23:27,805][15401] Updated weights for policy 0, policy_version 631171 (0.0031) [2024-06-24 09:23:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10341122048. Throughput: 0: 42562.8. Samples: 10341244560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 09:23:31,930][15401] Updated weights for policy 0, policy_version 631181 (0.0028) [2024-06-24 09:23:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10341335040. Throughput: 0: 42620.9. Samples: 10341497460. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:33,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-24 09:23:35,806][15401] Updated weights for policy 0, policy_version 631191 (0.0032) [2024-06-24 09:23:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10341531648. Throughput: 0: 42550.0. Samples: 10341623960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:38,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 09:23:39,572][15349] Signal inference workers to stop experience collection... (153100 times) [2024-06-24 09:23:39,572][15349] Signal inference workers to resume experience collection... (153100 times) [2024-06-24 09:23:39,578][15401] Updated weights for policy 0, policy_version 631201 (0.0038) [2024-06-24 09:23:39,591][15401] InferenceWorker_p0-w0: stopping experience collection (153100 times) [2024-06-24 09:23:39,592][15401] InferenceWorker_p0-w0: resuming experience collection (153100 times) [2024-06-24 09:23:43,366][15401] Updated weights for policy 0, policy_version 631211 (0.0033) [2024-06-24 09:23:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10341761024. Throughput: 0: 42488.4. Samples: 10341881760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 09:23:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000631211_10341761024.pth... [2024-06-24 09:23:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000630585_10331504640.pth [2024-06-24 09:23:47,159][15401] Updated weights for policy 0, policy_version 631221 (0.0038) [2024-06-24 09:23:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10341957632. Throughput: 0: 42509.3. Samples: 10342137360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 09:23:50,978][15401] Updated weights for policy 0, policy_version 631231 (0.0037) [2024-06-24 09:23:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10342187008. Throughput: 0: 42496.1. Samples: 10342263940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 09:23:54,958][15401] Updated weights for policy 0, policy_version 631241 (0.0021) [2024-06-24 09:23:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10342400000. Throughput: 0: 42499.6. Samples: 10342519180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:23:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 09:23:58,907][15401] Updated weights for policy 0, policy_version 631251 (0.0033) [2024-06-24 09:24:02,499][15401] Updated weights for policy 0, policy_version 631261 (0.0033) [2024-06-24 09:24:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 10342612992. Throughput: 0: 42629.3. Samples: 10342773140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:03,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 09:24:06,589][15401] Updated weights for policy 0, policy_version 631271 (0.0034) [2024-06-24 09:24:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10342825984. Throughput: 0: 42652.7. Samples: 10342904040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 09:24:10,289][15401] Updated weights for policy 0, policy_version 631281 (0.0033) [2024-06-24 09:24:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.6, 300 sec: 42765.1). Total num frames: 10343038976. Throughput: 0: 42563.7. Samples: 10343159920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 09:24:14,233][15401] Updated weights for policy 0, policy_version 631291 (0.0047) [2024-06-24 09:24:17,843][15401] Updated weights for policy 0, policy_version 631301 (0.0032) [2024-06-24 09:24:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10343251968. Throughput: 0: 42639.1. Samples: 10343416220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:18,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-24 09:24:21,981][15401] Updated weights for policy 0, policy_version 631311 (0.0036) [2024-06-24 09:24:23,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 10343464960. Throughput: 0: 42598.0. Samples: 10343540880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 09:24:25,463][15401] Updated weights for policy 0, policy_version 631321 (0.0044) [2024-06-24 09:24:28,394][15132] Fps is (10 sec: 42579.4, 60 sec: 42595.3, 300 sec: 42708.8). Total num frames: 10343677952. Throughput: 0: 42578.2. Samples: 10343797960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:28,394][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 09:24:29,734][15401] Updated weights for policy 0, policy_version 631331 (0.0034) [2024-06-24 09:24:33,342][15401] Updated weights for policy 0, policy_version 631341 (0.0038) [2024-06-24 09:24:33,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 10343890944. Throughput: 0: 42574.2. Samples: 10344053300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:33,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 09:24:37,220][15401] Updated weights for policy 0, policy_version 631351 (0.0035) [2024-06-24 09:24:38,389][15132] Fps is (10 sec: 42617.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10344103936. Throughput: 0: 42648.0. Samples: 10344183100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 09:24:40,831][15401] Updated weights for policy 0, policy_version 631361 (0.0032) [2024-06-24 09:24:43,395][15132] Fps is (10 sec: 42585.9, 60 sec: 42594.7, 300 sec: 42597.6). Total num frames: 10344316928. Throughput: 0: 42752.8. Samples: 10344443280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:43,395][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 09:24:44,723][15401] Updated weights for policy 0, policy_version 631371 (0.0039) [2024-06-24 09:24:48,334][15401] Updated weights for policy 0, policy_version 631381 (0.0029) [2024-06-24 09:24:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10344546304. Throughput: 0: 42766.7. Samples: 10344697640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 09:24:52,286][15401] Updated weights for policy 0, policy_version 631391 (0.0043) [2024-06-24 09:24:53,389][15132] Fps is (10 sec: 42621.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10344742912. Throughput: 0: 42860.2. Samples: 10344832740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 09:24:55,830][15401] Updated weights for policy 0, policy_version 631401 (0.0031) [2024-06-24 09:24:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10344939520. Throughput: 0: 42766.1. Samples: 10345084400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:24:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 09:24:59,689][15401] Updated weights for policy 0, policy_version 631411 (0.0032) [2024-06-24 09:25:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10345185280. Throughput: 0: 42779.4. Samples: 10345341300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 09:25:03,402][15401] Updated weights for policy 0, policy_version 631421 (0.0025) [2024-06-24 09:25:07,055][15401] Updated weights for policy 0, policy_version 631431 (0.0033) [2024-06-24 09:25:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10345381888. Throughput: 0: 43029.1. Samples: 10345477180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 09:25:11,323][15401] Updated weights for policy 0, policy_version 631441 (0.0035) [2024-06-24 09:25:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 10345594880. Throughput: 0: 42921.4. Samples: 10345729240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 09:25:14,596][15401] Updated weights for policy 0, policy_version 631451 (0.0042) [2024-06-24 09:25:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10345824256. Throughput: 0: 42849.1. Samples: 10345981400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:18,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 09:25:19,239][15401] Updated weights for policy 0, policy_version 631461 (0.0036) [2024-06-24 09:25:21,509][15349] Signal inference workers to stop experience collection... (153150 times) [2024-06-24 09:25:21,567][15401] InferenceWorker_p0-w0: stopping experience collection (153150 times) [2024-06-24 09:25:21,575][15349] Signal inference workers to resume experience collection... (153150 times) [2024-06-24 09:25:21,581][15401] InferenceWorker_p0-w0: resuming experience collection (153150 times) [2024-06-24 09:25:22,650][15401] Updated weights for policy 0, policy_version 631471 (0.0045) [2024-06-24 09:25:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 10346020864. Throughput: 0: 42925.3. Samples: 10346114740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 09:25:26,762][15401] Updated weights for policy 0, policy_version 631481 (0.0029) [2024-06-24 09:25:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42874.5, 300 sec: 42709.8). Total num frames: 10346250240. Throughput: 0: 42728.5. Samples: 10346365840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 09:25:30,298][15401] Updated weights for policy 0, policy_version 631491 (0.0038) [2024-06-24 09:25:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 10346463232. Throughput: 0: 42914.2. Samples: 10346628780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:33,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 09:25:34,315][15401] Updated weights for policy 0, policy_version 631501 (0.0041) [2024-06-24 09:25:37,855][15401] Updated weights for policy 0, policy_version 631511 (0.0033) [2024-06-24 09:25:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10346676224. Throughput: 0: 42789.2. Samples: 10346758260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 09:25:42,420][15401] Updated weights for policy 0, policy_version 631521 (0.0048) [2024-06-24 09:25:43,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43143.7, 300 sec: 42764.1). Total num frames: 10346905600. Throughput: 0: 42888.9. Samples: 10347014680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:43,397][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 09:25:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000631525_10346905600.pth... [2024-06-24 09:25:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000630900_10336665600.pth [2024-06-24 09:25:45,575][15401] Updated weights for policy 0, policy_version 631531 (0.0031) [2024-06-24 09:25:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10347102208. Throughput: 0: 42785.5. Samples: 10347266640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 09:25:50,041][15401] Updated weights for policy 0, policy_version 631541 (0.0032) [2024-06-24 09:25:53,137][15401] Updated weights for policy 0, policy_version 631551 (0.0039) [2024-06-24 09:25:53,389][15132] Fps is (10 sec: 42626.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10347331584. Throughput: 0: 42508.1. Samples: 10347390040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 09:25:57,729][15401] Updated weights for policy 0, policy_version 631561 (0.0033) [2024-06-24 09:25:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 10347528192. Throughput: 0: 42670.7. Samples: 10347649420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:25:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 09:26:01,215][15401] Updated weights for policy 0, policy_version 631571 (0.0037) [2024-06-24 09:26:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10347741184. Throughput: 0: 42700.8. Samples: 10347902940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 09:26:05,469][15401] Updated weights for policy 0, policy_version 631581 (0.0030) [2024-06-24 09:26:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 10347970560. Throughput: 0: 42628.9. Samples: 10348033040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 09:26:09,033][15401] Updated weights for policy 0, policy_version 631591 (0.0042) [2024-06-24 09:26:13,110][15401] Updated weights for policy 0, policy_version 631601 (0.0041) [2024-06-24 09:26:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 10348150784. Throughput: 0: 42749.7. Samples: 10348289580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:13,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 09:26:16,652][15401] Updated weights for policy 0, policy_version 631611 (0.0038) [2024-06-24 09:26:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10348380160. Throughput: 0: 42485.8. Samples: 10348540640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 09:26:20,875][15401] Updated weights for policy 0, policy_version 631621 (0.0031) [2024-06-24 09:26:23,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 10348609536. Throughput: 0: 42425.7. Samples: 10348667520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:23,393][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 09:26:24,224][15401] Updated weights for policy 0, policy_version 631631 (0.0028) [2024-06-24 09:26:28,347][15401] Updated weights for policy 0, policy_version 631641 (0.0032) [2024-06-24 09:26:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10348806144. Throughput: 0: 42505.5. Samples: 10348927160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 09:26:32,514][15401] Updated weights for policy 0, policy_version 631651 (0.0025) [2024-06-24 09:26:33,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10349002752. Throughput: 0: 42510.5. Samples: 10349179620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 09:26:36,524][15401] Updated weights for policy 0, policy_version 631661 (0.0041) [2024-06-24 09:26:36,617][15349] Signal inference workers to stop experience collection... (153200 times) [2024-06-24 09:26:36,643][15401] InferenceWorker_p0-w0: stopping experience collection (153200 times) [2024-06-24 09:26:36,677][15349] Signal inference workers to resume experience collection... (153200 times) [2024-06-24 09:26:36,677][15401] InferenceWorker_p0-w0: resuming experience collection (153200 times) [2024-06-24 09:26:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10349248512. Throughput: 0: 42635.4. Samples: 10349308640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 09:26:39,929][15401] Updated weights for policy 0, policy_version 631671 (0.0031) [2024-06-24 09:26:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42056.8, 300 sec: 42653.9). Total num frames: 10349428736. Throughput: 0: 42568.1. Samples: 10349564980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 09:26:43,974][15401] Updated weights for policy 0, policy_version 631681 (0.0040) [2024-06-24 09:26:47,409][15401] Updated weights for policy 0, policy_version 631691 (0.0032) [2024-06-24 09:26:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 10349658112. Throughput: 0: 42649.2. Samples: 10349822160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:48,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 09:26:51,470][15401] Updated weights for policy 0, policy_version 631701 (0.0037) [2024-06-24 09:26:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10349887488. Throughput: 0: 42617.4. Samples: 10349950820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:53,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-24 09:26:54,929][15401] Updated weights for policy 0, policy_version 631711 (0.0035) [2024-06-24 09:26:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 10350067712. Throughput: 0: 42762.3. Samples: 10350213880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:26:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 09:26:59,080][15401] Updated weights for policy 0, policy_version 631721 (0.0029) [2024-06-24 09:27:02,432][15401] Updated weights for policy 0, policy_version 631731 (0.0026) [2024-06-24 09:27:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10350297088. Throughput: 0: 42722.6. Samples: 10350463160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:03,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 09:27:06,770][15401] Updated weights for policy 0, policy_version 631741 (0.0025) [2024-06-24 09:27:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10350526464. Throughput: 0: 42807.6. Samples: 10350593760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:08,394][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 09:27:10,521][15401] Updated weights for policy 0, policy_version 631751 (0.0031) [2024-06-24 09:27:13,391][15132] Fps is (10 sec: 40955.3, 60 sec: 42597.7, 300 sec: 42598.2). Total num frames: 10350706688. Throughput: 0: 42640.8. Samples: 10350846040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:13,391][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 09:27:14,615][15401] Updated weights for policy 0, policy_version 631761 (0.0031) [2024-06-24 09:27:18,065][15401] Updated weights for policy 0, policy_version 631771 (0.0029) [2024-06-24 09:27:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10350936064. Throughput: 0: 42716.2. Samples: 10351101840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 09:27:22,394][15401] Updated weights for policy 0, policy_version 631781 (0.0036) [2024-06-24 09:27:23,389][15132] Fps is (10 sec: 45880.9, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 10351165440. Throughput: 0: 42895.7. Samples: 10351238940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 09:27:25,653][15401] Updated weights for policy 0, policy_version 631791 (0.0034) [2024-06-24 09:27:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10351362048. Throughput: 0: 42789.7. Samples: 10351490520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 09:27:29,999][15401] Updated weights for policy 0, policy_version 631801 (0.0038) [2024-06-24 09:27:33,390][15132] Fps is (10 sec: 40958.7, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 10351575040. Throughput: 0: 42724.8. Samples: 10351744780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 09:27:33,602][15401] Updated weights for policy 0, policy_version 631811 (0.0029) [2024-06-24 09:27:37,624][15401] Updated weights for policy 0, policy_version 631821 (0.0037) [2024-06-24 09:27:38,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 10351804416. Throughput: 0: 42629.4. Samples: 10351869420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:38,396][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 09:27:41,368][15401] Updated weights for policy 0, policy_version 631831 (0.0034) [2024-06-24 09:27:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 10352001024. Throughput: 0: 42430.5. Samples: 10352123260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 09:27:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000631836_10352001024.pth... [2024-06-24 09:27:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000631211_10341761024.pth [2024-06-24 09:27:45,107][15401] Updated weights for policy 0, policy_version 631841 (0.0028) [2024-06-24 09:27:47,165][15349] Signal inference workers to stop experience collection... (153250 times) [2024-06-24 09:27:47,166][15349] Signal inference workers to resume experience collection... (153250 times) [2024-06-24 09:27:47,184][15401] InferenceWorker_p0-w0: stopping experience collection (153250 times) [2024-06-24 09:27:47,185][15401] InferenceWorker_p0-w0: resuming experience collection (153250 times) [2024-06-24 09:27:48,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10352214016. Throughput: 0: 42548.4. Samples: 10352377840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 09:27:48,980][15401] Updated weights for policy 0, policy_version 631851 (0.0029) [2024-06-24 09:27:53,047][15401] Updated weights for policy 0, policy_version 631861 (0.0030) [2024-06-24 09:27:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10352427008. Throughput: 0: 42513.0. Samples: 10352506840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 09:27:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 09:27:56,654][15401] Updated weights for policy 0, policy_version 631871 (0.0022) [2024-06-24 09:27:58,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 10352656384. Throughput: 0: 42617.5. Samples: 10352763880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:27:58,393][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 09:28:00,627][15401] Updated weights for policy 0, policy_version 631881 (0.0027) [2024-06-24 09:28:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10352852992. Throughput: 0: 42599.9. Samples: 10353018840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 09:28:04,164][15401] Updated weights for policy 0, policy_version 631891 (0.0032) [2024-06-24 09:28:08,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10353049600. Throughput: 0: 42257.2. Samples: 10353140520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 09:28:08,527][15401] Updated weights for policy 0, policy_version 631901 (0.0023) [2024-06-24 09:28:12,035][15401] Updated weights for policy 0, policy_version 631911 (0.0025) [2024-06-24 09:28:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43145.3, 300 sec: 42709.5). Total num frames: 10353295360. Throughput: 0: 42369.7. Samples: 10353397160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 09:28:16,139][15401] Updated weights for policy 0, policy_version 631921 (0.0039) [2024-06-24 09:28:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10353491968. Throughput: 0: 42569.5. Samples: 10353660400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 09:28:19,648][15401] Updated weights for policy 0, policy_version 631931 (0.0045) [2024-06-24 09:28:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10353704960. Throughput: 0: 42570.9. Samples: 10353784840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 09:28:23,629][15401] Updated weights for policy 0, policy_version 631941 (0.0026) [2024-06-24 09:28:27,154][15401] Updated weights for policy 0, policy_version 631951 (0.0033) [2024-06-24 09:28:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10353917952. Throughput: 0: 42704.7. Samples: 10354044960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 09:28:31,075][15401] Updated weights for policy 0, policy_version 631961 (0.0026) [2024-06-24 09:28:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.4). Total num frames: 10354130944. Throughput: 0: 42847.1. Samples: 10354305960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:33,391][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 09:28:34,749][15401] Updated weights for policy 0, policy_version 631971 (0.0024) [2024-06-24 09:28:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 10354343936. Throughput: 0: 42714.1. Samples: 10354428980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 09:28:39,127][15401] Updated weights for policy 0, policy_version 631981 (0.0036) [2024-06-24 09:28:42,427][15401] Updated weights for policy 0, policy_version 631991 (0.0035) [2024-06-24 09:28:43,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 10354589696. Throughput: 0: 42718.7. Samples: 10354686120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 09:28:46,638][15401] Updated weights for policy 0, policy_version 632001 (0.0037) [2024-06-24 09:28:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10354786304. Throughput: 0: 42828.1. Samples: 10354946100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 09:28:50,159][15401] Updated weights for policy 0, policy_version 632011 (0.0024) [2024-06-24 09:28:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10354982912. Throughput: 0: 42890.3. Samples: 10355070580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-24 09:28:54,160][15401] Updated weights for policy 0, policy_version 632021 (0.0040) [2024-06-24 09:28:57,783][15401] Updated weights for policy 0, policy_version 632031 (0.0027) [2024-06-24 09:28:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10355228672. Throughput: 0: 42914.8. Samples: 10355328320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:28:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 09:29:01,756][15401] Updated weights for policy 0, policy_version 632041 (0.0034) [2024-06-24 09:29:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10355425280. Throughput: 0: 42747.3. Samples: 10355584020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:29:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 09:29:05,487][15401] Updated weights for policy 0, policy_version 632051 (0.0030) [2024-06-24 09:29:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10355638272. Throughput: 0: 42878.8. Samples: 10355714380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:29:08,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 09:29:09,387][15401] Updated weights for policy 0, policy_version 632061 (0.0039) [2024-06-24 09:29:13,249][15401] Updated weights for policy 0, policy_version 632071 (0.0029) [2024-06-24 09:29:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10355851264. Throughput: 0: 42809.2. Samples: 10355971380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:29:13,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 09:29:13,468][15349] Signal inference workers to stop experience collection... (153300 times) [2024-06-24 09:29:13,468][15349] Signal inference workers to resume experience collection... (153300 times) [2024-06-24 09:29:13,513][15401] InferenceWorker_p0-w0: stopping experience collection (153300 times) [2024-06-24 09:29:13,513][15401] InferenceWorker_p0-w0: resuming experience collection (153300 times) [2024-06-24 09:29:16,976][15401] Updated weights for policy 0, policy_version 632081 (0.0037) [2024-06-24 09:29:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10356047872. Throughput: 0: 42759.8. Samples: 10356230140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:29:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 09:29:20,762][15401] Updated weights for policy 0, policy_version 632091 (0.0028) [2024-06-24 09:29:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42710.1). Total num frames: 10356277248. Throughput: 0: 42823.6. Samples: 10356356040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:29:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 09:29:24,546][15401] Updated weights for policy 0, policy_version 632101 (0.0030) [2024-06-24 09:29:28,375][15401] Updated weights for policy 0, policy_version 632111 (0.0038) [2024-06-24 09:29:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 10356506624. Throughput: 0: 42855.1. Samples: 10356614600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:29:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 09:29:32,341][15401] Updated weights for policy 0, policy_version 632121 (0.0038) [2024-06-24 09:29:33,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 10356686848. Throughput: 0: 42812.8. Samples: 10356872780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-24 09:29:33,392][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 09:29:35,946][15401] Updated weights for policy 0, policy_version 632131 (0.0038) [2024-06-24 09:29:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42710.2). Total num frames: 10356916224. Throughput: 0: 42675.8. Samples: 10356991000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:29:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 09:29:39,899][15401] Updated weights for policy 0, policy_version 632141 (0.0030) [2024-06-24 09:29:43,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10357145600. Throughput: 0: 42868.1. Samples: 10357257380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:29:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 09:29:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000632151_10357161984.pth... [2024-06-24 09:29:43,419][15401] Updated weights for policy 0, policy_version 632151 (0.0029) [2024-06-24 09:29:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000631525_10346905600.pth [2024-06-24 09:29:47,613][15401] Updated weights for policy 0, policy_version 632161 (0.0027) [2024-06-24 09:29:48,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10357325824. Throughput: 0: 42826.7. Samples: 10357511220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:29:48,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 09:29:51,238][15401] Updated weights for policy 0, policy_version 632171 (0.0024) [2024-06-24 09:29:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10357555200. Throughput: 0: 42671.8. Samples: 10357634620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:29:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 09:29:55,254][15401] Updated weights for policy 0, policy_version 632181 (0.0040) [2024-06-24 09:29:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10357768192. Throughput: 0: 42698.2. Samples: 10357892800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:29:58,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 09:29:59,159][15401] Updated weights for policy 0, policy_version 632191 (0.0026) [2024-06-24 09:30:03,146][15401] Updated weights for policy 0, policy_version 632201 (0.0025) [2024-06-24 09:30:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10357981184. Throughput: 0: 42591.8. Samples: 10358146780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 09:30:06,837][15401] Updated weights for policy 0, policy_version 632211 (0.0023) [2024-06-24 09:30:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10358194176. Throughput: 0: 42487.5. Samples: 10358267980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 09:30:11,139][15401] Updated weights for policy 0, policy_version 632221 (0.0029) [2024-06-24 09:30:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10358423552. Throughput: 0: 42707.7. Samples: 10358536440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 09:30:14,585][15401] Updated weights for policy 0, policy_version 632231 (0.0043) [2024-06-24 09:30:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10358620160. Throughput: 0: 42592.5. Samples: 10358789340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:18,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 09:30:18,667][15401] Updated weights for policy 0, policy_version 632241 (0.0022) [2024-06-24 09:30:22,135][15401] Updated weights for policy 0, policy_version 632251 (0.0032) [2024-06-24 09:30:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10358833152. Throughput: 0: 42818.9. Samples: 10358917840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 09:30:26,335][15401] Updated weights for policy 0, policy_version 632261 (0.0033) [2024-06-24 09:30:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10359029760. Throughput: 0: 42673.2. Samples: 10359177680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 09:30:28,473][15349] Signal inference workers to stop experience collection... (153350 times) [2024-06-24 09:30:28,480][15349] Signal inference workers to resume experience collection... (153350 times) [2024-06-24 09:30:28,521][15401] InferenceWorker_p0-w0: stopping experience collection (153350 times) [2024-06-24 09:30:28,522][15401] InferenceWorker_p0-w0: resuming experience collection (153350 times) [2024-06-24 09:30:29,805][15401] Updated weights for policy 0, policy_version 632271 (0.0043) [2024-06-24 09:30:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 10359275520. Throughput: 0: 42672.9. Samples: 10359431500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 09:30:34,338][15401] Updated weights for policy 0, policy_version 632281 (0.0020) [2024-06-24 09:30:37,427][15401] Updated weights for policy 0, policy_version 632291 (0.0030) [2024-06-24 09:30:38,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42654.9). Total num frames: 10359488512. Throughput: 0: 42826.0. Samples: 10359561780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:38,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 09:30:41,776][15401] Updated weights for policy 0, policy_version 632301 (0.0031) [2024-06-24 09:30:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10359685120. Throughput: 0: 42988.2. Samples: 10359827260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 09:30:44,963][15401] Updated weights for policy 0, policy_version 632311 (0.0033) [2024-06-24 09:30:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 10359914496. Throughput: 0: 43045.8. Samples: 10360083940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:48,393][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 09:30:49,157][15401] Updated weights for policy 0, policy_version 632321 (0.0035) [2024-06-24 09:30:52,437][15401] Updated weights for policy 0, policy_version 632331 (0.0042) [2024-06-24 09:30:53,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43416.0, 300 sec: 42820.2). Total num frames: 10360160256. Throughput: 0: 43267.6. Samples: 10360215120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:53,392][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 09:30:56,662][15401] Updated weights for policy 0, policy_version 632341 (0.0038) [2024-06-24 09:30:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10360340480. Throughput: 0: 43052.3. Samples: 10360473800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:30:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 09:31:00,148][15401] Updated weights for policy 0, policy_version 632351 (0.0023) [2024-06-24 09:31:03,390][15132] Fps is (10 sec: 39330.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10360553472. Throughput: 0: 43036.8. Samples: 10360726000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:31:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 09:31:04,173][15401] Updated weights for policy 0, policy_version 632361 (0.0048) [2024-06-24 09:31:07,901][15401] Updated weights for policy 0, policy_version 632371 (0.0027) [2024-06-24 09:31:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 10360799232. Throughput: 0: 43079.1. Samples: 10360856400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 09:31:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 09:31:11,787][15401] Updated weights for policy 0, policy_version 632381 (0.0040) [2024-06-24 09:31:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10360979456. Throughput: 0: 42950.8. Samples: 10361110460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 09:31:15,563][15401] Updated weights for policy 0, policy_version 632391 (0.0035) [2024-06-24 09:31:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 10361208832. Throughput: 0: 42954.6. Samples: 10361364460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 09:31:19,731][15401] Updated weights for policy 0, policy_version 632401 (0.0031) [2024-06-24 09:31:23,227][15401] Updated weights for policy 0, policy_version 632411 (0.0033) [2024-06-24 09:31:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 10361421824. Throughput: 0: 43137.8. Samples: 10361502980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 09:31:27,269][15401] Updated weights for policy 0, policy_version 632421 (0.0041) [2024-06-24 09:31:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 10361634816. Throughput: 0: 43070.6. Samples: 10361765440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:28,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 09:31:30,717][15401] Updated weights for policy 0, policy_version 632431 (0.0044) [2024-06-24 09:31:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10361864192. Throughput: 0: 42899.1. Samples: 10362014300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:33,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 09:31:34,231][15349] Signal inference workers to stop experience collection... (153400 times) [2024-06-24 09:31:34,266][15401] InferenceWorker_p0-w0: stopping experience collection (153400 times) [2024-06-24 09:31:34,278][15349] Signal inference workers to resume experience collection... (153400 times) [2024-06-24 09:31:34,285][15401] InferenceWorker_p0-w0: resuming experience collection (153400 times) [2024-06-24 09:31:34,772][15401] Updated weights for policy 0, policy_version 632441 (0.0025) [2024-06-24 09:31:38,314][15401] Updated weights for policy 0, policy_version 632451 (0.0029) [2024-06-24 09:31:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10362077184. Throughput: 0: 43095.2. Samples: 10362154300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:38,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 09:31:42,219][15401] Updated weights for policy 0, policy_version 632461 (0.0032) [2024-06-24 09:31:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10362273792. Throughput: 0: 42992.5. Samples: 10362408460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 09:31:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000632463_10362273792.pth... [2024-06-24 09:31:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000631836_10352001024.pth [2024-06-24 09:31:45,786][15401] Updated weights for policy 0, policy_version 632471 (0.0022) [2024-06-24 09:31:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 10362503168. Throughput: 0: 42997.5. Samples: 10362660880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:48,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 09:31:49,729][15401] Updated weights for policy 0, policy_version 632481 (0.0044) [2024-06-24 09:31:53,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42598.4, 300 sec: 42875.8). Total num frames: 10362716160. Throughput: 0: 43102.1. Samples: 10362796100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:53,392][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 09:31:53,558][15401] Updated weights for policy 0, policy_version 632491 (0.0032) [2024-06-24 09:31:57,390][15401] Updated weights for policy 0, policy_version 632501 (0.0031) [2024-06-24 09:31:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10362912768. Throughput: 0: 42971.9. Samples: 10363044200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:31:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 09:32:01,127][15401] Updated weights for policy 0, policy_version 632511 (0.0032) [2024-06-24 09:32:03,390][15132] Fps is (10 sec: 42607.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10363142144. Throughput: 0: 42975.9. Samples: 10363298380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 09:32:05,201][15401] Updated weights for policy 0, policy_version 632521 (0.0037) [2024-06-24 09:32:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42876.3). Total num frames: 10363355136. Throughput: 0: 42867.0. Samples: 10363432000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 09:32:08,740][15401] Updated weights for policy 0, policy_version 632531 (0.0040) [2024-06-24 09:32:13,045][15401] Updated weights for policy 0, policy_version 632541 (0.0037) [2024-06-24 09:32:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10363551744. Throughput: 0: 42519.6. Samples: 10363678820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 09:32:16,396][15401] Updated weights for policy 0, policy_version 632551 (0.0033) [2024-06-24 09:32:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10363781120. Throughput: 0: 42837.4. Samples: 10363941980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 09:32:20,487][15401] Updated weights for policy 0, policy_version 632561 (0.0035) [2024-06-24 09:32:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10363994112. Throughput: 0: 42554.9. Samples: 10364069280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 09:32:24,130][15401] Updated weights for policy 0, policy_version 632571 (0.0042) [2024-06-24 09:32:27,914][15401] Updated weights for policy 0, policy_version 632581 (0.0033) [2024-06-24 09:32:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10364223488. Throughput: 0: 42641.4. Samples: 10364327320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:28,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 09:32:32,025][15401] Updated weights for policy 0, policy_version 632591 (0.0038) [2024-06-24 09:32:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 10364403712. Throughput: 0: 42789.4. Samples: 10364586400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:33,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 09:32:35,536][15401] Updated weights for policy 0, policy_version 632601 (0.0040) [2024-06-24 09:32:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10364633088. Throughput: 0: 42516.9. Samples: 10364709260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 09:32:40,002][15401] Updated weights for policy 0, policy_version 632611 (0.0031) [2024-06-24 09:32:43,177][15401] Updated weights for policy 0, policy_version 632621 (0.0031) [2024-06-24 09:32:43,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 10364878848. Throughput: 0: 42799.2. Samples: 10364970160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 09:32:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 09:32:48,044][15401] Updated weights for policy 0, policy_version 632631 (0.0032) [2024-06-24 09:32:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10365042688. Throughput: 0: 42933.9. Samples: 10365230400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:32:48,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 09:32:49,226][15349] Signal inference workers to stop experience collection... (153450 times) [2024-06-24 09:32:49,226][15349] Signal inference workers to resume experience collection... (153450 times) [2024-06-24 09:32:49,237][15401] InferenceWorker_p0-w0: stopping experience collection (153450 times) [2024-06-24 09:32:49,237][15401] InferenceWorker_p0-w0: resuming experience collection (153450 times) [2024-06-24 09:32:50,892][15401] Updated weights for policy 0, policy_version 632641 (0.0038) [2024-06-24 09:32:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 10365272064. Throughput: 0: 42583.1. Samples: 10365348240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:32:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 09:32:55,742][15401] Updated weights for policy 0, policy_version 632651 (0.0035) [2024-06-24 09:32:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10365485056. Throughput: 0: 42826.1. Samples: 10365606000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:32:58,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 09:32:59,209][15401] Updated weights for policy 0, policy_version 632661 (0.0044) [2024-06-24 09:33:03,241][15401] Updated weights for policy 0, policy_version 632671 (0.0036) [2024-06-24 09:33:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 10365681664. Throughput: 0: 42754.3. Samples: 10365865920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:03,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 09:33:06,775][15401] Updated weights for policy 0, policy_version 632681 (0.0028) [2024-06-24 09:33:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10365927424. Throughput: 0: 42657.0. Samples: 10365988840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 09:33:10,763][15401] Updated weights for policy 0, policy_version 632691 (0.0042) [2024-06-24 09:33:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10366124032. Throughput: 0: 42625.8. Samples: 10366245480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 09:33:14,325][15401] Updated weights for policy 0, policy_version 632701 (0.0031) [2024-06-24 09:33:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10366320640. Throughput: 0: 42471.5. Samples: 10366497620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:18,394][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 09:33:18,861][15401] Updated weights for policy 0, policy_version 632711 (0.0042) [2024-06-24 09:33:22,058][15401] Updated weights for policy 0, policy_version 632721 (0.0034) [2024-06-24 09:33:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 10366550016. Throughput: 0: 42535.6. Samples: 10366623360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 09:33:26,349][15401] Updated weights for policy 0, policy_version 632731 (0.0041) [2024-06-24 09:33:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10366763008. Throughput: 0: 42481.4. Samples: 10366881820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 09:33:29,915][15401] Updated weights for policy 0, policy_version 632741 (0.0037) [2024-06-24 09:33:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 10366959616. Throughput: 0: 42519.6. Samples: 10367143780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 09:33:33,963][15401] Updated weights for policy 0, policy_version 632751 (0.0039) [2024-06-24 09:33:37,424][15401] Updated weights for policy 0, policy_version 632761 (0.0034) [2024-06-24 09:33:38,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42593.8, 300 sec: 42708.6). Total num frames: 10367188992. Throughput: 0: 42560.6. Samples: 10367263740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:38,396][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 09:33:41,630][15401] Updated weights for policy 0, policy_version 632771 (0.0034) [2024-06-24 09:33:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 10367385600. Throughput: 0: 42703.5. Samples: 10367527660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 09:33:43,555][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000632776_10367401984.pth... [2024-06-24 09:33:43,632][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000632151_10357161984.pth [2024-06-24 09:33:44,994][15401] Updated weights for policy 0, policy_version 632781 (0.0031) [2024-06-24 09:33:48,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10367598592. Throughput: 0: 42429.7. Samples: 10367775260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 09:33:49,582][15401] Updated weights for policy 0, policy_version 632791 (0.0028) [2024-06-24 09:33:52,775][15401] Updated weights for policy 0, policy_version 632801 (0.0027) [2024-06-24 09:33:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10367844352. Throughput: 0: 42673.8. Samples: 10367909160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:53,391][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 09:33:57,082][15401] Updated weights for policy 0, policy_version 632811 (0.0031) [2024-06-24 09:33:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10368024576. Throughput: 0: 42703.9. Samples: 10368167160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:33:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 09:34:00,330][15401] Updated weights for policy 0, policy_version 632821 (0.0033) [2024-06-24 09:34:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10368253952. Throughput: 0: 42564.4. Samples: 10368413020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:34:03,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 09:34:04,771][15401] Updated weights for policy 0, policy_version 632831 (0.0041) [2024-06-24 09:34:08,249][15401] Updated weights for policy 0, policy_version 632841 (0.0048) [2024-06-24 09:34:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10368466944. Throughput: 0: 42711.1. Samples: 10368545360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:34:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 09:34:12,629][15401] Updated weights for policy 0, policy_version 632851 (0.0037) [2024-06-24 09:34:13,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.5, 300 sec: 42764.6). Total num frames: 10368663552. Throughput: 0: 42674.9. Samples: 10368802300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:34:13,393][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 09:34:15,944][15401] Updated weights for policy 0, policy_version 632861 (0.0041) [2024-06-24 09:34:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10368892928. Throughput: 0: 42163.4. Samples: 10369041140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 09:34:18,396][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 09:34:20,007][15349] Signal inference workers to stop experience collection... (153500 times) [2024-06-24 09:34:20,011][15349] Signal inference workers to resume experience collection... (153500 times) [2024-06-24 09:34:20,059][15401] InferenceWorker_p0-w0: stopping experience collection (153500 times) [2024-06-24 09:34:20,059][15401] InferenceWorker_p0-w0: resuming experience collection (153500 times) [2024-06-24 09:34:20,404][15401] Updated weights for policy 0, policy_version 632871 (0.0043) [2024-06-24 09:34:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10369089536. Throughput: 0: 42548.6. Samples: 10369178160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:34:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 09:34:23,825][15401] Updated weights for policy 0, policy_version 632881 (0.0051) [2024-06-24 09:34:28,262][15401] Updated weights for policy 0, policy_version 632891 (0.0030) [2024-06-24 09:34:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 10369286144. Throughput: 0: 42191.1. Samples: 10369426260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:34:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 09:34:31,500][15401] Updated weights for policy 0, policy_version 632901 (0.0032) [2024-06-24 09:34:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10369531904. Throughput: 0: 42175.0. Samples: 10369673140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:34:33,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 09:34:35,971][15401] Updated weights for policy 0, policy_version 632911 (0.0031) [2024-06-24 09:34:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 10369744896. Throughput: 0: 42318.7. Samples: 10369813500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:34:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 09:34:39,486][15401] Updated weights for policy 0, policy_version 632921 (0.0033) [2024-06-24 09:34:43,396][15132] Fps is (10 sec: 39297.2, 60 sec: 42320.9, 300 sec: 42708.5). Total num frames: 10369925120. Throughput: 0: 42156.8. Samples: 10370064480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:34:43,396][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 09:34:43,633][15401] Updated weights for policy 0, policy_version 632931 (0.0027) [2024-06-24 09:34:46,941][15401] Updated weights for policy 0, policy_version 632941 (0.0040) [2024-06-24 09:34:48,394][15132] Fps is (10 sec: 44217.2, 60 sec: 43141.3, 300 sec: 42819.9). Total num frames: 10370187264. Throughput: 0: 42314.9. Samples: 10370317380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:34:48,395][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 09:34:51,230][15401] Updated weights for policy 0, policy_version 632951 (0.0035) [2024-06-24 09:34:53,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 10370367488. Throughput: 0: 42488.0. Samples: 10370457320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:34:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 09:34:54,496][15401] Updated weights for policy 0, policy_version 632961 (0.0030) [2024-06-24 09:34:58,395][15132] Fps is (10 sec: 36042.1, 60 sec: 42048.6, 300 sec: 42597.7). Total num frames: 10370547712. Throughput: 0: 42327.1. Samples: 10370707140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:34:58,395][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 09:34:59,023][15401] Updated weights for policy 0, policy_version 632971 (0.0027) [2024-06-24 09:35:02,046][15401] Updated weights for policy 0, policy_version 632981 (0.0035) [2024-06-24 09:35:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10370826240. Throughput: 0: 42510.8. Samples: 10370954120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 09:35:06,892][15401] Updated weights for policy 0, policy_version 632991 (0.0043) [2024-06-24 09:35:08,390][15132] Fps is (10 sec: 45898.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10371006464. Throughput: 0: 42596.8. Samples: 10371095020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 09:35:09,619][15401] Updated weights for policy 0, policy_version 633001 (0.0033) [2024-06-24 09:35:13,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 10371203072. Throughput: 0: 42592.6. Samples: 10371342920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 09:35:14,541][15401] Updated weights for policy 0, policy_version 633011 (0.0041) [2024-06-24 09:35:17,463][15401] Updated weights for policy 0, policy_version 633021 (0.0044) [2024-06-24 09:35:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10371448832. Throughput: 0: 42670.8. Samples: 10371593320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 09:35:22,097][15401] Updated weights for policy 0, policy_version 633031 (0.0032) [2024-06-24 09:35:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10371645440. Throughput: 0: 42635.3. Samples: 10371732080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:23,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 09:35:25,110][15401] Updated weights for policy 0, policy_version 633041 (0.0039) [2024-06-24 09:35:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10371842048. Throughput: 0: 42579.8. Samples: 10371980300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 09:35:29,777][15401] Updated weights for policy 0, policy_version 633051 (0.0030) [2024-06-24 09:35:32,687][15401] Updated weights for policy 0, policy_version 633061 (0.0027) [2024-06-24 09:35:33,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10372104192. Throughput: 0: 42574.4. Samples: 10372233040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:33,391][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 09:35:36,669][15349] Signal inference workers to stop experience collection... (153550 times) [2024-06-24 09:35:36,670][15349] Signal inference workers to resume experience collection... (153550 times) [2024-06-24 09:35:36,711][15401] InferenceWorker_p0-w0: stopping experience collection (153550 times) [2024-06-24 09:35:36,712][15401] InferenceWorker_p0-w0: resuming experience collection (153550 times) [2024-06-24 09:35:37,349][15401] Updated weights for policy 0, policy_version 633071 (0.0030) [2024-06-24 09:35:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 10372284416. Throughput: 0: 42566.7. Samples: 10372372820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 09:35:40,560][15401] Updated weights for policy 0, policy_version 633081 (0.0035) [2024-06-24 09:35:43,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42603.0, 300 sec: 42598.8). Total num frames: 10372481024. Throughput: 0: 42471.7. Samples: 10372618140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 09:35:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000633086_10372481024.pth... [2024-06-24 09:35:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000632463_10362273792.pth [2024-06-24 09:35:44,855][15401] Updated weights for policy 0, policy_version 633091 (0.0038) [2024-06-24 09:35:48,237][15401] Updated weights for policy 0, policy_version 633101 (0.0028) [2024-06-24 09:35:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42328.5, 300 sec: 42598.7). Total num frames: 10372726784. Throughput: 0: 42666.1. Samples: 10372874100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:48,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 09:35:52,466][15401] Updated weights for policy 0, policy_version 633111 (0.0029) [2024-06-24 09:35:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10372923392. Throughput: 0: 42573.8. Samples: 10373010840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:53,396][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 09:35:56,086][15401] Updated weights for policy 0, policy_version 633121 (0.0044) [2024-06-24 09:35:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42875.3, 300 sec: 42598.4). Total num frames: 10373120000. Throughput: 0: 42611.6. Samples: 10373260440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-24 09:35:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 09:36:00,015][15401] Updated weights for policy 0, policy_version 633131 (0.0022) [2024-06-24 09:36:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 10373349376. Throughput: 0: 42744.9. Samples: 10373516840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 09:36:03,864][15401] Updated weights for policy 0, policy_version 633141 (0.0036) [2024-06-24 09:36:07,647][15401] Updated weights for policy 0, policy_version 633151 (0.0030) [2024-06-24 09:36:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10373562368. Throughput: 0: 42680.7. Samples: 10373652720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 09:36:11,581][15401] Updated weights for policy 0, policy_version 633161 (0.0044) [2024-06-24 09:36:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10373775360. Throughput: 0: 42579.1. Samples: 10373896360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 09:36:15,688][15401] Updated weights for policy 0, policy_version 633171 (0.0053) [2024-06-24 09:36:18,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 10374004736. Throughput: 0: 42570.3. Samples: 10374148800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:18,393][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 09:36:19,310][15401] Updated weights for policy 0, policy_version 633181 (0.0028) [2024-06-24 09:36:23,148][15401] Updated weights for policy 0, policy_version 633191 (0.0032) [2024-06-24 09:36:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10374201344. Throughput: 0: 42330.5. Samples: 10374277700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 09:36:27,116][15401] Updated weights for policy 0, policy_version 633201 (0.0034) [2024-06-24 09:36:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10374414336. Throughput: 0: 42699.5. Samples: 10374539620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 09:36:30,653][15401] Updated weights for policy 0, policy_version 633211 (0.0047) [2024-06-24 09:36:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10374643712. Throughput: 0: 42662.2. Samples: 10374793900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 09:36:34,761][15401] Updated weights for policy 0, policy_version 633221 (0.0041) [2024-06-24 09:36:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10374840320. Throughput: 0: 42467.6. Samples: 10374921880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 09:36:38,662][15401] Updated weights for policy 0, policy_version 633231 (0.0038) [2024-06-24 09:36:42,269][15401] Updated weights for policy 0, policy_version 633241 (0.0036) [2024-06-24 09:36:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10375053312. Throughput: 0: 42599.5. Samples: 10375177420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 09:36:46,194][15401] Updated weights for policy 0, policy_version 633251 (0.0031) [2024-06-24 09:36:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 10375282688. Throughput: 0: 42519.0. Samples: 10375430200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:48,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 09:36:50,448][15401] Updated weights for policy 0, policy_version 633261 (0.0033) [2024-06-24 09:36:52,585][15349] Signal inference workers to stop experience collection... (153600 times) [2024-06-24 09:36:52,626][15349] Signal inference workers to resume experience collection... (153600 times) [2024-06-24 09:36:52,631][15401] InferenceWorker_p0-w0: stopping experience collection (153600 times) [2024-06-24 09:36:52,645][15401] InferenceWorker_p0-w0: resuming experience collection (153600 times) [2024-06-24 09:36:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10375479296. Throughput: 0: 42385.4. Samples: 10375560060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 09:36:54,234][15401] Updated weights for policy 0, policy_version 633271 (0.0036) [2024-06-24 09:36:58,370][15401] Updated weights for policy 0, policy_version 633281 (0.0036) [2024-06-24 09:36:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10375675904. Throughput: 0: 42632.5. Samples: 10375814820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:36:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 09:37:01,911][15401] Updated weights for policy 0, policy_version 633291 (0.0038) [2024-06-24 09:37:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10375921664. Throughput: 0: 42658.2. Samples: 10376068320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:37:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 09:37:05,863][15401] Updated weights for policy 0, policy_version 633301 (0.0047) [2024-06-24 09:37:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10376118272. Throughput: 0: 42783.7. Samples: 10376202960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:37:08,398][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 09:37:09,557][15401] Updated weights for policy 0, policy_version 633311 (0.0035) [2024-06-24 09:37:13,364][15401] Updated weights for policy 0, policy_version 633321 (0.0035) [2024-06-24 09:37:13,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 10376331264. Throughput: 0: 42572.4. Samples: 10376455480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:37:13,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 09:37:17,149][15401] Updated weights for policy 0, policy_version 633331 (0.0025) [2024-06-24 09:37:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 10376560640. Throughput: 0: 42544.4. Samples: 10376708400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:37:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 09:37:20,935][15401] Updated weights for policy 0, policy_version 633341 (0.0036) [2024-06-24 09:37:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10376757248. Throughput: 0: 42636.9. Samples: 10376840540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:37:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 09:37:24,654][15401] Updated weights for policy 0, policy_version 633351 (0.0036) [2024-06-24 09:37:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10376970240. Throughput: 0: 42647.1. Samples: 10377096540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:37:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 09:37:28,495][15401] Updated weights for policy 0, policy_version 633361 (0.0024) [2024-06-24 09:37:32,280][15401] Updated weights for policy 0, policy_version 633371 (0.0030) [2024-06-24 09:37:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10377199616. Throughput: 0: 42588.6. Samples: 10377346680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 09:37:33,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 09:37:36,174][15401] Updated weights for policy 0, policy_version 633381 (0.0040) [2024-06-24 09:37:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 10377363456. Throughput: 0: 42582.7. Samples: 10377476280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:37:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 09:37:39,991][15401] Updated weights for policy 0, policy_version 633391 (0.0032) [2024-06-24 09:37:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10377592832. Throughput: 0: 42518.6. Samples: 10377728160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:37:43,392][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 09:37:43,463][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000633399_10377609216.pth... [2024-06-24 09:37:43,528][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000632776_10367401984.pth [2024-06-24 09:37:44,382][15401] Updated weights for policy 0, policy_version 633401 (0.0036) [2024-06-24 09:37:47,728][15401] Updated weights for policy 0, policy_version 633411 (0.0028) [2024-06-24 09:37:48,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10377838592. Throughput: 0: 42457.0. Samples: 10377978880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:37:48,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 09:37:52,342][15401] Updated weights for policy 0, policy_version 633421 (0.0033) [2024-06-24 09:37:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 10378002432. Throughput: 0: 42371.5. Samples: 10378109680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:37:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 09:37:55,424][15401] Updated weights for policy 0, policy_version 633431 (0.0033) [2024-06-24 09:37:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10378248192. Throughput: 0: 42472.9. Samples: 10378366660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:37:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 09:37:59,814][15401] Updated weights for policy 0, policy_version 633441 (0.0034) [2024-06-24 09:38:02,921][15401] Updated weights for policy 0, policy_version 633451 (0.0032) [2024-06-24 09:38:03,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10378477568. Throughput: 0: 42553.1. Samples: 10378623280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 09:38:07,280][15401] Updated weights for policy 0, policy_version 633461 (0.0033) [2024-06-24 09:38:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10378657792. Throughput: 0: 42552.2. Samples: 10378755380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:08,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 09:38:09,937][15349] Signal inference workers to stop experience collection... (153650 times) [2024-06-24 09:38:09,968][15401] InferenceWorker_p0-w0: stopping experience collection (153650 times) [2024-06-24 09:38:09,995][15349] Signal inference workers to resume experience collection... (153650 times) [2024-06-24 09:38:09,996][15401] InferenceWorker_p0-w0: resuming experience collection (153650 times) [2024-06-24 09:38:10,505][15401] Updated weights for policy 0, policy_version 633471 (0.0030) [2024-06-24 09:38:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 10378887168. Throughput: 0: 42552.0. Samples: 10379011380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 09:38:14,728][15401] Updated weights for policy 0, policy_version 633481 (0.0037) [2024-06-24 09:38:18,260][15401] Updated weights for policy 0, policy_version 633491 (0.0035) [2024-06-24 09:38:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10379116544. Throughput: 0: 42750.5. Samples: 10379270460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:18,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 09:38:22,268][15401] Updated weights for policy 0, policy_version 633501 (0.0044) [2024-06-24 09:38:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10379296768. Throughput: 0: 42705.3. Samples: 10379398020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 09:38:25,946][15401] Updated weights for policy 0, policy_version 633511 (0.0037) [2024-06-24 09:38:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10379542528. Throughput: 0: 42730.8. Samples: 10379651040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 09:38:29,655][15401] Updated weights for policy 0, policy_version 633521 (0.0028) [2024-06-24 09:38:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.2, 300 sec: 42599.3). Total num frames: 10379755520. Throughput: 0: 43231.8. Samples: 10379924320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:33,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 09:38:33,462][15401] Updated weights for policy 0, policy_version 633531 (0.0032) [2024-06-24 09:38:37,662][15401] Updated weights for policy 0, policy_version 633541 (0.0037) [2024-06-24 09:38:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 10379968512. Throughput: 0: 42994.7. Samples: 10380044440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:38,392][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 09:38:41,272][15401] Updated weights for policy 0, policy_version 633551 (0.0036) [2024-06-24 09:38:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 10380181504. Throughput: 0: 42923.2. Samples: 10380298200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:43,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 09:38:45,257][15401] Updated weights for policy 0, policy_version 633561 (0.0040) [2024-06-24 09:38:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 10380378112. Throughput: 0: 43147.4. Samples: 10380564920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 09:38:49,018][15401] Updated weights for policy 0, policy_version 633571 (0.0033) [2024-06-24 09:38:52,778][15401] Updated weights for policy 0, policy_version 633581 (0.0031) [2024-06-24 09:38:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 10380607488. Throughput: 0: 42931.8. Samples: 10380687320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:53,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 09:38:56,661][15401] Updated weights for policy 0, policy_version 633591 (0.0030) [2024-06-24 09:38:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 10380836864. Throughput: 0: 42854.7. Samples: 10380939840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:38:58,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 09:39:00,388][15401] Updated weights for policy 0, policy_version 633601 (0.0031) [2024-06-24 09:39:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10381017088. Throughput: 0: 42995.6. Samples: 10381205260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:39:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 09:39:04,315][15401] Updated weights for policy 0, policy_version 633611 (0.0036) [2024-06-24 09:39:08,048][15401] Updated weights for policy 0, policy_version 633621 (0.0040) [2024-06-24 09:39:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 10381246464. Throughput: 0: 42867.0. Samples: 10381327040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 09:39:08,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 09:39:11,902][15401] Updated weights for policy 0, policy_version 633631 (0.0040) [2024-06-24 09:39:13,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.3, 300 sec: 42653.9). Total num frames: 10381475840. Throughput: 0: 42886.4. Samples: 10381580940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:13,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-24 09:39:16,199][15401] Updated weights for policy 0, policy_version 633641 (0.0038) [2024-06-24 09:39:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10381656064. Throughput: 0: 42499.6. Samples: 10381836800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:18,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 09:39:19,781][15401] Updated weights for policy 0, policy_version 633651 (0.0038) [2024-06-24 09:39:22,609][15349] Signal inference workers to stop experience collection... (153700 times) [2024-06-24 09:39:22,652][15401] InferenceWorker_p0-w0: stopping experience collection (153700 times) [2024-06-24 09:39:22,663][15349] Signal inference workers to resume experience collection... (153700 times) [2024-06-24 09:39:22,671][15401] InferenceWorker_p0-w0: resuming experience collection (153700 times) [2024-06-24 09:39:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10381885440. Throughput: 0: 42547.1. Samples: 10381959060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:23,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 09:39:23,979][15401] Updated weights for policy 0, policy_version 633661 (0.0036) [2024-06-24 09:39:27,499][15401] Updated weights for policy 0, policy_version 633671 (0.0036) [2024-06-24 09:39:28,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10382131200. Throughput: 0: 42786.1. Samples: 10382223580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:28,390][15132] Avg episode reward: [(0, '0.187')] [2024-06-24 09:39:31,621][15401] Updated weights for policy 0, policy_version 633681 (0.0025) [2024-06-24 09:39:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10382311424. Throughput: 0: 42572.9. Samples: 10382480700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:33,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 09:39:35,087][15401] Updated weights for policy 0, policy_version 633691 (0.0046) [2024-06-24 09:39:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 10382508032. Throughput: 0: 42476.9. Samples: 10382598780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 09:39:39,295][15401] Updated weights for policy 0, policy_version 633701 (0.0026) [2024-06-24 09:39:42,760][15401] Updated weights for policy 0, policy_version 633711 (0.0044) [2024-06-24 09:39:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42599.1). Total num frames: 10382753792. Throughput: 0: 42819.9. Samples: 10382866740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 09:39:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000633714_10382770176.pth... [2024-06-24 09:39:43,565][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000633086_10372481024.pth [2024-06-24 09:39:46,867][15401] Updated weights for policy 0, policy_version 633721 (0.0033) [2024-06-24 09:39:48,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 10382950400. Throughput: 0: 42487.3. Samples: 10383117200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:48,391][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 09:39:50,445][15401] Updated weights for policy 0, policy_version 633731 (0.0034) [2024-06-24 09:39:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.8). Total num frames: 10383163392. Throughput: 0: 42665.3. Samples: 10383246980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:53,400][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 09:39:54,413][15401] Updated weights for policy 0, policy_version 633741 (0.0045) [2024-06-24 09:39:58,172][15401] Updated weights for policy 0, policy_version 633751 (0.0033) [2024-06-24 09:39:58,390][15132] Fps is (10 sec: 42599.5, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 10383376384. Throughput: 0: 42615.3. Samples: 10383498620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:39:58,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 09:40:02,603][15401] Updated weights for policy 0, policy_version 633761 (0.0030) [2024-06-24 09:40:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10383572992. Throughput: 0: 42580.1. Samples: 10383752900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 09:40:05,835][15401] Updated weights for policy 0, policy_version 633771 (0.0035) [2024-06-24 09:40:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 10383802368. Throughput: 0: 42715.6. Samples: 10383881260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 09:40:10,229][15401] Updated weights for policy 0, policy_version 633781 (0.0040) [2024-06-24 09:40:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 10384015360. Throughput: 0: 42500.6. Samples: 10384136100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 09:40:13,449][15401] Updated weights for policy 0, policy_version 633791 (0.0041) [2024-06-24 09:40:17,857][15401] Updated weights for policy 0, policy_version 633801 (0.0033) [2024-06-24 09:40:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10384228352. Throughput: 0: 42588.5. Samples: 10384397180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:18,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 09:40:21,241][15401] Updated weights for policy 0, policy_version 633811 (0.0028) [2024-06-24 09:40:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10384441344. Throughput: 0: 42712.5. Samples: 10384520840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 09:40:25,641][15401] Updated weights for policy 0, policy_version 633821 (0.0027) [2024-06-24 09:40:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 10384654336. Throughput: 0: 42403.2. Samples: 10384774880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 09:40:29,105][15401] Updated weights for policy 0, policy_version 633831 (0.0043) [2024-06-24 09:40:33,372][15401] Updated weights for policy 0, policy_version 633841 (0.0035) [2024-06-24 09:40:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10384850944. Throughput: 0: 42593.7. Samples: 10385033900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 09:40:36,689][15401] Updated weights for policy 0, policy_version 633851 (0.0034) [2024-06-24 09:40:38,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10385096704. Throughput: 0: 42436.5. Samples: 10385156620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:38,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 09:40:41,045][15401] Updated weights for policy 0, policy_version 633861 (0.0035) [2024-06-24 09:40:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10385293312. Throughput: 0: 42515.6. Samples: 10385411820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 09:40:43,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 09:40:44,391][15401] Updated weights for policy 0, policy_version 633871 (0.0037) [2024-06-24 09:40:44,954][15349] Signal inference workers to stop experience collection... (153750 times) [2024-06-24 09:40:44,986][15401] InferenceWorker_p0-w0: stopping experience collection (153750 times) [2024-06-24 09:40:45,002][15349] Signal inference workers to resume experience collection... (153750 times) [2024-06-24 09:40:45,041][15401] InferenceWorker_p0-w0: resuming experience collection (153750 times) [2024-06-24 09:40:48,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.5, 300 sec: 42542.9). Total num frames: 10385473536. Throughput: 0: 42571.7. Samples: 10385668620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:40:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 09:40:48,650][15401] Updated weights for policy 0, policy_version 633881 (0.0026) [2024-06-24 09:40:51,936][15401] Updated weights for policy 0, policy_version 633891 (0.0035) [2024-06-24 09:40:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10385702912. Throughput: 0: 42413.3. Samples: 10385789860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:40:53,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 09:40:56,182][15401] Updated weights for policy 0, policy_version 633901 (0.0039) [2024-06-24 09:40:58,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 10385932288. Throughput: 0: 42535.0. Samples: 10386050280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:40:58,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 09:40:59,651][15401] Updated weights for policy 0, policy_version 633911 (0.0029) [2024-06-24 09:41:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10386128896. Throughput: 0: 42413.0. Samples: 10386305760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 09:41:03,732][15401] Updated weights for policy 0, policy_version 633921 (0.0037) [2024-06-24 09:41:07,281][15401] Updated weights for policy 0, policy_version 633931 (0.0043) [2024-06-24 09:41:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10386358272. Throughput: 0: 42474.2. Samples: 10386432180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 09:41:11,459][15401] Updated weights for policy 0, policy_version 633941 (0.0029) [2024-06-24 09:41:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 10386571264. Throughput: 0: 42465.2. Samples: 10386685820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 09:41:14,890][15401] Updated weights for policy 0, policy_version 633951 (0.0027) [2024-06-24 09:41:18,394][15132] Fps is (10 sec: 42579.7, 60 sec: 42595.3, 300 sec: 42653.3). Total num frames: 10386784256. Throughput: 0: 42489.2. Samples: 10386946100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:18,395][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 09:41:18,948][15401] Updated weights for policy 0, policy_version 633961 (0.0031) [2024-06-24 09:41:22,763][15401] Updated weights for policy 0, policy_version 633971 (0.0041) [2024-06-24 09:41:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10387013632. Throughput: 0: 42656.0. Samples: 10387076140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 09:41:26,910][15401] Updated weights for policy 0, policy_version 633981 (0.0029) [2024-06-24 09:41:28,390][15132] Fps is (10 sec: 42616.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10387210240. Throughput: 0: 42646.1. Samples: 10387330900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:28,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 09:41:30,274][15401] Updated weights for policy 0, policy_version 633991 (0.0028) [2024-06-24 09:41:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10387423232. Throughput: 0: 42758.5. Samples: 10387592760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 09:41:34,515][15401] Updated weights for policy 0, policy_version 634001 (0.0032) [2024-06-24 09:41:37,981][15401] Updated weights for policy 0, policy_version 634011 (0.0050) [2024-06-24 09:41:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10387636224. Throughput: 0: 42809.2. Samples: 10387716280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 09:41:42,095][15401] Updated weights for policy 0, policy_version 634021 (0.0033) [2024-06-24 09:41:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10387881984. Throughput: 0: 42868.9. Samples: 10387979280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:43,396][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 09:41:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000634026_10387881984.pth... [2024-06-24 09:41:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000633399_10377609216.pth [2024-06-24 09:41:45,477][15401] Updated weights for policy 0, policy_version 634031 (0.0037) [2024-06-24 09:41:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10388062208. Throughput: 0: 43013.3. Samples: 10388241360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:48,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 09:41:49,662][15401] Updated weights for policy 0, policy_version 634041 (0.0036) [2024-06-24 09:41:53,242][15401] Updated weights for policy 0, policy_version 634051 (0.0028) [2024-06-24 09:41:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10388291584. Throughput: 0: 42939.1. Samples: 10388364440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 09:41:57,194][15401] Updated weights for policy 0, policy_version 634061 (0.0033) [2024-06-24 09:41:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 10388520960. Throughput: 0: 43079.5. Samples: 10388624400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:41:58,400][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 09:42:01,160][15401] Updated weights for policy 0, policy_version 634071 (0.0040) [2024-06-24 09:42:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10388717568. Throughput: 0: 42994.0. Samples: 10388880640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:42:03,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-24 09:42:04,890][15401] Updated weights for policy 0, policy_version 634081 (0.0030) [2024-06-24 09:42:06,728][15349] Signal inference workers to stop experience collection... (153800 times) [2024-06-24 09:42:06,728][15349] Signal inference workers to resume experience collection... (153800 times) [2024-06-24 09:42:06,785][15401] InferenceWorker_p0-w0: stopping experience collection (153800 times) [2024-06-24 09:42:06,785][15401] InferenceWorker_p0-w0: resuming experience collection (153800 times) [2024-06-24 09:42:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 10388930560. Throughput: 0: 42911.0. Samples: 10389007140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:42:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 09:42:08,667][15401] Updated weights for policy 0, policy_version 634091 (0.0039) [2024-06-24 09:42:12,367][15401] Updated weights for policy 0, policy_version 634101 (0.0045) [2024-06-24 09:42:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10389176320. Throughput: 0: 43070.4. Samples: 10389269060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:42:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 09:42:16,877][15401] Updated weights for policy 0, policy_version 634111 (0.0034) [2024-06-24 09:42:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42874.6, 300 sec: 42709.5). Total num frames: 10389356544. Throughput: 0: 42899.3. Samples: 10389523220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 09:42:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 09:42:19,974][15401] Updated weights for policy 0, policy_version 634121 (0.0031) [2024-06-24 09:42:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10389569536. Throughput: 0: 42816.1. Samples: 10389643000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:42:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 09:42:24,473][15401] Updated weights for policy 0, policy_version 634131 (0.0048) [2024-06-24 09:42:27,584][15401] Updated weights for policy 0, policy_version 634141 (0.0037) [2024-06-24 09:42:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10389798912. Throughput: 0: 42767.6. Samples: 10389903820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:42:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 09:42:32,305][15401] Updated weights for policy 0, policy_version 634151 (0.0043) [2024-06-24 09:42:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10389979136. Throughput: 0: 42669.2. Samples: 10390161480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:42:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 09:42:35,700][15401] Updated weights for policy 0, policy_version 634161 (0.0036) [2024-06-24 09:42:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10390192128. Throughput: 0: 42574.2. Samples: 10390280280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:42:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 09:42:40,045][15401] Updated weights for policy 0, policy_version 634171 (0.0035) [2024-06-24 09:42:43,323][15401] Updated weights for policy 0, policy_version 634181 (0.0052) [2024-06-24 09:42:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10390421504. Throughput: 0: 42585.5. Samples: 10390540740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:42:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 09:42:47,551][15401] Updated weights for policy 0, policy_version 634191 (0.0045) [2024-06-24 09:42:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10390634496. Throughput: 0: 42766.8. Samples: 10390805140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:42:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 09:42:50,970][15401] Updated weights for policy 0, policy_version 634201 (0.0039) [2024-06-24 09:42:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10390847488. Throughput: 0: 42605.0. Samples: 10390924360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:42:53,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 09:42:55,391][15401] Updated weights for policy 0, policy_version 634211 (0.0030) [2024-06-24 09:42:58,365][15401] Updated weights for policy 0, policy_version 634221 (0.0036) [2024-06-24 09:42:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10391076864. Throughput: 0: 42561.6. Samples: 10391184340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:42:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 09:43:03,162][15401] Updated weights for policy 0, policy_version 634231 (0.0041) [2024-06-24 09:43:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10391257088. Throughput: 0: 42821.2. Samples: 10391450180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 09:43:06,007][15401] Updated weights for policy 0, policy_version 634241 (0.0033) [2024-06-24 09:43:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10391502848. Throughput: 0: 42747.4. Samples: 10391566640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:08,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 09:43:10,514][15401] Updated weights for policy 0, policy_version 634251 (0.0033) [2024-06-24 09:43:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10391715840. Throughput: 0: 42807.1. Samples: 10391830140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 09:43:13,499][15401] Updated weights for policy 0, policy_version 634261 (0.0043) [2024-06-24 09:43:18,095][15401] Updated weights for policy 0, policy_version 634271 (0.0045) [2024-06-24 09:43:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10391896064. Throughput: 0: 42720.2. Samples: 10392083880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 09:43:20,177][15349] Signal inference workers to stop experience collection... (153850 times) [2024-06-24 09:43:20,177][15349] Signal inference workers to resume experience collection... (153850 times) [2024-06-24 09:43:20,224][15401] InferenceWorker_p0-w0: stopping experience collection (153850 times) [2024-06-24 09:43:20,224][15401] InferenceWorker_p0-w0: resuming experience collection (153850 times) [2024-06-24 09:43:21,062][15401] Updated weights for policy 0, policy_version 634281 (0.0036) [2024-06-24 09:43:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10392141824. Throughput: 0: 42740.9. Samples: 10392203620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 09:43:25,930][15401] Updated weights for policy 0, policy_version 634291 (0.0043) [2024-06-24 09:43:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10392338432. Throughput: 0: 42719.0. Samples: 10392463100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 09:43:28,954][15401] Updated weights for policy 0, policy_version 634301 (0.0036) [2024-06-24 09:43:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10392535040. Throughput: 0: 42754.1. Samples: 10392729080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 09:43:33,419][15401] Updated weights for policy 0, policy_version 634311 (0.0038) [2024-06-24 09:43:36,499][15401] Updated weights for policy 0, policy_version 634321 (0.0032) [2024-06-24 09:43:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10392780800. Throughput: 0: 42917.7. Samples: 10392855660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 09:43:40,939][15401] Updated weights for policy 0, policy_version 634331 (0.0031) [2024-06-24 09:43:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10392993792. Throughput: 0: 42896.4. Samples: 10393114680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 09:43:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000634338_10392993792.pth... [2024-06-24 09:43:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000633714_10382770176.pth [2024-06-24 09:43:44,534][15401] Updated weights for policy 0, policy_version 634341 (0.0038) [2024-06-24 09:43:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 10393190400. Throughput: 0: 42761.2. Samples: 10393374440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 09:43:48,564][15401] Updated weights for policy 0, policy_version 634351 (0.0025) [2024-06-24 09:43:52,011][15401] Updated weights for policy 0, policy_version 634361 (0.0026) [2024-06-24 09:43:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10393419776. Throughput: 0: 42914.8. Samples: 10393497800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:53,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-24 09:43:56,030][15401] Updated weights for policy 0, policy_version 634371 (0.0037) [2024-06-24 09:43:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10393616384. Throughput: 0: 42755.5. Samples: 10393754140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 09:43:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 09:44:00,113][15401] Updated weights for policy 0, policy_version 634381 (0.0030) [2024-06-24 09:44:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10393845760. Throughput: 0: 42764.0. Samples: 10394008260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 09:44:03,500][15401] Updated weights for policy 0, policy_version 634391 (0.0033) [2024-06-24 09:44:07,657][15401] Updated weights for policy 0, policy_version 634401 (0.0042) [2024-06-24 09:44:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10394042368. Throughput: 0: 42963.1. Samples: 10394136960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:08,400][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 09:44:11,020][15401] Updated weights for policy 0, policy_version 634411 (0.0034) [2024-06-24 09:44:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10394271744. Throughput: 0: 42946.2. Samples: 10394395680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:13,398][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 09:44:15,235][15401] Updated weights for policy 0, policy_version 634421 (0.0031) [2024-06-24 09:44:18,391][15132] Fps is (10 sec: 45867.7, 60 sec: 43416.4, 300 sec: 42764.8). Total num frames: 10394501120. Throughput: 0: 42676.7. Samples: 10394649600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:18,392][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 09:44:19,013][15401] Updated weights for policy 0, policy_version 634431 (0.0027) [2024-06-24 09:44:22,818][15401] Updated weights for policy 0, policy_version 634441 (0.0027) [2024-06-24 09:44:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10394681344. Throughput: 0: 42683.6. Samples: 10394776420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 09:44:26,861][15401] Updated weights for policy 0, policy_version 634451 (0.0047) [2024-06-24 09:44:27,925][15349] Signal inference workers to stop experience collection... (153900 times) [2024-06-24 09:44:27,967][15401] InferenceWorker_p0-w0: stopping experience collection (153900 times) [2024-06-24 09:44:27,977][15349] Signal inference workers to resume experience collection... (153900 times) [2024-06-24 09:44:27,987][15401] InferenceWorker_p0-w0: resuming experience collection (153900 times) [2024-06-24 09:44:28,390][15132] Fps is (10 sec: 40966.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10394910720. Throughput: 0: 42622.2. Samples: 10395032680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 09:44:30,885][15401] Updated weights for policy 0, policy_version 634461 (0.0032) [2024-06-24 09:44:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10395123712. Throughput: 0: 42502.8. Samples: 10395287060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 09:44:34,804][15401] Updated weights for policy 0, policy_version 634471 (0.0045) [2024-06-24 09:44:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10395320320. Throughput: 0: 42586.6. Samples: 10395414200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 09:44:38,630][15401] Updated weights for policy 0, policy_version 634481 (0.0036) [2024-06-24 09:44:42,288][15401] Updated weights for policy 0, policy_version 634491 (0.0040) [2024-06-24 09:44:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 10395566080. Throughput: 0: 42687.1. Samples: 10395675060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 09:44:46,415][15401] Updated weights for policy 0, policy_version 634501 (0.0040) [2024-06-24 09:44:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10395762688. Throughput: 0: 42623.1. Samples: 10395926300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 09:44:49,878][15401] Updated weights for policy 0, policy_version 634511 (0.0036) [2024-06-24 09:44:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10395959296. Throughput: 0: 42600.0. Samples: 10396053960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 09:44:53,937][15401] Updated weights for policy 0, policy_version 634521 (0.0042) [2024-06-24 09:44:57,681][15401] Updated weights for policy 0, policy_version 634531 (0.0048) [2024-06-24 09:44:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10396205056. Throughput: 0: 42527.6. Samples: 10396309420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:44:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 09:45:01,730][15401] Updated weights for policy 0, policy_version 634541 (0.0032) [2024-06-24 09:45:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10396385280. Throughput: 0: 42652.7. Samples: 10396568900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:45:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 09:45:05,241][15401] Updated weights for policy 0, policy_version 634551 (0.0029) [2024-06-24 09:45:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10396614656. Throughput: 0: 42645.4. Samples: 10396695460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:45:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 09:45:09,494][15401] Updated weights for policy 0, policy_version 634561 (0.0028) [2024-06-24 09:45:12,809][15401] Updated weights for policy 0, policy_version 634571 (0.0037) [2024-06-24 09:45:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10396844032. Throughput: 0: 42657.8. Samples: 10396952280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:45:13,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 09:45:17,151][15401] Updated weights for policy 0, policy_version 634581 (0.0023) [2024-06-24 09:45:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42053.4, 300 sec: 42653.9). Total num frames: 10397024256. Throughput: 0: 42826.6. Samples: 10397214260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:45:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 09:45:20,284][15401] Updated weights for policy 0, policy_version 634591 (0.0042) [2024-06-24 09:45:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10397253632. Throughput: 0: 42614.2. Samples: 10397331840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:45:23,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 09:45:24,664][15401] Updated weights for policy 0, policy_version 634601 (0.0034) [2024-06-24 09:45:27,758][15401] Updated weights for policy 0, policy_version 634611 (0.0034) [2024-06-24 09:45:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10397483008. Throughput: 0: 42672.0. Samples: 10397595300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:45:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 09:45:32,279][15401] Updated weights for policy 0, policy_version 634621 (0.0040) [2024-06-24 09:45:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10397663232. Throughput: 0: 42825.2. Samples: 10397853440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:45:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 09:45:35,583][15401] Updated weights for policy 0, policy_version 634631 (0.0034) [2024-06-24 09:45:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10397908992. Throughput: 0: 42673.3. Samples: 10397974260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:45:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 09:45:39,756][15401] Updated weights for policy 0, policy_version 634641 (0.0032) [2024-06-24 09:45:43,206][15401] Updated weights for policy 0, policy_version 634651 (0.0033) [2024-06-24 09:45:43,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 10398121984. Throughput: 0: 42822.1. Samples: 10398236520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:45:43,393][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 09:45:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000634651_10398121984.pth... [2024-06-24 09:45:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000634026_10387881984.pth [2024-06-24 09:45:47,342][15401] Updated weights for policy 0, policy_version 634661 (0.0031) [2024-06-24 09:45:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 10398285824. Throughput: 0: 42690.6. Samples: 10398489980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:45:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 09:45:50,977][15401] Updated weights for policy 0, policy_version 634671 (0.0031) [2024-06-24 09:45:53,326][15349] Signal inference workers to stop experience collection... (153950 times) [2024-06-24 09:45:53,328][15349] Signal inference workers to resume experience collection... (153950 times) [2024-06-24 09:45:53,347][15401] InferenceWorker_p0-w0: stopping experience collection (153950 times) [2024-06-24 09:45:53,347][15401] InferenceWorker_p0-w0: resuming experience collection (153950 times) [2024-06-24 09:45:53,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 10398547968. Throughput: 0: 42559.1. Samples: 10398610620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:45:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 09:45:54,821][15401] Updated weights for policy 0, policy_version 634681 (0.0035) [2024-06-24 09:45:58,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10398760960. Throughput: 0: 42843.1. Samples: 10398880220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:45:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 09:45:58,554][15401] Updated weights for policy 0, policy_version 634691 (0.0029) [2024-06-24 09:46:02,306][15401] Updated weights for policy 0, policy_version 634701 (0.0028) [2024-06-24 09:46:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10398941184. Throughput: 0: 42602.8. Samples: 10399131380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:03,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 09:46:06,155][15401] Updated weights for policy 0, policy_version 634711 (0.0036) [2024-06-24 09:46:08,394][15132] Fps is (10 sec: 42578.7, 60 sec: 42868.0, 300 sec: 42764.3). Total num frames: 10399186944. Throughput: 0: 42722.2. Samples: 10399254540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:08,395][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 09:46:10,573][15401] Updated weights for policy 0, policy_version 634721 (0.0038) [2024-06-24 09:46:13,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42710.1). Total num frames: 10399383552. Throughput: 0: 42716.9. Samples: 10399517560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:13,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 09:46:13,800][15401] Updated weights for policy 0, policy_version 634731 (0.0037) [2024-06-24 09:46:18,131][15401] Updated weights for policy 0, policy_version 634741 (0.0036) [2024-06-24 09:46:18,392][15132] Fps is (10 sec: 40969.7, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 10399596544. Throughput: 0: 42443.1. Samples: 10399763480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:18,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 09:46:22,093][15401] Updated weights for policy 0, policy_version 634751 (0.0037) [2024-06-24 09:46:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10399825920. Throughput: 0: 42467.5. Samples: 10399885300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 09:46:25,798][15401] Updated weights for policy 0, policy_version 634761 (0.0031) [2024-06-24 09:46:28,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10400006144. Throughput: 0: 42466.3. Samples: 10400147400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 09:46:29,798][15401] Updated weights for policy 0, policy_version 634771 (0.0040) [2024-06-24 09:46:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10400235520. Throughput: 0: 42303.1. Samples: 10400393620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 09:46:33,593][15401] Updated weights for policy 0, policy_version 634781 (0.0031) [2024-06-24 09:46:37,444][15401] Updated weights for policy 0, policy_version 634791 (0.0032) [2024-06-24 09:46:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10400448512. Throughput: 0: 42607.9. Samples: 10400527980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 09:46:41,523][15401] Updated weights for policy 0, policy_version 634801 (0.0041) [2024-06-24 09:46:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 10400645120. Throughput: 0: 42249.9. Samples: 10400781460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 09:46:45,170][15401] Updated weights for policy 0, policy_version 634811 (0.0048) [2024-06-24 09:46:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10400874496. Throughput: 0: 42105.7. Samples: 10401026140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 09:46:49,293][15401] Updated weights for policy 0, policy_version 634821 (0.0045) [2024-06-24 09:46:53,083][15401] Updated weights for policy 0, policy_version 634831 (0.0034) [2024-06-24 09:46:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10401071104. Throughput: 0: 42317.8. Samples: 10401158640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 09:46:57,024][15401] Updated weights for policy 0, policy_version 634841 (0.0032) [2024-06-24 09:46:58,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 10401267712. Throughput: 0: 42100.0. Samples: 10401412060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:46:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 09:47:00,947][15401] Updated weights for policy 0, policy_version 634851 (0.0039) [2024-06-24 09:47:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10401513472. Throughput: 0: 42128.0. Samples: 10401659140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:47:03,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 09:47:04,798][15401] Updated weights for policy 0, policy_version 634861 (0.0057) [2024-06-24 09:47:06,975][15349] Signal inference workers to stop experience collection... (154000 times) [2024-06-24 09:47:07,015][15401] InferenceWorker_p0-w0: stopping experience collection (154000 times) [2024-06-24 09:47:07,041][15349] Signal inference workers to resume experience collection... (154000 times) [2024-06-24 09:47:07,049][15401] InferenceWorker_p0-w0: resuming experience collection (154000 times) [2024-06-24 09:47:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42055.6, 300 sec: 42487.3). Total num frames: 10401710080. Throughput: 0: 42401.9. Samples: 10401793380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 09:47:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 09:47:08,666][15401] Updated weights for policy 0, policy_version 634871 (0.0032) [2024-06-24 09:47:12,277][15401] Updated weights for policy 0, policy_version 634881 (0.0032) [2024-06-24 09:47:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10401906688. Throughput: 0: 42221.9. Samples: 10402047380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 09:47:16,173][15401] Updated weights for policy 0, policy_version 634891 (0.0041) [2024-06-24 09:47:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 10402136064. Throughput: 0: 42488.8. Samples: 10402305620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 09:47:19,830][15401] Updated weights for policy 0, policy_version 634901 (0.0029) [2024-06-24 09:47:23,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 10402365440. Throughput: 0: 42373.8. Samples: 10402434900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:23,392][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 09:47:23,671][15401] Updated weights for policy 0, policy_version 634911 (0.0031) [2024-06-24 09:47:27,438][15401] Updated weights for policy 0, policy_version 634921 (0.0025) [2024-06-24 09:47:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10402545664. Throughput: 0: 42316.5. Samples: 10402685700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:28,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 09:47:31,209][15401] Updated weights for policy 0, policy_version 634931 (0.0033) [2024-06-24 09:47:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10402775040. Throughput: 0: 42551.5. Samples: 10402940960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 09:47:35,298][15401] Updated weights for policy 0, policy_version 634941 (0.0038) [2024-06-24 09:47:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 10402988032. Throughput: 0: 42604.9. Samples: 10403075960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:38,401][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 09:47:38,897][15401] Updated weights for policy 0, policy_version 634951 (0.0032) [2024-06-24 09:47:42,902][15401] Updated weights for policy 0, policy_version 634961 (0.0037) [2024-06-24 09:47:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10403201024. Throughput: 0: 42602.1. Samples: 10403329160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:43,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 09:47:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000634961_10403201024.pth... [2024-06-24 09:47:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000634338_10392993792.pth [2024-06-24 09:47:46,772][15401] Updated weights for policy 0, policy_version 634971 (0.0030) [2024-06-24 09:47:48,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10403430400. Throughput: 0: 42701.7. Samples: 10403580720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:48,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 09:47:51,062][15401] Updated weights for policy 0, policy_version 634981 (0.0029) [2024-06-24 09:47:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10403627008. Throughput: 0: 42590.1. Samples: 10403709940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:53,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 09:47:54,489][15401] Updated weights for policy 0, policy_version 634991 (0.0055) [2024-06-24 09:47:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10403840000. Throughput: 0: 42583.3. Samples: 10403963640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:47:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 09:47:58,714][15401] Updated weights for policy 0, policy_version 635001 (0.0024) [2024-06-24 09:48:02,383][15401] Updated weights for policy 0, policy_version 635011 (0.0036) [2024-06-24 09:48:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10404069376. Throughput: 0: 42332.5. Samples: 10404210580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 09:48:06,410][15401] Updated weights for policy 0, policy_version 635021 (0.0033) [2024-06-24 09:48:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 10404265984. Throughput: 0: 42380.4. Samples: 10404341920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 09:48:09,918][15401] Updated weights for policy 0, policy_version 635031 (0.0029) [2024-06-24 09:48:13,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 10404478976. Throughput: 0: 42465.7. Samples: 10404596760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:13,393][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 09:48:14,451][15401] Updated weights for policy 0, policy_version 635041 (0.0023) [2024-06-24 09:48:17,635][15401] Updated weights for policy 0, policy_version 635051 (0.0044) [2024-06-24 09:48:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10404691968. Throughput: 0: 42469.9. Samples: 10404852100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:18,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 09:48:19,336][15349] Signal inference workers to stop experience collection... (154050 times) [2024-06-24 09:48:19,337][15349] Signal inference workers to resume experience collection... (154050 times) [2024-06-24 09:48:19,366][15401] InferenceWorker_p0-w0: stopping experience collection (154050 times) [2024-06-24 09:48:19,366][15401] InferenceWorker_p0-w0: resuming experience collection (154050 times) [2024-06-24 09:48:21,987][15401] Updated weights for policy 0, policy_version 635061 (0.0030) [2024-06-24 09:48:23,389][15132] Fps is (10 sec: 39331.4, 60 sec: 41780.9, 300 sec: 42487.3). Total num frames: 10404872192. Throughput: 0: 42298.3. Samples: 10404979280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 09:48:25,267][15401] Updated weights for policy 0, policy_version 635071 (0.0036) [2024-06-24 09:48:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10405117952. Throughput: 0: 42302.0. Samples: 10405232740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 09:48:29,483][15401] Updated weights for policy 0, policy_version 635081 (0.0034) [2024-06-24 09:48:33,343][15401] Updated weights for policy 0, policy_version 635091 (0.0038) [2024-06-24 09:48:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10405330944. Throughput: 0: 42502.6. Samples: 10405493340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 09:48:37,363][15401] Updated weights for policy 0, policy_version 635101 (0.0026) [2024-06-24 09:48:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 10405527552. Throughput: 0: 42342.7. Samples: 10405615360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 09:48:40,827][15401] Updated weights for policy 0, policy_version 635111 (0.0032) [2024-06-24 09:48:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 10405773312. Throughput: 0: 42341.0. Samples: 10405869080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 09:48:43,393][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 09:48:44,882][15401] Updated weights for policy 0, policy_version 635121 (0.0040) [2024-06-24 09:48:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10405953536. Throughput: 0: 42700.5. Samples: 10406132100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:48:48,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 09:48:48,722][15401] Updated weights for policy 0, policy_version 635131 (0.0034) [2024-06-24 09:48:52,525][15401] Updated weights for policy 0, policy_version 635141 (0.0044) [2024-06-24 09:48:53,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10406182912. Throughput: 0: 42442.3. Samples: 10406251820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:48:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 09:48:56,335][15401] Updated weights for policy 0, policy_version 635151 (0.0027) [2024-06-24 09:48:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 10406412288. Throughput: 0: 42645.0. Samples: 10406515680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:48:58,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 09:49:00,481][15401] Updated weights for policy 0, policy_version 635161 (0.0031) [2024-06-24 09:49:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 10406592512. Throughput: 0: 42829.8. Samples: 10406779440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 09:49:04,048][15401] Updated weights for policy 0, policy_version 635171 (0.0046) [2024-06-24 09:49:08,043][15401] Updated weights for policy 0, policy_version 635181 (0.0029) [2024-06-24 09:49:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10406805504. Throughput: 0: 42686.7. Samples: 10406900180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 09:49:11,640][15401] Updated weights for policy 0, policy_version 635191 (0.0031) [2024-06-24 09:49:13,396][15132] Fps is (10 sec: 45845.6, 60 sec: 42868.6, 300 sec: 42542.2). Total num frames: 10407051264. Throughput: 0: 42824.5. Samples: 10407160120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:13,396][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 09:49:15,391][15401] Updated weights for policy 0, policy_version 635201 (0.0046) [2024-06-24 09:49:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10407231488. Throughput: 0: 42732.1. Samples: 10407416280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 09:49:19,146][15401] Updated weights for policy 0, policy_version 635211 (0.0029) [2024-06-24 09:49:22,879][15401] Updated weights for policy 0, policy_version 635221 (0.0029) [2024-06-24 09:49:23,390][15132] Fps is (10 sec: 40985.9, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 10407460864. Throughput: 0: 42821.4. Samples: 10407542320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 09:49:27,224][15401] Updated weights for policy 0, policy_version 635231 (0.0038) [2024-06-24 09:49:28,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 10407690240. Throughput: 0: 42986.7. Samples: 10407803480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:28,392][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 09:49:30,676][15401] Updated weights for policy 0, policy_version 635241 (0.0036) [2024-06-24 09:49:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10407886848. Throughput: 0: 42761.8. Samples: 10408056380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 09:49:34,782][15401] Updated weights for policy 0, policy_version 635251 (0.0034) [2024-06-24 09:49:38,198][15401] Updated weights for policy 0, policy_version 635261 (0.0028) [2024-06-24 09:49:38,390][15132] Fps is (10 sec: 42608.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 10408116224. Throughput: 0: 42869.7. Samples: 10408180960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 09:49:40,168][15349] Signal inference workers to stop experience collection... (154100 times) [2024-06-24 09:49:40,168][15349] Signal inference workers to resume experience collection... (154100 times) [2024-06-24 09:49:40,217][15401] InferenceWorker_p0-w0: stopping experience collection (154100 times) [2024-06-24 09:49:40,217][15401] InferenceWorker_p0-w0: resuming experience collection (154100 times) [2024-06-24 09:49:42,221][15401] Updated weights for policy 0, policy_version 635271 (0.0022) [2024-06-24 09:49:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 10408329216. Throughput: 0: 42788.1. Samples: 10408441140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 09:49:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000635275_10408345600.pth... [2024-06-24 09:49:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000634651_10398121984.pth [2024-06-24 09:49:45,729][15401] Updated weights for policy 0, policy_version 635281 (0.0034) [2024-06-24 09:49:48,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10408509440. Throughput: 0: 42653.3. Samples: 10408698840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 09:49:49,826][15401] Updated weights for policy 0, policy_version 635291 (0.0023) [2024-06-24 09:49:53,275][15401] Updated weights for policy 0, policy_version 635301 (0.0021) [2024-06-24 09:49:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10408771584. Throughput: 0: 42723.0. Samples: 10408822720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 09:49:57,725][15401] Updated weights for policy 0, policy_version 635311 (0.0038) [2024-06-24 09:49:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10408968192. Throughput: 0: 42838.0. Samples: 10409087560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:49:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 09:50:01,116][15401] Updated weights for policy 0, policy_version 635321 (0.0031) [2024-06-24 09:50:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 10409164800. Throughput: 0: 42856.5. Samples: 10409344820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:50:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 09:50:05,238][15401] Updated weights for policy 0, policy_version 635331 (0.0039) [2024-06-24 09:50:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 10409394176. Throughput: 0: 42835.6. Samples: 10409469920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:50:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 09:50:08,954][15401] Updated weights for policy 0, policy_version 635341 (0.0023) [2024-06-24 09:50:13,338][15401] Updated weights for policy 0, policy_version 635351 (0.0038) [2024-06-24 09:50:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42328.1, 300 sec: 42598.1). Total num frames: 10409590784. Throughput: 0: 42735.9. Samples: 10409726600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:50:13,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 09:50:16,758][15401] Updated weights for policy 0, policy_version 635361 (0.0035) [2024-06-24 09:50:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10409820160. Throughput: 0: 42668.0. Samples: 10409976440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:50:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 09:50:20,907][15401] Updated weights for policy 0, policy_version 635371 (0.0029) [2024-06-24 09:50:23,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10410033152. Throughput: 0: 42838.7. Samples: 10410108700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 09:50:23,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 09:50:24,324][15401] Updated weights for policy 0, policy_version 635381 (0.0033) [2024-06-24 09:50:28,390][15132] Fps is (10 sec: 40957.9, 60 sec: 42326.6, 300 sec: 42598.3). Total num frames: 10410229760. Throughput: 0: 42743.9. Samples: 10410364640. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:50:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 09:50:28,464][15401] Updated weights for policy 0, policy_version 635391 (0.0027) [2024-06-24 09:50:31,756][15401] Updated weights for policy 0, policy_version 635401 (0.0041) [2024-06-24 09:50:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10410442752. Throughput: 0: 42724.8. Samples: 10410621460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:50:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 09:50:36,047][15401] Updated weights for policy 0, policy_version 635411 (0.0028) [2024-06-24 09:50:38,389][15132] Fps is (10 sec: 44239.1, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 10410672128. Throughput: 0: 42751.2. Samples: 10410746520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:50:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 09:50:39,222][15401] Updated weights for policy 0, policy_version 635421 (0.0027) [2024-06-24 09:50:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10410868736. Throughput: 0: 42680.4. Samples: 10411008180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:50:43,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 09:50:43,565][15401] Updated weights for policy 0, policy_version 635431 (0.0035) [2024-06-24 09:50:47,204][15401] Updated weights for policy 0, policy_version 635441 (0.0029) [2024-06-24 09:50:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 10411081728. Throughput: 0: 42661.7. Samples: 10411264600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:50:48,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 09:50:51,114][15401] Updated weights for policy 0, policy_version 635451 (0.0036) [2024-06-24 09:50:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10411311104. Throughput: 0: 42627.5. Samples: 10411388160. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:50:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 09:50:54,843][15401] Updated weights for policy 0, policy_version 635461 (0.0029) [2024-06-24 09:50:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10411507712. Throughput: 0: 42787.2. Samples: 10411651920. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:50:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 09:50:59,152][15401] Updated weights for policy 0, policy_version 635471 (0.0033) [2024-06-24 09:51:02,458][15401] Updated weights for policy 0, policy_version 635481 (0.0033) [2024-06-24 09:51:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42488.0). Total num frames: 10411720704. Throughput: 0: 42789.8. Samples: 10411901980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 09:51:06,702][15401] Updated weights for policy 0, policy_version 635491 (0.0041) [2024-06-24 09:51:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10411966464. Throughput: 0: 42746.3. Samples: 10412032280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 09:51:10,124][15401] Updated weights for policy 0, policy_version 635501 (0.0038) [2024-06-24 09:51:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42543.2). Total num frames: 10412146688. Throughput: 0: 42824.4. Samples: 10412291720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:13,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 09:51:14,322][15401] Updated weights for policy 0, policy_version 635511 (0.0061) [2024-06-24 09:51:16,965][15349] Signal inference workers to stop experience collection... (154150 times) [2024-06-24 09:51:16,974][15349] Signal inference workers to resume experience collection... (154150 times) [2024-06-24 09:51:16,993][15401] InferenceWorker_p0-w0: stopping experience collection (154150 times) [2024-06-24 09:51:16,993][15401] InferenceWorker_p0-w0: resuming experience collection (154150 times) [2024-06-24 09:51:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10412359680. Throughput: 0: 42637.9. Samples: 10412540160. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 09:51:18,441][15401] Updated weights for policy 0, policy_version 635521 (0.0043) [2024-06-24 09:51:22,186][15401] Updated weights for policy 0, policy_version 635531 (0.0023) [2024-06-24 09:51:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10412605440. Throughput: 0: 42734.6. Samples: 10412669580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 09:51:26,014][15401] Updated weights for policy 0, policy_version 635541 (0.0041) [2024-06-24 09:51:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.8, 300 sec: 42542.9). Total num frames: 10412785664. Throughput: 0: 42675.2. Samples: 10412928560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:28,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 09:51:29,814][15401] Updated weights for policy 0, policy_version 635551 (0.0028) [2024-06-24 09:51:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10413015040. Throughput: 0: 42620.0. Samples: 10413182500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 09:51:33,480][15401] Updated weights for policy 0, policy_version 635561 (0.0031) [2024-06-24 09:51:37,345][15401] Updated weights for policy 0, policy_version 635571 (0.0029) [2024-06-24 09:51:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10413244416. Throughput: 0: 42896.0. Samples: 10413318480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:38,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-24 09:51:40,940][15401] Updated weights for policy 0, policy_version 635581 (0.0042) [2024-06-24 09:51:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10413441024. Throughput: 0: 42497.7. Samples: 10413564320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:43,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-24 09:51:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000635586_10413441024.pth... [2024-06-24 09:51:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000634961_10403201024.pth [2024-06-24 09:51:45,238][15401] Updated weights for policy 0, policy_version 635591 (0.0042) [2024-06-24 09:51:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10413654016. Throughput: 0: 42755.0. Samples: 10413825960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:48,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 09:51:48,766][15401] Updated weights for policy 0, policy_version 635601 (0.0032) [2024-06-24 09:51:52,904][15401] Updated weights for policy 0, policy_version 635611 (0.0040) [2024-06-24 09:51:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10413883392. Throughput: 0: 42671.6. Samples: 10413952500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 09:51:56,421][15401] Updated weights for policy 0, policy_version 635621 (0.0028) [2024-06-24 09:51:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10414096384. Throughput: 0: 42527.2. Samples: 10414205440. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-24 09:51:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 09:52:00,390][15401] Updated weights for policy 0, policy_version 635631 (0.0034) [2024-06-24 09:52:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10414292992. Throughput: 0: 42889.2. Samples: 10414470180. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 09:52:04,276][15401] Updated weights for policy 0, policy_version 635641 (0.0028) [2024-06-24 09:52:08,097][15401] Updated weights for policy 0, policy_version 635651 (0.0040) [2024-06-24 09:52:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10414522368. Throughput: 0: 42769.7. Samples: 10414594220. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 09:52:11,800][15401] Updated weights for policy 0, policy_version 635661 (0.0037) [2024-06-24 09:52:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10414735360. Throughput: 0: 42703.1. Samples: 10414850200. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:13,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 09:52:15,838][15401] Updated weights for policy 0, policy_version 635671 (0.0032) [2024-06-24 09:52:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 10414931968. Throughput: 0: 42935.4. Samples: 10415114600. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 09:52:19,703][15401] Updated weights for policy 0, policy_version 635681 (0.0032) [2024-06-24 09:52:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10415144960. Throughput: 0: 42698.3. Samples: 10415239900. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 09:52:23,436][15401] Updated weights for policy 0, policy_version 635691 (0.0026) [2024-06-24 09:52:27,179][15401] Updated weights for policy 0, policy_version 635701 (0.0026) [2024-06-24 09:52:28,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10415390720. Throughput: 0: 42979.1. Samples: 10415498380. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 09:52:31,029][15401] Updated weights for policy 0, policy_version 635711 (0.0033) [2024-06-24 09:52:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 10415570944. Throughput: 0: 42934.3. Samples: 10415758000. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 09:52:34,288][15349] Signal inference workers to stop experience collection... (154200 times) [2024-06-24 09:52:34,289][15349] Signal inference workers to resume experience collection... (154200 times) [2024-06-24 09:52:34,301][15401] InferenceWorker_p0-w0: stopping experience collection (154200 times) [2024-06-24 09:52:34,313][15401] InferenceWorker_p0-w0: resuming experience collection (154200 times) [2024-06-24 09:52:34,757][15401] Updated weights for policy 0, policy_version 635721 (0.0037) [2024-06-24 09:52:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10415800320. Throughput: 0: 42873.3. Samples: 10415881800. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 09:52:38,614][15401] Updated weights for policy 0, policy_version 635731 (0.0045) [2024-06-24 09:52:42,289][15401] Updated weights for policy 0, policy_version 635741 (0.0021) [2024-06-24 09:52:43,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10416046080. Throughput: 0: 43038.7. Samples: 10416142180. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 09:52:46,126][15401] Updated weights for policy 0, policy_version 635751 (0.0024) [2024-06-24 09:52:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10416226304. Throughput: 0: 42886.3. Samples: 10416400060. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:48,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-24 09:52:49,862][15401] Updated weights for policy 0, policy_version 635761 (0.0038) [2024-06-24 09:52:53,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10416422912. Throughput: 0: 42870.8. Samples: 10416523400. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 09:52:53,715][15401] Updated weights for policy 0, policy_version 635771 (0.0031) [2024-06-24 09:52:57,743][15401] Updated weights for policy 0, policy_version 635781 (0.0021) [2024-06-24 09:52:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10416668672. Throughput: 0: 43107.6. Samples: 10416790040. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:52:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 09:53:01,281][15401] Updated weights for policy 0, policy_version 635791 (0.0035) [2024-06-24 09:53:03,394][15132] Fps is (10 sec: 45855.7, 60 sec: 43141.5, 300 sec: 42764.4). Total num frames: 10416881664. Throughput: 0: 42734.7. Samples: 10417037840. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:53:03,394][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 09:53:05,413][15401] Updated weights for policy 0, policy_version 635801 (0.0029) [2024-06-24 09:53:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 10417078272. Throughput: 0: 42781.4. Samples: 10417165060. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:53:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 09:53:08,947][15401] Updated weights for policy 0, policy_version 635811 (0.0036) [2024-06-24 09:53:12,868][15401] Updated weights for policy 0, policy_version 635821 (0.0039) [2024-06-24 09:53:13,389][15132] Fps is (10 sec: 42616.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10417307648. Throughput: 0: 42964.5. Samples: 10417431780. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:53:13,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 09:53:16,673][15401] Updated weights for policy 0, policy_version 635831 (0.0039) [2024-06-24 09:53:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 10417537024. Throughput: 0: 42681.7. Samples: 10417678680. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:53:18,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 09:53:20,405][15401] Updated weights for policy 0, policy_version 635841 (0.0025) [2024-06-24 09:53:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10417700864. Throughput: 0: 42772.5. Samples: 10417806560. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:53:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 09:53:24,195][15401] Updated weights for policy 0, policy_version 635851 (0.0025) [2024-06-24 09:53:28,190][15401] Updated weights for policy 0, policy_version 635861 (0.0045) [2024-06-24 09:53:28,394][15132] Fps is (10 sec: 40941.4, 60 sec: 42595.1, 300 sec: 42764.4). Total num frames: 10417946624. Throughput: 0: 42704.9. Samples: 10418064100. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:53:28,395][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 09:53:32,154][15401] Updated weights for policy 0, policy_version 635871 (0.0032) [2024-06-24 09:53:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10418143232. Throughput: 0: 42554.3. Samples: 10418315000. Policy #0 lag: (min: 1.0, avg: 8.9, max: 24.0) [2024-06-24 09:53:33,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 09:53:35,926][15401] Updated weights for policy 0, policy_version 635881 (0.0041) [2024-06-24 09:53:36,464][15349] Signal inference workers to stop experience collection... (154250 times) [2024-06-24 09:53:36,464][15349] Signal inference workers to resume experience collection... (154250 times) [2024-06-24 09:53:36,496][15401] InferenceWorker_p0-w0: stopping experience collection (154250 times) [2024-06-24 09:53:36,496][15401] InferenceWorker_p0-w0: resuming experience collection (154250 times) [2024-06-24 09:53:38,389][15132] Fps is (10 sec: 39339.8, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 10418339840. Throughput: 0: 42600.4. Samples: 10418440420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:53:38,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 09:53:39,979][15401] Updated weights for policy 0, policy_version 635891 (0.0042) [2024-06-24 09:53:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10418585600. Throughput: 0: 42440.0. Samples: 10418699840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:53:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 09:53:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000635900_10418585600.pth... [2024-06-24 09:53:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000635275_10408345600.pth [2024-06-24 09:53:43,847][15401] Updated weights for policy 0, policy_version 635901 (0.0041) [2024-06-24 09:53:47,529][15401] Updated weights for policy 0, policy_version 635911 (0.0034) [2024-06-24 09:53:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10418798592. Throughput: 0: 42668.4. Samples: 10418957740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:53:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 09:53:51,513][15401] Updated weights for policy 0, policy_version 635921 (0.0038) [2024-06-24 09:53:53,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 10418995200. Throughput: 0: 42563.2. Samples: 10419080680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:53:53,396][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 09:53:55,364][15401] Updated weights for policy 0, policy_version 635931 (0.0033) [2024-06-24 09:53:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10419224576. Throughput: 0: 42381.8. Samples: 10419338960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:53:58,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 09:53:59,120][15401] Updated weights for policy 0, policy_version 635941 (0.0043) [2024-06-24 09:54:03,110][15401] Updated weights for policy 0, policy_version 635951 (0.0039) [2024-06-24 09:54:03,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42328.3, 300 sec: 42765.0). Total num frames: 10419421184. Throughput: 0: 42494.8. Samples: 10419590940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:03,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 09:54:06,564][15401] Updated weights for policy 0, policy_version 635961 (0.0037) [2024-06-24 09:54:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 10419634176. Throughput: 0: 42531.5. Samples: 10419720480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 09:54:10,905][15401] Updated weights for policy 0, policy_version 635971 (0.0026) [2024-06-24 09:54:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10419847168. Throughput: 0: 42607.6. Samples: 10419981240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 09:54:14,125][15401] Updated weights for policy 0, policy_version 635981 (0.0042) [2024-06-24 09:54:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10420060160. Throughput: 0: 42612.3. Samples: 10420232560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:18,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 09:54:18,455][15401] Updated weights for policy 0, policy_version 635991 (0.0031) [2024-06-24 09:54:21,908][15401] Updated weights for policy 0, policy_version 636001 (0.0027) [2024-06-24 09:54:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 10420273152. Throughput: 0: 42661.0. Samples: 10420360160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 09:54:26,040][15401] Updated weights for policy 0, policy_version 636011 (0.0037) [2024-06-24 09:54:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42328.6, 300 sec: 42709.5). Total num frames: 10420486144. Throughput: 0: 42737.7. Samples: 10420623040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 09:54:29,910][15401] Updated weights for policy 0, policy_version 636021 (0.0031) [2024-06-24 09:54:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 10420715520. Throughput: 0: 42546.2. Samples: 10420872320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 09:54:33,567][15401] Updated weights for policy 0, policy_version 636031 (0.0030) [2024-06-24 09:54:37,459][15401] Updated weights for policy 0, policy_version 636041 (0.0038) [2024-06-24 09:54:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10420928512. Throughput: 0: 42743.8. Samples: 10421003880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:38,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 09:54:41,143][15401] Updated weights for policy 0, policy_version 636051 (0.0023) [2024-06-24 09:54:43,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10421108736. Throughput: 0: 42653.0. Samples: 10421258340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:43,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 09:54:45,049][15401] Updated weights for policy 0, policy_version 636061 (0.0026) [2024-06-24 09:54:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10421370880. Throughput: 0: 42602.7. Samples: 10421508060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:48,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 09:54:48,695][15401] Updated weights for policy 0, policy_version 636071 (0.0030) [2024-06-24 09:54:52,742][15401] Updated weights for policy 0, policy_version 636081 (0.0034) [2024-06-24 09:54:53,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42874.3, 300 sec: 42709.1). Total num frames: 10421567488. Throughput: 0: 42815.4. Samples: 10421647280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:53,392][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 09:54:56,303][15401] Updated weights for policy 0, policy_version 636091 (0.0032) [2024-06-24 09:54:58,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42320.9, 300 sec: 42708.6). Total num frames: 10421764096. Throughput: 0: 42652.6. Samples: 10421900880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:54:58,396][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 09:55:00,323][15401] Updated weights for policy 0, policy_version 636101 (0.0032) [2024-06-24 09:55:02,309][15349] Signal inference workers to stop experience collection... (154300 times) [2024-06-24 09:55:02,309][15349] Signal inference workers to resume experience collection... (154300 times) [2024-06-24 09:55:02,341][15401] InferenceWorker_p0-w0: stopping experience collection (154300 times) [2024-06-24 09:55:02,341][15401] InferenceWorker_p0-w0: resuming experience collection (154300 times) [2024-06-24 09:55:03,393][15132] Fps is (10 sec: 45871.9, 60 sec: 43415.3, 300 sec: 42820.1). Total num frames: 10422026240. Throughput: 0: 42605.1. Samples: 10422149920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:55:03,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 09:55:04,489][15401] Updated weights for policy 0, policy_version 636111 (0.0034) [2024-06-24 09:55:08,234][15401] Updated weights for policy 0, policy_version 636121 (0.0037) [2024-06-24 09:55:08,390][15132] Fps is (10 sec: 44264.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 10422206464. Throughput: 0: 42875.4. Samples: 10422289560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 09:55:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 09:55:11,851][15401] Updated weights for policy 0, policy_version 636131 (0.0028) [2024-06-24 09:55:13,389][15132] Fps is (10 sec: 37695.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10422403072. Throughput: 0: 42633.0. Samples: 10422541520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 09:55:15,931][15401] Updated weights for policy 0, policy_version 636141 (0.0029) [2024-06-24 09:55:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 10422665216. Throughput: 0: 42562.0. Samples: 10422787600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 09:55:19,502][15401] Updated weights for policy 0, policy_version 636151 (0.0037) [2024-06-24 09:55:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10422829056. Throughput: 0: 42641.3. Samples: 10422922740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 09:55:24,027][15401] Updated weights for policy 0, policy_version 636161 (0.0035) [2024-06-24 09:55:27,313][15401] Updated weights for policy 0, policy_version 636171 (0.0029) [2024-06-24 09:55:28,391][15132] Fps is (10 sec: 39316.5, 60 sec: 42870.6, 300 sec: 42764.8). Total num frames: 10423058432. Throughput: 0: 42605.0. Samples: 10423175620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:28,391][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 09:55:31,882][15401] Updated weights for policy 0, policy_version 636181 (0.0034) [2024-06-24 09:55:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10423287808. Throughput: 0: 42724.0. Samples: 10423430640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 09:55:34,941][15401] Updated weights for policy 0, policy_version 636191 (0.0038) [2024-06-24 09:55:38,394][15132] Fps is (10 sec: 42584.0, 60 sec: 42595.2, 300 sec: 42764.4). Total num frames: 10423484416. Throughput: 0: 42561.9. Samples: 10423562660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:38,395][15132] Avg episode reward: [(0, '0.891')] [2024-06-24 09:55:39,370][15401] Updated weights for policy 0, policy_version 636201 (0.0045) [2024-06-24 09:55:42,594][15401] Updated weights for policy 0, policy_version 636211 (0.0031) [2024-06-24 09:55:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10423697408. Throughput: 0: 42472.5. Samples: 10423811880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:43,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 09:55:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000636212_10423697408.pth... [2024-06-24 09:55:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000635586_10413441024.pth [2024-06-24 09:55:46,851][15401] Updated weights for policy 0, policy_version 636221 (0.0034) [2024-06-24 09:55:48,389][15132] Fps is (10 sec: 44257.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10423926784. Throughput: 0: 42692.8. Samples: 10424070960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:48,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 09:55:50,362][15401] Updated weights for policy 0, policy_version 636231 (0.0028) [2024-06-24 09:55:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 10424107008. Throughput: 0: 42642.3. Samples: 10424208460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:53,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 09:55:54,324][15401] Updated weights for policy 0, policy_version 636241 (0.0043) [2024-06-24 09:55:58,269][15401] Updated weights for policy 0, policy_version 636251 (0.0045) [2024-06-24 09:55:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 10424336384. Throughput: 0: 42586.6. Samples: 10424457920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:55:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 09:56:02,287][15401] Updated weights for policy 0, policy_version 636261 (0.0026) [2024-06-24 09:56:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42327.6, 300 sec: 42709.5). Total num frames: 10424565760. Throughput: 0: 42889.3. Samples: 10424717620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 09:56:05,755][15401] Updated weights for policy 0, policy_version 636271 (0.0041) [2024-06-24 09:56:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10424762368. Throughput: 0: 42810.7. Samples: 10424849220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 09:56:09,680][15401] Updated weights for policy 0, policy_version 636281 (0.0039) [2024-06-24 09:56:10,469][15349] Signal inference workers to stop experience collection... (154350 times) [2024-06-24 09:56:10,469][15349] Signal inference workers to resume experience collection... (154350 times) [2024-06-24 09:56:10,489][15401] InferenceWorker_p0-w0: stopping experience collection (154350 times) [2024-06-24 09:56:10,490][15401] InferenceWorker_p0-w0: resuming experience collection (154350 times) [2024-06-24 09:56:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10424991744. Throughput: 0: 42767.4. Samples: 10425100100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 09:56:13,394][15401] Updated weights for policy 0, policy_version 636291 (0.0036) [2024-06-24 09:56:17,692][15401] Updated weights for policy 0, policy_version 636301 (0.0031) [2024-06-24 09:56:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10425204736. Throughput: 0: 42690.7. Samples: 10425351720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 09:56:21,352][15401] Updated weights for policy 0, policy_version 636311 (0.0032) [2024-06-24 09:56:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10425384960. Throughput: 0: 42650.6. Samples: 10425481740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 09:56:25,397][15401] Updated weights for policy 0, policy_version 636321 (0.0033) [2024-06-24 09:56:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42872.3, 300 sec: 42765.0). Total num frames: 10425630720. Throughput: 0: 42661.9. Samples: 10425731660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 09:56:29,158][15401] Updated weights for policy 0, policy_version 636331 (0.0028) [2024-06-24 09:56:32,877][15401] Updated weights for policy 0, policy_version 636341 (0.0038) [2024-06-24 09:56:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10425827328. Throughput: 0: 42619.7. Samples: 10425988840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 09:56:37,163][15401] Updated weights for policy 0, policy_version 636351 (0.0035) [2024-06-24 09:56:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42328.5, 300 sec: 42653.9). Total num frames: 10426023936. Throughput: 0: 42388.7. Samples: 10426115960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:38,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 09:56:40,629][15401] Updated weights for policy 0, policy_version 636361 (0.0059) [2024-06-24 09:56:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10426269696. Throughput: 0: 42526.7. Samples: 10426371620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 09:56:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 09:56:44,753][15401] Updated weights for policy 0, policy_version 636371 (0.0032) [2024-06-24 09:56:48,253][15401] Updated weights for policy 0, policy_version 636381 (0.0037) [2024-06-24 09:56:48,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10426466304. Throughput: 0: 42491.5. Samples: 10426629740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:56:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 09:56:52,374][15401] Updated weights for policy 0, policy_version 636391 (0.0025) [2024-06-24 09:56:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10426679296. Throughput: 0: 42405.4. Samples: 10426757460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:56:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 09:56:55,965][15401] Updated weights for policy 0, policy_version 636401 (0.0033) [2024-06-24 09:56:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10426908672. Throughput: 0: 42507.1. Samples: 10427012920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:56:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 09:56:59,840][15401] Updated weights for policy 0, policy_version 636411 (0.0041) [2024-06-24 09:57:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10427105280. Throughput: 0: 42630.2. Samples: 10427270080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 09:57:03,394][15401] Updated weights for policy 0, policy_version 636421 (0.0037) [2024-06-24 09:57:07,308][15401] Updated weights for policy 0, policy_version 636431 (0.0035) [2024-06-24 09:57:08,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 10427301888. Throughput: 0: 42598.6. Samples: 10427398780. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:08,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 09:57:10,807][15401] Updated weights for policy 0, policy_version 636441 (0.0042) [2024-06-24 09:57:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10427547648. Throughput: 0: 42693.3. Samples: 10427652860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 09:57:15,263][15401] Updated weights for policy 0, policy_version 636451 (0.0030) [2024-06-24 09:57:18,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10427760640. Throughput: 0: 42760.2. Samples: 10427913060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:18,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 09:57:18,440][15401] Updated weights for policy 0, policy_version 636461 (0.0032) [2024-06-24 09:57:23,079][15401] Updated weights for policy 0, policy_version 636471 (0.0035) [2024-06-24 09:57:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10427940864. Throughput: 0: 42749.6. Samples: 10428039680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:23,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 09:57:26,063][15401] Updated weights for policy 0, policy_version 636481 (0.0031) [2024-06-24 09:57:27,661][15349] Signal inference workers to stop experience collection... (154400 times) [2024-06-24 09:57:27,661][15349] Signal inference workers to resume experience collection... (154400 times) [2024-06-24 09:57:27,703][15401] InferenceWorker_p0-w0: stopping experience collection (154400 times) [2024-06-24 09:57:27,703][15401] InferenceWorker_p0-w0: resuming experience collection (154400 times) [2024-06-24 09:57:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10428186624. Throughput: 0: 42786.8. Samples: 10428297020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 09:57:30,738][15401] Updated weights for policy 0, policy_version 636491 (0.0034) [2024-06-24 09:57:33,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 10428399616. Throughput: 0: 42799.1. Samples: 10428555800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:33,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 09:57:33,792][15401] Updated weights for policy 0, policy_version 636501 (0.0026) [2024-06-24 09:57:38,228][15401] Updated weights for policy 0, policy_version 636511 (0.0034) [2024-06-24 09:57:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 10428596224. Throughput: 0: 42702.6. Samples: 10428679080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:38,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 09:57:41,445][15401] Updated weights for policy 0, policy_version 636521 (0.0030) [2024-06-24 09:57:43,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10428841984. Throughput: 0: 42771.6. Samples: 10428937640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 09:57:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000636526_10428841984.pth... [2024-06-24 09:57:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000635900_10418585600.pth [2024-06-24 09:57:45,942][15401] Updated weights for policy 0, policy_version 636531 (0.0025) [2024-06-24 09:57:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10429005824. Throughput: 0: 42622.7. Samples: 10429188100. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 09:57:49,676][15401] Updated weights for policy 0, policy_version 636541 (0.0033) [2024-06-24 09:57:53,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10429235200. Throughput: 0: 42498.6. Samples: 10429311120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:53,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 09:57:53,549][15401] Updated weights for policy 0, policy_version 636551 (0.0043) [2024-06-24 09:57:57,269][15401] Updated weights for policy 0, policy_version 636561 (0.0044) [2024-06-24 09:57:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42654.5). Total num frames: 10429464576. Throughput: 0: 42581.2. Samples: 10429569020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:57:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 09:58:01,481][15401] Updated weights for policy 0, policy_version 636571 (0.0043) [2024-06-24 09:58:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 10429644800. Throughput: 0: 42480.9. Samples: 10429824700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:58:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 09:58:05,065][15401] Updated weights for policy 0, policy_version 636581 (0.0035) [2024-06-24 09:58:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 10429874176. Throughput: 0: 42368.8. Samples: 10429946280. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:58:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 09:58:09,217][15401] Updated weights for policy 0, policy_version 636591 (0.0029) [2024-06-24 09:58:12,811][15401] Updated weights for policy 0, policy_version 636601 (0.0034) [2024-06-24 09:58:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10430087168. Throughput: 0: 42520.8. Samples: 10430210460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:58:13,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 09:58:16,943][15401] Updated weights for policy 0, policy_version 636611 (0.0036) [2024-06-24 09:58:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 10430283776. Throughput: 0: 42395.6. Samples: 10430463500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:58:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 09:58:20,425][15401] Updated weights for policy 0, policy_version 636621 (0.0040) [2024-06-24 09:58:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42599.1). Total num frames: 10430513152. Throughput: 0: 42360.0. Samples: 10430585280. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-24 09:58:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 09:58:24,590][15401] Updated weights for policy 0, policy_version 636631 (0.0026) [2024-06-24 09:58:27,946][15401] Updated weights for policy 0, policy_version 636641 (0.0035) [2024-06-24 09:58:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10430726144. Throughput: 0: 42407.1. Samples: 10430845960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:58:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 09:58:32,214][15401] Updated weights for policy 0, policy_version 636651 (0.0037) [2024-06-24 09:58:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 10430922752. Throughput: 0: 42458.9. Samples: 10431098760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:58:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 09:58:35,799][15401] Updated weights for policy 0, policy_version 636661 (0.0038) [2024-06-24 09:58:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10431135744. Throughput: 0: 42537.1. Samples: 10431225280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:58:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 09:58:39,906][15401] Updated weights for policy 0, policy_version 636671 (0.0025) [2024-06-24 09:58:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10431365120. Throughput: 0: 42430.3. Samples: 10431478380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:58:43,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 09:58:43,527][15401] Updated weights for policy 0, policy_version 636681 (0.0036) [2024-06-24 09:58:47,574][15401] Updated weights for policy 0, policy_version 636691 (0.0043) [2024-06-24 09:58:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42654.9). Total num frames: 10431578112. Throughput: 0: 42462.2. Samples: 10431735500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:58:48,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 09:58:48,867][15349] Signal inference workers to stop experience collection... (154450 times) [2024-06-24 09:58:48,872][15349] Signal inference workers to resume experience collection... (154450 times) [2024-06-24 09:58:48,903][15401] InferenceWorker_p0-w0: stopping experience collection (154450 times) [2024-06-24 09:58:48,903][15401] InferenceWorker_p0-w0: resuming experience collection (154450 times) [2024-06-24 09:58:51,248][15401] Updated weights for policy 0, policy_version 636701 (0.0038) [2024-06-24 09:58:53,394][15132] Fps is (10 sec: 40939.6, 60 sec: 42321.9, 300 sec: 42542.1). Total num frames: 10431774720. Throughput: 0: 42589.0. Samples: 10431863000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:58:53,395][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 09:58:55,499][15401] Updated weights for policy 0, policy_version 636711 (0.0029) [2024-06-24 09:58:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10432020480. Throughput: 0: 42480.0. Samples: 10432122060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:58:58,394][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 09:58:58,967][15401] Updated weights for policy 0, policy_version 636721 (0.0036) [2024-06-24 09:59:02,946][15401] Updated weights for policy 0, policy_version 636731 (0.0030) [2024-06-24 09:59:03,390][15132] Fps is (10 sec: 44258.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10432217088. Throughput: 0: 42646.0. Samples: 10432382580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 09:59:06,411][15401] Updated weights for policy 0, policy_version 636741 (0.0038) [2024-06-24 09:59:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10432430080. Throughput: 0: 42664.9. Samples: 10432505200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 09:59:10,580][15401] Updated weights for policy 0, policy_version 636751 (0.0044) [2024-06-24 09:59:13,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10432643072. Throughput: 0: 42676.0. Samples: 10432766380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 09:59:14,119][15401] Updated weights for policy 0, policy_version 636761 (0.0036) [2024-06-24 09:59:18,275][15401] Updated weights for policy 0, policy_version 636771 (0.0042) [2024-06-24 09:59:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10432856064. Throughput: 0: 42686.4. Samples: 10433019640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 09:59:21,877][15401] Updated weights for policy 0, policy_version 636781 (0.0040) [2024-06-24 09:59:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10433069056. Throughput: 0: 42602.2. Samples: 10433142380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 09:59:26,101][15401] Updated weights for policy 0, policy_version 636791 (0.0046) [2024-06-24 09:59:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10433282048. Throughput: 0: 42674.2. Samples: 10433398720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:28,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 09:59:29,454][15401] Updated weights for policy 0, policy_version 636801 (0.0041) [2024-06-24 09:59:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 10433478656. Throughput: 0: 42675.7. Samples: 10433655900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 09:59:34,007][15401] Updated weights for policy 0, policy_version 636811 (0.0031) [2024-06-24 09:59:37,409][15401] Updated weights for policy 0, policy_version 636821 (0.0036) [2024-06-24 09:59:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10433708032. Throughput: 0: 42657.6. Samples: 10433782380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 09:59:41,542][15401] Updated weights for policy 0, policy_version 636831 (0.0028) [2024-06-24 09:59:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10433904640. Throughput: 0: 42586.2. Samples: 10434038440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 09:59:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000636835_10433904640.pth... [2024-06-24 09:59:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000636212_10423697408.pth [2024-06-24 09:59:45,108][15401] Updated weights for policy 0, policy_version 636841 (0.0028) [2024-06-24 09:59:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 10434117632. Throughput: 0: 42469.9. Samples: 10434293720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 09:59:49,187][15401] Updated weights for policy 0, policy_version 636851 (0.0029) [2024-06-24 09:59:52,807][15401] Updated weights for policy 0, policy_version 636861 (0.0040) [2024-06-24 09:59:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42875.0, 300 sec: 42654.9). Total num frames: 10434347008. Throughput: 0: 42463.7. Samples: 10434416060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 09:59:57,259][15401] Updated weights for policy 0, policy_version 636871 (0.0032) [2024-06-24 09:59:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42432.2). Total num frames: 10434543616. Throughput: 0: 42436.9. Samples: 10434676040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 09:59:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 10:00:00,499][15401] Updated weights for policy 0, policy_version 636881 (0.0036) [2024-06-24 10:00:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 10434756608. Throughput: 0: 42534.2. Samples: 10434933680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 10:00:04,829][15401] Updated weights for policy 0, policy_version 636891 (0.0039) [2024-06-24 10:00:08,088][15401] Updated weights for policy 0, policy_version 636901 (0.0043) [2024-06-24 10:00:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10435002368. Throughput: 0: 42490.2. Samples: 10435054440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 10:00:12,390][15401] Updated weights for policy 0, policy_version 636911 (0.0032) [2024-06-24 10:00:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 10435166208. Throughput: 0: 42492.0. Samples: 10435310860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:13,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 10:00:15,646][15349] Signal inference workers to stop experience collection... (154500 times) [2024-06-24 10:00:15,680][15401] InferenceWorker_p0-w0: stopping experience collection (154500 times) [2024-06-24 10:00:15,696][15349] Signal inference workers to resume experience collection... (154500 times) [2024-06-24 10:00:15,698][15401] InferenceWorker_p0-w0: resuming experience collection (154500 times) [2024-06-24 10:00:15,845][15401] Updated weights for policy 0, policy_version 636921 (0.0037) [2024-06-24 10:00:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10435395584. Throughput: 0: 42342.5. Samples: 10435561320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 10:00:20,281][15401] Updated weights for policy 0, policy_version 636931 (0.0042) [2024-06-24 10:00:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 10435624960. Throughput: 0: 42403.3. Samples: 10435690520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:23,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-24 10:00:23,503][15401] Updated weights for policy 0, policy_version 636941 (0.0036) [2024-06-24 10:00:28,079][15401] Updated weights for policy 0, policy_version 636951 (0.0044) [2024-06-24 10:00:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 10435805184. Throughput: 0: 42260.1. Samples: 10435940140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 10:00:31,176][15401] Updated weights for policy 0, policy_version 636961 (0.0046) [2024-06-24 10:00:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.6, 300 sec: 42543.2). Total num frames: 10436034560. Throughput: 0: 42184.9. Samples: 10436192140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:33,393][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 10:00:35,726][15401] Updated weights for policy 0, policy_version 636971 (0.0034) [2024-06-24 10:00:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10436247552. Throughput: 0: 42419.0. Samples: 10436324920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 10:00:39,257][15401] Updated weights for policy 0, policy_version 636981 (0.0045) [2024-06-24 10:00:43,396][15132] Fps is (10 sec: 40943.4, 60 sec: 42320.8, 300 sec: 42430.9). Total num frames: 10436444160. Throughput: 0: 42251.2. Samples: 10436577620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:43,396][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 10:00:43,412][15401] Updated weights for policy 0, policy_version 636991 (0.0036) [2024-06-24 10:00:46,847][15401] Updated weights for policy 0, policy_version 637001 (0.0040) [2024-06-24 10:00:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10436657152. Throughput: 0: 42195.6. Samples: 10436832480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 10:00:51,166][15401] Updated weights for policy 0, policy_version 637011 (0.0031) [2024-06-24 10:00:53,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10436886528. Throughput: 0: 42357.3. Samples: 10436960520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 10:00:54,409][15401] Updated weights for policy 0, policy_version 637021 (0.0039) [2024-06-24 10:00:58,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 10437099520. Throughput: 0: 42342.1. Samples: 10437216260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:00:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 10:00:58,932][15401] Updated weights for policy 0, policy_version 637031 (0.0031) [2024-06-24 10:01:02,341][15401] Updated weights for policy 0, policy_version 637041 (0.0037) [2024-06-24 10:01:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 10437312512. Throughput: 0: 42249.3. Samples: 10437462640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:01:03,393][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 10:01:06,683][15401] Updated weights for policy 0, policy_version 637051 (0.0032) [2024-06-24 10:01:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10437525504. Throughput: 0: 42304.0. Samples: 10437594200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:01:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 10:01:10,210][15401] Updated weights for policy 0, policy_version 637061 (0.0040) [2024-06-24 10:01:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10437722112. Throughput: 0: 42444.9. Samples: 10437850160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:01:13,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 10:01:14,206][15401] Updated weights for policy 0, policy_version 637071 (0.0037) [2024-06-24 10:01:17,967][15401] Updated weights for policy 0, policy_version 637081 (0.0031) [2024-06-24 10:01:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10437951488. Throughput: 0: 42347.1. Samples: 10438097660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:01:18,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 10:01:22,244][15401] Updated weights for policy 0, policy_version 637091 (0.0035) [2024-06-24 10:01:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 42431.8). Total num frames: 10438148096. Throughput: 0: 42373.3. Samples: 10438231720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:01:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 10:01:25,621][15401] Updated weights for policy 0, policy_version 637101 (0.0040) [2024-06-24 10:01:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 10438344704. Throughput: 0: 42343.5. Samples: 10438482800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:01:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 10:01:29,869][15401] Updated weights for policy 0, policy_version 637111 (0.0043) [2024-06-24 10:01:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 10438574080. Throughput: 0: 42400.0. Samples: 10438740480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 10:01:33,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 10:01:33,439][15401] Updated weights for policy 0, policy_version 637121 (0.0045) [2024-06-24 10:01:37,363][15401] Updated weights for policy 0, policy_version 637131 (0.0042) [2024-06-24 10:01:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10438787072. Throughput: 0: 42387.1. Samples: 10438867940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:01:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 10:01:41,309][15401] Updated weights for policy 0, policy_version 637141 (0.0031) [2024-06-24 10:01:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42603.0, 300 sec: 42487.3). Total num frames: 10439000064. Throughput: 0: 42340.6. Samples: 10439121580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:01:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 10:01:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000637146_10439000064.pth... [2024-06-24 10:01:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000636526_10428841984.pth [2024-06-24 10:01:45,016][15401] Updated weights for policy 0, policy_version 637151 (0.0028) [2024-06-24 10:01:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 10439196672. Throughput: 0: 42636.5. Samples: 10439381180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:01:48,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 10:01:49,116][15401] Updated weights for policy 0, policy_version 637161 (0.0039) [2024-06-24 10:01:52,597][15401] Updated weights for policy 0, policy_version 637171 (0.0028) [2024-06-24 10:01:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10439442432. Throughput: 0: 42561.6. Samples: 10439509480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:01:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 10:01:56,930][15401] Updated weights for policy 0, policy_version 637181 (0.0046) [2024-06-24 10:01:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10439639040. Throughput: 0: 42483.9. Samples: 10439761940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:01:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 10:02:00,178][15401] Updated weights for policy 0, policy_version 637191 (0.0033) [2024-06-24 10:02:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42543.2). Total num frames: 10439852032. Throughput: 0: 42864.6. Samples: 10440026560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 10:02:04,443][15401] Updated weights for policy 0, policy_version 637201 (0.0035) [2024-06-24 10:02:07,087][15349] Signal inference workers to stop experience collection... (154550 times) [2024-06-24 10:02:07,087][15349] Signal inference workers to resume experience collection... (154550 times) [2024-06-24 10:02:07,109][15401] InferenceWorker_p0-w0: stopping experience collection (154550 times) [2024-06-24 10:02:07,109][15401] InferenceWorker_p0-w0: resuming experience collection (154550 times) [2024-06-24 10:02:07,698][15401] Updated weights for policy 0, policy_version 637211 (0.0034) [2024-06-24 10:02:08,390][15132] Fps is (10 sec: 44233.9, 60 sec: 42597.9, 300 sec: 42487.2). Total num frames: 10440081408. Throughput: 0: 42642.1. Samples: 10440150640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:08,391][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 10:02:12,135][15401] Updated weights for policy 0, policy_version 637221 (0.0033) [2024-06-24 10:02:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10440278016. Throughput: 0: 42816.4. Samples: 10440409540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:13,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-24 10:02:15,144][15401] Updated weights for policy 0, policy_version 637231 (0.0028) [2024-06-24 10:02:18,390][15132] Fps is (10 sec: 40962.6, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 10440491008. Throughput: 0: 42881.7. Samples: 10440670160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:18,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-24 10:02:19,651][15401] Updated weights for policy 0, policy_version 637241 (0.0042) [2024-06-24 10:02:22,723][15401] Updated weights for policy 0, policy_version 637251 (0.0039) [2024-06-24 10:02:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42542.8). Total num frames: 10440736768. Throughput: 0: 42783.2. Samples: 10440793180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:23,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 10:02:27,187][15401] Updated weights for policy 0, policy_version 637261 (0.0032) [2024-06-24 10:02:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42487.7). Total num frames: 10440933376. Throughput: 0: 42811.4. Samples: 10441048100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 10:02:30,778][15401] Updated weights for policy 0, policy_version 637271 (0.0034) [2024-06-24 10:02:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10441113600. Throughput: 0: 42840.9. Samples: 10441309020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 10:02:35,035][15401] Updated weights for policy 0, policy_version 637281 (0.0027) [2024-06-24 10:02:38,365][15401] Updated weights for policy 0, policy_version 637291 (0.0041) [2024-06-24 10:02:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.7, 300 sec: 42487.3). Total num frames: 10441375744. Throughput: 0: 42818.8. Samples: 10441436320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 10:02:43,032][15401] Updated weights for policy 0, policy_version 637301 (0.0034) [2024-06-24 10:02:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10441572352. Throughput: 0: 42988.5. Samples: 10441696420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:43,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-24 10:02:45,930][15401] Updated weights for policy 0, policy_version 637311 (0.0035) [2024-06-24 10:02:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 10441768960. Throughput: 0: 42677.2. Samples: 10441947040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:48,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 10:02:50,753][15401] Updated weights for policy 0, policy_version 637321 (0.0044) [2024-06-24 10:02:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10442014720. Throughput: 0: 42863.8. Samples: 10442079480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 10:02:53,503][15401] Updated weights for policy 0, policy_version 637331 (0.0040) [2024-06-24 10:02:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 10442178560. Throughput: 0: 42717.0. Samples: 10442331800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:02:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 10:02:58,488][15401] Updated weights for policy 0, policy_version 637341 (0.0037) [2024-06-24 10:03:01,221][15401] Updated weights for policy 0, policy_version 637351 (0.0047) [2024-06-24 10:03:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 10442424320. Throughput: 0: 42587.5. Samples: 10442586600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:03:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 10:03:05,962][15401] Updated weights for policy 0, policy_version 637361 (0.0037) [2024-06-24 10:03:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.9, 300 sec: 42542.9). Total num frames: 10442637312. Throughput: 0: 42742.8. Samples: 10442716600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:03:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 10:03:08,876][15349] Signal inference workers to stop experience collection... (154600 times) [2024-06-24 10:03:08,878][15349] Signal inference workers to resume experience collection... (154600 times) [2024-06-24 10:03:08,883][15401] Updated weights for policy 0, policy_version 637371 (0.0042) [2024-06-24 10:03:08,897][15401] InferenceWorker_p0-w0: stopping experience collection (154600 times) [2024-06-24 10:03:08,898][15401] InferenceWorker_p0-w0: resuming experience collection (154600 times) [2024-06-24 10:03:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10442833920. Throughput: 0: 42733.4. Samples: 10442971100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 10:03:13,495][15401] Updated weights for policy 0, policy_version 637381 (0.0032) [2024-06-24 10:03:16,597][15401] Updated weights for policy 0, policy_version 637391 (0.0034) [2024-06-24 10:03:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 10443079680. Throughput: 0: 42315.1. Samples: 10443213200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 10:03:21,455][15401] Updated weights for policy 0, policy_version 637401 (0.0037) [2024-06-24 10:03:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10443259904. Throughput: 0: 42526.6. Samples: 10443350020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 10:03:24,224][15401] Updated weights for policy 0, policy_version 637411 (0.0030) [2024-06-24 10:03:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10443472896. Throughput: 0: 42512.9. Samples: 10443609500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 10:03:29,001][15401] Updated weights for policy 0, policy_version 637421 (0.0036) [2024-06-24 10:03:31,847][15401] Updated weights for policy 0, policy_version 637431 (0.0025) [2024-06-24 10:03:33,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 10443735040. Throughput: 0: 42436.0. Samples: 10443856660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:33,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 10:03:36,622][15401] Updated weights for policy 0, policy_version 637441 (0.0052) [2024-06-24 10:03:38,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42320.8, 300 sec: 42541.9). Total num frames: 10443915264. Throughput: 0: 42654.0. Samples: 10443999180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:38,396][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 10:03:39,757][15401] Updated weights for policy 0, policy_version 637451 (0.0039) [2024-06-24 10:03:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10444111872. Throughput: 0: 42476.3. Samples: 10444243240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 10:03:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000637458_10444111872.pth... [2024-06-24 10:03:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000636835_10433904640.pth [2024-06-24 10:03:44,160][15401] Updated weights for policy 0, policy_version 637461 (0.0046) [2024-06-24 10:03:47,476][15401] Updated weights for policy 0, policy_version 637471 (0.0048) [2024-06-24 10:03:48,389][15132] Fps is (10 sec: 44265.2, 60 sec: 43144.6, 300 sec: 42654.7). Total num frames: 10444357632. Throughput: 0: 42374.4. Samples: 10444493440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:48,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 10:03:51,931][15401] Updated weights for policy 0, policy_version 637481 (0.0024) [2024-06-24 10:03:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 10444537856. Throughput: 0: 42617.2. Samples: 10444634380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 10:03:55,350][15401] Updated weights for policy 0, policy_version 637491 (0.0026) [2024-06-24 10:03:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10444734464. Throughput: 0: 42453.0. Samples: 10444881480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:03:58,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 10:04:00,182][15401] Updated weights for policy 0, policy_version 637501 (0.0036) [2024-06-24 10:04:03,335][15401] Updated weights for policy 0, policy_version 637511 (0.0036) [2024-06-24 10:04:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10444980224. Throughput: 0: 42712.3. Samples: 10445135260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 10:04:07,564][15401] Updated weights for policy 0, policy_version 637521 (0.0021) [2024-06-24 10:04:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 10445160448. Throughput: 0: 42606.7. Samples: 10445267320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 10:04:10,859][15401] Updated weights for policy 0, policy_version 637531 (0.0033) [2024-06-24 10:04:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10445389824. Throughput: 0: 42340.5. Samples: 10445514820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 10:04:15,249][15401] Updated weights for policy 0, policy_version 637541 (0.0026) [2024-06-24 10:04:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10445619200. Throughput: 0: 42553.9. Samples: 10445771580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 10:04:18,499][15401] Updated weights for policy 0, policy_version 637551 (0.0035) [2024-06-24 10:04:22,925][15401] Updated weights for policy 0, policy_version 637561 (0.0045) [2024-06-24 10:04:23,396][15132] Fps is (10 sec: 42570.7, 60 sec: 42593.8, 300 sec: 42486.4). Total num frames: 10445815808. Throughput: 0: 42321.7. Samples: 10445903660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:23,396][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 10:04:26,282][15401] Updated weights for policy 0, policy_version 637571 (0.0043) [2024-06-24 10:04:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10446045184. Throughput: 0: 42392.5. Samples: 10446150900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 10:04:30,864][15349] Signal inference workers to stop experience collection... (154650 times) [2024-06-24 10:04:30,908][15401] InferenceWorker_p0-w0: stopping experience collection (154650 times) [2024-06-24 10:04:30,909][15349] Signal inference workers to resume experience collection... (154650 times) [2024-06-24 10:04:30,918][15401] InferenceWorker_p0-w0: resuming experience collection (154650 times) [2024-06-24 10:04:30,925][15401] Updated weights for policy 0, policy_version 637581 (0.0027) [2024-06-24 10:04:33,390][15132] Fps is (10 sec: 44262.5, 60 sec: 42051.9, 300 sec: 42542.8). Total num frames: 10446258176. Throughput: 0: 42618.0. Samples: 10446411280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:33,391][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 10:04:34,103][15401] Updated weights for policy 0, policy_version 637591 (0.0025) [2024-06-24 10:04:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41783.6, 300 sec: 42431.8). Total num frames: 10446422016. Throughput: 0: 42359.2. Samples: 10446540540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 10:04:38,645][15401] Updated weights for policy 0, policy_version 637601 (0.0031) [2024-06-24 10:04:41,677][15401] Updated weights for policy 0, policy_version 637611 (0.0045) [2024-06-24 10:04:43,391][15132] Fps is (10 sec: 42595.2, 60 sec: 42870.5, 300 sec: 42598.2). Total num frames: 10446684160. Throughput: 0: 42336.4. Samples: 10446786680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 10:04:43,391][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 10:04:46,408][15401] Updated weights for policy 0, policy_version 637621 (0.0042) [2024-06-24 10:04:48,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 10446897152. Throughput: 0: 42383.6. Samples: 10447042520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:04:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 10:04:49,380][15401] Updated weights for policy 0, policy_version 637631 (0.0039) [2024-06-24 10:04:53,389][15132] Fps is (10 sec: 37688.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 10447060992. Throughput: 0: 42246.7. Samples: 10447168420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:04:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 10:04:54,056][15401] Updated weights for policy 0, policy_version 637641 (0.0029) [2024-06-24 10:04:57,074][15401] Updated weights for policy 0, policy_version 637651 (0.0043) [2024-06-24 10:04:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10447323136. Throughput: 0: 42426.2. Samples: 10447424000. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:04:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 10:05:01,722][15401] Updated weights for policy 0, policy_version 637661 (0.0041) [2024-06-24 10:05:03,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10447536128. Throughput: 0: 42458.1. Samples: 10447682200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 10:05:04,806][15401] Updated weights for policy 0, policy_version 637671 (0.0028) [2024-06-24 10:05:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10447716352. Throughput: 0: 42410.0. Samples: 10447811840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 10:05:09,385][15401] Updated weights for policy 0, policy_version 637681 (0.0035) [2024-06-24 10:05:12,402][15401] Updated weights for policy 0, policy_version 637691 (0.0038) [2024-06-24 10:05:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10447945728. Throughput: 0: 42464.3. Samples: 10448061800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 10:05:17,123][15401] Updated weights for policy 0, policy_version 637701 (0.0027) [2024-06-24 10:05:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10448158720. Throughput: 0: 42446.9. Samples: 10448321360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 10:05:20,092][15401] Updated weights for policy 0, policy_version 637711 (0.0030) [2024-06-24 10:05:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42329.8, 300 sec: 42542.8). Total num frames: 10448355328. Throughput: 0: 42401.7. Samples: 10448448620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 10:05:24,702][15401] Updated weights for policy 0, policy_version 637721 (0.0038) [2024-06-24 10:05:27,871][15401] Updated weights for policy 0, policy_version 637731 (0.0033) [2024-06-24 10:05:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 10448584704. Throughput: 0: 42531.1. Samples: 10448700520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:28,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 10:05:32,361][15401] Updated weights for policy 0, policy_version 637741 (0.0041) [2024-06-24 10:05:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.6, 300 sec: 42487.3). Total num frames: 10448781312. Throughput: 0: 42561.7. Samples: 10448957800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:33,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 10:05:35,575][15401] Updated weights for policy 0, policy_version 637751 (0.0048) [2024-06-24 10:05:38,396][15132] Fps is (10 sec: 42570.8, 60 sec: 43139.9, 300 sec: 42598.4). Total num frames: 10449010688. Throughput: 0: 42435.2. Samples: 10449078280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:38,396][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 10:05:40,537][15401] Updated weights for policy 0, policy_version 637761 (0.0048) [2024-06-24 10:05:43,306][15401] Updated weights for policy 0, policy_version 637771 (0.0045) [2024-06-24 10:05:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42599.4, 300 sec: 42653.9). Total num frames: 10449240064. Throughput: 0: 42491.1. Samples: 10449336100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 10:05:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000637771_10449240064.pth... [2024-06-24 10:05:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000637146_10439000064.pth [2024-06-24 10:05:48,162][15401] Updated weights for policy 0, policy_version 637781 (0.0034) [2024-06-24 10:05:48,390][15132] Fps is (10 sec: 39346.5, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 10449403904. Throughput: 0: 42480.9. Samples: 10449593840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 10:05:51,651][15401] Updated weights for policy 0, policy_version 637791 (0.0041) [2024-06-24 10:05:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 10449649664. Throughput: 0: 42208.5. Samples: 10449711220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 10:05:55,916][15401] Updated weights for policy 0, policy_version 637801 (0.0045) [2024-06-24 10:05:57,969][15349] Signal inference workers to stop experience collection... (154700 times) [2024-06-24 10:05:57,970][15349] Signal inference workers to resume experience collection... (154700 times) [2024-06-24 10:05:58,011][15401] InferenceWorker_p0-w0: stopping experience collection (154700 times) [2024-06-24 10:05:58,011][15401] InferenceWorker_p0-w0: resuming experience collection (154700 times) [2024-06-24 10:05:58,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 10449862656. Throughput: 0: 42422.8. Samples: 10449970820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:05:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 10:05:59,231][15401] Updated weights for policy 0, policy_version 637811 (0.0026) [2024-06-24 10:06:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 10450042880. Throughput: 0: 42379.0. Samples: 10450228420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:06:03,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 10:06:03,439][15401] Updated weights for policy 0, policy_version 637821 (0.0034) [2024-06-24 10:06:06,769][15401] Updated weights for policy 0, policy_version 637831 (0.0039) [2024-06-24 10:06:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10450305024. Throughput: 0: 42294.4. Samples: 10450351860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:06:08,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 10:06:10,862][15401] Updated weights for policy 0, policy_version 637841 (0.0041) [2024-06-24 10:06:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10450501632. Throughput: 0: 42562.2. Samples: 10450615820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:06:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 10:06:14,387][15401] Updated weights for policy 0, policy_version 637851 (0.0032) [2024-06-24 10:06:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10450698240. Throughput: 0: 42422.9. Samples: 10450866820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:06:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 10:06:18,978][15401] Updated weights for policy 0, policy_version 637861 (0.0042) [2024-06-24 10:06:21,946][15401] Updated weights for policy 0, policy_version 637871 (0.0038) [2024-06-24 10:06:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 10450927616. Throughput: 0: 42497.1. Samples: 10450990380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-24 10:06:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 10:06:26,954][15401] Updated weights for policy 0, policy_version 637881 (0.0031) [2024-06-24 10:06:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10451124224. Throughput: 0: 42595.7. Samples: 10451252900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:06:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 10:06:29,782][15401] Updated weights for policy 0, policy_version 637891 (0.0032) [2024-06-24 10:06:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10451337216. Throughput: 0: 42452.0. Samples: 10451504180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:06:33,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 10:06:34,559][15401] Updated weights for policy 0, policy_version 637901 (0.0037) [2024-06-24 10:06:37,426][15401] Updated weights for policy 0, policy_version 637911 (0.0033) [2024-06-24 10:06:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42602.9, 300 sec: 42598.4). Total num frames: 10451566592. Throughput: 0: 42650.6. Samples: 10451630500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:06:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 10:06:42,113][15401] Updated weights for policy 0, policy_version 637921 (0.0034) [2024-06-24 10:06:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10451763200. Throughput: 0: 42674.3. Samples: 10451891160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:06:43,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-24 10:06:45,044][15401] Updated weights for policy 0, policy_version 637931 (0.0047) [2024-06-24 10:06:48,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42323.7, 300 sec: 42375.9). Total num frames: 10451943424. Throughput: 0: 42589.7. Samples: 10452145060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:06:48,392][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 10:06:49,730][15401] Updated weights for policy 0, policy_version 637941 (0.0047) [2024-06-24 10:06:52,567][15401] Updated weights for policy 0, policy_version 637951 (0.0038) [2024-06-24 10:06:53,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10452221952. Throughput: 0: 42725.2. Samples: 10452274500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:06:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 10:06:57,271][15401] Updated weights for policy 0, policy_version 637961 (0.0031) [2024-06-24 10:06:58,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 10452385792. Throughput: 0: 42518.6. Samples: 10452529160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:06:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 10:07:00,568][15401] Updated weights for policy 0, policy_version 637971 (0.0043) [2024-06-24 10:07:03,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42431.9). Total num frames: 10452598784. Throughput: 0: 42621.7. Samples: 10452784800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:03,396][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 10:07:05,129][15401] Updated weights for policy 0, policy_version 637981 (0.0037) [2024-06-24 10:07:08,144][15401] Updated weights for policy 0, policy_version 637991 (0.0033) [2024-06-24 10:07:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10452844544. Throughput: 0: 42786.6. Samples: 10452915780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:08,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 10:07:12,841][15401] Updated weights for policy 0, policy_version 638001 (0.0043) [2024-06-24 10:07:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 10453024768. Throughput: 0: 42444.7. Samples: 10453162920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:13,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 10:07:15,703][15401] Updated weights for policy 0, policy_version 638011 (0.0029) [2024-06-24 10:07:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10453254144. Throughput: 0: 42586.4. Samples: 10453420560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 10:07:20,762][15401] Updated weights for policy 0, policy_version 638021 (0.0036) [2024-06-24 10:07:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10453483520. Throughput: 0: 42571.5. Samples: 10453546220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 10:07:23,562][15401] Updated weights for policy 0, policy_version 638031 (0.0037) [2024-06-24 10:07:28,378][15401] Updated weights for policy 0, policy_version 638041 (0.0042) [2024-06-24 10:07:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 10453663744. Throughput: 0: 42446.1. Samples: 10453801240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:28,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 10:07:29,399][15349] Signal inference workers to stop experience collection... (154750 times) [2024-06-24 10:07:29,439][15401] InferenceWorker_p0-w0: stopping experience collection (154750 times) [2024-06-24 10:07:29,521][15349] Signal inference workers to resume experience collection... (154750 times) [2024-06-24 10:07:29,521][15401] InferenceWorker_p0-w0: resuming experience collection (154750 times) [2024-06-24 10:07:31,338][15401] Updated weights for policy 0, policy_version 638051 (0.0045) [2024-06-24 10:07:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10453893120. Throughput: 0: 42497.8. Samples: 10454057360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 10:07:35,923][15401] Updated weights for policy 0, policy_version 638061 (0.0035) [2024-06-24 10:07:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10454138880. Throughput: 0: 42512.1. Samples: 10454187540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 10:07:38,837][15401] Updated weights for policy 0, policy_version 638071 (0.0031) [2024-06-24 10:07:43,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 10454302720. Throughput: 0: 42626.7. Samples: 10454447460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:43,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 10:07:43,437][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000638081_10454319104.pth... [2024-06-24 10:07:43,446][15401] Updated weights for policy 0, policy_version 638081 (0.0021) [2024-06-24 10:07:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000637458_10444111872.pth [2024-06-24 10:07:46,641][15401] Updated weights for policy 0, policy_version 638091 (0.0042) [2024-06-24 10:07:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43146.2, 300 sec: 42431.8). Total num frames: 10454532096. Throughput: 0: 42510.2. Samples: 10454697760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:48,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-24 10:07:51,183][15401] Updated weights for policy 0, policy_version 638101 (0.0041) [2024-06-24 10:07:53,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10454761472. Throughput: 0: 42423.5. Samples: 10454824840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 10:07:54,285][15401] Updated weights for policy 0, policy_version 638111 (0.0030) [2024-06-24 10:07:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10454941696. Throughput: 0: 42572.4. Samples: 10455078680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 10:07:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 10:07:59,000][15401] Updated weights for policy 0, policy_version 638121 (0.0041) [2024-06-24 10:08:02,359][15401] Updated weights for policy 0, policy_version 638131 (0.0024) [2024-06-24 10:08:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 10455171072. Throughput: 0: 42484.9. Samples: 10455332380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:03,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 10:08:06,502][15401] Updated weights for policy 0, policy_version 638141 (0.0030) [2024-06-24 10:08:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10455384064. Throughput: 0: 42569.9. Samples: 10455461860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 10:08:10,229][15401] Updated weights for policy 0, policy_version 638151 (0.0023) [2024-06-24 10:08:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 10455580672. Throughput: 0: 42538.2. Samples: 10455715460. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:13,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 10:08:14,337][15401] Updated weights for policy 0, policy_version 638161 (0.0025) [2024-06-24 10:08:17,913][15401] Updated weights for policy 0, policy_version 638171 (0.0038) [2024-06-24 10:08:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10455810048. Throughput: 0: 42585.4. Samples: 10455973700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:18,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 10:08:21,904][15401] Updated weights for policy 0, policy_version 638181 (0.0045) [2024-06-24 10:08:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10456023040. Throughput: 0: 42517.3. Samples: 10456100820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 10:08:25,845][15401] Updated weights for policy 0, policy_version 638191 (0.0045) [2024-06-24 10:08:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 10456236032. Throughput: 0: 42298.3. Samples: 10456350780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 10:08:29,350][15401] Updated weights for policy 0, policy_version 638201 (0.0026) [2024-06-24 10:08:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42432.7). Total num frames: 10456432640. Throughput: 0: 42472.2. Samples: 10456609000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 10:08:33,421][15401] Updated weights for policy 0, policy_version 638211 (0.0053) [2024-06-24 10:08:36,818][15401] Updated weights for policy 0, policy_version 638221 (0.0035) [2024-06-24 10:08:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10456662016. Throughput: 0: 42501.4. Samples: 10456737400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 10:08:41,487][15401] Updated weights for policy 0, policy_version 638231 (0.0025) [2024-06-24 10:08:43,393][15132] Fps is (10 sec: 44218.9, 60 sec: 42870.4, 300 sec: 42431.2). Total num frames: 10456875008. Throughput: 0: 42576.8. Samples: 10456994800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:43,394][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 10:08:44,358][15401] Updated weights for policy 0, policy_version 638241 (0.0026) [2024-06-24 10:08:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 10457055232. Throughput: 0: 42567.6. Samples: 10457247920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 10:08:49,410][15401] Updated weights for policy 0, policy_version 638251 (0.0036) [2024-06-24 10:08:50,173][15349] Signal inference workers to stop experience collection... (154800 times) [2024-06-24 10:08:50,180][15349] Signal inference workers to resume experience collection... (154800 times) [2024-06-24 10:08:50,225][15401] InferenceWorker_p0-w0: stopping experience collection (154800 times) [2024-06-24 10:08:50,226][15401] InferenceWorker_p0-w0: resuming experience collection (154800 times) [2024-06-24 10:08:51,964][15401] Updated weights for policy 0, policy_version 638261 (0.0034) [2024-06-24 10:08:53,389][15132] Fps is (10 sec: 40976.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 10457284608. Throughput: 0: 42522.3. Samples: 10457375360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 10:08:57,214][15401] Updated weights for policy 0, policy_version 638271 (0.0037) [2024-06-24 10:08:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10457497600. Throughput: 0: 42722.2. Samples: 10457637960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:08:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 10:08:59,620][15401] Updated weights for policy 0, policy_version 638281 (0.0032) [2024-06-24 10:09:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10457726976. Throughput: 0: 42606.9. Samples: 10457891020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:09:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 10:09:04,698][15401] Updated weights for policy 0, policy_version 638291 (0.0034) [2024-06-24 10:09:07,058][15401] Updated weights for policy 0, policy_version 638301 (0.0024) [2024-06-24 10:09:08,390][15132] Fps is (10 sec: 45873.6, 60 sec: 42871.2, 300 sec: 42598.3). Total num frames: 10457956352. Throughput: 0: 42657.4. Samples: 10458020420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:09:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 10:09:12,111][15401] Updated weights for policy 0, policy_version 638311 (0.0034) [2024-06-24 10:09:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10458136576. Throughput: 0: 42965.7. Samples: 10458284240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:09:13,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 10:09:14,923][15401] Updated weights for policy 0, policy_version 638321 (0.0034) [2024-06-24 10:09:18,389][15132] Fps is (10 sec: 40962.0, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 10458365952. Throughput: 0: 42708.5. Samples: 10458530880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:09:18,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-24 10:09:19,670][15401] Updated weights for policy 0, policy_version 638331 (0.0039) [2024-06-24 10:09:22,649][15401] Updated weights for policy 0, policy_version 638341 (0.0038) [2024-06-24 10:09:23,396][15132] Fps is (10 sec: 45846.3, 60 sec: 42866.9, 300 sec: 42541.9). Total num frames: 10458595328. Throughput: 0: 42887.8. Samples: 10458667620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:09:23,396][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 10:09:27,454][15401] Updated weights for policy 0, policy_version 638351 (0.0034) [2024-06-24 10:09:28,389][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.2, 300 sec: 42320.8). Total num frames: 10458742784. Throughput: 0: 42678.1. Samples: 10458915140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:09:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 10:09:30,610][15401] Updated weights for policy 0, policy_version 638361 (0.0030) [2024-06-24 10:09:33,391][15132] Fps is (10 sec: 42619.6, 60 sec: 43143.5, 300 sec: 42709.3). Total num frames: 10459021312. Throughput: 0: 42634.2. Samples: 10459166520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 10:09:33,391][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 10:09:35,337][15401] Updated weights for policy 0, policy_version 638371 (0.0034) [2024-06-24 10:09:38,127][15401] Updated weights for policy 0, policy_version 638381 (0.0023) [2024-06-24 10:09:38,389][15132] Fps is (10 sec: 50790.2, 60 sec: 43144.6, 300 sec: 42598.6). Total num frames: 10459250688. Throughput: 0: 42944.9. Samples: 10459307880. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:09:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 10:09:42,893][15401] Updated weights for policy 0, policy_version 638391 (0.0038) [2024-06-24 10:09:43,390][15132] Fps is (10 sec: 37688.1, 60 sec: 42055.0, 300 sec: 42376.3). Total num frames: 10459398144. Throughput: 0: 42707.5. Samples: 10459559800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:09:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 10:09:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000638391_10459398144.pth... [2024-06-24 10:09:43,505][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000637771_10449240064.pth [2024-06-24 10:09:45,725][15401] Updated weights for policy 0, policy_version 638401 (0.0029) [2024-06-24 10:09:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10459660288. Throughput: 0: 42673.6. Samples: 10459811320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:09:48,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 10:09:49,896][15349] Signal inference workers to stop experience collection... (154850 times) [2024-06-24 10:09:49,948][15401] InferenceWorker_p0-w0: stopping experience collection (154850 times) [2024-06-24 10:09:49,951][15349] Signal inference workers to resume experience collection... (154850 times) [2024-06-24 10:09:49,965][15401] InferenceWorker_p0-w0: resuming experience collection (154850 times) [2024-06-24 10:09:50,963][15401] Updated weights for policy 0, policy_version 638411 (0.0032) [2024-06-24 10:09:53,287][15401] Updated weights for policy 0, policy_version 638421 (0.0028) [2024-06-24 10:09:53,389][15132] Fps is (10 sec: 49152.6, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 10459889664. Throughput: 0: 42847.1. Samples: 10459948520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:09:53,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 10:09:58,335][15401] Updated weights for policy 0, policy_version 638431 (0.0027) [2024-06-24 10:09:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10460053504. Throughput: 0: 42654.6. Samples: 10460203700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:09:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 10:10:00,920][15401] Updated weights for policy 0, policy_version 638441 (0.0035) [2024-06-24 10:10:03,391][15132] Fps is (10 sec: 40953.7, 60 sec: 42870.5, 300 sec: 42653.7). Total num frames: 10460299264. Throughput: 0: 42728.3. Samples: 10460453720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:03,391][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 10:10:05,977][15401] Updated weights for policy 0, policy_version 638451 (0.0040) [2024-06-24 10:10:08,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.7, 300 sec: 42653.9). Total num frames: 10460528640. Throughput: 0: 42776.6. Samples: 10460592300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:08,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 10:10:08,631][15401] Updated weights for policy 0, policy_version 638461 (0.0030) [2024-06-24 10:10:13,389][15132] Fps is (10 sec: 39327.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10460692480. Throughput: 0: 42920.8. Samples: 10460846580. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 10:10:13,646][15401] Updated weights for policy 0, policy_version 638471 (0.0028) [2024-06-24 10:10:16,229][15401] Updated weights for policy 0, policy_version 638481 (0.0038) [2024-06-24 10:10:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10460954624. Throughput: 0: 42857.8. Samples: 10461095060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 10:10:21,557][15401] Updated weights for policy 0, policy_version 638491 (0.0025) [2024-06-24 10:10:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42602.9, 300 sec: 42598.4). Total num frames: 10461151232. Throughput: 0: 42867.5. Samples: 10461236920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:23,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 10:10:23,801][15401] Updated weights for policy 0, policy_version 638501 (0.0021) [2024-06-24 10:10:28,392][15132] Fps is (10 sec: 37673.7, 60 sec: 43142.7, 300 sec: 42542.5). Total num frames: 10461331456. Throughput: 0: 42812.8. Samples: 10461486480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:28,393][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 10:10:29,127][15401] Updated weights for policy 0, policy_version 638511 (0.0034) [2024-06-24 10:10:31,831][15401] Updated weights for policy 0, policy_version 638521 (0.0037) [2024-06-24 10:10:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43145.5, 300 sec: 42710.4). Total num frames: 10461609984. Throughput: 0: 42862.1. Samples: 10461740120. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 10:10:36,674][15401] Updated weights for policy 0, policy_version 638531 (0.0035) [2024-06-24 10:10:38,389][15132] Fps is (10 sec: 45886.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10461790208. Throughput: 0: 42851.1. Samples: 10461876820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:38,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 10:10:39,442][15401] Updated weights for policy 0, policy_version 638541 (0.0034) [2024-06-24 10:10:43,389][15132] Fps is (10 sec: 37683.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 10461986816. Throughput: 0: 42682.3. Samples: 10462124400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:43,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 10:10:44,255][15401] Updated weights for policy 0, policy_version 638551 (0.0025) [2024-06-24 10:10:47,074][15401] Updated weights for policy 0, policy_version 638561 (0.0023) [2024-06-24 10:10:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10462248960. Throughput: 0: 42834.7. Samples: 10462381220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:48,392][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 10:10:51,634][15401] Updated weights for policy 0, policy_version 638571 (0.0047) [2024-06-24 10:10:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10462429184. Throughput: 0: 42708.5. Samples: 10462514180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:53,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 10:10:54,581][15401] Updated weights for policy 0, policy_version 638581 (0.0029) [2024-06-24 10:10:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10462642176. Throughput: 0: 42578.3. Samples: 10462762600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:10:58,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 10:10:59,399][15401] Updated weights for policy 0, policy_version 638591 (0.0026) [2024-06-24 10:11:00,861][15349] Signal inference workers to stop experience collection... (154900 times) [2024-06-24 10:11:00,903][15401] InferenceWorker_p0-w0: stopping experience collection (154900 times) [2024-06-24 10:11:00,911][15349] Signal inference workers to resume experience collection... (154900 times) [2024-06-24 10:11:00,926][15401] InferenceWorker_p0-w0: resuming experience collection (154900 times) [2024-06-24 10:11:02,354][15401] Updated weights for policy 0, policy_version 638601 (0.0036) [2024-06-24 10:11:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42872.5, 300 sec: 42598.4). Total num frames: 10462871552. Throughput: 0: 42793.3. Samples: 10463020760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:11:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 10:11:07,349][15401] Updated weights for policy 0, policy_version 638611 (0.0037) [2024-06-24 10:11:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 10463035392. Throughput: 0: 42556.5. Samples: 10463151960. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-24 10:11:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 10:11:10,222][15401] Updated weights for policy 0, policy_version 638621 (0.0030) [2024-06-24 10:11:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10463297536. Throughput: 0: 42613.4. Samples: 10463403980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:13,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 10:11:14,769][15401] Updated weights for policy 0, policy_version 638631 (0.0046) [2024-06-24 10:11:17,747][15401] Updated weights for policy 0, policy_version 638641 (0.0039) [2024-06-24 10:11:18,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10463510528. Throughput: 0: 42869.7. Samples: 10463669260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:18,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 10:11:22,323][15401] Updated weights for policy 0, policy_version 638651 (0.0032) [2024-06-24 10:11:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10463690752. Throughput: 0: 42667.9. Samples: 10463796880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 10:11:25,330][15401] Updated weights for policy 0, policy_version 638661 (0.0039) [2024-06-24 10:11:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43419.4, 300 sec: 42709.5). Total num frames: 10463936512. Throughput: 0: 42880.9. Samples: 10464054040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 10:11:29,664][15401] Updated weights for policy 0, policy_version 638671 (0.0038) [2024-06-24 10:11:32,990][15401] Updated weights for policy 0, policy_version 638681 (0.0036) [2024-06-24 10:11:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10464149504. Throughput: 0: 42890.7. Samples: 10464311300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 10:11:37,578][15401] Updated weights for policy 0, policy_version 638691 (0.0028) [2024-06-24 10:11:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10464346112. Throughput: 0: 42786.6. Samples: 10464439580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 10:11:40,503][15401] Updated weights for policy 0, policy_version 638701 (0.0030) [2024-06-24 10:11:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 10464591872. Throughput: 0: 43026.1. Samples: 10464698780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 10:11:43,525][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000638709_10464608256.pth... [2024-06-24 10:11:43,574][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000638081_10454319104.pth [2024-06-24 10:11:45,197][15401] Updated weights for policy 0, policy_version 638711 (0.0040) [2024-06-24 10:11:48,122][15401] Updated weights for policy 0, policy_version 638721 (0.0032) [2024-06-24 10:11:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10464804864. Throughput: 0: 42883.1. Samples: 10464950500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:48,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 10:11:52,666][15401] Updated weights for policy 0, policy_version 638731 (0.0042) [2024-06-24 10:11:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10464985088. Throughput: 0: 42822.1. Samples: 10465078960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:53,395][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 10:11:55,749][15401] Updated weights for policy 0, policy_version 638741 (0.0034) [2024-06-24 10:11:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10465214464. Throughput: 0: 42949.4. Samples: 10465336700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:11:58,390][15132] Avg episode reward: [(0, '0.184')] [2024-06-24 10:12:00,326][15401] Updated weights for policy 0, policy_version 638751 (0.0039) [2024-06-24 10:12:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10465443840. Throughput: 0: 42623.9. Samples: 10465587340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 10:12:04,098][15401] Updated weights for policy 0, policy_version 638761 (0.0042) [2024-06-24 10:12:07,871][15401] Updated weights for policy 0, policy_version 638771 (0.0030) [2024-06-24 10:12:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 10465640448. Throughput: 0: 42735.0. Samples: 10465719960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 10:12:11,912][15401] Updated weights for policy 0, policy_version 638781 (0.0046) [2024-06-24 10:12:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10465853440. Throughput: 0: 42685.3. Samples: 10465974880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 10:12:14,384][15349] Signal inference workers to stop experience collection... (154950 times) [2024-06-24 10:12:14,384][15349] Signal inference workers to resume experience collection... (154950 times) [2024-06-24 10:12:14,436][15401] InferenceWorker_p0-w0: stopping experience collection (154950 times) [2024-06-24 10:12:14,436][15401] InferenceWorker_p0-w0: resuming experience collection (154950 times) [2024-06-24 10:12:15,442][15401] Updated weights for policy 0, policy_version 638791 (0.0027) [2024-06-24 10:12:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10466066432. Throughput: 0: 42613.8. Samples: 10466228920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 10:12:19,678][15401] Updated weights for policy 0, policy_version 638801 (0.0039) [2024-06-24 10:12:23,035][15401] Updated weights for policy 0, policy_version 638811 (0.0038) [2024-06-24 10:12:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 10466295808. Throughput: 0: 42649.8. Samples: 10466358820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:23,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 10:12:27,167][15401] Updated weights for policy 0, policy_version 638821 (0.0042) [2024-06-24 10:12:28,393][15132] Fps is (10 sec: 40943.8, 60 sec: 42322.6, 300 sec: 42653.4). Total num frames: 10466476032. Throughput: 0: 42639.0. Samples: 10466617700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:28,394][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 10:12:30,583][15401] Updated weights for policy 0, policy_version 638831 (0.0030) [2024-06-24 10:12:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10466689024. Throughput: 0: 42740.4. Samples: 10466873820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 10:12:34,765][15401] Updated weights for policy 0, policy_version 638841 (0.0049) [2024-06-24 10:12:38,164][15401] Updated weights for policy 0, policy_version 638851 (0.0031) [2024-06-24 10:12:38,389][15132] Fps is (10 sec: 45893.6, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 10466934784. Throughput: 0: 42819.6. Samples: 10467005840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 10:12:42,354][15401] Updated weights for policy 0, policy_version 638861 (0.0043) [2024-06-24 10:12:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 10467115008. Throughput: 0: 42664.4. Samples: 10467256600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:43,396][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 10:12:45,832][15401] Updated weights for policy 0, policy_version 638871 (0.0043) [2024-06-24 10:12:48,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 10467344384. Throughput: 0: 42874.0. Samples: 10467516940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-24 10:12:48,396][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 10:12:50,065][15401] Updated weights for policy 0, policy_version 638881 (0.0037) [2024-06-24 10:12:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10467573760. Throughput: 0: 42843.3. Samples: 10467647900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:12:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 10:12:53,508][15401] Updated weights for policy 0, policy_version 638891 (0.0040) [2024-06-24 10:12:57,895][15401] Updated weights for policy 0, policy_version 638901 (0.0035) [2024-06-24 10:12:58,392][15132] Fps is (10 sec: 42615.2, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 10467770368. Throughput: 0: 42797.7. Samples: 10467900880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:12:58,392][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 10:13:01,145][15401] Updated weights for policy 0, policy_version 638911 (0.0027) [2024-06-24 10:13:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10467999744. Throughput: 0: 42859.1. Samples: 10468157580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 10:13:05,535][15401] Updated weights for policy 0, policy_version 638921 (0.0025) [2024-06-24 10:13:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10468212736. Throughput: 0: 42883.7. Samples: 10468288580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:08,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 10:13:08,889][15401] Updated weights for policy 0, policy_version 638931 (0.0032) [2024-06-24 10:13:13,211][15401] Updated weights for policy 0, policy_version 638941 (0.0032) [2024-06-24 10:13:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10468409344. Throughput: 0: 42750.4. Samples: 10468541300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 10:13:16,826][15401] Updated weights for policy 0, policy_version 638951 (0.0027) [2024-06-24 10:13:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10468638720. Throughput: 0: 42733.2. Samples: 10468796820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 10:13:20,875][15401] Updated weights for policy 0, policy_version 638961 (0.0023) [2024-06-24 10:13:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 10468851712. Throughput: 0: 42741.6. Samples: 10468929320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:23,393][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 10:13:24,274][15401] Updated weights for policy 0, policy_version 638971 (0.0034) [2024-06-24 10:13:28,353][15401] Updated weights for policy 0, policy_version 638981 (0.0034) [2024-06-24 10:13:28,392][15132] Fps is (10 sec: 42589.5, 60 sec: 43145.8, 300 sec: 42820.2). Total num frames: 10469064704. Throughput: 0: 42813.0. Samples: 10469183280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:28,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 10:13:32,060][15401] Updated weights for policy 0, policy_version 638991 (0.0037) [2024-06-24 10:13:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 10469294080. Throughput: 0: 42600.6. Samples: 10469433700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 10:13:35,778][15401] Updated weights for policy 0, policy_version 639001 (0.0029) [2024-06-24 10:13:38,390][15132] Fps is (10 sec: 42607.6, 60 sec: 42598.3, 300 sec: 42765.6). Total num frames: 10469490688. Throughput: 0: 42583.8. Samples: 10469564180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 10:13:39,849][15401] Updated weights for policy 0, policy_version 639011 (0.0046) [2024-06-24 10:13:40,596][15349] Signal inference workers to stop experience collection... (155000 times) [2024-06-24 10:13:40,642][15401] InferenceWorker_p0-w0: stopping experience collection (155000 times) [2024-06-24 10:13:40,651][15349] Signal inference workers to resume experience collection... (155000 times) [2024-06-24 10:13:40,659][15401] InferenceWorker_p0-w0: resuming experience collection (155000 times) [2024-06-24 10:13:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10469703680. Throughput: 0: 42769.0. Samples: 10469825380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 10:13:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000639020_10469703680.pth... [2024-06-24 10:13:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000638391_10459398144.pth [2024-06-24 10:13:43,810][15401] Updated weights for policy 0, policy_version 639021 (0.0035) [2024-06-24 10:13:47,270][15401] Updated weights for policy 0, policy_version 639031 (0.0028) [2024-06-24 10:13:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 10469933056. Throughput: 0: 42606.3. Samples: 10470074860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 10:13:51,578][15401] Updated weights for policy 0, policy_version 639041 (0.0044) [2024-06-24 10:13:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10470113280. Throughput: 0: 42570.6. Samples: 10470204260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:53,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 10:13:54,993][15401] Updated weights for policy 0, policy_version 639051 (0.0043) [2024-06-24 10:13:58,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 10470326272. Throughput: 0: 42608.5. Samples: 10470458680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:13:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 10:13:59,217][15401] Updated weights for policy 0, policy_version 639061 (0.0028) [2024-06-24 10:14:02,797][15401] Updated weights for policy 0, policy_version 639071 (0.0035) [2024-06-24 10:14:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 10470572032. Throughput: 0: 42479.2. Samples: 10470708380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:14:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 10:14:07,100][15401] Updated weights for policy 0, policy_version 639081 (0.0028) [2024-06-24 10:14:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10470768640. Throughput: 0: 42556.6. Samples: 10470844260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:14:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 10:14:10,568][15401] Updated weights for policy 0, policy_version 639091 (0.0042) [2024-06-24 10:14:13,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10470948864. Throughput: 0: 42407.0. Samples: 10471091500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:14:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 10:14:14,711][15401] Updated weights for policy 0, policy_version 639101 (0.0033) [2024-06-24 10:14:18,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.3, 300 sec: 42654.8). Total num frames: 10471178240. Throughput: 0: 42689.6. Samples: 10471354740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:14:18,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 10:14:18,541][15401] Updated weights for policy 0, policy_version 639111 (0.0045) [2024-06-24 10:14:22,450][15401] Updated weights for policy 0, policy_version 639121 (0.0038) [2024-06-24 10:14:23,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 10471424000. Throughput: 0: 42648.5. Samples: 10471483360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 10:14:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 10:14:26,323][15401] Updated weights for policy 0, policy_version 639131 (0.0032) [2024-06-24 10:14:28,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42326.9, 300 sec: 42654.1). Total num frames: 10471604224. Throughput: 0: 42213.3. Samples: 10471724980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:14:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 10:14:30,250][15401] Updated weights for policy 0, policy_version 639141 (0.0035) [2024-06-24 10:14:33,392][15132] Fps is (10 sec: 37674.3, 60 sec: 41777.6, 300 sec: 42542.5). Total num frames: 10471800832. Throughput: 0: 42512.3. Samples: 10471988020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:14:33,392][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 10:14:33,903][15401] Updated weights for policy 0, policy_version 639151 (0.0030) [2024-06-24 10:14:37,912][15401] Updated weights for policy 0, policy_version 639161 (0.0036) [2024-06-24 10:14:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.8, 300 sec: 42875.8). Total num frames: 10472046592. Throughput: 0: 42470.2. Samples: 10472115520. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:14:38,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 10:14:41,297][15401] Updated weights for policy 0, policy_version 639171 (0.0021) [2024-06-24 10:14:43,390][15132] Fps is (10 sec: 45885.6, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 10472259584. Throughput: 0: 42619.9. Samples: 10472376580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:14:43,396][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 10:14:45,538][15401] Updated weights for policy 0, policy_version 639181 (0.0028) [2024-06-24 10:14:48,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10472472576. Throughput: 0: 42811.1. Samples: 10472634880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:14:48,390][15132] Avg episode reward: [(0, '0.112')] [2024-06-24 10:14:49,080][15401] Updated weights for policy 0, policy_version 639191 (0.0027) [2024-06-24 10:14:51,428][15349] Signal inference workers to stop experience collection... (155050 times) [2024-06-24 10:14:51,428][15349] Signal inference workers to resume experience collection... (155050 times) [2024-06-24 10:14:51,447][15401] InferenceWorker_p0-w0: stopping experience collection (155050 times) [2024-06-24 10:14:51,447][15401] InferenceWorker_p0-w0: resuming experience collection (155050 times) [2024-06-24 10:14:52,948][15401] Updated weights for policy 0, policy_version 639201 (0.0034) [2024-06-24 10:14:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10472685568. Throughput: 0: 42588.0. Samples: 10472760720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:14:53,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 10:14:56,962][15401] Updated weights for policy 0, policy_version 639211 (0.0025) [2024-06-24 10:14:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.2). Total num frames: 10472914944. Throughput: 0: 42896.9. Samples: 10473021860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:14:58,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 10:15:00,550][15401] Updated weights for policy 0, policy_version 639221 (0.0027) [2024-06-24 10:15:03,392][15132] Fps is (10 sec: 39311.8, 60 sec: 41777.5, 300 sec: 42542.5). Total num frames: 10473078784. Throughput: 0: 42749.0. Samples: 10473278540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:03,393][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 10:15:04,808][15401] Updated weights for policy 0, policy_version 639231 (0.0038) [2024-06-24 10:15:08,088][15401] Updated weights for policy 0, policy_version 639241 (0.0026) [2024-06-24 10:15:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10473324544. Throughput: 0: 42499.2. Samples: 10473395820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:08,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 10:15:12,444][15401] Updated weights for policy 0, policy_version 639251 (0.0035) [2024-06-24 10:15:13,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10473553920. Throughput: 0: 43126.2. Samples: 10473665660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 10:15:15,951][15401] Updated weights for policy 0, policy_version 639261 (0.0026) [2024-06-24 10:15:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10473734144. Throughput: 0: 42863.1. Samples: 10473916760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 10:15:20,315][15401] Updated weights for policy 0, policy_version 639271 (0.0032) [2024-06-24 10:15:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 10473947136. Throughput: 0: 42707.6. Samples: 10474037260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 10:15:23,619][15401] Updated weights for policy 0, policy_version 639281 (0.0027) [2024-06-24 10:15:27,863][15401] Updated weights for policy 0, policy_version 639291 (0.0035) [2024-06-24 10:15:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10474176512. Throughput: 0: 42700.2. Samples: 10474298080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 10:15:31,441][15401] Updated weights for policy 0, policy_version 639301 (0.0038) [2024-06-24 10:15:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 10474389504. Throughput: 0: 42478.4. Samples: 10474546400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:33,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-24 10:15:35,487][15401] Updated weights for policy 0, policy_version 639311 (0.0030) [2024-06-24 10:15:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 10474586112. Throughput: 0: 42652.5. Samples: 10474680080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:38,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 10:15:38,891][15401] Updated weights for policy 0, policy_version 639321 (0.0037) [2024-06-24 10:15:43,034][15401] Updated weights for policy 0, policy_version 639331 (0.0032) [2024-06-24 10:15:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10474815488. Throughput: 0: 42681.8. Samples: 10474942540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:43,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 10:15:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000639333_10474831872.pth... [2024-06-24 10:15:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000638709_10464608256.pth [2024-06-24 10:15:46,525][15401] Updated weights for policy 0, policy_version 639341 (0.0041) [2024-06-24 10:15:48,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10475044864. Throughput: 0: 42496.5. Samples: 10475190780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 10:15:50,922][15401] Updated weights for policy 0, policy_version 639351 (0.0041) [2024-06-24 10:15:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10475225088. Throughput: 0: 42739.6. Samples: 10475319100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 10:15:54,221][15401] Updated weights for policy 0, policy_version 639361 (0.0042) [2024-06-24 10:15:54,510][15349] Signal inference workers to stop experience collection... (155100 times) [2024-06-24 10:15:54,511][15349] Signal inference workers to resume experience collection... (155100 times) [2024-06-24 10:15:54,564][15401] InferenceWorker_p0-w0: stopping experience collection (155100 times) [2024-06-24 10:15:54,564][15401] InferenceWorker_p0-w0: resuming experience collection (155100 times) [2024-06-24 10:15:58,292][15401] Updated weights for policy 0, policy_version 639371 (0.0031) [2024-06-24 10:15:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10475454464. Throughput: 0: 42381.3. Samples: 10475572820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-24 10:15:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 10:16:02,059][15401] Updated weights for policy 0, policy_version 639381 (0.0033) [2024-06-24 10:16:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 10475683840. Throughput: 0: 42390.2. Samples: 10475824320. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 10:16:06,327][15401] Updated weights for policy 0, policy_version 639391 (0.0029) [2024-06-24 10:16:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 10475847680. Throughput: 0: 42692.8. Samples: 10475958440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:08,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 10:16:09,731][15401] Updated weights for policy 0, policy_version 639401 (0.0035) [2024-06-24 10:16:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10476093440. Throughput: 0: 42452.3. Samples: 10476208440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:13,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 10:16:13,795][15401] Updated weights for policy 0, policy_version 639411 (0.0042) [2024-06-24 10:16:17,365][15401] Updated weights for policy 0, policy_version 639421 (0.0035) [2024-06-24 10:16:18,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10476322816. Throughput: 0: 42755.0. Samples: 10476470380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:18,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 10:16:21,445][15401] Updated weights for policy 0, policy_version 639431 (0.0033) [2024-06-24 10:16:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10476503040. Throughput: 0: 42680.4. Samples: 10476600700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 10:16:25,110][15401] Updated weights for policy 0, policy_version 639441 (0.0039) [2024-06-24 10:16:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10476732416. Throughput: 0: 42356.9. Samples: 10476848600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:28,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-24 10:16:29,170][15401] Updated weights for policy 0, policy_version 639451 (0.0046) [2024-06-24 10:16:32,686][15401] Updated weights for policy 0, policy_version 639461 (0.0030) [2024-06-24 10:16:33,394][15132] Fps is (10 sec: 45852.1, 60 sec: 42867.9, 300 sec: 42764.3). Total num frames: 10476961792. Throughput: 0: 42502.8. Samples: 10477103620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:33,395][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 10:16:36,881][15401] Updated weights for policy 0, policy_version 639471 (0.0031) [2024-06-24 10:16:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10477125632. Throughput: 0: 42542.3. Samples: 10477233500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 10:16:40,495][15401] Updated weights for policy 0, policy_version 639481 (0.0039) [2024-06-24 10:16:43,390][15132] Fps is (10 sec: 40980.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10477371392. Throughput: 0: 42522.2. Samples: 10477486320. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 10:16:44,455][15401] Updated weights for policy 0, policy_version 639491 (0.0025) [2024-06-24 10:16:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10477568000. Throughput: 0: 42517.9. Samples: 10477737620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:48,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 10:16:48,481][15401] Updated weights for policy 0, policy_version 639501 (0.0041) [2024-06-24 10:16:52,587][15401] Updated weights for policy 0, policy_version 639511 (0.0031) [2024-06-24 10:16:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10477780992. Throughput: 0: 42332.3. Samples: 10477863400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:53,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-24 10:16:56,547][15401] Updated weights for policy 0, policy_version 639521 (0.0040) [2024-06-24 10:16:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10478026752. Throughput: 0: 42499.2. Samples: 10478120900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:16:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 10:17:00,319][15401] Updated weights for policy 0, policy_version 639531 (0.0029) [2024-06-24 10:17:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10478206976. Throughput: 0: 42286.3. Samples: 10478373260. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:17:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 10:17:04,141][15401] Updated weights for policy 0, policy_version 639541 (0.0036) [2024-06-24 10:17:07,904][15401] Updated weights for policy 0, policy_version 639551 (0.0036) [2024-06-24 10:17:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10478419968. Throughput: 0: 42200.8. Samples: 10478499740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:17:08,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 10:17:11,772][15401] Updated weights for policy 0, policy_version 639561 (0.0034) [2024-06-24 10:17:11,987][15349] Signal inference workers to stop experience collection... (155150 times) [2024-06-24 10:17:11,989][15349] Signal inference workers to resume experience collection... (155150 times) [2024-06-24 10:17:12,027][15401] InferenceWorker_p0-w0: stopping experience collection (155150 times) [2024-06-24 10:17:12,027][15401] InferenceWorker_p0-w0: resuming experience collection (155150 times) [2024-06-24 10:17:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10478632960. Throughput: 0: 42229.7. Samples: 10478748940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:17:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 10:17:15,503][15401] Updated weights for policy 0, policy_version 639571 (0.0039) [2024-06-24 10:17:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10478845952. Throughput: 0: 42447.8. Samples: 10479013560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:17:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 10:17:19,405][15401] Updated weights for policy 0, policy_version 639581 (0.0038) [2024-06-24 10:17:23,273][15401] Updated weights for policy 0, policy_version 639591 (0.0028) [2024-06-24 10:17:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 10479058944. Throughput: 0: 42387.5. Samples: 10479140940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:17:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 10:17:26,931][15401] Updated weights for policy 0, policy_version 639601 (0.0033) [2024-06-24 10:17:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10479288320. Throughput: 0: 42372.4. Samples: 10479393080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:17:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 10:17:30,728][15401] Updated weights for policy 0, policy_version 639611 (0.0040) [2024-06-24 10:17:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42055.8, 300 sec: 42542.9). Total num frames: 10479484928. Throughput: 0: 42639.6. Samples: 10479656400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 10:17:33,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 10:17:34,478][15401] Updated weights for policy 0, policy_version 639621 (0.0038) [2024-06-24 10:17:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10479697920. Throughput: 0: 42574.8. Samples: 10479779260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:17:38,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 10:17:38,423][15401] Updated weights for policy 0, policy_version 639631 (0.0029) [2024-06-24 10:17:42,136][15401] Updated weights for policy 0, policy_version 639641 (0.0027) [2024-06-24 10:17:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42654.8). Total num frames: 10479927296. Throughput: 0: 42492.4. Samples: 10480033060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:17:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 10:17:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000639644_10479927296.pth... [2024-06-24 10:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000639020_10469703680.pth [2024-06-24 10:17:46,279][15401] Updated weights for policy 0, policy_version 639651 (0.0029) [2024-06-24 10:17:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10480107520. Throughput: 0: 42616.8. Samples: 10480291020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:17:48,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-24 10:17:49,658][15401] Updated weights for policy 0, policy_version 639661 (0.0035) [2024-06-24 10:17:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 10480320512. Throughput: 0: 42487.2. Samples: 10480411660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:17:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 10:17:54,245][15401] Updated weights for policy 0, policy_version 639671 (0.0033) [2024-06-24 10:17:57,715][15401] Updated weights for policy 0, policy_version 639681 (0.0030) [2024-06-24 10:17:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 10480549888. Throughput: 0: 42585.3. Samples: 10480665280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:17:58,390][15132] Avg episode reward: [(0, '0.148')] [2024-06-24 10:18:01,958][15401] Updated weights for policy 0, policy_version 639691 (0.0034) [2024-06-24 10:18:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10480762880. Throughput: 0: 42431.0. Samples: 10480922960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:03,392][15132] Avg episode reward: [(0, '0.148')] [2024-06-24 10:18:05,317][15401] Updated weights for policy 0, policy_version 639701 (0.0041) [2024-06-24 10:18:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10480975872. Throughput: 0: 42387.6. Samples: 10481048380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:08,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 10:18:09,508][15401] Updated weights for policy 0, policy_version 639711 (0.0033) [2024-06-24 10:18:13,054][15401] Updated weights for policy 0, policy_version 639721 (0.0039) [2024-06-24 10:18:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10481188864. Throughput: 0: 42475.6. Samples: 10481304480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 10:18:17,121][15401] Updated weights for policy 0, policy_version 639731 (0.0035) [2024-06-24 10:18:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42432.1). Total num frames: 10481369088. Throughput: 0: 42327.6. Samples: 10481561140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:18,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-24 10:18:20,598][15401] Updated weights for policy 0, policy_version 639741 (0.0038) [2024-06-24 10:18:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42487.6). Total num frames: 10481598464. Throughput: 0: 42317.2. Samples: 10481683540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:23,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 10:18:23,471][15349] Signal inference workers to stop experience collection... (155200 times) [2024-06-24 10:18:23,525][15401] InferenceWorker_p0-w0: stopping experience collection (155200 times) [2024-06-24 10:18:23,525][15349] Signal inference workers to resume experience collection... (155200 times) [2024-06-24 10:18:23,547][15401] InferenceWorker_p0-w0: resuming experience collection (155200 times) [2024-06-24 10:18:24,720][15401] Updated weights for policy 0, policy_version 639751 (0.0023) [2024-06-24 10:18:28,267][15401] Updated weights for policy 0, policy_version 639761 (0.0030) [2024-06-24 10:18:28,391][15132] Fps is (10 sec: 47506.6, 60 sec: 42597.5, 300 sec: 42542.7). Total num frames: 10481844224. Throughput: 0: 42431.3. Samples: 10481942520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:28,391][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 10:18:32,965][15401] Updated weights for policy 0, policy_version 639771 (0.0043) [2024-06-24 10:18:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 10482008064. Throughput: 0: 42268.1. Samples: 10482193080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:33,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 10:18:36,172][15401] Updated weights for policy 0, policy_version 639781 (0.0032) [2024-06-24 10:18:38,390][15132] Fps is (10 sec: 39326.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10482237440. Throughput: 0: 42233.7. Samples: 10482312180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 10:18:40,750][15401] Updated weights for policy 0, policy_version 639791 (0.0052) [2024-06-24 10:18:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 10482450432. Throughput: 0: 42451.7. Samples: 10482575600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:43,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 10:18:44,038][15401] Updated weights for policy 0, policy_version 639801 (0.0028) [2024-06-24 10:18:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10482647040. Throughput: 0: 42185.4. Samples: 10482821300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:48,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 10:18:48,471][15401] Updated weights for policy 0, policy_version 639811 (0.0030) [2024-06-24 10:18:51,841][15401] Updated weights for policy 0, policy_version 639821 (0.0031) [2024-06-24 10:18:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10482860032. Throughput: 0: 42108.0. Samples: 10482943240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:53,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 10:18:56,054][15401] Updated weights for policy 0, policy_version 639831 (0.0029) [2024-06-24 10:18:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 10483089408. Throughput: 0: 42293.4. Samples: 10483207680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:18:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 10:18:59,792][15401] Updated weights for policy 0, policy_version 639841 (0.0026) [2024-06-24 10:19:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10483302400. Throughput: 0: 42210.5. Samples: 10483460620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:19:03,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 10:19:03,544][15401] Updated weights for policy 0, policy_version 639851 (0.0026) [2024-06-24 10:19:07,626][15401] Updated weights for policy 0, policy_version 639861 (0.0029) [2024-06-24 10:19:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 10483499008. Throughput: 0: 42253.3. Samples: 10483584940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:19:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 10:19:11,636][15401] Updated weights for policy 0, policy_version 639871 (0.0028) [2024-06-24 10:19:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 10483712000. Throughput: 0: 42265.3. Samples: 10483844400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 10:19:13,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 10:19:15,205][15401] Updated weights for policy 0, policy_version 639881 (0.0037) [2024-06-24 10:19:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 10483924992. Throughput: 0: 42293.8. Samples: 10484096300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 10:19:19,218][15401] Updated weights for policy 0, policy_version 639891 (0.0035) [2024-06-24 10:19:22,797][15401] Updated weights for policy 0, policy_version 639901 (0.0033) [2024-06-24 10:19:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 10484154368. Throughput: 0: 42544.0. Samples: 10484226660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 10:19:26,978][15401] Updated weights for policy 0, policy_version 639911 (0.0038) [2024-06-24 10:19:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41507.1, 300 sec: 42487.7). Total num frames: 10484334592. Throughput: 0: 42333.2. Samples: 10484480600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:28,395][15132] Avg episode reward: [(0, '0.241')] [2024-06-24 10:19:30,456][15401] Updated weights for policy 0, policy_version 639921 (0.0026) [2024-06-24 10:19:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42432.1). Total num frames: 10484563968. Throughput: 0: 42606.7. Samples: 10484738600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:33,390][15132] Avg episode reward: [(0, '0.184')] [2024-06-24 10:19:34,562][15401] Updated weights for policy 0, policy_version 639931 (0.0028) [2024-06-24 10:19:38,198][15401] Updated weights for policy 0, policy_version 639941 (0.0048) [2024-06-24 10:19:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 10484793344. Throughput: 0: 42697.3. Samples: 10484864620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:38,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 10:19:42,256][15401] Updated weights for policy 0, policy_version 639951 (0.0030) [2024-06-24 10:19:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10484989952. Throughput: 0: 42548.4. Samples: 10485122360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 10:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000639953_10484989952.pth... [2024-06-24 10:19:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000639333_10474831872.pth [2024-06-24 10:19:45,740][15401] Updated weights for policy 0, policy_version 639961 (0.0022) [2024-06-24 10:19:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10485202944. Throughput: 0: 42779.2. Samples: 10485385680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:48,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 10:19:49,654][15401] Updated weights for policy 0, policy_version 639971 (0.0040) [2024-06-24 10:19:53,354][15401] Updated weights for policy 0, policy_version 639981 (0.0041) [2024-06-24 10:19:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 10485448704. Throughput: 0: 42841.4. Samples: 10485512800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:53,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 10:19:57,549][15401] Updated weights for policy 0, policy_version 639991 (0.0035) [2024-06-24 10:19:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 10485645312. Throughput: 0: 42662.1. Samples: 10485764200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:19:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 10:20:01,386][15401] Updated weights for policy 0, policy_version 640001 (0.0030) [2024-06-24 10:20:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10485841920. Throughput: 0: 42897.7. Samples: 10486026700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 10:20:05,163][15401] Updated weights for policy 0, policy_version 640011 (0.0041) [2024-06-24 10:20:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42431.8). Total num frames: 10486071296. Throughput: 0: 42691.7. Samples: 10486147780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:08,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 10:20:09,265][15401] Updated weights for policy 0, policy_version 640021 (0.0032) [2024-06-24 10:20:12,653][15401] Updated weights for policy 0, policy_version 640031 (0.0040) [2024-06-24 10:20:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10486300672. Throughput: 0: 42946.3. Samples: 10486413180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 10:20:16,811][15401] Updated weights for policy 0, policy_version 640041 (0.0024) [2024-06-24 10:20:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 10486497280. Throughput: 0: 42913.7. Samples: 10486669720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:18,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 10:20:20,383][15401] Updated weights for policy 0, policy_version 640051 (0.0030) [2024-06-24 10:20:21,484][15349] Signal inference workers to stop experience collection... (155250 times) [2024-06-24 10:20:21,484][15349] Signal inference workers to resume experience collection... (155250 times) [2024-06-24 10:20:21,527][15401] InferenceWorker_p0-w0: stopping experience collection (155250 times) [2024-06-24 10:20:21,527][15401] InferenceWorker_p0-w0: resuming experience collection (155250 times) [2024-06-24 10:20:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 10486726656. Throughput: 0: 42894.7. Samples: 10486794880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 10:20:24,309][15401] Updated weights for policy 0, policy_version 640061 (0.0039) [2024-06-24 10:20:27,911][15401] Updated weights for policy 0, policy_version 640071 (0.0036) [2024-06-24 10:20:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 10486939648. Throughput: 0: 43041.8. Samples: 10487059240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 10:20:31,916][15401] Updated weights for policy 0, policy_version 640081 (0.0038) [2024-06-24 10:20:33,391][15132] Fps is (10 sec: 42592.5, 60 sec: 43143.6, 300 sec: 42598.2). Total num frames: 10487152640. Throughput: 0: 42992.5. Samples: 10487320400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:33,391][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 10:20:35,402][15401] Updated weights for policy 0, policy_version 640091 (0.0042) [2024-06-24 10:20:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 10487365632. Throughput: 0: 42908.4. Samples: 10487443680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:38,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 10:20:39,786][15401] Updated weights for policy 0, policy_version 640101 (0.0048) [2024-06-24 10:20:43,000][15401] Updated weights for policy 0, policy_version 640111 (0.0036) [2024-06-24 10:20:43,390][15132] Fps is (10 sec: 42603.7, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 10487578624. Throughput: 0: 43021.9. Samples: 10487700180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:43,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 10:20:47,497][15401] Updated weights for policy 0, policy_version 640121 (0.0028) [2024-06-24 10:20:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 10487775232. Throughput: 0: 42861.8. Samples: 10487955480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:20:48,394][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 10:20:50,609][15401] Updated weights for policy 0, policy_version 640131 (0.0033) [2024-06-24 10:20:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10488020992. Throughput: 0: 43008.8. Samples: 10488083180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:20:53,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 10:20:54,980][15401] Updated weights for policy 0, policy_version 640141 (0.0036) [2024-06-24 10:20:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 10488217600. Throughput: 0: 42934.6. Samples: 10488345240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:20:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 10:20:58,619][15401] Updated weights for policy 0, policy_version 640151 (0.0048) [2024-06-24 10:21:02,834][15401] Updated weights for policy 0, policy_version 640161 (0.0033) [2024-06-24 10:21:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10488430592. Throughput: 0: 42936.4. Samples: 10488601860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 10:21:06,285][15401] Updated weights for policy 0, policy_version 640171 (0.0033) [2024-06-24 10:21:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 10488676352. Throughput: 0: 42994.1. Samples: 10488729620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:08,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 10:21:10,207][15401] Updated weights for policy 0, policy_version 640181 (0.0036) [2024-06-24 10:21:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 10488856576. Throughput: 0: 42947.4. Samples: 10488991880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:13,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 10:21:13,819][15401] Updated weights for policy 0, policy_version 640191 (0.0043) [2024-06-24 10:21:17,753][15401] Updated weights for policy 0, policy_version 640201 (0.0031) [2024-06-24 10:21:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10489069568. Throughput: 0: 42934.1. Samples: 10489252380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 10:21:21,268][15401] Updated weights for policy 0, policy_version 640211 (0.0033) [2024-06-24 10:21:23,389][15132] Fps is (10 sec: 47514.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10489331712. Throughput: 0: 42957.1. Samples: 10489376740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:23,391][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 10:21:25,502][15401] Updated weights for policy 0, policy_version 640221 (0.0038) [2024-06-24 10:21:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42488.0). Total num frames: 10489495552. Throughput: 0: 42864.5. Samples: 10489629080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:28,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 10:21:29,340][15401] Updated weights for policy 0, policy_version 640231 (0.0039) [2024-06-24 10:21:33,390][15132] Fps is (10 sec: 36044.2, 60 sec: 42326.2, 300 sec: 42598.4). Total num frames: 10489692160. Throughput: 0: 42820.4. Samples: 10489882400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 10:21:33,531][15401] Updated weights for policy 0, policy_version 640241 (0.0036) [2024-06-24 10:21:37,102][15401] Updated weights for policy 0, policy_version 640251 (0.0050) [2024-06-24 10:21:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10489954304. Throughput: 0: 42803.6. Samples: 10490009340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:38,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 10:21:41,190][15401] Updated weights for policy 0, policy_version 640261 (0.0038) [2024-06-24 10:21:43,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 10490118144. Throughput: 0: 42735.1. Samples: 10490268420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:43,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 10:21:43,481][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000640267_10490134528.pth... [2024-06-24 10:21:43,529][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000639644_10479927296.pth [2024-06-24 10:21:44,720][15401] Updated weights for policy 0, policy_version 640271 (0.0048) [2024-06-24 10:21:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10490347520. Throughput: 0: 42682.3. Samples: 10490522560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 10:21:48,817][15401] Updated weights for policy 0, policy_version 640281 (0.0042) [2024-06-24 10:21:52,478][15401] Updated weights for policy 0, policy_version 640291 (0.0042) [2024-06-24 10:21:53,389][15132] Fps is (10 sec: 47525.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 10490593280. Throughput: 0: 42628.2. Samples: 10490647880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 10:21:56,393][15401] Updated weights for policy 0, policy_version 640301 (0.0029) [2024-06-24 10:21:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10490773504. Throughput: 0: 42538.3. Samples: 10490906100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:21:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 10:22:00,101][15349] Signal inference workers to stop experience collection... (155300 times) [2024-06-24 10:22:00,103][15349] Signal inference workers to resume experience collection... (155300 times) [2024-06-24 10:22:00,114][15401] Updated weights for policy 0, policy_version 640311 (0.0039) [2024-06-24 10:22:00,137][15401] InferenceWorker_p0-w0: stopping experience collection (155300 times) [2024-06-24 10:22:00,137][15401] InferenceWorker_p0-w0: resuming experience collection (155300 times) [2024-06-24 10:22:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10491002880. Throughput: 0: 42258.7. Samples: 10491154020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:22:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 10:22:04,001][15401] Updated weights for policy 0, policy_version 640321 (0.0023) [2024-06-24 10:22:07,853][15401] Updated weights for policy 0, policy_version 640331 (0.0043) [2024-06-24 10:22:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10491215872. Throughput: 0: 42483.4. Samples: 10491288500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:22:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 10:22:11,971][15401] Updated weights for policy 0, policy_version 640341 (0.0034) [2024-06-24 10:22:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 10491412480. Throughput: 0: 42466.2. Samples: 10491540160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:22:13,393][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 10:22:15,534][15401] Updated weights for policy 0, policy_version 640351 (0.0034) [2024-06-24 10:22:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10491641856. Throughput: 0: 42215.5. Samples: 10491782100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:22:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 10:22:19,573][15401] Updated weights for policy 0, policy_version 640361 (0.0035) [2024-06-24 10:22:23,246][15401] Updated weights for policy 0, policy_version 640371 (0.0031) [2024-06-24 10:22:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 10491838464. Throughput: 0: 42437.4. Samples: 10491919020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-24 10:22:23,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 10:22:27,326][15401] Updated weights for policy 0, policy_version 640381 (0.0031) [2024-06-24 10:22:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10492035072. Throughput: 0: 42381.9. Samples: 10492175500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:22:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 10:22:30,966][15401] Updated weights for policy 0, policy_version 640391 (0.0038) [2024-06-24 10:22:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10492280832. Throughput: 0: 42172.8. Samples: 10492420340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:22:33,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-24 10:22:35,016][15401] Updated weights for policy 0, policy_version 640401 (0.0039) [2024-06-24 10:22:38,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 10492477440. Throughput: 0: 42417.6. Samples: 10492556780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:22:38,393][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 10:22:38,882][15401] Updated weights for policy 0, policy_version 640411 (0.0040) [2024-06-24 10:22:42,621][15401] Updated weights for policy 0, policy_version 640421 (0.0028) [2024-06-24 10:22:43,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 10492674048. Throughput: 0: 42185.4. Samples: 10492804440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:22:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 10:22:46,722][15401] Updated weights for policy 0, policy_version 640431 (0.0040) [2024-06-24 10:22:48,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10492903424. Throughput: 0: 42188.0. Samples: 10493052480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:22:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 10:22:50,759][15401] Updated weights for policy 0, policy_version 640441 (0.0032) [2024-06-24 10:22:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 10493100032. Throughput: 0: 42250.3. Samples: 10493189760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:22:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 10:22:54,341][15401] Updated weights for policy 0, policy_version 640451 (0.0040) [2024-06-24 10:22:58,340][15401] Updated weights for policy 0, policy_version 640461 (0.0040) [2024-06-24 10:22:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10493313024. Throughput: 0: 42253.4. Samples: 10493441460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:22:58,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 10:23:01,979][15401] Updated weights for policy 0, policy_version 640471 (0.0039) [2024-06-24 10:23:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10493558784. Throughput: 0: 42455.2. Samples: 10493692580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 10:23:05,855][15401] Updated weights for policy 0, policy_version 640481 (0.0036) [2024-06-24 10:23:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10493739008. Throughput: 0: 42384.0. Samples: 10493826300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 10:23:09,817][15401] Updated weights for policy 0, policy_version 640491 (0.0038) [2024-06-24 10:23:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 10493952000. Throughput: 0: 42266.0. Samples: 10494077480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 10:23:13,584][15401] Updated weights for policy 0, policy_version 640501 (0.0042) [2024-06-24 10:23:17,561][15401] Updated weights for policy 0, policy_version 640511 (0.0032) [2024-06-24 10:23:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10494181376. Throughput: 0: 42463.1. Samples: 10494331180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:18,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 10:23:21,110][15401] Updated weights for policy 0, policy_version 640521 (0.0023) [2024-06-24 10:23:23,394][15132] Fps is (10 sec: 44215.4, 60 sec: 42594.8, 300 sec: 42542.3). Total num frames: 10494394368. Throughput: 0: 42319.8. Samples: 10494461280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:23,395][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 10:23:25,017][15401] Updated weights for policy 0, policy_version 640531 (0.0050) [2024-06-24 10:23:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10494607360. Throughput: 0: 42573.2. Samples: 10494720240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:28,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 10:23:28,709][15401] Updated weights for policy 0, policy_version 640541 (0.0042) [2024-06-24 10:23:32,652][15401] Updated weights for policy 0, policy_version 640551 (0.0043) [2024-06-24 10:23:33,390][15132] Fps is (10 sec: 40980.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10494803968. Throughput: 0: 42883.5. Samples: 10494982240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:33,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 10:23:34,389][15349] Signal inference workers to stop experience collection... (155350 times) [2024-06-24 10:23:34,435][15401] InferenceWorker_p0-w0: stopping experience collection (155350 times) [2024-06-24 10:23:34,441][15349] Signal inference workers to resume experience collection... (155350 times) [2024-06-24 10:23:34,446][15401] InferenceWorker_p0-w0: resuming experience collection (155350 times) [2024-06-24 10:23:36,145][15401] Updated weights for policy 0, policy_version 640561 (0.0037) [2024-06-24 10:23:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 10495016960. Throughput: 0: 42628.8. Samples: 10495108060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 10:23:40,241][15401] Updated weights for policy 0, policy_version 640571 (0.0044) [2024-06-24 10:23:43,396][15132] Fps is (10 sec: 45845.7, 60 sec: 43139.8, 300 sec: 42764.1). Total num frames: 10495262720. Throughput: 0: 42797.9. Samples: 10495367640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:43,397][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 10:23:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000640580_10495262720.pth... [2024-06-24 10:23:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000639953_10484989952.pth [2024-06-24 10:23:43,968][15401] Updated weights for policy 0, policy_version 640581 (0.0033) [2024-06-24 10:23:47,807][15401] Updated weights for policy 0, policy_version 640591 (0.0030) [2024-06-24 10:23:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10495475712. Throughput: 0: 43048.5. Samples: 10495629760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 10:23:51,762][15401] Updated weights for policy 0, policy_version 640601 (0.0031) [2024-06-24 10:23:53,390][15132] Fps is (10 sec: 40986.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10495672320. Throughput: 0: 42857.3. Samples: 10495754880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 10:23:55,522][15401] Updated weights for policy 0, policy_version 640611 (0.0042) [2024-06-24 10:23:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10495918080. Throughput: 0: 43035.3. Samples: 10496014060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 10:23:58,396][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 10:23:59,158][15401] Updated weights for policy 0, policy_version 640621 (0.0036) [2024-06-24 10:24:03,003][15401] Updated weights for policy 0, policy_version 640631 (0.0026) [2024-06-24 10:24:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10496098304. Throughput: 0: 43170.7. Samples: 10496273860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 10:24:07,213][15401] Updated weights for policy 0, policy_version 640641 (0.0025) [2024-06-24 10:24:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10496294912. Throughput: 0: 43031.4. Samples: 10496397480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:08,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-24 10:24:10,477][15401] Updated weights for policy 0, policy_version 640651 (0.0023) [2024-06-24 10:24:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 10496557056. Throughput: 0: 43171.9. Samples: 10496662980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:13,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 10:24:14,773][15401] Updated weights for policy 0, policy_version 640661 (0.0039) [2024-06-24 10:24:18,227][15401] Updated weights for policy 0, policy_version 640671 (0.0024) [2024-06-24 10:24:18,393][15132] Fps is (10 sec: 45857.7, 60 sec: 42868.8, 300 sec: 42708.9). Total num frames: 10496753664. Throughput: 0: 42937.2. Samples: 10496914580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:18,394][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 10:24:22,398][15401] Updated weights for policy 0, policy_version 640681 (0.0032) [2024-06-24 10:24:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42602.0, 300 sec: 42765.0). Total num frames: 10496950272. Throughput: 0: 42952.9. Samples: 10497040940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 10:24:25,672][15401] Updated weights for policy 0, policy_version 640691 (0.0048) [2024-06-24 10:24:28,390][15132] Fps is (10 sec: 42614.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10497179648. Throughput: 0: 42966.2. Samples: 10497300840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 10:24:30,011][15401] Updated weights for policy 0, policy_version 640701 (0.0038) [2024-06-24 10:24:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 10497392640. Throughput: 0: 42691.8. Samples: 10497550900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 10:24:33,578][15401] Updated weights for policy 0, policy_version 640711 (0.0034) [2024-06-24 10:24:38,159][15401] Updated weights for policy 0, policy_version 640721 (0.0030) [2024-06-24 10:24:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10497572864. Throughput: 0: 42817.9. Samples: 10497681680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 10:24:41,069][15401] Updated weights for policy 0, policy_version 640731 (0.0028) [2024-06-24 10:24:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 10497818624. Throughput: 0: 42787.0. Samples: 10497939480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:43,399][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 10:24:45,632][15401] Updated weights for policy 0, policy_version 640741 (0.0036) [2024-06-24 10:24:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10498031616. Throughput: 0: 42634.7. Samples: 10498192420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:48,399][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 10:24:48,718][15401] Updated weights for policy 0, policy_version 640751 (0.0035) [2024-06-24 10:24:53,142][15401] Updated weights for policy 0, policy_version 640761 (0.0040) [2024-06-24 10:24:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10498244608. Throughput: 0: 42792.9. Samples: 10498323160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:53,391][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 10:24:56,634][15401] Updated weights for policy 0, policy_version 640771 (0.0045) [2024-06-24 10:24:57,887][15349] Signal inference workers to stop experience collection... (155400 times) [2024-06-24 10:24:57,887][15349] Signal inference workers to resume experience collection... (155400 times) [2024-06-24 10:24:57,954][15401] InferenceWorker_p0-w0: stopping experience collection (155400 times) [2024-06-24 10:24:57,955][15401] InferenceWorker_p0-w0: resuming experience collection (155400 times) [2024-06-24 10:24:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10498457600. Throughput: 0: 42594.4. Samples: 10498579720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:24:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 10:25:00,739][15401] Updated weights for policy 0, policy_version 640781 (0.0028) [2024-06-24 10:25:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10498670592. Throughput: 0: 42613.4. Samples: 10498832020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:25:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 10:25:04,359][15401] Updated weights for policy 0, policy_version 640791 (0.0038) [2024-06-24 10:25:08,317][15401] Updated weights for policy 0, policy_version 640801 (0.0032) [2024-06-24 10:25:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10498883584. Throughput: 0: 42575.9. Samples: 10498956860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:25:08,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-24 10:25:12,172][15401] Updated weights for policy 0, policy_version 640811 (0.0034) [2024-06-24 10:25:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10499112960. Throughput: 0: 42560.4. Samples: 10499216060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:25:13,391][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 10:25:15,807][15401] Updated weights for policy 0, policy_version 640821 (0.0033) [2024-06-24 10:25:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42601.2, 300 sec: 42653.9). Total num frames: 10499309568. Throughput: 0: 42625.9. Samples: 10499469060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:25:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 10:25:19,858][15401] Updated weights for policy 0, policy_version 640831 (0.0031) [2024-06-24 10:25:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 10499506176. Throughput: 0: 42439.7. Samples: 10499591480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:25:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 10:25:23,863][15401] Updated weights for policy 0, policy_version 640841 (0.0041) [2024-06-24 10:25:27,583][15401] Updated weights for policy 0, policy_version 640851 (0.0033) [2024-06-24 10:25:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.7). Total num frames: 10499751936. Throughput: 0: 42524.5. Samples: 10499853080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:25:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 10:25:31,553][15401] Updated weights for policy 0, policy_version 640861 (0.0033) [2024-06-24 10:25:33,390][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10499948544. Throughput: 0: 42513.3. Samples: 10500105520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:25:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 10:25:35,186][15401] Updated weights for policy 0, policy_version 640871 (0.0026) [2024-06-24 10:25:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10500145152. Throughput: 0: 42468.9. Samples: 10500234260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:25:38,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 10:25:39,518][15401] Updated weights for policy 0, policy_version 640881 (0.0043) [2024-06-24 10:25:43,201][15401] Updated weights for policy 0, policy_version 640891 (0.0033) [2024-06-24 10:25:43,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42594.0, 300 sec: 42708.6). Total num frames: 10500374528. Throughput: 0: 42374.8. Samples: 10500486860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:25:43,396][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 10:25:43,513][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000640893_10500390912.pth... [2024-06-24 10:25:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000640267_10490134528.pth [2024-06-24 10:25:47,168][15401] Updated weights for policy 0, policy_version 640901 (0.0036) [2024-06-24 10:25:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10500587520. Throughput: 0: 42392.9. Samples: 10500739700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:25:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 10:25:50,802][15401] Updated weights for policy 0, policy_version 640911 (0.0037) [2024-06-24 10:25:53,390][15132] Fps is (10 sec: 39346.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 10500767744. Throughput: 0: 42399.5. Samples: 10500864840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:25:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 10:25:55,065][15401] Updated weights for policy 0, policy_version 640921 (0.0047) [2024-06-24 10:25:58,290][15401] Updated weights for policy 0, policy_version 640931 (0.0037) [2024-06-24 10:25:58,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 10501013504. Throughput: 0: 42356.6. Samples: 10501122200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:25:58,392][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 10:26:02,739][15401] Updated weights for policy 0, policy_version 640941 (0.0032) [2024-06-24 10:26:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10501210112. Throughput: 0: 42325.8. Samples: 10501373720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 10:26:06,003][15401] Updated weights for policy 0, policy_version 640951 (0.0043) [2024-06-24 10:26:08,389][15132] Fps is (10 sec: 39330.6, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 10501406720. Throughput: 0: 42315.3. Samples: 10501495660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 10:26:10,427][15401] Updated weights for policy 0, policy_version 640961 (0.0036) [2024-06-24 10:26:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10501636096. Throughput: 0: 42194.8. Samples: 10501751840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 10:26:13,947][15401] Updated weights for policy 0, policy_version 640971 (0.0033) [2024-06-24 10:26:17,990][15401] Updated weights for policy 0, policy_version 640981 (0.0022) [2024-06-24 10:26:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 10501849088. Throughput: 0: 42323.2. Samples: 10502010060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 10:26:21,684][15401] Updated weights for policy 0, policy_version 640991 (0.0032) [2024-06-24 10:26:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10502045696. Throughput: 0: 42261.7. Samples: 10502136040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 10:26:25,591][15401] Updated weights for policy 0, policy_version 641001 (0.0027) [2024-06-24 10:26:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10502291456. Throughput: 0: 42386.4. Samples: 10502393980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:28,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-24 10:26:29,160][15401] Updated weights for policy 0, policy_version 641011 (0.0032) [2024-06-24 10:26:32,288][15349] Signal inference workers to stop experience collection... (155450 times) [2024-06-24 10:26:32,322][15401] InferenceWorker_p0-w0: stopping experience collection (155450 times) [2024-06-24 10:26:32,404][15349] Signal inference workers to resume experience collection... (155450 times) [2024-06-24 10:26:32,404][15401] InferenceWorker_p0-w0: resuming experience collection (155450 times) [2024-06-24 10:26:33,366][15401] Updated weights for policy 0, policy_version 641021 (0.0033) [2024-06-24 10:26:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10502488064. Throughput: 0: 42593.8. Samples: 10502656420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 10:26:36,664][15401] Updated weights for policy 0, policy_version 641031 (0.0032) [2024-06-24 10:26:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 10502701056. Throughput: 0: 42456.0. Samples: 10502775360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 10:26:40,888][15401] Updated weights for policy 0, policy_version 641041 (0.0037) [2024-06-24 10:26:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42329.8, 300 sec: 42598.4). Total num frames: 10502914048. Throughput: 0: 42490.5. Samples: 10503034180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 10:26:44,511][15401] Updated weights for policy 0, policy_version 641051 (0.0042) [2024-06-24 10:26:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10503127040. Throughput: 0: 42612.0. Samples: 10503291260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:48,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 10:26:48,651][15401] Updated weights for policy 0, policy_version 641061 (0.0025) [2024-06-24 10:26:52,070][15401] Updated weights for policy 0, policy_version 641071 (0.0036) [2024-06-24 10:26:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10503356416. Throughput: 0: 42727.0. Samples: 10503418380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:53,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-24 10:26:56,439][15401] Updated weights for policy 0, policy_version 641081 (0.0031) [2024-06-24 10:26:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42325.2, 300 sec: 42542.5). Total num frames: 10503553024. Throughput: 0: 42611.5. Samples: 10503669460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:26:58,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 10:27:00,039][15401] Updated weights for policy 0, policy_version 641091 (0.0029) [2024-06-24 10:27:03,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10503749632. Throughput: 0: 42694.6. Samples: 10503931320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:27:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 10:27:04,111][15401] Updated weights for policy 0, policy_version 641101 (0.0028) [2024-06-24 10:27:07,801][15401] Updated weights for policy 0, policy_version 641111 (0.0033) [2024-06-24 10:27:08,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 10503995392. Throughput: 0: 42635.5. Samples: 10504054640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:27:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 10:27:11,750][15401] Updated weights for policy 0, policy_version 641121 (0.0027) [2024-06-24 10:27:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10504192000. Throughput: 0: 42486.6. Samples: 10504305880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 10:27:13,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 10:27:15,593][15401] Updated weights for policy 0, policy_version 641131 (0.0046) [2024-06-24 10:27:18,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10504372224. Throughput: 0: 42494.7. Samples: 10504568680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 10:27:19,399][15401] Updated weights for policy 0, policy_version 641141 (0.0027) [2024-06-24 10:27:23,080][15401] Updated weights for policy 0, policy_version 641151 (0.0032) [2024-06-24 10:27:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10504617984. Throughput: 0: 42685.0. Samples: 10504696180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 10:27:26,872][15401] Updated weights for policy 0, policy_version 641161 (0.0036) [2024-06-24 10:27:28,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10504847360. Throughput: 0: 42687.2. Samples: 10504955100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:28,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 10:27:30,635][15401] Updated weights for policy 0, policy_version 641171 (0.0029) [2024-06-24 10:27:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 10505043968. Throughput: 0: 42878.2. Samples: 10505220780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:33,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 10:27:34,444][15401] Updated weights for policy 0, policy_version 641181 (0.0028) [2024-06-24 10:27:38,199][15401] Updated weights for policy 0, policy_version 641191 (0.0041) [2024-06-24 10:27:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 10505273344. Throughput: 0: 42801.8. Samples: 10505344560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:38,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 10:27:38,591][15349] Signal inference workers to stop experience collection... (155500 times) [2024-06-24 10:27:38,626][15401] InferenceWorker_p0-w0: stopping experience collection (155500 times) [2024-06-24 10:27:38,637][15349] Signal inference workers to resume experience collection... (155500 times) [2024-06-24 10:27:38,648][15401] InferenceWorker_p0-w0: resuming experience collection (155500 times) [2024-06-24 10:27:41,986][15401] Updated weights for policy 0, policy_version 641201 (0.0034) [2024-06-24 10:27:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10505469952. Throughput: 0: 42920.0. Samples: 10505600760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 10:27:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000641203_10505469952.pth... [2024-06-24 10:27:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000640580_10495262720.pth [2024-06-24 10:27:45,778][15401] Updated weights for policy 0, policy_version 641211 (0.0025) [2024-06-24 10:27:48,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10505682944. Throughput: 0: 42946.1. Samples: 10505863900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 10:27:49,718][15401] Updated weights for policy 0, policy_version 641221 (0.0033) [2024-06-24 10:27:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10505912320. Throughput: 0: 43003.6. Samples: 10505989800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:53,396][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 10:27:53,519][15401] Updated weights for policy 0, policy_version 641231 (0.0032) [2024-06-24 10:27:57,324][15401] Updated weights for policy 0, policy_version 641241 (0.0035) [2024-06-24 10:27:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 10506125312. Throughput: 0: 43199.6. Samples: 10506249860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:27:58,395][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 10:28:00,944][15401] Updated weights for policy 0, policy_version 641251 (0.0034) [2024-06-24 10:28:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10506338304. Throughput: 0: 43132.7. Samples: 10506509660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 10:28:04,900][15401] Updated weights for policy 0, policy_version 641261 (0.0029) [2024-06-24 10:28:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10506567680. Throughput: 0: 43303.5. Samples: 10506644840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 10:28:08,444][15401] Updated weights for policy 0, policy_version 641271 (0.0024) [2024-06-24 10:28:12,476][15401] Updated weights for policy 0, policy_version 641281 (0.0036) [2024-06-24 10:28:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 10506764288. Throughput: 0: 43218.2. Samples: 10506899920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 10:28:16,027][15401] Updated weights for policy 0, policy_version 641291 (0.0039) [2024-06-24 10:28:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43690.5, 300 sec: 42710.2). Total num frames: 10506993664. Throughput: 0: 42979.0. Samples: 10507154840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 10:28:20,694][15401] Updated weights for policy 0, policy_version 641301 (0.0039) [2024-06-24 10:28:23,392][15132] Fps is (10 sec: 45863.5, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 10507223040. Throughput: 0: 43193.7. Samples: 10507288280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:23,393][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 10:28:23,546][15401] Updated weights for policy 0, policy_version 641311 (0.0031) [2024-06-24 10:28:28,240][15401] Updated weights for policy 0, policy_version 641321 (0.0037) [2024-06-24 10:28:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10507403264. Throughput: 0: 43213.7. Samples: 10507545380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:28,391][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 10:28:31,431][15401] Updated weights for policy 0, policy_version 641331 (0.0030) [2024-06-24 10:28:33,390][15132] Fps is (10 sec: 40970.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10507632640. Throughput: 0: 43044.9. Samples: 10507800920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 10:28:35,971][15401] Updated weights for policy 0, policy_version 641341 (0.0038) [2024-06-24 10:28:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43146.3, 300 sec: 42710.4). Total num frames: 10507862016. Throughput: 0: 43159.1. Samples: 10507931960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 10:28:39,210][15401] Updated weights for policy 0, policy_version 641351 (0.0038) [2024-06-24 10:28:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10508042240. Throughput: 0: 42993.9. Samples: 10508184580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:43,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 10:28:43,576][15401] Updated weights for policy 0, policy_version 641361 (0.0038) [2024-06-24 10:28:47,071][15401] Updated weights for policy 0, policy_version 641371 (0.0035) [2024-06-24 10:28:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 10508288000. Throughput: 0: 42971.1. Samples: 10508443360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 10:28:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 10:28:50,987][15401] Updated weights for policy 0, policy_version 641381 (0.0040) [2024-06-24 10:28:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10508500992. Throughput: 0: 42905.3. Samples: 10508575580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:28:53,400][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 10:28:54,541][15401] Updated weights for policy 0, policy_version 641391 (0.0045) [2024-06-24 10:28:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10508697600. Throughput: 0: 42861.4. Samples: 10508828680. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:28:58,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 10:28:58,515][15401] Updated weights for policy 0, policy_version 641401 (0.0038) [2024-06-24 10:28:58,864][15349] Signal inference workers to stop experience collection... (155550 times) [2024-06-24 10:28:58,864][15349] Signal inference workers to resume experience collection... (155550 times) [2024-06-24 10:28:58,907][15401] InferenceWorker_p0-w0: stopping experience collection (155550 times) [2024-06-24 10:28:58,908][15401] InferenceWorker_p0-w0: resuming experience collection (155550 times) [2024-06-24 10:29:02,050][15401] Updated weights for policy 0, policy_version 641411 (0.0041) [2024-06-24 10:29:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10508943360. Throughput: 0: 42913.9. Samples: 10509085960. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 10:29:06,041][15401] Updated weights for policy 0, policy_version 641421 (0.0045) [2024-06-24 10:29:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10509156352. Throughput: 0: 42919.8. Samples: 10509219560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 10:29:09,455][15401] Updated weights for policy 0, policy_version 641431 (0.0035) [2024-06-24 10:29:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42710.0). Total num frames: 10509352960. Throughput: 0: 42970.7. Samples: 10509479060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 10:29:13,430][15401] Updated weights for policy 0, policy_version 641441 (0.0031) [2024-06-24 10:29:16,954][15401] Updated weights for policy 0, policy_version 641451 (0.0034) [2024-06-24 10:29:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10509582336. Throughput: 0: 42907.2. Samples: 10509731740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:18,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 10:29:21,645][15401] Updated weights for policy 0, policy_version 641461 (0.0038) [2024-06-24 10:29:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10509795328. Throughput: 0: 42936.4. Samples: 10509864100. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:23,390][15132] Avg episode reward: [(0, '0.168')] [2024-06-24 10:29:24,735][15401] Updated weights for policy 0, policy_version 641471 (0.0042) [2024-06-24 10:29:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 10510008320. Throughput: 0: 42907.9. Samples: 10510115440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:28,399][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 10:29:29,255][15401] Updated weights for policy 0, policy_version 641481 (0.0030) [2024-06-24 10:29:32,456][15401] Updated weights for policy 0, policy_version 641491 (0.0044) [2024-06-24 10:29:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10510221312. Throughput: 0: 42629.3. Samples: 10510361680. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:33,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 10:29:37,112][15401] Updated weights for policy 0, policy_version 641501 (0.0038) [2024-06-24 10:29:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10510417920. Throughput: 0: 42729.8. Samples: 10510498420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:38,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 10:29:39,995][15401] Updated weights for policy 0, policy_version 641511 (0.0028) [2024-06-24 10:29:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10510630912. Throughput: 0: 42838.1. Samples: 10510756400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 10:29:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000641519_10510647296.pth... [2024-06-24 10:29:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000640893_10500390912.pth [2024-06-24 10:29:44,525][15401] Updated weights for policy 0, policy_version 641521 (0.0040) [2024-06-24 10:29:47,469][15401] Updated weights for policy 0, policy_version 641531 (0.0037) [2024-06-24 10:29:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10510876672. Throughput: 0: 42683.0. Samples: 10511006700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:48,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 10:29:51,940][15401] Updated weights for policy 0, policy_version 641541 (0.0034) [2024-06-24 10:29:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10511040512. Throughput: 0: 42681.3. Samples: 10511140220. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 10:29:55,075][15401] Updated weights for policy 0, policy_version 641551 (0.0036) [2024-06-24 10:29:58,391][15132] Fps is (10 sec: 40955.0, 60 sec: 43143.5, 300 sec: 42764.8). Total num frames: 10511286272. Throughput: 0: 42685.0. Samples: 10511399940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:29:58,391][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 10:29:59,700][15401] Updated weights for policy 0, policy_version 641561 (0.0030) [2024-06-24 10:30:02,898][15401] Updated weights for policy 0, policy_version 641571 (0.0042) [2024-06-24 10:30:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10511515648. Throughput: 0: 42558.1. Samples: 10511646860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:30:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 10:30:07,233][15401] Updated weights for policy 0, policy_version 641581 (0.0039) [2024-06-24 10:30:08,390][15132] Fps is (10 sec: 39326.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10511679488. Throughput: 0: 42707.1. Samples: 10511785920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:30:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 10:30:09,422][15349] Signal inference workers to stop experience collection... (155600 times) [2024-06-24 10:30:09,422][15349] Signal inference workers to resume experience collection... (155600 times) [2024-06-24 10:30:09,450][15401] InferenceWorker_p0-w0: stopping experience collection (155600 times) [2024-06-24 10:30:09,450][15401] InferenceWorker_p0-w0: resuming experience collection (155600 times) [2024-06-24 10:30:10,490][15401] Updated weights for policy 0, policy_version 641591 (0.0032) [2024-06-24 10:30:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10511908864. Throughput: 0: 42697.3. Samples: 10512036820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:30:13,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 10:30:14,605][15401] Updated weights for policy 0, policy_version 641601 (0.0029) [2024-06-24 10:30:18,337][15401] Updated weights for policy 0, policy_version 641611 (0.0039) [2024-06-24 10:30:18,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10512154624. Throughput: 0: 42898.7. Samples: 10512292120. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:30:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 10:30:22,556][15401] Updated weights for policy 0, policy_version 641621 (0.0027) [2024-06-24 10:30:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10512318464. Throughput: 0: 42725.8. Samples: 10512421080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:30:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 10:30:25,845][15401] Updated weights for policy 0, policy_version 641631 (0.0038) [2024-06-24 10:30:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10512564224. Throughput: 0: 42668.9. Samples: 10512676500. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-24 10:30:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 10:30:30,903][15401] Updated weights for policy 0, policy_version 641641 (0.0037) [2024-06-24 10:30:33,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10512793600. Throughput: 0: 42828.6. Samples: 10512933980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:30:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 10:30:33,449][15401] Updated weights for policy 0, policy_version 641651 (0.0036) [2024-06-24 10:30:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42654.9). Total num frames: 10512957440. Throughput: 0: 42768.1. Samples: 10513064780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:30:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 10:30:38,414][15401] Updated weights for policy 0, policy_version 641661 (0.0039) [2024-06-24 10:30:41,075][15401] Updated weights for policy 0, policy_version 641671 (0.0040) [2024-06-24 10:30:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10513203200. Throughput: 0: 42675.5. Samples: 10513320280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:30:43,391][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 10:30:45,867][15401] Updated weights for policy 0, policy_version 641681 (0.0030) [2024-06-24 10:30:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 10513416192. Throughput: 0: 42885.9. Samples: 10513576720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:30:48,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-24 10:30:48,997][15401] Updated weights for policy 0, policy_version 641691 (0.0042) [2024-06-24 10:30:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 10513612800. Throughput: 0: 42681.8. Samples: 10513706600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:30:53,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 10:30:53,475][15401] Updated weights for policy 0, policy_version 641701 (0.0028) [2024-06-24 10:30:56,623][15401] Updated weights for policy 0, policy_version 641711 (0.0031) [2024-06-24 10:30:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42872.5, 300 sec: 42876.1). Total num frames: 10513858560. Throughput: 0: 42773.4. Samples: 10513961620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:30:58,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 10:31:01,127][15401] Updated weights for policy 0, policy_version 641721 (0.0032) [2024-06-24 10:31:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 10514055168. Throughput: 0: 42925.5. Samples: 10514223760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 10:31:04,010][15349] Signal inference workers to stop experience collection... (155650 times) [2024-06-24 10:31:04,037][15401] InferenceWorker_p0-w0: stopping experience collection (155650 times) [2024-06-24 10:31:04,071][15349] Signal inference workers to resume experience collection... (155650 times) [2024-06-24 10:31:04,072][15401] InferenceWorker_p0-w0: resuming experience collection (155650 times) [2024-06-24 10:31:04,588][15401] Updated weights for policy 0, policy_version 641731 (0.0027) [2024-06-24 10:31:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10514268160. Throughput: 0: 42897.9. Samples: 10514351480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 10:31:08,849][15401] Updated weights for policy 0, policy_version 641741 (0.0044) [2024-06-24 10:31:12,137][15401] Updated weights for policy 0, policy_version 641751 (0.0038) [2024-06-24 10:31:13,390][15132] Fps is (10 sec: 44235.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10514497536. Throughput: 0: 42822.9. Samples: 10514603540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 10:31:16,681][15401] Updated weights for policy 0, policy_version 641761 (0.0055) [2024-06-24 10:31:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 10514677760. Throughput: 0: 42919.6. Samples: 10514865360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 10:31:19,565][15401] Updated weights for policy 0, policy_version 641771 (0.0038) [2024-06-24 10:31:23,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10514907136. Throughput: 0: 42758.1. Samples: 10514988900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 10:31:24,809][15401] Updated weights for policy 0, policy_version 641781 (0.0034) [2024-06-24 10:31:27,385][15401] Updated weights for policy 0, policy_version 641791 (0.0023) [2024-06-24 10:31:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10515136512. Throughput: 0: 42802.8. Samples: 10515246400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:28,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 10:31:32,553][15401] Updated weights for policy 0, policy_version 641801 (0.0043) [2024-06-24 10:31:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 10515316736. Throughput: 0: 42944.0. Samples: 10515509200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:33,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 10:31:34,851][15401] Updated weights for policy 0, policy_version 641811 (0.0033) [2024-06-24 10:31:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 10515562496. Throughput: 0: 42688.9. Samples: 10515627600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:38,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 10:31:40,136][15401] Updated weights for policy 0, policy_version 641821 (0.0034) [2024-06-24 10:31:42,574][15401] Updated weights for policy 0, policy_version 641831 (0.0028) [2024-06-24 10:31:43,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10515791872. Throughput: 0: 42839.0. Samples: 10515889380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 10:31:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000641833_10515791872.pth... [2024-06-24 10:31:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000641203_10505469952.pth [2024-06-24 10:31:47,503][15401] Updated weights for policy 0, policy_version 641841 (0.0030) [2024-06-24 10:31:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10515955712. Throughput: 0: 42852.7. Samples: 10516152140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 10:31:50,207][15401] Updated weights for policy 0, policy_version 641851 (0.0039) [2024-06-24 10:31:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 10516201472. Throughput: 0: 42635.9. Samples: 10516270100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:53,395][15132] Avg episode reward: [(0, '0.308')] [2024-06-24 10:31:55,019][15401] Updated weights for policy 0, policy_version 641861 (0.0033) [2024-06-24 10:31:57,853][15401] Updated weights for policy 0, policy_version 641871 (0.0022) [2024-06-24 10:31:58,389][15132] Fps is (10 sec: 49152.5, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 10516447232. Throughput: 0: 42923.8. Samples: 10516535100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:31:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 10:32:02,531][15401] Updated weights for policy 0, policy_version 641881 (0.0035) [2024-06-24 10:32:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10516611072. Throughput: 0: 42952.4. Samples: 10516798220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 10:32:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 10:32:05,339][15401] Updated weights for policy 0, policy_version 641891 (0.0027) [2024-06-24 10:32:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 10516856832. Throughput: 0: 42784.1. Samples: 10516914180. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:08,395][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 10:32:10,214][15401] Updated weights for policy 0, policy_version 641901 (0.0030) [2024-06-24 10:32:13,140][15401] Updated weights for policy 0, policy_version 641911 (0.0026) [2024-06-24 10:32:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 10517069824. Throughput: 0: 43019.6. Samples: 10517182280. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 10:32:17,992][15401] Updated weights for policy 0, policy_version 641921 (0.0036) [2024-06-24 10:32:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10517250048. Throughput: 0: 42833.7. Samples: 10517436720. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 10:32:20,894][15401] Updated weights for policy 0, policy_version 641931 (0.0042) [2024-06-24 10:32:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10517495808. Throughput: 0: 42850.6. Samples: 10517555880. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 10:32:25,482][15401] Updated weights for policy 0, policy_version 641941 (0.0033) [2024-06-24 10:32:28,299][15349] Signal inference workers to stop experience collection... (155700 times) [2024-06-24 10:32:28,301][15349] Signal inference workers to resume experience collection... (155700 times) [2024-06-24 10:32:28,350][15401] InferenceWorker_p0-w0: stopping experience collection (155700 times) [2024-06-24 10:32:28,351][15401] InferenceWorker_p0-w0: resuming experience collection (155700 times) [2024-06-24 10:32:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10517692416. Throughput: 0: 42806.8. Samples: 10517815680. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 10:32:28,670][15401] Updated weights for policy 0, policy_version 641951 (0.0033) [2024-06-24 10:32:33,379][15401] Updated weights for policy 0, policy_version 641961 (0.0022) [2024-06-24 10:32:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 10517889024. Throughput: 0: 42804.1. Samples: 10518078320. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:33,396][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 10:32:36,558][15401] Updated weights for policy 0, policy_version 641971 (0.0034) [2024-06-24 10:32:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 10518134784. Throughput: 0: 42830.8. Samples: 10518197480. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 10:32:40,880][15401] Updated weights for policy 0, policy_version 641981 (0.0034) [2024-06-24 10:32:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10518347776. Throughput: 0: 42890.5. Samples: 10518465180. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 10:32:44,120][15401] Updated weights for policy 0, policy_version 641991 (0.0063) [2024-06-24 10:32:48,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 10518528000. Throughput: 0: 42566.6. Samples: 10518713820. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:48,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 10:32:48,570][15401] Updated weights for policy 0, policy_version 642001 (0.0028) [2024-06-24 10:32:51,696][15401] Updated weights for policy 0, policy_version 642011 (0.0042) [2024-06-24 10:32:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10518790144. Throughput: 0: 42733.3. Samples: 10518837180. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 10:32:56,506][15401] Updated weights for policy 0, policy_version 642021 (0.0032) [2024-06-24 10:32:58,390][15132] Fps is (10 sec: 44246.7, 60 sec: 42052.1, 300 sec: 42820.5). Total num frames: 10518970368. Throughput: 0: 42679.8. Samples: 10519102880. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:32:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 10:32:59,447][15401] Updated weights for policy 0, policy_version 642031 (0.0038) [2024-06-24 10:33:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10519166976. Throughput: 0: 42484.1. Samples: 10519348500. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:33:03,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 10:33:04,421][15401] Updated weights for policy 0, policy_version 642041 (0.0030) [2024-06-24 10:33:07,131][15401] Updated weights for policy 0, policy_version 642051 (0.0037) [2024-06-24 10:33:08,392][15132] Fps is (10 sec: 45866.4, 60 sec: 42870.0, 300 sec: 42931.3). Total num frames: 10519429120. Throughput: 0: 42614.1. Samples: 10519473600. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:33:08,392][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 10:33:12,006][15401] Updated weights for policy 0, policy_version 642061 (0.0038) [2024-06-24 10:33:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 10519592960. Throughput: 0: 42607.5. Samples: 10519733020. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:33:13,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 10:33:14,683][15401] Updated weights for policy 0, policy_version 642071 (0.0034) [2024-06-24 10:33:18,396][15132] Fps is (10 sec: 37666.7, 60 sec: 42593.9, 300 sec: 42653.4). Total num frames: 10519805952. Throughput: 0: 42275.3. Samples: 10519980980. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:33:18,397][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 10:33:19,625][15401] Updated weights for policy 0, policy_version 642081 (0.0032) [2024-06-24 10:33:22,543][15401] Updated weights for policy 0, policy_version 642091 (0.0029) [2024-06-24 10:33:23,391][15132] Fps is (10 sec: 45869.8, 60 sec: 42597.6, 300 sec: 42875.9). Total num frames: 10520051712. Throughput: 0: 42602.8. Samples: 10520114660. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:33:23,391][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 10:33:27,107][15401] Updated weights for policy 0, policy_version 642101 (0.0035) [2024-06-24 10:33:28,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 10520215552. Throughput: 0: 42300.4. Samples: 10520368700. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:33:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 10:33:30,077][15401] Updated weights for policy 0, policy_version 642111 (0.0030) [2024-06-24 10:33:33,390][15132] Fps is (10 sec: 40964.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10520461312. Throughput: 0: 42463.5. Samples: 10520624580. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:33:33,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 10:33:35,011][15401] Updated weights for policy 0, policy_version 642121 (0.0029) [2024-06-24 10:33:37,815][15401] Updated weights for policy 0, policy_version 642131 (0.0032) [2024-06-24 10:33:38,389][15132] Fps is (10 sec: 49152.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10520707072. Throughput: 0: 42665.4. Samples: 10520757120. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-24 10:33:38,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 10:33:42,639][15401] Updated weights for policy 0, policy_version 642141 (0.0042) [2024-06-24 10:33:43,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 10520854528. Throughput: 0: 42394.3. Samples: 10521010620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:33:43,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 10:33:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000642142_10520854528.pth... [2024-06-24 10:33:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000641519_10510647296.pth [2024-06-24 10:33:45,545][15401] Updated weights for policy 0, policy_version 642151 (0.0041) [2024-06-24 10:33:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 10521100288. Throughput: 0: 42437.2. Samples: 10521258180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:33:48,399][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 10:33:50,133][15401] Updated weights for policy 0, policy_version 642161 (0.0034) [2024-06-24 10:33:52,418][15349] Signal inference workers to stop experience collection... (155750 times) [2024-06-24 10:33:52,423][15349] Signal inference workers to resume experience collection... (155750 times) [2024-06-24 10:33:52,468][15401] InferenceWorker_p0-w0: stopping experience collection (155750 times) [2024-06-24 10:33:52,468][15401] InferenceWorker_p0-w0: resuming experience collection (155750 times) [2024-06-24 10:33:53,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42050.5, 300 sec: 42764.6). Total num frames: 10521313280. Throughput: 0: 42751.2. Samples: 10521397420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:33:53,401][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 10:33:53,436][15401] Updated weights for policy 0, policy_version 642171 (0.0033) [2024-06-24 10:33:57,610][15401] Updated weights for policy 0, policy_version 642181 (0.0031) [2024-06-24 10:33:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10521493504. Throughput: 0: 42547.1. Samples: 10521647640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:33:58,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 10:34:01,376][15401] Updated weights for policy 0, policy_version 642191 (0.0031) [2024-06-24 10:34:03,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10521739264. Throughput: 0: 42506.9. Samples: 10521893520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 10:34:05,234][15401] Updated weights for policy 0, policy_version 642201 (0.0041) [2024-06-24 10:34:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42053.7, 300 sec: 42709.5). Total num frames: 10521952256. Throughput: 0: 42635.4. Samples: 10522033200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 10:34:09,012][15401] Updated weights for policy 0, policy_version 642211 (0.0047) [2024-06-24 10:34:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10522132480. Throughput: 0: 42479.7. Samples: 10522280280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:13,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 10:34:13,491][15401] Updated weights for policy 0, policy_version 642221 (0.0035) [2024-06-24 10:34:16,681][15401] Updated weights for policy 0, policy_version 642231 (0.0036) [2024-06-24 10:34:18,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42871.5, 300 sec: 42653.0). Total num frames: 10522378240. Throughput: 0: 42391.4. Samples: 10522532460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:18,396][15132] Avg episode reward: [(0, '0.863')] [2024-06-24 10:34:21,135][15401] Updated weights for policy 0, policy_version 642241 (0.0032) [2024-06-24 10:34:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42326.1, 300 sec: 42653.9). Total num frames: 10522591232. Throughput: 0: 42451.8. Samples: 10522667460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 10:34:24,845][15401] Updated weights for policy 0, policy_version 642251 (0.0029) [2024-06-24 10:34:28,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10522787840. Throughput: 0: 42446.7. Samples: 10522920720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 10:34:28,845][15401] Updated weights for policy 0, policy_version 642261 (0.0031) [2024-06-24 10:34:32,414][15401] Updated weights for policy 0, policy_version 642271 (0.0031) [2024-06-24 10:34:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10523033600. Throughput: 0: 42510.6. Samples: 10523171160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:33,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 10:34:36,464][15401] Updated weights for policy 0, policy_version 642281 (0.0048) [2024-06-24 10:34:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10523230208. Throughput: 0: 42499.7. Samples: 10523309800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 10:34:39,923][15401] Updated weights for policy 0, policy_version 642291 (0.0035) [2024-06-24 10:34:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10523426816. Throughput: 0: 42591.6. Samples: 10523564260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 10:34:43,904][15401] Updated weights for policy 0, policy_version 642301 (0.0031) [2024-06-24 10:34:47,477][15401] Updated weights for policy 0, policy_version 642311 (0.0032) [2024-06-24 10:34:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10523656192. Throughput: 0: 42632.0. Samples: 10523811960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:48,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 10:34:50,359][15349] Signal inference workers to stop experience collection... (155800 times) [2024-06-24 10:34:50,366][15349] Signal inference workers to resume experience collection... (155800 times) [2024-06-24 10:34:50,401][15401] InferenceWorker_p0-w0: stopping experience collection (155800 times) [2024-06-24 10:34:50,401][15401] InferenceWorker_p0-w0: resuming experience collection (155800 times) [2024-06-24 10:34:51,646][15401] Updated weights for policy 0, policy_version 642321 (0.0035) [2024-06-24 10:34:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42654.1). Total num frames: 10523869184. Throughput: 0: 42504.0. Samples: 10523945880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 10:34:55,073][15401] Updated weights for policy 0, policy_version 642331 (0.0037) [2024-06-24 10:34:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 10524049408. Throughput: 0: 42551.1. Samples: 10524195080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:34:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 10:34:59,295][15401] Updated weights for policy 0, policy_version 642341 (0.0039) [2024-06-24 10:35:02,618][15401] Updated weights for policy 0, policy_version 642351 (0.0038) [2024-06-24 10:35:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10524311552. Throughput: 0: 42594.4. Samples: 10524448940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:35:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 10:35:07,156][15401] Updated weights for policy 0, policy_version 642361 (0.0034) [2024-06-24 10:35:08,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10524508160. Throughput: 0: 42561.4. Samples: 10524582820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:35:08,392][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 10:35:10,232][15401] Updated weights for policy 0, policy_version 642371 (0.0054) [2024-06-24 10:35:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 10524704768. Throughput: 0: 42605.7. Samples: 10524837980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:35:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 10:35:14,675][15401] Updated weights for policy 0, policy_version 642381 (0.0034) [2024-06-24 10:35:17,726][15401] Updated weights for policy 0, policy_version 642391 (0.0036) [2024-06-24 10:35:18,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 10524934144. Throughput: 0: 42706.3. Samples: 10525092940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 10:35:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 10:35:22,118][15401] Updated weights for policy 0, policy_version 642401 (0.0038) [2024-06-24 10:35:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10525147136. Throughput: 0: 42735.5. Samples: 10525232900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:35:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 10:35:25,333][15401] Updated weights for policy 0, policy_version 642411 (0.0052) [2024-06-24 10:35:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10525343744. Throughput: 0: 42775.6. Samples: 10525489160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:35:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 10:35:29,657][15401] Updated weights for policy 0, policy_version 642421 (0.0024) [2024-06-24 10:35:33,021][15401] Updated weights for policy 0, policy_version 642431 (0.0035) [2024-06-24 10:35:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10525589504. Throughput: 0: 42705.3. Samples: 10525733700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:35:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 10:35:37,669][15401] Updated weights for policy 0, policy_version 642441 (0.0040) [2024-06-24 10:35:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10525786112. Throughput: 0: 42892.0. Samples: 10525876020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:35:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 10:35:40,784][15401] Updated weights for policy 0, policy_version 642451 (0.0033) [2024-06-24 10:35:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10525982720. Throughput: 0: 42813.2. Samples: 10526121680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:35:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 10:35:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000642455_10525982720.pth... [2024-06-24 10:35:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000641833_10515791872.pth [2024-06-24 10:35:45,303][15401] Updated weights for policy 0, policy_version 642461 (0.0036) [2024-06-24 10:35:48,391][15401] Updated weights for policy 0, policy_version 642471 (0.0036) [2024-06-24 10:35:48,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 10526244864. Throughput: 0: 42790.7. Samples: 10526374620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:35:48,392][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 10:35:53,043][15401] Updated weights for policy 0, policy_version 642481 (0.0038) [2024-06-24 10:35:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10526425088. Throughput: 0: 42904.1. Samples: 10526513400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:35:53,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 10:35:55,891][15401] Updated weights for policy 0, policy_version 642491 (0.0023) [2024-06-24 10:35:58,390][15132] Fps is (10 sec: 37691.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 10526621696. Throughput: 0: 42781.7. Samples: 10526763160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:35:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 10:36:00,073][15349] Signal inference workers to stop experience collection... (155850 times) [2024-06-24 10:36:00,120][15401] InferenceWorker_p0-w0: stopping experience collection (155850 times) [2024-06-24 10:36:00,130][15349] Signal inference workers to resume experience collection... (155850 times) [2024-06-24 10:36:00,139][15401] InferenceWorker_p0-w0: resuming experience collection (155850 times) [2024-06-24 10:36:00,627][15401] Updated weights for policy 0, policy_version 642501 (0.0030) [2024-06-24 10:36:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10526883840. Throughput: 0: 42822.2. Samples: 10527019940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:03,394][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 10:36:03,592][15401] Updated weights for policy 0, policy_version 642511 (0.0035) [2024-06-24 10:36:08,242][15401] Updated weights for policy 0, policy_version 642521 (0.0027) [2024-06-24 10:36:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 10527064064. Throughput: 0: 42687.0. Samples: 10527153820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 10:36:11,275][15401] Updated weights for policy 0, policy_version 642531 (0.0034) [2024-06-24 10:36:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10527277056. Throughput: 0: 42646.7. Samples: 10527408260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:13,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 10:36:15,717][15401] Updated weights for policy 0, policy_version 642541 (0.0034) [2024-06-24 10:36:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10527522816. Throughput: 0: 42934.2. Samples: 10527665740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:18,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 10:36:18,895][15401] Updated weights for policy 0, policy_version 642551 (0.0043) [2024-06-24 10:36:23,170][15401] Updated weights for policy 0, policy_version 642561 (0.0027) [2024-06-24 10:36:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10527719424. Throughput: 0: 42730.2. Samples: 10527798880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:23,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 10:36:26,598][15401] Updated weights for policy 0, policy_version 642571 (0.0032) [2024-06-24 10:36:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10527932416. Throughput: 0: 42903.1. Samples: 10528052320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 10:36:30,796][15401] Updated weights for policy 0, policy_version 642581 (0.0045) [2024-06-24 10:36:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10528145408. Throughput: 0: 42966.3. Samples: 10528308000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:33,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 10:36:34,045][15401] Updated weights for policy 0, policy_version 642591 (0.0037) [2024-06-24 10:36:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10528358400. Throughput: 0: 42762.2. Samples: 10528437700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 10:36:38,746][15401] Updated weights for policy 0, policy_version 642601 (0.0033) [2024-06-24 10:36:41,832][15401] Updated weights for policy 0, policy_version 642611 (0.0034) [2024-06-24 10:36:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10528571392. Throughput: 0: 42805.3. Samples: 10528689400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 10:36:46,194][15401] Updated weights for policy 0, policy_version 642621 (0.0032) [2024-06-24 10:36:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 10528800768. Throughput: 0: 42945.9. Samples: 10528952500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 10:36:49,361][15401] Updated weights for policy 0, policy_version 642631 (0.0042) [2024-06-24 10:36:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 10529013760. Throughput: 0: 42962.2. Samples: 10529087120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 10:36:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 10:36:53,706][15401] Updated weights for policy 0, policy_version 642641 (0.0026) [2024-06-24 10:36:56,866][15401] Updated weights for policy 0, policy_version 642651 (0.0038) [2024-06-24 10:36:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 10529210368. Throughput: 0: 42858.7. Samples: 10529336900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:36:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 10:37:01,462][15401] Updated weights for policy 0, policy_version 642661 (0.0029) [2024-06-24 10:37:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10529456128. Throughput: 0: 42989.8. Samples: 10529600280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:03,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 10:37:04,577][15401] Updated weights for policy 0, policy_version 642671 (0.0023) [2024-06-24 10:37:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10529619968. Throughput: 0: 43015.2. Samples: 10529734560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 10:37:09,138][15401] Updated weights for policy 0, policy_version 642681 (0.0033) [2024-06-24 10:37:11,654][15349] Signal inference workers to stop experience collection... (155900 times) [2024-06-24 10:37:11,656][15349] Signal inference workers to resume experience collection... (155900 times) [2024-06-24 10:37:11,685][15401] InferenceWorker_p0-w0: stopping experience collection (155900 times) [2024-06-24 10:37:11,711][15401] InferenceWorker_p0-w0: resuming experience collection (155900 times) [2024-06-24 10:37:12,390][15401] Updated weights for policy 0, policy_version 642691 (0.0028) [2024-06-24 10:37:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10529849344. Throughput: 0: 42774.8. Samples: 10529977180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:13,391][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 10:37:16,734][15401] Updated weights for policy 0, policy_version 642701 (0.0049) [2024-06-24 10:37:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10530078720. Throughput: 0: 42871.0. Samples: 10530237200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:18,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-24 10:37:20,166][15401] Updated weights for policy 0, policy_version 642711 (0.0032) [2024-06-24 10:37:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10530275328. Throughput: 0: 43016.5. Samples: 10530373440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 10:37:24,352][15401] Updated weights for policy 0, policy_version 642721 (0.0039) [2024-06-24 10:37:27,762][15401] Updated weights for policy 0, policy_version 642731 (0.0039) [2024-06-24 10:37:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10530504704. Throughput: 0: 43006.3. Samples: 10530624680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 10:37:31,964][15401] Updated weights for policy 0, policy_version 642741 (0.0043) [2024-06-24 10:37:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10530734080. Throughput: 0: 42908.4. Samples: 10530883380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 10:37:35,680][15401] Updated weights for policy 0, policy_version 642751 (0.0030) [2024-06-24 10:37:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10530914304. Throughput: 0: 42681.0. Samples: 10531007760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 10:37:39,850][15401] Updated weights for policy 0, policy_version 642761 (0.0041) [2024-06-24 10:37:43,170][15401] Updated weights for policy 0, policy_version 642771 (0.0044) [2024-06-24 10:37:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 10531160064. Throughput: 0: 42777.2. Samples: 10531261880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:43,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 10:37:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000642771_10531160064.pth... [2024-06-24 10:37:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000642142_10520854528.pth [2024-06-24 10:37:47,458][15401] Updated weights for policy 0, policy_version 642781 (0.0031) [2024-06-24 10:37:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10531340288. Throughput: 0: 42624.9. Samples: 10531518400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 10:37:51,262][15401] Updated weights for policy 0, policy_version 642791 (0.0033) [2024-06-24 10:37:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10531553280. Throughput: 0: 42435.6. Samples: 10531644160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 10:37:55,051][15401] Updated weights for policy 0, policy_version 642801 (0.0031) [2024-06-24 10:37:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10531799040. Throughput: 0: 42820.0. Samples: 10531904080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:37:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 10:37:58,789][15401] Updated weights for policy 0, policy_version 642811 (0.0043) [2024-06-24 10:38:02,567][15401] Updated weights for policy 0, policy_version 642821 (0.0036) [2024-06-24 10:38:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 10531995648. Throughput: 0: 42628.1. Samples: 10532155460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:38:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 10:38:06,451][15401] Updated weights for policy 0, policy_version 642831 (0.0034) [2024-06-24 10:38:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10532208640. Throughput: 0: 42462.5. Samples: 10532284260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:38:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 10:38:10,132][15401] Updated weights for policy 0, policy_version 642841 (0.0024) [2024-06-24 10:38:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42765.6). Total num frames: 10532421632. Throughput: 0: 42780.9. Samples: 10532549920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:38:13,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 10:38:14,067][15401] Updated weights for policy 0, policy_version 642851 (0.0027) [2024-06-24 10:38:18,240][15401] Updated weights for policy 0, policy_version 642861 (0.0033) [2024-06-24 10:38:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 10532634624. Throughput: 0: 42536.3. Samples: 10532797520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:38:18,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 10:38:21,703][15401] Updated weights for policy 0, policy_version 642871 (0.0042) [2024-06-24 10:38:23,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10532831232. Throughput: 0: 42507.5. Samples: 10532920600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:38:23,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 10:38:25,910][15401] Updated weights for policy 0, policy_version 642881 (0.0036) [2024-06-24 10:38:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10533060608. Throughput: 0: 42751.5. Samples: 10533185700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:38:28,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 10:38:29,457][15401] Updated weights for policy 0, policy_version 642891 (0.0036) [2024-06-24 10:38:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10533273600. Throughput: 0: 42446.2. Samples: 10533428480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:38:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 10:38:33,559][15401] Updated weights for policy 0, policy_version 642901 (0.0034) [2024-06-24 10:38:36,984][15401] Updated weights for policy 0, policy_version 642911 (0.0043) [2024-06-24 10:38:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10533470208. Throughput: 0: 42637.7. Samples: 10533562860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:38:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 10:38:41,230][15401] Updated weights for policy 0, policy_version 642921 (0.0025) [2024-06-24 10:38:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 10533666816. Throughput: 0: 42654.7. Samples: 10533823540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:38:43,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 10:38:43,723][15349] Signal inference workers to stop experience collection... (155950 times) [2024-06-24 10:38:43,724][15349] Signal inference workers to resume experience collection... (155950 times) [2024-06-24 10:38:43,748][15401] InferenceWorker_p0-w0: stopping experience collection (155950 times) [2024-06-24 10:38:43,748][15401] InferenceWorker_p0-w0: resuming experience collection (155950 times) [2024-06-24 10:38:44,986][15401] Updated weights for policy 0, policy_version 642931 (0.0045) [2024-06-24 10:38:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 10533928960. Throughput: 0: 42459.9. Samples: 10534066160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:38:48,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 10:38:48,838][15401] Updated weights for policy 0, policy_version 642941 (0.0036) [2024-06-24 10:38:52,632][15401] Updated weights for policy 0, policy_version 642951 (0.0047) [2024-06-24 10:38:53,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10534109184. Throughput: 0: 42644.0. Samples: 10534203240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:38:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 10:38:56,519][15401] Updated weights for policy 0, policy_version 642961 (0.0032) [2024-06-24 10:38:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 10534322176. Throughput: 0: 42428.5. Samples: 10534459100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:38:58,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 10:39:00,170][15401] Updated weights for policy 0, policy_version 642971 (0.0034) [2024-06-24 10:39:03,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10534551552. Throughput: 0: 42516.5. Samples: 10534710760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:03,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 10:39:04,463][15401] Updated weights for policy 0, policy_version 642981 (0.0021) [2024-06-24 10:39:08,004][15401] Updated weights for policy 0, policy_version 642991 (0.0028) [2024-06-24 10:39:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 10534764544. Throughput: 0: 42692.5. Samples: 10534841760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 10:39:11,903][15401] Updated weights for policy 0, policy_version 643001 (0.0033) [2024-06-24 10:39:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42710.4). Total num frames: 10534977536. Throughput: 0: 42612.5. Samples: 10535103260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:13,394][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 10:39:15,510][15401] Updated weights for policy 0, policy_version 643011 (0.0036) [2024-06-24 10:39:18,391][15132] Fps is (10 sec: 44229.8, 60 sec: 42870.4, 300 sec: 42764.8). Total num frames: 10535206912. Throughput: 0: 42890.1. Samples: 10535358600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:18,391][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 10:39:19,761][15401] Updated weights for policy 0, policy_version 643021 (0.0040) [2024-06-24 10:39:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10535403520. Throughput: 0: 42831.3. Samples: 10535490260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 10:39:23,396][15401] Updated weights for policy 0, policy_version 643031 (0.0036) [2024-06-24 10:39:27,483][15401] Updated weights for policy 0, policy_version 643041 (0.0035) [2024-06-24 10:39:28,392][15132] Fps is (10 sec: 40956.3, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 10535616512. Throughput: 0: 42710.1. Samples: 10535745600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:28,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 10:39:30,845][15401] Updated weights for policy 0, policy_version 643051 (0.0033) [2024-06-24 10:39:33,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10535862272. Throughput: 0: 43035.9. Samples: 10536002780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 10:39:35,236][15401] Updated weights for policy 0, policy_version 643061 (0.0051) [2024-06-24 10:39:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10536058880. Throughput: 0: 42916.2. Samples: 10536134460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:38,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 10:39:38,601][15401] Updated weights for policy 0, policy_version 643071 (0.0035) [2024-06-24 10:39:43,200][15401] Updated weights for policy 0, policy_version 643081 (0.0039) [2024-06-24 10:39:43,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10536239104. Throughput: 0: 42765.3. Samples: 10536383540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 10:39:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000643081_10536239104.pth... [2024-06-24 10:39:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000642455_10525982720.pth [2024-06-24 10:39:46,239][15401] Updated weights for policy 0, policy_version 643091 (0.0031) [2024-06-24 10:39:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10536501248. Throughput: 0: 42916.0. Samples: 10536641980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 10:39:51,250][15401] Updated weights for policy 0, policy_version 643101 (0.0023) [2024-06-24 10:39:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 10536697856. Throughput: 0: 43149.8. Samples: 10536783500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 10:39:53,854][15401] Updated weights for policy 0, policy_version 643111 (0.0026) [2024-06-24 10:39:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10536878080. Throughput: 0: 42766.7. Samples: 10537027760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:39:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 10:39:58,821][15401] Updated weights for policy 0, policy_version 643121 (0.0041) [2024-06-24 10:39:59,808][15349] Signal inference workers to stop experience collection... (156000 times) [2024-06-24 10:39:59,808][15349] Signal inference workers to resume experience collection... (156000 times) [2024-06-24 10:39:59,860][15401] InferenceWorker_p0-w0: stopping experience collection (156000 times) [2024-06-24 10:39:59,860][15401] InferenceWorker_p0-w0: resuming experience collection (156000 times) [2024-06-24 10:40:01,373][15401] Updated weights for policy 0, policy_version 643131 (0.0046) [2024-06-24 10:40:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 10537140224. Throughput: 0: 42609.0. Samples: 10537275940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:40:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 10:40:06,443][15401] Updated weights for policy 0, policy_version 643141 (0.0039) [2024-06-24 10:40:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10537320448. Throughput: 0: 42735.9. Samples: 10537413380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 10:40:08,997][15401] Updated weights for policy 0, policy_version 643151 (0.0036) [2024-06-24 10:40:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10537533440. Throughput: 0: 42654.2. Samples: 10537664940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 10:40:13,923][15401] Updated weights for policy 0, policy_version 643161 (0.0040) [2024-06-24 10:40:16,594][15401] Updated weights for policy 0, policy_version 643171 (0.0019) [2024-06-24 10:40:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42872.6, 300 sec: 42820.6). Total num frames: 10537779200. Throughput: 0: 42442.0. Samples: 10537912660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 10:40:22,142][15401] Updated weights for policy 0, policy_version 643181 (0.0022) [2024-06-24 10:40:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10537943040. Throughput: 0: 42603.5. Samples: 10538051620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:23,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 10:40:24,405][15401] Updated weights for policy 0, policy_version 643191 (0.0038) [2024-06-24 10:40:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 10538172416. Throughput: 0: 42676.9. Samples: 10538304000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 10:40:29,721][15401] Updated weights for policy 0, policy_version 643201 (0.0032) [2024-06-24 10:40:32,073][15401] Updated weights for policy 0, policy_version 643211 (0.0036) [2024-06-24 10:40:33,389][15132] Fps is (10 sec: 49152.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10538434560. Throughput: 0: 42420.0. Samples: 10538550880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 10:40:37,294][15401] Updated weights for policy 0, policy_version 643221 (0.0031) [2024-06-24 10:40:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 10538598400. Throughput: 0: 42568.2. Samples: 10538699080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:38,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-24 10:40:39,887][15401] Updated weights for policy 0, policy_version 643231 (0.0029) [2024-06-24 10:40:43,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 10538811392. Throughput: 0: 42600.8. Samples: 10538944800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:43,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 10:40:45,072][15401] Updated weights for policy 0, policy_version 643241 (0.0025) [2024-06-24 10:40:47,697][15401] Updated weights for policy 0, policy_version 643251 (0.0032) [2024-06-24 10:40:48,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10539073536. Throughput: 0: 42644.9. Samples: 10539194960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 10:40:52,635][15401] Updated weights for policy 0, policy_version 643261 (0.0033) [2024-06-24 10:40:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10539237376. Throughput: 0: 42670.3. Samples: 10539333540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 10:40:55,399][15401] Updated weights for policy 0, policy_version 643271 (0.0033) [2024-06-24 10:40:55,571][15349] Signal inference workers to stop experience collection... (156050 times) [2024-06-24 10:40:55,571][15349] Signal inference workers to resume experience collection... (156050 times) [2024-06-24 10:40:55,592][15401] InferenceWorker_p0-w0: stopping experience collection (156050 times) [2024-06-24 10:40:55,592][15401] InferenceWorker_p0-w0: resuming experience collection (156050 times) [2024-06-24 10:40:58,389][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 10539466752. Throughput: 0: 42578.3. Samples: 10539580960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:40:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 10:41:00,126][15401] Updated weights for policy 0, policy_version 643281 (0.0032) [2024-06-24 10:41:02,898][15401] Updated weights for policy 0, policy_version 643291 (0.0038) [2024-06-24 10:41:03,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10539696128. Throughput: 0: 42882.6. Samples: 10539842380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 10:41:07,682][15401] Updated weights for policy 0, policy_version 643301 (0.0033) [2024-06-24 10:41:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10539892736. Throughput: 0: 42684.4. Samples: 10539972420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 10:41:10,538][15401] Updated weights for policy 0, policy_version 643311 (0.0028) [2024-06-24 10:41:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10540122112. Throughput: 0: 42686.9. Samples: 10540224920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 10:41:15,369][15401] Updated weights for policy 0, policy_version 643321 (0.0031) [2024-06-24 10:41:18,043][15401] Updated weights for policy 0, policy_version 643331 (0.0030) [2024-06-24 10:41:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10540335104. Throughput: 0: 42752.0. Samples: 10540474720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:18,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 10:41:22,948][15401] Updated weights for policy 0, policy_version 643341 (0.0036) [2024-06-24 10:41:23,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10540498944. Throughput: 0: 42428.9. Samples: 10540608380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 10:41:25,553][15401] Updated weights for policy 0, policy_version 643351 (0.0037) [2024-06-24 10:41:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 10540777472. Throughput: 0: 42660.7. Samples: 10540864520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 10:41:30,881][15401] Updated weights for policy 0, policy_version 643361 (0.0044) [2024-06-24 10:41:33,171][15401] Updated weights for policy 0, policy_version 643371 (0.0025) [2024-06-24 10:41:33,390][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10540990464. Throughput: 0: 42762.1. Samples: 10541119260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 10:41:38,195][15401] Updated weights for policy 0, policy_version 643381 (0.0037) [2024-06-24 10:41:38,389][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10541154304. Throughput: 0: 42771.5. Samples: 10541258260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 10:41:40,629][15401] Updated weights for policy 0, policy_version 643391 (0.0030) [2024-06-24 10:41:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 10541383680. Throughput: 0: 42934.3. Samples: 10541513000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 10:41:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 10:41:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000643396_10541400064.pth... [2024-06-24 10:41:43,569][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000642771_10531160064.pth [2024-06-24 10:41:45,895][15401] Updated weights for policy 0, policy_version 643401 (0.0025) [2024-06-24 10:41:47,642][15349] Signal inference workers to stop experience collection... (156100 times) [2024-06-24 10:41:47,644][15349] Signal inference workers to resume experience collection... (156100 times) [2024-06-24 10:41:47,687][15401] InferenceWorker_p0-w0: stopping experience collection (156100 times) [2024-06-24 10:41:47,688][15401] InferenceWorker_p0-w0: resuming experience collection (156100 times) [2024-06-24 10:41:48,348][15401] Updated weights for policy 0, policy_version 643411 (0.0051) [2024-06-24 10:41:48,390][15132] Fps is (10 sec: 49151.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10541645824. Throughput: 0: 42699.5. Samples: 10541763860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:41:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 10:41:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10541793280. Throughput: 0: 42771.7. Samples: 10541897140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:41:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 10:41:53,709][15401] Updated weights for policy 0, policy_version 643421 (0.0038) [2024-06-24 10:41:56,059][15401] Updated weights for policy 0, policy_version 643431 (0.0029) [2024-06-24 10:41:58,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10542022656. Throughput: 0: 42590.8. Samples: 10542141500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:41:58,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 10:42:01,343][15401] Updated weights for policy 0, policy_version 643441 (0.0035) [2024-06-24 10:42:03,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10542268416. Throughput: 0: 42830.2. Samples: 10542402080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:03,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 10:42:03,762][15401] Updated weights for policy 0, policy_version 643451 (0.0034) [2024-06-24 10:42:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10542432256. Throughput: 0: 42903.9. Samples: 10542539060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:08,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 10:42:08,920][15401] Updated weights for policy 0, policy_version 643461 (0.0042) [2024-06-24 10:42:11,333][15401] Updated weights for policy 0, policy_version 643471 (0.0034) [2024-06-24 10:42:13,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10542678016. Throughput: 0: 42674.9. Samples: 10542785000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:13,393][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 10:42:16,633][15401] Updated weights for policy 0, policy_version 643481 (0.0044) [2024-06-24 10:42:18,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10542891008. Throughput: 0: 42863.7. Samples: 10543048120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:18,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 10:42:18,901][15401] Updated weights for policy 0, policy_version 643491 (0.0026) [2024-06-24 10:42:23,390][15132] Fps is (10 sec: 39331.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10543071232. Throughput: 0: 42692.8. Samples: 10543179440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:23,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 10:42:24,373][15401] Updated weights for policy 0, policy_version 643501 (0.0047) [2024-06-24 10:42:26,630][15401] Updated weights for policy 0, policy_version 643511 (0.0037) [2024-06-24 10:42:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10543316992. Throughput: 0: 42386.5. Samples: 10543420400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 10:42:32,041][15401] Updated weights for policy 0, policy_version 643521 (0.0043) [2024-06-24 10:42:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10543529984. Throughput: 0: 42641.8. Samples: 10543682740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 10:42:34,325][15401] Updated weights for policy 0, policy_version 643531 (0.0032) [2024-06-24 10:42:38,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10543693824. Throughput: 0: 42524.4. Samples: 10543810740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:38,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-24 10:42:39,679][15401] Updated weights for policy 0, policy_version 643541 (0.0027) [2024-06-24 10:42:42,335][15401] Updated weights for policy 0, policy_version 643551 (0.0033) [2024-06-24 10:42:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10543972352. Throughput: 0: 42721.7. Samples: 10544063980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:43,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 10:42:47,380][15401] Updated weights for policy 0, policy_version 643561 (0.0037) [2024-06-24 10:42:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 10544152576. Throughput: 0: 42708.5. Samples: 10544323960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 10:42:49,792][15401] Updated weights for policy 0, policy_version 643571 (0.0029) [2024-06-24 10:42:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10544349184. Throughput: 0: 42515.6. Samples: 10544452260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 10:42:54,517][15349] Signal inference workers to stop experience collection... (156150 times) [2024-06-24 10:42:54,518][15349] Signal inference workers to resume experience collection... (156150 times) [2024-06-24 10:42:54,558][15401] InferenceWorker_p0-w0: stopping experience collection (156150 times) [2024-06-24 10:42:54,558][15401] InferenceWorker_p0-w0: resuming experience collection (156150 times) [2024-06-24 10:42:54,850][15401] Updated weights for policy 0, policy_version 643581 (0.0036) [2024-06-24 10:42:57,307][15401] Updated weights for policy 0, policy_version 643591 (0.0041) [2024-06-24 10:42:58,390][15132] Fps is (10 sec: 47512.5, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 10544627712. Throughput: 0: 42602.2. Samples: 10544702000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:42:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 10:43:02,557][15401] Updated weights for policy 0, policy_version 643601 (0.0031) [2024-06-24 10:43:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 10544775168. Throughput: 0: 42705.2. Samples: 10544969860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:43:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 10:43:04,803][15401] Updated weights for policy 0, policy_version 643611 (0.0023) [2024-06-24 10:43:08,390][15132] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 10544988160. Throughput: 0: 42382.6. Samples: 10545086660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:43:08,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 10:43:09,951][15401] Updated weights for policy 0, policy_version 643621 (0.0022) [2024-06-24 10:43:12,519][15401] Updated weights for policy 0, policy_version 643631 (0.0028) [2024-06-24 10:43:13,390][15132] Fps is (10 sec: 49152.0, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 10545266688. Throughput: 0: 42784.4. Samples: 10545345700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:43:13,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-24 10:43:17,490][15401] Updated weights for policy 0, policy_version 643641 (0.0038) [2024-06-24 10:43:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10545430528. Throughput: 0: 42916.3. Samples: 10545613980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 10:43:18,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 10:43:20,219][15401] Updated weights for policy 0, policy_version 643651 (0.0024) [2024-06-24 10:43:23,390][15132] Fps is (10 sec: 36044.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10545627136. Throughput: 0: 42722.1. Samples: 10545733240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:43:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 10:43:25,132][15401] Updated weights for policy 0, policy_version 643661 (0.0028) [2024-06-24 10:43:28,082][15401] Updated weights for policy 0, policy_version 643671 (0.0032) [2024-06-24 10:43:28,389][15132] Fps is (10 sec: 49153.4, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 10545922048. Throughput: 0: 42980.2. Samples: 10545998080. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:43:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 10:43:33,121][15401] Updated weights for policy 0, policy_version 643681 (0.0032) [2024-06-24 10:43:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10546069504. Throughput: 0: 42899.5. Samples: 10546254440. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:43:33,390][15132] Avg episode reward: [(0, '0.130')] [2024-06-24 10:43:35,712][15401] Updated weights for policy 0, policy_version 643691 (0.0028) [2024-06-24 10:43:38,389][15132] Fps is (10 sec: 34406.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10546266112. Throughput: 0: 42710.4. Samples: 10546374220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:43:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 10:43:40,858][15401] Updated weights for policy 0, policy_version 643701 (0.0037) [2024-06-24 10:43:43,277][15401] Updated weights for policy 0, policy_version 643711 (0.0037) [2024-06-24 10:43:43,390][15132] Fps is (10 sec: 49151.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10546561024. Throughput: 0: 42938.3. Samples: 10546634220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:43:43,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 10:43:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000643711_10546561024.pth... [2024-06-24 10:43:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000643081_10536239104.pth [2024-06-24 10:43:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 10546708480. Throughput: 0: 43043.1. Samples: 10546906800. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:43:48,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 10:43:48,563][15401] Updated weights for policy 0, policy_version 643721 (0.0032) [2024-06-24 10:43:49,467][15349] Signal inference workers to stop experience collection... (156200 times) [2024-06-24 10:43:49,469][15349] Signal inference workers to resume experience collection... (156200 times) [2024-06-24 10:43:49,485][15401] InferenceWorker_p0-w0: stopping experience collection (156200 times) [2024-06-24 10:43:49,485][15401] InferenceWorker_p0-w0: resuming experience collection (156200 times) [2024-06-24 10:43:50,794][15401] Updated weights for policy 0, policy_version 643731 (0.0031) [2024-06-24 10:43:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10546937856. Throughput: 0: 42980.0. Samples: 10547020760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:43:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 10:43:56,040][15401] Updated weights for policy 0, policy_version 643741 (0.0038) [2024-06-24 10:43:58,380][15401] Updated weights for policy 0, policy_version 643751 (0.0044) [2024-06-24 10:43:58,389][15132] Fps is (10 sec: 50791.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10547216384. Throughput: 0: 43153.0. Samples: 10547287580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:43:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 10:44:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10547363840. Throughput: 0: 43028.6. Samples: 10547550260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 10:44:03,454][15401] Updated weights for policy 0, policy_version 643761 (0.0028) [2024-06-24 10:44:06,028][15401] Updated weights for policy 0, policy_version 643771 (0.0029) [2024-06-24 10:44:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10547593216. Throughput: 0: 42900.6. Samples: 10547663760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 10:44:10,953][15401] Updated weights for policy 0, policy_version 643781 (0.0033) [2024-06-24 10:44:13,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42820.8). Total num frames: 10547838976. Throughput: 0: 43011.1. Samples: 10547933580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 10:44:13,936][15401] Updated weights for policy 0, policy_version 643791 (0.0028) [2024-06-24 10:44:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10548019200. Throughput: 0: 43155.1. Samples: 10548196420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 10:44:18,735][15401] Updated weights for policy 0, policy_version 643801 (0.0036) [2024-06-24 10:44:21,672][15401] Updated weights for policy 0, policy_version 643811 (0.0035) [2024-06-24 10:44:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43417.7, 300 sec: 42765.4). Total num frames: 10548232192. Throughput: 0: 43133.2. Samples: 10548315220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:23,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 10:44:26,125][15401] Updated weights for policy 0, policy_version 643821 (0.0038) [2024-06-24 10:44:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10548461568. Throughput: 0: 43249.4. Samples: 10548580440. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 10:44:29,194][15401] Updated weights for policy 0, policy_version 643831 (0.0043) [2024-06-24 10:44:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10548658176. Throughput: 0: 42965.8. Samples: 10548840260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 10:44:33,629][15401] Updated weights for policy 0, policy_version 643841 (0.0038) [2024-06-24 10:44:37,446][15401] Updated weights for policy 0, policy_version 643851 (0.0039) [2024-06-24 10:44:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 10548871168. Throughput: 0: 43041.5. Samples: 10548957620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 10:44:42,074][15401] Updated weights for policy 0, policy_version 643861 (0.0042) [2024-06-24 10:44:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10549116928. Throughput: 0: 42970.1. Samples: 10549221240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 10:44:45,081][15401] Updated weights for policy 0, policy_version 643871 (0.0046) [2024-06-24 10:44:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 10549297152. Throughput: 0: 42747.4. Samples: 10549473900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:48,395][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 10:44:48,616][15349] Signal inference workers to stop experience collection... (156250 times) [2024-06-24 10:44:48,616][15349] Signal inference workers to resume experience collection... (156250 times) [2024-06-24 10:44:48,643][15401] InferenceWorker_p0-w0: stopping experience collection (156250 times) [2024-06-24 10:44:48,643][15401] InferenceWorker_p0-w0: resuming experience collection (156250 times) [2024-06-24 10:44:49,534][15401] Updated weights for policy 0, policy_version 643881 (0.0037) [2024-06-24 10:44:52,985][15401] Updated weights for policy 0, policy_version 643891 (0.0028) [2024-06-24 10:44:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10549510144. Throughput: 0: 42931.5. Samples: 10549595680. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-24 10:44:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 10:44:57,119][15401] Updated weights for policy 0, policy_version 643901 (0.0037) [2024-06-24 10:44:58,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10549739520. Throughput: 0: 42836.4. Samples: 10549861220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:44:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 10:45:00,848][15401] Updated weights for policy 0, policy_version 643911 (0.0028) [2024-06-24 10:45:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10549952512. Throughput: 0: 42566.6. Samples: 10550111920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 10:45:04,742][15401] Updated weights for policy 0, policy_version 643921 (0.0031) [2024-06-24 10:45:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10550149120. Throughput: 0: 42793.0. Samples: 10550240900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 10:45:08,405][15401] Updated weights for policy 0, policy_version 643931 (0.0043) [2024-06-24 10:45:12,302][15401] Updated weights for policy 0, policy_version 643941 (0.0023) [2024-06-24 10:45:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10550378496. Throughput: 0: 42786.2. Samples: 10550505820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:13,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 10:45:15,942][15401] Updated weights for policy 0, policy_version 643951 (0.0037) [2024-06-24 10:45:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10550607872. Throughput: 0: 42646.3. Samples: 10550759340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 10:45:19,841][15401] Updated weights for policy 0, policy_version 643961 (0.0041) [2024-06-24 10:45:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 10550804480. Throughput: 0: 42813.6. Samples: 10550884340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:23,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 10:45:23,806][15401] Updated weights for policy 0, policy_version 643971 (0.0035) [2024-06-24 10:45:27,514][15401] Updated weights for policy 0, policy_version 643981 (0.0041) [2024-06-24 10:45:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10551033856. Throughput: 0: 42669.9. Samples: 10551141380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:28,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 10:45:31,207][15401] Updated weights for policy 0, policy_version 643991 (0.0033) [2024-06-24 10:45:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10551230464. Throughput: 0: 42751.2. Samples: 10551397700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 10:45:35,245][15401] Updated weights for policy 0, policy_version 644001 (0.0034) [2024-06-24 10:45:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10551459840. Throughput: 0: 42844.9. Samples: 10551523700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 10:45:38,735][15401] Updated weights for policy 0, policy_version 644011 (0.0039) [2024-06-24 10:45:42,949][15401] Updated weights for policy 0, policy_version 644021 (0.0039) [2024-06-24 10:45:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10551656448. Throughput: 0: 42752.3. Samples: 10551785080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 10:45:43,433][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000644023_10551672832.pth... [2024-06-24 10:45:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000643396_10541400064.pth [2024-06-24 10:45:46,110][15401] Updated weights for policy 0, policy_version 644031 (0.0039) [2024-06-24 10:45:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 10551853056. Throughput: 0: 42891.7. Samples: 10552042040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 10:45:50,528][15401] Updated weights for policy 0, policy_version 644041 (0.0026) [2024-06-24 10:45:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10552082432. Throughput: 0: 42787.5. Samples: 10552166340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:53,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 10:45:53,984][15401] Updated weights for policy 0, policy_version 644051 (0.0029) [2024-06-24 10:45:58,043][15401] Updated weights for policy 0, policy_version 644061 (0.0027) [2024-06-24 10:45:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10552295424. Throughput: 0: 42790.3. Samples: 10552431380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:45:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 10:46:01,814][15401] Updated weights for policy 0, policy_version 644071 (0.0033) [2024-06-24 10:46:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10552508416. Throughput: 0: 42763.5. Samples: 10552683700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:46:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 10:46:05,531][15401] Updated weights for policy 0, policy_version 644081 (0.0040) [2024-06-24 10:46:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10552721408. Throughput: 0: 42800.6. Samples: 10552810260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:46:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 10:46:09,609][15401] Updated weights for policy 0, policy_version 644091 (0.0050) [2024-06-24 10:46:12,708][15349] Signal inference workers to stop experience collection... (156300 times) [2024-06-24 10:46:12,744][15401] InferenceWorker_p0-w0: stopping experience collection (156300 times) [2024-06-24 10:46:12,824][15349] Signal inference workers to resume experience collection... (156300 times) [2024-06-24 10:46:12,824][15401] InferenceWorker_p0-w0: resuming experience collection (156300 times) [2024-06-24 10:46:13,123][15401] Updated weights for policy 0, policy_version 644101 (0.0036) [2024-06-24 10:46:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10552950784. Throughput: 0: 42892.3. Samples: 10553071540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:46:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 10:46:17,322][15401] Updated weights for policy 0, policy_version 644111 (0.0036) [2024-06-24 10:46:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10553147392. Throughput: 0: 42705.4. Samples: 10553319440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:46:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 10:46:21,014][15401] Updated weights for policy 0, policy_version 644121 (0.0034) [2024-06-24 10:46:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42709.4). Total num frames: 10553376768. Throughput: 0: 42693.7. Samples: 10553444920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:46:23,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 10:46:24,914][15401] Updated weights for policy 0, policy_version 644131 (0.0040) [2024-06-24 10:46:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10553573376. Throughput: 0: 42796.5. Samples: 10553710920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:46:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 10:46:28,758][15401] Updated weights for policy 0, policy_version 644141 (0.0034) [2024-06-24 10:46:32,429][15401] Updated weights for policy 0, policy_version 644151 (0.0045) [2024-06-24 10:46:33,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10553802752. Throughput: 0: 42750.3. Samples: 10553965800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-24 10:46:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 10:46:36,440][15401] Updated weights for policy 0, policy_version 644161 (0.0046) [2024-06-24 10:46:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10554015744. Throughput: 0: 42792.8. Samples: 10554092020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:46:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 10:46:40,256][15401] Updated weights for policy 0, policy_version 644171 (0.0043) [2024-06-24 10:46:43,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10554212352. Throughput: 0: 42665.1. Samples: 10554351320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:46:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 10:46:43,925][15401] Updated weights for policy 0, policy_version 644181 (0.0027) [2024-06-24 10:46:48,022][15401] Updated weights for policy 0, policy_version 644191 (0.0035) [2024-06-24 10:46:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10554425344. Throughput: 0: 42729.7. Samples: 10554606540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:46:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 10:46:51,547][15401] Updated weights for policy 0, policy_version 644201 (0.0038) [2024-06-24 10:46:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10554654720. Throughput: 0: 42839.4. Samples: 10554738040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:46:53,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 10:46:55,669][15401] Updated weights for policy 0, policy_version 644211 (0.0035) [2024-06-24 10:46:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10554867712. Throughput: 0: 42646.8. Samples: 10554990640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:46:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 10:46:59,212][15401] Updated weights for policy 0, policy_version 644221 (0.0033) [2024-06-24 10:47:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10555064320. Throughput: 0: 42904.5. Samples: 10555250140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 10:47:03,428][15401] Updated weights for policy 0, policy_version 644231 (0.0028) [2024-06-24 10:47:07,140][15401] Updated weights for policy 0, policy_version 644241 (0.0041) [2024-06-24 10:47:08,391][15132] Fps is (10 sec: 44230.6, 60 sec: 43143.5, 300 sec: 42820.7). Total num frames: 10555310080. Throughput: 0: 42934.8. Samples: 10555377040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:08,391][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 10:47:11,016][15401] Updated weights for policy 0, policy_version 644251 (0.0029) [2024-06-24 10:47:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10555506688. Throughput: 0: 42602.1. Samples: 10555628020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 10:47:14,713][15401] Updated weights for policy 0, policy_version 644261 (0.0051) [2024-06-24 10:47:18,389][15132] Fps is (10 sec: 39327.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10555703296. Throughput: 0: 42783.5. Samples: 10555891060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:18,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-24 10:47:18,639][15401] Updated weights for policy 0, policy_version 644271 (0.0024) [2024-06-24 10:47:22,220][15401] Updated weights for policy 0, policy_version 644281 (0.0031) [2024-06-24 10:47:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10555949056. Throughput: 0: 42666.0. Samples: 10556011980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:23,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 10:47:26,424][15401] Updated weights for policy 0, policy_version 644291 (0.0038) [2024-06-24 10:47:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10556145664. Throughput: 0: 42557.9. Samples: 10556266420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 10:47:29,977][15401] Updated weights for policy 0, policy_version 644301 (0.0040) [2024-06-24 10:47:33,302][15349] Signal inference workers to stop experience collection... (156350 times) [2024-06-24 10:47:33,331][15401] InferenceWorker_p0-w0: stopping experience collection (156350 times) [2024-06-24 10:47:33,351][15349] Signal inference workers to resume experience collection... (156350 times) [2024-06-24 10:47:33,352][15401] InferenceWorker_p0-w0: resuming experience collection (156350 times) [2024-06-24 10:47:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10556342272. Throughput: 0: 42590.0. Samples: 10556523080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 10:47:33,978][15401] Updated weights for policy 0, policy_version 644311 (0.0029) [2024-06-24 10:47:37,651][15401] Updated weights for policy 0, policy_version 644321 (0.0038) [2024-06-24 10:47:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10556588032. Throughput: 0: 42454.4. Samples: 10556648480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 10:47:41,446][15401] Updated weights for policy 0, policy_version 644331 (0.0031) [2024-06-24 10:47:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10556784640. Throughput: 0: 42638.3. Samples: 10556909360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 10:47:43,483][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000644336_10556801024.pth... [2024-06-24 10:47:43,544][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000643711_10546561024.pth [2024-06-24 10:47:45,516][15401] Updated weights for policy 0, policy_version 644341 (0.0038) [2024-06-24 10:47:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 10556981248. Throughput: 0: 42609.0. Samples: 10557167540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:48,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 10:47:49,085][15401] Updated weights for policy 0, policy_version 644351 (0.0044) [2024-06-24 10:47:53,135][15401] Updated weights for policy 0, policy_version 644361 (0.0042) [2024-06-24 10:47:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10557210624. Throughput: 0: 42553.3. Samples: 10557291880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:53,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 10:47:57,254][15401] Updated weights for policy 0, policy_version 644371 (0.0028) [2024-06-24 10:47:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 10557407232. Throughput: 0: 42720.4. Samples: 10557550440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:47:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 10:48:00,766][15401] Updated weights for policy 0, policy_version 644381 (0.0031) [2024-06-24 10:48:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 10557620224. Throughput: 0: 42583.9. Samples: 10557807340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:48:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 10:48:04,728][15401] Updated weights for policy 0, policy_version 644391 (0.0032) [2024-06-24 10:48:08,377][15401] Updated weights for policy 0, policy_version 644401 (0.0035) [2024-06-24 10:48:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 10557865984. Throughput: 0: 42709.7. Samples: 10557933920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 10:48:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 10:48:12,287][15401] Updated weights for policy 0, policy_version 644411 (0.0038) [2024-06-24 10:48:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10558062592. Throughput: 0: 42721.4. Samples: 10558188880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 10:48:15,990][15401] Updated weights for policy 0, policy_version 644421 (0.0036) [2024-06-24 10:48:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10558275584. Throughput: 0: 42767.0. Samples: 10558447600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:18,390][15132] Avg episode reward: [(0, '0.889')] [2024-06-24 10:48:19,786][15401] Updated weights for policy 0, policy_version 644431 (0.0029) [2024-06-24 10:48:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10558504960. Throughput: 0: 42831.5. Samples: 10558575900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:23,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 10:48:23,450][15401] Updated weights for policy 0, policy_version 644441 (0.0039) [2024-06-24 10:48:27,876][15401] Updated weights for policy 0, policy_version 644451 (0.0040) [2024-06-24 10:48:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 10558701568. Throughput: 0: 42745.2. Samples: 10558833000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:28,392][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 10:48:31,110][15401] Updated weights for policy 0, policy_version 644461 (0.0028) [2024-06-24 10:48:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 10558930944. Throughput: 0: 42755.3. Samples: 10559091540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:33,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-24 10:48:35,370][15401] Updated weights for policy 0, policy_version 644471 (0.0030) [2024-06-24 10:48:38,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10559143936. Throughput: 0: 42928.9. Samples: 10559223680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:38,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 10:48:38,938][15401] Updated weights for policy 0, policy_version 644481 (0.0032) [2024-06-24 10:48:42,948][15401] Updated weights for policy 0, policy_version 644491 (0.0045) [2024-06-24 10:48:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 10559356928. Throughput: 0: 42989.3. Samples: 10559484960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 10:48:46,462][15401] Updated weights for policy 0, policy_version 644501 (0.0028) [2024-06-24 10:48:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 10559569920. Throughput: 0: 42854.7. Samples: 10559735800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:48,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-24 10:48:50,644][15401] Updated weights for policy 0, policy_version 644511 (0.0036) [2024-06-24 10:48:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10559782912. Throughput: 0: 42973.7. Samples: 10559867740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 10:48:54,114][15401] Updated weights for policy 0, policy_version 644521 (0.0031) [2024-06-24 10:48:58,290][15401] Updated weights for policy 0, policy_version 644531 (0.0028) [2024-06-24 10:48:58,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43140.0, 300 sec: 42819.6). Total num frames: 10559995904. Throughput: 0: 42973.8. Samples: 10560122980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:48:58,397][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 10:49:01,639][15401] Updated weights for policy 0, policy_version 644541 (0.0030) [2024-06-24 10:49:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10560208896. Throughput: 0: 42959.5. Samples: 10560380780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 10:49:05,985][15401] Updated weights for policy 0, policy_version 644551 (0.0046) [2024-06-24 10:49:07,073][15349] Signal inference workers to stop experience collection... (156400 times) [2024-06-24 10:49:07,120][15401] InferenceWorker_p0-w0: stopping experience collection (156400 times) [2024-06-24 10:49:07,188][15349] Signal inference workers to resume experience collection... (156400 times) [2024-06-24 10:49:07,188][15401] InferenceWorker_p0-w0: resuming experience collection (156400 times) [2024-06-24 10:49:08,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10560438272. Throughput: 0: 42900.4. Samples: 10560506420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:08,400][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 10:49:09,829][15401] Updated weights for policy 0, policy_version 644561 (0.0034) [2024-06-24 10:49:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10560634880. Throughput: 0: 42752.5. Samples: 10560756760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:13,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 10:49:13,814][15401] Updated weights for policy 0, policy_version 644571 (0.0028) [2024-06-24 10:49:17,359][15401] Updated weights for policy 0, policy_version 644581 (0.0041) [2024-06-24 10:49:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10560847872. Throughput: 0: 42860.9. Samples: 10561020280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 10:49:21,396][15401] Updated weights for policy 0, policy_version 644591 (0.0052) [2024-06-24 10:49:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10561060864. Throughput: 0: 42807.6. Samples: 10561150020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 10:49:25,120][15401] Updated weights for policy 0, policy_version 644601 (0.0039) [2024-06-24 10:49:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 10561257472. Throughput: 0: 42586.4. Samples: 10561401340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 10:49:29,053][15401] Updated weights for policy 0, policy_version 644611 (0.0043) [2024-06-24 10:49:32,877][15401] Updated weights for policy 0, policy_version 644621 (0.0022) [2024-06-24 10:49:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10561486848. Throughput: 0: 42775.5. Samples: 10561660700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 10:49:36,546][15401] Updated weights for policy 0, policy_version 644631 (0.0035) [2024-06-24 10:49:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10561716224. Throughput: 0: 42829.8. Samples: 10561795080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 10:49:40,322][15401] Updated weights for policy 0, policy_version 644641 (0.0037) [2024-06-24 10:49:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10561929216. Throughput: 0: 42843.2. Samples: 10562050660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 10:49:43,391][15132] Avg episode reward: [(0, '0.227')] [2024-06-24 10:49:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000644649_10561929216.pth... [2024-06-24 10:49:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000644023_10551672832.pth [2024-06-24 10:49:44,108][15401] Updated weights for policy 0, policy_version 644651 (0.0034) [2024-06-24 10:49:47,881][15401] Updated weights for policy 0, policy_version 644661 (0.0030) [2024-06-24 10:49:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10562158592. Throughput: 0: 42795.1. Samples: 10562306560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:49:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 10:49:51,918][15401] Updated weights for policy 0, policy_version 644671 (0.0027) [2024-06-24 10:49:53,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10562355200. Throughput: 0: 42862.7. Samples: 10562435240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:49:53,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 10:49:55,349][15401] Updated weights for policy 0, policy_version 644681 (0.0029) [2024-06-24 10:49:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 10562568192. Throughput: 0: 42901.7. Samples: 10562687340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:49:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 10:49:59,467][15401] Updated weights for policy 0, policy_version 644691 (0.0035) [2024-06-24 10:50:03,013][15401] Updated weights for policy 0, policy_version 644701 (0.0036) [2024-06-24 10:50:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10562781184. Throughput: 0: 42873.4. Samples: 10562949580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:03,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 10:50:07,133][15401] Updated weights for policy 0, policy_version 644711 (0.0037) [2024-06-24 10:50:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10562994176. Throughput: 0: 42924.4. Samples: 10563081620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 10:50:10,444][15401] Updated weights for policy 0, policy_version 644721 (0.0038) [2024-06-24 10:50:13,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 10563207168. Throughput: 0: 42970.3. Samples: 10563335280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:13,396][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 10:50:14,618][15401] Updated weights for policy 0, policy_version 644731 (0.0037) [2024-06-24 10:50:17,911][15401] Updated weights for policy 0, policy_version 644741 (0.0031) [2024-06-24 10:50:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 10563436544. Throughput: 0: 42937.4. Samples: 10563592880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 10:50:22,431][15401] Updated weights for policy 0, policy_version 644751 (0.0027) [2024-06-24 10:50:23,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10563633152. Throughput: 0: 42836.0. Samples: 10563722700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 10:50:25,953][15401] Updated weights for policy 0, policy_version 644761 (0.0038) [2024-06-24 10:50:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 10563862528. Throughput: 0: 42688.7. Samples: 10563971640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:28,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 10:50:30,158][15401] Updated weights for policy 0, policy_version 644771 (0.0032) [2024-06-24 10:50:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 10564075520. Throughput: 0: 42856.9. Samples: 10564235220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:33,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 10:50:33,563][15401] Updated weights for policy 0, policy_version 644781 (0.0026) [2024-06-24 10:50:34,069][15349] Signal inference workers to stop experience collection... (156450 times) [2024-06-24 10:50:34,096][15401] InferenceWorker_p0-w0: stopping experience collection (156450 times) [2024-06-24 10:50:34,123][15349] Signal inference workers to resume experience collection... (156450 times) [2024-06-24 10:50:34,128][15401] InferenceWorker_p0-w0: resuming experience collection (156450 times) [2024-06-24 10:50:37,840][15401] Updated weights for policy 0, policy_version 644791 (0.0040) [2024-06-24 10:50:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10564272128. Throughput: 0: 42805.3. Samples: 10564361480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 10:50:41,355][15401] Updated weights for policy 0, policy_version 644801 (0.0035) [2024-06-24 10:50:43,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 10564485120. Throughput: 0: 42759.5. Samples: 10564611520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:43,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 10:50:45,477][15401] Updated weights for policy 0, policy_version 644811 (0.0022) [2024-06-24 10:50:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10564698112. Throughput: 0: 42661.9. Samples: 10564869360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:48,399][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 10:50:49,137][15401] Updated weights for policy 0, policy_version 644821 (0.0031) [2024-06-24 10:50:53,090][15401] Updated weights for policy 0, policy_version 644831 (0.0042) [2024-06-24 10:50:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10564911104. Throughput: 0: 42508.9. Samples: 10564994520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:53,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 10:50:56,854][15401] Updated weights for policy 0, policy_version 644841 (0.0028) [2024-06-24 10:50:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10565156864. Throughput: 0: 42745.7. Samples: 10565258560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:50:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 10:51:00,497][15401] Updated weights for policy 0, policy_version 644851 (0.0029) [2024-06-24 10:51:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10565337088. Throughput: 0: 42807.2. Samples: 10565519200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:51:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 10:51:04,342][15401] Updated weights for policy 0, policy_version 644861 (0.0027) [2024-06-24 10:51:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10565550080. Throughput: 0: 42676.0. Samples: 10565643120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:51:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 10:51:08,792][15401] Updated weights for policy 0, policy_version 644871 (0.0032) [2024-06-24 10:51:11,779][15401] Updated weights for policy 0, policy_version 644881 (0.0038) [2024-06-24 10:51:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 10565795840. Throughput: 0: 42843.6. Samples: 10565899600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:51:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 10:51:16,219][15401] Updated weights for policy 0, policy_version 644891 (0.0035) [2024-06-24 10:51:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10565976064. Throughput: 0: 42736.9. Samples: 10566158280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:51:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 10:51:19,821][15401] Updated weights for policy 0, policy_version 644901 (0.0039) [2024-06-24 10:51:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10566205440. Throughput: 0: 42748.9. Samples: 10566285180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 10:51:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 10:51:23,619][15401] Updated weights for policy 0, policy_version 644911 (0.0040) [2024-06-24 10:51:27,255][15401] Updated weights for policy 0, policy_version 644921 (0.0036) [2024-06-24 10:51:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10566434816. Throughput: 0: 42987.1. Samples: 10566545940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:51:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 10:51:31,336][15401] Updated weights for policy 0, policy_version 644931 (0.0031) [2024-06-24 10:51:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 10566631424. Throughput: 0: 42986.7. Samples: 10566803760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:51:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 10:51:34,818][15401] Updated weights for policy 0, policy_version 644941 (0.0034) [2024-06-24 10:51:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10566844416. Throughput: 0: 42999.6. Samples: 10566929500. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:51:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 10:51:38,944][15401] Updated weights for policy 0, policy_version 644951 (0.0031) [2024-06-24 10:51:42,483][15401] Updated weights for policy 0, policy_version 644961 (0.0048) [2024-06-24 10:51:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 10567090176. Throughput: 0: 43001.8. Samples: 10567193640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:51:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 10:51:43,541][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000644965_10567106560.pth... [2024-06-24 10:51:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000644336_10556801024.pth [2024-06-24 10:51:46,443][15401] Updated weights for policy 0, policy_version 644971 (0.0039) [2024-06-24 10:51:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10567286784. Throughput: 0: 42989.2. Samples: 10567453720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:51:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 10:51:50,131][15401] Updated weights for policy 0, policy_version 644981 (0.0038) [2024-06-24 10:51:53,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 10567483392. Throughput: 0: 42857.2. Samples: 10567571800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:51:53,393][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 10:51:54,073][15401] Updated weights for policy 0, policy_version 644991 (0.0039) [2024-06-24 10:51:57,956][15401] Updated weights for policy 0, policy_version 645001 (0.0028) [2024-06-24 10:51:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10567712768. Throughput: 0: 42865.4. Samples: 10567828540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:51:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 10:52:01,607][15401] Updated weights for policy 0, policy_version 645011 (0.0031) [2024-06-24 10:52:03,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.4, 300 sec: 42765.2). Total num frames: 10567925760. Throughput: 0: 42988.3. Samples: 10568092760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 10:52:03,946][15349] Signal inference workers to stop experience collection... (156500 times) [2024-06-24 10:52:03,973][15401] InferenceWorker_p0-w0: stopping experience collection (156500 times) [2024-06-24 10:52:04,008][15349] Signal inference workers to resume experience collection... (156500 times) [2024-06-24 10:52:04,010][15401] InferenceWorker_p0-w0: resuming experience collection (156500 times) [2024-06-24 10:52:05,497][15401] Updated weights for policy 0, policy_version 645021 (0.0029) [2024-06-24 10:52:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10568138752. Throughput: 0: 42927.1. Samples: 10568216900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 10:52:09,200][15401] Updated weights for policy 0, policy_version 645031 (0.0040) [2024-06-24 10:52:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 10568335360. Throughput: 0: 42784.5. Samples: 10568471240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:13,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 10:52:13,478][15401] Updated weights for policy 0, policy_version 645041 (0.0026) [2024-06-24 10:52:17,426][15401] Updated weights for policy 0, policy_version 645051 (0.0033) [2024-06-24 10:52:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10568564736. Throughput: 0: 42792.0. Samples: 10568729400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 10:52:21,698][15401] Updated weights for policy 0, policy_version 645061 (0.0032) [2024-06-24 10:52:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10568794112. Throughput: 0: 42860.2. Samples: 10568858220. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:23,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 10:52:24,822][15401] Updated weights for policy 0, policy_version 645071 (0.0026) [2024-06-24 10:52:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 10568974336. Throughput: 0: 42493.8. Samples: 10569105860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:28,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 10:52:29,192][15401] Updated weights for policy 0, policy_version 645081 (0.0054) [2024-06-24 10:52:32,504][15401] Updated weights for policy 0, policy_version 645091 (0.0043) [2024-06-24 10:52:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10569187328. Throughput: 0: 42602.2. Samples: 10569370820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:33,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 10:52:36,739][15401] Updated weights for policy 0, policy_version 645101 (0.0035) [2024-06-24 10:52:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10569433088. Throughput: 0: 42813.4. Samples: 10569498300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 10:52:40,103][15401] Updated weights for policy 0, policy_version 645111 (0.0033) [2024-06-24 10:52:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 10569629696. Throughput: 0: 42628.0. Samples: 10569746800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:43,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 10:52:44,568][15401] Updated weights for policy 0, policy_version 645121 (0.0033) [2024-06-24 10:52:47,713][15401] Updated weights for policy 0, policy_version 645131 (0.0054) [2024-06-24 10:52:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10569826304. Throughput: 0: 42444.9. Samples: 10570002780. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 10:52:52,139][15401] Updated weights for policy 0, policy_version 645141 (0.0037) [2024-06-24 10:52:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 10570039296. Throughput: 0: 42530.9. Samples: 10570130800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:53,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 10:52:55,524][15401] Updated weights for policy 0, policy_version 645151 (0.0033) [2024-06-24 10:52:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 10570268672. Throughput: 0: 42550.6. Samples: 10570386020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-24 10:52:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 10:52:59,699][15401] Updated weights for policy 0, policy_version 645161 (0.0042) [2024-06-24 10:53:03,212][15401] Updated weights for policy 0, policy_version 645171 (0.0045) [2024-06-24 10:53:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10570481664. Throughput: 0: 42463.6. Samples: 10570640260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 10:53:07,353][15401] Updated weights for policy 0, policy_version 645181 (0.0028) [2024-06-24 10:53:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10570678272. Throughput: 0: 42417.0. Samples: 10570766980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 10:53:08,440][15349] Signal inference workers to stop experience collection... (156550 times) [2024-06-24 10:53:08,440][15349] Signal inference workers to resume experience collection... (156550 times) [2024-06-24 10:53:08,469][15401] InferenceWorker_p0-w0: stopping experience collection (156550 times) [2024-06-24 10:53:08,469][15401] InferenceWorker_p0-w0: resuming experience collection (156550 times) [2024-06-24 10:53:11,782][15401] Updated weights for policy 0, policy_version 645191 (0.0038) [2024-06-24 10:53:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10570907648. Throughput: 0: 42730.7. Samples: 10571028740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 10:53:14,835][15401] Updated weights for policy 0, policy_version 645201 (0.0039) [2024-06-24 10:53:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10571120640. Throughput: 0: 42370.6. Samples: 10571277500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 10:53:19,419][15401] Updated weights for policy 0, policy_version 645211 (0.0028) [2024-06-24 10:53:22,541][15401] Updated weights for policy 0, policy_version 645221 (0.0036) [2024-06-24 10:53:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.7, 300 sec: 42820.6). Total num frames: 10571333632. Throughput: 0: 42336.5. Samples: 10571403540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:23,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 10:53:26,880][15401] Updated weights for policy 0, policy_version 645231 (0.0041) [2024-06-24 10:53:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10571546624. Throughput: 0: 42784.4. Samples: 10571672100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:28,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 10:53:30,169][15401] Updated weights for policy 0, policy_version 645241 (0.0037) [2024-06-24 10:53:33,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10571776000. Throughput: 0: 42555.7. Samples: 10571917780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 10:53:34,418][15401] Updated weights for policy 0, policy_version 645251 (0.0035) [2024-06-24 10:53:37,718][15401] Updated weights for policy 0, policy_version 645261 (0.0035) [2024-06-24 10:53:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10571972608. Throughput: 0: 42557.8. Samples: 10572045900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:38,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 10:53:41,946][15401] Updated weights for policy 0, policy_version 645271 (0.0044) [2024-06-24 10:53:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10572169216. Throughput: 0: 42631.7. Samples: 10572304440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 10:53:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000645275_10572185600.pth... [2024-06-24 10:53:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000644649_10561929216.pth [2024-06-24 10:53:45,290][15401] Updated weights for policy 0, policy_version 645281 (0.0047) [2024-06-24 10:53:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10572398592. Throughput: 0: 42491.1. Samples: 10572552360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 10:53:49,618][15401] Updated weights for policy 0, policy_version 645291 (0.0024) [2024-06-24 10:53:53,096][15401] Updated weights for policy 0, policy_version 645301 (0.0039) [2024-06-24 10:53:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 10572627968. Throughput: 0: 42700.4. Samples: 10572688500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 10:53:57,838][15401] Updated weights for policy 0, policy_version 645311 (0.0039) [2024-06-24 10:53:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 10572791808. Throughput: 0: 42482.3. Samples: 10572940440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:53:58,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 10:54:00,606][15401] Updated weights for policy 0, policy_version 645321 (0.0032) [2024-06-24 10:54:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10573037568. Throughput: 0: 42562.7. Samples: 10573192820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:54:03,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 10:54:05,527][15401] Updated weights for policy 0, policy_version 645331 (0.0027) [2024-06-24 10:54:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10573250560. Throughput: 0: 42708.6. Samples: 10573325320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:54:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 10:54:08,439][15401] Updated weights for policy 0, policy_version 645341 (0.0031) [2024-06-24 10:54:13,078][15401] Updated weights for policy 0, policy_version 645351 (0.0038) [2024-06-24 10:54:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10573447168. Throughput: 0: 42295.5. Samples: 10573575400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:54:13,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 10:54:15,824][15349] Signal inference workers to stop experience collection... (156600 times) [2024-06-24 10:54:15,875][15401] InferenceWorker_p0-w0: stopping experience collection (156600 times) [2024-06-24 10:54:15,878][15349] Signal inference workers to resume experience collection... (156600 times) [2024-06-24 10:54:15,889][15401] InferenceWorker_p0-w0: resuming experience collection (156600 times) [2024-06-24 10:54:16,023][15401] Updated weights for policy 0, policy_version 645361 (0.0033) [2024-06-24 10:54:18,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10573692928. Throughput: 0: 42555.9. Samples: 10573832800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:54:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 10:54:20,472][15401] Updated weights for policy 0, policy_version 645371 (0.0035) [2024-06-24 10:54:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 10573889536. Throughput: 0: 42680.2. Samples: 10573966500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:54:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 10:54:23,829][15401] Updated weights for policy 0, policy_version 645381 (0.0045) [2024-06-24 10:54:28,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 10574069760. Throughput: 0: 42603.6. Samples: 10574221600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:54:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 10:54:28,416][15401] Updated weights for policy 0, policy_version 645391 (0.0031) [2024-06-24 10:54:31,606][15401] Updated weights for policy 0, policy_version 645401 (0.0038) [2024-06-24 10:54:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10574348288. Throughput: 0: 42635.5. Samples: 10574470960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 10:54:33,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 10:54:35,871][15401] Updated weights for policy 0, policy_version 645411 (0.0028) [2024-06-24 10:54:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10574528512. Throughput: 0: 42697.3. Samples: 10574609880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:54:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 10:54:39,270][15401] Updated weights for policy 0, policy_version 645421 (0.0038) [2024-06-24 10:54:43,358][15401] Updated weights for policy 0, policy_version 645431 (0.0032) [2024-06-24 10:54:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10574741504. Throughput: 0: 42688.8. Samples: 10574861440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:54:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 10:54:47,028][15401] Updated weights for policy 0, policy_version 645441 (0.0041) [2024-06-24 10:54:48,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 10574987264. Throughput: 0: 42491.5. Samples: 10575105040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:54:48,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 10:54:50,889][15401] Updated weights for policy 0, policy_version 645451 (0.0025) [2024-06-24 10:54:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 10575151104. Throughput: 0: 42605.6. Samples: 10575242580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:54:53,394][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 10:54:55,009][15401] Updated weights for policy 0, policy_version 645461 (0.0032) [2024-06-24 10:54:58,392][15132] Fps is (10 sec: 39322.2, 60 sec: 43142.8, 300 sec: 42709.2). Total num frames: 10575380480. Throughput: 0: 42703.2. Samples: 10575497140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:54:58,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 10:54:58,439][15401] Updated weights for policy 0, policy_version 645471 (0.0032) [2024-06-24 10:55:02,659][15401] Updated weights for policy 0, policy_version 645481 (0.0029) [2024-06-24 10:55:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10575593472. Throughput: 0: 42652.1. Samples: 10575752140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 10:55:06,247][15401] Updated weights for policy 0, policy_version 645491 (0.0033) [2024-06-24 10:55:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42052.3, 300 sec: 42599.3). Total num frames: 10575773696. Throughput: 0: 42474.7. Samples: 10575877860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 10:55:10,377][15401] Updated weights for policy 0, policy_version 645501 (0.0037) [2024-06-24 10:55:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10576003072. Throughput: 0: 42476.4. Samples: 10576133040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 10:55:14,018][15401] Updated weights for policy 0, policy_version 645511 (0.0033) [2024-06-24 10:55:17,972][15401] Updated weights for policy 0, policy_version 645521 (0.0036) [2024-06-24 10:55:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10576232448. Throughput: 0: 42730.2. Samples: 10576393820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:18,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 10:55:21,554][15401] Updated weights for policy 0, policy_version 645531 (0.0034) [2024-06-24 10:55:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10576412672. Throughput: 0: 42484.9. Samples: 10576521700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 10:55:25,656][15401] Updated weights for policy 0, policy_version 645541 (0.0037) [2024-06-24 10:55:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 10576658432. Throughput: 0: 42501.8. Samples: 10576774020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 10:55:29,606][15401] Updated weights for policy 0, policy_version 645551 (0.0037) [2024-06-24 10:55:33,275][15401] Updated weights for policy 0, policy_version 645561 (0.0037) [2024-06-24 10:55:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 10576871424. Throughput: 0: 42849.8. Samples: 10577033180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 10:55:37,233][15401] Updated weights for policy 0, policy_version 645571 (0.0035) [2024-06-24 10:55:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10577051648. Throughput: 0: 42600.1. Samples: 10577159580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 10:55:40,803][15401] Updated weights for policy 0, policy_version 645581 (0.0034) [2024-06-24 10:55:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10577281024. Throughput: 0: 42623.1. Samples: 10577415080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 10:55:43,530][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000645588_10577313792.pth... [2024-06-24 10:55:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000644965_10567106560.pth [2024-06-24 10:55:44,816][15401] Updated weights for policy 0, policy_version 645591 (0.0042) [2024-06-24 10:55:45,934][15349] Signal inference workers to stop experience collection... (156650 times) [2024-06-24 10:55:45,935][15349] Signal inference workers to resume experience collection... (156650 times) [2024-06-24 10:55:45,989][15401] InferenceWorker_p0-w0: stopping experience collection (156650 times) [2024-06-24 10:55:45,989][15401] InferenceWorker_p0-w0: resuming experience collection (156650 times) [2024-06-24 10:55:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 10577510400. Throughput: 0: 42661.7. Samples: 10577671920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:48,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 10:55:48,474][15401] Updated weights for policy 0, policy_version 645601 (0.0040) [2024-06-24 10:55:52,582][15401] Updated weights for policy 0, policy_version 645611 (0.0036) [2024-06-24 10:55:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10577707008. Throughput: 0: 42583.9. Samples: 10577794140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 10:55:56,126][15401] Updated weights for policy 0, policy_version 645621 (0.0048) [2024-06-24 10:55:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 10577920000. Throughput: 0: 42754.2. Samples: 10578056980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:55:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 10:55:59,963][15401] Updated weights for policy 0, policy_version 645631 (0.0031) [2024-06-24 10:56:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10578165760. Throughput: 0: 42599.1. Samples: 10578310780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:56:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 10:56:04,063][15401] Updated weights for policy 0, policy_version 645641 (0.0038) [2024-06-24 10:56:07,530][15401] Updated weights for policy 0, policy_version 645651 (0.0027) [2024-06-24 10:56:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 10578362368. Throughput: 0: 42717.7. Samples: 10578444000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:56:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 10:56:11,730][15401] Updated weights for policy 0, policy_version 645661 (0.0052) [2024-06-24 10:56:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10578575360. Throughput: 0: 42789.9. Samples: 10578699560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-24 10:56:13,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 10:56:15,320][15401] Updated weights for policy 0, policy_version 645671 (0.0031) [2024-06-24 10:56:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10578788352. Throughput: 0: 42721.4. Samples: 10578955640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:18,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 10:56:19,364][15401] Updated weights for policy 0, policy_version 645681 (0.0024) [2024-06-24 10:56:23,044][15401] Updated weights for policy 0, policy_version 645691 (0.0035) [2024-06-24 10:56:23,392][15132] Fps is (10 sec: 44225.6, 60 sec: 43415.8, 300 sec: 42653.6). Total num frames: 10579017728. Throughput: 0: 42828.3. Samples: 10579086960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:23,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 10:56:26,899][15401] Updated weights for policy 0, policy_version 645701 (0.0032) [2024-06-24 10:56:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 10579197952. Throughput: 0: 42660.3. Samples: 10579334900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:28,392][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 10:56:30,875][15401] Updated weights for policy 0, policy_version 645711 (0.0042) [2024-06-24 10:56:33,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10579443712. Throughput: 0: 42567.0. Samples: 10579587440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 10:56:35,137][15401] Updated weights for policy 0, policy_version 645721 (0.0039) [2024-06-24 10:56:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 10579640320. Throughput: 0: 42794.6. Samples: 10579719900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 10:56:38,446][15401] Updated weights for policy 0, policy_version 645731 (0.0030) [2024-06-24 10:56:43,026][15401] Updated weights for policy 0, policy_version 645741 (0.0042) [2024-06-24 10:56:43,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10579820544. Throughput: 0: 42556.9. Samples: 10579972040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:43,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 10:56:46,675][15401] Updated weights for policy 0, policy_version 645751 (0.0026) [2024-06-24 10:56:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 10580066304. Throughput: 0: 42356.1. Samples: 10580216800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:48,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 10:56:50,673][15401] Updated weights for policy 0, policy_version 645761 (0.0027) [2024-06-24 10:56:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10580262912. Throughput: 0: 42450.6. Samples: 10580354280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:53,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-24 10:56:54,157][15401] Updated weights for policy 0, policy_version 645771 (0.0052) [2024-06-24 10:56:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 10580459520. Throughput: 0: 42288.9. Samples: 10580602560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:56:58,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 10:56:58,574][15401] Updated weights for policy 0, policy_version 645781 (0.0027) [2024-06-24 10:57:01,291][15349] Signal inference workers to stop experience collection... (156700 times) [2024-06-24 10:57:01,292][15349] Signal inference workers to resume experience collection... (156700 times) [2024-06-24 10:57:01,314][15401] InferenceWorker_p0-w0: stopping experience collection (156700 times) [2024-06-24 10:57:01,315][15401] InferenceWorker_p0-w0: resuming experience collection (156700 times) [2024-06-24 10:57:02,416][15401] Updated weights for policy 0, policy_version 645791 (0.0039) [2024-06-24 10:57:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10580705280. Throughput: 0: 42165.3. Samples: 10580853080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:03,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 10:57:06,040][15401] Updated weights for policy 0, policy_version 645801 (0.0034) [2024-06-24 10:57:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10580901888. Throughput: 0: 42272.6. Samples: 10580989120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 10:57:09,974][15401] Updated weights for policy 0, policy_version 645811 (0.0031) [2024-06-24 10:57:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10581114880. Throughput: 0: 42396.1. Samples: 10581242620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 10:57:13,498][15401] Updated weights for policy 0, policy_version 645821 (0.0042) [2024-06-24 10:57:17,565][15401] Updated weights for policy 0, policy_version 645831 (0.0026) [2024-06-24 10:57:18,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42598.2, 300 sec: 42542.9). Total num frames: 10581344256. Throughput: 0: 42382.1. Samples: 10581494640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:18,395][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 10:57:21,127][15401] Updated weights for policy 0, policy_version 645841 (0.0038) [2024-06-24 10:57:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 10581540864. Throughput: 0: 42394.6. Samples: 10581627660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:23,392][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 10:57:24,997][15401] Updated weights for policy 0, policy_version 645851 (0.0025) [2024-06-24 10:57:28,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 10581770240. Throughput: 0: 42531.1. Samples: 10581885940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 10:57:28,758][15401] Updated weights for policy 0, policy_version 645861 (0.0023) [2024-06-24 10:57:32,571][15401] Updated weights for policy 0, policy_version 645871 (0.0036) [2024-06-24 10:57:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 10581983232. Throughput: 0: 42641.9. Samples: 10582135700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 10:57:36,277][15401] Updated weights for policy 0, policy_version 645881 (0.0041) [2024-06-24 10:57:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 10582179840. Throughput: 0: 42495.1. Samples: 10582266560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 10:57:40,055][15401] Updated weights for policy 0, policy_version 645891 (0.0024) [2024-06-24 10:57:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 10582409216. Throughput: 0: 42847.7. Samples: 10582530720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:43,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 10:57:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000645899_10582409216.pth... [2024-06-24 10:57:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000645275_10572185600.pth [2024-06-24 10:57:43,770][15401] Updated weights for policy 0, policy_version 645901 (0.0024) [2024-06-24 10:57:47,626][15401] Updated weights for policy 0, policy_version 645911 (0.0027) [2024-06-24 10:57:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 10582622208. Throughput: 0: 42873.3. Samples: 10582782380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 10:57:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 10:57:51,462][15401] Updated weights for policy 0, policy_version 645921 (0.0027) [2024-06-24 10:57:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10582818816. Throughput: 0: 42664.8. Samples: 10582909040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:57:53,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 10:57:55,422][15401] Updated weights for policy 0, policy_version 645931 (0.0038) [2024-06-24 10:57:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10583048192. Throughput: 0: 42814.7. Samples: 10583169280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:57:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 10:57:59,162][15401] Updated weights for policy 0, policy_version 645941 (0.0035) [2024-06-24 10:58:02,914][15401] Updated weights for policy 0, policy_version 645951 (0.0032) [2024-06-24 10:58:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10583261184. Throughput: 0: 42818.0. Samples: 10583421440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 10:58:06,820][15401] Updated weights for policy 0, policy_version 645961 (0.0027) [2024-06-24 10:58:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10583457792. Throughput: 0: 42833.0. Samples: 10583555140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 10:58:08,749][15349] Signal inference workers to stop experience collection... (156750 times) [2024-06-24 10:58:08,750][15349] Signal inference workers to resume experience collection... (156750 times) [2024-06-24 10:58:08,792][15401] InferenceWorker_p0-w0: stopping experience collection (156750 times) [2024-06-24 10:58:08,792][15401] InferenceWorker_p0-w0: resuming experience collection (156750 times) [2024-06-24 10:58:10,507][15401] Updated weights for policy 0, policy_version 645971 (0.0035) [2024-06-24 10:58:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10583687168. Throughput: 0: 42869.3. Samples: 10583815060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 10:58:14,644][15401] Updated weights for policy 0, policy_version 645981 (0.0041) [2024-06-24 10:58:18,162][15401] Updated weights for policy 0, policy_version 645991 (0.0038) [2024-06-24 10:58:18,396][15132] Fps is (10 sec: 45845.4, 60 sec: 42867.0, 300 sec: 42653.4). Total num frames: 10583916544. Throughput: 0: 42943.4. Samples: 10584068420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:18,397][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 10:58:22,333][15401] Updated weights for policy 0, policy_version 646001 (0.0034) [2024-06-24 10:58:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10584113152. Throughput: 0: 42854.2. Samples: 10584195000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 10:58:25,914][15401] Updated weights for policy 0, policy_version 646011 (0.0033) [2024-06-24 10:58:28,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10584342528. Throughput: 0: 42678.3. Samples: 10584451240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 10:58:29,932][15401] Updated weights for policy 0, policy_version 646021 (0.0041) [2024-06-24 10:58:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 10584555520. Throughput: 0: 42911.2. Samples: 10584713380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 10:58:33,718][15401] Updated weights for policy 0, policy_version 646031 (0.0040) [2024-06-24 10:58:37,587][15401] Updated weights for policy 0, policy_version 646041 (0.0032) [2024-06-24 10:58:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 10584768512. Throughput: 0: 42993.3. Samples: 10584843740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 10:58:41,207][15401] Updated weights for policy 0, policy_version 646051 (0.0039) [2024-06-24 10:58:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 10584981504. Throughput: 0: 42824.3. Samples: 10585096380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 10:58:45,212][15401] Updated weights for policy 0, policy_version 646061 (0.0031) [2024-06-24 10:58:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 10585194496. Throughput: 0: 43036.4. Samples: 10585358080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 10:58:48,717][15401] Updated weights for policy 0, policy_version 646071 (0.0031) [2024-06-24 10:58:52,724][15401] Updated weights for policy 0, policy_version 646081 (0.0043) [2024-06-24 10:58:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10585391104. Throughput: 0: 42936.5. Samples: 10585487280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:53,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 10:58:56,451][15401] Updated weights for policy 0, policy_version 646091 (0.0027) [2024-06-24 10:58:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 10585620480. Throughput: 0: 42865.7. Samples: 10585744020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:58:58,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 10:59:01,045][15401] Updated weights for policy 0, policy_version 646101 (0.0028) [2024-06-24 10:59:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10585833472. Throughput: 0: 42845.7. Samples: 10585996200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:59:03,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 10:59:04,078][15401] Updated weights for policy 0, policy_version 646111 (0.0032) [2024-06-24 10:59:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10586030080. Throughput: 0: 42991.6. Samples: 10586129620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:59:08,394][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 10:59:08,648][15401] Updated weights for policy 0, policy_version 646121 (0.0031) [2024-06-24 10:59:11,564][15401] Updated weights for policy 0, policy_version 646131 (0.0027) [2024-06-24 10:59:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10586275840. Throughput: 0: 43046.3. Samples: 10586388320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:59:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 10:59:16,127][15401] Updated weights for policy 0, policy_version 646141 (0.0043) [2024-06-24 10:59:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 10586488832. Throughput: 0: 42857.7. Samples: 10586641980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:59:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 10:59:18,681][15349] Signal inference workers to stop experience collection... (156800 times) [2024-06-24 10:59:18,682][15349] Signal inference workers to resume experience collection... (156800 times) [2024-06-24 10:59:18,702][15401] InferenceWorker_p0-w0: stopping experience collection (156800 times) [2024-06-24 10:59:18,702][15401] InferenceWorker_p0-w0: resuming experience collection (156800 times) [2024-06-24 10:59:19,345][15401] Updated weights for policy 0, policy_version 646151 (0.0038) [2024-06-24 10:59:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10586685440. Throughput: 0: 42782.9. Samples: 10586768960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 10:59:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 10:59:23,830][15401] Updated weights for policy 0, policy_version 646161 (0.0036) [2024-06-24 10:59:26,827][15401] Updated weights for policy 0, policy_version 646171 (0.0027) [2024-06-24 10:59:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10586898432. Throughput: 0: 42845.4. Samples: 10587024420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:59:28,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 10:59:31,505][15401] Updated weights for policy 0, policy_version 646181 (0.0028) [2024-06-24 10:59:33,390][15132] Fps is (10 sec: 45873.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10587144192. Throughput: 0: 42861.5. Samples: 10587286860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:59:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 10:59:34,560][15401] Updated weights for policy 0, policy_version 646191 (0.0047) [2024-06-24 10:59:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10587308032. Throughput: 0: 42818.1. Samples: 10587414100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:59:38,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 10:59:39,060][15401] Updated weights for policy 0, policy_version 646201 (0.0041) [2024-06-24 10:59:42,022][15401] Updated weights for policy 0, policy_version 646211 (0.0038) [2024-06-24 10:59:43,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 10587553792. Throughput: 0: 42814.3. Samples: 10587670660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:59:43,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 10:59:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000646213_10587553792.pth... [2024-06-24 10:59:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000645588_10577313792.pth [2024-06-24 10:59:46,684][15401] Updated weights for policy 0, policy_version 646221 (0.0034) [2024-06-24 10:59:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10587766784. Throughput: 0: 43022.1. Samples: 10587932200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:59:48,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 10:59:49,614][15401] Updated weights for policy 0, policy_version 646231 (0.0040) [2024-06-24 10:59:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 10587963392. Throughput: 0: 43010.2. Samples: 10588065080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:59:53,392][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 10:59:54,164][15401] Updated weights for policy 0, policy_version 646241 (0.0033) [2024-06-24 10:59:57,127][15401] Updated weights for policy 0, policy_version 646251 (0.0032) [2024-06-24 10:59:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10588192768. Throughput: 0: 42774.7. Samples: 10588313180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 10:59:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 11:00:02,002][15401] Updated weights for policy 0, policy_version 646261 (0.0034) [2024-06-24 11:00:03,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 10588405760. Throughput: 0: 42891.8. Samples: 10588572380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:03,396][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 11:00:04,753][15401] Updated weights for policy 0, policy_version 646271 (0.0038) [2024-06-24 11:00:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10588618752. Throughput: 0: 43032.7. Samples: 10588705440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 11:00:09,525][15401] Updated weights for policy 0, policy_version 646281 (0.0028) [2024-06-24 11:00:12,757][15401] Updated weights for policy 0, policy_version 646291 (0.0025) [2024-06-24 11:00:13,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10588831744. Throughput: 0: 43032.8. Samples: 10588960900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 11:00:17,016][15401] Updated weights for policy 0, policy_version 646301 (0.0033) [2024-06-24 11:00:18,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 10589044736. Throughput: 0: 42998.8. Samples: 10589221900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:18,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 11:00:20,245][15401] Updated weights for policy 0, policy_version 646311 (0.0034) [2024-06-24 11:00:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10589274112. Throughput: 0: 42979.1. Samples: 10589348160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:23,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 11:00:25,062][15401] Updated weights for policy 0, policy_version 646321 (0.0028) [2024-06-24 11:00:27,613][15349] Signal inference workers to stop experience collection... (156850 times) [2024-06-24 11:00:27,613][15349] Signal inference workers to resume experience collection... (156850 times) [2024-06-24 11:00:27,660][15401] InferenceWorker_p0-w0: stopping experience collection (156850 times) [2024-06-24 11:00:27,660][15401] InferenceWorker_p0-w0: resuming experience collection (156850 times) [2024-06-24 11:00:27,750][15401] Updated weights for policy 0, policy_version 646331 (0.0034) [2024-06-24 11:00:28,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10589487104. Throughput: 0: 42965.9. Samples: 10589604120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 11:00:32,567][15401] Updated weights for policy 0, policy_version 646341 (0.0045) [2024-06-24 11:00:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10589700096. Throughput: 0: 43061.3. Samples: 10589869960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 11:00:35,271][15401] Updated weights for policy 0, policy_version 646351 (0.0035) [2024-06-24 11:00:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 10589913088. Throughput: 0: 42795.2. Samples: 10589990860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:38,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 11:00:40,027][15401] Updated weights for policy 0, policy_version 646361 (0.0030) [2024-06-24 11:00:42,881][15401] Updated weights for policy 0, policy_version 646371 (0.0028) [2024-06-24 11:00:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10590142464. Throughput: 0: 43100.4. Samples: 10590252700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 11:00:47,634][15401] Updated weights for policy 0, policy_version 646381 (0.0025) [2024-06-24 11:00:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10590322688. Throughput: 0: 43095.5. Samples: 10590511400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:48,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-24 11:00:50,889][15401] Updated weights for policy 0, policy_version 646391 (0.0043) [2024-06-24 11:00:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10590552064. Throughput: 0: 42780.0. Samples: 10590630540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 11:00:55,154][15401] Updated weights for policy 0, policy_version 646401 (0.0040) [2024-06-24 11:00:58,389][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10590781440. Throughput: 0: 42868.1. Samples: 10590889960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:00:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 11:00:58,508][15401] Updated weights for policy 0, policy_version 646411 (0.0039) [2024-06-24 11:01:02,855][15401] Updated weights for policy 0, policy_version 646421 (0.0029) [2024-06-24 11:01:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 10590961664. Throughput: 0: 42790.2. Samples: 10591147360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 11:01:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 11:01:06,135][15401] Updated weights for policy 0, policy_version 646431 (0.0027) [2024-06-24 11:01:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 10591191040. Throughput: 0: 42691.1. Samples: 10591269360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:08,393][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 11:01:10,550][15401] Updated weights for policy 0, policy_version 646441 (0.0037) [2024-06-24 11:01:13,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10591436800. Throughput: 0: 42934.1. Samples: 10591536160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 11:01:13,625][15401] Updated weights for policy 0, policy_version 646451 (0.0049) [2024-06-24 11:01:18,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10591600640. Throughput: 0: 42778.2. Samples: 10591795080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:18,393][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 11:01:18,810][15401] Updated weights for policy 0, policy_version 646461 (0.0027) [2024-06-24 11:01:21,625][15401] Updated weights for policy 0, policy_version 646471 (0.0026) [2024-06-24 11:01:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 10591846400. Throughput: 0: 42707.5. Samples: 10591912700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 11:01:26,429][15401] Updated weights for policy 0, policy_version 646481 (0.0052) [2024-06-24 11:01:28,389][15132] Fps is (10 sec: 47525.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10592075776. Throughput: 0: 42843.6. Samples: 10592180660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 11:01:29,241][15401] Updated weights for policy 0, policy_version 646491 (0.0035) [2024-06-24 11:01:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10592256000. Throughput: 0: 42823.3. Samples: 10592438460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:33,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-24 11:01:33,899][15401] Updated weights for policy 0, policy_version 646501 (0.0048) [2024-06-24 11:01:36,546][15401] Updated weights for policy 0, policy_version 646511 (0.0031) [2024-06-24 11:01:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 10592501760. Throughput: 0: 42974.8. Samples: 10592564400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 11:01:39,709][15349] Signal inference workers to stop experience collection... (156900 times) [2024-06-24 11:01:39,756][15401] InferenceWorker_p0-w0: stopping experience collection (156900 times) [2024-06-24 11:01:39,770][15349] Signal inference workers to resume experience collection... (156900 times) [2024-06-24 11:01:39,770][15401] InferenceWorker_p0-w0: resuming experience collection (156900 times) [2024-06-24 11:01:41,517][15401] Updated weights for policy 0, policy_version 646521 (0.0034) [2024-06-24 11:01:43,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10592714752. Throughput: 0: 43020.9. Samples: 10592825900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:43,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 11:01:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000646529_10592731136.pth... [2024-06-24 11:01:43,550][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000645899_10582409216.pth [2024-06-24 11:01:44,497][15401] Updated weights for policy 0, policy_version 646531 (0.0037) [2024-06-24 11:01:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10592894976. Throughput: 0: 43035.7. Samples: 10593083960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:48,390][15132] Avg episode reward: [(0, '0.128')] [2024-06-24 11:01:49,019][15401] Updated weights for policy 0, policy_version 646541 (0.0042) [2024-06-24 11:01:52,143][15401] Updated weights for policy 0, policy_version 646551 (0.0036) [2024-06-24 11:01:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.8, 300 sec: 43042.7). Total num frames: 10593157120. Throughput: 0: 43106.9. Samples: 10593209060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:53,390][15132] Avg episode reward: [(0, '0.128')] [2024-06-24 11:01:56,917][15401] Updated weights for policy 0, policy_version 646561 (0.0035) [2024-06-24 11:01:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10593337344. Throughput: 0: 42999.7. Samples: 10593471140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:01:58,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 11:01:59,688][15401] Updated weights for policy 0, policy_version 646571 (0.0027) [2024-06-24 11:02:03,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10593533952. Throughput: 0: 42993.4. Samples: 10593729680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:02:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 11:02:04,461][15401] Updated weights for policy 0, policy_version 646581 (0.0038) [2024-06-24 11:02:07,282][15401] Updated weights for policy 0, policy_version 646591 (0.0029) [2024-06-24 11:02:08,390][15132] Fps is (10 sec: 47512.6, 60 sec: 43692.3, 300 sec: 43042.7). Total num frames: 10593812480. Throughput: 0: 43158.1. Samples: 10593854820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:02:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 11:02:11,950][15401] Updated weights for policy 0, policy_version 646601 (0.0027) [2024-06-24 11:02:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10593976320. Throughput: 0: 42907.0. Samples: 10594111480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:02:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 11:02:15,030][15401] Updated weights for policy 0, policy_version 646611 (0.0035) [2024-06-24 11:02:18,390][15132] Fps is (10 sec: 37683.6, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 10594189312. Throughput: 0: 42801.0. Samples: 10594364500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:02:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 11:02:19,516][15401] Updated weights for policy 0, policy_version 646621 (0.0046) [2024-06-24 11:02:22,758][15401] Updated weights for policy 0, policy_version 646631 (0.0038) [2024-06-24 11:02:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10594435072. Throughput: 0: 42910.5. Samples: 10594495380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:02:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 11:02:26,998][15401] Updated weights for policy 0, policy_version 646641 (0.0049) [2024-06-24 11:02:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10594615296. Throughput: 0: 42781.8. Samples: 10594751080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:02:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 11:02:30,291][15401] Updated weights for policy 0, policy_version 646651 (0.0030) [2024-06-24 11:02:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 10594844672. Throughput: 0: 42679.5. Samples: 10595004540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:02:33,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 11:02:34,406][15401] Updated weights for policy 0, policy_version 646661 (0.0039) [2024-06-24 11:02:38,057][15401] Updated weights for policy 0, policy_version 646671 (0.0028) [2024-06-24 11:02:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 10595074048. Throughput: 0: 42877.8. Samples: 10595138560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:02:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 11:02:42,502][15401] Updated weights for policy 0, policy_version 646681 (0.0026) [2024-06-24 11:02:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10595254272. Throughput: 0: 42678.1. Samples: 10595391660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:02:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 11:02:45,900][15401] Updated weights for policy 0, policy_version 646691 (0.0035) [2024-06-24 11:02:47,494][15349] Signal inference workers to stop experience collection... (156950 times) [2024-06-24 11:02:47,497][15349] Signal inference workers to resume experience collection... (156950 times) [2024-06-24 11:02:47,544][15401] InferenceWorker_p0-w0: stopping experience collection (156950 times) [2024-06-24 11:02:47,544][15401] InferenceWorker_p0-w0: resuming experience collection (156950 times) [2024-06-24 11:02:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 10595500032. Throughput: 0: 42334.7. Samples: 10595634740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:02:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 11:02:50,276][15401] Updated weights for policy 0, policy_version 646701 (0.0034) [2024-06-24 11:02:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10595696640. Throughput: 0: 42656.6. Samples: 10595774360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:02:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 11:02:53,412][15401] Updated weights for policy 0, policy_version 646711 (0.0037) [2024-06-24 11:02:57,889][15401] Updated weights for policy 0, policy_version 646721 (0.0033) [2024-06-24 11:02:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10595893248. Throughput: 0: 42686.3. Samples: 10596032360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:02:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 11:03:01,191][15401] Updated weights for policy 0, policy_version 646731 (0.0035) [2024-06-24 11:03:03,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43689.0, 300 sec: 43042.4). Total num frames: 10596155392. Throughput: 0: 42575.6. Samples: 10596280500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:03,392][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 11:03:05,499][15401] Updated weights for policy 0, policy_version 646741 (0.0048) [2024-06-24 11:03:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42876.1). Total num frames: 10596335616. Throughput: 0: 42709.9. Samples: 10596417320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 11:03:08,714][15401] Updated weights for policy 0, policy_version 646751 (0.0033) [2024-06-24 11:03:13,295][15401] Updated weights for policy 0, policy_version 646761 (0.0033) [2024-06-24 11:03:13,389][15132] Fps is (10 sec: 37692.2, 60 sec: 42598.4, 300 sec: 42766.0). Total num frames: 10596532224. Throughput: 0: 42658.2. Samples: 10596670700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 11:03:16,330][15401] Updated weights for policy 0, policy_version 646771 (0.0034) [2024-06-24 11:03:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 10596777984. Throughput: 0: 42681.4. Samples: 10596925200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 11:03:20,946][15401] Updated weights for policy 0, policy_version 646781 (0.0036) [2024-06-24 11:03:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10596974592. Throughput: 0: 42648.8. Samples: 10597057760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 11:03:24,238][15401] Updated weights for policy 0, policy_version 646791 (0.0031) [2024-06-24 11:03:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10597171200. Throughput: 0: 42615.2. Samples: 10597309340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 11:03:28,415][15401] Updated weights for policy 0, policy_version 646801 (0.0028) [2024-06-24 11:03:32,016][15401] Updated weights for policy 0, policy_version 646811 (0.0037) [2024-06-24 11:03:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10597416960. Throughput: 0: 42995.1. Samples: 10597569520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 11:03:35,930][15401] Updated weights for policy 0, policy_version 646821 (0.0039) [2024-06-24 11:03:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 10597629952. Throughput: 0: 42935.9. Samples: 10597706480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 11:03:39,506][15401] Updated weights for policy 0, policy_version 646831 (0.0034) [2024-06-24 11:03:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10597826560. Throughput: 0: 42885.9. Samples: 10597962240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:43,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 11:03:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000646840_10597826560.pth... [2024-06-24 11:03:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000646213_10587553792.pth [2024-06-24 11:03:43,631][15401] Updated weights for policy 0, policy_version 646841 (0.0033) [2024-06-24 11:03:47,146][15401] Updated weights for policy 0, policy_version 646851 (0.0035) [2024-06-24 11:03:48,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 10598055936. Throughput: 0: 43090.2. Samples: 10598219560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:48,392][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 11:03:51,407][15401] Updated weights for policy 0, policy_version 646861 (0.0035) [2024-06-24 11:03:53,390][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10598268928. Throughput: 0: 42986.1. Samples: 10598351700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:53,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 11:03:54,616][15401] Updated weights for policy 0, policy_version 646871 (0.0040) [2024-06-24 11:03:58,389][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10598481920. Throughput: 0: 42975.6. Samples: 10598604600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:03:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 11:03:58,857][15401] Updated weights for policy 0, policy_version 646881 (0.0027) [2024-06-24 11:04:02,178][15401] Updated weights for policy 0, policy_version 646891 (0.0030) [2024-06-24 11:04:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42931.6). Total num frames: 10598694912. Throughput: 0: 42922.1. Samples: 10598856700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:04:03,394][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 11:04:06,557][15401] Updated weights for policy 0, policy_version 646901 (0.0035) [2024-06-24 11:04:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10598907904. Throughput: 0: 42891.1. Samples: 10598987860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:04:08,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 11:04:09,631][15401] Updated weights for policy 0, policy_version 646911 (0.0039) [2024-06-24 11:04:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 10599120896. Throughput: 0: 42982.1. Samples: 10599243640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:04:13,393][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 11:04:14,176][15401] Updated weights for policy 0, policy_version 646921 (0.0037) [2024-06-24 11:04:17,691][15401] Updated weights for policy 0, policy_version 646931 (0.0043) [2024-06-24 11:04:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 10599317504. Throughput: 0: 42869.0. Samples: 10599498620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:04:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 11:04:21,826][15401] Updated weights for policy 0, policy_version 646941 (0.0037) [2024-06-24 11:04:23,390][15132] Fps is (10 sec: 40966.6, 60 sec: 42597.8, 300 sec: 42820.4). Total num frames: 10599530496. Throughput: 0: 42660.2. Samples: 10599626220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:04:23,391][15132] Avg episode reward: [(0, '0.197')] [2024-06-24 11:04:23,474][15349] Signal inference workers to stop experience collection... (157000 times) [2024-06-24 11:04:23,474][15349] Signal inference workers to resume experience collection... (157000 times) [2024-06-24 11:04:23,499][15401] InferenceWorker_p0-w0: stopping experience collection (157000 times) [2024-06-24 11:04:23,499][15401] InferenceWorker_p0-w0: resuming experience collection (157000 times) [2024-06-24 11:04:25,169][15401] Updated weights for policy 0, policy_version 646951 (0.0033) [2024-06-24 11:04:28,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 10599759872. Throughput: 0: 42820.6. Samples: 10599889260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:04:28,392][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 11:04:29,332][15401] Updated weights for policy 0, policy_version 646961 (0.0043) [2024-06-24 11:04:32,709][15401] Updated weights for policy 0, policy_version 646971 (0.0041) [2024-06-24 11:04:33,390][15132] Fps is (10 sec: 44240.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10599972864. Throughput: 0: 42695.5. Samples: 10600140760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:04:33,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 11:04:37,122][15401] Updated weights for policy 0, policy_version 646981 (0.0030) [2024-06-24 11:04:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10600169472. Throughput: 0: 42606.3. Samples: 10600268980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:04:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 11:04:40,612][15401] Updated weights for policy 0, policy_version 646991 (0.0042) [2024-06-24 11:04:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10600398848. Throughput: 0: 42677.3. Samples: 10600525080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:04:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 11:04:44,811][15401] Updated weights for policy 0, policy_version 647001 (0.0037) [2024-06-24 11:04:48,362][15401] Updated weights for policy 0, policy_version 647011 (0.0039) [2024-06-24 11:04:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 10600628224. Throughput: 0: 42676.5. Samples: 10600777140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:04:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 11:04:52,299][15401] Updated weights for policy 0, policy_version 647021 (0.0032) [2024-06-24 11:04:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10600824832. Throughput: 0: 42645.7. Samples: 10600906920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:04:53,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 11:04:56,235][15401] Updated weights for policy 0, policy_version 647031 (0.0045) [2024-06-24 11:04:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 10601021440. Throughput: 0: 42594.4. Samples: 10601160280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:04:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 11:05:00,074][15401] Updated weights for policy 0, policy_version 647041 (0.0039) [2024-06-24 11:05:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10601267200. Throughput: 0: 42538.7. Samples: 10601412860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:03,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 11:05:03,850][15401] Updated weights for policy 0, policy_version 647051 (0.0033) [2024-06-24 11:05:07,879][15401] Updated weights for policy 0, policy_version 647061 (0.0034) [2024-06-24 11:05:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10601447424. Throughput: 0: 42541.3. Samples: 10601540540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 11:05:11,399][15401] Updated weights for policy 0, policy_version 647071 (0.0031) [2024-06-24 11:05:13,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10601660416. Throughput: 0: 42315.2. Samples: 10601793440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:13,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 11:05:15,724][15401] Updated weights for policy 0, policy_version 647081 (0.0031) [2024-06-24 11:05:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10601873408. Throughput: 0: 42484.5. Samples: 10602052560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 11:05:19,084][15401] Updated weights for policy 0, policy_version 647091 (0.0030) [2024-06-24 11:05:23,343][15401] Updated weights for policy 0, policy_version 647101 (0.0032) [2024-06-24 11:05:23,397][15132] Fps is (10 sec: 44214.2, 60 sec: 42866.7, 300 sec: 42763.9). Total num frames: 10602102784. Throughput: 0: 42433.3. Samples: 10602178800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:23,397][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 11:05:26,709][15401] Updated weights for policy 0, policy_version 647111 (0.0040) [2024-06-24 11:05:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 10602299392. Throughput: 0: 42354.7. Samples: 10602431040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 11:05:31,294][15401] Updated weights for policy 0, policy_version 647121 (0.0022) [2024-06-24 11:05:33,390][15132] Fps is (10 sec: 40990.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10602512384. Throughput: 0: 42431.1. Samples: 10602686540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 11:05:34,735][15401] Updated weights for policy 0, policy_version 647131 (0.0038) [2024-06-24 11:05:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10602725376. Throughput: 0: 42392.5. Samples: 10602814580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 11:05:38,950][15401] Updated weights for policy 0, policy_version 647141 (0.0035) [2024-06-24 11:05:42,390][15401] Updated weights for policy 0, policy_version 647151 (0.0042) [2024-06-24 11:05:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10602954752. Throughput: 0: 42296.8. Samples: 10603063640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 11:05:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000647153_10602954752.pth... [2024-06-24 11:05:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000646529_10592731136.pth [2024-06-24 11:05:43,981][15349] Signal inference workers to stop experience collection... (157050 times) [2024-06-24 11:05:43,981][15349] Signal inference workers to resume experience collection... (157050 times) [2024-06-24 11:05:44,036][15401] InferenceWorker_p0-w0: stopping experience collection (157050 times) [2024-06-24 11:05:44,036][15401] InferenceWorker_p0-w0: resuming experience collection (157050 times) [2024-06-24 11:05:46,862][15401] Updated weights for policy 0, policy_version 647161 (0.0052) [2024-06-24 11:05:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 10603134976. Throughput: 0: 42444.3. Samples: 10603322860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 11:05:49,982][15401] Updated weights for policy 0, policy_version 647171 (0.0036) [2024-06-24 11:05:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10603364352. Throughput: 0: 42283.4. Samples: 10603443300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 11:05:53,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-24 11:05:54,898][15401] Updated weights for policy 0, policy_version 647181 (0.0025) [2024-06-24 11:05:57,507][15401] Updated weights for policy 0, policy_version 647191 (0.0033) [2024-06-24 11:05:58,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10603610112. Throughput: 0: 42495.6. Samples: 10603705640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:05:58,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-24 11:06:02,593][15401] Updated weights for policy 0, policy_version 647201 (0.0035) [2024-06-24 11:06:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 10603790336. Throughput: 0: 42396.4. Samples: 10603960400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:03,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 11:06:05,395][15401] Updated weights for policy 0, policy_version 647211 (0.0038) [2024-06-24 11:06:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10604003328. Throughput: 0: 42246.2. Samples: 10604079560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 11:06:10,086][15401] Updated weights for policy 0, policy_version 647221 (0.0037) [2024-06-24 11:06:13,207][15401] Updated weights for policy 0, policy_version 647231 (0.0029) [2024-06-24 11:06:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42820.9). Total num frames: 10604232704. Throughput: 0: 42568.4. Samples: 10604346620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:13,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 11:06:17,714][15401] Updated weights for policy 0, policy_version 647241 (0.0031) [2024-06-24 11:06:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10604429312. Throughput: 0: 42583.1. Samples: 10604602780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 11:06:20,808][15401] Updated weights for policy 0, policy_version 647251 (0.0037) [2024-06-24 11:06:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42603.6, 300 sec: 42653.9). Total num frames: 10604658688. Throughput: 0: 42382.6. Samples: 10604721800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:23,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 11:06:25,379][15401] Updated weights for policy 0, policy_version 647261 (0.0033) [2024-06-24 11:06:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10604871680. Throughput: 0: 42709.8. Samples: 10604985580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:28,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 11:06:28,454][15401] Updated weights for policy 0, policy_version 647271 (0.0035) [2024-06-24 11:06:33,222][15401] Updated weights for policy 0, policy_version 647281 (0.0033) [2024-06-24 11:06:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10605051904. Throughput: 0: 42655.1. Samples: 10605242340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 11:06:36,450][15401] Updated weights for policy 0, policy_version 647291 (0.0036) [2024-06-24 11:06:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10605297664. Throughput: 0: 42615.2. Samples: 10605360980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 11:06:41,011][15401] Updated weights for policy 0, policy_version 647301 (0.0038) [2024-06-24 11:06:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 10605494272. Throughput: 0: 42503.8. Samples: 10605618320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 11:06:44,381][15401] Updated weights for policy 0, policy_version 647311 (0.0040) [2024-06-24 11:06:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 10605690880. Throughput: 0: 42358.8. Samples: 10605866540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 11:06:48,827][15401] Updated weights for policy 0, policy_version 647321 (0.0039) [2024-06-24 11:06:52,119][15401] Updated weights for policy 0, policy_version 647331 (0.0041) [2024-06-24 11:06:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10605936640. Throughput: 0: 42481.7. Samples: 10605991240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 11:06:57,020][15401] Updated weights for policy 0, policy_version 647341 (0.0026) [2024-06-24 11:06:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 10606116864. Throughput: 0: 42360.5. Samples: 10606252840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:06:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 11:06:59,783][15401] Updated weights for policy 0, policy_version 647351 (0.0033) [2024-06-24 11:07:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10606346240. Throughput: 0: 42128.1. Samples: 10606498540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:07:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 11:07:04,606][15401] Updated weights for policy 0, policy_version 647361 (0.0046) [2024-06-24 11:07:07,402][15401] Updated weights for policy 0, policy_version 647371 (0.0037) [2024-06-24 11:07:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10606559232. Throughput: 0: 42405.1. Samples: 10606630020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:07:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 11:07:12,212][15401] Updated weights for policy 0, policy_version 647381 (0.0043) [2024-06-24 11:07:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 10606739456. Throughput: 0: 42362.8. Samples: 10606891900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:07:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 11:07:14,845][15401] Updated weights for policy 0, policy_version 647391 (0.0042) [2024-06-24 11:07:16,178][15349] Signal inference workers to stop experience collection... (157100 times) [2024-06-24 11:07:16,179][15349] Signal inference workers to resume experience collection... (157100 times) [2024-06-24 11:07:16,224][15401] InferenceWorker_p0-w0: stopping experience collection (157100 times) [2024-06-24 11:07:16,224][15401] InferenceWorker_p0-w0: resuming experience collection (157100 times) [2024-06-24 11:07:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10606968832. Throughput: 0: 42232.4. Samples: 10607142800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:07:18,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 11:07:19,714][15401] Updated weights for policy 0, policy_version 647401 (0.0027) [2024-06-24 11:07:22,633][15401] Updated weights for policy 0, policy_version 647411 (0.0026) [2024-06-24 11:07:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10607181824. Throughput: 0: 42482.2. Samples: 10607272680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:07:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 11:07:27,348][15401] Updated weights for policy 0, policy_version 647421 (0.0034) [2024-06-24 11:07:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 10607378432. Throughput: 0: 42545.0. Samples: 10607532840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:07:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 11:07:30,171][15401] Updated weights for policy 0, policy_version 647431 (0.0032) [2024-06-24 11:07:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10607624192. Throughput: 0: 42636.0. Samples: 10607785160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:07:33,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 11:07:34,928][15401] Updated weights for policy 0, policy_version 647441 (0.0040) [2024-06-24 11:07:38,255][15401] Updated weights for policy 0, policy_version 647451 (0.0034) [2024-06-24 11:07:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10607837184. Throughput: 0: 42776.1. Samples: 10607916160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:07:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 11:07:42,566][15401] Updated weights for policy 0, policy_version 647461 (0.0032) [2024-06-24 11:07:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10608033792. Throughput: 0: 42680.7. Samples: 10608173480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:07:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 11:07:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000647463_10608033792.pth... [2024-06-24 11:07:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000646840_10597826560.pth [2024-06-24 11:07:45,855][15401] Updated weights for policy 0, policy_version 647471 (0.0027) [2024-06-24 11:07:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 10608263168. Throughput: 0: 42806.1. Samples: 10608424820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:07:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 11:07:49,974][15401] Updated weights for policy 0, policy_version 647481 (0.0036) [2024-06-24 11:07:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10608476160. Throughput: 0: 42978.6. Samples: 10608564060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:07:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 11:07:53,554][15401] Updated weights for policy 0, policy_version 647491 (0.0026) [2024-06-24 11:07:57,279][15401] Updated weights for policy 0, policy_version 647501 (0.0042) [2024-06-24 11:07:58,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42432.1). Total num frames: 10608672768. Throughput: 0: 42860.0. Samples: 10608820600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:07:58,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 11:08:01,028][15401] Updated weights for policy 0, policy_version 647511 (0.0039) [2024-06-24 11:08:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10608902144. Throughput: 0: 42853.8. Samples: 10609071220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:03,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 11:08:05,044][15401] Updated weights for policy 0, policy_version 647521 (0.0034) [2024-06-24 11:08:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10609115136. Throughput: 0: 42964.1. Samples: 10609206060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 11:08:09,087][15401] Updated weights for policy 0, policy_version 647531 (0.0028) [2024-06-24 11:08:12,423][15401] Updated weights for policy 0, policy_version 647541 (0.0031) [2024-06-24 11:08:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 10609328128. Throughput: 0: 42828.1. Samples: 10609460100. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 11:08:16,676][15401] Updated weights for policy 0, policy_version 647551 (0.0034) [2024-06-24 11:08:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10609557504. Throughput: 0: 42936.0. Samples: 10609717280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 11:08:19,978][15401] Updated weights for policy 0, policy_version 647561 (0.0034) [2024-06-24 11:08:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10609770496. Throughput: 0: 42946.3. Samples: 10609848740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 11:08:23,977][15401] Updated weights for policy 0, policy_version 647571 (0.0041) [2024-06-24 11:08:27,470][15401] Updated weights for policy 0, policy_version 647581 (0.0039) [2024-06-24 11:08:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 10609983488. Throughput: 0: 42967.2. Samples: 10610107000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 11:08:31,452][15401] Updated weights for policy 0, policy_version 647591 (0.0037) [2024-06-24 11:08:33,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 10610212864. Throughput: 0: 43041.6. Samples: 10610361960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:33,397][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 11:08:35,305][15401] Updated weights for policy 0, policy_version 647601 (0.0048) [2024-06-24 11:08:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10610409472. Throughput: 0: 42801.4. Samples: 10610490120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 11:08:39,338][15349] Signal inference workers to stop experience collection... (157150 times) [2024-06-24 11:08:39,357][15401] InferenceWorker_p0-w0: stopping experience collection (157150 times) [2024-06-24 11:08:39,397][15349] Signal inference workers to resume experience collection... (157150 times) [2024-06-24 11:08:39,397][15401] InferenceWorker_p0-w0: resuming experience collection (157150 times) [2024-06-24 11:08:39,400][15401] Updated weights for policy 0, policy_version 647611 (0.0038) [2024-06-24 11:08:43,095][15401] Updated weights for policy 0, policy_version 647621 (0.0041) [2024-06-24 11:08:43,390][15132] Fps is (10 sec: 40986.2, 60 sec: 43144.6, 300 sec: 42598.7). Total num frames: 10610622464. Throughput: 0: 42685.3. Samples: 10610741440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:43,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 11:08:46,917][15401] Updated weights for policy 0, policy_version 647631 (0.0032) [2024-06-24 11:08:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10610851840. Throughput: 0: 42755.6. Samples: 10610995220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 11:08:51,011][15401] Updated weights for policy 0, policy_version 647641 (0.0044) [2024-06-24 11:08:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 10611032064. Throughput: 0: 42612.8. Samples: 10611123740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:53,392][15132] Avg episode reward: [(0, '0.835')] [2024-06-24 11:08:54,536][15401] Updated weights for policy 0, policy_version 647651 (0.0022) [2024-06-24 11:08:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10611261440. Throughput: 0: 42678.3. Samples: 10611380620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:08:58,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-24 11:08:58,460][15401] Updated weights for policy 0, policy_version 647661 (0.0035) [2024-06-24 11:09:02,141][15401] Updated weights for policy 0, policy_version 647671 (0.0041) [2024-06-24 11:09:03,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10611490816. Throughput: 0: 42661.4. Samples: 10611637040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:09:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 11:09:06,037][15401] Updated weights for policy 0, policy_version 647681 (0.0043) [2024-06-24 11:09:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 10611671040. Throughput: 0: 42594.7. Samples: 10611765500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 11:09:08,396][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 11:09:09,678][15401] Updated weights for policy 0, policy_version 647691 (0.0029) [2024-06-24 11:09:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10611916800. Throughput: 0: 42640.4. Samples: 10612025820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 11:09:14,026][15401] Updated weights for policy 0, policy_version 647701 (0.0048) [2024-06-24 11:09:17,115][15401] Updated weights for policy 0, policy_version 647711 (0.0033) [2024-06-24 11:09:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42709.6). Total num frames: 10612129792. Throughput: 0: 42614.4. Samples: 10612279340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 11:09:22,056][15401] Updated weights for policy 0, policy_version 647721 (0.0024) [2024-06-24 11:09:23,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42487.7). Total num frames: 10612293632. Throughput: 0: 42622.3. Samples: 10612408120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 11:09:24,756][15401] Updated weights for policy 0, policy_version 647731 (0.0037) [2024-06-24 11:09:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10612555776. Throughput: 0: 42801.8. Samples: 10612667520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 11:09:29,672][15401] Updated weights for policy 0, policy_version 647741 (0.0045) [2024-06-24 11:09:32,476][15401] Updated weights for policy 0, policy_version 647751 (0.0031) [2024-06-24 11:09:33,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 10612785152. Throughput: 0: 42741.4. Samples: 10612918580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 11:09:37,366][15401] Updated weights for policy 0, policy_version 647761 (0.0050) [2024-06-24 11:09:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10612948992. Throughput: 0: 42806.8. Samples: 10613049940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:38,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 11:09:39,929][15349] Signal inference workers to stop experience collection... (157200 times) [2024-06-24 11:09:39,929][15349] Signal inference workers to resume experience collection... (157200 times) [2024-06-24 11:09:39,941][15401] InferenceWorker_p0-w0: stopping experience collection (157200 times) [2024-06-24 11:09:39,941][15401] InferenceWorker_p0-w0: resuming experience collection (157200 times) [2024-06-24 11:09:40,328][15401] Updated weights for policy 0, policy_version 647771 (0.0041) [2024-06-24 11:09:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10613194752. Throughput: 0: 42772.0. Samples: 10613305360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:43,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-24 11:09:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000647779_10613211136.pth... [2024-06-24 11:09:43,539][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000647153_10602954752.pth [2024-06-24 11:09:44,984][15401] Updated weights for policy 0, policy_version 647781 (0.0031) [2024-06-24 11:09:48,136][15401] Updated weights for policy 0, policy_version 647791 (0.0043) [2024-06-24 11:09:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10613407744. Throughput: 0: 42745.8. Samples: 10613560600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 11:09:52,612][15401] Updated weights for policy 0, policy_version 647801 (0.0033) [2024-06-24 11:09:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 10613604352. Throughput: 0: 42735.5. Samples: 10613688600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 11:09:55,686][15401] Updated weights for policy 0, policy_version 647811 (0.0026) [2024-06-24 11:09:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 10613817344. Throughput: 0: 42742.6. Samples: 10613949240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:09:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 11:10:00,221][15401] Updated weights for policy 0, policy_version 647821 (0.0044) [2024-06-24 11:10:03,184][15401] Updated weights for policy 0, policy_version 647831 (0.0042) [2024-06-24 11:10:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10614063104. Throughput: 0: 42758.7. Samples: 10614203480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 11:10:08,023][15401] Updated weights for policy 0, policy_version 647841 (0.0033) [2024-06-24 11:10:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 10614243328. Throughput: 0: 42780.3. Samples: 10614333240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:08,395][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 11:10:10,776][15401] Updated weights for policy 0, policy_version 647851 (0.0032) [2024-06-24 11:10:13,391][15132] Fps is (10 sec: 40953.5, 60 sec: 42597.2, 300 sec: 42709.2). Total num frames: 10614472704. Throughput: 0: 42742.4. Samples: 10614591000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:13,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 11:10:15,525][15401] Updated weights for policy 0, policy_version 647861 (0.0042) [2024-06-24 11:10:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42655.0). Total num frames: 10614685696. Throughput: 0: 42817.7. Samples: 10614845380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 11:10:18,661][15401] Updated weights for policy 0, policy_version 647871 (0.0034) [2024-06-24 11:10:23,389][15132] Fps is (10 sec: 39328.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10614865920. Throughput: 0: 42682.6. Samples: 10614970660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 11:10:23,544][15401] Updated weights for policy 0, policy_version 647881 (0.0031) [2024-06-24 11:10:26,683][15401] Updated weights for policy 0, policy_version 647891 (0.0033) [2024-06-24 11:10:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10615111680. Throughput: 0: 42507.0. Samples: 10615218180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:28,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 11:10:31,553][15401] Updated weights for policy 0, policy_version 647901 (0.0045) [2024-06-24 11:10:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 10615308288. Throughput: 0: 42505.2. Samples: 10615473340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:33,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-24 11:10:34,439][15401] Updated weights for policy 0, policy_version 647911 (0.0034) [2024-06-24 11:10:38,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 10615521280. Throughput: 0: 42488.8. Samples: 10615600700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:38,392][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 11:10:39,115][15401] Updated weights for policy 0, policy_version 647921 (0.0031) [2024-06-24 11:10:42,137][15401] Updated weights for policy 0, policy_version 647931 (0.0045) [2024-06-24 11:10:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10615750656. Throughput: 0: 42391.9. Samples: 10615856880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:10:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 11:10:46,708][15401] Updated weights for policy 0, policy_version 647941 (0.0038) [2024-06-24 11:10:48,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 10615930880. Throughput: 0: 42494.6. Samples: 10616115740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:10:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 11:10:49,783][15401] Updated weights for policy 0, policy_version 647951 (0.0026) [2024-06-24 11:10:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10616143872. Throughput: 0: 42359.6. Samples: 10616239420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:10:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 11:10:54,339][15401] Updated weights for policy 0, policy_version 647961 (0.0042) [2024-06-24 11:10:57,470][15401] Updated weights for policy 0, policy_version 647971 (0.0039) [2024-06-24 11:10:58,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10616406016. Throughput: 0: 42330.7. Samples: 10616495820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:10:58,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 11:11:02,307][15401] Updated weights for policy 0, policy_version 647981 (0.0033) [2024-06-24 11:11:02,984][15349] Signal inference workers to stop experience collection... (157250 times) [2024-06-24 11:11:02,984][15349] Signal inference workers to resume experience collection... (157250 times) [2024-06-24 11:11:03,013][15401] InferenceWorker_p0-w0: stopping experience collection (157250 times) [2024-06-24 11:11:03,013][15401] InferenceWorker_p0-w0: resuming experience collection (157250 times) [2024-06-24 11:11:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 10616569856. Throughput: 0: 42501.9. Samples: 10616757960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 11:11:05,347][15401] Updated weights for policy 0, policy_version 647991 (0.0038) [2024-06-24 11:11:08,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10616799232. Throughput: 0: 42313.7. Samples: 10616874780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 11:11:09,982][15401] Updated weights for policy 0, policy_version 648001 (0.0032) [2024-06-24 11:11:13,066][15401] Updated weights for policy 0, policy_version 648011 (0.0035) [2024-06-24 11:11:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42599.6, 300 sec: 42709.5). Total num frames: 10617028608. Throughput: 0: 42630.8. Samples: 10617136560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:13,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 11:11:17,587][15401] Updated weights for policy 0, policy_version 648021 (0.0025) [2024-06-24 11:11:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10617225216. Throughput: 0: 42815.1. Samples: 10617400020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 11:11:20,714][15401] Updated weights for policy 0, policy_version 648031 (0.0040) [2024-06-24 11:11:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 10617438208. Throughput: 0: 42652.0. Samples: 10617520040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:23,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 11:11:25,342][15401] Updated weights for policy 0, policy_version 648041 (0.0033) [2024-06-24 11:11:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10617651200. Throughput: 0: 42718.7. Samples: 10617779220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 11:11:28,471][15401] Updated weights for policy 0, policy_version 648051 (0.0038) [2024-06-24 11:11:33,027][15401] Updated weights for policy 0, policy_version 648061 (0.0035) [2024-06-24 11:11:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10617864192. Throughput: 0: 42752.2. Samples: 10618039580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:33,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 11:11:36,021][15401] Updated weights for policy 0, policy_version 648071 (0.0040) [2024-06-24 11:11:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 10618077184. Throughput: 0: 42557.3. Samples: 10618154500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 11:11:40,673][15401] Updated weights for policy 0, policy_version 648081 (0.0034) [2024-06-24 11:11:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10618273792. Throughput: 0: 42713.4. Samples: 10618417920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:43,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-24 11:11:43,545][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000648090_10618306560.pth... [2024-06-24 11:11:43,617][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000647463_10608033792.pth [2024-06-24 11:11:43,778][15401] Updated weights for policy 0, policy_version 648091 (0.0033) [2024-06-24 11:11:48,207][15401] Updated weights for policy 0, policy_version 648101 (0.0038) [2024-06-24 11:11:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10618486784. Throughput: 0: 42540.8. Samples: 10618672300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 11:11:51,356][15401] Updated weights for policy 0, policy_version 648111 (0.0038) [2024-06-24 11:11:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10618716160. Throughput: 0: 42643.2. Samples: 10618793720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 11:11:55,955][15401] Updated weights for policy 0, policy_version 648121 (0.0033) [2024-06-24 11:11:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 10618912768. Throughput: 0: 42691.0. Samples: 10619057660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:11:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 11:11:59,024][15401] Updated weights for policy 0, policy_version 648131 (0.0033) [2024-06-24 11:12:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10619125760. Throughput: 0: 42401.0. Samples: 10619308060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:12:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 11:12:03,540][15401] Updated weights for policy 0, policy_version 648141 (0.0027) [2024-06-24 11:12:06,743][15401] Updated weights for policy 0, policy_version 648151 (0.0026) [2024-06-24 11:12:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10619371520. Throughput: 0: 42572.4. Samples: 10619435700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:12:08,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 11:12:11,031][15401] Updated weights for policy 0, policy_version 648161 (0.0044) [2024-06-24 11:12:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 10619551744. Throughput: 0: 42538.7. Samples: 10619693460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:12:13,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 11:12:14,752][15401] Updated weights for policy 0, policy_version 648171 (0.0036) [2024-06-24 11:12:15,630][15349] Signal inference workers to stop experience collection... (157300 times) [2024-06-24 11:12:15,683][15349] Signal inference workers to resume experience collection... (157300 times) [2024-06-24 11:12:15,688][15401] InferenceWorker_p0-w0: stopping experience collection (157300 times) [2024-06-24 11:12:15,701][15401] InferenceWorker_p0-w0: resuming experience collection (157300 times) [2024-06-24 11:12:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10619781120. Throughput: 0: 42430.0. Samples: 10619948940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 11:12:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 11:12:18,535][15401] Updated weights for policy 0, policy_version 648181 (0.0024) [2024-06-24 11:12:22,301][15401] Updated weights for policy 0, policy_version 648191 (0.0028) [2024-06-24 11:12:23,391][15132] Fps is (10 sec: 45867.6, 60 sec: 42872.0, 300 sec: 42820.3). Total num frames: 10620010496. Throughput: 0: 42764.3. Samples: 10620078960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:12:23,392][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 11:12:26,338][15401] Updated weights for policy 0, policy_version 648201 (0.0035) [2024-06-24 11:12:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10620174336. Throughput: 0: 42641.5. Samples: 10620336780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:12:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 11:12:30,155][15401] Updated weights for policy 0, policy_version 648211 (0.0034) [2024-06-24 11:12:33,389][15132] Fps is (10 sec: 42605.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10620436480. Throughput: 0: 42504.9. Samples: 10620585020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:12:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 11:12:33,789][15401] Updated weights for policy 0, policy_version 648221 (0.0042) [2024-06-24 11:12:38,051][15401] Updated weights for policy 0, policy_version 648231 (0.0027) [2024-06-24 11:12:38,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10620649472. Throughput: 0: 42767.1. Samples: 10620718240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:12:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 11:12:41,235][15401] Updated weights for policy 0, policy_version 648241 (0.0023) [2024-06-24 11:12:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10620829696. Throughput: 0: 42660.5. Samples: 10620977380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:12:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 11:12:45,536][15401] Updated weights for policy 0, policy_version 648251 (0.0040) [2024-06-24 11:12:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 10621091840. Throughput: 0: 42677.2. Samples: 10621228540. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:12:48,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 11:12:49,174][15401] Updated weights for policy 0, policy_version 648261 (0.0030) [2024-06-24 11:12:53,127][15401] Updated weights for policy 0, policy_version 648271 (0.0032) [2024-06-24 11:12:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10621272064. Throughput: 0: 42829.3. Samples: 10621363020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:12:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 11:12:57,310][15401] Updated weights for policy 0, policy_version 648281 (0.0026) [2024-06-24 11:12:58,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10621468672. Throughput: 0: 42738.5. Samples: 10621616700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:12:58,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-24 11:13:01,119][15401] Updated weights for policy 0, policy_version 648291 (0.0033) [2024-06-24 11:13:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 10621730816. Throughput: 0: 42533.4. Samples: 10621862940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:03,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 11:13:04,821][15401] Updated weights for policy 0, policy_version 648301 (0.0034) [2024-06-24 11:13:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10621911040. Throughput: 0: 42622.9. Samples: 10621996920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 11:13:08,648][15401] Updated weights for policy 0, policy_version 648311 (0.0037) [2024-06-24 11:13:12,375][15401] Updated weights for policy 0, policy_version 648321 (0.0033) [2024-06-24 11:13:13,396][15132] Fps is (10 sec: 39296.4, 60 sec: 42866.8, 300 sec: 42597.5). Total num frames: 10622124032. Throughput: 0: 42528.0. Samples: 10622250820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:13,397][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 11:13:16,296][15401] Updated weights for policy 0, policy_version 648331 (0.0037) [2024-06-24 11:13:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 10622353408. Throughput: 0: 42638.7. Samples: 10622503760. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 11:13:20,041][15401] Updated weights for policy 0, policy_version 648341 (0.0035) [2024-06-24 11:13:23,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42326.5, 300 sec: 42598.4). Total num frames: 10622550016. Throughput: 0: 42617.3. Samples: 10622636020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 11:13:23,943][15401] Updated weights for policy 0, policy_version 648351 (0.0026) [2024-06-24 11:13:27,440][15401] Updated weights for policy 0, policy_version 648361 (0.0044) [2024-06-24 11:13:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42599.3). Total num frames: 10622779392. Throughput: 0: 42486.1. Samples: 10622889260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:28,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 11:13:31,443][15401] Updated weights for policy 0, policy_version 648371 (0.0039) [2024-06-24 11:13:31,826][15349] Signal inference workers to stop experience collection... (157350 times) [2024-06-24 11:13:31,880][15401] InferenceWorker_p0-w0: stopping experience collection (157350 times) [2024-06-24 11:13:31,888][15349] Signal inference workers to resume experience collection... (157350 times) [2024-06-24 11:13:31,898][15401] InferenceWorker_p0-w0: resuming experience collection (157350 times) [2024-06-24 11:13:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10622992384. Throughput: 0: 42730.8. Samples: 10623151420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:33,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 11:13:34,889][15401] Updated weights for policy 0, policy_version 648381 (0.0036) [2024-06-24 11:13:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 10623188992. Throughput: 0: 42608.9. Samples: 10623280420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 11:13:39,435][15401] Updated weights for policy 0, policy_version 648391 (0.0039) [2024-06-24 11:13:42,314][15401] Updated weights for policy 0, policy_version 648401 (0.0027) [2024-06-24 11:13:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 10623401984. Throughput: 0: 42417.4. Samples: 10623525580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:43,392][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 11:13:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000648401_10623401984.pth... [2024-06-24 11:13:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000647779_10613211136.pth [2024-06-24 11:13:47,053][15401] Updated weights for policy 0, policy_version 648411 (0.0031) [2024-06-24 11:13:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 10623631360. Throughput: 0: 42663.0. Samples: 10623782780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 11:13:50,016][15401] Updated weights for policy 0, policy_version 648421 (0.0035) [2024-06-24 11:13:53,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 10623827968. Throughput: 0: 42635.0. Samples: 10623915600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:53,393][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 11:13:54,616][15401] Updated weights for policy 0, policy_version 648431 (0.0035) [2024-06-24 11:13:58,254][15401] Updated weights for policy 0, policy_version 648441 (0.0035) [2024-06-24 11:13:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 10624057344. Throughput: 0: 42679.9. Samples: 10624171140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 11:13:58,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 11:14:02,266][15401] Updated weights for policy 0, policy_version 648451 (0.0042) [2024-06-24 11:14:03,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10624286720. Throughput: 0: 42573.6. Samples: 10624419580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 11:14:05,803][15401] Updated weights for policy 0, policy_version 648461 (0.0046) [2024-06-24 11:14:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 10624466944. Throughput: 0: 42531.5. Samples: 10624550040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:08,392][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 11:14:10,185][15401] Updated weights for policy 0, policy_version 648471 (0.0033) [2024-06-24 11:14:13,339][15401] Updated weights for policy 0, policy_version 648481 (0.0042) [2024-06-24 11:14:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43149.1, 300 sec: 42653.9). Total num frames: 10624712704. Throughput: 0: 42740.4. Samples: 10624812580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:13,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 11:14:17,930][15401] Updated weights for policy 0, policy_version 648491 (0.0038) [2024-06-24 11:14:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10624892928. Throughput: 0: 42576.4. Samples: 10625067360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 11:14:20,890][15401] Updated weights for policy 0, policy_version 648501 (0.0028) [2024-06-24 11:14:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10625105920. Throughput: 0: 42340.0. Samples: 10625185720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 11:14:25,833][15401] Updated weights for policy 0, policy_version 648511 (0.0034) [2024-06-24 11:14:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 10625351680. Throughput: 0: 42781.9. Samples: 10625450660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:28,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 11:14:28,515][15401] Updated weights for policy 0, policy_version 648521 (0.0032) [2024-06-24 11:14:33,321][15401] Updated weights for policy 0, policy_version 648531 (0.0037) [2024-06-24 11:14:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 10625531904. Throughput: 0: 42826.2. Samples: 10625709960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 11:14:36,066][15401] Updated weights for policy 0, policy_version 648541 (0.0026) [2024-06-24 11:14:38,396][15132] Fps is (10 sec: 37658.4, 60 sec: 42320.8, 300 sec: 42486.4). Total num frames: 10625728512. Throughput: 0: 42497.5. Samples: 10625828160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:38,397][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 11:14:40,765][15401] Updated weights for policy 0, policy_version 648551 (0.0038) [2024-06-24 11:14:41,319][15349] Signal inference workers to stop experience collection... (157400 times) [2024-06-24 11:14:41,365][15401] InferenceWorker_p0-w0: stopping experience collection (157400 times) [2024-06-24 11:14:41,372][15349] Signal inference workers to resume experience collection... (157400 times) [2024-06-24 11:14:41,379][15401] InferenceWorker_p0-w0: resuming experience collection (157400 times) [2024-06-24 11:14:43,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42871.5, 300 sec: 42598.0). Total num frames: 10625974272. Throughput: 0: 42625.7. Samples: 10626089400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:43,392][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 11:14:44,030][15401] Updated weights for policy 0, policy_version 648561 (0.0024) [2024-06-24 11:14:48,390][15132] Fps is (10 sec: 44265.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10626170880. Throughput: 0: 42880.1. Samples: 10626349180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 11:14:48,763][15401] Updated weights for policy 0, policy_version 648571 (0.0043) [2024-06-24 11:14:51,747][15401] Updated weights for policy 0, policy_version 648581 (0.0029) [2024-06-24 11:14:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 10626383872. Throughput: 0: 42748.1. Samples: 10626473600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 11:14:56,158][15401] Updated weights for policy 0, policy_version 648591 (0.0029) [2024-06-24 11:14:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10626596864. Throughput: 0: 42688.5. Samples: 10626733560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:14:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 11:14:59,367][15401] Updated weights for policy 0, policy_version 648601 (0.0035) [2024-06-24 11:15:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10626826240. Throughput: 0: 42710.6. Samples: 10626989340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:15:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 11:15:03,595][15401] Updated weights for policy 0, policy_version 648611 (0.0037) [2024-06-24 11:15:07,275][15401] Updated weights for policy 0, policy_version 648621 (0.0044) [2024-06-24 11:15:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42543.1). Total num frames: 10627022848. Throughput: 0: 42992.0. Samples: 10627120360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:15:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 11:15:11,081][15401] Updated weights for policy 0, policy_version 648631 (0.0042) [2024-06-24 11:15:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10627252224. Throughput: 0: 42839.0. Samples: 10627378420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:15:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 11:15:14,905][15401] Updated weights for policy 0, policy_version 648641 (0.0033) [2024-06-24 11:15:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10627481600. Throughput: 0: 42751.3. Samples: 10627633760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:15:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 11:15:18,666][15401] Updated weights for policy 0, policy_version 648651 (0.0042) [2024-06-24 11:15:22,563][15401] Updated weights for policy 0, policy_version 648661 (0.0035) [2024-06-24 11:15:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 10627694592. Throughput: 0: 42968.9. Samples: 10627761480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:15:23,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 11:15:26,581][15401] Updated weights for policy 0, policy_version 648671 (0.0036) [2024-06-24 11:15:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10627891200. Throughput: 0: 42972.6. Samples: 10628023060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:15:28,393][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 11:15:30,344][15401] Updated weights for policy 0, policy_version 648681 (0.0038) [2024-06-24 11:15:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42709.8). Total num frames: 10628120576. Throughput: 0: 43026.3. Samples: 10628285360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 11:15:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 11:15:34,005][15401] Updated weights for policy 0, policy_version 648691 (0.0035) [2024-06-24 11:15:37,800][15401] Updated weights for policy 0, policy_version 648701 (0.0033) [2024-06-24 11:15:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43422.2, 300 sec: 42653.9). Total num frames: 10628333568. Throughput: 0: 43069.2. Samples: 10628411720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:15:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 11:15:41,653][15401] Updated weights for policy 0, policy_version 648711 (0.0039) [2024-06-24 11:15:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42871.4, 300 sec: 42764.7). Total num frames: 10628546560. Throughput: 0: 42953.3. Samples: 10628666560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:15:43,393][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 11:15:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000648715_10628546560.pth... [2024-06-24 11:15:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000648090_10618306560.pth [2024-06-24 11:15:45,447][15401] Updated weights for policy 0, policy_version 648721 (0.0022) [2024-06-24 11:15:48,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 10628759552. Throughput: 0: 42840.9. Samples: 10628917280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:15:48,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 11:15:49,918][15401] Updated weights for policy 0, policy_version 648731 (0.0030) [2024-06-24 11:15:51,625][15349] Signal inference workers to stop experience collection... (157450 times) [2024-06-24 11:15:51,625][15349] Signal inference workers to resume experience collection... (157450 times) [2024-06-24 11:15:51,644][15401] InferenceWorker_p0-w0: stopping experience collection (157450 times) [2024-06-24 11:15:51,645][15401] InferenceWorker_p0-w0: resuming experience collection (157450 times) [2024-06-24 11:15:53,106][15401] Updated weights for policy 0, policy_version 648741 (0.0044) [2024-06-24 11:15:53,390][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10628972544. Throughput: 0: 42801.3. Samples: 10629046420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:15:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 11:15:57,604][15401] Updated weights for policy 0, policy_version 648751 (0.0045) [2024-06-24 11:15:58,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10629185536. Throughput: 0: 42706.2. Samples: 10629300200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:15:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 11:16:00,793][15401] Updated weights for policy 0, policy_version 648761 (0.0034) [2024-06-24 11:16:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10629414912. Throughput: 0: 42707.0. Samples: 10629555580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:03,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 11:16:05,133][15401] Updated weights for policy 0, policy_version 648771 (0.0037) [2024-06-24 11:16:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10629611520. Throughput: 0: 42623.6. Samples: 10629679540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:08,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 11:16:08,494][15401] Updated weights for policy 0, policy_version 648781 (0.0034) [2024-06-24 11:16:12,865][15401] Updated weights for policy 0, policy_version 648791 (0.0037) [2024-06-24 11:16:13,391][15132] Fps is (10 sec: 40955.8, 60 sec: 42870.7, 300 sec: 42709.3). Total num frames: 10629824512. Throughput: 0: 42653.1. Samples: 10629942500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:13,391][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 11:16:16,036][15401] Updated weights for policy 0, policy_version 648801 (0.0040) [2024-06-24 11:16:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 10630021120. Throughput: 0: 42455.5. Samples: 10630195860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:18,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 11:16:20,333][15401] Updated weights for policy 0, policy_version 648811 (0.0032) [2024-06-24 11:16:23,389][15132] Fps is (10 sec: 44242.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10630266880. Throughput: 0: 42515.8. Samples: 10630324920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:23,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 11:16:23,547][15401] Updated weights for policy 0, policy_version 648821 (0.0029) [2024-06-24 11:16:27,869][15401] Updated weights for policy 0, policy_version 648831 (0.0027) [2024-06-24 11:16:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10630447104. Throughput: 0: 42656.6. Samples: 10630586000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:28,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-24 11:16:31,455][15401] Updated weights for policy 0, policy_version 648841 (0.0033) [2024-06-24 11:16:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10630676480. Throughput: 0: 42654.2. Samples: 10630836620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:33,394][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 11:16:35,477][15401] Updated weights for policy 0, policy_version 648851 (0.0028) [2024-06-24 11:16:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10630889472. Throughput: 0: 42693.0. Samples: 10630967600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 11:16:39,142][15401] Updated weights for policy 0, policy_version 648861 (0.0037) [2024-06-24 11:16:43,326][15401] Updated weights for policy 0, policy_version 648871 (0.0036) [2024-06-24 11:16:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 10631102464. Throughput: 0: 42947.2. Samples: 10631232820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 11:16:46,715][15401] Updated weights for policy 0, policy_version 648881 (0.0037) [2024-06-24 11:16:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10631331840. Throughput: 0: 42549.0. Samples: 10631470280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:48,399][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 11:16:51,450][15401] Updated weights for policy 0, policy_version 648891 (0.0040) [2024-06-24 11:16:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10631528448. Throughput: 0: 42694.3. Samples: 10631600780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 11:16:54,806][15401] Updated weights for policy 0, policy_version 648901 (0.0031) [2024-06-24 11:16:58,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10631708672. Throughput: 0: 42550.0. Samples: 10631857200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:16:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 11:16:59,077][15401] Updated weights for policy 0, policy_version 648911 (0.0027) [2024-06-24 11:17:02,660][15401] Updated weights for policy 0, policy_version 648921 (0.0040) [2024-06-24 11:17:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10631970816. Throughput: 0: 42368.0. Samples: 10632102420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:17:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 11:17:06,787][15401] Updated weights for policy 0, policy_version 648931 (0.0022) [2024-06-24 11:17:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10632167424. Throughput: 0: 42459.3. Samples: 10632235600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:17:08,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 11:17:10,185][15401] Updated weights for policy 0, policy_version 648941 (0.0030) [2024-06-24 11:17:13,396][15132] Fps is (10 sec: 39296.4, 60 sec: 42321.6, 300 sec: 42653.0). Total num frames: 10632364032. Throughput: 0: 42418.3. Samples: 10632495100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:17:13,397][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 11:17:14,463][15401] Updated weights for policy 0, policy_version 648951 (0.0030) [2024-06-24 11:17:17,929][15401] Updated weights for policy 0, policy_version 648961 (0.0037) [2024-06-24 11:17:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 10632577024. Throughput: 0: 42361.9. Samples: 10632742900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 11:17:22,037][15401] Updated weights for policy 0, policy_version 648971 (0.0046) [2024-06-24 11:17:23,390][15132] Fps is (10 sec: 40985.9, 60 sec: 41779.0, 300 sec: 42709.5). Total num frames: 10632773632. Throughput: 0: 42376.7. Samples: 10632874560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:23,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-24 11:17:23,789][15349] Signal inference workers to stop experience collection... (157500 times) [2024-06-24 11:17:23,790][15349] Signal inference workers to resume experience collection... (157500 times) [2024-06-24 11:17:23,816][15401] InferenceWorker_p0-w0: stopping experience collection (157500 times) [2024-06-24 11:17:23,845][15401] InferenceWorker_p0-w0: resuming experience collection (157500 times) [2024-06-24 11:17:25,819][15401] Updated weights for policy 0, policy_version 648981 (0.0027) [2024-06-24 11:17:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10633003008. Throughput: 0: 42043.0. Samples: 10633124760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 11:17:29,587][15401] Updated weights for policy 0, policy_version 648991 (0.0039) [2024-06-24 11:17:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10633216000. Throughput: 0: 42468.9. Samples: 10633381380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 11:17:33,611][15401] Updated weights for policy 0, policy_version 649001 (0.0035) [2024-06-24 11:17:37,477][15401] Updated weights for policy 0, policy_version 649011 (0.0051) [2024-06-24 11:17:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10633428992. Throughput: 0: 42371.9. Samples: 10633507520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 11:17:41,200][15401] Updated weights for policy 0, policy_version 649021 (0.0023) [2024-06-24 11:17:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 10633641984. Throughput: 0: 42343.5. Samples: 10633762660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 11:17:43,527][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000649027_10633658368.pth... [2024-06-24 11:17:43,599][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000648401_10623401984.pth [2024-06-24 11:17:44,886][15401] Updated weights for policy 0, policy_version 649031 (0.0026) [2024-06-24 11:17:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 10633838592. Throughput: 0: 42644.0. Samples: 10634021400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:48,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-24 11:17:49,314][15401] Updated weights for policy 0, policy_version 649041 (0.0033) [2024-06-24 11:17:52,582][15401] Updated weights for policy 0, policy_version 649051 (0.0030) [2024-06-24 11:17:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10634067968. Throughput: 0: 42562.3. Samples: 10634150900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:53,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-24 11:17:56,772][15401] Updated weights for policy 0, policy_version 649061 (0.0054) [2024-06-24 11:17:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10634280960. Throughput: 0: 42511.4. Samples: 10634407840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:17:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 11:18:00,283][15401] Updated weights for policy 0, policy_version 649071 (0.0037) [2024-06-24 11:18:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 10634477568. Throughput: 0: 42787.5. Samples: 10634668340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 11:18:04,262][15401] Updated weights for policy 0, policy_version 649081 (0.0042) [2024-06-24 11:18:08,087][15401] Updated weights for policy 0, policy_version 649091 (0.0031) [2024-06-24 11:18:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42654.9). Total num frames: 10634706944. Throughput: 0: 42562.4. Samples: 10634789860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 11:18:12,142][15401] Updated weights for policy 0, policy_version 649101 (0.0045) [2024-06-24 11:18:13,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42874.3, 300 sec: 42653.6). Total num frames: 10634936320. Throughput: 0: 42767.0. Samples: 10635049380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:13,401][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 11:18:15,946][15401] Updated weights for policy 0, policy_version 649111 (0.0032) [2024-06-24 11:18:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10635132928. Throughput: 0: 42676.4. Samples: 10635301820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 11:18:19,769][15401] Updated weights for policy 0, policy_version 649121 (0.0032) [2024-06-24 11:18:23,390][15132] Fps is (10 sec: 40969.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10635345920. Throughput: 0: 42738.9. Samples: 10635430780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:23,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 11:18:23,546][15401] Updated weights for policy 0, policy_version 649131 (0.0028) [2024-06-24 11:18:27,402][15401] Updated weights for policy 0, policy_version 649141 (0.0023) [2024-06-24 11:18:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10635575296. Throughput: 0: 42930.8. Samples: 10635694540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 11:18:31,078][15401] Updated weights for policy 0, policy_version 649151 (0.0027) [2024-06-24 11:18:33,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10635771904. Throughput: 0: 42711.6. Samples: 10635943420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 11:18:35,225][15401] Updated weights for policy 0, policy_version 649161 (0.0030) [2024-06-24 11:18:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 10636001280. Throughput: 0: 42660.4. Samples: 10636070620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 11:18:38,750][15401] Updated weights for policy 0, policy_version 649171 (0.0022) [2024-06-24 11:18:42,826][15401] Updated weights for policy 0, policy_version 649181 (0.0029) [2024-06-24 11:18:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10636197888. Throughput: 0: 42736.3. Samples: 10636330980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 11:18:46,967][15401] Updated weights for policy 0, policy_version 649191 (0.0035) [2024-06-24 11:18:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 10636427264. Throughput: 0: 42491.0. Samples: 10636580440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 11:18:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 11:18:50,424][15401] Updated weights for policy 0, policy_version 649201 (0.0042) [2024-06-24 11:18:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10636640256. Throughput: 0: 42770.6. Samples: 10636714540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:18:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 11:18:54,375][15401] Updated weights for policy 0, policy_version 649211 (0.0023) [2024-06-24 11:18:58,133][15401] Updated weights for policy 0, policy_version 649221 (0.0038) [2024-06-24 11:18:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10636836864. Throughput: 0: 42744.5. Samples: 10636972780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:18:58,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 11:19:00,225][15349] Signal inference workers to stop experience collection... (157550 times) [2024-06-24 11:19:00,225][15349] Signal inference workers to resume experience collection... (157550 times) [2024-06-24 11:19:00,248][15401] InferenceWorker_p0-w0: stopping experience collection (157550 times) [2024-06-24 11:19:00,249][15401] InferenceWorker_p0-w0: resuming experience collection (157550 times) [2024-06-24 11:19:01,917][15401] Updated weights for policy 0, policy_version 649231 (0.0048) [2024-06-24 11:19:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42709.5). Total num frames: 10637066240. Throughput: 0: 42709.3. Samples: 10637223840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:03,393][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 11:19:06,188][15401] Updated weights for policy 0, policy_version 649241 (0.0038) [2024-06-24 11:19:08,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 10637295616. Throughput: 0: 42777.6. Samples: 10637355760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:08,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 11:19:09,501][15401] Updated weights for policy 0, policy_version 649251 (0.0037) [2024-06-24 11:19:13,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 10637475840. Throughput: 0: 42535.9. Samples: 10637608660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 11:19:13,645][15401] Updated weights for policy 0, policy_version 649261 (0.0033) [2024-06-24 11:19:17,087][15401] Updated weights for policy 0, policy_version 649271 (0.0033) [2024-06-24 11:19:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10637688832. Throughput: 0: 42538.7. Samples: 10637857660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 11:19:21,463][15401] Updated weights for policy 0, policy_version 649281 (0.0032) [2024-06-24 11:19:23,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.9, 300 sec: 42598.0). Total num frames: 10637918208. Throughput: 0: 42629.9. Samples: 10637989060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:23,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 11:19:25,213][15401] Updated weights for policy 0, policy_version 649291 (0.0043) [2024-06-24 11:19:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10638114816. Throughput: 0: 42549.0. Samples: 10638245680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:28,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 11:19:29,487][15401] Updated weights for policy 0, policy_version 649301 (0.0031) [2024-06-24 11:19:32,695][15401] Updated weights for policy 0, policy_version 649311 (0.0035) [2024-06-24 11:19:33,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42766.0). Total num frames: 10638344192. Throughput: 0: 42525.4. Samples: 10638494080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 11:19:37,121][15401] Updated weights for policy 0, policy_version 649321 (0.0031) [2024-06-24 11:19:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 10638573568. Throughput: 0: 42516.9. Samples: 10638627800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:38,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 11:19:40,178][15401] Updated weights for policy 0, policy_version 649331 (0.0033) [2024-06-24 11:19:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10638753792. Throughput: 0: 42415.5. Samples: 10638881480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 11:19:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000649338_10638753792.pth... [2024-06-24 11:19:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000648715_10628546560.pth [2024-06-24 11:19:44,674][15401] Updated weights for policy 0, policy_version 649341 (0.0033) [2024-06-24 11:19:47,764][15401] Updated weights for policy 0, policy_version 649351 (0.0036) [2024-06-24 11:19:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 10638983168. Throughput: 0: 42390.8. Samples: 10639131320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 11:19:52,630][15401] Updated weights for policy 0, policy_version 649361 (0.0028) [2024-06-24 11:19:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10639179776. Throughput: 0: 42330.1. Samples: 10639260620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:53,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 11:19:55,403][15401] Updated weights for policy 0, policy_version 649371 (0.0037) [2024-06-24 11:19:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10639376384. Throughput: 0: 42281.8. Samples: 10639511340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:19:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 11:20:00,214][15401] Updated weights for policy 0, policy_version 649381 (0.0038) [2024-06-24 11:20:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 10639605760. Throughput: 0: 42371.0. Samples: 10639764360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:20:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 11:20:03,666][15401] Updated weights for policy 0, policy_version 649391 (0.0025) [2024-06-24 11:20:08,055][15401] Updated weights for policy 0, policy_version 649401 (0.0051) [2024-06-24 11:20:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10639818752. Throughput: 0: 42499.1. Samples: 10639901420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:20:08,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 11:20:11,376][15401] Updated weights for policy 0, policy_version 649411 (0.0038) [2024-06-24 11:20:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 10639998976. Throughput: 0: 42228.8. Samples: 10640145980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:20:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 11:20:15,515][15401] Updated weights for policy 0, policy_version 649421 (0.0030) [2024-06-24 11:20:15,861][15349] Signal inference workers to stop experience collection... (157600 times) [2024-06-24 11:20:15,863][15349] Signal inference workers to resume experience collection... (157600 times) [2024-06-24 11:20:15,879][15401] InferenceWorker_p0-w0: stopping experience collection (157600 times) [2024-06-24 11:20:15,879][15401] InferenceWorker_p0-w0: resuming experience collection (157600 times) [2024-06-24 11:20:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10640244736. Throughput: 0: 42454.1. Samples: 10640404520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:20:18,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 11:20:19,123][15401] Updated weights for policy 0, policy_version 649431 (0.0028) [2024-06-24 11:20:23,272][15401] Updated weights for policy 0, policy_version 649441 (0.0037) [2024-06-24 11:20:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42053.8, 300 sec: 42542.8). Total num frames: 10640441344. Throughput: 0: 42483.4. Samples: 10640539560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 11:20:23,391][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 11:20:26,643][15401] Updated weights for policy 0, policy_version 649451 (0.0029) [2024-06-24 11:20:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 10640654336. Throughput: 0: 42351.2. Samples: 10640787280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:20:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 11:20:30,873][15401] Updated weights for policy 0, policy_version 649461 (0.0032) [2024-06-24 11:20:33,392][15132] Fps is (10 sec: 44227.1, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 10640883712. Throughput: 0: 42422.6. Samples: 10641040440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:20:33,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 11:20:34,563][15401] Updated weights for policy 0, policy_version 649471 (0.0033) [2024-06-24 11:20:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42487.7). Total num frames: 10641080320. Throughput: 0: 42541.3. Samples: 10641174980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:20:38,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-24 11:20:38,493][15401] Updated weights for policy 0, policy_version 649481 (0.0046) [2024-06-24 11:20:42,080][15401] Updated weights for policy 0, policy_version 649491 (0.0040) [2024-06-24 11:20:43,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 10641293312. Throughput: 0: 42574.1. Samples: 10641427180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:20:43,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 11:20:46,103][15401] Updated weights for policy 0, policy_version 649501 (0.0032) [2024-06-24 11:20:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10641539072. Throughput: 0: 42620.0. Samples: 10641682260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:20:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 11:20:49,574][15401] Updated weights for policy 0, policy_version 649511 (0.0041) [2024-06-24 11:20:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10641735680. Throughput: 0: 42509.0. Samples: 10641814320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:20:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 11:20:53,749][15401] Updated weights for policy 0, policy_version 649521 (0.0030) [2024-06-24 11:20:57,126][15401] Updated weights for policy 0, policy_version 649531 (0.0035) [2024-06-24 11:20:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 10641948672. Throughput: 0: 42592.3. Samples: 10642062640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:20:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 11:21:01,329][15401] Updated weights for policy 0, policy_version 649541 (0.0044) [2024-06-24 11:21:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10642145280. Throughput: 0: 42520.1. Samples: 10642317920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:03,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-24 11:21:04,801][15401] Updated weights for policy 0, policy_version 649551 (0.0039) [2024-06-24 11:21:08,390][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.2, 300 sec: 42431.9). Total num frames: 10642341888. Throughput: 0: 42337.5. Samples: 10642444740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 11:21:09,289][15401] Updated weights for policy 0, policy_version 649561 (0.0028) [2024-06-24 11:21:12,608][15401] Updated weights for policy 0, policy_version 649571 (0.0051) [2024-06-24 11:21:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10642571264. Throughput: 0: 42327.2. Samples: 10642692000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 11:21:16,875][15401] Updated weights for policy 0, policy_version 649581 (0.0035) [2024-06-24 11:21:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42431.7). Total num frames: 10642784256. Throughput: 0: 42564.4. Samples: 10642955740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 11:21:20,294][15401] Updated weights for policy 0, policy_version 649591 (0.0038) [2024-06-24 11:21:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10642980864. Throughput: 0: 42390.6. Samples: 10643082560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 11:21:24,994][15401] Updated weights for policy 0, policy_version 649601 (0.0041) [2024-06-24 11:21:27,874][15401] Updated weights for policy 0, policy_version 649611 (0.0032) [2024-06-24 11:21:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10643226624. Throughput: 0: 42505.0. Samples: 10643339900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:28,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 11:21:32,527][15401] Updated weights for policy 0, policy_version 649621 (0.0036) [2024-06-24 11:21:33,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42598.4, 300 sec: 42542.5). Total num frames: 10643439616. Throughput: 0: 42642.7. Samples: 10643601280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:33,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 11:21:35,974][15401] Updated weights for policy 0, policy_version 649631 (0.0039) [2024-06-24 11:21:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10643636224. Throughput: 0: 42357.2. Samples: 10643720400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 11:21:39,760][15349] Signal inference workers to stop experience collection... (157650 times) [2024-06-24 11:21:39,761][15349] Signal inference workers to resume experience collection... (157650 times) [2024-06-24 11:21:39,768][15401] InferenceWorker_p0-w0: stopping experience collection (157650 times) [2024-06-24 11:21:39,769][15401] InferenceWorker_p0-w0: resuming experience collection (157650 times) [2024-06-24 11:21:40,087][15401] Updated weights for policy 0, policy_version 649641 (0.0036) [2024-06-24 11:21:43,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 10643849216. Throughput: 0: 42425.2. Samples: 10643971760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 11:21:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000649650_10643865600.pth... [2024-06-24 11:21:43,518][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000649027_10633658368.pth [2024-06-24 11:21:43,691][15401] Updated weights for policy 0, policy_version 649651 (0.0034) [2024-06-24 11:21:47,860][15401] Updated weights for policy 0, policy_version 649661 (0.0020) [2024-06-24 11:21:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 10644078592. Throughput: 0: 42649.9. Samples: 10644237160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 11:21:51,589][15401] Updated weights for policy 0, policy_version 649671 (0.0040) [2024-06-24 11:21:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10644275200. Throughput: 0: 42461.4. Samples: 10644355500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 11:21:55,611][15401] Updated weights for policy 0, policy_version 649681 (0.0037) [2024-06-24 11:21:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 10644488192. Throughput: 0: 42635.0. Samples: 10644610580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:21:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 11:21:59,389][15401] Updated weights for policy 0, policy_version 649691 (0.0034) [2024-06-24 11:22:03,311][15401] Updated weights for policy 0, policy_version 649701 (0.0029) [2024-06-24 11:22:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10644701184. Throughput: 0: 42519.6. Samples: 10644869120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 11:22:03,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 11:22:07,093][15401] Updated weights for policy 0, policy_version 649711 (0.0038) [2024-06-24 11:22:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42543.8). Total num frames: 10644914176. Throughput: 0: 42527.2. Samples: 10644996280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:08,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 11:22:11,121][15401] Updated weights for policy 0, policy_version 649721 (0.0043) [2024-06-24 11:22:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10645127168. Throughput: 0: 42313.8. Samples: 10645244020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:13,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 11:22:14,752][15401] Updated weights for policy 0, policy_version 649731 (0.0031) [2024-06-24 11:22:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10645323776. Throughput: 0: 42314.8. Samples: 10645505340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 11:22:18,885][15401] Updated weights for policy 0, policy_version 649741 (0.0042) [2024-06-24 11:22:22,364][15401] Updated weights for policy 0, policy_version 649751 (0.0027) [2024-06-24 11:22:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10645553152. Throughput: 0: 42445.8. Samples: 10645630460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 11:22:26,451][15401] Updated weights for policy 0, policy_version 649761 (0.0041) [2024-06-24 11:22:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10645749760. Throughput: 0: 42357.3. Samples: 10645877840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 11:22:30,434][15401] Updated weights for policy 0, policy_version 649771 (0.0037) [2024-06-24 11:22:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 10645962752. Throughput: 0: 42216.7. Samples: 10646136920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 11:22:34,271][15401] Updated weights for policy 0, policy_version 649781 (0.0039) [2024-06-24 11:22:38,031][15401] Updated weights for policy 0, policy_version 649791 (0.0034) [2024-06-24 11:22:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 10646175744. Throughput: 0: 42298.1. Samples: 10646259020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:38,393][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 11:22:41,893][15401] Updated weights for policy 0, policy_version 649801 (0.0033) [2024-06-24 11:22:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10646405120. Throughput: 0: 42305.9. Samples: 10646514340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 11:22:45,538][15401] Updated weights for policy 0, policy_version 649811 (0.0030) [2024-06-24 11:22:48,389][15132] Fps is (10 sec: 40970.2, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 10646585344. Throughput: 0: 42389.4. Samples: 10646776640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 11:22:49,575][15401] Updated weights for policy 0, policy_version 649821 (0.0037) [2024-06-24 11:22:53,243][15401] Updated weights for policy 0, policy_version 649831 (0.0047) [2024-06-24 11:22:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10646831104. Throughput: 0: 42136.8. Samples: 10646892440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:53,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 11:22:57,331][15401] Updated weights for policy 0, policy_version 649841 (0.0038) [2024-06-24 11:22:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 10647027712. Throughput: 0: 42301.3. Samples: 10647147580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:22:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 11:23:00,890][15401] Updated weights for policy 0, policy_version 649851 (0.0033) [2024-06-24 11:23:02,156][15349] Signal inference workers to stop experience collection... (157700 times) [2024-06-24 11:23:02,156][15349] Signal inference workers to resume experience collection... (157700 times) [2024-06-24 11:23:02,193][15401] InferenceWorker_p0-w0: stopping experience collection (157700 times) [2024-06-24 11:23:02,193][15401] InferenceWorker_p0-w0: resuming experience collection (157700 times) [2024-06-24 11:23:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 10647224320. Throughput: 0: 42068.0. Samples: 10647398400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:23:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 11:23:05,290][15401] Updated weights for policy 0, policy_version 649861 (0.0042) [2024-06-24 11:23:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 10647453696. Throughput: 0: 42064.0. Samples: 10647523340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:23:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 11:23:09,182][15401] Updated weights for policy 0, policy_version 649871 (0.0026) [2024-06-24 11:23:12,998][15401] Updated weights for policy 0, policy_version 649881 (0.0034) [2024-06-24 11:23:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10647666688. Throughput: 0: 42376.8. Samples: 10647784800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:23:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 11:23:16,813][15401] Updated weights for policy 0, policy_version 649891 (0.0038) [2024-06-24 11:23:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 10647846912. Throughput: 0: 42211.7. Samples: 10648036440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:23:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 11:23:20,889][15401] Updated weights for policy 0, policy_version 649901 (0.0042) [2024-06-24 11:23:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10648092672. Throughput: 0: 42305.4. Samples: 10648162660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:23:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 11:23:24,573][15401] Updated weights for policy 0, policy_version 649911 (0.0048) [2024-06-24 11:23:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 10648289280. Throughput: 0: 42442.1. Samples: 10648424240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:23:28,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 11:23:28,526][15401] Updated weights for policy 0, policy_version 649921 (0.0037) [2024-06-24 11:23:32,192][15401] Updated weights for policy 0, policy_version 649931 (0.0029) [2024-06-24 11:23:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 10648502272. Throughput: 0: 42116.0. Samples: 10648671860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:23:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 11:23:36,229][15401] Updated weights for policy 0, policy_version 649941 (0.0023) [2024-06-24 11:23:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 10648731648. Throughput: 0: 42415.6. Samples: 10648801140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-24 11:23:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 11:23:39,799][15401] Updated weights for policy 0, policy_version 649951 (0.0025) [2024-06-24 11:23:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 10648911872. Throughput: 0: 42395.0. Samples: 10649055360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:23:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 11:23:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000649959_10648928256.pth... [2024-06-24 11:23:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000649338_10638753792.pth [2024-06-24 11:23:43,863][15401] Updated weights for policy 0, policy_version 649961 (0.0035) [2024-06-24 11:23:47,281][15401] Updated weights for policy 0, policy_version 649971 (0.0042) [2024-06-24 11:23:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 10649157632. Throughput: 0: 42613.4. Samples: 10649316000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:23:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 11:23:51,509][15401] Updated weights for policy 0, policy_version 649981 (0.0028) [2024-06-24 11:23:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10649370624. Throughput: 0: 42680.1. Samples: 10649443940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:23:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 11:23:55,472][15401] Updated weights for policy 0, policy_version 649991 (0.0040) [2024-06-24 11:23:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 10649567232. Throughput: 0: 42538.7. Samples: 10649699040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:23:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 11:23:59,344][15401] Updated weights for policy 0, policy_version 650001 (0.0034) [2024-06-24 11:24:02,895][15401] Updated weights for policy 0, policy_version 650011 (0.0032) [2024-06-24 11:24:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 10649796608. Throughput: 0: 42663.4. Samples: 10649956300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:03,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 11:24:06,818][15401] Updated weights for policy 0, policy_version 650021 (0.0027) [2024-06-24 11:24:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 10650009600. Throughput: 0: 42775.2. Samples: 10650087540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 11:24:10,403][15401] Updated weights for policy 0, policy_version 650031 (0.0029) [2024-06-24 11:24:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10650206208. Throughput: 0: 42562.7. Samples: 10650339560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:13,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 11:24:14,581][15401] Updated weights for policy 0, policy_version 650041 (0.0037) [2024-06-24 11:24:18,228][15401] Updated weights for policy 0, policy_version 650051 (0.0022) [2024-06-24 11:24:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42432.1). Total num frames: 10650435584. Throughput: 0: 42739.1. Samples: 10650595120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 11:24:22,321][15401] Updated weights for policy 0, policy_version 650061 (0.0031) [2024-06-24 11:24:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10650664960. Throughput: 0: 42838.7. Samples: 10650728880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 11:24:26,090][15401] Updated weights for policy 0, policy_version 650071 (0.0028) [2024-06-24 11:24:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 10650845184. Throughput: 0: 42795.1. Samples: 10650981140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 11:24:29,376][15349] Signal inference workers to stop experience collection... (157750 times) [2024-06-24 11:24:29,376][15349] Signal inference workers to resume experience collection... (157750 times) [2024-06-24 11:24:29,397][15401] InferenceWorker_p0-w0: stopping experience collection (157750 times) [2024-06-24 11:24:29,428][15401] InferenceWorker_p0-w0: resuming experience collection (157750 times) [2024-06-24 11:24:29,686][15401] Updated weights for policy 0, policy_version 650081 (0.0042) [2024-06-24 11:24:33,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42375.9). Total num frames: 10651074560. Throughput: 0: 42624.3. Samples: 10651234200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:33,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 11:24:33,607][15401] Updated weights for policy 0, policy_version 650091 (0.0031) [2024-06-24 11:24:37,628][15401] Updated weights for policy 0, policy_version 650101 (0.0032) [2024-06-24 11:24:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10651303936. Throughput: 0: 42752.9. Samples: 10651367820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 11:24:41,637][15401] Updated weights for policy 0, policy_version 650111 (0.0042) [2024-06-24 11:24:43,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 10651484160. Throughput: 0: 42699.1. Samples: 10651620500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:43,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 11:24:45,090][15401] Updated weights for policy 0, policy_version 650121 (0.0022) [2024-06-24 11:24:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 10651713536. Throughput: 0: 42612.5. Samples: 10651873860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:48,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 11:24:49,386][15401] Updated weights for policy 0, policy_version 650131 (0.0035) [2024-06-24 11:24:52,590][15401] Updated weights for policy 0, policy_version 650141 (0.0039) [2024-06-24 11:24:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10651926528. Throughput: 0: 42653.7. Samples: 10652006960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 11:24:57,014][15401] Updated weights for policy 0, policy_version 650151 (0.0025) [2024-06-24 11:24:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 10652139520. Throughput: 0: 42735.1. Samples: 10652262640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:24:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 11:25:00,568][15401] Updated weights for policy 0, policy_version 650161 (0.0037) [2024-06-24 11:25:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 10652352512. Throughput: 0: 42539.2. Samples: 10652509380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:25:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 11:25:04,966][15401] Updated weights for policy 0, policy_version 650171 (0.0033) [2024-06-24 11:25:08,257][15401] Updated weights for policy 0, policy_version 650181 (0.0039) [2024-06-24 11:25:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10652565504. Throughput: 0: 42409.3. Samples: 10652637300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:25:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 11:25:12,629][15401] Updated weights for policy 0, policy_version 650191 (0.0039) [2024-06-24 11:25:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 10652745728. Throughput: 0: 42405.0. Samples: 10652889360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:25:13,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 11:25:16,223][15401] Updated weights for policy 0, policy_version 650201 (0.0037) [2024-06-24 11:25:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 10652975104. Throughput: 0: 42248.9. Samples: 10653135300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 11:25:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 11:25:20,411][15401] Updated weights for policy 0, policy_version 650211 (0.0051) [2024-06-24 11:25:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10653188096. Throughput: 0: 42268.5. Samples: 10653269900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:25:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 11:25:23,888][15401] Updated weights for policy 0, policy_version 650221 (0.0039) [2024-06-24 11:25:27,892][15401] Updated weights for policy 0, policy_version 650231 (0.0046) [2024-06-24 11:25:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 10653384704. Throughput: 0: 42420.4. Samples: 10653529420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:25:28,392][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 11:25:31,489][15401] Updated weights for policy 0, policy_version 650241 (0.0040) [2024-06-24 11:25:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 10653630464. Throughput: 0: 42310.2. Samples: 10653777820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:25:33,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 11:25:35,508][15401] Updated weights for policy 0, policy_version 650251 (0.0032) [2024-06-24 11:25:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 10653827072. Throughput: 0: 42412.5. Samples: 10653915520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:25:38,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 11:25:39,027][15401] Updated weights for policy 0, policy_version 650261 (0.0037) [2024-06-24 11:25:41,536][15349] Signal inference workers to stop experience collection... (157800 times) [2024-06-24 11:25:41,540][15349] Signal inference workers to resume experience collection... (157800 times) [2024-06-24 11:25:41,592][15401] InferenceWorker_p0-w0: stopping experience collection (157800 times) [2024-06-24 11:25:41,592][15401] InferenceWorker_p0-w0: resuming experience collection (157800 times) [2024-06-24 11:25:43,115][15401] Updated weights for policy 0, policy_version 650271 (0.0024) [2024-06-24 11:25:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 10654040064. Throughput: 0: 42385.8. Samples: 10654170000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:25:43,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-24 11:25:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000650271_10654040064.pth... [2024-06-24 11:25:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000649650_10643865600.pth [2024-06-24 11:25:46,578][15401] Updated weights for policy 0, policy_version 650281 (0.0030) [2024-06-24 11:25:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 10654269440. Throughput: 0: 42537.3. Samples: 10654423560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:25:48,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 11:25:51,252][15401] Updated weights for policy 0, policy_version 650291 (0.0037) [2024-06-24 11:25:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 10654466048. Throughput: 0: 42700.8. Samples: 10654558840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:25:53,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 11:25:54,224][15401] Updated weights for policy 0, policy_version 650301 (0.0021) [2024-06-24 11:25:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10654679040. Throughput: 0: 42720.3. Samples: 10654811780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:25:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 11:25:58,769][15401] Updated weights for policy 0, policy_version 650311 (0.0027) [2024-06-24 11:26:01,798][15401] Updated weights for policy 0, policy_version 650321 (0.0033) [2024-06-24 11:26:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10654908416. Throughput: 0: 42973.9. Samples: 10655069120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 11:26:06,359][15401] Updated weights for policy 0, policy_version 650331 (0.0043) [2024-06-24 11:26:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10655105024. Throughput: 0: 42846.7. Samples: 10655198000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 11:26:09,689][15401] Updated weights for policy 0, policy_version 650341 (0.0032) [2024-06-24 11:26:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 10655334400. Throughput: 0: 42639.1. Samples: 10655448180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 11:26:13,913][15401] Updated weights for policy 0, policy_version 650351 (0.0028) [2024-06-24 11:26:17,322][15401] Updated weights for policy 0, policy_version 650361 (0.0039) [2024-06-24 11:26:18,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 10655547392. Throughput: 0: 42793.3. Samples: 10655703620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:18,392][15132] Avg episode reward: [(0, '0.201')] [2024-06-24 11:26:21,473][15401] Updated weights for policy 0, policy_version 650371 (0.0034) [2024-06-24 11:26:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 10655744000. Throughput: 0: 42578.5. Samples: 10655831560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 11:26:25,223][15401] Updated weights for policy 0, policy_version 650381 (0.0041) [2024-06-24 11:26:28,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42487.7). Total num frames: 10655973376. Throughput: 0: 42613.8. Samples: 10656087620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 11:26:29,048][15401] Updated weights for policy 0, policy_version 650391 (0.0028) [2024-06-24 11:26:32,877][15401] Updated weights for policy 0, policy_version 650401 (0.0027) [2024-06-24 11:26:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10656186368. Throughput: 0: 42722.6. Samples: 10656346080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 11:26:37,084][15401] Updated weights for policy 0, policy_version 650411 (0.0033) [2024-06-24 11:26:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10656382976. Throughput: 0: 42508.7. Samples: 10656471720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:38,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 11:26:40,958][15401] Updated weights for policy 0, policy_version 650421 (0.0036) [2024-06-24 11:26:43,398][15132] Fps is (10 sec: 42564.3, 60 sec: 42865.8, 300 sec: 42486.2). Total num frames: 10656612352. Throughput: 0: 42480.9. Samples: 10656723760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:43,398][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 11:26:44,501][15401] Updated weights for policy 0, policy_version 650431 (0.0028) [2024-06-24 11:26:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10656808960. Throughput: 0: 42575.1. Samples: 10656985000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 11:26:48,551][15401] Updated weights for policy 0, policy_version 650441 (0.0041) [2024-06-24 11:26:52,250][15401] Updated weights for policy 0, policy_version 650451 (0.0043) [2024-06-24 11:26:53,389][15132] Fps is (10 sec: 42632.8, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 10657038336. Throughput: 0: 42576.9. Samples: 10657113960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 11:26:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 11:26:56,169][15401] Updated weights for policy 0, policy_version 650461 (0.0050) [2024-06-24 11:26:56,568][15349] Signal inference workers to stop experience collection... (157850 times) [2024-06-24 11:26:56,572][15349] Signal inference workers to resume experience collection... (157850 times) [2024-06-24 11:26:56,596][15401] InferenceWorker_p0-w0: stopping experience collection (157850 times) [2024-06-24 11:26:56,596][15401] InferenceWorker_p0-w0: resuming experience collection (157850 times) [2024-06-24 11:26:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 10657251328. Throughput: 0: 42661.0. Samples: 10657367920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:26:58,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 11:26:59,783][15401] Updated weights for policy 0, policy_version 650471 (0.0036) [2024-06-24 11:27:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 10657431552. Throughput: 0: 42894.8. Samples: 10657633780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 11:27:03,834][15401] Updated weights for policy 0, policy_version 650481 (0.0042) [2024-06-24 11:27:07,441][15401] Updated weights for policy 0, policy_version 650491 (0.0036) [2024-06-24 11:27:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10657677312. Throughput: 0: 42693.0. Samples: 10657752740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 11:27:11,753][15401] Updated weights for policy 0, policy_version 650501 (0.0045) [2024-06-24 11:27:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10657890304. Throughput: 0: 42768.4. Samples: 10658012200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 11:27:15,020][15401] Updated weights for policy 0, policy_version 650511 (0.0029) [2024-06-24 11:27:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 10658086912. Throughput: 0: 42642.7. Samples: 10658265000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:18,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 11:27:19,615][15401] Updated weights for policy 0, policy_version 650521 (0.0039) [2024-06-24 11:27:22,714][15401] Updated weights for policy 0, policy_version 650531 (0.0032) [2024-06-24 11:27:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10658332672. Throughput: 0: 42637.3. Samples: 10658390400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:23,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 11:27:27,346][15401] Updated weights for policy 0, policy_version 650541 (0.0035) [2024-06-24 11:27:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10658529280. Throughput: 0: 42833.0. Samples: 10658650900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:28,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 11:27:30,284][15401] Updated weights for policy 0, policy_version 650551 (0.0036) [2024-06-24 11:27:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 10658742272. Throughput: 0: 42558.7. Samples: 10658900140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:33,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 11:27:35,230][15401] Updated weights for policy 0, policy_version 650561 (0.0033) [2024-06-24 11:27:38,379][15401] Updated weights for policy 0, policy_version 650571 (0.0040) [2024-06-24 11:27:38,390][15132] Fps is (10 sec: 42594.4, 60 sec: 42870.7, 300 sec: 42542.7). Total num frames: 10658955264. Throughput: 0: 42481.7. Samples: 10659025680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:38,391][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 11:27:42,733][15401] Updated weights for policy 0, policy_version 650581 (0.0028) [2024-06-24 11:27:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42330.9, 300 sec: 42598.4). Total num frames: 10659151872. Throughput: 0: 42439.9. Samples: 10659277720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 11:27:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000650583_10659151872.pth... [2024-06-24 11:27:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000649959_10648928256.pth [2024-06-24 11:27:45,979][15401] Updated weights for policy 0, policy_version 650591 (0.0038) [2024-06-24 11:27:48,390][15132] Fps is (10 sec: 40963.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10659364864. Throughput: 0: 42154.6. Samples: 10659530740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 11:27:50,378][15401] Updated weights for policy 0, policy_version 650601 (0.0032) [2024-06-24 11:27:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10659594240. Throughput: 0: 42557.2. Samples: 10659667820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:53,399][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 11:27:53,587][15401] Updated weights for policy 0, policy_version 650611 (0.0036) [2024-06-24 11:27:58,223][15401] Updated weights for policy 0, policy_version 650621 (0.0043) [2024-06-24 11:27:58,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42050.5, 300 sec: 42542.5). Total num frames: 10659774464. Throughput: 0: 42334.2. Samples: 10659917340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:27:58,393][15132] Avg episode reward: [(0, '0.276')] [2024-06-24 11:28:01,313][15401] Updated weights for policy 0, policy_version 650631 (0.0032) [2024-06-24 11:28:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10660003840. Throughput: 0: 42399.6. Samples: 10660172980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:28:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 11:28:05,753][15401] Updated weights for policy 0, policy_version 650641 (0.0041) [2024-06-24 11:28:08,390][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10660233216. Throughput: 0: 42607.5. Samples: 10660307740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:28:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 11:28:09,192][15401] Updated weights for policy 0, policy_version 650651 (0.0033) [2024-06-24 11:28:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 10660413440. Throughput: 0: 42382.8. Samples: 10660558120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:28:13,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 11:28:13,525][15401] Updated weights for policy 0, policy_version 650661 (0.0040) [2024-06-24 11:28:16,995][15401] Updated weights for policy 0, policy_version 650671 (0.0030) [2024-06-24 11:28:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10660659200. Throughput: 0: 42084.4. Samples: 10660793940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:28:18,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 11:28:21,577][15401] Updated weights for policy 0, policy_version 650681 (0.0042) [2024-06-24 11:28:22,420][15349] Signal inference workers to stop experience collection... (157900 times) [2024-06-24 11:28:22,474][15401] InferenceWorker_p0-w0: stopping experience collection (157900 times) [2024-06-24 11:28:22,481][15349] Signal inference workers to resume experience collection... (157900 times) [2024-06-24 11:28:22,488][15401] InferenceWorker_p0-w0: resuming experience collection (157900 times) [2024-06-24 11:28:23,389][15132] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 10660839424. Throughput: 0: 42361.3. Samples: 10660931900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:28:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 11:28:24,697][15401] Updated weights for policy 0, policy_version 650691 (0.0034) [2024-06-24 11:28:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 10661052416. Throughput: 0: 42422.2. Samples: 10661186720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 11:28:28,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 11:28:29,170][15401] Updated weights for policy 0, policy_version 650701 (0.0044) [2024-06-24 11:28:32,215][15401] Updated weights for policy 0, policy_version 650711 (0.0037) [2024-06-24 11:28:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10661314560. Throughput: 0: 42337.3. Samples: 10661435920. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:28:33,391][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 11:28:36,831][15401] Updated weights for policy 0, policy_version 650721 (0.0038) [2024-06-24 11:28:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.9, 300 sec: 42598.4). Total num frames: 10661478400. Throughput: 0: 42382.3. Samples: 10661575020. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:28:38,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 11:28:39,973][15401] Updated weights for policy 0, policy_version 650731 (0.0032) [2024-06-24 11:28:43,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10661691392. Throughput: 0: 42423.6. Samples: 10661826300. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:28:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 11:28:44,399][15401] Updated weights for policy 0, policy_version 650741 (0.0030) [2024-06-24 11:28:47,589][15401] Updated weights for policy 0, policy_version 650751 (0.0041) [2024-06-24 11:28:48,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 10661969920. Throughput: 0: 42324.8. Samples: 10662077600. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:28:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 11:28:51,953][15401] Updated weights for policy 0, policy_version 650761 (0.0031) [2024-06-24 11:28:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10662133760. Throughput: 0: 42445.3. Samples: 10662217780. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:28:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 11:28:55,258][15401] Updated weights for policy 0, policy_version 650771 (0.0031) [2024-06-24 11:28:58,390][15132] Fps is (10 sec: 36044.4, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 10662330368. Throughput: 0: 42441.6. Samples: 10662468000. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:28:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 11:28:59,523][15401] Updated weights for policy 0, policy_version 650781 (0.0036) [2024-06-24 11:29:02,891][15401] Updated weights for policy 0, policy_version 650791 (0.0038) [2024-06-24 11:29:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10662576128. Throughput: 0: 42892.5. Samples: 10662724100. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 11:29:07,050][15401] Updated weights for policy 0, policy_version 650801 (0.0038) [2024-06-24 11:29:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 10662756352. Throughput: 0: 42870.1. Samples: 10662861060. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:08,390][15132] Avg episode reward: [(0, '0.211')] [2024-06-24 11:29:10,931][15401] Updated weights for policy 0, policy_version 650811 (0.0034) [2024-06-24 11:29:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 10662985728. Throughput: 0: 42626.8. Samples: 10663104920. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 11:29:14,898][15401] Updated weights for policy 0, policy_version 650821 (0.0038) [2024-06-24 11:29:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10663198720. Throughput: 0: 42951.2. Samples: 10663368720. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 11:29:18,494][15401] Updated weights for policy 0, policy_version 650831 (0.0042) [2024-06-24 11:29:22,411][15401] Updated weights for policy 0, policy_version 650841 (0.0045) [2024-06-24 11:29:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10663395328. Throughput: 0: 42755.5. Samples: 10663499020. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 11:29:26,128][15401] Updated weights for policy 0, policy_version 650851 (0.0039) [2024-06-24 11:29:28,390][15132] Fps is (10 sec: 42597.0, 60 sec: 42871.3, 300 sec: 42543.2). Total num frames: 10663624704. Throughput: 0: 42563.4. Samples: 10663741660. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 11:29:29,952][15401] Updated weights for policy 0, policy_version 650861 (0.0036) [2024-06-24 11:29:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 10663821312. Throughput: 0: 42860.0. Samples: 10664006300. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 11:29:33,899][15401] Updated weights for policy 0, policy_version 650871 (0.0039) [2024-06-24 11:29:34,613][15349] Signal inference workers to stop experience collection... (157950 times) [2024-06-24 11:29:34,664][15349] Signal inference workers to resume experience collection... (157950 times) [2024-06-24 11:29:34,665][15401] InferenceWorker_p0-w0: stopping experience collection (157950 times) [2024-06-24 11:29:34,674][15401] InferenceWorker_p0-w0: resuming experience collection (157950 times) [2024-06-24 11:29:37,863][15401] Updated weights for policy 0, policy_version 650881 (0.0032) [2024-06-24 11:29:38,389][15132] Fps is (10 sec: 40961.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10664034304. Throughput: 0: 42512.5. Samples: 10664130840. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 11:29:41,520][15401] Updated weights for policy 0, policy_version 650891 (0.0043) [2024-06-24 11:29:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10664263680. Throughput: 0: 42611.6. Samples: 10664385520. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 11:29:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000650896_10664280064.pth... [2024-06-24 11:29:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000650271_10654040064.pth [2024-06-24 11:29:45,870][15401] Updated weights for policy 0, policy_version 650901 (0.0035) [2024-06-24 11:29:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 10664476672. Throughput: 0: 42633.7. Samples: 10664642620. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 11:29:49,244][15401] Updated weights for policy 0, policy_version 650911 (0.0034) [2024-06-24 11:29:53,366][15401] Updated weights for policy 0, policy_version 650921 (0.0029) [2024-06-24 11:29:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10664689664. Throughput: 0: 42310.8. Samples: 10664765040. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 11:29:56,825][15401] Updated weights for policy 0, policy_version 650931 (0.0040) [2024-06-24 11:29:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 10664902656. Throughput: 0: 42658.1. Samples: 10665024540. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:29:58,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 11:30:01,196][15401] Updated weights for policy 0, policy_version 650941 (0.0034) [2024-06-24 11:30:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10665115648. Throughput: 0: 42599.4. Samples: 10665285700. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:30:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 11:30:04,577][15401] Updated weights for policy 0, policy_version 650951 (0.0031) [2024-06-24 11:30:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10665328640. Throughput: 0: 42444.9. Samples: 10665409040. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-06-24 11:30:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 11:30:08,645][15401] Updated weights for policy 0, policy_version 650961 (0.0040) [2024-06-24 11:30:12,138][15401] Updated weights for policy 0, policy_version 650971 (0.0041) [2024-06-24 11:30:13,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 10665558016. Throughput: 0: 42816.6. Samples: 10665668500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:13,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 11:30:16,073][15401] Updated weights for policy 0, policy_version 650981 (0.0034) [2024-06-24 11:30:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10665738240. Throughput: 0: 42828.4. Samples: 10665933580. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:18,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 11:30:19,590][15401] Updated weights for policy 0, policy_version 650991 (0.0038) [2024-06-24 11:30:23,392][15132] Fps is (10 sec: 42598.6, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 10665984000. Throughput: 0: 42686.1. Samples: 10666051820. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:23,392][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 11:30:24,291][15401] Updated weights for policy 0, policy_version 651001 (0.0037) [2024-06-24 11:30:27,098][15401] Updated weights for policy 0, policy_version 651011 (0.0026) [2024-06-24 11:30:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 10666196992. Throughput: 0: 42799.1. Samples: 10666311480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 11:30:31,719][15401] Updated weights for policy 0, policy_version 651021 (0.0042) [2024-06-24 11:30:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10666393600. Throughput: 0: 43090.3. Samples: 10666581680. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 11:30:34,750][15401] Updated weights for policy 0, policy_version 651031 (0.0039) [2024-06-24 11:30:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10666622976. Throughput: 0: 42986.2. Samples: 10666699420. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:38,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 11:30:39,255][15401] Updated weights for policy 0, policy_version 651041 (0.0034) [2024-06-24 11:30:41,726][15349] Signal inference workers to stop experience collection... (158000 times) [2024-06-24 11:30:41,785][15349] Signal inference workers to resume experience collection... (158000 times) [2024-06-24 11:30:41,786][15401] InferenceWorker_p0-w0: stopping experience collection (158000 times) [2024-06-24 11:30:41,801][15401] InferenceWorker_p0-w0: resuming experience collection (158000 times) [2024-06-24 11:30:42,232][15401] Updated weights for policy 0, policy_version 651051 (0.0031) [2024-06-24 11:30:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10666852352. Throughput: 0: 42888.0. Samples: 10666954500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:43,393][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 11:30:46,878][15401] Updated weights for policy 0, policy_version 651061 (0.0028) [2024-06-24 11:30:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10667016192. Throughput: 0: 43069.4. Samples: 10667223820. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 11:30:50,125][15401] Updated weights for policy 0, policy_version 651071 (0.0024) [2024-06-24 11:30:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10667261952. Throughput: 0: 42882.1. Samples: 10667338740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 11:30:54,504][15401] Updated weights for policy 0, policy_version 651081 (0.0040) [2024-06-24 11:30:58,273][15401] Updated weights for policy 0, policy_version 651091 (0.0036) [2024-06-24 11:30:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10667474944. Throughput: 0: 42950.8. Samples: 10667601180. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:30:58,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 11:31:02,082][15401] Updated weights for policy 0, policy_version 651101 (0.0038) [2024-06-24 11:31:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10667655168. Throughput: 0: 42890.3. Samples: 10667863640. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:03,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 11:31:05,847][15401] Updated weights for policy 0, policy_version 651111 (0.0024) [2024-06-24 11:31:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10667917312. Throughput: 0: 42880.8. Samples: 10667981360. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 11:31:09,815][15401] Updated weights for policy 0, policy_version 651121 (0.0029) [2024-06-24 11:31:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42600.1, 300 sec: 42598.7). Total num frames: 10668113920. Throughput: 0: 42939.0. Samples: 10668243740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:13,396][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 11:31:13,402][15401] Updated weights for policy 0, policy_version 651131 (0.0036) [2024-06-24 11:31:17,173][15401] Updated weights for policy 0, policy_version 651141 (0.0036) [2024-06-24 11:31:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10668310528. Throughput: 0: 42629.8. Samples: 10668500020. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:18,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 11:31:20,822][15401] Updated weights for policy 0, policy_version 651151 (0.0028) [2024-06-24 11:31:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 10668556288. Throughput: 0: 42779.9. Samples: 10668624520. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:23,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 11:31:25,116][15401] Updated weights for policy 0, policy_version 651161 (0.0031) [2024-06-24 11:31:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10668769280. Throughput: 0: 43004.1. Samples: 10668889680. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 11:31:28,545][15401] Updated weights for policy 0, policy_version 651171 (0.0030) [2024-06-24 11:31:32,873][15401] Updated weights for policy 0, policy_version 651181 (0.0045) [2024-06-24 11:31:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10668982272. Throughput: 0: 42844.0. Samples: 10669151800. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 11:31:35,955][15401] Updated weights for policy 0, policy_version 651191 (0.0038) [2024-06-24 11:31:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42655.1). Total num frames: 10669195264. Throughput: 0: 43098.8. Samples: 10669278180. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:38,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 11:31:40,298][15401] Updated weights for policy 0, policy_version 651201 (0.0035) [2024-06-24 11:31:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10669424640. Throughput: 0: 43096.0. Samples: 10669540500. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-24 11:31:43,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 11:31:43,446][15401] Updated weights for policy 0, policy_version 651211 (0.0041) [2024-06-24 11:31:43,449][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000651211_10669441024.pth... [2024-06-24 11:31:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000650583_10659151872.pth [2024-06-24 11:31:47,763][15401] Updated weights for policy 0, policy_version 651221 (0.0029) [2024-06-24 11:31:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10669604864. Throughput: 0: 43032.3. Samples: 10669800100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:31:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 11:31:51,239][15401] Updated weights for policy 0, policy_version 651231 (0.0028) [2024-06-24 11:31:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10669850624. Throughput: 0: 43171.6. Samples: 10669924080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:31:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 11:31:55,703][15401] Updated weights for policy 0, policy_version 651241 (0.0033) [2024-06-24 11:31:56,770][15349] Signal inference workers to stop experience collection... (158050 times) [2024-06-24 11:31:56,821][15401] InferenceWorker_p0-w0: stopping experience collection (158050 times) [2024-06-24 11:31:56,883][15349] Signal inference workers to resume experience collection... (158050 times) [2024-06-24 11:31:56,883][15401] InferenceWorker_p0-w0: resuming experience collection (158050 times) [2024-06-24 11:31:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10670047232. Throughput: 0: 43096.6. Samples: 10670183080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:31:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 11:31:58,814][15401] Updated weights for policy 0, policy_version 651251 (0.0028) [2024-06-24 11:32:03,301][15401] Updated weights for policy 0, policy_version 651261 (0.0029) [2024-06-24 11:32:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 10670260224. Throughput: 0: 43181.8. Samples: 10670443200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:03,395][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 11:32:06,375][15401] Updated weights for policy 0, policy_version 651271 (0.0024) [2024-06-24 11:32:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10670489600. Throughput: 0: 43165.3. Samples: 10670566960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:08,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 11:32:10,631][15401] Updated weights for policy 0, policy_version 651281 (0.0028) [2024-06-24 11:32:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10670702592. Throughput: 0: 43163.5. Samples: 10670832040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 11:32:13,823][15401] Updated weights for policy 0, policy_version 651291 (0.0028) [2024-06-24 11:32:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 10670899200. Throughput: 0: 42985.0. Samples: 10671086120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:18,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 11:32:18,494][15401] Updated weights for policy 0, policy_version 651301 (0.0034) [2024-06-24 11:32:21,347][15401] Updated weights for policy 0, policy_version 651311 (0.0035) [2024-06-24 11:32:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10671144960. Throughput: 0: 43006.7. Samples: 10671213480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:23,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 11:32:25,902][15401] Updated weights for policy 0, policy_version 651321 (0.0027) [2024-06-24 11:32:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10671357952. Throughput: 0: 42968.0. Samples: 10671474060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:28,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 11:32:29,188][15401] Updated weights for policy 0, policy_version 651331 (0.0031) [2024-06-24 11:32:33,310][15401] Updated weights for policy 0, policy_version 651341 (0.0030) [2024-06-24 11:32:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 10671570944. Throughput: 0: 42957.4. Samples: 10671733180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 11:32:36,533][15401] Updated weights for policy 0, policy_version 651351 (0.0036) [2024-06-24 11:32:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 10671800320. Throughput: 0: 43045.4. Samples: 10671861120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 11:32:40,891][15401] Updated weights for policy 0, policy_version 651361 (0.0047) [2024-06-24 11:32:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10672013312. Throughput: 0: 43149.6. Samples: 10672124820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 11:32:44,116][15401] Updated weights for policy 0, policy_version 651371 (0.0030) [2024-06-24 11:32:48,391][15132] Fps is (10 sec: 40955.1, 60 sec: 43416.9, 300 sec: 42764.9). Total num frames: 10672209920. Throughput: 0: 42975.8. Samples: 10672377160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:48,391][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 11:32:48,687][15401] Updated weights for policy 0, policy_version 651381 (0.0030) [2024-06-24 11:32:51,922][15401] Updated weights for policy 0, policy_version 651391 (0.0025) [2024-06-24 11:32:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 10672422912. Throughput: 0: 42891.7. Samples: 10672497080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:53,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 11:32:56,497][15401] Updated weights for policy 0, policy_version 651401 (0.0028) [2024-06-24 11:32:58,389][15132] Fps is (10 sec: 44242.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10672652288. Throughput: 0: 42841.4. Samples: 10672759900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:32:58,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 11:32:59,945][15401] Updated weights for policy 0, policy_version 651411 (0.0032) [2024-06-24 11:33:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10672848896. Throughput: 0: 42960.0. Samples: 10673019320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:33:03,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 11:33:04,011][15401] Updated weights for policy 0, policy_version 651421 (0.0023) [2024-06-24 11:33:07,576][15401] Updated weights for policy 0, policy_version 651431 (0.0034) [2024-06-24 11:33:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10673078272. Throughput: 0: 42880.9. Samples: 10673143120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:33:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 11:33:11,479][15401] Updated weights for policy 0, policy_version 651441 (0.0036) [2024-06-24 11:33:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10673274880. Throughput: 0: 42903.0. Samples: 10673404700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:33:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 11:33:15,179][15401] Updated weights for policy 0, policy_version 651451 (0.0023) [2024-06-24 11:33:18,356][15349] Signal inference workers to stop experience collection... (158100 times) [2024-06-24 11:33:18,364][15349] Signal inference workers to resume experience collection... (158100 times) [2024-06-24 11:33:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10673471488. Throughput: 0: 42771.9. Samples: 10673657920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:33:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 11:33:18,391][15401] InferenceWorker_p0-w0: stopping experience collection (158100 times) [2024-06-24 11:33:18,391][15401] InferenceWorker_p0-w0: resuming experience collection (158100 times) [2024-06-24 11:33:19,085][15401] Updated weights for policy 0, policy_version 651461 (0.0033) [2024-06-24 11:33:23,270][15401] Updated weights for policy 0, policy_version 651471 (0.0038) [2024-06-24 11:33:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 10673700864. Throughput: 0: 42611.9. Samples: 10673778660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 11:33:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 11:33:27,574][15401] Updated weights for policy 0, policy_version 651481 (0.0030) [2024-06-24 11:33:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10673897472. Throughput: 0: 42433.1. Samples: 10674034300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:33:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 11:33:30,906][15401] Updated weights for policy 0, policy_version 651491 (0.0039) [2024-06-24 11:33:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 10674110464. Throughput: 0: 42613.9. Samples: 10674294740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:33:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 11:33:35,158][15401] Updated weights for policy 0, policy_version 651501 (0.0039) [2024-06-24 11:33:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 10674339840. Throughput: 0: 42671.8. Samples: 10674417320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:33:38,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 11:33:38,583][15401] Updated weights for policy 0, policy_version 651511 (0.0027) [2024-06-24 11:33:42,603][15401] Updated weights for policy 0, policy_version 651521 (0.0036) [2024-06-24 11:33:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10674552832. Throughput: 0: 42477.3. Samples: 10674671380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:33:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 11:33:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000651523_10674552832.pth... [2024-06-24 11:33:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000650896_10664280064.pth [2024-06-24 11:33:46,332][15401] Updated weights for policy 0, policy_version 651531 (0.0032) [2024-06-24 11:33:48,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42599.2, 300 sec: 42820.6). Total num frames: 10674765824. Throughput: 0: 42413.8. Samples: 10674927940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:33:48,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 11:33:50,134][15401] Updated weights for policy 0, policy_version 651541 (0.0026) [2024-06-24 11:33:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10674962432. Throughput: 0: 42489.3. Samples: 10675055140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:33:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 11:33:53,975][15401] Updated weights for policy 0, policy_version 651551 (0.0032) [2024-06-24 11:33:57,739][15401] Updated weights for policy 0, policy_version 651561 (0.0027) [2024-06-24 11:33:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10675191808. Throughput: 0: 42336.6. Samples: 10675309840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:33:58,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 11:34:01,963][15401] Updated weights for policy 0, policy_version 651571 (0.0038) [2024-06-24 11:34:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10675404800. Throughput: 0: 42363.7. Samples: 10675564280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 11:34:05,371][15401] Updated weights for policy 0, policy_version 651581 (0.0028) [2024-06-24 11:34:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 10675617792. Throughput: 0: 42708.0. Samples: 10675700620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:08,393][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 11:34:09,466][15401] Updated weights for policy 0, policy_version 651591 (0.0031) [2024-06-24 11:34:13,195][15401] Updated weights for policy 0, policy_version 651601 (0.0034) [2024-06-24 11:34:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10675830784. Throughput: 0: 42733.7. Samples: 10675957320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 11:34:16,963][15401] Updated weights for policy 0, policy_version 651611 (0.0034) [2024-06-24 11:34:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10676060160. Throughput: 0: 42657.8. Samples: 10676214340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 11:34:20,827][15401] Updated weights for policy 0, policy_version 651621 (0.0038) [2024-06-24 11:34:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10676273152. Throughput: 0: 42861.0. Samples: 10676346060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 11:34:24,777][15401] Updated weights for policy 0, policy_version 651631 (0.0034) [2024-06-24 11:34:28,344][15401] Updated weights for policy 0, policy_version 651641 (0.0031) [2024-06-24 11:34:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10676486144. Throughput: 0: 42952.5. Samples: 10676604240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:28,392][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 11:34:32,102][15401] Updated weights for policy 0, policy_version 651651 (0.0038) [2024-06-24 11:34:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 10676715520. Throughput: 0: 43156.4. Samples: 10676869980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:33,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 11:34:35,854][15401] Updated weights for policy 0, policy_version 651661 (0.0035) [2024-06-24 11:34:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10676928512. Throughput: 0: 43195.1. Samples: 10676998920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:38,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 11:34:39,656][15401] Updated weights for policy 0, policy_version 651671 (0.0033) [2024-06-24 11:34:43,275][15401] Updated weights for policy 0, policy_version 651681 (0.0029) [2024-06-24 11:34:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10677141504. Throughput: 0: 43204.0. Samples: 10677254020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 11:34:46,738][15349] Signal inference workers to stop experience collection... (158150 times) [2024-06-24 11:34:46,738][15349] Signal inference workers to resume experience collection... (158150 times) [2024-06-24 11:34:46,754][15401] InferenceWorker_p0-w0: stopping experience collection (158150 times) [2024-06-24 11:34:46,755][15401] InferenceWorker_p0-w0: resuming experience collection (158150 times) [2024-06-24 11:34:47,070][15401] Updated weights for policy 0, policy_version 651691 (0.0030) [2024-06-24 11:34:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 10677354496. Throughput: 0: 43366.1. Samples: 10677515760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 11:34:50,782][15401] Updated weights for policy 0, policy_version 651701 (0.0038) [2024-06-24 11:34:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10677551104. Throughput: 0: 43193.5. Samples: 10677644220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 11:34:54,492][15401] Updated weights for policy 0, policy_version 651711 (0.0033) [2024-06-24 11:34:58,198][15401] Updated weights for policy 0, policy_version 651721 (0.0041) [2024-06-24 11:34:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 10677796864. Throughput: 0: 43399.1. Samples: 10677910280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 11:34:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 11:35:02,300][15401] Updated weights for policy 0, policy_version 651731 (0.0031) [2024-06-24 11:35:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 10678009856. Throughput: 0: 43263.2. Samples: 10678161180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 11:35:05,954][15401] Updated weights for policy 0, policy_version 651741 (0.0032) [2024-06-24 11:35:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43146.3, 300 sec: 42876.4). Total num frames: 10678206464. Throughput: 0: 43193.8. Samples: 10678289780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 11:35:09,962][15401] Updated weights for policy 0, policy_version 651751 (0.0042) [2024-06-24 11:35:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 10678435840. Throughput: 0: 43288.9. Samples: 10678552240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 11:35:13,511][15401] Updated weights for policy 0, policy_version 651761 (0.0044) [2024-06-24 11:35:17,649][15401] Updated weights for policy 0, policy_version 651771 (0.0038) [2024-06-24 11:35:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 10678648832. Throughput: 0: 43018.2. Samples: 10678805800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 11:35:21,144][15401] Updated weights for policy 0, policy_version 651781 (0.0043) [2024-06-24 11:35:23,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 10678845440. Throughput: 0: 42892.9. Samples: 10678929200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:23,392][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 11:35:25,204][15401] Updated weights for policy 0, policy_version 651791 (0.0036) [2024-06-24 11:35:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10679058432. Throughput: 0: 42947.2. Samples: 10679186640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 11:35:28,706][15401] Updated weights for policy 0, policy_version 651801 (0.0031) [2024-06-24 11:35:32,824][15401] Updated weights for policy 0, policy_version 651811 (0.0042) [2024-06-24 11:35:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10679287808. Throughput: 0: 42866.7. Samples: 10679444760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:33,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-24 11:35:36,338][15401] Updated weights for policy 0, policy_version 651821 (0.0035) [2024-06-24 11:35:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10679500800. Throughput: 0: 42832.4. Samples: 10679571680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 11:35:40,507][15401] Updated weights for policy 0, policy_version 651831 (0.0035) [2024-06-24 11:35:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 10679713792. Throughput: 0: 42685.8. Samples: 10679831140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 11:35:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000651839_10679730176.pth... [2024-06-24 11:35:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000651211_10669441024.pth [2024-06-24 11:35:43,843][15401] Updated weights for policy 0, policy_version 651841 (0.0035) [2024-06-24 11:35:48,328][15401] Updated weights for policy 0, policy_version 651851 (0.0038) [2024-06-24 11:35:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10679926784. Throughput: 0: 42768.3. Samples: 10680085760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:48,391][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 11:35:51,674][15401] Updated weights for policy 0, policy_version 651861 (0.0029) [2024-06-24 11:35:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10680123392. Throughput: 0: 42709.0. Samples: 10680211680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 11:35:56,047][15401] Updated weights for policy 0, policy_version 651871 (0.0040) [2024-06-24 11:35:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 10680336384. Throughput: 0: 42544.5. Samples: 10680466740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:35:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 11:35:59,800][15401] Updated weights for policy 0, policy_version 651881 (0.0022) [2024-06-24 11:36:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 10680549376. Throughput: 0: 42719.5. Samples: 10680728180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:36:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 11:36:03,787][15401] Updated weights for policy 0, policy_version 651891 (0.0036) [2024-06-24 11:36:07,302][15401] Updated weights for policy 0, policy_version 651901 (0.0025) [2024-06-24 11:36:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10680778752. Throughput: 0: 42760.9. Samples: 10680853340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:36:08,400][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 11:36:11,492][15401] Updated weights for policy 0, policy_version 651911 (0.0032) [2024-06-24 11:36:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 10680975360. Throughput: 0: 42784.8. Samples: 10681111960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:36:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 11:36:14,788][15349] Signal inference workers to stop experience collection... (158200 times) [2024-06-24 11:36:14,789][15349] Signal inference workers to resume experience collection... (158200 times) [2024-06-24 11:36:14,797][15401] InferenceWorker_p0-w0: stopping experience collection (158200 times) [2024-06-24 11:36:14,798][15401] InferenceWorker_p0-w0: resuming experience collection (158200 times) [2024-06-24 11:36:14,935][15401] Updated weights for policy 0, policy_version 651921 (0.0034) [2024-06-24 11:36:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10681188352. Throughput: 0: 42780.9. Samples: 10681369900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:36:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 11:36:19,014][15401] Updated weights for policy 0, policy_version 651931 (0.0023) [2024-06-24 11:36:22,596][15401] Updated weights for policy 0, policy_version 651941 (0.0036) [2024-06-24 11:36:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 10681417728. Throughput: 0: 42672.0. Samples: 10681491920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:36:23,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 11:36:26,728][15401] Updated weights for policy 0, policy_version 651951 (0.0035) [2024-06-24 11:36:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10681630720. Throughput: 0: 42689.1. Samples: 10681752140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:36:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 11:36:30,274][15401] Updated weights for policy 0, policy_version 651961 (0.0029) [2024-06-24 11:36:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10681827328. Throughput: 0: 42929.9. Samples: 10682017600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:36:33,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 11:36:34,358][15401] Updated weights for policy 0, policy_version 651971 (0.0034) [2024-06-24 11:36:37,853][15401] Updated weights for policy 0, policy_version 651981 (0.0036) [2024-06-24 11:36:38,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 10682056704. Throughput: 0: 42888.6. Samples: 10682141680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-24 11:36:38,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-24 11:36:41,996][15401] Updated weights for policy 0, policy_version 651991 (0.0036) [2024-06-24 11:36:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 10682286080. Throughput: 0: 42982.2. Samples: 10682400940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:36:43,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 11:36:45,514][15401] Updated weights for policy 0, policy_version 652001 (0.0029) [2024-06-24 11:36:48,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10682466304. Throughput: 0: 42965.1. Samples: 10682661600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:36:48,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 11:36:49,584][15401] Updated weights for policy 0, policy_version 652011 (0.0037) [2024-06-24 11:36:52,987][15401] Updated weights for policy 0, policy_version 652021 (0.0029) [2024-06-24 11:36:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 10682712064. Throughput: 0: 42999.0. Samples: 10682788300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:36:53,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-24 11:36:57,376][15401] Updated weights for policy 0, policy_version 652031 (0.0030) [2024-06-24 11:36:58,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 10682941440. Throughput: 0: 43097.4. Samples: 10683051340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:36:58,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 11:37:00,793][15401] Updated weights for policy 0, policy_version 652041 (0.0035) [2024-06-24 11:37:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10683121664. Throughput: 0: 43071.5. Samples: 10683308120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:03,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 11:37:04,979][15401] Updated weights for policy 0, policy_version 652051 (0.0027) [2024-06-24 11:37:08,298][15401] Updated weights for policy 0, policy_version 652061 (0.0029) [2024-06-24 11:37:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10683367424. Throughput: 0: 43087.1. Samples: 10683430840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 11:37:12,572][15401] Updated weights for policy 0, policy_version 652071 (0.0031) [2024-06-24 11:37:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10683564032. Throughput: 0: 43175.6. Samples: 10683695040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 11:37:16,027][15401] Updated weights for policy 0, policy_version 652081 (0.0030) [2024-06-24 11:37:18,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 10683777024. Throughput: 0: 42954.5. Samples: 10683950660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:18,393][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 11:37:20,212][15401] Updated weights for policy 0, policy_version 652091 (0.0036) [2024-06-24 11:37:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10684006400. Throughput: 0: 42974.1. Samples: 10684075500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:23,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 11:37:23,628][15401] Updated weights for policy 0, policy_version 652101 (0.0049) [2024-06-24 11:37:28,131][15401] Updated weights for policy 0, policy_version 652111 (0.0042) [2024-06-24 11:37:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10684203008. Throughput: 0: 43083.6. Samples: 10684339700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 11:37:31,415][15401] Updated weights for policy 0, policy_version 652121 (0.0037) [2024-06-24 11:37:33,390][15132] Fps is (10 sec: 42597.1, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 10684432384. Throughput: 0: 42889.5. Samples: 10684591640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:33,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 11:37:35,727][15401] Updated weights for policy 0, policy_version 652131 (0.0036) [2024-06-24 11:37:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 10684661760. Throughput: 0: 43076.1. Samples: 10684726720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 11:37:38,757][15401] Updated weights for policy 0, policy_version 652141 (0.0030) [2024-06-24 11:37:43,265][15349] Signal inference workers to stop experience collection... (158250 times) [2024-06-24 11:37:43,299][15401] InferenceWorker_p0-w0: stopping experience collection (158250 times) [2024-06-24 11:37:43,328][15349] Signal inference workers to resume experience collection... (158250 times) [2024-06-24 11:37:43,332][15401] InferenceWorker_p0-w0: resuming experience collection (158250 times) [2024-06-24 11:37:43,336][15401] Updated weights for policy 0, policy_version 652151 (0.0031) [2024-06-24 11:37:43,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42820.7). Total num frames: 10684841984. Throughput: 0: 43079.0. Samples: 10684989900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 11:37:43,619][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000652153_10684874752.pth... [2024-06-24 11:37:43,677][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000651523_10674552832.pth [2024-06-24 11:37:46,162][15401] Updated weights for policy 0, policy_version 652161 (0.0031) [2024-06-24 11:37:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10685071360. Throughput: 0: 42895.7. Samples: 10685238420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 11:37:50,888][15401] Updated weights for policy 0, policy_version 652171 (0.0022) [2024-06-24 11:37:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10685300736. Throughput: 0: 43148.0. Samples: 10685372500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 11:37:53,812][15401] Updated weights for policy 0, policy_version 652181 (0.0031) [2024-06-24 11:37:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10685480960. Throughput: 0: 43000.9. Samples: 10685630080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:37:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 11:37:58,458][15401] Updated weights for policy 0, policy_version 652191 (0.0026) [2024-06-24 11:38:01,362][15401] Updated weights for policy 0, policy_version 652201 (0.0028) [2024-06-24 11:38:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10685726720. Throughput: 0: 42777.4. Samples: 10685875540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:38:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 11:38:06,188][15401] Updated weights for policy 0, policy_version 652211 (0.0035) [2024-06-24 11:38:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 10685939712. Throughput: 0: 42991.0. Samples: 10686010100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:38:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 11:38:09,486][15401] Updated weights for policy 0, policy_version 652221 (0.0032) [2024-06-24 11:38:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 10686136320. Throughput: 0: 42794.2. Samples: 10686265440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 11:38:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 11:38:13,662][15401] Updated weights for policy 0, policy_version 652231 (0.0040) [2024-06-24 11:38:17,184][15401] Updated weights for policy 0, policy_version 652241 (0.0034) [2024-06-24 11:38:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 10686349312. Throughput: 0: 42857.6. Samples: 10686520220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 11:38:21,206][15401] Updated weights for policy 0, policy_version 652251 (0.0037) [2024-06-24 11:38:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10686562304. Throughput: 0: 42700.1. Samples: 10686648220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:23,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 11:38:24,918][15401] Updated weights for policy 0, policy_version 652261 (0.0035) [2024-06-24 11:38:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10686758912. Throughput: 0: 42459.6. Samples: 10686900580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 11:38:28,995][15401] Updated weights for policy 0, policy_version 652271 (0.0037) [2024-06-24 11:38:32,685][15401] Updated weights for policy 0, policy_version 652281 (0.0039) [2024-06-24 11:38:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 10687004672. Throughput: 0: 42624.4. Samples: 10687156520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:33,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 11:38:36,577][15401] Updated weights for policy 0, policy_version 652291 (0.0033) [2024-06-24 11:38:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 10687201280. Throughput: 0: 42659.4. Samples: 10687292180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 11:38:40,388][15401] Updated weights for policy 0, policy_version 652301 (0.0042) [2024-06-24 11:38:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10687397888. Throughput: 0: 42516.4. Samples: 10687543320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 11:38:44,239][15401] Updated weights for policy 0, policy_version 652311 (0.0034) [2024-06-24 11:38:48,076][15401] Updated weights for policy 0, policy_version 652321 (0.0036) [2024-06-24 11:38:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 10687627264. Throughput: 0: 42666.2. Samples: 10687795520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:48,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 11:38:52,213][15401] Updated weights for policy 0, policy_version 652331 (0.0028) [2024-06-24 11:38:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10687840256. Throughput: 0: 42487.1. Samples: 10687922020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 11:38:55,535][15401] Updated weights for policy 0, policy_version 652341 (0.0038) [2024-06-24 11:38:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 10688053248. Throughput: 0: 42601.2. Samples: 10688182500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:38:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 11:38:59,954][15401] Updated weights for policy 0, policy_version 652351 (0.0027) [2024-06-24 11:39:03,237][15401] Updated weights for policy 0, policy_version 652361 (0.0041) [2024-06-24 11:39:03,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42931.6). Total num frames: 10688282624. Throughput: 0: 42564.4. Samples: 10688435720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:03,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 11:39:07,557][15401] Updated weights for policy 0, policy_version 652371 (0.0024) [2024-06-24 11:39:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 10688479232. Throughput: 0: 42677.8. Samples: 10688568720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 11:39:10,694][15401] Updated weights for policy 0, policy_version 652381 (0.0031) [2024-06-24 11:39:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10688692224. Throughput: 0: 42826.7. Samples: 10688827780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:13,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 11:39:15,047][15401] Updated weights for policy 0, policy_version 652391 (0.0031) [2024-06-24 11:39:18,267][15401] Updated weights for policy 0, policy_version 652401 (0.0022) [2024-06-24 11:39:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 10688937984. Throughput: 0: 42736.8. Samples: 10689079680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:18,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 11:39:22,582][15401] Updated weights for policy 0, policy_version 652411 (0.0029) [2024-06-24 11:39:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10689134592. Throughput: 0: 42695.7. Samples: 10689213480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 11:39:26,237][15401] Updated weights for policy 0, policy_version 652421 (0.0033) [2024-06-24 11:39:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10689347584. Throughput: 0: 42758.2. Samples: 10689467440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 11:39:30,337][15401] Updated weights for policy 0, policy_version 652431 (0.0040) [2024-06-24 11:39:33,106][15349] Signal inference workers to stop experience collection... (158300 times) [2024-06-24 11:39:33,106][15349] Signal inference workers to resume experience collection... (158300 times) [2024-06-24 11:39:33,119][15401] InferenceWorker_p0-w0: stopping experience collection (158300 times) [2024-06-24 11:39:33,131][15401] InferenceWorker_p0-w0: resuming experience collection (158300 times) [2024-06-24 11:39:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10689560576. Throughput: 0: 42883.6. Samples: 10689725280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 11:39:33,746][15401] Updated weights for policy 0, policy_version 652441 (0.0032) [2024-06-24 11:39:37,964][15401] Updated weights for policy 0, policy_version 652451 (0.0033) [2024-06-24 11:39:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10689757184. Throughput: 0: 42793.3. Samples: 10689847720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:38,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 11:39:41,631][15401] Updated weights for policy 0, policy_version 652461 (0.0035) [2024-06-24 11:39:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10689986560. Throughput: 0: 42567.2. Samples: 10690098020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:43,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 11:39:43,511][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000652466_10690002944.pth... [2024-06-24 11:39:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000651839_10679730176.pth [2024-06-24 11:39:46,157][15401] Updated weights for policy 0, policy_version 652471 (0.0033) [2024-06-24 11:39:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10690183168. Throughput: 0: 42793.4. Samples: 10690361320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:48,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 11:39:49,372][15401] Updated weights for policy 0, policy_version 652481 (0.0034) [2024-06-24 11:39:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10690396160. Throughput: 0: 42483.0. Samples: 10690480460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 11:39:53,395][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 11:39:54,001][15401] Updated weights for policy 0, policy_version 652491 (0.0034) [2024-06-24 11:39:56,934][15401] Updated weights for policy 0, policy_version 652501 (0.0028) [2024-06-24 11:39:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 10690625536. Throughput: 0: 42459.7. Samples: 10690738460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:39:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 11:40:01,598][15401] Updated weights for policy 0, policy_version 652511 (0.0039) [2024-06-24 11:40:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 10690838528. Throughput: 0: 42635.1. Samples: 10690998260. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 11:40:04,473][15401] Updated weights for policy 0, policy_version 652521 (0.0037) [2024-06-24 11:40:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10691035136. Throughput: 0: 42392.4. Samples: 10691121140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:08,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 11:40:09,089][15401] Updated weights for policy 0, policy_version 652531 (0.0029) [2024-06-24 11:40:11,933][15401] Updated weights for policy 0, policy_version 652541 (0.0034) [2024-06-24 11:40:13,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10691280896. Throughput: 0: 42495.6. Samples: 10691379740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 11:40:16,639][15401] Updated weights for policy 0, policy_version 652551 (0.0029) [2024-06-24 11:40:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 10691461120. Throughput: 0: 42679.1. Samples: 10691645840. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 11:40:19,437][15401] Updated weights for policy 0, policy_version 652561 (0.0037) [2024-06-24 11:40:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10691674112. Throughput: 0: 42592.6. Samples: 10691764380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:23,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 11:40:24,245][15401] Updated weights for policy 0, policy_version 652571 (0.0041) [2024-06-24 11:40:27,059][15401] Updated weights for policy 0, policy_version 652581 (0.0033) [2024-06-24 11:40:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10691919872. Throughput: 0: 42740.5. Samples: 10692021340. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 11:40:31,897][15401] Updated weights for policy 0, policy_version 652591 (0.0039) [2024-06-24 11:40:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10692116480. Throughput: 0: 42915.5. Samples: 10692292520. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 11:40:34,972][15401] Updated weights for policy 0, policy_version 652601 (0.0027) [2024-06-24 11:40:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10692329472. Throughput: 0: 42880.9. Samples: 10692410100. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 11:40:39,530][15401] Updated weights for policy 0, policy_version 652611 (0.0024) [2024-06-24 11:40:42,497][15401] Updated weights for policy 0, policy_version 652621 (0.0037) [2024-06-24 11:40:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10692575232. Throughput: 0: 42866.9. Samples: 10692667480. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:43,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 11:40:47,116][15401] Updated weights for policy 0, policy_version 652631 (0.0027) [2024-06-24 11:40:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10692739072. Throughput: 0: 42956.2. Samples: 10692931280. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 11:40:48,478][15349] Signal inference workers to stop experience collection... (158350 times) [2024-06-24 11:40:48,532][15401] InferenceWorker_p0-w0: stopping experience collection (158350 times) [2024-06-24 11:40:48,540][15349] Signal inference workers to resume experience collection... (158350 times) [2024-06-24 11:40:48,551][15401] InferenceWorker_p0-w0: resuming experience collection (158350 times) [2024-06-24 11:40:50,144][15401] Updated weights for policy 0, policy_version 652641 (0.0030) [2024-06-24 11:40:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10692952064. Throughput: 0: 42796.9. Samples: 10693047000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 11:40:54,759][15401] Updated weights for policy 0, policy_version 652651 (0.0023) [2024-06-24 11:40:58,386][15401] Updated weights for policy 0, policy_version 652661 (0.0023) [2024-06-24 11:40:58,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42871.2, 300 sec: 42876.1). Total num frames: 10693197824. Throughput: 0: 42755.7. Samples: 10693303760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:40:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 11:41:02,404][15401] Updated weights for policy 0, policy_version 652671 (0.0028) [2024-06-24 11:41:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 10693361664. Throughput: 0: 42636.5. Samples: 10693564480. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:41:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 11:41:06,054][15401] Updated weights for policy 0, policy_version 652681 (0.0030) [2024-06-24 11:41:08,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10693607424. Throughput: 0: 42657.6. Samples: 10693683980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:41:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 11:41:10,354][15401] Updated weights for policy 0, policy_version 652691 (0.0033) [2024-06-24 11:41:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42323.5, 300 sec: 42820.2). Total num frames: 10693820416. Throughput: 0: 42749.6. Samples: 10693945180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:41:13,393][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 11:41:13,802][15401] Updated weights for policy 0, policy_version 652701 (0.0044) [2024-06-24 11:41:18,016][15401] Updated weights for policy 0, policy_version 652711 (0.0037) [2024-06-24 11:41:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10694017024. Throughput: 0: 42293.8. Samples: 10694195740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:41:18,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 11:41:21,539][15401] Updated weights for policy 0, policy_version 652721 (0.0034) [2024-06-24 11:41:23,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10694262784. Throughput: 0: 42596.5. Samples: 10694326940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:41:23,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 11:41:25,825][15401] Updated weights for policy 0, policy_version 652731 (0.0031) [2024-06-24 11:41:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 10694459392. Throughput: 0: 42658.7. Samples: 10694587120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-24 11:41:28,394][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 11:41:29,053][15401] Updated weights for policy 0, policy_version 652741 (0.0033) [2024-06-24 11:41:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 10694672384. Throughput: 0: 42478.2. Samples: 10694842800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:41:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 11:41:33,391][15401] Updated weights for policy 0, policy_version 652751 (0.0027) [2024-06-24 11:41:36,906][15401] Updated weights for policy 0, policy_version 652761 (0.0032) [2024-06-24 11:41:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10694901760. Throughput: 0: 42681.9. Samples: 10694967680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:41:38,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-24 11:41:41,022][15401] Updated weights for policy 0, policy_version 652771 (0.0027) [2024-06-24 11:41:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 10695098368. Throughput: 0: 42790.5. Samples: 10695229320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:41:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 11:41:43,439][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000652778_10695114752.pth... [2024-06-24 11:41:43,494][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000652153_10684874752.pth [2024-06-24 11:41:44,406][15401] Updated weights for policy 0, policy_version 652781 (0.0051) [2024-06-24 11:41:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10695311360. Throughput: 0: 42801.8. Samples: 10695490560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:41:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 11:41:48,967][15401] Updated weights for policy 0, policy_version 652791 (0.0032) [2024-06-24 11:41:52,139][15401] Updated weights for policy 0, policy_version 652801 (0.0038) [2024-06-24 11:41:53,392][15132] Fps is (10 sec: 44225.4, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 10695540736. Throughput: 0: 42897.6. Samples: 10695614480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:41:53,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 11:41:56,758][15401] Updated weights for policy 0, policy_version 652811 (0.0038) [2024-06-24 11:41:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10695753728. Throughput: 0: 42847.6. Samples: 10695873220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:41:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 11:41:59,759][15401] Updated weights for policy 0, policy_version 652821 (0.0041) [2024-06-24 11:42:03,389][15132] Fps is (10 sec: 39331.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10695933952. Throughput: 0: 42918.7. Samples: 10696127080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 11:42:04,318][15401] Updated weights for policy 0, policy_version 652831 (0.0031) [2024-06-24 11:42:07,457][15401] Updated weights for policy 0, policy_version 652841 (0.0040) [2024-06-24 11:42:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 10696179712. Throughput: 0: 42896.8. Samples: 10696257400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:08,393][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 11:42:11,699][15401] Updated weights for policy 0, policy_version 652851 (0.0039) [2024-06-24 11:42:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 10696376320. Throughput: 0: 42740.4. Samples: 10696510440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:13,395][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 11:42:15,160][15401] Updated weights for policy 0, policy_version 652861 (0.0032) [2024-06-24 11:42:15,963][15349] Signal inference workers to stop experience collection... (158400 times) [2024-06-24 11:42:16,001][15401] InferenceWorker_p0-w0: stopping experience collection (158400 times) [2024-06-24 11:42:16,024][15349] Signal inference workers to resume experience collection... (158400 times) [2024-06-24 11:42:16,024][15401] InferenceWorker_p0-w0: resuming experience collection (158400 times) [2024-06-24 11:42:18,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 10696589312. Throughput: 0: 42869.8. Samples: 10696771940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 11:42:19,294][15401] Updated weights for policy 0, policy_version 652871 (0.0040) [2024-06-24 11:42:22,861][15401] Updated weights for policy 0, policy_version 652881 (0.0032) [2024-06-24 11:42:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10696835072. Throughput: 0: 42980.9. Samples: 10696901820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 11:42:26,763][15401] Updated weights for policy 0, policy_version 652891 (0.0033) [2024-06-24 11:42:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10697015296. Throughput: 0: 42827.0. Samples: 10697156540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:28,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 11:42:30,465][15401] Updated weights for policy 0, policy_version 652901 (0.0034) [2024-06-24 11:42:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10697228288. Throughput: 0: 42685.3. Samples: 10697411400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 11:42:34,273][15401] Updated weights for policy 0, policy_version 652911 (0.0034) [2024-06-24 11:42:38,124][15401] Updated weights for policy 0, policy_version 652921 (0.0049) [2024-06-24 11:42:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10697457664. Throughput: 0: 42763.8. Samples: 10697538740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:38,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 11:42:41,897][15401] Updated weights for policy 0, policy_version 652931 (0.0032) [2024-06-24 11:42:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10697654272. Throughput: 0: 42586.7. Samples: 10697789620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:43,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 11:42:45,682][15401] Updated weights for policy 0, policy_version 652941 (0.0028) [2024-06-24 11:42:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10697883648. Throughput: 0: 42598.2. Samples: 10698044000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 11:42:49,771][15401] Updated weights for policy 0, policy_version 652951 (0.0038) [2024-06-24 11:42:53,309][15401] Updated weights for policy 0, policy_version 652961 (0.0033) [2024-06-24 11:42:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 10698113024. Throughput: 0: 42704.8. Samples: 10698179020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 11:42:57,310][15401] Updated weights for policy 0, policy_version 652971 (0.0029) [2024-06-24 11:42:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10698309632. Throughput: 0: 42844.9. Samples: 10698438460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:42:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 11:43:01,132][15401] Updated weights for policy 0, policy_version 652981 (0.0032) [2024-06-24 11:43:03,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 10698539008. Throughput: 0: 42809.8. Samples: 10698698380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:43:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 11:43:04,802][15401] Updated weights for policy 0, policy_version 652991 (0.0023) [2024-06-24 11:43:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10698752000. Throughput: 0: 42779.6. Samples: 10698826900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 11:43:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 11:43:08,641][15401] Updated weights for policy 0, policy_version 653001 (0.0039) [2024-06-24 11:43:12,361][15401] Updated weights for policy 0, policy_version 653011 (0.0030) [2024-06-24 11:43:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 10698964992. Throughput: 0: 42824.6. Samples: 10699083640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:13,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 11:43:16,428][15401] Updated weights for policy 0, policy_version 653021 (0.0042) [2024-06-24 11:43:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10699177984. Throughput: 0: 42679.0. Samples: 10699331960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:18,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 11:43:20,388][15401] Updated weights for policy 0, policy_version 653031 (0.0039) [2024-06-24 11:43:23,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 10699374592. Throughput: 0: 42805.2. Samples: 10699464980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 11:43:24,270][15401] Updated weights for policy 0, policy_version 653041 (0.0033) [2024-06-24 11:43:27,942][15401] Updated weights for policy 0, policy_version 653051 (0.0024) [2024-06-24 11:43:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10699603968. Throughput: 0: 42840.9. Samples: 10699717460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 11:43:30,600][15349] Signal inference workers to stop experience collection... (158450 times) [2024-06-24 11:43:30,600][15349] Signal inference workers to resume experience collection... (158450 times) [2024-06-24 11:43:30,649][15401] InferenceWorker_p0-w0: stopping experience collection (158450 times) [2024-06-24 11:43:30,649][15401] InferenceWorker_p0-w0: resuming experience collection (158450 times) [2024-06-24 11:43:31,898][15401] Updated weights for policy 0, policy_version 653061 (0.0028) [2024-06-24 11:43:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10699816960. Throughput: 0: 43058.2. Samples: 10699981620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 11:43:35,747][15401] Updated weights for policy 0, policy_version 653071 (0.0038) [2024-06-24 11:43:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10700013568. Throughput: 0: 42770.3. Samples: 10700103680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:38,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 11:43:39,585][15401] Updated weights for policy 0, policy_version 653081 (0.0037) [2024-06-24 11:43:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10700226560. Throughput: 0: 42642.8. Samples: 10700357380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 11:43:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000653091_10700242944.pth... [2024-06-24 11:43:43,506][15401] Updated weights for policy 0, policy_version 653091 (0.0030) [2024-06-24 11:43:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000652466_10690002944.pth [2024-06-24 11:43:47,290][15401] Updated weights for policy 0, policy_version 653101 (0.0029) [2024-06-24 11:43:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10700439552. Throughput: 0: 42655.9. Samples: 10700617900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 11:43:50,998][15401] Updated weights for policy 0, policy_version 653111 (0.0036) [2024-06-24 11:43:53,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 10700652544. Throughput: 0: 42611.5. Samples: 10700744520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:53,392][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 11:43:54,920][15401] Updated weights for policy 0, policy_version 653121 (0.0039) [2024-06-24 11:43:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 10700881920. Throughput: 0: 42517.7. Samples: 10700996940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:43:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 11:43:58,542][15401] Updated weights for policy 0, policy_version 653131 (0.0038) [2024-06-24 11:44:02,787][15401] Updated weights for policy 0, policy_version 653141 (0.0031) [2024-06-24 11:44:03,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10701078528. Throughput: 0: 42772.0. Samples: 10701256700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 11:44:06,037][15401] Updated weights for policy 0, policy_version 653151 (0.0024) [2024-06-24 11:44:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 10701291520. Throughput: 0: 42557.8. Samples: 10701380180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:08,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 11:44:10,335][15401] Updated weights for policy 0, policy_version 653161 (0.0027) [2024-06-24 11:44:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10701537280. Throughput: 0: 42762.7. Samples: 10701641780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:13,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 11:44:14,041][15401] Updated weights for policy 0, policy_version 653171 (0.0035) [2024-06-24 11:44:17,808][15401] Updated weights for policy 0, policy_version 653181 (0.0026) [2024-06-24 11:44:18,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10701733888. Throughput: 0: 42647.1. Samples: 10701900740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 11:44:21,561][15401] Updated weights for policy 0, policy_version 653191 (0.0037) [2024-06-24 11:44:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10701930496. Throughput: 0: 42632.5. Samples: 10702022140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 11:44:25,673][15401] Updated weights for policy 0, policy_version 653201 (0.0026) [2024-06-24 11:44:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10702159872. Throughput: 0: 42798.2. Samples: 10702283300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 11:44:29,104][15401] Updated weights for policy 0, policy_version 653211 (0.0033) [2024-06-24 11:44:33,315][15401] Updated weights for policy 0, policy_version 653221 (0.0032) [2024-06-24 11:44:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10702372864. Throughput: 0: 42655.6. Samples: 10702537400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 11:44:36,614][15401] Updated weights for policy 0, policy_version 653231 (0.0031) [2024-06-24 11:44:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10702585856. Throughput: 0: 42657.9. Samples: 10702664020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 11:44:40,975][15401] Updated weights for policy 0, policy_version 653241 (0.0028) [2024-06-24 11:44:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10702798848. Throughput: 0: 42798.6. Samples: 10702922880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 11:44:43,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-24 11:44:44,505][15401] Updated weights for policy 0, policy_version 653251 (0.0044) [2024-06-24 11:44:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10703011840. Throughput: 0: 42697.9. Samples: 10703178100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:44:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 11:44:48,642][15401] Updated weights for policy 0, policy_version 653261 (0.0031) [2024-06-24 11:44:52,733][15401] Updated weights for policy 0, policy_version 653271 (0.0040) [2024-06-24 11:44:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 10703224832. Throughput: 0: 42759.2. Samples: 10703304240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:44:53,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 11:44:56,173][15401] Updated weights for policy 0, policy_version 653281 (0.0031) [2024-06-24 11:44:56,356][15349] Signal inference workers to stop experience collection... (158500 times) [2024-06-24 11:44:56,400][15401] InferenceWorker_p0-w0: stopping experience collection (158500 times) [2024-06-24 11:44:56,409][15349] Signal inference workers to resume experience collection... (158500 times) [2024-06-24 11:44:56,415][15401] InferenceWorker_p0-w0: resuming experience collection (158500 times) [2024-06-24 11:44:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10703437824. Throughput: 0: 42725.0. Samples: 10703564400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:44:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 11:45:00,248][15401] Updated weights for policy 0, policy_version 653291 (0.0028) [2024-06-24 11:45:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10703650816. Throughput: 0: 42639.1. Samples: 10703819500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:03,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 11:45:03,801][15401] Updated weights for policy 0, policy_version 653301 (0.0043) [2024-06-24 11:45:07,691][15401] Updated weights for policy 0, policy_version 653311 (0.0025) [2024-06-24 11:45:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 10703863808. Throughput: 0: 42976.0. Samples: 10703956060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 11:45:11,530][15401] Updated weights for policy 0, policy_version 653321 (0.0029) [2024-06-24 11:45:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10704076800. Throughput: 0: 42692.3. Samples: 10704204460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 11:45:15,236][15401] Updated weights for policy 0, policy_version 653331 (0.0035) [2024-06-24 11:45:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10704289792. Throughput: 0: 42853.8. Samples: 10704465820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 11:45:19,102][15401] Updated weights for policy 0, policy_version 653341 (0.0033) [2024-06-24 11:45:22,809][15401] Updated weights for policy 0, policy_version 653351 (0.0043) [2024-06-24 11:45:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10704519168. Throughput: 0: 42979.0. Samples: 10704598080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:23,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 11:45:26,714][15401] Updated weights for policy 0, policy_version 653361 (0.0046) [2024-06-24 11:45:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10704715776. Throughput: 0: 42793.4. Samples: 10704848580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:28,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 11:45:30,807][15401] Updated weights for policy 0, policy_version 653371 (0.0034) [2024-06-24 11:45:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10704945152. Throughput: 0: 42821.7. Samples: 10705105080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:33,399][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 11:45:34,191][15401] Updated weights for policy 0, policy_version 653381 (0.0038) [2024-06-24 11:45:38,239][15401] Updated weights for policy 0, policy_version 653391 (0.0030) [2024-06-24 11:45:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10705158144. Throughput: 0: 42953.3. Samples: 10705237140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:38,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 11:45:42,417][15401] Updated weights for policy 0, policy_version 653401 (0.0048) [2024-06-24 11:45:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10705354752. Throughput: 0: 42905.2. Samples: 10705495140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 11:45:43,541][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000653404_10705371136.pth... [2024-06-24 11:45:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000652778_10695114752.pth [2024-06-24 11:45:45,713][15401] Updated weights for policy 0, policy_version 653411 (0.0022) [2024-06-24 11:45:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10705600512. Throughput: 0: 42812.0. Samples: 10705746040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:48,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 11:45:50,103][15401] Updated weights for policy 0, policy_version 653421 (0.0034) [2024-06-24 11:45:53,234][15401] Updated weights for policy 0, policy_version 653431 (0.0023) [2024-06-24 11:45:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42765.1). Total num frames: 10705813504. Throughput: 0: 42787.5. Samples: 10705881500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 11:45:57,714][15401] Updated weights for policy 0, policy_version 653441 (0.0029) [2024-06-24 11:45:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10705993728. Throughput: 0: 42951.1. Samples: 10706137260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:45:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 11:46:00,765][15401] Updated weights for policy 0, policy_version 653451 (0.0024) [2024-06-24 11:46:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10706223104. Throughput: 0: 42705.1. Samples: 10706387560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:46:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 11:46:05,398][15401] Updated weights for policy 0, policy_version 653461 (0.0043) [2024-06-24 11:46:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 10706452480. Throughput: 0: 42661.3. Samples: 10706517840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:46:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 11:46:08,485][15401] Updated weights for policy 0, policy_version 653471 (0.0027) [2024-06-24 11:46:12,993][15401] Updated weights for policy 0, policy_version 653481 (0.0037) [2024-06-24 11:46:13,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10706649088. Throughput: 0: 42696.0. Samples: 10706769900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:46:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 11:46:16,457][15401] Updated weights for policy 0, policy_version 653491 (0.0027) [2024-06-24 11:46:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10706862080. Throughput: 0: 42759.1. Samples: 10707029240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:46:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 11:46:20,719][15401] Updated weights for policy 0, policy_version 653501 (0.0033) [2024-06-24 11:46:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10707075072. Throughput: 0: 42728.9. Samples: 10707159940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 11:46:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 11:46:23,925][15401] Updated weights for policy 0, policy_version 653511 (0.0034) [2024-06-24 11:46:28,151][15401] Updated weights for policy 0, policy_version 653521 (0.0029) [2024-06-24 11:46:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10707304448. Throughput: 0: 42723.2. Samples: 10707417680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:46:28,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 11:46:28,764][15349] Signal inference workers to stop experience collection... (158550 times) [2024-06-24 11:46:28,765][15349] Signal inference workers to resume experience collection... (158550 times) [2024-06-24 11:46:28,792][15401] InferenceWorker_p0-w0: stopping experience collection (158550 times) [2024-06-24 11:46:28,792][15401] InferenceWorker_p0-w0: resuming experience collection (158550 times) [2024-06-24 11:46:32,082][15401] Updated weights for policy 0, policy_version 653531 (0.0024) [2024-06-24 11:46:33,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 10707517440. Throughput: 0: 42875.4. Samples: 10707675540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:46:33,393][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 11:46:35,601][15401] Updated weights for policy 0, policy_version 653541 (0.0050) [2024-06-24 11:46:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10707714048. Throughput: 0: 42782.2. Samples: 10707806700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:46:38,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 11:46:39,811][15401] Updated weights for policy 0, policy_version 653551 (0.0024) [2024-06-24 11:46:43,272][15401] Updated weights for policy 0, policy_version 653561 (0.0035) [2024-06-24 11:46:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 10707943424. Throughput: 0: 42760.9. Samples: 10708061500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:46:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 11:46:47,159][15401] Updated weights for policy 0, policy_version 653571 (0.0035) [2024-06-24 11:46:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 10708140032. Throughput: 0: 43101.5. Samples: 10708327120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:46:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 11:46:50,781][15401] Updated weights for policy 0, policy_version 653581 (0.0029) [2024-06-24 11:46:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10708369408. Throughput: 0: 42893.6. Samples: 10708448060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:46:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 11:46:54,616][15401] Updated weights for policy 0, policy_version 653591 (0.0027) [2024-06-24 11:46:58,363][15401] Updated weights for policy 0, policy_version 653601 (0.0026) [2024-06-24 11:46:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 10708598784. Throughput: 0: 43057.3. Samples: 10708707480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:46:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 11:47:02,337][15401] Updated weights for policy 0, policy_version 653611 (0.0027) [2024-06-24 11:47:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 10708779008. Throughput: 0: 43101.9. Samples: 10708968820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 11:47:05,996][15401] Updated weights for policy 0, policy_version 653621 (0.0029) [2024-06-24 11:47:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10709024768. Throughput: 0: 42995.5. Samples: 10709094740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:08,394][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 11:47:09,721][15401] Updated weights for policy 0, policy_version 653631 (0.0030) [2024-06-24 11:47:13,392][15132] Fps is (10 sec: 44225.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 10709221376. Throughput: 0: 43107.9. Samples: 10709357640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:13,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 11:47:13,742][15401] Updated weights for policy 0, policy_version 653641 (0.0022) [2024-06-24 11:47:17,228][15401] Updated weights for policy 0, policy_version 653651 (0.0044) [2024-06-24 11:47:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10709434368. Throughput: 0: 42989.4. Samples: 10709609960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 11:47:21,486][15401] Updated weights for policy 0, policy_version 653661 (0.0038) [2024-06-24 11:47:23,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 10709680128. Throughput: 0: 42979.9. Samples: 10709740800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 11:47:24,875][15401] Updated weights for policy 0, policy_version 653671 (0.0024) [2024-06-24 11:47:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10709860352. Throughput: 0: 43136.4. Samples: 10710002640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 11:47:28,891][15401] Updated weights for policy 0, policy_version 653681 (0.0034) [2024-06-24 11:47:32,355][15401] Updated weights for policy 0, policy_version 653691 (0.0032) [2024-06-24 11:47:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 10710089728. Throughput: 0: 42804.5. Samples: 10710253320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:33,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 11:47:37,115][15401] Updated weights for policy 0, policy_version 653701 (0.0038) [2024-06-24 11:47:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 10710319104. Throughput: 0: 43032.0. Samples: 10710384500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:38,399][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 11:47:40,103][15401] Updated weights for policy 0, policy_version 653711 (0.0037) [2024-06-24 11:47:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10710515712. Throughput: 0: 43002.3. Samples: 10710642580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:43,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 11:47:43,532][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000653719_10710532096.pth... [2024-06-24 11:47:43,586][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000653091_10700242944.pth [2024-06-24 11:47:44,678][15401] Updated weights for policy 0, policy_version 653721 (0.0039) [2024-06-24 11:47:47,785][15401] Updated weights for policy 0, policy_version 653731 (0.0045) [2024-06-24 11:47:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 10710745088. Throughput: 0: 42761.3. Samples: 10710893080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 11:47:52,191][15401] Updated weights for policy 0, policy_version 653741 (0.0040) [2024-06-24 11:47:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10710941696. Throughput: 0: 42849.0. Samples: 10711022940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:53,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 11:47:55,529][15401] Updated weights for policy 0, policy_version 653751 (0.0046) [2024-06-24 11:47:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10711138304. Throughput: 0: 42690.9. Samples: 10711278620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 11:47:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 11:47:59,759][15401] Updated weights for policy 0, policy_version 653761 (0.0043) [2024-06-24 11:48:03,065][15401] Updated weights for policy 0, policy_version 653771 (0.0039) [2024-06-24 11:48:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 10711384064. Throughput: 0: 42607.6. Samples: 10711527300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 11:48:07,839][15401] Updated weights for policy 0, policy_version 653781 (0.0031) [2024-06-24 11:48:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10711564288. Throughput: 0: 42826.4. Samples: 10711667980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 11:48:09,677][15349] Signal inference workers to stop experience collection... (158600 times) [2024-06-24 11:48:09,678][15349] Signal inference workers to resume experience collection... (158600 times) [2024-06-24 11:48:09,717][15401] InferenceWorker_p0-w0: stopping experience collection (158600 times) [2024-06-24 11:48:09,717][15401] InferenceWorker_p0-w0: resuming experience collection (158600 times) [2024-06-24 11:48:10,663][15401] Updated weights for policy 0, policy_version 653791 (0.0041) [2024-06-24 11:48:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 10711793664. Throughput: 0: 42612.1. Samples: 10711920180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 11:48:15,411][15401] Updated weights for policy 0, policy_version 653801 (0.0051) [2024-06-24 11:48:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10712023040. Throughput: 0: 42605.4. Samples: 10712170560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:18,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 11:48:18,488][15401] Updated weights for policy 0, policy_version 653811 (0.0036) [2024-06-24 11:48:23,132][15401] Updated weights for policy 0, policy_version 653821 (0.0037) [2024-06-24 11:48:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 10712203264. Throughput: 0: 42696.6. Samples: 10712305840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 11:48:26,247][15401] Updated weights for policy 0, policy_version 653831 (0.0035) [2024-06-24 11:48:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10712449024. Throughput: 0: 42614.2. Samples: 10712560220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 11:48:30,829][15401] Updated weights for policy 0, policy_version 653841 (0.0038) [2024-06-24 11:48:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10712662016. Throughput: 0: 42648.3. Samples: 10712812260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 11:48:34,463][15401] Updated weights for policy 0, policy_version 653851 (0.0031) [2024-06-24 11:48:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 10712842240. Throughput: 0: 42588.4. Samples: 10712939420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 11:48:38,495][15401] Updated weights for policy 0, policy_version 653861 (0.0042) [2024-06-24 11:48:41,978][15401] Updated weights for policy 0, policy_version 653871 (0.0032) [2024-06-24 11:48:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10713088000. Throughput: 0: 42613.7. Samples: 10713196240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:43,396][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 11:48:46,093][15401] Updated weights for policy 0, policy_version 653881 (0.0037) [2024-06-24 11:48:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 10713300992. Throughput: 0: 42549.4. Samples: 10713442020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 11:48:49,911][15401] Updated weights for policy 0, policy_version 653891 (0.0032) [2024-06-24 11:48:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10713481216. Throughput: 0: 42311.1. Samples: 10713571980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:53,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 11:48:53,882][15401] Updated weights for policy 0, policy_version 653901 (0.0034) [2024-06-24 11:48:57,389][15401] Updated weights for policy 0, policy_version 653911 (0.0038) [2024-06-24 11:48:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10713710592. Throughput: 0: 42546.3. Samples: 10713834760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:48:58,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 11:49:01,552][15401] Updated weights for policy 0, policy_version 653921 (0.0034) [2024-06-24 11:49:03,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 10713956352. Throughput: 0: 42559.0. Samples: 10714085720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:49:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 11:49:04,903][15401] Updated weights for policy 0, policy_version 653931 (0.0039) [2024-06-24 11:49:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10714136576. Throughput: 0: 42521.3. Samples: 10714219300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:49:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 11:49:09,068][15401] Updated weights for policy 0, policy_version 653941 (0.0035) [2024-06-24 11:49:13,057][15401] Updated weights for policy 0, policy_version 653951 (0.0039) [2024-06-24 11:49:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10714349568. Throughput: 0: 42659.6. Samples: 10714479900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:49:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 11:49:16,827][15401] Updated weights for policy 0, policy_version 653961 (0.0043) [2024-06-24 11:49:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 10714578944. Throughput: 0: 42663.2. Samples: 10714732100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:49:18,395][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 11:49:20,614][15401] Updated weights for policy 0, policy_version 653971 (0.0026) [2024-06-24 11:49:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10714791936. Throughput: 0: 42640.0. Samples: 10714858220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:49:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 11:49:24,357][15401] Updated weights for policy 0, policy_version 653981 (0.0043) [2024-06-24 11:49:28,122][15401] Updated weights for policy 0, policy_version 653991 (0.0040) [2024-06-24 11:49:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10714988544. Throughput: 0: 42733.4. Samples: 10715119240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:49:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 11:49:31,805][15401] Updated weights for policy 0, policy_version 654001 (0.0030) [2024-06-24 11:49:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 10715217920. Throughput: 0: 42943.1. Samples: 10715374460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 11:49:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 11:49:34,557][15349] Signal inference workers to stop experience collection... (158650 times) [2024-06-24 11:49:34,557][15349] Signal inference workers to resume experience collection... (158650 times) [2024-06-24 11:49:34,605][15401] InferenceWorker_p0-w0: stopping experience collection (158650 times) [2024-06-24 11:49:34,606][15401] InferenceWorker_p0-w0: resuming experience collection (158650 times) [2024-06-24 11:49:35,940][15401] Updated weights for policy 0, policy_version 654011 (0.0037) [2024-06-24 11:49:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10715430912. Throughput: 0: 42935.9. Samples: 10715504100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:49:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 11:49:39,324][15401] Updated weights for policy 0, policy_version 654021 (0.0038) [2024-06-24 11:49:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10715627520. Throughput: 0: 42677.3. Samples: 10715755240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:49:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 11:49:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000654030_10715627520.pth... [2024-06-24 11:49:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000653404_10705371136.pth [2024-06-24 11:49:43,639][15401] Updated weights for policy 0, policy_version 654031 (0.0034) [2024-06-24 11:49:47,167][15401] Updated weights for policy 0, policy_version 654041 (0.0032) [2024-06-24 11:49:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10715856896. Throughput: 0: 42720.4. Samples: 10716008140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:49:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 11:49:51,167][15401] Updated weights for policy 0, policy_version 654051 (0.0026) [2024-06-24 11:49:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10716053504. Throughput: 0: 42755.2. Samples: 10716143280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:49:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 11:49:54,725][15401] Updated weights for policy 0, policy_version 654061 (0.0023) [2024-06-24 11:49:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10716282880. Throughput: 0: 42631.9. Samples: 10716398340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:49:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 11:49:58,770][15401] Updated weights for policy 0, policy_version 654071 (0.0037) [2024-06-24 11:50:02,375][15401] Updated weights for policy 0, policy_version 654081 (0.0042) [2024-06-24 11:50:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10716495872. Throughput: 0: 42639.7. Samples: 10716650880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 11:50:06,426][15401] Updated weights for policy 0, policy_version 654091 (0.0027) [2024-06-24 11:50:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10716708864. Throughput: 0: 42829.8. Samples: 10716785560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:08,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 11:50:10,018][15401] Updated weights for policy 0, policy_version 654101 (0.0030) [2024-06-24 11:50:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10716921856. Throughput: 0: 42641.7. Samples: 10717038120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 11:50:14,713][15401] Updated weights for policy 0, policy_version 654111 (0.0037) [2024-06-24 11:50:17,740][15401] Updated weights for policy 0, policy_version 654121 (0.0029) [2024-06-24 11:50:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10717151232. Throughput: 0: 42571.1. Samples: 10717290160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 11:50:22,553][15401] Updated weights for policy 0, policy_version 654131 (0.0043) [2024-06-24 11:50:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10717347840. Throughput: 0: 42704.8. Samples: 10717425820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:23,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 11:50:25,576][15401] Updated weights for policy 0, policy_version 654141 (0.0035) [2024-06-24 11:50:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10717560832. Throughput: 0: 42605.7. Samples: 10717672500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 11:50:30,113][15401] Updated weights for policy 0, policy_version 654151 (0.0039) [2024-06-24 11:50:33,224][15401] Updated weights for policy 0, policy_version 654161 (0.0036) [2024-06-24 11:50:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10717773824. Throughput: 0: 42842.7. Samples: 10717936060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 11:50:37,650][15401] Updated weights for policy 0, policy_version 654171 (0.0037) [2024-06-24 11:50:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10717970432. Throughput: 0: 42680.8. Samples: 10718063920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:38,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 11:50:40,720][15401] Updated weights for policy 0, policy_version 654181 (0.0035) [2024-06-24 11:50:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10718199808. Throughput: 0: 42686.4. Samples: 10718319220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 11:50:45,206][15401] Updated weights for policy 0, policy_version 654191 (0.0031) [2024-06-24 11:50:46,032][15349] Signal inference workers to stop experience collection... (158700 times) [2024-06-24 11:50:46,033][15349] Signal inference workers to resume experience collection... (158700 times) [2024-06-24 11:50:46,051][15401] InferenceWorker_p0-w0: stopping experience collection (158700 times) [2024-06-24 11:50:46,051][15401] InferenceWorker_p0-w0: resuming experience collection (158700 times) [2024-06-24 11:50:48,248][15401] Updated weights for policy 0, policy_version 654201 (0.0031) [2024-06-24 11:50:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10718429184. Throughput: 0: 42903.4. Samples: 10718581540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:48,396][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 11:50:52,853][15401] Updated weights for policy 0, policy_version 654211 (0.0025) [2024-06-24 11:50:53,393][15132] Fps is (10 sec: 40945.0, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 10718609408. Throughput: 0: 42762.3. Samples: 10718710020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:53,394][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 11:50:55,866][15401] Updated weights for policy 0, policy_version 654221 (0.0029) [2024-06-24 11:50:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10718855168. Throughput: 0: 42774.6. Samples: 10718962980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:50:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 11:51:00,481][15401] Updated weights for policy 0, policy_version 654231 (0.0031) [2024-06-24 11:51:03,382][15401] Updated weights for policy 0, policy_version 654241 (0.0030) [2024-06-24 11:51:03,389][15132] Fps is (10 sec: 47530.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10719084544. Throughput: 0: 43060.0. Samples: 10719227860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:51:03,396][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 11:51:08,227][15401] Updated weights for policy 0, policy_version 654251 (0.0056) [2024-06-24 11:51:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10719248384. Throughput: 0: 42786.8. Samples: 10719351220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:51:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 11:51:11,359][15401] Updated weights for policy 0, policy_version 654261 (0.0025) [2024-06-24 11:51:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10719510528. Throughput: 0: 42987.1. Samples: 10719606920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-24 11:51:13,399][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 11:51:15,704][15401] Updated weights for policy 0, policy_version 654271 (0.0033) [2024-06-24 11:51:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10719690752. Throughput: 0: 42924.1. Samples: 10719867640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 11:51:18,838][15401] Updated weights for policy 0, policy_version 654281 (0.0030) [2024-06-24 11:51:23,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10719887360. Throughput: 0: 42813.3. Samples: 10719990520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:23,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 11:51:23,436][15401] Updated weights for policy 0, policy_version 654291 (0.0038) [2024-06-24 11:51:26,436][15401] Updated weights for policy 0, policy_version 654301 (0.0028) [2024-06-24 11:51:28,396][15132] Fps is (10 sec: 44208.2, 60 sec: 42866.9, 300 sec: 42764.4). Total num frames: 10720133120. Throughput: 0: 42709.0. Samples: 10720241400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:28,396][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 11:51:31,220][15401] Updated weights for policy 0, policy_version 654311 (0.0033) [2024-06-24 11:51:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10720346112. Throughput: 0: 42744.8. Samples: 10720505060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 11:51:34,297][15401] Updated weights for policy 0, policy_version 654321 (0.0028) [2024-06-24 11:51:38,389][15132] Fps is (10 sec: 39347.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10720526336. Throughput: 0: 42552.4. Samples: 10720624720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 11:51:38,918][15401] Updated weights for policy 0, policy_version 654331 (0.0035) [2024-06-24 11:51:42,062][15401] Updated weights for policy 0, policy_version 654341 (0.0039) [2024-06-24 11:51:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10720788480. Throughput: 0: 42642.7. Samples: 10720881900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 11:51:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000654345_10720788480.pth... [2024-06-24 11:51:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000653719_10710532096.pth [2024-06-24 11:51:46,410][15401] Updated weights for policy 0, policy_version 654351 (0.0024) [2024-06-24 11:51:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10720985088. Throughput: 0: 42555.5. Samples: 10721142860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 11:51:49,669][15401] Updated weights for policy 0, policy_version 654361 (0.0030) [2024-06-24 11:51:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42874.0, 300 sec: 42653.9). Total num frames: 10721181696. Throughput: 0: 42617.7. Samples: 10721269020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 11:51:54,292][15401] Updated weights for policy 0, policy_version 654371 (0.0043) [2024-06-24 11:51:57,470][15401] Updated weights for policy 0, policy_version 654381 (0.0029) [2024-06-24 11:51:58,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 10721411072. Throughput: 0: 42636.0. Samples: 10721525640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:51:58,393][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 11:52:02,006][15401] Updated weights for policy 0, policy_version 654391 (0.0035) [2024-06-24 11:52:03,390][15132] Fps is (10 sec: 45872.9, 60 sec: 42598.0, 300 sec: 42764.9). Total num frames: 10721640448. Throughput: 0: 42539.4. Samples: 10721781940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:03,391][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 11:52:05,145][15401] Updated weights for policy 0, policy_version 654401 (0.0042) [2024-06-24 11:52:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 10721820672. Throughput: 0: 42602.3. Samples: 10721907620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 11:52:09,444][15401] Updated weights for policy 0, policy_version 654411 (0.0037) [2024-06-24 11:52:13,060][15401] Updated weights for policy 0, policy_version 654421 (0.0034) [2024-06-24 11:52:13,390][15132] Fps is (10 sec: 40961.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10722050048. Throughput: 0: 42797.1. Samples: 10722167000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 11:52:16,888][15401] Updated weights for policy 0, policy_version 654431 (0.0039) [2024-06-24 11:52:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10722263040. Throughput: 0: 42612.1. Samples: 10722422600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:18,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 11:52:20,541][15401] Updated weights for policy 0, policy_version 654441 (0.0032) [2024-06-24 11:52:21,935][15349] Signal inference workers to stop experience collection... (158750 times) [2024-06-24 11:52:21,938][15349] Signal inference workers to resume experience collection... (158750 times) [2024-06-24 11:52:21,971][15401] InferenceWorker_p0-w0: stopping experience collection (158750 times) [2024-06-24 11:52:21,971][15401] InferenceWorker_p0-w0: resuming experience collection (158750 times) [2024-06-24 11:52:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 10722476032. Throughput: 0: 42806.9. Samples: 10722551140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:23,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 11:52:24,602][15401] Updated weights for policy 0, policy_version 654451 (0.0035) [2024-06-24 11:52:28,115][15401] Updated weights for policy 0, policy_version 654461 (0.0031) [2024-06-24 11:52:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 10722689024. Throughput: 0: 42731.7. Samples: 10722804820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 11:52:32,074][15401] Updated weights for policy 0, policy_version 654471 (0.0033) [2024-06-24 11:52:33,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10722902016. Throughput: 0: 42677.9. Samples: 10723063360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 11:52:36,110][15401] Updated weights for policy 0, policy_version 654481 (0.0030) [2024-06-24 11:52:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10723115008. Throughput: 0: 42725.4. Samples: 10723191660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 11:52:40,075][15401] Updated weights for policy 0, policy_version 654491 (0.0033) [2024-06-24 11:52:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10723328000. Throughput: 0: 42693.8. Samples: 10723446760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:43,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 11:52:43,769][15401] Updated weights for policy 0, policy_version 654501 (0.0042) [2024-06-24 11:52:47,761][15401] Updated weights for policy 0, policy_version 654511 (0.0034) [2024-06-24 11:52:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10723540992. Throughput: 0: 42739.6. Samples: 10723705200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 11:52:48,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 11:52:51,538][15401] Updated weights for policy 0, policy_version 654521 (0.0036) [2024-06-24 11:52:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10723753984. Throughput: 0: 42684.3. Samples: 10723828420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:52:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 11:52:55,238][15401] Updated weights for policy 0, policy_version 654531 (0.0028) [2024-06-24 11:52:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 10723983360. Throughput: 0: 42629.5. Samples: 10724085320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:52:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 11:52:58,971][15401] Updated weights for policy 0, policy_version 654541 (0.0034) [2024-06-24 11:53:02,835][15401] Updated weights for policy 0, policy_version 654551 (0.0034) [2024-06-24 11:53:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.8, 300 sec: 42820.6). Total num frames: 10724196352. Throughput: 0: 42734.7. Samples: 10724345660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 11:53:06,378][15401] Updated weights for policy 0, policy_version 654561 (0.0037) [2024-06-24 11:53:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10724392960. Throughput: 0: 42685.8. Samples: 10724471900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 11:53:10,245][15401] Updated weights for policy 0, policy_version 654571 (0.0023) [2024-06-24 11:53:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42709.4). Total num frames: 10724622336. Throughput: 0: 42877.6. Samples: 10724734320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:13,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 11:53:14,244][15401] Updated weights for policy 0, policy_version 654581 (0.0037) [2024-06-24 11:53:17,789][15401] Updated weights for policy 0, policy_version 654591 (0.0047) [2024-06-24 11:53:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10724835328. Throughput: 0: 42848.0. Samples: 10724991520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 11:53:21,813][15401] Updated weights for policy 0, policy_version 654601 (0.0025) [2024-06-24 11:53:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 10725031936. Throughput: 0: 42772.0. Samples: 10725116400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 11:53:25,694][15401] Updated weights for policy 0, policy_version 654611 (0.0032) [2024-06-24 11:53:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10725261312. Throughput: 0: 42877.8. Samples: 10725376260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 11:53:29,687][15401] Updated weights for policy 0, policy_version 654621 (0.0049) [2024-06-24 11:53:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10725457920. Throughput: 0: 42815.2. Samples: 10725631880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 11:53:33,510][15401] Updated weights for policy 0, policy_version 654631 (0.0036) [2024-06-24 11:53:37,252][15401] Updated weights for policy 0, policy_version 654641 (0.0034) [2024-06-24 11:53:38,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 10725687296. Throughput: 0: 42897.6. Samples: 10725759080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:38,396][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 11:53:40,943][15401] Updated weights for policy 0, policy_version 654651 (0.0035) [2024-06-24 11:53:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10725900288. Throughput: 0: 43087.1. Samples: 10726024240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:43,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 11:53:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000654658_10725916672.pth... [2024-06-24 11:53:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000654030_10715627520.pth [2024-06-24 11:53:44,658][15401] Updated weights for policy 0, policy_version 654661 (0.0036) [2024-06-24 11:53:48,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10726113280. Throughput: 0: 42913.8. Samples: 10726276780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 11:53:48,557][15401] Updated weights for policy 0, policy_version 654671 (0.0036) [2024-06-24 11:53:52,492][15401] Updated weights for policy 0, policy_version 654681 (0.0042) [2024-06-24 11:53:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10726326272. Throughput: 0: 42955.9. Samples: 10726404920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 11:53:56,167][15401] Updated weights for policy 0, policy_version 654691 (0.0024) [2024-06-24 11:53:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10726539264. Throughput: 0: 42802.7. Samples: 10726660440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:53:58,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 11:54:00,248][15401] Updated weights for policy 0, policy_version 654701 (0.0047) [2024-06-24 11:54:02,530][15349] Signal inference workers to stop experience collection... (158800 times) [2024-06-24 11:54:02,581][15401] InferenceWorker_p0-w0: stopping experience collection (158800 times) [2024-06-24 11:54:02,643][15349] Signal inference workers to resume experience collection... (158800 times) [2024-06-24 11:54:02,643][15401] InferenceWorker_p0-w0: resuming experience collection (158800 times) [2024-06-24 11:54:03,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10726752256. Throughput: 0: 42768.1. Samples: 10726916080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:54:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 11:54:03,913][15401] Updated weights for policy 0, policy_version 654711 (0.0032) [2024-06-24 11:54:07,908][15401] Updated weights for policy 0, policy_version 654721 (0.0026) [2024-06-24 11:54:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10726965248. Throughput: 0: 42926.5. Samples: 10727048100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:54:08,391][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 11:54:11,739][15401] Updated weights for policy 0, policy_version 654731 (0.0028) [2024-06-24 11:54:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10727194624. Throughput: 0: 42808.0. Samples: 10727302620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:54:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 11:54:15,519][15401] Updated weights for policy 0, policy_version 654741 (0.0032) [2024-06-24 11:54:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10727391232. Throughput: 0: 42752.9. Samples: 10727555760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:54:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 11:54:19,418][15401] Updated weights for policy 0, policy_version 654751 (0.0032) [2024-06-24 11:54:23,109][15401] Updated weights for policy 0, policy_version 654761 (0.0041) [2024-06-24 11:54:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10727604224. Throughput: 0: 42785.2. Samples: 10727684140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:54:23,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 11:54:26,920][15401] Updated weights for policy 0, policy_version 654771 (0.0029) [2024-06-24 11:54:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10727833600. Throughput: 0: 42715.6. Samples: 10727946440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 11:54:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 11:54:30,549][15401] Updated weights for policy 0, policy_version 654781 (0.0047) [2024-06-24 11:54:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10728046592. Throughput: 0: 42815.0. Samples: 10728203460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:54:33,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 11:54:34,599][15401] Updated weights for policy 0, policy_version 654791 (0.0026) [2024-06-24 11:54:38,050][15401] Updated weights for policy 0, policy_version 654801 (0.0031) [2024-06-24 11:54:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42876.0, 300 sec: 42820.6). Total num frames: 10728259584. Throughput: 0: 42799.7. Samples: 10728330900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:54:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 11:54:42,337][15401] Updated weights for policy 0, policy_version 654811 (0.0045) [2024-06-24 11:54:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10728488960. Throughput: 0: 42817.8. Samples: 10728587240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:54:43,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 11:54:46,032][15401] Updated weights for policy 0, policy_version 654821 (0.0032) [2024-06-24 11:54:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10728685568. Throughput: 0: 42830.6. Samples: 10728843460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:54:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 11:54:49,865][15401] Updated weights for policy 0, policy_version 654831 (0.0024) [2024-06-24 11:54:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10728898560. Throughput: 0: 42723.1. Samples: 10728970640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:54:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 11:54:53,527][15401] Updated weights for policy 0, policy_version 654841 (0.0031) [2024-06-24 11:54:57,419][15401] Updated weights for policy 0, policy_version 654851 (0.0030) [2024-06-24 11:54:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10729111552. Throughput: 0: 43001.3. Samples: 10729237680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:54:58,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 11:55:01,311][15401] Updated weights for policy 0, policy_version 654861 (0.0031) [2024-06-24 11:55:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10729308160. Throughput: 0: 43018.6. Samples: 10729491600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 11:55:04,870][15401] Updated weights for policy 0, policy_version 654871 (0.0022) [2024-06-24 11:55:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10729537536. Throughput: 0: 42937.7. Samples: 10729616340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:08,399][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 11:55:09,099][15401] Updated weights for policy 0, policy_version 654881 (0.0039) [2024-06-24 11:55:12,467][15401] Updated weights for policy 0, policy_version 654891 (0.0029) [2024-06-24 11:55:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10729750528. Throughput: 0: 42836.9. Samples: 10729874100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 11:55:17,023][15401] Updated weights for policy 0, policy_version 654901 (0.0031) [2024-06-24 11:55:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10729963520. Throughput: 0: 42579.5. Samples: 10730119540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 11:55:20,740][15401] Updated weights for policy 0, policy_version 654911 (0.0037) [2024-06-24 11:55:23,390][15132] Fps is (10 sec: 42596.8, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 10730176512. Throughput: 0: 42742.3. Samples: 10730254320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 11:55:24,566][15401] Updated weights for policy 0, policy_version 654921 (0.0028) [2024-06-24 11:55:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10730373120. Throughput: 0: 42596.0. Samples: 10730504060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:28,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 11:55:28,582][15401] Updated weights for policy 0, policy_version 654931 (0.0024) [2024-06-24 11:55:31,981][15401] Updated weights for policy 0, policy_version 654941 (0.0033) [2024-06-24 11:55:33,389][15132] Fps is (10 sec: 40961.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10730586112. Throughput: 0: 42709.3. Samples: 10730765380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 11:55:35,923][15401] Updated weights for policy 0, policy_version 654951 (0.0040) [2024-06-24 11:55:36,466][15349] Signal inference workers to stop experience collection... (158850 times) [2024-06-24 11:55:36,466][15349] Signal inference workers to resume experience collection... (158850 times) [2024-06-24 11:55:36,479][15401] InferenceWorker_p0-w0: stopping experience collection (158850 times) [2024-06-24 11:55:36,479][15401] InferenceWorker_p0-w0: resuming experience collection (158850 times) [2024-06-24 11:55:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10730815488. Throughput: 0: 42736.9. Samples: 10730893800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 11:55:39,997][15401] Updated weights for policy 0, policy_version 654961 (0.0040) [2024-06-24 11:55:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10731028480. Throughput: 0: 42477.2. Samples: 10731149160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:43,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 11:55:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000654971_10731044864.pth... [2024-06-24 11:55:43,433][15401] Updated weights for policy 0, policy_version 654971 (0.0037) [2024-06-24 11:55:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000654345_10720788480.pth [2024-06-24 11:55:47,460][15401] Updated weights for policy 0, policy_version 654981 (0.0033) [2024-06-24 11:55:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42821.1). Total num frames: 10731241472. Throughput: 0: 42643.5. Samples: 10731410560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 11:55:51,223][15401] Updated weights for policy 0, policy_version 654991 (0.0023) [2024-06-24 11:55:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10731454464. Throughput: 0: 42759.7. Samples: 10731540520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 11:55:55,090][15401] Updated weights for policy 0, policy_version 655001 (0.0043) [2024-06-24 11:55:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10731683840. Throughput: 0: 42549.2. Samples: 10731788820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:55:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 11:55:58,694][15401] Updated weights for policy 0, policy_version 655011 (0.0038) [2024-06-24 11:56:02,780][15401] Updated weights for policy 0, policy_version 655021 (0.0038) [2024-06-24 11:56:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10731864064. Throughput: 0: 42938.7. Samples: 10732051780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 11:56:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 11:56:06,142][15401] Updated weights for policy 0, policy_version 655031 (0.0034) [2024-06-24 11:56:08,395][15132] Fps is (10 sec: 40939.4, 60 sec: 42594.8, 300 sec: 42653.2). Total num frames: 10732093440. Throughput: 0: 42608.8. Samples: 10732171920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:08,395][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 11:56:10,799][15401] Updated weights for policy 0, policy_version 655041 (0.0032) [2024-06-24 11:56:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10732306432. Throughput: 0: 42775.6. Samples: 10732428960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:13,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 11:56:13,928][15401] Updated weights for policy 0, policy_version 655051 (0.0037) [2024-06-24 11:56:18,389][15132] Fps is (10 sec: 40980.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10732503040. Throughput: 0: 42673.7. Samples: 10732685700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:18,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 11:56:18,518][15401] Updated weights for policy 0, policy_version 655061 (0.0030) [2024-06-24 11:56:21,487][15401] Updated weights for policy 0, policy_version 655071 (0.0041) [2024-06-24 11:56:23,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 10732732416. Throughput: 0: 42642.9. Samples: 10732812740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:23,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-24 11:56:26,385][15401] Updated weights for policy 0, policy_version 655081 (0.0034) [2024-06-24 11:56:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10732961792. Throughput: 0: 42565.0. Samples: 10733064580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:28,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 11:56:29,211][15401] Updated weights for policy 0, policy_version 655091 (0.0035) [2024-06-24 11:56:33,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10733142016. Throughput: 0: 42502.3. Samples: 10733323160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 11:56:33,814][15401] Updated weights for policy 0, policy_version 655101 (0.0027) [2024-06-24 11:56:36,813][15401] Updated weights for policy 0, policy_version 655111 (0.0028) [2024-06-24 11:56:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10733387776. Throughput: 0: 42459.5. Samples: 10733451200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 11:56:41,332][15401] Updated weights for policy 0, policy_version 655121 (0.0027) [2024-06-24 11:56:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10733584384. Throughput: 0: 42667.5. Samples: 10733708860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 11:56:44,544][15401] Updated weights for policy 0, policy_version 655131 (0.0043) [2024-06-24 11:56:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10733780992. Throughput: 0: 42624.0. Samples: 10733969860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 11:56:49,076][15401] Updated weights for policy 0, policy_version 655141 (0.0029) [2024-06-24 11:56:52,298][15401] Updated weights for policy 0, policy_version 655151 (0.0039) [2024-06-24 11:56:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42765.3). Total num frames: 10734026752. Throughput: 0: 42610.9. Samples: 10734089200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 11:56:56,946][15401] Updated weights for policy 0, policy_version 655161 (0.0042) [2024-06-24 11:56:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42709.6). Total num frames: 10734239744. Throughput: 0: 42618.2. Samples: 10734346780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:56:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 11:57:00,222][15401] Updated weights for policy 0, policy_version 655171 (0.0024) [2024-06-24 11:57:03,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10734419968. Throughput: 0: 42667.5. Samples: 10734605740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 11:57:04,497][15401] Updated weights for policy 0, policy_version 655181 (0.0037) [2024-06-24 11:57:07,796][15401] Updated weights for policy 0, policy_version 655191 (0.0034) [2024-06-24 11:57:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42602.0, 300 sec: 42709.5). Total num frames: 10734649344. Throughput: 0: 42538.0. Samples: 10734726940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 11:57:11,803][15349] Signal inference workers to stop experience collection... (158900 times) [2024-06-24 11:57:11,827][15401] InferenceWorker_p0-w0: stopping experience collection (158900 times) [2024-06-24 11:57:11,865][15349] Signal inference workers to resume experience collection... (158900 times) [2024-06-24 11:57:11,866][15401] InferenceWorker_p0-w0: resuming experience collection (158900 times) [2024-06-24 11:57:12,349][15401] Updated weights for policy 0, policy_version 655201 (0.0029) [2024-06-24 11:57:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10734878720. Throughput: 0: 42636.9. Samples: 10734983240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:13,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 11:57:15,397][15401] Updated weights for policy 0, policy_version 655211 (0.0035) [2024-06-24 11:57:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 10735042560. Throughput: 0: 42553.3. Samples: 10735238060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 11:57:19,918][15401] Updated weights for policy 0, policy_version 655221 (0.0039) [2024-06-24 11:57:23,297][15401] Updated weights for policy 0, policy_version 655231 (0.0027) [2024-06-24 11:57:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 10735304704. Throughput: 0: 42483.6. Samples: 10735362960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 11:57:27,482][15401] Updated weights for policy 0, policy_version 655241 (0.0030) [2024-06-24 11:57:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10735501312. Throughput: 0: 42626.6. Samples: 10735627060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 11:57:31,135][15401] Updated weights for policy 0, policy_version 655251 (0.0022) [2024-06-24 11:57:33,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 10735681536. Throughput: 0: 42617.8. Samples: 10735887660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 11:57:35,078][15401] Updated weights for policy 0, policy_version 655261 (0.0044) [2024-06-24 11:57:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10735943680. Throughput: 0: 42513.5. Samples: 10736002300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 11:57:38,936][15401] Updated weights for policy 0, policy_version 655271 (0.0030) [2024-06-24 11:57:42,685][15401] Updated weights for policy 0, policy_version 655281 (0.0039) [2024-06-24 11:57:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10736140288. Throughput: 0: 42611.6. Samples: 10736264300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 11:57:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 11:57:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000655282_10736140288.pth... [2024-06-24 11:57:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000654658_10725916672.pth [2024-06-24 11:57:46,426][15401] Updated weights for policy 0, policy_version 655291 (0.0045) [2024-06-24 11:57:48,390][15132] Fps is (10 sec: 39319.3, 60 sec: 42598.0, 300 sec: 42653.9). Total num frames: 10736336896. Throughput: 0: 42497.2. Samples: 10736518140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:57:48,391][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 11:57:50,574][15401] Updated weights for policy 0, policy_version 655301 (0.0038) [2024-06-24 11:57:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10736582656. Throughput: 0: 42521.3. Samples: 10736640400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:57:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 11:57:53,926][15401] Updated weights for policy 0, policy_version 655311 (0.0040) [2024-06-24 11:57:58,390][15132] Fps is (10 sec: 42600.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10736762880. Throughput: 0: 42691.5. Samples: 10736904360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:57:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 11:57:58,398][15401] Updated weights for policy 0, policy_version 655321 (0.0031) [2024-06-24 11:58:01,401][15401] Updated weights for policy 0, policy_version 655331 (0.0039) [2024-06-24 11:58:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10736975872. Throughput: 0: 42715.8. Samples: 10737160280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 11:58:06,057][15401] Updated weights for policy 0, policy_version 655341 (0.0032) [2024-06-24 11:58:08,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10737238016. Throughput: 0: 42725.3. Samples: 10737285600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 11:58:09,541][15401] Updated weights for policy 0, policy_version 655351 (0.0044) [2024-06-24 11:58:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10737418240. Throughput: 0: 42652.9. Samples: 10737546440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 11:58:13,514][15401] Updated weights for policy 0, policy_version 655361 (0.0037) [2024-06-24 11:58:17,186][15401] Updated weights for policy 0, policy_version 655371 (0.0031) [2024-06-24 11:58:18,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10737598464. Throughput: 0: 42516.5. Samples: 10737800900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:18,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 11:58:21,509][15401] Updated weights for policy 0, policy_version 655381 (0.0030) [2024-06-24 11:58:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10737876992. Throughput: 0: 42686.3. Samples: 10737923180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 11:58:24,947][15401] Updated weights for policy 0, policy_version 655391 (0.0033) [2024-06-24 11:58:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10738040832. Throughput: 0: 42694.2. Samples: 10738185540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 11:58:29,106][15401] Updated weights for policy 0, policy_version 655401 (0.0033) [2024-06-24 11:58:30,064][15349] Signal inference workers to stop experience collection... (158950 times) [2024-06-24 11:58:30,065][15349] Signal inference workers to resume experience collection... (158950 times) [2024-06-24 11:58:30,112][15401] InferenceWorker_p0-w0: stopping experience collection (158950 times) [2024-06-24 11:58:30,112][15401] InferenceWorker_p0-w0: resuming experience collection (158950 times) [2024-06-24 11:58:32,330][15401] Updated weights for policy 0, policy_version 655411 (0.0030) [2024-06-24 11:58:33,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 10738253824. Throughput: 0: 42721.5. Samples: 10738440580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 11:58:36,870][15401] Updated weights for policy 0, policy_version 655421 (0.0036) [2024-06-24 11:58:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10738483200. Throughput: 0: 42873.8. Samples: 10738569720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:38,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 11:58:39,935][15401] Updated weights for policy 0, policy_version 655431 (0.0041) [2024-06-24 11:58:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10738663424. Throughput: 0: 42583.2. Samples: 10738820600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:43,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 11:58:44,677][15401] Updated weights for policy 0, policy_version 655441 (0.0045) [2024-06-24 11:58:47,827][15401] Updated weights for policy 0, policy_version 655451 (0.0034) [2024-06-24 11:58:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42872.0, 300 sec: 42654.0). Total num frames: 10738909184. Throughput: 0: 42416.7. Samples: 10739069020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 11:58:52,361][15401] Updated weights for policy 0, policy_version 655461 (0.0042) [2024-06-24 11:58:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10739122176. Throughput: 0: 42746.3. Samples: 10739209180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:53,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 11:58:55,425][15401] Updated weights for policy 0, policy_version 655471 (0.0038) [2024-06-24 11:58:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 10739302400. Throughput: 0: 42405.4. Samples: 10739454680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:58:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 11:59:00,013][15401] Updated weights for policy 0, policy_version 655481 (0.0026) [2024-06-24 11:59:03,013][15401] Updated weights for policy 0, policy_version 655491 (0.0038) [2024-06-24 11:59:03,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 10739564544. Throughput: 0: 42260.2. Samples: 10739702880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:59:03,396][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 11:59:07,981][15401] Updated weights for policy 0, policy_version 655501 (0.0035) [2024-06-24 11:59:08,392][15132] Fps is (10 sec: 44226.2, 60 sec: 41777.5, 300 sec: 42542.5). Total num frames: 10739744768. Throughput: 0: 42692.8. Samples: 10739844460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:59:08,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 11:59:10,486][15401] Updated weights for policy 0, policy_version 655511 (0.0036) [2024-06-24 11:59:13,396][15132] Fps is (10 sec: 39321.6, 60 sec: 42320.9, 300 sec: 42597.5). Total num frames: 10739957760. Throughput: 0: 42332.6. Samples: 10740090780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:59:13,397][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 11:59:15,622][15401] Updated weights for policy 0, policy_version 655521 (0.0031) [2024-06-24 11:59:18,177][15401] Updated weights for policy 0, policy_version 655531 (0.0035) [2024-06-24 11:59:18,390][15132] Fps is (10 sec: 47524.7, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 10740219904. Throughput: 0: 42257.7. Samples: 10740342180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-24 11:59:18,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 11:59:23,133][15401] Updated weights for policy 0, policy_version 655541 (0.0043) [2024-06-24 11:59:23,389][15132] Fps is (10 sec: 42625.7, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 10740383744. Throughput: 0: 42539.1. Samples: 10740483980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 11:59:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 11:59:25,899][15401] Updated weights for policy 0, policy_version 655551 (0.0033) [2024-06-24 11:59:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10740613120. Throughput: 0: 42573.8. Samples: 10740736420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 11:59:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 11:59:29,456][15349] Signal inference workers to stop experience collection... (159000 times) [2024-06-24 11:59:29,457][15349] Signal inference workers to resume experience collection... (159000 times) [2024-06-24 11:59:29,496][15401] InferenceWorker_p0-w0: stopping experience collection (159000 times) [2024-06-24 11:59:29,496][15401] InferenceWorker_p0-w0: resuming experience collection (159000 times) [2024-06-24 11:59:30,799][15401] Updated weights for policy 0, policy_version 655561 (0.0032) [2024-06-24 11:59:33,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 10740858880. Throughput: 0: 42695.4. Samples: 10740990320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 11:59:33,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 11:59:33,559][15401] Updated weights for policy 0, policy_version 655571 (0.0026) [2024-06-24 11:59:38,386][15401] Updated weights for policy 0, policy_version 655581 (0.0042) [2024-06-24 11:59:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 10741039104. Throughput: 0: 42691.9. Samples: 10741130420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 11:59:38,392][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 11:59:41,286][15401] Updated weights for policy 0, policy_version 655591 (0.0032) [2024-06-24 11:59:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 10741268480. Throughput: 0: 42779.2. Samples: 10741379740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 11:59:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 11:59:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000655595_10741268480.pth... [2024-06-24 11:59:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000654971_10731044864.pth [2024-06-24 11:59:45,965][15401] Updated weights for policy 0, policy_version 655601 (0.0027) [2024-06-24 11:59:48,389][15132] Fps is (10 sec: 45886.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10741497856. Throughput: 0: 43012.0. Samples: 10741638140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 11:59:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 11:59:48,843][15401] Updated weights for policy 0, policy_version 655611 (0.0041) [2024-06-24 11:59:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10741678080. Throughput: 0: 42813.5. Samples: 10741770960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 11:59:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 11:59:53,517][15401] Updated weights for policy 0, policy_version 655621 (0.0038) [2024-06-24 11:59:56,684][15401] Updated weights for policy 0, policy_version 655631 (0.0029) [2024-06-24 11:59:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10741907456. Throughput: 0: 42880.3. Samples: 10742020120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 11:59:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 12:00:01,074][15401] Updated weights for policy 0, policy_version 655641 (0.0037) [2024-06-24 12:00:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 10742136832. Throughput: 0: 43050.4. Samples: 10742279440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:03,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 12:00:04,250][15401] Updated weights for policy 0, policy_version 655651 (0.0033) [2024-06-24 12:00:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 10742300672. Throughput: 0: 42894.4. Samples: 10742414220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 12:00:08,963][15401] Updated weights for policy 0, policy_version 655661 (0.0034) [2024-06-24 12:00:11,900][15401] Updated weights for policy 0, policy_version 655671 (0.0032) [2024-06-24 12:00:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43149.1, 300 sec: 42653.9). Total num frames: 10742546432. Throughput: 0: 42791.0. Samples: 10742662020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 12:00:16,387][15401] Updated weights for policy 0, policy_version 655681 (0.0042) [2024-06-24 12:00:18,390][15132] Fps is (10 sec: 49151.2, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 10742792192. Throughput: 0: 42973.8. Samples: 10742924140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 12:00:19,490][15401] Updated weights for policy 0, policy_version 655691 (0.0030) [2024-06-24 12:00:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10742939648. Throughput: 0: 42834.8. Samples: 10743057880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 12:00:24,124][15401] Updated weights for policy 0, policy_version 655701 (0.0044) [2024-06-24 12:00:27,191][15401] Updated weights for policy 0, policy_version 655711 (0.0036) [2024-06-24 12:00:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10743201792. Throughput: 0: 42842.5. Samples: 10743307660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 12:00:31,796][15401] Updated weights for policy 0, policy_version 655721 (0.0043) [2024-06-24 12:00:33,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10743414784. Throughput: 0: 42907.4. Samples: 10743568980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 12:00:35,179][15401] Updated weights for policy 0, policy_version 655731 (0.0028) [2024-06-24 12:00:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 10743595008. Throughput: 0: 42883.6. Samples: 10743700720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 12:00:39,241][15401] Updated weights for policy 0, policy_version 655741 (0.0042) [2024-06-24 12:00:42,660][15401] Updated weights for policy 0, policy_version 655751 (0.0028) [2024-06-24 12:00:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10743824384. Throughput: 0: 42972.8. Samples: 10743953900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:43,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 12:00:43,569][15349] Signal inference workers to stop experience collection... (159050 times) [2024-06-24 12:00:43,569][15349] Signal inference workers to resume experience collection... (159050 times) [2024-06-24 12:00:43,580][15401] InferenceWorker_p0-w0: stopping experience collection (159050 times) [2024-06-24 12:00:43,580][15401] InferenceWorker_p0-w0: resuming experience collection (159050 times) [2024-06-24 12:00:46,955][15401] Updated weights for policy 0, policy_version 655761 (0.0037) [2024-06-24 12:00:48,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10744070144. Throughput: 0: 42982.1. Samples: 10744213640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:48,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 12:00:50,177][15401] Updated weights for policy 0, policy_version 655771 (0.0034) [2024-06-24 12:00:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10744250368. Throughput: 0: 43074.1. Samples: 10744352560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 12:00:54,315][15401] Updated weights for policy 0, policy_version 655781 (0.0029) [2024-06-24 12:00:57,987][15401] Updated weights for policy 0, policy_version 655791 (0.0027) [2024-06-24 12:00:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10744479744. Throughput: 0: 43152.1. Samples: 10744603860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 12:00:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 12:01:01,987][15401] Updated weights for policy 0, policy_version 655801 (0.0028) [2024-06-24 12:01:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.8). Total num frames: 10744709120. Throughput: 0: 42997.9. Samples: 10744859040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 12:01:05,973][15401] Updated weights for policy 0, policy_version 655811 (0.0026) [2024-06-24 12:01:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 10744905728. Throughput: 0: 43010.6. Samples: 10744993360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 12:01:09,550][15401] Updated weights for policy 0, policy_version 655821 (0.0036) [2024-06-24 12:01:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10745118720. Throughput: 0: 42990.7. Samples: 10745242240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:13,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 12:01:13,842][15401] Updated weights for policy 0, policy_version 655831 (0.0037) [2024-06-24 12:01:17,270][15401] Updated weights for policy 0, policy_version 655841 (0.0041) [2024-06-24 12:01:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10745364480. Throughput: 0: 42974.2. Samples: 10745502820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 12:01:21,352][15401] Updated weights for policy 0, policy_version 655851 (0.0036) [2024-06-24 12:01:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 10745544704. Throughput: 0: 43039.5. Samples: 10745637500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 12:01:24,822][15401] Updated weights for policy 0, policy_version 655861 (0.0031) [2024-06-24 12:01:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10745774080. Throughput: 0: 43020.1. Samples: 10745889800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 12:01:28,806][15401] Updated weights for policy 0, policy_version 655871 (0.0035) [2024-06-24 12:01:32,299][15401] Updated weights for policy 0, policy_version 655881 (0.0037) [2024-06-24 12:01:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10746003456. Throughput: 0: 42873.4. Samples: 10746142940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 12:01:36,556][15401] Updated weights for policy 0, policy_version 655891 (0.0038) [2024-06-24 12:01:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 10746200064. Throughput: 0: 42858.5. Samples: 10746281200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 12:01:39,862][15401] Updated weights for policy 0, policy_version 655901 (0.0043) [2024-06-24 12:01:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10746396672. Throughput: 0: 42893.4. Samples: 10746534060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 12:01:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000655909_10746413056.pth... [2024-06-24 12:01:43,538][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000655282_10736140288.pth [2024-06-24 12:01:44,015][15401] Updated weights for policy 0, policy_version 655911 (0.0047) [2024-06-24 12:01:47,493][15401] Updated weights for policy 0, policy_version 655921 (0.0028) [2024-06-24 12:01:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10746642432. Throughput: 0: 42945.3. Samples: 10746791580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 12:01:52,132][15401] Updated weights for policy 0, policy_version 655931 (0.0042) [2024-06-24 12:01:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10746839040. Throughput: 0: 42791.1. Samples: 10746918960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 12:01:55,140][15401] Updated weights for policy 0, policy_version 655941 (0.0034) [2024-06-24 12:01:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10747052032. Throughput: 0: 42874.7. Samples: 10747171600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:01:58,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-24 12:01:59,601][15401] Updated weights for policy 0, policy_version 655951 (0.0040) [2024-06-24 12:02:02,761][15401] Updated weights for policy 0, policy_version 655961 (0.0026) [2024-06-24 12:02:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10747281408. Throughput: 0: 42792.4. Samples: 10747428480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:02:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 12:02:07,130][15401] Updated weights for policy 0, policy_version 655971 (0.0030) [2024-06-24 12:02:08,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 10747445248. Throughput: 0: 42652.8. Samples: 10747556980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:02:08,392][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 12:02:10,659][15401] Updated weights for policy 0, policy_version 655981 (0.0032) [2024-06-24 12:02:11,257][15349] Signal inference workers to stop experience collection... (159100 times) [2024-06-24 12:02:11,258][15349] Signal inference workers to resume experience collection... (159100 times) [2024-06-24 12:02:11,303][15401] InferenceWorker_p0-w0: stopping experience collection (159100 times) [2024-06-24 12:02:11,303][15401] InferenceWorker_p0-w0: resuming experience collection (159100 times) [2024-06-24 12:02:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10747691008. Throughput: 0: 42658.6. Samples: 10747809440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:02:13,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 12:02:14,861][15401] Updated weights for policy 0, policy_version 655991 (0.0051) [2024-06-24 12:02:18,196][15401] Updated weights for policy 0, policy_version 656001 (0.0021) [2024-06-24 12:02:18,389][15132] Fps is (10 sec: 47525.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10747920384. Throughput: 0: 42776.4. Samples: 10748067880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:02:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 12:02:22,734][15401] Updated weights for policy 0, policy_version 656011 (0.0031) [2024-06-24 12:02:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10748084224. Throughput: 0: 42532.5. Samples: 10748195160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:02:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 12:02:25,914][15401] Updated weights for policy 0, policy_version 656021 (0.0027) [2024-06-24 12:02:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10748346368. Throughput: 0: 42625.3. Samples: 10748452200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:02:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 12:02:30,323][15401] Updated weights for policy 0, policy_version 656031 (0.0041) [2024-06-24 12:02:33,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10748559360. Throughput: 0: 42742.1. Samples: 10748714980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-24 12:02:33,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 12:02:34,012][15401] Updated weights for policy 0, policy_version 656041 (0.0030) [2024-06-24 12:02:37,934][15401] Updated weights for policy 0, policy_version 656051 (0.0020) [2024-06-24 12:02:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10748739584. Throughput: 0: 42644.1. Samples: 10748837940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:02:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 12:02:41,463][15401] Updated weights for policy 0, policy_version 656061 (0.0032) [2024-06-24 12:02:43,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43415.8, 300 sec: 42931.4). Total num frames: 10749001728. Throughput: 0: 42688.7. Samples: 10749092700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:02:43,393][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 12:02:45,479][15401] Updated weights for policy 0, policy_version 656071 (0.0035) [2024-06-24 12:02:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10749181952. Throughput: 0: 42825.8. Samples: 10749355640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:02:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 12:02:49,391][15401] Updated weights for policy 0, policy_version 656081 (0.0037) [2024-06-24 12:02:53,288][15401] Updated weights for policy 0, policy_version 656091 (0.0036) [2024-06-24 12:02:53,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10749394944. Throughput: 0: 42702.7. Samples: 10749478500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:02:53,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-24 12:02:56,944][15401] Updated weights for policy 0, policy_version 656101 (0.0034) [2024-06-24 12:02:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 10749640704. Throughput: 0: 42870.7. Samples: 10749738620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:02:58,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 12:03:00,898][15401] Updated weights for policy 0, policy_version 656111 (0.0038) [2024-06-24 12:03:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10749820928. Throughput: 0: 42881.3. Samples: 10749997540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 12:03:04,357][15401] Updated weights for policy 0, policy_version 656121 (0.0022) [2024-06-24 12:03:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 10750033920. Throughput: 0: 42916.5. Samples: 10750126400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:08,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 12:03:08,535][15401] Updated weights for policy 0, policy_version 656131 (0.0029) [2024-06-24 12:03:11,953][15401] Updated weights for policy 0, policy_version 656141 (0.0040) [2024-06-24 12:03:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 10750296064. Throughput: 0: 42831.0. Samples: 10750379600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:13,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 12:03:15,989][15401] Updated weights for policy 0, policy_version 656151 (0.0031) [2024-06-24 12:03:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10750459904. Throughput: 0: 42839.9. Samples: 10750642780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 12:03:19,503][15401] Updated weights for policy 0, policy_version 656161 (0.0033) [2024-06-24 12:03:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 10750689280. Throughput: 0: 42706.9. Samples: 10750759760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 12:03:24,029][15401] Updated weights for policy 0, policy_version 656171 (0.0028) [2024-06-24 12:03:26,541][15349] Signal inference workers to stop experience collection... (159150 times) [2024-06-24 12:03:26,543][15349] Signal inference workers to resume experience collection... (159150 times) [2024-06-24 12:03:26,565][15401] InferenceWorker_p0-w0: stopping experience collection (159150 times) [2024-06-24 12:03:26,565][15401] InferenceWorker_p0-w0: resuming experience collection (159150 times) [2024-06-24 12:03:26,995][15401] Updated weights for policy 0, policy_version 656181 (0.0040) [2024-06-24 12:03:28,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 10750935040. Throughput: 0: 42974.8. Samples: 10751026460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 12:03:31,588][15401] Updated weights for policy 0, policy_version 656191 (0.0037) [2024-06-24 12:03:33,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10751082496. Throughput: 0: 43084.4. Samples: 10751294440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 12:03:34,815][15401] Updated weights for policy 0, policy_version 656201 (0.0032) [2024-06-24 12:03:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10751328256. Throughput: 0: 42826.3. Samples: 10751405680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 12:03:39,112][15401] Updated weights for policy 0, policy_version 656211 (0.0035) [2024-06-24 12:03:42,448][15401] Updated weights for policy 0, policy_version 656221 (0.0048) [2024-06-24 12:03:43,390][15132] Fps is (10 sec: 49152.0, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 10751574016. Throughput: 0: 42845.7. Samples: 10751666680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 12:03:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000656224_10751574016.pth... [2024-06-24 12:03:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000655595_10741268480.pth [2024-06-24 12:03:46,966][15401] Updated weights for policy 0, policy_version 656231 (0.0029) [2024-06-24 12:03:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10751737856. Throughput: 0: 42910.7. Samples: 10751928520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 12:03:50,030][15401] Updated weights for policy 0, policy_version 656241 (0.0037) [2024-06-24 12:03:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 10751967232. Throughput: 0: 42725.4. Samples: 10752049040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:53,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 12:03:54,659][15401] Updated weights for policy 0, policy_version 656251 (0.0036) [2024-06-24 12:03:57,990][15401] Updated weights for policy 0, policy_version 656261 (0.0029) [2024-06-24 12:03:58,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 10752212992. Throughput: 0: 42730.3. Samples: 10752302460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:03:58,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 12:04:02,747][15401] Updated weights for policy 0, policy_version 656271 (0.0042) [2024-06-24 12:04:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 10752376832. Throughput: 0: 42561.8. Samples: 10752558060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:04:03,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 12:04:05,702][15401] Updated weights for policy 0, policy_version 656281 (0.0026) [2024-06-24 12:04:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 10752606208. Throughput: 0: 42567.7. Samples: 10752675300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:04:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 12:04:10,530][15401] Updated weights for policy 0, policy_version 656291 (0.0036) [2024-06-24 12:04:13,247][15401] Updated weights for policy 0, policy_version 656301 (0.0034) [2024-06-24 12:04:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10752835584. Throughput: 0: 42493.7. Samples: 10752938680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 12:04:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 12:04:18,296][15401] Updated weights for policy 0, policy_version 656311 (0.0039) [2024-06-24 12:04:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10752999424. Throughput: 0: 42238.7. Samples: 10753195180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 12:04:21,143][15401] Updated weights for policy 0, policy_version 656321 (0.0040) [2024-06-24 12:04:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10753245184. Throughput: 0: 42472.4. Samples: 10753316940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 12:04:25,929][15401] Updated weights for policy 0, policy_version 656331 (0.0043) [2024-06-24 12:04:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10753474560. Throughput: 0: 42446.7. Samples: 10753576780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 12:04:28,848][15401] Updated weights for policy 0, policy_version 656341 (0.0043) [2024-06-24 12:04:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 10753638400. Throughput: 0: 42279.1. Samples: 10753831080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 12:04:33,463][15401] Updated weights for policy 0, policy_version 656351 (0.0030) [2024-06-24 12:04:36,615][15401] Updated weights for policy 0, policy_version 656361 (0.0039) [2024-06-24 12:04:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10753884160. Throughput: 0: 42338.2. Samples: 10753954260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 12:04:39,011][15349] Signal inference workers to stop experience collection... (159200 times) [2024-06-24 12:04:39,020][15349] Signal inference workers to resume experience collection... (159200 times) [2024-06-24 12:04:39,039][15401] InferenceWorker_p0-w0: stopping experience collection (159200 times) [2024-06-24 12:04:39,039][15401] InferenceWorker_p0-w0: resuming experience collection (159200 times) [2024-06-24 12:04:41,109][15401] Updated weights for policy 0, policy_version 656371 (0.0040) [2024-06-24 12:04:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 10754080768. Throughput: 0: 42428.5. Samples: 10754211740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 12:04:44,309][15401] Updated weights for policy 0, policy_version 656381 (0.0024) [2024-06-24 12:04:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10754293760. Throughput: 0: 42465.0. Samples: 10754468980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 12:04:48,577][15401] Updated weights for policy 0, policy_version 656391 (0.0030) [2024-06-24 12:04:51,978][15401] Updated weights for policy 0, policy_version 656401 (0.0036) [2024-06-24 12:04:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10754506752. Throughput: 0: 42808.0. Samples: 10754601660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 12:04:56,168][15401] Updated weights for policy 0, policy_version 656411 (0.0033) [2024-06-24 12:04:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 10754719744. Throughput: 0: 42594.3. Samples: 10754855420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:04:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 12:04:59,492][15401] Updated weights for policy 0, policy_version 656421 (0.0038) [2024-06-24 12:05:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 10754932736. Throughput: 0: 42559.1. Samples: 10755110340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 12:05:03,866][15401] Updated weights for policy 0, policy_version 656431 (0.0037) [2024-06-24 12:05:07,239][15401] Updated weights for policy 0, policy_version 656441 (0.0037) [2024-06-24 12:05:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10755162112. Throughput: 0: 42688.4. Samples: 10755237920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 12:05:11,729][15401] Updated weights for policy 0, policy_version 656451 (0.0034) [2024-06-24 12:05:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 10755375104. Throughput: 0: 42605.0. Samples: 10755494000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 12:05:14,766][15401] Updated weights for policy 0, policy_version 656461 (0.0027) [2024-06-24 12:05:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 10755571712. Throughput: 0: 42764.0. Samples: 10755755560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:18,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 12:05:19,289][15401] Updated weights for policy 0, policy_version 656471 (0.0042) [2024-06-24 12:05:22,294][15401] Updated weights for policy 0, policy_version 656481 (0.0025) [2024-06-24 12:05:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10755817472. Throughput: 0: 42759.6. Samples: 10755878440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 12:05:26,923][15401] Updated weights for policy 0, policy_version 656491 (0.0038) [2024-06-24 12:05:28,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10756014080. Throughput: 0: 42935.1. Samples: 10756143820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 12:05:29,794][15401] Updated weights for policy 0, policy_version 656501 (0.0034) [2024-06-24 12:05:33,396][15132] Fps is (10 sec: 40933.2, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 10756227072. Throughput: 0: 42892.0. Samples: 10756399400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:33,397][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 12:05:34,431][15401] Updated weights for policy 0, policy_version 656511 (0.0038) [2024-06-24 12:05:37,831][15401] Updated weights for policy 0, policy_version 656521 (0.0030) [2024-06-24 12:05:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10756456448. Throughput: 0: 42846.2. Samples: 10756529740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 12:05:42,131][15401] Updated weights for policy 0, policy_version 656531 (0.0027) [2024-06-24 12:05:43,390][15132] Fps is (10 sec: 44264.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10756669440. Throughput: 0: 42886.9. Samples: 10756785340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 12:05:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000656535_10756669440.pth... [2024-06-24 12:05:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000655909_10746413056.pth [2024-06-24 12:05:45,586][15401] Updated weights for policy 0, policy_version 656541 (0.0036) [2024-06-24 12:05:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10756866048. Throughput: 0: 42853.7. Samples: 10757038760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:05:48,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 12:05:49,699][15401] Updated weights for policy 0, policy_version 656551 (0.0030) [2024-06-24 12:05:53,389][15401] Updated weights for policy 0, policy_version 656561 (0.0034) [2024-06-24 12:05:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10757095424. Throughput: 0: 42885.0. Samples: 10757167740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:05:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 12:05:57,626][15401] Updated weights for policy 0, policy_version 656571 (0.0025) [2024-06-24 12:05:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10757292032. Throughput: 0: 42895.8. Samples: 10757424320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:05:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 12:06:01,153][15401] Updated weights for policy 0, policy_version 656581 (0.0037) [2024-06-24 12:06:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10757505024. Throughput: 0: 42720.5. Samples: 10757677880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 12:06:05,107][15401] Updated weights for policy 0, policy_version 656591 (0.0029) [2024-06-24 12:06:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10757734400. Throughput: 0: 42867.9. Samples: 10757807500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:08,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 12:06:08,537][15401] Updated weights for policy 0, policy_version 656601 (0.0034) [2024-06-24 12:06:12,796][15401] Updated weights for policy 0, policy_version 656611 (0.0032) [2024-06-24 12:06:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10757947392. Throughput: 0: 42855.2. Samples: 10758072300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:13,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-24 12:06:16,361][15401] Updated weights for policy 0, policy_version 656621 (0.0048) [2024-06-24 12:06:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 10758160384. Throughput: 0: 42690.6. Samples: 10758320200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:18,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-24 12:06:20,692][15401] Updated weights for policy 0, policy_version 656631 (0.0041) [2024-06-24 12:06:21,272][15349] Signal inference workers to stop experience collection... (159250 times) [2024-06-24 12:06:21,274][15349] Signal inference workers to resume experience collection... (159250 times) [2024-06-24 12:06:21,294][15401] InferenceWorker_p0-w0: stopping experience collection (159250 times) [2024-06-24 12:06:21,295][15401] InferenceWorker_p0-w0: resuming experience collection (159250 times) [2024-06-24 12:06:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10758389760. Throughput: 0: 42689.8. Samples: 10758450780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 12:06:23,778][15401] Updated weights for policy 0, policy_version 656641 (0.0034) [2024-06-24 12:06:28,361][15401] Updated weights for policy 0, policy_version 656651 (0.0041) [2024-06-24 12:06:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10758569984. Throughput: 0: 42734.0. Samples: 10758708360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 12:06:31,403][15401] Updated weights for policy 0, policy_version 656661 (0.0029) [2024-06-24 12:06:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 10758799360. Throughput: 0: 42854.3. Samples: 10758967200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 12:06:35,796][15401] Updated weights for policy 0, policy_version 656671 (0.0042) [2024-06-24 12:06:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10759028736. Throughput: 0: 42857.8. Samples: 10759096340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:38,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 12:06:38,975][15401] Updated weights for policy 0, policy_version 656681 (0.0029) [2024-06-24 12:06:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10759208960. Throughput: 0: 42766.6. Samples: 10759348820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:43,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 12:06:43,479][15401] Updated weights for policy 0, policy_version 656691 (0.0037) [2024-06-24 12:06:46,867][15401] Updated weights for policy 0, policy_version 656701 (0.0030) [2024-06-24 12:06:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10759421952. Throughput: 0: 42671.1. Samples: 10759598080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 12:06:51,175][15401] Updated weights for policy 0, policy_version 656711 (0.0026) [2024-06-24 12:06:53,390][15132] Fps is (10 sec: 45873.0, 60 sec: 42871.0, 300 sec: 42764.9). Total num frames: 10759667712. Throughput: 0: 42730.2. Samples: 10759730380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:53,391][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 12:06:54,434][15401] Updated weights for policy 0, policy_version 656721 (0.0037) [2024-06-24 12:06:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10759847936. Throughput: 0: 42530.6. Samples: 10759986180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:06:58,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 12:06:58,818][15401] Updated weights for policy 0, policy_version 656731 (0.0032) [2024-06-24 12:07:02,030][15401] Updated weights for policy 0, policy_version 656741 (0.0021) [2024-06-24 12:07:03,390][15132] Fps is (10 sec: 40962.1, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 10760077312. Throughput: 0: 42780.8. Samples: 10760245340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:07:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 12:07:06,503][15401] Updated weights for policy 0, policy_version 656751 (0.0027) [2024-06-24 12:07:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10760306688. Throughput: 0: 42695.4. Samples: 10760372080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:07:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 12:07:09,506][15401] Updated weights for policy 0, policy_version 656761 (0.0036) [2024-06-24 12:07:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10760470528. Throughput: 0: 42625.3. Samples: 10760626500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:07:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 12:07:14,133][15401] Updated weights for policy 0, policy_version 656771 (0.0040) [2024-06-24 12:07:17,766][15401] Updated weights for policy 0, policy_version 656781 (0.0043) [2024-06-24 12:07:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10760699904. Throughput: 0: 42358.2. Samples: 10760873320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:07:18,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 12:07:22,175][15401] Updated weights for policy 0, policy_version 656791 (0.0027) [2024-06-24 12:07:23,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10760945664. Throughput: 0: 42426.7. Samples: 10761005540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:07:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 12:07:25,654][15401] Updated weights for policy 0, policy_version 656801 (0.0029) [2024-06-24 12:07:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10761125888. Throughput: 0: 42437.9. Samples: 10761258520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:07:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 12:07:29,678][15401] Updated weights for policy 0, policy_version 656811 (0.0042) [2024-06-24 12:07:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10761338880. Throughput: 0: 42538.3. Samples: 10761512300. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:07:33,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 12:07:33,468][15401] Updated weights for policy 0, policy_version 656821 (0.0045) [2024-06-24 12:07:37,278][15401] Updated weights for policy 0, policy_version 656831 (0.0035) [2024-06-24 12:07:38,390][15132] Fps is (10 sec: 45871.3, 60 sec: 42597.8, 300 sec: 42654.2). Total num frames: 10761584640. Throughput: 0: 42567.3. Samples: 10761645920. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:07:38,391][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 12:07:41,146][15401] Updated weights for policy 0, policy_version 656841 (0.0041) [2024-06-24 12:07:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 10761764864. Throughput: 0: 42394.2. Samples: 10761894020. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:07:43,392][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 12:07:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000656846_10761764864.pth... [2024-06-24 12:07:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000656224_10751574016.pth [2024-06-24 12:07:44,917][15401] Updated weights for policy 0, policy_version 656851 (0.0044) [2024-06-24 12:07:45,342][15349] Signal inference workers to stop experience collection... (159300 times) [2024-06-24 12:07:45,378][15401] InferenceWorker_p0-w0: stopping experience collection (159300 times) [2024-06-24 12:07:45,407][15349] Signal inference workers to resume experience collection... (159300 times) [2024-06-24 12:07:45,407][15401] InferenceWorker_p0-w0: resuming experience collection (159300 times) [2024-06-24 12:07:48,389][15132] Fps is (10 sec: 39324.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10761977856. Throughput: 0: 42389.4. Samples: 10762152860. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:07:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 12:07:48,784][15401] Updated weights for policy 0, policy_version 656861 (0.0027) [2024-06-24 12:07:52,778][15401] Updated weights for policy 0, policy_version 656871 (0.0029) [2024-06-24 12:07:53,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.7, 300 sec: 42598.4). Total num frames: 10762207232. Throughput: 0: 42498.7. Samples: 10762284520. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:07:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 12:07:56,327][15401] Updated weights for policy 0, policy_version 656881 (0.0038) [2024-06-24 12:07:58,391][15132] Fps is (10 sec: 44228.2, 60 sec: 42870.1, 300 sec: 42709.2). Total num frames: 10762420224. Throughput: 0: 42495.0. Samples: 10762538860. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:07:58,401][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 12:08:00,313][15401] Updated weights for policy 0, policy_version 656891 (0.0035) [2024-06-24 12:08:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10762633216. Throughput: 0: 42859.6. Samples: 10762802000. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:03,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 12:08:03,951][15401] Updated weights for policy 0, policy_version 656901 (0.0029) [2024-06-24 12:08:07,761][15401] Updated weights for policy 0, policy_version 656911 (0.0040) [2024-06-24 12:08:08,389][15132] Fps is (10 sec: 40968.4, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 10762829824. Throughput: 0: 42785.8. Samples: 10762930900. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 12:08:11,595][15401] Updated weights for policy 0, policy_version 656921 (0.0031) [2024-06-24 12:08:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10763059200. Throughput: 0: 42674.7. Samples: 10763178880. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:13,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 12:08:15,775][15401] Updated weights for policy 0, policy_version 656931 (0.0032) [2024-06-24 12:08:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10763272192. Throughput: 0: 42823.5. Samples: 10763439360. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:18,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 12:08:19,127][15401] Updated weights for policy 0, policy_version 656941 (0.0034) [2024-06-24 12:08:23,238][15401] Updated weights for policy 0, policy_version 656951 (0.0031) [2024-06-24 12:08:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10763485184. Throughput: 0: 42809.6. Samples: 10763572320. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 12:08:26,596][15401] Updated weights for policy 0, policy_version 656961 (0.0032) [2024-06-24 12:08:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10763714560. Throughput: 0: 42862.3. Samples: 10763822720. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 12:08:31,160][15401] Updated weights for policy 0, policy_version 656971 (0.0035) [2024-06-24 12:08:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10763927552. Throughput: 0: 42830.6. Samples: 10764080240. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 12:08:34,366][15401] Updated weights for policy 0, policy_version 656981 (0.0035) [2024-06-24 12:08:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.9, 300 sec: 42487.3). Total num frames: 10764107776. Throughput: 0: 42733.5. Samples: 10764207520. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 12:08:38,783][15401] Updated weights for policy 0, policy_version 656991 (0.0031) [2024-06-24 12:08:41,876][15401] Updated weights for policy 0, policy_version 657001 (0.0028) [2024-06-24 12:08:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 10764369920. Throughput: 0: 42795.6. Samples: 10764464580. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 12:08:46,338][15401] Updated weights for policy 0, policy_version 657011 (0.0033) [2024-06-24 12:08:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10764566528. Throughput: 0: 42709.3. Samples: 10764723920. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 12:08:49,405][15401] Updated weights for policy 0, policy_version 657021 (0.0032) [2024-06-24 12:08:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10764763136. Throughput: 0: 42590.1. Samples: 10764847460. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:53,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 12:08:54,212][15401] Updated weights for policy 0, policy_version 657031 (0.0035) [2024-06-24 12:08:57,207][15401] Updated weights for policy 0, policy_version 657041 (0.0038) [2024-06-24 12:08:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43419.0, 300 sec: 42876.1). Total num frames: 10765025280. Throughput: 0: 42868.8. Samples: 10765107980. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:08:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 12:09:01,902][15401] Updated weights for policy 0, policy_version 657051 (0.0037) [2024-06-24 12:09:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10765205504. Throughput: 0: 42830.3. Samples: 10765366720. Policy #0 lag: (min: 2.0, avg: 11.1, max: 21.0) [2024-06-24 12:09:03,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 12:09:04,853][15401] Updated weights for policy 0, policy_version 657061 (0.0034) [2024-06-24 12:09:08,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10765402112. Throughput: 0: 42697.4. Samples: 10765493700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 12:09:09,736][15401] Updated weights for policy 0, policy_version 657071 (0.0036) [2024-06-24 12:09:12,331][15401] Updated weights for policy 0, policy_version 657081 (0.0037) [2024-06-24 12:09:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10765647872. Throughput: 0: 42943.0. Samples: 10765755160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 12:09:17,242][15401] Updated weights for policy 0, policy_version 657091 (0.0034) [2024-06-24 12:09:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10765860864. Throughput: 0: 42896.9. Samples: 10766010600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 12:09:18,801][15349] Signal inference workers to stop experience collection... (159350 times) [2024-06-24 12:09:18,814][15401] InferenceWorker_p0-w0: stopping experience collection (159350 times) [2024-06-24 12:09:18,914][15349] Signal inference workers to resume experience collection... (159350 times) [2024-06-24 12:09:18,915][15401] InferenceWorker_p0-w0: resuming experience collection (159350 times) [2024-06-24 12:09:20,069][15401] Updated weights for policy 0, policy_version 657101 (0.0037) [2024-06-24 12:09:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10766057472. Throughput: 0: 42909.7. Samples: 10766138460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 12:09:24,643][15401] Updated weights for policy 0, policy_version 657111 (0.0039) [2024-06-24 12:09:27,691][15401] Updated weights for policy 0, policy_version 657121 (0.0038) [2024-06-24 12:09:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10766270464. Throughput: 0: 42921.4. Samples: 10766396040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:28,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 12:09:32,222][15401] Updated weights for policy 0, policy_version 657131 (0.0038) [2024-06-24 12:09:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10766499840. Throughput: 0: 42758.7. Samples: 10766648060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 12:09:35,371][15401] Updated weights for policy 0, policy_version 657141 (0.0033) [2024-06-24 12:09:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10766696448. Throughput: 0: 42867.9. Samples: 10766776520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 12:09:39,873][15401] Updated weights for policy 0, policy_version 657151 (0.0047) [2024-06-24 12:09:43,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 10766909440. Throughput: 0: 42981.3. Samples: 10767042240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:43,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 12:09:43,466][15401] Updated weights for policy 0, policy_version 657161 (0.0029) [2024-06-24 12:09:43,580][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000657162_10766942208.pth... [2024-06-24 12:09:43,648][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000656535_10756669440.pth [2024-06-24 12:09:47,438][15401] Updated weights for policy 0, policy_version 657171 (0.0028) [2024-06-24 12:09:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10767122432. Throughput: 0: 42815.9. Samples: 10767293440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 12:09:50,932][15401] Updated weights for policy 0, policy_version 657181 (0.0030) [2024-06-24 12:09:53,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10767335424. Throughput: 0: 42814.6. Samples: 10767420360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 12:09:55,013][15401] Updated weights for policy 0, policy_version 657191 (0.0038) [2024-06-24 12:09:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10767564800. Throughput: 0: 42611.2. Samples: 10767672660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:09:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 12:09:58,500][15401] Updated weights for policy 0, policy_version 657201 (0.0024) [2024-06-24 12:10:02,901][15401] Updated weights for policy 0, policy_version 657211 (0.0035) [2024-06-24 12:10:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10767761408. Throughput: 0: 42699.1. Samples: 10767932060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:03,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 12:10:06,563][15401] Updated weights for policy 0, policy_version 657221 (0.0025) [2024-06-24 12:10:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10767974400. Throughput: 0: 42638.7. Samples: 10768057200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:08,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 12:10:10,323][15401] Updated weights for policy 0, policy_version 657231 (0.0029) [2024-06-24 12:10:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 10768220160. Throughput: 0: 42775.4. Samples: 10768320940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:13,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 12:10:14,171][15401] Updated weights for policy 0, policy_version 657241 (0.0029) [2024-06-24 12:10:17,988][15401] Updated weights for policy 0, policy_version 657251 (0.0044) [2024-06-24 12:10:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10768433152. Throughput: 0: 42972.1. Samples: 10768581800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 12:10:21,669][15401] Updated weights for policy 0, policy_version 657261 (0.0031) [2024-06-24 12:10:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10768629760. Throughput: 0: 42899.5. Samples: 10768707000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 12:10:25,440][15401] Updated weights for policy 0, policy_version 657271 (0.0048) [2024-06-24 12:10:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42821.5). Total num frames: 10768859136. Throughput: 0: 42769.3. Samples: 10768966760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:28,391][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 12:10:29,478][15401] Updated weights for policy 0, policy_version 657281 (0.0038) [2024-06-24 12:10:33,055][15401] Updated weights for policy 0, policy_version 657291 (0.0034) [2024-06-24 12:10:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10769055744. Throughput: 0: 42856.4. Samples: 10769221980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:33,393][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 12:10:37,052][15401] Updated weights for policy 0, policy_version 657301 (0.0034) [2024-06-24 12:10:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10769268736. Throughput: 0: 42755.1. Samples: 10769344340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 12:10:40,650][15401] Updated weights for policy 0, policy_version 657311 (0.0040) [2024-06-24 12:10:43,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43417.6, 300 sec: 42875.8). Total num frames: 10769514496. Throughput: 0: 43034.5. Samples: 10769609320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 12:10:43,393][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 12:10:44,637][15401] Updated weights for policy 0, policy_version 657321 (0.0031) [2024-06-24 12:10:47,580][15349] Signal inference workers to stop experience collection... (159400 times) [2024-06-24 12:10:47,581][15349] Signal inference workers to resume experience collection... (159400 times) [2024-06-24 12:10:47,632][15401] InferenceWorker_p0-w0: stopping experience collection (159400 times) [2024-06-24 12:10:47,632][15401] InferenceWorker_p0-w0: resuming experience collection (159400 times) [2024-06-24 12:10:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10769694720. Throughput: 0: 42901.4. Samples: 10769862620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:10:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 12:10:48,648][15401] Updated weights for policy 0, policy_version 657331 (0.0039) [2024-06-24 12:10:52,276][15401] Updated weights for policy 0, policy_version 657341 (0.0034) [2024-06-24 12:10:53,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10769907712. Throughput: 0: 42959.5. Samples: 10769990380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:10:53,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 12:10:56,200][15401] Updated weights for policy 0, policy_version 657351 (0.0039) [2024-06-24 12:10:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10770153472. Throughput: 0: 42873.7. Samples: 10770250260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:10:58,399][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 12:10:59,921][15401] Updated weights for policy 0, policy_version 657361 (0.0029) [2024-06-24 12:11:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10770350080. Throughput: 0: 42680.8. Samples: 10770502440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 12:11:04,100][15401] Updated weights for policy 0, policy_version 657371 (0.0037) [2024-06-24 12:11:07,453][15401] Updated weights for policy 0, policy_version 657381 (0.0032) [2024-06-24 12:11:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10770563072. Throughput: 0: 42637.5. Samples: 10770625680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:08,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 12:11:11,725][15401] Updated weights for policy 0, policy_version 657391 (0.0033) [2024-06-24 12:11:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10770776064. Throughput: 0: 42749.0. Samples: 10770890460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:13,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 12:11:15,443][15401] Updated weights for policy 0, policy_version 657401 (0.0022) [2024-06-24 12:11:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10770972672. Throughput: 0: 42661.8. Samples: 10771141760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 12:11:19,500][15401] Updated weights for policy 0, policy_version 657411 (0.0045) [2024-06-24 12:11:23,106][15401] Updated weights for policy 0, policy_version 657421 (0.0034) [2024-06-24 12:11:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 10771202048. Throughput: 0: 42706.2. Samples: 10771266220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:23,393][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 12:11:27,019][15401] Updated weights for policy 0, policy_version 657431 (0.0033) [2024-06-24 12:11:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10771398656. Throughput: 0: 42507.2. Samples: 10771522040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 12:11:30,801][15401] Updated weights for policy 0, policy_version 657441 (0.0036) [2024-06-24 12:11:33,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10771611648. Throughput: 0: 42575.0. Samples: 10771778500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 12:11:34,567][15401] Updated weights for policy 0, policy_version 657451 (0.0024) [2024-06-24 12:11:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10771824640. Throughput: 0: 42624.8. Samples: 10771908500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 12:11:38,551][15401] Updated weights for policy 0, policy_version 657461 (0.0021) [2024-06-24 12:11:42,222][15401] Updated weights for policy 0, policy_version 657471 (0.0033) [2024-06-24 12:11:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41780.9, 300 sec: 42709.5). Total num frames: 10772021248. Throughput: 0: 42472.5. Samples: 10772161520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 12:11:43,558][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000657474_10772054016.pth... [2024-06-24 12:11:43,639][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000656846_10761764864.pth [2024-06-24 12:11:46,360][15401] Updated weights for policy 0, policy_version 657481 (0.0030) [2024-06-24 12:11:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10772250624. Throughput: 0: 42615.1. Samples: 10772420120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 12:11:49,824][15401] Updated weights for policy 0, policy_version 657491 (0.0044) [2024-06-24 12:11:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10772463616. Throughput: 0: 42661.3. Samples: 10772545440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:53,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 12:11:53,891][15401] Updated weights for policy 0, policy_version 657501 (0.0033) [2024-06-24 12:11:54,725][15349] Signal inference workers to stop experience collection... (159450 times) [2024-06-24 12:11:54,727][15349] Signal inference workers to resume experience collection... (159450 times) [2024-06-24 12:11:54,748][15401] InferenceWorker_p0-w0: stopping experience collection (159450 times) [2024-06-24 12:11:54,748][15401] InferenceWorker_p0-w0: resuming experience collection (159450 times) [2024-06-24 12:11:57,673][15401] Updated weights for policy 0, policy_version 657511 (0.0037) [2024-06-24 12:11:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10772676608. Throughput: 0: 42472.9. Samples: 10772801740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:11:58,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 12:12:01,443][15401] Updated weights for policy 0, policy_version 657521 (0.0042) [2024-06-24 12:12:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10772889600. Throughput: 0: 42704.5. Samples: 10773063460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:12:03,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-24 12:12:05,070][15401] Updated weights for policy 0, policy_version 657531 (0.0030) [2024-06-24 12:12:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 10773102592. Throughput: 0: 42733.4. Samples: 10773189120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:12:08,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 12:12:09,408][15401] Updated weights for policy 0, policy_version 657541 (0.0036) [2024-06-24 12:12:12,809][15401] Updated weights for policy 0, policy_version 657551 (0.0035) [2024-06-24 12:12:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10773315584. Throughput: 0: 42785.3. Samples: 10773447380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:12:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 12:12:17,030][15401] Updated weights for policy 0, policy_version 657561 (0.0034) [2024-06-24 12:12:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10773528576. Throughput: 0: 42719.7. Samples: 10773700880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:12:18,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 12:12:20,823][15401] Updated weights for policy 0, policy_version 657571 (0.0042) [2024-06-24 12:12:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 10773741568. Throughput: 0: 42526.9. Samples: 10773822200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:12:23,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 12:12:24,601][15401] Updated weights for policy 0, policy_version 657581 (0.0039) [2024-06-24 12:12:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10773954560. Throughput: 0: 42702.2. Samples: 10774083120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:12:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 12:12:28,445][15401] Updated weights for policy 0, policy_version 657591 (0.0027) [2024-06-24 12:12:32,049][15401] Updated weights for policy 0, policy_version 657601 (0.0033) [2024-06-24 12:12:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10774167552. Throughput: 0: 42536.0. Samples: 10774334240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:12:33,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 12:12:36,429][15401] Updated weights for policy 0, policy_version 657611 (0.0031) [2024-06-24 12:12:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 10774380544. Throughput: 0: 42669.3. Samples: 10774465560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:12:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 12:12:39,636][15401] Updated weights for policy 0, policy_version 657621 (0.0041) [2024-06-24 12:12:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10774593536. Throughput: 0: 42821.0. Samples: 10774728680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:12:43,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 12:12:43,781][15401] Updated weights for policy 0, policy_version 657631 (0.0038) [2024-06-24 12:12:47,723][15401] Updated weights for policy 0, policy_version 657641 (0.0034) [2024-06-24 12:12:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10774822912. Throughput: 0: 42652.4. Samples: 10774982820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:12:48,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 12:12:51,209][15401] Updated weights for policy 0, policy_version 657651 (0.0037) [2024-06-24 12:12:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.3). Total num frames: 10775035904. Throughput: 0: 42681.4. Samples: 10775109780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:12:53,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 12:12:55,244][15401] Updated weights for policy 0, policy_version 657661 (0.0037) [2024-06-24 12:12:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10775232512. Throughput: 0: 42707.6. Samples: 10775369220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:12:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 12:12:58,848][15401] Updated weights for policy 0, policy_version 657671 (0.0021) [2024-06-24 12:13:03,013][15401] Updated weights for policy 0, policy_version 657681 (0.0037) [2024-06-24 12:13:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10775445504. Throughput: 0: 42662.1. Samples: 10775620680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:03,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 12:13:06,106][15349] Signal inference workers to stop experience collection... (159500 times) [2024-06-24 12:13:06,112][15349] Signal inference workers to resume experience collection... (159500 times) [2024-06-24 12:13:06,135][15401] InferenceWorker_p0-w0: stopping experience collection (159500 times) [2024-06-24 12:13:06,140][15401] InferenceWorker_p0-w0: resuming experience collection (159500 times) [2024-06-24 12:13:06,551][15401] Updated weights for policy 0, policy_version 657691 (0.0041) [2024-06-24 12:13:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10775691264. Throughput: 0: 42836.3. Samples: 10775749840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 12:13:10,488][15401] Updated weights for policy 0, policy_version 657701 (0.0031) [2024-06-24 12:13:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10775871488. Throughput: 0: 42875.2. Samples: 10776012500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 12:13:14,208][15401] Updated weights for policy 0, policy_version 657711 (0.0038) [2024-06-24 12:13:18,306][15401] Updated weights for policy 0, policy_version 657721 (0.0026) [2024-06-24 12:13:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10776100864. Throughput: 0: 42962.7. Samples: 10776267560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:18,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 12:13:21,797][15401] Updated weights for policy 0, policy_version 657731 (0.0037) [2024-06-24 12:13:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10776330240. Throughput: 0: 42863.2. Samples: 10776394400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 12:13:25,717][15401] Updated weights for policy 0, policy_version 657741 (0.0039) [2024-06-24 12:13:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.7, 300 sec: 42765.1). Total num frames: 10776543232. Throughput: 0: 42895.7. Samples: 10776658980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 12:13:29,346][15401] Updated weights for policy 0, policy_version 657751 (0.0022) [2024-06-24 12:13:33,126][15401] Updated weights for policy 0, policy_version 657761 (0.0029) [2024-06-24 12:13:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10776756224. Throughput: 0: 43000.0. Samples: 10776917820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:33,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 12:13:37,141][15401] Updated weights for policy 0, policy_version 657771 (0.0030) [2024-06-24 12:13:38,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 10777001984. Throughput: 0: 43084.7. Samples: 10777048600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 12:13:40,758][15401] Updated weights for policy 0, policy_version 657781 (0.0033) [2024-06-24 12:13:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10777165824. Throughput: 0: 43053.2. Samples: 10777306620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 12:13:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000657786_10777165824.pth... [2024-06-24 12:13:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000657162_10766942208.pth [2024-06-24 12:13:44,671][15401] Updated weights for policy 0, policy_version 657791 (0.0025) [2024-06-24 12:13:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10777395200. Throughput: 0: 42984.4. Samples: 10777554980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 12:13:48,879][15401] Updated weights for policy 0, policy_version 657801 (0.0029) [2024-06-24 12:13:52,547][15401] Updated weights for policy 0, policy_version 657811 (0.0036) [2024-06-24 12:13:53,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10777640960. Throughput: 0: 42951.2. Samples: 10777682640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 12:13:56,424][15401] Updated weights for policy 0, policy_version 657821 (0.0036) [2024-06-24 12:13:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10777804800. Throughput: 0: 42837.7. Samples: 10777940200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-24 12:13:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 12:14:00,272][15401] Updated weights for policy 0, policy_version 657831 (0.0037) [2024-06-24 12:14:03,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10778017792. Throughput: 0: 42762.2. Samples: 10778191860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:03,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 12:14:04,230][15401] Updated weights for policy 0, policy_version 657841 (0.0026) [2024-06-24 12:14:07,761][15401] Updated weights for policy 0, policy_version 657851 (0.0038) [2024-06-24 12:14:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10778247168. Throughput: 0: 42843.0. Samples: 10778322340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:08,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 12:14:12,166][15401] Updated weights for policy 0, policy_version 657861 (0.0028) [2024-06-24 12:14:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10778443776. Throughput: 0: 42645.6. Samples: 10778578040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:13,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 12:14:15,456][15401] Updated weights for policy 0, policy_version 657871 (0.0038) [2024-06-24 12:14:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10778673152. Throughput: 0: 42407.6. Samples: 10778826160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 12:14:19,661][15401] Updated weights for policy 0, policy_version 657881 (0.0029) [2024-06-24 12:14:23,118][15401] Updated weights for policy 0, policy_version 657891 (0.0029) [2024-06-24 12:14:23,393][15132] Fps is (10 sec: 44221.1, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 10778886144. Throughput: 0: 42351.8. Samples: 10778954580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:23,394][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 12:14:27,630][15401] Updated weights for policy 0, policy_version 657901 (0.0030) [2024-06-24 12:14:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10779082752. Throughput: 0: 42281.4. Samples: 10779209280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 12:14:31,103][15401] Updated weights for policy 0, policy_version 657911 (0.0033) [2024-06-24 12:14:33,390][15132] Fps is (10 sec: 42613.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10779312128. Throughput: 0: 42471.5. Samples: 10779466200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 12:14:35,103][15401] Updated weights for policy 0, policy_version 657921 (0.0045) [2024-06-24 12:14:38,389][15132] Fps is (10 sec: 42599.5, 60 sec: 41779.4, 300 sec: 42709.9). Total num frames: 10779508736. Throughput: 0: 42503.7. Samples: 10779595300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:38,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 12:14:38,714][15401] Updated weights for policy 0, policy_version 657931 (0.0037) [2024-06-24 12:14:39,396][15349] Signal inference workers to stop experience collection... (159550 times) [2024-06-24 12:14:39,397][15349] Signal inference workers to resume experience collection... (159550 times) [2024-06-24 12:14:39,427][15401] InferenceWorker_p0-w0: stopping experience collection (159550 times) [2024-06-24 12:14:39,427][15401] InferenceWorker_p0-w0: resuming experience collection (159550 times) [2024-06-24 12:14:42,618][15401] Updated weights for policy 0, policy_version 657941 (0.0046) [2024-06-24 12:14:43,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 10779705344. Throughput: 0: 42545.9. Samples: 10779854760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 12:14:46,192][15401] Updated weights for policy 0, policy_version 657951 (0.0033) [2024-06-24 12:14:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10779934720. Throughput: 0: 42659.6. Samples: 10780111540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 12:14:50,142][15401] Updated weights for policy 0, policy_version 657961 (0.0027) [2024-06-24 12:14:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10780164096. Throughput: 0: 42593.4. Samples: 10780239040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 12:14:54,011][15401] Updated weights for policy 0, policy_version 657971 (0.0040) [2024-06-24 12:14:58,029][15401] Updated weights for policy 0, policy_version 657981 (0.0035) [2024-06-24 12:14:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10780360704. Throughput: 0: 42559.2. Samples: 10780493200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:14:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 12:15:02,137][15401] Updated weights for policy 0, policy_version 657991 (0.0046) [2024-06-24 12:15:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10780557312. Throughput: 0: 42703.1. Samples: 10780747800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:15:03,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 12:15:05,674][15401] Updated weights for policy 0, policy_version 658001 (0.0040) [2024-06-24 12:15:08,393][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 10780819456. Throughput: 0: 42703.3. Samples: 10780876180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:15:08,393][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 12:15:09,813][15401] Updated weights for policy 0, policy_version 658011 (0.0041) [2024-06-24 12:15:13,298][15401] Updated weights for policy 0, policy_version 658021 (0.0043) [2024-06-24 12:15:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10781016064. Throughput: 0: 42673.4. Samples: 10781129580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:15:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 12:15:17,422][15401] Updated weights for policy 0, policy_version 658031 (0.0037) [2024-06-24 12:15:18,390][15132] Fps is (10 sec: 39330.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10781212672. Throughput: 0: 42578.1. Samples: 10781382220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:15:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 12:15:20,893][15401] Updated weights for policy 0, policy_version 658041 (0.0034) [2024-06-24 12:15:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.9, 300 sec: 42598.4). Total num frames: 10781425664. Throughput: 0: 42494.6. Samples: 10781507560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:15:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 12:15:25,181][15401] Updated weights for policy 0, policy_version 658051 (0.0031) [2024-06-24 12:15:28,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 10781638656. Throughput: 0: 42583.6. Samples: 10781771020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:15:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 12:15:28,667][15401] Updated weights for policy 0, policy_version 658061 (0.0029) [2024-06-24 12:15:32,858][15401] Updated weights for policy 0, policy_version 658071 (0.0045) [2024-06-24 12:15:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10781851648. Throughput: 0: 42388.0. Samples: 10782019000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:15:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 12:15:36,457][15401] Updated weights for policy 0, policy_version 658081 (0.0038) [2024-06-24 12:15:38,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42866.8, 300 sec: 42597.8). Total num frames: 10782081024. Throughput: 0: 42499.7. Samples: 10782151800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-24 12:15:38,396][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 12:15:40,359][15401] Updated weights for policy 0, policy_version 658091 (0.0031) [2024-06-24 12:15:43,390][15132] Fps is (10 sec: 42595.4, 60 sec: 42870.9, 300 sec: 42653.8). Total num frames: 10782277632. Throughput: 0: 42529.1. Samples: 10782407040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:15:43,391][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 12:15:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000658098_10782277632.pth... [2024-06-24 12:15:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000657474_10772054016.pth [2024-06-24 12:15:44,080][15401] Updated weights for policy 0, policy_version 658101 (0.0037) [2024-06-24 12:15:48,259][15401] Updated weights for policy 0, policy_version 658111 (0.0038) [2024-06-24 12:15:48,390][15132] Fps is (10 sec: 40985.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10782490624. Throughput: 0: 42586.0. Samples: 10782664180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:15:48,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 12:15:51,885][15401] Updated weights for policy 0, policy_version 658121 (0.0040) [2024-06-24 12:15:53,389][15132] Fps is (10 sec: 44240.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10782720000. Throughput: 0: 42602.8. Samples: 10782793200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:15:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 12:15:55,856][15401] Updated weights for policy 0, policy_version 658131 (0.0026) [2024-06-24 12:15:58,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10782916608. Throughput: 0: 42699.7. Samples: 10783051060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:15:58,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-24 12:15:59,569][15401] Updated weights for policy 0, policy_version 658141 (0.0041) [2024-06-24 12:16:00,999][15349] Signal inference workers to stop experience collection... (159600 times) [2024-06-24 12:16:00,999][15349] Signal inference workers to resume experience collection... (159600 times) [2024-06-24 12:16:01,044][15401] InferenceWorker_p0-w0: stopping experience collection (159600 times) [2024-06-24 12:16:01,045][15401] InferenceWorker_p0-w0: resuming experience collection (159600 times) [2024-06-24 12:16:03,271][15401] Updated weights for policy 0, policy_version 658151 (0.0029) [2024-06-24 12:16:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 10783145984. Throughput: 0: 42749.0. Samples: 10783305920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 12:16:07,329][15401] Updated weights for policy 0, policy_version 658161 (0.0037) [2024-06-24 12:16:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 10783342592. Throughput: 0: 42719.0. Samples: 10783429920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 12:16:11,328][15401] Updated weights for policy 0, policy_version 658171 (0.0042) [2024-06-24 12:16:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10783571968. Throughput: 0: 42630.9. Samples: 10783689420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:13,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 12:16:14,749][15401] Updated weights for policy 0, policy_version 658181 (0.0039) [2024-06-24 12:16:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42598.8). Total num frames: 10783768576. Throughput: 0: 42893.9. Samples: 10783949220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 12:16:19,123][15401] Updated weights for policy 0, policy_version 658191 (0.0034) [2024-06-24 12:16:22,634][15401] Updated weights for policy 0, policy_version 658201 (0.0025) [2024-06-24 12:16:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10783981568. Throughput: 0: 42745.3. Samples: 10784075060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 12:16:26,554][15401] Updated weights for policy 0, policy_version 658211 (0.0030) [2024-06-24 12:16:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10784210944. Throughput: 0: 42644.8. Samples: 10784326020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 12:16:30,159][15401] Updated weights for policy 0, policy_version 658221 (0.0037) [2024-06-24 12:16:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10784407552. Throughput: 0: 42712.1. Samples: 10784586220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:33,392][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 12:16:34,455][15401] Updated weights for policy 0, policy_version 658231 (0.0028) [2024-06-24 12:16:37,754][15401] Updated weights for policy 0, policy_version 658241 (0.0037) [2024-06-24 12:16:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 10784636928. Throughput: 0: 42710.1. Samples: 10784715160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 12:16:41,978][15401] Updated weights for policy 0, policy_version 658251 (0.0032) [2024-06-24 12:16:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43145.0, 300 sec: 42765.0). Total num frames: 10784866304. Throughput: 0: 42722.1. Samples: 10784973560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:43,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-24 12:16:45,623][15401] Updated weights for policy 0, policy_version 658261 (0.0032) [2024-06-24 12:16:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10785062912. Throughput: 0: 42759.9. Samples: 10785230120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:48,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-24 12:16:49,454][15401] Updated weights for policy 0, policy_version 658271 (0.0032) [2024-06-24 12:16:53,162][15401] Updated weights for policy 0, policy_version 658281 (0.0031) [2024-06-24 12:16:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10785275904. Throughput: 0: 42750.6. Samples: 10785353700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:53,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 12:16:57,207][15401] Updated weights for policy 0, policy_version 658291 (0.0031) [2024-06-24 12:16:58,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10785505280. Throughput: 0: 42793.1. Samples: 10785615100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:16:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 12:17:00,850][15401] Updated weights for policy 0, policy_version 658301 (0.0041) [2024-06-24 12:17:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10785685504. Throughput: 0: 42675.0. Samples: 10785869600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:17:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 12:17:04,935][15401] Updated weights for policy 0, policy_version 658311 (0.0040) [2024-06-24 12:17:08,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 10785914880. Throughput: 0: 42602.7. Samples: 10785992460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:17:08,397][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 12:17:08,457][15401] Updated weights for policy 0, policy_version 658321 (0.0037) [2024-06-24 12:17:12,350][15401] Updated weights for policy 0, policy_version 658331 (0.0039) [2024-06-24 12:17:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 10786127872. Throughput: 0: 42896.9. Samples: 10786256380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:17:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 12:17:15,431][15349] Signal inference workers to stop experience collection... (159650 times) [2024-06-24 12:17:15,475][15401] InferenceWorker_p0-w0: stopping experience collection (159650 times) [2024-06-24 12:17:15,486][15349] Signal inference workers to resume experience collection... (159650 times) [2024-06-24 12:17:15,492][15401] InferenceWorker_p0-w0: resuming experience collection (159650 times) [2024-06-24 12:17:16,066][15401] Updated weights for policy 0, policy_version 658341 (0.0027) [2024-06-24 12:17:18,390][15132] Fps is (10 sec: 42625.1, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 10786340864. Throughput: 0: 42901.2. Samples: 10786516780. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 12:17:19,922][15401] Updated weights for policy 0, policy_version 658351 (0.0035) [2024-06-24 12:17:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10786570240. Throughput: 0: 42743.7. Samples: 10786638620. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 12:17:23,481][15401] Updated weights for policy 0, policy_version 658361 (0.0035) [2024-06-24 12:17:27,534][15401] Updated weights for policy 0, policy_version 658371 (0.0032) [2024-06-24 12:17:28,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10786783232. Throughput: 0: 42965.0. Samples: 10786906980. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:28,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 12:17:30,961][15401] Updated weights for policy 0, policy_version 658381 (0.0028) [2024-06-24 12:17:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10786979840. Throughput: 0: 42951.3. Samples: 10787162920. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 12:17:35,157][15401] Updated weights for policy 0, policy_version 658391 (0.0037) [2024-06-24 12:17:38,390][15132] Fps is (10 sec: 44234.4, 60 sec: 43144.2, 300 sec: 42820.5). Total num frames: 10787225600. Throughput: 0: 43069.4. Samples: 10787291840. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:38,391][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 12:17:38,672][15401] Updated weights for policy 0, policy_version 658401 (0.0035) [2024-06-24 12:17:42,743][15401] Updated weights for policy 0, policy_version 658411 (0.0030) [2024-06-24 12:17:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10787422208. Throughput: 0: 43160.7. Samples: 10787557340. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 12:17:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000658413_10787438592.pth... [2024-06-24 12:17:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000657786_10777165824.pth [2024-06-24 12:17:46,334][15401] Updated weights for policy 0, policy_version 658421 (0.0042) [2024-06-24 12:17:48,389][15132] Fps is (10 sec: 40961.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10787635200. Throughput: 0: 42974.7. Samples: 10787803460. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 12:17:50,784][15401] Updated weights for policy 0, policy_version 658431 (0.0047) [2024-06-24 12:17:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10787864576. Throughput: 0: 43113.8. Samples: 10787932300. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:53,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 12:17:53,856][15401] Updated weights for policy 0, policy_version 658441 (0.0043) [2024-06-24 12:17:58,088][15401] Updated weights for policy 0, policy_version 658451 (0.0039) [2024-06-24 12:17:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10788061184. Throughput: 0: 43131.0. Samples: 10788197280. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:17:58,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 12:18:01,393][15401] Updated weights for policy 0, policy_version 658461 (0.0036) [2024-06-24 12:18:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10788290560. Throughput: 0: 42960.7. Samples: 10788450000. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 12:18:05,535][15401] Updated weights for policy 0, policy_version 658471 (0.0033) [2024-06-24 12:18:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 10788519936. Throughput: 0: 43100.4. Samples: 10788578140. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 12:18:08,907][15401] Updated weights for policy 0, policy_version 658481 (0.0034) [2024-06-24 12:18:13,106][15401] Updated weights for policy 0, policy_version 658491 (0.0043) [2024-06-24 12:18:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10788716544. Throughput: 0: 42884.4. Samples: 10788836780. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:13,390][15132] Avg episode reward: [(0, '0.184')] [2024-06-24 12:18:17,151][15401] Updated weights for policy 0, policy_version 658501 (0.0028) [2024-06-24 12:18:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10788913152. Throughput: 0: 42594.5. Samples: 10789079680. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 12:18:21,120][15401] Updated weights for policy 0, policy_version 658511 (0.0029) [2024-06-24 12:18:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10789126144. Throughput: 0: 42575.5. Samples: 10789207720. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:23,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 12:18:24,714][15401] Updated weights for policy 0, policy_version 658521 (0.0037) [2024-06-24 12:18:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 10789322752. Throughput: 0: 42470.7. Samples: 10789468520. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 12:18:28,883][15401] Updated weights for policy 0, policy_version 658531 (0.0042) [2024-06-24 12:18:32,297][15401] Updated weights for policy 0, policy_version 658541 (0.0044) [2024-06-24 12:18:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10789552128. Throughput: 0: 42618.3. Samples: 10789721280. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:33,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 12:18:36,477][15401] Updated weights for policy 0, policy_version 658551 (0.0045) [2024-06-24 12:18:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.7, 300 sec: 42765.0). Total num frames: 10789781504. Throughput: 0: 42632.8. Samples: 10789850780. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 12:18:39,865][15401] Updated weights for policy 0, policy_version 658561 (0.0037) [2024-06-24 12:18:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10789978112. Throughput: 0: 42485.3. Samples: 10790109120. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 12:18:44,246][15401] Updated weights for policy 0, policy_version 658571 (0.0027) [2024-06-24 12:18:46,689][15349] Signal inference workers to stop experience collection... (159700 times) [2024-06-24 12:18:46,717][15401] InferenceWorker_p0-w0: stopping experience collection (159700 times) [2024-06-24 12:18:46,755][15349] Signal inference workers to resume experience collection... (159700 times) [2024-06-24 12:18:46,755][15401] InferenceWorker_p0-w0: resuming experience collection (159700 times) [2024-06-24 12:18:47,465][15401] Updated weights for policy 0, policy_version 658581 (0.0033) [2024-06-24 12:18:48,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 10790191104. Throughput: 0: 42434.1. Samples: 10790359640. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:48,393][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 12:18:51,726][15401] Updated weights for policy 0, policy_version 658591 (0.0028) [2024-06-24 12:18:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10790420480. Throughput: 0: 42544.9. Samples: 10790492660. Policy #0 lag: (min: 1.0, avg: 11.9, max: 22.0) [2024-06-24 12:18:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 12:18:55,714][15401] Updated weights for policy 0, policy_version 658601 (0.0041) [2024-06-24 12:18:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10790617088. Throughput: 0: 42565.7. Samples: 10790752240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:18:58,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 12:18:59,279][15401] Updated weights for policy 0, policy_version 658611 (0.0023) [2024-06-24 12:19:03,087][15401] Updated weights for policy 0, policy_version 658621 (0.0030) [2024-06-24 12:19:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10790846464. Throughput: 0: 42754.4. Samples: 10791003620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 12:19:06,817][15401] Updated weights for policy 0, policy_version 658631 (0.0027) [2024-06-24 12:19:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10791043072. Throughput: 0: 43030.7. Samples: 10791144100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 12:19:10,553][15401] Updated weights for policy 0, policy_version 658641 (0.0046) [2024-06-24 12:19:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10791256064. Throughput: 0: 42880.1. Samples: 10791398120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 12:19:14,390][15401] Updated weights for policy 0, policy_version 658651 (0.0046) [2024-06-24 12:19:18,101][15401] Updated weights for policy 0, policy_version 658661 (0.0044) [2024-06-24 12:19:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.5). Total num frames: 10791501824. Throughput: 0: 42925.6. Samples: 10791652940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 12:19:21,926][15401] Updated weights for policy 0, policy_version 658671 (0.0030) [2024-06-24 12:19:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10791698432. Throughput: 0: 43117.8. Samples: 10791791080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 12:19:25,597][15401] Updated weights for policy 0, policy_version 658681 (0.0034) [2024-06-24 12:19:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10791911424. Throughput: 0: 43036.5. Samples: 10792045760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 12:19:29,623][15401] Updated weights for policy 0, policy_version 658691 (0.0037) [2024-06-24 12:19:33,174][15401] Updated weights for policy 0, policy_version 658701 (0.0026) [2024-06-24 12:19:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 10792157184. Throughput: 0: 43012.9. Samples: 10792295120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 12:19:37,503][15401] Updated weights for policy 0, policy_version 658711 (0.0029) [2024-06-24 12:19:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10792337408. Throughput: 0: 43039.7. Samples: 10792429440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 12:19:40,391][15349] Signal inference workers to stop experience collection... (159750 times) [2024-06-24 12:19:40,439][15401] InferenceWorker_p0-w0: stopping experience collection (159750 times) [2024-06-24 12:19:40,509][15349] Signal inference workers to resume experience collection... (159750 times) [2024-06-24 12:19:40,509][15401] InferenceWorker_p0-w0: resuming experience collection (159750 times) [2024-06-24 12:19:41,118][15401] Updated weights for policy 0, policy_version 658721 (0.0047) [2024-06-24 12:19:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10792566784. Throughput: 0: 42879.1. Samples: 10792681800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 12:19:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000658727_10792583168.pth... [2024-06-24 12:19:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000658098_10782277632.pth [2024-06-24 12:19:45,284][15401] Updated weights for policy 0, policy_version 658731 (0.0025) [2024-06-24 12:19:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 10792796160. Throughput: 0: 43051.6. Samples: 10792940940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:48,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 12:19:48,492][15401] Updated weights for policy 0, policy_version 658741 (0.0034) [2024-06-24 12:19:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10792960000. Throughput: 0: 42738.2. Samples: 10793067320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:53,390][15132] Avg episode reward: [(0, '0.118')] [2024-06-24 12:19:53,543][15401] Updated weights for policy 0, policy_version 658751 (0.0033) [2024-06-24 12:19:56,212][15401] Updated weights for policy 0, policy_version 658761 (0.0043) [2024-06-24 12:19:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 10793222144. Throughput: 0: 42817.4. Samples: 10793324900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:19:58,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-24 12:20:01,136][15401] Updated weights for policy 0, policy_version 658771 (0.0044) [2024-06-24 12:20:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 10793418752. Throughput: 0: 43025.4. Samples: 10793589080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:20:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 12:20:04,176][15401] Updated weights for policy 0, policy_version 658781 (0.0029) [2024-06-24 12:20:08,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 10793615360. Throughput: 0: 42645.3. Samples: 10793710220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:20:08,392][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 12:20:08,671][15401] Updated weights for policy 0, policy_version 658791 (0.0030) [2024-06-24 12:20:11,883][15401] Updated weights for policy 0, policy_version 658801 (0.0026) [2024-06-24 12:20:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 10793877504. Throughput: 0: 42776.8. Samples: 10793970720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:20:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 12:20:16,068][15401] Updated weights for policy 0, policy_version 658811 (0.0037) [2024-06-24 12:20:18,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10794074112. Throughput: 0: 43016.9. Samples: 10794230880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:20:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 12:20:19,810][15401] Updated weights for policy 0, policy_version 658821 (0.0038) [2024-06-24 12:20:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10794270720. Throughput: 0: 42714.1. Samples: 10794351580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:20:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 12:20:23,497][15401] Updated weights for policy 0, policy_version 658831 (0.0046) [2024-06-24 12:20:27,409][15401] Updated weights for policy 0, policy_version 658841 (0.0034) [2024-06-24 12:20:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10794500096. Throughput: 0: 43081.4. Samples: 10794620460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:20:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 12:20:30,875][15401] Updated weights for policy 0, policy_version 658851 (0.0039) [2024-06-24 12:20:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 10794696704. Throughput: 0: 42984.3. Samples: 10794875240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:20:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 12:20:35,151][15401] Updated weights for policy 0, policy_version 658861 (0.0036) [2024-06-24 12:20:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.3, 300 sec: 42876.2). Total num frames: 10794926080. Throughput: 0: 42962.5. Samples: 10795000640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:20:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 12:20:38,922][15401] Updated weights for policy 0, policy_version 658871 (0.0034) [2024-06-24 12:20:42,955][15401] Updated weights for policy 0, policy_version 658881 (0.0028) [2024-06-24 12:20:43,393][15132] Fps is (10 sec: 44222.7, 60 sec: 42869.2, 300 sec: 42875.6). Total num frames: 10795139072. Throughput: 0: 42936.3. Samples: 10795257180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:20:43,393][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 12:20:46,325][15401] Updated weights for policy 0, policy_version 658891 (0.0034) [2024-06-24 12:20:48,396][15132] Fps is (10 sec: 42571.9, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 10795352064. Throughput: 0: 42777.1. Samples: 10795514320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:20:48,396][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 12:20:49,668][15349] Signal inference workers to stop experience collection... (159800 times) [2024-06-24 12:20:49,711][15401] InferenceWorker_p0-w0: stopping experience collection (159800 times) [2024-06-24 12:20:49,718][15349] Signal inference workers to resume experience collection... (159800 times) [2024-06-24 12:20:49,726][15401] InferenceWorker_p0-w0: resuming experience collection (159800 times) [2024-06-24 12:20:50,555][15401] Updated weights for policy 0, policy_version 658901 (0.0033) [2024-06-24 12:20:53,390][15132] Fps is (10 sec: 44250.9, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 10795581440. Throughput: 0: 42923.1. Samples: 10795641660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:20:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 12:20:54,380][15401] Updated weights for policy 0, policy_version 658911 (0.0042) [2024-06-24 12:20:57,893][15401] Updated weights for policy 0, policy_version 658921 (0.0034) [2024-06-24 12:20:58,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 10795761664. Throughput: 0: 42901.4. Samples: 10795901280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:20:58,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 12:21:01,827][15401] Updated weights for policy 0, policy_version 658931 (0.0035) [2024-06-24 12:21:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10795991040. Throughput: 0: 42849.4. Samples: 10796159100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 12:21:05,339][15401] Updated weights for policy 0, policy_version 658941 (0.0030) [2024-06-24 12:21:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 10796220416. Throughput: 0: 42902.2. Samples: 10796282180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 12:21:09,898][15401] Updated weights for policy 0, policy_version 658951 (0.0037) [2024-06-24 12:21:13,385][15401] Updated weights for policy 0, policy_version 658961 (0.0034) [2024-06-24 12:21:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10796417024. Throughput: 0: 42658.1. Samples: 10796540080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 12:21:17,378][15401] Updated weights for policy 0, policy_version 658971 (0.0034) [2024-06-24 12:21:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 10796613632. Throughput: 0: 42628.5. Samples: 10796793520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:18,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 12:21:20,892][15401] Updated weights for policy 0, policy_version 658981 (0.0034) [2024-06-24 12:21:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10796859392. Throughput: 0: 42629.4. Samples: 10796918960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:23,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-24 12:21:24,840][15401] Updated weights for policy 0, policy_version 658991 (0.0043) [2024-06-24 12:21:28,344][15401] Updated weights for policy 0, policy_version 659001 (0.0046) [2024-06-24 12:21:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10797072384. Throughput: 0: 42896.4. Samples: 10797187380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:28,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 12:21:32,499][15401] Updated weights for policy 0, policy_version 659011 (0.0036) [2024-06-24 12:21:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10797268992. Throughput: 0: 42707.4. Samples: 10797435880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:33,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 12:21:35,967][15401] Updated weights for policy 0, policy_version 659021 (0.0045) [2024-06-24 12:21:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10797498368. Throughput: 0: 42659.2. Samples: 10797561320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 12:21:40,258][15401] Updated weights for policy 0, policy_version 659031 (0.0049) [2024-06-24 12:21:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.6, 300 sec: 42765.0). Total num frames: 10797678592. Throughput: 0: 42799.6. Samples: 10797827260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:43,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 12:21:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000659039_10797694976.pth... [2024-06-24 12:21:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000658413_10787438592.pth [2024-06-24 12:21:43,873][15401] Updated weights for policy 0, policy_version 659041 (0.0034) [2024-06-24 12:21:48,062][15401] Updated weights for policy 0, policy_version 659051 (0.0032) [2024-06-24 12:21:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 10797907968. Throughput: 0: 42542.7. Samples: 10798073520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:48,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 12:21:51,620][15401] Updated weights for policy 0, policy_version 659061 (0.0043) [2024-06-24 12:21:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10798137344. Throughput: 0: 42730.3. Samples: 10798205040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 12:21:55,758][15401] Updated weights for policy 0, policy_version 659071 (0.0036) [2024-06-24 12:21:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10798301184. Throughput: 0: 42737.4. Samples: 10798463260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:21:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 12:21:59,350][15401] Updated weights for policy 0, policy_version 659081 (0.0040) [2024-06-24 12:22:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 10798530560. Throughput: 0: 42615.9. Samples: 10798711240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:22:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 12:22:03,566][15401] Updated weights for policy 0, policy_version 659091 (0.0040) [2024-06-24 12:22:06,942][15401] Updated weights for policy 0, policy_version 659101 (0.0028) [2024-06-24 12:22:07,651][15349] Signal inference workers to stop experience collection... (159850 times) [2024-06-24 12:22:07,652][15349] Signal inference workers to resume experience collection... (159850 times) [2024-06-24 12:22:07,700][15401] InferenceWorker_p0-w0: stopping experience collection (159850 times) [2024-06-24 12:22:07,700][15401] InferenceWorker_p0-w0: resuming experience collection (159850 times) [2024-06-24 12:22:08,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10798776320. Throughput: 0: 42812.0. Samples: 10798845500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 12:22:08,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 12:22:11,269][15401] Updated weights for policy 0, policy_version 659111 (0.0029) [2024-06-24 12:22:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10798956544. Throughput: 0: 42420.5. Samples: 10799096300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 12:22:14,600][15401] Updated weights for policy 0, policy_version 659121 (0.0033) [2024-06-24 12:22:18,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10799169536. Throughput: 0: 42501.3. Samples: 10799348440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:18,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-24 12:22:19,024][15401] Updated weights for policy 0, policy_version 659131 (0.0040) [2024-06-24 12:22:22,332][15401] Updated weights for policy 0, policy_version 659141 (0.0041) [2024-06-24 12:22:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10799431680. Throughput: 0: 42637.0. Samples: 10799479980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 12:22:26,607][15401] Updated weights for policy 0, policy_version 659151 (0.0034) [2024-06-24 12:22:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 10799595520. Throughput: 0: 42479.9. Samples: 10799738860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 12:22:30,162][15401] Updated weights for policy 0, policy_version 659161 (0.0035) [2024-06-24 12:22:33,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10799824896. Throughput: 0: 42491.5. Samples: 10799985640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:33,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 12:22:34,170][15401] Updated weights for policy 0, policy_version 659171 (0.0029) [2024-06-24 12:22:37,755][15401] Updated weights for policy 0, policy_version 659181 (0.0037) [2024-06-24 12:22:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10800037888. Throughput: 0: 42535.7. Samples: 10800119140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 12:22:41,772][15401] Updated weights for policy 0, policy_version 659191 (0.0035) [2024-06-24 12:22:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10800234496. Throughput: 0: 42613.4. Samples: 10800380860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 12:22:45,206][15401] Updated weights for policy 0, policy_version 659201 (0.0023) [2024-06-24 12:22:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10800480256. Throughput: 0: 42764.6. Samples: 10800635640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 12:22:49,263][15401] Updated weights for policy 0, policy_version 659211 (0.0031) [2024-06-24 12:22:52,873][15401] Updated weights for policy 0, policy_version 659221 (0.0033) [2024-06-24 12:22:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10800676864. Throughput: 0: 42701.5. Samples: 10800767060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:53,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 12:22:56,828][15401] Updated weights for policy 0, policy_version 659231 (0.0040) [2024-06-24 12:22:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10800889856. Throughput: 0: 42897.7. Samples: 10801026700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:22:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 12:23:00,380][15401] Updated weights for policy 0, policy_version 659241 (0.0038) [2024-06-24 12:23:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10801135616. Throughput: 0: 42847.1. Samples: 10801276560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:03,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 12:23:04,344][15401] Updated weights for policy 0, policy_version 659251 (0.0033) [2024-06-24 12:23:08,106][15401] Updated weights for policy 0, policy_version 659261 (0.0048) [2024-06-24 12:23:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10801332224. Throughput: 0: 42850.5. Samples: 10801408260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:08,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 12:23:12,005][15401] Updated weights for policy 0, policy_version 659271 (0.0025) [2024-06-24 12:23:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10801528832. Throughput: 0: 42714.4. Samples: 10801661000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 12:23:16,106][15401] Updated weights for policy 0, policy_version 659281 (0.0026) [2024-06-24 12:23:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 10801774592. Throughput: 0: 42846.6. Samples: 10801913740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 12:23:19,734][15401] Updated weights for policy 0, policy_version 659291 (0.0029) [2024-06-24 12:23:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 10801954816. Throughput: 0: 42849.8. Samples: 10802047380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:23,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 12:23:23,867][15401] Updated weights for policy 0, policy_version 659301 (0.0030) [2024-06-24 12:23:27,396][15401] Updated weights for policy 0, policy_version 659311 (0.0037) [2024-06-24 12:23:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10802167808. Throughput: 0: 42612.9. Samples: 10802298440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:28,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 12:23:31,583][15401] Updated weights for policy 0, policy_version 659321 (0.0036) [2024-06-24 12:23:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10802397184. Throughput: 0: 42657.6. Samples: 10802555240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:33,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 12:23:35,295][15401] Updated weights for policy 0, policy_version 659331 (0.0034) [2024-06-24 12:23:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10802577408. Throughput: 0: 42660.8. Samples: 10802686800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:38,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 12:23:39,235][15401] Updated weights for policy 0, policy_version 659341 (0.0032) [2024-06-24 12:23:43,341][15401] Updated weights for policy 0, policy_version 659351 (0.0041) [2024-06-24 12:23:43,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.7, 300 sec: 42765.0). Total num frames: 10802806784. Throughput: 0: 42469.7. Samples: 10802937940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 12:23:43,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 12:23:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000659351_10802806784.pth... [2024-06-24 12:23:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000658727_10792583168.pth [2024-06-24 12:23:46,899][15401] Updated weights for policy 0, policy_version 659361 (0.0031) [2024-06-24 12:23:47,268][15349] Signal inference workers to stop experience collection... (159900 times) [2024-06-24 12:23:47,269][15349] Signal inference workers to resume experience collection... (159900 times) [2024-06-24 12:23:47,286][15401] InferenceWorker_p0-w0: stopping experience collection (159900 times) [2024-06-24 12:23:47,286][15401] InferenceWorker_p0-w0: resuming experience collection (159900 times) [2024-06-24 12:23:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10803036160. Throughput: 0: 42510.8. Samples: 10803189540. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:23:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 12:23:50,852][15401] Updated weights for policy 0, policy_version 659371 (0.0028) [2024-06-24 12:23:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10803216384. Throughput: 0: 42604.2. Samples: 10803325440. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:23:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 12:23:54,450][15401] Updated weights for policy 0, policy_version 659381 (0.0031) [2024-06-24 12:23:58,329][15401] Updated weights for policy 0, policy_version 659391 (0.0036) [2024-06-24 12:23:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10803462144. Throughput: 0: 42719.9. Samples: 10803583400. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:23:58,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 12:24:01,998][15401] Updated weights for policy 0, policy_version 659401 (0.0029) [2024-06-24 12:24:03,389][15132] Fps is (10 sec: 49152.1, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 10803707904. Throughput: 0: 42742.0. Samples: 10803837120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:03,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 12:24:05,802][15401] Updated weights for policy 0, policy_version 659411 (0.0034) [2024-06-24 12:24:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10803871744. Throughput: 0: 42832.9. Samples: 10803974860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:08,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-24 12:24:09,443][15401] Updated weights for policy 0, policy_version 659421 (0.0027) [2024-06-24 12:24:13,319][15401] Updated weights for policy 0, policy_version 659431 (0.0031) [2024-06-24 12:24:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10804117504. Throughput: 0: 42967.1. Samples: 10804231960. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 12:24:17,128][15401] Updated weights for policy 0, policy_version 659441 (0.0045) [2024-06-24 12:24:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10804330496. Throughput: 0: 42752.0. Samples: 10804479080. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 12:24:21,329][15401] Updated weights for policy 0, policy_version 659451 (0.0034) [2024-06-24 12:24:23,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 10804510720. Throughput: 0: 42717.7. Samples: 10804609200. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:23,393][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 12:24:24,972][15401] Updated weights for policy 0, policy_version 659461 (0.0032) [2024-06-24 12:24:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10804756480. Throughput: 0: 42874.3. Samples: 10804867180. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:28,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 12:24:28,984][15401] Updated weights for policy 0, policy_version 659471 (0.0031) [2024-06-24 12:24:32,750][15401] Updated weights for policy 0, policy_version 659481 (0.0037) [2024-06-24 12:24:33,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 10804969472. Throughput: 0: 43013.3. Samples: 10805125140. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 12:24:36,456][15401] Updated weights for policy 0, policy_version 659491 (0.0039) [2024-06-24 12:24:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10805166080. Throughput: 0: 42919.8. Samples: 10805256840. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 12:24:40,516][15401] Updated weights for policy 0, policy_version 659501 (0.0030) [2024-06-24 12:24:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43146.4, 300 sec: 42709.5). Total num frames: 10805395456. Throughput: 0: 42915.3. Samples: 10805514580. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:43,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 12:24:43,801][15401] Updated weights for policy 0, policy_version 659511 (0.0037) [2024-06-24 12:24:47,861][15401] Updated weights for policy 0, policy_version 659521 (0.0030) [2024-06-24 12:24:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10805608448. Throughput: 0: 43149.8. Samples: 10805778860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:48,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 12:24:51,264][15401] Updated weights for policy 0, policy_version 659531 (0.0028) [2024-06-24 12:24:53,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43417.5, 300 sec: 42709.4). Total num frames: 10805821440. Throughput: 0: 42858.5. Samples: 10805903500. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 12:24:55,351][15401] Updated weights for policy 0, policy_version 659541 (0.0043) [2024-06-24 12:24:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10806050816. Throughput: 0: 42827.6. Samples: 10806159200. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:24:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 12:24:59,101][15401] Updated weights for policy 0, policy_version 659551 (0.0032) [2024-06-24 12:25:03,067][15401] Updated weights for policy 0, policy_version 659561 (0.0044) [2024-06-24 12:25:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.2, 300 sec: 42820.9). Total num frames: 10806247424. Throughput: 0: 43003.6. Samples: 10806414240. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:25:03,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 12:25:06,667][15401] Updated weights for policy 0, policy_version 659571 (0.0024) [2024-06-24 12:25:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 10806460416. Throughput: 0: 42914.0. Samples: 10806540220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:25:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 12:25:10,497][15401] Updated weights for policy 0, policy_version 659581 (0.0038) [2024-06-24 12:25:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10806673408. Throughput: 0: 42948.7. Samples: 10806799880. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:25:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 12:25:14,248][15401] Updated weights for policy 0, policy_version 659591 (0.0032) [2024-06-24 12:25:18,149][15401] Updated weights for policy 0, policy_version 659601 (0.0035) [2024-06-24 12:25:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10806902784. Throughput: 0: 42869.2. Samples: 10807054260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:25:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 12:25:21,045][15349] Signal inference workers to stop experience collection... (159950 times) [2024-06-24 12:25:21,088][15401] InferenceWorker_p0-w0: stopping experience collection (159950 times) [2024-06-24 12:25:21,111][15349] Signal inference workers to resume experience collection... (159950 times) [2024-06-24 12:25:21,116][15401] InferenceWorker_p0-w0: resuming experience collection (159950 times) [2024-06-24 12:25:22,022][15401] Updated weights for policy 0, policy_version 659611 (0.0039) [2024-06-24 12:25:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 10807115776. Throughput: 0: 42884.6. Samples: 10807186640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 12:25:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 12:25:26,283][15401] Updated weights for policy 0, policy_version 659621 (0.0034) [2024-06-24 12:25:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10807312384. Throughput: 0: 42785.7. Samples: 10807439940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:25:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 12:25:29,915][15401] Updated weights for policy 0, policy_version 659631 (0.0038) [2024-06-24 12:25:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10807541760. Throughput: 0: 42682.6. Samples: 10807699580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:25:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 12:25:33,965][15401] Updated weights for policy 0, policy_version 659641 (0.0028) [2024-06-24 12:25:37,487][15401] Updated weights for policy 0, policy_version 659651 (0.0037) [2024-06-24 12:25:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.6, 300 sec: 42765.5). Total num frames: 10807754752. Throughput: 0: 42902.7. Samples: 10807834120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:25:38,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 12:25:41,527][15401] Updated weights for policy 0, policy_version 659661 (0.0032) [2024-06-24 12:25:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 10807967744. Throughput: 0: 42906.2. Samples: 10808089980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:25:43,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-24 12:25:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000659667_10807984128.pth... [2024-06-24 12:25:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000659039_10797694976.pth [2024-06-24 12:25:44,947][15401] Updated weights for policy 0, policy_version 659671 (0.0036) [2024-06-24 12:25:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10808180736. Throughput: 0: 42988.5. Samples: 10808348720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:25:48,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-24 12:25:49,252][15401] Updated weights for policy 0, policy_version 659681 (0.0033) [2024-06-24 12:25:52,629][15401] Updated weights for policy 0, policy_version 659691 (0.0036) [2024-06-24 12:25:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10808393728. Throughput: 0: 42995.5. Samples: 10808475020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:25:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 12:25:56,799][15401] Updated weights for policy 0, policy_version 659701 (0.0039) [2024-06-24 12:25:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10808623104. Throughput: 0: 43009.1. Samples: 10808735280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:25:58,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-24 12:26:00,356][15401] Updated weights for policy 0, policy_version 659711 (0.0030) [2024-06-24 12:26:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10808819712. Throughput: 0: 43037.4. Samples: 10808990940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:03,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-24 12:26:04,356][15401] Updated weights for policy 0, policy_version 659721 (0.0029) [2024-06-24 12:26:07,932][15401] Updated weights for policy 0, policy_version 659731 (0.0032) [2024-06-24 12:26:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10809032704. Throughput: 0: 43012.4. Samples: 10809122200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:08,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-24 12:26:11,876][15401] Updated weights for policy 0, policy_version 659741 (0.0031) [2024-06-24 12:26:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10809245696. Throughput: 0: 43047.6. Samples: 10809377080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:13,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 12:26:15,990][15401] Updated weights for policy 0, policy_version 659751 (0.0034) [2024-06-24 12:26:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10809475072. Throughput: 0: 43015.6. Samples: 10809635280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:18,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 12:26:19,318][15401] Updated weights for policy 0, policy_version 659761 (0.0036) [2024-06-24 12:26:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10809671680. Throughput: 0: 43021.3. Samples: 10809770080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 12:26:23,741][15401] Updated weights for policy 0, policy_version 659771 (0.0051) [2024-06-24 12:26:26,976][15401] Updated weights for policy 0, policy_version 659781 (0.0031) [2024-06-24 12:26:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 10809917440. Throughput: 0: 42970.1. Samples: 10810023640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:28,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 12:26:31,407][15401] Updated weights for policy 0, policy_version 659791 (0.0038) [2024-06-24 12:26:31,763][15349] Signal inference workers to stop experience collection... (160000 times) [2024-06-24 12:26:31,763][15349] Signal inference workers to resume experience collection... (160000 times) [2024-06-24 12:26:31,812][15401] InferenceWorker_p0-w0: stopping experience collection (160000 times) [2024-06-24 12:26:31,812][15401] InferenceWorker_p0-w0: resuming experience collection (160000 times) [2024-06-24 12:26:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10810130432. Throughput: 0: 42886.2. Samples: 10810278600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 12:26:34,627][15401] Updated weights for policy 0, policy_version 659801 (0.0027) [2024-06-24 12:26:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10810310656. Throughput: 0: 43074.1. Samples: 10810413360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 12:26:39,042][15401] Updated weights for policy 0, policy_version 659811 (0.0032) [2024-06-24 12:26:42,384][15401] Updated weights for policy 0, policy_version 659821 (0.0038) [2024-06-24 12:26:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10810556416. Throughput: 0: 43015.1. Samples: 10810670960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:43,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 12:26:46,421][15401] Updated weights for policy 0, policy_version 659831 (0.0033) [2024-06-24 12:26:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10810753024. Throughput: 0: 43033.4. Samples: 10810927440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 12:26:49,940][15401] Updated weights for policy 0, policy_version 659841 (0.0032) [2024-06-24 12:26:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10810966016. Throughput: 0: 42959.2. Samples: 10811055360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 12:26:54,259][15401] Updated weights for policy 0, policy_version 659851 (0.0031) [2024-06-24 12:26:57,328][15401] Updated weights for policy 0, policy_version 659861 (0.0030) [2024-06-24 12:26:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 10811211776. Throughput: 0: 42966.1. Samples: 10811310560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 12:26:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 12:27:02,118][15401] Updated weights for policy 0, policy_version 659871 (0.0035) [2024-06-24 12:27:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10811392000. Throughput: 0: 42927.4. Samples: 10811567020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 12:27:04,753][15401] Updated weights for policy 0, policy_version 659881 (0.0032) [2024-06-24 12:27:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10811621376. Throughput: 0: 42842.7. Samples: 10811698000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:08,393][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 12:27:09,584][15401] Updated weights for policy 0, policy_version 659891 (0.0040) [2024-06-24 12:27:12,402][15401] Updated weights for policy 0, policy_version 659901 (0.0028) [2024-06-24 12:27:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10811834368. Throughput: 0: 42773.0. Samples: 10811948420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 12:27:17,186][15401] Updated weights for policy 0, policy_version 659911 (0.0029) [2024-06-24 12:27:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10812030976. Throughput: 0: 42933.5. Samples: 10812210600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 12:27:19,973][15401] Updated weights for policy 0, policy_version 659921 (0.0029) [2024-06-24 12:27:23,391][15132] Fps is (10 sec: 42593.5, 60 sec: 43143.8, 300 sec: 42931.5). Total num frames: 10812260352. Throughput: 0: 42787.0. Samples: 10812338820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:23,391][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 12:27:24,857][15401] Updated weights for policy 0, policy_version 659931 (0.0033) [2024-06-24 12:27:27,923][15401] Updated weights for policy 0, policy_version 659941 (0.0042) [2024-06-24 12:27:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10812473344. Throughput: 0: 42699.4. Samples: 10812592440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 12:27:32,290][15401] Updated weights for policy 0, policy_version 659951 (0.0027) [2024-06-24 12:27:33,390][15132] Fps is (10 sec: 40964.3, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 10812669952. Throughput: 0: 42820.3. Samples: 10812854360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 12:27:35,716][15401] Updated weights for policy 0, policy_version 659961 (0.0022) [2024-06-24 12:27:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10812882944. Throughput: 0: 42771.5. Samples: 10812980080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:38,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 12:27:40,060][15401] Updated weights for policy 0, policy_version 659971 (0.0036) [2024-06-24 12:27:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 10813112320. Throughput: 0: 42796.9. Samples: 10813236420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:43,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 12:27:43,462][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000659981_10813128704.pth... [2024-06-24 12:27:43,464][15401] Updated weights for policy 0, policy_version 659981 (0.0038) [2024-06-24 12:27:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000659351_10802806784.pth [2024-06-24 12:27:47,503][15401] Updated weights for policy 0, policy_version 659991 (0.0022) [2024-06-24 12:27:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10813325312. Throughput: 0: 42873.0. Samples: 10813496300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:48,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 12:27:51,005][15401] Updated weights for policy 0, policy_version 660001 (0.0034) [2024-06-24 12:27:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10813521920. Throughput: 0: 42741.8. Samples: 10813621380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 12:27:55,095][15401] Updated weights for policy 0, policy_version 660011 (0.0036) [2024-06-24 12:27:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10813767680. Throughput: 0: 43007.2. Samples: 10813883740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:27:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 12:27:58,591][15401] Updated weights for policy 0, policy_version 660021 (0.0035) [2024-06-24 12:28:02,134][15349] Signal inference workers to stop experience collection... (160050 times) [2024-06-24 12:28:02,135][15349] Signal inference workers to resume experience collection... (160050 times) [2024-06-24 12:28:02,156][15401] InferenceWorker_p0-w0: stopping experience collection (160050 times) [2024-06-24 12:28:02,156][15401] InferenceWorker_p0-w0: resuming experience collection (160050 times) [2024-06-24 12:28:02,891][15401] Updated weights for policy 0, policy_version 660031 (0.0026) [2024-06-24 12:28:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10813980672. Throughput: 0: 42821.6. Samples: 10814137580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:28:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 12:28:06,082][15401] Updated weights for policy 0, policy_version 660041 (0.0039) [2024-06-24 12:28:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 10814177280. Throughput: 0: 42676.6. Samples: 10814259320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:28:08,393][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 12:28:10,274][15401] Updated weights for policy 0, policy_version 660051 (0.0030) [2024-06-24 12:28:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10814406656. Throughput: 0: 42928.5. Samples: 10814524220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:28:13,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 12:28:14,148][15401] Updated weights for policy 0, policy_version 660061 (0.0031) [2024-06-24 12:28:17,734][15401] Updated weights for policy 0, policy_version 660071 (0.0035) [2024-06-24 12:28:18,390][15132] Fps is (10 sec: 45886.3, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 10814636032. Throughput: 0: 42777.4. Samples: 10814779340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:28:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 12:28:21,739][15401] Updated weights for policy 0, policy_version 660081 (0.0040) [2024-06-24 12:28:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42599.2, 300 sec: 42876.1). Total num frames: 10814816256. Throughput: 0: 42748.0. Samples: 10814903740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:28:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 12:28:25,333][15401] Updated weights for policy 0, policy_version 660091 (0.0023) [2024-06-24 12:28:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10815045632. Throughput: 0: 42864.0. Samples: 10815165300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:28:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 12:28:29,208][15401] Updated weights for policy 0, policy_version 660101 (0.0033) [2024-06-24 12:28:32,885][15401] Updated weights for policy 0, policy_version 660111 (0.0038) [2024-06-24 12:28:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 10815275008. Throughput: 0: 42740.8. Samples: 10815419640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:28:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 12:28:36,899][15401] Updated weights for policy 0, policy_version 660121 (0.0034) [2024-06-24 12:28:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 10815471616. Throughput: 0: 42873.9. Samples: 10815550700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-24 12:28:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 12:28:40,496][15401] Updated weights for policy 0, policy_version 660131 (0.0040) [2024-06-24 12:28:43,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10815684608. Throughput: 0: 42831.9. Samples: 10815811180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:28:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 12:28:44,958][15401] Updated weights for policy 0, policy_version 660141 (0.0041) [2024-06-24 12:28:48,127][15401] Updated weights for policy 0, policy_version 660151 (0.0038) [2024-06-24 12:28:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 10815930368. Throughput: 0: 42814.7. Samples: 10816064240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:28:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 12:28:52,570][15401] Updated weights for policy 0, policy_version 660161 (0.0045) [2024-06-24 12:28:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 10816126976. Throughput: 0: 43064.1. Samples: 10816197100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:28:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 12:28:55,735][15401] Updated weights for policy 0, policy_version 660171 (0.0035) [2024-06-24 12:28:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10816323584. Throughput: 0: 42918.1. Samples: 10816455540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:28:58,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 12:29:00,190][15401] Updated weights for policy 0, policy_version 660181 (0.0042) [2024-06-24 12:29:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 10816552960. Throughput: 0: 42953.9. Samples: 10816712260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 12:29:03,683][15401] Updated weights for policy 0, policy_version 660191 (0.0026) [2024-06-24 12:29:07,776][15401] Updated weights for policy 0, policy_version 660201 (0.0041) [2024-06-24 12:29:08,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 10816765952. Throughput: 0: 43001.3. Samples: 10816838800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 12:29:11,463][15401] Updated weights for policy 0, policy_version 660211 (0.0025) [2024-06-24 12:29:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10816962560. Throughput: 0: 42684.1. Samples: 10817086080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 12:29:15,355][15401] Updated weights for policy 0, policy_version 660221 (0.0037) [2024-06-24 12:29:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42876.5). Total num frames: 10817159168. Throughput: 0: 42834.4. Samples: 10817347180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:18,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 12:29:19,154][15401] Updated weights for policy 0, policy_version 660231 (0.0028) [2024-06-24 12:29:23,065][15401] Updated weights for policy 0, policy_version 660241 (0.0036) [2024-06-24 12:29:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10817404928. Throughput: 0: 42678.2. Samples: 10817471220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:23,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 12:29:26,742][15401] Updated weights for policy 0, policy_version 660251 (0.0042) [2024-06-24 12:29:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10817601536. Throughput: 0: 42776.1. Samples: 10817736100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:28,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 12:29:30,541][15401] Updated weights for policy 0, policy_version 660261 (0.0027) [2024-06-24 12:29:31,934][15349] Signal inference workers to stop experience collection... (160100 times) [2024-06-24 12:29:31,983][15401] InferenceWorker_p0-w0: stopping experience collection (160100 times) [2024-06-24 12:29:32,051][15349] Signal inference workers to resume experience collection... (160100 times) [2024-06-24 12:29:32,051][15401] InferenceWorker_p0-w0: resuming experience collection (160100 times) [2024-06-24 12:29:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 10817814528. Throughput: 0: 42825.4. Samples: 10817991380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:33,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 12:29:34,485][15401] Updated weights for policy 0, policy_version 660271 (0.0030) [2024-06-24 12:29:38,343][15401] Updated weights for policy 0, policy_version 660281 (0.0035) [2024-06-24 12:29:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10818043904. Throughput: 0: 42667.6. Samples: 10818117140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 12:29:42,068][15401] Updated weights for policy 0, policy_version 660291 (0.0039) [2024-06-24 12:29:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10818256896. Throughput: 0: 42666.7. Samples: 10818375540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 12:29:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000660294_10818256896.pth... [2024-06-24 12:29:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000659667_10807984128.pth [2024-06-24 12:29:46,277][15401] Updated weights for policy 0, policy_version 660301 (0.0040) [2024-06-24 12:29:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 10818469888. Throughput: 0: 42561.3. Samples: 10818627520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 12:29:49,622][15401] Updated weights for policy 0, policy_version 660311 (0.0030) [2024-06-24 12:29:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 10818650112. Throughput: 0: 42638.6. Samples: 10818757540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 12:29:54,075][15401] Updated weights for policy 0, policy_version 660321 (0.0028) [2024-06-24 12:29:57,145][15401] Updated weights for policy 0, policy_version 660331 (0.0046) [2024-06-24 12:29:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10818895872. Throughput: 0: 42746.2. Samples: 10819009660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:29:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 12:30:01,810][15401] Updated weights for policy 0, policy_version 660341 (0.0032) [2024-06-24 12:30:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10819108864. Throughput: 0: 42552.4. Samples: 10819262040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:30:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 12:30:05,458][15401] Updated weights for policy 0, policy_version 660351 (0.0027) [2024-06-24 12:30:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10819305472. Throughput: 0: 42736.8. Samples: 10819394380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:30:08,390][15132] Avg episode reward: [(0, '0.182')] [2024-06-24 12:30:09,690][15401] Updated weights for policy 0, policy_version 660361 (0.0043) [2024-06-24 12:30:12,764][15401] Updated weights for policy 0, policy_version 660371 (0.0049) [2024-06-24 12:30:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10819534848. Throughput: 0: 42512.0. Samples: 10819649140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-24 12:30:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 12:30:17,296][15401] Updated weights for policy 0, policy_version 660381 (0.0032) [2024-06-24 12:30:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10819747840. Throughput: 0: 42476.0. Samples: 10819902800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 12:30:20,680][15401] Updated weights for policy 0, policy_version 660391 (0.0041) [2024-06-24 12:30:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 10819944448. Throughput: 0: 42516.4. Samples: 10820030380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:23,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 12:30:25,257][15401] Updated weights for policy 0, policy_version 660401 (0.0046) [2024-06-24 12:30:28,350][15401] Updated weights for policy 0, policy_version 660411 (0.0032) [2024-06-24 12:30:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10820173824. Throughput: 0: 42367.3. Samples: 10820282060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 12:30:32,852][15401] Updated weights for policy 0, policy_version 660421 (0.0041) [2024-06-24 12:30:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10820370432. Throughput: 0: 42546.7. Samples: 10820542120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:33,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 12:30:35,851][15401] Updated weights for policy 0, policy_version 660431 (0.0036) [2024-06-24 12:30:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10820583424. Throughput: 0: 42449.5. Samples: 10820667760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:38,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 12:30:40,271][15401] Updated weights for policy 0, policy_version 660441 (0.0042) [2024-06-24 12:30:43,350][15401] Updated weights for policy 0, policy_version 660451 (0.0025) [2024-06-24 12:30:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10820829184. Throughput: 0: 42652.8. Samples: 10820929040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 12:30:47,942][15401] Updated weights for policy 0, policy_version 660461 (0.0031) [2024-06-24 12:30:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 10821009408. Throughput: 0: 42809.8. Samples: 10821188580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:48,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 12:30:50,891][15401] Updated weights for policy 0, policy_version 660471 (0.0030) [2024-06-24 12:30:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10821238784. Throughput: 0: 42603.0. Samples: 10821311520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 12:30:53,773][15349] Signal inference workers to stop experience collection... (160150 times) [2024-06-24 12:30:53,773][15349] Signal inference workers to resume experience collection... (160150 times) [2024-06-24 12:30:53,816][15401] InferenceWorker_p0-w0: stopping experience collection (160150 times) [2024-06-24 12:30:53,816][15401] InferenceWorker_p0-w0: resuming experience collection (160150 times) [2024-06-24 12:30:55,471][15401] Updated weights for policy 0, policy_version 660481 (0.0041) [2024-06-24 12:30:58,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10821468160. Throughput: 0: 42744.4. Samples: 10821572640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:30:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 12:30:58,589][15401] Updated weights for policy 0, policy_version 660491 (0.0038) [2024-06-24 12:31:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 10821632000. Throughput: 0: 42793.4. Samples: 10821828500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:03,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 12:31:03,606][15401] Updated weights for policy 0, policy_version 660501 (0.0038) [2024-06-24 12:31:06,137][15401] Updated weights for policy 0, policy_version 660511 (0.0034) [2024-06-24 12:31:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10821877760. Throughput: 0: 42611.1. Samples: 10821947880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 12:31:11,268][15401] Updated weights for policy 0, policy_version 660521 (0.0031) [2024-06-24 12:31:13,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10822123520. Throughput: 0: 42908.8. Samples: 10822212960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 12:31:13,751][15401] Updated weights for policy 0, policy_version 660531 (0.0026) [2024-06-24 12:31:18,393][15132] Fps is (10 sec: 39308.7, 60 sec: 42049.9, 300 sec: 42709.0). Total num frames: 10822270976. Throughput: 0: 42878.9. Samples: 10822471820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:18,393][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 12:31:18,783][15401] Updated weights for policy 0, policy_version 660541 (0.0025) [2024-06-24 12:31:21,404][15401] Updated weights for policy 0, policy_version 660551 (0.0042) [2024-06-24 12:31:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10822516736. Throughput: 0: 42623.9. Samples: 10822585840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 12:31:26,778][15401] Updated weights for policy 0, policy_version 660561 (0.0028) [2024-06-24 12:31:28,389][15132] Fps is (10 sec: 47530.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10822746112. Throughput: 0: 42728.1. Samples: 10822851800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:28,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 12:31:29,269][15401] Updated weights for policy 0, policy_version 660571 (0.0027) [2024-06-24 12:31:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10822909952. Throughput: 0: 42721.2. Samples: 10823110940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 12:31:34,308][15401] Updated weights for policy 0, policy_version 660581 (0.0031) [2024-06-24 12:31:36,761][15401] Updated weights for policy 0, policy_version 660591 (0.0043) [2024-06-24 12:31:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10823155712. Throughput: 0: 42642.4. Samples: 10823230420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 12:31:41,750][15401] Updated weights for policy 0, policy_version 660601 (0.0034) [2024-06-24 12:31:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10823368704. Throughput: 0: 42726.7. Samples: 10823495340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 12:31:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000660607_10823385088.pth... [2024-06-24 12:31:43,523][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000659981_10813128704.pth [2024-06-24 12:31:44,774][15401] Updated weights for policy 0, policy_version 660611 (0.0031) [2024-06-24 12:31:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 10823565312. Throughput: 0: 42696.8. Samples: 10823749860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:48,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-24 12:31:49,208][15401] Updated weights for policy 0, policy_version 660621 (0.0041) [2024-06-24 12:31:52,326][15401] Updated weights for policy 0, policy_version 660631 (0.0028) [2024-06-24 12:31:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10823811072. Throughput: 0: 42893.5. Samples: 10823878080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 12:31:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 12:31:57,029][15401] Updated weights for policy 0, policy_version 660641 (0.0035) [2024-06-24 12:31:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10824024064. Throughput: 0: 42709.9. Samples: 10824134900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:31:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 12:32:00,254][15401] Updated weights for policy 0, policy_version 660651 (0.0022) [2024-06-24 12:32:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10824220672. Throughput: 0: 42621.4. Samples: 10824389640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 12:32:04,664][15401] Updated weights for policy 0, policy_version 660661 (0.0040) [2024-06-24 12:32:05,526][15349] Signal inference workers to stop experience collection... (160200 times) [2024-06-24 12:32:05,574][15401] InferenceWorker_p0-w0: stopping experience collection (160200 times) [2024-06-24 12:32:05,574][15349] Signal inference workers to resume experience collection... (160200 times) [2024-06-24 12:32:05,592][15401] InferenceWorker_p0-w0: resuming experience collection (160200 times) [2024-06-24 12:32:07,657][15401] Updated weights for policy 0, policy_version 660671 (0.0040) [2024-06-24 12:32:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10824450048. Throughput: 0: 42863.0. Samples: 10824514680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 12:32:12,343][15401] Updated weights for policy 0, policy_version 660681 (0.0034) [2024-06-24 12:32:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 10824663040. Throughput: 0: 42915.0. Samples: 10824782980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 12:32:15,391][15401] Updated weights for policy 0, policy_version 660691 (0.0032) [2024-06-24 12:32:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43147.0, 300 sec: 42709.7). Total num frames: 10824859648. Throughput: 0: 42877.5. Samples: 10825040420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 12:32:19,788][15401] Updated weights for policy 0, policy_version 660701 (0.0027) [2024-06-24 12:32:22,994][15401] Updated weights for policy 0, policy_version 660711 (0.0041) [2024-06-24 12:32:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10825105408. Throughput: 0: 42986.2. Samples: 10825164800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 12:32:27,266][15401] Updated weights for policy 0, policy_version 660721 (0.0033) [2024-06-24 12:32:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 10825302016. Throughput: 0: 42871.5. Samples: 10825424560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:28,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 12:32:30,663][15401] Updated weights for policy 0, policy_version 660731 (0.0026) [2024-06-24 12:32:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10825498624. Throughput: 0: 42961.8. Samples: 10825683140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 12:32:34,909][15401] Updated weights for policy 0, policy_version 660741 (0.0047) [2024-06-24 12:32:38,333][15401] Updated weights for policy 0, policy_version 660751 (0.0030) [2024-06-24 12:32:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10825744384. Throughput: 0: 42950.7. Samples: 10825810860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 12:32:42,577][15401] Updated weights for policy 0, policy_version 660761 (0.0034) [2024-06-24 12:32:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10825940992. Throughput: 0: 42990.6. Samples: 10826069480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 12:32:45,926][15401] Updated weights for policy 0, policy_version 660771 (0.0039) [2024-06-24 12:32:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10826137600. Throughput: 0: 43140.0. Samples: 10826330940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 12:32:49,993][15401] Updated weights for policy 0, policy_version 660781 (0.0042) [2024-06-24 12:32:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10826383360. Throughput: 0: 43203.1. Samples: 10826458820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 12:32:53,929][15401] Updated weights for policy 0, policy_version 660791 (0.0025) [2024-06-24 12:32:57,550][15401] Updated weights for policy 0, policy_version 660801 (0.0050) [2024-06-24 12:32:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10826596352. Throughput: 0: 43020.9. Samples: 10826718920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:32:58,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 12:33:01,460][15401] Updated weights for policy 0, policy_version 660811 (0.0036) [2024-06-24 12:33:03,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42866.9, 300 sec: 42764.4). Total num frames: 10826792960. Throughput: 0: 42884.5. Samples: 10826970500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:33:03,396][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 12:33:05,361][15401] Updated weights for policy 0, policy_version 660821 (0.0033) [2024-06-24 12:33:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 10827038720. Throughput: 0: 43020.5. Samples: 10827100720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:33:08,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 12:33:08,950][15401] Updated weights for policy 0, policy_version 660831 (0.0029) [2024-06-24 12:33:13,107][15401] Updated weights for policy 0, policy_version 660841 (0.0048) [2024-06-24 12:33:13,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10827218944. Throughput: 0: 43062.7. Samples: 10827362380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:33:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 12:33:16,590][15401] Updated weights for policy 0, policy_version 660851 (0.0041) [2024-06-24 12:33:18,392][15132] Fps is (10 sec: 40949.7, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 10827448320. Throughput: 0: 42820.4. Samples: 10827610160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:33:18,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 12:33:20,679][15401] Updated weights for policy 0, policy_version 660861 (0.0034) [2024-06-24 12:33:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10827661312. Throughput: 0: 42963.4. Samples: 10827744220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:33:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 12:33:24,117][15401] Updated weights for policy 0, policy_version 660871 (0.0033) [2024-06-24 12:33:28,296][15401] Updated weights for policy 0, policy_version 660881 (0.0044) [2024-06-24 12:33:28,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10827874304. Throughput: 0: 42807.0. Samples: 10827995800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-24 12:33:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 12:33:31,819][15401] Updated weights for policy 0, policy_version 660891 (0.0040) [2024-06-24 12:33:33,391][15132] Fps is (10 sec: 44232.5, 60 sec: 43416.8, 300 sec: 42820.4). Total num frames: 10828103680. Throughput: 0: 42603.8. Samples: 10828248160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:33:33,391][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 12:33:36,088][15401] Updated weights for policy 0, policy_version 660901 (0.0048) [2024-06-24 12:33:37,241][15349] Signal inference workers to stop experience collection... (160250 times) [2024-06-24 12:33:37,291][15401] InferenceWorker_p0-w0: stopping experience collection (160250 times) [2024-06-24 12:33:37,297][15349] Signal inference workers to resume experience collection... (160250 times) [2024-06-24 12:33:37,313][15401] InferenceWorker_p0-w0: resuming experience collection (160250 times) [2024-06-24 12:33:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10828300288. Throughput: 0: 42618.8. Samples: 10828376660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:33:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 12:33:39,656][15401] Updated weights for policy 0, policy_version 660911 (0.0034) [2024-06-24 12:33:43,390][15132] Fps is (10 sec: 39325.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10828496896. Throughput: 0: 42498.7. Samples: 10828631360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:33:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 12:33:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000660920_10828513280.pth... [2024-06-24 12:33:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000660294_10818256896.pth [2024-06-24 12:33:44,494][15401] Updated weights for policy 0, policy_version 660921 (0.0029) [2024-06-24 12:33:47,294][15401] Updated weights for policy 0, policy_version 660931 (0.0028) [2024-06-24 12:33:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10828726272. Throughput: 0: 42543.4. Samples: 10828884680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:33:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 12:33:52,018][15401] Updated weights for policy 0, policy_version 660941 (0.0039) [2024-06-24 12:33:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10828939264. Throughput: 0: 42497.7. Samples: 10829013120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:33:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 12:33:54,819][15401] Updated weights for policy 0, policy_version 660951 (0.0039) [2024-06-24 12:33:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10829152256. Throughput: 0: 42404.0. Samples: 10829270560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:33:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 12:33:59,536][15401] Updated weights for policy 0, policy_version 660961 (0.0037) [2024-06-24 12:34:03,134][15401] Updated weights for policy 0, policy_version 660971 (0.0026) [2024-06-24 12:34:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 10829348864. Throughput: 0: 42584.9. Samples: 10829526380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:03,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 12:34:06,998][15401] Updated weights for policy 0, policy_version 660981 (0.0037) [2024-06-24 12:34:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 10829578240. Throughput: 0: 42341.4. Samples: 10829649580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 12:34:10,678][15401] Updated weights for policy 0, policy_version 660991 (0.0034) [2024-06-24 12:34:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10829791232. Throughput: 0: 42564.0. Samples: 10829911180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 12:34:14,471][15401] Updated weights for policy 0, policy_version 661001 (0.0026) [2024-06-24 12:34:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 10829987840. Throughput: 0: 42713.9. Samples: 10830170240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 12:34:18,636][15401] Updated weights for policy 0, policy_version 661011 (0.0052) [2024-06-24 12:34:22,028][15401] Updated weights for policy 0, policy_version 661021 (0.0033) [2024-06-24 12:34:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 10830200832. Throughput: 0: 42584.0. Samples: 10830292940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 12:34:26,147][15401] Updated weights for policy 0, policy_version 661031 (0.0035) [2024-06-24 12:34:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10830430208. Throughput: 0: 42777.7. Samples: 10830556360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 12:34:29,626][15401] Updated weights for policy 0, policy_version 661041 (0.0039) [2024-06-24 12:34:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42326.0, 300 sec: 42709.5). Total num frames: 10830643200. Throughput: 0: 42669.6. Samples: 10830804820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:33,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 12:34:34,084][15401] Updated weights for policy 0, policy_version 661051 (0.0049) [2024-06-24 12:34:37,158][15401] Updated weights for policy 0, policy_version 661061 (0.0044) [2024-06-24 12:34:38,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10830823424. Throughput: 0: 42658.2. Samples: 10830932740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 12:34:41,739][15401] Updated weights for policy 0, policy_version 661071 (0.0038) [2024-06-24 12:34:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10831069184. Throughput: 0: 42762.3. Samples: 10831194860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:43,398][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 12:34:45,003][15401] Updated weights for policy 0, policy_version 661081 (0.0043) [2024-06-24 12:34:48,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10831298560. Throughput: 0: 42523.5. Samples: 10831439940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 12:34:49,305][15401] Updated weights for policy 0, policy_version 661091 (0.0031) [2024-06-24 12:34:50,710][15349] Signal inference workers to stop experience collection... (160300 times) [2024-06-24 12:34:50,710][15349] Signal inference workers to resume experience collection... (160300 times) [2024-06-24 12:34:50,721][15401] InferenceWorker_p0-w0: stopping experience collection (160300 times) [2024-06-24 12:34:50,752][15401] InferenceWorker_p0-w0: resuming experience collection (160300 times) [2024-06-24 12:34:53,285][15401] Updated weights for policy 0, policy_version 661101 (0.0039) [2024-06-24 12:34:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10831478784. Throughput: 0: 42689.7. Samples: 10831570620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 12:34:56,959][15401] Updated weights for policy 0, policy_version 661111 (0.0026) [2024-06-24 12:34:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10831691776. Throughput: 0: 42545.3. Samples: 10831825720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:34:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 12:35:00,955][15401] Updated weights for policy 0, policy_version 661121 (0.0036) [2024-06-24 12:35:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10831937536. Throughput: 0: 42330.6. Samples: 10832075120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:35:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 12:35:04,634][15401] Updated weights for policy 0, policy_version 661131 (0.0043) [2024-06-24 12:35:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10832117760. Throughput: 0: 42618.7. Samples: 10832210780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 12:35:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 12:35:08,717][15401] Updated weights for policy 0, policy_version 661141 (0.0037) [2024-06-24 12:35:12,064][15401] Updated weights for policy 0, policy_version 661151 (0.0031) [2024-06-24 12:35:13,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10832314368. Throughput: 0: 42281.9. Samples: 10832459040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:13,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 12:35:16,238][15401] Updated weights for policy 0, policy_version 661161 (0.0029) [2024-06-24 12:35:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10832560128. Throughput: 0: 42462.9. Samples: 10832715640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 12:35:19,677][15401] Updated weights for policy 0, policy_version 661171 (0.0038) [2024-06-24 12:35:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10832756736. Throughput: 0: 42496.9. Samples: 10832845100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:23,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 12:35:23,950][15401] Updated weights for policy 0, policy_version 661181 (0.0026) [2024-06-24 12:35:27,302][15401] Updated weights for policy 0, policy_version 661191 (0.0024) [2024-06-24 12:35:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10832953344. Throughput: 0: 42247.9. Samples: 10833096020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:28,396][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 12:35:31,601][15401] Updated weights for policy 0, policy_version 661201 (0.0032) [2024-06-24 12:35:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 10833199104. Throughput: 0: 42565.0. Samples: 10833355360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:33,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 12:35:34,999][15401] Updated weights for policy 0, policy_version 661211 (0.0029) [2024-06-24 12:35:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10833379328. Throughput: 0: 42537.3. Samples: 10833484800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 12:35:39,242][15401] Updated weights for policy 0, policy_version 661221 (0.0035) [2024-06-24 12:35:43,062][15401] Updated weights for policy 0, policy_version 661231 (0.0034) [2024-06-24 12:35:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 10833608704. Throughput: 0: 42441.1. Samples: 10833735560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 12:35:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000661232_10833625088.pth... [2024-06-24 12:35:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000660607_10823385088.pth [2024-06-24 12:35:46,929][15401] Updated weights for policy 0, policy_version 661241 (0.0045) [2024-06-24 12:35:48,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10833838080. Throughput: 0: 42625.6. Samples: 10833993260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 12:35:50,647][15401] Updated weights for policy 0, policy_version 661251 (0.0038) [2024-06-24 12:35:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10834018304. Throughput: 0: 42439.9. Samples: 10834120580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 12:35:54,911][15401] Updated weights for policy 0, policy_version 661261 (0.0038) [2024-06-24 12:35:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10834247680. Throughput: 0: 42546.7. Samples: 10834373640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:35:58,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 12:35:58,569][15401] Updated weights for policy 0, policy_version 661271 (0.0036) [2024-06-24 12:36:02,709][15401] Updated weights for policy 0, policy_version 661281 (0.0031) [2024-06-24 12:36:02,813][15349] Signal inference workers to stop experience collection... (160350 times) [2024-06-24 12:36:02,852][15401] InferenceWorker_p0-w0: stopping experience collection (160350 times) [2024-06-24 12:36:02,880][15349] Signal inference workers to resume experience collection... (160350 times) [2024-06-24 12:36:02,880][15401] InferenceWorker_p0-w0: resuming experience collection (160350 times) [2024-06-24 12:36:03,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10834493440. Throughput: 0: 42492.2. Samples: 10834627800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 12:36:06,296][15401] Updated weights for policy 0, policy_version 661291 (0.0043) [2024-06-24 12:36:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 10834657280. Throughput: 0: 42513.7. Samples: 10834758320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:08,392][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 12:36:10,262][15401] Updated weights for policy 0, policy_version 661301 (0.0045) [2024-06-24 12:36:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42821.0). Total num frames: 10834903040. Throughput: 0: 42469.7. Samples: 10835007160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:13,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 12:36:14,045][15401] Updated weights for policy 0, policy_version 661311 (0.0034) [2024-06-24 12:36:17,846][15401] Updated weights for policy 0, policy_version 661321 (0.0037) [2024-06-24 12:36:18,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 10835116032. Throughput: 0: 42469.1. Samples: 10835266480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:18,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 12:36:21,659][15401] Updated weights for policy 0, policy_version 661331 (0.0033) [2024-06-24 12:36:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10835312640. Throughput: 0: 42529.3. Samples: 10835398620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:23,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 12:36:25,502][15401] Updated weights for policy 0, policy_version 661341 (0.0029) [2024-06-24 12:36:28,391][15132] Fps is (10 sec: 42593.1, 60 sec: 43143.6, 300 sec: 42820.4). Total num frames: 10835542016. Throughput: 0: 42537.3. Samples: 10835649800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:28,391][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 12:36:29,120][15401] Updated weights for policy 0, policy_version 661351 (0.0047) [2024-06-24 12:36:33,210][15401] Updated weights for policy 0, policy_version 661361 (0.0022) [2024-06-24 12:36:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10835738624. Throughput: 0: 42630.2. Samples: 10835911620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 12:36:36,581][15401] Updated weights for policy 0, policy_version 661371 (0.0047) [2024-06-24 12:36:38,390][15132] Fps is (10 sec: 40965.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10835951616. Throughput: 0: 42626.6. Samples: 10836038780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 12:36:40,613][15401] Updated weights for policy 0, policy_version 661381 (0.0039) [2024-06-24 12:36:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10836197376. Throughput: 0: 42665.6. Samples: 10836293600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:43,400][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 12:36:44,242][15401] Updated weights for policy 0, policy_version 661391 (0.0040) [2024-06-24 12:36:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10836377600. Throughput: 0: 42654.4. Samples: 10836547240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 12:36:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 12:36:48,421][15401] Updated weights for policy 0, policy_version 661401 (0.0031) [2024-06-24 12:36:52,467][15401] Updated weights for policy 0, policy_version 661411 (0.0028) [2024-06-24 12:36:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10836590592. Throughput: 0: 42633.3. Samples: 10836676720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:36:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 12:36:55,975][15401] Updated weights for policy 0, policy_version 661421 (0.0030) [2024-06-24 12:36:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10836836352. Throughput: 0: 42858.8. Samples: 10836935800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:36:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 12:36:59,910][15401] Updated weights for policy 0, policy_version 661431 (0.0045) [2024-06-24 12:37:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10837032960. Throughput: 0: 42737.8. Samples: 10837189680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 12:37:03,656][15401] Updated weights for policy 0, policy_version 661441 (0.0042) [2024-06-24 12:37:07,422][15401] Updated weights for policy 0, policy_version 661451 (0.0035) [2024-06-24 12:37:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 10837229568. Throughput: 0: 42576.6. Samples: 10837314560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 12:37:11,092][15401] Updated weights for policy 0, policy_version 661461 (0.0039) [2024-06-24 12:37:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10837475328. Throughput: 0: 42819.8. Samples: 10837576640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:13,391][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 12:37:15,074][15401] Updated weights for policy 0, policy_version 661471 (0.0036) [2024-06-24 12:37:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10837655552. Throughput: 0: 42867.1. Samples: 10837840640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:18,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 12:37:19,093][15401] Updated weights for policy 0, policy_version 661481 (0.0033) [2024-06-24 12:37:20,672][15349] Signal inference workers to stop experience collection... (160400 times) [2024-06-24 12:37:20,672][15349] Signal inference workers to resume experience collection... (160400 times) [2024-06-24 12:37:20,714][15401] InferenceWorker_p0-w0: stopping experience collection (160400 times) [2024-06-24 12:37:20,714][15401] InferenceWorker_p0-w0: resuming experience collection (160400 times) [2024-06-24 12:37:22,645][15401] Updated weights for policy 0, policy_version 661491 (0.0035) [2024-06-24 12:37:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10837884928. Throughput: 0: 42768.5. Samples: 10837963360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 12:37:26,719][15401] Updated weights for policy 0, policy_version 661501 (0.0036) [2024-06-24 12:37:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42872.5, 300 sec: 42765.0). Total num frames: 10838114304. Throughput: 0: 42814.8. Samples: 10838220260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:28,390][15132] Avg episode reward: [(0, '0.884')] [2024-06-24 12:37:30,346][15401] Updated weights for policy 0, policy_version 661511 (0.0038) [2024-06-24 12:37:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10838294528. Throughput: 0: 42954.6. Samples: 10838480200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:33,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 12:37:34,462][15401] Updated weights for policy 0, policy_version 661521 (0.0034) [2024-06-24 12:37:37,987][15401] Updated weights for policy 0, policy_version 661531 (0.0038) [2024-06-24 12:37:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10838523904. Throughput: 0: 42688.9. Samples: 10838597720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:38,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 12:37:42,299][15401] Updated weights for policy 0, policy_version 661541 (0.0029) [2024-06-24 12:37:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10838753280. Throughput: 0: 42802.6. Samples: 10838861920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 12:37:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000661546_10838769664.pth... [2024-06-24 12:37:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000660920_10828513280.pth [2024-06-24 12:37:45,675][15401] Updated weights for policy 0, policy_version 661551 (0.0033) [2024-06-24 12:37:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 10838933504. Throughput: 0: 42880.3. Samples: 10839119300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 12:37:49,934][15401] Updated weights for policy 0, policy_version 661561 (0.0033) [2024-06-24 12:37:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10839162880. Throughput: 0: 42690.6. Samples: 10839235640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 12:37:53,454][15401] Updated weights for policy 0, policy_version 661571 (0.0030) [2024-06-24 12:37:57,760][15401] Updated weights for policy 0, policy_version 661581 (0.0037) [2024-06-24 12:37:58,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42323.6, 300 sec: 42654.5). Total num frames: 10839375872. Throughput: 0: 42796.5. Samples: 10839502580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:37:58,392][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 12:38:00,977][15401] Updated weights for policy 0, policy_version 661591 (0.0034) [2024-06-24 12:38:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10839572480. Throughput: 0: 42645.3. Samples: 10839759680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:38:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 12:38:05,329][15401] Updated weights for policy 0, policy_version 661601 (0.0037) [2024-06-24 12:38:08,396][15132] Fps is (10 sec: 44218.9, 60 sec: 43139.9, 300 sec: 42708.6). Total num frames: 10839818240. Throughput: 0: 42554.0. Samples: 10839878560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:38:08,397][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 12:38:08,897][15401] Updated weights for policy 0, policy_version 661611 (0.0040) [2024-06-24 12:38:13,172][15401] Updated weights for policy 0, policy_version 661621 (0.0030) [2024-06-24 12:38:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 10840014848. Throughput: 0: 42842.1. Samples: 10840148160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:38:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 12:38:16,725][15401] Updated weights for policy 0, policy_version 661631 (0.0027) [2024-06-24 12:38:18,389][15132] Fps is (10 sec: 39347.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10840211456. Throughput: 0: 42633.9. Samples: 10840398720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:38:18,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 12:38:20,690][15401] Updated weights for policy 0, policy_version 661641 (0.0031) [2024-06-24 12:38:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 10840457216. Throughput: 0: 42706.3. Samples: 10840519500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 12:38:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 12:38:24,288][15401] Updated weights for policy 0, policy_version 661651 (0.0031) [2024-06-24 12:38:28,361][15401] Updated weights for policy 0, policy_version 661661 (0.0027) [2024-06-24 12:38:28,391][15132] Fps is (10 sec: 44230.1, 60 sec: 42324.3, 300 sec: 42542.8). Total num frames: 10840653824. Throughput: 0: 42768.4. Samples: 10840786560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:38:28,392][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 12:38:31,909][15401] Updated weights for policy 0, policy_version 661671 (0.0037) [2024-06-24 12:38:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10840866816. Throughput: 0: 42519.2. Samples: 10841032660. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:38:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 12:38:36,118][15401] Updated weights for policy 0, policy_version 661681 (0.0042) [2024-06-24 12:38:38,390][15132] Fps is (10 sec: 42604.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10841079808. Throughput: 0: 42735.7. Samples: 10841158740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:38:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 12:38:39,566][15401] Updated weights for policy 0, policy_version 661691 (0.0029) [2024-06-24 12:38:40,624][15349] Signal inference workers to stop experience collection... (160450 times) [2024-06-24 12:38:40,624][15349] Signal inference workers to resume experience collection... (160450 times) [2024-06-24 12:38:40,670][15401] InferenceWorker_p0-w0: stopping experience collection (160450 times) [2024-06-24 12:38:40,670][15401] InferenceWorker_p0-w0: resuming experience collection (160450 times) [2024-06-24 12:38:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10841276416. Throughput: 0: 42534.3. Samples: 10841416520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:38:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 12:38:43,810][15401] Updated weights for policy 0, policy_version 661701 (0.0033) [2024-06-24 12:38:47,478][15401] Updated weights for policy 0, policy_version 661711 (0.0040) [2024-06-24 12:38:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10841522176. Throughput: 0: 42424.5. Samples: 10841668780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:38:48,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 12:38:51,300][15401] Updated weights for policy 0, policy_version 661721 (0.0043) [2024-06-24 12:38:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10841735168. Throughput: 0: 42689.6. Samples: 10841799320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:38:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 12:38:55,147][15401] Updated weights for policy 0, policy_version 661731 (0.0035) [2024-06-24 12:38:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 10841915392. Throughput: 0: 42517.1. Samples: 10842061420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:38:58,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 12:38:58,735][15401] Updated weights for policy 0, policy_version 661741 (0.0036) [2024-06-24 12:39:02,660][15401] Updated weights for policy 0, policy_version 661751 (0.0031) [2024-06-24 12:39:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10842161152. Throughput: 0: 42564.8. Samples: 10842314140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 12:39:06,428][15401] Updated weights for policy 0, policy_version 661761 (0.0031) [2024-06-24 12:39:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 10842374144. Throughput: 0: 42787.9. Samples: 10842444960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 12:39:10,225][15401] Updated weights for policy 0, policy_version 661771 (0.0028) [2024-06-24 12:39:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10842587136. Throughput: 0: 42759.6. Samples: 10842710680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 12:39:14,113][15401] Updated weights for policy 0, policy_version 661781 (0.0028) [2024-06-24 12:39:17,768][15401] Updated weights for policy 0, policy_version 661791 (0.0032) [2024-06-24 12:39:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10842816512. Throughput: 0: 42863.7. Samples: 10842961520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 12:39:21,803][15401] Updated weights for policy 0, policy_version 661801 (0.0029) [2024-06-24 12:39:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10843013120. Throughput: 0: 43032.1. Samples: 10843095180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 12:39:25,320][15401] Updated weights for policy 0, policy_version 661811 (0.0047) [2024-06-24 12:39:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42599.4, 300 sec: 42598.4). Total num frames: 10843209728. Throughput: 0: 42946.6. Samples: 10843349120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 12:39:29,371][15401] Updated weights for policy 0, policy_version 661821 (0.0033) [2024-06-24 12:39:33,309][15401] Updated weights for policy 0, policy_version 661831 (0.0041) [2024-06-24 12:39:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10843439104. Throughput: 0: 43101.0. Samples: 10843608320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:33,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 12:39:37,000][15401] Updated weights for policy 0, policy_version 661841 (0.0037) [2024-06-24 12:39:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10843652096. Throughput: 0: 42983.6. Samples: 10843733580. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 12:39:40,975][15401] Updated weights for policy 0, policy_version 661851 (0.0039) [2024-06-24 12:39:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 10843848704. Throughput: 0: 42670.2. Samples: 10843981580. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:43,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 12:39:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000661857_10843865088.pth... [2024-06-24 12:39:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000661232_10833625088.pth [2024-06-24 12:39:44,616][15401] Updated weights for policy 0, policy_version 661861 (0.0029) [2024-06-24 12:39:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10844078080. Throughput: 0: 42770.3. Samples: 10844238800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:48,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 12:39:48,566][15401] Updated weights for policy 0, policy_version 661871 (0.0044) [2024-06-24 12:39:52,230][15401] Updated weights for policy 0, policy_version 661881 (0.0034) [2024-06-24 12:39:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10844291072. Throughput: 0: 42842.1. Samples: 10844372860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:53,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 12:39:56,311][15401] Updated weights for policy 0, policy_version 661891 (0.0024) [2024-06-24 12:39:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 10844504064. Throughput: 0: 42503.9. Samples: 10844623360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:39:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 12:39:59,836][15401] Updated weights for policy 0, policy_version 661901 (0.0027) [2024-06-24 12:40:03,174][15349] Signal inference workers to stop experience collection... (160500 times) [2024-06-24 12:40:03,179][15349] Signal inference workers to resume experience collection... (160500 times) [2024-06-24 12:40:03,204][15401] InferenceWorker_p0-w0: stopping experience collection (160500 times) [2024-06-24 12:40:03,204][15401] InferenceWorker_p0-w0: resuming experience collection (160500 times) [2024-06-24 12:40:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10844717056. Throughput: 0: 42770.6. Samples: 10844886200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 12:40:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 12:40:03,907][15401] Updated weights for policy 0, policy_version 661911 (0.0039) [2024-06-24 12:40:07,450][15401] Updated weights for policy 0, policy_version 661921 (0.0035) [2024-06-24 12:40:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10844946432. Throughput: 0: 42531.8. Samples: 10845009120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:08,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 12:40:11,392][15401] Updated weights for policy 0, policy_version 661931 (0.0035) [2024-06-24 12:40:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10845143040. Throughput: 0: 42581.4. Samples: 10845265280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 12:40:14,987][15401] Updated weights for policy 0, policy_version 661941 (0.0029) [2024-06-24 12:40:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 10845339648. Throughput: 0: 42728.7. Samples: 10845531120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:18,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 12:40:19,186][15401] Updated weights for policy 0, policy_version 661951 (0.0023) [2024-06-24 12:40:22,773][15401] Updated weights for policy 0, policy_version 661961 (0.0034) [2024-06-24 12:40:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10845585408. Throughput: 0: 42562.6. Samples: 10845648900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 12:40:26,820][15401] Updated weights for policy 0, policy_version 661971 (0.0037) [2024-06-24 12:40:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10845765632. Throughput: 0: 42623.4. Samples: 10845899640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:28,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 12:40:30,413][15401] Updated weights for policy 0, policy_version 661981 (0.0039) [2024-06-24 12:40:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10845978624. Throughput: 0: 42780.0. Samples: 10846163900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:33,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 12:40:34,545][15401] Updated weights for policy 0, policy_version 661991 (0.0033) [2024-06-24 12:40:38,374][15401] Updated weights for policy 0, policy_version 662001 (0.0039) [2024-06-24 12:40:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10846224384. Throughput: 0: 42540.9. Samples: 10846287200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 12:40:42,230][15401] Updated weights for policy 0, policy_version 662011 (0.0036) [2024-06-24 12:40:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10846420992. Throughput: 0: 42630.7. Samples: 10846541740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:43,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 12:40:46,025][15401] Updated weights for policy 0, policy_version 662021 (0.0026) [2024-06-24 12:40:48,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 10846601216. Throughput: 0: 42644.3. Samples: 10846805200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:48,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 12:40:50,133][15401] Updated weights for policy 0, policy_version 662031 (0.0043) [2024-06-24 12:40:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10846863360. Throughput: 0: 42687.6. Samples: 10846930060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:53,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 12:40:53,542][15401] Updated weights for policy 0, policy_version 662041 (0.0041) [2024-06-24 12:40:57,916][15401] Updated weights for policy 0, policy_version 662051 (0.0039) [2024-06-24 12:40:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10847043584. Throughput: 0: 42599.5. Samples: 10847182260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:40:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 12:41:01,257][15401] Updated weights for policy 0, policy_version 662061 (0.0037) [2024-06-24 12:41:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 10847256576. Throughput: 0: 42582.2. Samples: 10847447320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:41:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 12:41:05,511][15401] Updated weights for policy 0, policy_version 662071 (0.0028) [2024-06-24 12:41:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10847502336. Throughput: 0: 42684.0. Samples: 10847569680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:41:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 12:41:08,779][15401] Updated weights for policy 0, policy_version 662081 (0.0028) [2024-06-24 12:41:13,136][15401] Updated weights for policy 0, policy_version 662091 (0.0034) [2024-06-24 12:41:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10847698944. Throughput: 0: 42872.0. Samples: 10847828880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:41:13,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 12:41:14,749][15349] Signal inference workers to stop experience collection... (160550 times) [2024-06-24 12:41:14,756][15349] Signal inference workers to resume experience collection... (160550 times) [2024-06-24 12:41:14,794][15401] InferenceWorker_p0-w0: stopping experience collection (160550 times) [2024-06-24 12:41:14,794][15401] InferenceWorker_p0-w0: resuming experience collection (160550 times) [2024-06-24 12:41:16,678][15401] Updated weights for policy 0, policy_version 662101 (0.0031) [2024-06-24 12:41:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10847911936. Throughput: 0: 42748.4. Samples: 10848087580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:41:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 12:41:20,678][15401] Updated weights for policy 0, policy_version 662111 (0.0042) [2024-06-24 12:41:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.2). Total num frames: 10848157696. Throughput: 0: 42956.0. Samples: 10848220220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:41:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 12:41:24,049][15401] Updated weights for policy 0, policy_version 662121 (0.0035) [2024-06-24 12:41:28,280][15401] Updated weights for policy 0, policy_version 662131 (0.0034) [2024-06-24 12:41:28,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10848354304. Throughput: 0: 42969.0. Samples: 10848475360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:41:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 12:41:31,973][15401] Updated weights for policy 0, policy_version 662141 (0.0027) [2024-06-24 12:41:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10848567296. Throughput: 0: 42715.7. Samples: 10848727400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:41:33,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 12:41:36,236][15401] Updated weights for policy 0, policy_version 662151 (0.0044) [2024-06-24 12:41:38,390][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10848796672. Throughput: 0: 42948.0. Samples: 10848862720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 12:41:38,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 12:41:39,486][15401] Updated weights for policy 0, policy_version 662161 (0.0029) [2024-06-24 12:41:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10848960512. Throughput: 0: 42901.7. Samples: 10849112840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:41:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 12:41:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000662168_10848960512.pth... [2024-06-24 12:41:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000661546_10838769664.pth [2024-06-24 12:41:44,014][15401] Updated weights for policy 0, policy_version 662171 (0.0028) [2024-06-24 12:41:47,034][15401] Updated weights for policy 0, policy_version 662181 (0.0028) [2024-06-24 12:41:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10849206272. Throughput: 0: 42685.8. Samples: 10849368180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:41:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 12:41:51,629][15401] Updated weights for policy 0, policy_version 662191 (0.0036) [2024-06-24 12:41:53,393][15132] Fps is (10 sec: 47498.1, 60 sec: 42869.1, 300 sec: 42709.0). Total num frames: 10849435648. Throughput: 0: 42999.9. Samples: 10849504820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:41:53,393][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 12:41:54,591][15401] Updated weights for policy 0, policy_version 662201 (0.0036) [2024-06-24 12:41:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10849615872. Throughput: 0: 42772.9. Samples: 10849753660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:41:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 12:41:59,447][15401] Updated weights for policy 0, policy_version 662211 (0.0043) [2024-06-24 12:42:02,570][15401] Updated weights for policy 0, policy_version 662221 (0.0023) [2024-06-24 12:42:03,390][15132] Fps is (10 sec: 40973.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10849845248. Throughput: 0: 42812.8. Samples: 10850014160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:03,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 12:42:07,115][15401] Updated weights for policy 0, policy_version 662231 (0.0033) [2024-06-24 12:42:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10850074624. Throughput: 0: 42687.2. Samples: 10850141140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 12:42:10,786][15401] Updated weights for policy 0, policy_version 662241 (0.0030) [2024-06-24 12:42:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10850271232. Throughput: 0: 42704.4. Samples: 10850397040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 12:42:14,537][15401] Updated weights for policy 0, policy_version 662251 (0.0033) [2024-06-24 12:42:18,243][15401] Updated weights for policy 0, policy_version 662261 (0.0033) [2024-06-24 12:42:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10850484224. Throughput: 0: 42966.9. Samples: 10850660920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 12:42:21,853][15401] Updated weights for policy 0, policy_version 662271 (0.0035) [2024-06-24 12:42:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10850713600. Throughput: 0: 42856.5. Samples: 10850791260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 12:42:25,792][15401] Updated weights for policy 0, policy_version 662281 (0.0041) [2024-06-24 12:42:28,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 10850926592. Throughput: 0: 42983.7. Samples: 10851047100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 12:42:29,300][15401] Updated weights for policy 0, policy_version 662291 (0.0038) [2024-06-24 12:42:33,187][15401] Updated weights for policy 0, policy_version 662301 (0.0032) [2024-06-24 12:42:33,391][15132] Fps is (10 sec: 42592.9, 60 sec: 42870.5, 300 sec: 42764.8). Total num frames: 10851139584. Throughput: 0: 43037.1. Samples: 10851304900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:33,391][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 12:42:37,078][15401] Updated weights for policy 0, policy_version 662311 (0.0026) [2024-06-24 12:42:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10851352576. Throughput: 0: 42790.3. Samples: 10851430240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 12:42:40,524][15401] Updated weights for policy 0, policy_version 662321 (0.0032) [2024-06-24 12:42:43,390][15132] Fps is (10 sec: 44242.0, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 10851581952. Throughput: 0: 43143.1. Samples: 10851695100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:43,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 12:42:44,534][15401] Updated weights for policy 0, policy_version 662331 (0.0026) [2024-06-24 12:42:47,151][15349] Signal inference workers to stop experience collection... (160600 times) [2024-06-24 12:42:47,152][15349] Signal inference workers to resume experience collection... (160600 times) [2024-06-24 12:42:47,199][15401] InferenceWorker_p0-w0: stopping experience collection (160600 times) [2024-06-24 12:42:47,199][15401] InferenceWorker_p0-w0: resuming experience collection (160600 times) [2024-06-24 12:42:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10851778560. Throughput: 0: 43041.4. Samples: 10851951020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 12:42:48,508][15401] Updated weights for policy 0, policy_version 662341 (0.0029) [2024-06-24 12:42:52,080][15401] Updated weights for policy 0, policy_version 662351 (0.0033) [2024-06-24 12:42:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.7, 300 sec: 42765.4). Total num frames: 10851991552. Throughput: 0: 43111.1. Samples: 10852081140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 12:42:56,163][15401] Updated weights for policy 0, policy_version 662361 (0.0024) [2024-06-24 12:42:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10852204544. Throughput: 0: 43175.5. Samples: 10852339940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:42:58,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 12:42:59,841][15401] Updated weights for policy 0, policy_version 662371 (0.0031) [2024-06-24 12:43:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.9). Total num frames: 10852433920. Throughput: 0: 42817.9. Samples: 10852587720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:43:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 12:43:03,802][15401] Updated weights for policy 0, policy_version 662381 (0.0041) [2024-06-24 12:43:07,489][15401] Updated weights for policy 0, policy_version 662391 (0.0038) [2024-06-24 12:43:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10852646912. Throughput: 0: 42910.7. Samples: 10852722240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:43:08,390][15132] Avg episode reward: [(0, '0.123')] [2024-06-24 12:43:11,331][15401] Updated weights for policy 0, policy_version 662401 (0.0040) [2024-06-24 12:43:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10852859904. Throughput: 0: 42995.1. Samples: 10852981880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:43:13,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 12:43:15,467][15401] Updated weights for policy 0, policy_version 662411 (0.0029) [2024-06-24 12:43:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 10853089280. Throughput: 0: 42880.3. Samples: 10853234460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:43:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 12:43:18,992][15401] Updated weights for policy 0, policy_version 662421 (0.0027) [2024-06-24 12:43:23,105][15401] Updated weights for policy 0, policy_version 662431 (0.0032) [2024-06-24 12:43:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 10853285888. Throughput: 0: 43024.5. Samples: 10853366340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:43:23,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-24 12:43:26,779][15401] Updated weights for policy 0, policy_version 662441 (0.0034) [2024-06-24 12:43:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10853498880. Throughput: 0: 42709.4. Samples: 10853617020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:43:28,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 12:43:30,515][15401] Updated weights for policy 0, policy_version 662451 (0.0038) [2024-06-24 12:43:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42872.3, 300 sec: 42820.5). Total num frames: 10853711872. Throughput: 0: 42764.4. Samples: 10853875420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:43:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 12:43:34,379][15401] Updated weights for policy 0, policy_version 662461 (0.0035) [2024-06-24 12:43:38,080][15401] Updated weights for policy 0, policy_version 662471 (0.0034) [2024-06-24 12:43:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 10853924864. Throughput: 0: 42728.4. Samples: 10854004020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:43:38,392][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 12:43:41,870][15401] Updated weights for policy 0, policy_version 662481 (0.0025) [2024-06-24 12:43:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10854137856. Throughput: 0: 42701.5. Samples: 10854261500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:43:43,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 12:43:43,544][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000662486_10854170624.pth... [2024-06-24 12:43:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000661857_10843865088.pth [2024-06-24 12:43:45,636][15401] Updated weights for policy 0, policy_version 662491 (0.0028) [2024-06-24 12:43:48,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10854367232. Throughput: 0: 43048.9. Samples: 10854524920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:43:48,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 12:43:49,402][15401] Updated weights for policy 0, policy_version 662501 (0.0034) [2024-06-24 12:43:53,119][15401] Updated weights for policy 0, policy_version 662511 (0.0027) [2024-06-24 12:43:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10854580224. Throughput: 0: 42818.7. Samples: 10854649080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:43:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 12:43:57,025][15401] Updated weights for policy 0, policy_version 662521 (0.0042) [2024-06-24 12:43:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10854809600. Throughput: 0: 42815.0. Samples: 10854908560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:43:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 12:44:00,688][15401] Updated weights for policy 0, policy_version 662531 (0.0028) [2024-06-24 12:44:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10854989824. Throughput: 0: 43093.0. Samples: 10855173640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 12:44:04,723][15401] Updated weights for policy 0, policy_version 662541 (0.0034) [2024-06-24 12:44:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 10855219200. Throughput: 0: 42830.5. Samples: 10855293720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:08,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 12:44:08,538][15401] Updated weights for policy 0, policy_version 662551 (0.0026) [2024-06-24 12:44:12,528][15349] Signal inference workers to stop experience collection... (160650 times) [2024-06-24 12:44:12,532][15349] Signal inference workers to resume experience collection... (160650 times) [2024-06-24 12:44:12,542][15401] InferenceWorker_p0-w0: stopping experience collection (160650 times) [2024-06-24 12:44:12,549][15401] Updated weights for policy 0, policy_version 662561 (0.0043) [2024-06-24 12:44:12,569][15401] InferenceWorker_p0-w0: resuming experience collection (160650 times) [2024-06-24 12:44:13,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10855448576. Throughput: 0: 43193.1. Samples: 10855560720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 12:44:16,139][15401] Updated weights for policy 0, policy_version 662571 (0.0033) [2024-06-24 12:44:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10855645184. Throughput: 0: 43184.5. Samples: 10855818720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 12:44:20,063][15401] Updated weights for policy 0, policy_version 662581 (0.0030) [2024-06-24 12:44:23,396][15132] Fps is (10 sec: 42572.1, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 10855874560. Throughput: 0: 42912.2. Samples: 10855935240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:23,396][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 12:44:24,065][15401] Updated weights for policy 0, policy_version 662591 (0.0041) [2024-06-24 12:44:27,715][15401] Updated weights for policy 0, policy_version 662601 (0.0023) [2024-06-24 12:44:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 10856103936. Throughput: 0: 43271.9. Samples: 10856208740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 12:44:31,761][15401] Updated weights for policy 0, policy_version 662611 (0.0024) [2024-06-24 12:44:33,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10856284160. Throughput: 0: 43198.8. Samples: 10856468860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 12:44:35,295][15401] Updated weights for policy 0, policy_version 662621 (0.0029) [2024-06-24 12:44:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43419.4, 300 sec: 42987.2). Total num frames: 10856529920. Throughput: 0: 42990.6. Samples: 10856583660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:38,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-24 12:44:39,258][15401] Updated weights for policy 0, policy_version 662631 (0.0022) [2024-06-24 12:44:42,886][15401] Updated weights for policy 0, policy_version 662641 (0.0022) [2024-06-24 12:44:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 10856742912. Throughput: 0: 43223.5. Samples: 10856853620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 12:44:46,922][15401] Updated weights for policy 0, policy_version 662651 (0.0040) [2024-06-24 12:44:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10856939520. Throughput: 0: 43113.7. Samples: 10857113760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:48,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-24 12:44:50,465][15401] Updated weights for policy 0, policy_version 662661 (0.0038) [2024-06-24 12:44:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 10857168896. Throughput: 0: 43163.2. Samples: 10857236060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 12:44:53,392][15132] Avg episode reward: [(0, '0.160')] [2024-06-24 12:44:54,462][15401] Updated weights for policy 0, policy_version 662671 (0.0040) [2024-06-24 12:44:57,972][15401] Updated weights for policy 0, policy_version 662681 (0.0037) [2024-06-24 12:44:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10857381888. Throughput: 0: 43065.5. Samples: 10857498660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:44:58,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 12:45:02,070][15401] Updated weights for policy 0, policy_version 662691 (0.0028) [2024-06-24 12:45:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10857562112. Throughput: 0: 43004.9. Samples: 10857753940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:03,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 12:45:05,502][15401] Updated weights for policy 0, policy_version 662701 (0.0037) [2024-06-24 12:45:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42987.1). Total num frames: 10857824256. Throughput: 0: 43157.1. Samples: 10857877040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:08,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 12:45:09,715][15401] Updated weights for policy 0, policy_version 662711 (0.0039) [2024-06-24 12:45:11,865][15349] Signal inference workers to stop experience collection... (160700 times) [2024-06-24 12:45:11,914][15401] InferenceWorker_p0-w0: stopping experience collection (160700 times) [2024-06-24 12:45:11,923][15349] Signal inference workers to resume experience collection... (160700 times) [2024-06-24 12:45:11,927][15401] InferenceWorker_p0-w0: resuming experience collection (160700 times) [2024-06-24 12:45:13,167][15401] Updated weights for policy 0, policy_version 662721 (0.0036) [2024-06-24 12:45:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 10858020864. Throughput: 0: 42823.5. Samples: 10858135800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 12:45:17,484][15401] Updated weights for policy 0, policy_version 662731 (0.0042) [2024-06-24 12:45:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10858217472. Throughput: 0: 42686.1. Samples: 10858389740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 12:45:20,779][15401] Updated weights for policy 0, policy_version 662741 (0.0031) [2024-06-24 12:45:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43149.1, 300 sec: 43042.7). Total num frames: 10858463232. Throughput: 0: 42937.2. Samples: 10858515840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 12:45:25,065][15401] Updated weights for policy 0, policy_version 662751 (0.0040) [2024-06-24 12:45:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 10858659840. Throughput: 0: 42770.3. Samples: 10858778280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 12:45:29,101][15401] Updated weights for policy 0, policy_version 662761 (0.0028) [2024-06-24 12:45:32,543][15401] Updated weights for policy 0, policy_version 662771 (0.0042) [2024-06-24 12:45:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10858856448. Throughput: 0: 42621.8. Samples: 10859031740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 12:45:36,591][15401] Updated weights for policy 0, policy_version 662781 (0.0027) [2024-06-24 12:45:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 10859102208. Throughput: 0: 42729.8. Samples: 10859158900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 12:45:40,143][15401] Updated weights for policy 0, policy_version 662791 (0.0034) [2024-06-24 12:45:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 10859282432. Throughput: 0: 42799.2. Samples: 10859424620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 12:45:43,481][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000662799_10859298816.pth... [2024-06-24 12:45:43,544][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000662168_10848960512.pth [2024-06-24 12:45:44,181][15401] Updated weights for policy 0, policy_version 662801 (0.0040) [2024-06-24 12:45:47,784][15401] Updated weights for policy 0, policy_version 662811 (0.0037) [2024-06-24 12:45:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10859511808. Throughput: 0: 42626.7. Samples: 10859672140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 12:45:51,778][15401] Updated weights for policy 0, policy_version 662821 (0.0032) [2024-06-24 12:45:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 10859724800. Throughput: 0: 42937.5. Samples: 10859809220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 12:45:55,390][15401] Updated weights for policy 0, policy_version 662831 (0.0037) [2024-06-24 12:45:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 10859921408. Throughput: 0: 42863.1. Samples: 10860064640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:45:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 12:45:59,643][15401] Updated weights for policy 0, policy_version 662841 (0.0039) [2024-06-24 12:46:03,005][15401] Updated weights for policy 0, policy_version 662851 (0.0044) [2024-06-24 12:46:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10860150784. Throughput: 0: 42701.5. Samples: 10860311300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:46:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 12:46:07,295][15401] Updated weights for policy 0, policy_version 662861 (0.0031) [2024-06-24 12:46:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 10860380160. Throughput: 0: 42905.9. Samples: 10860446600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:46:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 12:46:10,829][15401] Updated weights for policy 0, policy_version 662871 (0.0038) [2024-06-24 12:46:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10860576768. Throughput: 0: 42639.5. Samples: 10860697060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:46:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 12:46:14,978][15401] Updated weights for policy 0, policy_version 662881 (0.0033) [2024-06-24 12:46:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10860789760. Throughput: 0: 42758.2. Samples: 10860955860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:46:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 12:46:18,890][15401] Updated weights for policy 0, policy_version 662891 (0.0030) [2024-06-24 12:46:22,394][15401] Updated weights for policy 0, policy_version 662901 (0.0039) [2024-06-24 12:46:23,390][15132] Fps is (10 sec: 44233.5, 60 sec: 42597.9, 300 sec: 42931.6). Total num frames: 10861019136. Throughput: 0: 42819.8. Samples: 10861085820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:46:23,391][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 12:46:26,294][15401] Updated weights for policy 0, policy_version 662911 (0.0034) [2024-06-24 12:46:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10861215744. Throughput: 0: 42665.3. Samples: 10861344560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:46:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 12:46:29,948][15401] Updated weights for policy 0, policy_version 662921 (0.0031) [2024-06-24 12:46:33,390][15132] Fps is (10 sec: 40962.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10861428736. Throughput: 0: 42927.1. Samples: 10861603860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 12:46:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 12:46:33,693][15401] Updated weights for policy 0, policy_version 662931 (0.0026) [2024-06-24 12:46:37,542][15401] Updated weights for policy 0, policy_version 662941 (0.0029) [2024-06-24 12:46:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 10861658112. Throughput: 0: 42657.2. Samples: 10861728800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:46:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 12:46:41,495][15401] Updated weights for policy 0, policy_version 662951 (0.0029) [2024-06-24 12:46:42,679][15349] Signal inference workers to stop experience collection... (160750 times) [2024-06-24 12:46:42,680][15349] Signal inference workers to resume experience collection... (160750 times) [2024-06-24 12:46:42,725][15401] InferenceWorker_p0-w0: stopping experience collection (160750 times) [2024-06-24 12:46:42,726][15401] InferenceWorker_p0-w0: resuming experience collection (160750 times) [2024-06-24 12:46:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10861871104. Throughput: 0: 42609.9. Samples: 10861982080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:46:43,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 12:46:45,241][15401] Updated weights for policy 0, policy_version 662961 (0.0034) [2024-06-24 12:46:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42821.0). Total num frames: 10862067712. Throughput: 0: 42971.5. Samples: 10862245020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:46:48,398][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 12:46:48,917][15401] Updated weights for policy 0, policy_version 662971 (0.0031) [2024-06-24 12:46:52,958][15401] Updated weights for policy 0, policy_version 662981 (0.0035) [2024-06-24 12:46:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42986.8). Total num frames: 10862297088. Throughput: 0: 42753.3. Samples: 10862370600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:46:53,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 12:46:56,476][15401] Updated weights for policy 0, policy_version 662991 (0.0033) [2024-06-24 12:46:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10862510080. Throughput: 0: 42849.3. Samples: 10862625280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:46:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 12:47:00,597][15401] Updated weights for policy 0, policy_version 663001 (0.0037) [2024-06-24 12:47:03,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10862723072. Throughput: 0: 42945.8. Samples: 10862888420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 12:47:03,885][15401] Updated weights for policy 0, policy_version 663011 (0.0036) [2024-06-24 12:47:08,118][15401] Updated weights for policy 0, policy_version 663021 (0.0033) [2024-06-24 12:47:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10862936064. Throughput: 0: 42917.6. Samples: 10863017080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 12:47:11,929][15401] Updated weights for policy 0, policy_version 663031 (0.0030) [2024-06-24 12:47:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10863132672. Throughput: 0: 42770.3. Samples: 10863269220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 12:47:15,506][15401] Updated weights for policy 0, policy_version 663041 (0.0038) [2024-06-24 12:47:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10863362048. Throughput: 0: 42790.3. Samples: 10863529420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:18,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 12:47:19,666][15401] Updated weights for policy 0, policy_version 663051 (0.0028) [2024-06-24 12:47:22,962][15401] Updated weights for policy 0, policy_version 663061 (0.0023) [2024-06-24 12:47:23,394][15132] Fps is (10 sec: 45856.1, 60 sec: 42869.1, 300 sec: 42931.0). Total num frames: 10863591424. Throughput: 0: 42875.7. Samples: 10863658380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:23,394][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 12:47:27,315][15401] Updated weights for policy 0, policy_version 663071 (0.0027) [2024-06-24 12:47:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.3). Total num frames: 10863788032. Throughput: 0: 42894.2. Samples: 10863912320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:28,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 12:47:30,522][15401] Updated weights for policy 0, policy_version 663081 (0.0027) [2024-06-24 12:47:33,390][15132] Fps is (10 sec: 40976.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10864001024. Throughput: 0: 42820.4. Samples: 10864171940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 12:47:35,066][15401] Updated weights for policy 0, policy_version 663091 (0.0036) [2024-06-24 12:47:38,329][15401] Updated weights for policy 0, policy_version 663101 (0.0049) [2024-06-24 12:47:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 10864246784. Throughput: 0: 42836.1. Samples: 10864298120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 12:47:42,755][15401] Updated weights for policy 0, policy_version 663111 (0.0037) [2024-06-24 12:47:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 10864410624. Throughput: 0: 42879.1. Samples: 10864554840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:43,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 12:47:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000663112_10864427008.pth... [2024-06-24 12:47:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000662486_10854170624.pth [2024-06-24 12:47:46,169][15401] Updated weights for policy 0, policy_version 663121 (0.0043) [2024-06-24 12:47:48,394][15132] Fps is (10 sec: 39303.6, 60 sec: 42868.2, 300 sec: 42875.4). Total num frames: 10864640000. Throughput: 0: 42481.1. Samples: 10864800260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:48,395][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 12:47:51,006][15401] Updated weights for policy 0, policy_version 663131 (0.0031) [2024-06-24 12:47:53,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 10864885760. Throughput: 0: 42524.5. Samples: 10864930680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:53,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 12:47:53,931][15401] Updated weights for policy 0, policy_version 663141 (0.0046) [2024-06-24 12:47:58,390][15132] Fps is (10 sec: 40978.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10865049600. Throughput: 0: 42664.3. Samples: 10865189120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:47:58,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 12:47:58,540][15401] Updated weights for policy 0, policy_version 663151 (0.0038) [2024-06-24 12:48:02,122][15401] Updated weights for policy 0, policy_version 663161 (0.0034) [2024-06-24 12:48:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10865278976. Throughput: 0: 42512.0. Samples: 10865442460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:48:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 12:48:06,086][15401] Updated weights for policy 0, policy_version 663171 (0.0032) [2024-06-24 12:48:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10865508352. Throughput: 0: 42524.7. Samples: 10865571820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:48:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 12:48:09,975][15401] Updated weights for policy 0, policy_version 663181 (0.0038) [2024-06-24 12:48:10,655][15349] Signal inference workers to stop experience collection... (160800 times) [2024-06-24 12:48:10,656][15349] Signal inference workers to resume experience collection... (160800 times) [2024-06-24 12:48:10,670][15401] InferenceWorker_p0-w0: stopping experience collection (160800 times) [2024-06-24 12:48:10,694][15401] InferenceWorker_p0-w0: resuming experience collection (160800 times) [2024-06-24 12:48:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 10865672192. Throughput: 0: 42519.1. Samples: 10865825680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 12:48:13,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 12:48:14,283][15401] Updated weights for policy 0, policy_version 663191 (0.0037) [2024-06-24 12:48:17,650][15401] Updated weights for policy 0, policy_version 663201 (0.0044) [2024-06-24 12:48:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10865901568. Throughput: 0: 42388.2. Samples: 10866079400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:18,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-24 12:48:21,948][15401] Updated weights for policy 0, policy_version 663211 (0.0040) [2024-06-24 12:48:23,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42601.3, 300 sec: 42876.1). Total num frames: 10866147328. Throughput: 0: 42495.5. Samples: 10866210420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:23,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 12:48:25,266][15401] Updated weights for policy 0, policy_version 663221 (0.0023) [2024-06-24 12:48:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10866327552. Throughput: 0: 42371.4. Samples: 10866461560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 12:48:29,572][15401] Updated weights for policy 0, policy_version 663231 (0.0025) [2024-06-24 12:48:32,843][15401] Updated weights for policy 0, policy_version 663241 (0.0038) [2024-06-24 12:48:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 10866556928. Throughput: 0: 42490.6. Samples: 10866712140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 12:48:37,200][15401] Updated weights for policy 0, policy_version 663251 (0.0026) [2024-06-24 12:48:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 10866769920. Throughput: 0: 42532.4. Samples: 10866844640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 12:48:40,438][15401] Updated weights for policy 0, policy_version 663261 (0.0038) [2024-06-24 12:48:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10866966528. Throughput: 0: 42412.5. Samples: 10867097680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 12:48:45,033][15401] Updated weights for policy 0, policy_version 663271 (0.0043) [2024-06-24 12:48:48,106][15401] Updated weights for policy 0, policy_version 663281 (0.0035) [2024-06-24 12:48:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42601.6, 300 sec: 42765.0). Total num frames: 10867195904. Throughput: 0: 42318.2. Samples: 10867346780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 12:48:52,601][15401] Updated weights for policy 0, policy_version 663291 (0.0035) [2024-06-24 12:48:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 10867392512. Throughput: 0: 42445.5. Samples: 10867481860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 12:48:56,239][15401] Updated weights for policy 0, policy_version 663301 (0.0035) [2024-06-24 12:48:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10867605504. Throughput: 0: 42422.2. Samples: 10867734680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:48:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 12:49:00,307][15401] Updated weights for policy 0, policy_version 663311 (0.0041) [2024-06-24 12:49:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10867818496. Throughput: 0: 42376.4. Samples: 10867986340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:03,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 12:49:03,835][15401] Updated weights for policy 0, policy_version 663321 (0.0039) [2024-06-24 12:49:07,945][15401] Updated weights for policy 0, policy_version 663331 (0.0040) [2024-06-24 12:49:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 10868047872. Throughput: 0: 42395.6. Samples: 10868118220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 12:49:11,446][15401] Updated weights for policy 0, policy_version 663341 (0.0037) [2024-06-24 12:49:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10868244480. Throughput: 0: 42361.0. Samples: 10868367800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 12:49:15,551][15401] Updated weights for policy 0, policy_version 663351 (0.0042) [2024-06-24 12:49:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 10868457472. Throughput: 0: 42506.7. Samples: 10868624940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:18,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 12:49:19,359][15401] Updated weights for policy 0, policy_version 663361 (0.0035) [2024-06-24 12:49:23,091][15401] Updated weights for policy 0, policy_version 663371 (0.0047) [2024-06-24 12:49:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10868686848. Throughput: 0: 42579.6. Samples: 10868760720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:23,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 12:49:26,947][15401] Updated weights for policy 0, policy_version 663381 (0.0033) [2024-06-24 12:49:26,986][15349] Signal inference workers to stop experience collection... (160850 times) [2024-06-24 12:49:26,987][15349] Signal inference workers to resume experience collection... (160850 times) [2024-06-24 12:49:27,013][15401] InferenceWorker_p0-w0: stopping experience collection (160850 times) [2024-06-24 12:49:27,014][15401] InferenceWorker_p0-w0: resuming experience collection (160850 times) [2024-06-24 12:49:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10868899840. Throughput: 0: 42688.4. Samples: 10869018660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:28,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-24 12:49:30,822][15401] Updated weights for policy 0, policy_version 663391 (0.0029) [2024-06-24 12:49:33,391][15132] Fps is (10 sec: 42592.8, 60 sec: 42597.4, 300 sec: 42653.7). Total num frames: 10869112832. Throughput: 0: 42777.0. Samples: 10869271800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:33,391][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 12:49:34,527][15401] Updated weights for policy 0, policy_version 663401 (0.0030) [2024-06-24 12:49:38,326][15401] Updated weights for policy 0, policy_version 663411 (0.0040) [2024-06-24 12:49:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10869325824. Throughput: 0: 42736.4. Samples: 10869405000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:38,390][15132] Avg episode reward: [(0, '0.916')] [2024-06-24 12:49:42,042][15401] Updated weights for policy 0, policy_version 663421 (0.0028) [2024-06-24 12:49:43,389][15132] Fps is (10 sec: 42604.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10869538816. Throughput: 0: 42795.2. Samples: 10869660460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:43,390][15132] Avg episode reward: [(0, '0.875')] [2024-06-24 12:49:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000663425_10869555200.pth... [2024-06-24 12:49:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000662799_10859298816.pth [2024-06-24 12:49:45,950][15401] Updated weights for policy 0, policy_version 663431 (0.0030) [2024-06-24 12:49:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10869768192. Throughput: 0: 42948.9. Samples: 10869919040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 12:49:48,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-24 12:49:49,741][15401] Updated weights for policy 0, policy_version 663441 (0.0031) [2024-06-24 12:49:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10869964800. Throughput: 0: 42852.3. Samples: 10870046580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:49:53,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 12:49:53,537][15401] Updated weights for policy 0, policy_version 663451 (0.0043) [2024-06-24 12:49:57,356][15401] Updated weights for policy 0, policy_version 663461 (0.0029) [2024-06-24 12:49:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10870177792. Throughput: 0: 43036.9. Samples: 10870304460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:49:58,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 12:50:01,369][15401] Updated weights for policy 0, policy_version 663471 (0.0036) [2024-06-24 12:50:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10870390784. Throughput: 0: 42936.4. Samples: 10870557080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 12:50:04,927][15401] Updated weights for policy 0, policy_version 663481 (0.0029) [2024-06-24 12:50:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10870587392. Throughput: 0: 42844.9. Samples: 10870688740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 12:50:08,989][15401] Updated weights for policy 0, policy_version 663491 (0.0038) [2024-06-24 12:50:12,608][15401] Updated weights for policy 0, policy_version 663501 (0.0030) [2024-06-24 12:50:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10870816768. Throughput: 0: 42841.3. Samples: 10870946520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:13,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 12:50:16,677][15401] Updated weights for policy 0, policy_version 663511 (0.0027) [2024-06-24 12:50:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 10871046144. Throughput: 0: 42778.1. Samples: 10871196760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:18,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 12:50:20,519][15401] Updated weights for policy 0, policy_version 663521 (0.0033) [2024-06-24 12:50:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10871242752. Throughput: 0: 42824.4. Samples: 10871332100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 12:50:24,018][15401] Updated weights for policy 0, policy_version 663531 (0.0026) [2024-06-24 12:50:27,899][15401] Updated weights for policy 0, policy_version 663541 (0.0037) [2024-06-24 12:50:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 10871472128. Throughput: 0: 42980.3. Samples: 10871594680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:28,393][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 12:50:31,591][15401] Updated weights for policy 0, policy_version 663551 (0.0035) [2024-06-24 12:50:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43145.4, 300 sec: 42709.5). Total num frames: 10871701504. Throughput: 0: 42894.5. Samples: 10871849300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:33,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 12:50:35,479][15401] Updated weights for policy 0, policy_version 663561 (0.0035) [2024-06-24 12:50:38,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10871898112. Throughput: 0: 43132.6. Samples: 10871987540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 12:50:39,302][15401] Updated weights for policy 0, policy_version 663571 (0.0028) [2024-06-24 12:50:42,941][15401] Updated weights for policy 0, policy_version 663581 (0.0035) [2024-06-24 12:50:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10872111104. Throughput: 0: 43103.0. Samples: 10872244100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 12:50:46,084][15349] Signal inference workers to stop experience collection... (160900 times) [2024-06-24 12:50:46,085][15349] Signal inference workers to resume experience collection... (160900 times) [2024-06-24 12:50:46,120][15401] InferenceWorker_p0-w0: stopping experience collection (160900 times) [2024-06-24 12:50:46,120][15401] InferenceWorker_p0-w0: resuming experience collection (160900 times) [2024-06-24 12:50:46,751][15401] Updated weights for policy 0, policy_version 663591 (0.0026) [2024-06-24 12:50:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10872340480. Throughput: 0: 43112.0. Samples: 10872497120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 12:50:50,922][15401] Updated weights for policy 0, policy_version 663601 (0.0035) [2024-06-24 12:50:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 10872553472. Throughput: 0: 43166.8. Samples: 10872631240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 12:50:54,867][15401] Updated weights for policy 0, policy_version 663611 (0.0039) [2024-06-24 12:50:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10872750080. Throughput: 0: 43066.2. Samples: 10872884500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:50:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 12:50:58,535][15401] Updated weights for policy 0, policy_version 663621 (0.0038) [2024-06-24 12:51:02,435][15401] Updated weights for policy 0, policy_version 663631 (0.0026) [2024-06-24 12:51:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10872995840. Throughput: 0: 43135.5. Samples: 10873137860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:51:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 12:51:06,039][15401] Updated weights for policy 0, policy_version 663641 (0.0026) [2024-06-24 12:51:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10873176064. Throughput: 0: 43143.1. Samples: 10873273540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:51:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 12:51:09,945][15401] Updated weights for policy 0, policy_version 663651 (0.0031) [2024-06-24 12:51:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10873405440. Throughput: 0: 43069.9. Samples: 10873532720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:51:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 12:51:13,537][15401] Updated weights for policy 0, policy_version 663661 (0.0033) [2024-06-24 12:51:17,383][15401] Updated weights for policy 0, policy_version 663671 (0.0029) [2024-06-24 12:51:18,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42765.1). Total num frames: 10873634816. Throughput: 0: 43147.2. Samples: 10873790920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:51:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 12:51:21,212][15401] Updated weights for policy 0, policy_version 663681 (0.0029) [2024-06-24 12:51:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10873831424. Throughput: 0: 42891.9. Samples: 10873917680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:51:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 12:51:24,888][15401] Updated weights for policy 0, policy_version 663691 (0.0039) [2024-06-24 12:51:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 10874028032. Throughput: 0: 42901.5. Samples: 10874174660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 12:51:28,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-24 12:51:29,072][15401] Updated weights for policy 0, policy_version 663701 (0.0028) [2024-06-24 12:51:32,457][15401] Updated weights for policy 0, policy_version 663711 (0.0041) [2024-06-24 12:51:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10874290176. Throughput: 0: 42896.4. Samples: 10874427460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:51:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 12:51:36,668][15401] Updated weights for policy 0, policy_version 663721 (0.0033) [2024-06-24 12:51:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10874470400. Throughput: 0: 42950.6. Samples: 10874564020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:51:38,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 12:51:40,100][15401] Updated weights for policy 0, policy_version 663731 (0.0045) [2024-06-24 12:51:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10874683392. Throughput: 0: 42957.8. Samples: 10874817600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:51:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 12:51:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000663738_10874683392.pth... [2024-06-24 12:51:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000663112_10864427008.pth [2024-06-24 12:51:44,043][15401] Updated weights for policy 0, policy_version 663741 (0.0031) [2024-06-24 12:51:47,753][15401] Updated weights for policy 0, policy_version 663751 (0.0035) [2024-06-24 12:51:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 10874929152. Throughput: 0: 43013.4. Samples: 10875073460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:51:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 12:51:51,758][15401] Updated weights for policy 0, policy_version 663761 (0.0035) [2024-06-24 12:51:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10875125760. Throughput: 0: 43041.9. Samples: 10875210420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:51:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 12:51:55,355][15401] Updated weights for policy 0, policy_version 663771 (0.0036) [2024-06-24 12:51:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10875338752. Throughput: 0: 42773.8. Samples: 10875457540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:51:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 12:51:59,313][15401] Updated weights for policy 0, policy_version 663781 (0.0036) [2024-06-24 12:52:03,017][15401] Updated weights for policy 0, policy_version 663791 (0.0049) [2024-06-24 12:52:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10875568128. Throughput: 0: 42774.8. Samples: 10875715780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 12:52:06,894][15401] Updated weights for policy 0, policy_version 663801 (0.0033) [2024-06-24 12:52:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10875748352. Throughput: 0: 42893.0. Samples: 10875847860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 12:52:10,625][15401] Updated weights for policy 0, policy_version 663811 (0.0041) [2024-06-24 12:52:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10875961344. Throughput: 0: 42852.9. Samples: 10876103040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 12:52:14,908][15401] Updated weights for policy 0, policy_version 663821 (0.0026) [2024-06-24 12:52:16,641][15349] Signal inference workers to stop experience collection... (160950 times) [2024-06-24 12:52:16,644][15349] Signal inference workers to resume experience collection... (160950 times) [2024-06-24 12:52:16,659][15401] InferenceWorker_p0-w0: stopping experience collection (160950 times) [2024-06-24 12:52:16,659][15401] InferenceWorker_p0-w0: resuming experience collection (160950 times) [2024-06-24 12:52:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42710.0). Total num frames: 10876190720. Throughput: 0: 42923.0. Samples: 10876359000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:18,391][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 12:52:18,464][15401] Updated weights for policy 0, policy_version 663831 (0.0032) [2024-06-24 12:52:22,755][15401] Updated weights for policy 0, policy_version 663841 (0.0034) [2024-06-24 12:52:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10876403712. Throughput: 0: 42625.7. Samples: 10876482180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 12:52:25,946][15401] Updated weights for policy 0, policy_version 663851 (0.0038) [2024-06-24 12:52:28,390][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 10876633088. Throughput: 0: 42764.4. Samples: 10876742000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 12:52:30,143][15401] Updated weights for policy 0, policy_version 663861 (0.0036) [2024-06-24 12:52:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10876846080. Throughput: 0: 42900.3. Samples: 10877003980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:33,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 12:52:33,906][15401] Updated weights for policy 0, policy_version 663871 (0.0032) [2024-06-24 12:52:37,539][15401] Updated weights for policy 0, policy_version 663881 (0.0032) [2024-06-24 12:52:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10877042688. Throughput: 0: 42624.4. Samples: 10877128520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 12:52:41,462][15401] Updated weights for policy 0, policy_version 663891 (0.0031) [2024-06-24 12:52:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42821.2). Total num frames: 10877272064. Throughput: 0: 42999.1. Samples: 10877392500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 12:52:44,932][15401] Updated weights for policy 0, policy_version 663901 (0.0027) [2024-06-24 12:52:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10877501440. Throughput: 0: 42963.0. Samples: 10877649120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 12:52:49,062][15401] Updated weights for policy 0, policy_version 663911 (0.0031) [2024-06-24 12:52:52,345][15401] Updated weights for policy 0, policy_version 663921 (0.0034) [2024-06-24 12:52:53,391][15132] Fps is (10 sec: 42593.5, 60 sec: 42870.6, 300 sec: 42875.9). Total num frames: 10877698048. Throughput: 0: 42983.8. Samples: 10877782180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:53,391][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 12:52:56,693][15401] Updated weights for policy 0, policy_version 663931 (0.0040) [2024-06-24 12:52:58,390][15132] Fps is (10 sec: 42596.4, 60 sec: 43144.2, 300 sec: 42876.0). Total num frames: 10877927424. Throughput: 0: 43173.7. Samples: 10878045880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:52:58,391][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 12:52:59,690][15401] Updated weights for policy 0, policy_version 663941 (0.0036) [2024-06-24 12:53:03,390][15132] Fps is (10 sec: 45880.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10878156800. Throughput: 0: 43193.9. Samples: 10878302720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 12:53:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 12:53:04,310][15401] Updated weights for policy 0, policy_version 663951 (0.0037) [2024-06-24 12:53:07,626][15401] Updated weights for policy 0, policy_version 663961 (0.0032) [2024-06-24 12:53:08,390][15132] Fps is (10 sec: 42600.2, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 10878353408. Throughput: 0: 43321.7. Samples: 10878431660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 12:53:11,778][15401] Updated weights for policy 0, policy_version 663971 (0.0033) [2024-06-24 12:53:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 10878582784. Throughput: 0: 43262.3. Samples: 10878688800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 12:53:15,169][15401] Updated weights for policy 0, policy_version 663981 (0.0037) [2024-06-24 12:53:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10878763008. Throughput: 0: 43110.3. Samples: 10878943940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:18,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 12:53:19,383][15401] Updated weights for policy 0, policy_version 663991 (0.0039) [2024-06-24 12:53:20,706][15349] Signal inference workers to stop experience collection... (161000 times) [2024-06-24 12:53:20,744][15401] InferenceWorker_p0-w0: stopping experience collection (161000 times) [2024-06-24 12:53:20,775][15349] Signal inference workers to resume experience collection... (161000 times) [2024-06-24 12:53:20,776][15401] InferenceWorker_p0-w0: resuming experience collection (161000 times) [2024-06-24 12:53:22,887][15401] Updated weights for policy 0, policy_version 664001 (0.0030) [2024-06-24 12:53:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10878992384. Throughput: 0: 43129.7. Samples: 10879069360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:23,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 12:53:27,179][15401] Updated weights for policy 0, policy_version 664011 (0.0037) [2024-06-24 12:53:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10879205376. Throughput: 0: 43020.0. Samples: 10879328400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 12:53:30,449][15401] Updated weights for policy 0, policy_version 664021 (0.0027) [2024-06-24 12:53:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10879401984. Throughput: 0: 43042.7. Samples: 10879586040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:33,390][15132] Avg episode reward: [(0, '0.135')] [2024-06-24 12:53:35,100][15401] Updated weights for policy 0, policy_version 664031 (0.0039) [2024-06-24 12:53:37,986][15401] Updated weights for policy 0, policy_version 664041 (0.0035) [2024-06-24 12:53:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 10879647744. Throughput: 0: 42886.8. Samples: 10879712140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:38,393][15132] Avg episode reward: [(0, '0.258')] [2024-06-24 12:53:42,831][15401] Updated weights for policy 0, policy_version 664051 (0.0034) [2024-06-24 12:53:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10879844352. Throughput: 0: 42737.4. Samples: 10879969040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:43,390][15132] Avg episode reward: [(0, '0.183')] [2024-06-24 12:53:43,533][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000664054_10879860736.pth... [2024-06-24 12:53:43,588][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000663425_10869555200.pth [2024-06-24 12:53:46,032][15401] Updated weights for policy 0, policy_version 664061 (0.0043) [2024-06-24 12:53:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10880057344. Throughput: 0: 42749.9. Samples: 10880226460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 12:53:50,294][15401] Updated weights for policy 0, policy_version 664071 (0.0030) [2024-06-24 12:53:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43145.3, 300 sec: 42987.2). Total num frames: 10880286720. Throughput: 0: 42640.9. Samples: 10880350500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 12:53:53,418][15401] Updated weights for policy 0, policy_version 664081 (0.0023) [2024-06-24 12:53:57,891][15401] Updated weights for policy 0, policy_version 664091 (0.0038) [2024-06-24 12:53:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.8, 300 sec: 42931.6). Total num frames: 10880483328. Throughput: 0: 42771.2. Samples: 10880613500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:53:58,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 12:54:00,967][15401] Updated weights for policy 0, policy_version 664101 (0.0041) [2024-06-24 12:54:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 10880696320. Throughput: 0: 42769.3. Samples: 10880868560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 12:54:05,559][15401] Updated weights for policy 0, policy_version 664111 (0.0036) [2024-06-24 12:54:08,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 10880942080. Throughput: 0: 42841.7. Samples: 10880997240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:08,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 12:54:08,618][15401] Updated weights for policy 0, policy_version 664121 (0.0036) [2024-06-24 12:54:13,106][15401] Updated weights for policy 0, policy_version 664131 (0.0035) [2024-06-24 12:54:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 10881122304. Throughput: 0: 42844.4. Samples: 10881256400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 12:54:16,962][15401] Updated weights for policy 0, policy_version 664141 (0.0033) [2024-06-24 12:54:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10881351680. Throughput: 0: 42748.4. Samples: 10881509720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 12:54:20,555][15401] Updated weights for policy 0, policy_version 664151 (0.0028) [2024-06-24 12:54:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10881564672. Throughput: 0: 42807.0. Samples: 10881638360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 12:54:24,329][15401] Updated weights for policy 0, policy_version 664161 (0.0032) [2024-06-24 12:54:28,227][15401] Updated weights for policy 0, policy_version 664171 (0.0033) [2024-06-24 12:54:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.8). Total num frames: 10881777664. Throughput: 0: 42884.3. Samples: 10881898840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 12:54:32,273][15401] Updated weights for policy 0, policy_version 664181 (0.0040) [2024-06-24 12:54:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10881974272. Throughput: 0: 42800.8. Samples: 10882152500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 12:54:35,789][15401] Updated weights for policy 0, policy_version 664191 (0.0040) [2024-06-24 12:54:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 10882220032. Throughput: 0: 42824.0. Samples: 10882277580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 12:54:39,933][15401] Updated weights for policy 0, policy_version 664201 (0.0033) [2024-06-24 12:54:43,012][15349] Signal inference workers to stop experience collection... (161050 times) [2024-06-24 12:54:43,069][15401] InferenceWorker_p0-w0: stopping experience collection (161050 times) [2024-06-24 12:54:43,071][15349] Signal inference workers to resume experience collection... (161050 times) [2024-06-24 12:54:43,082][15401] InferenceWorker_p0-w0: resuming experience collection (161050 times) [2024-06-24 12:54:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10882416640. Throughput: 0: 42727.9. Samples: 10882536260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-24 12:54:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 12:54:43,553][15401] Updated weights for policy 0, policy_version 664211 (0.0029) [2024-06-24 12:54:47,495][15401] Updated weights for policy 0, policy_version 664221 (0.0034) [2024-06-24 12:54:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10882629632. Throughput: 0: 42686.6. Samples: 10882789460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:54:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 12:54:51,217][15401] Updated weights for policy 0, policy_version 664231 (0.0036) [2024-06-24 12:54:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 10882859008. Throughput: 0: 42641.6. Samples: 10882916100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:54:53,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 12:54:54,928][15401] Updated weights for policy 0, policy_version 664241 (0.0033) [2024-06-24 12:54:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 10883039232. Throughput: 0: 42651.6. Samples: 10883175720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:54:58,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-24 12:54:58,967][15401] Updated weights for policy 0, policy_version 664251 (0.0044) [2024-06-24 12:55:02,440][15401] Updated weights for policy 0, policy_version 664261 (0.0028) [2024-06-24 12:55:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 10883268608. Throughput: 0: 42761.9. Samples: 10883434000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:03,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 12:55:06,500][15401] Updated weights for policy 0, policy_version 664271 (0.0037) [2024-06-24 12:55:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 10883497984. Throughput: 0: 42837.8. Samples: 10883566060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 12:55:10,266][15401] Updated weights for policy 0, policy_version 664281 (0.0025) [2024-06-24 12:55:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10883678208. Throughput: 0: 42661.0. Samples: 10883818580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 12:55:14,053][15401] Updated weights for policy 0, policy_version 664291 (0.0044) [2024-06-24 12:55:17,772][15401] Updated weights for policy 0, policy_version 664301 (0.0040) [2024-06-24 12:55:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 10883907584. Throughput: 0: 42640.4. Samples: 10884071320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:18,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 12:55:21,796][15401] Updated weights for policy 0, policy_version 664311 (0.0036) [2024-06-24 12:55:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 10884136960. Throughput: 0: 42847.1. Samples: 10884205700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:23,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-24 12:55:25,728][15401] Updated weights for policy 0, policy_version 664321 (0.0032) [2024-06-24 12:55:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10884333568. Throughput: 0: 42797.8. Samples: 10884462160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:28,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 12:55:29,457][15401] Updated weights for policy 0, policy_version 664331 (0.0036) [2024-06-24 12:55:33,359][15401] Updated weights for policy 0, policy_version 664341 (0.0034) [2024-06-24 12:55:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10884562944. Throughput: 0: 42678.7. Samples: 10884710000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:33,396][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 12:55:37,225][15401] Updated weights for policy 0, policy_version 664351 (0.0025) [2024-06-24 12:55:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10884775936. Throughput: 0: 42880.2. Samples: 10884845720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 12:55:41,181][15401] Updated weights for policy 0, policy_version 664361 (0.0031) [2024-06-24 12:55:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10884972544. Throughput: 0: 42735.1. Samples: 10885098800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:43,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 12:55:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000664366_10884972544.pth... [2024-06-24 12:55:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000663738_10874683392.pth [2024-06-24 12:55:44,922][15401] Updated weights for policy 0, policy_version 664371 (0.0023) [2024-06-24 12:55:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10885185536. Throughput: 0: 42779.0. Samples: 10885359060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:48,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 12:55:48,770][15401] Updated weights for policy 0, policy_version 664381 (0.0041) [2024-06-24 12:55:52,409][15401] Updated weights for policy 0, policy_version 664391 (0.0030) [2024-06-24 12:55:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 10885414912. Throughput: 0: 42615.6. Samples: 10885483760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 12:55:56,580][15401] Updated weights for policy 0, policy_version 664401 (0.0029) [2024-06-24 12:55:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10885611520. Throughput: 0: 42728.4. Samples: 10885741360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:55:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 12:55:59,168][15349] Signal inference workers to stop experience collection... (161100 times) [2024-06-24 12:55:59,222][15401] InferenceWorker_p0-w0: stopping experience collection (161100 times) [2024-06-24 12:55:59,280][15349] Signal inference workers to resume experience collection... (161100 times) [2024-06-24 12:55:59,280][15401] InferenceWorker_p0-w0: resuming experience collection (161100 times) [2024-06-24 12:55:59,939][15401] Updated weights for policy 0, policy_version 664411 (0.0031) [2024-06-24 12:56:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 10885824512. Throughput: 0: 42869.5. Samples: 10886000440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:56:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 12:56:04,189][15401] Updated weights for policy 0, policy_version 664421 (0.0042) [2024-06-24 12:56:07,721][15401] Updated weights for policy 0, policy_version 664431 (0.0040) [2024-06-24 12:56:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10886037504. Throughput: 0: 42775.7. Samples: 10886130600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:56:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 12:56:11,840][15401] Updated weights for policy 0, policy_version 664441 (0.0031) [2024-06-24 12:56:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10886266880. Throughput: 0: 42711.1. Samples: 10886384160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:56:13,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 12:56:15,451][15401] Updated weights for policy 0, policy_version 664451 (0.0040) [2024-06-24 12:56:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 10886447104. Throughput: 0: 42802.3. Samples: 10886636100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:56:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 12:56:19,545][15401] Updated weights for policy 0, policy_version 664461 (0.0049) [2024-06-24 12:56:22,978][15401] Updated weights for policy 0, policy_version 664471 (0.0034) [2024-06-24 12:56:23,393][15132] Fps is (10 sec: 42583.3, 60 sec: 42595.9, 300 sec: 42931.1). Total num frames: 10886692864. Throughput: 0: 42524.3. Samples: 10886759460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-24 12:56:23,394][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 12:56:27,314][15401] Updated weights for policy 0, policy_version 664481 (0.0033) [2024-06-24 12:56:28,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10886905856. Throughput: 0: 42649.7. Samples: 10887018040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:56:28,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 12:56:30,620][15401] Updated weights for policy 0, policy_version 664491 (0.0028) [2024-06-24 12:56:33,389][15132] Fps is (10 sec: 40975.3, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 10887102464. Throughput: 0: 42577.1. Samples: 10887275020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:56:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 12:56:34,885][15401] Updated weights for policy 0, policy_version 664501 (0.0042) [2024-06-24 12:56:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10887331840. Throughput: 0: 42630.3. Samples: 10887402120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:56:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 12:56:38,466][15401] Updated weights for policy 0, policy_version 664511 (0.0037) [2024-06-24 12:56:42,294][15401] Updated weights for policy 0, policy_version 664521 (0.0034) [2024-06-24 12:56:43,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10887561216. Throughput: 0: 42719.0. Samples: 10887663720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:56:43,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-24 12:56:46,412][15401] Updated weights for policy 0, policy_version 664531 (0.0024) [2024-06-24 12:56:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10887757824. Throughput: 0: 42667.2. Samples: 10887920460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:56:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 12:56:49,819][15401] Updated weights for policy 0, policy_version 664541 (0.0030) [2024-06-24 12:56:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10887970816. Throughput: 0: 42551.6. Samples: 10888045420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:56:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 12:56:54,220][15401] Updated weights for policy 0, policy_version 664551 (0.0028) [2024-06-24 12:56:57,732][15401] Updated weights for policy 0, policy_version 664561 (0.0036) [2024-06-24 12:56:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 10888216576. Throughput: 0: 42709.4. Samples: 10888306080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:56:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 12:57:01,948][15401] Updated weights for policy 0, policy_version 664571 (0.0035) [2024-06-24 12:57:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10888364032. Throughput: 0: 42804.9. Samples: 10888562320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 12:57:05,243][15401] Updated weights for policy 0, policy_version 664581 (0.0023) [2024-06-24 12:57:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10888609792. Throughput: 0: 42696.6. Samples: 10888680660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 12:57:09,645][15401] Updated weights for policy 0, policy_version 664591 (0.0035) [2024-06-24 12:57:12,823][15401] Updated weights for policy 0, policy_version 664601 (0.0025) [2024-06-24 12:57:13,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10888839168. Throughput: 0: 42697.0. Samples: 10888939400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 12:57:17,708][15401] Updated weights for policy 0, policy_version 664611 (0.0029) [2024-06-24 12:57:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10889003008. Throughput: 0: 42731.9. Samples: 10889197960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 12:57:19,736][15349] Signal inference workers to stop experience collection... (161150 times) [2024-06-24 12:57:19,738][15349] Signal inference workers to resume experience collection... (161150 times) [2024-06-24 12:57:19,778][15401] InferenceWorker_p0-w0: stopping experience collection (161150 times) [2024-06-24 12:57:19,779][15401] InferenceWorker_p0-w0: resuming experience collection (161150 times) [2024-06-24 12:57:20,454][15401] Updated weights for policy 0, policy_version 664621 (0.0029) [2024-06-24 12:57:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.9, 300 sec: 42820.5). Total num frames: 10889265152. Throughput: 0: 42592.7. Samples: 10889318800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 12:57:25,213][15401] Updated weights for policy 0, policy_version 664631 (0.0036) [2024-06-24 12:57:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10889461760. Throughput: 0: 42626.7. Samples: 10889581920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 12:57:28,517][15401] Updated weights for policy 0, policy_version 664641 (0.0034) [2024-06-24 12:57:32,793][15401] Updated weights for policy 0, policy_version 664651 (0.0047) [2024-06-24 12:57:33,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10889641984. Throughput: 0: 42511.4. Samples: 10889833480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 12:57:36,451][15401] Updated weights for policy 0, policy_version 664661 (0.0033) [2024-06-24 12:57:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10889871360. Throughput: 0: 42468.9. Samples: 10889956520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 12:57:40,339][15401] Updated weights for policy 0, policy_version 664671 (0.0022) [2024-06-24 12:57:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10890100736. Throughput: 0: 42521.1. Samples: 10890219540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 12:57:43,547][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000664680_10890117120.pth... [2024-06-24 12:57:43,612][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000664054_10879860736.pth [2024-06-24 12:57:44,178][15401] Updated weights for policy 0, policy_version 664681 (0.0029) [2024-06-24 12:57:48,275][15401] Updated weights for policy 0, policy_version 664691 (0.0040) [2024-06-24 12:57:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 10890297344. Throughput: 0: 42510.2. Samples: 10890475280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 12:57:51,779][15401] Updated weights for policy 0, policy_version 664701 (0.0029) [2024-06-24 12:57:53,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 10890543104. Throughput: 0: 42622.4. Samples: 10890598660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:53,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-24 12:57:55,874][15401] Updated weights for policy 0, policy_version 664711 (0.0034) [2024-06-24 12:57:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 10890723328. Throughput: 0: 42727.4. Samples: 10890862140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 12:57:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 12:57:59,399][15401] Updated weights for policy 0, policy_version 664721 (0.0040) [2024-06-24 12:58:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10890936320. Throughput: 0: 42640.4. Samples: 10891116780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 12:58:03,590][15401] Updated weights for policy 0, policy_version 664731 (0.0036) [2024-06-24 12:58:07,068][15401] Updated weights for policy 0, policy_version 664741 (0.0042) [2024-06-24 12:58:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10891165696. Throughput: 0: 42681.5. Samples: 10891239460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 12:58:11,153][15401] Updated weights for policy 0, policy_version 664751 (0.0033) [2024-06-24 12:58:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 10891362304. Throughput: 0: 42729.0. Samples: 10891504720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 12:58:14,818][15401] Updated weights for policy 0, policy_version 664761 (0.0023) [2024-06-24 12:58:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 10891591680. Throughput: 0: 42796.4. Samples: 10891759320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:18,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 12:58:18,549][15401] Updated weights for policy 0, policy_version 664771 (0.0033) [2024-06-24 12:58:22,152][15401] Updated weights for policy 0, policy_version 664781 (0.0051) [2024-06-24 12:58:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 10891804672. Throughput: 0: 42841.8. Samples: 10891884400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 12:58:26,057][15401] Updated weights for policy 0, policy_version 664791 (0.0035) [2024-06-24 12:58:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10892017664. Throughput: 0: 42949.1. Samples: 10892152240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:28,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 12:58:29,961][15401] Updated weights for policy 0, policy_version 664801 (0.0037) [2024-06-24 12:58:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 10892230656. Throughput: 0: 42723.5. Samples: 10892397840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 12:58:34,116][15401] Updated weights for policy 0, policy_version 664811 (0.0030) [2024-06-24 12:58:37,857][15401] Updated weights for policy 0, policy_version 664821 (0.0032) [2024-06-24 12:58:37,860][15349] Signal inference workers to stop experience collection... (161200 times) [2024-06-24 12:58:37,860][15349] Signal inference workers to resume experience collection... (161200 times) [2024-06-24 12:58:37,891][15401] InferenceWorker_p0-w0: stopping experience collection (161200 times) [2024-06-24 12:58:37,891][15401] InferenceWorker_p0-w0: resuming experience collection (161200 times) [2024-06-24 12:58:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10892460032. Throughput: 0: 42956.8. Samples: 10892531720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 12:58:41,594][15401] Updated weights for policy 0, policy_version 664831 (0.0046) [2024-06-24 12:58:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10892673024. Throughput: 0: 42871.2. Samples: 10892791340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:43,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 12:58:45,472][15401] Updated weights for policy 0, policy_version 664841 (0.0033) [2024-06-24 12:58:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10892869632. Throughput: 0: 42703.9. Samples: 10893038460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 12:58:49,329][15401] Updated weights for policy 0, policy_version 664851 (0.0043) [2024-06-24 12:58:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 10893066240. Throughput: 0: 42817.0. Samples: 10893166220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 12:58:53,423][15401] Updated weights for policy 0, policy_version 664861 (0.0032) [2024-06-24 12:58:56,809][15401] Updated weights for policy 0, policy_version 664871 (0.0026) [2024-06-24 12:58:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10893279232. Throughput: 0: 42555.9. Samples: 10893419740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:58:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 12:59:01,144][15401] Updated weights for policy 0, policy_version 664881 (0.0034) [2024-06-24 12:59:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10893524992. Throughput: 0: 42515.5. Samples: 10893672520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:59:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 12:59:04,601][15401] Updated weights for policy 0, policy_version 664891 (0.0027) [2024-06-24 12:59:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10893721600. Throughput: 0: 42750.2. Samples: 10893808160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:59:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 12:59:08,679][15401] Updated weights for policy 0, policy_version 664901 (0.0029) [2024-06-24 12:59:12,629][15401] Updated weights for policy 0, policy_version 664911 (0.0038) [2024-06-24 12:59:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 10893934592. Throughput: 0: 42365.8. Samples: 10894058700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:59:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 12:59:16,104][15401] Updated weights for policy 0, policy_version 664921 (0.0033) [2024-06-24 12:59:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10894163968. Throughput: 0: 42585.3. Samples: 10894314180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:59:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 12:59:20,445][15401] Updated weights for policy 0, policy_version 664931 (0.0033) [2024-06-24 12:59:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10894376960. Throughput: 0: 42540.0. Samples: 10894446020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:59:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 12:59:23,588][15401] Updated weights for policy 0, policy_version 664941 (0.0037) [2024-06-24 12:59:27,943][15401] Updated weights for policy 0, policy_version 664951 (0.0032) [2024-06-24 12:59:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 10894557184. Throughput: 0: 42358.7. Samples: 10894697480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:59:28,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 12:59:31,160][15401] Updated weights for policy 0, policy_version 664961 (0.0033) [2024-06-24 12:59:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10894786560. Throughput: 0: 42611.5. Samples: 10894955980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:59:33,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 12:59:35,511][15401] Updated weights for policy 0, policy_version 664971 (0.0032) [2024-06-24 12:59:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10895015936. Throughput: 0: 42665.2. Samples: 10895086160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-24 12:59:38,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 12:59:39,435][15401] Updated weights for policy 0, policy_version 664981 (0.0028) [2024-06-24 12:59:43,184][15401] Updated weights for policy 0, policy_version 664991 (0.0040) [2024-06-24 12:59:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10895212544. Throughput: 0: 42669.7. Samples: 10895339880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 12:59:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 12:59:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000664991_10895212544.pth... [2024-06-24 12:59:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000664366_10884972544.pth [2024-06-24 12:59:46,896][15401] Updated weights for policy 0, policy_version 665001 (0.0021) [2024-06-24 12:59:48,391][15132] Fps is (10 sec: 40951.9, 60 sec: 42597.0, 300 sec: 42598.1). Total num frames: 10895425536. Throughput: 0: 42745.3. Samples: 10895596140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 12:59:48,392][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 12:59:50,743][15401] Updated weights for policy 0, policy_version 665011 (0.0055) [2024-06-24 12:59:53,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43415.7, 300 sec: 42820.2). Total num frames: 10895671296. Throughput: 0: 42471.9. Samples: 10895719500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 12:59:53,393][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 12:59:54,502][15401] Updated weights for policy 0, policy_version 665021 (0.0036) [2024-06-24 12:59:58,390][15132] Fps is (10 sec: 42606.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10895851520. Throughput: 0: 42679.0. Samples: 10895979260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 12:59:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 12:59:58,451][15401] Updated weights for policy 0, policy_version 665031 (0.0021) [2024-06-24 13:00:01,973][15401] Updated weights for policy 0, policy_version 665041 (0.0040) [2024-06-24 13:00:03,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10896064512. Throughput: 0: 42586.7. Samples: 10896230580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 13:00:05,454][15349] Signal inference workers to stop experience collection... (161250 times) [2024-06-24 13:00:05,455][15349] Signal inference workers to resume experience collection... (161250 times) [2024-06-24 13:00:05,469][15401] InferenceWorker_p0-w0: stopping experience collection (161250 times) [2024-06-24 13:00:05,469][15401] InferenceWorker_p0-w0: resuming experience collection (161250 times) [2024-06-24 13:00:06,102][15401] Updated weights for policy 0, policy_version 665051 (0.0024) [2024-06-24 13:00:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10896277504. Throughput: 0: 42500.3. Samples: 10896358640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:08,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 13:00:09,629][15401] Updated weights for policy 0, policy_version 665061 (0.0039) [2024-06-24 13:00:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10896490496. Throughput: 0: 42846.3. Samples: 10896625560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 13:00:13,710][15401] Updated weights for policy 0, policy_version 665071 (0.0026) [2024-06-24 13:00:17,302][15401] Updated weights for policy 0, policy_version 665081 (0.0029) [2024-06-24 13:00:18,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10896719872. Throughput: 0: 42545.4. Samples: 10896870520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:18,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 13:00:21,689][15401] Updated weights for policy 0, policy_version 665091 (0.0042) [2024-06-24 13:00:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10896932864. Throughput: 0: 42662.7. Samples: 10897005980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:23,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 13:00:24,829][15401] Updated weights for policy 0, policy_version 665101 (0.0033) [2024-06-24 13:00:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10897113088. Throughput: 0: 42624.2. Samples: 10897257960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 13:00:29,520][15401] Updated weights for policy 0, policy_version 665111 (0.0033) [2024-06-24 13:00:32,502][15401] Updated weights for policy 0, policy_version 665121 (0.0036) [2024-06-24 13:00:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10897358848. Throughput: 0: 42426.2. Samples: 10897505240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 13:00:36,995][15401] Updated weights for policy 0, policy_version 665131 (0.0041) [2024-06-24 13:00:38,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10897571840. Throughput: 0: 42787.7. Samples: 10897644840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:38,390][15132] Avg episode reward: [(0, '0.177')] [2024-06-24 13:00:40,357][15401] Updated weights for policy 0, policy_version 665141 (0.0030) [2024-06-24 13:00:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10897752064. Throughput: 0: 42660.5. Samples: 10897898980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 13:00:44,700][15401] Updated weights for policy 0, policy_version 665151 (0.0048) [2024-06-24 13:00:47,967][15401] Updated weights for policy 0, policy_version 665161 (0.0029) [2024-06-24 13:00:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42872.8, 300 sec: 42653.9). Total num frames: 10897997824. Throughput: 0: 42489.3. Samples: 10898142600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 13:00:52,288][15401] Updated weights for policy 0, policy_version 665171 (0.0029) [2024-06-24 13:00:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 10898194432. Throughput: 0: 42744.4. Samples: 10898282040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 13:00:55,618][15401] Updated weights for policy 0, policy_version 665181 (0.0027) [2024-06-24 13:00:58,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 10898374656. Throughput: 0: 42566.9. Samples: 10898541080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:00:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 13:00:59,850][15401] Updated weights for policy 0, policy_version 665191 (0.0040) [2024-06-24 13:01:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10898636800. Throughput: 0: 42519.1. Samples: 10898783880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:01:03,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 13:01:03,722][15401] Updated weights for policy 0, policy_version 665201 (0.0043) [2024-06-24 13:01:07,737][15401] Updated weights for policy 0, policy_version 665211 (0.0036) [2024-06-24 13:01:08,392][15132] Fps is (10 sec: 47502.7, 60 sec: 42871.5, 300 sec: 42653.6). Total num frames: 10898849792. Throughput: 0: 42508.8. Samples: 10898918980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:01:08,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 13:01:11,473][15401] Updated weights for policy 0, policy_version 665221 (0.0036) [2024-06-24 13:01:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10899046400. Throughput: 0: 42609.7. Samples: 10899175400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:01:13,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 13:01:15,473][15401] Updated weights for policy 0, policy_version 665231 (0.0033) [2024-06-24 13:01:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42598.9). Total num frames: 10899259392. Throughput: 0: 42860.6. Samples: 10899433960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 13:01:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 13:01:18,945][15401] Updated weights for policy 0, policy_version 665241 (0.0038) [2024-06-24 13:01:23,152][15401] Updated weights for policy 0, policy_version 665251 (0.0060) [2024-06-24 13:01:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10899472384. Throughput: 0: 42556.5. Samples: 10899559880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:01:23,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 13:01:24,982][15349] Signal inference workers to stop experience collection... (161300 times) [2024-06-24 13:01:24,983][15349] Signal inference workers to resume experience collection... (161300 times) [2024-06-24 13:01:25,029][15401] InferenceWorker_p0-w0: stopping experience collection (161300 times) [2024-06-24 13:01:25,029][15401] InferenceWorker_p0-w0: resuming experience collection (161300 times) [2024-06-24 13:01:26,906][15401] Updated weights for policy 0, policy_version 665261 (0.0033) [2024-06-24 13:01:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10899668992. Throughput: 0: 42604.5. Samples: 10899816180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:01:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 13:01:30,704][15401] Updated weights for policy 0, policy_version 665271 (0.0041) [2024-06-24 13:01:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10899914752. Throughput: 0: 42841.9. Samples: 10900070480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:01:33,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-24 13:01:34,542][15401] Updated weights for policy 0, policy_version 665281 (0.0040) [2024-06-24 13:01:38,200][15401] Updated weights for policy 0, policy_version 665291 (0.0026) [2024-06-24 13:01:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10900127744. Throughput: 0: 42720.4. Samples: 10900204460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:01:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 13:01:42,680][15401] Updated weights for policy 0, policy_version 665301 (0.0040) [2024-06-24 13:01:43,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 10900307968. Throughput: 0: 42572.9. Samples: 10900456960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:01:43,393][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 13:01:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000665302_10900307968.pth... [2024-06-24 13:01:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000664680_10890117120.pth [2024-06-24 13:01:45,661][15401] Updated weights for policy 0, policy_version 665311 (0.0036) [2024-06-24 13:01:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10900553728. Throughput: 0: 42790.7. Samples: 10900709460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:01:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 13:01:50,155][15401] Updated weights for policy 0, policy_version 665321 (0.0026) [2024-06-24 13:01:53,392][15132] Fps is (10 sec: 45872.9, 60 sec: 42869.4, 300 sec: 42542.4). Total num frames: 10900766720. Throughput: 0: 42964.8. Samples: 10900852420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:01:53,393][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 13:01:53,792][15401] Updated weights for policy 0, policy_version 665331 (0.0038) [2024-06-24 13:01:57,611][15401] Updated weights for policy 0, policy_version 665341 (0.0035) [2024-06-24 13:01:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10900963328. Throughput: 0: 42771.6. Samples: 10901100120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:01:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 13:02:01,317][15401] Updated weights for policy 0, policy_version 665351 (0.0035) [2024-06-24 13:02:03,390][15132] Fps is (10 sec: 42610.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10901192704. Throughput: 0: 42703.5. Samples: 10901355620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:03,392][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 13:02:05,268][15401] Updated weights for policy 0, policy_version 665361 (0.0034) [2024-06-24 13:02:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 10901422080. Throughput: 0: 42930.1. Samples: 10901491740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 13:02:08,706][15401] Updated weights for policy 0, policy_version 665371 (0.0031) [2024-06-24 13:02:13,269][15401] Updated weights for policy 0, policy_version 665381 (0.0029) [2024-06-24 13:02:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10901602304. Throughput: 0: 42942.9. Samples: 10901748620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 13:02:16,186][15401] Updated weights for policy 0, policy_version 665391 (0.0029) [2024-06-24 13:02:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10901831680. Throughput: 0: 43010.7. Samples: 10902005960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 13:02:20,750][15401] Updated weights for policy 0, policy_version 665401 (0.0036) [2024-06-24 13:02:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10902061056. Throughput: 0: 42977.9. Samples: 10902138460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 13:02:23,743][15401] Updated weights for policy 0, policy_version 665411 (0.0026) [2024-06-24 13:02:28,230][15401] Updated weights for policy 0, policy_version 665421 (0.0039) [2024-06-24 13:02:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10902257664. Throughput: 0: 42922.9. Samples: 10902388380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 13:02:31,322][15401] Updated weights for policy 0, policy_version 665431 (0.0045) [2024-06-24 13:02:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10902470656. Throughput: 0: 43143.2. Samples: 10902650900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:33,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-24 13:02:36,181][15401] Updated weights for policy 0, policy_version 665441 (0.0035) [2024-06-24 13:02:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10902700032. Throughput: 0: 42874.5. Samples: 10902781640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 13:02:38,869][15401] Updated weights for policy 0, policy_version 665451 (0.0030) [2024-06-24 13:02:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 10902896640. Throughput: 0: 42955.4. Samples: 10903033120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 13:02:43,859][15401] Updated weights for policy 0, policy_version 665461 (0.0040) [2024-06-24 13:02:46,635][15401] Updated weights for policy 0, policy_version 665471 (0.0037) [2024-06-24 13:02:47,333][15349] Signal inference workers to stop experience collection... (161350 times) [2024-06-24 13:02:47,335][15349] Signal inference workers to resume experience collection... (161350 times) [2024-06-24 13:02:47,386][15401] InferenceWorker_p0-w0: stopping experience collection (161350 times) [2024-06-24 13:02:47,386][15401] InferenceWorker_p0-w0: resuming experience collection (161350 times) [2024-06-24 13:02:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10903126016. Throughput: 0: 42780.4. Samples: 10903280740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 13:02:51,471][15401] Updated weights for policy 0, policy_version 665481 (0.0039) [2024-06-24 13:02:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.6, 300 sec: 42765.0). Total num frames: 10903339008. Throughput: 0: 42689.4. Samples: 10903412760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 13:02:54,471][15401] Updated weights for policy 0, policy_version 665491 (0.0026) [2024-06-24 13:02:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10903519232. Throughput: 0: 42750.7. Samples: 10903672400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:02:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 13:02:59,169][15401] Updated weights for policy 0, policy_version 665501 (0.0037) [2024-06-24 13:03:02,183][15401] Updated weights for policy 0, policy_version 665511 (0.0033) [2024-06-24 13:03:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10903781376. Throughput: 0: 42443.9. Samples: 10903915940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 13:03:06,842][15401] Updated weights for policy 0, policy_version 665521 (0.0026) [2024-06-24 13:03:08,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10903977984. Throughput: 0: 42627.1. Samples: 10904056680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 13:03:09,643][15401] Updated weights for policy 0, policy_version 665531 (0.0038) [2024-06-24 13:03:13,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 10904158208. Throughput: 0: 42648.9. Samples: 10904307580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 13:03:14,582][15401] Updated weights for policy 0, policy_version 665541 (0.0038) [2024-06-24 13:03:17,230][15401] Updated weights for policy 0, policy_version 665551 (0.0035) [2024-06-24 13:03:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10904420352. Throughput: 0: 42281.2. Samples: 10904553560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 13:03:22,323][15401] Updated weights for policy 0, policy_version 665561 (0.0027) [2024-06-24 13:03:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10904600576. Throughput: 0: 42604.3. Samples: 10904698840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 13:03:24,720][15401] Updated weights for policy 0, policy_version 665571 (0.0035) [2024-06-24 13:03:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10904813568. Throughput: 0: 42522.7. Samples: 10904946640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 13:03:29,995][15401] Updated weights for policy 0, policy_version 665581 (0.0032) [2024-06-24 13:03:32,417][15401] Updated weights for policy 0, policy_version 665591 (0.0038) [2024-06-24 13:03:33,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 10905075712. Throughput: 0: 42621.3. Samples: 10905198700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:33,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 13:03:37,987][15401] Updated weights for policy 0, policy_version 665601 (0.0032) [2024-06-24 13:03:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 10905223168. Throughput: 0: 42730.2. Samples: 10905335620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 13:03:40,080][15401] Updated weights for policy 0, policy_version 665611 (0.0034) [2024-06-24 13:03:43,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10905452544. Throughput: 0: 42564.9. Samples: 10905587820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:43,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-24 13:03:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000665616_10905452544.pth... [2024-06-24 13:03:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000664991_10895212544.pth [2024-06-24 13:03:45,656][15401] Updated weights for policy 0, policy_version 665621 (0.0029) [2024-06-24 13:03:47,659][15401] Updated weights for policy 0, policy_version 665631 (0.0037) [2024-06-24 13:03:48,390][15132] Fps is (10 sec: 49151.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10905714688. Throughput: 0: 42617.8. Samples: 10905833740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:48,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-24 13:03:53,243][15401] Updated weights for policy 0, policy_version 665641 (0.0031) [2024-06-24 13:03:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 10905862144. Throughput: 0: 42596.9. Samples: 10905973540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 13:03:54,139][15349] Signal inference workers to stop experience collection... (161400 times) [2024-06-24 13:03:54,194][15401] InferenceWorker_p0-w0: stopping experience collection (161400 times) [2024-06-24 13:03:54,194][15349] Signal inference workers to resume experience collection... (161400 times) [2024-06-24 13:03:54,216][15401] InferenceWorker_p0-w0: resuming experience collection (161400 times) [2024-06-24 13:03:55,672][15401] Updated weights for policy 0, policy_version 665651 (0.0036) [2024-06-24 13:03:58,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10906091520. Throughput: 0: 42681.7. Samples: 10906228260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:03:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 13:04:00,873][15401] Updated weights for policy 0, policy_version 665661 (0.0036) [2024-06-24 13:04:03,249][15401] Updated weights for policy 0, policy_version 665671 (0.0029) [2024-06-24 13:04:03,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10906353664. Throughput: 0: 42629.8. Samples: 10906471900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:04:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 13:04:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10906501120. Throughput: 0: 42376.1. Samples: 10906605760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:04:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 13:04:08,403][15401] Updated weights for policy 0, policy_version 665681 (0.0046) [2024-06-24 13:04:11,028][15401] Updated weights for policy 0, policy_version 665691 (0.0033) [2024-06-24 13:04:13,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 10906730496. Throughput: 0: 42359.9. Samples: 10906852840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:04:13,400][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 13:04:16,092][15401] Updated weights for policy 0, policy_version 665701 (0.0033) [2024-06-24 13:04:18,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10906976256. Throughput: 0: 42425.1. Samples: 10907107820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:04:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 13:04:18,819][15401] Updated weights for policy 0, policy_version 665711 (0.0046) [2024-06-24 13:04:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10907140096. Throughput: 0: 42314.7. Samples: 10907239780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:04:23,390][15132] Avg episode reward: [(0, '0.875')] [2024-06-24 13:04:23,650][15401] Updated weights for policy 0, policy_version 665721 (0.0035) [2024-06-24 13:04:27,009][15401] Updated weights for policy 0, policy_version 665731 (0.0033) [2024-06-24 13:04:28,393][15132] Fps is (10 sec: 39307.4, 60 sec: 42595.9, 300 sec: 42653.4). Total num frames: 10907369472. Throughput: 0: 42211.0. Samples: 10907487460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:04:28,394][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 13:04:31,295][15401] Updated weights for policy 0, policy_version 665741 (0.0026) [2024-06-24 13:04:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 10907582464. Throughput: 0: 42653.7. Samples: 10907753160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:04:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 13:04:34,481][15401] Updated weights for policy 0, policy_version 665751 (0.0037) [2024-06-24 13:04:38,390][15132] Fps is (10 sec: 42613.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10907795456. Throughput: 0: 42410.2. Samples: 10907882000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:04:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 13:04:38,823][15401] Updated weights for policy 0, policy_version 665761 (0.0038) [2024-06-24 13:04:42,098][15401] Updated weights for policy 0, policy_version 665771 (0.0037) [2024-06-24 13:04:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 10908024832. Throughput: 0: 42330.2. Samples: 10908133120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:04:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 13:04:46,702][15401] Updated weights for policy 0, policy_version 665781 (0.0038) [2024-06-24 13:04:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42543.2). Total num frames: 10908221440. Throughput: 0: 42721.4. Samples: 10908394360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:04:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 13:04:49,829][15401] Updated weights for policy 0, policy_version 665791 (0.0033) [2024-06-24 13:04:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10908418048. Throughput: 0: 42526.6. Samples: 10908519460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:04:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 13:04:54,454][15401] Updated weights for policy 0, policy_version 665801 (0.0048) [2024-06-24 13:04:57,600][15401] Updated weights for policy 0, policy_version 665811 (0.0024) [2024-06-24 13:04:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10908663808. Throughput: 0: 42738.0. Samples: 10908776040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:04:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 13:05:01,912][15401] Updated weights for policy 0, policy_version 665821 (0.0040) [2024-06-24 13:05:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 10908876800. Throughput: 0: 42861.5. Samples: 10909036600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:03,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-24 13:05:05,515][15401] Updated weights for policy 0, policy_version 665831 (0.0040) [2024-06-24 13:05:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10909073408. Throughput: 0: 42839.2. Samples: 10909167540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:08,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 13:05:08,942][15349] Signal inference workers to stop experience collection... (161450 times) [2024-06-24 13:05:08,988][15401] InferenceWorker_p0-w0: stopping experience collection (161450 times) [2024-06-24 13:05:09,008][15349] Signal inference workers to resume experience collection... (161450 times) [2024-06-24 13:05:09,009][15401] InferenceWorker_p0-w0: resuming experience collection (161450 times) [2024-06-24 13:05:09,363][15401] Updated weights for policy 0, policy_version 665841 (0.0030) [2024-06-24 13:05:13,035][15401] Updated weights for policy 0, policy_version 665851 (0.0027) [2024-06-24 13:05:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10909302784. Throughput: 0: 42975.8. Samples: 10909421220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 13:05:16,935][15401] Updated weights for policy 0, policy_version 665861 (0.0029) [2024-06-24 13:05:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10909532160. Throughput: 0: 42881.5. Samples: 10909682820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 13:05:20,761][15401] Updated weights for policy 0, policy_version 665871 (0.0031) [2024-06-24 13:05:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 10909728768. Throughput: 0: 42994.1. Samples: 10909816740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:23,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 13:05:24,539][15401] Updated weights for policy 0, policy_version 665881 (0.0035) [2024-06-24 13:05:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42873.9, 300 sec: 42653.9). Total num frames: 10909941760. Throughput: 0: 43050.6. Samples: 10910070400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 13:05:28,608][15401] Updated weights for policy 0, policy_version 665891 (0.0026) [2024-06-24 13:05:31,967][15401] Updated weights for policy 0, policy_version 665901 (0.0035) [2024-06-24 13:05:33,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 10910187520. Throughput: 0: 43065.1. Samples: 10910332280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:33,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 13:05:35,996][15401] Updated weights for policy 0, policy_version 665911 (0.0031) [2024-06-24 13:05:38,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 10910367744. Throughput: 0: 43256.4. Samples: 10910466100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:38,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 13:05:39,328][15401] Updated weights for policy 0, policy_version 665921 (0.0031) [2024-06-24 13:05:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10910597120. Throughput: 0: 43237.6. Samples: 10910721740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:43,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 13:05:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000665930_10910597120.pth... [2024-06-24 13:05:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000665302_10900307968.pth [2024-06-24 13:05:43,822][15401] Updated weights for policy 0, policy_version 665931 (0.0022) [2024-06-24 13:05:47,308][15401] Updated weights for policy 0, policy_version 665941 (0.0037) [2024-06-24 13:05:48,389][15132] Fps is (10 sec: 47524.9, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 10910842880. Throughput: 0: 43213.1. Samples: 10910981180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 13:05:51,391][15401] Updated weights for policy 0, policy_version 665951 (0.0033) [2024-06-24 13:05:53,389][15132] Fps is (10 sec: 40961.1, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 10911006720. Throughput: 0: 43349.4. Samples: 10911118260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 13:05:54,698][15401] Updated weights for policy 0, policy_version 665961 (0.0028) [2024-06-24 13:05:58,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10911219712. Throughput: 0: 43188.0. Samples: 10911364680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:05:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 13:05:59,007][15401] Updated weights for policy 0, policy_version 665971 (0.0026) [2024-06-24 13:06:02,440][15401] Updated weights for policy 0, policy_version 665981 (0.0030) [2024-06-24 13:06:03,390][15132] Fps is (10 sec: 49147.5, 60 sec: 43690.2, 300 sec: 42876.3). Total num frames: 10911498240. Throughput: 0: 43141.8. Samples: 10911624240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:06:03,391][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 13:06:07,355][15401] Updated weights for policy 0, policy_version 665991 (0.0030) [2024-06-24 13:06:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10911662080. Throughput: 0: 43204.5. Samples: 10911760940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-24 13:06:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 13:06:10,194][15401] Updated weights for policy 0, policy_version 666001 (0.0029) [2024-06-24 13:06:13,390][15132] Fps is (10 sec: 37685.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10911875072. Throughput: 0: 43039.1. Samples: 10912007160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 13:06:14,805][15401] Updated weights for policy 0, policy_version 666011 (0.0048) [2024-06-24 13:06:14,897][15349] Signal inference workers to stop experience collection... (161500 times) [2024-06-24 13:06:14,954][15401] InferenceWorker_p0-w0: stopping experience collection (161500 times) [2024-06-24 13:06:14,957][15349] Signal inference workers to resume experience collection... (161500 times) [2024-06-24 13:06:14,968][15401] InferenceWorker_p0-w0: resuming experience collection (161500 times) [2024-06-24 13:06:17,667][15401] Updated weights for policy 0, policy_version 666021 (0.0040) [2024-06-24 13:06:18,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 10912137216. Throughput: 0: 43023.9. Samples: 10912268360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 13:06:22,280][15401] Updated weights for policy 0, policy_version 666031 (0.0040) [2024-06-24 13:06:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10912317440. Throughput: 0: 43054.6. Samples: 10912403460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 13:06:25,281][15401] Updated weights for policy 0, policy_version 666041 (0.0042) [2024-06-24 13:06:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10912514048. Throughput: 0: 42926.4. Samples: 10912653420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:28,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 13:06:29,968][15401] Updated weights for policy 0, policy_version 666051 (0.0030) [2024-06-24 13:06:32,958][15401] Updated weights for policy 0, policy_version 666061 (0.0038) [2024-06-24 13:06:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10912759808. Throughput: 0: 42934.3. Samples: 10912913220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:33,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 13:06:37,530][15401] Updated weights for policy 0, policy_version 666071 (0.0037) [2024-06-24 13:06:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 42876.5). Total num frames: 10912956416. Throughput: 0: 42809.2. Samples: 10913044680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:38,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 13:06:40,551][15401] Updated weights for policy 0, policy_version 666081 (0.0037) [2024-06-24 13:06:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10913169408. Throughput: 0: 42949.8. Samples: 10913297420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 13:06:45,027][15401] Updated weights for policy 0, policy_version 666091 (0.0032) [2024-06-24 13:06:47,961][15401] Updated weights for policy 0, policy_version 666101 (0.0024) [2024-06-24 13:06:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42821.0). Total num frames: 10913398784. Throughput: 0: 42974.1. Samples: 10913558040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:48,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 13:06:52,582][15401] Updated weights for policy 0, policy_version 666111 (0.0031) [2024-06-24 13:06:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10913595392. Throughput: 0: 42946.3. Samples: 10913693520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 13:06:55,714][15401] Updated weights for policy 0, policy_version 666121 (0.0030) [2024-06-24 13:06:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 10913824768. Throughput: 0: 43153.5. Samples: 10913949060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:06:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 13:07:00,104][15401] Updated weights for policy 0, policy_version 666131 (0.0033) [2024-06-24 13:07:03,215][15401] Updated weights for policy 0, policy_version 666141 (0.0028) [2024-06-24 13:07:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.9, 300 sec: 42820.5). Total num frames: 10914054144. Throughput: 0: 43188.3. Samples: 10914211840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 13:07:07,681][15401] Updated weights for policy 0, policy_version 666151 (0.0038) [2024-06-24 13:07:08,396][15132] Fps is (10 sec: 44208.2, 60 sec: 43413.0, 300 sec: 42930.7). Total num frames: 10914267136. Throughput: 0: 43029.1. Samples: 10914340040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:08,396][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 13:07:11,091][15401] Updated weights for policy 0, policy_version 666161 (0.0029) [2024-06-24 13:07:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 10914463744. Throughput: 0: 43214.1. Samples: 10914598060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:13,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 13:07:15,206][15401] Updated weights for policy 0, policy_version 666171 (0.0032) [2024-06-24 13:07:18,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10914676736. Throughput: 0: 43096.3. Samples: 10914852560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 13:07:18,667][15401] Updated weights for policy 0, policy_version 666181 (0.0033) [2024-06-24 13:07:22,740][15401] Updated weights for policy 0, policy_version 666191 (0.0036) [2024-06-24 13:07:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10914906112. Throughput: 0: 43161.7. Samples: 10914986960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 13:07:26,199][15401] Updated weights for policy 0, policy_version 666201 (0.0029) [2024-06-24 13:07:28,396][15132] Fps is (10 sec: 44208.6, 60 sec: 43412.9, 300 sec: 42875.2). Total num frames: 10915119104. Throughput: 0: 43060.5. Samples: 10915235420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:28,397][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 13:07:30,149][15401] Updated weights for policy 0, policy_version 666211 (0.0025) [2024-06-24 13:07:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 10915332096. Throughput: 0: 43115.1. Samples: 10915498220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 13:07:33,765][15401] Updated weights for policy 0, policy_version 666221 (0.0035) [2024-06-24 13:07:37,855][15401] Updated weights for policy 0, policy_version 666231 (0.0035) [2024-06-24 13:07:38,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10915528704. Throughput: 0: 43011.2. Samples: 10915629020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 13:07:38,543][15349] Signal inference workers to stop experience collection... (161550 times) [2024-06-24 13:07:38,543][15349] Signal inference workers to resume experience collection... (161550 times) [2024-06-24 13:07:38,572][15401] InferenceWorker_p0-w0: stopping experience collection (161550 times) [2024-06-24 13:07:38,572][15401] InferenceWorker_p0-w0: resuming experience collection (161550 times) [2024-06-24 13:07:41,490][15401] Updated weights for policy 0, policy_version 666241 (0.0041) [2024-06-24 13:07:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 10915774464. Throughput: 0: 42930.2. Samples: 10915880920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:43,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 13:07:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000666246_10915774464.pth... [2024-06-24 13:07:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000665616_10905452544.pth [2024-06-24 13:07:45,464][15401] Updated weights for policy 0, policy_version 666251 (0.0035) [2024-06-24 13:07:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10915971072. Throughput: 0: 42863.2. Samples: 10916140680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 13:07:48,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 13:07:49,157][15401] Updated weights for policy 0, policy_version 666261 (0.0042) [2024-06-24 13:07:53,318][15401] Updated weights for policy 0, policy_version 666271 (0.0040) [2024-06-24 13:07:53,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 10916184064. Throughput: 0: 42846.8. Samples: 10916267880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:07:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 13:07:56,645][15401] Updated weights for policy 0, policy_version 666281 (0.0045) [2024-06-24 13:07:58,396][15132] Fps is (10 sec: 44208.5, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 10916413440. Throughput: 0: 42787.7. Samples: 10916523780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:07:58,396][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 13:08:01,167][15401] Updated weights for policy 0, policy_version 666291 (0.0028) [2024-06-24 13:08:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10916626432. Throughput: 0: 42835.5. Samples: 10916780160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 13:08:04,287][15401] Updated weights for policy 0, policy_version 666301 (0.0037) [2024-06-24 13:08:08,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42329.8, 300 sec: 42876.1). Total num frames: 10916806656. Throughput: 0: 42686.8. Samples: 10916907860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 13:08:08,650][15401] Updated weights for policy 0, policy_version 666311 (0.0028) [2024-06-24 13:08:11,937][15401] Updated weights for policy 0, policy_version 666321 (0.0028) [2024-06-24 13:08:13,395][15132] Fps is (10 sec: 44214.3, 60 sec: 43413.9, 300 sec: 42875.4). Total num frames: 10917068800. Throughput: 0: 42823.8. Samples: 10917162440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:13,395][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 13:08:16,273][15401] Updated weights for policy 0, policy_version 666331 (0.0030) [2024-06-24 13:08:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10917249024. Throughput: 0: 42636.8. Samples: 10917416880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:18,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 13:08:19,806][15401] Updated weights for policy 0, policy_version 666341 (0.0036) [2024-06-24 13:08:23,389][15132] Fps is (10 sec: 37702.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10917445632. Throughput: 0: 42419.1. Samples: 10917537880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 13:08:24,251][15401] Updated weights for policy 0, policy_version 666351 (0.0021) [2024-06-24 13:08:27,423][15401] Updated weights for policy 0, policy_version 666361 (0.0030) [2024-06-24 13:08:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 10917691392. Throughput: 0: 42607.5. Samples: 10917798260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 13:08:31,939][15401] Updated weights for policy 0, policy_version 666371 (0.0037) [2024-06-24 13:08:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 10917888000. Throughput: 0: 42603.9. Samples: 10918057860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 13:08:35,053][15401] Updated weights for policy 0, policy_version 666381 (0.0031) [2024-06-24 13:08:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10918068224. Throughput: 0: 42470.0. Samples: 10918179020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:38,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 13:08:39,564][15401] Updated weights for policy 0, policy_version 666391 (0.0030) [2024-06-24 13:08:42,697][15401] Updated weights for policy 0, policy_version 666401 (0.0042) [2024-06-24 13:08:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10918330368. Throughput: 0: 42424.8. Samples: 10918432620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 13:08:47,448][15401] Updated weights for policy 0, policy_version 666411 (0.0042) [2024-06-24 13:08:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 10918526976. Throughput: 0: 42574.7. Samples: 10918696020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 13:08:49,508][15349] Signal inference workers to stop experience collection... (161600 times) [2024-06-24 13:08:49,508][15349] Signal inference workers to resume experience collection... (161600 times) [2024-06-24 13:08:49,527][15401] InferenceWorker_p0-w0: stopping experience collection (161600 times) [2024-06-24 13:08:49,528][15401] InferenceWorker_p0-w0: resuming experience collection (161600 times) [2024-06-24 13:08:50,391][15401] Updated weights for policy 0, policy_version 666421 (0.0035) [2024-06-24 13:08:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10918723584. Throughput: 0: 42450.6. Samples: 10918818140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:53,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 13:08:55,109][15401] Updated weights for policy 0, policy_version 666431 (0.0032) [2024-06-24 13:08:58,055][15401] Updated weights for policy 0, policy_version 666441 (0.0032) [2024-06-24 13:08:58,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42602.7, 300 sec: 42765.0). Total num frames: 10918969344. Throughput: 0: 42415.3. Samples: 10919070920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:08:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 13:09:02,811][15401] Updated weights for policy 0, policy_version 666451 (0.0033) [2024-06-24 13:09:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 10919149568. Throughput: 0: 42649.4. Samples: 10919336100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:09:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 13:09:05,642][15401] Updated weights for policy 0, policy_version 666461 (0.0043) [2024-06-24 13:09:08,390][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10919378944. Throughput: 0: 42666.6. Samples: 10919457880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:09:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 13:09:10,546][15401] Updated weights for policy 0, policy_version 666471 (0.0035) [2024-06-24 13:09:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42329.0, 300 sec: 42820.5). Total num frames: 10919608320. Throughput: 0: 42566.2. Samples: 10919713740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:09:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 13:09:13,515][15401] Updated weights for policy 0, policy_version 666481 (0.0032) [2024-06-24 13:09:18,046][15401] Updated weights for policy 0, policy_version 666491 (0.0046) [2024-06-24 13:09:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10919788544. Throughput: 0: 42599.6. Samples: 10919974840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:09:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 13:09:21,043][15401] Updated weights for policy 0, policy_version 666501 (0.0024) [2024-06-24 13:09:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42876.3). Total num frames: 10920017920. Throughput: 0: 42586.6. Samples: 10920095520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:09:23,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 13:09:25,558][15401] Updated weights for policy 0, policy_version 666511 (0.0040) [2024-06-24 13:09:28,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10920247296. Throughput: 0: 42719.5. Samples: 10920355000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 13:09:28,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 13:09:29,276][15401] Updated weights for policy 0, policy_version 666521 (0.0041) [2024-06-24 13:09:33,123][15401] Updated weights for policy 0, policy_version 666531 (0.0026) [2024-06-24 13:09:33,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10920460288. Throughput: 0: 42597.0. Samples: 10920612880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:09:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 13:09:37,112][15401] Updated weights for policy 0, policy_version 666541 (0.0047) [2024-06-24 13:09:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10920656896. Throughput: 0: 42667.2. Samples: 10920738160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:09:38,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 13:09:40,731][15401] Updated weights for policy 0, policy_version 666551 (0.0032) [2024-06-24 13:09:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 10920886272. Throughput: 0: 42701.6. Samples: 10920992480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:09:43,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 13:09:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000666558_10920886272.pth... [2024-06-24 13:09:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000665930_10910597120.pth [2024-06-24 13:09:44,813][15401] Updated weights for policy 0, policy_version 666561 (0.0032) [2024-06-24 13:09:48,237][15401] Updated weights for policy 0, policy_version 666571 (0.0039) [2024-06-24 13:09:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 10921099264. Throughput: 0: 42596.3. Samples: 10921252940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:09:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 13:09:52,345][15401] Updated weights for policy 0, policy_version 666581 (0.0040) [2024-06-24 13:09:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 10921295872. Throughput: 0: 42717.0. Samples: 10921380140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:09:53,396][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 13:09:56,223][15401] Updated weights for policy 0, policy_version 666591 (0.0028) [2024-06-24 13:09:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.7, 300 sec: 42931.7). Total num frames: 10921541632. Throughput: 0: 42708.9. Samples: 10921635640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:09:58,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 13:10:00,002][15401] Updated weights for policy 0, policy_version 666601 (0.0031) [2024-06-24 13:10:03,240][15349] Signal inference workers to stop experience collection... (161650 times) [2024-06-24 13:10:03,288][15401] InferenceWorker_p0-w0: stopping experience collection (161650 times) [2024-06-24 13:10:03,291][15349] Signal inference workers to resume experience collection... (161650 times) [2024-06-24 13:10:03,299][15401] InferenceWorker_p0-w0: resuming experience collection (161650 times) [2024-06-24 13:10:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10921721856. Throughput: 0: 42680.1. Samples: 10921895440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 13:10:03,801][15401] Updated weights for policy 0, policy_version 666611 (0.0038) [2024-06-24 13:10:07,577][15401] Updated weights for policy 0, policy_version 666621 (0.0038) [2024-06-24 13:10:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10921934848. Throughput: 0: 42687.2. Samples: 10922016340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 13:10:11,316][15401] Updated weights for policy 0, policy_version 666631 (0.0041) [2024-06-24 13:10:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10922164224. Throughput: 0: 42628.9. Samples: 10922273300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 13:10:15,005][15401] Updated weights for policy 0, policy_version 666641 (0.0030) [2024-06-24 13:10:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10922344448. Throughput: 0: 42937.4. Samples: 10922545060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 13:10:18,874][15401] Updated weights for policy 0, policy_version 666651 (0.0036) [2024-06-24 13:10:22,567][15401] Updated weights for policy 0, policy_version 666661 (0.0043) [2024-06-24 13:10:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 10922590208. Throughput: 0: 42780.7. Samples: 10922663300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 13:10:26,749][15401] Updated weights for policy 0, policy_version 666671 (0.0047) [2024-06-24 13:10:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10922803200. Throughput: 0: 42840.1. Samples: 10922920280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:28,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 13:10:30,124][15401] Updated weights for policy 0, policy_version 666681 (0.0034) [2024-06-24 13:10:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 10922983424. Throughput: 0: 43049.9. Samples: 10923190180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 13:10:34,346][15401] Updated weights for policy 0, policy_version 666691 (0.0035) [2024-06-24 13:10:38,178][15401] Updated weights for policy 0, policy_version 666701 (0.0040) [2024-06-24 13:10:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10923229184. Throughput: 0: 42800.9. Samples: 10923306180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 13:10:41,853][15401] Updated weights for policy 0, policy_version 666711 (0.0045) [2024-06-24 13:10:43,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10923458560. Throughput: 0: 42799.6. Samples: 10923561620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 13:10:45,758][15401] Updated weights for policy 0, policy_version 666721 (0.0048) [2024-06-24 13:10:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 10923622400. Throughput: 0: 42975.4. Samples: 10923829340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 13:10:49,815][15401] Updated weights for policy 0, policy_version 666731 (0.0032) [2024-06-24 13:10:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10923868160. Throughput: 0: 42824.4. Samples: 10923943440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:53,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 13:10:53,487][15401] Updated weights for policy 0, policy_version 666741 (0.0032) [2024-06-24 13:10:57,596][15401] Updated weights for policy 0, policy_version 666751 (0.0043) [2024-06-24 13:10:58,390][15132] Fps is (10 sec: 49152.2, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 10924113920. Throughput: 0: 42910.6. Samples: 10924204280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:10:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 13:11:01,106][15401] Updated weights for policy 0, policy_version 666761 (0.0042) [2024-06-24 13:11:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10924261376. Throughput: 0: 42648.9. Samples: 10924464260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 13:11:03,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 13:11:05,224][15401] Updated weights for policy 0, policy_version 666771 (0.0032) [2024-06-24 13:11:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 10924523520. Throughput: 0: 42504.4. Samples: 10924576000. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 13:11:08,763][15401] Updated weights for policy 0, policy_version 666781 (0.0030) [2024-06-24 13:11:12,674][15349] Signal inference workers to stop experience collection... (161700 times) [2024-06-24 13:11:12,729][15349] Signal inference workers to resume experience collection... (161700 times) [2024-06-24 13:11:12,730][15401] InferenceWorker_p0-w0: stopping experience collection (161700 times) [2024-06-24 13:11:12,745][15401] InferenceWorker_p0-w0: resuming experience collection (161700 times) [2024-06-24 13:11:12,874][15401] Updated weights for policy 0, policy_version 666791 (0.0033) [2024-06-24 13:11:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10924720128. Throughput: 0: 42741.3. Samples: 10924843640. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 13:11:16,609][15401] Updated weights for policy 0, policy_version 666801 (0.0042) [2024-06-24 13:11:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10924916736. Throughput: 0: 42444.8. Samples: 10925100200. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:18,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-24 13:11:20,481][15401] Updated weights for policy 0, policy_version 666811 (0.0030) [2024-06-24 13:11:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 10925162496. Throughput: 0: 42610.1. Samples: 10925223740. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:23,392][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 13:11:24,647][15401] Updated weights for policy 0, policy_version 666821 (0.0033) [2024-06-24 13:11:28,319][15401] Updated weights for policy 0, policy_version 666831 (0.0034) [2024-06-24 13:11:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10925359104. Throughput: 0: 42898.1. Samples: 10925492040. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 13:11:32,090][15401] Updated weights for policy 0, policy_version 666841 (0.0030) [2024-06-24 13:11:33,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10925555712. Throughput: 0: 42610.8. Samples: 10925746820. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 13:11:35,985][15401] Updated weights for policy 0, policy_version 666851 (0.0030) [2024-06-24 13:11:38,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10925817856. Throughput: 0: 42949.3. Samples: 10925876160. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:38,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 13:11:39,601][15401] Updated weights for policy 0, policy_version 666861 (0.0035) [2024-06-24 13:11:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10925998080. Throughput: 0: 42876.9. Samples: 10926133740. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 13:11:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000666871_10926014464.pth... [2024-06-24 13:11:43,517][15401] Updated weights for policy 0, policy_version 666871 (0.0040) [2024-06-24 13:11:43,570][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000666246_10915774464.pth [2024-06-24 13:11:47,129][15401] Updated weights for policy 0, policy_version 666881 (0.0035) [2024-06-24 13:11:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10926211072. Throughput: 0: 42619.9. Samples: 10926382160. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 13:11:51,235][15401] Updated weights for policy 0, policy_version 666891 (0.0027) [2024-06-24 13:11:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10926440448. Throughput: 0: 42957.3. Samples: 10926509080. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:53,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 13:11:54,808][15401] Updated weights for policy 0, policy_version 666901 (0.0042) [2024-06-24 13:11:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 10926637056. Throughput: 0: 42848.9. Samples: 10926771840. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:11:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 13:11:58,839][15401] Updated weights for policy 0, policy_version 666911 (0.0031) [2024-06-24 13:12:02,306][15401] Updated weights for policy 0, policy_version 666921 (0.0032) [2024-06-24 13:12:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42654.9). Total num frames: 10926850048. Throughput: 0: 42791.6. Samples: 10927025820. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 13:12:06,649][15401] Updated weights for policy 0, policy_version 666931 (0.0041) [2024-06-24 13:12:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10927079424. Throughput: 0: 42881.4. Samples: 10927153300. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:08,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 13:12:09,879][15401] Updated weights for policy 0, policy_version 666941 (0.0028) [2024-06-24 13:12:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10927276032. Throughput: 0: 42588.2. Samples: 10927408500. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 13:12:14,271][15401] Updated weights for policy 0, policy_version 666951 (0.0041) [2024-06-24 13:12:17,534][15401] Updated weights for policy 0, policy_version 666961 (0.0053) [2024-06-24 13:12:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10927489024. Throughput: 0: 42506.2. Samples: 10927659600. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 13:12:20,672][15349] Signal inference workers to stop experience collection... (161750 times) [2024-06-24 13:12:20,672][15349] Signal inference workers to resume experience collection... (161750 times) [2024-06-24 13:12:20,720][15401] InferenceWorker_p0-w0: stopping experience collection (161750 times) [2024-06-24 13:12:20,720][15401] InferenceWorker_p0-w0: resuming experience collection (161750 times) [2024-06-24 13:12:22,147][15401] Updated weights for policy 0, policy_version 666971 (0.0041) [2024-06-24 13:12:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42710.4). Total num frames: 10927718400. Throughput: 0: 42499.6. Samples: 10927788640. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:23,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 13:12:25,432][15401] Updated weights for policy 0, policy_version 666981 (0.0034) [2024-06-24 13:12:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10927898624. Throughput: 0: 42364.1. Samples: 10928040120. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:28,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 13:12:30,158][15401] Updated weights for policy 0, policy_version 666991 (0.0040) [2024-06-24 13:12:33,083][15401] Updated weights for policy 0, policy_version 667001 (0.0031) [2024-06-24 13:12:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10928144384. Throughput: 0: 42312.5. Samples: 10928286220. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 13:12:37,711][15401] Updated weights for policy 0, policy_version 667011 (0.0034) [2024-06-24 13:12:38,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 10928357376. Throughput: 0: 42568.5. Samples: 10928424760. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:38,392][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 13:12:40,816][15401] Updated weights for policy 0, policy_version 667021 (0.0042) [2024-06-24 13:12:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10928537600. Throughput: 0: 42286.7. Samples: 10928674740. Policy #0 lag: (min: 1.0, avg: 12.1, max: 22.0) [2024-06-24 13:12:43,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 13:12:45,335][15401] Updated weights for policy 0, policy_version 667031 (0.0027) [2024-06-24 13:12:48,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 10928783360. Throughput: 0: 42276.3. Samples: 10928928360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:12:48,393][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 13:12:48,899][15401] Updated weights for policy 0, policy_version 667041 (0.0041) [2024-06-24 13:12:52,989][15401] Updated weights for policy 0, policy_version 667051 (0.0038) [2024-06-24 13:12:53,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42596.8, 300 sec: 42654.5). Total num frames: 10928996352. Throughput: 0: 42452.4. Samples: 10929063760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:12:53,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 13:12:56,473][15401] Updated weights for policy 0, policy_version 667061 (0.0045) [2024-06-24 13:12:58,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10929192960. Throughput: 0: 42499.5. Samples: 10929320980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:12:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 13:13:00,709][15401] Updated weights for policy 0, policy_version 667071 (0.0037) [2024-06-24 13:13:03,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10929405952. Throughput: 0: 42382.7. Samples: 10929566820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 13:13:04,164][15401] Updated weights for policy 0, policy_version 667081 (0.0025) [2024-06-24 13:13:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42488.1). Total num frames: 10929602560. Throughput: 0: 42449.7. Samples: 10929698880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:08,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 13:13:08,560][15401] Updated weights for policy 0, policy_version 667091 (0.0044) [2024-06-24 13:13:11,985][15401] Updated weights for policy 0, policy_version 667101 (0.0034) [2024-06-24 13:13:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10929815552. Throughput: 0: 42548.0. Samples: 10929954780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 13:13:16,281][15401] Updated weights for policy 0, policy_version 667111 (0.0035) [2024-06-24 13:13:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10930061312. Throughput: 0: 42662.1. Samples: 10930206020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:18,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 13:13:19,450][15401] Updated weights for policy 0, policy_version 667121 (0.0046) [2024-06-24 13:13:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10930241536. Throughput: 0: 42489.0. Samples: 10930336660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:23,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 13:13:23,718][15401] Updated weights for policy 0, policy_version 667131 (0.0043) [2024-06-24 13:13:26,993][15401] Updated weights for policy 0, policy_version 667141 (0.0042) [2024-06-24 13:13:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10930470912. Throughput: 0: 42601.7. Samples: 10930591820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:28,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 13:13:31,292][15401] Updated weights for policy 0, policy_version 667151 (0.0033) [2024-06-24 13:13:33,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10930716672. Throughput: 0: 42614.2. Samples: 10930845900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 13:13:35,057][15401] Updated weights for policy 0, policy_version 667161 (0.0031) [2024-06-24 13:13:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 10930896896. Throughput: 0: 42695.1. Samples: 10930984940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 13:13:38,713][15401] Updated weights for policy 0, policy_version 667171 (0.0032) [2024-06-24 13:13:42,951][15401] Updated weights for policy 0, policy_version 667181 (0.0047) [2024-06-24 13:13:43,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10931093504. Throughput: 0: 42627.5. Samples: 10931239220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 13:13:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000667181_10931093504.pth... [2024-06-24 13:13:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000666558_10920886272.pth [2024-06-24 13:13:46,160][15401] Updated weights for policy 0, policy_version 667191 (0.0043) [2024-06-24 13:13:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 10931355648. Throughput: 0: 42663.4. Samples: 10931486680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:48,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 13:13:50,590][15401] Updated weights for policy 0, policy_version 667201 (0.0047) [2024-06-24 13:13:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 10931535872. Throughput: 0: 42727.6. Samples: 10931621620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 13:13:54,148][15401] Updated weights for policy 0, policy_version 667211 (0.0054) [2024-06-24 13:13:55,227][15349] Signal inference workers to stop experience collection... (161800 times) [2024-06-24 13:13:55,272][15401] InferenceWorker_p0-w0: stopping experience collection (161800 times) [2024-06-24 13:13:55,280][15349] Signal inference workers to resume experience collection... (161800 times) [2024-06-24 13:13:55,292][15401] InferenceWorker_p0-w0: resuming experience collection (161800 times) [2024-06-24 13:13:58,228][15401] Updated weights for policy 0, policy_version 667221 (0.0033) [2024-06-24 13:13:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 10931748864. Throughput: 0: 42685.6. Samples: 10931875640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:13:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 13:14:01,837][15401] Updated weights for policy 0, policy_version 667231 (0.0026) [2024-06-24 13:14:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10931978240. Throughput: 0: 42690.6. Samples: 10932127100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:14:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 13:14:05,924][15401] Updated weights for policy 0, policy_version 667241 (0.0036) [2024-06-24 13:14:08,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 10932174848. Throughput: 0: 42772.3. Samples: 10932261520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:14:08,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 13:14:09,354][15401] Updated weights for policy 0, policy_version 667251 (0.0025) [2024-06-24 13:14:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10932387840. Throughput: 0: 42625.8. Samples: 10932509980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:14:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 13:14:13,583][15401] Updated weights for policy 0, policy_version 667261 (0.0035) [2024-06-24 13:14:17,374][15401] Updated weights for policy 0, policy_version 667271 (0.0038) [2024-06-24 13:14:18,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 10932617216. Throughput: 0: 42777.9. Samples: 10932770900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:14:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 13:14:21,464][15401] Updated weights for policy 0, policy_version 667281 (0.0034) [2024-06-24 13:14:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10932813824. Throughput: 0: 42601.4. Samples: 10932902000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-24 13:14:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 13:14:24,838][15401] Updated weights for policy 0, policy_version 667291 (0.0026) [2024-06-24 13:14:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10933043200. Throughput: 0: 42659.1. Samples: 10933158880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:14:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 13:14:28,851][15401] Updated weights for policy 0, policy_version 667301 (0.0039) [2024-06-24 13:14:32,244][15401] Updated weights for policy 0, policy_version 667311 (0.0048) [2024-06-24 13:14:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10933256192. Throughput: 0: 42927.3. Samples: 10933418400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:14:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 13:14:36,772][15401] Updated weights for policy 0, policy_version 667321 (0.0034) [2024-06-24 13:14:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10933452800. Throughput: 0: 42777.3. Samples: 10933546600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:14:38,390][15132] Avg episode reward: [(0, '0.845')] [2024-06-24 13:14:40,213][15401] Updated weights for policy 0, policy_version 667331 (0.0030) [2024-06-24 13:14:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 10933682176. Throughput: 0: 42728.0. Samples: 10933798400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:14:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 13:14:44,261][15401] Updated weights for policy 0, policy_version 667341 (0.0032) [2024-06-24 13:14:47,685][15401] Updated weights for policy 0, policy_version 667351 (0.0036) [2024-06-24 13:14:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10933895168. Throughput: 0: 43003.2. Samples: 10934062240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:14:48,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 13:14:51,971][15401] Updated weights for policy 0, policy_version 667361 (0.0033) [2024-06-24 13:14:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10934075392. Throughput: 0: 42852.1. Samples: 10934189760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:14:53,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 13:14:55,314][15401] Updated weights for policy 0, policy_version 667371 (0.0034) [2024-06-24 13:14:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 10934321152. Throughput: 0: 42883.2. Samples: 10934439720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:14:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 13:14:59,772][15401] Updated weights for policy 0, policy_version 667381 (0.0051) [2024-06-24 13:15:02,883][15401] Updated weights for policy 0, policy_version 667391 (0.0042) [2024-06-24 13:15:03,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 10934534144. Throughput: 0: 42778.5. Samples: 10934696040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:03,392][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 13:15:07,349][15401] Updated weights for policy 0, policy_version 667401 (0.0034) [2024-06-24 13:15:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 10934714368. Throughput: 0: 42785.4. Samples: 10934827340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:08,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 13:15:10,902][15401] Updated weights for policy 0, policy_version 667411 (0.0034) [2024-06-24 13:15:13,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 10934976512. Throughput: 0: 42532.3. Samples: 10935072840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:13,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 13:15:14,945][15401] Updated weights for policy 0, policy_version 667421 (0.0043) [2024-06-24 13:15:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10935173120. Throughput: 0: 42440.5. Samples: 10935328220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 13:15:18,466][15401] Updated weights for policy 0, policy_version 667431 (0.0031) [2024-06-24 13:15:23,001][15401] Updated weights for policy 0, policy_version 667441 (0.0031) [2024-06-24 13:15:23,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 10935353344. Throughput: 0: 42452.1. Samples: 10935456940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:23,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 13:15:26,455][15401] Updated weights for policy 0, policy_version 667451 (0.0037) [2024-06-24 13:15:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10935615488. Throughput: 0: 42599.7. Samples: 10935715380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 13:15:30,576][15401] Updated weights for policy 0, policy_version 667461 (0.0041) [2024-06-24 13:15:32,154][15349] Signal inference workers to stop experience collection... (161850 times) [2024-06-24 13:15:32,154][15349] Signal inference workers to resume experience collection... (161850 times) [2024-06-24 13:15:32,177][15401] InferenceWorker_p0-w0: stopping experience collection (161850 times) [2024-06-24 13:15:32,177][15401] InferenceWorker_p0-w0: resuming experience collection (161850 times) [2024-06-24 13:15:33,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10935828480. Throughput: 0: 42301.2. Samples: 10935965800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:33,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-24 13:15:33,939][15401] Updated weights for policy 0, policy_version 667471 (0.0036) [2024-06-24 13:15:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10935992320. Throughput: 0: 42302.2. Samples: 10936093360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 13:15:38,635][15401] Updated weights for policy 0, policy_version 667481 (0.0028) [2024-06-24 13:15:41,512][15401] Updated weights for policy 0, policy_version 667491 (0.0031) [2024-06-24 13:15:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 10936254464. Throughput: 0: 42341.8. Samples: 10936345100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 13:15:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000667496_10936254464.pth... [2024-06-24 13:15:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000666871_10926014464.pth [2024-06-24 13:15:46,414][15401] Updated weights for policy 0, policy_version 667501 (0.0034) [2024-06-24 13:15:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10936434688. Throughput: 0: 42468.5. Samples: 10936607020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 13:15:49,322][15401] Updated weights for policy 0, policy_version 667511 (0.0027) [2024-06-24 13:15:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 10936647680. Throughput: 0: 42220.8. Samples: 10936727280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 13:15:54,046][15401] Updated weights for policy 0, policy_version 667521 (0.0029) [2024-06-24 13:15:56,926][15401] Updated weights for policy 0, policy_version 667531 (0.0028) [2024-06-24 13:15:58,395][15132] Fps is (10 sec: 45851.7, 60 sec: 42867.8, 300 sec: 42819.8). Total num frames: 10936893440. Throughput: 0: 42498.8. Samples: 10936985500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 13:15:58,395][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 13:16:01,945][15401] Updated weights for policy 0, policy_version 667541 (0.0034) [2024-06-24 13:16:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 10937073664. Throughput: 0: 42656.3. Samples: 10937247760. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:03,390][15132] Avg episode reward: [(0, '0.906')] [2024-06-24 13:16:04,555][15401] Updated weights for policy 0, policy_version 667551 (0.0029) [2024-06-24 13:16:08,389][15132] Fps is (10 sec: 39341.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10937286656. Throughput: 0: 42498.2. Samples: 10937369360. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:08,390][15132] Avg episode reward: [(0, '0.886')] [2024-06-24 13:16:09,548][15401] Updated weights for policy 0, policy_version 667561 (0.0031) [2024-06-24 13:16:12,460][15401] Updated weights for policy 0, policy_version 667571 (0.0030) [2024-06-24 13:16:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10937532416. Throughput: 0: 42618.1. Samples: 10937633200. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 13:16:16,987][15401] Updated weights for policy 0, policy_version 667581 (0.0039) [2024-06-24 13:16:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42543.2). Total num frames: 10937712640. Throughput: 0: 42785.8. Samples: 10937891160. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 13:16:20,115][15401] Updated weights for policy 0, policy_version 667591 (0.0024) [2024-06-24 13:16:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 10937942016. Throughput: 0: 42687.9. Samples: 10938014320. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:23,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 13:16:24,351][15401] Updated weights for policy 0, policy_version 667601 (0.0036) [2024-06-24 13:16:27,803][15401] Updated weights for policy 0, policy_version 667611 (0.0039) [2024-06-24 13:16:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10938155008. Throughput: 0: 42881.3. Samples: 10938274760. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 13:16:31,805][15401] Updated weights for policy 0, policy_version 667621 (0.0039) [2024-06-24 13:16:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 10938368000. Throughput: 0: 42804.3. Samples: 10938533220. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:33,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 13:16:35,376][15401] Updated weights for policy 0, policy_version 667631 (0.0033) [2024-06-24 13:16:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 10938580992. Throughput: 0: 42865.3. Samples: 10938656320. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:38,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 13:16:39,692][15401] Updated weights for policy 0, policy_version 667641 (0.0029) [2024-06-24 13:16:43,312][15401] Updated weights for policy 0, policy_version 667651 (0.0042) [2024-06-24 13:16:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 10938793984. Throughput: 0: 43044.0. Samples: 10938922260. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 13:16:47,298][15401] Updated weights for policy 0, policy_version 667661 (0.0043) [2024-06-24 13:16:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10939006976. Throughput: 0: 42891.6. Samples: 10939177880. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:48,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 13:16:50,936][15401] Updated weights for policy 0, policy_version 667671 (0.0031) [2024-06-24 13:16:51,622][15349] Signal inference workers to stop experience collection... (161900 times) [2024-06-24 13:16:51,676][15401] InferenceWorker_p0-w0: stopping experience collection (161900 times) [2024-06-24 13:16:51,676][15349] Signal inference workers to resume experience collection... (161900 times) [2024-06-24 13:16:51,694][15401] InferenceWorker_p0-w0: resuming experience collection (161900 times) [2024-06-24 13:16:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 10939219968. Throughput: 0: 42922.7. Samples: 10939300880. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 13:16:54,642][15401] Updated weights for policy 0, policy_version 667681 (0.0041) [2024-06-24 13:16:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42328.9, 300 sec: 42653.9). Total num frames: 10939432960. Throughput: 0: 42854.7. Samples: 10939561660. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:16:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 13:16:58,557][15401] Updated weights for policy 0, policy_version 667691 (0.0049) [2024-06-24 13:17:02,157][15401] Updated weights for policy 0, policy_version 667701 (0.0037) [2024-06-24 13:17:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 10939629568. Throughput: 0: 42944.0. Samples: 10939823640. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:17:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 13:17:06,180][15401] Updated weights for policy 0, policy_version 667711 (0.0036) [2024-06-24 13:17:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10939858944. Throughput: 0: 42954.3. Samples: 10939947260. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:17:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 13:17:09,687][15401] Updated weights for policy 0, policy_version 667721 (0.0037) [2024-06-24 13:17:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10940055552. Throughput: 0: 42803.4. Samples: 10940200920. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:17:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 13:17:13,909][15401] Updated weights for policy 0, policy_version 667731 (0.0042) [2024-06-24 13:17:17,179][15401] Updated weights for policy 0, policy_version 667741 (0.0053) [2024-06-24 13:17:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 10940268544. Throughput: 0: 42833.9. Samples: 10940460740. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:17:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 13:17:21,466][15401] Updated weights for policy 0, policy_version 667751 (0.0046) [2024-06-24 13:17:23,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10940514304. Throughput: 0: 42942.3. Samples: 10940588620. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:17:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 13:17:24,759][15401] Updated weights for policy 0, policy_version 667761 (0.0032) [2024-06-24 13:17:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10940710912. Throughput: 0: 42752.8. Samples: 10940846140. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:17:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 13:17:29,066][15401] Updated weights for policy 0, policy_version 667771 (0.0020) [2024-06-24 13:17:32,465][15401] Updated weights for policy 0, policy_version 667781 (0.0029) [2024-06-24 13:17:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 10940923904. Throughput: 0: 42753.8. Samples: 10941101800. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:17:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 13:17:36,721][15401] Updated weights for policy 0, policy_version 667791 (0.0051) [2024-06-24 13:17:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 10941136896. Throughput: 0: 42970.2. Samples: 10941234540. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-06-24 13:17:38,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 13:17:40,250][15401] Updated weights for policy 0, policy_version 667801 (0.0027) [2024-06-24 13:17:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 10941349888. Throughput: 0: 42764.9. Samples: 10941486080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:17:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 13:17:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000667807_10941349888.pth... [2024-06-24 13:17:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000667181_10931093504.pth [2024-06-24 13:17:44,542][15401] Updated weights for policy 0, policy_version 667811 (0.0035) [2024-06-24 13:17:48,106][15401] Updated weights for policy 0, policy_version 667821 (0.0031) [2024-06-24 13:17:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42653.9). Total num frames: 10941579264. Throughput: 0: 42544.0. Samples: 10941738220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:17:48,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 13:17:52,152][15401] Updated weights for policy 0, policy_version 667831 (0.0043) [2024-06-24 13:17:53,391][15132] Fps is (10 sec: 44228.8, 60 sec: 42870.1, 300 sec: 42709.2). Total num frames: 10941792256. Throughput: 0: 42793.4. Samples: 10941873040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:17:53,392][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 13:17:56,208][15401] Updated weights for policy 0, policy_version 667841 (0.0038) [2024-06-24 13:17:58,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 10941988864. Throughput: 0: 42768.7. Samples: 10942125500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:17:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 13:17:59,939][15401] Updated weights for policy 0, policy_version 667851 (0.0032) [2024-06-24 13:18:03,396][15132] Fps is (10 sec: 42578.7, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 10942218240. Throughput: 0: 42550.8. Samples: 10942375800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:03,396][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 13:18:03,969][15401] Updated weights for policy 0, policy_version 667861 (0.0037) [2024-06-24 13:18:07,567][15401] Updated weights for policy 0, policy_version 667871 (0.0035) [2024-06-24 13:18:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10942431232. Throughput: 0: 42712.9. Samples: 10942510700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 13:18:08,602][15349] Signal inference workers to stop experience collection... (161950 times) [2024-06-24 13:18:08,644][15401] InferenceWorker_p0-w0: stopping experience collection (161950 times) [2024-06-24 13:18:08,660][15349] Signal inference workers to resume experience collection... (161950 times) [2024-06-24 13:18:08,662][15401] InferenceWorker_p0-w0: resuming experience collection (161950 times) [2024-06-24 13:18:11,513][15401] Updated weights for policy 0, policy_version 667881 (0.0052) [2024-06-24 13:18:13,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10942627840. Throughput: 0: 42602.6. Samples: 10942763260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:13,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 13:18:15,345][15401] Updated weights for policy 0, policy_version 667891 (0.0033) [2024-06-24 13:18:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10942857216. Throughput: 0: 42525.9. Samples: 10943015460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 13:18:19,090][15401] Updated weights for policy 0, policy_version 667901 (0.0034) [2024-06-24 13:18:23,122][15401] Updated weights for policy 0, policy_version 667911 (0.0042) [2024-06-24 13:18:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10943053824. Throughput: 0: 42504.5. Samples: 10943147240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 13:18:26,642][15401] Updated weights for policy 0, policy_version 667921 (0.0027) [2024-06-24 13:18:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10943283200. Throughput: 0: 42484.5. Samples: 10943397880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:28,390][15132] Avg episode reward: [(0, '0.907')] [2024-06-24 13:18:30,946][15401] Updated weights for policy 0, policy_version 667931 (0.0032) [2024-06-24 13:18:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 10943512576. Throughput: 0: 42459.7. Samples: 10943648800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:33,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 13:18:34,729][15401] Updated weights for policy 0, policy_version 667941 (0.0050) [2024-06-24 13:18:38,393][15132] Fps is (10 sec: 40944.1, 60 sec: 42595.7, 300 sec: 42708.9). Total num frames: 10943692800. Throughput: 0: 42353.6. Samples: 10943779040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:38,394][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 13:18:38,733][15401] Updated weights for policy 0, policy_version 667951 (0.0033) [2024-06-24 13:18:42,165][15401] Updated weights for policy 0, policy_version 667961 (0.0034) [2024-06-24 13:18:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 10943922176. Throughput: 0: 42490.5. Samples: 10944037580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 13:18:46,354][15401] Updated weights for policy 0, policy_version 667971 (0.0047) [2024-06-24 13:18:48,389][15132] Fps is (10 sec: 45893.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 10944151552. Throughput: 0: 42512.4. Samples: 10944288580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 13:18:49,720][15401] Updated weights for policy 0, policy_version 667981 (0.0042) [2024-06-24 13:18:53,391][15132] Fps is (10 sec: 40953.7, 60 sec: 42325.4, 300 sec: 42653.7). Total num frames: 10944331776. Throughput: 0: 42485.2. Samples: 10944422600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:53,392][15132] Avg episode reward: [(0, '0.858')] [2024-06-24 13:18:53,783][15401] Updated weights for policy 0, policy_version 667991 (0.0040) [2024-06-24 13:18:57,452][15401] Updated weights for policy 0, policy_version 668001 (0.0042) [2024-06-24 13:18:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10944561152. Throughput: 0: 42482.4. Samples: 10944674960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:18:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 13:19:01,774][15401] Updated weights for policy 0, policy_version 668011 (0.0028) [2024-06-24 13:19:03,390][15132] Fps is (10 sec: 44243.7, 60 sec: 42602.9, 300 sec: 42709.8). Total num frames: 10944774144. Throughput: 0: 42491.4. Samples: 10944927580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:19:03,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 13:19:04,994][15401] Updated weights for policy 0, policy_version 668021 (0.0025) [2024-06-24 13:19:08,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 10944970752. Throughput: 0: 42482.5. Samples: 10945059060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:19:08,393][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 13:19:09,295][15401] Updated weights for policy 0, policy_version 668031 (0.0036) [2024-06-24 13:19:12,550][15401] Updated weights for policy 0, policy_version 668041 (0.0032) [2024-06-24 13:19:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10945216512. Throughput: 0: 42704.0. Samples: 10945319560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:19:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 13:19:17,074][15401] Updated weights for policy 0, policy_version 668051 (0.0034) [2024-06-24 13:19:18,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10945413120. Throughput: 0: 42770.5. Samples: 10945573480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 13:19:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 13:19:20,083][15401] Updated weights for policy 0, policy_version 668061 (0.0041) [2024-06-24 13:19:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10945609728. Throughput: 0: 42646.8. Samples: 10945697980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:19:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 13:19:24,742][15401] Updated weights for policy 0, policy_version 668071 (0.0040) [2024-06-24 13:19:27,484][15349] Signal inference workers to stop experience collection... (162000 times) [2024-06-24 13:19:27,489][15349] Signal inference workers to resume experience collection... (162000 times) [2024-06-24 13:19:27,520][15401] InferenceWorker_p0-w0: stopping experience collection (162000 times) [2024-06-24 13:19:27,520][15401] InferenceWorker_p0-w0: resuming experience collection (162000 times) [2024-06-24 13:19:27,825][15401] Updated weights for policy 0, policy_version 668081 (0.0039) [2024-06-24 13:19:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10945839104. Throughput: 0: 42625.0. Samples: 10945955700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:19:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 13:19:32,567][15401] Updated weights for policy 0, policy_version 668091 (0.0036) [2024-06-24 13:19:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 10946052096. Throughput: 0: 42731.9. Samples: 10946211520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:19:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 13:19:35,351][15401] Updated weights for policy 0, policy_version 668101 (0.0032) [2024-06-24 13:19:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42601.1, 300 sec: 42598.4). Total num frames: 10946248704. Throughput: 0: 42603.2. Samples: 10946339680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:19:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 13:19:39,930][15401] Updated weights for policy 0, policy_version 668111 (0.0044) [2024-06-24 13:19:43,087][15401] Updated weights for policy 0, policy_version 668121 (0.0024) [2024-06-24 13:19:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10946494464. Throughput: 0: 42815.0. Samples: 10946601640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:19:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 13:19:43,527][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000668122_10946510848.pth... [2024-06-24 13:19:43,575][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000667496_10936254464.pth [2024-06-24 13:19:47,418][15401] Updated weights for policy 0, policy_version 668131 (0.0029) [2024-06-24 13:19:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10946691072. Throughput: 0: 42762.8. Samples: 10946851900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:19:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 13:19:50,729][15401] Updated weights for policy 0, policy_version 668141 (0.0038) [2024-06-24 13:19:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42599.5, 300 sec: 42598.4). Total num frames: 10946887680. Throughput: 0: 42670.3. Samples: 10946979120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:19:53,393][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 13:19:55,220][15401] Updated weights for policy 0, policy_version 668151 (0.0032) [2024-06-24 13:19:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 10947133440. Throughput: 0: 42763.1. Samples: 10947243900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:19:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 13:19:58,410][15401] Updated weights for policy 0, policy_version 668161 (0.0032) [2024-06-24 13:20:03,082][15401] Updated weights for policy 0, policy_version 668171 (0.0037) [2024-06-24 13:20:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10947313664. Throughput: 0: 42699.1. Samples: 10947494940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:03,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 13:20:06,226][15401] Updated weights for policy 0, policy_version 668181 (0.0032) [2024-06-24 13:20:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 10947543040. Throughput: 0: 42698.5. Samples: 10947619420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 13:20:10,586][15401] Updated weights for policy 0, policy_version 668191 (0.0035) [2024-06-24 13:20:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 10947739648. Throughput: 0: 42789.8. Samples: 10947881240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:13,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 13:20:13,867][15401] Updated weights for policy 0, policy_version 668201 (0.0036) [2024-06-24 13:20:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10947952640. Throughput: 0: 42776.4. Samples: 10948136460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:18,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 13:20:18,452][15401] Updated weights for policy 0, policy_version 668211 (0.0039) [2024-06-24 13:20:21,582][15401] Updated weights for policy 0, policy_version 668221 (0.0029) [2024-06-24 13:20:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 10948198400. Throughput: 0: 42765.4. Samples: 10948264120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 13:20:26,045][15401] Updated weights for policy 0, policy_version 668231 (0.0035) [2024-06-24 13:20:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10948395008. Throughput: 0: 42835.6. Samples: 10948529240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 13:20:29,090][15401] Updated weights for policy 0, policy_version 668241 (0.0030) [2024-06-24 13:20:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10948608000. Throughput: 0: 42897.7. Samples: 10948782300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:33,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 13:20:33,866][15401] Updated weights for policy 0, policy_version 668251 (0.0043) [2024-06-24 13:20:36,742][15401] Updated weights for policy 0, policy_version 668261 (0.0033) [2024-06-24 13:20:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10948837376. Throughput: 0: 42911.9. Samples: 10948910160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:38,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 13:20:42,006][15401] Updated weights for policy 0, policy_version 668271 (0.0049) [2024-06-24 13:20:42,711][15349] Signal inference workers to stop experience collection... (162050 times) [2024-06-24 13:20:42,760][15401] InferenceWorker_p0-w0: stopping experience collection (162050 times) [2024-06-24 13:20:42,768][15349] Signal inference workers to resume experience collection... (162050 times) [2024-06-24 13:20:42,776][15401] InferenceWorker_p0-w0: resuming experience collection (162050 times) [2024-06-24 13:20:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 10949033984. Throughput: 0: 42780.0. Samples: 10949169000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 13:20:44,529][15401] Updated weights for policy 0, policy_version 668281 (0.0031) [2024-06-24 13:20:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10949263360. Throughput: 0: 42714.8. Samples: 10949417100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 13:20:49,866][15401] Updated weights for policy 0, policy_version 668291 (0.0036) [2024-06-24 13:20:52,130][15401] Updated weights for policy 0, policy_version 668301 (0.0027) [2024-06-24 13:20:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42654.7). Total num frames: 10949476352. Throughput: 0: 42826.3. Samples: 10949546600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 13:20:53,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 13:20:57,273][15401] Updated weights for policy 0, policy_version 668311 (0.0039) [2024-06-24 13:20:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10949672960. Throughput: 0: 43005.2. Samples: 10949816480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:20:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 13:20:59,819][15401] Updated weights for policy 0, policy_version 668321 (0.0036) [2024-06-24 13:21:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 10949902336. Throughput: 0: 42889.1. Samples: 10950066460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 13:21:05,068][15401] Updated weights for policy 0, policy_version 668331 (0.0051) [2024-06-24 13:21:07,342][15401] Updated weights for policy 0, policy_version 668341 (0.0050) [2024-06-24 13:21:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10950131712. Throughput: 0: 42958.3. Samples: 10950197240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 13:21:12,556][15401] Updated weights for policy 0, policy_version 668351 (0.0034) [2024-06-24 13:21:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10950311936. Throughput: 0: 42950.2. Samples: 10950462000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 13:21:14,839][15401] Updated weights for policy 0, policy_version 668361 (0.0030) [2024-06-24 13:21:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 10950557696. Throughput: 0: 42778.2. Samples: 10950707320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 13:21:20,007][15401] Updated weights for policy 0, policy_version 668371 (0.0040) [2024-06-24 13:21:22,690][15401] Updated weights for policy 0, policy_version 668381 (0.0038) [2024-06-24 13:21:23,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 10950770688. Throughput: 0: 42976.0. Samples: 10950844180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:23,392][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 13:21:27,555][15401] Updated weights for policy 0, policy_version 668391 (0.0025) [2024-06-24 13:21:28,396][15132] Fps is (10 sec: 39296.4, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 10950950912. Throughput: 0: 42900.0. Samples: 10951099780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:28,397][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 13:21:30,314][15401] Updated weights for policy 0, policy_version 668401 (0.0022) [2024-06-24 13:21:33,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 10951180288. Throughput: 0: 43058.6. Samples: 10951354740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:33,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 13:21:34,934][15401] Updated weights for policy 0, policy_version 668411 (0.0040) [2024-06-24 13:21:37,883][15401] Updated weights for policy 0, policy_version 668421 (0.0036) [2024-06-24 13:21:38,389][15132] Fps is (10 sec: 45904.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 10951409664. Throughput: 0: 43205.8. Samples: 10951490860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:38,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 13:21:43,233][15401] Updated weights for policy 0, policy_version 668431 (0.0047) [2024-06-24 13:21:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10951573504. Throughput: 0: 42844.5. Samples: 10951744480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 13:21:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000668432_10951589888.pth... [2024-06-24 13:21:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000667807_10941349888.pth [2024-06-24 13:21:44,733][15349] Signal inference workers to stop experience collection... (162100 times) [2024-06-24 13:21:44,733][15349] Signal inference workers to resume experience collection... (162100 times) [2024-06-24 13:21:44,767][15401] InferenceWorker_p0-w0: stopping experience collection (162100 times) [2024-06-24 13:21:44,767][15401] InferenceWorker_p0-w0: resuming experience collection (162100 times) [2024-06-24 13:21:45,321][15401] Updated weights for policy 0, policy_version 668441 (0.0031) [2024-06-24 13:21:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10951835648. Throughput: 0: 42987.3. Samples: 10952000900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 13:21:50,987][15401] Updated weights for policy 0, policy_version 668451 (0.0038) [2024-06-24 13:21:53,230][15401] Updated weights for policy 0, policy_version 668461 (0.0034) [2024-06-24 13:21:53,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 10952065024. Throughput: 0: 43193.4. Samples: 10952140940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:53,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 13:21:58,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 10952212480. Throughput: 0: 42952.4. Samples: 10952394860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:21:58,393][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 13:21:58,481][15401] Updated weights for policy 0, policy_version 668471 (0.0039) [2024-06-24 13:22:00,709][15401] Updated weights for policy 0, policy_version 668481 (0.0036) [2024-06-24 13:22:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10952491008. Throughput: 0: 43105.3. Samples: 10952647060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:22:03,393][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 13:22:06,015][15401] Updated weights for policy 0, policy_version 668491 (0.0033) [2024-06-24 13:22:08,389][15132] Fps is (10 sec: 49152.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10952704000. Throughput: 0: 43130.8. Samples: 10952784960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:22:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 13:22:08,504][15401] Updated weights for policy 0, policy_version 668501 (0.0034) [2024-06-24 13:22:13,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10952867840. Throughput: 0: 42930.1. Samples: 10953031360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:22:13,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 13:22:13,551][15401] Updated weights for policy 0, policy_version 668511 (0.0032) [2024-06-24 13:22:16,212][15401] Updated weights for policy 0, policy_version 668521 (0.0044) [2024-06-24 13:22:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10953129984. Throughput: 0: 42788.9. Samples: 10953280240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:22:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 13:22:21,053][15401] Updated weights for policy 0, policy_version 668531 (0.0027) [2024-06-24 13:22:23,394][15132] Fps is (10 sec: 47494.0, 60 sec: 42870.2, 300 sec: 42820.0). Total num frames: 10953342976. Throughput: 0: 42903.1. Samples: 10953421680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:22:23,394][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 13:22:24,016][15401] Updated weights for policy 0, policy_version 668541 (0.0028) [2024-06-24 13:22:28,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42603.0, 300 sec: 42654.0). Total num frames: 10953506816. Throughput: 0: 42876.1. Samples: 10953673900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:22:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 13:22:28,658][15401] Updated weights for policy 0, policy_version 668551 (0.0036) [2024-06-24 13:22:31,501][15401] Updated weights for policy 0, policy_version 668561 (0.0034) [2024-06-24 13:22:33,389][15132] Fps is (10 sec: 44255.4, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 10953785344. Throughput: 0: 42763.3. Samples: 10953925240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-24 13:22:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 13:22:36,199][15401] Updated weights for policy 0, policy_version 668571 (0.0040) [2024-06-24 13:22:38,390][15132] Fps is (10 sec: 49151.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10953998336. Throughput: 0: 42824.4. Samples: 10954068040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:22:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 13:22:39,418][15401] Updated weights for policy 0, policy_version 668581 (0.0034) [2024-06-24 13:22:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 10954178560. Throughput: 0: 42630.2. Samples: 10954313220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:22:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 13:22:43,855][15401] Updated weights for policy 0, policy_version 668591 (0.0023) [2024-06-24 13:22:45,592][15349] Signal inference workers to stop experience collection... (162150 times) [2024-06-24 13:22:45,592][15349] Signal inference workers to resume experience collection... (162150 times) [2024-06-24 13:22:45,635][15401] InferenceWorker_p0-w0: stopping experience collection (162150 times) [2024-06-24 13:22:45,635][15401] InferenceWorker_p0-w0: resuming experience collection (162150 times) [2024-06-24 13:22:46,896][15401] Updated weights for policy 0, policy_version 668601 (0.0031) [2024-06-24 13:22:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.8). Total num frames: 10954424320. Throughput: 0: 42662.6. Samples: 10954566880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:22:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 13:22:51,682][15401] Updated weights for policy 0, policy_version 668611 (0.0035) [2024-06-24 13:22:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 10954620928. Throughput: 0: 42630.6. Samples: 10954703340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:22:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 13:22:54,717][15401] Updated weights for policy 0, policy_version 668621 (0.0038) [2024-06-24 13:22:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43417.6, 300 sec: 42710.4). Total num frames: 10954817536. Throughput: 0: 42785.4. Samples: 10954956700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:22:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 13:22:59,031][15401] Updated weights for policy 0, policy_version 668631 (0.0035) [2024-06-24 13:23:02,390][15401] Updated weights for policy 0, policy_version 668641 (0.0038) [2024-06-24 13:23:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10955046912. Throughput: 0: 42800.0. Samples: 10955206240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:03,404][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 13:23:06,674][15401] Updated weights for policy 0, policy_version 668651 (0.0024) [2024-06-24 13:23:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10955243520. Throughput: 0: 42680.9. Samples: 10955342140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:08,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 13:23:10,109][15401] Updated weights for policy 0, policy_version 668661 (0.0033) [2024-06-24 13:23:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 10955456512. Throughput: 0: 42632.9. Samples: 10955592380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 13:23:14,200][15401] Updated weights for policy 0, policy_version 668671 (0.0031) [2024-06-24 13:23:17,719][15401] Updated weights for policy 0, policy_version 668681 (0.0028) [2024-06-24 13:23:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10955702272. Throughput: 0: 42875.0. Samples: 10955854620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:18,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 13:23:21,775][15401] Updated weights for policy 0, policy_version 668691 (0.0041) [2024-06-24 13:23:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42601.3, 300 sec: 42765.0). Total num frames: 10955898880. Throughput: 0: 42532.0. Samples: 10955981980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 13:23:25,261][15401] Updated weights for policy 0, policy_version 668701 (0.0027) [2024-06-24 13:23:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10956111872. Throughput: 0: 42792.5. Samples: 10956238880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 13:23:29,210][15401] Updated weights for policy 0, policy_version 668711 (0.0031) [2024-06-24 13:23:32,914][15401] Updated weights for policy 0, policy_version 668721 (0.0040) [2024-06-24 13:23:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42821.1). Total num frames: 10956324864. Throughput: 0: 42749.9. Samples: 10956490620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 13:23:36,966][15401] Updated weights for policy 0, policy_version 668731 (0.0022) [2024-06-24 13:23:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10956554240. Throughput: 0: 42616.9. Samples: 10956621100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 13:23:40,574][15401] Updated weights for policy 0, policy_version 668741 (0.0028) [2024-06-24 13:23:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10956750848. Throughput: 0: 42672.0. Samples: 10956876940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:43,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 13:23:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000668748_10956767232.pth... [2024-06-24 13:23:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000668122_10946510848.pth [2024-06-24 13:23:44,893][15401] Updated weights for policy 0, policy_version 668751 (0.0032) [2024-06-24 13:23:48,291][15401] Updated weights for policy 0, policy_version 668761 (0.0024) [2024-06-24 13:23:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.3). Total num frames: 10956980224. Throughput: 0: 42643.9. Samples: 10957125220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 13:23:52,558][15401] Updated weights for policy 0, policy_version 668771 (0.0033) [2024-06-24 13:23:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10957176832. Throughput: 0: 42543.9. Samples: 10957256620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 13:23:56,379][15401] Updated weights for policy 0, policy_version 668781 (0.0026) [2024-06-24 13:23:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10957373440. Throughput: 0: 42673.2. Samples: 10957512680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:23:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 13:24:00,121][15401] Updated weights for policy 0, policy_version 668791 (0.0044) [2024-06-24 13:24:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 10957619200. Throughput: 0: 42480.4. Samples: 10957766240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:24:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 13:24:03,851][15401] Updated weights for policy 0, policy_version 668801 (0.0042) [2024-06-24 13:24:07,020][15349] Signal inference workers to stop experience collection... (162200 times) [2024-06-24 13:24:07,066][15401] InferenceWorker_p0-w0: stopping experience collection (162200 times) [2024-06-24 13:24:07,069][15349] Signal inference workers to resume experience collection... (162200 times) [2024-06-24 13:24:07,082][15401] InferenceWorker_p0-w0: resuming experience collection (162200 times) [2024-06-24 13:24:07,644][15401] Updated weights for policy 0, policy_version 668811 (0.0033) [2024-06-24 13:24:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10957815808. Throughput: 0: 42676.6. Samples: 10957902420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:24:08,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 13:24:11,366][15401] Updated weights for policy 0, policy_version 668821 (0.0040) [2024-06-24 13:24:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10958012416. Throughput: 0: 42627.2. Samples: 10958157100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 13:24:13,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-24 13:24:15,188][15401] Updated weights for policy 0, policy_version 668831 (0.0030) [2024-06-24 13:24:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10958258176. Throughput: 0: 42591.1. Samples: 10958407220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 13:24:18,807][15401] Updated weights for policy 0, policy_version 668841 (0.0029) [2024-06-24 13:24:22,863][15401] Updated weights for policy 0, policy_version 668851 (0.0032) [2024-06-24 13:24:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10958454784. Throughput: 0: 42683.6. Samples: 10958541860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 13:24:27,318][15401] Updated weights for policy 0, policy_version 668861 (0.0032) [2024-06-24 13:24:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10958651392. Throughput: 0: 42628.5. Samples: 10958795220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 13:24:30,857][15401] Updated weights for policy 0, policy_version 668871 (0.0042) [2024-06-24 13:24:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10958880768. Throughput: 0: 42726.3. Samples: 10959047900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 13:24:34,928][15401] Updated weights for policy 0, policy_version 668881 (0.0037) [2024-06-24 13:24:38,382][15401] Updated weights for policy 0, policy_version 668891 (0.0027) [2024-06-24 13:24:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10959110144. Throughput: 0: 42698.8. Samples: 10959178060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 13:24:42,534][15401] Updated weights for policy 0, policy_version 668901 (0.0036) [2024-06-24 13:24:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10959306752. Throughput: 0: 42624.9. Samples: 10959430800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:43,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 13:24:46,161][15401] Updated weights for policy 0, policy_version 668911 (0.0044) [2024-06-24 13:24:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10959536128. Throughput: 0: 42605.1. Samples: 10959683460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:48,396][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 13:24:50,497][15401] Updated weights for policy 0, policy_version 668921 (0.0044) [2024-06-24 13:24:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10959732736. Throughput: 0: 42490.6. Samples: 10959814500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 13:24:53,736][15401] Updated weights for policy 0, policy_version 668931 (0.0027) [2024-06-24 13:24:58,250][15401] Updated weights for policy 0, policy_version 668941 (0.0027) [2024-06-24 13:24:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10959945728. Throughput: 0: 42463.5. Samples: 10960067960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:24:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 13:25:01,539][15401] Updated weights for policy 0, policy_version 668951 (0.0029) [2024-06-24 13:25:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 10960175104. Throughput: 0: 42653.0. Samples: 10960326600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 13:25:05,674][15401] Updated weights for policy 0, policy_version 668961 (0.0030) [2024-06-24 13:25:08,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 10960371712. Throughput: 0: 42633.3. Samples: 10960460460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:08,393][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 13:25:09,125][15401] Updated weights for policy 0, policy_version 668971 (0.0047) [2024-06-24 13:25:13,367][15401] Updated weights for policy 0, policy_version 668981 (0.0045) [2024-06-24 13:25:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10960584704. Throughput: 0: 42567.6. Samples: 10960710760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 13:25:16,768][15401] Updated weights for policy 0, policy_version 668991 (0.0031) [2024-06-24 13:25:18,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10960814080. Throughput: 0: 42512.6. Samples: 10960960960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 13:25:21,258][15401] Updated weights for policy 0, policy_version 669001 (0.0027) [2024-06-24 13:25:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10961010688. Throughput: 0: 42715.6. Samples: 10961100260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 13:25:24,268][15401] Updated weights for policy 0, policy_version 669011 (0.0030) [2024-06-24 13:25:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10961223680. Throughput: 0: 42639.1. Samples: 10961349560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:28,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 13:25:28,802][15401] Updated weights for policy 0, policy_version 669021 (0.0038) [2024-06-24 13:25:31,618][15349] Signal inference workers to stop experience collection... (162250 times) [2024-06-24 13:25:31,619][15349] Signal inference workers to resume experience collection... (162250 times) [2024-06-24 13:25:31,648][15401] InferenceWorker_p0-w0: stopping experience collection (162250 times) [2024-06-24 13:25:31,648][15401] InferenceWorker_p0-w0: resuming experience collection (162250 times) [2024-06-24 13:25:31,781][15401] Updated weights for policy 0, policy_version 669031 (0.0035) [2024-06-24 13:25:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10961453056. Throughput: 0: 42778.6. Samples: 10961608500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 13:25:36,352][15401] Updated weights for policy 0, policy_version 669041 (0.0031) [2024-06-24 13:25:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10961649664. Throughput: 0: 42959.1. Samples: 10961747660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 13:25:39,424][15401] Updated weights for policy 0, policy_version 669051 (0.0038) [2024-06-24 13:25:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 10961879040. Throughput: 0: 43034.6. Samples: 10962004520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 13:25:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000669060_10961879040.pth... [2024-06-24 13:25:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000668432_10951589888.pth [2024-06-24 13:25:43,773][15401] Updated weights for policy 0, policy_version 669061 (0.0039) [2024-06-24 13:25:47,227][15401] Updated weights for policy 0, policy_version 669071 (0.0037) [2024-06-24 13:25:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10962108416. Throughput: 0: 42813.3. Samples: 10962253200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:25:48,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 13:25:51,574][15401] Updated weights for policy 0, policy_version 669081 (0.0034) [2024-06-24 13:25:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 10962288640. Throughput: 0: 42851.5. Samples: 10962388680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:25:53,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 13:25:54,865][15401] Updated weights for policy 0, policy_version 669091 (0.0038) [2024-06-24 13:25:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 10962501632. Throughput: 0: 43071.1. Samples: 10962648960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:25:58,390][15132] Avg episode reward: [(0, '0.125')] [2024-06-24 13:25:59,261][15401] Updated weights for policy 0, policy_version 669101 (0.0041) [2024-06-24 13:26:02,258][15401] Updated weights for policy 0, policy_version 669111 (0.0041) [2024-06-24 13:26:03,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10962763776. Throughput: 0: 43143.0. Samples: 10962902400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 13:26:06,763][15401] Updated weights for policy 0, policy_version 669121 (0.0029) [2024-06-24 13:26:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 10962944000. Throughput: 0: 42992.8. Samples: 10963034940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 13:26:10,066][15401] Updated weights for policy 0, policy_version 669131 (0.0038) [2024-06-24 13:26:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10963156992. Throughput: 0: 43123.1. Samples: 10963290100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:13,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 13:26:14,426][15401] Updated weights for policy 0, policy_version 669141 (0.0024) [2024-06-24 13:26:17,414][15401] Updated weights for policy 0, policy_version 669151 (0.0041) [2024-06-24 13:26:18,393][15132] Fps is (10 sec: 47494.7, 60 sec: 43414.6, 300 sec: 42875.9). Total num frames: 10963419136. Throughput: 0: 43061.1. Samples: 10963546420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:18,394][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 13:26:22,141][15401] Updated weights for policy 0, policy_version 669161 (0.0037) [2024-06-24 13:26:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42877.0). Total num frames: 10963599360. Throughput: 0: 43068.4. Samples: 10963685740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 13:26:25,083][15401] Updated weights for policy 0, policy_version 669171 (0.0032) [2024-06-24 13:26:28,390][15132] Fps is (10 sec: 37698.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10963795968. Throughput: 0: 42962.8. Samples: 10963937840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 13:26:29,730][15401] Updated weights for policy 0, policy_version 669181 (0.0034) [2024-06-24 13:26:32,565][15401] Updated weights for policy 0, policy_version 669191 (0.0041) [2024-06-24 13:26:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10964041728. Throughput: 0: 43041.1. Samples: 10964190060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:33,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 13:26:37,441][15401] Updated weights for policy 0, policy_version 669201 (0.0043) [2024-06-24 13:26:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 10964238336. Throughput: 0: 43183.7. Samples: 10964331940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:38,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 13:26:40,372][15401] Updated weights for policy 0, policy_version 669211 (0.0024) [2024-06-24 13:26:43,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 10964451328. Throughput: 0: 42895.8. Samples: 10964579380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:43,393][15132] Avg episode reward: [(0, '0.261')] [2024-06-24 13:26:45,265][15401] Updated weights for policy 0, policy_version 669221 (0.0034) [2024-06-24 13:26:47,820][15401] Updated weights for policy 0, policy_version 669231 (0.0053) [2024-06-24 13:26:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 10964697088. Throughput: 0: 42877.3. Samples: 10964831880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 13:26:52,851][15401] Updated weights for policy 0, policy_version 669241 (0.0031) [2024-06-24 13:26:53,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 10964877312. Throughput: 0: 43025.0. Samples: 10964971060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 13:26:54,064][15349] Signal inference workers to stop experience collection... (162300 times) [2024-06-24 13:26:54,065][15349] Signal inference workers to resume experience collection... (162300 times) [2024-06-24 13:26:54,083][15401] InferenceWorker_p0-w0: stopping experience collection (162300 times) [2024-06-24 13:26:54,084][15401] InferenceWorker_p0-w0: resuming experience collection (162300 times) [2024-06-24 13:26:55,328][15401] Updated weights for policy 0, policy_version 669251 (0.0028) [2024-06-24 13:26:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10965090304. Throughput: 0: 43122.3. Samples: 10965230600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:26:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 13:27:00,361][15401] Updated weights for policy 0, policy_version 669261 (0.0028) [2024-06-24 13:27:02,953][15401] Updated weights for policy 0, policy_version 669271 (0.0036) [2024-06-24 13:27:03,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10965352448. Throughput: 0: 42934.4. Samples: 10965478300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:27:03,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-24 13:27:08,023][15401] Updated weights for policy 0, policy_version 669281 (0.0035) [2024-06-24 13:27:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10965516288. Throughput: 0: 43050.7. Samples: 10965623020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:27:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 13:27:10,451][15401] Updated weights for policy 0, policy_version 669291 (0.0033) [2024-06-24 13:27:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 10965745664. Throughput: 0: 42936.4. Samples: 10965869980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:27:13,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 13:27:15,710][15401] Updated weights for policy 0, policy_version 669301 (0.0036) [2024-06-24 13:27:18,240][15401] Updated weights for policy 0, policy_version 669311 (0.0040) [2024-06-24 13:27:18,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42874.4, 300 sec: 42876.7). Total num frames: 10965991424. Throughput: 0: 42957.1. Samples: 10966123120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:27:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 13:27:23,358][15401] Updated weights for policy 0, policy_version 669321 (0.0029) [2024-06-24 13:27:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10966155264. Throughput: 0: 42913.8. Samples: 10966263060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:27:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 13:27:25,644][15401] Updated weights for policy 0, policy_version 669331 (0.0042) [2024-06-24 13:27:28,389][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 10966384640. Throughput: 0: 43007.2. Samples: 10966514600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-24 13:27:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 13:27:30,763][15401] Updated weights for policy 0, policy_version 669341 (0.0030) [2024-06-24 13:27:33,183][15401] Updated weights for policy 0, policy_version 669351 (0.0029) [2024-06-24 13:27:33,396][15132] Fps is (10 sec: 49120.5, 60 sec: 43413.1, 300 sec: 42875.2). Total num frames: 10966646784. Throughput: 0: 42992.2. Samples: 10966766800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:27:33,396][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 13:27:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10966794240. Throughput: 0: 43012.0. Samples: 10966906600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:27:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 13:27:38,615][15401] Updated weights for policy 0, policy_version 669361 (0.0044) [2024-06-24 13:27:40,988][15401] Updated weights for policy 0, policy_version 669371 (0.0031) [2024-06-24 13:27:43,389][15132] Fps is (10 sec: 37707.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 10967023616. Throughput: 0: 42734.2. Samples: 10967153640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:27:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 13:27:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000669374_10967023616.pth... [2024-06-24 13:27:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000668748_10956767232.pth [2024-06-24 13:27:46,176][15401] Updated weights for policy 0, policy_version 669381 (0.0025) [2024-06-24 13:27:48,390][15132] Fps is (10 sec: 49151.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10967285760. Throughput: 0: 42927.5. Samples: 10967410040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:27:48,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 13:27:49,078][15349] Signal inference workers to stop experience collection... (162350 times) [2024-06-24 13:27:49,079][15349] Signal inference workers to resume experience collection... (162350 times) [2024-06-24 13:27:49,095][15401] Updated weights for policy 0, policy_version 669391 (0.0034) [2024-06-24 13:27:49,125][15401] InferenceWorker_p0-w0: stopping experience collection (162350 times) [2024-06-24 13:27:49,126][15401] InferenceWorker_p0-w0: resuming experience collection (162350 times) [2024-06-24 13:27:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 10967449600. Throughput: 0: 42884.0. Samples: 10967552800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:27:53,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 13:27:53,710][15401] Updated weights for policy 0, policy_version 669401 (0.0044) [2024-06-24 13:27:56,570][15401] Updated weights for policy 0, policy_version 669411 (0.0043) [2024-06-24 13:27:58,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10967678976. Throughput: 0: 42879.6. Samples: 10967799560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:27:58,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 13:28:01,242][15401] Updated weights for policy 0, policy_version 669421 (0.0039) [2024-06-24 13:28:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 10967908352. Throughput: 0: 42942.7. Samples: 10968055540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 13:28:04,082][15401] Updated weights for policy 0, policy_version 669431 (0.0048) [2024-06-24 13:28:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10968104960. Throughput: 0: 42888.8. Samples: 10968193060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:08,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 13:28:08,697][15401] Updated weights for policy 0, policy_version 669441 (0.0037) [2024-06-24 13:28:11,595][15401] Updated weights for policy 0, policy_version 669451 (0.0046) [2024-06-24 13:28:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10968317952. Throughput: 0: 42911.1. Samples: 10968445600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 13:28:16,386][15401] Updated weights for policy 0, policy_version 669461 (0.0043) [2024-06-24 13:28:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 10968563712. Throughput: 0: 42969.3. Samples: 10968700140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:18,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 13:28:19,061][15401] Updated weights for policy 0, policy_version 669471 (0.0025) [2024-06-24 13:28:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10968727552. Throughput: 0: 42986.2. Samples: 10968840980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 13:28:24,030][15401] Updated weights for policy 0, policy_version 669481 (0.0029) [2024-06-24 13:28:26,647][15401] Updated weights for policy 0, policy_version 669491 (0.0039) [2024-06-24 13:28:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10968973312. Throughput: 0: 42842.7. Samples: 10969081560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 13:28:31,690][15401] Updated weights for policy 0, policy_version 669501 (0.0039) [2024-06-24 13:28:33,390][15132] Fps is (10 sec: 47511.9, 60 sec: 42602.7, 300 sec: 42876.1). Total num frames: 10969202688. Throughput: 0: 42855.3. Samples: 10969338540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 13:28:34,763][15401] Updated weights for policy 0, policy_version 669511 (0.0029) [2024-06-24 13:28:38,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 10969366528. Throughput: 0: 42781.6. Samples: 10969477980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 13:28:39,253][15401] Updated weights for policy 0, policy_version 669521 (0.0033) [2024-06-24 13:28:42,278][15401] Updated weights for policy 0, policy_version 669531 (0.0035) [2024-06-24 13:28:43,390][15132] Fps is (10 sec: 40961.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 10969612288. Throughput: 0: 42884.4. Samples: 10969729360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 13:28:47,025][15401] Updated weights for policy 0, policy_version 669541 (0.0043) [2024-06-24 13:28:48,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10969841664. Throughput: 0: 42755.0. Samples: 10969979520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:48,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 13:28:49,864][15401] Updated weights for policy 0, policy_version 669551 (0.0038) [2024-06-24 13:28:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 10970005504. Throughput: 0: 42696.9. Samples: 10970114420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 13:28:54,738][15401] Updated weights for policy 0, policy_version 669561 (0.0041) [2024-06-24 13:28:55,497][15349] Signal inference workers to stop experience collection... (162400 times) [2024-06-24 13:28:55,498][15349] Signal inference workers to resume experience collection... (162400 times) [2024-06-24 13:28:55,530][15401] InferenceWorker_p0-w0: stopping experience collection (162400 times) [2024-06-24 13:28:55,530][15401] InferenceWorker_p0-w0: resuming experience collection (162400 times) [2024-06-24 13:28:57,372][15401] Updated weights for policy 0, policy_version 669571 (0.0033) [2024-06-24 13:28:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 10970251264. Throughput: 0: 42586.7. Samples: 10970362000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:28:58,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 13:29:02,398][15401] Updated weights for policy 0, policy_version 669581 (0.0049) [2024-06-24 13:29:03,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 10970480640. Throughput: 0: 42801.2. Samples: 10970626200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:29:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 13:29:04,835][15401] Updated weights for policy 0, policy_version 669591 (0.0038) [2024-06-24 13:29:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 10970660864. Throughput: 0: 42603.9. Samples: 10970758260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-24 13:29:08,392][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 13:29:09,991][15401] Updated weights for policy 0, policy_version 669601 (0.0035) [2024-06-24 13:29:12,243][15401] Updated weights for policy 0, policy_version 669611 (0.0036) [2024-06-24 13:29:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10970906624. Throughput: 0: 42730.5. Samples: 10971004440. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:13,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 13:29:17,552][15401] Updated weights for policy 0, policy_version 669621 (0.0031) [2024-06-24 13:29:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 10971103232. Throughput: 0: 43009.7. Samples: 10971273960. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 13:29:20,166][15401] Updated weights for policy 0, policy_version 669631 (0.0035) [2024-06-24 13:29:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10971299840. Throughput: 0: 42686.3. Samples: 10971398860. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:23,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 13:29:25,268][15401] Updated weights for policy 0, policy_version 669641 (0.0036) [2024-06-24 13:29:27,842][15401] Updated weights for policy 0, policy_version 669651 (0.0037) [2024-06-24 13:29:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 10971561984. Throughput: 0: 42708.9. Samples: 10971651260. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 13:29:32,981][15401] Updated weights for policy 0, policy_version 669661 (0.0031) [2024-06-24 13:29:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.6, 300 sec: 42820.5). Total num frames: 10971742208. Throughput: 0: 43189.3. Samples: 10971923040. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 13:29:35,475][15401] Updated weights for policy 0, policy_version 669671 (0.0033) [2024-06-24 13:29:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10971955200. Throughput: 0: 42851.5. Samples: 10972042740. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 13:29:40,728][15401] Updated weights for policy 0, policy_version 669681 (0.0038) [2024-06-24 13:29:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 10972200960. Throughput: 0: 42984.4. Samples: 10972296300. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 13:29:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000669691_10972217344.pth... [2024-06-24 13:29:43,488][15401] Updated weights for policy 0, policy_version 669691 (0.0043) [2024-06-24 13:29:43,523][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000669060_10961879040.pth [2024-06-24 13:29:48,347][15401] Updated weights for policy 0, policy_version 669701 (0.0030) [2024-06-24 13:29:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 10972381184. Throughput: 0: 42916.1. Samples: 10972557420. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 13:29:51,548][15401] Updated weights for policy 0, policy_version 669711 (0.0034) [2024-06-24 13:29:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 10972594176. Throughput: 0: 42511.1. Samples: 10972671160. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 13:29:56,002][15401] Updated weights for policy 0, policy_version 669721 (0.0031) [2024-06-24 13:29:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 10972823552. Throughput: 0: 42832.4. Samples: 10972931900. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:29:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 13:29:59,156][15401] Updated weights for policy 0, policy_version 669731 (0.0032) [2024-06-24 13:30:03,316][15349] Signal inference workers to stop experience collection... (162450 times) [2024-06-24 13:30:03,317][15349] Signal inference workers to resume experience collection... (162450 times) [2024-06-24 13:30:03,359][15401] InferenceWorker_p0-w0: stopping experience collection (162450 times) [2024-06-24 13:30:03,359][15401] InferenceWorker_p0-w0: resuming experience collection (162450 times) [2024-06-24 13:30:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42820.9). Total num frames: 10973003776. Throughput: 0: 42649.2. Samples: 10973193180. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:03,392][15132] Avg episode reward: [(0, '0.035')] [2024-06-24 13:30:03,919][15401] Updated weights for policy 0, policy_version 669741 (0.0039) [2024-06-24 13:30:07,213][15401] Updated weights for policy 0, policy_version 669751 (0.0043) [2024-06-24 13:30:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 10973233152. Throughput: 0: 42423.5. Samples: 10973307920. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 13:30:11,554][15401] Updated weights for policy 0, policy_version 669761 (0.0038) [2024-06-24 13:30:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 10973462528. Throughput: 0: 42663.1. Samples: 10973571100. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 13:30:14,603][15401] Updated weights for policy 0, policy_version 669771 (0.0034) [2024-06-24 13:30:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 10973642752. Throughput: 0: 42426.1. Samples: 10973832220. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 13:30:19,205][15401] Updated weights for policy 0, policy_version 669781 (0.0041) [2024-06-24 13:30:22,503][15401] Updated weights for policy 0, policy_version 669791 (0.0029) [2024-06-24 13:30:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 10973888512. Throughput: 0: 42312.0. Samples: 10973946880. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:23,393][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 13:30:27,210][15401] Updated weights for policy 0, policy_version 669801 (0.0043) [2024-06-24 13:30:28,389][15132] Fps is (10 sec: 47514.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 10974117888. Throughput: 0: 42750.4. Samples: 10974220060. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:28,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 13:30:30,146][15401] Updated weights for policy 0, policy_version 669811 (0.0029) [2024-06-24 13:30:33,389][15132] Fps is (10 sec: 39331.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 10974281728. Throughput: 0: 42726.7. Samples: 10974480120. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:33,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-24 13:30:34,791][15401] Updated weights for policy 0, policy_version 669821 (0.0031) [2024-06-24 13:30:37,650][15401] Updated weights for policy 0, policy_version 669831 (0.0033) [2024-06-24 13:30:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 10974527488. Throughput: 0: 42867.6. Samples: 10974600200. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:38,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-24 13:30:42,216][15401] Updated weights for policy 0, policy_version 669841 (0.0044) [2024-06-24 13:30:43,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 10974756864. Throughput: 0: 42972.4. Samples: 10974865660. Policy #0 lag: (min: 1.0, avg: 8.7, max: 20.0) [2024-06-24 13:30:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 13:30:45,295][15401] Updated weights for policy 0, policy_version 669851 (0.0032) [2024-06-24 13:30:48,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.6, 300 sec: 42875.8). Total num frames: 10974937088. Throughput: 0: 42797.3. Samples: 10975119160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:30:48,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 13:30:49,738][15401] Updated weights for policy 0, policy_version 669861 (0.0037) [2024-06-24 13:30:53,365][15401] Updated weights for policy 0, policy_version 669871 (0.0039) [2024-06-24 13:30:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 10975166464. Throughput: 0: 42868.6. Samples: 10975237000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:30:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 13:30:57,295][15401] Updated weights for policy 0, policy_version 669881 (0.0025) [2024-06-24 13:30:58,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 10975379456. Throughput: 0: 42831.2. Samples: 10975498500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:30:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 13:31:00,994][15401] Updated weights for policy 0, policy_version 669891 (0.0027) [2024-06-24 13:31:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 10975592448. Throughput: 0: 42621.6. Samples: 10975750180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 13:31:04,922][15401] Updated weights for policy 0, policy_version 669901 (0.0042) [2024-06-24 13:31:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 10975805440. Throughput: 0: 43019.3. Samples: 10975882640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:08,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 13:31:08,471][15401] Updated weights for policy 0, policy_version 669911 (0.0038) [2024-06-24 13:31:12,571][15401] Updated weights for policy 0, policy_version 669921 (0.0036) [2024-06-24 13:31:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42710.1). Total num frames: 10976018432. Throughput: 0: 42641.2. Samples: 10976138920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 13:31:14,031][15349] Signal inference workers to stop experience collection... (162500 times) [2024-06-24 13:31:14,032][15349] Signal inference workers to resume experience collection... (162500 times) [2024-06-24 13:31:14,058][15401] InferenceWorker_p0-w0: stopping experience collection (162500 times) [2024-06-24 13:31:14,058][15401] InferenceWorker_p0-w0: resuming experience collection (162500 times) [2024-06-24 13:31:16,356][15401] Updated weights for policy 0, policy_version 669931 (0.0045) [2024-06-24 13:31:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 10976231424. Throughput: 0: 42419.1. Samples: 10976388980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:18,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 13:31:20,121][15401] Updated weights for policy 0, policy_version 669941 (0.0033) [2024-06-24 13:31:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 10976444416. Throughput: 0: 42589.9. Samples: 10976516740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 13:31:24,137][15401] Updated weights for policy 0, policy_version 669951 (0.0041) [2024-06-24 13:31:27,638][15401] Updated weights for policy 0, policy_version 669961 (0.0032) [2024-06-24 13:31:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10976657408. Throughput: 0: 42419.7. Samples: 10976774540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 13:31:31,735][15401] Updated weights for policy 0, policy_version 669971 (0.0034) [2024-06-24 13:31:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 10976886784. Throughput: 0: 42464.4. Samples: 10977029960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 13:31:35,238][15401] Updated weights for policy 0, policy_version 669981 (0.0041) [2024-06-24 13:31:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 10977067008. Throughput: 0: 42668.1. Samples: 10977157060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:38,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 13:31:39,593][15401] Updated weights for policy 0, policy_version 669991 (0.0030) [2024-06-24 13:31:42,779][15401] Updated weights for policy 0, policy_version 670001 (0.0037) [2024-06-24 13:31:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10977312768. Throughput: 0: 42708.3. Samples: 10977420380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:43,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 13:31:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000670002_10977312768.pth... [2024-06-24 13:31:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000669374_10967023616.pth [2024-06-24 13:31:47,495][15401] Updated weights for policy 0, policy_version 670011 (0.0026) [2024-06-24 13:31:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 10977525760. Throughput: 0: 42745.7. Samples: 10977673740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 13:31:50,317][15401] Updated weights for policy 0, policy_version 670021 (0.0036) [2024-06-24 13:31:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 10977705984. Throughput: 0: 42660.4. Samples: 10977802360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 13:31:55,246][15401] Updated weights for policy 0, policy_version 670031 (0.0032) [2024-06-24 13:31:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10977935360. Throughput: 0: 42588.4. Samples: 10978055400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:31:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 13:31:58,461][15401] Updated weights for policy 0, policy_version 670041 (0.0029) [2024-06-24 13:32:02,703][15401] Updated weights for policy 0, policy_version 670051 (0.0034) [2024-06-24 13:32:03,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42598.1, 300 sec: 42820.5). Total num frames: 10978148352. Throughput: 0: 42809.9. Samples: 10978315440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:32:03,391][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 13:32:06,399][15401] Updated weights for policy 0, policy_version 670061 (0.0029) [2024-06-24 13:32:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10978344960. Throughput: 0: 42736.4. Samples: 10978439880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:32:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 13:32:10,278][15401] Updated weights for policy 0, policy_version 670071 (0.0031) [2024-06-24 13:32:13,391][15132] Fps is (10 sec: 42593.4, 60 sec: 42597.4, 300 sec: 42653.7). Total num frames: 10978574336. Throughput: 0: 42599.0. Samples: 10978691560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:32:13,391][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 13:32:13,999][15401] Updated weights for policy 0, policy_version 670081 (0.0035) [2024-06-24 13:32:17,944][15401] Updated weights for policy 0, policy_version 670091 (0.0033) [2024-06-24 13:32:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 10978770944. Throughput: 0: 42564.1. Samples: 10978945340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:32:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 13:32:21,622][15401] Updated weights for policy 0, policy_version 670101 (0.0035) [2024-06-24 13:32:23,389][15132] Fps is (10 sec: 42604.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 10979000320. Throughput: 0: 42545.3. Samples: 10979071600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:32:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 13:32:25,566][15401] Updated weights for policy 0, policy_version 670111 (0.0043) [2024-06-24 13:32:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 10979213312. Throughput: 0: 42315.1. Samples: 10979324560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:32:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 13:32:29,291][15401] Updated weights for policy 0, policy_version 670121 (0.0033) [2024-06-24 13:32:32,873][15349] Signal inference workers to stop experience collection... (162550 times) [2024-06-24 13:32:32,903][15401] InferenceWorker_p0-w0: stopping experience collection (162550 times) [2024-06-24 13:32:32,922][15349] Signal inference workers to resume experience collection... (162550 times) [2024-06-24 13:32:32,939][15401] InferenceWorker_p0-w0: resuming experience collection (162550 times) [2024-06-24 13:32:33,265][15401] Updated weights for policy 0, policy_version 670131 (0.0045) [2024-06-24 13:32:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 10979426304. Throughput: 0: 42392.4. Samples: 10979581400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:32:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 13:32:36,851][15401] Updated weights for policy 0, policy_version 670141 (0.0034) [2024-06-24 13:32:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10979622912. Throughput: 0: 42372.5. Samples: 10979709120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:32:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 13:32:41,103][15401] Updated weights for policy 0, policy_version 670151 (0.0025) [2024-06-24 13:32:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10979835904. Throughput: 0: 42443.6. Samples: 10979965360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:32:43,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 13:32:44,287][15401] Updated weights for policy 0, policy_version 670161 (0.0039) [2024-06-24 13:32:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 10980048896. Throughput: 0: 42381.6. Samples: 10980222600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:32:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 13:32:48,659][15401] Updated weights for policy 0, policy_version 670171 (0.0027) [2024-06-24 13:32:52,319][15401] Updated weights for policy 0, policy_version 670181 (0.0028) [2024-06-24 13:32:53,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 10980261888. Throughput: 0: 42361.7. Samples: 10980346260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:32:53,393][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 13:32:56,401][15401] Updated weights for policy 0, policy_version 670191 (0.0040) [2024-06-24 13:32:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 10980442112. Throughput: 0: 42511.6. Samples: 10980604520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:32:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 13:32:59,958][15401] Updated weights for policy 0, policy_version 670201 (0.0037) [2024-06-24 13:33:03,393][15132] Fps is (10 sec: 40955.2, 60 sec: 42049.9, 300 sec: 42597.9). Total num frames: 10980671488. Throughput: 0: 42372.1. Samples: 10980852240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:03,394][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 13:33:04,069][15401] Updated weights for policy 0, policy_version 670211 (0.0032) [2024-06-24 13:33:07,956][15401] Updated weights for policy 0, policy_version 670221 (0.0031) [2024-06-24 13:33:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10980900864. Throughput: 0: 42435.1. Samples: 10980981180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 13:33:12,487][15401] Updated weights for policy 0, policy_version 670231 (0.0034) [2024-06-24 13:33:13,389][15132] Fps is (10 sec: 40974.8, 60 sec: 41780.2, 300 sec: 42431.8). Total num frames: 10981081088. Throughput: 0: 42380.5. Samples: 10981231680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:13,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-24 13:33:15,813][15401] Updated weights for policy 0, policy_version 670241 (0.0034) [2024-06-24 13:33:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10981326848. Throughput: 0: 42248.4. Samples: 10981482580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 13:33:20,147][15401] Updated weights for policy 0, policy_version 670251 (0.0032) [2024-06-24 13:33:23,367][15401] Updated weights for policy 0, policy_version 670261 (0.0039) [2024-06-24 13:33:23,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 10981556224. Throughput: 0: 42355.3. Samples: 10981615120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 13:33:27,695][15401] Updated weights for policy 0, policy_version 670271 (0.0044) [2024-06-24 13:33:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 10981720064. Throughput: 0: 42354.7. Samples: 10981871320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:28,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 13:33:30,916][15401] Updated weights for policy 0, policy_version 670281 (0.0026) [2024-06-24 13:33:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 10981965824. Throughput: 0: 42363.9. Samples: 10982128980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 13:33:35,710][15401] Updated weights for policy 0, policy_version 670291 (0.0024) [2024-06-24 13:33:38,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10982195200. Throughput: 0: 42674.3. Samples: 10982266500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 13:33:38,670][15401] Updated weights for policy 0, policy_version 670301 (0.0044) [2024-06-24 13:33:43,166][15401] Updated weights for policy 0, policy_version 670311 (0.0037) [2024-06-24 13:33:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 10982375424. Throughput: 0: 42439.4. Samples: 10982514300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 13:33:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000670311_10982375424.pth... [2024-06-24 13:33:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000669691_10972217344.pth [2024-06-24 13:33:46,654][15401] Updated weights for policy 0, policy_version 670321 (0.0030) [2024-06-24 13:33:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10982604800. Throughput: 0: 42506.0. Samples: 10982764860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 13:33:50,606][15401] Updated weights for policy 0, policy_version 670331 (0.0024) [2024-06-24 13:33:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 10982801408. Throughput: 0: 42545.4. Samples: 10982895720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:53,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 13:33:54,576][15401] Updated weights for policy 0, policy_version 670341 (0.0034) [2024-06-24 13:33:54,855][15349] Signal inference workers to stop experience collection... (162600 times) [2024-06-24 13:33:54,879][15401] InferenceWorker_p0-w0: stopping experience collection (162600 times) [2024-06-24 13:33:54,916][15349] Signal inference workers to resume experience collection... (162600 times) [2024-06-24 13:33:54,916][15401] InferenceWorker_p0-w0: resuming experience collection (162600 times) [2024-06-24 13:33:58,054][15401] Updated weights for policy 0, policy_version 670351 (0.0039) [2024-06-24 13:33:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 10983030784. Throughput: 0: 42531.5. Samples: 10983145600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:33:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 13:34:02,073][15401] Updated weights for policy 0, policy_version 670361 (0.0036) [2024-06-24 13:34:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42874.1, 300 sec: 42654.3). Total num frames: 10983243776. Throughput: 0: 42687.6. Samples: 10983403520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 13:34:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 13:34:05,792][15401] Updated weights for policy 0, policy_version 670371 (0.0033) [2024-06-24 13:34:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 10983440384. Throughput: 0: 42593.9. Samples: 10983531840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:08,391][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 13:34:09,675][15401] Updated weights for policy 0, policy_version 670381 (0.0041) [2024-06-24 13:34:13,349][15401] Updated weights for policy 0, policy_version 670391 (0.0037) [2024-06-24 13:34:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 10983686144. Throughput: 0: 42563.1. Samples: 10983786660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:13,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 13:34:17,297][15401] Updated weights for policy 0, policy_version 670401 (0.0029) [2024-06-24 13:34:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 10983882752. Throughput: 0: 42673.0. Samples: 10984049260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 13:34:21,040][15401] Updated weights for policy 0, policy_version 670411 (0.0042) [2024-06-24 13:34:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 10984079360. Throughput: 0: 42302.3. Samples: 10984170100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 13:34:25,144][15401] Updated weights for policy 0, policy_version 670421 (0.0042) [2024-06-24 13:34:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 10984325120. Throughput: 0: 42666.7. Samples: 10984434300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:28,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 13:34:28,769][15401] Updated weights for policy 0, policy_version 670431 (0.0033) [2024-06-24 13:34:32,738][15401] Updated weights for policy 0, policy_version 670441 (0.0052) [2024-06-24 13:34:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10984505344. Throughput: 0: 42778.1. Samples: 10984689880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 13:34:36,462][15401] Updated weights for policy 0, policy_version 670451 (0.0047) [2024-06-24 13:34:38,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 10984718336. Throughput: 0: 42610.6. Samples: 10984813200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 13:34:40,401][15401] Updated weights for policy 0, policy_version 670461 (0.0042) [2024-06-24 13:34:43,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 10984964096. Throughput: 0: 42707.1. Samples: 10985067420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 13:34:44,471][15401] Updated weights for policy 0, policy_version 670471 (0.0035) [2024-06-24 13:34:48,041][15401] Updated weights for policy 0, policy_version 670481 (0.0037) [2024-06-24 13:34:48,395][15132] Fps is (10 sec: 44213.0, 60 sec: 42594.6, 300 sec: 42597.6). Total num frames: 10985160704. Throughput: 0: 42615.2. Samples: 10985321440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:48,396][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 13:34:52,036][15401] Updated weights for policy 0, policy_version 670491 (0.0032) [2024-06-24 13:34:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 10985357312. Throughput: 0: 42587.2. Samples: 10985448260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 13:34:56,038][15401] Updated weights for policy 0, policy_version 670501 (0.0027) [2024-06-24 13:34:58,389][15132] Fps is (10 sec: 44261.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 10985603072. Throughput: 0: 42711.1. Samples: 10985708660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:34:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 13:34:59,834][15401] Updated weights for policy 0, policy_version 670511 (0.0026) [2024-06-24 13:35:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 10985799680. Throughput: 0: 42556.3. Samples: 10985964300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:35:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 13:35:03,867][15401] Updated weights for policy 0, policy_version 670521 (0.0025) [2024-06-24 13:35:07,834][15401] Updated weights for policy 0, policy_version 670531 (0.0037) [2024-06-24 13:35:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 10985996288. Throughput: 0: 42550.7. Samples: 10986084880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:35:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 13:35:11,415][15401] Updated weights for policy 0, policy_version 670541 (0.0026) [2024-06-24 13:35:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 10986242048. Throughput: 0: 42416.9. Samples: 10986343060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:35:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 13:35:15,513][15401] Updated weights for policy 0, policy_version 670551 (0.0026) [2024-06-24 13:35:17,996][15349] Signal inference workers to stop experience collection... (162650 times) [2024-06-24 13:35:17,996][15349] Signal inference workers to resume experience collection... (162650 times) [2024-06-24 13:35:18,018][15401] InferenceWorker_p0-w0: stopping experience collection (162650 times) [2024-06-24 13:35:18,019][15401] InferenceWorker_p0-w0: resuming experience collection (162650 times) [2024-06-24 13:35:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 10986422272. Throughput: 0: 42470.4. Samples: 10986601040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:35:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 13:35:19,029][15401] Updated weights for policy 0, policy_version 670561 (0.0025) [2024-06-24 13:35:23,020][15401] Updated weights for policy 0, policy_version 670571 (0.0041) [2024-06-24 13:35:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 10986635264. Throughput: 0: 42440.4. Samples: 10986723020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:35:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 13:35:26,803][15401] Updated weights for policy 0, policy_version 670581 (0.0045) [2024-06-24 13:35:28,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 10986897408. Throughput: 0: 42582.2. Samples: 10986983620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:35:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 13:35:30,859][15401] Updated weights for policy 0, policy_version 670591 (0.0042) [2024-06-24 13:35:33,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 10987061248. Throughput: 0: 42618.5. Samples: 10987239140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:35:33,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 13:35:34,397][15401] Updated weights for policy 0, policy_version 670601 (0.0032) [2024-06-24 13:35:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 10987274240. Throughput: 0: 42417.3. Samples: 10987357040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 13:35:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 13:35:38,550][15401] Updated weights for policy 0, policy_version 670611 (0.0037) [2024-06-24 13:35:42,061][15401] Updated weights for policy 0, policy_version 670621 (0.0027) [2024-06-24 13:35:43,389][15132] Fps is (10 sec: 45886.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 10987520000. Throughput: 0: 42401.3. Samples: 10987616720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:35:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 13:35:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000670625_10987520000.pth... [2024-06-24 13:35:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000670002_10977312768.pth [2024-06-24 13:35:46,151][15401] Updated weights for policy 0, policy_version 670631 (0.0034) [2024-06-24 13:35:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42327.5, 300 sec: 42487.0). Total num frames: 10987700224. Throughput: 0: 42536.5. Samples: 10987878540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:35:48,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 13:35:49,671][15401] Updated weights for policy 0, policy_version 670641 (0.0032) [2024-06-24 13:35:53,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42866.8, 300 sec: 42541.9). Total num frames: 10987929600. Throughput: 0: 42565.4. Samples: 10988000600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:35:53,396][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 13:35:53,673][15401] Updated weights for policy 0, policy_version 670651 (0.0037) [2024-06-24 13:35:57,292][15401] Updated weights for policy 0, policy_version 670661 (0.0038) [2024-06-24 13:35:58,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 10988142592. Throughput: 0: 42527.6. Samples: 10988256800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:35:58,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-24 13:36:01,243][15401] Updated weights for policy 0, policy_version 670671 (0.0028) [2024-06-24 13:36:03,390][15132] Fps is (10 sec: 40986.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 10988339200. Throughput: 0: 42646.6. Samples: 10988520140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 13:36:04,911][15401] Updated weights for policy 0, policy_version 670681 (0.0036) [2024-06-24 13:36:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10988584960. Throughput: 0: 42689.8. Samples: 10988644060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 13:36:08,603][15401] Updated weights for policy 0, policy_version 670691 (0.0031) [2024-06-24 13:36:12,401][15401] Updated weights for policy 0, policy_version 670701 (0.0048) [2024-06-24 13:36:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10988797952. Throughput: 0: 42733.4. Samples: 10988906620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 13:36:16,274][15401] Updated weights for policy 0, policy_version 670711 (0.0033) [2024-06-24 13:36:18,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 10988961792. Throughput: 0: 42801.8. Samples: 10989165120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 13:36:20,217][15401] Updated weights for policy 0, policy_version 670721 (0.0045) [2024-06-24 13:36:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 10989223936. Throughput: 0: 42904.4. Samples: 10989287740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 13:36:23,978][15401] Updated weights for policy 0, policy_version 670731 (0.0036) [2024-06-24 13:36:28,079][15401] Updated weights for policy 0, policy_version 670741 (0.0034) [2024-06-24 13:36:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 10989436928. Throughput: 0: 42971.9. Samples: 10989550460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 13:36:31,613][15401] Updated weights for policy 0, policy_version 670751 (0.0038) [2024-06-24 13:36:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 10989617152. Throughput: 0: 42890.6. Samples: 10989808520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 13:36:35,883][15401] Updated weights for policy 0, policy_version 670761 (0.0051) [2024-06-24 13:36:38,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 10989862912. Throughput: 0: 42802.6. Samples: 10989926440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:38,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 13:36:39,395][15401] Updated weights for policy 0, policy_version 670771 (0.0035) [2024-06-24 13:36:42,342][15349] Signal inference workers to stop experience collection... (162700 times) [2024-06-24 13:36:42,345][15349] Signal inference workers to resume experience collection... (162700 times) [2024-06-24 13:36:42,374][15401] InferenceWorker_p0-w0: stopping experience collection (162700 times) [2024-06-24 13:36:42,374][15401] InferenceWorker_p0-w0: resuming experience collection (162700 times) [2024-06-24 13:36:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 10990043136. Throughput: 0: 42849.8. Samples: 10990185040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 13:36:43,690][15401] Updated weights for policy 0, policy_version 670781 (0.0033) [2024-06-24 13:36:47,083][15401] Updated weights for policy 0, policy_version 670791 (0.0022) [2024-06-24 13:36:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 10990256128. Throughput: 0: 42592.4. Samples: 10990436800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 13:36:51,422][15401] Updated weights for policy 0, policy_version 670801 (0.0033) [2024-06-24 13:36:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 10990485504. Throughput: 0: 42625.7. Samples: 10990562220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:53,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 13:36:54,766][15401] Updated weights for policy 0, policy_version 670811 (0.0040) [2024-06-24 13:36:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 10990682112. Throughput: 0: 42548.1. Samples: 10990821280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:36:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 13:36:59,058][15401] Updated weights for policy 0, policy_version 670821 (0.0034) [2024-06-24 13:37:03,096][15401] Updated weights for policy 0, policy_version 670831 (0.0030) [2024-06-24 13:37:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 10990895104. Throughput: 0: 42288.8. Samples: 10991068120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:37:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 13:37:06,591][15401] Updated weights for policy 0, policy_version 670841 (0.0031) [2024-06-24 13:37:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42543.1). Total num frames: 10991124480. Throughput: 0: 42414.2. Samples: 10991196380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:37:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 13:37:10,676][15401] Updated weights for policy 0, policy_version 670851 (0.0034) [2024-06-24 13:37:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 10991321088. Throughput: 0: 42373.5. Samples: 10991457260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:37:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 13:37:14,457][15401] Updated weights for policy 0, policy_version 670861 (0.0025) [2024-06-24 13:37:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 10991534080. Throughput: 0: 42145.0. Samples: 10991705040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 13:37:18,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-24 13:37:18,567][15401] Updated weights for policy 0, policy_version 670871 (0.0028) [2024-06-24 13:37:22,016][15401] Updated weights for policy 0, policy_version 670881 (0.0030) [2024-06-24 13:37:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 10991747072. Throughput: 0: 42381.4. Samples: 10991833600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:37:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 13:37:26,075][15401] Updated weights for policy 0, policy_version 670891 (0.0026) [2024-06-24 13:37:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 10991960064. Throughput: 0: 42403.1. Samples: 10992093180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:37:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 13:37:29,708][15401] Updated weights for policy 0, policy_version 670901 (0.0040) [2024-06-24 13:37:33,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 10992189440. Throughput: 0: 42388.9. Samples: 10992344400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:37:33,392][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 13:37:33,531][15401] Updated weights for policy 0, policy_version 670911 (0.0037) [2024-06-24 13:37:37,657][15401] Updated weights for policy 0, policy_version 670921 (0.0042) [2024-06-24 13:37:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10992402432. Throughput: 0: 42552.1. Samples: 10992477060. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:37:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 13:37:41,276][15401] Updated weights for policy 0, policy_version 670931 (0.0026) [2024-06-24 13:37:43,395][15132] Fps is (10 sec: 40947.8, 60 sec: 42594.6, 300 sec: 42542.1). Total num frames: 10992599040. Throughput: 0: 42501.1. Samples: 10992734060. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:37:43,395][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 13:37:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000670935_10992599040.pth... [2024-06-24 13:37:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000670311_10982375424.pth [2024-06-24 13:37:45,477][15401] Updated weights for policy 0, policy_version 670941 (0.0039) [2024-06-24 13:37:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 10992844800. Throughput: 0: 42637.3. Samples: 10992986800. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:37:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 13:37:48,868][15401] Updated weights for policy 0, policy_version 670951 (0.0045) [2024-06-24 13:37:53,327][15401] Updated weights for policy 0, policy_version 670961 (0.0035) [2024-06-24 13:37:53,389][15132] Fps is (10 sec: 42621.6, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 10993025024. Throughput: 0: 42699.6. Samples: 10993117860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:37:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 13:37:56,626][15401] Updated weights for policy 0, policy_version 670971 (0.0033) [2024-06-24 13:37:58,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.7, 300 sec: 42654.1). Total num frames: 10993254400. Throughput: 0: 42473.6. Samples: 10993368680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:37:58,393][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 13:38:00,959][15401] Updated weights for policy 0, policy_version 670981 (0.0026) [2024-06-24 13:38:02,196][15349] Signal inference workers to stop experience collection... (162750 times) [2024-06-24 13:38:02,233][15401] InferenceWorker_p0-w0: stopping experience collection (162750 times) [2024-06-24 13:38:02,255][15349] Signal inference workers to resume experience collection... (162750 times) [2024-06-24 13:38:02,260][15401] InferenceWorker_p0-w0: resuming experience collection (162750 times) [2024-06-24 13:38:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10993467392. Throughput: 0: 42695.0. Samples: 10993626320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 13:38:04,165][15401] Updated weights for policy 0, policy_version 670991 (0.0028) [2024-06-24 13:38:08,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10993664000. Throughput: 0: 42697.3. Samples: 10993754980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:08,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 13:38:08,620][15401] Updated weights for policy 0, policy_version 671001 (0.0032) [2024-06-24 13:38:11,795][15401] Updated weights for policy 0, policy_version 671011 (0.0026) [2024-06-24 13:38:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 10993909760. Throughput: 0: 42633.8. Samples: 10994011700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:13,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 13:38:16,200][15401] Updated weights for policy 0, policy_version 671021 (0.0029) [2024-06-24 13:38:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 10994106368. Throughput: 0: 42864.0. Samples: 10994273180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 13:38:19,473][15401] Updated weights for policy 0, policy_version 671031 (0.0042) [2024-06-24 13:38:23,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 10994319360. Throughput: 0: 42727.5. Samples: 10994399900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:23,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 13:38:23,792][15401] Updated weights for policy 0, policy_version 671041 (0.0043) [2024-06-24 13:38:27,168][15401] Updated weights for policy 0, policy_version 671051 (0.0044) [2024-06-24 13:38:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 10994565120. Throughput: 0: 42861.6. Samples: 10994662600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:28,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 13:38:31,539][15401] Updated weights for policy 0, policy_version 671061 (0.0029) [2024-06-24 13:38:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 10994761728. Throughput: 0: 42975.3. Samples: 10994920680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 13:38:34,718][15401] Updated weights for policy 0, policy_version 671071 (0.0028) [2024-06-24 13:38:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42709.2). Total num frames: 10994974720. Throughput: 0: 42819.9. Samples: 10995044860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:38,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 13:38:39,389][15401] Updated weights for policy 0, policy_version 671081 (0.0041) [2024-06-24 13:38:42,266][15401] Updated weights for policy 0, policy_version 671091 (0.0043) [2024-06-24 13:38:43,396][15132] Fps is (10 sec: 45845.5, 60 sec: 43689.9, 300 sec: 42764.1). Total num frames: 10995220480. Throughput: 0: 43115.3. Samples: 10995309040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:43,397][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 13:38:47,046][15401] Updated weights for policy 0, policy_version 671101 (0.0029) [2024-06-24 13:38:48,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 10995384320. Throughput: 0: 43067.7. Samples: 10995564360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 13:38:49,973][15401] Updated weights for policy 0, policy_version 671111 (0.0040) [2024-06-24 13:38:53,389][15132] Fps is (10 sec: 37707.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 10995597312. Throughput: 0: 42862.2. Samples: 10995683780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 13:38:54,782][15401] Updated weights for policy 0, policy_version 671121 (0.0035) [2024-06-24 13:38:57,695][15401] Updated weights for policy 0, policy_version 671131 (0.0029) [2024-06-24 13:38:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 10995843072. Throughput: 0: 42999.4. Samples: 10995946680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-24 13:38:58,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 13:39:02,436][15401] Updated weights for policy 0, policy_version 671141 (0.0044) [2024-06-24 13:39:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10996023296. Throughput: 0: 43050.7. Samples: 10996210460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 13:39:05,299][15401] Updated weights for policy 0, policy_version 671151 (0.0028) [2024-06-24 13:39:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 10996252672. Throughput: 0: 42855.2. Samples: 10996328280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 13:39:10,111][15401] Updated weights for policy 0, policy_version 671161 (0.0030) [2024-06-24 13:39:12,919][15401] Updated weights for policy 0, policy_version 671171 (0.0035) [2024-06-24 13:39:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 10996482048. Throughput: 0: 42760.8. Samples: 10996586840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 13:39:14,706][15349] Signal inference workers to stop experience collection... (162800 times) [2024-06-24 13:39:14,734][15401] InferenceWorker_p0-w0: stopping experience collection (162800 times) [2024-06-24 13:39:14,820][15349] Signal inference workers to resume experience collection... (162800 times) [2024-06-24 13:39:14,821][15401] InferenceWorker_p0-w0: resuming experience collection (162800 times) [2024-06-24 13:39:17,941][15401] Updated weights for policy 0, policy_version 671181 (0.0028) [2024-06-24 13:39:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 10996645888. Throughput: 0: 42908.9. Samples: 10996851580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:18,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 13:39:20,729][15401] Updated weights for policy 0, policy_version 671191 (0.0034) [2024-06-24 13:39:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 10996908032. Throughput: 0: 42684.0. Samples: 10996965540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 13:39:25,741][15401] Updated weights for policy 0, policy_version 671201 (0.0030) [2024-06-24 13:39:28,286][15401] Updated weights for policy 0, policy_version 671211 (0.0034) [2024-06-24 13:39:28,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 10997121024. Throughput: 0: 42522.0. Samples: 10997222360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:28,393][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 13:39:33,153][15401] Updated weights for policy 0, policy_version 671221 (0.0036) [2024-06-24 13:39:33,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 10997284864. Throughput: 0: 42735.0. Samples: 10997487440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 13:39:36,135][15401] Updated weights for policy 0, policy_version 671231 (0.0023) [2024-06-24 13:39:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 10997563392. Throughput: 0: 42735.5. Samples: 10997606880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 13:39:40,840][15401] Updated weights for policy 0, policy_version 671241 (0.0028) [2024-06-24 13:39:43,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42329.8, 300 sec: 42710.3). Total num frames: 10997760000. Throughput: 0: 42726.2. Samples: 10997869360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 13:39:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000671250_10997760000.pth... [2024-06-24 13:39:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000670625_10987520000.pth [2024-06-24 13:39:43,841][15401] Updated weights for policy 0, policy_version 671251 (0.0052) [2024-06-24 13:39:48,390][15132] Fps is (10 sec: 36044.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 10997923840. Throughput: 0: 42596.9. Samples: 10998127320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 13:39:48,629][15401] Updated weights for policy 0, policy_version 671261 (0.0030) [2024-06-24 13:39:51,398][15401] Updated weights for policy 0, policy_version 671271 (0.0023) [2024-06-24 13:39:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 10998185984. Throughput: 0: 42669.2. Samples: 10998248400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:53,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 13:39:56,198][15401] Updated weights for policy 0, policy_version 671281 (0.0043) [2024-06-24 13:39:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 10998382592. Throughput: 0: 42774.3. Samples: 10998511680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:39:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 13:39:59,221][15401] Updated weights for policy 0, policy_version 671291 (0.0029) [2024-06-24 13:40:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 10998579200. Throughput: 0: 42508.4. Samples: 10998764460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:40:03,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 13:40:03,715][15401] Updated weights for policy 0, policy_version 671301 (0.0028) [2024-06-24 13:40:07,043][15401] Updated weights for policy 0, policy_version 671311 (0.0039) [2024-06-24 13:40:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 10998824960. Throughput: 0: 42661.4. Samples: 10998885300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:40:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 13:40:11,212][15401] Updated weights for policy 0, policy_version 671321 (0.0033) [2024-06-24 13:40:13,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42320.8, 300 sec: 42708.5). Total num frames: 10999021568. Throughput: 0: 42778.0. Samples: 10999147540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:40:13,396][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 13:40:14,585][15401] Updated weights for policy 0, policy_version 671331 (0.0037) [2024-06-24 13:40:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 10999218176. Throughput: 0: 42547.2. Samples: 10999402060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:40:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 13:40:19,066][15401] Updated weights for policy 0, policy_version 671341 (0.0029) [2024-06-24 13:40:20,481][15349] Signal inference workers to stop experience collection... (162850 times) [2024-06-24 13:40:20,529][15401] InferenceWorker_p0-w0: stopping experience collection (162850 times) [2024-06-24 13:40:20,538][15349] Signal inference workers to resume experience collection... (162850 times) [2024-06-24 13:40:20,547][15401] InferenceWorker_p0-w0: resuming experience collection (162850 times) [2024-06-24 13:40:22,321][15401] Updated weights for policy 0, policy_version 671351 (0.0036) [2024-06-24 13:40:23,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 10999463936. Throughput: 0: 42531.0. Samples: 10999520780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:40:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 13:40:26,522][15401] Updated weights for policy 0, policy_version 671361 (0.0047) [2024-06-24 13:40:28,391][15132] Fps is (10 sec: 44231.2, 60 sec: 42326.2, 300 sec: 42709.6). Total num frames: 10999660544. Throughput: 0: 42593.1. Samples: 10999786100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:40:28,391][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 13:40:29,894][15401] Updated weights for policy 0, policy_version 671371 (0.0032) [2024-06-24 13:40:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 10999857152. Throughput: 0: 42379.6. Samples: 11000034400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 13:40:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 13:40:34,602][15401] Updated weights for policy 0, policy_version 671381 (0.0034) [2024-06-24 13:40:37,585][15401] Updated weights for policy 0, policy_version 671391 (0.0033) [2024-06-24 13:40:38,390][15132] Fps is (10 sec: 44242.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11000102912. Throughput: 0: 42526.3. Samples: 11000162080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:40:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 13:40:42,178][15401] Updated weights for policy 0, policy_version 671401 (0.0044) [2024-06-24 13:40:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 11000283136. Throughput: 0: 42394.6. Samples: 11000419440. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:40:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 13:40:45,199][15401] Updated weights for policy 0, policy_version 671411 (0.0043) [2024-06-24 13:40:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42599.3). Total num frames: 11000496128. Throughput: 0: 42412.0. Samples: 11000673000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:40:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 13:40:49,821][15401] Updated weights for policy 0, policy_version 671421 (0.0056) [2024-06-24 13:40:52,927][15401] Updated weights for policy 0, policy_version 671431 (0.0031) [2024-06-24 13:40:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11000741888. Throughput: 0: 42565.9. Samples: 11000800760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:40:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 13:40:57,349][15401] Updated weights for policy 0, policy_version 671441 (0.0038) [2024-06-24 13:40:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11000922112. Throughput: 0: 42467.3. Samples: 11001058300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:40:58,394][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 13:41:00,813][15401] Updated weights for policy 0, policy_version 671451 (0.0026) [2024-06-24 13:41:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11001135104. Throughput: 0: 42435.5. Samples: 11001311660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 13:41:04,926][15401] Updated weights for policy 0, policy_version 671461 (0.0042) [2024-06-24 13:41:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11001364480. Throughput: 0: 42678.3. Samples: 11001441300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 13:41:08,420][15401] Updated weights for policy 0, policy_version 671471 (0.0047) [2024-06-24 13:41:12,585][15401] Updated weights for policy 0, policy_version 671481 (0.0026) [2024-06-24 13:41:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 11001577472. Throughput: 0: 42508.2. Samples: 11001698920. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 13:41:16,199][15401] Updated weights for policy 0, policy_version 671491 (0.0030) [2024-06-24 13:41:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11001790464. Throughput: 0: 42453.8. Samples: 11001944820. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 13:41:20,641][15349] Signal inference workers to stop experience collection... (162900 times) [2024-06-24 13:41:20,668][15401] InferenceWorker_p0-w0: stopping experience collection (162900 times) [2024-06-24 13:41:20,694][15349] Signal inference workers to resume experience collection... (162900 times) [2024-06-24 13:41:20,700][15401] InferenceWorker_p0-w0: resuming experience collection (162900 times) [2024-06-24 13:41:20,702][15401] Updated weights for policy 0, policy_version 671501 (0.0036) [2024-06-24 13:41:23,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 11001987072. Throughput: 0: 42497.7. Samples: 11002074580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:23,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 13:41:23,831][15401] Updated weights for policy 0, policy_version 671511 (0.0032) [2024-06-24 13:41:28,339][15401] Updated weights for policy 0, policy_version 671521 (0.0035) [2024-06-24 13:41:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42326.2, 300 sec: 42654.0). Total num frames: 11002200064. Throughput: 0: 42542.7. Samples: 11002333860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:28,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 13:41:31,442][15401] Updated weights for policy 0, policy_version 671531 (0.0026) [2024-06-24 13:41:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11002429440. Throughput: 0: 42489.8. Samples: 11002585040. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:33,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 13:41:35,885][15401] Updated weights for policy 0, policy_version 671541 (0.0045) [2024-06-24 13:41:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 11002626048. Throughput: 0: 42460.8. Samples: 11002711500. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:38,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 13:41:39,146][15401] Updated weights for policy 0, policy_version 671551 (0.0032) [2024-06-24 13:41:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11002822656. Throughput: 0: 42567.7. Samples: 11002973840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:43,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 13:41:43,464][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000671560_11002839040.pth... [2024-06-24 13:41:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000670935_10992599040.pth [2024-06-24 13:41:43,695][15401] Updated weights for policy 0, policy_version 671561 (0.0039) [2024-06-24 13:41:46,676][15401] Updated weights for policy 0, policy_version 671571 (0.0029) [2024-06-24 13:41:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11003084800. Throughput: 0: 42370.6. Samples: 11003218340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 13:41:51,485][15401] Updated weights for policy 0, policy_version 671581 (0.0027) [2024-06-24 13:41:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 11003265024. Throughput: 0: 42616.4. Samples: 11003359040. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 13:41:54,350][15401] Updated weights for policy 0, policy_version 671591 (0.0041) [2024-06-24 13:41:58,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11003461632. Throughput: 0: 42510.8. Samples: 11003611900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:41:58,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-24 13:41:59,060][15401] Updated weights for policy 0, policy_version 671601 (0.0029) [2024-06-24 13:42:02,211][15401] Updated weights for policy 0, policy_version 671611 (0.0033) [2024-06-24 13:42:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11003723776. Throughput: 0: 42474.6. Samples: 11003856180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:42:03,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 13:42:06,604][15401] Updated weights for policy 0, policy_version 671621 (0.0031) [2024-06-24 13:42:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11003904000. Throughput: 0: 42809.4. Samples: 11004000900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:42:08,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 13:42:09,689][15401] Updated weights for policy 0, policy_version 671631 (0.0030) [2024-06-24 13:42:13,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 11004116992. Throughput: 0: 42631.0. Samples: 11004252360. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-24 13:42:13,393][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 13:42:14,116][15401] Updated weights for policy 0, policy_version 671641 (0.0033) [2024-06-24 13:42:17,324][15401] Updated weights for policy 0, policy_version 671651 (0.0046) [2024-06-24 13:42:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 11004362752. Throughput: 0: 42549.3. Samples: 11004499860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:18,392][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 13:42:21,669][15401] Updated weights for policy 0, policy_version 671661 (0.0045) [2024-06-24 13:42:23,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 11004526592. Throughput: 0: 42813.0. Samples: 11004638080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 13:42:25,017][15401] Updated weights for policy 0, policy_version 671671 (0.0030) [2024-06-24 13:42:26,329][15349] Signal inference workers to stop experience collection... (162950 times) [2024-06-24 13:42:26,365][15401] InferenceWorker_p0-w0: stopping experience collection (162950 times) [2024-06-24 13:42:26,448][15349] Signal inference workers to resume experience collection... (162950 times) [2024-06-24 13:42:26,448][15401] InferenceWorker_p0-w0: resuming experience collection (162950 times) [2024-06-24 13:42:28,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 11004772352. Throughput: 0: 42603.9. Samples: 11004891020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 13:42:29,240][15401] Updated weights for policy 0, policy_version 671681 (0.0033) [2024-06-24 13:42:32,785][15401] Updated weights for policy 0, policy_version 671691 (0.0036) [2024-06-24 13:42:33,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11005001728. Throughput: 0: 42851.6. Samples: 11005146660. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 13:42:36,818][15401] Updated weights for policy 0, policy_version 671701 (0.0040) [2024-06-24 13:42:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42599.2). Total num frames: 11005165568. Throughput: 0: 42637.7. Samples: 11005277740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:38,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-24 13:42:40,687][15401] Updated weights for policy 0, policy_version 671711 (0.0035) [2024-06-24 13:42:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 11005411328. Throughput: 0: 42686.5. Samples: 11005532800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:43,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 13:42:44,867][15401] Updated weights for policy 0, policy_version 671721 (0.0036) [2024-06-24 13:42:48,129][15401] Updated weights for policy 0, policy_version 671731 (0.0048) [2024-06-24 13:42:48,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11005640704. Throughput: 0: 42812.6. Samples: 11005782740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 13:42:52,597][15401] Updated weights for policy 0, policy_version 671741 (0.0040) [2024-06-24 13:42:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 11005804544. Throughput: 0: 42484.4. Samples: 11005912700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 13:42:55,977][15401] Updated weights for policy 0, policy_version 671751 (0.0038) [2024-06-24 13:42:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11006066688. Throughput: 0: 42586.3. Samples: 11006168640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:42:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 13:43:00,151][15401] Updated weights for policy 0, policy_version 671761 (0.0028) [2024-06-24 13:43:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 11006263296. Throughput: 0: 42680.8. Samples: 11006420400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 13:43:03,740][15401] Updated weights for policy 0, policy_version 671771 (0.0052) [2024-06-24 13:43:08,046][15401] Updated weights for policy 0, policy_version 671781 (0.0036) [2024-06-24 13:43:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11006459904. Throughput: 0: 42482.6. Samples: 11006549800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 13:43:11,415][15401] Updated weights for policy 0, policy_version 671791 (0.0040) [2024-06-24 13:43:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 11006705664. Throughput: 0: 42624.5. Samples: 11006809120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 13:43:15,457][15401] Updated weights for policy 0, policy_version 671801 (0.0035) [2024-06-24 13:43:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 11006902272. Throughput: 0: 42714.2. Samples: 11007068800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:18,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 13:43:19,150][15401] Updated weights for policy 0, policy_version 671811 (0.0033) [2024-06-24 13:43:23,110][15401] Updated weights for policy 0, policy_version 671821 (0.0034) [2024-06-24 13:43:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 11007115264. Throughput: 0: 42583.2. Samples: 11007193980. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:23,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 13:43:27,160][15401] Updated weights for policy 0, policy_version 671831 (0.0047) [2024-06-24 13:43:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11007344640. Throughput: 0: 42824.6. Samples: 11007459900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:28,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 13:43:30,674][15401] Updated weights for policy 0, policy_version 671841 (0.0036) [2024-06-24 13:43:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 11007557632. Throughput: 0: 42979.0. Samples: 11007716800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 13:43:34,658][15401] Updated weights for policy 0, policy_version 671851 (0.0027) [2024-06-24 13:43:35,478][15349] Signal inference workers to stop experience collection... (163000 times) [2024-06-24 13:43:35,478][15349] Signal inference workers to resume experience collection... (163000 times) [2024-06-24 13:43:35,495][15401] InferenceWorker_p0-w0: stopping experience collection (163000 times) [2024-06-24 13:43:35,495][15401] InferenceWorker_p0-w0: resuming experience collection (163000 times) [2024-06-24 13:43:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42488.2). Total num frames: 11007754240. Throughput: 0: 42797.7. Samples: 11007838600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:38,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 13:43:38,818][15401] Updated weights for policy 0, policy_version 671861 (0.0044) [2024-06-24 13:43:42,280][15401] Updated weights for policy 0, policy_version 671871 (0.0041) [2024-06-24 13:43:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11008000000. Throughput: 0: 42982.6. Samples: 11008102860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:43,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 13:43:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000671876_11008016384.pth... [2024-06-24 13:43:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000671250_10997760000.pth [2024-06-24 13:43:46,385][15401] Updated weights for policy 0, policy_version 671881 (0.0034) [2024-06-24 13:43:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 11008196608. Throughput: 0: 43115.6. Samples: 11008360600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 13:43:49,894][15401] Updated weights for policy 0, policy_version 671891 (0.0032) [2024-06-24 13:43:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 11008409600. Throughput: 0: 42933.8. Samples: 11008481820. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-24 13:43:53,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 13:43:53,870][15401] Updated weights for policy 0, policy_version 671901 (0.0040) [2024-06-24 13:43:57,808][15401] Updated weights for policy 0, policy_version 671911 (0.0036) [2024-06-24 13:43:58,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11008638976. Throughput: 0: 42924.8. Samples: 11008740740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:43:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 13:44:01,757][15401] Updated weights for policy 0, policy_version 671921 (0.0031) [2024-06-24 13:44:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11008819200. Throughput: 0: 42966.2. Samples: 11009002280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 13:44:05,349][15401] Updated weights for policy 0, policy_version 671931 (0.0044) [2024-06-24 13:44:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11009048576. Throughput: 0: 42947.0. Samples: 11009126600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 13:44:09,293][15401] Updated weights for policy 0, policy_version 671941 (0.0041) [2024-06-24 13:44:13,032][15401] Updated weights for policy 0, policy_version 671951 (0.0047) [2024-06-24 13:44:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 11009261568. Throughput: 0: 42806.5. Samples: 11009386300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:13,393][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 13:44:16,820][15401] Updated weights for policy 0, policy_version 671961 (0.0031) [2024-06-24 13:44:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11009474560. Throughput: 0: 42773.8. Samples: 11009641620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 13:44:20,815][15401] Updated weights for policy 0, policy_version 671971 (0.0036) [2024-06-24 13:44:23,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 11009687552. Throughput: 0: 42857.9. Samples: 11009767200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:23,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 13:44:24,377][15401] Updated weights for policy 0, policy_version 671981 (0.0033) [2024-06-24 13:44:28,342][15401] Updated weights for policy 0, policy_version 671991 (0.0047) [2024-06-24 13:44:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11009900544. Throughput: 0: 42700.8. Samples: 11010024400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 13:44:32,065][15401] Updated weights for policy 0, policy_version 672001 (0.0032) [2024-06-24 13:44:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11010113536. Throughput: 0: 42888.6. Samples: 11010290580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:33,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 13:44:35,728][15401] Updated weights for policy 0, policy_version 672011 (0.0032) [2024-06-24 13:44:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11010326528. Throughput: 0: 42882.7. Samples: 11010411540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 13:44:39,637][15401] Updated weights for policy 0, policy_version 672021 (0.0038) [2024-06-24 13:44:43,362][15401] Updated weights for policy 0, policy_version 672031 (0.0039) [2024-06-24 13:44:43,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 11010555904. Throughput: 0: 42892.4. Samples: 11010671000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:43,393][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 13:44:47,225][15401] Updated weights for policy 0, policy_version 672041 (0.0047) [2024-06-24 13:44:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11010752512. Throughput: 0: 42823.1. Samples: 11010929320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 13:44:50,815][15401] Updated weights for policy 0, policy_version 672051 (0.0036) [2024-06-24 13:44:53,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11010965504. Throughput: 0: 42847.5. Samples: 11011054740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 13:44:53,414][15349] Signal inference workers to stop experience collection... (163050 times) [2024-06-24 13:44:53,454][15401] InferenceWorker_p0-w0: stopping experience collection (163050 times) [2024-06-24 13:44:53,484][15349] Signal inference workers to resume experience collection... (163050 times) [2024-06-24 13:44:53,485][15401] InferenceWorker_p0-w0: resuming experience collection (163050 times) [2024-06-24 13:44:54,822][15401] Updated weights for policy 0, policy_version 672061 (0.0038) [2024-06-24 13:44:58,277][15401] Updated weights for policy 0, policy_version 672071 (0.0029) [2024-06-24 13:44:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11011211264. Throughput: 0: 42894.2. Samples: 11011316440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:44:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 13:45:02,400][15401] Updated weights for policy 0, policy_version 672081 (0.0028) [2024-06-24 13:45:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11011391488. Throughput: 0: 43033.3. Samples: 11011578120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:45:03,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 13:45:05,814][15401] Updated weights for policy 0, policy_version 672091 (0.0046) [2024-06-24 13:45:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 11011620864. Throughput: 0: 42920.9. Samples: 11011698640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:45:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 13:45:10,013][15401] Updated weights for policy 0, policy_version 672101 (0.0033) [2024-06-24 13:45:13,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 11011850240. Throughput: 0: 43012.6. Samples: 11011959960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:45:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 13:45:13,721][15401] Updated weights for policy 0, policy_version 672111 (0.0031) [2024-06-24 13:45:17,887][15401] Updated weights for policy 0, policy_version 672121 (0.0030) [2024-06-24 13:45:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11012030464. Throughput: 0: 42797.3. Samples: 11012216460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:45:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 13:45:21,206][15401] Updated weights for policy 0, policy_version 672131 (0.0035) [2024-06-24 13:45:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.8, 300 sec: 42764.9). Total num frames: 11012276224. Throughput: 0: 42835.5. Samples: 11012339240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:45:23,393][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 13:45:25,691][15401] Updated weights for policy 0, policy_version 672141 (0.0035) [2024-06-24 13:45:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11012489216. Throughput: 0: 42892.1. Samples: 11012601040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:45:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 13:45:28,777][15401] Updated weights for policy 0, policy_version 672151 (0.0046) [2024-06-24 13:45:33,195][15401] Updated weights for policy 0, policy_version 672161 (0.0029) [2024-06-24 13:45:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11012685824. Throughput: 0: 42881.3. Samples: 11012858980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:45:33,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 13:45:36,307][15401] Updated weights for policy 0, policy_version 672171 (0.0031) [2024-06-24 13:45:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11012915200. Throughput: 0: 42783.1. Samples: 11012979980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:45:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 13:45:40,950][15401] Updated weights for policy 0, policy_version 672181 (0.0030) [2024-06-24 13:45:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11013111808. Throughput: 0: 42813.4. Samples: 11013243040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:45:43,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 13:45:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000672188_11013128192.pth... [2024-06-24 13:45:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000671560_11002839040.pth [2024-06-24 13:45:44,076][15401] Updated weights for policy 0, policy_version 672191 (0.0033) [2024-06-24 13:45:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11013308416. Throughput: 0: 42619.2. Samples: 11013495980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:45:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 13:45:48,676][15401] Updated weights for policy 0, policy_version 672201 (0.0037) [2024-06-24 13:45:51,714][15401] Updated weights for policy 0, policy_version 672211 (0.0030) [2024-06-24 13:45:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11013537792. Throughput: 0: 42652.4. Samples: 11013618000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:45:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 13:45:56,733][15401] Updated weights for policy 0, policy_version 672221 (0.0037) [2024-06-24 13:45:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11013734400. Throughput: 0: 42610.1. Samples: 11013877420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:45:58,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 13:45:59,307][15401] Updated weights for policy 0, policy_version 672231 (0.0038) [2024-06-24 13:46:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11013963776. Throughput: 0: 42499.1. Samples: 11014128920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:03,394][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 13:46:04,399][15401] Updated weights for policy 0, policy_version 672241 (0.0042) [2024-06-24 13:46:07,039][15401] Updated weights for policy 0, policy_version 672251 (0.0027) [2024-06-24 13:46:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11014176768. Throughput: 0: 42703.2. Samples: 11014260780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 13:46:11,997][15401] Updated weights for policy 0, policy_version 672261 (0.0036) [2024-06-24 13:46:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 11014373376. Throughput: 0: 42474.7. Samples: 11014512400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:13,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 13:46:15,118][15401] Updated weights for policy 0, policy_version 672271 (0.0030) [2024-06-24 13:46:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 11014602752. Throughput: 0: 42358.0. Samples: 11014765100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:18,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 13:46:19,555][15401] Updated weights for policy 0, policy_version 672281 (0.0040) [2024-06-24 13:46:22,720][15401] Updated weights for policy 0, policy_version 672291 (0.0032) [2024-06-24 13:46:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 11014832128. Throughput: 0: 42564.5. Samples: 11014895380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 13:46:27,255][15401] Updated weights for policy 0, policy_version 672301 (0.0035) [2024-06-24 13:46:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 11015012352. Throughput: 0: 42383.6. Samples: 11015150300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 13:46:29,307][15349] Signal inference workers to stop experience collection... (163100 times) [2024-06-24 13:46:29,310][15349] Signal inference workers to resume experience collection... (163100 times) [2024-06-24 13:46:29,325][15401] InferenceWorker_p0-w0: stopping experience collection (163100 times) [2024-06-24 13:46:29,325][15401] InferenceWorker_p0-w0: resuming experience collection (163100 times) [2024-06-24 13:46:30,482][15401] Updated weights for policy 0, policy_version 672311 (0.0039) [2024-06-24 13:46:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11015225344. Throughput: 0: 42477.8. Samples: 11015407480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 13:46:35,422][15401] Updated weights for policy 0, policy_version 672321 (0.0040) [2024-06-24 13:46:38,045][15401] Updated weights for policy 0, policy_version 672331 (0.0034) [2024-06-24 13:46:38,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11015471104. Throughput: 0: 42697.7. Samples: 11015539400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:38,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 13:46:43,067][15401] Updated weights for policy 0, policy_version 672341 (0.0038) [2024-06-24 13:46:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 11015634944. Throughput: 0: 42458.7. Samples: 11015788060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 13:46:45,963][15401] Updated weights for policy 0, policy_version 672351 (0.0025) [2024-06-24 13:46:48,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11015880704. Throughput: 0: 42312.1. Samples: 11016032960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:48,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 13:46:51,030][15401] Updated weights for policy 0, policy_version 672361 (0.0038) [2024-06-24 13:46:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11016077312. Throughput: 0: 42406.3. Samples: 11016169060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:53,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 13:46:54,151][15401] Updated weights for policy 0, policy_version 672371 (0.0049) [2024-06-24 13:46:58,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 11016273920. Throughput: 0: 42394.6. Samples: 11016420260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:46:58,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 13:46:58,726][15401] Updated weights for policy 0, policy_version 672381 (0.0029) [2024-06-24 13:47:01,710][15401] Updated weights for policy 0, policy_version 672391 (0.0035) [2024-06-24 13:47:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11016519680. Throughput: 0: 42225.5. Samples: 11016665240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:47:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 13:47:06,262][15401] Updated weights for policy 0, policy_version 672401 (0.0032) [2024-06-24 13:47:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42052.2, 300 sec: 42654.3). Total num frames: 11016699904. Throughput: 0: 42361.3. Samples: 11016801640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 13:47:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 13:47:09,648][15401] Updated weights for policy 0, policy_version 672411 (0.0044) [2024-06-24 13:47:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 11016912896. Throughput: 0: 42291.4. Samples: 11017053420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 13:47:14,001][15401] Updated weights for policy 0, policy_version 672421 (0.0033) [2024-06-24 13:47:17,180][15401] Updated weights for policy 0, policy_version 672431 (0.0045) [2024-06-24 13:47:18,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11017175040. Throughput: 0: 42019.4. Samples: 11017298360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 13:47:21,798][15401] Updated weights for policy 0, policy_version 672441 (0.0024) [2024-06-24 13:47:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 11017355264. Throughput: 0: 42177.0. Samples: 11017437360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 13:47:24,782][15401] Updated weights for policy 0, policy_version 672451 (0.0028) [2024-06-24 13:47:28,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 11017551872. Throughput: 0: 42289.7. Samples: 11017691100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 13:47:29,392][15401] Updated weights for policy 0, policy_version 672461 (0.0031) [2024-06-24 13:47:32,427][15401] Updated weights for policy 0, policy_version 672471 (0.0029) [2024-06-24 13:47:33,392][15132] Fps is (10 sec: 45865.5, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 11017814016. Throughput: 0: 42395.2. Samples: 11017940840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:33,392][15132] Avg episode reward: [(0, '0.259')] [2024-06-24 13:47:36,917][15401] Updated weights for policy 0, policy_version 672481 (0.0038) [2024-06-24 13:47:38,391][15132] Fps is (10 sec: 44228.8, 60 sec: 42051.1, 300 sec: 42653.7). Total num frames: 11017994240. Throughput: 0: 42437.7. Samples: 11018078840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:38,392][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 13:47:40,083][15401] Updated weights for policy 0, policy_version 672491 (0.0032) [2024-06-24 13:47:43,390][15132] Fps is (10 sec: 37690.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11018190848. Throughput: 0: 42401.7. Samples: 11018328240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:43,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 13:47:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000672497_11018190848.pth... [2024-06-24 13:47:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000671876_11008016384.pth [2024-06-24 13:47:44,852][15401] Updated weights for policy 0, policy_version 672501 (0.0050) [2024-06-24 13:47:47,777][15401] Updated weights for policy 0, policy_version 672511 (0.0036) [2024-06-24 13:47:48,390][15132] Fps is (10 sec: 45883.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11018452992. Throughput: 0: 42563.0. Samples: 11018580580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 13:47:52,279][15401] Updated weights for policy 0, policy_version 672521 (0.0041) [2024-06-24 13:47:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11018633216. Throughput: 0: 42558.6. Samples: 11018716780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 13:47:54,731][15349] Signal inference workers to stop experience collection... (163150 times) [2024-06-24 13:47:54,760][15401] InferenceWorker_p0-w0: stopping experience collection (163150 times) [2024-06-24 13:47:54,792][15349] Signal inference workers to resume experience collection... (163150 times) [2024-06-24 13:47:54,792][15401] InferenceWorker_p0-w0: resuming experience collection (163150 times) [2024-06-24 13:47:55,590][15401] Updated weights for policy 0, policy_version 672531 (0.0042) [2024-06-24 13:47:58,390][15132] Fps is (10 sec: 36044.8, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 11018813440. Throughput: 0: 42421.3. Samples: 11018962380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:47:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 13:47:59,923][15401] Updated weights for policy 0, policy_version 672541 (0.0025) [2024-06-24 13:48:03,249][15401] Updated weights for policy 0, policy_version 672551 (0.0040) [2024-06-24 13:48:03,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11019075584. Throughput: 0: 42525.4. Samples: 11019212000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 13:48:07,454][15401] Updated weights for policy 0, policy_version 672561 (0.0037) [2024-06-24 13:48:08,391][15132] Fps is (10 sec: 45869.1, 60 sec: 42870.5, 300 sec: 42598.2). Total num frames: 11019272192. Throughput: 0: 42450.7. Samples: 11019347700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:08,391][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 13:48:11,189][15401] Updated weights for policy 0, policy_version 672571 (0.0037) [2024-06-24 13:48:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11019468800. Throughput: 0: 42306.3. Samples: 11019594880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 13:48:15,180][15401] Updated weights for policy 0, policy_version 672581 (0.0027) [2024-06-24 13:48:18,389][15132] Fps is (10 sec: 42604.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11019698176. Throughput: 0: 42641.6. Samples: 11019859620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:18,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 13:48:18,812][15401] Updated weights for policy 0, policy_version 672591 (0.0053) [2024-06-24 13:48:23,014][15401] Updated weights for policy 0, policy_version 672601 (0.0040) [2024-06-24 13:48:23,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 11019911168. Throughput: 0: 42383.9. Samples: 11019986140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:23,404][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 13:48:26,431][15401] Updated weights for policy 0, policy_version 672611 (0.0035) [2024-06-24 13:48:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 11020124160. Throughput: 0: 42393.3. Samples: 11020236040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:28,393][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 13:48:30,505][15401] Updated weights for policy 0, policy_version 672621 (0.0027) [2024-06-24 13:48:33,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.1, 300 sec: 42653.6). Total num frames: 11020337152. Throughput: 0: 42683.2. Samples: 11020501420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:33,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 13:48:34,034][15401] Updated weights for policy 0, policy_version 672631 (0.0031) [2024-06-24 13:48:38,181][15401] Updated weights for policy 0, policy_version 672641 (0.0037) [2024-06-24 13:48:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42599.7, 300 sec: 42542.8). Total num frames: 11020550144. Throughput: 0: 42550.3. Samples: 11020631540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 13:48:41,745][15401] Updated weights for policy 0, policy_version 672651 (0.0042) [2024-06-24 13:48:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11020763136. Throughput: 0: 42608.9. Samples: 11020879780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 13:48:45,720][15401] Updated weights for policy 0, policy_version 672661 (0.0038) [2024-06-24 13:48:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 11020959744. Throughput: 0: 42827.9. Samples: 11021139260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-24 13:48:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 13:48:49,590][15401] Updated weights for policy 0, policy_version 672671 (0.0031) [2024-06-24 13:48:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11021189120. Throughput: 0: 42578.3. Samples: 11021263660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:48:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 13:48:53,721][15401] Updated weights for policy 0, policy_version 672681 (0.0030) [2024-06-24 13:48:56,011][15349] Signal inference workers to stop experience collection... (163200 times) [2024-06-24 13:48:56,011][15349] Signal inference workers to resume experience collection... (163200 times) [2024-06-24 13:48:56,051][15401] InferenceWorker_p0-w0: stopping experience collection (163200 times) [2024-06-24 13:48:56,052][15401] InferenceWorker_p0-w0: resuming experience collection (163200 times) [2024-06-24 13:48:57,080][15401] Updated weights for policy 0, policy_version 672691 (0.0035) [2024-06-24 13:48:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11021402112. Throughput: 0: 42745.3. Samples: 11021518420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:48:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 13:49:01,394][15401] Updated weights for policy 0, policy_version 672701 (0.0032) [2024-06-24 13:49:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11021615104. Throughput: 0: 42756.0. Samples: 11021783640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 13:49:04,699][15401] Updated weights for policy 0, policy_version 672711 (0.0027) [2024-06-24 13:49:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42599.4, 300 sec: 42598.7). Total num frames: 11021828096. Throughput: 0: 42745.4. Samples: 11021909580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 13:49:08,959][15401] Updated weights for policy 0, policy_version 672721 (0.0032) [2024-06-24 13:49:12,335][15401] Updated weights for policy 0, policy_version 672731 (0.0040) [2024-06-24 13:49:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11022057472. Throughput: 0: 42915.2. Samples: 11022167120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 13:49:16,793][15401] Updated weights for policy 0, policy_version 672741 (0.0038) [2024-06-24 13:49:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11022254080. Throughput: 0: 42789.8. Samples: 11022426860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:18,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 13:49:19,880][15401] Updated weights for policy 0, policy_version 672751 (0.0041) [2024-06-24 13:49:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 11022467072. Throughput: 0: 42707.6. Samples: 11022553380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 13:49:24,349][15401] Updated weights for policy 0, policy_version 672761 (0.0035) [2024-06-24 13:49:28,188][15401] Updated weights for policy 0, policy_version 672771 (0.0043) [2024-06-24 13:49:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 11022696448. Throughput: 0: 42887.6. Samples: 11022809720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 13:49:32,068][15401] Updated weights for policy 0, policy_version 672781 (0.0041) [2024-06-24 13:49:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 11022909440. Throughput: 0: 42633.5. Samples: 11023057760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 13:49:35,721][15401] Updated weights for policy 0, policy_version 672791 (0.0027) [2024-06-24 13:49:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 11023106048. Throughput: 0: 42766.4. Samples: 11023188160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:38,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 13:49:39,499][15401] Updated weights for policy 0, policy_version 672801 (0.0034) [2024-06-24 13:49:43,274][15401] Updated weights for policy 0, policy_version 672811 (0.0038) [2024-06-24 13:49:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11023335424. Throughput: 0: 43030.6. Samples: 11023454800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 13:49:43,473][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000672812_11023351808.pth... [2024-06-24 13:49:43,519][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000672188_11013128192.pth [2024-06-24 13:49:47,123][15401] Updated weights for policy 0, policy_version 672821 (0.0029) [2024-06-24 13:49:48,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 11023564800. Throughput: 0: 42735.1. Samples: 11023706720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 13:49:50,765][15401] Updated weights for policy 0, policy_version 672831 (0.0032) [2024-06-24 13:49:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 11023761408. Throughput: 0: 43012.9. Samples: 11023845160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 13:49:54,633][15401] Updated weights for policy 0, policy_version 672841 (0.0047) [2024-06-24 13:49:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11023974400. Throughput: 0: 43054.2. Samples: 11024104560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:49:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 13:49:58,704][15401] Updated weights for policy 0, policy_version 672851 (0.0027) [2024-06-24 13:50:02,091][15401] Updated weights for policy 0, policy_version 672861 (0.0027) [2024-06-24 13:50:03,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43415.8, 300 sec: 42709.1). Total num frames: 11024220160. Throughput: 0: 43012.0. Samples: 11024362500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:50:03,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 13:50:06,213][15401] Updated weights for policy 0, policy_version 672871 (0.0034) [2024-06-24 13:50:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 11024400384. Throughput: 0: 43197.5. Samples: 11024497260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:50:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 13:50:09,847][15401] Updated weights for policy 0, policy_version 672881 (0.0039) [2024-06-24 13:50:13,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11024613376. Throughput: 0: 43085.4. Samples: 11024748560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:50:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 13:50:13,857][15401] Updated weights for policy 0, policy_version 672891 (0.0041) [2024-06-24 13:50:17,350][15401] Updated weights for policy 0, policy_version 672901 (0.0031) [2024-06-24 13:50:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.7, 300 sec: 42654.3). Total num frames: 11024859136. Throughput: 0: 43264.8. Samples: 11025004680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:50:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 13:50:21,578][15401] Updated weights for policy 0, policy_version 672911 (0.0045) [2024-06-24 13:50:22,276][15349] Signal inference workers to stop experience collection... (163250 times) [2024-06-24 13:50:22,281][15349] Signal inference workers to resume experience collection... (163250 times) [2024-06-24 13:50:22,316][15401] InferenceWorker_p0-w0: stopping experience collection (163250 times) [2024-06-24 13:50:22,316][15401] InferenceWorker_p0-w0: resuming experience collection (163250 times) [2024-06-24 13:50:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11025055744. Throughput: 0: 43361.9. Samples: 11025139440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 13:50:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 13:50:24,758][15401] Updated weights for policy 0, policy_version 672921 (0.0039) [2024-06-24 13:50:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11025268736. Throughput: 0: 43064.1. Samples: 11025392680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:50:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 13:50:29,285][15401] Updated weights for policy 0, policy_version 672931 (0.0023) [2024-06-24 13:50:32,382][15401] Updated weights for policy 0, policy_version 672941 (0.0032) [2024-06-24 13:50:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 11025498112. Throughput: 0: 43244.1. Samples: 11025652700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:50:33,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 13:50:36,774][15401] Updated weights for policy 0, policy_version 672951 (0.0045) [2024-06-24 13:50:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 11025711104. Throughput: 0: 43060.4. Samples: 11025782880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:50:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 13:50:40,302][15401] Updated weights for policy 0, policy_version 672961 (0.0031) [2024-06-24 13:50:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11025907712. Throughput: 0: 42942.8. Samples: 11026036980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:50:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 13:50:44,553][15401] Updated weights for policy 0, policy_version 672971 (0.0044) [2024-06-24 13:50:47,835][15401] Updated weights for policy 0, policy_version 672981 (0.0029) [2024-06-24 13:50:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11026137088. Throughput: 0: 42906.7. Samples: 11026293200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:50:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 13:50:51,995][15401] Updated weights for policy 0, policy_version 672991 (0.0042) [2024-06-24 13:50:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11026333696. Throughput: 0: 42880.8. Samples: 11026426900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:50:53,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 13:50:55,364][15401] Updated weights for policy 0, policy_version 673001 (0.0034) [2024-06-24 13:50:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11026563072. Throughput: 0: 42880.4. Samples: 11026678180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:50:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 13:50:59,605][15401] Updated weights for policy 0, policy_version 673011 (0.0034) [2024-06-24 13:51:03,037][15401] Updated weights for policy 0, policy_version 673021 (0.0029) [2024-06-24 13:51:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 11026776064. Throughput: 0: 42900.1. Samples: 11026935180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 13:51:07,147][15401] Updated weights for policy 0, policy_version 673031 (0.0032) [2024-06-24 13:51:08,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 11026972672. Throughput: 0: 42781.8. Samples: 11027064720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:08,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 13:51:11,263][15401] Updated weights for policy 0, policy_version 673041 (0.0037) [2024-06-24 13:51:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11027218432. Throughput: 0: 42731.5. Samples: 11027315600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 13:51:15,201][15401] Updated weights for policy 0, policy_version 673051 (0.0037) [2024-06-24 13:51:18,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 11027382272. Throughput: 0: 42828.9. Samples: 11027580000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 13:51:18,823][15401] Updated weights for policy 0, policy_version 673061 (0.0032) [2024-06-24 13:51:22,673][15401] Updated weights for policy 0, policy_version 673071 (0.0038) [2024-06-24 13:51:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11027628032. Throughput: 0: 42719.6. Samples: 11027705260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 13:51:26,517][15401] Updated weights for policy 0, policy_version 673081 (0.0043) [2024-06-24 13:51:28,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11027841024. Throughput: 0: 42734.6. Samples: 11027960040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 13:51:30,099][15401] Updated weights for policy 0, policy_version 673091 (0.0043) [2024-06-24 13:51:33,394][15132] Fps is (10 sec: 44217.1, 60 sec: 42868.2, 300 sec: 42708.9). Total num frames: 11028070400. Throughput: 0: 42873.1. Samples: 11028222680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:33,394][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 13:51:33,991][15401] Updated weights for policy 0, policy_version 673101 (0.0034) [2024-06-24 13:51:37,569][15401] Updated weights for policy 0, policy_version 673111 (0.0040) [2024-06-24 13:51:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11028250624. Throughput: 0: 42761.8. Samples: 11028351180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 13:51:41,582][15401] Updated weights for policy 0, policy_version 673121 (0.0036) [2024-06-24 13:51:43,389][15132] Fps is (10 sec: 39339.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11028463616. Throughput: 0: 42935.6. Samples: 11028610280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:43,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 13:51:43,539][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000673126_11028496384.pth... [2024-06-24 13:51:43,595][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000672497_11018190848.pth [2024-06-24 13:51:45,081][15401] Updated weights for policy 0, policy_version 673131 (0.0028) [2024-06-24 13:51:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11028709376. Throughput: 0: 42789.8. Samples: 11028860720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 13:51:49,690][15401] Updated weights for policy 0, policy_version 673141 (0.0033) [2024-06-24 13:51:50,780][15349] Signal inference workers to stop experience collection... (163300 times) [2024-06-24 13:51:50,780][15349] Signal inference workers to resume experience collection... (163300 times) [2024-06-24 13:51:50,825][15401] InferenceWorker_p0-w0: stopping experience collection (163300 times) [2024-06-24 13:51:50,825][15401] InferenceWorker_p0-w0: resuming experience collection (163300 times) [2024-06-24 13:51:53,231][15401] Updated weights for policy 0, policy_version 673151 (0.0045) [2024-06-24 13:51:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 11028905984. Throughput: 0: 42837.3. Samples: 11028992300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 13:51:57,328][15401] Updated weights for policy 0, policy_version 673161 (0.0033) [2024-06-24 13:51:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11029118976. Throughput: 0: 42969.4. Samples: 11029249220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:51:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 13:52:01,001][15401] Updated weights for policy 0, policy_version 673171 (0.0040) [2024-06-24 13:52:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11029348352. Throughput: 0: 42754.5. Samples: 11029503960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 13:52:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 13:52:04,761][15401] Updated weights for policy 0, policy_version 673181 (0.0041) [2024-06-24 13:52:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42871.4, 300 sec: 42820.2). Total num frames: 11029544960. Throughput: 0: 42883.9. Samples: 11029635140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:08,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 13:52:08,431][15401] Updated weights for policy 0, policy_version 673191 (0.0037) [2024-06-24 13:52:12,248][15401] Updated weights for policy 0, policy_version 673201 (0.0034) [2024-06-24 13:52:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11029757952. Throughput: 0: 43035.1. Samples: 11029896620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 13:52:15,928][15401] Updated weights for policy 0, policy_version 673211 (0.0033) [2024-06-24 13:52:18,389][15132] Fps is (10 sec: 44248.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11029987328. Throughput: 0: 42798.6. Samples: 11030148420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 13:52:19,771][15401] Updated weights for policy 0, policy_version 673221 (0.0039) [2024-06-24 13:52:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11030200320. Throughput: 0: 42893.3. Samples: 11030281380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 13:52:23,486][15401] Updated weights for policy 0, policy_version 673231 (0.0033) [2024-06-24 13:52:27,442][15401] Updated weights for policy 0, policy_version 673241 (0.0030) [2024-06-24 13:52:28,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42654.2). Total num frames: 11030396928. Throughput: 0: 42609.6. Samples: 11030527720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 13:52:31,618][15401] Updated weights for policy 0, policy_version 673251 (0.0023) [2024-06-24 13:52:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42601.6, 300 sec: 42820.8). Total num frames: 11030626304. Throughput: 0: 42684.4. Samples: 11030781520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 13:52:35,263][15401] Updated weights for policy 0, policy_version 673261 (0.0036) [2024-06-24 13:52:38,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11030822912. Throughput: 0: 42831.6. Samples: 11030919820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:38,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 13:52:39,130][15401] Updated weights for policy 0, policy_version 673271 (0.0023) [2024-06-24 13:52:42,947][15401] Updated weights for policy 0, policy_version 673281 (0.0045) [2024-06-24 13:52:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 11031052288. Throughput: 0: 42731.9. Samples: 11031172260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:43,393][15132] Avg episode reward: [(0, '0.237')] [2024-06-24 13:52:46,839][15401] Updated weights for policy 0, policy_version 673291 (0.0039) [2024-06-24 13:52:48,389][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11031281664. Throughput: 0: 42691.1. Samples: 11031425060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:48,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 13:52:50,551][15401] Updated weights for policy 0, policy_version 673301 (0.0029) [2024-06-24 13:52:53,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11031461888. Throughput: 0: 42788.9. Samples: 11031560540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 13:52:54,320][15401] Updated weights for policy 0, policy_version 673311 (0.0041) [2024-06-24 13:52:58,062][15401] Updated weights for policy 0, policy_version 673321 (0.0028) [2024-06-24 13:52:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11031691264. Throughput: 0: 42546.4. Samples: 11031811200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:52:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 13:53:01,859][15401] Updated weights for policy 0, policy_version 673331 (0.0029) [2024-06-24 13:53:03,395][15132] Fps is (10 sec: 45851.6, 60 sec: 42867.8, 300 sec: 42875.5). Total num frames: 11031920640. Throughput: 0: 42627.4. Samples: 11032066880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:03,395][15132] Avg episode reward: [(0, '0.326')] [2024-06-24 13:53:05,750][15401] Updated weights for policy 0, policy_version 673341 (0.0043) [2024-06-24 13:53:08,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 11032084480. Throughput: 0: 42469.7. Samples: 11032192520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:08,392][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 13:53:09,670][15401] Updated weights for policy 0, policy_version 673351 (0.0024) [2024-06-24 13:53:10,910][15349] Signal inference workers to stop experience collection... (163350 times) [2024-06-24 13:53:10,912][15349] Signal inference workers to resume experience collection... (163350 times) [2024-06-24 13:53:10,942][15401] InferenceWorker_p0-w0: stopping experience collection (163350 times) [2024-06-24 13:53:10,943][15401] InferenceWorker_p0-w0: resuming experience collection (163350 times) [2024-06-24 13:53:13,390][15132] Fps is (10 sec: 40981.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11032330240. Throughput: 0: 42624.0. Samples: 11032445800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 13:53:13,685][15401] Updated weights for policy 0, policy_version 673361 (0.0035) [2024-06-24 13:53:17,353][15401] Updated weights for policy 0, policy_version 673371 (0.0031) [2024-06-24 13:53:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 11032543232. Throughput: 0: 42760.4. Samples: 11032705740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 13:53:21,390][15401] Updated weights for policy 0, policy_version 673381 (0.0031) [2024-06-24 13:53:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 11032739840. Throughput: 0: 42451.6. Samples: 11032830040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:23,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 13:53:24,939][15401] Updated weights for policy 0, policy_version 673391 (0.0026) [2024-06-24 13:53:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 11032969216. Throughput: 0: 42552.0. Samples: 11033087000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 13:53:28,996][15401] Updated weights for policy 0, policy_version 673401 (0.0051) [2024-06-24 13:53:32,558][15401] Updated weights for policy 0, policy_version 673411 (0.0040) [2024-06-24 13:53:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11033182208. Throughput: 0: 42685.8. Samples: 11033345920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 13:53:36,920][15401] Updated weights for policy 0, policy_version 673421 (0.0022) [2024-06-24 13:53:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 11033395200. Throughput: 0: 42468.9. Samples: 11033471640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 13:53:40,751][15401] Updated weights for policy 0, policy_version 673431 (0.0037) [2024-06-24 13:53:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 11033608192. Throughput: 0: 42675.5. Samples: 11033731600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 13:53:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 13:53:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000673439_11033624576.pth... [2024-06-24 13:53:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000672812_11023351808.pth [2024-06-24 13:53:44,292][15401] Updated weights for policy 0, policy_version 673441 (0.0036) [2024-06-24 13:53:48,290][15401] Updated weights for policy 0, policy_version 673451 (0.0044) [2024-06-24 13:53:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11033821184. Throughput: 0: 42822.3. Samples: 11033993660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:53:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 13:53:51,806][15401] Updated weights for policy 0, policy_version 673461 (0.0033) [2024-06-24 13:53:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11034050560. Throughput: 0: 42823.1. Samples: 11034119560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:53:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 13:53:55,816][15401] Updated weights for policy 0, policy_version 673471 (0.0038) [2024-06-24 13:53:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 11034263552. Throughput: 0: 42925.7. Samples: 11034377460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:53:58,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 13:53:59,727][15401] Updated weights for policy 0, policy_version 673481 (0.0042) [2024-06-24 13:54:03,357][15401] Updated weights for policy 0, policy_version 673491 (0.0024) [2024-06-24 13:54:03,396][15132] Fps is (10 sec: 42571.7, 60 sec: 42597.5, 300 sec: 42875.2). Total num frames: 11034476544. Throughput: 0: 42853.4. Samples: 11034634420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:03,396][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 13:54:07,425][15401] Updated weights for policy 0, policy_version 673501 (0.0029) [2024-06-24 13:54:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 11034705920. Throughput: 0: 42980.4. Samples: 11034764160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 13:54:10,835][15401] Updated weights for policy 0, policy_version 673511 (0.0030) [2024-06-24 13:54:13,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11034902528. Throughput: 0: 42942.2. Samples: 11035019400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 13:54:14,919][15401] Updated weights for policy 0, policy_version 673521 (0.0031) [2024-06-24 13:54:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11035115520. Throughput: 0: 43019.5. Samples: 11035281800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 13:54:18,655][15401] Updated weights for policy 0, policy_version 673531 (0.0044) [2024-06-24 13:54:22,606][15401] Updated weights for policy 0, policy_version 673541 (0.0048) [2024-06-24 13:54:23,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 11035328512. Throughput: 0: 42985.9. Samples: 11035406280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:23,397][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 13:54:26,423][15401] Updated weights for policy 0, policy_version 673551 (0.0029) [2024-06-24 13:54:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11035541504. Throughput: 0: 42861.8. Samples: 11035660380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:28,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 13:54:30,198][15401] Updated weights for policy 0, policy_version 673561 (0.0045) [2024-06-24 13:54:33,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11035754496. Throughput: 0: 42795.1. Samples: 11035919440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 13:54:34,118][15401] Updated weights for policy 0, policy_version 673571 (0.0042) [2024-06-24 13:54:37,834][15401] Updated weights for policy 0, policy_version 673581 (0.0036) [2024-06-24 13:54:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11035983872. Throughput: 0: 42830.8. Samples: 11036046940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 13:54:40,305][15349] Signal inference workers to stop experience collection... (163400 times) [2024-06-24 13:54:40,309][15349] Signal inference workers to resume experience collection... (163400 times) [2024-06-24 13:54:40,352][15401] InferenceWorker_p0-w0: stopping experience collection (163400 times) [2024-06-24 13:54:40,352][15401] InferenceWorker_p0-w0: resuming experience collection (163400 times) [2024-06-24 13:54:41,969][15401] Updated weights for policy 0, policy_version 673591 (0.0042) [2024-06-24 13:54:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11036196864. Throughput: 0: 42767.6. Samples: 11036302000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:43,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-24 13:54:45,903][15401] Updated weights for policy 0, policy_version 673601 (0.0044) [2024-06-24 13:54:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11036393472. Throughput: 0: 42723.8. Samples: 11036556720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 13:54:49,610][15401] Updated weights for policy 0, policy_version 673611 (0.0037) [2024-06-24 13:54:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11036590080. Throughput: 0: 42681.8. Samples: 11036684840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:53,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 13:54:53,450][15401] Updated weights for policy 0, policy_version 673621 (0.0040) [2024-06-24 13:54:57,378][15401] Updated weights for policy 0, policy_version 673631 (0.0033) [2024-06-24 13:54:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 11036835840. Throughput: 0: 42875.6. Samples: 11036948800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:54:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 13:55:01,066][15401] Updated weights for policy 0, policy_version 673641 (0.0033) [2024-06-24 13:55:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 11037048832. Throughput: 0: 42584.8. Samples: 11037198120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:55:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 13:55:04,975][15401] Updated weights for policy 0, policy_version 673651 (0.0033) [2024-06-24 13:55:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 11037229056. Throughput: 0: 42603.8. Samples: 11037323180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:55:08,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 13:55:08,751][15401] Updated weights for policy 0, policy_version 673661 (0.0041) [2024-06-24 13:55:12,895][15401] Updated weights for policy 0, policy_version 673671 (0.0029) [2024-06-24 13:55:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11037442048. Throughput: 0: 42728.9. Samples: 11037583180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:55:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 13:55:16,639][15401] Updated weights for policy 0, policy_version 673681 (0.0037) [2024-06-24 13:55:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11037671424. Throughput: 0: 42384.9. Samples: 11037826760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:55:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 13:55:20,590][15401] Updated weights for policy 0, policy_version 673691 (0.0029) [2024-06-24 13:55:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 11037868032. Throughput: 0: 42428.1. Samples: 11037956200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 13:55:23,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 13:55:24,571][15401] Updated weights for policy 0, policy_version 673701 (0.0044) [2024-06-24 13:55:28,251][15401] Updated weights for policy 0, policy_version 673711 (0.0034) [2024-06-24 13:55:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11038081024. Throughput: 0: 42478.7. Samples: 11038213540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:55:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 13:55:32,064][15401] Updated weights for policy 0, policy_version 673721 (0.0038) [2024-06-24 13:55:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 11038310400. Throughput: 0: 42495.9. Samples: 11038469040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:55:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 13:55:35,961][15401] Updated weights for policy 0, policy_version 673731 (0.0029) [2024-06-24 13:55:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11038523392. Throughput: 0: 42478.7. Samples: 11038596380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:55:38,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 13:55:39,539][15401] Updated weights for policy 0, policy_version 673741 (0.0028) [2024-06-24 13:55:43,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 11038720000. Throughput: 0: 42273.4. Samples: 11038851100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:55:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 13:55:43,536][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000673751_11038736384.pth... [2024-06-24 13:55:43,543][15401] Updated weights for policy 0, policy_version 673751 (0.0046) [2024-06-24 13:55:43,598][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000673126_11028496384.pth [2024-06-24 13:55:47,305][15401] Updated weights for policy 0, policy_version 673761 (0.0042) [2024-06-24 13:55:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11038949376. Throughput: 0: 42352.1. Samples: 11039103960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:55:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 13:55:51,150][15401] Updated weights for policy 0, policy_version 673771 (0.0040) [2024-06-24 13:55:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11039145984. Throughput: 0: 42450.7. Samples: 11039233460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:55:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 13:55:55,365][15401] Updated weights for policy 0, policy_version 673781 (0.0046) [2024-06-24 13:55:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11039375360. Throughput: 0: 42350.7. Samples: 11039488960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:55:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 13:55:58,714][15401] Updated weights for policy 0, policy_version 673791 (0.0027) [2024-06-24 13:56:02,962][15401] Updated weights for policy 0, policy_version 673801 (0.0038) [2024-06-24 13:56:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 11039571968. Throughput: 0: 42675.9. Samples: 11039747180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 13:56:06,599][15401] Updated weights for policy 0, policy_version 673811 (0.0028) [2024-06-24 13:56:07,641][15349] Signal inference workers to stop experience collection... (163450 times) [2024-06-24 13:56:07,641][15349] Signal inference workers to resume experience collection... (163450 times) [2024-06-24 13:56:07,659][15401] InferenceWorker_p0-w0: stopping experience collection (163450 times) [2024-06-24 13:56:07,660][15401] InferenceWorker_p0-w0: resuming experience collection (163450 times) [2024-06-24 13:56:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11039801344. Throughput: 0: 42466.6. Samples: 11039867200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 13:56:10,577][15401] Updated weights for policy 0, policy_version 673821 (0.0025) [2024-06-24 13:56:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11039997952. Throughput: 0: 42473.9. Samples: 11040124860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 13:56:14,382][15401] Updated weights for policy 0, policy_version 673831 (0.0021) [2024-06-24 13:56:18,311][15401] Updated weights for policy 0, policy_version 673841 (0.0034) [2024-06-24 13:56:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11040210944. Throughput: 0: 42537.1. Samples: 11040383200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:18,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 13:56:21,899][15401] Updated weights for policy 0, policy_version 673851 (0.0041) [2024-06-24 13:56:23,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 11040423936. Throughput: 0: 42462.0. Samples: 11040507440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:23,396][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 13:56:25,867][15401] Updated weights for policy 0, policy_version 673861 (0.0023) [2024-06-24 13:56:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42654.6). Total num frames: 11040653312. Throughput: 0: 42433.8. Samples: 11040760620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 13:56:29,469][15401] Updated weights for policy 0, policy_version 673871 (0.0027) [2024-06-24 13:56:33,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11040849920. Throughput: 0: 42563.5. Samples: 11041019320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 13:56:33,610][15401] Updated weights for policy 0, policy_version 673881 (0.0029) [2024-06-24 13:56:37,148][15401] Updated weights for policy 0, policy_version 673891 (0.0030) [2024-06-24 13:56:38,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11041062912. Throughput: 0: 42447.5. Samples: 11041143600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 13:56:41,238][15401] Updated weights for policy 0, policy_version 673901 (0.0037) [2024-06-24 13:56:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 11041275904. Throughput: 0: 42434.4. Samples: 11041398520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 13:56:44,671][15401] Updated weights for policy 0, policy_version 673911 (0.0030) [2024-06-24 13:56:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11041488896. Throughput: 0: 42442.2. Samples: 11041657080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:48,404][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 13:56:49,161][15401] Updated weights for policy 0, policy_version 673921 (0.0036) [2024-06-24 13:56:52,129][15401] Updated weights for policy 0, policy_version 673931 (0.0035) [2024-06-24 13:56:53,392][15132] Fps is (10 sec: 42589.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 11041701888. Throughput: 0: 42684.9. Samples: 11041788120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:53,392][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 13:56:56,595][15401] Updated weights for policy 0, policy_version 673941 (0.0030) [2024-06-24 13:56:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11041914880. Throughput: 0: 42671.5. Samples: 11042045080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 13:56:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 13:56:59,669][15401] Updated weights for policy 0, policy_version 673951 (0.0032) [2024-06-24 13:57:03,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 11042127872. Throughput: 0: 42681.8. Samples: 11042303880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 13:57:04,040][15401] Updated weights for policy 0, policy_version 673961 (0.0037) [2024-06-24 13:57:07,550][15401] Updated weights for policy 0, policy_version 673971 (0.0040) [2024-06-24 13:57:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11042357248. Throughput: 0: 42852.7. Samples: 11042435540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 13:57:11,969][15401] Updated weights for policy 0, policy_version 673981 (0.0028) [2024-06-24 13:57:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11042570240. Throughput: 0: 42953.6. Samples: 11042693540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 13:57:15,316][15401] Updated weights for policy 0, policy_version 673991 (0.0042) [2024-06-24 13:57:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11042783232. Throughput: 0: 42950.3. Samples: 11042952080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:18,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 13:57:19,604][15401] Updated weights for policy 0, policy_version 674001 (0.0023) [2024-06-24 13:57:22,845][15401] Updated weights for policy 0, policy_version 674011 (0.0033) [2024-06-24 13:57:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 11042996224. Throughput: 0: 42940.4. Samples: 11043075920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:23,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 13:57:27,272][15401] Updated weights for policy 0, policy_version 674021 (0.0035) [2024-06-24 13:57:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11043225600. Throughput: 0: 42978.1. Samples: 11043332520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 13:57:30,024][15349] Signal inference workers to stop experience collection... (163500 times) [2024-06-24 13:57:30,077][15401] InferenceWorker_p0-w0: stopping experience collection (163500 times) [2024-06-24 13:57:30,085][15349] Signal inference workers to resume experience collection... (163500 times) [2024-06-24 13:57:30,096][15401] InferenceWorker_p0-w0: resuming experience collection (163500 times) [2024-06-24 13:57:30,390][15401] Updated weights for policy 0, policy_version 674031 (0.0032) [2024-06-24 13:57:33,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 11043405824. Throughput: 0: 42918.6. Samples: 11043588520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:33,393][15132] Avg episode reward: [(0, '0.237')] [2024-06-24 13:57:34,701][15401] Updated weights for policy 0, policy_version 674041 (0.0033) [2024-06-24 13:57:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 11043635200. Throughput: 0: 42794.7. Samples: 11043713780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:38,390][15132] Avg episode reward: [(0, '0.109')] [2024-06-24 13:57:38,628][15401] Updated weights for policy 0, policy_version 674051 (0.0027) [2024-06-24 13:57:42,193][15401] Updated weights for policy 0, policy_version 674061 (0.0043) [2024-06-24 13:57:43,389][15132] Fps is (10 sec: 45886.7, 60 sec: 43144.8, 300 sec: 42653.9). Total num frames: 11043864576. Throughput: 0: 42993.4. Samples: 11043979780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:43,390][15132] Avg episode reward: [(0, '0.166')] [2024-06-24 13:57:43,437][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000674065_11043880960.pth... [2024-06-24 13:57:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000673439_11033624576.pth [2024-06-24 13:57:46,285][15401] Updated weights for policy 0, policy_version 674071 (0.0026) [2024-06-24 13:57:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11044061184. Throughput: 0: 43008.4. Samples: 11044239260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 13:57:49,996][15401] Updated weights for policy 0, policy_version 674081 (0.0031) [2024-06-24 13:57:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43146.2, 300 sec: 42709.4). Total num frames: 11044290560. Throughput: 0: 42854.2. Samples: 11044363980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 13:57:53,581][15401] Updated weights for policy 0, policy_version 674091 (0.0029) [2024-06-24 13:57:57,420][15401] Updated weights for policy 0, policy_version 674101 (0.0025) [2024-06-24 13:57:58,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43415.9, 300 sec: 42709.9). Total num frames: 11044519936. Throughput: 0: 42955.5. Samples: 11044626640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:57:58,393][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 13:58:01,206][15401] Updated weights for policy 0, policy_version 674111 (0.0033) [2024-06-24 13:58:03,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 11044716544. Throughput: 0: 42803.5. Samples: 11044878340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:58:03,393][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 13:58:05,077][15401] Updated weights for policy 0, policy_version 674121 (0.0037) [2024-06-24 13:58:08,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11044929536. Throughput: 0: 42833.4. Samples: 11045003420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:58:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 13:58:08,737][15401] Updated weights for policy 0, policy_version 674131 (0.0033) [2024-06-24 13:58:12,622][15401] Updated weights for policy 0, policy_version 674141 (0.0037) [2024-06-24 13:58:13,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11045142528. Throughput: 0: 42881.2. Samples: 11045262180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:58:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 13:58:16,758][15401] Updated weights for policy 0, policy_version 674151 (0.0037) [2024-06-24 13:58:18,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42593.8, 300 sec: 42708.6). Total num frames: 11045339136. Throughput: 0: 42981.1. Samples: 11045522840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:58:18,397][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 13:58:20,430][15401] Updated weights for policy 0, policy_version 674161 (0.0032) [2024-06-24 13:58:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11045568512. Throughput: 0: 42991.6. Samples: 11045648400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:58:23,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 13:58:24,135][15401] Updated weights for policy 0, policy_version 674171 (0.0040) [2024-06-24 13:58:27,792][15401] Updated weights for policy 0, policy_version 674181 (0.0030) [2024-06-24 13:58:28,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11045781504. Throughput: 0: 42768.4. Samples: 11045904360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:58:28,395][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 13:58:31,907][15401] Updated weights for policy 0, policy_version 674191 (0.0023) [2024-06-24 13:58:33,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 11045978112. Throughput: 0: 42823.7. Samples: 11046166340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:58:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 13:58:35,867][15401] Updated weights for policy 0, policy_version 674201 (0.0027) [2024-06-24 13:58:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11046191104. Throughput: 0: 42797.3. Samples: 11046289860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 13:58:38,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 13:58:39,521][15401] Updated weights for policy 0, policy_version 674211 (0.0045) [2024-06-24 13:58:43,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11046420480. Throughput: 0: 42481.5. Samples: 11046538200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:58:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 13:58:43,498][15401] Updated weights for policy 0, policy_version 674221 (0.0034) [2024-06-24 13:58:47,360][15401] Updated weights for policy 0, policy_version 674231 (0.0048) [2024-06-24 13:58:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11046617088. Throughput: 0: 42676.2. Samples: 11046798660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:58:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 13:58:50,566][15349] Signal inference workers to stop experience collection... (163550 times) [2024-06-24 13:58:50,617][15349] Signal inference workers to resume experience collection... (163550 times) [2024-06-24 13:58:50,618][15401] InferenceWorker_p0-w0: stopping experience collection (163550 times) [2024-06-24 13:58:50,633][15401] InferenceWorker_p0-w0: resuming experience collection (163550 times) [2024-06-24 13:58:51,123][15401] Updated weights for policy 0, policy_version 674241 (0.0041) [2024-06-24 13:58:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11046830080. Throughput: 0: 42559.1. Samples: 11046918580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:58:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 13:58:55,234][15401] Updated weights for policy 0, policy_version 674251 (0.0032) [2024-06-24 13:58:58,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42052.3, 300 sec: 42599.0). Total num frames: 11047043072. Throughput: 0: 42468.5. Samples: 11047173360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:58:58,393][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 13:58:58,871][15401] Updated weights for policy 0, policy_version 674261 (0.0023) [2024-06-24 13:59:03,069][15401] Updated weights for policy 0, policy_version 674271 (0.0047) [2024-06-24 13:59:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 11047256064. Throughput: 0: 42209.1. Samples: 11047421980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:03,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 13:59:06,870][15401] Updated weights for policy 0, policy_version 674281 (0.0030) [2024-06-24 13:59:08,391][15132] Fps is (10 sec: 42603.0, 60 sec: 42324.4, 300 sec: 42598.2). Total num frames: 11047469056. Throughput: 0: 42307.1. Samples: 11047552280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:08,391][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 13:59:11,175][15401] Updated weights for policy 0, policy_version 674291 (0.0042) [2024-06-24 13:59:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11047698432. Throughput: 0: 42282.6. Samples: 11047807080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 13:59:14,447][15401] Updated weights for policy 0, policy_version 674301 (0.0032) [2024-06-24 13:59:18,390][15132] Fps is (10 sec: 42604.1, 60 sec: 42602.9, 300 sec: 42599.3). Total num frames: 11047895040. Throughput: 0: 42074.4. Samples: 11048059680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 13:59:18,942][15401] Updated weights for policy 0, policy_version 674311 (0.0024) [2024-06-24 13:59:22,038][15401] Updated weights for policy 0, policy_version 674321 (0.0029) [2024-06-24 13:59:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11048108032. Throughput: 0: 42114.7. Samples: 11048185020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 13:59:26,476][15401] Updated weights for policy 0, policy_version 674331 (0.0035) [2024-06-24 13:59:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11048321024. Throughput: 0: 42379.5. Samples: 11048445280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 13:59:29,907][15401] Updated weights for policy 0, policy_version 674341 (0.0029) [2024-06-24 13:59:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 11048517632. Throughput: 0: 42118.9. Samples: 11048694020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:33,390][15132] Avg episode reward: [(0, '0.160')] [2024-06-24 13:59:34,627][15401] Updated weights for policy 0, policy_version 674351 (0.0034) [2024-06-24 13:59:38,056][15401] Updated weights for policy 0, policy_version 674361 (0.0037) [2024-06-24 13:59:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 11048730624. Throughput: 0: 42259.7. Samples: 11048820260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 13:59:42,195][15401] Updated weights for policy 0, policy_version 674371 (0.0030) [2024-06-24 13:59:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11048960000. Throughput: 0: 42495.5. Samples: 11049085560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 13:59:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000674375_11048960000.pth... [2024-06-24 13:59:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000673751_11038736384.pth [2024-06-24 13:59:45,501][15401] Updated weights for policy 0, policy_version 674381 (0.0031) [2024-06-24 13:59:48,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11049156608. Throughput: 0: 42464.0. Samples: 11049332860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 13:59:50,025][15401] Updated weights for policy 0, policy_version 674391 (0.0028) [2024-06-24 13:59:53,154][15401] Updated weights for policy 0, policy_version 674401 (0.0040) [2024-06-24 13:59:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11049385984. Throughput: 0: 42372.0. Samples: 11049458960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:53,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 13:59:57,662][15401] Updated weights for policy 0, policy_version 674411 (0.0035) [2024-06-24 13:59:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 11049598976. Throughput: 0: 42473.7. Samples: 11049718400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 13:59:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 14:00:00,892][15401] Updated weights for policy 0, policy_version 674421 (0.0033) [2024-06-24 14:00:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11049779200. Throughput: 0: 42340.9. Samples: 11049965020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 14:00:03,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 14:00:05,423][15401] Updated weights for policy 0, policy_version 674431 (0.0042) [2024-06-24 14:00:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42599.4, 300 sec: 42653.9). Total num frames: 11050024960. Throughput: 0: 42357.9. Samples: 11050091120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 14:00:08,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 14:00:08,498][15401] Updated weights for policy 0, policy_version 674441 (0.0042) [2024-06-24 14:00:13,162][15401] Updated weights for policy 0, policy_version 674451 (0.0035) [2024-06-24 14:00:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 11050205184. Throughput: 0: 42421.2. Samples: 11050354240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 14:00:13,393][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 14:00:16,405][15401] Updated weights for policy 0, policy_version 674461 (0.0040) [2024-06-24 14:00:18,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 11050434560. Throughput: 0: 42398.3. Samples: 11050602040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 14:00:18,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 14:00:20,734][15401] Updated weights for policy 0, policy_version 674471 (0.0037) [2024-06-24 14:00:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11050663936. Throughput: 0: 42525.0. Samples: 11050733900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:00:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 14:00:24,177][15401] Updated weights for policy 0, policy_version 674481 (0.0027) [2024-06-24 14:00:28,154][15401] Updated weights for policy 0, policy_version 674491 (0.0028) [2024-06-24 14:00:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11050860544. Throughput: 0: 42264.5. Samples: 11050987460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:00:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 14:00:28,921][15349] Signal inference workers to stop experience collection... (163600 times) [2024-06-24 14:00:28,960][15401] InferenceWorker_p0-w0: stopping experience collection (163600 times) [2024-06-24 14:00:28,970][15349] Signal inference workers to resume experience collection... (163600 times) [2024-06-24 14:00:28,980][15401] InferenceWorker_p0-w0: resuming experience collection (163600 times) [2024-06-24 14:00:31,901][15401] Updated weights for policy 0, policy_version 674501 (0.0031) [2024-06-24 14:00:33,393][15132] Fps is (10 sec: 42585.5, 60 sec: 42869.2, 300 sec: 42597.9). Total num frames: 11051089920. Throughput: 0: 42467.2. Samples: 11051244020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:00:33,393][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 14:00:36,191][15401] Updated weights for policy 0, policy_version 674511 (0.0034) [2024-06-24 14:00:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11051286528. Throughput: 0: 42601.4. Samples: 11051376020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:00:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 14:00:39,628][15401] Updated weights for policy 0, policy_version 674521 (0.0040) [2024-06-24 14:00:43,390][15132] Fps is (10 sec: 39334.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 11051483136. Throughput: 0: 42376.0. Samples: 11051625320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:00:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 14:00:44,029][15401] Updated weights for policy 0, policy_version 674531 (0.0039) [2024-06-24 14:00:47,505][15401] Updated weights for policy 0, policy_version 674541 (0.0032) [2024-06-24 14:00:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11051712512. Throughput: 0: 42551.6. Samples: 11051879840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:00:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 14:00:51,481][15401] Updated weights for policy 0, policy_version 674551 (0.0037) [2024-06-24 14:00:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11051925504. Throughput: 0: 42788.9. Samples: 11052016620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:00:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 14:00:55,175][15401] Updated weights for policy 0, policy_version 674561 (0.0029) [2024-06-24 14:00:58,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42047.8, 300 sec: 42541.9). Total num frames: 11052122112. Throughput: 0: 42590.4. Samples: 11052271080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:00:58,397][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 14:00:58,911][15401] Updated weights for policy 0, policy_version 674571 (0.0033) [2024-06-24 14:01:02,642][15401] Updated weights for policy 0, policy_version 674581 (0.0034) [2024-06-24 14:01:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 11052351488. Throughput: 0: 42792.5. Samples: 11052527600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:03,392][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 14:01:06,522][15401] Updated weights for policy 0, policy_version 674591 (0.0031) [2024-06-24 14:01:08,390][15132] Fps is (10 sec: 45904.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11052580864. Throughput: 0: 42687.7. Samples: 11052654840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 14:01:10,282][15401] Updated weights for policy 0, policy_version 674601 (0.0028) [2024-06-24 14:01:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11052793856. Throughput: 0: 42756.5. Samples: 11052911500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:13,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 14:01:14,086][15401] Updated weights for policy 0, policy_version 674611 (0.0037) [2024-06-24 14:01:17,822][15401] Updated weights for policy 0, policy_version 674621 (0.0032) [2024-06-24 14:01:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42599.3). Total num frames: 11052990464. Throughput: 0: 42915.1. Samples: 11053175060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:18,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 14:01:21,563][15401] Updated weights for policy 0, policy_version 674631 (0.0047) [2024-06-24 14:01:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11053219840. Throughput: 0: 42847.5. Samples: 11053304160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 14:01:25,426][15401] Updated weights for policy 0, policy_version 674641 (0.0037) [2024-06-24 14:01:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11053432832. Throughput: 0: 42970.1. Samples: 11053558980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 14:01:28,995][15401] Updated weights for policy 0, policy_version 674651 (0.0035) [2024-06-24 14:01:33,232][15401] Updated weights for policy 0, policy_version 674661 (0.0046) [2024-06-24 14:01:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.6, 300 sec: 42653.9). Total num frames: 11053645824. Throughput: 0: 43141.3. Samples: 11053821200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:33,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 14:01:36,699][15401] Updated weights for policy 0, policy_version 674671 (0.0031) [2024-06-24 14:01:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 11053875200. Throughput: 0: 42976.7. Samples: 11053950580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 14:01:40,930][15401] Updated weights for policy 0, policy_version 674681 (0.0030) [2024-06-24 14:01:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11054071808. Throughput: 0: 42848.8. Samples: 11054199000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 14:01:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000674687_11054071808.pth... [2024-06-24 14:01:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000674065_11043880960.pth [2024-06-24 14:01:44,324][15401] Updated weights for policy 0, policy_version 674691 (0.0034) [2024-06-24 14:01:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 11054284800. Throughput: 0: 42852.8. Samples: 11054455980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 14:01:48,631][15401] Updated weights for policy 0, policy_version 674701 (0.0033) [2024-06-24 14:01:52,329][15401] Updated weights for policy 0, policy_version 674711 (0.0026) [2024-06-24 14:01:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 11054530560. Throughput: 0: 42888.0. Samples: 11054584800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 14:01:55,416][15349] Signal inference workers to stop experience collection... (163650 times) [2024-06-24 14:01:55,416][15349] Signal inference workers to resume experience collection... (163650 times) [2024-06-24 14:01:55,461][15401] InferenceWorker_p0-w0: stopping experience collection (163650 times) [2024-06-24 14:01:55,461][15401] InferenceWorker_p0-w0: resuming experience collection (163650 times) [2024-06-24 14:01:56,268][15401] Updated weights for policy 0, policy_version 674721 (0.0029) [2024-06-24 14:01:58,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43420.5, 300 sec: 42709.1). Total num frames: 11054727168. Throughput: 0: 42848.8. Samples: 11054839800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 14:01:58,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 14:01:59,906][15401] Updated weights for policy 0, policy_version 674731 (0.0033) [2024-06-24 14:02:03,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 11054940160. Throughput: 0: 42766.7. Samples: 11055099660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:03,392][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 14:02:03,902][15401] Updated weights for policy 0, policy_version 674741 (0.0032) [2024-06-24 14:02:07,650][15401] Updated weights for policy 0, policy_version 674751 (0.0041) [2024-06-24 14:02:08,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11055169536. Throughput: 0: 42659.9. Samples: 11055223860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 14:02:11,813][15401] Updated weights for policy 0, policy_version 674761 (0.0050) [2024-06-24 14:02:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11055366144. Throughput: 0: 42836.9. Samples: 11055486640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 14:02:15,296][15401] Updated weights for policy 0, policy_version 674771 (0.0040) [2024-06-24 14:02:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11055562752. Throughput: 0: 42713.4. Samples: 11055743300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 14:02:19,450][15401] Updated weights for policy 0, policy_version 674781 (0.0030) [2024-06-24 14:02:22,907][15401] Updated weights for policy 0, policy_version 674791 (0.0027) [2024-06-24 14:02:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11055775744. Throughput: 0: 42570.8. Samples: 11055866260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:23,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 14:02:26,866][15401] Updated weights for policy 0, policy_version 674801 (0.0030) [2024-06-24 14:02:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 11056021504. Throughput: 0: 42876.8. Samples: 11056128460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 14:02:30,776][15401] Updated weights for policy 0, policy_version 674811 (0.0039) [2024-06-24 14:02:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11056218112. Throughput: 0: 42923.7. Samples: 11056387540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:33,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 14:02:34,733][15401] Updated weights for policy 0, policy_version 674821 (0.0036) [2024-06-24 14:02:38,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 11056414720. Throughput: 0: 42751.3. Samples: 11056508600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:38,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 14:02:38,441][15401] Updated weights for policy 0, policy_version 674831 (0.0039) [2024-06-24 14:02:42,383][15401] Updated weights for policy 0, policy_version 674841 (0.0044) [2024-06-24 14:02:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 11056660480. Throughput: 0: 42881.7. Samples: 11056769380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:43,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 14:02:46,035][15401] Updated weights for policy 0, policy_version 674851 (0.0044) [2024-06-24 14:02:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11056840704. Throughput: 0: 42580.1. Samples: 11057015660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:48,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-24 14:02:49,965][15401] Updated weights for policy 0, policy_version 674861 (0.0040) [2024-06-24 14:02:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42487.7). Total num frames: 11057053696. Throughput: 0: 42562.4. Samples: 11057139160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 14:02:53,796][15401] Updated weights for policy 0, policy_version 674871 (0.0046) [2024-06-24 14:02:58,157][15401] Updated weights for policy 0, policy_version 674881 (0.0035) [2024-06-24 14:02:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42327.0, 300 sec: 42543.2). Total num frames: 11057266688. Throughput: 0: 42563.2. Samples: 11057401980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:02:58,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 14:03:01,505][15401] Updated weights for policy 0, policy_version 674891 (0.0027) [2024-06-24 14:03:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 11057496064. Throughput: 0: 42317.4. Samples: 11057647580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:03:03,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 14:03:04,069][15349] Signal inference workers to stop experience collection... (163700 times) [2024-06-24 14:03:04,115][15401] InferenceWorker_p0-w0: stopping experience collection (163700 times) [2024-06-24 14:03:04,132][15349] Signal inference workers to resume experience collection... (163700 times) [2024-06-24 14:03:04,132][15401] InferenceWorker_p0-w0: resuming experience collection (163700 times) [2024-06-24 14:03:05,923][15401] Updated weights for policy 0, policy_version 674901 (0.0034) [2024-06-24 14:03:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11057692672. Throughput: 0: 42507.1. Samples: 11057779080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:03:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 14:03:09,081][15401] Updated weights for policy 0, policy_version 674911 (0.0031) [2024-06-24 14:03:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 42488.3). Total num frames: 11057872896. Throughput: 0: 42314.4. Samples: 11058032600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:03:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 14:03:13,555][15401] Updated weights for policy 0, policy_version 674921 (0.0054) [2024-06-24 14:03:16,753][15401] Updated weights for policy 0, policy_version 674931 (0.0036) [2024-06-24 14:03:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11058135040. Throughput: 0: 42050.1. Samples: 11058279800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:03:18,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 14:03:21,477][15401] Updated weights for policy 0, policy_version 674941 (0.0030) [2024-06-24 14:03:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11058331648. Throughput: 0: 42369.6. Samples: 11058415240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:03:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 14:03:24,226][15401] Updated weights for policy 0, policy_version 674951 (0.0042) [2024-06-24 14:03:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 11058528256. Throughput: 0: 42181.9. Samples: 11058667560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:03:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 14:03:28,994][15401] Updated weights for policy 0, policy_version 674961 (0.0035) [2024-06-24 14:03:31,855][15401] Updated weights for policy 0, policy_version 674971 (0.0032) [2024-06-24 14:03:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11058774016. Throughput: 0: 42298.2. Samples: 11058919080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:03:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 14:03:36,727][15401] Updated weights for policy 0, policy_version 674981 (0.0041) [2024-06-24 14:03:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11058970624. Throughput: 0: 42575.6. Samples: 11059055060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:03:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 14:03:39,552][15401] Updated weights for policy 0, policy_version 674991 (0.0034) [2024-06-24 14:03:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.4, 300 sec: 42542.9). Total num frames: 11059167232. Throughput: 0: 42388.1. Samples: 11059309440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:03:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 14:03:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000674999_11059183616.pth... [2024-06-24 14:03:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000674375_11048960000.pth [2024-06-24 14:03:44,267][15401] Updated weights for policy 0, policy_version 675001 (0.0034) [2024-06-24 14:03:47,125][15401] Updated weights for policy 0, policy_version 675011 (0.0041) [2024-06-24 14:03:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11059412992. Throughput: 0: 42537.3. Samples: 11059561760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:03:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 14:03:51,882][15401] Updated weights for policy 0, policy_version 675021 (0.0026) [2024-06-24 14:03:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 11059625984. Throughput: 0: 42651.0. Samples: 11059698380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:03:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 14:03:54,868][15401] Updated weights for policy 0, policy_version 675031 (0.0043) [2024-06-24 14:03:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11059822592. Throughput: 0: 42640.3. Samples: 11059951420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:03:58,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-24 14:03:59,848][15401] Updated weights for policy 0, policy_version 675041 (0.0027) [2024-06-24 14:04:02,519][15401] Updated weights for policy 0, policy_version 675051 (0.0032) [2024-06-24 14:04:03,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42866.9, 300 sec: 42708.7). Total num frames: 11060068352. Throughput: 0: 42573.1. Samples: 11060195860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:03,396][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 14:04:07,412][15401] Updated weights for policy 0, policy_version 675061 (0.0044) [2024-06-24 14:04:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11060248576. Throughput: 0: 42862.8. Samples: 11060344060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 14:04:10,330][15401] Updated weights for policy 0, policy_version 675071 (0.0049) [2024-06-24 14:04:13,390][15132] Fps is (10 sec: 37707.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 11060445184. Throughput: 0: 42652.8. Samples: 11060586940. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:13,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-24 14:04:15,197][15401] Updated weights for policy 0, policy_version 675081 (0.0037) [2024-06-24 14:04:17,900][15401] Updated weights for policy 0, policy_version 675091 (0.0032) [2024-06-24 14:04:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11060690944. Throughput: 0: 42589.3. Samples: 11060835600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 14:04:22,786][15401] Updated weights for policy 0, policy_version 675101 (0.0037) [2024-06-24 14:04:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11060887552. Throughput: 0: 42696.2. Samples: 11060976400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 14:04:25,081][15349] Signal inference workers to stop experience collection... (163750 times) [2024-06-24 14:04:25,081][15349] Signal inference workers to resume experience collection... (163750 times) [2024-06-24 14:04:25,127][15401] InferenceWorker_p0-w0: stopping experience collection (163750 times) [2024-06-24 14:04:25,127][15401] InferenceWorker_p0-w0: resuming experience collection (163750 times) [2024-06-24 14:04:25,399][15401] Updated weights for policy 0, policy_version 675111 (0.0026) [2024-06-24 14:04:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11061084160. Throughput: 0: 42485.1. Samples: 11061221280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:28,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 14:04:30,449][15401] Updated weights for policy 0, policy_version 675121 (0.0022) [2024-06-24 14:04:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11061329920. Throughput: 0: 42558.3. Samples: 11061476880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 14:04:33,394][15401] Updated weights for policy 0, policy_version 675131 (0.0044) [2024-06-24 14:04:38,298][15401] Updated weights for policy 0, policy_version 675141 (0.0038) [2024-06-24 14:04:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11061510144. Throughput: 0: 42674.8. Samples: 11061618740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 14:04:41,086][15401] Updated weights for policy 0, policy_version 675151 (0.0023) [2024-06-24 14:04:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11061739520. Throughput: 0: 42500.9. Samples: 11061863960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 14:04:45,799][15401] Updated weights for policy 0, policy_version 675161 (0.0034) [2024-06-24 14:04:48,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11061985280. Throughput: 0: 42566.1. Samples: 11062111060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 14:04:48,666][15401] Updated weights for policy 0, policy_version 675171 (0.0032) [2024-06-24 14:04:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11062149120. Throughput: 0: 42409.8. Samples: 11062252500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:53,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 14:04:53,396][15401] Updated weights for policy 0, policy_version 675181 (0.0027) [2024-06-24 14:04:56,102][15401] Updated weights for policy 0, policy_version 675191 (0.0033) [2024-06-24 14:04:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11062378496. Throughput: 0: 42669.0. Samples: 11062507040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:04:58,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 14:05:01,065][15401] Updated weights for policy 0, policy_version 675201 (0.0026) [2024-06-24 14:05:03,390][15132] Fps is (10 sec: 49151.8, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 11062640640. Throughput: 0: 42736.3. Samples: 11062758740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:05:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 14:05:03,586][15401] Updated weights for policy 0, policy_version 675211 (0.0032) [2024-06-24 14:05:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11062788096. Throughput: 0: 42679.5. Samples: 11062896980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:05:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 14:05:08,831][15401] Updated weights for policy 0, policy_version 675221 (0.0034) [2024-06-24 14:05:10,999][15401] Updated weights for policy 0, policy_version 675231 (0.0036) [2024-06-24 14:05:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 11063033856. Throughput: 0: 42938.8. Samples: 11063153520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-24 14:05:13,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 14:05:16,471][15401] Updated weights for policy 0, policy_version 675241 (0.0035) [2024-06-24 14:05:18,392][15132] Fps is (10 sec: 49140.7, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 11063279616. Throughput: 0: 42790.1. Samples: 11063402540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:18,393][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 14:05:18,861][15401] Updated weights for policy 0, policy_version 675251 (0.0046) [2024-06-24 14:05:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 11063427072. Throughput: 0: 42579.6. Samples: 11063534820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:23,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 14:05:24,032][15401] Updated weights for policy 0, policy_version 675261 (0.0032) [2024-06-24 14:05:26,813][15401] Updated weights for policy 0, policy_version 675271 (0.0042) [2024-06-24 14:05:28,390][15132] Fps is (10 sec: 39330.8, 60 sec: 43144.6, 300 sec: 42654.4). Total num frames: 11063672832. Throughput: 0: 42756.0. Samples: 11063787980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 14:05:31,353][15349] Signal inference workers to stop experience collection... (163800 times) [2024-06-24 14:05:31,354][15349] Signal inference workers to resume experience collection... (163800 times) [2024-06-24 14:05:31,372][15401] InferenceWorker_p0-w0: stopping experience collection (163800 times) [2024-06-24 14:05:31,373][15401] InferenceWorker_p0-w0: resuming experience collection (163800 times) [2024-06-24 14:05:31,659][15401] Updated weights for policy 0, policy_version 675281 (0.0029) [2024-06-24 14:05:33,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11063902208. Throughput: 0: 43046.2. Samples: 11064048140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 14:05:34,395][15401] Updated weights for policy 0, policy_version 675291 (0.0028) [2024-06-24 14:05:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11064082432. Throughput: 0: 42796.9. Samples: 11064178360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 14:05:39,366][15401] Updated weights for policy 0, policy_version 675301 (0.0031) [2024-06-24 14:05:41,829][15401] Updated weights for policy 0, policy_version 675311 (0.0030) [2024-06-24 14:05:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11064328192. Throughput: 0: 42782.5. Samples: 11064432260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 14:05:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000675313_11064328192.pth... [2024-06-24 14:05:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000674687_11054071808.pth [2024-06-24 14:05:47,015][15401] Updated weights for policy 0, policy_version 675321 (0.0034) [2024-06-24 14:05:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11064541184. Throughput: 0: 42959.7. Samples: 11064691920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 14:05:49,441][15401] Updated weights for policy 0, policy_version 675331 (0.0033) [2024-06-24 14:05:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 11064721408. Throughput: 0: 42767.2. Samples: 11064821500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 14:05:54,501][15401] Updated weights for policy 0, policy_version 675341 (0.0043) [2024-06-24 14:05:57,011][15401] Updated weights for policy 0, policy_version 675351 (0.0040) [2024-06-24 14:05:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11064983552. Throughput: 0: 42615.6. Samples: 11065071220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:05:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 14:06:02,096][15401] Updated weights for policy 0, policy_version 675361 (0.0040) [2024-06-24 14:06:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11065180160. Throughput: 0: 43036.4. Samples: 11065339080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 14:06:05,588][15401] Updated weights for policy 0, policy_version 675371 (0.0041) [2024-06-24 14:06:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 11065376768. Throughput: 0: 42871.1. Samples: 11065464020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 14:06:09,620][15401] Updated weights for policy 0, policy_version 675381 (0.0032) [2024-06-24 14:06:13,365][15401] Updated weights for policy 0, policy_version 675391 (0.0034) [2024-06-24 14:06:13,394][15132] Fps is (10 sec: 42578.5, 60 sec: 42868.0, 300 sec: 42764.3). Total num frames: 11065606144. Throughput: 0: 42844.0. Samples: 11065716160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:13,395][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 14:06:17,362][15401] Updated weights for policy 0, policy_version 675401 (0.0032) [2024-06-24 14:06:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 11065819136. Throughput: 0: 42946.3. Samples: 11065980720. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 14:06:20,896][15401] Updated weights for policy 0, policy_version 675411 (0.0039) [2024-06-24 14:06:23,389][15132] Fps is (10 sec: 40979.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 11066015744. Throughput: 0: 42893.9. Samples: 11066108580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 14:06:25,032][15401] Updated weights for policy 0, policy_version 675421 (0.0044) [2024-06-24 14:06:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11066245120. Throughput: 0: 42672.1. Samples: 11066352500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:28,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 14:06:28,965][15401] Updated weights for policy 0, policy_version 675431 (0.0030) [2024-06-24 14:06:32,874][15401] Updated weights for policy 0, policy_version 675441 (0.0039) [2024-06-24 14:06:33,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 11066458112. Throughput: 0: 42711.4. Samples: 11066614040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:33,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 14:06:36,568][15401] Updated weights for policy 0, policy_version 675451 (0.0036) [2024-06-24 14:06:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11066671104. Throughput: 0: 42772.5. Samples: 11066746260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 14:06:39,675][15349] Signal inference workers to stop experience collection... (163850 times) [2024-06-24 14:06:39,675][15349] Signal inference workers to resume experience collection... (163850 times) [2024-06-24 14:06:39,714][15401] InferenceWorker_p0-w0: stopping experience collection (163850 times) [2024-06-24 14:06:39,715][15401] InferenceWorker_p0-w0: resuming experience collection (163850 times) [2024-06-24 14:06:40,362][15401] Updated weights for policy 0, policy_version 675461 (0.0032) [2024-06-24 14:06:43,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11066884096. Throughput: 0: 42853.6. Samples: 11066999640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 14:06:44,171][15401] Updated weights for policy 0, policy_version 675471 (0.0027) [2024-06-24 14:06:47,806][15401] Updated weights for policy 0, policy_version 675481 (0.0042) [2024-06-24 14:06:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11067097088. Throughput: 0: 42597.3. Samples: 11067255960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 14:06:52,007][15401] Updated weights for policy 0, policy_version 675491 (0.0032) [2024-06-24 14:06:53,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43142.8, 300 sec: 42653.9). Total num frames: 11067310080. Throughput: 0: 42796.7. Samples: 11067389980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-24 14:06:53,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 14:06:55,233][15401] Updated weights for policy 0, policy_version 675501 (0.0027) [2024-06-24 14:06:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 11067539456. Throughput: 0: 42894.7. Samples: 11067646220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:06:58,392][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 14:06:59,535][15401] Updated weights for policy 0, policy_version 675511 (0.0030) [2024-06-24 14:07:03,158][15401] Updated weights for policy 0, policy_version 675521 (0.0045) [2024-06-24 14:07:03,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11067736064. Throughput: 0: 42752.0. Samples: 11067904560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 14:07:07,123][15401] Updated weights for policy 0, policy_version 675531 (0.0035) [2024-06-24 14:07:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11067932672. Throughput: 0: 42755.1. Samples: 11068032560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:08,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-24 14:07:10,741][15401] Updated weights for policy 0, policy_version 675541 (0.0049) [2024-06-24 14:07:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42874.9, 300 sec: 42765.0). Total num frames: 11068178432. Throughput: 0: 42944.9. Samples: 11068285020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:13,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-24 14:07:14,903][15401] Updated weights for policy 0, policy_version 675551 (0.0026) [2024-06-24 14:07:18,375][15401] Updated weights for policy 0, policy_version 675561 (0.0034) [2024-06-24 14:07:18,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 11068391424. Throughput: 0: 42905.4. Samples: 11068544780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:18,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 14:07:22,753][15401] Updated weights for policy 0, policy_version 675571 (0.0028) [2024-06-24 14:07:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11068571648. Throughput: 0: 42802.5. Samples: 11068672380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:23,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 14:07:25,988][15401] Updated weights for policy 0, policy_version 675581 (0.0040) [2024-06-24 14:07:28,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11068817408. Throughput: 0: 42856.1. Samples: 11068928160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:28,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 14:07:30,334][15401] Updated weights for policy 0, policy_version 675591 (0.0025) [2024-06-24 14:07:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 11069030400. Throughput: 0: 42969.5. Samples: 11069189580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 14:07:33,488][15401] Updated weights for policy 0, policy_version 675601 (0.0031) [2024-06-24 14:07:37,933][15401] Updated weights for policy 0, policy_version 675611 (0.0036) [2024-06-24 14:07:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11069227008. Throughput: 0: 42766.3. Samples: 11069314360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 14:07:41,253][15401] Updated weights for policy 0, policy_version 675621 (0.0041) [2024-06-24 14:07:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11069456384. Throughput: 0: 42748.1. Samples: 11069569880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 14:07:43,440][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000675627_11069472768.pth... [2024-06-24 14:07:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000674999_11059183616.pth [2024-06-24 14:07:45,816][15401] Updated weights for policy 0, policy_version 675631 (0.0043) [2024-06-24 14:07:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11069652992. Throughput: 0: 42884.0. Samples: 11069834340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 14:07:48,960][15401] Updated weights for policy 0, policy_version 675641 (0.0028) [2024-06-24 14:07:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 11069849600. Throughput: 0: 42702.6. Samples: 11069954180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 14:07:53,568][15401] Updated weights for policy 0, policy_version 675651 (0.0044) [2024-06-24 14:07:56,732][15401] Updated weights for policy 0, policy_version 675661 (0.0029) [2024-06-24 14:07:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11070095360. Throughput: 0: 42636.9. Samples: 11070203680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:07:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 14:08:01,132][15401] Updated weights for policy 0, policy_version 675671 (0.0037) [2024-06-24 14:08:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11070275584. Throughput: 0: 42802.6. Samples: 11070470800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:08:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 14:08:03,989][15349] Signal inference workers to stop experience collection... (163900 times) [2024-06-24 14:08:03,989][15349] Signal inference workers to resume experience collection... (163900 times) [2024-06-24 14:08:04,034][15401] InferenceWorker_p0-w0: stopping experience collection (163900 times) [2024-06-24 14:08:04,034][15401] InferenceWorker_p0-w0: resuming experience collection (163900 times) [2024-06-24 14:08:04,339][15401] Updated weights for policy 0, policy_version 675681 (0.0038) [2024-06-24 14:08:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11070504960. Throughput: 0: 42633.8. Samples: 11070590900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:08:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 14:08:08,683][15401] Updated weights for policy 0, policy_version 675691 (0.0026) [2024-06-24 14:08:12,055][15401] Updated weights for policy 0, policy_version 675701 (0.0030) [2024-06-24 14:08:13,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11070734336. Throughput: 0: 42638.6. Samples: 11070846900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:08:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 14:08:16,180][15401] Updated weights for policy 0, policy_version 675711 (0.0043) [2024-06-24 14:08:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 11070930944. Throughput: 0: 42596.2. Samples: 11071106420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:08:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 14:08:19,730][15401] Updated weights for policy 0, policy_version 675721 (0.0042) [2024-06-24 14:08:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 11071160320. Throughput: 0: 42616.8. Samples: 11071232120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:08:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 14:08:23,614][15401] Updated weights for policy 0, policy_version 675731 (0.0045) [2024-06-24 14:08:27,447][15401] Updated weights for policy 0, policy_version 675741 (0.0034) [2024-06-24 14:08:28,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11071373312. Throughput: 0: 42652.5. Samples: 11071489240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:08:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 14:08:31,110][15401] Updated weights for policy 0, policy_version 675751 (0.0024) [2024-06-24 14:08:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11071569920. Throughput: 0: 42760.0. Samples: 11071758540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 14:08:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 14:08:35,145][15401] Updated weights for policy 0, policy_version 675761 (0.0035) [2024-06-24 14:08:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11071815680. Throughput: 0: 42789.8. Samples: 11071879720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:08:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 14:08:38,857][15401] Updated weights for policy 0, policy_version 675771 (0.0024) [2024-06-24 14:08:42,889][15401] Updated weights for policy 0, policy_version 675781 (0.0033) [2024-06-24 14:08:43,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 11072028672. Throughput: 0: 43037.5. Samples: 11072140380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:08:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 14:08:46,597][15401] Updated weights for policy 0, policy_version 675791 (0.0036) [2024-06-24 14:08:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11072225280. Throughput: 0: 42833.4. Samples: 11072398300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:08:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 14:08:50,594][15401] Updated weights for policy 0, policy_version 675801 (0.0036) [2024-06-24 14:08:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 11072471040. Throughput: 0: 42824.9. Samples: 11072518020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:08:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 14:08:54,196][15401] Updated weights for policy 0, policy_version 675811 (0.0029) [2024-06-24 14:08:58,071][15401] Updated weights for policy 0, policy_version 675821 (0.0026) [2024-06-24 14:08:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 11072667648. Throughput: 0: 42954.4. Samples: 11072779840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:08:58,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 14:09:01,776][15401] Updated weights for policy 0, policy_version 675831 (0.0038) [2024-06-24 14:09:03,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11072847872. Throughput: 0: 42867.8. Samples: 11073035460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:03,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 14:09:05,699][15401] Updated weights for policy 0, policy_version 675841 (0.0039) [2024-06-24 14:09:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11073077248. Throughput: 0: 42757.8. Samples: 11073156220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 14:09:09,554][15401] Updated weights for policy 0, policy_version 675851 (0.0043) [2024-06-24 14:09:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11073273856. Throughput: 0: 42643.6. Samples: 11073408200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 14:09:13,983][15401] Updated weights for policy 0, policy_version 675861 (0.0038) [2024-06-24 14:09:17,491][15401] Updated weights for policy 0, policy_version 675871 (0.0036) [2024-06-24 14:09:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11073503232. Throughput: 0: 42218.2. Samples: 11073658360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:18,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 14:09:21,582][15401] Updated weights for policy 0, policy_version 675881 (0.0032) [2024-06-24 14:09:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11073716224. Throughput: 0: 42525.4. Samples: 11073793360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 14:09:25,230][15401] Updated weights for policy 0, policy_version 675891 (0.0034) [2024-06-24 14:09:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11073912832. Throughput: 0: 42395.7. Samples: 11074048180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 14:09:29,217][15401] Updated weights for policy 0, policy_version 675901 (0.0044) [2024-06-24 14:09:32,828][15401] Updated weights for policy 0, policy_version 675911 (0.0036) [2024-06-24 14:09:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11074125824. Throughput: 0: 42353.9. Samples: 11074304220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 14:09:36,978][15401] Updated weights for policy 0, policy_version 675921 (0.0032) [2024-06-24 14:09:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11074371584. Throughput: 0: 42613.0. Samples: 11074435600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:38,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-24 14:09:40,578][15401] Updated weights for policy 0, policy_version 675931 (0.0039) [2024-06-24 14:09:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11074568192. Throughput: 0: 42359.8. Samples: 11074686040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:43,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-24 14:09:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000675938_11074568192.pth... [2024-06-24 14:09:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000675313_11064328192.pth [2024-06-24 14:09:44,922][15401] Updated weights for policy 0, policy_version 675941 (0.0042) [2024-06-24 14:09:46,699][15349] Signal inference workers to stop experience collection... (163950 times) [2024-06-24 14:09:46,700][15349] Signal inference workers to resume experience collection... (163950 times) [2024-06-24 14:09:46,744][15401] InferenceWorker_p0-w0: stopping experience collection (163950 times) [2024-06-24 14:09:46,744][15401] InferenceWorker_p0-w0: resuming experience collection (163950 times) [2024-06-24 14:09:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11074764800. Throughput: 0: 42338.7. Samples: 11074940700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 14:09:48,415][15401] Updated weights for policy 0, policy_version 675951 (0.0039) [2024-06-24 14:09:52,620][15401] Updated weights for policy 0, policy_version 675961 (0.0034) [2024-06-24 14:09:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 11074977792. Throughput: 0: 42545.2. Samples: 11075070760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 14:09:55,990][15401] Updated weights for policy 0, policy_version 675971 (0.0032) [2024-06-24 14:09:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11075223552. Throughput: 0: 42586.2. Samples: 11075324580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:09:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 14:10:00,385][15401] Updated weights for policy 0, policy_version 675981 (0.0028) [2024-06-24 14:10:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11075420160. Throughput: 0: 42823.6. Samples: 11075585420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:10:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 14:10:03,429][15401] Updated weights for policy 0, policy_version 675991 (0.0034) [2024-06-24 14:10:08,044][15401] Updated weights for policy 0, policy_version 676001 (0.0049) [2024-06-24 14:10:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11075616768. Throughput: 0: 42577.3. Samples: 11075709340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 14:10:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 14:10:11,326][15401] Updated weights for policy 0, policy_version 676011 (0.0025) [2024-06-24 14:10:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42709.8). Total num frames: 11075878912. Throughput: 0: 42607.1. Samples: 11075965500. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:13,392][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 14:10:15,572][15401] Updated weights for policy 0, policy_version 676021 (0.0034) [2024-06-24 14:10:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11076075520. Throughput: 0: 42640.3. Samples: 11076223040. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:18,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 14:10:19,113][15401] Updated weights for policy 0, policy_version 676031 (0.0041) [2024-06-24 14:10:23,216][15401] Updated weights for policy 0, policy_version 676041 (0.0044) [2024-06-24 14:10:23,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 11076272128. Throughput: 0: 42615.9. Samples: 11076353420. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:23,392][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 14:10:26,644][15401] Updated weights for policy 0, policy_version 676051 (0.0044) [2024-06-24 14:10:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 11076485120. Throughput: 0: 42694.7. Samples: 11076607400. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:28,393][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 14:10:30,947][15401] Updated weights for policy 0, policy_version 676061 (0.0034) [2024-06-24 14:10:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11076714496. Throughput: 0: 42713.6. Samples: 11076862820. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:33,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 14:10:34,145][15401] Updated weights for policy 0, policy_version 676071 (0.0038) [2024-06-24 14:10:38,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11076894720. Throughput: 0: 42729.4. Samples: 11076993580. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 14:10:38,691][15401] Updated weights for policy 0, policy_version 676081 (0.0037) [2024-06-24 14:10:41,803][15401] Updated weights for policy 0, policy_version 676091 (0.0041) [2024-06-24 14:10:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11077140480. Throughput: 0: 42686.1. Samples: 11077245460. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 14:10:46,701][15401] Updated weights for policy 0, policy_version 676101 (0.0043) [2024-06-24 14:10:48,123][15349] Signal inference workers to stop experience collection... (164000 times) [2024-06-24 14:10:48,123][15349] Signal inference workers to resume experience collection... (164000 times) [2024-06-24 14:10:48,162][15401] InferenceWorker_p0-w0: stopping experience collection (164000 times) [2024-06-24 14:10:48,162][15401] InferenceWorker_p0-w0: resuming experience collection (164000 times) [2024-06-24 14:10:48,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 11077353472. Throughput: 0: 42594.6. Samples: 11077502280. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:48,393][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 14:10:49,541][15401] Updated weights for policy 0, policy_version 676111 (0.0037) [2024-06-24 14:10:53,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 11077517312. Throughput: 0: 42676.4. Samples: 11077629780. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 14:10:54,287][15401] Updated weights for policy 0, policy_version 676121 (0.0042) [2024-06-24 14:10:57,203][15401] Updated weights for policy 0, policy_version 676131 (0.0049) [2024-06-24 14:10:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11077779456. Throughput: 0: 42799.7. Samples: 11077891480. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:10:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 14:11:01,834][15401] Updated weights for policy 0, policy_version 676141 (0.0043) [2024-06-24 14:11:03,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11077992448. Throughput: 0: 42758.7. Samples: 11078147180. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:03,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-24 14:11:04,667][15401] Updated weights for policy 0, policy_version 676151 (0.0029) [2024-06-24 14:11:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42599.1). Total num frames: 11078172672. Throughput: 0: 42613.8. Samples: 11078270940. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 14:11:09,269][15401] Updated weights for policy 0, policy_version 676161 (0.0040) [2024-06-24 14:11:12,221][15401] Updated weights for policy 0, policy_version 676171 (0.0033) [2024-06-24 14:11:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11078434816. Throughput: 0: 42735.7. Samples: 11078530400. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 14:11:16,771][15401] Updated weights for policy 0, policy_version 676181 (0.0036) [2024-06-24 14:11:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11078631424. Throughput: 0: 42949.8. Samples: 11078795560. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 14:11:19,726][15401] Updated weights for policy 0, policy_version 676191 (0.0027) [2024-06-24 14:11:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 11078828032. Throughput: 0: 42791.9. Samples: 11078919220. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 14:11:24,165][15401] Updated weights for policy 0, policy_version 676201 (0.0033) [2024-06-24 14:11:27,663][15401] Updated weights for policy 0, policy_version 676211 (0.0026) [2024-06-24 14:11:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43419.4, 300 sec: 42820.9). Total num frames: 11079090176. Throughput: 0: 43056.9. Samples: 11079183020. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:28,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-24 14:11:31,651][15401] Updated weights for policy 0, policy_version 676221 (0.0032) [2024-06-24 14:11:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11079254016. Throughput: 0: 43349.9. Samples: 11079452920. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 14:11:35,240][15401] Updated weights for policy 0, policy_version 676231 (0.0042) [2024-06-24 14:11:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11079483392. Throughput: 0: 42989.4. Samples: 11079564300. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 14:11:39,122][15401] Updated weights for policy 0, policy_version 676241 (0.0036) [2024-06-24 14:11:42,775][15401] Updated weights for policy 0, policy_version 676251 (0.0031) [2024-06-24 14:11:43,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11079729152. Throughput: 0: 43179.4. Samples: 11079834560. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 14:11:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000676253_11079729152.pth... [2024-06-24 14:11:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000675627_11069472768.pth [2024-06-24 14:11:46,730][15401] Updated weights for policy 0, policy_version 676261 (0.0029) [2024-06-24 14:11:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42654.3). Total num frames: 11079892992. Throughput: 0: 43331.7. Samples: 11080097100. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-24 14:11:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 14:11:50,068][15349] Signal inference workers to stop experience collection... (164050 times) [2024-06-24 14:11:50,068][15349] Signal inference workers to resume experience collection... (164050 times) [2024-06-24 14:11:50,089][15401] InferenceWorker_p0-w0: stopping experience collection (164050 times) [2024-06-24 14:11:50,089][15401] InferenceWorker_p0-w0: resuming experience collection (164050 times) [2024-06-24 14:11:50,515][15401] Updated weights for policy 0, policy_version 676271 (0.0039) [2024-06-24 14:11:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43690.7, 300 sec: 42709.5). Total num frames: 11080138752. Throughput: 0: 43168.1. Samples: 11080213500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:11:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 14:11:54,439][15401] Updated weights for policy 0, policy_version 676281 (0.0036) [2024-06-24 14:11:57,955][15401] Updated weights for policy 0, policy_version 676291 (0.0039) [2024-06-24 14:11:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11080351744. Throughput: 0: 43364.5. Samples: 11080481800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:11:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 14:12:02,130][15401] Updated weights for policy 0, policy_version 676301 (0.0044) [2024-06-24 14:12:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11080548352. Throughput: 0: 43289.7. Samples: 11080743600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 14:12:05,700][15401] Updated weights for policy 0, policy_version 676311 (0.0046) [2024-06-24 14:12:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 11080794112. Throughput: 0: 43201.0. Samples: 11080863260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 14:12:09,604][15401] Updated weights for policy 0, policy_version 676321 (0.0045) [2024-06-24 14:12:13,158][15401] Updated weights for policy 0, policy_version 676331 (0.0045) [2024-06-24 14:12:13,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 11081007104. Throughput: 0: 43253.4. Samples: 11081129420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:13,396][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 14:12:17,517][15401] Updated weights for policy 0, policy_version 676341 (0.0031) [2024-06-24 14:12:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11081203712. Throughput: 0: 42973.4. Samples: 11081386720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 14:12:20,946][15401] Updated weights for policy 0, policy_version 676351 (0.0026) [2024-06-24 14:12:23,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43689.0, 300 sec: 42820.2). Total num frames: 11081449472. Throughput: 0: 43325.6. Samples: 11081514060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:23,393][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 14:12:24,952][15401] Updated weights for policy 0, policy_version 676361 (0.0038) [2024-06-24 14:12:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11081646080. Throughput: 0: 43034.3. Samples: 11081771100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 14:12:28,737][15401] Updated weights for policy 0, policy_version 676371 (0.0037) [2024-06-24 14:12:32,438][15401] Updated weights for policy 0, policy_version 676381 (0.0038) [2024-06-24 14:12:33,389][15132] Fps is (10 sec: 40970.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11081859072. Throughput: 0: 42921.8. Samples: 11082028580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:33,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 14:12:36,630][15401] Updated weights for policy 0, policy_version 676391 (0.0029) [2024-06-24 14:12:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11082072064. Throughput: 0: 43086.2. Samples: 11082152380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 14:12:39,918][15401] Updated weights for policy 0, policy_version 676401 (0.0038) [2024-06-24 14:12:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 11082285056. Throughput: 0: 42822.6. Samples: 11082408820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 14:12:44,226][15401] Updated weights for policy 0, policy_version 676411 (0.0027) [2024-06-24 14:12:48,037][15401] Updated weights for policy 0, policy_version 676421 (0.0035) [2024-06-24 14:12:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11082481664. Throughput: 0: 42686.0. Samples: 11082664460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 14:12:51,724][15401] Updated weights for policy 0, policy_version 676431 (0.0033) [2024-06-24 14:12:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11082711040. Throughput: 0: 42883.1. Samples: 11082793000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 14:12:55,496][15401] Updated weights for policy 0, policy_version 676441 (0.0040) [2024-06-24 14:12:58,396][15132] Fps is (10 sec: 42570.6, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 11082907648. Throughput: 0: 42607.6. Samples: 11083047040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:12:58,397][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 14:12:59,289][15401] Updated weights for policy 0, policy_version 676451 (0.0037) [2024-06-24 14:13:02,990][15401] Updated weights for policy 0, policy_version 676461 (0.0038) [2024-06-24 14:13:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 11083137024. Throughput: 0: 42454.1. Samples: 11083297260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:13:03,393][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 14:13:07,138][15401] Updated weights for policy 0, policy_version 676471 (0.0037) [2024-06-24 14:13:08,390][15132] Fps is (10 sec: 45904.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11083366400. Throughput: 0: 42568.9. Samples: 11083429560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:13:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 14:13:10,650][15401] Updated weights for policy 0, policy_version 676481 (0.0027) [2024-06-24 14:13:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11083546624. Throughput: 0: 42572.9. Samples: 11083686880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:13:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 14:13:14,986][15401] Updated weights for policy 0, policy_version 676491 (0.0036) [2024-06-24 14:13:17,540][15349] Signal inference workers to stop experience collection... (164100 times) [2024-06-24 14:13:17,540][15349] Signal inference workers to resume experience collection... (164100 times) [2024-06-24 14:13:17,587][15401] InferenceWorker_p0-w0: stopping experience collection (164100 times) [2024-06-24 14:13:17,587][15401] InferenceWorker_p0-w0: resuming experience collection (164100 times) [2024-06-24 14:13:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11083776000. Throughput: 0: 42339.1. Samples: 11083933840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:13:18,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 14:13:18,415][15401] Updated weights for policy 0, policy_version 676501 (0.0031) [2024-06-24 14:13:22,687][15401] Updated weights for policy 0, policy_version 676511 (0.0034) [2024-06-24 14:13:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 11083988992. Throughput: 0: 42504.5. Samples: 11084065080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:13:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 14:13:26,248][15401] Updated weights for policy 0, policy_version 676521 (0.0037) [2024-06-24 14:13:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11084201984. Throughput: 0: 42592.4. Samples: 11084325480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-24 14:13:28,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 14:13:30,349][15401] Updated weights for policy 0, policy_version 676531 (0.0028) [2024-06-24 14:13:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11084431360. Throughput: 0: 42534.6. Samples: 11084578520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:13:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 14:13:33,849][15401] Updated weights for policy 0, policy_version 676541 (0.0042) [2024-06-24 14:13:37,853][15401] Updated weights for policy 0, policy_version 676551 (0.0031) [2024-06-24 14:13:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11084644352. Throughput: 0: 42709.7. Samples: 11084714940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:13:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 14:13:41,612][15401] Updated weights for policy 0, policy_version 676561 (0.0044) [2024-06-24 14:13:43,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 11084808192. Throughput: 0: 42724.4. Samples: 11084969360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:13:43,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 14:13:43,474][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000676564_11084824576.pth... [2024-06-24 14:13:43,530][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000675938_11074568192.pth [2024-06-24 14:13:45,569][15401] Updated weights for policy 0, policy_version 676571 (0.0035) [2024-06-24 14:13:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11085053952. Throughput: 0: 42612.6. Samples: 11085214720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:13:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 14:13:49,425][15401] Updated weights for policy 0, policy_version 676581 (0.0027) [2024-06-24 14:13:53,221][15401] Updated weights for policy 0, policy_version 676591 (0.0033) [2024-06-24 14:13:53,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 11085266944. Throughput: 0: 42728.4. Samples: 11085352340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:13:53,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 14:13:57,008][15401] Updated weights for policy 0, policy_version 676601 (0.0040) [2024-06-24 14:13:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 11085463552. Throughput: 0: 42530.2. Samples: 11085600740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:13:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 14:14:00,930][15401] Updated weights for policy 0, policy_version 676611 (0.0041) [2024-06-24 14:14:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 11085709312. Throughput: 0: 42589.8. Samples: 11085850380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:03,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 14:14:04,624][15401] Updated weights for policy 0, policy_version 676621 (0.0040) [2024-06-24 14:14:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 11085905920. Throughput: 0: 42720.8. Samples: 11085987520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 14:14:08,694][15401] Updated weights for policy 0, policy_version 676631 (0.0037) [2024-06-24 14:14:12,308][15401] Updated weights for policy 0, policy_version 676641 (0.0031) [2024-06-24 14:14:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11086118912. Throughput: 0: 42415.6. Samples: 11086234180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:13,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 14:14:16,736][15401] Updated weights for policy 0, policy_version 676651 (0.0048) [2024-06-24 14:14:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11086348288. Throughput: 0: 42420.0. Samples: 11086487420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:18,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 14:14:20,045][15401] Updated weights for policy 0, policy_version 676661 (0.0040) [2024-06-24 14:14:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 11086512128. Throughput: 0: 42327.6. Samples: 11086619680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:23,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 14:14:24,213][15401] Updated weights for policy 0, policy_version 676671 (0.0040) [2024-06-24 14:14:27,776][15401] Updated weights for policy 0, policy_version 676681 (0.0046) [2024-06-24 14:14:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11086741504. Throughput: 0: 42333.8. Samples: 11086874380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 14:14:31,905][15401] Updated weights for policy 0, policy_version 676691 (0.0034) [2024-06-24 14:14:31,915][15349] Signal inference workers to stop experience collection... (164150 times) [2024-06-24 14:14:31,920][15349] Signal inference workers to resume experience collection... (164150 times) [2024-06-24 14:14:31,935][15401] InferenceWorker_p0-w0: stopping experience collection (164150 times) [2024-06-24 14:14:31,935][15401] InferenceWorker_p0-w0: resuming experience collection (164150 times) [2024-06-24 14:14:33,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11086987264. Throughput: 0: 42531.7. Samples: 11087128660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 14:14:35,796][15401] Updated weights for policy 0, policy_version 676701 (0.0030) [2024-06-24 14:14:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11087167488. Throughput: 0: 42474.4. Samples: 11087263680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 14:14:39,627][15401] Updated weights for policy 0, policy_version 676711 (0.0033) [2024-06-24 14:14:43,193][15401] Updated weights for policy 0, policy_version 676721 (0.0046) [2024-06-24 14:14:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11087396864. Throughput: 0: 42350.8. Samples: 11087506520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 14:14:47,394][15401] Updated weights for policy 0, policy_version 676731 (0.0038) [2024-06-24 14:14:48,396][15132] Fps is (10 sec: 45845.9, 60 sec: 42866.8, 300 sec: 42875.2). Total num frames: 11087626240. Throughput: 0: 42536.6. Samples: 11087764800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:48,396][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 14:14:51,293][15401] Updated weights for policy 0, policy_version 676741 (0.0043) [2024-06-24 14:14:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11087806464. Throughput: 0: 42599.0. Samples: 11087904480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 14:14:55,176][15401] Updated weights for policy 0, policy_version 676751 (0.0024) [2024-06-24 14:14:58,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11088035840. Throughput: 0: 42557.4. Samples: 11088149260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:14:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 14:14:58,902][15401] Updated weights for policy 0, policy_version 676761 (0.0031) [2024-06-24 14:15:02,985][15401] Updated weights for policy 0, policy_version 676771 (0.0040) [2024-06-24 14:15:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 11088232448. Throughput: 0: 42841.8. Samples: 11088415300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:15:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 14:15:06,437][15401] Updated weights for policy 0, policy_version 676781 (0.0043) [2024-06-24 14:15:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11088445440. Throughput: 0: 42648.0. Samples: 11088538840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 14:15:08,391][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 14:15:10,540][15401] Updated weights for policy 0, policy_version 676791 (0.0036) [2024-06-24 14:15:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11088691200. Throughput: 0: 42599.3. Samples: 11088791360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 14:15:13,963][15401] Updated weights for policy 0, policy_version 676801 (0.0030) [2024-06-24 14:15:18,039][15401] Updated weights for policy 0, policy_version 676811 (0.0041) [2024-06-24 14:15:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 11088871424. Throughput: 0: 42672.6. Samples: 11089048920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 14:15:21,719][15401] Updated weights for policy 0, policy_version 676821 (0.0034) [2024-06-24 14:15:23,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 11089100800. Throughput: 0: 42455.5. Samples: 11089174180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:23,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 14:15:25,506][15401] Updated weights for policy 0, policy_version 676831 (0.0041) [2024-06-24 14:15:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 11089346560. Throughput: 0: 42875.4. Samples: 11089435920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 14:15:29,417][15401] Updated weights for policy 0, policy_version 676841 (0.0039) [2024-06-24 14:15:33,067][15401] Updated weights for policy 0, policy_version 676851 (0.0033) [2024-06-24 14:15:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 11089526784. Throughput: 0: 42885.1. Samples: 11089694360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 14:15:36,903][15401] Updated weights for policy 0, policy_version 676861 (0.0028) [2024-06-24 14:15:38,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11089739776. Throughput: 0: 42600.6. Samples: 11089821500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 14:15:41,037][15401] Updated weights for policy 0, policy_version 676871 (0.0035) [2024-06-24 14:15:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 11089985536. Throughput: 0: 42952.7. Samples: 11090082140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 14:15:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000676879_11089985536.pth... [2024-06-24 14:15:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000676253_11079729152.pth [2024-06-24 14:15:44,423][15401] Updated weights for policy 0, policy_version 676881 (0.0032) [2024-06-24 14:15:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42056.8, 300 sec: 42820.6). Total num frames: 11090149376. Throughput: 0: 42863.1. Samples: 11090344140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 14:15:48,638][15401] Updated weights for policy 0, policy_version 676891 (0.0027) [2024-06-24 14:15:52,278][15401] Updated weights for policy 0, policy_version 676901 (0.0035) [2024-06-24 14:15:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11090378752. Throughput: 0: 42742.6. Samples: 11090462260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 14:15:54,960][15349] Signal inference workers to stop experience collection... (164200 times) [2024-06-24 14:15:54,993][15401] InferenceWorker_p0-w0: stopping experience collection (164200 times) [2024-06-24 14:15:55,073][15349] Signal inference workers to resume experience collection... (164200 times) [2024-06-24 14:15:55,074][15401] InferenceWorker_p0-w0: resuming experience collection (164200 times) [2024-06-24 14:15:56,193][15401] Updated weights for policy 0, policy_version 676911 (0.0035) [2024-06-24 14:15:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11090591744. Throughput: 0: 42732.6. Samples: 11090714320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:15:58,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 14:15:59,869][15401] Updated weights for policy 0, policy_version 676921 (0.0040) [2024-06-24 14:16:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11090788352. Throughput: 0: 42744.1. Samples: 11090972400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 14:16:04,042][15401] Updated weights for policy 0, policy_version 676931 (0.0037) [2024-06-24 14:16:07,434][15401] Updated weights for policy 0, policy_version 676941 (0.0038) [2024-06-24 14:16:08,391][15132] Fps is (10 sec: 42591.4, 60 sec: 42870.3, 300 sec: 42653.7). Total num frames: 11091017728. Throughput: 0: 42807.8. Samples: 11091100600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:08,392][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 14:16:11,601][15401] Updated weights for policy 0, policy_version 676951 (0.0034) [2024-06-24 14:16:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 11091230720. Throughput: 0: 42547.7. Samples: 11091350560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:13,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 14:16:15,274][15401] Updated weights for policy 0, policy_version 676961 (0.0030) [2024-06-24 14:16:18,389][15132] Fps is (10 sec: 39328.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11091410944. Throughput: 0: 42662.7. Samples: 11091614180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 14:16:19,622][15401] Updated weights for policy 0, policy_version 676971 (0.0043) [2024-06-24 14:16:22,710][15401] Updated weights for policy 0, policy_version 676981 (0.0045) [2024-06-24 14:16:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11091656704. Throughput: 0: 42478.7. Samples: 11091733040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 14:16:27,240][15401] Updated weights for policy 0, policy_version 676991 (0.0024) [2024-06-24 14:16:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 11091853312. Throughput: 0: 42414.3. Samples: 11091990780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 14:16:30,706][15401] Updated weights for policy 0, policy_version 677001 (0.0034) [2024-06-24 14:16:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11092066304. Throughput: 0: 42274.1. Samples: 11092246480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:33,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-24 14:16:34,898][15401] Updated weights for policy 0, policy_version 677011 (0.0043) [2024-06-24 14:16:38,225][15401] Updated weights for policy 0, policy_version 677021 (0.0043) [2024-06-24 14:16:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11092312064. Throughput: 0: 42409.0. Samples: 11092370660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:38,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-24 14:16:42,856][15401] Updated weights for policy 0, policy_version 677031 (0.0036) [2024-06-24 14:16:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 11092508672. Throughput: 0: 42688.9. Samples: 11092635320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 14:16:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 14:16:45,746][15401] Updated weights for policy 0, policy_version 677041 (0.0041) [2024-06-24 14:16:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11092705280. Throughput: 0: 42570.6. Samples: 11092888080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:16:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 14:16:50,674][15401] Updated weights for policy 0, policy_version 677051 (0.0033) [2024-06-24 14:16:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11092951040. Throughput: 0: 42570.9. Samples: 11093016220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:16:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 14:16:53,416][15401] Updated weights for policy 0, policy_version 677061 (0.0030) [2024-06-24 14:16:58,210][15401] Updated weights for policy 0, policy_version 677071 (0.0029) [2024-06-24 14:16:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11093131264. Throughput: 0: 42648.4. Samples: 11093269740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:16:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 14:17:01,021][15401] Updated weights for policy 0, policy_version 677081 (0.0029) [2024-06-24 14:17:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11093344256. Throughput: 0: 42410.2. Samples: 11093522640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:03,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 14:17:04,539][15349] Signal inference workers to stop experience collection... (164250 times) [2024-06-24 14:17:04,575][15401] InferenceWorker_p0-w0: stopping experience collection (164250 times) [2024-06-24 14:17:04,600][15349] Signal inference workers to resume experience collection... (164250 times) [2024-06-24 14:17:04,600][15401] InferenceWorker_p0-w0: resuming experience collection (164250 times) [2024-06-24 14:17:05,835][15401] Updated weights for policy 0, policy_version 677091 (0.0037) [2024-06-24 14:17:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42872.6, 300 sec: 42653.9). Total num frames: 11093590016. Throughput: 0: 42699.5. Samples: 11093654520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:08,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 14:17:08,911][15401] Updated weights for policy 0, policy_version 677101 (0.0034) [2024-06-24 14:17:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11093770240. Throughput: 0: 42629.0. Samples: 11093909080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 14:17:13,848][15401] Updated weights for policy 0, policy_version 677111 (0.0026) [2024-06-24 14:17:16,639][15401] Updated weights for policy 0, policy_version 677121 (0.0032) [2024-06-24 14:17:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42543.2). Total num frames: 11093999616. Throughput: 0: 42506.4. Samples: 11094159260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 14:17:21,587][15401] Updated weights for policy 0, policy_version 677131 (0.0047) [2024-06-24 14:17:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11094228992. Throughput: 0: 42786.2. Samples: 11094296040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 14:17:24,400][15401] Updated weights for policy 0, policy_version 677141 (0.0036) [2024-06-24 14:17:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11094409216. Throughput: 0: 42517.8. Samples: 11094548620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 14:17:29,006][15401] Updated weights for policy 0, policy_version 677151 (0.0036) [2024-06-24 14:17:32,525][15401] Updated weights for policy 0, policy_version 677161 (0.0029) [2024-06-24 14:17:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11094654976. Throughput: 0: 42450.6. Samples: 11094798360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 14:17:36,470][15401] Updated weights for policy 0, policy_version 677171 (0.0036) [2024-06-24 14:17:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11094851584. Throughput: 0: 42569.2. Samples: 11094931840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 14:17:40,038][15401] Updated weights for policy 0, policy_version 677181 (0.0033) [2024-06-24 14:17:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11095064576. Throughput: 0: 42552.8. Samples: 11095184620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 14:17:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000677189_11095064576.pth... [2024-06-24 14:17:43,444][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000676564_11084824576.pth [2024-06-24 14:17:44,423][15401] Updated weights for policy 0, policy_version 677191 (0.0028) [2024-06-24 14:17:47,596][15401] Updated weights for policy 0, policy_version 677201 (0.0038) [2024-06-24 14:17:48,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 11095293952. Throughput: 0: 42444.0. Samples: 11095432720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:48,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 14:17:52,023][15401] Updated weights for policy 0, policy_version 677211 (0.0030) [2024-06-24 14:17:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42599.3). Total num frames: 11095474176. Throughput: 0: 42477.8. Samples: 11095566020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 14:17:55,169][15401] Updated weights for policy 0, policy_version 677221 (0.0050) [2024-06-24 14:17:58,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 11095703552. Throughput: 0: 42447.0. Samples: 11095819200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:17:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 14:17:59,832][15401] Updated weights for policy 0, policy_version 677231 (0.0039) [2024-06-24 14:18:02,724][15401] Updated weights for policy 0, policy_version 677241 (0.0036) [2024-06-24 14:18:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11095932928. Throughput: 0: 42525.7. Samples: 11096072920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:18:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 14:18:07,579][15401] Updated weights for policy 0, policy_version 677251 (0.0042) [2024-06-24 14:18:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11096113152. Throughput: 0: 42439.1. Samples: 11096205800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:18:08,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 14:18:10,355][15401] Updated weights for policy 0, policy_version 677261 (0.0032) [2024-06-24 14:18:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11096342528. Throughput: 0: 42473.7. Samples: 11096459940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:18:13,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 14:18:15,077][15401] Updated weights for policy 0, policy_version 677271 (0.0028) [2024-06-24 14:18:18,159][15401] Updated weights for policy 0, policy_version 677281 (0.0031) [2024-06-24 14:18:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11096571904. Throughput: 0: 42432.2. Samples: 11096707800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:18:18,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 14:18:22,913][15401] Updated weights for policy 0, policy_version 677291 (0.0033) [2024-06-24 14:18:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11096752128. Throughput: 0: 42463.7. Samples: 11096842700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 14:18:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 14:18:25,759][15401] Updated weights for policy 0, policy_version 677301 (0.0032) [2024-06-24 14:18:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 11096981504. Throughput: 0: 42547.1. Samples: 11097099240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:18:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 14:18:30,545][15401] Updated weights for policy 0, policy_version 677311 (0.0028) [2024-06-24 14:18:31,448][15349] Signal inference workers to stop experience collection... (164300 times) [2024-06-24 14:18:31,449][15349] Signal inference workers to resume experience collection... (164300 times) [2024-06-24 14:18:31,504][15401] InferenceWorker_p0-w0: stopping experience collection (164300 times) [2024-06-24 14:18:31,504][15401] InferenceWorker_p0-w0: resuming experience collection (164300 times) [2024-06-24 14:18:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11097210880. Throughput: 0: 42706.2. Samples: 11097354400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:18:33,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 14:18:33,562][15401] Updated weights for policy 0, policy_version 677321 (0.0041) [2024-06-24 14:18:38,127][15401] Updated weights for policy 0, policy_version 677331 (0.0030) [2024-06-24 14:18:38,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 11097391104. Throughput: 0: 42695.6. Samples: 11097487420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:18:38,392][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 14:18:41,367][15401] Updated weights for policy 0, policy_version 677341 (0.0033) [2024-06-24 14:18:43,390][15132] Fps is (10 sec: 40956.1, 60 sec: 42597.8, 300 sec: 42598.2). Total num frames: 11097620480. Throughput: 0: 42723.6. Samples: 11097741800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:18:43,391][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 14:18:45,606][15401] Updated weights for policy 0, policy_version 677351 (0.0031) [2024-06-24 14:18:48,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 11097833472. Throughput: 0: 42756.9. Samples: 11097996980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:18:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 14:18:49,071][15401] Updated weights for policy 0, policy_version 677361 (0.0028) [2024-06-24 14:18:53,165][15401] Updated weights for policy 0, policy_version 677371 (0.0034) [2024-06-24 14:18:53,389][15132] Fps is (10 sec: 42602.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11098046464. Throughput: 0: 42635.6. Samples: 11098124400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:18:53,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 14:18:57,024][15401] Updated weights for policy 0, policy_version 677381 (0.0030) [2024-06-24 14:18:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11098259456. Throughput: 0: 42575.2. Samples: 11098375820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:18:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 14:19:00,740][15401] Updated weights for policy 0, policy_version 677391 (0.0040) [2024-06-24 14:19:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11098472448. Throughput: 0: 42811.6. Samples: 11098634320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 14:19:04,459][15401] Updated weights for policy 0, policy_version 677401 (0.0027) [2024-06-24 14:19:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11098685440. Throughput: 0: 42597.8. Samples: 11098759600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:08,396][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 14:19:08,801][15401] Updated weights for policy 0, policy_version 677411 (0.0036) [2024-06-24 14:19:12,124][15401] Updated weights for policy 0, policy_version 677421 (0.0047) [2024-06-24 14:19:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11098898432. Throughput: 0: 42497.4. Samples: 11099011620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:13,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 14:19:16,690][15401] Updated weights for policy 0, policy_version 677431 (0.0033) [2024-06-24 14:19:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 11099095040. Throughput: 0: 42590.3. Samples: 11099270960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 14:19:19,780][15401] Updated weights for policy 0, policy_version 677441 (0.0039) [2024-06-24 14:19:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11099308032. Throughput: 0: 42420.5. Samples: 11099396240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 14:19:24,185][15401] Updated weights for policy 0, policy_version 677451 (0.0034) [2024-06-24 14:19:27,284][15401] Updated weights for policy 0, policy_version 677461 (0.0032) [2024-06-24 14:19:28,390][15132] Fps is (10 sec: 45873.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11099553792. Throughput: 0: 42444.2. Samples: 11099651760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 14:19:31,954][15401] Updated weights for policy 0, policy_version 677471 (0.0034) [2024-06-24 14:19:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 11099717632. Throughput: 0: 42744.9. Samples: 11099920500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 14:19:35,118][15401] Updated weights for policy 0, policy_version 677481 (0.0034) [2024-06-24 14:19:38,392][15132] Fps is (10 sec: 40951.1, 60 sec: 42871.5, 300 sec: 42598.1). Total num frames: 11099963392. Throughput: 0: 42614.1. Samples: 11100042140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:38,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 14:19:39,418][15401] Updated weights for policy 0, policy_version 677491 (0.0032) [2024-06-24 14:19:42,596][15401] Updated weights for policy 0, policy_version 677501 (0.0036) [2024-06-24 14:19:43,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42872.2, 300 sec: 42599.3). Total num frames: 11100192768. Throughput: 0: 42749.8. Samples: 11100299560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 14:19:43,460][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000677503_11100209152.pth... [2024-06-24 14:19:43,515][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000676879_11089985536.pth [2024-06-24 14:19:46,930][15401] Updated weights for policy 0, policy_version 677511 (0.0029) [2024-06-24 14:19:48,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11100372992. Throughput: 0: 43067.8. Samples: 11100572380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:48,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 14:19:50,052][15401] Updated weights for policy 0, policy_version 677521 (0.0034) [2024-06-24 14:19:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 11100618752. Throughput: 0: 42970.1. Samples: 11100693260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:53,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 14:19:54,834][15401] Updated weights for policy 0, policy_version 677531 (0.0029) [2024-06-24 14:19:57,735][15401] Updated weights for policy 0, policy_version 677541 (0.0030) [2024-06-24 14:19:58,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11100848128. Throughput: 0: 43073.3. Samples: 11100949920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:19:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 14:20:02,347][15401] Updated weights for policy 0, policy_version 677551 (0.0026) [2024-06-24 14:20:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11101011968. Throughput: 0: 43136.0. Samples: 11101212080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 14:20:03,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 14:20:04,033][15349] Signal inference workers to stop experience collection... (164350 times) [2024-06-24 14:20:04,033][15349] Signal inference workers to resume experience collection... (164350 times) [2024-06-24 14:20:04,048][15401] InferenceWorker_p0-w0: stopping experience collection (164350 times) [2024-06-24 14:20:04,077][15401] InferenceWorker_p0-w0: resuming experience collection (164350 times) [2024-06-24 14:20:05,372][15401] Updated weights for policy 0, policy_version 677561 (0.0041) [2024-06-24 14:20:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11101257728. Throughput: 0: 43074.7. Samples: 11101334600. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 14:20:09,948][15401] Updated weights for policy 0, policy_version 677571 (0.0044) [2024-06-24 14:20:13,023][15401] Updated weights for policy 0, policy_version 677581 (0.0036) [2024-06-24 14:20:13,390][15132] Fps is (10 sec: 47512.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11101487104. Throughput: 0: 43115.6. Samples: 11101591960. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 14:20:17,518][15401] Updated weights for policy 0, policy_version 677591 (0.0049) [2024-06-24 14:20:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11101650944. Throughput: 0: 42936.9. Samples: 11101852660. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 14:20:20,848][15401] Updated weights for policy 0, policy_version 677601 (0.0039) [2024-06-24 14:20:23,389][15132] Fps is (10 sec: 42599.6, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 11101913088. Throughput: 0: 43027.2. Samples: 11101978260. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:23,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 14:20:24,906][15401] Updated weights for policy 0, policy_version 677611 (0.0033) [2024-06-24 14:20:28,166][15401] Updated weights for policy 0, policy_version 677621 (0.0029) [2024-06-24 14:20:28,390][15132] Fps is (10 sec: 49151.7, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 11102142464. Throughput: 0: 43216.3. Samples: 11102244300. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 14:20:32,306][15401] Updated weights for policy 0, policy_version 677631 (0.0028) [2024-06-24 14:20:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 11102322688. Throughput: 0: 42989.9. Samples: 11102506920. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 14:20:36,065][15401] Updated weights for policy 0, policy_version 677641 (0.0044) [2024-06-24 14:20:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 11102552064. Throughput: 0: 42985.0. Samples: 11102627580. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 14:20:40,253][15401] Updated weights for policy 0, policy_version 677651 (0.0036) [2024-06-24 14:20:43,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43144.3, 300 sec: 42820.5). Total num frames: 11102781440. Throughput: 0: 43164.7. Samples: 11102892340. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:43,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 14:20:43,569][15401] Updated weights for policy 0, policy_version 677661 (0.0031) [2024-06-24 14:20:47,769][15401] Updated weights for policy 0, policy_version 677671 (0.0031) [2024-06-24 14:20:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.8, 300 sec: 42709.5). Total num frames: 11102978048. Throughput: 0: 43091.6. Samples: 11103151200. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:48,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 14:20:51,129][15401] Updated weights for policy 0, policy_version 677681 (0.0031) [2024-06-24 14:20:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11103207424. Throughput: 0: 43086.2. Samples: 11103273480. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:53,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 14:20:55,389][15401] Updated weights for policy 0, policy_version 677691 (0.0039) [2024-06-24 14:20:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11103420416. Throughput: 0: 43326.0. Samples: 11103541620. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:20:58,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 14:20:58,752][15401] Updated weights for policy 0, policy_version 677701 (0.0037) [2024-06-24 14:21:02,998][15401] Updated weights for policy 0, policy_version 677711 (0.0031) [2024-06-24 14:21:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 42765.2). Total num frames: 11103633408. Throughput: 0: 43243.1. Samples: 11103798600. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:03,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 14:21:04,157][15349] Signal inference workers to stop experience collection... (164400 times) [2024-06-24 14:21:04,183][15401] InferenceWorker_p0-w0: stopping experience collection (164400 times) [2024-06-24 14:21:04,211][15349] Signal inference workers to resume experience collection... (164400 times) [2024-06-24 14:21:04,216][15401] InferenceWorker_p0-w0: resuming experience collection (164400 times) [2024-06-24 14:21:06,281][15401] Updated weights for policy 0, policy_version 677721 (0.0038) [2024-06-24 14:21:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11103846400. Throughput: 0: 43279.0. Samples: 11103925820. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 14:21:10,591][15401] Updated weights for policy 0, policy_version 677731 (0.0030) [2024-06-24 14:21:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 11104059392. Throughput: 0: 43162.3. Samples: 11104186600. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 14:21:14,187][15401] Updated weights for policy 0, policy_version 677741 (0.0027) [2024-06-24 14:21:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11104256000. Throughput: 0: 42937.3. Samples: 11104439100. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:18,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 14:21:18,751][15401] Updated weights for policy 0, policy_version 677751 (0.0028) [2024-06-24 14:21:21,759][15401] Updated weights for policy 0, policy_version 677761 (0.0036) [2024-06-24 14:21:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11104501760. Throughput: 0: 43057.7. Samples: 11104565180. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:23,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 14:21:26,377][15401] Updated weights for policy 0, policy_version 677771 (0.0034) [2024-06-24 14:21:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11104681984. Throughput: 0: 43013.6. Samples: 11104827940. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 14:21:29,477][15401] Updated weights for policy 0, policy_version 677781 (0.0029) [2024-06-24 14:21:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11104894976. Throughput: 0: 42807.9. Samples: 11105077560. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 14:21:33,841][15401] Updated weights for policy 0, policy_version 677791 (0.0037) [2024-06-24 14:21:37,018][15401] Updated weights for policy 0, policy_version 677801 (0.0026) [2024-06-24 14:21:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11105140736. Throughput: 0: 42893.6. Samples: 11105203700. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:38,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 14:21:41,912][15401] Updated weights for policy 0, policy_version 677811 (0.0040) [2024-06-24 14:21:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11105320960. Throughput: 0: 42798.2. Samples: 11105467540. Policy #0 lag: (min: 2.0, avg: 13.0, max: 27.0) [2024-06-24 14:21:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 14:21:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000677815_11105320960.pth... [2024-06-24 14:21:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000677189_11095064576.pth [2024-06-24 14:21:44,589][15401] Updated weights for policy 0, policy_version 677821 (0.0030) [2024-06-24 14:21:48,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11105533952. Throughput: 0: 42694.4. Samples: 11105719840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:21:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 14:21:49,530][15401] Updated weights for policy 0, policy_version 677831 (0.0035) [2024-06-24 14:21:52,215][15401] Updated weights for policy 0, policy_version 677841 (0.0026) [2024-06-24 14:21:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11105779712. Throughput: 0: 42780.0. Samples: 11105850920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:21:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 14:21:57,179][15401] Updated weights for policy 0, policy_version 677851 (0.0037) [2024-06-24 14:21:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11105976320. Throughput: 0: 42699.0. Samples: 11106108060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:21:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 14:22:00,067][15401] Updated weights for policy 0, policy_version 677861 (0.0032) [2024-06-24 14:22:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11106189312. Throughput: 0: 42573.7. Samples: 11106354920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:03,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 14:22:04,676][15401] Updated weights for policy 0, policy_version 677871 (0.0028) [2024-06-24 14:22:07,655][15401] Updated weights for policy 0, policy_version 677881 (0.0032) [2024-06-24 14:22:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11106435072. Throughput: 0: 42681.4. Samples: 11106485840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 14:22:10,428][15349] Signal inference workers to stop experience collection... (164450 times) [2024-06-24 14:22:10,469][15401] InferenceWorker_p0-w0: stopping experience collection (164450 times) [2024-06-24 14:22:10,477][15349] Signal inference workers to resume experience collection... (164450 times) [2024-06-24 14:22:10,500][15401] InferenceWorker_p0-w0: resuming experience collection (164450 times) [2024-06-24 14:22:12,199][15401] Updated weights for policy 0, policy_version 677891 (0.0026) [2024-06-24 14:22:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11106598912. Throughput: 0: 42671.1. Samples: 11106748140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 14:22:15,311][15401] Updated weights for policy 0, policy_version 677901 (0.0041) [2024-06-24 14:22:18,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 11106828288. Throughput: 0: 42831.0. Samples: 11107005060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:18,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 14:22:19,582][15401] Updated weights for policy 0, policy_version 677911 (0.0032) [2024-06-24 14:22:22,700][15401] Updated weights for policy 0, policy_version 677921 (0.0037) [2024-06-24 14:22:23,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 11107090432. Throughput: 0: 43032.2. Samples: 11107140140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 14:22:27,055][15401] Updated weights for policy 0, policy_version 677931 (0.0035) [2024-06-24 14:22:28,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11107254272. Throughput: 0: 42794.3. Samples: 11107393280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 14:22:30,542][15401] Updated weights for policy 0, policy_version 677941 (0.0036) [2024-06-24 14:22:33,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11107467264. Throughput: 0: 42994.1. Samples: 11107654580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:33,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 14:22:34,478][15401] Updated weights for policy 0, policy_version 677951 (0.0038) [2024-06-24 14:22:38,171][15401] Updated weights for policy 0, policy_version 677961 (0.0031) [2024-06-24 14:22:38,396][15132] Fps is (10 sec: 45845.5, 60 sec: 42867.0, 300 sec: 42875.2). Total num frames: 11107713024. Throughput: 0: 42901.5. Samples: 11107781760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:38,396][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 14:22:42,451][15401] Updated weights for policy 0, policy_version 677971 (0.0032) [2024-06-24 14:22:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 11107909632. Throughput: 0: 42902.7. Samples: 11108038680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:43,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 14:22:45,854][15401] Updated weights for policy 0, policy_version 677981 (0.0040) [2024-06-24 14:22:48,389][15132] Fps is (10 sec: 40986.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11108122624. Throughput: 0: 43149.9. Samples: 11108296660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 14:22:50,034][15401] Updated weights for policy 0, policy_version 677991 (0.0038) [2024-06-24 14:22:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11108352000. Throughput: 0: 43088.1. Samples: 11108424800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 14:22:53,398][15401] Updated weights for policy 0, policy_version 678001 (0.0029) [2024-06-24 14:22:57,845][15401] Updated weights for policy 0, policy_version 678011 (0.0034) [2024-06-24 14:22:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11108564992. Throughput: 0: 43139.9. Samples: 11108689540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:22:58,393][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 14:23:01,215][15401] Updated weights for policy 0, policy_version 678021 (0.0032) [2024-06-24 14:23:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11108777984. Throughput: 0: 42916.4. Samples: 11108936200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:23:03,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 14:23:05,407][15401] Updated weights for policy 0, policy_version 678031 (0.0038) [2024-06-24 14:23:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11108990976. Throughput: 0: 42857.3. Samples: 11109068720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:23:08,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 14:23:08,765][15401] Updated weights for policy 0, policy_version 678041 (0.0025) [2024-06-24 14:23:12,958][15401] Updated weights for policy 0, policy_version 678051 (0.0029) [2024-06-24 14:23:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11109187584. Throughput: 0: 43003.9. Samples: 11109328460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:23:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 14:23:16,270][15401] Updated weights for policy 0, policy_version 678061 (0.0023) [2024-06-24 14:23:18,393][15132] Fps is (10 sec: 42582.2, 60 sec: 43143.5, 300 sec: 42931.1). Total num frames: 11109416960. Throughput: 0: 42880.4. Samples: 11109584360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-24 14:23:18,394][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 14:23:20,730][15401] Updated weights for policy 0, policy_version 678071 (0.0038) [2024-06-24 14:23:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 11109646336. Throughput: 0: 43046.9. Samples: 11109718600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:23:23,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 14:23:23,914][15401] Updated weights for policy 0, policy_version 678081 (0.0042) [2024-06-24 14:23:28,312][15401] Updated weights for policy 0, policy_version 678091 (0.0022) [2024-06-24 14:23:28,390][15132] Fps is (10 sec: 42614.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11109842944. Throughput: 0: 43057.7. Samples: 11109976280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:23:28,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 14:23:31,422][15401] Updated weights for policy 0, policy_version 678101 (0.0034) [2024-06-24 14:23:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 11110072320. Throughput: 0: 43052.0. Samples: 11110234000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:23:33,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 14:23:35,893][15401] Updated weights for policy 0, policy_version 678111 (0.0039) [2024-06-24 14:23:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42876.1, 300 sec: 42931.8). Total num frames: 11110285312. Throughput: 0: 43037.8. Samples: 11110361500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:23:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 14:23:38,895][15401] Updated weights for policy 0, policy_version 678121 (0.0036) [2024-06-24 14:23:43,369][15401] Updated weights for policy 0, policy_version 678131 (0.0043) [2024-06-24 14:23:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11110498304. Throughput: 0: 43111.2. Samples: 11110629440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:23:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 14:23:43,531][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000678132_11110514688.pth... [2024-06-24 14:23:43,585][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000677503_11100209152.pth [2024-06-24 14:23:46,321][15401] Updated weights for policy 0, policy_version 678141 (0.0029) [2024-06-24 14:23:48,145][15349] Signal inference workers to stop experience collection... (164500 times) [2024-06-24 14:23:48,149][15349] Signal inference workers to resume experience collection... (164500 times) [2024-06-24 14:23:48,180][15401] InferenceWorker_p0-w0: stopping experience collection (164500 times) [2024-06-24 14:23:48,180][15401] InferenceWorker_p0-w0: resuming experience collection (164500 times) [2024-06-24 14:23:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 11110727680. Throughput: 0: 43288.2. Samples: 11110884160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:23:48,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 14:23:50,914][15401] Updated weights for policy 0, policy_version 678151 (0.0035) [2024-06-24 14:23:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 11110957056. Throughput: 0: 43136.4. Samples: 11111009860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:23:53,391][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 14:23:54,288][15401] Updated weights for policy 0, policy_version 678161 (0.0036) [2024-06-24 14:23:58,333][15401] Updated weights for policy 0, policy_version 678171 (0.0029) [2024-06-24 14:23:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 11111153664. Throughput: 0: 43290.6. Samples: 11111276540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:23:58,392][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 14:24:01,783][15401] Updated weights for policy 0, policy_version 678181 (0.0038) [2024-06-24 14:24:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 11111350272. Throughput: 0: 43168.5. Samples: 11111526780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 14:24:06,342][15401] Updated weights for policy 0, policy_version 678191 (0.0038) [2024-06-24 14:24:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 11111596032. Throughput: 0: 43169.4. Samples: 11111661220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 14:24:09,264][15401] Updated weights for policy 0, policy_version 678201 (0.0036) [2024-06-24 14:24:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11111759872. Throughput: 0: 43237.4. Samples: 11111921960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 14:24:13,885][15401] Updated weights for policy 0, policy_version 678211 (0.0035) [2024-06-24 14:24:16,907][15401] Updated weights for policy 0, policy_version 678221 (0.0025) [2024-06-24 14:24:18,392][15132] Fps is (10 sec: 40951.9, 60 sec: 43145.8, 300 sec: 43042.4). Total num frames: 11112005632. Throughput: 0: 43078.0. Samples: 11112172600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:18,392][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 14:24:21,452][15401] Updated weights for policy 0, policy_version 678231 (0.0027) [2024-06-24 14:24:23,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 11112235008. Throughput: 0: 43254.2. Samples: 11112307940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 14:24:25,024][15401] Updated weights for policy 0, policy_version 678241 (0.0036) [2024-06-24 14:24:28,390][15132] Fps is (10 sec: 40967.9, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 11112415232. Throughput: 0: 42845.7. Samples: 11112557500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:28,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 14:24:29,052][15401] Updated weights for policy 0, policy_version 678251 (0.0028) [2024-06-24 14:24:32,620][15401] Updated weights for policy 0, policy_version 678261 (0.0036) [2024-06-24 14:24:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 43043.0). Total num frames: 11112660992. Throughput: 0: 42686.1. Samples: 11112805040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 14:24:36,660][15401] Updated weights for policy 0, policy_version 678271 (0.0035) [2024-06-24 14:24:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 11112873984. Throughput: 0: 43004.4. Samples: 11112945060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:38,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 14:24:40,320][15401] Updated weights for policy 0, policy_version 678281 (0.0037) [2024-06-24 14:24:43,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 11113054208. Throughput: 0: 42815.3. Samples: 11113203220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:43,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 14:24:44,207][15401] Updated weights for policy 0, policy_version 678291 (0.0033) [2024-06-24 14:24:48,020][15401] Updated weights for policy 0, policy_version 678301 (0.0043) [2024-06-24 14:24:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 11113316352. Throughput: 0: 42883.1. Samples: 11113456520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:48,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 14:24:51,779][15401] Updated weights for policy 0, policy_version 678311 (0.0034) [2024-06-24 14:24:53,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11113512960. Throughput: 0: 42924.5. Samples: 11113592820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 14:24:54,992][15349] Signal inference workers to stop experience collection... (164550 times) [2024-06-24 14:24:55,048][15401] InferenceWorker_p0-w0: stopping experience collection (164550 times) [2024-06-24 14:24:55,103][15349] Signal inference workers to resume experience collection... (164550 times) [2024-06-24 14:24:55,103][15401] InferenceWorker_p0-w0: resuming experience collection (164550 times) [2024-06-24 14:24:55,506][15401] Updated weights for policy 0, policy_version 678321 (0.0041) [2024-06-24 14:24:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 11113709568. Throughput: 0: 42656.5. Samples: 11113841500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 14:24:58,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 14:24:59,375][15401] Updated weights for policy 0, policy_version 678331 (0.0042) [2024-06-24 14:25:03,018][15401] Updated weights for policy 0, policy_version 678341 (0.0032) [2024-06-24 14:25:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 11113938944. Throughput: 0: 42861.9. Samples: 11114101300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:03,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 14:25:06,916][15401] Updated weights for policy 0, policy_version 678351 (0.0047) [2024-06-24 14:25:08,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11114168320. Throughput: 0: 42860.8. Samples: 11114236680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 14:25:10,592][15401] Updated weights for policy 0, policy_version 678361 (0.0020) [2024-06-24 14:25:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 11114348544. Throughput: 0: 42872.9. Samples: 11114486780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:13,396][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 14:25:14,642][15401] Updated weights for policy 0, policy_version 678371 (0.0028) [2024-06-24 14:25:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42872.8, 300 sec: 42931.6). Total num frames: 11114577920. Throughput: 0: 43122.2. Samples: 11114745540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:18,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 14:25:18,493][15401] Updated weights for policy 0, policy_version 678381 (0.0030) [2024-06-24 14:25:22,205][15401] Updated weights for policy 0, policy_version 678391 (0.0037) [2024-06-24 14:25:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 11114807296. Throughput: 0: 43006.8. Samples: 11114880360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:23,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 14:25:25,931][15401] Updated weights for policy 0, policy_version 678401 (0.0021) [2024-06-24 14:25:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 11115003904. Throughput: 0: 42890.0. Samples: 11115133280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 14:25:30,159][15401] Updated weights for policy 0, policy_version 678411 (0.0033) [2024-06-24 14:25:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11115233280. Throughput: 0: 42933.3. Samples: 11115388520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 14:25:33,460][15401] Updated weights for policy 0, policy_version 678421 (0.0028) [2024-06-24 14:25:37,743][15401] Updated weights for policy 0, policy_version 678431 (0.0040) [2024-06-24 14:25:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 11115446272. Throughput: 0: 42838.6. Samples: 11115520560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 14:25:41,580][15401] Updated weights for policy 0, policy_version 678441 (0.0038) [2024-06-24 14:25:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11115626496. Throughput: 0: 42714.8. Samples: 11115763660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 14:25:43,535][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000678446_11115659264.pth... [2024-06-24 14:25:43,590][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000677815_11105320960.pth [2024-06-24 14:25:45,464][15401] Updated weights for policy 0, policy_version 678451 (0.0033) [2024-06-24 14:25:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 11115855872. Throughput: 0: 42826.6. Samples: 11116028500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 14:25:49,172][15401] Updated weights for policy 0, policy_version 678461 (0.0044) [2024-06-24 14:25:53,071][15401] Updated weights for policy 0, policy_version 678471 (0.0029) [2024-06-24 14:25:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11116068864. Throughput: 0: 42585.9. Samples: 11116153040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 14:25:56,959][15401] Updated weights for policy 0, policy_version 678481 (0.0034) [2024-06-24 14:25:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11116265472. Throughput: 0: 42632.9. Samples: 11116405260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:25:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 14:26:00,666][15401] Updated weights for policy 0, policy_version 678491 (0.0025) [2024-06-24 14:26:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11116478464. Throughput: 0: 42774.3. Samples: 11116670380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:26:03,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 14:26:04,545][15401] Updated weights for policy 0, policy_version 678501 (0.0037) [2024-06-24 14:26:08,345][15401] Updated weights for policy 0, policy_version 678511 (0.0049) [2024-06-24 14:26:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11116724224. Throughput: 0: 42490.5. Samples: 11116792440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:26:08,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 14:26:08,918][15349] Signal inference workers to stop experience collection... (164600 times) [2024-06-24 14:26:08,920][15349] Signal inference workers to resume experience collection... (164600 times) [2024-06-24 14:26:08,955][15401] InferenceWorker_p0-w0: stopping experience collection (164600 times) [2024-06-24 14:26:08,955][15401] InferenceWorker_p0-w0: resuming experience collection (164600 times) [2024-06-24 14:26:12,039][15401] Updated weights for policy 0, policy_version 678521 (0.0035) [2024-06-24 14:26:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11116920832. Throughput: 0: 42420.9. Samples: 11117042220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:26:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 14:26:16,381][15401] Updated weights for policy 0, policy_version 678531 (0.0029) [2024-06-24 14:26:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11117133824. Throughput: 0: 42565.0. Samples: 11117303940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:26:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 14:26:20,069][15401] Updated weights for policy 0, policy_version 678541 (0.0041) [2024-06-24 14:26:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.2, 300 sec: 42987.1). Total num frames: 11117363200. Throughput: 0: 42374.1. Samples: 11117427400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:26:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 14:26:24,075][15401] Updated weights for policy 0, policy_version 678551 (0.0041) [2024-06-24 14:26:27,736][15401] Updated weights for policy 0, policy_version 678561 (0.0033) [2024-06-24 14:26:28,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 11117576192. Throughput: 0: 42701.7. Samples: 11117685340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:26:28,393][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 14:26:31,651][15401] Updated weights for policy 0, policy_version 678571 (0.0031) [2024-06-24 14:26:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11117772800. Throughput: 0: 42496.9. Samples: 11117940860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:26:33,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 14:26:35,290][15401] Updated weights for policy 0, policy_version 678581 (0.0032) [2024-06-24 14:26:38,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 11118018560. Throughput: 0: 42612.9. Samples: 11118070620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:26:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 14:26:39,415][15401] Updated weights for policy 0, policy_version 678591 (0.0033) [2024-06-24 14:26:42,966][15401] Updated weights for policy 0, policy_version 678601 (0.0029) [2024-06-24 14:26:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 11118215168. Throughput: 0: 42800.9. Samples: 11118331300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:26:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 14:26:46,949][15401] Updated weights for policy 0, policy_version 678611 (0.0031) [2024-06-24 14:26:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11118411776. Throughput: 0: 42704.9. Samples: 11118592100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:26:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 14:26:51,096][15401] Updated weights for policy 0, policy_version 678621 (0.0027) [2024-06-24 14:26:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 11118657536. Throughput: 0: 42832.1. Samples: 11118719880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:26:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 14:26:54,633][15401] Updated weights for policy 0, policy_version 678631 (0.0035) [2024-06-24 14:26:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 11118837760. Throughput: 0: 42955.9. Samples: 11118975240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:26:58,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 14:26:58,826][15401] Updated weights for policy 0, policy_version 678641 (0.0024) [2024-06-24 14:27:02,293][15401] Updated weights for policy 0, policy_version 678651 (0.0029) [2024-06-24 14:27:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11119067136. Throughput: 0: 42752.4. Samples: 11119227800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:03,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 14:27:06,365][15401] Updated weights for policy 0, policy_version 678661 (0.0037) [2024-06-24 14:27:08,390][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 11119296512. Throughput: 0: 42969.9. Samples: 11119361040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 14:27:09,917][15401] Updated weights for policy 0, policy_version 678671 (0.0045) [2024-06-24 14:27:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 11119493120. Throughput: 0: 43040.9. Samples: 11119622080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:13,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 14:27:13,773][15401] Updated weights for policy 0, policy_version 678681 (0.0035) [2024-06-24 14:27:17,445][15401] Updated weights for policy 0, policy_version 678691 (0.0031) [2024-06-24 14:27:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11119722496. Throughput: 0: 42976.9. Samples: 11119874920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:18,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 14:27:21,173][15401] Updated weights for policy 0, policy_version 678701 (0.0027) [2024-06-24 14:27:23,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 43042.7). Total num frames: 11119951872. Throughput: 0: 43096.0. Samples: 11120009940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:23,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 14:27:25,014][15401] Updated weights for policy 0, policy_version 678711 (0.0031) [2024-06-24 14:27:28,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 11120148480. Throughput: 0: 43205.4. Samples: 11120275540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 14:27:28,474][15401] Updated weights for policy 0, policy_version 678721 (0.0037) [2024-06-24 14:27:32,553][15401] Updated weights for policy 0, policy_version 678731 (0.0041) [2024-06-24 14:27:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.6, 300 sec: 42932.6). Total num frames: 11120377856. Throughput: 0: 42938.1. Samples: 11120524320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 14:27:35,700][15349] Signal inference workers to stop experience collection... (164650 times) [2024-06-24 14:27:35,701][15349] Signal inference workers to resume experience collection... (164650 times) [2024-06-24 14:27:35,724][15401] InferenceWorker_p0-w0: stopping experience collection (164650 times) [2024-06-24 14:27:35,724][15401] InferenceWorker_p0-w0: resuming experience collection (164650 times) [2024-06-24 14:27:35,997][15401] Updated weights for policy 0, policy_version 678741 (0.0030) [2024-06-24 14:27:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 11120590848. Throughput: 0: 42971.1. Samples: 11120653580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:38,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-24 14:27:39,955][15401] Updated weights for policy 0, policy_version 678751 (0.0037) [2024-06-24 14:27:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11120787456. Throughput: 0: 43090.0. Samples: 11120914280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 14:27:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000678759_11120787456.pth... [2024-06-24 14:27:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000678132_11110514688.pth [2024-06-24 14:27:43,969][15401] Updated weights for policy 0, policy_version 678761 (0.0036) [2024-06-24 14:27:47,883][15401] Updated weights for policy 0, policy_version 678771 (0.0035) [2024-06-24 14:27:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11121000448. Throughput: 0: 43158.7. Samples: 11121169940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:48,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 14:27:51,606][15401] Updated weights for policy 0, policy_version 678781 (0.0033) [2024-06-24 14:27:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 11121229824. Throughput: 0: 43108.0. Samples: 11121300900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 14:27:55,721][15401] Updated weights for policy 0, policy_version 678791 (0.0026) [2024-06-24 14:27:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 11121426432. Throughput: 0: 42911.3. Samples: 11121553080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:27:58,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 14:27:59,307][15401] Updated weights for policy 0, policy_version 678801 (0.0033) [2024-06-24 14:28:03,217][15401] Updated weights for policy 0, policy_version 678811 (0.0038) [2024-06-24 14:28:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11121639424. Throughput: 0: 43014.8. Samples: 11121810480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:28:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 14:28:06,759][15401] Updated weights for policy 0, policy_version 678821 (0.0025) [2024-06-24 14:28:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11121868800. Throughput: 0: 42883.0. Samples: 11121939680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:28:08,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 14:28:10,796][15401] Updated weights for policy 0, policy_version 678831 (0.0032) [2024-06-24 14:28:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.6). Total num frames: 11122065408. Throughput: 0: 42680.4. Samples: 11122196160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:28:13,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 14:28:14,752][15401] Updated weights for policy 0, policy_version 678841 (0.0044) [2024-06-24 14:28:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 11122294784. Throughput: 0: 42701.5. Samples: 11122445880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:28:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 14:28:18,397][15401] Updated weights for policy 0, policy_version 678851 (0.0047) [2024-06-24 14:28:22,135][15401] Updated weights for policy 0, policy_version 678861 (0.0026) [2024-06-24 14:28:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 11122507776. Throughput: 0: 42904.1. Samples: 11122584260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:28:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 14:28:25,854][15401] Updated weights for policy 0, policy_version 678871 (0.0031) [2024-06-24 14:28:28,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11122688000. Throughput: 0: 42702.6. Samples: 11122835900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:28:28,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 14:28:30,081][15401] Updated weights for policy 0, policy_version 678881 (0.0032) [2024-06-24 14:28:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11122933760. Throughput: 0: 42608.7. Samples: 11123087340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:28:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 14:28:33,595][15401] Updated weights for policy 0, policy_version 678891 (0.0023) [2024-06-24 14:28:37,574][15401] Updated weights for policy 0, policy_version 678901 (0.0039) [2024-06-24 14:28:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11123146752. Throughput: 0: 42807.1. Samples: 11123227220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:28:38,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 14:28:41,257][15401] Updated weights for policy 0, policy_version 678911 (0.0040) [2024-06-24 14:28:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11123343360. Throughput: 0: 42780.8. Samples: 11123478220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:28:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 14:28:45,350][15401] Updated weights for policy 0, policy_version 678921 (0.0038) [2024-06-24 14:28:48,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 11123589120. Throughput: 0: 42667.8. Samples: 11123730640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:28:48,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 14:28:48,969][15401] Updated weights for policy 0, policy_version 678931 (0.0037) [2024-06-24 14:28:52,976][15401] Updated weights for policy 0, policy_version 678941 (0.0036) [2024-06-24 14:28:53,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 11123802112. Throughput: 0: 42688.7. Samples: 11123860680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:28:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 14:28:56,479][15401] Updated weights for policy 0, policy_version 678951 (0.0054) [2024-06-24 14:28:58,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 11123998720. Throughput: 0: 42723.5. Samples: 11124118820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:28:58,392][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 14:29:00,637][15401] Updated weights for policy 0, policy_version 678961 (0.0031) [2024-06-24 14:29:00,658][15349] Signal inference workers to stop experience collection... (164700 times) [2024-06-24 14:29:00,708][15401] InferenceWorker_p0-w0: stopping experience collection (164700 times) [2024-06-24 14:29:00,768][15349] Signal inference workers to resume experience collection... (164700 times) [2024-06-24 14:29:00,768][15401] InferenceWorker_p0-w0: resuming experience collection (164700 times) [2024-06-24 14:29:03,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11124228096. Throughput: 0: 42911.0. Samples: 11124376880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:03,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 14:29:03,970][15401] Updated weights for policy 0, policy_version 678971 (0.0026) [2024-06-24 14:29:08,174][15401] Updated weights for policy 0, policy_version 678981 (0.0047) [2024-06-24 14:29:08,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11124441088. Throughput: 0: 42716.5. Samples: 11124506500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 14:29:11,566][15401] Updated weights for policy 0, policy_version 678991 (0.0035) [2024-06-24 14:29:13,390][15132] Fps is (10 sec: 40957.4, 60 sec: 42871.1, 300 sec: 42820.8). Total num frames: 11124637696. Throughput: 0: 42879.0. Samples: 11124765480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 14:29:15,766][15401] Updated weights for policy 0, policy_version 679001 (0.0029) [2024-06-24 14:29:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11124867072. Throughput: 0: 42994.0. Samples: 11125022060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 14:29:19,096][15401] Updated weights for policy 0, policy_version 679011 (0.0029) [2024-06-24 14:29:23,390][15132] Fps is (10 sec: 42600.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11125063680. Throughput: 0: 42753.2. Samples: 11125151120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 14:29:23,443][15401] Updated weights for policy 0, policy_version 679021 (0.0047) [2024-06-24 14:29:26,849][15401] Updated weights for policy 0, policy_version 679031 (0.0029) [2024-06-24 14:29:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11125276672. Throughput: 0: 42796.4. Samples: 11125404060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 14:29:30,927][15401] Updated weights for policy 0, policy_version 679041 (0.0035) [2024-06-24 14:29:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11125506048. Throughput: 0: 43000.1. Samples: 11125665540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:33,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-24 14:29:34,766][15401] Updated weights for policy 0, policy_version 679051 (0.0042) [2024-06-24 14:29:38,392][15401] Updated weights for policy 0, policy_version 679061 (0.0027) [2024-06-24 14:29:38,394][15132] Fps is (10 sec: 45855.2, 60 sec: 43141.4, 300 sec: 42986.5). Total num frames: 11125735424. Throughput: 0: 43088.9. Samples: 11125799860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:38,394][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 14:29:42,611][15401] Updated weights for policy 0, policy_version 679071 (0.0034) [2024-06-24 14:29:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11125899264. Throughput: 0: 42995.6. Samples: 11126053520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:43,391][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 14:29:43,547][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000679072_11125915648.pth... [2024-06-24 14:29:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000678446_11115659264.pth [2024-06-24 14:29:46,245][15401] Updated weights for policy 0, policy_version 679081 (0.0046) [2024-06-24 14:29:48,390][15132] Fps is (10 sec: 40977.8, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 11126145024. Throughput: 0: 42832.8. Samples: 11126304360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:48,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 14:29:50,372][15401] Updated weights for policy 0, policy_version 679091 (0.0029) [2024-06-24 14:29:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11126358016. Throughput: 0: 42993.1. Samples: 11126441200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:53,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-24 14:29:53,689][15401] Updated weights for policy 0, policy_version 679101 (0.0034) [2024-06-24 14:29:58,088][15401] Updated weights for policy 0, policy_version 679111 (0.0029) [2024-06-24 14:29:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11126554624. Throughput: 0: 42836.5. Samples: 11126693100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 14:29:58,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-24 14:30:01,701][15401] Updated weights for policy 0, policy_version 679121 (0.0042) [2024-06-24 14:30:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11126784000. Throughput: 0: 42811.9. Samples: 11126948600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:03,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 14:30:05,751][15401] Updated weights for policy 0, policy_version 679131 (0.0029) [2024-06-24 14:30:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11126996992. Throughput: 0: 42794.4. Samples: 11127076860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 14:30:09,309][15401] Updated weights for policy 0, policy_version 679141 (0.0046) [2024-06-24 14:30:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.8, 300 sec: 42765.0). Total num frames: 11127193600. Throughput: 0: 42815.5. Samples: 11127330760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 14:30:13,458][15401] Updated weights for policy 0, policy_version 679151 (0.0035) [2024-06-24 14:30:14,644][15349] Signal inference workers to stop experience collection... (164750 times) [2024-06-24 14:30:14,644][15349] Signal inference workers to resume experience collection... (164750 times) [2024-06-24 14:30:14,691][15401] InferenceWorker_p0-w0: stopping experience collection (164750 times) [2024-06-24 14:30:14,691][15401] InferenceWorker_p0-w0: resuming experience collection (164750 times) [2024-06-24 14:30:16,994][15401] Updated weights for policy 0, policy_version 679161 (0.0034) [2024-06-24 14:30:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11127422976. Throughput: 0: 42698.2. Samples: 11127586960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 14:30:21,026][15401] Updated weights for policy 0, policy_version 679171 (0.0028) [2024-06-24 14:30:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11127603200. Throughput: 0: 42518.7. Samples: 11127713020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 14:30:24,730][15401] Updated weights for policy 0, policy_version 679181 (0.0050) [2024-06-24 14:30:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11127848960. Throughput: 0: 42461.5. Samples: 11127964280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 14:30:28,469][15401] Updated weights for policy 0, policy_version 679191 (0.0033) [2024-06-24 14:30:32,497][15401] Updated weights for policy 0, policy_version 679201 (0.0035) [2024-06-24 14:30:33,392][15132] Fps is (10 sec: 47502.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 11128078336. Throughput: 0: 42466.3. Samples: 11128215440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:33,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 14:30:36,571][15401] Updated weights for policy 0, policy_version 679211 (0.0024) [2024-06-24 14:30:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42055.4, 300 sec: 42820.6). Total num frames: 11128258560. Throughput: 0: 42360.2. Samples: 11128347400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 14:30:40,106][15401] Updated weights for policy 0, policy_version 679221 (0.0043) [2024-06-24 14:30:43,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11128487936. Throughput: 0: 42639.7. Samples: 11128611880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 14:30:44,026][15401] Updated weights for policy 0, policy_version 679231 (0.0041) [2024-06-24 14:30:47,603][15401] Updated weights for policy 0, policy_version 679241 (0.0039) [2024-06-24 14:30:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11128717312. Throughput: 0: 42449.4. Samples: 11128858820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 14:30:51,682][15401] Updated weights for policy 0, policy_version 679251 (0.0032) [2024-06-24 14:30:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11128913920. Throughput: 0: 42521.7. Samples: 11128990340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 14:30:55,523][15401] Updated weights for policy 0, policy_version 679261 (0.0047) [2024-06-24 14:30:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11129126912. Throughput: 0: 42610.8. Samples: 11129248240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:30:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 14:30:59,155][15401] Updated weights for policy 0, policy_version 679271 (0.0048) [2024-06-24 14:31:03,117][15401] Updated weights for policy 0, policy_version 679281 (0.0038) [2024-06-24 14:31:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11129339904. Throughput: 0: 42545.3. Samples: 11129501500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:31:03,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 14:31:06,627][15401] Updated weights for policy 0, policy_version 679291 (0.0033) [2024-06-24 14:31:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11129552896. Throughput: 0: 42622.7. Samples: 11129631040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:31:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 14:31:11,203][15401] Updated weights for policy 0, policy_version 679301 (0.0044) [2024-06-24 14:31:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11129765888. Throughput: 0: 42699.4. Samples: 11129885760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:31:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 14:31:14,423][15401] Updated weights for policy 0, policy_version 679311 (0.0030) [2024-06-24 14:31:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11129978880. Throughput: 0: 42924.8. Samples: 11130146960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:31:18,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 14:31:18,663][15401] Updated weights for policy 0, policy_version 679321 (0.0038) [2024-06-24 14:31:22,067][15401] Updated weights for policy 0, policy_version 679331 (0.0038) [2024-06-24 14:31:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 11130191872. Throughput: 0: 42704.8. Samples: 11130269120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:31:23,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-24 14:31:26,171][15401] Updated weights for policy 0, policy_version 679341 (0.0035) [2024-06-24 14:31:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11130404864. Throughput: 0: 42654.6. Samples: 11130531340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:31:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 14:31:29,689][15401] Updated weights for policy 0, policy_version 679351 (0.0036) [2024-06-24 14:31:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11130634240. Throughput: 0: 42860.4. Samples: 11130787540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:31:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 14:31:33,671][15401] Updated weights for policy 0, policy_version 679361 (0.0036) [2024-06-24 14:31:37,258][15401] Updated weights for policy 0, policy_version 679371 (0.0036) [2024-06-24 14:31:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11130847232. Throughput: 0: 42934.7. Samples: 11130922400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-06-24 14:31:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 14:31:41,213][15401] Updated weights for policy 0, policy_version 679381 (0.0044) [2024-06-24 14:31:42,962][15349] Signal inference workers to stop experience collection... (164800 times) [2024-06-24 14:31:43,016][15401] InferenceWorker_p0-w0: stopping experience collection (164800 times) [2024-06-24 14:31:43,019][15349] Signal inference workers to resume experience collection... (164800 times) [2024-06-24 14:31:43,029][15401] InferenceWorker_p0-w0: resuming experience collection (164800 times) [2024-06-24 14:31:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11131043840. Throughput: 0: 42927.8. Samples: 11131180000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:31:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 14:31:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000679385_11131043840.pth... [2024-06-24 14:31:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000678759_11120787456.pth [2024-06-24 14:31:45,098][15401] Updated weights for policy 0, policy_version 679391 (0.0031) [2024-06-24 14:31:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11131273216. Throughput: 0: 42867.6. Samples: 11131430540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:31:48,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 14:31:48,839][15401] Updated weights for policy 0, policy_version 679401 (0.0031) [2024-06-24 14:31:53,057][15401] Updated weights for policy 0, policy_version 679411 (0.0036) [2024-06-24 14:31:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11131486208. Throughput: 0: 42904.5. Samples: 11131561740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:31:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 14:31:57,402][15401] Updated weights for policy 0, policy_version 679421 (0.0041) [2024-06-24 14:31:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11131666432. Throughput: 0: 42821.7. Samples: 11131812740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:31:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 14:32:00,523][15401] Updated weights for policy 0, policy_version 679431 (0.0045) [2024-06-24 14:32:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11131895808. Throughput: 0: 42609.5. Samples: 11132064380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 14:32:05,033][15401] Updated weights for policy 0, policy_version 679441 (0.0037) [2024-06-24 14:32:08,213][15401] Updated weights for policy 0, policy_version 679451 (0.0035) [2024-06-24 14:32:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11132125184. Throughput: 0: 42715.0. Samples: 11132191300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 14:32:12,777][15401] Updated weights for policy 0, policy_version 679461 (0.0051) [2024-06-24 14:32:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 11132321792. Throughput: 0: 42615.2. Samples: 11132449020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 14:32:16,009][15401] Updated weights for policy 0, policy_version 679471 (0.0047) [2024-06-24 14:32:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11132534784. Throughput: 0: 42378.7. Samples: 11132694580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:18,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-24 14:32:20,560][15401] Updated weights for policy 0, policy_version 679481 (0.0041) [2024-06-24 14:32:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11132747776. Throughput: 0: 42245.4. Samples: 11132823440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 14:32:23,783][15401] Updated weights for policy 0, policy_version 679491 (0.0030) [2024-06-24 14:32:28,192][15401] Updated weights for policy 0, policy_version 679501 (0.0028) [2024-06-24 14:32:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11132944384. Throughput: 0: 42314.8. Samples: 11133084160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 14:32:31,353][15401] Updated weights for policy 0, policy_version 679511 (0.0023) [2024-06-24 14:32:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11133190144. Throughput: 0: 42358.7. Samples: 11133336680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 14:32:35,922][15401] Updated weights for policy 0, policy_version 679521 (0.0048) [2024-06-24 14:32:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11133386752. Throughput: 0: 42413.7. Samples: 11133470360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 14:32:38,916][15401] Updated weights for policy 0, policy_version 679531 (0.0048) [2024-06-24 14:32:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 11133583360. Throughput: 0: 42434.9. Samples: 11133722300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 14:32:43,468][15401] Updated weights for policy 0, policy_version 679541 (0.0035) [2024-06-24 14:32:46,528][15401] Updated weights for policy 0, policy_version 679551 (0.0048) [2024-06-24 14:32:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11133829120. Throughput: 0: 42470.6. Samples: 11133975560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:48,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 14:32:51,196][15401] Updated weights for policy 0, policy_version 679561 (0.0042) [2024-06-24 14:32:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11134025728. Throughput: 0: 42673.0. Samples: 11134111580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:53,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 14:32:54,417][15401] Updated weights for policy 0, policy_version 679571 (0.0037) [2024-06-24 14:32:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11134222336. Throughput: 0: 42431.0. Samples: 11134358420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:32:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 14:32:58,908][15401] Updated weights for policy 0, policy_version 679581 (0.0028) [2024-06-24 14:33:02,265][15401] Updated weights for policy 0, policy_version 679591 (0.0049) [2024-06-24 14:33:03,279][15349] Signal inference workers to stop experience collection... (164850 times) [2024-06-24 14:33:03,332][15401] InferenceWorker_p0-w0: stopping experience collection (164850 times) [2024-06-24 14:33:03,338][15349] Signal inference workers to resume experience collection... (164850 times) [2024-06-24 14:33:03,354][15401] InferenceWorker_p0-w0: resuming experience collection (164850 times) [2024-06-24 14:33:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 11134468096. Throughput: 0: 42568.7. Samples: 11134610180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:33:03,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 14:33:06,955][15401] Updated weights for policy 0, policy_version 679601 (0.0034) [2024-06-24 14:33:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 11134648320. Throughput: 0: 42620.5. Samples: 11134741360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:33:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 14:33:09,889][15401] Updated weights for policy 0, policy_version 679611 (0.0044) [2024-06-24 14:33:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11134861312. Throughput: 0: 42508.9. Samples: 11134997060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:33:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 14:33:14,468][15401] Updated weights for policy 0, policy_version 679621 (0.0035) [2024-06-24 14:33:17,493][15401] Updated weights for policy 0, policy_version 679631 (0.0039) [2024-06-24 14:33:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11135107072. Throughput: 0: 42456.0. Samples: 11135247200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-24 14:33:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 14:33:22,135][15401] Updated weights for policy 0, policy_version 679641 (0.0040) [2024-06-24 14:33:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11135287296. Throughput: 0: 42445.7. Samples: 11135380420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:33:23,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 14:33:25,101][15401] Updated weights for policy 0, policy_version 679651 (0.0040) [2024-06-24 14:33:28,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11135500288. Throughput: 0: 42318.6. Samples: 11135626640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:33:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 14:33:30,420][15401] Updated weights for policy 0, policy_version 679661 (0.0033) [2024-06-24 14:33:33,097][15401] Updated weights for policy 0, policy_version 679671 (0.0022) [2024-06-24 14:33:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11135746048. Throughput: 0: 42153.7. Samples: 11135872480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:33:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 14:33:38,031][15401] Updated weights for policy 0, policy_version 679681 (0.0025) [2024-06-24 14:33:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11135909888. Throughput: 0: 42064.4. Samples: 11136004480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:33:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 14:33:40,870][15401] Updated weights for policy 0, policy_version 679691 (0.0035) [2024-06-24 14:33:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 11136139264. Throughput: 0: 42285.3. Samples: 11136261260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:33:43,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 14:33:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000679697_11136155648.pth... [2024-06-24 14:33:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000679072_11125915648.pth [2024-06-24 14:33:45,607][15401] Updated weights for policy 0, policy_version 679701 (0.0037) [2024-06-24 14:33:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11136368640. Throughput: 0: 42208.5. Samples: 11136509560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:33:48,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 14:33:48,671][15401] Updated weights for policy 0, policy_version 679711 (0.0031) [2024-06-24 14:33:53,231][15401] Updated weights for policy 0, policy_version 679721 (0.0026) [2024-06-24 14:33:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 11136548864. Throughput: 0: 42214.2. Samples: 11136641000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:33:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 14:33:56,205][15401] Updated weights for policy 0, policy_version 679731 (0.0033) [2024-06-24 14:33:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11136794624. Throughput: 0: 42303.0. Samples: 11136900700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:33:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 14:34:00,727][15401] Updated weights for policy 0, policy_version 679741 (0.0046) [2024-06-24 14:34:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11137007616. Throughput: 0: 42471.2. Samples: 11137158400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 14:34:03,827][15401] Updated weights for policy 0, policy_version 679751 (0.0038) [2024-06-24 14:34:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 11137187840. Throughput: 0: 42336.8. Samples: 11137285580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:08,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 14:34:08,575][15401] Updated weights for policy 0, policy_version 679761 (0.0024) [2024-06-24 14:34:11,571][15401] Updated weights for policy 0, policy_version 679771 (0.0028) [2024-06-24 14:34:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 11137433600. Throughput: 0: 42599.4. Samples: 11137543620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:13,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 14:34:16,062][15401] Updated weights for policy 0, policy_version 679781 (0.0024) [2024-06-24 14:34:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11137646592. Throughput: 0: 42908.5. Samples: 11137803360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 14:34:19,296][15401] Updated weights for policy 0, policy_version 679791 (0.0046) [2024-06-24 14:34:23,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11137843200. Throughput: 0: 42771.6. Samples: 11137929200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:23,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-24 14:34:23,553][15401] Updated weights for policy 0, policy_version 679801 (0.0039) [2024-06-24 14:34:26,974][15401] Updated weights for policy 0, policy_version 679811 (0.0034) [2024-06-24 14:34:28,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 11138088960. Throughput: 0: 42825.1. Samples: 11138188660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:28,397][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 14:34:31,264][15401] Updated weights for policy 0, policy_version 679821 (0.0043) [2024-06-24 14:34:32,869][15349] Signal inference workers to stop experience collection... (164900 times) [2024-06-24 14:34:32,869][15349] Signal inference workers to resume experience collection... (164900 times) [2024-06-24 14:34:32,912][15401] InferenceWorker_p0-w0: stopping experience collection (164900 times) [2024-06-24 14:34:32,912][15401] InferenceWorker_p0-w0: resuming experience collection (164900 times) [2024-06-24 14:34:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42323.7, 300 sec: 42543.1). Total num frames: 11138285568. Throughput: 0: 42952.4. Samples: 11138442520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:33,393][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 14:34:34,667][15401] Updated weights for policy 0, policy_version 679831 (0.0040) [2024-06-24 14:34:38,389][15132] Fps is (10 sec: 37707.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11138465792. Throughput: 0: 42921.4. Samples: 11138572460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:38,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 14:34:38,842][15401] Updated weights for policy 0, policy_version 679841 (0.0038) [2024-06-24 14:34:42,345][15401] Updated weights for policy 0, policy_version 679851 (0.0030) [2024-06-24 14:34:43,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 11138711552. Throughput: 0: 42746.2. Samples: 11138824380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:43,392][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 14:34:46,403][15401] Updated weights for policy 0, policy_version 679861 (0.0034) [2024-06-24 14:34:48,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 11138924544. Throughput: 0: 42834.7. Samples: 11139086060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:48,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 14:34:50,106][15401] Updated weights for policy 0, policy_version 679871 (0.0041) [2024-06-24 14:34:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11139121152. Throughput: 0: 42856.6. Samples: 11139214120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 14:34:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 14:34:53,882][15401] Updated weights for policy 0, policy_version 679881 (0.0038) [2024-06-24 14:34:58,137][15401] Updated weights for policy 0, policy_version 679891 (0.0038) [2024-06-24 14:34:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11139350528. Throughput: 0: 42794.9. Samples: 11139469380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:34:58,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 14:35:01,756][15401] Updated weights for policy 0, policy_version 679901 (0.0034) [2024-06-24 14:35:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11139563520. Throughput: 0: 42797.6. Samples: 11139729260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:03,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 14:35:05,611][15401] Updated weights for policy 0, policy_version 679911 (0.0034) [2024-06-24 14:35:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 11139776512. Throughput: 0: 42819.2. Samples: 11139856060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 14:35:09,261][15401] Updated weights for policy 0, policy_version 679921 (0.0036) [2024-06-24 14:35:13,057][15401] Updated weights for policy 0, policy_version 679931 (0.0042) [2024-06-24 14:35:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11140005888. Throughput: 0: 42843.4. Samples: 11140116340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 14:35:17,257][15401] Updated weights for policy 0, policy_version 679941 (0.0033) [2024-06-24 14:35:18,395][15132] Fps is (10 sec: 44211.7, 60 sec: 42867.4, 300 sec: 42764.2). Total num frames: 11140218880. Throughput: 0: 42880.5. Samples: 11140372280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:18,396][15132] Avg episode reward: [(0, '0.277')] [2024-06-24 14:35:20,680][15401] Updated weights for policy 0, policy_version 679951 (0.0032) [2024-06-24 14:35:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42598.3). Total num frames: 11140415488. Throughput: 0: 42834.8. Samples: 11140500040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 14:35:25,158][15401] Updated weights for policy 0, policy_version 679961 (0.0041) [2024-06-24 14:35:28,389][15132] Fps is (10 sec: 40983.4, 60 sec: 42329.9, 300 sec: 42543.2). Total num frames: 11140628480. Throughput: 0: 42816.6. Samples: 11140751020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 14:35:28,503][15401] Updated weights for policy 0, policy_version 679971 (0.0026) [2024-06-24 14:35:32,784][15401] Updated weights for policy 0, policy_version 679981 (0.0036) [2024-06-24 14:35:33,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 11140841472. Throughput: 0: 42832.5. Samples: 11141013420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 14:35:36,264][15401] Updated weights for policy 0, policy_version 679991 (0.0028) [2024-06-24 14:35:38,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 11141054464. Throughput: 0: 42837.1. Samples: 11141141800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 14:35:40,310][15401] Updated weights for policy 0, policy_version 680001 (0.0027) [2024-06-24 14:35:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11141283840. Throughput: 0: 42894.2. Samples: 11141399620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 14:35:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000680010_11141283840.pth... [2024-06-24 14:35:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000679385_11131043840.pth [2024-06-24 14:35:43,884][15401] Updated weights for policy 0, policy_version 680011 (0.0032) [2024-06-24 14:35:47,880][15401] Updated weights for policy 0, policy_version 680021 (0.0033) [2024-06-24 14:35:48,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 11141496832. Throughput: 0: 42837.0. Samples: 11141656920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:48,395][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 14:35:51,747][15401] Updated weights for policy 0, policy_version 680031 (0.0021) [2024-06-24 14:35:53,390][15132] Fps is (10 sec: 42595.0, 60 sec: 43143.9, 300 sec: 42653.8). Total num frames: 11141709824. Throughput: 0: 42880.5. Samples: 11141785720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:53,391][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 14:35:55,421][15401] Updated weights for policy 0, policy_version 680041 (0.0031) [2024-06-24 14:35:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 11141922816. Throughput: 0: 42793.4. Samples: 11142042040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:35:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 14:35:59,348][15401] Updated weights for policy 0, policy_version 680051 (0.0034) [2024-06-24 14:36:01,649][15349] Signal inference workers to stop experience collection... (164950 times) [2024-06-24 14:36:01,650][15349] Signal inference workers to resume experience collection... (164950 times) [2024-06-24 14:36:01,690][15401] InferenceWorker_p0-w0: stopping experience collection (164950 times) [2024-06-24 14:36:01,690][15401] InferenceWorker_p0-w0: resuming experience collection (164950 times) [2024-06-24 14:36:02,876][15401] Updated weights for policy 0, policy_version 680061 (0.0036) [2024-06-24 14:36:03,396][15132] Fps is (10 sec: 44212.2, 60 sec: 43140.1, 300 sec: 42708.6). Total num frames: 11142152192. Throughput: 0: 42925.5. Samples: 11142303960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:36:03,396][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 14:36:06,927][15401] Updated weights for policy 0, policy_version 680071 (0.0038) [2024-06-24 14:36:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11142348800. Throughput: 0: 42944.3. Samples: 11142432520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:36:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 14:36:10,285][15401] Updated weights for policy 0, policy_version 680081 (0.0037) [2024-06-24 14:36:13,390][15132] Fps is (10 sec: 42625.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11142578176. Throughput: 0: 43164.3. Samples: 11142693420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:36:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 14:36:14,480][15401] Updated weights for policy 0, policy_version 680091 (0.0035) [2024-06-24 14:36:17,830][15401] Updated weights for policy 0, policy_version 680101 (0.0036) [2024-06-24 14:36:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42602.4, 300 sec: 42653.9). Total num frames: 11142774784. Throughput: 0: 42972.0. Samples: 11142947160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:36:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 14:36:22,056][15401] Updated weights for policy 0, policy_version 680111 (0.0038) [2024-06-24 14:36:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.7, 300 sec: 42598.4). Total num frames: 11142971392. Throughput: 0: 42854.9. Samples: 11143070260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:36:23,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 14:36:25,857][15401] Updated weights for policy 0, policy_version 680121 (0.0040) [2024-06-24 14:36:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11143217152. Throughput: 0: 42806.2. Samples: 11143325900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:36:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 14:36:30,009][15401] Updated weights for policy 0, policy_version 680131 (0.0027) [2024-06-24 14:36:33,341][15401] Updated weights for policy 0, policy_version 680141 (0.0037) [2024-06-24 14:36:33,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 11143430144. Throughput: 0: 42898.6. Samples: 11143587360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 14:36:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 14:36:37,458][15401] Updated weights for policy 0, policy_version 680151 (0.0036) [2024-06-24 14:36:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 11143626752. Throughput: 0: 42866.9. Samples: 11143714700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:36:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 14:36:41,055][15401] Updated weights for policy 0, policy_version 680161 (0.0039) [2024-06-24 14:36:43,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 11143856128. Throughput: 0: 42929.0. Samples: 11143974120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:36:43,396][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 14:36:45,431][15401] Updated weights for policy 0, policy_version 680171 (0.0029) [2024-06-24 14:36:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11144069120. Throughput: 0: 42780.7. Samples: 11144228820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:36:48,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-24 14:36:48,544][15401] Updated weights for policy 0, policy_version 680181 (0.0038) [2024-06-24 14:36:53,073][15401] Updated weights for policy 0, policy_version 680191 (0.0045) [2024-06-24 14:36:53,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42598.9, 300 sec: 42709.5). Total num frames: 11144265728. Throughput: 0: 42742.9. Samples: 11144355960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:36:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 14:36:56,009][15401] Updated weights for policy 0, policy_version 680201 (0.0035) [2024-06-24 14:36:58,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 11144511488. Throughput: 0: 42800.5. Samples: 11144619540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:36:58,392][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 14:37:00,600][15401] Updated weights for policy 0, policy_version 680211 (0.0040) [2024-06-24 14:37:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42602.9, 300 sec: 42654.0). Total num frames: 11144708096. Throughput: 0: 42956.0. Samples: 11144880180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 14:37:03,567][15401] Updated weights for policy 0, policy_version 680221 (0.0032) [2024-06-24 14:37:08,153][15401] Updated weights for policy 0, policy_version 680231 (0.0029) [2024-06-24 14:37:08,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11144904704. Throughput: 0: 42909.2. Samples: 11145001180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 14:37:11,667][15401] Updated weights for policy 0, policy_version 680241 (0.0037) [2024-06-24 14:37:13,265][15349] Signal inference workers to stop experience collection... (165000 times) [2024-06-24 14:37:13,316][15401] InferenceWorker_p0-w0: stopping experience collection (165000 times) [2024-06-24 14:37:13,379][15349] Signal inference workers to resume experience collection... (165000 times) [2024-06-24 14:37:13,379][15401] InferenceWorker_p0-w0: resuming experience collection (165000 times) [2024-06-24 14:37:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 11145166848. Throughput: 0: 42937.0. Samples: 11145258060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:13,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 14:37:15,719][15401] Updated weights for policy 0, policy_version 680251 (0.0032) [2024-06-24 14:37:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11145363456. Throughput: 0: 42934.8. Samples: 11145519420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 14:37:19,284][15401] Updated weights for policy 0, policy_version 680261 (0.0034) [2024-06-24 14:37:23,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11145543680. Throughput: 0: 42715.7. Samples: 11145636900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 14:37:23,497][15401] Updated weights for policy 0, policy_version 680271 (0.0029) [2024-06-24 14:37:26,981][15401] Updated weights for policy 0, policy_version 680281 (0.0040) [2024-06-24 14:37:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 11145822208. Throughput: 0: 42982.9. Samples: 11145908080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:28,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 14:37:31,027][15401] Updated weights for policy 0, policy_version 680291 (0.0025) [2024-06-24 14:37:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11145986048. Throughput: 0: 43142.7. Samples: 11146170240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:33,400][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 14:37:34,555][15401] Updated weights for policy 0, policy_version 680301 (0.0030) [2024-06-24 14:37:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11146199040. Throughput: 0: 42954.8. Samples: 11146288920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:38,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 14:37:38,623][15401] Updated weights for policy 0, policy_version 680311 (0.0031) [2024-06-24 14:37:42,023][15401] Updated weights for policy 0, policy_version 680321 (0.0033) [2024-06-24 14:37:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 11146444800. Throughput: 0: 42965.0. Samples: 11146552860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:43,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 14:37:43,525][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000680326_11146461184.pth... [2024-06-24 14:37:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000679697_11136155648.pth [2024-06-24 14:37:46,239][15401] Updated weights for policy 0, policy_version 680331 (0.0030) [2024-06-24 14:37:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11146625024. Throughput: 0: 42909.8. Samples: 11146811120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 14:37:49,551][15401] Updated weights for policy 0, policy_version 680341 (0.0035) [2024-06-24 14:37:53,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11146838016. Throughput: 0: 42741.3. Samples: 11146924540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 14:37:53,999][15401] Updated weights for policy 0, policy_version 680351 (0.0046) [2024-06-24 14:37:57,001][15401] Updated weights for policy 0, policy_version 680361 (0.0041) [2024-06-24 14:37:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11147067392. Throughput: 0: 42791.1. Samples: 11147183660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:37:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 14:38:01,758][15401] Updated weights for policy 0, policy_version 680371 (0.0031) [2024-06-24 14:38:03,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 11147264000. Throughput: 0: 42886.5. Samples: 11147449420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:38:03,392][15132] Avg episode reward: [(0, '0.273')] [2024-06-24 14:38:04,998][15401] Updated weights for policy 0, policy_version 680381 (0.0032) [2024-06-24 14:38:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11147493376. Throughput: 0: 42954.6. Samples: 11147569860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:38:08,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-24 14:38:09,586][15401] Updated weights for policy 0, policy_version 680391 (0.0025) [2024-06-24 14:38:10,555][15349] Signal inference workers to stop experience collection... (165050 times) [2024-06-24 14:38:10,610][15401] InferenceWorker_p0-w0: stopping experience collection (165050 times) [2024-06-24 14:38:10,673][15349] Signal inference workers to resume experience collection... (165050 times) [2024-06-24 14:38:10,673][15401] InferenceWorker_p0-w0: resuming experience collection (165050 times) [2024-06-24 14:38:12,485][15401] Updated weights for policy 0, policy_version 680401 (0.0036) [2024-06-24 14:38:13,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11147722752. Throughput: 0: 42684.5. Samples: 11147828880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 14:38:13,395][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 14:38:17,053][15401] Updated weights for policy 0, policy_version 680411 (0.0037) [2024-06-24 14:38:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 11147886592. Throughput: 0: 42640.4. Samples: 11148089060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 14:38:20,450][15401] Updated weights for policy 0, policy_version 680421 (0.0033) [2024-06-24 14:38:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11148132352. Throughput: 0: 42668.0. Samples: 11148208980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:23,394][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 14:38:24,906][15401] Updated weights for policy 0, policy_version 680431 (0.0041) [2024-06-24 14:38:28,095][15401] Updated weights for policy 0, policy_version 680441 (0.0024) [2024-06-24 14:38:28,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11148361728. Throughput: 0: 42608.0. Samples: 11148470220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 14:38:32,547][15401] Updated weights for policy 0, policy_version 680451 (0.0028) [2024-06-24 14:38:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11148541952. Throughput: 0: 42553.8. Samples: 11148726040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:33,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 14:38:35,768][15401] Updated weights for policy 0, policy_version 680461 (0.0045) [2024-06-24 14:38:38,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 11148787712. Throughput: 0: 42601.8. Samples: 11148841720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:38,393][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 14:38:40,347][15401] Updated weights for policy 0, policy_version 680471 (0.0030) [2024-06-24 14:38:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11149000704. Throughput: 0: 42747.9. Samples: 11149107320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 14:38:43,406][15401] Updated weights for policy 0, policy_version 680481 (0.0029) [2024-06-24 14:38:48,109][15401] Updated weights for policy 0, policy_version 680491 (0.0029) [2024-06-24 14:38:48,389][15132] Fps is (10 sec: 37692.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11149164544. Throughput: 0: 42671.7. Samples: 11149369540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 14:38:51,028][15401] Updated weights for policy 0, policy_version 680501 (0.0038) [2024-06-24 14:38:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11149410304. Throughput: 0: 42571.2. Samples: 11149485560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:53,390][15132] Avg episode reward: [(0, '0.877')] [2024-06-24 14:38:55,601][15401] Updated weights for policy 0, policy_version 680511 (0.0032) [2024-06-24 14:38:58,389][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11149623296. Throughput: 0: 42676.5. Samples: 11149749320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:38:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 14:38:58,617][15401] Updated weights for policy 0, policy_version 680521 (0.0031) [2024-06-24 14:39:02,971][15401] Updated weights for policy 0, policy_version 680531 (0.0034) [2024-06-24 14:39:03,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 11149819904. Throughput: 0: 42526.2. Samples: 11150002840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:03,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 14:39:06,677][15401] Updated weights for policy 0, policy_version 680541 (0.0043) [2024-06-24 14:39:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11150065664. Throughput: 0: 42617.7. Samples: 11150126780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 14:39:10,957][15401] Updated weights for policy 0, policy_version 680551 (0.0024) [2024-06-24 14:39:13,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11150278656. Throughput: 0: 42612.4. Samples: 11150387780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:13,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 14:39:14,156][15401] Updated weights for policy 0, policy_version 680561 (0.0039) [2024-06-24 14:39:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11150458880. Throughput: 0: 42718.3. Samples: 11150648360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:18,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 14:39:18,479][15401] Updated weights for policy 0, policy_version 680571 (0.0026) [2024-06-24 14:39:21,572][15401] Updated weights for policy 0, policy_version 680581 (0.0047) [2024-06-24 14:39:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42766.0). Total num frames: 11150704640. Throughput: 0: 43045.1. Samples: 11150778640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:23,396][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 14:39:25,927][15401] Updated weights for policy 0, policy_version 680591 (0.0041) [2024-06-24 14:39:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 11150917632. Throughput: 0: 42796.4. Samples: 11151033160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 14:39:29,242][15401] Updated weights for policy 0, policy_version 680601 (0.0029) [2024-06-24 14:39:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11151114240. Throughput: 0: 42763.9. Samples: 11151293920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:33,398][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 14:39:33,432][15401] Updated weights for policy 0, policy_version 680611 (0.0042) [2024-06-24 14:39:34,338][15349] Signal inference workers to stop experience collection... (165100 times) [2024-06-24 14:39:34,339][15349] Signal inference workers to resume experience collection... (165100 times) [2024-06-24 14:39:34,368][15401] InferenceWorker_p0-w0: stopping experience collection (165100 times) [2024-06-24 14:39:34,368][15401] InferenceWorker_p0-w0: resuming experience collection (165100 times) [2024-06-24 14:39:36,852][15401] Updated weights for policy 0, policy_version 680621 (0.0053) [2024-06-24 14:39:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.0, 300 sec: 42765.4). Total num frames: 11151327232. Throughput: 0: 42953.7. Samples: 11151418480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 14:39:40,863][15401] Updated weights for policy 0, policy_version 680631 (0.0024) [2024-06-24 14:39:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 11151556608. Throughput: 0: 42876.0. Samples: 11151678740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 14:39:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000680637_11151556608.pth... [2024-06-24 14:39:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000680010_11141283840.pth [2024-06-24 14:39:44,624][15401] Updated weights for policy 0, policy_version 680641 (0.0040) [2024-06-24 14:39:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11151769600. Throughput: 0: 42924.1. Samples: 11151934320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 14:39:48,417][15401] Updated weights for policy 0, policy_version 680651 (0.0041) [2024-06-24 14:39:52,289][15401] Updated weights for policy 0, policy_version 680661 (0.0043) [2024-06-24 14:39:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11151982592. Throughput: 0: 43028.1. Samples: 11152063040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:39:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 14:39:56,274][15401] Updated weights for policy 0, policy_version 680671 (0.0039) [2024-06-24 14:39:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11152179200. Throughput: 0: 42898.2. Samples: 11152318200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:39:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 14:39:59,798][15401] Updated weights for policy 0, policy_version 680681 (0.0027) [2024-06-24 14:40:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 11152408576. Throughput: 0: 42854.1. Samples: 11152576800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 14:40:03,818][15401] Updated weights for policy 0, policy_version 680691 (0.0032) [2024-06-24 14:40:07,931][15401] Updated weights for policy 0, policy_version 680701 (0.0040) [2024-06-24 14:40:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11152621568. Throughput: 0: 42829.3. Samples: 11152705960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 14:40:11,398][15401] Updated weights for policy 0, policy_version 680711 (0.0041) [2024-06-24 14:40:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.8). Total num frames: 11152834560. Throughput: 0: 42708.8. Samples: 11152955060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 14:40:15,601][15401] Updated weights for policy 0, policy_version 680721 (0.0032) [2024-06-24 14:40:18,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 11153047552. Throughput: 0: 42768.2. Samples: 11153218500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 14:40:18,981][15401] Updated weights for policy 0, policy_version 680731 (0.0049) [2024-06-24 14:40:23,317][15401] Updated weights for policy 0, policy_version 680741 (0.0022) [2024-06-24 14:40:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11153260544. Throughput: 0: 42841.8. Samples: 11153346360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 14:40:26,515][15401] Updated weights for policy 0, policy_version 680751 (0.0028) [2024-06-24 14:40:28,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11153473536. Throughput: 0: 42616.5. Samples: 11153596480. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 14:40:31,016][15401] Updated weights for policy 0, policy_version 680761 (0.0038) [2024-06-24 14:40:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11153686528. Throughput: 0: 42764.9. Samples: 11153858740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:33,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 14:40:34,321][15401] Updated weights for policy 0, policy_version 680771 (0.0024) [2024-06-24 14:40:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11153899520. Throughput: 0: 42722.6. Samples: 11153985560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 14:40:38,636][15401] Updated weights for policy 0, policy_version 680781 (0.0035) [2024-06-24 14:40:42,150][15401] Updated weights for policy 0, policy_version 680791 (0.0033) [2024-06-24 14:40:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11154128896. Throughput: 0: 42577.7. Samples: 11154234300. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:43,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 14:40:46,309][15401] Updated weights for policy 0, policy_version 680801 (0.0035) [2024-06-24 14:40:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.1). Total num frames: 11154325504. Throughput: 0: 42541.8. Samples: 11154491180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 14:40:49,792][15401] Updated weights for policy 0, policy_version 680811 (0.0031) [2024-06-24 14:40:53,392][15132] Fps is (10 sec: 39321.6, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 11154522112. Throughput: 0: 42448.3. Samples: 11154616240. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:53,401][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 14:40:54,110][15401] Updated weights for policy 0, policy_version 680821 (0.0042) [2024-06-24 14:40:57,441][15401] Updated weights for policy 0, policy_version 680831 (0.0040) [2024-06-24 14:40:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.9). Total num frames: 11154767872. Throughput: 0: 42716.6. Samples: 11154877300. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:40:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 14:41:01,683][15401] Updated weights for policy 0, policy_version 680841 (0.0034) [2024-06-24 14:41:03,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11154948096. Throughput: 0: 42498.4. Samples: 11155130920. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:41:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 14:41:05,463][15401] Updated weights for policy 0, policy_version 680851 (0.0029) [2024-06-24 14:41:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11155177472. Throughput: 0: 42387.0. Samples: 11155253780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:41:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 14:41:09,466][15401] Updated weights for policy 0, policy_version 680861 (0.0027) [2024-06-24 14:41:13,052][15401] Updated weights for policy 0, policy_version 680871 (0.0035) [2024-06-24 14:41:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11155406848. Throughput: 0: 42572.9. Samples: 11155512260. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:41:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 14:41:15,634][15349] Signal inference workers to stop experience collection... (165150 times) [2024-06-24 14:41:15,688][15401] InferenceWorker_p0-w0: stopping experience collection (165150 times) [2024-06-24 14:41:15,752][15349] Signal inference workers to resume experience collection... (165150 times) [2024-06-24 14:41:15,752][15401] InferenceWorker_p0-w0: resuming experience collection (165150 times) [2024-06-24 14:41:17,373][15401] Updated weights for policy 0, policy_version 680881 (0.0028) [2024-06-24 14:41:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 11155603456. Throughput: 0: 42475.9. Samples: 11155770160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:41:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 14:41:20,697][15401] Updated weights for policy 0, policy_version 680891 (0.0042) [2024-06-24 14:41:23,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11155816448. Throughput: 0: 42479.9. Samples: 11155897160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:41:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 14:41:25,031][15401] Updated weights for policy 0, policy_version 680901 (0.0036) [2024-06-24 14:41:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11156029440. Throughput: 0: 42633.0. Samples: 11156152680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:41:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 14:41:28,442][15401] Updated weights for policy 0, policy_version 680911 (0.0033) [2024-06-24 14:41:32,729][15401] Updated weights for policy 0, policy_version 680921 (0.0033) [2024-06-24 14:41:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11156242432. Throughput: 0: 42660.5. Samples: 11156410900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 14:41:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 14:41:35,987][15401] Updated weights for policy 0, policy_version 680931 (0.0041) [2024-06-24 14:41:38,390][15132] Fps is (10 sec: 42596.7, 60 sec: 42598.1, 300 sec: 42710.4). Total num frames: 11156455424. Throughput: 0: 42583.7. Samples: 11156532420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:41:38,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 14:41:40,307][15401] Updated weights for policy 0, policy_version 680941 (0.0049) [2024-06-24 14:41:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 11156668416. Throughput: 0: 42505.7. Samples: 11156790060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:41:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 14:41:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000680950_11156684800.pth... [2024-06-24 14:41:43,520][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000680326_11146461184.pth [2024-06-24 14:41:44,094][15401] Updated weights for policy 0, policy_version 680951 (0.0033) [2024-06-24 14:41:47,996][15401] Updated weights for policy 0, policy_version 680961 (0.0036) [2024-06-24 14:41:48,389][15132] Fps is (10 sec: 40961.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11156865024. Throughput: 0: 42486.3. Samples: 11157042800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:41:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 14:41:51,717][15401] Updated weights for policy 0, policy_version 680971 (0.0036) [2024-06-24 14:41:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42654.3). Total num frames: 11157094400. Throughput: 0: 42514.3. Samples: 11157166920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:41:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 14:41:55,815][15401] Updated weights for policy 0, policy_version 680981 (0.0023) [2024-06-24 14:41:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11157307392. Throughput: 0: 42568.5. Samples: 11157427840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:41:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 14:41:59,195][15401] Updated weights for policy 0, policy_version 680991 (0.0027) [2024-06-24 14:42:03,354][15401] Updated weights for policy 0, policy_version 681001 (0.0030) [2024-06-24 14:42:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 11157520384. Throughput: 0: 42580.5. Samples: 11157686380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:03,392][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 14:42:06,711][15401] Updated weights for policy 0, policy_version 681011 (0.0036) [2024-06-24 14:42:08,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 11157716992. Throughput: 0: 42550.8. Samples: 11157812040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:08,393][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 14:42:11,082][15401] Updated weights for policy 0, policy_version 681021 (0.0033) [2024-06-24 14:42:13,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 11157962752. Throughput: 0: 42580.7. Samples: 11158068820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:13,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 14:42:14,341][15401] Updated weights for policy 0, policy_version 681031 (0.0031) [2024-06-24 14:42:18,396][15132] Fps is (10 sec: 42581.5, 60 sec: 42320.9, 300 sec: 42708.5). Total num frames: 11158142976. Throughput: 0: 42577.5. Samples: 11158327160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:18,396][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 14:42:18,837][15401] Updated weights for policy 0, policy_version 681041 (0.0045) [2024-06-24 14:42:22,296][15401] Updated weights for policy 0, policy_version 681051 (0.0037) [2024-06-24 14:42:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11158372352. Throughput: 0: 42575.4. Samples: 11158448300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 14:42:26,482][15401] Updated weights for policy 0, policy_version 681061 (0.0044) [2024-06-24 14:42:28,392][15132] Fps is (10 sec: 45893.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 11158601728. Throughput: 0: 42649.4. Samples: 11158709380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:28,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 14:42:29,965][15401] Updated weights for policy 0, policy_version 681071 (0.0030) [2024-06-24 14:42:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11158781952. Throughput: 0: 42786.9. Samples: 11158968220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:33,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 14:42:34,052][15401] Updated weights for policy 0, policy_version 681081 (0.0038) [2024-06-24 14:42:37,559][15401] Updated weights for policy 0, policy_version 681091 (0.0041) [2024-06-24 14:42:38,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.7, 300 sec: 42598.4). Total num frames: 11159011328. Throughput: 0: 42714.3. Samples: 11159089060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:38,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-24 14:42:41,771][15401] Updated weights for policy 0, policy_version 681101 (0.0032) [2024-06-24 14:42:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11159240704. Throughput: 0: 42718.5. Samples: 11159350180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:43,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 14:42:45,122][15401] Updated weights for policy 0, policy_version 681111 (0.0034) [2024-06-24 14:42:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11159437312. Throughput: 0: 42575.3. Samples: 11159602160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 14:42:49,690][15401] Updated weights for policy 0, policy_version 681121 (0.0031) [2024-06-24 14:42:52,921][15401] Updated weights for policy 0, policy_version 681131 (0.0025) [2024-06-24 14:42:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11159650304. Throughput: 0: 42522.7. Samples: 11159725460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:53,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 14:42:57,281][15401] Updated weights for policy 0, policy_version 681141 (0.0027) [2024-06-24 14:42:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 11159863296. Throughput: 0: 42529.1. Samples: 11159982620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:42:58,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-24 14:43:01,073][15401] Updated weights for policy 0, policy_version 681151 (0.0028) [2024-06-24 14:43:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 11160076288. Throughput: 0: 42494.5. Samples: 11160239140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:43:03,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 14:43:04,893][15401] Updated weights for policy 0, policy_version 681161 (0.0032) [2024-06-24 14:43:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11160289280. Throughput: 0: 42495.7. Samples: 11160360600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:43:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 14:43:08,678][15401] Updated weights for policy 0, policy_version 681171 (0.0026) [2024-06-24 14:43:12,543][15401] Updated weights for policy 0, policy_version 681181 (0.0032) [2024-06-24 14:43:13,137][15349] Signal inference workers to stop experience collection... (165200 times) [2024-06-24 14:43:13,137][15349] Signal inference workers to resume experience collection... (165200 times) [2024-06-24 14:43:13,168][15401] InferenceWorker_p0-w0: stopping experience collection (165200 times) [2024-06-24 14:43:13,168][15401] InferenceWorker_p0-w0: resuming experience collection (165200 times) [2024-06-24 14:43:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11160502272. Throughput: 0: 42444.1. Samples: 11160619260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-24 14:43:13,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 14:43:16,649][15401] Updated weights for policy 0, policy_version 681191 (0.0027) [2024-06-24 14:43:18,397][15132] Fps is (10 sec: 42564.4, 60 sec: 42870.4, 300 sec: 42652.8). Total num frames: 11160715264. Throughput: 0: 42435.3. Samples: 11160878140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:18,398][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 14:43:20,098][15401] Updated weights for policy 0, policy_version 681201 (0.0045) [2024-06-24 14:43:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11160928256. Throughput: 0: 42430.3. Samples: 11160998420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:23,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 14:43:24,537][15401] Updated weights for policy 0, policy_version 681211 (0.0037) [2024-06-24 14:43:27,759][15401] Updated weights for policy 0, policy_version 681221 (0.0038) [2024-06-24 14:43:28,389][15132] Fps is (10 sec: 40992.6, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 11161124864. Throughput: 0: 42394.3. Samples: 11161257920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:28,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 14:43:32,158][15401] Updated weights for policy 0, policy_version 681231 (0.0029) [2024-06-24 14:43:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 11161337856. Throughput: 0: 42499.1. Samples: 11161514620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:33,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-24 14:43:35,630][15401] Updated weights for policy 0, policy_version 681241 (0.0034) [2024-06-24 14:43:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11161567232. Throughput: 0: 42464.4. Samples: 11161636360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:38,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 14:43:40,107][15401] Updated weights for policy 0, policy_version 681251 (0.0053) [2024-06-24 14:43:43,330][15401] Updated weights for policy 0, policy_version 681261 (0.0032) [2024-06-24 14:43:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11161780224. Throughput: 0: 42503.0. Samples: 11161895260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 14:43:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000681261_11161780224.pth... [2024-06-24 14:43:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000680637_11151556608.pth [2024-06-24 14:43:47,748][15401] Updated weights for policy 0, policy_version 681271 (0.0041) [2024-06-24 14:43:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11161976832. Throughput: 0: 42549.4. Samples: 11162153860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 14:43:51,032][15401] Updated weights for policy 0, policy_version 681281 (0.0055) [2024-06-24 14:43:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11162222592. Throughput: 0: 42623.5. Samples: 11162278660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:53,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 14:43:55,235][15401] Updated weights for policy 0, policy_version 681291 (0.0037) [2024-06-24 14:43:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 11162402816. Throughput: 0: 42602.6. Samples: 11162536380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:43:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 14:43:58,812][15401] Updated weights for policy 0, policy_version 681301 (0.0041) [2024-06-24 14:44:02,800][15401] Updated weights for policy 0, policy_version 681311 (0.0037) [2024-06-24 14:44:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11162615808. Throughput: 0: 42483.0. Samples: 11162789540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 14:44:06,754][15401] Updated weights for policy 0, policy_version 681321 (0.0036) [2024-06-24 14:44:08,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 11162861568. Throughput: 0: 42709.2. Samples: 11162920440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:08,392][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 14:44:10,283][15401] Updated weights for policy 0, policy_version 681331 (0.0041) [2024-06-24 14:44:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 11163025408. Throughput: 0: 42519.8. Samples: 11163171320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 14:44:14,563][15401] Updated weights for policy 0, policy_version 681341 (0.0036) [2024-06-24 14:44:17,882][15401] Updated weights for policy 0, policy_version 681351 (0.0030) [2024-06-24 14:44:18,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42330.9, 300 sec: 42542.8). Total num frames: 11163254784. Throughput: 0: 42352.4. Samples: 11163420480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 14:44:22,395][15401] Updated weights for policy 0, policy_version 681361 (0.0022) [2024-06-24 14:44:23,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11163500544. Throughput: 0: 42698.6. Samples: 11163557800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 14:44:25,465][15401] Updated weights for policy 0, policy_version 681371 (0.0035) [2024-06-24 14:44:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 11163664384. Throughput: 0: 42587.1. Samples: 11163811680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 14:44:29,798][15401] Updated weights for policy 0, policy_version 681381 (0.0030) [2024-06-24 14:44:33,053][15401] Updated weights for policy 0, policy_version 681391 (0.0022) [2024-06-24 14:44:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11163910144. Throughput: 0: 42531.9. Samples: 11164067800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:33,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 14:44:37,327][15401] Updated weights for policy 0, policy_version 681401 (0.0036) [2024-06-24 14:44:38,045][15349] Signal inference workers to stop experience collection... (165250 times) [2024-06-24 14:44:38,077][15401] InferenceWorker_p0-w0: stopping experience collection (165250 times) [2024-06-24 14:44:38,117][15349] Signal inference workers to resume experience collection... (165250 times) [2024-06-24 14:44:38,117][15401] InferenceWorker_p0-w0: resuming experience collection (165250 times) [2024-06-24 14:44:38,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11164139520. Throughput: 0: 42740.3. Samples: 11164201980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 14:44:40,531][15401] Updated weights for policy 0, policy_version 681411 (0.0025) [2024-06-24 14:44:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 11164319744. Throughput: 0: 42724.9. Samples: 11164459000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:43,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 14:44:45,263][15401] Updated weights for policy 0, policy_version 681421 (0.0037) [2024-06-24 14:44:48,263][15401] Updated weights for policy 0, policy_version 681431 (0.0031) [2024-06-24 14:44:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11164565504. Throughput: 0: 42725.7. Samples: 11164712200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 14:44:48,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 14:44:52,679][15401] Updated weights for policy 0, policy_version 681441 (0.0032) [2024-06-24 14:44:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11164778496. Throughput: 0: 42892.5. Samples: 11164850500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:44:53,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 14:44:56,009][15401] Updated weights for policy 0, policy_version 681451 (0.0045) [2024-06-24 14:44:58,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 11164958720. Throughput: 0: 42951.2. Samples: 11165104220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:44:58,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 14:45:00,206][15401] Updated weights for policy 0, policy_version 681461 (0.0036) [2024-06-24 14:45:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 11165204480. Throughput: 0: 42899.9. Samples: 11165351080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:03,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 14:45:04,020][15401] Updated weights for policy 0, policy_version 681471 (0.0041) [2024-06-24 14:45:07,966][15401] Updated weights for policy 0, policy_version 681481 (0.0037) [2024-06-24 14:45:08,392][15132] Fps is (10 sec: 44237.9, 60 sec: 42325.5, 300 sec: 42598.1). Total num frames: 11165401088. Throughput: 0: 42879.3. Samples: 11165487460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:08,393][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 14:45:11,465][15401] Updated weights for policy 0, policy_version 681491 (0.0030) [2024-06-24 14:45:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 11165614080. Throughput: 0: 42960.5. Samples: 11165744900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:13,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 14:45:15,555][15401] Updated weights for policy 0, policy_version 681501 (0.0027) [2024-06-24 14:45:18,392][15132] Fps is (10 sec: 44236.6, 60 sec: 43143.0, 300 sec: 42653.6). Total num frames: 11165843456. Throughput: 0: 42777.1. Samples: 11165992860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:18,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 14:45:19,003][15401] Updated weights for policy 0, policy_version 681511 (0.0027) [2024-06-24 14:45:23,250][15401] Updated weights for policy 0, policy_version 681521 (0.0035) [2024-06-24 14:45:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11166040064. Throughput: 0: 42790.7. Samples: 11166127560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 14:45:26,596][15401] Updated weights for policy 0, policy_version 681531 (0.0033) [2024-06-24 14:45:28,390][15132] Fps is (10 sec: 40968.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 11166253056. Throughput: 0: 42684.4. Samples: 11166379800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 14:45:30,870][15401] Updated weights for policy 0, policy_version 681541 (0.0028) [2024-06-24 14:45:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11166482432. Throughput: 0: 42844.8. Samples: 11166640220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 14:45:34,396][15401] Updated weights for policy 0, policy_version 681551 (0.0027) [2024-06-24 14:45:38,337][15401] Updated weights for policy 0, policy_version 681561 (0.0025) [2024-06-24 14:45:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 11166695424. Throughput: 0: 42774.2. Samples: 11166775340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 14:45:42,128][15401] Updated weights for policy 0, policy_version 681571 (0.0035) [2024-06-24 14:45:43,391][15132] Fps is (10 sec: 42593.6, 60 sec: 43143.7, 300 sec: 42653.8). Total num frames: 11166908416. Throughput: 0: 42821.1. Samples: 11167031120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:43,391][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 14:45:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000681574_11166908416.pth... [2024-06-24 14:45:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000680950_11156684800.pth [2024-06-24 14:45:45,857][15401] Updated weights for policy 0, policy_version 681581 (0.0035) [2024-06-24 14:45:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 11167121408. Throughput: 0: 43002.9. Samples: 11167286100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:48,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 14:45:50,008][15401] Updated weights for policy 0, policy_version 681591 (0.0032) [2024-06-24 14:45:53,390][15132] Fps is (10 sec: 42603.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11167334400. Throughput: 0: 42842.4. Samples: 11167415280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 14:45:53,698][15401] Updated weights for policy 0, policy_version 681601 (0.0038) [2024-06-24 14:45:54,606][15349] Signal inference workers to stop experience collection... (165300 times) [2024-06-24 14:45:54,664][15401] InferenceWorker_p0-w0: stopping experience collection (165300 times) [2024-06-24 14:45:54,726][15349] Signal inference workers to resume experience collection... (165300 times) [2024-06-24 14:45:54,726][15401] InferenceWorker_p0-w0: resuming experience collection (165300 times) [2024-06-24 14:45:57,822][15401] Updated weights for policy 0, policy_version 681611 (0.0037) [2024-06-24 14:45:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 11167547392. Throughput: 0: 42781.3. Samples: 11167670060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:45:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 14:46:01,327][15401] Updated weights for policy 0, policy_version 681621 (0.0032) [2024-06-24 14:46:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11167776768. Throughput: 0: 42786.5. Samples: 11167918160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:46:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 14:46:05,507][15401] Updated weights for policy 0, policy_version 681631 (0.0050) [2024-06-24 14:46:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 11167973376. Throughput: 0: 42804.5. Samples: 11168053760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:46:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 14:46:09,161][15401] Updated weights for policy 0, policy_version 681641 (0.0032) [2024-06-24 14:46:13,008][15401] Updated weights for policy 0, policy_version 681651 (0.0041) [2024-06-24 14:46:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11168169984. Throughput: 0: 42705.0. Samples: 11168301520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:46:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 14:46:16,818][15401] Updated weights for policy 0, policy_version 681661 (0.0045) [2024-06-24 14:46:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 11168415744. Throughput: 0: 42585.9. Samples: 11168556580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:46:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 14:46:20,401][15401] Updated weights for policy 0, policy_version 681671 (0.0040) [2024-06-24 14:46:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11168612352. Throughput: 0: 42613.0. Samples: 11168692920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:46:23,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 14:46:24,361][15401] Updated weights for policy 0, policy_version 681681 (0.0035) [2024-06-24 14:46:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11168808960. Throughput: 0: 42491.0. Samples: 11168943160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-24 14:46:28,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-24 14:46:28,587][15401] Updated weights for policy 0, policy_version 681691 (0.0030) [2024-06-24 14:46:31,977][15401] Updated weights for policy 0, policy_version 681701 (0.0031) [2024-06-24 14:46:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11169038336. Throughput: 0: 42538.1. Samples: 11169200320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:46:33,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 14:46:36,412][15401] Updated weights for policy 0, policy_version 681711 (0.0034) [2024-06-24 14:46:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11169234944. Throughput: 0: 42545.9. Samples: 11169329840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:46:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 14:46:39,826][15401] Updated weights for policy 0, policy_version 681721 (0.0041) [2024-06-24 14:46:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42599.2, 300 sec: 42709.5). Total num frames: 11169464320. Throughput: 0: 42387.6. Samples: 11169577500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:46:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 14:46:44,021][15401] Updated weights for policy 0, policy_version 681731 (0.0038) [2024-06-24 14:46:47,543][15401] Updated weights for policy 0, policy_version 681741 (0.0033) [2024-06-24 14:46:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11169677312. Throughput: 0: 42530.2. Samples: 11169832020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:46:48,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-24 14:46:51,560][15401] Updated weights for policy 0, policy_version 681751 (0.0038) [2024-06-24 14:46:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11169890304. Throughput: 0: 42448.0. Samples: 11169963920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:46:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 14:46:55,275][15401] Updated weights for policy 0, policy_version 681761 (0.0025) [2024-06-24 14:46:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 11170103296. Throughput: 0: 42593.2. Samples: 11170218220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:46:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 14:46:59,073][15401] Updated weights for policy 0, policy_version 681771 (0.0037) [2024-06-24 14:47:02,907][15401] Updated weights for policy 0, policy_version 681781 (0.0033) [2024-06-24 14:47:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 11170332672. Throughput: 0: 42519.4. Samples: 11170469960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:03,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 14:47:06,646][15401] Updated weights for policy 0, policy_version 681791 (0.0040) [2024-06-24 14:47:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11170529280. Throughput: 0: 42333.3. Samples: 11170597920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 14:47:10,335][15349] Signal inference workers to stop experience collection... (165350 times) [2024-06-24 14:47:10,336][15349] Signal inference workers to resume experience collection... (165350 times) [2024-06-24 14:47:10,355][15401] InferenceWorker_p0-w0: stopping experience collection (165350 times) [2024-06-24 14:47:10,355][15401] InferenceWorker_p0-w0: resuming experience collection (165350 times) [2024-06-24 14:47:10,481][15401] Updated weights for policy 0, policy_version 681801 (0.0039) [2024-06-24 14:47:13,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.7, 300 sec: 42710.0). Total num frames: 11170742272. Throughput: 0: 42564.3. Samples: 11170858660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:13,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 14:47:14,410][15401] Updated weights for policy 0, policy_version 681811 (0.0032) [2024-06-24 14:47:18,235][15401] Updated weights for policy 0, policy_version 681821 (0.0027) [2024-06-24 14:47:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 11170955264. Throughput: 0: 42479.5. Samples: 11171112000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:18,393][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 14:47:21,894][15401] Updated weights for policy 0, policy_version 681831 (0.0027) [2024-06-24 14:47:23,390][15132] Fps is (10 sec: 42607.7, 60 sec: 42598.2, 300 sec: 42598.7). Total num frames: 11171168256. Throughput: 0: 42464.1. Samples: 11171240740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:23,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 14:47:25,996][15401] Updated weights for policy 0, policy_version 681841 (0.0041) [2024-06-24 14:47:28,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11171381248. Throughput: 0: 42644.5. Samples: 11171496500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:28,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 14:47:29,907][15401] Updated weights for policy 0, policy_version 681851 (0.0036) [2024-06-24 14:47:33,389][15132] Fps is (10 sec: 39322.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11171561472. Throughput: 0: 42655.2. Samples: 11171751500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 14:47:33,879][15401] Updated weights for policy 0, policy_version 681861 (0.0033) [2024-06-24 14:47:37,500][15401] Updated weights for policy 0, policy_version 681871 (0.0030) [2024-06-24 14:47:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11171807232. Throughput: 0: 42505.7. Samples: 11171876680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 14:47:41,763][15401] Updated weights for policy 0, policy_version 681881 (0.0029) [2024-06-24 14:47:43,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11172036608. Throughput: 0: 42595.1. Samples: 11172135000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 14:47:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000681887_11172036608.pth... [2024-06-24 14:47:43,445][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000681261_11161780224.pth [2024-06-24 14:47:45,128][15401] Updated weights for policy 0, policy_version 681891 (0.0046) [2024-06-24 14:47:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11172216832. Throughput: 0: 42714.7. Samples: 11172392120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 14:47:49,287][15401] Updated weights for policy 0, policy_version 681901 (0.0038) [2024-06-24 14:47:52,756][15401] Updated weights for policy 0, policy_version 681911 (0.0032) [2024-06-24 14:47:53,393][15132] Fps is (10 sec: 42583.2, 60 sec: 42868.8, 300 sec: 42708.9). Total num frames: 11172462592. Throughput: 0: 42506.8. Samples: 11172510880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:53,394][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 14:47:56,897][15401] Updated weights for policy 0, policy_version 681921 (0.0036) [2024-06-24 14:47:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11172659200. Throughput: 0: 42458.8. Samples: 11172769200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:47:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 14:48:00,476][15401] Updated weights for policy 0, policy_version 681931 (0.0032) [2024-06-24 14:48:03,389][15132] Fps is (10 sec: 39335.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11172855808. Throughput: 0: 42626.7. Samples: 11173030100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:48:03,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 14:48:04,544][15401] Updated weights for policy 0, policy_version 681941 (0.0033) [2024-06-24 14:48:08,119][15401] Updated weights for policy 0, policy_version 681951 (0.0031) [2024-06-24 14:48:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11173085184. Throughput: 0: 42534.0. Samples: 11173154760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 14:48:08,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 14:48:12,253][15401] Updated weights for policy 0, policy_version 681961 (0.0033) [2024-06-24 14:48:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42710.6). Total num frames: 11173314560. Throughput: 0: 42562.1. Samples: 11173411800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:13,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 14:48:15,905][15401] Updated weights for policy 0, policy_version 681971 (0.0044) [2024-06-24 14:48:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 11173494784. Throughput: 0: 42631.5. Samples: 11173669920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 14:48:19,916][15401] Updated weights for policy 0, policy_version 681981 (0.0031) [2024-06-24 14:48:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11173724160. Throughput: 0: 42560.9. Samples: 11173791920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:23,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 14:48:23,483][15401] Updated weights for policy 0, policy_version 681991 (0.0026) [2024-06-24 14:48:27,542][15401] Updated weights for policy 0, policy_version 682001 (0.0039) [2024-06-24 14:48:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11173953536. Throughput: 0: 42768.9. Samples: 11174059600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 14:48:31,033][15349] Signal inference workers to stop experience collection... (165400 times) [2024-06-24 14:48:31,078][15401] InferenceWorker_p0-w0: stopping experience collection (165400 times) [2024-06-24 14:48:31,086][15349] Signal inference workers to resume experience collection... (165400 times) [2024-06-24 14:48:31,093][15401] InferenceWorker_p0-w0: resuming experience collection (165400 times) [2024-06-24 14:48:31,097][15401] Updated weights for policy 0, policy_version 682011 (0.0031) [2024-06-24 14:48:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11174133760. Throughput: 0: 42765.9. Samples: 11174316580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:33,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-24 14:48:35,111][15401] Updated weights for policy 0, policy_version 682021 (0.0034) [2024-06-24 14:48:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11174379520. Throughput: 0: 42765.6. Samples: 11174435180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 14:48:38,580][15401] Updated weights for policy 0, policy_version 682031 (0.0035) [2024-06-24 14:48:42,660][15401] Updated weights for policy 0, policy_version 682041 (0.0033) [2024-06-24 14:48:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11174592512. Throughput: 0: 43016.3. Samples: 11174704940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 14:48:46,159][15401] Updated weights for policy 0, policy_version 682051 (0.0047) [2024-06-24 14:48:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11174772736. Throughput: 0: 42907.6. Samples: 11174960940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 14:48:50,409][15401] Updated weights for policy 0, policy_version 682061 (0.0022) [2024-06-24 14:48:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42601.0, 300 sec: 42765.0). Total num frames: 11175018496. Throughput: 0: 42870.7. Samples: 11175083940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 14:48:53,955][15401] Updated weights for policy 0, policy_version 682071 (0.0029) [2024-06-24 14:48:57,900][15401] Updated weights for policy 0, policy_version 682081 (0.0046) [2024-06-24 14:48:58,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 11175215104. Throughput: 0: 42965.3. Samples: 11175345340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:48:58,393][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 14:49:01,464][15401] Updated weights for policy 0, policy_version 682091 (0.0036) [2024-06-24 14:49:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 11175428096. Throughput: 0: 43134.6. Samples: 11175610980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 14:49:05,564][15401] Updated weights for policy 0, policy_version 682101 (0.0036) [2024-06-24 14:49:08,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11175657472. Throughput: 0: 43208.0. Samples: 11175736280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:08,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 14:49:08,881][15401] Updated weights for policy 0, policy_version 682111 (0.0037) [2024-06-24 14:49:13,198][15401] Updated weights for policy 0, policy_version 682121 (0.0033) [2024-06-24 14:49:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11175870464. Throughput: 0: 43051.7. Samples: 11175996920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 14:49:16,365][15401] Updated weights for policy 0, policy_version 682131 (0.0047) [2024-06-24 14:49:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11176083456. Throughput: 0: 43214.6. Samples: 11176261240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 14:49:20,782][15401] Updated weights for policy 0, policy_version 682141 (0.0035) [2024-06-24 14:49:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11176312832. Throughput: 0: 43406.4. Samples: 11176388460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:23,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 14:49:23,889][15401] Updated weights for policy 0, policy_version 682151 (0.0025) [2024-06-24 14:49:28,345][15401] Updated weights for policy 0, policy_version 682161 (0.0046) [2024-06-24 14:49:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11176525824. Throughput: 0: 43177.4. Samples: 11176647920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 14:49:32,103][15401] Updated weights for policy 0, policy_version 682171 (0.0036) [2024-06-24 14:49:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 11176722432. Throughput: 0: 43108.4. Samples: 11176900820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 14:49:36,469][15401] Updated weights for policy 0, policy_version 682181 (0.0048) [2024-06-24 14:49:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11176968192. Throughput: 0: 43290.2. Samples: 11177032000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 14:49:39,661][15401] Updated weights for policy 0, policy_version 682191 (0.0026) [2024-06-24 14:49:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11177164800. Throughput: 0: 43167.1. Samples: 11177287760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:43,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 14:49:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000682200_11177164800.pth... [2024-06-24 14:49:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000681574_11166908416.pth [2024-06-24 14:49:44,080][15349] Signal inference workers to stop experience collection... (165450 times) [2024-06-24 14:49:44,140][15401] InferenceWorker_p0-w0: stopping experience collection (165450 times) [2024-06-24 14:49:44,190][15349] Signal inference workers to resume experience collection... (165450 times) [2024-06-24 14:49:44,191][15401] InferenceWorker_p0-w0: resuming experience collection (165450 times) [2024-06-24 14:49:44,192][15401] Updated weights for policy 0, policy_version 682201 (0.0040) [2024-06-24 14:49:47,369][15401] Updated weights for policy 0, policy_version 682211 (0.0037) [2024-06-24 14:49:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 11177377792. Throughput: 0: 42932.4. Samples: 11177542940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 14:49:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 14:49:51,726][15401] Updated weights for policy 0, policy_version 682221 (0.0030) [2024-06-24 14:49:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 11177590784. Throughput: 0: 42888.5. Samples: 11177666260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:49:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 14:49:55,011][15401] Updated weights for policy 0, policy_version 682231 (0.0040) [2024-06-24 14:49:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.2, 300 sec: 42709.8). Total num frames: 11177803776. Throughput: 0: 42817.2. Samples: 11177923700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:49:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 14:49:59,172][15401] Updated weights for policy 0, policy_version 682241 (0.0037) [2024-06-24 14:50:02,670][15401] Updated weights for policy 0, policy_version 682251 (0.0029) [2024-06-24 14:50:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 11178016768. Throughput: 0: 42778.3. Samples: 11178186260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:03,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 14:50:06,598][15401] Updated weights for policy 0, policy_version 682261 (0.0040) [2024-06-24 14:50:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11178229760. Throughput: 0: 42734.2. Samples: 11178311500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 14:50:10,609][15401] Updated weights for policy 0, policy_version 682271 (0.0033) [2024-06-24 14:50:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 11178459136. Throughput: 0: 42740.0. Samples: 11178571220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:13,396][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 14:50:14,058][15401] Updated weights for policy 0, policy_version 682281 (0.0030) [2024-06-24 14:50:17,978][15401] Updated weights for policy 0, policy_version 682291 (0.0031) [2024-06-24 14:50:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11178672128. Throughput: 0: 42864.1. Samples: 11178829700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 14:50:21,569][15401] Updated weights for policy 0, policy_version 682301 (0.0038) [2024-06-24 14:50:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 11178885120. Throughput: 0: 42903.9. Samples: 11178962680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:23,391][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 14:50:25,809][15401] Updated weights for policy 0, policy_version 682311 (0.0041) [2024-06-24 14:50:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11179098112. Throughput: 0: 42817.5. Samples: 11179214540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 14:50:29,117][15401] Updated weights for policy 0, policy_version 682321 (0.0043) [2024-06-24 14:50:33,299][15401] Updated weights for policy 0, policy_version 682331 (0.0032) [2024-06-24 14:50:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11179311104. Throughput: 0: 42966.9. Samples: 11179476440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 14:50:36,855][15401] Updated weights for policy 0, policy_version 682341 (0.0038) [2024-06-24 14:50:38,391][15132] Fps is (10 sec: 44229.1, 60 sec: 42870.2, 300 sec: 42820.5). Total num frames: 11179540480. Throughput: 0: 43142.7. Samples: 11179607760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:38,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 14:50:40,837][15401] Updated weights for policy 0, policy_version 682351 (0.0044) [2024-06-24 14:50:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11179737088. Throughput: 0: 42944.1. Samples: 11179856180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 14:50:44,955][15401] Updated weights for policy 0, policy_version 682361 (0.0034) [2024-06-24 14:50:48,390][15132] Fps is (10 sec: 40967.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11179950080. Throughput: 0: 42774.6. Samples: 11180111120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:48,390][15132] Avg episode reward: [(0, '0.103')] [2024-06-24 14:50:48,707][15401] Updated weights for policy 0, policy_version 682371 (0.0030) [2024-06-24 14:50:52,570][15401] Updated weights for policy 0, policy_version 682381 (0.0033) [2024-06-24 14:50:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11180163072. Throughput: 0: 42843.0. Samples: 11180239440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:53,390][15132] Avg episode reward: [(0, '0.103')] [2024-06-24 14:50:56,350][15401] Updated weights for policy 0, policy_version 682391 (0.0022) [2024-06-24 14:50:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11180376064. Throughput: 0: 42747.4. Samples: 11180494860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:50:58,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 14:51:00,212][15401] Updated weights for policy 0, policy_version 682401 (0.0035) [2024-06-24 14:51:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11180572672. Throughput: 0: 42661.8. Samples: 11180749480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:51:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 14:51:04,206][15401] Updated weights for policy 0, policy_version 682411 (0.0031) [2024-06-24 14:51:07,828][15401] Updated weights for policy 0, policy_version 682421 (0.0036) [2024-06-24 14:51:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11180802048. Throughput: 0: 42565.1. Samples: 11180878100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:51:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 14:51:11,796][15401] Updated weights for policy 0, policy_version 682431 (0.0034) [2024-06-24 14:51:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 11181015040. Throughput: 0: 42695.9. Samples: 11181135860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:51:13,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 14:51:15,426][15401] Updated weights for policy 0, policy_version 682441 (0.0028) [2024-06-24 14:51:16,644][15349] Signal inference workers to stop experience collection... (165500 times) [2024-06-24 14:51:16,645][15349] Signal inference workers to resume experience collection... (165500 times) [2024-06-24 14:51:16,665][15401] InferenceWorker_p0-w0: stopping experience collection (165500 times) [2024-06-24 14:51:16,666][15401] InferenceWorker_p0-w0: resuming experience collection (165500 times) [2024-06-24 14:51:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11181211648. Throughput: 0: 42684.9. Samples: 11181397260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:51:18,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-24 14:51:19,502][15401] Updated weights for policy 0, policy_version 682451 (0.0038) [2024-06-24 14:51:23,031][15401] Updated weights for policy 0, policy_version 682461 (0.0031) [2024-06-24 14:51:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11181441024. Throughput: 0: 42534.4. Samples: 11181521740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:51:23,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 14:51:27,046][15401] Updated weights for policy 0, policy_version 682471 (0.0034) [2024-06-24 14:51:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11181670400. Throughput: 0: 42751.6. Samples: 11181780000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 14:51:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 14:51:30,585][15401] Updated weights for policy 0, policy_version 682481 (0.0029) [2024-06-24 14:51:33,396][15132] Fps is (10 sec: 42572.0, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 11181867008. Throughput: 0: 42898.4. Samples: 11182041820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:51:33,396][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 14:51:34,618][15401] Updated weights for policy 0, policy_version 682491 (0.0035) [2024-06-24 14:51:38,387][15401] Updated weights for policy 0, policy_version 682501 (0.0037) [2024-06-24 14:51:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42597.9, 300 sec: 42820.2). Total num frames: 11182096384. Throughput: 0: 42780.0. Samples: 11182164640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:51:38,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 14:51:42,264][15401] Updated weights for policy 0, policy_version 682511 (0.0048) [2024-06-24 14:51:43,389][15132] Fps is (10 sec: 42625.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11182292992. Throughput: 0: 42765.9. Samples: 11182419320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:51:43,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 14:51:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000682514_11182309376.pth... [2024-06-24 14:51:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000681887_11172036608.pth [2024-06-24 14:51:45,930][15401] Updated weights for policy 0, policy_version 682521 (0.0033) [2024-06-24 14:51:48,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11182505984. Throughput: 0: 42726.5. Samples: 11182672180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:51:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 14:51:49,857][15401] Updated weights for policy 0, policy_version 682531 (0.0036) [2024-06-24 14:51:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11182735360. Throughput: 0: 42837.7. Samples: 11182805800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:51:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 14:51:53,725][15401] Updated weights for policy 0, policy_version 682541 (0.0032) [2024-06-24 14:51:57,863][15401] Updated weights for policy 0, policy_version 682551 (0.0041) [2024-06-24 14:51:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11182931968. Throughput: 0: 42740.6. Samples: 11183059180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:51:58,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-24 14:52:01,207][15401] Updated weights for policy 0, policy_version 682561 (0.0041) [2024-06-24 14:52:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 11183161344. Throughput: 0: 42496.8. Samples: 11183309620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 14:52:05,570][15401] Updated weights for policy 0, policy_version 682571 (0.0040) [2024-06-24 14:52:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 11183341568. Throughput: 0: 42772.7. Samples: 11183446500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 14:52:09,085][15401] Updated weights for policy 0, policy_version 682581 (0.0034) [2024-06-24 14:52:13,071][15401] Updated weights for policy 0, policy_version 682591 (0.0030) [2024-06-24 14:52:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 11183570944. Throughput: 0: 42642.9. Samples: 11183698940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 14:52:16,560][15401] Updated weights for policy 0, policy_version 682601 (0.0041) [2024-06-24 14:52:18,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 11183800320. Throughput: 0: 42452.1. Samples: 11183951900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 14:52:20,968][15401] Updated weights for policy 0, policy_version 682611 (0.0032) [2024-06-24 14:52:23,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11183996928. Throughput: 0: 42689.8. Samples: 11184085580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 14:52:24,170][15401] Updated weights for policy 0, policy_version 682621 (0.0023) [2024-06-24 14:52:28,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 11184209920. Throughput: 0: 42570.6. Samples: 11184335000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 14:52:28,679][15401] Updated weights for policy 0, policy_version 682631 (0.0036) [2024-06-24 14:52:31,924][15401] Updated weights for policy 0, policy_version 682641 (0.0029) [2024-06-24 14:52:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42876.0, 300 sec: 42820.6). Total num frames: 11184439296. Throughput: 0: 42723.2. Samples: 11184594720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 14:52:36,259][15401] Updated weights for policy 0, policy_version 682651 (0.0039) [2024-06-24 14:52:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 11184635904. Throughput: 0: 42698.3. Samples: 11184727220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 14:52:39,638][15401] Updated weights for policy 0, policy_version 682661 (0.0035) [2024-06-24 14:52:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11184848896. Throughput: 0: 42682.1. Samples: 11184979880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 14:52:43,846][15401] Updated weights for policy 0, policy_version 682671 (0.0037) [2024-06-24 14:52:46,254][15349] Signal inference workers to stop experience collection... (165550 times) [2024-06-24 14:52:46,288][15401] InferenceWorker_p0-w0: stopping experience collection (165550 times) [2024-06-24 14:52:46,310][15349] Signal inference workers to resume experience collection... (165550 times) [2024-06-24 14:52:46,316][15401] InferenceWorker_p0-w0: resuming experience collection (165550 times) [2024-06-24 14:52:47,352][15401] Updated weights for policy 0, policy_version 682681 (0.0029) [2024-06-24 14:52:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.5). Total num frames: 11185078272. Throughput: 0: 42852.5. Samples: 11185237980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:48,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-24 14:52:51,985][15401] Updated weights for policy 0, policy_version 682691 (0.0039) [2024-06-24 14:52:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11185291264. Throughput: 0: 42776.7. Samples: 11185371460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:53,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 14:52:55,038][15401] Updated weights for policy 0, policy_version 682701 (0.0029) [2024-06-24 14:52:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11185487872. Throughput: 0: 42754.8. Samples: 11185622900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:52:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 14:52:59,582][15401] Updated weights for policy 0, policy_version 682711 (0.0035) [2024-06-24 14:53:02,688][15401] Updated weights for policy 0, policy_version 682721 (0.0030) [2024-06-24 14:53:03,391][15132] Fps is (10 sec: 40953.3, 60 sec: 42324.2, 300 sec: 42764.8). Total num frames: 11185700864. Throughput: 0: 42967.8. Samples: 11185885520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:53:03,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 14:53:06,976][15401] Updated weights for policy 0, policy_version 682731 (0.0023) [2024-06-24 14:53:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11185930240. Throughput: 0: 42920.0. Samples: 11186016980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 14:53:08,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 14:53:10,318][15401] Updated weights for policy 0, policy_version 682741 (0.0032) [2024-06-24 14:53:13,390][15132] Fps is (10 sec: 44244.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11186143232. Throughput: 0: 42853.7. Samples: 11186263420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 14:53:14,537][15401] Updated weights for policy 0, policy_version 682751 (0.0045) [2024-06-24 14:53:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11186339840. Throughput: 0: 42877.4. Samples: 11186524200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:18,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 14:53:18,482][15401] Updated weights for policy 0, policy_version 682761 (0.0036) [2024-06-24 14:53:22,219][15401] Updated weights for policy 0, policy_version 682771 (0.0029) [2024-06-24 14:53:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11186569216. Throughput: 0: 42763.5. Samples: 11186651580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 14:53:25,915][15401] Updated weights for policy 0, policy_version 682781 (0.0033) [2024-06-24 14:53:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11186765824. Throughput: 0: 42812.6. Samples: 11186906440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:28,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 14:53:29,901][15401] Updated weights for policy 0, policy_version 682791 (0.0038) [2024-06-24 14:53:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11186995200. Throughput: 0: 42748.0. Samples: 11187161640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:33,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 14:53:33,563][15401] Updated weights for policy 0, policy_version 682801 (0.0034) [2024-06-24 14:53:37,497][15401] Updated weights for policy 0, policy_version 682811 (0.0034) [2024-06-24 14:53:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11187208192. Throughput: 0: 42758.7. Samples: 11187295600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:38,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 14:53:41,236][15401] Updated weights for policy 0, policy_version 682821 (0.0040) [2024-06-24 14:53:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11187421184. Throughput: 0: 42657.3. Samples: 11187542480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 14:53:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000682826_11187421184.pth... [2024-06-24 14:53:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000682200_11177164800.pth [2024-06-24 14:53:45,227][15401] Updated weights for policy 0, policy_version 682831 (0.0035) [2024-06-24 14:53:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 11187634176. Throughput: 0: 42611.3. Samples: 11187803060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:48,393][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 14:53:48,811][15401] Updated weights for policy 0, policy_version 682841 (0.0031) [2024-06-24 14:53:52,827][15401] Updated weights for policy 0, policy_version 682851 (0.0046) [2024-06-24 14:53:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 11187863552. Throughput: 0: 42556.0. Samples: 11187932000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 14:53:56,283][15401] Updated weights for policy 0, policy_version 682861 (0.0031) [2024-06-24 14:53:58,390][15132] Fps is (10 sec: 44244.7, 60 sec: 43144.1, 300 sec: 42876.0). Total num frames: 11188076544. Throughput: 0: 42732.8. Samples: 11188186420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:53:58,391][15132] Avg episode reward: [(0, '0.235')] [2024-06-24 14:54:00,751][15401] Updated weights for policy 0, policy_version 682871 (0.0034) [2024-06-24 14:54:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42871.0, 300 sec: 42764.7). Total num frames: 11188273152. Throughput: 0: 42765.2. Samples: 11188448740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:03,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 14:54:04,000][15401] Updated weights for policy 0, policy_version 682881 (0.0025) [2024-06-24 14:54:08,158][15401] Updated weights for policy 0, policy_version 682891 (0.0036) [2024-06-24 14:54:08,392][15132] Fps is (10 sec: 40953.0, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 11188486144. Throughput: 0: 42798.5. Samples: 11188577620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:08,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 14:54:11,563][15401] Updated weights for policy 0, policy_version 682901 (0.0031) [2024-06-24 14:54:13,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11188731904. Throughput: 0: 42636.8. Samples: 11188825100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 14:54:16,050][15401] Updated weights for policy 0, policy_version 682911 (0.0041) [2024-06-24 14:54:16,964][15349] Signal inference workers to stop experience collection... (165600 times) [2024-06-24 14:54:16,964][15349] Signal inference workers to resume experience collection... (165600 times) [2024-06-24 14:54:16,984][15401] InferenceWorker_p0-w0: stopping experience collection (165600 times) [2024-06-24 14:54:16,984][15401] InferenceWorker_p0-w0: resuming experience collection (165600 times) [2024-06-24 14:54:18,392][15132] Fps is (10 sec: 44236.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 11188928512. Throughput: 0: 42868.4. Samples: 11189090820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:18,393][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 14:54:19,131][15401] Updated weights for policy 0, policy_version 682921 (0.0030) [2024-06-24 14:54:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11189125120. Throughput: 0: 42787.9. Samples: 11189221060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 14:54:23,430][15401] Updated weights for policy 0, policy_version 682931 (0.0046) [2024-06-24 14:54:26,611][15401] Updated weights for policy 0, policy_version 682941 (0.0039) [2024-06-24 14:54:28,389][15132] Fps is (10 sec: 44248.1, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 11189370880. Throughput: 0: 43031.8. Samples: 11189478900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 14:54:30,935][15401] Updated weights for policy 0, policy_version 682951 (0.0023) [2024-06-24 14:54:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11189583872. Throughput: 0: 43066.2. Samples: 11189740940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 14:54:34,338][15401] Updated weights for policy 0, policy_version 682961 (0.0038) [2024-06-24 14:54:38,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11189780480. Throughput: 0: 42936.8. Samples: 11189864160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:38,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 14:54:38,570][15401] Updated weights for policy 0, policy_version 682971 (0.0041) [2024-06-24 14:54:42,014][15401] Updated weights for policy 0, policy_version 682981 (0.0026) [2024-06-24 14:54:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11190026240. Throughput: 0: 43075.8. Samples: 11190124800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 14:54:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 14:54:46,587][15401] Updated weights for policy 0, policy_version 682991 (0.0034) [2024-06-24 14:54:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 11190222848. Throughput: 0: 43031.6. Samples: 11190385060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:54:48,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 14:54:49,666][15401] Updated weights for policy 0, policy_version 683001 (0.0031) [2024-06-24 14:54:53,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11190403072. Throughput: 0: 42957.4. Samples: 11190510600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:54:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 14:54:54,030][15401] Updated weights for policy 0, policy_version 683011 (0.0038) [2024-06-24 14:54:57,251][15401] Updated weights for policy 0, policy_version 683021 (0.0028) [2024-06-24 14:54:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43418.1, 300 sec: 42931.6). Total num frames: 11190681600. Throughput: 0: 43216.9. Samples: 11190769860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:54:58,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 14:55:01,573][15401] Updated weights for policy 0, policy_version 683031 (0.0032) [2024-06-24 14:55:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 11190861824. Throughput: 0: 43127.6. Samples: 11191031460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:03,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 14:55:04,946][15401] Updated weights for policy 0, policy_version 683041 (0.0046) [2024-06-24 14:55:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11191058432. Throughput: 0: 42868.1. Samples: 11191150120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:08,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 14:55:09,332][15401] Updated weights for policy 0, policy_version 683051 (0.0027) [2024-06-24 14:55:12,451][15401] Updated weights for policy 0, policy_version 683061 (0.0050) [2024-06-24 14:55:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11191304192. Throughput: 0: 42879.5. Samples: 11191408480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 14:55:16,909][15401] Updated weights for policy 0, policy_version 683071 (0.0047) [2024-06-24 14:55:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 11191484416. Throughput: 0: 42904.8. Samples: 11191671660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:18,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 14:55:20,088][15401] Updated weights for policy 0, policy_version 683081 (0.0042) [2024-06-24 14:55:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11191697408. Throughput: 0: 42860.6. Samples: 11191792880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:23,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 14:55:24,869][15401] Updated weights for policy 0, policy_version 683091 (0.0039) [2024-06-24 14:55:27,497][15401] Updated weights for policy 0, policy_version 683101 (0.0034) [2024-06-24 14:55:28,392][15132] Fps is (10 sec: 45865.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11191943168. Throughput: 0: 42696.5. Samples: 11192046240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:28,392][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 14:55:32,340][15401] Updated weights for policy 0, policy_version 683111 (0.0034) [2024-06-24 14:55:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42709.7). Total num frames: 11192139776. Throughput: 0: 42824.4. Samples: 11192312160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 14:55:35,396][15401] Updated weights for policy 0, policy_version 683121 (0.0029) [2024-06-24 14:55:38,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11192352768. Throughput: 0: 42820.5. Samples: 11192437520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:38,391][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 14:55:40,055][15401] Updated weights for policy 0, policy_version 683131 (0.0048) [2024-06-24 14:55:42,998][15401] Updated weights for policy 0, policy_version 683141 (0.0033) [2024-06-24 14:55:43,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 11192582144. Throughput: 0: 42790.7. Samples: 11192695540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:43,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 14:55:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000683141_11192582144.pth... [2024-06-24 14:55:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000682514_11182309376.pth [2024-06-24 14:55:47,475][15401] Updated weights for policy 0, policy_version 683151 (0.0028) [2024-06-24 14:55:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11192778752. Throughput: 0: 42889.9. Samples: 11192961500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 14:55:50,494][15401] Updated weights for policy 0, policy_version 683161 (0.0035) [2024-06-24 14:55:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11192991744. Throughput: 0: 43041.8. Samples: 11193087000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 14:55:55,253][15401] Updated weights for policy 0, policy_version 683171 (0.0043) [2024-06-24 14:55:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 11193221120. Throughput: 0: 42919.4. Samples: 11193339860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:55:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 14:55:58,547][15401] Updated weights for policy 0, policy_version 683181 (0.0034) [2024-06-24 14:56:02,953][15401] Updated weights for policy 0, policy_version 683191 (0.0031) [2024-06-24 14:56:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11193417728. Throughput: 0: 42813.0. Samples: 11193598240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:56:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 14:56:03,564][15349] Signal inference workers to stop experience collection... (165650 times) [2024-06-24 14:56:03,564][15349] Signal inference workers to resume experience collection... (165650 times) [2024-06-24 14:56:03,619][15401] InferenceWorker_p0-w0: stopping experience collection (165650 times) [2024-06-24 14:56:03,619][15401] InferenceWorker_p0-w0: resuming experience collection (165650 times) [2024-06-24 14:56:06,087][15401] Updated weights for policy 0, policy_version 683201 (0.0032) [2024-06-24 14:56:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11193630720. Throughput: 0: 42857.8. Samples: 11193721480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:56:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 14:56:10,614][15401] Updated weights for policy 0, policy_version 683211 (0.0044) [2024-06-24 14:56:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11193860096. Throughput: 0: 42904.9. Samples: 11193976860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:56:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 14:56:13,818][15401] Updated weights for policy 0, policy_version 683221 (0.0041) [2024-06-24 14:56:18,296][15401] Updated weights for policy 0, policy_version 683231 (0.0038) [2024-06-24 14:56:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11194056704. Throughput: 0: 42811.1. Samples: 11194238660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:56:18,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 14:56:21,619][15401] Updated weights for policy 0, policy_version 683241 (0.0039) [2024-06-24 14:56:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11194286080. Throughput: 0: 42757.2. Samples: 11194361600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-24 14:56:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 14:56:25,849][15401] Updated weights for policy 0, policy_version 683251 (0.0042) [2024-06-24 14:56:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.0, 300 sec: 42821.5). Total num frames: 11194499072. Throughput: 0: 42642.6. Samples: 11194614360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:56:28,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 14:56:29,365][15401] Updated weights for policy 0, policy_version 683261 (0.0048) [2024-06-24 14:56:33,276][15401] Updated weights for policy 0, policy_version 683271 (0.0021) [2024-06-24 14:56:33,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42765.0). Total num frames: 11194712064. Throughput: 0: 42479.8. Samples: 11194873200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:56:33,393][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 14:56:36,803][15401] Updated weights for policy 0, policy_version 683281 (0.0036) [2024-06-24 14:56:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11194941440. Throughput: 0: 42598.3. Samples: 11195003920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:56:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 14:56:40,647][15401] Updated weights for policy 0, policy_version 683291 (0.0030) [2024-06-24 14:56:43,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 11195121664. Throughput: 0: 42645.8. Samples: 11195258920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:56:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 14:56:44,453][15401] Updated weights for policy 0, policy_version 683301 (0.0031) [2024-06-24 14:56:48,281][15401] Updated weights for policy 0, policy_version 683311 (0.0043) [2024-06-24 14:56:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 11195367424. Throughput: 0: 42622.6. Samples: 11195516360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:56:48,393][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 14:56:52,182][15401] Updated weights for policy 0, policy_version 683321 (0.0024) [2024-06-24 14:56:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11195564032. Throughput: 0: 42867.6. Samples: 11195650520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:56:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 14:56:56,138][15401] Updated weights for policy 0, policy_version 683331 (0.0038) [2024-06-24 14:56:58,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11195760640. Throughput: 0: 42719.0. Samples: 11195899220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:56:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 14:56:59,995][15401] Updated weights for policy 0, policy_version 683341 (0.0028) [2024-06-24 14:57:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11195990016. Throughput: 0: 42691.5. Samples: 11196159780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:03,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 14:57:03,575][15401] Updated weights for policy 0, policy_version 683351 (0.0026) [2024-06-24 14:57:07,559][15401] Updated weights for policy 0, policy_version 683361 (0.0040) [2024-06-24 14:57:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11196219392. Throughput: 0: 42959.8. Samples: 11196294780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 14:57:08,432][15349] Signal inference workers to stop experience collection... (165700 times) [2024-06-24 14:57:08,476][15401] InferenceWorker_p0-w0: stopping experience collection (165700 times) [2024-06-24 14:57:08,485][15349] Signal inference workers to resume experience collection... (165700 times) [2024-06-24 14:57:08,495][15401] InferenceWorker_p0-w0: resuming experience collection (165700 times) [2024-06-24 14:57:11,092][15401] Updated weights for policy 0, policy_version 683371 (0.0039) [2024-06-24 14:57:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11196416000. Throughput: 0: 42896.9. Samples: 11196544720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 14:57:15,321][15401] Updated weights for policy 0, policy_version 683381 (0.0038) [2024-06-24 14:57:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 11196661760. Throughput: 0: 42820.9. Samples: 11196800040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 14:57:19,024][15401] Updated weights for policy 0, policy_version 683391 (0.0038) [2024-06-24 14:57:23,040][15401] Updated weights for policy 0, policy_version 683401 (0.0028) [2024-06-24 14:57:23,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.9, 300 sec: 42875.8). Total num frames: 11196858368. Throughput: 0: 42910.6. Samples: 11196935000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:23,392][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 14:57:26,549][15401] Updated weights for policy 0, policy_version 683411 (0.0030) [2024-06-24 14:57:28,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 11197071360. Throughput: 0: 42851.2. Samples: 11197187320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:28,392][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 14:57:30,609][15401] Updated weights for policy 0, policy_version 683421 (0.0035) [2024-06-24 14:57:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 11197284352. Throughput: 0: 42871.7. Samples: 11197445480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:33,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 14:57:34,031][15401] Updated weights for policy 0, policy_version 683431 (0.0033) [2024-06-24 14:57:38,287][15401] Updated weights for policy 0, policy_version 683441 (0.0038) [2024-06-24 14:57:38,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11197497344. Throughput: 0: 42740.9. Samples: 11197573860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:38,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 14:57:41,538][15401] Updated weights for policy 0, policy_version 683451 (0.0043) [2024-06-24 14:57:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11197710336. Throughput: 0: 42873.7. Samples: 11197828540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 14:57:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000683455_11197726720.pth... [2024-06-24 14:57:43,584][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000682826_11187421184.pth [2024-06-24 14:57:45,994][15401] Updated weights for policy 0, policy_version 683461 (0.0037) [2024-06-24 14:57:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 11197923328. Throughput: 0: 42921.3. Samples: 11198091240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 14:57:49,075][15401] Updated weights for policy 0, policy_version 683471 (0.0029) [2024-06-24 14:57:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11198136320. Throughput: 0: 42666.1. Samples: 11198214760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 14:57:53,719][15401] Updated weights for policy 0, policy_version 683481 (0.0038) [2024-06-24 14:57:56,816][15401] Updated weights for policy 0, policy_version 683491 (0.0039) [2024-06-24 14:57:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 11198349312. Throughput: 0: 42710.4. Samples: 11198466680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:57:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 14:58:01,307][15401] Updated weights for policy 0, policy_version 683501 (0.0034) [2024-06-24 14:58:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11198562304. Throughput: 0: 42863.3. Samples: 11198728880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 14:58:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 14:58:04,885][15401] Updated weights for policy 0, policy_version 683511 (0.0032) [2024-06-24 14:58:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 11198775296. Throughput: 0: 42696.4. Samples: 11198856340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:08,393][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 14:58:08,999][15401] Updated weights for policy 0, policy_version 683521 (0.0028) [2024-06-24 14:58:12,425][15401] Updated weights for policy 0, policy_version 683531 (0.0032) [2024-06-24 14:58:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11198988288. Throughput: 0: 42706.5. Samples: 11199109020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 14:58:16,750][15401] Updated weights for policy 0, policy_version 683541 (0.0044) [2024-06-24 14:58:18,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 11199201280. Throughput: 0: 42629.6. Samples: 11199363920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:18,393][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 14:58:20,027][15401] Updated weights for policy 0, policy_version 683551 (0.0031) [2024-06-24 14:58:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 11199414272. Throughput: 0: 42598.7. Samples: 11199490800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 14:58:24,379][15401] Updated weights for policy 0, policy_version 683561 (0.0027) [2024-06-24 14:58:27,779][15401] Updated weights for policy 0, policy_version 683571 (0.0026) [2024-06-24 14:58:28,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 11199627264. Throughput: 0: 42638.4. Samples: 11199747260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 14:58:32,346][15401] Updated weights for policy 0, policy_version 683581 (0.0027) [2024-06-24 14:58:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11199840256. Throughput: 0: 42487.8. Samples: 11200003180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 14:58:35,297][15401] Updated weights for policy 0, policy_version 683591 (0.0030) [2024-06-24 14:58:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11200036864. Throughput: 0: 42716.1. Samples: 11200136980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:38,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 14:58:39,806][15401] Updated weights for policy 0, policy_version 683601 (0.0034) [2024-06-24 14:58:41,246][15349] Signal inference workers to stop experience collection... (165750 times) [2024-06-24 14:58:41,289][15401] InferenceWorker_p0-w0: stopping experience collection (165750 times) [2024-06-24 14:58:41,302][15349] Signal inference workers to resume experience collection... (165750 times) [2024-06-24 14:58:41,306][15401] InferenceWorker_p0-w0: resuming experience collection (165750 times) [2024-06-24 14:58:42,869][15401] Updated weights for policy 0, policy_version 683611 (0.0054) [2024-06-24 14:58:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 11200282624. Throughput: 0: 42565.3. Samples: 11200382120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 14:58:47,419][15401] Updated weights for policy 0, policy_version 683621 (0.0037) [2024-06-24 14:58:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 11200479232. Throughput: 0: 42555.9. Samples: 11200644000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:48,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 14:58:50,536][15401] Updated weights for policy 0, policy_version 683631 (0.0040) [2024-06-24 14:58:53,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 11200675840. Throughput: 0: 42571.0. Samples: 11200771940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 14:58:55,005][15401] Updated weights for policy 0, policy_version 683641 (0.0033) [2024-06-24 14:58:58,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 11200921600. Throughput: 0: 42477.4. Samples: 11201020500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:58:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 14:58:58,856][15401] Updated weights for policy 0, policy_version 683651 (0.0044) [2024-06-24 14:59:03,037][15401] Updated weights for policy 0, policy_version 683661 (0.0035) [2024-06-24 14:59:03,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 11201101824. Throughput: 0: 42661.1. Samples: 11201283560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 14:59:06,393][15401] Updated weights for policy 0, policy_version 683671 (0.0044) [2024-06-24 14:59:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11201331200. Throughput: 0: 42613.7. Samples: 11201408420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:08,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 14:59:10,751][15401] Updated weights for policy 0, policy_version 683681 (0.0037) [2024-06-24 14:59:13,390][15132] Fps is (10 sec: 44235.1, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 11201544192. Throughput: 0: 42593.9. Samples: 11201664000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:13,391][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 14:59:14,137][15401] Updated weights for policy 0, policy_version 683691 (0.0030) [2024-06-24 14:59:18,390][15132] Fps is (10 sec: 40956.5, 60 sec: 42326.4, 300 sec: 42764.9). Total num frames: 11201740800. Throughput: 0: 42756.3. Samples: 11201927260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:18,391][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 14:59:18,462][15401] Updated weights for policy 0, policy_version 683701 (0.0035) [2024-06-24 14:59:21,514][15401] Updated weights for policy 0, policy_version 683711 (0.0027) [2024-06-24 14:59:23,389][15132] Fps is (10 sec: 40961.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11201953792. Throughput: 0: 42538.3. Samples: 11202051200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:23,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 14:59:25,977][15401] Updated weights for policy 0, policy_version 683721 (0.0034) [2024-06-24 14:59:28,390][15132] Fps is (10 sec: 45879.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11202199552. Throughput: 0: 42871.5. Samples: 11202311340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 14:59:29,106][15401] Updated weights for policy 0, policy_version 683731 (0.0037) [2024-06-24 14:59:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11202396160. Throughput: 0: 42748.5. Samples: 11202567580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 14:59:33,505][15401] Updated weights for policy 0, policy_version 683741 (0.0036) [2024-06-24 14:59:36,819][15401] Updated weights for policy 0, policy_version 683751 (0.0041) [2024-06-24 14:59:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11202609152. Throughput: 0: 42722.4. Samples: 11202694440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:38,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 14:59:41,446][15401] Updated weights for policy 0, policy_version 683761 (0.0032) [2024-06-24 14:59:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11202822144. Throughput: 0: 42905.8. Samples: 11202951260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 14:59:43,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 14:59:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000683766_11202822144.pth... [2024-06-24 14:59:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000683141_11192582144.pth [2024-06-24 14:59:44,429][15401] Updated weights for policy 0, policy_version 683771 (0.0043) [2024-06-24 14:59:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 11203035136. Throughput: 0: 42739.1. Samples: 11203206820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 14:59:48,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 14:59:48,903][15401] Updated weights for policy 0, policy_version 683781 (0.0036) [2024-06-24 14:59:52,854][15401] Updated weights for policy 0, policy_version 683791 (0.0033) [2024-06-24 14:59:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11203248128. Throughput: 0: 42709.9. Samples: 11203330360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 14:59:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 14:59:55,864][15349] Signal inference workers to stop experience collection... (165800 times) [2024-06-24 14:59:55,911][15401] InferenceWorker_p0-w0: stopping experience collection (165800 times) [2024-06-24 14:59:55,920][15349] Signal inference workers to resume experience collection... (165800 times) [2024-06-24 14:59:55,930][15401] InferenceWorker_p0-w0: resuming experience collection (165800 times) [2024-06-24 14:59:56,745][15401] Updated weights for policy 0, policy_version 683801 (0.0045) [2024-06-24 14:59:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11203461120. Throughput: 0: 42685.7. Samples: 11203584840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 14:59:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 15:00:00,691][15401] Updated weights for policy 0, policy_version 683811 (0.0036) [2024-06-24 15:00:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11203674112. Throughput: 0: 42505.3. Samples: 11203839960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 15:00:04,455][15401] Updated weights for policy 0, policy_version 683821 (0.0032) [2024-06-24 15:00:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11203870720. Throughput: 0: 42542.2. Samples: 11203965600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:08,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 15:00:08,468][15401] Updated weights for policy 0, policy_version 683831 (0.0043) [2024-06-24 15:00:12,178][15401] Updated weights for policy 0, policy_version 683841 (0.0048) [2024-06-24 15:00:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 11204116480. Throughput: 0: 42601.3. Samples: 11204228400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 15:00:15,911][15401] Updated weights for policy 0, policy_version 683851 (0.0033) [2024-06-24 15:00:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42872.2, 300 sec: 42765.0). Total num frames: 11204313088. Throughput: 0: 42608.4. Samples: 11204484960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 15:00:19,789][15401] Updated weights for policy 0, policy_version 683861 (0.0035) [2024-06-24 15:00:23,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42653.9). Total num frames: 11204526080. Throughput: 0: 42540.4. Samples: 11204608860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:23,392][15132] Avg episode reward: [(0, '0.205')] [2024-06-24 15:00:23,510][15401] Updated weights for policy 0, policy_version 683871 (0.0029) [2024-06-24 15:00:27,756][15401] Updated weights for policy 0, policy_version 683881 (0.0045) [2024-06-24 15:00:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11204739072. Throughput: 0: 42544.0. Samples: 11204865740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:28,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 15:00:31,359][15401] Updated weights for policy 0, policy_version 683891 (0.0043) [2024-06-24 15:00:33,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11204952064. Throughput: 0: 42517.3. Samples: 11205120100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 15:00:35,209][15401] Updated weights for policy 0, policy_version 683901 (0.0024) [2024-06-24 15:00:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 11205181440. Throughput: 0: 42714.1. Samples: 11205252500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:38,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 15:00:38,858][15401] Updated weights for policy 0, policy_version 683911 (0.0047) [2024-06-24 15:00:42,714][15401] Updated weights for policy 0, policy_version 683921 (0.0045) [2024-06-24 15:00:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11205361664. Throughput: 0: 42714.2. Samples: 11205506980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:43,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 15:00:46,729][15401] Updated weights for policy 0, policy_version 683931 (0.0039) [2024-06-24 15:00:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11205574656. Throughput: 0: 42864.0. Samples: 11205768840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:48,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 15:00:50,227][15401] Updated weights for policy 0, policy_version 683941 (0.0024) [2024-06-24 15:00:53,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 11205820416. Throughput: 0: 42885.6. Samples: 11205895560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:53,392][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 15:00:54,188][15401] Updated weights for policy 0, policy_version 683951 (0.0030) [2024-06-24 15:00:57,834][15401] Updated weights for policy 0, policy_version 683961 (0.0027) [2024-06-24 15:00:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11206017024. Throughput: 0: 42721.4. Samples: 11206150860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:00:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 15:01:01,787][15401] Updated weights for policy 0, policy_version 683971 (0.0028) [2024-06-24 15:01:03,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 11206230016. Throughput: 0: 42830.5. Samples: 11206412440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:01:03,393][15132] Avg episode reward: [(0, '0.284')] [2024-06-24 15:01:05,384][15401] Updated weights for policy 0, policy_version 683981 (0.0032) [2024-06-24 15:01:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11206459392. Throughput: 0: 42827.7. Samples: 11206536000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:01:08,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 15:01:09,422][15401] Updated weights for policy 0, policy_version 683991 (0.0032) [2024-06-24 15:01:13,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11206656000. Throughput: 0: 42815.7. Samples: 11206792440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:01:13,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 15:01:13,457][15401] Updated weights for policy 0, policy_version 684001 (0.0036) [2024-06-24 15:01:17,002][15401] Updated weights for policy 0, policy_version 684011 (0.0024) [2024-06-24 15:01:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 11206868992. Throughput: 0: 42956.0. Samples: 11207053120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:01:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 15:01:20,981][15401] Updated weights for policy 0, policy_version 684021 (0.0037) [2024-06-24 15:01:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 11207098368. Throughput: 0: 42851.5. Samples: 11207180820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-24 15:01:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 15:01:24,461][15401] Updated weights for policy 0, policy_version 684031 (0.0032) [2024-06-24 15:01:27,130][15349] Signal inference workers to stop experience collection... (165850 times) [2024-06-24 15:01:27,175][15401] InferenceWorker_p0-w0: stopping experience collection (165850 times) [2024-06-24 15:01:27,183][15349] Signal inference workers to resume experience collection... (165850 times) [2024-06-24 15:01:27,195][15401] InferenceWorker_p0-w0: resuming experience collection (165850 times) [2024-06-24 15:01:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42709.9). Total num frames: 11207311360. Throughput: 0: 42979.2. Samples: 11207441040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:01:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 15:01:28,421][15401] Updated weights for policy 0, policy_version 684041 (0.0034) [2024-06-24 15:01:32,266][15401] Updated weights for policy 0, policy_version 684051 (0.0040) [2024-06-24 15:01:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11207524352. Throughput: 0: 42759.6. Samples: 11207693020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:01:33,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 15:01:35,979][15401] Updated weights for policy 0, policy_version 684061 (0.0024) [2024-06-24 15:01:38,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11207737344. Throughput: 0: 42712.6. Samples: 11207817520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:01:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 15:01:39,859][15401] Updated weights for policy 0, policy_version 684071 (0.0035) [2024-06-24 15:01:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42709.8). Total num frames: 11207966720. Throughput: 0: 42940.3. Samples: 11208083180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:01:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 15:01:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000684081_11207983104.pth... [2024-06-24 15:01:43,493][15401] Updated weights for policy 0, policy_version 684081 (0.0034) [2024-06-24 15:01:43,544][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000683455_11197726720.pth [2024-06-24 15:01:47,585][15401] Updated weights for policy 0, policy_version 684091 (0.0035) [2024-06-24 15:01:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11208163328. Throughput: 0: 42666.8. Samples: 11208332340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:01:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 15:01:51,344][15401] Updated weights for policy 0, policy_version 684101 (0.0031) [2024-06-24 15:01:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 11208376320. Throughput: 0: 42751.6. Samples: 11208459820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:01:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 15:01:55,176][15401] Updated weights for policy 0, policy_version 684111 (0.0032) [2024-06-24 15:01:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11208589312. Throughput: 0: 42891.6. Samples: 11208722560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:01:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 15:01:59,018][15401] Updated weights for policy 0, policy_version 684121 (0.0044) [2024-06-24 15:02:02,698][15401] Updated weights for policy 0, policy_version 684131 (0.0045) [2024-06-24 15:02:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 11208818688. Throughput: 0: 42693.9. Samples: 11208974340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 15:02:07,186][15401] Updated weights for policy 0, policy_version 684141 (0.0038) [2024-06-24 15:02:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11209031680. Throughput: 0: 42654.3. Samples: 11209100260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:08,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 15:02:10,386][15401] Updated weights for policy 0, policy_version 684151 (0.0033) [2024-06-24 15:02:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11209228288. Throughput: 0: 42639.1. Samples: 11209359800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 15:02:14,831][15401] Updated weights for policy 0, policy_version 684161 (0.0031) [2024-06-24 15:02:18,093][15401] Updated weights for policy 0, policy_version 684171 (0.0034) [2024-06-24 15:02:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 11209457664. Throughput: 0: 42707.1. Samples: 11209614840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 15:02:22,318][15401] Updated weights for policy 0, policy_version 684181 (0.0033) [2024-06-24 15:02:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.6, 300 sec: 42654.3). Total num frames: 11209654272. Throughput: 0: 42842.7. Samples: 11209745440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 15:02:25,881][15401] Updated weights for policy 0, policy_version 684191 (0.0042) [2024-06-24 15:02:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11209867264. Throughput: 0: 42634.7. Samples: 11210001740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 15:02:29,957][15401] Updated weights for policy 0, policy_version 684201 (0.0031) [2024-06-24 15:02:33,347][15401] Updated weights for policy 0, policy_version 684211 (0.0032) [2024-06-24 15:02:33,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 11210113024. Throughput: 0: 42759.9. Samples: 11210256640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:33,392][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 15:02:37,386][15401] Updated weights for policy 0, policy_version 684221 (0.0024) [2024-06-24 15:02:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11210276864. Throughput: 0: 42890.6. Samples: 11210389900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:38,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 15:02:40,252][15349] Signal inference workers to stop experience collection... (165900 times) [2024-06-24 15:02:40,252][15349] Signal inference workers to resume experience collection... (165900 times) [2024-06-24 15:02:40,290][15401] InferenceWorker_p0-w0: stopping experience collection (165900 times) [2024-06-24 15:02:40,291][15401] InferenceWorker_p0-w0: resuming experience collection (165900 times) [2024-06-24 15:02:40,735][15401] Updated weights for policy 0, policy_version 684231 (0.0031) [2024-06-24 15:02:43,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11210522624. Throughput: 0: 42770.1. Samples: 11210647220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:43,392][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 15:02:44,791][15401] Updated weights for policy 0, policy_version 684241 (0.0049) [2024-06-24 15:02:48,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11210752000. Throughput: 0: 42867.5. Samples: 11210903380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 15:02:48,866][15401] Updated weights for policy 0, policy_version 684251 (0.0035) [2024-06-24 15:02:52,578][15401] Updated weights for policy 0, policy_version 684261 (0.0035) [2024-06-24 15:02:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11210948608. Throughput: 0: 43050.8. Samples: 11211037540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 15:02:56,282][15401] Updated weights for policy 0, policy_version 684271 (0.0026) [2024-06-24 15:02:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11211177984. Throughput: 0: 42937.2. Samples: 11211291980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:02:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 15:03:00,019][15401] Updated weights for policy 0, policy_version 684281 (0.0025) [2024-06-24 15:03:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 11211407360. Throughput: 0: 43025.3. Samples: 11211550980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:03:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 15:03:03,803][15401] Updated weights for policy 0, policy_version 684291 (0.0037) [2024-06-24 15:03:07,571][15401] Updated weights for policy 0, policy_version 684301 (0.0045) [2024-06-24 15:03:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11211587584. Throughput: 0: 43063.3. Samples: 11211683300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:08,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 15:03:11,567][15401] Updated weights for policy 0, policy_version 684311 (0.0036) [2024-06-24 15:03:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 11211816960. Throughput: 0: 42846.6. Samples: 11211929840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 15:03:15,875][15401] Updated weights for policy 0, policy_version 684321 (0.0035) [2024-06-24 15:03:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11212013568. Throughput: 0: 43107.2. Samples: 11212196360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 15:03:19,373][15401] Updated weights for policy 0, policy_version 684331 (0.0036) [2024-06-24 15:03:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11212226560. Throughput: 0: 42849.4. Samples: 11212318120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 15:03:23,450][15401] Updated weights for policy 0, policy_version 684341 (0.0030) [2024-06-24 15:03:26,787][15401] Updated weights for policy 0, policy_version 684351 (0.0025) [2024-06-24 15:03:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 11212472320. Throughput: 0: 42857.0. Samples: 11212575780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:28,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 15:03:31,038][15401] Updated weights for policy 0, policy_version 684361 (0.0041) [2024-06-24 15:03:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 11212636160. Throughput: 0: 43019.7. Samples: 11212839260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 15:03:34,491][15401] Updated weights for policy 0, policy_version 684371 (0.0024) [2024-06-24 15:03:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11212881920. Throughput: 0: 42707.5. Samples: 11212959380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 15:03:38,619][15401] Updated weights for policy 0, policy_version 684381 (0.0026) [2024-06-24 15:03:42,259][15401] Updated weights for policy 0, policy_version 684391 (0.0036) [2024-06-24 15:03:43,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 11213111296. Throughput: 0: 42842.7. Samples: 11213219900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 15:03:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000684394_11213111296.pth... [2024-06-24 15:03:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000683766_11202822144.pth [2024-06-24 15:03:46,241][15401] Updated weights for policy 0, policy_version 684401 (0.0050) [2024-06-24 15:03:48,391][15132] Fps is (10 sec: 40953.5, 60 sec: 42324.2, 300 sec: 42764.8). Total num frames: 11213291520. Throughput: 0: 42969.6. Samples: 11213484680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:48,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 15:03:50,005][15401] Updated weights for policy 0, policy_version 684411 (0.0037) [2024-06-24 15:03:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11213537280. Throughput: 0: 42768.6. Samples: 11213607880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 15:03:53,877][15401] Updated weights for policy 0, policy_version 684421 (0.0045) [2024-06-24 15:03:57,532][15401] Updated weights for policy 0, policy_version 684431 (0.0037) [2024-06-24 15:03:58,389][15132] Fps is (10 sec: 44244.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11213733888. Throughput: 0: 43043.2. Samples: 11213866780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:03:58,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 15:04:01,328][15401] Updated weights for policy 0, policy_version 684441 (0.0028) [2024-06-24 15:04:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11213946880. Throughput: 0: 42825.5. Samples: 11214123520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:04:03,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 15:04:05,500][15401] Updated weights for policy 0, policy_version 684451 (0.0032) [2024-06-24 15:04:08,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11214192640. Throughput: 0: 42870.9. Samples: 11214247320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:04:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 15:04:08,901][15401] Updated weights for policy 0, policy_version 684461 (0.0039) [2024-06-24 15:04:13,035][15401] Updated weights for policy 0, policy_version 684471 (0.0022) [2024-06-24 15:04:13,392][15132] Fps is (10 sec: 44227.1, 60 sec: 42869.8, 300 sec: 42875.9). Total num frames: 11214389248. Throughput: 0: 42989.7. Samples: 11214510420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:04:13,392][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 15:04:16,407][15401] Updated weights for policy 0, policy_version 684481 (0.0027) [2024-06-24 15:04:18,097][15349] Signal inference workers to stop experience collection... (165950 times) [2024-06-24 15:04:18,097][15349] Signal inference workers to resume experience collection... (165950 times) [2024-06-24 15:04:18,113][15401] InferenceWorker_p0-w0: stopping experience collection (165950 times) [2024-06-24 15:04:18,142][15401] InferenceWorker_p0-w0: resuming experience collection (165950 times) [2024-06-24 15:04:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11214602240. Throughput: 0: 42927.5. Samples: 11214771000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:04:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 15:04:20,474][15401] Updated weights for policy 0, policy_version 684491 (0.0032) [2024-06-24 15:04:23,390][15132] Fps is (10 sec: 42607.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11214815232. Throughput: 0: 43089.7. Samples: 11214898420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:04:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 15:04:24,643][15401] Updated weights for policy 0, policy_version 684501 (0.0031) [2024-06-24 15:04:28,309][15401] Updated weights for policy 0, policy_version 684511 (0.0026) [2024-06-24 15:04:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11215028224. Throughput: 0: 43121.8. Samples: 11215160380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:04:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 15:04:32,226][15401] Updated weights for policy 0, policy_version 684521 (0.0040) [2024-06-24 15:04:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 11215241216. Throughput: 0: 42790.8. Samples: 11215410200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:04:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 15:04:35,807][15401] Updated weights for policy 0, policy_version 684531 (0.0029) [2024-06-24 15:04:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11215470592. Throughput: 0: 43027.6. Samples: 11215544120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:04:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 15:04:39,757][15401] Updated weights for policy 0, policy_version 684541 (0.0028) [2024-06-24 15:04:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11215667200. Throughput: 0: 43079.2. Samples: 11215805340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:04:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 15:04:43,523][15401] Updated weights for policy 0, policy_version 684551 (0.0047) [2024-06-24 15:04:47,332][15401] Updated weights for policy 0, policy_version 684561 (0.0044) [2024-06-24 15:04:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43418.8, 300 sec: 42876.1). Total num frames: 11215896576. Throughput: 0: 43003.7. Samples: 11216058680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:04:48,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 15:04:50,951][15401] Updated weights for policy 0, policy_version 684571 (0.0038) [2024-06-24 15:04:53,393][15132] Fps is (10 sec: 45856.5, 60 sec: 43141.7, 300 sec: 42931.0). Total num frames: 11216125952. Throughput: 0: 43132.8. Samples: 11216188460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:04:53,394][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 15:04:54,931][15401] Updated weights for policy 0, policy_version 684581 (0.0024) [2024-06-24 15:04:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11216322560. Throughput: 0: 43093.4. Samples: 11216449520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:04:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 15:04:58,574][15401] Updated weights for policy 0, policy_version 684591 (0.0035) [2024-06-24 15:05:02,635][15401] Updated weights for policy 0, policy_version 684601 (0.0029) [2024-06-24 15:05:03,390][15132] Fps is (10 sec: 40976.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11216535552. Throughput: 0: 42971.5. Samples: 11216704720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 15:05:06,491][15401] Updated weights for policy 0, policy_version 684611 (0.0044) [2024-06-24 15:05:08,396][15132] Fps is (10 sec: 45845.8, 60 sec: 43140.1, 300 sec: 42930.7). Total num frames: 11216781312. Throughput: 0: 43038.1. Samples: 11216835400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:08,396][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 15:05:10,245][15401] Updated weights for policy 0, policy_version 684621 (0.0030) [2024-06-24 15:05:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 11216977920. Throughput: 0: 42980.0. Samples: 11217094480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 15:05:13,825][15401] Updated weights for policy 0, policy_version 684631 (0.0036) [2024-06-24 15:05:17,786][15401] Updated weights for policy 0, policy_version 684641 (0.0021) [2024-06-24 15:05:18,390][15132] Fps is (10 sec: 39344.5, 60 sec: 42871.0, 300 sec: 42876.4). Total num frames: 11217174528. Throughput: 0: 43028.9. Samples: 11217346520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:18,391][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 15:05:21,723][15401] Updated weights for policy 0, policy_version 684651 (0.0034) [2024-06-24 15:05:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 11217403904. Throughput: 0: 43011.2. Samples: 11217479620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 15:05:25,323][15401] Updated weights for policy 0, policy_version 684661 (0.0031) [2024-06-24 15:05:28,390][15132] Fps is (10 sec: 44239.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11217616896. Throughput: 0: 42951.5. Samples: 11217738160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:28,392][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 15:05:29,276][15401] Updated weights for policy 0, policy_version 684671 (0.0031) [2024-06-24 15:05:32,838][15401] Updated weights for policy 0, policy_version 684681 (0.0030) [2024-06-24 15:05:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11217813504. Throughput: 0: 43018.3. Samples: 11217994500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:33,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 15:05:36,788][15401] Updated weights for policy 0, policy_version 684691 (0.0042) [2024-06-24 15:05:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 11218059264. Throughput: 0: 42996.6. Samples: 11218123140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 15:05:40,689][15401] Updated weights for policy 0, policy_version 684701 (0.0026) [2024-06-24 15:05:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 11218255872. Throughput: 0: 43094.2. Samples: 11218388760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 15:05:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000684708_11218255872.pth... [2024-06-24 15:05:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000684081_11207983104.pth [2024-06-24 15:05:44,357][15401] Updated weights for policy 0, policy_version 684711 (0.0030) [2024-06-24 15:05:44,515][15349] Signal inference workers to stop experience collection... (166000 times) [2024-06-24 15:05:44,515][15349] Signal inference workers to resume experience collection... (166000 times) [2024-06-24 15:05:44,543][15401] InferenceWorker_p0-w0: stopping experience collection (166000 times) [2024-06-24 15:05:44,543][15401] InferenceWorker_p0-w0: resuming experience collection (166000 times) [2024-06-24 15:05:48,167][15401] Updated weights for policy 0, policy_version 684721 (0.0032) [2024-06-24 15:05:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 11218468864. Throughput: 0: 42972.9. Samples: 11218638500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 15:05:51,890][15401] Updated weights for policy 0, policy_version 684731 (0.0042) [2024-06-24 15:05:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42874.3, 300 sec: 42987.2). Total num frames: 11218698240. Throughput: 0: 42898.0. Samples: 11218765540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 15:05:55,700][15401] Updated weights for policy 0, policy_version 684741 (0.0034) [2024-06-24 15:05:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 11218894848. Throughput: 0: 42890.2. Samples: 11219024540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:05:58,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 15:05:59,550][15401] Updated weights for policy 0, policy_version 684751 (0.0038) [2024-06-24 15:06:03,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 11219107840. Throughput: 0: 42831.8. Samples: 11219274200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:06:03,396][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 15:06:03,522][15401] Updated weights for policy 0, policy_version 684761 (0.0033) [2024-06-24 15:06:07,506][15401] Updated weights for policy 0, policy_version 684771 (0.0032) [2024-06-24 15:06:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42603.0, 300 sec: 42987.2). Total num frames: 11219337216. Throughput: 0: 42816.1. Samples: 11219406340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:06:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 15:06:11,078][15401] Updated weights for policy 0, policy_version 684781 (0.0031) [2024-06-24 15:06:13,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11219533824. Throughput: 0: 42940.0. Samples: 11219670460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:06:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 15:06:14,947][15401] Updated weights for policy 0, policy_version 684791 (0.0036) [2024-06-24 15:06:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.9, 300 sec: 42931.7). Total num frames: 11219763200. Throughput: 0: 42886.6. Samples: 11219924400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 15:06:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 15:06:18,790][15401] Updated weights for policy 0, policy_version 684801 (0.0031) [2024-06-24 15:06:22,485][15401] Updated weights for policy 0, policy_version 684811 (0.0041) [2024-06-24 15:06:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11219976192. Throughput: 0: 42971.1. Samples: 11220056840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:06:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 15:06:26,562][15401] Updated weights for policy 0, policy_version 684821 (0.0040) [2024-06-24 15:06:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11220172800. Throughput: 0: 42677.7. Samples: 11220309260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:06:28,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-24 15:06:29,995][15401] Updated weights for policy 0, policy_version 684831 (0.0023) [2024-06-24 15:06:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11220402176. Throughput: 0: 42784.6. Samples: 11220563800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:06:33,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 15:06:34,182][15401] Updated weights for policy 0, policy_version 684841 (0.0034) [2024-06-24 15:06:37,692][15401] Updated weights for policy 0, policy_version 684851 (0.0042) [2024-06-24 15:06:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11220615168. Throughput: 0: 42961.8. Samples: 11220698820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:06:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 15:06:41,737][15401] Updated weights for policy 0, policy_version 684861 (0.0036) [2024-06-24 15:06:43,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11220811776. Throughput: 0: 42809.7. Samples: 11220950980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:06:43,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 15:06:45,306][15401] Updated weights for policy 0, policy_version 684871 (0.0039) [2024-06-24 15:06:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 11221057536. Throughput: 0: 42861.2. Samples: 11221202680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:06:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 15:06:49,380][15401] Updated weights for policy 0, policy_version 684881 (0.0032) [2024-06-24 15:06:53,241][15401] Updated weights for policy 0, policy_version 684891 (0.0026) [2024-06-24 15:06:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11221254144. Throughput: 0: 42921.2. Samples: 11221337800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:06:53,392][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 15:06:56,865][15401] Updated weights for policy 0, policy_version 684901 (0.0032) [2024-06-24 15:06:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11221450752. Throughput: 0: 42529.4. Samples: 11221584280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:06:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 15:07:00,893][15401] Updated weights for policy 0, policy_version 684911 (0.0041) [2024-06-24 15:07:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 11221680128. Throughput: 0: 42608.0. Samples: 11221841760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:03,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 15:07:04,500][15401] Updated weights for policy 0, policy_version 684921 (0.0029) [2024-06-24 15:07:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.2, 300 sec: 42931.6). Total num frames: 11221893120. Throughput: 0: 42484.8. Samples: 11221968660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:08,391][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 15:07:08,546][15401] Updated weights for policy 0, policy_version 684931 (0.0034) [2024-06-24 15:07:09,038][15349] Signal inference workers to stop experience collection... (166050 times) [2024-06-24 15:07:09,092][15401] InferenceWorker_p0-w0: stopping experience collection (166050 times) [2024-06-24 15:07:09,096][15349] Signal inference workers to resume experience collection... (166050 times) [2024-06-24 15:07:09,106][15401] InferenceWorker_p0-w0: resuming experience collection (166050 times) [2024-06-24 15:07:12,241][15401] Updated weights for policy 0, policy_version 684941 (0.0032) [2024-06-24 15:07:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 11222106112. Throughput: 0: 42680.4. Samples: 11222229980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:13,392][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 15:07:16,135][15401] Updated weights for policy 0, policy_version 684951 (0.0040) [2024-06-24 15:07:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11222319104. Throughput: 0: 42804.8. Samples: 11222490020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:18,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 15:07:19,713][15401] Updated weights for policy 0, policy_version 684961 (0.0038) [2024-06-24 15:07:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11222548480. Throughput: 0: 42739.1. Samples: 11222622080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:23,400][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 15:07:23,751][15401] Updated weights for policy 0, policy_version 684971 (0.0038) [2024-06-24 15:07:27,479][15401] Updated weights for policy 0, policy_version 684981 (0.0036) [2024-06-24 15:07:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 11222745088. Throughput: 0: 42795.3. Samples: 11222876760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 15:07:31,393][15401] Updated weights for policy 0, policy_version 684991 (0.0032) [2024-06-24 15:07:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 11222958080. Throughput: 0: 42898.7. Samples: 11223133120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 15:07:35,420][15401] Updated weights for policy 0, policy_version 685001 (0.0046) [2024-06-24 15:07:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11223171072. Throughput: 0: 42699.6. Samples: 11223259280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 15:07:39,056][15401] Updated weights for policy 0, policy_version 685011 (0.0041) [2024-06-24 15:07:43,143][15401] Updated weights for policy 0, policy_version 685021 (0.0036) [2024-06-24 15:07:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11223384064. Throughput: 0: 42906.8. Samples: 11223515080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 15:07:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000685022_11223400448.pth... [2024-06-24 15:07:43,578][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000684394_11213111296.pth [2024-06-24 15:07:46,876][15401] Updated weights for policy 0, policy_version 685031 (0.0043) [2024-06-24 15:07:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 11223597056. Throughput: 0: 42799.0. Samples: 11223767720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 15:07:50,845][15401] Updated weights for policy 0, policy_version 685041 (0.0043) [2024-06-24 15:07:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11223810048. Throughput: 0: 42835.7. Samples: 11223896260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 15:07:54,694][15401] Updated weights for policy 0, policy_version 685051 (0.0035) [2024-06-24 15:07:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11224023040. Throughput: 0: 42801.0. Samples: 11224155920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 15:07:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 15:07:58,453][15401] Updated weights for policy 0, policy_version 685061 (0.0039) [2024-06-24 15:08:02,283][15401] Updated weights for policy 0, policy_version 685071 (0.0037) [2024-06-24 15:08:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11224236032. Throughput: 0: 42735.6. Samples: 11224413120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 15:08:06,149][15401] Updated weights for policy 0, policy_version 685081 (0.0040) [2024-06-24 15:08:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 11224449024. Throughput: 0: 42616.6. Samples: 11224539820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 15:08:09,871][15401] Updated weights for policy 0, policy_version 685091 (0.0031) [2024-06-24 15:08:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 11224645632. Throughput: 0: 42614.9. Samples: 11224794440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 15:08:13,773][15401] Updated weights for policy 0, policy_version 685101 (0.0031) [2024-06-24 15:08:17,532][15401] Updated weights for policy 0, policy_version 685111 (0.0028) [2024-06-24 15:08:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11224891392. Throughput: 0: 42847.1. Samples: 11225061240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 15:08:21,280][15401] Updated weights for policy 0, policy_version 685121 (0.0037) [2024-06-24 15:08:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11225104384. Throughput: 0: 42836.9. Samples: 11225186940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:23,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 15:08:25,115][15401] Updated weights for policy 0, policy_version 685131 (0.0043) [2024-06-24 15:08:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11225300992. Throughput: 0: 42759.6. Samples: 11225439260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:28,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 15:08:29,259][15401] Updated weights for policy 0, policy_version 685141 (0.0030) [2024-06-24 15:08:32,810][15401] Updated weights for policy 0, policy_version 685151 (0.0026) [2024-06-24 15:08:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11225546752. Throughput: 0: 42905.4. Samples: 11225698460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:33,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 15:08:36,780][15401] Updated weights for policy 0, policy_version 685161 (0.0032) [2024-06-24 15:08:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11225759744. Throughput: 0: 42979.8. Samples: 11225830360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 15:08:40,530][15401] Updated weights for policy 0, policy_version 685171 (0.0037) [2024-06-24 15:08:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42931.9). Total num frames: 11225956352. Throughput: 0: 42727.5. Samples: 11226078660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 15:08:44,349][15401] Updated weights for policy 0, policy_version 685181 (0.0032) [2024-06-24 15:08:45,632][15349] Signal inference workers to stop experience collection... (166100 times) [2024-06-24 15:08:45,632][15349] Signal inference workers to resume experience collection... (166100 times) [2024-06-24 15:08:45,650][15401] InferenceWorker_p0-w0: stopping experience collection (166100 times) [2024-06-24 15:08:45,651][15401] InferenceWorker_p0-w0: resuming experience collection (166100 times) [2024-06-24 15:08:48,157][15401] Updated weights for policy 0, policy_version 685191 (0.0050) [2024-06-24 15:08:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11226169344. Throughput: 0: 42685.3. Samples: 11226333960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 15:08:51,955][15401] Updated weights for policy 0, policy_version 685201 (0.0024) [2024-06-24 15:08:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11226398720. Throughput: 0: 42703.5. Samples: 11226461480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 15:08:55,813][15401] Updated weights for policy 0, policy_version 685211 (0.0039) [2024-06-24 15:08:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11226595328. Throughput: 0: 42762.3. Samples: 11226718740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:08:58,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 15:08:59,536][15401] Updated weights for policy 0, policy_version 685221 (0.0026) [2024-06-24 15:09:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 11226808320. Throughput: 0: 42532.4. Samples: 11226975200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:09:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 15:09:03,720][15401] Updated weights for policy 0, policy_version 685231 (0.0034) [2024-06-24 15:09:07,447][15401] Updated weights for policy 0, policy_version 685241 (0.0034) [2024-06-24 15:09:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 11227021312. Throughput: 0: 42496.8. Samples: 11227099300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:09:08,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 15:09:11,589][15401] Updated weights for policy 0, policy_version 685251 (0.0029) [2024-06-24 15:09:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 11227234304. Throughput: 0: 42674.7. Samples: 11227359620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:09:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 15:09:15,305][15401] Updated weights for policy 0, policy_version 685261 (0.0037) [2024-06-24 15:09:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11227447296. Throughput: 0: 42476.5. Samples: 11227609900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:09:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 15:09:19,331][15401] Updated weights for policy 0, policy_version 685271 (0.0036) [2024-06-24 15:09:22,878][15401] Updated weights for policy 0, policy_version 685281 (0.0027) [2024-06-24 15:09:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 11227676672. Throughput: 0: 42471.6. Samples: 11227741580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:09:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 15:09:26,911][15401] Updated weights for policy 0, policy_version 685291 (0.0033) [2024-06-24 15:09:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11227873280. Throughput: 0: 42644.5. Samples: 11227997660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:09:28,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 15:09:30,628][15401] Updated weights for policy 0, policy_version 685301 (0.0028) [2024-06-24 15:09:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 11228069888. Throughput: 0: 42507.5. Samples: 11228246800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:09:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 15:09:34,770][15401] Updated weights for policy 0, policy_version 685311 (0.0032) [2024-06-24 15:09:38,026][15401] Updated weights for policy 0, policy_version 685321 (0.0029) [2024-06-24 15:09:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 11228299264. Throughput: 0: 42544.8. Samples: 11228376100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 15:09:38,392][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 15:09:42,424][15401] Updated weights for policy 0, policy_version 685331 (0.0034) [2024-06-24 15:09:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11228512256. Throughput: 0: 42583.2. Samples: 11228634980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:09:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 15:09:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000685335_11228528640.pth... [2024-06-24 15:09:43,550][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000684708_11218255872.pth [2024-06-24 15:09:46,084][15401] Updated weights for policy 0, policy_version 685341 (0.0037) [2024-06-24 15:09:48,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42596.7, 300 sec: 42709.7). Total num frames: 11228725248. Throughput: 0: 42399.0. Samples: 11228883260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:09:48,393][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 15:09:50,126][15401] Updated weights for policy 0, policy_version 685351 (0.0030) [2024-06-24 15:09:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11228938240. Throughput: 0: 42575.5. Samples: 11229015200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:09:53,395][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 15:09:53,702][15401] Updated weights for policy 0, policy_version 685361 (0.0038) [2024-06-24 15:09:55,301][15349] Signal inference workers to stop experience collection... (166150 times) [2024-06-24 15:09:55,307][15349] Signal inference workers to resume experience collection... (166150 times) [2024-06-24 15:09:55,331][15401] InferenceWorker_p0-w0: stopping experience collection (166150 times) [2024-06-24 15:09:55,331][15401] InferenceWorker_p0-w0: resuming experience collection (166150 times) [2024-06-24 15:09:57,643][15401] Updated weights for policy 0, policy_version 685371 (0.0043) [2024-06-24 15:09:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11229151232. Throughput: 0: 42454.7. Samples: 11229270080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:09:58,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-24 15:10:01,475][15401] Updated weights for policy 0, policy_version 685381 (0.0036) [2024-06-24 15:10:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 11229364224. Throughput: 0: 42474.6. Samples: 11229521260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:03,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 15:10:05,266][15401] Updated weights for policy 0, policy_version 685391 (0.0022) [2024-06-24 15:10:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 11229577216. Throughput: 0: 42548.6. Samples: 11229656360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:08,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 15:10:08,978][15401] Updated weights for policy 0, policy_version 685401 (0.0033) [2024-06-24 15:10:12,945][15401] Updated weights for policy 0, policy_version 685411 (0.0043) [2024-06-24 15:10:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 11229773824. Throughput: 0: 42548.0. Samples: 11229912320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:13,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 15:10:16,691][15401] Updated weights for policy 0, policy_version 685421 (0.0030) [2024-06-24 15:10:18,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11230003200. Throughput: 0: 42521.4. Samples: 11230160260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:18,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 15:10:20,586][15401] Updated weights for policy 0, policy_version 685431 (0.0040) [2024-06-24 15:10:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11230232576. Throughput: 0: 42605.9. Samples: 11230293260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 15:10:24,211][15401] Updated weights for policy 0, policy_version 685441 (0.0034) [2024-06-24 15:10:28,107][15401] Updated weights for policy 0, policy_version 685451 (0.0030) [2024-06-24 15:10:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 11230429184. Throughput: 0: 42599.0. Samples: 11230552040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:28,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 15:10:32,291][15401] Updated weights for policy 0, policy_version 685461 (0.0037) [2024-06-24 15:10:33,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 11230642176. Throughput: 0: 42642.9. Samples: 11230802360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:33,397][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 15:10:36,322][15401] Updated weights for policy 0, policy_version 685471 (0.0031) [2024-06-24 15:10:38,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 11230887936. Throughput: 0: 42724.5. Samples: 11230937800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 15:10:39,876][15401] Updated weights for policy 0, policy_version 685481 (0.0034) [2024-06-24 15:10:43,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11231051776. Throughput: 0: 42735.6. Samples: 11231193180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 15:10:43,808][15401] Updated weights for policy 0, policy_version 685491 (0.0028) [2024-06-24 15:10:47,289][15401] Updated weights for policy 0, policy_version 685501 (0.0032) [2024-06-24 15:10:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 11231281152. Throughput: 0: 42784.4. Samples: 11231446660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:48,393][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 15:10:51,438][15401] Updated weights for policy 0, policy_version 685511 (0.0025) [2024-06-24 15:10:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11231526912. Throughput: 0: 42643.6. Samples: 11231575220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 15:10:54,742][15401] Updated weights for policy 0, policy_version 685521 (0.0034) [2024-06-24 15:10:58,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 11231690752. Throughput: 0: 42742.7. Samples: 11231835740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:10:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 15:10:59,000][15401] Updated weights for policy 0, policy_version 685531 (0.0035) [2024-06-24 15:11:02,403][15401] Updated weights for policy 0, policy_version 685541 (0.0029) [2024-06-24 15:11:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11231920128. Throughput: 0: 42746.7. Samples: 11232083860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:11:03,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 15:11:06,744][15401] Updated weights for policy 0, policy_version 685551 (0.0039) [2024-06-24 15:11:07,582][15349] Signal inference workers to stop experience collection... (166200 times) [2024-06-24 15:11:07,615][15401] InferenceWorker_p0-w0: stopping experience collection (166200 times) [2024-06-24 15:11:07,642][15349] Signal inference workers to resume experience collection... (166200 times) [2024-06-24 15:11:07,643][15401] InferenceWorker_p0-w0: resuming experience collection (166200 times) [2024-06-24 15:11:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 11232149504. Throughput: 0: 42816.8. Samples: 11232220020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:11:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 15:11:09,922][15401] Updated weights for policy 0, policy_version 685561 (0.0036) [2024-06-24 15:11:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11232329728. Throughput: 0: 42669.4. Samples: 11232472060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:11:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 15:11:14,312][15401] Updated weights for policy 0, policy_version 685571 (0.0048) [2024-06-24 15:11:17,547][15401] Updated weights for policy 0, policy_version 685581 (0.0035) [2024-06-24 15:11:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11232575488. Throughput: 0: 42638.6. Samples: 11232720820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:11:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 15:11:22,262][15401] Updated weights for policy 0, policy_version 685591 (0.0029) [2024-06-24 15:11:23,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 11232772096. Throughput: 0: 42774.1. Samples: 11232862740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:11:23,392][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 15:11:25,460][15401] Updated weights for policy 0, policy_version 685601 (0.0042) [2024-06-24 15:11:28,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 11232968704. Throughput: 0: 42704.3. Samples: 11233114880. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:11:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 15:11:29,928][15401] Updated weights for policy 0, policy_version 685611 (0.0037) [2024-06-24 15:11:33,084][15401] Updated weights for policy 0, policy_version 685621 (0.0023) [2024-06-24 15:11:33,389][15132] Fps is (10 sec: 45886.7, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 11233230848. Throughput: 0: 42773.9. Samples: 11233371380. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:11:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 15:11:37,443][15401] Updated weights for policy 0, policy_version 685631 (0.0038) [2024-06-24 15:11:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 11233411072. Throughput: 0: 42883.4. Samples: 11233504980. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:11:38,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-24 15:11:40,850][15401] Updated weights for policy 0, policy_version 685641 (0.0029) [2024-06-24 15:11:43,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11233607680. Throughput: 0: 42642.3. Samples: 11233754640. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:11:43,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 15:11:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000685646_11233624064.pth... [2024-06-24 15:11:43,550][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000685022_11223400448.pth [2024-06-24 15:11:45,291][15401] Updated weights for policy 0, policy_version 685651 (0.0042) [2024-06-24 15:11:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11233853440. Throughput: 0: 42714.7. Samples: 11234006020. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:11:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 15:11:48,546][15401] Updated weights for policy 0, policy_version 685661 (0.0026) [2024-06-24 15:11:52,942][15401] Updated weights for policy 0, policy_version 685671 (0.0038) [2024-06-24 15:11:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11234050048. Throughput: 0: 42698.3. Samples: 11234141440. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:11:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 15:11:56,159][15401] Updated weights for policy 0, policy_version 685681 (0.0037) [2024-06-24 15:11:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11234263040. Throughput: 0: 42606.0. Samples: 11234389340. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:11:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 15:12:00,621][15401] Updated weights for policy 0, policy_version 685691 (0.0033) [2024-06-24 15:12:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 11234476032. Throughput: 0: 42758.2. Samples: 11234644940. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:03,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 15:12:03,918][15401] Updated weights for policy 0, policy_version 685701 (0.0038) [2024-06-24 15:12:08,345][15401] Updated weights for policy 0, policy_version 685711 (0.0039) [2024-06-24 15:12:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 11234689024. Throughput: 0: 42433.3. Samples: 11234772140. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 15:12:11,782][15401] Updated weights for policy 0, policy_version 685721 (0.0036) [2024-06-24 15:12:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 11234918400. Throughput: 0: 42534.7. Samples: 11235028940. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:13,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 15:12:15,913][15401] Updated weights for policy 0, policy_version 685731 (0.0037) [2024-06-24 15:12:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11235115008. Throughput: 0: 42274.6. Samples: 11235273740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 15:12:19,869][15401] Updated weights for policy 0, policy_version 685741 (0.0032) [2024-06-24 15:12:20,252][15349] Signal inference workers to stop experience collection... (166250 times) [2024-06-24 15:12:20,253][15349] Signal inference workers to resume experience collection... (166250 times) [2024-06-24 15:12:20,263][15401] InferenceWorker_p0-w0: stopping experience collection (166250 times) [2024-06-24 15:12:20,263][15401] InferenceWorker_p0-w0: resuming experience collection (166250 times) [2024-06-24 15:12:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 11235328000. Throughput: 0: 42201.6. Samples: 11235404040. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 15:12:23,435][15401] Updated weights for policy 0, policy_version 685751 (0.0033) [2024-06-24 15:12:27,238][15401] Updated weights for policy 0, policy_version 685761 (0.0032) [2024-06-24 15:12:28,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11235540992. Throughput: 0: 42333.0. Samples: 11235659640. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:28,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 15:12:31,459][15401] Updated weights for policy 0, policy_version 685771 (0.0027) [2024-06-24 15:12:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11235753984. Throughput: 0: 42464.0. Samples: 11235916900. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:33,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 15:12:35,380][15401] Updated weights for policy 0, policy_version 685781 (0.0027) [2024-06-24 15:12:38,396][15132] Fps is (10 sec: 42572.0, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 11235966976. Throughput: 0: 42305.9. Samples: 11236045480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:38,397][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 15:12:39,018][15401] Updated weights for policy 0, policy_version 685791 (0.0035) [2024-06-24 15:12:42,831][15401] Updated weights for policy 0, policy_version 685801 (0.0041) [2024-06-24 15:12:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11236179968. Throughput: 0: 42404.6. Samples: 11236297540. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 15:12:46,775][15401] Updated weights for policy 0, policy_version 685811 (0.0033) [2024-06-24 15:12:48,392][15132] Fps is (10 sec: 42615.8, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 11236392960. Throughput: 0: 42414.2. Samples: 11236553680. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:48,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 15:12:50,475][15401] Updated weights for policy 0, policy_version 685821 (0.0042) [2024-06-24 15:12:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11236589568. Throughput: 0: 42403.7. Samples: 11236680300. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 15:12:54,387][15401] Updated weights for policy 0, policy_version 685831 (0.0047) [2024-06-24 15:12:58,167][15401] Updated weights for policy 0, policy_version 685841 (0.0028) [2024-06-24 15:12:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11236818944. Throughput: 0: 42337.9. Samples: 11236934140. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-24 15:12:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 15:13:02,102][15401] Updated weights for policy 0, policy_version 685851 (0.0031) [2024-06-24 15:13:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11237031936. Throughput: 0: 42668.9. Samples: 11237193840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:03,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 15:13:05,986][15401] Updated weights for policy 0, policy_version 685861 (0.0033) [2024-06-24 15:13:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11237244928. Throughput: 0: 42572.8. Samples: 11237319820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 15:13:09,798][15401] Updated weights for policy 0, policy_version 685871 (0.0034) [2024-06-24 15:13:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11237457920. Throughput: 0: 42478.0. Samples: 11237571140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 15:13:13,487][15401] Updated weights for policy 0, policy_version 685881 (0.0034) [2024-06-24 15:13:17,648][15401] Updated weights for policy 0, policy_version 685891 (0.0035) [2024-06-24 15:13:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 11237670912. Throughput: 0: 42497.2. Samples: 11237829380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:18,393][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 15:13:21,361][15401] Updated weights for policy 0, policy_version 685901 (0.0035) [2024-06-24 15:13:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11237883904. Throughput: 0: 42574.9. Samples: 11237961080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 15:13:25,141][15401] Updated weights for policy 0, policy_version 685911 (0.0032) [2024-06-24 15:13:28,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 11238113280. Throughput: 0: 42644.9. Samples: 11238216560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:28,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 15:13:28,874][15401] Updated weights for policy 0, policy_version 685921 (0.0027) [2024-06-24 15:13:32,935][15401] Updated weights for policy 0, policy_version 685931 (0.0032) [2024-06-24 15:13:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11238309888. Throughput: 0: 42855.6. Samples: 11238482080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 15:13:36,409][15401] Updated weights for policy 0, policy_version 685941 (0.0037) [2024-06-24 15:13:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 11238522880. Throughput: 0: 42804.0. Samples: 11238606480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 15:13:40,477][15401] Updated weights for policy 0, policy_version 685951 (0.0030) [2024-06-24 15:13:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11238752256. Throughput: 0: 42930.1. Samples: 11238866000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:43,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-24 15:13:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000685960_11238768640.pth... [2024-06-24 15:13:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000685335_11228528640.pth [2024-06-24 15:13:44,362][15401] Updated weights for policy 0, policy_version 685961 (0.0025) [2024-06-24 15:13:44,863][15349] Signal inference workers to stop experience collection... (166300 times) [2024-06-24 15:13:44,910][15401] InferenceWorker_p0-w0: stopping experience collection (166300 times) [2024-06-24 15:13:44,916][15349] Signal inference workers to resume experience collection... (166300 times) [2024-06-24 15:13:44,924][15401] InferenceWorker_p0-w0: resuming experience collection (166300 times) [2024-06-24 15:13:47,960][15401] Updated weights for policy 0, policy_version 685971 (0.0030) [2024-06-24 15:13:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 11238948864. Throughput: 0: 42924.5. Samples: 11239125440. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:48,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-24 15:13:51,865][15401] Updated weights for policy 0, policy_version 685981 (0.0027) [2024-06-24 15:13:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 11239178240. Throughput: 0: 42927.5. Samples: 11239251560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 15:13:55,520][15401] Updated weights for policy 0, policy_version 685991 (0.0039) [2024-06-24 15:13:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11239407616. Throughput: 0: 43097.9. Samples: 11239510540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:13:58,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 15:13:59,427][15401] Updated weights for policy 0, policy_version 686001 (0.0032) [2024-06-24 15:14:03,076][15401] Updated weights for policy 0, policy_version 686011 (0.0036) [2024-06-24 15:14:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11239604224. Throughput: 0: 43057.0. Samples: 11239766840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:14:03,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 15:14:06,970][15401] Updated weights for policy 0, policy_version 686021 (0.0043) [2024-06-24 15:14:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11239817216. Throughput: 0: 43062.3. Samples: 11239898880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:14:08,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 15:14:10,693][15401] Updated weights for policy 0, policy_version 686031 (0.0024) [2024-06-24 15:14:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11240046592. Throughput: 0: 43176.9. Samples: 11240159520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:14:13,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 15:14:14,567][15401] Updated weights for policy 0, policy_version 686041 (0.0027) [2024-06-24 15:14:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11240243200. Throughput: 0: 42903.5. Samples: 11240412740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:14:18,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 15:14:18,813][15401] Updated weights for policy 0, policy_version 686051 (0.0024) [2024-06-24 15:14:22,212][15401] Updated weights for policy 0, policy_version 686061 (0.0032) [2024-06-24 15:14:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11240456192. Throughput: 0: 42969.2. Samples: 11240540100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:14:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 15:14:26,235][15401] Updated weights for policy 0, policy_version 686071 (0.0038) [2024-06-24 15:14:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11240685568. Throughput: 0: 43038.3. Samples: 11240802720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:14:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 15:14:29,642][15401] Updated weights for policy 0, policy_version 686081 (0.0038) [2024-06-24 15:14:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 11240882176. Throughput: 0: 42786.6. Samples: 11241050840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:14:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 15:14:34,164][15401] Updated weights for policy 0, policy_version 686091 (0.0048) [2024-06-24 15:14:37,605][15401] Updated weights for policy 0, policy_version 686101 (0.0036) [2024-06-24 15:14:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11241095168. Throughput: 0: 42839.7. Samples: 11241179340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:14:38,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 15:14:41,675][15401] Updated weights for policy 0, policy_version 686111 (0.0039) [2024-06-24 15:14:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 11241308160. Throughput: 0: 42770.6. Samples: 11241435220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:14:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 15:14:45,258][15401] Updated weights for policy 0, policy_version 686121 (0.0025) [2024-06-24 15:14:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11241521152. Throughput: 0: 42789.8. Samples: 11241692380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:14:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 15:14:49,359][15401] Updated weights for policy 0, policy_version 686131 (0.0030) [2024-06-24 15:14:53,022][15401] Updated weights for policy 0, policy_version 686141 (0.0047) [2024-06-24 15:14:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11241734144. Throughput: 0: 42653.9. Samples: 11241818300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:14:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 15:14:56,934][15401] Updated weights for policy 0, policy_version 686151 (0.0025) [2024-06-24 15:14:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11241963520. Throughput: 0: 42537.2. Samples: 11242073700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:14:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 15:15:00,931][15401] Updated weights for policy 0, policy_version 686161 (0.0037) [2024-06-24 15:15:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 11242176512. Throughput: 0: 42599.1. Samples: 11242329700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:03,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 15:15:05,021][15401] Updated weights for policy 0, policy_version 686171 (0.0045) [2024-06-24 15:15:05,021][15349] Signal inference workers to stop experience collection... (166350 times) [2024-06-24 15:15:05,021][15349] Signal inference workers to resume experience collection... (166350 times) [2024-06-24 15:15:05,038][15401] InferenceWorker_p0-w0: stopping experience collection (166350 times) [2024-06-24 15:15:05,038][15401] InferenceWorker_p0-w0: resuming experience collection (166350 times) [2024-06-24 15:15:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11242373120. Throughput: 0: 42542.1. Samples: 11242454500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:08,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 15:15:08,474][15401] Updated weights for policy 0, policy_version 686181 (0.0035) [2024-06-24 15:15:12,471][15401] Updated weights for policy 0, policy_version 686191 (0.0036) [2024-06-24 15:15:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11242602496. Throughput: 0: 42542.7. Samples: 11242717140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 15:15:16,640][15401] Updated weights for policy 0, policy_version 686201 (0.0024) [2024-06-24 15:15:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11242815488. Throughput: 0: 42614.0. Samples: 11242968480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 15:15:19,932][15401] Updated weights for policy 0, policy_version 686211 (0.0036) [2024-06-24 15:15:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 11242995712. Throughput: 0: 42646.1. Samples: 11243098420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 15:15:23,956][15401] Updated weights for policy 0, policy_version 686221 (0.0030) [2024-06-24 15:15:27,341][15401] Updated weights for policy 0, policy_version 686231 (0.0051) [2024-06-24 15:15:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 11243241472. Throughput: 0: 42623.5. Samples: 11243353280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:28,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 15:15:31,467][15401] Updated weights for policy 0, policy_version 686241 (0.0026) [2024-06-24 15:15:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11243470848. Throughput: 0: 42691.9. Samples: 11243613520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:33,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 15:15:34,953][15401] Updated weights for policy 0, policy_version 686251 (0.0039) [2024-06-24 15:15:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11243634688. Throughput: 0: 42831.5. Samples: 11243745720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:38,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 15:15:39,022][15401] Updated weights for policy 0, policy_version 686261 (0.0028) [2024-06-24 15:15:42,443][15401] Updated weights for policy 0, policy_version 686271 (0.0037) [2024-06-24 15:15:43,396][15132] Fps is (10 sec: 42571.2, 60 sec: 43139.9, 300 sec: 42764.4). Total num frames: 11243896832. Throughput: 0: 42826.9. Samples: 11244001180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:43,397][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 15:15:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000686273_11243896832.pth... [2024-06-24 15:15:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000685646_11233624064.pth [2024-06-24 15:15:46,790][15401] Updated weights for policy 0, policy_version 686281 (0.0034) [2024-06-24 15:15:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11244093440. Throughput: 0: 42741.3. Samples: 11244253060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 15:15:50,707][15401] Updated weights for policy 0, policy_version 686291 (0.0037) [2024-06-24 15:15:53,390][15132] Fps is (10 sec: 39346.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11244290048. Throughput: 0: 42835.2. Samples: 11244382080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 15:15:54,668][15401] Updated weights for policy 0, policy_version 686301 (0.0027) [2024-06-24 15:15:58,218][15401] Updated weights for policy 0, policy_version 686311 (0.0040) [2024-06-24 15:15:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11244519424. Throughput: 0: 42688.9. Samples: 11244638140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:15:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 15:16:02,334][15401] Updated weights for policy 0, policy_version 686321 (0.0030) [2024-06-24 15:16:03,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11244748800. Throughput: 0: 42743.2. Samples: 11244891920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:16:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 15:16:05,815][15401] Updated weights for policy 0, policy_version 686331 (0.0025) [2024-06-24 15:16:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11244945408. Throughput: 0: 42801.3. Samples: 11245024480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:16:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 15:16:10,099][15401] Updated weights for policy 0, policy_version 686341 (0.0043) [2024-06-24 15:16:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11245174784. Throughput: 0: 42812.4. Samples: 11245279840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:16:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 15:16:13,401][15401] Updated weights for policy 0, policy_version 686351 (0.0039) [2024-06-24 15:16:18,038][15401] Updated weights for policy 0, policy_version 686361 (0.0041) [2024-06-24 15:16:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 11245355008. Throughput: 0: 42821.8. Samples: 11245540500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 15:16:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 15:16:20,967][15401] Updated weights for policy 0, policy_version 686371 (0.0028) [2024-06-24 15:16:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11245568000. Throughput: 0: 42538.7. Samples: 11245659960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:16:23,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 15:16:25,495][15401] Updated weights for policy 0, policy_version 686381 (0.0029) [2024-06-24 15:16:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11245797376. Throughput: 0: 42734.6. Samples: 11245923960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:16:28,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 15:16:28,401][15349] Signal inference workers to stop experience collection... (166400 times) [2024-06-24 15:16:28,454][15401] InferenceWorker_p0-w0: stopping experience collection (166400 times) [2024-06-24 15:16:28,458][15349] Signal inference workers to resume experience collection... (166400 times) [2024-06-24 15:16:28,465][15401] InferenceWorker_p0-w0: resuming experience collection (166400 times) [2024-06-24 15:16:28,601][15401] Updated weights for policy 0, policy_version 686391 (0.0033) [2024-06-24 15:16:32,963][15401] Updated weights for policy 0, policy_version 686401 (0.0036) [2024-06-24 15:16:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11246010368. Throughput: 0: 42895.0. Samples: 11246183340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:16:33,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 15:16:36,181][15401] Updated weights for policy 0, policy_version 686411 (0.0027) [2024-06-24 15:16:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11246223360. Throughput: 0: 42818.8. Samples: 11246308920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:16:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 15:16:40,402][15401] Updated weights for policy 0, policy_version 686421 (0.0027) [2024-06-24 15:16:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 11246452736. Throughput: 0: 43015.4. Samples: 11246573840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:16:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 15:16:43,684][15401] Updated weights for policy 0, policy_version 686431 (0.0042) [2024-06-24 15:16:47,914][15401] Updated weights for policy 0, policy_version 686441 (0.0038) [2024-06-24 15:16:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11246649344. Throughput: 0: 43020.4. Samples: 11246827840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:16:48,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 15:16:51,414][15401] Updated weights for policy 0, policy_version 686451 (0.0032) [2024-06-24 15:16:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11246878720. Throughput: 0: 42835.1. Samples: 11246952060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:16:53,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 15:16:55,502][15401] Updated weights for policy 0, policy_version 686461 (0.0025) [2024-06-24 15:16:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11247075328. Throughput: 0: 42903.3. Samples: 11247210480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:16:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 15:16:59,255][15401] Updated weights for policy 0, policy_version 686471 (0.0036) [2024-06-24 15:17:03,113][15401] Updated weights for policy 0, policy_version 686481 (0.0031) [2024-06-24 15:17:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11247304704. Throughput: 0: 42700.8. Samples: 11247462040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:03,390][15132] Avg episode reward: [(0, '0.866')] [2024-06-24 15:17:06,955][15401] Updated weights for policy 0, policy_version 686491 (0.0036) [2024-06-24 15:17:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11247517696. Throughput: 0: 42876.1. Samples: 11247589380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 15:17:11,137][15401] Updated weights for policy 0, policy_version 686501 (0.0041) [2024-06-24 15:17:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11247714304. Throughput: 0: 42741.3. Samples: 11247847320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:13,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 15:17:14,726][15401] Updated weights for policy 0, policy_version 686511 (0.0031) [2024-06-24 15:17:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11247943680. Throughput: 0: 42631.3. Samples: 11248101740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 15:17:18,737][15401] Updated weights for policy 0, policy_version 686521 (0.0031) [2024-06-24 15:17:22,331][15401] Updated weights for policy 0, policy_version 686531 (0.0028) [2024-06-24 15:17:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11248156672. Throughput: 0: 42771.0. Samples: 11248233620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 15:17:26,477][15401] Updated weights for policy 0, policy_version 686541 (0.0034) [2024-06-24 15:17:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11248353280. Throughput: 0: 42575.3. Samples: 11248489720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 15:17:29,998][15401] Updated weights for policy 0, policy_version 686551 (0.0035) [2024-06-24 15:17:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 11248566272. Throughput: 0: 42541.7. Samples: 11248742220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 15:17:34,053][15401] Updated weights for policy 0, policy_version 686561 (0.0036) [2024-06-24 15:17:37,618][15401] Updated weights for policy 0, policy_version 686571 (0.0024) [2024-06-24 15:17:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11248779264. Throughput: 0: 42686.7. Samples: 11248872960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 15:17:41,778][15401] Updated weights for policy 0, policy_version 686581 (0.0031) [2024-06-24 15:17:42,286][15349] Signal inference workers to stop experience collection... (166450 times) [2024-06-24 15:17:42,313][15401] InferenceWorker_p0-w0: stopping experience collection (166450 times) [2024-06-24 15:17:42,347][15349] Signal inference workers to resume experience collection... (166450 times) [2024-06-24 15:17:42,348][15401] InferenceWorker_p0-w0: resuming experience collection (166450 times) [2024-06-24 15:17:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 11248992256. Throughput: 0: 42546.1. Samples: 11249125060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 15:17:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000686585_11249008640.pth... [2024-06-24 15:17:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000685960_11238768640.pth [2024-06-24 15:17:46,035][15401] Updated weights for policy 0, policy_version 686591 (0.0029) [2024-06-24 15:17:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11249221632. Throughput: 0: 42510.3. Samples: 11249375000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:48,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 15:17:49,601][15401] Updated weights for policy 0, policy_version 686601 (0.0038) [2024-06-24 15:17:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 11249418240. Throughput: 0: 42558.0. Samples: 11249504500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:17:53,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 15:17:53,576][15401] Updated weights for policy 0, policy_version 686611 (0.0031) [2024-06-24 15:17:57,175][15401] Updated weights for policy 0, policy_version 686621 (0.0037) [2024-06-24 15:17:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11249647616. Throughput: 0: 42562.2. Samples: 11249762620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:17:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 15:18:01,252][15401] Updated weights for policy 0, policy_version 686631 (0.0041) [2024-06-24 15:18:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11249860608. Throughput: 0: 42577.3. Samples: 11250017720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 15:18:04,902][15401] Updated weights for policy 0, policy_version 686641 (0.0034) [2024-06-24 15:18:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11250057216. Throughput: 0: 42440.5. Samples: 11250143440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 15:18:08,866][15401] Updated weights for policy 0, policy_version 686651 (0.0035) [2024-06-24 15:18:12,471][15401] Updated weights for policy 0, policy_version 686661 (0.0047) [2024-06-24 15:18:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 11250286592. Throughput: 0: 42523.9. Samples: 11250403300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 15:18:16,646][15401] Updated weights for policy 0, policy_version 686671 (0.0028) [2024-06-24 15:18:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 11250483200. Throughput: 0: 42487.6. Samples: 11250654260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:18,393][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 15:18:20,666][15401] Updated weights for policy 0, policy_version 686681 (0.0037) [2024-06-24 15:18:23,390][15132] Fps is (10 sec: 40958.4, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 11250696192. Throughput: 0: 42381.8. Samples: 11250780160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:23,391][15132] Avg episode reward: [(0, '0.881')] [2024-06-24 15:18:24,326][15401] Updated weights for policy 0, policy_version 686691 (0.0031) [2024-06-24 15:18:28,207][15401] Updated weights for policy 0, policy_version 686701 (0.0041) [2024-06-24 15:18:28,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11250925568. Throughput: 0: 42448.5. Samples: 11251035240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:28,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 15:18:31,982][15401] Updated weights for policy 0, policy_version 686711 (0.0039) [2024-06-24 15:18:33,390][15132] Fps is (10 sec: 40961.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11251105792. Throughput: 0: 42619.6. Samples: 11251292880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 15:18:35,735][15401] Updated weights for policy 0, policy_version 686721 (0.0040) [2024-06-24 15:18:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11251351552. Throughput: 0: 42449.9. Samples: 11251414740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:38,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 15:18:40,090][15401] Updated weights for policy 0, policy_version 686731 (0.0048) [2024-06-24 15:18:43,263][15401] Updated weights for policy 0, policy_version 686741 (0.0033) [2024-06-24 15:18:43,396][15132] Fps is (10 sec: 45846.0, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 11251564544. Throughput: 0: 42540.6. Samples: 11251677220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:43,396][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 15:18:47,549][15401] Updated weights for policy 0, policy_version 686751 (0.0036) [2024-06-24 15:18:48,396][15132] Fps is (10 sec: 39296.8, 60 sec: 42047.9, 300 sec: 42597.5). Total num frames: 11251744768. Throughput: 0: 42494.5. Samples: 11251930240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:48,396][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 15:18:51,292][15401] Updated weights for policy 0, policy_version 686761 (0.0028) [2024-06-24 15:18:53,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11251990528. Throughput: 0: 42570.6. Samples: 11252059120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 15:18:55,178][15401] Updated weights for policy 0, policy_version 686771 (0.0030) [2024-06-24 15:18:58,390][15132] Fps is (10 sec: 45903.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11252203520. Throughput: 0: 42568.4. Samples: 11252318880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:18:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 15:18:58,865][15401] Updated weights for policy 0, policy_version 686781 (0.0036) [2024-06-24 15:19:02,741][15401] Updated weights for policy 0, policy_version 686791 (0.0035) [2024-06-24 15:19:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11252400128. Throughput: 0: 42601.0. Samples: 11252571200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:19:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 15:19:06,698][15401] Updated weights for policy 0, policy_version 686801 (0.0033) [2024-06-24 15:19:07,422][15349] Signal inference workers to stop experience collection... (166500 times) [2024-06-24 15:19:07,422][15349] Signal inference workers to resume experience collection... (166500 times) [2024-06-24 15:19:07,461][15401] InferenceWorker_p0-w0: stopping experience collection (166500 times) [2024-06-24 15:19:07,461][15401] InferenceWorker_p0-w0: resuming experience collection (166500 times) [2024-06-24 15:19:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11252629504. Throughput: 0: 42594.3. Samples: 11252696880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:19:08,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 15:19:10,596][15401] Updated weights for policy 0, policy_version 686811 (0.0036) [2024-06-24 15:19:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11252826112. Throughput: 0: 42720.0. Samples: 11252957640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:19:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 15:19:14,342][15401] Updated weights for policy 0, policy_version 686821 (0.0029) [2024-06-24 15:19:18,257][15401] Updated weights for policy 0, policy_version 686831 (0.0041) [2024-06-24 15:19:18,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 11253039104. Throughput: 0: 42646.6. Samples: 11253211980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:19:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 15:19:21,850][15401] Updated weights for policy 0, policy_version 686841 (0.0033) [2024-06-24 15:19:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.8, 300 sec: 42653.9). Total num frames: 11253268480. Throughput: 0: 42769.8. Samples: 11253339380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:19:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 15:19:25,950][15401] Updated weights for policy 0, policy_version 686851 (0.0044) [2024-06-24 15:19:28,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11253481472. Throughput: 0: 42628.3. Samples: 11253595220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:19:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 15:19:29,634][15401] Updated weights for policy 0, policy_version 686861 (0.0044) [2024-06-24 15:19:33,393][15132] Fps is (10 sec: 40944.5, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 11253678080. Throughput: 0: 42622.9. Samples: 11253848160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-24 15:19:33,394][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 15:19:33,689][15401] Updated weights for policy 0, policy_version 686871 (0.0034) [2024-06-24 15:19:37,406][15401] Updated weights for policy 0, policy_version 686881 (0.0041) [2024-06-24 15:19:38,390][15132] Fps is (10 sec: 42595.5, 60 sec: 42597.9, 300 sec: 42709.4). Total num frames: 11253907456. Throughput: 0: 42578.1. Samples: 11253975160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:19:38,391][15132] Avg episode reward: [(0, '0.272')] [2024-06-24 15:19:41,512][15401] Updated weights for policy 0, policy_version 686891 (0.0035) [2024-06-24 15:19:43,390][15132] Fps is (10 sec: 42613.6, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 11254104064. Throughput: 0: 42489.3. Samples: 11254230900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:19:43,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 15:19:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000686896_11254104064.pth... [2024-06-24 15:19:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000686273_11243896832.pth [2024-06-24 15:19:45,065][15401] Updated weights for policy 0, policy_version 686901 (0.0033) [2024-06-24 15:19:48,389][15132] Fps is (10 sec: 40962.8, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 11254317056. Throughput: 0: 42500.4. Samples: 11254483720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:19:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 15:19:49,153][15401] Updated weights for policy 0, policy_version 686911 (0.0042) [2024-06-24 15:19:52,788][15401] Updated weights for policy 0, policy_version 686921 (0.0034) [2024-06-24 15:19:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11254530048. Throughput: 0: 42570.2. Samples: 11254612540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:19:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 15:19:56,730][15401] Updated weights for policy 0, policy_version 686931 (0.0031) [2024-06-24 15:19:58,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 11254759424. Throughput: 0: 42551.0. Samples: 11254872540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:19:58,393][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 15:20:00,575][15401] Updated weights for policy 0, policy_version 686941 (0.0035) [2024-06-24 15:20:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11254988800. Throughput: 0: 42504.4. Samples: 11255124680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 15:20:04,311][15401] Updated weights for policy 0, policy_version 686951 (0.0031) [2024-06-24 15:20:08,373][15401] Updated weights for policy 0, policy_version 686961 (0.0039) [2024-06-24 15:20:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11255169024. Throughput: 0: 42688.3. Samples: 11255260360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 15:20:11,806][15401] Updated weights for policy 0, policy_version 686971 (0.0048) [2024-06-24 15:20:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 11255398400. Throughput: 0: 42711.0. Samples: 11255517220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:13,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 15:20:15,903][15401] Updated weights for policy 0, policy_version 686981 (0.0030) [2024-06-24 15:20:18,391][15132] Fps is (10 sec: 45870.4, 60 sec: 43143.8, 300 sec: 42820.4). Total num frames: 11255627776. Throughput: 0: 42622.4. Samples: 11255766060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:18,391][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 15:20:19,512][15401] Updated weights for policy 0, policy_version 686991 (0.0024) [2024-06-24 15:20:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11255808000. Throughput: 0: 42853.6. Samples: 11255903540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 15:20:23,423][15401] Updated weights for policy 0, policy_version 687001 (0.0030) [2024-06-24 15:20:27,046][15401] Updated weights for policy 0, policy_version 687011 (0.0043) [2024-06-24 15:20:28,390][15132] Fps is (10 sec: 42602.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11256053760. Throughput: 0: 42916.5. Samples: 11256162140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 15:20:30,875][15401] Updated weights for policy 0, policy_version 687021 (0.0038) [2024-06-24 15:20:33,390][15132] Fps is (10 sec: 45872.1, 60 sec: 43146.8, 300 sec: 42820.5). Total num frames: 11256266752. Throughput: 0: 42835.0. Samples: 11256411320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:33,391][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 15:20:34,721][15401] Updated weights for policy 0, policy_version 687031 (0.0038) [2024-06-24 15:20:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.9, 300 sec: 42599.3). Total num frames: 11256463360. Throughput: 0: 42927.0. Samples: 11256544260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:38,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-24 15:20:38,776][15401] Updated weights for policy 0, policy_version 687041 (0.0045) [2024-06-24 15:20:42,275][15401] Updated weights for policy 0, policy_version 687051 (0.0034) [2024-06-24 15:20:43,390][15132] Fps is (10 sec: 40962.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11256676352. Throughput: 0: 42877.3. Samples: 11256801920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:43,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 15:20:43,761][15349] Signal inference workers to stop experience collection... (166550 times) [2024-06-24 15:20:43,798][15401] InferenceWorker_p0-w0: stopping experience collection (166550 times) [2024-06-24 15:20:43,811][15349] Signal inference workers to resume experience collection... (166550 times) [2024-06-24 15:20:43,818][15401] InferenceWorker_p0-w0: resuming experience collection (166550 times) [2024-06-24 15:20:46,287][15401] Updated weights for policy 0, policy_version 687061 (0.0035) [2024-06-24 15:20:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11256889344. Throughput: 0: 42954.3. Samples: 11257057620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:48,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 15:20:49,884][15401] Updated weights for policy 0, policy_version 687071 (0.0030) [2024-06-24 15:20:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11257085952. Throughput: 0: 42858.2. Samples: 11257188980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 15:20:53,859][15401] Updated weights for policy 0, policy_version 687081 (0.0026) [2024-06-24 15:20:57,484][15401] Updated weights for policy 0, policy_version 687091 (0.0031) [2024-06-24 15:20:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 11257331712. Throughput: 0: 42820.2. Samples: 11257444120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:20:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 15:21:01,863][15401] Updated weights for policy 0, policy_version 687101 (0.0037) [2024-06-24 15:21:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11257544704. Throughput: 0: 42887.3. Samples: 11257695940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:21:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 15:21:04,862][15401] Updated weights for policy 0, policy_version 687111 (0.0033) [2024-06-24 15:21:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11257741312. Throughput: 0: 42816.9. Samples: 11257830300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:21:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 15:21:09,397][15401] Updated weights for policy 0, policy_version 687121 (0.0035) [2024-06-24 15:21:12,313][15401] Updated weights for policy 0, policy_version 687131 (0.0045) [2024-06-24 15:21:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11257970688. Throughput: 0: 42788.9. Samples: 11258087640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:21:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 15:21:17,090][15401] Updated weights for policy 0, policy_version 687141 (0.0025) [2024-06-24 15:21:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42872.3, 300 sec: 42820.6). Total num frames: 11258200064. Throughput: 0: 42973.5. Samples: 11258345100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:18,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 15:21:20,466][15401] Updated weights for policy 0, policy_version 687151 (0.0033) [2024-06-24 15:21:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11258363904. Throughput: 0: 42840.9. Samples: 11258472100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 15:21:24,719][15401] Updated weights for policy 0, policy_version 687161 (0.0032) [2024-06-24 15:21:28,099][15401] Updated weights for policy 0, policy_version 687171 (0.0042) [2024-06-24 15:21:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11258609664. Throughput: 0: 42663.6. Samples: 11258721780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:28,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 15:21:32,284][15401] Updated weights for policy 0, policy_version 687181 (0.0032) [2024-06-24 15:21:33,393][15132] Fps is (10 sec: 47499.4, 60 sec: 42869.7, 300 sec: 42764.6). Total num frames: 11258839040. Throughput: 0: 42713.2. Samples: 11258979840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:33,393][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 15:21:35,856][15401] Updated weights for policy 0, policy_version 687191 (0.0036) [2024-06-24 15:21:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11259002880. Throughput: 0: 42700.5. Samples: 11259110500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 15:21:40,051][15401] Updated weights for policy 0, policy_version 687201 (0.0029) [2024-06-24 15:21:43,389][15132] Fps is (10 sec: 40972.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11259248640. Throughput: 0: 42527.9. Samples: 11259357880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 15:21:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000687211_11259265024.pth... [2024-06-24 15:21:43,513][15401] Updated weights for policy 0, policy_version 687211 (0.0039) [2024-06-24 15:21:43,544][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000686585_11249008640.pth [2024-06-24 15:21:47,790][15401] Updated weights for policy 0, policy_version 687221 (0.0033) [2024-06-24 15:21:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11259461632. Throughput: 0: 42876.0. Samples: 11259625360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 15:21:51,044][15401] Updated weights for policy 0, policy_version 687231 (0.0048) [2024-06-24 15:21:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11259658240. Throughput: 0: 42544.7. Samples: 11259744820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:53,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 15:21:55,565][15401] Updated weights for policy 0, policy_version 687241 (0.0038) [2024-06-24 15:21:56,313][15349] Signal inference workers to stop experience collection... (166600 times) [2024-06-24 15:21:56,313][15349] Signal inference workers to resume experience collection... (166600 times) [2024-06-24 15:21:56,367][15401] InferenceWorker_p0-w0: stopping experience collection (166600 times) [2024-06-24 15:21:56,367][15401] InferenceWorker_p0-w0: resuming experience collection (166600 times) [2024-06-24 15:21:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11259904000. Throughput: 0: 42511.3. Samples: 11260000640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:21:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 15:21:58,486][15401] Updated weights for policy 0, policy_version 687251 (0.0030) [2024-06-24 15:22:03,174][15401] Updated weights for policy 0, policy_version 687261 (0.0033) [2024-06-24 15:22:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11260100608. Throughput: 0: 42660.8. Samples: 11260264840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:03,391][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 15:22:05,971][15401] Updated weights for policy 0, policy_version 687271 (0.0045) [2024-06-24 15:22:08,393][15132] Fps is (10 sec: 40945.1, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 11260313600. Throughput: 0: 42483.4. Samples: 11260384000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:08,394][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 15:22:10,869][15401] Updated weights for policy 0, policy_version 687281 (0.0039) [2024-06-24 15:22:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11260559360. Throughput: 0: 42752.5. Samples: 11260645640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 15:22:13,474][15401] Updated weights for policy 0, policy_version 687291 (0.0028) [2024-06-24 15:22:18,389][15132] Fps is (10 sec: 40974.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11260723200. Throughput: 0: 42873.6. Samples: 11260909020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 15:22:18,410][15401] Updated weights for policy 0, policy_version 687301 (0.0036) [2024-06-24 15:22:21,365][15401] Updated weights for policy 0, policy_version 687311 (0.0040) [2024-06-24 15:22:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11260968960. Throughput: 0: 42480.0. Samples: 11261022100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 15:22:26,372][15401] Updated weights for policy 0, policy_version 687321 (0.0031) [2024-06-24 15:22:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11261181952. Throughput: 0: 42804.5. Samples: 11261284080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 15:22:29,360][15401] Updated weights for policy 0, policy_version 687331 (0.0042) [2024-06-24 15:22:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42054.3, 300 sec: 42653.9). Total num frames: 11261362176. Throughput: 0: 42763.9. Samples: 11261549740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 15:22:33,937][15401] Updated weights for policy 0, policy_version 687341 (0.0033) [2024-06-24 15:22:36,963][15401] Updated weights for policy 0, policy_version 687351 (0.0035) [2024-06-24 15:22:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11261591552. Throughput: 0: 42816.5. Samples: 11261671560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 15:22:41,634][15401] Updated weights for policy 0, policy_version 687361 (0.0036) [2024-06-24 15:22:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11261804544. Throughput: 0: 42783.4. Samples: 11261925900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 15:22:44,545][15401] Updated weights for policy 0, policy_version 687371 (0.0040) [2024-06-24 15:22:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11262017536. Throughput: 0: 42697.8. Samples: 11262186240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:48,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 15:22:49,279][15401] Updated weights for policy 0, policy_version 687381 (0.0029) [2024-06-24 15:22:52,532][15401] Updated weights for policy 0, policy_version 687391 (0.0033) [2024-06-24 15:22:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11262246912. Throughput: 0: 42807.3. Samples: 11262310180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 15:22:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 15:22:56,757][15401] Updated weights for policy 0, policy_version 687401 (0.0035) [2024-06-24 15:22:58,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11262443520. Throughput: 0: 42777.3. Samples: 11262570620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:22:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 15:23:00,750][15401] Updated weights for policy 0, policy_version 687411 (0.0048) [2024-06-24 15:23:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11262656512. Throughput: 0: 42591.2. Samples: 11262825620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 15:23:03,754][15349] Signal inference workers to stop experience collection... (166650 times) [2024-06-24 15:23:03,755][15349] Signal inference workers to resume experience collection... (166650 times) [2024-06-24 15:23:03,803][15401] InferenceWorker_p0-w0: stopping experience collection (166650 times) [2024-06-24 15:23:03,803][15401] InferenceWorker_p0-w0: resuming experience collection (166650 times) [2024-06-24 15:23:04,426][15401] Updated weights for policy 0, policy_version 687421 (0.0029) [2024-06-24 15:23:08,131][15401] Updated weights for policy 0, policy_version 687431 (0.0031) [2024-06-24 15:23:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42601.0, 300 sec: 42654.0). Total num frames: 11262869504. Throughput: 0: 42851.2. Samples: 11262950400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 15:23:12,024][15401] Updated weights for policy 0, policy_version 687441 (0.0038) [2024-06-24 15:23:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42050.6, 300 sec: 42709.5). Total num frames: 11263082496. Throughput: 0: 42668.7. Samples: 11263204280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:13,393][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 15:23:15,935][15401] Updated weights for policy 0, policy_version 687451 (0.0046) [2024-06-24 15:23:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11263279104. Throughput: 0: 42481.9. Samples: 11263461420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:18,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 15:23:20,077][15401] Updated weights for policy 0, policy_version 687461 (0.0050) [2024-06-24 15:23:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11263508480. Throughput: 0: 42577.3. Samples: 11263587540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 15:23:23,629][15401] Updated weights for policy 0, policy_version 687471 (0.0031) [2024-06-24 15:23:27,650][15401] Updated weights for policy 0, policy_version 687481 (0.0024) [2024-06-24 15:23:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 11263721472. Throughput: 0: 42630.7. Samples: 11263844380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:28,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 15:23:31,173][15401] Updated weights for policy 0, policy_version 687491 (0.0034) [2024-06-24 15:23:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11263918080. Throughput: 0: 42571.7. Samples: 11264101960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 15:23:35,065][15401] Updated weights for policy 0, policy_version 687501 (0.0033) [2024-06-24 15:23:38,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 11264163840. Throughput: 0: 42648.9. Samples: 11264229380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 15:23:38,753][15401] Updated weights for policy 0, policy_version 687511 (0.0033) [2024-06-24 15:23:42,443][15401] Updated weights for policy 0, policy_version 687521 (0.0034) [2024-06-24 15:23:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 11264376832. Throughput: 0: 42516.9. Samples: 11264483880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 15:23:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000687523_11264376832.pth... [2024-06-24 15:23:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000686896_11254104064.pth [2024-06-24 15:23:46,376][15401] Updated weights for policy 0, policy_version 687531 (0.0037) [2024-06-24 15:23:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 11264573440. Throughput: 0: 42666.3. Samples: 11264745600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 15:23:50,188][15401] Updated weights for policy 0, policy_version 687541 (0.0047) [2024-06-24 15:23:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11264786432. Throughput: 0: 42657.3. Samples: 11264869980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 15:23:53,870][15401] Updated weights for policy 0, policy_version 687551 (0.0047) [2024-06-24 15:23:58,082][15401] Updated weights for policy 0, policy_version 687561 (0.0032) [2024-06-24 15:23:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11265015808. Throughput: 0: 42647.2. Samples: 11265123300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:23:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 15:24:01,842][15401] Updated weights for policy 0, policy_version 687571 (0.0032) [2024-06-24 15:24:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11265196032. Throughput: 0: 42901.8. Samples: 11265392000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:24:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 15:24:05,502][15401] Updated weights for policy 0, policy_version 687581 (0.0029) [2024-06-24 15:24:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11265441792. Throughput: 0: 42855.3. Samples: 11265516020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:24:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 15:24:09,238][15401] Updated weights for policy 0, policy_version 687591 (0.0039) [2024-06-24 15:24:12,866][15401] Updated weights for policy 0, policy_version 687601 (0.0039) [2024-06-24 15:24:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 11265671168. Throughput: 0: 42933.3. Samples: 11265776280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:24:13,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 15:24:15,629][15349] Signal inference workers to stop experience collection... (166700 times) [2024-06-24 15:24:15,630][15349] Signal inference workers to resume experience collection... (166700 times) [2024-06-24 15:24:15,641][15401] InferenceWorker_p0-w0: stopping experience collection (166700 times) [2024-06-24 15:24:15,665][15401] InferenceWorker_p0-w0: resuming experience collection (166700 times) [2024-06-24 15:24:17,126][15401] Updated weights for policy 0, policy_version 687611 (0.0033) [2024-06-24 15:24:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11265867776. Throughput: 0: 43059.1. Samples: 11266039620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:24:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 15:24:20,721][15401] Updated weights for policy 0, policy_version 687621 (0.0041) [2024-06-24 15:24:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11266080768. Throughput: 0: 42900.5. Samples: 11266159900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:24:23,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-24 15:24:24,722][15401] Updated weights for policy 0, policy_version 687631 (0.0033) [2024-06-24 15:24:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42871.4, 300 sec: 42765.2). Total num frames: 11266293760. Throughput: 0: 43075.1. Samples: 11266422360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:24:28,393][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 15:24:28,486][15401] Updated weights for policy 0, policy_version 687641 (0.0053) [2024-06-24 15:24:32,360][15401] Updated weights for policy 0, policy_version 687651 (0.0034) [2024-06-24 15:24:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42709.6). Total num frames: 11266506752. Throughput: 0: 42837.5. Samples: 11266673300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 15:24:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 15:24:36,530][15401] Updated weights for policy 0, policy_version 687661 (0.0028) [2024-06-24 15:24:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11266719744. Throughput: 0: 42953.7. Samples: 11266802900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:24:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 15:24:40,231][15401] Updated weights for policy 0, policy_version 687671 (0.0037) [2024-06-24 15:24:43,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11266932736. Throughput: 0: 42991.6. Samples: 11267057920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:24:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 15:24:44,079][15401] Updated weights for policy 0, policy_version 687681 (0.0031) [2024-06-24 15:24:47,929][15401] Updated weights for policy 0, policy_version 687691 (0.0028) [2024-06-24 15:24:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11267162112. Throughput: 0: 42745.8. Samples: 11267315560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:24:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 15:24:51,587][15401] Updated weights for policy 0, policy_version 687701 (0.0033) [2024-06-24 15:24:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 11267358720. Throughput: 0: 42837.7. Samples: 11267443720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:24:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 15:24:55,614][15401] Updated weights for policy 0, policy_version 687711 (0.0041) [2024-06-24 15:24:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11267571712. Throughput: 0: 42635.2. Samples: 11267694860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:24:58,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 15:24:59,209][15401] Updated weights for policy 0, policy_version 687721 (0.0028) [2024-06-24 15:25:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11267768320. Throughput: 0: 42535.6. Samples: 11267953720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 15:25:03,469][15401] Updated weights for policy 0, policy_version 687731 (0.0038) [2024-06-24 15:25:06,976][15401] Updated weights for policy 0, policy_version 687741 (0.0031) [2024-06-24 15:25:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11267981312. Throughput: 0: 42585.4. Samples: 11268076240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:08,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 15:25:11,307][15401] Updated weights for policy 0, policy_version 687751 (0.0032) [2024-06-24 15:25:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42654.1). Total num frames: 11268210688. Throughput: 0: 42231.6. Samples: 11268322680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 15:25:14,665][15401] Updated weights for policy 0, policy_version 687761 (0.0045) [2024-06-24 15:25:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11268390912. Throughput: 0: 42549.1. Samples: 11268588000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:18,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 15:25:18,955][15401] Updated weights for policy 0, policy_version 687771 (0.0028) [2024-06-24 15:25:22,654][15401] Updated weights for policy 0, policy_version 687781 (0.0039) [2024-06-24 15:25:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11268636672. Throughput: 0: 42320.5. Samples: 11268707320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 15:25:26,851][15401] Updated weights for policy 0, policy_version 687791 (0.0028) [2024-06-24 15:25:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 11268849664. Throughput: 0: 42422.1. Samples: 11268966920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:28,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 15:25:30,430][15401] Updated weights for policy 0, policy_version 687801 (0.0036) [2024-06-24 15:25:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 11269046272. Throughput: 0: 42378.7. Samples: 11269222600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 15:25:34,610][15401] Updated weights for policy 0, policy_version 687811 (0.0034) [2024-06-24 15:25:38,017][15401] Updated weights for policy 0, policy_version 687821 (0.0044) [2024-06-24 15:25:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11269259264. Throughput: 0: 42285.0. Samples: 11269346540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 15:25:41,780][15349] Signal inference workers to stop experience collection... (166750 times) [2024-06-24 15:25:41,809][15401] InferenceWorker_p0-w0: stopping experience collection (166750 times) [2024-06-24 15:25:41,839][15349] Signal inference workers to resume experience collection... (166750 times) [2024-06-24 15:25:41,839][15401] InferenceWorker_p0-w0: resuming experience collection (166750 times) [2024-06-24 15:25:42,166][15401] Updated weights for policy 0, policy_version 687831 (0.0033) [2024-06-24 15:25:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11269472256. Throughput: 0: 42354.6. Samples: 11269600820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 15:25:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000687834_11269472256.pth... [2024-06-24 15:25:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000687211_11259265024.pth [2024-06-24 15:25:45,559][15401] Updated weights for policy 0, policy_version 687841 (0.0039) [2024-06-24 15:25:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 11269685248. Throughput: 0: 42158.6. Samples: 11269850860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 15:25:49,835][15401] Updated weights for policy 0, policy_version 687851 (0.0030) [2024-06-24 15:25:53,111][15401] Updated weights for policy 0, policy_version 687861 (0.0028) [2024-06-24 15:25:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11269914624. Throughput: 0: 42350.2. Samples: 11269982000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:53,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 15:25:57,612][15401] Updated weights for policy 0, policy_version 687871 (0.0031) [2024-06-24 15:25:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11270111232. Throughput: 0: 42611.6. Samples: 11270240200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:25:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 15:26:00,621][15401] Updated weights for policy 0, policy_version 687881 (0.0031) [2024-06-24 15:26:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11270324224. Throughput: 0: 42356.9. Samples: 11270494060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:26:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 15:26:05,231][15401] Updated weights for policy 0, policy_version 687891 (0.0042) [2024-06-24 15:26:08,332][15401] Updated weights for policy 0, policy_version 687901 (0.0027) [2024-06-24 15:26:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11270569984. Throughput: 0: 42602.7. Samples: 11270624440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:26:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 15:26:13,282][15401] Updated weights for policy 0, policy_version 687911 (0.0032) [2024-06-24 15:26:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 11270733824. Throughput: 0: 42421.8. Samples: 11270875900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 15:26:13,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 15:26:16,037][15401] Updated weights for policy 0, policy_version 687921 (0.0036) [2024-06-24 15:26:18,390][15132] Fps is (10 sec: 39320.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 11270963200. Throughput: 0: 42432.2. Samples: 11271132060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:18,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 15:26:20,847][15401] Updated weights for policy 0, policy_version 687931 (0.0030) [2024-06-24 15:26:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11271192576. Throughput: 0: 42589.7. Samples: 11271263080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:23,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 15:26:23,706][15401] Updated weights for policy 0, policy_version 687941 (0.0029) [2024-06-24 15:26:28,392][15132] Fps is (10 sec: 40951.1, 60 sec: 42050.6, 300 sec: 42487.4). Total num frames: 11271372800. Throughput: 0: 42552.1. Samples: 11271515760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:28,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 15:26:28,442][15401] Updated weights for policy 0, policy_version 687951 (0.0053) [2024-06-24 15:26:31,550][15401] Updated weights for policy 0, policy_version 687961 (0.0032) [2024-06-24 15:26:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11271618560. Throughput: 0: 42691.2. Samples: 11271771960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 15:26:36,050][15401] Updated weights for policy 0, policy_version 687971 (0.0032) [2024-06-24 15:26:38,390][15132] Fps is (10 sec: 44246.3, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 11271815168. Throughput: 0: 42614.4. Samples: 11271899660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 15:26:39,499][15401] Updated weights for policy 0, policy_version 687981 (0.0033) [2024-06-24 15:26:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11272028160. Throughput: 0: 42631.1. Samples: 11272158600. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 15:26:43,699][15401] Updated weights for policy 0, policy_version 687991 (0.0041) [2024-06-24 15:26:47,133][15401] Updated weights for policy 0, policy_version 688001 (0.0037) [2024-06-24 15:26:48,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11272257536. Throughput: 0: 42443.9. Samples: 11272404040. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:48,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 15:26:51,286][15401] Updated weights for policy 0, policy_version 688011 (0.0042) [2024-06-24 15:26:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 11272454144. Throughput: 0: 42474.1. Samples: 11272535880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:53,392][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 15:26:54,742][15401] Updated weights for policy 0, policy_version 688021 (0.0026) [2024-06-24 15:26:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11272667136. Throughput: 0: 42628.4. Samples: 11272794180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:26:58,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 15:26:58,965][15401] Updated weights for policy 0, policy_version 688031 (0.0038) [2024-06-24 15:27:01,869][15349] Signal inference workers to stop experience collection... (166800 times) [2024-06-24 15:27:01,879][15349] Signal inference workers to resume experience collection... (166800 times) [2024-06-24 15:27:01,907][15401] InferenceWorker_p0-w0: stopping experience collection (166800 times) [2024-06-24 15:27:01,907][15401] InferenceWorker_p0-w0: resuming experience collection (166800 times) [2024-06-24 15:27:02,367][15401] Updated weights for policy 0, policy_version 688041 (0.0029) [2024-06-24 15:27:03,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42654.5). Total num frames: 11272896512. Throughput: 0: 42340.3. Samples: 11273037360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 15:27:07,007][15401] Updated weights for policy 0, policy_version 688051 (0.0035) [2024-06-24 15:27:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 11273093120. Throughput: 0: 42524.9. Samples: 11273176700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 15:27:10,003][15401] Updated weights for policy 0, policy_version 688061 (0.0038) [2024-06-24 15:27:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11273306112. Throughput: 0: 42514.2. Samples: 11273428800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:13,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 15:27:14,847][15401] Updated weights for policy 0, policy_version 688071 (0.0026) [2024-06-24 15:27:17,962][15401] Updated weights for policy 0, policy_version 688081 (0.0032) [2024-06-24 15:27:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 11273519104. Throughput: 0: 42345.4. Samples: 11273677500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:18,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-24 15:27:22,615][15401] Updated weights for policy 0, policy_version 688091 (0.0035) [2024-06-24 15:27:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11273748480. Throughput: 0: 42427.8. Samples: 11273808900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 15:27:25,604][15401] Updated weights for policy 0, policy_version 688101 (0.0037) [2024-06-24 15:27:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 11273928704. Throughput: 0: 42330.7. Samples: 11274063480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:28,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 15:27:30,167][15401] Updated weights for policy 0, policy_version 688111 (0.0036) [2024-06-24 15:27:33,082][15401] Updated weights for policy 0, policy_version 688121 (0.0026) [2024-06-24 15:27:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11274174464. Throughput: 0: 42541.5. Samples: 11274318400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 15:27:37,802][15401] Updated weights for policy 0, policy_version 688131 (0.0032) [2024-06-24 15:27:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 11274371072. Throughput: 0: 42672.5. Samples: 11274456040. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 15:27:40,862][15401] Updated weights for policy 0, policy_version 688141 (0.0033) [2024-06-24 15:27:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11274584064. Throughput: 0: 42581.4. Samples: 11274710340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:43,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 15:27:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000688146_11274584064.pth... [2024-06-24 15:27:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000687523_11264376832.pth [2024-06-24 15:27:45,412][15401] Updated weights for policy 0, policy_version 688151 (0.0029) [2024-06-24 15:27:48,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 11274813440. Throughput: 0: 42712.1. Samples: 11274959680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:48,397][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 15:27:48,467][15401] Updated weights for policy 0, policy_version 688161 (0.0028) [2024-06-24 15:27:53,145][15401] Updated weights for policy 0, policy_version 688171 (0.0032) [2024-06-24 15:27:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 11274993664. Throughput: 0: 42599.6. Samples: 11275093680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-24 15:27:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 15:27:56,081][15401] Updated weights for policy 0, policy_version 688181 (0.0031) [2024-06-24 15:27:58,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11275223040. Throughput: 0: 42567.3. Samples: 11275344320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:27:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 15:28:00,667][15401] Updated weights for policy 0, policy_version 688191 (0.0043) [2024-06-24 15:28:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11275436032. Throughput: 0: 42743.5. Samples: 11275600960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 15:28:04,236][15401] Updated weights for policy 0, policy_version 688201 (0.0030) [2024-06-24 15:28:08,305][15401] Updated weights for policy 0, policy_version 688211 (0.0042) [2024-06-24 15:28:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 11275649024. Throughput: 0: 42786.1. Samples: 11275734280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 15:28:11,861][15401] Updated weights for policy 0, policy_version 688221 (0.0038) [2024-06-24 15:28:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11275862016. Throughput: 0: 42738.2. Samples: 11275986700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 15:28:15,900][15401] Updated weights for policy 0, policy_version 688231 (0.0044) [2024-06-24 15:28:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 11276091392. Throughput: 0: 42709.6. Samples: 11276240340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 15:28:19,460][15401] Updated weights for policy 0, policy_version 688241 (0.0046) [2024-06-24 15:28:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 11276288000. Throughput: 0: 42644.9. Samples: 11276375060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:23,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 15:28:23,423][15401] Updated weights for policy 0, policy_version 688251 (0.0023) [2024-06-24 15:28:27,002][15401] Updated weights for policy 0, policy_version 688261 (0.0030) [2024-06-24 15:28:28,396][15132] Fps is (10 sec: 39296.9, 60 sec: 42593.8, 300 sec: 42597.5). Total num frames: 11276484608. Throughput: 0: 42526.9. Samples: 11276624320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:28,397][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 15:28:31,023][15401] Updated weights for policy 0, policy_version 688271 (0.0029) [2024-06-24 15:28:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 11276713984. Throughput: 0: 42853.1. Samples: 11276887800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 15:28:34,622][15401] Updated weights for policy 0, policy_version 688281 (0.0039) [2024-06-24 15:28:38,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11276926976. Throughput: 0: 42789.8. Samples: 11277019220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:38,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 15:28:38,846][15401] Updated weights for policy 0, policy_version 688291 (0.0039) [2024-06-24 15:28:41,732][15349] Signal inference workers to stop experience collection... (166850 times) [2024-06-24 15:28:41,735][15349] Signal inference workers to resume experience collection... (166850 times) [2024-06-24 15:28:41,751][15401] InferenceWorker_p0-w0: stopping experience collection (166850 times) [2024-06-24 15:28:41,751][15401] InferenceWorker_p0-w0: resuming experience collection (166850 times) [2024-06-24 15:28:42,625][15401] Updated weights for policy 0, policy_version 688301 (0.0042) [2024-06-24 15:28:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11277139968. Throughput: 0: 42637.8. Samples: 11277263020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 15:28:46,375][15401] Updated weights for policy 0, policy_version 688311 (0.0035) [2024-06-24 15:28:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 11277352960. Throughput: 0: 42809.4. Samples: 11277527380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:48,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 15:28:50,207][15401] Updated weights for policy 0, policy_version 688321 (0.0047) [2024-06-24 15:28:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11277582336. Throughput: 0: 42605.8. Samples: 11277651540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 15:28:53,932][15401] Updated weights for policy 0, policy_version 688331 (0.0033) [2024-06-24 15:28:57,855][15401] Updated weights for policy 0, policy_version 688341 (0.0032) [2024-06-24 15:28:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11277795328. Throughput: 0: 42628.0. Samples: 11277904960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:28:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 15:29:02,346][15401] Updated weights for policy 0, policy_version 688351 (0.0023) [2024-06-24 15:29:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11277991936. Throughput: 0: 42725.7. Samples: 11278163000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:29:03,391][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 15:29:05,384][15401] Updated weights for policy 0, policy_version 688361 (0.0025) [2024-06-24 15:29:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 11278221312. Throughput: 0: 42646.7. Samples: 11278294160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:29:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 15:29:09,738][15401] Updated weights for policy 0, policy_version 688371 (0.0023) [2024-06-24 15:29:13,220][15401] Updated weights for policy 0, policy_version 688381 (0.0043) [2024-06-24 15:29:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11278434304. Throughput: 0: 42767.5. Samples: 11278548580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:29:13,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 15:29:17,483][15401] Updated weights for policy 0, policy_version 688391 (0.0041) [2024-06-24 15:29:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11278630912. Throughput: 0: 42581.0. Samples: 11278803940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:29:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 15:29:21,203][15401] Updated weights for policy 0, policy_version 688401 (0.0033) [2024-06-24 15:29:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 11278860288. Throughput: 0: 42418.9. Samples: 11278928080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:29:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 15:29:25,124][15401] Updated weights for policy 0, policy_version 688411 (0.0033) [2024-06-24 15:29:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42874.3, 300 sec: 42542.5). Total num frames: 11279056896. Throughput: 0: 42771.4. Samples: 11279187840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:29:28,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 15:29:28,863][15401] Updated weights for policy 0, policy_version 688421 (0.0035) [2024-06-24 15:29:32,647][15401] Updated weights for policy 0, policy_version 688431 (0.0035) [2024-06-24 15:29:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11279286272. Throughput: 0: 42587.0. Samples: 11279443800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 15:29:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 15:29:36,503][15401] Updated weights for policy 0, policy_version 688441 (0.0029) [2024-06-24 15:29:38,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11279499264. Throughput: 0: 42694.2. Samples: 11279572780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:29:38,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 15:29:40,203][15401] Updated weights for policy 0, policy_version 688451 (0.0029) [2024-06-24 15:29:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 11279695872. Throughput: 0: 42759.1. Samples: 11279829120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:29:43,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 15:29:43,510][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000688459_11279712256.pth... [2024-06-24 15:29:43,570][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000687834_11269472256.pth [2024-06-24 15:29:44,107][15401] Updated weights for policy 0, policy_version 688461 (0.0035) [2024-06-24 15:29:48,151][15401] Updated weights for policy 0, policy_version 688471 (0.0033) [2024-06-24 15:29:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11279908864. Throughput: 0: 42741.8. Samples: 11280086380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:29:48,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 15:29:51,658][15401] Updated weights for policy 0, policy_version 688481 (0.0035) [2024-06-24 15:29:53,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 11280154624. Throughput: 0: 42702.2. Samples: 11280215860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:29:53,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 15:29:55,596][15401] Updated weights for policy 0, policy_version 688491 (0.0027) [2024-06-24 15:29:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11280351232. Throughput: 0: 42847.0. Samples: 11280476700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:29:58,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 15:29:59,372][15401] Updated weights for policy 0, policy_version 688501 (0.0032) [2024-06-24 15:30:01,285][15349] Signal inference workers to stop experience collection... (166900 times) [2024-06-24 15:30:01,323][15401] InferenceWorker_p0-w0: stopping experience collection (166900 times) [2024-06-24 15:30:01,399][15349] Signal inference workers to resume experience collection... (166900 times) [2024-06-24 15:30:01,400][15401] InferenceWorker_p0-w0: resuming experience collection (166900 times) [2024-06-24 15:30:02,963][15401] Updated weights for policy 0, policy_version 688511 (0.0032) [2024-06-24 15:30:03,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11280564224. Throughput: 0: 42714.5. Samples: 11280726100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 15:30:06,979][15401] Updated weights for policy 0, policy_version 688521 (0.0032) [2024-06-24 15:30:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11280793600. Throughput: 0: 42947.6. Samples: 11280860720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:08,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 15:30:10,798][15401] Updated weights for policy 0, policy_version 688531 (0.0044) [2024-06-24 15:30:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11280973824. Throughput: 0: 42925.0. Samples: 11281119360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 15:30:14,469][15401] Updated weights for policy 0, policy_version 688541 (0.0045) [2024-06-24 15:30:18,274][15401] Updated weights for policy 0, policy_version 688551 (0.0033) [2024-06-24 15:30:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11281219584. Throughput: 0: 42907.3. Samples: 11281374620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:18,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 15:30:22,105][15401] Updated weights for policy 0, policy_version 688561 (0.0029) [2024-06-24 15:30:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11281416192. Throughput: 0: 42922.2. Samples: 11281504280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 15:30:25,790][15401] Updated weights for policy 0, policy_version 688571 (0.0030) [2024-06-24 15:30:28,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 11281629184. Throughput: 0: 43043.9. Samples: 11281766100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 15:30:29,858][15401] Updated weights for policy 0, policy_version 688581 (0.0034) [2024-06-24 15:30:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11281874944. Throughput: 0: 42888.6. Samples: 11282016360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 15:30:33,394][15401] Updated weights for policy 0, policy_version 688591 (0.0038) [2024-06-24 15:30:37,435][15401] Updated weights for policy 0, policy_version 688601 (0.0032) [2024-06-24 15:30:38,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11282071552. Throughput: 0: 42939.2. Samples: 11282148020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 15:30:41,091][15401] Updated weights for policy 0, policy_version 688611 (0.0035) [2024-06-24 15:30:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11282284544. Throughput: 0: 42934.6. Samples: 11282408760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:43,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 15:30:44,924][15401] Updated weights for policy 0, policy_version 688621 (0.0022) [2024-06-24 15:30:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11282481152. Throughput: 0: 43075.6. Samples: 11282664500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 15:30:48,876][15401] Updated weights for policy 0, policy_version 688631 (0.0042) [2024-06-24 15:30:52,527][15401] Updated weights for policy 0, policy_version 688641 (0.0028) [2024-06-24 15:30:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 11282726912. Throughput: 0: 42899.5. Samples: 11282791200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:53,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 15:30:56,472][15401] Updated weights for policy 0, policy_version 688651 (0.0035) [2024-06-24 15:30:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11282939904. Throughput: 0: 43028.8. Samples: 11283055660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:30:58,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 15:31:00,176][15401] Updated weights for policy 0, policy_version 688661 (0.0035) [2024-06-24 15:31:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11283136512. Throughput: 0: 43070.1. Samples: 11283312780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:31:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 15:31:03,992][15401] Updated weights for policy 0, policy_version 688671 (0.0034) [2024-06-24 15:31:07,879][15401] Updated weights for policy 0, policy_version 688681 (0.0028) [2024-06-24 15:31:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11283365888. Throughput: 0: 42989.9. Samples: 11283438820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:31:08,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 15:31:11,985][15401] Updated weights for policy 0, policy_version 688691 (0.0033) [2024-06-24 15:31:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 11283578880. Throughput: 0: 42779.6. Samples: 11283691280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-24 15:31:13,392][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 15:31:15,627][15401] Updated weights for policy 0, policy_version 688701 (0.0042) [2024-06-24 15:31:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11283775488. Throughput: 0: 43030.6. Samples: 11283952740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 15:31:19,666][15401] Updated weights for policy 0, policy_version 688711 (0.0035) [2024-06-24 15:31:23,260][15401] Updated weights for policy 0, policy_version 688721 (0.0030) [2024-06-24 15:31:23,278][15349] Signal inference workers to stop experience collection... (166950 times) [2024-06-24 15:31:23,278][15349] Signal inference workers to resume experience collection... (166950 times) [2024-06-24 15:31:23,291][15401] InferenceWorker_p0-w0: stopping experience collection (166950 times) [2024-06-24 15:31:23,291][15401] InferenceWorker_p0-w0: resuming experience collection (166950 times) [2024-06-24 15:31:23,392][15132] Fps is (10 sec: 42598.5, 60 sec: 43142.9, 300 sec: 42820.6). Total num frames: 11284004864. Throughput: 0: 42879.4. Samples: 11284077700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:23,392][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 15:31:27,717][15401] Updated weights for policy 0, policy_version 688731 (0.0037) [2024-06-24 15:31:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11284201472. Throughput: 0: 42868.4. Samples: 11284337840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:28,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 15:31:30,832][15401] Updated weights for policy 0, policy_version 688741 (0.0033) [2024-06-24 15:31:33,390][15132] Fps is (10 sec: 40969.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11284414464. Throughput: 0: 42837.7. Samples: 11284592200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 15:31:35,233][15401] Updated weights for policy 0, policy_version 688751 (0.0033) [2024-06-24 15:31:38,315][15401] Updated weights for policy 0, policy_version 688761 (0.0034) [2024-06-24 15:31:38,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11284660224. Throughput: 0: 42974.7. Samples: 11284725060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:38,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 15:31:42,899][15401] Updated weights for policy 0, policy_version 688771 (0.0025) [2024-06-24 15:31:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11284856832. Throughput: 0: 42999.9. Samples: 11284990660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:43,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-24 15:31:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000688773_11284856832.pth... [2024-06-24 15:31:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000688146_11274584064.pth [2024-06-24 15:31:45,824][15401] Updated weights for policy 0, policy_version 688781 (0.0026) [2024-06-24 15:31:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11285053440. Throughput: 0: 42773.9. Samples: 11285237600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 15:31:50,581][15401] Updated weights for policy 0, policy_version 688791 (0.0029) [2024-06-24 15:31:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11285299200. Throughput: 0: 42831.9. Samples: 11285366260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 15:31:53,463][15401] Updated weights for policy 0, policy_version 688801 (0.0035) [2024-06-24 15:31:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 11285479424. Throughput: 0: 43067.6. Samples: 11285629320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:31:58,392][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 15:31:58,398][15401] Updated weights for policy 0, policy_version 688811 (0.0050) [2024-06-24 15:32:01,566][15401] Updated weights for policy 0, policy_version 688821 (0.0048) [2024-06-24 15:32:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11285708800. Throughput: 0: 42675.0. Samples: 11285873120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 15:32:05,923][15401] Updated weights for policy 0, policy_version 688831 (0.0045) [2024-06-24 15:32:08,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11285938176. Throughput: 0: 42757.8. Samples: 11286001700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 15:32:09,233][15401] Updated weights for policy 0, policy_version 688841 (0.0043) [2024-06-24 15:32:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42326.9, 300 sec: 42709.4). Total num frames: 11286118400. Throughput: 0: 42729.3. Samples: 11286260660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:13,390][15132] Avg episode reward: [(0, '0.204')] [2024-06-24 15:32:13,552][15401] Updated weights for policy 0, policy_version 688851 (0.0044) [2024-06-24 15:32:16,922][15401] Updated weights for policy 0, policy_version 688861 (0.0046) [2024-06-24 15:32:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11286364160. Throughput: 0: 42519.1. Samples: 11286505560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:18,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 15:32:21,187][15401] Updated weights for policy 0, policy_version 688871 (0.0055) [2024-06-24 15:32:23,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 11286577152. Throughput: 0: 42659.5. Samples: 11286644740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 15:32:24,569][15401] Updated weights for policy 0, policy_version 688881 (0.0038) [2024-06-24 15:32:28,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11286757376. Throughput: 0: 42323.7. Samples: 11286895220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 15:32:29,184][15401] Updated weights for policy 0, policy_version 688891 (0.0035) [2024-06-24 15:32:32,006][15401] Updated weights for policy 0, policy_version 688901 (0.0036) [2024-06-24 15:32:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 11287019520. Throughput: 0: 42354.2. Samples: 11287143540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 15:32:36,959][15401] Updated weights for policy 0, policy_version 688911 (0.0023) [2024-06-24 15:32:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 11287199744. Throughput: 0: 42693.3. Samples: 11287287560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:38,393][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 15:32:39,839][15401] Updated weights for policy 0, policy_version 688921 (0.0041) [2024-06-24 15:32:40,074][15349] Signal inference workers to stop experience collection... (167000 times) [2024-06-24 15:32:40,099][15401] InferenceWorker_p0-w0: stopping experience collection (167000 times) [2024-06-24 15:32:40,138][15349] Signal inference workers to resume experience collection... (167000 times) [2024-06-24 15:32:40,138][15401] InferenceWorker_p0-w0: resuming experience collection (167000 times) [2024-06-24 15:32:43,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.5, 300 sec: 42654.9). Total num frames: 11287396352. Throughput: 0: 42340.0. Samples: 11287534520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 15:32:44,750][15401] Updated weights for policy 0, policy_version 688931 (0.0032) [2024-06-24 15:32:47,397][15401] Updated weights for policy 0, policy_version 688941 (0.0035) [2024-06-24 15:32:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11287642112. Throughput: 0: 42448.1. Samples: 11287783280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 15:32:52,219][15401] Updated weights for policy 0, policy_version 688951 (0.0031) [2024-06-24 15:32:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11287822336. Throughput: 0: 42650.2. Samples: 11287920960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 15:32:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 15:32:55,075][15401] Updated weights for policy 0, policy_version 688961 (0.0029) [2024-06-24 15:32:58,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42871.4, 300 sec: 42764.7). Total num frames: 11288051712. Throughput: 0: 42582.3. Samples: 11288176960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:32:58,392][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 15:32:59,766][15401] Updated weights for policy 0, policy_version 688971 (0.0044) [2024-06-24 15:33:02,651][15401] Updated weights for policy 0, policy_version 688981 (0.0038) [2024-06-24 15:33:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11288264704. Throughput: 0: 42807.7. Samples: 11288431900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:03,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-24 15:33:07,224][15401] Updated weights for policy 0, policy_version 688991 (0.0025) [2024-06-24 15:33:08,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11288461312. Throughput: 0: 42649.0. Samples: 11288563940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 15:33:10,270][15401] Updated weights for policy 0, policy_version 689001 (0.0046) [2024-06-24 15:33:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11288690688. Throughput: 0: 42537.9. Samples: 11288809420. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 15:33:14,879][15401] Updated weights for policy 0, policy_version 689011 (0.0030) [2024-06-24 15:33:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11288903680. Throughput: 0: 42738.9. Samples: 11289066800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 15:33:18,559][15401] Updated weights for policy 0, policy_version 689021 (0.0034) [2024-06-24 15:33:22,453][15401] Updated weights for policy 0, policy_version 689031 (0.0044) [2024-06-24 15:33:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42766.0). Total num frames: 11289100288. Throughput: 0: 42497.9. Samples: 11289199860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 15:33:26,161][15401] Updated weights for policy 0, policy_version 689041 (0.0035) [2024-06-24 15:33:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11289329664. Throughput: 0: 42626.7. Samples: 11289452720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:28,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-24 15:33:30,154][15401] Updated weights for policy 0, policy_version 689051 (0.0034) [2024-06-24 15:33:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 11289542656. Throughput: 0: 42643.6. Samples: 11289702240. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:33,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 15:33:33,831][15401] Updated weights for policy 0, policy_version 689061 (0.0034) [2024-06-24 15:33:37,714][15401] Updated weights for policy 0, policy_version 689071 (0.0033) [2024-06-24 15:33:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11289755648. Throughput: 0: 42481.2. Samples: 11289832620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 15:33:41,468][15401] Updated weights for policy 0, policy_version 689081 (0.0034) [2024-06-24 15:33:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11289968640. Throughput: 0: 42488.9. Samples: 11290088860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 15:33:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000689085_11289968640.pth... [2024-06-24 15:33:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000688459_11279712256.pth [2024-06-24 15:33:45,387][15401] Updated weights for policy 0, policy_version 689091 (0.0028) [2024-06-24 15:33:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11290198016. Throughput: 0: 42384.8. Samples: 11290339220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:48,394][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 15:33:49,610][15401] Updated weights for policy 0, policy_version 689101 (0.0036) [2024-06-24 15:33:52,942][15401] Updated weights for policy 0, policy_version 689111 (0.0038) [2024-06-24 15:33:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11290394624. Throughput: 0: 42259.4. Samples: 11290465620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 15:33:57,194][15401] Updated weights for policy 0, policy_version 689121 (0.0030) [2024-06-24 15:33:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 11290607616. Throughput: 0: 42658.1. Samples: 11290729140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:33:58,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 15:34:00,609][15401] Updated weights for policy 0, policy_version 689131 (0.0033) [2024-06-24 15:34:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11290820608. Throughput: 0: 42624.0. Samples: 11290984880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:34:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 15:34:04,805][15401] Updated weights for policy 0, policy_version 689141 (0.0050) [2024-06-24 15:34:07,060][15349] Signal inference workers to stop experience collection... (167050 times) [2024-06-24 15:34:07,060][15349] Signal inference workers to resume experience collection... (167050 times) [2024-06-24 15:34:07,075][15401] InferenceWorker_p0-w0: stopping experience collection (167050 times) [2024-06-24 15:34:07,075][15401] InferenceWorker_p0-w0: resuming experience collection (167050 times) [2024-06-24 15:34:08,179][15401] Updated weights for policy 0, policy_version 689151 (0.0030) [2024-06-24 15:34:08,390][15132] Fps is (10 sec: 44246.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11291049984. Throughput: 0: 42522.5. Samples: 11291113380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:34:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 15:34:12,385][15401] Updated weights for policy 0, policy_version 689161 (0.0039) [2024-06-24 15:34:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11291262976. Throughput: 0: 42671.9. Samples: 11291372960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:34:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 15:34:15,664][15401] Updated weights for policy 0, policy_version 689171 (0.0039) [2024-06-24 15:34:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11291459584. Throughput: 0: 42898.2. Samples: 11291632660. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:34:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 15:34:19,959][15401] Updated weights for policy 0, policy_version 689181 (0.0053) [2024-06-24 15:34:23,301][15401] Updated weights for policy 0, policy_version 689191 (0.0038) [2024-06-24 15:34:23,396][15132] Fps is (10 sec: 44208.6, 60 sec: 43412.9, 300 sec: 42875.5). Total num frames: 11291705344. Throughput: 0: 42841.5. Samples: 11291760760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:34:23,396][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 15:34:28,151][15401] Updated weights for policy 0, policy_version 689201 (0.0042) [2024-06-24 15:34:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11291869184. Throughput: 0: 42774.6. Samples: 11292013720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-24 15:34:28,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 15:34:31,123][15401] Updated weights for policy 0, policy_version 689211 (0.0059) [2024-06-24 15:34:33,390][15132] Fps is (10 sec: 39346.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11292098560. Throughput: 0: 42774.7. Samples: 11292264080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:34:33,394][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 15:34:35,882][15401] Updated weights for policy 0, policy_version 689221 (0.0036) [2024-06-24 15:34:38,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 11292311552. Throughput: 0: 42859.3. Samples: 11292394560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:34:38,397][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 15:34:39,314][15401] Updated weights for policy 0, policy_version 689231 (0.0038) [2024-06-24 15:34:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11292508160. Throughput: 0: 42553.3. Samples: 11292643940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:34:43,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 15:34:43,556][15401] Updated weights for policy 0, policy_version 689241 (0.0049) [2024-06-24 15:34:46,883][15401] Updated weights for policy 0, policy_version 689251 (0.0023) [2024-06-24 15:34:48,389][15132] Fps is (10 sec: 44265.8, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 11292753920. Throughput: 0: 42530.0. Samples: 11292898720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:34:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 15:34:50,985][15401] Updated weights for policy 0, policy_version 689261 (0.0042) [2024-06-24 15:34:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11292934144. Throughput: 0: 42616.6. Samples: 11293031120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:34:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 15:34:54,344][15401] Updated weights for policy 0, policy_version 689271 (0.0043) [2024-06-24 15:34:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11293163520. Throughput: 0: 42522.7. Samples: 11293286480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:34:58,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 15:34:58,479][15401] Updated weights for policy 0, policy_version 689281 (0.0041) [2024-06-24 15:35:01,840][15401] Updated weights for policy 0, policy_version 689291 (0.0041) [2024-06-24 15:35:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11293376512. Throughput: 0: 42371.8. Samples: 11293539400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 15:35:06,355][15401] Updated weights for policy 0, policy_version 689301 (0.0030) [2024-06-24 15:35:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11293589504. Throughput: 0: 42394.5. Samples: 11293668240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 15:35:10,081][15401] Updated weights for policy 0, policy_version 689311 (0.0038) [2024-06-24 15:35:13,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11293818880. Throughput: 0: 42487.1. Samples: 11293925640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 15:35:13,841][15401] Updated weights for policy 0, policy_version 689321 (0.0036) [2024-06-24 15:35:17,875][15401] Updated weights for policy 0, policy_version 689331 (0.0024) [2024-06-24 15:35:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11294015488. Throughput: 0: 42581.8. Samples: 11294180260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:18,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 15:35:22,097][15401] Updated weights for policy 0, policy_version 689341 (0.0027) [2024-06-24 15:35:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42056.7, 300 sec: 42709.5). Total num frames: 11294228480. Throughput: 0: 42449.5. Samples: 11294304520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 15:35:25,280][15401] Updated weights for policy 0, policy_version 689351 (0.0033) [2024-06-24 15:35:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11294457856. Throughput: 0: 42806.8. Samples: 11294570240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 15:35:29,652][15401] Updated weights for policy 0, policy_version 689361 (0.0033) [2024-06-24 15:35:29,657][15349] Signal inference workers to stop experience collection... (167100 times) [2024-06-24 15:35:29,657][15349] Signal inference workers to resume experience collection... (167100 times) [2024-06-24 15:35:29,676][15401] InferenceWorker_p0-w0: stopping experience collection (167100 times) [2024-06-24 15:35:29,676][15401] InferenceWorker_p0-w0: resuming experience collection (167100 times) [2024-06-24 15:35:32,775][15401] Updated weights for policy 0, policy_version 689371 (0.0027) [2024-06-24 15:35:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11294670848. Throughput: 0: 42867.4. Samples: 11294827760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 15:35:37,135][15401] Updated weights for policy 0, policy_version 689381 (0.0036) [2024-06-24 15:35:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 11294867456. Throughput: 0: 42716.0. Samples: 11294953340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 15:35:40,788][15401] Updated weights for policy 0, policy_version 689391 (0.0035) [2024-06-24 15:35:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11295080448. Throughput: 0: 42753.6. Samples: 11295210400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:43,404][15132] Avg episode reward: [(0, '0.843')] [2024-06-24 15:35:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000689397_11295080448.pth... [2024-06-24 15:35:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000688773_11284856832.pth [2024-06-24 15:35:44,606][15401] Updated weights for policy 0, policy_version 689401 (0.0040) [2024-06-24 15:35:48,276][15401] Updated weights for policy 0, policy_version 689411 (0.0026) [2024-06-24 15:35:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 11295309824. Throughput: 0: 42779.3. Samples: 11295464560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:48,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 15:35:52,327][15401] Updated weights for policy 0, policy_version 689421 (0.0035) [2024-06-24 15:35:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11295506432. Throughput: 0: 42793.3. Samples: 11295593940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:53,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-24 15:35:55,894][15401] Updated weights for policy 0, policy_version 689431 (0.0038) [2024-06-24 15:35:58,390][15132] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11295719424. Throughput: 0: 42819.4. Samples: 11295852520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:35:58,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 15:35:59,973][15401] Updated weights for policy 0, policy_version 689441 (0.0049) [2024-06-24 15:36:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11295948800. Throughput: 0: 42901.6. Samples: 11296110840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:36:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 15:36:03,542][15401] Updated weights for policy 0, policy_version 689451 (0.0036) [2024-06-24 15:36:07,461][15401] Updated weights for policy 0, policy_version 689461 (0.0030) [2024-06-24 15:36:08,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 11296129024. Throughput: 0: 43000.2. Samples: 11296239520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:36:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 15:36:11,146][15401] Updated weights for policy 0, policy_version 689471 (0.0036) [2024-06-24 15:36:13,392][15132] Fps is (10 sec: 42589.3, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 11296374784. Throughput: 0: 42715.5. Samples: 11296492540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:13,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 15:36:15,302][15401] Updated weights for policy 0, policy_version 689481 (0.0051) [2024-06-24 15:36:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 11296587776. Throughput: 0: 42717.0. Samples: 11296750020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 15:36:18,890][15401] Updated weights for policy 0, policy_version 689491 (0.0038) [2024-06-24 15:36:22,925][15401] Updated weights for policy 0, policy_version 689501 (0.0039) [2024-06-24 15:36:23,392][15132] Fps is (10 sec: 40959.4, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 11296784384. Throughput: 0: 42720.8. Samples: 11296875880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:23,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 15:36:26,494][15401] Updated weights for policy 0, policy_version 689511 (0.0030) [2024-06-24 15:36:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11297013760. Throughput: 0: 42763.3. Samples: 11297134740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 15:36:30,404][15401] Updated weights for policy 0, policy_version 689521 (0.0039) [2024-06-24 15:36:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11297226752. Throughput: 0: 42889.8. Samples: 11297394500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 15:36:34,046][15401] Updated weights for policy 0, policy_version 689531 (0.0029) [2024-06-24 15:36:37,962][15401] Updated weights for policy 0, policy_version 689541 (0.0035) [2024-06-24 15:36:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 11297439744. Throughput: 0: 42720.9. Samples: 11297516380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 15:36:41,861][15401] Updated weights for policy 0, policy_version 689551 (0.0041) [2024-06-24 15:36:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11297669120. Throughput: 0: 42726.0. Samples: 11297775180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 15:36:45,893][15401] Updated weights for policy 0, policy_version 689561 (0.0038) [2024-06-24 15:36:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 11297849344. Throughput: 0: 42715.3. Samples: 11298033020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 15:36:49,647][15401] Updated weights for policy 0, policy_version 689571 (0.0043) [2024-06-24 15:36:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11298078720. Throughput: 0: 42578.2. Samples: 11298155540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 15:36:53,484][15401] Updated weights for policy 0, policy_version 689581 (0.0036) [2024-06-24 15:36:57,229][15401] Updated weights for policy 0, policy_version 689591 (0.0033) [2024-06-24 15:36:58,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43143.0, 300 sec: 42709.1). Total num frames: 11298308096. Throughput: 0: 42759.9. Samples: 11298416740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:36:58,392][15132] Avg episode reward: [(0, '0.272')] [2024-06-24 15:37:01,021][15401] Updated weights for policy 0, policy_version 689601 (0.0041) [2024-06-24 15:37:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11298504704. Throughput: 0: 42723.3. Samples: 11298672580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 15:37:05,186][15401] Updated weights for policy 0, policy_version 689611 (0.0032) [2024-06-24 15:37:05,635][15349] Signal inference workers to stop experience collection... (167150 times) [2024-06-24 15:37:05,635][15349] Signal inference workers to resume experience collection... (167150 times) [2024-06-24 15:37:05,652][15401] InferenceWorker_p0-w0: stopping experience collection (167150 times) [2024-06-24 15:37:05,652][15401] InferenceWorker_p0-w0: resuming experience collection (167150 times) [2024-06-24 15:37:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11298734080. Throughput: 0: 42777.5. Samples: 11298800760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:08,399][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 15:37:09,038][15401] Updated weights for policy 0, policy_version 689621 (0.0031) [2024-06-24 15:37:12,759][15401] Updated weights for policy 0, policy_version 689631 (0.0032) [2024-06-24 15:37:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 11298947072. Throughput: 0: 42808.8. Samples: 11299061140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:13,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 15:37:16,614][15401] Updated weights for policy 0, policy_version 689641 (0.0046) [2024-06-24 15:37:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11299143680. Throughput: 0: 42717.3. Samples: 11299316780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 15:37:20,368][15401] Updated weights for policy 0, policy_version 689651 (0.0031) [2024-06-24 15:37:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 11299373056. Throughput: 0: 42789.2. Samples: 11299441900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:23,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 15:37:24,037][15401] Updated weights for policy 0, policy_version 689661 (0.0035) [2024-06-24 15:37:28,065][15401] Updated weights for policy 0, policy_version 689671 (0.0034) [2024-06-24 15:37:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11299569664. Throughput: 0: 42775.6. Samples: 11299700080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:28,390][15132] Avg episode reward: [(0, '0.884')] [2024-06-24 15:37:31,607][15401] Updated weights for policy 0, policy_version 689681 (0.0036) [2024-06-24 15:37:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 11299782656. Throughput: 0: 42704.8. Samples: 11299954740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:33,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 15:37:36,086][15401] Updated weights for policy 0, policy_version 689691 (0.0026) [2024-06-24 15:37:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11300012032. Throughput: 0: 42869.8. Samples: 11300084680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 15:37:39,143][15401] Updated weights for policy 0, policy_version 689701 (0.0035) [2024-06-24 15:37:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11300208640. Throughput: 0: 42830.2. Samples: 11300344000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 15:37:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000689710_11300208640.pth... [2024-06-24 15:37:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000689085_11289968640.pth [2024-06-24 15:37:43,670][15401] Updated weights for policy 0, policy_version 689711 (0.0036) [2024-06-24 15:37:47,250][15401] Updated weights for policy 0, policy_version 689721 (0.0028) [2024-06-24 15:37:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11300421632. Throughput: 0: 42777.9. Samples: 11300597580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 15:37:48,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 15:37:51,293][15401] Updated weights for policy 0, policy_version 689731 (0.0038) [2024-06-24 15:37:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11300651008. Throughput: 0: 42781.4. Samples: 11300725920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:37:53,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-24 15:37:54,719][15401] Updated weights for policy 0, policy_version 689741 (0.0037) [2024-06-24 15:37:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 11300847616. Throughput: 0: 42858.3. Samples: 11300989760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:37:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 15:37:58,911][15401] Updated weights for policy 0, policy_version 689751 (0.0036) [2024-06-24 15:38:02,233][15401] Updated weights for policy 0, policy_version 689761 (0.0029) [2024-06-24 15:38:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11301076992. Throughput: 0: 42766.1. Samples: 11301241260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 15:38:06,257][15401] Updated weights for policy 0, policy_version 689771 (0.0039) [2024-06-24 15:38:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11301306368. Throughput: 0: 42895.3. Samples: 11301372180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 15:38:09,739][15401] Updated weights for policy 0, policy_version 689781 (0.0034) [2024-06-24 15:38:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11301470208. Throughput: 0: 42913.7. Samples: 11301631200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:13,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 15:38:13,926][15401] Updated weights for policy 0, policy_version 689791 (0.0034) [2024-06-24 15:38:17,282][15401] Updated weights for policy 0, policy_version 689801 (0.0033) [2024-06-24 15:38:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11301715968. Throughput: 0: 42909.2. Samples: 11301885660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 15:38:21,678][15401] Updated weights for policy 0, policy_version 689811 (0.0043) [2024-06-24 15:38:23,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 11301945344. Throughput: 0: 43010.7. Samples: 11302020160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 15:38:24,834][15401] Updated weights for policy 0, policy_version 689821 (0.0024) [2024-06-24 15:38:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11302125568. Throughput: 0: 43006.8. Samples: 11302279300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 15:38:29,067][15401] Updated weights for policy 0, policy_version 689831 (0.0033) [2024-06-24 15:38:32,956][15401] Updated weights for policy 0, policy_version 689841 (0.0039) [2024-06-24 15:38:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11302354944. Throughput: 0: 43034.3. Samples: 11302534120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:33,390][15132] Avg episode reward: [(0, '0.190')] [2024-06-24 15:38:35,838][15349] Signal inference workers to stop experience collection... (167200 times) [2024-06-24 15:38:35,877][15401] InferenceWorker_p0-w0: stopping experience collection (167200 times) [2024-06-24 15:38:35,890][15349] Signal inference workers to resume experience collection... (167200 times) [2024-06-24 15:38:35,892][15401] InferenceWorker_p0-w0: resuming experience collection (167200 times) [2024-06-24 15:38:36,715][15401] Updated weights for policy 0, policy_version 689851 (0.0037) [2024-06-24 15:38:38,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11302600704. Throughput: 0: 43175.0. Samples: 11302668800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:38,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 15:38:40,742][15401] Updated weights for policy 0, policy_version 689861 (0.0033) [2024-06-24 15:38:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11302780928. Throughput: 0: 42998.2. Samples: 11302924680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 15:38:44,222][15401] Updated weights for policy 0, policy_version 689871 (0.0029) [2024-06-24 15:38:48,256][15401] Updated weights for policy 0, policy_version 689881 (0.0039) [2024-06-24 15:38:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 11303010304. Throughput: 0: 43213.5. Samples: 11303185860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 15:38:51,983][15401] Updated weights for policy 0, policy_version 689891 (0.0036) [2024-06-24 15:38:53,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43415.8, 300 sec: 42876.1). Total num frames: 11303256064. Throughput: 0: 43137.2. Samples: 11303313460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:53,392][15132] Avg episode reward: [(0, '0.234')] [2024-06-24 15:38:55,762][15401] Updated weights for policy 0, policy_version 689901 (0.0030) [2024-06-24 15:38:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11303419904. Throughput: 0: 42945.9. Samples: 11303563760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:38:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 15:38:59,603][15401] Updated weights for policy 0, policy_version 689911 (0.0039) [2024-06-24 15:39:03,349][15401] Updated weights for policy 0, policy_version 689921 (0.0031) [2024-06-24 15:39:03,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 11303665664. Throughput: 0: 43126.9. Samples: 11303826360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:39:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 15:39:07,226][15401] Updated weights for policy 0, policy_version 689931 (0.0035) [2024-06-24 15:39:08,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11303895040. Throughput: 0: 43093.3. Samples: 11303959360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:39:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 15:39:10,982][15401] Updated weights for policy 0, policy_version 689941 (0.0045) [2024-06-24 15:39:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11304058880. Throughput: 0: 42860.4. Samples: 11304208020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:39:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 15:39:14,814][15401] Updated weights for policy 0, policy_version 689951 (0.0032) [2024-06-24 15:39:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.6, 300 sec: 42710.4). Total num frames: 11304304640. Throughput: 0: 42937.1. Samples: 11304466300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:39:18,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 15:39:18,598][15401] Updated weights for policy 0, policy_version 689961 (0.0026) [2024-06-24 15:39:22,447][15401] Updated weights for policy 0, policy_version 689971 (0.0039) [2024-06-24 15:39:23,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 11304534016. Throughput: 0: 42797.3. Samples: 11304594780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:39:23,392][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 15:39:26,064][15401] Updated weights for policy 0, policy_version 689981 (0.0035) [2024-06-24 15:39:28,396][15132] Fps is (10 sec: 40934.2, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 11304714240. Throughput: 0: 42756.1. Samples: 11304848980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-24 15:39:28,396][15132] Avg episode reward: [(0, '0.207')] [2024-06-24 15:39:30,107][15401] Updated weights for policy 0, policy_version 689991 (0.0037) [2024-06-24 15:39:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 11304943616. Throughput: 0: 42670.2. Samples: 11305106020. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:39:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 15:39:33,642][15401] Updated weights for policy 0, policy_version 690001 (0.0026) [2024-06-24 15:39:37,760][15401] Updated weights for policy 0, policy_version 690011 (0.0035) [2024-06-24 15:39:38,389][15132] Fps is (10 sec: 45904.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11305172992. Throughput: 0: 42784.1. Samples: 11305238640. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:39:38,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 15:39:41,170][15401] Updated weights for policy 0, policy_version 690021 (0.0045) [2024-06-24 15:39:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11305353216. Throughput: 0: 42865.7. Samples: 11305492720. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:39:43,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-24 15:39:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000690024_11305353216.pth... [2024-06-24 15:39:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000689397_11295080448.pth [2024-06-24 15:39:45,681][15401] Updated weights for policy 0, policy_version 690031 (0.0033) [2024-06-24 15:39:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 11305582592. Throughput: 0: 42658.5. Samples: 11305746000. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:39:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 15:39:49,025][15401] Updated weights for policy 0, policy_version 690041 (0.0037) [2024-06-24 15:39:52,454][15349] Signal inference workers to stop experience collection... (167250 times) [2024-06-24 15:39:52,455][15349] Signal inference workers to resume experience collection... (167250 times) [2024-06-24 15:39:52,496][15401] InferenceWorker_p0-w0: stopping experience collection (167250 times) [2024-06-24 15:39:52,496][15401] InferenceWorker_p0-w0: resuming experience collection (167250 times) [2024-06-24 15:39:53,146][15401] Updated weights for policy 0, policy_version 690051 (0.0033) [2024-06-24 15:39:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 11305795584. Throughput: 0: 42684.0. Samples: 11305880140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:39:53,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 15:39:56,884][15401] Updated weights for policy 0, policy_version 690061 (0.0038) [2024-06-24 15:39:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 11305992192. Throughput: 0: 42818.2. Samples: 11306134840. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:39:58,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 15:40:00,570][15401] Updated weights for policy 0, policy_version 690071 (0.0030) [2024-06-24 15:40:03,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42866.8, 300 sec: 42875.2). Total num frames: 11306237952. Throughput: 0: 42675.8. Samples: 11306386980. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:03,397][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 15:40:04,621][15401] Updated weights for policy 0, policy_version 690081 (0.0032) [2024-06-24 15:40:08,020][15401] Updated weights for policy 0, policy_version 690091 (0.0043) [2024-06-24 15:40:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11306450944. Throughput: 0: 42812.4. Samples: 11306521240. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 15:40:12,333][15401] Updated weights for policy 0, policy_version 690101 (0.0029) [2024-06-24 15:40:13,390][15132] Fps is (10 sec: 40986.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11306647552. Throughput: 0: 42833.6. Samples: 11306776220. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 15:40:15,918][15401] Updated weights for policy 0, policy_version 690111 (0.0037) [2024-06-24 15:40:18,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11306860544. Throughput: 0: 42810.7. Samples: 11307032500. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 15:40:20,396][15401] Updated weights for policy 0, policy_version 690121 (0.0034) [2024-06-24 15:40:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 11307089920. Throughput: 0: 42836.4. Samples: 11307166280. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:23,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-24 15:40:23,406][15401] Updated weights for policy 0, policy_version 690131 (0.0020) [2024-06-24 15:40:27,899][15401] Updated weights for policy 0, policy_version 690141 (0.0029) [2024-06-24 15:40:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 11307302912. Throughput: 0: 42771.9. Samples: 11307417460. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:28,394][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 15:40:31,104][15401] Updated weights for policy 0, policy_version 690151 (0.0032) [2024-06-24 15:40:33,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 11307515904. Throughput: 0: 42822.7. Samples: 11307673120. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:33,392][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 15:40:35,533][15401] Updated weights for policy 0, policy_version 690161 (0.0039) [2024-06-24 15:40:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 11307745280. Throughput: 0: 42891.9. Samples: 11307810280. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 15:40:38,831][15401] Updated weights for policy 0, policy_version 690171 (0.0028) [2024-06-24 15:40:43,035][15401] Updated weights for policy 0, policy_version 690181 (0.0028) [2024-06-24 15:40:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 11307941888. Throughput: 0: 42914.4. Samples: 11308065980. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 15:40:46,587][15401] Updated weights for policy 0, policy_version 690191 (0.0029) [2024-06-24 15:40:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11308154880. Throughput: 0: 42850.5. Samples: 11308314980. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 15:40:49,590][15349] Signal inference workers to stop experience collection... (167300 times) [2024-06-24 15:40:49,590][15349] Signal inference workers to resume experience collection... (167300 times) [2024-06-24 15:40:49,608][15401] InferenceWorker_p0-w0: stopping experience collection (167300 times) [2024-06-24 15:40:49,612][15401] InferenceWorker_p0-w0: resuming experience collection (167300 times) [2024-06-24 15:40:50,943][15401] Updated weights for policy 0, policy_version 690201 (0.0030) [2024-06-24 15:40:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 11308384256. Throughput: 0: 42768.2. Samples: 11308445800. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:53,390][15132] Avg episode reward: [(0, '0.218')] [2024-06-24 15:40:53,989][15401] Updated weights for policy 0, policy_version 690211 (0.0029) [2024-06-24 15:40:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11308564480. Throughput: 0: 42831.5. Samples: 11308703640. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:40:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 15:40:58,695][15401] Updated weights for policy 0, policy_version 690221 (0.0035) [2024-06-24 15:41:02,115][15401] Updated weights for policy 0, policy_version 690231 (0.0038) [2024-06-24 15:41:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42876.0, 300 sec: 42987.1). Total num frames: 11308810240. Throughput: 0: 42648.7. Samples: 11308951700. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:41:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 15:41:06,253][15401] Updated weights for policy 0, policy_version 690241 (0.0038) [2024-06-24 15:41:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 11309023232. Throughput: 0: 42840.5. Samples: 11309094100. Policy #0 lag: (min: 0.0, avg: 13.0, max: 26.0) [2024-06-24 15:41:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 15:41:09,573][15401] Updated weights for policy 0, policy_version 690251 (0.0025) [2024-06-24 15:41:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11309203456. Throughput: 0: 42783.7. Samples: 11309342720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 15:41:14,231][15401] Updated weights for policy 0, policy_version 690261 (0.0033) [2024-06-24 15:41:17,451][15401] Updated weights for policy 0, policy_version 690271 (0.0029) [2024-06-24 15:41:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42932.0). Total num frames: 11309449216. Throughput: 0: 42674.7. Samples: 11309593380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:18,400][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 15:41:21,826][15401] Updated weights for policy 0, policy_version 690281 (0.0030) [2024-06-24 15:41:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11309662208. Throughput: 0: 42622.2. Samples: 11309728280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 15:41:24,924][15401] Updated weights for policy 0, policy_version 690291 (0.0032) [2024-06-24 15:41:28,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 11309842432. Throughput: 0: 42491.9. Samples: 11309978220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:28,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 15:41:29,501][15401] Updated weights for policy 0, policy_version 690301 (0.0042) [2024-06-24 15:41:32,867][15401] Updated weights for policy 0, policy_version 690311 (0.0040) [2024-06-24 15:41:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 11310088192. Throughput: 0: 42628.5. Samples: 11310233260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:33,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 15:41:37,057][15401] Updated weights for policy 0, policy_version 690321 (0.0038) [2024-06-24 15:41:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11310284800. Throughput: 0: 42696.4. Samples: 11310367140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 15:41:40,431][15401] Updated weights for policy 0, policy_version 690331 (0.0034) [2024-06-24 15:41:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11310497792. Throughput: 0: 42572.1. Samples: 11310619380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 15:41:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000690338_11310497792.pth... [2024-06-24 15:41:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000689710_11300208640.pth [2024-06-24 15:41:44,641][15401] Updated weights for policy 0, policy_version 690341 (0.0032) [2024-06-24 15:41:47,995][15401] Updated weights for policy 0, policy_version 690351 (0.0033) [2024-06-24 15:41:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11310727168. Throughput: 0: 42766.7. Samples: 11310876200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 15:41:52,590][15401] Updated weights for policy 0, policy_version 690361 (0.0038) [2024-06-24 15:41:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 11310907392. Throughput: 0: 42496.0. Samples: 11311006420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:53,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 15:41:55,554][15401] Updated weights for policy 0, policy_version 690371 (0.0025) [2024-06-24 15:41:58,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 11311120384. Throughput: 0: 42575.5. Samples: 11311258720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:41:58,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 15:41:59,955][15401] Updated weights for policy 0, policy_version 690381 (0.0034) [2024-06-24 15:42:03,156][15401] Updated weights for policy 0, policy_version 690391 (0.0036) [2024-06-24 15:42:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11311366144. Throughput: 0: 42709.4. Samples: 11311515300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 15:42:07,667][15401] Updated weights for policy 0, policy_version 690401 (0.0028) [2024-06-24 15:42:08,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11311562752. Throughput: 0: 42669.4. Samples: 11311648400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:08,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 15:42:11,254][15401] Updated weights for policy 0, policy_version 690411 (0.0029) [2024-06-24 15:42:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11311775744. Throughput: 0: 42684.2. Samples: 11311898900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 15:42:15,525][15401] Updated weights for policy 0, policy_version 690421 (0.0038) [2024-06-24 15:42:16,840][15349] Signal inference workers to stop experience collection... (167350 times) [2024-06-24 15:42:16,841][15349] Signal inference workers to resume experience collection... (167350 times) [2024-06-24 15:42:16,862][15401] InferenceWorker_p0-w0: stopping experience collection (167350 times) [2024-06-24 15:42:16,862][15401] InferenceWorker_p0-w0: resuming experience collection (167350 times) [2024-06-24 15:42:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11312005120. Throughput: 0: 42790.3. Samples: 11312158820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 15:42:18,689][15401] Updated weights for policy 0, policy_version 690431 (0.0033) [2024-06-24 15:42:22,977][15401] Updated weights for policy 0, policy_version 690441 (0.0024) [2024-06-24 15:42:23,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 11312201728. Throughput: 0: 42764.7. Samples: 11312291560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 15:42:26,335][15401] Updated weights for policy 0, policy_version 690451 (0.0027) [2024-06-24 15:42:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 11312431104. Throughput: 0: 42753.3. Samples: 11312543280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 15:42:30,601][15401] Updated weights for policy 0, policy_version 690461 (0.0040) [2024-06-24 15:42:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11312644096. Throughput: 0: 42836.1. Samples: 11312803820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:33,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 15:42:33,916][15401] Updated weights for policy 0, policy_version 690471 (0.0043) [2024-06-24 15:42:38,129][15401] Updated weights for policy 0, policy_version 690481 (0.0039) [2024-06-24 15:42:38,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 11312840704. Throughput: 0: 42790.5. Samples: 11312932100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:38,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 15:42:41,358][15401] Updated weights for policy 0, policy_version 690491 (0.0033) [2024-06-24 15:42:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11313053696. Throughput: 0: 42819.6. Samples: 11313185500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 15:42:45,803][15401] Updated weights for policy 0, policy_version 690501 (0.0032) [2024-06-24 15:42:48,389][15132] Fps is (10 sec: 44248.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11313283072. Throughput: 0: 42836.5. Samples: 11313442940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 15:42:48,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 15:42:49,274][15401] Updated weights for policy 0, policy_version 690511 (0.0040) [2024-06-24 15:42:53,228][15401] Updated weights for policy 0, policy_version 690521 (0.0033) [2024-06-24 15:42:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11313496064. Throughput: 0: 42870.3. Samples: 11313577560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:42:53,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 15:42:56,831][15401] Updated weights for policy 0, policy_version 690531 (0.0028) [2024-06-24 15:42:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 11313709056. Throughput: 0: 42974.6. Samples: 11313832760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:42:58,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 15:43:00,710][15401] Updated weights for policy 0, policy_version 690541 (0.0033) [2024-06-24 15:43:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11313922048. Throughput: 0: 42907.0. Samples: 11314089640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:03,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 15:43:04,370][15401] Updated weights for policy 0, policy_version 690551 (0.0034) [2024-06-24 15:43:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 11314135040. Throughput: 0: 42883.6. Samples: 11314221420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:08,392][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 15:43:08,609][15401] Updated weights for policy 0, policy_version 690561 (0.0029) [2024-06-24 15:43:11,855][15401] Updated weights for policy 0, policy_version 690571 (0.0042) [2024-06-24 15:43:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11314364416. Throughput: 0: 42997.7. Samples: 11314478180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 15:43:16,144][15401] Updated weights for policy 0, policy_version 690581 (0.0029) [2024-06-24 15:43:18,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11314577408. Throughput: 0: 43100.4. Samples: 11314743340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:18,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 15:43:19,854][15401] Updated weights for policy 0, policy_version 690591 (0.0041) [2024-06-24 15:43:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11314790400. Throughput: 0: 43091.9. Samples: 11314871140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:23,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-24 15:43:23,543][15401] Updated weights for policy 0, policy_version 690601 (0.0036) [2024-06-24 15:43:27,278][15401] Updated weights for policy 0, policy_version 690611 (0.0022) [2024-06-24 15:43:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11315019776. Throughput: 0: 43255.1. Samples: 11315131980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 15:43:31,264][15401] Updated weights for policy 0, policy_version 690621 (0.0039) [2024-06-24 15:43:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11315216384. Throughput: 0: 43164.3. Samples: 11315385340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 15:43:33,711][15349] Signal inference workers to stop experience collection... (167400 times) [2024-06-24 15:43:33,757][15401] InferenceWorker_p0-w0: stopping experience collection (167400 times) [2024-06-24 15:43:33,824][15349] Signal inference workers to resume experience collection... (167400 times) [2024-06-24 15:43:33,824][15401] InferenceWorker_p0-w0: resuming experience collection (167400 times) [2024-06-24 15:43:34,858][15401] Updated weights for policy 0, policy_version 690631 (0.0031) [2024-06-24 15:43:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 11315445760. Throughput: 0: 42959.4. Samples: 11315510740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:38,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 15:43:38,877][15401] Updated weights for policy 0, policy_version 690641 (0.0050) [2024-06-24 15:43:42,595][15401] Updated weights for policy 0, policy_version 690651 (0.0032) [2024-06-24 15:43:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11315658752. Throughput: 0: 43061.2. Samples: 11315770520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 15:43:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000690653_11315658752.pth... [2024-06-24 15:43:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000690024_11305353216.pth [2024-06-24 15:43:46,490][15401] Updated weights for policy 0, policy_version 690661 (0.0029) [2024-06-24 15:43:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 11315871744. Throughput: 0: 42956.0. Samples: 11316022660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:48,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 15:43:50,288][15401] Updated weights for policy 0, policy_version 690671 (0.0041) [2024-06-24 15:43:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11316068352. Throughput: 0: 42855.6. Samples: 11316149820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 15:43:54,103][15401] Updated weights for policy 0, policy_version 690681 (0.0035) [2024-06-24 15:43:57,834][15401] Updated weights for policy 0, policy_version 690691 (0.0029) [2024-06-24 15:43:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11316314112. Throughput: 0: 42895.6. Samples: 11316408480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:43:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 15:44:01,861][15401] Updated weights for policy 0, policy_version 690701 (0.0034) [2024-06-24 15:44:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11316477952. Throughput: 0: 42618.6. Samples: 11316661180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:44:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 15:44:05,487][15401] Updated weights for policy 0, policy_version 690711 (0.0032) [2024-06-24 15:44:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 11316707328. Throughput: 0: 42627.7. Samples: 11316789380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:44:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 15:44:09,550][15401] Updated weights for policy 0, policy_version 690721 (0.0031) [2024-06-24 15:44:13,105][15401] Updated weights for policy 0, policy_version 690731 (0.0041) [2024-06-24 15:44:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11316936704. Throughput: 0: 42601.4. Samples: 11317049040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:44:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 15:44:17,069][15401] Updated weights for policy 0, policy_version 690741 (0.0033) [2024-06-24 15:44:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 11317133312. Throughput: 0: 42666.2. Samples: 11317305320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:44:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 15:44:20,908][15401] Updated weights for policy 0, policy_version 690751 (0.0027) [2024-06-24 15:44:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.5, 300 sec: 42766.0). Total num frames: 11317329920. Throughput: 0: 42602.4. Samples: 11317427840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:44:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 15:44:24,869][15401] Updated weights for policy 0, policy_version 690761 (0.0044) [2024-06-24 15:44:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11317575680. Throughput: 0: 42524.0. Samples: 11317684100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 15:44:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 15:44:28,679][15401] Updated weights for policy 0, policy_version 690771 (0.0027) [2024-06-24 15:44:32,918][15401] Updated weights for policy 0, policy_version 690781 (0.0049) [2024-06-24 15:44:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11317772288. Throughput: 0: 42636.4. Samples: 11317941300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:44:33,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 15:44:36,397][15401] Updated weights for policy 0, policy_version 690791 (0.0047) [2024-06-24 15:44:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 11317985280. Throughput: 0: 42577.2. Samples: 11318065800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:44:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 15:44:40,499][15401] Updated weights for policy 0, policy_version 690801 (0.0028) [2024-06-24 15:44:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11318198272. Throughput: 0: 42433.4. Samples: 11318317980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:44:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 15:44:44,405][15401] Updated weights for policy 0, policy_version 690811 (0.0040) [2024-06-24 15:44:47,990][15401] Updated weights for policy 0, policy_version 690821 (0.0038) [2024-06-24 15:44:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11318411264. Throughput: 0: 42494.3. Samples: 11318573420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:44:48,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 15:44:51,910][15401] Updated weights for policy 0, policy_version 690831 (0.0031) [2024-06-24 15:44:53,393][15132] Fps is (10 sec: 42583.9, 60 sec: 42596.0, 300 sec: 42820.1). Total num frames: 11318624256. Throughput: 0: 42538.6. Samples: 11318703760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:44:53,393][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 15:44:54,333][15349] Signal inference workers to stop experience collection... (167450 times) [2024-06-24 15:44:54,379][15401] InferenceWorker_p0-w0: stopping experience collection (167450 times) [2024-06-24 15:44:54,388][15349] Signal inference workers to resume experience collection... (167450 times) [2024-06-24 15:44:54,397][15401] InferenceWorker_p0-w0: resuming experience collection (167450 times) [2024-06-24 15:44:55,435][15401] Updated weights for policy 0, policy_version 690841 (0.0025) [2024-06-24 15:44:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 11318853632. Throughput: 0: 42613.2. Samples: 11318966640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:44:58,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 15:44:59,427][15401] Updated weights for policy 0, policy_version 690851 (0.0033) [2024-06-24 15:45:03,389][15132] Fps is (10 sec: 42613.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11319050240. Throughput: 0: 42606.3. Samples: 11319222600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 15:45:03,399][15401] Updated weights for policy 0, policy_version 690861 (0.0036) [2024-06-24 15:45:07,111][15401] Updated weights for policy 0, policy_version 690871 (0.0030) [2024-06-24 15:45:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11319263232. Throughput: 0: 42747.9. Samples: 11319351500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 15:45:11,031][15401] Updated weights for policy 0, policy_version 690881 (0.0032) [2024-06-24 15:45:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11319476224. Throughput: 0: 42713.8. Samples: 11319606220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 15:45:14,902][15401] Updated weights for policy 0, policy_version 690891 (0.0041) [2024-06-24 15:45:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11319705600. Throughput: 0: 42487.5. Samples: 11319853240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 15:45:18,509][15401] Updated weights for policy 0, policy_version 690901 (0.0041) [2024-06-24 15:45:22,609][15401] Updated weights for policy 0, policy_version 690911 (0.0035) [2024-06-24 15:45:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11319918592. Throughput: 0: 42852.5. Samples: 11319994160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 15:45:26,063][15401] Updated weights for policy 0, policy_version 690921 (0.0035) [2024-06-24 15:45:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 11320098816. Throughput: 0: 42809.3. Samples: 11320244400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:28,391][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 15:45:30,000][15401] Updated weights for policy 0, policy_version 690931 (0.0030) [2024-06-24 15:45:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11320360960. Throughput: 0: 42844.8. Samples: 11320501440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 15:45:33,601][15401] Updated weights for policy 0, policy_version 690941 (0.0029) [2024-06-24 15:45:37,720][15401] Updated weights for policy 0, policy_version 690951 (0.0025) [2024-06-24 15:45:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11320557568. Throughput: 0: 42990.0. Samples: 11320638160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 15:45:41,406][15401] Updated weights for policy 0, policy_version 690961 (0.0031) [2024-06-24 15:45:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 11320754176. Throughput: 0: 42766.6. Samples: 11320891140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 15:45:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000690964_11320754176.pth... [2024-06-24 15:45:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000690338_11310497792.pth [2024-06-24 15:45:45,433][15401] Updated weights for policy 0, policy_version 690971 (0.0040) [2024-06-24 15:45:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11320983552. Throughput: 0: 42823.9. Samples: 11321149680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 15:45:48,958][15401] Updated weights for policy 0, policy_version 690981 (0.0036) [2024-06-24 15:45:53,092][15401] Updated weights for policy 0, policy_version 690991 (0.0028) [2024-06-24 15:45:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.8, 300 sec: 42820.6). Total num frames: 11321196544. Throughput: 0: 42791.9. Samples: 11321277140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 15:45:56,927][15401] Updated weights for policy 0, policy_version 691001 (0.0024) [2024-06-24 15:45:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11321393152. Throughput: 0: 42704.1. Samples: 11321527900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:45:58,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 15:46:01,118][15401] Updated weights for policy 0, policy_version 691011 (0.0036) [2024-06-24 15:46:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 11321606144. Throughput: 0: 42905.3. Samples: 11321783980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:46:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 15:46:03,546][15349] Signal inference workers to stop experience collection... (167500 times) [2024-06-24 15:46:03,547][15349] Signal inference workers to resume experience collection... (167500 times) [2024-06-24 15:46:03,596][15401] InferenceWorker_p0-w0: stopping experience collection (167500 times) [2024-06-24 15:46:03,597][15401] InferenceWorker_p0-w0: resuming experience collection (167500 times) [2024-06-24 15:46:04,515][15401] Updated weights for policy 0, policy_version 691021 (0.0039) [2024-06-24 15:46:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11321835520. Throughput: 0: 42599.3. Samples: 11321911120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 15:46:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 15:46:08,496][15401] Updated weights for policy 0, policy_version 691031 (0.0027) [2024-06-24 15:46:12,161][15401] Updated weights for policy 0, policy_version 691041 (0.0044) [2024-06-24 15:46:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11322048512. Throughput: 0: 42727.0. Samples: 11322167120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:13,399][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 15:46:16,071][15401] Updated weights for policy 0, policy_version 691051 (0.0039) [2024-06-24 15:46:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 11322261504. Throughput: 0: 42903.2. Samples: 11322432080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:18,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 15:46:19,712][15401] Updated weights for policy 0, policy_version 691061 (0.0032) [2024-06-24 15:46:23,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 11322490880. Throughput: 0: 42771.9. Samples: 11322562900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 15:46:23,640][15401] Updated weights for policy 0, policy_version 691071 (0.0032) [2024-06-24 15:46:27,220][15401] Updated weights for policy 0, policy_version 691081 (0.0043) [2024-06-24 15:46:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11322687488. Throughput: 0: 42828.8. Samples: 11322818420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:28,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 15:46:31,177][15401] Updated weights for policy 0, policy_version 691091 (0.0037) [2024-06-24 15:46:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11322933248. Throughput: 0: 42854.7. Samples: 11323078140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 15:46:35,191][15401] Updated weights for policy 0, policy_version 691101 (0.0028) [2024-06-24 15:46:38,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11323146240. Throughput: 0: 42790.7. Samples: 11323202720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 15:46:38,789][15401] Updated weights for policy 0, policy_version 691111 (0.0041) [2024-06-24 15:46:42,912][15401] Updated weights for policy 0, policy_version 691121 (0.0044) [2024-06-24 15:46:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 11323326464. Throughput: 0: 43089.0. Samples: 11323466900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 15:46:46,310][15401] Updated weights for policy 0, policy_version 691131 (0.0028) [2024-06-24 15:46:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11323572224. Throughput: 0: 43107.3. Samples: 11323723800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 15:46:50,433][15401] Updated weights for policy 0, policy_version 691141 (0.0029) [2024-06-24 15:46:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 11323768832. Throughput: 0: 43181.8. Samples: 11323854300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 15:46:53,943][15401] Updated weights for policy 0, policy_version 691151 (0.0029) [2024-06-24 15:46:57,962][15401] Updated weights for policy 0, policy_version 691161 (0.0031) [2024-06-24 15:46:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11323981824. Throughput: 0: 43196.9. Samples: 11324110980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:46:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 15:47:01,768][15401] Updated weights for policy 0, policy_version 691171 (0.0040) [2024-06-24 15:47:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43690.8, 300 sec: 42931.6). Total num frames: 11324227584. Throughput: 0: 42941.2. Samples: 11324364440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:03,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 15:47:05,491][15401] Updated weights for policy 0, policy_version 691181 (0.0027) [2024-06-24 15:47:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 11324407808. Throughput: 0: 42903.9. Samples: 11324493580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 15:47:09,037][15349] Signal inference workers to stop experience collection... (167550 times) [2024-06-24 15:47:09,037][15349] Signal inference workers to resume experience collection... (167550 times) [2024-06-24 15:47:09,091][15401] InferenceWorker_p0-w0: stopping experience collection (167550 times) [2024-06-24 15:47:09,092][15401] InferenceWorker_p0-w0: resuming experience collection (167550 times) [2024-06-24 15:47:09,380][15401] Updated weights for policy 0, policy_version 691191 (0.0052) [2024-06-24 15:47:13,384][15401] Updated weights for policy 0, policy_version 691201 (0.0039) [2024-06-24 15:47:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 11324637184. Throughput: 0: 42890.6. Samples: 11324748500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 15:47:17,542][15401] Updated weights for policy 0, policy_version 691211 (0.0034) [2024-06-24 15:47:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.3, 300 sec: 42876.1). Total num frames: 11324850176. Throughput: 0: 42798.6. Samples: 11325004080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 15:47:21,136][15401] Updated weights for policy 0, policy_version 691221 (0.0037) [2024-06-24 15:47:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11325063168. Throughput: 0: 42948.7. Samples: 11325135400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 15:47:25,126][15401] Updated weights for policy 0, policy_version 691231 (0.0025) [2024-06-24 15:47:28,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11325259776. Throughput: 0: 42680.3. Samples: 11325387520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 15:47:28,891][15401] Updated weights for policy 0, policy_version 691241 (0.0028) [2024-06-24 15:47:32,763][15401] Updated weights for policy 0, policy_version 691251 (0.0037) [2024-06-24 15:47:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 11325472768. Throughput: 0: 42693.8. Samples: 11325645020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 15:47:36,450][15401] Updated weights for policy 0, policy_version 691261 (0.0037) [2024-06-24 15:47:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11325702144. Throughput: 0: 42597.2. Samples: 11325771180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:38,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 15:47:40,340][15401] Updated weights for policy 0, policy_version 691271 (0.0037) [2024-06-24 15:47:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 11325898752. Throughput: 0: 42522.1. Samples: 11326024480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 15:47:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000691278_11325898752.pth... [2024-06-24 15:47:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000690653_11315658752.pth [2024-06-24 15:47:43,967][15401] Updated weights for policy 0, policy_version 691281 (0.0032) [2024-06-24 15:47:48,057][15401] Updated weights for policy 0, policy_version 691291 (0.0026) [2024-06-24 15:47:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11326128128. Throughput: 0: 42689.2. Samples: 11326285460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-24 15:47:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 15:47:51,557][15401] Updated weights for policy 0, policy_version 691301 (0.0028) [2024-06-24 15:47:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11326341120. Throughput: 0: 42739.7. Samples: 11326416860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:47:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 15:47:55,715][15401] Updated weights for policy 0, policy_version 691311 (0.0032) [2024-06-24 15:47:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11326554112. Throughput: 0: 42721.7. Samples: 11326670980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:47:58,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 15:47:59,875][15401] Updated weights for policy 0, policy_version 691321 (0.0037) [2024-06-24 15:48:03,164][15401] Updated weights for policy 0, policy_version 691331 (0.0034) [2024-06-24 15:48:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 11326767104. Throughput: 0: 42860.0. Samples: 11326932780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 15:48:07,743][15401] Updated weights for policy 0, policy_version 691341 (0.0034) [2024-06-24 15:48:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11326996480. Throughput: 0: 42860.3. Samples: 11327064120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:08,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 15:48:10,984][15401] Updated weights for policy 0, policy_version 691351 (0.0038) [2024-06-24 15:48:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11327176704. Throughput: 0: 42855.6. Samples: 11327316020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:13,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-24 15:48:15,155][15401] Updated weights for policy 0, policy_version 691361 (0.0031) [2024-06-24 15:48:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11327406080. Throughput: 0: 42961.7. Samples: 11327578300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 15:48:18,494][15401] Updated weights for policy 0, policy_version 691371 (0.0036) [2024-06-24 15:48:22,911][15401] Updated weights for policy 0, policy_version 691381 (0.0048) [2024-06-24 15:48:22,930][15349] Signal inference workers to stop experience collection... (167600 times) [2024-06-24 15:48:22,957][15401] InferenceWorker_p0-w0: stopping experience collection (167600 times) [2024-06-24 15:48:23,048][15349] Signal inference workers to resume experience collection... (167600 times) [2024-06-24 15:48:23,048][15401] InferenceWorker_p0-w0: resuming experience collection (167600 times) [2024-06-24 15:48:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 11327619072. Throughput: 0: 43090.2. Samples: 11327710240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:23,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 15:48:26,341][15401] Updated weights for policy 0, policy_version 691391 (0.0031) [2024-06-24 15:48:28,393][15132] Fps is (10 sec: 42584.4, 60 sec: 42869.1, 300 sec: 42764.5). Total num frames: 11327832064. Throughput: 0: 42974.3. Samples: 11327958460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:28,393][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 15:48:30,324][15401] Updated weights for policy 0, policy_version 691401 (0.0022) [2024-06-24 15:48:33,393][15132] Fps is (10 sec: 42583.5, 60 sec: 42868.8, 300 sec: 42709.0). Total num frames: 11328045056. Throughput: 0: 43072.6. Samples: 11328223880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:33,394][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 15:48:33,915][15401] Updated weights for policy 0, policy_version 691411 (0.0046) [2024-06-24 15:48:38,028][15401] Updated weights for policy 0, policy_version 691421 (0.0024) [2024-06-24 15:48:38,392][15132] Fps is (10 sec: 42602.2, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 11328258048. Throughput: 0: 42994.6. Samples: 11328351720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:38,392][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 15:48:41,398][15401] Updated weights for policy 0, policy_version 691431 (0.0026) [2024-06-24 15:48:43,389][15132] Fps is (10 sec: 44253.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 11328487424. Throughput: 0: 42969.0. Samples: 11328604580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:43,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 15:48:45,520][15401] Updated weights for policy 0, policy_version 691441 (0.0030) [2024-06-24 15:48:48,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11328684032. Throughput: 0: 43068.5. Samples: 11328870860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 15:48:48,961][15401] Updated weights for policy 0, policy_version 691451 (0.0028) [2024-06-24 15:48:53,011][15401] Updated weights for policy 0, policy_version 691461 (0.0037) [2024-06-24 15:48:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11328929792. Throughput: 0: 42919.1. Samples: 11328995480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:53,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 15:48:56,518][15401] Updated weights for policy 0, policy_version 691471 (0.0036) [2024-06-24 15:48:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11329126400. Throughput: 0: 43017.4. Samples: 11329251800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:48:58,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 15:49:00,545][15401] Updated weights for policy 0, policy_version 691481 (0.0036) [2024-06-24 15:49:03,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 11329323008. Throughput: 0: 43059.5. Samples: 11329516080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:49:03,393][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 15:49:04,057][15401] Updated weights for policy 0, policy_version 691491 (0.0031) [2024-06-24 15:49:07,949][15401] Updated weights for policy 0, policy_version 691501 (0.0039) [2024-06-24 15:49:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 11329552384. Throughput: 0: 42881.4. Samples: 11329640000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:49:08,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 15:49:11,692][15401] Updated weights for policy 0, policy_version 691511 (0.0035) [2024-06-24 15:49:13,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11329765376. Throughput: 0: 43155.5. Samples: 11329900320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:49:13,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 15:49:15,694][15401] Updated weights for policy 0, policy_version 691521 (0.0048) [2024-06-24 15:49:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11329994752. Throughput: 0: 43043.9. Samples: 11330160700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:49:18,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 15:49:19,695][15401] Updated weights for policy 0, policy_version 691531 (0.0033) [2024-06-24 15:49:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11330191360. Throughput: 0: 42870.2. Samples: 11330280780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:49:23,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 15:49:23,409][15401] Updated weights for policy 0, policy_version 691541 (0.0032) [2024-06-24 15:49:27,476][15401] Updated weights for policy 0, policy_version 691551 (0.0037) [2024-06-24 15:49:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.9, 300 sec: 42820.6). Total num frames: 11330404352. Throughput: 0: 43048.0. Samples: 11330541740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 15:49:28,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 15:49:30,966][15401] Updated weights for policy 0, policy_version 691561 (0.0027) [2024-06-24 15:49:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43147.1, 300 sec: 42876.1). Total num frames: 11330633728. Throughput: 0: 42759.9. Samples: 11330795060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:49:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 15:49:34,885][15401] Updated weights for policy 0, policy_version 691571 (0.0036) [2024-06-24 15:49:36,091][15349] Signal inference workers to stop experience collection... (167650 times) [2024-06-24 15:49:36,092][15349] Signal inference workers to resume experience collection... (167650 times) [2024-06-24 15:49:36,128][15401] InferenceWorker_p0-w0: stopping experience collection (167650 times) [2024-06-24 15:49:36,128][15401] InferenceWorker_p0-w0: resuming experience collection (167650 times) [2024-06-24 15:49:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 11330846720. Throughput: 0: 42860.0. Samples: 11330924180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:49:38,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-24 15:49:38,690][15401] Updated weights for policy 0, policy_version 691581 (0.0028) [2024-06-24 15:49:42,703][15401] Updated weights for policy 0, policy_version 691591 (0.0031) [2024-06-24 15:49:43,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 11331043328. Throughput: 0: 42954.5. Samples: 11331184860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:49:43,393][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 15:49:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000691592_11331043328.pth... [2024-06-24 15:49:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000690964_11320754176.pth [2024-06-24 15:49:46,346][15401] Updated weights for policy 0, policy_version 691601 (0.0028) [2024-06-24 15:49:48,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42876.2). Total num frames: 11331272704. Throughput: 0: 42531.6. Samples: 11331430000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:49:48,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 15:49:50,393][15401] Updated weights for policy 0, policy_version 691611 (0.0030) [2024-06-24 15:49:53,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11331485696. Throughput: 0: 42734.7. Samples: 11331562960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:49:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 15:49:54,048][15401] Updated weights for policy 0, policy_version 691621 (0.0029) [2024-06-24 15:49:58,002][15401] Updated weights for policy 0, policy_version 691631 (0.0035) [2024-06-24 15:49:58,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11331682304. Throughput: 0: 42659.6. Samples: 11331820000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:49:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 15:50:01,553][15401] Updated weights for policy 0, policy_version 691641 (0.0033) [2024-06-24 15:50:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 11331911680. Throughput: 0: 42513.4. Samples: 11332073800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:03,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 15:50:05,861][15401] Updated weights for policy 0, policy_version 691651 (0.0039) [2024-06-24 15:50:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 11332108288. Throughput: 0: 42662.7. Samples: 11332200600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 15:50:09,206][15401] Updated weights for policy 0, policy_version 691661 (0.0035) [2024-06-24 15:50:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11332321280. Throughput: 0: 42415.5. Samples: 11332450440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:13,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 15:50:13,714][15401] Updated weights for policy 0, policy_version 691671 (0.0040) [2024-06-24 15:50:16,961][15401] Updated weights for policy 0, policy_version 691681 (0.0045) [2024-06-24 15:50:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11332550656. Throughput: 0: 42460.9. Samples: 11332705800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 15:50:21,438][15401] Updated weights for policy 0, policy_version 691691 (0.0043) [2024-06-24 15:50:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 11332730880. Throughput: 0: 42429.4. Samples: 11332833500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:23,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 15:50:24,803][15401] Updated weights for policy 0, policy_version 691701 (0.0028) [2024-06-24 15:50:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11332960256. Throughput: 0: 42226.7. Samples: 11333084960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:28,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 15:50:29,337][15401] Updated weights for policy 0, policy_version 691711 (0.0034) [2024-06-24 15:50:32,349][15401] Updated weights for policy 0, policy_version 691721 (0.0032) [2024-06-24 15:50:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 11333189632. Throughput: 0: 42299.2. Samples: 11333333360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 15:50:36,917][15401] Updated weights for policy 0, policy_version 691731 (0.0042) [2024-06-24 15:50:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11333386240. Throughput: 0: 42331.1. Samples: 11333467860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 15:50:40,102][15401] Updated weights for policy 0, policy_version 691741 (0.0036) [2024-06-24 15:50:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 11333615616. Throughput: 0: 42214.2. Samples: 11333719640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:43,391][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 15:50:44,851][15401] Updated weights for policy 0, policy_version 691751 (0.0035) [2024-06-24 15:50:48,250][15401] Updated weights for policy 0, policy_version 691762 (0.0037) [2024-06-24 15:50:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 11333828608. Throughput: 0: 42199.1. Samples: 11333972760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 15:50:50,724][15349] Signal inference workers to stop experience collection... (167700 times) [2024-06-24 15:50:50,726][15349] Signal inference workers to resume experience collection... (167700 times) [2024-06-24 15:50:50,764][15401] InferenceWorker_p0-w0: stopping experience collection (167700 times) [2024-06-24 15:50:50,764][15401] InferenceWorker_p0-w0: resuming experience collection (167700 times) [2024-06-24 15:50:53,114][15401] Updated weights for policy 0, policy_version 691772 (0.0049) [2024-06-24 15:50:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 11334008832. Throughput: 0: 42245.1. Samples: 11334101620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:53,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 15:50:56,211][15401] Updated weights for policy 0, policy_version 691782 (0.0042) [2024-06-24 15:50:58,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11334238208. Throughput: 0: 42412.3. Samples: 11334359000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:50:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 15:51:00,549][15401] Updated weights for policy 0, policy_version 691792 (0.0041) [2024-06-24 15:51:03,393][15132] Fps is (10 sec: 45860.1, 60 sec: 42596.1, 300 sec: 42820.1). Total num frames: 11334467584. Throughput: 0: 42402.8. Samples: 11334614060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:51:03,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 15:51:03,735][15401] Updated weights for policy 0, policy_version 691802 (0.0029) [2024-06-24 15:51:08,041][15401] Updated weights for policy 0, policy_version 691812 (0.0033) [2024-06-24 15:51:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11334664192. Throughput: 0: 42599.4. Samples: 11334750480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-24 15:51:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 15:51:11,242][15401] Updated weights for policy 0, policy_version 691822 (0.0026) [2024-06-24 15:51:13,390][15132] Fps is (10 sec: 42611.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11334893568. Throughput: 0: 42790.7. Samples: 11335010540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:13,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 15:51:15,528][15401] Updated weights for policy 0, policy_version 691832 (0.0037) [2024-06-24 15:51:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11335106560. Throughput: 0: 42952.4. Samples: 11335266220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:18,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 15:51:18,798][15401] Updated weights for policy 0, policy_version 691842 (0.0034) [2024-06-24 15:51:22,906][15401] Updated weights for policy 0, policy_version 691852 (0.0046) [2024-06-24 15:51:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11335319552. Throughput: 0: 42889.3. Samples: 11335397880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 15:51:26,464][15401] Updated weights for policy 0, policy_version 691862 (0.0036) [2024-06-24 15:51:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11335532544. Throughput: 0: 43018.4. Samples: 11335655460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 15:51:30,403][15401] Updated weights for policy 0, policy_version 691872 (0.0031) [2024-06-24 15:51:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11335761920. Throughput: 0: 43241.3. Samples: 11335918620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 15:51:33,882][15401] Updated weights for policy 0, policy_version 691882 (0.0028) [2024-06-24 15:51:37,851][15401] Updated weights for policy 0, policy_version 691892 (0.0045) [2024-06-24 15:51:38,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 11335974912. Throughput: 0: 43378.5. Samples: 11336053760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:38,392][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 15:51:41,706][15401] Updated weights for policy 0, policy_version 691902 (0.0040) [2024-06-24 15:51:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11336187904. Throughput: 0: 43310.3. Samples: 11336307960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:43,391][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 15:51:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000691906_11336187904.pth... [2024-06-24 15:51:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000691278_11325898752.pth [2024-06-24 15:51:45,490][15401] Updated weights for policy 0, policy_version 691912 (0.0032) [2024-06-24 15:51:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11336400896. Throughput: 0: 43470.3. Samples: 11336570080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:48,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 15:51:49,148][15401] Updated weights for policy 0, policy_version 691922 (0.0040) [2024-06-24 15:51:53,090][15401] Updated weights for policy 0, policy_version 691932 (0.0045) [2024-06-24 15:51:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 11336613888. Throughput: 0: 43322.7. Samples: 11336700000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 15:51:56,807][15401] Updated weights for policy 0, policy_version 691942 (0.0032) [2024-06-24 15:51:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11336826880. Throughput: 0: 43207.6. Samples: 11336954880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:51:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 15:52:00,940][15401] Updated weights for policy 0, policy_version 691952 (0.0040) [2024-06-24 15:52:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.8, 300 sec: 42820.6). Total num frames: 11337039872. Throughput: 0: 43284.9. Samples: 11337214040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 15:52:04,494][15401] Updated weights for policy 0, policy_version 691962 (0.0033) [2024-06-24 15:52:08,363][15401] Updated weights for policy 0, policy_version 691972 (0.0029) [2024-06-24 15:52:08,364][15349] Signal inference workers to stop experience collection... (167750 times) [2024-06-24 15:52:08,364][15349] Signal inference workers to resume experience collection... (167750 times) [2024-06-24 15:52:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 11337269248. Throughput: 0: 43202.4. Samples: 11337341980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 15:52:08,409][15401] InferenceWorker_p0-w0: stopping experience collection (167750 times) [2024-06-24 15:52:08,409][15401] InferenceWorker_p0-w0: resuming experience collection (167750 times) [2024-06-24 15:52:12,319][15401] Updated weights for policy 0, policy_version 691982 (0.0038) [2024-06-24 15:52:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11337482240. Throughput: 0: 43251.5. Samples: 11337601780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 15:52:15,906][15401] Updated weights for policy 0, policy_version 691992 (0.0041) [2024-06-24 15:52:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11337695232. Throughput: 0: 43005.6. Samples: 11337853880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 15:52:19,923][15401] Updated weights for policy 0, policy_version 692002 (0.0042) [2024-06-24 15:52:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11337908224. Throughput: 0: 42792.5. Samples: 11337979320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 15:52:23,530][15401] Updated weights for policy 0, policy_version 692012 (0.0033) [2024-06-24 15:52:27,446][15401] Updated weights for policy 0, policy_version 692022 (0.0044) [2024-06-24 15:52:28,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 11338137600. Throughput: 0: 42901.9. Samples: 11338238540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:28,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 15:52:31,557][15401] Updated weights for policy 0, policy_version 692032 (0.0035) [2024-06-24 15:52:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11338334208. Throughput: 0: 42865.7. Samples: 11338499040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:33,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 15:52:35,056][15401] Updated weights for policy 0, policy_version 692042 (0.0042) [2024-06-24 15:52:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 11338547200. Throughput: 0: 42709.4. Samples: 11338621920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 15:52:39,191][15401] Updated weights for policy 0, policy_version 692052 (0.0031) [2024-06-24 15:52:42,543][15401] Updated weights for policy 0, policy_version 692062 (0.0036) [2024-06-24 15:52:43,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 11338792960. Throughput: 0: 42829.3. Samples: 11338882300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:43,392][15132] Avg episode reward: [(0, '0.172')] [2024-06-24 15:52:46,787][15401] Updated weights for policy 0, policy_version 692072 (0.0037) [2024-06-24 15:52:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11338973184. Throughput: 0: 42876.8. Samples: 11339143500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-24 15:52:48,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-24 15:52:50,096][15401] Updated weights for policy 0, policy_version 692082 (0.0031) [2024-06-24 15:52:53,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11339186176. Throughput: 0: 42849.7. Samples: 11339270220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:52:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 15:52:54,422][15401] Updated weights for policy 0, policy_version 692092 (0.0029) [2024-06-24 15:52:58,061][15401] Updated weights for policy 0, policy_version 692102 (0.0031) [2024-06-24 15:52:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11339415552. Throughput: 0: 42684.7. Samples: 11339522600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:52:58,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-24 15:53:02,262][15401] Updated weights for policy 0, policy_version 692112 (0.0024) [2024-06-24 15:53:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11339612160. Throughput: 0: 42706.8. Samples: 11339775680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 15:53:06,141][15401] Updated weights for policy 0, policy_version 692122 (0.0038) [2024-06-24 15:53:08,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11339825152. Throughput: 0: 42951.7. Samples: 11339912140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:08,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 15:53:09,756][15401] Updated weights for policy 0, policy_version 692132 (0.0040) [2024-06-24 15:53:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11340038144. Throughput: 0: 42667.9. Samples: 11340158600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:13,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 15:53:13,541][15401] Updated weights for policy 0, policy_version 692142 (0.0038) [2024-06-24 15:53:17,252][15401] Updated weights for policy 0, policy_version 692152 (0.0033) [2024-06-24 15:53:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11340267520. Throughput: 0: 42853.8. Samples: 11340427460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 15:53:21,117][15401] Updated weights for policy 0, policy_version 692162 (0.0036) [2024-06-24 15:53:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42821.0). Total num frames: 11340464128. Throughput: 0: 43028.5. Samples: 11340558200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:23,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 15:53:24,842][15401] Updated weights for policy 0, policy_version 692172 (0.0032) [2024-06-24 15:53:27,511][15349] Signal inference workers to stop experience collection... (167800 times) [2024-06-24 15:53:27,511][15349] Signal inference workers to resume experience collection... (167800 times) [2024-06-24 15:53:27,525][15401] InferenceWorker_p0-w0: stopping experience collection (167800 times) [2024-06-24 15:53:27,525][15401] InferenceWorker_p0-w0: resuming experience collection (167800 times) [2024-06-24 15:53:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42876.6). Total num frames: 11340693504. Throughput: 0: 42884.0. Samples: 11340811980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:28,392][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 15:53:28,585][15401] Updated weights for policy 0, policy_version 692182 (0.0035) [2024-06-24 15:53:32,490][15401] Updated weights for policy 0, policy_version 692192 (0.0035) [2024-06-24 15:53:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 11340922880. Throughput: 0: 42812.9. Samples: 11341070080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 15:53:36,301][15401] Updated weights for policy 0, policy_version 692202 (0.0044) [2024-06-24 15:53:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11341119488. Throughput: 0: 42951.1. Samples: 11341203020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:38,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 15:53:40,226][15401] Updated weights for policy 0, policy_version 692212 (0.0027) [2024-06-24 15:53:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42326.9, 300 sec: 42876.1). Total num frames: 11341332480. Throughput: 0: 43023.5. Samples: 11341458660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 15:53:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000692220_11341332480.pth... [2024-06-24 15:53:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000691592_11331043328.pth [2024-06-24 15:53:43,973][15401] Updated weights for policy 0, policy_version 692222 (0.0034) [2024-06-24 15:53:48,118][15401] Updated weights for policy 0, policy_version 692232 (0.0030) [2024-06-24 15:53:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11341545472. Throughput: 0: 43128.8. Samples: 11341716480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 15:53:51,636][15401] Updated weights for policy 0, policy_version 692242 (0.0029) [2024-06-24 15:53:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11341758464. Throughput: 0: 42861.2. Samples: 11341840900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 15:53:55,728][15401] Updated weights for policy 0, policy_version 692252 (0.0023) [2024-06-24 15:53:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 11341971456. Throughput: 0: 43097.8. Samples: 11342098000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:53:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 15:53:59,171][15401] Updated weights for policy 0, policy_version 692262 (0.0033) [2024-06-24 15:54:03,351][15401] Updated weights for policy 0, policy_version 692272 (0.0029) [2024-06-24 15:54:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 11342184448. Throughput: 0: 42901.3. Samples: 11342358020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:54:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 15:54:06,990][15401] Updated weights for policy 0, policy_version 692282 (0.0037) [2024-06-24 15:54:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11342413824. Throughput: 0: 42760.9. Samples: 11342482440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:54:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 15:54:10,826][15401] Updated weights for policy 0, policy_version 692292 (0.0039) [2024-06-24 15:54:13,396][15132] Fps is (10 sec: 44207.8, 60 sec: 43139.8, 300 sec: 42819.6). Total num frames: 11342626816. Throughput: 0: 42899.1. Samples: 11342742720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:54:13,397][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 15:54:14,580][15401] Updated weights for policy 0, policy_version 692302 (0.0037) [2024-06-24 15:54:18,310][15401] Updated weights for policy 0, policy_version 692312 (0.0038) [2024-06-24 15:54:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11342839808. Throughput: 0: 42717.0. Samples: 11342992340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:54:18,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 15:54:22,295][15401] Updated weights for policy 0, policy_version 692322 (0.0030) [2024-06-24 15:54:23,389][15132] Fps is (10 sec: 40986.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11343036416. Throughput: 0: 42588.0. Samples: 11343119480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:54:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 15:54:26,725][15401] Updated weights for policy 0, policy_version 692332 (0.0031) [2024-06-24 15:54:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11343249408. Throughput: 0: 42612.2. Samples: 11343376200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-24 15:54:28,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 15:54:30,080][15401] Updated weights for policy 0, policy_version 692342 (0.0031) [2024-06-24 15:54:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11343462400. Throughput: 0: 42481.4. Samples: 11343628140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:54:33,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 15:54:34,337][15401] Updated weights for policy 0, policy_version 692352 (0.0027) [2024-06-24 15:54:37,851][15401] Updated weights for policy 0, policy_version 692362 (0.0051) [2024-06-24 15:54:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 11343691776. Throughput: 0: 42629.8. Samples: 11343759240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:54:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 15:54:41,916][15401] Updated weights for policy 0, policy_version 692372 (0.0042) [2024-06-24 15:54:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 11343872000. Throughput: 0: 42499.4. Samples: 11344010480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:54:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 15:54:45,457][15401] Updated weights for policy 0, policy_version 692382 (0.0035) [2024-06-24 15:54:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11344117760. Throughput: 0: 42330.5. Samples: 11344262900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:54:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 15:54:49,387][15401] Updated weights for policy 0, policy_version 692392 (0.0032) [2024-06-24 15:54:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11344297984. Throughput: 0: 42579.8. Samples: 11344398540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:54:53,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 15:54:53,416][15349] Signal inference workers to stop experience collection... (167850 times) [2024-06-24 15:54:53,416][15349] Signal inference workers to resume experience collection... (167850 times) [2024-06-24 15:54:53,421][15401] Updated weights for policy 0, policy_version 692402 (0.0041) [2024-06-24 15:54:53,431][15401] InferenceWorker_p0-w0: stopping experience collection (167850 times) [2024-06-24 15:54:53,431][15401] InferenceWorker_p0-w0: resuming experience collection (167850 times) [2024-06-24 15:54:57,132][15401] Updated weights for policy 0, policy_version 692412 (0.0030) [2024-06-24 15:54:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11344527360. Throughput: 0: 42414.9. Samples: 11344651120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:54:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 15:55:01,087][15401] Updated weights for policy 0, policy_version 692422 (0.0028) [2024-06-24 15:55:03,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 11344756736. Throughput: 0: 42483.4. Samples: 11344904200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:03,393][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 15:55:04,709][15401] Updated weights for policy 0, policy_version 692432 (0.0051) [2024-06-24 15:55:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 11344936960. Throughput: 0: 42625.8. Samples: 11345037640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:08,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 15:55:08,567][15401] Updated weights for policy 0, policy_version 692442 (0.0032) [2024-06-24 15:55:12,553][15401] Updated weights for policy 0, policy_version 692452 (0.0043) [2024-06-24 15:55:13,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42330.0, 300 sec: 42765.0). Total num frames: 11345166336. Throughput: 0: 42693.8. Samples: 11345297420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:13,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 15:55:16,189][15401] Updated weights for policy 0, policy_version 692462 (0.0037) [2024-06-24 15:55:18,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42869.7, 300 sec: 42986.8). Total num frames: 11345412096. Throughput: 0: 42728.4. Samples: 11345551020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:18,392][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 15:55:20,155][15401] Updated weights for policy 0, policy_version 692472 (0.0034) [2024-06-24 15:55:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11345592320. Throughput: 0: 42729.7. Samples: 11345682080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:23,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 15:55:23,712][15401] Updated weights for policy 0, policy_version 692482 (0.0032) [2024-06-24 15:55:27,534][15401] Updated weights for policy 0, policy_version 692492 (0.0030) [2024-06-24 15:55:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11345821696. Throughput: 0: 42797.9. Samples: 11345936380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 15:55:31,579][15401] Updated weights for policy 0, policy_version 692502 (0.0036) [2024-06-24 15:55:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 11346051072. Throughput: 0: 42872.2. Samples: 11346192140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:33,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 15:55:35,071][15401] Updated weights for policy 0, policy_version 692512 (0.0034) [2024-06-24 15:55:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11346214912. Throughput: 0: 42815.3. Samples: 11346325220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:38,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-24 15:55:39,110][15401] Updated weights for policy 0, policy_version 692522 (0.0040) [2024-06-24 15:55:42,488][15401] Updated weights for policy 0, policy_version 692532 (0.0024) [2024-06-24 15:55:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11346460672. Throughput: 0: 42910.8. Samples: 11346582100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 15:55:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000692534_11346477056.pth... [2024-06-24 15:55:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000691906_11336187904.pth [2024-06-24 15:55:46,734][15401] Updated weights for policy 0, policy_version 692542 (0.0030) [2024-06-24 15:55:48,392][15132] Fps is (10 sec: 49140.0, 60 sec: 43142.9, 300 sec: 43042.4). Total num frames: 11346706432. Throughput: 0: 43048.5. Samples: 11346841380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:48,392][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 15:55:50,051][15401] Updated weights for policy 0, policy_version 692552 (0.0036) [2024-06-24 15:55:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11346870272. Throughput: 0: 43096.4. Samples: 11346976980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 15:55:54,429][15401] Updated weights for policy 0, policy_version 692562 (0.0043) [2024-06-24 15:55:57,667][15401] Updated weights for policy 0, policy_version 692572 (0.0032) [2024-06-24 15:55:58,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43144.7, 300 sec: 42876.6). Total num frames: 11347116032. Throughput: 0: 42825.3. Samples: 11347224560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:55:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 15:56:01,978][15401] Updated weights for policy 0, policy_version 692582 (0.0042) [2024-06-24 15:56:03,394][15132] Fps is (10 sec: 49129.8, 60 sec: 43416.1, 300 sec: 43042.1). Total num frames: 11347361792. Throughput: 0: 42957.9. Samples: 11347484220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:56:03,395][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 15:56:05,452][15401] Updated weights for policy 0, policy_version 692592 (0.0031) [2024-06-24 15:56:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11347525632. Throughput: 0: 42962.3. Samples: 11347615380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 15:56:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 15:56:09,551][15349] Signal inference workers to stop experience collection... (167900 times) [2024-06-24 15:56:09,556][15349] Signal inference workers to resume experience collection... (167900 times) [2024-06-24 15:56:09,605][15401] InferenceWorker_p0-w0: stopping experience collection (167900 times) [2024-06-24 15:56:09,605][15401] InferenceWorker_p0-w0: resuming experience collection (167900 times) [2024-06-24 15:56:09,697][15401] Updated weights for policy 0, policy_version 692602 (0.0048) [2024-06-24 15:56:13,067][15401] Updated weights for policy 0, policy_version 692612 (0.0035) [2024-06-24 15:56:13,390][15132] Fps is (10 sec: 40978.3, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 11347771392. Throughput: 0: 42979.0. Samples: 11347870440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:13,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 15:56:17,517][15401] Updated weights for policy 0, policy_version 692622 (0.0035) [2024-06-24 15:56:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 11347984384. Throughput: 0: 43080.7. Samples: 11348130780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 15:56:20,817][15401] Updated weights for policy 0, policy_version 692632 (0.0044) [2024-06-24 15:56:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11348164608. Throughput: 0: 42963.1. Samples: 11348258560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 15:56:25,075][15401] Updated weights for policy 0, policy_version 692642 (0.0031) [2024-06-24 15:56:28,347][15401] Updated weights for policy 0, policy_version 692652 (0.0040) [2024-06-24 15:56:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11348410368. Throughput: 0: 42806.6. Samples: 11348508400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 15:56:32,556][15401] Updated weights for policy 0, policy_version 692662 (0.0030) [2024-06-24 15:56:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 11348623360. Throughput: 0: 42991.2. Samples: 11348775880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 15:56:35,862][15401] Updated weights for policy 0, policy_version 692672 (0.0042) [2024-06-24 15:56:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11348803584. Throughput: 0: 42755.1. Samples: 11348900960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 15:56:40,091][15401] Updated weights for policy 0, policy_version 692682 (0.0037) [2024-06-24 15:56:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11349049344. Throughput: 0: 43011.6. Samples: 11349160080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 15:56:43,456][15401] Updated weights for policy 0, policy_version 692692 (0.0045) [2024-06-24 15:56:47,726][15401] Updated weights for policy 0, policy_version 692702 (0.0028) [2024-06-24 15:56:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 11349262336. Throughput: 0: 42981.2. Samples: 11349418180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:48,394][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 15:56:51,503][15401] Updated weights for policy 0, policy_version 692712 (0.0037) [2024-06-24 15:56:53,390][15132] Fps is (10 sec: 40956.9, 60 sec: 43144.1, 300 sec: 42820.5). Total num frames: 11349458944. Throughput: 0: 42901.6. Samples: 11349545980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:53,391][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 15:56:55,172][15401] Updated weights for policy 0, policy_version 692722 (0.0033) [2024-06-24 15:56:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11349704704. Throughput: 0: 42890.7. Samples: 11349800520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:56:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 15:56:59,002][15401] Updated weights for policy 0, policy_version 692732 (0.0042) [2024-06-24 15:57:02,871][15401] Updated weights for policy 0, policy_version 692742 (0.0038) [2024-06-24 15:57:03,390][15132] Fps is (10 sec: 44239.6, 60 sec: 42328.5, 300 sec: 42820.5). Total num frames: 11349901312. Throughput: 0: 42941.9. Samples: 11350063160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:03,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 15:57:06,516][15401] Updated weights for policy 0, policy_version 692752 (0.0029) [2024-06-24 15:57:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11350097920. Throughput: 0: 42994.7. Samples: 11350193320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 15:57:10,623][15401] Updated weights for policy 0, policy_version 692762 (0.0024) [2024-06-24 15:57:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11350343680. Throughput: 0: 43112.9. Samples: 11350448480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:13,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 15:57:14,057][15401] Updated weights for policy 0, policy_version 692772 (0.0034) [2024-06-24 15:57:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11350523904. Throughput: 0: 42992.1. Samples: 11350710520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 15:57:18,469][15401] Updated weights for policy 0, policy_version 692782 (0.0037) [2024-06-24 15:57:19,829][15349] Signal inference workers to stop experience collection... (167950 times) [2024-06-24 15:57:19,830][15349] Signal inference workers to resume experience collection... (167950 times) [2024-06-24 15:57:19,871][15401] InferenceWorker_p0-w0: stopping experience collection (167950 times) [2024-06-24 15:57:19,871][15401] InferenceWorker_p0-w0: resuming experience collection (167950 times) [2024-06-24 15:57:21,598][15401] Updated weights for policy 0, policy_version 692792 (0.0028) [2024-06-24 15:57:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11350736896. Throughput: 0: 42990.7. Samples: 11350835540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 15:57:25,943][15401] Updated weights for policy 0, policy_version 692802 (0.0031) [2024-06-24 15:57:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11350982656. Throughput: 0: 43047.5. Samples: 11351097220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 15:57:29,136][15401] Updated weights for policy 0, policy_version 692812 (0.0047) [2024-06-24 15:57:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11351179264. Throughput: 0: 43050.2. Samples: 11351355440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:33,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 15:57:33,573][15401] Updated weights for policy 0, policy_version 692822 (0.0038) [2024-06-24 15:57:36,719][15401] Updated weights for policy 0, policy_version 692832 (0.0037) [2024-06-24 15:57:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 11351392256. Throughput: 0: 43118.9. Samples: 11351486300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:38,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 15:57:41,135][15401] Updated weights for policy 0, policy_version 692842 (0.0029) [2024-06-24 15:57:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11351621632. Throughput: 0: 43092.9. Samples: 11351739700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 15:57:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000692848_11351621632.pth... [2024-06-24 15:57:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000692220_11341332480.pth [2024-06-24 15:57:44,397][15401] Updated weights for policy 0, policy_version 692852 (0.0033) [2024-06-24 15:57:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11351834624. Throughput: 0: 42975.6. Samples: 11351997060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-24 15:57:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 15:57:48,675][15401] Updated weights for policy 0, policy_version 692862 (0.0033) [2024-06-24 15:57:52,144][15401] Updated weights for policy 0, policy_version 692872 (0.0037) [2024-06-24 15:57:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43145.0, 300 sec: 42820.6). Total num frames: 11352047616. Throughput: 0: 43038.5. Samples: 11352130060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:57:53,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 15:57:56,407][15401] Updated weights for policy 0, policy_version 692882 (0.0047) [2024-06-24 15:57:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 11352244224. Throughput: 0: 42991.8. Samples: 11352383120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:57:58,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-24 15:57:59,807][15401] Updated weights for policy 0, policy_version 692892 (0.0040) [2024-06-24 15:58:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11352473600. Throughput: 0: 42921.7. Samples: 11352642000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:03,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 15:58:04,035][15401] Updated weights for policy 0, policy_version 692902 (0.0038) [2024-06-24 15:58:07,897][15401] Updated weights for policy 0, policy_version 692912 (0.0032) [2024-06-24 15:58:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11352686592. Throughput: 0: 43078.7. Samples: 11352774080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:08,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 15:58:11,487][15401] Updated weights for policy 0, policy_version 692922 (0.0034) [2024-06-24 15:58:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11352899584. Throughput: 0: 42964.9. Samples: 11353030640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 15:58:15,381][15401] Updated weights for policy 0, policy_version 692932 (0.0028) [2024-06-24 15:58:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11353096192. Throughput: 0: 42986.7. Samples: 11353289840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 15:58:19,392][15401] Updated weights for policy 0, policy_version 692942 (0.0036) [2024-06-24 15:58:22,949][15401] Updated weights for policy 0, policy_version 692952 (0.0036) [2024-06-24 15:58:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11353341952. Throughput: 0: 42754.6. Samples: 11353410260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:23,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 15:58:26,906][15401] Updated weights for policy 0, policy_version 692962 (0.0037) [2024-06-24 15:58:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11353554944. Throughput: 0: 42954.2. Samples: 11353672640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 15:58:30,962][15401] Updated weights for policy 0, policy_version 692972 (0.0046) [2024-06-24 15:58:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11353751552. Throughput: 0: 42983.6. Samples: 11353931320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 15:58:34,380][15401] Updated weights for policy 0, policy_version 692982 (0.0040) [2024-06-24 15:58:38,299][15401] Updated weights for policy 0, policy_version 692992 (0.0035) [2024-06-24 15:58:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11353980928. Throughput: 0: 42886.2. Samples: 11354059940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 15:58:40,540][15349] Signal inference workers to stop experience collection... (168000 times) [2024-06-24 15:58:40,593][15401] InferenceWorker_p0-w0: stopping experience collection (168000 times) [2024-06-24 15:58:40,602][15349] Signal inference workers to resume experience collection... (168000 times) [2024-06-24 15:58:40,612][15401] InferenceWorker_p0-w0: resuming experience collection (168000 times) [2024-06-24 15:58:42,133][15401] Updated weights for policy 0, policy_version 693002 (0.0028) [2024-06-24 15:58:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11354210304. Throughput: 0: 43050.7. Samples: 11354320400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 15:58:45,791][15401] Updated weights for policy 0, policy_version 693012 (0.0033) [2024-06-24 15:58:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11354406912. Throughput: 0: 43118.1. Samples: 11354582320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 15:58:49,794][15401] Updated weights for policy 0, policy_version 693022 (0.0032) [2024-06-24 15:58:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11354619904. Throughput: 0: 42873.6. Samples: 11354703400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 15:58:53,512][15401] Updated weights for policy 0, policy_version 693032 (0.0036) [2024-06-24 15:58:57,487][15401] Updated weights for policy 0, policy_version 693042 (0.0032) [2024-06-24 15:58:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 11354849280. Throughput: 0: 42951.4. Samples: 11354963460. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:58:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 15:59:01,058][15401] Updated weights for policy 0, policy_version 693052 (0.0036) [2024-06-24 15:59:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11355029504. Throughput: 0: 42812.0. Samples: 11355216380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:59:03,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 15:59:05,267][15401] Updated weights for policy 0, policy_version 693062 (0.0030) [2024-06-24 15:59:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42877.0). Total num frames: 11355275264. Throughput: 0: 42851.5. Samples: 11355338580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:59:08,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 15:59:08,919][15401] Updated weights for policy 0, policy_version 693072 (0.0036) [2024-06-24 15:59:12,839][15401] Updated weights for policy 0, policy_version 693082 (0.0026) [2024-06-24 15:59:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 11355471872. Throughput: 0: 42798.6. Samples: 11355598580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:59:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 15:59:16,338][15401] Updated weights for policy 0, policy_version 693092 (0.0040) [2024-06-24 15:59:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11355684864. Throughput: 0: 42726.6. Samples: 11355854020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:59:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 15:59:20,832][15401] Updated weights for policy 0, policy_version 693102 (0.0051) [2024-06-24 15:59:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11355914240. Throughput: 0: 42623.2. Samples: 11355977980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:59:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 15:59:24,374][15401] Updated weights for policy 0, policy_version 693112 (0.0031) [2024-06-24 15:59:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11356094464. Throughput: 0: 42675.7. Samples: 11356240800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-24 15:59:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 15:59:28,505][15401] Updated weights for policy 0, policy_version 693122 (0.0027) [2024-06-24 15:59:31,724][15401] Updated weights for policy 0, policy_version 693132 (0.0037) [2024-06-24 15:59:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 11356323840. Throughput: 0: 42526.6. Samples: 11356496020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 15:59:33,390][15132] Avg episode reward: [(0, '0.156')] [2024-06-24 15:59:35,877][15401] Updated weights for policy 0, policy_version 693142 (0.0042) [2024-06-24 15:59:38,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 11356569600. Throughput: 0: 42791.2. Samples: 11356629000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 15:59:38,390][15132] Avg episode reward: [(0, '0.120')] [2024-06-24 15:59:39,223][15401] Updated weights for policy 0, policy_version 693152 (0.0030) [2024-06-24 15:59:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11356749824. Throughput: 0: 42781.5. Samples: 11356888620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 15:59:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 15:59:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000693162_11356766208.pth... [2024-06-24 15:59:43,427][15401] Updated weights for policy 0, policy_version 693162 (0.0036) [2024-06-24 15:59:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000692534_11346477056.pth [2024-06-24 15:59:46,910][15401] Updated weights for policy 0, policy_version 693172 (0.0029) [2024-06-24 15:59:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11356979200. Throughput: 0: 42688.9. Samples: 11357137380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 15:59:48,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 15:59:51,428][15401] Updated weights for policy 0, policy_version 693182 (0.0027) [2024-06-24 15:59:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 11357192192. Throughput: 0: 43021.9. Samples: 11357274560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 15:59:53,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 15:59:54,663][15401] Updated weights for policy 0, policy_version 693192 (0.0029) [2024-06-24 15:59:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 11357388800. Throughput: 0: 42968.6. Samples: 11357532160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 15:59:58,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 15:59:58,853][15401] Updated weights for policy 0, policy_version 693202 (0.0040) [2024-06-24 16:00:02,148][15401] Updated weights for policy 0, policy_version 693212 (0.0037) [2024-06-24 16:00:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 11357618176. Throughput: 0: 42844.2. Samples: 11357782000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 16:00:06,759][15401] Updated weights for policy 0, policy_version 693222 (0.0030) [2024-06-24 16:00:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 11357831168. Throughput: 0: 43069.8. Samples: 11357916120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 16:00:10,073][15401] Updated weights for policy 0, policy_version 693232 (0.0040) [2024-06-24 16:00:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 11358044160. Throughput: 0: 42905.6. Samples: 11358171560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 16:00:14,313][15401] Updated weights for policy 0, policy_version 693242 (0.0033) [2024-06-24 16:00:16,127][15349] Signal inference workers to stop experience collection... (168050 times) [2024-06-24 16:00:16,128][15349] Signal inference workers to resume experience collection... (168050 times) [2024-06-24 16:00:16,155][15401] InferenceWorker_p0-w0: stopping experience collection (168050 times) [2024-06-24 16:00:16,155][15401] InferenceWorker_p0-w0: resuming experience collection (168050 times) [2024-06-24 16:00:17,551][15401] Updated weights for policy 0, policy_version 693252 (0.0040) [2024-06-24 16:00:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 11358257152. Throughput: 0: 42867.3. Samples: 11358425040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 16:00:21,809][15401] Updated weights for policy 0, policy_version 693262 (0.0028) [2024-06-24 16:00:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11358486528. Throughput: 0: 42859.0. Samples: 11358557660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 16:00:25,020][15401] Updated weights for policy 0, policy_version 693272 (0.0033) [2024-06-24 16:00:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11358683136. Throughput: 0: 42875.1. Samples: 11358818000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 16:00:29,352][15401] Updated weights for policy 0, policy_version 693282 (0.0040) [2024-06-24 16:00:32,741][15401] Updated weights for policy 0, policy_version 693292 (0.0030) [2024-06-24 16:00:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 11358912512. Throughput: 0: 42942.6. Samples: 11359069800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:33,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-24 16:00:36,753][15401] Updated weights for policy 0, policy_version 693302 (0.0036) [2024-06-24 16:00:38,391][15132] Fps is (10 sec: 44230.2, 60 sec: 42597.4, 300 sec: 42931.4). Total num frames: 11359125504. Throughput: 0: 42889.7. Samples: 11359204660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:38,391][15132] Avg episode reward: [(0, '0.291')] [2024-06-24 16:00:40,194][15401] Updated weights for policy 0, policy_version 693312 (0.0042) [2024-06-24 16:00:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 11359322112. Throughput: 0: 42862.6. Samples: 11359460980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 16:00:44,658][15401] Updated weights for policy 0, policy_version 693322 (0.0034) [2024-06-24 16:00:47,752][15401] Updated weights for policy 0, policy_version 693332 (0.0045) [2024-06-24 16:00:48,389][15132] Fps is (10 sec: 44243.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 11359567872. Throughput: 0: 42934.2. Samples: 11359714040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 16:00:52,137][15401] Updated weights for policy 0, policy_version 693342 (0.0024) [2024-06-24 16:00:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11359764480. Throughput: 0: 42904.9. Samples: 11359846840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 16:00:55,579][15401] Updated weights for policy 0, policy_version 693352 (0.0042) [2024-06-24 16:00:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.7). Total num frames: 11359977472. Throughput: 0: 42855.2. Samples: 11360100040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:00:58,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 16:00:59,816][15401] Updated weights for policy 0, policy_version 693362 (0.0028) [2024-06-24 16:01:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11360190464. Throughput: 0: 42932.8. Samples: 11360357020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:01:03,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 16:01:03,423][15401] Updated weights for policy 0, policy_version 693372 (0.0031) [2024-06-24 16:01:07,728][15401] Updated weights for policy 0, policy_version 693382 (0.0040) [2024-06-24 16:01:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11360387072. Throughput: 0: 42752.2. Samples: 11360481500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 16:01:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 16:01:11,481][15401] Updated weights for policy 0, policy_version 693392 (0.0028) [2024-06-24 16:01:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 11360616448. Throughput: 0: 42744.7. Samples: 11360741620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:13,393][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 16:01:15,275][15401] Updated weights for policy 0, policy_version 693402 (0.0031) [2024-06-24 16:01:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 11360829440. Throughput: 0: 42794.7. Samples: 11360995560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:18,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 16:01:18,987][15401] Updated weights for policy 0, policy_version 693412 (0.0038) [2024-06-24 16:01:22,908][15401] Updated weights for policy 0, policy_version 693422 (0.0031) [2024-06-24 16:01:23,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11361042432. Throughput: 0: 42628.9. Samples: 11361122900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 16:01:26,378][15401] Updated weights for policy 0, policy_version 693432 (0.0029) [2024-06-24 16:01:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11361255424. Throughput: 0: 42687.7. Samples: 11361381920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:28,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 16:01:30,818][15401] Updated weights for policy 0, policy_version 693442 (0.0047) [2024-06-24 16:01:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 11361484800. Throughput: 0: 42805.4. Samples: 11361640280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 16:01:34,028][15401] Updated weights for policy 0, policy_version 693452 (0.0042) [2024-06-24 16:01:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42326.3, 300 sec: 42765.0). Total num frames: 11361665024. Throughput: 0: 42634.6. Samples: 11361765400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 16:01:38,500][15401] Updated weights for policy 0, policy_version 693462 (0.0033) [2024-06-24 16:01:41,681][15401] Updated weights for policy 0, policy_version 693472 (0.0043) [2024-06-24 16:01:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11361878016. Throughput: 0: 42601.3. Samples: 11362017100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 16:01:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000693475_11361894400.pth... [2024-06-24 16:01:43,525][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000692848_11351621632.pth [2024-06-24 16:01:46,029][15401] Updated weights for policy 0, policy_version 693482 (0.0037) [2024-06-24 16:01:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42876.2). Total num frames: 11362107392. Throughput: 0: 42568.8. Samples: 11362272620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 16:01:50,045][15401] Updated weights for policy 0, policy_version 693492 (0.0043) [2024-06-24 16:01:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11362304000. Throughput: 0: 42608.5. Samples: 11362398880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:53,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 16:01:53,781][15401] Updated weights for policy 0, policy_version 693502 (0.0035) [2024-06-24 16:01:57,585][15401] Updated weights for policy 0, policy_version 693512 (0.0032) [2024-06-24 16:01:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11362533376. Throughput: 0: 42524.5. Samples: 11362655120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:01:58,393][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 16:02:01,730][15401] Updated weights for policy 0, policy_version 693522 (0.0025) [2024-06-24 16:02:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11362762752. Throughput: 0: 42514.3. Samples: 11362908700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 16:02:05,327][15401] Updated weights for policy 0, policy_version 693532 (0.0037) [2024-06-24 16:02:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11362959360. Throughput: 0: 42570.2. Samples: 11363038560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 16:02:09,281][15401] Updated weights for policy 0, policy_version 693542 (0.0036) [2024-06-24 16:02:10,822][15349] Signal inference workers to stop experience collection... (168100 times) [2024-06-24 16:02:10,871][15401] InferenceWorker_p0-w0: stopping experience collection (168100 times) [2024-06-24 16:02:10,878][15349] Signal inference workers to resume experience collection... (168100 times) [2024-06-24 16:02:10,891][15401] InferenceWorker_p0-w0: resuming experience collection (168100 times) [2024-06-24 16:02:12,987][15401] Updated weights for policy 0, policy_version 693552 (0.0042) [2024-06-24 16:02:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 11363172352. Throughput: 0: 42474.6. Samples: 11363293280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 16:02:16,976][15401] Updated weights for policy 0, policy_version 693562 (0.0034) [2024-06-24 16:02:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11363385344. Throughput: 0: 42317.2. Samples: 11363544560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:18,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 16:02:20,770][15401] Updated weights for policy 0, policy_version 693572 (0.0031) [2024-06-24 16:02:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11363581952. Throughput: 0: 42329.7. Samples: 11363670240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 16:02:24,457][15401] Updated weights for policy 0, policy_version 693582 (0.0028) [2024-06-24 16:02:28,337][15401] Updated weights for policy 0, policy_version 693592 (0.0033) [2024-06-24 16:02:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11363811328. Throughput: 0: 42493.8. Samples: 11363929320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:28,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 16:02:32,034][15401] Updated weights for policy 0, policy_version 693602 (0.0033) [2024-06-24 16:02:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11364024320. Throughput: 0: 42401.5. Samples: 11364180680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:33,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 16:02:35,891][15401] Updated weights for policy 0, policy_version 693612 (0.0038) [2024-06-24 16:02:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11364204544. Throughput: 0: 42260.4. Samples: 11364300600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 16:02:40,059][15401] Updated weights for policy 0, policy_version 693622 (0.0043) [2024-06-24 16:02:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11364450304. Throughput: 0: 42311.7. Samples: 11364559140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:43,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 16:02:43,638][15401] Updated weights for policy 0, policy_version 693632 (0.0048) [2024-06-24 16:02:48,179][15401] Updated weights for policy 0, policy_version 693642 (0.0037) [2024-06-24 16:02:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 11364630528. Throughput: 0: 42204.1. Samples: 11364807880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:02:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 16:02:51,747][15401] Updated weights for policy 0, policy_version 693652 (0.0033) [2024-06-24 16:02:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11364843520. Throughput: 0: 42047.6. Samples: 11364930700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:02:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 16:02:55,882][15401] Updated weights for policy 0, policy_version 693662 (0.0026) [2024-06-24 16:02:58,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11365089280. Throughput: 0: 42132.8. Samples: 11365189260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:02:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 16:02:59,273][15401] Updated weights for policy 0, policy_version 693672 (0.0026) [2024-06-24 16:03:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 11365269504. Throughput: 0: 42379.1. Samples: 11365451620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 16:03:03,630][15401] Updated weights for policy 0, policy_version 693682 (0.0032) [2024-06-24 16:03:07,124][15401] Updated weights for policy 0, policy_version 693692 (0.0035) [2024-06-24 16:03:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11365498880. Throughput: 0: 42225.8. Samples: 11365570400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 16:03:11,285][15401] Updated weights for policy 0, policy_version 693702 (0.0034) [2024-06-24 16:03:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11365728256. Throughput: 0: 42283.0. Samples: 11365832060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 16:03:14,876][15401] Updated weights for policy 0, policy_version 693712 (0.0032) [2024-06-24 16:03:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11365908480. Throughput: 0: 42536.4. Samples: 11366094820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 16:03:19,068][15401] Updated weights for policy 0, policy_version 693722 (0.0037) [2024-06-24 16:03:22,270][15401] Updated weights for policy 0, policy_version 693732 (0.0032) [2024-06-24 16:03:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11366154240. Throughput: 0: 42492.8. Samples: 11366212780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 16:03:26,776][15401] Updated weights for policy 0, policy_version 693742 (0.0046) [2024-06-24 16:03:28,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11366383616. Throughput: 0: 42715.1. Samples: 11366481320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 16:03:29,812][15401] Updated weights for policy 0, policy_version 693752 (0.0028) [2024-06-24 16:03:33,190][15349] Signal inference workers to stop experience collection... (168150 times) [2024-06-24 16:03:33,191][15349] Signal inference workers to resume experience collection... (168150 times) [2024-06-24 16:03:33,208][15401] InferenceWorker_p0-w0: stopping experience collection (168150 times) [2024-06-24 16:03:33,208][15401] InferenceWorker_p0-w0: resuming experience collection (168150 times) [2024-06-24 16:03:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11366563840. Throughput: 0: 42917.2. Samples: 11366739160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:33,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 16:03:34,239][15401] Updated weights for policy 0, policy_version 693762 (0.0030) [2024-06-24 16:03:37,573][15401] Updated weights for policy 0, policy_version 693772 (0.0031) [2024-06-24 16:03:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 11366793216. Throughput: 0: 42931.0. Samples: 11366862600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 16:03:41,990][15401] Updated weights for policy 0, policy_version 693782 (0.0032) [2024-06-24 16:03:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 11367022592. Throughput: 0: 43093.4. Samples: 11367128460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 16:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000693788_11367022592.pth... [2024-06-24 16:03:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000693162_11356766208.pth [2024-06-24 16:03:45,019][15401] Updated weights for policy 0, policy_version 693792 (0.0028) [2024-06-24 16:03:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11367186432. Throughput: 0: 42937.8. Samples: 11367383820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:48,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 16:03:49,635][15401] Updated weights for policy 0, policy_version 693802 (0.0035) [2024-06-24 16:03:52,918][15401] Updated weights for policy 0, policy_version 693812 (0.0022) [2024-06-24 16:03:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 11367448576. Throughput: 0: 42969.5. Samples: 11367504020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 16:03:57,087][15401] Updated weights for policy 0, policy_version 693822 (0.0042) [2024-06-24 16:03:58,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 11367645184. Throughput: 0: 43076.2. Samples: 11367770480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:03:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 16:04:00,458][15401] Updated weights for policy 0, policy_version 693832 (0.0026) [2024-06-24 16:04:03,394][15132] Fps is (10 sec: 39302.5, 60 sec: 42868.2, 300 sec: 42597.7). Total num frames: 11367841792. Throughput: 0: 42865.3. Samples: 11368023960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:04:03,395][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 16:04:04,717][15401] Updated weights for policy 0, policy_version 693842 (0.0042) [2024-06-24 16:04:07,867][15401] Updated weights for policy 0, policy_version 693852 (0.0032) [2024-06-24 16:04:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11368087552. Throughput: 0: 43062.2. Samples: 11368150580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:04:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 16:04:12,340][15401] Updated weights for policy 0, policy_version 693862 (0.0049) [2024-06-24 16:04:13,390][15132] Fps is (10 sec: 45896.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11368300544. Throughput: 0: 42972.3. Samples: 11368415080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:04:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 16:04:15,372][15401] Updated weights for policy 0, policy_version 693872 (0.0029) [2024-06-24 16:04:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11368480768. Throughput: 0: 42932.6. Samples: 11368671120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:04:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 16:04:19,908][15401] Updated weights for policy 0, policy_version 693882 (0.0029) [2024-06-24 16:04:23,154][15401] Updated weights for policy 0, policy_version 693892 (0.0030) [2024-06-24 16:04:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11368742912. Throughput: 0: 42913.7. Samples: 11368793720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:04:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 16:04:27,894][15401] Updated weights for policy 0, policy_version 693902 (0.0034) [2024-06-24 16:04:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 11368906752. Throughput: 0: 42842.0. Samples: 11369056340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 16:04:28,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-24 16:04:30,930][15401] Updated weights for policy 0, policy_version 693912 (0.0028) [2024-06-24 16:04:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11369136128. Throughput: 0: 42728.1. Samples: 11369306580. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:04:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 16:04:35,450][15401] Updated weights for policy 0, policy_version 693922 (0.0043) [2024-06-24 16:04:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11369365504. Throughput: 0: 42935.1. Samples: 11369436100. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:04:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 16:04:38,475][15401] Updated weights for policy 0, policy_version 693932 (0.0030) [2024-06-24 16:04:43,040][15401] Updated weights for policy 0, policy_version 693942 (0.0042) [2024-06-24 16:04:43,064][15349] Signal inference workers to stop experience collection... (168200 times) [2024-06-24 16:04:43,064][15349] Signal inference workers to resume experience collection... (168200 times) [2024-06-24 16:04:43,078][15401] InferenceWorker_p0-w0: stopping experience collection (168200 times) [2024-06-24 16:04:43,099][15401] InferenceWorker_p0-w0: resuming experience collection (168200 times) [2024-06-24 16:04:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 11369562112. Throughput: 0: 42722.0. Samples: 11369693080. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:04:43,393][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 16:04:46,045][15401] Updated weights for policy 0, policy_version 693952 (0.0039) [2024-06-24 16:04:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11369758720. Throughput: 0: 42763.6. Samples: 11369948120. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:04:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 16:04:50,648][15401] Updated weights for policy 0, policy_version 693962 (0.0033) [2024-06-24 16:04:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11370004480. Throughput: 0: 42776.1. Samples: 11370075500. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:04:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 16:04:53,781][15401] Updated weights for policy 0, policy_version 693972 (0.0035) [2024-06-24 16:04:58,150][15401] Updated weights for policy 0, policy_version 693982 (0.0033) [2024-06-24 16:04:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11370217472. Throughput: 0: 42601.9. Samples: 11370332160. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:04:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 16:05:01,554][15401] Updated weights for policy 0, policy_version 693992 (0.0019) [2024-06-24 16:05:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42601.8, 300 sec: 42598.4). Total num frames: 11370397696. Throughput: 0: 42467.1. Samples: 11370582140. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:03,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 16:05:06,263][15401] Updated weights for policy 0, policy_version 694002 (0.0033) [2024-06-24 16:05:08,390][15132] Fps is (10 sec: 42596.1, 60 sec: 42598.1, 300 sec: 42709.4). Total num frames: 11370643456. Throughput: 0: 42564.5. Samples: 11370709140. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:08,391][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 16:05:09,461][15401] Updated weights for policy 0, policy_version 694012 (0.0038) [2024-06-24 16:05:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 11370823680. Throughput: 0: 42434.2. Samples: 11370965880. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 16:05:13,884][15401] Updated weights for policy 0, policy_version 694022 (0.0038) [2024-06-24 16:05:17,159][15401] Updated weights for policy 0, policy_version 694032 (0.0027) [2024-06-24 16:05:18,390][15132] Fps is (10 sec: 40962.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11371053056. Throughput: 0: 42415.1. Samples: 11371215260. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 16:05:21,555][15401] Updated weights for policy 0, policy_version 694042 (0.0029) [2024-06-24 16:05:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11371282432. Throughput: 0: 42531.0. Samples: 11371350000. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:23,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 16:05:24,765][15401] Updated weights for policy 0, policy_version 694052 (0.0035) [2024-06-24 16:05:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11371462656. Throughput: 0: 42513.8. Samples: 11371606100. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:28,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 16:05:29,058][15401] Updated weights for policy 0, policy_version 694062 (0.0035) [2024-06-24 16:05:32,869][15401] Updated weights for policy 0, policy_version 694072 (0.0034) [2024-06-24 16:05:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 11371692032. Throughput: 0: 42539.5. Samples: 11371862400. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:33,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 16:05:36,929][15401] Updated weights for policy 0, policy_version 694082 (0.0031) [2024-06-24 16:05:38,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11371921408. Throughput: 0: 42624.1. Samples: 11371993580. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 16:05:40,388][15401] Updated weights for policy 0, policy_version 694092 (0.0029) [2024-06-24 16:05:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 11372101632. Throughput: 0: 42520.9. Samples: 11372245600. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 16:05:43,508][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000694099_11372118016.pth... [2024-06-24 16:05:43,569][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000693475_11361894400.pth [2024-06-24 16:05:44,500][15401] Updated weights for policy 0, policy_version 694102 (0.0034) [2024-06-24 16:05:47,934][15401] Updated weights for policy 0, policy_version 694112 (0.0042) [2024-06-24 16:05:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11372331008. Throughput: 0: 42543.8. Samples: 11372496620. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 16:05:52,103][15401] Updated weights for policy 0, policy_version 694122 (0.0039) [2024-06-24 16:05:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11372560384. Throughput: 0: 42703.5. Samples: 11372630780. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:53,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 16:05:55,567][15401] Updated weights for policy 0, policy_version 694132 (0.0040) [2024-06-24 16:05:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11372740608. Throughput: 0: 42630.3. Samples: 11372884240. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:05:58,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 16:05:59,822][15401] Updated weights for policy 0, policy_version 694142 (0.0036) [2024-06-24 16:06:03,203][15401] Updated weights for policy 0, policy_version 694152 (0.0042) [2024-06-24 16:06:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11372986368. Throughput: 0: 42775.1. Samples: 11373140140. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:06:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 16:06:06,455][15349] Signal inference workers to stop experience collection... (168250 times) [2024-06-24 16:06:06,492][15401] InferenceWorker_p0-w0: stopping experience collection (168250 times) [2024-06-24 16:06:06,501][15349] Signal inference workers to resume experience collection... (168250 times) [2024-06-24 16:06:06,514][15401] InferenceWorker_p0-w0: resuming experience collection (168250 times) [2024-06-24 16:06:07,461][15401] Updated weights for policy 0, policy_version 694162 (0.0037) [2024-06-24 16:06:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.8, 300 sec: 42654.3). Total num frames: 11373199360. Throughput: 0: 42688.5. Samples: 11373270980. Policy #0 lag: (min: 2.0, avg: 11.1, max: 25.0) [2024-06-24 16:06:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 16:06:10,869][15401] Updated weights for policy 0, policy_version 694172 (0.0037) [2024-06-24 16:06:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11373395968. Throughput: 0: 42743.2. Samples: 11373529540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 16:06:15,100][15401] Updated weights for policy 0, policy_version 694182 (0.0025) [2024-06-24 16:06:18,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 11373625344. Throughput: 0: 42574.4. Samples: 11373778520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:18,397][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 16:06:19,072][15401] Updated weights for policy 0, policy_version 694192 (0.0035) [2024-06-24 16:06:22,759][15401] Updated weights for policy 0, policy_version 694202 (0.0041) [2024-06-24 16:06:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11373821952. Throughput: 0: 42556.4. Samples: 11373908620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:23,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-24 16:06:26,628][15401] Updated weights for policy 0, policy_version 694212 (0.0036) [2024-06-24 16:06:28,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 11374034944. Throughput: 0: 42676.8. Samples: 11374166060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:28,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 16:06:30,642][15401] Updated weights for policy 0, policy_version 694222 (0.0032) [2024-06-24 16:06:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11374264320. Throughput: 0: 42610.2. Samples: 11374414080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 16:06:34,660][15401] Updated weights for policy 0, policy_version 694232 (0.0046) [2024-06-24 16:06:38,246][15401] Updated weights for policy 0, policy_version 694242 (0.0036) [2024-06-24 16:06:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11374477312. Throughput: 0: 42709.9. Samples: 11374552720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 16:06:42,144][15401] Updated weights for policy 0, policy_version 694252 (0.0024) [2024-06-24 16:06:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 11374690304. Throughput: 0: 42887.8. Samples: 11374814200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 16:06:45,823][15401] Updated weights for policy 0, policy_version 694262 (0.0025) [2024-06-24 16:06:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11374919680. Throughput: 0: 42679.5. Samples: 11375060720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:48,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 16:06:49,619][15401] Updated weights for policy 0, policy_version 694272 (0.0033) [2024-06-24 16:06:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11375099904. Throughput: 0: 42695.9. Samples: 11375192300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 16:06:53,424][15401] Updated weights for policy 0, policy_version 694282 (0.0037) [2024-06-24 16:06:57,099][15401] Updated weights for policy 0, policy_version 694292 (0.0029) [2024-06-24 16:06:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 11375329280. Throughput: 0: 42683.3. Samples: 11375450280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:06:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 16:07:00,917][15401] Updated weights for policy 0, policy_version 694302 (0.0038) [2024-06-24 16:07:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11375558656. Throughput: 0: 42783.5. Samples: 11375703500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 16:07:05,064][15401] Updated weights for policy 0, policy_version 694312 (0.0041) [2024-06-24 16:07:08,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11375738880. Throughput: 0: 42726.9. Samples: 11375831340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:08,391][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 16:07:09,029][15401] Updated weights for policy 0, policy_version 694322 (0.0028) [2024-06-24 16:07:12,520][15401] Updated weights for policy 0, policy_version 694332 (0.0032) [2024-06-24 16:07:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11375984640. Throughput: 0: 42875.1. Samples: 11376095440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 16:07:16,278][15349] Signal inference workers to stop experience collection... (168300 times) [2024-06-24 16:07:16,280][15349] Signal inference workers to resume experience collection... (168300 times) [2024-06-24 16:07:16,332][15401] InferenceWorker_p0-w0: stopping experience collection (168300 times) [2024-06-24 16:07:16,332][15401] InferenceWorker_p0-w0: resuming experience collection (168300 times) [2024-06-24 16:07:16,427][15401] Updated weights for policy 0, policy_version 694342 (0.0031) [2024-06-24 16:07:18,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42874.3, 300 sec: 42764.7). Total num frames: 11376197632. Throughput: 0: 43010.3. Samples: 11376349640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:18,392][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 16:07:20,049][15401] Updated weights for policy 0, policy_version 694352 (0.0040) [2024-06-24 16:07:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11376394240. Throughput: 0: 42925.3. Samples: 11376484360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 16:07:23,858][15401] Updated weights for policy 0, policy_version 694362 (0.0035) [2024-06-24 16:07:27,433][15401] Updated weights for policy 0, policy_version 694372 (0.0044) [2024-06-24 16:07:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11376623616. Throughput: 0: 42836.2. Samples: 11376741820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 16:07:31,613][15401] Updated weights for policy 0, policy_version 694382 (0.0044) [2024-06-24 16:07:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 11376836608. Throughput: 0: 43100.5. Samples: 11377000240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 16:07:34,806][15401] Updated weights for policy 0, policy_version 694392 (0.0027) [2024-06-24 16:07:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11377033216. Throughput: 0: 43001.0. Samples: 11377127340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 16:07:39,433][15401] Updated weights for policy 0, policy_version 694402 (0.0026) [2024-06-24 16:07:42,372][15401] Updated weights for policy 0, policy_version 694412 (0.0030) [2024-06-24 16:07:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 11377295360. Throughput: 0: 43066.5. Samples: 11377388280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:43,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 16:07:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000694415_11377295360.pth... [2024-06-24 16:07:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000693788_11367022592.pth [2024-06-24 16:07:47,060][15401] Updated weights for policy 0, policy_version 694422 (0.0030) [2024-06-24 16:07:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11377475584. Throughput: 0: 43281.7. Samples: 11377651180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 16:07:48,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 16:07:50,153][15401] Updated weights for policy 0, policy_version 694432 (0.0042) [2024-06-24 16:07:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11377688576. Throughput: 0: 43185.9. Samples: 11377774700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:07:53,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-24 16:07:54,547][15401] Updated weights for policy 0, policy_version 694442 (0.0029) [2024-06-24 16:07:57,729][15401] Updated weights for policy 0, policy_version 694452 (0.0047) [2024-06-24 16:07:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 11377934336. Throughput: 0: 43114.3. Samples: 11378035580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:07:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 16:08:01,936][15401] Updated weights for policy 0, policy_version 694462 (0.0038) [2024-06-24 16:08:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11378130944. Throughput: 0: 43383.7. Samples: 11378301800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 16:08:05,325][15401] Updated weights for policy 0, policy_version 694472 (0.0054) [2024-06-24 16:08:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 11378360320. Throughput: 0: 43095.4. Samples: 11378423660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 16:08:09,857][15401] Updated weights for policy 0, policy_version 694482 (0.0029) [2024-06-24 16:08:12,952][15401] Updated weights for policy 0, policy_version 694492 (0.0041) [2024-06-24 16:08:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 11378573312. Throughput: 0: 43166.2. Samples: 11378684300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:13,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 16:08:17,543][15401] Updated weights for policy 0, policy_version 694502 (0.0040) [2024-06-24 16:08:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 11378769920. Throughput: 0: 43175.5. Samples: 11378943140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:18,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 16:08:20,638][15401] Updated weights for policy 0, policy_version 694512 (0.0029) [2024-06-24 16:08:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11378982912. Throughput: 0: 43022.3. Samples: 11379063340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 16:08:25,207][15401] Updated weights for policy 0, policy_version 694522 (0.0025) [2024-06-24 16:08:25,516][15349] Signal inference workers to stop experience collection... (168350 times) [2024-06-24 16:08:25,560][15401] InferenceWorker_p0-w0: stopping experience collection (168350 times) [2024-06-24 16:08:25,573][15349] Signal inference workers to resume experience collection... (168350 times) [2024-06-24 16:08:25,582][15401] InferenceWorker_p0-w0: resuming experience collection (168350 times) [2024-06-24 16:08:28,183][15401] Updated weights for policy 0, policy_version 694532 (0.0033) [2024-06-24 16:08:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 11379228672. Throughput: 0: 43069.4. Samples: 11379326400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 16:08:32,733][15401] Updated weights for policy 0, policy_version 694542 (0.0040) [2024-06-24 16:08:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11379425280. Throughput: 0: 43111.1. Samples: 11379591180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 16:08:35,674][15401] Updated weights for policy 0, policy_version 694552 (0.0029) [2024-06-24 16:08:38,393][15132] Fps is (10 sec: 40944.3, 60 sec: 43414.8, 300 sec: 42764.5). Total num frames: 11379638272. Throughput: 0: 42979.1. Samples: 11379708920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:38,394][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 16:08:40,423][15401] Updated weights for policy 0, policy_version 694562 (0.0046) [2024-06-24 16:08:43,143][15401] Updated weights for policy 0, policy_version 694572 (0.0033) [2024-06-24 16:08:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 11379884032. Throughput: 0: 43031.1. Samples: 11379971980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 16:08:47,974][15401] Updated weights for policy 0, policy_version 694582 (0.0042) [2024-06-24 16:08:48,389][15132] Fps is (10 sec: 42614.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11380064256. Throughput: 0: 42942.3. Samples: 11380234200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 16:08:50,725][15401] Updated weights for policy 0, policy_version 694592 (0.0030) [2024-06-24 16:08:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11380277248. Throughput: 0: 42957.4. Samples: 11380356740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 16:08:55,839][15401] Updated weights for policy 0, policy_version 694602 (0.0033) [2024-06-24 16:08:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42932.3). Total num frames: 11380506624. Throughput: 0: 42982.7. Samples: 11380618520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:08:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 16:08:58,418][15401] Updated weights for policy 0, policy_version 694612 (0.0031) [2024-06-24 16:09:03,388][15401] Updated weights for policy 0, policy_version 694622 (0.0042) [2024-06-24 16:09:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11380686848. Throughput: 0: 43101.8. Samples: 11380882720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:09:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 16:09:06,016][15401] Updated weights for policy 0, policy_version 694632 (0.0043) [2024-06-24 16:09:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11380916224. Throughput: 0: 42959.5. Samples: 11380996520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:09:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 16:09:10,968][15401] Updated weights for policy 0, policy_version 694642 (0.0028) [2024-06-24 16:09:13,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 11381161984. Throughput: 0: 42979.0. Samples: 11381260460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:09:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 16:09:13,677][15401] Updated weights for policy 0, policy_version 694652 (0.0039) [2024-06-24 16:09:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11381325824. Throughput: 0: 42866.8. Samples: 11381520180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:09:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 16:09:18,625][15401] Updated weights for policy 0, policy_version 694662 (0.0042) [2024-06-24 16:09:21,416][15401] Updated weights for policy 0, policy_version 694672 (0.0033) [2024-06-24 16:09:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11381571584. Throughput: 0: 42933.8. Samples: 11381640780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 16:09:23,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 16:09:26,336][15401] Updated weights for policy 0, policy_version 694682 (0.0034) [2024-06-24 16:09:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11381784576. Throughput: 0: 42945.9. Samples: 11381904540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:09:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 16:09:29,298][15401] Updated weights for policy 0, policy_version 694692 (0.0031) [2024-06-24 16:09:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11381981184. Throughput: 0: 42740.3. Samples: 11382157520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:09:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 16:09:34,081][15401] Updated weights for policy 0, policy_version 694702 (0.0030) [2024-06-24 16:09:36,861][15401] Updated weights for policy 0, policy_version 694712 (0.0031) [2024-06-24 16:09:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42874.2, 300 sec: 42876.5). Total num frames: 11382210560. Throughput: 0: 42724.5. Samples: 11382279340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:09:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 16:09:41,794][15401] Updated weights for policy 0, policy_version 694722 (0.0030) [2024-06-24 16:09:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 11382423552. Throughput: 0: 42865.2. Samples: 11382547460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:09:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 16:09:43,441][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000694729_11382439936.pth... [2024-06-24 16:09:43,518][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000694099_11372118016.pth [2024-06-24 16:09:44,358][15401] Updated weights for policy 0, policy_version 694732 (0.0028) [2024-06-24 16:09:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11382620160. Throughput: 0: 42601.8. Samples: 11382799800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:09:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 16:09:49,661][15401] Updated weights for policy 0, policy_version 694742 (0.0051) [2024-06-24 16:09:50,653][15349] Signal inference workers to stop experience collection... (168400 times) [2024-06-24 16:09:50,654][15349] Signal inference workers to resume experience collection... (168400 times) [2024-06-24 16:09:50,686][15401] InferenceWorker_p0-w0: stopping experience collection (168400 times) [2024-06-24 16:09:50,686][15401] InferenceWorker_p0-w0: resuming experience collection (168400 times) [2024-06-24 16:09:52,199][15401] Updated weights for policy 0, policy_version 694752 (0.0036) [2024-06-24 16:09:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 11382882304. Throughput: 0: 42828.9. Samples: 11382923820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:09:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 16:09:57,270][15401] Updated weights for policy 0, policy_version 694762 (0.0033) [2024-06-24 16:09:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 11383046144. Throughput: 0: 43009.4. Samples: 11383195880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:09:58,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 16:09:59,808][15401] Updated weights for policy 0, policy_version 694772 (0.0038) [2024-06-24 16:10:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11383275520. Throughput: 0: 42807.9. Samples: 11383446540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 16:10:04,823][15401] Updated weights for policy 0, policy_version 694782 (0.0031) [2024-06-24 16:10:07,373][15401] Updated weights for policy 0, policy_version 694792 (0.0032) [2024-06-24 16:10:08,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 11383537664. Throughput: 0: 42955.1. Samples: 11383573760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 16:10:12,347][15401] Updated weights for policy 0, policy_version 694802 (0.0037) [2024-06-24 16:10:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.7, 300 sec: 42875.7). Total num frames: 11383701504. Throughput: 0: 43030.6. Samples: 11383841020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:13,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 16:10:14,964][15401] Updated weights for policy 0, policy_version 694812 (0.0033) [2024-06-24 16:10:18,390][15132] Fps is (10 sec: 37682.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11383914496. Throughput: 0: 42922.2. Samples: 11384089020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 16:10:19,871][15401] Updated weights for policy 0, policy_version 694822 (0.0034) [2024-06-24 16:10:22,705][15401] Updated weights for policy 0, policy_version 694832 (0.0035) [2024-06-24 16:10:23,390][15132] Fps is (10 sec: 47524.8, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 11384176640. Throughput: 0: 43083.5. Samples: 11384218100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 16:10:27,380][15401] Updated weights for policy 0, policy_version 694842 (0.0049) [2024-06-24 16:10:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11384340480. Throughput: 0: 42889.0. Samples: 11384477460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:28,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 16:10:30,221][15401] Updated weights for policy 0, policy_version 694852 (0.0038) [2024-06-24 16:10:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 11384553472. Throughput: 0: 42918.3. Samples: 11384731120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 16:10:35,018][15401] Updated weights for policy 0, policy_version 694862 (0.0040) [2024-06-24 16:10:38,003][15401] Updated weights for policy 0, policy_version 694872 (0.0036) [2024-06-24 16:10:38,390][15132] Fps is (10 sec: 47512.3, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 11384815616. Throughput: 0: 43046.9. Samples: 11384860940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 16:10:42,753][15401] Updated weights for policy 0, policy_version 694882 (0.0038) [2024-06-24 16:10:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11384963072. Throughput: 0: 42759.5. Samples: 11385120060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 16:10:45,595][15401] Updated weights for policy 0, policy_version 694892 (0.0034) [2024-06-24 16:10:48,390][15132] Fps is (10 sec: 39322.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11385208832. Throughput: 0: 42632.9. Samples: 11385365020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 16:10:50,304][15401] Updated weights for policy 0, policy_version 694902 (0.0037) [2024-06-24 16:10:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 11385421824. Throughput: 0: 42792.9. Samples: 11385499440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 16:10:53,440][15401] Updated weights for policy 0, policy_version 694912 (0.0027) [2024-06-24 16:10:57,905][15401] Updated weights for policy 0, policy_version 694922 (0.0040) [2024-06-24 16:10:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11385602048. Throughput: 0: 42548.5. Samples: 11385755600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:10:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 16:11:01,271][15401] Updated weights for policy 0, policy_version 694932 (0.0030) [2024-06-24 16:11:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11385831424. Throughput: 0: 42610.8. Samples: 11386006500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-24 16:11:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 16:11:05,604][15401] Updated weights for policy 0, policy_version 694942 (0.0041) [2024-06-24 16:11:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 42931.6). Total num frames: 11386060800. Throughput: 0: 42639.6. Samples: 11386136880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 16:11:08,960][15401] Updated weights for policy 0, policy_version 694952 (0.0046) [2024-06-24 16:11:13,326][15401] Updated weights for policy 0, policy_version 694962 (0.0034) [2024-06-24 16:11:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.0, 300 sec: 42821.5). Total num frames: 11386257408. Throughput: 0: 42493.1. Samples: 11386389660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 16:11:15,798][15349] Signal inference workers to stop experience collection... (168450 times) [2024-06-24 16:11:15,804][15349] Signal inference workers to resume experience collection... (168450 times) [2024-06-24 16:11:15,811][15401] InferenceWorker_p0-w0: stopping experience collection (168450 times) [2024-06-24 16:11:15,836][15401] InferenceWorker_p0-w0: resuming experience collection (168450 times) [2024-06-24 16:11:16,679][15401] Updated weights for policy 0, policy_version 694972 (0.0042) [2024-06-24 16:11:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 11386486784. Throughput: 0: 42467.1. Samples: 11386642140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:18,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 16:11:20,930][15401] Updated weights for policy 0, policy_version 694982 (0.0036) [2024-06-24 16:11:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42931.6). Total num frames: 11386699776. Throughput: 0: 42515.7. Samples: 11386774140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 16:11:24,619][15401] Updated weights for policy 0, policy_version 694992 (0.0031) [2024-06-24 16:11:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11386896384. Throughput: 0: 42590.3. Samples: 11387036620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 16:11:28,412][15401] Updated weights for policy 0, policy_version 695002 (0.0037) [2024-06-24 16:11:32,206][15401] Updated weights for policy 0, policy_version 695012 (0.0029) [2024-06-24 16:11:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11387142144. Throughput: 0: 42666.6. Samples: 11387285020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 16:11:36,220][15401] Updated weights for policy 0, policy_version 695022 (0.0032) [2024-06-24 16:11:38,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42323.8, 300 sec: 42931.3). Total num frames: 11387355136. Throughput: 0: 42675.5. Samples: 11387419940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:38,392][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 16:11:39,750][15401] Updated weights for policy 0, policy_version 695032 (0.0028) [2024-06-24 16:11:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11387551744. Throughput: 0: 42841.7. Samples: 11387683480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 16:11:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000695041_11387551744.pth... [2024-06-24 16:11:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000694415_11377295360.pth [2024-06-24 16:11:43,858][15401] Updated weights for policy 0, policy_version 695042 (0.0038) [2024-06-24 16:11:47,182][15401] Updated weights for policy 0, policy_version 695052 (0.0050) [2024-06-24 16:11:48,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 11387781120. Throughput: 0: 42716.8. Samples: 11387928760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:48,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 16:11:51,459][15401] Updated weights for policy 0, policy_version 695062 (0.0037) [2024-06-24 16:11:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11387994112. Throughput: 0: 42823.9. Samples: 11388063960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 16:11:55,090][15401] Updated weights for policy 0, policy_version 695072 (0.0032) [2024-06-24 16:11:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11388190720. Throughput: 0: 42765.9. Samples: 11388314120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:11:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 16:11:59,477][15401] Updated weights for policy 0, policy_version 695082 (0.0031) [2024-06-24 16:12:03,178][15401] Updated weights for policy 0, policy_version 695092 (0.0036) [2024-06-24 16:12:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11388403712. Throughput: 0: 42907.1. Samples: 11388572960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:03,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 16:12:06,838][15401] Updated weights for policy 0, policy_version 695102 (0.0039) [2024-06-24 16:12:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11388633088. Throughput: 0: 42829.8. Samples: 11388701480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 16:12:10,702][15401] Updated weights for policy 0, policy_version 695112 (0.0028) [2024-06-24 16:12:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 11388813312. Throughput: 0: 42747.5. Samples: 11388960260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 16:12:14,530][15401] Updated weights for policy 0, policy_version 695122 (0.0042) [2024-06-24 16:12:18,223][15401] Updated weights for policy 0, policy_version 695132 (0.0025) [2024-06-24 16:12:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11389059072. Throughput: 0: 42876.5. Samples: 11389214460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:18,396][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 16:12:22,023][15401] Updated weights for policy 0, policy_version 695142 (0.0035) [2024-06-24 16:12:22,043][15349] Signal inference workers to stop experience collection... (168500 times) [2024-06-24 16:12:22,048][15349] Signal inference workers to resume experience collection... (168500 times) [2024-06-24 16:12:22,082][15401] InferenceWorker_p0-w0: stopping experience collection (168500 times) [2024-06-24 16:12:22,082][15401] InferenceWorker_p0-w0: resuming experience collection (168500 times) [2024-06-24 16:12:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11389272064. Throughput: 0: 42785.8. Samples: 11389345200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 16:12:25,707][15401] Updated weights for policy 0, policy_version 695152 (0.0036) [2024-06-24 16:12:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11389468672. Throughput: 0: 42595.7. Samples: 11389600280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 16:12:29,575][15401] Updated weights for policy 0, policy_version 695162 (0.0031) [2024-06-24 16:12:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 11389681664. Throughput: 0: 42825.8. Samples: 11389855920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 16:12:33,538][15401] Updated weights for policy 0, policy_version 695172 (0.0039) [2024-06-24 16:12:37,135][15401] Updated weights for policy 0, policy_version 695182 (0.0032) [2024-06-24 16:12:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 11389927424. Throughput: 0: 42793.0. Samples: 11389989640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 16:12:41,148][15401] Updated weights for policy 0, policy_version 695192 (0.0040) [2024-06-24 16:12:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11390091264. Throughput: 0: 42855.1. Samples: 11390242600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 16:12:44,934][15401] Updated weights for policy 0, policy_version 695202 (0.0033) [2024-06-24 16:12:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11390337024. Throughput: 0: 42705.0. Samples: 11390494680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 16:12:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 16:12:48,709][15401] Updated weights for policy 0, policy_version 695212 (0.0052) [2024-06-24 16:12:52,415][15401] Updated weights for policy 0, policy_version 695222 (0.0034) [2024-06-24 16:12:53,392][15132] Fps is (10 sec: 49140.2, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 11390582784. Throughput: 0: 42940.9. Samples: 11390633920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:12:53,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 16:12:56,457][15401] Updated weights for policy 0, policy_version 695232 (0.0029) [2024-06-24 16:12:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11390746624. Throughput: 0: 42823.1. Samples: 11390887300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:12:58,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 16:13:00,009][15401] Updated weights for policy 0, policy_version 695242 (0.0046) [2024-06-24 16:13:03,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11390976000. Throughput: 0: 42659.1. Samples: 11391134120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 16:13:04,770][15401] Updated weights for policy 0, policy_version 695252 (0.0040) [2024-06-24 16:13:07,664][15401] Updated weights for policy 0, policy_version 695262 (0.0034) [2024-06-24 16:13:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11391221760. Throughput: 0: 42768.4. Samples: 11391269780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 16:13:12,364][15401] Updated weights for policy 0, policy_version 695272 (0.0037) [2024-06-24 16:13:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11391385600. Throughput: 0: 42946.1. Samples: 11391532860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 16:13:15,243][15401] Updated weights for policy 0, policy_version 695282 (0.0031) [2024-06-24 16:13:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11391614976. Throughput: 0: 42767.2. Samples: 11391780440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:18,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 16:13:19,993][15401] Updated weights for policy 0, policy_version 695292 (0.0034) [2024-06-24 16:13:22,950][15401] Updated weights for policy 0, policy_version 695302 (0.0038) [2024-06-24 16:13:23,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11391844352. Throughput: 0: 42656.0. Samples: 11391909160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:23,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 16:13:27,538][15401] Updated weights for policy 0, policy_version 695312 (0.0038) [2024-06-24 16:13:28,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 11392040960. Throughput: 0: 42757.3. Samples: 11392166780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:28,393][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 16:13:30,719][15401] Updated weights for policy 0, policy_version 695322 (0.0035) [2024-06-24 16:13:33,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.9, 300 sec: 42765.2). Total num frames: 11392253952. Throughput: 0: 42646.7. Samples: 11392413880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:33,401][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 16:13:35,142][15401] Updated weights for policy 0, policy_version 695332 (0.0035) [2024-06-24 16:13:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11392450560. Throughput: 0: 42401.4. Samples: 11392541880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 16:13:38,727][15401] Updated weights for policy 0, policy_version 695342 (0.0035) [2024-06-24 16:13:41,491][15349] Signal inference workers to stop experience collection... (168550 times) [2024-06-24 16:13:41,491][15349] Signal inference workers to resume experience collection... (168550 times) [2024-06-24 16:13:41,540][15401] InferenceWorker_p0-w0: stopping experience collection (168550 times) [2024-06-24 16:13:41,541][15401] InferenceWorker_p0-w0: resuming experience collection (168550 times) [2024-06-24 16:13:42,797][15401] Updated weights for policy 0, policy_version 695352 (0.0042) [2024-06-24 16:13:43,389][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11392679936. Throughput: 0: 42508.1. Samples: 11392800160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 16:13:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000695354_11392679936.pth... [2024-06-24 16:13:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000694729_11382439936.pth [2024-06-24 16:13:46,457][15401] Updated weights for policy 0, policy_version 695362 (0.0048) [2024-06-24 16:13:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11392892928. Throughput: 0: 42564.0. Samples: 11393049500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 16:13:50,564][15401] Updated weights for policy 0, policy_version 695372 (0.0037) [2024-06-24 16:13:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41780.9, 300 sec: 42653.9). Total num frames: 11393089536. Throughput: 0: 42422.7. Samples: 11393178800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 16:13:54,120][15401] Updated weights for policy 0, policy_version 695382 (0.0037) [2024-06-24 16:13:58,103][15401] Updated weights for policy 0, policy_version 695392 (0.0042) [2024-06-24 16:13:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11393302528. Throughput: 0: 42232.1. Samples: 11393433300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:13:58,391][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 16:14:01,947][15401] Updated weights for policy 0, policy_version 695402 (0.0031) [2024-06-24 16:14:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11393531904. Throughput: 0: 42340.3. Samples: 11393685760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:14:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 16:14:06,052][15401] Updated weights for policy 0, policy_version 695412 (0.0035) [2024-06-24 16:14:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 11393728512. Throughput: 0: 42503.5. Samples: 11393821820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:14:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 16:14:09,865][15401] Updated weights for policy 0, policy_version 695422 (0.0032) [2024-06-24 16:14:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11393941504. Throughput: 0: 42418.6. Samples: 11394075520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:14:13,400][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 16:14:13,554][15401] Updated weights for policy 0, policy_version 695432 (0.0031) [2024-06-24 16:14:17,454][15401] Updated weights for policy 0, policy_version 695442 (0.0040) [2024-06-24 16:14:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11394170880. Throughput: 0: 42718.7. Samples: 11394336120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:14:18,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 16:14:21,020][15401] Updated weights for policy 0, policy_version 695452 (0.0025) [2024-06-24 16:14:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11394383872. Throughput: 0: 42746.6. Samples: 11394465480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:14:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 16:14:25,148][15401] Updated weights for policy 0, policy_version 695462 (0.0035) [2024-06-24 16:14:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11394596864. Throughput: 0: 42639.0. Samples: 11394718920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:14:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 16:14:28,834][15401] Updated weights for policy 0, policy_version 695472 (0.0036) [2024-06-24 16:14:32,703][15401] Updated weights for policy 0, policy_version 695482 (0.0032) [2024-06-24 16:14:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11394826240. Throughput: 0: 42923.2. Samples: 11394981040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:14:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 16:14:36,608][15401] Updated weights for policy 0, policy_version 695492 (0.0023) [2024-06-24 16:14:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11395039232. Throughput: 0: 43022.7. Samples: 11395114820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:14:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 16:14:40,576][15401] Updated weights for policy 0, policy_version 695502 (0.0023) [2024-06-24 16:14:43,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 11395235840. Throughput: 0: 42964.3. Samples: 11395366700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:14:43,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 16:14:44,077][15401] Updated weights for policy 0, policy_version 695512 (0.0034) [2024-06-24 16:14:48,299][15401] Updated weights for policy 0, policy_version 695522 (0.0036) [2024-06-24 16:14:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11395432448. Throughput: 0: 43135.6. Samples: 11395626860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:14:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 16:14:51,794][15401] Updated weights for policy 0, policy_version 695532 (0.0023) [2024-06-24 16:14:53,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11395694592. Throughput: 0: 42849.2. Samples: 11395750040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:14:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 16:14:55,976][15401] Updated weights for policy 0, policy_version 695542 (0.0037) [2024-06-24 16:14:56,160][15349] Signal inference workers to stop experience collection... (168600 times) [2024-06-24 16:14:56,192][15401] InferenceWorker_p0-w0: stopping experience collection (168600 times) [2024-06-24 16:14:56,268][15349] Signal inference workers to resume experience collection... (168600 times) [2024-06-24 16:14:56,268][15401] InferenceWorker_p0-w0: resuming experience collection (168600 times) [2024-06-24 16:14:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11395891200. Throughput: 0: 42937.9. Samples: 11396007720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:14:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 16:14:59,275][15401] Updated weights for policy 0, policy_version 695552 (0.0034) [2024-06-24 16:15:03,358][15401] Updated weights for policy 0, policy_version 695562 (0.0035) [2024-06-24 16:15:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11396087808. Throughput: 0: 43105.3. Samples: 11396275860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 16:15:07,051][15401] Updated weights for policy 0, policy_version 695572 (0.0041) [2024-06-24 16:15:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 42876.5). Total num frames: 11396349952. Throughput: 0: 42936.4. Samples: 11396397620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 16:15:10,883][15401] Updated weights for policy 0, policy_version 695582 (0.0032) [2024-06-24 16:15:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11396530176. Throughput: 0: 43124.0. Samples: 11396659500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 16:15:14,485][15401] Updated weights for policy 0, policy_version 695592 (0.0047) [2024-06-24 16:15:18,259][15401] Updated weights for policy 0, policy_version 695602 (0.0038) [2024-06-24 16:15:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11396743168. Throughput: 0: 43202.2. Samples: 11396925140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:18,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 16:15:22,023][15401] Updated weights for policy 0, policy_version 695612 (0.0034) [2024-06-24 16:15:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11396988928. Throughput: 0: 43054.2. Samples: 11397052260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:23,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 16:15:25,791][15401] Updated weights for policy 0, policy_version 695622 (0.0031) [2024-06-24 16:15:28,391][15132] Fps is (10 sec: 42591.0, 60 sec: 42870.3, 300 sec: 42764.8). Total num frames: 11397169152. Throughput: 0: 43127.0. Samples: 11397307480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:28,392][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 16:15:29,674][15401] Updated weights for policy 0, policy_version 695632 (0.0036) [2024-06-24 16:15:33,300][15401] Updated weights for policy 0, policy_version 695642 (0.0024) [2024-06-24 16:15:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 11397398528. Throughput: 0: 43066.7. Samples: 11397564860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 16:15:37,280][15401] Updated weights for policy 0, policy_version 695652 (0.0027) [2024-06-24 16:15:38,389][15132] Fps is (10 sec: 44244.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11397611520. Throughput: 0: 43265.1. Samples: 11397696960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 16:15:40,930][15401] Updated weights for policy 0, policy_version 695662 (0.0038) [2024-06-24 16:15:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11397808128. Throughput: 0: 43060.9. Samples: 11397945460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 16:15:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000695668_11397824512.pth... [2024-06-24 16:15:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000695041_11387551744.pth [2024-06-24 16:15:44,846][15401] Updated weights for policy 0, policy_version 695672 (0.0023) [2024-06-24 16:15:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11398037504. Throughput: 0: 42798.6. Samples: 11398201800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 16:15:48,738][15401] Updated weights for policy 0, policy_version 695682 (0.0040) [2024-06-24 16:15:52,486][15401] Updated weights for policy 0, policy_version 695692 (0.0028) [2024-06-24 16:15:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11398250496. Throughput: 0: 43112.8. Samples: 11398337700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:53,394][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 16:15:56,189][15401] Updated weights for policy 0, policy_version 695702 (0.0041) [2024-06-24 16:15:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11398447104. Throughput: 0: 42961.8. Samples: 11398592780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:15:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 16:15:59,885][15401] Updated weights for policy 0, policy_version 695712 (0.0027) [2024-06-24 16:16:01,848][15349] Signal inference workers to stop experience collection... (168650 times) [2024-06-24 16:16:01,850][15349] Signal inference workers to resume experience collection... (168650 times) [2024-06-24 16:16:01,886][15401] InferenceWorker_p0-w0: stopping experience collection (168650 times) [2024-06-24 16:16:01,886][15401] InferenceWorker_p0-w0: resuming experience collection (168650 times) [2024-06-24 16:16:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11398692864. Throughput: 0: 42704.4. Samples: 11398846840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:16:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 16:16:03,757][15401] Updated weights for policy 0, policy_version 695722 (0.0030) [2024-06-24 16:16:07,852][15401] Updated weights for policy 0, policy_version 695732 (0.0043) [2024-06-24 16:16:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11398889472. Throughput: 0: 42972.3. Samples: 11398986020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 16:16:08,396][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 16:16:11,698][15401] Updated weights for policy 0, policy_version 695742 (0.0039) [2024-06-24 16:16:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 11399102464. Throughput: 0: 42817.1. Samples: 11399234280. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:13,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 16:16:15,447][15401] Updated weights for policy 0, policy_version 695752 (0.0034) [2024-06-24 16:16:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 11399331840. Throughput: 0: 42713.2. Samples: 11399486960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 16:16:19,203][15401] Updated weights for policy 0, policy_version 695762 (0.0023) [2024-06-24 16:16:22,980][15401] Updated weights for policy 0, policy_version 695772 (0.0032) [2024-06-24 16:16:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 11399528448. Throughput: 0: 42774.5. Samples: 11399621820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 16:16:26,830][15401] Updated weights for policy 0, policy_version 695782 (0.0022) [2024-06-24 16:16:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42872.5, 300 sec: 42709.5). Total num frames: 11399741440. Throughput: 0: 42870.9. Samples: 11399874660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:28,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 16:16:30,542][15401] Updated weights for policy 0, policy_version 695792 (0.0033) [2024-06-24 16:16:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 11399954432. Throughput: 0: 42840.6. Samples: 11400129620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:33,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 16:16:34,278][15401] Updated weights for policy 0, policy_version 695802 (0.0023) [2024-06-24 16:16:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11400167424. Throughput: 0: 42703.2. Samples: 11400259340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:38,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 16:16:38,446][15401] Updated weights for policy 0, policy_version 695812 (0.0035) [2024-06-24 16:16:41,929][15401] Updated weights for policy 0, policy_version 695822 (0.0042) [2024-06-24 16:16:43,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 11400380416. Throughput: 0: 42606.2. Samples: 11400510160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:43,392][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 16:16:46,123][15401] Updated weights for policy 0, policy_version 695832 (0.0040) [2024-06-24 16:16:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11400609792. Throughput: 0: 42608.0. Samples: 11400764200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 16:16:49,686][15401] Updated weights for policy 0, policy_version 695842 (0.0042) [2024-06-24 16:16:53,396][15132] Fps is (10 sec: 44219.0, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 11400822784. Throughput: 0: 42530.8. Samples: 11400900180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:53,397][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 16:16:53,738][15401] Updated weights for policy 0, policy_version 695852 (0.0037) [2024-06-24 16:16:57,228][15401] Updated weights for policy 0, policy_version 695862 (0.0038) [2024-06-24 16:16:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11401035776. Throughput: 0: 42573.7. Samples: 11401150100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:16:58,393][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 16:17:01,365][15401] Updated weights for policy 0, policy_version 695872 (0.0036) [2024-06-24 16:17:03,394][15132] Fps is (10 sec: 42605.2, 60 sec: 42595.0, 300 sec: 42764.3). Total num frames: 11401248768. Throughput: 0: 42686.6. Samples: 11401408060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:03,395][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 16:17:04,988][15401] Updated weights for policy 0, policy_version 695882 (0.0042) [2024-06-24 16:17:08,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11401461760. Throughput: 0: 42747.2. Samples: 11401545440. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 16:17:09,040][15401] Updated weights for policy 0, policy_version 695892 (0.0034) [2024-06-24 16:17:12,635][15401] Updated weights for policy 0, policy_version 695902 (0.0043) [2024-06-24 16:17:13,390][15132] Fps is (10 sec: 42618.7, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 11401674752. Throughput: 0: 42700.5. Samples: 11401796180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 16:17:16,770][15401] Updated weights for policy 0, policy_version 695912 (0.0026) [2024-06-24 16:17:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 11401887744. Throughput: 0: 42731.4. Samples: 11402052640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:18,393][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 16:17:20,514][15401] Updated weights for policy 0, policy_version 695922 (0.0040) [2024-06-24 16:17:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11402117120. Throughput: 0: 42721.7. Samples: 11402181820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:23,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 16:17:24,368][15401] Updated weights for policy 0, policy_version 695932 (0.0042) [2024-06-24 16:17:28,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 11402313728. Throughput: 0: 42757.3. Samples: 11402434240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:28,392][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 16:17:28,411][15401] Updated weights for policy 0, policy_version 695942 (0.0023) [2024-06-24 16:17:31,860][15401] Updated weights for policy 0, policy_version 695952 (0.0022) [2024-06-24 16:17:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11402543104. Throughput: 0: 42877.4. Samples: 11402693680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:33,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 16:17:35,942][15401] Updated weights for policy 0, policy_version 695962 (0.0025) [2024-06-24 16:17:38,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11402723328. Throughput: 0: 42674.6. Samples: 11402820260. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:38,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 16:17:39,429][15349] Signal inference workers to stop experience collection... (168700 times) [2024-06-24 16:17:39,457][15401] InferenceWorker_p0-w0: stopping experience collection (168700 times) [2024-06-24 16:17:39,485][15349] Signal inference workers to resume experience collection... (168700 times) [2024-06-24 16:17:39,488][15401] InferenceWorker_p0-w0: resuming experience collection (168700 times) [2024-06-24 16:17:39,491][15401] Updated weights for policy 0, policy_version 695972 (0.0027) [2024-06-24 16:17:43,365][15401] Updated weights for policy 0, policy_version 695982 (0.0038) [2024-06-24 16:17:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 11402969088. Throughput: 0: 42983.2. Samples: 11403084240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 16:17:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000695982_11402969088.pth... [2024-06-24 16:17:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000695354_11392679936.pth [2024-06-24 16:17:47,018][15401] Updated weights for policy 0, policy_version 695992 (0.0031) [2024-06-24 16:17:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 11403165696. Throughput: 0: 42864.2. Samples: 11403336740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-24 16:17:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 16:17:51,873][15401] Updated weights for policy 0, policy_version 696002 (0.0046) [2024-06-24 16:17:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 11403362304. Throughput: 0: 42639.9. Samples: 11403464240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:17:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 16:17:54,510][15401] Updated weights for policy 0, policy_version 696012 (0.0035) [2024-06-24 16:17:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11403591680. Throughput: 0: 42693.0. Samples: 11403717360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:17:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 16:17:59,504][15401] Updated weights for policy 0, policy_version 696022 (0.0034) [2024-06-24 16:18:02,413][15401] Updated weights for policy 0, policy_version 696032 (0.0042) [2024-06-24 16:18:03,394][15132] Fps is (10 sec: 45855.4, 60 sec: 42871.8, 300 sec: 42708.9). Total num frames: 11403821056. Throughput: 0: 42621.3. Samples: 11403970680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:03,394][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 16:18:07,180][15401] Updated weights for policy 0, policy_version 696042 (0.0036) [2024-06-24 16:18:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11404001280. Throughput: 0: 42725.8. Samples: 11404104480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:08,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 16:18:10,048][15401] Updated weights for policy 0, policy_version 696052 (0.0024) [2024-06-24 16:18:13,389][15132] Fps is (10 sec: 40978.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11404230656. Throughput: 0: 42687.3. Samples: 11404355060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 16:18:15,036][15401] Updated weights for policy 0, policy_version 696062 (0.0042) [2024-06-24 16:18:17,685][15401] Updated weights for policy 0, policy_version 696072 (0.0045) [2024-06-24 16:18:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11404460032. Throughput: 0: 42490.2. Samples: 11404605740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 16:18:22,650][15401] Updated weights for policy 0, policy_version 696082 (0.0032) [2024-06-24 16:18:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 11404656640. Throughput: 0: 42695.1. Samples: 11404741540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:23,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-24 16:18:25,209][15401] Updated weights for policy 0, policy_version 696092 (0.0032) [2024-06-24 16:18:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.2, 300 sec: 42765.4). Total num frames: 11404869632. Throughput: 0: 42460.1. Samples: 11404994940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:28,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 16:18:30,239][15401] Updated weights for policy 0, policy_version 696102 (0.0032) [2024-06-24 16:18:32,836][15401] Updated weights for policy 0, policy_version 696112 (0.0042) [2024-06-24 16:18:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11405115392. Throughput: 0: 42474.1. Samples: 11405248080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 16:18:38,045][15401] Updated weights for policy 0, policy_version 696122 (0.0037) [2024-06-24 16:18:38,392][15132] Fps is (10 sec: 40951.3, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 11405279232. Throughput: 0: 42666.1. Samples: 11405384300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:38,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 16:18:39,685][15349] Signal inference workers to stop experience collection... (168750 times) [2024-06-24 16:18:39,705][15401] InferenceWorker_p0-w0: stopping experience collection (168750 times) [2024-06-24 16:18:39,743][15349] Signal inference workers to resume experience collection... (168750 times) [2024-06-24 16:18:39,743][15401] InferenceWorker_p0-w0: resuming experience collection (168750 times) [2024-06-24 16:18:40,532][15401] Updated weights for policy 0, policy_version 696132 (0.0037) [2024-06-24 16:18:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11405524992. Throughput: 0: 42763.6. Samples: 11405641720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 16:18:45,531][15401] Updated weights for policy 0, policy_version 696142 (0.0037) [2024-06-24 16:18:48,223][15401] Updated weights for policy 0, policy_version 696152 (0.0030) [2024-06-24 16:18:48,390][15132] Fps is (10 sec: 47523.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11405754368. Throughput: 0: 42835.7. Samples: 11405898100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:48,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 16:18:52,989][15401] Updated weights for policy 0, policy_version 696162 (0.0037) [2024-06-24 16:18:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11405934592. Throughput: 0: 42848.8. Samples: 11406032680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:53,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 16:18:56,098][15401] Updated weights for policy 0, policy_version 696172 (0.0032) [2024-06-24 16:18:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11406163968. Throughput: 0: 42842.2. Samples: 11406282960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:18:58,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 16:19:00,707][15401] Updated weights for policy 0, policy_version 696182 (0.0035) [2024-06-24 16:19:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42874.5, 300 sec: 42931.6). Total num frames: 11406393344. Throughput: 0: 42988.8. Samples: 11406540240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:19:03,391][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 16:19:03,638][15401] Updated weights for policy 0, policy_version 696192 (0.0042) [2024-06-24 16:19:08,205][15401] Updated weights for policy 0, policy_version 696202 (0.0030) [2024-06-24 16:19:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11406589952. Throughput: 0: 43100.3. Samples: 11406681060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:19:08,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 16:19:11,274][15401] Updated weights for policy 0, policy_version 696212 (0.0045) [2024-06-24 16:19:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11406802944. Throughput: 0: 43005.3. Samples: 11406930180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:19:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 16:19:15,860][15401] Updated weights for policy 0, policy_version 696222 (0.0043) [2024-06-24 16:19:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11407048704. Throughput: 0: 42939.3. Samples: 11407180340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:19:18,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 16:19:18,988][15401] Updated weights for policy 0, policy_version 696232 (0.0031) [2024-06-24 16:19:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11407212544. Throughput: 0: 42957.1. Samples: 11407317280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:19:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 16:19:23,585][15401] Updated weights for policy 0, policy_version 696242 (0.0039) [2024-06-24 16:19:26,550][15401] Updated weights for policy 0, policy_version 696252 (0.0029) [2024-06-24 16:19:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11407474688. Throughput: 0: 43033.2. Samples: 11407578220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-24 16:19:28,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-24 16:19:30,976][15401] Updated weights for policy 0, policy_version 696262 (0.0038) [2024-06-24 16:19:33,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11407687680. Throughput: 0: 42864.9. Samples: 11407827020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:19:33,391][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 16:19:34,029][15401] Updated weights for policy 0, policy_version 696272 (0.0036) [2024-06-24 16:19:38,395][15132] Fps is (10 sec: 39300.1, 60 sec: 43142.0, 300 sec: 42819.8). Total num frames: 11407867904. Throughput: 0: 42917.9. Samples: 11407964220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:19:38,396][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 16:19:38,728][15401] Updated weights for policy 0, policy_version 696282 (0.0036) [2024-06-24 16:19:42,038][15401] Updated weights for policy 0, policy_version 696292 (0.0036) [2024-06-24 16:19:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 11408113664. Throughput: 0: 43076.8. Samples: 11408221420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:19:43,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 16:19:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000696296_11408113664.pth... [2024-06-24 16:19:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000695668_11397824512.pth [2024-06-24 16:19:46,399][15401] Updated weights for policy 0, policy_version 696302 (0.0040) [2024-06-24 16:19:48,390][15132] Fps is (10 sec: 47539.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11408343040. Throughput: 0: 42939.6. Samples: 11408472520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:19:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 16:19:49,594][15401] Updated weights for policy 0, policy_version 696312 (0.0035) [2024-06-24 16:19:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11408506880. Throughput: 0: 42890.7. Samples: 11408611140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:19:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 16:19:53,940][15401] Updated weights for policy 0, policy_version 696322 (0.0028) [2024-06-24 16:19:54,581][15349] Signal inference workers to stop experience collection... (168800 times) [2024-06-24 16:19:54,581][15349] Signal inference workers to resume experience collection... (168800 times) [2024-06-24 16:19:54,625][15401] InferenceWorker_p0-w0: stopping experience collection (168800 times) [2024-06-24 16:19:54,626][15401] InferenceWorker_p0-w0: resuming experience collection (168800 times) [2024-06-24 16:19:57,082][15401] Updated weights for policy 0, policy_version 696332 (0.0023) [2024-06-24 16:19:58,396][15132] Fps is (10 sec: 39296.7, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 11408736256. Throughput: 0: 43017.9. Samples: 11408866260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:19:58,396][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 16:20:01,437][15401] Updated weights for policy 0, policy_version 696342 (0.0034) [2024-06-24 16:20:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11408982016. Throughput: 0: 43057.5. Samples: 11409117940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 16:20:05,278][15401] Updated weights for policy 0, policy_version 696352 (0.0038) [2024-06-24 16:20:08,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11409162240. Throughput: 0: 43145.3. Samples: 11409258820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 16:20:08,781][15401] Updated weights for policy 0, policy_version 696362 (0.0030) [2024-06-24 16:20:12,914][15401] Updated weights for policy 0, policy_version 696372 (0.0033) [2024-06-24 16:20:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11409375232. Throughput: 0: 42933.9. Samples: 11409510240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:13,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 16:20:16,345][15401] Updated weights for policy 0, policy_version 696382 (0.0040) [2024-06-24 16:20:18,394][15132] Fps is (10 sec: 45854.9, 60 sec: 42868.3, 300 sec: 42819.9). Total num frames: 11409620992. Throughput: 0: 43000.3. Samples: 11409762220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:18,394][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 16:20:20,512][15401] Updated weights for policy 0, policy_version 696392 (0.0047) [2024-06-24 16:20:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.4, 300 sec: 42876.3). Total num frames: 11409817600. Throughput: 0: 42960.3. Samples: 11409897200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 16:20:24,000][15401] Updated weights for policy 0, policy_version 696402 (0.0034) [2024-06-24 16:20:28,207][15401] Updated weights for policy 0, policy_version 696412 (0.0037) [2024-06-24 16:20:28,389][15132] Fps is (10 sec: 40978.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11410030592. Throughput: 0: 42961.8. Samples: 11410154700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 16:20:31,837][15401] Updated weights for policy 0, policy_version 696422 (0.0040) [2024-06-24 16:20:33,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11410276352. Throughput: 0: 43151.1. Samples: 11410414320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 16:20:35,592][15401] Updated weights for policy 0, policy_version 696432 (0.0042) [2024-06-24 16:20:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43148.5, 300 sec: 42876.1). Total num frames: 11410456576. Throughput: 0: 42998.2. Samples: 11410546060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:38,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 16:20:39,329][15401] Updated weights for policy 0, policy_version 696442 (0.0054) [2024-06-24 16:20:43,020][15401] Updated weights for policy 0, policy_version 696452 (0.0043) [2024-06-24 16:20:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11410685952. Throughput: 0: 43102.9. Samples: 11410805620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:43,392][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 16:20:46,920][15401] Updated weights for policy 0, policy_version 696462 (0.0035) [2024-06-24 16:20:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11410915328. Throughput: 0: 43168.6. Samples: 11411060520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:48,394][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 16:20:50,641][15401] Updated weights for policy 0, policy_version 696472 (0.0027) [2024-06-24 16:20:53,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 11411095552. Throughput: 0: 42927.0. Samples: 11411190640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:53,393][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 16:20:54,613][15401] Updated weights for policy 0, policy_version 696482 (0.0033) [2024-06-24 16:20:58,014][15401] Updated weights for policy 0, policy_version 696492 (0.0033) [2024-06-24 16:20:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43149.2, 300 sec: 42820.6). Total num frames: 11411324928. Throughput: 0: 43105.4. Samples: 11411449980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:20:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 16:21:02,196][15401] Updated weights for policy 0, policy_version 696502 (0.0025) [2024-06-24 16:21:02,816][15349] Signal inference workers to stop experience collection... (168850 times) [2024-06-24 16:21:02,816][15349] Signal inference workers to resume experience collection... (168850 times) [2024-06-24 16:21:02,853][15401] InferenceWorker_p0-w0: stopping experience collection (168850 times) [2024-06-24 16:21:02,853][15401] InferenceWorker_p0-w0: resuming experience collection (168850 times) [2024-06-24 16:21:03,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11411554304. Throughput: 0: 43214.4. Samples: 11411706680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:21:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 16:21:05,763][15401] Updated weights for policy 0, policy_version 696512 (0.0033) [2024-06-24 16:21:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 11411767296. Throughput: 0: 43144.6. Samples: 11411838700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 16:21:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 16:21:09,752][15401] Updated weights for policy 0, policy_version 696522 (0.0037) [2024-06-24 16:21:13,161][15401] Updated weights for policy 0, policy_version 696532 (0.0032) [2024-06-24 16:21:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11411980288. Throughput: 0: 43138.7. Samples: 11412095940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 16:21:17,438][15401] Updated weights for policy 0, policy_version 696542 (0.0034) [2024-06-24 16:21:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42874.6, 300 sec: 42931.6). Total num frames: 11412193280. Throughput: 0: 43156.4. Samples: 11412356360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:18,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 16:21:20,956][15401] Updated weights for policy 0, policy_version 696552 (0.0033) [2024-06-24 16:21:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11412389888. Throughput: 0: 43076.5. Samples: 11412484500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:23,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 16:21:24,989][15401] Updated weights for policy 0, policy_version 696562 (0.0035) [2024-06-24 16:21:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11412619264. Throughput: 0: 42968.9. Samples: 11412739220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 16:21:29,106][15401] Updated weights for policy 0, policy_version 696572 (0.0029) [2024-06-24 16:21:32,594][15401] Updated weights for policy 0, policy_version 696582 (0.0042) [2024-06-24 16:21:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 11412832256. Throughput: 0: 42932.1. Samples: 11412992460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:33,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 16:21:36,655][15401] Updated weights for policy 0, policy_version 696592 (0.0027) [2024-06-24 16:21:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 11413045248. Throughput: 0: 42939.7. Samples: 11413122820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 16:21:40,062][15401] Updated weights for policy 0, policy_version 696602 (0.0037) [2024-06-24 16:21:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11413258240. Throughput: 0: 42825.3. Samples: 11413377120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 16:21:43,451][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000696611_11413274624.pth... [2024-06-24 16:21:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000695982_11402969088.pth [2024-06-24 16:21:44,286][15401] Updated weights for policy 0, policy_version 696612 (0.0021) [2024-06-24 16:21:47,906][15401] Updated weights for policy 0, policy_version 696622 (0.0032) [2024-06-24 16:21:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42877.0). Total num frames: 11413471232. Throughput: 0: 42722.3. Samples: 11413629180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 16:21:52,313][15401] Updated weights for policy 0, policy_version 696632 (0.0029) [2024-06-24 16:21:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.3, 300 sec: 42876.4). Total num frames: 11413684224. Throughput: 0: 42662.7. Samples: 11413758520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:53,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 16:21:55,537][15401] Updated weights for policy 0, policy_version 696642 (0.0035) [2024-06-24 16:21:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42932.3). Total num frames: 11413913600. Throughput: 0: 42702.2. Samples: 11414017540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:21:58,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-24 16:21:59,698][15401] Updated weights for policy 0, policy_version 696652 (0.0037) [2024-06-24 16:22:03,058][15401] Updated weights for policy 0, policy_version 696662 (0.0026) [2024-06-24 16:22:03,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.9, 300 sec: 42931.3). Total num frames: 11414126592. Throughput: 0: 42633.8. Samples: 11414274980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:03,392][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 16:22:07,216][15401] Updated weights for policy 0, policy_version 696672 (0.0033) [2024-06-24 16:22:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11414306816. Throughput: 0: 42644.9. Samples: 11414403520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:08,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 16:22:10,770][15401] Updated weights for policy 0, policy_version 696682 (0.0031) [2024-06-24 16:22:13,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 11414552576. Throughput: 0: 42680.8. Samples: 11414659860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 16:22:14,907][15401] Updated weights for policy 0, policy_version 696692 (0.0038) [2024-06-24 16:22:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11414749184. Throughput: 0: 42854.6. Samples: 11414920920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 16:22:18,439][15401] Updated weights for policy 0, policy_version 696702 (0.0036) [2024-06-24 16:22:22,408][15401] Updated weights for policy 0, policy_version 696712 (0.0041) [2024-06-24 16:22:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 11414945792. Throughput: 0: 42659.1. Samples: 11415042480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 16:22:25,976][15401] Updated weights for policy 0, policy_version 696722 (0.0042) [2024-06-24 16:22:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11415191552. Throughput: 0: 42719.0. Samples: 11415299480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 16:22:29,957][15401] Updated weights for policy 0, policy_version 696732 (0.0035) [2024-06-24 16:22:32,013][15349] Signal inference workers to stop experience collection... (168900 times) [2024-06-24 16:22:32,061][15401] InferenceWorker_p0-w0: stopping experience collection (168900 times) [2024-06-24 16:22:32,069][15349] Signal inference workers to resume experience collection... (168900 times) [2024-06-24 16:22:32,076][15401] InferenceWorker_p0-w0: resuming experience collection (168900 times) [2024-06-24 16:22:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11415388160. Throughput: 0: 42952.4. Samples: 11415562040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:33,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 16:22:33,845][15401] Updated weights for policy 0, policy_version 696742 (0.0040) [2024-06-24 16:22:38,152][15401] Updated weights for policy 0, policy_version 696752 (0.0030) [2024-06-24 16:22:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11415601152. Throughput: 0: 42756.9. Samples: 11415682580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:38,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 16:22:41,516][15401] Updated weights for policy 0, policy_version 696762 (0.0042) [2024-06-24 16:22:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11415830528. Throughput: 0: 42620.9. Samples: 11415935480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 16:22:45,890][15401] Updated weights for policy 0, policy_version 696772 (0.0036) [2024-06-24 16:22:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.6, 300 sec: 42931.3). Total num frames: 11416027136. Throughput: 0: 42666.5. Samples: 11416194980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 16:22:48,393][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 16:22:49,193][15401] Updated weights for policy 0, policy_version 696782 (0.0038) [2024-06-24 16:22:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11416223744. Throughput: 0: 42584.9. Samples: 11416319840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:22:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 16:22:53,465][15401] Updated weights for policy 0, policy_version 696792 (0.0029) [2024-06-24 16:22:56,963][15401] Updated weights for policy 0, policy_version 696802 (0.0035) [2024-06-24 16:22:58,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42598.4, 300 sec: 42876.7). Total num frames: 11416469504. Throughput: 0: 42643.3. Samples: 11416578800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:22:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 16:23:01,309][15401] Updated weights for policy 0, policy_version 696812 (0.0029) [2024-06-24 16:23:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.0, 300 sec: 42931.6). Total num frames: 11416666112. Throughput: 0: 42612.5. Samples: 11416838480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 16:23:04,539][15401] Updated weights for policy 0, policy_version 696822 (0.0042) [2024-06-24 16:23:08,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11416862720. Throughput: 0: 42708.1. Samples: 11416964340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:08,394][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 16:23:09,330][15401] Updated weights for policy 0, policy_version 696832 (0.0033) [2024-06-24 16:23:12,389][15401] Updated weights for policy 0, policy_version 696842 (0.0052) [2024-06-24 16:23:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11417108480. Throughput: 0: 42620.9. Samples: 11417217420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 16:23:16,871][15401] Updated weights for policy 0, policy_version 696852 (0.0033) [2024-06-24 16:23:18,391][15132] Fps is (10 sec: 44231.1, 60 sec: 42597.5, 300 sec: 42875.9). Total num frames: 11417305088. Throughput: 0: 42416.5. Samples: 11417470840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:18,391][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 16:23:20,002][15401] Updated weights for policy 0, policy_version 696862 (0.0032) [2024-06-24 16:23:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11417518080. Throughput: 0: 42503.0. Samples: 11417595220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:23,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 16:23:24,685][15401] Updated weights for policy 0, policy_version 696872 (0.0044) [2024-06-24 16:23:27,515][15401] Updated weights for policy 0, policy_version 696882 (0.0023) [2024-06-24 16:23:28,390][15132] Fps is (10 sec: 44242.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11417747456. Throughput: 0: 42635.0. Samples: 11417854060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 16:23:32,304][15401] Updated weights for policy 0, policy_version 696892 (0.0035) [2024-06-24 16:23:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 11417927680. Throughput: 0: 42560.9. Samples: 11418110120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 16:23:35,317][15401] Updated weights for policy 0, policy_version 696902 (0.0043) [2024-06-24 16:23:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11418157056. Throughput: 0: 42516.0. Samples: 11418233060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 16:23:40,009][15401] Updated weights for policy 0, policy_version 696912 (0.0034) [2024-06-24 16:23:42,946][15401] Updated weights for policy 0, policy_version 696922 (0.0046) [2024-06-24 16:23:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11418386432. Throughput: 0: 42585.7. Samples: 11418495160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 16:23:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000696924_11418402816.pth... [2024-06-24 16:23:43,609][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000696296_11408113664.pth [2024-06-24 16:23:47,626][15401] Updated weights for policy 0, policy_version 696932 (0.0027) [2024-06-24 16:23:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 11418583040. Throughput: 0: 42560.9. Samples: 11418753720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:48,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 16:23:50,646][15401] Updated weights for policy 0, policy_version 696942 (0.0033) [2024-06-24 16:23:51,170][15349] Signal inference workers to stop experience collection... (168950 times) [2024-06-24 16:23:51,177][15349] Signal inference workers to resume experience collection... (168950 times) [2024-06-24 16:23:51,207][15401] InferenceWorker_p0-w0: stopping experience collection (168950 times) [2024-06-24 16:23:51,208][15401] InferenceWorker_p0-w0: resuming experience collection (168950 times) [2024-06-24 16:23:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11418796032. Throughput: 0: 42381.4. Samples: 11418871500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 16:23:55,369][15401] Updated weights for policy 0, policy_version 696952 (0.0030) [2024-06-24 16:23:58,249][15401] Updated weights for policy 0, policy_version 696962 (0.0034) [2024-06-24 16:23:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11419041792. Throughput: 0: 42567.9. Samples: 11419132980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:23:58,394][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 16:24:02,973][15401] Updated weights for policy 0, policy_version 696972 (0.0046) [2024-06-24 16:24:03,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 11419205632. Throughput: 0: 42540.7. Samples: 11419385220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:24:03,392][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 16:24:06,054][15401] Updated weights for policy 0, policy_version 696982 (0.0023) [2024-06-24 16:24:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11419451392. Throughput: 0: 42495.2. Samples: 11419507500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:24:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 16:24:10,594][15401] Updated weights for policy 0, policy_version 696992 (0.0040) [2024-06-24 16:24:13,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11419664384. Throughput: 0: 42615.7. Samples: 11419771760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:24:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 16:24:13,521][15401] Updated weights for policy 0, policy_version 697002 (0.0046) [2024-06-24 16:24:18,366][15401] Updated weights for policy 0, policy_version 697012 (0.0038) [2024-06-24 16:24:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42326.2, 300 sec: 42820.5). Total num frames: 11419844608. Throughput: 0: 42684.1. Samples: 11420030900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:24:18,393][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 16:24:21,172][15401] Updated weights for policy 0, policy_version 697022 (0.0033) [2024-06-24 16:24:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11420090368. Throughput: 0: 42549.7. Samples: 11420147800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:24:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 16:24:25,999][15401] Updated weights for policy 0, policy_version 697032 (0.0028) [2024-06-24 16:24:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11420303360. Throughput: 0: 42612.0. Samples: 11420412700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 16:24:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 16:24:29,037][15401] Updated weights for policy 0, policy_version 697042 (0.0026) [2024-06-24 16:24:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42765.8). Total num frames: 11420483584. Throughput: 0: 42672.0. Samples: 11420673960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:24:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 16:24:33,462][15401] Updated weights for policy 0, policy_version 697052 (0.0037) [2024-06-24 16:24:36,738][15401] Updated weights for policy 0, policy_version 697062 (0.0028) [2024-06-24 16:24:38,394][15132] Fps is (10 sec: 44217.6, 60 sec: 43141.4, 300 sec: 42819.9). Total num frames: 11420745728. Throughput: 0: 42738.1. Samples: 11420794900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:24:38,394][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 16:24:41,039][15401] Updated weights for policy 0, policy_version 697072 (0.0039) [2024-06-24 16:24:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11420942336. Throughput: 0: 42698.3. Samples: 11421054400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:24:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 16:24:44,621][15401] Updated weights for policy 0, policy_version 697082 (0.0033) [2024-06-24 16:24:48,389][15132] Fps is (10 sec: 37699.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11421122560. Throughput: 0: 42881.1. Samples: 11421314760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:24:48,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-24 16:24:48,745][15401] Updated weights for policy 0, policy_version 697092 (0.0027) [2024-06-24 16:24:52,085][15401] Updated weights for policy 0, policy_version 697102 (0.0036) [2024-06-24 16:24:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 11421384704. Throughput: 0: 42851.6. Samples: 11421435820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:24:53,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 16:24:56,198][15401] Updated weights for policy 0, policy_version 697112 (0.0046) [2024-06-24 16:24:58,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11421597696. Throughput: 0: 42985.7. Samples: 11421706120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:24:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 16:24:59,653][15401] Updated weights for policy 0, policy_version 697122 (0.0027) [2024-06-24 16:25:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11421777920. Throughput: 0: 42930.3. Samples: 11421962760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:03,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 16:25:03,691][15401] Updated weights for policy 0, policy_version 697132 (0.0026) [2024-06-24 16:25:07,193][15401] Updated weights for policy 0, policy_version 697142 (0.0037) [2024-06-24 16:25:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11422023680. Throughput: 0: 43079.1. Samples: 11422086360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:08,390][15132] Avg episode reward: [(0, '0.158')] [2024-06-24 16:25:11,603][15401] Updated weights for policy 0, policy_version 697152 (0.0035) [2024-06-24 16:25:12,331][15349] Signal inference workers to stop experience collection... (169000 times) [2024-06-24 16:25:12,385][15401] InferenceWorker_p0-w0: stopping experience collection (169000 times) [2024-06-24 16:25:12,385][15349] Signal inference workers to resume experience collection... (169000 times) [2024-06-24 16:25:12,403][15401] InferenceWorker_p0-w0: resuming experience collection (169000 times) [2024-06-24 16:25:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42654.6). Total num frames: 11422203904. Throughput: 0: 42932.7. Samples: 11422344680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 16:25:14,650][15401] Updated weights for policy 0, policy_version 697162 (0.0036) [2024-06-24 16:25:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 11422433280. Throughput: 0: 42818.6. Samples: 11422600900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:18,393][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 16:25:18,970][15401] Updated weights for policy 0, policy_version 697172 (0.0039) [2024-06-24 16:25:22,286][15401] Updated weights for policy 0, policy_version 697182 (0.0031) [2024-06-24 16:25:23,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11422679040. Throughput: 0: 43027.2. Samples: 11422730940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:23,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 16:25:27,104][15401] Updated weights for policy 0, policy_version 697192 (0.0029) [2024-06-24 16:25:28,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11422859264. Throughput: 0: 42952.3. Samples: 11422987260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 16:25:29,923][15401] Updated weights for policy 0, policy_version 697202 (0.0042) [2024-06-24 16:25:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11423055872. Throughput: 0: 42838.6. Samples: 11423242500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 16:25:34,631][15401] Updated weights for policy 0, policy_version 697212 (0.0042) [2024-06-24 16:25:37,683][15401] Updated weights for policy 0, policy_version 697222 (0.0027) [2024-06-24 16:25:38,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42874.6, 300 sec: 42820.6). Total num frames: 11423318016. Throughput: 0: 42955.6. Samples: 11423368820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 16:25:42,294][15401] Updated weights for policy 0, policy_version 697232 (0.0041) [2024-06-24 16:25:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11423498240. Throughput: 0: 42745.9. Samples: 11423629680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 16:25:43,481][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000697236_11423514624.pth... [2024-06-24 16:25:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000696611_11413274624.pth [2024-06-24 16:25:45,301][15401] Updated weights for policy 0, policy_version 697242 (0.0043) [2024-06-24 16:25:48,390][15132] Fps is (10 sec: 37682.5, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 11423694848. Throughput: 0: 42813.6. Samples: 11423889380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 16:25:49,813][15401] Updated weights for policy 0, policy_version 697252 (0.0035) [2024-06-24 16:25:53,062][15401] Updated weights for policy 0, policy_version 697262 (0.0041) [2024-06-24 16:25:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11423956992. Throughput: 0: 42782.7. Samples: 11424011580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 16:25:57,455][15401] Updated weights for policy 0, policy_version 697272 (0.0039) [2024-06-24 16:25:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11424137216. Throughput: 0: 42784.6. Samples: 11424269980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:25:58,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 16:26:00,643][15401] Updated weights for policy 0, policy_version 697282 (0.0037) [2024-06-24 16:26:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11424333824. Throughput: 0: 42841.8. Samples: 11424528680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:26:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 16:26:04,946][15401] Updated weights for policy 0, policy_version 697292 (0.0031) [2024-06-24 16:26:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11424595968. Throughput: 0: 42716.0. Samples: 11424653160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 16:26:08,392][15401] Updated weights for policy 0, policy_version 697302 (0.0031) [2024-06-24 16:26:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 16:26:12,761][15401] Updated weights for policy 0, policy_version 697312 (0.0035) [2024-06-24 16:26:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11424792576. Throughput: 0: 42788.0. Samples: 11424912720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 16:26:16,100][15401] Updated weights for policy 0, policy_version 697322 (0.0038) [2024-06-24 16:26:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11424989184. Throughput: 0: 42650.2. Samples: 11425161760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 16:26:20,652][15401] Updated weights for policy 0, policy_version 697332 (0.0035) [2024-06-24 16:26:23,392][15132] Fps is (10 sec: 44227.5, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 11425234944. Throughput: 0: 42624.9. Samples: 11425287040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:23,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 16:26:23,636][15401] Updated weights for policy 0, policy_version 697342 (0.0042) [2024-06-24 16:26:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 11425398784. Throughput: 0: 42609.2. Samples: 11425547200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:28,393][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 16:26:28,461][15401] Updated weights for policy 0, policy_version 697352 (0.0046) [2024-06-24 16:26:31,720][15401] Updated weights for policy 0, policy_version 697362 (0.0038) [2024-06-24 16:26:33,389][15132] Fps is (10 sec: 39330.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11425628160. Throughput: 0: 42407.3. Samples: 11425797700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 16:26:35,998][15401] Updated weights for policy 0, policy_version 697372 (0.0039) [2024-06-24 16:26:36,029][15349] Signal inference workers to stop experience collection... (169050 times) [2024-06-24 16:26:36,029][15349] Signal inference workers to resume experience collection... (169050 times) [2024-06-24 16:26:36,043][15401] InferenceWorker_p0-w0: stopping experience collection (169050 times) [2024-06-24 16:26:36,044][15401] InferenceWorker_p0-w0: resuming experience collection (169050 times) [2024-06-24 16:26:38,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11425857536. Throughput: 0: 42476.9. Samples: 11425923040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:38,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 16:26:39,655][15401] Updated weights for policy 0, policy_version 697382 (0.0039) [2024-06-24 16:26:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11426054144. Throughput: 0: 42459.8. Samples: 11426180680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:43,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 16:26:43,517][15401] Updated weights for policy 0, policy_version 697392 (0.0037) [2024-06-24 16:26:47,182][15401] Updated weights for policy 0, policy_version 697402 (0.0037) [2024-06-24 16:26:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11426283520. Throughput: 0: 42350.2. Samples: 11426434440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 16:26:51,480][15401] Updated weights for policy 0, policy_version 697412 (0.0028) [2024-06-24 16:26:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11426496512. Throughput: 0: 42531.0. Samples: 11426567060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 16:26:54,685][15401] Updated weights for policy 0, policy_version 697422 (0.0032) [2024-06-24 16:26:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 11426693120. Throughput: 0: 42521.9. Samples: 11426826200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:26:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 16:26:59,034][15401] Updated weights for policy 0, policy_version 697432 (0.0033) [2024-06-24 16:27:02,169][15401] Updated weights for policy 0, policy_version 697442 (0.0044) [2024-06-24 16:27:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 11426938880. Throughput: 0: 42633.7. Samples: 11427080280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:03,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 16:27:06,629][15401] Updated weights for policy 0, policy_version 697452 (0.0034) [2024-06-24 16:27:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11427135488. Throughput: 0: 42802.1. Samples: 11427213040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 16:27:09,707][15401] Updated weights for policy 0, policy_version 697462 (0.0032) [2024-06-24 16:27:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 11427332096. Throughput: 0: 42717.1. Samples: 11427469360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 16:27:14,466][15401] Updated weights for policy 0, policy_version 697472 (0.0044) [2024-06-24 16:27:17,219][15401] Updated weights for policy 0, policy_version 697482 (0.0027) [2024-06-24 16:27:18,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11427577856. Throughput: 0: 42580.8. Samples: 11427713940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:18,392][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 16:27:22,191][15401] Updated weights for policy 0, policy_version 697492 (0.0035) [2024-06-24 16:27:23,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 11427774464. Throughput: 0: 42964.0. Samples: 11427856420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:23,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 16:27:24,814][15401] Updated weights for policy 0, policy_version 697502 (0.0031) [2024-06-24 16:27:28,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 11427971072. Throughput: 0: 42853.0. Samples: 11428109060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 16:27:29,847][15401] Updated weights for policy 0, policy_version 697512 (0.0036) [2024-06-24 16:27:32,405][15401] Updated weights for policy 0, policy_version 697522 (0.0027) [2024-06-24 16:27:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11428216832. Throughput: 0: 42716.4. Samples: 11428356680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 16:27:37,522][15401] Updated weights for policy 0, policy_version 697532 (0.0044) [2024-06-24 16:27:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11428397056. Throughput: 0: 42892.1. Samples: 11428497200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:38,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 16:27:40,462][15401] Updated weights for policy 0, policy_version 697542 (0.0040) [2024-06-24 16:27:43,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 11428593664. Throughput: 0: 42674.5. Samples: 11428746560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:43,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 16:27:43,445][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000697547_11428610048.pth... [2024-06-24 16:27:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000696924_11418402816.pth [2024-06-24 16:27:45,373][15401] Updated weights for policy 0, policy_version 697552 (0.0029) [2024-06-24 16:27:46,756][15349] Signal inference workers to stop experience collection... (169100 times) [2024-06-24 16:27:46,804][15401] InferenceWorker_p0-w0: stopping experience collection (169100 times) [2024-06-24 16:27:46,804][15349] Signal inference workers to resume experience collection... (169100 times) [2024-06-24 16:27:46,824][15401] InferenceWorker_p0-w0: resuming experience collection (169100 times) [2024-06-24 16:27:48,082][15401] Updated weights for policy 0, policy_version 697562 (0.0024) [2024-06-24 16:27:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11428872192. Throughput: 0: 42491.6. Samples: 11428992400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:27:48,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 16:27:52,957][15401] Updated weights for policy 0, policy_version 697572 (0.0036) [2024-06-24 16:27:53,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 11429036032. Throughput: 0: 42655.7. Samples: 11429132540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:27:53,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 16:27:56,042][15401] Updated weights for policy 0, policy_version 697582 (0.0046) [2024-06-24 16:27:58,389][15132] Fps is (10 sec: 36045.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11429232640. Throughput: 0: 42533.7. Samples: 11429383380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:27:58,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-24 16:28:00,540][15401] Updated weights for policy 0, policy_version 697592 (0.0034) [2024-06-24 16:28:03,392][15132] Fps is (10 sec: 45863.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 11429494784. Throughput: 0: 42726.2. Samples: 11429636620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:03,393][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 16:28:03,626][15401] Updated weights for policy 0, policy_version 697602 (0.0033) [2024-06-24 16:28:08,281][15401] Updated weights for policy 0, policy_version 697612 (0.0036) [2024-06-24 16:28:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11429675008. Throughput: 0: 42602.7. Samples: 11429773540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:08,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 16:28:11,285][15401] Updated weights for policy 0, policy_version 697622 (0.0031) [2024-06-24 16:28:13,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.3, 300 sec: 42654.1). Total num frames: 11429888000. Throughput: 0: 42603.6. Samples: 11430026220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 16:28:15,839][15401] Updated weights for policy 0, policy_version 697632 (0.0036) [2024-06-24 16:28:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11430133760. Throughput: 0: 42725.8. Samples: 11430279340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 16:28:18,744][15401] Updated weights for policy 0, policy_version 697642 (0.0046) [2024-06-24 16:28:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11430313984. Throughput: 0: 42579.5. Samples: 11430413280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 16:28:23,755][15401] Updated weights for policy 0, policy_version 697652 (0.0050) [2024-06-24 16:28:26,415][15401] Updated weights for policy 0, policy_version 697662 (0.0040) [2024-06-24 16:28:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11430543360. Throughput: 0: 42600.1. Samples: 11430663560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:28,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 16:28:31,260][15401] Updated weights for policy 0, policy_version 697672 (0.0033) [2024-06-24 16:28:33,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11430789120. Throughput: 0: 42799.2. Samples: 11430918360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:33,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 16:28:33,961][15401] Updated weights for policy 0, policy_version 697682 (0.0047) [2024-06-24 16:28:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11430952960. Throughput: 0: 42723.1. Samples: 11431055080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:38,390][15132] Avg episode reward: [(0, '0.927')] [2024-06-24 16:28:38,907][15401] Updated weights for policy 0, policy_version 697692 (0.0033) [2024-06-24 16:28:41,636][15401] Updated weights for policy 0, policy_version 697702 (0.0037) [2024-06-24 16:28:43,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11431165952. Throughput: 0: 42528.8. Samples: 11431297180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:43,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 16:28:46,558][15401] Updated weights for policy 0, policy_version 697712 (0.0033) [2024-06-24 16:28:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11431411712. Throughput: 0: 42659.1. Samples: 11431556180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:48,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-24 16:28:49,412][15401] Updated weights for policy 0, policy_version 697722 (0.0032) [2024-06-24 16:28:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11431591936. Throughput: 0: 42625.8. Samples: 11431691700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:53,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 16:28:54,379][15401] Updated weights for policy 0, policy_version 697732 (0.0032) [2024-06-24 16:28:57,452][15401] Updated weights for policy 0, policy_version 697742 (0.0036) [2024-06-24 16:28:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 11431821312. Throughput: 0: 42527.1. Samples: 11431939940. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:28:58,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 16:29:02,128][15401] Updated weights for policy 0, policy_version 697752 (0.0036) [2024-06-24 16:29:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 11432050688. Throughput: 0: 42565.9. Samples: 11432194800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:29:03,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-24 16:29:04,918][15401] Updated weights for policy 0, policy_version 697762 (0.0031) [2024-06-24 16:29:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11432230912. Throughput: 0: 42461.9. Samples: 11432324060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:29:08,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 16:29:09,688][15401] Updated weights for policy 0, policy_version 697772 (0.0039) [2024-06-24 16:29:11,668][15349] Signal inference workers to stop experience collection... (169150 times) [2024-06-24 16:29:11,668][15349] Signal inference workers to resume experience collection... (169150 times) [2024-06-24 16:29:11,687][15401] InferenceWorker_p0-w0: stopping experience collection (169150 times) [2024-06-24 16:29:11,687][15401] InferenceWorker_p0-w0: resuming experience collection (169150 times) [2024-06-24 16:29:12,866][15401] Updated weights for policy 0, policy_version 697782 (0.0036) [2024-06-24 16:29:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11432476672. Throughput: 0: 42535.7. Samples: 11432577660. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:29:13,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 16:29:17,361][15401] Updated weights for policy 0, policy_version 697792 (0.0038) [2024-06-24 16:29:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11432689664. Throughput: 0: 42634.6. Samples: 11432836920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:29:18,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-24 16:29:20,510][15401] Updated weights for policy 0, policy_version 697802 (0.0038) [2024-06-24 16:29:23,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11432853504. Throughput: 0: 42471.6. Samples: 11432966300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:29:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 16:29:24,957][15401] Updated weights for policy 0, policy_version 697812 (0.0028) [2024-06-24 16:29:28,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 11433099264. Throughput: 0: 42614.7. Samples: 11433214940. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-24 16:29:28,393][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 16:29:28,585][15401] Updated weights for policy 0, policy_version 697822 (0.0033) [2024-06-24 16:29:32,629][15401] Updated weights for policy 0, policy_version 697832 (0.0033) [2024-06-24 16:29:33,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 11433328640. Throughput: 0: 42612.5. Samples: 11433473740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:29:33,393][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 16:29:36,259][15401] Updated weights for policy 0, policy_version 697842 (0.0032) [2024-06-24 16:29:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11433508864. Throughput: 0: 42461.3. Samples: 11433602460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:29:38,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 16:29:40,410][15401] Updated weights for policy 0, policy_version 697852 (0.0031) [2024-06-24 16:29:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 11433754624. Throughput: 0: 42440.5. Samples: 11433849760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:29:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 16:29:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000697861_11433754624.pth... [2024-06-24 16:29:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000697236_11423514624.pth [2024-06-24 16:29:43,818][15401] Updated weights for policy 0, policy_version 697862 (0.0044) [2024-06-24 16:29:48,325][15401] Updated weights for policy 0, policy_version 697872 (0.0028) [2024-06-24 16:29:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 11433934848. Throughput: 0: 42674.7. Samples: 11434115160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:29:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 16:29:51,396][15401] Updated weights for policy 0, policy_version 697882 (0.0022) [2024-06-24 16:29:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11434147840. Throughput: 0: 42488.0. Samples: 11434236020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:29:53,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 16:29:55,926][15401] Updated weights for policy 0, policy_version 697892 (0.0027) [2024-06-24 16:29:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11434393600. Throughput: 0: 42591.4. Samples: 11434494280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:29:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 16:29:58,889][15401] Updated weights for policy 0, policy_version 697902 (0.0034) [2024-06-24 16:30:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 11434573824. Throughput: 0: 42710.6. Samples: 11434758900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 16:30:03,611][15401] Updated weights for policy 0, policy_version 697912 (0.0030) [2024-06-24 16:30:06,498][15401] Updated weights for policy 0, policy_version 697922 (0.0027) [2024-06-24 16:30:08,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 11434803200. Throughput: 0: 42551.0. Samples: 11434881200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:08,392][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 16:30:11,312][15401] Updated weights for policy 0, policy_version 697932 (0.0037) [2024-06-24 16:30:13,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 11435048960. Throughput: 0: 42892.1. Samples: 11435144980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 16:30:14,179][15401] Updated weights for policy 0, policy_version 697942 (0.0029) [2024-06-24 16:30:15,160][15349] Signal inference workers to stop experience collection... (169200 times) [2024-06-24 16:30:15,161][15349] Signal inference workers to resume experience collection... (169200 times) [2024-06-24 16:30:15,201][15401] InferenceWorker_p0-w0: stopping experience collection (169200 times) [2024-06-24 16:30:15,201][15401] InferenceWorker_p0-w0: resuming experience collection (169200 times) [2024-06-24 16:30:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11435229184. Throughput: 0: 42868.9. Samples: 11435402840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 16:30:18,784][15401] Updated weights for policy 0, policy_version 697952 (0.0032) [2024-06-24 16:30:21,922][15401] Updated weights for policy 0, policy_version 697962 (0.0041) [2024-06-24 16:30:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 11435442176. Throughput: 0: 42590.7. Samples: 11435519040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:23,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 16:30:26,327][15401] Updated weights for policy 0, policy_version 697972 (0.0028) [2024-06-24 16:30:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11435671552. Throughput: 0: 42915.1. Samples: 11435780940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 16:30:29,702][15401] Updated weights for policy 0, policy_version 697982 (0.0029) [2024-06-24 16:30:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11435868160. Throughput: 0: 42828.0. Samples: 11436042420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 16:30:34,246][15401] Updated weights for policy 0, policy_version 697992 (0.0031) [2024-06-24 16:30:37,171][15401] Updated weights for policy 0, policy_version 698002 (0.0042) [2024-06-24 16:30:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11436113920. Throughput: 0: 42836.9. Samples: 11436163680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 16:30:42,173][15401] Updated weights for policy 0, policy_version 698012 (0.0035) [2024-06-24 16:30:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11436310528. Throughput: 0: 43021.4. Samples: 11436430240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:43,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 16:30:45,038][15401] Updated weights for policy 0, policy_version 698022 (0.0037) [2024-06-24 16:30:48,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 11436507136. Throughput: 0: 42761.4. Samples: 11436683260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:48,392][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 16:30:49,672][15401] Updated weights for policy 0, policy_version 698032 (0.0022) [2024-06-24 16:30:52,575][15401] Updated weights for policy 0, policy_version 698042 (0.0041) [2024-06-24 16:30:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11436736512. Throughput: 0: 42711.6. Samples: 11436803120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 16:30:57,204][15401] Updated weights for policy 0, policy_version 698052 (0.0037) [2024-06-24 16:30:58,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11436949504. Throughput: 0: 42798.2. Samples: 11437070900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:30:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 16:31:00,203][15401] Updated weights for policy 0, policy_version 698062 (0.0033) [2024-06-24 16:31:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 11437146112. Throughput: 0: 42807.6. Samples: 11437329180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:31:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 16:31:04,711][15401] Updated weights for policy 0, policy_version 698072 (0.0035) [2024-06-24 16:31:07,999][15401] Updated weights for policy 0, policy_version 698082 (0.0029) [2024-06-24 16:31:08,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43144.6, 300 sec: 42709.2). Total num frames: 11437391872. Throughput: 0: 43021.2. Samples: 11437455100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 16:31:08,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 16:31:12,415][15401] Updated weights for policy 0, policy_version 698092 (0.0031) [2024-06-24 16:31:13,390][15132] Fps is (10 sec: 45871.2, 60 sec: 42597.8, 300 sec: 42764.9). Total num frames: 11437604864. Throughput: 0: 42959.6. Samples: 11437714160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:13,391][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 16:31:15,617][15401] Updated weights for policy 0, policy_version 698102 (0.0032) [2024-06-24 16:31:18,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 11437801472. Throughput: 0: 42851.5. Samples: 11437970740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 16:31:20,030][15401] Updated weights for policy 0, policy_version 698112 (0.0046) [2024-06-24 16:31:23,392][15132] Fps is (10 sec: 40953.6, 60 sec: 42869.7, 300 sec: 42765.0). Total num frames: 11438014464. Throughput: 0: 42903.8. Samples: 11438094460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:23,393][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 16:31:23,655][15401] Updated weights for policy 0, policy_version 698122 (0.0043) [2024-06-24 16:31:27,670][15401] Updated weights for policy 0, policy_version 698132 (0.0027) [2024-06-24 16:31:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11438227456. Throughput: 0: 42732.9. Samples: 11438353220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 16:31:31,403][15401] Updated weights for policy 0, policy_version 698142 (0.0028) [2024-06-24 16:31:33,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 11438473216. Throughput: 0: 42858.2. Samples: 11438611780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 16:31:35,081][15401] Updated weights for policy 0, policy_version 698152 (0.0039) [2024-06-24 16:31:35,739][15349] Signal inference workers to stop experience collection... (169250 times) [2024-06-24 16:31:35,779][15401] InferenceWorker_p0-w0: stopping experience collection (169250 times) [2024-06-24 16:31:35,788][15349] Signal inference workers to resume experience collection... (169250 times) [2024-06-24 16:31:35,795][15401] InferenceWorker_p0-w0: resuming experience collection (169250 times) [2024-06-24 16:31:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11438669824. Throughput: 0: 43011.0. Samples: 11438738620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:38,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 16:31:39,183][15401] Updated weights for policy 0, policy_version 698162 (0.0034) [2024-06-24 16:31:42,482][15401] Updated weights for policy 0, policy_version 698172 (0.0032) [2024-06-24 16:31:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11438882816. Throughput: 0: 42846.2. Samples: 11438998980. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:43,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 16:31:43,443][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000698175_11438899200.pth... [2024-06-24 16:31:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000697547_11428610048.pth [2024-06-24 16:31:46,722][15401] Updated weights for policy 0, policy_version 698182 (0.0027) [2024-06-24 16:31:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 11439095808. Throughput: 0: 42921.8. Samples: 11439260660. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:48,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 16:31:50,117][15401] Updated weights for policy 0, policy_version 698192 (0.0037) [2024-06-24 16:31:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11439308800. Throughput: 0: 42949.0. Samples: 11439387700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:53,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 16:31:54,253][15401] Updated weights for policy 0, policy_version 698202 (0.0039) [2024-06-24 16:31:57,700][15401] Updated weights for policy 0, policy_version 698212 (0.0042) [2024-06-24 16:31:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11439521792. Throughput: 0: 42959.5. Samples: 11439647300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:31:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 16:32:01,905][15401] Updated weights for policy 0, policy_version 698222 (0.0045) [2024-06-24 16:32:03,390][15132] Fps is (10 sec: 42595.3, 60 sec: 43144.0, 300 sec: 42709.4). Total num frames: 11439734784. Throughput: 0: 42926.4. Samples: 11439902460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:03,391][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 16:32:05,574][15401] Updated weights for policy 0, policy_version 698232 (0.0025) [2024-06-24 16:32:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 11439964160. Throughput: 0: 43073.0. Samples: 11440032640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 16:32:09,499][15401] Updated weights for policy 0, policy_version 698242 (0.0032) [2024-06-24 16:32:13,159][15401] Updated weights for policy 0, policy_version 698252 (0.0046) [2024-06-24 16:32:13,389][15132] Fps is (10 sec: 42601.5, 60 sec: 42599.1, 300 sec: 42654.3). Total num frames: 11440160768. Throughput: 0: 43102.3. Samples: 11440292820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 16:32:17,107][15401] Updated weights for policy 0, policy_version 698262 (0.0030) [2024-06-24 16:32:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11440373760. Throughput: 0: 43044.9. Samples: 11440548800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:18,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 16:32:20,567][15401] Updated weights for policy 0, policy_version 698272 (0.0030) [2024-06-24 16:32:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 11440619520. Throughput: 0: 42990.6. Samples: 11440673200. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 16:32:25,287][15401] Updated weights for policy 0, policy_version 698282 (0.0029) [2024-06-24 16:32:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11440799744. Throughput: 0: 43013.9. Samples: 11440934600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 16:32:28,677][15401] Updated weights for policy 0, policy_version 698292 (0.0030) [2024-06-24 16:32:32,831][15401] Updated weights for policy 0, policy_version 698302 (0.0036) [2024-06-24 16:32:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11441029120. Throughput: 0: 42892.4. Samples: 11441190820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:33,399][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 16:32:36,226][15401] Updated weights for policy 0, policy_version 698312 (0.0037) [2024-06-24 16:32:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11441242112. Throughput: 0: 42904.4. Samples: 11441318400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 16:32:40,481][15401] Updated weights for policy 0, policy_version 698322 (0.0027) [2024-06-24 16:32:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11441438720. Throughput: 0: 42833.8. Samples: 11441574820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 16:32:43,766][15401] Updated weights for policy 0, policy_version 698332 (0.0043) [2024-06-24 16:32:48,259][15401] Updated weights for policy 0, policy_version 698342 (0.0036) [2024-06-24 16:32:48,392][15132] Fps is (10 sec: 39313.0, 60 sec: 42323.8, 300 sec: 42709.1). Total num frames: 11441635328. Throughput: 0: 42799.9. Samples: 11441828520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-24 16:32:48,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 16:32:51,453][15401] Updated weights for policy 0, policy_version 698352 (0.0036) [2024-06-24 16:32:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11441881088. Throughput: 0: 42622.6. Samples: 11441950660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:32:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 16:32:55,976][15401] Updated weights for policy 0, policy_version 698362 (0.0027) [2024-06-24 16:32:58,389][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 11442061312. Throughput: 0: 42531.7. Samples: 11442206740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:32:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 16:32:58,729][15349] Signal inference workers to stop experience collection... (169300 times) [2024-06-24 16:32:58,730][15349] Signal inference workers to resume experience collection... (169300 times) [2024-06-24 16:32:58,752][15401] InferenceWorker_p0-w0: stopping experience collection (169300 times) [2024-06-24 16:32:58,752][15401] InferenceWorker_p0-w0: resuming experience collection (169300 times) [2024-06-24 16:32:59,172][15401] Updated weights for policy 0, policy_version 698372 (0.0041) [2024-06-24 16:33:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.9, 300 sec: 42709.5). Total num frames: 11442274304. Throughput: 0: 42529.0. Samples: 11442462600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 16:33:03,470][15401] Updated weights for policy 0, policy_version 698382 (0.0038) [2024-06-24 16:33:06,800][15401] Updated weights for policy 0, policy_version 698392 (0.0028) [2024-06-24 16:33:08,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11442520064. Throughput: 0: 42688.2. Samples: 11442594160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 16:33:11,006][15401] Updated weights for policy 0, policy_version 698402 (0.0033) [2024-06-24 16:33:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11442683904. Throughput: 0: 42565.4. Samples: 11442850040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 16:33:14,396][15401] Updated weights for policy 0, policy_version 698412 (0.0032) [2024-06-24 16:33:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11442929664. Throughput: 0: 42409.4. Samples: 11443099240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 16:33:18,514][15401] Updated weights for policy 0, policy_version 698422 (0.0032) [2024-06-24 16:33:21,979][15401] Updated weights for policy 0, policy_version 698432 (0.0037) [2024-06-24 16:33:23,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11443159040. Throughput: 0: 42700.4. Samples: 11443239920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:23,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 16:33:25,938][15401] Updated weights for policy 0, policy_version 698442 (0.0037) [2024-06-24 16:33:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11443339264. Throughput: 0: 42645.3. Samples: 11443493860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 16:33:30,141][15401] Updated weights for policy 0, policy_version 698452 (0.0032) [2024-06-24 16:33:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11443585024. Throughput: 0: 42540.8. Samples: 11443742760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 16:33:33,399][15401] Updated weights for policy 0, policy_version 698462 (0.0031) [2024-06-24 16:33:37,742][15401] Updated weights for policy 0, policy_version 698472 (0.0036) [2024-06-24 16:33:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11443798016. Throughput: 0: 42785.3. Samples: 11443876000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:38,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-24 16:33:41,488][15401] Updated weights for policy 0, policy_version 698482 (0.0033) [2024-06-24 16:33:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11443978240. Throughput: 0: 42679.8. Samples: 11444127340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 16:33:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000698486_11443994624.pth... [2024-06-24 16:33:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000697861_11433754624.pth [2024-06-24 16:33:45,328][15401] Updated weights for policy 0, policy_version 698492 (0.0038) [2024-06-24 16:33:48,394][15132] Fps is (10 sec: 44217.5, 60 sec: 43416.0, 300 sec: 42875.4). Total num frames: 11444240384. Throughput: 0: 42571.3. Samples: 11444378500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:48,394][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 16:33:49,045][15401] Updated weights for policy 0, policy_version 698502 (0.0040) [2024-06-24 16:33:52,895][15401] Updated weights for policy 0, policy_version 698512 (0.0030) [2024-06-24 16:33:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11444436992. Throughput: 0: 42741.1. Samples: 11444517520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:53,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 16:33:56,520][15401] Updated weights for policy 0, policy_version 698522 (0.0025) [2024-06-24 16:33:58,389][15132] Fps is (10 sec: 39339.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11444633600. Throughput: 0: 42456.4. Samples: 11444760580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:33:58,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 16:34:00,405][15401] Updated weights for policy 0, policy_version 698532 (0.0043) [2024-06-24 16:34:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11444862976. Throughput: 0: 42736.9. Samples: 11445022400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:34:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 16:34:04,089][15401] Updated weights for policy 0, policy_version 698542 (0.0034) [2024-06-24 16:34:07,915][15401] Updated weights for policy 0, policy_version 698552 (0.0040) [2024-06-24 16:34:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11445092352. Throughput: 0: 42583.7. Samples: 11445156180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:34:08,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 16:34:11,034][15349] Signal inference workers to stop experience collection... (169350 times) [2024-06-24 16:34:11,034][15349] Signal inference workers to resume experience collection... (169350 times) [2024-06-24 16:34:11,062][15401] InferenceWorker_p0-w0: stopping experience collection (169350 times) [2024-06-24 16:34:11,062][15401] InferenceWorker_p0-w0: resuming experience collection (169350 times) [2024-06-24 16:34:11,691][15401] Updated weights for policy 0, policy_version 698562 (0.0027) [2024-06-24 16:34:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 11445288960. Throughput: 0: 42487.5. Samples: 11445405800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:34:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 16:34:15,556][15401] Updated weights for policy 0, policy_version 698572 (0.0045) [2024-06-24 16:34:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11445485568. Throughput: 0: 42878.7. Samples: 11445672300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:34:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 16:34:19,663][15401] Updated weights for policy 0, policy_version 698582 (0.0025) [2024-06-24 16:34:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 11445714944. Throughput: 0: 42683.6. Samples: 11445796760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:34:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 16:34:23,733][15401] Updated weights for policy 0, policy_version 698592 (0.0032) [2024-06-24 16:34:27,145][15401] Updated weights for policy 0, policy_version 698602 (0.0027) [2024-06-24 16:34:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11445911552. Throughput: 0: 42713.8. Samples: 11446049460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 16:34:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 16:34:31,421][15401] Updated weights for policy 0, policy_version 698612 (0.0033) [2024-06-24 16:34:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11446124544. Throughput: 0: 42905.6. Samples: 11446309060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:34:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 16:34:34,925][15401] Updated weights for policy 0, policy_version 698622 (0.0045) [2024-06-24 16:34:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11446370304. Throughput: 0: 42633.5. Samples: 11446436020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:34:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 16:34:39,077][15401] Updated weights for policy 0, policy_version 698632 (0.0031) [2024-06-24 16:34:42,799][15401] Updated weights for policy 0, policy_version 698642 (0.0034) [2024-06-24 16:34:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11446566912. Throughput: 0: 42870.2. Samples: 11446689740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:34:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 16:34:46,647][15401] Updated weights for policy 0, policy_version 698652 (0.0033) [2024-06-24 16:34:48,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42323.9, 300 sec: 42819.6). Total num frames: 11446779904. Throughput: 0: 42869.0. Samples: 11446951780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:34:48,397][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 16:34:50,577][15401] Updated weights for policy 0, policy_version 698662 (0.0027) [2024-06-24 16:34:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11447025664. Throughput: 0: 42757.6. Samples: 11447080280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:34:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 16:34:54,111][15401] Updated weights for policy 0, policy_version 698672 (0.0037) [2024-06-24 16:34:58,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11447189504. Throughput: 0: 42796.0. Samples: 11447331620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:34:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 16:34:58,565][15401] Updated weights for policy 0, policy_version 698682 (0.0030) [2024-06-24 16:35:01,963][15401] Updated weights for policy 0, policy_version 698692 (0.0042) [2024-06-24 16:35:03,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 11447418880. Throughput: 0: 42612.7. Samples: 11447589980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:03,392][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 16:35:06,084][15401] Updated weights for policy 0, policy_version 698702 (0.0034) [2024-06-24 16:35:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11447631872. Throughput: 0: 42741.8. Samples: 11447720140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:08,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 16:35:09,541][15401] Updated weights for policy 0, policy_version 698712 (0.0039) [2024-06-24 16:35:13,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11447844864. Throughput: 0: 42762.6. Samples: 11447973780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 16:35:13,894][15401] Updated weights for policy 0, policy_version 698722 (0.0042) [2024-06-24 16:35:17,125][15401] Updated weights for policy 0, policy_version 698732 (0.0045) [2024-06-24 16:35:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11448074240. Throughput: 0: 42532.8. Samples: 11448223040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 16:35:21,515][15401] Updated weights for policy 0, policy_version 698742 (0.0039) [2024-06-24 16:35:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11448254464. Throughput: 0: 42646.6. Samples: 11448355120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 16:35:24,870][15401] Updated weights for policy 0, policy_version 698752 (0.0028) [2024-06-24 16:35:28,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 11448483840. Throughput: 0: 42599.1. Samples: 11448606800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:28,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 16:35:29,306][15401] Updated weights for policy 0, policy_version 698762 (0.0040) [2024-06-24 16:35:32,552][15401] Updated weights for policy 0, policy_version 698772 (0.0035) [2024-06-24 16:35:33,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 11448729600. Throughput: 0: 42323.7. Samples: 11448856080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 16:35:36,760][15401] Updated weights for policy 0, policy_version 698782 (0.0027) [2024-06-24 16:35:36,776][15349] Signal inference workers to stop experience collection... (169400 times) [2024-06-24 16:35:36,776][15349] Signal inference workers to resume experience collection... (169400 times) [2024-06-24 16:35:36,832][15401] InferenceWorker_p0-w0: stopping experience collection (169400 times) [2024-06-24 16:35:36,832][15401] InferenceWorker_p0-w0: resuming experience collection (169400 times) [2024-06-24 16:35:38,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 11448893440. Throughput: 0: 42514.1. Samples: 11448993400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 16:35:40,251][15401] Updated weights for policy 0, policy_version 698792 (0.0044) [2024-06-24 16:35:43,390][15132] Fps is (10 sec: 37682.0, 60 sec: 42325.0, 300 sec: 42709.8). Total num frames: 11449106432. Throughput: 0: 42553.0. Samples: 11449246520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:43,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-24 16:35:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000698799_11449122816.pth... [2024-06-24 16:35:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000698175_11438899200.pth [2024-06-24 16:35:44,670][15401] Updated weights for policy 0, policy_version 698802 (0.0039) [2024-06-24 16:35:47,796][15401] Updated weights for policy 0, policy_version 698812 (0.0027) [2024-06-24 16:35:48,390][15132] Fps is (10 sec: 47511.8, 60 sec: 43149.0, 300 sec: 42820.5). Total num frames: 11449368576. Throughput: 0: 42356.3. Samples: 11449495920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:48,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 16:35:52,163][15401] Updated weights for policy 0, policy_version 698822 (0.0036) [2024-06-24 16:35:53,390][15132] Fps is (10 sec: 44238.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11449548800. Throughput: 0: 42509.7. Samples: 11449633080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:53,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 16:35:55,458][15401] Updated weights for policy 0, policy_version 698832 (0.0045) [2024-06-24 16:35:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11449761792. Throughput: 0: 42643.4. Samples: 11449892740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:35:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 16:36:00,105][15401] Updated weights for policy 0, policy_version 698842 (0.0024) [2024-06-24 16:36:03,062][15401] Updated weights for policy 0, policy_version 698852 (0.0040) [2024-06-24 16:36:03,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11450007552. Throughput: 0: 42665.3. Samples: 11450143080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:36:03,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 16:36:07,494][15401] Updated weights for policy 0, policy_version 698862 (0.0025) [2024-06-24 16:36:08,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42869.8, 300 sec: 42709.3). Total num frames: 11450204160. Throughput: 0: 42751.1. Samples: 11450279020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 16:36:08,392][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 16:36:10,839][15401] Updated weights for policy 0, policy_version 698872 (0.0025) [2024-06-24 16:36:13,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11450384384. Throughput: 0: 42776.4. Samples: 11450531640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 16:36:15,260][15401] Updated weights for policy 0, policy_version 698882 (0.0040) [2024-06-24 16:36:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 11450630144. Throughput: 0: 42900.6. Samples: 11450786600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 16:36:18,491][15401] Updated weights for policy 0, policy_version 698892 (0.0038) [2024-06-24 16:36:22,845][15401] Updated weights for policy 0, policy_version 698902 (0.0023) [2024-06-24 16:36:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11450826752. Throughput: 0: 42743.8. Samples: 11450916880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 16:36:26,043][15401] Updated weights for policy 0, policy_version 698912 (0.0029) [2024-06-24 16:36:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 11451039744. Throughput: 0: 42613.3. Samples: 11451164100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:28,394][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 16:36:30,349][15401] Updated weights for policy 0, policy_version 698922 (0.0034) [2024-06-24 16:36:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11451269120. Throughput: 0: 42910.8. Samples: 11451426900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 16:36:33,891][15401] Updated weights for policy 0, policy_version 698932 (0.0034) [2024-06-24 16:36:37,904][15401] Updated weights for policy 0, policy_version 698942 (0.0039) [2024-06-24 16:36:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 11451465728. Throughput: 0: 42794.3. Samples: 11451558820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 16:36:41,471][15401] Updated weights for policy 0, policy_version 698952 (0.0034) [2024-06-24 16:36:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.7, 300 sec: 42653.9). Total num frames: 11451678720. Throughput: 0: 42519.2. Samples: 11451806100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 16:36:46,196][15401] Updated weights for policy 0, policy_version 698962 (0.0042) [2024-06-24 16:36:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11451908096. Throughput: 0: 42724.5. Samples: 11452065580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 16:36:49,285][15401] Updated weights for policy 0, policy_version 698972 (0.0032) [2024-06-24 16:36:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11452104704. Throughput: 0: 42666.6. Samples: 11452198920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 16:36:53,767][15401] Updated weights for policy 0, policy_version 698982 (0.0046) [2024-06-24 16:36:57,061][15401] Updated weights for policy 0, policy_version 698992 (0.0033) [2024-06-24 16:36:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.7, 300 sec: 42709.6). Total num frames: 11452334080. Throughput: 0: 42551.2. Samples: 11452446440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:36:58,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 16:36:59,687][15349] Signal inference workers to stop experience collection... (169450 times) [2024-06-24 16:36:59,692][15349] Signal inference workers to resume experience collection... (169450 times) [2024-06-24 16:36:59,709][15401] InferenceWorker_p0-w0: stopping experience collection (169450 times) [2024-06-24 16:36:59,743][15401] InferenceWorker_p0-w0: resuming experience collection (169450 times) [2024-06-24 16:37:01,322][15401] Updated weights for policy 0, policy_version 699002 (0.0038) [2024-06-24 16:37:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 11452547072. Throughput: 0: 42676.0. Samples: 11452707020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 16:37:04,511][15401] Updated weights for policy 0, policy_version 699012 (0.0032) [2024-06-24 16:37:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 11452760064. Throughput: 0: 42570.8. Samples: 11452832560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 16:37:08,830][15401] Updated weights for policy 0, policy_version 699022 (0.0033) [2024-06-24 16:37:12,482][15401] Updated weights for policy 0, policy_version 699032 (0.0033) [2024-06-24 16:37:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11452989440. Throughput: 0: 42726.7. Samples: 11453086800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 16:37:17,010][15401] Updated weights for policy 0, policy_version 699042 (0.0029) [2024-06-24 16:37:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11453169664. Throughput: 0: 42479.2. Samples: 11453338460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 16:37:20,241][15401] Updated weights for policy 0, policy_version 699052 (0.0025) [2024-06-24 16:37:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11453382656. Throughput: 0: 42348.5. Samples: 11453464500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 16:37:24,463][15401] Updated weights for policy 0, policy_version 699062 (0.0040) [2024-06-24 16:37:28,259][15401] Updated weights for policy 0, policy_version 699072 (0.0032) [2024-06-24 16:37:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11453595648. Throughput: 0: 42655.7. Samples: 11453725600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:28,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 16:37:32,042][15401] Updated weights for policy 0, policy_version 699082 (0.0032) [2024-06-24 16:37:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11453808640. Throughput: 0: 42506.4. Samples: 11453978360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 16:37:35,874][15401] Updated weights for policy 0, policy_version 699092 (0.0045) [2024-06-24 16:37:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11454038016. Throughput: 0: 42410.7. Samples: 11454107400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 16:37:39,649][15401] Updated weights for policy 0, policy_version 699102 (0.0041) [2024-06-24 16:37:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 11454251008. Throughput: 0: 42584.8. Samples: 11454362760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 16:37:43,398][15401] Updated weights for policy 0, policy_version 699112 (0.0033) [2024-06-24 16:37:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000699112_11454251008.pth... [2024-06-24 16:37:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000698486_11443994624.pth [2024-06-24 16:37:47,146][15401] Updated weights for policy 0, policy_version 699122 (0.0040) [2024-06-24 16:37:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11454464000. Throughput: 0: 42467.5. Samples: 11454618060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 16:37:48,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 16:37:51,156][15401] Updated weights for policy 0, policy_version 699132 (0.0036) [2024-06-24 16:37:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11454660608. Throughput: 0: 42597.3. Samples: 11454749440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:37:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 16:37:54,669][15401] Updated weights for policy 0, policy_version 699142 (0.0033) [2024-06-24 16:37:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11454873600. Throughput: 0: 42558.3. Samples: 11455001920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:37:58,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 16:37:58,878][15401] Updated weights for policy 0, policy_version 699152 (0.0037) [2024-06-24 16:38:02,890][15401] Updated weights for policy 0, policy_version 699162 (0.0036) [2024-06-24 16:38:03,394][15132] Fps is (10 sec: 42580.1, 60 sec: 42322.3, 300 sec: 42597.8). Total num frames: 11455086592. Throughput: 0: 42664.8. Samples: 11455258560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:03,394][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 16:38:06,511][15401] Updated weights for policy 0, policy_version 699172 (0.0031) [2024-06-24 16:38:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42052.1, 300 sec: 42709.4). Total num frames: 11455283200. Throughput: 0: 42749.6. Samples: 11455388240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 16:38:10,295][15401] Updated weights for policy 0, policy_version 699182 (0.0044) [2024-06-24 16:38:13,389][15132] Fps is (10 sec: 44255.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11455528960. Throughput: 0: 42563.5. Samples: 11455640960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 16:38:14,719][15401] Updated weights for policy 0, policy_version 699192 (0.0027) [2024-06-24 16:38:16,983][15349] Signal inference workers to stop experience collection... (169500 times) [2024-06-24 16:38:16,983][15349] Signal inference workers to resume experience collection... (169500 times) [2024-06-24 16:38:17,022][15401] InferenceWorker_p0-w0: stopping experience collection (169500 times) [2024-06-24 16:38:17,023][15401] InferenceWorker_p0-w0: resuming experience collection (169500 times) [2024-06-24 16:38:17,780][15401] Updated weights for policy 0, policy_version 699202 (0.0037) [2024-06-24 16:38:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11455725568. Throughput: 0: 42622.0. Samples: 11455896360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 16:38:22,427][15401] Updated weights for policy 0, policy_version 699212 (0.0038) [2024-06-24 16:38:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11455938560. Throughput: 0: 42569.9. Samples: 11456023040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 16:38:25,912][15401] Updated weights for policy 0, policy_version 699222 (0.0034) [2024-06-24 16:38:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 11456184320. Throughput: 0: 42663.0. Samples: 11456282600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 16:38:30,048][15401] Updated weights for policy 0, policy_version 699232 (0.0038) [2024-06-24 16:38:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11456364544. Throughput: 0: 42686.2. Samples: 11456538940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 16:38:33,522][15401] Updated weights for policy 0, policy_version 699242 (0.0035) [2024-06-24 16:38:38,006][15401] Updated weights for policy 0, policy_version 699252 (0.0031) [2024-06-24 16:38:38,392][15132] Fps is (10 sec: 37674.5, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 11456561152. Throughput: 0: 42435.4. Samples: 11456659140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:38,393][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 16:38:41,057][15401] Updated weights for policy 0, policy_version 699262 (0.0027) [2024-06-24 16:38:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42654.6). Total num frames: 11456823296. Throughput: 0: 42584.5. Samples: 11456918220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:43,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 16:38:45,425][15401] Updated weights for policy 0, policy_version 699272 (0.0032) [2024-06-24 16:38:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11457003520. Throughput: 0: 42684.0. Samples: 11457179160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 16:38:48,981][15401] Updated weights for policy 0, policy_version 699282 (0.0031) [2024-06-24 16:38:53,022][15401] Updated weights for policy 0, policy_version 699292 (0.0028) [2024-06-24 16:38:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11457216512. Throughput: 0: 42465.8. Samples: 11457299200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 16:38:56,611][15401] Updated weights for policy 0, policy_version 699302 (0.0030) [2024-06-24 16:38:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11457462272. Throughput: 0: 42643.6. Samples: 11457559920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:38:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 16:39:00,702][15401] Updated weights for policy 0, policy_version 699312 (0.0030) [2024-06-24 16:39:03,392][15132] Fps is (10 sec: 42590.1, 60 sec: 42600.0, 300 sec: 42542.6). Total num frames: 11457642496. Throughput: 0: 42739.1. Samples: 11457819700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:39:03,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 16:39:04,185][15401] Updated weights for policy 0, policy_version 699322 (0.0041) [2024-06-24 16:39:08,390][15132] Fps is (10 sec: 37682.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11457839104. Throughput: 0: 42546.5. Samples: 11457937640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:39:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 16:39:08,615][15401] Updated weights for policy 0, policy_version 699332 (0.0029) [2024-06-24 16:39:12,203][15401] Updated weights for policy 0, policy_version 699342 (0.0035) [2024-06-24 16:39:13,390][15132] Fps is (10 sec: 45884.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11458101248. Throughput: 0: 42755.2. Samples: 11458206580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:39:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 16:39:16,703][15401] Updated weights for policy 0, policy_version 699352 (0.0039) [2024-06-24 16:39:18,391][15132] Fps is (10 sec: 45871.1, 60 sec: 42870.8, 300 sec: 42653.8). Total num frames: 11458297856. Throughput: 0: 42510.1. Samples: 11458451940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:39:18,391][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 16:39:19,624][15401] Updated weights for policy 0, policy_version 699362 (0.0029) [2024-06-24 16:39:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11458478080. Throughput: 0: 42586.8. Samples: 11458575440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:39:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 16:39:24,227][15401] Updated weights for policy 0, policy_version 699372 (0.0027) [2024-06-24 16:39:27,060][15401] Updated weights for policy 0, policy_version 699382 (0.0029) [2024-06-24 16:39:28,390][15132] Fps is (10 sec: 44241.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11458740224. Throughput: 0: 42617.2. Samples: 11458836000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-24 16:39:28,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 16:39:31,793][15349] Signal inference workers to stop experience collection... (169550 times) [2024-06-24 16:39:31,796][15349] Signal inference workers to resume experience collection... (169550 times) [2024-06-24 16:39:31,808][15401] Updated weights for policy 0, policy_version 699392 (0.0034) [2024-06-24 16:39:31,824][15401] InferenceWorker_p0-w0: stopping experience collection (169550 times) [2024-06-24 16:39:31,825][15401] InferenceWorker_p0-w0: resuming experience collection (169550 times) [2024-06-24 16:39:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11458953216. Throughput: 0: 42477.8. Samples: 11459090660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:39:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 16:39:34,968][15401] Updated weights for policy 0, policy_version 699402 (0.0051) [2024-06-24 16:39:38,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11459133440. Throughput: 0: 42630.3. Samples: 11459217560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:39:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 16:39:39,361][15401] Updated weights for policy 0, policy_version 699412 (0.0048) [2024-06-24 16:39:42,441][15401] Updated weights for policy 0, policy_version 699422 (0.0043) [2024-06-24 16:39:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 11459379200. Throughput: 0: 42663.9. Samples: 11459479800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:39:43,394][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 16:39:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000699425_11459379200.pth... [2024-06-24 16:39:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000698799_11449122816.pth [2024-06-24 16:39:46,898][15401] Updated weights for policy 0, policy_version 699432 (0.0036) [2024-06-24 16:39:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 11459559424. Throughput: 0: 42573.9. Samples: 11459735440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:39:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 16:39:50,116][15401] Updated weights for policy 0, policy_version 699442 (0.0046) [2024-06-24 16:39:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11459788800. Throughput: 0: 42631.8. Samples: 11459856060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:39:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 16:39:54,421][15401] Updated weights for policy 0, policy_version 699452 (0.0027) [2024-06-24 16:39:57,615][15401] Updated weights for policy 0, policy_version 699462 (0.0024) [2024-06-24 16:39:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 11460001792. Throughput: 0: 42283.2. Samples: 11460109320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:39:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 16:40:02,523][15401] Updated weights for policy 0, policy_version 699472 (0.0023) [2024-06-24 16:40:03,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42325.1, 300 sec: 42542.5). Total num frames: 11460182016. Throughput: 0: 42827.6. Samples: 11460379240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:03,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 16:40:04,975][15401] Updated weights for policy 0, policy_version 699482 (0.0035) [2024-06-24 16:40:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11460411392. Throughput: 0: 42683.5. Samples: 11460496200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:08,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 16:40:10,039][15401] Updated weights for policy 0, policy_version 699492 (0.0032) [2024-06-24 16:40:12,618][15401] Updated weights for policy 0, policy_version 699502 (0.0035) [2024-06-24 16:40:13,389][15132] Fps is (10 sec: 49164.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11460673536. Throughput: 0: 42757.0. Samples: 11460760060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 16:40:17,510][15401] Updated weights for policy 0, policy_version 699512 (0.0033) [2024-06-24 16:40:18,390][15132] Fps is (10 sec: 40958.2, 60 sec: 42052.7, 300 sec: 42598.3). Total num frames: 11460820992. Throughput: 0: 42855.1. Samples: 11461019160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:18,391][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 16:40:20,616][15401] Updated weights for policy 0, policy_version 699522 (0.0044) [2024-06-24 16:40:23,396][15132] Fps is (10 sec: 39295.9, 60 sec: 43139.8, 300 sec: 42653.3). Total num frames: 11461066752. Throughput: 0: 42573.4. Samples: 11461133640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:23,397][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 16:40:25,167][15401] Updated weights for policy 0, policy_version 699532 (0.0023) [2024-06-24 16:40:28,208][15401] Updated weights for policy 0, policy_version 699542 (0.0031) [2024-06-24 16:40:28,389][15132] Fps is (10 sec: 47515.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11461296128. Throughput: 0: 42626.7. Samples: 11461398000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 16:40:32,714][15401] Updated weights for policy 0, policy_version 699552 (0.0036) [2024-06-24 16:40:33,396][15132] Fps is (10 sec: 40960.3, 60 sec: 42047.8, 300 sec: 42653.0). Total num frames: 11461476352. Throughput: 0: 42660.6. Samples: 11461655440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:33,396][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 16:40:35,928][15401] Updated weights for policy 0, policy_version 699562 (0.0033) [2024-06-24 16:40:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11461705728. Throughput: 0: 42669.2. Samples: 11461776180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 16:40:40,330][15401] Updated weights for policy 0, policy_version 699572 (0.0033) [2024-06-24 16:40:43,390][15132] Fps is (10 sec: 45904.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11461935104. Throughput: 0: 43038.2. Samples: 11462046040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 16:40:43,406][15401] Updated weights for policy 0, policy_version 699582 (0.0041) [2024-06-24 16:40:44,249][15349] Signal inference workers to stop experience collection... (169600 times) [2024-06-24 16:40:44,283][15401] InferenceWorker_p0-w0: stopping experience collection (169600 times) [2024-06-24 16:40:44,318][15349] Signal inference workers to resume experience collection... (169600 times) [2024-06-24 16:40:44,318][15401] InferenceWorker_p0-w0: resuming experience collection (169600 times) [2024-06-24 16:40:47,933][15401] Updated weights for policy 0, policy_version 699592 (0.0048) [2024-06-24 16:40:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11462131712. Throughput: 0: 42683.1. Samples: 11462299880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 16:40:51,531][15401] Updated weights for policy 0, policy_version 699602 (0.0037) [2024-06-24 16:40:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11462361088. Throughput: 0: 43032.4. Samples: 11462432660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 16:40:55,314][15401] Updated weights for policy 0, policy_version 699612 (0.0028) [2024-06-24 16:40:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 11462557696. Throughput: 0: 42889.2. Samples: 11462690080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:40:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 16:40:59,119][15401] Updated weights for policy 0, policy_version 699622 (0.0058) [2024-06-24 16:41:02,910][15401] Updated weights for policy 0, policy_version 699632 (0.0037) [2024-06-24 16:41:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43419.3, 300 sec: 42654.3). Total num frames: 11462787072. Throughput: 0: 42821.8. Samples: 11462946120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:41:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 16:41:06,753][15401] Updated weights for policy 0, policy_version 699642 (0.0027) [2024-06-24 16:41:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11463000064. Throughput: 0: 43244.4. Samples: 11463079360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:41:08,399][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 16:41:10,430][15401] Updated weights for policy 0, policy_version 699652 (0.0034) [2024-06-24 16:41:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11463196672. Throughput: 0: 43088.4. Samples: 11463336980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 16:41:13,400][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 16:41:14,279][15401] Updated weights for policy 0, policy_version 699662 (0.0026) [2024-06-24 16:41:18,059][15401] Updated weights for policy 0, policy_version 699672 (0.0032) [2024-06-24 16:41:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.9, 300 sec: 42709.5). Total num frames: 11463426048. Throughput: 0: 42969.6. Samples: 11463588800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:18,404][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 16:41:21,858][15401] Updated weights for policy 0, policy_version 699682 (0.0025) [2024-06-24 16:41:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 11463639040. Throughput: 0: 43384.0. Samples: 11463728460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 16:41:25,742][15401] Updated weights for policy 0, policy_version 699692 (0.0031) [2024-06-24 16:41:28,391][15132] Fps is (10 sec: 42591.9, 60 sec: 42597.3, 300 sec: 42653.7). Total num frames: 11463852032. Throughput: 0: 43038.1. Samples: 11463982820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:28,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 16:41:29,475][15401] Updated weights for policy 0, policy_version 699702 (0.0039) [2024-06-24 16:41:33,396][15132] Fps is (10 sec: 42571.5, 60 sec: 43144.5, 300 sec: 42708.6). Total num frames: 11464065024. Throughput: 0: 43086.7. Samples: 11464239060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:33,396][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 16:41:33,633][15401] Updated weights for policy 0, policy_version 699712 (0.0035) [2024-06-24 16:41:37,213][15401] Updated weights for policy 0, policy_version 699722 (0.0030) [2024-06-24 16:41:38,390][15132] Fps is (10 sec: 44243.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11464294400. Throughput: 0: 42969.7. Samples: 11464366300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 16:41:41,268][15401] Updated weights for policy 0, policy_version 699732 (0.0030) [2024-06-24 16:41:43,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11464491008. Throughput: 0: 42862.3. Samples: 11464618880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 16:41:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000699738_11464507392.pth... [2024-06-24 16:41:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000699112_11454251008.pth [2024-06-24 16:41:44,954][15401] Updated weights for policy 0, policy_version 699742 (0.0036) [2024-06-24 16:41:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11464704000. Throughput: 0: 42754.3. Samples: 11464870060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 16:41:49,212][15401] Updated weights for policy 0, policy_version 699752 (0.0047) [2024-06-24 16:41:52,661][15401] Updated weights for policy 0, policy_version 699762 (0.0045) [2024-06-24 16:41:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11464916992. Throughput: 0: 42642.3. Samples: 11464998260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 16:41:57,005][15401] Updated weights for policy 0, policy_version 699772 (0.0028) [2024-06-24 16:41:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11465129984. Throughput: 0: 42693.3. Samples: 11465258180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:41:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 16:42:00,408][15401] Updated weights for policy 0, policy_version 699782 (0.0036) [2024-06-24 16:42:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11465326592. Throughput: 0: 42726.8. Samples: 11465511500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 16:42:04,607][15401] Updated weights for policy 0, policy_version 699792 (0.0040) [2024-06-24 16:42:08,251][15401] Updated weights for policy 0, policy_version 699802 (0.0034) [2024-06-24 16:42:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11465572352. Throughput: 0: 42496.5. Samples: 11465640800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 16:42:08,907][15349] Signal inference workers to stop experience collection... (169650 times) [2024-06-24 16:42:08,908][15349] Signal inference workers to resume experience collection... (169650 times) [2024-06-24 16:42:08,946][15401] InferenceWorker_p0-w0: stopping experience collection (169650 times) [2024-06-24 16:42:08,946][15401] InferenceWorker_p0-w0: resuming experience collection (169650 times) [2024-06-24 16:42:12,312][15401] Updated weights for policy 0, policy_version 699812 (0.0040) [2024-06-24 16:42:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11465768960. Throughput: 0: 42538.0. Samples: 11465896960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 16:42:15,718][15401] Updated weights for policy 0, policy_version 699822 (0.0035) [2024-06-24 16:42:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11465981952. Throughput: 0: 42432.5. Samples: 11466148260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 16:42:20,051][15401] Updated weights for policy 0, policy_version 699832 (0.0041) [2024-06-24 16:42:23,346][15401] Updated weights for policy 0, policy_version 699842 (0.0039) [2024-06-24 16:42:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11466211328. Throughput: 0: 42503.6. Samples: 11466278960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 16:42:27,673][15401] Updated weights for policy 0, policy_version 699852 (0.0049) [2024-06-24 16:42:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42599.5, 300 sec: 42709.5). Total num frames: 11466407936. Throughput: 0: 42573.4. Samples: 11466534680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 16:42:30,923][15401] Updated weights for policy 0, policy_version 699862 (0.0047) [2024-06-24 16:42:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 11466637312. Throughput: 0: 42539.9. Samples: 11466784360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 16:42:35,584][15401] Updated weights for policy 0, policy_version 699872 (0.0033) [2024-06-24 16:42:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11466850304. Throughput: 0: 42670.6. Samples: 11466918440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 16:42:38,542][15401] Updated weights for policy 0, policy_version 699882 (0.0043) [2024-06-24 16:42:43,082][15401] Updated weights for policy 0, policy_version 699892 (0.0038) [2024-06-24 16:42:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11467046912. Throughput: 0: 42562.3. Samples: 11467173480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 16:42:46,071][15401] Updated weights for policy 0, policy_version 699902 (0.0046) [2024-06-24 16:42:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11467259904. Throughput: 0: 42580.4. Samples: 11467427620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:48,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 16:42:50,587][15401] Updated weights for policy 0, policy_version 699912 (0.0030) [2024-06-24 16:42:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11467489280. Throughput: 0: 42540.8. Samples: 11467555140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 16:42:53,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 16:42:54,483][15401] Updated weights for policy 0, policy_version 699922 (0.0029) [2024-06-24 16:42:58,309][15401] Updated weights for policy 0, policy_version 699932 (0.0033) [2024-06-24 16:42:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42709.7). Total num frames: 11467685888. Throughput: 0: 42641.2. Samples: 11467815920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:42:58,393][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 16:43:02,088][15401] Updated weights for policy 0, policy_version 699942 (0.0034) [2024-06-24 16:43:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11467898880. Throughput: 0: 42579.3. Samples: 11468064320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:03,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 16:43:05,786][15401] Updated weights for policy 0, policy_version 699952 (0.0038) [2024-06-24 16:43:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11468128256. Throughput: 0: 42613.4. Samples: 11468196560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 16:43:09,690][15401] Updated weights for policy 0, policy_version 699962 (0.0049) [2024-06-24 16:43:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11468324864. Throughput: 0: 42474.6. Samples: 11468446040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 16:43:13,783][15401] Updated weights for policy 0, policy_version 699972 (0.0030) [2024-06-24 16:43:17,294][15401] Updated weights for policy 0, policy_version 699982 (0.0034) [2024-06-24 16:43:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11468537856. Throughput: 0: 42672.9. Samples: 11468704640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:18,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 16:43:21,394][15401] Updated weights for policy 0, policy_version 699992 (0.0039) [2024-06-24 16:43:22,573][15349] Signal inference workers to stop experience collection... (169700 times) [2024-06-24 16:43:22,597][15401] InferenceWorker_p0-w0: stopping experience collection (169700 times) [2024-06-24 16:43:22,630][15349] Signal inference workers to resume experience collection... (169700 times) [2024-06-24 16:43:22,631][15401] InferenceWorker_p0-w0: resuming experience collection (169700 times) [2024-06-24 16:43:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11468750848. Throughput: 0: 42468.1. Samples: 11468829500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 16:43:25,530][15401] Updated weights for policy 0, policy_version 700002 (0.0031) [2024-06-24 16:43:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11468963840. Throughput: 0: 42504.4. Samples: 11469086180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 16:43:29,371][15401] Updated weights for policy 0, policy_version 700012 (0.0025) [2024-06-24 16:43:33,043][15401] Updated weights for policy 0, policy_version 700022 (0.0040) [2024-06-24 16:43:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 11469193216. Throughput: 0: 42626.6. Samples: 11469345820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:33,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 16:43:36,911][15401] Updated weights for policy 0, policy_version 700032 (0.0047) [2024-06-24 16:43:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11469389824. Throughput: 0: 42544.9. Samples: 11469469660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:38,395][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 16:43:40,739][15401] Updated weights for policy 0, policy_version 700042 (0.0030) [2024-06-24 16:43:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11469619200. Throughput: 0: 42475.2. Samples: 11469727200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:43,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 16:43:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000700050_11469619200.pth... [2024-06-24 16:43:43,448][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000699425_11459379200.pth [2024-06-24 16:43:44,324][15401] Updated weights for policy 0, policy_version 700052 (0.0043) [2024-06-24 16:43:48,278][15401] Updated weights for policy 0, policy_version 700062 (0.0029) [2024-06-24 16:43:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11469815808. Throughput: 0: 42742.3. Samples: 11469987720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 16:43:51,719][15401] Updated weights for policy 0, policy_version 700072 (0.0040) [2024-06-24 16:43:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11470045184. Throughput: 0: 42623.0. Samples: 11470114600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:53,391][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 16:43:55,978][15401] Updated weights for policy 0, policy_version 700082 (0.0029) [2024-06-24 16:43:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42765.3). Total num frames: 11470258176. Throughput: 0: 42849.0. Samples: 11470374240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:43:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 16:43:59,292][15401] Updated weights for policy 0, policy_version 700092 (0.0024) [2024-06-24 16:44:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11470454784. Throughput: 0: 42875.6. Samples: 11470634040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:44:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 16:44:03,675][15401] Updated weights for policy 0, policy_version 700102 (0.0034) [2024-06-24 16:44:06,996][15401] Updated weights for policy 0, policy_version 700112 (0.0036) [2024-06-24 16:44:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11470684160. Throughput: 0: 42834.6. Samples: 11470757060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:44:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 16:44:11,438][15401] Updated weights for policy 0, policy_version 700122 (0.0022) [2024-06-24 16:44:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.1). Total num frames: 11470880768. Throughput: 0: 42799.6. Samples: 11471012160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:44:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 16:44:14,784][15401] Updated weights for policy 0, policy_version 700132 (0.0035) [2024-06-24 16:44:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11471093760. Throughput: 0: 42574.6. Samples: 11471261680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:44:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 16:44:19,193][15401] Updated weights for policy 0, policy_version 700142 (0.0035) [2024-06-24 16:44:22,307][15401] Updated weights for policy 0, policy_version 700152 (0.0047) [2024-06-24 16:44:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 11471339520. Throughput: 0: 42778.6. Samples: 11471394700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:44:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 16:44:26,939][15401] Updated weights for policy 0, policy_version 700162 (0.0048) [2024-06-24 16:44:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11471503360. Throughput: 0: 42725.7. Samples: 11471649860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:44:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 16:44:29,843][15401] Updated weights for policy 0, policy_version 700172 (0.0038) [2024-06-24 16:44:29,853][15349] Signal inference workers to stop experience collection... (169750 times) [2024-06-24 16:44:29,854][15349] Signal inference workers to resume experience collection... (169750 times) [2024-06-24 16:44:29,893][15401] InferenceWorker_p0-w0: stopping experience collection (169750 times) [2024-06-24 16:44:29,900][15401] InferenceWorker_p0-w0: resuming experience collection (169750 times) [2024-06-24 16:44:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11471749120. Throughput: 0: 42585.5. Samples: 11471904080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 16:44:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 16:44:34,495][15401] Updated weights for policy 0, policy_version 700182 (0.0036) [2024-06-24 16:44:37,635][15401] Updated weights for policy 0, policy_version 700192 (0.0036) [2024-06-24 16:44:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11471962112. Throughput: 0: 42761.8. Samples: 11472038880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:44:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 16:44:42,458][15401] Updated weights for policy 0, policy_version 700202 (0.0035) [2024-06-24 16:44:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11472158720. Throughput: 0: 42694.7. Samples: 11472295500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:44:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 16:44:45,199][15401] Updated weights for policy 0, policy_version 700212 (0.0032) [2024-06-24 16:44:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 11472404480. Throughput: 0: 42410.7. Samples: 11472542620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:44:48,392][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 16:44:50,142][15401] Updated weights for policy 0, policy_version 700222 (0.0038) [2024-06-24 16:44:52,885][15401] Updated weights for policy 0, policy_version 700232 (0.0033) [2024-06-24 16:44:53,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11472617472. Throughput: 0: 42824.8. Samples: 11472684180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:44:53,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 16:44:57,782][15401] Updated weights for policy 0, policy_version 700242 (0.0032) [2024-06-24 16:44:58,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 11472797696. Throughput: 0: 42759.0. Samples: 11472936320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:44:58,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-24 16:45:00,531][15401] Updated weights for policy 0, policy_version 700252 (0.0033) [2024-06-24 16:45:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11473043456. Throughput: 0: 42849.0. Samples: 11473189880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 16:45:05,369][15401] Updated weights for policy 0, policy_version 700262 (0.0046) [2024-06-24 16:45:08,279][15401] Updated weights for policy 0, policy_version 700272 (0.0030) [2024-06-24 16:45:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11473256448. Throughput: 0: 42945.0. Samples: 11473327220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 16:45:12,992][15401] Updated weights for policy 0, policy_version 700282 (0.0036) [2024-06-24 16:45:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11473453056. Throughput: 0: 42963.6. Samples: 11473583220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 16:45:15,982][15401] Updated weights for policy 0, policy_version 700292 (0.0034) [2024-06-24 16:45:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43415.9, 300 sec: 42821.1). Total num frames: 11473698816. Throughput: 0: 42868.0. Samples: 11473833240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:18,393][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 16:45:20,602][15401] Updated weights for policy 0, policy_version 700302 (0.0037) [2024-06-24 16:45:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11473895424. Throughput: 0: 43010.1. Samples: 11473974340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:23,396][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 16:45:23,485][15401] Updated weights for policy 0, policy_version 700312 (0.0038) [2024-06-24 16:45:28,389][15132] Fps is (10 sec: 36053.7, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 11474059264. Throughput: 0: 42825.8. Samples: 11474222660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:28,392][15132] Avg episode reward: [(0, '0.326')] [2024-06-24 16:45:28,690][15401] Updated weights for policy 0, policy_version 700322 (0.0032) [2024-06-24 16:45:31,302][15401] Updated weights for policy 0, policy_version 700332 (0.0031) [2024-06-24 16:45:33,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 11474321408. Throughput: 0: 42783.5. Samples: 11474467880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:33,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 16:45:36,468][15401] Updated weights for policy 0, policy_version 700342 (0.0036) [2024-06-24 16:45:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11474518016. Throughput: 0: 42851.8. Samples: 11474612500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 16:45:38,899][15401] Updated weights for policy 0, policy_version 700352 (0.0029) [2024-06-24 16:45:43,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11474698240. Throughput: 0: 42655.2. Samples: 11474855800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 16:45:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000700361_11474714624.pth... [2024-06-24 16:45:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000699738_11464507392.pth [2024-06-24 16:45:44,103][15401] Updated weights for policy 0, policy_version 700362 (0.0037) [2024-06-24 16:45:45,830][15349] Signal inference workers to stop experience collection... (169800 times) [2024-06-24 16:45:45,864][15401] InferenceWorker_p0-w0: stopping experience collection (169800 times) [2024-06-24 16:45:45,884][15349] Signal inference workers to resume experience collection... (169800 times) [2024-06-24 16:45:45,885][15401] InferenceWorker_p0-w0: resuming experience collection (169800 times) [2024-06-24 16:45:46,694][15401] Updated weights for policy 0, policy_version 700372 (0.0030) [2024-06-24 16:45:48,392][15132] Fps is (10 sec: 45865.6, 60 sec: 42871.7, 300 sec: 42764.7). Total num frames: 11474976768. Throughput: 0: 42570.0. Samples: 11475105620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:48,392][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 16:45:51,834][15401] Updated weights for policy 0, policy_version 700382 (0.0039) [2024-06-24 16:45:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11475140608. Throughput: 0: 42600.3. Samples: 11475244240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 16:45:54,372][15401] Updated weights for policy 0, policy_version 700392 (0.0039) [2024-06-24 16:45:58,390][15132] Fps is (10 sec: 37690.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11475353600. Throughput: 0: 42453.4. Samples: 11475493620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:45:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 16:45:59,469][15401] Updated weights for policy 0, policy_version 700402 (0.0033) [2024-06-24 16:46:01,957][15401] Updated weights for policy 0, policy_version 700412 (0.0031) [2024-06-24 16:46:03,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11475615744. Throughput: 0: 42437.4. Samples: 11475742820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:46:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 16:46:07,352][15401] Updated weights for policy 0, policy_version 700422 (0.0038) [2024-06-24 16:46:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11475779584. Throughput: 0: 42381.0. Samples: 11475881480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:46:08,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 16:46:09,623][15401] Updated weights for policy 0, policy_version 700432 (0.0034) [2024-06-24 16:46:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11476008960. Throughput: 0: 42484.9. Samples: 11476134480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 16:46:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 16:46:14,861][15401] Updated weights for policy 0, policy_version 700442 (0.0032) [2024-06-24 16:46:17,254][15401] Updated weights for policy 0, policy_version 700452 (0.0043) [2024-06-24 16:46:18,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 11476254720. Throughput: 0: 42586.4. Samples: 11476384160. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 16:46:22,275][15401] Updated weights for policy 0, policy_version 700462 (0.0035) [2024-06-24 16:46:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42543.1). Total num frames: 11476402176. Throughput: 0: 42367.1. Samples: 11476519020. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 16:46:24,923][15401] Updated weights for policy 0, policy_version 700472 (0.0042) [2024-06-24 16:46:28,389][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42654.9). Total num frames: 11476647936. Throughput: 0: 42546.2. Samples: 11476770380. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:28,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 16:46:29,857][15401] Updated weights for policy 0, policy_version 700482 (0.0048) [2024-06-24 16:46:32,478][15401] Updated weights for policy 0, policy_version 700492 (0.0046) [2024-06-24 16:46:33,389][15132] Fps is (10 sec: 49152.1, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 11476893696. Throughput: 0: 42612.2. Samples: 11477023080. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 16:46:37,519][15401] Updated weights for policy 0, policy_version 700502 (0.0026) [2024-06-24 16:46:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11477057536. Throughput: 0: 42545.9. Samples: 11477158800. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 16:46:40,206][15401] Updated weights for policy 0, policy_version 700512 (0.0024) [2024-06-24 16:46:43,392][15132] Fps is (10 sec: 39311.7, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 11477286912. Throughput: 0: 42604.4. Samples: 11477410920. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:43,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 16:46:45,475][15401] Updated weights for policy 0, policy_version 700522 (0.0035) [2024-06-24 16:46:46,674][15349] Signal inference workers to stop experience collection... (169850 times) [2024-06-24 16:46:46,676][15349] Signal inference workers to resume experience collection... (169850 times) [2024-06-24 16:46:46,698][15401] InferenceWorker_p0-w0: stopping experience collection (169850 times) [2024-06-24 16:46:46,698][15401] InferenceWorker_p0-w0: resuming experience collection (169850 times) [2024-06-24 16:46:47,835][15401] Updated weights for policy 0, policy_version 700532 (0.0031) [2024-06-24 16:46:48,396][15132] Fps is (10 sec: 47483.3, 60 sec: 42595.3, 300 sec: 42764.1). Total num frames: 11477532672. Throughput: 0: 42582.4. Samples: 11477659300. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:48,397][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 16:46:53,095][15401] Updated weights for policy 0, policy_version 700542 (0.0025) [2024-06-24 16:46:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11477696512. Throughput: 0: 42578.2. Samples: 11477797500. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 16:46:55,985][15401] Updated weights for policy 0, policy_version 700552 (0.0040) [2024-06-24 16:46:58,390][15132] Fps is (10 sec: 40986.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11477942272. Throughput: 0: 42521.8. Samples: 11478047960. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:46:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 16:47:00,997][15401] Updated weights for policy 0, policy_version 700562 (0.0039) [2024-06-24 16:47:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11478155264. Throughput: 0: 42635.5. Samples: 11478302760. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 16:47:03,823][15401] Updated weights for policy 0, policy_version 700572 (0.0033) [2024-06-24 16:47:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11478319104. Throughput: 0: 42556.7. Samples: 11478434080. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 16:47:08,792][15401] Updated weights for policy 0, policy_version 700582 (0.0022) [2024-06-24 16:47:11,296][15401] Updated weights for policy 0, policy_version 700592 (0.0030) [2024-06-24 16:47:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11478564864. Throughput: 0: 42588.0. Samples: 11478686840. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 16:47:16,211][15401] Updated weights for policy 0, policy_version 700602 (0.0034) [2024-06-24 16:47:18,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11478794240. Throughput: 0: 42849.3. Samples: 11478951300. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 16:47:18,902][15401] Updated weights for policy 0, policy_version 700612 (0.0027) [2024-06-24 16:47:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11478974464. Throughput: 0: 42787.1. Samples: 11479084220. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 16:47:23,714][15401] Updated weights for policy 0, policy_version 700622 (0.0042) [2024-06-24 16:47:26,585][15401] Updated weights for policy 0, policy_version 700632 (0.0032) [2024-06-24 16:47:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11479220224. Throughput: 0: 42676.5. Samples: 11479331260. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 16:47:31,204][15401] Updated weights for policy 0, policy_version 700642 (0.0037) [2024-06-24 16:47:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11479433216. Throughput: 0: 42909.7. Samples: 11479589960. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 16:47:34,427][15401] Updated weights for policy 0, policy_version 700652 (0.0030) [2024-06-24 16:47:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11479613440. Throughput: 0: 42714.3. Samples: 11479719640. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 16:47:38,649][15401] Updated weights for policy 0, policy_version 700662 (0.0034) [2024-06-24 16:47:41,890][15401] Updated weights for policy 0, policy_version 700672 (0.0035) [2024-06-24 16:47:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11479859200. Throughput: 0: 42752.9. Samples: 11479971840. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 16:47:43,550][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000700676_11479875584.pth... [2024-06-24 16:47:43,603][15349] Signal inference workers to stop experience collection... (169900 times) [2024-06-24 16:47:43,603][15349] Signal inference workers to resume experience collection... (169900 times) [2024-06-24 16:47:43,622][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000700050_11469619200.pth [2024-06-24 16:47:43,624][15401] InferenceWorker_p0-w0: stopping experience collection (169900 times) [2024-06-24 16:47:43,624][15401] InferenceWorker_p0-w0: resuming experience collection (169900 times) [2024-06-24 16:47:46,738][15401] Updated weights for policy 0, policy_version 700682 (0.0036) [2024-06-24 16:47:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 11480072192. Throughput: 0: 42835.5. Samples: 11480230360. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 16:47:49,500][15401] Updated weights for policy 0, policy_version 700692 (0.0026) [2024-06-24 16:47:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 11480268800. Throughput: 0: 42723.6. Samples: 11480356640. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-24 16:47:53,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 16:47:54,135][15401] Updated weights for policy 0, policy_version 700702 (0.0034) [2024-06-24 16:47:57,008][15401] Updated weights for policy 0, policy_version 700712 (0.0043) [2024-06-24 16:47:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11480514560. Throughput: 0: 42843.8. Samples: 11480614820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:47:58,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-24 16:48:01,686][15401] Updated weights for policy 0, policy_version 700722 (0.0027) [2024-06-24 16:48:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11480727552. Throughput: 0: 42877.7. Samples: 11480880800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:03,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-24 16:48:04,654][15401] Updated weights for policy 0, policy_version 700732 (0.0034) [2024-06-24 16:48:08,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11480907776. Throughput: 0: 42708.9. Samples: 11481006120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:08,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 16:48:09,439][15401] Updated weights for policy 0, policy_version 700742 (0.0040) [2024-06-24 16:48:12,194][15401] Updated weights for policy 0, policy_version 700752 (0.0032) [2024-06-24 16:48:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11481169920. Throughput: 0: 43014.7. Samples: 11481266920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 16:48:17,103][15401] Updated weights for policy 0, policy_version 700762 (0.0040) [2024-06-24 16:48:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11481350144. Throughput: 0: 43065.8. Samples: 11481527920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 16:48:19,720][15401] Updated weights for policy 0, policy_version 700772 (0.0040) [2024-06-24 16:48:23,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11481546752. Throughput: 0: 42855.9. Samples: 11481648160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:23,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 16:48:24,471][15401] Updated weights for policy 0, policy_version 700782 (0.0029) [2024-06-24 16:48:27,520][15401] Updated weights for policy 0, policy_version 700792 (0.0033) [2024-06-24 16:48:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11481808896. Throughput: 0: 43018.6. Samples: 11481907680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:28,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 16:48:32,501][15401] Updated weights for policy 0, policy_version 700802 (0.0023) [2024-06-24 16:48:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11482005504. Throughput: 0: 43048.9. Samples: 11482167560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 16:48:35,229][15401] Updated weights for policy 0, policy_version 700812 (0.0040) [2024-06-24 16:48:38,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11482202112. Throughput: 0: 42940.4. Samples: 11482288960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:38,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 16:48:40,094][15401] Updated weights for policy 0, policy_version 700822 (0.0032) [2024-06-24 16:48:42,756][15401] Updated weights for policy 0, policy_version 700832 (0.0049) [2024-06-24 16:48:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 11482447872. Throughput: 0: 43086.4. Samples: 11482553700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 16:48:47,779][15401] Updated weights for policy 0, policy_version 700842 (0.0028) [2024-06-24 16:48:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11482644480. Throughput: 0: 42839.9. Samples: 11482808600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 16:48:50,654][15401] Updated weights for policy 0, policy_version 700852 (0.0039) [2024-06-24 16:48:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11482857472. Throughput: 0: 42684.0. Samples: 11482926900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 16:48:55,280][15401] Updated weights for policy 0, policy_version 700862 (0.0031) [2024-06-24 16:48:55,624][15349] Signal inference workers to stop experience collection... (169950 times) [2024-06-24 16:48:55,668][15401] InferenceWorker_p0-w0: stopping experience collection (169950 times) [2024-06-24 16:48:55,676][15349] Signal inference workers to resume experience collection... (169950 times) [2024-06-24 16:48:55,681][15401] InferenceWorker_p0-w0: resuming experience collection (169950 times) [2024-06-24 16:48:58,258][15401] Updated weights for policy 0, policy_version 700872 (0.0026) [2024-06-24 16:48:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11483086848. Throughput: 0: 42804.4. Samples: 11483193120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:48:58,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 16:49:02,943][15401] Updated weights for policy 0, policy_version 700882 (0.0046) [2024-06-24 16:49:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11483283456. Throughput: 0: 42826.2. Samples: 11483455100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:49:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 16:49:05,752][15401] Updated weights for policy 0, policy_version 700892 (0.0038) [2024-06-24 16:49:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11483496448. Throughput: 0: 42782.7. Samples: 11483573380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:49:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 16:49:10,536][15401] Updated weights for policy 0, policy_version 700902 (0.0041) [2024-06-24 16:49:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11483725824. Throughput: 0: 42896.1. Samples: 11483838000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:49:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 16:49:13,541][15401] Updated weights for policy 0, policy_version 700912 (0.0034) [2024-06-24 16:49:18,202][15401] Updated weights for policy 0, policy_version 700922 (0.0035) [2024-06-24 16:49:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11483922432. Throughput: 0: 42984.1. Samples: 11484101840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:49:18,396][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 16:49:21,404][15401] Updated weights for policy 0, policy_version 700932 (0.0042) [2024-06-24 16:49:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11484135424. Throughput: 0: 42986.3. Samples: 11484223340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:49:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 16:49:25,809][15401] Updated weights for policy 0, policy_version 700942 (0.0036) [2024-06-24 16:49:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11484364800. Throughput: 0: 42870.7. Samples: 11484482880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:49:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 16:49:28,930][15401] Updated weights for policy 0, policy_version 700952 (0.0040) [2024-06-24 16:49:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11484545024. Throughput: 0: 43016.4. Samples: 11484744340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-24 16:49:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 16:49:33,398][15401] Updated weights for policy 0, policy_version 700962 (0.0041) [2024-06-24 16:49:36,740][15401] Updated weights for policy 0, policy_version 700972 (0.0042) [2024-06-24 16:49:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11484790784. Throughput: 0: 43068.4. Samples: 11484864980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:49:38,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 16:49:40,976][15401] Updated weights for policy 0, policy_version 700982 (0.0025) [2024-06-24 16:49:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 11484987392. Throughput: 0: 42906.7. Samples: 11485123920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:49:43,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 16:49:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000700989_11485003776.pth... [2024-06-24 16:49:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000700361_11474714624.pth [2024-06-24 16:49:44,369][15401] Updated weights for policy 0, policy_version 700992 (0.0049) [2024-06-24 16:49:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11485200384. Throughput: 0: 42986.3. Samples: 11485389480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:49:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 16:49:48,554][15401] Updated weights for policy 0, policy_version 701002 (0.0044) [2024-06-24 16:49:51,984][15401] Updated weights for policy 0, policy_version 701012 (0.0040) [2024-06-24 16:49:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11485429760. Throughput: 0: 43066.7. Samples: 11485511380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:49:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 16:49:56,037][15401] Updated weights for policy 0, policy_version 701022 (0.0042) [2024-06-24 16:49:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11485642752. Throughput: 0: 42900.0. Samples: 11485768500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:49:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 16:49:59,557][15401] Updated weights for policy 0, policy_version 701032 (0.0027) [2024-06-24 16:50:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11485855744. Throughput: 0: 42813.3. Samples: 11486028440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 16:50:03,506][15401] Updated weights for policy 0, policy_version 701042 (0.0022) [2024-06-24 16:50:07,247][15401] Updated weights for policy 0, policy_version 701052 (0.0046) [2024-06-24 16:50:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11486068736. Throughput: 0: 42954.1. Samples: 11486156280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 16:50:11,089][15401] Updated weights for policy 0, policy_version 701062 (0.0039) [2024-06-24 16:50:11,604][15349] Signal inference workers to stop experience collection... (170000 times) [2024-06-24 16:50:11,631][15401] InferenceWorker_p0-w0: stopping experience collection (170000 times) [2024-06-24 16:50:11,669][15349] Signal inference workers to resume experience collection... (170000 times) [2024-06-24 16:50:11,671][15401] InferenceWorker_p0-w0: resuming experience collection (170000 times) [2024-06-24 16:50:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 11486281728. Throughput: 0: 42835.4. Samples: 11486410480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 16:50:14,930][15401] Updated weights for policy 0, policy_version 701072 (0.0027) [2024-06-24 16:50:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11486494720. Throughput: 0: 42813.9. Samples: 11486670960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:18,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 16:50:18,817][15401] Updated weights for policy 0, policy_version 701082 (0.0034) [2024-06-24 16:50:22,947][15401] Updated weights for policy 0, policy_version 701092 (0.0033) [2024-06-24 16:50:23,396][15132] Fps is (10 sec: 44208.9, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 11486724096. Throughput: 0: 42929.5. Samples: 11486797080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:23,396][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 16:50:26,491][15401] Updated weights for policy 0, policy_version 701102 (0.0035) [2024-06-24 16:50:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 11486937088. Throughput: 0: 42914.2. Samples: 11487055060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 16:50:30,393][15401] Updated weights for policy 0, policy_version 701112 (0.0048) [2024-06-24 16:50:33,390][15132] Fps is (10 sec: 42625.5, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 11487150080. Throughput: 0: 42596.4. Samples: 11487306320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:33,391][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 16:50:34,009][15401] Updated weights for policy 0, policy_version 701122 (0.0041) [2024-06-24 16:50:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 11487330304. Throughput: 0: 42768.8. Samples: 11487435980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 16:50:38,517][15401] Updated weights for policy 0, policy_version 701132 (0.0036) [2024-06-24 16:50:41,522][15401] Updated weights for policy 0, policy_version 701142 (0.0035) [2024-06-24 16:50:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.8, 300 sec: 42765.0). Total num frames: 11487592448. Throughput: 0: 42787.9. Samples: 11487694060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:43,393][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 16:50:46,203][15401] Updated weights for policy 0, policy_version 701152 (0.0033) [2024-06-24 16:50:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11487789056. Throughput: 0: 42848.3. Samples: 11487956620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 16:50:49,362][15401] Updated weights for policy 0, policy_version 701162 (0.0037) [2024-06-24 16:50:53,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11487985664. Throughput: 0: 42748.0. Samples: 11488079940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 16:50:53,957][15401] Updated weights for policy 0, policy_version 701172 (0.0035) [2024-06-24 16:50:56,862][15401] Updated weights for policy 0, policy_version 701182 (0.0034) [2024-06-24 16:50:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11488231424. Throughput: 0: 42717.3. Samples: 11488332760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:50:58,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 16:51:01,663][15401] Updated weights for policy 0, policy_version 701192 (0.0024) [2024-06-24 16:51:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 11488428032. Throughput: 0: 42690.1. Samples: 11488592120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:51:03,393][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 16:51:04,691][15401] Updated weights for policy 0, policy_version 701202 (0.0045) [2024-06-24 16:51:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11488624640. Throughput: 0: 42591.3. Samples: 11488713420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:51:08,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 16:51:09,342][15401] Updated weights for policy 0, policy_version 701212 (0.0036) [2024-06-24 16:51:12,197][15401] Updated weights for policy 0, policy_version 701222 (0.0035) [2024-06-24 16:51:13,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 11488886784. Throughput: 0: 42727.4. Samples: 11488977800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 16:51:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 16:51:17,063][15401] Updated weights for policy 0, policy_version 701232 (0.0023) [2024-06-24 16:51:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11489067008. Throughput: 0: 42851.6. Samples: 11489234640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:18,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 16:51:19,904][15401] Updated weights for policy 0, policy_version 701242 (0.0038) [2024-06-24 16:51:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 11489280000. Throughput: 0: 42630.7. Samples: 11489354360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 16:51:25,023][15401] Updated weights for policy 0, policy_version 701252 (0.0038) [2024-06-24 16:51:27,423][15401] Updated weights for policy 0, policy_version 701262 (0.0036) [2024-06-24 16:51:28,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11489542144. Throughput: 0: 42756.4. Samples: 11489618000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 16:51:32,425][15401] Updated weights for policy 0, policy_version 701272 (0.0031) [2024-06-24 16:51:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11489705984. Throughput: 0: 42866.4. Samples: 11489885600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:33,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 16:51:34,935][15349] Signal inference workers to stop experience collection... (170050 times) [2024-06-24 16:51:34,957][15401] InferenceWorker_p0-w0: stopping experience collection (170050 times) [2024-06-24 16:51:34,995][15349] Signal inference workers to resume experience collection... (170050 times) [2024-06-24 16:51:34,995][15401] InferenceWorker_p0-w0: resuming experience collection (170050 times) [2024-06-24 16:51:34,997][15401] Updated weights for policy 0, policy_version 701282 (0.0025) [2024-06-24 16:51:38,390][15132] Fps is (10 sec: 39322.0, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 11489935360. Throughput: 0: 42692.1. Samples: 11490001080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:38,394][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 16:51:39,829][15401] Updated weights for policy 0, policy_version 701292 (0.0038) [2024-06-24 16:51:42,325][15401] Updated weights for policy 0, policy_version 701302 (0.0036) [2024-06-24 16:51:43,390][15132] Fps is (10 sec: 49147.8, 60 sec: 43418.8, 300 sec: 42932.5). Total num frames: 11490197504. Throughput: 0: 42955.8. Samples: 11490265800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:43,391][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 16:51:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000701306_11490197504.pth... [2024-06-24 16:51:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000700676_11479875584.pth [2024-06-24 16:51:47,246][15401] Updated weights for policy 0, policy_version 701312 (0.0029) [2024-06-24 16:51:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11490344960. Throughput: 0: 43331.2. Samples: 11490541920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 16:51:49,839][15401] Updated weights for policy 0, policy_version 701322 (0.0023) [2024-06-24 16:51:53,389][15132] Fps is (10 sec: 39324.9, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 11490590720. Throughput: 0: 43147.7. Samples: 11490655060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:53,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-24 16:51:55,365][15401] Updated weights for policy 0, policy_version 701332 (0.0035) [2024-06-24 16:51:57,931][15401] Updated weights for policy 0, policy_version 701342 (0.0045) [2024-06-24 16:51:58,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11490820096. Throughput: 0: 42883.2. Samples: 11490907540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:51:58,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-24 16:52:02,880][15401] Updated weights for policy 0, policy_version 701352 (0.0032) [2024-06-24 16:52:03,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 11490967552. Throughput: 0: 43067.1. Samples: 11491172660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 16:52:05,499][15401] Updated weights for policy 0, policy_version 701362 (0.0034) [2024-06-24 16:52:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11491213312. Throughput: 0: 42959.5. Samples: 11491287540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 16:52:10,399][15401] Updated weights for policy 0, policy_version 701372 (0.0035) [2024-06-24 16:52:13,329][15401] Updated weights for policy 0, policy_version 701382 (0.0039) [2024-06-24 16:52:13,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11491442688. Throughput: 0: 42991.7. Samples: 11491552620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:13,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-24 16:52:17,941][15401] Updated weights for policy 0, policy_version 701392 (0.0035) [2024-06-24 16:52:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11491606528. Throughput: 0: 42725.7. Samples: 11491808260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:18,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 16:52:20,883][15401] Updated weights for policy 0, policy_version 701402 (0.0036) [2024-06-24 16:52:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 11491868672. Throughput: 0: 42736.4. Samples: 11491924320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:23,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 16:52:26,183][15401] Updated weights for policy 0, policy_version 701412 (0.0047) [2024-06-24 16:52:28,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 11492081664. Throughput: 0: 42845.3. Samples: 11492193800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:28,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 16:52:28,443][15401] Updated weights for policy 0, policy_version 701422 (0.0043) [2024-06-24 16:52:33,390][15132] Fps is (10 sec: 37691.6, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 11492245504. Throughput: 0: 42341.7. Samples: 11492447300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:33,390][15132] Avg episode reward: [(0, '0.179')] [2024-06-24 16:52:33,720][15401] Updated weights for policy 0, policy_version 701432 (0.0035) [2024-06-24 16:52:35,076][15349] Signal inference workers to stop experience collection... (170100 times) [2024-06-24 16:52:35,125][15401] InferenceWorker_p0-w0: stopping experience collection (170100 times) [2024-06-24 16:52:35,189][15349] Signal inference workers to resume experience collection... (170100 times) [2024-06-24 16:52:35,190][15401] InferenceWorker_p0-w0: resuming experience collection (170100 times) [2024-06-24 16:52:36,640][15401] Updated weights for policy 0, policy_version 701442 (0.0036) [2024-06-24 16:52:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11492491264. Throughput: 0: 42428.4. Samples: 11492564340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:38,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-24 16:52:41,306][15401] Updated weights for policy 0, policy_version 701452 (0.0033) [2024-06-24 16:52:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42052.7, 300 sec: 42876.1). Total num frames: 11492720640. Throughput: 0: 42702.1. Samples: 11492829140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:43,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-24 16:52:44,258][15401] Updated weights for policy 0, policy_version 701462 (0.0043) [2024-06-24 16:52:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11492884480. Throughput: 0: 42515.2. Samples: 11493085840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:48,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 16:52:48,873][15401] Updated weights for policy 0, policy_version 701472 (0.0031) [2024-06-24 16:52:52,093][15401] Updated weights for policy 0, policy_version 701482 (0.0037) [2024-06-24 16:52:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11493146624. Throughput: 0: 42593.4. Samples: 11493204240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 16:52:53,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 16:52:56,545][15401] Updated weights for policy 0, policy_version 701492 (0.0033) [2024-06-24 16:52:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 11493343232. Throughput: 0: 42614.5. Samples: 11493470280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:52:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 16:52:59,704][15401] Updated weights for policy 0, policy_version 701502 (0.0035) [2024-06-24 16:53:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11493539840. Throughput: 0: 42524.0. Samples: 11493721840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:03,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 16:53:04,288][15401] Updated weights for policy 0, policy_version 701512 (0.0034) [2024-06-24 16:53:07,389][15401] Updated weights for policy 0, policy_version 701522 (0.0028) [2024-06-24 16:53:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 11493785600. Throughput: 0: 42768.8. Samples: 11493848820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:08,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 16:53:11,913][15401] Updated weights for policy 0, policy_version 701532 (0.0037) [2024-06-24 16:53:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11493982208. Throughput: 0: 42643.1. Samples: 11494112740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:13,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 16:53:14,969][15401] Updated weights for policy 0, policy_version 701542 (0.0045) [2024-06-24 16:53:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11494195200. Throughput: 0: 42429.4. Samples: 11494356620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 16:53:19,425][15401] Updated weights for policy 0, policy_version 701552 (0.0038) [2024-06-24 16:53:22,822][15401] Updated weights for policy 0, policy_version 701562 (0.0031) [2024-06-24 16:53:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11494424576. Throughput: 0: 42799.1. Samples: 11494490300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 16:53:26,937][15401] Updated weights for policy 0, policy_version 701572 (0.0037) [2024-06-24 16:53:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 11494588416. Throughput: 0: 42529.0. Samples: 11494742940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 16:53:30,620][15401] Updated weights for policy 0, policy_version 701582 (0.0029) [2024-06-24 16:53:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 11494834176. Throughput: 0: 42508.9. Samples: 11494998740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 16:53:34,466][15401] Updated weights for policy 0, policy_version 701592 (0.0029) [2024-06-24 16:53:38,075][15401] Updated weights for policy 0, policy_version 701602 (0.0026) [2024-06-24 16:53:38,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11495063552. Throughput: 0: 42844.3. Samples: 11495132240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 16:53:42,332][15401] Updated weights for policy 0, policy_version 701612 (0.0025) [2024-06-24 16:53:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 11495227392. Throughput: 0: 42713.4. Samples: 11495392380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 16:53:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000701614_11495243776.pth... [2024-06-24 16:53:43,588][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000700989_11485003776.pth [2024-06-24 16:53:45,621][15401] Updated weights for policy 0, policy_version 701622 (0.0031) [2024-06-24 16:53:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11495473152. Throughput: 0: 42595.5. Samples: 11495638640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 16:53:49,947][15401] Updated weights for policy 0, policy_version 701632 (0.0036) [2024-06-24 16:53:51,151][15349] Signal inference workers to stop experience collection... (170150 times) [2024-06-24 16:53:51,191][15401] InferenceWorker_p0-w0: stopping experience collection (170150 times) [2024-06-24 16:53:51,210][15349] Signal inference workers to resume experience collection... (170150 times) [2024-06-24 16:53:51,212][15401] InferenceWorker_p0-w0: resuming experience collection (170150 times) [2024-06-24 16:53:53,390][15132] Fps is (10 sec: 45872.0, 60 sec: 42324.8, 300 sec: 42709.4). Total num frames: 11495686144. Throughput: 0: 42799.0. Samples: 11495774800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:53,391][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 16:53:53,595][15401] Updated weights for policy 0, policy_version 701642 (0.0026) [2024-06-24 16:53:57,410][15401] Updated weights for policy 0, policy_version 701652 (0.0032) [2024-06-24 16:53:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11495882752. Throughput: 0: 42433.2. Samples: 11496022240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:53:58,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 16:54:01,101][15401] Updated weights for policy 0, policy_version 701662 (0.0035) [2024-06-24 16:54:03,389][15132] Fps is (10 sec: 42601.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11496112128. Throughput: 0: 42663.2. Samples: 11496276460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:54:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 16:54:05,190][15401] Updated weights for policy 0, policy_version 701672 (0.0030) [2024-06-24 16:54:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 11496325120. Throughput: 0: 42707.0. Samples: 11496412220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:54:08,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 16:54:08,715][15401] Updated weights for policy 0, policy_version 701682 (0.0028) [2024-06-24 16:54:13,233][15401] Updated weights for policy 0, policy_version 701692 (0.0030) [2024-06-24 16:54:13,394][15132] Fps is (10 sec: 42579.2, 60 sec: 42595.2, 300 sec: 42764.4). Total num frames: 11496538112. Throughput: 0: 42743.7. Samples: 11496666600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:54:13,394][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 16:54:16,336][15401] Updated weights for policy 0, policy_version 701702 (0.0036) [2024-06-24 16:54:18,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11496783872. Throughput: 0: 42715.4. Samples: 11496920940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:54:18,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 16:54:20,577][15401] Updated weights for policy 0, policy_version 701712 (0.0036) [2024-06-24 16:54:23,389][15132] Fps is (10 sec: 44256.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11496980480. Throughput: 0: 42788.1. Samples: 11497057700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:54:23,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 16:54:23,874][15401] Updated weights for policy 0, policy_version 701722 (0.0035) [2024-06-24 16:54:28,049][15401] Updated weights for policy 0, policy_version 701732 (0.0044) [2024-06-24 16:54:28,390][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11497177088. Throughput: 0: 42785.3. Samples: 11497317720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:54:28,390][15132] Avg episode reward: [(0, '0.127')] [2024-06-24 16:54:31,456][15401] Updated weights for policy 0, policy_version 701742 (0.0035) [2024-06-24 16:54:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 11497390080. Throughput: 0: 42791.1. Samples: 11497564340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 23.0) [2024-06-24 16:54:33,393][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 16:54:36,147][15401] Updated weights for policy 0, policy_version 701752 (0.0036) [2024-06-24 16:54:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 11497619456. Throughput: 0: 42702.9. Samples: 11497696400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:54:38,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-24 16:54:39,339][15401] Updated weights for policy 0, policy_version 701762 (0.0033) [2024-06-24 16:54:43,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11497799680. Throughput: 0: 42893.8. Samples: 11497952460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:54:43,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 16:54:43,955][15401] Updated weights for policy 0, policy_version 701772 (0.0033) [2024-06-24 16:54:46,978][15401] Updated weights for policy 0, policy_version 701782 (0.0024) [2024-06-24 16:54:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11498029056. Throughput: 0: 42792.4. Samples: 11498202120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:54:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 16:54:51,774][15401] Updated weights for policy 0, policy_version 701792 (0.0026) [2024-06-24 16:54:53,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43145.0, 300 sec: 42820.6). Total num frames: 11498274816. Throughput: 0: 42628.2. Samples: 11498330380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:54:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 16:54:54,599][15401] Updated weights for policy 0, policy_version 701802 (0.0026) [2024-06-24 16:54:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11498438656. Throughput: 0: 42498.1. Samples: 11498578820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:54:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 16:54:59,426][15401] Updated weights for policy 0, policy_version 701812 (0.0040) [2024-06-24 16:55:02,881][15401] Updated weights for policy 0, policy_version 701822 (0.0038) [2024-06-24 16:55:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11498668032. Throughput: 0: 42378.3. Samples: 11498827960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 16:55:07,001][15401] Updated weights for policy 0, policy_version 701832 (0.0045) [2024-06-24 16:55:08,330][15349] Signal inference workers to stop experience collection... (170200 times) [2024-06-24 16:55:08,331][15349] Signal inference workers to resume experience collection... (170200 times) [2024-06-24 16:55:08,370][15401] InferenceWorker_p0-w0: stopping experience collection (170200 times) [2024-06-24 16:55:08,370][15401] InferenceWorker_p0-w0: resuming experience collection (170200 times) [2024-06-24 16:55:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 11498897408. Throughput: 0: 42347.2. Samples: 11498963320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 16:55:10,471][15401] Updated weights for policy 0, policy_version 701842 (0.0034) [2024-06-24 16:55:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42328.4, 300 sec: 42653.9). Total num frames: 11499077632. Throughput: 0: 42174.5. Samples: 11499215580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 16:55:14,740][15401] Updated weights for policy 0, policy_version 701852 (0.0035) [2024-06-24 16:55:18,051][15401] Updated weights for policy 0, policy_version 701862 (0.0040) [2024-06-24 16:55:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42050.7, 300 sec: 42654.5). Total num frames: 11499307008. Throughput: 0: 42282.7. Samples: 11499467060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:18,392][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 16:55:22,364][15401] Updated weights for policy 0, policy_version 701872 (0.0043) [2024-06-24 16:55:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11499520000. Throughput: 0: 42332.9. Samples: 11499601380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 16:55:26,130][15401] Updated weights for policy 0, policy_version 701882 (0.0045) [2024-06-24 16:55:28,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11499716608. Throughput: 0: 42250.7. Samples: 11499853740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 16:55:30,161][15401] Updated weights for policy 0, policy_version 701892 (0.0048) [2024-06-24 16:55:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11499945984. Throughput: 0: 42253.3. Samples: 11500103520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 16:55:34,055][15401] Updated weights for policy 0, policy_version 701902 (0.0043) [2024-06-24 16:55:37,720][15401] Updated weights for policy 0, policy_version 701912 (0.0037) [2024-06-24 16:55:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 11500142592. Throughput: 0: 42323.4. Samples: 11500234940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:38,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 16:55:41,576][15401] Updated weights for policy 0, policy_version 701922 (0.0037) [2024-06-24 16:55:43,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 11500355584. Throughput: 0: 42466.3. Samples: 11500490080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:43,396][15132] Avg episode reward: [(0, '0.208')] [2024-06-24 16:55:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000701926_11500355584.pth... [2024-06-24 16:55:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000701306_11490197504.pth [2024-06-24 16:55:45,581][15401] Updated weights for policy 0, policy_version 701932 (0.0032) [2024-06-24 16:55:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11500601344. Throughput: 0: 42586.6. Samples: 11500744360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:48,390][15132] Avg episode reward: [(0, '0.208')] [2024-06-24 16:55:49,066][15401] Updated weights for policy 0, policy_version 701942 (0.0041) [2024-06-24 16:55:53,139][15401] Updated weights for policy 0, policy_version 701952 (0.0032) [2024-06-24 16:55:53,389][15132] Fps is (10 sec: 42626.2, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 11500781568. Throughput: 0: 42604.0. Samples: 11500880500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 16:55:56,568][15401] Updated weights for policy 0, policy_version 701962 (0.0034) [2024-06-24 16:55:58,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.6, 300 sec: 42598.4). Total num frames: 11500994560. Throughput: 0: 42524.1. Samples: 11501129260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:55:58,393][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 16:56:00,779][15401] Updated weights for policy 0, policy_version 701972 (0.0031) [2024-06-24 16:56:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11501240320. Throughput: 0: 42631.9. Samples: 11501385400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:56:03,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 16:56:04,064][15401] Updated weights for policy 0, policy_version 701982 (0.0037) [2024-06-24 16:56:08,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 11501420544. Throughput: 0: 42582.1. Samples: 11501517580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:56:08,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-24 16:56:08,457][15401] Updated weights for policy 0, policy_version 701992 (0.0037) [2024-06-24 16:56:11,884][15401] Updated weights for policy 0, policy_version 702002 (0.0035) [2024-06-24 16:56:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11501649920. Throughput: 0: 42688.0. Samples: 11501774700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 16:56:13,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-24 16:56:15,968][15401] Updated weights for policy 0, policy_version 702012 (0.0039) [2024-06-24 16:56:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 11501862912. Throughput: 0: 42802.1. Samples: 11502029620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 16:56:19,440][15401] Updated weights for policy 0, policy_version 702022 (0.0037) [2024-06-24 16:56:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 11502075904. Throughput: 0: 42884.0. Samples: 11502164720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:23,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 16:56:23,660][15401] Updated weights for policy 0, policy_version 702032 (0.0045) [2024-06-24 16:56:27,226][15401] Updated weights for policy 0, policy_version 702042 (0.0031) [2024-06-24 16:56:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11502272512. Throughput: 0: 42914.2. Samples: 11502420940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 16:56:31,374][15349] Signal inference workers to stop experience collection... (170250 times) [2024-06-24 16:56:31,374][15349] Signal inference workers to resume experience collection... (170250 times) [2024-06-24 16:56:31,382][15401] Updated weights for policy 0, policy_version 702052 (0.0035) [2024-06-24 16:56:31,407][15401] InferenceWorker_p0-w0: stopping experience collection (170250 times) [2024-06-24 16:56:31,407][15401] InferenceWorker_p0-w0: resuming experience collection (170250 times) [2024-06-24 16:56:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11502518272. Throughput: 0: 42945.0. Samples: 11502676880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:33,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 16:56:34,771][15401] Updated weights for policy 0, policy_version 702062 (0.0041) [2024-06-24 16:56:38,390][15132] Fps is (10 sec: 44234.8, 60 sec: 42871.3, 300 sec: 42431.8). Total num frames: 11502714880. Throughput: 0: 42767.6. Samples: 11502805060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 16:56:38,921][15401] Updated weights for policy 0, policy_version 702072 (0.0045) [2024-06-24 16:56:42,586][15401] Updated weights for policy 0, policy_version 702082 (0.0035) [2024-06-24 16:56:43,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42874.3, 300 sec: 42653.6). Total num frames: 11502927872. Throughput: 0: 42882.7. Samples: 11503058980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:43,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 16:56:46,670][15401] Updated weights for policy 0, policy_version 702092 (0.0044) [2024-06-24 16:56:48,390][15132] Fps is (10 sec: 42599.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11503140864. Throughput: 0: 42877.8. Samples: 11503314900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:48,402][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 16:56:50,078][15401] Updated weights for policy 0, policy_version 702102 (0.0033) [2024-06-24 16:56:53,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 11503370240. Throughput: 0: 42868.4. Samples: 11503446660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 16:56:54,225][15401] Updated weights for policy 0, policy_version 702112 (0.0040) [2024-06-24 16:56:57,794][15401] Updated weights for policy 0, policy_version 702122 (0.0038) [2024-06-24 16:56:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11503566848. Throughput: 0: 42768.0. Samples: 11503699260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:56:58,394][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 16:57:01,992][15401] Updated weights for policy 0, policy_version 702132 (0.0040) [2024-06-24 16:57:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11503796224. Throughput: 0: 42772.6. Samples: 11503954380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 16:57:05,482][15401] Updated weights for policy 0, policy_version 702142 (0.0038) [2024-06-24 16:57:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11504009216. Throughput: 0: 42700.9. Samples: 11504086260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 16:57:09,515][15401] Updated weights for policy 0, policy_version 702152 (0.0028) [2024-06-24 16:57:12,986][15401] Updated weights for policy 0, policy_version 702162 (0.0034) [2024-06-24 16:57:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11504222208. Throughput: 0: 42749.3. Samples: 11504344660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 16:57:17,050][15401] Updated weights for policy 0, policy_version 702172 (0.0032) [2024-06-24 16:57:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 11504435200. Throughput: 0: 42783.0. Samples: 11504602120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 16:57:20,551][15401] Updated weights for policy 0, policy_version 702182 (0.0030) [2024-06-24 16:57:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11504648192. Throughput: 0: 42783.4. Samples: 11504730300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 16:57:24,639][15401] Updated weights for policy 0, policy_version 702192 (0.0041) [2024-06-24 16:57:28,297][15401] Updated weights for policy 0, policy_version 702202 (0.0029) [2024-06-24 16:57:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 11504877568. Throughput: 0: 42901.4. Samples: 11504989440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 16:57:32,192][15401] Updated weights for policy 0, policy_version 702212 (0.0034) [2024-06-24 16:57:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11505090560. Throughput: 0: 42777.0. Samples: 11505239860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:33,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 16:57:36,463][15401] Updated weights for policy 0, policy_version 702222 (0.0026) [2024-06-24 16:57:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.7, 300 sec: 42542.9). Total num frames: 11505270784. Throughput: 0: 42773.5. Samples: 11505371460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:38,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 16:57:40,221][15401] Updated weights for policy 0, policy_version 702232 (0.0042) [2024-06-24 16:57:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11505500160. Throughput: 0: 42766.7. Samples: 11505623760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 16:57:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000702240_11505500160.pth... [2024-06-24 16:57:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000701614_11495243776.pth [2024-06-24 16:57:44,220][15401] Updated weights for policy 0, policy_version 702242 (0.0023) [2024-06-24 16:57:47,770][15401] Updated weights for policy 0, policy_version 702252 (0.0043) [2024-06-24 16:57:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11505713152. Throughput: 0: 42841.7. Samples: 11505882260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 16:57:51,725][15401] Updated weights for policy 0, policy_version 702262 (0.0032) [2024-06-24 16:57:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11505926144. Throughput: 0: 42787.1. Samples: 11506011680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 16:57:53,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 16:57:55,234][15401] Updated weights for policy 0, policy_version 702272 (0.0043) [2024-06-24 16:57:58,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 11506155520. Throughput: 0: 42738.5. Samples: 11506268000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:57:58,392][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 16:57:59,466][15401] Updated weights for policy 0, policy_version 702282 (0.0036) [2024-06-24 16:58:02,860][15401] Updated weights for policy 0, policy_version 702292 (0.0029) [2024-06-24 16:58:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11506352128. Throughput: 0: 42611.2. Samples: 11506519620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:03,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 16:58:07,080][15401] Updated weights for policy 0, policy_version 702302 (0.0039) [2024-06-24 16:58:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11506581504. Throughput: 0: 42618.3. Samples: 11506648120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 16:58:10,886][15401] Updated weights for policy 0, policy_version 702312 (0.0034) [2024-06-24 16:58:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11506794496. Throughput: 0: 42630.3. Samples: 11506907800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 16:58:14,933][15401] Updated weights for policy 0, policy_version 702322 (0.0039) [2024-06-24 16:58:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11506991104. Throughput: 0: 42590.1. Samples: 11507156420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 16:58:18,637][15401] Updated weights for policy 0, policy_version 702332 (0.0035) [2024-06-24 16:58:22,506][15401] Updated weights for policy 0, policy_version 702342 (0.0048) [2024-06-24 16:58:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11507220480. Throughput: 0: 42363.1. Samples: 11507277800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:23,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 16:58:26,514][15401] Updated weights for policy 0, policy_version 702352 (0.0043) [2024-06-24 16:58:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11507433472. Throughput: 0: 42523.1. Samples: 11507537300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:28,390][15132] Avg episode reward: [(0, '0.105')] [2024-06-24 16:58:30,414][15401] Updated weights for policy 0, policy_version 702362 (0.0034) [2024-06-24 16:58:31,313][15349] Signal inference workers to stop experience collection... (170300 times) [2024-06-24 16:58:31,320][15349] Signal inference workers to resume experience collection... (170300 times) [2024-06-24 16:58:31,365][15401] InferenceWorker_p0-w0: stopping experience collection (170300 times) [2024-06-24 16:58:31,365][15401] InferenceWorker_p0-w0: resuming experience collection (170300 times) [2024-06-24 16:58:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11507630080. Throughput: 0: 42417.5. Samples: 11507791040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:33,396][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 16:58:34,293][15401] Updated weights for policy 0, policy_version 702372 (0.0049) [2024-06-24 16:58:38,170][15401] Updated weights for policy 0, policy_version 702382 (0.0034) [2024-06-24 16:58:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11507826688. Throughput: 0: 42248.1. Samples: 11507912840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 16:58:41,922][15401] Updated weights for policy 0, policy_version 702392 (0.0035) [2024-06-24 16:58:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11508056064. Throughput: 0: 42277.5. Samples: 11508170380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 16:58:46,047][15401] Updated weights for policy 0, policy_version 702402 (0.0037) [2024-06-24 16:58:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.5). Total num frames: 11508252672. Throughput: 0: 42363.9. Samples: 11508426000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 16:58:49,564][15401] Updated weights for policy 0, policy_version 702412 (0.0032) [2024-06-24 16:58:53,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42050.7, 300 sec: 42598.1). Total num frames: 11508449280. Throughput: 0: 42288.4. Samples: 11508551200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:53,392][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 16:58:53,920][15401] Updated weights for policy 0, policy_version 702422 (0.0042) [2024-06-24 16:58:57,147][15401] Updated weights for policy 0, policy_version 702432 (0.0022) [2024-06-24 16:58:58,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11508711424. Throughput: 0: 42298.7. Samples: 11508811240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:58:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 16:59:01,600][15401] Updated weights for policy 0, policy_version 702442 (0.0033) [2024-06-24 16:59:03,390][15132] Fps is (10 sec: 45883.9, 60 sec: 42598.1, 300 sec: 42654.2). Total num frames: 11508908032. Throughput: 0: 42430.7. Samples: 11509065820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:59:03,391][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 16:59:04,802][15401] Updated weights for policy 0, policy_version 702452 (0.0031) [2024-06-24 16:59:08,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 42543.5). Total num frames: 11509088256. Throughput: 0: 42629.8. Samples: 11509196140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:59:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 16:59:09,068][15401] Updated weights for policy 0, policy_version 702462 (0.0042) [2024-06-24 16:59:12,331][15401] Updated weights for policy 0, policy_version 702472 (0.0028) [2024-06-24 16:59:13,389][15132] Fps is (10 sec: 44239.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11509350400. Throughput: 0: 42607.6. Samples: 11509454640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:59:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 16:59:16,721][15401] Updated weights for policy 0, policy_version 702482 (0.0025) [2024-06-24 16:59:18,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11509547008. Throughput: 0: 42678.0. Samples: 11509711560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:59:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 16:59:19,931][15401] Updated weights for policy 0, policy_version 702492 (0.0046) [2024-06-24 16:59:23,398][15132] Fps is (10 sec: 40925.7, 60 sec: 42319.4, 300 sec: 42652.7). Total num frames: 11509760000. Throughput: 0: 42690.3. Samples: 11509834260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:59:23,398][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 16:59:24,345][15401] Updated weights for policy 0, policy_version 702502 (0.0036) [2024-06-24 16:59:27,443][15401] Updated weights for policy 0, policy_version 702512 (0.0035) [2024-06-24 16:59:28,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 11509989376. Throughput: 0: 42803.1. Samples: 11510096520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:59:28,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 16:59:31,973][15401] Updated weights for policy 0, policy_version 702522 (0.0036) [2024-06-24 16:59:33,390][15132] Fps is (10 sec: 42633.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11510185984. Throughput: 0: 42911.6. Samples: 11510357020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:59:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 16:59:35,027][15401] Updated weights for policy 0, policy_version 702532 (0.0025) [2024-06-24 16:59:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11510398976. Throughput: 0: 42969.8. Samples: 11510484740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 16:59:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 16:59:39,627][15401] Updated weights for policy 0, policy_version 702542 (0.0034) [2024-06-24 16:59:42,828][15401] Updated weights for policy 0, policy_version 702552 (0.0038) [2024-06-24 16:59:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 11510628352. Throughput: 0: 42916.4. Samples: 11510742480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 16:59:43,392][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 16:59:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000702553_11510628352.pth... [2024-06-24 16:59:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000701926_11500355584.pth [2024-06-24 16:59:47,063][15401] Updated weights for policy 0, policy_version 702562 (0.0029) [2024-06-24 16:59:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 11510857728. Throughput: 0: 43088.4. Samples: 11511004780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 16:59:48,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 16:59:50,237][15401] Updated weights for policy 0, policy_version 702572 (0.0031) [2024-06-24 16:59:51,284][15349] Signal inference workers to stop experience collection... (170350 times) [2024-06-24 16:59:51,308][15401] InferenceWorker_p0-w0: stopping experience collection (170350 times) [2024-06-24 16:59:51,398][15349] Signal inference workers to resume experience collection... (170350 times) [2024-06-24 16:59:51,399][15401] InferenceWorker_p0-w0: resuming experience collection (170350 times) [2024-06-24 16:59:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 11511054336. Throughput: 0: 43064.9. Samples: 11511134060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 16:59:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 16:59:54,574][15401] Updated weights for policy 0, policy_version 702582 (0.0037) [2024-06-24 16:59:58,122][15401] Updated weights for policy 0, policy_version 702592 (0.0034) [2024-06-24 16:59:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11511267328. Throughput: 0: 42933.4. Samples: 11511386640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 16:59:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 17:00:02,395][15401] Updated weights for policy 0, policy_version 702602 (0.0032) [2024-06-24 17:00:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.9, 300 sec: 42653.9). Total num frames: 11511480320. Throughput: 0: 42911.8. Samples: 11511642580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 17:00:05,822][15401] Updated weights for policy 0, policy_version 702612 (0.0027) [2024-06-24 17:00:08,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11511693312. Throughput: 0: 43108.1. Samples: 11511773760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:08,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 17:00:09,812][15401] Updated weights for policy 0, policy_version 702622 (0.0045) [2024-06-24 17:00:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 11511906304. Throughput: 0: 42905.6. Samples: 11512027280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:13,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 17:00:13,806][15401] Updated weights for policy 0, policy_version 702632 (0.0039) [2024-06-24 17:00:17,293][15401] Updated weights for policy 0, policy_version 702642 (0.0043) [2024-06-24 17:00:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11512102912. Throughput: 0: 42932.5. Samples: 11512288980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:18,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 17:00:21,422][15401] Updated weights for policy 0, policy_version 702652 (0.0032) [2024-06-24 17:00:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42877.5, 300 sec: 42765.0). Total num frames: 11512332288. Throughput: 0: 42984.1. Samples: 11512419020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 17:00:25,041][15401] Updated weights for policy 0, policy_version 702662 (0.0039) [2024-06-24 17:00:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11512561664. Throughput: 0: 42825.9. Samples: 11512669640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 17:00:28,960][15401] Updated weights for policy 0, policy_version 702672 (0.0025) [2024-06-24 17:00:32,684][15401] Updated weights for policy 0, policy_version 702682 (0.0031) [2024-06-24 17:00:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11512758272. Throughput: 0: 42903.6. Samples: 11512935440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:33,392][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 17:00:36,452][15401] Updated weights for policy 0, policy_version 702692 (0.0040) [2024-06-24 17:00:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 11512954880. Throughput: 0: 42843.6. Samples: 11513062020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:38,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 17:00:40,347][15401] Updated weights for policy 0, policy_version 702702 (0.0029) [2024-06-24 17:00:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11513200640. Throughput: 0: 42767.8. Samples: 11513311200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:43,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 17:00:44,403][15401] Updated weights for policy 0, policy_version 702712 (0.0023) [2024-06-24 17:00:47,895][15401] Updated weights for policy 0, policy_version 702722 (0.0029) [2024-06-24 17:00:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11513413632. Throughput: 0: 42927.9. Samples: 11513574340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 17:00:52,495][15401] Updated weights for policy 0, policy_version 702732 (0.0035) [2024-06-24 17:00:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.6, 300 sec: 42765.0). Total num frames: 11513610240. Throughput: 0: 42904.8. Samples: 11513704580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:53,393][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 17:00:55,581][15401] Updated weights for policy 0, policy_version 702742 (0.0046) [2024-06-24 17:00:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11513856000. Throughput: 0: 42841.4. Samples: 11513955140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:00:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 17:01:00,143][15401] Updated weights for policy 0, policy_version 702752 (0.0046) [2024-06-24 17:01:03,221][15401] Updated weights for policy 0, policy_version 702762 (0.0033) [2024-06-24 17:01:03,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11514052608. Throughput: 0: 42818.7. Samples: 11514215920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:01:03,393][15132] Avg episode reward: [(0, '0.834')] [2024-06-24 17:01:07,632][15401] Updated weights for policy 0, policy_version 702772 (0.0051) [2024-06-24 17:01:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11514249216. Throughput: 0: 42674.1. Samples: 11514339360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:01:08,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 17:01:10,940][15401] Updated weights for policy 0, policy_version 702782 (0.0037) [2024-06-24 17:01:13,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11514494976. Throughput: 0: 42850.5. Samples: 11514597920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:01:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 17:01:15,523][15401] Updated weights for policy 0, policy_version 702792 (0.0029) [2024-06-24 17:01:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42765.1). Total num frames: 11514691584. Throughput: 0: 42547.7. Samples: 11514850080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 17:01:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 17:01:18,506][15401] Updated weights for policy 0, policy_version 702802 (0.0034) [2024-06-24 17:01:23,076][15401] Updated weights for policy 0, policy_version 702812 (0.0033) [2024-06-24 17:01:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11514888192. Throughput: 0: 42677.2. Samples: 11514982500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:01:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 17:01:26,088][15401] Updated weights for policy 0, policy_version 702822 (0.0025) [2024-06-24 17:01:27,697][15349] Signal inference workers to stop experience collection... (170400 times) [2024-06-24 17:01:27,706][15349] Signal inference workers to resume experience collection... (170400 times) [2024-06-24 17:01:27,711][15401] InferenceWorker_p0-w0: stopping experience collection (170400 times) [2024-06-24 17:01:27,740][15401] InferenceWorker_p0-w0: resuming experience collection (170400 times) [2024-06-24 17:01:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11515133952. Throughput: 0: 42912.0. Samples: 11515242240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:01:28,394][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 17:01:30,633][15401] Updated weights for policy 0, policy_version 702832 (0.0030) [2024-06-24 17:01:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11515346944. Throughput: 0: 42772.1. Samples: 11515499080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:01:33,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 17:01:34,158][15401] Updated weights for policy 0, policy_version 702842 (0.0035) [2024-06-24 17:01:38,263][15401] Updated weights for policy 0, policy_version 702852 (0.0038) [2024-06-24 17:01:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11515527168. Throughput: 0: 42835.7. Samples: 11515632080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:01:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 17:01:41,565][15401] Updated weights for policy 0, policy_version 702862 (0.0032) [2024-06-24 17:01:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11515772928. Throughput: 0: 42926.3. Samples: 11515886820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:01:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 17:01:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000702868_11515789312.pth... [2024-06-24 17:01:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000702240_11505500160.pth [2024-06-24 17:01:45,787][15401] Updated weights for policy 0, policy_version 702872 (0.0029) [2024-06-24 17:01:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11515985920. Throughput: 0: 42894.4. Samples: 11516146060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:01:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 17:01:49,079][15401] Updated weights for policy 0, policy_version 702882 (0.0032) [2024-06-24 17:01:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11516166144. Throughput: 0: 42854.7. Samples: 11516267820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:01:53,390][15132] Avg episode reward: [(0, '0.220')] [2024-06-24 17:01:53,549][15401] Updated weights for policy 0, policy_version 702892 (0.0046) [2024-06-24 17:01:56,657][15401] Updated weights for policy 0, policy_version 702902 (0.0059) [2024-06-24 17:01:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11516411904. Throughput: 0: 42781.4. Samples: 11516523080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:01:58,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 17:02:01,317][15401] Updated weights for policy 0, policy_version 702912 (0.0041) [2024-06-24 17:02:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11516624896. Throughput: 0: 42984.8. Samples: 11516784400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:03,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 17:02:04,273][15401] Updated weights for policy 0, policy_version 702922 (0.0028) [2024-06-24 17:02:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11516821504. Throughput: 0: 42789.0. Samples: 11516908000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 17:02:09,029][15401] Updated weights for policy 0, policy_version 702932 (0.0040) [2024-06-24 17:02:12,062][15401] Updated weights for policy 0, policy_version 702942 (0.0038) [2024-06-24 17:02:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11517034496. Throughput: 0: 42568.9. Samples: 11517157840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 17:02:16,730][15401] Updated weights for policy 0, policy_version 702952 (0.0037) [2024-06-24 17:02:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11517231104. Throughput: 0: 42660.0. Samples: 11517418780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:18,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 17:02:20,045][15401] Updated weights for policy 0, policy_version 702962 (0.0035) [2024-06-24 17:02:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11517476864. Throughput: 0: 42477.7. Samples: 11517543580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 17:02:24,285][15401] Updated weights for policy 0, policy_version 702972 (0.0044) [2024-06-24 17:02:27,927][15401] Updated weights for policy 0, policy_version 702982 (0.0035) [2024-06-24 17:02:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11517673472. Throughput: 0: 42443.4. Samples: 11517796780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 17:02:31,634][15401] Updated weights for policy 0, policy_version 702992 (0.0030) [2024-06-24 17:02:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11517886464. Throughput: 0: 42550.0. Samples: 11518060820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 17:02:35,310][15401] Updated weights for policy 0, policy_version 703002 (0.0029) [2024-06-24 17:02:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11518115840. Throughput: 0: 42730.6. Samples: 11518190700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 17:02:38,949][15401] Updated weights for policy 0, policy_version 703012 (0.0035) [2024-06-24 17:02:42,674][15401] Updated weights for policy 0, policy_version 703022 (0.0044) [2024-06-24 17:02:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11518312448. Throughput: 0: 42775.8. Samples: 11518448000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 17:02:46,340][15401] Updated weights for policy 0, policy_version 703032 (0.0026) [2024-06-24 17:02:47,944][15349] Signal inference workers to stop experience collection... (170450 times) [2024-06-24 17:02:47,944][15349] Signal inference workers to resume experience collection... (170450 times) [2024-06-24 17:02:47,991][15401] InferenceWorker_p0-w0: stopping experience collection (170450 times) [2024-06-24 17:02:47,991][15401] InferenceWorker_p0-w0: resuming experience collection (170450 times) [2024-06-24 17:02:48,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 11518541824. Throughput: 0: 42651.7. Samples: 11518703720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:48,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 17:02:50,369][15401] Updated weights for policy 0, policy_version 703042 (0.0034) [2024-06-24 17:02:53,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 11518771200. Throughput: 0: 42781.7. Samples: 11518833180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:53,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 17:02:54,434][15401] Updated weights for policy 0, policy_version 703052 (0.0034) [2024-06-24 17:02:58,182][15401] Updated weights for policy 0, policy_version 703062 (0.0028) [2024-06-24 17:02:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11518967808. Throughput: 0: 42915.2. Samples: 11519089020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 17:02:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 17:03:01,993][15401] Updated weights for policy 0, policy_version 703072 (0.0029) [2024-06-24 17:03:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11519180800. Throughput: 0: 42941.6. Samples: 11519351160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:03,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 17:03:05,907][15401] Updated weights for policy 0, policy_version 703082 (0.0037) [2024-06-24 17:03:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11519393792. Throughput: 0: 42804.5. Samples: 11519469780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:08,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 17:03:09,564][15401] Updated weights for policy 0, policy_version 703092 (0.0034) [2024-06-24 17:03:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11519606784. Throughput: 0: 42989.4. Samples: 11519731300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 17:03:13,658][15401] Updated weights for policy 0, policy_version 703102 (0.0029) [2024-06-24 17:03:17,028][15401] Updated weights for policy 0, policy_version 703112 (0.0036) [2024-06-24 17:03:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11519819776. Throughput: 0: 42863.7. Samples: 11519989680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:18,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 17:03:21,315][15401] Updated weights for policy 0, policy_version 703122 (0.0036) [2024-06-24 17:03:23,390][15132] Fps is (10 sec: 42595.4, 60 sec: 42597.9, 300 sec: 42709.4). Total num frames: 11520032768. Throughput: 0: 42850.1. Samples: 11520118980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:23,391][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 17:03:24,530][15401] Updated weights for policy 0, policy_version 703132 (0.0031) [2024-06-24 17:03:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11520245760. Throughput: 0: 42745.9. Samples: 11520371560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 17:03:28,868][15401] Updated weights for policy 0, policy_version 703142 (0.0032) [2024-06-24 17:03:32,707][15401] Updated weights for policy 0, policy_version 703152 (0.0031) [2024-06-24 17:03:33,389][15132] Fps is (10 sec: 42601.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11520458752. Throughput: 0: 42754.6. Samples: 11520627680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 17:03:36,503][15401] Updated weights for policy 0, policy_version 703162 (0.0023) [2024-06-24 17:03:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11520655360. Throughput: 0: 42863.1. Samples: 11520762020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 17:03:40,535][15401] Updated weights for policy 0, policy_version 703172 (0.0025) [2024-06-24 17:03:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11520901120. Throughput: 0: 42771.4. Samples: 11521013740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 17:03:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000703180_11520901120.pth... [2024-06-24 17:03:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000702553_11510628352.pth [2024-06-24 17:03:43,902][15401] Updated weights for policy 0, policy_version 703182 (0.0031) [2024-06-24 17:03:48,258][15401] Updated weights for policy 0, policy_version 703192 (0.0031) [2024-06-24 17:03:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42876.4). Total num frames: 11521097728. Throughput: 0: 42864.4. Samples: 11521280060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 17:03:51,674][15401] Updated weights for policy 0, policy_version 703202 (0.0033) [2024-06-24 17:03:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11521310720. Throughput: 0: 43011.4. Samples: 11521405300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 17:03:56,086][15401] Updated weights for policy 0, policy_version 703212 (0.0026) [2024-06-24 17:03:56,928][15349] Signal inference workers to stop experience collection... (170500 times) [2024-06-24 17:03:56,928][15349] Signal inference workers to resume experience collection... (170500 times) [2024-06-24 17:03:56,946][15401] InferenceWorker_p0-w0: stopping experience collection (170500 times) [2024-06-24 17:03:56,946][15401] InferenceWorker_p0-w0: resuming experience collection (170500 times) [2024-06-24 17:03:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.2, 300 sec: 42820.6). Total num frames: 11521540096. Throughput: 0: 42854.0. Samples: 11521659740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:03:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 17:03:59,195][15401] Updated weights for policy 0, policy_version 703222 (0.0043) [2024-06-24 17:04:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11521736704. Throughput: 0: 42956.4. Samples: 11521922720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:04:03,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 17:04:03,531][15401] Updated weights for policy 0, policy_version 703232 (0.0034) [2024-06-24 17:04:06,802][15401] Updated weights for policy 0, policy_version 703242 (0.0031) [2024-06-24 17:04:08,389][15132] Fps is (10 sec: 44238.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11521982464. Throughput: 0: 42995.0. Samples: 11522053720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:04:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 17:04:11,217][15401] Updated weights for policy 0, policy_version 703252 (0.0043) [2024-06-24 17:04:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11522195456. Throughput: 0: 42943.1. Samples: 11522304000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:04:13,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 17:04:14,482][15401] Updated weights for policy 0, policy_version 703262 (0.0035) [2024-06-24 17:04:18,391][15132] Fps is (10 sec: 39314.7, 60 sec: 42597.2, 300 sec: 42766.0). Total num frames: 11522375680. Throughput: 0: 42994.4. Samples: 11522562500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:04:18,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 17:04:18,900][15401] Updated weights for policy 0, policy_version 703272 (0.0040) [2024-06-24 17:04:22,366][15401] Updated weights for policy 0, policy_version 703282 (0.0038) [2024-06-24 17:04:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42872.0, 300 sec: 42765.0). Total num frames: 11522605056. Throughput: 0: 42756.1. Samples: 11522686040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:04:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 17:04:26,402][15401] Updated weights for policy 0, policy_version 703292 (0.0031) [2024-06-24 17:04:28,389][15132] Fps is (10 sec: 44244.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11522818048. Throughput: 0: 42936.1. Samples: 11522945860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:04:28,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 17:04:29,873][15401] Updated weights for policy 0, policy_version 703302 (0.0041) [2024-06-24 17:04:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11523031040. Throughput: 0: 42670.3. Samples: 11523200220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:04:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 17:04:34,506][15401] Updated weights for policy 0, policy_version 703312 (0.0032) [2024-06-24 17:04:37,725][15401] Updated weights for policy 0, policy_version 703322 (0.0050) [2024-06-24 17:04:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11523260416. Throughput: 0: 42662.3. Samples: 11523325100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 17:04:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 17:04:42,073][15401] Updated weights for policy 0, policy_version 703332 (0.0043) [2024-06-24 17:04:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11523473408. Throughput: 0: 42769.2. Samples: 11523584340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:04:43,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-24 17:04:45,824][15401] Updated weights for policy 0, policy_version 703342 (0.0047) [2024-06-24 17:04:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11523670016. Throughput: 0: 42361.8. Samples: 11523829000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:04:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 17:04:49,916][15401] Updated weights for policy 0, policy_version 703352 (0.0035) [2024-06-24 17:04:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11523866624. Throughput: 0: 42212.3. Samples: 11523953280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:04:53,395][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 17:04:53,754][15401] Updated weights for policy 0, policy_version 703362 (0.0036) [2024-06-24 17:04:57,478][15401] Updated weights for policy 0, policy_version 703372 (0.0029) [2024-06-24 17:04:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 11524096000. Throughput: 0: 42521.0. Samples: 11524217440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:04:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 17:05:01,563][15401] Updated weights for policy 0, policy_version 703382 (0.0037) [2024-06-24 17:05:03,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 11524308992. Throughput: 0: 42368.0. Samples: 11524469260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:03,396][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 17:05:05,085][15401] Updated weights for policy 0, policy_version 703392 (0.0036) [2024-06-24 17:05:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11524521984. Throughput: 0: 42442.2. Samples: 11524595940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:08,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 17:05:09,104][15401] Updated weights for policy 0, policy_version 703402 (0.0034) [2024-06-24 17:05:12,670][15401] Updated weights for policy 0, policy_version 703412 (0.0028) [2024-06-24 17:05:13,389][15132] Fps is (10 sec: 42626.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11524734976. Throughput: 0: 42429.0. Samples: 11524855160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:13,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 17:05:16,756][15401] Updated weights for policy 0, policy_version 703422 (0.0041) [2024-06-24 17:05:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42872.7, 300 sec: 42765.0). Total num frames: 11524947968. Throughput: 0: 42423.2. Samples: 11525109260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 17:05:20,540][15401] Updated weights for policy 0, policy_version 703432 (0.0027) [2024-06-24 17:05:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 11525160960. Throughput: 0: 42371.4. Samples: 11525231820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 17:05:24,284][15401] Updated weights for policy 0, policy_version 703442 (0.0031) [2024-06-24 17:05:26,381][15349] Signal inference workers to stop experience collection... (170550 times) [2024-06-24 17:05:26,390][15349] Signal inference workers to resume experience collection... (170550 times) [2024-06-24 17:05:26,436][15401] InferenceWorker_p0-w0: stopping experience collection (170550 times) [2024-06-24 17:05:26,436][15401] InferenceWorker_p0-w0: resuming experience collection (170550 times) [2024-06-24 17:05:28,233][15401] Updated weights for policy 0, policy_version 703452 (0.0035) [2024-06-24 17:05:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 11525373952. Throughput: 0: 42317.6. Samples: 11525488740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:28,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 17:05:31,857][15401] Updated weights for policy 0, policy_version 703462 (0.0031) [2024-06-24 17:05:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11525570560. Throughput: 0: 42592.4. Samples: 11525745660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 17:05:35,733][15401] Updated weights for policy 0, policy_version 703472 (0.0038) [2024-06-24 17:05:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11525799936. Throughput: 0: 42609.8. Samples: 11525870720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:38,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-24 17:05:39,472][15401] Updated weights for policy 0, policy_version 703482 (0.0026) [2024-06-24 17:05:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 11525996544. Throughput: 0: 42485.2. Samples: 11526129280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 17:05:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000703491_11525996544.pth... [2024-06-24 17:05:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000702868_11515789312.pth [2024-06-24 17:05:43,679][15401] Updated weights for policy 0, policy_version 703492 (0.0031) [2024-06-24 17:05:47,649][15401] Updated weights for policy 0, policy_version 703502 (0.0033) [2024-06-24 17:05:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 11526225920. Throughput: 0: 42540.7. Samples: 11526383320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 17:05:51,418][15401] Updated weights for policy 0, policy_version 703512 (0.0026) [2024-06-24 17:05:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11526438912. Throughput: 0: 42567.5. Samples: 11526511480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:53,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 17:05:55,335][15401] Updated weights for policy 0, policy_version 703522 (0.0028) [2024-06-24 17:05:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 11526635520. Throughput: 0: 42427.5. Samples: 11526764400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:05:58,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 17:05:59,030][15401] Updated weights for policy 0, policy_version 703532 (0.0039) [2024-06-24 17:06:02,982][15401] Updated weights for policy 0, policy_version 703542 (0.0040) [2024-06-24 17:06:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 11526848512. Throughput: 0: 42492.0. Samples: 11527021400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:06:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 17:06:06,640][15401] Updated weights for policy 0, policy_version 703552 (0.0041) [2024-06-24 17:06:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11527077888. Throughput: 0: 42562.8. Samples: 11527147140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:06:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 17:06:10,613][15401] Updated weights for policy 0, policy_version 703562 (0.0034) [2024-06-24 17:06:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11527274496. Throughput: 0: 42573.4. Samples: 11527404440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:06:13,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 17:06:14,549][15401] Updated weights for policy 0, policy_version 703572 (0.0035) [2024-06-24 17:06:18,209][15401] Updated weights for policy 0, policy_version 703582 (0.0029) [2024-06-24 17:06:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 11527487488. Throughput: 0: 42590.5. Samples: 11527662340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-24 17:06:18,393][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 17:06:22,298][15401] Updated weights for policy 0, policy_version 703592 (0.0034) [2024-06-24 17:06:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11527716864. Throughput: 0: 42530.3. Samples: 11527784580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:06:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 17:06:25,810][15401] Updated weights for policy 0, policy_version 703602 (0.0036) [2024-06-24 17:06:28,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 11527929856. Throughput: 0: 42575.6. Samples: 11528045180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:06:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 17:06:29,822][15401] Updated weights for policy 0, policy_version 703612 (0.0026) [2024-06-24 17:06:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11528126464. Throughput: 0: 42524.9. Samples: 11528296940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:06:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 17:06:33,407][15401] Updated weights for policy 0, policy_version 703622 (0.0033) [2024-06-24 17:06:37,532][15401] Updated weights for policy 0, policy_version 703632 (0.0031) [2024-06-24 17:06:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11528355840. Throughput: 0: 42385.0. Samples: 11528418800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:06:38,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-24 17:06:41,606][15401] Updated weights for policy 0, policy_version 703642 (0.0024) [2024-06-24 17:06:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11528552448. Throughput: 0: 42617.9. Samples: 11528682200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:06:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 17:06:45,109][15401] Updated weights for policy 0, policy_version 703652 (0.0026) [2024-06-24 17:06:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11528765440. Throughput: 0: 42524.3. Samples: 11528935000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:06:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 17:06:49,142][15401] Updated weights for policy 0, policy_version 703662 (0.0035) [2024-06-24 17:06:52,734][15401] Updated weights for policy 0, policy_version 703672 (0.0027) [2024-06-24 17:06:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 11528978432. Throughput: 0: 42482.7. Samples: 11529058960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:06:53,392][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 17:06:56,814][15401] Updated weights for policy 0, policy_version 703682 (0.0028) [2024-06-24 17:06:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11529175040. Throughput: 0: 42449.7. Samples: 11529314680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:06:58,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 17:06:59,318][15349] Signal inference workers to stop experience collection... (170600 times) [2024-06-24 17:06:59,322][15349] Signal inference workers to resume experience collection... (170600 times) [2024-06-24 17:06:59,338][15401] InferenceWorker_p0-w0: stopping experience collection (170600 times) [2024-06-24 17:06:59,338][15401] InferenceWorker_p0-w0: resuming experience collection (170600 times) [2024-06-24 17:07:00,876][15401] Updated weights for policy 0, policy_version 703692 (0.0037) [2024-06-24 17:07:03,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11529404416. Throughput: 0: 42333.2. Samples: 11529567240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:03,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-24 17:07:04,445][15401] Updated weights for policy 0, policy_version 703702 (0.0040) [2024-06-24 17:07:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11529601024. Throughput: 0: 42442.1. Samples: 11529694480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 17:07:08,555][15401] Updated weights for policy 0, policy_version 703712 (0.0032) [2024-06-24 17:07:11,902][15401] Updated weights for policy 0, policy_version 703722 (0.0040) [2024-06-24 17:07:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11529830400. Throughput: 0: 42339.1. Samples: 11529950440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 17:07:16,085][15401] Updated weights for policy 0, policy_version 703732 (0.0027) [2024-06-24 17:07:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 11530043392. Throughput: 0: 42348.8. Samples: 11530202640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:18,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 17:07:19,755][15401] Updated weights for policy 0, policy_version 703742 (0.0036) [2024-06-24 17:07:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11530240000. Throughput: 0: 42517.4. Samples: 11530332080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 17:07:23,623][15401] Updated weights for policy 0, policy_version 703752 (0.0035) [2024-06-24 17:07:27,552][15401] Updated weights for policy 0, policy_version 703762 (0.0044) [2024-06-24 17:07:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 11530452992. Throughput: 0: 42350.6. Samples: 11530587980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 17:07:31,383][15401] Updated weights for policy 0, policy_version 703772 (0.0032) [2024-06-24 17:07:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11530682368. Throughput: 0: 42272.4. Samples: 11530837260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 17:07:35,433][15401] Updated weights for policy 0, policy_version 703782 (0.0043) [2024-06-24 17:07:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 11530862592. Throughput: 0: 42420.4. Samples: 11530967780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:38,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 17:07:38,916][15401] Updated weights for policy 0, policy_version 703792 (0.0037) [2024-06-24 17:07:43,023][15401] Updated weights for policy 0, policy_version 703802 (0.0031) [2024-06-24 17:07:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11531091968. Throughput: 0: 42372.5. Samples: 11531221440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 17:07:43,432][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000703803_11531108352.pth... [2024-06-24 17:07:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000703180_11520901120.pth [2024-06-24 17:07:46,774][15401] Updated weights for policy 0, policy_version 703812 (0.0038) [2024-06-24 17:07:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 11531304960. Throughput: 0: 42496.6. Samples: 11531479580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:48,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-24 17:07:50,864][15401] Updated weights for policy 0, policy_version 703822 (0.0034) [2024-06-24 17:07:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42325.3, 300 sec: 42542.5). Total num frames: 11531517952. Throughput: 0: 42579.6. Samples: 11531610660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:53,393][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 17:07:54,295][15401] Updated weights for policy 0, policy_version 703832 (0.0044) [2024-06-24 17:07:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11531730944. Throughput: 0: 42289.9. Samples: 11531853480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 17:07:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 17:07:58,510][15401] Updated weights for policy 0, policy_version 703842 (0.0034) [2024-06-24 17:08:02,464][15401] Updated weights for policy 0, policy_version 703852 (0.0027) [2024-06-24 17:08:03,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11531960320. Throughput: 0: 42613.3. Samples: 11532120240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 17:08:06,050][15401] Updated weights for policy 0, policy_version 703862 (0.0044) [2024-06-24 17:08:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11532156928. Throughput: 0: 42521.3. Samples: 11532245540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 17:08:10,061][15401] Updated weights for policy 0, policy_version 703872 (0.0032) [2024-06-24 17:08:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11532369920. Throughput: 0: 42389.4. Samples: 11532495500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 17:08:13,863][15401] Updated weights for policy 0, policy_version 703882 (0.0033) [2024-06-24 17:08:17,729][15401] Updated weights for policy 0, policy_version 703892 (0.0050) [2024-06-24 17:08:18,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42598.2). Total num frames: 11532599296. Throughput: 0: 42688.5. Samples: 11532758340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:18,393][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 17:08:21,420][15401] Updated weights for policy 0, policy_version 703902 (0.0035) [2024-06-24 17:08:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11532795904. Throughput: 0: 42626.1. Samples: 11532885960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 17:08:25,374][15349] Signal inference workers to stop experience collection... (170650 times) [2024-06-24 17:08:25,379][15401] Updated weights for policy 0, policy_version 703912 (0.0038) [2024-06-24 17:08:25,383][15349] Signal inference workers to resume experience collection... (170650 times) [2024-06-24 17:08:25,400][15401] InferenceWorker_p0-w0: stopping experience collection (170650 times) [2024-06-24 17:08:25,432][15401] InferenceWorker_p0-w0: resuming experience collection (170650 times) [2024-06-24 17:08:28,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11533025280. Throughput: 0: 42585.0. Samples: 11533137760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 17:08:28,928][15401] Updated weights for policy 0, policy_version 703922 (0.0036) [2024-06-24 17:08:32,968][15401] Updated weights for policy 0, policy_version 703932 (0.0035) [2024-06-24 17:08:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11533238272. Throughput: 0: 42616.4. Samples: 11533397320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:33,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 17:08:37,276][15401] Updated weights for policy 0, policy_version 703942 (0.0036) [2024-06-24 17:08:38,392][15132] Fps is (10 sec: 40951.2, 60 sec: 42870.0, 300 sec: 42487.0). Total num frames: 11533434880. Throughput: 0: 42467.9. Samples: 11533521700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:38,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 17:08:40,599][15401] Updated weights for policy 0, policy_version 703952 (0.0026) [2024-06-24 17:08:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11533664256. Throughput: 0: 42803.1. Samples: 11533779620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 17:08:44,719][15401] Updated weights for policy 0, policy_version 703962 (0.0032) [2024-06-24 17:08:48,125][15401] Updated weights for policy 0, policy_version 703972 (0.0046) [2024-06-24 17:08:48,390][15132] Fps is (10 sec: 44245.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11533877248. Throughput: 0: 42635.6. Samples: 11534038840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 17:08:52,427][15401] Updated weights for policy 0, policy_version 703982 (0.0029) [2024-06-24 17:08:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 11534090240. Throughput: 0: 42715.5. Samples: 11534167740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 17:08:56,176][15401] Updated weights for policy 0, policy_version 703992 (0.0049) [2024-06-24 17:08:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11534319616. Throughput: 0: 42829.7. Samples: 11534422840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:08:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 17:09:00,147][15401] Updated weights for policy 0, policy_version 704002 (0.0040) [2024-06-24 17:09:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 11534516224. Throughput: 0: 42726.6. Samples: 11534680940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 17:09:03,581][15401] Updated weights for policy 0, policy_version 704012 (0.0038) [2024-06-24 17:09:07,708][15401] Updated weights for policy 0, policy_version 704022 (0.0042) [2024-06-24 17:09:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 11534729216. Throughput: 0: 42733.4. Samples: 11534808960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 17:09:11,073][15401] Updated weights for policy 0, policy_version 704032 (0.0031) [2024-06-24 17:09:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42654.2). Total num frames: 11534958592. Throughput: 0: 42962.1. Samples: 11535071060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 17:09:15,191][15401] Updated weights for policy 0, policy_version 704042 (0.0030) [2024-06-24 17:09:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11535171584. Throughput: 0: 42849.8. Samples: 11535325560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:18,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 17:09:18,652][15401] Updated weights for policy 0, policy_version 704052 (0.0035) [2024-06-24 17:09:23,114][15401] Updated weights for policy 0, policy_version 704062 (0.0036) [2024-06-24 17:09:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 11535351808. Throughput: 0: 42958.4. Samples: 11535454740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 17:09:26,323][15401] Updated weights for policy 0, policy_version 704072 (0.0030) [2024-06-24 17:09:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11535597568. Throughput: 0: 42784.0. Samples: 11535704900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 17:09:31,000][15401] Updated weights for policy 0, policy_version 704082 (0.0031) [2024-06-24 17:09:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 11535794176. Throughput: 0: 42852.9. Samples: 11535967220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 17:09:33,848][15401] Updated weights for policy 0, policy_version 704092 (0.0033) [2024-06-24 17:09:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42599.8, 300 sec: 42431.8). Total num frames: 11535990784. Throughput: 0: 42698.7. Samples: 11536089180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:38,394][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 17:09:38,456][15401] Updated weights for policy 0, policy_version 704102 (0.0033) [2024-06-24 17:09:41,449][15401] Updated weights for policy 0, policy_version 704112 (0.0038) [2024-06-24 17:09:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11536236544. Throughput: 0: 42908.9. Samples: 11536353740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 17:09:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 17:09:43,440][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000704117_11536252928.pth... [2024-06-24 17:09:43,519][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000703491_11525996544.pth [2024-06-24 17:09:45,928][15401] Updated weights for policy 0, policy_version 704122 (0.0035) [2024-06-24 17:09:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 11536449536. Throughput: 0: 42930.8. Samples: 11536612820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:09:48,396][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 17:09:49,110][15401] Updated weights for policy 0, policy_version 704132 (0.0034) [2024-06-24 17:09:53,396][15132] Fps is (10 sec: 40934.5, 60 sec: 42593.9, 300 sec: 42541.9). Total num frames: 11536646144. Throughput: 0: 42858.5. Samples: 11536737860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:09:53,396][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 17:09:53,925][15401] Updated weights for policy 0, policy_version 704142 (0.0044) [2024-06-24 17:09:56,186][15349] Signal inference workers to stop experience collection... (170700 times) [2024-06-24 17:09:56,187][15349] Signal inference workers to resume experience collection... (170700 times) [2024-06-24 17:09:56,208][15401] InferenceWorker_p0-w0: stopping experience collection (170700 times) [2024-06-24 17:09:56,208][15401] InferenceWorker_p0-w0: resuming experience collection (170700 times) [2024-06-24 17:09:56,935][15401] Updated weights for policy 0, policy_version 704152 (0.0028) [2024-06-24 17:09:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 11536875520. Throughput: 0: 42576.0. Samples: 11536986980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:09:58,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-24 17:10:01,819][15401] Updated weights for policy 0, policy_version 704162 (0.0030) [2024-06-24 17:10:03,389][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11537088512. Throughput: 0: 42668.5. Samples: 11537245640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:03,390][15132] Avg episode reward: [(0, '0.909')] [2024-06-24 17:10:04,785][15401] Updated weights for policy 0, policy_version 704172 (0.0037) [2024-06-24 17:10:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11537285120. Throughput: 0: 42504.5. Samples: 11537367440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 17:10:09,232][15401] Updated weights for policy 0, policy_version 704182 (0.0025) [2024-06-24 17:10:12,673][15401] Updated weights for policy 0, policy_version 704192 (0.0037) [2024-06-24 17:10:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11537514496. Throughput: 0: 42780.9. Samples: 11537630040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 17:10:16,835][15401] Updated weights for policy 0, policy_version 704202 (0.0036) [2024-06-24 17:10:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11537711104. Throughput: 0: 42583.3. Samples: 11537883460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 17:10:20,407][15401] Updated weights for policy 0, policy_version 704212 (0.0027) [2024-06-24 17:10:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 11537940480. Throughput: 0: 42718.6. Samples: 11538011520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 17:10:24,625][15401] Updated weights for policy 0, policy_version 704222 (0.0031) [2024-06-24 17:10:28,227][15401] Updated weights for policy 0, policy_version 704232 (0.0034) [2024-06-24 17:10:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11538137088. Throughput: 0: 42403.2. Samples: 11538261880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 17:10:32,269][15401] Updated weights for policy 0, policy_version 704242 (0.0037) [2024-06-24 17:10:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 11538350080. Throughput: 0: 42163.9. Samples: 11538510200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:33,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 17:10:35,731][15401] Updated weights for policy 0, policy_version 704252 (0.0033) [2024-06-24 17:10:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11538563072. Throughput: 0: 42308.1. Samples: 11538641460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 17:10:40,026][15401] Updated weights for policy 0, policy_version 704262 (0.0033) [2024-06-24 17:10:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11538776064. Throughput: 0: 42433.7. Samples: 11538896500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 17:10:43,991][15401] Updated weights for policy 0, policy_version 704272 (0.0028) [2024-06-24 17:10:47,599][15401] Updated weights for policy 0, policy_version 704282 (0.0034) [2024-06-24 17:10:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11539005440. Throughput: 0: 42185.2. Samples: 11539143980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 17:10:51,592][15401] Updated weights for policy 0, policy_version 704292 (0.0035) [2024-06-24 17:10:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42602.8, 300 sec: 42598.4). Total num frames: 11539202048. Throughput: 0: 42482.5. Samples: 11539279160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 17:10:55,175][15401] Updated weights for policy 0, policy_version 704302 (0.0028) [2024-06-24 17:10:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11539398656. Throughput: 0: 42229.8. Samples: 11539530380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:10:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 17:10:59,107][15401] Updated weights for policy 0, policy_version 704312 (0.0031) [2024-06-24 17:11:02,720][15401] Updated weights for policy 0, policy_version 704322 (0.0043) [2024-06-24 17:11:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 11539628032. Throughput: 0: 42259.0. Samples: 11539785120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:11:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 17:11:06,597][15401] Updated weights for policy 0, policy_version 704332 (0.0037) [2024-06-24 17:11:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11539824640. Throughput: 0: 42351.3. Samples: 11539917320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:11:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 17:11:10,306][15401] Updated weights for policy 0, policy_version 704342 (0.0030) [2024-06-24 17:11:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 11540054016. Throughput: 0: 42486.7. Samples: 11540173780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:11:13,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 17:11:14,717][15401] Updated weights for policy 0, policy_version 704352 (0.0035) [2024-06-24 17:11:17,863][15401] Updated weights for policy 0, policy_version 704362 (0.0031) [2024-06-24 17:11:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11540283392. Throughput: 0: 42685.4. Samples: 11540431040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:11:18,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 17:11:22,088][15349] Signal inference workers to stop experience collection... (170750 times) [2024-06-24 17:11:22,088][15349] Signal inference workers to resume experience collection... (170750 times) [2024-06-24 17:11:22,122][15401] InferenceWorker_p0-w0: stopping experience collection (170750 times) [2024-06-24 17:11:22,122][15401] InferenceWorker_p0-w0: resuming experience collection (170750 times) [2024-06-24 17:11:22,234][15401] Updated weights for policy 0, policy_version 704372 (0.0038) [2024-06-24 17:11:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 11540463616. Throughput: 0: 42698.7. Samples: 11540562900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:11:23,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 17:11:25,391][15401] Updated weights for policy 0, policy_version 704382 (0.0046) [2024-06-24 17:11:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11540692992. Throughput: 0: 42684.9. Samples: 11540817320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:11:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 17:11:30,015][15401] Updated weights for policy 0, policy_version 704392 (0.0030) [2024-06-24 17:11:33,126][15401] Updated weights for policy 0, policy_version 704402 (0.0038) [2024-06-24 17:11:33,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 11540938752. Throughput: 0: 42766.8. Samples: 11541068480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:11:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 17:11:37,691][15401] Updated weights for policy 0, policy_version 704412 (0.0037) [2024-06-24 17:11:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 11541102592. Throughput: 0: 42688.1. Samples: 11541200120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:11:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 17:11:40,767][15401] Updated weights for policy 0, policy_version 704422 (0.0039) [2024-06-24 17:11:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11541331968. Throughput: 0: 42651.4. Samples: 11541449700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:11:43,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 17:11:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000704427_11541331968.pth... [2024-06-24 17:11:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000703803_11531108352.pth [2024-06-24 17:11:45,285][15401] Updated weights for policy 0, policy_version 704432 (0.0027) [2024-06-24 17:11:48,386][15401] Updated weights for policy 0, policy_version 704442 (0.0033) [2024-06-24 17:11:48,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 11541577728. Throughput: 0: 42733.5. Samples: 11541708120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:11:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 17:11:53,007][15401] Updated weights for policy 0, policy_version 704452 (0.0028) [2024-06-24 17:11:53,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 11541757952. Throughput: 0: 42667.9. Samples: 11541837480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:11:53,393][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 17:11:56,148][15401] Updated weights for policy 0, policy_version 704462 (0.0021) [2024-06-24 17:11:58,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11541970944. Throughput: 0: 42490.6. Samples: 11542085860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:11:58,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 17:12:00,723][15401] Updated weights for policy 0, policy_version 704472 (0.0028) [2024-06-24 17:12:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11542200320. Throughput: 0: 42601.4. Samples: 11542348100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:03,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 17:12:03,782][15401] Updated weights for policy 0, policy_version 704482 (0.0039) [2024-06-24 17:12:08,370][15401] Updated weights for policy 0, policy_version 704492 (0.0043) [2024-06-24 17:12:08,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 11542396928. Throughput: 0: 42543.5. Samples: 11542477460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:08,393][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 17:12:11,410][15401] Updated weights for policy 0, policy_version 704502 (0.0031) [2024-06-24 17:12:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11542626304. Throughput: 0: 42512.5. Samples: 11542730380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 17:12:16,226][15401] Updated weights for policy 0, policy_version 704512 (0.0042) [2024-06-24 17:12:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11542822912. Throughput: 0: 42573.7. Samples: 11542984300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:18,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 17:12:19,261][15401] Updated weights for policy 0, policy_version 704522 (0.0035) [2024-06-24 17:12:23,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 11543019520. Throughput: 0: 42378.6. Samples: 11543107260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:23,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 17:12:23,841][15401] Updated weights for policy 0, policy_version 704532 (0.0028) [2024-06-24 17:12:26,974][15401] Updated weights for policy 0, policy_version 704542 (0.0040) [2024-06-24 17:12:28,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 11543265280. Throughput: 0: 42371.4. Samples: 11543356680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:28,396][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 17:12:30,976][15349] Signal inference workers to stop experience collection... (170800 times) [2024-06-24 17:12:30,976][15349] Signal inference workers to resume experience collection... (170800 times) [2024-06-24 17:12:31,002][15401] InferenceWorker_p0-w0: stopping experience collection (170800 times) [2024-06-24 17:12:31,003][15401] InferenceWorker_p0-w0: resuming experience collection (170800 times) [2024-06-24 17:12:32,321][15401] Updated weights for policy 0, policy_version 704552 (0.0037) [2024-06-24 17:12:33,389][15132] Fps is (10 sec: 42609.2, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 11543445504. Throughput: 0: 42478.6. Samples: 11543619660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:33,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-24 17:12:34,709][15401] Updated weights for policy 0, policy_version 704562 (0.0038) [2024-06-24 17:12:38,392][15132] Fps is (10 sec: 37698.2, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 11543642112. Throughput: 0: 42320.9. Samples: 11543741920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:38,392][15132] Avg episode reward: [(0, '0.284')] [2024-06-24 17:12:39,827][15401] Updated weights for policy 0, policy_version 704572 (0.0032) [2024-06-24 17:12:42,672][15401] Updated weights for policy 0, policy_version 704582 (0.0031) [2024-06-24 17:12:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11543904256. Throughput: 0: 42564.6. Samples: 11544001260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 17:12:47,378][15401] Updated weights for policy 0, policy_version 704592 (0.0030) [2024-06-24 17:12:48,389][15132] Fps is (10 sec: 44248.0, 60 sec: 41779.2, 300 sec: 42598.8). Total num frames: 11544084480. Throughput: 0: 42526.7. Samples: 11544261800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 17:12:50,251][15401] Updated weights for policy 0, policy_version 704602 (0.0041) [2024-06-24 17:12:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 11544297472. Throughput: 0: 42293.5. Samples: 11544380560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:53,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 17:12:55,165][15401] Updated weights for policy 0, policy_version 704612 (0.0037) [2024-06-24 17:12:58,244][15401] Updated weights for policy 0, policy_version 704622 (0.0040) [2024-06-24 17:12:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11544526848. Throughput: 0: 42481.4. Samples: 11544642040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:12:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 17:13:02,823][15401] Updated weights for policy 0, policy_version 704632 (0.0033) [2024-06-24 17:13:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11544723456. Throughput: 0: 42546.6. Samples: 11544898900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-24 17:13:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 17:13:05,978][15401] Updated weights for policy 0, policy_version 704642 (0.0022) [2024-06-24 17:13:08,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 11544920064. Throughput: 0: 42495.2. Samples: 11545019440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 17:13:10,642][15401] Updated weights for policy 0, policy_version 704652 (0.0030) [2024-06-24 17:13:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 11545165824. Throughput: 0: 42625.6. Samples: 11545274560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 17:13:13,535][15401] Updated weights for policy 0, policy_version 704662 (0.0051) [2024-06-24 17:13:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 11545329664. Throughput: 0: 42648.8. Samples: 11545538860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 17:13:18,502][15401] Updated weights for policy 0, policy_version 704672 (0.0028) [2024-06-24 17:13:21,157][15401] Updated weights for policy 0, policy_version 704682 (0.0039) [2024-06-24 17:13:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 11545575424. Throughput: 0: 42586.8. Samples: 11545658220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:23,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 17:13:25,895][15401] Updated weights for policy 0, policy_version 704692 (0.0027) [2024-06-24 17:13:28,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 11545804800. Throughput: 0: 42553.3. Samples: 11545916160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:28,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 17:13:29,057][15401] Updated weights for policy 0, policy_version 704702 (0.0029) [2024-06-24 17:13:33,281][15401] Updated weights for policy 0, policy_version 704712 (0.0040) [2024-06-24 17:13:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 11546001408. Throughput: 0: 42735.4. Samples: 11546184900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:33,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 17:13:36,525][15401] Updated weights for policy 0, policy_version 704722 (0.0037) [2024-06-24 17:13:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 11546230784. Throughput: 0: 42852.9. Samples: 11546308940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 17:13:41,095][15401] Updated weights for policy 0, policy_version 704732 (0.0037) [2024-06-24 17:13:42,326][15349] Signal inference workers to stop experience collection... (170850 times) [2024-06-24 17:13:42,371][15401] InferenceWorker_p0-w0: stopping experience collection (170850 times) [2024-06-24 17:13:42,381][15349] Signal inference workers to resume experience collection... (170850 times) [2024-06-24 17:13:42,393][15401] InferenceWorker_p0-w0: resuming experience collection (170850 times) [2024-06-24 17:13:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11546460160. Throughput: 0: 42751.9. Samples: 11546565880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 17:13:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000704740_11546460160.pth... [2024-06-24 17:13:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000704117_11536252928.pth [2024-06-24 17:13:44,177][15401] Updated weights for policy 0, policy_version 704742 (0.0040) [2024-06-24 17:13:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11546640384. Throughput: 0: 42830.7. Samples: 11546826280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 17:13:48,555][15401] Updated weights for policy 0, policy_version 704752 (0.0036) [2024-06-24 17:13:51,681][15401] Updated weights for policy 0, policy_version 704762 (0.0029) [2024-06-24 17:13:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 11546886144. Throughput: 0: 42945.7. Samples: 11546952000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 17:13:55,917][15401] Updated weights for policy 0, policy_version 704772 (0.0029) [2024-06-24 17:13:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 11547099136. Throughput: 0: 43019.6. Samples: 11547210440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:13:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 17:13:59,254][15401] Updated weights for policy 0, policy_version 704782 (0.0038) [2024-06-24 17:14:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11547295744. Throughput: 0: 43019.6. Samples: 11547474740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 17:14:03,771][15401] Updated weights for policy 0, policy_version 704792 (0.0035) [2024-06-24 17:14:06,797][15401] Updated weights for policy 0, policy_version 704802 (0.0024) [2024-06-24 17:14:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 11547525120. Throughput: 0: 43153.4. Samples: 11547600120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 17:14:11,367][15401] Updated weights for policy 0, policy_version 704812 (0.0036) [2024-06-24 17:14:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11547738112. Throughput: 0: 43079.6. Samples: 11547854740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 17:14:14,994][15401] Updated weights for policy 0, policy_version 704822 (0.0025) [2024-06-24 17:14:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.7, 300 sec: 42654.0). Total num frames: 11547934720. Throughput: 0: 42824.6. Samples: 11548112000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:18,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 17:14:19,027][15401] Updated weights for policy 0, policy_version 704832 (0.0030) [2024-06-24 17:14:22,658][15401] Updated weights for policy 0, policy_version 704842 (0.0032) [2024-06-24 17:14:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11548164096. Throughput: 0: 42800.8. Samples: 11548234980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:23,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 17:14:26,660][15401] Updated weights for policy 0, policy_version 704852 (0.0033) [2024-06-24 17:14:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11548377088. Throughput: 0: 42816.6. Samples: 11548492620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:28,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 17:14:30,312][15401] Updated weights for policy 0, policy_version 704862 (0.0035) [2024-06-24 17:14:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11548590080. Throughput: 0: 42876.8. Samples: 11548755740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 17:14:34,377][15401] Updated weights for policy 0, policy_version 704872 (0.0026) [2024-06-24 17:14:38,010][15401] Updated weights for policy 0, policy_version 704882 (0.0036) [2024-06-24 17:14:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11548786688. Throughput: 0: 42765.8. Samples: 11548876460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 17:14:42,034][15401] Updated weights for policy 0, policy_version 704892 (0.0027) [2024-06-24 17:14:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11549016064. Throughput: 0: 42788.4. Samples: 11549135920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-24 17:14:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 17:14:46,043][15401] Updated weights for policy 0, policy_version 704902 (0.0041) [2024-06-24 17:14:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 11549212672. Throughput: 0: 42510.6. Samples: 11549387720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:14:48,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 17:14:49,570][15401] Updated weights for policy 0, policy_version 704912 (0.0033) [2024-06-24 17:14:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11549425664. Throughput: 0: 42435.5. Samples: 11549509720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:14:53,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-24 17:14:53,674][15401] Updated weights for policy 0, policy_version 704922 (0.0039) [2024-06-24 17:14:57,678][15401] Updated weights for policy 0, policy_version 704932 (0.0037) [2024-06-24 17:14:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11549655040. Throughput: 0: 42676.4. Samples: 11549775180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:14:58,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 17:14:59,293][15349] Signal inference workers to stop experience collection... (170900 times) [2024-06-24 17:14:59,294][15349] Signal inference workers to resume experience collection... (170900 times) [2024-06-24 17:14:59,321][15401] InferenceWorker_p0-w0: stopping experience collection (170900 times) [2024-06-24 17:14:59,321][15401] InferenceWorker_p0-w0: resuming experience collection (170900 times) [2024-06-24 17:15:01,182][15401] Updated weights for policy 0, policy_version 704942 (0.0046) [2024-06-24 17:15:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11549851648. Throughput: 0: 42611.5. Samples: 11550029520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 17:15:05,131][15401] Updated weights for policy 0, policy_version 704952 (0.0044) [2024-06-24 17:15:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 11550064640. Throughput: 0: 42666.1. Samples: 11550154960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 17:15:08,990][15401] Updated weights for policy 0, policy_version 704962 (0.0035) [2024-06-24 17:15:12,812][15401] Updated weights for policy 0, policy_version 704972 (0.0033) [2024-06-24 17:15:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11550294016. Throughput: 0: 42693.2. Samples: 11550413820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 17:15:16,454][15401] Updated weights for policy 0, policy_version 704982 (0.0046) [2024-06-24 17:15:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11550490624. Throughput: 0: 42596.6. Samples: 11550672580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 17:15:20,417][15401] Updated weights for policy 0, policy_version 704992 (0.0032) [2024-06-24 17:15:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11550720000. Throughput: 0: 42654.6. Samples: 11550795920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 17:15:24,474][15401] Updated weights for policy 0, policy_version 705002 (0.0031) [2024-06-24 17:15:27,994][15401] Updated weights for policy 0, policy_version 705012 (0.0040) [2024-06-24 17:15:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11550932992. Throughput: 0: 42531.0. Samples: 11551049820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:28,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 17:15:32,395][15401] Updated weights for policy 0, policy_version 705022 (0.0036) [2024-06-24 17:15:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11551129600. Throughput: 0: 42607.6. Samples: 11551305060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 17:15:35,806][15401] Updated weights for policy 0, policy_version 705032 (0.0038) [2024-06-24 17:15:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11551375360. Throughput: 0: 42805.7. Samples: 11551435980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:38,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 17:15:40,069][15401] Updated weights for policy 0, policy_version 705042 (0.0034) [2024-06-24 17:15:43,308][15401] Updated weights for policy 0, policy_version 705052 (0.0032) [2024-06-24 17:15:43,391][15132] Fps is (10 sec: 44229.2, 60 sec: 42597.1, 300 sec: 42598.1). Total num frames: 11551571968. Throughput: 0: 42440.6. Samples: 11551685080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:43,392][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 17:15:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000705052_11551571968.pth... [2024-06-24 17:15:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000704427_11541331968.pth [2024-06-24 17:15:47,716][15401] Updated weights for policy 0, policy_version 705062 (0.0051) [2024-06-24 17:15:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11551768576. Throughput: 0: 42589.7. Samples: 11551946060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 17:15:51,369][15401] Updated weights for policy 0, policy_version 705072 (0.0037) [2024-06-24 17:15:53,390][15132] Fps is (10 sec: 42605.3, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 11551997952. Throughput: 0: 42665.3. Samples: 11552074900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 17:15:55,200][15401] Updated weights for policy 0, policy_version 705082 (0.0022) [2024-06-24 17:15:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11552210944. Throughput: 0: 42437.8. Samples: 11552323520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:15:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 17:15:58,868][15401] Updated weights for policy 0, policy_version 705092 (0.0047) [2024-06-24 17:16:02,774][15401] Updated weights for policy 0, policy_version 705102 (0.0035) [2024-06-24 17:16:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11552423936. Throughput: 0: 42400.0. Samples: 11552580580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:16:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 17:16:06,604][15401] Updated weights for policy 0, policy_version 705112 (0.0041) [2024-06-24 17:16:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11552636928. Throughput: 0: 42505.0. Samples: 11552708640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:16:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 17:16:10,410][15401] Updated weights for policy 0, policy_version 705122 (0.0037) [2024-06-24 17:16:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11552833536. Throughput: 0: 42412.9. Samples: 11552958400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:16:13,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 17:16:14,425][15401] Updated weights for policy 0, policy_version 705132 (0.0039) [2024-06-24 17:16:18,261][15401] Updated weights for policy 0, policy_version 705142 (0.0030) [2024-06-24 17:16:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11553046528. Throughput: 0: 42667.2. Samples: 11553225080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:16:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 17:16:22,045][15401] Updated weights for policy 0, policy_version 705152 (0.0027) [2024-06-24 17:16:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11553275904. Throughput: 0: 42605.3. Samples: 11553353220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 17:16:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 17:16:25,621][15401] Updated weights for policy 0, policy_version 705162 (0.0039) [2024-06-24 17:16:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11553488896. Throughput: 0: 42721.7. Samples: 11553607480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:16:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 17:16:29,904][15401] Updated weights for policy 0, policy_version 705172 (0.0046) [2024-06-24 17:16:33,266][15401] Updated weights for policy 0, policy_version 705182 (0.0032) [2024-06-24 17:16:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11553701888. Throughput: 0: 42769.0. Samples: 11553870660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:16:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 17:16:33,704][15349] Signal inference workers to stop experience collection... (170950 times) [2024-06-24 17:16:33,753][15401] InferenceWorker_p0-w0: stopping experience collection (170950 times) [2024-06-24 17:16:33,820][15349] Signal inference workers to resume experience collection... (170950 times) [2024-06-24 17:16:33,821][15401] InferenceWorker_p0-w0: resuming experience collection (170950 times) [2024-06-24 17:16:37,432][15401] Updated weights for policy 0, policy_version 705192 (0.0025) [2024-06-24 17:16:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11553914880. Throughput: 0: 42722.8. Samples: 11553997420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:16:38,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-24 17:16:41,129][15401] Updated weights for policy 0, policy_version 705202 (0.0037) [2024-06-24 17:16:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42599.7, 300 sec: 42542.8). Total num frames: 11554127872. Throughput: 0: 42823.1. Samples: 11554250560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:16:43,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 17:16:44,939][15401] Updated weights for policy 0, policy_version 705212 (0.0033) [2024-06-24 17:16:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 11554324480. Throughput: 0: 42880.3. Samples: 11554510200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:16:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 17:16:48,974][15401] Updated weights for policy 0, policy_version 705222 (0.0031) [2024-06-24 17:16:52,377][15401] Updated weights for policy 0, policy_version 705232 (0.0037) [2024-06-24 17:16:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11554570240. Throughput: 0: 42820.5. Samples: 11554635560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:16:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 17:16:56,276][15401] Updated weights for policy 0, policy_version 705242 (0.0037) [2024-06-24 17:16:58,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11554783232. Throughput: 0: 43239.6. Samples: 11554904180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:16:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 17:16:59,777][15401] Updated weights for policy 0, policy_version 705252 (0.0034) [2024-06-24 17:17:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 11554979840. Throughput: 0: 43044.9. Samples: 11555162100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 17:17:03,880][15401] Updated weights for policy 0, policy_version 705262 (0.0030) [2024-06-24 17:17:07,458][15401] Updated weights for policy 0, policy_version 705272 (0.0040) [2024-06-24 17:17:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11555209216. Throughput: 0: 42964.1. Samples: 11555286600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 17:17:11,466][15401] Updated weights for policy 0, policy_version 705282 (0.0038) [2024-06-24 17:17:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11555422208. Throughput: 0: 43167.0. Samples: 11555550000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 17:17:14,990][15401] Updated weights for policy 0, policy_version 705292 (0.0032) [2024-06-24 17:17:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 11555635200. Throughput: 0: 42921.3. Samples: 11555802120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:18,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 17:17:18,900][15401] Updated weights for policy 0, policy_version 705302 (0.0036) [2024-06-24 17:17:22,604][15401] Updated weights for policy 0, policy_version 705312 (0.0042) [2024-06-24 17:17:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42710.4). Total num frames: 11555864576. Throughput: 0: 42960.9. Samples: 11555930660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 17:17:26,461][15401] Updated weights for policy 0, policy_version 705322 (0.0028) [2024-06-24 17:17:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11556061184. Throughput: 0: 43020.9. Samples: 11556186500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 17:17:30,484][15401] Updated weights for policy 0, policy_version 705332 (0.0030) [2024-06-24 17:17:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 11556274176. Throughput: 0: 42957.1. Samples: 11556443260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 17:17:34,052][15401] Updated weights for policy 0, policy_version 705342 (0.0032) [2024-06-24 17:17:38,082][15401] Updated weights for policy 0, policy_version 705352 (0.0036) [2024-06-24 17:17:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 11556503552. Throughput: 0: 43040.3. Samples: 11556572380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 17:17:41,763][15401] Updated weights for policy 0, policy_version 705362 (0.0042) [2024-06-24 17:17:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11556700160. Throughput: 0: 42684.0. Samples: 11556824960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 17:17:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000705365_11556700160.pth... [2024-06-24 17:17:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000704740_11546460160.pth [2024-06-24 17:17:45,795][15401] Updated weights for policy 0, policy_version 705372 (0.0042) [2024-06-24 17:17:48,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11556896768. Throughput: 0: 42711.5. Samples: 11557084120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 17:17:49,514][15401] Updated weights for policy 0, policy_version 705382 (0.0032) [2024-06-24 17:17:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11557126144. Throughput: 0: 42609.4. Samples: 11557204020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:53,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 17:17:53,513][15401] Updated weights for policy 0, policy_version 705392 (0.0032) [2024-06-24 17:17:56,622][15349] Signal inference workers to stop experience collection... (171000 times) [2024-06-24 17:17:56,669][15401] InferenceWorker_p0-w0: stopping experience collection (171000 times) [2024-06-24 17:17:56,695][15349] Signal inference workers to resume experience collection... (171000 times) [2024-06-24 17:17:56,700][15401] InferenceWorker_p0-w0: resuming experience collection (171000 times) [2024-06-24 17:17:57,525][15401] Updated weights for policy 0, policy_version 705402 (0.0033) [2024-06-24 17:17:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11557339136. Throughput: 0: 42313.9. Samples: 11557454120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:17:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 17:18:01,121][15401] Updated weights for policy 0, policy_version 705412 (0.0029) [2024-06-24 17:18:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11557535744. Throughput: 0: 42611.5. Samples: 11557719640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 17:18:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 17:18:05,051][15401] Updated weights for policy 0, policy_version 705422 (0.0032) [2024-06-24 17:18:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11557765120. Throughput: 0: 42479.9. Samples: 11557842260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 17:18:09,107][15401] Updated weights for policy 0, policy_version 705432 (0.0039) [2024-06-24 17:18:12,884][15401] Updated weights for policy 0, policy_version 705442 (0.0035) [2024-06-24 17:18:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11557994496. Throughput: 0: 42533.4. Samples: 11558100500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 17:18:16,797][15401] Updated weights for policy 0, policy_version 705452 (0.0047) [2024-06-24 17:18:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11558174720. Throughput: 0: 42454.2. Samples: 11558353700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 17:18:20,470][15401] Updated weights for policy 0, policy_version 705462 (0.0030) [2024-06-24 17:18:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11558404096. Throughput: 0: 42369.0. Samples: 11558478980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:23,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 17:18:24,769][15401] Updated weights for policy 0, policy_version 705472 (0.0037) [2024-06-24 17:18:28,161][15401] Updated weights for policy 0, policy_version 705482 (0.0038) [2024-06-24 17:18:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11558633472. Throughput: 0: 42604.0. Samples: 11558742140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:28,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 17:18:32,263][15401] Updated weights for policy 0, policy_version 705492 (0.0031) [2024-06-24 17:18:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11558830080. Throughput: 0: 42464.8. Samples: 11558995040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:33,393][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 17:18:35,709][15401] Updated weights for policy 0, policy_version 705502 (0.0024) [2024-06-24 17:18:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11559059456. Throughput: 0: 42669.7. Samples: 11559124160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 17:18:39,720][15401] Updated weights for policy 0, policy_version 705512 (0.0038) [2024-06-24 17:18:43,361][15401] Updated weights for policy 0, policy_version 705522 (0.0030) [2024-06-24 17:18:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11559272448. Throughput: 0: 42979.6. Samples: 11559388200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 17:18:47,495][15401] Updated weights for policy 0, policy_version 705532 (0.0041) [2024-06-24 17:18:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11559469056. Throughput: 0: 42602.2. Samples: 11559636740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:48,392][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 17:18:51,453][15401] Updated weights for policy 0, policy_version 705542 (0.0043) [2024-06-24 17:18:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 11559698432. Throughput: 0: 42664.4. Samples: 11559762160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 17:18:55,174][15401] Updated weights for policy 0, policy_version 705552 (0.0055) [2024-06-24 17:18:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11559895040. Throughput: 0: 42571.1. Samples: 11560016200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:18:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 17:18:59,152][15401] Updated weights for policy 0, policy_version 705562 (0.0040) [2024-06-24 17:19:02,661][15401] Updated weights for policy 0, policy_version 705572 (0.0042) [2024-06-24 17:19:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11560108032. Throughput: 0: 42678.2. Samples: 11560274220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 17:19:06,725][15401] Updated weights for policy 0, policy_version 705582 (0.0041) [2024-06-24 17:19:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11560353792. Throughput: 0: 42783.5. Samples: 11560404240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 17:19:10,181][15401] Updated weights for policy 0, policy_version 705592 (0.0025) [2024-06-24 17:19:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11560550400. Throughput: 0: 42779.5. Samples: 11560667220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 17:19:14,372][15401] Updated weights for policy 0, policy_version 705602 (0.0029) [2024-06-24 17:19:17,786][15401] Updated weights for policy 0, policy_version 705612 (0.0036) [2024-06-24 17:19:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11560763392. Throughput: 0: 42598.3. Samples: 11560911960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 17:19:21,927][15401] Updated weights for policy 0, policy_version 705622 (0.0036) [2024-06-24 17:19:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11560976384. Throughput: 0: 42709.3. Samples: 11561046080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 17:19:25,606][15401] Updated weights for policy 0, policy_version 705632 (0.0032) [2024-06-24 17:19:28,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 11561140224. Throughput: 0: 42540.9. Samples: 11561302540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:28,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 17:19:29,595][15401] Updated weights for policy 0, policy_version 705642 (0.0036) [2024-06-24 17:19:33,389][15401] Updated weights for policy 0, policy_version 705652 (0.0033) [2024-06-24 17:19:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11561402368. Throughput: 0: 42684.4. Samples: 11561557540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 17:19:36,201][15349] Signal inference workers to stop experience collection... (171050 times) [2024-06-24 17:19:36,250][15401] InferenceWorker_p0-w0: stopping experience collection (171050 times) [2024-06-24 17:19:36,259][15349] Signal inference workers to resume experience collection... (171050 times) [2024-06-24 17:19:36,274][15401] InferenceWorker_p0-w0: resuming experience collection (171050 times) [2024-06-24 17:19:37,256][15401] Updated weights for policy 0, policy_version 705662 (0.0045) [2024-06-24 17:19:38,389][15132] Fps is (10 sec: 49151.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11561631744. Throughput: 0: 42911.7. Samples: 11561693180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 17:19:40,952][15401] Updated weights for policy 0, policy_version 705672 (0.0033) [2024-06-24 17:19:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 11561795584. Throughput: 0: 42907.5. Samples: 11561947040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 17:19:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 17:19:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000705676_11561795584.pth... [2024-06-24 17:19:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000705052_11551571968.pth [2024-06-24 17:19:44,941][15401] Updated weights for policy 0, policy_version 705682 (0.0036) [2024-06-24 17:19:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11562024960. Throughput: 0: 42704.9. Samples: 11562195940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:19:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 17:19:48,669][15401] Updated weights for policy 0, policy_version 705692 (0.0031) [2024-06-24 17:19:52,601][15401] Updated weights for policy 0, policy_version 705702 (0.0031) [2024-06-24 17:19:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11562254336. Throughput: 0: 42823.5. Samples: 11562331300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:19:53,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 17:19:56,282][15401] Updated weights for policy 0, policy_version 705712 (0.0023) [2024-06-24 17:19:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 11562450944. Throughput: 0: 42497.0. Samples: 11562579680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:19:58,392][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 17:20:00,149][15401] Updated weights for policy 0, policy_version 705722 (0.0038) [2024-06-24 17:20:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11562680320. Throughput: 0: 42744.3. Samples: 11562835460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 17:20:04,167][15401] Updated weights for policy 0, policy_version 705732 (0.0028) [2024-06-24 17:20:07,695][15401] Updated weights for policy 0, policy_version 705742 (0.0026) [2024-06-24 17:20:08,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11562893312. Throughput: 0: 42748.0. Samples: 11562969740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 17:20:11,907][15401] Updated weights for policy 0, policy_version 705752 (0.0031) [2024-06-24 17:20:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11563106304. Throughput: 0: 42624.7. Samples: 11563220660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:13,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 17:20:15,541][15401] Updated weights for policy 0, policy_version 705762 (0.0030) [2024-06-24 17:20:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11563302912. Throughput: 0: 42696.0. Samples: 11563478860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 17:20:19,538][15401] Updated weights for policy 0, policy_version 705772 (0.0028) [2024-06-24 17:20:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11563515904. Throughput: 0: 42451.9. Samples: 11563603520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 17:20:23,539][15401] Updated weights for policy 0, policy_version 705782 (0.0029) [2024-06-24 17:20:27,011][15401] Updated weights for policy 0, policy_version 705792 (0.0028) [2024-06-24 17:20:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 11563745280. Throughput: 0: 42468.5. Samples: 11563858120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 17:20:31,007][15401] Updated weights for policy 0, policy_version 705802 (0.0032) [2024-06-24 17:20:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11563941888. Throughput: 0: 42864.4. Samples: 11564124840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:33,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 17:20:34,524][15401] Updated weights for policy 0, policy_version 705812 (0.0028) [2024-06-24 17:20:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.7). Total num frames: 11564171264. Throughput: 0: 42637.1. Samples: 11564249960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 17:20:38,525][15401] Updated weights for policy 0, policy_version 705822 (0.0032) [2024-06-24 17:20:42,252][15401] Updated weights for policy 0, policy_version 705832 (0.0031) [2024-06-24 17:20:43,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 11564400640. Throughput: 0: 42823.5. Samples: 11564506740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:43,393][15132] Avg episode reward: [(0, '0.237')] [2024-06-24 17:20:46,674][15401] Updated weights for policy 0, policy_version 705842 (0.0037) [2024-06-24 17:20:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11564580864. Throughput: 0: 42792.4. Samples: 11564761120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:48,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 17:20:50,153][15401] Updated weights for policy 0, policy_version 705852 (0.0038) [2024-06-24 17:20:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11564810240. Throughput: 0: 42618.2. Samples: 11564887560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:53,390][15132] Avg episode reward: [(0, '0.195')] [2024-06-24 17:20:54,111][15401] Updated weights for policy 0, policy_version 705862 (0.0031) [2024-06-24 17:20:57,719][15401] Updated weights for policy 0, policy_version 705872 (0.0039) [2024-06-24 17:20:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 11565039616. Throughput: 0: 42954.8. Samples: 11565153620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:20:58,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-24 17:21:00,426][15349] Signal inference workers to stop experience collection... (171100 times) [2024-06-24 17:21:00,427][15349] Signal inference workers to resume experience collection... (171100 times) [2024-06-24 17:21:00,451][15401] InferenceWorker_p0-w0: stopping experience collection (171100 times) [2024-06-24 17:21:00,452][15401] InferenceWorker_p0-w0: resuming experience collection (171100 times) [2024-06-24 17:21:01,613][15401] Updated weights for policy 0, policy_version 705882 (0.0034) [2024-06-24 17:21:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11565236224. Throughput: 0: 42888.9. Samples: 11565408860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:21:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 17:21:05,399][15401] Updated weights for policy 0, policy_version 705892 (0.0040) [2024-06-24 17:21:08,390][15132] Fps is (10 sec: 42596.3, 60 sec: 42871.2, 300 sec: 42820.5). Total num frames: 11565465600. Throughput: 0: 42885.0. Samples: 11565533360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:21:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 17:21:09,221][15401] Updated weights for policy 0, policy_version 705902 (0.0034) [2024-06-24 17:21:13,216][15401] Updated weights for policy 0, policy_version 705912 (0.0043) [2024-06-24 17:21:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 11565678592. Throughput: 0: 43056.0. Samples: 11565795640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:21:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 17:21:16,775][15401] Updated weights for policy 0, policy_version 705922 (0.0029) [2024-06-24 17:21:18,390][15132] Fps is (10 sec: 40961.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11565875200. Throughput: 0: 42814.1. Samples: 11566051480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:21:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 17:21:20,762][15401] Updated weights for policy 0, policy_version 705932 (0.0032) [2024-06-24 17:21:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11566104576. Throughput: 0: 42868.8. Samples: 11566179060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:21:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 17:21:24,129][15401] Updated weights for policy 0, policy_version 705942 (0.0045) [2024-06-24 17:21:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11566301184. Throughput: 0: 42996.1. Samples: 11566441460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-24 17:21:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 17:21:28,494][15401] Updated weights for policy 0, policy_version 705952 (0.0030) [2024-06-24 17:21:31,863][15401] Updated weights for policy 0, policy_version 705962 (0.0035) [2024-06-24 17:21:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11566530560. Throughput: 0: 42872.0. Samples: 11566690360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:21:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 17:21:36,170][15401] Updated weights for policy 0, policy_version 705972 (0.0052) [2024-06-24 17:21:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11566743552. Throughput: 0: 43065.9. Samples: 11566825520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:21:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 17:21:39,525][15401] Updated weights for policy 0, policy_version 705982 (0.0038) [2024-06-24 17:21:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 11566940160. Throughput: 0: 42878.6. Samples: 11567083160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:21:43,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 17:21:43,535][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000705991_11566956544.pth... [2024-06-24 17:21:43,578][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000705365_11556700160.pth [2024-06-24 17:21:43,909][15401] Updated weights for policy 0, policy_version 705992 (0.0033) [2024-06-24 17:21:46,907][15401] Updated weights for policy 0, policy_version 706002 (0.0035) [2024-06-24 17:21:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 11567169536. Throughput: 0: 42707.2. Samples: 11567330680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:21:48,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 17:21:51,546][15401] Updated weights for policy 0, policy_version 706012 (0.0028) [2024-06-24 17:21:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11567382528. Throughput: 0: 43015.4. Samples: 11567469040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:21:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 17:21:54,397][15401] Updated weights for policy 0, policy_version 706022 (0.0035) [2024-06-24 17:21:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11567562752. Throughput: 0: 42963.1. Samples: 11567728980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:21:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 17:21:59,033][15401] Updated weights for policy 0, policy_version 706032 (0.0031) [2024-06-24 17:22:02,017][15401] Updated weights for policy 0, policy_version 706042 (0.0032) [2024-06-24 17:22:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11567841280. Throughput: 0: 42894.3. Samples: 11567981720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 17:22:06,668][15401] Updated weights for policy 0, policy_version 706052 (0.0037) [2024-06-24 17:22:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.8, 300 sec: 42765.0). Total num frames: 11568037888. Throughput: 0: 43070.7. Samples: 11568117240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 17:22:09,756][15401] Updated weights for policy 0, policy_version 706062 (0.0026) [2024-06-24 17:22:13,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11568218112. Throughput: 0: 42972.9. Samples: 11568375240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 17:22:14,326][15401] Updated weights for policy 0, policy_version 706072 (0.0038) [2024-06-24 17:22:17,350][15401] Updated weights for policy 0, policy_version 706082 (0.0039) [2024-06-24 17:22:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 42820.5). Total num frames: 11568496640. Throughput: 0: 42881.4. Samples: 11568620020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 17:22:21,821][15401] Updated weights for policy 0, policy_version 706092 (0.0031) [2024-06-24 17:22:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11568660480. Throughput: 0: 43058.6. Samples: 11568763160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 17:22:24,928][15349] Signal inference workers to stop experience collection... (171150 times) [2024-06-24 17:22:24,961][15401] InferenceWorker_p0-w0: stopping experience collection (171150 times) [2024-06-24 17:22:24,987][15349] Signal inference workers to resume experience collection... (171150 times) [2024-06-24 17:22:24,995][15401] InferenceWorker_p0-w0: resuming experience collection (171150 times) [2024-06-24 17:22:24,997][15401] Updated weights for policy 0, policy_version 706102 (0.0030) [2024-06-24 17:22:28,396][15132] Fps is (10 sec: 37659.2, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 11568873472. Throughput: 0: 43023.7. Samples: 11569019500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:28,396][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 17:22:29,390][15401] Updated weights for policy 0, policy_version 706112 (0.0029) [2024-06-24 17:22:32,537][15401] Updated weights for policy 0, policy_version 706122 (0.0023) [2024-06-24 17:22:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11569119232. Throughput: 0: 43078.1. Samples: 11569269200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 17:22:37,195][15401] Updated weights for policy 0, policy_version 706132 (0.0030) [2024-06-24 17:22:38,390][15132] Fps is (10 sec: 45904.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11569332224. Throughput: 0: 43150.3. Samples: 11569410800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:38,396][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 17:22:40,402][15401] Updated weights for policy 0, policy_version 706142 (0.0030) [2024-06-24 17:22:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11569512448. Throughput: 0: 42885.3. Samples: 11569658820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 17:22:44,816][15401] Updated weights for policy 0, policy_version 706152 (0.0049) [2024-06-24 17:22:48,106][15401] Updated weights for policy 0, policy_version 706162 (0.0039) [2024-06-24 17:22:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11569774592. Throughput: 0: 42926.3. Samples: 11569913400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 17:22:52,375][15401] Updated weights for policy 0, policy_version 706172 (0.0031) [2024-06-24 17:22:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11569971200. Throughput: 0: 42975.1. Samples: 11570051120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 17:22:55,611][15401] Updated weights for policy 0, policy_version 706182 (0.0028) [2024-06-24 17:22:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11570151424. Throughput: 0: 42967.6. Samples: 11570308780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:22:58,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 17:23:00,011][15401] Updated weights for policy 0, policy_version 706192 (0.0033) [2024-06-24 17:23:03,223][15401] Updated weights for policy 0, policy_version 706202 (0.0029) [2024-06-24 17:23:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11570413568. Throughput: 0: 43081.3. Samples: 11570558680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:23:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 17:23:07,612][15401] Updated weights for policy 0, policy_version 706212 (0.0046) [2024-06-24 17:23:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11570593792. Throughput: 0: 42900.9. Samples: 11570693700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 17:23:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 17:23:11,127][15401] Updated weights for policy 0, policy_version 706222 (0.0030) [2024-06-24 17:23:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11570806784. Throughput: 0: 42821.2. Samples: 11570946180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 17:23:15,182][15401] Updated weights for policy 0, policy_version 706232 (0.0037) [2024-06-24 17:23:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 11571036160. Throughput: 0: 42827.6. Samples: 11571196440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:18,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 17:23:18,797][15401] Updated weights for policy 0, policy_version 706242 (0.0032) [2024-06-24 17:23:23,155][15401] Updated weights for policy 0, policy_version 706252 (0.0044) [2024-06-24 17:23:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11571249152. Throughput: 0: 42648.0. Samples: 11571329960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 17:23:26,708][15401] Updated weights for policy 0, policy_version 706262 (0.0022) [2024-06-24 17:23:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 11571462144. Throughput: 0: 42610.5. Samples: 11571576300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 17:23:30,710][15401] Updated weights for policy 0, policy_version 706272 (0.0052) [2024-06-24 17:23:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11571675136. Throughput: 0: 42617.7. Samples: 11571831200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:33,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 17:23:34,585][15401] Updated weights for policy 0, policy_version 706282 (0.0045) [2024-06-24 17:23:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11571871744. Throughput: 0: 42438.3. Samples: 11571960840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:38,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 17:23:38,683][15401] Updated weights for policy 0, policy_version 706292 (0.0044) [2024-06-24 17:23:42,612][15401] Updated weights for policy 0, policy_version 706302 (0.0036) [2024-06-24 17:23:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11572101120. Throughput: 0: 42331.9. Samples: 11572213720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:43,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-24 17:23:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000706305_11572101120.pth... [2024-06-24 17:23:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000705676_11561795584.pth [2024-06-24 17:23:45,349][15349] Signal inference workers to stop experience collection... (171200 times) [2024-06-24 17:23:45,389][15401] InferenceWorker_p0-w0: stopping experience collection (171200 times) [2024-06-24 17:23:45,408][15349] Signal inference workers to resume experience collection... (171200 times) [2024-06-24 17:23:45,410][15401] InferenceWorker_p0-w0: resuming experience collection (171200 times) [2024-06-24 17:23:46,207][15401] Updated weights for policy 0, policy_version 706312 (0.0037) [2024-06-24 17:23:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11572314112. Throughput: 0: 42465.2. Samples: 11572469620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 17:23:50,256][15401] Updated weights for policy 0, policy_version 706322 (0.0032) [2024-06-24 17:23:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11572510720. Throughput: 0: 42297.1. Samples: 11572597080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 17:23:53,754][15401] Updated weights for policy 0, policy_version 706332 (0.0041) [2024-06-24 17:23:57,989][15401] Updated weights for policy 0, policy_version 706342 (0.0052) [2024-06-24 17:23:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11572723712. Throughput: 0: 42451.1. Samples: 11572856480. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:23:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 17:24:01,346][15401] Updated weights for policy 0, policy_version 706352 (0.0035) [2024-06-24 17:24:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11572953088. Throughput: 0: 42518.2. Samples: 11573109760. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 17:24:05,593][15401] Updated weights for policy 0, policy_version 706362 (0.0023) [2024-06-24 17:24:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11573166080. Throughput: 0: 42427.7. Samples: 11573239200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 17:24:08,829][15401] Updated weights for policy 0, policy_version 706372 (0.0028) [2024-06-24 17:24:13,319][15401] Updated weights for policy 0, policy_version 706382 (0.0028) [2024-06-24 17:24:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11573362688. Throughput: 0: 42705.0. Samples: 11573498020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 17:24:16,702][15401] Updated weights for policy 0, policy_version 706392 (0.0042) [2024-06-24 17:24:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11573592064. Throughput: 0: 42613.3. Samples: 11573748800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 17:24:20,897][15401] Updated weights for policy 0, policy_version 706402 (0.0036) [2024-06-24 17:24:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11573821440. Throughput: 0: 42662.2. Samples: 11573880640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 17:24:24,271][15401] Updated weights for policy 0, policy_version 706412 (0.0023) [2024-06-24 17:24:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11574001664. Throughput: 0: 42890.3. Samples: 11574143780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 17:24:28,446][15401] Updated weights for policy 0, policy_version 706422 (0.0028) [2024-06-24 17:24:31,730][15401] Updated weights for policy 0, policy_version 706432 (0.0042) [2024-06-24 17:24:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11574231040. Throughput: 0: 42775.7. Samples: 11574394520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 17:24:36,195][15401] Updated weights for policy 0, policy_version 706442 (0.0026) [2024-06-24 17:24:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11574444032. Throughput: 0: 42985.5. Samples: 11574531420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 17:24:39,647][15401] Updated weights for policy 0, policy_version 706452 (0.0038) [2024-06-24 17:24:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11574657024. Throughput: 0: 42891.5. Samples: 11574786600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 17:24:43,676][15401] Updated weights for policy 0, policy_version 706462 (0.0040) [2024-06-24 17:24:47,163][15401] Updated weights for policy 0, policy_version 706472 (0.0039) [2024-06-24 17:24:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11574886400. Throughput: 0: 42872.0. Samples: 11575039000. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-06-24 17:24:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 17:24:51,238][15349] Signal inference workers to stop experience collection... (171250 times) [2024-06-24 17:24:51,238][15349] Signal inference workers to resume experience collection... (171250 times) [2024-06-24 17:24:51,253][15401] Updated weights for policy 0, policy_version 706482 (0.0038) [2024-06-24 17:24:51,281][15401] InferenceWorker_p0-w0: stopping experience collection (171250 times) [2024-06-24 17:24:51,281][15401] InferenceWorker_p0-w0: resuming experience collection (171250 times) [2024-06-24 17:24:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 11575083008. Throughput: 0: 42998.6. Samples: 11575174140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:24:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 17:24:54,718][15401] Updated weights for policy 0, policy_version 706492 (0.0026) [2024-06-24 17:24:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11575312384. Throughput: 0: 42967.2. Samples: 11575431540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:24:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 17:24:58,762][15401] Updated weights for policy 0, policy_version 706502 (0.0033) [2024-06-24 17:25:02,507][15401] Updated weights for policy 0, policy_version 706512 (0.0027) [2024-06-24 17:25:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11575541760. Throughput: 0: 43056.0. Samples: 11575686320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 17:25:06,284][15401] Updated weights for policy 0, policy_version 706522 (0.0039) [2024-06-24 17:25:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11575721984. Throughput: 0: 43053.3. Samples: 11575818040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 17:25:10,058][15401] Updated weights for policy 0, policy_version 706532 (0.0031) [2024-06-24 17:25:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11575951360. Throughput: 0: 42836.4. Samples: 11576071420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 17:25:14,509][15401] Updated weights for policy 0, policy_version 706542 (0.0032) [2024-06-24 17:25:17,666][15401] Updated weights for policy 0, policy_version 706552 (0.0044) [2024-06-24 17:25:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 11576180736. Throughput: 0: 42863.2. Samples: 11576323360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 17:25:22,094][15401] Updated weights for policy 0, policy_version 706562 (0.0032) [2024-06-24 17:25:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11576360960. Throughput: 0: 42768.4. Samples: 11576456000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:23,395][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 17:25:25,334][15401] Updated weights for policy 0, policy_version 706572 (0.0035) [2024-06-24 17:25:28,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11576590336. Throughput: 0: 42725.7. Samples: 11576709260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 17:25:29,738][15401] Updated weights for policy 0, policy_version 706582 (0.0032) [2024-06-24 17:25:32,916][15401] Updated weights for policy 0, policy_version 706592 (0.0050) [2024-06-24 17:25:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11576819712. Throughput: 0: 42779.1. Samples: 11576964060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 17:25:37,539][15401] Updated weights for policy 0, policy_version 706602 (0.0044) [2024-06-24 17:25:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 11576999936. Throughput: 0: 42610.6. Samples: 11577091620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 17:25:41,100][15401] Updated weights for policy 0, policy_version 706612 (0.0028) [2024-06-24 17:25:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 11577245696. Throughput: 0: 42601.6. Samples: 11577348620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 17:25:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000706620_11577262080.pth... [2024-06-24 17:25:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000705991_11566956544.pth [2024-06-24 17:25:45,422][15401] Updated weights for policy 0, policy_version 706622 (0.0039) [2024-06-24 17:25:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11577442304. Throughput: 0: 42747.2. Samples: 11577609940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 17:25:48,845][15401] Updated weights for policy 0, policy_version 706632 (0.0035) [2024-06-24 17:25:52,994][15401] Updated weights for policy 0, policy_version 706642 (0.0033) [2024-06-24 17:25:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11577638912. Throughput: 0: 42509.8. Samples: 11577730980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 17:25:56,330][15401] Updated weights for policy 0, policy_version 706652 (0.0031) [2024-06-24 17:25:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11577884672. Throughput: 0: 42657.7. Samples: 11577991020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:25:58,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 17:26:00,530][15401] Updated weights for policy 0, policy_version 706662 (0.0031) [2024-06-24 17:26:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11578097664. Throughput: 0: 42845.1. Samples: 11578251400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:26:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 17:26:03,763][15401] Updated weights for policy 0, policy_version 706672 (0.0036) [2024-06-24 17:26:08,050][15401] Updated weights for policy 0, policy_version 706682 (0.0034) [2024-06-24 17:26:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11578294272. Throughput: 0: 42786.8. Samples: 11578381400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:26:08,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 17:26:11,743][15401] Updated weights for policy 0, policy_version 706692 (0.0029) [2024-06-24 17:26:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11578540032. Throughput: 0: 42889.0. Samples: 11578639260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:26:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 17:26:15,617][15401] Updated weights for policy 0, policy_version 706702 (0.0041) [2024-06-24 17:26:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11578736640. Throughput: 0: 42928.4. Samples: 11578895840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:26:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 17:26:19,195][15401] Updated weights for policy 0, policy_version 706712 (0.0037) [2024-06-24 17:26:23,103][15401] Updated weights for policy 0, policy_version 706722 (0.0033) [2024-06-24 17:26:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11578949632. Throughput: 0: 42871.7. Samples: 11579020840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:26:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 17:26:26,757][15401] Updated weights for policy 0, policy_version 706732 (0.0024) [2024-06-24 17:26:27,393][15349] Signal inference workers to stop experience collection... (171300 times) [2024-06-24 17:26:27,414][15401] InferenceWorker_p0-w0: stopping experience collection (171300 times) [2024-06-24 17:26:27,453][15349] Signal inference workers to resume experience collection... (171300 times) [2024-06-24 17:26:27,455][15401] InferenceWorker_p0-w0: resuming experience collection (171300 times) [2024-06-24 17:26:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11579162624. Throughput: 0: 42962.0. Samples: 11579281900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-24 17:26:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 17:26:30,547][15401] Updated weights for policy 0, policy_version 706742 (0.0035) [2024-06-24 17:26:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11579359232. Throughput: 0: 42818.9. Samples: 11579536800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:26:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 17:26:34,686][15401] Updated weights for policy 0, policy_version 706752 (0.0040) [2024-06-24 17:26:38,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42870.0, 300 sec: 42820.2). Total num frames: 11579572224. Throughput: 0: 42885.4. Samples: 11579660920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:26:38,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 17:26:38,553][15401] Updated weights for policy 0, policy_version 706762 (0.0045) [2024-06-24 17:26:42,147][15401] Updated weights for policy 0, policy_version 706772 (0.0028) [2024-06-24 17:26:43,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11579801600. Throughput: 0: 42911.7. Samples: 11579922040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:26:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 17:26:46,132][15401] Updated weights for policy 0, policy_version 706782 (0.0036) [2024-06-24 17:26:48,390][15132] Fps is (10 sec: 44246.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11580014592. Throughput: 0: 42933.0. Samples: 11580183380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:26:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 17:26:49,701][15401] Updated weights for policy 0, policy_version 706792 (0.0033) [2024-06-24 17:26:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11580227584. Throughput: 0: 42762.1. Samples: 11580305700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:26:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 17:26:53,603][15401] Updated weights for policy 0, policy_version 706802 (0.0039) [2024-06-24 17:26:57,316][15401] Updated weights for policy 0, policy_version 706812 (0.0031) [2024-06-24 17:26:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11580456960. Throughput: 0: 42890.8. Samples: 11580569340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:26:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 17:27:01,665][15401] Updated weights for policy 0, policy_version 706822 (0.0033) [2024-06-24 17:27:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11580653568. Throughput: 0: 42829.6. Samples: 11580823180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 17:27:05,134][15401] Updated weights for policy 0, policy_version 706832 (0.0024) [2024-06-24 17:27:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11580866560. Throughput: 0: 42830.6. Samples: 11580948220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 17:27:09,118][15401] Updated weights for policy 0, policy_version 706842 (0.0023) [2024-06-24 17:27:12,564][15401] Updated weights for policy 0, policy_version 706852 (0.0037) [2024-06-24 17:27:13,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42593.8, 300 sec: 42708.5). Total num frames: 11581095936. Throughput: 0: 42883.6. Samples: 11581211940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:13,396][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 17:27:16,673][15401] Updated weights for policy 0, policy_version 706862 (0.0034) [2024-06-24 17:27:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11581292544. Throughput: 0: 42878.8. Samples: 11581466340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 17:27:20,220][15401] Updated weights for policy 0, policy_version 706872 (0.0040) [2024-06-24 17:27:23,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 11581521920. Throughput: 0: 42974.2. Samples: 11581594660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 17:27:24,301][15401] Updated weights for policy 0, policy_version 706882 (0.0037) [2024-06-24 17:27:27,863][15401] Updated weights for policy 0, policy_version 706892 (0.0035) [2024-06-24 17:27:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11581734912. Throughput: 0: 42953.8. Samples: 11581854960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 17:27:31,881][15401] Updated weights for policy 0, policy_version 706902 (0.0035) [2024-06-24 17:27:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 11581947904. Throughput: 0: 42906.7. Samples: 11582114180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 17:27:35,673][15401] Updated weights for policy 0, policy_version 706912 (0.0023) [2024-06-24 17:27:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 11582144512. Throughput: 0: 43002.4. Samples: 11582240800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 17:27:39,543][15401] Updated weights for policy 0, policy_version 706922 (0.0036) [2024-06-24 17:27:43,143][15401] Updated weights for policy 0, policy_version 706932 (0.0032) [2024-06-24 17:27:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11582373888. Throughput: 0: 42841.7. Samples: 11582497220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:43,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 17:27:43,464][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000706933_11582390272.pth... [2024-06-24 17:27:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000706305_11572101120.pth [2024-06-24 17:27:47,131][15401] Updated weights for policy 0, policy_version 706942 (0.0024) [2024-06-24 17:27:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11582603264. Throughput: 0: 42896.5. Samples: 11582753520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 17:27:50,870][15401] Updated weights for policy 0, policy_version 706952 (0.0037) [2024-06-24 17:27:53,396][15132] Fps is (10 sec: 42570.7, 60 sec: 42866.9, 300 sec: 42875.1). Total num frames: 11582799872. Throughput: 0: 42964.1. Samples: 11582881880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:53,397][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 17:27:54,979][15401] Updated weights for policy 0, policy_version 706962 (0.0025) [2024-06-24 17:27:58,121][15349] Signal inference workers to stop experience collection... (171350 times) [2024-06-24 17:27:58,122][15349] Signal inference workers to resume experience collection... (171350 times) [2024-06-24 17:27:58,164][15401] InferenceWorker_p0-w0: stopping experience collection (171350 times) [2024-06-24 17:27:58,164][15401] InferenceWorker_p0-w0: resuming experience collection (171350 times) [2024-06-24 17:27:58,264][15401] Updated weights for policy 0, policy_version 706972 (0.0047) [2024-06-24 17:27:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11583029248. Throughput: 0: 42812.4. Samples: 11583138220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:27:58,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 17:28:02,556][15401] Updated weights for policy 0, policy_version 706982 (0.0028) [2024-06-24 17:28:03,390][15132] Fps is (10 sec: 44265.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11583242240. Throughput: 0: 42840.4. Samples: 11583394160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:28:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 17:28:06,098][15401] Updated weights for policy 0, policy_version 706992 (0.0032) [2024-06-24 17:28:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11583438848. Throughput: 0: 42834.1. Samples: 11583522200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 17:28:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 17:28:10,205][15401] Updated weights for policy 0, policy_version 707002 (0.0033) [2024-06-24 17:28:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 11583651840. Throughput: 0: 42860.4. Samples: 11583783680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 17:28:14,059][15401] Updated weights for policy 0, policy_version 707012 (0.0037) [2024-06-24 17:28:17,665][15401] Updated weights for policy 0, policy_version 707022 (0.0031) [2024-06-24 17:28:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11583881216. Throughput: 0: 42806.2. Samples: 11584040460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 17:28:21,668][15401] Updated weights for policy 0, policy_version 707032 (0.0037) [2024-06-24 17:28:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11584077824. Throughput: 0: 42829.3. Samples: 11584168120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:23,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 17:28:25,287][15401] Updated weights for policy 0, policy_version 707042 (0.0038) [2024-06-24 17:28:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11584290816. Throughput: 0: 42761.7. Samples: 11584421500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 17:28:29,441][15401] Updated weights for policy 0, policy_version 707052 (0.0042) [2024-06-24 17:28:33,258][15401] Updated weights for policy 0, policy_version 707062 (0.0032) [2024-06-24 17:28:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11584503808. Throughput: 0: 42664.1. Samples: 11584673400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 17:28:37,331][15401] Updated weights for policy 0, policy_version 707072 (0.0033) [2024-06-24 17:28:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11584733184. Throughput: 0: 42626.7. Samples: 11584799800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:38,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 17:28:40,664][15401] Updated weights for policy 0, policy_version 707082 (0.0036) [2024-06-24 17:28:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11584929792. Throughput: 0: 42581.3. Samples: 11585054380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:43,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 17:28:45,017][15401] Updated weights for policy 0, policy_version 707092 (0.0032) [2024-06-24 17:28:48,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 11585142784. Throughput: 0: 42613.8. Samples: 11585311880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:48,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 17:28:48,458][15401] Updated weights for policy 0, policy_version 707102 (0.0033) [2024-06-24 17:28:52,687][15401] Updated weights for policy 0, policy_version 707112 (0.0031) [2024-06-24 17:28:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 11585355776. Throughput: 0: 42615.3. Samples: 11585439880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:53,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 17:28:56,065][15401] Updated weights for policy 0, policy_version 707122 (0.0039) [2024-06-24 17:28:58,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11585585152. Throughput: 0: 42488.4. Samples: 11585695660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:28:58,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-24 17:29:00,418][15401] Updated weights for policy 0, policy_version 707132 (0.0035) [2024-06-24 17:29:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11585781760. Throughput: 0: 42369.7. Samples: 11585947100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 17:29:03,678][15401] Updated weights for policy 0, policy_version 707142 (0.0027) [2024-06-24 17:29:08,050][15401] Updated weights for policy 0, policy_version 707152 (0.0037) [2024-06-24 17:29:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11585994752. Throughput: 0: 42462.6. Samples: 11586078940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 17:29:11,532][15401] Updated weights for policy 0, policy_version 707162 (0.0036) [2024-06-24 17:29:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11586207744. Throughput: 0: 42480.5. Samples: 11586333120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 17:29:15,575][15401] Updated weights for policy 0, policy_version 707172 (0.0032) [2024-06-24 17:29:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11586420736. Throughput: 0: 42754.2. Samples: 11586597340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:18,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 17:29:19,109][15401] Updated weights for policy 0, policy_version 707182 (0.0038) [2024-06-24 17:29:23,341][15401] Updated weights for policy 0, policy_version 707192 (0.0042) [2024-06-24 17:29:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11586633728. Throughput: 0: 42791.0. Samples: 11586725400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 17:29:25,159][15349] Signal inference workers to stop experience collection... (171400 times) [2024-06-24 17:29:25,160][15349] Signal inference workers to resume experience collection... (171400 times) [2024-06-24 17:29:25,217][15401] InferenceWorker_p0-w0: stopping experience collection (171400 times) [2024-06-24 17:29:25,217][15401] InferenceWorker_p0-w0: resuming experience collection (171400 times) [2024-06-24 17:29:26,638][15401] Updated weights for policy 0, policy_version 707202 (0.0031) [2024-06-24 17:29:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11586863104. Throughput: 0: 42632.0. Samples: 11586972820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:28,392][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 17:29:30,854][15401] Updated weights for policy 0, policy_version 707212 (0.0031) [2024-06-24 17:29:33,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 11587076096. Throughput: 0: 42741.6. Samples: 11587235420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:33,396][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 17:29:34,256][15401] Updated weights for policy 0, policy_version 707222 (0.0031) [2024-06-24 17:29:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11587272704. Throughput: 0: 42748.9. Samples: 11587363580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:38,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 17:29:38,477][15401] Updated weights for policy 0, policy_version 707232 (0.0035) [2024-06-24 17:29:41,921][15401] Updated weights for policy 0, policy_version 707242 (0.0028) [2024-06-24 17:29:43,390][15132] Fps is (10 sec: 44264.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11587518464. Throughput: 0: 42675.0. Samples: 11587616040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 17:29:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000707246_11587518464.pth... [2024-06-24 17:29:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000706620_11577262080.pth [2024-06-24 17:29:45,962][15401] Updated weights for policy 0, policy_version 707252 (0.0041) [2024-06-24 17:29:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 11587715072. Throughput: 0: 42877.4. Samples: 11587876580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 17:29:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 17:29:49,643][15401] Updated weights for policy 0, policy_version 707262 (0.0046) [2024-06-24 17:29:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11587928064. Throughput: 0: 42696.9. Samples: 11588000300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:29:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 17:29:53,993][15401] Updated weights for policy 0, policy_version 707272 (0.0027) [2024-06-24 17:29:57,067][15401] Updated weights for policy 0, policy_version 707282 (0.0035) [2024-06-24 17:29:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11588173824. Throughput: 0: 42856.1. Samples: 11588261640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:29:58,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 17:30:01,429][15401] Updated weights for policy 0, policy_version 707292 (0.0025) [2024-06-24 17:30:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11588354048. Throughput: 0: 42749.4. Samples: 11588521060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 17:30:04,587][15401] Updated weights for policy 0, policy_version 707302 (0.0026) [2024-06-24 17:30:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11588567040. Throughput: 0: 42633.4. Samples: 11588643900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 17:30:09,029][15401] Updated weights for policy 0, policy_version 707312 (0.0037) [2024-06-24 17:30:12,213][15401] Updated weights for policy 0, policy_version 707322 (0.0029) [2024-06-24 17:30:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 11588812800. Throughput: 0: 42948.4. Samples: 11588905500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:13,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 17:30:16,884][15401] Updated weights for policy 0, policy_version 707332 (0.0038) [2024-06-24 17:30:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11588993024. Throughput: 0: 43018.2. Samples: 11589170960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 17:30:19,990][15401] Updated weights for policy 0, policy_version 707342 (0.0039) [2024-06-24 17:30:23,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11589189632. Throughput: 0: 42764.4. Samples: 11589287980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 17:30:24,922][15401] Updated weights for policy 0, policy_version 707352 (0.0043) [2024-06-24 17:30:27,448][15401] Updated weights for policy 0, policy_version 707362 (0.0038) [2024-06-24 17:30:28,390][15132] Fps is (10 sec: 47512.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11589468160. Throughput: 0: 42914.6. Samples: 11589547200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 17:30:32,536][15401] Updated weights for policy 0, policy_version 707372 (0.0041) [2024-06-24 17:30:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 11589632000. Throughput: 0: 43145.4. Samples: 11589818120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 17:30:34,940][15401] Updated weights for policy 0, policy_version 707382 (0.0039) [2024-06-24 17:30:38,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11589844992. Throughput: 0: 42948.1. Samples: 11589932960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 17:30:40,014][15401] Updated weights for policy 0, policy_version 707392 (0.0027) [2024-06-24 17:30:42,592][15349] Signal inference workers to stop experience collection... (171450 times) [2024-06-24 17:30:42,626][15401] InferenceWorker_p0-w0: stopping experience collection (171450 times) [2024-06-24 17:30:42,657][15349] Signal inference workers to resume experience collection... (171450 times) [2024-06-24 17:30:42,657][15401] InferenceWorker_p0-w0: resuming experience collection (171450 times) [2024-06-24 17:30:42,661][15401] Updated weights for policy 0, policy_version 707402 (0.0024) [2024-06-24 17:30:43,389][15132] Fps is (10 sec: 49151.6, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 11590123520. Throughput: 0: 42921.7. Samples: 11590193120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 17:30:47,544][15401] Updated weights for policy 0, policy_version 707412 (0.0030) [2024-06-24 17:30:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11590287360. Throughput: 0: 43029.7. Samples: 11590457400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:48,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 17:30:50,135][15401] Updated weights for policy 0, policy_version 707422 (0.0023) [2024-06-24 17:30:53,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11590483968. Throughput: 0: 42948.8. Samples: 11590576600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 17:30:55,006][15401] Updated weights for policy 0, policy_version 707432 (0.0027) [2024-06-24 17:30:57,925][15401] Updated weights for policy 0, policy_version 707442 (0.0027) [2024-06-24 17:30:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11590746112. Throughput: 0: 42992.1. Samples: 11590840140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:30:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 17:31:02,635][15401] Updated weights for policy 0, policy_version 707452 (0.0037) [2024-06-24 17:31:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11590909952. Throughput: 0: 42890.5. Samples: 11591101040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:31:03,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 17:31:05,566][15401] Updated weights for policy 0, policy_version 707462 (0.0031) [2024-06-24 17:31:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11591139328. Throughput: 0: 42866.7. Samples: 11591216980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:31:08,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 17:31:10,062][15401] Updated weights for policy 0, policy_version 707472 (0.0034) [2024-06-24 17:31:13,306][15401] Updated weights for policy 0, policy_version 707482 (0.0040) [2024-06-24 17:31:13,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11591385088. Throughput: 0: 43078.9. Samples: 11591485740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:31:13,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 17:31:18,105][15401] Updated weights for policy 0, policy_version 707492 (0.0032) [2024-06-24 17:31:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11591565312. Throughput: 0: 42896.3. Samples: 11591748460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:31:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 17:31:20,895][15401] Updated weights for policy 0, policy_version 707502 (0.0023) [2024-06-24 17:31:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11591778304. Throughput: 0: 42994.1. Samples: 11591867700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:31:23,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-24 17:31:25,576][15401] Updated weights for policy 0, policy_version 707512 (0.0036) [2024-06-24 17:31:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.6, 300 sec: 42931.7). Total num frames: 11592024064. Throughput: 0: 43043.2. Samples: 11592130060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:31:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 17:31:28,436][15401] Updated weights for policy 0, policy_version 707522 (0.0039) [2024-06-24 17:31:33,089][15401] Updated weights for policy 0, policy_version 707532 (0.0041) [2024-06-24 17:31:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 11592220672. Throughput: 0: 42954.2. Samples: 11592390340. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:31:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 17:31:35,886][15401] Updated weights for policy 0, policy_version 707542 (0.0027) [2024-06-24 17:31:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11592433664. Throughput: 0: 43021.3. Samples: 11592512560. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:31:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 17:31:40,617][15401] Updated weights for policy 0, policy_version 707552 (0.0038) [2024-06-24 17:31:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 11592679424. Throughput: 0: 43089.6. Samples: 11592779180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:31:43,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-24 17:31:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000707562_11592695808.pth... [2024-06-24 17:31:43,458][15401] Updated weights for policy 0, policy_version 707562 (0.0039) [2024-06-24 17:31:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000706933_11582390272.pth [2024-06-24 17:31:46,990][15349] Signal inference workers to stop experience collection... (171500 times) [2024-06-24 17:31:46,991][15349] Signal inference workers to resume experience collection... (171500 times) [2024-06-24 17:31:47,025][15401] InferenceWorker_p0-w0: stopping experience collection (171500 times) [2024-06-24 17:31:47,025][15401] InferenceWorker_p0-w0: resuming experience collection (171500 times) [2024-06-24 17:31:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11592843264. Throughput: 0: 43038.7. Samples: 11593037780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:31:48,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 17:31:48,453][15401] Updated weights for policy 0, policy_version 707572 (0.0038) [2024-06-24 17:31:51,013][15401] Updated weights for policy 0, policy_version 707582 (0.0026) [2024-06-24 17:31:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11593072640. Throughput: 0: 42926.7. Samples: 11593148680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:31:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 17:31:56,493][15401] Updated weights for policy 0, policy_version 707592 (0.0041) [2024-06-24 17:31:58,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 11593318400. Throughput: 0: 42947.9. Samples: 11593418400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:31:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 17:31:58,709][15401] Updated weights for policy 0, policy_version 707602 (0.0035) [2024-06-24 17:32:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11593482240. Throughput: 0: 42845.9. Samples: 11593676520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:03,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 17:32:04,068][15401] Updated weights for policy 0, policy_version 707612 (0.0037) [2024-06-24 17:32:06,770][15401] Updated weights for policy 0, policy_version 707622 (0.0028) [2024-06-24 17:32:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 11593728000. Throughput: 0: 42768.1. Samples: 11593792260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:08,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 17:32:11,610][15401] Updated weights for policy 0, policy_version 707632 (0.0034) [2024-06-24 17:32:13,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11593957376. Throughput: 0: 42859.9. Samples: 11594058760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:13,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 17:32:14,741][15401] Updated weights for policy 0, policy_version 707642 (0.0032) [2024-06-24 17:32:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11594137600. Throughput: 0: 42774.8. Samples: 11594315200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:18,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 17:32:19,253][15401] Updated weights for policy 0, policy_version 707652 (0.0030) [2024-06-24 17:32:22,498][15401] Updated weights for policy 0, policy_version 707662 (0.0022) [2024-06-24 17:32:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11594366976. Throughput: 0: 42710.2. Samples: 11594434520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 17:32:27,245][15401] Updated weights for policy 0, policy_version 707672 (0.0039) [2024-06-24 17:32:28,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11594612736. Throughput: 0: 42766.4. Samples: 11594703660. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:28,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 17:32:30,261][15401] Updated weights for policy 0, policy_version 707682 (0.0039) [2024-06-24 17:32:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 11594776576. Throughput: 0: 42545.8. Samples: 11594952340. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 17:32:34,817][15401] Updated weights for policy 0, policy_version 707692 (0.0034) [2024-06-24 17:32:37,829][15401] Updated weights for policy 0, policy_version 707702 (0.0027) [2024-06-24 17:32:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11594989568. Throughput: 0: 42654.7. Samples: 11595068140. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 17:32:42,313][15401] Updated weights for policy 0, policy_version 707712 (0.0044) [2024-06-24 17:32:43,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11595251712. Throughput: 0: 42600.3. Samples: 11595335420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 17:32:45,586][15401] Updated weights for policy 0, policy_version 707722 (0.0041) [2024-06-24 17:32:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42654.9). Total num frames: 11595382784. Throughput: 0: 42515.5. Samples: 11595589720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 17:32:49,944][15401] Updated weights for policy 0, policy_version 707732 (0.0034) [2024-06-24 17:32:53,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11595628544. Throughput: 0: 42584.8. Samples: 11595708580. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 17:32:53,547][15401] Updated weights for policy 0, policy_version 707742 (0.0040) [2024-06-24 17:32:57,694][15349] Signal inference workers to stop experience collection... (171550 times) [2024-06-24 17:32:57,709][15401] InferenceWorker_p0-w0: stopping experience collection (171550 times) [2024-06-24 17:32:57,761][15349] Signal inference workers to resume experience collection... (171550 times) [2024-06-24 17:32:57,761][15401] InferenceWorker_p0-w0: resuming experience collection (171550 times) [2024-06-24 17:32:57,763][15401] Updated weights for policy 0, policy_version 707752 (0.0023) [2024-06-24 17:32:58,390][15132] Fps is (10 sec: 49151.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11595874304. Throughput: 0: 42459.6. Samples: 11595969440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:32:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 17:33:01,074][15401] Updated weights for policy 0, policy_version 707762 (0.0035) [2024-06-24 17:33:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11596038144. Throughput: 0: 42608.4. Samples: 11596232580. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:33:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 17:33:05,282][15401] Updated weights for policy 0, policy_version 707772 (0.0038) [2024-06-24 17:33:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11596283904. Throughput: 0: 42455.1. Samples: 11596345000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:33:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 17:33:08,701][15401] Updated weights for policy 0, policy_version 707782 (0.0034) [2024-06-24 17:33:12,794][15401] Updated weights for policy 0, policy_version 707792 (0.0026) [2024-06-24 17:33:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11596513280. Throughput: 0: 42581.7. Samples: 11596619840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-24 17:33:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 17:33:16,714][15401] Updated weights for policy 0, policy_version 707802 (0.0027) [2024-06-24 17:33:18,391][15132] Fps is (10 sec: 39315.2, 60 sec: 42324.2, 300 sec: 42709.2). Total num frames: 11596677120. Throughput: 0: 42704.2. Samples: 11596874100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:18,392][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 17:33:20,413][15401] Updated weights for policy 0, policy_version 707812 (0.0037) [2024-06-24 17:33:23,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 11596922880. Throughput: 0: 42799.0. Samples: 11596994200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:23,401][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 17:33:24,195][15401] Updated weights for policy 0, policy_version 707822 (0.0035) [2024-06-24 17:33:28,390][15132] Fps is (10 sec: 42605.0, 60 sec: 41506.1, 300 sec: 42709.5). Total num frames: 11597103104. Throughput: 0: 42708.9. Samples: 11597257320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:28,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 17:33:28,417][15401] Updated weights for policy 0, policy_version 707832 (0.0041) [2024-06-24 17:33:31,801][15401] Updated weights for policy 0, policy_version 707842 (0.0026) [2024-06-24 17:33:33,390][15132] Fps is (10 sec: 40969.0, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 11597332480. Throughput: 0: 42665.1. Samples: 11597509660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 17:33:35,961][15401] Updated weights for policy 0, policy_version 707852 (0.0036) [2024-06-24 17:33:38,392][15132] Fps is (10 sec: 47502.5, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 11597578240. Throughput: 0: 42871.1. Samples: 11597637880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:38,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 17:33:39,481][15401] Updated weights for policy 0, policy_version 707862 (0.0031) [2024-06-24 17:33:43,390][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 42765.4). Total num frames: 11597758464. Throughput: 0: 42843.6. Samples: 11597897400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 17:33:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000707871_11597758464.pth... [2024-06-24 17:33:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000707246_11587518464.pth [2024-06-24 17:33:43,625][15401] Updated weights for policy 0, policy_version 707872 (0.0030) [2024-06-24 17:33:47,101][15401] Updated weights for policy 0, policy_version 707882 (0.0037) [2024-06-24 17:33:48,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 11597987840. Throughput: 0: 42613.3. Samples: 11598150180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:48,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 17:33:50,262][15349] Signal inference workers to stop experience collection... (171600 times) [2024-06-24 17:33:50,263][15349] Signal inference workers to resume experience collection... (171600 times) [2024-06-24 17:33:50,282][15401] InferenceWorker_p0-w0: stopping experience collection (171600 times) [2024-06-24 17:33:50,283][15401] InferenceWorker_p0-w0: resuming experience collection (171600 times) [2024-06-24 17:33:51,276][15401] Updated weights for policy 0, policy_version 707892 (0.0039) [2024-06-24 17:33:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11598217216. Throughput: 0: 42866.5. Samples: 11598274000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 17:33:55,247][15401] Updated weights for policy 0, policy_version 707902 (0.0028) [2024-06-24 17:33:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 11598397440. Throughput: 0: 42548.5. Samples: 11598534520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:33:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 17:33:58,787][15401] Updated weights for policy 0, policy_version 707912 (0.0045) [2024-06-24 17:34:02,853][15401] Updated weights for policy 0, policy_version 707922 (0.0030) [2024-06-24 17:34:03,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11598610432. Throughput: 0: 42344.3. Samples: 11598779520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 17:34:06,717][15401] Updated weights for policy 0, policy_version 707932 (0.0039) [2024-06-24 17:34:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11598856192. Throughput: 0: 42700.5. Samples: 11598915620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:08,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 17:34:10,469][15401] Updated weights for policy 0, policy_version 707942 (0.0037) [2024-06-24 17:34:13,393][15132] Fps is (10 sec: 40945.0, 60 sec: 41776.8, 300 sec: 42709.0). Total num frames: 11599020032. Throughput: 0: 42378.1. Samples: 11599164480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:13,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 17:34:14,414][15401] Updated weights for policy 0, policy_version 707952 (0.0034) [2024-06-24 17:34:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42599.5, 300 sec: 42709.5). Total num frames: 11599233024. Throughput: 0: 42381.0. Samples: 11599416800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 17:34:18,544][15401] Updated weights for policy 0, policy_version 707962 (0.0046) [2024-06-24 17:34:22,110][15401] Updated weights for policy 0, policy_version 707972 (0.0028) [2024-06-24 17:34:23,390][15132] Fps is (10 sec: 45891.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11599478784. Throughput: 0: 42468.4. Samples: 11599548860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 17:34:25,960][15401] Updated weights for policy 0, policy_version 707982 (0.0047) [2024-06-24 17:34:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 11599642624. Throughput: 0: 42171.6. Samples: 11599795120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:28,391][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 17:34:29,813][15401] Updated weights for policy 0, policy_version 707992 (0.0036) [2024-06-24 17:34:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 11599872000. Throughput: 0: 42207.7. Samples: 11600049520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:33,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 17:34:33,697][15401] Updated weights for policy 0, policy_version 708002 (0.0032) [2024-06-24 17:34:37,655][15401] Updated weights for policy 0, policy_version 708012 (0.0036) [2024-06-24 17:34:38,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 11600117760. Throughput: 0: 42461.8. Samples: 11600184780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:38,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 17:34:41,198][15401] Updated weights for policy 0, policy_version 708022 (0.0030) [2024-06-24 17:34:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11600297984. Throughput: 0: 42086.8. Samples: 11600428420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:43,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 17:34:45,336][15401] Updated weights for policy 0, policy_version 708032 (0.0042) [2024-06-24 17:34:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11600527360. Throughput: 0: 42381.1. Samples: 11600686680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:48,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 17:34:48,756][15401] Updated weights for policy 0, policy_version 708042 (0.0026) [2024-06-24 17:34:53,277][15401] Updated weights for policy 0, policy_version 708052 (0.0031) [2024-06-24 17:34:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 11600723968. Throughput: 0: 42251.6. Samples: 11600816940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 17:34:53,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-24 17:34:54,852][15349] Signal inference workers to stop experience collection... (171650 times) [2024-06-24 17:34:54,860][15349] Signal inference workers to resume experience collection... (171650 times) [2024-06-24 17:34:54,880][15401] InferenceWorker_p0-w0: stopping experience collection (171650 times) [2024-06-24 17:34:54,880][15401] InferenceWorker_p0-w0: resuming experience collection (171650 times) [2024-06-24 17:34:56,585][15401] Updated weights for policy 0, policy_version 708062 (0.0022) [2024-06-24 17:34:58,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 11600953344. Throughput: 0: 42248.6. Samples: 11601065620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:34:58,392][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 17:35:00,983][15401] Updated weights for policy 0, policy_version 708072 (0.0032) [2024-06-24 17:35:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11601166336. Throughput: 0: 42532.9. Samples: 11601330780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 17:35:04,220][15401] Updated weights for policy 0, policy_version 708082 (0.0037) [2024-06-24 17:35:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 11601362944. Throughput: 0: 42564.5. Samples: 11601464260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:08,390][15132] Avg episode reward: [(0, '0.211')] [2024-06-24 17:35:08,753][15401] Updated weights for policy 0, policy_version 708092 (0.0028) [2024-06-24 17:35:11,669][15401] Updated weights for policy 0, policy_version 708102 (0.0038) [2024-06-24 17:35:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43147.1, 300 sec: 42765.0). Total num frames: 11601608704. Throughput: 0: 42586.3. Samples: 11601711500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 17:35:16,227][15401] Updated weights for policy 0, policy_version 708112 (0.0022) [2024-06-24 17:35:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11601821696. Throughput: 0: 42883.0. Samples: 11601979260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:18,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 17:35:19,043][15401] Updated weights for policy 0, policy_version 708122 (0.0027) [2024-06-24 17:35:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 11602001920. Throughput: 0: 42647.6. Samples: 11602103920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 17:35:23,686][15401] Updated weights for policy 0, policy_version 708132 (0.0032) [2024-06-24 17:35:26,536][15401] Updated weights for policy 0, policy_version 708142 (0.0027) [2024-06-24 17:35:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11602247680. Throughput: 0: 42892.0. Samples: 11602358560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:28,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-24 17:35:31,255][15401] Updated weights for policy 0, policy_version 708152 (0.0037) [2024-06-24 17:35:33,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 11602477056. Throughput: 0: 43044.9. Samples: 11602623800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:33,393][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 17:35:34,361][15401] Updated weights for policy 0, policy_version 708162 (0.0035) [2024-06-24 17:35:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 11602657280. Throughput: 0: 42977.7. Samples: 11602750940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:38,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 17:35:38,834][15401] Updated weights for policy 0, policy_version 708172 (0.0035) [2024-06-24 17:35:41,854][15401] Updated weights for policy 0, policy_version 708182 (0.0036) [2024-06-24 17:35:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11602903040. Throughput: 0: 43065.0. Samples: 11603003440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:43,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 17:35:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000708185_11602903040.pth... [2024-06-24 17:35:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000707562_11592695808.pth [2024-06-24 17:35:46,409][15401] Updated weights for policy 0, policy_version 708192 (0.0035) [2024-06-24 17:35:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11603116032. Throughput: 0: 43146.6. Samples: 11603272380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 17:35:49,694][15401] Updated weights for policy 0, policy_version 708202 (0.0029) [2024-06-24 17:35:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 11603312640. Throughput: 0: 42974.1. Samples: 11603398100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 17:35:54,328][15401] Updated weights for policy 0, policy_version 708212 (0.0045) [2024-06-24 17:35:57,342][15401] Updated weights for policy 0, policy_version 708222 (0.0034) [2024-06-24 17:35:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 11603558400. Throughput: 0: 43319.8. Samples: 11603660900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:35:58,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 17:36:01,851][15401] Updated weights for policy 0, policy_version 708232 (0.0021) [2024-06-24 17:36:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11603738624. Throughput: 0: 43168.0. Samples: 11603921820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:36:03,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-24 17:36:04,888][15401] Updated weights for policy 0, policy_version 708242 (0.0044) [2024-06-24 17:36:08,390][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 11603968000. Throughput: 0: 43042.2. Samples: 11604040820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:36:08,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-24 17:36:09,447][15401] Updated weights for policy 0, policy_version 708252 (0.0033) [2024-06-24 17:36:12,419][15401] Updated weights for policy 0, policy_version 708262 (0.0043) [2024-06-24 17:36:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11604180992. Throughput: 0: 43098.2. Samples: 11604297980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:36:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 17:36:15,599][15349] Signal inference workers to stop experience collection... (171700 times) [2024-06-24 17:36:15,599][15349] Signal inference workers to resume experience collection... (171700 times) [2024-06-24 17:36:15,633][15401] InferenceWorker_p0-w0: stopping experience collection (171700 times) [2024-06-24 17:36:15,634][15401] InferenceWorker_p0-w0: resuming experience collection (171700 times) [2024-06-24 17:36:17,290][15401] Updated weights for policy 0, policy_version 708272 (0.0026) [2024-06-24 17:36:18,391][15132] Fps is (10 sec: 39314.4, 60 sec: 42324.1, 300 sec: 42653.7). Total num frames: 11604361216. Throughput: 0: 43026.3. Samples: 11604559960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:36:18,392][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 17:36:20,275][15401] Updated weights for policy 0, policy_version 708282 (0.0034) [2024-06-24 17:36:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 11604606976. Throughput: 0: 42841.4. Samples: 11604678800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:36:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 17:36:24,902][15401] Updated weights for policy 0, policy_version 708292 (0.0048) [2024-06-24 17:36:28,389][15132] Fps is (10 sec: 44245.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11604803584. Throughput: 0: 42921.8. Samples: 11604934920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:36:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 17:36:28,419][15401] Updated weights for policy 0, policy_version 708302 (0.0042) [2024-06-24 17:36:32,718][15401] Updated weights for policy 0, policy_version 708312 (0.0032) [2024-06-24 17:36:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 11605016576. Throughput: 0: 42695.2. Samples: 11605193660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-24 17:36:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 17:36:35,902][15401] Updated weights for policy 0, policy_version 708322 (0.0030) [2024-06-24 17:36:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 11605262336. Throughput: 0: 42620.9. Samples: 11605316040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:36:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 17:36:40,169][15401] Updated weights for policy 0, policy_version 708332 (0.0030) [2024-06-24 17:36:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11605458944. Throughput: 0: 42632.1. Samples: 11605579340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:36:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 17:36:43,422][15401] Updated weights for policy 0, policy_version 708342 (0.0034) [2024-06-24 17:36:47,703][15401] Updated weights for policy 0, policy_version 708352 (0.0036) [2024-06-24 17:36:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11605655552. Throughput: 0: 42493.7. Samples: 11605834040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:36:48,399][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 17:36:51,220][15401] Updated weights for policy 0, policy_version 708362 (0.0032) [2024-06-24 17:36:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11605901312. Throughput: 0: 42643.9. Samples: 11605959800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:36:53,399][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 17:36:55,223][15401] Updated weights for policy 0, policy_version 708372 (0.0037) [2024-06-24 17:36:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11606097920. Throughput: 0: 42704.4. Samples: 11606219680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:36:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 17:36:59,347][15401] Updated weights for policy 0, policy_version 708382 (0.0027) [2024-06-24 17:37:02,923][15401] Updated weights for policy 0, policy_version 708392 (0.0038) [2024-06-24 17:37:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11606310912. Throughput: 0: 42604.0. Samples: 11606477060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 17:37:06,821][15401] Updated weights for policy 0, policy_version 708402 (0.0038) [2024-06-24 17:37:08,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 11606540288. Throughput: 0: 42748.8. Samples: 11606602600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:08,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 17:37:10,654][15401] Updated weights for policy 0, policy_version 708412 (0.0029) [2024-06-24 17:37:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 11606753280. Throughput: 0: 42818.1. Samples: 11606861840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:13,392][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 17:37:14,604][15401] Updated weights for policy 0, policy_version 708422 (0.0032) [2024-06-24 17:37:18,217][15401] Updated weights for policy 0, policy_version 708432 (0.0027) [2024-06-24 17:37:18,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43145.9, 300 sec: 42653.9). Total num frames: 11606949888. Throughput: 0: 42649.3. Samples: 11607112880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:18,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 17:37:22,144][15401] Updated weights for policy 0, policy_version 708442 (0.0030) [2024-06-24 17:37:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11607179264. Throughput: 0: 42773.9. Samples: 11607240860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:23,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 17:37:25,943][15401] Updated weights for policy 0, policy_version 708452 (0.0033) [2024-06-24 17:37:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11607375872. Throughput: 0: 42750.7. Samples: 11607503120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:28,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 17:37:29,698][15401] Updated weights for policy 0, policy_version 708462 (0.0045) [2024-06-24 17:37:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11607588864. Throughput: 0: 42845.0. Samples: 11607762060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 17:37:33,715][15401] Updated weights for policy 0, policy_version 708472 (0.0026) [2024-06-24 17:37:37,261][15401] Updated weights for policy 0, policy_version 708482 (0.0029) [2024-06-24 17:37:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11607801856. Throughput: 0: 42824.1. Samples: 11607886880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:38,394][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 17:37:38,591][15349] Signal inference workers to stop experience collection... (171750 times) [2024-06-24 17:37:38,591][15349] Signal inference workers to resume experience collection... (171750 times) [2024-06-24 17:37:38,613][15401] InferenceWorker_p0-w0: stopping experience collection (171750 times) [2024-06-24 17:37:38,614][15401] InferenceWorker_p0-w0: resuming experience collection (171750 times) [2024-06-24 17:37:41,499][15401] Updated weights for policy 0, policy_version 708492 (0.0031) [2024-06-24 17:37:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11608031232. Throughput: 0: 42769.8. Samples: 11608144320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 17:37:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000708498_11608031232.pth... [2024-06-24 17:37:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000707871_11597758464.pth [2024-06-24 17:37:44,841][15401] Updated weights for policy 0, policy_version 708502 (0.0038) [2024-06-24 17:37:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11608244224. Throughput: 0: 42761.3. Samples: 11608401320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 17:37:48,867][15401] Updated weights for policy 0, policy_version 708512 (0.0037) [2024-06-24 17:37:52,640][15401] Updated weights for policy 0, policy_version 708522 (0.0039) [2024-06-24 17:37:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 11608440832. Throughput: 0: 42805.9. Samples: 11608528760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 17:37:56,527][15401] Updated weights for policy 0, policy_version 708532 (0.0032) [2024-06-24 17:37:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11608653824. Throughput: 0: 42727.5. Samples: 11608784480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:37:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 17:38:00,660][15401] Updated weights for policy 0, policy_version 708542 (0.0034) [2024-06-24 17:38:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11608866816. Throughput: 0: 42944.9. Samples: 11609045400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:38:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 17:38:04,275][15401] Updated weights for policy 0, policy_version 708552 (0.0032) [2024-06-24 17:38:08,216][15401] Updated weights for policy 0, policy_version 708562 (0.0037) [2024-06-24 17:38:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 11609079808. Throughput: 0: 42833.3. Samples: 11609168360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:38:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 17:38:12,032][15401] Updated weights for policy 0, policy_version 708572 (0.0031) [2024-06-24 17:38:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42600.0, 300 sec: 42820.8). Total num frames: 11609309184. Throughput: 0: 42651.4. Samples: 11609422440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-24 17:38:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 17:38:15,846][15401] Updated weights for policy 0, policy_version 708582 (0.0038) [2024-06-24 17:38:18,390][15132] Fps is (10 sec: 42595.7, 60 sec: 42598.0, 300 sec: 42654.2). Total num frames: 11609505792. Throughput: 0: 42524.7. Samples: 11609675700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:18,391][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 17:38:20,082][15401] Updated weights for policy 0, policy_version 708592 (0.0037) [2024-06-24 17:38:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11609718784. Throughput: 0: 42644.9. Samples: 11609805900. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 17:38:23,499][15401] Updated weights for policy 0, policy_version 708602 (0.0054) [2024-06-24 17:38:27,706][15401] Updated weights for policy 0, policy_version 708612 (0.0042) [2024-06-24 17:38:28,389][15132] Fps is (10 sec: 42601.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11609931776. Throughput: 0: 42651.7. Samples: 11610063640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 17:38:31,075][15401] Updated weights for policy 0, policy_version 708622 (0.0035) [2024-06-24 17:38:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 11610161152. Throughput: 0: 42521.8. Samples: 11610314800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 17:38:35,257][15401] Updated weights for policy 0, policy_version 708632 (0.0052) [2024-06-24 17:38:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11610374144. Throughput: 0: 42602.6. Samples: 11610445880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:38,399][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 17:38:38,749][15401] Updated weights for policy 0, policy_version 708642 (0.0033) [2024-06-24 17:38:42,804][15401] Updated weights for policy 0, policy_version 708652 (0.0043) [2024-06-24 17:38:43,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42593.9, 300 sec: 42708.5). Total num frames: 11610587136. Throughput: 0: 42641.5. Samples: 11610703620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:43,405][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 17:38:46,454][15401] Updated weights for policy 0, policy_version 708662 (0.0038) [2024-06-24 17:38:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11610783744. Throughput: 0: 42443.3. Samples: 11610955360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 17:38:50,533][15401] Updated weights for policy 0, policy_version 708672 (0.0043) [2024-06-24 17:38:53,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11610996736. Throughput: 0: 42550.7. Samples: 11611083140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:53,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 17:38:54,115][15401] Updated weights for policy 0, policy_version 708682 (0.0042) [2024-06-24 17:38:58,019][15401] Updated weights for policy 0, policy_version 708692 (0.0029) [2024-06-24 17:38:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11611226112. Throughput: 0: 42600.9. Samples: 11611339480. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:38:58,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 17:39:01,799][15401] Updated weights for policy 0, policy_version 708702 (0.0040) [2024-06-24 17:39:02,045][15349] Signal inference workers to stop experience collection... (171800 times) [2024-06-24 17:39:02,045][15349] Signal inference workers to resume experience collection... (171800 times) [2024-06-24 17:39:02,092][15401] InferenceWorker_p0-w0: stopping experience collection (171800 times) [2024-06-24 17:39:02,093][15401] InferenceWorker_p0-w0: resuming experience collection (171800 times) [2024-06-24 17:39:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 11611439104. Throughput: 0: 42600.0. Samples: 11611592680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 17:39:05,711][15401] Updated weights for policy 0, policy_version 708712 (0.0027) [2024-06-24 17:39:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42710.0). Total num frames: 11611619328. Throughput: 0: 42651.1. Samples: 11611725200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 17:39:09,577][15401] Updated weights for policy 0, policy_version 708722 (0.0039) [2024-06-24 17:39:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11611848704. Throughput: 0: 42332.8. Samples: 11611968620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 17:39:13,820][15401] Updated weights for policy 0, policy_version 708732 (0.0027) [2024-06-24 17:39:17,149][15401] Updated weights for policy 0, policy_version 708742 (0.0034) [2024-06-24 17:39:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.9, 300 sec: 42709.5). Total num frames: 11612078080. Throughput: 0: 42364.5. Samples: 11612221200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 17:39:21,817][15401] Updated weights for policy 0, policy_version 708752 (0.0032) [2024-06-24 17:39:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11612258304. Throughput: 0: 42404.5. Samples: 11612354080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:23,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 17:39:24,683][15401] Updated weights for policy 0, policy_version 708762 (0.0035) [2024-06-24 17:39:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11612487680. Throughput: 0: 42326.0. Samples: 11612608020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 17:39:29,417][15401] Updated weights for policy 0, policy_version 708772 (0.0039) [2024-06-24 17:39:32,625][15401] Updated weights for policy 0, policy_version 708782 (0.0042) [2024-06-24 17:39:33,393][15132] Fps is (10 sec: 44219.4, 60 sec: 42322.6, 300 sec: 42653.4). Total num frames: 11612700672. Throughput: 0: 42474.7. Samples: 11612866880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:33,394][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 17:39:36,818][15401] Updated weights for policy 0, policy_version 708792 (0.0030) [2024-06-24 17:39:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11612913664. Throughput: 0: 42553.7. Samples: 11612998060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:38,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 17:39:40,426][15401] Updated weights for policy 0, policy_version 708802 (0.0029) [2024-06-24 17:39:43,389][15132] Fps is (10 sec: 40976.4, 60 sec: 42056.8, 300 sec: 42654.0). Total num frames: 11613110272. Throughput: 0: 42352.6. Samples: 11613245340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 17:39:43,511][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000708809_11613126656.pth... [2024-06-24 17:39:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000708185_11602903040.pth [2024-06-24 17:39:44,389][15401] Updated weights for policy 0, policy_version 708812 (0.0048) [2024-06-24 17:39:48,138][15401] Updated weights for policy 0, policy_version 708822 (0.0030) [2024-06-24 17:39:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 11613339648. Throughput: 0: 42508.1. Samples: 11613505540. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:48,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 17:39:52,089][15401] Updated weights for policy 0, policy_version 708832 (0.0033) [2024-06-24 17:39:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 11613552640. Throughput: 0: 42487.5. Samples: 11613637140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 19.0) [2024-06-24 17:39:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 17:39:56,026][15401] Updated weights for policy 0, policy_version 708842 (0.0029) [2024-06-24 17:39:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11613782016. Throughput: 0: 42744.0. Samples: 11613892100. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:39:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 17:39:59,700][15401] Updated weights for policy 0, policy_version 708852 (0.0048) [2024-06-24 17:40:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11613978624. Throughput: 0: 42862.7. Samples: 11614150020. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 17:40:03,635][15401] Updated weights for policy 0, policy_version 708862 (0.0032) [2024-06-24 17:40:07,381][15401] Updated weights for policy 0, policy_version 708872 (0.0042) [2024-06-24 17:40:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11614191616. Throughput: 0: 42700.0. Samples: 11614275580. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 17:40:11,272][15401] Updated weights for policy 0, policy_version 708882 (0.0029) [2024-06-24 17:40:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 11614420992. Throughput: 0: 42841.3. Samples: 11614535980. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:13,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 17:40:15,034][15401] Updated weights for policy 0, policy_version 708892 (0.0043) [2024-06-24 17:40:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11614633984. Throughput: 0: 42733.5. Samples: 11614789720. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 17:40:18,881][15401] Updated weights for policy 0, policy_version 708902 (0.0039) [2024-06-24 17:40:22,648][15401] Updated weights for policy 0, policy_version 708912 (0.0037) [2024-06-24 17:40:23,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 11614830592. Throughput: 0: 42605.3. Samples: 11614915400. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:23,393][15132] Avg episode reward: [(0, '0.189')] [2024-06-24 17:40:26,505][15401] Updated weights for policy 0, policy_version 708922 (0.0030) [2024-06-24 17:40:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 11615043584. Throughput: 0: 42845.7. Samples: 11615173400. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 17:40:30,583][15401] Updated weights for policy 0, policy_version 708932 (0.0045) [2024-06-24 17:40:32,101][15349] Signal inference workers to stop experience collection... (171850 times) [2024-06-24 17:40:32,101][15349] Signal inference workers to resume experience collection... (171850 times) [2024-06-24 17:40:32,112][15401] InferenceWorker_p0-w0: stopping experience collection (171850 times) [2024-06-24 17:40:32,124][15401] InferenceWorker_p0-w0: resuming experience collection (171850 times) [2024-06-24 17:40:33,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42874.3, 300 sec: 42765.0). Total num frames: 11615272960. Throughput: 0: 42716.0. Samples: 11615427760. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 17:40:34,129][15401] Updated weights for policy 0, policy_version 708942 (0.0055) [2024-06-24 17:40:38,162][15401] Updated weights for policy 0, policy_version 708952 (0.0032) [2024-06-24 17:40:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11615485952. Throughput: 0: 42722.7. Samples: 11615559660. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 17:40:42,120][15401] Updated weights for policy 0, policy_version 708962 (0.0039) [2024-06-24 17:40:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 11615682560. Throughput: 0: 42861.3. Samples: 11615820960. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:43,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 17:40:45,545][15401] Updated weights for policy 0, policy_version 708972 (0.0039) [2024-06-24 17:40:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11615911936. Throughput: 0: 42688.9. Samples: 11616071020. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:48,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 17:40:49,665][15401] Updated weights for policy 0, policy_version 708982 (0.0035) [2024-06-24 17:40:53,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11616108544. Throughput: 0: 42703.9. Samples: 11616197260. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:53,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 17:40:53,916][15401] Updated weights for policy 0, policy_version 708992 (0.0041) [2024-06-24 17:40:57,272][15401] Updated weights for policy 0, policy_version 709002 (0.0030) [2024-06-24 17:40:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11616321536. Throughput: 0: 42622.8. Samples: 11616453900. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:40:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 17:41:01,433][15401] Updated weights for policy 0, policy_version 709012 (0.0036) [2024-06-24 17:41:03,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11616534528. Throughput: 0: 42724.1. Samples: 11616712300. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:41:03,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 17:41:04,755][15401] Updated weights for policy 0, policy_version 709022 (0.0037) [2024-06-24 17:41:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11616747520. Throughput: 0: 42683.3. Samples: 11616836040. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:41:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 17:41:09,087][15401] Updated weights for policy 0, policy_version 709032 (0.0035) [2024-06-24 17:41:12,440][15401] Updated weights for policy 0, policy_version 709042 (0.0033) [2024-06-24 17:41:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42600.1, 300 sec: 42765.3). Total num frames: 11616976896. Throughput: 0: 42781.3. Samples: 11617098560. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:41:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 17:41:16,686][15401] Updated weights for policy 0, policy_version 709052 (0.0028) [2024-06-24 17:41:18,390][15132] Fps is (10 sec: 44233.4, 60 sec: 42597.9, 300 sec: 42653.8). Total num frames: 11617189888. Throughput: 0: 42859.7. Samples: 11617356480. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:41:18,391][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 17:41:20,111][15401] Updated weights for policy 0, policy_version 709062 (0.0050) [2024-06-24 17:41:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 11617386496. Throughput: 0: 42698.7. Samples: 11617481100. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:41:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 17:41:24,163][15401] Updated weights for policy 0, policy_version 709072 (0.0047) [2024-06-24 17:41:27,799][15401] Updated weights for policy 0, policy_version 709082 (0.0032) [2024-06-24 17:41:28,389][15132] Fps is (10 sec: 42601.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11617615872. Throughput: 0: 42519.2. Samples: 11617734220. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:41:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 17:41:31,621][15401] Updated weights for policy 0, policy_version 709092 (0.0033) [2024-06-24 17:41:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 11617828864. Throughput: 0: 42679.0. Samples: 11617991680. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:41:33,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 17:41:35,909][15401] Updated weights for policy 0, policy_version 709102 (0.0032) [2024-06-24 17:41:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11618025472. Throughput: 0: 42643.8. Samples: 11618116220. Policy #0 lag: (min: 1.0, avg: 12.2, max: 22.0) [2024-06-24 17:41:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 17:41:39,255][15401] Updated weights for policy 0, policy_version 709112 (0.0039) [2024-06-24 17:41:43,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 11618238464. Throughput: 0: 42572.7. Samples: 11618369780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:41:43,392][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 17:41:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000709122_11618254848.pth... [2024-06-24 17:41:43,424][15349] Signal inference workers to stop experience collection... (171900 times) [2024-06-24 17:41:43,425][15349] Signal inference workers to resume experience collection... (171900 times) [2024-06-24 17:41:43,429][15401] Updated weights for policy 0, policy_version 709122 (0.0047) [2024-06-24 17:41:43,445][15401] InferenceWorker_p0-w0: stopping experience collection (171900 times) [2024-06-24 17:41:43,445][15401] InferenceWorker_p0-w0: resuming experience collection (171900 times) [2024-06-24 17:41:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000708498_11608031232.pth [2024-06-24 17:41:47,326][15401] Updated weights for policy 0, policy_version 709132 (0.0036) [2024-06-24 17:41:48,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 11618467840. Throughput: 0: 42543.4. Samples: 11618626860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:41:48,393][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 17:41:50,802][15401] Updated weights for policy 0, policy_version 709142 (0.0027) [2024-06-24 17:41:53,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11618664448. Throughput: 0: 42660.4. Samples: 11618755760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:41:53,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 17:41:54,784][15401] Updated weights for policy 0, policy_version 709152 (0.0034) [2024-06-24 17:41:58,362][15401] Updated weights for policy 0, policy_version 709162 (0.0045) [2024-06-24 17:41:58,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 11618910208. Throughput: 0: 42650.3. Samples: 11619017820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:41:58,391][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 17:42:02,340][15401] Updated weights for policy 0, policy_version 709172 (0.0033) [2024-06-24 17:42:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 11619106816. Throughput: 0: 42479.7. Samples: 11619268040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 17:42:06,040][15401] Updated weights for policy 0, policy_version 709182 (0.0028) [2024-06-24 17:42:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 11619303424. Throughput: 0: 42529.3. Samples: 11619394920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:08,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 17:42:09,931][15401] Updated weights for policy 0, policy_version 709192 (0.0043) [2024-06-24 17:42:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11619549184. Throughput: 0: 42618.7. Samples: 11619652060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 17:42:13,757][15401] Updated weights for policy 0, policy_version 709202 (0.0037) [2024-06-24 17:42:18,307][15401] Updated weights for policy 0, policy_version 709212 (0.0036) [2024-06-24 17:42:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.8, 300 sec: 42542.8). Total num frames: 11619729408. Throughput: 0: 42620.0. Samples: 11619909480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:18,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 17:42:21,983][15401] Updated weights for policy 0, policy_version 709222 (0.0028) [2024-06-24 17:42:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11619942400. Throughput: 0: 42620.4. Samples: 11620034140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 17:42:25,779][15401] Updated weights for policy 0, policy_version 709232 (0.0035) [2024-06-24 17:42:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 11620188160. Throughput: 0: 42739.5. Samples: 11620292960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:28,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 17:42:29,877][15401] Updated weights for policy 0, policy_version 709242 (0.0029) [2024-06-24 17:42:33,339][15401] Updated weights for policy 0, policy_version 709252 (0.0031) [2024-06-24 17:42:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 11620384768. Throughput: 0: 42663.2. Samples: 11620546600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:33,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 17:42:37,519][15401] Updated weights for policy 0, policy_version 709262 (0.0038) [2024-06-24 17:42:38,392][15132] Fps is (10 sec: 39312.8, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 11620581376. Throughput: 0: 42690.2. Samples: 11620676920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:38,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 17:42:40,754][15401] Updated weights for policy 0, policy_version 709272 (0.0028) [2024-06-24 17:42:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 11620827136. Throughput: 0: 42562.6. Samples: 11620933140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:43,396][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 17:42:44,937][15401] Updated weights for policy 0, policy_version 709282 (0.0030) [2024-06-24 17:42:48,170][15401] Updated weights for policy 0, policy_version 709292 (0.0033) [2024-06-24 17:42:48,390][15132] Fps is (10 sec: 45884.9, 60 sec: 42873.0, 300 sec: 42709.4). Total num frames: 11621040128. Throughput: 0: 42641.2. Samples: 11621186900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 17:42:52,706][15401] Updated weights for policy 0, policy_version 709302 (0.0043) [2024-06-24 17:42:53,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11621220352. Throughput: 0: 42678.7. Samples: 11621315460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 17:42:55,951][15401] Updated weights for policy 0, policy_version 709312 (0.0032) [2024-06-24 17:42:58,392][15132] Fps is (10 sec: 42589.4, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 11621466112. Throughput: 0: 42752.3. Samples: 11621576020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:42:58,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 17:43:00,678][15401] Updated weights for policy 0, policy_version 709322 (0.0021) [2024-06-24 17:43:02,119][15349] Signal inference workers to stop experience collection... (171950 times) [2024-06-24 17:43:02,120][15349] Signal inference workers to resume experience collection... (171950 times) [2024-06-24 17:43:02,170][15401] InferenceWorker_p0-w0: stopping experience collection (171950 times) [2024-06-24 17:43:02,170][15401] InferenceWorker_p0-w0: resuming experience collection (171950 times) [2024-06-24 17:43:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11621679104. Throughput: 0: 42708.1. Samples: 11621831340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:43:03,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 17:43:03,460][15401] Updated weights for policy 0, policy_version 709332 (0.0035) [2024-06-24 17:43:08,390][15132] Fps is (10 sec: 37692.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 11621842944. Throughput: 0: 42806.7. Samples: 11621960440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:43:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 17:43:08,401][15401] Updated weights for policy 0, policy_version 709342 (0.0031) [2024-06-24 17:43:11,055][15401] Updated weights for policy 0, policy_version 709352 (0.0033) [2024-06-24 17:43:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42765.1). Total num frames: 11622121472. Throughput: 0: 42877.3. Samples: 11622222440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:43:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 17:43:15,818][15401] Updated weights for policy 0, policy_version 709362 (0.0036) [2024-06-24 17:43:18,390][15132] Fps is (10 sec: 49151.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11622334464. Throughput: 0: 42915.8. Samples: 11622477820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 17:43:18,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 17:43:18,585][15401] Updated weights for policy 0, policy_version 709372 (0.0046) [2024-06-24 17:43:23,256][15401] Updated weights for policy 0, policy_version 709382 (0.0030) [2024-06-24 17:43:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11622514688. Throughput: 0: 42868.1. Samples: 11622605880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:43:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 17:43:26,122][15401] Updated weights for policy 0, policy_version 709392 (0.0051) [2024-06-24 17:43:28,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11622776832. Throughput: 0: 43049.0. Samples: 11622870340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:43:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 17:43:31,210][15401] Updated weights for policy 0, policy_version 709402 (0.0026) [2024-06-24 17:43:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11622973440. Throughput: 0: 43171.3. Samples: 11623129600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:43:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 17:43:33,788][15401] Updated weights for policy 0, policy_version 709412 (0.0030) [2024-06-24 17:43:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42873.2, 300 sec: 42599.3). Total num frames: 11623153664. Throughput: 0: 43064.0. Samples: 11623253340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:43:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 17:43:39,075][15401] Updated weights for policy 0, policy_version 709422 (0.0038) [2024-06-24 17:43:41,516][15401] Updated weights for policy 0, policy_version 709432 (0.0030) [2024-06-24 17:43:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 11623432192. Throughput: 0: 43011.7. Samples: 11623511440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:43:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 17:43:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000709438_11623432192.pth... [2024-06-24 17:43:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000708809_11613126656.pth [2024-06-24 17:43:46,854][15401] Updated weights for policy 0, policy_version 709442 (0.0029) [2024-06-24 17:43:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11623596032. Throughput: 0: 43028.4. Samples: 11623767620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:43:48,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 17:43:49,271][15401] Updated weights for policy 0, policy_version 709452 (0.0033) [2024-06-24 17:43:53,390][15132] Fps is (10 sec: 36044.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11623792640. Throughput: 0: 42901.8. Samples: 11623891020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:43:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 17:43:54,423][15401] Updated weights for policy 0, policy_version 709462 (0.0034) [2024-06-24 17:43:56,915][15401] Updated weights for policy 0, policy_version 709472 (0.0032) [2024-06-24 17:43:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 11624054784. Throughput: 0: 42779.7. Samples: 11624147520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:43:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 17:44:02,067][15401] Updated weights for policy 0, policy_version 709482 (0.0042) [2024-06-24 17:44:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11624251392. Throughput: 0: 42896.2. Samples: 11624408140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:03,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 17:44:04,606][15401] Updated weights for policy 0, policy_version 709492 (0.0050) [2024-06-24 17:44:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11624448000. Throughput: 0: 42830.6. Samples: 11624533260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:08,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 17:44:09,439][15401] Updated weights for policy 0, policy_version 709502 (0.0029) [2024-06-24 17:44:12,384][15349] Signal inference workers to stop experience collection... (172000 times) [2024-06-24 17:44:12,386][15349] Signal inference workers to resume experience collection... (172000 times) [2024-06-24 17:44:12,400][15401] Updated weights for policy 0, policy_version 709512 (0.0042) [2024-06-24 17:44:12,432][15401] InferenceWorker_p0-w0: stopping experience collection (172000 times) [2024-06-24 17:44:12,432][15401] InferenceWorker_p0-w0: resuming experience collection (172000 times) [2024-06-24 17:44:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11624693760. Throughput: 0: 42775.2. Samples: 11624795220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 17:44:17,002][15401] Updated weights for policy 0, policy_version 709522 (0.0035) [2024-06-24 17:44:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11624873984. Throughput: 0: 42755.3. Samples: 11625053580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 17:44:20,310][15401] Updated weights for policy 0, policy_version 709532 (0.0031) [2024-06-24 17:44:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11625103360. Throughput: 0: 42760.8. Samples: 11625177580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:23,395][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 17:44:24,719][15401] Updated weights for policy 0, policy_version 709542 (0.0037) [2024-06-24 17:44:27,763][15401] Updated weights for policy 0, policy_version 709552 (0.0035) [2024-06-24 17:44:28,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42876.7). Total num frames: 11625349120. Throughput: 0: 42799.1. Samples: 11625437400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 17:44:32,312][15401] Updated weights for policy 0, policy_version 709562 (0.0042) [2024-06-24 17:44:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11625512960. Throughput: 0: 42683.7. Samples: 11625688380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:33,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 17:44:35,351][15401] Updated weights for policy 0, policy_version 709572 (0.0032) [2024-06-24 17:44:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11625758720. Throughput: 0: 42522.7. Samples: 11625804540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 17:44:40,251][15401] Updated weights for policy 0, policy_version 709582 (0.0033) [2024-06-24 17:44:43,041][15401] Updated weights for policy 0, policy_version 709592 (0.0037) [2024-06-24 17:44:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 11625971712. Throughput: 0: 42855.1. Samples: 11626076000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:43,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 17:44:47,906][15401] Updated weights for policy 0, policy_version 709602 (0.0025) [2024-06-24 17:44:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11626168320. Throughput: 0: 42719.5. Samples: 11626330520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 17:44:50,735][15401] Updated weights for policy 0, policy_version 709612 (0.0031) [2024-06-24 17:44:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11626381312. Throughput: 0: 42622.3. Samples: 11626451260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:53,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 17:44:55,395][15401] Updated weights for policy 0, policy_version 709622 (0.0041) [2024-06-24 17:44:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11626594304. Throughput: 0: 42590.9. Samples: 11626711820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 17:44:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 17:44:58,711][15401] Updated weights for policy 0, policy_version 709632 (0.0026) [2024-06-24 17:45:02,884][15401] Updated weights for policy 0, policy_version 709642 (0.0031) [2024-06-24 17:45:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11626790912. Throughput: 0: 42546.1. Samples: 11626968160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 17:45:06,502][15401] Updated weights for policy 0, policy_version 709652 (0.0027) [2024-06-24 17:45:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 11627036672. Throughput: 0: 42564.5. Samples: 11627092980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 17:45:10,485][15401] Updated weights for policy 0, policy_version 709662 (0.0038) [2024-06-24 17:45:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 11627216896. Throughput: 0: 42473.8. Samples: 11627348720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 17:45:14,360][15401] Updated weights for policy 0, policy_version 709672 (0.0036) [2024-06-24 17:45:18,024][15401] Updated weights for policy 0, policy_version 709682 (0.0048) [2024-06-24 17:45:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 11627446272. Throughput: 0: 42511.1. Samples: 11627601380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:18,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 17:45:21,919][15401] Updated weights for policy 0, policy_version 709692 (0.0032) [2024-06-24 17:45:23,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11627659264. Throughput: 0: 42778.6. Samples: 11627729580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 17:45:25,965][15401] Updated weights for policy 0, policy_version 709702 (0.0034) [2024-06-24 17:45:28,392][15132] Fps is (10 sec: 40949.9, 60 sec: 41777.5, 300 sec: 42653.6). Total num frames: 11627855872. Throughput: 0: 42460.0. Samples: 11627986800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:28,393][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 17:45:29,772][15401] Updated weights for policy 0, policy_version 709712 (0.0030) [2024-06-24 17:45:31,319][15349] Signal inference workers to stop experience collection... (172050 times) [2024-06-24 17:45:31,319][15349] Signal inference workers to resume experience collection... (172050 times) [2024-06-24 17:45:31,346][15401] InferenceWorker_p0-w0: stopping experience collection (172050 times) [2024-06-24 17:45:31,346][15401] InferenceWorker_p0-w0: resuming experience collection (172050 times) [2024-06-24 17:45:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11628068864. Throughput: 0: 42485.2. Samples: 11628242360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 17:45:33,555][15401] Updated weights for policy 0, policy_version 709722 (0.0033) [2024-06-24 17:45:37,547][15401] Updated weights for policy 0, policy_version 709732 (0.0044) [2024-06-24 17:45:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 11628298240. Throughput: 0: 42717.8. Samples: 11628373560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 17:45:41,155][15401] Updated weights for policy 0, policy_version 709742 (0.0034) [2024-06-24 17:45:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 11628494848. Throughput: 0: 42525.4. Samples: 11628625460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:43,390][15132] Avg episode reward: [(0, '0.167')] [2024-06-24 17:45:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000709747_11628494848.pth... [2024-06-24 17:45:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000709122_11618254848.pth [2024-06-24 17:45:45,254][15401] Updated weights for policy 0, policy_version 709752 (0.0040) [2024-06-24 17:45:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11628724224. Throughput: 0: 42451.6. Samples: 11628878480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 17:45:49,109][15401] Updated weights for policy 0, policy_version 709762 (0.0030) [2024-06-24 17:45:52,770][15401] Updated weights for policy 0, policy_version 709772 (0.0034) [2024-06-24 17:45:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11628937216. Throughput: 0: 42710.3. Samples: 11629014940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 17:45:56,855][15401] Updated weights for policy 0, policy_version 709782 (0.0038) [2024-06-24 17:45:58,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 11629133824. Throughput: 0: 42517.2. Samples: 11629262100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:45:58,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 17:46:00,284][15401] Updated weights for policy 0, policy_version 709792 (0.0027) [2024-06-24 17:46:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11629363200. Throughput: 0: 42666.1. Samples: 11629521360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:46:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 17:46:04,310][15401] Updated weights for policy 0, policy_version 709802 (0.0037) [2024-06-24 17:46:07,831][15401] Updated weights for policy 0, policy_version 709812 (0.0038) [2024-06-24 17:46:08,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11629576192. Throughput: 0: 42696.5. Samples: 11629650920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:46:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 17:46:11,756][15401] Updated weights for policy 0, policy_version 709822 (0.0036) [2024-06-24 17:46:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.6). Total num frames: 11629789184. Throughput: 0: 42652.6. Samples: 11629906060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:46:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 17:46:15,499][15401] Updated weights for policy 0, policy_version 709832 (0.0035) [2024-06-24 17:46:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11630002176. Throughput: 0: 42784.2. Samples: 11630167640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:46:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 17:46:19,278][15401] Updated weights for policy 0, policy_version 709842 (0.0032) [2024-06-24 17:46:23,031][15401] Updated weights for policy 0, policy_version 709852 (0.0033) [2024-06-24 17:46:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11630231552. Throughput: 0: 42610.9. Samples: 11630291060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:46:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 17:46:27,028][15401] Updated weights for policy 0, policy_version 709862 (0.0033) [2024-06-24 17:46:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43146.2, 300 sec: 42765.4). Total num frames: 11630444544. Throughput: 0: 42671.5. Samples: 11630545680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:46:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 17:46:30,977][15401] Updated weights for policy 0, policy_version 709872 (0.0046) [2024-06-24 17:46:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11630624768. Throughput: 0: 42926.8. Samples: 11630810180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:46:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 17:46:34,601][15401] Updated weights for policy 0, policy_version 709882 (0.0040) [2024-06-24 17:46:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 11630854144. Throughput: 0: 42534.7. Samples: 11630929000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:46:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 17:46:38,603][15401] Updated weights for policy 0, policy_version 709892 (0.0040) [2024-06-24 17:46:41,089][15349] Signal inference workers to stop experience collection... (172100 times) [2024-06-24 17:46:41,132][15401] InferenceWorker_p0-w0: stopping experience collection (172100 times) [2024-06-24 17:46:41,149][15349] Signal inference workers to resume experience collection... (172100 times) [2024-06-24 17:46:41,154][15401] InferenceWorker_p0-w0: resuming experience collection (172100 times) [2024-06-24 17:46:42,359][15401] Updated weights for policy 0, policy_version 709902 (0.0026) [2024-06-24 17:46:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 11631083520. Throughput: 0: 42727.1. Samples: 11631184720. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:46:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 17:46:46,301][15401] Updated weights for policy 0, policy_version 709912 (0.0027) [2024-06-24 17:46:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 11631247360. Throughput: 0: 42772.8. Samples: 11631446140. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:46:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 17:46:49,941][15401] Updated weights for policy 0, policy_version 709922 (0.0023) [2024-06-24 17:46:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11631460352. Throughput: 0: 42513.0. Samples: 11631564000. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:46:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 17:46:54,263][15401] Updated weights for policy 0, policy_version 709932 (0.0027) [2024-06-24 17:46:57,806][15401] Updated weights for policy 0, policy_version 709942 (0.0035) [2024-06-24 17:46:58,392][15132] Fps is (10 sec: 49140.6, 60 sec: 43417.6, 300 sec: 42820.2). Total num frames: 11631738880. Throughput: 0: 42632.3. Samples: 11631824620. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:46:58,393][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 17:47:01,809][15401] Updated weights for policy 0, policy_version 709952 (0.0038) [2024-06-24 17:47:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11631902720. Throughput: 0: 42560.3. Samples: 11632082860. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:03,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 17:47:05,285][15401] Updated weights for policy 0, policy_version 709962 (0.0030) [2024-06-24 17:47:08,390][15132] Fps is (10 sec: 37692.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11632115712. Throughput: 0: 42585.4. Samples: 11632207400. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 17:47:09,672][15401] Updated weights for policy 0, policy_version 709972 (0.0033) [2024-06-24 17:47:13,305][15401] Updated weights for policy 0, policy_version 709982 (0.0039) [2024-06-24 17:47:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11632345088. Throughput: 0: 42661.5. Samples: 11632465440. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 17:47:17,312][15401] Updated weights for policy 0, policy_version 709992 (0.0027) [2024-06-24 17:47:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11632558080. Throughput: 0: 42414.9. Samples: 11632718860. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:18,391][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 17:47:20,999][15401] Updated weights for policy 0, policy_version 710002 (0.0041) [2024-06-24 17:47:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11632771072. Throughput: 0: 42648.8. Samples: 11632848200. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 17:47:24,813][15401] Updated weights for policy 0, policy_version 710012 (0.0031) [2024-06-24 17:47:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11632984064. Throughput: 0: 42780.9. Samples: 11633109860. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 17:47:28,565][15401] Updated weights for policy 0, policy_version 710022 (0.0033) [2024-06-24 17:47:32,451][15401] Updated weights for policy 0, policy_version 710032 (0.0035) [2024-06-24 17:47:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42765.3). Total num frames: 11633197056. Throughput: 0: 42584.4. Samples: 11633362440. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:33,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 17:47:36,032][15401] Updated weights for policy 0, policy_version 710042 (0.0036) [2024-06-24 17:47:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11633410048. Throughput: 0: 42794.3. Samples: 11633489740. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:38,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 17:47:40,268][15401] Updated weights for policy 0, policy_version 710052 (0.0031) [2024-06-24 17:47:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11633623040. Throughput: 0: 42885.8. Samples: 11633754380. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 17:47:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000710061_11633639424.pth... [2024-06-24 17:47:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000709438_11623432192.pth [2024-06-24 17:47:43,751][15401] Updated weights for policy 0, policy_version 710062 (0.0031) [2024-06-24 17:47:47,715][15401] Updated weights for policy 0, policy_version 710072 (0.0034) [2024-06-24 17:47:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 11633836032. Throughput: 0: 42727.7. Samples: 11634005600. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 17:47:51,354][15401] Updated weights for policy 0, policy_version 710082 (0.0035) [2024-06-24 17:47:52,150][15349] Signal inference workers to stop experience collection... (172150 times) [2024-06-24 17:47:52,151][15349] Signal inference workers to resume experience collection... (172150 times) [2024-06-24 17:47:52,191][15401] InferenceWorker_p0-w0: stopping experience collection (172150 times) [2024-06-24 17:47:52,196][15401] InferenceWorker_p0-w0: resuming experience collection (172150 times) [2024-06-24 17:47:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 11634065408. Throughput: 0: 42864.5. Samples: 11634136300. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:53,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 17:47:55,286][15401] Updated weights for policy 0, policy_version 710092 (0.0030) [2024-06-24 17:47:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42054.0, 300 sec: 42654.0). Total num frames: 11634262016. Throughput: 0: 42903.1. Samples: 11634396080. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:47:58,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 17:47:59,054][15401] Updated weights for policy 0, policy_version 710102 (0.0024) [2024-06-24 17:48:02,930][15401] Updated weights for policy 0, policy_version 710112 (0.0034) [2024-06-24 17:48:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11634491392. Throughput: 0: 42944.5. Samples: 11634651360. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:48:03,394][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 17:48:06,639][15401] Updated weights for policy 0, policy_version 710122 (0.0028) [2024-06-24 17:48:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11634704384. Throughput: 0: 42931.5. Samples: 11634780120. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:48:08,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 17:48:10,628][15401] Updated weights for policy 0, policy_version 710132 (0.0039) [2024-06-24 17:48:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11634900992. Throughput: 0: 42889.2. Samples: 11635039880. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:48:13,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 17:48:14,147][15401] Updated weights for policy 0, policy_version 710142 (0.0033) [2024-06-24 17:48:18,271][15401] Updated weights for policy 0, policy_version 710152 (0.0029) [2024-06-24 17:48:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11635130368. Throughput: 0: 42811.2. Samples: 11635288940. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 17:48:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 17:48:22,229][15401] Updated weights for policy 0, policy_version 710162 (0.0029) [2024-06-24 17:48:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11635343360. Throughput: 0: 42851.5. Samples: 11635418060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:48:23,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 17:48:25,838][15401] Updated weights for policy 0, policy_version 710172 (0.0031) [2024-06-24 17:48:28,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 11635507200. Throughput: 0: 42577.1. Samples: 11635670340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:48:28,396][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 17:48:30,064][15401] Updated weights for policy 0, policy_version 710182 (0.0037) [2024-06-24 17:48:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11635769344. Throughput: 0: 42550.1. Samples: 11635920360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:48:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 17:48:33,559][15401] Updated weights for policy 0, policy_version 710192 (0.0035) [2024-06-24 17:48:37,637][15401] Updated weights for policy 0, policy_version 710202 (0.0037) [2024-06-24 17:48:38,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 11635965952. Throughput: 0: 42612.4. Samples: 11636053860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:48:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 17:48:41,052][15401] Updated weights for policy 0, policy_version 710212 (0.0040) [2024-06-24 17:48:43,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 11636162560. Throughput: 0: 42421.7. Samples: 11636305160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:48:43,392][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 17:48:45,553][15401] Updated weights for policy 0, policy_version 710222 (0.0048) [2024-06-24 17:48:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11636391936. Throughput: 0: 42357.3. Samples: 11636557440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:48:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 17:48:48,986][15401] Updated weights for policy 0, policy_version 710232 (0.0043) [2024-06-24 17:48:53,094][15401] Updated weights for policy 0, policy_version 710242 (0.0030) [2024-06-24 17:48:53,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11636621312. Throughput: 0: 42558.3. Samples: 11636695240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:48:53,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 17:48:56,960][15401] Updated weights for policy 0, policy_version 710252 (0.0042) [2024-06-24 17:48:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 11636785152. Throughput: 0: 42235.3. Samples: 11636940460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:48:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 17:49:00,830][15401] Updated weights for policy 0, policy_version 710262 (0.0046) [2024-06-24 17:49:01,787][15349] Signal inference workers to stop experience collection... (172200 times) [2024-06-24 17:49:01,787][15349] Signal inference workers to resume experience collection... (172200 times) [2024-06-24 17:49:01,826][15401] InferenceWorker_p0-w0: stopping experience collection (172200 times) [2024-06-24 17:49:01,832][15401] InferenceWorker_p0-w0: resuming experience collection (172200 times) [2024-06-24 17:49:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11637047296. Throughput: 0: 42371.1. Samples: 11637195640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 17:49:04,445][15401] Updated weights for policy 0, policy_version 710272 (0.0027) [2024-06-24 17:49:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 11637243904. Throughput: 0: 42540.8. Samples: 11637332400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:08,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 17:49:08,899][15401] Updated weights for policy 0, policy_version 710282 (0.0026) [2024-06-24 17:49:11,903][15401] Updated weights for policy 0, policy_version 710292 (0.0031) [2024-06-24 17:49:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11637440512. Throughput: 0: 42359.4. Samples: 11637576520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:13,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 17:49:16,445][15401] Updated weights for policy 0, policy_version 710302 (0.0049) [2024-06-24 17:49:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11637686272. Throughput: 0: 42497.8. Samples: 11637832760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 17:49:20,249][15401] Updated weights for policy 0, policy_version 710312 (0.0043) [2024-06-24 17:49:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 11637866496. Throughput: 0: 42495.2. Samples: 11637966140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 17:49:24,097][15401] Updated weights for policy 0, policy_version 710322 (0.0037) [2024-06-24 17:49:27,951][15401] Updated weights for policy 0, policy_version 710332 (0.0027) [2024-06-24 17:49:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11638095872. Throughput: 0: 42367.2. Samples: 11638211580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 17:49:31,840][15401] Updated weights for policy 0, policy_version 710342 (0.0028) [2024-06-24 17:49:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 11638325248. Throughput: 0: 42490.8. Samples: 11638469520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 17:49:35,814][15401] Updated weights for policy 0, policy_version 710352 (0.0036) [2024-06-24 17:49:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 11638505472. Throughput: 0: 42293.3. Samples: 11638598440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 17:49:39,623][15401] Updated weights for policy 0, policy_version 710362 (0.0038) [2024-06-24 17:49:43,374][15401] Updated weights for policy 0, policy_version 710372 (0.0040) [2024-06-24 17:49:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11638734848. Throughput: 0: 42375.1. Samples: 11638847340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 17:49:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000710372_11638734848.pth... [2024-06-24 17:49:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000709747_11628494848.pth [2024-06-24 17:49:47,415][15401] Updated weights for policy 0, policy_version 710382 (0.0025) [2024-06-24 17:49:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11638947840. Throughput: 0: 42523.2. Samples: 11639109180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 17:49:50,879][15401] Updated weights for policy 0, policy_version 710392 (0.0028) [2024-06-24 17:49:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11639160832. Throughput: 0: 42331.9. Samples: 11639237340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 17:49:55,003][15401] Updated weights for policy 0, policy_version 710402 (0.0023) [2024-06-24 17:49:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11639373824. Throughput: 0: 42550.8. Samples: 11639491300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-24 17:49:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 17:49:58,397][15401] Updated weights for policy 0, policy_version 710412 (0.0033) [2024-06-24 17:50:03,010][15401] Updated weights for policy 0, policy_version 710422 (0.0027) [2024-06-24 17:50:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 11639570432. Throughput: 0: 42715.6. Samples: 11639754960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:03,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 17:50:05,863][15401] Updated weights for policy 0, policy_version 710432 (0.0027) [2024-06-24 17:50:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11639783424. Throughput: 0: 42458.7. Samples: 11639876780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 17:50:10,745][15401] Updated weights for policy 0, policy_version 710442 (0.0040) [2024-06-24 17:50:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11640029184. Throughput: 0: 42714.2. Samples: 11640133720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:13,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 17:50:13,454][15401] Updated weights for policy 0, policy_version 710452 (0.0044) [2024-06-24 17:50:18,262][15401] Updated weights for policy 0, policy_version 710462 (0.0048) [2024-06-24 17:50:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11640209408. Throughput: 0: 42732.8. Samples: 11640392500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 17:50:19,807][15349] Signal inference workers to stop experience collection... (172250 times) [2024-06-24 17:50:19,834][15401] InferenceWorker_p0-w0: stopping experience collection (172250 times) [2024-06-24 17:50:19,922][15349] Signal inference workers to resume experience collection... (172250 times) [2024-06-24 17:50:19,923][15401] InferenceWorker_p0-w0: resuming experience collection (172250 times) [2024-06-24 17:50:21,244][15401] Updated weights for policy 0, policy_version 710472 (0.0030) [2024-06-24 17:50:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 11640438784. Throughput: 0: 42547.2. Samples: 11640513060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:23,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 17:50:25,903][15401] Updated weights for policy 0, policy_version 710482 (0.0040) [2024-06-24 17:50:28,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 11640668160. Throughput: 0: 42911.8. Samples: 11640778480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:28,393][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 17:50:29,282][15401] Updated weights for policy 0, policy_version 710492 (0.0030) [2024-06-24 17:50:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 11640848384. Throughput: 0: 42855.0. Samples: 11641037660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:33,391][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 17:50:33,469][15401] Updated weights for policy 0, policy_version 710502 (0.0049) [2024-06-24 17:50:36,913][15401] Updated weights for policy 0, policy_version 710512 (0.0037) [2024-06-24 17:50:38,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11641094144. Throughput: 0: 42788.5. Samples: 11641162820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 17:50:40,821][15401] Updated weights for policy 0, policy_version 710522 (0.0043) [2024-06-24 17:50:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11641307136. Throughput: 0: 42948.9. Samples: 11641424000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 17:50:44,624][15401] Updated weights for policy 0, policy_version 710532 (0.0027) [2024-06-24 17:50:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11641503744. Throughput: 0: 42895.3. Samples: 11641685240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:48,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 17:50:48,393][15401] Updated weights for policy 0, policy_version 710542 (0.0024) [2024-06-24 17:50:52,298][15401] Updated weights for policy 0, policy_version 710552 (0.0036) [2024-06-24 17:50:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11641733120. Throughput: 0: 42896.3. Samples: 11641807120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 17:50:55,986][15401] Updated weights for policy 0, policy_version 710562 (0.0036) [2024-06-24 17:50:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11641929728. Throughput: 0: 42910.5. Samples: 11642064700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:50:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 17:50:59,896][15401] Updated weights for policy 0, policy_version 710572 (0.0040) [2024-06-24 17:51:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11642142720. Throughput: 0: 42971.6. Samples: 11642326220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 17:51:03,965][15401] Updated weights for policy 0, policy_version 710582 (0.0052) [2024-06-24 17:51:07,668][15401] Updated weights for policy 0, policy_version 710592 (0.0035) [2024-06-24 17:51:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11642372096. Throughput: 0: 43071.5. Samples: 11642451280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 17:51:11,858][15401] Updated weights for policy 0, policy_version 710602 (0.0036) [2024-06-24 17:51:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11642568704. Throughput: 0: 42858.7. Samples: 11642707020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 17:51:15,531][15401] Updated weights for policy 0, policy_version 710612 (0.0029) [2024-06-24 17:51:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11642798080. Throughput: 0: 42609.4. Samples: 11642955080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:18,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 17:51:19,463][15401] Updated weights for policy 0, policy_version 710622 (0.0029) [2024-06-24 17:51:23,319][15401] Updated weights for policy 0, policy_version 710632 (0.0040) [2024-06-24 17:51:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11642994688. Throughput: 0: 42698.7. Samples: 11643084260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 17:51:27,112][15401] Updated weights for policy 0, policy_version 710642 (0.0038) [2024-06-24 17:51:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11643224064. Throughput: 0: 42644.3. Samples: 11643343000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:28,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 17:51:30,900][15401] Updated weights for policy 0, policy_version 710652 (0.0027) [2024-06-24 17:51:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11643420672. Throughput: 0: 42487.9. Samples: 11643597200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:33,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 17:51:34,699][15349] Signal inference workers to stop experience collection... (172300 times) [2024-06-24 17:51:34,700][15349] Signal inference workers to resume experience collection... (172300 times) [2024-06-24 17:51:34,705][15401] Updated weights for policy 0, policy_version 710662 (0.0036) [2024-06-24 17:51:34,749][15401] InferenceWorker_p0-w0: stopping experience collection (172300 times) [2024-06-24 17:51:34,749][15401] InferenceWorker_p0-w0: resuming experience collection (172300 times) [2024-06-24 17:51:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 11643633664. Throughput: 0: 42590.1. Samples: 11643723680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 17:51:38,456][15401] Updated weights for policy 0, policy_version 710672 (0.0040) [2024-06-24 17:51:42,264][15401] Updated weights for policy 0, policy_version 710682 (0.0029) [2024-06-24 17:51:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11643846656. Throughput: 0: 42586.4. Samples: 11643981080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-24 17:51:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 17:51:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000710685_11643863040.pth... [2024-06-24 17:51:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000710061_11633639424.pth [2024-06-24 17:51:46,086][15401] Updated weights for policy 0, policy_version 710692 (0.0029) [2024-06-24 17:51:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 11644076032. Throughput: 0: 42320.8. Samples: 11644230660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:51:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 17:51:49,853][15401] Updated weights for policy 0, policy_version 710702 (0.0039) [2024-06-24 17:51:53,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 11644289024. Throughput: 0: 42488.7. Samples: 11644363280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:51:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 17:51:53,782][15401] Updated weights for policy 0, policy_version 710712 (0.0037) [2024-06-24 17:51:57,838][15401] Updated weights for policy 0, policy_version 710722 (0.0036) [2024-06-24 17:51:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11644485632. Throughput: 0: 42519.2. Samples: 11644620380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:51:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 17:52:01,468][15401] Updated weights for policy 0, policy_version 710732 (0.0035) [2024-06-24 17:52:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11644715008. Throughput: 0: 42635.6. Samples: 11644873680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:03,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 17:52:05,571][15401] Updated weights for policy 0, policy_version 710742 (0.0037) [2024-06-24 17:52:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11644928000. Throughput: 0: 42704.7. Samples: 11645005980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:08,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 17:52:09,190][15401] Updated weights for policy 0, policy_version 710752 (0.0035) [2024-06-24 17:52:13,326][15401] Updated weights for policy 0, policy_version 710762 (0.0031) [2024-06-24 17:52:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11645124608. Throughput: 0: 42551.2. Samples: 11645257800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:13,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 17:52:16,930][15401] Updated weights for policy 0, policy_version 710772 (0.0035) [2024-06-24 17:52:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11645353984. Throughput: 0: 42513.4. Samples: 11645510300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:18,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 17:52:21,084][15401] Updated weights for policy 0, policy_version 710782 (0.0033) [2024-06-24 17:52:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11645566976. Throughput: 0: 42758.8. Samples: 11645647820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:23,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-24 17:52:24,433][15401] Updated weights for policy 0, policy_version 710792 (0.0043) [2024-06-24 17:52:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11645763584. Throughput: 0: 42572.7. Samples: 11645896860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 17:52:28,941][15401] Updated weights for policy 0, policy_version 710802 (0.0037) [2024-06-24 17:52:32,019][15401] Updated weights for policy 0, policy_version 710812 (0.0041) [2024-06-24 17:52:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11646009344. Throughput: 0: 42700.9. Samples: 11646152200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:33,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 17:52:36,359][15401] Updated weights for policy 0, policy_version 710822 (0.0033) [2024-06-24 17:52:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11646205952. Throughput: 0: 42848.2. Samples: 11646291440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 17:52:39,771][15401] Updated weights for policy 0, policy_version 710832 (0.0029) [2024-06-24 17:52:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11646418944. Throughput: 0: 42643.5. Samples: 11646539340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 17:52:43,801][15401] Updated weights for policy 0, policy_version 710842 (0.0032) [2024-06-24 17:52:47,351][15401] Updated weights for policy 0, policy_version 710852 (0.0028) [2024-06-24 17:52:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 11646648320. Throughput: 0: 42699.2. Samples: 11646795140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 17:52:51,486][15401] Updated weights for policy 0, policy_version 710862 (0.0028) [2024-06-24 17:52:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 11646828544. Throughput: 0: 42777.0. Samples: 11646930940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:53,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 17:52:54,996][15401] Updated weights for policy 0, policy_version 710872 (0.0034) [2024-06-24 17:52:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 11647074304. Throughput: 0: 42745.4. Samples: 11647181340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:52:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 17:52:59,058][15401] Updated weights for policy 0, policy_version 710882 (0.0022) [2024-06-24 17:53:02,652][15401] Updated weights for policy 0, policy_version 710892 (0.0030) [2024-06-24 17:53:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 11647287296. Throughput: 0: 42808.8. Samples: 11647436700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:53:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 17:53:05,593][15349] Signal inference workers to stop experience collection... (172350 times) [2024-06-24 17:53:05,636][15401] InferenceWorker_p0-w0: stopping experience collection (172350 times) [2024-06-24 17:53:05,645][15349] Signal inference workers to resume experience collection... (172350 times) [2024-06-24 17:53:05,646][15401] InferenceWorker_p0-w0: resuming experience collection (172350 times) [2024-06-24 17:53:06,496][15401] Updated weights for policy 0, policy_version 710902 (0.0047) [2024-06-24 17:53:08,392][15132] Fps is (10 sec: 40951.4, 60 sec: 42597.0, 300 sec: 42653.7). Total num frames: 11647483904. Throughput: 0: 42714.5. Samples: 11647570060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:53:08,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 17:53:10,230][15401] Updated weights for policy 0, policy_version 710912 (0.0031) [2024-06-24 17:53:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11647713280. Throughput: 0: 42938.8. Samples: 11647829100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:53:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 17:53:14,189][15401] Updated weights for policy 0, policy_version 710922 (0.0030) [2024-06-24 17:53:17,716][15401] Updated weights for policy 0, policy_version 710932 (0.0029) [2024-06-24 17:53:18,389][15132] Fps is (10 sec: 45884.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11647942656. Throughput: 0: 42957.4. Samples: 11648085280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:53:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 17:53:21,795][15401] Updated weights for policy 0, policy_version 710942 (0.0039) [2024-06-24 17:53:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11648122880. Throughput: 0: 42730.7. Samples: 11648214320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-24 17:53:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 17:53:25,733][15401] Updated weights for policy 0, policy_version 710952 (0.0038) [2024-06-24 17:53:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11648335872. Throughput: 0: 42902.4. Samples: 11648469940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:53:28,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 17:53:29,671][15401] Updated weights for policy 0, policy_version 710962 (0.0038) [2024-06-24 17:53:33,243][15401] Updated weights for policy 0, policy_version 710972 (0.0029) [2024-06-24 17:53:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11648581632. Throughput: 0: 42951.4. Samples: 11648727960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:53:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 17:53:37,410][15401] Updated weights for policy 0, policy_version 710982 (0.0029) [2024-06-24 17:53:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 11648778240. Throughput: 0: 42736.4. Samples: 11648854080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:53:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 17:53:40,850][15401] Updated weights for policy 0, policy_version 710992 (0.0027) [2024-06-24 17:53:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11648974848. Throughput: 0: 42687.4. Samples: 11649102280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:53:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 17:53:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000710997_11648974848.pth... [2024-06-24 17:53:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000710372_11638734848.pth [2024-06-24 17:53:45,243][15401] Updated weights for policy 0, policy_version 711002 (0.0030) [2024-06-24 17:53:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11649204224. Throughput: 0: 42607.6. Samples: 11649354040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:53:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 17:53:48,521][15401] Updated weights for policy 0, policy_version 711012 (0.0036) [2024-06-24 17:53:53,049][15401] Updated weights for policy 0, policy_version 711022 (0.0046) [2024-06-24 17:53:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11649417216. Throughput: 0: 42631.7. Samples: 11649488400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:53:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 17:53:56,311][15401] Updated weights for policy 0, policy_version 711032 (0.0033) [2024-06-24 17:53:58,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42320.8, 300 sec: 42597.5). Total num frames: 11649613824. Throughput: 0: 42489.9. Samples: 11649741420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:53:58,396][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 17:54:00,541][15401] Updated weights for policy 0, policy_version 711042 (0.0034) [2024-06-24 17:54:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11649843200. Throughput: 0: 42522.1. Samples: 11649998780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 17:54:04,048][15401] Updated weights for policy 0, policy_version 711052 (0.0042) [2024-06-24 17:54:08,100][15401] Updated weights for policy 0, policy_version 711062 (0.0034) [2024-06-24 17:54:08,389][15132] Fps is (10 sec: 44265.7, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 11650056192. Throughput: 0: 42567.6. Samples: 11650129860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 17:54:11,761][15401] Updated weights for policy 0, policy_version 711072 (0.0037) [2024-06-24 17:54:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11650252800. Throughput: 0: 42601.8. Samples: 11650387020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 17:54:15,984][15401] Updated weights for policy 0, policy_version 711082 (0.0033) [2024-06-24 17:54:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11650482176. Throughput: 0: 42335.1. Samples: 11650633040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:18,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-24 17:54:19,672][15401] Updated weights for policy 0, policy_version 711092 (0.0038) [2024-06-24 17:54:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11650678784. Throughput: 0: 42476.3. Samples: 11650765520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 17:54:23,809][15401] Updated weights for policy 0, policy_version 711102 (0.0043) [2024-06-24 17:54:27,270][15401] Updated weights for policy 0, policy_version 711112 (0.0033) [2024-06-24 17:54:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11650891776. Throughput: 0: 42500.9. Samples: 11651014820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 17:54:31,813][15401] Updated weights for policy 0, policy_version 711122 (0.0032) [2024-06-24 17:54:32,158][15349] Signal inference workers to stop experience collection... (172400 times) [2024-06-24 17:54:32,181][15401] InferenceWorker_p0-w0: stopping experience collection (172400 times) [2024-06-24 17:54:32,219][15349] Signal inference workers to resume experience collection... (172400 times) [2024-06-24 17:54:32,220][15401] InferenceWorker_p0-w0: resuming experience collection (172400 times) [2024-06-24 17:54:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 11651088384. Throughput: 0: 42612.8. Samples: 11651271620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 17:54:35,118][15401] Updated weights for policy 0, policy_version 711132 (0.0028) [2024-06-24 17:54:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11651317760. Throughput: 0: 42367.6. Samples: 11651394940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:38,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 17:54:39,414][15401] Updated weights for policy 0, policy_version 711142 (0.0039) [2024-06-24 17:54:43,038][15401] Updated weights for policy 0, policy_version 711152 (0.0054) [2024-06-24 17:54:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11651530752. Throughput: 0: 42353.6. Samples: 11651647060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 17:54:47,123][15401] Updated weights for policy 0, policy_version 711162 (0.0034) [2024-06-24 17:54:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11651727360. Throughput: 0: 42303.6. Samples: 11651902440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:48,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 17:54:50,582][15401] Updated weights for policy 0, policy_version 711172 (0.0039) [2024-06-24 17:54:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11651956736. Throughput: 0: 42259.1. Samples: 11652031520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 17:54:54,678][15401] Updated weights for policy 0, policy_version 711182 (0.0035) [2024-06-24 17:54:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42329.7, 300 sec: 42653.9). Total num frames: 11652153344. Throughput: 0: 42134.9. Samples: 11652283100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:54:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 17:54:58,668][15401] Updated weights for policy 0, policy_version 711192 (0.0033) [2024-06-24 17:55:02,257][15401] Updated weights for policy 0, policy_version 711202 (0.0040) [2024-06-24 17:55:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 11652366336. Throughput: 0: 42439.9. Samples: 11652542840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 17:55:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 17:55:06,397][15401] Updated weights for policy 0, policy_version 711212 (0.0036) [2024-06-24 17:55:08,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 11652595712. Throughput: 0: 42311.6. Samples: 11652669640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:08,393][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 17:55:09,891][15401] Updated weights for policy 0, policy_version 711222 (0.0034) [2024-06-24 17:55:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11652792320. Throughput: 0: 42308.6. Samples: 11652918700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 17:55:14,064][15401] Updated weights for policy 0, policy_version 711232 (0.0033) [2024-06-24 17:55:17,682][15401] Updated weights for policy 0, policy_version 711242 (0.0031) [2024-06-24 17:55:18,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42050.6, 300 sec: 42598.0). Total num frames: 11653005312. Throughput: 0: 42315.1. Samples: 11653175900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:18,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 17:55:21,829][15401] Updated weights for policy 0, policy_version 711252 (0.0037) [2024-06-24 17:55:23,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.8, 300 sec: 42598.4). Total num frames: 11653234688. Throughput: 0: 42384.4. Samples: 11653302340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:23,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 17:55:25,315][15401] Updated weights for policy 0, policy_version 711262 (0.0036) [2024-06-24 17:55:28,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11653447680. Throughput: 0: 42608.6. Samples: 11653564440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 17:55:29,381][15401] Updated weights for policy 0, policy_version 711272 (0.0035) [2024-06-24 17:55:32,984][15401] Updated weights for policy 0, policy_version 711282 (0.0043) [2024-06-24 17:55:33,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11653644288. Throughput: 0: 42540.4. Samples: 11653816760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 17:55:36,993][15401] Updated weights for policy 0, policy_version 711292 (0.0032) [2024-06-24 17:55:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11653873664. Throughput: 0: 42471.0. Samples: 11653942720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 17:55:40,978][15401] Updated weights for policy 0, policy_version 711302 (0.0042) [2024-06-24 17:55:42,819][15349] Signal inference workers to stop experience collection... (172450 times) [2024-06-24 17:55:42,862][15401] InferenceWorker_p0-w0: stopping experience collection (172450 times) [2024-06-24 17:55:42,928][15349] Signal inference workers to resume experience collection... (172450 times) [2024-06-24 17:55:42,928][15401] InferenceWorker_p0-w0: resuming experience collection (172450 times) [2024-06-24 17:55:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11654086656. Throughput: 0: 42672.2. Samples: 11654203340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:43,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 17:55:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000711310_11654103040.pth... [2024-06-24 17:55:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000710685_11643863040.pth [2024-06-24 17:55:44,907][15401] Updated weights for policy 0, policy_version 711312 (0.0027) [2024-06-24 17:55:48,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 11654283264. Throughput: 0: 42523.6. Samples: 11654456500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:48,392][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 17:55:48,634][15401] Updated weights for policy 0, policy_version 711322 (0.0037) [2024-06-24 17:55:52,517][15401] Updated weights for policy 0, policy_version 711332 (0.0039) [2024-06-24 17:55:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11654529024. Throughput: 0: 42546.2. Samples: 11654584120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:53,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-24 17:55:56,339][15401] Updated weights for policy 0, policy_version 711342 (0.0039) [2024-06-24 17:55:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11654709248. Throughput: 0: 42736.8. Samples: 11654841860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:55:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 17:56:00,106][15401] Updated weights for policy 0, policy_version 711352 (0.0042) [2024-06-24 17:56:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11654922240. Throughput: 0: 42691.7. Samples: 11655096920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 17:56:04,084][15401] Updated weights for policy 0, policy_version 711362 (0.0039) [2024-06-24 17:56:07,844][15401] Updated weights for policy 0, policy_version 711372 (0.0030) [2024-06-24 17:56:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 11655135232. Throughput: 0: 42806.4. Samples: 11655228520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 17:56:11,661][15401] Updated weights for policy 0, policy_version 711382 (0.0034) [2024-06-24 17:56:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11655348224. Throughput: 0: 42589.7. Samples: 11655480980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 17:56:15,519][15401] Updated weights for policy 0, policy_version 711392 (0.0028) [2024-06-24 17:56:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 11655577600. Throughput: 0: 42543.1. Samples: 11655731200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 17:56:19,646][15401] Updated weights for policy 0, policy_version 711402 (0.0035) [2024-06-24 17:56:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 11655757824. Throughput: 0: 42620.1. Samples: 11655860620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:23,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-24 17:56:23,561][15401] Updated weights for policy 0, policy_version 711412 (0.0023) [2024-06-24 17:56:27,338][15401] Updated weights for policy 0, policy_version 711422 (0.0027) [2024-06-24 17:56:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11655987200. Throughput: 0: 42635.6. Samples: 11656121940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 17:56:31,159][15401] Updated weights for policy 0, policy_version 711432 (0.0025) [2024-06-24 17:56:33,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11656232960. Throughput: 0: 42297.8. Samples: 11656359800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 17:56:35,092][15401] Updated weights for policy 0, policy_version 711442 (0.0036) [2024-06-24 17:56:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 11656396800. Throughput: 0: 42435.5. Samples: 11656493720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:38,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 17:56:38,807][15401] Updated weights for policy 0, policy_version 711452 (0.0035) [2024-06-24 17:56:42,485][15401] Updated weights for policy 0, policy_version 711462 (0.0036) [2024-06-24 17:56:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11656642560. Throughput: 0: 42654.3. Samples: 11656761300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 17:56:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 17:56:46,591][15401] Updated weights for policy 0, policy_version 711472 (0.0034) [2024-06-24 17:56:48,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43146.2, 300 sec: 42654.0). Total num frames: 11656871936. Throughput: 0: 42403.0. Samples: 11657005060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:56:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 17:56:50,143][15401] Updated weights for policy 0, policy_version 711482 (0.0031) [2024-06-24 17:56:53,390][15132] Fps is (10 sec: 37682.7, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 11657019392. Throughput: 0: 42427.4. Samples: 11657137760. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:56:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 17:56:53,666][15349] Signal inference workers to stop experience collection... (172500 times) [2024-06-24 17:56:53,667][15349] Signal inference workers to resume experience collection... (172500 times) [2024-06-24 17:56:53,708][15401] InferenceWorker_p0-w0: stopping experience collection (172500 times) [2024-06-24 17:56:53,709][15401] InferenceWorker_p0-w0: resuming experience collection (172500 times) [2024-06-24 17:56:54,201][15401] Updated weights for policy 0, policy_version 711492 (0.0034) [2024-06-24 17:56:57,816][15401] Updated weights for policy 0, policy_version 711502 (0.0038) [2024-06-24 17:56:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11657265152. Throughput: 0: 42406.0. Samples: 11657389260. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:56:58,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 17:57:02,041][15401] Updated weights for policy 0, policy_version 711512 (0.0040) [2024-06-24 17:57:03,390][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 11657510912. Throughput: 0: 42374.6. Samples: 11657638060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 17:57:05,319][15401] Updated weights for policy 0, policy_version 711522 (0.0027) [2024-06-24 17:57:08,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 11657674752. Throughput: 0: 42566.6. Samples: 11657776120. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 17:57:09,474][15401] Updated weights for policy 0, policy_version 711532 (0.0038) [2024-06-24 17:57:12,954][15401] Updated weights for policy 0, policy_version 711542 (0.0027) [2024-06-24 17:57:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 11657904128. Throughput: 0: 42383.9. Samples: 11658029220. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 17:57:17,098][15401] Updated weights for policy 0, policy_version 711552 (0.0031) [2024-06-24 17:57:18,390][15132] Fps is (10 sec: 49148.0, 60 sec: 43144.0, 300 sec: 42709.4). Total num frames: 11658166272. Throughput: 0: 42689.4. Samples: 11658280860. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:18,391][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 17:57:20,505][15401] Updated weights for policy 0, policy_version 711562 (0.0038) [2024-06-24 17:57:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 11658313728. Throughput: 0: 42697.8. Samples: 11658415220. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:23,393][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 17:57:24,942][15401] Updated weights for policy 0, policy_version 711572 (0.0051) [2024-06-24 17:57:28,390][15132] Fps is (10 sec: 37686.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 11658543104. Throughput: 0: 42343.9. Samples: 11658666780. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 17:57:28,612][15401] Updated weights for policy 0, policy_version 711582 (0.0037) [2024-06-24 17:57:32,552][15401] Updated weights for policy 0, policy_version 711592 (0.0038) [2024-06-24 17:57:33,390][15132] Fps is (10 sec: 47524.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11658788864. Throughput: 0: 42603.5. Samples: 11658922220. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 17:57:36,156][15401] Updated weights for policy 0, policy_version 711602 (0.0032) [2024-06-24 17:57:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 11658952704. Throughput: 0: 42605.9. Samples: 11659055020. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 17:57:40,253][15401] Updated weights for policy 0, policy_version 711612 (0.0035) [2024-06-24 17:57:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11659198464. Throughput: 0: 42645.4. Samples: 11659308300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:43,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 17:57:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000711621_11659198464.pth... [2024-06-24 17:57:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000710997_11648974848.pth [2024-06-24 17:57:43,844][15401] Updated weights for policy 0, policy_version 711622 (0.0024) [2024-06-24 17:57:47,872][15401] Updated weights for policy 0, policy_version 711632 (0.0039) [2024-06-24 17:57:48,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11659427840. Throughput: 0: 42816.5. Samples: 11659564800. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 17:57:51,547][15401] Updated weights for policy 0, policy_version 711642 (0.0029) [2024-06-24 17:57:53,396][15132] Fps is (10 sec: 40934.0, 60 sec: 43140.0, 300 sec: 42486.4). Total num frames: 11659608064. Throughput: 0: 42590.0. Samples: 11659692940. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:53,396][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 17:57:55,554][15401] Updated weights for policy 0, policy_version 711652 (0.0028) [2024-06-24 17:57:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 11659853824. Throughput: 0: 42655.2. Samples: 11659948700. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:57:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 17:57:59,116][15401] Updated weights for policy 0, policy_version 711662 (0.0046) [2024-06-24 17:58:02,274][15349] Signal inference workers to stop experience collection... (172550 times) [2024-06-24 17:58:02,322][15401] InferenceWorker_p0-w0: stopping experience collection (172550 times) [2024-06-24 17:58:02,333][15349] Signal inference workers to resume experience collection... (172550 times) [2024-06-24 17:58:02,345][15401] InferenceWorker_p0-w0: resuming experience collection (172550 times) [2024-06-24 17:58:03,179][15401] Updated weights for policy 0, policy_version 711672 (0.0030) [2024-06-24 17:58:03,392][15132] Fps is (10 sec: 44254.3, 60 sec: 42323.7, 300 sec: 42598.3). Total num frames: 11660050432. Throughput: 0: 42887.4. Samples: 11660210860. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:58:03,393][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 17:58:06,567][15401] Updated weights for policy 0, policy_version 711682 (0.0031) [2024-06-24 17:58:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 11660263424. Throughput: 0: 42563.1. Samples: 11660330460. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:58:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 17:58:10,913][15401] Updated weights for policy 0, policy_version 711692 (0.0036) [2024-06-24 17:58:13,392][15132] Fps is (10 sec: 44237.0, 60 sec: 43142.8, 300 sec: 42542.5). Total num frames: 11660492800. Throughput: 0: 42694.2. Samples: 11660588120. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:58:13,393][15132] Avg episode reward: [(0, '0.201')] [2024-06-24 17:58:14,389][15401] Updated weights for policy 0, policy_version 711702 (0.0034) [2024-06-24 17:58:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.7, 300 sec: 42487.3). Total num frames: 11660656640. Throughput: 0: 43014.8. Samples: 11660857880. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:58:18,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-24 17:58:18,752][15401] Updated weights for policy 0, policy_version 711712 (0.0040) [2024-06-24 17:58:21,880][15401] Updated weights for policy 0, policy_version 711722 (0.0029) [2024-06-24 17:58:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43419.3, 300 sec: 42653.9). Total num frames: 11660918784. Throughput: 0: 42630.6. Samples: 11660973400. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:58:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 17:58:26,387][15401] Updated weights for policy 0, policy_version 711732 (0.0035) [2024-06-24 17:58:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 11661131776. Throughput: 0: 42724.9. Samples: 11661230920. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-24 17:58:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 17:58:29,519][15401] Updated weights for policy 0, policy_version 711742 (0.0033) [2024-06-24 17:58:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 11661312000. Throughput: 0: 42995.1. Samples: 11661499580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:58:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 17:58:34,052][15401] Updated weights for policy 0, policy_version 711752 (0.0032) [2024-06-24 17:58:37,017][15401] Updated weights for policy 0, policy_version 711762 (0.0041) [2024-06-24 17:58:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 11661557760. Throughput: 0: 42837.2. Samples: 11661620340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:58:38,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 17:58:41,839][15401] Updated weights for policy 0, policy_version 711772 (0.0029) [2024-06-24 17:58:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11661770752. Throughput: 0: 43083.5. Samples: 11661887460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:58:43,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 17:58:45,106][15401] Updated weights for policy 0, policy_version 711782 (0.0030) [2024-06-24 17:58:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11661967360. Throughput: 0: 42946.3. Samples: 11662143340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:58:48,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 17:58:49,539][15401] Updated weights for policy 0, policy_version 711792 (0.0032) [2024-06-24 17:58:52,823][15401] Updated weights for policy 0, policy_version 711802 (0.0036) [2024-06-24 17:58:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42876.1, 300 sec: 42599.3). Total num frames: 11662180352. Throughput: 0: 43009.4. Samples: 11662265880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:58:53,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 17:58:57,112][15401] Updated weights for policy 0, policy_version 711812 (0.0024) [2024-06-24 17:58:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11662409728. Throughput: 0: 43223.6. Samples: 11662533080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:58:58,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 17:59:00,240][15401] Updated weights for policy 0, policy_version 711822 (0.0030) [2024-06-24 17:59:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 11662606336. Throughput: 0: 42919.1. Samples: 11662789240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 17:59:04,630][15401] Updated weights for policy 0, policy_version 711832 (0.0035) [2024-06-24 17:59:07,787][15401] Updated weights for policy 0, policy_version 711842 (0.0025) [2024-06-24 17:59:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 11662835712. Throughput: 0: 43033.0. Samples: 11662909880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:08,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 17:59:12,135][15401] Updated weights for policy 0, policy_version 711852 (0.0028) [2024-06-24 17:59:13,082][15349] Signal inference workers to stop experience collection... (172600 times) [2024-06-24 17:59:13,109][15401] InferenceWorker_p0-w0: stopping experience collection (172600 times) [2024-06-24 17:59:13,131][15349] Signal inference workers to resume experience collection... (172600 times) [2024-06-24 17:59:13,136][15401] InferenceWorker_p0-w0: resuming experience collection (172600 times) [2024-06-24 17:59:13,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42595.5, 300 sec: 42597.5). Total num frames: 11663048704. Throughput: 0: 43263.7. Samples: 11663178060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:13,396][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 17:59:15,304][15401] Updated weights for policy 0, policy_version 711862 (0.0046) [2024-06-24 17:59:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 11663261696. Throughput: 0: 42879.1. Samples: 11663429140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 17:59:19,664][15401] Updated weights for policy 0, policy_version 711872 (0.0040) [2024-06-24 17:59:22,997][15401] Updated weights for policy 0, policy_version 711882 (0.0037) [2024-06-24 17:59:23,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11663491072. Throughput: 0: 43106.6. Samples: 11663560140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:23,395][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 17:59:27,517][15401] Updated weights for policy 0, policy_version 711892 (0.0034) [2024-06-24 17:59:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 11663687680. Throughput: 0: 42970.9. Samples: 11663821160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 17:59:30,709][15401] Updated weights for policy 0, policy_version 711902 (0.0041) [2024-06-24 17:59:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11663917056. Throughput: 0: 42822.2. Samples: 11664070340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:33,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-24 17:59:35,035][15401] Updated weights for policy 0, policy_version 711912 (0.0035) [2024-06-24 17:59:38,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11664113664. Throughput: 0: 43064.8. Samples: 11664203800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 17:59:38,439][15401] Updated weights for policy 0, policy_version 711922 (0.0027) [2024-06-24 17:59:42,414][15401] Updated weights for policy 0, policy_version 711932 (0.0037) [2024-06-24 17:59:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11664326656. Throughput: 0: 42806.3. Samples: 11664459360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 17:59:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000711935_11664343040.pth... [2024-06-24 17:59:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000711310_11654103040.pth [2024-06-24 17:59:45,901][15401] Updated weights for policy 0, policy_version 711942 (0.0029) [2024-06-24 17:59:48,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 11664556032. Throughput: 0: 42781.3. Samples: 11664714500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:48,392][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 17:59:50,052][15401] Updated weights for policy 0, policy_version 711952 (0.0041) [2024-06-24 17:59:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11664769024. Throughput: 0: 43012.0. Samples: 11664845420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 17:59:53,461][15401] Updated weights for policy 0, policy_version 711962 (0.0036) [2024-06-24 17:59:58,129][15401] Updated weights for policy 0, policy_version 711972 (0.0031) [2024-06-24 17:59:58,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11664965632. Throughput: 0: 42808.4. Samples: 11665104160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 17:59:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 18:00:01,175][15401] Updated weights for policy 0, policy_version 711982 (0.0030) [2024-06-24 18:00:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 11665195008. Throughput: 0: 42894.3. Samples: 11665359380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 18:00:03,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 18:00:05,622][15401] Updated weights for policy 0, policy_version 711992 (0.0037) [2024-06-24 18:00:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11665408000. Throughput: 0: 42892.4. Samples: 11665490300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 18:00:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 18:00:08,740][15401] Updated weights for policy 0, policy_version 712002 (0.0037) [2024-06-24 18:00:13,268][15401] Updated weights for policy 0, policy_version 712012 (0.0036) [2024-06-24 18:00:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42603.0, 300 sec: 42709.8). Total num frames: 11665604608. Throughput: 0: 42843.9. Samples: 11665749120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:13,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 18:00:16,654][15401] Updated weights for policy 0, policy_version 712022 (0.0040) [2024-06-24 18:00:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11665833984. Throughput: 0: 42725.0. Samples: 11665992960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:18,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-24 18:00:20,831][15401] Updated weights for policy 0, policy_version 712032 (0.0025) [2024-06-24 18:00:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11666030592. Throughput: 0: 42762.8. Samples: 11666128120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:23,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 18:00:24,063][15401] Updated weights for policy 0, policy_version 712042 (0.0034) [2024-06-24 18:00:28,210][15401] Updated weights for policy 0, policy_version 712052 (0.0034) [2024-06-24 18:00:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11666259968. Throughput: 0: 42765.6. Samples: 11666383820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 18:00:32,122][15401] Updated weights for policy 0, policy_version 712062 (0.0040) [2024-06-24 18:00:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11666489344. Throughput: 0: 42707.6. Samples: 11666636240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 18:00:35,984][15401] Updated weights for policy 0, policy_version 712072 (0.0031) [2024-06-24 18:00:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11666685952. Throughput: 0: 42900.9. Samples: 11666775960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 18:00:39,651][15349] Signal inference workers to stop experience collection... (172650 times) [2024-06-24 18:00:39,655][15349] Signal inference workers to resume experience collection... (172650 times) [2024-06-24 18:00:39,665][15401] Updated weights for policy 0, policy_version 712082 (0.0035) [2024-06-24 18:00:39,673][15401] InferenceWorker_p0-w0: stopping experience collection (172650 times) [2024-06-24 18:00:39,673][15401] InferenceWorker_p0-w0: resuming experience collection (172650 times) [2024-06-24 18:00:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 11666898944. Throughput: 0: 42900.3. Samples: 11667034680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 18:00:43,495][15401] Updated weights for policy 0, policy_version 712092 (0.0028) [2024-06-24 18:00:47,203][15401] Updated weights for policy 0, policy_version 712102 (0.0029) [2024-06-24 18:00:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 11667144704. Throughput: 0: 42786.6. Samples: 11667284780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:48,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 18:00:51,113][15401] Updated weights for policy 0, policy_version 712112 (0.0036) [2024-06-24 18:00:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11667341312. Throughput: 0: 42986.7. Samples: 11667424800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:53,401][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 18:00:54,750][15401] Updated weights for policy 0, policy_version 712122 (0.0037) [2024-06-24 18:00:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11667537920. Throughput: 0: 42944.0. Samples: 11667681600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:00:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 18:00:58,615][15401] Updated weights for policy 0, policy_version 712132 (0.0029) [2024-06-24 18:01:02,270][15401] Updated weights for policy 0, policy_version 712142 (0.0025) [2024-06-24 18:01:03,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11667783680. Throughput: 0: 43164.8. Samples: 11667935380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:03,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 18:01:06,117][15401] Updated weights for policy 0, policy_version 712152 (0.0037) [2024-06-24 18:01:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11667980288. Throughput: 0: 43098.2. Samples: 11668067540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 18:01:09,976][15401] Updated weights for policy 0, policy_version 712162 (0.0032) [2024-06-24 18:01:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 11668209664. Throughput: 0: 43133.8. Samples: 11668324840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 18:01:13,700][15401] Updated weights for policy 0, policy_version 712172 (0.0036) [2024-06-24 18:01:17,573][15401] Updated weights for policy 0, policy_version 712182 (0.0027) [2024-06-24 18:01:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 11668439040. Throughput: 0: 43122.7. Samples: 11668576760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 18:01:21,226][15401] Updated weights for policy 0, policy_version 712192 (0.0023) [2024-06-24 18:01:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11668635648. Throughput: 0: 43008.0. Samples: 11668711320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:23,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 18:01:25,022][15401] Updated weights for policy 0, policy_version 712202 (0.0027) [2024-06-24 18:01:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11668848640. Throughput: 0: 43102.7. Samples: 11668974300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:28,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 18:01:28,880][15401] Updated weights for policy 0, policy_version 712212 (0.0045) [2024-06-24 18:01:32,689][15401] Updated weights for policy 0, policy_version 712222 (0.0039) [2024-06-24 18:01:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 11669061632. Throughput: 0: 43176.1. Samples: 11669227700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 18:01:36,357][15401] Updated weights for policy 0, policy_version 712232 (0.0039) [2024-06-24 18:01:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11669291008. Throughput: 0: 42938.3. Samples: 11669356920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 18:01:40,435][15401] Updated weights for policy 0, policy_version 712242 (0.0035) [2024-06-24 18:01:43,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 11669504000. Throughput: 0: 43020.3. Samples: 11669617620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:43,393][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 18:01:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000712250_11669504000.pth... [2024-06-24 18:01:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000711621_11659198464.pth [2024-06-24 18:01:44,225][15401] Updated weights for policy 0, policy_version 712252 (0.0029) [2024-06-24 18:01:48,050][15401] Updated weights for policy 0, policy_version 712262 (0.0043) [2024-06-24 18:01:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 11669716992. Throughput: 0: 42972.4. Samples: 11669869140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 18:01:48,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 18:01:52,211][15401] Updated weights for policy 0, policy_version 712272 (0.0023) [2024-06-24 18:01:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 11669913600. Throughput: 0: 42791.1. Samples: 11669993140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:01:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 18:01:55,839][15401] Updated weights for policy 0, policy_version 712282 (0.0030) [2024-06-24 18:01:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 11670142976. Throughput: 0: 42922.6. Samples: 11670256360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:01:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 18:01:59,555][15401] Updated weights for policy 0, policy_version 712292 (0.0036) [2024-06-24 18:02:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11670339584. Throughput: 0: 42996.9. Samples: 11670511620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:03,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 18:02:03,463][15349] Signal inference workers to stop experience collection... (172700 times) [2024-06-24 18:02:03,523][15349] Signal inference workers to resume experience collection... (172700 times) [2024-06-24 18:02:03,526][15401] InferenceWorker_p0-w0: stopping experience collection (172700 times) [2024-06-24 18:02:03,530][15401] Updated weights for policy 0, policy_version 712302 (0.0033) [2024-06-24 18:02:03,561][15401] InferenceWorker_p0-w0: resuming experience collection (172700 times) [2024-06-24 18:02:07,264][15401] Updated weights for policy 0, policy_version 712312 (0.0031) [2024-06-24 18:02:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11670552576. Throughput: 0: 43013.3. Samples: 11670646920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 18:02:10,791][15401] Updated weights for policy 0, policy_version 712322 (0.0044) [2024-06-24 18:02:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 11670765568. Throughput: 0: 42921.8. Samples: 11670905780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 18:02:14,841][15401] Updated weights for policy 0, policy_version 712332 (0.0035) [2024-06-24 18:02:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42987.5). Total num frames: 11670994944. Throughput: 0: 42824.5. Samples: 11671154800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:18,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 18:02:18,473][15401] Updated weights for policy 0, policy_version 712342 (0.0034) [2024-06-24 18:02:22,526][15401] Updated weights for policy 0, policy_version 712352 (0.0038) [2024-06-24 18:02:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11671207936. Throughput: 0: 42944.5. Samples: 11671289420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 18:02:26,036][15401] Updated weights for policy 0, policy_version 712362 (0.0048) [2024-06-24 18:02:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11671404544. Throughput: 0: 42847.7. Samples: 11671545660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 18:02:30,262][15401] Updated weights for policy 0, policy_version 712372 (0.0034) [2024-06-24 18:02:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 11671650304. Throughput: 0: 42876.2. Samples: 11671798560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:33,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 18:02:33,442][15401] Updated weights for policy 0, policy_version 712382 (0.0034) [2024-06-24 18:02:37,776][15401] Updated weights for policy 0, policy_version 712392 (0.0043) [2024-06-24 18:02:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11671846912. Throughput: 0: 43146.2. Samples: 11671934720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 18:02:41,285][15401] Updated weights for policy 0, policy_version 712402 (0.0023) [2024-06-24 18:02:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.3, 300 sec: 42820.6). Total num frames: 11672059904. Throughput: 0: 42909.6. Samples: 11672187280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:43,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 18:02:45,349][15401] Updated weights for policy 0, policy_version 712412 (0.0036) [2024-06-24 18:02:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42988.1). Total num frames: 11672289280. Throughput: 0: 42819.9. Samples: 11672438520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:48,392][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 18:02:49,067][15401] Updated weights for policy 0, policy_version 712422 (0.0039) [2024-06-24 18:02:52,987][15401] Updated weights for policy 0, policy_version 712432 (0.0042) [2024-06-24 18:02:53,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 11672485888. Throughput: 0: 42886.5. Samples: 11672576820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:53,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 18:02:56,846][15401] Updated weights for policy 0, policy_version 712442 (0.0026) [2024-06-24 18:02:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42876.4). Total num frames: 11672698880. Throughput: 0: 42751.1. Samples: 11672829580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:02:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 18:03:00,604][15401] Updated weights for policy 0, policy_version 712452 (0.0028) [2024-06-24 18:03:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11672928256. Throughput: 0: 42960.7. Samples: 11673088040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:03:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 18:03:04,495][15401] Updated weights for policy 0, policy_version 712462 (0.0030) [2024-06-24 18:03:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 11673124864. Throughput: 0: 42902.7. Samples: 11673220040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:03:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 18:03:08,449][15401] Updated weights for policy 0, policy_version 712472 (0.0025) [2024-06-24 18:03:12,672][15401] Updated weights for policy 0, policy_version 712482 (0.0031) [2024-06-24 18:03:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11673321472. Throughput: 0: 42810.7. Samples: 11673472140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:03:13,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 18:03:16,634][15401] Updated weights for policy 0, policy_version 712492 (0.0039) [2024-06-24 18:03:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11673567232. Throughput: 0: 42778.7. Samples: 11673723600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:03:18,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 18:03:20,234][15401] Updated weights for policy 0, policy_version 712502 (0.0032) [2024-06-24 18:03:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11673763840. Throughput: 0: 42717.3. Samples: 11673857000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:03:23,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 18:03:24,102][15401] Updated weights for policy 0, policy_version 712512 (0.0028) [2024-06-24 18:03:27,742][15401] Updated weights for policy 0, policy_version 712522 (0.0032) [2024-06-24 18:03:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11673976832. Throughput: 0: 42760.3. Samples: 11674111500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-24 18:03:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 18:03:29,346][15349] Signal inference workers to stop experience collection... (172750 times) [2024-06-24 18:03:29,346][15349] Signal inference workers to resume experience collection... (172750 times) [2024-06-24 18:03:29,389][15401] InferenceWorker_p0-w0: stopping experience collection (172750 times) [2024-06-24 18:03:29,389][15401] InferenceWorker_p0-w0: resuming experience collection (172750 times) [2024-06-24 18:03:31,910][15401] Updated weights for policy 0, policy_version 712532 (0.0037) [2024-06-24 18:03:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 11674222592. Throughput: 0: 42824.8. Samples: 11674365640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:03:33,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 18:03:35,240][15401] Updated weights for policy 0, policy_version 712542 (0.0027) [2024-06-24 18:03:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11674402816. Throughput: 0: 42762.8. Samples: 11674501140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:03:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 18:03:39,451][15401] Updated weights for policy 0, policy_version 712552 (0.0025) [2024-06-24 18:03:42,893][15401] Updated weights for policy 0, policy_version 712562 (0.0033) [2024-06-24 18:03:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.2, 300 sec: 42931.6). Total num frames: 11674632192. Throughput: 0: 42867.0. Samples: 11674758600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:03:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 18:03:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000712563_11674632192.pth... [2024-06-24 18:03:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000711935_11664343040.pth [2024-06-24 18:03:47,170][15401] Updated weights for policy 0, policy_version 712572 (0.0035) [2024-06-24 18:03:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11674861568. Throughput: 0: 42722.3. Samples: 11675010540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:03:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 18:03:50,511][15401] Updated weights for policy 0, policy_version 712582 (0.0035) [2024-06-24 18:03:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11675041792. Throughput: 0: 42646.8. Samples: 11675139160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:03:53,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 18:03:54,635][15401] Updated weights for policy 0, policy_version 712592 (0.0046) [2024-06-24 18:03:58,183][15401] Updated weights for policy 0, policy_version 712602 (0.0031) [2024-06-24 18:03:58,396][15132] Fps is (10 sec: 42570.7, 60 sec: 43139.9, 300 sec: 42986.2). Total num frames: 11675287552. Throughput: 0: 42570.3. Samples: 11675388080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:03:58,397][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 18:04:02,278][15401] Updated weights for policy 0, policy_version 712612 (0.0029) [2024-06-24 18:04:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11675484160. Throughput: 0: 42908.9. Samples: 11675654500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 18:04:05,601][15401] Updated weights for policy 0, policy_version 712622 (0.0037) [2024-06-24 18:04:08,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42871.3, 300 sec: 42877.0). Total num frames: 11675697152. Throughput: 0: 42668.0. Samples: 11675777060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:08,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 18:04:10,307][15401] Updated weights for policy 0, policy_version 712632 (0.0036) [2024-06-24 18:04:13,041][15401] Updated weights for policy 0, policy_version 712642 (0.0027) [2024-06-24 18:04:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 11675926528. Throughput: 0: 42785.4. Samples: 11676036840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 18:04:17,848][15401] Updated weights for policy 0, policy_version 712652 (0.0031) [2024-06-24 18:04:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11676123136. Throughput: 0: 43022.0. Samples: 11676301620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 18:04:20,731][15401] Updated weights for policy 0, policy_version 712662 (0.0027) [2024-06-24 18:04:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11676336128. Throughput: 0: 42662.2. Samples: 11676420940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:23,404][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 18:04:25,401][15401] Updated weights for policy 0, policy_version 712672 (0.0036) [2024-06-24 18:04:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11676565504. Throughput: 0: 42677.6. Samples: 11676679080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 18:04:28,473][15401] Updated weights for policy 0, policy_version 712682 (0.0036) [2024-06-24 18:04:32,918][15401] Updated weights for policy 0, policy_version 712692 (0.0035) [2024-06-24 18:04:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 11676762112. Throughput: 0: 42942.6. Samples: 11676942960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 18:04:35,959][15401] Updated weights for policy 0, policy_version 712702 (0.0035) [2024-06-24 18:04:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11676991488. Throughput: 0: 42743.2. Samples: 11677062600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 18:04:40,473][15401] Updated weights for policy 0, policy_version 712712 (0.0029) [2024-06-24 18:04:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 11677220864. Throughput: 0: 43000.4. Samples: 11677322820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 18:04:44,150][15401] Updated weights for policy 0, policy_version 712722 (0.0039) [2024-06-24 18:04:48,225][15401] Updated weights for policy 0, policy_version 712732 (0.0036) [2024-06-24 18:04:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 11677401088. Throughput: 0: 42851.0. Samples: 11677582800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 18:04:51,744][15401] Updated weights for policy 0, policy_version 712742 (0.0024) [2024-06-24 18:04:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11677630464. Throughput: 0: 42825.8. Samples: 11677704220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 18:04:55,816][15401] Updated weights for policy 0, policy_version 712752 (0.0043) [2024-06-24 18:04:57,208][15349] Signal inference workers to stop experience collection... (172800 times) [2024-06-24 18:04:57,211][15349] Signal inference workers to resume experience collection... (172800 times) [2024-06-24 18:04:57,226][15401] InferenceWorker_p0-w0: stopping experience collection (172800 times) [2024-06-24 18:04:57,227][15401] InferenceWorker_p0-w0: resuming experience collection (172800 times) [2024-06-24 18:04:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42876.0, 300 sec: 42931.6). Total num frames: 11677859840. Throughput: 0: 42871.4. Samples: 11677966060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:04:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 18:04:59,217][15401] Updated weights for policy 0, policy_version 712762 (0.0024) [2024-06-24 18:05:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11678040064. Throughput: 0: 42710.1. Samples: 11678223580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:05:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 18:05:03,585][15401] Updated weights for policy 0, policy_version 712772 (0.0028) [2024-06-24 18:05:07,290][15401] Updated weights for policy 0, policy_version 712782 (0.0032) [2024-06-24 18:05:08,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 11678285824. Throughput: 0: 42701.7. Samples: 11678342620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:05:08,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 18:05:11,381][15401] Updated weights for policy 0, policy_version 712792 (0.0038) [2024-06-24 18:05:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11678482432. Throughput: 0: 42754.6. Samples: 11678603040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-24 18:05:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 18:05:14,797][15401] Updated weights for policy 0, policy_version 712802 (0.0024) [2024-06-24 18:05:18,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 11678679040. Throughput: 0: 42773.3. Samples: 11678867760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 18:05:18,911][15401] Updated weights for policy 0, policy_version 712812 (0.0038) [2024-06-24 18:05:22,403][15401] Updated weights for policy 0, policy_version 712822 (0.0035) [2024-06-24 18:05:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11678924800. Throughput: 0: 42729.4. Samples: 11678985420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 18:05:26,573][15401] Updated weights for policy 0, policy_version 712832 (0.0033) [2024-06-24 18:05:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11679137792. Throughput: 0: 42649.4. Samples: 11679242040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 18:05:29,879][15401] Updated weights for policy 0, policy_version 712842 (0.0029) [2024-06-24 18:05:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11679334400. Throughput: 0: 42709.4. Samples: 11679504720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 18:05:34,134][15401] Updated weights for policy 0, policy_version 712852 (0.0036) [2024-06-24 18:05:37,553][15401] Updated weights for policy 0, policy_version 712862 (0.0031) [2024-06-24 18:05:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11679563776. Throughput: 0: 42764.4. Samples: 11679628620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:38,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 18:05:41,618][15401] Updated weights for policy 0, policy_version 712872 (0.0029) [2024-06-24 18:05:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11679760384. Throughput: 0: 42709.7. Samples: 11679888000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:43,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-24 18:05:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000712876_11679760384.pth... [2024-06-24 18:05:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000712250_11669504000.pth [2024-06-24 18:05:45,471][15401] Updated weights for policy 0, policy_version 712882 (0.0037) [2024-06-24 18:05:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 11679956992. Throughput: 0: 42752.9. Samples: 11680147460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 18:05:49,468][15401] Updated weights for policy 0, policy_version 712892 (0.0034) [2024-06-24 18:05:53,035][15401] Updated weights for policy 0, policy_version 712902 (0.0024) [2024-06-24 18:05:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 11680219136. Throughput: 0: 42872.2. Samples: 11680271760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:53,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-24 18:05:56,982][15401] Updated weights for policy 0, policy_version 712912 (0.0028) [2024-06-24 18:05:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11680415744. Throughput: 0: 42877.6. Samples: 11680532540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:05:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 18:06:00,682][15401] Updated weights for policy 0, policy_version 712922 (0.0030) [2024-06-24 18:06:03,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11680612352. Throughput: 0: 42708.5. Samples: 11680789640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 18:06:04,486][15401] Updated weights for policy 0, policy_version 712932 (0.0033) [2024-06-24 18:06:08,341][15401] Updated weights for policy 0, policy_version 712942 (0.0041) [2024-06-24 18:06:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 11680841728. Throughput: 0: 42863.6. Samples: 11680914280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:08,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 18:06:12,805][15401] Updated weights for policy 0, policy_version 712952 (0.0039) [2024-06-24 18:06:13,343][15349] Signal inference workers to stop experience collection... (172850 times) [2024-06-24 18:06:13,348][15349] Signal inference workers to resume experience collection... (172850 times) [2024-06-24 18:06:13,356][15401] InferenceWorker_p0-w0: stopping experience collection (172850 times) [2024-06-24 18:06:13,384][15401] InferenceWorker_p0-w0: resuming experience collection (172850 times) [2024-06-24 18:06:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11681054720. Throughput: 0: 43008.0. Samples: 11681177400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:13,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 18:06:15,765][15401] Updated weights for policy 0, policy_version 712962 (0.0026) [2024-06-24 18:06:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11681251328. Throughput: 0: 42827.2. Samples: 11681431940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 18:06:20,420][15401] Updated weights for policy 0, policy_version 712972 (0.0031) [2024-06-24 18:06:23,310][15401] Updated weights for policy 0, policy_version 712982 (0.0034) [2024-06-24 18:06:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11681497088. Throughput: 0: 42876.0. Samples: 11681558040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 18:06:27,799][15401] Updated weights for policy 0, policy_version 712992 (0.0025) [2024-06-24 18:06:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11681693696. Throughput: 0: 43021.9. Samples: 11681823980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 18:06:31,092][15401] Updated weights for policy 0, policy_version 713002 (0.0030) [2024-06-24 18:06:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11681906688. Throughput: 0: 42987.6. Samples: 11682081900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 18:06:35,332][15401] Updated weights for policy 0, policy_version 713012 (0.0044) [2024-06-24 18:06:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 11682136064. Throughput: 0: 42916.3. Samples: 11682203000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 18:06:38,787][15401] Updated weights for policy 0, policy_version 713022 (0.0031) [2024-06-24 18:06:43,033][15401] Updated weights for policy 0, policy_version 713032 (0.0030) [2024-06-24 18:06:43,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 11682349056. Throughput: 0: 43052.5. Samples: 11682470000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:43,392][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 18:06:46,311][15401] Updated weights for policy 0, policy_version 713042 (0.0038) [2024-06-24 18:06:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11682545664. Throughput: 0: 42928.2. Samples: 11682721400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:48,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-24 18:06:50,531][15401] Updated weights for policy 0, policy_version 713052 (0.0032) [2024-06-24 18:06:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11682775040. Throughput: 0: 42988.5. Samples: 11682848760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:06:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 18:06:54,123][15401] Updated weights for policy 0, policy_version 713062 (0.0038) [2024-06-24 18:06:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11682955264. Throughput: 0: 42894.3. Samples: 11683107640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:06:58,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 18:06:58,456][15401] Updated weights for policy 0, policy_version 713072 (0.0030) [2024-06-24 18:07:01,618][15401] Updated weights for policy 0, policy_version 713082 (0.0033) [2024-06-24 18:07:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11683184640. Throughput: 0: 42940.4. Samples: 11683364260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:03,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 18:07:05,979][15401] Updated weights for policy 0, policy_version 713092 (0.0031) [2024-06-24 18:07:08,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11683430400. Throughput: 0: 42972.5. Samples: 11683491800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:08,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 18:07:09,995][15401] Updated weights for policy 0, policy_version 713102 (0.0032) [2024-06-24 18:07:13,294][15401] Updated weights for policy 0, policy_version 713112 (0.0038) [2024-06-24 18:07:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11683627008. Throughput: 0: 42811.1. Samples: 11683750480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 18:07:17,492][15401] Updated weights for policy 0, policy_version 713122 (0.0030) [2024-06-24 18:07:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11683840000. Throughput: 0: 42818.6. Samples: 11684008740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 18:07:21,012][15401] Updated weights for policy 0, policy_version 713132 (0.0037) [2024-06-24 18:07:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11684052992. Throughput: 0: 42950.7. Samples: 11684135780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 18:07:24,912][15401] Updated weights for policy 0, policy_version 713142 (0.0030) [2024-06-24 18:07:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11684265984. Throughput: 0: 42612.1. Samples: 11684387440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 18:07:28,453][15401] Updated weights for policy 0, policy_version 713152 (0.0035) [2024-06-24 18:07:32,298][15401] Updated weights for policy 0, policy_version 713162 (0.0034) [2024-06-24 18:07:33,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11684478976. Throughput: 0: 42851.9. Samples: 11684649840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:33,392][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 18:07:36,178][15401] Updated weights for policy 0, policy_version 713172 (0.0032) [2024-06-24 18:07:38,392][15132] Fps is (10 sec: 44224.7, 60 sec: 42869.6, 300 sec: 42875.7). Total num frames: 11684708352. Throughput: 0: 42791.6. Samples: 11684774500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:38,393][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 18:07:39,953][15401] Updated weights for policy 0, policy_version 713182 (0.0037) [2024-06-24 18:07:43,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 11684921344. Throughput: 0: 42819.1. Samples: 11685034500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 18:07:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000713191_11684921344.pth... [2024-06-24 18:07:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000712563_11674632192.pth [2024-06-24 18:07:43,789][15401] Updated weights for policy 0, policy_version 713192 (0.0036) [2024-06-24 18:07:47,503][15401] Updated weights for policy 0, policy_version 713202 (0.0031) [2024-06-24 18:07:48,390][15132] Fps is (10 sec: 40970.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11685117952. Throughput: 0: 42843.0. Samples: 11685292200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 18:07:51,482][15401] Updated weights for policy 0, policy_version 713212 (0.0032) [2024-06-24 18:07:52,460][15349] Signal inference workers to stop experience collection... (172900 times) [2024-06-24 18:07:52,460][15349] Signal inference workers to resume experience collection... (172900 times) [2024-06-24 18:07:52,484][15401] InferenceWorker_p0-w0: stopping experience collection (172900 times) [2024-06-24 18:07:52,484][15401] InferenceWorker_p0-w0: resuming experience collection (172900 times) [2024-06-24 18:07:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11685347328. Throughput: 0: 42784.0. Samples: 11685417080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:53,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 18:07:55,221][15401] Updated weights for policy 0, policy_version 713222 (0.0030) [2024-06-24 18:07:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11685560320. Throughput: 0: 42807.5. Samples: 11685676820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:07:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 18:07:59,187][15401] Updated weights for policy 0, policy_version 713232 (0.0025) [2024-06-24 18:08:03,259][15401] Updated weights for policy 0, policy_version 713242 (0.0027) [2024-06-24 18:08:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11685756928. Throughput: 0: 42823.6. Samples: 11685935800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:08:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 18:08:06,792][15401] Updated weights for policy 0, policy_version 713252 (0.0038) [2024-06-24 18:08:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11685986304. Throughput: 0: 42782.8. Samples: 11686061000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:08:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 18:08:10,723][15401] Updated weights for policy 0, policy_version 713262 (0.0042) [2024-06-24 18:08:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11686182912. Throughput: 0: 42916.4. Samples: 11686318680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:08:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 18:08:14,459][15401] Updated weights for policy 0, policy_version 713272 (0.0037) [2024-06-24 18:08:18,200][15401] Updated weights for policy 0, policy_version 713282 (0.0034) [2024-06-24 18:08:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11686412288. Throughput: 0: 42791.2. Samples: 11686575340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:08:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 18:08:22,111][15401] Updated weights for policy 0, policy_version 713292 (0.0036) [2024-06-24 18:08:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11686625280. Throughput: 0: 42817.7. Samples: 11686701180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:08:23,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 18:08:25,777][15401] Updated weights for policy 0, policy_version 713302 (0.0042) [2024-06-24 18:08:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11686821888. Throughput: 0: 42705.8. Samples: 11686956260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:08:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 18:08:29,740][15401] Updated weights for policy 0, policy_version 713312 (0.0039) [2024-06-24 18:08:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 11687051264. Throughput: 0: 42501.3. Samples: 11687204760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 18:08:33,398][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 18:08:33,962][15401] Updated weights for policy 0, policy_version 713322 (0.0040) [2024-06-24 18:08:37,216][15401] Updated weights for policy 0, policy_version 713332 (0.0036) [2024-06-24 18:08:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.2, 300 sec: 42765.0). Total num frames: 11687247872. Throughput: 0: 42651.4. Samples: 11687336400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:08:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 18:08:41,495][15401] Updated weights for policy 0, policy_version 713342 (0.0025) [2024-06-24 18:08:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11687477248. Throughput: 0: 42666.9. Samples: 11687596840. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:08:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 18:08:45,022][15401] Updated weights for policy 0, policy_version 713352 (0.0038) [2024-06-24 18:08:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11687673856. Throughput: 0: 42592.8. Samples: 11687852480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:08:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 18:08:49,259][15401] Updated weights for policy 0, policy_version 713362 (0.0030) [2024-06-24 18:08:52,802][15401] Updated weights for policy 0, policy_version 713372 (0.0036) [2024-06-24 18:08:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 11687903232. Throughput: 0: 42775.4. Samples: 11687985900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:08:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 18:08:57,120][15401] Updated weights for policy 0, policy_version 713382 (0.0034) [2024-06-24 18:08:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11688116224. Throughput: 0: 42726.7. Samples: 11688241380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:08:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 18:09:00,638][15401] Updated weights for policy 0, policy_version 713392 (0.0042) [2024-06-24 18:09:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11688312832. Throughput: 0: 42666.5. Samples: 11688495340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 18:09:04,759][15401] Updated weights for policy 0, policy_version 713402 (0.0032) [2024-06-24 18:09:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11688542208. Throughput: 0: 42697.4. Samples: 11688622560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 18:09:08,392][15401] Updated weights for policy 0, policy_version 713412 (0.0024) [2024-06-24 18:09:12,330][15401] Updated weights for policy 0, policy_version 713422 (0.0033) [2024-06-24 18:09:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11688755200. Throughput: 0: 42760.9. Samples: 11688880500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:13,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 18:09:16,087][15401] Updated weights for policy 0, policy_version 713432 (0.0029) [2024-06-24 18:09:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11688968192. Throughput: 0: 42917.8. Samples: 11689136060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 18:09:19,967][15401] Updated weights for policy 0, policy_version 713442 (0.0045) [2024-06-24 18:09:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11689181184. Throughput: 0: 42901.8. Samples: 11689266980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:23,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 18:09:23,536][15401] Updated weights for policy 0, policy_version 713452 (0.0039) [2024-06-24 18:09:27,558][15401] Updated weights for policy 0, policy_version 713462 (0.0027) [2024-06-24 18:09:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11689410560. Throughput: 0: 42883.8. Samples: 11689526600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 18:09:31,071][15401] Updated weights for policy 0, policy_version 713472 (0.0029) [2024-06-24 18:09:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11689623552. Throughput: 0: 42911.5. Samples: 11689783500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:33,396][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 18:09:35,196][15401] Updated weights for policy 0, policy_version 713482 (0.0030) [2024-06-24 18:09:36,252][15349] Signal inference workers to stop experience collection... (172950 times) [2024-06-24 18:09:36,252][15349] Signal inference workers to resume experience collection... (172950 times) [2024-06-24 18:09:36,267][15401] InferenceWorker_p0-w0: stopping experience collection (172950 times) [2024-06-24 18:09:36,268][15401] InferenceWorker_p0-w0: resuming experience collection (172950 times) [2024-06-24 18:09:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11689820160. Throughput: 0: 42786.3. Samples: 11689911280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:38,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 18:09:38,785][15401] Updated weights for policy 0, policy_version 713492 (0.0033) [2024-06-24 18:09:42,835][15401] Updated weights for policy 0, policy_version 713502 (0.0036) [2024-06-24 18:09:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11690033152. Throughput: 0: 42731.9. Samples: 11690164320. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:43,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-24 18:09:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000713503_11690033152.pth... [2024-06-24 18:09:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000712876_11679760384.pth [2024-06-24 18:09:46,745][15401] Updated weights for policy 0, policy_version 713512 (0.0045) [2024-06-24 18:09:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11690246144. Throughput: 0: 42767.3. Samples: 11690419860. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:48,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 18:09:50,412][15401] Updated weights for policy 0, policy_version 713522 (0.0037) [2024-06-24 18:09:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11690475520. Throughput: 0: 42783.8. Samples: 11690547840. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 18:09:54,481][15401] Updated weights for policy 0, policy_version 713532 (0.0039) [2024-06-24 18:09:57,905][15401] Updated weights for policy 0, policy_version 713542 (0.0034) [2024-06-24 18:09:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11690688512. Throughput: 0: 42860.3. Samples: 11690809220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:09:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 18:10:02,028][15401] Updated weights for policy 0, policy_version 713552 (0.0030) [2024-06-24 18:10:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 11690901504. Throughput: 0: 42927.9. Samples: 11691067820. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:10:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 18:10:05,642][15401] Updated weights for policy 0, policy_version 713562 (0.0034) [2024-06-24 18:10:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11691130880. Throughput: 0: 42885.0. Samples: 11691196800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:10:08,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 18:10:09,472][15401] Updated weights for policy 0, policy_version 713572 (0.0033) [2024-06-24 18:10:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11691327488. Throughput: 0: 42852.8. Samples: 11691454980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-24 18:10:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 18:10:13,391][15401] Updated weights for policy 0, policy_version 713582 (0.0042) [2024-06-24 18:10:16,993][15401] Updated weights for policy 0, policy_version 713592 (0.0033) [2024-06-24 18:10:18,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 11691540480. Throughput: 0: 42888.5. Samples: 11691713580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:18,392][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 18:10:21,019][15401] Updated weights for policy 0, policy_version 713602 (0.0036) [2024-06-24 18:10:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11691769856. Throughput: 0: 42820.4. Samples: 11691838200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 18:10:24,541][15401] Updated weights for policy 0, policy_version 713612 (0.0029) [2024-06-24 18:10:28,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 11691950080. Throughput: 0: 42957.8. Samples: 11692097520. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:28,393][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 18:10:28,730][15401] Updated weights for policy 0, policy_version 713622 (0.0023) [2024-06-24 18:10:32,237][15401] Updated weights for policy 0, policy_version 713632 (0.0042) [2024-06-24 18:10:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11692163072. Throughput: 0: 42926.5. Samples: 11692351560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 18:10:36,487][15401] Updated weights for policy 0, policy_version 713642 (0.0033) [2024-06-24 18:10:38,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11692408832. Throughput: 0: 42939.3. Samples: 11692480100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 18:10:39,871][15401] Updated weights for policy 0, policy_version 713652 (0.0037) [2024-06-24 18:10:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11692605440. Throughput: 0: 42888.0. Samples: 11692739180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 18:10:44,170][15401] Updated weights for policy 0, policy_version 713662 (0.0036) [2024-06-24 18:10:44,178][15349] Signal inference workers to stop experience collection... (173000 times) [2024-06-24 18:10:44,178][15349] Signal inference workers to resume experience collection... (173000 times) [2024-06-24 18:10:44,224][15401] InferenceWorker_p0-w0: stopping experience collection (173000 times) [2024-06-24 18:10:44,224][15401] InferenceWorker_p0-w0: resuming experience collection (173000 times) [2024-06-24 18:10:47,973][15401] Updated weights for policy 0, policy_version 713672 (0.0029) [2024-06-24 18:10:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11692818432. Throughput: 0: 42850.2. Samples: 11692996080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 18:10:51,700][15401] Updated weights for policy 0, policy_version 713682 (0.0036) [2024-06-24 18:10:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11693064192. Throughput: 0: 42876.0. Samples: 11693126220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 18:10:55,552][15401] Updated weights for policy 0, policy_version 713692 (0.0026) [2024-06-24 18:10:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11693260800. Throughput: 0: 42840.4. Samples: 11693382800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:10:58,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 18:10:59,248][15401] Updated weights for policy 0, policy_version 713702 (0.0028) [2024-06-24 18:11:03,178][15401] Updated weights for policy 0, policy_version 713712 (0.0030) [2024-06-24 18:11:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11693457408. Throughput: 0: 42771.1. Samples: 11693638180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 18:11:06,780][15401] Updated weights for policy 0, policy_version 713722 (0.0031) [2024-06-24 18:11:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11693703168. Throughput: 0: 42722.2. Samples: 11693760700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 18:11:10,659][15401] Updated weights for policy 0, policy_version 713732 (0.0030) [2024-06-24 18:11:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11693883392. Throughput: 0: 42678.3. Samples: 11694017940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 18:11:14,431][15401] Updated weights for policy 0, policy_version 713742 (0.0053) [2024-06-24 18:11:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11694096384. Throughput: 0: 42731.8. Samples: 11694274480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 18:11:18,412][15401] Updated weights for policy 0, policy_version 713752 (0.0037) [2024-06-24 18:11:22,540][15401] Updated weights for policy 0, policy_version 713762 (0.0038) [2024-06-24 18:11:23,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 11694342144. Throughput: 0: 42736.7. Samples: 11694403360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:23,393][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 18:11:25,932][15401] Updated weights for policy 0, policy_version 713772 (0.0027) [2024-06-24 18:11:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 11694505984. Throughput: 0: 42818.8. Samples: 11694666020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 18:11:30,158][15401] Updated weights for policy 0, policy_version 713782 (0.0037) [2024-06-24 18:11:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11694751744. Throughput: 0: 42664.9. Samples: 11694916000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 18:11:33,492][15401] Updated weights for policy 0, policy_version 713792 (0.0042) [2024-06-24 18:11:37,771][15401] Updated weights for policy 0, policy_version 713802 (0.0029) [2024-06-24 18:11:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 11694981120. Throughput: 0: 42631.0. Samples: 11695044620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:38,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 18:11:40,957][15401] Updated weights for policy 0, policy_version 713812 (0.0048) [2024-06-24 18:11:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11695161344. Throughput: 0: 42788.9. Samples: 11695308300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 18:11:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000713817_11695177728.pth... [2024-06-24 18:11:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000713191_11684921344.pth [2024-06-24 18:11:45,360][15401] Updated weights for policy 0, policy_version 713822 (0.0030) [2024-06-24 18:11:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11695407104. Throughput: 0: 42633.6. Samples: 11695556700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 18:11:48,569][15401] Updated weights for policy 0, policy_version 713832 (0.0030) [2024-06-24 18:11:52,925][15401] Updated weights for policy 0, policy_version 713842 (0.0035) [2024-06-24 18:11:53,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42323.6, 300 sec: 42875.7). Total num frames: 11695603712. Throughput: 0: 42859.0. Samples: 11695689460. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:53,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 18:11:56,199][15401] Updated weights for policy 0, policy_version 713852 (0.0031) [2024-06-24 18:11:58,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11695800320. Throughput: 0: 42894.7. Samples: 11695948200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 18:11:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 18:12:00,704][15401] Updated weights for policy 0, policy_version 713862 (0.0029) [2024-06-24 18:12:00,732][15349] Signal inference workers to stop experience collection... (173050 times) [2024-06-24 18:12:00,783][15401] InferenceWorker_p0-w0: stopping experience collection (173050 times) [2024-06-24 18:12:00,848][15349] Signal inference workers to resume experience collection... (173050 times) [2024-06-24 18:12:00,848][15401] InferenceWorker_p0-w0: resuming experience collection (173050 times) [2024-06-24 18:12:03,390][15132] Fps is (10 sec: 45886.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 11696062464. Throughput: 0: 42701.2. Samples: 11696196040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 18:12:04,260][15401] Updated weights for policy 0, policy_version 713872 (0.0045) [2024-06-24 18:12:08,185][15401] Updated weights for policy 0, policy_version 713882 (0.0037) [2024-06-24 18:12:08,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11696259072. Throughput: 0: 42862.8. Samples: 11696332080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 18:12:11,821][15401] Updated weights for policy 0, policy_version 713892 (0.0043) [2024-06-24 18:12:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11696455680. Throughput: 0: 42564.8. Samples: 11696581440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 18:12:16,179][15401] Updated weights for policy 0, policy_version 713902 (0.0033) [2024-06-24 18:12:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11696685056. Throughput: 0: 42514.3. Samples: 11696829140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:18,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 18:12:19,826][15401] Updated weights for policy 0, policy_version 713912 (0.0033) [2024-06-24 18:12:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 11696881664. Throughput: 0: 42557.8. Samples: 11696959720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:23,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 18:12:23,787][15401] Updated weights for policy 0, policy_version 713922 (0.0028) [2024-06-24 18:12:27,501][15401] Updated weights for policy 0, policy_version 713932 (0.0029) [2024-06-24 18:12:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 11697078272. Throughput: 0: 42272.5. Samples: 11697210560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 18:12:31,331][15401] Updated weights for policy 0, policy_version 713942 (0.0029) [2024-06-24 18:12:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.9). Total num frames: 11697307648. Throughput: 0: 42586.4. Samples: 11697473080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:33,394][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 18:12:35,096][15401] Updated weights for policy 0, policy_version 713952 (0.0039) [2024-06-24 18:12:38,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42320.9, 300 sec: 42708.6). Total num frames: 11697520640. Throughput: 0: 42429.2. Samples: 11697598940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:38,396][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 18:12:39,429][15401] Updated weights for policy 0, policy_version 713962 (0.0038) [2024-06-24 18:12:43,232][15401] Updated weights for policy 0, policy_version 713972 (0.0036) [2024-06-24 18:12:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11697733632. Throughput: 0: 42308.4. Samples: 11697852080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 18:12:46,985][15401] Updated weights for policy 0, policy_version 713982 (0.0031) [2024-06-24 18:12:48,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 11697946624. Throughput: 0: 42405.8. Samples: 11698104300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 18:12:50,730][15401] Updated weights for policy 0, policy_version 713992 (0.0032) [2024-06-24 18:12:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 11698143232. Throughput: 0: 42308.4. Samples: 11698235960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 18:12:54,537][15401] Updated weights for policy 0, policy_version 714002 (0.0034) [2024-06-24 18:12:58,279][15401] Updated weights for policy 0, policy_version 714012 (0.0042) [2024-06-24 18:12:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 11698372608. Throughput: 0: 42382.1. Samples: 11698488640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:12:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 18:13:02,062][15401] Updated weights for policy 0, policy_version 714022 (0.0029) [2024-06-24 18:13:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 11698569216. Throughput: 0: 42548.1. Samples: 11698743800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:13:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 18:13:05,724][15401] Updated weights for policy 0, policy_version 714032 (0.0031) [2024-06-24 18:13:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11698798592. Throughput: 0: 42417.0. Samples: 11698868480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:13:08,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 18:13:09,793][15401] Updated weights for policy 0, policy_version 714042 (0.0047) [2024-06-24 18:13:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11699011584. Throughput: 0: 42625.6. Samples: 11699128720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:13:13,400][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 18:13:13,697][15401] Updated weights for policy 0, policy_version 714052 (0.0036) [2024-06-24 18:13:15,814][15349] Signal inference workers to stop experience collection... (173100 times) [2024-06-24 18:13:15,816][15349] Signal inference workers to resume experience collection... (173100 times) [2024-06-24 18:13:15,841][15401] InferenceWorker_p0-w0: stopping experience collection (173100 times) [2024-06-24 18:13:15,841][15401] InferenceWorker_p0-w0: resuming experience collection (173100 times) [2024-06-24 18:13:17,338][15401] Updated weights for policy 0, policy_version 714062 (0.0032) [2024-06-24 18:13:18,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42320.8, 300 sec: 42708.5). Total num frames: 11699224576. Throughput: 0: 42455.3. Samples: 11699383840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:13:18,396][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 18:13:21,224][15401] Updated weights for policy 0, policy_version 714072 (0.0043) [2024-06-24 18:13:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11699421184. Throughput: 0: 42500.3. Samples: 11699511180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:13:23,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 18:13:24,923][15401] Updated weights for policy 0, policy_version 714082 (0.0034) [2024-06-24 18:13:28,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11699650560. Throughput: 0: 42578.6. Samples: 11699768120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:13:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 18:13:28,924][15401] Updated weights for policy 0, policy_version 714092 (0.0022) [2024-06-24 18:13:33,215][15401] Updated weights for policy 0, policy_version 714102 (0.0035) [2024-06-24 18:13:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 11699847168. Throughput: 0: 42619.9. Samples: 11700022300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:13:33,392][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 18:13:36,774][15401] Updated weights for policy 0, policy_version 714112 (0.0034) [2024-06-24 18:13:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42329.9, 300 sec: 42654.0). Total num frames: 11700060160. Throughput: 0: 42557.0. Samples: 11700151020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-24 18:13:38,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 18:13:40,973][15401] Updated weights for policy 0, policy_version 714122 (0.0034) [2024-06-24 18:13:43,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11700273152. Throughput: 0: 42448.4. Samples: 11700398820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:13:43,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 18:13:43,513][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000714129_11700289536.pth... [2024-06-24 18:13:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000713503_11690033152.pth [2024-06-24 18:13:44,425][15401] Updated weights for policy 0, policy_version 714132 (0.0040) [2024-06-24 18:13:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11700469760. Throughput: 0: 42450.2. Samples: 11700654060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:13:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 18:13:48,682][15401] Updated weights for policy 0, policy_version 714142 (0.0037) [2024-06-24 18:13:52,272][15401] Updated weights for policy 0, policy_version 714152 (0.0028) [2024-06-24 18:13:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11700682752. Throughput: 0: 42431.3. Samples: 11700777900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:13:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 18:13:56,317][15401] Updated weights for policy 0, policy_version 714162 (0.0032) [2024-06-24 18:13:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11700928512. Throughput: 0: 42350.4. Samples: 11701034480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:13:58,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 18:14:00,146][15401] Updated weights for policy 0, policy_version 714172 (0.0030) [2024-06-24 18:14:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11701125120. Throughput: 0: 42383.3. Samples: 11701290820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 18:14:04,172][15401] Updated weights for policy 0, policy_version 714182 (0.0037) [2024-06-24 18:14:08,120][15401] Updated weights for policy 0, policy_version 714192 (0.0031) [2024-06-24 18:14:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11701321728. Throughput: 0: 42324.0. Samples: 11701415760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:08,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-24 18:14:12,182][15401] Updated weights for policy 0, policy_version 714202 (0.0028) [2024-06-24 18:14:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11701551104. Throughput: 0: 42481.8. Samples: 11701679800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:13,390][15132] Avg episode reward: [(0, '0.140')] [2024-06-24 18:14:15,894][15401] Updated weights for policy 0, policy_version 714212 (0.0029) [2024-06-24 18:14:17,349][15349] Signal inference workers to stop experience collection... (173150 times) [2024-06-24 18:14:17,388][15401] InferenceWorker_p0-w0: stopping experience collection (173150 times) [2024-06-24 18:14:17,397][15349] Signal inference workers to resume experience collection... (173150 times) [2024-06-24 18:14:17,407][15401] InferenceWorker_p0-w0: resuming experience collection (173150 times) [2024-06-24 18:14:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 11701780480. Throughput: 0: 42325.9. Samples: 11701926860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:18,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 18:14:19,690][15401] Updated weights for policy 0, policy_version 714222 (0.0035) [2024-06-24 18:14:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11701960704. Throughput: 0: 42300.8. Samples: 11702054560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 18:14:23,592][15401] Updated weights for policy 0, policy_version 714232 (0.0044) [2024-06-24 18:14:27,314][15401] Updated weights for policy 0, policy_version 714242 (0.0037) [2024-06-24 18:14:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11702190080. Throughput: 0: 42614.4. Samples: 11702316460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 18:14:31,259][15401] Updated weights for policy 0, policy_version 714252 (0.0036) [2024-06-24 18:14:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11702419456. Throughput: 0: 42332.8. Samples: 11702559040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 18:14:35,080][15401] Updated weights for policy 0, policy_version 714262 (0.0037) [2024-06-24 18:14:38,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11702599680. Throughput: 0: 42604.0. Samples: 11702695080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 18:14:38,824][15401] Updated weights for policy 0, policy_version 714272 (0.0041) [2024-06-24 18:14:42,667][15401] Updated weights for policy 0, policy_version 714282 (0.0025) [2024-06-24 18:14:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11702812672. Throughput: 0: 42520.3. Samples: 11702947900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 18:14:46,712][15401] Updated weights for policy 0, policy_version 714292 (0.0024) [2024-06-24 18:14:48,390][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 11703058432. Throughput: 0: 42381.4. Samples: 11703197980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 18:14:50,165][15401] Updated weights for policy 0, policy_version 714302 (0.0031) [2024-06-24 18:14:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11703238656. Throughput: 0: 42603.5. Samples: 11703332920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 18:14:54,528][15401] Updated weights for policy 0, policy_version 714312 (0.0038) [2024-06-24 18:14:57,967][15401] Updated weights for policy 0, policy_version 714322 (0.0028) [2024-06-24 18:14:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 11703451648. Throughput: 0: 42309.2. Samples: 11703583720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:14:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 18:15:02,085][15401] Updated weights for policy 0, policy_version 714332 (0.0031) [2024-06-24 18:15:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11703697408. Throughput: 0: 42431.5. Samples: 11703836280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:15:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 18:15:06,022][15401] Updated weights for policy 0, policy_version 714342 (0.0028) [2024-06-24 18:15:08,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11703877632. Throughput: 0: 42680.4. Samples: 11703975180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:15:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 18:15:09,565][15401] Updated weights for policy 0, policy_version 714352 (0.0047) [2024-06-24 18:15:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 11704090624. Throughput: 0: 42365.7. Samples: 11704222920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:15:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 18:15:13,652][15401] Updated weights for policy 0, policy_version 714362 (0.0036) [2024-06-24 18:15:17,453][15401] Updated weights for policy 0, policy_version 714372 (0.0040) [2024-06-24 18:15:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11704320000. Throughput: 0: 42710.3. Samples: 11704481000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 18:15:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 18:15:21,240][15401] Updated weights for policy 0, policy_version 714382 (0.0034) [2024-06-24 18:15:23,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.6, 300 sec: 42653.9). Total num frames: 11704532992. Throughput: 0: 42687.5. Samples: 11704616120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:15:23,393][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 18:15:24,965][15401] Updated weights for policy 0, policy_version 714392 (0.0041) [2024-06-24 18:15:28,391][15132] Fps is (10 sec: 42593.0, 60 sec: 42597.5, 300 sec: 42653.8). Total num frames: 11704745984. Throughput: 0: 42719.8. Samples: 11704870340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:15:28,391][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 18:15:28,880][15401] Updated weights for policy 0, policy_version 714402 (0.0033) [2024-06-24 18:15:32,546][15401] Updated weights for policy 0, policy_version 714412 (0.0045) [2024-06-24 18:15:33,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11704975360. Throughput: 0: 42900.7. Samples: 11705128520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:15:33,391][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 18:15:36,498][15401] Updated weights for policy 0, policy_version 714422 (0.0039) [2024-06-24 18:15:37,884][15349] Signal inference workers to stop experience collection... (173200 times) [2024-06-24 18:15:37,884][15349] Signal inference workers to resume experience collection... (173200 times) [2024-06-24 18:15:37,909][15401] InferenceWorker_p0-w0: stopping experience collection (173200 times) [2024-06-24 18:15:37,909][15401] InferenceWorker_p0-w0: resuming experience collection (173200 times) [2024-06-24 18:15:38,390][15132] Fps is (10 sec: 42603.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11705171968. Throughput: 0: 42833.8. Samples: 11705260440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:15:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 18:15:40,003][15401] Updated weights for policy 0, policy_version 714432 (0.0034) [2024-06-24 18:15:43,389][15132] Fps is (10 sec: 42599.6, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 11705401344. Throughput: 0: 42843.4. Samples: 11705511660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:15:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 18:15:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000714441_11705401344.pth... [2024-06-24 18:15:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000713817_11695177728.pth [2024-06-24 18:15:44,369][15401] Updated weights for policy 0, policy_version 714442 (0.0041) [2024-06-24 18:15:47,499][15401] Updated weights for policy 0, policy_version 714452 (0.0027) [2024-06-24 18:15:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 11705597952. Throughput: 0: 43017.4. Samples: 11705772060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:15:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 18:15:52,071][15401] Updated weights for policy 0, policy_version 714462 (0.0027) [2024-06-24 18:15:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 11705827328. Throughput: 0: 42810.7. Samples: 11705901660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:15:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 18:15:54,968][15401] Updated weights for policy 0, policy_version 714472 (0.0033) [2024-06-24 18:15:58,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 11706040320. Throughput: 0: 43015.0. Samples: 11706158700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:15:58,393][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 18:15:59,753][15401] Updated weights for policy 0, policy_version 714482 (0.0026) [2024-06-24 18:16:03,104][15401] Updated weights for policy 0, policy_version 714492 (0.0023) [2024-06-24 18:16:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11706253312. Throughput: 0: 42949.3. Samples: 11706413720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 18:16:07,311][15401] Updated weights for policy 0, policy_version 714502 (0.0029) [2024-06-24 18:16:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11706466304. Throughput: 0: 42712.6. Samples: 11706538080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:08,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 18:16:10,667][15401] Updated weights for policy 0, policy_version 714512 (0.0040) [2024-06-24 18:16:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11706695680. Throughput: 0: 42910.9. Samples: 11706801280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 18:16:14,640][15401] Updated weights for policy 0, policy_version 714522 (0.0050) [2024-06-24 18:16:18,051][15401] Updated weights for policy 0, policy_version 714532 (0.0030) [2024-06-24 18:16:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 11706892288. Throughput: 0: 42823.2. Samples: 11707055560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 18:16:22,133][15401] Updated weights for policy 0, policy_version 714542 (0.0030) [2024-06-24 18:16:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.4, 300 sec: 42709.5). Total num frames: 11707105280. Throughput: 0: 42788.6. Samples: 11707185920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 18:16:26,023][15401] Updated weights for policy 0, policy_version 714552 (0.0033) [2024-06-24 18:16:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43145.4, 300 sec: 42653.9). Total num frames: 11707334656. Throughput: 0: 43046.5. Samples: 11707448760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:28,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 18:16:29,860][15401] Updated weights for policy 0, policy_version 714562 (0.0041) [2024-06-24 18:16:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11707531264. Throughput: 0: 42889.3. Samples: 11707702080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:33,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 18:16:33,554][15401] Updated weights for policy 0, policy_version 714572 (0.0023) [2024-06-24 18:16:37,784][15401] Updated weights for policy 0, policy_version 714582 (0.0036) [2024-06-24 18:16:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11707727872. Throughput: 0: 42985.3. Samples: 11707836000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:38,396][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 18:16:41,073][15401] Updated weights for policy 0, policy_version 714592 (0.0029) [2024-06-24 18:16:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11707957248. Throughput: 0: 42862.3. Samples: 11708087400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 18:16:45,222][15401] Updated weights for policy 0, policy_version 714602 (0.0029) [2024-06-24 18:16:47,095][15349] Signal inference workers to stop experience collection... (173250 times) [2024-06-24 18:16:47,098][15349] Signal inference workers to resume experience collection... (173250 times) [2024-06-24 18:16:47,129][15401] InferenceWorker_p0-w0: stopping experience collection (173250 times) [2024-06-24 18:16:47,130][15401] InferenceWorker_p0-w0: resuming experience collection (173250 times) [2024-06-24 18:16:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 11708186624. Throughput: 0: 42931.2. Samples: 11708345620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 18:16:48,619][15401] Updated weights for policy 0, policy_version 714612 (0.0034) [2024-06-24 18:16:52,928][15401] Updated weights for policy 0, policy_version 714622 (0.0028) [2024-06-24 18:16:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11708366848. Throughput: 0: 43137.0. Samples: 11708479240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 18:16:56,024][15401] Updated weights for policy 0, policy_version 714632 (0.0034) [2024-06-24 18:16:58,392][15132] Fps is (10 sec: 42586.7, 60 sec: 42871.4, 300 sec: 42542.5). Total num frames: 11708612608. Throughput: 0: 42957.1. Samples: 11708734460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:16:58,393][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 18:17:00,659][15401] Updated weights for policy 0, policy_version 714642 (0.0046) [2024-06-24 18:17:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11708825600. Throughput: 0: 42902.2. Samples: 11708986160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 18:17:03,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 18:17:03,742][15401] Updated weights for policy 0, policy_version 714652 (0.0031) [2024-06-24 18:17:08,173][15401] Updated weights for policy 0, policy_version 714662 (0.0043) [2024-06-24 18:17:08,389][15132] Fps is (10 sec: 40970.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11709022208. Throughput: 0: 42892.8. Samples: 11709116100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:08,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 18:17:11,629][15401] Updated weights for policy 0, policy_version 714672 (0.0030) [2024-06-24 18:17:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11709251584. Throughput: 0: 42762.6. Samples: 11709373080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 18:17:15,727][15401] Updated weights for policy 0, policy_version 714682 (0.0023) [2024-06-24 18:17:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11709464576. Throughput: 0: 42923.2. Samples: 11709633620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:18,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-24 18:17:19,296][15401] Updated weights for policy 0, policy_version 714692 (0.0044) [2024-06-24 18:17:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11709661184. Throughput: 0: 42766.6. Samples: 11709760500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:23,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 18:17:23,673][15401] Updated weights for policy 0, policy_version 714702 (0.0047) [2024-06-24 18:17:26,894][15401] Updated weights for policy 0, policy_version 714712 (0.0030) [2024-06-24 18:17:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11709906944. Throughput: 0: 42915.6. Samples: 11710018600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 18:17:31,289][15401] Updated weights for policy 0, policy_version 714722 (0.0042) [2024-06-24 18:17:33,390][15132] Fps is (10 sec: 45872.7, 60 sec: 43144.1, 300 sec: 42710.3). Total num frames: 11710119936. Throughput: 0: 42821.5. Samples: 11710272620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:33,391][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 18:17:34,569][15401] Updated weights for policy 0, policy_version 714732 (0.0048) [2024-06-24 18:17:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 11710316544. Throughput: 0: 42692.8. Samples: 11710400520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:38,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 18:17:38,789][15401] Updated weights for policy 0, policy_version 714742 (0.0034) [2024-06-24 18:17:42,037][15401] Updated weights for policy 0, policy_version 714752 (0.0042) [2024-06-24 18:17:43,390][15132] Fps is (10 sec: 44239.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11710562304. Throughput: 0: 42689.9. Samples: 11710655400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:43,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 18:17:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000714756_11710562304.pth... [2024-06-24 18:17:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000714129_11700289536.pth [2024-06-24 18:17:46,322][15401] Updated weights for policy 0, policy_version 714762 (0.0034) [2024-06-24 18:17:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11710726144. Throughput: 0: 42983.7. Samples: 11710920420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:48,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 18:17:49,569][15401] Updated weights for policy 0, policy_version 714772 (0.0038) [2024-06-24 18:17:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11710971904. Throughput: 0: 42761.3. Samples: 11711040360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 18:17:53,894][15401] Updated weights for policy 0, policy_version 714782 (0.0044) [2024-06-24 18:17:57,150][15401] Updated weights for policy 0, policy_version 714792 (0.0033) [2024-06-24 18:17:58,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 11711217664. Throughput: 0: 42850.2. Samples: 11711301340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:17:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 18:18:01,389][15401] Updated weights for policy 0, policy_version 714802 (0.0037) [2024-06-24 18:18:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11711381504. Throughput: 0: 43087.0. Samples: 11711572540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 18:18:04,714][15401] Updated weights for policy 0, policy_version 714812 (0.0040) [2024-06-24 18:18:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11711610880. Throughput: 0: 42829.4. Samples: 11711687820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 18:18:09,147][15401] Updated weights for policy 0, policy_version 714822 (0.0040) [2024-06-24 18:18:12,354][15401] Updated weights for policy 0, policy_version 714832 (0.0037) [2024-06-24 18:18:13,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.7, 300 sec: 42821.5). Total num frames: 11711856640. Throughput: 0: 42930.2. Samples: 11711950460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:13,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 18:18:16,739][15401] Updated weights for policy 0, policy_version 714842 (0.0044) [2024-06-24 18:18:16,761][15349] Signal inference workers to stop experience collection... (173300 times) [2024-06-24 18:18:16,762][15349] Signal inference workers to resume experience collection... (173300 times) [2024-06-24 18:18:16,781][15401] InferenceWorker_p0-w0: stopping experience collection (173300 times) [2024-06-24 18:18:16,781][15401] InferenceWorker_p0-w0: resuming experience collection (173300 times) [2024-06-24 18:18:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11712020480. Throughput: 0: 43225.1. Samples: 11712217720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 18:18:19,825][15401] Updated weights for policy 0, policy_version 714852 (0.0042) [2024-06-24 18:18:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11712266240. Throughput: 0: 42944.4. Samples: 11712332920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 18:18:24,196][15401] Updated weights for policy 0, policy_version 714862 (0.0049) [2024-06-24 18:18:27,592][15401] Updated weights for policy 0, policy_version 714872 (0.0039) [2024-06-24 18:18:28,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 11712495616. Throughput: 0: 43215.2. Samples: 11712600080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 18:18:31,987][15401] Updated weights for policy 0, policy_version 714882 (0.0036) [2024-06-24 18:18:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.8, 300 sec: 42765.0). Total num frames: 11712675840. Throughput: 0: 43119.0. Samples: 11712860780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:33,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 18:18:35,319][15401] Updated weights for policy 0, policy_version 714892 (0.0030) [2024-06-24 18:18:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11712888832. Throughput: 0: 43056.4. Samples: 11712977900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 18:18:39,445][15401] Updated weights for policy 0, policy_version 714902 (0.0034) [2024-06-24 18:18:43,172][15401] Updated weights for policy 0, policy_version 714912 (0.0029) [2024-06-24 18:18:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11713134592. Throughput: 0: 43202.8. Samples: 11713245460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-24 18:18:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 18:18:47,010][15401] Updated weights for policy 0, policy_version 714922 (0.0041) [2024-06-24 18:18:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11713331200. Throughput: 0: 42798.2. Samples: 11713498460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:18:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 18:18:50,805][15401] Updated weights for policy 0, policy_version 714932 (0.0040) [2024-06-24 18:18:53,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 11713544192. Throughput: 0: 43116.3. Samples: 11713628160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:18:53,392][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 18:18:54,650][15401] Updated weights for policy 0, policy_version 714942 (0.0031) [2024-06-24 18:18:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11713757184. Throughput: 0: 43077.8. Samples: 11713888960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:18:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 18:18:58,487][15401] Updated weights for policy 0, policy_version 714952 (0.0028) [2024-06-24 18:19:02,080][15401] Updated weights for policy 0, policy_version 714962 (0.0029) [2024-06-24 18:19:03,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 11713986560. Throughput: 0: 42839.9. Samples: 11714145520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 18:19:06,299][15401] Updated weights for policy 0, policy_version 714972 (0.0032) [2024-06-24 18:19:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11714183168. Throughput: 0: 43280.1. Samples: 11714280520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:08,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 18:19:09,547][15401] Updated weights for policy 0, policy_version 714982 (0.0024) [2024-06-24 18:19:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11714396160. Throughput: 0: 42851.9. Samples: 11714528420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:13,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 18:19:13,999][15401] Updated weights for policy 0, policy_version 714992 (0.0032) [2024-06-24 18:19:16,613][15349] Signal inference workers to stop experience collection... (173350 times) [2024-06-24 18:19:16,650][15401] InferenceWorker_p0-w0: stopping experience collection (173350 times) [2024-06-24 18:19:16,664][15349] Signal inference workers to resume experience collection... (173350 times) [2024-06-24 18:19:16,670][15401] InferenceWorker_p0-w0: resuming experience collection (173350 times) [2024-06-24 18:19:17,156][15401] Updated weights for policy 0, policy_version 715002 (0.0026) [2024-06-24 18:19:18,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43688.8, 300 sec: 42986.8). Total num frames: 11714641920. Throughput: 0: 42757.2. Samples: 11714784960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:18,393][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 18:19:21,704][15401] Updated weights for policy 0, policy_version 715012 (0.0034) [2024-06-24 18:19:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 11714822144. Throughput: 0: 43087.1. Samples: 11714916820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 18:19:24,899][15401] Updated weights for policy 0, policy_version 715022 (0.0030) [2024-06-24 18:19:28,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11715035136. Throughput: 0: 42706.1. Samples: 11715167240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 18:19:29,567][15401] Updated weights for policy 0, policy_version 715032 (0.0033) [2024-06-24 18:19:32,478][15401] Updated weights for policy 0, policy_version 715042 (0.0040) [2024-06-24 18:19:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 11715280896. Throughput: 0: 42766.7. Samples: 11715422960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 18:19:37,290][15401] Updated weights for policy 0, policy_version 715052 (0.0031) [2024-06-24 18:19:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11715444736. Throughput: 0: 42724.8. Samples: 11715550680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:38,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 18:19:40,404][15401] Updated weights for policy 0, policy_version 715062 (0.0036) [2024-06-24 18:19:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11715674112. Throughput: 0: 42608.4. Samples: 11715806340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 18:19:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000715069_11715690496.pth... [2024-06-24 18:19:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000714441_11705401344.pth [2024-06-24 18:19:44,967][15401] Updated weights for policy 0, policy_version 715072 (0.0029) [2024-06-24 18:19:47,974][15401] Updated weights for policy 0, policy_version 715082 (0.0041) [2024-06-24 18:19:48,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 11715919872. Throughput: 0: 42445.4. Samples: 11716055560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:48,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 18:19:52,711][15401] Updated weights for policy 0, policy_version 715092 (0.0039) [2024-06-24 18:19:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 11716100096. Throughput: 0: 42547.5. Samples: 11716195160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 18:19:55,523][15401] Updated weights for policy 0, policy_version 715102 (0.0036) [2024-06-24 18:19:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11716329472. Throughput: 0: 42522.7. Samples: 11716441940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:19:58,396][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 18:20:00,283][15401] Updated weights for policy 0, policy_version 715112 (0.0042) [2024-06-24 18:20:03,302][15401] Updated weights for policy 0, policy_version 715122 (0.0030) [2024-06-24 18:20:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 11716558848. Throughput: 0: 42584.9. Samples: 11716701180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:20:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 18:20:07,914][15401] Updated weights for policy 0, policy_version 715132 (0.0029) [2024-06-24 18:20:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11716739072. Throughput: 0: 42521.8. Samples: 11716830300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:20:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 18:20:10,949][15401] Updated weights for policy 0, policy_version 715142 (0.0042) [2024-06-24 18:20:13,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 11716984832. Throughput: 0: 42591.5. Samples: 11717083960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:20:13,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 18:20:15,454][15401] Updated weights for policy 0, policy_version 715152 (0.0031) [2024-06-24 18:20:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42932.0). Total num frames: 11717197824. Throughput: 0: 42695.1. Samples: 11717344240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:20:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 18:20:18,766][15401] Updated weights for policy 0, policy_version 715162 (0.0041) [2024-06-24 18:20:23,087][15401] Updated weights for policy 0, policy_version 715172 (0.0027) [2024-06-24 18:20:23,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42871.5, 300 sec: 42876.3). Total num frames: 11717394432. Throughput: 0: 42654.4. Samples: 11717470120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-24 18:20:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 18:20:26,697][15401] Updated weights for policy 0, policy_version 715182 (0.0036) [2024-06-24 18:20:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 11717640192. Throughput: 0: 42697.2. Samples: 11717727820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:20:28,393][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 18:20:31,278][15401] Updated weights for policy 0, policy_version 715192 (0.0040) [2024-06-24 18:20:32,313][15349] Signal inference workers to stop experience collection... (173400 times) [2024-06-24 18:20:32,314][15349] Signal inference workers to resume experience collection... (173400 times) [2024-06-24 18:20:32,336][15401] InferenceWorker_p0-w0: stopping experience collection (173400 times) [2024-06-24 18:20:32,336][15401] InferenceWorker_p0-w0: resuming experience collection (173400 times) [2024-06-24 18:20:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11717853184. Throughput: 0: 42943.5. Samples: 11717988020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:20:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 18:20:34,110][15401] Updated weights for policy 0, policy_version 715202 (0.0040) [2024-06-24 18:20:38,390][15132] Fps is (10 sec: 37691.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11718017024. Throughput: 0: 42702.6. Samples: 11718116780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:20:38,396][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 18:20:38,876][15401] Updated weights for policy 0, policy_version 715212 (0.0031) [2024-06-24 18:20:41,798][15401] Updated weights for policy 0, policy_version 715222 (0.0033) [2024-06-24 18:20:43,390][15132] Fps is (10 sec: 40958.1, 60 sec: 43144.2, 300 sec: 42931.6). Total num frames: 11718262784. Throughput: 0: 42956.5. Samples: 11718375000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:20:43,391][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 18:20:46,388][15401] Updated weights for policy 0, policy_version 715232 (0.0037) [2024-06-24 18:20:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11718475776. Throughput: 0: 43003.6. Samples: 11718636340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:20:48,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 18:20:49,145][15401] Updated weights for policy 0, policy_version 715242 (0.0022) [2024-06-24 18:20:53,389][15132] Fps is (10 sec: 39323.6, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 11718656000. Throughput: 0: 42903.2. Samples: 11718760940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:20:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 18:20:53,882][15401] Updated weights for policy 0, policy_version 715252 (0.0032) [2024-06-24 18:20:56,847][15401] Updated weights for policy 0, policy_version 715262 (0.0042) [2024-06-24 18:20:58,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 11718918144. Throughput: 0: 42994.3. Samples: 11719018700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:20:58,392][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 18:21:01,749][15401] Updated weights for policy 0, policy_version 715272 (0.0034) [2024-06-24 18:21:03,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11719114752. Throughput: 0: 42929.6. Samples: 11719276080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 18:21:04,679][15401] Updated weights for policy 0, policy_version 715282 (0.0032) [2024-06-24 18:21:08,390][15132] Fps is (10 sec: 39330.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11719311360. Throughput: 0: 42927.8. Samples: 11719401880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 18:21:09,273][15401] Updated weights for policy 0, policy_version 715292 (0.0041) [2024-06-24 18:21:12,253][15401] Updated weights for policy 0, policy_version 715302 (0.0028) [2024-06-24 18:21:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 11719557120. Throughput: 0: 42965.8. Samples: 11719661180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 18:21:16,966][15401] Updated weights for policy 0, policy_version 715312 (0.0041) [2024-06-24 18:21:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11719753728. Throughput: 0: 42926.2. Samples: 11719919700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 18:21:19,775][15401] Updated weights for policy 0, policy_version 715322 (0.0023) [2024-06-24 18:21:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11719966720. Throughput: 0: 42711.6. Samples: 11720038800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:23,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 18:21:24,616][15401] Updated weights for policy 0, policy_version 715332 (0.0029) [2024-06-24 18:21:27,924][15401] Updated weights for policy 0, policy_version 715342 (0.0030) [2024-06-24 18:21:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 11720179712. Throughput: 0: 42669.4. Samples: 11720295100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:28,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 18:21:32,356][15401] Updated weights for policy 0, policy_version 715352 (0.0035) [2024-06-24 18:21:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 11720376320. Throughput: 0: 42673.8. Samples: 11720556660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:33,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 18:21:35,700][15401] Updated weights for policy 0, policy_version 715362 (0.0041) [2024-06-24 18:21:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11720605696. Throughput: 0: 42673.7. Samples: 11720681260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 18:21:40,240][15401] Updated weights for policy 0, policy_version 715372 (0.0038) [2024-06-24 18:21:43,354][15401] Updated weights for policy 0, policy_version 715382 (0.0029) [2024-06-24 18:21:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.7, 300 sec: 42820.5). Total num frames: 11720818688. Throughput: 0: 42528.9. Samples: 11720932400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 18:21:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000715382_11720818688.pth... [2024-06-24 18:21:43,505][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000714756_11710562304.pth [2024-06-24 18:21:47,766][15401] Updated weights for policy 0, policy_version 715392 (0.0032) [2024-06-24 18:21:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 11721031680. Throughput: 0: 42670.0. Samples: 11721196220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:48,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-24 18:21:48,920][15349] Signal inference workers to stop experience collection... (173450 times) [2024-06-24 18:21:48,920][15349] Signal inference workers to resume experience collection... (173450 times) [2024-06-24 18:21:48,962][15401] InferenceWorker_p0-w0: stopping experience collection (173450 times) [2024-06-24 18:21:48,962][15401] InferenceWorker_p0-w0: resuming experience collection (173450 times) [2024-06-24 18:21:50,882][15401] Updated weights for policy 0, policy_version 715402 (0.0036) [2024-06-24 18:21:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.7, 300 sec: 42820.6). Total num frames: 11721244672. Throughput: 0: 42667.6. Samples: 11721322020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:53,392][15132] Avg episode reward: [(0, '0.153')] [2024-06-24 18:21:55,315][15401] Updated weights for policy 0, policy_version 715412 (0.0024) [2024-06-24 18:21:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 11721457664. Throughput: 0: 42584.6. Samples: 11721577480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:21:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 18:21:58,509][15401] Updated weights for policy 0, policy_version 715422 (0.0036) [2024-06-24 18:22:02,731][15401] Updated weights for policy 0, policy_version 715432 (0.0034) [2024-06-24 18:22:03,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11721670656. Throughput: 0: 42603.5. Samples: 11721836860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:22:03,395][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 18:22:06,133][15401] Updated weights for policy 0, policy_version 715442 (0.0038) [2024-06-24 18:22:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11721867264. Throughput: 0: 42751.5. Samples: 11721962620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-24 18:22:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 18:22:10,422][15401] Updated weights for policy 0, policy_version 715452 (0.0029) [2024-06-24 18:22:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 11722096640. Throughput: 0: 42745.2. Samples: 11722218640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 18:22:13,681][15401] Updated weights for policy 0, policy_version 715462 (0.0031) [2024-06-24 18:22:18,136][15401] Updated weights for policy 0, policy_version 715472 (0.0029) [2024-06-24 18:22:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11722309632. Throughput: 0: 42659.1. Samples: 11722476320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 18:22:21,368][15401] Updated weights for policy 0, policy_version 715482 (0.0032) [2024-06-24 18:22:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11722522624. Throughput: 0: 42552.0. Samples: 11722596100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 18:22:26,023][15401] Updated weights for policy 0, policy_version 715492 (0.0024) [2024-06-24 18:22:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42820.3). Total num frames: 11722752000. Throughput: 0: 42845.3. Samples: 11722860540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:28,392][15132] Avg episode reward: [(0, '0.170')] [2024-06-24 18:22:28,930][15401] Updated weights for policy 0, policy_version 715502 (0.0041) [2024-06-24 18:22:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 11722932224. Throughput: 0: 42736.4. Samples: 11723119360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 18:22:33,613][15401] Updated weights for policy 0, policy_version 715512 (0.0026) [2024-06-24 18:22:36,854][15401] Updated weights for policy 0, policy_version 715522 (0.0026) [2024-06-24 18:22:38,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11723177984. Throughput: 0: 42587.7. Samples: 11723238360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:38,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 18:22:41,349][15401] Updated weights for policy 0, policy_version 715532 (0.0040) [2024-06-24 18:22:43,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 11723407360. Throughput: 0: 42776.3. Samples: 11723502420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:43,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 18:22:44,383][15401] Updated weights for policy 0, policy_version 715542 (0.0024) [2024-06-24 18:22:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11723571200. Throughput: 0: 42733.8. Samples: 11723759880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 18:22:49,051][15401] Updated weights for policy 0, policy_version 715552 (0.0041) [2024-06-24 18:22:51,998][15401] Updated weights for policy 0, policy_version 715562 (0.0037) [2024-06-24 18:22:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 11723816960. Throughput: 0: 42581.1. Samples: 11723878760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:53,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 18:22:56,777][15401] Updated weights for policy 0, policy_version 715572 (0.0032) [2024-06-24 18:22:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 11723997184. Throughput: 0: 42584.8. Samples: 11724135060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:22:58,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 18:22:59,929][15401] Updated weights for policy 0, policy_version 715582 (0.0025) [2024-06-24 18:23:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11724210176. Throughput: 0: 42474.7. Samples: 11724387680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:03,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 18:23:04,820][15401] Updated weights for policy 0, policy_version 715592 (0.0033) [2024-06-24 18:23:07,482][15401] Updated weights for policy 0, policy_version 715602 (0.0043) [2024-06-24 18:23:08,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11724439552. Throughput: 0: 42606.5. Samples: 11724513400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 18:23:12,461][15401] Updated weights for policy 0, policy_version 715612 (0.0036) [2024-06-24 18:23:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11724652544. Throughput: 0: 42467.7. Samples: 11724771480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 18:23:15,160][15401] Updated weights for policy 0, policy_version 715622 (0.0030) [2024-06-24 18:23:18,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11724832768. Throughput: 0: 42347.0. Samples: 11725024980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 18:23:20,058][15401] Updated weights for policy 0, policy_version 715632 (0.0053) [2024-06-24 18:23:23,273][15401] Updated weights for policy 0, policy_version 715642 (0.0038) [2024-06-24 18:23:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11725078528. Throughput: 0: 42333.2. Samples: 11725143360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 18:23:27,869][15401] Updated weights for policy 0, policy_version 715652 (0.0029) [2024-06-24 18:23:28,320][15349] Signal inference workers to stop experience collection... (173500 times) [2024-06-24 18:23:28,363][15401] InferenceWorker_p0-w0: stopping experience collection (173500 times) [2024-06-24 18:23:28,374][15349] Signal inference workers to resume experience collection... (173500 times) [2024-06-24 18:23:28,383][15401] InferenceWorker_p0-w0: resuming experience collection (173500 times) [2024-06-24 18:23:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 11725275136. Throughput: 0: 42128.5. Samples: 11725398200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:28,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 18:23:31,315][15401] Updated weights for policy 0, policy_version 715662 (0.0032) [2024-06-24 18:23:33,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 11725471744. Throughput: 0: 41975.1. Samples: 11725648860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:33,393][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 18:23:35,380][15401] Updated weights for policy 0, policy_version 715672 (0.0037) [2024-06-24 18:23:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11725717504. Throughput: 0: 42181.2. Samples: 11725776920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 18:23:38,779][15401] Updated weights for policy 0, policy_version 715682 (0.0045) [2024-06-24 18:23:43,389][15132] Fps is (10 sec: 40969.9, 60 sec: 41233.1, 300 sec: 42542.9). Total num frames: 11725881344. Throughput: 0: 42224.9. Samples: 11726035080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 18:23:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000715692_11725897728.pth... [2024-06-24 18:23:43,427][15401] Updated weights for policy 0, policy_version 715692 (0.0033) [2024-06-24 18:23:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000715069_11715690496.pth [2024-06-24 18:23:46,362][15401] Updated weights for policy 0, policy_version 715702 (0.0027) [2024-06-24 18:23:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 11726110720. Throughput: 0: 42171.5. Samples: 11726285400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 18:23:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 18:23:51,226][15401] Updated weights for policy 0, policy_version 715712 (0.0034) [2024-06-24 18:23:53,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11726356480. Throughput: 0: 42393.4. Samples: 11726421100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:23:53,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 18:23:53,989][15401] Updated weights for policy 0, policy_version 715722 (0.0028) [2024-06-24 18:23:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 11726536704. Throughput: 0: 42446.6. Samples: 11726681580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:23:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 18:23:58,803][15401] Updated weights for policy 0, policy_version 715732 (0.0021) [2024-06-24 18:24:01,608][15401] Updated weights for policy 0, policy_version 715742 (0.0043) [2024-06-24 18:24:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11726782464. Throughput: 0: 42293.8. Samples: 11726928200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 18:24:06,289][15401] Updated weights for policy 0, policy_version 715752 (0.0029) [2024-06-24 18:24:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11726979072. Throughput: 0: 42709.8. Samples: 11727065300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 18:24:09,223][15401] Updated weights for policy 0, policy_version 715762 (0.0037) [2024-06-24 18:24:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42487.7). Total num frames: 11727175680. Throughput: 0: 42775.5. Samples: 11727323100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 18:24:13,898][15401] Updated weights for policy 0, policy_version 715772 (0.0040) [2024-06-24 18:24:17,118][15401] Updated weights for policy 0, policy_version 715782 (0.0035) [2024-06-24 18:24:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 11727437824. Throughput: 0: 42566.4. Samples: 11727564240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 18:24:21,462][15401] Updated weights for policy 0, policy_version 715792 (0.0023) [2024-06-24 18:24:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11727634432. Throughput: 0: 42776.1. Samples: 11727701840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 18:24:24,817][15401] Updated weights for policy 0, policy_version 715802 (0.0036) [2024-06-24 18:24:24,826][15349] Signal inference workers to stop experience collection... (173550 times) [2024-06-24 18:24:24,832][15349] Signal inference workers to resume experience collection... (173550 times) [2024-06-24 18:24:24,860][15401] InferenceWorker_p0-w0: stopping experience collection (173550 times) [2024-06-24 18:24:24,861][15401] InferenceWorker_p0-w0: resuming experience collection (173550 times) [2024-06-24 18:24:28,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11727831040. Throughput: 0: 42816.9. Samples: 11727961840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 18:24:29,105][15401] Updated weights for policy 0, policy_version 715812 (0.0034) [2024-06-24 18:24:32,463][15401] Updated weights for policy 0, policy_version 715822 (0.0031) [2024-06-24 18:24:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11728044032. Throughput: 0: 42762.7. Samples: 11728209720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:33,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 18:24:36,912][15401] Updated weights for policy 0, policy_version 715832 (0.0050) [2024-06-24 18:24:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11728273408. Throughput: 0: 42718.8. Samples: 11728343440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 18:24:40,086][15401] Updated weights for policy 0, policy_version 715842 (0.0038) [2024-06-24 18:24:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 11728470016. Throughput: 0: 42611.2. Samples: 11728599080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 18:24:44,504][15401] Updated weights for policy 0, policy_version 715852 (0.0030) [2024-06-24 18:24:48,303][15401] Updated weights for policy 0, policy_version 715862 (0.0029) [2024-06-24 18:24:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11728683008. Throughput: 0: 42886.3. Samples: 11728858080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:48,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-24 18:24:52,120][15401] Updated weights for policy 0, policy_version 715872 (0.0029) [2024-06-24 18:24:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11728896000. Throughput: 0: 42607.6. Samples: 11728982640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 18:24:55,832][15401] Updated weights for policy 0, policy_version 715882 (0.0031) [2024-06-24 18:24:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 11729108992. Throughput: 0: 42597.8. Samples: 11729240000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:24:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 18:24:59,727][15401] Updated weights for policy 0, policy_version 715892 (0.0037) [2024-06-24 18:25:03,379][15401] Updated weights for policy 0, policy_version 715902 (0.0030) [2024-06-24 18:25:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 11729338368. Throughput: 0: 42941.2. Samples: 11729496700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:25:03,392][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 18:25:07,372][15401] Updated weights for policy 0, policy_version 715912 (0.0028) [2024-06-24 18:25:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 11729534976. Throughput: 0: 42786.1. Samples: 11729627220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:25:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 18:25:10,914][15401] Updated weights for policy 0, policy_version 715922 (0.0033) [2024-06-24 18:25:13,390][15132] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11729764352. Throughput: 0: 42604.7. Samples: 11729879060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:25:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 18:25:15,149][15401] Updated weights for policy 0, policy_version 715932 (0.0039) [2024-06-24 18:25:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11729977344. Throughput: 0: 42707.5. Samples: 11730131560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:25:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 18:25:18,999][15401] Updated weights for policy 0, policy_version 715942 (0.0030) [2024-06-24 18:25:22,994][15401] Updated weights for policy 0, policy_version 715952 (0.0037) [2024-06-24 18:25:23,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 11730173952. Throughput: 0: 42506.6. Samples: 11730256240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:25:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 18:25:26,991][15401] Updated weights for policy 0, policy_version 715962 (0.0033) [2024-06-24 18:25:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 11730386944. Throughput: 0: 42592.7. Samples: 11730515760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 18:25:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 18:25:30,786][15401] Updated weights for policy 0, policy_version 715972 (0.0029) [2024-06-24 18:25:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11730599936. Throughput: 0: 42341.3. Samples: 11730763440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:25:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 18:25:34,533][15401] Updated weights for policy 0, policy_version 715982 (0.0026) [2024-06-24 18:25:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42487.4). Total num frames: 11730796544. Throughput: 0: 42417.4. Samples: 11730891420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:25:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 18:25:38,524][15401] Updated weights for policy 0, policy_version 715992 (0.0029) [2024-06-24 18:25:42,129][15401] Updated weights for policy 0, policy_version 716002 (0.0042) [2024-06-24 18:25:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42542.9). Total num frames: 11731025920. Throughput: 0: 42412.8. Samples: 11731148580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:25:43,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 18:25:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000716005_11731025920.pth... [2024-06-24 18:25:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000715382_11720818688.pth [2024-06-24 18:25:46,291][15401] Updated weights for policy 0, policy_version 716012 (0.0048) [2024-06-24 18:25:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11731255296. Throughput: 0: 42331.7. Samples: 11731401520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:25:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 18:25:49,697][15401] Updated weights for policy 0, policy_version 716022 (0.0032) [2024-06-24 18:25:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 11731435520. Throughput: 0: 42312.0. Samples: 11731531260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:25:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 18:25:54,020][15401] Updated weights for policy 0, policy_version 716032 (0.0037) [2024-06-24 18:25:57,542][15401] Updated weights for policy 0, policy_version 716042 (0.0035) [2024-06-24 18:25:58,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42593.9, 300 sec: 42542.0). Total num frames: 11731664896. Throughput: 0: 42408.4. Samples: 11731787700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:25:58,396][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 18:26:01,866][15401] Updated weights for policy 0, policy_version 716052 (0.0029) [2024-06-24 18:26:02,567][15349] Signal inference workers to stop experience collection... (173600 times) [2024-06-24 18:26:02,595][15401] InferenceWorker_p0-w0: stopping experience collection (173600 times) [2024-06-24 18:26:02,628][15349] Signal inference workers to resume experience collection... (173600 times) [2024-06-24 18:26:02,631][15401] InferenceWorker_p0-w0: resuming experience collection (173600 times) [2024-06-24 18:26:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 11731894272. Throughput: 0: 42351.9. Samples: 11732037400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 18:26:05,194][15401] Updated weights for policy 0, policy_version 716062 (0.0026) [2024-06-24 18:26:08,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 11732090880. Throughput: 0: 42424.9. Samples: 11732165360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 18:26:09,874][15401] Updated weights for policy 0, policy_version 716072 (0.0029) [2024-06-24 18:26:13,036][15401] Updated weights for policy 0, policy_version 716082 (0.0032) [2024-06-24 18:26:13,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 11732303872. Throughput: 0: 42434.4. Samples: 11732425300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 18:26:17,355][15401] Updated weights for policy 0, policy_version 716092 (0.0032) [2024-06-24 18:26:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11732533248. Throughput: 0: 42542.5. Samples: 11732677860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 18:26:20,507][15401] Updated weights for policy 0, policy_version 716102 (0.0037) [2024-06-24 18:26:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 11732729856. Throughput: 0: 42580.5. Samples: 11732807540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 18:26:24,898][15401] Updated weights for policy 0, policy_version 716112 (0.0043) [2024-06-24 18:26:28,039][15401] Updated weights for policy 0, policy_version 716122 (0.0028) [2024-06-24 18:26:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11732959232. Throughput: 0: 42572.2. Samples: 11733064320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 18:26:32,371][15401] Updated weights for policy 0, policy_version 716132 (0.0039) [2024-06-24 18:26:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11733155840. Throughput: 0: 42606.0. Samples: 11733318800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 18:26:35,886][15401] Updated weights for policy 0, policy_version 716142 (0.0030) [2024-06-24 18:26:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 11733368832. Throughput: 0: 42513.9. Samples: 11733444380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 18:26:39,839][15401] Updated weights for policy 0, policy_version 716152 (0.0038) [2024-06-24 18:26:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11733598208. Throughput: 0: 42683.4. Samples: 11733708180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:43,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 18:26:43,393][15401] Updated weights for policy 0, policy_version 716162 (0.0041) [2024-06-24 18:26:47,381][15401] Updated weights for policy 0, policy_version 716172 (0.0036) [2024-06-24 18:26:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 11733794816. Throughput: 0: 42809.1. Samples: 11733963800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:48,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 18:26:51,207][15401] Updated weights for policy 0, policy_version 716182 (0.0054) [2024-06-24 18:26:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 11734007808. Throughput: 0: 42716.4. Samples: 11734087600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 18:26:54,957][15401] Updated weights for policy 0, policy_version 716192 (0.0027) [2024-06-24 18:26:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42329.8, 300 sec: 42487.3). Total num frames: 11734204416. Throughput: 0: 42590.6. Samples: 11734341880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:26:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 18:26:59,059][15401] Updated weights for policy 0, policy_version 716202 (0.0042) [2024-06-24 18:27:02,599][15401] Updated weights for policy 0, policy_version 716212 (0.0034) [2024-06-24 18:27:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11734433792. Throughput: 0: 42778.3. Samples: 11734602880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:27:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 18:27:06,685][15401] Updated weights for policy 0, policy_version 716222 (0.0025) [2024-06-24 18:27:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11734646784. Throughput: 0: 42892.1. Samples: 11734737680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-24 18:27:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 18:27:10,416][15401] Updated weights for policy 0, policy_version 716232 (0.0025) [2024-06-24 18:27:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 11734843392. Throughput: 0: 42681.4. Samples: 11734984980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 18:27:14,545][15401] Updated weights for policy 0, policy_version 716242 (0.0026) [2024-06-24 18:27:17,979][15401] Updated weights for policy 0, policy_version 716252 (0.0026) [2024-06-24 18:27:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11735089152. Throughput: 0: 42683.7. Samples: 11735239560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 18:27:22,181][15401] Updated weights for policy 0, policy_version 716262 (0.0038) [2024-06-24 18:27:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 11735285760. Throughput: 0: 42934.2. Samples: 11735376420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 18:27:25,418][15401] Updated weights for policy 0, policy_version 716272 (0.0038) [2024-06-24 18:27:26,201][15349] Signal inference workers to stop experience collection... (173650 times) [2024-06-24 18:27:26,201][15349] Signal inference workers to resume experience collection... (173650 times) [2024-06-24 18:27:26,244][15401] InferenceWorker_p0-w0: stopping experience collection (173650 times) [2024-06-24 18:27:26,244][15401] InferenceWorker_p0-w0: resuming experience collection (173650 times) [2024-06-24 18:27:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11735498752. Throughput: 0: 42692.9. Samples: 11735629360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:28,392][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 18:27:29,925][15401] Updated weights for policy 0, policy_version 716282 (0.0048) [2024-06-24 18:27:33,128][15401] Updated weights for policy 0, policy_version 716292 (0.0032) [2024-06-24 18:27:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 11735744512. Throughput: 0: 42527.0. Samples: 11735877520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:33,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 18:27:37,621][15401] Updated weights for policy 0, policy_version 716302 (0.0031) [2024-06-24 18:27:38,394][15132] Fps is (10 sec: 44215.8, 60 sec: 42868.0, 300 sec: 42486.7). Total num frames: 11735941120. Throughput: 0: 42758.7. Samples: 11736011940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:38,395][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 18:27:41,021][15401] Updated weights for policy 0, policy_version 716312 (0.0036) [2024-06-24 18:27:43,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 11736121344. Throughput: 0: 42720.8. Samples: 11736264320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 18:27:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000716316_11736121344.pth... [2024-06-24 18:27:43,505][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000715692_11725897728.pth [2024-06-24 18:27:45,446][15401] Updated weights for policy 0, policy_version 716322 (0.0040) [2024-06-24 18:27:48,390][15132] Fps is (10 sec: 42618.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 11736367104. Throughput: 0: 42491.1. Samples: 11736514980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 18:27:48,892][15401] Updated weights for policy 0, policy_version 716332 (0.0034) [2024-06-24 18:27:53,109][15401] Updated weights for policy 0, policy_version 716342 (0.0033) [2024-06-24 18:27:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 11736563712. Throughput: 0: 42465.8. Samples: 11736648640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:53,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 18:27:56,476][15401] Updated weights for policy 0, policy_version 716352 (0.0045) [2024-06-24 18:27:58,396][15132] Fps is (10 sec: 40932.3, 60 sec: 42866.6, 300 sec: 42597.4). Total num frames: 11736776704. Throughput: 0: 42586.8. Samples: 11736901680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:27:58,397][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 18:28:00,801][15401] Updated weights for policy 0, policy_version 716362 (0.0033) [2024-06-24 18:28:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11736989696. Throughput: 0: 42535.0. Samples: 11737153640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 18:28:04,543][15401] Updated weights for policy 0, policy_version 716372 (0.0034) [2024-06-24 18:28:08,390][15132] Fps is (10 sec: 40987.4, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 11737186304. Throughput: 0: 42403.0. Samples: 11737284560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 18:28:08,400][15401] Updated weights for policy 0, policy_version 716382 (0.0024) [2024-06-24 18:28:12,065][15401] Updated weights for policy 0, policy_version 716392 (0.0033) [2024-06-24 18:28:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 11737415680. Throughput: 0: 42471.5. Samples: 11737540580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 18:28:16,195][15401] Updated weights for policy 0, policy_version 716402 (0.0024) [2024-06-24 18:28:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11737628672. Throughput: 0: 42621.0. Samples: 11737795460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 18:28:19,719][15401] Updated weights for policy 0, policy_version 716412 (0.0036) [2024-06-24 18:28:23,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 11737825280. Throughput: 0: 42558.2. Samples: 11737926960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:23,392][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 18:28:23,855][15401] Updated weights for policy 0, policy_version 716422 (0.0027) [2024-06-24 18:28:27,338][15401] Updated weights for policy 0, policy_version 716432 (0.0028) [2024-06-24 18:28:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 11738054656. Throughput: 0: 42637.5. Samples: 11738183000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 18:28:31,565][15401] Updated weights for policy 0, policy_version 716442 (0.0039) [2024-06-24 18:28:33,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11738284032. Throughput: 0: 42554.3. Samples: 11738429920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 18:28:35,096][15401] Updated weights for policy 0, policy_version 716452 (0.0032) [2024-06-24 18:28:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42328.7, 300 sec: 42709.5). Total num frames: 11738480640. Throughput: 0: 42611.5. Samples: 11738566160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 18:28:39,393][15401] Updated weights for policy 0, policy_version 716462 (0.0031) [2024-06-24 18:28:42,880][15401] Updated weights for policy 0, policy_version 716472 (0.0034) [2024-06-24 18:28:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 11738693632. Throughput: 0: 42602.4. Samples: 11738818500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 18:28:46,934][15401] Updated weights for policy 0, policy_version 716482 (0.0044) [2024-06-24 18:28:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11738923008. Throughput: 0: 42675.2. Samples: 11739074020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 18:28:50,535][15401] Updated weights for policy 0, policy_version 716492 (0.0023) [2024-06-24 18:28:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11739119616. Throughput: 0: 42727.7. Samples: 11739207300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 18:28:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 18:28:54,570][15401] Updated weights for policy 0, policy_version 716502 (0.0037) [2024-06-24 18:28:55,547][15349] Signal inference workers to stop experience collection... (173700 times) [2024-06-24 18:28:55,586][15401] InferenceWorker_p0-w0: stopping experience collection (173700 times) [2024-06-24 18:28:55,667][15349] Signal inference workers to resume experience collection... (173700 times) [2024-06-24 18:28:55,667][15401] InferenceWorker_p0-w0: resuming experience collection (173700 times) [2024-06-24 18:28:58,301][15401] Updated weights for policy 0, policy_version 716512 (0.0032) [2024-06-24 18:28:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42603.2, 300 sec: 42542.9). Total num frames: 11739332608. Throughput: 0: 42622.8. Samples: 11739458600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:28:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 18:29:02,187][15401] Updated weights for policy 0, policy_version 716522 (0.0040) [2024-06-24 18:29:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 11739545600. Throughput: 0: 42826.0. Samples: 11739722740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:03,393][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 18:29:06,034][15401] Updated weights for policy 0, policy_version 716532 (0.0034) [2024-06-24 18:29:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11739774976. Throughput: 0: 42785.8. Samples: 11739852220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 18:29:09,868][15401] Updated weights for policy 0, policy_version 716542 (0.0030) [2024-06-24 18:29:13,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 11739971584. Throughput: 0: 42633.3. Samples: 11740101500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 18:29:13,648][15401] Updated weights for policy 0, policy_version 716552 (0.0042) [2024-06-24 18:29:17,623][15401] Updated weights for policy 0, policy_version 716562 (0.0031) [2024-06-24 18:29:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11740184576. Throughput: 0: 42960.0. Samples: 11740363120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:18,392][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 18:29:21,369][15401] Updated weights for policy 0, policy_version 716572 (0.0037) [2024-06-24 18:29:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11740397568. Throughput: 0: 42712.4. Samples: 11740488220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 18:29:25,313][15401] Updated weights for policy 0, policy_version 716582 (0.0045) [2024-06-24 18:29:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11740610560. Throughput: 0: 42546.7. Samples: 11740733100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 18:29:28,914][15401] Updated weights for policy 0, policy_version 716592 (0.0035) [2024-06-24 18:29:32,870][15401] Updated weights for policy 0, policy_version 716602 (0.0042) [2024-06-24 18:29:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11740823552. Throughput: 0: 42664.0. Samples: 11740993900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 18:29:36,875][15401] Updated weights for policy 0, policy_version 716612 (0.0036) [2024-06-24 18:29:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11741036544. Throughput: 0: 42528.4. Samples: 11741121080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:38,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 18:29:40,437][15401] Updated weights for policy 0, policy_version 716622 (0.0041) [2024-06-24 18:29:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11741265920. Throughput: 0: 42519.1. Samples: 11741371960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:43,396][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 18:29:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000716630_11741265920.pth... [2024-06-24 18:29:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000716005_11731025920.pth [2024-06-24 18:29:44,399][15401] Updated weights for policy 0, policy_version 716632 (0.0030) [2024-06-24 18:29:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 11741446144. Throughput: 0: 42453.4. Samples: 11741633040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 18:29:48,509][15401] Updated weights for policy 0, policy_version 716642 (0.0040) [2024-06-24 18:29:51,982][15401] Updated weights for policy 0, policy_version 716652 (0.0035) [2024-06-24 18:29:53,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 11741691904. Throughput: 0: 42285.4. Samples: 11741755160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:53,392][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 18:29:56,335][15401] Updated weights for policy 0, policy_version 716662 (0.0025) [2024-06-24 18:29:58,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42598.4). Total num frames: 11741904896. Throughput: 0: 42499.5. Samples: 11742014080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:29:58,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 18:29:59,505][15401] Updated weights for policy 0, policy_version 716672 (0.0037) [2024-06-24 18:30:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 11742101504. Throughput: 0: 42313.8. Samples: 11742267240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:30:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 18:30:03,902][15401] Updated weights for policy 0, policy_version 716682 (0.0027) [2024-06-24 18:30:07,175][15401] Updated weights for policy 0, policy_version 716692 (0.0036) [2024-06-24 18:30:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11742314496. Throughput: 0: 42392.9. Samples: 11742395900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:30:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 18:30:11,402][15401] Updated weights for policy 0, policy_version 716702 (0.0042) [2024-06-24 18:30:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 11742511104. Throughput: 0: 42659.5. Samples: 11742652780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:30:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 18:30:13,711][15349] Signal inference workers to stop experience collection... (173750 times) [2024-06-24 18:30:13,711][15349] Signal inference workers to resume experience collection... (173750 times) [2024-06-24 18:30:13,760][15401] InferenceWorker_p0-w0: stopping experience collection (173750 times) [2024-06-24 18:30:13,760][15401] InferenceWorker_p0-w0: resuming experience collection (173750 times) [2024-06-24 18:30:15,075][15401] Updated weights for policy 0, policy_version 716712 (0.0029) [2024-06-24 18:30:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11742740480. Throughput: 0: 42469.4. Samples: 11742905020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:30:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 18:30:19,140][15401] Updated weights for policy 0, policy_version 716722 (0.0028) [2024-06-24 18:30:22,675][15401] Updated weights for policy 0, policy_version 716732 (0.0035) [2024-06-24 18:30:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11742953472. Throughput: 0: 42594.3. Samples: 11743037820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:30:23,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 18:30:26,874][15401] Updated weights for policy 0, policy_version 716742 (0.0037) [2024-06-24 18:30:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11743150080. Throughput: 0: 42581.8. Samples: 11743288140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:30:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 18:30:30,283][15401] Updated weights for policy 0, policy_version 716752 (0.0024) [2024-06-24 18:30:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11743395840. Throughput: 0: 42463.0. Samples: 11743543880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 18:30:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 18:30:34,455][15401] Updated weights for policy 0, policy_version 716762 (0.0034) [2024-06-24 18:30:37,897][15401] Updated weights for policy 0, policy_version 716772 (0.0031) [2024-06-24 18:30:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11743608832. Throughput: 0: 42720.4. Samples: 11743677480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:30:38,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-24 18:30:42,142][15401] Updated weights for policy 0, policy_version 716782 (0.0044) [2024-06-24 18:30:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11743805440. Throughput: 0: 42648.5. Samples: 11743933160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:30:43,396][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 18:30:45,765][15401] Updated weights for policy 0, policy_version 716792 (0.0027) [2024-06-24 18:30:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11744018432. Throughput: 0: 42588.1. Samples: 11744183700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:30:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 18:30:49,870][15401] Updated weights for policy 0, policy_version 716802 (0.0029) [2024-06-24 18:30:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42599.3). Total num frames: 11744231424. Throughput: 0: 42608.0. Samples: 11744313260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:30:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 18:30:53,668][15401] Updated weights for policy 0, policy_version 716812 (0.0038) [2024-06-24 18:30:57,563][15401] Updated weights for policy 0, policy_version 716822 (0.0028) [2024-06-24 18:30:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 11744460800. Throughput: 0: 42517.9. Samples: 11744566080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:30:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 18:31:01,669][15401] Updated weights for policy 0, policy_version 716832 (0.0037) [2024-06-24 18:31:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11744641024. Throughput: 0: 42647.0. Samples: 11744824140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 18:31:05,067][15401] Updated weights for policy 0, policy_version 716842 (0.0029) [2024-06-24 18:31:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11744870400. Throughput: 0: 42428.3. Samples: 11744947100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 18:31:09,128][15401] Updated weights for policy 0, policy_version 716852 (0.0048) [2024-06-24 18:31:12,561][15401] Updated weights for policy 0, policy_version 716862 (0.0040) [2024-06-24 18:31:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11745099776. Throughput: 0: 42607.9. Samples: 11745205500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 18:31:17,063][15401] Updated weights for policy 0, policy_version 716872 (0.0044) [2024-06-24 18:31:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.5, 300 sec: 42542.5). Total num frames: 11745280000. Throughput: 0: 42715.1. Samples: 11745466160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:18,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 18:31:20,146][15401] Updated weights for policy 0, policy_version 716882 (0.0044) [2024-06-24 18:31:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11745509376. Throughput: 0: 42309.1. Samples: 11745581380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 18:31:24,647][15401] Updated weights for policy 0, policy_version 716892 (0.0035) [2024-06-24 18:31:28,131][15401] Updated weights for policy 0, policy_version 716902 (0.0046) [2024-06-24 18:31:28,389][15132] Fps is (10 sec: 45886.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 11745738752. Throughput: 0: 42520.1. Samples: 11745846560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:28,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 18:31:32,462][15401] Updated weights for policy 0, policy_version 716912 (0.0033) [2024-06-24 18:31:33,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 11745902592. Throughput: 0: 42654.2. Samples: 11746103140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 18:31:35,752][15401] Updated weights for policy 0, policy_version 716922 (0.0034) [2024-06-24 18:31:36,391][15349] Signal inference workers to stop experience collection... (173800 times) [2024-06-24 18:31:36,398][15349] Signal inference workers to resume experience collection... (173800 times) [2024-06-24 18:31:36,414][15401] InferenceWorker_p0-w0: stopping experience collection (173800 times) [2024-06-24 18:31:36,414][15401] InferenceWorker_p0-w0: resuming experience collection (173800 times) [2024-06-24 18:31:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11746148352. Throughput: 0: 42442.1. Samples: 11746223160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 18:31:40,140][15401] Updated weights for policy 0, policy_version 716932 (0.0043) [2024-06-24 18:31:43,306][15401] Updated weights for policy 0, policy_version 716942 (0.0037) [2024-06-24 18:31:43,392][15132] Fps is (10 sec: 47502.1, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 11746377728. Throughput: 0: 42671.4. Samples: 11746486400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:43,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 18:31:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000716942_11746377728.pth... [2024-06-24 18:31:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000716316_11736121344.pth [2024-06-24 18:31:47,972][15401] Updated weights for policy 0, policy_version 716952 (0.0041) [2024-06-24 18:31:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11746557952. Throughput: 0: 42442.6. Samples: 11746734060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:48,395][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 18:31:51,358][15401] Updated weights for policy 0, policy_version 716962 (0.0039) [2024-06-24 18:31:53,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11746787328. Throughput: 0: 42475.1. Samples: 11746858480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 18:31:55,676][15401] Updated weights for policy 0, policy_version 716972 (0.0047) [2024-06-24 18:31:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 11746983936. Throughput: 0: 42392.1. Samples: 11747113140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:31:58,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 18:31:58,977][15401] Updated weights for policy 0, policy_version 716982 (0.0031) [2024-06-24 18:32:03,147][15401] Updated weights for policy 0, policy_version 716992 (0.0034) [2024-06-24 18:32:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 11747196928. Throughput: 0: 42285.4. Samples: 11747368900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:32:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 18:32:06,590][15401] Updated weights for policy 0, policy_version 717002 (0.0028) [2024-06-24 18:32:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11747426304. Throughput: 0: 42675.6. Samples: 11747501780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:32:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 18:32:10,560][15401] Updated weights for policy 0, policy_version 717012 (0.0031) [2024-06-24 18:32:13,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42050.7, 300 sec: 42487.0). Total num frames: 11747622912. Throughput: 0: 42438.1. Samples: 11747756380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 18:32:13,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 18:32:14,506][15401] Updated weights for policy 0, policy_version 717022 (0.0032) [2024-06-24 18:32:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 11747835904. Throughput: 0: 42432.8. Samples: 11748012620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:18,396][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 18:32:18,594][15401] Updated weights for policy 0, policy_version 717032 (0.0040) [2024-06-24 18:32:22,227][15401] Updated weights for policy 0, policy_version 717042 (0.0029) [2024-06-24 18:32:23,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11748081664. Throughput: 0: 42761.4. Samples: 11748147420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:23,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 18:32:26,097][15401] Updated weights for policy 0, policy_version 717052 (0.0033) [2024-06-24 18:32:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 11748261888. Throughput: 0: 42448.9. Samples: 11748396500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:28,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 18:32:29,804][15401] Updated weights for policy 0, policy_version 717062 (0.0030) [2024-06-24 18:32:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42543.5). Total num frames: 11748491264. Throughput: 0: 42660.0. Samples: 11748653760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:33,390][15132] Avg episode reward: [(0, '0.144')] [2024-06-24 18:32:33,823][15401] Updated weights for policy 0, policy_version 717072 (0.0035) [2024-06-24 18:32:37,580][15401] Updated weights for policy 0, policy_version 717082 (0.0041) [2024-06-24 18:32:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11748704256. Throughput: 0: 42847.6. Samples: 11748786620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:38,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 18:32:41,507][15401] Updated weights for policy 0, policy_version 717092 (0.0033) [2024-06-24 18:32:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 11748900864. Throughput: 0: 42804.0. Samples: 11749039320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 18:32:45,320][15401] Updated weights for policy 0, policy_version 717102 (0.0027) [2024-06-24 18:32:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11749130240. Throughput: 0: 42712.4. Samples: 11749290960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 18:32:49,050][15401] Updated weights for policy 0, policy_version 717112 (0.0037) [2024-06-24 18:32:52,943][15401] Updated weights for policy 0, policy_version 717122 (0.0032) [2024-06-24 18:32:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 11749359616. Throughput: 0: 42751.8. Samples: 11749425620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:53,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 18:32:54,306][15349] Signal inference workers to stop experience collection... (173850 times) [2024-06-24 18:32:54,355][15401] InferenceWorker_p0-w0: stopping experience collection (173850 times) [2024-06-24 18:32:54,366][15349] Signal inference workers to resume experience collection... (173850 times) [2024-06-24 18:32:54,380][15401] InferenceWorker_p0-w0: resuming experience collection (173850 times) [2024-06-24 18:32:56,759][15401] Updated weights for policy 0, policy_version 717132 (0.0030) [2024-06-24 18:32:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11749556224. Throughput: 0: 42849.3. Samples: 11749684500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:32:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 18:33:00,653][15401] Updated weights for policy 0, policy_version 717142 (0.0027) [2024-06-24 18:33:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11749785600. Throughput: 0: 42721.9. Samples: 11749935100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:03,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 18:33:04,543][15401] Updated weights for policy 0, policy_version 717152 (0.0027) [2024-06-24 18:33:08,337][15401] Updated weights for policy 0, policy_version 717162 (0.0033) [2024-06-24 18:33:08,394][15132] Fps is (10 sec: 42580.5, 60 sec: 42595.3, 300 sec: 42597.8). Total num frames: 11749982208. Throughput: 0: 42777.3. Samples: 11750072580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:08,394][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 18:33:12,436][15401] Updated weights for policy 0, policy_version 717172 (0.0034) [2024-06-24 18:33:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11750195200. Throughput: 0: 43059.6. Samples: 11750334180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 18:33:15,843][15401] Updated weights for policy 0, policy_version 717182 (0.0036) [2024-06-24 18:33:18,389][15132] Fps is (10 sec: 45894.9, 60 sec: 43417.7, 300 sec: 42765.4). Total num frames: 11750440960. Throughput: 0: 42859.1. Samples: 11750582420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 18:33:19,961][15401] Updated weights for policy 0, policy_version 717192 (0.0030) [2024-06-24 18:33:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 11750621184. Throughput: 0: 42891.5. Samples: 11750716840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:23,393][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 18:33:23,505][15401] Updated weights for policy 0, policy_version 717202 (0.0042) [2024-06-24 18:33:27,351][15401] Updated weights for policy 0, policy_version 717212 (0.0029) [2024-06-24 18:33:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 11750850560. Throughput: 0: 43015.6. Samples: 11750975020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 18:33:31,044][15401] Updated weights for policy 0, policy_version 717222 (0.0036) [2024-06-24 18:33:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11751063552. Throughput: 0: 43108.0. Samples: 11751230820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 18:33:34,755][15401] Updated weights for policy 0, policy_version 717232 (0.0044) [2024-06-24 18:33:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11751276544. Throughput: 0: 43071.7. Samples: 11751363840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:38,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 18:33:38,413][15401] Updated weights for policy 0, policy_version 717242 (0.0037) [2024-06-24 18:33:42,542][15401] Updated weights for policy 0, policy_version 717252 (0.0029) [2024-06-24 18:33:43,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42598.0). Total num frames: 11751489536. Throughput: 0: 43003.5. Samples: 11751619760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:43,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 18:33:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000717255_11751505920.pth... [2024-06-24 18:33:43,550][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000716630_11741265920.pth [2024-06-24 18:33:45,915][15401] Updated weights for policy 0, policy_version 717262 (0.0029) [2024-06-24 18:33:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11751702528. Throughput: 0: 43097.0. Samples: 11751874460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 18:33:50,211][15401] Updated weights for policy 0, policy_version 717272 (0.0033) [2024-06-24 18:33:53,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11751931904. Throughput: 0: 42885.4. Samples: 11752002240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:53,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-24 18:33:54,058][15401] Updated weights for policy 0, policy_version 717282 (0.0034) [2024-06-24 18:33:57,793][15401] Updated weights for policy 0, policy_version 717292 (0.0040) [2024-06-24 18:33:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 11752112128. Throughput: 0: 42694.1. Samples: 11752255420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 18:33:58,391][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 18:34:01,572][15401] Updated weights for policy 0, policy_version 717302 (0.0038) [2024-06-24 18:34:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11752357888. Throughput: 0: 42804.1. Samples: 11752508600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:03,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 18:34:05,508][15401] Updated weights for policy 0, policy_version 717312 (0.0031) [2024-06-24 18:34:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42601.5, 300 sec: 42598.4). Total num frames: 11752538112. Throughput: 0: 42839.7. Samples: 11752644520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 18:34:08,999][15401] Updated weights for policy 0, policy_version 717322 (0.0034) [2024-06-24 18:34:13,066][15401] Updated weights for policy 0, policy_version 717332 (0.0039) [2024-06-24 18:34:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11752767488. Throughput: 0: 42815.5. Samples: 11752901720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 18:34:16,688][15401] Updated weights for policy 0, policy_version 717342 (0.0035) [2024-06-24 18:34:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11752996864. Throughput: 0: 42682.3. Samples: 11753151520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:18,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 18:34:20,768][15401] Updated weights for policy 0, policy_version 717352 (0.0035) [2024-06-24 18:34:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 11753193472. Throughput: 0: 42659.9. Samples: 11753283540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:23,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 18:34:24,481][15401] Updated weights for policy 0, policy_version 717362 (0.0028) [2024-06-24 18:34:24,737][15349] Signal inference workers to stop experience collection... (173900 times) [2024-06-24 18:34:24,738][15349] Signal inference workers to resume experience collection... (173900 times) [2024-06-24 18:34:24,759][15401] InferenceWorker_p0-w0: stopping experience collection (173900 times) [2024-06-24 18:34:24,760][15401] InferenceWorker_p0-w0: resuming experience collection (173900 times) [2024-06-24 18:34:28,333][15401] Updated weights for policy 0, policy_version 717372 (0.0042) [2024-06-24 18:34:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11753422848. Throughput: 0: 42833.0. Samples: 11753547140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 18:34:32,040][15401] Updated weights for policy 0, policy_version 717382 (0.0035) [2024-06-24 18:34:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11753652224. Throughput: 0: 42688.2. Samples: 11753795440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 18:34:36,079][15401] Updated weights for policy 0, policy_version 717392 (0.0037) [2024-06-24 18:34:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11753848832. Throughput: 0: 42868.8. Samples: 11753931340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 18:34:39,557][15401] Updated weights for policy 0, policy_version 717402 (0.0030) [2024-06-24 18:34:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11754061824. Throughput: 0: 42971.6. Samples: 11754189140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 18:34:43,683][15401] Updated weights for policy 0, policy_version 717412 (0.0028) [2024-06-24 18:34:47,348][15401] Updated weights for policy 0, policy_version 717422 (0.0028) [2024-06-24 18:34:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 11754307584. Throughput: 0: 42832.9. Samples: 11754436080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:48,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 18:34:51,251][15401] Updated weights for policy 0, policy_version 717432 (0.0028) [2024-06-24 18:34:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 11754487808. Throughput: 0: 42848.3. Samples: 11754572700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:53,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 18:34:54,972][15401] Updated weights for policy 0, policy_version 717442 (0.0033) [2024-06-24 18:34:58,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11754700800. Throughput: 0: 42922.5. Samples: 11754833240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:34:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 18:34:58,865][15401] Updated weights for policy 0, policy_version 717452 (0.0039) [2024-06-24 18:35:02,586][15401] Updated weights for policy 0, policy_version 717462 (0.0031) [2024-06-24 18:35:03,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 11754946560. Throughput: 0: 42994.5. Samples: 11755086380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:35:03,392][15132] Avg episode reward: [(0, '0.326')] [2024-06-24 18:35:06,479][15401] Updated weights for policy 0, policy_version 717472 (0.0031) [2024-06-24 18:35:08,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11755126784. Throughput: 0: 43053.9. Samples: 11755220960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:35:08,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 18:35:10,134][15401] Updated weights for policy 0, policy_version 717482 (0.0034) [2024-06-24 18:35:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11755356160. Throughput: 0: 42852.0. Samples: 11755475480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:35:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 18:35:14,399][15401] Updated weights for policy 0, policy_version 717492 (0.0038) [2024-06-24 18:35:17,823][15401] Updated weights for policy 0, policy_version 717502 (0.0033) [2024-06-24 18:35:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11755585536. Throughput: 0: 43058.3. Samples: 11755733060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:35:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 18:35:21,805][15401] Updated weights for policy 0, policy_version 717512 (0.0030) [2024-06-24 18:35:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11755782144. Throughput: 0: 42992.5. Samples: 11755866000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:35:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 18:35:25,329][15401] Updated weights for policy 0, policy_version 717522 (0.0029) [2024-06-24 18:35:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11755995136. Throughput: 0: 42932.1. Samples: 11756121080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:35:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 18:35:29,358][15401] Updated weights for policy 0, policy_version 717532 (0.0033) [2024-06-24 18:35:32,833][15401] Updated weights for policy 0, policy_version 717542 (0.0042) [2024-06-24 18:35:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 11756240896. Throughput: 0: 43182.2. Samples: 11756379280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:35:33,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 18:35:36,873][15401] Updated weights for policy 0, policy_version 717552 (0.0030) [2024-06-24 18:35:38,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11756421120. Throughput: 0: 42992.8. Samples: 11756507380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 18:35:38,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 18:35:40,375][15401] Updated weights for policy 0, policy_version 717562 (0.0031) [2024-06-24 18:35:41,295][15349] Signal inference workers to stop experience collection... (173950 times) [2024-06-24 18:35:41,300][15349] Signal inference workers to resume experience collection... (173950 times) [2024-06-24 18:35:41,331][15401] InferenceWorker_p0-w0: stopping experience collection (173950 times) [2024-06-24 18:35:41,331][15401] InferenceWorker_p0-w0: resuming experience collection (173950 times) [2024-06-24 18:35:43,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11756634112. Throughput: 0: 43026.8. Samples: 11756769440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:35:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 18:35:43,439][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000717569_11756650496.pth... [2024-06-24 18:35:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000716942_11746377728.pth [2024-06-24 18:35:44,829][15401] Updated weights for policy 0, policy_version 717572 (0.0037) [2024-06-24 18:35:48,035][15401] Updated weights for policy 0, policy_version 717582 (0.0043) [2024-06-24 18:35:48,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11756879872. Throughput: 0: 43087.6. Samples: 11757025220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:35:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 18:35:52,332][15401] Updated weights for policy 0, policy_version 717592 (0.0030) [2024-06-24 18:35:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11757060096. Throughput: 0: 43053.7. Samples: 11757158380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:35:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 18:35:55,461][15401] Updated weights for policy 0, policy_version 717602 (0.0022) [2024-06-24 18:35:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11757273088. Throughput: 0: 43015.9. Samples: 11757411200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:35:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 18:35:59,735][15401] Updated weights for policy 0, policy_version 717612 (0.0035) [2024-06-24 18:36:03,393][15132] Fps is (10 sec: 44222.0, 60 sec: 42597.7, 300 sec: 42820.1). Total num frames: 11757502464. Throughput: 0: 42968.4. Samples: 11757666780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:03,393][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 18:36:03,609][15401] Updated weights for policy 0, policy_version 717622 (0.0026) [2024-06-24 18:36:07,352][15401] Updated weights for policy 0, policy_version 717632 (0.0029) [2024-06-24 18:36:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11757699072. Throughput: 0: 42973.5. Samples: 11757799800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 18:36:10,977][15401] Updated weights for policy 0, policy_version 717642 (0.0033) [2024-06-24 18:36:13,390][15132] Fps is (10 sec: 42612.1, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 11757928448. Throughput: 0: 42796.7. Samples: 11758046940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:13,391][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 18:36:15,259][15401] Updated weights for policy 0, policy_version 717652 (0.0036) [2024-06-24 18:36:18,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11758157824. Throughput: 0: 42793.3. Samples: 11758304980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 18:36:18,815][15401] Updated weights for policy 0, policy_version 717662 (0.0030) [2024-06-24 18:36:23,063][15401] Updated weights for policy 0, policy_version 717672 (0.0031) [2024-06-24 18:36:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11758354432. Throughput: 0: 42871.6. Samples: 11758436600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 18:36:26,602][15401] Updated weights for policy 0, policy_version 717682 (0.0038) [2024-06-24 18:36:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 11758583808. Throughput: 0: 42597.3. Samples: 11758686320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 18:36:30,678][15401] Updated weights for policy 0, policy_version 717692 (0.0041) [2024-06-24 18:36:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 11758764032. Throughput: 0: 42850.2. Samples: 11758953480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 18:36:34,221][15401] Updated weights for policy 0, policy_version 717702 (0.0039) [2024-06-24 18:36:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 11758977024. Throughput: 0: 42537.4. Samples: 11759072560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 18:36:38,534][15401] Updated weights for policy 0, policy_version 717712 (0.0036) [2024-06-24 18:36:41,775][15401] Updated weights for policy 0, policy_version 717722 (0.0033) [2024-06-24 18:36:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11759222784. Throughput: 0: 42573.8. Samples: 11759327020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 18:36:46,224][15401] Updated weights for policy 0, policy_version 717732 (0.0038) [2024-06-24 18:36:47,193][15349] Signal inference workers to stop experience collection... (174000 times) [2024-06-24 18:36:47,248][15401] InferenceWorker_p0-w0: stopping experience collection (174000 times) [2024-06-24 18:36:47,257][15349] Signal inference workers to resume experience collection... (174000 times) [2024-06-24 18:36:47,263][15401] InferenceWorker_p0-w0: resuming experience collection (174000 times) [2024-06-24 18:36:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 11759386624. Throughput: 0: 42794.4. Samples: 11759592380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 18:36:49,694][15401] Updated weights for policy 0, policy_version 717742 (0.0034) [2024-06-24 18:36:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11759616000. Throughput: 0: 42355.0. Samples: 11759705780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:53,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-24 18:36:53,923][15401] Updated weights for policy 0, policy_version 717752 (0.0026) [2024-06-24 18:36:57,382][15401] Updated weights for policy 0, policy_version 717762 (0.0040) [2024-06-24 18:36:58,389][15132] Fps is (10 sec: 50790.2, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 11759894528. Throughput: 0: 42677.1. Samples: 11759967400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:36:58,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 18:37:01,404][15401] Updated weights for policy 0, policy_version 717772 (0.0023) [2024-06-24 18:37:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.8, 300 sec: 42765.0). Total num frames: 11760041984. Throughput: 0: 42868.0. Samples: 11760234040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:37:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 18:37:04,871][15401] Updated weights for policy 0, policy_version 717782 (0.0043) [2024-06-24 18:37:08,391][15132] Fps is (10 sec: 37677.2, 60 sec: 42870.3, 300 sec: 42876.2). Total num frames: 11760271360. Throughput: 0: 42564.0. Samples: 11760352040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:37:08,392][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 18:37:08,958][15401] Updated weights for policy 0, policy_version 717792 (0.0050) [2024-06-24 18:37:12,422][15401] Updated weights for policy 0, policy_version 717802 (0.0035) [2024-06-24 18:37:13,389][15132] Fps is (10 sec: 49151.7, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 11760533504. Throughput: 0: 42937.8. Samples: 11760618520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:37:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 18:37:16,673][15401] Updated weights for policy 0, policy_version 717812 (0.0036) [2024-06-24 18:37:18,390][15132] Fps is (10 sec: 42604.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11760697344. Throughput: 0: 42811.1. Samples: 11760879980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-24 18:37:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 18:37:19,925][15401] Updated weights for policy 0, policy_version 717822 (0.0047) [2024-06-24 18:37:23,389][15132] Fps is (10 sec: 36044.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11760893952. Throughput: 0: 42812.4. Samples: 11760999120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:37:23,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 18:37:24,126][15401] Updated weights for policy 0, policy_version 717832 (0.0039) [2024-06-24 18:37:27,665][15401] Updated weights for policy 0, policy_version 717842 (0.0038) [2024-06-24 18:37:28,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 11761172480. Throughput: 0: 42988.0. Samples: 11761261480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:37:28,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 18:37:32,041][15401] Updated weights for policy 0, policy_version 717852 (0.0034) [2024-06-24 18:37:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11761352704. Throughput: 0: 42882.2. Samples: 11761522080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:37:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 18:37:35,281][15401] Updated weights for policy 0, policy_version 717862 (0.0043) [2024-06-24 18:37:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11761565696. Throughput: 0: 43041.0. Samples: 11761642620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:37:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 18:37:39,601][15401] Updated weights for policy 0, policy_version 717872 (0.0036) [2024-06-24 18:37:42,751][15401] Updated weights for policy 0, policy_version 717882 (0.0052) [2024-06-24 18:37:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 11761811456. Throughput: 0: 43019.9. Samples: 11761903300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:37:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 18:37:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000717884_11761811456.pth... [2024-06-24 18:37:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000717255_11751505920.pth [2024-06-24 18:37:47,435][15401] Updated weights for policy 0, policy_version 717892 (0.0023) [2024-06-24 18:37:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 11761991680. Throughput: 0: 42927.0. Samples: 11762165760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:37:48,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-24 18:37:50,370][15401] Updated weights for policy 0, policy_version 717902 (0.0032) [2024-06-24 18:37:53,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11762188288. Throughput: 0: 43004.1. Samples: 11762287160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:37:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 18:37:55,389][15401] Updated weights for policy 0, policy_version 717912 (0.0037) [2024-06-24 18:37:55,556][15349] Signal inference workers to stop experience collection... (174050 times) [2024-06-24 18:37:55,564][15349] Signal inference workers to resume experience collection... (174050 times) [2024-06-24 18:37:55,575][15401] InferenceWorker_p0-w0: stopping experience collection (174050 times) [2024-06-24 18:37:55,600][15401] InferenceWorker_p0-w0: resuming experience collection (174050 times) [2024-06-24 18:37:58,108][15401] Updated weights for policy 0, policy_version 717922 (0.0033) [2024-06-24 18:37:58,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 11762450432. Throughput: 0: 42908.8. Samples: 11762549520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:37:58,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 18:38:02,795][15401] Updated weights for policy 0, policy_version 717932 (0.0043) [2024-06-24 18:38:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42932.3). Total num frames: 11762647040. Throughput: 0: 42908.0. Samples: 11762810840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 18:38:05,605][15401] Updated weights for policy 0, policy_version 717942 (0.0035) [2024-06-24 18:38:08,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42872.5, 300 sec: 42876.1). Total num frames: 11762843648. Throughput: 0: 42946.1. Samples: 11762931700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 18:38:10,251][15401] Updated weights for policy 0, policy_version 717952 (0.0042) [2024-06-24 18:38:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11763073024. Throughput: 0: 42941.8. Samples: 11763193860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 18:38:13,654][15401] Updated weights for policy 0, policy_version 717962 (0.0044) [2024-06-24 18:38:17,838][15401] Updated weights for policy 0, policy_version 717972 (0.0038) [2024-06-24 18:38:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 11763302400. Throughput: 0: 42829.7. Samples: 11763449420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 18:38:21,150][15401] Updated weights for policy 0, policy_version 717982 (0.0032) [2024-06-24 18:38:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11763482624. Throughput: 0: 42998.6. Samples: 11763577560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 18:38:25,373][15401] Updated weights for policy 0, policy_version 717992 (0.0024) [2024-06-24 18:38:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11763728384. Throughput: 0: 42972.4. Samples: 11763837060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:28,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 18:38:28,689][15401] Updated weights for policy 0, policy_version 718002 (0.0040) [2024-06-24 18:38:33,055][15401] Updated weights for policy 0, policy_version 718012 (0.0027) [2024-06-24 18:38:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11763924992. Throughput: 0: 42828.9. Samples: 11764093060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 18:38:36,165][15401] Updated weights for policy 0, policy_version 718022 (0.0033) [2024-06-24 18:38:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 11764137984. Throughput: 0: 42891.5. Samples: 11764217280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:38,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 18:38:40,623][15401] Updated weights for policy 0, policy_version 718032 (0.0025) [2024-06-24 18:38:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11764383744. Throughput: 0: 42930.2. Samples: 11764481280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 18:38:43,934][15401] Updated weights for policy 0, policy_version 718042 (0.0032) [2024-06-24 18:38:48,181][15401] Updated weights for policy 0, policy_version 718052 (0.0040) [2024-06-24 18:38:48,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 11764580352. Throughput: 0: 42737.4. Samples: 11764734120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:48,401][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 18:38:51,459][15401] Updated weights for policy 0, policy_version 718062 (0.0029) [2024-06-24 18:38:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11764776960. Throughput: 0: 42756.5. Samples: 11764855740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 18:38:56,075][15401] Updated weights for policy 0, policy_version 718072 (0.0030) [2024-06-24 18:38:58,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 11765006336. Throughput: 0: 42797.9. Samples: 11765119760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:38:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 18:38:58,923][15401] Updated weights for policy 0, policy_version 718082 (0.0023) [2024-06-24 18:39:02,604][15349] Signal inference workers to stop experience collection... (174100 times) [2024-06-24 18:39:02,604][15349] Signal inference workers to resume experience collection... (174100 times) [2024-06-24 18:39:02,653][15401] InferenceWorker_p0-w0: stopping experience collection (174100 times) [2024-06-24 18:39:02,653][15401] InferenceWorker_p0-w0: resuming experience collection (174100 times) [2024-06-24 18:39:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 11765202944. Throughput: 0: 42696.0. Samples: 11765370840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-24 18:39:03,393][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 18:39:03,555][15401] Updated weights for policy 0, policy_version 718092 (0.0033) [2024-06-24 18:39:06,538][15401] Updated weights for policy 0, policy_version 718102 (0.0030) [2024-06-24 18:39:08,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11765432320. Throughput: 0: 42675.9. Samples: 11765497980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:08,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 18:39:11,431][15401] Updated weights for policy 0, policy_version 718112 (0.0040) [2024-06-24 18:39:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11765628928. Throughput: 0: 42635.2. Samples: 11765755640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 18:39:14,902][15401] Updated weights for policy 0, policy_version 718122 (0.0048) [2024-06-24 18:39:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 11765841920. Throughput: 0: 42542.6. Samples: 11766007480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:18,399][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 18:39:19,230][15401] Updated weights for policy 0, policy_version 718132 (0.0037) [2024-06-24 18:39:22,574][15401] Updated weights for policy 0, policy_version 718142 (0.0042) [2024-06-24 18:39:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11766054912. Throughput: 0: 42619.1. Samples: 11766135140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 18:39:26,913][15401] Updated weights for policy 0, policy_version 718152 (0.0039) [2024-06-24 18:39:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11766251520. Throughput: 0: 42345.3. Samples: 11766386820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:28,399][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 18:39:30,563][15401] Updated weights for policy 0, policy_version 718162 (0.0042) [2024-06-24 18:39:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11766480896. Throughput: 0: 42366.6. Samples: 11766640520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:33,396][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 18:39:34,779][15401] Updated weights for policy 0, policy_version 718172 (0.0034) [2024-06-24 18:39:38,144][15401] Updated weights for policy 0, policy_version 718182 (0.0027) [2024-06-24 18:39:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11766693888. Throughput: 0: 42512.0. Samples: 11766768780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 18:39:42,278][15401] Updated weights for policy 0, policy_version 718192 (0.0026) [2024-06-24 18:39:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 11766890496. Throughput: 0: 42412.3. Samples: 11767028320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:43,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 18:39:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000718195_11766906880.pth... [2024-06-24 18:39:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000717569_11756650496.pth [2024-06-24 18:39:45,741][15401] Updated weights for policy 0, policy_version 718202 (0.0038) [2024-06-24 18:39:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 11767119872. Throughput: 0: 42529.0. Samples: 11767284540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 18:39:50,012][15401] Updated weights for policy 0, policy_version 718212 (0.0038) [2024-06-24 18:39:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11767332864. Throughput: 0: 42502.3. Samples: 11767410580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 18:39:53,560][15401] Updated weights for policy 0, policy_version 718222 (0.0024) [2024-06-24 18:39:57,878][15401] Updated weights for policy 0, policy_version 718232 (0.0036) [2024-06-24 18:39:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 42598.7). Total num frames: 11767513088. Throughput: 0: 42327.9. Samples: 11767660400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:39:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 18:40:01,741][15401] Updated weights for policy 0, policy_version 718242 (0.0046) [2024-06-24 18:40:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 11767742464. Throughput: 0: 42437.5. Samples: 11767917160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:03,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 18:40:05,520][15401] Updated weights for policy 0, policy_version 718252 (0.0038) [2024-06-24 18:40:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 11767971840. Throughput: 0: 42431.2. Samples: 11768044540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 18:40:09,266][15401] Updated weights for policy 0, policy_version 718262 (0.0039) [2024-06-24 18:40:13,391][15132] Fps is (10 sec: 40955.4, 60 sec: 42051.4, 300 sec: 42598.2). Total num frames: 11768152064. Throughput: 0: 42426.6. Samples: 11768296060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:13,391][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 18:40:13,543][15401] Updated weights for policy 0, policy_version 718272 (0.0038) [2024-06-24 18:40:17,091][15401] Updated weights for policy 0, policy_version 718282 (0.0037) [2024-06-24 18:40:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11768397824. Throughput: 0: 42478.6. Samples: 11768552060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 18:40:21,195][15401] Updated weights for policy 0, policy_version 718292 (0.0031) [2024-06-24 18:40:23,390][15132] Fps is (10 sec: 47518.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11768627200. Throughput: 0: 42610.6. Samples: 11768686260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 18:40:24,915][15401] Updated weights for policy 0, policy_version 718302 (0.0040) [2024-06-24 18:40:27,000][15349] Signal inference workers to stop experience collection... (174150 times) [2024-06-24 18:40:27,001][15349] Signal inference workers to resume experience collection... (174150 times) [2024-06-24 18:40:27,040][15401] InferenceWorker_p0-w0: stopping experience collection (174150 times) [2024-06-24 18:40:27,040][15401] InferenceWorker_p0-w0: resuming experience collection (174150 times) [2024-06-24 18:40:28,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 11768807424. Throughput: 0: 42367.6. Samples: 11768934960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:28,392][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 18:40:28,842][15401] Updated weights for policy 0, policy_version 718312 (0.0040) [2024-06-24 18:40:32,493][15401] Updated weights for policy 0, policy_version 718322 (0.0034) [2024-06-24 18:40:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11769020416. Throughput: 0: 42233.8. Samples: 11769185060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:33,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 18:40:36,397][15401] Updated weights for policy 0, policy_version 718332 (0.0025) [2024-06-24 18:40:38,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11769249792. Throughput: 0: 42284.9. Samples: 11769313400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 18:40:40,473][15401] Updated weights for policy 0, policy_version 718342 (0.0044) [2024-06-24 18:40:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11769446400. Throughput: 0: 42459.7. Samples: 11769571080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-24 18:40:43,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 18:40:44,014][15401] Updated weights for policy 0, policy_version 718352 (0.0033) [2024-06-24 18:40:48,115][15401] Updated weights for policy 0, policy_version 718362 (0.0037) [2024-06-24 18:40:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11769659392. Throughput: 0: 42296.4. Samples: 11769820500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:40:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 18:40:51,901][15401] Updated weights for policy 0, policy_version 718372 (0.0043) [2024-06-24 18:40:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11769888768. Throughput: 0: 42293.8. Samples: 11769947760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:40:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 18:40:55,722][15401] Updated weights for policy 0, policy_version 718382 (0.0030) [2024-06-24 18:40:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42654.4). Total num frames: 11770085376. Throughput: 0: 42449.5. Samples: 11770206240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:40:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 18:40:59,616][15401] Updated weights for policy 0, policy_version 718392 (0.0040) [2024-06-24 18:41:03,342][15401] Updated weights for policy 0, policy_version 718402 (0.0047) [2024-06-24 18:41:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11770298368. Throughput: 0: 42472.1. Samples: 11770463300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:03,396][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 18:41:07,054][15401] Updated weights for policy 0, policy_version 718412 (0.0046) [2024-06-24 18:41:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11770511360. Throughput: 0: 42208.6. Samples: 11770585640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 18:41:11,011][15401] Updated weights for policy 0, policy_version 718422 (0.0046) [2024-06-24 18:41:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42872.3, 300 sec: 42598.4). Total num frames: 11770724352. Throughput: 0: 42597.0. Samples: 11770851720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 18:41:14,570][15401] Updated weights for policy 0, policy_version 718432 (0.0031) [2024-06-24 18:41:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 11770920960. Throughput: 0: 42703.1. Samples: 11771106700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:18,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 18:41:18,842][15401] Updated weights for policy 0, policy_version 718442 (0.0029) [2024-06-24 18:41:22,668][15401] Updated weights for policy 0, policy_version 718452 (0.0046) [2024-06-24 18:41:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11771166720. Throughput: 0: 42669.3. Samples: 11771233520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:23,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 18:41:26,399][15401] Updated weights for policy 0, policy_version 718462 (0.0046) [2024-06-24 18:41:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 11771379712. Throughput: 0: 42714.1. Samples: 11771493220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:28,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 18:41:30,264][15401] Updated weights for policy 0, policy_version 718472 (0.0037) [2024-06-24 18:41:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11771559936. Throughput: 0: 42644.5. Samples: 11771739500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:33,399][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 18:41:34,166][15401] Updated weights for policy 0, policy_version 718482 (0.0032) [2024-06-24 18:41:37,889][15401] Updated weights for policy 0, policy_version 718492 (0.0029) [2024-06-24 18:41:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11771789312. Throughput: 0: 42653.4. Samples: 11771867160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:38,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 18:41:41,858][15401] Updated weights for policy 0, policy_version 718502 (0.0039) [2024-06-24 18:41:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11771985920. Throughput: 0: 42575.0. Samples: 11772122120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 18:41:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000718506_11772002304.pth... [2024-06-24 18:41:43,619][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000717884_11761811456.pth [2024-06-24 18:41:45,499][15401] Updated weights for policy 0, policy_version 718512 (0.0024) [2024-06-24 18:41:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11772198912. Throughput: 0: 42505.8. Samples: 11772376060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 18:41:49,512][15401] Updated weights for policy 0, policy_version 718522 (0.0033) [2024-06-24 18:41:53,303][15401] Updated weights for policy 0, policy_version 718532 (0.0036) [2024-06-24 18:41:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.1, 300 sec: 42487.3). Total num frames: 11772428288. Throughput: 0: 42572.2. Samples: 11772501400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 18:41:57,446][15401] Updated weights for policy 0, policy_version 718542 (0.0035) [2024-06-24 18:41:58,303][15349] Signal inference workers to stop experience collection... (174200 times) [2024-06-24 18:41:58,349][15401] InferenceWorker_p0-w0: stopping experience collection (174200 times) [2024-06-24 18:41:58,358][15349] Signal inference workers to resume experience collection... (174200 times) [2024-06-24 18:41:58,367][15401] InferenceWorker_p0-w0: resuming experience collection (174200 times) [2024-06-24 18:41:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11772624896. Throughput: 0: 42411.1. Samples: 11772760220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:41:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 18:42:01,016][15401] Updated weights for policy 0, policy_version 718552 (0.0034) [2024-06-24 18:42:03,390][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42598.6). Total num frames: 11772837888. Throughput: 0: 42395.1. Samples: 11773014480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:42:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 18:42:05,242][15401] Updated weights for policy 0, policy_version 718562 (0.0035) [2024-06-24 18:42:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 11773050880. Throughput: 0: 42417.7. Samples: 11773142320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:42:08,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 18:42:08,823][15401] Updated weights for policy 0, policy_version 718572 (0.0028) [2024-06-24 18:42:12,847][15401] Updated weights for policy 0, policy_version 718582 (0.0036) [2024-06-24 18:42:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11773263872. Throughput: 0: 42260.9. Samples: 11773394960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:42:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 18:42:16,305][15401] Updated weights for policy 0, policy_version 718592 (0.0042) [2024-06-24 18:42:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11773493248. Throughput: 0: 42509.3. Samples: 11773652420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:42:18,401][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 18:42:20,498][15401] Updated weights for policy 0, policy_version 718602 (0.0037) [2024-06-24 18:42:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 11773706240. Throughput: 0: 42600.4. Samples: 11773784180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:42:23,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 18:42:24,124][15401] Updated weights for policy 0, policy_version 718612 (0.0040) [2024-06-24 18:42:28,352][15401] Updated weights for policy 0, policy_version 718622 (0.0024) [2024-06-24 18:42:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11773902848. Throughput: 0: 42456.0. Samples: 11774032640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:42:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 18:42:31,683][15401] Updated weights for policy 0, policy_version 718632 (0.0044) [2024-06-24 18:42:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11774132224. Throughput: 0: 42457.6. Samples: 11774286660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:42:33,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 18:42:36,409][15401] Updated weights for policy 0, policy_version 718642 (0.0040) [2024-06-24 18:42:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 11774328832. Throughput: 0: 42571.8. Samples: 11774417120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:42:38,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 18:42:39,336][15401] Updated weights for policy 0, policy_version 718652 (0.0033) [2024-06-24 18:42:43,392][15132] Fps is (10 sec: 39313.1, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 11774525440. Throughput: 0: 42284.9. Samples: 11774663140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:42:43,392][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 18:42:44,077][15401] Updated weights for policy 0, policy_version 718662 (0.0036) [2024-06-24 18:42:47,086][15401] Updated weights for policy 0, policy_version 718672 (0.0032) [2024-06-24 18:42:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11774771200. Throughput: 0: 42386.7. Samples: 11774921880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:42:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 18:42:51,737][15401] Updated weights for policy 0, policy_version 718682 (0.0044) [2024-06-24 18:42:53,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42325.5, 300 sec: 42432.1). Total num frames: 11774967808. Throughput: 0: 42454.4. Samples: 11775052760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:42:53,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 18:42:54,564][15401] Updated weights for policy 0, policy_version 718692 (0.0036) [2024-06-24 18:42:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 11775180800. Throughput: 0: 42413.4. Samples: 11775303560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:42:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 18:42:59,350][15401] Updated weights for policy 0, policy_version 718702 (0.0028) [2024-06-24 18:43:02,343][15401] Updated weights for policy 0, policy_version 718712 (0.0044) [2024-06-24 18:43:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11775393792. Throughput: 0: 42303.7. Samples: 11775556080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:03,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 18:43:07,059][15401] Updated weights for policy 0, policy_version 718722 (0.0033) [2024-06-24 18:43:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 11775590400. Throughput: 0: 42213.4. Samples: 11775683780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:08,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 18:43:10,119][15401] Updated weights for policy 0, policy_version 718732 (0.0033) [2024-06-24 18:43:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 11775819776. Throughput: 0: 42339.2. Samples: 11775937900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:13,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 18:43:14,732][15401] Updated weights for policy 0, policy_version 718742 (0.0035) [2024-06-24 18:43:16,637][15349] Signal inference workers to stop experience collection... (174250 times) [2024-06-24 18:43:16,637][15349] Signal inference workers to resume experience collection... (174250 times) [2024-06-24 18:43:16,653][15401] InferenceWorker_p0-w0: stopping experience collection (174250 times) [2024-06-24 18:43:16,653][15401] InferenceWorker_p0-w0: resuming experience collection (174250 times) [2024-06-24 18:43:17,682][15401] Updated weights for policy 0, policy_version 718752 (0.0042) [2024-06-24 18:43:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11776049152. Throughput: 0: 42344.0. Samples: 11776192140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:18,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 18:43:22,241][15401] Updated weights for policy 0, policy_version 718762 (0.0037) [2024-06-24 18:43:23,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42050.6, 300 sec: 42375.9). Total num frames: 11776229376. Throughput: 0: 42494.2. Samples: 11776329460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:23,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 18:43:25,194][15401] Updated weights for policy 0, policy_version 718772 (0.0039) [2024-06-24 18:43:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 11776458752. Throughput: 0: 42654.5. Samples: 11776582500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:28,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 18:43:29,950][15401] Updated weights for policy 0, policy_version 718782 (0.0030) [2024-06-24 18:43:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 11776671744. Throughput: 0: 42562.2. Samples: 11776837180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 18:43:33,415][15401] Updated weights for policy 0, policy_version 718792 (0.0040) [2024-06-24 18:43:37,698][15401] Updated weights for policy 0, policy_version 718802 (0.0038) [2024-06-24 18:43:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 11776868352. Throughput: 0: 42563.9. Samples: 11776968140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:38,392][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 18:43:40,919][15401] Updated weights for policy 0, policy_version 718812 (0.0031) [2024-06-24 18:43:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.1, 300 sec: 42432.1). Total num frames: 11777097728. Throughput: 0: 42774.2. Samples: 11777228400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:43,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 18:43:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000718818_11777114112.pth... [2024-06-24 18:43:43,518][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000718195_11766906880.pth [2024-06-24 18:43:45,605][15401] Updated weights for policy 0, policy_version 718822 (0.0047) [2024-06-24 18:43:48,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11777327104. Throughput: 0: 42703.6. Samples: 11777477740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 18:43:48,487][15401] Updated weights for policy 0, policy_version 718832 (0.0028) [2024-06-24 18:43:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 11777490944. Throughput: 0: 42904.4. Samples: 11777614480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 18:43:53,523][15401] Updated weights for policy 0, policy_version 718842 (0.0027) [2024-06-24 18:43:55,937][15401] Updated weights for policy 0, policy_version 718852 (0.0024) [2024-06-24 18:43:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 11777736704. Throughput: 0: 42927.0. Samples: 11777869620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:43:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 18:44:01,163][15401] Updated weights for policy 0, policy_version 718862 (0.0038) [2024-06-24 18:44:03,389][15132] Fps is (10 sec: 49152.4, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 11777982464. Throughput: 0: 42852.2. Samples: 11778120480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:44:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 18:44:03,418][15401] Updated weights for policy 0, policy_version 718872 (0.0034) [2024-06-24 18:44:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 11778146304. Throughput: 0: 42784.9. Samples: 11778254680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 18:44:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 18:44:08,966][15401] Updated weights for policy 0, policy_version 718882 (0.0033) [2024-06-24 18:44:11,057][15401] Updated weights for policy 0, policy_version 718892 (0.0038) [2024-06-24 18:44:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 11778392064. Throughput: 0: 42760.2. Samples: 11778506700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:13,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 18:44:16,656][15349] Signal inference workers to stop experience collection... (174300 times) [2024-06-24 18:44:16,658][15349] Signal inference workers to resume experience collection... (174300 times) [2024-06-24 18:44:16,666][15401] Updated weights for policy 0, policy_version 718902 (0.0030) [2024-06-24 18:44:16,676][15401] InferenceWorker_p0-w0: stopping experience collection (174300 times) [2024-06-24 18:44:16,676][15401] InferenceWorker_p0-w0: resuming experience collection (174300 times) [2024-06-24 18:44:18,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11778621440. Throughput: 0: 42750.1. Samples: 11778760940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:18,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 18:44:18,927][15401] Updated weights for policy 0, policy_version 718912 (0.0029) [2024-06-24 18:44:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 11778785280. Throughput: 0: 42711.5. Samples: 11778890160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 18:44:24,277][15401] Updated weights for policy 0, policy_version 718922 (0.0032) [2024-06-24 18:44:26,444][15401] Updated weights for policy 0, policy_version 718932 (0.0036) [2024-06-24 18:44:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 11779031040. Throughput: 0: 42500.5. Samples: 11779140920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 18:44:31,775][15401] Updated weights for policy 0, policy_version 718942 (0.0039) [2024-06-24 18:44:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 11779244032. Throughput: 0: 42762.5. Samples: 11779402060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 18:44:34,013][15401] Updated weights for policy 0, policy_version 718952 (0.0035) [2024-06-24 18:44:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 11779424256. Throughput: 0: 42398.6. Samples: 11779522420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 18:44:39,371][15401] Updated weights for policy 0, policy_version 718962 (0.0028) [2024-06-24 18:44:42,043][15401] Updated weights for policy 0, policy_version 718972 (0.0035) [2024-06-24 18:44:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11779686400. Throughput: 0: 42495.1. Samples: 11779781900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 18:44:46,997][15401] Updated weights for policy 0, policy_version 718982 (0.0040) [2024-06-24 18:44:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11779883008. Throughput: 0: 42798.2. Samples: 11780046400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 18:44:49,794][15401] Updated weights for policy 0, policy_version 718992 (0.0045) [2024-06-24 18:44:53,389][15132] Fps is (10 sec: 36045.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 11780046848. Throughput: 0: 42490.8. Samples: 11780166760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 18:44:54,483][15401] Updated weights for policy 0, policy_version 719002 (0.0034) [2024-06-24 18:44:57,242][15401] Updated weights for policy 0, policy_version 719012 (0.0030) [2024-06-24 18:44:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 11780341760. Throughput: 0: 42631.8. Samples: 11780425140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:44:58,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 18:45:02,380][15401] Updated weights for policy 0, policy_version 719022 (0.0027) [2024-06-24 18:45:03,392][15132] Fps is (10 sec: 47501.8, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 11780521984. Throughput: 0: 42892.4. Samples: 11780691200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:03,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 18:45:05,211][15401] Updated weights for policy 0, policy_version 719032 (0.0040) [2024-06-24 18:45:08,389][15132] Fps is (10 sec: 36045.2, 60 sec: 42598.5, 300 sec: 42543.0). Total num frames: 11780702208. Throughput: 0: 42619.3. Samples: 11780808020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 18:45:09,882][15401] Updated weights for policy 0, policy_version 719042 (0.0035) [2024-06-24 18:45:12,637][15401] Updated weights for policy 0, policy_version 719052 (0.0036) [2024-06-24 18:45:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 11780980736. Throughput: 0: 42831.9. Samples: 11781068360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 18:45:17,488][15401] Updated weights for policy 0, policy_version 719062 (0.0040) [2024-06-24 18:45:18,018][15349] Signal inference workers to stop experience collection... (174350 times) [2024-06-24 18:45:18,025][15349] Signal inference workers to resume experience collection... (174350 times) [2024-06-24 18:45:18,030][15401] InferenceWorker_p0-w0: stopping experience collection (174350 times) [2024-06-24 18:45:18,065][15401] InferenceWorker_p0-w0: resuming experience collection (174350 times) [2024-06-24 18:45:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 11781160960. Throughput: 0: 42904.5. Samples: 11781332760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 18:45:20,414][15401] Updated weights for policy 0, policy_version 719072 (0.0042) [2024-06-24 18:45:23,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 11781357568. Throughput: 0: 42840.9. Samples: 11781450260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 18:45:25,336][15401] Updated weights for policy 0, policy_version 719082 (0.0032) [2024-06-24 18:45:28,355][15401] Updated weights for policy 0, policy_version 719092 (0.0044) [2024-06-24 18:45:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11781603328. Throughput: 0: 42769.3. Samples: 11781706520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 18:45:32,942][15401] Updated weights for policy 0, policy_version 719102 (0.0040) [2024-06-24 18:45:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 11781783552. Throughput: 0: 42838.7. Samples: 11781974140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:33,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 18:45:35,891][15401] Updated weights for policy 0, policy_version 719112 (0.0030) [2024-06-24 18:45:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 11781996544. Throughput: 0: 42691.4. Samples: 11782087880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 18:45:40,854][15401] Updated weights for policy 0, policy_version 719122 (0.0026) [2024-06-24 18:45:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11782242304. Throughput: 0: 42737.5. Samples: 11782348320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 18:45:43,463][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000719132_11782258688.pth... [2024-06-24 18:45:43,466][15401] Updated weights for policy 0, policy_version 719132 (0.0033) [2024-06-24 18:45:43,510][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000718506_11772002304.pth [2024-06-24 18:45:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 11782389760. Throughput: 0: 42799.2. Samples: 11782617060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 18:45:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 18:45:48,559][15401] Updated weights for policy 0, policy_version 719142 (0.0025) [2024-06-24 18:45:51,022][15401] Updated weights for policy 0, policy_version 719152 (0.0033) [2024-06-24 18:45:53,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 11782651904. Throughput: 0: 42737.3. Samples: 11782731200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:45:53,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-24 18:45:56,124][15401] Updated weights for policy 0, policy_version 719162 (0.0035) [2024-06-24 18:45:58,389][15132] Fps is (10 sec: 49152.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11782881280. Throughput: 0: 42855.2. Samples: 11782996840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:45:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 18:45:58,916][15401] Updated weights for policy 0, policy_version 719172 (0.0032) [2024-06-24 18:46:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 11783045120. Throughput: 0: 42841.8. Samples: 11783260640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 18:46:03,638][15401] Updated weights for policy 0, policy_version 719182 (0.0032) [2024-06-24 18:46:06,638][15401] Updated weights for policy 0, policy_version 719192 (0.0029) [2024-06-24 18:46:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 11783307264. Throughput: 0: 42914.2. Samples: 11783381400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 18:46:11,226][15401] Updated weights for policy 0, policy_version 719202 (0.0035) [2024-06-24 18:46:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11783503872. Throughput: 0: 43119.1. Samples: 11783646880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:13,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 18:46:14,191][15401] Updated weights for policy 0, policy_version 719212 (0.0027) [2024-06-24 18:46:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 11783700480. Throughput: 0: 42950.2. Samples: 11783906900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:18,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 18:46:18,746][15401] Updated weights for policy 0, policy_version 719222 (0.0027) [2024-06-24 18:46:19,968][15349] Signal inference workers to stop experience collection... (174400 times) [2024-06-24 18:46:19,968][15349] Signal inference workers to resume experience collection... (174400 times) [2024-06-24 18:46:19,984][15401] InferenceWorker_p0-w0: stopping experience collection (174400 times) [2024-06-24 18:46:19,985][15401] InferenceWorker_p0-w0: resuming experience collection (174400 times) [2024-06-24 18:46:21,852][15401] Updated weights for policy 0, policy_version 719232 (0.0028) [2024-06-24 18:46:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 11783962624. Throughput: 0: 43203.7. Samples: 11784032040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:23,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-24 18:46:26,198][15401] Updated weights for policy 0, policy_version 719242 (0.0044) [2024-06-24 18:46:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11784159232. Throughput: 0: 43191.3. Samples: 11784291940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 18:46:29,368][15401] Updated weights for policy 0, policy_version 719252 (0.0034) [2024-06-24 18:46:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11784355840. Throughput: 0: 43033.8. Samples: 11784553580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 18:46:33,697][15401] Updated weights for policy 0, policy_version 719262 (0.0038) [2024-06-24 18:46:36,962][15401] Updated weights for policy 0, policy_version 719272 (0.0034) [2024-06-24 18:46:38,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 11784617984. Throughput: 0: 43290.2. Samples: 11784679260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 18:46:41,414][15401] Updated weights for policy 0, policy_version 719282 (0.0040) [2024-06-24 18:46:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11784814592. Throughput: 0: 43212.3. Samples: 11784941400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 18:46:44,283][15401] Updated weights for policy 0, policy_version 719292 (0.0032) [2024-06-24 18:46:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 11785011200. Throughput: 0: 43358.3. Samples: 11785211760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 18:46:48,757][15401] Updated weights for policy 0, policy_version 719302 (0.0025) [2024-06-24 18:46:51,967][15401] Updated weights for policy 0, policy_version 719312 (0.0027) [2024-06-24 18:46:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 11785273344. Throughput: 0: 43216.0. Samples: 11785326120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 18:46:56,509][15401] Updated weights for policy 0, policy_version 719322 (0.0042) [2024-06-24 18:46:58,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11785469952. Throughput: 0: 43217.4. Samples: 11785591660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:46:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 18:46:59,411][15401] Updated weights for policy 0, policy_version 719332 (0.0025) [2024-06-24 18:47:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 11785666560. Throughput: 0: 43255.5. Samples: 11785853400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:47:03,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 18:47:04,026][15401] Updated weights for policy 0, policy_version 719342 (0.0033) [2024-06-24 18:47:06,889][15349] Signal inference workers to stop experience collection... (174450 times) [2024-06-24 18:47:06,921][15401] InferenceWorker_p0-w0: stopping experience collection (174450 times) [2024-06-24 18:47:06,947][15349] Signal inference workers to resume experience collection... (174450 times) [2024-06-24 18:47:06,952][15401] InferenceWorker_p0-w0: resuming experience collection (174450 times) [2024-06-24 18:47:06,955][15401] Updated weights for policy 0, policy_version 719352 (0.0031) [2024-06-24 18:47:08,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43688.9, 300 sec: 42931.3). Total num frames: 11785928704. Throughput: 0: 43110.1. Samples: 11785972100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:47:08,392][15132] Avg episode reward: [(0, '0.046')] [2024-06-24 18:47:11,985][15401] Updated weights for policy 0, policy_version 719362 (0.0032) [2024-06-24 18:47:13,390][15132] Fps is (10 sec: 44234.7, 60 sec: 43417.3, 300 sec: 42765.0). Total num frames: 11786108928. Throughput: 0: 43216.5. Samples: 11786236700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:47:13,393][15132] Avg episode reward: [(0, '0.234')] [2024-06-24 18:47:14,478][15401] Updated weights for policy 0, policy_version 719372 (0.0050) [2024-06-24 18:47:18,392][15132] Fps is (10 sec: 36044.3, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 11786289152. Throughput: 0: 43159.3. Samples: 11786495860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:47:18,393][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 18:47:19,520][15401] Updated weights for policy 0, policy_version 719382 (0.0027) [2024-06-24 18:47:22,305][15401] Updated weights for policy 0, policy_version 719392 (0.0026) [2024-06-24 18:47:23,390][15132] Fps is (10 sec: 45877.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 11786567680. Throughput: 0: 43155.9. Samples: 11786621280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:47:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 18:47:27,664][15401] Updated weights for policy 0, policy_version 719402 (0.0038) [2024-06-24 18:47:28,389][15132] Fps is (10 sec: 42609.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 11786715136. Throughput: 0: 43091.2. Samples: 11786880500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 18:47:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 18:47:29,731][15401] Updated weights for policy 0, policy_version 719412 (0.0023) [2024-06-24 18:47:33,389][15132] Fps is (10 sec: 37683.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11786944512. Throughput: 0: 42679.5. Samples: 11787132340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:47:33,396][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 18:47:35,256][15401] Updated weights for policy 0, policy_version 719422 (0.0034) [2024-06-24 18:47:37,395][15401] Updated weights for policy 0, policy_version 719432 (0.0035) [2024-06-24 18:47:38,389][15132] Fps is (10 sec: 50790.2, 60 sec: 43417.6, 300 sec: 43043.1). Total num frames: 11787223040. Throughput: 0: 43009.4. Samples: 11787261540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:47:38,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 18:47:42,815][15401] Updated weights for policy 0, policy_version 719442 (0.0026) [2024-06-24 18:47:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11787370496. Throughput: 0: 42927.5. Samples: 11787523400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:47:43,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 18:47:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000719444_11787370496.pth... [2024-06-24 18:47:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000718818_11777114112.pth [2024-06-24 18:47:45,406][15401] Updated weights for policy 0, policy_version 719452 (0.0026) [2024-06-24 18:47:48,389][15132] Fps is (10 sec: 37683.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11787599872. Throughput: 0: 42499.2. Samples: 11787765860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:47:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 18:47:50,650][15401] Updated weights for policy 0, policy_version 719462 (0.0030) [2024-06-24 18:47:53,177][15401] Updated weights for policy 0, policy_version 719472 (0.0040) [2024-06-24 18:47:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11787845632. Throughput: 0: 42821.3. Samples: 11787898960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:47:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 18:47:58,102][15401] Updated weights for policy 0, policy_version 719482 (0.0033) [2024-06-24 18:47:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11788009472. Throughput: 0: 42720.5. Samples: 11788159100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:47:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 18:48:00,622][15401] Updated weights for policy 0, policy_version 719492 (0.0044) [2024-06-24 18:48:02,957][15349] Signal inference workers to stop experience collection... (174500 times) [2024-06-24 18:48:03,007][15401] InferenceWorker_p0-w0: stopping experience collection (174500 times) [2024-06-24 18:48:03,070][15349] Signal inference workers to resume experience collection... (174500 times) [2024-06-24 18:48:03,071][15401] InferenceWorker_p0-w0: resuming experience collection (174500 times) [2024-06-24 18:48:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11788255232. Throughput: 0: 42607.8. Samples: 11788413100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:03,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 18:48:05,545][15401] Updated weights for policy 0, policy_version 719502 (0.0041) [2024-06-24 18:48:08,378][15401] Updated weights for policy 0, policy_version 719512 (0.0033) [2024-06-24 18:48:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 11788484608. Throughput: 0: 42740.0. Samples: 11788544580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:08,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 18:48:12,955][15401] Updated weights for policy 0, policy_version 719522 (0.0032) [2024-06-24 18:48:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.8, 300 sec: 42820.6). Total num frames: 11788681216. Throughput: 0: 42792.8. Samples: 11788806180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:13,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-24 18:48:15,881][15401] Updated weights for policy 0, policy_version 719532 (0.0031) [2024-06-24 18:48:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43692.6, 300 sec: 42987.5). Total num frames: 11788910592. Throughput: 0: 42881.4. Samples: 11789062000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 18:48:20,439][15401] Updated weights for policy 0, policy_version 719542 (0.0040) [2024-06-24 18:48:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 11789123584. Throughput: 0: 42923.1. Samples: 11789193080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 18:48:23,465][15401] Updated weights for policy 0, policy_version 719552 (0.0036) [2024-06-24 18:48:27,914][15401] Updated weights for policy 0, policy_version 719562 (0.0029) [2024-06-24 18:48:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11789320192. Throughput: 0: 42879.2. Samples: 11789452960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 18:48:31,335][15401] Updated weights for policy 0, policy_version 719572 (0.0026) [2024-06-24 18:48:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 43042.7). Total num frames: 11789565952. Throughput: 0: 43111.0. Samples: 11789705860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 18:48:35,313][15401] Updated weights for policy 0, policy_version 719582 (0.0035) [2024-06-24 18:48:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 11789762560. Throughput: 0: 43193.0. Samples: 11789842640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:38,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 18:48:38,820][15401] Updated weights for policy 0, policy_version 719592 (0.0044) [2024-06-24 18:48:42,785][15401] Updated weights for policy 0, policy_version 719602 (0.0034) [2024-06-24 18:48:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11789975552. Throughput: 0: 43167.1. Samples: 11790101620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 18:48:46,623][15401] Updated weights for policy 0, policy_version 719612 (0.0030) [2024-06-24 18:48:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 43098.2). Total num frames: 11790204928. Throughput: 0: 43249.3. Samples: 11790359320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 18:48:50,229][15401] Updated weights for policy 0, policy_version 719622 (0.0034) [2024-06-24 18:48:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11790417920. Throughput: 0: 43197.3. Samples: 11790488460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 18:48:54,235][15401] Updated weights for policy 0, policy_version 719632 (0.0031) [2024-06-24 18:48:58,155][15401] Updated weights for policy 0, policy_version 719642 (0.0034) [2024-06-24 18:48:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 11790630912. Throughput: 0: 43206.2. Samples: 11790750460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:48:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 18:49:02,138][15401] Updated weights for policy 0, policy_version 719652 (0.0035) [2024-06-24 18:49:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 11790843904. Throughput: 0: 43082.1. Samples: 11791000700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:49:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 18:49:05,612][15401] Updated weights for policy 0, policy_version 719662 (0.0032) [2024-06-24 18:49:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 11791056896. Throughput: 0: 43191.6. Samples: 11791136700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:49:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 18:49:09,639][15401] Updated weights for policy 0, policy_version 719672 (0.0033) [2024-06-24 18:49:13,331][15401] Updated weights for policy 0, policy_version 719682 (0.0031) [2024-06-24 18:49:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11791269888. Throughput: 0: 42992.9. Samples: 11791387640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 18:49:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 18:49:17,275][15401] Updated weights for policy 0, policy_version 719692 (0.0035) [2024-06-24 18:49:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 11791482880. Throughput: 0: 43168.4. Samples: 11791648440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 18:49:20,866][15401] Updated weights for policy 0, policy_version 719702 (0.0038) [2024-06-24 18:49:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11791695872. Throughput: 0: 42968.8. Samples: 11791776240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:23,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-24 18:49:24,948][15401] Updated weights for policy 0, policy_version 719712 (0.0031) [2024-06-24 18:49:28,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 11791908864. Throughput: 0: 43039.1. Samples: 11792038480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:28,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 18:49:28,471][15401] Updated weights for policy 0, policy_version 719722 (0.0029) [2024-06-24 18:49:32,916][15401] Updated weights for policy 0, policy_version 719732 (0.0024) [2024-06-24 18:49:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 11792105472. Throughput: 0: 42882.7. Samples: 11792289040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 18:49:33,645][15349] Signal inference workers to stop experience collection... (174550 times) [2024-06-24 18:49:33,645][15349] Signal inference workers to resume experience collection... (174550 times) [2024-06-24 18:49:33,678][15401] InferenceWorker_p0-w0: stopping experience collection (174550 times) [2024-06-24 18:49:33,678][15401] InferenceWorker_p0-w0: resuming experience collection (174550 times) [2024-06-24 18:49:35,994][15401] Updated weights for policy 0, policy_version 719742 (0.0038) [2024-06-24 18:49:38,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11792334848. Throughput: 0: 42716.0. Samples: 11792410680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:38,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 18:49:40,466][15401] Updated weights for policy 0, policy_version 719752 (0.0031) [2024-06-24 18:49:43,394][15132] Fps is (10 sec: 45852.7, 60 sec: 43141.0, 300 sec: 42986.4). Total num frames: 11792564224. Throughput: 0: 42728.6. Samples: 11792673460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:43,395][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 18:49:43,519][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000719762_11792580608.pth... [2024-06-24 18:49:43,533][15401] Updated weights for policy 0, policy_version 719762 (0.0039) [2024-06-24 18:49:43,571][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000719132_11782258688.pth [2024-06-24 18:49:48,000][15401] Updated weights for policy 0, policy_version 719772 (0.0023) [2024-06-24 18:49:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 43098.2). Total num frames: 11792760832. Throughput: 0: 42947.1. Samples: 11792933320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:48,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 18:49:51,138][15401] Updated weights for policy 0, policy_version 719782 (0.0037) [2024-06-24 18:49:53,389][15132] Fps is (10 sec: 42619.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11792990208. Throughput: 0: 42648.9. Samples: 11793055900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:53,390][15132] Avg episode reward: [(0, '0.196')] [2024-06-24 18:49:55,707][15401] Updated weights for policy 0, policy_version 719792 (0.0036) [2024-06-24 18:49:58,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43142.8, 300 sec: 43042.7). Total num frames: 11793219584. Throughput: 0: 42894.6. Samples: 11793318000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:49:58,393][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 18:49:58,745][15401] Updated weights for policy 0, policy_version 719802 (0.0035) [2024-06-24 18:50:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42987.1). Total num frames: 11793383424. Throughput: 0: 42908.4. Samples: 11793579320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 18:50:03,909][15401] Updated weights for policy 0, policy_version 719812 (0.0034) [2024-06-24 18:50:06,407][15401] Updated weights for policy 0, policy_version 719822 (0.0045) [2024-06-24 18:50:08,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11793629184. Throughput: 0: 42620.6. Samples: 11793694160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 18:50:11,387][15401] Updated weights for policy 0, policy_version 719832 (0.0031) [2024-06-24 18:50:13,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 11793858560. Throughput: 0: 42779.6. Samples: 11793963460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:13,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 18:50:14,325][15401] Updated weights for policy 0, policy_version 719842 (0.0026) [2024-06-24 18:50:18,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 11794006016. Throughput: 0: 42891.6. Samples: 11794219160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:18,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 18:50:19,162][15401] Updated weights for policy 0, policy_version 719852 (0.0032) [2024-06-24 18:50:21,803][15401] Updated weights for policy 0, policy_version 719862 (0.0040) [2024-06-24 18:50:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11794268160. Throughput: 0: 42774.7. Samples: 11794335540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:23,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 18:50:26,578][15401] Updated weights for policy 0, policy_version 719872 (0.0037) [2024-06-24 18:50:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 11794464768. Throughput: 0: 42701.6. Samples: 11794594820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:28,395][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 18:50:29,865][15401] Updated weights for policy 0, policy_version 719882 (0.0031) [2024-06-24 18:50:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 11794661376. Throughput: 0: 42585.0. Samples: 11794849640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:33,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 18:50:34,322][15401] Updated weights for policy 0, policy_version 719892 (0.0026) [2024-06-24 18:50:37,449][15401] Updated weights for policy 0, policy_version 719902 (0.0040) [2024-06-24 18:50:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11794907136. Throughput: 0: 42697.7. Samples: 11794977300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:38,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 18:50:41,755][15401] Updated weights for policy 0, policy_version 719912 (0.0026) [2024-06-24 18:50:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42328.8, 300 sec: 43098.2). Total num frames: 11795103744. Throughput: 0: 42572.9. Samples: 11795233680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 18:50:45,035][15401] Updated weights for policy 0, policy_version 719922 (0.0033) [2024-06-24 18:50:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 11795316736. Throughput: 0: 42588.2. Samples: 11795495780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 18:50:49,356][15401] Updated weights for policy 0, policy_version 719932 (0.0042) [2024-06-24 18:50:52,547][15401] Updated weights for policy 0, policy_version 719942 (0.0032) [2024-06-24 18:50:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11795562496. Throughput: 0: 42912.4. Samples: 11795625220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 18:50:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 18:50:56,992][15401] Updated weights for policy 0, policy_version 719952 (0.0028) [2024-06-24 18:50:58,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42325.3, 300 sec: 43097.9). Total num frames: 11795759104. Throughput: 0: 42635.5. Samples: 11795882160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:50:58,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 18:51:00,441][15401] Updated weights for policy 0, policy_version 719962 (0.0045) [2024-06-24 18:51:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11795955712. Throughput: 0: 42679.1. Samples: 11796139720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:03,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-24 18:51:04,470][15401] Updated weights for policy 0, policy_version 719972 (0.0039) [2024-06-24 18:51:07,982][15401] Updated weights for policy 0, policy_version 719982 (0.0031) [2024-06-24 18:51:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 11796201472. Throughput: 0: 42956.1. Samples: 11796268560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 18:51:12,181][15401] Updated weights for policy 0, policy_version 719992 (0.0045) [2024-06-24 18:51:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 43042.7). Total num frames: 11796398080. Throughput: 0: 42927.2. Samples: 11796526540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 18:51:13,982][15349] Signal inference workers to stop experience collection... (174600 times) [2024-06-24 18:51:14,016][15401] InferenceWorker_p0-w0: stopping experience collection (174600 times) [2024-06-24 18:51:14,038][15349] Signal inference workers to resume experience collection... (174600 times) [2024-06-24 18:51:14,039][15401] InferenceWorker_p0-w0: resuming experience collection (174600 times) [2024-06-24 18:51:15,653][15401] Updated weights for policy 0, policy_version 720002 (0.0033) [2024-06-24 18:51:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11796611072. Throughput: 0: 42907.9. Samples: 11796780500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:18,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 18:51:20,015][15401] Updated weights for policy 0, policy_version 720012 (0.0028) [2024-06-24 18:51:23,242][15401] Updated weights for policy 0, policy_version 720022 (0.0043) [2024-06-24 18:51:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 11796856832. Throughput: 0: 42967.2. Samples: 11796910820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 18:51:27,563][15401] Updated weights for policy 0, policy_version 720032 (0.0037) [2024-06-24 18:51:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 11797053440. Throughput: 0: 43195.1. Samples: 11797177460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 18:51:30,670][15401] Updated weights for policy 0, policy_version 720042 (0.0024) [2024-06-24 18:51:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 11797282816. Throughput: 0: 42926.6. Samples: 11797427480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 18:51:35,129][15401] Updated weights for policy 0, policy_version 720052 (0.0038) [2024-06-24 18:51:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11797479424. Throughput: 0: 42968.8. Samples: 11797558820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:38,395][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 18:51:38,764][15401] Updated weights for policy 0, policy_version 720062 (0.0037) [2024-06-24 18:51:42,754][15401] Updated weights for policy 0, policy_version 720072 (0.0029) [2024-06-24 18:51:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 11797692416. Throughput: 0: 43018.3. Samples: 11797817880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 18:51:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000720074_11797692416.pth... [2024-06-24 18:51:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000719444_11787370496.pth [2024-06-24 18:51:46,690][15401] Updated weights for policy 0, policy_version 720082 (0.0033) [2024-06-24 18:51:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 11797938176. Throughput: 0: 42785.3. Samples: 11798065060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 18:51:50,292][15401] Updated weights for policy 0, policy_version 720092 (0.0043) [2024-06-24 18:51:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11798118400. Throughput: 0: 42975.9. Samples: 11798202480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 18:51:54,250][15401] Updated weights for policy 0, policy_version 720102 (0.0026) [2024-06-24 18:51:57,836][15401] Updated weights for policy 0, policy_version 720112 (0.0037) [2024-06-24 18:51:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 11798331392. Throughput: 0: 42886.6. Samples: 11798456440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:51:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 18:52:02,087][15401] Updated weights for policy 0, policy_version 720122 (0.0029) [2024-06-24 18:52:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42820.9). Total num frames: 11798560768. Throughput: 0: 42785.3. Samples: 11798705840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:52:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 18:52:05,353][15401] Updated weights for policy 0, policy_version 720132 (0.0026) [2024-06-24 18:52:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11798740992. Throughput: 0: 42859.6. Samples: 11798839500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:52:08,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 18:52:09,548][15401] Updated weights for policy 0, policy_version 720142 (0.0032) [2024-06-24 18:52:13,048][15401] Updated weights for policy 0, policy_version 720152 (0.0029) [2024-06-24 18:52:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 43043.1). Total num frames: 11798986752. Throughput: 0: 42603.9. Samples: 11799094640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:52:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 18:52:17,170][15401] Updated weights for policy 0, policy_version 720162 (0.0039) [2024-06-24 18:52:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11799199744. Throughput: 0: 42747.6. Samples: 11799351120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:52:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 18:52:20,579][15401] Updated weights for policy 0, policy_version 720172 (0.0036) [2024-06-24 18:52:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 11799396352. Throughput: 0: 42757.0. Samples: 11799482880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:52:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 18:52:24,703][15401] Updated weights for policy 0, policy_version 720182 (0.0033) [2024-06-24 18:52:28,122][15401] Updated weights for policy 0, policy_version 720192 (0.0032) [2024-06-24 18:52:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 11799642112. Throughput: 0: 42723.6. Samples: 11799740440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:52:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 18:52:31,227][15349] Signal inference workers to stop experience collection... (174650 times) [2024-06-24 18:52:31,229][15349] Signal inference workers to resume experience collection... (174650 times) [2024-06-24 18:52:31,246][15401] InferenceWorker_p0-w0: stopping experience collection (174650 times) [2024-06-24 18:52:31,280][15401] InferenceWorker_p0-w0: resuming experience collection (174650 times) [2024-06-24 18:52:32,211][15401] Updated weights for policy 0, policy_version 720202 (0.0040) [2024-06-24 18:52:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11799855104. Throughput: 0: 42958.6. Samples: 11799998200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-24 18:52:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 18:52:35,541][15401] Updated weights for policy 0, policy_version 720212 (0.0042) [2024-06-24 18:52:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 11800018944. Throughput: 0: 42890.8. Samples: 11800132560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:52:38,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 18:52:40,027][15401] Updated weights for policy 0, policy_version 720222 (0.0029) [2024-06-24 18:52:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11800264704. Throughput: 0: 42835.1. Samples: 11800384020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:52:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 18:52:43,789][15401] Updated weights for policy 0, policy_version 720232 (0.0035) [2024-06-24 18:52:47,929][15401] Updated weights for policy 0, policy_version 720242 (0.0031) [2024-06-24 18:52:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11800477696. Throughput: 0: 42997.9. Samples: 11800640740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:52:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 18:52:51,318][15401] Updated weights for policy 0, policy_version 720252 (0.0032) [2024-06-24 18:52:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 11800657920. Throughput: 0: 42833.6. Samples: 11800767020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:52:53,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 18:52:55,593][15401] Updated weights for policy 0, policy_version 720262 (0.0026) [2024-06-24 18:52:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11800903680. Throughput: 0: 42766.3. Samples: 11801019120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:52:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 18:52:58,786][15401] Updated weights for policy 0, policy_version 720272 (0.0039) [2024-06-24 18:53:03,219][15401] Updated weights for policy 0, policy_version 720282 (0.0040) [2024-06-24 18:53:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11801116672. Throughput: 0: 43022.2. Samples: 11801287120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 18:53:06,243][15401] Updated weights for policy 0, policy_version 720292 (0.0026) [2024-06-24 18:53:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11801313280. Throughput: 0: 42780.8. Samples: 11801408120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:08,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 18:53:10,823][15401] Updated weights for policy 0, policy_version 720302 (0.0053) [2024-06-24 18:53:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11801559040. Throughput: 0: 42682.1. Samples: 11801661140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 18:53:13,744][15401] Updated weights for policy 0, policy_version 720312 (0.0025) [2024-06-24 18:53:18,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11801739264. Throughput: 0: 42872.9. Samples: 11801927480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 18:53:18,553][15401] Updated weights for policy 0, policy_version 720322 (0.0042) [2024-06-24 18:53:21,449][15401] Updated weights for policy 0, policy_version 720332 (0.0036) [2024-06-24 18:53:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11801968640. Throughput: 0: 42554.6. Samples: 11802047520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 18:53:26,120][15401] Updated weights for policy 0, policy_version 720342 (0.0045) [2024-06-24 18:53:28,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11802214400. Throughput: 0: 42680.1. Samples: 11802304620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 18:53:28,833][15401] Updated weights for policy 0, policy_version 720352 (0.0033) [2024-06-24 18:53:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 11802378240. Throughput: 0: 42884.5. Samples: 11802570540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 18:53:33,696][15401] Updated weights for policy 0, policy_version 720362 (0.0028) [2024-06-24 18:53:33,698][15349] Signal inference workers to stop experience collection... (174700 times) [2024-06-24 18:53:33,699][15349] Signal inference workers to resume experience collection... (174700 times) [2024-06-24 18:53:33,738][15401] InferenceWorker_p0-w0: stopping experience collection (174700 times) [2024-06-24 18:53:33,739][15401] InferenceWorker_p0-w0: resuming experience collection (174700 times) [2024-06-24 18:53:36,735][15401] Updated weights for policy 0, policy_version 720372 (0.0024) [2024-06-24 18:53:38,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 11802607616. Throughput: 0: 42595.9. Samples: 11802683840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 18:53:41,293][15401] Updated weights for policy 0, policy_version 720382 (0.0033) [2024-06-24 18:53:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11802836992. Throughput: 0: 42820.3. Samples: 11802946040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 18:53:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000720388_11802836992.pth... [2024-06-24 18:53:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000719762_11792580608.pth [2024-06-24 18:53:44,280][15401] Updated weights for policy 0, policy_version 720392 (0.0036) [2024-06-24 18:53:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11803017216. Throughput: 0: 42754.1. Samples: 11803211060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 18:53:48,773][15401] Updated weights for policy 0, policy_version 720402 (0.0037) [2024-06-24 18:53:51,797][15401] Updated weights for policy 0, policy_version 720412 (0.0037) [2024-06-24 18:53:53,393][15132] Fps is (10 sec: 42583.3, 60 sec: 43415.1, 300 sec: 42820.0). Total num frames: 11803262976. Throughput: 0: 42688.7. Samples: 11803329160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:53,394][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 18:53:56,855][15401] Updated weights for policy 0, policy_version 720422 (0.0043) [2024-06-24 18:53:58,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11803492352. Throughput: 0: 42996.6. Samples: 11803595980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:53:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 18:53:59,711][15401] Updated weights for policy 0, policy_version 720432 (0.0029) [2024-06-24 18:54:03,392][15132] Fps is (10 sec: 39326.2, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 11803656192. Throughput: 0: 42921.2. Samples: 11803859040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:54:03,393][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 18:54:04,471][15401] Updated weights for policy 0, policy_version 720442 (0.0031) [2024-06-24 18:54:07,527][15401] Updated weights for policy 0, policy_version 720452 (0.0036) [2024-06-24 18:54:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 11803918336. Throughput: 0: 42751.9. Samples: 11803971360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:54:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 18:54:12,177][15401] Updated weights for policy 0, policy_version 720462 (0.0033) [2024-06-24 18:54:13,392][15132] Fps is (10 sec: 45875.3, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 11804114944. Throughput: 0: 42720.8. Samples: 11804227160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:54:13,392][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 18:54:15,479][15401] Updated weights for policy 0, policy_version 720472 (0.0037) [2024-06-24 18:54:18,389][15132] Fps is (10 sec: 36045.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11804278784. Throughput: 0: 42639.5. Samples: 11804489320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-24 18:54:18,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 18:54:19,876][15401] Updated weights for policy 0, policy_version 720482 (0.0039) [2024-06-24 18:54:22,862][15401] Updated weights for policy 0, policy_version 720492 (0.0030) [2024-06-24 18:54:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 11804557312. Throughput: 0: 42821.8. Samples: 11804610820. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:54:23,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 18:54:27,579][15401] Updated weights for policy 0, policy_version 720502 (0.0041) [2024-06-24 18:54:28,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 11804753920. Throughput: 0: 42777.5. Samples: 11804871020. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:54:28,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 18:54:30,593][15401] Updated weights for policy 0, policy_version 720512 (0.0029) [2024-06-24 18:54:33,389][15132] Fps is (10 sec: 36045.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11804917760. Throughput: 0: 42617.0. Samples: 11805128820. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:54:33,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 18:54:35,410][15401] Updated weights for policy 0, policy_version 720522 (0.0035) [2024-06-24 18:54:38,171][15401] Updated weights for policy 0, policy_version 720532 (0.0029) [2024-06-24 18:54:38,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.6, 300 sec: 42821.3). Total num frames: 11805196288. Throughput: 0: 42697.5. Samples: 11805250400. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:54:38,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 18:54:42,967][15401] Updated weights for policy 0, policy_version 720542 (0.0048) [2024-06-24 18:54:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11805376512. Throughput: 0: 42526.7. Samples: 11805509680. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:54:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 18:54:46,118][15401] Updated weights for policy 0, policy_version 720552 (0.0034) [2024-06-24 18:54:48,391][15132] Fps is (10 sec: 37677.2, 60 sec: 42597.2, 300 sec: 42653.7). Total num frames: 11805573120. Throughput: 0: 42483.8. Samples: 11805770780. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:54:48,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 18:54:49,188][15349] Signal inference workers to stop experience collection... (174750 times) [2024-06-24 18:54:49,189][15349] Signal inference workers to resume experience collection... (174750 times) [2024-06-24 18:54:49,225][15401] InferenceWorker_p0-w0: stopping experience collection (174750 times) [2024-06-24 18:54:49,225][15401] InferenceWorker_p0-w0: resuming experience collection (174750 times) [2024-06-24 18:54:50,590][15401] Updated weights for policy 0, policy_version 720562 (0.0039) [2024-06-24 18:54:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42874.0, 300 sec: 42765.4). Total num frames: 11805835264. Throughput: 0: 42745.3. Samples: 11805894900. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:54:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 18:54:53,548][15401] Updated weights for policy 0, policy_version 720572 (0.0032) [2024-06-24 18:54:58,344][15401] Updated weights for policy 0, policy_version 720582 (0.0038) [2024-06-24 18:54:58,390][15132] Fps is (10 sec: 44243.9, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 11806015488. Throughput: 0: 42848.0. Samples: 11806155220. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:54:58,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 18:55:01,122][15401] Updated weights for policy 0, policy_version 720592 (0.0036) [2024-06-24 18:55:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11806228480. Throughput: 0: 42693.3. Samples: 11806410520. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 18:55:05,881][15401] Updated weights for policy 0, policy_version 720602 (0.0038) [2024-06-24 18:55:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11806474240. Throughput: 0: 42828.0. Samples: 11806538080. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 18:55:09,010][15401] Updated weights for policy 0, policy_version 720612 (0.0033) [2024-06-24 18:55:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 11806654464. Throughput: 0: 42849.6. Samples: 11806799260. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 18:55:13,635][15401] Updated weights for policy 0, policy_version 720622 (0.0022) [2024-06-24 18:55:16,997][15401] Updated weights for policy 0, policy_version 720632 (0.0047) [2024-06-24 18:55:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 11806883840. Throughput: 0: 42620.8. Samples: 11807046760. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 18:55:21,262][15401] Updated weights for policy 0, policy_version 720642 (0.0043) [2024-06-24 18:55:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11807113216. Throughput: 0: 42839.1. Samples: 11807178160. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 18:55:24,886][15401] Updated weights for policy 0, policy_version 720652 (0.0041) [2024-06-24 18:55:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11807309824. Throughput: 0: 42799.5. Samples: 11807435660. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:28,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 18:55:28,765][15401] Updated weights for policy 0, policy_version 720662 (0.0042) [2024-06-24 18:55:32,715][15401] Updated weights for policy 0, policy_version 720672 (0.0022) [2024-06-24 18:55:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11807522816. Throughput: 0: 42606.1. Samples: 11807687980. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:33,398][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 18:55:36,241][15401] Updated weights for policy 0, policy_version 720682 (0.0041) [2024-06-24 18:55:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.8, 300 sec: 42875.8). Total num frames: 11807752192. Throughput: 0: 42698.7. Samples: 11807816440. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:38,393][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 18:55:40,727][15401] Updated weights for policy 0, policy_version 720692 (0.0031) [2024-06-24 18:55:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11807932416. Throughput: 0: 42627.2. Samples: 11808073440. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 18:55:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000720700_11807948800.pth... [2024-06-24 18:55:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000720074_11797692416.pth [2024-06-24 18:55:44,043][15401] Updated weights for policy 0, policy_version 720702 (0.0032) [2024-06-24 18:55:48,358][15401] Updated weights for policy 0, policy_version 720712 (0.0039) [2024-06-24 18:55:48,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42872.7, 300 sec: 42653.9). Total num frames: 11808145408. Throughput: 0: 42757.3. Samples: 11808334600. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 18:55:51,685][15401] Updated weights for policy 0, policy_version 720722 (0.0028) [2024-06-24 18:55:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 11808391168. Throughput: 0: 42683.3. Samples: 11808458820. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:53,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 18:55:55,981][15401] Updated weights for policy 0, policy_version 720732 (0.0041) [2024-06-24 18:55:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11808587776. Throughput: 0: 42582.8. Samples: 11808715480. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-24 18:55:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 18:55:59,357][15401] Updated weights for policy 0, policy_version 720742 (0.0028) [2024-06-24 18:56:00,768][15349] Signal inference workers to stop experience collection... (174800 times) [2024-06-24 18:56:00,769][15349] Signal inference workers to resume experience collection... (174800 times) [2024-06-24 18:56:00,781][15401] InferenceWorker_p0-w0: stopping experience collection (174800 times) [2024-06-24 18:56:00,790][15401] InferenceWorker_p0-w0: resuming experience collection (174800 times) [2024-06-24 18:56:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11808784384. Throughput: 0: 42723.5. Samples: 11808969320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 18:56:03,943][15401] Updated weights for policy 0, policy_version 720752 (0.0038) [2024-06-24 18:56:06,915][15401] Updated weights for policy 0, policy_version 720762 (0.0041) [2024-06-24 18:56:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 11809013760. Throughput: 0: 42636.0. Samples: 11809096880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:08,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 18:56:11,617][15401] Updated weights for policy 0, policy_version 720772 (0.0046) [2024-06-24 18:56:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11809226752. Throughput: 0: 42743.0. Samples: 11809359100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:13,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-24 18:56:14,599][15401] Updated weights for policy 0, policy_version 720782 (0.0034) [2024-06-24 18:56:18,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11809439744. Throughput: 0: 42665.3. Samples: 11809607920. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:18,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 18:56:19,501][15401] Updated weights for policy 0, policy_version 720792 (0.0034) [2024-06-24 18:56:22,118][15401] Updated weights for policy 0, policy_version 720802 (0.0026) [2024-06-24 18:56:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11809685504. Throughput: 0: 42791.6. Samples: 11809741960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 18:56:26,970][15401] Updated weights for policy 0, policy_version 720812 (0.0030) [2024-06-24 18:56:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11809865728. Throughput: 0: 42879.1. Samples: 11810003000. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:28,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 18:56:29,687][15401] Updated weights for policy 0, policy_version 720822 (0.0029) [2024-06-24 18:56:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11810095104. Throughput: 0: 42694.6. Samples: 11810255860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 18:56:34,298][15401] Updated weights for policy 0, policy_version 720832 (0.0043) [2024-06-24 18:56:37,809][15401] Updated weights for policy 0, policy_version 720842 (0.0028) [2024-06-24 18:56:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11810308096. Throughput: 0: 42862.7. Samples: 11810387640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:38,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-24 18:56:41,736][15401] Updated weights for policy 0, policy_version 720852 (0.0028) [2024-06-24 18:56:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11810504704. Throughput: 0: 43003.9. Samples: 11810650660. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 18:56:45,505][15401] Updated weights for policy 0, policy_version 720862 (0.0038) [2024-06-24 18:56:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11810734080. Throughput: 0: 42844.1. Samples: 11810897300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 18:56:49,251][15401] Updated weights for policy 0, policy_version 720872 (0.0034) [2024-06-24 18:56:53,087][15401] Updated weights for policy 0, policy_version 720882 (0.0041) [2024-06-24 18:56:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11810947072. Throughput: 0: 42949.9. Samples: 11811029520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 18:56:56,694][15401] Updated weights for policy 0, policy_version 720892 (0.0055) [2024-06-24 18:56:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11811143680. Throughput: 0: 42780.5. Samples: 11811284220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:56:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 18:57:00,650][15401] Updated weights for policy 0, policy_version 720902 (0.0027) [2024-06-24 18:57:03,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 11811389440. Throughput: 0: 42967.1. Samples: 11811541540. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:57:03,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 18:57:04,153][15401] Updated weights for policy 0, policy_version 720912 (0.0032) [2024-06-24 18:57:08,252][15401] Updated weights for policy 0, policy_version 720922 (0.0033) [2024-06-24 18:57:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 11811602432. Throughput: 0: 42949.9. Samples: 11811674700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:57:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 18:57:11,938][15401] Updated weights for policy 0, policy_version 720932 (0.0030) [2024-06-24 18:57:13,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11811815424. Throughput: 0: 42799.4. Samples: 11811928980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:57:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 18:57:15,729][15401] Updated weights for policy 0, policy_version 720942 (0.0039) [2024-06-24 18:57:16,480][15349] Signal inference workers to stop experience collection... (174850 times) [2024-06-24 18:57:16,480][15349] Signal inference workers to resume experience collection... (174850 times) [2024-06-24 18:57:16,532][15401] InferenceWorker_p0-w0: stopping experience collection (174850 times) [2024-06-24 18:57:16,532][15401] InferenceWorker_p0-w0: resuming experience collection (174850 times) [2024-06-24 18:57:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11812012032. Throughput: 0: 42925.4. Samples: 11812187500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:57:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 18:57:19,685][15401] Updated weights for policy 0, policy_version 720952 (0.0029) [2024-06-24 18:57:23,372][15401] Updated weights for policy 0, policy_version 720962 (0.0033) [2024-06-24 18:57:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 11812241408. Throughput: 0: 42876.7. Samples: 11812317200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:57:23,393][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 18:57:27,113][15401] Updated weights for policy 0, policy_version 720972 (0.0029) [2024-06-24 18:57:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11812454400. Throughput: 0: 42748.0. Samples: 11812574320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:57:28,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 18:57:30,999][15401] Updated weights for policy 0, policy_version 720982 (0.0036) [2024-06-24 18:57:33,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11812667392. Throughput: 0: 42988.0. Samples: 11812831760. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:57:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 18:57:34,639][15401] Updated weights for policy 0, policy_version 720992 (0.0033) [2024-06-24 18:57:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11812880384. Throughput: 0: 42895.0. Samples: 11812959800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 18:57:38,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 18:57:38,593][15401] Updated weights for policy 0, policy_version 721002 (0.0035) [2024-06-24 18:57:42,591][15401] Updated weights for policy 0, policy_version 721012 (0.0034) [2024-06-24 18:57:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 11813109760. Throughput: 0: 42996.0. Samples: 11813219040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:57:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 18:57:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000721015_11813109760.pth... [2024-06-24 18:57:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000720388_11802836992.pth [2024-06-24 18:57:46,324][15401] Updated weights for policy 0, policy_version 721022 (0.0023) [2024-06-24 18:57:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11813306368. Throughput: 0: 42884.1. Samples: 11813471220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:57:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 18:57:50,323][15401] Updated weights for policy 0, policy_version 721032 (0.0042) [2024-06-24 18:57:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11813502976. Throughput: 0: 42663.5. Samples: 11813594560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:57:53,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 18:57:54,364][15401] Updated weights for policy 0, policy_version 721042 (0.0025) [2024-06-24 18:57:57,821][15401] Updated weights for policy 0, policy_version 721052 (0.0046) [2024-06-24 18:57:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11813732352. Throughput: 0: 42842.7. Samples: 11813856900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:57:58,390][15132] Avg episode reward: [(0, '0.881')] [2024-06-24 18:58:02,182][15401] Updated weights for policy 0, policy_version 721062 (0.0039) [2024-06-24 18:58:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42873.1, 300 sec: 42876.4). Total num frames: 11813961728. Throughput: 0: 42708.7. Samples: 11814109400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 18:58:05,358][15401] Updated weights for policy 0, policy_version 721072 (0.0040) [2024-06-24 18:58:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11814141952. Throughput: 0: 42654.3. Samples: 11814236540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:08,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 18:58:09,756][15401] Updated weights for policy 0, policy_version 721082 (0.0029) [2024-06-24 18:58:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11814354944. Throughput: 0: 42645.7. Samples: 11814493380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:13,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-24 18:58:13,400][15401] Updated weights for policy 0, policy_version 721092 (0.0046) [2024-06-24 18:58:17,475][15401] Updated weights for policy 0, policy_version 721102 (0.0054) [2024-06-24 18:58:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11814600704. Throughput: 0: 42512.9. Samples: 11814744840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 18:58:20,987][15401] Updated weights for policy 0, policy_version 721112 (0.0037) [2024-06-24 18:58:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 11814797312. Throughput: 0: 42481.2. Samples: 11814871460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 18:58:25,164][15401] Updated weights for policy 0, policy_version 721122 (0.0049) [2024-06-24 18:58:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 11815010304. Throughput: 0: 42504.8. Samples: 11815131760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 18:58:28,501][15401] Updated weights for policy 0, policy_version 721132 (0.0029) [2024-06-24 18:58:32,640][15401] Updated weights for policy 0, policy_version 721142 (0.0029) [2024-06-24 18:58:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 11815239680. Throughput: 0: 42685.2. Samples: 11815392060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:33,393][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 18:58:36,244][15401] Updated weights for policy 0, policy_version 721152 (0.0032) [2024-06-24 18:58:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11815436288. Throughput: 0: 42794.0. Samples: 11815520300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:38,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-24 18:58:40,488][15401] Updated weights for policy 0, policy_version 721162 (0.0023) [2024-06-24 18:58:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 11815649280. Throughput: 0: 42534.2. Samples: 11815770940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:43,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 18:58:44,091][15401] Updated weights for policy 0, policy_version 721172 (0.0031) [2024-06-24 18:58:44,108][15349] Signal inference workers to stop experience collection... (174900 times) [2024-06-24 18:58:44,108][15349] Signal inference workers to resume experience collection... (174900 times) [2024-06-24 18:58:44,144][15401] InferenceWorker_p0-w0: stopping experience collection (174900 times) [2024-06-24 18:58:44,144][15401] InferenceWorker_p0-w0: resuming experience collection (174900 times) [2024-06-24 18:58:48,143][15401] Updated weights for policy 0, policy_version 721182 (0.0033) [2024-06-24 18:58:48,392][15132] Fps is (10 sec: 42590.0, 60 sec: 42596.8, 300 sec: 42709.7). Total num frames: 11815862272. Throughput: 0: 42855.8. Samples: 11816038000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:48,392][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 18:58:51,538][15401] Updated weights for policy 0, policy_version 721192 (0.0037) [2024-06-24 18:58:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11816058880. Throughput: 0: 42792.4. Samples: 11816162200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 18:58:55,534][15401] Updated weights for policy 0, policy_version 721202 (0.0027) [2024-06-24 18:58:58,391][15132] Fps is (10 sec: 44241.5, 60 sec: 42870.8, 300 sec: 42876.3). Total num frames: 11816304640. Throughput: 0: 42667.1. Samples: 11816413440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:58:58,391][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 18:58:59,547][15401] Updated weights for policy 0, policy_version 721212 (0.0042) [2024-06-24 18:59:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11816484864. Throughput: 0: 42868.4. Samples: 11816673920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:59:03,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-24 18:59:03,637][15401] Updated weights for policy 0, policy_version 721222 (0.0037) [2024-06-24 18:59:07,235][15401] Updated weights for policy 0, policy_version 721232 (0.0042) [2024-06-24 18:59:08,390][15132] Fps is (10 sec: 40964.2, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 11816714240. Throughput: 0: 42761.4. Samples: 11816795720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:59:08,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 18:59:11,199][15401] Updated weights for policy 0, policy_version 721242 (0.0041) [2024-06-24 18:59:13,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43416.0, 300 sec: 42986.8). Total num frames: 11816960000. Throughput: 0: 42662.2. Samples: 11817051660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:59:13,393][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 18:59:14,898][15401] Updated weights for policy 0, policy_version 721252 (0.0046) [2024-06-24 18:59:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 11817123840. Throughput: 0: 42708.4. Samples: 11817313940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:59:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 18:59:18,806][15401] Updated weights for policy 0, policy_version 721262 (0.0042) [2024-06-24 18:59:22,419][15401] Updated weights for policy 0, policy_version 721272 (0.0042) [2024-06-24 18:59:23,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11817353216. Throughput: 0: 42486.3. Samples: 11817432180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 18:59:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 18:59:26,391][15401] Updated weights for policy 0, policy_version 721282 (0.0029) [2024-06-24 18:59:28,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11817566208. Throughput: 0: 42704.9. Samples: 11817692660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 18:59:28,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 18:59:29,933][15401] Updated weights for policy 0, policy_version 721292 (0.0041) [2024-06-24 18:59:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11817779200. Throughput: 0: 42503.2. Samples: 11817950560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 18:59:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 18:59:33,972][15401] Updated weights for policy 0, policy_version 721302 (0.0033) [2024-06-24 18:59:37,848][15401] Updated weights for policy 0, policy_version 721312 (0.0030) [2024-06-24 18:59:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.8, 300 sec: 42764.6). Total num frames: 11817992192. Throughput: 0: 42535.0. Samples: 11818076380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 18:59:38,393][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 18:59:41,575][15401] Updated weights for policy 0, policy_version 721322 (0.0029) [2024-06-24 18:59:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42765.3). Total num frames: 11818188800. Throughput: 0: 42595.2. Samples: 11818330180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 18:59:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 18:59:43,474][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000721326_11818205184.pth... [2024-06-24 18:59:43,524][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000720700_11807948800.pth [2024-06-24 18:59:45,522][15401] Updated weights for policy 0, policy_version 721332 (0.0035) [2024-06-24 18:59:48,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 11818401792. Throughput: 0: 42255.1. Samples: 11818575400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 18:59:48,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 18:59:49,788][15401] Updated weights for policy 0, policy_version 721342 (0.0043) [2024-06-24 18:59:53,339][15401] Updated weights for policy 0, policy_version 721352 (0.0026) [2024-06-24 18:59:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11818631168. Throughput: 0: 42390.2. Samples: 11818703280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 18:59:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 18:59:57,311][15401] Updated weights for policy 0, policy_version 721362 (0.0036) [2024-06-24 18:59:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.9, 300 sec: 42709.5). Total num frames: 11818827776. Throughput: 0: 42390.1. Samples: 11818959120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 18:59:58,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 19:00:00,957][15401] Updated weights for policy 0, policy_version 721372 (0.0026) [2024-06-24 19:00:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11819057152. Throughput: 0: 42087.3. Samples: 11819207860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:03,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 19:00:05,230][15401] Updated weights for policy 0, policy_version 721382 (0.0032) [2024-06-24 19:00:07,284][15349] Signal inference workers to stop experience collection... (174950 times) [2024-06-24 19:00:07,322][15401] InferenceWorker_p0-w0: stopping experience collection (174950 times) [2024-06-24 19:00:07,335][15349] Signal inference workers to resume experience collection... (174950 times) [2024-06-24 19:00:07,344][15401] InferenceWorker_p0-w0: resuming experience collection (174950 times) [2024-06-24 19:00:08,378][15401] Updated weights for policy 0, policy_version 721392 (0.0033) [2024-06-24 19:00:08,392][15132] Fps is (10 sec: 45865.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 11819286528. Throughput: 0: 42419.6. Samples: 11819341160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:08,392][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 19:00:12,767][15401] Updated weights for policy 0, policy_version 721402 (0.0046) [2024-06-24 19:00:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41780.8, 300 sec: 42653.9). Total num frames: 11819466752. Throughput: 0: 42451.5. Samples: 11819602980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 19:00:16,428][15401] Updated weights for policy 0, policy_version 721412 (0.0035) [2024-06-24 19:00:18,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 11819696128. Throughput: 0: 42285.5. Samples: 11819853400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:18,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 19:00:20,486][15401] Updated weights for policy 0, policy_version 721422 (0.0036) [2024-06-24 19:00:23,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 11819909120. Throughput: 0: 42292.9. Samples: 11819979560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:23,392][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 19:00:24,031][15401] Updated weights for policy 0, policy_version 721432 (0.0038) [2024-06-24 19:00:28,127][15401] Updated weights for policy 0, policy_version 721442 (0.0034) [2024-06-24 19:00:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11820122112. Throughput: 0: 42371.5. Samples: 11820236900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:28,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 19:00:31,590][15401] Updated weights for policy 0, policy_version 721452 (0.0032) [2024-06-24 19:00:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 11820335104. Throughput: 0: 42611.9. Samples: 11820492940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:33,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 19:00:35,925][15401] Updated weights for policy 0, policy_version 721462 (0.0037) [2024-06-24 19:00:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 11820548096. Throughput: 0: 42621.9. Samples: 11820621260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 19:00:39,533][15401] Updated weights for policy 0, policy_version 721472 (0.0044) [2024-06-24 19:00:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11820744704. Throughput: 0: 42646.7. Samples: 11820878220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:43,392][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 19:00:43,396][15401] Updated weights for policy 0, policy_version 721482 (0.0040) [2024-06-24 19:00:47,215][15401] Updated weights for policy 0, policy_version 721492 (0.0037) [2024-06-24 19:00:48,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11820974080. Throughput: 0: 42718.5. Samples: 11821130200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 19:00:51,290][15401] Updated weights for policy 0, policy_version 721502 (0.0040) [2024-06-24 19:00:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11821154304. Throughput: 0: 42647.0. Samples: 11821260180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 19:00:54,705][15401] Updated weights for policy 0, policy_version 721512 (0.0039) [2024-06-24 19:00:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11821383680. Throughput: 0: 42465.0. Samples: 11821513900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:00:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 19:00:58,825][15401] Updated weights for policy 0, policy_version 721522 (0.0041) [2024-06-24 19:01:02,568][15401] Updated weights for policy 0, policy_version 721532 (0.0039) [2024-06-24 19:01:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 11821613056. Throughput: 0: 42511.5. Samples: 11821766420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 19:01:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 19:01:06,385][15401] Updated weights for policy 0, policy_version 721542 (0.0037) [2024-06-24 19:01:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 11821809664. Throughput: 0: 42501.9. Samples: 11821892040. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 19:01:10,281][15401] Updated weights for policy 0, policy_version 721552 (0.0025) [2024-06-24 19:01:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11822039040. Throughput: 0: 42428.4. Samples: 11822146180. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:13,392][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 19:01:13,884][15401] Updated weights for policy 0, policy_version 721562 (0.0033) [2024-06-24 19:01:17,847][15401] Updated weights for policy 0, policy_version 721572 (0.0040) [2024-06-24 19:01:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 11822252032. Throughput: 0: 42363.1. Samples: 11822399380. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:18,393][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 19:01:21,658][15401] Updated weights for policy 0, policy_version 721582 (0.0047) [2024-06-24 19:01:23,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42322.5, 300 sec: 42653.0). Total num frames: 11822448640. Throughput: 0: 42394.3. Samples: 11822529280. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:23,397][15132] Avg episode reward: [(0, '0.820')] [2024-06-24 19:01:25,840][15401] Updated weights for policy 0, policy_version 721592 (0.0041) [2024-06-24 19:01:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11822678016. Throughput: 0: 42473.4. Samples: 11822789520. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 19:01:29,158][15401] Updated weights for policy 0, policy_version 721602 (0.0032) [2024-06-24 19:01:33,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11822874624. Throughput: 0: 42538.7. Samples: 11823044440. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 19:01:33,849][15401] Updated weights for policy 0, policy_version 721612 (0.0030) [2024-06-24 19:01:37,039][15401] Updated weights for policy 0, policy_version 721622 (0.0045) [2024-06-24 19:01:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11823087616. Throughput: 0: 42413.9. Samples: 11823168800. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:38,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 19:01:39,989][15349] Signal inference workers to stop experience collection... (175000 times) [2024-06-24 19:01:40,010][15401] InferenceWorker_p0-w0: stopping experience collection (175000 times) [2024-06-24 19:01:40,100][15349] Signal inference workers to resume experience collection... (175000 times) [2024-06-24 19:01:40,100][15401] InferenceWorker_p0-w0: resuming experience collection (175000 times) [2024-06-24 19:01:41,499][15401] Updated weights for policy 0, policy_version 721632 (0.0031) [2024-06-24 19:01:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 11823316992. Throughput: 0: 42659.0. Samples: 11823433660. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:43,393][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 19:01:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000721639_11823333376.pth... [2024-06-24 19:01:43,509][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000721015_11813109760.pth [2024-06-24 19:01:44,544][15401] Updated weights for policy 0, policy_version 721642 (0.0033) [2024-06-24 19:01:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11823513600. Throughput: 0: 42663.5. Samples: 11823686280. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:48,395][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 19:01:49,111][15401] Updated weights for policy 0, policy_version 721652 (0.0028) [2024-06-24 19:01:51,979][15401] Updated weights for policy 0, policy_version 721662 (0.0037) [2024-06-24 19:01:53,390][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 11823742976. Throughput: 0: 42711.1. Samples: 11823814040. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:53,391][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 19:01:56,857][15401] Updated weights for policy 0, policy_version 721672 (0.0036) [2024-06-24 19:01:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 11823955968. Throughput: 0: 42938.7. Samples: 11824078420. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:01:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 19:01:59,504][15401] Updated weights for policy 0, policy_version 721682 (0.0041) [2024-06-24 19:02:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11824152576. Throughput: 0: 42975.2. Samples: 11824333160. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 19:02:04,411][15401] Updated weights for policy 0, policy_version 721692 (0.0028) [2024-06-24 19:02:07,770][15401] Updated weights for policy 0, policy_version 721702 (0.0043) [2024-06-24 19:02:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11824381952. Throughput: 0: 42801.2. Samples: 11824455060. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 19:02:11,987][15401] Updated weights for policy 0, policy_version 721712 (0.0032) [2024-06-24 19:02:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 11824578560. Throughput: 0: 42762.2. Samples: 11824713820. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:13,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 19:02:15,437][15401] Updated weights for policy 0, policy_version 721722 (0.0038) [2024-06-24 19:02:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42543.2). Total num frames: 11824791552. Throughput: 0: 42764.6. Samples: 11824968840. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 19:02:19,566][15401] Updated weights for policy 0, policy_version 721732 (0.0033) [2024-06-24 19:02:22,973][15401] Updated weights for policy 0, policy_version 721742 (0.0036) [2024-06-24 19:02:23,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43147.4, 300 sec: 42653.6). Total num frames: 11825037312. Throughput: 0: 42841.3. Samples: 11825096760. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:23,392][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 19:02:27,675][15401] Updated weights for policy 0, policy_version 721752 (0.0029) [2024-06-24 19:02:28,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 11825201152. Throughput: 0: 42645.8. Samples: 11825352720. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:28,393][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 19:02:30,470][15401] Updated weights for policy 0, policy_version 721762 (0.0034) [2024-06-24 19:02:33,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11825430528. Throughput: 0: 42851.2. Samples: 11825614580. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:33,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 19:02:35,076][15401] Updated weights for policy 0, policy_version 721772 (0.0044) [2024-06-24 19:02:37,049][15349] Signal inference workers to stop experience collection... (175050 times) [2024-06-24 19:02:37,081][15401] InferenceWorker_p0-w0: stopping experience collection (175050 times) [2024-06-24 19:02:37,102][15349] Signal inference workers to resume experience collection... (175050 times) [2024-06-24 19:02:37,103][15401] InferenceWorker_p0-w0: resuming experience collection (175050 times) [2024-06-24 19:02:38,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 11825659904. Throughput: 0: 42944.1. Samples: 11825746520. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:38,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 19:02:38,722][15401] Updated weights for policy 0, policy_version 721782 (0.0034) [2024-06-24 19:02:42,565][15401] Updated weights for policy 0, policy_version 721792 (0.0024) [2024-06-24 19:02:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 11825856512. Throughput: 0: 42460.8. Samples: 11825989160. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-06-24 19:02:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 19:02:46,292][15401] Updated weights for policy 0, policy_version 721802 (0.0039) [2024-06-24 19:02:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11826085888. Throughput: 0: 42651.6. Samples: 11826252480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:02:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 19:02:49,962][15401] Updated weights for policy 0, policy_version 721812 (0.0032) [2024-06-24 19:02:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11826315264. Throughput: 0: 42877.3. Samples: 11826384540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:02:53,396][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 19:02:53,846][15401] Updated weights for policy 0, policy_version 721822 (0.0043) [2024-06-24 19:02:57,867][15401] Updated weights for policy 0, policy_version 721832 (0.0045) [2024-06-24 19:02:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 11826495488. Throughput: 0: 42780.5. Samples: 11826638940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:02:58,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 19:03:01,324][15401] Updated weights for policy 0, policy_version 721842 (0.0035) [2024-06-24 19:03:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11826724864. Throughput: 0: 42802.2. Samples: 11826894940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 19:03:05,707][15401] Updated weights for policy 0, policy_version 721852 (0.0037) [2024-06-24 19:03:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11826954240. Throughput: 0: 42939.7. Samples: 11827028940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:08,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-24 19:03:08,983][15401] Updated weights for policy 0, policy_version 721862 (0.0025) [2024-06-24 19:03:13,287][15401] Updated weights for policy 0, policy_version 721872 (0.0032) [2024-06-24 19:03:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 11827150848. Throughput: 0: 42718.3. Samples: 11827274940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:13,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 19:03:17,171][15401] Updated weights for policy 0, policy_version 721882 (0.0038) [2024-06-24 19:03:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 11827363840. Throughput: 0: 42506.0. Samples: 11827527360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:18,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-24 19:03:20,973][15401] Updated weights for policy 0, policy_version 721892 (0.0046) [2024-06-24 19:03:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 11827576832. Throughput: 0: 42460.8. Samples: 11827657260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:23,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 19:03:24,732][15401] Updated weights for policy 0, policy_version 721902 (0.0045) [2024-06-24 19:03:28,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43144.5, 300 sec: 42542.5). Total num frames: 11827789824. Throughput: 0: 42926.6. Samples: 11827920960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:28,393][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 19:03:28,777][15401] Updated weights for policy 0, policy_version 721912 (0.0036) [2024-06-24 19:03:32,088][15401] Updated weights for policy 0, policy_version 721922 (0.0025) [2024-06-24 19:03:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 11828019200. Throughput: 0: 42683.5. Samples: 11828173240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:33,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 19:03:36,632][15401] Updated weights for policy 0, policy_version 721932 (0.0034) [2024-06-24 19:03:38,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11828232192. Throughput: 0: 42626.8. Samples: 11828302740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:38,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 19:03:39,898][15401] Updated weights for policy 0, policy_version 721942 (0.0028) [2024-06-24 19:03:43,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42866.9, 300 sec: 42597.8). Total num frames: 11828428800. Throughput: 0: 42588.6. Samples: 11828555700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:43,396][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 19:03:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000721950_11828428800.pth... [2024-06-24 19:03:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000721326_11818205184.pth [2024-06-24 19:03:44,222][15401] Updated weights for policy 0, policy_version 721952 (0.0032) [2024-06-24 19:03:47,713][15401] Updated weights for policy 0, policy_version 721962 (0.0033) [2024-06-24 19:03:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11828641792. Throughput: 0: 42380.8. Samples: 11828802080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:48,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 19:03:51,841][15401] Updated weights for policy 0, policy_version 721972 (0.0043) [2024-06-24 19:03:53,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42325.4, 300 sec: 42543.0). Total num frames: 11828854784. Throughput: 0: 42334.2. Samples: 11828933980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:53,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 19:03:55,390][15401] Updated weights for policy 0, policy_version 721982 (0.0041) [2024-06-24 19:03:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11829051392. Throughput: 0: 42587.0. Samples: 11829191360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:03:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 19:03:59,736][15401] Updated weights for policy 0, policy_version 721992 (0.0044) [2024-06-24 19:04:03,337][15401] Updated weights for policy 0, policy_version 722002 (0.0032) [2024-06-24 19:04:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 11829280768. Throughput: 0: 42571.1. Samples: 11829443160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:04:03,393][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 19:04:07,350][15401] Updated weights for policy 0, policy_version 722012 (0.0032) [2024-06-24 19:04:07,853][15349] Signal inference workers to stop experience collection... (175100 times) [2024-06-24 19:04:07,903][15401] InferenceWorker_p0-w0: stopping experience collection (175100 times) [2024-06-24 19:04:07,912][15349] Signal inference workers to resume experience collection... (175100 times) [2024-06-24 19:04:07,916][15401] InferenceWorker_p0-w0: resuming experience collection (175100 times) [2024-06-24 19:04:08,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 11829493760. Throughput: 0: 42673.9. Samples: 11829577580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:04:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 19:04:10,996][15401] Updated weights for policy 0, policy_version 722022 (0.0030) [2024-06-24 19:04:13,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11829706752. Throughput: 0: 42397.3. Samples: 11829828740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:04:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 19:04:15,333][15401] Updated weights for policy 0, policy_version 722032 (0.0028) [2024-06-24 19:04:18,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11829903360. Throughput: 0: 42395.1. Samples: 11830081020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:04:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 19:04:18,961][15401] Updated weights for policy 0, policy_version 722042 (0.0048) [2024-06-24 19:04:22,940][15401] Updated weights for policy 0, policy_version 722052 (0.0043) [2024-06-24 19:04:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11830132736. Throughput: 0: 42348.4. Samples: 11830208420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:04:23,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 19:04:26,394][15401] Updated weights for policy 0, policy_version 722062 (0.0039) [2024-06-24 19:04:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 11830345728. Throughput: 0: 42359.3. Samples: 11830461600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:04:28,403][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 19:04:30,485][15401] Updated weights for policy 0, policy_version 722072 (0.0041) [2024-06-24 19:04:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 11830558720. Throughput: 0: 42731.6. Samples: 11830725000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:04:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 19:04:33,994][15401] Updated weights for policy 0, policy_version 722082 (0.0036) [2024-06-24 19:04:38,102][15401] Updated weights for policy 0, policy_version 722092 (0.0033) [2024-06-24 19:04:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 11830771712. Throughput: 0: 42580.3. Samples: 11830850200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:04:38,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 19:04:41,638][15401] Updated weights for policy 0, policy_version 722102 (0.0035) [2024-06-24 19:04:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42602.8, 300 sec: 42653.9). Total num frames: 11830984704. Throughput: 0: 42596.9. Samples: 11831108220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:04:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 19:04:46,040][15401] Updated weights for policy 0, policy_version 722112 (0.0040) [2024-06-24 19:04:48,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 11831214080. Throughput: 0: 42538.7. Samples: 11831357400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:04:48,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 19:04:49,131][15401] Updated weights for policy 0, policy_version 722122 (0.0032) [2024-06-24 19:04:53,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 11831377920. Throughput: 0: 42454.5. Samples: 11831488040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:04:53,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 19:04:53,583][15401] Updated weights for policy 0, policy_version 722132 (0.0031) [2024-06-24 19:04:56,853][15401] Updated weights for policy 0, policy_version 722142 (0.0040) [2024-06-24 19:04:58,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11831607296. Throughput: 0: 42393.0. Samples: 11831736420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:04:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 19:05:01,339][15401] Updated weights for policy 0, policy_version 722152 (0.0047) [2024-06-24 19:05:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42600.2, 300 sec: 42543.2). Total num frames: 11831836672. Throughput: 0: 42395.1. Samples: 11831988800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:03,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 19:05:04,522][15401] Updated weights for policy 0, policy_version 722162 (0.0039) [2024-06-24 19:05:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 11832016896. Throughput: 0: 42492.4. Samples: 11832120580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 19:05:09,147][15401] Updated weights for policy 0, policy_version 722172 (0.0040) [2024-06-24 19:05:12,420][15401] Updated weights for policy 0, policy_version 722182 (0.0030) [2024-06-24 19:05:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11832246272. Throughput: 0: 42375.1. Samples: 11832368480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 19:05:16,708][15401] Updated weights for policy 0, policy_version 722192 (0.0036) [2024-06-24 19:05:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 11832459264. Throughput: 0: 42247.9. Samples: 11832626160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:18,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 19:05:19,513][15349] Signal inference workers to stop experience collection... (175150 times) [2024-06-24 19:05:19,521][15349] Signal inference workers to resume experience collection... (175150 times) [2024-06-24 19:05:19,529][15401] InferenceWorker_p0-w0: stopping experience collection (175150 times) [2024-06-24 19:05:19,547][15401] InferenceWorker_p0-w0: resuming experience collection (175150 times) [2024-06-24 19:05:20,159][15401] Updated weights for policy 0, policy_version 722202 (0.0029) [2024-06-24 19:05:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 11832655872. Throughput: 0: 42297.8. Samples: 11832753500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:23,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 19:05:24,303][15401] Updated weights for policy 0, policy_version 722212 (0.0027) [2024-06-24 19:05:28,187][15401] Updated weights for policy 0, policy_version 722222 (0.0027) [2024-06-24 19:05:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11832885248. Throughput: 0: 42262.4. Samples: 11833010020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:28,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-24 19:05:32,009][15401] Updated weights for policy 0, policy_version 722232 (0.0032) [2024-06-24 19:05:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 11833098240. Throughput: 0: 42448.0. Samples: 11833267460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 19:05:35,810][15401] Updated weights for policy 0, policy_version 722242 (0.0023) [2024-06-24 19:05:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 11833311232. Throughput: 0: 42267.6. Samples: 11833390080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 19:05:39,912][15401] Updated weights for policy 0, policy_version 722252 (0.0044) [2024-06-24 19:05:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11833524224. Throughput: 0: 42516.8. Samples: 11833649680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 19:05:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000722262_11833540608.pth... [2024-06-24 19:05:43,461][15401] Updated weights for policy 0, policy_version 722262 (0.0036) [2024-06-24 19:05:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000721639_11823333376.pth [2024-06-24 19:05:47,867][15401] Updated weights for policy 0, policy_version 722272 (0.0030) [2024-06-24 19:05:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 11833720832. Throughput: 0: 42518.7. Samples: 11833902140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 19:05:51,030][15401] Updated weights for policy 0, policy_version 722282 (0.0034) [2024-06-24 19:05:53,392][15132] Fps is (10 sec: 44223.8, 60 sec: 43142.5, 300 sec: 42653.5). Total num frames: 11833966592. Throughput: 0: 42350.6. Samples: 11834026480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:53,393][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 19:05:55,499][15401] Updated weights for policy 0, policy_version 722292 (0.0044) [2024-06-24 19:05:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 11834163200. Throughput: 0: 42587.6. Samples: 11834284920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:05:58,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 19:05:59,111][15401] Updated weights for policy 0, policy_version 722302 (0.0037) [2024-06-24 19:06:03,363][15401] Updated weights for policy 0, policy_version 722312 (0.0044) [2024-06-24 19:06:03,392][15132] Fps is (10 sec: 39323.7, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 11834359808. Throughput: 0: 42475.1. Samples: 11834537640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:06:03,392][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 19:06:06,703][15401] Updated weights for policy 0, policy_version 722322 (0.0030) [2024-06-24 19:06:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11834605568. Throughput: 0: 42429.3. Samples: 11834662820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 19:06:08,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 19:06:10,870][15401] Updated weights for policy 0, policy_version 722332 (0.0036) [2024-06-24 19:06:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 11834785792. Throughput: 0: 42432.3. Samples: 11834919480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:13,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-24 19:06:14,407][15401] Updated weights for policy 0, policy_version 722342 (0.0032) [2024-06-24 19:06:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42543.8). Total num frames: 11834998784. Throughput: 0: 42399.7. Samples: 11835175440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 19:06:19,040][15401] Updated weights for policy 0, policy_version 722352 (0.0035) [2024-06-24 19:06:21,914][15401] Updated weights for policy 0, policy_version 722362 (0.0031) [2024-06-24 19:06:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 11835228160. Throughput: 0: 42538.7. Samples: 11835304320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 19:06:26,483][15401] Updated weights for policy 0, policy_version 722372 (0.0045) [2024-06-24 19:06:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 11835424768. Throughput: 0: 42507.1. Samples: 11835562500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 19:06:29,677][15401] Updated weights for policy 0, policy_version 722382 (0.0030) [2024-06-24 19:06:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11835654144. Throughput: 0: 42426.2. Samples: 11835811320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 19:06:33,998][15401] Updated weights for policy 0, policy_version 722392 (0.0033) [2024-06-24 19:06:37,318][15401] Updated weights for policy 0, policy_version 722402 (0.0039) [2024-06-24 19:06:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 11835850752. Throughput: 0: 42545.9. Samples: 11835940920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 19:06:41,541][15401] Updated weights for policy 0, policy_version 722412 (0.0040) [2024-06-24 19:06:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 11836063744. Throughput: 0: 42524.3. Samples: 11836198520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 19:06:44,980][15401] Updated weights for policy 0, policy_version 722422 (0.0031) [2024-06-24 19:06:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 11836276736. Throughput: 0: 42509.8. Samples: 11836450580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:48,393][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 19:06:49,383][15401] Updated weights for policy 0, policy_version 722432 (0.0034) [2024-06-24 19:06:50,270][15349] Signal inference workers to stop experience collection... (175200 times) [2024-06-24 19:06:50,270][15349] Signal inference workers to resume experience collection... (175200 times) [2024-06-24 19:06:50,316][15401] InferenceWorker_p0-w0: stopping experience collection (175200 times) [2024-06-24 19:06:50,316][15401] InferenceWorker_p0-w0: resuming experience collection (175200 times) [2024-06-24 19:06:52,658][15401] Updated weights for policy 0, policy_version 722442 (0.0036) [2024-06-24 19:06:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42327.4, 300 sec: 42542.9). Total num frames: 11836506112. Throughput: 0: 42713.4. Samples: 11836584920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:53,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 19:06:56,927][15401] Updated weights for policy 0, policy_version 722452 (0.0025) [2024-06-24 19:06:58,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 11836702720. Throughput: 0: 42700.8. Samples: 11836841020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:06:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 19:07:00,583][15401] Updated weights for policy 0, policy_version 722462 (0.0031) [2024-06-24 19:07:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 11836932096. Throughput: 0: 42600.8. Samples: 11837092480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 19:07:04,449][15401] Updated weights for policy 0, policy_version 722472 (0.0039) [2024-06-24 19:07:08,059][15401] Updated weights for policy 0, policy_version 722482 (0.0029) [2024-06-24 19:07:08,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11837161472. Throughput: 0: 42693.2. Samples: 11837225520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 19:07:11,989][15401] Updated weights for policy 0, policy_version 722492 (0.0034) [2024-06-24 19:07:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11837358080. Throughput: 0: 42574.2. Samples: 11837478340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 19:07:15,633][15401] Updated weights for policy 0, policy_version 722502 (0.0025) [2024-06-24 19:07:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 11837587456. Throughput: 0: 42704.9. Samples: 11837733040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 19:07:19,661][15401] Updated weights for policy 0, policy_version 722512 (0.0045) [2024-06-24 19:07:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 11837784064. Throughput: 0: 42780.3. Samples: 11837866040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:23,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 19:07:23,653][15401] Updated weights for policy 0, policy_version 722522 (0.0035) [2024-06-24 19:07:27,325][15401] Updated weights for policy 0, policy_version 722532 (0.0034) [2024-06-24 19:07:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11837980672. Throughput: 0: 42733.1. Samples: 11838121500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:28,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 19:07:31,164][15401] Updated weights for policy 0, policy_version 722542 (0.0040) [2024-06-24 19:07:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 11838226432. Throughput: 0: 42790.1. Samples: 11838376040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:33,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 19:07:34,983][15401] Updated weights for policy 0, policy_version 722552 (0.0024) [2024-06-24 19:07:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11838423040. Throughput: 0: 42791.5. Samples: 11838510540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:38,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 19:07:38,706][15401] Updated weights for policy 0, policy_version 722562 (0.0037) [2024-06-24 19:07:42,882][15401] Updated weights for policy 0, policy_version 722572 (0.0037) [2024-06-24 19:07:43,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42542.8). Total num frames: 11838636032. Throughput: 0: 42739.7. Samples: 11838764300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 19:07:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000722573_11838636032.pth... [2024-06-24 19:07:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000721950_11828428800.pth [2024-06-24 19:07:46,319][15401] Updated weights for policy 0, policy_version 722582 (0.0036) [2024-06-24 19:07:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.2, 300 sec: 42542.9). Total num frames: 11838865408. Throughput: 0: 42882.2. Samples: 11839022180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:48,396][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 19:07:50,630][15401] Updated weights for policy 0, policy_version 722592 (0.0030) [2024-06-24 19:07:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11839078400. Throughput: 0: 42882.2. Samples: 11839155220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-24 19:07:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 19:07:53,918][15401] Updated weights for policy 0, policy_version 722602 (0.0032) [2024-06-24 19:07:58,383][15401] Updated weights for policy 0, policy_version 722612 (0.0027) [2024-06-24 19:07:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.7, 300 sec: 42542.9). Total num frames: 11839275008. Throughput: 0: 42842.8. Samples: 11839406260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:07:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 19:08:01,518][15401] Updated weights for policy 0, policy_version 722622 (0.0026) [2024-06-24 19:08:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 11839504384. Throughput: 0: 42924.1. Samples: 11839664620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 19:08:06,263][15401] Updated weights for policy 0, policy_version 722632 (0.0041) [2024-06-24 19:08:08,205][15349] Signal inference workers to stop experience collection... (175250 times) [2024-06-24 19:08:08,205][15349] Signal inference workers to resume experience collection... (175250 times) [2024-06-24 19:08:08,250][15401] InferenceWorker_p0-w0: stopping experience collection (175250 times) [2024-06-24 19:08:08,250][15401] InferenceWorker_p0-w0: resuming experience collection (175250 times) [2024-06-24 19:08:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11839717376. Throughput: 0: 42977.5. Samples: 11839800020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:08,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 19:08:09,342][15401] Updated weights for policy 0, policy_version 722642 (0.0022) [2024-06-24 19:08:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11839913984. Throughput: 0: 42897.3. Samples: 11840051880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 19:08:13,746][15401] Updated weights for policy 0, policy_version 722652 (0.0029) [2024-06-24 19:08:17,047][15401] Updated weights for policy 0, policy_version 722662 (0.0042) [2024-06-24 19:08:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11840159744. Throughput: 0: 42821.9. Samples: 11840303020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 19:08:21,261][15401] Updated weights for policy 0, policy_version 722672 (0.0048) [2024-06-24 19:08:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 11840356352. Throughput: 0: 42861.7. Samples: 11840439320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:23,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-24 19:08:24,540][15401] Updated weights for policy 0, policy_version 722682 (0.0031) [2024-06-24 19:08:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 11840569344. Throughput: 0: 42812.5. Samples: 11840690860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 19:08:29,199][15401] Updated weights for policy 0, policy_version 722692 (0.0029) [2024-06-24 19:08:32,114][15401] Updated weights for policy 0, policy_version 722702 (0.0023) [2024-06-24 19:08:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 11840815104. Throughput: 0: 42799.7. Samples: 11840948160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 19:08:36,714][15401] Updated weights for policy 0, policy_version 722712 (0.0031) [2024-06-24 19:08:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 11840978944. Throughput: 0: 42769.0. Samples: 11841079820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:38,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 19:08:39,934][15401] Updated weights for policy 0, policy_version 722722 (0.0031) [2024-06-24 19:08:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11841208320. Throughput: 0: 42771.1. Samples: 11841330960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:43,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 19:08:44,174][15401] Updated weights for policy 0, policy_version 722732 (0.0031) [2024-06-24 19:08:47,623][15401] Updated weights for policy 0, policy_version 722742 (0.0034) [2024-06-24 19:08:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11841437696. Throughput: 0: 42774.7. Samples: 11841589480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 19:08:51,806][15401] Updated weights for policy 0, policy_version 722752 (0.0034) [2024-06-24 19:08:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11841634304. Throughput: 0: 42661.2. Samples: 11841719780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 19:08:55,385][15401] Updated weights for policy 0, policy_version 722762 (0.0043) [2024-06-24 19:08:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.8). Total num frames: 11841847296. Throughput: 0: 42744.0. Samples: 11841975360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:08:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 19:08:59,490][15401] Updated weights for policy 0, policy_version 722772 (0.0034) [2024-06-24 19:09:02,971][15401] Updated weights for policy 0, policy_version 722782 (0.0033) [2024-06-24 19:09:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 11842076672. Throughput: 0: 42839.5. Samples: 11842230900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:09:03,393][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 19:09:07,077][15401] Updated weights for policy 0, policy_version 722792 (0.0039) [2024-06-24 19:09:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11842273280. Throughput: 0: 42708.1. Samples: 11842361180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:09:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 19:09:10,891][15401] Updated weights for policy 0, policy_version 722802 (0.0046) [2024-06-24 19:09:13,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11842502656. Throughput: 0: 42804.9. Samples: 11842617080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:09:13,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 19:09:14,592][15401] Updated weights for policy 0, policy_version 722812 (0.0036) [2024-06-24 19:09:18,386][15401] Updated weights for policy 0, policy_version 722822 (0.0040) [2024-06-24 19:09:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11842715648. Throughput: 0: 42786.2. Samples: 11842873540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:09:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 19:09:22,404][15401] Updated weights for policy 0, policy_version 722832 (0.0034) [2024-06-24 19:09:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11842912256. Throughput: 0: 42840.5. Samples: 11843007640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:09:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 19:09:25,837][15401] Updated weights for policy 0, policy_version 722842 (0.0043) [2024-06-24 19:09:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11843125248. Throughput: 0: 42813.5. Samples: 11843257580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:09:28,391][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 19:09:30,162][15401] Updated weights for policy 0, policy_version 722852 (0.0032) [2024-06-24 19:09:33,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 11843354624. Throughput: 0: 42851.4. Samples: 11843517900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-24 19:09:33,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 19:09:33,534][15401] Updated weights for policy 0, policy_version 722862 (0.0037) [2024-06-24 19:09:34,804][15349] Signal inference workers to stop experience collection... (175300 times) [2024-06-24 19:09:34,854][15401] InferenceWorker_p0-w0: stopping experience collection (175300 times) [2024-06-24 19:09:34,926][15349] Signal inference workers to resume experience collection... (175300 times) [2024-06-24 19:09:34,926][15401] InferenceWorker_p0-w0: resuming experience collection (175300 times) [2024-06-24 19:09:37,815][15401] Updated weights for policy 0, policy_version 722872 (0.0042) [2024-06-24 19:09:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 11843567616. Throughput: 0: 42896.1. Samples: 11843650100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:09:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 19:09:41,102][15401] Updated weights for policy 0, policy_version 722882 (0.0022) [2024-06-24 19:09:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 11843780608. Throughput: 0: 42874.9. Samples: 11843904740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:09:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 19:09:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000722887_11843780608.pth... [2024-06-24 19:09:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000722262_11833540608.pth [2024-06-24 19:09:45,349][15401] Updated weights for policy 0, policy_version 722892 (0.0032) [2024-06-24 19:09:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11844009984. Throughput: 0: 42989.8. Samples: 11844165340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:09:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 19:09:48,558][15401] Updated weights for policy 0, policy_version 722902 (0.0036) [2024-06-24 19:09:52,880][15401] Updated weights for policy 0, policy_version 722912 (0.0040) [2024-06-24 19:09:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11844222976. Throughput: 0: 43005.3. Samples: 11844296420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:09:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 19:09:56,295][15401] Updated weights for policy 0, policy_version 722922 (0.0025) [2024-06-24 19:09:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 11844435968. Throughput: 0: 43082.2. Samples: 11844555780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:09:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 19:10:00,521][15401] Updated weights for policy 0, policy_version 722932 (0.0033) [2024-06-24 19:10:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 11844648960. Throughput: 0: 42992.8. Samples: 11844808320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:03,393][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 19:10:04,111][15401] Updated weights for policy 0, policy_version 722942 (0.0034) [2024-06-24 19:10:08,179][15401] Updated weights for policy 0, policy_version 722952 (0.0034) [2024-06-24 19:10:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11844861952. Throughput: 0: 42943.4. Samples: 11844940100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 19:10:11,846][15401] Updated weights for policy 0, policy_version 722962 (0.0035) [2024-06-24 19:10:13,389][15132] Fps is (10 sec: 44248.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11845091328. Throughput: 0: 43091.4. Samples: 11845196680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 19:10:15,746][15401] Updated weights for policy 0, policy_version 722972 (0.0036) [2024-06-24 19:10:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11845287936. Throughput: 0: 43184.1. Samples: 11845461080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:18,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 19:10:19,241][15401] Updated weights for policy 0, policy_version 722982 (0.0035) [2024-06-24 19:10:23,265][15401] Updated weights for policy 0, policy_version 722992 (0.0030) [2024-06-24 19:10:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11845500928. Throughput: 0: 42994.3. Samples: 11845584840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 19:10:26,805][15401] Updated weights for policy 0, policy_version 723002 (0.0035) [2024-06-24 19:10:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 11845730304. Throughput: 0: 43079.2. Samples: 11845843300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 19:10:30,914][15401] Updated weights for policy 0, policy_version 723012 (0.0027) [2024-06-24 19:10:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 11845943296. Throughput: 0: 43096.9. Samples: 11846104800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:33,392][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 19:10:34,345][15401] Updated weights for policy 0, policy_version 723022 (0.0054) [2024-06-24 19:10:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11846156288. Throughput: 0: 42975.5. Samples: 11846230320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 19:10:38,392][15401] Updated weights for policy 0, policy_version 723032 (0.0044) [2024-06-24 19:10:42,063][15401] Updated weights for policy 0, policy_version 723042 (0.0032) [2024-06-24 19:10:43,389][15132] Fps is (10 sec: 42609.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 11846369280. Throughput: 0: 42979.3. Samples: 11846489840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:43,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 19:10:45,839][15401] Updated weights for policy 0, policy_version 723052 (0.0030) [2024-06-24 19:10:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 11846582272. Throughput: 0: 43252.5. Samples: 11846754580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:48,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 19:10:49,893][15401] Updated weights for policy 0, policy_version 723062 (0.0023) [2024-06-24 19:10:52,523][15349] Signal inference workers to stop experience collection... (175350 times) [2024-06-24 19:10:52,525][15349] Signal inference workers to resume experience collection... (175350 times) [2024-06-24 19:10:52,580][15401] InferenceWorker_p0-w0: stopping experience collection (175350 times) [2024-06-24 19:10:52,580][15401] InferenceWorker_p0-w0: resuming experience collection (175350 times) [2024-06-24 19:10:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11846795264. Throughput: 0: 42967.2. Samples: 11846873620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:53,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 19:10:53,503][15401] Updated weights for policy 0, policy_version 723072 (0.0032) [2024-06-24 19:10:57,283][15401] Updated weights for policy 0, policy_version 723082 (0.0040) [2024-06-24 19:10:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 11847024640. Throughput: 0: 43105.6. Samples: 11847136440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:10:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 19:11:01,430][15401] Updated weights for policy 0, policy_version 723092 (0.0032) [2024-06-24 19:11:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 11847221248. Throughput: 0: 42964.1. Samples: 11847394460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:11:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 19:11:04,737][15401] Updated weights for policy 0, policy_version 723102 (0.0028) [2024-06-24 19:11:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11847434240. Throughput: 0: 43065.8. Samples: 11847522800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:11:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 19:11:08,982][15401] Updated weights for policy 0, policy_version 723112 (0.0038) [2024-06-24 19:11:12,419][15401] Updated weights for policy 0, policy_version 723122 (0.0033) [2024-06-24 19:11:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11847647232. Throughput: 0: 43130.7. Samples: 11847784180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:11:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 19:11:16,700][15401] Updated weights for policy 0, policy_version 723132 (0.0034) [2024-06-24 19:11:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11847876608. Throughput: 0: 43061.0. Samples: 11848042440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 19:11:19,953][15401] Updated weights for policy 0, policy_version 723142 (0.0039) [2024-06-24 19:11:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11848073216. Throughput: 0: 43034.7. Samples: 11848166880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 19:11:24,306][15401] Updated weights for policy 0, policy_version 723152 (0.0043) [2024-06-24 19:11:27,563][15401] Updated weights for policy 0, policy_version 723162 (0.0034) [2024-06-24 19:11:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11848302592. Throughput: 0: 42867.9. Samples: 11848418900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 19:11:31,893][15401] Updated weights for policy 0, policy_version 723172 (0.0037) [2024-06-24 19:11:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 11848515584. Throughput: 0: 42742.7. Samples: 11848678000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 19:11:34,966][15401] Updated weights for policy 0, policy_version 723182 (0.0030) [2024-06-24 19:11:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11848712192. Throughput: 0: 42908.0. Samples: 11848804480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 19:11:39,672][15401] Updated weights for policy 0, policy_version 723192 (0.0026) [2024-06-24 19:11:42,670][15401] Updated weights for policy 0, policy_version 723202 (0.0024) [2024-06-24 19:11:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42987.5). Total num frames: 11848957952. Throughput: 0: 42773.4. Samples: 11849061240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 19:11:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000723203_11848957952.pth... [2024-06-24 19:11:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000722573_11838636032.pth [2024-06-24 19:11:47,809][15401] Updated weights for policy 0, policy_version 723212 (0.0043) [2024-06-24 19:11:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11849138176. Throughput: 0: 42748.8. Samples: 11849318160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:48,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 19:11:50,534][15401] Updated weights for policy 0, policy_version 723222 (0.0027) [2024-06-24 19:11:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11849351168. Throughput: 0: 42588.8. Samples: 11849439300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 19:11:55,615][15401] Updated weights for policy 0, policy_version 723232 (0.0042) [2024-06-24 19:11:58,215][15401] Updated weights for policy 0, policy_version 723242 (0.0039) [2024-06-24 19:11:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11849596928. Throughput: 0: 42488.4. Samples: 11849696160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:11:58,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 19:12:03,107][15401] Updated weights for policy 0, policy_version 723252 (0.0035) [2024-06-24 19:12:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11849777152. Throughput: 0: 42580.4. Samples: 11849958560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 19:12:05,128][15349] Signal inference workers to stop experience collection... (175400 times) [2024-06-24 19:12:05,129][15349] Signal inference workers to resume experience collection... (175400 times) [2024-06-24 19:12:05,173][15401] InferenceWorker_p0-w0: stopping experience collection (175400 times) [2024-06-24 19:12:05,173][15401] InferenceWorker_p0-w0: resuming experience collection (175400 times) [2024-06-24 19:12:06,049][15401] Updated weights for policy 0, policy_version 723262 (0.0030) [2024-06-24 19:12:08,395][15132] Fps is (10 sec: 40937.6, 60 sec: 42867.5, 300 sec: 42875.3). Total num frames: 11850006528. Throughput: 0: 42463.2. Samples: 11850077960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:08,395][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 19:12:10,550][15401] Updated weights for policy 0, policy_version 723272 (0.0035) [2024-06-24 19:12:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11850219520. Throughput: 0: 42692.1. Samples: 11850340040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:13,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 19:12:13,923][15401] Updated weights for policy 0, policy_version 723282 (0.0034) [2024-06-24 19:12:18,392][15132] Fps is (10 sec: 39333.8, 60 sec: 42050.6, 300 sec: 42764.7). Total num frames: 11850399744. Throughput: 0: 42595.5. Samples: 11850594900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:18,393][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 19:12:18,694][15401] Updated weights for policy 0, policy_version 723292 (0.0042) [2024-06-24 19:12:21,406][15401] Updated weights for policy 0, policy_version 723302 (0.0029) [2024-06-24 19:12:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11850645504. Throughput: 0: 42463.5. Samples: 11850715340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 19:12:26,355][15401] Updated weights for policy 0, policy_version 723312 (0.0031) [2024-06-24 19:12:28,390][15132] Fps is (10 sec: 47524.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11850874880. Throughput: 0: 42668.3. Samples: 11850981320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 19:12:28,837][15401] Updated weights for policy 0, policy_version 723322 (0.0032) [2024-06-24 19:12:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11851055104. Throughput: 0: 42572.0. Samples: 11851233900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 19:12:33,927][15401] Updated weights for policy 0, policy_version 723332 (0.0035) [2024-06-24 19:12:36,409][15401] Updated weights for policy 0, policy_version 723342 (0.0035) [2024-06-24 19:12:38,392][15132] Fps is (10 sec: 40948.5, 60 sec: 42869.4, 300 sec: 42875.7). Total num frames: 11851284480. Throughput: 0: 42600.4. Samples: 11851356440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:38,393][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 19:12:41,408][15401] Updated weights for policy 0, policy_version 723352 (0.0036) [2024-06-24 19:12:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11851513856. Throughput: 0: 42905.5. Samples: 11851626900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 19:12:44,328][15401] Updated weights for policy 0, policy_version 723362 (0.0033) [2024-06-24 19:12:48,389][15132] Fps is (10 sec: 40972.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11851694080. Throughput: 0: 42682.3. Samples: 11851879260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 19:12:49,306][15401] Updated weights for policy 0, policy_version 723372 (0.0028) [2024-06-24 19:12:51,971][15401] Updated weights for policy 0, policy_version 723382 (0.0033) [2024-06-24 19:12:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11851939840. Throughput: 0: 42784.8. Samples: 11852003040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 19:12:57,174][15401] Updated weights for policy 0, policy_version 723392 (0.0038) [2024-06-24 19:12:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 11852136448. Throughput: 0: 42863.1. Samples: 11852268880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-24 19:12:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 19:12:59,694][15401] Updated weights for policy 0, policy_version 723402 (0.0040) [2024-06-24 19:13:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11852349440. Throughput: 0: 42617.5. Samples: 11852512580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 19:13:04,726][15401] Updated weights for policy 0, policy_version 723412 (0.0028) [2024-06-24 19:13:07,297][15401] Updated weights for policy 0, policy_version 723422 (0.0039) [2024-06-24 19:13:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42875.4, 300 sec: 42931.6). Total num frames: 11852578816. Throughput: 0: 42975.6. Samples: 11852649240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 19:13:12,262][15401] Updated weights for policy 0, policy_version 723432 (0.0040) [2024-06-24 19:13:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11852759040. Throughput: 0: 42759.6. Samples: 11852905500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 19:13:14,306][15349] Signal inference workers to stop experience collection... (175450 times) [2024-06-24 19:13:14,338][15401] InferenceWorker_p0-w0: stopping experience collection (175450 times) [2024-06-24 19:13:14,366][15349] Signal inference workers to resume experience collection... (175450 times) [2024-06-24 19:13:14,372][15401] InferenceWorker_p0-w0: resuming experience collection (175450 times) [2024-06-24 19:13:14,857][15401] Updated weights for policy 0, policy_version 723442 (0.0039) [2024-06-24 19:13:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11852972032. Throughput: 0: 42633.4. Samples: 11853152400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 19:13:20,315][15401] Updated weights for policy 0, policy_version 723452 (0.0025) [2024-06-24 19:13:22,678][15401] Updated weights for policy 0, policy_version 723462 (0.0033) [2024-06-24 19:13:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11853217792. Throughput: 0: 42892.0. Samples: 11853286460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:23,391][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 19:13:27,814][15401] Updated weights for policy 0, policy_version 723472 (0.0034) [2024-06-24 19:13:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11853414400. Throughput: 0: 42596.8. Samples: 11853543760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 19:13:30,127][15401] Updated weights for policy 0, policy_version 723482 (0.0033) [2024-06-24 19:13:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11853627392. Throughput: 0: 42740.4. Samples: 11853802580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 19:13:35,290][15401] Updated weights for policy 0, policy_version 723492 (0.0035) [2024-06-24 19:13:38,076][15401] Updated weights for policy 0, policy_version 723502 (0.0032) [2024-06-24 19:13:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42873.5, 300 sec: 42876.1). Total num frames: 11853856768. Throughput: 0: 42798.6. Samples: 11853928980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 19:13:42,741][15401] Updated weights for policy 0, policy_version 723512 (0.0038) [2024-06-24 19:13:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11854053376. Throughput: 0: 42685.4. Samples: 11854189720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 19:13:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000723515_11854069760.pth... [2024-06-24 19:13:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000722887_11843780608.pth [2024-06-24 19:13:45,524][15401] Updated weights for policy 0, policy_version 723522 (0.0030) [2024-06-24 19:13:48,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11854266368. Throughput: 0: 42878.7. Samples: 11854442120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 19:13:50,223][15401] Updated weights for policy 0, policy_version 723532 (0.0027) [2024-06-24 19:13:53,356][15401] Updated weights for policy 0, policy_version 723542 (0.0033) [2024-06-24 19:13:53,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11854512128. Throughput: 0: 42779.1. Samples: 11854574300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 19:13:57,851][15401] Updated weights for policy 0, policy_version 723552 (0.0031) [2024-06-24 19:13:58,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 11854692352. Throughput: 0: 42897.9. Samples: 11854835900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:13:58,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 19:14:00,939][15401] Updated weights for policy 0, policy_version 723562 (0.0033) [2024-06-24 19:14:03,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 11854921728. Throughput: 0: 43084.9. Samples: 11855091500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:14:03,396][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 19:14:05,408][15401] Updated weights for policy 0, policy_version 723572 (0.0033) [2024-06-24 19:14:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11855151104. Throughput: 0: 43071.8. Samples: 11855224680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:14:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 19:14:08,501][15401] Updated weights for policy 0, policy_version 723582 (0.0032) [2024-06-24 19:14:12,987][15401] Updated weights for policy 0, policy_version 723592 (0.0026) [2024-06-24 19:14:13,390][15132] Fps is (10 sec: 42625.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 11855347712. Throughput: 0: 42949.2. Samples: 11855476480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:14:13,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 19:14:16,348][15401] Updated weights for policy 0, policy_version 723602 (0.0037) [2024-06-24 19:14:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11855560704. Throughput: 0: 42795.1. Samples: 11855728360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:14:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 19:14:20,474][15401] Updated weights for policy 0, policy_version 723612 (0.0042) [2024-06-24 19:14:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 11855790080. Throughput: 0: 42953.5. Samples: 11855861880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:14:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 19:14:24,037][15401] Updated weights for policy 0, policy_version 723622 (0.0034) [2024-06-24 19:14:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 11855970304. Throughput: 0: 42784.3. Samples: 11856115020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:14:28,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 19:14:28,537][15401] Updated weights for policy 0, policy_version 723632 (0.0030) [2024-06-24 19:14:31,910][15401] Updated weights for policy 0, policy_version 723642 (0.0027) [2024-06-24 19:14:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11856199680. Throughput: 0: 42737.2. Samples: 11856365300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:14:33,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 19:14:36,050][15401] Updated weights for policy 0, policy_version 723652 (0.0043) [2024-06-24 19:14:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11856412672. Throughput: 0: 42731.2. Samples: 11856497200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-24 19:14:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 19:14:39,577][15401] Updated weights for policy 0, policy_version 723662 (0.0048) [2024-06-24 19:14:41,748][15349] Signal inference workers to stop experience collection... (175500 times) [2024-06-24 19:14:41,748][15349] Signal inference workers to resume experience collection... (175500 times) [2024-06-24 19:14:41,784][15401] InferenceWorker_p0-w0: stopping experience collection (175500 times) [2024-06-24 19:14:41,784][15401] InferenceWorker_p0-w0: resuming experience collection (175500 times) [2024-06-24 19:14:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11856625664. Throughput: 0: 42497.3. Samples: 11856748280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:14:43,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 19:14:43,511][15401] Updated weights for policy 0, policy_version 723672 (0.0026) [2024-06-24 19:14:47,196][15401] Updated weights for policy 0, policy_version 723682 (0.0033) [2024-06-24 19:14:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11856855040. Throughput: 0: 42503.4. Samples: 11857003880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:14:48,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 19:14:51,449][15401] Updated weights for policy 0, policy_version 723692 (0.0035) [2024-06-24 19:14:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11857051648. Throughput: 0: 42522.6. Samples: 11857138200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:14:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 19:14:54,711][15401] Updated weights for policy 0, policy_version 723702 (0.0044) [2024-06-24 19:14:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 11857281024. Throughput: 0: 42598.2. Samples: 11857393400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:14:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 19:14:59,216][15401] Updated weights for policy 0, policy_version 723712 (0.0035) [2024-06-24 19:15:02,324][15401] Updated weights for policy 0, policy_version 723722 (0.0036) [2024-06-24 19:15:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42876.0, 300 sec: 42820.6). Total num frames: 11857494016. Throughput: 0: 42682.2. Samples: 11857649060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:03,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 19:15:06,731][15401] Updated weights for policy 0, policy_version 723732 (0.0044) [2024-06-24 19:15:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 11857690624. Throughput: 0: 42639.4. Samples: 11857780660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:08,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 19:15:10,051][15401] Updated weights for policy 0, policy_version 723742 (0.0028) [2024-06-24 19:15:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11857936384. Throughput: 0: 42674.2. Samples: 11858035360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 19:15:14,252][15401] Updated weights for policy 0, policy_version 723752 (0.0042) [2024-06-24 19:15:17,622][15401] Updated weights for policy 0, policy_version 723762 (0.0022) [2024-06-24 19:15:18,392][15132] Fps is (10 sec: 45864.9, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 11858149376. Throughput: 0: 42671.6. Samples: 11858285620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:18,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 19:15:22,091][15401] Updated weights for policy 0, policy_version 723772 (0.0027) [2024-06-24 19:15:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11858329600. Throughput: 0: 42715.5. Samples: 11858419400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:23,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 19:15:25,194][15401] Updated weights for policy 0, policy_version 723782 (0.0030) [2024-06-24 19:15:28,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 11858575360. Throughput: 0: 42848.4. Samples: 11858676460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:28,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 19:15:29,642][15401] Updated weights for policy 0, policy_version 723792 (0.0030) [2024-06-24 19:15:33,217][15401] Updated weights for policy 0, policy_version 723802 (0.0029) [2024-06-24 19:15:33,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11858788352. Throughput: 0: 42919.9. Samples: 11858935380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:33,392][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 19:15:37,239][15401] Updated weights for policy 0, policy_version 723812 (0.0049) [2024-06-24 19:15:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11858984960. Throughput: 0: 42690.2. Samples: 11859059260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 19:15:40,791][15401] Updated weights for policy 0, policy_version 723822 (0.0044) [2024-06-24 19:15:43,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11859197952. Throughput: 0: 42663.2. Samples: 11859313240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 19:15:43,521][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000723829_11859214336.pth... [2024-06-24 19:15:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000723203_11848957952.pth [2024-06-24 19:15:44,692][15401] Updated weights for policy 0, policy_version 723832 (0.0047) [2024-06-24 19:15:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11859394560. Throughput: 0: 42796.5. Samples: 11859574900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 19:15:48,709][15401] Updated weights for policy 0, policy_version 723842 (0.0029) [2024-06-24 19:15:52,143][15401] Updated weights for policy 0, policy_version 723852 (0.0037) [2024-06-24 19:15:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11859640320. Throughput: 0: 42588.6. Samples: 11859697140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:53,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-24 19:15:56,535][15401] Updated weights for policy 0, policy_version 723862 (0.0039) [2024-06-24 19:15:58,393][15132] Fps is (10 sec: 45857.3, 60 sec: 42868.7, 300 sec: 42820.0). Total num frames: 11859853312. Throughput: 0: 42737.3. Samples: 11859958700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:15:58,394][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 19:16:00,019][15401] Updated weights for policy 0, policy_version 723872 (0.0040) [2024-06-24 19:16:02,946][15349] Signal inference workers to stop experience collection... (175550 times) [2024-06-24 19:16:02,952][15349] Signal inference workers to resume experience collection... (175550 times) [2024-06-24 19:16:02,981][15401] InferenceWorker_p0-w0: stopping experience collection (175550 times) [2024-06-24 19:16:02,982][15401] InferenceWorker_p0-w0: resuming experience collection (175550 times) [2024-06-24 19:16:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11860049920. Throughput: 0: 42971.5. Samples: 11860219240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:16:03,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 19:16:04,079][15401] Updated weights for policy 0, policy_version 723882 (0.0025) [2024-06-24 19:16:07,703][15401] Updated weights for policy 0, policy_version 723892 (0.0031) [2024-06-24 19:16:08,389][15132] Fps is (10 sec: 42615.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 11860279296. Throughput: 0: 42708.5. Samples: 11860341280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:16:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 19:16:11,631][15401] Updated weights for policy 0, policy_version 723902 (0.0027) [2024-06-24 19:16:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11860475904. Throughput: 0: 42856.6. Samples: 11860605000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:16:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 19:16:15,039][15401] Updated weights for policy 0, policy_version 723912 (0.0033) [2024-06-24 19:16:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 11860705280. Throughput: 0: 42841.5. Samples: 11860863140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:16:18,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-24 19:16:19,218][15401] Updated weights for policy 0, policy_version 723922 (0.0028) [2024-06-24 19:16:22,991][15401] Updated weights for policy 0, policy_version 723932 (0.0041) [2024-06-24 19:16:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11860934656. Throughput: 0: 42927.6. Samples: 11860991000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 19:16:23,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 19:16:27,007][15401] Updated weights for policy 0, policy_version 723942 (0.0042) [2024-06-24 19:16:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11861114880. Throughput: 0: 43048.9. Samples: 11861250440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:16:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 19:16:30,636][15401] Updated weights for policy 0, policy_version 723952 (0.0034) [2024-06-24 19:16:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 11861327872. Throughput: 0: 42912.0. Samples: 11861505940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:16:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 19:16:34,663][15401] Updated weights for policy 0, policy_version 723962 (0.0036) [2024-06-24 19:16:38,161][15401] Updated weights for policy 0, policy_version 723972 (0.0029) [2024-06-24 19:16:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11861573632. Throughput: 0: 43019.5. Samples: 11861633020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:16:38,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 19:16:42,251][15401] Updated weights for policy 0, policy_version 723982 (0.0029) [2024-06-24 19:16:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 11861770240. Throughput: 0: 43057.8. Samples: 11861896140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:16:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 19:16:46,125][15401] Updated weights for policy 0, policy_version 723992 (0.0036) [2024-06-24 19:16:48,393][15132] Fps is (10 sec: 40945.0, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 11861983232. Throughput: 0: 42750.4. Samples: 11862143160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:16:48,394][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 19:16:49,795][15401] Updated weights for policy 0, policy_version 724002 (0.0042) [2024-06-24 19:16:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11862196224. Throughput: 0: 42990.2. Samples: 11862275840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:16:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 19:16:53,738][15401] Updated weights for policy 0, policy_version 724012 (0.0043) [2024-06-24 19:16:57,325][15401] Updated weights for policy 0, policy_version 724022 (0.0037) [2024-06-24 19:16:58,390][15132] Fps is (10 sec: 42613.6, 60 sec: 42601.1, 300 sec: 42820.5). Total num frames: 11862409216. Throughput: 0: 42852.7. Samples: 11862533380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:16:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 19:17:01,360][15401] Updated weights for policy 0, policy_version 724032 (0.0035) [2024-06-24 19:17:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42765.8). Total num frames: 11862622208. Throughput: 0: 42773.8. Samples: 11862787960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 19:17:04,959][15401] Updated weights for policy 0, policy_version 724042 (0.0037) [2024-06-24 19:17:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11862835200. Throughput: 0: 42792.8. Samples: 11862916680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:08,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 19:17:08,934][15401] Updated weights for policy 0, policy_version 724052 (0.0035) [2024-06-24 19:17:12,475][15401] Updated weights for policy 0, policy_version 724062 (0.0032) [2024-06-24 19:17:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 11863064576. Throughput: 0: 42767.6. Samples: 11863174980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 19:17:16,728][15401] Updated weights for policy 0, policy_version 724072 (0.0047) [2024-06-24 19:17:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11863261184. Throughput: 0: 42862.3. Samples: 11863434740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:18,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 19:17:20,079][15401] Updated weights for policy 0, policy_version 724082 (0.0040) [2024-06-24 19:17:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11863474176. Throughput: 0: 42812.5. Samples: 11863559580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 19:17:24,111][15401] Updated weights for policy 0, policy_version 724092 (0.0039) [2024-06-24 19:17:28,158][15401] Updated weights for policy 0, policy_version 724102 (0.0031) [2024-06-24 19:17:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11863703552. Throughput: 0: 42751.3. Samples: 11863819940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 19:17:31,829][15401] Updated weights for policy 0, policy_version 724112 (0.0035) [2024-06-24 19:17:32,621][15349] Signal inference workers to stop experience collection... (175600 times) [2024-06-24 19:17:32,621][15349] Signal inference workers to resume experience collection... (175600 times) [2024-06-24 19:17:32,648][15401] InferenceWorker_p0-w0: stopping experience collection (175600 times) [2024-06-24 19:17:32,648][15401] InferenceWorker_p0-w0: resuming experience collection (175600 times) [2024-06-24 19:17:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42821.0). Total num frames: 11863916544. Throughput: 0: 42924.8. Samples: 11864074620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 19:17:35,647][15401] Updated weights for policy 0, policy_version 724122 (0.0026) [2024-06-24 19:17:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11864113152. Throughput: 0: 42810.7. Samples: 11864202320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 19:17:39,817][15401] Updated weights for policy 0, policy_version 724132 (0.0026) [2024-06-24 19:17:43,283][15401] Updated weights for policy 0, policy_version 724142 (0.0040) [2024-06-24 19:17:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11864342528. Throughput: 0: 42877.9. Samples: 11864462880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 19:17:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000724143_11864358912.pth... [2024-06-24 19:17:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000723515_11854069760.pth [2024-06-24 19:17:47,413][15401] Updated weights for policy 0, policy_version 724152 (0.0026) [2024-06-24 19:17:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42874.1, 300 sec: 42765.0). Total num frames: 11864555520. Throughput: 0: 42783.1. Samples: 11864713200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:48,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 19:17:51,177][15401] Updated weights for policy 0, policy_version 724162 (0.0035) [2024-06-24 19:17:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11864768512. Throughput: 0: 42820.8. Samples: 11864843620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:53,394][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 19:17:54,889][15401] Updated weights for policy 0, policy_version 724172 (0.0036) [2024-06-24 19:17:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 11864981504. Throughput: 0: 42713.3. Samples: 11865097080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:17:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 19:17:58,799][15401] Updated weights for policy 0, policy_version 724182 (0.0025) [2024-06-24 19:18:02,335][15401] Updated weights for policy 0, policy_version 724192 (0.0029) [2024-06-24 19:18:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11865194496. Throughput: 0: 42686.2. Samples: 11865355620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:18:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 19:18:06,290][15401] Updated weights for policy 0, policy_version 724202 (0.0029) [2024-06-24 19:18:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11865407488. Throughput: 0: 42760.4. Samples: 11865483800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 19:18:10,121][15401] Updated weights for policy 0, policy_version 724212 (0.0042) [2024-06-24 19:18:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11865604096. Throughput: 0: 42651.5. Samples: 11865739260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 19:18:14,037][15401] Updated weights for policy 0, policy_version 724222 (0.0046) [2024-06-24 19:18:17,859][15401] Updated weights for policy 0, policy_version 724232 (0.0033) [2024-06-24 19:18:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11865817088. Throughput: 0: 42563.2. Samples: 11865989960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 19:18:21,910][15401] Updated weights for policy 0, policy_version 724242 (0.0037) [2024-06-24 19:18:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11866030080. Throughput: 0: 42616.0. Samples: 11866120040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 19:18:25,525][15401] Updated weights for policy 0, policy_version 724252 (0.0031) [2024-06-24 19:18:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11866243072. Throughput: 0: 42501.8. Samples: 11866375460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:28,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 19:18:29,904][15401] Updated weights for policy 0, policy_version 724262 (0.0038) [2024-06-24 19:18:33,146][15401] Updated weights for policy 0, policy_version 724272 (0.0028) [2024-06-24 19:18:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11866472448. Throughput: 0: 42536.9. Samples: 11866627360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 19:18:37,781][15401] Updated weights for policy 0, policy_version 724282 (0.0040) [2024-06-24 19:18:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11866652672. Throughput: 0: 42488.5. Samples: 11866755600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:38,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 19:18:40,911][15401] Updated weights for policy 0, policy_version 724292 (0.0035) [2024-06-24 19:18:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11866865664. Throughput: 0: 42406.3. Samples: 11867005360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:43,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 19:18:45,507][15401] Updated weights for policy 0, policy_version 724302 (0.0027) [2024-06-24 19:18:45,969][15349] Signal inference workers to stop experience collection... (175650 times) [2024-06-24 19:18:45,977][15349] Signal inference workers to resume experience collection... (175650 times) [2024-06-24 19:18:45,999][15401] InferenceWorker_p0-w0: stopping experience collection (175650 times) [2024-06-24 19:18:45,999][15401] InferenceWorker_p0-w0: resuming experience collection (175650 times) [2024-06-24 19:18:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11867111424. Throughput: 0: 42408.0. Samples: 11867263980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:48,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 19:18:48,483][15401] Updated weights for policy 0, policy_version 724312 (0.0028) [2024-06-24 19:18:53,304][15401] Updated weights for policy 0, policy_version 724322 (0.0032) [2024-06-24 19:18:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11867291648. Throughput: 0: 42431.1. Samples: 11867393200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 19:18:56,060][15401] Updated weights for policy 0, policy_version 724332 (0.0033) [2024-06-24 19:18:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 11867521024. Throughput: 0: 42195.1. Samples: 11867638040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:18:58,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 19:19:00,911][15401] Updated weights for policy 0, policy_version 724342 (0.0049) [2024-06-24 19:19:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11867750400. Throughput: 0: 42429.8. Samples: 11867899300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:03,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 19:19:03,676][15401] Updated weights for policy 0, policy_version 724352 (0.0044) [2024-06-24 19:19:08,392][15132] Fps is (10 sec: 39311.8, 60 sec: 41777.5, 300 sec: 42598.1). Total num frames: 11867914240. Throughput: 0: 42451.9. Samples: 11868030480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:08,393][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 19:19:08,923][15401] Updated weights for policy 0, policy_version 724362 (0.0037) [2024-06-24 19:19:11,345][15401] Updated weights for policy 0, policy_version 724372 (0.0038) [2024-06-24 19:19:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 11868176384. Throughput: 0: 42221.7. Samples: 11868275540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:13,393][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 19:19:16,477][15401] Updated weights for policy 0, policy_version 724382 (0.0049) [2024-06-24 19:19:18,390][15132] Fps is (10 sec: 47525.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11868389376. Throughput: 0: 42508.4. Samples: 11868540240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 19:19:19,065][15401] Updated weights for policy 0, policy_version 724392 (0.0035) [2024-06-24 19:19:23,391][15132] Fps is (10 sec: 39325.7, 60 sec: 42324.3, 300 sec: 42709.3). Total num frames: 11868569600. Throughput: 0: 42536.0. Samples: 11868669780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:23,391][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 19:19:23,943][15401] Updated weights for policy 0, policy_version 724402 (0.0032) [2024-06-24 19:19:26,902][15401] Updated weights for policy 0, policy_version 724412 (0.0033) [2024-06-24 19:19:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11868815360. Throughput: 0: 42387.9. Samples: 11868912820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:28,399][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 19:19:31,700][15401] Updated weights for policy 0, policy_version 724422 (0.0043) [2024-06-24 19:19:33,389][15132] Fps is (10 sec: 44242.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11869011968. Throughput: 0: 42606.6. Samples: 11869181280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 19:19:34,373][15401] Updated weights for policy 0, policy_version 724432 (0.0032) [2024-06-24 19:19:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11869208576. Throughput: 0: 42560.9. Samples: 11869308440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 19:19:39,337][15401] Updated weights for policy 0, policy_version 724442 (0.0039) [2024-06-24 19:19:42,018][15401] Updated weights for policy 0, policy_version 724452 (0.0031) [2024-06-24 19:19:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11869454336. Throughput: 0: 42728.9. Samples: 11869560840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 19:19:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000724454_11869454336.pth... [2024-06-24 19:19:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000723829_11859214336.pth [2024-06-24 19:19:47,031][15401] Updated weights for policy 0, policy_version 724462 (0.0037) [2024-06-24 19:19:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11869650944. Throughput: 0: 42724.9. Samples: 11869821920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-24 19:19:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 19:19:49,860][15401] Updated weights for policy 0, policy_version 724472 (0.0031) [2024-06-24 19:19:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11869863936. Throughput: 0: 42592.6. Samples: 11869947040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:19:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 19:19:54,437][15401] Updated weights for policy 0, policy_version 724482 (0.0036) [2024-06-24 19:19:57,492][15401] Updated weights for policy 0, policy_version 724492 (0.0046) [2024-06-24 19:19:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11870109696. Throughput: 0: 42914.7. Samples: 11870206600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:19:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 19:20:02,102][15401] Updated weights for policy 0, policy_version 724502 (0.0035) [2024-06-24 19:20:03,391][15132] Fps is (10 sec: 42589.7, 60 sec: 42323.9, 300 sec: 42709.2). Total num frames: 11870289920. Throughput: 0: 42853.7. Samples: 11870468740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:03,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 19:20:05,244][15401] Updated weights for policy 0, policy_version 724512 (0.0026) [2024-06-24 19:20:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 11870502912. Throughput: 0: 42565.8. Samples: 11870585180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 19:20:09,693][15401] Updated weights for policy 0, policy_version 724522 (0.0040) [2024-06-24 19:20:12,951][15401] Updated weights for policy 0, policy_version 724532 (0.0031) [2024-06-24 19:20:13,390][15132] Fps is (10 sec: 47522.5, 60 sec: 43146.2, 300 sec: 42765.3). Total num frames: 11870765056. Throughput: 0: 43013.7. Samples: 11870848440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 19:20:17,157][15401] Updated weights for policy 0, policy_version 724542 (0.0035) [2024-06-24 19:20:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11870928896. Throughput: 0: 42835.6. Samples: 11871108880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 19:20:19,312][15349] Signal inference workers to stop experience collection... (175700 times) [2024-06-24 19:20:19,347][15401] InferenceWorker_p0-w0: stopping experience collection (175700 times) [2024-06-24 19:20:19,371][15349] Signal inference workers to resume experience collection... (175700 times) [2024-06-24 19:20:19,371][15401] InferenceWorker_p0-w0: resuming experience collection (175700 times) [2024-06-24 19:20:20,505][15401] Updated weights for policy 0, policy_version 724552 (0.0041) [2024-06-24 19:20:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42872.4, 300 sec: 42598.4). Total num frames: 11871141888. Throughput: 0: 42632.9. Samples: 11871226920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 19:20:24,528][15401] Updated weights for policy 0, policy_version 724562 (0.0031) [2024-06-24 19:20:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 11871371264. Throughput: 0: 42959.1. Samples: 11871494000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:28,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-24 19:20:28,403][15401] Updated weights for policy 0, policy_version 724572 (0.0040) [2024-06-24 19:20:32,494][15401] Updated weights for policy 0, policy_version 724582 (0.0043) [2024-06-24 19:20:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11871567872. Throughput: 0: 42787.1. Samples: 11871747340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:33,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 19:20:36,302][15401] Updated weights for policy 0, policy_version 724592 (0.0030) [2024-06-24 19:20:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11871780864. Throughput: 0: 42769.2. Samples: 11871871660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:38,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 19:20:40,164][15401] Updated weights for policy 0, policy_version 724602 (0.0056) [2024-06-24 19:20:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11871993856. Throughput: 0: 42824.4. Samples: 11872133700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 19:20:43,847][15401] Updated weights for policy 0, policy_version 724612 (0.0037) [2024-06-24 19:20:47,686][15401] Updated weights for policy 0, policy_version 724622 (0.0031) [2024-06-24 19:20:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11872223232. Throughput: 0: 42645.8. Samples: 11872387720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 19:20:51,645][15401] Updated weights for policy 0, policy_version 724632 (0.0028) [2024-06-24 19:20:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 11872436224. Throughput: 0: 42962.6. Samples: 11872518500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 19:20:55,285][15401] Updated weights for policy 0, policy_version 724642 (0.0043) [2024-06-24 19:20:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 11872616448. Throughput: 0: 42661.8. Samples: 11872768220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:20:58,391][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 19:20:59,481][15401] Updated weights for policy 0, policy_version 724652 (0.0031) [2024-06-24 19:21:02,970][15401] Updated weights for policy 0, policy_version 724662 (0.0040) [2024-06-24 19:21:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.0, 300 sec: 42709.5). Total num frames: 11872878592. Throughput: 0: 42517.9. Samples: 11873022180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:21:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 19:21:07,318][15401] Updated weights for policy 0, policy_version 724672 (0.0034) [2024-06-24 19:21:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11873075200. Throughput: 0: 42838.6. Samples: 11873154660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:21:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 19:21:10,504][15401] Updated weights for policy 0, policy_version 724682 (0.0025) [2024-06-24 19:21:13,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.2, 300 sec: 42542.9). Total num frames: 11873255424. Throughput: 0: 42702.8. Samples: 11873415620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:21:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 19:21:14,829][15401] Updated weights for policy 0, policy_version 724692 (0.0032) [2024-06-24 19:21:18,028][15401] Updated weights for policy 0, policy_version 724702 (0.0036) [2024-06-24 19:21:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11873533952. Throughput: 0: 42381.8. Samples: 11873654520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:21:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 19:21:22,367][15401] Updated weights for policy 0, policy_version 724712 (0.0045) [2024-06-24 19:21:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11873697792. Throughput: 0: 42829.5. Samples: 11873798980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:21:23,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 19:21:25,795][15401] Updated weights for policy 0, policy_version 724722 (0.0034) [2024-06-24 19:21:28,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11873910784. Throughput: 0: 42534.2. Samples: 11874047740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:21:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 19:21:28,845][15349] Signal inference workers to stop experience collection... (175750 times) [2024-06-24 19:21:28,845][15349] Signal inference workers to resume experience collection... (175750 times) [2024-06-24 19:21:28,872][15401] InferenceWorker_p0-w0: stopping experience collection (175750 times) [2024-06-24 19:21:28,873][15401] InferenceWorker_p0-w0: resuming experience collection (175750 times) [2024-06-24 19:21:30,339][15401] Updated weights for policy 0, policy_version 724732 (0.0030) [2024-06-24 19:21:33,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 11874156544. Throughput: 0: 42480.4. Samples: 11874299340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:21:33,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 19:21:33,522][15401] Updated weights for policy 0, policy_version 724742 (0.0037) [2024-06-24 19:21:38,042][15401] Updated weights for policy 0, policy_version 724752 (0.0046) [2024-06-24 19:21:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 11874353152. Throughput: 0: 42630.2. Samples: 11874436960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:21:38,393][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 19:21:41,495][15401] Updated weights for policy 0, policy_version 724762 (0.0026) [2024-06-24 19:21:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 11874549760. Throughput: 0: 42598.7. Samples: 11874685160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:21:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 19:21:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000724765_11874549760.pth... [2024-06-24 19:21:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000724143_11864358912.pth [2024-06-24 19:21:45,858][15401] Updated weights for policy 0, policy_version 724772 (0.0039) [2024-06-24 19:21:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11874795520. Throughput: 0: 42575.0. Samples: 11874938060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:21:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 19:21:48,982][15401] Updated weights for policy 0, policy_version 724782 (0.0031) [2024-06-24 19:21:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 11874975744. Throughput: 0: 42609.6. Samples: 11875072100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:21:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 19:21:53,411][15401] Updated weights for policy 0, policy_version 724792 (0.0032) [2024-06-24 19:21:56,714][15401] Updated weights for policy 0, policy_version 724802 (0.0034) [2024-06-24 19:21:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11875188736. Throughput: 0: 42197.7. Samples: 11875314520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:21:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 19:22:00,940][15401] Updated weights for policy 0, policy_version 724812 (0.0031) [2024-06-24 19:22:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.1, 300 sec: 42709.4). Total num frames: 11875434496. Throughput: 0: 42754.4. Samples: 11875578480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 19:22:04,153][15401] Updated weights for policy 0, policy_version 724822 (0.0030) [2024-06-24 19:22:08,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 11875631104. Throughput: 0: 42451.0. Samples: 11875709380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:08,392][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 19:22:08,582][15401] Updated weights for policy 0, policy_version 724832 (0.0042) [2024-06-24 19:22:11,609][15401] Updated weights for policy 0, policy_version 724842 (0.0038) [2024-06-24 19:22:13,390][15132] Fps is (10 sec: 42599.7, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 11875860480. Throughput: 0: 42528.9. Samples: 11875961540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 19:22:16,416][15401] Updated weights for policy 0, policy_version 724852 (0.0034) [2024-06-24 19:22:18,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11876073472. Throughput: 0: 42742.3. Samples: 11876222740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:18,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 19:22:19,182][15401] Updated weights for policy 0, policy_version 724862 (0.0034) [2024-06-24 19:22:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11876270080. Throughput: 0: 42507.1. Samples: 11876349680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:23,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 19:22:24,208][15401] Updated weights for policy 0, policy_version 724872 (0.0039) [2024-06-24 19:22:26,803][15401] Updated weights for policy 0, policy_version 724882 (0.0035) [2024-06-24 19:22:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11876499456. Throughput: 0: 42549.4. Samples: 11876599880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:28,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-24 19:22:31,746][15401] Updated weights for policy 0, policy_version 724892 (0.0042) [2024-06-24 19:22:33,392][15132] Fps is (10 sec: 44228.1, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 11876712448. Throughput: 0: 42787.8. Samples: 11876863600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:33,392][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 19:22:34,973][15401] Updated weights for policy 0, policy_version 724902 (0.0035) [2024-06-24 19:22:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 11876892672. Throughput: 0: 42692.8. Samples: 11876993260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 19:22:39,378][15401] Updated weights for policy 0, policy_version 724912 (0.0028) [2024-06-24 19:22:42,457][15401] Updated weights for policy 0, policy_version 724922 (0.0037) [2024-06-24 19:22:43,390][15132] Fps is (10 sec: 42606.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11877138432. Throughput: 0: 43048.4. Samples: 11877251700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:43,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 19:22:47,352][15401] Updated weights for policy 0, policy_version 724932 (0.0032) [2024-06-24 19:22:47,380][15349] Signal inference workers to stop experience collection... (175800 times) [2024-06-24 19:22:47,380][15349] Signal inference workers to resume experience collection... (175800 times) [2024-06-24 19:22:47,424][15401] InferenceWorker_p0-w0: stopping experience collection (175800 times) [2024-06-24 19:22:47,424][15401] InferenceWorker_p0-w0: resuming experience collection (175800 times) [2024-06-24 19:22:48,391][15132] Fps is (10 sec: 45866.9, 60 sec: 42597.2, 300 sec: 42653.7). Total num frames: 11877351424. Throughput: 0: 42883.6. Samples: 11877508300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:48,392][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 19:22:50,064][15401] Updated weights for policy 0, policy_version 724942 (0.0026) [2024-06-24 19:22:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11877548032. Throughput: 0: 42875.6. Samples: 11877638680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 19:22:54,742][15401] Updated weights for policy 0, policy_version 724952 (0.0028) [2024-06-24 19:22:57,733][15401] Updated weights for policy 0, policy_version 724962 (0.0033) [2024-06-24 19:22:58,389][15132] Fps is (10 sec: 44244.5, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 11877793792. Throughput: 0: 42929.0. Samples: 11877893340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:22:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 19:23:02,779][15401] Updated weights for policy 0, policy_version 724972 (0.0037) [2024-06-24 19:23:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 11877990400. Throughput: 0: 42996.9. Samples: 11878157700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:23:03,393][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 19:23:05,512][15401] Updated weights for policy 0, policy_version 724982 (0.0032) [2024-06-24 19:23:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 11878187008. Throughput: 0: 42972.9. Samples: 11878283460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 19:23:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 19:23:10,321][15401] Updated weights for policy 0, policy_version 724992 (0.0036) [2024-06-24 19:23:13,356][15401] Updated weights for policy 0, policy_version 725002 (0.0045) [2024-06-24 19:23:13,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11878432768. Throughput: 0: 43076.1. Samples: 11878538300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 19:23:17,899][15401] Updated weights for policy 0, policy_version 725012 (0.0035) [2024-06-24 19:23:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11878629376. Throughput: 0: 42999.8. Samples: 11878798500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:18,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 19:23:21,014][15401] Updated weights for policy 0, policy_version 725022 (0.0025) [2024-06-24 19:23:23,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11878842368. Throughput: 0: 42836.6. Samples: 11878920920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:23,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 19:23:25,383][15401] Updated weights for policy 0, policy_version 725032 (0.0035) [2024-06-24 19:23:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11879071744. Throughput: 0: 42948.5. Samples: 11879184380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 19:23:28,787][15401] Updated weights for policy 0, policy_version 725042 (0.0027) [2024-06-24 19:23:33,041][15401] Updated weights for policy 0, policy_version 725052 (0.0035) [2024-06-24 19:23:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42599.8, 300 sec: 42765.0). Total num frames: 11879268352. Throughput: 0: 43066.0. Samples: 11879446200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 19:23:36,379][15401] Updated weights for policy 0, policy_version 725062 (0.0029) [2024-06-24 19:23:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11879497728. Throughput: 0: 42947.3. Samples: 11879571300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 19:23:40,539][15401] Updated weights for policy 0, policy_version 725072 (0.0042) [2024-06-24 19:23:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11879710720. Throughput: 0: 43087.5. Samples: 11879832280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 19:23:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000725081_11879727104.pth... [2024-06-24 19:23:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000724454_11869454336.pth [2024-06-24 19:23:43,943][15401] Updated weights for policy 0, policy_version 725082 (0.0037) [2024-06-24 19:23:48,113][15401] Updated weights for policy 0, policy_version 725092 (0.0030) [2024-06-24 19:23:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42599.6, 300 sec: 42765.0). Total num frames: 11879907328. Throughput: 0: 42821.4. Samples: 11880084560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:48,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-24 19:23:51,510][15401] Updated weights for policy 0, policy_version 725102 (0.0032) [2024-06-24 19:23:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11880136704. Throughput: 0: 42834.8. Samples: 11880211020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:53,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 19:23:55,777][15401] Updated weights for policy 0, policy_version 725112 (0.0045) [2024-06-24 19:23:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11880333312. Throughput: 0: 42936.5. Samples: 11880470440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:23:58,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 19:23:59,307][15401] Updated weights for policy 0, policy_version 725122 (0.0047) [2024-06-24 19:24:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.3, 300 sec: 42876.5). Total num frames: 11880562688. Throughput: 0: 42903.6. Samples: 11880729160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 19:24:03,395][15401] Updated weights for policy 0, policy_version 725132 (0.0029) [2024-06-24 19:24:06,850][15401] Updated weights for policy 0, policy_version 725142 (0.0029) [2024-06-24 19:24:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 11880775680. Throughput: 0: 43099.7. Samples: 11880860400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 19:24:10,960][15401] Updated weights for policy 0, policy_version 725152 (0.0047) [2024-06-24 19:24:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11880988672. Throughput: 0: 42821.2. Samples: 11881111340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 19:24:13,640][15349] Signal inference workers to stop experience collection... (175850 times) [2024-06-24 19:24:13,640][15349] Signal inference workers to resume experience collection... (175850 times) [2024-06-24 19:24:13,684][15401] InferenceWorker_p0-w0: stopping experience collection (175850 times) [2024-06-24 19:24:13,684][15401] InferenceWorker_p0-w0: resuming experience collection (175850 times) [2024-06-24 19:24:15,175][15401] Updated weights for policy 0, policy_version 725162 (0.0027) [2024-06-24 19:24:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.7). Total num frames: 11881201664. Throughput: 0: 42789.7. Samples: 11881371740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 19:24:18,538][15401] Updated weights for policy 0, policy_version 725172 (0.0027) [2024-06-24 19:24:22,769][15401] Updated weights for policy 0, policy_version 725182 (0.0035) [2024-06-24 19:24:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11881431040. Throughput: 0: 42874.6. Samples: 11881500660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:23,396][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 19:24:26,200][15401] Updated weights for policy 0, policy_version 725192 (0.0042) [2024-06-24 19:24:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11881644032. Throughput: 0: 42923.1. Samples: 11881763820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 19:24:30,303][15401] Updated weights for policy 0, policy_version 725202 (0.0036) [2024-06-24 19:24:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11881857024. Throughput: 0: 42992.9. Samples: 11882019240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:33,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 19:24:33,850][15401] Updated weights for policy 0, policy_version 725212 (0.0038) [2024-06-24 19:24:38,027][15401] Updated weights for policy 0, policy_version 725222 (0.0032) [2024-06-24 19:24:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 11882053632. Throughput: 0: 43100.7. Samples: 11882150660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:38,397][15132] Avg episode reward: [(0, '0.260')] [2024-06-24 19:24:41,316][15401] Updated weights for policy 0, policy_version 725232 (0.0036) [2024-06-24 19:24:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11882283008. Throughput: 0: 42999.0. Samples: 11882405400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:43,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 19:24:45,724][15401] Updated weights for policy 0, policy_version 725242 (0.0035) [2024-06-24 19:24:48,392][15132] Fps is (10 sec: 44237.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11882496000. Throughput: 0: 42903.8. Samples: 11882659940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:48,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 19:24:48,844][15401] Updated weights for policy 0, policy_version 725252 (0.0048) [2024-06-24 19:24:53,143][15401] Updated weights for policy 0, policy_version 725262 (0.0026) [2024-06-24 19:24:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11882708992. Throughput: 0: 42820.3. Samples: 11882787320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 19:24:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 19:24:56,546][15401] Updated weights for policy 0, policy_version 725272 (0.0060) [2024-06-24 19:24:58,392][15132] Fps is (10 sec: 42598.5, 60 sec: 43142.7, 300 sec: 42820.5). Total num frames: 11882921984. Throughput: 0: 42871.1. Samples: 11883040640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:24:58,401][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 19:25:00,870][15401] Updated weights for policy 0, policy_version 725282 (0.0031) [2024-06-24 19:25:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.3, 300 sec: 42876.1). Total num frames: 11883151360. Throughput: 0: 42826.5. Samples: 11883298940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:03,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 19:25:04,346][15401] Updated weights for policy 0, policy_version 725292 (0.0045) [2024-06-24 19:25:08,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11883331584. Throughput: 0: 42697.8. Samples: 11883422060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 19:25:08,566][15401] Updated weights for policy 0, policy_version 725302 (0.0028) [2024-06-24 19:25:12,371][15401] Updated weights for policy 0, policy_version 725312 (0.0029) [2024-06-24 19:25:13,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 11883577344. Throughput: 0: 42712.3. Samples: 11883685980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:13,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 19:25:16,093][15401] Updated weights for policy 0, policy_version 725322 (0.0030) [2024-06-24 19:25:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11883773952. Throughput: 0: 42618.3. Samples: 11883937060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 19:25:20,264][15401] Updated weights for policy 0, policy_version 725332 (0.0034) [2024-06-24 19:25:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11883986944. Throughput: 0: 42557.8. Samples: 11884065660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 19:25:23,888][15401] Updated weights for policy 0, policy_version 725342 (0.0025) [2024-06-24 19:25:27,844][15401] Updated weights for policy 0, policy_version 725352 (0.0032) [2024-06-24 19:25:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 11884216320. Throughput: 0: 42575.1. Samples: 11884321380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:28,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 19:25:31,379][15401] Updated weights for policy 0, policy_version 725362 (0.0040) [2024-06-24 19:25:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 11884429312. Throughput: 0: 42495.5. Samples: 11884572240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:33,393][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 19:25:35,455][15401] Updated weights for policy 0, policy_version 725372 (0.0026) [2024-06-24 19:25:38,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 11884609536. Throughput: 0: 42589.1. Samples: 11884703820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 19:25:38,927][15401] Updated weights for policy 0, policy_version 725382 (0.0025) [2024-06-24 19:25:39,745][15349] Signal inference workers to stop experience collection... (175900 times) [2024-06-24 19:25:39,752][15349] Signal inference workers to resume experience collection... (175900 times) [2024-06-24 19:25:39,754][15401] InferenceWorker_p0-w0: stopping experience collection (175900 times) [2024-06-24 19:25:39,783][15401] InferenceWorker_p0-w0: resuming experience collection (175900 times) [2024-06-24 19:25:43,032][15401] Updated weights for policy 0, policy_version 725392 (0.0042) [2024-06-24 19:25:43,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11884838912. Throughput: 0: 42668.6. Samples: 11884960620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 19:25:43,495][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000725394_11884855296.pth... [2024-06-24 19:25:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000724765_11874549760.pth [2024-06-24 19:25:46,785][15401] Updated weights for policy 0, policy_version 725402 (0.0038) [2024-06-24 19:25:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11885051904. Throughput: 0: 42561.5. Samples: 11885214200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 19:25:50,730][15401] Updated weights for policy 0, policy_version 725412 (0.0025) [2024-06-24 19:25:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 11885248512. Throughput: 0: 42626.9. Samples: 11885340280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:53,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 19:25:54,376][15401] Updated weights for policy 0, policy_version 725422 (0.0032) [2024-06-24 19:25:58,259][15401] Updated weights for policy 0, policy_version 725432 (0.0035) [2024-06-24 19:25:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 11885477888. Throughput: 0: 42479.7. Samples: 11885597460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:25:58,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 19:26:02,011][15401] Updated weights for policy 0, policy_version 725442 (0.0032) [2024-06-24 19:26:03,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11885707264. Throughput: 0: 42571.5. Samples: 11885852780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:26:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 19:26:05,884][15401] Updated weights for policy 0, policy_version 725452 (0.0022) [2024-06-24 19:26:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11885903872. Throughput: 0: 42645.1. Samples: 11885984680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:26:08,390][15132] Avg episode reward: [(0, '0.891')] [2024-06-24 19:26:09,538][15401] Updated weights for policy 0, policy_version 725462 (0.0037) [2024-06-24 19:26:13,338][15401] Updated weights for policy 0, policy_version 725472 (0.0032) [2024-06-24 19:26:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 11886133248. Throughput: 0: 42746.2. Samples: 11886244860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:26:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 19:26:16,986][15401] Updated weights for policy 0, policy_version 725482 (0.0043) [2024-06-24 19:26:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11886346240. Throughput: 0: 42768.1. Samples: 11886496700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:26:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 19:26:21,400][15401] Updated weights for policy 0, policy_version 725492 (0.0029) [2024-06-24 19:26:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11886559232. Throughput: 0: 42908.4. Samples: 11886634700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:26:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 19:26:24,698][15401] Updated weights for policy 0, policy_version 725502 (0.0043) [2024-06-24 19:26:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 11886755840. Throughput: 0: 42910.6. Samples: 11886891600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:26:28,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 19:26:28,914][15401] Updated weights for policy 0, policy_version 725512 (0.0040) [2024-06-24 19:26:32,237][15401] Updated weights for policy 0, policy_version 725522 (0.0038) [2024-06-24 19:26:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11886985216. Throughput: 0: 42916.3. Samples: 11887145540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:26:33,392][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 19:26:36,414][15401] Updated weights for policy 0, policy_version 725532 (0.0024) [2024-06-24 19:26:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11887181824. Throughput: 0: 42961.4. Samples: 11887273640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:26:38,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 19:26:40,106][15401] Updated weights for policy 0, policy_version 725542 (0.0024) [2024-06-24 19:26:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11887411200. Throughput: 0: 42931.0. Samples: 11887529360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:26:43,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 19:26:44,025][15401] Updated weights for policy 0, policy_version 725552 (0.0039) [2024-06-24 19:26:47,492][15349] Signal inference workers to stop experience collection... (175950 times) [2024-06-24 19:26:47,493][15349] Signal inference workers to resume experience collection... (175950 times) [2024-06-24 19:26:47,505][15401] InferenceWorker_p0-w0: stopping experience collection (175950 times) [2024-06-24 19:26:47,506][15401] InferenceWorker_p0-w0: resuming experience collection (175950 times) [2024-06-24 19:26:47,657][15401] Updated weights for policy 0, policy_version 725562 (0.0029) [2024-06-24 19:26:48,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 11887624192. Throughput: 0: 42973.7. Samples: 11887786700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:26:48,393][15132] Avg episode reward: [(0, '0.197')] [2024-06-24 19:26:51,576][15401] Updated weights for policy 0, policy_version 725572 (0.0030) [2024-06-24 19:26:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11887837184. Throughput: 0: 43014.0. Samples: 11887920320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:26:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 19:26:55,556][15401] Updated weights for policy 0, policy_version 725582 (0.0038) [2024-06-24 19:26:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 11888050176. Throughput: 0: 42826.4. Samples: 11888172040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:26:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 19:26:59,646][15401] Updated weights for policy 0, policy_version 725592 (0.0031) [2024-06-24 19:27:03,326][15401] Updated weights for policy 0, policy_version 725602 (0.0043) [2024-06-24 19:27:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 11888263168. Throughput: 0: 42997.8. Samples: 11888431600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:03,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-24 19:27:07,178][15401] Updated weights for policy 0, policy_version 725612 (0.0032) [2024-06-24 19:27:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11888476160. Throughput: 0: 42673.4. Samples: 11888555000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 19:27:10,839][15401] Updated weights for policy 0, policy_version 725622 (0.0039) [2024-06-24 19:27:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 11888672768. Throughput: 0: 42632.0. Samples: 11888810140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:13,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 19:27:14,798][15401] Updated weights for policy 0, policy_version 725632 (0.0025) [2024-06-24 19:27:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11888902144. Throughput: 0: 42687.2. Samples: 11889066360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 19:27:18,405][15401] Updated weights for policy 0, policy_version 725642 (0.0028) [2024-06-24 19:27:22,410][15401] Updated weights for policy 0, policy_version 725652 (0.0035) [2024-06-24 19:27:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11889115136. Throughput: 0: 42806.7. Samples: 11889199840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 19:27:25,875][15401] Updated weights for policy 0, policy_version 725662 (0.0023) [2024-06-24 19:27:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 11889328128. Throughput: 0: 42809.8. Samples: 11889455800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 19:27:30,110][15401] Updated weights for policy 0, policy_version 725672 (0.0026) [2024-06-24 19:27:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 11889557504. Throughput: 0: 42821.8. Samples: 11889713580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:33,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 19:27:33,782][15401] Updated weights for policy 0, policy_version 725682 (0.0029) [2024-06-24 19:27:37,652][15401] Updated weights for policy 0, policy_version 725692 (0.0035) [2024-06-24 19:27:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 11889754112. Throughput: 0: 42809.8. Samples: 11889846760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 19:27:41,354][15401] Updated weights for policy 0, policy_version 725702 (0.0023) [2024-06-24 19:27:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 11889983488. Throughput: 0: 42932.3. Samples: 11890104000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 19:27:43,395][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000725707_11889983488.pth... [2024-06-24 19:27:43,441][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000725081_11879727104.pth [2024-06-24 19:27:45,225][15401] Updated weights for policy 0, policy_version 725712 (0.0037) [2024-06-24 19:27:48,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42871.5, 300 sec: 42875.8). Total num frames: 11890196480. Throughput: 0: 42748.8. Samples: 11890355400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:48,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 19:27:49,292][15401] Updated weights for policy 0, policy_version 725722 (0.0033) [2024-06-24 19:27:53,240][15401] Updated weights for policy 0, policy_version 725732 (0.0035) [2024-06-24 19:27:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11890393088. Throughput: 0: 42976.4. Samples: 11890488940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 19:27:53,728][15349] Signal inference workers to stop experience collection... (176000 times) [2024-06-24 19:27:53,732][15349] Signal inference workers to resume experience collection... (176000 times) [2024-06-24 19:27:53,775][15401] InferenceWorker_p0-w0: stopping experience collection (176000 times) [2024-06-24 19:27:53,775][15401] InferenceWorker_p0-w0: resuming experience collection (176000 times) [2024-06-24 19:27:56,961][15401] Updated weights for policy 0, policy_version 725742 (0.0044) [2024-06-24 19:27:58,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 11890622464. Throughput: 0: 42980.5. Samples: 11890744160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:27:58,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-24 19:28:00,929][15401] Updated weights for policy 0, policy_version 725752 (0.0031) [2024-06-24 19:28:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11890851840. Throughput: 0: 42880.8. Samples: 11890996000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:28:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 19:28:04,509][15401] Updated weights for policy 0, policy_version 725762 (0.0024) [2024-06-24 19:28:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11891032064. Throughput: 0: 42901.3. Samples: 11891130400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:28:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 19:28:08,562][15401] Updated weights for policy 0, policy_version 725772 (0.0031) [2024-06-24 19:28:12,111][15401] Updated weights for policy 0, policy_version 725782 (0.0030) [2024-06-24 19:28:13,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 11891261440. Throughput: 0: 42906.2. Samples: 11891386680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:28:13,392][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 19:28:16,239][15401] Updated weights for policy 0, policy_version 725792 (0.0024) [2024-06-24 19:28:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11891490816. Throughput: 0: 42620.1. Samples: 11891631480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-24 19:28:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 19:28:19,784][15401] Updated weights for policy 0, policy_version 725802 (0.0036) [2024-06-24 19:28:23,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11891654656. Throughput: 0: 42774.9. Samples: 11891771620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:28:23,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 19:28:23,906][15401] Updated weights for policy 0, policy_version 725812 (0.0052) [2024-06-24 19:28:27,232][15401] Updated weights for policy 0, policy_version 725822 (0.0033) [2024-06-24 19:28:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11891916800. Throughput: 0: 42814.2. Samples: 11892030640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:28:28,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-24 19:28:31,423][15401] Updated weights for policy 0, policy_version 725832 (0.0030) [2024-06-24 19:28:33,389][15132] Fps is (10 sec: 49151.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11892146176. Throughput: 0: 42753.8. Samples: 11892279220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:28:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 19:28:35,022][15401] Updated weights for policy 0, policy_version 725842 (0.0038) [2024-06-24 19:28:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11892310016. Throughput: 0: 42696.4. Samples: 11892410280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:28:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 19:28:39,091][15401] Updated weights for policy 0, policy_version 725852 (0.0028) [2024-06-24 19:28:42,597][15401] Updated weights for policy 0, policy_version 725862 (0.0038) [2024-06-24 19:28:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11892555776. Throughput: 0: 42710.7. Samples: 11892666140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:28:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 19:28:47,118][15401] Updated weights for policy 0, policy_version 725872 (0.0035) [2024-06-24 19:28:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11892752384. Throughput: 0: 42727.2. Samples: 11892918720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:28:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 19:28:50,212][15401] Updated weights for policy 0, policy_version 725882 (0.0032) [2024-06-24 19:28:53,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42764.6). Total num frames: 11892948992. Throughput: 0: 42467.1. Samples: 11893041520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:28:53,393][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 19:28:54,606][15401] Updated weights for policy 0, policy_version 725892 (0.0030) [2024-06-24 19:28:57,805][15401] Updated weights for policy 0, policy_version 725902 (0.0036) [2024-06-24 19:28:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11893178368. Throughput: 0: 42489.0. Samples: 11893298580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:28:58,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 19:29:02,275][15401] Updated weights for policy 0, policy_version 725912 (0.0031) [2024-06-24 19:29:03,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 11893391360. Throughput: 0: 42731.5. Samples: 11893554500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:03,393][15132] Avg episode reward: [(0, '0.326')] [2024-06-24 19:29:05,845][15401] Updated weights for policy 0, policy_version 725922 (0.0032) [2024-06-24 19:29:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11893587968. Throughput: 0: 42492.3. Samples: 11893683780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 19:29:09,811][15401] Updated weights for policy 0, policy_version 725932 (0.0029) [2024-06-24 19:29:13,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 11893817344. Throughput: 0: 42428.4. Samples: 11893939920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 19:29:13,838][15401] Updated weights for policy 0, policy_version 725942 (0.0043) [2024-06-24 19:29:17,321][15401] Updated weights for policy 0, policy_version 725952 (0.0028) [2024-06-24 19:29:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11894030336. Throughput: 0: 42734.3. Samples: 11894202260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 19:29:18,501][15349] Signal inference workers to stop experience collection... (176050 times) [2024-06-24 19:29:18,507][15349] Signal inference workers to resume experience collection... (176050 times) [2024-06-24 19:29:18,540][15401] InferenceWorker_p0-w0: stopping experience collection (176050 times) [2024-06-24 19:29:18,540][15401] InferenceWorker_p0-w0: resuming experience collection (176050 times) [2024-06-24 19:29:21,277][15401] Updated weights for policy 0, policy_version 725962 (0.0038) [2024-06-24 19:29:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11894226944. Throughput: 0: 42769.8. Samples: 11894334920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 19:29:25,166][15401] Updated weights for policy 0, policy_version 725972 (0.0028) [2024-06-24 19:29:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11894456320. Throughput: 0: 42672.6. Samples: 11894586400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:28,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 19:29:28,851][15401] Updated weights for policy 0, policy_version 725982 (0.0032) [2024-06-24 19:29:32,909][15401] Updated weights for policy 0, policy_version 725992 (0.0029) [2024-06-24 19:29:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 11894685696. Throughput: 0: 42802.6. Samples: 11894844840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 19:29:36,414][15401] Updated weights for policy 0, policy_version 726002 (0.0033) [2024-06-24 19:29:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11894882304. Throughput: 0: 42952.4. Samples: 11894974280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 19:29:40,457][15401] Updated weights for policy 0, policy_version 726012 (0.0036) [2024-06-24 19:29:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 11895111680. Throughput: 0: 42980.2. Samples: 11895232800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:43,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 19:29:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000726020_11895111680.pth... [2024-06-24 19:29:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000725394_11884855296.pth [2024-06-24 19:29:44,118][15401] Updated weights for policy 0, policy_version 726022 (0.0023) [2024-06-24 19:29:48,059][15401] Updated weights for policy 0, policy_version 726032 (0.0028) [2024-06-24 19:29:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11895341056. Throughput: 0: 42960.0. Samples: 11895487600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 19:29:51,678][15401] Updated weights for policy 0, policy_version 726042 (0.0038) [2024-06-24 19:29:53,389][15132] Fps is (10 sec: 42609.4, 60 sec: 43146.4, 300 sec: 42765.4). Total num frames: 11895537664. Throughput: 0: 42939.3. Samples: 11895616040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 19:29:55,538][15401] Updated weights for policy 0, policy_version 726052 (0.0023) [2024-06-24 19:29:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11895750656. Throughput: 0: 43028.9. Samples: 11895876220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 19:29:58,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 19:29:59,173][15401] Updated weights for policy 0, policy_version 726062 (0.0043) [2024-06-24 19:30:03,316][15401] Updated weights for policy 0, policy_version 726072 (0.0038) [2024-06-24 19:30:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 11895963648. Throughput: 0: 42906.2. Samples: 11896133040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 19:30:06,735][15401] Updated weights for policy 0, policy_version 726082 (0.0032) [2024-06-24 19:30:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 11896176640. Throughput: 0: 42797.0. Samples: 11896260780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 19:30:10,810][15401] Updated weights for policy 0, policy_version 726092 (0.0047) [2024-06-24 19:30:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11896389632. Throughput: 0: 42909.1. Samples: 11896517320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 19:30:14,480][15401] Updated weights for policy 0, policy_version 726102 (0.0044) [2024-06-24 19:30:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11896602624. Throughput: 0: 42883.1. Samples: 11896774580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:18,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-24 19:30:18,655][15401] Updated weights for policy 0, policy_version 726112 (0.0028) [2024-06-24 19:30:22,356][15401] Updated weights for policy 0, policy_version 726122 (0.0032) [2024-06-24 19:30:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42765.3). Total num frames: 11896832000. Throughput: 0: 42750.7. Samples: 11896898060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:23,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 19:30:26,323][15401] Updated weights for policy 0, policy_version 726132 (0.0033) [2024-06-24 19:30:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 11897028608. Throughput: 0: 42791.3. Samples: 11897158300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 19:30:29,920][15401] Updated weights for policy 0, policy_version 726142 (0.0037) [2024-06-24 19:30:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11897241600. Throughput: 0: 42824.0. Samples: 11897414680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 19:30:33,920][15401] Updated weights for policy 0, policy_version 726152 (0.0032) [2024-06-24 19:30:37,873][15401] Updated weights for policy 0, policy_version 726162 (0.0034) [2024-06-24 19:30:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11897454592. Throughput: 0: 42703.5. Samples: 11897537700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 19:30:41,454][15401] Updated weights for policy 0, policy_version 726172 (0.0034) [2024-06-24 19:30:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 11897667584. Throughput: 0: 42721.9. Samples: 11897798700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 19:30:45,429][15401] Updated weights for policy 0, policy_version 726182 (0.0035) [2024-06-24 19:30:48,076][15349] Signal inference workers to stop experience collection... (176100 times) [2024-06-24 19:30:48,094][15401] InferenceWorker_p0-w0: stopping experience collection (176100 times) [2024-06-24 19:30:48,188][15349] Signal inference workers to resume experience collection... (176100 times) [2024-06-24 19:30:48,188][15401] InferenceWorker_p0-w0: resuming experience collection (176100 times) [2024-06-24 19:30:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11897896960. Throughput: 0: 42727.0. Samples: 11898055760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:48,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 19:30:48,973][15401] Updated weights for policy 0, policy_version 726192 (0.0038) [2024-06-24 19:30:53,123][15401] Updated weights for policy 0, policy_version 726202 (0.0038) [2024-06-24 19:30:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11898109952. Throughput: 0: 42723.0. Samples: 11898183320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:53,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 19:30:56,849][15401] Updated weights for policy 0, policy_version 726212 (0.0047) [2024-06-24 19:30:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11898273792. Throughput: 0: 42564.6. Samples: 11898432720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:30:58,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 19:31:00,925][15401] Updated weights for policy 0, policy_version 726222 (0.0044) [2024-06-24 19:31:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11898535936. Throughput: 0: 42442.7. Samples: 11898684500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 19:31:04,777][15401] Updated weights for policy 0, policy_version 726232 (0.0036) [2024-06-24 19:31:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11898732544. Throughput: 0: 42708.4. Samples: 11898819940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 19:31:08,842][15401] Updated weights for policy 0, policy_version 726242 (0.0022) [2024-06-24 19:31:12,621][15401] Updated weights for policy 0, policy_version 726252 (0.0042) [2024-06-24 19:31:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11898929152. Throughput: 0: 42543.1. Samples: 11899072740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 19:31:16,421][15401] Updated weights for policy 0, policy_version 726262 (0.0036) [2024-06-24 19:31:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11899174912. Throughput: 0: 42460.5. Samples: 11899325400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 19:31:20,143][15401] Updated weights for policy 0, policy_version 726272 (0.0032) [2024-06-24 19:31:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11899371520. Throughput: 0: 42719.1. Samples: 11899460060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 19:31:24,004][15401] Updated weights for policy 0, policy_version 726282 (0.0037) [2024-06-24 19:31:27,711][15401] Updated weights for policy 0, policy_version 726292 (0.0036) [2024-06-24 19:31:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 11899584512. Throughput: 0: 42490.9. Samples: 11899710800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 19:31:31,552][15401] Updated weights for policy 0, policy_version 726302 (0.0033) [2024-06-24 19:31:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 11899797504. Throughput: 0: 42598.3. Samples: 11899972680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 19:31:35,241][15401] Updated weights for policy 0, policy_version 726312 (0.0031) [2024-06-24 19:31:38,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11899994112. Throughput: 0: 42517.8. Samples: 11900096620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 19:31:39,158][15401] Updated weights for policy 0, policy_version 726322 (0.0032) [2024-06-24 19:31:42,724][15401] Updated weights for policy 0, policy_version 726332 (0.0036) [2024-06-24 19:31:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 11900239872. Throughput: 0: 42730.2. Samples: 11900355580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 19:31:43,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 19:31:43,438][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000726334_11900256256.pth... [2024-06-24 19:31:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000725707_11889983488.pth [2024-06-24 19:31:46,740][15401] Updated weights for policy 0, policy_version 726342 (0.0039) [2024-06-24 19:31:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 11900436480. Throughput: 0: 42757.8. Samples: 11900608600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:31:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 19:31:50,566][15401] Updated weights for policy 0, policy_version 726352 (0.0028) [2024-06-24 19:31:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11900649472. Throughput: 0: 42470.8. Samples: 11900731120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:31:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 19:31:54,554][15401] Updated weights for policy 0, policy_version 726362 (0.0031) [2024-06-24 19:31:58,021][15401] Updated weights for policy 0, policy_version 726372 (0.0036) [2024-06-24 19:31:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 11900895232. Throughput: 0: 42727.5. Samples: 11900995480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:31:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 19:32:02,254][15401] Updated weights for policy 0, policy_version 726382 (0.0041) [2024-06-24 19:32:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11901091840. Throughput: 0: 42705.3. Samples: 11901247140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:03,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-24 19:32:05,654][15401] Updated weights for policy 0, policy_version 726392 (0.0043) [2024-06-24 19:32:07,965][15349] Signal inference workers to stop experience collection... (176150 times) [2024-06-24 19:32:07,967][15349] Signal inference workers to resume experience collection... (176150 times) [2024-06-24 19:32:08,019][15401] InferenceWorker_p0-w0: stopping experience collection (176150 times) [2024-06-24 19:32:08,020][15401] InferenceWorker_p0-w0: resuming experience collection (176150 times) [2024-06-24 19:32:08,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 11901272064. Throughput: 0: 42493.8. Samples: 11901372280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 19:32:10,128][15401] Updated weights for policy 0, policy_version 726402 (0.0045) [2024-06-24 19:32:13,364][15401] Updated weights for policy 0, policy_version 726412 (0.0029) [2024-06-24 19:32:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11901534208. Throughput: 0: 42743.8. Samples: 11901634260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:13,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 19:32:17,804][15401] Updated weights for policy 0, policy_version 726422 (0.0041) [2024-06-24 19:32:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11901730816. Throughput: 0: 42497.8. Samples: 11901885080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 19:32:21,095][15401] Updated weights for policy 0, policy_version 726432 (0.0030) [2024-06-24 19:32:23,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11901911040. Throughput: 0: 42557.7. Samples: 11902011720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 19:32:25,417][15401] Updated weights for policy 0, policy_version 726442 (0.0028) [2024-06-24 19:32:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11902156800. Throughput: 0: 42631.5. Samples: 11902274000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:28,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 19:32:28,599][15401] Updated weights for policy 0, policy_version 726452 (0.0041) [2024-06-24 19:32:32,977][15401] Updated weights for policy 0, policy_version 726462 (0.0047) [2024-06-24 19:32:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11902369792. Throughput: 0: 42662.5. Samples: 11902528420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:33,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 19:32:36,138][15401] Updated weights for policy 0, policy_version 726472 (0.0047) [2024-06-24 19:32:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 11902566400. Throughput: 0: 42706.6. Samples: 11902653020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:38,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 19:32:40,878][15401] Updated weights for policy 0, policy_version 726482 (0.0038) [2024-06-24 19:32:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 11902795776. Throughput: 0: 42606.7. Samples: 11902912780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 19:32:44,243][15401] Updated weights for policy 0, policy_version 726492 (0.0029) [2024-06-24 19:32:48,390][15401] Updated weights for policy 0, policy_version 726502 (0.0035) [2024-06-24 19:32:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11903008768. Throughput: 0: 42784.1. Samples: 11903172420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 19:32:51,817][15401] Updated weights for policy 0, policy_version 726512 (0.0032) [2024-06-24 19:32:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11903221760. Throughput: 0: 42822.3. Samples: 11903299280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 19:32:55,851][15401] Updated weights for policy 0, policy_version 726522 (0.0030) [2024-06-24 19:32:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11903434752. Throughput: 0: 42794.9. Samples: 11903560040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:32:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 19:33:00,047][15401] Updated weights for policy 0, policy_version 726532 (0.0027) [2024-06-24 19:33:03,389][15401] Updated weights for policy 0, policy_version 726542 (0.0045) [2024-06-24 19:33:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11903664128. Throughput: 0: 43047.1. Samples: 11903822200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:33:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 19:33:07,487][15401] Updated weights for policy 0, policy_version 726552 (0.0040) [2024-06-24 19:33:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 11903877120. Throughput: 0: 42985.4. Samples: 11903946060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:33:08,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 19:33:11,246][15401] Updated weights for policy 0, policy_version 726562 (0.0040) [2024-06-24 19:33:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11904090112. Throughput: 0: 42882.4. Samples: 11904203700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:33:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 19:33:14,922][15401] Updated weights for policy 0, policy_version 726572 (0.0025) [2024-06-24 19:33:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11904286720. Throughput: 0: 43114.8. Samples: 11904468580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:33:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 19:33:18,775][15401] Updated weights for policy 0, policy_version 726582 (0.0029) [2024-06-24 19:33:22,869][15401] Updated weights for policy 0, policy_version 726592 (0.0034) [2024-06-24 19:33:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 11904516096. Throughput: 0: 42939.6. Samples: 11904585200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 19:33:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 19:33:25,728][15349] Signal inference workers to stop experience collection... (176200 times) [2024-06-24 19:33:25,729][15349] Signal inference workers to resume experience collection... (176200 times) [2024-06-24 19:33:25,756][15401] InferenceWorker_p0-w0: stopping experience collection (176200 times) [2024-06-24 19:33:25,756][15401] InferenceWorker_p0-w0: resuming experience collection (176200 times) [2024-06-24 19:33:26,350][15401] Updated weights for policy 0, policy_version 726602 (0.0032) [2024-06-24 19:33:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 11904761856. Throughput: 0: 42953.9. Samples: 11904845700. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:33:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 19:33:30,322][15401] Updated weights for policy 0, policy_version 726612 (0.0029) [2024-06-24 19:33:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11904925696. Throughput: 0: 43207.1. Samples: 11905116740. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:33:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 19:33:33,919][15401] Updated weights for policy 0, policy_version 726622 (0.0048) [2024-06-24 19:33:38,182][15401] Updated weights for policy 0, policy_version 726632 (0.0034) [2024-06-24 19:33:38,392][15132] Fps is (10 sec: 39311.7, 60 sec: 43144.5, 300 sec: 42709.1). Total num frames: 11905155072. Throughput: 0: 43094.9. Samples: 11905238660. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:33:38,393][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 19:33:41,777][15401] Updated weights for policy 0, policy_version 726642 (0.0031) [2024-06-24 19:33:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11905400832. Throughput: 0: 43061.0. Samples: 11905497780. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:33:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 19:33:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000726648_11905400832.pth... [2024-06-24 19:33:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000726020_11895111680.pth [2024-06-24 19:33:45,675][15401] Updated weights for policy 0, policy_version 726652 (0.0033) [2024-06-24 19:33:48,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 11905581056. Throughput: 0: 42989.0. Samples: 11905756700. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:33:48,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 19:33:49,344][15401] Updated weights for policy 0, policy_version 726662 (0.0024) [2024-06-24 19:33:53,231][15401] Updated weights for policy 0, policy_version 726672 (0.0033) [2024-06-24 19:33:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11905810432. Throughput: 0: 42984.6. Samples: 11905880360. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:33:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 19:33:56,879][15401] Updated weights for policy 0, policy_version 726682 (0.0044) [2024-06-24 19:33:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42820.9). Total num frames: 11906023424. Throughput: 0: 42972.4. Samples: 11906137460. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:33:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 19:34:00,775][15401] Updated weights for policy 0, policy_version 726692 (0.0043) [2024-06-24 19:34:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11906220032. Throughput: 0: 42880.0. Samples: 11906398180. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 19:34:04,601][15401] Updated weights for policy 0, policy_version 726702 (0.0044) [2024-06-24 19:34:08,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11906433024. Throughput: 0: 43065.7. Samples: 11906523160. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 19:34:08,596][15401] Updated weights for policy 0, policy_version 726712 (0.0036) [2024-06-24 19:34:12,241][15401] Updated weights for policy 0, policy_version 726722 (0.0031) [2024-06-24 19:34:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11906662400. Throughput: 0: 43045.3. Samples: 11906782740. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 19:34:16,109][15401] Updated weights for policy 0, policy_version 726732 (0.0036) [2024-06-24 19:34:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11906859008. Throughput: 0: 42828.0. Samples: 11907044000. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 19:34:19,919][15401] Updated weights for policy 0, policy_version 726742 (0.0031) [2024-06-24 19:34:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11907072000. Throughput: 0: 42888.1. Samples: 11907168520. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 19:34:23,431][15349] Signal inference workers to stop experience collection... (176250 times) [2024-06-24 19:34:23,432][15349] Signal inference workers to resume experience collection... (176250 times) [2024-06-24 19:34:23,478][15401] InferenceWorker_p0-w0: stopping experience collection (176250 times) [2024-06-24 19:34:23,479][15401] InferenceWorker_p0-w0: resuming experience collection (176250 times) [2024-06-24 19:34:23,565][15401] Updated weights for policy 0, policy_version 726752 (0.0036) [2024-06-24 19:34:27,566][15401] Updated weights for policy 0, policy_version 726762 (0.0032) [2024-06-24 19:34:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 11907317760. Throughput: 0: 42819.1. Samples: 11907424640. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:28,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 19:34:31,703][15401] Updated weights for policy 0, policy_version 726772 (0.0035) [2024-06-24 19:34:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 11907497984. Throughput: 0: 42632.3. Samples: 11907675260. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:33,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 19:34:35,621][15401] Updated weights for policy 0, policy_version 726782 (0.0042) [2024-06-24 19:34:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 11907727360. Throughput: 0: 42512.8. Samples: 11907793440. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:38,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 19:34:39,322][15401] Updated weights for policy 0, policy_version 726792 (0.0025) [2024-06-24 19:34:43,139][15401] Updated weights for policy 0, policy_version 726802 (0.0038) [2024-06-24 19:34:43,396][15132] Fps is (10 sec: 44219.2, 60 sec: 42320.8, 300 sec: 42708.6). Total num frames: 11907940352. Throughput: 0: 42604.6. Samples: 11908054940. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:43,396][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 19:34:46,886][15401] Updated weights for policy 0, policy_version 726812 (0.0036) [2024-06-24 19:34:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11908136960. Throughput: 0: 42548.7. Samples: 11908312880. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:48,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 19:34:50,766][15401] Updated weights for policy 0, policy_version 726822 (0.0032) [2024-06-24 19:34:53,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11908382720. Throughput: 0: 42555.7. Samples: 11908438160. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 19:34:54,501][15401] Updated weights for policy 0, policy_version 726832 (0.0039) [2024-06-24 19:34:58,382][15401] Updated weights for policy 0, policy_version 726842 (0.0032) [2024-06-24 19:34:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11908579328. Throughput: 0: 42589.4. Samples: 11908699260. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:34:58,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 19:35:02,070][15401] Updated weights for policy 0, policy_version 726852 (0.0036) [2024-06-24 19:35:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11908775936. Throughput: 0: 42378.7. Samples: 11908951040. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:35:03,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 19:35:06,144][15401] Updated weights for policy 0, policy_version 726862 (0.0030) [2024-06-24 19:35:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11909005312. Throughput: 0: 42452.5. Samples: 11909078880. Policy #0 lag: (min: 2.0, avg: 10.8, max: 23.0) [2024-06-24 19:35:08,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 19:35:09,748][15401] Updated weights for policy 0, policy_version 726872 (0.0037) [2024-06-24 19:35:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11909201920. Throughput: 0: 42487.1. Samples: 11909336560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 19:35:14,227][15401] Updated weights for policy 0, policy_version 726882 (0.0035) [2024-06-24 19:35:17,607][15401] Updated weights for policy 0, policy_version 726892 (0.0036) [2024-06-24 19:35:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11909431296. Throughput: 0: 42514.3. Samples: 11909588300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 19:35:21,887][15401] Updated weights for policy 0, policy_version 726902 (0.0036) [2024-06-24 19:35:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11909644288. Throughput: 0: 42794.7. Samples: 11909719200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 19:35:25,049][15401] Updated weights for policy 0, policy_version 726912 (0.0034) [2024-06-24 19:35:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 11909824512. Throughput: 0: 42637.9. Samples: 11909973380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 19:35:29,391][15401] Updated weights for policy 0, policy_version 726922 (0.0052) [2024-06-24 19:35:32,863][15401] Updated weights for policy 0, policy_version 726932 (0.0028) [2024-06-24 19:35:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 11910070272. Throughput: 0: 42472.0. Samples: 11910224120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 19:35:37,184][15401] Updated weights for policy 0, policy_version 726942 (0.0037) [2024-06-24 19:35:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11910266880. Throughput: 0: 42627.2. Samples: 11910356380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 19:35:40,635][15401] Updated weights for policy 0, policy_version 726952 (0.0034) [2024-06-24 19:35:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42056.6, 300 sec: 42598.4). Total num frames: 11910463488. Throughput: 0: 42330.9. Samples: 11910604160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 19:35:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000726958_11910479872.pth... [2024-06-24 19:35:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000726334_11900256256.pth [2024-06-24 19:35:44,842][15401] Updated weights for policy 0, policy_version 726962 (0.0028) [2024-06-24 19:35:46,837][15349] Signal inference workers to stop experience collection... (176300 times) [2024-06-24 19:35:46,862][15401] InferenceWorker_p0-w0: stopping experience collection (176300 times) [2024-06-24 19:35:46,895][15349] Signal inference workers to resume experience collection... (176300 times) [2024-06-24 19:35:46,900][15401] InferenceWorker_p0-w0: resuming experience collection (176300 times) [2024-06-24 19:35:48,226][15401] Updated weights for policy 0, policy_version 726972 (0.0033) [2024-06-24 19:35:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11910709248. Throughput: 0: 42176.8. Samples: 11910849000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 19:35:52,610][15401] Updated weights for policy 0, policy_version 726982 (0.0027) [2024-06-24 19:35:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 11910889472. Throughput: 0: 42361.7. Samples: 11910985160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 19:35:56,010][15401] Updated weights for policy 0, policy_version 726992 (0.0031) [2024-06-24 19:35:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 11911102464. Throughput: 0: 42137.8. Samples: 11911232760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:35:58,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 19:36:00,361][15401] Updated weights for policy 0, policy_version 727002 (0.0027) [2024-06-24 19:36:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11911331840. Throughput: 0: 42206.7. Samples: 11911487600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 19:36:03,727][15401] Updated weights for policy 0, policy_version 727012 (0.0034) [2024-06-24 19:36:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 11911512064. Throughput: 0: 42183.2. Samples: 11911617440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 19:36:08,490][15401] Updated weights for policy 0, policy_version 727022 (0.0025) [2024-06-24 19:36:11,912][15401] Updated weights for policy 0, policy_version 727032 (0.0034) [2024-06-24 19:36:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11911741440. Throughput: 0: 42059.1. Samples: 11911866040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 19:36:16,040][15401] Updated weights for policy 0, policy_version 727042 (0.0037) [2024-06-24 19:36:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11911954432. Throughput: 0: 42214.8. Samples: 11912123780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 19:36:19,547][15401] Updated weights for policy 0, policy_version 727052 (0.0031) [2024-06-24 19:36:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 11912167424. Throughput: 0: 42146.6. Samples: 11912253080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:23,392][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 19:36:24,154][15401] Updated weights for policy 0, policy_version 727062 (0.0027) [2024-06-24 19:36:27,124][15401] Updated weights for policy 0, policy_version 727072 (0.0036) [2024-06-24 19:36:28,390][15132] Fps is (10 sec: 44233.2, 60 sec: 42871.0, 300 sec: 42709.4). Total num frames: 11912396800. Throughput: 0: 42276.3. Samples: 11912506620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:28,391][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 19:36:31,728][15401] Updated weights for policy 0, policy_version 727082 (0.0041) [2024-06-24 19:36:33,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11912593408. Throughput: 0: 42604.1. Samples: 11912766180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:33,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 19:36:34,916][15401] Updated weights for policy 0, policy_version 727092 (0.0038) [2024-06-24 19:36:38,390][15132] Fps is (10 sec: 42601.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11912822784. Throughput: 0: 42352.0. Samples: 11912891000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 19:36:39,063][15401] Updated weights for policy 0, policy_version 727102 (0.0048) [2024-06-24 19:36:42,631][15401] Updated weights for policy 0, policy_version 727112 (0.0032) [2024-06-24 19:36:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11913052160. Throughput: 0: 42642.6. Samples: 11913151680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:43,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-24 19:36:46,592][15401] Updated weights for policy 0, policy_version 727122 (0.0038) [2024-06-24 19:36:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11913232384. Throughput: 0: 42768.9. Samples: 11913412200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 19:36:48,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 19:36:50,177][15401] Updated weights for policy 0, policy_version 727132 (0.0036) [2024-06-24 19:36:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11913478144. Throughput: 0: 42594.6. Samples: 11913534200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:36:53,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 19:36:54,459][15401] Updated weights for policy 0, policy_version 727142 (0.0032) [2024-06-24 19:36:57,624][15401] Updated weights for policy 0, policy_version 727152 (0.0029) [2024-06-24 19:36:58,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 11913707520. Throughput: 0: 42935.9. Samples: 11913798160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:36:58,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 19:37:02,269][15401] Updated weights for policy 0, policy_version 727162 (0.0032) [2024-06-24 19:37:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11913871360. Throughput: 0: 42833.2. Samples: 11914051280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 19:37:05,307][15349] Signal inference workers to stop experience collection... (176350 times) [2024-06-24 19:37:05,308][15349] Signal inference workers to resume experience collection... (176350 times) [2024-06-24 19:37:05,352][15401] InferenceWorker_p0-w0: stopping experience collection (176350 times) [2024-06-24 19:37:05,353][15401] InferenceWorker_p0-w0: resuming experience collection (176350 times) [2024-06-24 19:37:05,446][15401] Updated weights for policy 0, policy_version 727172 (0.0038) [2024-06-24 19:37:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 11914117120. Throughput: 0: 42645.9. Samples: 11914172040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 19:37:09,671][15401] Updated weights for policy 0, policy_version 727182 (0.0029) [2024-06-24 19:37:13,054][15401] Updated weights for policy 0, policy_version 727192 (0.0043) [2024-06-24 19:37:13,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 11914330112. Throughput: 0: 42876.6. Samples: 11914436140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:13,393][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 19:37:17,181][15401] Updated weights for policy 0, policy_version 727202 (0.0026) [2024-06-24 19:37:18,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11914510336. Throughput: 0: 42859.4. Samples: 11914694860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:18,396][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 19:37:20,862][15401] Updated weights for policy 0, policy_version 727212 (0.0025) [2024-06-24 19:37:23,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 11914756096. Throughput: 0: 42768.1. Samples: 11914815560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:23,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-24 19:37:24,729][15401] Updated weights for policy 0, policy_version 727222 (0.0053) [2024-06-24 19:37:28,300][15401] Updated weights for policy 0, policy_version 727232 (0.0036) [2024-06-24 19:37:28,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42872.1, 300 sec: 42709.5). Total num frames: 11914969088. Throughput: 0: 42978.8. Samples: 11915085720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 19:37:32,242][15401] Updated weights for policy 0, policy_version 727242 (0.0036) [2024-06-24 19:37:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11915165696. Throughput: 0: 42777.4. Samples: 11915337180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 19:37:36,142][15401] Updated weights for policy 0, policy_version 727252 (0.0039) [2024-06-24 19:37:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11915395072. Throughput: 0: 42928.0. Samples: 11915465960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:38,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 19:37:39,740][15401] Updated weights for policy 0, policy_version 727262 (0.0030) [2024-06-24 19:37:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11915575296. Throughput: 0: 42773.1. Samples: 11915722940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 19:37:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000727270_11915591680.pth... [2024-06-24 19:37:43,563][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000726648_11905400832.pth [2024-06-24 19:37:43,887][15401] Updated weights for policy 0, policy_version 727272 (0.0029) [2024-06-24 19:37:47,315][15401] Updated weights for policy 0, policy_version 727282 (0.0029) [2024-06-24 19:37:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 11915804672. Throughput: 0: 42712.3. Samples: 11915973340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:48,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 19:37:51,527][15401] Updated weights for policy 0, policy_version 727292 (0.0028) [2024-06-24 19:37:53,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11916050432. Throughput: 0: 42999.6. Samples: 11916107020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 19:37:54,889][15401] Updated weights for policy 0, policy_version 727302 (0.0031) [2024-06-24 19:37:58,390][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 11916214272. Throughput: 0: 42857.0. Samples: 11916364600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:37:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 19:37:59,149][15401] Updated weights for policy 0, policy_version 727312 (0.0033) [2024-06-24 19:38:02,433][15401] Updated weights for policy 0, policy_version 727322 (0.0033) [2024-06-24 19:38:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11916460032. Throughput: 0: 42665.0. Samples: 11916614780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:38:03,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 19:38:06,826][15401] Updated weights for policy 0, policy_version 727332 (0.0041) [2024-06-24 19:38:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11916673024. Throughput: 0: 42930.1. Samples: 11916747420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:38:08,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 19:38:10,162][15401] Updated weights for policy 0, policy_version 727342 (0.0034) [2024-06-24 19:38:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 11916853248. Throughput: 0: 42454.2. Samples: 11916996160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:38:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 19:38:13,787][15349] Signal inference workers to stop experience collection... (176400 times) [2024-06-24 19:38:13,787][15349] Signal inference workers to resume experience collection... (176400 times) [2024-06-24 19:38:13,799][15401] InferenceWorker_p0-w0: stopping experience collection (176400 times) [2024-06-24 19:38:13,811][15401] InferenceWorker_p0-w0: resuming experience collection (176400 times) [2024-06-24 19:38:14,479][15401] Updated weights for policy 0, policy_version 727352 (0.0037) [2024-06-24 19:38:17,788][15401] Updated weights for policy 0, policy_version 727362 (0.0032) [2024-06-24 19:38:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.8, 300 sec: 42709.5). Total num frames: 11917115392. Throughput: 0: 42312.5. Samples: 11917241240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:38:18,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 19:38:22,609][15401] Updated weights for policy 0, policy_version 727372 (0.0044) [2024-06-24 19:38:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11917312000. Throughput: 0: 42639.7. Samples: 11917384740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:38:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 19:38:25,307][15401] Updated weights for policy 0, policy_version 727382 (0.0036) [2024-06-24 19:38:28,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 11917492224. Throughput: 0: 42582.5. Samples: 11917639160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-24 19:38:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 19:38:30,232][15401] Updated weights for policy 0, policy_version 727392 (0.0027) [2024-06-24 19:38:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 11917737984. Throughput: 0: 42473.9. Samples: 11917884660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:38:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 19:38:33,651][15401] Updated weights for policy 0, policy_version 727402 (0.0036) [2024-06-24 19:38:38,002][15401] Updated weights for policy 0, policy_version 727412 (0.0049) [2024-06-24 19:38:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 11917934592. Throughput: 0: 42560.9. Samples: 11918022260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:38:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 19:38:41,554][15401] Updated weights for policy 0, policy_version 727422 (0.0023) [2024-06-24 19:38:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 11918131200. Throughput: 0: 42228.9. Samples: 11918264900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:38:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 19:38:45,939][15401] Updated weights for policy 0, policy_version 727432 (0.0030) [2024-06-24 19:38:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 11918376960. Throughput: 0: 42280.9. Samples: 11918517420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:38:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 19:38:49,259][15401] Updated weights for policy 0, policy_version 727442 (0.0038) [2024-06-24 19:38:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 11918557184. Throughput: 0: 42344.5. Samples: 11918652920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:38:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 19:38:53,498][15401] Updated weights for policy 0, policy_version 727452 (0.0024) [2024-06-24 19:38:56,916][15401] Updated weights for policy 0, policy_version 727462 (0.0036) [2024-06-24 19:38:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11918786560. Throughput: 0: 42290.1. Samples: 11918899220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:38:58,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 19:39:01,247][15401] Updated weights for policy 0, policy_version 727472 (0.0037) [2024-06-24 19:39:03,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11919032320. Throughput: 0: 42551.0. Samples: 11919156040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 19:39:04,415][15401] Updated weights for policy 0, policy_version 727482 (0.0043) [2024-06-24 19:39:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 11919179776. Throughput: 0: 42418.9. Samples: 11919293600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 19:39:08,861][15401] Updated weights for policy 0, policy_version 727492 (0.0039) [2024-06-24 19:39:11,811][15401] Updated weights for policy 0, policy_version 727502 (0.0041) [2024-06-24 19:39:13,391][15132] Fps is (10 sec: 39315.3, 60 sec: 42870.3, 300 sec: 42598.2). Total num frames: 11919425536. Throughput: 0: 42279.1. Samples: 11919541780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:13,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 19:39:16,651][15401] Updated weights for policy 0, policy_version 727512 (0.0044) [2024-06-24 19:39:18,134][15349] Signal inference workers to stop experience collection... (176450 times) [2024-06-24 19:39:18,176][15401] InferenceWorker_p0-w0: stopping experience collection (176450 times) [2024-06-24 19:39:18,182][15349] Signal inference workers to resume experience collection... (176450 times) [2024-06-24 19:39:18,187][15401] InferenceWorker_p0-w0: resuming experience collection (176450 times) [2024-06-24 19:39:18,390][15132] Fps is (10 sec: 50790.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11919687680. Throughput: 0: 42508.9. Samples: 11919797560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:18,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 19:39:19,564][15401] Updated weights for policy 0, policy_version 727522 (0.0040) [2024-06-24 19:39:23,392][15132] Fps is (10 sec: 37679.3, 60 sec: 41504.3, 300 sec: 42320.3). Total num frames: 11919802368. Throughput: 0: 42421.0. Samples: 11919931320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:23,393][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 19:39:24,445][15401] Updated weights for policy 0, policy_version 727532 (0.0041) [2024-06-24 19:39:27,402][15401] Updated weights for policy 0, policy_version 727542 (0.0030) [2024-06-24 19:39:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 11920080896. Throughput: 0: 42454.2. Samples: 11920175340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:28,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 19:39:31,937][15401] Updated weights for policy 0, policy_version 727552 (0.0045) [2024-06-24 19:39:33,389][15132] Fps is (10 sec: 49165.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11920293888. Throughput: 0: 42598.7. Samples: 11920434360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 19:39:35,409][15401] Updated weights for policy 0, policy_version 727562 (0.0026) [2024-06-24 19:39:38,390][15132] Fps is (10 sec: 36044.7, 60 sec: 41779.2, 300 sec: 42377.2). Total num frames: 11920441344. Throughput: 0: 42456.5. Samples: 11920563460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 19:39:39,725][15401] Updated weights for policy 0, policy_version 727572 (0.0036) [2024-06-24 19:39:43,146][15401] Updated weights for policy 0, policy_version 727582 (0.0029) [2024-06-24 19:39:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11920719872. Throughput: 0: 42660.0. Samples: 11920818920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 19:39:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000727583_11920719872.pth... [2024-06-24 19:39:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000726958_11910479872.pth [2024-06-24 19:39:47,368][15401] Updated weights for policy 0, policy_version 727592 (0.0040) [2024-06-24 19:39:48,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 11920916480. Throughput: 0: 42641.6. Samples: 11921074920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 19:39:50,648][15401] Updated weights for policy 0, policy_version 727602 (0.0037) [2024-06-24 19:39:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42431.7). Total num frames: 11921096704. Throughput: 0: 42431.6. Samples: 11921203020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:53,391][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 19:39:54,960][15401] Updated weights for policy 0, policy_version 727612 (0.0027) [2024-06-24 19:39:58,295][15401] Updated weights for policy 0, policy_version 727622 (0.0028) [2024-06-24 19:39:58,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11921358848. Throughput: 0: 42575.2. Samples: 11921457600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:39:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 19:40:02,579][15401] Updated weights for policy 0, policy_version 727632 (0.0026) [2024-06-24 19:40:03,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 11921555456. Throughput: 0: 42602.2. Samples: 11921714660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:40:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 19:40:05,949][15401] Updated weights for policy 0, policy_version 727642 (0.0037) [2024-06-24 19:40:08,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 11921735680. Throughput: 0: 42410.4. Samples: 11921839680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:40:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 19:40:10,203][15401] Updated weights for policy 0, policy_version 727652 (0.0033) [2024-06-24 19:40:11,692][15349] Signal inference workers to stop experience collection... (176500 times) [2024-06-24 19:40:11,693][15349] Signal inference workers to resume experience collection... (176500 times) [2024-06-24 19:40:11,711][15401] InferenceWorker_p0-w0: stopping experience collection (176500 times) [2024-06-24 19:40:11,711][15401] InferenceWorker_p0-w0: resuming experience collection (176500 times) [2024-06-24 19:40:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42870.9, 300 sec: 42598.0). Total num frames: 11921997824. Throughput: 0: 42788.3. Samples: 11922100920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-24 19:40:13,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 19:40:13,486][15401] Updated weights for policy 0, policy_version 727662 (0.0026) [2024-06-24 19:40:17,898][15401] Updated weights for policy 0, policy_version 727672 (0.0035) [2024-06-24 19:40:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 11922194432. Throughput: 0: 42831.9. Samples: 11922361800. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 19:40:21,206][15401] Updated weights for policy 0, policy_version 727682 (0.0032) [2024-06-24 19:40:23,390][15132] Fps is (10 sec: 39330.7, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 11922391040. Throughput: 0: 42703.4. Samples: 11922485120. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 19:40:25,604][15401] Updated weights for policy 0, policy_version 727692 (0.0023) [2024-06-24 19:40:28,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 11922653184. Throughput: 0: 42817.7. Samples: 11922745820. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:28,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 19:40:28,815][15401] Updated weights for policy 0, policy_version 727702 (0.0027) [2024-06-24 19:40:33,278][15401] Updated weights for policy 0, policy_version 727712 (0.0029) [2024-06-24 19:40:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 11922833408. Throughput: 0: 42955.8. Samples: 11923007920. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:33,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 19:40:36,670][15401] Updated weights for policy 0, policy_version 727722 (0.0026) [2024-06-24 19:40:38,389][15132] Fps is (10 sec: 39331.1, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 11923046400. Throughput: 0: 42899.7. Samples: 11923133500. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 19:40:40,776][15401] Updated weights for policy 0, policy_version 727732 (0.0037) [2024-06-24 19:40:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11923275776. Throughput: 0: 43047.7. Samples: 11923394740. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 19:40:44,278][15401] Updated weights for policy 0, policy_version 727742 (0.0031) [2024-06-24 19:40:48,238][15401] Updated weights for policy 0, policy_version 727752 (0.0030) [2024-06-24 19:40:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11923488768. Throughput: 0: 43087.0. Samples: 11923653580. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 19:40:51,875][15401] Updated weights for policy 0, policy_version 727762 (0.0027) [2024-06-24 19:40:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 11923701760. Throughput: 0: 43038.7. Samples: 11923776420. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:53,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 19:40:55,915][15401] Updated weights for policy 0, policy_version 727772 (0.0034) [2024-06-24 19:40:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11923914752. Throughput: 0: 43110.8. Samples: 11924040800. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:40:58,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 19:40:59,553][15401] Updated weights for policy 0, policy_version 727782 (0.0035) [2024-06-24 19:41:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11924127744. Throughput: 0: 43003.1. Samples: 11924296940. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:03,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 19:41:03,982][15401] Updated weights for policy 0, policy_version 727792 (0.0036) [2024-06-24 19:41:07,113][15401] Updated weights for policy 0, policy_version 727802 (0.0033) [2024-06-24 19:41:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 11924357120. Throughput: 0: 42865.9. Samples: 11924414080. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 19:41:11,502][15401] Updated weights for policy 0, policy_version 727812 (0.0028) [2024-06-24 19:41:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 11924570112. Throughput: 0: 42958.3. Samples: 11924678840. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 19:41:14,698][15401] Updated weights for policy 0, policy_version 727822 (0.0038) [2024-06-24 19:41:18,183][15349] Signal inference workers to stop experience collection... (176550 times) [2024-06-24 19:41:18,239][15401] InferenceWorker_p0-w0: stopping experience collection (176550 times) [2024-06-24 19:41:18,248][15349] Signal inference workers to resume experience collection... (176550 times) [2024-06-24 19:41:18,256][15401] InferenceWorker_p0-w0: resuming experience collection (176550 times) [2024-06-24 19:41:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11924766720. Throughput: 0: 42776.8. Samples: 11924932880. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 19:41:19,158][15401] Updated weights for policy 0, policy_version 727832 (0.0040) [2024-06-24 19:41:23,066][15401] Updated weights for policy 0, policy_version 727842 (0.0035) [2024-06-24 19:41:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 11924979712. Throughput: 0: 42674.7. Samples: 11925053860. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:23,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 19:41:27,003][15401] Updated weights for policy 0, policy_version 727852 (0.0027) [2024-06-24 19:41:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 11925192704. Throughput: 0: 42506.6. Samples: 11925307540. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 19:41:30,751][15401] Updated weights for policy 0, policy_version 727862 (0.0047) [2024-06-24 19:41:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 11925389312. Throughput: 0: 42461.9. Samples: 11925564360. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 19:41:34,874][15401] Updated weights for policy 0, policy_version 727872 (0.0038) [2024-06-24 19:41:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11925602304. Throughput: 0: 42395.9. Samples: 11925684240. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 19:41:38,462][15401] Updated weights for policy 0, policy_version 727882 (0.0048) [2024-06-24 19:41:42,603][15401] Updated weights for policy 0, policy_version 727892 (0.0038) [2024-06-24 19:41:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11925848064. Throughput: 0: 42310.1. Samples: 11925944760. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 19:41:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000727896_11925848064.pth... [2024-06-24 19:41:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000727270_11915591680.pth [2024-06-24 19:41:46,656][15401] Updated weights for policy 0, policy_version 727902 (0.0035) [2024-06-24 19:41:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 11926011904. Throughput: 0: 42239.1. Samples: 11926197700. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:48,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 19:41:50,147][15401] Updated weights for policy 0, policy_version 727912 (0.0031) [2024-06-24 19:41:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11926257664. Throughput: 0: 42290.7. Samples: 11926317160. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-24 19:41:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 19:41:54,167][15401] Updated weights for policy 0, policy_version 727922 (0.0040) [2024-06-24 19:41:57,651][15401] Updated weights for policy 0, policy_version 727932 (0.0023) [2024-06-24 19:41:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11926470656. Throughput: 0: 42290.1. Samples: 11926581900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:41:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 19:42:01,821][15401] Updated weights for policy 0, policy_version 727942 (0.0031) [2024-06-24 19:42:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11926667264. Throughput: 0: 42422.8. Samples: 11926841900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 19:42:05,459][15401] Updated weights for policy 0, policy_version 727952 (0.0034) [2024-06-24 19:42:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 11926913024. Throughput: 0: 42527.4. Samples: 11926967600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 19:42:09,469][15401] Updated weights for policy 0, policy_version 727962 (0.0043) [2024-06-24 19:42:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 11927076864. Throughput: 0: 42605.7. Samples: 11927224800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 19:42:13,571][15401] Updated weights for policy 0, policy_version 727972 (0.0032) [2024-06-24 19:42:16,938][15401] Updated weights for policy 0, policy_version 727982 (0.0043) [2024-06-24 19:42:18,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11927322624. Throughput: 0: 42678.6. Samples: 11927484900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 19:42:21,023][15401] Updated weights for policy 0, policy_version 727992 (0.0031) [2024-06-24 19:42:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11927535616. Throughput: 0: 42856.1. Samples: 11927612760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 19:42:24,675][15401] Updated weights for policy 0, policy_version 728002 (0.0026) [2024-06-24 19:42:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 11927732224. Throughput: 0: 42660.9. Samples: 11927864600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:28,392][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 19:42:28,753][15401] Updated weights for policy 0, policy_version 728012 (0.0039) [2024-06-24 19:42:32,383][15401] Updated weights for policy 0, policy_version 728022 (0.0041) [2024-06-24 19:42:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 11927945216. Throughput: 0: 42907.4. Samples: 11928128540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:33,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 19:42:36,093][15401] Updated weights for policy 0, policy_version 728032 (0.0035) [2024-06-24 19:42:36,643][15349] Signal inference workers to stop experience collection... (176600 times) [2024-06-24 19:42:36,651][15349] Signal inference workers to resume experience collection... (176600 times) [2024-06-24 19:42:36,657][15401] InferenceWorker_p0-w0: stopping experience collection (176600 times) [2024-06-24 19:42:36,674][15401] InferenceWorker_p0-w0: resuming experience collection (176600 times) [2024-06-24 19:42:38,396][15132] Fps is (10 sec: 44218.8, 60 sec: 42867.0, 300 sec: 42708.5). Total num frames: 11928174592. Throughput: 0: 43026.7. Samples: 11928253640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:38,396][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 19:42:40,088][15401] Updated weights for policy 0, policy_version 728042 (0.0025) [2024-06-24 19:42:43,390][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 11928387584. Throughput: 0: 42844.5. Samples: 11928509900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:43,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 19:42:43,543][15401] Updated weights for policy 0, policy_version 728052 (0.0036) [2024-06-24 19:42:47,675][15401] Updated weights for policy 0, policy_version 728062 (0.0034) [2024-06-24 19:42:48,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 11928584192. Throughput: 0: 42834.1. Samples: 11928769440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:48,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 19:42:51,339][15401] Updated weights for policy 0, policy_version 728072 (0.0036) [2024-06-24 19:42:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11928813568. Throughput: 0: 42936.1. Samples: 11928899720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:53,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 19:42:55,449][15401] Updated weights for policy 0, policy_version 728082 (0.0032) [2024-06-24 19:42:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11929026560. Throughput: 0: 42896.1. Samples: 11929155120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:42:58,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-24 19:42:58,922][15401] Updated weights for policy 0, policy_version 728092 (0.0034) [2024-06-24 19:43:03,181][15401] Updated weights for policy 0, policy_version 728102 (0.0034) [2024-06-24 19:43:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 11929239552. Throughput: 0: 42965.0. Samples: 11929418320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:43:03,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 19:43:06,557][15401] Updated weights for policy 0, policy_version 728112 (0.0036) [2024-06-24 19:43:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11929452544. Throughput: 0: 42881.7. Samples: 11929542440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:43:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 19:43:11,107][15401] Updated weights for policy 0, policy_version 728122 (0.0035) [2024-06-24 19:43:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 11929665536. Throughput: 0: 42948.8. Samples: 11929797200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:43:13,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 19:43:14,046][15401] Updated weights for policy 0, policy_version 728132 (0.0037) [2024-06-24 19:43:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11929862144. Throughput: 0: 42822.4. Samples: 11930055540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:43:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 19:43:18,836][15401] Updated weights for policy 0, policy_version 728142 (0.0032) [2024-06-24 19:43:21,643][15401] Updated weights for policy 0, policy_version 728152 (0.0029) [2024-06-24 19:43:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11930107904. Throughput: 0: 42909.3. Samples: 11930184280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:43:23,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 19:43:26,307][15401] Updated weights for policy 0, policy_version 728162 (0.0036) [2024-06-24 19:43:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 11930320896. Throughput: 0: 42796.9. Samples: 11930435760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:43:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 19:43:29,453][15401] Updated weights for policy 0, policy_version 728172 (0.0027) [2024-06-24 19:43:33,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 11930501120. Throughput: 0: 42785.4. Samples: 11930694780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:43:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 19:43:33,888][15401] Updated weights for policy 0, policy_version 728182 (0.0027) [2024-06-24 19:43:37,089][15401] Updated weights for policy 0, policy_version 728192 (0.0024) [2024-06-24 19:43:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43149.2, 300 sec: 42820.6). Total num frames: 11930763264. Throughput: 0: 42609.9. Samples: 11930817160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 19:43:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 19:43:41,528][15401] Updated weights for policy 0, policy_version 728202 (0.0038) [2024-06-24 19:43:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11930959872. Throughput: 0: 42821.4. Samples: 11931082080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:43:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 19:43:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000728209_11930976256.pth... [2024-06-24 19:43:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000727583_11920719872.pth [2024-06-24 19:43:44,790][15401] Updated weights for policy 0, policy_version 728212 (0.0035) [2024-06-24 19:43:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 11931156480. Throughput: 0: 42718.5. Samples: 11931340660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:43:48,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 19:43:48,934][15401] Updated weights for policy 0, policy_version 728222 (0.0032) [2024-06-24 19:43:52,221][15401] Updated weights for policy 0, policy_version 728232 (0.0031) [2024-06-24 19:43:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11931402240. Throughput: 0: 42706.8. Samples: 11931464240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:43:53,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-24 19:43:56,865][15349] Signal inference workers to stop experience collection... (176650 times) [2024-06-24 19:43:56,874][15349] Signal inference workers to resume experience collection... (176650 times) [2024-06-24 19:43:56,885][15401] Updated weights for policy 0, policy_version 728242 (0.0038) [2024-06-24 19:43:56,920][15401] InferenceWorker_p0-w0: stopping experience collection (176650 times) [2024-06-24 19:43:56,920][15401] InferenceWorker_p0-w0: resuming experience collection (176650 times) [2024-06-24 19:43:58,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11931615232. Throughput: 0: 42848.5. Samples: 11931725380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:43:58,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 19:44:00,329][15401] Updated weights for policy 0, policy_version 728252 (0.0039) [2024-06-24 19:44:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11931811840. Throughput: 0: 42780.4. Samples: 11931980660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 19:44:04,442][15401] Updated weights for policy 0, policy_version 728262 (0.0042) [2024-06-24 19:44:07,798][15401] Updated weights for policy 0, policy_version 728272 (0.0038) [2024-06-24 19:44:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.7). Total num frames: 11932024832. Throughput: 0: 42655.9. Samples: 11932103800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 19:44:11,859][15401] Updated weights for policy 0, policy_version 728282 (0.0041) [2024-06-24 19:44:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 11932237824. Throughput: 0: 42857.7. Samples: 11932364360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:13,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 19:44:15,270][15401] Updated weights for policy 0, policy_version 728292 (0.0031) [2024-06-24 19:44:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 11932450816. Throughput: 0: 42577.8. Samples: 11932610780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 19:44:19,575][15401] Updated weights for policy 0, policy_version 728302 (0.0040) [2024-06-24 19:44:23,069][15401] Updated weights for policy 0, policy_version 728312 (0.0037) [2024-06-24 19:44:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11932680192. Throughput: 0: 42730.2. Samples: 11932740020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 19:44:27,180][15401] Updated weights for policy 0, policy_version 728322 (0.0041) [2024-06-24 19:44:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 11932876800. Throughput: 0: 42600.3. Samples: 11932999100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 19:44:30,748][15401] Updated weights for policy 0, policy_version 728332 (0.0032) [2024-06-24 19:44:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11933073408. Throughput: 0: 42461.9. Samples: 11933251440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 19:44:34,951][15401] Updated weights for policy 0, policy_version 728342 (0.0036) [2024-06-24 19:44:38,357][15401] Updated weights for policy 0, policy_version 728352 (0.0045) [2024-06-24 19:44:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11933319168. Throughput: 0: 42533.3. Samples: 11933378240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:38,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 19:44:42,814][15401] Updated weights for policy 0, policy_version 728362 (0.0039) [2024-06-24 19:44:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 11933499392. Throughput: 0: 42503.7. Samples: 11933638040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:43,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 19:44:46,656][15401] Updated weights for policy 0, policy_version 728372 (0.0035) [2024-06-24 19:44:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11933712384. Throughput: 0: 42502.4. Samples: 11933893260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 19:44:50,224][15401] Updated weights for policy 0, policy_version 728382 (0.0045) [2024-06-24 19:44:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11933941760. Throughput: 0: 42584.4. Samples: 11934020100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 19:44:54,494][15401] Updated weights for policy 0, policy_version 728392 (0.0039) [2024-06-24 19:44:57,791][15401] Updated weights for policy 0, policy_version 728402 (0.0042) [2024-06-24 19:44:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11934154752. Throughput: 0: 42571.2. Samples: 11934280060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:44:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 19:45:02,069][15401] Updated weights for policy 0, policy_version 728412 (0.0035) [2024-06-24 19:45:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11934367744. Throughput: 0: 42951.1. Samples: 11934543580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:45:03,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-24 19:45:05,348][15401] Updated weights for policy 0, policy_version 728422 (0.0038) [2024-06-24 19:45:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 11934580736. Throughput: 0: 42941.8. Samples: 11934672400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:45:08,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-24 19:45:09,364][15401] Updated weights for policy 0, policy_version 728432 (0.0029) [2024-06-24 19:45:13,330][15401] Updated weights for policy 0, policy_version 728442 (0.0033) [2024-06-24 19:45:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11934793728. Throughput: 0: 42818.7. Samples: 11934925940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:45:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 19:45:17,296][15401] Updated weights for policy 0, policy_version 728452 (0.0036) [2024-06-24 19:45:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11935023104. Throughput: 0: 43137.3. Samples: 11935192620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 19:45:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 19:45:20,675][15401] Updated weights for policy 0, policy_version 728462 (0.0029) [2024-06-24 19:45:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 11935252480. Throughput: 0: 43047.5. Samples: 11935315380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:45:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 19:45:24,923][15401] Updated weights for policy 0, policy_version 728472 (0.0036) [2024-06-24 19:45:26,046][15349] Signal inference workers to stop experience collection... (176700 times) [2024-06-24 19:45:26,100][15401] InferenceWorker_p0-w0: stopping experience collection (176700 times) [2024-06-24 19:45:26,167][15349] Signal inference workers to resume experience collection... (176700 times) [2024-06-24 19:45:26,168][15401] InferenceWorker_p0-w0: resuming experience collection (176700 times) [2024-06-24 19:45:28,125][15401] Updated weights for policy 0, policy_version 728482 (0.0047) [2024-06-24 19:45:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11935449088. Throughput: 0: 43050.6. Samples: 11935575320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:45:28,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-24 19:45:32,493][15401] Updated weights for policy 0, policy_version 728492 (0.0026) [2024-06-24 19:45:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11935662080. Throughput: 0: 43221.7. Samples: 11935838240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:45:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 19:45:35,587][15401] Updated weights for policy 0, policy_version 728502 (0.0023) [2024-06-24 19:45:38,394][15132] Fps is (10 sec: 44218.8, 60 sec: 42868.6, 300 sec: 42764.4). Total num frames: 11935891456. Throughput: 0: 43173.5. Samples: 11935963080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:45:38,394][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 19:45:39,914][15401] Updated weights for policy 0, policy_version 728512 (0.0025) [2024-06-24 19:45:43,256][15401] Updated weights for policy 0, policy_version 728522 (0.0043) [2024-06-24 19:45:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 11936104448. Throughput: 0: 43087.0. Samples: 11936219080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:45:43,393][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 19:45:43,553][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000728523_11936120832.pth... [2024-06-24 19:45:43,612][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000727896_11925848064.pth [2024-06-24 19:45:47,521][15401] Updated weights for policy 0, policy_version 728532 (0.0032) [2024-06-24 19:45:48,389][15132] Fps is (10 sec: 39337.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 11936284672. Throughput: 0: 43048.0. Samples: 11936480740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:45:48,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 19:45:51,319][15401] Updated weights for policy 0, policy_version 728542 (0.0024) [2024-06-24 19:45:53,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 11936546816. Throughput: 0: 42917.6. Samples: 11936603700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:45:53,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 19:45:55,067][15401] Updated weights for policy 0, policy_version 728552 (0.0039) [2024-06-24 19:45:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11936727040. Throughput: 0: 42969.4. Samples: 11936859560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:45:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 19:45:58,761][15401] Updated weights for policy 0, policy_version 728562 (0.0042) [2024-06-24 19:46:03,080][15401] Updated weights for policy 0, policy_version 728572 (0.0030) [2024-06-24 19:46:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 11936940032. Throughput: 0: 42857.6. Samples: 11937121220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 19:46:06,432][15401] Updated weights for policy 0, policy_version 728582 (0.0035) [2024-06-24 19:46:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11937185792. Throughput: 0: 42942.7. Samples: 11937247800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:08,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 19:46:10,715][15401] Updated weights for policy 0, policy_version 728592 (0.0027) [2024-06-24 19:46:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11937366016. Throughput: 0: 42921.3. Samples: 11937506780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 19:46:13,979][15401] Updated weights for policy 0, policy_version 728602 (0.0033) [2024-06-24 19:46:18,349][15401] Updated weights for policy 0, policy_version 728612 (0.0042) [2024-06-24 19:46:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11937579008. Throughput: 0: 42808.9. Samples: 11937764640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:18,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 19:46:21,853][15401] Updated weights for policy 0, policy_version 728622 (0.0037) [2024-06-24 19:46:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11937841152. Throughput: 0: 42797.2. Samples: 11937888780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 19:46:26,181][15401] Updated weights for policy 0, policy_version 728632 (0.0028) [2024-06-24 19:46:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11938004992. Throughput: 0: 42817.5. Samples: 11938145760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:28,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 19:46:29,369][15401] Updated weights for policy 0, policy_version 728642 (0.0030) [2024-06-24 19:46:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11938217984. Throughput: 0: 42755.6. Samples: 11938404740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:33,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 19:46:33,694][15401] Updated weights for policy 0, policy_version 728652 (0.0034) [2024-06-24 19:46:36,952][15401] Updated weights for policy 0, policy_version 728662 (0.0041) [2024-06-24 19:46:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42874.4, 300 sec: 42765.0). Total num frames: 11938463744. Throughput: 0: 42878.8. Samples: 11938533240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 19:46:41,137][15401] Updated weights for policy 0, policy_version 728672 (0.0044) [2024-06-24 19:46:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 11938643968. Throughput: 0: 42773.0. Samples: 11938784340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 19:46:44,540][15401] Updated weights for policy 0, policy_version 728682 (0.0033) [2024-06-24 19:46:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11938856960. Throughput: 0: 42649.5. Samples: 11939040440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 19:46:48,949][15401] Updated weights for policy 0, policy_version 728692 (0.0042) [2024-06-24 19:46:48,966][15349] Signal inference workers to stop experience collection... (176750 times) [2024-06-24 19:46:48,966][15349] Signal inference workers to resume experience collection... (176750 times) [2024-06-24 19:46:49,006][15401] InferenceWorker_p0-w0: stopping experience collection (176750 times) [2024-06-24 19:46:49,007][15401] InferenceWorker_p0-w0: resuming experience collection (176750 times) [2024-06-24 19:46:52,319][15401] Updated weights for policy 0, policy_version 728702 (0.0025) [2024-06-24 19:46:53,390][15132] Fps is (10 sec: 44233.7, 60 sec: 42325.0, 300 sec: 42764.9). Total num frames: 11939086336. Throughput: 0: 42639.0. Samples: 11939166580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:53,391][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 19:46:56,529][15401] Updated weights for policy 0, policy_version 728712 (0.0034) [2024-06-24 19:46:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11939282944. Throughput: 0: 42544.0. Samples: 11939421260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:46:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 19:47:00,099][15401] Updated weights for policy 0, policy_version 728722 (0.0039) [2024-06-24 19:47:03,389][15132] Fps is (10 sec: 42601.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 11939512320. Throughput: 0: 42491.1. Samples: 11939676740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 19:47:03,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 19:47:04,270][15401] Updated weights for policy 0, policy_version 728732 (0.0034) [2024-06-24 19:47:07,671][15401] Updated weights for policy 0, policy_version 728742 (0.0034) [2024-06-24 19:47:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 11939741696. Throughput: 0: 42702.6. Samples: 11939810400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 19:47:12,082][15401] Updated weights for policy 0, policy_version 728752 (0.0034) [2024-06-24 19:47:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11939921920. Throughput: 0: 42616.8. Samples: 11940063520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 19:47:15,088][15401] Updated weights for policy 0, policy_version 728762 (0.0035) [2024-06-24 19:47:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11940151296. Throughput: 0: 42518.6. Samples: 11940318080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 19:47:19,536][15401] Updated weights for policy 0, policy_version 728772 (0.0036) [2024-06-24 19:47:23,087][15401] Updated weights for policy 0, policy_version 728782 (0.0034) [2024-06-24 19:47:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 11940380672. Throughput: 0: 42585.7. Samples: 11940449600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 19:47:27,055][15401] Updated weights for policy 0, policy_version 728792 (0.0027) [2024-06-24 19:47:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11940560896. Throughput: 0: 42606.6. Samples: 11940701640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 19:47:30,730][15401] Updated weights for policy 0, policy_version 728802 (0.0040) [2024-06-24 19:47:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 11940790272. Throughput: 0: 42512.4. Samples: 11940953500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 19:47:34,610][15401] Updated weights for policy 0, policy_version 728812 (0.0037) [2024-06-24 19:47:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11941003264. Throughput: 0: 42741.1. Samples: 11941089900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 19:47:38,584][15401] Updated weights for policy 0, policy_version 728822 (0.0035) [2024-06-24 19:47:42,426][15401] Updated weights for policy 0, policy_version 728832 (0.0035) [2024-06-24 19:47:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11941199872. Throughput: 0: 42669.8. Samples: 11941341400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 19:47:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000728834_11941216256.pth... [2024-06-24 19:47:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000728209_11930976256.pth [2024-06-24 19:47:46,262][15401] Updated weights for policy 0, policy_version 728842 (0.0035) [2024-06-24 19:47:48,394][15132] Fps is (10 sec: 42578.3, 60 sec: 42868.1, 300 sec: 42764.3). Total num frames: 11941429248. Throughput: 0: 42676.5. Samples: 11941597380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:48,395][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 19:47:49,939][15401] Updated weights for policy 0, policy_version 728852 (0.0041) [2024-06-24 19:47:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.8, 300 sec: 42765.0). Total num frames: 11941642240. Throughput: 0: 42631.1. Samples: 11941728800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:53,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 19:47:53,875][15401] Updated weights for policy 0, policy_version 728862 (0.0032) [2024-06-24 19:47:57,459][15401] Updated weights for policy 0, policy_version 728872 (0.0032) [2024-06-24 19:47:58,390][15132] Fps is (10 sec: 42618.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11941855232. Throughput: 0: 42702.2. Samples: 11941985120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:47:58,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 19:48:01,527][15401] Updated weights for policy 0, policy_version 728882 (0.0034) [2024-06-24 19:48:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 11942084608. Throughput: 0: 42765.3. Samples: 11942242520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:03,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 19:48:05,021][15401] Updated weights for policy 0, policy_version 728892 (0.0024) [2024-06-24 19:48:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11942264832. Throughput: 0: 42769.4. Samples: 11942374220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 19:48:09,091][15401] Updated weights for policy 0, policy_version 728902 (0.0038) [2024-06-24 19:48:12,417][15401] Updated weights for policy 0, policy_version 728912 (0.0033) [2024-06-24 19:48:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11942510592. Throughput: 0: 42837.3. Samples: 11942629320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 19:48:15,474][15349] Signal inference workers to stop experience collection... (176800 times) [2024-06-24 19:48:15,480][15349] Signal inference workers to resume experience collection... (176800 times) [2024-06-24 19:48:15,499][15401] InferenceWorker_p0-w0: stopping experience collection (176800 times) [2024-06-24 19:48:15,500][15401] InferenceWorker_p0-w0: resuming experience collection (176800 times) [2024-06-24 19:48:16,881][15401] Updated weights for policy 0, policy_version 728922 (0.0050) [2024-06-24 19:48:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11942723584. Throughput: 0: 42956.4. Samples: 11942886540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 19:48:20,229][15401] Updated weights for policy 0, policy_version 728932 (0.0044) [2024-06-24 19:48:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11942920192. Throughput: 0: 42911.9. Samples: 11943020940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 19:48:24,560][15401] Updated weights for policy 0, policy_version 728942 (0.0042) [2024-06-24 19:48:27,919][15401] Updated weights for policy 0, policy_version 728952 (0.0029) [2024-06-24 19:48:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 11943165952. Throughput: 0: 42894.1. Samples: 11943271640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:28,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 19:48:32,149][15401] Updated weights for policy 0, policy_version 728962 (0.0034) [2024-06-24 19:48:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 11943378944. Throughput: 0: 42846.7. Samples: 11943525280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 19:48:35,435][15401] Updated weights for policy 0, policy_version 728972 (0.0023) [2024-06-24 19:48:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11943559168. Throughput: 0: 42851.2. Samples: 11943657100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 19:48:39,779][15401] Updated weights for policy 0, policy_version 728982 (0.0029) [2024-06-24 19:48:43,353][15401] Updated weights for policy 0, policy_version 728992 (0.0050) [2024-06-24 19:48:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 11943804928. Throughput: 0: 42792.9. Samples: 11943910800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 19:48:43,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 19:48:47,325][15401] Updated weights for policy 0, policy_version 729002 (0.0037) [2024-06-24 19:48:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43148.0, 300 sec: 42765.0). Total num frames: 11944017920. Throughput: 0: 42966.8. Samples: 11944176020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:48:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 19:48:50,955][15401] Updated weights for policy 0, policy_version 729012 (0.0034) [2024-06-24 19:48:53,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 11944198144. Throughput: 0: 42782.4. Samples: 11944299440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:48:53,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-24 19:48:55,178][15401] Updated weights for policy 0, policy_version 729022 (0.0026) [2024-06-24 19:48:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11944443904. Throughput: 0: 42649.7. Samples: 11944548560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:48:58,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 19:48:58,586][15401] Updated weights for policy 0, policy_version 729032 (0.0035) [2024-06-24 19:49:02,808][15401] Updated weights for policy 0, policy_version 729042 (0.0033) [2024-06-24 19:49:03,389][15132] Fps is (10 sec: 44238.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11944640512. Throughput: 0: 42842.3. Samples: 11944814440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 19:49:06,237][15401] Updated weights for policy 0, policy_version 729052 (0.0041) [2024-06-24 19:49:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11944853504. Throughput: 0: 42567.5. Samples: 11944936480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:08,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 19:49:10,661][15401] Updated weights for policy 0, policy_version 729062 (0.0029) [2024-06-24 19:49:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11945099264. Throughput: 0: 42699.6. Samples: 11945193120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 19:49:14,032][15401] Updated weights for policy 0, policy_version 729072 (0.0031) [2024-06-24 19:49:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11945263104. Throughput: 0: 42963.5. Samples: 11945458640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 19:49:18,561][15401] Updated weights for policy 0, policy_version 729082 (0.0031) [2024-06-24 19:49:18,656][15349] Signal inference workers to stop experience collection... (176850 times) [2024-06-24 19:49:18,704][15401] InferenceWorker_p0-w0: stopping experience collection (176850 times) [2024-06-24 19:49:18,712][15349] Signal inference workers to resume experience collection... (176850 times) [2024-06-24 19:49:18,722][15401] InferenceWorker_p0-w0: resuming experience collection (176850 times) [2024-06-24 19:49:21,860][15401] Updated weights for policy 0, policy_version 729092 (0.0040) [2024-06-24 19:49:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11945508864. Throughput: 0: 42740.3. Samples: 11945580520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:23,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 19:49:26,018][15401] Updated weights for policy 0, policy_version 729102 (0.0037) [2024-06-24 19:49:28,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11945738240. Throughput: 0: 42760.9. Samples: 11945835040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 19:49:29,746][15401] Updated weights for policy 0, policy_version 729112 (0.0038) [2024-06-24 19:49:33,392][15132] Fps is (10 sec: 40959.8, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 11945918464. Throughput: 0: 42859.3. Samples: 11946104800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:33,393][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 19:49:33,737][15401] Updated weights for policy 0, policy_version 729122 (0.0036) [2024-06-24 19:49:37,150][15401] Updated weights for policy 0, policy_version 729132 (0.0036) [2024-06-24 19:49:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11946131456. Throughput: 0: 42775.8. Samples: 11946224340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 19:49:41,441][15401] Updated weights for policy 0, policy_version 729142 (0.0038) [2024-06-24 19:49:43,389][15132] Fps is (10 sec: 47525.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 11946393600. Throughput: 0: 42920.5. Samples: 11946479980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 19:49:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000729150_11946393600.pth... [2024-06-24 19:49:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000728523_11936120832.pth [2024-06-24 19:49:44,725][15401] Updated weights for policy 0, policy_version 729152 (0.0033) [2024-06-24 19:49:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11946557440. Throughput: 0: 42708.4. Samples: 11946736320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 19:49:49,167][15401] Updated weights for policy 0, policy_version 729162 (0.0041) [2024-06-24 19:49:52,202][15401] Updated weights for policy 0, policy_version 729172 (0.0030) [2024-06-24 19:49:53,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 11946770432. Throughput: 0: 42708.5. Samples: 11946858360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 19:49:56,802][15401] Updated weights for policy 0, policy_version 729182 (0.0035) [2024-06-24 19:49:58,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 11946999808. Throughput: 0: 42865.6. Samples: 11947122080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:49:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 19:49:59,744][15401] Updated weights for policy 0, policy_version 729192 (0.0030) [2024-06-24 19:50:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 11947180032. Throughput: 0: 42654.6. Samples: 11947378100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:50:03,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 19:50:04,489][15401] Updated weights for policy 0, policy_version 729202 (0.0035) [2024-06-24 19:50:07,775][15401] Updated weights for policy 0, policy_version 729212 (0.0042) [2024-06-24 19:50:08,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 11947425792. Throughput: 0: 42613.4. Samples: 11947498020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:50:08,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 19:50:11,881][15401] Updated weights for policy 0, policy_version 729222 (0.0037) [2024-06-24 19:50:13,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11947655168. Throughput: 0: 42792.9. Samples: 11947760720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:50:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 19:50:15,363][15401] Updated weights for policy 0, policy_version 729232 (0.0038) [2024-06-24 19:50:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11947835392. Throughput: 0: 42637.8. Samples: 11948023400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:50:18,396][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 19:50:19,473][15401] Updated weights for policy 0, policy_version 729242 (0.0042) [2024-06-24 19:50:23,163][15401] Updated weights for policy 0, policy_version 729252 (0.0034) [2024-06-24 19:50:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 11948064768. Throughput: 0: 42658.6. Samples: 11948143980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:50:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 19:50:26,940][15401] Updated weights for policy 0, policy_version 729262 (0.0035) [2024-06-24 19:50:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11948277760. Throughput: 0: 42805.9. Samples: 11948406240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-24 19:50:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 19:50:30,763][15401] Updated weights for policy 0, policy_version 729272 (0.0025) [2024-06-24 19:50:33,030][15349] Signal inference workers to stop experience collection... (176900 times) [2024-06-24 19:50:33,030][15349] Signal inference workers to resume experience collection... (176900 times) [2024-06-24 19:50:33,075][15401] InferenceWorker_p0-w0: stopping experience collection (176900 times) [2024-06-24 19:50:33,080][15401] InferenceWorker_p0-w0: resuming experience collection (176900 times) [2024-06-24 19:50:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42710.0). Total num frames: 11948490752. Throughput: 0: 42860.7. Samples: 11948665060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:50:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-24 19:50:34,490][15401] Updated weights for policy 0, policy_version 729282 (0.0037) [2024-06-24 19:50:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11948703744. Throughput: 0: 42881.0. Samples: 11948788000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:50:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 19:50:38,451][15401] Updated weights for policy 0, policy_version 729292 (0.0037) [2024-06-24 19:50:42,179][15401] Updated weights for policy 0, policy_version 729302 (0.0039) [2024-06-24 19:50:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 11948916736. Throughput: 0: 42755.3. Samples: 11949046060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:50:43,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 19:50:46,645][15401] Updated weights for policy 0, policy_version 729312 (0.0030) [2024-06-24 19:50:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 11949129728. Throughput: 0: 42863.7. Samples: 11949306960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:50:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 19:50:49,941][15401] Updated weights for policy 0, policy_version 729322 (0.0038) [2024-06-24 19:50:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11949359104. Throughput: 0: 42849.2. Samples: 11949426240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:50:53,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 19:50:54,217][15401] Updated weights for policy 0, policy_version 729332 (0.0037) [2024-06-24 19:50:57,515][15401] Updated weights for policy 0, policy_version 729342 (0.0037) [2024-06-24 19:50:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 11949588480. Throughput: 0: 42876.0. Samples: 11949690140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:50:58,401][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 19:51:01,692][15401] Updated weights for policy 0, policy_version 729352 (0.0022) [2024-06-24 19:51:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 11949768704. Throughput: 0: 42758.6. Samples: 11949947540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 19:51:05,138][15401] Updated weights for policy 0, policy_version 729362 (0.0032) [2024-06-24 19:51:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 11949998080. Throughput: 0: 42789.3. Samples: 11950069600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:08,393][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 19:51:09,002][15401] Updated weights for policy 0, policy_version 729372 (0.0026) [2024-06-24 19:51:12,691][15401] Updated weights for policy 0, policy_version 729382 (0.0029) [2024-06-24 19:51:13,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11950243840. Throughput: 0: 42855.1. Samples: 11950334720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 19:51:16,751][15401] Updated weights for policy 0, policy_version 729392 (0.0026) [2024-06-24 19:51:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11950424064. Throughput: 0: 42899.7. Samples: 11950595540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 19:51:20,288][15401] Updated weights for policy 0, policy_version 729402 (0.0033) [2024-06-24 19:51:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11950653440. Throughput: 0: 42893.6. Samples: 11950718220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:23,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 19:51:24,377][15401] Updated weights for policy 0, policy_version 729412 (0.0042) [2024-06-24 19:51:27,841][15401] Updated weights for policy 0, policy_version 729422 (0.0033) [2024-06-24 19:51:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 11950882816. Throughput: 0: 43078.2. Samples: 11950984580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 19:51:32,167][15401] Updated weights for policy 0, policy_version 729432 (0.0042) [2024-06-24 19:51:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11951063040. Throughput: 0: 42914.1. Samples: 11951238100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 19:51:35,619][15401] Updated weights for policy 0, policy_version 729442 (0.0039) [2024-06-24 19:51:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11951292416. Throughput: 0: 42975.1. Samples: 11951360120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 19:51:39,822][15401] Updated weights for policy 0, policy_version 729452 (0.0039) [2024-06-24 19:51:43,205][15401] Updated weights for policy 0, policy_version 729462 (0.0028) [2024-06-24 19:51:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 11951521792. Throughput: 0: 42997.8. Samples: 11951625040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 19:51:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000729463_11951521792.pth... [2024-06-24 19:51:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000728834_11941216256.pth [2024-06-24 19:51:47,315][15401] Updated weights for policy 0, policy_version 729472 (0.0032) [2024-06-24 19:51:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 11951702016. Throughput: 0: 43062.0. Samples: 11951885320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:48,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 19:51:50,769][15401] Updated weights for policy 0, policy_version 729482 (0.0030) [2024-06-24 19:51:53,396][15132] Fps is (10 sec: 42572.9, 60 sec: 43140.3, 300 sec: 42930.8). Total num frames: 11951947776. Throughput: 0: 42900.6. Samples: 11952000280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:53,396][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 19:51:54,780][15401] Updated weights for policy 0, policy_version 729492 (0.0030) [2024-06-24 19:51:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11952144384. Throughput: 0: 42901.8. Samples: 11952265300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:51:58,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 19:51:58,425][15401] Updated weights for policy 0, policy_version 729502 (0.0031) [2024-06-24 19:52:02,449][15401] Updated weights for policy 0, policy_version 729512 (0.0032) [2024-06-24 19:52:03,390][15132] Fps is (10 sec: 39345.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11952340992. Throughput: 0: 42838.2. Samples: 11952523260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:52:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 19:52:06,295][15401] Updated weights for policy 0, policy_version 729522 (0.0034) [2024-06-24 19:52:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 11952553984. Throughput: 0: 42798.8. Samples: 11952644160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-24 19:52:08,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-24 19:52:10,460][15401] Updated weights for policy 0, policy_version 729532 (0.0029) [2024-06-24 19:52:11,551][15349] Signal inference workers to stop experience collection... (176950 times) [2024-06-24 19:52:11,555][15349] Signal inference workers to resume experience collection... (176950 times) [2024-06-24 19:52:11,593][15401] InferenceWorker_p0-w0: stopping experience collection (176950 times) [2024-06-24 19:52:11,628][15401] InferenceWorker_p0-w0: resuming experience collection (176950 times) [2024-06-24 19:52:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11952783360. Throughput: 0: 42639.6. Samples: 11952903360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:13,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 19:52:14,062][15401] Updated weights for policy 0, policy_version 729542 (0.0034) [2024-06-24 19:52:18,106][15401] Updated weights for policy 0, policy_version 729552 (0.0027) [2024-06-24 19:52:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11952979968. Throughput: 0: 42820.5. Samples: 11953165020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:18,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-24 19:52:21,355][15401] Updated weights for policy 0, policy_version 729562 (0.0026) [2024-06-24 19:52:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11953209344. Throughput: 0: 42948.2. Samples: 11953292780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 19:52:25,763][15401] Updated weights for policy 0, policy_version 729572 (0.0029) [2024-06-24 19:52:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11953438720. Throughput: 0: 43011.7. Samples: 11953560560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:28,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-24 19:52:28,846][15401] Updated weights for policy 0, policy_version 729582 (0.0034) [2024-06-24 19:52:33,307][15401] Updated weights for policy 0, policy_version 729592 (0.0041) [2024-06-24 19:52:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11953635328. Throughput: 0: 43005.8. Samples: 11953820580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 19:52:36,626][15401] Updated weights for policy 0, policy_version 729602 (0.0038) [2024-06-24 19:52:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11953848320. Throughput: 0: 43143.2. Samples: 11953941460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 19:52:40,941][15401] Updated weights for policy 0, policy_version 729612 (0.0038) [2024-06-24 19:52:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42932.3). Total num frames: 11954094080. Throughput: 0: 43048.8. Samples: 11954202500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 19:52:44,118][15401] Updated weights for policy 0, policy_version 729622 (0.0045) [2024-06-24 19:52:48,375][15401] Updated weights for policy 0, policy_version 729632 (0.0028) [2024-06-24 19:52:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11954290688. Throughput: 0: 43006.2. Samples: 11954458540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 19:52:51,651][15401] Updated weights for policy 0, policy_version 729642 (0.0032) [2024-06-24 19:52:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42602.6, 300 sec: 42876.1). Total num frames: 11954503680. Throughput: 0: 43094.1. Samples: 11954583400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:53,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-24 19:52:55,840][15401] Updated weights for policy 0, policy_version 729652 (0.0039) [2024-06-24 19:52:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11954700288. Throughput: 0: 43096.9. Samples: 11954842720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:52:58,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 19:52:59,581][15401] Updated weights for policy 0, policy_version 729662 (0.0034) [2024-06-24 19:53:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 11954929664. Throughput: 0: 42880.1. Samples: 11955094620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 19:53:03,879][15401] Updated weights for policy 0, policy_version 729672 (0.0034) [2024-06-24 19:53:07,440][15401] Updated weights for policy 0, policy_version 729682 (0.0041) [2024-06-24 19:53:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 11955142656. Throughput: 0: 42959.6. Samples: 11955225960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 19:53:11,365][15401] Updated weights for policy 0, policy_version 729692 (0.0034) [2024-06-24 19:53:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11955339264. Throughput: 0: 42787.9. Samples: 11955486020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 19:53:15,123][15401] Updated weights for policy 0, policy_version 729702 (0.0033) [2024-06-24 19:53:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 11955585024. Throughput: 0: 42635.5. Samples: 11955739180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:18,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 19:53:18,838][15401] Updated weights for policy 0, policy_version 729712 (0.0029) [2024-06-24 19:53:22,676][15401] Updated weights for policy 0, policy_version 729722 (0.0034) [2024-06-24 19:53:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11955781632. Throughput: 0: 42802.1. Samples: 11955867560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 19:53:26,331][15401] Updated weights for policy 0, policy_version 729732 (0.0032) [2024-06-24 19:53:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11955994624. Throughput: 0: 42790.3. Samples: 11956128060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:28,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 19:53:29,510][15349] Signal inference workers to stop experience collection... (177000 times) [2024-06-24 19:53:29,540][15401] InferenceWorker_p0-w0: stopping experience collection (177000 times) [2024-06-24 19:53:29,566][15349] Signal inference workers to resume experience collection... (177000 times) [2024-06-24 19:53:29,572][15401] InferenceWorker_p0-w0: resuming experience collection (177000 times) [2024-06-24 19:53:30,388][15401] Updated weights for policy 0, policy_version 729742 (0.0036) [2024-06-24 19:53:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 11956224000. Throughput: 0: 42626.2. Samples: 11956376720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 19:53:34,034][15401] Updated weights for policy 0, policy_version 729752 (0.0033) [2024-06-24 19:53:38,217][15401] Updated weights for policy 0, policy_version 729762 (0.0042) [2024-06-24 19:53:38,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 11956420608. Throughput: 0: 42708.7. Samples: 11956505300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:38,391][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 19:53:41,849][15401] Updated weights for policy 0, policy_version 729772 (0.0037) [2024-06-24 19:53:43,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 11956617216. Throughput: 0: 42583.0. Samples: 11956759060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:43,392][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 19:53:43,442][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000729775_11956633600.pth... [2024-06-24 19:53:43,511][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000729150_11946393600.pth [2024-06-24 19:53:45,852][15401] Updated weights for policy 0, policy_version 729782 (0.0034) [2024-06-24 19:53:48,390][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 11956862976. Throughput: 0: 42759.1. Samples: 11957018780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 19:53:49,248][15401] Updated weights for policy 0, policy_version 729792 (0.0032) [2024-06-24 19:53:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 11957059584. Throughput: 0: 42722.6. Samples: 11957148480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 19:53:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 19:53:53,645][15401] Updated weights for policy 0, policy_version 729802 (0.0029) [2024-06-24 19:53:57,107][15401] Updated weights for policy 0, policy_version 729812 (0.0044) [2024-06-24 19:53:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11957272576. Throughput: 0: 42567.1. Samples: 11957401540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:53:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 19:54:01,338][15401] Updated weights for policy 0, policy_version 729822 (0.0027) [2024-06-24 19:54:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11957501952. Throughput: 0: 42538.2. Samples: 11957653400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 19:54:05,108][15401] Updated weights for policy 0, policy_version 729832 (0.0032) [2024-06-24 19:54:08,396][15132] Fps is (10 sec: 42569.5, 60 sec: 42593.5, 300 sec: 42708.5). Total num frames: 11957698560. Throughput: 0: 42611.8. Samples: 11957785380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:08,397][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 19:54:08,884][15401] Updated weights for policy 0, policy_version 729842 (0.0040) [2024-06-24 19:54:12,505][15401] Updated weights for policy 0, policy_version 729852 (0.0030) [2024-06-24 19:54:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11957911552. Throughput: 0: 42497.8. Samples: 11958040460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:13,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 19:54:16,549][15401] Updated weights for policy 0, policy_version 729862 (0.0036) [2024-06-24 19:54:18,390][15132] Fps is (10 sec: 45905.7, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 11958157312. Throughput: 0: 42726.1. Samples: 11958299400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 19:54:20,057][15401] Updated weights for policy 0, policy_version 729872 (0.0036) [2024-06-24 19:54:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11958321152. Throughput: 0: 42805.1. Samples: 11958431520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 19:54:24,446][15401] Updated weights for policy 0, policy_version 729882 (0.0028) [2024-06-24 19:54:28,240][15401] Updated weights for policy 0, policy_version 729892 (0.0031) [2024-06-24 19:54:28,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 11958550528. Throughput: 0: 42744.6. Samples: 11958682460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 19:54:31,886][15401] Updated weights for policy 0, policy_version 729902 (0.0036) [2024-06-24 19:54:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11958779904. Throughput: 0: 42868.8. Samples: 11958947880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 19:54:35,594][15401] Updated weights for policy 0, policy_version 729912 (0.0031) [2024-06-24 19:54:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 11958976512. Throughput: 0: 42852.0. Samples: 11959076820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 19:54:39,513][15401] Updated weights for policy 0, policy_version 729922 (0.0044) [2024-06-24 19:54:41,151][15349] Signal inference workers to stop experience collection... (177050 times) [2024-06-24 19:54:41,204][15349] Signal inference workers to resume experience collection... (177050 times) [2024-06-24 19:54:41,205][15401] InferenceWorker_p0-w0: stopping experience collection (177050 times) [2024-06-24 19:54:41,218][15401] InferenceWorker_p0-w0: resuming experience collection (177050 times) [2024-06-24 19:54:43,074][15401] Updated weights for policy 0, policy_version 729932 (0.0026) [2024-06-24 19:54:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43419.4, 300 sec: 42931.6). Total num frames: 11959222272. Throughput: 0: 42810.8. Samples: 11959328020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 19:54:47,209][15401] Updated weights for policy 0, policy_version 729942 (0.0041) [2024-06-24 19:54:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11959418880. Throughput: 0: 43121.0. Samples: 11959593840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 19:54:50,513][15401] Updated weights for policy 0, policy_version 729952 (0.0031) [2024-06-24 19:54:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11959615488. Throughput: 0: 42762.0. Samples: 11959709380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:53,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-24 19:54:54,982][15401] Updated weights for policy 0, policy_version 729962 (0.0041) [2024-06-24 19:54:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11959844864. Throughput: 0: 42855.9. Samples: 11959968980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:54:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 19:54:58,465][15401] Updated weights for policy 0, policy_version 729972 (0.0030) [2024-06-24 19:55:02,625][15401] Updated weights for policy 0, policy_version 729982 (0.0038) [2024-06-24 19:55:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 11960041472. Throughput: 0: 42891.6. Samples: 11960229520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:55:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 19:55:05,977][15401] Updated weights for policy 0, policy_version 729992 (0.0031) [2024-06-24 19:55:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42603.3, 300 sec: 42709.5). Total num frames: 11960254464. Throughput: 0: 42689.0. Samples: 11960352520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:55:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 19:55:10,336][15401] Updated weights for policy 0, policy_version 730002 (0.0041) [2024-06-24 19:55:13,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 11960500224. Throughput: 0: 42857.2. Samples: 11960611040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:55:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 19:55:13,870][15401] Updated weights for policy 0, policy_version 730012 (0.0031) [2024-06-24 19:55:17,838][15401] Updated weights for policy 0, policy_version 730022 (0.0045) [2024-06-24 19:55:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 11960696832. Throughput: 0: 42608.0. Samples: 11960865240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:55:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 19:55:21,570][15401] Updated weights for policy 0, policy_version 730032 (0.0036) [2024-06-24 19:55:23,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11960909824. Throughput: 0: 42475.0. Samples: 11960988300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:55:23,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 19:55:25,585][15401] Updated weights for policy 0, policy_version 730042 (0.0032) [2024-06-24 19:55:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11961139200. Throughput: 0: 42619.8. Samples: 11961245920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:55:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 19:55:29,197][15401] Updated weights for policy 0, policy_version 730052 (0.0038) [2024-06-24 19:55:33,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11961319424. Throughput: 0: 42548.0. Samples: 11961508500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-24 19:55:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 19:55:33,439][15401] Updated weights for policy 0, policy_version 730062 (0.0035) [2024-06-24 19:55:36,928][15401] Updated weights for policy 0, policy_version 730072 (0.0029) [2024-06-24 19:55:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 11961565184. Throughput: 0: 42729.3. Samples: 11961632200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:55:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 19:55:40,948][15401] Updated weights for policy 0, policy_version 730082 (0.0036) [2024-06-24 19:55:43,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11961794560. Throughput: 0: 42818.2. Samples: 11961895800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:55:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 19:55:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000730090_11961794560.pth... [2024-06-24 19:55:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000729463_11951521792.pth [2024-06-24 19:55:44,422][15401] Updated weights for policy 0, policy_version 730092 (0.0031) [2024-06-24 19:55:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 11961974784. Throughput: 0: 42673.4. Samples: 11962149820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:55:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 19:55:48,575][15401] Updated weights for policy 0, policy_version 730102 (0.0031) [2024-06-24 19:55:52,173][15401] Updated weights for policy 0, policy_version 730112 (0.0048) [2024-06-24 19:55:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 11962220544. Throughput: 0: 42660.8. Samples: 11962272260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:55:53,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 19:55:56,151][15401] Updated weights for policy 0, policy_version 730122 (0.0029) [2024-06-24 19:55:58,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 11962417152. Throughput: 0: 42853.8. Samples: 11962539560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:55:58,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 19:55:59,605][15401] Updated weights for policy 0, policy_version 730132 (0.0033) [2024-06-24 19:56:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 11962630144. Throughput: 0: 42768.9. Samples: 11962789840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:03,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 19:56:03,767][15401] Updated weights for policy 0, policy_version 730142 (0.0041) [2024-06-24 19:56:07,148][15349] Signal inference workers to stop experience collection... (177100 times) [2024-06-24 19:56:07,148][15349] Signal inference workers to resume experience collection... (177100 times) [2024-06-24 19:56:07,193][15401] InferenceWorker_p0-w0: stopping experience collection (177100 times) [2024-06-24 19:56:07,193][15401] InferenceWorker_p0-w0: resuming experience collection (177100 times) [2024-06-24 19:56:07,295][15401] Updated weights for policy 0, policy_version 730152 (0.0034) [2024-06-24 19:56:08,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 11962859520. Throughput: 0: 42908.1. Samples: 11962919060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:08,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 19:56:11,482][15401] Updated weights for policy 0, policy_version 730162 (0.0034) [2024-06-24 19:56:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11963039744. Throughput: 0: 42900.4. Samples: 11963176440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 19:56:14,965][15401] Updated weights for policy 0, policy_version 730172 (0.0044) [2024-06-24 19:56:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11963269120. Throughput: 0: 42632.0. Samples: 11963426940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 19:56:19,092][15401] Updated weights for policy 0, policy_version 730182 (0.0035) [2024-06-24 19:56:22,859][15401] Updated weights for policy 0, policy_version 730192 (0.0037) [2024-06-24 19:56:23,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 11963498496. Throughput: 0: 42891.6. Samples: 11963562320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 19:56:26,726][15401] Updated weights for policy 0, policy_version 730202 (0.0027) [2024-06-24 19:56:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11963678720. Throughput: 0: 42715.0. Samples: 11963817980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:28,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 19:56:30,400][15401] Updated weights for policy 0, policy_version 730212 (0.0024) [2024-06-24 19:56:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11963924480. Throughput: 0: 42742.8. Samples: 11964073240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 19:56:34,789][15401] Updated weights for policy 0, policy_version 730222 (0.0021) [2024-06-24 19:56:37,874][15401] Updated weights for policy 0, policy_version 730232 (0.0027) [2024-06-24 19:56:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 11964153856. Throughput: 0: 43052.1. Samples: 11964209600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:38,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 19:56:42,435][15401] Updated weights for policy 0, policy_version 730242 (0.0043) [2024-06-24 19:56:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 11964334080. Throughput: 0: 42961.7. Samples: 11964472740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 19:56:45,343][15401] Updated weights for policy 0, policy_version 730252 (0.0037) [2024-06-24 19:56:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42821.4). Total num frames: 11964579840. Throughput: 0: 43000.9. Samples: 11964724880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 19:56:49,807][15401] Updated weights for policy 0, policy_version 730262 (0.0033) [2024-06-24 19:56:52,818][15401] Updated weights for policy 0, policy_version 730272 (0.0033) [2024-06-24 19:56:53,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11964792832. Throughput: 0: 43162.2. Samples: 11964861360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:53,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 19:56:57,526][15401] Updated weights for policy 0, policy_version 730282 (0.0026) [2024-06-24 19:56:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 11964989440. Throughput: 0: 43316.6. Samples: 11965125680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:56:58,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 19:57:00,393][15401] Updated weights for policy 0, policy_version 730292 (0.0033) [2024-06-24 19:57:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11965218816. Throughput: 0: 43382.5. Samples: 11965379160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:57:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 19:57:05,004][15401] Updated weights for policy 0, policy_version 730302 (0.0049) [2024-06-24 19:57:07,926][15401] Updated weights for policy 0, policy_version 730312 (0.0036) [2024-06-24 19:57:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11965448192. Throughput: 0: 43279.1. Samples: 11965509880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:57:08,390][15132] Avg episode reward: [(0, '0.130')] [2024-06-24 19:57:12,490][15401] Updated weights for policy 0, policy_version 730322 (0.0043) [2024-06-24 19:57:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 11965612032. Throughput: 0: 43233.1. Samples: 11965763460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:57:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 19:57:15,778][15401] Updated weights for policy 0, policy_version 730332 (0.0039) [2024-06-24 19:57:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 11965857792. Throughput: 0: 43116.5. Samples: 11966013480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-24 19:57:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 19:57:19,964][15401] Updated weights for policy 0, policy_version 730342 (0.0038) [2024-06-24 19:57:23,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11966070784. Throughput: 0: 43020.4. Samples: 11966145520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:57:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 19:57:23,526][15401] Updated weights for policy 0, policy_version 730352 (0.0031) [2024-06-24 19:57:28,138][15401] Updated weights for policy 0, policy_version 730362 (0.0026) [2024-06-24 19:57:28,391][15132] Fps is (10 sec: 40952.5, 60 sec: 43143.3, 300 sec: 42820.3). Total num frames: 11966267392. Throughput: 0: 42889.0. Samples: 11966402820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:57:28,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 19:57:31,141][15401] Updated weights for policy 0, policy_version 730372 (0.0036) [2024-06-24 19:57:33,243][15349] Signal inference workers to stop experience collection... (177150 times) [2024-06-24 19:57:33,290][15401] InferenceWorker_p0-w0: stopping experience collection (177150 times) [2024-06-24 19:57:33,297][15349] Signal inference workers to resume experience collection... (177150 times) [2024-06-24 19:57:33,308][15401] InferenceWorker_p0-w0: resuming experience collection (177150 times) [2024-06-24 19:57:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 11966496768. Throughput: 0: 42900.0. Samples: 11966655380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:57:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 19:57:35,703][15401] Updated weights for policy 0, policy_version 730382 (0.0024) [2024-06-24 19:57:38,389][15132] Fps is (10 sec: 42606.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11966693376. Throughput: 0: 42821.5. Samples: 11966788320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:57:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 19:57:38,713][15401] Updated weights for policy 0, policy_version 730392 (0.0032) [2024-06-24 19:57:43,385][15401] Updated weights for policy 0, policy_version 730402 (0.0043) [2024-06-24 19:57:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 11966906368. Throughput: 0: 42569.8. Samples: 11967041320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:57:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 19:57:43,516][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000730403_11966922752.pth... [2024-06-24 19:57:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000729775_11956633600.pth [2024-06-24 19:57:46,622][15401] Updated weights for policy 0, policy_version 730412 (0.0042) [2024-06-24 19:57:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 11967135744. Throughput: 0: 42574.4. Samples: 11967295000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:57:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 19:57:50,908][15401] Updated weights for policy 0, policy_version 730422 (0.0029) [2024-06-24 19:57:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 11967348736. Throughput: 0: 42567.2. Samples: 11967425400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:57:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 19:57:54,254][15401] Updated weights for policy 0, policy_version 730432 (0.0037) [2024-06-24 19:57:58,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 11967545344. Throughput: 0: 42671.5. Samples: 11967683680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:57:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 19:57:58,972][15401] Updated weights for policy 0, policy_version 730442 (0.0026) [2024-06-24 19:58:01,926][15401] Updated weights for policy 0, policy_version 730452 (0.0034) [2024-06-24 19:58:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 11967807488. Throughput: 0: 42693.2. Samples: 11967934680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:03,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 19:58:06,417][15401] Updated weights for policy 0, policy_version 730462 (0.0032) [2024-06-24 19:58:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 11967971328. Throughput: 0: 42869.4. Samples: 11968074640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 19:58:09,547][15401] Updated weights for policy 0, policy_version 730472 (0.0028) [2024-06-24 19:58:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11968200704. Throughput: 0: 42719.1. Samples: 11968325100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 19:58:13,935][15401] Updated weights for policy 0, policy_version 730482 (0.0036) [2024-06-24 19:58:17,418][15401] Updated weights for policy 0, policy_version 730492 (0.0033) [2024-06-24 19:58:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11968430080. Throughput: 0: 42705.4. Samples: 11968577120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 19:58:21,280][15401] Updated weights for policy 0, policy_version 730502 (0.0032) [2024-06-24 19:58:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 11968610304. Throughput: 0: 42635.1. Samples: 11968706900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 19:58:24,879][15401] Updated weights for policy 0, policy_version 730512 (0.0040) [2024-06-24 19:58:28,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42326.6, 300 sec: 42654.0). Total num frames: 11968806912. Throughput: 0: 42662.1. Samples: 11968961120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 19:58:29,103][15401] Updated weights for policy 0, policy_version 730522 (0.0026) [2024-06-24 19:58:32,410][15401] Updated weights for policy 0, policy_version 730532 (0.0034) [2024-06-24 19:58:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11969069056. Throughput: 0: 42847.6. Samples: 11969223140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 19:58:36,631][15401] Updated weights for policy 0, policy_version 730542 (0.0040) [2024-06-24 19:58:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 11969265664. Throughput: 0: 42936.2. Samples: 11969357540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 19:58:40,000][15401] Updated weights for policy 0, policy_version 730552 (0.0033) [2024-06-24 19:58:43,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 11969462272. Throughput: 0: 42756.3. Samples: 11969607720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 19:58:44,627][15401] Updated weights for policy 0, policy_version 730562 (0.0022) [2024-06-24 19:58:45,040][15349] Signal inference workers to stop experience collection... (177200 times) [2024-06-24 19:58:45,040][15349] Signal inference workers to resume experience collection... (177200 times) [2024-06-24 19:58:45,054][15401] InferenceWorker_p0-w0: stopping experience collection (177200 times) [2024-06-24 19:58:45,055][15401] InferenceWorker_p0-w0: resuming experience collection (177200 times) [2024-06-24 19:58:47,676][15401] Updated weights for policy 0, policy_version 730572 (0.0035) [2024-06-24 19:58:48,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 11969708032. Throughput: 0: 42709.1. Samples: 11969856580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:48,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 19:58:52,298][15401] Updated weights for policy 0, policy_version 730582 (0.0025) [2024-06-24 19:58:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 11969904640. Throughput: 0: 42656.4. Samples: 11969994180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:53,392][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 19:58:55,310][15401] Updated weights for policy 0, policy_version 730592 (0.0041) [2024-06-24 19:58:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11970101248. Throughput: 0: 42656.4. Samples: 11970244640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 19:58:58,391][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 19:58:59,874][15401] Updated weights for policy 0, policy_version 730602 (0.0044) [2024-06-24 19:59:03,136][15401] Updated weights for policy 0, policy_version 730612 (0.0040) [2024-06-24 19:59:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42932.6). Total num frames: 11970363392. Throughput: 0: 42913.3. Samples: 11970508220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:03,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 19:59:07,520][15401] Updated weights for policy 0, policy_version 730622 (0.0039) [2024-06-24 19:59:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11970560000. Throughput: 0: 42989.3. Samples: 11970641420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 19:59:10,734][15401] Updated weights for policy 0, policy_version 730632 (0.0034) [2024-06-24 19:59:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 11970756608. Throughput: 0: 42820.0. Samples: 11970888020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 19:59:14,971][15401] Updated weights for policy 0, policy_version 730642 (0.0044) [2024-06-24 19:59:18,207][15401] Updated weights for policy 0, policy_version 730652 (0.0039) [2024-06-24 19:59:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 11971002368. Throughput: 0: 42810.6. Samples: 11971149620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 19:59:22,856][15401] Updated weights for policy 0, policy_version 730662 (0.0037) [2024-06-24 19:59:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11971182592. Throughput: 0: 42821.9. Samples: 11971284520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:23,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 19:59:25,747][15401] Updated weights for policy 0, policy_version 730672 (0.0034) [2024-06-24 19:59:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 11971411968. Throughput: 0: 42708.2. Samples: 11971529580. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:28,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 19:59:30,940][15401] Updated weights for policy 0, policy_version 730682 (0.0029) [2024-06-24 19:59:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 11971641344. Throughput: 0: 42836.4. Samples: 11971784220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 19:59:33,784][15401] Updated weights for policy 0, policy_version 730692 (0.0037) [2024-06-24 19:59:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11971805184. Throughput: 0: 42728.4. Samples: 11971916960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 19:59:38,563][15401] Updated weights for policy 0, policy_version 730702 (0.0022) [2024-06-24 19:59:41,343][15401] Updated weights for policy 0, policy_version 730712 (0.0044) [2024-06-24 19:59:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 11972067328. Throughput: 0: 42730.6. Samples: 11972167520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:43,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 19:59:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000730717_11972067328.pth... [2024-06-24 19:59:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000730090_11961794560.pth [2024-06-24 19:59:46,232][15401] Updated weights for policy 0, policy_version 730722 (0.0040) [2024-06-24 19:59:48,392][15132] Fps is (10 sec: 47502.5, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 11972280320. Throughput: 0: 42535.1. Samples: 11972422400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:48,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 19:59:49,274][15401] Updated weights for policy 0, policy_version 730732 (0.0035) [2024-06-24 19:59:53,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11972444160. Throughput: 0: 42466.2. Samples: 11972552400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 19:59:53,975][15401] Updated weights for policy 0, policy_version 730742 (0.0037) [2024-06-24 19:59:54,628][15349] Signal inference workers to stop experience collection... (177250 times) [2024-06-24 19:59:54,632][15349] Signal inference workers to resume experience collection... (177250 times) [2024-06-24 19:59:54,638][15401] InferenceWorker_p0-w0: stopping experience collection (177250 times) [2024-06-24 19:59:54,657][15401] InferenceWorker_p0-w0: resuming experience collection (177250 times) [2024-06-24 19:59:56,964][15401] Updated weights for policy 0, policy_version 730752 (0.0031) [2024-06-24 19:59:58,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 11972706304. Throughput: 0: 42666.9. Samples: 11972808040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 19:59:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 20:00:01,628][15401] Updated weights for policy 0, policy_version 730762 (0.0038) [2024-06-24 20:00:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 11972902912. Throughput: 0: 42345.1. Samples: 11973055160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 20:00:04,789][15401] Updated weights for policy 0, policy_version 730772 (0.0039) [2024-06-24 20:00:08,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 11973083136. Throughput: 0: 42104.5. Samples: 11973179220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:08,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 20:00:09,627][15401] Updated weights for policy 0, policy_version 730782 (0.0031) [2024-06-24 20:00:12,288][15401] Updated weights for policy 0, policy_version 730792 (0.0028) [2024-06-24 20:00:13,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 11973345280. Throughput: 0: 42468.9. Samples: 11973440680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 20:00:17,254][15401] Updated weights for policy 0, policy_version 730802 (0.0023) [2024-06-24 20:00:18,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42050.5, 300 sec: 42765.0). Total num frames: 11973525504. Throughput: 0: 42626.6. Samples: 11973702520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:18,393][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 20:00:20,194][15401] Updated weights for policy 0, policy_version 730812 (0.0032) [2024-06-24 20:00:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11973738496. Throughput: 0: 42459.5. Samples: 11973827640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 20:00:24,754][15401] Updated weights for policy 0, policy_version 730822 (0.0039) [2024-06-24 20:00:27,567][15401] Updated weights for policy 0, policy_version 730832 (0.0038) [2024-06-24 20:00:28,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 11973967872. Throughput: 0: 42510.8. Samples: 11974080500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 20:00:32,102][15401] Updated weights for policy 0, policy_version 730842 (0.0034) [2024-06-24 20:00:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 11974164480. Throughput: 0: 42744.5. Samples: 11974345800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:33,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 20:00:35,586][15401] Updated weights for policy 0, policy_version 730852 (0.0043) [2024-06-24 20:00:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11974377472. Throughput: 0: 42619.9. Samples: 11974470300. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 20:00:40,051][15401] Updated weights for policy 0, policy_version 730862 (0.0029) [2024-06-24 20:00:43,107][15401] Updated weights for policy 0, policy_version 730872 (0.0042) [2024-06-24 20:00:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 11974606848. Throughput: 0: 42615.6. Samples: 11974725840. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-24 20:00:43,392][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 20:00:47,717][15401] Updated weights for policy 0, policy_version 730882 (0.0042) [2024-06-24 20:00:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42054.0, 300 sec: 42654.0). Total num frames: 11974803456. Throughput: 0: 42883.3. Samples: 11974984900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:00:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 20:00:50,686][15401] Updated weights for policy 0, policy_version 730892 (0.0043) [2024-06-24 20:00:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 11975016448. Throughput: 0: 42906.7. Samples: 11975110020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:00:53,390][15132] Avg episode reward: [(0, '0.097')] [2024-06-24 20:00:55,219][15401] Updated weights for policy 0, policy_version 730902 (0.0030) [2024-06-24 20:00:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11975245824. Throughput: 0: 42649.6. Samples: 11975359920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:00:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 20:00:58,802][15401] Updated weights for policy 0, policy_version 730912 (0.0029) [2024-06-24 20:01:02,735][15401] Updated weights for policy 0, policy_version 730922 (0.0023) [2024-06-24 20:01:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11975442432. Throughput: 0: 42524.0. Samples: 11975616000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 20:01:06,255][15401] Updated weights for policy 0, policy_version 730932 (0.0025) [2024-06-24 20:01:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 11975655424. Throughput: 0: 42582.3. Samples: 11975743840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:08,399][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 20:01:10,374][15401] Updated weights for policy 0, policy_version 730942 (0.0025) [2024-06-24 20:01:11,885][15349] Signal inference workers to stop experience collection... (177300 times) [2024-06-24 20:01:11,885][15349] Signal inference workers to resume experience collection... (177300 times) [2024-06-24 20:01:11,924][15401] InferenceWorker_p0-w0: stopping experience collection (177300 times) [2024-06-24 20:01:11,924][15401] InferenceWorker_p0-w0: resuming experience collection (177300 times) [2024-06-24 20:01:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 11975868416. Throughput: 0: 42670.6. Samples: 11976000680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 20:01:13,992][15401] Updated weights for policy 0, policy_version 730952 (0.0044) [2024-06-24 20:01:18,035][15401] Updated weights for policy 0, policy_version 730962 (0.0021) [2024-06-24 20:01:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11976097792. Throughput: 0: 42354.3. Samples: 11976251740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 20:01:21,866][15401] Updated weights for policy 0, policy_version 730972 (0.0033) [2024-06-24 20:01:23,391][15132] Fps is (10 sec: 44230.7, 60 sec: 42870.5, 300 sec: 42820.4). Total num frames: 11976310784. Throughput: 0: 42492.0. Samples: 11976382500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:23,391][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 20:01:25,677][15401] Updated weights for policy 0, policy_version 730982 (0.0032) [2024-06-24 20:01:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 11976523776. Throughput: 0: 42636.0. Samples: 11976644360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:28,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-24 20:01:29,515][15401] Updated weights for policy 0, policy_version 730992 (0.0032) [2024-06-24 20:01:33,314][15401] Updated weights for policy 0, policy_version 731002 (0.0022) [2024-06-24 20:01:33,390][15132] Fps is (10 sec: 42604.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11976736768. Throughput: 0: 42553.3. Samples: 11976899800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:33,390][15132] Avg episode reward: [(0, '0.137')] [2024-06-24 20:01:37,092][15401] Updated weights for policy 0, policy_version 731012 (0.0039) [2024-06-24 20:01:38,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 11976966144. Throughput: 0: 42560.8. Samples: 11977025360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:38,392][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 20:01:40,964][15401] Updated weights for policy 0, policy_version 731022 (0.0030) [2024-06-24 20:01:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 11977162752. Throughput: 0: 42798.4. Samples: 11977285840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 20:01:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000731028_11977162752.pth... [2024-06-24 20:01:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000730403_11966922752.pth [2024-06-24 20:01:44,592][15401] Updated weights for policy 0, policy_version 731032 (0.0041) [2024-06-24 20:01:48,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11977359360. Throughput: 0: 42577.8. Samples: 11977532000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:48,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 20:01:48,821][15401] Updated weights for policy 0, policy_version 731042 (0.0033) [2024-06-24 20:01:52,905][15401] Updated weights for policy 0, policy_version 731052 (0.0044) [2024-06-24 20:01:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 11977605120. Throughput: 0: 42517.4. Samples: 11977657120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 20:01:56,618][15401] Updated weights for policy 0, policy_version 731062 (0.0029) [2024-06-24 20:01:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11977768960. Throughput: 0: 42583.1. Samples: 11977916920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:01:58,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 20:02:00,523][15401] Updated weights for policy 0, policy_version 731072 (0.0025) [2024-06-24 20:02:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11978014720. Throughput: 0: 42640.9. Samples: 11978170580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:02:03,391][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 20:02:04,462][15401] Updated weights for policy 0, policy_version 731082 (0.0044) [2024-06-24 20:02:05,147][15349] Signal inference workers to stop experience collection... (177350 times) [2024-06-24 20:02:05,158][15401] InferenceWorker_p0-w0: stopping experience collection (177350 times) [2024-06-24 20:02:05,208][15349] Signal inference workers to resume experience collection... (177350 times) [2024-06-24 20:02:05,208][15401] InferenceWorker_p0-w0: resuming experience collection (177350 times) [2024-06-24 20:02:08,013][15401] Updated weights for policy 0, policy_version 731092 (0.0029) [2024-06-24 20:02:08,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 11978244096. Throughput: 0: 42700.9. Samples: 11978303980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:02:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 20:02:12,374][15401] Updated weights for policy 0, policy_version 731102 (0.0042) [2024-06-24 20:02:13,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42593.8, 300 sec: 42597.5). Total num frames: 11978424320. Throughput: 0: 42585.9. Samples: 11978561000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:02:13,397][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 20:02:15,583][15401] Updated weights for policy 0, policy_version 731112 (0.0033) [2024-06-24 20:02:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 11978653696. Throughput: 0: 42431.2. Samples: 11978809200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:02:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 20:02:20,191][15401] Updated weights for policy 0, policy_version 731122 (0.0033) [2024-06-24 20:02:23,176][15401] Updated weights for policy 0, policy_version 731132 (0.0030) [2024-06-24 20:02:23,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42599.4, 300 sec: 42709.7). Total num frames: 11978866688. Throughput: 0: 42567.1. Samples: 11978940780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-24 20:02:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 20:02:27,731][15401] Updated weights for policy 0, policy_version 731142 (0.0039) [2024-06-24 20:02:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11979046912. Throughput: 0: 42425.3. Samples: 11979194980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:02:28,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 20:02:30,910][15401] Updated weights for policy 0, policy_version 731152 (0.0030) [2024-06-24 20:02:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11979276288. Throughput: 0: 42411.1. Samples: 11979440500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:02:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 20:02:35,673][15401] Updated weights for policy 0, policy_version 731162 (0.0039) [2024-06-24 20:02:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 11979505664. Throughput: 0: 42501.4. Samples: 11979569680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:02:38,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 20:02:38,696][15401] Updated weights for policy 0, policy_version 731172 (0.0042) [2024-06-24 20:02:43,283][15401] Updated weights for policy 0, policy_version 731182 (0.0036) [2024-06-24 20:02:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 11979685888. Throughput: 0: 42420.0. Samples: 11979825820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:02:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 20:02:46,619][15401] Updated weights for policy 0, policy_version 731192 (0.0044) [2024-06-24 20:02:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11979915264. Throughput: 0: 42317.3. Samples: 11980074860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:02:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 20:02:50,940][15401] Updated weights for policy 0, policy_version 731202 (0.0042) [2024-06-24 20:02:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 11980144640. Throughput: 0: 42344.0. Samples: 11980209460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:02:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 20:02:54,210][15401] Updated weights for policy 0, policy_version 731212 (0.0038) [2024-06-24 20:02:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 11980308480. Throughput: 0: 42180.3. Samples: 11980458840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:02:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 20:02:59,011][15401] Updated weights for policy 0, policy_version 731222 (0.0032) [2024-06-24 20:03:01,894][15401] Updated weights for policy 0, policy_version 731232 (0.0042) [2024-06-24 20:03:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 11980554240. Throughput: 0: 42128.7. Samples: 11980705000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 20:03:07,142][15401] Updated weights for policy 0, policy_version 731242 (0.0032) [2024-06-24 20:03:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 11980750848. Throughput: 0: 42225.8. Samples: 11980840940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:08,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-24 20:03:09,503][15401] Updated weights for policy 0, policy_version 731252 (0.0035) [2024-06-24 20:03:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42329.9, 300 sec: 42487.3). Total num frames: 11980963840. Throughput: 0: 42209.3. Samples: 11981094400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:13,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-24 20:03:14,534][15401] Updated weights for policy 0, policy_version 731262 (0.0036) [2024-06-24 20:03:17,121][15401] Updated weights for policy 0, policy_version 731272 (0.0035) [2024-06-24 20:03:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 11981209600. Throughput: 0: 42287.5. Samples: 11981343440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:18,391][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 20:03:22,351][15401] Updated weights for policy 0, policy_version 731282 (0.0044) [2024-06-24 20:03:22,560][15349] Signal inference workers to stop experience collection... (177400 times) [2024-06-24 20:03:22,560][15349] Signal inference workers to resume experience collection... (177400 times) [2024-06-24 20:03:22,603][15401] InferenceWorker_p0-w0: stopping experience collection (177400 times) [2024-06-24 20:03:22,604][15401] InferenceWorker_p0-w0: resuming experience collection (177400 times) [2024-06-24 20:03:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 11981389824. Throughput: 0: 42376.5. Samples: 11981476620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 20:03:24,762][15401] Updated weights for policy 0, policy_version 731292 (0.0028) [2024-06-24 20:03:28,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 11981586432. Throughput: 0: 42338.8. Samples: 11981731060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 20:03:30,017][15401] Updated weights for policy 0, policy_version 731302 (0.0040) [2024-06-24 20:03:32,822][15401] Updated weights for policy 0, policy_version 731312 (0.0044) [2024-06-24 20:03:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11981848576. Throughput: 0: 42367.1. Samples: 11981981380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 20:03:37,510][15401] Updated weights for policy 0, policy_version 731322 (0.0029) [2024-06-24 20:03:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 11982028800. Throughput: 0: 42505.0. Samples: 11982122180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 20:03:40,288][15401] Updated weights for policy 0, policy_version 731332 (0.0046) [2024-06-24 20:03:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 11982241792. Throughput: 0: 42619.5. Samples: 11982376720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:43,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 20:03:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000731338_11982241792.pth... [2024-06-24 20:03:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000730717_11972067328.pth [2024-06-24 20:03:44,946][15401] Updated weights for policy 0, policy_version 731342 (0.0024) [2024-06-24 20:03:47,955][15401] Updated weights for policy 0, policy_version 731352 (0.0026) [2024-06-24 20:03:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 11982487552. Throughput: 0: 42733.1. Samples: 11982627980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 20:03:52,574][15401] Updated weights for policy 0, policy_version 731362 (0.0044) [2024-06-24 20:03:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 11982684160. Throughput: 0: 42718.2. Samples: 11982763260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:53,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 20:03:55,819][15401] Updated weights for policy 0, policy_version 731372 (0.0028) [2024-06-24 20:03:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 11982880768. Throughput: 0: 42591.5. Samples: 11983011020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:03:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 20:04:00,391][15401] Updated weights for policy 0, policy_version 731382 (0.0027) [2024-06-24 20:04:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 11983110144. Throughput: 0: 42713.5. Samples: 11983265540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:04:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 20:04:03,525][15401] Updated weights for policy 0, policy_version 731392 (0.0039) [2024-06-24 20:04:07,980][15401] Updated weights for policy 0, policy_version 731402 (0.0034) [2024-06-24 20:04:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 11983323136. Throughput: 0: 42728.0. Samples: 11983399380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:04:08,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-24 20:04:11,305][15401] Updated weights for policy 0, policy_version 731412 (0.0039) [2024-06-24 20:04:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 11983536128. Throughput: 0: 42660.0. Samples: 11983650760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:13,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 20:04:15,615][15401] Updated weights for policy 0, policy_version 731422 (0.0035) [2024-06-24 20:04:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 11983765504. Throughput: 0: 42768.4. Samples: 11983905960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 20:04:18,960][15401] Updated weights for policy 0, policy_version 731432 (0.0024) [2024-06-24 20:04:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 11983929344. Throughput: 0: 42596.4. Samples: 11984039020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 20:04:23,412][15401] Updated weights for policy 0, policy_version 731442 (0.0037) [2024-06-24 20:04:26,595][15401] Updated weights for policy 0, policy_version 731452 (0.0034) [2024-06-24 20:04:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.7, 300 sec: 42487.0). Total num frames: 11984175104. Throughput: 0: 42436.4. Samples: 11984286460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:28,393][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 20:04:31,549][15401] Updated weights for policy 0, policy_version 731462 (0.0032) [2024-06-24 20:04:33,097][15349] Signal inference workers to stop experience collection... (177450 times) [2024-06-24 20:04:33,097][15349] Signal inference workers to resume experience collection... (177450 times) [2024-06-24 20:04:33,113][15401] InferenceWorker_p0-w0: stopping experience collection (177450 times) [2024-06-24 20:04:33,124][15401] InferenceWorker_p0-w0: resuming experience collection (177450 times) [2024-06-24 20:04:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 11984388096. Throughput: 0: 42670.9. Samples: 11984548180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 20:04:34,283][15401] Updated weights for policy 0, policy_version 731472 (0.0038) [2024-06-24 20:04:38,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42325.2, 300 sec: 42376.3). Total num frames: 11984568320. Throughput: 0: 42540.4. Samples: 11984677580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 20:04:39,050][15401] Updated weights for policy 0, policy_version 731482 (0.0034) [2024-06-24 20:04:41,738][15401] Updated weights for policy 0, policy_version 731492 (0.0044) [2024-06-24 20:04:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 11984830464. Throughput: 0: 42692.5. Samples: 11984932180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 20:04:46,760][15401] Updated weights for policy 0, policy_version 731502 (0.0028) [2024-06-24 20:04:48,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11985043456. Throughput: 0: 42745.3. Samples: 11985189080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:48,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 20:04:49,335][15401] Updated weights for policy 0, policy_version 731512 (0.0039) [2024-06-24 20:04:53,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 11985207296. Throughput: 0: 42539.0. Samples: 11985313640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 20:04:54,645][15401] Updated weights for policy 0, policy_version 731522 (0.0034) [2024-06-24 20:04:56,927][15401] Updated weights for policy 0, policy_version 731532 (0.0031) [2024-06-24 20:04:58,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43415.9, 300 sec: 42653.6). Total num frames: 11985485824. Throughput: 0: 42615.9. Samples: 11985568580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:04:58,393][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 20:05:02,084][15401] Updated weights for policy 0, policy_version 731542 (0.0034) [2024-06-24 20:05:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 11985666048. Throughput: 0: 42869.8. Samples: 11985835100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 20:05:04,723][15401] Updated weights for policy 0, policy_version 731552 (0.0030) [2024-06-24 20:05:08,389][15132] Fps is (10 sec: 36053.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 11985846272. Throughput: 0: 42538.7. Samples: 11985953260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:08,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 20:05:09,941][15401] Updated weights for policy 0, policy_version 731562 (0.0032) [2024-06-24 20:05:12,434][15401] Updated weights for policy 0, policy_version 731572 (0.0032) [2024-06-24 20:05:13,392][15132] Fps is (10 sec: 47502.6, 60 sec: 43415.8, 300 sec: 42765.0). Total num frames: 11986141184. Throughput: 0: 42822.3. Samples: 11986213460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:13,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 20:05:17,496][15401] Updated weights for policy 0, policy_version 731582 (0.0040) [2024-06-24 20:05:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11986288640. Throughput: 0: 42989.1. Samples: 11986482680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 20:05:20,011][15401] Updated weights for policy 0, policy_version 731592 (0.0031) [2024-06-24 20:05:23,389][15132] Fps is (10 sec: 36053.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 11986501632. Throughput: 0: 42550.9. Samples: 11986592360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 20:05:25,059][15401] Updated weights for policy 0, policy_version 731602 (0.0046) [2024-06-24 20:05:27,699][15401] Updated weights for policy 0, policy_version 731612 (0.0024) [2024-06-24 20:05:28,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 11986763776. Throughput: 0: 42864.5. Samples: 11986861080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:28,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 20:05:32,479][15401] Updated weights for policy 0, policy_version 731622 (0.0041) [2024-06-24 20:05:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 11986927616. Throughput: 0: 42969.3. Samples: 11987122700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 20:05:35,515][15401] Updated weights for policy 0, policy_version 731632 (0.0027) [2024-06-24 20:05:38,392][15132] Fps is (10 sec: 39311.8, 60 sec: 43142.8, 300 sec: 42542.9). Total num frames: 11987156992. Throughput: 0: 42699.9. Samples: 11987235240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:38,393][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 20:05:39,980][15401] Updated weights for policy 0, policy_version 731642 (0.0041) [2024-06-24 20:05:41,761][15349] Signal inference workers to stop experience collection... (177500 times) [2024-06-24 20:05:41,763][15349] Signal inference workers to resume experience collection... (177500 times) [2024-06-24 20:05:41,805][15401] InferenceWorker_p0-w0: stopping experience collection (177500 times) [2024-06-24 20:05:41,805][15401] InferenceWorker_p0-w0: resuming experience collection (177500 times) [2024-06-24 20:05:43,122][15401] Updated weights for policy 0, policy_version 731652 (0.0031) [2024-06-24 20:05:43,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11987402752. Throughput: 0: 42839.1. Samples: 11987496240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 20:05:43,395][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000731653_11987402752.pth... [2024-06-24 20:05:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000731028_11977162752.pth [2024-06-24 20:05:47,670][15401] Updated weights for policy 0, policy_version 731662 (0.0033) [2024-06-24 20:05:48,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 11987566592. Throughput: 0: 42617.0. Samples: 11987752860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-24 20:05:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 20:05:50,805][15401] Updated weights for policy 0, policy_version 731672 (0.0032) [2024-06-24 20:05:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 11987812352. Throughput: 0: 42693.6. Samples: 11987874480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:05:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 20:05:55,419][15401] Updated weights for policy 0, policy_version 731682 (0.0039) [2024-06-24 20:05:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 11988025344. Throughput: 0: 42786.8. Samples: 11988138760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:05:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 20:05:58,666][15401] Updated weights for policy 0, policy_version 731692 (0.0036) [2024-06-24 20:06:03,049][15401] Updated weights for policy 0, policy_version 731702 (0.0031) [2024-06-24 20:06:03,393][15132] Fps is (10 sec: 40947.2, 60 sec: 42596.2, 300 sec: 42597.9). Total num frames: 11988221952. Throughput: 0: 42565.8. Samples: 11988398280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:03,393][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 20:06:06,346][15401] Updated weights for policy 0, policy_version 731712 (0.0041) [2024-06-24 20:06:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 11988434944. Throughput: 0: 42912.4. Samples: 11988523420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:08,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-24 20:06:10,779][15401] Updated weights for policy 0, policy_version 731722 (0.0026) [2024-06-24 20:06:13,390][15132] Fps is (10 sec: 44250.8, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 11988664320. Throughput: 0: 42659.5. Samples: 11988780760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 20:06:14,245][15401] Updated weights for policy 0, policy_version 731732 (0.0037) [2024-06-24 20:06:18,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42487.5). Total num frames: 11988844544. Throughput: 0: 42514.5. Samples: 11989035860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 20:06:18,442][15401] Updated weights for policy 0, policy_version 731742 (0.0027) [2024-06-24 20:06:21,716][15401] Updated weights for policy 0, policy_version 731752 (0.0036) [2024-06-24 20:06:23,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42866.7, 300 sec: 42541.9). Total num frames: 11989073920. Throughput: 0: 42759.3. Samples: 11989159580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:23,397][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 20:06:26,248][15401] Updated weights for policy 0, policy_version 731762 (0.0027) [2024-06-24 20:06:28,390][15132] Fps is (10 sec: 44237.7, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 11989286912. Throughput: 0: 42588.0. Samples: 11989412700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 20:06:29,161][15401] Updated weights for policy 0, policy_version 731772 (0.0036) [2024-06-24 20:06:33,390][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.4, 300 sec: 42487.7). Total num frames: 11989499904. Throughput: 0: 42723.4. Samples: 11989675420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 20:06:33,866][15401] Updated weights for policy 0, policy_version 731782 (0.0028) [2024-06-24 20:06:36,673][15401] Updated weights for policy 0, policy_version 731792 (0.0032) [2024-06-24 20:06:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 11989729280. Throughput: 0: 42872.1. Samples: 11989803720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:38,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 20:06:41,285][15401] Updated weights for policy 0, policy_version 731802 (0.0027) [2024-06-24 20:06:43,393][15132] Fps is (10 sec: 44223.2, 60 sec: 42323.2, 300 sec: 42653.5). Total num frames: 11989942272. Throughput: 0: 42787.2. Samples: 11990064320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:43,393][15132] Avg episode reward: [(0, '0.134')] [2024-06-24 20:06:44,281][15401] Updated weights for policy 0, policy_version 731812 (0.0032) [2024-06-24 20:06:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 11990155264. Throughput: 0: 42825.7. Samples: 11990325300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:48,390][15132] Avg episode reward: [(0, '0.179')] [2024-06-24 20:06:48,820][15401] Updated weights for policy 0, policy_version 731822 (0.0022) [2024-06-24 20:06:52,119][15401] Updated weights for policy 0, policy_version 731832 (0.0030) [2024-06-24 20:06:53,390][15132] Fps is (10 sec: 44250.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11990384640. Throughput: 0: 42831.0. Samples: 11990450820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 20:06:55,168][15349] Signal inference workers to stop experience collection... (177550 times) [2024-06-24 20:06:55,168][15349] Signal inference workers to resume experience collection... (177550 times) [2024-06-24 20:06:55,198][15401] InferenceWorker_p0-w0: stopping experience collection (177550 times) [2024-06-24 20:06:55,198][15401] InferenceWorker_p0-w0: resuming experience collection (177550 times) [2024-06-24 20:06:56,333][15401] Updated weights for policy 0, policy_version 731842 (0.0047) [2024-06-24 20:06:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11990597632. Throughput: 0: 42952.5. Samples: 11990713620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:06:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 20:07:00,026][15401] Updated weights for policy 0, policy_version 731852 (0.0021) [2024-06-24 20:07:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.7, 300 sec: 42542.8). Total num frames: 11990794240. Throughput: 0: 42878.8. Samples: 11990965400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:07:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 20:07:04,256][15401] Updated weights for policy 0, policy_version 731862 (0.0034) [2024-06-24 20:07:07,558][15401] Updated weights for policy 0, policy_version 731872 (0.0034) [2024-06-24 20:07:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42654.9). Total num frames: 11991007232. Throughput: 0: 42984.3. Samples: 11991093600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:07:08,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 20:07:11,860][15401] Updated weights for policy 0, policy_version 731882 (0.0025) [2024-06-24 20:07:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 11991220224. Throughput: 0: 43053.7. Samples: 11991350120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:07:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 20:07:15,545][15401] Updated weights for policy 0, policy_version 731892 (0.0038) [2024-06-24 20:07:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 11991449600. Throughput: 0: 42846.7. Samples: 11991603520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:07:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 20:07:19,472][15401] Updated weights for policy 0, policy_version 731902 (0.0038) [2024-06-24 20:07:23,005][15401] Updated weights for policy 0, policy_version 731912 (0.0028) [2024-06-24 20:07:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 11991662592. Throughput: 0: 42967.2. Samples: 11991737240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:07:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 20:07:27,027][15401] Updated weights for policy 0, policy_version 731922 (0.0027) [2024-06-24 20:07:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 11991859200. Throughput: 0: 42942.9. Samples: 11991996620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:07:28,392][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 20:07:30,380][15401] Updated weights for policy 0, policy_version 731932 (0.0043) [2024-06-24 20:07:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 11992088576. Throughput: 0: 42847.6. Samples: 11992253440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-24 20:07:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 20:07:34,524][15401] Updated weights for policy 0, policy_version 731942 (0.0032) [2024-06-24 20:07:38,179][15401] Updated weights for policy 0, policy_version 731952 (0.0039) [2024-06-24 20:07:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 11992301568. Throughput: 0: 42942.4. Samples: 11992383220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:07:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 20:07:42,157][15401] Updated weights for policy 0, policy_version 731962 (0.0037) [2024-06-24 20:07:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.6, 300 sec: 42709.5). Total num frames: 11992514560. Throughput: 0: 42769.3. Samples: 11992638240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:07:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 20:07:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000731965_11992514560.pth... [2024-06-24 20:07:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000731338_11982241792.pth [2024-06-24 20:07:45,715][15401] Updated weights for policy 0, policy_version 731972 (0.0029) [2024-06-24 20:07:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11992743936. Throughput: 0: 42777.8. Samples: 11992890400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:07:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 20:07:49,667][15401] Updated weights for policy 0, policy_version 731982 (0.0036) [2024-06-24 20:07:53,298][15401] Updated weights for policy 0, policy_version 731992 (0.0042) [2024-06-24 20:07:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 11992956928. Throughput: 0: 42904.6. Samples: 11993024300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:07:53,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-24 20:07:57,231][15401] Updated weights for policy 0, policy_version 732002 (0.0036) [2024-06-24 20:07:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 11993153536. Throughput: 0: 42877.3. Samples: 11993279600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:07:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 20:08:00,914][15401] Updated weights for policy 0, policy_version 732012 (0.0033) [2024-06-24 20:08:03,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 11993382912. Throughput: 0: 42848.4. Samples: 11993531700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 20:08:04,908][15401] Updated weights for policy 0, policy_version 732022 (0.0037) [2024-06-24 20:08:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 11993595904. Throughput: 0: 42809.7. Samples: 11993663680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 20:08:08,737][15401] Updated weights for policy 0, policy_version 732032 (0.0041) [2024-06-24 20:08:12,491][15401] Updated weights for policy 0, policy_version 732042 (0.0032) [2024-06-24 20:08:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 11993792512. Throughput: 0: 42824.1. Samples: 11993923700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 20:08:16,482][15401] Updated weights for policy 0, policy_version 732052 (0.0032) [2024-06-24 20:08:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11994021888. Throughput: 0: 42702.6. Samples: 11994175060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:18,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 20:08:20,206][15401] Updated weights for policy 0, policy_version 732062 (0.0029) [2024-06-24 20:08:22,978][15349] Signal inference workers to stop experience collection... (177600 times) [2024-06-24 20:08:22,984][15349] Signal inference workers to resume experience collection... (177600 times) [2024-06-24 20:08:22,989][15401] InferenceWorker_p0-w0: stopping experience collection (177600 times) [2024-06-24 20:08:23,014][15401] InferenceWorker_p0-w0: resuming experience collection (177600 times) [2024-06-24 20:08:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 11994202112. Throughput: 0: 42721.3. Samples: 11994305680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 20:08:24,201][15401] Updated weights for policy 0, policy_version 732072 (0.0039) [2024-06-24 20:08:27,815][15401] Updated weights for policy 0, policy_version 732082 (0.0033) [2024-06-24 20:08:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 11994447872. Throughput: 0: 42773.3. Samples: 11994563040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 20:08:31,714][15401] Updated weights for policy 0, policy_version 732092 (0.0030) [2024-06-24 20:08:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11994660864. Throughput: 0: 42835.1. Samples: 11994817980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 20:08:35,631][15401] Updated weights for policy 0, policy_version 732102 (0.0035) [2024-06-24 20:08:38,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 11994857472. Throughput: 0: 42740.3. Samples: 11994947720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:38,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 20:08:39,522][15401] Updated weights for policy 0, policy_version 732112 (0.0044) [2024-06-24 20:08:43,241][15401] Updated weights for policy 0, policy_version 732122 (0.0031) [2024-06-24 20:08:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 11995086848. Throughput: 0: 42804.4. Samples: 11995205800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 20:08:47,003][15401] Updated weights for policy 0, policy_version 732132 (0.0039) [2024-06-24 20:08:48,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 11995316224. Throughput: 0: 42906.1. Samples: 11995462480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:48,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 20:08:50,818][15401] Updated weights for policy 0, policy_version 732142 (0.0031) [2024-06-24 20:08:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 11995480064. Throughput: 0: 42836.5. Samples: 11995591320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 20:08:54,722][15401] Updated weights for policy 0, policy_version 732152 (0.0038) [2024-06-24 20:08:58,391][15132] Fps is (10 sec: 40956.3, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 11995725824. Throughput: 0: 42783.4. Samples: 11995849000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:08:58,391][15132] Avg episode reward: [(0, '0.281')] [2024-06-24 20:08:58,692][15401] Updated weights for policy 0, policy_version 732162 (0.0032) [2024-06-24 20:09:02,358][15401] Updated weights for policy 0, policy_version 732172 (0.0035) [2024-06-24 20:09:03,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 11995955200. Throughput: 0: 42808.0. Samples: 11996101420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:09:03,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-24 20:09:06,162][15401] Updated weights for policy 0, policy_version 732182 (0.0032) [2024-06-24 20:09:08,389][15132] Fps is (10 sec: 40964.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 11996135424. Throughput: 0: 42835.1. Samples: 11996233260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:09:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 20:09:10,057][15401] Updated weights for policy 0, policy_version 732192 (0.0032) [2024-06-24 20:09:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 11996381184. Throughput: 0: 42720.0. Samples: 11996485440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-24 20:09:13,396][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 20:09:13,833][15401] Updated weights for policy 0, policy_version 732202 (0.0051) [2024-06-24 20:09:17,905][15401] Updated weights for policy 0, policy_version 732212 (0.0027) [2024-06-24 20:09:18,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11996577792. Throughput: 0: 42605.2. Samples: 11996735220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 20:09:21,449][15401] Updated weights for policy 0, policy_version 732222 (0.0043) [2024-06-24 20:09:23,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 11996774400. Throughput: 0: 42718.2. Samples: 11996870040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:23,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 20:09:25,456][15401] Updated weights for policy 0, policy_version 732232 (0.0029) [2024-06-24 20:09:28,392][15132] Fps is (10 sec: 44227.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 11997020160. Throughput: 0: 42730.3. Samples: 11997128760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:28,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 20:09:29,212][15401] Updated weights for policy 0, policy_version 732242 (0.0037) [2024-06-24 20:09:32,519][15349] Signal inference workers to stop experience collection... (177650 times) [2024-06-24 20:09:32,519][15349] Signal inference workers to resume experience collection... (177650 times) [2024-06-24 20:09:32,557][15401] InferenceWorker_p0-w0: stopping experience collection (177650 times) [2024-06-24 20:09:32,557][15401] InferenceWorker_p0-w0: resuming experience collection (177650 times) [2024-06-24 20:09:33,115][15401] Updated weights for policy 0, policy_version 732252 (0.0038) [2024-06-24 20:09:33,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 11997233152. Throughput: 0: 42767.6. Samples: 11997387020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:33,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 20:09:36,787][15401] Updated weights for policy 0, policy_version 732262 (0.0024) [2024-06-24 20:09:38,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 11997429760. Throughput: 0: 42741.3. Samples: 11997514680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:38,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 20:09:40,615][15401] Updated weights for policy 0, policy_version 732272 (0.0032) [2024-06-24 20:09:43,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 11997642752. Throughput: 0: 42722.0. Samples: 11997771460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:43,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 20:09:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000732278_11997642752.pth... [2024-06-24 20:09:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000731653_11987402752.pth [2024-06-24 20:09:44,523][15401] Updated weights for policy 0, policy_version 732282 (0.0039) [2024-06-24 20:09:48,278][15401] Updated weights for policy 0, policy_version 732292 (0.0029) [2024-06-24 20:09:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11997872128. Throughput: 0: 42906.1. Samples: 11998032200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 20:09:51,945][15401] Updated weights for policy 0, policy_version 732302 (0.0037) [2024-06-24 20:09:53,389][15132] Fps is (10 sec: 44238.2, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 11998085120. Throughput: 0: 42879.5. Samples: 11998162840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 20:09:55,750][15401] Updated weights for policy 0, policy_version 732312 (0.0033) [2024-06-24 20:09:58,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42872.2, 300 sec: 42820.6). Total num frames: 11998298112. Throughput: 0: 42960.9. Samples: 11998418680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:09:58,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 20:09:59,870][15401] Updated weights for policy 0, policy_version 732322 (0.0038) [2024-06-24 20:10:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 11998511104. Throughput: 0: 43006.4. Samples: 11998670500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 20:10:03,514][15401] Updated weights for policy 0, policy_version 732332 (0.0028) [2024-06-24 20:10:07,571][15401] Updated weights for policy 0, policy_version 732342 (0.0032) [2024-06-24 20:10:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 11998724096. Throughput: 0: 43010.7. Samples: 11998805420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:08,395][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 20:10:11,580][15401] Updated weights for policy 0, policy_version 732352 (0.0042) [2024-06-24 20:10:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 11998937088. Throughput: 0: 42919.1. Samples: 11999060020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:13,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 20:10:15,293][15401] Updated weights for policy 0, policy_version 732362 (0.0033) [2024-06-24 20:10:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 11999150080. Throughput: 0: 42835.6. Samples: 11999314620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 20:10:19,100][15401] Updated weights for policy 0, policy_version 732372 (0.0042) [2024-06-24 20:10:22,893][15401] Updated weights for policy 0, policy_version 732382 (0.0030) [2024-06-24 20:10:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 11999363072. Throughput: 0: 42889.7. Samples: 11999444720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 20:10:26,559][15401] Updated weights for policy 0, policy_version 732392 (0.0032) [2024-06-24 20:10:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 11999592448. Throughput: 0: 43015.8. Samples: 11999707160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 20:10:30,476][15401] Updated weights for policy 0, policy_version 732402 (0.0036) [2024-06-24 20:10:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 11999821824. Throughput: 0: 42959.3. Samples: 11999965360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 20:10:34,015][15401] Updated weights for policy 0, policy_version 732412 (0.0034) [2024-06-24 20:10:38,283][15401] Updated weights for policy 0, policy_version 732422 (0.0040) [2024-06-24 20:10:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12000002048. Throughput: 0: 43135.5. Samples: 12000104040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:38,393][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 20:10:41,445][15401] Updated weights for policy 0, policy_version 732432 (0.0041) [2024-06-24 20:10:43,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43143.0, 300 sec: 42931.3). Total num frames: 12000231424. Throughput: 0: 43023.4. Samples: 12000354840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:43,393][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 20:10:45,911][15401] Updated weights for policy 0, policy_version 732442 (0.0029) [2024-06-24 20:10:48,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12000460800. Throughput: 0: 43029.8. Samples: 12000606840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 20:10:48,977][15401] Updated weights for policy 0, policy_version 732452 (0.0026) [2024-06-24 20:10:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12000641024. Throughput: 0: 43080.6. Samples: 12000744040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 20:10:53,458][15401] Updated weights for policy 0, policy_version 732462 (0.0028) [2024-06-24 20:10:56,528][15401] Updated weights for policy 0, policy_version 732472 (0.0032) [2024-06-24 20:10:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.6). Total num frames: 12000870400. Throughput: 0: 43086.4. Samples: 12000998900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 20:10:58,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 20:11:01,396][15401] Updated weights for policy 0, policy_version 732482 (0.0033) [2024-06-24 20:11:01,747][15349] Signal inference workers to stop experience collection... (177700 times) [2024-06-24 20:11:01,747][15349] Signal inference workers to resume experience collection... (177700 times) [2024-06-24 20:11:01,787][15401] InferenceWorker_p0-w0: stopping experience collection (177700 times) [2024-06-24 20:11:01,787][15401] InferenceWorker_p0-w0: resuming experience collection (177700 times) [2024-06-24 20:11:03,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 12001116160. Throughput: 0: 43062.5. Samples: 12001252440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:03,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 20:11:04,566][15401] Updated weights for policy 0, policy_version 732492 (0.0030) [2024-06-24 20:11:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12001296384. Throughput: 0: 43113.3. Samples: 12001384920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:08,392][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 20:11:08,784][15401] Updated weights for policy 0, policy_version 732502 (0.0042) [2024-06-24 20:11:12,021][15401] Updated weights for policy 0, policy_version 732512 (0.0054) [2024-06-24 20:11:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 12001509376. Throughput: 0: 42941.0. Samples: 12001639500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 20:11:16,606][15401] Updated weights for policy 0, policy_version 732522 (0.0031) [2024-06-24 20:11:18,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43417.6, 300 sec: 42988.1). Total num frames: 12001755136. Throughput: 0: 42843.1. Samples: 12001893300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:18,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 20:11:19,830][15401] Updated weights for policy 0, policy_version 732532 (0.0028) [2024-06-24 20:11:23,390][15132] Fps is (10 sec: 42595.5, 60 sec: 42871.1, 300 sec: 42876.0). Total num frames: 12001935360. Throughput: 0: 42720.4. Samples: 12002026380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:23,391][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 20:11:23,962][15401] Updated weights for policy 0, policy_version 732542 (0.0031) [2024-06-24 20:11:27,634][15401] Updated weights for policy 0, policy_version 732552 (0.0028) [2024-06-24 20:11:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12002148352. Throughput: 0: 42925.0. Samples: 12002286360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:28,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 20:11:31,627][15401] Updated weights for policy 0, policy_version 732562 (0.0032) [2024-06-24 20:11:33,390][15132] Fps is (10 sec: 45878.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12002394112. Throughput: 0: 42974.7. Samples: 12002540700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 20:11:35,165][15401] Updated weights for policy 0, policy_version 732572 (0.0032) [2024-06-24 20:11:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43146.3, 300 sec: 42876.6). Total num frames: 12002590720. Throughput: 0: 43053.8. Samples: 12002681460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 20:11:39,089][15401] Updated weights for policy 0, policy_version 732582 (0.0028) [2024-06-24 20:11:42,557][15401] Updated weights for policy 0, policy_version 732592 (0.0036) [2024-06-24 20:11:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 12002803712. Throughput: 0: 42951.1. Samples: 12002931700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 20:11:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000732593_12002803712.pth... [2024-06-24 20:11:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000731965_11992514560.pth [2024-06-24 20:11:47,094][15401] Updated weights for policy 0, policy_version 732602 (0.0032) [2024-06-24 20:11:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12003049472. Throughput: 0: 42995.6. Samples: 12003187240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 20:11:50,076][15401] Updated weights for policy 0, policy_version 732612 (0.0045) [2024-06-24 20:11:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 12003246080. Throughput: 0: 43023.1. Samples: 12003320860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 20:11:54,522][15401] Updated weights for policy 0, policy_version 732622 (0.0029) [2024-06-24 20:11:58,088][15401] Updated weights for policy 0, policy_version 732632 (0.0032) [2024-06-24 20:11:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 12003459072. Throughput: 0: 42996.0. Samples: 12003574320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:11:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 20:12:01,900][15401] Updated weights for policy 0, policy_version 732642 (0.0035) [2024-06-24 20:12:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 12003688448. Throughput: 0: 43091.6. Samples: 12003832420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:12:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 20:12:05,513][15401] Updated weights for policy 0, policy_version 732652 (0.0023) [2024-06-24 20:12:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 12003868672. Throughput: 0: 43048.1. Samples: 12003963520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:12:08,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 20:12:09,372][15401] Updated weights for policy 0, policy_version 732662 (0.0043) [2024-06-24 20:12:13,092][15401] Updated weights for policy 0, policy_version 732672 (0.0042) [2024-06-24 20:12:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 12004098048. Throughput: 0: 42707.9. Samples: 12004208220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:12:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 20:12:16,744][15349] Signal inference workers to stop experience collection... (177750 times) [2024-06-24 20:12:16,795][15401] InferenceWorker_p0-w0: stopping experience collection (177750 times) [2024-06-24 20:12:16,858][15349] Signal inference workers to resume experience collection... (177750 times) [2024-06-24 20:12:16,859][15401] InferenceWorker_p0-w0: resuming experience collection (177750 times) [2024-06-24 20:12:17,315][15401] Updated weights for policy 0, policy_version 732682 (0.0024) [2024-06-24 20:12:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12004311040. Throughput: 0: 42855.6. Samples: 12004469200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:12:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 20:12:20,849][15401] Updated weights for policy 0, policy_version 732692 (0.0034) [2024-06-24 20:12:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.9, 300 sec: 42876.1). Total num frames: 12004507648. Throughput: 0: 42614.6. Samples: 12004599120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:12:23,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 20:12:25,046][15401] Updated weights for policy 0, policy_version 732702 (0.0023) [2024-06-24 20:12:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12004737024. Throughput: 0: 42657.7. Samples: 12004851300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:12:28,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 20:12:28,627][15401] Updated weights for policy 0, policy_version 732712 (0.0032) [2024-06-24 20:12:32,648][15401] Updated weights for policy 0, policy_version 732722 (0.0025) [2024-06-24 20:12:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12004950016. Throughput: 0: 42813.2. Samples: 12005113840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:12:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 20:12:36,242][15401] Updated weights for policy 0, policy_version 732732 (0.0032) [2024-06-24 20:12:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12005146624. Throughput: 0: 42703.1. Samples: 12005242500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 20:12:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 20:12:40,374][15401] Updated weights for policy 0, policy_version 732742 (0.0038) [2024-06-24 20:12:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12005392384. Throughput: 0: 42626.6. Samples: 12005492520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:12:43,395][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 20:12:43,887][15401] Updated weights for policy 0, policy_version 732752 (0.0035) [2024-06-24 20:12:48,357][15401] Updated weights for policy 0, policy_version 732762 (0.0043) [2024-06-24 20:12:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 12005572608. Throughput: 0: 42582.4. Samples: 12005748640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:12:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 20:12:51,474][15401] Updated weights for policy 0, policy_version 732772 (0.0027) [2024-06-24 20:12:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12005801984. Throughput: 0: 42279.5. Samples: 12005866100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:12:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 20:12:55,967][15401] Updated weights for policy 0, policy_version 732782 (0.0032) [2024-06-24 20:12:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12006031360. Throughput: 0: 42555.6. Samples: 12006123220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:12:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 20:12:59,462][15401] Updated weights for policy 0, policy_version 732792 (0.0031) [2024-06-24 20:13:03,389][15132] Fps is (10 sec: 39322.5, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 12006195200. Throughput: 0: 42667.6. Samples: 12006389240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:03,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 20:13:03,666][15401] Updated weights for policy 0, policy_version 732802 (0.0041) [2024-06-24 20:13:07,062][15401] Updated weights for policy 0, policy_version 732812 (0.0041) [2024-06-24 20:13:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12006440960. Throughput: 0: 42364.0. Samples: 12006505500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:08,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 20:13:11,466][15401] Updated weights for policy 0, policy_version 732822 (0.0035) [2024-06-24 20:13:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12006653952. Throughput: 0: 42523.2. Samples: 12006764840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:13,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 20:13:15,017][15401] Updated weights for policy 0, policy_version 732832 (0.0037) [2024-06-24 20:13:18,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 12006834176. Throughput: 0: 42590.4. Samples: 12007030400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:18,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 20:13:19,065][15401] Updated weights for policy 0, policy_version 732842 (0.0043) [2024-06-24 20:13:22,583][15401] Updated weights for policy 0, policy_version 732852 (0.0041) [2024-06-24 20:13:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12007079936. Throughput: 0: 42286.3. Samples: 12007145480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:23,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 20:13:26,842][15401] Updated weights for policy 0, policy_version 732862 (0.0033) [2024-06-24 20:13:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12007292928. Throughput: 0: 42601.9. Samples: 12007409600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 20:13:30,147][15401] Updated weights for policy 0, policy_version 732872 (0.0036) [2024-06-24 20:13:33,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42052.4, 300 sec: 42765.4). Total num frames: 12007473152. Throughput: 0: 42564.2. Samples: 12007664020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:33,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 20:13:34,827][15401] Updated weights for policy 0, policy_version 732882 (0.0037) [2024-06-24 20:13:37,593][15349] Signal inference workers to stop experience collection... (177800 times) [2024-06-24 20:13:37,593][15349] Signal inference workers to resume experience collection... (177800 times) [2024-06-24 20:13:37,594][15401] Updated weights for policy 0, policy_version 732892 (0.0031) [2024-06-24 20:13:37,633][15401] InferenceWorker_p0-w0: stopping experience collection (177800 times) [2024-06-24 20:13:37,633][15401] InferenceWorker_p0-w0: resuming experience collection (177800 times) [2024-06-24 20:13:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12007718912. Throughput: 0: 42638.3. Samples: 12007784820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 20:13:42,663][15401] Updated weights for policy 0, policy_version 732902 (0.0038) [2024-06-24 20:13:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12007915520. Throughput: 0: 42694.7. Samples: 12008044480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 20:13:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000732905_12007915520.pth... [2024-06-24 20:13:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000732278_11997642752.pth [2024-06-24 20:13:45,237][15401] Updated weights for policy 0, policy_version 732912 (0.0038) [2024-06-24 20:13:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12008112128. Throughput: 0: 42424.8. Samples: 12008298360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 20:13:50,287][15401] Updated weights for policy 0, policy_version 732922 (0.0028) [2024-06-24 20:13:52,768][15401] Updated weights for policy 0, policy_version 732932 (0.0049) [2024-06-24 20:13:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42876.2). Total num frames: 12008374272. Throughput: 0: 42621.9. Samples: 12008423480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:53,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 20:13:58,161][15401] Updated weights for policy 0, policy_version 732942 (0.0045) [2024-06-24 20:13:58,396][15132] Fps is (10 sec: 42571.3, 60 sec: 41774.8, 300 sec: 42653.0). Total num frames: 12008538112. Throughput: 0: 42517.0. Samples: 12008678380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:13:58,396][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 20:14:00,407][15401] Updated weights for policy 0, policy_version 732952 (0.0037) [2024-06-24 20:14:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12008751104. Throughput: 0: 42309.2. Samples: 12008934320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:14:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 20:14:05,739][15401] Updated weights for policy 0, policy_version 732962 (0.0040) [2024-06-24 20:14:08,389][15132] Fps is (10 sec: 45905.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12008996864. Throughput: 0: 42676.6. Samples: 12009065820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:14:08,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-24 20:14:08,412][15401] Updated weights for policy 0, policy_version 732972 (0.0034) [2024-06-24 20:14:13,280][15401] Updated weights for policy 0, policy_version 732982 (0.0047) [2024-06-24 20:14:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12009177088. Throughput: 0: 42382.2. Samples: 12009316800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:14:13,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 20:14:16,335][15401] Updated weights for policy 0, policy_version 732992 (0.0034) [2024-06-24 20:14:18,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 12009390080. Throughput: 0: 42411.0. Samples: 12009572520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:14:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 20:14:20,865][15401] Updated weights for policy 0, policy_version 733002 (0.0029) [2024-06-24 20:14:23,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 12009635840. Throughput: 0: 42544.0. Samples: 12009699300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-24 20:14:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 20:14:23,837][15401] Updated weights for policy 0, policy_version 733012 (0.0040) [2024-06-24 20:14:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12009816064. Throughput: 0: 42543.2. Samples: 12009958920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:14:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 20:14:28,413][15401] Updated weights for policy 0, policy_version 733022 (0.0033) [2024-06-24 20:14:31,310][15401] Updated weights for policy 0, policy_version 733032 (0.0034) [2024-06-24 20:14:33,391][15132] Fps is (10 sec: 40953.4, 60 sec: 42870.2, 300 sec: 42764.8). Total num frames: 12010045440. Throughput: 0: 42620.3. Samples: 12010216340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:14:33,391][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 20:14:35,975][15401] Updated weights for policy 0, policy_version 733042 (0.0027) [2024-06-24 20:14:38,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42596.7, 300 sec: 42820.3). Total num frames: 12010274816. Throughput: 0: 42636.4. Samples: 12010342220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:14:38,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 20:14:38,977][15401] Updated weights for policy 0, policy_version 733052 (0.0028) [2024-06-24 20:14:43,375][15401] Updated weights for policy 0, policy_version 733062 (0.0034) [2024-06-24 20:14:43,390][15132] Fps is (10 sec: 44243.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12010487808. Throughput: 0: 42760.7. Samples: 12010602340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:14:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 20:14:45,872][15349] Signal inference workers to stop experience collection... (177850 times) [2024-06-24 20:14:45,928][15401] InferenceWorker_p0-w0: stopping experience collection (177850 times) [2024-06-24 20:14:45,931][15349] Signal inference workers to resume experience collection... (177850 times) [2024-06-24 20:14:45,944][15401] InferenceWorker_p0-w0: resuming experience collection (177850 times) [2024-06-24 20:14:46,813][15401] Updated weights for policy 0, policy_version 733072 (0.0034) [2024-06-24 20:14:48,391][15132] Fps is (10 sec: 40964.0, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 12010684416. Throughput: 0: 42669.4. Samples: 12010854500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:14:48,391][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 20:14:51,529][15401] Updated weights for policy 0, policy_version 733082 (0.0036) [2024-06-24 20:14:53,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 12010897408. Throughput: 0: 42717.2. Samples: 12010988200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:14:53,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 20:14:54,633][15401] Updated weights for policy 0, policy_version 733092 (0.0054) [2024-06-24 20:14:58,392][15132] Fps is (10 sec: 42594.3, 60 sec: 42874.4, 300 sec: 42709.1). Total num frames: 12011110400. Throughput: 0: 42565.2. Samples: 12011232340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:14:58,392][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 20:14:59,506][15401] Updated weights for policy 0, policy_version 733102 (0.0030) [2024-06-24 20:15:02,224][15401] Updated weights for policy 0, policy_version 733112 (0.0042) [2024-06-24 20:15:03,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12011323392. Throughput: 0: 42696.1. Samples: 12011493840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 20:15:06,994][15401] Updated weights for policy 0, policy_version 733122 (0.0037) [2024-06-24 20:15:08,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 12011536384. Throughput: 0: 42812.4. Samples: 12011625860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 20:15:09,862][15401] Updated weights for policy 0, policy_version 733132 (0.0042) [2024-06-24 20:15:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12011765760. Throughput: 0: 42677.7. Samples: 12011879420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 20:15:14,699][15401] Updated weights for policy 0, policy_version 733142 (0.0031) [2024-06-24 20:15:17,497][15401] Updated weights for policy 0, policy_version 733152 (0.0043) [2024-06-24 20:15:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12011978752. Throughput: 0: 42593.6. Samples: 12012132980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 20:15:22,308][15401] Updated weights for policy 0, policy_version 733162 (0.0037) [2024-06-24 20:15:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12012175360. Throughput: 0: 42709.7. Samples: 12012264060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:23,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 20:15:25,331][15401] Updated weights for policy 0, policy_version 733172 (0.0029) [2024-06-24 20:15:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12012388352. Throughput: 0: 42601.4. Samples: 12012519400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 20:15:29,908][15401] Updated weights for policy 0, policy_version 733182 (0.0032) [2024-06-24 20:15:33,104][15401] Updated weights for policy 0, policy_version 733192 (0.0031) [2024-06-24 20:15:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42872.6, 300 sec: 42765.4). Total num frames: 12012617728. Throughput: 0: 42734.2. Samples: 12012777480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 20:15:37,508][15401] Updated weights for policy 0, policy_version 733202 (0.0040) [2024-06-24 20:15:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42709.8). Total num frames: 12012830720. Throughput: 0: 42719.9. Samples: 12012910500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 20:15:41,136][15401] Updated weights for policy 0, policy_version 733212 (0.0038) [2024-06-24 20:15:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12013043712. Throughput: 0: 42770.2. Samples: 12013156900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:43,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 20:15:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000733218_12013043712.pth... [2024-06-24 20:15:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000732593_12002803712.pth [2024-06-24 20:15:44,966][15401] Updated weights for policy 0, policy_version 733222 (0.0043) [2024-06-24 20:15:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 12013240320. Throughput: 0: 42594.1. Samples: 12013410580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 20:15:48,854][15401] Updated weights for policy 0, policy_version 733232 (0.0033) [2024-06-24 20:15:52,961][15401] Updated weights for policy 0, policy_version 733242 (0.0024) [2024-06-24 20:15:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 12013453312. Throughput: 0: 42457.0. Samples: 12013536420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 20:15:56,723][15401] Updated weights for policy 0, policy_version 733252 (0.0033) [2024-06-24 20:15:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 12013682688. Throughput: 0: 42512.5. Samples: 12013792480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:15:58,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 20:16:00,557][15401] Updated weights for policy 0, policy_version 733262 (0.0053) [2024-06-24 20:16:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12013879296. Throughput: 0: 42605.8. Samples: 12014050240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 20:16:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 20:16:04,529][15401] Updated weights for policy 0, policy_version 733272 (0.0034) [2024-06-24 20:16:08,154][15401] Updated weights for policy 0, policy_version 733282 (0.0024) [2024-06-24 20:16:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 12014092288. Throughput: 0: 42459.8. Samples: 12014174740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 20:16:12,253][15401] Updated weights for policy 0, policy_version 733292 (0.0030) [2024-06-24 20:16:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12014338048. Throughput: 0: 42593.7. Samples: 12014436120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 20:16:15,880][15401] Updated weights for policy 0, policy_version 733302 (0.0042) [2024-06-24 20:16:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 12014534656. Throughput: 0: 42481.4. Samples: 12014689140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 20:16:19,837][15401] Updated weights for policy 0, policy_version 733312 (0.0036) [2024-06-24 20:16:23,232][15349] Signal inference workers to stop experience collection... (177900 times) [2024-06-24 20:16:23,232][15349] Signal inference workers to resume experience collection... (177900 times) [2024-06-24 20:16:23,263][15401] InferenceWorker_p0-w0: stopping experience collection (177900 times) [2024-06-24 20:16:23,264][15401] InferenceWorker_p0-w0: resuming experience collection (177900 times) [2024-06-24 20:16:23,375][15401] Updated weights for policy 0, policy_version 733322 (0.0043) [2024-06-24 20:16:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12014747648. Throughput: 0: 42301.8. Samples: 12014814180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:23,392][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 20:16:27,467][15401] Updated weights for policy 0, policy_version 733332 (0.0044) [2024-06-24 20:16:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12014960640. Throughput: 0: 42620.9. Samples: 12015074840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:28,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 20:16:30,938][15401] Updated weights for policy 0, policy_version 733342 (0.0034) [2024-06-24 20:16:33,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12015173632. Throughput: 0: 42533.8. Samples: 12015324700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:33,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 20:16:35,219][15401] Updated weights for policy 0, policy_version 733352 (0.0032) [2024-06-24 20:16:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12015370240. Throughput: 0: 42631.1. Samples: 12015454820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:38,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 20:16:38,947][15401] Updated weights for policy 0, policy_version 733362 (0.0029) [2024-06-24 20:16:42,783][15401] Updated weights for policy 0, policy_version 733372 (0.0038) [2024-06-24 20:16:43,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12015599616. Throughput: 0: 42671.1. Samples: 12015712680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:43,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 20:16:46,745][15401] Updated weights for policy 0, policy_version 733382 (0.0032) [2024-06-24 20:16:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12015812608. Throughput: 0: 42583.4. Samples: 12015966500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 20:16:50,299][15401] Updated weights for policy 0, policy_version 733392 (0.0044) [2024-06-24 20:16:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12016025600. Throughput: 0: 42713.6. Samples: 12016096860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 20:16:54,133][15401] Updated weights for policy 0, policy_version 733402 (0.0040) [2024-06-24 20:16:58,044][15401] Updated weights for policy 0, policy_version 733412 (0.0037) [2024-06-24 20:16:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 12016238592. Throughput: 0: 42650.2. Samples: 12016355380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:16:58,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 20:17:01,611][15401] Updated weights for policy 0, policy_version 733422 (0.0047) [2024-06-24 20:17:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12016451584. Throughput: 0: 42762.6. Samples: 12016613460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:03,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 20:17:05,531][15401] Updated weights for policy 0, policy_version 733432 (0.0032) [2024-06-24 20:17:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 12016680960. Throughput: 0: 42833.9. Samples: 12016741600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:08,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 20:17:09,159][15401] Updated weights for policy 0, policy_version 733442 (0.0032) [2024-06-24 20:17:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12016861184. Throughput: 0: 42808.5. Samples: 12017001220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:13,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 20:17:13,498][15401] Updated weights for policy 0, policy_version 733452 (0.0033) [2024-06-24 20:17:17,186][15401] Updated weights for policy 0, policy_version 733462 (0.0026) [2024-06-24 20:17:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12017090560. Throughput: 0: 42820.9. Samples: 12017251540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 20:17:21,121][15401] Updated weights for policy 0, policy_version 733472 (0.0043) [2024-06-24 20:17:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 12017303552. Throughput: 0: 42939.0. Samples: 12017387080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:23,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 20:17:24,654][15401] Updated weights for policy 0, policy_version 733482 (0.0030) [2024-06-24 20:17:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12017516544. Throughput: 0: 42952.0. Samples: 12017645520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-24 20:17:28,475][15401] Updated weights for policy 0, policy_version 733492 (0.0054) [2024-06-24 20:17:32,115][15401] Updated weights for policy 0, policy_version 733502 (0.0034) [2024-06-24 20:17:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 12017745920. Throughput: 0: 42916.4. Samples: 12017897740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 20:17:36,107][15401] Updated weights for policy 0, policy_version 733512 (0.0040) [2024-06-24 20:17:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12017958912. Throughput: 0: 42863.6. Samples: 12018025720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 20:17:39,789][15401] Updated weights for policy 0, policy_version 733522 (0.0027) [2024-06-24 20:17:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12018155520. Throughput: 0: 42959.7. Samples: 12018288560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 20:17:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000733531_12018171904.pth... [2024-06-24 20:17:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000732905_12007915520.pth [2024-06-24 20:17:43,622][15401] Updated weights for policy 0, policy_version 733532 (0.0027) [2024-06-24 20:17:47,323][15401] Updated weights for policy 0, policy_version 733542 (0.0039) [2024-06-24 20:17:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12018384896. Throughput: 0: 42759.1. Samples: 12018537620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:17:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 20:17:51,418][15401] Updated weights for policy 0, policy_version 733552 (0.0031) [2024-06-24 20:17:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12018581504. Throughput: 0: 42922.6. Samples: 12018673120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:17:53,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-24 20:17:55,132][15401] Updated weights for policy 0, policy_version 733562 (0.0034) [2024-06-24 20:17:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12018810880. Throughput: 0: 42840.8. Samples: 12018929060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:17:58,390][15132] Avg episode reward: [(0, '0.120')] [2024-06-24 20:17:58,888][15401] Updated weights for policy 0, policy_version 733572 (0.0037) [2024-06-24 20:18:02,581][15401] Updated weights for policy 0, policy_version 733582 (0.0031) [2024-06-24 20:18:03,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 12019023872. Throughput: 0: 42971.0. Samples: 12019185340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:03,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 20:18:06,478][15401] Updated weights for policy 0, policy_version 733592 (0.0042) [2024-06-24 20:18:08,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12019220480. Throughput: 0: 42873.4. Samples: 12019316380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 20:18:10,119][15401] Updated weights for policy 0, policy_version 733602 (0.0025) [2024-06-24 20:18:11,106][15349] Signal inference workers to stop experience collection... (177950 times) [2024-06-24 20:18:11,148][15401] InferenceWorker_p0-w0: stopping experience collection (177950 times) [2024-06-24 20:18:11,231][15349] Signal inference workers to resume experience collection... (177950 times) [2024-06-24 20:18:11,231][15401] InferenceWorker_p0-w0: resuming experience collection (177950 times) [2024-06-24 20:18:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12019449856. Throughput: 0: 42831.6. Samples: 12019572940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:13,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 20:18:14,190][15401] Updated weights for policy 0, policy_version 733612 (0.0032) [2024-06-24 20:18:17,853][15401] Updated weights for policy 0, policy_version 733622 (0.0036) [2024-06-24 20:18:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 12019679232. Throughput: 0: 42862.3. Samples: 12019826540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:18,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-24 20:18:21,677][15401] Updated weights for policy 0, policy_version 733632 (0.0037) [2024-06-24 20:18:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12019859456. Throughput: 0: 43027.2. Samples: 12019961940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 20:18:25,363][15401] Updated weights for policy 0, policy_version 733642 (0.0034) [2024-06-24 20:18:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12020088832. Throughput: 0: 42866.6. Samples: 12020217560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 20:18:29,083][15401] Updated weights for policy 0, policy_version 733652 (0.0044) [2024-06-24 20:18:33,166][15401] Updated weights for policy 0, policy_version 733662 (0.0032) [2024-06-24 20:18:33,394][15132] Fps is (10 sec: 47491.2, 60 sec: 43141.3, 300 sec: 42764.3). Total num frames: 12020334592. Throughput: 0: 43080.9. Samples: 12020476460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:33,395][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 20:18:36,687][15401] Updated weights for policy 0, policy_version 733672 (0.0030) [2024-06-24 20:18:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12020531200. Throughput: 0: 42967.4. Samples: 12020606660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 20:18:40,801][15401] Updated weights for policy 0, policy_version 733682 (0.0030) [2024-06-24 20:18:43,394][15132] Fps is (10 sec: 40958.6, 60 sec: 43140.9, 300 sec: 42819.8). Total num frames: 12020744192. Throughput: 0: 42921.6. Samples: 12020860740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:43,395][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 20:18:44,597][15401] Updated weights for policy 0, policy_version 733692 (0.0041) [2024-06-24 20:18:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12020957184. Throughput: 0: 42975.6. Samples: 12021119140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 20:18:48,411][15401] Updated weights for policy 0, policy_version 733702 (0.0027) [2024-06-24 20:18:52,534][15401] Updated weights for policy 0, policy_version 733712 (0.0036) [2024-06-24 20:18:53,390][15132] Fps is (10 sec: 42619.6, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 12021170176. Throughput: 0: 42925.3. Samples: 12021248020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 20:18:56,091][15401] Updated weights for policy 0, policy_version 733722 (0.0038) [2024-06-24 20:18:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 12021399552. Throughput: 0: 42943.1. Samples: 12021505380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:18:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 20:19:00,297][15401] Updated weights for policy 0, policy_version 733732 (0.0033) [2024-06-24 20:19:03,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 12021596160. Throughput: 0: 43028.9. Samples: 12021762940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:19:03,392][15132] Avg episode reward: [(0, '0.838')] [2024-06-24 20:19:03,759][15401] Updated weights for policy 0, policy_version 733742 (0.0032) [2024-06-24 20:19:07,784][15401] Updated weights for policy 0, policy_version 733752 (0.0040) [2024-06-24 20:19:08,392][15132] Fps is (10 sec: 40949.7, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 12021809152. Throughput: 0: 42812.7. Samples: 12021888620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:19:08,393][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 20:19:11,316][15401] Updated weights for policy 0, policy_version 733762 (0.0031) [2024-06-24 20:19:13,389][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12022038528. Throughput: 0: 42917.9. Samples: 12022148860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:19:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 20:19:15,609][15401] Updated weights for policy 0, policy_version 733772 (0.0036) [2024-06-24 20:19:18,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12022235136. Throughput: 0: 42964.3. Samples: 12022409660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:19:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 20:19:19,137][15401] Updated weights for policy 0, policy_version 733782 (0.0029) [2024-06-24 20:19:23,201][15401] Updated weights for policy 0, policy_version 733792 (0.0027) [2024-06-24 20:19:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12022464512. Throughput: 0: 42875.2. Samples: 12022536040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:19:23,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 20:19:26,866][15401] Updated weights for policy 0, policy_version 733802 (0.0028) [2024-06-24 20:19:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.8). Total num frames: 12022677504. Throughput: 0: 42945.1. Samples: 12022793060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:19:28,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 20:19:30,670][15401] Updated weights for policy 0, policy_version 733812 (0.0033) [2024-06-24 20:19:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42328.6, 300 sec: 42709.8). Total num frames: 12022874112. Throughput: 0: 42914.2. Samples: 12023050280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:19:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 20:19:34,666][15401] Updated weights for policy 0, policy_version 733822 (0.0044) [2024-06-24 20:19:38,153][15401] Updated weights for policy 0, policy_version 733832 (0.0037) [2024-06-24 20:19:38,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 12023103488. Throughput: 0: 42849.8. Samples: 12023176360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:19:38,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 20:19:42,436][15401] Updated weights for policy 0, policy_version 733842 (0.0022) [2024-06-24 20:19:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42875.0, 300 sec: 42820.8). Total num frames: 12023316480. Throughput: 0: 42935.0. Samples: 12023437460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:19:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 20:19:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000733845_12023316480.pth... [2024-06-24 20:19:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000733218_12013043712.pth [2024-06-24 20:19:44,028][15349] Signal inference workers to stop experience collection... (178000 times) [2024-06-24 20:19:44,029][15349] Signal inference workers to resume experience collection... (178000 times) [2024-06-24 20:19:44,043][15401] InferenceWorker_p0-w0: stopping experience collection (178000 times) [2024-06-24 20:19:44,043][15401] InferenceWorker_p0-w0: resuming experience collection (178000 times) [2024-06-24 20:19:45,609][15401] Updated weights for policy 0, policy_version 733852 (0.0035) [2024-06-24 20:19:48,396][15132] Fps is (10 sec: 40943.5, 60 sec: 42593.8, 300 sec: 42764.4). Total num frames: 12023513088. Throughput: 0: 42755.7. Samples: 12023687120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:19:48,396][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 20:19:50,247][15401] Updated weights for policy 0, policy_version 733862 (0.0027) [2024-06-24 20:19:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 12023726080. Throughput: 0: 42789.0. Samples: 12023814020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:19:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 20:19:53,701][15401] Updated weights for policy 0, policy_version 733872 (0.0030) [2024-06-24 20:19:57,857][15401] Updated weights for policy 0, policy_version 733882 (0.0036) [2024-06-24 20:19:58,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 12023939072. Throughput: 0: 42868.8. Samples: 12024077960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:19:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 20:20:01,296][15401] Updated weights for policy 0, policy_version 733892 (0.0032) [2024-06-24 20:20:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 12024168448. Throughput: 0: 42547.7. Samples: 12024324300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:03,391][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 20:20:05,850][15401] Updated weights for policy 0, policy_version 733902 (0.0043) [2024-06-24 20:20:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12024381440. Throughput: 0: 42645.3. Samples: 12024455080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 20:20:08,874][15401] Updated weights for policy 0, policy_version 733912 (0.0038) [2024-06-24 20:20:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12024561664. Throughput: 0: 42754.7. Samples: 12024717020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 20:20:13,449][15401] Updated weights for policy 0, policy_version 733922 (0.0040) [2024-06-24 20:20:16,637][15401] Updated weights for policy 0, policy_version 733932 (0.0033) [2024-06-24 20:20:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12024807424. Throughput: 0: 42579.9. Samples: 12024966380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 20:20:21,358][15401] Updated weights for policy 0, policy_version 733942 (0.0036) [2024-06-24 20:20:23,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12025036800. Throughput: 0: 42889.3. Samples: 12025106280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 20:20:24,273][15401] Updated weights for policy 0, policy_version 733952 (0.0041) [2024-06-24 20:20:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12025200640. Throughput: 0: 42673.9. Samples: 12025357780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 20:20:28,912][15401] Updated weights for policy 0, policy_version 733962 (0.0031) [2024-06-24 20:20:32,211][15401] Updated weights for policy 0, policy_version 733972 (0.0041) [2024-06-24 20:20:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12025446400. Throughput: 0: 42693.1. Samples: 12025608040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 20:20:36,498][15401] Updated weights for policy 0, policy_version 733982 (0.0037) [2024-06-24 20:20:38,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 12025675776. Throughput: 0: 42972.9. Samples: 12025747800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 20:20:39,650][15401] Updated weights for policy 0, policy_version 733992 (0.0037) [2024-06-24 20:20:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12025839616. Throughput: 0: 42593.7. Samples: 12025994680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:43,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-24 20:20:44,297][15401] Updated weights for policy 0, policy_version 734002 (0.0039) [2024-06-24 20:20:47,103][15401] Updated weights for policy 0, policy_version 734012 (0.0030) [2024-06-24 20:20:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 12026101760. Throughput: 0: 42697.9. Samples: 12026245700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 20:20:51,957][15401] Updated weights for policy 0, policy_version 734022 (0.0037) [2024-06-24 20:20:53,390][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12026314752. Throughput: 0: 42995.9. Samples: 12026389900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:53,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 20:20:54,741][15401] Updated weights for policy 0, policy_version 734032 (0.0036) [2024-06-24 20:20:58,391][15132] Fps is (10 sec: 39316.5, 60 sec: 42597.6, 300 sec: 42764.8). Total num frames: 12026494976. Throughput: 0: 42734.3. Samples: 12026640120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:20:58,391][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 20:20:59,553][15401] Updated weights for policy 0, policy_version 734042 (0.0032) [2024-06-24 20:21:02,273][15401] Updated weights for policy 0, policy_version 734052 (0.0038) [2024-06-24 20:21:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12026757120. Throughput: 0: 42701.9. Samples: 12026887960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:21:03,390][15132] Avg episode reward: [(0, '0.208')] [2024-06-24 20:21:06,115][15349] Signal inference workers to stop experience collection... (178050 times) [2024-06-24 20:21:06,116][15349] Signal inference workers to resume experience collection... (178050 times) [2024-06-24 20:21:06,136][15401] InferenceWorker_p0-w0: stopping experience collection (178050 times) [2024-06-24 20:21:06,136][15401] InferenceWorker_p0-w0: resuming experience collection (178050 times) [2024-06-24 20:21:07,174][15401] Updated weights for policy 0, policy_version 734062 (0.0023) [2024-06-24 20:21:08,389][15132] Fps is (10 sec: 44242.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12026937344. Throughput: 0: 42702.3. Samples: 12027027880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:21:08,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 20:21:09,917][15401] Updated weights for policy 0, policy_version 734072 (0.0044) [2024-06-24 20:21:13,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12027133952. Throughput: 0: 42642.0. Samples: 12027276680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-24 20:21:13,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 20:21:14,683][15401] Updated weights for policy 0, policy_version 734082 (0.0041) [2024-06-24 20:21:17,810][15401] Updated weights for policy 0, policy_version 734092 (0.0042) [2024-06-24 20:21:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 42876.5). Total num frames: 12027396096. Throughput: 0: 42739.3. Samples: 12027531300. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 20:21:22,257][15401] Updated weights for policy 0, policy_version 734102 (0.0038) [2024-06-24 20:21:23,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12027592704. Throughput: 0: 42768.5. Samples: 12027672380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 20:21:25,318][15401] Updated weights for policy 0, policy_version 734112 (0.0024) [2024-06-24 20:21:28,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 12027789312. Throughput: 0: 42795.6. Samples: 12027920480. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 20:21:29,998][15401] Updated weights for policy 0, policy_version 734122 (0.0047) [2024-06-24 20:21:32,991][15401] Updated weights for policy 0, policy_version 734132 (0.0034) [2024-06-24 20:21:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12028035072. Throughput: 0: 42775.5. Samples: 12028170600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 20:21:37,977][15401] Updated weights for policy 0, policy_version 734142 (0.0039) [2024-06-24 20:21:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12028198912. Throughput: 0: 42594.6. Samples: 12028306660. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:38,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-24 20:21:40,769][15401] Updated weights for policy 0, policy_version 734152 (0.0034) [2024-06-24 20:21:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 12028444672. Throughput: 0: 42608.2. Samples: 12028557440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 20:21:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000734158_12028444672.pth... [2024-06-24 20:21:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000733531_12018171904.pth [2024-06-24 20:21:45,357][15401] Updated weights for policy 0, policy_version 734162 (0.0040) [2024-06-24 20:21:48,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12028657664. Throughput: 0: 42746.2. Samples: 12028811540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 20:21:48,717][15401] Updated weights for policy 0, policy_version 734172 (0.0038) [2024-06-24 20:21:52,998][15401] Updated weights for policy 0, policy_version 734182 (0.0027) [2024-06-24 20:21:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12028854272. Throughput: 0: 42540.9. Samples: 12028942220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 20:21:56,352][15401] Updated weights for policy 0, policy_version 734192 (0.0035) [2024-06-24 20:21:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43418.5, 300 sec: 42876.1). Total num frames: 12029100032. Throughput: 0: 42585.0. Samples: 12029193000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:21:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 20:22:00,493][15401] Updated weights for policy 0, policy_version 734202 (0.0043) [2024-06-24 20:22:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12029296640. Throughput: 0: 42706.5. Samples: 12029453100. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 20:22:03,858][15349] Signal inference workers to stop experience collection... (178100 times) [2024-06-24 20:22:03,908][15401] InferenceWorker_p0-w0: stopping experience collection (178100 times) [2024-06-24 20:22:03,913][15349] Signal inference workers to resume experience collection... (178100 times) [2024-06-24 20:22:03,919][15401] InferenceWorker_p0-w0: resuming experience collection (178100 times) [2024-06-24 20:22:03,925][15401] Updated weights for policy 0, policy_version 734212 (0.0044) [2024-06-24 20:22:07,970][15401] Updated weights for policy 0, policy_version 734222 (0.0032) [2024-06-24 20:22:08,390][15132] Fps is (10 sec: 40957.9, 60 sec: 42871.1, 300 sec: 42876.0). Total num frames: 12029509632. Throughput: 0: 42449.2. Samples: 12029582620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:08,391][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 20:22:11,346][15401] Updated weights for policy 0, policy_version 734232 (0.0029) [2024-06-24 20:22:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 12029755392. Throughput: 0: 42790.7. Samples: 12029846060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 20:22:15,375][15401] Updated weights for policy 0, policy_version 734242 (0.0028) [2024-06-24 20:22:18,389][15132] Fps is (10 sec: 42601.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 12029935616. Throughput: 0: 43079.7. Samples: 12030109180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 20:22:18,839][15401] Updated weights for policy 0, policy_version 734252 (0.0039) [2024-06-24 20:22:22,889][15401] Updated weights for policy 0, policy_version 734262 (0.0029) [2024-06-24 20:22:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12030164992. Throughput: 0: 42756.1. Samples: 12030230680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 20:22:26,686][15401] Updated weights for policy 0, policy_version 734272 (0.0024) [2024-06-24 20:22:28,396][15132] Fps is (10 sec: 45845.3, 60 sec: 43413.0, 300 sec: 42875.2). Total num frames: 12030394368. Throughput: 0: 43009.5. Samples: 12030493140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:28,397][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 20:22:30,519][15401] Updated weights for policy 0, policy_version 734282 (0.0028) [2024-06-24 20:22:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12030590976. Throughput: 0: 43217.0. Samples: 12030756300. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 20:22:34,381][15401] Updated weights for policy 0, policy_version 734292 (0.0053) [2024-06-24 20:22:38,236][15401] Updated weights for policy 0, policy_version 734302 (0.0025) [2024-06-24 20:22:38,392][15132] Fps is (10 sec: 42615.3, 60 sec: 43689.0, 300 sec: 42931.3). Total num frames: 12030820352. Throughput: 0: 43039.4. Samples: 12030879100. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:38,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 20:22:42,055][15401] Updated weights for policy 0, policy_version 734312 (0.0032) [2024-06-24 20:22:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12031033344. Throughput: 0: 43339.5. Samples: 12031143280. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:43,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 20:22:46,193][15401] Updated weights for policy 0, policy_version 734322 (0.0030) [2024-06-24 20:22:48,389][15132] Fps is (10 sec: 40970.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12031229952. Throughput: 0: 43214.4. Samples: 12031397740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 20:22:49,724][15401] Updated weights for policy 0, policy_version 734332 (0.0030) [2024-06-24 20:22:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12031442944. Throughput: 0: 43074.2. Samples: 12031520940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-24 20:22:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 20:22:53,829][15401] Updated weights for policy 0, policy_version 734342 (0.0036) [2024-06-24 20:22:57,455][15401] Updated weights for policy 0, policy_version 734352 (0.0024) [2024-06-24 20:22:58,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.7, 300 sec: 42876.1). Total num frames: 12031672320. Throughput: 0: 42992.9. Samples: 12031780840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:22:58,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 20:23:01,383][15401] Updated weights for policy 0, policy_version 734362 (0.0047) [2024-06-24 20:23:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12031868928. Throughput: 0: 42898.2. Samples: 12032039600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:03,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 20:23:05,187][15401] Updated weights for policy 0, policy_version 734372 (0.0029) [2024-06-24 20:23:08,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42871.9, 300 sec: 42820.6). Total num frames: 12032081920. Throughput: 0: 43051.6. Samples: 12032168000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:08,390][15132] Avg episode reward: [(0, '0.911')] [2024-06-24 20:23:08,774][15401] Updated weights for policy 0, policy_version 734382 (0.0024) [2024-06-24 20:23:12,966][15401] Updated weights for policy 0, policy_version 734392 (0.0042) [2024-06-24 20:23:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12032311296. Throughput: 0: 42927.0. Samples: 12032424580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:13,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-24 20:23:15,601][15349] Signal inference workers to stop experience collection... (178150 times) [2024-06-24 20:23:15,601][15349] Signal inference workers to resume experience collection... (178150 times) [2024-06-24 20:23:15,618][15401] InferenceWorker_p0-w0: stopping experience collection (178150 times) [2024-06-24 20:23:15,651][15401] InferenceWorker_p0-w0: resuming experience collection (178150 times) [2024-06-24 20:23:16,883][15401] Updated weights for policy 0, policy_version 734402 (0.0029) [2024-06-24 20:23:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 12032507904. Throughput: 0: 42714.0. Samples: 12032678440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 20:23:20,499][15401] Updated weights for policy 0, policy_version 734412 (0.0029) [2024-06-24 20:23:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12032720896. Throughput: 0: 42790.8. Samples: 12032804580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 20:23:24,380][15401] Updated weights for policy 0, policy_version 734422 (0.0041) [2024-06-24 20:23:28,326][15401] Updated weights for policy 0, policy_version 734432 (0.0046) [2024-06-24 20:23:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42329.8, 300 sec: 42710.1). Total num frames: 12032933888. Throughput: 0: 42535.5. Samples: 12033057380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:28,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 20:23:32,036][15401] Updated weights for policy 0, policy_version 734442 (0.0043) [2024-06-24 20:23:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12033163264. Throughput: 0: 42546.1. Samples: 12033312320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:33,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 20:23:36,000][15401] Updated weights for policy 0, policy_version 734452 (0.0033) [2024-06-24 20:23:38,395][15132] Fps is (10 sec: 39301.3, 60 sec: 41777.3, 300 sec: 42653.9). Total num frames: 12033327104. Throughput: 0: 42662.7. Samples: 12033440980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:38,395][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 20:23:39,700][15401] Updated weights for policy 0, policy_version 734462 (0.0031) [2024-06-24 20:23:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12033556480. Throughput: 0: 42475.6. Samples: 12033692140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 20:23:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000734471_12033572864.pth... [2024-06-24 20:23:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000733845_12023316480.pth [2024-06-24 20:23:43,660][15401] Updated weights for policy 0, policy_version 734472 (0.0036) [2024-06-24 20:23:47,697][15401] Updated weights for policy 0, policy_version 734482 (0.0037) [2024-06-24 20:23:48,390][15132] Fps is (10 sec: 47537.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 12033802240. Throughput: 0: 42306.4. Samples: 12033943400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:48,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 20:23:51,904][15401] Updated weights for policy 0, policy_version 734492 (0.0040) [2024-06-24 20:23:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12033982464. Throughput: 0: 42385.7. Samples: 12034075360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:53,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 20:23:55,197][15401] Updated weights for policy 0, policy_version 734502 (0.0048) [2024-06-24 20:23:58,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.0, 300 sec: 42765.4). Total num frames: 12034211840. Throughput: 0: 42233.3. Samples: 12034325080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:23:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 20:23:59,511][15401] Updated weights for policy 0, policy_version 734512 (0.0037) [2024-06-24 20:24:02,838][15401] Updated weights for policy 0, policy_version 734522 (0.0024) [2024-06-24 20:24:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12034424832. Throughput: 0: 42400.6. Samples: 12034586460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:24:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 20:24:07,096][15401] Updated weights for policy 0, policy_version 734532 (0.0023) [2024-06-24 20:24:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12034621440. Throughput: 0: 42415.9. Samples: 12034713300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:24:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 20:24:10,654][15401] Updated weights for policy 0, policy_version 734542 (0.0040) [2024-06-24 20:24:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12034867200. Throughput: 0: 42451.1. Samples: 12034967680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:24:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 20:24:14,674][15401] Updated weights for policy 0, policy_version 734552 (0.0030) [2024-06-24 20:24:18,222][15401] Updated weights for policy 0, policy_version 734562 (0.0027) [2024-06-24 20:24:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12035063808. Throughput: 0: 42664.1. Samples: 12035232200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:24:18,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 20:24:22,179][15401] Updated weights for policy 0, policy_version 734572 (0.0042) [2024-06-24 20:24:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12035276800. Throughput: 0: 42538.2. Samples: 12035354980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:24:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 20:24:25,950][15401] Updated weights for policy 0, policy_version 734582 (0.0047) [2024-06-24 20:24:28,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12035506176. Throughput: 0: 42731.5. Samples: 12035615160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:24:28,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 20:24:30,014][15401] Updated weights for policy 0, policy_version 734592 (0.0034) [2024-06-24 20:24:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 12035702784. Throughput: 0: 43042.2. Samples: 12035880300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:24:33,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 20:24:33,651][15401] Updated weights for policy 0, policy_version 734602 (0.0028) [2024-06-24 20:24:34,609][15349] Signal inference workers to stop experience collection... (178200 times) [2024-06-24 20:24:34,609][15349] Signal inference workers to resume experience collection... (178200 times) [2024-06-24 20:24:34,653][15401] InferenceWorker_p0-w0: stopping experience collection (178200 times) [2024-06-24 20:24:34,653][15401] InferenceWorker_p0-w0: resuming experience collection (178200 times) [2024-06-24 20:24:37,668][15401] Updated weights for policy 0, policy_version 734612 (0.0029) [2024-06-24 20:24:38,390][15132] Fps is (10 sec: 40969.2, 60 sec: 43148.2, 300 sec: 42709.5). Total num frames: 12035915776. Throughput: 0: 42817.7. Samples: 12036002160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 20:24:38,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 20:24:41,248][15401] Updated weights for policy 0, policy_version 734622 (0.0036) [2024-06-24 20:24:43,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43417.7, 300 sec: 42877.0). Total num frames: 12036161536. Throughput: 0: 42965.9. Samples: 12036258540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:24:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 20:24:45,131][15401] Updated weights for policy 0, policy_version 734632 (0.0034) [2024-06-24 20:24:48,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 12036341760. Throughput: 0: 42969.7. Samples: 12036520200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:24:48,392][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 20:24:49,038][15401] Updated weights for policy 0, policy_version 734642 (0.0045) [2024-06-24 20:24:52,689][15401] Updated weights for policy 0, policy_version 734652 (0.0028) [2024-06-24 20:24:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12036571136. Throughput: 0: 42705.9. Samples: 12036635060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:24:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 20:24:56,715][15401] Updated weights for policy 0, policy_version 734662 (0.0030) [2024-06-24 20:24:58,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12036800512. Throughput: 0: 42972.4. Samples: 12036901440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:24:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 20:25:00,271][15401] Updated weights for policy 0, policy_version 734672 (0.0037) [2024-06-24 20:25:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12036964352. Throughput: 0: 42811.4. Samples: 12037158720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 20:25:04,441][15401] Updated weights for policy 0, policy_version 734682 (0.0038) [2024-06-24 20:25:07,723][15401] Updated weights for policy 0, policy_version 734692 (0.0031) [2024-06-24 20:25:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12037210112. Throughput: 0: 42718.3. Samples: 12037277300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 20:25:12,072][15401] Updated weights for policy 0, policy_version 734702 (0.0043) [2024-06-24 20:25:13,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12037439488. Throughput: 0: 42804.9. Samples: 12037541280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 20:25:15,370][15401] Updated weights for policy 0, policy_version 734712 (0.0039) [2024-06-24 20:25:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12037619712. Throughput: 0: 42653.9. Samples: 12037799720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 20:25:19,818][15401] Updated weights for policy 0, policy_version 734722 (0.0033) [2024-06-24 20:25:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12037832704. Throughput: 0: 42618.8. Samples: 12037920000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 20:25:23,522][15401] Updated weights for policy 0, policy_version 734732 (0.0042) [2024-06-24 20:25:27,393][15401] Updated weights for policy 0, policy_version 734742 (0.0044) [2024-06-24 20:25:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 12038078464. Throughput: 0: 42718.0. Samples: 12038180860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 20:25:31,123][15401] Updated weights for policy 0, policy_version 734752 (0.0035) [2024-06-24 20:25:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12038258688. Throughput: 0: 42586.8. Samples: 12038436500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 20:25:34,969][15401] Updated weights for policy 0, policy_version 734762 (0.0035) [2024-06-24 20:25:38,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 12038471680. Throughput: 0: 42746.7. Samples: 12038558660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:38,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 20:25:38,676][15401] Updated weights for policy 0, policy_version 734772 (0.0042) [2024-06-24 20:25:42,535][15401] Updated weights for policy 0, policy_version 734782 (0.0025) [2024-06-24 20:25:43,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 12038717440. Throughput: 0: 42591.9. Samples: 12038818080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 20:25:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000734785_12038717440.pth... [2024-06-24 20:25:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000734158_12028444672.pth [2024-06-24 20:25:46,374][15401] Updated weights for policy 0, policy_version 734792 (0.0031) [2024-06-24 20:25:48,394][15132] Fps is (10 sec: 44215.2, 60 sec: 42869.8, 300 sec: 42708.8). Total num frames: 12038914048. Throughput: 0: 42460.0. Samples: 12039069620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:48,395][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 20:25:50,481][15401] Updated weights for policy 0, policy_version 734802 (0.0031) [2024-06-24 20:25:53,390][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.3, 300 sec: 42765.2). Total num frames: 12039110656. Throughput: 0: 42540.9. Samples: 12039191640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:53,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-24 20:25:54,199][15401] Updated weights for policy 0, policy_version 734812 (0.0032) [2024-06-24 20:25:57,992][15349] Signal inference workers to stop experience collection... (178250 times) [2024-06-24 20:25:57,993][15349] Signal inference workers to resume experience collection... (178250 times) [2024-06-24 20:25:58,035][15401] InferenceWorker_p0-w0: stopping experience collection (178250 times) [2024-06-24 20:25:58,035][15401] InferenceWorker_p0-w0: resuming experience collection (178250 times) [2024-06-24 20:25:58,156][15401] Updated weights for policy 0, policy_version 734822 (0.0034) [2024-06-24 20:25:58,389][15132] Fps is (10 sec: 42619.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12039340032. Throughput: 0: 42489.0. Samples: 12039453280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:25:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 20:26:01,783][15401] Updated weights for policy 0, policy_version 734832 (0.0033) [2024-06-24 20:26:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12039553024. Throughput: 0: 42265.7. Samples: 12039701680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:26:03,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 20:26:06,173][15401] Updated weights for policy 0, policy_version 734842 (0.0044) [2024-06-24 20:26:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12039733248. Throughput: 0: 42557.3. Samples: 12039835080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:26:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 20:26:09,874][15401] Updated weights for policy 0, policy_version 734852 (0.0036) [2024-06-24 20:26:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12039962624. Throughput: 0: 42228.0. Samples: 12040081120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:26:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 20:26:13,826][15401] Updated weights for policy 0, policy_version 734862 (0.0035) [2024-06-24 20:26:17,606][15401] Updated weights for policy 0, policy_version 734872 (0.0038) [2024-06-24 20:26:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12040192000. Throughput: 0: 42071.9. Samples: 12040329740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 20:26:18,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 20:26:21,541][15401] Updated weights for policy 0, policy_version 734882 (0.0026) [2024-06-24 20:26:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12040372224. Throughput: 0: 42261.6. Samples: 12040460440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:26:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 20:26:25,558][15401] Updated weights for policy 0, policy_version 734892 (0.0037) [2024-06-24 20:26:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12040601600. Throughput: 0: 42044.1. Samples: 12040710060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:26:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-24 20:26:29,590][15401] Updated weights for policy 0, policy_version 734902 (0.0027) [2024-06-24 20:26:33,379][15401] Updated weights for policy 0, policy_version 734912 (0.0028) [2024-06-24 20:26:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12040798208. Throughput: 0: 42189.4. Samples: 12040967940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:26:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 20:26:37,243][15401] Updated weights for policy 0, policy_version 734922 (0.0047) [2024-06-24 20:26:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.5, 300 sec: 42598.1). Total num frames: 12041011200. Throughput: 0: 42165.7. Samples: 12041089200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:26:38,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 20:26:41,006][15401] Updated weights for policy 0, policy_version 734932 (0.0027) [2024-06-24 20:26:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 12041256960. Throughput: 0: 42153.3. Samples: 12041350180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:26:43,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-24 20:26:44,713][15401] Updated weights for policy 0, policy_version 734942 (0.0029) [2024-06-24 20:26:48,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42055.5, 300 sec: 42653.9). Total num frames: 12041437184. Throughput: 0: 42310.2. Samples: 12041605640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:26:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 20:26:48,683][15401] Updated weights for policy 0, policy_version 734952 (0.0031) [2024-06-24 20:26:52,227][15401] Updated weights for policy 0, policy_version 734962 (0.0045) [2024-06-24 20:26:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12041633792. Throughput: 0: 42110.7. Samples: 12041730060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:26:53,399][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 20:26:56,352][15401] Updated weights for policy 0, policy_version 734972 (0.0036) [2024-06-24 20:26:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12041879552. Throughput: 0: 42449.9. Samples: 12041991360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:26:58,390][15132] Avg episode reward: [(0, '0.221')] [2024-06-24 20:26:59,779][15401] Updated weights for policy 0, policy_version 734982 (0.0034) [2024-06-24 20:27:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42598.5). Total num frames: 12042076160. Throughput: 0: 42603.5. Samples: 12042246900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:03,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 20:27:04,393][15401] Updated weights for policy 0, policy_version 734992 (0.0031) [2024-06-24 20:27:07,234][15401] Updated weights for policy 0, policy_version 735002 (0.0033) [2024-06-24 20:27:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12042289152. Throughput: 0: 42356.9. Samples: 12042366500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-24 20:27:12,095][15401] Updated weights for policy 0, policy_version 735012 (0.0037) [2024-06-24 20:27:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12042534912. Throughput: 0: 42816.4. Samples: 12042636800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 20:27:15,093][15401] Updated weights for policy 0, policy_version 735022 (0.0042) [2024-06-24 20:27:17,604][15349] Signal inference workers to stop experience collection... (178300 times) [2024-06-24 20:27:17,605][15349] Signal inference workers to resume experience collection... (178300 times) [2024-06-24 20:27:17,618][15401] InferenceWorker_p0-w0: stopping experience collection (178300 times) [2024-06-24 20:27:17,619][15401] InferenceWorker_p0-w0: resuming experience collection (178300 times) [2024-06-24 20:27:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12042731520. Throughput: 0: 42713.2. Samples: 12042890040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 20:27:19,472][15401] Updated weights for policy 0, policy_version 735032 (0.0031) [2024-06-24 20:27:22,942][15401] Updated weights for policy 0, policy_version 735042 (0.0041) [2024-06-24 20:27:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42543.8). Total num frames: 12042944512. Throughput: 0: 42858.2. Samples: 12043017720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:23,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 20:27:26,923][15401] Updated weights for policy 0, policy_version 735052 (0.0032) [2024-06-24 20:27:28,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 12043173888. Throughput: 0: 42767.0. Samples: 12043274800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:28,392][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 20:27:30,522][15401] Updated weights for policy 0, policy_version 735062 (0.0033) [2024-06-24 20:27:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42487.6). Total num frames: 12043354112. Throughput: 0: 42796.3. Samples: 12043531480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 20:27:34,764][15401] Updated weights for policy 0, policy_version 735072 (0.0027) [2024-06-24 20:27:38,010][15401] Updated weights for policy 0, policy_version 735082 (0.0032) [2024-06-24 20:27:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 12043599872. Throughput: 0: 42763.6. Samples: 12043654420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:38,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-24 20:27:42,349][15401] Updated weights for policy 0, policy_version 735092 (0.0040) [2024-06-24 20:27:43,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12043796480. Throughput: 0: 42862.7. Samples: 12043920180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 20:27:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000735096_12043812864.pth... [2024-06-24 20:27:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000734471_12033572864.pth [2024-06-24 20:27:45,482][15401] Updated weights for policy 0, policy_version 735102 (0.0022) [2024-06-24 20:27:48,396][15132] Fps is (10 sec: 39296.4, 60 sec: 42593.9, 300 sec: 42541.9). Total num frames: 12043993088. Throughput: 0: 42937.5. Samples: 12044179360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:48,396][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 20:27:49,966][15401] Updated weights for policy 0, policy_version 735112 (0.0026) [2024-06-24 20:27:53,074][15401] Updated weights for policy 0, policy_version 735122 (0.0040) [2024-06-24 20:27:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 42654.3). Total num frames: 12044255232. Throughput: 0: 43053.3. Samples: 12044303900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 20:27:57,612][15401] Updated weights for policy 0, policy_version 735132 (0.0043) [2024-06-24 20:27:58,389][15132] Fps is (10 sec: 45905.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12044451840. Throughput: 0: 42884.6. Samples: 12044566600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:27:58,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 20:28:00,710][15401] Updated weights for policy 0, policy_version 735142 (0.0045) [2024-06-24 20:28:03,390][15132] Fps is (10 sec: 37680.8, 60 sec: 42598.0, 300 sec: 42542.8). Total num frames: 12044632064. Throughput: 0: 42879.9. Samples: 12044819660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-24 20:28:03,391][15132] Avg episode reward: [(0, '0.227')] [2024-06-24 20:28:05,310][15401] Updated weights for policy 0, policy_version 735152 (0.0037) [2024-06-24 20:28:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12044877824. Throughput: 0: 42900.0. Samples: 12044948220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:08,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 20:28:08,535][15401] Updated weights for policy 0, policy_version 735162 (0.0033) [2024-06-24 20:28:13,086][15401] Updated weights for policy 0, policy_version 735172 (0.0041) [2024-06-24 20:28:13,390][15132] Fps is (10 sec: 44239.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12045074432. Throughput: 0: 42917.4. Samples: 12045205980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:13,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 20:28:16,105][15401] Updated weights for policy 0, policy_version 735182 (0.0039) [2024-06-24 20:28:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12045271040. Throughput: 0: 42952.2. Samples: 12045464320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 20:28:20,651][15401] Updated weights for policy 0, policy_version 735192 (0.0038) [2024-06-24 20:28:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12045533184. Throughput: 0: 43033.4. Samples: 12045590920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 20:28:23,674][15401] Updated weights for policy 0, policy_version 735202 (0.0028) [2024-06-24 20:28:28,267][15401] Updated weights for policy 0, policy_version 735212 (0.0030) [2024-06-24 20:28:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 12045713408. Throughput: 0: 42922.5. Samples: 12045851700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 20:28:31,479][15401] Updated weights for policy 0, policy_version 735222 (0.0029) [2024-06-24 20:28:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.6, 300 sec: 42710.2). Total num frames: 12045926400. Throughput: 0: 42796.7. Samples: 12046104940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 20:28:35,829][15401] Updated weights for policy 0, policy_version 735232 (0.0037) [2024-06-24 20:28:38,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12046172160. Throughput: 0: 42842.7. Samples: 12046231820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:38,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 20:28:39,498][15401] Updated weights for policy 0, policy_version 735242 (0.0031) [2024-06-24 20:28:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12046352384. Throughput: 0: 42834.2. Samples: 12046494140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 20:28:43,495][15401] Updated weights for policy 0, policy_version 735252 (0.0033) [2024-06-24 20:28:47,548][15401] Updated weights for policy 0, policy_version 735262 (0.0038) [2024-06-24 20:28:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 12046581760. Throughput: 0: 42641.0. Samples: 12046738480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:48,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 20:28:51,150][15401] Updated weights for policy 0, policy_version 735272 (0.0050) [2024-06-24 20:28:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12046794752. Throughput: 0: 42637.5. Samples: 12046866900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 20:28:55,018][15401] Updated weights for policy 0, policy_version 735282 (0.0036) [2024-06-24 20:28:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12047007744. Throughput: 0: 42749.7. Samples: 12047129720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:28:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 20:28:58,588][15401] Updated weights for policy 0, policy_version 735292 (0.0026) [2024-06-24 20:28:59,288][15349] Signal inference workers to stop experience collection... (178350 times) [2024-06-24 20:28:59,340][15401] InferenceWorker_p0-w0: stopping experience collection (178350 times) [2024-06-24 20:28:59,340][15349] Signal inference workers to resume experience collection... (178350 times) [2024-06-24 20:28:59,357][15401] InferenceWorker_p0-w0: resuming experience collection (178350 times) [2024-06-24 20:29:02,696][15401] Updated weights for policy 0, policy_version 735302 (0.0036) [2024-06-24 20:29:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43145.0, 300 sec: 42709.5). Total num frames: 12047220736. Throughput: 0: 42626.1. Samples: 12047382500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:03,395][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 20:29:06,206][15401] Updated weights for policy 0, policy_version 735312 (0.0033) [2024-06-24 20:29:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12047450112. Throughput: 0: 42661.8. Samples: 12047510700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:08,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 20:29:10,233][15401] Updated weights for policy 0, policy_version 735322 (0.0021) [2024-06-24 20:29:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12047630336. Throughput: 0: 42655.3. Samples: 12047771180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 20:29:14,128][15401] Updated weights for policy 0, policy_version 735332 (0.0027) [2024-06-24 20:29:17,829][15401] Updated weights for policy 0, policy_version 735342 (0.0033) [2024-06-24 20:29:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12047859712. Throughput: 0: 42565.7. Samples: 12048020400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:18,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 20:29:21,665][15401] Updated weights for policy 0, policy_version 735352 (0.0039) [2024-06-24 20:29:23,390][15132] Fps is (10 sec: 45873.6, 60 sec: 42598.2, 300 sec: 42654.2). Total num frames: 12048089088. Throughput: 0: 42700.2. Samples: 12048153340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:23,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 20:29:25,888][15401] Updated weights for policy 0, policy_version 735362 (0.0044) [2024-06-24 20:29:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12048269312. Throughput: 0: 42557.8. Samples: 12048409240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 20:29:29,109][15401] Updated weights for policy 0, policy_version 735372 (0.0042) [2024-06-24 20:29:33,389][15132] Fps is (10 sec: 39322.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12048482304. Throughput: 0: 42824.5. Samples: 12048665580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 20:29:33,478][15401] Updated weights for policy 0, policy_version 735382 (0.0024) [2024-06-24 20:29:36,816][15401] Updated weights for policy 0, policy_version 735392 (0.0040) [2024-06-24 20:29:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12048728064. Throughput: 0: 42809.6. Samples: 12048793340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 20:29:40,947][15401] Updated weights for policy 0, policy_version 735402 (0.0025) [2024-06-24 20:29:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 12048924672. Throughput: 0: 42800.1. Samples: 12049055720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 20:29:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000735409_12048941056.pth... [2024-06-24 20:29:43,574][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000734785_12038717440.pth [2024-06-24 20:29:44,380][15401] Updated weights for policy 0, policy_version 735412 (0.0027) [2024-06-24 20:29:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12049137664. Throughput: 0: 42701.0. Samples: 12049304040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-24 20:29:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 20:29:48,433][15401] Updated weights for policy 0, policy_version 735422 (0.0028) [2024-06-24 20:29:52,322][15401] Updated weights for policy 0, policy_version 735432 (0.0032) [2024-06-24 20:29:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12049367040. Throughput: 0: 42786.2. Samples: 12049436080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:29:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 20:29:56,031][15401] Updated weights for policy 0, policy_version 735442 (0.0033) [2024-06-24 20:29:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 12049547264. Throughput: 0: 42741.8. Samples: 12049694560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:29:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 20:29:59,775][15401] Updated weights for policy 0, policy_version 735452 (0.0032) [2024-06-24 20:30:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12049793024. Throughput: 0: 42868.9. Samples: 12049949500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 20:30:03,440][15401] Updated weights for policy 0, policy_version 735462 (0.0043) [2024-06-24 20:30:07,358][15401] Updated weights for policy 0, policy_version 735472 (0.0034) [2024-06-24 20:30:08,389][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12050022400. Throughput: 0: 42914.0. Samples: 12050084460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 20:30:11,197][15401] Updated weights for policy 0, policy_version 735482 (0.0053) [2024-06-24 20:30:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12050202624. Throughput: 0: 43066.3. Samples: 12050347220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 20:30:15,082][15401] Updated weights for policy 0, policy_version 735492 (0.0032) [2024-06-24 20:30:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12050448384. Throughput: 0: 42909.7. Samples: 12050596520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 20:30:18,890][15401] Updated weights for policy 0, policy_version 735502 (0.0028) [2024-06-24 20:30:22,657][15401] Updated weights for policy 0, policy_version 735512 (0.0030) [2024-06-24 20:30:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 12050661376. Throughput: 0: 43057.5. Samples: 12050730920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:23,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 20:30:26,485][15401] Updated weights for policy 0, policy_version 735522 (0.0040) [2024-06-24 20:30:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12050841600. Throughput: 0: 42831.2. Samples: 12050983120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 20:30:30,298][15401] Updated weights for policy 0, policy_version 735532 (0.0034) [2024-06-24 20:30:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12051070976. Throughput: 0: 43024.5. Samples: 12051240140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 20:30:33,395][15349] Signal inference workers to stop experience collection... (178400 times) [2024-06-24 20:30:33,396][15349] Signal inference workers to resume experience collection... (178400 times) [2024-06-24 20:30:33,413][15401] InferenceWorker_p0-w0: stopping experience collection (178400 times) [2024-06-24 20:30:33,413][15401] InferenceWorker_p0-w0: resuming experience collection (178400 times) [2024-06-24 20:30:34,118][15401] Updated weights for policy 0, policy_version 735542 (0.0041) [2024-06-24 20:30:37,754][15401] Updated weights for policy 0, policy_version 735552 (0.0042) [2024-06-24 20:30:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 12051300352. Throughput: 0: 43003.2. Samples: 12051371220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 20:30:41,847][15401] Updated weights for policy 0, policy_version 735562 (0.0027) [2024-06-24 20:30:43,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42654.6). Total num frames: 12051496960. Throughput: 0: 43008.8. Samples: 12051629960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:43,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 20:30:45,513][15401] Updated weights for policy 0, policy_version 735572 (0.0043) [2024-06-24 20:30:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12051709952. Throughput: 0: 43116.0. Samples: 12051889720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 20:30:49,624][15401] Updated weights for policy 0, policy_version 735582 (0.0029) [2024-06-24 20:30:53,215][15401] Updated weights for policy 0, policy_version 735592 (0.0034) [2024-06-24 20:30:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 12051939328. Throughput: 0: 42892.3. Samples: 12052014620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 20:30:57,333][15401] Updated weights for policy 0, policy_version 735602 (0.0031) [2024-06-24 20:30:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 12052152320. Throughput: 0: 42712.4. Samples: 12052269280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:30:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 20:31:01,035][15401] Updated weights for policy 0, policy_version 735612 (0.0035) [2024-06-24 20:31:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12052348928. Throughput: 0: 42983.9. Samples: 12052530800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:31:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 20:31:04,963][15401] Updated weights for policy 0, policy_version 735622 (0.0037) [2024-06-24 20:31:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12052578304. Throughput: 0: 42683.5. Samples: 12052651680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:31:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 20:31:08,659][15401] Updated weights for policy 0, policy_version 735632 (0.0030) [2024-06-24 20:31:12,665][15401] Updated weights for policy 0, policy_version 735642 (0.0035) [2024-06-24 20:31:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12052807680. Throughput: 0: 42768.8. Samples: 12052907720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:31:13,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 20:31:16,171][15401] Updated weights for policy 0, policy_version 735652 (0.0035) [2024-06-24 20:31:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12052971520. Throughput: 0: 42761.7. Samples: 12053164420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:31:18,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 20:31:20,260][15401] Updated weights for policy 0, policy_version 735662 (0.0043) [2024-06-24 20:31:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12053233664. Throughput: 0: 42616.4. Samples: 12053288960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:31:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 20:31:23,697][15401] Updated weights for policy 0, policy_version 735672 (0.0037) [2024-06-24 20:31:27,911][15401] Updated weights for policy 0, policy_version 735682 (0.0034) [2024-06-24 20:31:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12053430272. Throughput: 0: 42675.1. Samples: 12053550340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:31:28,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-24 20:31:31,870][15401] Updated weights for policy 0, policy_version 735692 (0.0024) [2024-06-24 20:31:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.2, 300 sec: 42765.4). Total num frames: 12053626880. Throughput: 0: 42760.8. Samples: 12053813960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:31:33,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 20:31:35,302][15401] Updated weights for policy 0, policy_version 735702 (0.0023) [2024-06-24 20:31:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12053872640. Throughput: 0: 42640.1. Samples: 12053933420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:31:38,394][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 20:31:39,464][15401] Updated weights for policy 0, policy_version 735712 (0.0037) [2024-06-24 20:31:43,142][15349] Signal inference workers to stop experience collection... (178450 times) [2024-06-24 20:31:43,142][15349] Signal inference workers to resume experience collection... (178450 times) [2024-06-24 20:31:43,164][15401] InferenceWorker_p0-w0: stopping experience collection (178450 times) [2024-06-24 20:31:43,164][15401] InferenceWorker_p0-w0: resuming experience collection (178450 times) [2024-06-24 20:31:43,282][15401] Updated weights for policy 0, policy_version 735722 (0.0030) [2024-06-24 20:31:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 12054069248. Throughput: 0: 42800.3. Samples: 12054195300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:31:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 20:31:43,531][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000735723_12054085632.pth... [2024-06-24 20:31:43,587][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000735096_12043812864.pth [2024-06-24 20:31:46,950][15401] Updated weights for policy 0, policy_version 735732 (0.0037) [2024-06-24 20:31:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12054282240. Throughput: 0: 42822.7. Samples: 12054457820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:31:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 20:31:51,003][15401] Updated weights for policy 0, policy_version 735742 (0.0039) [2024-06-24 20:31:53,392][15132] Fps is (10 sec: 45864.7, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 12054528000. Throughput: 0: 43000.4. Samples: 12054586800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:31:53,392][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 20:31:54,517][15401] Updated weights for policy 0, policy_version 735752 (0.0037) [2024-06-24 20:31:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12054691840. Throughput: 0: 43000.5. Samples: 12054842740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:31:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 20:31:58,746][15401] Updated weights for policy 0, policy_version 735762 (0.0037) [2024-06-24 20:32:02,063][15401] Updated weights for policy 0, policy_version 735772 (0.0027) [2024-06-24 20:32:03,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12054921216. Throughput: 0: 42955.1. Samples: 12055097400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:03,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 20:32:06,385][15401] Updated weights for policy 0, policy_version 735782 (0.0031) [2024-06-24 20:32:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12055150592. Throughput: 0: 43091.6. Samples: 12055228080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 20:32:10,299][15401] Updated weights for policy 0, policy_version 735792 (0.0022) [2024-06-24 20:32:13,394][15132] Fps is (10 sec: 42579.7, 60 sec: 42322.3, 300 sec: 42764.4). Total num frames: 12055347200. Throughput: 0: 42851.4. Samples: 12055478840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:13,394][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 20:32:13,938][15401] Updated weights for policy 0, policy_version 735802 (0.0033) [2024-06-24 20:32:18,157][15401] Updated weights for policy 0, policy_version 735812 (0.0035) [2024-06-24 20:32:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12055560192. Throughput: 0: 42786.3. Samples: 12055739340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 20:32:21,435][15401] Updated weights for policy 0, policy_version 735822 (0.0033) [2024-06-24 20:32:23,389][15132] Fps is (10 sec: 44255.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12055789568. Throughput: 0: 42863.6. Samples: 12055862280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 20:32:25,618][15401] Updated weights for policy 0, policy_version 735832 (0.0022) [2024-06-24 20:32:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12056002560. Throughput: 0: 42832.1. Samples: 12056122740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 20:32:29,187][15401] Updated weights for policy 0, policy_version 735842 (0.0037) [2024-06-24 20:32:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12056182784. Throughput: 0: 42755.7. Samples: 12056381820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 20:32:33,487][15401] Updated weights for policy 0, policy_version 735852 (0.0033) [2024-06-24 20:32:36,738][15401] Updated weights for policy 0, policy_version 735862 (0.0029) [2024-06-24 20:32:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12056428544. Throughput: 0: 42560.0. Samples: 12056501900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 20:32:40,903][15401] Updated weights for policy 0, policy_version 735872 (0.0032) [2024-06-24 20:32:43,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43142.9, 300 sec: 42932.2). Total num frames: 12056657920. Throughput: 0: 42756.3. Samples: 12056766880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:43,392][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 20:32:44,263][15401] Updated weights for policy 0, policy_version 735882 (0.0042) [2024-06-24 20:32:48,394][15132] Fps is (10 sec: 40942.4, 60 sec: 42595.4, 300 sec: 42653.3). Total num frames: 12056838144. Throughput: 0: 42746.5. Samples: 12057021180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:48,394][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 20:32:48,570][15401] Updated weights for policy 0, policy_version 735892 (0.0035) [2024-06-24 20:32:52,339][15401] Updated weights for policy 0, policy_version 735902 (0.0031) [2024-06-24 20:32:53,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 12057067520. Throughput: 0: 42606.1. Samples: 12057145360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:53,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 20:32:56,211][15401] Updated weights for policy 0, policy_version 735912 (0.0032) [2024-06-24 20:32:58,390][15132] Fps is (10 sec: 45894.5, 60 sec: 43417.5, 300 sec: 42931.7). Total num frames: 12057296896. Throughput: 0: 42884.0. Samples: 12057408440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:32:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 20:32:59,833][15401] Updated weights for policy 0, policy_version 735922 (0.0028) [2024-06-24 20:33:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12057493504. Throughput: 0: 42696.0. Samples: 12057660660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:33:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 20:33:04,047][15401] Updated weights for policy 0, policy_version 735932 (0.0032) [2024-06-24 20:33:07,227][15401] Updated weights for policy 0, policy_version 735942 (0.0030) [2024-06-24 20:33:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12057722880. Throughput: 0: 42749.8. Samples: 12057786020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:33:08,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 20:33:11,801][15401] Updated weights for policy 0, policy_version 735952 (0.0027) [2024-06-24 20:33:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43147.6, 300 sec: 42931.6). Total num frames: 12057935872. Throughput: 0: 42904.4. Samples: 12058053440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-24 20:33:13,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 20:33:14,559][15401] Updated weights for policy 0, policy_version 735962 (0.0027) [2024-06-24 20:33:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12058132480. Throughput: 0: 42832.4. Samples: 12058309280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 20:33:19,313][15401] Updated weights for policy 0, policy_version 735972 (0.0031) [2024-06-24 20:33:22,123][15401] Updated weights for policy 0, policy_version 735982 (0.0043) [2024-06-24 20:33:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12058361856. Throughput: 0: 42934.2. Samples: 12058433940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 20:33:26,858][15401] Updated weights for policy 0, policy_version 735992 (0.0024) [2024-06-24 20:33:27,819][15349] Signal inference workers to stop experience collection... (178500 times) [2024-06-24 20:33:27,872][15349] Signal inference workers to resume experience collection... (178500 times) [2024-06-24 20:33:27,872][15401] InferenceWorker_p0-w0: stopping experience collection (178500 times) [2024-06-24 20:33:27,889][15401] InferenceWorker_p0-w0: resuming experience collection (178500 times) [2024-06-24 20:33:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12058574848. Throughput: 0: 42930.4. Samples: 12058698640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 20:33:29,685][15401] Updated weights for policy 0, policy_version 736002 (0.0033) [2024-06-24 20:33:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12058771456. Throughput: 0: 42941.0. Samples: 12058953340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:33,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 20:33:34,331][15401] Updated weights for policy 0, policy_version 736012 (0.0034) [2024-06-24 20:33:37,587][15401] Updated weights for policy 0, policy_version 736022 (0.0048) [2024-06-24 20:33:38,392][15132] Fps is (10 sec: 44225.5, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 12059017216. Throughput: 0: 43071.1. Samples: 12059083660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:38,393][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 20:33:41,987][15401] Updated weights for policy 0, policy_version 736032 (0.0035) [2024-06-24 20:33:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 12059213824. Throughput: 0: 42947.6. Samples: 12059341080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 20:33:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000736036_12059213824.pth... [2024-06-24 20:33:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000735409_12048941056.pth [2024-06-24 20:33:45,171][15401] Updated weights for policy 0, policy_version 736042 (0.0031) [2024-06-24 20:33:48,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42874.4, 300 sec: 42765.0). Total num frames: 12059410432. Throughput: 0: 42957.7. Samples: 12059593760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:48,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 20:33:49,492][15401] Updated weights for policy 0, policy_version 736052 (0.0026) [2024-06-24 20:33:53,123][15401] Updated weights for policy 0, policy_version 736062 (0.0024) [2024-06-24 20:33:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12059656192. Throughput: 0: 42926.5. Samples: 12059717720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:53,394][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 20:33:57,407][15401] Updated weights for policy 0, policy_version 736072 (0.0028) [2024-06-24 20:33:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12059852800. Throughput: 0: 42757.8. Samples: 12059977540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:33:58,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 20:34:00,652][15401] Updated weights for policy 0, policy_version 736082 (0.0047) [2024-06-24 20:34:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12060065792. Throughput: 0: 42799.1. Samples: 12060235240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 20:34:05,271][15401] Updated weights for policy 0, policy_version 736092 (0.0031) [2024-06-24 20:34:08,347][15401] Updated weights for policy 0, policy_version 736102 (0.0037) [2024-06-24 20:34:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12060295168. Throughput: 0: 42841.7. Samples: 12060361820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:08,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 20:34:12,826][15401] Updated weights for policy 0, policy_version 736112 (0.0031) [2024-06-24 20:34:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12060491776. Throughput: 0: 42791.4. Samples: 12060624260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 20:34:16,048][15401] Updated weights for policy 0, policy_version 736122 (0.0047) [2024-06-24 20:34:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 12060704768. Throughput: 0: 42885.0. Samples: 12060883160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:18,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 20:34:20,397][15401] Updated weights for policy 0, policy_version 736132 (0.0044) [2024-06-24 20:34:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12060934144. Throughput: 0: 42772.5. Samples: 12061008320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 20:34:23,545][15401] Updated weights for policy 0, policy_version 736142 (0.0028) [2024-06-24 20:34:27,995][15401] Updated weights for policy 0, policy_version 736152 (0.0050) [2024-06-24 20:34:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 12061114368. Throughput: 0: 42680.4. Samples: 12061261700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 20:34:31,249][15401] Updated weights for policy 0, policy_version 736162 (0.0023) [2024-06-24 20:34:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12061343744. Throughput: 0: 42598.3. Samples: 12061510680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 20:34:35,562][15401] Updated weights for policy 0, policy_version 736172 (0.0029) [2024-06-24 20:34:38,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 12061573120. Throughput: 0: 42793.8. Samples: 12061643440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 20:34:38,836][15401] Updated weights for policy 0, policy_version 736182 (0.0025) [2024-06-24 20:34:43,215][15401] Updated weights for policy 0, policy_version 736192 (0.0025) [2024-06-24 20:34:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 12061769728. Throughput: 0: 42701.4. Samples: 12061899100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:43,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 20:34:45,645][15349] Signal inference workers to stop experience collection... (178550 times) [2024-06-24 20:34:45,646][15349] Signal inference workers to resume experience collection... (178550 times) [2024-06-24 20:34:45,682][15401] InferenceWorker_p0-w0: stopping experience collection (178550 times) [2024-06-24 20:34:45,683][15401] InferenceWorker_p0-w0: resuming experience collection (178550 times) [2024-06-24 20:34:46,389][15401] Updated weights for policy 0, policy_version 736202 (0.0034) [2024-06-24 20:34:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12061982720. Throughput: 0: 42517.3. Samples: 12062148520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 20:34:50,751][15401] Updated weights for policy 0, policy_version 736212 (0.0040) [2024-06-24 20:34:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42820.5). Total num frames: 12062179328. Throughput: 0: 42594.8. Samples: 12062278580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-24 20:34:53,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 20:34:54,189][15401] Updated weights for policy 0, policy_version 736222 (0.0044) [2024-06-24 20:34:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12062408704. Throughput: 0: 42406.3. Samples: 12062532540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:34:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 20:34:58,612][15401] Updated weights for policy 0, policy_version 736232 (0.0035) [2024-06-24 20:35:01,828][15401] Updated weights for policy 0, policy_version 736242 (0.0038) [2024-06-24 20:35:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12062638080. Throughput: 0: 42253.2. Samples: 12062784560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:03,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 20:35:06,792][15401] Updated weights for policy 0, policy_version 736252 (0.0033) [2024-06-24 20:35:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 12062818304. Throughput: 0: 42402.7. Samples: 12062916440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 20:35:09,782][15401] Updated weights for policy 0, policy_version 736262 (0.0043) [2024-06-24 20:35:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12063064064. Throughput: 0: 42318.0. Samples: 12063166000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:13,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 20:35:14,422][15401] Updated weights for policy 0, policy_version 736272 (0.0024) [2024-06-24 20:35:17,473][15401] Updated weights for policy 0, policy_version 736282 (0.0032) [2024-06-24 20:35:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12063277056. Throughput: 0: 42454.8. Samples: 12063421140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 20:35:22,000][15401] Updated weights for policy 0, policy_version 736292 (0.0040) [2024-06-24 20:35:23,389][15132] Fps is (10 sec: 37682.9, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 12063440896. Throughput: 0: 42464.0. Samples: 12063554320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 20:35:25,468][15401] Updated weights for policy 0, policy_version 736302 (0.0036) [2024-06-24 20:35:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12063686656. Throughput: 0: 42382.2. Samples: 12063806300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 20:35:29,983][15401] Updated weights for policy 0, policy_version 736312 (0.0043) [2024-06-24 20:35:32,977][15401] Updated weights for policy 0, policy_version 736322 (0.0031) [2024-06-24 20:35:33,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12063916032. Throughput: 0: 42592.0. Samples: 12064065160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:33,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 20:35:37,449][15401] Updated weights for policy 0, policy_version 736332 (0.0029) [2024-06-24 20:35:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 12064079872. Throughput: 0: 42574.3. Samples: 12064194420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:38,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 20:35:40,498][15401] Updated weights for policy 0, policy_version 736342 (0.0036) [2024-06-24 20:35:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12064325632. Throughput: 0: 42603.5. Samples: 12064449700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:43,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 20:35:43,432][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000736349_12064342016.pth... [2024-06-24 20:35:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000735723_12054085632.pth [2024-06-24 20:35:44,911][15401] Updated weights for policy 0, policy_version 736352 (0.0037) [2024-06-24 20:35:48,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12064538624. Throughput: 0: 42618.7. Samples: 12064702400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:48,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 20:35:48,410][15401] Updated weights for policy 0, policy_version 736362 (0.0044) [2024-06-24 20:35:52,339][15401] Updated weights for policy 0, policy_version 736372 (0.0036) [2024-06-24 20:35:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12064735232. Throughput: 0: 42587.2. Samples: 12064832860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 20:35:55,949][15401] Updated weights for policy 0, policy_version 736382 (0.0043) [2024-06-24 20:35:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12064997376. Throughput: 0: 42784.4. Samples: 12065091300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:35:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 20:35:59,787][15401] Updated weights for policy 0, policy_version 736392 (0.0039) [2024-06-24 20:36:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12065193984. Throughput: 0: 42857.7. Samples: 12065349740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:36:03,399][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 20:36:03,635][15401] Updated weights for policy 0, policy_version 736402 (0.0044) [2024-06-24 20:36:07,271][15401] Updated weights for policy 0, policy_version 736412 (0.0033) [2024-06-24 20:36:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12065390592. Throughput: 0: 42692.9. Samples: 12065475500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:36:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 20:36:09,027][15349] Signal inference workers to stop experience collection... (178600 times) [2024-06-24 20:36:09,027][15349] Signal inference workers to resume experience collection... (178600 times) [2024-06-24 20:36:09,041][15401] InferenceWorker_p0-w0: stopping experience collection (178600 times) [2024-06-24 20:36:09,041][15401] InferenceWorker_p0-w0: resuming experience collection (178600 times) [2024-06-24 20:36:11,612][15401] Updated weights for policy 0, policy_version 736422 (0.0028) [2024-06-24 20:36:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12065619968. Throughput: 0: 42927.5. Samples: 12065738040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:36:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 20:36:14,749][15401] Updated weights for policy 0, policy_version 736432 (0.0032) [2024-06-24 20:36:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12065816576. Throughput: 0: 42801.0. Samples: 12065991200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:36:18,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 20:36:19,444][15401] Updated weights for policy 0, policy_version 736442 (0.0036) [2024-06-24 20:36:22,863][15401] Updated weights for policy 0, policy_version 736452 (0.0034) [2024-06-24 20:36:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12066029568. Throughput: 0: 42666.6. Samples: 12066114420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:36:23,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 20:36:27,372][15401] Updated weights for policy 0, policy_version 736462 (0.0026) [2024-06-24 20:36:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12066242560. Throughput: 0: 42712.5. Samples: 12066371760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:36:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-24 20:36:30,567][15401] Updated weights for policy 0, policy_version 736472 (0.0029) [2024-06-24 20:36:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12066439168. Throughput: 0: 42856.0. Samples: 12066630920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:36:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 20:36:34,847][15401] Updated weights for policy 0, policy_version 736482 (0.0032) [2024-06-24 20:36:38,368][15401] Updated weights for policy 0, policy_version 736492 (0.0048) [2024-06-24 20:36:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 12066684928. Throughput: 0: 42607.1. Samples: 12066750180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 20:36:38,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 20:36:42,429][15401] Updated weights for policy 0, policy_version 736502 (0.0040) [2024-06-24 20:36:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12066881536. Throughput: 0: 42628.1. Samples: 12067009560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:36:43,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 20:36:45,990][15401] Updated weights for policy 0, policy_version 736512 (0.0025) [2024-06-24 20:36:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 12067078144. Throughput: 0: 42574.3. Samples: 12067265580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:36:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 20:36:50,460][15401] Updated weights for policy 0, policy_version 736522 (0.0030) [2024-06-24 20:36:53,390][15132] Fps is (10 sec: 44235.3, 60 sec: 43144.3, 300 sec: 42820.5). Total num frames: 12067323904. Throughput: 0: 42575.3. Samples: 12067391400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:36:53,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-24 20:36:53,550][15401] Updated weights for policy 0, policy_version 736532 (0.0042) [2024-06-24 20:36:58,033][15401] Updated weights for policy 0, policy_version 736542 (0.0033) [2024-06-24 20:36:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12067520512. Throughput: 0: 42560.1. Samples: 12067653240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:36:58,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 20:37:01,599][15401] Updated weights for policy 0, policy_version 736552 (0.0037) [2024-06-24 20:37:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12067733504. Throughput: 0: 42765.6. Samples: 12067915660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:03,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-24 20:37:05,660][15401] Updated weights for policy 0, policy_version 736562 (0.0035) [2024-06-24 20:37:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 12067962880. Throughput: 0: 42723.9. Samples: 12068037000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:08,390][15132] Avg episode reward: [(0, '0.079')] [2024-06-24 20:37:09,085][15401] Updated weights for policy 0, policy_version 736572 (0.0033) [2024-06-24 20:37:13,244][15401] Updated weights for policy 0, policy_version 736582 (0.0028) [2024-06-24 20:37:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12068159488. Throughput: 0: 42768.9. Samples: 12068296360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 20:37:16,737][15401] Updated weights for policy 0, policy_version 736592 (0.0039) [2024-06-24 20:37:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12068372480. Throughput: 0: 42777.3. Samples: 12068555900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 20:37:20,886][15349] Signal inference workers to stop experience collection... (178650 times) [2024-06-24 20:37:20,892][15349] Signal inference workers to resume experience collection... (178650 times) [2024-06-24 20:37:20,893][15401] Updated weights for policy 0, policy_version 736602 (0.0041) [2024-06-24 20:37:20,914][15401] InferenceWorker_p0-w0: stopping experience collection (178650 times) [2024-06-24 20:37:20,915][15401] InferenceWorker_p0-w0: resuming experience collection (178650 times) [2024-06-24 20:37:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12068585472. Throughput: 0: 42858.6. Samples: 12068678820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:23,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 20:37:24,450][15401] Updated weights for policy 0, policy_version 736612 (0.0035) [2024-06-24 20:37:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12068798464. Throughput: 0: 42744.3. Samples: 12068933060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:28,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 20:37:28,562][15401] Updated weights for policy 0, policy_version 736622 (0.0045) [2024-06-24 20:37:32,158][15401] Updated weights for policy 0, policy_version 736632 (0.0036) [2024-06-24 20:37:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12068995072. Throughput: 0: 42681.8. Samples: 12069186260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 20:37:36,403][15401] Updated weights for policy 0, policy_version 736642 (0.0033) [2024-06-24 20:37:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 12069240832. Throughput: 0: 42671.7. Samples: 12069311620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:38,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-24 20:37:39,621][15401] Updated weights for policy 0, policy_version 736652 (0.0046) [2024-06-24 20:37:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42710.1). Total num frames: 12069437440. Throughput: 0: 42642.2. Samples: 12069572140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:43,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-24 20:37:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000736660_12069437440.pth... [2024-06-24 20:37:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000736036_12059213824.pth [2024-06-24 20:37:44,018][15401] Updated weights for policy 0, policy_version 736662 (0.0024) [2024-06-24 20:37:47,633][15401] Updated weights for policy 0, policy_version 736672 (0.0040) [2024-06-24 20:37:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12069650432. Throughput: 0: 42501.1. Samples: 12069828200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:48,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-24 20:37:51,721][15401] Updated weights for policy 0, policy_version 736682 (0.0043) [2024-06-24 20:37:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 12069879808. Throughput: 0: 42642.3. Samples: 12069955900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 20:37:55,513][15401] Updated weights for policy 0, policy_version 736692 (0.0040) [2024-06-24 20:37:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12070092800. Throughput: 0: 42460.8. Samples: 12070207100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:37:58,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 20:37:59,310][15401] Updated weights for policy 0, policy_version 736702 (0.0029) [2024-06-24 20:38:03,129][15401] Updated weights for policy 0, policy_version 736712 (0.0039) [2024-06-24 20:38:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12070289408. Throughput: 0: 42407.2. Samples: 12070464220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:38:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 20:38:07,167][15401] Updated weights for policy 0, policy_version 736722 (0.0044) [2024-06-24 20:38:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12070518784. Throughput: 0: 42404.5. Samples: 12070587020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:38:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 20:38:10,816][15401] Updated weights for policy 0, policy_version 736732 (0.0035) [2024-06-24 20:38:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12070715392. Throughput: 0: 42544.9. Samples: 12070847580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:38:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 20:38:14,887][15401] Updated weights for policy 0, policy_version 736742 (0.0032) [2024-06-24 20:38:18,391][15132] Fps is (10 sec: 40952.5, 60 sec: 42597.2, 300 sec: 42598.1). Total num frames: 12070928384. Throughput: 0: 42489.5. Samples: 12071098360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 20:38:18,392][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 20:38:18,605][15401] Updated weights for policy 0, policy_version 736752 (0.0038) [2024-06-24 20:38:22,593][15401] Updated weights for policy 0, policy_version 736762 (0.0035) [2024-06-24 20:38:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12071157760. Throughput: 0: 42615.6. Samples: 12071229320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:38:23,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 20:38:26,443][15401] Updated weights for policy 0, policy_version 736772 (0.0040) [2024-06-24 20:38:28,389][15132] Fps is (10 sec: 42606.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12071354368. Throughput: 0: 42664.5. Samples: 12071492040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:38:28,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 20:38:30,226][15401] Updated weights for policy 0, policy_version 736782 (0.0032) [2024-06-24 20:38:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42598.8). Total num frames: 12071583744. Throughput: 0: 42547.5. Samples: 12071742840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:38:33,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 20:38:33,915][15401] Updated weights for policy 0, policy_version 736792 (0.0039) [2024-06-24 20:38:36,495][15349] Signal inference workers to stop experience collection... (178700 times) [2024-06-24 20:38:36,496][15349] Signal inference workers to resume experience collection... (178700 times) [2024-06-24 20:38:36,542][15401] InferenceWorker_p0-w0: stopping experience collection (178700 times) [2024-06-24 20:38:36,543][15401] InferenceWorker_p0-w0: resuming experience collection (178700 times) [2024-06-24 20:38:37,758][15401] Updated weights for policy 0, policy_version 736802 (0.0043) [2024-06-24 20:38:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12071796736. Throughput: 0: 42680.7. Samples: 12071876540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:38:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 20:38:41,515][15401] Updated weights for policy 0, policy_version 736812 (0.0024) [2024-06-24 20:38:43,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12071993344. Throughput: 0: 42766.1. Samples: 12072131680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:38:43,393][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 20:38:45,371][15401] Updated weights for policy 0, policy_version 736822 (0.0039) [2024-06-24 20:38:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 12072239104. Throughput: 0: 42602.6. Samples: 12072381340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:38:48,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 20:38:49,556][15401] Updated weights for policy 0, policy_version 736832 (0.0043) [2024-06-24 20:38:52,886][15401] Updated weights for policy 0, policy_version 736842 (0.0034) [2024-06-24 20:38:53,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12072435712. Throughput: 0: 42941.3. Samples: 12072519380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:38:53,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 20:38:57,165][15401] Updated weights for policy 0, policy_version 736852 (0.0036) [2024-06-24 20:38:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12072632320. Throughput: 0: 42940.5. Samples: 12072779900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:38:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 20:39:00,358][15401] Updated weights for policy 0, policy_version 736862 (0.0029) [2024-06-24 20:39:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 12072894464. Throughput: 0: 42775.4. Samples: 12073023180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 20:39:04,976][15401] Updated weights for policy 0, policy_version 736872 (0.0030) [2024-06-24 20:39:07,970][15401] Updated weights for policy 0, policy_version 736882 (0.0036) [2024-06-24 20:39:08,390][15132] Fps is (10 sec: 45873.2, 60 sec: 42871.1, 300 sec: 42709.4). Total num frames: 12073091072. Throughput: 0: 43077.3. Samples: 12073167820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:08,394][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 20:39:12,451][15401] Updated weights for policy 0, policy_version 736892 (0.0027) [2024-06-24 20:39:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12073287680. Throughput: 0: 42945.3. Samples: 12073424580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 20:39:15,553][15401] Updated weights for policy 0, policy_version 736902 (0.0053) [2024-06-24 20:39:18,390][15132] Fps is (10 sec: 44238.6, 60 sec: 43418.8, 300 sec: 42709.5). Total num frames: 12073533440. Throughput: 0: 42744.3. Samples: 12073666340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 20:39:20,031][15401] Updated weights for policy 0, policy_version 736912 (0.0039) [2024-06-24 20:39:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12073713664. Throughput: 0: 42823.1. Samples: 12073803580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:23,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-24 20:39:23,639][15401] Updated weights for policy 0, policy_version 736922 (0.0039) [2024-06-24 20:39:27,525][15401] Updated weights for policy 0, policy_version 736932 (0.0034) [2024-06-24 20:39:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12073926656. Throughput: 0: 42810.3. Samples: 12074058040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:28,390][15132] Avg episode reward: [(0, '0.161')] [2024-06-24 20:39:31,298][15401] Updated weights for policy 0, policy_version 736942 (0.0042) [2024-06-24 20:39:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12074156032. Throughput: 0: 42771.1. Samples: 12074306040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 20:39:35,066][15401] Updated weights for policy 0, policy_version 736952 (0.0034) [2024-06-24 20:39:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12074352640. Throughput: 0: 42623.9. Samples: 12074437460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 20:39:38,907][15401] Updated weights for policy 0, policy_version 736962 (0.0036) [2024-06-24 20:39:42,710][15401] Updated weights for policy 0, policy_version 736972 (0.0028) [2024-06-24 20:39:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 12074565632. Throughput: 0: 42427.5. Samples: 12074689140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 20:39:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000736973_12074565632.pth... [2024-06-24 20:39:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000736349_12064342016.pth [2024-06-24 20:39:46,651][15401] Updated weights for policy 0, policy_version 736982 (0.0045) [2024-06-24 20:39:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 12074795008. Throughput: 0: 42732.9. Samples: 12074946260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:48,393][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 20:39:50,147][15401] Updated weights for policy 0, policy_version 736992 (0.0033) [2024-06-24 20:39:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12074991616. Throughput: 0: 42446.3. Samples: 12075077880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 20:39:54,080][15401] Updated weights for policy 0, policy_version 737002 (0.0025) [2024-06-24 20:39:57,720][15401] Updated weights for policy 0, policy_version 737012 (0.0033) [2024-06-24 20:39:58,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12075204608. Throughput: 0: 42300.9. Samples: 12075328120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:39:58,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 20:40:02,035][15401] Updated weights for policy 0, policy_version 737022 (0.0046) [2024-06-24 20:40:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12075433984. Throughput: 0: 42590.3. Samples: 12075582900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-24 20:40:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 20:40:05,922][15401] Updated weights for policy 0, policy_version 737032 (0.0037) [2024-06-24 20:40:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.7, 300 sec: 42598.4). Total num frames: 12075630592. Throughput: 0: 42498.7. Samples: 12075716020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 20:40:09,770][15401] Updated weights for policy 0, policy_version 737042 (0.0029) [2024-06-24 20:40:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12075843584. Throughput: 0: 42422.7. Samples: 12075967060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 20:40:13,956][15401] Updated weights for policy 0, policy_version 737052 (0.0039) [2024-06-24 20:40:14,800][15349] Signal inference workers to stop experience collection... (178750 times) [2024-06-24 20:40:14,800][15349] Signal inference workers to resume experience collection... (178750 times) [2024-06-24 20:40:14,846][15401] InferenceWorker_p0-w0: stopping experience collection (178750 times) [2024-06-24 20:40:14,846][15401] InferenceWorker_p0-w0: resuming experience collection (178750 times) [2024-06-24 20:40:17,534][15401] Updated weights for policy 0, policy_version 737062 (0.0027) [2024-06-24 20:40:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12076072960. Throughput: 0: 42621.8. Samples: 12076224020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 20:40:21,475][15401] Updated weights for policy 0, policy_version 737072 (0.0033) [2024-06-24 20:40:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12076253184. Throughput: 0: 42569.3. Samples: 12076353080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:23,392][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 20:40:25,142][15401] Updated weights for policy 0, policy_version 737082 (0.0028) [2024-06-24 20:40:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12076498944. Throughput: 0: 42508.9. Samples: 12076602040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:28,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 20:40:28,967][15401] Updated weights for policy 0, policy_version 737092 (0.0041) [2024-06-24 20:40:32,851][15401] Updated weights for policy 0, policy_version 737102 (0.0027) [2024-06-24 20:40:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12076711936. Throughput: 0: 42613.5. Samples: 12076863760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 20:40:36,500][15401] Updated weights for policy 0, policy_version 737112 (0.0039) [2024-06-24 20:40:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12076908544. Throughput: 0: 42591.6. Samples: 12076994500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:38,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 20:40:40,794][15401] Updated weights for policy 0, policy_version 737122 (0.0027) [2024-06-24 20:40:43,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12077121536. Throughput: 0: 42520.0. Samples: 12077241520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 20:40:44,610][15401] Updated weights for policy 0, policy_version 737132 (0.0030) [2024-06-24 20:40:48,289][15401] Updated weights for policy 0, policy_version 737142 (0.0027) [2024-06-24 20:40:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 12077334528. Throughput: 0: 42698.7. Samples: 12077504340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 20:40:52,206][15401] Updated weights for policy 0, policy_version 737152 (0.0037) [2024-06-24 20:40:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12077547520. Throughput: 0: 42653.7. Samples: 12077635440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:53,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 20:40:55,771][15401] Updated weights for policy 0, policy_version 737162 (0.0042) [2024-06-24 20:40:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12077760512. Throughput: 0: 42650.3. Samples: 12077886320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:40:58,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 20:40:59,962][15401] Updated weights for policy 0, policy_version 737172 (0.0033) [2024-06-24 20:41:03,393][15132] Fps is (10 sec: 42585.0, 60 sec: 42323.1, 300 sec: 42653.5). Total num frames: 12077973504. Throughput: 0: 42723.7. Samples: 12078146720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:03,393][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 20:41:03,435][15401] Updated weights for policy 0, policy_version 737182 (0.0038) [2024-06-24 20:41:07,977][15401] Updated weights for policy 0, policy_version 737192 (0.0025) [2024-06-24 20:41:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12078186496. Throughput: 0: 42677.0. Samples: 12078273540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 20:41:11,035][15401] Updated weights for policy 0, policy_version 737202 (0.0034) [2024-06-24 20:41:13,389][15132] Fps is (10 sec: 44251.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12078415872. Throughput: 0: 42731.6. Samples: 12078524960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 20:41:15,254][15401] Updated weights for policy 0, policy_version 737212 (0.0035) [2024-06-24 20:41:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12078612480. Throughput: 0: 42751.8. Samples: 12078787600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 20:41:18,973][15401] Updated weights for policy 0, policy_version 737222 (0.0025) [2024-06-24 20:41:22,881][15349] Signal inference workers to stop experience collection... (178800 times) [2024-06-24 20:41:22,883][15349] Signal inference workers to resume experience collection... (178800 times) [2024-06-24 20:41:22,929][15401] InferenceWorker_p0-w0: stopping experience collection (178800 times) [2024-06-24 20:41:22,930][15401] InferenceWorker_p0-w0: resuming experience collection (178800 times) [2024-06-24 20:41:23,049][15401] Updated weights for policy 0, policy_version 737232 (0.0032) [2024-06-24 20:41:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12078841856. Throughput: 0: 42735.8. Samples: 12078917620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 20:41:26,442][15401] Updated weights for policy 0, policy_version 737242 (0.0024) [2024-06-24 20:41:28,396][15132] Fps is (10 sec: 45846.1, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 12079071232. Throughput: 0: 42989.9. Samples: 12079176340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:28,397][15132] Avg episode reward: [(0, '0.418')] [2024-06-24 20:41:30,557][15401] Updated weights for policy 0, policy_version 737252 (0.0041) [2024-06-24 20:41:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12079267840. Throughput: 0: 43036.9. Samples: 12079441000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:33,398][15132] Avg episode reward: [(0, '0.845')] [2024-06-24 20:41:33,853][15401] Updated weights for policy 0, policy_version 737262 (0.0036) [2024-06-24 20:41:38,213][15401] Updated weights for policy 0, policy_version 737272 (0.0039) [2024-06-24 20:41:38,389][15132] Fps is (10 sec: 39347.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12079464448. Throughput: 0: 42893.0. Samples: 12079565620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 20:41:41,631][15401] Updated weights for policy 0, policy_version 737282 (0.0027) [2024-06-24 20:41:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12079710208. Throughput: 0: 42927.1. Samples: 12079818040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:43,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 20:41:43,539][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000737288_12079726592.pth... [2024-06-24 20:41:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000736660_12069437440.pth [2024-06-24 20:41:45,946][15401] Updated weights for policy 0, policy_version 737292 (0.0050) [2024-06-24 20:41:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 12079906816. Throughput: 0: 43009.2. Samples: 12080082000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 20:41:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 20:41:49,162][15401] Updated weights for policy 0, policy_version 737302 (0.0034) [2024-06-24 20:41:53,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12080103424. Throughput: 0: 42995.8. Samples: 12080208460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:41:53,392][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 20:41:53,560][15401] Updated weights for policy 0, policy_version 737312 (0.0032) [2024-06-24 20:41:56,529][15401] Updated weights for policy 0, policy_version 737322 (0.0039) [2024-06-24 20:41:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 12080365568. Throughput: 0: 43220.3. Samples: 12080469880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:41:58,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 20:42:01,051][15401] Updated weights for policy 0, policy_version 737332 (0.0043) [2024-06-24 20:42:03,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43146.9, 300 sec: 42709.5). Total num frames: 12080562176. Throughput: 0: 43187.7. Samples: 12080731040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:03,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 20:42:04,393][15401] Updated weights for policy 0, policy_version 737342 (0.0030) [2024-06-24 20:42:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12080758784. Throughput: 0: 43100.5. Samples: 12080857140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 20:42:08,402][15401] Updated weights for policy 0, policy_version 737352 (0.0042) [2024-06-24 20:42:12,134][15401] Updated weights for policy 0, policy_version 737362 (0.0033) [2024-06-24 20:42:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12080988160. Throughput: 0: 43092.0. Samples: 12081115200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 20:42:16,317][15401] Updated weights for policy 0, policy_version 737372 (0.0032) [2024-06-24 20:42:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12081201152. Throughput: 0: 42779.2. Samples: 12081366060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 20:42:19,763][15401] Updated weights for policy 0, policy_version 737382 (0.0033) [2024-06-24 20:42:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12081397760. Throughput: 0: 42842.0. Samples: 12081493520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:23,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 20:42:23,917][15401] Updated weights for policy 0, policy_version 737392 (0.0027) [2024-06-24 20:42:27,516][15401] Updated weights for policy 0, policy_version 737402 (0.0026) [2024-06-24 20:42:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 12081627136. Throughput: 0: 42977.9. Samples: 12081752040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 20:42:31,487][15401] Updated weights for policy 0, policy_version 737412 (0.0060) [2024-06-24 20:42:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12081823744. Throughput: 0: 42762.3. Samples: 12082006300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 20:42:35,506][15401] Updated weights for policy 0, policy_version 737422 (0.0035) [2024-06-24 20:42:35,809][15349] Signal inference workers to stop experience collection... (178850 times) [2024-06-24 20:42:35,851][15401] InferenceWorker_p0-w0: stopping experience collection (178850 times) [2024-06-24 20:42:35,860][15349] Signal inference workers to resume experience collection... (178850 times) [2024-06-24 20:42:35,873][15401] InferenceWorker_p0-w0: resuming experience collection (178850 times) [2024-06-24 20:42:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12082053120. Throughput: 0: 42671.1. Samples: 12082128560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 20:42:39,154][15401] Updated weights for policy 0, policy_version 737432 (0.0036) [2024-06-24 20:42:43,102][15401] Updated weights for policy 0, policy_version 737442 (0.0032) [2024-06-24 20:42:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12082266112. Throughput: 0: 42629.6. Samples: 12082388200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 20:42:46,829][15401] Updated weights for policy 0, policy_version 737452 (0.0043) [2024-06-24 20:42:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12082479104. Throughput: 0: 42403.9. Samples: 12082639220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 20:42:50,736][15401] Updated weights for policy 0, policy_version 737462 (0.0034) [2024-06-24 20:42:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 12082692096. Throughput: 0: 42363.1. Samples: 12082763480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 20:42:54,850][15401] Updated weights for policy 0, policy_version 737472 (0.0050) [2024-06-24 20:42:58,314][15401] Updated weights for policy 0, policy_version 737482 (0.0038) [2024-06-24 20:42:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12082905088. Throughput: 0: 42495.1. Samples: 12083027480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:42:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 20:43:02,304][15401] Updated weights for policy 0, policy_version 737492 (0.0034) [2024-06-24 20:43:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12083118080. Throughput: 0: 42546.6. Samples: 12083280660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:43:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 20:43:06,011][15401] Updated weights for policy 0, policy_version 737502 (0.0027) [2024-06-24 20:43:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12083331072. Throughput: 0: 42533.1. Samples: 12083407500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:43:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 20:43:09,819][15401] Updated weights for policy 0, policy_version 737512 (0.0029) [2024-06-24 20:43:13,391][15132] Fps is (10 sec: 40954.7, 60 sec: 42324.3, 300 sec: 42709.5). Total num frames: 12083527680. Throughput: 0: 42504.5. Samples: 12083664800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:43:13,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 20:43:13,617][15401] Updated weights for policy 0, policy_version 737522 (0.0035) [2024-06-24 20:43:17,265][15401] Updated weights for policy 0, policy_version 737532 (0.0033) [2024-06-24 20:43:18,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 12083757056. Throughput: 0: 42636.0. Samples: 12083925020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:43:18,393][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 20:43:21,351][15401] Updated weights for policy 0, policy_version 737542 (0.0036) [2024-06-24 20:43:23,390][15132] Fps is (10 sec: 45881.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 12083986432. Throughput: 0: 42872.1. Samples: 12084057800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:43:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 20:43:24,939][15401] Updated weights for policy 0, policy_version 737552 (0.0033) [2024-06-24 20:43:28,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 12084183040. Throughput: 0: 42673.6. Samples: 12084308620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 20:43:28,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 20:43:29,083][15401] Updated weights for policy 0, policy_version 737562 (0.0029) [2024-06-24 20:43:32,674][15401] Updated weights for policy 0, policy_version 737572 (0.0043) [2024-06-24 20:43:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12084412416. Throughput: 0: 42729.3. Samples: 12084562040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:43:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-24 20:43:36,843][15401] Updated weights for policy 0, policy_version 737582 (0.0033) [2024-06-24 20:43:38,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42596.8, 300 sec: 42765.0). Total num frames: 12084609024. Throughput: 0: 42901.2. Samples: 12084694140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:43:38,392][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 20:43:40,340][15401] Updated weights for policy 0, policy_version 737592 (0.0030) [2024-06-24 20:43:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12084838400. Throughput: 0: 42712.0. Samples: 12084949520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:43:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 20:43:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000737600_12084838400.pth... [2024-06-24 20:43:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000736973_12074565632.pth [2024-06-24 20:43:44,816][15401] Updated weights for policy 0, policy_version 737602 (0.0033) [2024-06-24 20:43:47,935][15401] Updated weights for policy 0, policy_version 737612 (0.0040) [2024-06-24 20:43:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12085051392. Throughput: 0: 42724.5. Samples: 12085203260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:43:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 20:43:52,336][15401] Updated weights for policy 0, policy_version 737622 (0.0040) [2024-06-24 20:43:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12085248000. Throughput: 0: 42860.7. Samples: 12085336240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:43:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 20:43:55,398][15401] Updated weights for policy 0, policy_version 737632 (0.0028) [2024-06-24 20:43:58,391][15132] Fps is (10 sec: 40954.2, 60 sec: 42597.4, 300 sec: 42598.2). Total num frames: 12085460992. Throughput: 0: 42798.6. Samples: 12085590740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:43:58,391][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 20:43:59,930][15401] Updated weights for policy 0, policy_version 737642 (0.0039) [2024-06-24 20:44:01,439][15349] Signal inference workers to stop experience collection... (178900 times) [2024-06-24 20:44:01,439][15349] Signal inference workers to resume experience collection... (178900 times) [2024-06-24 20:44:01,480][15401] InferenceWorker_p0-w0: stopping experience collection (178900 times) [2024-06-24 20:44:01,481][15401] InferenceWorker_p0-w0: resuming experience collection (178900 times) [2024-06-24 20:44:03,207][15401] Updated weights for policy 0, policy_version 737652 (0.0031) [2024-06-24 20:44:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.6). Total num frames: 12085690368. Throughput: 0: 42665.9. Samples: 12085844880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 20:44:07,570][15401] Updated weights for policy 0, policy_version 737662 (0.0041) [2024-06-24 20:44:08,390][15132] Fps is (10 sec: 40965.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12085870592. Throughput: 0: 42573.7. Samples: 12085973620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:08,395][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 20:44:11,026][15401] Updated weights for policy 0, policy_version 737672 (0.0050) [2024-06-24 20:44:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43145.4, 300 sec: 42653.9). Total num frames: 12086116352. Throughput: 0: 42573.7. Samples: 12086224340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 20:44:15,265][15401] Updated weights for policy 0, policy_version 737682 (0.0033) [2024-06-24 20:44:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12086329344. Throughput: 0: 42564.0. Samples: 12086477420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:18,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 20:44:18,647][15401] Updated weights for policy 0, policy_version 737692 (0.0028) [2024-06-24 20:44:22,839][15401] Updated weights for policy 0, policy_version 737702 (0.0026) [2024-06-24 20:44:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12086525952. Throughput: 0: 42694.7. Samples: 12086615300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:23,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 20:44:26,061][15401] Updated weights for policy 0, policy_version 737712 (0.0029) [2024-06-24 20:44:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 12086738944. Throughput: 0: 42481.3. Samples: 12086861180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 20:44:30,561][15401] Updated weights for policy 0, policy_version 737722 (0.0037) [2024-06-24 20:44:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12086984704. Throughput: 0: 42580.0. Samples: 12087119360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:33,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 20:44:33,686][15401] Updated weights for policy 0, policy_version 737732 (0.0029) [2024-06-24 20:44:38,280][15401] Updated weights for policy 0, policy_version 737742 (0.0028) [2024-06-24 20:44:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 12087164928. Throughput: 0: 42517.4. Samples: 12087249520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:38,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 20:44:41,375][15401] Updated weights for policy 0, policy_version 737752 (0.0032) [2024-06-24 20:44:43,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12087377920. Throughput: 0: 42420.8. Samples: 12087499620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:43,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 20:44:46,011][15401] Updated weights for policy 0, policy_version 737762 (0.0038) [2024-06-24 20:44:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12087590912. Throughput: 0: 42700.0. Samples: 12087766380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 20:44:48,986][15401] Updated weights for policy 0, policy_version 737772 (0.0039) [2024-06-24 20:44:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12087803904. Throughput: 0: 42625.0. Samples: 12087891740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:53,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 20:44:53,517][15401] Updated weights for policy 0, policy_version 737782 (0.0039) [2024-06-24 20:44:56,513][15401] Updated weights for policy 0, policy_version 737792 (0.0047) [2024-06-24 20:44:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42872.4, 300 sec: 42709.5). Total num frames: 12088033280. Throughput: 0: 42569.8. Samples: 12088139980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:44:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 20:45:01,281][15401] Updated weights for policy 0, policy_version 737802 (0.0033) [2024-06-24 20:45:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 12088246272. Throughput: 0: 42758.2. Samples: 12088401640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:45:03,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 20:45:04,480][15401] Updated weights for policy 0, policy_version 737812 (0.0043) [2024-06-24 20:45:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12088426496. Throughput: 0: 42493.3. Samples: 12088527500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:45:08,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 20:45:09,026][15401] Updated weights for policy 0, policy_version 737822 (0.0039) [2024-06-24 20:45:12,116][15401] Updated weights for policy 0, policy_version 737832 (0.0049) [2024-06-24 20:45:13,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12088672256. Throughput: 0: 42630.1. Samples: 12088779540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 20:45:13,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-24 20:45:16,796][15401] Updated weights for policy 0, policy_version 737842 (0.0033) [2024-06-24 20:45:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12088868864. Throughput: 0: 42733.3. Samples: 12089042360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 20:45:19,693][15401] Updated weights for policy 0, policy_version 737852 (0.0037) [2024-06-24 20:45:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12089065472. Throughput: 0: 42556.5. Samples: 12089164560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 20:45:23,441][15349] Signal inference workers to stop experience collection... (178950 times) [2024-06-24 20:45:23,441][15349] Signal inference workers to resume experience collection... (178950 times) [2024-06-24 20:45:23,457][15401] InferenceWorker_p0-w0: stopping experience collection (178950 times) [2024-06-24 20:45:23,457][15401] InferenceWorker_p0-w0: resuming experience collection (178950 times) [2024-06-24 20:45:24,630][15401] Updated weights for policy 0, policy_version 737862 (0.0033) [2024-06-24 20:45:27,577][15401] Updated weights for policy 0, policy_version 737872 (0.0033) [2024-06-24 20:45:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12089311232. Throughput: 0: 42620.9. Samples: 12089417560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 20:45:32,021][15401] Updated weights for policy 0, policy_version 737882 (0.0027) [2024-06-24 20:45:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12089507840. Throughput: 0: 42619.6. Samples: 12089684260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:33,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-24 20:45:35,297][15401] Updated weights for policy 0, policy_version 737892 (0.0043) [2024-06-24 20:45:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 12089720832. Throughput: 0: 42531.9. Samples: 12089805780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:38,393][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 20:45:39,462][15401] Updated weights for policy 0, policy_version 737902 (0.0041) [2024-06-24 20:45:43,078][15401] Updated weights for policy 0, policy_version 737912 (0.0031) [2024-06-24 20:45:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12089966592. Throughput: 0: 42644.5. Samples: 12090058980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:43,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-24 20:45:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000737913_12089966592.pth... [2024-06-24 20:45:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000737288_12079726592.pth [2024-06-24 20:45:47,488][15401] Updated weights for policy 0, policy_version 737922 (0.0042) [2024-06-24 20:45:48,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12090163200. Throughput: 0: 42620.6. Samples: 12090319460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 20:45:50,824][15401] Updated weights for policy 0, policy_version 737932 (0.0032) [2024-06-24 20:45:53,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12090343424. Throughput: 0: 42668.9. Samples: 12090447600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 20:45:55,125][15401] Updated weights for policy 0, policy_version 737942 (0.0033) [2024-06-24 20:45:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42765.5). Total num frames: 12090589184. Throughput: 0: 42810.2. Samples: 12090706000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:45:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 20:45:58,720][15401] Updated weights for policy 0, policy_version 737952 (0.0033) [2024-06-24 20:46:02,592][15401] Updated weights for policy 0, policy_version 737962 (0.0033) [2024-06-24 20:46:03,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 12090818560. Throughput: 0: 42828.0. Samples: 12090969620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 20:46:06,520][15401] Updated weights for policy 0, policy_version 737972 (0.0036) [2024-06-24 20:46:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12090998784. Throughput: 0: 43033.7. Samples: 12091101080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 20:46:10,231][15401] Updated weights for policy 0, policy_version 737982 (0.0042) [2024-06-24 20:46:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12091244544. Throughput: 0: 42906.6. Samples: 12091348460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:13,392][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 20:46:14,286][15401] Updated weights for policy 0, policy_version 737992 (0.0048) [2024-06-24 20:46:17,818][15401] Updated weights for policy 0, policy_version 738002 (0.0026) [2024-06-24 20:46:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12091457536. Throughput: 0: 42861.8. Samples: 12091613040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 20:46:21,682][15401] Updated weights for policy 0, policy_version 738012 (0.0035) [2024-06-24 20:46:23,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 12091637760. Throughput: 0: 42929.4. Samples: 12091737500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 20:46:25,309][15401] Updated weights for policy 0, policy_version 738022 (0.0037) [2024-06-24 20:46:25,776][15349] Signal inference workers to stop experience collection... (179000 times) [2024-06-24 20:46:25,831][15401] InferenceWorker_p0-w0: stopping experience collection (179000 times) [2024-06-24 20:46:25,887][15349] Signal inference workers to resume experience collection... (179000 times) [2024-06-24 20:46:25,888][15401] InferenceWorker_p0-w0: resuming experience collection (179000 times) [2024-06-24 20:46:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12091883520. Throughput: 0: 42952.0. Samples: 12091991820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:28,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 20:46:29,579][15401] Updated weights for policy 0, policy_version 738032 (0.0026) [2024-06-24 20:46:32,918][15401] Updated weights for policy 0, policy_version 738042 (0.0028) [2024-06-24 20:46:33,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 12092096512. Throughput: 0: 42906.9. Samples: 12092250380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:33,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 20:46:37,055][15401] Updated weights for policy 0, policy_version 738052 (0.0027) [2024-06-24 20:46:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 12092293120. Throughput: 0: 42854.2. Samples: 12092376040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:38,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 20:46:40,825][15401] Updated weights for policy 0, policy_version 738062 (0.0037) [2024-06-24 20:46:43,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12092538880. Throughput: 0: 42798.7. Samples: 12092631940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:43,396][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 20:46:44,525][15401] Updated weights for policy 0, policy_version 738072 (0.0029) [2024-06-24 20:46:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12092719104. Throughput: 0: 42717.9. Samples: 12092891920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 20:46:48,487][15401] Updated weights for policy 0, policy_version 738082 (0.0033) [2024-06-24 20:46:52,046][15401] Updated weights for policy 0, policy_version 738092 (0.0041) [2024-06-24 20:46:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 12092932096. Throughput: 0: 42508.4. Samples: 12093013960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-24 20:46:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 20:46:56,196][15401] Updated weights for policy 0, policy_version 738102 (0.0049) [2024-06-24 20:46:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12093177856. Throughput: 0: 42760.0. Samples: 12093272560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:46:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 20:46:59,524][15401] Updated weights for policy 0, policy_version 738112 (0.0028) [2024-06-24 20:47:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12093358080. Throughput: 0: 42671.0. Samples: 12093533240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 20:47:03,951][15401] Updated weights for policy 0, policy_version 738122 (0.0036) [2024-06-24 20:47:07,188][15401] Updated weights for policy 0, policy_version 738132 (0.0036) [2024-06-24 20:47:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12093571072. Throughput: 0: 42593.3. Samples: 12093654200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:08,394][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 20:47:11,524][15401] Updated weights for policy 0, policy_version 738142 (0.0030) [2024-06-24 20:47:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12093816832. Throughput: 0: 42737.3. Samples: 12093915000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:13,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-24 20:47:15,218][15401] Updated weights for policy 0, policy_version 738152 (0.0034) [2024-06-24 20:47:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12094013440. Throughput: 0: 42846.4. Samples: 12094178360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 20:47:19,130][15401] Updated weights for policy 0, policy_version 738162 (0.0034) [2024-06-24 20:47:22,771][15401] Updated weights for policy 0, policy_version 738172 (0.0030) [2024-06-24 20:47:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12094226432. Throughput: 0: 42753.4. Samples: 12094299940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:23,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 20:47:26,755][15401] Updated weights for policy 0, policy_version 738182 (0.0035) [2024-06-24 20:47:28,394][15132] Fps is (10 sec: 42577.0, 60 sec: 42594.9, 300 sec: 42764.3). Total num frames: 12094439424. Throughput: 0: 42922.4. Samples: 12094563660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:28,395][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 20:47:30,516][15401] Updated weights for policy 0, policy_version 738192 (0.0032) [2024-06-24 20:47:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 12094636032. Throughput: 0: 42900.0. Samples: 12094822420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:33,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 20:47:34,368][15401] Updated weights for policy 0, policy_version 738202 (0.0046) [2024-06-24 20:47:37,967][15401] Updated weights for policy 0, policy_version 738212 (0.0035) [2024-06-24 20:47:38,396][15132] Fps is (10 sec: 42592.4, 60 sec: 42867.0, 300 sec: 42708.5). Total num frames: 12094865408. Throughput: 0: 42996.2. Samples: 12094949060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:38,396][15132] Avg episode reward: [(0, '0.818')] [2024-06-24 20:47:41,885][15401] Updated weights for policy 0, policy_version 738222 (0.0023) [2024-06-24 20:47:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12095078400. Throughput: 0: 42859.6. Samples: 12095201240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 20:47:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000738226_12095094784.pth... [2024-06-24 20:47:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000737600_12084838400.pth [2024-06-24 20:47:45,879][15401] Updated weights for policy 0, policy_version 738232 (0.0039) [2024-06-24 20:47:48,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12095291392. Throughput: 0: 42785.3. Samples: 12095458580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 20:47:49,410][15401] Updated weights for policy 0, policy_version 738242 (0.0038) [2024-06-24 20:47:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12095504384. Throughput: 0: 42910.6. Samples: 12095585180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 20:47:53,629][15401] Updated weights for policy 0, policy_version 738252 (0.0045) [2024-06-24 20:47:57,010][15401] Updated weights for policy 0, policy_version 738262 (0.0026) [2024-06-24 20:47:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12095717376. Throughput: 0: 42736.3. Samples: 12095838140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:47:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 20:48:01,549][15401] Updated weights for policy 0, policy_version 738272 (0.0031) [2024-06-24 20:48:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12095930368. Throughput: 0: 42855.6. Samples: 12096106860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:48:03,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 20:48:03,825][15349] Signal inference workers to stop experience collection... (179050 times) [2024-06-24 20:48:03,884][15401] InferenceWorker_p0-w0: stopping experience collection (179050 times) [2024-06-24 20:48:03,942][15349] Signal inference workers to resume experience collection... (179050 times) [2024-06-24 20:48:03,942][15401] InferenceWorker_p0-w0: resuming experience collection (179050 times) [2024-06-24 20:48:04,964][15401] Updated weights for policy 0, policy_version 738282 (0.0037) [2024-06-24 20:48:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42820.8). Total num frames: 12096159744. Throughput: 0: 42798.7. Samples: 12096225880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:48:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 20:48:09,008][15401] Updated weights for policy 0, policy_version 738292 (0.0041) [2024-06-24 20:48:12,457][15401] Updated weights for policy 0, policy_version 738302 (0.0047) [2024-06-24 20:48:13,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 12096356352. Throughput: 0: 42634.9. Samples: 12096482020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:48:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 20:48:17,159][15401] Updated weights for policy 0, policy_version 738312 (0.0031) [2024-06-24 20:48:18,394][15132] Fps is (10 sec: 40941.5, 60 sec: 42595.2, 300 sec: 42653.3). Total num frames: 12096569344. Throughput: 0: 42704.2. Samples: 12096744300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:48:18,394][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 20:48:20,384][15401] Updated weights for policy 0, policy_version 738322 (0.0030) [2024-06-24 20:48:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12096798720. Throughput: 0: 42630.5. Samples: 12096867160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:48:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 20:48:24,692][15401] Updated weights for policy 0, policy_version 738332 (0.0029) [2024-06-24 20:48:28,048][15401] Updated weights for policy 0, policy_version 738342 (0.0034) [2024-06-24 20:48:28,389][15132] Fps is (10 sec: 44257.0, 60 sec: 42875.1, 300 sec: 42709.5). Total num frames: 12097011712. Throughput: 0: 42620.1. Samples: 12097119140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:48:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 20:48:32,086][15401] Updated weights for policy 0, policy_version 738352 (0.0049) [2024-06-24 20:48:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 12097208320. Throughput: 0: 42709.4. Samples: 12097380500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:48:33,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 20:48:35,659][15401] Updated weights for policy 0, policy_version 738362 (0.0036) [2024-06-24 20:48:38,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 12097437696. Throughput: 0: 42728.5. Samples: 12097507960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 20:48:38,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 20:48:39,493][15401] Updated weights for policy 0, policy_version 738372 (0.0036) [2024-06-24 20:48:43,183][15401] Updated weights for policy 0, policy_version 738382 (0.0032) [2024-06-24 20:48:43,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 12097667072. Throughput: 0: 42942.7. Samples: 12097770660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:48:43,393][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 20:48:47,016][15401] Updated weights for policy 0, policy_version 738392 (0.0034) [2024-06-24 20:48:48,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 12097863680. Throughput: 0: 42770.7. Samples: 12098031820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:48:48,396][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 20:48:50,750][15401] Updated weights for policy 0, policy_version 738402 (0.0038) [2024-06-24 20:48:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.6, 300 sec: 42765.2). Total num frames: 12098076672. Throughput: 0: 42822.2. Samples: 12098152880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:48:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 20:48:54,547][15401] Updated weights for policy 0, policy_version 738412 (0.0032) [2024-06-24 20:48:58,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12098289664. Throughput: 0: 42898.3. Samples: 12098412440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:48:58,396][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 20:48:58,625][15401] Updated weights for policy 0, policy_version 738422 (0.0037) [2024-06-24 20:49:02,880][15401] Updated weights for policy 0, policy_version 738432 (0.0034) [2024-06-24 20:49:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12098486272. Throughput: 0: 42733.5. Samples: 12098667120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 20:49:06,206][15401] Updated weights for policy 0, policy_version 738442 (0.0032) [2024-06-24 20:49:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12098732032. Throughput: 0: 42842.3. Samples: 12098795060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:08,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 20:49:10,300][15401] Updated weights for policy 0, policy_version 738452 (0.0042) [2024-06-24 20:49:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12098928640. Throughput: 0: 43187.3. Samples: 12099062580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 20:49:13,728][15401] Updated weights for policy 0, policy_version 738462 (0.0036) [2024-06-24 20:49:18,168][15401] Updated weights for policy 0, policy_version 738472 (0.0027) [2024-06-24 20:49:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42874.7, 300 sec: 42765.0). Total num frames: 12099141632. Throughput: 0: 42951.1. Samples: 12099313300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 20:49:21,682][15401] Updated weights for policy 0, policy_version 738482 (0.0027) [2024-06-24 20:49:23,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12099387392. Throughput: 0: 42842.2. Samples: 12099435860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 20:49:25,679][15401] Updated weights for policy 0, policy_version 738492 (0.0037) [2024-06-24 20:49:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12099551232. Throughput: 0: 42797.4. Samples: 12099696440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 20:49:28,404][15349] Signal inference workers to stop experience collection... (179100 times) [2024-06-24 20:49:28,448][15401] InferenceWorker_p0-w0: stopping experience collection (179100 times) [2024-06-24 20:49:28,456][15349] Signal inference workers to resume experience collection... (179100 times) [2024-06-24 20:49:28,461][15401] InferenceWorker_p0-w0: resuming experience collection (179100 times) [2024-06-24 20:49:29,232][15401] Updated weights for policy 0, policy_version 738502 (0.0040) [2024-06-24 20:49:33,288][15401] Updated weights for policy 0, policy_version 738512 (0.0032) [2024-06-24 20:49:33,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 12099780608. Throughput: 0: 42740.5. Samples: 12099954880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 20:49:36,874][15401] Updated weights for policy 0, policy_version 738522 (0.0033) [2024-06-24 20:49:38,390][15132] Fps is (10 sec: 45871.3, 60 sec: 42870.9, 300 sec: 42820.4). Total num frames: 12100009984. Throughput: 0: 42835.1. Samples: 12100080500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:38,391][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 20:49:40,881][15401] Updated weights for policy 0, policy_version 738532 (0.0042) [2024-06-24 20:49:43,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42054.0, 300 sec: 42709.5). Total num frames: 12100190208. Throughput: 0: 42814.6. Samples: 12100339100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 20:49:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000738538_12100206592.pth... [2024-06-24 20:49:43,574][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000737913_12089966592.pth [2024-06-24 20:49:44,711][15401] Updated weights for policy 0, policy_version 738542 (0.0032) [2024-06-24 20:49:48,389][15132] Fps is (10 sec: 40963.7, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 12100419584. Throughput: 0: 42748.6. Samples: 12100590800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 20:49:48,418][15401] Updated weights for policy 0, policy_version 738552 (0.0029) [2024-06-24 20:49:52,313][15401] Updated weights for policy 0, policy_version 738562 (0.0029) [2024-06-24 20:49:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 12100648960. Throughput: 0: 42755.4. Samples: 12100719060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:53,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 20:49:56,137][15401] Updated weights for policy 0, policy_version 738572 (0.0033) [2024-06-24 20:49:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 12100845568. Throughput: 0: 42625.5. Samples: 12100980720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:49:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 20:49:59,835][15401] Updated weights for policy 0, policy_version 738582 (0.0032) [2024-06-24 20:50:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12101074944. Throughput: 0: 42610.1. Samples: 12101230760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:50:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 20:50:03,782][15401] Updated weights for policy 0, policy_version 738592 (0.0033) [2024-06-24 20:50:07,439][15401] Updated weights for policy 0, policy_version 738602 (0.0032) [2024-06-24 20:50:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12101287936. Throughput: 0: 42924.9. Samples: 12101367480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:50:08,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 20:50:11,339][15401] Updated weights for policy 0, policy_version 738612 (0.0026) [2024-06-24 20:50:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12101468160. Throughput: 0: 42691.2. Samples: 12101617540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:50:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 20:50:15,232][15401] Updated weights for policy 0, policy_version 738622 (0.0028) [2024-06-24 20:50:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12101713920. Throughput: 0: 42699.3. Samples: 12101876340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 20:50:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 20:50:18,995][15401] Updated weights for policy 0, policy_version 738632 (0.0040) [2024-06-24 20:50:22,874][15401] Updated weights for policy 0, policy_version 738642 (0.0032) [2024-06-24 20:50:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12101926912. Throughput: 0: 42747.1. Samples: 12102004080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:50:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 20:50:26,599][15401] Updated weights for policy 0, policy_version 738652 (0.0029) [2024-06-24 20:50:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12102107136. Throughput: 0: 42591.2. Samples: 12102255700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:50:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 20:50:30,468][15401] Updated weights for policy 0, policy_version 738662 (0.0023) [2024-06-24 20:50:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 12102336512. Throughput: 0: 42714.2. Samples: 12102512940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:50:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 20:50:34,192][15401] Updated weights for policy 0, policy_version 738672 (0.0030) [2024-06-24 20:50:37,204][15349] Signal inference workers to stop experience collection... (179150 times) [2024-06-24 20:50:37,205][15349] Signal inference workers to resume experience collection... (179150 times) [2024-06-24 20:50:37,236][15401] InferenceWorker_p0-w0: stopping experience collection (179150 times) [2024-06-24 20:50:37,236][15401] InferenceWorker_p0-w0: resuming experience collection (179150 times) [2024-06-24 20:50:38,097][15401] Updated weights for policy 0, policy_version 738682 (0.0038) [2024-06-24 20:50:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.9, 300 sec: 42709.5). Total num frames: 12102565888. Throughput: 0: 42766.3. Samples: 12102643540. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:50:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 20:50:41,821][15401] Updated weights for policy 0, policy_version 738692 (0.0030) [2024-06-24 20:50:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12102746112. Throughput: 0: 42531.1. Samples: 12102894620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:50:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 20:50:46,010][15401] Updated weights for policy 0, policy_version 738702 (0.0033) [2024-06-24 20:50:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12102975488. Throughput: 0: 42724.0. Samples: 12103153340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:50:48,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 20:50:49,490][15401] Updated weights for policy 0, policy_version 738712 (0.0038) [2024-06-24 20:50:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12103204864. Throughput: 0: 42548.0. Samples: 12103282140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:50:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 20:50:53,615][15401] Updated weights for policy 0, policy_version 738722 (0.0031) [2024-06-24 20:50:57,448][15401] Updated weights for policy 0, policy_version 738732 (0.0038) [2024-06-24 20:50:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12103417856. Throughput: 0: 42812.4. Samples: 12103544100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:50:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 20:51:01,271][15401] Updated weights for policy 0, policy_version 738742 (0.0034) [2024-06-24 20:51:03,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 12103630848. Throughput: 0: 42741.4. Samples: 12103799800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:03,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 20:51:04,897][15401] Updated weights for policy 0, policy_version 738752 (0.0040) [2024-06-24 20:51:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 12103843840. Throughput: 0: 42764.3. Samples: 12103928480. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 20:51:08,900][15401] Updated weights for policy 0, policy_version 738762 (0.0032) [2024-06-24 20:51:12,242][15401] Updated weights for policy 0, policy_version 738772 (0.0039) [2024-06-24 20:51:13,389][15132] Fps is (10 sec: 45886.3, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 12104089600. Throughput: 0: 42956.4. Samples: 12104188740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 20:51:16,442][15401] Updated weights for policy 0, policy_version 738782 (0.0027) [2024-06-24 20:51:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12104286208. Throughput: 0: 43078.3. Samples: 12104451460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 20:51:19,750][15401] Updated weights for policy 0, policy_version 738792 (0.0033) [2024-06-24 20:51:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12104499200. Throughput: 0: 42970.8. Samples: 12104577220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 20:51:24,065][15401] Updated weights for policy 0, policy_version 738802 (0.0024) [2024-06-24 20:51:27,286][15401] Updated weights for policy 0, policy_version 738812 (0.0047) [2024-06-24 20:51:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43690.6, 300 sec: 42820.9). Total num frames: 12104728576. Throughput: 0: 43031.0. Samples: 12104831020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 20:51:31,701][15401] Updated weights for policy 0, policy_version 738822 (0.0043) [2024-06-24 20:51:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12104925184. Throughput: 0: 43119.3. Samples: 12105093700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:33,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 20:51:35,214][15401] Updated weights for policy 0, policy_version 738832 (0.0034) [2024-06-24 20:51:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12105121792. Throughput: 0: 43221.4. Samples: 12105227100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 20:51:39,158][15401] Updated weights for policy 0, policy_version 738842 (0.0050) [2024-06-24 20:51:43,203][15401] Updated weights for policy 0, policy_version 738852 (0.0036) [2024-06-24 20:51:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 12105367552. Throughput: 0: 43048.8. Samples: 12105481300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 20:51:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000738853_12105367552.pth... [2024-06-24 20:51:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000738226_12095094784.pth [2024-06-24 20:51:46,855][15401] Updated weights for policy 0, policy_version 738862 (0.0032) [2024-06-24 20:51:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 12105580544. Throughput: 0: 43137.5. Samples: 12105740880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 20:51:49,203][15349] Signal inference workers to stop experience collection... (179200 times) [2024-06-24 20:51:49,252][15401] InferenceWorker_p0-w0: stopping experience collection (179200 times) [2024-06-24 20:51:49,262][15349] Signal inference workers to resume experience collection... (179200 times) [2024-06-24 20:51:49,264][15401] InferenceWorker_p0-w0: resuming experience collection (179200 times) [2024-06-24 20:51:50,653][15401] Updated weights for policy 0, policy_version 738872 (0.0039) [2024-06-24 20:51:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12105777152. Throughput: 0: 43205.0. Samples: 12105872700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 20:51:54,714][15401] Updated weights for policy 0, policy_version 738882 (0.0039) [2024-06-24 20:51:58,104][15401] Updated weights for policy 0, policy_version 738892 (0.0035) [2024-06-24 20:51:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12106006528. Throughput: 0: 43035.7. Samples: 12106125340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:51:58,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 20:52:02,339][15401] Updated weights for policy 0, policy_version 738902 (0.0041) [2024-06-24 20:52:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 12106219520. Throughput: 0: 42913.3. Samples: 12106382560. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-24 20:52:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 20:52:05,819][15401] Updated weights for policy 0, policy_version 738912 (0.0039) [2024-06-24 20:52:08,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12106416128. Throughput: 0: 42936.0. Samples: 12106509340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 20:52:10,141][15401] Updated weights for policy 0, policy_version 738922 (0.0037) [2024-06-24 20:52:13,332][15401] Updated weights for policy 0, policy_version 738932 (0.0037) [2024-06-24 20:52:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12106661888. Throughput: 0: 42946.8. Samples: 12106763620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 20:52:17,868][15401] Updated weights for policy 0, policy_version 738942 (0.0045) [2024-06-24 20:52:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12106842112. Throughput: 0: 42914.2. Samples: 12107024840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:18,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 20:52:20,774][15401] Updated weights for policy 0, policy_version 738952 (0.0034) [2024-06-24 20:52:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42765.7). Total num frames: 12107055104. Throughput: 0: 42716.3. Samples: 12107149340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 20:52:25,473][15401] Updated weights for policy 0, policy_version 738962 (0.0034) [2024-06-24 20:52:28,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 12107300864. Throughput: 0: 42751.9. Samples: 12107405240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:28,393][15132] Avg episode reward: [(0, '0.359')] [2024-06-24 20:52:28,728][15401] Updated weights for policy 0, policy_version 738972 (0.0043) [2024-06-24 20:52:32,969][15401] Updated weights for policy 0, policy_version 738982 (0.0044) [2024-06-24 20:52:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 12107497472. Throughput: 0: 42819.5. Samples: 12107667760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 20:52:36,342][15401] Updated weights for policy 0, policy_version 738992 (0.0039) [2024-06-24 20:52:38,389][15132] Fps is (10 sec: 39332.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12107694080. Throughput: 0: 42702.8. Samples: 12107794320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 20:52:40,613][15401] Updated weights for policy 0, policy_version 739002 (0.0037) [2024-06-24 20:52:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 12107956224. Throughput: 0: 42812.8. Samples: 12108051920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:43,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 20:52:43,973][15401] Updated weights for policy 0, policy_version 739012 (0.0022) [2024-06-24 20:52:48,195][15401] Updated weights for policy 0, policy_version 739022 (0.0029) [2024-06-24 20:52:48,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12108152832. Throughput: 0: 42831.1. Samples: 12108309960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 20:52:51,823][15401] Updated weights for policy 0, policy_version 739032 (0.0039) [2024-06-24 20:52:53,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 12108333056. Throughput: 0: 42596.6. Samples: 12108426180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 20:52:56,171][15401] Updated weights for policy 0, policy_version 739042 (0.0029) [2024-06-24 20:52:58,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 12108595200. Throughput: 0: 42715.0. Samples: 12108685900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:52:58,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 20:53:00,017][15401] Updated weights for policy 0, policy_version 739053 (0.0034) [2024-06-24 20:53:03,389][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12108759040. Throughput: 0: 42533.3. Samples: 12108938840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 20:53:04,456][15401] Updated weights for policy 0, policy_version 739063 (0.0032) [2024-06-24 20:53:07,541][15401] Updated weights for policy 0, policy_version 739073 (0.0044) [2024-06-24 20:53:08,389][15132] Fps is (10 sec: 37692.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12108972032. Throughput: 0: 42443.7. Samples: 12109059300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:08,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-24 20:53:09,608][15349] Signal inference workers to stop experience collection... (179250 times) [2024-06-24 20:53:09,635][15401] InferenceWorker_p0-w0: stopping experience collection (179250 times) [2024-06-24 20:53:09,667][15349] Signal inference workers to resume experience collection... (179250 times) [2024-06-24 20:53:09,667][15401] InferenceWorker_p0-w0: resuming experience collection (179250 times) [2024-06-24 20:53:12,090][15401] Updated weights for policy 0, policy_version 739083 (0.0035) [2024-06-24 20:53:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42821.2). Total num frames: 12109201408. Throughput: 0: 42693.1. Samples: 12109326320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:13,390][15132] Avg episode reward: [(0, '0.187')] [2024-06-24 20:53:14,963][15401] Updated weights for policy 0, policy_version 739093 (0.0030) [2024-06-24 20:53:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12109414400. Throughput: 0: 42583.1. Samples: 12109584000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:18,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-24 20:53:19,865][15401] Updated weights for policy 0, policy_version 739103 (0.0039) [2024-06-24 20:53:22,415][15401] Updated weights for policy 0, policy_version 739113 (0.0029) [2024-06-24 20:53:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 12109643776. Throughput: 0: 42458.1. Samples: 12109704940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 20:53:27,551][15401] Updated weights for policy 0, policy_version 739123 (0.0031) [2024-06-24 20:53:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42054.1, 300 sec: 42765.0). Total num frames: 12109824000. Throughput: 0: 42525.0. Samples: 12109965540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 20:53:30,610][15401] Updated weights for policy 0, policy_version 739133 (0.0024) [2024-06-24 20:53:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12110053376. Throughput: 0: 42397.9. Samples: 12110217860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 20:53:35,171][15401] Updated weights for policy 0, policy_version 739143 (0.0029) [2024-06-24 20:53:38,357][15401] Updated weights for policy 0, policy_version 739153 (0.0047) [2024-06-24 20:53:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 12110282752. Throughput: 0: 42683.5. Samples: 12110346940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 20:53:42,722][15401] Updated weights for policy 0, policy_version 739163 (0.0033) [2024-06-24 20:53:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42766.0). Total num frames: 12110479360. Throughput: 0: 42717.1. Samples: 12110608060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:43,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 20:53:43,494][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000739166_12110495744.pth... [2024-06-24 20:53:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000738538_12100206592.pth [2024-06-24 20:53:45,898][15401] Updated weights for policy 0, policy_version 739173 (0.0024) [2024-06-24 20:53:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12110692352. Throughput: 0: 42767.7. Samples: 12110863380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 20:53:48,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-24 20:53:50,290][15401] Updated weights for policy 0, policy_version 739183 (0.0039) [2024-06-24 20:53:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12110921728. Throughput: 0: 42987.2. Samples: 12110993720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:53:53,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-24 20:53:53,494][15401] Updated weights for policy 0, policy_version 739193 (0.0029) [2024-06-24 20:53:57,897][15401] Updated weights for policy 0, policy_version 739203 (0.0037) [2024-06-24 20:53:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42054.0, 300 sec: 42820.6). Total num frames: 12111118336. Throughput: 0: 42918.2. Samples: 12111257640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:53:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 20:54:01,107][15401] Updated weights for policy 0, policy_version 739213 (0.0030) [2024-06-24 20:54:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12111347712. Throughput: 0: 42678.2. Samples: 12111504520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 20:54:05,320][15401] Updated weights for policy 0, policy_version 739223 (0.0021) [2024-06-24 20:54:08,394][15132] Fps is (10 sec: 44217.8, 60 sec: 43141.5, 300 sec: 42820.0). Total num frames: 12111560704. Throughput: 0: 42973.7. Samples: 12111638940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:08,394][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 20:54:08,888][15401] Updated weights for policy 0, policy_version 739233 (0.0028) [2024-06-24 20:54:12,911][15401] Updated weights for policy 0, policy_version 739243 (0.0042) [2024-06-24 20:54:13,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 12111757312. Throughput: 0: 42826.0. Samples: 12111892820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:13,392][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 20:54:16,472][15401] Updated weights for policy 0, policy_version 739253 (0.0045) [2024-06-24 20:54:18,389][15132] Fps is (10 sec: 40977.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12111970304. Throughput: 0: 42664.4. Samples: 12112137760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 20:54:20,519][15401] Updated weights for policy 0, policy_version 739263 (0.0043) [2024-06-24 20:54:23,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12112183296. Throughput: 0: 42679.6. Samples: 12112267520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 20:54:24,591][15401] Updated weights for policy 0, policy_version 739273 (0.0033) [2024-06-24 20:54:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 12112396288. Throughput: 0: 42568.9. Samples: 12112523660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 20:54:28,494][15401] Updated weights for policy 0, policy_version 739283 (0.0049) [2024-06-24 20:54:32,182][15401] Updated weights for policy 0, policy_version 739293 (0.0042) [2024-06-24 20:54:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 12112609280. Throughput: 0: 42509.3. Samples: 12112776300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 20:54:36,061][15401] Updated weights for policy 0, policy_version 739303 (0.0032) [2024-06-24 20:54:38,090][15349] Signal inference workers to stop experience collection... (179300 times) [2024-06-24 20:54:38,090][15349] Signal inference workers to resume experience collection... (179300 times) [2024-06-24 20:54:38,099][15401] InferenceWorker_p0-w0: stopping experience collection (179300 times) [2024-06-24 20:54:38,100][15401] InferenceWorker_p0-w0: resuming experience collection (179300 times) [2024-06-24 20:54:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12112822272. Throughput: 0: 42474.6. Samples: 12112905080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:38,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 20:54:39,805][15401] Updated weights for policy 0, policy_version 739313 (0.0024) [2024-06-24 20:54:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12113035264. Throughput: 0: 42366.7. Samples: 12113164140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 20:54:43,598][15401] Updated weights for policy 0, policy_version 739323 (0.0029) [2024-06-24 20:54:47,755][15401] Updated weights for policy 0, policy_version 739333 (0.0026) [2024-06-24 20:54:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 12113264640. Throughput: 0: 42552.9. Samples: 12113419400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 20:54:51,198][15401] Updated weights for policy 0, policy_version 739343 (0.0045) [2024-06-24 20:54:53,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 12113461248. Throughput: 0: 42432.4. Samples: 12113548220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 20:54:55,219][15401] Updated weights for policy 0, policy_version 739353 (0.0037) [2024-06-24 20:54:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12113674240. Throughput: 0: 42563.7. Samples: 12113808080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:54:58,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-24 20:54:58,969][15401] Updated weights for policy 0, policy_version 739363 (0.0034) [2024-06-24 20:55:03,046][15401] Updated weights for policy 0, policy_version 739373 (0.0030) [2024-06-24 20:55:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12113903616. Throughput: 0: 42856.5. Samples: 12114066300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:55:03,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-24 20:55:06,788][15401] Updated weights for policy 0, policy_version 739383 (0.0034) [2024-06-24 20:55:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42328.4, 300 sec: 42820.6). Total num frames: 12114100224. Throughput: 0: 42864.8. Samples: 12114196440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:55:08,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 20:55:10,511][15401] Updated weights for policy 0, policy_version 739393 (0.0035) [2024-06-24 20:55:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 12114313216. Throughput: 0: 42847.1. Samples: 12114451780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:55:13,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 20:55:14,445][15401] Updated weights for policy 0, policy_version 739403 (0.0027) [2024-06-24 20:55:18,024][15401] Updated weights for policy 0, policy_version 739413 (0.0033) [2024-06-24 20:55:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12114542592. Throughput: 0: 42866.2. Samples: 12114705280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:55:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 20:55:22,074][15401] Updated weights for policy 0, policy_version 739423 (0.0039) [2024-06-24 20:55:23,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.6, 300 sec: 42875.7). Total num frames: 12114755584. Throughput: 0: 42893.1. Samples: 12114835380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:55:23,392][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 20:55:25,727][15401] Updated weights for policy 0, policy_version 739433 (0.0028) [2024-06-24 20:55:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12114952192. Throughput: 0: 42812.9. Samples: 12115090720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-24 20:55:28,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 20:55:29,698][15401] Updated weights for policy 0, policy_version 739443 (0.0039) [2024-06-24 20:55:33,274][15401] Updated weights for policy 0, policy_version 739453 (0.0049) [2024-06-24 20:55:33,389][15132] Fps is (10 sec: 44248.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12115197952. Throughput: 0: 42898.3. Samples: 12115349820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:55:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 20:55:37,364][15401] Updated weights for policy 0, policy_version 739463 (0.0031) [2024-06-24 20:55:38,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 12115394560. Throughput: 0: 42916.3. Samples: 12115479560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:55:38,392][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 20:55:40,976][15401] Updated weights for policy 0, policy_version 739473 (0.0038) [2024-06-24 20:55:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12115591168. Throughput: 0: 42747.6. Samples: 12115731720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:55:43,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 20:55:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000739478_12115607552.pth... [2024-06-24 20:55:43,563][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000738853_12105367552.pth [2024-06-24 20:55:45,392][15401] Updated weights for policy 0, policy_version 739483 (0.0044) [2024-06-24 20:55:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12115820544. Throughput: 0: 42735.0. Samples: 12115989380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:55:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 20:55:48,781][15401] Updated weights for policy 0, policy_version 739493 (0.0032) [2024-06-24 20:55:49,002][15349] Signal inference workers to stop experience collection... (179350 times) [2024-06-24 20:55:49,003][15349] Signal inference workers to resume experience collection... (179350 times) [2024-06-24 20:55:49,044][15401] InferenceWorker_p0-w0: stopping experience collection (179350 times) [2024-06-24 20:55:49,045][15401] InferenceWorker_p0-w0: resuming experience collection (179350 times) [2024-06-24 20:55:53,006][15401] Updated weights for policy 0, policy_version 739503 (0.0035) [2024-06-24 20:55:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12116017152. Throughput: 0: 42753.4. Samples: 12116120340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:55:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 20:55:56,319][15401] Updated weights for policy 0, policy_version 739513 (0.0031) [2024-06-24 20:55:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 12116246528. Throughput: 0: 42668.5. Samples: 12116371860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:55:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 20:56:00,772][15401] Updated weights for policy 0, policy_version 739523 (0.0037) [2024-06-24 20:56:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12116459520. Throughput: 0: 42863.0. Samples: 12116634120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 20:56:03,952][15401] Updated weights for policy 0, policy_version 739533 (0.0031) [2024-06-24 20:56:08,237][15401] Updated weights for policy 0, policy_version 739543 (0.0037) [2024-06-24 20:56:08,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42866.8, 300 sec: 42653.0). Total num frames: 12116672512. Throughput: 0: 42932.2. Samples: 12116767500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:08,396][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 20:56:11,407][15401] Updated weights for policy 0, policy_version 739553 (0.0038) [2024-06-24 20:56:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12116885504. Throughput: 0: 42773.8. Samples: 12117015540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 20:56:15,897][15401] Updated weights for policy 0, policy_version 739563 (0.0032) [2024-06-24 20:56:18,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12117114880. Throughput: 0: 42903.9. Samples: 12117280500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:18,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 20:56:19,081][15401] Updated weights for policy 0, policy_version 739573 (0.0030) [2024-06-24 20:56:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 12117311488. Throughput: 0: 42857.1. Samples: 12117408020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 20:56:23,428][15401] Updated weights for policy 0, policy_version 739583 (0.0028) [2024-06-24 20:56:26,904][15401] Updated weights for policy 0, policy_version 739593 (0.0038) [2024-06-24 20:56:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12117540864. Throughput: 0: 42960.8. Samples: 12117664960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:28,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 20:56:30,882][15401] Updated weights for policy 0, policy_version 739603 (0.0044) [2024-06-24 20:56:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12117770240. Throughput: 0: 42996.9. Samples: 12117924240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:33,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 20:56:34,506][15401] Updated weights for policy 0, policy_version 739613 (0.0041) [2024-06-24 20:56:38,345][15401] Updated weights for policy 0, policy_version 739623 (0.0029) [2024-06-24 20:56:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 12117983232. Throughput: 0: 42934.5. Samples: 12118052400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 20:56:42,397][15401] Updated weights for policy 0, policy_version 739633 (0.0027) [2024-06-24 20:56:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12118179840. Throughput: 0: 43015.0. Samples: 12118307540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 20:56:46,213][15401] Updated weights for policy 0, policy_version 739643 (0.0034) [2024-06-24 20:56:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12118409216. Throughput: 0: 42812.5. Samples: 12118560680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 20:56:50,020][15401] Updated weights for policy 0, policy_version 739653 (0.0028) [2024-06-24 20:56:53,391][15132] Fps is (10 sec: 42592.4, 60 sec: 43143.4, 300 sec: 42709.3). Total num frames: 12118605824. Throughput: 0: 42691.4. Samples: 12118688400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:53,391][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 20:56:54,218][15401] Updated weights for policy 0, policy_version 739663 (0.0044) [2024-06-24 20:56:55,300][15349] Signal inference workers to stop experience collection... (179400 times) [2024-06-24 20:56:55,340][15401] InferenceWorker_p0-w0: stopping experience collection (179400 times) [2024-06-24 20:56:55,347][15349] Signal inference workers to resume experience collection... (179400 times) [2024-06-24 20:56:55,354][15401] InferenceWorker_p0-w0: resuming experience collection (179400 times) [2024-06-24 20:56:57,502][15401] Updated weights for policy 0, policy_version 739673 (0.0035) [2024-06-24 20:56:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 12118851584. Throughput: 0: 42940.8. Samples: 12118947880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:56:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 20:57:01,561][15401] Updated weights for policy 0, policy_version 739683 (0.0033) [2024-06-24 20:57:03,389][15132] Fps is (10 sec: 45882.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12119064576. Throughput: 0: 42790.2. Samples: 12119206060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:57:03,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 20:57:04,899][15401] Updated weights for policy 0, policy_version 739693 (0.0037) [2024-06-24 20:57:08,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42874.3, 300 sec: 42653.6). Total num frames: 12119244800. Throughput: 0: 42790.5. Samples: 12119333700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:57:08,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 20:57:09,074][15401] Updated weights for policy 0, policy_version 739703 (0.0030) [2024-06-24 20:57:12,368][15401] Updated weights for policy 0, policy_version 739713 (0.0045) [2024-06-24 20:57:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 12119490560. Throughput: 0: 42926.6. Samples: 12119596660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 20:57:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 20:57:16,886][15401] Updated weights for policy 0, policy_version 739723 (0.0039) [2024-06-24 20:57:18,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12119687168. Throughput: 0: 42990.3. Samples: 12119858800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:18,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 20:57:19,867][15401] Updated weights for policy 0, policy_version 739733 (0.0033) [2024-06-24 20:57:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 12119883776. Throughput: 0: 42945.3. Samples: 12119984940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 20:57:24,432][15401] Updated weights for policy 0, policy_version 739743 (0.0041) [2024-06-24 20:57:27,396][15401] Updated weights for policy 0, policy_version 739753 (0.0039) [2024-06-24 20:57:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12120145920. Throughput: 0: 42981.0. Samples: 12120241680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:28,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 20:57:31,894][15401] Updated weights for policy 0, policy_version 739763 (0.0036) [2024-06-24 20:57:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12120326144. Throughput: 0: 43305.8. Samples: 12120509440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 20:57:34,852][15401] Updated weights for policy 0, policy_version 739773 (0.0030) [2024-06-24 20:57:38,391][15132] Fps is (10 sec: 40952.1, 60 sec: 42870.2, 300 sec: 42709.2). Total num frames: 12120555520. Throughput: 0: 43305.0. Samples: 12120637140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:38,400][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 20:57:39,804][15401] Updated weights for policy 0, policy_version 739783 (0.0032) [2024-06-24 20:57:42,360][15401] Updated weights for policy 0, policy_version 739793 (0.0026) [2024-06-24 20:57:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12120768512. Throughput: 0: 43128.9. Samples: 12120888680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 20:57:43,510][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000739794_12120784896.pth... [2024-06-24 20:57:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000739166_12110495744.pth [2024-06-24 20:57:47,313][15401] Updated weights for policy 0, policy_version 739803 (0.0034) [2024-06-24 20:57:48,389][15132] Fps is (10 sec: 42606.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12120981504. Throughput: 0: 43366.7. Samples: 12121157560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:48,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 20:57:49,949][15401] Updated weights for policy 0, policy_version 739813 (0.0037) [2024-06-24 20:57:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43418.7, 300 sec: 42765.4). Total num frames: 12121210880. Throughput: 0: 43366.4. Samples: 12121285080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 20:57:54,733][15401] Updated weights for policy 0, policy_version 739823 (0.0030) [2024-06-24 20:57:57,379][15401] Updated weights for policy 0, policy_version 739833 (0.0037) [2024-06-24 20:57:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 12121440256. Throughput: 0: 43238.7. Samples: 12121542400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:57:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 20:58:02,329][15401] Updated weights for policy 0, policy_version 739843 (0.0034) [2024-06-24 20:58:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12121636864. Throughput: 0: 43306.3. Samples: 12121807580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 20:58:05,498][15401] Updated weights for policy 0, policy_version 739853 (0.0032) [2024-06-24 20:58:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 12121833472. Throughput: 0: 43130.2. Samples: 12121925800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 20:58:08,587][15349] Signal inference workers to stop experience collection... (179450 times) [2024-06-24 20:58:08,639][15401] InferenceWorker_p0-w0: stopping experience collection (179450 times) [2024-06-24 20:58:08,647][15349] Signal inference workers to resume experience collection... (179450 times) [2024-06-24 20:58:08,660][15401] InferenceWorker_p0-w0: resuming experience collection (179450 times) [2024-06-24 20:58:10,336][15401] Updated weights for policy 0, policy_version 739863 (0.0038) [2024-06-24 20:58:12,969][15401] Updated weights for policy 0, policy_version 739873 (0.0046) [2024-06-24 20:58:13,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 12122095616. Throughput: 0: 43178.6. Samples: 12122184720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 20:58:17,882][15401] Updated weights for policy 0, policy_version 739883 (0.0038) [2024-06-24 20:58:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12122259456. Throughput: 0: 43197.3. Samples: 12122453320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:18,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 20:58:20,421][15401] Updated weights for policy 0, policy_version 739893 (0.0048) [2024-06-24 20:58:23,390][15132] Fps is (10 sec: 37679.1, 60 sec: 43143.8, 300 sec: 42875.9). Total num frames: 12122472448. Throughput: 0: 42957.7. Samples: 12122570200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:23,391][15132] Avg episode reward: [(0, '0.419')] [2024-06-24 20:58:25,378][15401] Updated weights for policy 0, policy_version 739903 (0.0028) [2024-06-24 20:58:28,055][15401] Updated weights for policy 0, policy_version 739913 (0.0031) [2024-06-24 20:58:28,392][15132] Fps is (10 sec: 49140.2, 60 sec: 43415.8, 300 sec: 43042.4). Total num frames: 12122750976. Throughput: 0: 43161.7. Samples: 12122831060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:28,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 20:58:32,614][15401] Updated weights for policy 0, policy_version 739923 (0.0040) [2024-06-24 20:58:33,389][15132] Fps is (10 sec: 44241.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12122914816. Throughput: 0: 43210.7. Samples: 12123102040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 20:58:35,755][15401] Updated weights for policy 0, policy_version 739933 (0.0032) [2024-06-24 20:58:38,389][15132] Fps is (10 sec: 36053.8, 60 sec: 42599.8, 300 sec: 42820.6). Total num frames: 12123111424. Throughput: 0: 43019.7. Samples: 12123220960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 20:58:40,007][15401] Updated weights for policy 0, policy_version 739943 (0.0036) [2024-06-24 20:58:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 12123373568. Throughput: 0: 43158.6. Samples: 12123484540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:43,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 20:58:43,553][15401] Updated weights for policy 0, policy_version 739953 (0.0036) [2024-06-24 20:58:47,764][15401] Updated weights for policy 0, policy_version 739963 (0.0045) [2024-06-24 20:58:48,389][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12123570176. Throughput: 0: 42969.2. Samples: 12123741200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 20:58:51,081][15401] Updated weights for policy 0, policy_version 739973 (0.0029) [2024-06-24 20:58:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12123766784. Throughput: 0: 43086.7. Samples: 12123864700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:53,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 20:58:55,609][15401] Updated weights for policy 0, policy_version 739983 (0.0039) [2024-06-24 20:58:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 12124028928. Throughput: 0: 43146.7. Samples: 12124126320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 20:58:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 20:58:58,523][15401] Updated weights for policy 0, policy_version 739993 (0.0036) [2024-06-24 20:59:03,284][15401] Updated weights for policy 0, policy_version 740003 (0.0028) [2024-06-24 20:59:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.7). Total num frames: 12124209152. Throughput: 0: 43113.8. Samples: 12124393440. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:03,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 20:59:06,070][15401] Updated weights for policy 0, policy_version 740013 (0.0031) [2024-06-24 20:59:08,389][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 12124422144. Throughput: 0: 43155.2. Samples: 12124512140. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:08,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-24 20:59:10,693][15401] Updated weights for policy 0, policy_version 740023 (0.0032) [2024-06-24 20:59:10,714][15349] Signal inference workers to stop experience collection... (179500 times) [2024-06-24 20:59:10,714][15349] Signal inference workers to resume experience collection... (179500 times) [2024-06-24 20:59:10,746][15401] InferenceWorker_p0-w0: stopping experience collection (179500 times) [2024-06-24 20:59:10,746][15401] InferenceWorker_p0-w0: resuming experience collection (179500 times) [2024-06-24 20:59:13,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 12124684288. Throughput: 0: 43323.2. Samples: 12124780500. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 20:59:13,859][15401] Updated weights for policy 0, policy_version 740033 (0.0032) [2024-06-24 20:59:18,159][15401] Updated weights for policy 0, policy_version 740043 (0.0029) [2024-06-24 20:59:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 12124864512. Throughput: 0: 43112.9. Samples: 12125042120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 20:59:21,626][15401] Updated weights for policy 0, policy_version 740053 (0.0034) [2024-06-24 20:59:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43418.4, 300 sec: 42987.2). Total num frames: 12125077504. Throughput: 0: 43208.4. Samples: 12125165340. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 20:59:25,991][15401] Updated weights for policy 0, policy_version 740063 (0.0041) [2024-06-24 20:59:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.2, 300 sec: 43098.2). Total num frames: 12125323264. Throughput: 0: 43015.6. Samples: 12125420240. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 20:59:29,316][15401] Updated weights for policy 0, policy_version 740073 (0.0033) [2024-06-24 20:59:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 12125503488. Throughput: 0: 43322.7. Samples: 12125690720. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 20:59:33,510][15401] Updated weights for policy 0, policy_version 740083 (0.0021) [2024-06-24 20:59:36,943][15401] Updated weights for policy 0, policy_version 740093 (0.0030) [2024-06-24 20:59:38,390][15132] Fps is (10 sec: 42595.0, 60 sec: 43963.1, 300 sec: 43098.1). Total num frames: 12125749248. Throughput: 0: 43284.1. Samples: 12125812520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:38,391][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 20:59:41,075][15401] Updated weights for policy 0, policy_version 740103 (0.0031) [2024-06-24 20:59:43,391][15132] Fps is (10 sec: 45869.1, 60 sec: 43143.7, 300 sec: 43042.5). Total num frames: 12125962240. Throughput: 0: 43268.9. Samples: 12126073480. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:43,391][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 20:59:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000740110_12125962240.pth... [2024-06-24 20:59:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000739478_12115607552.pth [2024-06-24 20:59:44,412][15401] Updated weights for policy 0, policy_version 740113 (0.0046) [2024-06-24 20:59:48,389][15132] Fps is (10 sec: 39324.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 12126142464. Throughput: 0: 43057.3. Samples: 12126331020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 20:59:48,755][15401] Updated weights for policy 0, policy_version 740123 (0.0030) [2024-06-24 20:59:52,014][15401] Updated weights for policy 0, policy_version 740133 (0.0040) [2024-06-24 20:59:53,389][15132] Fps is (10 sec: 42604.1, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 12126388224. Throughput: 0: 43152.9. Samples: 12126454020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:53,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 20:59:56,330][15401] Updated weights for policy 0, policy_version 740143 (0.0044) [2024-06-24 20:59:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 12126601216. Throughput: 0: 42995.5. Samples: 12126715300. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 20:59:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 20:59:59,671][15401] Updated weights for policy 0, policy_version 740153 (0.0036) [2024-06-24 21:00:03,392][15132] Fps is (10 sec: 39311.2, 60 sec: 42869.6, 300 sec: 42986.8). Total num frames: 12126781440. Throughput: 0: 42943.7. Samples: 12126974700. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 21:00:03,393][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 21:00:04,457][15401] Updated weights for policy 0, policy_version 740163 (0.0023) [2024-06-24 21:00:07,352][15401] Updated weights for policy 0, policy_version 740173 (0.0037) [2024-06-24 21:00:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 12127027200. Throughput: 0: 42856.8. Samples: 12127093900. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 21:00:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 21:00:11,886][15401] Updated weights for policy 0, policy_version 740183 (0.0033) [2024-06-24 21:00:13,389][15132] Fps is (10 sec: 45887.2, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 12127240192. Throughput: 0: 43190.7. Samples: 12127363820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 21:00:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 21:00:14,777][15401] Updated weights for policy 0, policy_version 740193 (0.0029) [2024-06-24 21:00:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42987.5). Total num frames: 12127436800. Throughput: 0: 42809.6. Samples: 12127617160. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 21:00:18,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 21:00:19,395][15401] Updated weights for policy 0, policy_version 740203 (0.0032) [2024-06-24 21:00:22,608][15401] Updated weights for policy 0, policy_version 740213 (0.0039) [2024-06-24 21:00:23,391][15132] Fps is (10 sec: 44229.8, 60 sec: 43416.4, 300 sec: 43153.5). Total num frames: 12127682560. Throughput: 0: 42799.3. Samples: 12127738520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 21:00:23,392][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 21:00:27,258][15401] Updated weights for policy 0, policy_version 740223 (0.0044) [2024-06-24 21:00:28,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 12127862784. Throughput: 0: 42882.2. Samples: 12128003120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 21:00:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 21:00:30,502][15401] Updated weights for policy 0, policy_version 740233 (0.0034) [2024-06-24 21:00:33,389][15132] Fps is (10 sec: 37689.0, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 12128059392. Throughput: 0: 42827.5. Samples: 12128258260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 21:00:33,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 21:00:34,708][15401] Updated weights for policy 0, policy_version 740243 (0.0037) [2024-06-24 21:00:37,917][15349] Signal inference workers to stop experience collection... (179550 times) [2024-06-24 21:00:37,957][15401] InferenceWorker_p0-w0: stopping experience collection (179550 times) [2024-06-24 21:00:37,979][15349] Signal inference workers to resume experience collection... (179550 times) [2024-06-24 21:00:37,984][15401] InferenceWorker_p0-w0: resuming experience collection (179550 times) [2024-06-24 21:00:38,160][15401] Updated weights for policy 0, policy_version 740253 (0.0025) [2024-06-24 21:00:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42872.1, 300 sec: 43153.8). Total num frames: 12128321536. Throughput: 0: 42956.5. Samples: 12128387060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-06-24 21:00:38,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-24 21:00:42,382][15401] Updated weights for policy 0, policy_version 740263 (0.0031) [2024-06-24 21:00:43,392][15132] Fps is (10 sec: 42589.6, 60 sec: 42051.7, 300 sec: 42931.3). Total num frames: 12128485376. Throughput: 0: 42720.7. Samples: 12128637820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:00:43,392][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 21:00:45,675][15401] Updated weights for policy 0, policy_version 740273 (0.0027) [2024-06-24 21:00:48,390][15132] Fps is (10 sec: 39320.5, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 12128714752. Throughput: 0: 42666.7. Samples: 12128894600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:00:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 21:00:50,011][15401] Updated weights for policy 0, policy_version 740283 (0.0022) [2024-06-24 21:00:53,254][15401] Updated weights for policy 0, policy_version 740293 (0.0033) [2024-06-24 21:00:53,389][15132] Fps is (10 sec: 49162.0, 60 sec: 43144.5, 300 sec: 43153.8). Total num frames: 12128976896. Throughput: 0: 42980.4. Samples: 12129028020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:00:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 21:00:57,700][15401] Updated weights for policy 0, policy_version 740303 (0.0034) [2024-06-24 21:00:58,392][15132] Fps is (10 sec: 42589.1, 60 sec: 42323.7, 300 sec: 42986.8). Total num frames: 12129140736. Throughput: 0: 42697.3. Samples: 12129285300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:00:58,392][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 21:01:00,704][15401] Updated weights for policy 0, policy_version 740313 (0.0025) [2024-06-24 21:01:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43146.4, 300 sec: 43043.7). Total num frames: 12129370112. Throughput: 0: 42771.7. Samples: 12129541880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 21:01:05,592][15401] Updated weights for policy 0, policy_version 740323 (0.0031) [2024-06-24 21:01:08,155][15401] Updated weights for policy 0, policy_version 740333 (0.0027) [2024-06-24 21:01:08,393][15132] Fps is (10 sec: 49146.2, 60 sec: 43415.0, 300 sec: 43208.8). Total num frames: 12129632256. Throughput: 0: 43123.0. Samples: 12129679140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:08,394][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 21:01:13,053][15401] Updated weights for policy 0, policy_version 740343 (0.0027) [2024-06-24 21:01:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 12129779712. Throughput: 0: 42931.6. Samples: 12129935040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:13,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 21:01:15,797][15401] Updated weights for policy 0, policy_version 740353 (0.0042) [2024-06-24 21:01:18,389][15132] Fps is (10 sec: 39335.8, 60 sec: 43144.7, 300 sec: 43098.2). Total num frames: 12130025472. Throughput: 0: 42789.8. Samples: 12130183800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:18,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-24 21:01:20,727][15401] Updated weights for policy 0, policy_version 740363 (0.0037) [2024-06-24 21:01:23,389][15132] Fps is (10 sec: 45874.4, 60 sec: 42599.5, 300 sec: 43042.7). Total num frames: 12130238464. Throughput: 0: 42971.0. Samples: 12130320760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:23,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 21:01:23,661][15401] Updated weights for policy 0, policy_version 740373 (0.0030) [2024-06-24 21:01:28,191][15401] Updated weights for policy 0, policy_version 740383 (0.0040) [2024-06-24 21:01:28,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.6, 300 sec: 42931.3). Total num frames: 12130435072. Throughput: 0: 43095.2. Samples: 12130577120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:28,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 21:01:31,326][15401] Updated weights for policy 0, policy_version 740393 (0.0032) [2024-06-24 21:01:33,390][15132] Fps is (10 sec: 44235.1, 60 sec: 43690.4, 300 sec: 43042.6). Total num frames: 12130680832. Throughput: 0: 42921.6. Samples: 12130826080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 21:01:36,223][15401] Updated weights for policy 0, policy_version 740403 (0.0028) [2024-06-24 21:01:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.3, 300 sec: 43042.7). Total num frames: 12130877440. Throughput: 0: 43088.5. Samples: 12130967000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:38,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 21:01:38,898][15401] Updated weights for policy 0, policy_version 740413 (0.0030) [2024-06-24 21:01:41,051][15349] Signal inference workers to stop experience collection... (179600 times) [2024-06-24 21:01:41,056][15349] Signal inference workers to resume experience collection... (179600 times) [2024-06-24 21:01:41,070][15401] InferenceWorker_p0-w0: stopping experience collection (179600 times) [2024-06-24 21:01:41,071][15401] InferenceWorker_p0-w0: resuming experience collection (179600 times) [2024-06-24 21:01:43,390][15132] Fps is (10 sec: 37684.7, 60 sec: 42872.9, 300 sec: 42876.1). Total num frames: 12131057664. Throughput: 0: 42866.7. Samples: 12131214200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 21:01:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000740421_12131057664.pth... [2024-06-24 21:01:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000739794_12120784896.pth [2024-06-24 21:01:43,991][15401] Updated weights for policy 0, policy_version 740423 (0.0035) [2024-06-24 21:01:46,383][15401] Updated weights for policy 0, policy_version 740433 (0.0026) [2024-06-24 21:01:48,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43689.0, 300 sec: 43153.6). Total num frames: 12131336192. Throughput: 0: 42774.1. Samples: 12131466820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:48,393][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 21:01:51,662][15401] Updated weights for policy 0, policy_version 740443 (0.0034) [2024-06-24 21:01:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 12131516416. Throughput: 0: 42909.5. Samples: 12131609920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 21:01:53,959][15401] Updated weights for policy 0, policy_version 740453 (0.0034) [2024-06-24 21:01:58,389][15132] Fps is (10 sec: 36053.8, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 12131696640. Throughput: 0: 42628.4. Samples: 12131853320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:01:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 21:01:59,454][15401] Updated weights for policy 0, policy_version 740463 (0.0035) [2024-06-24 21:02:01,723][15401] Updated weights for policy 0, policy_version 740473 (0.0027) [2024-06-24 21:02:03,392][15132] Fps is (10 sec: 45865.8, 60 sec: 43416.0, 300 sec: 43153.8). Total num frames: 12131975168. Throughput: 0: 42781.9. Samples: 12132109080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:02:03,392][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 21:02:07,274][15401] Updated weights for policy 0, policy_version 740483 (0.0034) [2024-06-24 21:02:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42054.7, 300 sec: 42931.6). Total num frames: 12132155392. Throughput: 0: 42886.2. Samples: 12132250640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:02:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 21:02:09,408][15401] Updated weights for policy 0, policy_version 740493 (0.0028) [2024-06-24 21:02:13,389][15132] Fps is (10 sec: 37691.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12132352000. Throughput: 0: 42588.6. Samples: 12132493500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:02:13,390][15132] Avg episode reward: [(0, '0.127')] [2024-06-24 21:02:15,078][15401] Updated weights for policy 0, policy_version 740503 (0.0040) [2024-06-24 21:02:17,174][15401] Updated weights for policy 0, policy_version 740513 (0.0024) [2024-06-24 21:02:18,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 43209.3). Total num frames: 12132630528. Throughput: 0: 42580.9. Samples: 12132742200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:02:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 21:02:22,729][15401] Updated weights for policy 0, policy_version 740523 (0.0026) [2024-06-24 21:02:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12132794368. Throughput: 0: 42652.9. Samples: 12132886380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-24 21:02:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 21:02:24,855][15401] Updated weights for policy 0, policy_version 740533 (0.0033) [2024-06-24 21:02:28,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42873.2, 300 sec: 42987.1). Total num frames: 12133007360. Throughput: 0: 42672.0. Samples: 12133134440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:02:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 21:02:30,182][15401] Updated weights for policy 0, policy_version 740543 (0.0027) [2024-06-24 21:02:32,578][15401] Updated weights for policy 0, policy_version 740553 (0.0037) [2024-06-24 21:02:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.8, 300 sec: 43098.5). Total num frames: 12133269504. Throughput: 0: 42681.0. Samples: 12133387360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:02:33,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 21:02:37,779][15401] Updated weights for policy 0, policy_version 740563 (0.0025) [2024-06-24 21:02:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 12133433344. Throughput: 0: 42656.1. Samples: 12133529440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:02:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 21:02:38,469][15349] Signal inference workers to stop experience collection... (179650 times) [2024-06-24 21:02:38,470][15349] Signal inference workers to resume experience collection... (179650 times) [2024-06-24 21:02:38,484][15401] InferenceWorker_p0-w0: stopping experience collection (179650 times) [2024-06-24 21:02:38,484][15401] InferenceWorker_p0-w0: resuming experience collection (179650 times) [2024-06-24 21:02:40,094][15401] Updated weights for policy 0, policy_version 740573 (0.0039) [2024-06-24 21:02:43,389][15132] Fps is (10 sec: 37683.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12133646336. Throughput: 0: 42889.8. Samples: 12133783360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:02:43,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 21:02:45,447][15401] Updated weights for policy 0, policy_version 740583 (0.0025) [2024-06-24 21:02:47,663][15401] Updated weights for policy 0, policy_version 740593 (0.0038) [2024-06-24 21:02:48,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42873.3, 300 sec: 43042.7). Total num frames: 12133908480. Throughput: 0: 42559.4. Samples: 12134024160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:02:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 21:02:53,098][15401] Updated weights for policy 0, policy_version 740603 (0.0026) [2024-06-24 21:02:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12134055936. Throughput: 0: 42650.2. Samples: 12134169900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:02:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 21:02:55,171][15401] Updated weights for policy 0, policy_version 740613 (0.0032) [2024-06-24 21:02:58,389][15132] Fps is (10 sec: 39321.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 12134301696. Throughput: 0: 42733.7. Samples: 12134416520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:02:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 21:03:00,687][15401] Updated weights for policy 0, policy_version 740623 (0.0042) [2024-06-24 21:03:02,762][15401] Updated weights for policy 0, policy_version 740633 (0.0025) [2024-06-24 21:03:03,389][15132] Fps is (10 sec: 50790.4, 60 sec: 43146.1, 300 sec: 43153.8). Total num frames: 12134563840. Throughput: 0: 42935.5. Samples: 12134674300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 21:03:08,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12134678528. Throughput: 0: 42744.9. Samples: 12134809900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 21:03:08,464][15401] Updated weights for policy 0, policy_version 740643 (0.0038) [2024-06-24 21:03:10,555][15401] Updated weights for policy 0, policy_version 740653 (0.0041) [2024-06-24 21:03:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 12134957056. Throughput: 0: 42751.6. Samples: 12135058260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 21:03:16,022][15401] Updated weights for policy 0, policy_version 740663 (0.0027) [2024-06-24 21:03:18,374][15401] Updated weights for policy 0, policy_version 740673 (0.0036) [2024-06-24 21:03:18,390][15132] Fps is (10 sec: 50790.0, 60 sec: 42598.3, 300 sec: 43098.4). Total num frames: 12135186432. Throughput: 0: 42811.0. Samples: 12135313860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 21:03:23,389][15132] Fps is (10 sec: 36044.9, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 12135317504. Throughput: 0: 42513.3. Samples: 12135442540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 21:03:23,646][15401] Updated weights for policy 0, policy_version 740683 (0.0021) [2024-06-24 21:03:24,842][15349] Signal inference workers to stop experience collection... (179700 times) [2024-06-24 21:03:24,869][15401] InferenceWorker_p0-w0: stopping experience collection (179700 times) [2024-06-24 21:03:24,955][15349] Signal inference workers to resume experience collection... (179700 times) [2024-06-24 21:03:24,955][15401] InferenceWorker_p0-w0: resuming experience collection (179700 times) [2024-06-24 21:03:26,183][15401] Updated weights for policy 0, policy_version 740693 (0.0029) [2024-06-24 21:03:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 12135596032. Throughput: 0: 42395.9. Samples: 12135691180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 21:03:31,633][15401] Updated weights for policy 0, policy_version 740703 (0.0037) [2024-06-24 21:03:33,389][15132] Fps is (10 sec: 49152.0, 60 sec: 42325.4, 300 sec: 43042.7). Total num frames: 12135809024. Throughput: 0: 42785.7. Samples: 12135949520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:33,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 21:03:33,959][15401] Updated weights for policy 0, policy_version 740713 (0.0030) [2024-06-24 21:03:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12135972864. Throughput: 0: 42350.7. Samples: 12136075680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 21:03:39,321][15401] Updated weights for policy 0, policy_version 740723 (0.0042) [2024-06-24 21:03:41,643][15401] Updated weights for policy 0, policy_version 740733 (0.0028) [2024-06-24 21:03:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 12136251392. Throughput: 0: 42554.3. Samples: 12136331460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 21:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000740738_12136251392.pth... [2024-06-24 21:03:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000740110_12125962240.pth [2024-06-24 21:03:46,896][15401] Updated weights for policy 0, policy_version 740743 (0.0029) [2024-06-24 21:03:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 42876.1). Total num frames: 12136415232. Throughput: 0: 42681.0. Samples: 12136594940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:48,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 21:03:49,198][15401] Updated weights for policy 0, policy_version 740753 (0.0027) [2024-06-24 21:03:53,392][15132] Fps is (10 sec: 36036.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12136611840. Throughput: 0: 42314.2. Samples: 12136714140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:53,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 21:03:54,431][15401] Updated weights for policy 0, policy_version 740763 (0.0033) [2024-06-24 21:03:56,992][15401] Updated weights for policy 0, policy_version 740773 (0.0043) [2024-06-24 21:03:58,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 12136890368. Throughput: 0: 42568.0. Samples: 12136973820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:03:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 21:04:02,043][15401] Updated weights for policy 0, policy_version 740783 (0.0031) [2024-06-24 21:04:03,392][15132] Fps is (10 sec: 44236.6, 60 sec: 41504.4, 300 sec: 42820.2). Total num frames: 12137054208. Throughput: 0: 42863.5. Samples: 12137242820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:04:03,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 21:04:04,625][15401] Updated weights for policy 0, policy_version 740793 (0.0025) [2024-06-24 21:04:08,392][15132] Fps is (10 sec: 37674.1, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 12137267200. Throughput: 0: 42501.2. Samples: 12137355200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-24 21:04:08,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 21:04:09,834][15401] Updated weights for policy 0, policy_version 740803 (0.0034) [2024-06-24 21:04:12,639][15401] Updated weights for policy 0, policy_version 740813 (0.0051) [2024-06-24 21:04:13,396][15132] Fps is (10 sec: 47494.8, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 12137529344. Throughput: 0: 42755.2. Samples: 12137615440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:13,396][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 21:04:17,350][15401] Updated weights for policy 0, policy_version 740823 (0.0038) [2024-06-24 21:04:18,390][15132] Fps is (10 sec: 40969.8, 60 sec: 41506.2, 300 sec: 42709.5). Total num frames: 12137676800. Throughput: 0: 42924.4. Samples: 12137881120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 21:04:18,876][15349] Signal inference workers to stop experience collection... (179750 times) [2024-06-24 21:04:18,876][15349] Signal inference workers to resume experience collection... (179750 times) [2024-06-24 21:04:18,929][15401] InferenceWorker_p0-w0: stopping experience collection (179750 times) [2024-06-24 21:04:18,929][15401] InferenceWorker_p0-w0: resuming experience collection (179750 times) [2024-06-24 21:04:20,228][15401] Updated weights for policy 0, policy_version 740833 (0.0049) [2024-06-24 21:04:23,390][15132] Fps is (10 sec: 39346.8, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 12137922560. Throughput: 0: 42570.2. Samples: 12137991340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:23,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 21:04:25,028][15401] Updated weights for policy 0, policy_version 740843 (0.0043) [2024-06-24 21:04:28,230][15401] Updated weights for policy 0, policy_version 740853 (0.0033) [2024-06-24 21:04:28,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 12138151936. Throughput: 0: 42691.1. Samples: 12138252560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:28,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-24 21:04:32,981][15401] Updated weights for policy 0, policy_version 740863 (0.0027) [2024-06-24 21:04:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42598.5). Total num frames: 12138315776. Throughput: 0: 42711.5. Samples: 12138516960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 21:04:35,941][15401] Updated weights for policy 0, policy_version 740873 (0.0035) [2024-06-24 21:04:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42765.2). Total num frames: 12138577920. Throughput: 0: 42579.7. Samples: 12138630120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:38,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 21:04:40,794][15401] Updated weights for policy 0, policy_version 740883 (0.0032) [2024-06-24 21:04:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 12138774528. Throughput: 0: 42682.6. Samples: 12138894540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 21:04:43,454][15401] Updated weights for policy 0, policy_version 740893 (0.0026) [2024-06-24 21:04:48,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12138938368. Throughput: 0: 42533.9. Samples: 12139156740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:48,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 21:04:48,486][15401] Updated weights for policy 0, policy_version 740903 (0.0039) [2024-06-24 21:04:51,216][15401] Updated weights for policy 0, policy_version 740913 (0.0031) [2024-06-24 21:04:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 12139216896. Throughput: 0: 42563.3. Samples: 12139270440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 21:04:56,235][15401] Updated weights for policy 0, policy_version 740923 (0.0035) [2024-06-24 21:04:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 42709.9). Total num frames: 12139380736. Throughput: 0: 42612.4. Samples: 12139532720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:04:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 21:04:59,074][15401] Updated weights for policy 0, policy_version 740933 (0.0047) [2024-06-24 21:05:03,392][15132] Fps is (10 sec: 36035.6, 60 sec: 42052.3, 300 sec: 42542.5). Total num frames: 12139577344. Throughput: 0: 42323.5. Samples: 12139785780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:03,392][15132] Avg episode reward: [(0, '0.319')] [2024-06-24 21:05:03,741][15401] Updated weights for policy 0, policy_version 740943 (0.0037) [2024-06-24 21:05:06,578][15401] Updated weights for policy 0, policy_version 740953 (0.0037) [2024-06-24 21:05:08,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 12139855872. Throughput: 0: 42618.7. Samples: 12139909180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 21:05:11,276][15401] Updated weights for policy 0, policy_version 740963 (0.0025) [2024-06-24 21:05:13,389][15132] Fps is (10 sec: 45886.7, 60 sec: 41783.7, 300 sec: 42709.5). Total num frames: 12140036096. Throughput: 0: 42692.5. Samples: 12140173720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:13,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 21:05:14,385][15401] Updated weights for policy 0, policy_version 740973 (0.0037) [2024-06-24 21:05:18,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42543.1). Total num frames: 12140232704. Throughput: 0: 42398.2. Samples: 12140424880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 21:05:18,858][15401] Updated weights for policy 0, policy_version 740983 (0.0031) [2024-06-24 21:05:21,974][15401] Updated weights for policy 0, policy_version 740993 (0.0033) [2024-06-24 21:05:23,390][15132] Fps is (10 sec: 47512.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12140511232. Throughput: 0: 42683.8. Samples: 12140550900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:23,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 21:05:26,487][15401] Updated weights for policy 0, policy_version 741003 (0.0028) [2024-06-24 21:05:27,863][15349] Signal inference workers to stop experience collection... (179800 times) [2024-06-24 21:05:27,892][15401] InferenceWorker_p0-w0: stopping experience collection (179800 times) [2024-06-24 21:05:27,928][15349] Signal inference workers to resume experience collection... (179800 times) [2024-06-24 21:05:27,929][15401] InferenceWorker_p0-w0: resuming experience collection (179800 times) [2024-06-24 21:05:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 42709.5). Total num frames: 12140658688. Throughput: 0: 42763.9. Samples: 12140818920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:28,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-24 21:05:29,381][15401] Updated weights for policy 0, policy_version 741013 (0.0034) [2024-06-24 21:05:33,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12140888064. Throughput: 0: 42455.4. Samples: 12141067240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 21:05:33,977][15401] Updated weights for policy 0, policy_version 741023 (0.0024) [2024-06-24 21:05:37,145][15401] Updated weights for policy 0, policy_version 741033 (0.0040) [2024-06-24 21:05:38,389][15132] Fps is (10 sec: 50790.9, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 12141166592. Throughput: 0: 42908.3. Samples: 12141201320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 21:05:42,122][15401] Updated weights for policy 0, policy_version 741043 (0.0038) [2024-06-24 21:05:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 12141297664. Throughput: 0: 42700.7. Samples: 12141454360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:43,392][15132] Avg episode reward: [(0, '0.446')] [2024-06-24 21:05:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000741046_12141297664.pth... [2024-06-24 21:05:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000740421_12131057664.pth [2024-06-24 21:05:44,858][15401] Updated weights for policy 0, policy_version 741053 (0.0031) [2024-06-24 21:05:48,391][15132] Fps is (10 sec: 37676.4, 60 sec: 43416.3, 300 sec: 42598.1). Total num frames: 12141543424. Throughput: 0: 42568.6. Samples: 12141701340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-24 21:05:48,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 21:05:49,614][15401] Updated weights for policy 0, policy_version 741063 (0.0035) [2024-06-24 21:05:52,569][15401] Updated weights for policy 0, policy_version 741073 (0.0033) [2024-06-24 21:05:53,392][15132] Fps is (10 sec: 49151.9, 60 sec: 42869.6, 300 sec: 42876.1). Total num frames: 12141789184. Throughput: 0: 42870.6. Samples: 12141838460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:05:53,392][15132] Avg episode reward: [(0, '0.179')] [2024-06-24 21:05:57,244][15401] Updated weights for policy 0, policy_version 741083 (0.0028) [2024-06-24 21:05:58,389][15132] Fps is (10 sec: 39329.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12141936640. Throughput: 0: 42574.2. Samples: 12142089560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:05:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 21:06:00,278][15401] Updated weights for policy 0, policy_version 741093 (0.0038) [2024-06-24 21:06:03,389][15132] Fps is (10 sec: 39331.5, 60 sec: 43419.4, 300 sec: 42543.4). Total num frames: 12142182400. Throughput: 0: 42613.4. Samples: 12142342480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:03,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 21:06:04,767][15401] Updated weights for policy 0, policy_version 741103 (0.0035) [2024-06-24 21:06:07,911][15401] Updated weights for policy 0, policy_version 741113 (0.0026) [2024-06-24 21:06:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12142395392. Throughput: 0: 42860.3. Samples: 12142479600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 21:06:12,354][15401] Updated weights for policy 0, policy_version 741123 (0.0035) [2024-06-24 21:06:13,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12142559232. Throughput: 0: 42397.9. Samples: 12142726820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:13,396][15132] Avg episode reward: [(0, '0.861')] [2024-06-24 21:06:15,629][15401] Updated weights for policy 0, policy_version 741133 (0.0028) [2024-06-24 21:06:18,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12142804992. Throughput: 0: 42519.6. Samples: 12142980620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:18,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-24 21:06:20,429][15401] Updated weights for policy 0, policy_version 741143 (0.0024) [2024-06-24 21:06:23,359][15401] Updated weights for policy 0, policy_version 741153 (0.0034) [2024-06-24 21:06:23,389][15132] Fps is (10 sec: 49152.1, 60 sec: 42325.5, 300 sec: 42765.4). Total num frames: 12143050752. Throughput: 0: 42460.5. Samples: 12143112040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 21:06:27,991][15401] Updated weights for policy 0, policy_version 741163 (0.0032) [2024-06-24 21:06:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 12143214592. Throughput: 0: 42408.0. Samples: 12143362620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 21:06:31,007][15401] Updated weights for policy 0, policy_version 741173 (0.0041) [2024-06-24 21:06:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12143443968. Throughput: 0: 42618.6. Samples: 12143619100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:33,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 21:06:35,478][15401] Updated weights for policy 0, policy_version 741183 (0.0029) [2024-06-24 21:06:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 12143673344. Throughput: 0: 42485.9. Samples: 12143750220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 21:06:38,585][15401] Updated weights for policy 0, policy_version 741193 (0.0043) [2024-06-24 21:06:39,396][15349] Signal inference workers to stop experience collection... (179850 times) [2024-06-24 21:06:39,430][15401] InferenceWorker_p0-w0: stopping experience collection (179850 times) [2024-06-24 21:06:39,454][15349] Signal inference workers to resume experience collection... (179850 times) [2024-06-24 21:06:39,454][15401] InferenceWorker_p0-w0: resuming experience collection (179850 times) [2024-06-24 21:06:43,055][15401] Updated weights for policy 0, policy_version 741203 (0.0035) [2024-06-24 21:06:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.2, 300 sec: 42487.7). Total num frames: 12143869952. Throughput: 0: 42508.8. Samples: 12144002460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 21:06:46,729][15401] Updated weights for policy 0, policy_version 741213 (0.0037) [2024-06-24 21:06:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42599.7, 300 sec: 42653.9). Total num frames: 12144099328. Throughput: 0: 42487.9. Samples: 12144254440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 21:06:51,559][15401] Updated weights for policy 0, policy_version 741223 (0.0026) [2024-06-24 21:06:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 41779.2, 300 sec: 42709.1). Total num frames: 12144295936. Throughput: 0: 42333.6. Samples: 12144384720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:53,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 21:06:54,397][15401] Updated weights for policy 0, policy_version 741233 (0.0024) [2024-06-24 21:06:58,392][15132] Fps is (10 sec: 39313.7, 60 sec: 42596.9, 300 sec: 42431.8). Total num frames: 12144492544. Throughput: 0: 42386.0. Samples: 12144634280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:06:58,392][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 21:06:59,068][15401] Updated weights for policy 0, policy_version 741243 (0.0053) [2024-06-24 21:07:02,007][15401] Updated weights for policy 0, policy_version 741253 (0.0036) [2024-06-24 21:07:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12144738304. Throughput: 0: 42309.3. Samples: 12144884540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:07:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 21:07:06,640][15401] Updated weights for policy 0, policy_version 741263 (0.0026) [2024-06-24 21:07:08,389][15132] Fps is (10 sec: 44245.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12144934912. Throughput: 0: 42488.9. Samples: 12145024040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:07:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 21:07:09,678][15401] Updated weights for policy 0, policy_version 741273 (0.0030) [2024-06-24 21:07:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 12145131520. Throughput: 0: 42346.2. Samples: 12145268200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:07:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 21:07:14,267][15401] Updated weights for policy 0, policy_version 741283 (0.0034) [2024-06-24 21:07:17,327][15401] Updated weights for policy 0, policy_version 741293 (0.0037) [2024-06-24 21:07:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12145377280. Throughput: 0: 42289.7. Samples: 12145522140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:07:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 21:07:21,976][15401] Updated weights for policy 0, policy_version 741303 (0.0038) [2024-06-24 21:07:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12145573888. Throughput: 0: 42410.7. Samples: 12145658700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:07:23,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 21:07:24,970][15401] Updated weights for policy 0, policy_version 741313 (0.0034) [2024-06-24 21:07:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 12145786880. Throughput: 0: 42379.5. Samples: 12145909540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:07:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 21:07:29,422][15401] Updated weights for policy 0, policy_version 741323 (0.0036) [2024-06-24 21:07:32,903][15401] Updated weights for policy 0, policy_version 741333 (0.0033) [2024-06-24 21:07:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12146032640. Throughput: 0: 42447.5. Samples: 12146164580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-24 21:07:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 21:07:37,026][15401] Updated weights for policy 0, policy_version 741343 (0.0039) [2024-06-24 21:07:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 12146196480. Throughput: 0: 42449.0. Samples: 12146294820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:07:38,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 21:07:40,654][15401] Updated weights for policy 0, policy_version 741353 (0.0031) [2024-06-24 21:07:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 12146425856. Throughput: 0: 42601.9. Samples: 12146551280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:07:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 21:07:43,533][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000741360_12146442240.pth... [2024-06-24 21:07:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000740738_12136251392.pth [2024-06-24 21:07:44,827][15401] Updated weights for policy 0, policy_version 741363 (0.0035) [2024-06-24 21:07:48,289][15401] Updated weights for policy 0, policy_version 741373 (0.0027) [2024-06-24 21:07:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12146655232. Throughput: 0: 42744.0. Samples: 12146808020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:07:48,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 21:07:52,154][15401] Updated weights for policy 0, policy_version 741383 (0.0036) [2024-06-24 21:07:53,390][15132] Fps is (10 sec: 42594.2, 60 sec: 42599.5, 300 sec: 42542.7). Total num frames: 12146851840. Throughput: 0: 42605.7. Samples: 12146941340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:07:53,391][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 21:07:55,795][15401] Updated weights for policy 0, policy_version 741393 (0.0046) [2024-06-24 21:07:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.0, 300 sec: 42376.3). Total num frames: 12147064832. Throughput: 0: 42745.9. Samples: 12147191760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:07:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 21:07:59,982][15401] Updated weights for policy 0, policy_version 741403 (0.0024) [2024-06-24 21:08:03,379][15401] Updated weights for policy 0, policy_version 741413 (0.0041) [2024-06-24 21:08:03,389][15132] Fps is (10 sec: 45879.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12147310592. Throughput: 0: 42870.7. Samples: 12147451320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 21:08:07,404][15401] Updated weights for policy 0, policy_version 741423 (0.0032) [2024-06-24 21:08:08,136][15349] Signal inference workers to stop experience collection... (179900 times) [2024-06-24 21:08:08,140][15349] Signal inference workers to resume experience collection... (179900 times) [2024-06-24 21:08:08,171][15401] InferenceWorker_p0-w0: stopping experience collection (179900 times) [2024-06-24 21:08:08,171][15401] InferenceWorker_p0-w0: resuming experience collection (179900 times) [2024-06-24 21:08:08,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 12147507200. Throughput: 0: 42761.6. Samples: 12147583080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:08,392][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 21:08:11,203][15401] Updated weights for policy 0, policy_version 741433 (0.0022) [2024-06-24 21:08:13,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.8, 300 sec: 42431.4). Total num frames: 12147703808. Throughput: 0: 42857.3. Samples: 12147838220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:13,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 21:08:14,853][15401] Updated weights for policy 0, policy_version 741443 (0.0030) [2024-06-24 21:08:18,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12147916800. Throughput: 0: 42972.5. Samples: 12148098340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:18,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 21:08:18,783][15401] Updated weights for policy 0, policy_version 741453 (0.0038) [2024-06-24 21:08:22,405][15401] Updated weights for policy 0, policy_version 741463 (0.0050) [2024-06-24 21:08:23,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 12148146176. Throughput: 0: 42931.0. Samples: 12148226820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:23,392][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 21:08:26,423][15401] Updated weights for policy 0, policy_version 741473 (0.0034) [2024-06-24 21:08:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12148359168. Throughput: 0: 43066.3. Samples: 12148489260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:28,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 21:08:30,061][15401] Updated weights for policy 0, policy_version 741483 (0.0035) [2024-06-24 21:08:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12148572160. Throughput: 0: 42964.8. Samples: 12148741440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:33,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 21:08:34,643][15401] Updated weights for policy 0, policy_version 741493 (0.0037) [2024-06-24 21:08:37,858][15401] Updated weights for policy 0, policy_version 741503 (0.0031) [2024-06-24 21:08:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 12148801536. Throughput: 0: 42876.1. Samples: 12148870720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:38,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 21:08:42,331][15401] Updated weights for policy 0, policy_version 741513 (0.0036) [2024-06-24 21:08:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12148981760. Throughput: 0: 42977.4. Samples: 12149125740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 21:08:45,499][15401] Updated weights for policy 0, policy_version 741523 (0.0034) [2024-06-24 21:08:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 12149211136. Throughput: 0: 42748.9. Samples: 12149375020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 21:08:49,758][15401] Updated weights for policy 0, policy_version 741533 (0.0030) [2024-06-24 21:08:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42599.0, 300 sec: 42431.8). Total num frames: 12149407744. Throughput: 0: 42667.1. Samples: 12149503000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 21:08:53,610][15401] Updated weights for policy 0, policy_version 741543 (0.0037) [2024-06-24 21:08:57,848][15401] Updated weights for policy 0, policy_version 741553 (0.0038) [2024-06-24 21:08:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 12149620736. Throughput: 0: 42604.9. Samples: 12149755340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:08:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 21:09:01,463][15401] Updated weights for policy 0, policy_version 741563 (0.0037) [2024-06-24 21:09:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 12149833728. Throughput: 0: 42537.6. Samples: 12150012540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:09:03,399][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 21:09:05,306][15401] Updated weights for policy 0, policy_version 741573 (0.0031) [2024-06-24 21:09:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42873.2, 300 sec: 42543.8). Total num frames: 12150079488. Throughput: 0: 42513.4. Samples: 12150139820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:09:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 21:09:08,999][15401] Updated weights for policy 0, policy_version 741583 (0.0036) [2024-06-24 21:09:13,079][15401] Updated weights for policy 0, policy_version 741593 (0.0036) [2024-06-24 21:09:13,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 12150292480. Throughput: 0: 42376.4. Samples: 12150396200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:09:13,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 21:09:16,652][15401] Updated weights for policy 0, policy_version 741603 (0.0033) [2024-06-24 21:09:18,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 12150489088. Throughput: 0: 42541.3. Samples: 12150655900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 21:09:18,393][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 21:09:20,822][15401] Updated weights for policy 0, policy_version 741613 (0.0033) [2024-06-24 21:09:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 12150702080. Throughput: 0: 42495.1. Samples: 12150783000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:09:23,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 21:09:23,839][15349] Signal inference workers to stop experience collection... (179950 times) [2024-06-24 21:09:23,890][15401] InferenceWorker_p0-w0: stopping experience collection (179950 times) [2024-06-24 21:09:23,890][15349] Signal inference workers to resume experience collection... (179950 times) [2024-06-24 21:09:23,914][15401] InferenceWorker_p0-w0: resuming experience collection (179950 times) [2024-06-24 21:09:24,033][15401] Updated weights for policy 0, policy_version 741623 (0.0033) [2024-06-24 21:09:28,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12150898688. Throughput: 0: 42451.9. Samples: 12151036080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:09:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 21:09:28,531][15401] Updated weights for policy 0, policy_version 741633 (0.0033) [2024-06-24 21:09:31,629][15401] Updated weights for policy 0, policy_version 741643 (0.0028) [2024-06-24 21:09:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 12151128064. Throughput: 0: 42500.8. Samples: 12151287660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:09:33,393][15132] Avg episode reward: [(0, '0.824')] [2024-06-24 21:09:36,079][15401] Updated weights for policy 0, policy_version 741653 (0.0032) [2024-06-24 21:09:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12151324672. Throughput: 0: 42574.0. Samples: 12151418820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:09:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 21:09:39,170][15401] Updated weights for policy 0, policy_version 741663 (0.0035) [2024-06-24 21:09:43,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12151554048. Throughput: 0: 42794.2. Samples: 12151681080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:09:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 21:09:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000741673_12151570432.pth... [2024-06-24 21:09:43,509][15401] Updated weights for policy 0, policy_version 741673 (0.0031) [2024-06-24 21:09:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000741046_12141297664.pth [2024-06-24 21:09:46,888][15401] Updated weights for policy 0, policy_version 741683 (0.0030) [2024-06-24 21:09:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12151783424. Throughput: 0: 42614.0. Samples: 12151930160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:09:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 21:09:51,550][15401] Updated weights for policy 0, policy_version 741693 (0.0033) [2024-06-24 21:09:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12151963648. Throughput: 0: 42762.1. Samples: 12152064120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:09:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 21:09:54,665][15401] Updated weights for policy 0, policy_version 741703 (0.0036) [2024-06-24 21:09:58,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42765.0). Total num frames: 12152193024. Throughput: 0: 42801.2. Samples: 12152322360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:09:58,392][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 21:09:59,039][15401] Updated weights for policy 0, policy_version 741713 (0.0042) [2024-06-24 21:10:02,455][15401] Updated weights for policy 0, policy_version 741723 (0.0032) [2024-06-24 21:10:03,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 12152422400. Throughput: 0: 42594.4. Samples: 12152572540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 21:10:06,489][15401] Updated weights for policy 0, policy_version 741733 (0.0036) [2024-06-24 21:10:08,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12152619008. Throughput: 0: 42815.0. Samples: 12152709680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 21:10:10,073][15401] Updated weights for policy 0, policy_version 741743 (0.0044) [2024-06-24 21:10:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12152832000. Throughput: 0: 42752.9. Samples: 12152959960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 21:10:14,073][15401] Updated weights for policy 0, policy_version 741753 (0.0030) [2024-06-24 21:10:17,769][15401] Updated weights for policy 0, policy_version 741763 (0.0041) [2024-06-24 21:10:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 12153077760. Throughput: 0: 42825.3. Samples: 12153214700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:18,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 21:10:21,829][15401] Updated weights for policy 0, policy_version 741773 (0.0027) [2024-06-24 21:10:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12153257984. Throughput: 0: 42841.7. Samples: 12153346700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:23,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 21:10:25,662][15401] Updated weights for policy 0, policy_version 741783 (0.0040) [2024-06-24 21:10:28,396][15132] Fps is (10 sec: 37659.4, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 12153454592. Throughput: 0: 42614.0. Samples: 12153598980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:28,396][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 21:10:29,591][15401] Updated weights for policy 0, policy_version 741793 (0.0027) [2024-06-24 21:10:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42431.8). Total num frames: 12153683968. Throughput: 0: 42800.0. Samples: 12153856160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:33,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 21:10:33,527][15401] Updated weights for policy 0, policy_version 741803 (0.0037) [2024-06-24 21:10:37,175][15401] Updated weights for policy 0, policy_version 741813 (0.0037) [2024-06-24 21:10:38,392][15132] Fps is (10 sec: 44254.4, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 12153896960. Throughput: 0: 42705.4. Samples: 12153985960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:38,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 21:10:39,102][15349] Signal inference workers to stop experience collection... (180000 times) [2024-06-24 21:10:39,102][15349] Signal inference workers to resume experience collection... (180000 times) [2024-06-24 21:10:39,123][15401] InferenceWorker_p0-w0: stopping experience collection (180000 times) [2024-06-24 21:10:39,123][15401] InferenceWorker_p0-w0: resuming experience collection (180000 times) [2024-06-24 21:10:41,021][15401] Updated weights for policy 0, policy_version 741823 (0.0024) [2024-06-24 21:10:43,390][15132] Fps is (10 sec: 42594.5, 60 sec: 42597.8, 300 sec: 42598.5). Total num frames: 12154109952. Throughput: 0: 42535.7. Samples: 12154236400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:43,391][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 21:10:44,734][15401] Updated weights for policy 0, policy_version 741833 (0.0037) [2024-06-24 21:10:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 12154322944. Throughput: 0: 42747.0. Samples: 12154496160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 21:10:48,967][15401] Updated weights for policy 0, policy_version 741843 (0.0031) [2024-06-24 21:10:52,495][15401] Updated weights for policy 0, policy_version 741853 (0.0031) [2024-06-24 21:10:53,389][15132] Fps is (10 sec: 44240.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12154552320. Throughput: 0: 42521.0. Samples: 12154623120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 21:10:56,573][15401] Updated weights for policy 0, policy_version 741863 (0.0042) [2024-06-24 21:10:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 12154765312. Throughput: 0: 42584.6. Samples: 12154876260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:10:58,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 21:11:00,086][15401] Updated weights for policy 0, policy_version 741873 (0.0035) [2024-06-24 21:11:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12154961920. Throughput: 0: 42600.6. Samples: 12155131720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 21:11:04,211][15401] Updated weights for policy 0, policy_version 741883 (0.0031) [2024-06-24 21:11:08,209][15401] Updated weights for policy 0, policy_version 741893 (0.0040) [2024-06-24 21:11:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12155174912. Throughput: 0: 42450.2. Samples: 12155256960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 21:11:12,033][15401] Updated weights for policy 0, policy_version 741903 (0.0035) [2024-06-24 21:11:13,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12155387904. Throughput: 0: 42573.5. Samples: 12155514620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:13,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 21:11:15,699][15401] Updated weights for policy 0, policy_version 741913 (0.0027) [2024-06-24 21:11:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12155600896. Throughput: 0: 42481.3. Samples: 12155767820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:18,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-24 21:11:19,738][15401] Updated weights for policy 0, policy_version 741923 (0.0039) [2024-06-24 21:11:23,234][15401] Updated weights for policy 0, policy_version 741933 (0.0030) [2024-06-24 21:11:23,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 12155830272. Throughput: 0: 42374.2. Samples: 12155892800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:23,393][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 21:11:27,561][15401] Updated weights for policy 0, policy_version 741943 (0.0029) [2024-06-24 21:11:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43149.2, 300 sec: 42709.5). Total num frames: 12156043264. Throughput: 0: 42604.0. Samples: 12156153540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 21:11:30,804][15401] Updated weights for policy 0, policy_version 741953 (0.0038) [2024-06-24 21:11:33,392][15132] Fps is (10 sec: 40961.6, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 12156239872. Throughput: 0: 42342.1. Samples: 12156401640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:33,392][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 21:11:35,404][15401] Updated weights for policy 0, policy_version 741963 (0.0032) [2024-06-24 21:11:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 12156452864. Throughput: 0: 42366.7. Samples: 12156529620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 21:11:38,662][15401] Updated weights for policy 0, policy_version 741973 (0.0043) [2024-06-24 21:11:42,933][15401] Updated weights for policy 0, policy_version 741983 (0.0035) [2024-06-24 21:11:43,390][15132] Fps is (10 sec: 40968.3, 60 sec: 42325.9, 300 sec: 42542.9). Total num frames: 12156649472. Throughput: 0: 42397.2. Samples: 12156784140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 21:11:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000741984_12156665856.pth... [2024-06-24 21:11:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000741360_12146442240.pth [2024-06-24 21:11:46,538][15401] Updated weights for policy 0, policy_version 741993 (0.0031) [2024-06-24 21:11:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 12156895232. Throughput: 0: 42235.5. Samples: 12157032320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 21:11:50,572][15401] Updated weights for policy 0, policy_version 742003 (0.0033) [2024-06-24 21:11:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 12157091840. Throughput: 0: 42384.8. Samples: 12157164280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 21:11:54,640][15401] Updated weights for policy 0, policy_version 742013 (0.0036) [2024-06-24 21:11:58,106][15401] Updated weights for policy 0, policy_version 742023 (0.0034) [2024-06-24 21:11:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12157304832. Throughput: 0: 42322.8. Samples: 12157419040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:11:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 21:12:02,246][15401] Updated weights for policy 0, policy_version 742033 (0.0027) [2024-06-24 21:12:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12157550592. Throughput: 0: 42424.0. Samples: 12157676900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 21:12:05,793][15401] Updated weights for policy 0, policy_version 742043 (0.0041) [2024-06-24 21:12:08,391][15132] Fps is (10 sec: 42592.8, 60 sec: 42597.5, 300 sec: 42709.3). Total num frames: 12157730816. Throughput: 0: 42567.3. Samples: 12157808280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:08,391][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 21:12:09,872][15401] Updated weights for policy 0, policy_version 742053 (0.0029) [2024-06-24 21:12:10,424][15349] Signal inference workers to stop experience collection... (180050 times) [2024-06-24 21:12:10,435][15401] InferenceWorker_p0-w0: stopping experience collection (180050 times) [2024-06-24 21:12:10,487][15349] Signal inference workers to resume experience collection... (180050 times) [2024-06-24 21:12:10,488][15401] InferenceWorker_p0-w0: resuming experience collection (180050 times) [2024-06-24 21:12:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 12157943808. Throughput: 0: 42560.8. Samples: 12158068780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 21:12:13,847][15401] Updated weights for policy 0, policy_version 742063 (0.0038) [2024-06-24 21:12:17,501][15401] Updated weights for policy 0, policy_version 742073 (0.0045) [2024-06-24 21:12:18,392][15132] Fps is (10 sec: 44231.6, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 12158173184. Throughput: 0: 42768.1. Samples: 12158326220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:18,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 21:12:21,454][15401] Updated weights for policy 0, policy_version 742083 (0.0035) [2024-06-24 21:12:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 12158369792. Throughput: 0: 42864.8. Samples: 12158458540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 21:12:25,295][15401] Updated weights for policy 0, policy_version 742093 (0.0026) [2024-06-24 21:12:28,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 12158582784. Throughput: 0: 42684.0. Samples: 12158704920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 21:12:29,021][15401] Updated weights for policy 0, policy_version 742103 (0.0033) [2024-06-24 21:12:32,876][15401] Updated weights for policy 0, policy_version 742113 (0.0032) [2024-06-24 21:12:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42872.9, 300 sec: 42765.0). Total num frames: 12158812160. Throughput: 0: 42938.2. Samples: 12158964540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:33,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 21:12:36,644][15401] Updated weights for policy 0, policy_version 742123 (0.0043) [2024-06-24 21:12:38,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 12158992384. Throughput: 0: 42882.7. Samples: 12159094100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:38,392][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 21:12:40,344][15401] Updated weights for policy 0, policy_version 742133 (0.0026) [2024-06-24 21:12:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12159221760. Throughput: 0: 42721.8. Samples: 12159341520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:12:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 21:12:44,498][15401] Updated weights for policy 0, policy_version 742143 (0.0037) [2024-06-24 21:12:48,203][15401] Updated weights for policy 0, policy_version 742153 (0.0045) [2024-06-24 21:12:48,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 42654.1). Total num frames: 12159434752. Throughput: 0: 42737.7. Samples: 12159600100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:12:48,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 21:12:52,005][15401] Updated weights for policy 0, policy_version 742163 (0.0030) [2024-06-24 21:12:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12159647744. Throughput: 0: 42643.4. Samples: 12159727180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:12:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 21:12:56,026][15401] Updated weights for policy 0, policy_version 742173 (0.0025) [2024-06-24 21:12:58,392][15132] Fps is (10 sec: 44227.6, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 12159877120. Throughput: 0: 42433.6. Samples: 12159978380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:12:58,392][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 21:12:59,568][15401] Updated weights for policy 0, policy_version 742183 (0.0031) [2024-06-24 21:13:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41777.5, 300 sec: 42542.9). Total num frames: 12160057344. Throughput: 0: 42631.5. Samples: 12160244640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:03,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 21:13:03,607][15401] Updated weights for policy 0, policy_version 742193 (0.0034) [2024-06-24 21:13:07,134][15401] Updated weights for policy 0, policy_version 742203 (0.0028) [2024-06-24 21:13:08,390][15132] Fps is (10 sec: 40968.6, 60 sec: 42599.3, 300 sec: 42654.3). Total num frames: 12160286720. Throughput: 0: 42388.9. Samples: 12160366040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 21:13:11,734][15401] Updated weights for policy 0, policy_version 742213 (0.0052) [2024-06-24 21:13:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12160516096. Throughput: 0: 42501.3. Samples: 12160617480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 21:13:14,643][15401] Updated weights for policy 0, policy_version 742223 (0.0038) [2024-06-24 21:13:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42054.0, 300 sec: 42543.2). Total num frames: 12160696320. Throughput: 0: 42608.9. Samples: 12160881940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:18,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 21:13:19,204][15401] Updated weights for policy 0, policy_version 742233 (0.0036) [2024-06-24 21:13:22,085][15401] Updated weights for policy 0, policy_version 742243 (0.0043) [2024-06-24 21:13:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12160925696. Throughput: 0: 42473.4. Samples: 12161005300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 21:13:26,803][15401] Updated weights for policy 0, policy_version 742253 (0.0036) [2024-06-24 21:13:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 12161155072. Throughput: 0: 42713.3. Samples: 12161263620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 21:13:30,451][15401] Updated weights for policy 0, policy_version 742263 (0.0040) [2024-06-24 21:13:33,293][15349] Signal inference workers to stop experience collection... (180100 times) [2024-06-24 21:13:33,294][15349] Signal inference workers to resume experience collection... (180100 times) [2024-06-24 21:13:33,344][15401] InferenceWorker_p0-w0: stopping experience collection (180100 times) [2024-06-24 21:13:33,344][15401] InferenceWorker_p0-w0: resuming experience collection (180100 times) [2024-06-24 21:13:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12161335296. Throughput: 0: 42695.7. Samples: 12161521400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 21:13:34,369][15401] Updated weights for policy 0, policy_version 742273 (0.0036) [2024-06-24 21:13:38,020][15401] Updated weights for policy 0, policy_version 742283 (0.0028) [2024-06-24 21:13:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 12161564672. Throughput: 0: 42508.9. Samples: 12161640080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 21:13:42,067][15401] Updated weights for policy 0, policy_version 742293 (0.0049) [2024-06-24 21:13:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12161777664. Throughput: 0: 42647.4. Samples: 12161897420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 21:13:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000742296_12161777664.pth... [2024-06-24 21:13:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000741673_12151570432.pth [2024-06-24 21:13:45,771][15401] Updated weights for policy 0, policy_version 742303 (0.0036) [2024-06-24 21:13:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12161974272. Throughput: 0: 42370.0. Samples: 12162151180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:48,390][15132] Avg episode reward: [(0, '0.206')] [2024-06-24 21:13:50,135][15401] Updated weights for policy 0, policy_version 742313 (0.0025) [2024-06-24 21:13:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12162187264. Throughput: 0: 42477.3. Samples: 12162277520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 21:13:53,662][15401] Updated weights for policy 0, policy_version 742323 (0.0035) [2024-06-24 21:13:57,802][15401] Updated weights for policy 0, policy_version 742333 (0.0027) [2024-06-24 21:13:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 12162416640. Throughput: 0: 42578.4. Samples: 12162533500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:13:58,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 21:14:01,540][15401] Updated weights for policy 0, policy_version 742343 (0.0033) [2024-06-24 21:14:03,391][15132] Fps is (10 sec: 44231.6, 60 sec: 42872.4, 300 sec: 42542.7). Total num frames: 12162629632. Throughput: 0: 42241.0. Samples: 12162782840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:14:03,392][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 21:14:05,662][15401] Updated weights for policy 0, policy_version 742353 (0.0031) [2024-06-24 21:14:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12162826240. Throughput: 0: 42312.0. Samples: 12162909340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:14:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 21:14:09,114][15401] Updated weights for policy 0, policy_version 742363 (0.0040) [2024-06-24 21:14:13,354][15401] Updated weights for policy 0, policy_version 742373 (0.0036) [2024-06-24 21:14:13,389][15132] Fps is (10 sec: 40965.1, 60 sec: 42052.4, 300 sec: 42543.2). Total num frames: 12163039232. Throughput: 0: 42426.2. Samples: 12163172800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:14:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 21:14:16,859][15401] Updated weights for policy 0, policy_version 742383 (0.0034) [2024-06-24 21:14:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12163252224. Throughput: 0: 42235.0. Samples: 12163421980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:14:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 21:14:21,107][15401] Updated weights for policy 0, policy_version 742393 (0.0032) [2024-06-24 21:14:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12163465216. Throughput: 0: 42365.4. Samples: 12163546520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:14:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 21:14:24,807][15401] Updated weights for policy 0, policy_version 742403 (0.0035) [2024-06-24 21:14:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 12163678208. Throughput: 0: 42298.7. Samples: 12163800860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:14:28,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 21:14:28,629][15401] Updated weights for policy 0, policy_version 742413 (0.0032) [2024-06-24 21:14:32,506][15401] Updated weights for policy 0, policy_version 742423 (0.0037) [2024-06-24 21:14:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12163907584. Throughput: 0: 42323.5. Samples: 12164055740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:14:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 21:14:36,596][15401] Updated weights for policy 0, policy_version 742433 (0.0036) [2024-06-24 21:14:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 12164104192. Throughput: 0: 42419.4. Samples: 12164186400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:14:38,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 21:14:40,126][15401] Updated weights for policy 0, policy_version 742443 (0.0028) [2024-06-24 21:14:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12164317184. Throughput: 0: 42455.9. Samples: 12164444020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:14:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 21:14:44,006][15401] Updated weights for policy 0, policy_version 742453 (0.0035) [2024-06-24 21:14:47,667][15401] Updated weights for policy 0, policy_version 742463 (0.0030) [2024-06-24 21:14:48,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12164546560. Throughput: 0: 42497.2. Samples: 12164695160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:14:48,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-24 21:14:51,578][15401] Updated weights for policy 0, policy_version 742473 (0.0032) [2024-06-24 21:14:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 12164726784. Throughput: 0: 42695.7. Samples: 12164830640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:14:53,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 21:14:55,316][15401] Updated weights for policy 0, policy_version 742483 (0.0038) [2024-06-24 21:14:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 12164939776. Throughput: 0: 42532.0. Samples: 12165086740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:14:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 21:14:59,305][15401] Updated weights for policy 0, policy_version 742493 (0.0024) [2024-06-24 21:15:02,922][15401] Updated weights for policy 0, policy_version 742503 (0.0036) [2024-06-24 21:15:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42599.2, 300 sec: 42598.4). Total num frames: 12165185536. Throughput: 0: 42471.0. Samples: 12165333180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 21:15:06,995][15401] Updated weights for policy 0, policy_version 742513 (0.0030) [2024-06-24 21:15:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12165382144. Throughput: 0: 42779.2. Samples: 12165471580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 21:15:10,094][15349] Signal inference workers to stop experience collection... (180150 times) [2024-06-24 21:15:10,152][15401] InferenceWorker_p0-w0: stopping experience collection (180150 times) [2024-06-24 21:15:10,207][15349] Signal inference workers to resume experience collection... (180150 times) [2024-06-24 21:15:10,207][15401] InferenceWorker_p0-w0: resuming experience collection (180150 times) [2024-06-24 21:15:10,345][15401] Updated weights for policy 0, policy_version 742523 (0.0032) [2024-06-24 21:15:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 12165578752. Throughput: 0: 42686.1. Samples: 12165721740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 21:15:14,674][15401] Updated weights for policy 0, policy_version 742533 (0.0034) [2024-06-24 21:15:18,145][15401] Updated weights for policy 0, policy_version 742543 (0.0032) [2024-06-24 21:15:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12165824512. Throughput: 0: 42614.6. Samples: 12165973400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 21:15:22,539][15401] Updated weights for policy 0, policy_version 742553 (0.0025) [2024-06-24 21:15:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42654.9). Total num frames: 12166037504. Throughput: 0: 42763.3. Samples: 12166110740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 21:15:26,053][15401] Updated weights for policy 0, policy_version 742563 (0.0044) [2024-06-24 21:15:28,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 12166217728. Throughput: 0: 42546.2. Samples: 12166358700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:28,393][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 21:15:30,091][15401] Updated weights for policy 0, policy_version 742573 (0.0039) [2024-06-24 21:15:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 12166463488. Throughput: 0: 42693.6. Samples: 12166616380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:33,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 21:15:33,638][15401] Updated weights for policy 0, policy_version 742583 (0.0038) [2024-06-24 21:15:37,614][15401] Updated weights for policy 0, policy_version 742593 (0.0029) [2024-06-24 21:15:38,392][15132] Fps is (10 sec: 45875.3, 60 sec: 42869.8, 300 sec: 42598.2). Total num frames: 12166676480. Throughput: 0: 42754.0. Samples: 12166754680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:38,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 21:15:41,161][15401] Updated weights for policy 0, policy_version 742603 (0.0037) [2024-06-24 21:15:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 12166873088. Throughput: 0: 42655.0. Samples: 12167006220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 21:15:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000742607_12166873088.pth... [2024-06-24 21:15:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000741984_12156665856.pth [2024-06-24 21:15:45,261][15401] Updated weights for policy 0, policy_version 742613 (0.0042) [2024-06-24 21:15:48,392][15132] Fps is (10 sec: 42598.7, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 12167102464. Throughput: 0: 42814.3. Samples: 12167259920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:48,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 21:15:48,984][15401] Updated weights for policy 0, policy_version 742623 (0.0040) [2024-06-24 21:15:52,889][15401] Updated weights for policy 0, policy_version 742633 (0.0043) [2024-06-24 21:15:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 12167315456. Throughput: 0: 42826.5. Samples: 12167398780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 21:15:56,545][15401] Updated weights for policy 0, policy_version 742643 (0.0031) [2024-06-24 21:15:58,392][15132] Fps is (10 sec: 40959.8, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 12167512064. Throughput: 0: 42761.4. Samples: 12167646100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:15:58,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 21:16:00,874][15401] Updated weights for policy 0, policy_version 742653 (0.0041) [2024-06-24 21:16:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12167741440. Throughput: 0: 42876.9. Samples: 12167902860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:16:03,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 21:16:04,261][15401] Updated weights for policy 0, policy_version 742663 (0.0040) [2024-06-24 21:16:08,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 12167938048. Throughput: 0: 42715.0. Samples: 12168032920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-24 21:16:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 21:16:08,467][15401] Updated weights for policy 0, policy_version 742673 (0.0036) [2024-06-24 21:16:11,914][15401] Updated weights for policy 0, policy_version 742683 (0.0038) [2024-06-24 21:16:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42598.0). Total num frames: 12168167424. Throughput: 0: 42851.5. Samples: 12168287020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:13,393][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 21:16:15,954][15401] Updated weights for policy 0, policy_version 742693 (0.0038) [2024-06-24 21:16:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 12168380416. Throughput: 0: 42745.9. Samples: 12168539940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:18,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-24 21:16:19,785][15401] Updated weights for policy 0, policy_version 742703 (0.0040) [2024-06-24 21:16:23,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12168593408. Throughput: 0: 42549.4. Samples: 12168669300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 21:16:23,592][15401] Updated weights for policy 0, policy_version 742713 (0.0037) [2024-06-24 21:16:27,303][15401] Updated weights for policy 0, policy_version 742723 (0.0030) [2024-06-24 21:16:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.3, 300 sec: 42598.7). Total num frames: 12168806400. Throughput: 0: 42672.9. Samples: 12168926500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 21:16:31,197][15401] Updated weights for policy 0, policy_version 742733 (0.0037) [2024-06-24 21:16:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12169035776. Throughput: 0: 42825.8. Samples: 12169186980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 21:16:34,772][15401] Updated weights for policy 0, policy_version 742743 (0.0037) [2024-06-24 21:16:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 12169232384. Throughput: 0: 42547.1. Samples: 12169313500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:38,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 21:16:38,946][15401] Updated weights for policy 0, policy_version 742753 (0.0044) [2024-06-24 21:16:42,459][15401] Updated weights for policy 0, policy_version 742763 (0.0032) [2024-06-24 21:16:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12169445376. Throughput: 0: 42696.5. Samples: 12169567340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:43,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 21:16:46,775][15401] Updated weights for policy 0, policy_version 742773 (0.0034) [2024-06-24 21:16:48,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 12169658368. Throughput: 0: 42758.3. Samples: 12169826980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 21:16:49,957][15401] Updated weights for policy 0, policy_version 742783 (0.0039) [2024-06-24 21:16:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12169871360. Throughput: 0: 42716.3. Samples: 12169955160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 21:16:54,195][15401] Updated weights for policy 0, policy_version 742793 (0.0036) [2024-06-24 21:16:57,777][15401] Updated weights for policy 0, policy_version 742803 (0.0041) [2024-06-24 21:16:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43144.5, 300 sec: 42542.5). Total num frames: 12170100736. Throughput: 0: 42729.8. Samples: 12170209860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:16:58,393][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 21:17:00,956][15349] Signal inference workers to stop experience collection... (180200 times) [2024-06-24 21:17:00,996][15401] InferenceWorker_p0-w0: stopping experience collection (180200 times) [2024-06-24 21:17:01,018][15349] Signal inference workers to resume experience collection... (180200 times) [2024-06-24 21:17:01,024][15401] InferenceWorker_p0-w0: resuming experience collection (180200 times) [2024-06-24 21:17:01,797][15401] Updated weights for policy 0, policy_version 742813 (0.0031) [2024-06-24 21:17:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 12170297344. Throughput: 0: 42918.1. Samples: 12170471260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 21:17:05,586][15401] Updated weights for policy 0, policy_version 742823 (0.0028) [2024-06-24 21:17:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12170493952. Throughput: 0: 42892.9. Samples: 12170599480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 21:17:09,911][15401] Updated weights for policy 0, policy_version 742833 (0.0037) [2024-06-24 21:17:13,076][15401] Updated weights for policy 0, policy_version 742843 (0.0029) [2024-06-24 21:17:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 12170739712. Throughput: 0: 42732.4. Samples: 12170849560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:13,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 21:17:17,715][15401] Updated weights for policy 0, policy_version 742853 (0.0044) [2024-06-24 21:17:18,396][15132] Fps is (10 sec: 45845.7, 60 sec: 42866.8, 300 sec: 42653.0). Total num frames: 12170952704. Throughput: 0: 42560.1. Samples: 12171102460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:18,397][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 21:17:21,207][15401] Updated weights for policy 0, policy_version 742863 (0.0023) [2024-06-24 21:17:23,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12171149312. Throughput: 0: 42640.1. Samples: 12171232200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:23,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 21:17:25,355][15401] Updated weights for policy 0, policy_version 742873 (0.0035) [2024-06-24 21:17:28,390][15132] Fps is (10 sec: 42625.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12171378688. Throughput: 0: 42670.6. Samples: 12171487520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 21:17:28,787][15401] Updated weights for policy 0, policy_version 742883 (0.0037) [2024-06-24 21:17:33,082][15401] Updated weights for policy 0, policy_version 742893 (0.0038) [2024-06-24 21:17:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12171575296. Throughput: 0: 42607.9. Samples: 12171744340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 21:17:36,295][15401] Updated weights for policy 0, policy_version 742903 (0.0027) [2024-06-24 21:17:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 12171788288. Throughput: 0: 42502.9. Samples: 12171867780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 21:17:40,594][15401] Updated weights for policy 0, policy_version 742913 (0.0042) [2024-06-24 21:17:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12172001280. Throughput: 0: 42437.3. Samples: 12172119440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 21:17:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000742920_12172001280.pth... [2024-06-24 21:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000742296_12161777664.pth [2024-06-24 21:17:44,321][15401] Updated weights for policy 0, policy_version 742923 (0.0032) [2024-06-24 21:17:48,195][15401] Updated weights for policy 0, policy_version 742933 (0.0034) [2024-06-24 21:17:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12172214272. Throughput: 0: 42363.3. Samples: 12172377600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 21:17:52,154][15401] Updated weights for policy 0, policy_version 742943 (0.0051) [2024-06-24 21:17:53,391][15132] Fps is (10 sec: 39315.9, 60 sec: 42051.3, 300 sec: 42431.9). Total num frames: 12172394496. Throughput: 0: 42394.1. Samples: 12172507280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-24 21:17:53,391][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 21:17:55,688][15401] Updated weights for policy 0, policy_version 742953 (0.0036) [2024-06-24 21:17:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 12172640256. Throughput: 0: 42453.3. Samples: 12172759860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:17:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 21:17:59,888][15401] Updated weights for policy 0, policy_version 742963 (0.0034) [2024-06-24 21:18:03,390][15132] Fps is (10 sec: 45881.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12172853248. Throughput: 0: 42652.7. Samples: 12173021560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 21:18:03,442][15401] Updated weights for policy 0, policy_version 742973 (0.0023) [2024-06-24 21:18:07,477][15401] Updated weights for policy 0, policy_version 742983 (0.0040) [2024-06-24 21:18:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 12173049856. Throughput: 0: 42626.3. Samples: 12173150380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 21:18:11,245][15401] Updated weights for policy 0, policy_version 742993 (0.0041) [2024-06-24 21:18:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42598.3, 300 sec: 42709.1). Total num frames: 12173295616. Throughput: 0: 42619.5. Samples: 12173405500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:13,393][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 21:18:15,193][15401] Updated weights for policy 0, policy_version 743003 (0.0030) [2024-06-24 21:18:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 12173492224. Throughput: 0: 42744.1. Samples: 12173667820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 21:18:18,648][15401] Updated weights for policy 0, policy_version 743013 (0.0048) [2024-06-24 21:18:21,916][15349] Signal inference workers to stop experience collection... (180250 times) [2024-06-24 21:18:21,916][15349] Signal inference workers to resume experience collection... (180250 times) [2024-06-24 21:18:21,956][15401] InferenceWorker_p0-w0: stopping experience collection (180250 times) [2024-06-24 21:18:21,956][15401] InferenceWorker_p0-w0: resuming experience collection (180250 times) [2024-06-24 21:18:23,022][15401] Updated weights for policy 0, policy_version 743023 (0.0029) [2024-06-24 21:18:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12173721600. Throughput: 0: 42907.0. Samples: 12173798600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:23,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 21:18:26,157][15401] Updated weights for policy 0, policy_version 743033 (0.0036) [2024-06-24 21:18:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12173934592. Throughput: 0: 42875.2. Samples: 12174048820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:28,390][15132] Avg episode reward: [(0, '0.166')] [2024-06-24 21:18:30,438][15401] Updated weights for policy 0, policy_version 743043 (0.0027) [2024-06-24 21:18:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12174131200. Throughput: 0: 43134.2. Samples: 12174318640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:33,390][15132] Avg episode reward: [(0, '0.166')] [2024-06-24 21:18:33,758][15401] Updated weights for policy 0, policy_version 743053 (0.0028) [2024-06-24 21:18:37,851][15401] Updated weights for policy 0, policy_version 743063 (0.0031) [2024-06-24 21:18:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12174360576. Throughput: 0: 43127.2. Samples: 12174447940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 21:18:41,366][15401] Updated weights for policy 0, policy_version 743073 (0.0031) [2024-06-24 21:18:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12174589952. Throughput: 0: 43112.0. Samples: 12174699900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:43,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-24 21:18:45,664][15401] Updated weights for policy 0, policy_version 743083 (0.0049) [2024-06-24 21:18:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12174802944. Throughput: 0: 43041.4. Samples: 12174958420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:48,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 21:18:48,947][15401] Updated weights for policy 0, policy_version 743093 (0.0033) [2024-06-24 21:18:53,220][15401] Updated weights for policy 0, policy_version 743103 (0.0044) [2024-06-24 21:18:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43418.6, 300 sec: 42653.9). Total num frames: 12174999552. Throughput: 0: 43004.7. Samples: 12175085600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 21:18:56,617][15401] Updated weights for policy 0, policy_version 743113 (0.0032) [2024-06-24 21:18:58,391][15132] Fps is (10 sec: 42591.7, 60 sec: 43143.4, 300 sec: 42709.4). Total num frames: 12175228928. Throughput: 0: 42965.7. Samples: 12175338920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:18:58,392][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 21:19:00,745][15401] Updated weights for policy 0, policy_version 743123 (0.0033) [2024-06-24 21:19:03,390][15132] Fps is (10 sec: 42595.6, 60 sec: 42871.0, 300 sec: 42709.4). Total num frames: 12175425536. Throughput: 0: 42965.4. Samples: 12175601300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:19:03,391][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 21:19:04,252][15401] Updated weights for policy 0, policy_version 743133 (0.0031) [2024-06-24 21:19:08,388][15401] Updated weights for policy 0, policy_version 743143 (0.0023) [2024-06-24 21:19:08,389][15132] Fps is (10 sec: 42605.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 12175654912. Throughput: 0: 42925.4. Samples: 12175730240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:19:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 21:19:11,879][15401] Updated weights for policy 0, policy_version 743153 (0.0033) [2024-06-24 21:19:13,392][15132] Fps is (10 sec: 44229.2, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 12175867904. Throughput: 0: 42994.9. Samples: 12175983700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:19:13,393][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 21:19:15,890][15401] Updated weights for policy 0, policy_version 743163 (0.0031) [2024-06-24 21:19:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12176080896. Throughput: 0: 42707.1. Samples: 12176240460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:19:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 21:19:19,542][15401] Updated weights for policy 0, policy_version 743173 (0.0023) [2024-06-24 21:19:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12176293888. Throughput: 0: 42739.0. Samples: 12176371200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:19:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 21:19:23,819][15401] Updated weights for policy 0, policy_version 743183 (0.0039) [2024-06-24 21:19:27,145][15401] Updated weights for policy 0, policy_version 743193 (0.0040) [2024-06-24 21:19:28,391][15132] Fps is (10 sec: 44229.1, 60 sec: 43143.3, 300 sec: 42764.8). Total num frames: 12176523264. Throughput: 0: 42844.2. Samples: 12176627960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:19:28,392][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 21:19:31,269][15401] Updated weights for policy 0, policy_version 743203 (0.0042) [2024-06-24 21:19:33,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 12176719872. Throughput: 0: 42894.8. Samples: 12176888680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-24 21:19:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 21:19:34,782][15401] Updated weights for policy 0, policy_version 743213 (0.0035) [2024-06-24 21:19:38,389][15132] Fps is (10 sec: 39328.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12176916480. Throughput: 0: 42847.3. Samples: 12177013720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:19:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 21:19:38,796][15401] Updated weights for policy 0, policy_version 743223 (0.0041) [2024-06-24 21:19:42,501][15401] Updated weights for policy 0, policy_version 743233 (0.0054) [2024-06-24 21:19:43,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12177162240. Throughput: 0: 43018.0. Samples: 12177274660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:19:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 21:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000743236_12177178624.pth... [2024-06-24 21:19:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000742607_12166873088.pth [2024-06-24 21:19:46,849][15401] Updated weights for policy 0, policy_version 743243 (0.0039) [2024-06-24 21:19:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12177375232. Throughput: 0: 42842.9. Samples: 12177529200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:19:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 21:19:50,074][15401] Updated weights for policy 0, policy_version 743253 (0.0043) [2024-06-24 21:19:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12177571840. Throughput: 0: 42778.6. Samples: 12177655280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:19:53,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-24 21:19:53,917][15349] Signal inference workers to stop experience collection... (180300 times) [2024-06-24 21:19:53,938][15401] InferenceWorker_p0-w0: stopping experience collection (180300 times) [2024-06-24 21:19:54,031][15349] Signal inference workers to resume experience collection... (180300 times) [2024-06-24 21:19:54,031][15401] InferenceWorker_p0-w0: resuming experience collection (180300 times) [2024-06-24 21:19:54,347][15401] Updated weights for policy 0, policy_version 743263 (0.0029) [2024-06-24 21:19:57,736][15401] Updated weights for policy 0, policy_version 743273 (0.0031) [2024-06-24 21:19:58,396][15132] Fps is (10 sec: 44208.6, 60 sec: 43141.1, 300 sec: 42819.6). Total num frames: 12177817600. Throughput: 0: 42844.2. Samples: 12177911860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:19:58,397][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 21:20:02,028][15401] Updated weights for policy 0, policy_version 743283 (0.0042) [2024-06-24 21:20:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43145.1, 300 sec: 42820.6). Total num frames: 12178014208. Throughput: 0: 43041.8. Samples: 12178177340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 21:20:05,164][15401] Updated weights for policy 0, policy_version 743293 (0.0032) [2024-06-24 21:20:08,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12178210816. Throughput: 0: 42894.4. Samples: 12178301440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 21:20:09,422][15401] Updated weights for policy 0, policy_version 743303 (0.0034) [2024-06-24 21:20:13,128][15401] Updated weights for policy 0, policy_version 743313 (0.0036) [2024-06-24 21:20:13,396][15132] Fps is (10 sec: 44207.6, 60 sec: 43141.6, 300 sec: 42819.6). Total num frames: 12178456576. Throughput: 0: 42857.6. Samples: 12178556760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:13,397][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 21:20:17,234][15401] Updated weights for policy 0, policy_version 743323 (0.0035) [2024-06-24 21:20:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12178636800. Throughput: 0: 42882.7. Samples: 12178818400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 21:20:20,837][15401] Updated weights for policy 0, policy_version 743333 (0.0031) [2024-06-24 21:20:23,389][15132] Fps is (10 sec: 39347.4, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 12178849792. Throughput: 0: 42874.1. Samples: 12178943060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 21:20:24,969][15401] Updated weights for policy 0, policy_version 743343 (0.0037) [2024-06-24 21:20:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42599.6, 300 sec: 42765.0). Total num frames: 12179079168. Throughput: 0: 42599.1. Samples: 12179191620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:28,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 21:20:28,737][15401] Updated weights for policy 0, policy_version 743353 (0.0023) [2024-06-24 21:20:32,680][15401] Updated weights for policy 0, policy_version 743363 (0.0029) [2024-06-24 21:20:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 12179292160. Throughput: 0: 42806.7. Samples: 12179455500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 21:20:36,313][15401] Updated weights for policy 0, policy_version 743373 (0.0031) [2024-06-24 21:20:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12179472384. Throughput: 0: 42839.6. Samples: 12179583060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:38,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 21:20:40,441][15401] Updated weights for policy 0, policy_version 743383 (0.0033) [2024-06-24 21:20:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12179734528. Throughput: 0: 42742.5. Samples: 12179835000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:43,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 21:20:43,891][15401] Updated weights for policy 0, policy_version 743393 (0.0032) [2024-06-24 21:20:48,320][15401] Updated weights for policy 0, policy_version 743403 (0.0036) [2024-06-24 21:20:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12179914752. Throughput: 0: 42734.1. Samples: 12180100380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:48,396][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 21:20:51,362][15401] Updated weights for policy 0, policy_version 743413 (0.0033) [2024-06-24 21:20:53,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 12180111360. Throughput: 0: 42516.4. Samples: 12180214680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:53,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 21:20:56,107][15401] Updated weights for policy 0, policy_version 743423 (0.0035) [2024-06-24 21:20:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 12180373504. Throughput: 0: 42710.1. Samples: 12180478440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:20:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 21:20:58,813][15401] Updated weights for policy 0, policy_version 743433 (0.0039) [2024-06-24 21:21:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 12180553728. Throughput: 0: 42668.7. Samples: 12180738500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:21:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 21:21:03,588][15401] Updated weights for policy 0, policy_version 743443 (0.0032) [2024-06-24 21:21:06,716][15401] Updated weights for policy 0, policy_version 743453 (0.0033) [2024-06-24 21:21:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 12180766720. Throughput: 0: 42515.9. Samples: 12180856280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:21:08,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 21:21:09,613][15349] Signal inference workers to stop experience collection... (180350 times) [2024-06-24 21:21:09,614][15349] Signal inference workers to resume experience collection... (180350 times) [2024-06-24 21:21:09,633][15401] InferenceWorker_p0-w0: stopping experience collection (180350 times) [2024-06-24 21:21:09,633][15401] InferenceWorker_p0-w0: resuming experience collection (180350 times) [2024-06-24 21:21:11,155][15401] Updated weights for policy 0, policy_version 743463 (0.0035) [2024-06-24 21:21:13,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42601.3, 300 sec: 42820.2). Total num frames: 12181012480. Throughput: 0: 42762.1. Samples: 12181116020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:21:13,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 21:21:14,404][15401] Updated weights for policy 0, policy_version 743473 (0.0034) [2024-06-24 21:21:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12181192704. Throughput: 0: 42742.7. Samples: 12181378920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-24 21:21:18,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-24 21:21:18,744][15401] Updated weights for policy 0, policy_version 743483 (0.0035) [2024-06-24 21:21:22,097][15401] Updated weights for policy 0, policy_version 743493 (0.0041) [2024-06-24 21:21:23,390][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12181422080. Throughput: 0: 42603.1. Samples: 12181500200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:21:23,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 21:21:26,536][15401] Updated weights for policy 0, policy_version 743503 (0.0037) [2024-06-24 21:21:28,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12181667840. Throughput: 0: 42729.4. Samples: 12181757820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:21:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 21:21:29,895][15401] Updated weights for policy 0, policy_version 743513 (0.0027) [2024-06-24 21:21:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 12181831680. Throughput: 0: 42692.9. Samples: 12182021560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:21:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 21:21:34,373][15401] Updated weights for policy 0, policy_version 743523 (0.0031) [2024-06-24 21:21:37,591][15401] Updated weights for policy 0, policy_version 743533 (0.0043) [2024-06-24 21:21:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12182061056. Throughput: 0: 42710.3. Samples: 12182136640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:21:38,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-24 21:21:42,047][15401] Updated weights for policy 0, policy_version 743543 (0.0029) [2024-06-24 21:21:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12182306816. Throughput: 0: 42675.7. Samples: 12182398840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:21:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 21:21:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000743550_12182323200.pth... [2024-06-24 21:21:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000742920_12172001280.pth [2024-06-24 21:21:45,275][15401] Updated weights for policy 0, policy_version 743553 (0.0037) [2024-06-24 21:21:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12182454272. Throughput: 0: 42572.6. Samples: 12182654260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:21:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 21:21:49,771][15401] Updated weights for policy 0, policy_version 743563 (0.0028) [2024-06-24 21:21:52,898][15401] Updated weights for policy 0, policy_version 743573 (0.0032) [2024-06-24 21:21:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 12182700032. Throughput: 0: 42471.6. Samples: 12182767500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:21:53,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-24 21:21:57,493][15401] Updated weights for policy 0, policy_version 743583 (0.0047) [2024-06-24 21:21:58,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12182929408. Throughput: 0: 42633.0. Samples: 12183034400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:21:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 21:22:00,571][15401] Updated weights for policy 0, policy_version 743593 (0.0034) [2024-06-24 21:22:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12183093248. Throughput: 0: 42365.7. Samples: 12183285380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 21:22:05,191][15401] Updated weights for policy 0, policy_version 743603 (0.0038) [2024-06-24 21:22:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 12183339008. Throughput: 0: 42338.7. Samples: 12183405540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:08,392][15132] Avg episode reward: [(0, '0.154')] [2024-06-24 21:22:08,521][15401] Updated weights for policy 0, policy_version 743613 (0.0040) [2024-06-24 21:22:12,714][15401] Updated weights for policy 0, policy_version 743623 (0.0032) [2024-06-24 21:22:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42327.0, 300 sec: 42710.4). Total num frames: 12183552000. Throughput: 0: 42566.6. Samples: 12183673320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:13,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 21:22:16,018][15401] Updated weights for policy 0, policy_version 743633 (0.0031) [2024-06-24 21:22:18,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12183732224. Throughput: 0: 42306.6. Samples: 12183925360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:18,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-24 21:22:20,423][15401] Updated weights for policy 0, policy_version 743643 (0.0030) [2024-06-24 21:22:21,304][15349] Signal inference workers to stop experience collection... (180400 times) [2024-06-24 21:22:21,340][15401] InferenceWorker_p0-w0: stopping experience collection (180400 times) [2024-06-24 21:22:21,357][15349] Signal inference workers to resume experience collection... (180400 times) [2024-06-24 21:22:21,358][15401] InferenceWorker_p0-w0: resuming experience collection (180400 times) [2024-06-24 21:22:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12183977984. Throughput: 0: 42349.7. Samples: 12184042380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 21:22:23,861][15401] Updated weights for policy 0, policy_version 743653 (0.0052) [2024-06-24 21:22:28,133][15401] Updated weights for policy 0, policy_version 743663 (0.0037) [2024-06-24 21:22:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 12184190976. Throughput: 0: 42458.7. Samples: 12184309480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:28,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 21:22:31,651][15401] Updated weights for policy 0, policy_version 743673 (0.0023) [2024-06-24 21:22:33,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 12184371200. Throughput: 0: 42288.3. Samples: 12184557340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:33,393][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 21:22:36,019][15401] Updated weights for policy 0, policy_version 743683 (0.0031) [2024-06-24 21:22:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12184616960. Throughput: 0: 42571.2. Samples: 12184683200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 21:22:39,289][15401] Updated weights for policy 0, policy_version 743693 (0.0033) [2024-06-24 21:22:43,389][15132] Fps is (10 sec: 42609.1, 60 sec: 41506.2, 300 sec: 42653.9). Total num frames: 12184797184. Throughput: 0: 42383.2. Samples: 12184941640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 21:22:43,639][15401] Updated weights for policy 0, policy_version 743703 (0.0023) [2024-06-24 21:22:46,924][15401] Updated weights for policy 0, policy_version 743713 (0.0045) [2024-06-24 21:22:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42765.2). Total num frames: 12185010176. Throughput: 0: 42248.4. Samples: 12185186560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 21:22:51,303][15401] Updated weights for policy 0, policy_version 743723 (0.0041) [2024-06-24 21:22:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12185239552. Throughput: 0: 42467.0. Samples: 12185316460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:53,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-24 21:22:55,000][15401] Updated weights for policy 0, policy_version 743733 (0.0037) [2024-06-24 21:22:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 12185419776. Throughput: 0: 42302.4. Samples: 12185576920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:22:58,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 21:22:59,073][15401] Updated weights for policy 0, policy_version 743743 (0.0034) [2024-06-24 21:23:02,683][15401] Updated weights for policy 0, policy_version 743753 (0.0029) [2024-06-24 21:23:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12185665536. Throughput: 0: 42327.0. Samples: 12185830080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-24 21:23:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 21:23:06,714][15401] Updated weights for policy 0, policy_version 743763 (0.0030) [2024-06-24 21:23:08,391][15132] Fps is (10 sec: 47505.2, 60 sec: 42598.9, 300 sec: 42709.6). Total num frames: 12185894912. Throughput: 0: 42616.7. Samples: 12185960200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:08,392][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 21:23:10,333][15401] Updated weights for policy 0, policy_version 743773 (0.0033) [2024-06-24 21:23:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 12186058752. Throughput: 0: 42394.6. Samples: 12186217240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 21:23:14,232][15401] Updated weights for policy 0, policy_version 743783 (0.0030) [2024-06-24 21:23:17,854][15401] Updated weights for policy 0, policy_version 743793 (0.0038) [2024-06-24 21:23:18,390][15132] Fps is (10 sec: 40966.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12186304512. Throughput: 0: 42502.2. Samples: 12186469840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 21:23:21,981][15401] Updated weights for policy 0, policy_version 743803 (0.0031) [2024-06-24 21:23:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12186517504. Throughput: 0: 42753.7. Samples: 12186607120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:23,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 21:23:26,097][15401] Updated weights for policy 0, policy_version 743813 (0.0037) [2024-06-24 21:23:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12186714112. Throughput: 0: 42495.5. Samples: 12186853940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 21:23:28,718][15349] Signal inference workers to stop experience collection... (180450 times) [2024-06-24 21:23:28,719][15349] Signal inference workers to resume experience collection... (180450 times) [2024-06-24 21:23:28,738][15401] InferenceWorker_p0-w0: stopping experience collection (180450 times) [2024-06-24 21:23:28,739][15401] InferenceWorker_p0-w0: resuming experience collection (180450 times) [2024-06-24 21:23:29,771][15401] Updated weights for policy 0, policy_version 743823 (0.0034) [2024-06-24 21:23:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 12186943488. Throughput: 0: 42705.4. Samples: 12187108300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 21:23:33,639][15401] Updated weights for policy 0, policy_version 743833 (0.0031) [2024-06-24 21:23:37,560][15401] Updated weights for policy 0, policy_version 743843 (0.0037) [2024-06-24 21:23:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12187156480. Throughput: 0: 42731.1. Samples: 12187239360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 21:23:41,169][15401] Updated weights for policy 0, policy_version 743853 (0.0040) [2024-06-24 21:23:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12187353088. Throughput: 0: 42565.7. Samples: 12187492380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 21:23:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000743857_12187353088.pth... [2024-06-24 21:23:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000743236_12177178624.pth [2024-06-24 21:23:45,144][15401] Updated weights for policy 0, policy_version 743863 (0.0031) [2024-06-24 21:23:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12187598848. Throughput: 0: 42504.2. Samples: 12187742760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 21:23:49,229][15401] Updated weights for policy 0, policy_version 743873 (0.0032) [2024-06-24 21:23:52,797][15401] Updated weights for policy 0, policy_version 743883 (0.0045) [2024-06-24 21:23:53,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42869.8, 300 sec: 42653.8). Total num frames: 12187811840. Throughput: 0: 42626.8. Samples: 12187878440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:53,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 21:23:56,750][15401] Updated weights for policy 0, policy_version 743893 (0.0041) [2024-06-24 21:23:58,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.3, 300 sec: 42598.5). Total num frames: 12187992064. Throughput: 0: 42569.2. Samples: 12188132860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:23:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 21:24:00,336][15401] Updated weights for policy 0, policy_version 743903 (0.0026) [2024-06-24 21:24:03,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 12188237824. Throughput: 0: 42705.8. Samples: 12188391700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:03,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 21:24:04,302][15401] Updated weights for policy 0, policy_version 743913 (0.0043) [2024-06-24 21:24:07,902][15401] Updated weights for policy 0, policy_version 743923 (0.0031) [2024-06-24 21:24:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42599.5, 300 sec: 42654.3). Total num frames: 12188450816. Throughput: 0: 42613.3. Samples: 12188524720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 21:24:11,958][15401] Updated weights for policy 0, policy_version 743933 (0.0033) [2024-06-24 21:24:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12188647424. Throughput: 0: 42827.6. Samples: 12188781180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 21:24:15,653][15401] Updated weights for policy 0, policy_version 743943 (0.0041) [2024-06-24 21:24:18,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 12188876800. Throughput: 0: 42683.5. Samples: 12189029160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:18,392][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 21:24:19,527][15401] Updated weights for policy 0, policy_version 743953 (0.0033) [2024-06-24 21:24:23,190][15401] Updated weights for policy 0, policy_version 743963 (0.0029) [2024-06-24 21:24:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42598.6). Total num frames: 12189089792. Throughput: 0: 42782.2. Samples: 12189164560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:23,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 21:24:26,994][15401] Updated weights for policy 0, policy_version 743973 (0.0032) [2024-06-24 21:24:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12189302784. Throughput: 0: 42723.0. Samples: 12189414920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 21:24:30,896][15401] Updated weights for policy 0, policy_version 743983 (0.0039) [2024-06-24 21:24:33,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12189499392. Throughput: 0: 42788.1. Samples: 12189668220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 21:24:34,968][15401] Updated weights for policy 0, policy_version 743993 (0.0053) [2024-06-24 21:24:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12189728768. Throughput: 0: 42611.7. Samples: 12189795860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 21:24:38,536][15401] Updated weights for policy 0, policy_version 744003 (0.0033) [2024-06-24 21:24:42,990][15401] Updated weights for policy 0, policy_version 744013 (0.0035) [2024-06-24 21:24:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 12189908992. Throughput: 0: 42641.4. Samples: 12190051720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-24 21:24:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 21:24:46,659][15401] Updated weights for policy 0, policy_version 744023 (0.0029) [2024-06-24 21:24:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12190154752. Throughput: 0: 42499.6. Samples: 12190304080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:24:48,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-24 21:24:50,563][15401] Updated weights for policy 0, policy_version 744033 (0.0032) [2024-06-24 21:24:53,392][15132] Fps is (10 sec: 45863.5, 60 sec: 42598.3, 300 sec: 42543.4). Total num frames: 12190367744. Throughput: 0: 42515.0. Samples: 12190438000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:24:53,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 21:24:54,148][15401] Updated weights for policy 0, policy_version 744043 (0.0035) [2024-06-24 21:24:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 12190547968. Throughput: 0: 42461.2. Samples: 12190691940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:24:58,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 21:24:58,639][15401] Updated weights for policy 0, policy_version 744053 (0.0031) [2024-06-24 21:25:01,932][15401] Updated weights for policy 0, policy_version 744063 (0.0043) [2024-06-24 21:25:03,228][15349] Signal inference workers to stop experience collection... (180500 times) [2024-06-24 21:25:03,253][15401] InferenceWorker_p0-w0: stopping experience collection (180500 times) [2024-06-24 21:25:03,343][15349] Signal inference workers to resume experience collection... (180500 times) [2024-06-24 21:25:03,343][15401] InferenceWorker_p0-w0: resuming experience collection (180500 times) [2024-06-24 21:25:03,392][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 12190793728. Throughput: 0: 42548.4. Samples: 12190943840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:03,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 21:25:06,249][15401] Updated weights for policy 0, policy_version 744073 (0.0039) [2024-06-24 21:25:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.6, 300 sec: 42543.8). Total num frames: 12191006720. Throughput: 0: 42503.7. Samples: 12191077220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:08,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 21:25:09,707][15401] Updated weights for policy 0, policy_version 744083 (0.0038) [2024-06-24 21:25:13,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 12191186944. Throughput: 0: 42439.6. Samples: 12191324700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 21:25:13,850][15401] Updated weights for policy 0, policy_version 744093 (0.0033) [2024-06-24 21:25:17,291][15401] Updated weights for policy 0, policy_version 744103 (0.0041) [2024-06-24 21:25:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 12191416320. Throughput: 0: 42481.7. Samples: 12191579900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:18,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 21:25:21,780][15401] Updated weights for policy 0, policy_version 744113 (0.0026) [2024-06-24 21:25:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12191645696. Throughput: 0: 42611.2. Samples: 12191713360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 21:25:24,702][15401] Updated weights for policy 0, policy_version 744123 (0.0038) [2024-06-24 21:25:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 12191825920. Throughput: 0: 42509.9. Samples: 12191964660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 21:25:29,234][15401] Updated weights for policy 0, policy_version 744133 (0.0039) [2024-06-24 21:25:33,026][15401] Updated weights for policy 0, policy_version 744143 (0.0033) [2024-06-24 21:25:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12192055296. Throughput: 0: 42742.3. Samples: 12192227480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 21:25:36,678][15401] Updated weights for policy 0, policy_version 744153 (0.0032) [2024-06-24 21:25:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12192284672. Throughput: 0: 42671.9. Samples: 12192358120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 21:25:40,521][15401] Updated weights for policy 0, policy_version 744163 (0.0030) [2024-06-24 21:25:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12192497664. Throughput: 0: 42798.3. Samples: 12192617860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 21:25:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000744171_12192497664.pth... [2024-06-24 21:25:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000743550_12182323200.pth [2024-06-24 21:25:44,059][15401] Updated weights for policy 0, policy_version 744173 (0.0031) [2024-06-24 21:25:48,014][15401] Updated weights for policy 0, policy_version 744183 (0.0033) [2024-06-24 21:25:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12192694272. Throughput: 0: 42867.7. Samples: 12192872780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 21:25:51,848][15401] Updated weights for policy 0, policy_version 744193 (0.0032) [2024-06-24 21:25:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.3, 300 sec: 42542.9). Total num frames: 12192923648. Throughput: 0: 42789.7. Samples: 12193002760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 21:25:55,606][15401] Updated weights for policy 0, policy_version 744203 (0.0031) [2024-06-24 21:25:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12193136640. Throughput: 0: 43000.8. Samples: 12193259740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:25:58,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 21:25:59,559][15401] Updated weights for policy 0, policy_version 744213 (0.0035) [2024-06-24 21:26:03,247][15401] Updated weights for policy 0, policy_version 744223 (0.0034) [2024-06-24 21:26:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 12193349632. Throughput: 0: 43020.3. Samples: 12193515820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:26:03,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 21:26:07,164][15401] Updated weights for policy 0, policy_version 744233 (0.0036) [2024-06-24 21:26:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 12193562624. Throughput: 0: 42846.6. Samples: 12193641460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:26:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 21:26:10,919][15401] Updated weights for policy 0, policy_version 744243 (0.0033) [2024-06-24 21:26:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12193759232. Throughput: 0: 42801.7. Samples: 12193890740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:26:13,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 21:26:15,116][15401] Updated weights for policy 0, policy_version 744253 (0.0059) [2024-06-24 21:26:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12193988608. Throughput: 0: 42681.4. Samples: 12194148140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:26:18,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 21:26:18,413][15401] Updated weights for policy 0, policy_version 744263 (0.0031) [2024-06-24 21:26:22,705][15401] Updated weights for policy 0, policy_version 744273 (0.0044) [2024-06-24 21:26:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 12194201600. Throughput: 0: 42768.8. Samples: 12194282820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:26:23,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 21:26:25,223][15349] Signal inference workers to stop experience collection... (180550 times) [2024-06-24 21:26:25,226][15349] Signal inference workers to resume experience collection... (180550 times) [2024-06-24 21:26:25,272][15401] InferenceWorker_p0-w0: stopping experience collection (180550 times) [2024-06-24 21:26:25,272][15401] InferenceWorker_p0-w0: resuming experience collection (180550 times) [2024-06-24 21:26:25,946][15401] Updated weights for policy 0, policy_version 744283 (0.0033) [2024-06-24 21:26:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12194414592. Throughput: 0: 42577.3. Samples: 12194533840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-24 21:26:28,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 21:26:30,144][15401] Updated weights for policy 0, policy_version 744293 (0.0029) [2024-06-24 21:26:33,392][15132] Fps is (10 sec: 42598.1, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 12194627584. Throughput: 0: 42691.9. Samples: 12194794020. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:26:33,392][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 21:26:33,937][15401] Updated weights for policy 0, policy_version 744303 (0.0039) [2024-06-24 21:26:37,687][15401] Updated weights for policy 0, policy_version 744313 (0.0037) [2024-06-24 21:26:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 12194840576. Throughput: 0: 42703.5. Samples: 12194924420. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:26:38,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 21:26:41,519][15401] Updated weights for policy 0, policy_version 744323 (0.0032) [2024-06-24 21:26:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12195053568. Throughput: 0: 42644.9. Samples: 12195178760. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:26:43,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 21:26:45,556][15401] Updated weights for policy 0, policy_version 744333 (0.0034) [2024-06-24 21:26:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12195266560. Throughput: 0: 42700.9. Samples: 12195437360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:26:48,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-24 21:26:49,121][15401] Updated weights for policy 0, policy_version 744343 (0.0039) [2024-06-24 21:26:53,382][15401] Updated weights for policy 0, policy_version 744353 (0.0026) [2024-06-24 21:26:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 12195479552. Throughput: 0: 42726.5. Samples: 12195564160. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:26:53,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 21:26:56,894][15401] Updated weights for policy 0, policy_version 744363 (0.0041) [2024-06-24 21:26:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 12195692544. Throughput: 0: 42754.5. Samples: 12195814800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:26:58,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 21:27:00,913][15401] Updated weights for policy 0, policy_version 744373 (0.0029) [2024-06-24 21:27:03,391][15132] Fps is (10 sec: 44229.2, 60 sec: 42870.2, 300 sec: 42654.0). Total num frames: 12195921920. Throughput: 0: 42813.3. Samples: 12196074820. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:03,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 21:27:04,500][15401] Updated weights for policy 0, policy_version 744383 (0.0045) [2024-06-24 21:27:08,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12196118528. Throughput: 0: 42734.5. Samples: 12196205780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:08,390][15132] Avg episode reward: [(0, '0.060')] [2024-06-24 21:27:08,715][15401] Updated weights for policy 0, policy_version 744393 (0.0032) [2024-06-24 21:27:12,051][15401] Updated weights for policy 0, policy_version 744403 (0.0034) [2024-06-24 21:27:13,390][15132] Fps is (10 sec: 40967.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12196331520. Throughput: 0: 42746.2. Samples: 12196457420. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 21:27:16,298][15401] Updated weights for policy 0, policy_version 744413 (0.0034) [2024-06-24 21:27:18,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 12196577280. Throughput: 0: 42649.7. Samples: 12196713260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:18,393][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 21:27:19,894][15401] Updated weights for policy 0, policy_version 744423 (0.0031) [2024-06-24 21:27:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 12196757504. Throughput: 0: 42811.0. Samples: 12196850920. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 21:27:23,858][15401] Updated weights for policy 0, policy_version 744433 (0.0034) [2024-06-24 21:27:27,569][15401] Updated weights for policy 0, policy_version 744443 (0.0037) [2024-06-24 21:27:28,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 12196986880. Throughput: 0: 42944.8. Samples: 12197111280. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:28,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-24 21:27:31,335][15401] Updated weights for policy 0, policy_version 744453 (0.0023) [2024-06-24 21:27:32,711][15349] Signal inference workers to stop experience collection... (180600 times) [2024-06-24 21:27:32,716][15349] Signal inference workers to resume experience collection... (180600 times) [2024-06-24 21:27:32,757][15401] InferenceWorker_p0-w0: stopping experience collection (180600 times) [2024-06-24 21:27:32,757][15401] InferenceWorker_p0-w0: resuming experience collection (180600 times) [2024-06-24 21:27:33,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 12197232640. Throughput: 0: 42805.8. Samples: 12197363620. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 21:27:35,414][15401] Updated weights for policy 0, policy_version 744463 (0.0037) [2024-06-24 21:27:38,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42869.7, 300 sec: 42764.6). Total num frames: 12197412864. Throughput: 0: 43066.2. Samples: 12197502240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:38,393][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 21:27:38,932][15401] Updated weights for policy 0, policy_version 744473 (0.0038) [2024-06-24 21:27:43,069][15401] Updated weights for policy 0, policy_version 744483 (0.0023) [2024-06-24 21:27:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12197625856. Throughput: 0: 43296.9. Samples: 12197763060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:43,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 21:27:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000744484_12197625856.pth... [2024-06-24 21:27:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000743857_12187353088.pth [2024-06-24 21:27:46,380][15401] Updated weights for policy 0, policy_version 744493 (0.0037) [2024-06-24 21:27:48,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 12197871616. Throughput: 0: 43055.9. Samples: 12198012260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:48,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-24 21:27:50,720][15401] Updated weights for policy 0, policy_version 744503 (0.0024) [2024-06-24 21:27:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12198068224. Throughput: 0: 43208.1. Samples: 12198150140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 21:27:54,017][15401] Updated weights for policy 0, policy_version 744513 (0.0039) [2024-06-24 21:27:58,303][15401] Updated weights for policy 0, policy_version 744523 (0.0024) [2024-06-24 21:27:58,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42871.5, 300 sec: 42709.2). Total num frames: 12198264832. Throughput: 0: 43284.4. Samples: 12198405320. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:27:58,393][15132] Avg episode reward: [(0, '0.821')] [2024-06-24 21:28:01,540][15401] Updated weights for policy 0, policy_version 744533 (0.0035) [2024-06-24 21:28:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43145.9, 300 sec: 42765.3). Total num frames: 12198510592. Throughput: 0: 43200.6. Samples: 12198657180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:28:03,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 21:28:06,003][15401] Updated weights for policy 0, policy_version 744543 (0.0037) [2024-06-24 21:28:08,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 12198707200. Throughput: 0: 43268.6. Samples: 12198798000. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:28:08,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 21:28:09,124][15401] Updated weights for policy 0, policy_version 744553 (0.0037) [2024-06-24 21:28:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12198903808. Throughput: 0: 42954.8. Samples: 12199044240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-24 21:28:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 21:28:13,477][15401] Updated weights for policy 0, policy_version 744563 (0.0036) [2024-06-24 21:28:16,846][15401] Updated weights for policy 0, policy_version 744573 (0.0022) [2024-06-24 21:28:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 12199165952. Throughput: 0: 43020.3. Samples: 12199299540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 21:28:20,913][15401] Updated weights for policy 0, policy_version 744583 (0.0037) [2024-06-24 21:28:23,391][15132] Fps is (10 sec: 42592.8, 60 sec: 42870.6, 300 sec: 42764.8). Total num frames: 12199329792. Throughput: 0: 42975.8. Samples: 12199436100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:23,391][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 21:28:24,480][15401] Updated weights for policy 0, policy_version 744593 (0.0035) [2024-06-24 21:28:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 12199559168. Throughput: 0: 42746.9. Samples: 12199686660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:28,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 21:28:28,434][15401] Updated weights for policy 0, policy_version 744603 (0.0035) [2024-06-24 21:28:32,080][15401] Updated weights for policy 0, policy_version 744613 (0.0028) [2024-06-24 21:28:33,390][15132] Fps is (10 sec: 49158.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 12199821312. Throughput: 0: 42901.8. Samples: 12199942840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 21:28:36,367][15401] Updated weights for policy 0, policy_version 744623 (0.0029) [2024-06-24 21:28:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.4, 300 sec: 42876.1). Total num frames: 12200001536. Throughput: 0: 42873.4. Samples: 12200079440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 21:28:39,815][15401] Updated weights for policy 0, policy_version 744633 (0.0034) [2024-06-24 21:28:43,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12200198144. Throughput: 0: 42844.9. Samples: 12200333240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 21:28:43,903][15401] Updated weights for policy 0, policy_version 744643 (0.0025) [2024-06-24 21:28:47,332][15401] Updated weights for policy 0, policy_version 744653 (0.0024) [2024-06-24 21:28:47,958][15349] Signal inference workers to stop experience collection... (180650 times) [2024-06-24 21:28:48,007][15401] InferenceWorker_p0-w0: stopping experience collection (180650 times) [2024-06-24 21:28:48,015][15349] Signal inference workers to resume experience collection... (180650 times) [2024-06-24 21:28:48,023][15401] InferenceWorker_p0-w0: resuming experience collection (180650 times) [2024-06-24 21:28:48,391][15132] Fps is (10 sec: 45868.2, 60 sec: 43143.5, 300 sec: 42876.2). Total num frames: 12200460288. Throughput: 0: 42964.4. Samples: 12200590640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:48,392][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 21:28:51,561][15401] Updated weights for policy 0, policy_version 744663 (0.0032) [2024-06-24 21:28:53,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12200624128. Throughput: 0: 42810.5. Samples: 12200724580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:53,392][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 21:28:54,819][15401] Updated weights for policy 0, policy_version 744673 (0.0023) [2024-06-24 21:28:58,389][15132] Fps is (10 sec: 39327.4, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 12200853504. Throughput: 0: 42883.1. Samples: 12200973980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:28:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 21:28:59,405][15401] Updated weights for policy 0, policy_version 744683 (0.0047) [2024-06-24 21:29:02,383][15401] Updated weights for policy 0, policy_version 744693 (0.0036) [2024-06-24 21:29:03,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12201082880. Throughput: 0: 43103.7. Samples: 12201239200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 21:29:07,018][15401] Updated weights for policy 0, policy_version 744703 (0.0033) [2024-06-24 21:29:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12201279488. Throughput: 0: 42961.2. Samples: 12201369300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:08,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-24 21:29:10,235][15401] Updated weights for policy 0, policy_version 744713 (0.0031) [2024-06-24 21:29:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 12201508864. Throughput: 0: 42927.9. Samples: 12201618420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:13,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 21:29:14,455][15401] Updated weights for policy 0, policy_version 744723 (0.0035) [2024-06-24 21:29:17,700][15401] Updated weights for policy 0, policy_version 744733 (0.0027) [2024-06-24 21:29:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12201721856. Throughput: 0: 43070.2. Samples: 12201881000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:18,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 21:29:22,050][15401] Updated weights for policy 0, policy_version 744743 (0.0032) [2024-06-24 21:29:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42872.4, 300 sec: 42709.5). Total num frames: 12201902080. Throughput: 0: 42900.8. Samples: 12202009980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 21:29:25,229][15401] Updated weights for policy 0, policy_version 744753 (0.0021) [2024-06-24 21:29:28,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43139.8, 300 sec: 42875.1). Total num frames: 12202147840. Throughput: 0: 42925.9. Samples: 12202265180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:28,397][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 21:29:29,535][15401] Updated weights for policy 0, policy_version 744763 (0.0036) [2024-06-24 21:29:33,050][15401] Updated weights for policy 0, policy_version 744773 (0.0029) [2024-06-24 21:29:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 12202360832. Throughput: 0: 42966.2. Samples: 12202524060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 21:29:37,480][15401] Updated weights for policy 0, policy_version 744783 (0.0035) [2024-06-24 21:29:38,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12202557440. Throughput: 0: 42883.7. Samples: 12202654240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 21:29:40,599][15401] Updated weights for policy 0, policy_version 744793 (0.0035) [2024-06-24 21:29:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12202803200. Throughput: 0: 42983.1. Samples: 12202908220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 21:29:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000744800_12202803200.pth... [2024-06-24 21:29:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000744171_12192497664.pth [2024-06-24 21:29:44,965][15401] Updated weights for policy 0, policy_version 744803 (0.0039) [2024-06-24 21:29:48,327][15401] Updated weights for policy 0, policy_version 744813 (0.0032) [2024-06-24 21:29:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42599.4, 300 sec: 42876.5). Total num frames: 12203016192. Throughput: 0: 42992.4. Samples: 12203173860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 21:29:52,556][15401] Updated weights for policy 0, policy_version 744823 (0.0034) [2024-06-24 21:29:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 12203212800. Throughput: 0: 42909.3. Samples: 12203300220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:53,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 21:29:55,978][15401] Updated weights for policy 0, policy_version 744833 (0.0038) [2024-06-24 21:29:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 12203458560. Throughput: 0: 42880.4. Samples: 12203548040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-24 21:29:58,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-24 21:30:00,324][15401] Updated weights for policy 0, policy_version 744843 (0.0034) [2024-06-24 21:30:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12203638784. Throughput: 0: 42991.2. Samples: 12203815600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:03,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 21:30:03,718][15401] Updated weights for policy 0, policy_version 744853 (0.0030) [2024-06-24 21:30:07,924][15401] Updated weights for policy 0, policy_version 744863 (0.0037) [2024-06-24 21:30:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12203851776. Throughput: 0: 42827.1. Samples: 12203937200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 21:30:11,283][15401] Updated weights for policy 0, policy_version 744873 (0.0024) [2024-06-24 21:30:11,366][15349] Signal inference workers to stop experience collection... (180700 times) [2024-06-24 21:30:11,408][15401] InferenceWorker_p0-w0: stopping experience collection (180700 times) [2024-06-24 21:30:11,426][15349] Signal inference workers to resume experience collection... (180700 times) [2024-06-24 21:30:11,427][15401] InferenceWorker_p0-w0: resuming experience collection (180700 times) [2024-06-24 21:30:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 12204097536. Throughput: 0: 42779.5. Samples: 12204189980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:13,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 21:30:15,677][15401] Updated weights for policy 0, policy_version 744883 (0.0033) [2024-06-24 21:30:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12204277760. Throughput: 0: 42930.3. Samples: 12204455920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:18,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 21:30:18,862][15401] Updated weights for policy 0, policy_version 744893 (0.0039) [2024-06-24 21:30:23,110][15401] Updated weights for policy 0, policy_version 744903 (0.0044) [2024-06-24 21:30:23,396][15132] Fps is (10 sec: 39296.1, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 12204490752. Throughput: 0: 42792.9. Samples: 12204580200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:23,397][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 21:30:26,526][15401] Updated weights for policy 0, policy_version 744913 (0.0039) [2024-06-24 21:30:28,390][15132] Fps is (10 sec: 45871.5, 60 sec: 43148.6, 300 sec: 42987.1). Total num frames: 12204736512. Throughput: 0: 42729.1. Samples: 12204831060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:28,391][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 21:30:30,645][15401] Updated weights for policy 0, policy_version 744923 (0.0037) [2024-06-24 21:30:33,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12204933120. Throughput: 0: 42715.5. Samples: 12205096060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 21:30:34,233][15401] Updated weights for policy 0, policy_version 744933 (0.0030) [2024-06-24 21:30:38,389][15132] Fps is (10 sec: 39324.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12205129728. Throughput: 0: 42510.0. Samples: 12205213160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 21:30:38,813][15401] Updated weights for policy 0, policy_version 744943 (0.0032) [2024-06-24 21:30:42,188][15401] Updated weights for policy 0, policy_version 744953 (0.0028) [2024-06-24 21:30:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 12205375488. Throughput: 0: 42759.0. Samples: 12205472200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 21:30:46,268][15401] Updated weights for policy 0, policy_version 744963 (0.0030) [2024-06-24 21:30:48,396][15132] Fps is (10 sec: 42570.7, 60 sec: 42320.8, 300 sec: 42819.6). Total num frames: 12205555712. Throughput: 0: 42513.5. Samples: 12205728980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:48,396][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 21:30:49,959][15401] Updated weights for policy 0, policy_version 744973 (0.0038) [2024-06-24 21:30:53,392][15132] Fps is (10 sec: 39312.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12205768704. Throughput: 0: 42533.3. Samples: 12205851300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:53,393][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 21:30:53,789][15401] Updated weights for policy 0, policy_version 744983 (0.0048) [2024-06-24 21:30:57,495][15401] Updated weights for policy 0, policy_version 744993 (0.0039) [2024-06-24 21:30:58,389][15132] Fps is (10 sec: 45904.8, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 12206014464. Throughput: 0: 42813.3. Samples: 12206116580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:30:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 21:31:01,277][15401] Updated weights for policy 0, policy_version 745003 (0.0050) [2024-06-24 21:31:03,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12206194688. Throughput: 0: 42495.9. Samples: 12206368340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:31:03,392][15132] Avg episode reward: [(0, '0.635')] [2024-06-24 21:31:05,205][15401] Updated weights for policy 0, policy_version 745013 (0.0029) [2024-06-24 21:31:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12206407680. Throughput: 0: 42371.4. Samples: 12206486640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:31:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 21:31:08,981][15401] Updated weights for policy 0, policy_version 745023 (0.0031) [2024-06-24 21:31:13,026][15401] Updated weights for policy 0, policy_version 745033 (0.0027) [2024-06-24 21:31:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 12206653440. Throughput: 0: 42689.1. Samples: 12206752040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:31:13,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 21:31:16,744][15401] Updated weights for policy 0, policy_version 745043 (0.0031) [2024-06-24 21:31:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 12206817280. Throughput: 0: 42394.7. Samples: 12207003820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:31:18,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 21:31:20,741][15401] Updated weights for policy 0, policy_version 745053 (0.0037) [2024-06-24 21:31:23,391][15132] Fps is (10 sec: 40956.1, 60 sec: 42875.3, 300 sec: 42875.9). Total num frames: 12207063040. Throughput: 0: 42493.6. Samples: 12207125420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:31:23,391][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 21:31:24,351][15401] Updated weights for policy 0, policy_version 745063 (0.0031) [2024-06-24 21:31:28,344][15401] Updated weights for policy 0, policy_version 745073 (0.0052) [2024-06-24 21:31:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.8, 300 sec: 42876.4). Total num frames: 12207276032. Throughput: 0: 42640.6. Samples: 12207391020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:31:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 21:31:28,764][15349] Signal inference workers to stop experience collection... (180750 times) [2024-06-24 21:31:28,764][15349] Signal inference workers to resume experience collection... (180750 times) [2024-06-24 21:31:28,808][15401] InferenceWorker_p0-w0: stopping experience collection (180750 times) [2024-06-24 21:31:28,808][15401] InferenceWorker_p0-w0: resuming experience collection (180750 times) [2024-06-24 21:31:32,350][15401] Updated weights for policy 0, policy_version 745083 (0.0030) [2024-06-24 21:31:33,392][15132] Fps is (10 sec: 39316.1, 60 sec: 42050.6, 300 sec: 42764.7). Total num frames: 12207456256. Throughput: 0: 42503.3. Samples: 12207641460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:31:33,392][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 21:31:36,202][15401] Updated weights for policy 0, policy_version 745093 (0.0041) [2024-06-24 21:31:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 12207702016. Throughput: 0: 42555.6. Samples: 12207766300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-24 21:31:38,393][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 21:31:40,269][15401] Updated weights for policy 0, policy_version 745103 (0.0034) [2024-06-24 21:31:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 12207882240. Throughput: 0: 42455.6. Samples: 12208027080. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:31:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 21:31:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000745111_12207898624.pth... [2024-06-24 21:31:43,545][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000744484_12197625856.pth [2024-06-24 21:31:43,850][15401] Updated weights for policy 0, policy_version 745113 (0.0028) [2024-06-24 21:31:48,390][15132] Fps is (10 sec: 37692.1, 60 sec: 42056.7, 300 sec: 42709.5). Total num frames: 12208078848. Throughput: 0: 42512.0. Samples: 12208281280. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:31:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 21:31:48,572][15401] Updated weights for policy 0, policy_version 745123 (0.0026) [2024-06-24 21:31:51,642][15401] Updated weights for policy 0, policy_version 745133 (0.0029) [2024-06-24 21:31:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42873.3, 300 sec: 42876.5). Total num frames: 12208340992. Throughput: 0: 42557.9. Samples: 12208401740. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:31:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 21:31:56,412][15401] Updated weights for policy 0, policy_version 745143 (0.0041) [2024-06-24 21:31:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 42654.2). Total num frames: 12208504832. Throughput: 0: 42380.1. Samples: 12208659140. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:31:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 21:31:59,337][15401] Updated weights for policy 0, policy_version 745153 (0.0032) [2024-06-24 21:32:03,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 12208717824. Throughput: 0: 42498.2. Samples: 12208916240. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 21:32:04,023][15401] Updated weights for policy 0, policy_version 745163 (0.0032) [2024-06-24 21:32:06,968][15401] Updated weights for policy 0, policy_version 745173 (0.0039) [2024-06-24 21:32:08,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 12208996352. Throughput: 0: 42700.0. Samples: 12209046880. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:08,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 21:32:11,659][15401] Updated weights for policy 0, policy_version 745183 (0.0037) [2024-06-24 21:32:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 41233.2, 300 sec: 42543.2). Total num frames: 12209127424. Throughput: 0: 42393.9. Samples: 12209298740. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:13,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 21:32:14,758][15401] Updated weights for policy 0, policy_version 745193 (0.0044) [2024-06-24 21:32:18,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12209373184. Throughput: 0: 42410.7. Samples: 12209549840. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:18,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-24 21:32:19,163][15401] Updated weights for policy 0, policy_version 745203 (0.0031) [2024-06-24 21:32:22,459][15401] Updated weights for policy 0, policy_version 745213 (0.0028) [2024-06-24 21:32:23,392][15132] Fps is (10 sec: 49139.8, 60 sec: 42597.4, 300 sec: 42820.2). Total num frames: 12209618944. Throughput: 0: 42530.2. Samples: 12209680160. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:23,393][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 21:32:26,649][15401] Updated weights for policy 0, policy_version 745223 (0.0043) [2024-06-24 21:32:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 12209766400. Throughput: 0: 42352.0. Samples: 12209932920. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:28,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 21:32:30,264][15401] Updated weights for policy 0, policy_version 745233 (0.0042) [2024-06-24 21:32:33,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12210028544. Throughput: 0: 42268.5. Samples: 12210183460. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:33,392][15132] Avg episode reward: [(0, '0.343')] [2024-06-24 21:32:34,600][15401] Updated weights for policy 0, policy_version 745243 (0.0034) [2024-06-24 21:32:36,899][15349] Signal inference workers to stop experience collection... (180800 times) [2024-06-24 21:32:36,900][15349] Signal inference workers to resume experience collection... (180800 times) [2024-06-24 21:32:36,931][15401] InferenceWorker_p0-w0: stopping experience collection (180800 times) [2024-06-24 21:32:36,931][15401] InferenceWorker_p0-w0: resuming experience collection (180800 times) [2024-06-24 21:32:37,926][15401] Updated weights for policy 0, policy_version 745253 (0.0028) [2024-06-24 21:32:38,392][15132] Fps is (10 sec: 49140.1, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 12210257920. Throughput: 0: 42583.0. Samples: 12210318080. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:38,392][15132] Avg episode reward: [(0, '0.774')] [2024-06-24 21:32:42,211][15401] Updated weights for policy 0, policy_version 745263 (0.0028) [2024-06-24 21:32:43,391][15132] Fps is (10 sec: 39323.3, 60 sec: 42323.9, 300 sec: 42542.6). Total num frames: 12210421760. Throughput: 0: 42547.0. Samples: 12210573840. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:43,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 21:32:45,741][15401] Updated weights for policy 0, policy_version 745273 (0.0034) [2024-06-24 21:32:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 12210683904. Throughput: 0: 42224.2. Samples: 12210816320. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:48,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-24 21:32:49,713][15401] Updated weights for policy 0, policy_version 745283 (0.0030) [2024-06-24 21:32:53,202][15401] Updated weights for policy 0, policy_version 745293 (0.0043) [2024-06-24 21:32:53,389][15132] Fps is (10 sec: 45884.7, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 12210880512. Throughput: 0: 42454.0. Samples: 12210957300. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:53,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 21:32:57,050][15401] Updated weights for policy 0, policy_version 745303 (0.0036) [2024-06-24 21:32:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12211077120. Throughput: 0: 42573.8. Samples: 12211214560. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:32:58,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 21:33:00,914][15401] Updated weights for policy 0, policy_version 745313 (0.0042) [2024-06-24 21:33:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12211322880. Throughput: 0: 42512.4. Samples: 12211462900. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:33:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 21:33:05,010][15401] Updated weights for policy 0, policy_version 745323 (0.0034) [2024-06-24 21:33:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 12211519488. Throughput: 0: 42650.7. Samples: 12211599340. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:33:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 21:33:08,502][15401] Updated weights for policy 0, policy_version 745333 (0.0027) [2024-06-24 21:33:12,502][15401] Updated weights for policy 0, policy_version 745343 (0.0037) [2024-06-24 21:33:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 12211716096. Throughput: 0: 42630.6. Samples: 12211851300. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:33:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 21:33:16,268][15401] Updated weights for policy 0, policy_version 745353 (0.0026) [2024-06-24 21:33:18,390][15132] Fps is (10 sec: 42597.0, 60 sec: 42871.2, 300 sec: 42765.2). Total num frames: 12211945472. Throughput: 0: 42780.2. Samples: 12212108480. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:33:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 21:33:19,981][15401] Updated weights for policy 0, policy_version 745363 (0.0035) [2024-06-24 21:33:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42327.0, 300 sec: 42709.4). Total num frames: 12212158464. Throughput: 0: 42794.2. Samples: 12212243720. Policy #0 lag: (min: 1.0, avg: 8.1, max: 22.0) [2024-06-24 21:33:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 21:33:23,823][15401] Updated weights for policy 0, policy_version 745373 (0.0029) [2024-06-24 21:33:27,433][15401] Updated weights for policy 0, policy_version 745383 (0.0032) [2024-06-24 21:33:28,389][15132] Fps is (10 sec: 42600.0, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 12212371456. Throughput: 0: 42737.0. Samples: 12212496920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:33:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-24 21:33:31,206][15401] Updated weights for policy 0, policy_version 745393 (0.0025) [2024-06-24 21:33:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 12212600832. Throughput: 0: 43080.7. Samples: 12212754960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:33:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 21:33:34,922][15401] Updated weights for policy 0, policy_version 745403 (0.0033) [2024-06-24 21:33:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 12212813824. Throughput: 0: 42823.8. Samples: 12212884480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:33:38,393][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 21:33:38,863][15401] Updated weights for policy 0, policy_version 745413 (0.0033) [2024-06-24 21:33:42,484][15401] Updated weights for policy 0, policy_version 745423 (0.0044) [2024-06-24 21:33:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43145.9, 300 sec: 42543.1). Total num frames: 12213010432. Throughput: 0: 42731.9. Samples: 12213137500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:33:43,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-24 21:33:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000745423_12213010432.pth... [2024-06-24 21:33:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000744800_12202803200.pth [2024-06-24 21:33:47,022][15401] Updated weights for policy 0, policy_version 745433 (0.0039) [2024-06-24 21:33:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 12213239808. Throughput: 0: 42887.1. Samples: 12213392820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:33:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 21:33:50,106][15401] Updated weights for policy 0, policy_version 745443 (0.0035) [2024-06-24 21:33:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12213452800. Throughput: 0: 42783.6. Samples: 12213524600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:33:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 21:33:54,609][15401] Updated weights for policy 0, policy_version 745453 (0.0034) [2024-06-24 21:33:58,007][15401] Updated weights for policy 0, policy_version 745463 (0.0038) [2024-06-24 21:33:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 12213665792. Throughput: 0: 42847.9. Samples: 12213779460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:33:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 21:34:02,218][15401] Updated weights for policy 0, policy_version 745473 (0.0039) [2024-06-24 21:34:03,392][15132] Fps is (10 sec: 40951.6, 60 sec: 42324.0, 300 sec: 42653.7). Total num frames: 12213862400. Throughput: 0: 42902.9. Samples: 12214039180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:03,392][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 21:34:03,850][15349] Signal inference workers to stop experience collection... (180850 times) [2024-06-24 21:34:03,851][15349] Signal inference workers to resume experience collection... (180850 times) [2024-06-24 21:34:03,891][15401] InferenceWorker_p0-w0: stopping experience collection (180850 times) [2024-06-24 21:34:03,891][15401] InferenceWorker_p0-w0: resuming experience collection (180850 times) [2024-06-24 21:34:05,403][15401] Updated weights for policy 0, policy_version 745483 (0.0044) [2024-06-24 21:34:08,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12214091776. Throughput: 0: 42797.1. Samples: 12214169580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-24 21:34:09,876][15401] Updated weights for policy 0, policy_version 745493 (0.0042) [2024-06-24 21:34:13,389][15132] Fps is (10 sec: 44245.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 12214304768. Throughput: 0: 42812.9. Samples: 12214423500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:13,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 21:34:13,496][15401] Updated weights for policy 0, policy_version 745503 (0.0046) [2024-06-24 21:34:17,443][15401] Updated weights for policy 0, policy_version 745513 (0.0038) [2024-06-24 21:34:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 12214517760. Throughput: 0: 42780.9. Samples: 12214680100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 21:34:21,123][15401] Updated weights for policy 0, policy_version 745523 (0.0038) [2024-06-24 21:34:23,390][15132] Fps is (10 sec: 44234.4, 60 sec: 43144.2, 300 sec: 42710.3). Total num frames: 12214747136. Throughput: 0: 42743.1. Samples: 12214807840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:23,391][15132] Avg episode reward: [(0, '0.561')] [2024-06-24 21:34:24,960][15401] Updated weights for policy 0, policy_version 745533 (0.0050) [2024-06-24 21:34:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12214943744. Throughput: 0: 42874.7. Samples: 12215066860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 21:34:29,022][15401] Updated weights for policy 0, policy_version 745543 (0.0036) [2024-06-24 21:34:32,699][15401] Updated weights for policy 0, policy_version 745553 (0.0026) [2024-06-24 21:34:33,389][15132] Fps is (10 sec: 40962.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12215156736. Throughput: 0: 42965.8. Samples: 12215326280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:33,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 21:34:36,558][15401] Updated weights for policy 0, policy_version 745563 (0.0041) [2024-06-24 21:34:38,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42868.6, 300 sec: 42653.0). Total num frames: 12215386112. Throughput: 0: 42858.7. Samples: 12215453520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:38,396][15132] Avg episode reward: [(0, '0.848')] [2024-06-24 21:34:40,399][15401] Updated weights for policy 0, policy_version 745573 (0.0040) [2024-06-24 21:34:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12215582720. Throughput: 0: 42883.2. Samples: 12215709200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:43,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-24 21:34:44,053][15401] Updated weights for policy 0, policy_version 745583 (0.0036) [2024-06-24 21:34:48,328][15401] Updated weights for policy 0, policy_version 745593 (0.0039) [2024-06-24 21:34:48,392][15132] Fps is (10 sec: 40976.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12215795712. Throughput: 0: 42973.4. Samples: 12215973000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:48,393][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 21:34:51,779][15401] Updated weights for policy 0, policy_version 745603 (0.0031) [2024-06-24 21:34:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12216008704. Throughput: 0: 42833.7. Samples: 12216097100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:53,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 21:34:55,845][15401] Updated weights for policy 0, policy_version 745613 (0.0047) [2024-06-24 21:34:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12216221696. Throughput: 0: 42897.8. Samples: 12216353900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:34:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 21:34:59,564][15401] Updated weights for policy 0, policy_version 745623 (0.0058) [2024-06-24 21:35:03,364][15401] Updated weights for policy 0, policy_version 745633 (0.0037) [2024-06-24 21:35:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43145.9, 300 sec: 42709.5). Total num frames: 12216451072. Throughput: 0: 42974.5. Samples: 12216613960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:35:03,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 21:35:07,119][15401] Updated weights for policy 0, policy_version 745643 (0.0032) [2024-06-24 21:35:08,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 12216664064. Throughput: 0: 43019.9. Samples: 12216743720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 21:35:08,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 21:35:10,869][15401] Updated weights for policy 0, policy_version 745653 (0.0033) [2024-06-24 21:35:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 12216877056. Throughput: 0: 42995.1. Samples: 12217001640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:13,390][15132] Avg episode reward: [(0, '0.147')] [2024-06-24 21:35:14,564][15401] Updated weights for policy 0, policy_version 745663 (0.0028) [2024-06-24 21:35:18,389][15132] Fps is (10 sec: 44238.1, 60 sec: 43144.6, 300 sec: 42766.0). Total num frames: 12217106432. Throughput: 0: 42889.9. Samples: 12217256320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:18,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 21:35:18,390][15401] Updated weights for policy 0, policy_version 745673 (0.0031) [2024-06-24 21:35:21,901][15349] Signal inference workers to stop experience collection... (180900 times) [2024-06-24 21:35:21,947][15401] InferenceWorker_p0-w0: stopping experience collection (180900 times) [2024-06-24 21:35:22,012][15349] Signal inference workers to resume experience collection... (180900 times) [2024-06-24 21:35:22,013][15401] InferenceWorker_p0-w0: resuming experience collection (180900 times) [2024-06-24 21:35:22,181][15401] Updated weights for policy 0, policy_version 745683 (0.0031) [2024-06-24 21:35:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.8, 300 sec: 42654.0). Total num frames: 12217319424. Throughput: 0: 42952.7. Samples: 12217386120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 21:35:26,332][15401] Updated weights for policy 0, policy_version 745693 (0.0036) [2024-06-24 21:35:28,392][15132] Fps is (10 sec: 39311.5, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 12217499648. Throughput: 0: 42796.8. Samples: 12217635160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:28,393][15132] Avg episode reward: [(0, '0.721')] [2024-06-24 21:35:29,721][15401] Updated weights for policy 0, policy_version 745703 (0.0034) [2024-06-24 21:35:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12217712640. Throughput: 0: 42799.2. Samples: 12217898860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:33,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 21:35:33,986][15401] Updated weights for policy 0, policy_version 745713 (0.0027) [2024-06-24 21:35:37,336][15401] Updated weights for policy 0, policy_version 745723 (0.0042) [2024-06-24 21:35:38,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42876.0, 300 sec: 42654.0). Total num frames: 12217958400. Throughput: 0: 42829.3. Samples: 12218024420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 21:35:41,764][15401] Updated weights for policy 0, policy_version 745733 (0.0038) [2024-06-24 21:35:43,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42654.5). Total num frames: 12218138624. Throughput: 0: 42734.5. Samples: 12218277060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:43,392][15132] Avg episode reward: [(0, '0.225')] [2024-06-24 21:35:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000745736_12218138624.pth... [2024-06-24 21:35:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000745111_12207898624.pth [2024-06-24 21:35:45,309][15401] Updated weights for policy 0, policy_version 745743 (0.0035) [2024-06-24 21:35:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.1, 300 sec: 42654.3). Total num frames: 12218351616. Throughput: 0: 42700.9. Samples: 12218535500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-24 21:35:49,487][15401] Updated weights for policy 0, policy_version 745753 (0.0030) [2024-06-24 21:35:53,018][15401] Updated weights for policy 0, policy_version 745763 (0.0041) [2024-06-24 21:35:53,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12218597376. Throughput: 0: 42683.3. Samples: 12218664460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 21:35:57,326][15401] Updated weights for policy 0, policy_version 745773 (0.0031) [2024-06-24 21:35:58,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 12218793984. Throughput: 0: 42570.7. Samples: 12218917420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:35:58,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 21:36:00,474][15401] Updated weights for policy 0, policy_version 745783 (0.0035) [2024-06-24 21:36:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12219006976. Throughput: 0: 42644.3. Samples: 12219175320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 21:36:05,151][15401] Updated weights for policy 0, policy_version 745793 (0.0032) [2024-06-24 21:36:08,242][15401] Updated weights for policy 0, policy_version 745803 (0.0028) [2024-06-24 21:36:08,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 12219236352. Throughput: 0: 42529.8. Samples: 12219300060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:08,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 21:36:12,796][15401] Updated weights for policy 0, policy_version 745813 (0.0032) [2024-06-24 21:36:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12219432960. Throughput: 0: 42905.5. Samples: 12219565800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:13,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 21:36:15,794][15401] Updated weights for policy 0, policy_version 745823 (0.0033) [2024-06-24 21:36:18,396][15132] Fps is (10 sec: 42581.2, 60 sec: 42593.8, 300 sec: 42708.7). Total num frames: 12219662336. Throughput: 0: 42508.1. Samples: 12219812000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:18,396][15132] Avg episode reward: [(0, '0.402')] [2024-06-24 21:36:20,325][15401] Updated weights for policy 0, policy_version 745833 (0.0040) [2024-06-24 21:36:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12219875328. Throughput: 0: 42663.9. Samples: 12219944300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:23,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 21:36:23,551][15401] Updated weights for policy 0, policy_version 745843 (0.0026) [2024-06-24 21:36:27,740][15401] Updated weights for policy 0, policy_version 745853 (0.0033) [2024-06-24 21:36:28,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 12220071936. Throughput: 0: 42786.7. Samples: 12220202360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:28,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-24 21:36:31,106][15401] Updated weights for policy 0, policy_version 745863 (0.0029) [2024-06-24 21:36:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 12220284928. Throughput: 0: 42733.1. Samples: 12220458480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:33,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 21:36:35,375][15401] Updated weights for policy 0, policy_version 745873 (0.0038) [2024-06-24 21:36:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12220514304. Throughput: 0: 42631.0. Samples: 12220582960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:38,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 21:36:38,802][15401] Updated weights for policy 0, policy_version 745883 (0.0034) [2024-06-24 21:36:43,039][15401] Updated weights for policy 0, policy_version 745893 (0.0032) [2024-06-24 21:36:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 12220710912. Throughput: 0: 42754.7. Samples: 12220841280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 21:36:46,832][15401] Updated weights for policy 0, policy_version 745903 (0.0046) [2024-06-24 21:36:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12220923904. Throughput: 0: 42644.9. Samples: 12221094340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-24 21:36:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 21:36:50,670][15401] Updated weights for policy 0, policy_version 745913 (0.0022) [2024-06-24 21:36:51,681][15349] Signal inference workers to stop experience collection... (180950 times) [2024-06-24 21:36:51,712][15401] InferenceWorker_p0-w0: stopping experience collection (180950 times) [2024-06-24 21:36:51,730][15349] Signal inference workers to resume experience collection... (180950 times) [2024-06-24 21:36:51,731][15401] InferenceWorker_p0-w0: resuming experience collection (180950 times) [2024-06-24 21:36:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 12221136896. Throughput: 0: 42711.6. Samples: 12221221980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:36:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 21:36:54,229][15401] Updated weights for policy 0, policy_version 745923 (0.0032) [2024-06-24 21:36:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 12221349888. Throughput: 0: 42648.4. Samples: 12221484980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:36:58,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-24 21:36:58,539][15401] Updated weights for policy 0, policy_version 745933 (0.0029) [2024-06-24 21:37:01,713][15401] Updated weights for policy 0, policy_version 745943 (0.0030) [2024-06-24 21:37:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12221579264. Throughput: 0: 42795.0. Samples: 12221737500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 21:37:06,332][15401] Updated weights for policy 0, policy_version 745953 (0.0032) [2024-06-24 21:37:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 12221775872. Throughput: 0: 42809.0. Samples: 12221870700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 21:37:09,307][15401] Updated weights for policy 0, policy_version 745963 (0.0034) [2024-06-24 21:37:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12221988864. Throughput: 0: 42656.1. Samples: 12222121880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 21:37:13,931][15401] Updated weights for policy 0, policy_version 745973 (0.0034) [2024-06-24 21:37:17,011][15401] Updated weights for policy 0, policy_version 745983 (0.0030) [2024-06-24 21:37:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42602.9, 300 sec: 42709.8). Total num frames: 12222218240. Throughput: 0: 42571.4. Samples: 12222374200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:18,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 21:37:21,581][15401] Updated weights for policy 0, policy_version 745993 (0.0039) [2024-06-24 21:37:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 12222431232. Throughput: 0: 42749.9. Samples: 12222506600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 21:37:24,470][15401] Updated weights for policy 0, policy_version 746003 (0.0040) [2024-06-24 21:37:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 12222627840. Throughput: 0: 42720.4. Samples: 12222763700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 21:37:29,326][15401] Updated weights for policy 0, policy_version 746013 (0.0040) [2024-06-24 21:37:32,371][15401] Updated weights for policy 0, policy_version 746023 (0.0028) [2024-06-24 21:37:33,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.7, 300 sec: 42765.0). Total num frames: 12222873600. Throughput: 0: 42726.6. Samples: 12223017140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:33,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 21:37:36,898][15401] Updated weights for policy 0, policy_version 746033 (0.0032) [2024-06-24 21:37:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42873.2, 300 sec: 42931.9). Total num frames: 12223086592. Throughput: 0: 42884.0. Samples: 12223151760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 21:37:39,947][15401] Updated weights for policy 0, policy_version 746043 (0.0033) [2024-06-24 21:37:43,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12223283200. Throughput: 0: 42760.0. Samples: 12223409180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:43,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 21:37:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000746050_12223283200.pth... [2024-06-24 21:37:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000745423_12213010432.pth [2024-06-24 21:37:44,808][15401] Updated weights for policy 0, policy_version 746053 (0.0036) [2024-06-24 21:37:47,505][15401] Updated weights for policy 0, policy_version 746063 (0.0032) [2024-06-24 21:37:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12223496192. Throughput: 0: 42719.7. Samples: 12223659880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 21:37:52,500][15401] Updated weights for policy 0, policy_version 746073 (0.0032) [2024-06-24 21:37:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12223725568. Throughput: 0: 42691.1. Samples: 12223791800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:53,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 21:37:55,071][15401] Updated weights for policy 0, policy_version 746083 (0.0037) [2024-06-24 21:37:58,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12223905792. Throughput: 0: 42702.1. Samples: 12224043480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:37:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 21:37:59,966][15401] Updated weights for policy 0, policy_version 746093 (0.0040) [2024-06-24 21:38:03,104][15401] Updated weights for policy 0, policy_version 746103 (0.0038) [2024-06-24 21:38:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12224151552. Throughput: 0: 42792.5. Samples: 12224299860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:38:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 21:38:07,522][15401] Updated weights for policy 0, policy_version 746113 (0.0038) [2024-06-24 21:38:08,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12224364544. Throughput: 0: 42985.8. Samples: 12224440960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:38:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 21:38:10,522][15349] Signal inference workers to stop experience collection... (181000 times) [2024-06-24 21:38:10,522][15349] Signal inference workers to resume experience collection... (181000 times) [2024-06-24 21:38:10,555][15401] InferenceWorker_p0-w0: stopping experience collection (181000 times) [2024-06-24 21:38:10,555][15401] InferenceWorker_p0-w0: resuming experience collection (181000 times) [2024-06-24 21:38:10,665][15401] Updated weights for policy 0, policy_version 746123 (0.0033) [2024-06-24 21:38:13,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 12224561152. Throughput: 0: 42889.4. Samples: 12224693820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:38:13,393][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 21:38:15,305][15401] Updated weights for policy 0, policy_version 746133 (0.0034) [2024-06-24 21:38:18,177][15401] Updated weights for policy 0, policy_version 746143 (0.0041) [2024-06-24 21:38:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12224806912. Throughput: 0: 42724.1. Samples: 12224939620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:38:18,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 21:38:22,967][15401] Updated weights for policy 0, policy_version 746153 (0.0036) [2024-06-24 21:38:23,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42869.6, 300 sec: 42820.2). Total num frames: 12225003520. Throughput: 0: 42792.4. Samples: 12225077520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:38:23,393][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 21:38:25,831][15401] Updated weights for policy 0, policy_version 746163 (0.0037) [2024-06-24 21:38:28,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12225200128. Throughput: 0: 42575.1. Samples: 12225325160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:38:28,393][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 21:38:30,645][15401] Updated weights for policy 0, policy_version 746173 (0.0035) [2024-06-24 21:38:33,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42873.3, 300 sec: 42820.9). Total num frames: 12225445888. Throughput: 0: 42679.5. Samples: 12225580460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 21:38:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 21:38:33,689][15401] Updated weights for policy 0, policy_version 746183 (0.0037) [2024-06-24 21:38:38,223][15401] Updated weights for policy 0, policy_version 746193 (0.0040) [2024-06-24 21:38:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12225626112. Throughput: 0: 42634.2. Samples: 12225710340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:38:38,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 21:38:41,198][15401] Updated weights for policy 0, policy_version 746203 (0.0035) [2024-06-24 21:38:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12225855488. Throughput: 0: 42740.5. Samples: 12225966800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:38:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 21:38:46,177][15401] Updated weights for policy 0, policy_version 746213 (0.0036) [2024-06-24 21:38:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12226068480. Throughput: 0: 42681.5. Samples: 12226220520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:38:48,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 21:38:48,783][15401] Updated weights for policy 0, policy_version 746223 (0.0033) [2024-06-24 21:38:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 12226265088. Throughput: 0: 42491.3. Samples: 12226353080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:38:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 21:38:53,640][15401] Updated weights for policy 0, policy_version 746233 (0.0033) [2024-06-24 21:38:57,026][15401] Updated weights for policy 0, policy_version 746243 (0.0031) [2024-06-24 21:38:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42820.9). Total num frames: 12226494464. Throughput: 0: 42519.7. Samples: 12226607100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:38:58,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 21:39:01,201][15401] Updated weights for policy 0, policy_version 746253 (0.0038) [2024-06-24 21:39:03,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12226723840. Throughput: 0: 42691.1. Samples: 12226860720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 21:39:04,871][15401] Updated weights for policy 0, policy_version 746263 (0.0028) [2024-06-24 21:39:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 12226904064. Throughput: 0: 42558.7. Samples: 12226992660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:08,392][15132] Avg episode reward: [(0, '0.243')] [2024-06-24 21:39:08,739][15401] Updated weights for policy 0, policy_version 746273 (0.0036) [2024-06-24 21:39:12,438][15401] Updated weights for policy 0, policy_version 746283 (0.0022) [2024-06-24 21:39:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12227133440. Throughput: 0: 42727.6. Samples: 12227247800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 21:39:16,263][15401] Updated weights for policy 0, policy_version 746293 (0.0043) [2024-06-24 21:39:18,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 12227346432. Throughput: 0: 42682.6. Samples: 12227501180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 21:39:20,099][15401] Updated weights for policy 0, policy_version 746303 (0.0034) [2024-06-24 21:39:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 12227543040. Throughput: 0: 42691.1. Samples: 12227631440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:23,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 21:39:23,762][15401] Updated weights for policy 0, policy_version 746313 (0.0041) [2024-06-24 21:39:24,640][15349] Signal inference workers to stop experience collection... (181050 times) [2024-06-24 21:39:24,640][15349] Signal inference workers to resume experience collection... (181050 times) [2024-06-24 21:39:24,654][15401] InferenceWorker_p0-w0: stopping experience collection (181050 times) [2024-06-24 21:39:24,654][15401] InferenceWorker_p0-w0: resuming experience collection (181050 times) [2024-06-24 21:39:27,786][15401] Updated weights for policy 0, policy_version 746323 (0.0025) [2024-06-24 21:39:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 12227772416. Throughput: 0: 42806.7. Samples: 12227893100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-24 21:39:31,792][15401] Updated weights for policy 0, policy_version 746333 (0.0040) [2024-06-24 21:39:33,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.6, 300 sec: 42710.1). Total num frames: 12227985408. Throughput: 0: 42794.0. Samples: 12228146360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:33,393][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 21:39:35,664][15401] Updated weights for policy 0, policy_version 746343 (0.0041) [2024-06-24 21:39:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12228182016. Throughput: 0: 42465.9. Samples: 12228264040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 21:39:40,112][15401] Updated weights for policy 0, policy_version 746353 (0.0034) [2024-06-24 21:39:43,286][15401] Updated weights for policy 0, policy_version 746363 (0.0028) [2024-06-24 21:39:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12228411392. Throughput: 0: 42463.0. Samples: 12228517940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 21:39:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000746363_12228411392.pth... [2024-06-24 21:39:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000745736_12218138624.pth [2024-06-24 21:39:47,750][15401] Updated weights for policy 0, policy_version 746373 (0.0041) [2024-06-24 21:39:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12228624384. Throughput: 0: 42495.5. Samples: 12228773020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 21:39:51,179][15401] Updated weights for policy 0, policy_version 746383 (0.0038) [2024-06-24 21:39:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.6, 300 sec: 42653.9). Total num frames: 12228804608. Throughput: 0: 42328.6. Samples: 12228897340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 21:39:55,342][15401] Updated weights for policy 0, policy_version 746393 (0.0041) [2024-06-24 21:39:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42654.0). Total num frames: 12229033984. Throughput: 0: 42253.7. Samples: 12229149220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:39:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 21:39:59,114][15401] Updated weights for policy 0, policy_version 746403 (0.0028) [2024-06-24 21:40:02,943][15401] Updated weights for policy 0, policy_version 746413 (0.0029) [2024-06-24 21:40:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12229246976. Throughput: 0: 42426.2. Samples: 12229410360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:40:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 21:40:06,843][15401] Updated weights for policy 0, policy_version 746423 (0.0030) [2024-06-24 21:40:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 12229443584. Throughput: 0: 42336.1. Samples: 12229536560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:40:08,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 21:40:10,545][15401] Updated weights for policy 0, policy_version 746433 (0.0024) [2024-06-24 21:40:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12229672960. Throughput: 0: 42184.4. Samples: 12229791400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:40:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-24 21:40:14,287][15401] Updated weights for policy 0, policy_version 746443 (0.0044) [2024-06-24 21:40:18,306][15401] Updated weights for policy 0, policy_version 746453 (0.0033) [2024-06-24 21:40:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12229885952. Throughput: 0: 42366.7. Samples: 12230052760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-24 21:40:18,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 21:40:21,994][15401] Updated weights for policy 0, policy_version 746463 (0.0042) [2024-06-24 21:40:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12230082560. Throughput: 0: 42531.6. Samples: 12230177960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:40:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 21:40:25,919][15401] Updated weights for policy 0, policy_version 746473 (0.0029) [2024-06-24 21:40:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12230311936. Throughput: 0: 42440.0. Samples: 12230427740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:40:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 21:40:29,526][15401] Updated weights for policy 0, policy_version 746483 (0.0039) [2024-06-24 21:40:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 12230508544. Throughput: 0: 42609.9. Samples: 12230690460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:40:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-24 21:40:33,606][15401] Updated weights for policy 0, policy_version 746493 (0.0032) [2024-06-24 21:40:34,526][15349] Signal inference workers to stop experience collection... (181100 times) [2024-06-24 21:40:34,580][15401] InferenceWorker_p0-w0: stopping experience collection (181100 times) [2024-06-24 21:40:34,583][15349] Signal inference workers to resume experience collection... (181100 times) [2024-06-24 21:40:34,594][15401] InferenceWorker_p0-w0: resuming experience collection (181100 times) [2024-06-24 21:40:37,504][15401] Updated weights for policy 0, policy_version 746503 (0.0044) [2024-06-24 21:40:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12230721536. Throughput: 0: 42593.6. Samples: 12230814060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:40:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 21:40:41,747][15401] Updated weights for policy 0, policy_version 746513 (0.0033) [2024-06-24 21:40:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 12230950912. Throughput: 0: 42540.8. Samples: 12231063560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:40:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 21:40:45,214][15401] Updated weights for policy 0, policy_version 746523 (0.0029) [2024-06-24 21:40:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12231147520. Throughput: 0: 42571.5. Samples: 12231326080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:40:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 21:40:49,458][15401] Updated weights for policy 0, policy_version 746533 (0.0026) [2024-06-24 21:40:52,814][15401] Updated weights for policy 0, policy_version 746543 (0.0043) [2024-06-24 21:40:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 12231360512. Throughput: 0: 42467.6. Samples: 12231447600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:40:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 21:40:57,043][15401] Updated weights for policy 0, policy_version 746553 (0.0028) [2024-06-24 21:40:58,391][15132] Fps is (10 sec: 45867.8, 60 sec: 42870.3, 300 sec: 42709.2). Total num frames: 12231606272. Throughput: 0: 42577.0. Samples: 12231707440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:40:58,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 21:41:00,454][15401] Updated weights for policy 0, policy_version 746563 (0.0030) [2024-06-24 21:41:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42487.7). Total num frames: 12231770112. Throughput: 0: 42488.5. Samples: 12231964740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 21:41:04,822][15401] Updated weights for policy 0, policy_version 746573 (0.0023) [2024-06-24 21:41:08,011][15401] Updated weights for policy 0, policy_version 746583 (0.0024) [2024-06-24 21:41:08,392][15132] Fps is (10 sec: 40956.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 12232015872. Throughput: 0: 42407.1. Samples: 12232086380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:08,392][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 21:41:12,532][15401] Updated weights for policy 0, policy_version 746593 (0.0043) [2024-06-24 21:41:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42543.8). Total num frames: 12232212480. Throughput: 0: 42685.4. Samples: 12232348580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 21:41:15,968][15401] Updated weights for policy 0, policy_version 746603 (0.0039) [2024-06-24 21:41:18,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12232409088. Throughput: 0: 42466.6. Samples: 12232601460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 21:41:20,247][15401] Updated weights for policy 0, policy_version 746613 (0.0028) [2024-06-24 21:41:23,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12232654848. Throughput: 0: 42453.4. Samples: 12232724460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 21:41:23,553][15401] Updated weights for policy 0, policy_version 746623 (0.0035) [2024-06-24 21:41:28,059][15401] Updated weights for policy 0, policy_version 746633 (0.0028) [2024-06-24 21:41:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12232851456. Throughput: 0: 42618.0. Samples: 12232981360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 21:41:31,153][15401] Updated weights for policy 0, policy_version 746643 (0.0028) [2024-06-24 21:41:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42487.7). Total num frames: 12233048064. Throughput: 0: 42469.7. Samples: 12233237220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 21:41:35,690][15401] Updated weights for policy 0, policy_version 746653 (0.0041) [2024-06-24 21:41:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12233310208. Throughput: 0: 42473.7. Samples: 12233358920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:38,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 21:41:38,766][15401] Updated weights for policy 0, policy_version 746663 (0.0038) [2024-06-24 21:41:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 12233474048. Throughput: 0: 42465.1. Samples: 12233618300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:43,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 21:41:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000746673_12233490432.pth... [2024-06-24 21:41:43,433][15401] Updated weights for policy 0, policy_version 746673 (0.0022) [2024-06-24 21:41:43,500][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000746050_12223283200.pth [2024-06-24 21:41:46,255][15401] Updated weights for policy 0, policy_version 746683 (0.0024) [2024-06-24 21:41:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12233703424. Throughput: 0: 42349.8. Samples: 12233870480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:48,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-24 21:41:51,125][15401] Updated weights for policy 0, policy_version 746693 (0.0031) [2024-06-24 21:41:53,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12233949184. Throughput: 0: 42532.1. Samples: 12234000220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:53,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 21:41:53,853][15401] Updated weights for policy 0, policy_version 746703 (0.0037) [2024-06-24 21:41:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41507.3, 300 sec: 42431.8). Total num frames: 12234096640. Throughput: 0: 42247.5. Samples: 12234249720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-24 21:41:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 21:41:58,974][15401] Updated weights for policy 0, policy_version 746713 (0.0032) [2024-06-24 21:42:00,821][15349] Signal inference workers to stop experience collection... (181150 times) [2024-06-24 21:42:00,821][15349] Signal inference workers to resume experience collection... (181150 times) [2024-06-24 21:42:00,868][15401] InferenceWorker_p0-w0: stopping experience collection (181150 times) [2024-06-24 21:42:00,868][15401] InferenceWorker_p0-w0: resuming experience collection (181150 times) [2024-06-24 21:42:01,498][15401] Updated weights for policy 0, policy_version 746723 (0.0057) [2024-06-24 21:42:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12234342400. Throughput: 0: 42140.4. Samples: 12234497780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:03,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 21:42:06,663][15401] Updated weights for policy 0, policy_version 746733 (0.0044) [2024-06-24 21:42:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 12234555392. Throughput: 0: 42392.8. Samples: 12234632140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 21:42:09,195][15401] Updated weights for policy 0, policy_version 746743 (0.0039) [2024-06-24 21:42:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 12234719232. Throughput: 0: 42260.9. Samples: 12234883100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 21:42:14,241][15401] Updated weights for policy 0, policy_version 746753 (0.0041) [2024-06-24 21:42:17,014][15401] Updated weights for policy 0, policy_version 746763 (0.0034) [2024-06-24 21:42:18,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 12234981376. Throughput: 0: 42168.5. Samples: 12235134800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 21:42:22,327][15401] Updated weights for policy 0, policy_version 746773 (0.0036) [2024-06-24 21:42:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12235177984. Throughput: 0: 42449.7. Samples: 12235269160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 21:42:25,304][15401] Updated weights for policy 0, policy_version 746783 (0.0027) [2024-06-24 21:42:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42376.6). Total num frames: 12235374592. Throughput: 0: 42163.5. Samples: 12235515660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 21:42:30,007][15401] Updated weights for policy 0, policy_version 746793 (0.0034) [2024-06-24 21:42:32,882][15401] Updated weights for policy 0, policy_version 746803 (0.0036) [2024-06-24 21:42:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 12235620352. Throughput: 0: 42164.4. Samples: 12235767880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:33,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 21:42:37,721][15401] Updated weights for policy 0, policy_version 746813 (0.0057) [2024-06-24 21:42:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 42376.2). Total num frames: 12235784192. Throughput: 0: 42274.6. Samples: 12235902580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 21:42:40,462][15401] Updated weights for policy 0, policy_version 746823 (0.0035) [2024-06-24 21:42:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 12236013568. Throughput: 0: 42295.5. Samples: 12236153020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:43,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-24 21:42:45,341][15401] Updated weights for policy 0, policy_version 746833 (0.0036) [2024-06-24 21:42:48,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12236259328. Throughput: 0: 42484.0. Samples: 12236409560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 21:42:48,566][15401] Updated weights for policy 0, policy_version 746843 (0.0035) [2024-06-24 21:42:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 41232.9, 300 sec: 42431.8). Total num frames: 12236423168. Throughput: 0: 42443.9. Samples: 12236542120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 21:42:53,539][15401] Updated weights for policy 0, policy_version 746853 (0.0030) [2024-06-24 21:42:56,249][15401] Updated weights for policy 0, policy_version 746863 (0.0036) [2024-06-24 21:42:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 12236652544. Throughput: 0: 42397.3. Samples: 12236790980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:42:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 21:43:01,163][15401] Updated weights for policy 0, policy_version 746873 (0.0028) [2024-06-24 21:43:03,389][15132] Fps is (10 sec: 47514.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12236898304. Throughput: 0: 42512.5. Samples: 12237047860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:03,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 21:43:03,796][15401] Updated weights for policy 0, policy_version 746883 (0.0045) [2024-06-24 21:43:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42432.1). Total num frames: 12237078528. Throughput: 0: 42492.0. Samples: 12237181300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 21:43:08,664][15401] Updated weights for policy 0, policy_version 746893 (0.0033) [2024-06-24 21:43:11,101][15349] Signal inference workers to stop experience collection... (181200 times) [2024-06-24 21:43:11,101][15349] Signal inference workers to resume experience collection... (181200 times) [2024-06-24 21:43:11,139][15401] InferenceWorker_p0-w0: stopping experience collection (181200 times) [2024-06-24 21:43:11,139][15401] InferenceWorker_p0-w0: resuming experience collection (181200 times) [2024-06-24 21:43:11,449][15401] Updated weights for policy 0, policy_version 746903 (0.0040) [2024-06-24 21:43:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42376.2). Total num frames: 12237307904. Throughput: 0: 42559.6. Samples: 12237430840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 21:43:16,167][15401] Updated weights for policy 0, policy_version 746913 (0.0036) [2024-06-24 21:43:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42432.2). Total num frames: 12237520896. Throughput: 0: 42760.6. Samples: 12237692100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:18,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 21:43:19,039][15401] Updated weights for policy 0, policy_version 746923 (0.0036) [2024-06-24 21:43:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 12237733888. Throughput: 0: 42510.7. Samples: 12237815560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:23,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 21:43:23,525][15401] Updated weights for policy 0, policy_version 746933 (0.0036) [2024-06-24 21:43:26,744][15401] Updated weights for policy 0, policy_version 746943 (0.0042) [2024-06-24 21:43:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 12237963264. Throughput: 0: 42666.8. Samples: 12238073020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:28,390][15132] Avg episode reward: [(0, '0.079')] [2024-06-24 21:43:31,077][15401] Updated weights for policy 0, policy_version 746953 (0.0040) [2024-06-24 21:43:33,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.8, 300 sec: 42487.0). Total num frames: 12238159872. Throughput: 0: 42824.5. Samples: 12238336760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:33,392][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 21:43:34,587][15401] Updated weights for policy 0, policy_version 746963 (0.0037) [2024-06-24 21:43:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 12238372864. Throughput: 0: 42534.5. Samples: 12238456160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 21:43:38,706][15401] Updated weights for policy 0, policy_version 746973 (0.0046) [2024-06-24 21:43:42,342][15401] Updated weights for policy 0, policy_version 746983 (0.0038) [2024-06-24 21:43:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 12238602240. Throughput: 0: 42668.1. Samples: 12238711040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-24 21:43:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 21:43:43,436][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000746986_12238618624.pth... [2024-06-24 21:43:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000746363_12228411392.pth [2024-06-24 21:43:46,636][15401] Updated weights for policy 0, policy_version 746993 (0.0036) [2024-06-24 21:43:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 12238798848. Throughput: 0: 42687.2. Samples: 12238968780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:43:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 21:43:50,254][15401] Updated weights for policy 0, policy_version 747003 (0.0035) [2024-06-24 21:43:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.7, 300 sec: 42431.8). Total num frames: 12239011840. Throughput: 0: 42292.4. Samples: 12239084460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:43:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 21:43:54,066][15401] Updated weights for policy 0, policy_version 747013 (0.0036) [2024-06-24 21:43:57,962][15401] Updated weights for policy 0, policy_version 747023 (0.0030) [2024-06-24 21:43:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42487.3). Total num frames: 12239257600. Throughput: 0: 42707.5. Samples: 12239352680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:43:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 21:44:01,864][15401] Updated weights for policy 0, policy_version 747033 (0.0040) [2024-06-24 21:44:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 12239437824. Throughput: 0: 42612.0. Samples: 12239609640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:03,398][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 21:44:05,772][15401] Updated weights for policy 0, policy_version 747043 (0.0032) [2024-06-24 21:44:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 12239667200. Throughput: 0: 42600.8. Samples: 12239732600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:08,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-24 21:44:09,801][15401] Updated weights for policy 0, policy_version 747053 (0.0042) [2024-06-24 21:44:13,362][15401] Updated weights for policy 0, policy_version 747063 (0.0030) [2024-06-24 21:44:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 12239880192. Throughput: 0: 42666.1. Samples: 12239993000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:13,390][15132] Avg episode reward: [(0, '0.193')] [2024-06-24 21:44:17,544][15401] Updated weights for policy 0, policy_version 747073 (0.0032) [2024-06-24 21:44:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 12240093184. Throughput: 0: 42565.7. Samples: 12240252120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 21:44:20,840][15401] Updated weights for policy 0, policy_version 747083 (0.0034) [2024-06-24 21:44:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.7, 300 sec: 42542.5). Total num frames: 12240322560. Throughput: 0: 42720.7. Samples: 12240378700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:23,393][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 21:44:25,367][15401] Updated weights for policy 0, policy_version 747093 (0.0032) [2024-06-24 21:44:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 12240519168. Throughput: 0: 42834.6. Samples: 12240638600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:28,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-24 21:44:28,950][15401] Updated weights for policy 0, policy_version 747103 (0.0034) [2024-06-24 21:44:32,891][15401] Updated weights for policy 0, policy_version 747113 (0.0032) [2024-06-24 21:44:33,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42542.5). Total num frames: 12240732160. Throughput: 0: 42724.3. Samples: 12240891480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:33,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 21:44:36,546][15401] Updated weights for policy 0, policy_version 747123 (0.0027) [2024-06-24 21:44:38,390][15132] Fps is (10 sec: 42595.7, 60 sec: 42871.0, 300 sec: 42487.2). Total num frames: 12240945152. Throughput: 0: 42973.7. Samples: 12241018300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:38,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-24 21:44:40,456][15401] Updated weights for policy 0, policy_version 747133 (0.0031) [2024-06-24 21:44:42,647][15349] Signal inference workers to stop experience collection... (181250 times) [2024-06-24 21:44:42,654][15349] Signal inference workers to resume experience collection... (181250 times) [2024-06-24 21:44:42,696][15401] InferenceWorker_p0-w0: stopping experience collection (181250 times) [2024-06-24 21:44:42,696][15401] InferenceWorker_p0-w0: resuming experience collection (181250 times) [2024-06-24 21:44:43,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42596.5, 300 sec: 42487.0). Total num frames: 12241158144. Throughput: 0: 42671.4. Samples: 12241273000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:43,393][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 21:44:44,395][15401] Updated weights for policy 0, policy_version 747143 (0.0040) [2024-06-24 21:44:47,964][15401] Updated weights for policy 0, policy_version 747153 (0.0028) [2024-06-24 21:44:48,389][15132] Fps is (10 sec: 42600.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12241371136. Throughput: 0: 42672.4. Samples: 12241529900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:48,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 21:44:52,438][15401] Updated weights for policy 0, policy_version 747163 (0.0039) [2024-06-24 21:44:53,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12241584128. Throughput: 0: 42667.9. Samples: 12241652660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:53,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-24 21:44:55,742][15401] Updated weights for policy 0, policy_version 747173 (0.0030) [2024-06-24 21:44:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12241780736. Throughput: 0: 42676.9. Samples: 12241913460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:44:58,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 21:44:59,906][15401] Updated weights for policy 0, policy_version 747183 (0.0042) [2024-06-24 21:45:03,303][15401] Updated weights for policy 0, policy_version 747193 (0.0035) [2024-06-24 21:45:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 12242010112. Throughput: 0: 42642.6. Samples: 12242171140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:45:03,392][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 21:45:07,348][15401] Updated weights for policy 0, policy_version 747203 (0.0043) [2024-06-24 21:45:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 12242223104. Throughput: 0: 42660.1. Samples: 12242298400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:45:08,393][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 21:45:10,880][15401] Updated weights for policy 0, policy_version 747213 (0.0041) [2024-06-24 21:45:13,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 12242419712. Throughput: 0: 42618.1. Samples: 12242556520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:45:13,393][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 21:45:14,753][15401] Updated weights for policy 0, policy_version 747223 (0.0035) [2024-06-24 21:45:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12242632704. Throughput: 0: 42776.5. Samples: 12242816320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:45:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-24 21:45:18,651][15401] Updated weights for policy 0, policy_version 747233 (0.0029) [2024-06-24 21:45:22,532][15401] Updated weights for policy 0, policy_version 747243 (0.0038) [2024-06-24 21:45:23,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 12242862080. Throughput: 0: 42681.0. Samples: 12242938920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:45:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 21:45:26,702][15401] Updated weights for policy 0, policy_version 747253 (0.0038) [2024-06-24 21:45:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 12243075072. Throughput: 0: 42692.9. Samples: 12243194180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-24 21:45:28,392][15132] Avg episode reward: [(0, '0.424')] [2024-06-24 21:45:30,235][15401] Updated weights for policy 0, policy_version 747263 (0.0026) [2024-06-24 21:45:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42326.9, 300 sec: 42542.9). Total num frames: 12243271680. Throughput: 0: 42840.3. Samples: 12243457720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:45:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 21:45:34,129][15401] Updated weights for policy 0, policy_version 747273 (0.0047) [2024-06-24 21:45:37,582][15401] Updated weights for policy 0, policy_version 747283 (0.0042) [2024-06-24 21:45:38,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.8, 300 sec: 42598.4). Total num frames: 12243517440. Throughput: 0: 42895.5. Samples: 12243582960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:45:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 21:45:41,739][15401] Updated weights for policy 0, policy_version 747293 (0.0032) [2024-06-24 21:45:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 12243714048. Throughput: 0: 42813.8. Samples: 12243840080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:45:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 21:45:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000747297_12243714048.pth... [2024-06-24 21:45:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000746673_12233490432.pth [2024-06-24 21:45:45,657][15401] Updated weights for policy 0, policy_version 747303 (0.0047) [2024-06-24 21:45:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 12243927040. Throughput: 0: 42715.0. Samples: 12244093220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:45:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 21:45:49,569][15401] Updated weights for policy 0, policy_version 747313 (0.0035) [2024-06-24 21:45:53,233][15401] Updated weights for policy 0, policy_version 747323 (0.0035) [2024-06-24 21:45:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42487.6). Total num frames: 12244140032. Throughput: 0: 42638.8. Samples: 12244217040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:45:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 21:45:57,621][15401] Updated weights for policy 0, policy_version 747333 (0.0040) [2024-06-24 21:45:58,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12244353024. Throughput: 0: 42737.4. Samples: 12244479600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:45:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 21:46:00,876][15401] Updated weights for policy 0, policy_version 747343 (0.0046) [2024-06-24 21:46:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.3, 300 sec: 42598.8). Total num frames: 12244582400. Throughput: 0: 42319.6. Samples: 12244720700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 21:46:04,852][15349] Signal inference workers to stop experience collection... (181300 times) [2024-06-24 21:46:04,884][15401] InferenceWorker_p0-w0: stopping experience collection (181300 times) [2024-06-24 21:46:04,907][15349] Signal inference workers to resume experience collection... (181300 times) [2024-06-24 21:46:04,912][15401] InferenceWorker_p0-w0: resuming experience collection (181300 times) [2024-06-24 21:46:05,205][15401] Updated weights for policy 0, policy_version 747353 (0.0033) [2024-06-24 21:46:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 12244779008. Throughput: 0: 42466.5. Samples: 12244849920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 21:46:08,771][15401] Updated weights for policy 0, policy_version 747363 (0.0043) [2024-06-24 21:46:13,004][15401] Updated weights for policy 0, policy_version 747373 (0.0030) [2024-06-24 21:46:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 12244975616. Throughput: 0: 42625.8. Samples: 12245112240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:13,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 21:46:16,487][15401] Updated weights for policy 0, policy_version 747383 (0.0032) [2024-06-24 21:46:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 12245204992. Throughput: 0: 42268.5. Samples: 12245359800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 21:46:20,523][15401] Updated weights for policy 0, policy_version 747393 (0.0031) [2024-06-24 21:46:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12245417984. Throughput: 0: 42556.4. Samples: 12245498000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 21:46:23,977][15401] Updated weights for policy 0, policy_version 747403 (0.0036) [2024-06-24 21:46:28,061][15401] Updated weights for policy 0, policy_version 747413 (0.0033) [2024-06-24 21:46:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 12245614592. Throughput: 0: 42627.6. Samples: 12245758320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-24 21:46:31,659][15401] Updated weights for policy 0, policy_version 747423 (0.0033) [2024-06-24 21:46:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.7, 300 sec: 42542.9). Total num frames: 12245860352. Throughput: 0: 42447.4. Samples: 12246003340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 21:46:36,067][15401] Updated weights for policy 0, policy_version 747433 (0.0032) [2024-06-24 21:46:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12246056960. Throughput: 0: 42721.6. Samples: 12246139520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 21:46:39,491][15401] Updated weights for policy 0, policy_version 747443 (0.0028) [2024-06-24 21:46:43,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12246237184. Throughput: 0: 42351.3. Samples: 12246385400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 21:46:43,658][15401] Updated weights for policy 0, policy_version 747453 (0.0039) [2024-06-24 21:46:47,029][15401] Updated weights for policy 0, policy_version 747463 (0.0037) [2024-06-24 21:46:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 12246499328. Throughput: 0: 42650.1. Samples: 12246639960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:48,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-24 21:46:51,290][15401] Updated weights for policy 0, policy_version 747473 (0.0041) [2024-06-24 21:46:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12246695936. Throughput: 0: 42835.2. Samples: 12246777500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 21:46:54,574][15401] Updated weights for policy 0, policy_version 747483 (0.0038) [2024-06-24 21:46:58,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12246876160. Throughput: 0: 42589.7. Samples: 12247028780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:46:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 21:46:59,064][15401] Updated weights for policy 0, policy_version 747493 (0.0038) [2024-06-24 21:47:02,228][15401] Updated weights for policy 0, policy_version 747503 (0.0033) [2024-06-24 21:47:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12247138304. Throughput: 0: 42729.0. Samples: 12247282600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:47:03,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 21:47:06,637][15401] Updated weights for policy 0, policy_version 747513 (0.0023) [2024-06-24 21:47:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12247334912. Throughput: 0: 42734.8. Samples: 12247421060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-24 21:47:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 21:47:09,781][15401] Updated weights for policy 0, policy_version 747523 (0.0029) [2024-06-24 21:47:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12247531520. Throughput: 0: 42423.4. Samples: 12247667380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 21:47:14,476][15401] Updated weights for policy 0, policy_version 747533 (0.0045) [2024-06-24 21:47:17,661][15401] Updated weights for policy 0, policy_version 747543 (0.0039) [2024-06-24 21:47:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12247760896. Throughput: 0: 42690.1. Samples: 12247924400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 21:47:22,018][15401] Updated weights for policy 0, policy_version 747553 (0.0030) [2024-06-24 21:47:23,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 12247973888. Throughput: 0: 42580.4. Samples: 12248055740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:23,393][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 21:47:25,438][15401] Updated weights for policy 0, policy_version 747563 (0.0025) [2024-06-24 21:47:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12248170496. Throughput: 0: 42497.3. Samples: 12248297780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 21:47:28,614][15349] Signal inference workers to stop experience collection... (181350 times) [2024-06-24 21:47:28,645][15401] InferenceWorker_p0-w0: stopping experience collection (181350 times) [2024-06-24 21:47:28,680][15349] Signal inference workers to resume experience collection... (181350 times) [2024-06-24 21:47:28,684][15401] InferenceWorker_p0-w0: resuming experience collection (181350 times) [2024-06-24 21:47:29,974][15401] Updated weights for policy 0, policy_version 747573 (0.0041) [2024-06-24 21:47:33,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12248383488. Throughput: 0: 42615.1. Samples: 12248557640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 21:47:33,408][15401] Updated weights for policy 0, policy_version 747583 (0.0035) [2024-06-24 21:47:37,470][15401] Updated weights for policy 0, policy_version 747593 (0.0033) [2024-06-24 21:47:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12248596480. Throughput: 0: 42426.2. Samples: 12248686680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 21:47:40,983][15401] Updated weights for policy 0, policy_version 747603 (0.0041) [2024-06-24 21:47:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12248825856. Throughput: 0: 42465.8. Samples: 12248939740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 21:47:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000747610_12248842240.pth... [2024-06-24 21:47:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000746986_12238618624.pth [2024-06-24 21:47:45,050][15401] Updated weights for policy 0, policy_version 747613 (0.0031) [2024-06-24 21:47:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 12249038848. Throughput: 0: 42472.0. Samples: 12249193840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 21:47:48,567][15401] Updated weights for policy 0, policy_version 747623 (0.0037) [2024-06-24 21:47:53,015][15401] Updated weights for policy 0, policy_version 747633 (0.0032) [2024-06-24 21:47:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12249219072. Throughput: 0: 42105.7. Samples: 12249315820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 21:47:56,349][15401] Updated weights for policy 0, policy_version 747643 (0.0028) [2024-06-24 21:47:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 12249464832. Throughput: 0: 42375.6. Samples: 12249574280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:47:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 21:48:00,651][15401] Updated weights for policy 0, policy_version 747653 (0.0033) [2024-06-24 21:48:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12249677824. Throughput: 0: 42333.3. Samples: 12249829400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 21:48:03,852][15401] Updated weights for policy 0, policy_version 747663 (0.0038) [2024-06-24 21:48:08,080][15401] Updated weights for policy 0, policy_version 747673 (0.0033) [2024-06-24 21:48:08,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 12249874432. Throughput: 0: 42229.8. Samples: 12249956080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:08,393][15132] Avg episode reward: [(0, '0.271')] [2024-06-24 21:48:11,542][15401] Updated weights for policy 0, policy_version 747683 (0.0032) [2024-06-24 21:48:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 12250103808. Throughput: 0: 42575.9. Samples: 12250213800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:13,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 21:48:15,971][15401] Updated weights for policy 0, policy_version 747693 (0.0035) [2024-06-24 21:48:18,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12250300416. Throughput: 0: 42642.7. Samples: 12250476560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 21:48:19,477][15401] Updated weights for policy 0, policy_version 747703 (0.0041) [2024-06-24 21:48:23,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 12250497024. Throughput: 0: 42560.5. Samples: 12250601900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 21:48:23,761][15401] Updated weights for policy 0, policy_version 747713 (0.0034) [2024-06-24 21:48:27,043][15401] Updated weights for policy 0, policy_version 747723 (0.0038) [2024-06-24 21:48:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 12250742784. Throughput: 0: 42532.4. Samples: 12250853700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:28,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-24 21:48:31,358][15401] Updated weights for policy 0, policy_version 747733 (0.0030) [2024-06-24 21:48:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12250939392. Throughput: 0: 42653.2. Samples: 12251113240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 21:48:34,622][15401] Updated weights for policy 0, policy_version 747743 (0.0029) [2024-06-24 21:48:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 12251152384. Throughput: 0: 42770.3. Samples: 12251240480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 21:48:39,202][15401] Updated weights for policy 0, policy_version 747753 (0.0048) [2024-06-24 21:48:42,277][15401] Updated weights for policy 0, policy_version 747763 (0.0031) [2024-06-24 21:48:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12251381760. Throughput: 0: 42626.6. Samples: 12251492580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:43,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 21:48:46,881][15401] Updated weights for policy 0, policy_version 747773 (0.0032) [2024-06-24 21:48:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12251578368. Throughput: 0: 42762.4. Samples: 12251753700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 21:48:50,253][15401] Updated weights for policy 0, policy_version 747783 (0.0029) [2024-06-24 21:48:53,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 12251807744. Throughput: 0: 42651.5. Samples: 12251875300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 21:48:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 21:48:54,821][15401] Updated weights for policy 0, policy_version 747793 (0.0041) [2024-06-24 21:48:57,812][15401] Updated weights for policy 0, policy_version 747803 (0.0036) [2024-06-24 21:48:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12252004352. Throughput: 0: 42665.9. Samples: 12252133660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:48:58,390][15132] Avg episode reward: [(0, '0.228')] [2024-06-24 21:48:58,480][15349] Signal inference workers to stop experience collection... (181400 times) [2024-06-24 21:48:58,480][15349] Signal inference workers to resume experience collection... (181400 times) [2024-06-24 21:48:58,503][15401] InferenceWorker_p0-w0: stopping experience collection (181400 times) [2024-06-24 21:48:58,504][15401] InferenceWorker_p0-w0: resuming experience collection (181400 times) [2024-06-24 21:49:02,367][15401] Updated weights for policy 0, policy_version 747813 (0.0039) [2024-06-24 21:49:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12252217344. Throughput: 0: 42437.8. Samples: 12252386260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 21:49:05,287][15401] Updated weights for policy 0, policy_version 747823 (0.0032) [2024-06-24 21:49:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 12252446720. Throughput: 0: 42486.7. Samples: 12252513800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 21:49:09,917][15401] Updated weights for policy 0, policy_version 747833 (0.0032) [2024-06-24 21:49:13,141][15401] Updated weights for policy 0, policy_version 747843 (0.0038) [2024-06-24 21:49:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 12252659712. Throughput: 0: 42678.8. Samples: 12252774240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:13,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 21:49:17,542][15401] Updated weights for policy 0, policy_version 747853 (0.0035) [2024-06-24 21:49:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 12252872704. Throughput: 0: 42501.8. Samples: 12253025820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 21:49:21,042][15401] Updated weights for policy 0, policy_version 747863 (0.0030) [2024-06-24 21:49:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12253085696. Throughput: 0: 42486.2. Samples: 12253152360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:23,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 21:49:25,141][15401] Updated weights for policy 0, policy_version 747873 (0.0035) [2024-06-24 21:49:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 12253282304. Throughput: 0: 42665.3. Samples: 12253412420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 21:49:28,563][15401] Updated weights for policy 0, policy_version 747883 (0.0033) [2024-06-24 21:49:32,719][15401] Updated weights for policy 0, policy_version 747893 (0.0028) [2024-06-24 21:49:33,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.8, 300 sec: 42542.6). Total num frames: 12253495296. Throughput: 0: 42618.5. Samples: 12253671640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:33,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 21:49:36,645][15401] Updated weights for policy 0, policy_version 747903 (0.0023) [2024-06-24 21:49:38,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.8, 300 sec: 42598.4). Total num frames: 12253724672. Throughput: 0: 42608.5. Samples: 12253792780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:38,393][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 21:49:40,475][15401] Updated weights for policy 0, policy_version 747913 (0.0041) [2024-06-24 21:49:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 12253921280. Throughput: 0: 42661.7. Samples: 12254053440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 21:49:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000747921_12253937664.pth... [2024-06-24 21:49:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000747297_12243714048.pth [2024-06-24 21:49:44,134][15401] Updated weights for policy 0, policy_version 747923 (0.0039) [2024-06-24 21:49:47,989][15401] Updated weights for policy 0, policy_version 747933 (0.0030) [2024-06-24 21:49:48,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12254134272. Throughput: 0: 42563.5. Samples: 12254301620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 21:49:51,813][15401] Updated weights for policy 0, policy_version 747943 (0.0033) [2024-06-24 21:49:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 12254347264. Throughput: 0: 42560.4. Samples: 12254429120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:53,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 21:49:56,101][15401] Updated weights for policy 0, policy_version 747953 (0.0046) [2024-06-24 21:49:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 12254560256. Throughput: 0: 42511.0. Samples: 12254687240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:49:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 21:49:59,378][15401] Updated weights for policy 0, policy_version 747963 (0.0027) [2024-06-24 21:50:03,396][15132] Fps is (10 sec: 42581.4, 60 sec: 42593.8, 300 sec: 42542.3). Total num frames: 12254773248. Throughput: 0: 42517.1. Samples: 12254939360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:50:03,396][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 21:50:03,533][15401] Updated weights for policy 0, policy_version 747973 (0.0031) [2024-06-24 21:50:07,329][15401] Updated weights for policy 0, policy_version 747983 (0.0034) [2024-06-24 21:50:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 12254986240. Throughput: 0: 42585.9. Samples: 12255068720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:50:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 21:50:11,775][15401] Updated weights for policy 0, policy_version 747993 (0.0042) [2024-06-24 21:50:13,390][15132] Fps is (10 sec: 42624.8, 60 sec: 42325.1, 300 sec: 42598.4). Total num frames: 12255199232. Throughput: 0: 42415.0. Samples: 12255321100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:50:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 21:50:14,943][15401] Updated weights for policy 0, policy_version 748003 (0.0025) [2024-06-24 21:50:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 12255412224. Throughput: 0: 42316.5. Samples: 12255575780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:50:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 21:50:19,369][15401] Updated weights for policy 0, policy_version 748013 (0.0032) [2024-06-24 21:50:22,570][15401] Updated weights for policy 0, policy_version 748023 (0.0041) [2024-06-24 21:50:23,390][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 12255641600. Throughput: 0: 42485.3. Samples: 12255704520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:50:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 21:50:25,228][15349] Signal inference workers to stop experience collection... (181450 times) [2024-06-24 21:50:25,256][15401] InferenceWorker_p0-w0: stopping experience collection (181450 times) [2024-06-24 21:50:25,286][15349] Signal inference workers to resume experience collection... (181450 times) [2024-06-24 21:50:25,287][15401] InferenceWorker_p0-w0: resuming experience collection (181450 times) [2024-06-24 21:50:26,920][15401] Updated weights for policy 0, policy_version 748033 (0.0042) [2024-06-24 21:50:28,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 12255821824. Throughput: 0: 42466.1. Samples: 12255964520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:50:28,393][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 21:50:30,015][15401] Updated weights for policy 0, policy_version 748043 (0.0041) [2024-06-24 21:50:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 12256067584. Throughput: 0: 42524.6. Samples: 12256215220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:50:33,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 21:50:34,361][15401] Updated weights for policy 0, policy_version 748053 (0.0026) [2024-06-24 21:50:37,692][15401] Updated weights for policy 0, policy_version 748063 (0.0029) [2024-06-24 21:50:38,390][15132] Fps is (10 sec: 47524.5, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 12256296960. Throughput: 0: 42725.7. Samples: 12256351680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 21:50:38,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 21:50:41,891][15401] Updated weights for policy 0, policy_version 748073 (0.0035) [2024-06-24 21:50:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 12256460800. Throughput: 0: 42630.6. Samples: 12256605620. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:50:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 21:50:45,546][15401] Updated weights for policy 0, policy_version 748083 (0.0035) [2024-06-24 21:50:48,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 12256722944. Throughput: 0: 42737.6. Samples: 12256862380. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:50:48,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 21:50:49,346][15401] Updated weights for policy 0, policy_version 748093 (0.0029) [2024-06-24 21:50:53,207][15401] Updated weights for policy 0, policy_version 748103 (0.0032) [2024-06-24 21:50:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 12256919552. Throughput: 0: 42784.3. Samples: 12256994020. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:50:53,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 21:50:57,225][15401] Updated weights for policy 0, policy_version 748113 (0.0034) [2024-06-24 21:50:58,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12257116160. Throughput: 0: 42823.8. Samples: 12257248160. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:50:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 21:51:00,946][15401] Updated weights for policy 0, policy_version 748123 (0.0041) [2024-06-24 21:51:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 12257345536. Throughput: 0: 42801.2. Samples: 12257501840. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:03,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 21:51:05,107][15401] Updated weights for policy 0, policy_version 748133 (0.0035) [2024-06-24 21:51:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12257558528. Throughput: 0: 42978.3. Samples: 12257638540. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 21:51:08,497][15401] Updated weights for policy 0, policy_version 748143 (0.0030) [2024-06-24 21:51:12,587][15401] Updated weights for policy 0, policy_version 748153 (0.0045) [2024-06-24 21:51:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 12257755136. Throughput: 0: 42892.6. Samples: 12257894580. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:13,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 21:51:16,108][15401] Updated weights for policy 0, policy_version 748163 (0.0024) [2024-06-24 21:51:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12257984512. Throughput: 0: 42848.4. Samples: 12258143400. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 21:51:20,135][15401] Updated weights for policy 0, policy_version 748173 (0.0035) [2024-06-24 21:51:23,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12258197504. Throughput: 0: 42811.2. Samples: 12258278180. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 21:51:23,700][15401] Updated weights for policy 0, policy_version 748183 (0.0040) [2024-06-24 21:51:27,831][15401] Updated weights for policy 0, policy_version 748193 (0.0035) [2024-06-24 21:51:28,390][15132] Fps is (10 sec: 40957.8, 60 sec: 42872.8, 300 sec: 42487.2). Total num frames: 12258394112. Throughput: 0: 42745.8. Samples: 12258529200. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:28,391][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 21:51:31,424][15401] Updated weights for policy 0, policy_version 748203 (0.0030) [2024-06-24 21:51:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12258623488. Throughput: 0: 42813.4. Samples: 12258788880. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 21:51:35,396][15401] Updated weights for policy 0, policy_version 748213 (0.0037) [2024-06-24 21:51:38,389][15132] Fps is (10 sec: 44239.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12258836480. Throughput: 0: 42817.0. Samples: 12258920780. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 21:51:39,069][15401] Updated weights for policy 0, policy_version 748223 (0.0045) [2024-06-24 21:51:43,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 12259033088. Throughput: 0: 42606.1. Samples: 12259165540. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:43,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 21:51:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000748232_12259033088.pth... [2024-06-24 21:51:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000747610_12248842240.pth [2024-06-24 21:51:43,667][15401] Updated weights for policy 0, policy_version 748233 (0.0039) [2024-06-24 21:51:46,844][15401] Updated weights for policy 0, policy_version 748243 (0.0035) [2024-06-24 21:51:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 12259246080. Throughput: 0: 42682.8. Samples: 12259422560. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 21:51:51,420][15401] Updated weights for policy 0, policy_version 748253 (0.0029) [2024-06-24 21:51:51,793][15349] Signal inference workers to stop experience collection... (181500 times) [2024-06-24 21:51:51,793][15349] Signal inference workers to resume experience collection... (181500 times) [2024-06-24 21:51:51,831][15401] InferenceWorker_p0-w0: stopping experience collection (181500 times) [2024-06-24 21:51:51,831][15401] InferenceWorker_p0-w0: resuming experience collection (181500 times) [2024-06-24 21:51:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12259459072. Throughput: 0: 42593.7. Samples: 12259555260. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:53,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 21:51:54,354][15401] Updated weights for policy 0, policy_version 748263 (0.0041) [2024-06-24 21:51:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 12259672064. Throughput: 0: 42392.3. Samples: 12259802240. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:51:58,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 21:51:59,066][15401] Updated weights for policy 0, policy_version 748273 (0.0033) [2024-06-24 21:52:01,902][15401] Updated weights for policy 0, policy_version 748283 (0.0036) [2024-06-24 21:52:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12259901440. Throughput: 0: 42450.7. Samples: 12260053680. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:52:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 21:52:06,762][15401] Updated weights for policy 0, policy_version 748293 (0.0035) [2024-06-24 21:52:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12260114432. Throughput: 0: 42407.1. Samples: 12260186500. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:52:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 21:52:10,018][15401] Updated weights for policy 0, policy_version 748303 (0.0027) [2024-06-24 21:52:13,391][15132] Fps is (10 sec: 40955.5, 60 sec: 42597.5, 300 sec: 42542.7). Total num frames: 12260311040. Throughput: 0: 42595.4. Samples: 12260446020. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:52:13,391][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 21:52:14,418][15401] Updated weights for policy 0, policy_version 748313 (0.0036) [2024-06-24 21:52:17,738][15401] Updated weights for policy 0, policy_version 748323 (0.0033) [2024-06-24 21:52:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 12260540416. Throughput: 0: 42267.0. Samples: 12260690900. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:52:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-24 21:52:22,255][15401] Updated weights for policy 0, policy_version 748333 (0.0034) [2024-06-24 21:52:23,390][15132] Fps is (10 sec: 45880.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12260769792. Throughput: 0: 42311.4. Samples: 12260824800. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-24 21:52:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 21:52:25,598][15401] Updated weights for policy 0, policy_version 748343 (0.0031) [2024-06-24 21:52:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.8, 300 sec: 42598.4). Total num frames: 12260950016. Throughput: 0: 42547.6. Samples: 12261080080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:52:28,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 21:52:30,009][15401] Updated weights for policy 0, policy_version 748353 (0.0037) [2024-06-24 21:52:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12261163008. Throughput: 0: 42454.2. Samples: 12261333000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:52:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 21:52:33,427][15401] Updated weights for policy 0, policy_version 748363 (0.0025) [2024-06-24 21:52:37,519][15401] Updated weights for policy 0, policy_version 748373 (0.0032) [2024-06-24 21:52:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12261376000. Throughput: 0: 42392.4. Samples: 12261462920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:52:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 21:52:40,945][15401] Updated weights for policy 0, policy_version 748383 (0.0028) [2024-06-24 21:52:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 12261588992. Throughput: 0: 42649.8. Samples: 12261721480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:52:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 21:52:45,156][15401] Updated weights for policy 0, policy_version 748393 (0.0034) [2024-06-24 21:52:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12261801984. Throughput: 0: 42760.1. Samples: 12261977880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:52:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 21:52:48,672][15401] Updated weights for policy 0, policy_version 748403 (0.0031) [2024-06-24 21:52:52,731][15401] Updated weights for policy 0, policy_version 748413 (0.0036) [2024-06-24 21:52:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12262014976. Throughput: 0: 42680.1. Samples: 12262107100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:52:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 21:52:56,150][15401] Updated weights for policy 0, policy_version 748423 (0.0046) [2024-06-24 21:52:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12262244352. Throughput: 0: 42697.5. Samples: 12262367360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:52:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 21:53:00,095][15401] Updated weights for policy 0, policy_version 748433 (0.0036) [2024-06-24 21:53:03,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 12262457344. Throughput: 0: 42936.0. Samples: 12262623120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:03,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-24 21:53:03,968][15401] Updated weights for policy 0, policy_version 748443 (0.0034) [2024-06-24 21:53:08,119][15401] Updated weights for policy 0, policy_version 748453 (0.0045) [2024-06-24 21:53:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42542.9). Total num frames: 12262653952. Throughput: 0: 42908.9. Samples: 12262755800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:08,393][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 21:53:11,648][15401] Updated weights for policy 0, policy_version 748463 (0.0041) [2024-06-24 21:53:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42872.2, 300 sec: 42653.9). Total num frames: 12262883328. Throughput: 0: 42758.6. Samples: 12263004220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 21:53:15,773][15401] Updated weights for policy 0, policy_version 748473 (0.0033) [2024-06-24 21:53:17,472][15349] Signal inference workers to stop experience collection... (181550 times) [2024-06-24 21:53:17,472][15349] Signal inference workers to resume experience collection... (181550 times) [2024-06-24 21:53:17,490][15401] InferenceWorker_p0-w0: stopping experience collection (181550 times) [2024-06-24 21:53:17,518][15401] InferenceWorker_p0-w0: resuming experience collection (181550 times) [2024-06-24 21:53:18,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12263112704. Throughput: 0: 42763.5. Samples: 12263257360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 21:53:19,350][15401] Updated weights for policy 0, policy_version 748483 (0.0035) [2024-06-24 21:53:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 12263276544. Throughput: 0: 42815.7. Samples: 12263389620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 21:53:23,593][15401] Updated weights for policy 0, policy_version 748493 (0.0040) [2024-06-24 21:53:27,334][15401] Updated weights for policy 0, policy_version 748503 (0.0039) [2024-06-24 21:53:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12263505920. Throughput: 0: 42642.9. Samples: 12263640400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 21:53:31,100][15401] Updated weights for policy 0, policy_version 748513 (0.0033) [2024-06-24 21:53:33,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 12263751680. Throughput: 0: 42518.2. Samples: 12263891300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:33,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 21:53:34,763][15401] Updated weights for policy 0, policy_version 748523 (0.0032) [2024-06-24 21:53:38,391][15132] Fps is (10 sec: 44229.3, 60 sec: 42870.4, 300 sec: 42598.5). Total num frames: 12263948288. Throughput: 0: 42676.6. Samples: 12264027620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:38,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 21:53:38,607][15401] Updated weights for policy 0, policy_version 748533 (0.0041) [2024-06-24 21:53:42,326][15401] Updated weights for policy 0, policy_version 748543 (0.0041) [2024-06-24 21:53:43,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 12264161280. Throughput: 0: 42584.1. Samples: 12264283640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 21:53:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000748546_12264177664.pth... [2024-06-24 21:53:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000747921_12253937664.pth [2024-06-24 21:53:46,102][15401] Updated weights for policy 0, policy_version 748553 (0.0036) [2024-06-24 21:53:48,389][15132] Fps is (10 sec: 42605.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12264374272. Throughput: 0: 42585.5. Samples: 12264539360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 21:53:49,937][15401] Updated weights for policy 0, policy_version 748563 (0.0039) [2024-06-24 21:53:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12264587264. Throughput: 0: 42552.1. Samples: 12264670540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:53,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 21:53:53,570][15401] Updated weights for policy 0, policy_version 748573 (0.0033) [2024-06-24 21:53:57,437][15401] Updated weights for policy 0, policy_version 748583 (0.0035) [2024-06-24 21:53:58,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12264816640. Throughput: 0: 42729.4. Samples: 12264927140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:53:58,392][15132] Avg episode reward: [(0, '0.278')] [2024-06-24 21:54:02,154][15401] Updated weights for policy 0, policy_version 748593 (0.0038) [2024-06-24 21:54:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42598.3, 300 sec: 42598.0). Total num frames: 12265013248. Throughput: 0: 42879.0. Samples: 12265187020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 21:54:03,393][15132] Avg episode reward: [(0, '0.829')] [2024-06-24 21:54:05,272][15401] Updated weights for policy 0, policy_version 748603 (0.0042) [2024-06-24 21:54:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 12265242624. Throughput: 0: 42740.9. Samples: 12265312960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:08,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 21:54:09,655][15401] Updated weights for policy 0, policy_version 748613 (0.0039) [2024-06-24 21:54:13,048][15401] Updated weights for policy 0, policy_version 748623 (0.0025) [2024-06-24 21:54:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12265439232. Throughput: 0: 42748.3. Samples: 12265564080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:13,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 21:54:17,317][15401] Updated weights for policy 0, policy_version 748633 (0.0029) [2024-06-24 21:54:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12265652224. Throughput: 0: 43115.3. Samples: 12265831380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 21:54:21,080][15401] Updated weights for policy 0, policy_version 748643 (0.0025) [2024-06-24 21:54:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12265881600. Throughput: 0: 42923.8. Samples: 12265959120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 21:54:25,015][15401] Updated weights for policy 0, policy_version 748653 (0.0040) [2024-06-24 21:54:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 12266078208. Throughput: 0: 42767.0. Samples: 12266208160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:28,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-24 21:54:28,651][15401] Updated weights for policy 0, policy_version 748663 (0.0029) [2024-06-24 21:54:32,609][15401] Updated weights for policy 0, policy_version 748673 (0.0029) [2024-06-24 21:54:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42326.9, 300 sec: 42598.7). Total num frames: 12266291200. Throughput: 0: 42984.2. Samples: 12266473660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:33,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 21:54:36,153][15401] Updated weights for policy 0, policy_version 748683 (0.0040) [2024-06-24 21:54:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42872.6, 300 sec: 42709.5). Total num frames: 12266520576. Throughput: 0: 42915.5. Samples: 12266601740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 21:54:40,516][15401] Updated weights for policy 0, policy_version 748693 (0.0033) [2024-06-24 21:54:40,903][15349] Signal inference workers to stop experience collection... (181600 times) [2024-06-24 21:54:40,929][15401] InferenceWorker_p0-w0: stopping experience collection (181600 times) [2024-06-24 21:54:40,959][15349] Signal inference workers to resume experience collection... (181600 times) [2024-06-24 21:54:40,960][15401] InferenceWorker_p0-w0: resuming experience collection (181600 times) [2024-06-24 21:54:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12266733568. Throughput: 0: 42807.2. Samples: 12266853360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:43,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 21:54:43,671][15401] Updated weights for policy 0, policy_version 748703 (0.0032) [2024-06-24 21:54:48,239][15401] Updated weights for policy 0, policy_version 748713 (0.0038) [2024-06-24 21:54:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 12266913792. Throughput: 0: 42851.8. Samples: 12267115240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 21:54:51,181][15401] Updated weights for policy 0, policy_version 748723 (0.0038) [2024-06-24 21:54:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12267159552. Throughput: 0: 42762.1. Samples: 12267237260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 21:54:55,731][15401] Updated weights for policy 0, policy_version 748733 (0.0039) [2024-06-24 21:54:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42600.1, 300 sec: 42710.4). Total num frames: 12267372544. Throughput: 0: 42854.2. Samples: 12267492520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:54:58,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 21:54:58,636][15401] Updated weights for policy 0, policy_version 748743 (0.0040) [2024-06-24 21:55:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 12267552768. Throughput: 0: 42636.7. Samples: 12267750040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-24 21:55:03,744][15401] Updated weights for policy 0, policy_version 748753 (0.0031) [2024-06-24 21:55:06,117][15401] Updated weights for policy 0, policy_version 748763 (0.0042) [2024-06-24 21:55:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 12267814912. Throughput: 0: 42443.1. Samples: 12267869060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 21:55:11,217][15401] Updated weights for policy 0, policy_version 748773 (0.0027) [2024-06-24 21:55:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12268027904. Throughput: 0: 42877.2. Samples: 12268137640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 21:55:14,165][15401] Updated weights for policy 0, policy_version 748783 (0.0052) [2024-06-24 21:55:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12268208128. Throughput: 0: 42488.6. Samples: 12268385640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 21:55:18,809][15401] Updated weights for policy 0, policy_version 748793 (0.0044) [2024-06-24 21:55:21,997][15401] Updated weights for policy 0, policy_version 748803 (0.0022) [2024-06-24 21:55:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12268437504. Throughput: 0: 42420.0. Samples: 12268510640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:23,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 21:55:26,250][15401] Updated weights for policy 0, policy_version 748813 (0.0047) [2024-06-24 21:55:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12268634112. Throughput: 0: 42653.2. Samples: 12268772760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:28,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-24 21:55:30,036][15401] Updated weights for policy 0, policy_version 748823 (0.0027) [2024-06-24 21:55:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 12268847104. Throughput: 0: 42180.7. Samples: 12269013480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:33,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 21:55:34,113][15401] Updated weights for policy 0, policy_version 748833 (0.0039) [2024-06-24 21:55:37,777][15401] Updated weights for policy 0, policy_version 748843 (0.0034) [2024-06-24 21:55:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12269076480. Throughput: 0: 42378.2. Samples: 12269144280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 21:55:41,784][15401] Updated weights for policy 0, policy_version 748853 (0.0044) [2024-06-24 21:55:43,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42052.2, 300 sec: 42487.7). Total num frames: 12269256704. Throughput: 0: 42396.0. Samples: 12269400340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:43,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 21:55:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000748856_12269256704.pth... [2024-06-24 21:55:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000748232_12259033088.pth [2024-06-24 21:55:45,400][15401] Updated weights for policy 0, policy_version 748863 (0.0028) [2024-06-24 21:55:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12269486080. Throughput: 0: 42276.1. Samples: 12269652460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-24 21:55:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 21:55:49,393][15401] Updated weights for policy 0, policy_version 748873 (0.0035) [2024-06-24 21:55:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12269682688. Throughput: 0: 42639.1. Samples: 12269787820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:55:53,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-24 21:55:53,418][15401] Updated weights for policy 0, policy_version 748883 (0.0035) [2024-06-24 21:55:53,432][15349] Signal inference workers to stop experience collection... (181650 times) [2024-06-24 21:55:53,432][15349] Signal inference workers to resume experience collection... (181650 times) [2024-06-24 21:55:53,448][15401] InferenceWorker_p0-w0: stopping experience collection (181650 times) [2024-06-24 21:55:53,448][15401] InferenceWorker_p0-w0: resuming experience collection (181650 times) [2024-06-24 21:55:56,837][15401] Updated weights for policy 0, policy_version 748893 (0.0037) [2024-06-24 21:55:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12269912064. Throughput: 0: 42280.0. Samples: 12270040240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:55:58,395][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 21:56:01,096][15401] Updated weights for policy 0, policy_version 748903 (0.0032) [2024-06-24 21:56:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12270141440. Throughput: 0: 42280.4. Samples: 12270288260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:03,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 21:56:04,799][15401] Updated weights for policy 0, policy_version 748913 (0.0036) [2024-06-24 21:56:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12270338048. Throughput: 0: 42474.7. Samples: 12270422000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 21:56:08,703][15401] Updated weights for policy 0, policy_version 748923 (0.0028) [2024-06-24 21:56:12,294][15401] Updated weights for policy 0, policy_version 748933 (0.0044) [2024-06-24 21:56:13,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 12270567424. Throughput: 0: 42341.8. Samples: 12270678240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:13,392][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 21:56:16,281][15401] Updated weights for policy 0, policy_version 748943 (0.0027) [2024-06-24 21:56:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12270764032. Throughput: 0: 42454.0. Samples: 12270923800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 21:56:19,944][15401] Updated weights for policy 0, policy_version 748953 (0.0036) [2024-06-24 21:56:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12270977024. Throughput: 0: 42372.5. Samples: 12271051040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 21:56:23,961][15401] Updated weights for policy 0, policy_version 748963 (0.0026) [2024-06-24 21:56:27,575][15401] Updated weights for policy 0, policy_version 748973 (0.0031) [2024-06-24 21:56:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12271190016. Throughput: 0: 42389.0. Samples: 12271307840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-24 21:56:31,755][15401] Updated weights for policy 0, policy_version 748983 (0.0028) [2024-06-24 21:56:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 12271403008. Throughput: 0: 42372.3. Samples: 12271559220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 21:56:35,227][15401] Updated weights for policy 0, policy_version 748993 (0.0030) [2024-06-24 21:56:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 12271599616. Throughput: 0: 42257.8. Samples: 12271689420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 21:56:39,448][15401] Updated weights for policy 0, policy_version 749003 (0.0052) [2024-06-24 21:56:42,886][15401] Updated weights for policy 0, policy_version 749013 (0.0021) [2024-06-24 21:56:43,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 12271828992. Throughput: 0: 42286.9. Samples: 12271943140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 21:56:47,059][15401] Updated weights for policy 0, policy_version 749023 (0.0030) [2024-06-24 21:56:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12272041984. Throughput: 0: 42476.0. Samples: 12272199680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 21:56:50,533][15401] Updated weights for policy 0, policy_version 749033 (0.0028) [2024-06-24 21:56:53,395][15132] Fps is (10 sec: 40935.4, 60 sec: 42594.2, 300 sec: 42597.6). Total num frames: 12272238592. Throughput: 0: 42278.5. Samples: 12272324780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:53,396][15132] Avg episode reward: [(0, '0.801')] [2024-06-24 21:56:54,718][15401] Updated weights for policy 0, policy_version 749043 (0.0042) [2024-06-24 21:56:57,903][15349] Signal inference workers to stop experience collection... (181700 times) [2024-06-24 21:56:57,904][15349] Signal inference workers to resume experience collection... (181700 times) [2024-06-24 21:56:57,932][15401] InferenceWorker_p0-w0: stopping experience collection (181700 times) [2024-06-24 21:56:57,932][15401] InferenceWorker_p0-w0: resuming experience collection (181700 times) [2024-06-24 21:56:58,205][15401] Updated weights for policy 0, policy_version 749053 (0.0027) [2024-06-24 21:56:58,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 12272484352. Throughput: 0: 42436.0. Samples: 12272587860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:56:58,393][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 21:57:02,439][15401] Updated weights for policy 0, policy_version 749063 (0.0038) [2024-06-24 21:57:03,389][15132] Fps is (10 sec: 42624.1, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 12272664576. Throughput: 0: 42562.2. Samples: 12272839100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:57:03,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-24 21:57:06,029][15401] Updated weights for policy 0, policy_version 749073 (0.0024) [2024-06-24 21:57:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.4, 300 sec: 42598.6). Total num frames: 12272877568. Throughput: 0: 42501.8. Samples: 12272963620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:57:08,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 21:57:09,962][15401] Updated weights for policy 0, policy_version 749083 (0.0029) [2024-06-24 21:57:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 12273123328. Throughput: 0: 42703.6. Samples: 12273229500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:57:13,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 21:57:13,816][15401] Updated weights for policy 0, policy_version 749093 (0.0023) [2024-06-24 21:57:17,458][15401] Updated weights for policy 0, policy_version 749103 (0.0040) [2024-06-24 21:57:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 12273303552. Throughput: 0: 42748.1. Samples: 12273482880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:57:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 21:57:21,573][15401] Updated weights for policy 0, policy_version 749113 (0.0046) [2024-06-24 21:57:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12273516544. Throughput: 0: 42685.2. Samples: 12273610260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:57:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 21:57:24,978][15401] Updated weights for policy 0, policy_version 749123 (0.0028) [2024-06-24 21:57:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12273762304. Throughput: 0: 42788.7. Samples: 12273868640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:57:28,399][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 21:57:29,094][15401] Updated weights for policy 0, policy_version 749133 (0.0035) [2024-06-24 21:57:32,978][15401] Updated weights for policy 0, policy_version 749143 (0.0039) [2024-06-24 21:57:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12273958912. Throughput: 0: 42770.3. Samples: 12274124340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-24 21:57:33,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 21:57:36,523][15401] Updated weights for policy 0, policy_version 749153 (0.0036) [2024-06-24 21:57:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12274155520. Throughput: 0: 42889.6. Samples: 12274254560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:57:38,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 21:57:40,430][15401] Updated weights for policy 0, policy_version 749163 (0.0033) [2024-06-24 21:57:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12274401280. Throughput: 0: 42852.1. Samples: 12274516100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:57:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 21:57:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000749171_12274417664.pth... [2024-06-24 21:57:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000748546_12264177664.pth [2024-06-24 21:57:44,406][15401] Updated weights for policy 0, policy_version 749173 (0.0022) [2024-06-24 21:57:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12274597888. Throughput: 0: 43014.1. Samples: 12274774740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:57:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 21:57:48,433][15401] Updated weights for policy 0, policy_version 749183 (0.0028) [2024-06-24 21:57:51,906][15401] Updated weights for policy 0, policy_version 749193 (0.0032) [2024-06-24 21:57:53,391][15132] Fps is (10 sec: 40955.4, 60 sec: 42874.9, 300 sec: 42598.2). Total num frames: 12274810880. Throughput: 0: 43001.1. Samples: 12274898720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:57:53,391][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 21:57:55,895][15401] Updated weights for policy 0, policy_version 749203 (0.0039) [2024-06-24 21:57:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42598.8). Total num frames: 12275023872. Throughput: 0: 42928.9. Samples: 12275161300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:57:58,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 21:57:59,527][15401] Updated weights for policy 0, policy_version 749213 (0.0041) [2024-06-24 21:58:03,364][15401] Updated weights for policy 0, policy_version 749223 (0.0038) [2024-06-24 21:58:03,389][15132] Fps is (10 sec: 45880.3, 60 sec: 43417.5, 300 sec: 42765.4). Total num frames: 12275269632. Throughput: 0: 42998.7. Samples: 12275417820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-24 21:58:07,099][15401] Updated weights for policy 0, policy_version 749233 (0.0033) [2024-06-24 21:58:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 12275466240. Throughput: 0: 42925.0. Samples: 12275541880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 21:58:11,061][15401] Updated weights for policy 0, policy_version 749243 (0.0024) [2024-06-24 21:58:13,390][15132] Fps is (10 sec: 40956.0, 60 sec: 42597.7, 300 sec: 42598.3). Total num frames: 12275679232. Throughput: 0: 42950.7. Samples: 12275801460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:13,391][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 21:58:14,833][15401] Updated weights for policy 0, policy_version 749253 (0.0040) [2024-06-24 21:58:17,133][15349] Signal inference workers to stop experience collection... (181750 times) [2024-06-24 21:58:17,133][15349] Signal inference workers to resume experience collection... (181750 times) [2024-06-24 21:58:17,175][15401] InferenceWorker_p0-w0: stopping experience collection (181750 times) [2024-06-24 21:58:17,175][15401] InferenceWorker_p0-w0: resuming experience collection (181750 times) [2024-06-24 21:58:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 12275892224. Throughput: 0: 42982.1. Samples: 12276058640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:18,393][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 21:58:18,792][15401] Updated weights for policy 0, policy_version 749263 (0.0037) [2024-06-24 21:58:22,481][15401] Updated weights for policy 0, policy_version 749273 (0.0031) [2024-06-24 21:58:23,389][15132] Fps is (10 sec: 44241.4, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 12276121600. Throughput: 0: 42842.3. Samples: 12276182460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:23,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-24 21:58:26,604][15401] Updated weights for policy 0, policy_version 749283 (0.0028) [2024-06-24 21:58:28,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 12276318208. Throughput: 0: 42611.6. Samples: 12276433620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:28,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 21:58:30,306][15401] Updated weights for policy 0, policy_version 749293 (0.0036) [2024-06-24 21:58:33,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.7, 300 sec: 42653.8). Total num frames: 12276531200. Throughput: 0: 42773.3. Samples: 12276699640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:33,392][15132] Avg episode reward: [(0, '0.265')] [2024-06-24 21:58:34,035][15401] Updated weights for policy 0, policy_version 749303 (0.0033) [2024-06-24 21:58:37,981][15401] Updated weights for policy 0, policy_version 749313 (0.0049) [2024-06-24 21:58:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12276744192. Throughput: 0: 42762.0. Samples: 12276822960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:38,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-24 21:58:41,576][15401] Updated weights for policy 0, policy_version 749323 (0.0030) [2024-06-24 21:58:43,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12276973568. Throughput: 0: 42720.0. Samples: 12277083700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 21:58:45,617][15401] Updated weights for policy 0, policy_version 749333 (0.0034) [2024-06-24 21:58:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12277153792. Throughput: 0: 42761.0. Samples: 12277342060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-24 21:58:49,448][15401] Updated weights for policy 0, policy_version 749343 (0.0032) [2024-06-24 21:58:53,190][15401] Updated weights for policy 0, policy_version 749353 (0.0050) [2024-06-24 21:58:53,392][15132] Fps is (10 sec: 42587.5, 60 sec: 43143.6, 300 sec: 42653.9). Total num frames: 12277399552. Throughput: 0: 42739.4. Samples: 12277465260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:53,401][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 21:58:57,230][15401] Updated weights for policy 0, policy_version 749363 (0.0039) [2024-06-24 21:58:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 12277596160. Throughput: 0: 42817.4. Samples: 12277728200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:58:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 21:59:00,792][15401] Updated weights for policy 0, policy_version 749373 (0.0032) [2024-06-24 21:59:03,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12277809152. Throughput: 0: 42973.4. Samples: 12277992340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:59:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 21:59:04,648][15401] Updated weights for policy 0, policy_version 749383 (0.0037) [2024-06-24 21:59:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12278038528. Throughput: 0: 42993.7. Samples: 12278117180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:59:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 21:59:08,445][15401] Updated weights for policy 0, policy_version 749393 (0.0035) [2024-06-24 21:59:12,239][15401] Updated weights for policy 0, policy_version 749403 (0.0038) [2024-06-24 21:59:13,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42870.4, 300 sec: 42709.1). Total num frames: 12278251520. Throughput: 0: 43190.5. Samples: 12278377300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:59:13,393][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 21:59:16,034][15401] Updated weights for policy 0, policy_version 749413 (0.0023) [2024-06-24 21:59:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 12278464512. Throughput: 0: 43063.2. Samples: 12278637380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 21:59:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 21:59:19,778][15401] Updated weights for policy 0, policy_version 749423 (0.0036) [2024-06-24 21:59:20,244][15349] Signal inference workers to stop experience collection... (181800 times) [2024-06-24 21:59:20,288][15401] InferenceWorker_p0-w0: stopping experience collection (181800 times) [2024-06-24 21:59:20,364][15349] Signal inference workers to resume experience collection... (181800 times) [2024-06-24 21:59:20,364][15401] InferenceWorker_p0-w0: resuming experience collection (181800 times) [2024-06-24 21:59:23,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12278661120. Throughput: 0: 43187.1. Samples: 12278766380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:59:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 21:59:23,782][15401] Updated weights for policy 0, policy_version 749433 (0.0036) [2024-06-24 21:59:27,390][15401] Updated weights for policy 0, policy_version 749443 (0.0033) [2024-06-24 21:59:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12278906880. Throughput: 0: 43091.5. Samples: 12279022820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:59:28,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 21:59:31,547][15401] Updated weights for policy 0, policy_version 749453 (0.0038) [2024-06-24 21:59:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 12279103488. Throughput: 0: 43118.2. Samples: 12279282380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:59:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 21:59:34,947][15401] Updated weights for policy 0, policy_version 749463 (0.0029) [2024-06-24 21:59:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12279316480. Throughput: 0: 43113.9. Samples: 12279405280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:59:38,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 21:59:39,345][15401] Updated weights for policy 0, policy_version 749473 (0.0041) [2024-06-24 21:59:42,826][15401] Updated weights for policy 0, policy_version 749483 (0.0029) [2024-06-24 21:59:43,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 12279562240. Throughput: 0: 43075.8. Samples: 12279666620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:59:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 21:59:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000749485_12279562240.pth... [2024-06-24 21:59:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000748856_12269256704.pth [2024-06-24 21:59:46,956][15401] Updated weights for policy 0, policy_version 749493 (0.0039) [2024-06-24 21:59:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 12279758848. Throughput: 0: 42922.2. Samples: 12279923840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:59:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 21:59:50,319][15401] Updated weights for policy 0, policy_version 749503 (0.0040) [2024-06-24 21:59:53,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 12279971840. Throughput: 0: 42866.3. Samples: 12280046160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:59:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 21:59:54,563][15401] Updated weights for policy 0, policy_version 749513 (0.0040) [2024-06-24 21:59:57,759][15401] Updated weights for policy 0, policy_version 749523 (0.0037) [2024-06-24 21:59:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12280184832. Throughput: 0: 42848.2. Samples: 12280305360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 21:59:58,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 22:00:02,103][15401] Updated weights for policy 0, policy_version 749533 (0.0025) [2024-06-24 22:00:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12280381440. Throughput: 0: 42929.0. Samples: 12280569180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 22:00:05,729][15401] Updated weights for policy 0, policy_version 749543 (0.0044) [2024-06-24 22:00:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 12280610816. Throughput: 0: 42743.6. Samples: 12280689840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:08,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 22:00:09,731][15401] Updated weights for policy 0, policy_version 749553 (0.0031) [2024-06-24 22:00:13,275][15401] Updated weights for policy 0, policy_version 749563 (0.0032) [2024-06-24 22:00:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 12280840192. Throughput: 0: 42864.0. Samples: 12280951700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-24 22:00:17,274][15401] Updated weights for policy 0, policy_version 749573 (0.0046) [2024-06-24 22:00:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12281036800. Throughput: 0: 42833.2. Samples: 12281209880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:18,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-24 22:00:20,729][15401] Updated weights for policy 0, policy_version 749583 (0.0041) [2024-06-24 22:00:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 12281266176. Throughput: 0: 42868.3. Samples: 12281334360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 22:00:24,946][15401] Updated weights for policy 0, policy_version 749593 (0.0033) [2024-06-24 22:00:28,184][15401] Updated weights for policy 0, policy_version 749603 (0.0033) [2024-06-24 22:00:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 12281495552. Throughput: 0: 42911.2. Samples: 12281597620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 22:00:32,798][15401] Updated weights for policy 0, policy_version 749613 (0.0033) [2024-06-24 22:00:33,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 12281675776. Throughput: 0: 43038.7. Samples: 12281860680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:33,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 22:00:35,727][15401] Updated weights for policy 0, policy_version 749623 (0.0043) [2024-06-24 22:00:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 12281921536. Throughput: 0: 42967.4. Samples: 12281979700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:38,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 22:00:40,441][15401] Updated weights for policy 0, policy_version 749633 (0.0028) [2024-06-24 22:00:43,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12282134528. Throughput: 0: 42967.9. Samples: 12282238920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 22:00:43,585][15401] Updated weights for policy 0, policy_version 749643 (0.0041) [2024-06-24 22:00:45,965][15349] Signal inference workers to stop experience collection... (181850 times) [2024-06-24 22:00:45,990][15401] InferenceWorker_p0-w0: stopping experience collection (181850 times) [2024-06-24 22:00:46,028][15349] Signal inference workers to resume experience collection... (181850 times) [2024-06-24 22:00:46,028][15401] InferenceWorker_p0-w0: resuming experience collection (181850 times) [2024-06-24 22:00:47,907][15401] Updated weights for policy 0, policy_version 749653 (0.0039) [2024-06-24 22:00:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12282331136. Throughput: 0: 42850.6. Samples: 12282497460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 22:00:51,300][15401] Updated weights for policy 0, policy_version 749663 (0.0042) [2024-06-24 22:00:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12282527744. Throughput: 0: 42945.8. Samples: 12282622400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:53,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 22:00:55,630][15401] Updated weights for policy 0, policy_version 749673 (0.0036) [2024-06-24 22:00:58,391][15132] Fps is (10 sec: 45867.9, 60 sec: 43416.3, 300 sec: 42875.9). Total num frames: 12282789888. Throughput: 0: 42921.9. Samples: 12282883260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-24 22:00:58,391][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 22:00:59,019][15401] Updated weights for policy 0, policy_version 749683 (0.0029) [2024-06-24 22:01:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12282953728. Throughput: 0: 42913.9. Samples: 12283141000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 22:01:03,511][15401] Updated weights for policy 0, policy_version 749693 (0.0028) [2024-06-24 22:01:06,754][15401] Updated weights for policy 0, policy_version 749703 (0.0033) [2024-06-24 22:01:08,389][15132] Fps is (10 sec: 39328.5, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12283183104. Throughput: 0: 42769.6. Samples: 12283258980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 22:01:10,998][15401] Updated weights for policy 0, policy_version 749713 (0.0028) [2024-06-24 22:01:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12283396096. Throughput: 0: 42666.6. Samples: 12283517620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 22:01:14,449][15401] Updated weights for policy 0, policy_version 749723 (0.0032) [2024-06-24 22:01:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12283609088. Throughput: 0: 42597.9. Samples: 12283777480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 22:01:18,851][15401] Updated weights for policy 0, policy_version 749733 (0.0038) [2024-06-24 22:01:22,140][15401] Updated weights for policy 0, policy_version 749743 (0.0026) [2024-06-24 22:01:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12283838464. Throughput: 0: 42729.4. Samples: 12283902520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 22:01:26,355][15401] Updated weights for policy 0, policy_version 749753 (0.0042) [2024-06-24 22:01:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12284035072. Throughput: 0: 42711.7. Samples: 12284160940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:28,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 22:01:29,852][15401] Updated weights for policy 0, policy_version 749763 (0.0036) [2024-06-24 22:01:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42871.5, 300 sec: 42875.7). Total num frames: 12284248064. Throughput: 0: 42673.3. Samples: 12284417860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:33,392][15132] Avg episode reward: [(0, '0.219')] [2024-06-24 22:01:33,822][15401] Updated weights for policy 0, policy_version 749773 (0.0026) [2024-06-24 22:01:37,671][15401] Updated weights for policy 0, policy_version 749783 (0.0034) [2024-06-24 22:01:38,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 12284477440. Throughput: 0: 42743.4. Samples: 12284545960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:38,392][15132] Avg episode reward: [(0, '0.287')] [2024-06-24 22:01:41,362][15401] Updated weights for policy 0, policy_version 749793 (0.0040) [2024-06-24 22:01:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12284674048. Throughput: 0: 42711.9. Samples: 12284805220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 22:01:43,560][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000749799_12284706816.pth... [2024-06-24 22:01:43,613][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000749171_12274417664.pth [2024-06-24 22:01:45,515][15401] Updated weights for policy 0, policy_version 749803 (0.0028) [2024-06-24 22:01:48,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.5, 300 sec: 42877.0). Total num frames: 12284887040. Throughput: 0: 42621.4. Samples: 12285058960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-24 22:01:49,363][15401] Updated weights for policy 0, policy_version 749813 (0.0045) [2024-06-24 22:01:53,176][15401] Updated weights for policy 0, policy_version 749823 (0.0043) [2024-06-24 22:01:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12285100032. Throughput: 0: 42775.6. Samples: 12285183880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:53,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-24 22:01:56,886][15401] Updated weights for policy 0, policy_version 749833 (0.0036) [2024-06-24 22:01:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42053.4, 300 sec: 42876.1). Total num frames: 12285313024. Throughput: 0: 42749.9. Samples: 12285441360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:01:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 22:02:00,812][15401] Updated weights for policy 0, policy_version 749843 (0.0032) [2024-06-24 22:02:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12285526016. Throughput: 0: 42814.7. Samples: 12285704140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:03,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 22:02:04,348][15401] Updated weights for policy 0, policy_version 749853 (0.0041) [2024-06-24 22:02:08,275][15401] Updated weights for policy 0, policy_version 749863 (0.0035) [2024-06-24 22:02:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 12285755392. Throughput: 0: 42688.3. Samples: 12285823500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-24 22:02:10,685][15349] Signal inference workers to stop experience collection... (181900 times) [2024-06-24 22:02:10,686][15349] Signal inference workers to resume experience collection... (181900 times) [2024-06-24 22:02:10,709][15401] InferenceWorker_p0-w0: stopping experience collection (181900 times) [2024-06-24 22:02:10,716][15401] InferenceWorker_p0-w0: resuming experience collection (181900 times) [2024-06-24 22:02:12,220][15401] Updated weights for policy 0, policy_version 749873 (0.0031) [2024-06-24 22:02:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12285952000. Throughput: 0: 42663.3. Samples: 12286080800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 22:02:15,811][15401] Updated weights for policy 0, policy_version 749883 (0.0027) [2024-06-24 22:02:18,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12286148608. Throughput: 0: 42746.4. Samples: 12286341340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 22:02:19,923][15401] Updated weights for policy 0, policy_version 749893 (0.0041) [2024-06-24 22:02:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12286361600. Throughput: 0: 42581.3. Samples: 12286462020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 22:02:23,910][15401] Updated weights for policy 0, policy_version 749903 (0.0035) [2024-06-24 22:02:27,530][15401] Updated weights for policy 0, policy_version 749913 (0.0030) [2024-06-24 22:02:28,389][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12286590976. Throughput: 0: 42435.5. Samples: 12286714820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:28,396][15132] Avg episode reward: [(0, '0.438')] [2024-06-24 22:02:31,563][15401] Updated weights for policy 0, policy_version 749923 (0.0028) [2024-06-24 22:02:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 12286803968. Throughput: 0: 42622.6. Samples: 12286976980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:33,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 22:02:35,159][15401] Updated weights for policy 0, policy_version 749933 (0.0039) [2024-06-24 22:02:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 12287016960. Throughput: 0: 42530.6. Samples: 12287097760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:38,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-24 22:02:39,143][15401] Updated weights for policy 0, policy_version 749943 (0.0035) [2024-06-24 22:02:42,768][15401] Updated weights for policy 0, policy_version 749953 (0.0040) [2024-06-24 22:02:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 12287246336. Throughput: 0: 42553.2. Samples: 12287356360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 22:02:43,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 22:02:46,767][15401] Updated weights for policy 0, policy_version 749963 (0.0023) [2024-06-24 22:02:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.7). Total num frames: 12287442944. Throughput: 0: 42347.0. Samples: 12287609760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:02:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 22:02:50,451][15401] Updated weights for policy 0, policy_version 749973 (0.0041) [2024-06-24 22:02:53,396][15132] Fps is (10 sec: 40943.4, 60 sec: 42593.7, 300 sec: 42819.6). Total num frames: 12287655936. Throughput: 0: 42470.4. Samples: 12287734940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:02:53,397][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 22:02:54,814][15401] Updated weights for policy 0, policy_version 749983 (0.0027) [2024-06-24 22:02:58,038][15401] Updated weights for policy 0, policy_version 749993 (0.0028) [2024-06-24 22:02:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12287901696. Throughput: 0: 42534.3. Samples: 12287994840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:02:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 22:03:02,303][15401] Updated weights for policy 0, policy_version 750003 (0.0034) [2024-06-24 22:03:03,396][15132] Fps is (10 sec: 44237.2, 60 sec: 42866.8, 300 sec: 42819.6). Total num frames: 12288098304. Throughput: 0: 42419.1. Samples: 12288250480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:03,396][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 22:03:05,581][15401] Updated weights for policy 0, policy_version 750013 (0.0023) [2024-06-24 22:03:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42765.2). Total num frames: 12288294912. Throughput: 0: 42585.4. Samples: 12288378360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:08,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-24 22:03:09,934][15401] Updated weights for policy 0, policy_version 750023 (0.0039) [2024-06-24 22:03:13,213][15401] Updated weights for policy 0, policy_version 750033 (0.0034) [2024-06-24 22:03:13,390][15132] Fps is (10 sec: 44265.0, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 12288540672. Throughput: 0: 42784.4. Samples: 12288640120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:13,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 22:03:17,951][15401] Updated weights for policy 0, policy_version 750043 (0.0030) [2024-06-24 22:03:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12288720896. Throughput: 0: 42662.6. Samples: 12288896800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:18,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-24 22:03:21,178][15401] Updated weights for policy 0, policy_version 750053 (0.0042) [2024-06-24 22:03:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12288933888. Throughput: 0: 42767.5. Samples: 12289022300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 22:03:25,762][15401] Updated weights for policy 0, policy_version 750063 (0.0032) [2024-06-24 22:03:26,526][15349] Signal inference workers to stop experience collection... (181950 times) [2024-06-24 22:03:26,527][15349] Signal inference workers to resume experience collection... (181950 times) [2024-06-24 22:03:26,545][15401] InferenceWorker_p0-w0: stopping experience collection (181950 times) [2024-06-24 22:03:26,581][15401] InferenceWorker_p0-w0: resuming experience collection (181950 times) [2024-06-24 22:03:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12289163264. Throughput: 0: 42713.9. Samples: 12289278380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-24 22:03:28,785][15401] Updated weights for policy 0, policy_version 750073 (0.0050) [2024-06-24 22:03:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12289343488. Throughput: 0: 42807.6. Samples: 12289536100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:33,402][15132] Avg episode reward: [(0, '0.356')] [2024-06-24 22:03:33,449][15401] Updated weights for policy 0, policy_version 750083 (0.0034) [2024-06-24 22:03:36,487][15401] Updated weights for policy 0, policy_version 750093 (0.0026) [2024-06-24 22:03:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12289589248. Throughput: 0: 42734.6. Samples: 12289657720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 22:03:41,016][15401] Updated weights for policy 0, policy_version 750103 (0.0031) [2024-06-24 22:03:43,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42598.4, 300 sec: 42875.7). Total num frames: 12289802240. Throughput: 0: 42776.4. Samples: 12289919880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:43,392][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 22:03:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000750110_12289802240.pth... [2024-06-24 22:03:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000749485_12279562240.pth [2024-06-24 22:03:44,059][15401] Updated weights for policy 0, policy_version 750113 (0.0037) [2024-06-24 22:03:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12289982464. Throughput: 0: 42838.9. Samples: 12290177960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:48,394][15132] Avg episode reward: [(0, '0.860')] [2024-06-24 22:03:48,642][15401] Updated weights for policy 0, policy_version 750123 (0.0026) [2024-06-24 22:03:51,726][15401] Updated weights for policy 0, policy_version 750133 (0.0046) [2024-06-24 22:03:53,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42876.1, 300 sec: 42820.5). Total num frames: 12290228224. Throughput: 0: 42769.3. Samples: 12290302980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 22:03:56,155][15401] Updated weights for policy 0, policy_version 750143 (0.0026) [2024-06-24 22:03:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 12290441216. Throughput: 0: 42802.2. Samples: 12290566220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:03:58,395][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 22:03:59,201][15401] Updated weights for policy 0, policy_version 750153 (0.0032) [2024-06-24 22:04:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42329.8, 300 sec: 42709.5). Total num frames: 12290637824. Throughput: 0: 42788.9. Samples: 12290822300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:04:03,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-24 22:04:03,635][15401] Updated weights for policy 0, policy_version 750163 (0.0035) [2024-06-24 22:04:07,250][15401] Updated weights for policy 0, policy_version 750173 (0.0027) [2024-06-24 22:04:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 12290883584. Throughput: 0: 42731.6. Samples: 12290945220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:04:08,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-24 22:04:11,152][15401] Updated weights for policy 0, policy_version 750183 (0.0031) [2024-06-24 22:04:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12291080192. Throughput: 0: 42833.7. Samples: 12291205900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:04:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 22:04:14,744][15401] Updated weights for policy 0, policy_version 750193 (0.0033) [2024-06-24 22:04:18,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 12291293184. Throughput: 0: 42944.2. Samples: 12291468600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:04:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 22:04:19,075][15401] Updated weights for policy 0, policy_version 750203 (0.0043) [2024-06-24 22:04:22,666][15401] Updated weights for policy 0, policy_version 750213 (0.0037) [2024-06-24 22:04:23,389][15132] Fps is (10 sec: 44237.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12291522560. Throughput: 0: 43025.5. Samples: 12291593860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:04:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 22:04:26,685][15401] Updated weights for policy 0, policy_version 750223 (0.0046) [2024-06-24 22:04:28,389][15132] Fps is (10 sec: 44238.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12291735552. Throughput: 0: 42940.5. Samples: 12291852100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 22:04:28,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 22:04:30,223][15401] Updated weights for policy 0, policy_version 750233 (0.0035) [2024-06-24 22:04:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12291932160. Throughput: 0: 42924.9. Samples: 12292109580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:04:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 22:04:34,234][15401] Updated weights for policy 0, policy_version 750243 (0.0038) [2024-06-24 22:04:37,570][15401] Updated weights for policy 0, policy_version 750253 (0.0028) [2024-06-24 22:04:38,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 12292177920. Throughput: 0: 43022.3. Samples: 12292239260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:04:38,397][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 22:04:42,024][15401] Updated weights for policy 0, policy_version 750263 (0.0034) [2024-06-24 22:04:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 12292374528. Throughput: 0: 43037.3. Samples: 12292502900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:04:43,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 22:04:45,286][15401] Updated weights for policy 0, policy_version 750273 (0.0032) [2024-06-24 22:04:48,389][15132] Fps is (10 sec: 40986.7, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 12292587520. Throughput: 0: 42999.7. Samples: 12292757280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:04:48,390][15132] Avg episode reward: [(0, '0.005')] [2024-06-24 22:04:49,775][15401] Updated weights for policy 0, policy_version 750283 (0.0030) [2024-06-24 22:04:52,780][15401] Updated weights for policy 0, policy_version 750293 (0.0036) [2024-06-24 22:04:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12292833280. Throughput: 0: 43105.3. Samples: 12292884960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:04:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 22:04:57,103][15401] Updated weights for policy 0, policy_version 750303 (0.0037) [2024-06-24 22:04:58,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12293029888. Throughput: 0: 43272.6. Samples: 12293153160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:04:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-24 22:05:00,434][15401] Updated weights for policy 0, policy_version 750313 (0.0031) [2024-06-24 22:05:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 12293242880. Throughput: 0: 43026.0. Samples: 12293404760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 22:05:04,645][15401] Updated weights for policy 0, policy_version 750323 (0.0030) [2024-06-24 22:05:07,944][15401] Updated weights for policy 0, policy_version 750333 (0.0029) [2024-06-24 22:05:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 12293455872. Throughput: 0: 43094.9. Samples: 12293533240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:08,392][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 22:05:12,057][15401] Updated weights for policy 0, policy_version 750343 (0.0041) [2024-06-24 22:05:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 12293668864. Throughput: 0: 43175.6. Samples: 12293795000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 22:05:15,304][15349] Signal inference workers to stop experience collection... (182000 times) [2024-06-24 22:05:15,304][15349] Signal inference workers to resume experience collection... (182000 times) [2024-06-24 22:05:15,348][15401] InferenceWorker_p0-w0: stopping experience collection (182000 times) [2024-06-24 22:05:15,348][15401] InferenceWorker_p0-w0: resuming experience collection (182000 times) [2024-06-24 22:05:15,441][15401] Updated weights for policy 0, policy_version 750353 (0.0046) [2024-06-24 22:05:18,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 12293881856. Throughput: 0: 43109.8. Samples: 12294049520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 22:05:19,816][15401] Updated weights for policy 0, policy_version 750363 (0.0042) [2024-06-24 22:05:23,240][15401] Updated weights for policy 0, policy_version 750373 (0.0046) [2024-06-24 22:05:23,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12294111232. Throughput: 0: 43071.0. Samples: 12294177180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:23,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 22:05:27,355][15401] Updated weights for policy 0, policy_version 750383 (0.0033) [2024-06-24 22:05:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12294307840. Throughput: 0: 42948.1. Samples: 12294435560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 22:05:30,976][15401] Updated weights for policy 0, policy_version 750393 (0.0038) [2024-06-24 22:05:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12294537216. Throughput: 0: 42903.3. Samples: 12294687940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:33,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 22:05:34,982][15401] Updated weights for policy 0, policy_version 750403 (0.0045) [2024-06-24 22:05:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42603.1, 300 sec: 42709.5). Total num frames: 12294733824. Throughput: 0: 42893.0. Samples: 12294815140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 22:05:38,853][15401] Updated weights for policy 0, policy_version 750413 (0.0025) [2024-06-24 22:05:42,833][15401] Updated weights for policy 0, policy_version 750423 (0.0037) [2024-06-24 22:05:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12294946816. Throughput: 0: 42715.2. Samples: 12295075340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 22:05:43,437][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000750425_12294963200.pth... [2024-06-24 22:05:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000749799_12284706816.pth [2024-06-24 22:05:46,461][15401] Updated weights for policy 0, policy_version 750433 (0.0034) [2024-06-24 22:05:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 12295176192. Throughput: 0: 42699.5. Samples: 12295326240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 22:05:50,722][15401] Updated weights for policy 0, policy_version 750443 (0.0035) [2024-06-24 22:05:53,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.6, 300 sec: 42653.8). Total num frames: 12295372800. Throughput: 0: 42783.1. Samples: 12295458480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:53,393][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 22:05:54,182][15401] Updated weights for policy 0, policy_version 750453 (0.0041) [2024-06-24 22:05:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12295569408. Throughput: 0: 42608.0. Samples: 12295712360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:05:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 22:05:58,433][15401] Updated weights for policy 0, policy_version 750463 (0.0037) [2024-06-24 22:06:01,780][15401] Updated weights for policy 0, policy_version 750473 (0.0030) [2024-06-24 22:06:03,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12295815168. Throughput: 0: 42829.8. Samples: 12295976860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:06:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 22:06:05,898][15401] Updated weights for policy 0, policy_version 750483 (0.0036) [2024-06-24 22:06:08,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 12296028160. Throughput: 0: 42920.0. Samples: 12296108580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:06:08,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 22:06:09,530][15401] Updated weights for policy 0, policy_version 750493 (0.0043) [2024-06-24 22:06:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12296224768. Throughput: 0: 42746.6. Samples: 12296359160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-24 22:06:13,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 22:06:13,548][15401] Updated weights for policy 0, policy_version 750503 (0.0028) [2024-06-24 22:06:17,355][15401] Updated weights for policy 0, policy_version 750513 (0.0031) [2024-06-24 22:06:18,391][15132] Fps is (10 sec: 42593.5, 60 sec: 42870.7, 300 sec: 42764.8). Total num frames: 12296454144. Throughput: 0: 42743.0. Samples: 12296611420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:18,391][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 22:06:21,063][15401] Updated weights for policy 0, policy_version 750523 (0.0037) [2024-06-24 22:06:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12296667136. Throughput: 0: 42852.2. Samples: 12296743500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 22:06:24,990][15401] Updated weights for policy 0, policy_version 750533 (0.0025) [2024-06-24 22:06:28,389][15132] Fps is (10 sec: 42603.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 12296880128. Throughput: 0: 42799.1. Samples: 12297001300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 22:06:29,061][15401] Updated weights for policy 0, policy_version 750543 (0.0037) [2024-06-24 22:06:32,513][15401] Updated weights for policy 0, policy_version 750553 (0.0036) [2024-06-24 22:06:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 12297060352. Throughput: 0: 43083.1. Samples: 12297264980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:33,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 22:06:36,527][15401] Updated weights for policy 0, policy_version 750563 (0.0041) [2024-06-24 22:06:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12297306112. Throughput: 0: 42997.1. Samples: 12297393240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:38,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 22:06:40,037][15401] Updated weights for policy 0, policy_version 750573 (0.0040) [2024-06-24 22:06:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 12297519104. Throughput: 0: 42797.7. Samples: 12297638260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 22:06:43,975][15401] Updated weights for policy 0, policy_version 750583 (0.0023) [2024-06-24 22:06:47,786][15401] Updated weights for policy 0, policy_version 750593 (0.0040) [2024-06-24 22:06:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12297715712. Throughput: 0: 42782.7. Samples: 12297902080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:48,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 22:06:48,467][15349] Signal inference workers to stop experience collection... (182050 times) [2024-06-24 22:06:48,467][15349] Signal inference workers to resume experience collection... (182050 times) [2024-06-24 22:06:48,514][15401] InferenceWorker_p0-w0: stopping experience collection (182050 times) [2024-06-24 22:06:48,514][15401] InferenceWorker_p0-w0: resuming experience collection (182050 times) [2024-06-24 22:06:51,385][15401] Updated weights for policy 0, policy_version 750603 (0.0033) [2024-06-24 22:06:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 12297945088. Throughput: 0: 42647.6. Samples: 12298027720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:53,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 22:06:55,514][15401] Updated weights for policy 0, policy_version 750613 (0.0035) [2024-06-24 22:06:58,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43415.8, 300 sec: 42875.7). Total num frames: 12298174464. Throughput: 0: 42736.3. Samples: 12298282400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:06:58,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 22:06:58,962][15401] Updated weights for policy 0, policy_version 750623 (0.0033) [2024-06-24 22:07:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12298354688. Throughput: 0: 42904.2. Samples: 12298542060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:03,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 22:07:03,406][15401] Updated weights for policy 0, policy_version 750633 (0.0049) [2024-06-24 22:07:06,678][15401] Updated weights for policy 0, policy_version 750643 (0.0038) [2024-06-24 22:07:08,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12298567680. Throughput: 0: 42673.8. Samples: 12298663820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 22:07:11,137][15401] Updated weights for policy 0, policy_version 750653 (0.0039) [2024-06-24 22:07:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12298797056. Throughput: 0: 42503.9. Samples: 12298913980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:13,394][15132] Avg episode reward: [(0, '0.723')] [2024-06-24 22:07:14,677][15401] Updated weights for policy 0, policy_version 750663 (0.0032) [2024-06-24 22:07:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42326.1, 300 sec: 42820.6). Total num frames: 12298993664. Throughput: 0: 42444.9. Samples: 12299175000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 22:07:18,683][15401] Updated weights for policy 0, policy_version 750673 (0.0031) [2024-06-24 22:07:22,420][15401] Updated weights for policy 0, policy_version 750683 (0.0035) [2024-06-24 22:07:23,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 12299206656. Throughput: 0: 42445.2. Samples: 12299303380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:23,393][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 22:07:26,178][15401] Updated weights for policy 0, policy_version 750693 (0.0025) [2024-06-24 22:07:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12299452416. Throughput: 0: 42678.7. Samples: 12299558800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 22:07:29,860][15401] Updated weights for policy 0, policy_version 750703 (0.0027) [2024-06-24 22:07:33,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12299649024. Throughput: 0: 42632.0. Samples: 12299820520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-24 22:07:33,713][15401] Updated weights for policy 0, policy_version 750713 (0.0025) [2024-06-24 22:07:38,151][15401] Updated weights for policy 0, policy_version 750723 (0.0041) [2024-06-24 22:07:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12299862016. Throughput: 0: 42549.7. Samples: 12299942460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 22:07:41,314][15401] Updated weights for policy 0, policy_version 750733 (0.0039) [2024-06-24 22:07:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12300091392. Throughput: 0: 42663.2. Samples: 12300202140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:43,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-24 22:07:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000750739_12300107776.pth... [2024-06-24 22:07:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000750110_12289802240.pth [2024-06-24 22:07:45,538][15401] Updated weights for policy 0, policy_version 750743 (0.0025) [2024-06-24 22:07:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 12300288000. Throughput: 0: 42631.5. Samples: 12300460480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:48,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 22:07:49,085][15401] Updated weights for policy 0, policy_version 750753 (0.0038) [2024-06-24 22:07:53,149][15401] Updated weights for policy 0, policy_version 750763 (0.0033) [2024-06-24 22:07:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12300500992. Throughput: 0: 42739.6. Samples: 12300587100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-24 22:07:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 22:07:56,646][15401] Updated weights for policy 0, policy_version 750773 (0.0031) [2024-06-24 22:07:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.1, 300 sec: 42877.0). Total num frames: 12300746752. Throughput: 0: 42950.7. Samples: 12300846760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:07:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 22:08:00,654][15401] Updated weights for policy 0, policy_version 750783 (0.0025) [2024-06-24 22:08:02,338][15349] Signal inference workers to stop experience collection... (182100 times) [2024-06-24 22:08:02,338][15349] Signal inference workers to resume experience collection... (182100 times) [2024-06-24 22:08:02,356][15401] InferenceWorker_p0-w0: stopping experience collection (182100 times) [2024-06-24 22:08:02,356][15401] InferenceWorker_p0-w0: resuming experience collection (182100 times) [2024-06-24 22:08:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12300943360. Throughput: 0: 42877.8. Samples: 12301104500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 22:08:04,163][15401] Updated weights for policy 0, policy_version 750793 (0.0028) [2024-06-24 22:08:08,160][15401] Updated weights for policy 0, policy_version 750803 (0.0035) [2024-06-24 22:08:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12301156352. Throughput: 0: 42889.0. Samples: 12301233280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 22:08:11,655][15401] Updated weights for policy 0, policy_version 750813 (0.0026) [2024-06-24 22:08:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12301369344. Throughput: 0: 42946.0. Samples: 12301491360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 22:08:15,757][15401] Updated weights for policy 0, policy_version 750823 (0.0033) [2024-06-24 22:08:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 12301598720. Throughput: 0: 42884.3. Samples: 12301750320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 22:08:19,257][15401] Updated weights for policy 0, policy_version 750833 (0.0047) [2024-06-24 22:08:23,392][15132] Fps is (10 sec: 42587.3, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 12301795328. Throughput: 0: 43019.8. Samples: 12301878460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:23,393][15132] Avg episode reward: [(0, '0.708')] [2024-06-24 22:08:23,550][15401] Updated weights for policy 0, policy_version 750843 (0.0033) [2024-06-24 22:08:26,978][15401] Updated weights for policy 0, policy_version 750853 (0.0048) [2024-06-24 22:08:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.6, 300 sec: 42931.7). Total num frames: 12302008320. Throughput: 0: 42808.0. Samples: 12302128500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 22:08:31,076][15401] Updated weights for policy 0, policy_version 750863 (0.0032) [2024-06-24 22:08:33,390][15132] Fps is (10 sec: 42609.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12302221312. Throughput: 0: 43018.7. Samples: 12302396320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:33,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-24 22:08:34,557][15401] Updated weights for policy 0, policy_version 750873 (0.0034) [2024-06-24 22:08:38,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 12302417920. Throughput: 0: 42893.1. Samples: 12302517300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:38,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-24 22:08:38,934][15401] Updated weights for policy 0, policy_version 750883 (0.0036) [2024-06-24 22:08:42,569][15401] Updated weights for policy 0, policy_version 750893 (0.0027) [2024-06-24 22:08:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 12302663680. Throughput: 0: 42776.8. Samples: 12302771720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:43,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:08:46,667][15401] Updated weights for policy 0, policy_version 750903 (0.0035) [2024-06-24 22:08:48,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12302843904. Throughput: 0: 42808.0. Samples: 12303030860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 22:08:50,172][15401] Updated weights for policy 0, policy_version 750913 (0.0039) [2024-06-24 22:08:53,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12303056896. Throughput: 0: 42729.8. Samples: 12303156120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 22:08:54,407][15401] Updated weights for policy 0, policy_version 750923 (0.0036) [2024-06-24 22:08:57,844][15401] Updated weights for policy 0, policy_version 750933 (0.0034) [2024-06-24 22:08:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 12303302656. Throughput: 0: 42742.7. Samples: 12303414780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:08:58,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 22:09:02,056][15401] Updated weights for policy 0, policy_version 750943 (0.0026) [2024-06-24 22:09:03,392][15132] Fps is (10 sec: 44223.9, 60 sec: 42596.3, 300 sec: 42764.6). Total num frames: 12303499264. Throughput: 0: 42776.4. Samples: 12303675380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:09:03,393][15132] Avg episode reward: [(0, '0.372')] [2024-06-24 22:09:05,658][15401] Updated weights for policy 0, policy_version 750953 (0.0029) [2024-06-24 22:09:08,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12303712256. Throughput: 0: 42675.3. Samples: 12303798740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:09:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 22:09:09,578][15401] Updated weights for policy 0, policy_version 750963 (0.0033) [2024-06-24 22:09:13,152][15401] Updated weights for policy 0, policy_version 750973 (0.0039) [2024-06-24 22:09:13,389][15132] Fps is (10 sec: 44249.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12303941632. Throughput: 0: 42853.2. Samples: 12304056900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:09:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 22:09:17,718][15401] Updated weights for policy 0, policy_version 750983 (0.0026) [2024-06-24 22:09:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.7, 300 sec: 42764.6). Total num frames: 12304138240. Throughput: 0: 42489.2. Samples: 12304308440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:09:18,393][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 22:09:20,794][15349] Signal inference workers to stop experience collection... (182150 times) [2024-06-24 22:09:20,844][15401] InferenceWorker_p0-w0: stopping experience collection (182150 times) [2024-06-24 22:09:20,853][15349] Signal inference workers to resume experience collection... (182150 times) [2024-06-24 22:09:20,861][15401] InferenceWorker_p0-w0: resuming experience collection (182150 times) [2024-06-24 22:09:20,994][15401] Updated weights for policy 0, policy_version 750993 (0.0029) [2024-06-24 22:09:23,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 12304367616. Throughput: 0: 42601.8. Samples: 12304434480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:09:23,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 22:09:25,331][15401] Updated weights for policy 0, policy_version 751003 (0.0039) [2024-06-24 22:09:28,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12304564224. Throughput: 0: 42808.2. Samples: 12304698080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:09:28,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 22:09:28,702][15401] Updated weights for policy 0, policy_version 751013 (0.0031) [2024-06-24 22:09:32,954][15401] Updated weights for policy 0, policy_version 751023 (0.0036) [2024-06-24 22:09:33,392][15132] Fps is (10 sec: 40960.3, 60 sec: 42596.7, 300 sec: 42710.1). Total num frames: 12304777216. Throughput: 0: 42566.1. Samples: 12304946440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:09:33,392][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 22:09:36,302][15401] Updated weights for policy 0, policy_version 751033 (0.0038) [2024-06-24 22:09:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12304990208. Throughput: 0: 42580.9. Samples: 12305072260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-24 22:09:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 22:09:40,959][15401] Updated weights for policy 0, policy_version 751043 (0.0029) [2024-06-24 22:09:43,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12305219584. Throughput: 0: 42544.7. Samples: 12305329300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:09:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 22:09:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000751051_12305219584.pth... [2024-06-24 22:09:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000750425_12294963200.pth [2024-06-24 22:09:44,032][15401] Updated weights for policy 0, policy_version 751053 (0.0034) [2024-06-24 22:09:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 12305383424. Throughput: 0: 42477.4. Samples: 12305586740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:09:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 22:09:48,797][15401] Updated weights for policy 0, policy_version 751063 (0.0032) [2024-06-24 22:09:51,865][15401] Updated weights for policy 0, policy_version 751073 (0.0048) [2024-06-24 22:09:53,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12305629184. Throughput: 0: 42356.4. Samples: 12305704880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:09:53,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 22:09:56,324][15401] Updated weights for policy 0, policy_version 751083 (0.0040) [2024-06-24 22:09:58,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 12305858560. Throughput: 0: 42519.4. Samples: 12305970280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:09:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 22:09:59,457][15401] Updated weights for policy 0, policy_version 751093 (0.0044) [2024-06-24 22:10:03,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42054.4, 300 sec: 42598.8). Total num frames: 12306022400. Throughput: 0: 42670.4. Samples: 12306228500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 22:10:04,010][15401] Updated weights for policy 0, policy_version 751103 (0.0038) [2024-06-24 22:10:06,972][15401] Updated weights for policy 0, policy_version 751113 (0.0040) [2024-06-24 22:10:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12306284544. Throughput: 0: 42460.5. Samples: 12306345100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:08,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 22:10:11,641][15401] Updated weights for policy 0, policy_version 751123 (0.0026) [2024-06-24 22:10:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12306481152. Throughput: 0: 42454.2. Samples: 12306608520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 22:10:14,597][15401] Updated weights for policy 0, policy_version 751133 (0.0023) [2024-06-24 22:10:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 12306677760. Throughput: 0: 42722.2. Samples: 12306868840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 22:10:19,264][15401] Updated weights for policy 0, policy_version 751143 (0.0036) [2024-06-24 22:10:22,235][15401] Updated weights for policy 0, policy_version 751153 (0.0029) [2024-06-24 22:10:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 12306939904. Throughput: 0: 42658.2. Samples: 12306991880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:23,396][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 22:10:26,886][15401] Updated weights for policy 0, policy_version 751163 (0.0041) [2024-06-24 22:10:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12307136512. Throughput: 0: 42777.4. Samples: 12307254280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 22:10:29,940][15401] Updated weights for policy 0, policy_version 751173 (0.0026) [2024-06-24 22:10:33,391][15132] Fps is (10 sec: 36040.1, 60 sec: 42053.0, 300 sec: 42598.2). Total num frames: 12307300352. Throughput: 0: 42890.9. Samples: 12307516880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:33,391][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 22:10:34,694][15401] Updated weights for policy 0, policy_version 751183 (0.0034) [2024-06-24 22:10:35,727][15349] Signal inference workers to stop experience collection... (182200 times) [2024-06-24 22:10:35,772][15401] InferenceWorker_p0-w0: stopping experience collection (182200 times) [2024-06-24 22:10:35,782][15349] Signal inference workers to resume experience collection... (182200 times) [2024-06-24 22:10:35,795][15401] InferenceWorker_p0-w0: resuming experience collection (182200 times) [2024-06-24 22:10:37,537][15401] Updated weights for policy 0, policy_version 751193 (0.0035) [2024-06-24 22:10:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 12307578880. Throughput: 0: 42815.5. Samples: 12307631580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:38,393][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 22:10:42,109][15401] Updated weights for policy 0, policy_version 751203 (0.0045) [2024-06-24 22:10:43,389][15132] Fps is (10 sec: 47519.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12307775488. Throughput: 0: 42853.0. Samples: 12307898660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 22:10:45,044][15401] Updated weights for policy 0, policy_version 751213 (0.0037) [2024-06-24 22:10:48,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 12307955712. Throughput: 0: 42838.2. Samples: 12308156220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 22:10:50,071][15401] Updated weights for policy 0, policy_version 751223 (0.0027) [2024-06-24 22:10:52,589][15401] Updated weights for policy 0, policy_version 751233 (0.0031) [2024-06-24 22:10:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 12308217856. Throughput: 0: 42939.6. Samples: 12308277380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 22:10:57,587][15401] Updated weights for policy 0, policy_version 751243 (0.0022) [2024-06-24 22:10:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 12308398080. Throughput: 0: 43049.4. Samples: 12308545740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:10:58,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-24 22:11:00,123][15401] Updated weights for policy 0, policy_version 751253 (0.0038) [2024-06-24 22:11:03,396][15132] Fps is (10 sec: 39296.3, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 12308611072. Throughput: 0: 42911.7. Samples: 12308800140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:11:03,396][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 22:11:05,125][15401] Updated weights for policy 0, policy_version 751263 (0.0030) [2024-06-24 22:11:07,858][15401] Updated weights for policy 0, policy_version 751273 (0.0034) [2024-06-24 22:11:08,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12308873216. Throughput: 0: 43112.0. Samples: 12308931920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:11:08,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 22:11:12,723][15401] Updated weights for policy 0, policy_version 751283 (0.0029) [2024-06-24 22:11:13,392][15132] Fps is (10 sec: 42615.6, 60 sec: 42596.7, 300 sec: 42653.8). Total num frames: 12309037056. Throughput: 0: 43102.1. Samples: 12309193980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:11:13,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-24 22:11:15,512][15401] Updated weights for policy 0, policy_version 751293 (0.0040) [2024-06-24 22:11:18,390][15132] Fps is (10 sec: 39317.6, 60 sec: 43143.9, 300 sec: 42709.4). Total num frames: 12309266432. Throughput: 0: 42899.8. Samples: 12309447360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:11:18,391][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 22:11:20,369][15401] Updated weights for policy 0, policy_version 751303 (0.0046) [2024-06-24 22:11:23,274][15401] Updated weights for policy 0, policy_version 751313 (0.0030) [2024-06-24 22:11:23,390][15132] Fps is (10 sec: 47524.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12309512192. Throughput: 0: 43191.6. Samples: 12309575100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-24 22:11:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 22:11:27,907][15401] Updated weights for policy 0, policy_version 751323 (0.0044) [2024-06-24 22:11:28,389][15132] Fps is (10 sec: 42602.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12309692416. Throughput: 0: 43122.6. Samples: 12309839180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:11:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 22:11:30,922][15401] Updated weights for policy 0, policy_version 751333 (0.0039) [2024-06-24 22:11:33,390][15132] Fps is (10 sec: 40956.8, 60 sec: 43691.0, 300 sec: 42764.9). Total num frames: 12309921792. Throughput: 0: 43036.1. Samples: 12310092880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:11:33,391][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 22:11:35,283][15401] Updated weights for policy 0, policy_version 751343 (0.0031) [2024-06-24 22:11:38,338][15401] Updated weights for policy 0, policy_version 751353 (0.0042) [2024-06-24 22:11:38,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 12310167552. Throughput: 0: 43194.7. Samples: 12310221140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:11:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 22:11:42,899][15401] Updated weights for policy 0, policy_version 751363 (0.0045) [2024-06-24 22:11:43,390][15132] Fps is (10 sec: 42601.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12310347776. Throughput: 0: 43079.8. Samples: 12310484340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:11:43,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 22:11:43,581][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000751365_12310364160.pth... [2024-06-24 22:11:43,645][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000750739_12300107776.pth [2024-06-24 22:11:45,822][15401] Updated weights for policy 0, policy_version 751373 (0.0034) [2024-06-24 22:11:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12310560768. Throughput: 0: 42996.0. Samples: 12310734680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:11:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-24 22:11:50,702][15401] Updated weights for policy 0, policy_version 751383 (0.0040) [2024-06-24 22:11:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12310790144. Throughput: 0: 42993.8. Samples: 12310866640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:11:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 22:11:53,626][15401] Updated weights for policy 0, policy_version 751393 (0.0046) [2024-06-24 22:11:56,976][15349] Signal inference workers to stop experience collection... (182250 times) [2024-06-24 22:11:56,976][15349] Signal inference workers to resume experience collection... (182250 times) [2024-06-24 22:11:57,022][15401] InferenceWorker_p0-w0: stopping experience collection (182250 times) [2024-06-24 22:11:57,022][15401] InferenceWorker_p0-w0: resuming experience collection (182250 times) [2024-06-24 22:11:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12310970368. Throughput: 0: 42929.5. Samples: 12311125700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:11:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 22:11:58,410][15401] Updated weights for policy 0, policy_version 751403 (0.0026) [2024-06-24 22:12:01,529][15401] Updated weights for policy 0, policy_version 751413 (0.0032) [2024-06-24 22:12:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43422.3, 300 sec: 42876.1). Total num frames: 12311216128. Throughput: 0: 42856.9. Samples: 12311375880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 22:12:06,029][15401] Updated weights for policy 0, policy_version 751423 (0.0034) [2024-06-24 22:12:08,390][15132] Fps is (10 sec: 47512.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12311445504. Throughput: 0: 42962.6. Samples: 12311508420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 22:12:09,106][15401] Updated weights for policy 0, policy_version 751433 (0.0039) [2024-06-24 22:12:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 12311625728. Throughput: 0: 42811.1. Samples: 12311765680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 22:12:13,449][15401] Updated weights for policy 0, policy_version 751443 (0.0043) [2024-06-24 22:12:16,765][15401] Updated weights for policy 0, policy_version 751453 (0.0038) [2024-06-24 22:12:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43145.3, 300 sec: 42876.5). Total num frames: 12311855104. Throughput: 0: 42816.0. Samples: 12312019560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 22:12:21,467][15401] Updated weights for policy 0, policy_version 751463 (0.0033) [2024-06-24 22:12:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12312084480. Throughput: 0: 42915.5. Samples: 12312152340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 22:12:24,248][15401] Updated weights for policy 0, policy_version 751473 (0.0032) [2024-06-24 22:12:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12312264704. Throughput: 0: 42628.0. Samples: 12312402600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:28,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 22:12:29,165][15401] Updated weights for policy 0, policy_version 751483 (0.0030) [2024-06-24 22:12:31,841][15401] Updated weights for policy 0, policy_version 751493 (0.0053) [2024-06-24 22:12:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43145.2, 300 sec: 42876.1). Total num frames: 12312510464. Throughput: 0: 42777.4. Samples: 12312659660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:33,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 22:12:36,978][15401] Updated weights for policy 0, policy_version 751503 (0.0044) [2024-06-24 22:12:38,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 12312723456. Throughput: 0: 42828.3. Samples: 12312794020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:38,401][15132] Avg episode reward: [(0, '0.603')] [2024-06-24 22:12:39,533][15401] Updated weights for policy 0, policy_version 751513 (0.0042) [2024-06-24 22:12:43,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12312903680. Throughput: 0: 42574.6. Samples: 12313041560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:43,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 22:12:44,566][15401] Updated weights for policy 0, policy_version 751523 (0.0040) [2024-06-24 22:12:47,425][15401] Updated weights for policy 0, policy_version 751533 (0.0043) [2024-06-24 22:12:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12313133056. Throughput: 0: 42780.5. Samples: 12313301000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 22:12:52,105][15401] Updated weights for policy 0, policy_version 751543 (0.0034) [2024-06-24 22:12:53,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 12313362432. Throughput: 0: 42881.4. Samples: 12313438180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:53,392][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 22:12:54,949][15401] Updated weights for policy 0, policy_version 751553 (0.0042) [2024-06-24 22:12:58,393][15132] Fps is (10 sec: 42584.0, 60 sec: 43142.1, 300 sec: 42764.5). Total num frames: 12313559040. Throughput: 0: 42738.2. Samples: 12313689040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:12:58,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 22:12:59,785][15401] Updated weights for policy 0, policy_version 751563 (0.0043) [2024-06-24 22:13:02,570][15401] Updated weights for policy 0, policy_version 751573 (0.0034) [2024-06-24 22:13:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12313772032. Throughput: 0: 42649.2. Samples: 12313938780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 22:13:03,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 22:13:07,452][15401] Updated weights for policy 0, policy_version 751583 (0.0043) [2024-06-24 22:13:08,389][15132] Fps is (10 sec: 42613.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 12313985024. Throughput: 0: 42585.0. Samples: 12314068660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:08,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 22:13:10,400][15401] Updated weights for policy 0, policy_version 751593 (0.0030) [2024-06-24 22:13:13,396][15132] Fps is (10 sec: 44208.7, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 12314214400. Throughput: 0: 42736.1. Samples: 12314326000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:13,397][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 22:13:14,898][15401] Updated weights for policy 0, policy_version 751603 (0.0036) [2024-06-24 22:13:18,164][15401] Updated weights for policy 0, policy_version 751613 (0.0042) [2024-06-24 22:13:18,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42867.0, 300 sec: 42820.0). Total num frames: 12314427392. Throughput: 0: 42658.9. Samples: 12314579580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:18,397][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 22:13:22,388][15401] Updated weights for policy 0, policy_version 751623 (0.0035) [2024-06-24 22:13:22,897][15349] Signal inference workers to stop experience collection... (182300 times) [2024-06-24 22:13:22,898][15349] Signal inference workers to resume experience collection... (182300 times) [2024-06-24 22:13:22,907][15401] InferenceWorker_p0-w0: stopping experience collection (182300 times) [2024-06-24 22:13:22,936][15401] InferenceWorker_p0-w0: resuming experience collection (182300 times) [2024-06-24 22:13:23,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12314640384. Throughput: 0: 42616.0. Samples: 12314711640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 22:13:25,827][15401] Updated weights for policy 0, policy_version 751633 (0.0041) [2024-06-24 22:13:28,389][15132] Fps is (10 sec: 40985.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12314836992. Throughput: 0: 42917.4. Samples: 12314972840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:28,396][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 22:13:30,176][15401] Updated weights for policy 0, policy_version 751643 (0.0039) [2024-06-24 22:13:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12315066368. Throughput: 0: 42719.0. Samples: 12315223360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 22:13:33,478][15401] Updated weights for policy 0, policy_version 751653 (0.0041) [2024-06-24 22:13:37,770][15401] Updated weights for policy 0, policy_version 751663 (0.0032) [2024-06-24 22:13:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 12315262976. Throughput: 0: 42692.6. Samples: 12315359240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 22:13:41,010][15401] Updated weights for policy 0, policy_version 751673 (0.0046) [2024-06-24 22:13:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12315492352. Throughput: 0: 42772.9. Samples: 12315613680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-24 22:13:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000751678_12315492352.pth... [2024-06-24 22:13:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000751051_12305219584.pth [2024-06-24 22:13:45,263][15401] Updated weights for policy 0, policy_version 751683 (0.0035) [2024-06-24 22:13:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12315705344. Throughput: 0: 42883.6. Samples: 12315868540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 22:13:48,733][15401] Updated weights for policy 0, policy_version 751693 (0.0043) [2024-06-24 22:13:53,115][15401] Updated weights for policy 0, policy_version 751703 (0.0040) [2024-06-24 22:13:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 12315901952. Throughput: 0: 42944.0. Samples: 12316001140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 22:13:56,355][15401] Updated weights for policy 0, policy_version 751713 (0.0025) [2024-06-24 22:13:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.8, 300 sec: 42821.0). Total num frames: 12316131328. Throughput: 0: 42877.2. Samples: 12316255200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:13:58,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 22:14:00,592][15401] Updated weights for policy 0, policy_version 751723 (0.0026) [2024-06-24 22:14:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12316344320. Throughput: 0: 42952.2. Samples: 12316512160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:03,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 22:14:03,985][15401] Updated weights for policy 0, policy_version 751733 (0.0037) [2024-06-24 22:14:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12316540928. Throughput: 0: 42881.4. Samples: 12316641300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:08,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-24 22:14:08,517][15401] Updated weights for policy 0, policy_version 751743 (0.0036) [2024-06-24 22:14:11,608][15401] Updated weights for policy 0, policy_version 751753 (0.0033) [2024-06-24 22:14:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42876.0, 300 sec: 42876.4). Total num frames: 12316786688. Throughput: 0: 42669.3. Samples: 12316892960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:13,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-24 22:14:16,079][15401] Updated weights for policy 0, policy_version 751763 (0.0025) [2024-06-24 22:14:18,391][15132] Fps is (10 sec: 45870.4, 60 sec: 42875.2, 300 sec: 42820.8). Total num frames: 12316999680. Throughput: 0: 42873.7. Samples: 12317152720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:18,391][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 22:14:19,521][15401] Updated weights for policy 0, policy_version 751773 (0.0039) [2024-06-24 22:14:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 12317163520. Throughput: 0: 42645.4. Samples: 12317278280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 22:14:23,792][15401] Updated weights for policy 0, policy_version 751783 (0.0044) [2024-06-24 22:14:27,313][15401] Updated weights for policy 0, policy_version 751793 (0.0031) [2024-06-24 22:14:28,389][15132] Fps is (10 sec: 42602.7, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 12317425664. Throughput: 0: 42689.9. Samples: 12317534720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 22:14:31,595][15401] Updated weights for policy 0, policy_version 751803 (0.0034) [2024-06-24 22:14:33,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12317622272. Throughput: 0: 42520.5. Samples: 12317782060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:33,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 22:14:34,985][15401] Updated weights for policy 0, policy_version 751813 (0.0024) [2024-06-24 22:14:38,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12317802496. Throughput: 0: 42431.8. Samples: 12317910580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:38,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-24 22:14:39,435][15401] Updated weights for policy 0, policy_version 751823 (0.0040) [2024-06-24 22:14:42,682][15401] Updated weights for policy 0, policy_version 751833 (0.0043) [2024-06-24 22:14:43,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 12318048256. Throughput: 0: 42537.9. Samples: 12318169400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 22:14:46,833][15349] Signal inference workers to stop experience collection... (182350 times) [2024-06-24 22:14:46,834][15349] Signal inference workers to resume experience collection... (182350 times) [2024-06-24 22:14:46,881][15401] InferenceWorker_p0-w0: stopping experience collection (182350 times) [2024-06-24 22:14:46,881][15401] InferenceWorker_p0-w0: resuming experience collection (182350 times) [2024-06-24 22:14:46,977][15401] Updated weights for policy 0, policy_version 751843 (0.0050) [2024-06-24 22:14:48,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 12318277632. Throughput: 0: 42376.3. Samples: 12318419100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-24 22:14:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 22:14:50,582][15401] Updated weights for policy 0, policy_version 751853 (0.0041) [2024-06-24 22:14:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12318457856. Throughput: 0: 42322.6. Samples: 12318545820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:14:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 22:14:54,513][15401] Updated weights for policy 0, policy_version 751863 (0.0033) [2024-06-24 22:14:58,234][15401] Updated weights for policy 0, policy_version 751873 (0.0039) [2024-06-24 22:14:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 12318687232. Throughput: 0: 42367.6. Samples: 12318799500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:14:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 22:15:02,356][15401] Updated weights for policy 0, policy_version 751883 (0.0034) [2024-06-24 22:15:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 12318883840. Throughput: 0: 42346.2. Samples: 12319058260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 22:15:06,153][15401] Updated weights for policy 0, policy_version 751893 (0.0034) [2024-06-24 22:15:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12319113216. Throughput: 0: 42480.3. Samples: 12319189900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:08,396][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 22:15:10,245][15401] Updated weights for policy 0, policy_version 751903 (0.0027) [2024-06-24 22:15:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 12319309824. Throughput: 0: 42431.1. Samples: 12319444120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:13,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 22:15:13,588][15401] Updated weights for policy 0, policy_version 751913 (0.0024) [2024-06-24 22:15:17,636][15401] Updated weights for policy 0, policy_version 751923 (0.0038) [2024-06-24 22:15:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42053.0, 300 sec: 42653.9). Total num frames: 12319522816. Throughput: 0: 42747.2. Samples: 12319705580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 22:15:21,111][15401] Updated weights for policy 0, policy_version 751933 (0.0030) [2024-06-24 22:15:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 12319768576. Throughput: 0: 42701.9. Samples: 12319832160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 22:15:25,305][15401] Updated weights for policy 0, policy_version 751943 (0.0047) [2024-06-24 22:15:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42931.8). Total num frames: 12319965184. Throughput: 0: 42651.9. Samples: 12320088740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:28,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-24 22:15:28,673][15401] Updated weights for policy 0, policy_version 751953 (0.0043) [2024-06-24 22:15:33,205][15401] Updated weights for policy 0, policy_version 751963 (0.0038) [2024-06-24 22:15:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42327.0, 300 sec: 42654.3). Total num frames: 12320161792. Throughput: 0: 42978.7. Samples: 12320353140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:33,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 22:15:36,167][15401] Updated weights for policy 0, policy_version 751973 (0.0031) [2024-06-24 22:15:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 12320407552. Throughput: 0: 42729.8. Samples: 12320468660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 22:15:40,809][15401] Updated weights for policy 0, policy_version 751983 (0.0025) [2024-06-24 22:15:43,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42593.7, 300 sec: 42875.2). Total num frames: 12320604160. Throughput: 0: 42817.8. Samples: 12320726580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:43,396][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 22:15:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000751990_12320604160.pth... [2024-06-24 22:15:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000751365_12310364160.pth [2024-06-24 22:15:44,270][15401] Updated weights for policy 0, policy_version 751993 (0.0032) [2024-06-24 22:15:48,295][15401] Updated weights for policy 0, policy_version 752003 (0.0039) [2024-06-24 22:15:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12320817152. Throughput: 0: 42726.7. Samples: 12320980960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:48,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-24 22:15:51,910][15401] Updated weights for policy 0, policy_version 752013 (0.0033) [2024-06-24 22:15:53,390][15132] Fps is (10 sec: 44265.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12321046528. Throughput: 0: 42648.0. Samples: 12321109060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 22:15:55,891][15401] Updated weights for policy 0, policy_version 752023 (0.0032) [2024-06-24 22:15:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42766.0). Total num frames: 12321226752. Throughput: 0: 42660.9. Samples: 12321363860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:15:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 22:15:59,526][15401] Updated weights for policy 0, policy_version 752033 (0.0027) [2024-06-24 22:16:03,366][15401] Updated weights for policy 0, policy_version 752043 (0.0037) [2024-06-24 22:16:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12321472512. Throughput: 0: 42624.3. Samples: 12321623680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:16:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 22:16:07,046][15401] Updated weights for policy 0, policy_version 752053 (0.0032) [2024-06-24 22:16:08,396][15132] Fps is (10 sec: 47483.4, 60 sec: 43140.1, 300 sec: 42931.1). Total num frames: 12321701888. Throughput: 0: 42739.8. Samples: 12321755720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:16:08,396][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 22:16:11,150][15401] Updated weights for policy 0, policy_version 752063 (0.0038) [2024-06-24 22:16:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 12321865728. Throughput: 0: 42638.6. Samples: 12322007480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:16:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 22:16:15,053][15401] Updated weights for policy 0, policy_version 752073 (0.0034) [2024-06-24 22:16:18,389][15132] Fps is (10 sec: 39346.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 12322095104. Throughput: 0: 42254.2. Samples: 12322254580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:16:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 22:16:19,143][15401] Updated weights for policy 0, policy_version 752083 (0.0032) [2024-06-24 22:16:21,661][15349] Signal inference workers to stop experience collection... (182400 times) [2024-06-24 22:16:21,662][15349] Signal inference workers to resume experience collection... (182400 times) [2024-06-24 22:16:21,684][15401] InferenceWorker_p0-w0: stopping experience collection (182400 times) [2024-06-24 22:16:21,684][15401] InferenceWorker_p0-w0: resuming experience collection (182400 times) [2024-06-24 22:16:22,760][15401] Updated weights for policy 0, policy_version 752093 (0.0037) [2024-06-24 22:16:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12322324480. Throughput: 0: 42625.1. Samples: 12322386800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:16:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 22:16:26,856][15401] Updated weights for policy 0, policy_version 752103 (0.0037) [2024-06-24 22:16:28,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.7). Total num frames: 12322504704. Throughput: 0: 42493.2. Samples: 12322638600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:16:28,392][15132] Avg episode reward: [(0, '0.747')] [2024-06-24 22:16:30,367][15401] Updated weights for policy 0, policy_version 752113 (0.0036) [2024-06-24 22:16:33,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12322734080. Throughput: 0: 42527.5. Samples: 12322894700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-24 22:16:33,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 22:16:34,348][15401] Updated weights for policy 0, policy_version 752123 (0.0031) [2024-06-24 22:16:37,986][15401] Updated weights for policy 0, policy_version 752133 (0.0037) [2024-06-24 22:16:38,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12322947072. Throughput: 0: 42647.6. Samples: 12323028200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:16:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 22:16:41,974][15401] Updated weights for policy 0, policy_version 752143 (0.0027) [2024-06-24 22:16:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 12323160064. Throughput: 0: 42628.9. Samples: 12323282160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:16:43,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-24 22:16:45,545][15401] Updated weights for policy 0, policy_version 752153 (0.0028) [2024-06-24 22:16:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12323389440. Throughput: 0: 42503.2. Samples: 12323536320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:16:48,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 22:16:49,612][15401] Updated weights for policy 0, policy_version 752163 (0.0028) [2024-06-24 22:16:53,275][15401] Updated weights for policy 0, policy_version 752173 (0.0036) [2024-06-24 22:16:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12323602432. Throughput: 0: 42545.9. Samples: 12323670020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:16:53,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 22:16:57,168][15401] Updated weights for policy 0, policy_version 752183 (0.0037) [2024-06-24 22:16:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12323799040. Throughput: 0: 42582.2. Samples: 12323923680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:16:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 22:17:00,902][15401] Updated weights for policy 0, policy_version 752193 (0.0046) [2024-06-24 22:17:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12324012032. Throughput: 0: 42785.3. Samples: 12324179920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:03,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 22:17:04,739][15401] Updated weights for policy 0, policy_version 752203 (0.0028) [2024-06-24 22:17:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42056.6, 300 sec: 42709.5). Total num frames: 12324225024. Throughput: 0: 42637.4. Samples: 12324305480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 22:17:08,918][15401] Updated weights for policy 0, policy_version 752213 (0.0045) [2024-06-24 22:17:12,344][15401] Updated weights for policy 0, policy_version 752223 (0.0047) [2024-06-24 22:17:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12324421632. Throughput: 0: 42651.6. Samples: 12324557820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 22:17:16,691][15401] Updated weights for policy 0, policy_version 752233 (0.0028) [2024-06-24 22:17:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12324634624. Throughput: 0: 42707.2. Samples: 12324816520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-24 22:17:20,539][15401] Updated weights for policy 0, policy_version 752243 (0.0026) [2024-06-24 22:17:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 12324864000. Throughput: 0: 42581.7. Samples: 12324944380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 22:17:24,217][15401] Updated weights for policy 0, policy_version 752253 (0.0037) [2024-06-24 22:17:28,099][15401] Updated weights for policy 0, policy_version 752263 (0.0033) [2024-06-24 22:17:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 12325076992. Throughput: 0: 42480.3. Samples: 12325193780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 22:17:31,814][15401] Updated weights for policy 0, policy_version 752273 (0.0034) [2024-06-24 22:17:33,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42052.2, 300 sec: 42487.6). Total num frames: 12325257216. Throughput: 0: 42661.6. Samples: 12325456100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:33,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 22:17:35,625][15401] Updated weights for policy 0, policy_version 752283 (0.0036) [2024-06-24 22:17:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12325519360. Throughput: 0: 42471.6. Samples: 12325581240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 22:17:40,182][15401] Updated weights for policy 0, policy_version 752293 (0.0030) [2024-06-24 22:17:41,880][15349] Signal inference workers to stop experience collection... (182450 times) [2024-06-24 22:17:41,881][15349] Signal inference workers to resume experience collection... (182450 times) [2024-06-24 22:17:41,932][15401] InferenceWorker_p0-w0: stopping experience collection (182450 times) [2024-06-24 22:17:41,932][15401] InferenceWorker_p0-w0: resuming experience collection (182450 times) [2024-06-24 22:17:43,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12325715968. Throughput: 0: 42412.5. Samples: 12325832240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 22:17:43,423][15401] Updated weights for policy 0, policy_version 752303 (0.0046) [2024-06-24 22:17:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000752303_12325732352.pth... [2024-06-24 22:17:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000751678_12315492352.pth [2024-06-24 22:17:47,821][15401] Updated weights for policy 0, policy_version 752313 (0.0050) [2024-06-24 22:17:48,392][15132] Fps is (10 sec: 37674.2, 60 sec: 41777.5, 300 sec: 42487.3). Total num frames: 12325896192. Throughput: 0: 42523.1. Samples: 12326093560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:48,393][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 22:17:51,274][15401] Updated weights for policy 0, policy_version 752323 (0.0052) [2024-06-24 22:17:53,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42596.7, 300 sec: 42709.6). Total num frames: 12326158336. Throughput: 0: 42466.7. Samples: 12326216580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:53,401][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 22:17:55,528][15401] Updated weights for policy 0, policy_version 752333 (0.0042) [2024-06-24 22:17:58,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12326354944. Throughput: 0: 42543.5. Samples: 12326472280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:17:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 22:17:59,227][15401] Updated weights for policy 0, policy_version 752343 (0.0024) [2024-06-24 22:18:03,106][15401] Updated weights for policy 0, policy_version 752353 (0.0036) [2024-06-24 22:18:03,396][15132] Fps is (10 sec: 39305.8, 60 sec: 42320.8, 300 sec: 42597.5). Total num frames: 12326551552. Throughput: 0: 42555.6. Samples: 12326731800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:18:03,397][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 22:18:06,782][15401] Updated weights for policy 0, policy_version 752363 (0.0040) [2024-06-24 22:18:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 12326780928. Throughput: 0: 42559.6. Samples: 12326859560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:18:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 22:18:10,724][15401] Updated weights for policy 0, policy_version 752373 (0.0040) [2024-06-24 22:18:13,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 12326993920. Throughput: 0: 42739.5. Samples: 12327117060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:18:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 22:18:14,257][15401] Updated weights for policy 0, policy_version 752383 (0.0035) [2024-06-24 22:18:18,323][15401] Updated weights for policy 0, policy_version 752393 (0.0038) [2024-06-24 22:18:18,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 12327206912. Throughput: 0: 42515.7. Samples: 12327369400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-24 22:18:18,392][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 22:18:21,864][15401] Updated weights for policy 0, policy_version 752403 (0.0039) [2024-06-24 22:18:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12327419904. Throughput: 0: 42641.4. Samples: 12327500100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:18:23,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 22:18:25,900][15401] Updated weights for policy 0, policy_version 752413 (0.0034) [2024-06-24 22:18:28,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12327649280. Throughput: 0: 42745.7. Samples: 12327755800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:18:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 22:18:29,895][15401] Updated weights for policy 0, policy_version 752423 (0.0028) [2024-06-24 22:18:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12327845888. Throughput: 0: 42648.5. Samples: 12328012640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:18:33,399][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 22:18:33,754][15401] Updated weights for policy 0, policy_version 752433 (0.0036) [2024-06-24 22:18:37,516][15401] Updated weights for policy 0, policy_version 752443 (0.0034) [2024-06-24 22:18:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12328058880. Throughput: 0: 42713.7. Samples: 12328138600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:18:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 22:18:41,502][15401] Updated weights for policy 0, policy_version 752453 (0.0034) [2024-06-24 22:18:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12328288256. Throughput: 0: 42705.8. Samples: 12328394040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:18:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 22:18:45,035][15401] Updated weights for policy 0, policy_version 752463 (0.0031) [2024-06-24 22:18:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 12328484864. Throughput: 0: 42773.2. Samples: 12328656320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:18:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-24 22:18:49,029][15401] Updated weights for policy 0, policy_version 752473 (0.0041) [2024-06-24 22:18:52,623][15401] Updated weights for policy 0, policy_version 752483 (0.0029) [2024-06-24 22:18:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 12328681472. Throughput: 0: 42659.8. Samples: 12328779260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:18:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 22:18:56,506][15401] Updated weights for policy 0, policy_version 752493 (0.0031) [2024-06-24 22:18:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 12328894464. Throughput: 0: 42759.1. Samples: 12329041220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:18:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 22:19:00,214][15401] Updated weights for policy 0, policy_version 752503 (0.0027) [2024-06-24 22:19:03,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42874.4, 300 sec: 42653.6). Total num frames: 12329123840. Throughput: 0: 42734.2. Samples: 12329292440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:03,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 22:19:04,316][15401] Updated weights for policy 0, policy_version 752513 (0.0039) [2024-06-24 22:19:05,062][15349] Signal inference workers to stop experience collection... (182500 times) [2024-06-24 22:19:05,063][15349] Signal inference workers to resume experience collection... (182500 times) [2024-06-24 22:19:05,087][15401] InferenceWorker_p0-w0: stopping experience collection (182500 times) [2024-06-24 22:19:05,087][15401] InferenceWorker_p0-w0: resuming experience collection (182500 times) [2024-06-24 22:19:08,376][15401] Updated weights for policy 0, policy_version 752523 (0.0033) [2024-06-24 22:19:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12329336832. Throughput: 0: 42655.5. Samples: 12329419600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 22:19:11,894][15401] Updated weights for policy 0, policy_version 752533 (0.0036) [2024-06-24 22:19:13,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42543.0). Total num frames: 12329549824. Throughput: 0: 42712.1. Samples: 12329677840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 22:19:15,847][15401] Updated weights for policy 0, policy_version 752543 (0.0038) [2024-06-24 22:19:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 12329746432. Throughput: 0: 42673.5. Samples: 12329932940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:18,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 22:19:19,862][15401] Updated weights for policy 0, policy_version 752553 (0.0050) [2024-06-24 22:19:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12329975808. Throughput: 0: 42685.5. Samples: 12330059440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 22:19:23,419][15401] Updated weights for policy 0, policy_version 752563 (0.0047) [2024-06-24 22:19:27,483][15401] Updated weights for policy 0, policy_version 752573 (0.0038) [2024-06-24 22:19:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 12330205184. Throughput: 0: 42728.1. Samples: 12330316800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:28,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 22:19:31,213][15401] Updated weights for policy 0, policy_version 752583 (0.0032) [2024-06-24 22:19:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12330401792. Throughput: 0: 42611.6. Samples: 12330573840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 22:19:35,112][15401] Updated weights for policy 0, policy_version 752593 (0.0042) [2024-06-24 22:19:38,390][15132] Fps is (10 sec: 40957.9, 60 sec: 42598.2, 300 sec: 42598.3). Total num frames: 12330614784. Throughput: 0: 42661.0. Samples: 12330699020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:38,391][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 22:19:38,823][15401] Updated weights for policy 0, policy_version 752603 (0.0027) [2024-06-24 22:19:42,676][15401] Updated weights for policy 0, policy_version 752613 (0.0040) [2024-06-24 22:19:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12330827776. Throughput: 0: 42584.6. Samples: 12330957520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:43,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 22:19:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000752615_12330844160.pth... [2024-06-24 22:19:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000751990_12320604160.pth [2024-06-24 22:19:46,378][15401] Updated weights for policy 0, policy_version 752623 (0.0043) [2024-06-24 22:19:48,390][15132] Fps is (10 sec: 42600.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12331040768. Throughput: 0: 42728.9. Samples: 12331215140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 22:19:50,251][15401] Updated weights for policy 0, policy_version 752633 (0.0031) [2024-06-24 22:19:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12331253760. Throughput: 0: 42633.3. Samples: 12331338100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 22:19:54,138][15401] Updated weights for policy 0, policy_version 752643 (0.0031) [2024-06-24 22:19:58,293][15401] Updated weights for policy 0, policy_version 752653 (0.0028) [2024-06-24 22:19:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12331466752. Throughput: 0: 42546.6. Samples: 12331592440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:19:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 22:20:02,366][15401] Updated weights for policy 0, policy_version 752663 (0.0034) [2024-06-24 22:20:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 12331679744. Throughput: 0: 42634.2. Samples: 12331851480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-24 22:20:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 22:20:05,815][15401] Updated weights for policy 0, policy_version 752673 (0.0031) [2024-06-24 22:20:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12331892736. Throughput: 0: 42668.4. Samples: 12331979520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 22:20:09,979][15401] Updated weights for policy 0, policy_version 752683 (0.0035) [2024-06-24 22:20:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12332105728. Throughput: 0: 42472.3. Samples: 12332228060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 22:20:13,738][15401] Updated weights for policy 0, policy_version 752693 (0.0041) [2024-06-24 22:20:17,698][15401] Updated weights for policy 0, policy_version 752703 (0.0041) [2024-06-24 22:20:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12332318720. Throughput: 0: 42586.3. Samples: 12332490220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 22:20:21,337][15401] Updated weights for policy 0, policy_version 752713 (0.0029) [2024-06-24 22:20:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12332548096. Throughput: 0: 42630.1. Samples: 12332617360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 22:20:25,259][15401] Updated weights for policy 0, policy_version 752723 (0.0040) [2024-06-24 22:20:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12332728320. Throughput: 0: 42507.5. Samples: 12332870360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 22:20:29,026][15401] Updated weights for policy 0, policy_version 752733 (0.0033) [2024-06-24 22:20:32,789][15401] Updated weights for policy 0, policy_version 752743 (0.0029) [2024-06-24 22:20:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12332941312. Throughput: 0: 42522.6. Samples: 12333128660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 22:20:36,667][15401] Updated weights for policy 0, policy_version 752753 (0.0033) [2024-06-24 22:20:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.7, 300 sec: 42654.9). Total num frames: 12333187072. Throughput: 0: 42563.1. Samples: 12333253440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:38,396][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 22:20:40,172][15349] Signal inference workers to stop experience collection... (182550 times) [2024-06-24 22:20:40,172][15349] Signal inference workers to resume experience collection... (182550 times) [2024-06-24 22:20:40,216][15401] InferenceWorker_p0-w0: stopping experience collection (182550 times) [2024-06-24 22:20:40,216][15401] InferenceWorker_p0-w0: resuming experience collection (182550 times) [2024-06-24 22:20:40,314][15401] Updated weights for policy 0, policy_version 752763 (0.0042) [2024-06-24 22:20:43,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42598.1, 300 sec: 42598.3). Total num frames: 12333383680. Throughput: 0: 42649.5. Samples: 12333511680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:43,391][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 22:20:44,332][15401] Updated weights for policy 0, policy_version 752773 (0.0029) [2024-06-24 22:20:47,884][15401] Updated weights for policy 0, policy_version 752783 (0.0022) [2024-06-24 22:20:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12333596672. Throughput: 0: 42404.3. Samples: 12333759680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:48,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 22:20:52,282][15401] Updated weights for policy 0, policy_version 752793 (0.0047) [2024-06-24 22:20:53,390][15132] Fps is (10 sec: 44238.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12333826048. Throughput: 0: 42489.8. Samples: 12333891560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 22:20:55,551][15401] Updated weights for policy 0, policy_version 752803 (0.0026) [2024-06-24 22:20:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12334022656. Throughput: 0: 42601.0. Samples: 12334145100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:20:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 22:20:59,981][15401] Updated weights for policy 0, policy_version 752813 (0.0036) [2024-06-24 22:21:03,069][15401] Updated weights for policy 0, policy_version 752823 (0.0040) [2024-06-24 22:21:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42543.4). Total num frames: 12334252032. Throughput: 0: 42289.3. Samples: 12334393340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:03,393][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 22:21:07,719][15401] Updated weights for policy 0, policy_version 752833 (0.0030) [2024-06-24 22:21:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12334448640. Throughput: 0: 42448.9. Samples: 12334527560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:08,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 22:21:10,693][15401] Updated weights for policy 0, policy_version 752843 (0.0051) [2024-06-24 22:21:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12334661632. Throughput: 0: 42733.7. Samples: 12334793380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 22:21:15,147][15401] Updated weights for policy 0, policy_version 752853 (0.0042) [2024-06-24 22:21:18,230][15401] Updated weights for policy 0, policy_version 752863 (0.0026) [2024-06-24 22:21:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 12334907392. Throughput: 0: 42509.4. Samples: 12335041580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 22:21:22,642][15401] Updated weights for policy 0, policy_version 752873 (0.0030) [2024-06-24 22:21:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42598.8). Total num frames: 12335071232. Throughput: 0: 42658.8. Samples: 12335173080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:23,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 22:21:25,850][15401] Updated weights for policy 0, policy_version 752883 (0.0038) [2024-06-24 22:21:28,396][15132] Fps is (10 sec: 40933.7, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 12335316992. Throughput: 0: 42777.8. Samples: 12335436940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:28,396][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 22:21:30,175][15401] Updated weights for policy 0, policy_version 752893 (0.0033) [2024-06-24 22:21:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12335529984. Throughput: 0: 42902.6. Samples: 12335690300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:33,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-24 22:21:33,739][15401] Updated weights for policy 0, policy_version 752903 (0.0047) [2024-06-24 22:21:37,748][15401] Updated weights for policy 0, policy_version 752913 (0.0038) [2024-06-24 22:21:38,389][15132] Fps is (10 sec: 40986.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12335726592. Throughput: 0: 42740.1. Samples: 12335814860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:38,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 22:21:40,005][15349] Signal inference workers to stop experience collection... (182600 times) [2024-06-24 22:21:40,056][15401] InferenceWorker_p0-w0: stopping experience collection (182600 times) [2024-06-24 22:21:40,063][15349] Signal inference workers to resume experience collection... (182600 times) [2024-06-24 22:21:40,070][15401] InferenceWorker_p0-w0: resuming experience collection (182600 times) [2024-06-24 22:21:41,791][15401] Updated weights for policy 0, policy_version 752923 (0.0029) [2024-06-24 22:21:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.7, 300 sec: 42542.9). Total num frames: 12335939584. Throughput: 0: 42880.4. Samples: 12336074720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-24 22:21:43,390][15132] Avg episode reward: [(0, '0.886')] [2024-06-24 22:21:43,513][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000752927_12335955968.pth... [2024-06-24 22:21:43,580][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000752303_12325732352.pth [2024-06-24 22:21:45,829][15401] Updated weights for policy 0, policy_version 752933 (0.0032) [2024-06-24 22:21:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12336168960. Throughput: 0: 42873.5. Samples: 12336322540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:21:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 22:21:49,374][15401] Updated weights for policy 0, policy_version 752943 (0.0039) [2024-06-24 22:21:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 12336349184. Throughput: 0: 42828.7. Samples: 12336454840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:21:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 22:21:53,664][15401] Updated weights for policy 0, policy_version 752953 (0.0039) [2024-06-24 22:21:56,832][15401] Updated weights for policy 0, policy_version 752963 (0.0034) [2024-06-24 22:21:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12336578560. Throughput: 0: 42506.1. Samples: 12336706160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:21:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 22:22:01,459][15401] Updated weights for policy 0, policy_version 752973 (0.0038) [2024-06-24 22:22:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 12336807936. Throughput: 0: 42728.1. Samples: 12336964340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-24 22:22:04,309][15401] Updated weights for policy 0, policy_version 752983 (0.0039) [2024-06-24 22:22:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12337004544. Throughput: 0: 42695.9. Samples: 12337094400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:08,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 22:22:09,296][15401] Updated weights for policy 0, policy_version 752993 (0.0036) [2024-06-24 22:22:11,849][15401] Updated weights for policy 0, policy_version 753003 (0.0032) [2024-06-24 22:22:13,396][15132] Fps is (10 sec: 40933.2, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 12337217536. Throughput: 0: 42233.7. Samples: 12337337460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:13,397][15132] Avg episode reward: [(0, '0.272')] [2024-06-24 22:22:16,919][15401] Updated weights for policy 0, policy_version 753013 (0.0040) [2024-06-24 22:22:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 12337414144. Throughput: 0: 42403.7. Samples: 12337598460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 22:22:19,968][15401] Updated weights for policy 0, policy_version 753023 (0.0039) [2024-06-24 22:22:23,390][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12337643520. Throughput: 0: 42418.6. Samples: 12337723700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 22:22:24,445][15401] Updated weights for policy 0, policy_version 753033 (0.0036) [2024-06-24 22:22:27,576][15401] Updated weights for policy 0, policy_version 753043 (0.0030) [2024-06-24 22:22:28,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42328.2, 300 sec: 42709.2). Total num frames: 12337856512. Throughput: 0: 41970.7. Samples: 12337963500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:28,392][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 22:22:32,693][15401] Updated weights for policy 0, policy_version 753053 (0.0029) [2024-06-24 22:22:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12338053120. Throughput: 0: 42366.4. Samples: 12338229040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:33,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 22:22:35,348][15401] Updated weights for policy 0, policy_version 753063 (0.0043) [2024-06-24 22:22:38,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12338266112. Throughput: 0: 42135.9. Samples: 12338350960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:38,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-24 22:22:40,468][15401] Updated weights for policy 0, policy_version 753073 (0.0035) [2024-06-24 22:22:43,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 12338495488. Throughput: 0: 42120.2. Samples: 12338601560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:43,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 22:22:43,464][15401] Updated weights for policy 0, policy_version 753083 (0.0033) [2024-06-24 22:22:48,142][15401] Updated weights for policy 0, policy_version 753093 (0.0034) [2024-06-24 22:22:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42487.7). Total num frames: 12338692096. Throughput: 0: 42277.6. Samples: 12338866840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:48,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 22:22:51,176][15401] Updated weights for policy 0, policy_version 753103 (0.0038) [2024-06-24 22:22:53,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12338888704. Throughput: 0: 42028.0. Samples: 12338985660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:53,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 22:22:55,792][15401] Updated weights for policy 0, policy_version 753113 (0.0037) [2024-06-24 22:22:58,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42593.9, 300 sec: 42653.9). Total num frames: 12339134464. Throughput: 0: 42228.9. Samples: 12339237760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:22:58,397][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 22:22:58,721][15401] Updated weights for policy 0, policy_version 753123 (0.0032) [2024-06-24 22:23:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 12339314688. Throughput: 0: 42280.8. Samples: 12339501100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:23:03,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 22:23:03,407][15401] Updated weights for policy 0, policy_version 753133 (0.0028) [2024-06-24 22:23:04,627][15349] Signal inference workers to stop experience collection... (182650 times) [2024-06-24 22:23:04,627][15349] Signal inference workers to resume experience collection... (182650 times) [2024-06-24 22:23:04,643][15401] InferenceWorker_p0-w0: stopping experience collection (182650 times) [2024-06-24 22:23:04,644][15401] InferenceWorker_p0-w0: resuming experience collection (182650 times) [2024-06-24 22:23:06,344][15401] Updated weights for policy 0, policy_version 753143 (0.0051) [2024-06-24 22:23:08,392][15132] Fps is (10 sec: 40976.4, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 12339544064. Throughput: 0: 42147.1. Samples: 12339620420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:23:08,393][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 22:23:11,231][15401] Updated weights for policy 0, policy_version 753153 (0.0033) [2024-06-24 22:23:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42603.0, 300 sec: 42598.7). Total num frames: 12339773440. Throughput: 0: 42611.1. Samples: 12339880900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:23:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 22:23:14,317][15401] Updated weights for policy 0, policy_version 753163 (0.0027) [2024-06-24 22:23:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12339970048. Throughput: 0: 42497.5. Samples: 12340141420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:23:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 22:23:18,766][15401] Updated weights for policy 0, policy_version 753173 (0.0042) [2024-06-24 22:23:22,141][15401] Updated weights for policy 0, policy_version 753183 (0.0027) [2024-06-24 22:23:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 12340183040. Throughput: 0: 42413.4. Samples: 12340259560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:23:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 22:23:26,485][15401] Updated weights for policy 0, policy_version 753193 (0.0040) [2024-06-24 22:23:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 12340428800. Throughput: 0: 42675.4. Samples: 12340521960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:23:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 22:23:29,922][15401] Updated weights for policy 0, policy_version 753203 (0.0032) [2024-06-24 22:23:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 12340609024. Throughput: 0: 42373.0. Samples: 12340773620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:23:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 22:23:34,215][15401] Updated weights for policy 0, policy_version 753213 (0.0024) [2024-06-24 22:23:37,748][15401] Updated weights for policy 0, policy_version 753223 (0.0036) [2024-06-24 22:23:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 12340805632. Throughput: 0: 42394.2. Samples: 12340893400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:23:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 22:23:42,042][15401] Updated weights for policy 0, policy_version 753233 (0.0039) [2024-06-24 22:23:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12341051392. Throughput: 0: 42734.6. Samples: 12341160540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:23:43,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 22:23:43,527][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000753239_12341067776.pth... [2024-06-24 22:23:43,593][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000752615_12330844160.pth [2024-06-24 22:23:45,751][15401] Updated weights for policy 0, policy_version 753243 (0.0051) [2024-06-24 22:23:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12341231616. Throughput: 0: 42544.0. Samples: 12341415580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:23:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 22:23:49,736][15401] Updated weights for policy 0, policy_version 753253 (0.0038) [2024-06-24 22:23:53,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12341444608. Throughput: 0: 42517.9. Samples: 12341533620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:23:53,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 22:23:53,423][15401] Updated weights for policy 0, policy_version 753263 (0.0038) [2024-06-24 22:23:57,484][15401] Updated weights for policy 0, policy_version 753273 (0.0045) [2024-06-24 22:23:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42329.8, 300 sec: 42543.2). Total num frames: 12341673984. Throughput: 0: 42573.3. Samples: 12341796700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:23:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 22:24:00,983][15401] Updated weights for policy 0, policy_version 753283 (0.0027) [2024-06-24 22:24:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12341870592. Throughput: 0: 42382.2. Samples: 12342048620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 22:24:05,195][15401] Updated weights for policy 0, policy_version 753293 (0.0036) [2024-06-24 22:24:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 12342099968. Throughput: 0: 42500.0. Samples: 12342172060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 22:24:08,517][15401] Updated weights for policy 0, policy_version 753303 (0.0034) [2024-06-24 22:24:12,775][15401] Updated weights for policy 0, policy_version 753313 (0.0035) [2024-06-24 22:24:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 12342296576. Throughput: 0: 42382.2. Samples: 12342429160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 22:24:16,304][15401] Updated weights for policy 0, policy_version 753323 (0.0044) [2024-06-24 22:24:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 12342493184. Throughput: 0: 42580.4. Samples: 12342689740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:18,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 22:24:18,882][15349] Signal inference workers to stop experience collection... (182700 times) [2024-06-24 22:24:18,934][15349] Signal inference workers to resume experience collection... (182700 times) [2024-06-24 22:24:18,935][15401] InferenceWorker_p0-w0: stopping experience collection (182700 times) [2024-06-24 22:24:18,955][15401] InferenceWorker_p0-w0: resuming experience collection (182700 times) [2024-06-24 22:24:20,879][15401] Updated weights for policy 0, policy_version 753333 (0.0034) [2024-06-24 22:24:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 12342755328. Throughput: 0: 42708.0. Samples: 12342815260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 22:24:24,080][15401] Updated weights for policy 0, policy_version 753343 (0.0031) [2024-06-24 22:24:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 42431.8). Total num frames: 12342919168. Throughput: 0: 42408.4. Samples: 12343068920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 22:24:28,438][15401] Updated weights for policy 0, policy_version 753353 (0.0037) [2024-06-24 22:24:31,568][15401] Updated weights for policy 0, policy_version 753363 (0.0031) [2024-06-24 22:24:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42431.9). Total num frames: 12343132160. Throughput: 0: 42610.7. Samples: 12343333060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 22:24:35,799][15401] Updated weights for policy 0, policy_version 753373 (0.0033) [2024-06-24 22:24:38,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12343394304. Throughput: 0: 42866.6. Samples: 12343462620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:38,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 22:24:38,953][15401] Updated weights for policy 0, policy_version 753383 (0.0029) [2024-06-24 22:24:43,261][15401] Updated weights for policy 0, policy_version 753393 (0.0033) [2024-06-24 22:24:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 12343590912. Throughput: 0: 42623.1. Samples: 12343714740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 22:24:46,383][15401] Updated weights for policy 0, policy_version 753403 (0.0042) [2024-06-24 22:24:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12343787520. Throughput: 0: 42869.7. Samples: 12343977760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 22:24:50,643][15401] Updated weights for policy 0, policy_version 753413 (0.0043) [2024-06-24 22:24:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12344033280. Throughput: 0: 42915.5. Samples: 12344103260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 22:24:54,239][15401] Updated weights for policy 0, policy_version 753423 (0.0026) [2024-06-24 22:24:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 12344213504. Throughput: 0: 42884.1. Samples: 12344358940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:24:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 22:24:58,685][15401] Updated weights for policy 0, policy_version 753433 (0.0043) [2024-06-24 22:25:01,763][15401] Updated weights for policy 0, policy_version 753443 (0.0037) [2024-06-24 22:25:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12344442880. Throughput: 0: 42744.9. Samples: 12344613260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:25:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 22:25:06,145][15401] Updated weights for policy 0, policy_version 753453 (0.0041) [2024-06-24 22:25:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12344672256. Throughput: 0: 42729.0. Samples: 12344738060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:25:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 22:25:09,622][15401] Updated weights for policy 0, policy_version 753463 (0.0033) [2024-06-24 22:25:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 12344868864. Throughput: 0: 42908.3. Samples: 12344999800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-24 22:25:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 22:25:14,084][15401] Updated weights for policy 0, policy_version 753473 (0.0040) [2024-06-24 22:25:17,148][15401] Updated weights for policy 0, policy_version 753483 (0.0037) [2024-06-24 22:25:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 12345081856. Throughput: 0: 42646.1. Samples: 12345252140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:18,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-24 22:25:21,498][15401] Updated weights for policy 0, policy_version 753493 (0.0027) [2024-06-24 22:25:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12345294848. Throughput: 0: 42524.3. Samples: 12345376220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 22:25:24,713][15401] Updated weights for policy 0, policy_version 753503 (0.0035) [2024-06-24 22:25:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 12345524224. Throughput: 0: 42770.4. Samples: 12345639400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:28,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 22:25:29,251][15401] Updated weights for policy 0, policy_version 753513 (0.0035) [2024-06-24 22:25:32,250][15401] Updated weights for policy 0, policy_version 753523 (0.0031) [2024-06-24 22:25:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 12345737216. Throughput: 0: 42663.6. Samples: 12345897620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:25:36,937][15401] Updated weights for policy 0, policy_version 753533 (0.0043) [2024-06-24 22:25:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.5). Total num frames: 12345950208. Throughput: 0: 42685.9. Samples: 12346024120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:38,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 22:25:40,526][15401] Updated weights for policy 0, policy_version 753543 (0.0040) [2024-06-24 22:25:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 12346163200. Throughput: 0: 42840.9. Samples: 12346286780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:43,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 22:25:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000753550_12346163200.pth... [2024-06-24 22:25:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000752927_12335955968.pth [2024-06-24 22:25:44,414][15401] Updated weights for policy 0, policy_version 753553 (0.0033) [2024-06-24 22:25:48,192][15401] Updated weights for policy 0, policy_version 753563 (0.0035) [2024-06-24 22:25:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 12346376192. Throughput: 0: 42835.5. Samples: 12346540860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 22:25:48,658][15349] Signal inference workers to stop experience collection... (182750 times) [2024-06-24 22:25:48,700][15401] InferenceWorker_p0-w0: stopping experience collection (182750 times) [2024-06-24 22:25:48,707][15349] Signal inference workers to resume experience collection... (182750 times) [2024-06-24 22:25:48,719][15401] InferenceWorker_p0-w0: resuming experience collection (182750 times) [2024-06-24 22:25:51,990][15401] Updated weights for policy 0, policy_version 753573 (0.0040) [2024-06-24 22:25:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12346589184. Throughput: 0: 42907.5. Samples: 12346668900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:53,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-24 22:25:55,737][15401] Updated weights for policy 0, policy_version 753583 (0.0035) [2024-06-24 22:25:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42543.2). Total num frames: 12346802176. Throughput: 0: 42925.6. Samples: 12346931440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:25:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 22:25:59,431][15401] Updated weights for policy 0, policy_version 753593 (0.0046) [2024-06-24 22:26:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12347015168. Throughput: 0: 43024.8. Samples: 12347188260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 22:26:03,570][15401] Updated weights for policy 0, policy_version 753603 (0.0025) [2024-06-24 22:26:07,005][15401] Updated weights for policy 0, policy_version 753613 (0.0041) [2024-06-24 22:26:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12347228160. Throughput: 0: 43058.9. Samples: 12347313860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:08,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 22:26:11,274][15401] Updated weights for policy 0, policy_version 753623 (0.0035) [2024-06-24 22:26:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 12347441152. Throughput: 0: 42906.1. Samples: 12347570180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-24 22:26:14,714][15401] Updated weights for policy 0, policy_version 753633 (0.0039) [2024-06-24 22:26:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12347637760. Throughput: 0: 42807.1. Samples: 12347823940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 22:26:19,024][15401] Updated weights for policy 0, policy_version 753643 (0.0051) [2024-06-24 22:26:22,341][15401] Updated weights for policy 0, policy_version 753653 (0.0035) [2024-06-24 22:26:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42543.8). Total num frames: 12347867136. Throughput: 0: 42827.4. Samples: 12347951360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:23,393][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 22:26:26,730][15401] Updated weights for policy 0, policy_version 753663 (0.0043) [2024-06-24 22:26:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12348080128. Throughput: 0: 42698.2. Samples: 12348208200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:28,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-24 22:26:30,284][15401] Updated weights for policy 0, policy_version 753673 (0.0045) [2024-06-24 22:26:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12348293120. Throughput: 0: 42536.1. Samples: 12348454980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:33,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-24 22:26:34,591][15401] Updated weights for policy 0, policy_version 753683 (0.0039) [2024-06-24 22:26:37,817][15401] Updated weights for policy 0, policy_version 753693 (0.0033) [2024-06-24 22:26:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12348522496. Throughput: 0: 42647.0. Samples: 12348588020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:38,392][15132] Avg episode reward: [(0, '0.507')] [2024-06-24 22:26:42,091][15401] Updated weights for policy 0, policy_version 753703 (0.0032) [2024-06-24 22:26:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12348702720. Throughput: 0: 42476.8. Samples: 12348842900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:43,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 22:26:45,466][15401] Updated weights for policy 0, policy_version 753713 (0.0038) [2024-06-24 22:26:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12348948480. Throughput: 0: 42394.3. Samples: 12349096000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:48,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 22:26:49,897][15401] Updated weights for policy 0, policy_version 753723 (0.0026) [2024-06-24 22:26:53,209][15401] Updated weights for policy 0, policy_version 753733 (0.0035) [2024-06-24 22:26:53,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12349161472. Throughput: 0: 42586.4. Samples: 12349230260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:53,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 22:26:57,549][15401] Updated weights for policy 0, policy_version 753743 (0.0043) [2024-06-24 22:26:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12349341696. Throughput: 0: 42750.8. Samples: 12349493960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-24 22:26:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 22:27:00,899][15401] Updated weights for policy 0, policy_version 753753 (0.0039) [2024-06-24 22:27:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12349603840. Throughput: 0: 42612.7. Samples: 12349741520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:03,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 22:27:05,042][15401] Updated weights for policy 0, policy_version 753763 (0.0032) [2024-06-24 22:27:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.2, 300 sec: 42599.3). Total num frames: 12349784064. Throughput: 0: 42759.0. Samples: 12349875520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 22:27:08,749][15401] Updated weights for policy 0, policy_version 753773 (0.0031) [2024-06-24 22:27:12,545][15401] Updated weights for policy 0, policy_version 753783 (0.0043) [2024-06-24 22:27:13,392][15132] Fps is (10 sec: 37674.5, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 12349980672. Throughput: 0: 42708.3. Samples: 12350130180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:13,393][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 22:27:14,895][15349] Signal inference workers to stop experience collection... (182800 times) [2024-06-24 22:27:14,900][15349] Signal inference workers to resume experience collection... (182800 times) [2024-06-24 22:27:14,909][15401] InferenceWorker_p0-w0: stopping experience collection (182800 times) [2024-06-24 22:27:14,921][15401] InferenceWorker_p0-w0: resuming experience collection (182800 times) [2024-06-24 22:27:16,658][15401] Updated weights for policy 0, policy_version 753793 (0.0038) [2024-06-24 22:27:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12350242816. Throughput: 0: 42652.0. Samples: 12350374320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:18,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-24 22:27:20,349][15401] Updated weights for policy 0, policy_version 753803 (0.0044) [2024-06-24 22:27:23,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 12350423040. Throughput: 0: 42766.2. Samples: 12350512500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:23,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-24 22:27:24,254][15401] Updated weights for policy 0, policy_version 753813 (0.0052) [2024-06-24 22:27:28,038][15401] Updated weights for policy 0, policy_version 753823 (0.0032) [2024-06-24 22:27:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12350636032. Throughput: 0: 42674.7. Samples: 12350763260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 22:27:32,443][15401] Updated weights for policy 0, policy_version 753833 (0.0034) [2024-06-24 22:27:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12350881792. Throughput: 0: 42700.3. Samples: 12351017520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 22:27:35,663][15401] Updated weights for policy 0, policy_version 753843 (0.0032) [2024-06-24 22:27:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12351078400. Throughput: 0: 42662.0. Samples: 12351150040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 22:27:40,045][15401] Updated weights for policy 0, policy_version 753853 (0.0037) [2024-06-24 22:27:43,056][15401] Updated weights for policy 0, policy_version 753863 (0.0046) [2024-06-24 22:27:43,392][15132] Fps is (10 sec: 40951.1, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 12351291392. Throughput: 0: 42474.2. Samples: 12351405400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:43,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 22:27:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000753863_12351291392.pth... [2024-06-24 22:27:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000753239_12341067776.pth [2024-06-24 22:27:47,632][15401] Updated weights for policy 0, policy_version 753873 (0.0041) [2024-06-24 22:27:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12351504384. Throughput: 0: 42772.2. Samples: 12351666260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 22:27:50,562][15401] Updated weights for policy 0, policy_version 753883 (0.0034) [2024-06-24 22:27:53,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 12351700992. Throughput: 0: 42674.3. Samples: 12351795860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 22:27:55,077][15401] Updated weights for policy 0, policy_version 753893 (0.0033) [2024-06-24 22:27:57,987][15401] Updated weights for policy 0, policy_version 753903 (0.0032) [2024-06-24 22:27:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 12351946752. Throughput: 0: 42678.4. Samples: 12352050600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:27:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 22:28:02,561][15401] Updated weights for policy 0, policy_version 753913 (0.0029) [2024-06-24 22:28:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 12352143360. Throughput: 0: 42955.1. Samples: 12352307300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:28:03,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-24 22:28:05,935][15401] Updated weights for policy 0, policy_version 753923 (0.0040) [2024-06-24 22:28:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 12352356352. Throughput: 0: 42649.5. Samples: 12352431720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:28:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 22:28:10,059][15401] Updated weights for policy 0, policy_version 753933 (0.0038) [2024-06-24 22:28:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 12352585728. Throughput: 0: 42891.4. Samples: 12352693380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:28:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 22:28:13,403][15401] Updated weights for policy 0, policy_version 753943 (0.0034) [2024-06-24 22:28:18,031][15401] Updated weights for policy 0, policy_version 753953 (0.0044) [2024-06-24 22:28:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12352765952. Throughput: 0: 42873.0. Samples: 12352946800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:28:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:28:21,401][15401] Updated weights for policy 0, policy_version 753963 (0.0027) [2024-06-24 22:28:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 12352995328. Throughput: 0: 42589.3. Samples: 12353066560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:28:23,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-24 22:28:25,926][15401] Updated weights for policy 0, policy_version 753973 (0.0036) [2024-06-24 22:28:28,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 12353224704. Throughput: 0: 42728.0. Samples: 12353328160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:28:28,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 22:28:28,841][15401] Updated weights for policy 0, policy_version 753983 (0.0046) [2024-06-24 22:28:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 12353388544. Throughput: 0: 42598.1. Samples: 12353583180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:28:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 22:28:34,077][15401] Updated weights for policy 0, policy_version 753993 (0.0050) [2024-06-24 22:28:36,493][15349] Signal inference workers to stop experience collection... (182850 times) [2024-06-24 22:28:36,543][15401] InferenceWorker_p0-w0: stopping experience collection (182850 times) [2024-06-24 22:28:36,548][15349] Signal inference workers to resume experience collection... (182850 times) [2024-06-24 22:28:36,558][15401] InferenceWorker_p0-w0: resuming experience collection (182850 times) [2024-06-24 22:28:36,689][15401] Updated weights for policy 0, policy_version 754003 (0.0037) [2024-06-24 22:28:38,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12353634304. Throughput: 0: 42467.2. Samples: 12353706880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-24 22:28:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 22:28:41,499][15401] Updated weights for policy 0, policy_version 754013 (0.0032) [2024-06-24 22:28:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 12353847296. Throughput: 0: 42618.6. Samples: 12353968440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:28:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 22:28:44,278][15401] Updated weights for policy 0, policy_version 754023 (0.0040) [2024-06-24 22:28:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12354043904. Throughput: 0: 42694.7. Samples: 12354228560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:28:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-24 22:28:49,138][15401] Updated weights for policy 0, policy_version 754033 (0.0041) [2024-06-24 22:28:52,380][15401] Updated weights for policy 0, policy_version 754043 (0.0036) [2024-06-24 22:28:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12354289664. Throughput: 0: 42596.0. Samples: 12354348540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:28:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 22:28:56,723][15401] Updated weights for policy 0, policy_version 754053 (0.0032) [2024-06-24 22:28:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12354502656. Throughput: 0: 42579.7. Samples: 12354609460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:28:58,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 22:29:00,107][15401] Updated weights for policy 0, policy_version 754063 (0.0024) [2024-06-24 22:29:03,391][15132] Fps is (10 sec: 36039.5, 60 sec: 41778.2, 300 sec: 42542.6). Total num frames: 12354650112. Throughput: 0: 42735.5. Samples: 12354869960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:03,391][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 22:29:04,458][15401] Updated weights for policy 0, policy_version 754073 (0.0036) [2024-06-24 22:29:07,713][15401] Updated weights for policy 0, policy_version 754083 (0.0043) [2024-06-24 22:29:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12354928640. Throughput: 0: 42639.1. Samples: 12354985420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:08,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 22:29:12,107][15401] Updated weights for policy 0, policy_version 754093 (0.0029) [2024-06-24 22:29:13,389][15132] Fps is (10 sec: 47520.7, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 12355125248. Throughput: 0: 42690.3. Samples: 12355249120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 22:29:15,220][15401] Updated weights for policy 0, policy_version 754103 (0.0042) [2024-06-24 22:29:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12355338240. Throughput: 0: 42783.1. Samples: 12355508420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 22:29:19,680][15401] Updated weights for policy 0, policy_version 754113 (0.0029) [2024-06-24 22:29:22,825][15401] Updated weights for policy 0, policy_version 754123 (0.0038) [2024-06-24 22:29:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12355551232. Throughput: 0: 42709.7. Samples: 12355628820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 22:29:27,231][15401] Updated weights for policy 0, policy_version 754133 (0.0023) [2024-06-24 22:29:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42326.9, 300 sec: 42820.5). Total num frames: 12355764224. Throughput: 0: 42679.9. Samples: 12355889040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 22:29:30,817][15401] Updated weights for policy 0, policy_version 754143 (0.0030) [2024-06-24 22:29:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 12355977216. Throughput: 0: 42643.7. Samples: 12356147520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 22:29:34,841][15401] Updated weights for policy 0, policy_version 754153 (0.0038) [2024-06-24 22:29:38,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12356190208. Throughput: 0: 42796.9. Samples: 12356274400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 22:29:38,659][15401] Updated weights for policy 0, policy_version 754163 (0.0036) [2024-06-24 22:29:42,341][15401] Updated weights for policy 0, policy_version 754173 (0.0035) [2024-06-24 22:29:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12356419584. Throughput: 0: 42779.3. Samples: 12356534540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 22:29:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000754176_12356419584.pth... [2024-06-24 22:29:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000753550_12346163200.pth [2024-06-24 22:29:43,630][15349] Signal inference workers to stop experience collection... (182900 times) [2024-06-24 22:29:43,630][15349] Signal inference workers to resume experience collection... (182900 times) [2024-06-24 22:29:43,675][15401] InferenceWorker_p0-w0: stopping experience collection (182900 times) [2024-06-24 22:29:43,680][15401] InferenceWorker_p0-w0: resuming experience collection (182900 times) [2024-06-24 22:29:46,110][15401] Updated weights for policy 0, policy_version 754183 (0.0045) [2024-06-24 22:29:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12356616192. Throughput: 0: 42610.2. Samples: 12356787360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-24 22:29:49,867][15401] Updated weights for policy 0, policy_version 754193 (0.0027) [2024-06-24 22:29:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12356829184. Throughput: 0: 42828.5. Samples: 12356912600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 22:29:53,658][15401] Updated weights for policy 0, policy_version 754203 (0.0031) [2024-06-24 22:29:58,091][15401] Updated weights for policy 0, policy_version 754213 (0.0040) [2024-06-24 22:29:58,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12357058560. Throughput: 0: 42739.3. Samples: 12357172380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:29:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 22:30:01,612][15401] Updated weights for policy 0, policy_version 754223 (0.0027) [2024-06-24 22:30:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43691.8, 300 sec: 42709.5). Total num frames: 12357271552. Throughput: 0: 42756.1. Samples: 12357432440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:30:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 22:30:05,381][15401] Updated weights for policy 0, policy_version 754233 (0.0034) [2024-06-24 22:30:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 12357468160. Throughput: 0: 42892.5. Samples: 12357558980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:30:08,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-24 22:30:09,118][15401] Updated weights for policy 0, policy_version 754243 (0.0026) [2024-06-24 22:30:13,102][15401] Updated weights for policy 0, policy_version 754253 (0.0031) [2024-06-24 22:30:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12357697536. Throughput: 0: 42901.5. Samples: 12357819600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:30:13,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-24 22:30:16,915][15401] Updated weights for policy 0, policy_version 754263 (0.0027) [2024-06-24 22:30:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12357910528. Throughput: 0: 42846.0. Samples: 12358075600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:30:18,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 22:30:20,670][15401] Updated weights for policy 0, policy_version 754273 (0.0031) [2024-06-24 22:30:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12358107136. Throughput: 0: 42736.0. Samples: 12358197520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-24 22:30:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 22:30:24,736][15401] Updated weights for policy 0, policy_version 754283 (0.0032) [2024-06-24 22:30:28,204][15401] Updated weights for policy 0, policy_version 754293 (0.0041) [2024-06-24 22:30:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12358336512. Throughput: 0: 42676.9. Samples: 12358455000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:30:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 22:30:32,342][15401] Updated weights for policy 0, policy_version 754303 (0.0037) [2024-06-24 22:30:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12358533120. Throughput: 0: 42724.6. Samples: 12358709960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:30:33,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 22:30:35,874][15401] Updated weights for policy 0, policy_version 754313 (0.0032) [2024-06-24 22:30:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12358778880. Throughput: 0: 42799.0. Samples: 12358838560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:30:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 22:30:40,306][15401] Updated weights for policy 0, policy_version 754323 (0.0031) [2024-06-24 22:30:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12358975488. Throughput: 0: 42831.9. Samples: 12359099820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:30:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 22:30:43,402][15401] Updated weights for policy 0, policy_version 754333 (0.0032) [2024-06-24 22:30:47,854][15401] Updated weights for policy 0, policy_version 754343 (0.0042) [2024-06-24 22:30:48,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 12359172096. Throughput: 0: 42717.7. Samples: 12359354840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:30:48,392][15132] Avg episode reward: [(0, '0.181')] [2024-06-24 22:30:51,289][15401] Updated weights for policy 0, policy_version 754353 (0.0020) [2024-06-24 22:30:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12359401472. Throughput: 0: 42706.7. Samples: 12359480780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:30:53,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-24 22:30:55,369][15401] Updated weights for policy 0, policy_version 754363 (0.0036) [2024-06-24 22:30:57,642][15349] Signal inference workers to stop experience collection... (182950 times) [2024-06-24 22:30:57,642][15349] Signal inference workers to resume experience collection... (182950 times) [2024-06-24 22:30:57,652][15401] InferenceWorker_p0-w0: stopping experience collection (182950 times) [2024-06-24 22:30:57,653][15401] InferenceWorker_p0-w0: resuming experience collection (182950 times) [2024-06-24 22:30:58,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12359614464. Throughput: 0: 42519.6. Samples: 12359732980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:30:58,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-24 22:30:58,945][15401] Updated weights for policy 0, policy_version 754373 (0.0030) [2024-06-24 22:31:02,802][15401] Updated weights for policy 0, policy_version 754383 (0.0026) [2024-06-24 22:31:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12359811072. Throughput: 0: 42549.4. Samples: 12359990320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:03,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 22:31:06,510][15401] Updated weights for policy 0, policy_version 754393 (0.0035) [2024-06-24 22:31:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12360024064. Throughput: 0: 42785.8. Samples: 12360122880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:08,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-24 22:31:10,257][15401] Updated weights for policy 0, policy_version 754403 (0.0029) [2024-06-24 22:31:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12360253440. Throughput: 0: 42689.8. Samples: 12360376040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 22:31:14,157][15401] Updated weights for policy 0, policy_version 754413 (0.0038) [2024-06-24 22:31:17,799][15401] Updated weights for policy 0, policy_version 754423 (0.0030) [2024-06-24 22:31:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12360466432. Throughput: 0: 42703.0. Samples: 12360631600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 22:31:21,825][15401] Updated weights for policy 0, policy_version 754433 (0.0034) [2024-06-24 22:31:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12360679424. Throughput: 0: 42779.6. Samples: 12360763640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 22:31:25,742][15401] Updated weights for policy 0, policy_version 754443 (0.0027) [2024-06-24 22:31:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12360892416. Throughput: 0: 42645.7. Samples: 12361018880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 22:31:29,484][15401] Updated weights for policy 0, policy_version 754453 (0.0054) [2024-06-24 22:31:33,329][15401] Updated weights for policy 0, policy_version 754463 (0.0044) [2024-06-24 22:31:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12361121792. Throughput: 0: 42713.3. Samples: 12361276840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:33,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-24 22:31:37,085][15401] Updated weights for policy 0, policy_version 754473 (0.0034) [2024-06-24 22:31:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12361318400. Throughput: 0: 42792.5. Samples: 12361406440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:38,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-24 22:31:41,162][15401] Updated weights for policy 0, policy_version 754483 (0.0035) [2024-06-24 22:31:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12361531392. Throughput: 0: 42889.3. Samples: 12361663000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 22:31:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000754488_12361531392.pth... [2024-06-24 22:31:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000753863_12351291392.pth [2024-06-24 22:31:44,602][15401] Updated weights for policy 0, policy_version 754493 (0.0037) [2024-06-24 22:31:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 12361760768. Throughput: 0: 42842.1. Samples: 12361918220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 22:31:48,722][15401] Updated weights for policy 0, policy_version 754503 (0.0046) [2024-06-24 22:31:52,470][15401] Updated weights for policy 0, policy_version 754513 (0.0035) [2024-06-24 22:31:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12361973760. Throughput: 0: 42727.5. Samples: 12362045620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:53,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 22:31:56,416][15401] Updated weights for policy 0, policy_version 754523 (0.0041) [2024-06-24 22:31:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12362186752. Throughput: 0: 42697.1. Samples: 12362297420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:31:58,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-24 22:31:59,999][15401] Updated weights for policy 0, policy_version 754533 (0.0026) [2024-06-24 22:32:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12362399744. Throughput: 0: 42696.6. Samples: 12362552940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:32:03,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 22:32:04,009][15401] Updated weights for policy 0, policy_version 754543 (0.0029) [2024-06-24 22:32:08,197][15401] Updated weights for policy 0, policy_version 754553 (0.0028) [2024-06-24 22:32:08,390][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12362596352. Throughput: 0: 42640.9. Samples: 12362682480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-24 22:32:08,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 22:32:11,519][15401] Updated weights for policy 0, policy_version 754563 (0.0039) [2024-06-24 22:32:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12362809344. Throughput: 0: 42661.4. Samples: 12362938640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:13,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 22:32:15,962][15401] Updated weights for policy 0, policy_version 754573 (0.0037) [2024-06-24 22:32:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12363038720. Throughput: 0: 42447.2. Samples: 12363186960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 22:32:19,514][15401] Updated weights for policy 0, policy_version 754583 (0.0033) [2024-06-24 22:32:22,626][15349] Signal inference workers to stop experience collection... (183000 times) [2024-06-24 22:32:22,626][15349] Signal inference workers to resume experience collection... (183000 times) [2024-06-24 22:32:22,649][15401] InferenceWorker_p0-w0: stopping experience collection (183000 times) [2024-06-24 22:32:22,649][15401] InferenceWorker_p0-w0: resuming experience collection (183000 times) [2024-06-24 22:32:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12363218944. Throughput: 0: 42507.6. Samples: 12363319280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 22:32:23,894][15401] Updated weights for policy 0, policy_version 754593 (0.0034) [2024-06-24 22:32:27,061][15401] Updated weights for policy 0, policy_version 754603 (0.0035) [2024-06-24 22:32:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12363448320. Throughput: 0: 42242.3. Samples: 12363563900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 22:32:31,680][15401] Updated weights for policy 0, policy_version 754613 (0.0034) [2024-06-24 22:32:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12363644928. Throughput: 0: 42477.9. Samples: 12363829720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 22:32:34,668][15401] Updated weights for policy 0, policy_version 754623 (0.0042) [2024-06-24 22:32:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 12363874304. Throughput: 0: 42377.3. Samples: 12363952700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:38,393][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 22:32:39,475][15401] Updated weights for policy 0, policy_version 754633 (0.0046) [2024-06-24 22:32:43,015][15401] Updated weights for policy 0, policy_version 754643 (0.0028) [2024-06-24 22:32:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12364087296. Throughput: 0: 42291.3. Samples: 12364200520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:43,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:32:47,030][15401] Updated weights for policy 0, policy_version 754653 (0.0032) [2024-06-24 22:32:48,389][15132] Fps is (10 sec: 39331.4, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 12364267520. Throughput: 0: 42384.0. Samples: 12364460220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-24 22:32:50,525][15401] Updated weights for policy 0, policy_version 754663 (0.0025) [2024-06-24 22:32:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12364529664. Throughput: 0: 42178.3. Samples: 12364580500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 22:32:54,614][15401] Updated weights for policy 0, policy_version 754673 (0.0036) [2024-06-24 22:32:58,045][15401] Updated weights for policy 0, policy_version 754683 (0.0031) [2024-06-24 22:32:58,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 12364726272. Throughput: 0: 42221.2. Samples: 12364838700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:32:58,393][15132] Avg episode reward: [(0, '0.735')] [2024-06-24 22:33:02,764][15401] Updated weights for policy 0, policy_version 754693 (0.0037) [2024-06-24 22:33:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 12364906496. Throughput: 0: 42575.6. Samples: 12365102860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:03,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 22:33:05,534][15401] Updated weights for policy 0, policy_version 754703 (0.0026) [2024-06-24 22:33:08,392][15132] Fps is (10 sec: 40960.5, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 12365135872. Throughput: 0: 42230.7. Samples: 12365219760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:08,392][15132] Avg episode reward: [(0, '0.883')] [2024-06-24 22:33:10,337][15401] Updated weights for policy 0, policy_version 754713 (0.0031) [2024-06-24 22:33:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12365365248. Throughput: 0: 42623.2. Samples: 12365481940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:13,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 22:33:13,397][15401] Updated weights for policy 0, policy_version 754723 (0.0037) [2024-06-24 22:33:17,914][15401] Updated weights for policy 0, policy_version 754733 (0.0036) [2024-06-24 22:33:18,392][15132] Fps is (10 sec: 42598.9, 60 sec: 42050.7, 300 sec: 42598.1). Total num frames: 12365561856. Throughput: 0: 42514.8. Samples: 12365742980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:18,392][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 22:33:20,874][15401] Updated weights for policy 0, policy_version 754743 (0.0029) [2024-06-24 22:33:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 12365791232. Throughput: 0: 42457.3. Samples: 12365863180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 22:33:25,496][15401] Updated weights for policy 0, policy_version 754753 (0.0028) [2024-06-24 22:33:25,963][15349] Signal inference workers to stop experience collection... (183050 times) [2024-06-24 22:33:25,965][15349] Signal inference workers to resume experience collection... (183050 times) [2024-06-24 22:33:25,995][15401] InferenceWorker_p0-w0: stopping experience collection (183050 times) [2024-06-24 22:33:25,996][15401] InferenceWorker_p0-w0: resuming experience collection (183050 times) [2024-06-24 22:33:28,391][15132] Fps is (10 sec: 45876.3, 60 sec: 42870.1, 300 sec: 42820.3). Total num frames: 12366020608. Throughput: 0: 42748.0. Samples: 12366124260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:28,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 22:33:28,835][15401] Updated weights for policy 0, policy_version 754763 (0.0036) [2024-06-24 22:33:33,273][15401] Updated weights for policy 0, policy_version 754773 (0.0047) [2024-06-24 22:33:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12366200832. Throughput: 0: 42560.4. Samples: 12366375440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 22:33:36,762][15401] Updated weights for policy 0, policy_version 754783 (0.0040) [2024-06-24 22:33:38,389][15132] Fps is (10 sec: 39329.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 12366413824. Throughput: 0: 42622.3. Samples: 12366498500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:38,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 22:33:40,866][15401] Updated weights for policy 0, policy_version 754793 (0.0023) [2024-06-24 22:33:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12366643200. Throughput: 0: 42708.9. Samples: 12366760500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-24 22:33:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000754800_12366643200.pth... [2024-06-24 22:33:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000754176_12356419584.pth [2024-06-24 22:33:44,364][15401] Updated weights for policy 0, policy_version 754803 (0.0038) [2024-06-24 22:33:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12366839808. Throughput: 0: 42546.1. Samples: 12367017440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 22:33:48,663][15401] Updated weights for policy 0, policy_version 754813 (0.0032) [2024-06-24 22:33:51,896][15401] Updated weights for policy 0, policy_version 754823 (0.0030) [2024-06-24 22:33:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12367069184. Throughput: 0: 42819.8. Samples: 12367146560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 22:33:56,077][15401] Updated weights for policy 0, policy_version 754833 (0.0040) [2024-06-24 22:33:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.1, 300 sec: 42820.8). Total num frames: 12367282176. Throughput: 0: 42757.3. Samples: 12367406020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:33:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 22:33:59,447][15401] Updated weights for policy 0, policy_version 754843 (0.0029) [2024-06-24 22:34:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 12367495168. Throughput: 0: 42734.5. Samples: 12367665940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:03,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 22:34:03,506][15401] Updated weights for policy 0, policy_version 754853 (0.0033) [2024-06-24 22:34:07,014][15401] Updated weights for policy 0, policy_version 754863 (0.0034) [2024-06-24 22:34:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 12367724544. Throughput: 0: 42917.0. Samples: 12367794440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:08,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-24 22:34:10,910][15401] Updated weights for policy 0, policy_version 754873 (0.0050) [2024-06-24 22:34:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12367921152. Throughput: 0: 42883.2. Samples: 12368053920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 22:34:14,531][15401] Updated weights for policy 0, policy_version 754883 (0.0037) [2024-06-24 22:34:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42872.9, 300 sec: 42653.9). Total num frames: 12368134144. Throughput: 0: 43026.5. Samples: 12368311640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 22:34:18,689][15401] Updated weights for policy 0, policy_version 754893 (0.0036) [2024-06-24 22:34:22,121][15401] Updated weights for policy 0, policy_version 754903 (0.0043) [2024-06-24 22:34:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12368363520. Throughput: 0: 43130.7. Samples: 12368439380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 22:34:26,474][15401] Updated weights for policy 0, policy_version 754913 (0.0037) [2024-06-24 22:34:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42326.7, 300 sec: 42653.9). Total num frames: 12368560128. Throughput: 0: 42963.2. Samples: 12368693840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 22:34:29,735][15401] Updated weights for policy 0, policy_version 754923 (0.0023) [2024-06-24 22:34:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12368789504. Throughput: 0: 43120.6. Samples: 12368957860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:33,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-24 22:34:33,974][15401] Updated weights for policy 0, policy_version 754933 (0.0038) [2024-06-24 22:34:37,369][15401] Updated weights for policy 0, policy_version 754943 (0.0034) [2024-06-24 22:34:38,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43417.4, 300 sec: 42709.5). Total num frames: 12369018880. Throughput: 0: 42985.2. Samples: 12369080900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 22:34:41,472][15401] Updated weights for policy 0, policy_version 754953 (0.0039) [2024-06-24 22:34:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12369215488. Throughput: 0: 43037.3. Samples: 12369342700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:43,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-24 22:34:44,940][15401] Updated weights for policy 0, policy_version 754963 (0.0035) [2024-06-24 22:34:48,390][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12369428480. Throughput: 0: 42992.4. Samples: 12369600600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 22:34:49,186][15401] Updated weights for policy 0, policy_version 754973 (0.0029) [2024-06-24 22:34:52,846][15401] Updated weights for policy 0, policy_version 754983 (0.0036) [2024-06-24 22:34:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12369641472. Throughput: 0: 42950.5. Samples: 12369727220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-24 22:34:56,732][15349] Signal inference workers to stop experience collection... (183100 times) [2024-06-24 22:34:56,784][15401] InferenceWorker_p0-w0: stopping experience collection (183100 times) [2024-06-24 22:34:56,786][15349] Signal inference workers to resume experience collection... (183100 times) [2024-06-24 22:34:56,803][15401] InferenceWorker_p0-w0: resuming experience collection (183100 times) [2024-06-24 22:34:56,920][15401] Updated weights for policy 0, policy_version 754993 (0.0044) [2024-06-24 22:34:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12369854464. Throughput: 0: 42808.9. Samples: 12369980320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:34:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 22:35:00,403][15401] Updated weights for policy 0, policy_version 755003 (0.0030) [2024-06-24 22:35:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12370051072. Throughput: 0: 42829.4. Samples: 12370238960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:35:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 22:35:04,569][15401] Updated weights for policy 0, policy_version 755013 (0.0039) [2024-06-24 22:35:08,039][15401] Updated weights for policy 0, policy_version 755023 (0.0036) [2024-06-24 22:35:08,391][15132] Fps is (10 sec: 44229.1, 60 sec: 42870.3, 300 sec: 42709.2). Total num frames: 12370296832. Throughput: 0: 42776.1. Samples: 12370364380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:35:08,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 22:35:12,263][15401] Updated weights for policy 0, policy_version 755033 (0.0023) [2024-06-24 22:35:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12370509824. Throughput: 0: 42865.4. Samples: 12370622780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:35:13,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 22:35:15,719][15401] Updated weights for policy 0, policy_version 755043 (0.0037) [2024-06-24 22:35:18,390][15132] Fps is (10 sec: 40966.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12370706432. Throughput: 0: 42784.8. Samples: 12370883180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:35:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 22:35:20,022][15401] Updated weights for policy 0, policy_version 755053 (0.0034) [2024-06-24 22:35:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12370935808. Throughput: 0: 42752.6. Samples: 12371004760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:35:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-24 22:35:23,578][15401] Updated weights for policy 0, policy_version 755063 (0.0036) [2024-06-24 22:35:27,653][15401] Updated weights for policy 0, policy_version 755073 (0.0039) [2024-06-24 22:35:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12371148800. Throughput: 0: 42721.3. Samples: 12371265160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:35:28,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 22:35:31,440][15401] Updated weights for policy 0, policy_version 755083 (0.0029) [2024-06-24 22:35:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12371361792. Throughput: 0: 42623.6. Samples: 12371518660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:35:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 22:35:35,142][15401] Updated weights for policy 0, policy_version 755093 (0.0029) [2024-06-24 22:35:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12371574784. Throughput: 0: 42639.2. Samples: 12371645980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:35:38,390][15132] Avg episode reward: [(0, '0.926')] [2024-06-24 22:35:39,171][15401] Updated weights for policy 0, policy_version 755103 (0.0046) [2024-06-24 22:35:42,864][15401] Updated weights for policy 0, policy_version 755113 (0.0033) [2024-06-24 22:35:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 12371787776. Throughput: 0: 42719.8. Samples: 12371902720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:35:43,390][15132] Avg episode reward: [(0, '0.948')] [2024-06-24 22:35:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000755114_12371787776.pth... [2024-06-24 22:35:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000754488_12361531392.pth [2024-06-24 22:35:43,478][15349] Saving new best policy, reward=0.948! [2024-06-24 22:35:46,754][15401] Updated weights for policy 0, policy_version 755123 (0.0030) [2024-06-24 22:35:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12372000768. Throughput: 0: 42806.8. Samples: 12372165260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:35:48,390][15132] Avg episode reward: [(0, '0.881')] [2024-06-24 22:35:50,478][15401] Updated weights for policy 0, policy_version 755133 (0.0041) [2024-06-24 22:35:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12372230144. Throughput: 0: 42827.7. Samples: 12372291560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:35:53,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-24 22:35:54,242][15401] Updated weights for policy 0, policy_version 755143 (0.0028) [2024-06-24 22:35:58,102][15401] Updated weights for policy 0, policy_version 755153 (0.0031) [2024-06-24 22:35:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12372426752. Throughput: 0: 42779.4. Samples: 12372547860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:35:58,390][15132] Avg episode reward: [(0, '0.134')] [2024-06-24 22:36:01,727][15401] Updated weights for policy 0, policy_version 755163 (0.0023) [2024-06-24 22:36:03,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12372623360. Throughput: 0: 42598.3. Samples: 12372800100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:03,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 22:36:05,766][15401] Updated weights for policy 0, policy_version 755173 (0.0027) [2024-06-24 22:36:08,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42870.9, 300 sec: 42764.7). Total num frames: 12372869120. Throughput: 0: 42677.7. Samples: 12372925360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:08,393][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 22:36:09,966][15401] Updated weights for policy 0, policy_version 755183 (0.0023) [2024-06-24 22:36:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12373065728. Throughput: 0: 42567.9. Samples: 12373180720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-24 22:36:13,541][15401] Updated weights for policy 0, policy_version 755193 (0.0031) [2024-06-24 22:36:17,577][15401] Updated weights for policy 0, policy_version 755203 (0.0032) [2024-06-24 22:36:18,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12373262336. Throughput: 0: 42629.8. Samples: 12373437000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 22:36:21,360][15401] Updated weights for policy 0, policy_version 755213 (0.0041) [2024-06-24 22:36:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12373491712. Throughput: 0: 42728.6. Samples: 12373568760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 22:36:23,529][15349] Signal inference workers to stop experience collection... (183150 times) [2024-06-24 22:36:23,529][15349] Signal inference workers to resume experience collection... (183150 times) [2024-06-24 22:36:23,549][15401] InferenceWorker_p0-w0: stopping experience collection (183150 times) [2024-06-24 22:36:23,549][15401] InferenceWorker_p0-w0: resuming experience collection (183150 times) [2024-06-24 22:36:25,123][15401] Updated weights for policy 0, policy_version 755223 (0.0042) [2024-06-24 22:36:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12373704704. Throughput: 0: 42698.7. Samples: 12373824160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 22:36:29,179][15401] Updated weights for policy 0, policy_version 755233 (0.0029) [2024-06-24 22:36:33,155][15401] Updated weights for policy 0, policy_version 755243 (0.0035) [2024-06-24 22:36:33,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 12373901312. Throughput: 0: 42479.0. Samples: 12374076920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:33,393][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 22:36:36,757][15401] Updated weights for policy 0, policy_version 755253 (0.0035) [2024-06-24 22:36:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12374130688. Throughput: 0: 42477.5. Samples: 12374203040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:38,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-24 22:36:40,975][15401] Updated weights for policy 0, policy_version 755263 (0.0035) [2024-06-24 22:36:43,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12374343680. Throughput: 0: 42579.2. Samples: 12374463920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-24 22:36:44,252][15401] Updated weights for policy 0, policy_version 755273 (0.0053) [2024-06-24 22:36:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12374540288. Throughput: 0: 42726.7. Samples: 12374722800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:48,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 22:36:48,450][15401] Updated weights for policy 0, policy_version 755283 (0.0041) [2024-06-24 22:36:51,866][15401] Updated weights for policy 0, policy_version 755293 (0.0040) [2024-06-24 22:36:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12374769664. Throughput: 0: 42764.0. Samples: 12374849640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:53,396][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 22:36:55,970][15401] Updated weights for policy 0, policy_version 755303 (0.0031) [2024-06-24 22:36:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12374999040. Throughput: 0: 42857.0. Samples: 12375109280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:36:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 22:36:59,433][15401] Updated weights for policy 0, policy_version 755313 (0.0040) [2024-06-24 22:37:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 12375179264. Throughput: 0: 42852.8. Samples: 12375365480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:37:03,393][15132] Avg episode reward: [(0, '0.334')] [2024-06-24 22:37:03,603][15401] Updated weights for policy 0, policy_version 755323 (0.0038) [2024-06-24 22:37:06,928][15401] Updated weights for policy 0, policy_version 755333 (0.0041) [2024-06-24 22:37:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 12375408640. Throughput: 0: 42729.8. Samples: 12375491600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:37:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 22:37:11,036][15401] Updated weights for policy 0, policy_version 755343 (0.0047) [2024-06-24 22:37:13,390][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12375621632. Throughput: 0: 42904.0. Samples: 12375754840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:37:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 22:37:14,561][15401] Updated weights for policy 0, policy_version 755353 (0.0033) [2024-06-24 22:37:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12375851008. Throughput: 0: 42930.7. Samples: 12376008700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:37:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 22:37:18,662][15401] Updated weights for policy 0, policy_version 755363 (0.0040) [2024-06-24 22:37:22,342][15401] Updated weights for policy 0, policy_version 755373 (0.0027) [2024-06-24 22:37:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 12376047616. Throughput: 0: 42970.1. Samples: 12376136800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-24 22:37:23,392][15132] Avg episode reward: [(0, '0.825')] [2024-06-24 22:37:26,320][15401] Updated weights for policy 0, policy_version 755383 (0.0035) [2024-06-24 22:37:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12376276992. Throughput: 0: 42973.2. Samples: 12376397720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:37:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 22:37:30,205][15401] Updated weights for policy 0, policy_version 755393 (0.0030) [2024-06-24 22:37:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 12376489984. Throughput: 0: 42615.0. Samples: 12376640480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:37:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 22:37:34,118][15401] Updated weights for policy 0, policy_version 755403 (0.0037) [2024-06-24 22:37:37,935][15401] Updated weights for policy 0, policy_version 755413 (0.0028) [2024-06-24 22:37:38,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12376686592. Throughput: 0: 42910.0. Samples: 12376780580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:37:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 22:37:41,855][15401] Updated weights for policy 0, policy_version 755423 (0.0028) [2024-06-24 22:37:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12376883200. Throughput: 0: 42848.0. Samples: 12377037440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:37:43,393][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 22:37:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000755426_12376899584.pth... [2024-06-24 22:37:43,557][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000754800_12366643200.pth [2024-06-24 22:37:45,378][15401] Updated weights for policy 0, policy_version 755433 (0.0037) [2024-06-24 22:37:47,647][15349] Signal inference workers to stop experience collection... (183200 times) [2024-06-24 22:37:47,677][15401] InferenceWorker_p0-w0: stopping experience collection (183200 times) [2024-06-24 22:37:47,707][15349] Signal inference workers to resume experience collection... (183200 times) [2024-06-24 22:37:47,712][15401] InferenceWorker_p0-w0: resuming experience collection (183200 times) [2024-06-24 22:37:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12377145344. Throughput: 0: 42709.1. Samples: 12377287280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:37:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 22:37:49,448][15401] Updated weights for policy 0, policy_version 755443 (0.0029) [2024-06-24 22:37:52,906][15401] Updated weights for policy 0, policy_version 755453 (0.0028) [2024-06-24 22:37:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12377341952. Throughput: 0: 42922.1. Samples: 12377423100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:37:53,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-24 22:37:57,151][15401] Updated weights for policy 0, policy_version 755463 (0.0032) [2024-06-24 22:37:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 12377538560. Throughput: 0: 42736.4. Samples: 12377677980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:37:58,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 22:38:00,624][15401] Updated weights for policy 0, policy_version 755473 (0.0030) [2024-06-24 22:38:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12377784320. Throughput: 0: 42724.9. Samples: 12377931420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:03,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 22:38:04,773][15401] Updated weights for policy 0, policy_version 755483 (0.0039) [2024-06-24 22:38:08,353][15401] Updated weights for policy 0, policy_version 755493 (0.0048) [2024-06-24 22:38:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12377997312. Throughput: 0: 42827.2. Samples: 12378063920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:08,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 22:38:12,346][15401] Updated weights for policy 0, policy_version 755503 (0.0040) [2024-06-24 22:38:13,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12378193920. Throughput: 0: 42593.0. Samples: 12378314400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 22:38:15,974][15401] Updated weights for policy 0, policy_version 755513 (0.0032) [2024-06-24 22:38:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12378390528. Throughput: 0: 42840.4. Samples: 12378568300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:18,396][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 22:38:20,022][15401] Updated weights for policy 0, policy_version 755523 (0.0031) [2024-06-24 22:38:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42709.7). Total num frames: 12378619904. Throughput: 0: 42528.7. Samples: 12378694380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 22:38:23,910][15401] Updated weights for policy 0, policy_version 755533 (0.0026) [2024-06-24 22:38:27,907][15401] Updated weights for policy 0, policy_version 755543 (0.0030) [2024-06-24 22:38:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12378832896. Throughput: 0: 42541.3. Samples: 12378951800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:28,394][15132] Avg episode reward: [(0, '0.691')] [2024-06-24 22:38:31,584][15401] Updated weights for policy 0, policy_version 755553 (0.0041) [2024-06-24 22:38:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12379029504. Throughput: 0: 42518.2. Samples: 12379200600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 22:38:35,558][15401] Updated weights for policy 0, policy_version 755563 (0.0043) [2024-06-24 22:38:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12379258880. Throughput: 0: 42321.4. Samples: 12379327560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:38,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 22:38:39,295][15401] Updated weights for policy 0, policy_version 755573 (0.0033) [2024-06-24 22:38:43,342][15401] Updated weights for policy 0, policy_version 755583 (0.0033) [2024-06-24 22:38:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12379471872. Throughput: 0: 42462.8. Samples: 12379588800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 22:38:46,918][15401] Updated weights for policy 0, policy_version 755593 (0.0036) [2024-06-24 22:38:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 12379684864. Throughput: 0: 42453.8. Samples: 12379841840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:48,392][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 22:38:50,847][15401] Updated weights for policy 0, policy_version 755603 (0.0032) [2024-06-24 22:38:53,394][15132] Fps is (10 sec: 42578.2, 60 sec: 42595.1, 300 sec: 42764.3). Total num frames: 12379897856. Throughput: 0: 42452.4. Samples: 12379974480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:53,395][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 22:38:54,384][15401] Updated weights for policy 0, policy_version 755613 (0.0030) [2024-06-24 22:38:58,384][15401] Updated weights for policy 0, policy_version 755623 (0.0034) [2024-06-24 22:38:58,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12380127232. Throughput: 0: 42774.2. Samples: 12380239240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:38:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 22:39:02,067][15401] Updated weights for policy 0, policy_version 755633 (0.0044) [2024-06-24 22:39:03,389][15132] Fps is (10 sec: 42618.8, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 12380323840. Throughput: 0: 42750.0. Samples: 12380492040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:39:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 22:39:05,645][15349] Signal inference workers to stop experience collection... (183250 times) [2024-06-24 22:39:05,695][15401] InferenceWorker_p0-w0: stopping experience collection (183250 times) [2024-06-24 22:39:05,703][15349] Signal inference workers to resume experience collection... (183250 times) [2024-06-24 22:39:05,709][15401] InferenceWorker_p0-w0: resuming experience collection (183250 times) [2024-06-24 22:39:06,213][15401] Updated weights for policy 0, policy_version 755643 (0.0039) [2024-06-24 22:39:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 12380536832. Throughput: 0: 42858.2. Samples: 12380623000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-24 22:39:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 22:39:09,588][15401] Updated weights for policy 0, policy_version 755653 (0.0029) [2024-06-24 22:39:13,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12380749824. Throughput: 0: 42915.6. Samples: 12380883000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 22:39:13,911][15401] Updated weights for policy 0, policy_version 755663 (0.0041) [2024-06-24 22:39:17,187][15401] Updated weights for policy 0, policy_version 755673 (0.0024) [2024-06-24 22:39:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 12380995584. Throughput: 0: 43008.8. Samples: 12381136000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 22:39:21,450][15401] Updated weights for policy 0, policy_version 755683 (0.0039) [2024-06-24 22:39:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12381175808. Throughput: 0: 43195.1. Samples: 12381271340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:23,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-24 22:39:25,324][15401] Updated weights for policy 0, policy_version 755693 (0.0025) [2024-06-24 22:39:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12381388800. Throughput: 0: 42922.5. Samples: 12381520320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 22:39:28,949][15401] Updated weights for policy 0, policy_version 755703 (0.0036) [2024-06-24 22:39:32,886][15401] Updated weights for policy 0, policy_version 755713 (0.0037) [2024-06-24 22:39:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12381618176. Throughput: 0: 42868.0. Samples: 12381770800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 22:39:36,709][15401] Updated weights for policy 0, policy_version 755723 (0.0037) [2024-06-24 22:39:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12381814784. Throughput: 0: 42879.6. Samples: 12381903860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 22:39:40,432][15401] Updated weights for policy 0, policy_version 755733 (0.0034) [2024-06-24 22:39:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12382044160. Throughput: 0: 42594.2. Samples: 12382155980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:43,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-24 22:39:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000755740_12382044160.pth... [2024-06-24 22:39:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000755114_12371787776.pth [2024-06-24 22:39:44,301][15401] Updated weights for policy 0, policy_version 755743 (0.0039) [2024-06-24 22:39:48,111][15401] Updated weights for policy 0, policy_version 755753 (0.0035) [2024-06-24 22:39:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12382257152. Throughput: 0: 42619.9. Samples: 12382409940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 22:39:51,906][15401] Updated weights for policy 0, policy_version 755763 (0.0031) [2024-06-24 22:39:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42874.8, 300 sec: 42765.0). Total num frames: 12382470144. Throughput: 0: 42711.2. Samples: 12382545000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 22:39:56,041][15401] Updated weights for policy 0, policy_version 755773 (0.0029) [2024-06-24 22:39:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12382666752. Throughput: 0: 42392.0. Samples: 12382790640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:39:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 22:39:59,451][15401] Updated weights for policy 0, policy_version 755783 (0.0032) [2024-06-24 22:40:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 12382879744. Throughput: 0: 42650.8. Samples: 12383055280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:03,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-24 22:40:03,574][15401] Updated weights for policy 0, policy_version 755793 (0.0043) [2024-06-24 22:40:06,998][15401] Updated weights for policy 0, policy_version 755803 (0.0025) [2024-06-24 22:40:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12383109120. Throughput: 0: 42485.4. Samples: 12383183180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:08,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 22:40:11,632][15401] Updated weights for policy 0, policy_version 755813 (0.0028) [2024-06-24 22:40:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12383305728. Throughput: 0: 42621.1. Samples: 12383438260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 22:40:15,184][15401] Updated weights for policy 0, policy_version 755823 (0.0038) [2024-06-24 22:40:17,817][15349] Signal inference workers to stop experience collection... (183300 times) [2024-06-24 22:40:17,871][15401] InferenceWorker_p0-w0: stopping experience collection (183300 times) [2024-06-24 22:40:17,934][15349] Signal inference workers to resume experience collection... (183300 times) [2024-06-24 22:40:17,934][15401] InferenceWorker_p0-w0: resuming experience collection (183300 times) [2024-06-24 22:40:18,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12383535104. Throughput: 0: 42827.5. Samples: 12383698040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 22:40:19,067][15401] Updated weights for policy 0, policy_version 755833 (0.0026) [2024-06-24 22:40:23,103][15401] Updated weights for policy 0, policy_version 755843 (0.0029) [2024-06-24 22:40:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12383764480. Throughput: 0: 42661.8. Samples: 12383823640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:23,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 22:40:26,702][15401] Updated weights for policy 0, policy_version 755853 (0.0036) [2024-06-24 22:40:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12383961088. Throughput: 0: 42896.9. Samples: 12384086340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:28,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 22:40:30,566][15401] Updated weights for policy 0, policy_version 755863 (0.0026) [2024-06-24 22:40:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12384174080. Throughput: 0: 43004.9. Samples: 12384345160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 22:40:34,255][15401] Updated weights for policy 0, policy_version 755873 (0.0035) [2024-06-24 22:40:38,071][15401] Updated weights for policy 0, policy_version 755883 (0.0036) [2024-06-24 22:40:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12384387072. Throughput: 0: 42769.8. Samples: 12384469740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:38,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 22:40:41,967][15401] Updated weights for policy 0, policy_version 755893 (0.0034) [2024-06-24 22:40:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12384616448. Throughput: 0: 43157.7. Samples: 12384732740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-24 22:40:45,544][15401] Updated weights for policy 0, policy_version 755903 (0.0030) [2024-06-24 22:40:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12384829440. Throughput: 0: 42932.8. Samples: 12384987260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 22:40:48,394][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 22:40:49,507][15401] Updated weights for policy 0, policy_version 755913 (0.0026) [2024-06-24 22:40:53,254][15401] Updated weights for policy 0, policy_version 755923 (0.0037) [2024-06-24 22:40:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12385042432. Throughput: 0: 42892.4. Samples: 12385113340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:40:53,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-24 22:40:57,086][15401] Updated weights for policy 0, policy_version 755933 (0.0039) [2024-06-24 22:40:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12385255424. Throughput: 0: 43007.5. Samples: 12385373600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:40:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 22:41:00,928][15401] Updated weights for policy 0, policy_version 755943 (0.0030) [2024-06-24 22:41:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 12385484800. Throughput: 0: 42928.6. Samples: 12385629820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 22:41:04,704][15401] Updated weights for policy 0, policy_version 755953 (0.0036) [2024-06-24 22:41:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12385681408. Throughput: 0: 43040.5. Samples: 12385760460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:08,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-24 22:41:08,492][15401] Updated weights for policy 0, policy_version 755963 (0.0034) [2024-06-24 22:41:12,253][15401] Updated weights for policy 0, policy_version 755973 (0.0037) [2024-06-24 22:41:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12385894400. Throughput: 0: 42868.1. Samples: 12386015400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:13,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-24 22:41:16,357][15401] Updated weights for policy 0, policy_version 755983 (0.0041) [2024-06-24 22:41:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.7, 300 sec: 42820.5). Total num frames: 12386123776. Throughput: 0: 42832.5. Samples: 12386272620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 22:41:19,893][15401] Updated weights for policy 0, policy_version 755993 (0.0040) [2024-06-24 22:41:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12386320384. Throughput: 0: 43032.1. Samples: 12386406080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 22:41:23,881][15401] Updated weights for policy 0, policy_version 756003 (0.0040) [2024-06-24 22:41:27,814][15401] Updated weights for policy 0, policy_version 756013 (0.0032) [2024-06-24 22:41:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12386533376. Throughput: 0: 42806.8. Samples: 12386659040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:28,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-24 22:41:31,642][15401] Updated weights for policy 0, policy_version 756023 (0.0035) [2024-06-24 22:41:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12386762752. Throughput: 0: 42752.8. Samples: 12386911140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:41:35,599][15401] Updated weights for policy 0, policy_version 756033 (0.0034) [2024-06-24 22:41:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 12386959360. Throughput: 0: 42807.4. Samples: 12387039680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 22:41:39,482][15401] Updated weights for policy 0, policy_version 756043 (0.0032) [2024-06-24 22:41:43,164][15401] Updated weights for policy 0, policy_version 756053 (0.0033) [2024-06-24 22:41:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12387172352. Throughput: 0: 42717.0. Samples: 12387295860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:43,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 22:41:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000756054_12387188736.pth... [2024-06-24 22:41:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000755426_12376899584.pth [2024-06-24 22:41:47,230][15401] Updated weights for policy 0, policy_version 756063 (0.0041) [2024-06-24 22:41:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12387385344. Throughput: 0: 42801.3. Samples: 12387555880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:48,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-24 22:41:49,226][15349] Signal inference workers to stop experience collection... (183350 times) [2024-06-24 22:41:49,274][15401] InferenceWorker_p0-w0: stopping experience collection (183350 times) [2024-06-24 22:41:49,279][15349] Signal inference workers to resume experience collection... (183350 times) [2024-06-24 22:41:49,290][15401] InferenceWorker_p0-w0: resuming experience collection (183350 times) [2024-06-24 22:41:50,645][15401] Updated weights for policy 0, policy_version 756073 (0.0047) [2024-06-24 22:41:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12387598336. Throughput: 0: 42719.9. Samples: 12387682860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:53,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-24 22:41:54,676][15401] Updated weights for policy 0, policy_version 756083 (0.0029) [2024-06-24 22:41:58,351][15401] Updated weights for policy 0, policy_version 756093 (0.0036) [2024-06-24 22:41:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 12387827712. Throughput: 0: 42760.8. Samples: 12387939640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:41:58,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 22:42:02,237][15401] Updated weights for policy 0, policy_version 756103 (0.0043) [2024-06-24 22:42:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12388024320. Throughput: 0: 42774.7. Samples: 12388197480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:42:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 22:42:05,795][15401] Updated weights for policy 0, policy_version 756113 (0.0040) [2024-06-24 22:42:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12388253696. Throughput: 0: 42684.4. Samples: 12388326880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:42:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 22:42:09,743][15401] Updated weights for policy 0, policy_version 756123 (0.0027) [2024-06-24 22:42:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12388466688. Throughput: 0: 42841.3. Samples: 12388586900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:42:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 22:42:13,632][15401] Updated weights for policy 0, policy_version 756133 (0.0038) [2024-06-24 22:42:17,386][15401] Updated weights for policy 0, policy_version 756143 (0.0026) [2024-06-24 22:42:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 12388663296. Throughput: 0: 42935.2. Samples: 12388843220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:42:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 22:42:21,228][15401] Updated weights for policy 0, policy_version 756153 (0.0050) [2024-06-24 22:42:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12388892672. Throughput: 0: 42944.4. Samples: 12388972180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:42:23,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 22:42:25,268][15401] Updated weights for policy 0, policy_version 756163 (0.0027) [2024-06-24 22:42:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12389105664. Throughput: 0: 42867.1. Samples: 12389224880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:42:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 22:42:29,051][15401] Updated weights for policy 0, policy_version 756173 (0.0034) [2024-06-24 22:42:32,897][15401] Updated weights for policy 0, policy_version 756183 (0.0028) [2024-06-24 22:42:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12389302272. Throughput: 0: 42760.9. Samples: 12389480120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-24 22:42:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 22:42:36,611][15401] Updated weights for policy 0, policy_version 756193 (0.0043) [2024-06-24 22:42:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 12389548032. Throughput: 0: 42822.2. Samples: 12389609860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:42:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 22:42:40,731][15401] Updated weights for policy 0, policy_version 756203 (0.0030) [2024-06-24 22:42:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12389728256. Throughput: 0: 42833.9. Samples: 12389867160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:42:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 22:42:44,227][15401] Updated weights for policy 0, policy_version 756213 (0.0032) [2024-06-24 22:42:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12389941248. Throughput: 0: 42843.1. Samples: 12390125420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:42:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 22:42:48,429][15401] Updated weights for policy 0, policy_version 756223 (0.0029) [2024-06-24 22:42:51,923][15401] Updated weights for policy 0, policy_version 756233 (0.0047) [2024-06-24 22:42:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12390187008. Throughput: 0: 42839.7. Samples: 12390254660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:42:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-24 22:42:56,034][15401] Updated weights for policy 0, policy_version 756243 (0.0030) [2024-06-24 22:42:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 12390383616. Throughput: 0: 42751.1. Samples: 12390510700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:42:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 22:42:59,494][15401] Updated weights for policy 0, policy_version 756253 (0.0040) [2024-06-24 22:43:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12390596608. Throughput: 0: 42754.3. Samples: 12390767160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-24 22:43:03,637][15401] Updated weights for policy 0, policy_version 756263 (0.0039) [2024-06-24 22:43:06,895][15401] Updated weights for policy 0, policy_version 756273 (0.0029) [2024-06-24 22:43:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12390842368. Throughput: 0: 42840.5. Samples: 12390900000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:08,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-24 22:43:11,173][15401] Updated weights for policy 0, policy_version 756283 (0.0031) [2024-06-24 22:43:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12391006208. Throughput: 0: 42902.1. Samples: 12391155480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 22:43:14,813][15401] Updated weights for policy 0, policy_version 756293 (0.0041) [2024-06-24 22:43:14,814][15349] Signal inference workers to stop experience collection... (183400 times) [2024-06-24 22:43:14,815][15349] Signal inference workers to resume experience collection... (183400 times) [2024-06-24 22:43:14,836][15401] InferenceWorker_p0-w0: stopping experience collection (183400 times) [2024-06-24 22:43:14,840][15401] InferenceWorker_p0-w0: resuming experience collection (183400 times) [2024-06-24 22:43:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12391235584. Throughput: 0: 42914.6. Samples: 12391411280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 22:43:19,070][15401] Updated weights for policy 0, policy_version 756303 (0.0028) [2024-06-24 22:43:22,419][15401] Updated weights for policy 0, policy_version 756313 (0.0030) [2024-06-24 22:43:23,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 12391481344. Throughput: 0: 42845.5. Samples: 12391537900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 22:43:26,668][15401] Updated weights for policy 0, policy_version 756323 (0.0033) [2024-06-24 22:43:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12391661568. Throughput: 0: 42787.9. Samples: 12391792620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 22:43:30,231][15401] Updated weights for policy 0, policy_version 756333 (0.0036) [2024-06-24 22:43:33,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12391874560. Throughput: 0: 42661.6. Samples: 12392045200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 22:43:34,363][15401] Updated weights for policy 0, policy_version 756343 (0.0047) [2024-06-24 22:43:37,868][15401] Updated weights for policy 0, policy_version 756353 (0.0028) [2024-06-24 22:43:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12392087552. Throughput: 0: 42607.1. Samples: 12392171980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 22:43:42,366][15401] Updated weights for policy 0, policy_version 756363 (0.0026) [2024-06-24 22:43:43,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.6, 300 sec: 42709.5). Total num frames: 12392284160. Throughput: 0: 42660.0. Samples: 12392430500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:43,393][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 22:43:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000756366_12392300544.pth... [2024-06-24 22:43:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000755740_12382044160.pth [2024-06-24 22:43:45,592][15401] Updated weights for policy 0, policy_version 756373 (0.0028) [2024-06-24 22:43:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.8, 300 sec: 42765.4). Total num frames: 12392513536. Throughput: 0: 42558.2. Samples: 12392682380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:48,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 22:43:49,846][15401] Updated weights for policy 0, policy_version 756383 (0.0034) [2024-06-24 22:43:53,356][15401] Updated weights for policy 0, policy_version 756393 (0.0041) [2024-06-24 22:43:53,395][15132] Fps is (10 sec: 45861.0, 60 sec: 42594.4, 300 sec: 42764.2). Total num frames: 12392742912. Throughput: 0: 42483.2. Samples: 12392811980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:53,395][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 22:43:57,731][15401] Updated weights for policy 0, policy_version 756403 (0.0032) [2024-06-24 22:43:58,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 12392923136. Throughput: 0: 42545.0. Samples: 12393070000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:43:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 22:44:00,867][15401] Updated weights for policy 0, policy_version 756413 (0.0046) [2024-06-24 22:44:03,390][15132] Fps is (10 sec: 42621.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12393168896. Throughput: 0: 42468.0. Samples: 12393322340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:44:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 22:44:05,168][15401] Updated weights for policy 0, policy_version 756423 (0.0038) [2024-06-24 22:44:08,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 12393365504. Throughput: 0: 42682.5. Samples: 12393458620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:44:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 22:44:08,710][15401] Updated weights for policy 0, policy_version 756433 (0.0031) [2024-06-24 22:44:13,302][15401] Updated weights for policy 0, policy_version 756443 (0.0033) [2024-06-24 22:44:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12393562112. Throughput: 0: 42591.2. Samples: 12393709220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:44:13,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 22:44:16,338][15401] Updated weights for policy 0, policy_version 756453 (0.0022) [2024-06-24 22:44:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12393807872. Throughput: 0: 42732.2. Samples: 12393968140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-24 22:44:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 22:44:20,836][15401] Updated weights for policy 0, policy_version 756463 (0.0040) [2024-06-24 22:44:23,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 12394020864. Throughput: 0: 42720.9. Samples: 12394094420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:44:23,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 22:44:23,605][15349] Signal inference workers to stop experience collection... (183450 times) [2024-06-24 22:44:23,651][15401] InferenceWorker_p0-w0: stopping experience collection (183450 times) [2024-06-24 22:44:23,651][15349] Signal inference workers to resume experience collection... (183450 times) [2024-06-24 22:44:23,674][15401] InferenceWorker_p0-w0: resuming experience collection (183450 times) [2024-06-24 22:44:23,788][15401] Updated weights for policy 0, policy_version 756473 (0.0053) [2024-06-24 22:44:28,244][15401] Updated weights for policy 0, policy_version 756483 (0.0041) [2024-06-24 22:44:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12394217472. Throughput: 0: 42644.5. Samples: 12394349400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:44:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 22:44:31,655][15401] Updated weights for policy 0, policy_version 756493 (0.0030) [2024-06-24 22:44:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12394446848. Throughput: 0: 42879.2. Samples: 12394611840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:44:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 22:44:35,700][15401] Updated weights for policy 0, policy_version 756503 (0.0031) [2024-06-24 22:44:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 12394676224. Throughput: 0: 43017.2. Samples: 12394747520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:44:38,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 22:44:39,213][15401] Updated weights for policy 0, policy_version 756513 (0.0032) [2024-06-24 22:44:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 12394856448. Throughput: 0: 42750.1. Samples: 12394993760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:44:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 22:44:43,458][15401] Updated weights for policy 0, policy_version 756523 (0.0039) [2024-06-24 22:44:46,897][15401] Updated weights for policy 0, policy_version 756533 (0.0044) [2024-06-24 22:44:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 12395085824. Throughput: 0: 42943.5. Samples: 12395254800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:44:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 22:44:51,321][15401] Updated weights for policy 0, policy_version 756543 (0.0037) [2024-06-24 22:44:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42329.3, 300 sec: 42765.0). Total num frames: 12395282432. Throughput: 0: 42821.9. Samples: 12395385600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:44:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 22:44:54,395][15401] Updated weights for policy 0, policy_version 756553 (0.0032) [2024-06-24 22:44:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 12395511808. Throughput: 0: 42821.6. Samples: 12395636200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:44:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 22:44:58,727][15401] Updated weights for policy 0, policy_version 756563 (0.0037) [2024-06-24 22:45:02,017][15401] Updated weights for policy 0, policy_version 756573 (0.0034) [2024-06-24 22:45:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12395724800. Throughput: 0: 42851.1. Samples: 12395896440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 22:45:06,107][15401] Updated weights for policy 0, policy_version 756583 (0.0022) [2024-06-24 22:45:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12395937792. Throughput: 0: 42961.3. Samples: 12396027680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 22:45:09,790][15401] Updated weights for policy 0, policy_version 756593 (0.0023) [2024-06-24 22:45:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 12396167168. Throughput: 0: 43025.3. Samples: 12396285540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:45:13,519][15401] Updated weights for policy 0, policy_version 756603 (0.0032) [2024-06-24 22:45:17,335][15401] Updated weights for policy 0, policy_version 756613 (0.0036) [2024-06-24 22:45:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12396380160. Throughput: 0: 42803.5. Samples: 12396538000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:18,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 22:45:21,103][15401] Updated weights for policy 0, policy_version 756623 (0.0029) [2024-06-24 22:45:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12396593152. Throughput: 0: 42660.9. Samples: 12396667260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 22:45:24,970][15401] Updated weights for policy 0, policy_version 756633 (0.0040) [2024-06-24 22:45:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12396789760. Throughput: 0: 42928.2. Samples: 12396925520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 22:45:28,750][15401] Updated weights for policy 0, policy_version 756643 (0.0033) [2024-06-24 22:45:32,639][15401] Updated weights for policy 0, policy_version 756653 (0.0036) [2024-06-24 22:45:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 12397019136. Throughput: 0: 42799.6. Samples: 12397180780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:33,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-24 22:45:36,491][15401] Updated weights for policy 0, policy_version 756663 (0.0027) [2024-06-24 22:45:38,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12397215744. Throughput: 0: 42794.6. Samples: 12397311360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 22:45:40,255][15401] Updated weights for policy 0, policy_version 756673 (0.0028) [2024-06-24 22:45:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12397445120. Throughput: 0: 42988.8. Samples: 12397570700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:43,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 22:45:43,493][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000756681_12397461504.pth... [2024-06-24 22:45:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000756054_12387188736.pth [2024-06-24 22:45:44,321][15401] Updated weights for policy 0, policy_version 756683 (0.0026) [2024-06-24 22:45:46,612][15349] Signal inference workers to stop experience collection... (183500 times) [2024-06-24 22:45:46,662][15401] InferenceWorker_p0-w0: stopping experience collection (183500 times) [2024-06-24 22:45:46,662][15349] Signal inference workers to resume experience collection... (183500 times) [2024-06-24 22:45:46,678][15401] InferenceWorker_p0-w0: resuming experience collection (183500 times) [2024-06-24 22:45:47,887][15401] Updated weights for policy 0, policy_version 756693 (0.0038) [2024-06-24 22:45:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12397658112. Throughput: 0: 42887.2. Samples: 12397826360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:48,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-24 22:45:51,968][15401] Updated weights for policy 0, policy_version 756703 (0.0026) [2024-06-24 22:45:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12397871104. Throughput: 0: 42918.6. Samples: 12397959020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 22:45:55,683][15401] Updated weights for policy 0, policy_version 756713 (0.0034) [2024-06-24 22:45:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12398067712. Throughput: 0: 42733.5. Samples: 12398208540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:45:58,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-24 22:45:59,628][15401] Updated weights for policy 0, policy_version 756723 (0.0041) [2024-06-24 22:46:03,330][15401] Updated weights for policy 0, policy_version 756733 (0.0028) [2024-06-24 22:46:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 12398313472. Throughput: 0: 42891.0. Samples: 12398468100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 22:46:03,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 22:46:07,266][15401] Updated weights for policy 0, policy_version 756743 (0.0043) [2024-06-24 22:46:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12398510080. Throughput: 0: 42789.8. Samples: 12398592800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 22:46:10,926][15401] Updated weights for policy 0, policy_version 756753 (0.0032) [2024-06-24 22:46:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12398723072. Throughput: 0: 42776.8. Samples: 12398850480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 22:46:14,929][15401] Updated weights for policy 0, policy_version 756763 (0.0036) [2024-06-24 22:46:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12398952448. Throughput: 0: 42850.3. Samples: 12399109040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:18,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 22:46:18,547][15401] Updated weights for policy 0, policy_version 756773 (0.0032) [2024-06-24 22:46:22,629][15401] Updated weights for policy 0, policy_version 756783 (0.0035) [2024-06-24 22:46:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12399149056. Throughput: 0: 42793.8. Samples: 12399237080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 22:46:26,422][15401] Updated weights for policy 0, policy_version 756793 (0.0034) [2024-06-24 22:46:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12399378432. Throughput: 0: 42582.3. Samples: 12399486900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 22:46:30,249][15401] Updated weights for policy 0, policy_version 756803 (0.0026) [2024-06-24 22:46:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12399575040. Throughput: 0: 42688.3. Samples: 12399747340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 22:46:34,296][15401] Updated weights for policy 0, policy_version 756813 (0.0022) [2024-06-24 22:46:38,173][15401] Updated weights for policy 0, policy_version 756823 (0.0041) [2024-06-24 22:46:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12399788032. Throughput: 0: 42512.9. Samples: 12399872100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 22:46:41,763][15401] Updated weights for policy 0, policy_version 756833 (0.0029) [2024-06-24 22:46:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12400033792. Throughput: 0: 42702.5. Samples: 12400130160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:43,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-24 22:46:45,714][15401] Updated weights for policy 0, policy_version 756843 (0.0028) [2024-06-24 22:46:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12400230400. Throughput: 0: 42685.8. Samples: 12400388960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 22:46:49,358][15401] Updated weights for policy 0, policy_version 756853 (0.0025) [2024-06-24 22:46:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12400427008. Throughput: 0: 42679.5. Samples: 12400513380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 22:46:53,912][15401] Updated weights for policy 0, policy_version 756863 (0.0042) [2024-06-24 22:46:57,257][15401] Updated weights for policy 0, policy_version 756873 (0.0036) [2024-06-24 22:46:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 12400656384. Throughput: 0: 42713.2. Samples: 12400772580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:46:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 22:47:01,551][15401] Updated weights for policy 0, policy_version 756883 (0.0036) [2024-06-24 22:47:02,038][15349] Signal inference workers to stop experience collection... (183550 times) [2024-06-24 22:47:02,039][15349] Signal inference workers to resume experience collection... (183550 times) [2024-06-24 22:47:02,058][15401] InferenceWorker_p0-w0: stopping experience collection (183550 times) [2024-06-24 22:47:02,059][15401] InferenceWorker_p0-w0: resuming experience collection (183550 times) [2024-06-24 22:47:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12400869376. Throughput: 0: 42621.8. Samples: 12401027020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 22:47:05,043][15401] Updated weights for policy 0, policy_version 756893 (0.0033) [2024-06-24 22:47:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12401065984. Throughput: 0: 42585.3. Samples: 12401153420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 22:47:09,245][15401] Updated weights for policy 0, policy_version 756903 (0.0031) [2024-06-24 22:47:12,639][15401] Updated weights for policy 0, policy_version 756913 (0.0038) [2024-06-24 22:47:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 12401295360. Throughput: 0: 42798.1. Samples: 12401412920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:13,393][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 22:47:16,957][15401] Updated weights for policy 0, policy_version 756923 (0.0037) [2024-06-24 22:47:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12401491968. Throughput: 0: 42690.6. Samples: 12401668420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:18,391][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 22:47:20,209][15401] Updated weights for policy 0, policy_version 756933 (0.0029) [2024-06-24 22:47:23,390][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12401721344. Throughput: 0: 42659.1. Samples: 12401791760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:23,390][15132] Avg episode reward: [(0, '0.890')] [2024-06-24 22:47:24,606][15401] Updated weights for policy 0, policy_version 756943 (0.0035) [2024-06-24 22:47:27,673][15401] Updated weights for policy 0, policy_version 756953 (0.0040) [2024-06-24 22:47:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12401934336. Throughput: 0: 42805.9. Samples: 12402056420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:28,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-24 22:47:32,130][15401] Updated weights for policy 0, policy_version 756963 (0.0038) [2024-06-24 22:47:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12402130944. Throughput: 0: 42636.9. Samples: 12402307620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:33,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 22:47:36,004][15401] Updated weights for policy 0, policy_version 756973 (0.0038) [2024-06-24 22:47:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12402376704. Throughput: 0: 42689.4. Samples: 12402434400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 22:47:39,770][15401] Updated weights for policy 0, policy_version 756983 (0.0035) [2024-06-24 22:47:43,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42047.8, 300 sec: 42764.1). Total num frames: 12402556928. Throughput: 0: 42705.5. Samples: 12402694600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:43,396][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 22:47:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000756993_12402573312.pth... [2024-06-24 22:47:43,523][15401] Updated weights for policy 0, policy_version 756993 (0.0029) [2024-06-24 22:47:43,578][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000756366_12392300544.pth [2024-06-24 22:47:47,325][15401] Updated weights for policy 0, policy_version 757003 (0.0034) [2024-06-24 22:47:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12402786304. Throughput: 0: 42709.0. Samples: 12402948920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 22:47:48,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 22:47:51,020][15401] Updated weights for policy 0, policy_version 757013 (0.0036) [2024-06-24 22:47:53,392][15132] Fps is (10 sec: 45893.9, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 12403015680. Throughput: 0: 42653.3. Samples: 12403072920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:47:53,392][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 22:47:55,006][15401] Updated weights for policy 0, policy_version 757023 (0.0027) [2024-06-24 22:47:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12403195904. Throughput: 0: 42703.2. Samples: 12403334460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:47:58,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-24 22:47:58,753][15401] Updated weights for policy 0, policy_version 757033 (0.0040) [2024-06-24 22:48:02,539][15401] Updated weights for policy 0, policy_version 757043 (0.0030) [2024-06-24 22:48:03,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12403425280. Throughput: 0: 42605.4. Samples: 12403585660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:03,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:48:06,476][15401] Updated weights for policy 0, policy_version 757053 (0.0033) [2024-06-24 22:48:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12403638272. Throughput: 0: 42755.6. Samples: 12403715760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 22:48:10,100][15401] Updated weights for policy 0, policy_version 757063 (0.0038) [2024-06-24 22:48:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.2, 300 sec: 42709.5). Total num frames: 12403834880. Throughput: 0: 42549.9. Samples: 12403971160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-24 22:48:14,191][15401] Updated weights for policy 0, policy_version 757073 (0.0031) [2024-06-24 22:48:17,674][15401] Updated weights for policy 0, policy_version 757083 (0.0032) [2024-06-24 22:48:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12404064256. Throughput: 0: 42539.1. Samples: 12404221880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 22:48:21,982][15401] Updated weights for policy 0, policy_version 757093 (0.0038) [2024-06-24 22:48:23,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12404277248. Throughput: 0: 42603.1. Samples: 12404351540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-24 22:48:25,308][15401] Updated weights for policy 0, policy_version 757103 (0.0032) [2024-06-24 22:48:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12404457472. Throughput: 0: 42370.0. Samples: 12404600980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 22:48:29,885][15401] Updated weights for policy 0, policy_version 757113 (0.0034) [2024-06-24 22:48:33,011][15401] Updated weights for policy 0, policy_version 757123 (0.0030) [2024-06-24 22:48:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12404703232. Throughput: 0: 42280.8. Samples: 12404851560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 22:48:33,524][15349] Signal inference workers to stop experience collection... (183600 times) [2024-06-24 22:48:33,524][15349] Signal inference workers to resume experience collection... (183600 times) [2024-06-24 22:48:33,543][15401] InferenceWorker_p0-w0: stopping experience collection (183600 times) [2024-06-24 22:48:33,543][15401] InferenceWorker_p0-w0: resuming experience collection (183600 times) [2024-06-24 22:48:37,764][15401] Updated weights for policy 0, policy_version 757133 (0.0037) [2024-06-24 22:48:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 42765.3). Total num frames: 12404899840. Throughput: 0: 42498.5. Samples: 12404985260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 22:48:40,900][15401] Updated weights for policy 0, policy_version 757143 (0.0031) [2024-06-24 22:48:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42602.9, 300 sec: 42709.8). Total num frames: 12405112832. Throughput: 0: 42293.4. Samples: 12405237660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:43,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-24 22:48:45,399][15401] Updated weights for policy 0, policy_version 757153 (0.0038) [2024-06-24 22:48:48,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42323.5, 300 sec: 42654.4). Total num frames: 12405325824. Throughput: 0: 42294.2. Samples: 12405489000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:48,393][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 22:48:48,705][15401] Updated weights for policy 0, policy_version 757163 (0.0038) [2024-06-24 22:48:53,126][15401] Updated weights for policy 0, policy_version 757173 (0.0029) [2024-06-24 22:48:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41780.9, 300 sec: 42709.5). Total num frames: 12405522432. Throughput: 0: 42243.6. Samples: 12405616720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 22:48:56,366][15401] Updated weights for policy 0, policy_version 757183 (0.0031) [2024-06-24 22:48:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12405735424. Throughput: 0: 42211.4. Samples: 12405870680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:48:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-24 22:49:00,870][15401] Updated weights for policy 0, policy_version 757193 (0.0030) [2024-06-24 22:49:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12405964800. Throughput: 0: 42257.0. Samples: 12406123440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:49:03,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 22:49:04,089][15401] Updated weights for policy 0, policy_version 757203 (0.0046) [2024-06-24 22:49:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12406161408. Throughput: 0: 42199.6. Samples: 12406250520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:49:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 22:49:08,738][15401] Updated weights for policy 0, policy_version 757213 (0.0039) [2024-06-24 22:49:12,041][15401] Updated weights for policy 0, policy_version 757223 (0.0029) [2024-06-24 22:49:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12406374400. Throughput: 0: 42217.3. Samples: 12406500760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:49:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 22:49:16,417][15401] Updated weights for policy 0, policy_version 757233 (0.0030) [2024-06-24 22:49:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 12406587392. Throughput: 0: 42381.4. Samples: 12406758720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:49:18,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 22:49:19,599][15401] Updated weights for policy 0, policy_version 757243 (0.0038) [2024-06-24 22:49:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 12406784000. Throughput: 0: 42252.6. Samples: 12406886620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:49:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 22:49:24,208][15401] Updated weights for policy 0, policy_version 757253 (0.0047) [2024-06-24 22:49:27,270][15401] Updated weights for policy 0, policy_version 757263 (0.0023) [2024-06-24 22:49:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12407029760. Throughput: 0: 42211.1. Samples: 12407137160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:49:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 22:49:32,062][15401] Updated weights for policy 0, policy_version 757273 (0.0029) [2024-06-24 22:49:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12407226368. Throughput: 0: 42308.6. Samples: 12407392780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 22:49:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 22:49:34,853][15401] Updated weights for policy 0, policy_version 757283 (0.0041) [2024-06-24 22:49:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 12407422976. Throughput: 0: 42300.0. Samples: 12407520220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:49:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 22:49:39,540][15401] Updated weights for policy 0, policy_version 757293 (0.0033) [2024-06-24 22:49:42,810][15401] Updated weights for policy 0, policy_version 757303 (0.0036) [2024-06-24 22:49:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12407685120. Throughput: 0: 42365.4. Samples: 12407777120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:49:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 22:49:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000757305_12407685120.pth... [2024-06-24 22:49:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000756681_12397461504.pth [2024-06-24 22:49:47,150][15401] Updated weights for policy 0, policy_version 757313 (0.0033) [2024-06-24 22:49:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 12407865344. Throughput: 0: 42394.7. Samples: 12408031200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:49:48,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 22:49:50,838][15401] Updated weights for policy 0, policy_version 757323 (0.0034) [2024-06-24 22:49:53,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12408061952. Throughput: 0: 42292.9. Samples: 12408153700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:49:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 22:49:55,002][15401] Updated weights for policy 0, policy_version 757333 (0.0066) [2024-06-24 22:49:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12408291328. Throughput: 0: 42357.9. Samples: 12408406860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:49:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 22:49:58,490][15401] Updated weights for policy 0, policy_version 757343 (0.0029) [2024-06-24 22:50:02,638][15401] Updated weights for policy 0, policy_version 757353 (0.0034) [2024-06-24 22:50:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12408504320. Throughput: 0: 42335.0. Samples: 12408663800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:03,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-24 22:50:06,106][15401] Updated weights for policy 0, policy_version 757363 (0.0045) [2024-06-24 22:50:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12408717312. Throughput: 0: 42275.9. Samples: 12408789040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:08,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 22:50:10,338][15401] Updated weights for policy 0, policy_version 757373 (0.0035) [2024-06-24 22:50:11,888][15349] Signal inference workers to stop experience collection... (183650 times) [2024-06-24 22:50:11,936][15401] InferenceWorker_p0-w0: stopping experience collection (183650 times) [2024-06-24 22:50:11,945][15349] Signal inference workers to resume experience collection... (183650 times) [2024-06-24 22:50:11,955][15401] InferenceWorker_p0-w0: resuming experience collection (183650 times) [2024-06-24 22:50:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12408930304. Throughput: 0: 42420.1. Samples: 12409046060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:13,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-24 22:50:13,867][15401] Updated weights for policy 0, policy_version 757383 (0.0029) [2024-06-24 22:50:18,241][15401] Updated weights for policy 0, policy_version 757393 (0.0037) [2024-06-24 22:50:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12409126912. Throughput: 0: 42537.7. Samples: 12409306980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:18,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-24 22:50:21,582][15401] Updated weights for policy 0, policy_version 757403 (0.0027) [2024-06-24 22:50:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12409372672. Throughput: 0: 42398.6. Samples: 12409428160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:23,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 22:50:26,325][15401] Updated weights for policy 0, policy_version 757413 (0.0030) [2024-06-24 22:50:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12409569280. Throughput: 0: 42509.3. Samples: 12409690040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 22:50:29,213][15401] Updated weights for policy 0, policy_version 757423 (0.0039) [2024-06-24 22:50:33,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42050.5, 300 sec: 42487.0). Total num frames: 12409749504. Throughput: 0: 42629.7. Samples: 12409949640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:33,392][15132] Avg episode reward: [(0, '0.867')] [2024-06-24 22:50:33,955][15401] Updated weights for policy 0, policy_version 757433 (0.0044) [2024-06-24 22:50:36,757][15401] Updated weights for policy 0, policy_version 757443 (0.0033) [2024-06-24 22:50:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12410011648. Throughput: 0: 42710.2. Samples: 12410075660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:38,390][15132] Avg episode reward: [(0, '0.902')] [2024-06-24 22:50:41,474][15401] Updated weights for policy 0, policy_version 757453 (0.0031) [2024-06-24 22:50:43,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12410208256. Throughput: 0: 42828.9. Samples: 12410334160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:43,390][15132] Avg episode reward: [(0, '0.890')] [2024-06-24 22:50:44,402][15401] Updated weights for policy 0, policy_version 757463 (0.0040) [2024-06-24 22:50:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12410404864. Throughput: 0: 43058.7. Samples: 12410601440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:48,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-24 22:50:48,928][15401] Updated weights for policy 0, policy_version 757473 (0.0030) [2024-06-24 22:50:51,769][15401] Updated weights for policy 0, policy_version 757483 (0.0037) [2024-06-24 22:50:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 12410667008. Throughput: 0: 43085.0. Samples: 12410727860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:53,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-24 22:50:56,505][15401] Updated weights for policy 0, policy_version 757493 (0.0030) [2024-06-24 22:50:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 12410863616. Throughput: 0: 43106.5. Samples: 12410985860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:50:58,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-24 22:50:59,737][15401] Updated weights for policy 0, policy_version 757503 (0.0033) [2024-06-24 22:51:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12411060224. Throughput: 0: 42987.1. Samples: 12411241400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:51:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 22:51:04,149][15401] Updated weights for policy 0, policy_version 757513 (0.0023) [2024-06-24 22:51:07,220][15401] Updated weights for policy 0, policy_version 757523 (0.0023) [2024-06-24 22:51:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12411305984. Throughput: 0: 43092.8. Samples: 12411367340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:51:08,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 22:51:11,733][15401] Updated weights for policy 0, policy_version 757533 (0.0028) [2024-06-24 22:51:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12411502592. Throughput: 0: 43141.8. Samples: 12411631420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-24 22:51:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 22:51:14,885][15401] Updated weights for policy 0, policy_version 757543 (0.0034) [2024-06-24 22:51:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 12411715584. Throughput: 0: 42933.1. Samples: 12411881520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:18,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 22:51:19,494][15401] Updated weights for policy 0, policy_version 757553 (0.0040) [2024-06-24 22:51:19,692][15349] Signal inference workers to stop experience collection... (183700 times) [2024-06-24 22:51:19,736][15401] InferenceWorker_p0-w0: stopping experience collection (183700 times) [2024-06-24 22:51:19,805][15349] Signal inference workers to resume experience collection... (183700 times) [2024-06-24 22:51:19,806][15401] InferenceWorker_p0-w0: resuming experience collection (183700 times) [2024-06-24 22:51:22,478][15401] Updated weights for policy 0, policy_version 757563 (0.0046) [2024-06-24 22:51:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12411944960. Throughput: 0: 43000.9. Samples: 12412010700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:23,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 22:51:27,108][15401] Updated weights for policy 0, policy_version 757573 (0.0044) [2024-06-24 22:51:28,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12412157952. Throughput: 0: 43029.7. Samples: 12412270500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:28,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 22:51:30,176][15401] Updated weights for policy 0, policy_version 757583 (0.0039) [2024-06-24 22:51:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 43417.6, 300 sec: 42598.1). Total num frames: 12412354560. Throughput: 0: 42724.8. Samples: 12412524160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:33,392][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 22:51:34,577][15401] Updated weights for policy 0, policy_version 757593 (0.0040) [2024-06-24 22:51:37,933][15401] Updated weights for policy 0, policy_version 757603 (0.0036) [2024-06-24 22:51:38,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 12412567552. Throughput: 0: 42746.0. Samples: 12412651540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:38,392][15132] Avg episode reward: [(0, '0.565')] [2024-06-24 22:51:42,107][15401] Updated weights for policy 0, policy_version 757613 (0.0033) [2024-06-24 22:51:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12412780544. Throughput: 0: 42772.1. Samples: 12412910600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:43,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 22:51:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000757616_12412780544.pth... [2024-06-24 22:51:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000756993_12402573312.pth [2024-06-24 22:51:46,080][15401] Updated weights for policy 0, policy_version 757623 (0.0041) [2024-06-24 22:51:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43417.7, 300 sec: 42654.0). Total num frames: 12413009920. Throughput: 0: 42620.5. Samples: 12413159320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 22:51:50,222][15401] Updated weights for policy 0, policy_version 757633 (0.0026) [2024-06-24 22:51:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 12413206528. Throughput: 0: 42758.2. Samples: 12413291460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:53,391][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 22:51:53,562][15401] Updated weights for policy 0, policy_version 757643 (0.0036) [2024-06-24 22:51:57,608][15401] Updated weights for policy 0, policy_version 757653 (0.0023) [2024-06-24 22:51:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 12413419520. Throughput: 0: 42665.3. Samples: 12413551360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:51:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-24 22:52:01,205][15401] Updated weights for policy 0, policy_version 757663 (0.0031) [2024-06-24 22:52:03,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12413665280. Throughput: 0: 42628.3. Samples: 12413799800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-24 22:52:05,503][15401] Updated weights for policy 0, policy_version 757673 (0.0033) [2024-06-24 22:52:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 12413845504. Throughput: 0: 42725.2. Samples: 12413933340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-24 22:52:08,917][15401] Updated weights for policy 0, policy_version 757683 (0.0026) [2024-06-24 22:52:13,173][15401] Updated weights for policy 0, policy_version 757693 (0.0033) [2024-06-24 22:52:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12414058496. Throughput: 0: 42681.8. Samples: 12414191180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:13,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-24 22:52:16,660][15401] Updated weights for policy 0, policy_version 757703 (0.0043) [2024-06-24 22:52:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12414287872. Throughput: 0: 42553.5. Samples: 12414438960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:18,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-24 22:52:20,729][15401] Updated weights for policy 0, policy_version 757713 (0.0036) [2024-06-24 22:52:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12414500864. Throughput: 0: 42625.8. Samples: 12414569600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-24 22:52:24,344][15401] Updated weights for policy 0, policy_version 757723 (0.0041) [2024-06-24 22:52:28,332][15401] Updated weights for policy 0, policy_version 757733 (0.0034) [2024-06-24 22:52:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12414697472. Throughput: 0: 42573.9. Samples: 12414826420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:28,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 22:52:31,819][15401] Updated weights for policy 0, policy_version 757743 (0.0032) [2024-06-24 22:52:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 12414926848. Throughput: 0: 42733.7. Samples: 12415082340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:33,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 22:52:35,994][15401] Updated weights for policy 0, policy_version 757753 (0.0032) [2024-06-24 22:52:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42599.3). Total num frames: 12415123456. Throughput: 0: 42713.1. Samples: 12415213540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 22:52:39,311][15401] Updated weights for policy 0, policy_version 757763 (0.0041) [2024-06-24 22:52:41,088][15349] Signal inference workers to stop experience collection... (183750 times) [2024-06-24 22:52:41,144][15401] InferenceWorker_p0-w0: stopping experience collection (183750 times) [2024-06-24 22:52:41,146][15349] Signal inference workers to resume experience collection... (183750 times) [2024-06-24 22:52:41,162][15401] InferenceWorker_p0-w0: resuming experience collection (183750 times) [2024-06-24 22:52:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 12415336448. Throughput: 0: 42526.7. Samples: 12415465060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 22:52:43,553][15401] Updated weights for policy 0, policy_version 757773 (0.0047) [2024-06-24 22:52:47,174][15401] Updated weights for policy 0, policy_version 757783 (0.0035) [2024-06-24 22:52:48,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.7, 300 sec: 42542.8). Total num frames: 12415565824. Throughput: 0: 42604.4. Samples: 12415717100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:48,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-24 22:52:50,981][15401] Updated weights for policy 0, policy_version 757793 (0.0043) [2024-06-24 22:52:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12415762432. Throughput: 0: 42613.9. Samples: 12415850960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:53,394][15132] Avg episode reward: [(0, '0.452')] [2024-06-24 22:52:54,810][15401] Updated weights for policy 0, policy_version 757803 (0.0028) [2024-06-24 22:52:58,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12416008192. Throughput: 0: 42624.4. Samples: 12416109280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-24 22:52:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 22:52:58,397][15401] Updated weights for policy 0, policy_version 757813 (0.0032) [2024-06-24 22:53:02,610][15401] Updated weights for policy 0, policy_version 757823 (0.0039) [2024-06-24 22:53:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12416221184. Throughput: 0: 42679.0. Samples: 12416359520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:03,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 22:53:05,981][15401] Updated weights for policy 0, policy_version 757833 (0.0032) [2024-06-24 22:53:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12416401408. Throughput: 0: 42632.9. Samples: 12416488080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:08,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 22:53:10,293][15401] Updated weights for policy 0, policy_version 757843 (0.0025) [2024-06-24 22:53:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 12416647168. Throughput: 0: 42674.7. Samples: 12416746780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:13,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 22:53:13,620][15401] Updated weights for policy 0, policy_version 757853 (0.0031) [2024-06-24 22:53:17,878][15401] Updated weights for policy 0, policy_version 757863 (0.0036) [2024-06-24 22:53:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 12416843776. Throughput: 0: 42713.6. Samples: 12417004460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 22:53:21,319][15401] Updated weights for policy 0, policy_version 757873 (0.0046) [2024-06-24 22:53:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12417056768. Throughput: 0: 42603.0. Samples: 12417130680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:23,394][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 22:53:25,517][15401] Updated weights for policy 0, policy_version 757883 (0.0036) [2024-06-24 22:53:28,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 12417269760. Throughput: 0: 42778.3. Samples: 12417390180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:28,392][15132] Avg episode reward: [(0, '0.768')] [2024-06-24 22:53:29,067][15401] Updated weights for policy 0, policy_version 757893 (0.0039) [2024-06-24 22:53:33,053][15401] Updated weights for policy 0, policy_version 757903 (0.0042) [2024-06-24 22:53:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12417482752. Throughput: 0: 42722.3. Samples: 12417639500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 22:53:36,942][15401] Updated weights for policy 0, policy_version 757913 (0.0035) [2024-06-24 22:53:38,392][15132] Fps is (10 sec: 40959.7, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 12417679360. Throughput: 0: 42676.4. Samples: 12417771500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:38,393][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 22:53:40,581][15401] Updated weights for policy 0, policy_version 757923 (0.0029) [2024-06-24 22:53:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 12417925120. Throughput: 0: 42739.5. Samples: 12418032560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 22:53:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000757930_12417925120.pth... [2024-06-24 22:53:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000757305_12407685120.pth [2024-06-24 22:53:44,644][15401] Updated weights for policy 0, policy_version 757933 (0.0041) [2024-06-24 22:53:48,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12418138112. Throughput: 0: 42799.5. Samples: 12418285500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 22:53:48,393][15401] Updated weights for policy 0, policy_version 757943 (0.0038) [2024-06-24 22:53:52,130][15401] Updated weights for policy 0, policy_version 757953 (0.0039) [2024-06-24 22:53:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12418334720. Throughput: 0: 42832.4. Samples: 12418415540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-24 22:53:56,030][15401] Updated weights for policy 0, policy_version 757963 (0.0029) [2024-06-24 22:53:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12418547712. Throughput: 0: 42850.6. Samples: 12418675060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:53:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 22:53:59,807][15401] Updated weights for policy 0, policy_version 757973 (0.0035) [2024-06-24 22:54:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12418777088. Throughput: 0: 42847.6. Samples: 12418932600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-24 22:54:03,514][15401] Updated weights for policy 0, policy_version 757983 (0.0028) [2024-06-24 22:54:07,467][15401] Updated weights for policy 0, policy_version 757993 (0.0032) [2024-06-24 22:54:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 12418990080. Throughput: 0: 42974.7. Samples: 12419064640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:08,392][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 22:54:10,804][15349] Signal inference workers to stop experience collection... (183800 times) [2024-06-24 22:54:10,805][15349] Signal inference workers to resume experience collection... (183800 times) [2024-06-24 22:54:10,822][15401] InferenceWorker_p0-w0: stopping experience collection (183800 times) [2024-06-24 22:54:10,822][15401] InferenceWorker_p0-w0: resuming experience collection (183800 times) [2024-06-24 22:54:10,951][15401] Updated weights for policy 0, policy_version 758003 (0.0028) [2024-06-24 22:54:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12419203072. Throughput: 0: 42966.7. Samples: 12419323580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 22:54:14,868][15401] Updated weights for policy 0, policy_version 758013 (0.0037) [2024-06-24 22:54:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 12419416064. Throughput: 0: 43129.8. Samples: 12419580340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 22:54:19,133][15401] Updated weights for policy 0, policy_version 758023 (0.0045) [2024-06-24 22:54:22,456][15401] Updated weights for policy 0, policy_version 758033 (0.0042) [2024-06-24 22:54:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12419629056. Throughput: 0: 43042.2. Samples: 12419708400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:23,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 22:54:26,947][15401] Updated weights for policy 0, policy_version 758043 (0.0034) [2024-06-24 22:54:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12419842048. Throughput: 0: 42914.4. Samples: 12419963700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 22:54:30,099][15401] Updated weights for policy 0, policy_version 758053 (0.0032) [2024-06-24 22:54:33,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12420055040. Throughput: 0: 43003.6. Samples: 12420220660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 22:54:34,485][15401] Updated weights for policy 0, policy_version 758063 (0.0035) [2024-06-24 22:54:37,963][15401] Updated weights for policy 0, policy_version 758073 (0.0025) [2024-06-24 22:54:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 12420268032. Throughput: 0: 42821.0. Samples: 12420342480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-24 22:54:42,086][15401] Updated weights for policy 0, policy_version 758083 (0.0033) [2024-06-24 22:54:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 12420481024. Throughput: 0: 42880.8. Samples: 12420604800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-24 22:54:43,392][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 22:54:45,570][15401] Updated weights for policy 0, policy_version 758093 (0.0037) [2024-06-24 22:54:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12420694016. Throughput: 0: 42806.8. Samples: 12420858900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:54:48,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-24 22:54:49,888][15401] Updated weights for policy 0, policy_version 758103 (0.0035) [2024-06-24 22:54:53,155][15401] Updated weights for policy 0, policy_version 758113 (0.0045) [2024-06-24 22:54:53,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 12420923392. Throughput: 0: 42787.7. Samples: 12420989980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:54:53,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-24 22:54:57,439][15401] Updated weights for policy 0, policy_version 758123 (0.0026) [2024-06-24 22:54:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12421120000. Throughput: 0: 42785.8. Samples: 12421248940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:54:58,391][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 22:55:00,589][15401] Updated weights for policy 0, policy_version 758133 (0.0029) [2024-06-24 22:55:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 12421316608. Throughput: 0: 42720.5. Samples: 12421502760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 22:55:05,330][15401] Updated weights for policy 0, policy_version 758143 (0.0028) [2024-06-24 22:55:07,963][15401] Updated weights for policy 0, policy_version 758153 (0.0025) [2024-06-24 22:55:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 12421578752. Throughput: 0: 42841.8. Samples: 12421636180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 22:55:12,769][15401] Updated weights for policy 0, policy_version 758163 (0.0032) [2024-06-24 22:55:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12421775360. Throughput: 0: 42956.3. Samples: 12421896740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:13,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 22:55:15,778][15401] Updated weights for policy 0, policy_version 758173 (0.0027) [2024-06-24 22:55:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12421971968. Throughput: 0: 43097.8. Samples: 12422160060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 22:55:20,372][15401] Updated weights for policy 0, policy_version 758183 (0.0031) [2024-06-24 22:55:23,256][15401] Updated weights for policy 0, policy_version 758193 (0.0043) [2024-06-24 22:55:23,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43419.2, 300 sec: 42931.6). Total num frames: 12422234112. Throughput: 0: 43200.6. Samples: 12422286520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 22:55:28,351][15401] Updated weights for policy 0, policy_version 758203 (0.0045) [2024-06-24 22:55:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42876.4). Total num frames: 12422397952. Throughput: 0: 42948.8. Samples: 12422537400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-24 22:55:30,892][15401] Updated weights for policy 0, policy_version 758213 (0.0046) [2024-06-24 22:55:33,390][15132] Fps is (10 sec: 37684.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12422610944. Throughput: 0: 42953.3. Samples: 12422791800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:33,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-24 22:55:35,809][15401] Updated weights for policy 0, policy_version 758223 (0.0034) [2024-06-24 22:55:38,396][15132] Fps is (10 sec: 45846.3, 60 sec: 43139.9, 300 sec: 42875.1). Total num frames: 12422856704. Throughput: 0: 42960.0. Samples: 12422923460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:38,397][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 22:55:38,661][15401] Updated weights for policy 0, policy_version 758233 (0.0033) [2024-06-24 22:55:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 12423020544. Throughput: 0: 42828.0. Samples: 12423176200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:43,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 22:55:43,445][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000758242_12423036928.pth... [2024-06-24 22:55:43,509][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000757616_12412780544.pth [2024-06-24 22:55:43,677][15401] Updated weights for policy 0, policy_version 758243 (0.0045) [2024-06-24 22:55:43,696][15349] Signal inference workers to stop experience collection... (183850 times) [2024-06-24 22:55:43,697][15349] Signal inference workers to resume experience collection... (183850 times) [2024-06-24 22:55:43,739][15401] InferenceWorker_p0-w0: stopping experience collection (183850 times) [2024-06-24 22:55:43,739][15401] InferenceWorker_p0-w0: resuming experience collection (183850 times) [2024-06-24 22:55:46,478][15401] Updated weights for policy 0, policy_version 758253 (0.0033) [2024-06-24 22:55:48,389][15132] Fps is (10 sec: 39347.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12423249920. Throughput: 0: 42902.2. Samples: 12423433360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:48,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-24 22:55:51,231][15401] Updated weights for policy 0, policy_version 758263 (0.0029) [2024-06-24 22:55:53,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12423495680. Throughput: 0: 42864.9. Samples: 12423565100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:55:54,049][15401] Updated weights for policy 0, policy_version 758273 (0.0033) [2024-06-24 22:55:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12423675904. Throughput: 0: 42699.6. Samples: 12423818220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:55:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 22:55:58,745][15401] Updated weights for policy 0, policy_version 758283 (0.0034) [2024-06-24 22:56:01,956][15401] Updated weights for policy 0, policy_version 758293 (0.0031) [2024-06-24 22:56:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 12423921664. Throughput: 0: 42527.1. Samples: 12424073780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:56:03,390][15132] Avg episode reward: [(0, '0.149')] [2024-06-24 22:56:06,187][15401] Updated weights for policy 0, policy_version 758303 (0.0029) [2024-06-24 22:56:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12424118272. Throughput: 0: 42646.0. Samples: 12424205580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:56:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 22:56:09,869][15401] Updated weights for policy 0, policy_version 758313 (0.0027) [2024-06-24 22:56:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12424314880. Throughput: 0: 42700.6. Samples: 12424458920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:56:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 22:56:13,989][15401] Updated weights for policy 0, policy_version 758323 (0.0036) [2024-06-24 22:56:17,363][15401] Updated weights for policy 0, policy_version 758333 (0.0037) [2024-06-24 22:56:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12424560640. Throughput: 0: 42698.7. Samples: 12424713240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:56:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 22:56:21,437][15401] Updated weights for policy 0, policy_version 758343 (0.0047) [2024-06-24 22:56:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.4, 300 sec: 42654.0). Total num frames: 12424740864. Throughput: 0: 42713.3. Samples: 12424845280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:56:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 22:56:24,812][15401] Updated weights for policy 0, policy_version 758353 (0.0036) [2024-06-24 22:56:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12424970240. Throughput: 0: 42857.7. Samples: 12425104800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 22:56:28,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 22:56:29,082][15401] Updated weights for policy 0, policy_version 758363 (0.0048) [2024-06-24 22:56:32,582][15401] Updated weights for policy 0, policy_version 758373 (0.0037) [2024-06-24 22:56:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 12425199616. Throughput: 0: 42631.9. Samples: 12425351800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:56:33,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 22:56:36,514][15401] Updated weights for policy 0, policy_version 758383 (0.0029) [2024-06-24 22:56:38,391][15132] Fps is (10 sec: 42592.6, 60 sec: 42328.9, 300 sec: 42764.8). Total num frames: 12425396224. Throughput: 0: 42743.1. Samples: 12425488600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:56:38,392][15132] Avg episode reward: [(0, '0.347')] [2024-06-24 22:56:40,128][15401] Updated weights for policy 0, policy_version 758393 (0.0031) [2024-06-24 22:56:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12425592832. Throughput: 0: 42813.7. Samples: 12425744840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:56:43,392][15132] Avg episode reward: [(0, '0.403')] [2024-06-24 22:56:44,681][15401] Updated weights for policy 0, policy_version 758403 (0.0029) [2024-06-24 22:56:47,681][15401] Updated weights for policy 0, policy_version 758413 (0.0031) [2024-06-24 22:56:48,390][15132] Fps is (10 sec: 45881.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 12425854976. Throughput: 0: 42570.2. Samples: 12425989440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:56:48,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-24 22:56:52,327][15401] Updated weights for policy 0, policy_version 758423 (0.0039) [2024-06-24 22:56:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 12426035200. Throughput: 0: 42755.2. Samples: 12426129560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:56:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 22:56:55,602][15401] Updated weights for policy 0, policy_version 758433 (0.0028) [2024-06-24 22:56:58,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 12426231808. Throughput: 0: 42692.3. Samples: 12426380080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:56:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 22:56:59,996][15401] Updated weights for policy 0, policy_version 758443 (0.0048) [2024-06-24 22:57:03,040][15401] Updated weights for policy 0, policy_version 758453 (0.0044) [2024-06-24 22:57:03,389][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12426493952. Throughput: 0: 42569.8. Samples: 12426628880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 22:57:07,792][15349] Signal inference workers to stop experience collection... (183900 times) [2024-06-24 22:57:07,792][15349] Signal inference workers to resume experience collection... (183900 times) [2024-06-24 22:57:07,840][15401] InferenceWorker_p0-w0: stopping experience collection (183900 times) [2024-06-24 22:57:07,840][15401] InferenceWorker_p0-w0: resuming experience collection (183900 times) [2024-06-24 22:57:07,921][15401] Updated weights for policy 0, policy_version 758463 (0.0028) [2024-06-24 22:57:08,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12426674176. Throughput: 0: 42790.7. Samples: 12426770860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-24 22:57:10,473][15401] Updated weights for policy 0, policy_version 758473 (0.0042) [2024-06-24 22:57:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12426887168. Throughput: 0: 42590.8. Samples: 12427021380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:13,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 22:57:15,550][15401] Updated weights for policy 0, policy_version 758483 (0.0046) [2024-06-24 22:57:18,271][15401] Updated weights for policy 0, policy_version 758493 (0.0029) [2024-06-24 22:57:18,392][15132] Fps is (10 sec: 47501.6, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 12427149312. Throughput: 0: 42772.0. Samples: 12427276640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:18,393][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 22:57:23,087][15401] Updated weights for policy 0, policy_version 758503 (0.0023) [2024-06-24 22:57:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 12427345920. Throughput: 0: 42898.2. Samples: 12427418960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 22:57:25,893][15401] Updated weights for policy 0, policy_version 758513 (0.0042) [2024-06-24 22:57:28,392][15132] Fps is (10 sec: 37683.4, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 12427526144. Throughput: 0: 42787.1. Samples: 12427670360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:28,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 22:57:30,479][15401] Updated weights for policy 0, policy_version 758523 (0.0042) [2024-06-24 22:57:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12427788288. Throughput: 0: 43051.6. Samples: 12427926760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:33,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 22:57:33,451][15401] Updated weights for policy 0, policy_version 758533 (0.0034) [2024-06-24 22:57:38,022][15401] Updated weights for policy 0, policy_version 758543 (0.0034) [2024-06-24 22:57:38,390][15132] Fps is (10 sec: 45885.4, 60 sec: 43145.5, 300 sec: 42876.1). Total num frames: 12427984896. Throughput: 0: 42957.9. Samples: 12428062680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 22:57:41,038][15401] Updated weights for policy 0, policy_version 758553 (0.0038) [2024-06-24 22:57:43,392][15132] Fps is (10 sec: 39312.0, 60 sec: 43142.8, 300 sec: 42765.0). Total num frames: 12428181504. Throughput: 0: 42876.5. Samples: 12428309620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:43,392][15132] Avg episode reward: [(0, '0.811')] [2024-06-24 22:57:43,534][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000758557_12428197888.pth... [2024-06-24 22:57:43,597][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000757930_12417925120.pth [2024-06-24 22:57:45,798][15401] Updated weights for policy 0, policy_version 758563 (0.0038) [2024-06-24 22:57:48,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 12428427264. Throughput: 0: 43115.5. Samples: 12428569180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:48,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 22:57:48,759][15401] Updated weights for policy 0, policy_version 758573 (0.0025) [2024-06-24 22:57:53,370][15401] Updated weights for policy 0, policy_version 758583 (0.0030) [2024-06-24 22:57:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12428623872. Throughput: 0: 42879.0. Samples: 12428700420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 22:57:56,586][15401] Updated weights for policy 0, policy_version 758593 (0.0035) [2024-06-24 22:57:58,390][15132] Fps is (10 sec: 39330.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12428820480. Throughput: 0: 42758.1. Samples: 12428945500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:57:58,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 22:58:01,104][15401] Updated weights for policy 0, policy_version 758603 (0.0031) [2024-06-24 22:58:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12429066240. Throughput: 0: 42974.4. Samples: 12429210380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:58:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 22:58:04,170][15401] Updated weights for policy 0, policy_version 758613 (0.0041) [2024-06-24 22:58:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12429246464. Throughput: 0: 42743.1. Samples: 12429342400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:58:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 22:58:08,738][15401] Updated weights for policy 0, policy_version 758623 (0.0036) [2024-06-24 22:58:11,848][15401] Updated weights for policy 0, policy_version 758633 (0.0044) [2024-06-24 22:58:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 12429492224. Throughput: 0: 42634.6. Samples: 12429588820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-24 22:58:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 22:58:16,485][15401] Updated weights for policy 0, policy_version 758643 (0.0031) [2024-06-24 22:58:18,001][15349] Signal inference workers to stop experience collection... (183950 times) [2024-06-24 22:58:18,033][15401] InferenceWorker_p0-w0: stopping experience collection (183950 times) [2024-06-24 22:58:18,050][15349] Signal inference workers to resume experience collection... (183950 times) [2024-06-24 22:58:18,056][15401] InferenceWorker_p0-w0: resuming experience collection (183950 times) [2024-06-24 22:58:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 12429688832. Throughput: 0: 42792.4. Samples: 12429852420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:18,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 22:58:19,618][15401] Updated weights for policy 0, policy_version 758653 (0.0030) [2024-06-24 22:58:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 12429885440. Throughput: 0: 42629.0. Samples: 12429980980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-24 22:58:24,086][15401] Updated weights for policy 0, policy_version 758663 (0.0031) [2024-06-24 22:58:27,139][15401] Updated weights for policy 0, policy_version 758673 (0.0024) [2024-06-24 22:58:28,392][15132] Fps is (10 sec: 44227.9, 60 sec: 43417.8, 300 sec: 42875.8). Total num frames: 12430131200. Throughput: 0: 42692.4. Samples: 12430230760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:28,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 22:58:31,672][15401] Updated weights for policy 0, policy_version 758683 (0.0039) [2024-06-24 22:58:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42876.5). Total num frames: 12430327808. Throughput: 0: 42824.1. Samples: 12430496160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:33,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-24 22:58:35,026][15401] Updated weights for policy 0, policy_version 758693 (0.0027) [2024-06-24 22:58:38,390][15132] Fps is (10 sec: 39329.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12430524416. Throughput: 0: 42673.7. Samples: 12430620740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 22:58:39,624][15401] Updated weights for policy 0, policy_version 758703 (0.0039) [2024-06-24 22:58:42,667][15401] Updated weights for policy 0, policy_version 758713 (0.0035) [2024-06-24 22:58:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 12430786560. Throughput: 0: 42851.2. Samples: 12430873800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 22:58:47,257][15401] Updated weights for policy 0, policy_version 758723 (0.0042) [2024-06-24 22:58:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42325.3, 300 sec: 42820.2). Total num frames: 12430966784. Throughput: 0: 42801.2. Samples: 12431136540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:48,392][15132] Avg episode reward: [(0, '0.777')] [2024-06-24 22:58:50,415][15401] Updated weights for policy 0, policy_version 758733 (0.0039) [2024-06-24 22:58:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12431179776. Throughput: 0: 42581.3. Samples: 12431258560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 22:58:54,807][15401] Updated weights for policy 0, policy_version 758743 (0.0036) [2024-06-24 22:58:58,100][15401] Updated weights for policy 0, policy_version 758753 (0.0032) [2024-06-24 22:58:58,392][15132] Fps is (10 sec: 44236.9, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 12431409152. Throughput: 0: 42908.5. Samples: 12431519800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:58:58,392][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 22:59:02,669][15401] Updated weights for policy 0, policy_version 758763 (0.0035) [2024-06-24 22:59:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42765.4). Total num frames: 12431605760. Throughput: 0: 42567.1. Samples: 12431767940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:03,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 22:59:06,002][15401] Updated weights for policy 0, policy_version 758773 (0.0045) [2024-06-24 22:59:08,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12431818752. Throughput: 0: 42500.0. Samples: 12431893480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:08,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 22:59:10,253][15401] Updated weights for policy 0, policy_version 758783 (0.0032) [2024-06-24 22:59:13,391][15132] Fps is (10 sec: 42593.0, 60 sec: 42324.5, 300 sec: 42764.8). Total num frames: 12432031744. Throughput: 0: 42695.8. Samples: 12432152040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:13,391][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 22:59:13,646][15401] Updated weights for policy 0, policy_version 758793 (0.0028) [2024-06-24 22:59:17,981][15401] Updated weights for policy 0, policy_version 758803 (0.0031) [2024-06-24 22:59:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12432261120. Throughput: 0: 42644.4. Samples: 12432415160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 22:59:21,150][15401] Updated weights for policy 0, policy_version 758813 (0.0036) [2024-06-24 22:59:23,390][15132] Fps is (10 sec: 44242.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 12432474112. Throughput: 0: 42640.5. Samples: 12432539560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 22:59:25,362][15401] Updated weights for policy 0, policy_version 758823 (0.0037) [2024-06-24 22:59:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42599.9, 300 sec: 42820.5). Total num frames: 12432687104. Throughput: 0: 42778.6. Samples: 12432798840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 22:59:28,890][15401] Updated weights for policy 0, policy_version 758833 (0.0025) [2024-06-24 22:59:32,847][15401] Updated weights for policy 0, policy_version 758843 (0.0040) [2024-06-24 22:59:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12432883712. Throughput: 0: 42737.5. Samples: 12433059620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 22:59:36,653][15401] Updated weights for policy 0, policy_version 758853 (0.0037) [2024-06-24 22:59:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 12433113088. Throughput: 0: 42783.2. Samples: 12433183800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 22:59:40,820][15401] Updated weights for policy 0, policy_version 758863 (0.0038) [2024-06-24 22:59:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 12433326080. Throughput: 0: 42722.7. Samples: 12433442220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 22:59:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000758870_12433326080.pth... [2024-06-24 22:59:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000758242_12423036928.pth [2024-06-24 22:59:44,566][15401] Updated weights for policy 0, policy_version 758873 (0.0033) [2024-06-24 22:59:48,214][15401] Updated weights for policy 0, policy_version 758883 (0.0034) [2024-06-24 22:59:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 12433539072. Throughput: 0: 42912.0. Samples: 12433698980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:48,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 22:59:52,151][15401] Updated weights for policy 0, policy_version 758893 (0.0033) [2024-06-24 22:59:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12433768448. Throughput: 0: 43043.6. Samples: 12433830440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:53,398][15132] Avg episode reward: [(0, '0.555')] [2024-06-24 22:59:53,955][15349] Signal inference workers to stop experience collection... (184000 times) [2024-06-24 22:59:54,005][15401] InferenceWorker_p0-w0: stopping experience collection (184000 times) [2024-06-24 22:59:54,013][15349] Signal inference workers to resume experience collection... (184000 times) [2024-06-24 22:59:54,023][15401] InferenceWorker_p0-w0: resuming experience collection (184000 times) [2024-06-24 22:59:56,121][15401] Updated weights for policy 0, policy_version 758903 (0.0031) [2024-06-24 22:59:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42820.5). Total num frames: 12433948672. Throughput: 0: 43121.6. Samples: 12434092460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 22:59:58,399][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 22:59:59,467][15401] Updated weights for policy 0, policy_version 758913 (0.0033) [2024-06-24 23:00:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12434178048. Throughput: 0: 43038.7. Samples: 12434351900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:03,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 23:00:03,946][15401] Updated weights for policy 0, policy_version 758923 (0.0036) [2024-06-24 23:00:06,874][15401] Updated weights for policy 0, policy_version 758933 (0.0027) [2024-06-24 23:00:08,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12434407424. Throughput: 0: 43133.4. Samples: 12434480560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:08,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 23:00:11,500][15401] Updated weights for policy 0, policy_version 758943 (0.0033) [2024-06-24 23:00:13,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42870.7, 300 sec: 42820.2). Total num frames: 12434604032. Throughput: 0: 43033.7. Samples: 12434735460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:13,393][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 23:00:14,890][15401] Updated weights for policy 0, policy_version 758953 (0.0029) [2024-06-24 23:00:18,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 12434817024. Throughput: 0: 42934.9. Samples: 12434991700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:18,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 23:00:19,050][15401] Updated weights for policy 0, policy_version 758963 (0.0022) [2024-06-24 23:00:22,455][15401] Updated weights for policy 0, policy_version 758973 (0.0041) [2024-06-24 23:00:23,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12435046400. Throughput: 0: 43094.5. Samples: 12435123060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 23:00:26,530][15401] Updated weights for policy 0, policy_version 758983 (0.0041) [2024-06-24 23:00:28,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 12435259392. Throughput: 0: 43126.6. Samples: 12435383020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:28,392][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 23:00:29,976][15401] Updated weights for policy 0, policy_version 758993 (0.0032) [2024-06-24 23:00:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 12435456000. Throughput: 0: 43165.9. Samples: 12435641440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-24 23:00:34,368][15401] Updated weights for policy 0, policy_version 759003 (0.0036) [2024-06-24 23:00:38,071][15401] Updated weights for policy 0, policy_version 759013 (0.0033) [2024-06-24 23:00:38,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 12435701760. Throughput: 0: 43024.8. Samples: 12435766560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:38,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-24 23:00:41,875][15401] Updated weights for policy 0, policy_version 759023 (0.0036) [2024-06-24 23:00:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12435898368. Throughput: 0: 42995.6. Samples: 12436027260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 23:00:45,514][15401] Updated weights for policy 0, policy_version 759033 (0.0042) [2024-06-24 23:00:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12436127744. Throughput: 0: 42868.9. Samples: 12436281000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-24 23:00:49,375][15401] Updated weights for policy 0, policy_version 759043 (0.0029) [2024-06-24 23:00:52,963][15401] Updated weights for policy 0, policy_version 759053 (0.0045) [2024-06-24 23:00:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12436324352. Throughput: 0: 42951.4. Samples: 12436413380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 23:00:56,956][15401] Updated weights for policy 0, policy_version 759063 (0.0033) [2024-06-24 23:00:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12436537344. Throughput: 0: 42973.4. Samples: 12436669160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:00:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 23:01:00,745][15401] Updated weights for policy 0, policy_version 759073 (0.0036) [2024-06-24 23:01:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 12436783104. Throughput: 0: 42902.7. Samples: 12436922320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:01:03,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 23:01:04,467][15401] Updated weights for policy 0, policy_version 759083 (0.0030) [2024-06-24 23:01:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12436963328. Throughput: 0: 42997.5. Samples: 12437057940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:01:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 23:01:08,403][15401] Updated weights for policy 0, policy_version 759093 (0.0023) [2024-06-24 23:01:12,164][15401] Updated weights for policy 0, policy_version 759103 (0.0046) [2024-06-24 23:01:13,396][15132] Fps is (10 sec: 40934.0, 60 sec: 43141.7, 300 sec: 42819.6). Total num frames: 12437192704. Throughput: 0: 42912.6. Samples: 12437314260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:01:13,396][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 23:01:16,027][15401] Updated weights for policy 0, policy_version 759113 (0.0025) [2024-06-24 23:01:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 12437422080. Throughput: 0: 42832.8. Samples: 12437568920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:01:18,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 23:01:19,702][15401] Updated weights for policy 0, policy_version 759123 (0.0024) [2024-06-24 23:01:23,392][15132] Fps is (10 sec: 40976.3, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12437602304. Throughput: 0: 42842.2. Samples: 12437694560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:01:23,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 23:01:23,819][15401] Updated weights for policy 0, policy_version 759133 (0.0029) [2024-06-24 23:01:27,286][15401] Updated weights for policy 0, policy_version 759143 (0.0036) [2024-06-24 23:01:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 12437815296. Throughput: 0: 42538.6. Samples: 12437941500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:01:28,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 23:01:31,572][15401] Updated weights for policy 0, policy_version 759153 (0.0029) [2024-06-24 23:01:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 12438044672. Throughput: 0: 42679.2. Samples: 12438201560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:01:33,390][15132] Avg episode reward: [(0, '0.203')] [2024-06-24 23:01:34,812][15401] Updated weights for policy 0, policy_version 759163 (0.0031) [2024-06-24 23:01:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 12438241280. Throughput: 0: 42631.5. Samples: 12438331800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:01:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-24 23:01:39,735][15401] Updated weights for policy 0, policy_version 759173 (0.0036) [2024-06-24 23:01:40,821][15349] Signal inference workers to stop experience collection... (184050 times) [2024-06-24 23:01:40,822][15349] Signal inference workers to resume experience collection... (184050 times) [2024-06-24 23:01:40,869][15401] InferenceWorker_p0-w0: stopping experience collection (184050 times) [2024-06-24 23:01:40,870][15401] InferenceWorker_p0-w0: resuming experience collection (184050 times) [2024-06-24 23:01:42,315][15401] Updated weights for policy 0, policy_version 759183 (0.0039) [2024-06-24 23:01:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12438470656. Throughput: 0: 42385.4. Samples: 12438576500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:01:43,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-24 23:01:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000759184_12438470656.pth... [2024-06-24 23:01:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000758557_12428197888.pth [2024-06-24 23:01:47,417][15401] Updated weights for policy 0, policy_version 759193 (0.0029) [2024-06-24 23:01:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 12438667264. Throughput: 0: 42561.0. Samples: 12438837560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:01:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 23:01:50,011][15401] Updated weights for policy 0, policy_version 759203 (0.0033) [2024-06-24 23:01:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12438880256. Throughput: 0: 42454.9. Samples: 12438968420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:01:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 23:01:55,077][15401] Updated weights for policy 0, policy_version 759213 (0.0039) [2024-06-24 23:01:58,027][15401] Updated weights for policy 0, policy_version 759223 (0.0030) [2024-06-24 23:01:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12439109632. Throughput: 0: 42257.5. Samples: 12439215580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:01:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 23:02:02,733][15401] Updated weights for policy 0, policy_version 759233 (0.0038) [2024-06-24 23:02:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 12439289856. Throughput: 0: 42376.4. Samples: 12439475860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 23:02:05,879][15401] Updated weights for policy 0, policy_version 759243 (0.0030) [2024-06-24 23:02:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12439519232. Throughput: 0: 42302.7. Samples: 12439598080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 23:02:10,479][15401] Updated weights for policy 0, policy_version 759253 (0.0035) [2024-06-24 23:02:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42603.0, 300 sec: 42709.8). Total num frames: 12439748608. Throughput: 0: 42495.2. Samples: 12439853780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:13,404][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 23:02:13,513][15401] Updated weights for policy 0, policy_version 759263 (0.0034) [2024-06-24 23:02:18,107][15401] Updated weights for policy 0, policy_version 759273 (0.0025) [2024-06-24 23:02:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 12439928832. Throughput: 0: 42525.7. Samples: 12440115220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-24 23:02:21,045][15401] Updated weights for policy 0, policy_version 759283 (0.0031) [2024-06-24 23:02:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 12440174592. Throughput: 0: 42237.8. Samples: 12440232500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 23:02:25,879][15401] Updated weights for policy 0, policy_version 759293 (0.0032) [2024-06-24 23:02:28,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12440403968. Throughput: 0: 42669.7. Samples: 12440496640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:28,392][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 23:02:28,607][15401] Updated weights for policy 0, policy_version 759303 (0.0040) [2024-06-24 23:02:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12440567808. Throughput: 0: 42773.3. Samples: 12440762360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 23:02:33,443][15401] Updated weights for policy 0, policy_version 759313 (0.0024) [2024-06-24 23:02:36,487][15401] Updated weights for policy 0, policy_version 759323 (0.0027) [2024-06-24 23:02:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 12440813568. Throughput: 0: 42518.3. Samples: 12440881740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:38,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-24 23:02:41,068][15401] Updated weights for policy 0, policy_version 759333 (0.0045) [2024-06-24 23:02:43,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.2, 300 sec: 42709.8). Total num frames: 12441026560. Throughput: 0: 42815.0. Samples: 12441142260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:43,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-24 23:02:44,088][15401] Updated weights for policy 0, policy_version 759343 (0.0039) [2024-06-24 23:02:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12441223168. Throughput: 0: 42807.6. Samples: 12441402200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:48,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-24 23:02:48,694][15401] Updated weights for policy 0, policy_version 759353 (0.0048) [2024-06-24 23:02:51,590][15401] Updated weights for policy 0, policy_version 759363 (0.0043) [2024-06-24 23:02:53,392][15132] Fps is (10 sec: 42589.1, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 12441452544. Throughput: 0: 42858.2. Samples: 12441526800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:53,392][15132] Avg episode reward: [(0, '0.338')] [2024-06-24 23:02:56,316][15401] Updated weights for policy 0, policy_version 759373 (0.0042) [2024-06-24 23:02:58,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 12441649152. Throughput: 0: 42840.8. Samples: 12441781720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:02:58,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 23:02:58,680][15349] Signal inference workers to stop experience collection... (184100 times) [2024-06-24 23:02:58,730][15401] InferenceWorker_p0-w0: stopping experience collection (184100 times) [2024-06-24 23:02:58,730][15349] Signal inference workers to resume experience collection... (184100 times) [2024-06-24 23:02:58,743][15401] InferenceWorker_p0-w0: resuming experience collection (184100 times) [2024-06-24 23:02:59,383][15401] Updated weights for policy 0, policy_version 759383 (0.0036) [2024-06-24 23:03:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12441862144. Throughput: 0: 42708.4. Samples: 12442037100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:03:03,395][15132] Avg episode reward: [(0, '0.569')] [2024-06-24 23:03:04,057][15401] Updated weights for policy 0, policy_version 759393 (0.0032) [2024-06-24 23:03:06,816][15401] Updated weights for policy 0, policy_version 759403 (0.0035) [2024-06-24 23:03:08,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12442107904. Throughput: 0: 42939.1. Samples: 12442164760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:03:08,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 23:03:11,579][15401] Updated weights for policy 0, policy_version 759413 (0.0034) [2024-06-24 23:03:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 12442288128. Throughput: 0: 42907.5. Samples: 12442427480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:03:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 23:03:14,528][15401] Updated weights for policy 0, policy_version 759423 (0.0044) [2024-06-24 23:03:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12442517504. Throughput: 0: 42611.1. Samples: 12442679860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:03:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-24 23:03:19,147][15401] Updated weights for policy 0, policy_version 759433 (0.0038) [2024-06-24 23:03:22,523][15401] Updated weights for policy 0, policy_version 759443 (0.0038) [2024-06-24 23:03:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 12442746880. Throughput: 0: 42744.4. Samples: 12442805240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-24 23:03:23,391][15132] Avg episode reward: [(0, '0.428')] [2024-06-24 23:03:26,659][15401] Updated weights for policy 0, policy_version 759453 (0.0041) [2024-06-24 23:03:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12442927104. Throughput: 0: 42740.7. Samples: 12443065580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:03:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 23:03:30,061][15401] Updated weights for policy 0, policy_version 759463 (0.0029) [2024-06-24 23:03:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12443140096. Throughput: 0: 42821.3. Samples: 12443329160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:03:33,395][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 23:03:34,166][15401] Updated weights for policy 0, policy_version 759473 (0.0029) [2024-06-24 23:03:37,693][15401] Updated weights for policy 0, policy_version 759483 (0.0051) [2024-06-24 23:03:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12443385856. Throughput: 0: 42872.4. Samples: 12443455960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:03:38,395][15132] Avg episode reward: [(0, '0.390')] [2024-06-24 23:03:41,735][15401] Updated weights for policy 0, policy_version 759493 (0.0032) [2024-06-24 23:03:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 12443582464. Throughput: 0: 42806.3. Samples: 12443707900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:03:43,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-24 23:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000759496_12443582464.pth... [2024-06-24 23:03:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000758870_12433326080.pth [2024-06-24 23:03:45,321][15401] Updated weights for policy 0, policy_version 759503 (0.0040) [2024-06-24 23:03:48,389][15132] Fps is (10 sec: 36045.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 12443746304. Throughput: 0: 43081.9. Samples: 12443975780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:03:48,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-24 23:03:49,434][15401] Updated weights for policy 0, policy_version 759513 (0.0041) [2024-06-24 23:03:53,069][15401] Updated weights for policy 0, policy_version 759523 (0.0041) [2024-06-24 23:03:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 12444024832. Throughput: 0: 42938.7. Samples: 12444097000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:03:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 23:03:57,243][15401] Updated weights for policy 0, policy_version 759533 (0.0038) [2024-06-24 23:03:58,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 12444221440. Throughput: 0: 42678.7. Samples: 12444348020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:03:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-24 23:03:59,057][15349] Signal inference workers to stop experience collection... (184150 times) [2024-06-24 23:03:59,090][15401] InferenceWorker_p0-w0: stopping experience collection (184150 times) [2024-06-24 23:03:59,104][15349] Signal inference workers to resume experience collection... (184150 times) [2024-06-24 23:03:59,113][15401] InferenceWorker_p0-w0: resuming experience collection (184150 times) [2024-06-24 23:04:00,759][15401] Updated weights for policy 0, policy_version 759543 (0.0041) [2024-06-24 23:04:03,396][15132] Fps is (10 sec: 39296.1, 60 sec: 42593.8, 300 sec: 42708.5). Total num frames: 12444418048. Throughput: 0: 42890.2. Samples: 12444610200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:03,397][15132] Avg episode reward: [(0, '0.571')] [2024-06-24 23:04:05,102][15401] Updated weights for policy 0, policy_version 759553 (0.0030) [2024-06-24 23:04:08,288][15401] Updated weights for policy 0, policy_version 759563 (0.0032) [2024-06-24 23:04:08,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.3). Total num frames: 12444680192. Throughput: 0: 42915.7. Samples: 12444736440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 23:04:12,561][15401] Updated weights for policy 0, policy_version 759573 (0.0040) [2024-06-24 23:04:13,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12444860416. Throughput: 0: 42852.9. Samples: 12444993960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 23:04:15,985][15401] Updated weights for policy 0, policy_version 759583 (0.0035) [2024-06-24 23:04:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12445073408. Throughput: 0: 42684.9. Samples: 12445249980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:18,390][15132] Avg episode reward: [(0, '0.094')] [2024-06-24 23:04:20,161][15401] Updated weights for policy 0, policy_version 759593 (0.0030) [2024-06-24 23:04:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12445319168. Throughput: 0: 42735.5. Samples: 12445379060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:23,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 23:04:23,809][15401] Updated weights for policy 0, policy_version 759603 (0.0027) [2024-06-24 23:04:28,075][15401] Updated weights for policy 0, policy_version 759613 (0.0024) [2024-06-24 23:04:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12445499392. Throughput: 0: 42875.9. Samples: 12445637320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:28,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-24 23:04:31,396][15401] Updated weights for policy 0, policy_version 759623 (0.0033) [2024-06-24 23:04:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12445728768. Throughput: 0: 42688.3. Samples: 12445896760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 23:04:35,600][15401] Updated weights for policy 0, policy_version 759633 (0.0042) [2024-06-24 23:04:38,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12445958144. Throughput: 0: 42844.5. Samples: 12446025000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 23:04:38,908][15401] Updated weights for policy 0, policy_version 759643 (0.0021) [2024-06-24 23:04:43,342][15401] Updated weights for policy 0, policy_version 759653 (0.0031) [2024-06-24 23:04:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12446154752. Throughput: 0: 42886.0. Samples: 12446277880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:43,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 23:04:46,761][15401] Updated weights for policy 0, policy_version 759663 (0.0041) [2024-06-24 23:04:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 12446351360. Throughput: 0: 42818.2. Samples: 12446536740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-24 23:04:50,901][15401] Updated weights for policy 0, policy_version 759673 (0.0034) [2024-06-24 23:04:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12446597120. Throughput: 0: 42756.8. Samples: 12446660500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 23:04:54,651][15401] Updated weights for policy 0, policy_version 759683 (0.0040) [2024-06-24 23:04:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12446777344. Throughput: 0: 42849.3. Samples: 12446922180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:04:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 23:04:58,579][15401] Updated weights for policy 0, policy_version 759693 (0.0027) [2024-06-24 23:05:02,154][15401] Updated weights for policy 0, policy_version 759703 (0.0031) [2024-06-24 23:05:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 12447006720. Throughput: 0: 42806.2. Samples: 12447176260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:05:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 23:05:06,057][15401] Updated weights for policy 0, policy_version 759713 (0.0037) [2024-06-24 23:05:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 12447236096. Throughput: 0: 42733.9. Samples: 12447302080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-24 23:05:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 23:05:09,654][15401] Updated weights for policy 0, policy_version 759723 (0.0032) [2024-06-24 23:05:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12447432704. Throughput: 0: 42851.1. Samples: 12447565620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:13,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 23:05:13,815][15401] Updated weights for policy 0, policy_version 759733 (0.0029) [2024-06-24 23:05:15,939][15349] Signal inference workers to stop experience collection... (184200 times) [2024-06-24 23:05:15,977][15401] InferenceWorker_p0-w0: stopping experience collection (184200 times) [2024-06-24 23:05:16,000][15349] Signal inference workers to resume experience collection... (184200 times) [2024-06-24 23:05:16,004][15401] InferenceWorker_p0-w0: resuming experience collection (184200 times) [2024-06-24 23:05:17,196][15401] Updated weights for policy 0, policy_version 759743 (0.0035) [2024-06-24 23:05:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12447662080. Throughput: 0: 42680.9. Samples: 12447817400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 23:05:21,487][15401] Updated weights for policy 0, policy_version 759753 (0.0024) [2024-06-24 23:05:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12447875072. Throughput: 0: 42766.5. Samples: 12447949500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 23:05:24,802][15401] Updated weights for policy 0, policy_version 759763 (0.0046) [2024-06-24 23:05:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12448055296. Throughput: 0: 42719.1. Samples: 12448200240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 23:05:29,384][15401] Updated weights for policy 0, policy_version 759773 (0.0044) [2024-06-24 23:05:32,778][15401] Updated weights for policy 0, policy_version 759783 (0.0032) [2024-06-24 23:05:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12448284672. Throughput: 0: 42534.2. Samples: 12448450780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-24 23:05:37,087][15401] Updated weights for policy 0, policy_version 759793 (0.0035) [2024-06-24 23:05:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12448497664. Throughput: 0: 42728.6. Samples: 12448583280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 23:05:40,773][15401] Updated weights for policy 0, policy_version 759803 (0.0033) [2024-06-24 23:05:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12448694272. Throughput: 0: 42396.7. Samples: 12448830040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:43,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 23:05:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000759808_12448694272.pth... [2024-06-24 23:05:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000759184_12438470656.pth [2024-06-24 23:05:44,836][15401] Updated weights for policy 0, policy_version 759813 (0.0030) [2024-06-24 23:05:48,352][15401] Updated weights for policy 0, policy_version 759823 (0.0039) [2024-06-24 23:05:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12448940032. Throughput: 0: 42374.3. Samples: 12449083100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 23:05:52,582][15401] Updated weights for policy 0, policy_version 759833 (0.0036) [2024-06-24 23:05:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12449136640. Throughput: 0: 42507.0. Samples: 12449214900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 23:05:56,121][15401] Updated weights for policy 0, policy_version 759843 (0.0041) [2024-06-24 23:05:58,390][15132] Fps is (10 sec: 40957.2, 60 sec: 42870.9, 300 sec: 42598.3). Total num frames: 12449349632. Throughput: 0: 42345.6. Samples: 12449471200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:05:58,391][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 23:06:00,174][15401] Updated weights for policy 0, policy_version 759853 (0.0033) [2024-06-24 23:06:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12449579008. Throughput: 0: 42500.1. Samples: 12449729900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 23:06:03,683][15401] Updated weights for policy 0, policy_version 759863 (0.0026) [2024-06-24 23:06:08,023][15401] Updated weights for policy 0, policy_version 759873 (0.0027) [2024-06-24 23:06:08,389][15132] Fps is (10 sec: 42601.6, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 12449775616. Throughput: 0: 42540.5. Samples: 12449863820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:08,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-24 23:06:11,429][15401] Updated weights for policy 0, policy_version 759883 (0.0027) [2024-06-24 23:06:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12450004992. Throughput: 0: 42668.3. Samples: 12450120320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:13,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-24 23:06:15,725][15401] Updated weights for policy 0, policy_version 759893 (0.0035) [2024-06-24 23:06:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12450217984. Throughput: 0: 42553.7. Samples: 12450365700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 23:06:19,068][15401] Updated weights for policy 0, policy_version 759903 (0.0033) [2024-06-24 23:06:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12450398208. Throughput: 0: 42652.0. Samples: 12450502620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-24 23:06:23,398][15401] Updated weights for policy 0, policy_version 759913 (0.0030) [2024-06-24 23:06:26,661][15401] Updated weights for policy 0, policy_version 759923 (0.0024) [2024-06-24 23:06:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 12450611200. Throughput: 0: 42777.8. Samples: 12450755040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 23:06:31,324][15401] Updated weights for policy 0, policy_version 759933 (0.0040) [2024-06-24 23:06:33,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12450873344. Throughput: 0: 42796.4. Samples: 12451008940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-24 23:06:34,129][15401] Updated weights for policy 0, policy_version 759943 (0.0031) [2024-06-24 23:06:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12451053568. Throughput: 0: 42964.1. Samples: 12451148280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 23:06:38,747][15401] Updated weights for policy 0, policy_version 759953 (0.0036) [2024-06-24 23:06:39,321][15349] Signal inference workers to stop experience collection... (184250 times) [2024-06-24 23:06:39,369][15401] InferenceWorker_p0-w0: stopping experience collection (184250 times) [2024-06-24 23:06:39,378][15349] Signal inference workers to resume experience collection... (184250 times) [2024-06-24 23:06:39,383][15401] InferenceWorker_p0-w0: resuming experience collection (184250 times) [2024-06-24 23:06:41,914][15401] Updated weights for policy 0, policy_version 759963 (0.0027) [2024-06-24 23:06:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12451266560. Throughput: 0: 42829.6. Samples: 12451398500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-24 23:06:46,309][15401] Updated weights for policy 0, policy_version 759973 (0.0039) [2024-06-24 23:06:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12451512320. Throughput: 0: 42707.9. Samples: 12451651760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 23:06:49,406][15401] Updated weights for policy 0, policy_version 759983 (0.0026) [2024-06-24 23:06:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12451692544. Throughput: 0: 42769.8. Samples: 12451788460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-24 23:06:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 23:06:53,820][15401] Updated weights for policy 0, policy_version 759993 (0.0045) [2024-06-24 23:06:57,534][15401] Updated weights for policy 0, policy_version 760003 (0.0038) [2024-06-24 23:06:58,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.9, 300 sec: 42765.0). Total num frames: 12451905536. Throughput: 0: 42589.3. Samples: 12452036840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:06:58,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 23:07:01,615][15401] Updated weights for policy 0, policy_version 760013 (0.0027) [2024-06-24 23:07:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12452151296. Throughput: 0: 42832.5. Samples: 12452293160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 23:07:05,178][15401] Updated weights for policy 0, policy_version 760023 (0.0023) [2024-06-24 23:07:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12452347904. Throughput: 0: 42698.1. Samples: 12452424040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:08,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 23:07:09,287][15401] Updated weights for policy 0, policy_version 760033 (0.0039) [2024-06-24 23:07:12,834][15401] Updated weights for policy 0, policy_version 760043 (0.0034) [2024-06-24 23:07:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12452544512. Throughput: 0: 42619.2. Samples: 12452672900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 23:07:17,001][15401] Updated weights for policy 0, policy_version 760053 (0.0041) [2024-06-24 23:07:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 12452773888. Throughput: 0: 42626.1. Samples: 12452927120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 23:07:20,335][15401] Updated weights for policy 0, policy_version 760063 (0.0030) [2024-06-24 23:07:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12452970496. Throughput: 0: 42474.3. Samples: 12453059620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 23:07:24,491][15401] Updated weights for policy 0, policy_version 760073 (0.0032) [2024-06-24 23:07:27,982][15401] Updated weights for policy 0, policy_version 760083 (0.0044) [2024-06-24 23:07:28,396][15132] Fps is (10 sec: 42572.1, 60 sec: 43140.0, 300 sec: 42819.6). Total num frames: 12453199872. Throughput: 0: 42564.1. Samples: 12453314160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:28,396][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 23:07:32,268][15401] Updated weights for policy 0, policy_version 760093 (0.0036) [2024-06-24 23:07:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12453412864. Throughput: 0: 42667.6. Samples: 12453571800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:33,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 23:07:35,741][15401] Updated weights for policy 0, policy_version 760103 (0.0039) [2024-06-24 23:07:38,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 12453609472. Throughput: 0: 42535.4. Samples: 12453702560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:38,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 23:07:39,843][15401] Updated weights for policy 0, policy_version 760113 (0.0035) [2024-06-24 23:07:43,275][15401] Updated weights for policy 0, policy_version 760123 (0.0033) [2024-06-24 23:07:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12453855232. Throughput: 0: 42734.7. Samples: 12453959900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-24 23:07:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000760123_12453855232.pth... [2024-06-24 23:07:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000759496_12443582464.pth [2024-06-24 23:07:47,416][15401] Updated weights for policy 0, policy_version 760133 (0.0037) [2024-06-24 23:07:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12454068224. Throughput: 0: 42730.2. Samples: 12454216020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:48,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-24 23:07:50,986][15401] Updated weights for policy 0, policy_version 760143 (0.0033) [2024-06-24 23:07:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12454264832. Throughput: 0: 42595.7. Samples: 12454340840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 23:07:55,158][15401] Updated weights for policy 0, policy_version 760153 (0.0028) [2024-06-24 23:07:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12454477824. Throughput: 0: 42902.7. Samples: 12454603520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:07:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 23:07:58,738][15401] Updated weights for policy 0, policy_version 760163 (0.0032) [2024-06-24 23:08:02,863][15401] Updated weights for policy 0, policy_version 760173 (0.0033) [2024-06-24 23:08:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12454707200. Throughput: 0: 42948.6. Samples: 12454859800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:08:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 23:08:06,555][15401] Updated weights for policy 0, policy_version 760183 (0.0041) [2024-06-24 23:08:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12454903808. Throughput: 0: 42875.8. Samples: 12454989040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:08:08,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-24 23:08:10,519][15401] Updated weights for policy 0, policy_version 760193 (0.0041) [2024-06-24 23:08:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12455133184. Throughput: 0: 42921.2. Samples: 12455245340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:08:13,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 23:08:14,287][15401] Updated weights for policy 0, policy_version 760203 (0.0027) [2024-06-24 23:08:18,108][15401] Updated weights for policy 0, policy_version 760213 (0.0034) [2024-06-24 23:08:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12455329792. Throughput: 0: 42912.9. Samples: 12455502880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:08:18,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 23:08:21,862][15401] Updated weights for policy 0, policy_version 760223 (0.0040) [2024-06-24 23:08:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12455542784. Throughput: 0: 42789.9. Samples: 12455628100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:08:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-24 23:08:24,200][15349] Signal inference workers to stop experience collection... (184300 times) [2024-06-24 23:08:24,200][15349] Signal inference workers to resume experience collection... (184300 times) [2024-06-24 23:08:24,252][15401] InferenceWorker_p0-w0: stopping experience collection (184300 times) [2024-06-24 23:08:24,252][15401] InferenceWorker_p0-w0: resuming experience collection (184300 times) [2024-06-24 23:08:25,664][15401] Updated weights for policy 0, policy_version 760233 (0.0030) [2024-06-24 23:08:28,394][15132] Fps is (10 sec: 44215.6, 60 sec: 42872.6, 300 sec: 42819.9). Total num frames: 12455772160. Throughput: 0: 42824.3. Samples: 12455887200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:08:28,395][15132] Avg episode reward: [(0, '0.822')] [2024-06-24 23:08:29,539][15401] Updated weights for policy 0, policy_version 760243 (0.0039) [2024-06-24 23:08:33,195][15401] Updated weights for policy 0, policy_version 760253 (0.0032) [2024-06-24 23:08:33,391][15132] Fps is (10 sec: 44228.0, 60 sec: 42870.1, 300 sec: 42709.2). Total num frames: 12455985152. Throughput: 0: 42847.1. Samples: 12456144220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:08:33,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 23:08:37,349][15401] Updated weights for policy 0, policy_version 760263 (0.0033) [2024-06-24 23:08:38,389][15132] Fps is (10 sec: 40979.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12456181760. Throughput: 0: 42894.2. Samples: 12456271080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-24 23:08:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 23:08:40,839][15401] Updated weights for policy 0, policy_version 760273 (0.0032) [2024-06-24 23:08:43,390][15132] Fps is (10 sec: 42606.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 12456411136. Throughput: 0: 42726.5. Samples: 12456526220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:08:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 23:08:44,727][15401] Updated weights for policy 0, policy_version 760283 (0.0042) [2024-06-24 23:08:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12456607744. Throughput: 0: 42880.9. Samples: 12456789440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:08:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 23:08:48,732][15401] Updated weights for policy 0, policy_version 760293 (0.0045) [2024-06-24 23:08:52,674][15401] Updated weights for policy 0, policy_version 760303 (0.0026) [2024-06-24 23:08:53,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 12456837120. Throughput: 0: 42767.1. Samples: 12456913660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:08:53,392][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 23:08:56,444][15401] Updated weights for policy 0, policy_version 760313 (0.0032) [2024-06-24 23:08:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 12457050112. Throughput: 0: 42666.2. Samples: 12457165320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:08:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 23:09:00,306][15401] Updated weights for policy 0, policy_version 760323 (0.0030) [2024-06-24 23:09:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12457263104. Throughput: 0: 42711.5. Samples: 12457424900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 23:09:04,211][15401] Updated weights for policy 0, policy_version 760333 (0.0039) [2024-06-24 23:09:07,926][15401] Updated weights for policy 0, policy_version 760343 (0.0037) [2024-06-24 23:09:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12457476096. Throughput: 0: 42836.1. Samples: 12457555720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 23:09:11,878][15401] Updated weights for policy 0, policy_version 760353 (0.0040) [2024-06-24 23:09:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12457689088. Throughput: 0: 42598.2. Samples: 12457803920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:13,396][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 23:09:15,427][15401] Updated weights for policy 0, policy_version 760363 (0.0031) [2024-06-24 23:09:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12457902080. Throughput: 0: 42661.3. Samples: 12458063900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 23:09:19,394][15401] Updated weights for policy 0, policy_version 760373 (0.0034) [2024-06-24 23:09:23,057][15401] Updated weights for policy 0, policy_version 760383 (0.0034) [2024-06-24 23:09:23,394][15132] Fps is (10 sec: 44217.3, 60 sec: 43141.2, 300 sec: 42819.9). Total num frames: 12458131456. Throughput: 0: 42792.1. Samples: 12458196920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:23,395][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 23:09:26,810][15401] Updated weights for policy 0, policy_version 760393 (0.0021) [2024-06-24 23:09:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42600.1, 300 sec: 42709.1). Total num frames: 12458328064. Throughput: 0: 42760.5. Samples: 12458450540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:28,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-24 23:09:30,788][15401] Updated weights for policy 0, policy_version 760403 (0.0034) [2024-06-24 23:09:33,389][15132] Fps is (10 sec: 40979.0, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 12458541056. Throughput: 0: 42658.4. Samples: 12458709060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 23:09:34,665][15401] Updated weights for policy 0, policy_version 760413 (0.0040) [2024-06-24 23:09:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 12458754048. Throughput: 0: 42766.2. Samples: 12458838040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 23:09:38,596][15401] Updated weights for policy 0, policy_version 760423 (0.0026) [2024-06-24 23:09:42,211][15401] Updated weights for policy 0, policy_version 760433 (0.0027) [2024-06-24 23:09:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12458983424. Throughput: 0: 42860.0. Samples: 12459094020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:43,396][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 23:09:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000760436_12458983424.pth... [2024-06-24 23:09:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000759808_12448694272.pth [2024-06-24 23:09:46,182][15401] Updated weights for policy 0, policy_version 760443 (0.0033) [2024-06-24 23:09:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12459180032. Throughput: 0: 42785.4. Samples: 12459350240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:48,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-24 23:09:49,714][15401] Updated weights for policy 0, policy_version 760453 (0.0037) [2024-06-24 23:09:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 12459393024. Throughput: 0: 42664.7. Samples: 12459475640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:53,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 23:09:53,930][15401] Updated weights for policy 0, policy_version 760463 (0.0031) [2024-06-24 23:09:57,238][15349] Signal inference workers to stop experience collection... (184350 times) [2024-06-24 23:09:57,239][15349] Signal inference workers to resume experience collection... (184350 times) [2024-06-24 23:09:57,280][15401] InferenceWorker_p0-w0: stopping experience collection (184350 times) [2024-06-24 23:09:57,281][15401] InferenceWorker_p0-w0: resuming experience collection (184350 times) [2024-06-24 23:09:57,387][15401] Updated weights for policy 0, policy_version 760473 (0.0030) [2024-06-24 23:09:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12459638784. Throughput: 0: 42775.7. Samples: 12459728820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:09:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 23:10:01,520][15401] Updated weights for policy 0, policy_version 760483 (0.0032) [2024-06-24 23:10:03,391][15132] Fps is (10 sec: 44231.0, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 12459835392. Throughput: 0: 42722.3. Samples: 12459986460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:10:03,391][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 23:10:04,979][15401] Updated weights for policy 0, policy_version 760493 (0.0038) [2024-06-24 23:10:08,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12460015616. Throughput: 0: 42539.4. Samples: 12460111000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:10:08,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 23:10:09,132][15401] Updated weights for policy 0, policy_version 760503 (0.0023) [2024-06-24 23:10:12,766][15401] Updated weights for policy 0, policy_version 760513 (0.0028) [2024-06-24 23:10:13,389][15132] Fps is (10 sec: 42604.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12460261376. Throughput: 0: 42769.9. Samples: 12460375080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:10:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 23:10:16,995][15401] Updated weights for policy 0, policy_version 760523 (0.0045) [2024-06-24 23:10:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12460457984. Throughput: 0: 42619.0. Samples: 12460626920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:10:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 23:10:20,472][15401] Updated weights for policy 0, policy_version 760533 (0.0039) [2024-06-24 23:10:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42328.5, 300 sec: 42765.0). Total num frames: 12460670976. Throughput: 0: 42505.0. Samples: 12460750760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-24 23:10:23,390][15132] Avg episode reward: [(0, '0.225')] [2024-06-24 23:10:24,894][15401] Updated weights for policy 0, policy_version 760543 (0.0036) [2024-06-24 23:10:27,942][15401] Updated weights for policy 0, policy_version 760553 (0.0032) [2024-06-24 23:10:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 12460916736. Throughput: 0: 42771.7. Samples: 12461018740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:10:28,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 23:10:32,293][15401] Updated weights for policy 0, policy_version 760563 (0.0032) [2024-06-24 23:10:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12461113344. Throughput: 0: 42782.2. Samples: 12461275440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:10:33,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-24 23:10:35,468][15401] Updated weights for policy 0, policy_version 760573 (0.0034) [2024-06-24 23:10:38,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12461309952. Throughput: 0: 42776.5. Samples: 12461400580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:10:38,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-24 23:10:39,796][15401] Updated weights for policy 0, policy_version 760583 (0.0036) [2024-06-24 23:10:43,184][15401] Updated weights for policy 0, policy_version 760593 (0.0040) [2024-06-24 23:10:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12461555712. Throughput: 0: 43125.8. Samples: 12461669480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:10:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-24 23:10:47,354][15401] Updated weights for policy 0, policy_version 760603 (0.0043) [2024-06-24 23:10:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12461752320. Throughput: 0: 43045.7. Samples: 12461923460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:10:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 23:10:50,842][15401] Updated weights for policy 0, policy_version 760613 (0.0022) [2024-06-24 23:10:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 12461965312. Throughput: 0: 43025.3. Samples: 12462047140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:10:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 23:10:55,271][15401] Updated weights for policy 0, policy_version 760623 (0.0040) [2024-06-24 23:10:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12462194688. Throughput: 0: 42950.2. Samples: 12462307840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:10:58,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 23:10:58,737][15401] Updated weights for policy 0, policy_version 760633 (0.0042) [2024-06-24 23:11:02,709][15401] Updated weights for policy 0, policy_version 760643 (0.0027) [2024-06-24 23:11:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42872.4, 300 sec: 42820.5). Total num frames: 12462407680. Throughput: 0: 43124.3. Samples: 12462567520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 23:11:06,149][15401] Updated weights for policy 0, policy_version 760653 (0.0040) [2024-06-24 23:11:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 12462620672. Throughput: 0: 43225.7. Samples: 12462695920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:08,392][15132] Avg episode reward: [(0, '0.205')] [2024-06-24 23:11:10,263][15401] Updated weights for policy 0, policy_version 760663 (0.0036) [2024-06-24 23:11:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12462833664. Throughput: 0: 43163.0. Samples: 12462961080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:13,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-24 23:11:13,475][15349] Signal inference workers to stop experience collection... (184400 times) [2024-06-24 23:11:13,475][15349] Signal inference workers to resume experience collection... (184400 times) [2024-06-24 23:11:13,524][15401] InferenceWorker_p0-w0: stopping experience collection (184400 times) [2024-06-24 23:11:13,524][15401] InferenceWorker_p0-w0: resuming experience collection (184400 times) [2024-06-24 23:11:13,616][15401] Updated weights for policy 0, policy_version 760673 (0.0041) [2024-06-24 23:11:17,850][15401] Updated weights for policy 0, policy_version 760683 (0.0026) [2024-06-24 23:11:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 12463063040. Throughput: 0: 42991.1. Samples: 12463210040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-24 23:11:21,592][15401] Updated weights for policy 0, policy_version 760693 (0.0029) [2024-06-24 23:11:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12463259648. Throughput: 0: 43121.4. Samples: 12463341040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 23:11:25,426][15401] Updated weights for policy 0, policy_version 760703 (0.0028) [2024-06-24 23:11:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12463472640. Throughput: 0: 42900.2. Samples: 12463600000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:28,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 23:11:29,314][15401] Updated weights for policy 0, policy_version 760713 (0.0024) [2024-06-24 23:11:33,241][15401] Updated weights for policy 0, policy_version 760723 (0.0041) [2024-06-24 23:11:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12463702016. Throughput: 0: 42818.7. Samples: 12463850300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-24 23:11:36,919][15401] Updated weights for policy 0, policy_version 760733 (0.0028) [2024-06-24 23:11:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 12463898624. Throughput: 0: 42843.9. Samples: 12463975120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:38,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 23:11:40,738][15401] Updated weights for policy 0, policy_version 760743 (0.0043) [2024-06-24 23:11:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12464095232. Throughput: 0: 42766.2. Samples: 12464232320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-24 23:11:43,433][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000760749_12464111616.pth... [2024-06-24 23:11:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000760123_12453855232.pth [2024-06-24 23:11:44,919][15401] Updated weights for policy 0, policy_version 760753 (0.0032) [2024-06-24 23:11:48,325][15401] Updated weights for policy 0, policy_version 760763 (0.0032) [2024-06-24 23:11:48,394][15132] Fps is (10 sec: 44218.1, 60 sec: 43141.5, 300 sec: 42875.5). Total num frames: 12464340992. Throughput: 0: 42562.3. Samples: 12464483000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:48,394][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 23:11:52,596][15401] Updated weights for policy 0, policy_version 760773 (0.0047) [2024-06-24 23:11:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12464537600. Throughput: 0: 42666.6. Samples: 12464615920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 23:11:56,523][15401] Updated weights for policy 0, policy_version 760783 (0.0026) [2024-06-24 23:11:58,390][15132] Fps is (10 sec: 39338.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12464734208. Throughput: 0: 42318.2. Samples: 12464865400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:11:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 23:12:00,575][15401] Updated weights for policy 0, policy_version 760793 (0.0044) [2024-06-24 23:12:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12464963584. Throughput: 0: 42368.4. Samples: 12465116620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:12:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 23:12:04,118][15401] Updated weights for policy 0, policy_version 760803 (0.0026) [2024-06-24 23:12:08,200][15401] Updated weights for policy 0, policy_version 760813 (0.0031) [2024-06-24 23:12:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12465176576. Throughput: 0: 42497.3. Samples: 12465253420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-24 23:12:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 23:12:11,582][15401] Updated weights for policy 0, policy_version 760823 (0.0039) [2024-06-24 23:12:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12465373184. Throughput: 0: 42234.8. Samples: 12465500560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:13,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-24 23:12:15,882][15401] Updated weights for policy 0, policy_version 760833 (0.0028) [2024-06-24 23:12:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12465618944. Throughput: 0: 42387.5. Samples: 12465757740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:18,394][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 23:12:19,187][15401] Updated weights for policy 0, policy_version 760843 (0.0039) [2024-06-24 23:12:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 12465799168. Throughput: 0: 42524.6. Samples: 12465888720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-24 23:12:23,486][15401] Updated weights for policy 0, policy_version 760853 (0.0033) [2024-06-24 23:12:26,781][15401] Updated weights for policy 0, policy_version 760863 (0.0028) [2024-06-24 23:12:28,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 12466028544. Throughput: 0: 42326.8. Samples: 12466137300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:28,396][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 23:12:31,351][15401] Updated weights for policy 0, policy_version 760873 (0.0050) [2024-06-24 23:12:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 12466225152. Throughput: 0: 42543.9. Samples: 12466397300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-24 23:12:34,641][15401] Updated weights for policy 0, policy_version 760883 (0.0031) [2024-06-24 23:12:38,390][15132] Fps is (10 sec: 39346.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12466421760. Throughput: 0: 42414.8. Samples: 12466524580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 23:12:38,830][15401] Updated weights for policy 0, policy_version 760893 (0.0028) [2024-06-24 23:12:41,353][15349] Signal inference workers to stop experience collection... (184450 times) [2024-06-24 23:12:41,354][15349] Signal inference workers to resume experience collection... (184450 times) [2024-06-24 23:12:41,381][15401] InferenceWorker_p0-w0: stopping experience collection (184450 times) [2024-06-24 23:12:41,382][15401] InferenceWorker_p0-w0: resuming experience collection (184450 times) [2024-06-24 23:12:42,441][15401] Updated weights for policy 0, policy_version 760903 (0.0034) [2024-06-24 23:12:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12466667520. Throughput: 0: 42581.7. Samples: 12466781580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 23:12:46,721][15401] Updated weights for policy 0, policy_version 760913 (0.0051) [2024-06-24 23:12:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42055.4, 300 sec: 42709.5). Total num frames: 12466864128. Throughput: 0: 42707.6. Samples: 12467038460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:48,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-24 23:12:50,209][15401] Updated weights for policy 0, policy_version 760923 (0.0042) [2024-06-24 23:12:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12467077120. Throughput: 0: 42393.0. Samples: 12467161100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 23:12:54,326][15401] Updated weights for policy 0, policy_version 760933 (0.0030) [2024-06-24 23:12:57,810][15401] Updated weights for policy 0, policy_version 760943 (0.0034) [2024-06-24 23:12:58,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 12467306496. Throughput: 0: 42700.0. Samples: 12467422160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:12:58,392][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 23:13:01,919][15401] Updated weights for policy 0, policy_version 760953 (0.0028) [2024-06-24 23:13:03,393][15132] Fps is (10 sec: 42584.1, 60 sec: 42322.9, 300 sec: 42709.0). Total num frames: 12467503104. Throughput: 0: 42621.3. Samples: 12467675840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:03,393][15132] Avg episode reward: [(0, '0.127')] [2024-06-24 23:13:05,495][15401] Updated weights for policy 0, policy_version 760963 (0.0021) [2024-06-24 23:13:08,390][15132] Fps is (10 sec: 42607.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12467732480. Throughput: 0: 42451.4. Samples: 12467799040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:08,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-24 23:13:09,632][15401] Updated weights for policy 0, policy_version 760973 (0.0044) [2024-06-24 23:13:13,241][15401] Updated weights for policy 0, policy_version 760983 (0.0043) [2024-06-24 23:13:13,390][15132] Fps is (10 sec: 44251.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12467945472. Throughput: 0: 42730.1. Samples: 12468059880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-24 23:13:17,272][15401] Updated weights for policy 0, policy_version 760993 (0.0023) [2024-06-24 23:13:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12468158464. Throughput: 0: 42598.9. Samples: 12468314240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 23:13:20,735][15401] Updated weights for policy 0, policy_version 761003 (0.0033) [2024-06-24 23:13:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42710.2). Total num frames: 12468371456. Throughput: 0: 42665.3. Samples: 12468444520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:23,399][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 23:13:24,866][15401] Updated weights for policy 0, policy_version 761013 (0.0036) [2024-06-24 23:13:28,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42601.3, 300 sec: 42709.4). Total num frames: 12468584448. Throughput: 0: 42828.1. Samples: 12468708940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:28,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-24 23:13:28,458][15401] Updated weights for policy 0, policy_version 761023 (0.0031) [2024-06-24 23:13:32,756][15401] Updated weights for policy 0, policy_version 761033 (0.0037) [2024-06-24 23:13:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12468797440. Throughput: 0: 42704.3. Samples: 12468960160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-24 23:13:36,359][15401] Updated weights for policy 0, policy_version 761043 (0.0038) [2024-06-24 23:13:38,390][15132] Fps is (10 sec: 44246.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 12469026816. Throughput: 0: 42883.0. Samples: 12469090840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:38,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-24 23:13:40,204][15401] Updated weights for policy 0, policy_version 761053 (0.0044) [2024-06-24 23:13:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12469223424. Throughput: 0: 42773.7. Samples: 12469346880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 23:13:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000761061_12469223424.pth... [2024-06-24 23:13:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000760436_12458983424.pth [2024-06-24 23:13:43,819][15401] Updated weights for policy 0, policy_version 761063 (0.0030) [2024-06-24 23:13:47,628][15401] Updated weights for policy 0, policy_version 761073 (0.0036) [2024-06-24 23:13:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 12469436416. Throughput: 0: 42862.6. Samples: 12469604520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:48,392][15132] Avg episode reward: [(0, '0.164')] [2024-06-24 23:13:51,590][15401] Updated weights for policy 0, policy_version 761083 (0.0035) [2024-06-24 23:13:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12469649408. Throughput: 0: 43026.3. Samples: 12469735220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-24 23:13:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 23:13:55,365][15401] Updated weights for policy 0, policy_version 761093 (0.0039) [2024-06-24 23:13:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 12469862400. Throughput: 0: 42805.4. Samples: 12469986120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:13:58,396][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 23:13:59,217][15401] Updated weights for policy 0, policy_version 761103 (0.0034) [2024-06-24 23:14:02,977][15401] Updated weights for policy 0, policy_version 761113 (0.0046) [2024-06-24 23:14:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.9, 300 sec: 42709.5). Total num frames: 12470075392. Throughput: 0: 42857.2. Samples: 12470242820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-24 23:14:07,062][15401] Updated weights for policy 0, policy_version 761123 (0.0024) [2024-06-24 23:14:08,396][15132] Fps is (10 sec: 44208.0, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 12470304768. Throughput: 0: 42789.1. Samples: 12470370300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:08,396][15132] Avg episode reward: [(0, '0.448')] [2024-06-24 23:14:10,966][15401] Updated weights for policy 0, policy_version 761133 (0.0034) [2024-06-24 23:14:12,369][15349] Signal inference workers to stop experience collection... (184500 times) [2024-06-24 23:14:12,417][15401] InferenceWorker_p0-w0: stopping experience collection (184500 times) [2024-06-24 23:14:12,418][15349] Signal inference workers to resume experience collection... (184500 times) [2024-06-24 23:14:12,435][15401] InferenceWorker_p0-w0: resuming experience collection (184500 times) [2024-06-24 23:14:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 12470517760. Throughput: 0: 42629.7. Samples: 12470627180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 23:14:14,671][15401] Updated weights for policy 0, policy_version 761143 (0.0027) [2024-06-24 23:14:18,390][15132] Fps is (10 sec: 39346.4, 60 sec: 42325.2, 300 sec: 42599.0). Total num frames: 12470697984. Throughput: 0: 42666.1. Samples: 12470880140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 23:14:18,635][15401] Updated weights for policy 0, policy_version 761153 (0.0032) [2024-06-24 23:14:22,263][15401] Updated weights for policy 0, policy_version 761163 (0.0036) [2024-06-24 23:14:23,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 12470927360. Throughput: 0: 42510.5. Samples: 12471003800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:23,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-24 23:14:26,347][15401] Updated weights for policy 0, policy_version 761173 (0.0045) [2024-06-24 23:14:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 12471140352. Throughput: 0: 42599.2. Samples: 12471263840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 23:14:30,135][15401] Updated weights for policy 0, policy_version 761183 (0.0043) [2024-06-24 23:14:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12471336960. Throughput: 0: 42448.2. Samples: 12471514680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-24 23:14:34,042][15401] Updated weights for policy 0, policy_version 761193 (0.0032) [2024-06-24 23:14:37,757][15401] Updated weights for policy 0, policy_version 761203 (0.0047) [2024-06-24 23:14:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 12471582720. Throughput: 0: 42331.1. Samples: 12471640120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:38,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-24 23:14:41,670][15401] Updated weights for policy 0, policy_version 761213 (0.0034) [2024-06-24 23:14:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12471795712. Throughput: 0: 42580.8. Samples: 12471902260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 23:14:45,407][15401] Updated weights for policy 0, policy_version 761223 (0.0029) [2024-06-24 23:14:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12471992320. Throughput: 0: 42406.7. Samples: 12472151120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 23:14:49,670][15401] Updated weights for policy 0, policy_version 761233 (0.0038) [2024-06-24 23:14:53,271][15401] Updated weights for policy 0, policy_version 761243 (0.0040) [2024-06-24 23:14:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12472205312. Throughput: 0: 42463.9. Samples: 12472280900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-24 23:14:57,284][15401] Updated weights for policy 0, policy_version 761253 (0.0036) [2024-06-24 23:14:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 12472418304. Throughput: 0: 42607.4. Samples: 12472544500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:14:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 23:15:00,831][15401] Updated weights for policy 0, policy_version 761263 (0.0038) [2024-06-24 23:15:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12472631296. Throughput: 0: 42652.5. Samples: 12472799500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:15:03,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 23:15:04,783][15401] Updated weights for policy 0, policy_version 761273 (0.0032) [2024-06-24 23:15:08,286][15401] Updated weights for policy 0, policy_version 761283 (0.0033) [2024-06-24 23:15:08,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42601.3, 300 sec: 42709.1). Total num frames: 12472860672. Throughput: 0: 42742.1. Samples: 12472927300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:15:08,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 23:15:12,280][15401] Updated weights for policy 0, policy_version 761293 (0.0036) [2024-06-24 23:15:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 12473073664. Throughput: 0: 42735.6. Samples: 12473186940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:15:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 23:15:15,939][15401] Updated weights for policy 0, policy_version 761303 (0.0025) [2024-06-24 23:15:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12473270272. Throughput: 0: 42828.4. Samples: 12473441960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:15:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-24 23:15:20,224][15401] Updated weights for policy 0, policy_version 761313 (0.0026) [2024-06-24 23:15:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 12473499648. Throughput: 0: 42955.9. Samples: 12473573240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:15:23,393][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 23:15:23,477][15401] Updated weights for policy 0, policy_version 761323 (0.0028) [2024-06-24 23:15:28,007][15401] Updated weights for policy 0, policy_version 761333 (0.0040) [2024-06-24 23:15:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12473696256. Throughput: 0: 42730.7. Samples: 12473825140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:15:28,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 23:15:30,988][15401] Updated weights for policy 0, policy_version 761343 (0.0038) [2024-06-24 23:15:33,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12473909248. Throughput: 0: 42955.9. Samples: 12474084140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:15:33,400][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 23:15:35,541][15401] Updated weights for policy 0, policy_version 761353 (0.0035) [2024-06-24 23:15:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12474138624. Throughput: 0: 42860.0. Samples: 12474209600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-24 23:15:38,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 23:15:39,126][15401] Updated weights for policy 0, policy_version 761363 (0.0023) [2024-06-24 23:15:40,906][15349] Signal inference workers to stop experience collection... (184550 times) [2024-06-24 23:15:40,957][15401] InferenceWorker_p0-w0: stopping experience collection (184550 times) [2024-06-24 23:15:40,962][15349] Signal inference workers to resume experience collection... (184550 times) [2024-06-24 23:15:40,979][15401] InferenceWorker_p0-w0: resuming experience collection (184550 times) [2024-06-24 23:15:43,078][15401] Updated weights for policy 0, policy_version 761373 (0.0042) [2024-06-24 23:15:43,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12474351616. Throughput: 0: 42716.0. Samples: 12474466720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:15:43,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 23:15:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000761374_12474351616.pth... [2024-06-24 23:15:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000760749_12464111616.pth [2024-06-24 23:15:46,812][15401] Updated weights for policy 0, policy_version 761383 (0.0035) [2024-06-24 23:15:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12474548224. Throughput: 0: 42774.2. Samples: 12474724340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:15:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 23:15:50,649][15401] Updated weights for policy 0, policy_version 761393 (0.0040) [2024-06-24 23:15:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12474777600. Throughput: 0: 42791.6. Samples: 12474852820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:15:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 23:15:54,443][15401] Updated weights for policy 0, policy_version 761403 (0.0029) [2024-06-24 23:15:58,197][15401] Updated weights for policy 0, policy_version 761413 (0.0032) [2024-06-24 23:15:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12474990592. Throughput: 0: 42655.4. Samples: 12475106440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:15:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 23:16:01,957][15401] Updated weights for policy 0, policy_version 761423 (0.0028) [2024-06-24 23:16:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12475187200. Throughput: 0: 42828.0. Samples: 12475369220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 23:16:05,685][15401] Updated weights for policy 0, policy_version 761433 (0.0027) [2024-06-24 23:16:08,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 12475432960. Throughput: 0: 42688.2. Samples: 12475494100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-24 23:16:09,777][15401] Updated weights for policy 0, policy_version 761443 (0.0038) [2024-06-24 23:16:13,306][15401] Updated weights for policy 0, policy_version 761453 (0.0032) [2024-06-24 23:16:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12475645952. Throughput: 0: 42774.2. Samples: 12475749980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-24 23:16:17,687][15401] Updated weights for policy 0, policy_version 761463 (0.0034) [2024-06-24 23:16:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12475826176. Throughput: 0: 42694.4. Samples: 12476005380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:18,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-24 23:16:20,920][15401] Updated weights for policy 0, policy_version 761473 (0.0052) [2024-06-24 23:16:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 12476071936. Throughput: 0: 42626.6. Samples: 12476127800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 23:16:25,518][15401] Updated weights for policy 0, policy_version 761483 (0.0034) [2024-06-24 23:16:28,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12476268544. Throughput: 0: 42764.8. Samples: 12476391140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 23:16:29,049][15401] Updated weights for policy 0, policy_version 761493 (0.0040) [2024-06-24 23:16:33,064][15401] Updated weights for policy 0, policy_version 761503 (0.0042) [2024-06-24 23:16:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12476465152. Throughput: 0: 42584.9. Samples: 12476640660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 23:16:36,731][15401] Updated weights for policy 0, policy_version 761513 (0.0030) [2024-06-24 23:16:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12476710912. Throughput: 0: 42621.7. Samples: 12476770800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 23:16:40,613][15401] Updated weights for policy 0, policy_version 761523 (0.0043) [2024-06-24 23:16:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.2, 300 sec: 42487.9). Total num frames: 12476874752. Throughput: 0: 42617.9. Samples: 12477024240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-24 23:16:44,561][15401] Updated weights for policy 0, policy_version 761533 (0.0033) [2024-06-24 23:16:48,204][15401] Updated weights for policy 0, policy_version 761543 (0.0032) [2024-06-24 23:16:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 12477120512. Throughput: 0: 42436.1. Samples: 12477278840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-24 23:16:52,080][15401] Updated weights for policy 0, policy_version 761553 (0.0032) [2024-06-24 23:16:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12477349888. Throughput: 0: 42662.6. Samples: 12477413920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 23:16:56,254][15401] Updated weights for policy 0, policy_version 761563 (0.0030) [2024-06-24 23:16:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 41779.4, 300 sec: 42487.3). Total num frames: 12477497344. Throughput: 0: 42394.4. Samples: 12477657720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:16:58,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 23:17:00,076][15401] Updated weights for policy 0, policy_version 761573 (0.0045) [2024-06-24 23:17:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12477743104. Throughput: 0: 42482.1. Samples: 12477917080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:17:03,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 23:17:03,822][15401] Updated weights for policy 0, policy_version 761583 (0.0036) [2024-06-24 23:17:05,999][15349] Signal inference workers to stop experience collection... (184600 times) [2024-06-24 23:17:05,999][15349] Signal inference workers to resume experience collection... (184600 times) [2024-06-24 23:17:06,028][15401] InferenceWorker_p0-w0: stopping experience collection (184600 times) [2024-06-24 23:17:06,028][15401] InferenceWorker_p0-w0: resuming experience collection (184600 times) [2024-06-24 23:17:07,601][15401] Updated weights for policy 0, policy_version 761593 (0.0036) [2024-06-24 23:17:08,389][15132] Fps is (10 sec: 49151.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12477988864. Throughput: 0: 42649.8. Samples: 12478047040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:17:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-24 23:17:11,388][15401] Updated weights for policy 0, policy_version 761603 (0.0027) [2024-06-24 23:17:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 12478152704. Throughput: 0: 42506.7. Samples: 12478303940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:17:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 23:17:15,039][15401] Updated weights for policy 0, policy_version 761613 (0.0034) [2024-06-24 23:17:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12478382080. Throughput: 0: 42733.5. Samples: 12478563660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:17:18,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-24 23:17:19,029][15401] Updated weights for policy 0, policy_version 761623 (0.0035) [2024-06-24 23:17:22,929][15401] Updated weights for policy 0, policy_version 761633 (0.0030) [2024-06-24 23:17:23,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 12478627840. Throughput: 0: 42645.8. Samples: 12478689860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-24 23:17:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 23:17:26,703][15401] Updated weights for policy 0, policy_version 761643 (0.0024) [2024-06-24 23:17:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12478808064. Throughput: 0: 42639.9. Samples: 12478943040. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:17:28,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 23:17:30,517][15401] Updated weights for policy 0, policy_version 761653 (0.0034) [2024-06-24 23:17:33,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 12479021056. Throughput: 0: 42652.8. Samples: 12479198320. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:17:33,393][15132] Avg episode reward: [(0, '0.395')] [2024-06-24 23:17:34,354][15401] Updated weights for policy 0, policy_version 761663 (0.0024) [2024-06-24 23:17:38,275][15401] Updated weights for policy 0, policy_version 761673 (0.0025) [2024-06-24 23:17:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12479250432. Throughput: 0: 42590.3. Samples: 12479330480. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:17:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-24 23:17:41,955][15401] Updated weights for policy 0, policy_version 761683 (0.0035) [2024-06-24 23:17:43,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12479463424. Throughput: 0: 42785.3. Samples: 12479583060. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:17:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 23:17:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000761686_12479463424.pth... [2024-06-24 23:17:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000761061_12469223424.pth [2024-06-24 23:17:46,016][15401] Updated weights for policy 0, policy_version 761693 (0.0032) [2024-06-24 23:17:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12479692800. Throughput: 0: 42792.4. Samples: 12479842740. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:17:48,396][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 23:17:49,561][15401] Updated weights for policy 0, policy_version 761703 (0.0043) [2024-06-24 23:17:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 12479889408. Throughput: 0: 42844.1. Samples: 12479975020. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:17:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 23:17:53,399][15401] Updated weights for policy 0, policy_version 761713 (0.0038) [2024-06-24 23:17:57,058][15401] Updated weights for policy 0, policy_version 761723 (0.0026) [2024-06-24 23:17:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42710.0). Total num frames: 12480102400. Throughput: 0: 42766.2. Samples: 12480228420. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:17:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-24 23:18:00,855][15401] Updated weights for policy 0, policy_version 761733 (0.0040) [2024-06-24 23:18:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12480348160. Throughput: 0: 42644.9. Samples: 12480482680. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:03,391][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 23:18:05,068][15401] Updated weights for policy 0, policy_version 761743 (0.0032) [2024-06-24 23:18:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12480544768. Throughput: 0: 42866.3. Samples: 12480618840. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 23:18:08,404][15401] Updated weights for policy 0, policy_version 761753 (0.0036) [2024-06-24 23:18:12,449][15401] Updated weights for policy 0, policy_version 761763 (0.0030) [2024-06-24 23:18:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12480741376. Throughput: 0: 42936.2. Samples: 12480875160. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-24 23:18:14,727][15349] Signal inference workers to stop experience collection... (184650 times) [2024-06-24 23:18:14,727][15349] Signal inference workers to resume experience collection... (184650 times) [2024-06-24 23:18:14,747][15401] InferenceWorker_p0-w0: stopping experience collection (184650 times) [2024-06-24 23:18:14,747][15401] InferenceWorker_p0-w0: resuming experience collection (184650 times) [2024-06-24 23:18:15,895][15401] Updated weights for policy 0, policy_version 761773 (0.0023) [2024-06-24 23:18:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12480987136. Throughput: 0: 42954.3. Samples: 12481131160. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:18,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 23:18:20,351][15401] Updated weights for policy 0, policy_version 761783 (0.0035) [2024-06-24 23:18:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 12481183744. Throughput: 0: 42961.3. Samples: 12481263740. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:23,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-24 23:18:23,581][15401] Updated weights for policy 0, policy_version 761793 (0.0034) [2024-06-24 23:18:27,835][15401] Updated weights for policy 0, policy_version 761803 (0.0023) [2024-06-24 23:18:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12481413120. Throughput: 0: 42989.7. Samples: 12481517600. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 23:18:31,951][15401] Updated weights for policy 0, policy_version 761813 (0.0037) [2024-06-24 23:18:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.3, 300 sec: 42654.0). Total num frames: 12481609728. Throughput: 0: 42916.5. Samples: 12481773980. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 23:18:35,301][15401] Updated weights for policy 0, policy_version 761823 (0.0035) [2024-06-24 23:18:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12481822720. Throughput: 0: 42899.0. Samples: 12481905480. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:38,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-24 23:18:39,521][15401] Updated weights for policy 0, policy_version 761833 (0.0033) [2024-06-24 23:18:42,943][15401] Updated weights for policy 0, policy_version 761843 (0.0033) [2024-06-24 23:18:43,390][15132] Fps is (10 sec: 42595.7, 60 sec: 42871.0, 300 sec: 42709.4). Total num frames: 12482035712. Throughput: 0: 42816.4. Samples: 12482155180. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:43,391][15132] Avg episode reward: [(0, '0.766')] [2024-06-24 23:18:47,083][15401] Updated weights for policy 0, policy_version 761853 (0.0052) [2024-06-24 23:18:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12482265088. Throughput: 0: 42795.1. Samples: 12482408460. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-24 23:18:50,729][15401] Updated weights for policy 0, policy_version 761863 (0.0032) [2024-06-24 23:18:53,390][15132] Fps is (10 sec: 40962.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12482445312. Throughput: 0: 42743.9. Samples: 12482542320. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 23:18:54,817][15401] Updated weights for policy 0, policy_version 761873 (0.0035) [2024-06-24 23:18:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12482674688. Throughput: 0: 42745.1. Samples: 12482798700. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:18:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 23:18:58,601][15401] Updated weights for policy 0, policy_version 761883 (0.0032) [2024-06-24 23:19:02,335][15401] Updated weights for policy 0, policy_version 761893 (0.0029) [2024-06-24 23:19:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 12482887680. Throughput: 0: 42665.7. Samples: 12483051120. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-06-24 23:19:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 23:19:06,138][15401] Updated weights for policy 0, policy_version 761903 (0.0042) [2024-06-24 23:19:08,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12483100672. Throughput: 0: 42587.4. Samples: 12483180280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:08,393][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 23:19:09,876][15401] Updated weights for policy 0, policy_version 761913 (0.0038) [2024-06-24 23:19:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12483313664. Throughput: 0: 42572.6. Samples: 12483433360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 23:19:13,787][15401] Updated weights for policy 0, policy_version 761923 (0.0034) [2024-06-24 23:19:17,434][15401] Updated weights for policy 0, policy_version 761933 (0.0035) [2024-06-24 23:19:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12483526656. Throughput: 0: 42690.2. Samples: 12483695040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-24 23:19:21,511][15401] Updated weights for policy 0, policy_version 761943 (0.0027) [2024-06-24 23:19:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12483723264. Throughput: 0: 42573.7. Samples: 12483821300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 23:19:24,623][15349] Signal inference workers to stop experience collection... (184700 times) [2024-06-24 23:19:24,676][15401] InferenceWorker_p0-w0: stopping experience collection (184700 times) [2024-06-24 23:19:24,680][15349] Signal inference workers to resume experience collection... (184700 times) [2024-06-24 23:19:24,685][15401] InferenceWorker_p0-w0: resuming experience collection (184700 times) [2024-06-24 23:19:25,067][15401] Updated weights for policy 0, policy_version 761953 (0.0046) [2024-06-24 23:19:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12483952640. Throughput: 0: 42683.6. Samples: 12484075920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:28,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 23:19:29,194][15401] Updated weights for policy 0, policy_version 761963 (0.0035) [2024-06-24 23:19:32,691][15401] Updated weights for policy 0, policy_version 761973 (0.0034) [2024-06-24 23:19:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12484165632. Throughput: 0: 42630.7. Samples: 12484326840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:33,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-24 23:19:37,536][15401] Updated weights for policy 0, policy_version 761983 (0.0044) [2024-06-24 23:19:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12484362240. Throughput: 0: 42466.8. Samples: 12484453320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:38,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-24 23:19:40,392][15401] Updated weights for policy 0, policy_version 761993 (0.0033) [2024-06-24 23:19:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.8, 300 sec: 42765.0). Total num frames: 12484608000. Throughput: 0: 42568.9. Samples: 12484714300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 23:19:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000762000_12484608000.pth... [2024-06-24 23:19:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000761374_12474351616.pth [2024-06-24 23:19:45,228][15401] Updated weights for policy 0, policy_version 762003 (0.0035) [2024-06-24 23:19:48,162][15401] Updated weights for policy 0, policy_version 762013 (0.0034) [2024-06-24 23:19:48,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 12484820992. Throughput: 0: 42463.6. Samples: 12484962080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:48,393][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 23:19:53,061][15401] Updated weights for policy 0, policy_version 762023 (0.0026) [2024-06-24 23:19:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12485001216. Throughput: 0: 42394.1. Samples: 12485087920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:53,395][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 23:19:55,631][15401] Updated weights for policy 0, policy_version 762033 (0.0030) [2024-06-24 23:19:58,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12485214208. Throughput: 0: 42569.3. Samples: 12485348980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:19:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 23:20:00,837][15401] Updated weights for policy 0, policy_version 762043 (0.0040) [2024-06-24 23:20:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 12485443584. Throughput: 0: 42330.2. Samples: 12485599900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:03,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-24 23:20:03,573][15401] Updated weights for policy 0, policy_version 762053 (0.0041) [2024-06-24 23:20:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42052.3, 300 sec: 42542.5). Total num frames: 12485623808. Throughput: 0: 42513.4. Samples: 12485734500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:08,392][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 23:20:08,550][15401] Updated weights for policy 0, policy_version 762063 (0.0030) [2024-06-24 23:20:11,295][15401] Updated weights for policy 0, policy_version 762073 (0.0040) [2024-06-24 23:20:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12485853184. Throughput: 0: 42535.7. Samples: 12485990020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:13,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-24 23:20:16,013][15401] Updated weights for policy 0, policy_version 762083 (0.0029) [2024-06-24 23:20:18,389][15132] Fps is (10 sec: 47525.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 12486098944. Throughput: 0: 42589.4. Samples: 12486243360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:18,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-24 23:20:18,899][15401] Updated weights for policy 0, policy_version 762093 (0.0039) [2024-06-24 23:20:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12486279168. Throughput: 0: 42803.4. Samples: 12486379480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:23,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-24 23:20:23,584][15401] Updated weights for policy 0, policy_version 762103 (0.0042) [2024-06-24 23:20:26,450][15401] Updated weights for policy 0, policy_version 762113 (0.0039) [2024-06-24 23:20:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12486508544. Throughput: 0: 42623.8. Samples: 12486632360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-24 23:20:31,399][15401] Updated weights for policy 0, policy_version 762123 (0.0044) [2024-06-24 23:20:33,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12486737920. Throughput: 0: 42812.0. Samples: 12486888520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:33,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-24 23:20:34,500][15401] Updated weights for policy 0, policy_version 762133 (0.0038) [2024-06-24 23:20:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12486918144. Throughput: 0: 42832.6. Samples: 12487015380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-24 23:20:39,029][15401] Updated weights for policy 0, policy_version 762143 (0.0032) [2024-06-24 23:20:41,888][15401] Updated weights for policy 0, policy_version 762153 (0.0037) [2024-06-24 23:20:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12487163904. Throughput: 0: 42786.6. Samples: 12487274380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 23:20:46,727][15401] Updated weights for policy 0, policy_version 762163 (0.0032) [2024-06-24 23:20:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 12487360512. Throughput: 0: 43046.3. Samples: 12487536980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-24 23:20:48,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-24 23:20:49,449][15401] Updated weights for policy 0, policy_version 762173 (0.0039) [2024-06-24 23:20:51,087][15349] Signal inference workers to stop experience collection... (184750 times) [2024-06-24 23:20:51,088][15349] Signal inference workers to resume experience collection... (184750 times) [2024-06-24 23:20:51,128][15401] InferenceWorker_p0-w0: stopping experience collection (184750 times) [2024-06-24 23:20:51,128][15401] InferenceWorker_p0-w0: resuming experience collection (184750 times) [2024-06-24 23:20:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12487557120. Throughput: 0: 42770.2. Samples: 12487659060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:20:53,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-24 23:20:54,294][15401] Updated weights for policy 0, policy_version 762183 (0.0022) [2024-06-24 23:20:56,789][15401] Updated weights for policy 0, policy_version 762193 (0.0041) [2024-06-24 23:20:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 12487819264. Throughput: 0: 42765.7. Samples: 12487914480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:20:58,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-24 23:21:01,899][15401] Updated weights for policy 0, policy_version 762203 (0.0037) [2024-06-24 23:21:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12487999488. Throughput: 0: 43096.3. Samples: 12488182700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:03,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 23:21:04,518][15401] Updated weights for policy 0, policy_version 762213 (0.0030) [2024-06-24 23:21:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 12488212480. Throughput: 0: 42784.6. Samples: 12488304780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 23:21:09,525][15401] Updated weights for policy 0, policy_version 762223 (0.0044) [2024-06-24 23:21:12,168][15401] Updated weights for policy 0, policy_version 762233 (0.0041) [2024-06-24 23:21:13,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 12488474624. Throughput: 0: 42886.6. Samples: 12488562260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 23:21:16,971][15401] Updated weights for policy 0, policy_version 762243 (0.0039) [2024-06-24 23:21:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12488654848. Throughput: 0: 43097.0. Samples: 12488827880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-24 23:21:19,782][15401] Updated weights for policy 0, policy_version 762253 (0.0034) [2024-06-24 23:21:23,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 12488851456. Throughput: 0: 42838.3. Samples: 12488943100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:23,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 23:21:24,657][15401] Updated weights for policy 0, policy_version 762263 (0.0029) [2024-06-24 23:21:27,489][15401] Updated weights for policy 0, policy_version 762273 (0.0042) [2024-06-24 23:21:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 12489113600. Throughput: 0: 42811.5. Samples: 12489200900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 23:21:32,597][15401] Updated weights for policy 0, policy_version 762283 (0.0030) [2024-06-24 23:21:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12489277440. Throughput: 0: 42829.7. Samples: 12489464320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-24 23:21:35,423][15401] Updated weights for policy 0, policy_version 762293 (0.0035) [2024-06-24 23:21:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12489506816. Throughput: 0: 42682.8. Samples: 12489579780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 23:21:40,166][15401] Updated weights for policy 0, policy_version 762303 (0.0029) [2024-06-24 23:21:43,183][15401] Updated weights for policy 0, policy_version 762313 (0.0029) [2024-06-24 23:21:43,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12489752576. Throughput: 0: 42741.8. Samples: 12489837860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-24 23:21:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000762314_12489752576.pth... [2024-06-24 23:21:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000761686_12479463424.pth [2024-06-24 23:21:47,700][15401] Updated weights for policy 0, policy_version 762323 (0.0031) [2024-06-24 23:21:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 12489900032. Throughput: 0: 42607.0. Samples: 12490100020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:48,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-24 23:21:50,681][15401] Updated weights for policy 0, policy_version 762333 (0.0027) [2024-06-24 23:21:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12490145792. Throughput: 0: 42483.1. Samples: 12490216520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:53,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 23:21:55,686][15401] Updated weights for policy 0, policy_version 762343 (0.0034) [2024-06-24 23:21:58,389][15132] Fps is (10 sec: 49153.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12490391552. Throughput: 0: 42669.4. Samples: 12490482380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:21:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 23:21:58,391][15401] Updated weights for policy 0, policy_version 762353 (0.0038) [2024-06-24 23:22:03,267][15401] Updated weights for policy 0, policy_version 762363 (0.0033) [2024-06-24 23:22:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 12490555392. Throughput: 0: 42446.6. Samples: 12490738080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:22:03,392][15132] Avg episode reward: [(0, '0.633')] [2024-06-24 23:22:06,169][15401] Updated weights for policy 0, policy_version 762373 (0.0043) [2024-06-24 23:22:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12490768384. Throughput: 0: 42535.9. Samples: 12490857220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:22:08,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 23:22:11,523][15401] Updated weights for policy 0, policy_version 762383 (0.0039) [2024-06-24 23:22:11,720][15349] Signal inference workers to stop experience collection... (184800 times) [2024-06-24 23:22:11,749][15401] InferenceWorker_p0-w0: stopping experience collection (184800 times) [2024-06-24 23:22:11,779][15349] Signal inference workers to resume experience collection... (184800 times) [2024-06-24 23:22:11,784][15401] InferenceWorker_p0-w0: resuming experience collection (184800 times) [2024-06-24 23:22:13,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 12490997760. Throughput: 0: 42572.9. Samples: 12491116680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:22:13,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 23:22:13,975][15401] Updated weights for policy 0, policy_version 762393 (0.0038) [2024-06-24 23:22:18,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42047.7, 300 sec: 42541.9). Total num frames: 12491177984. Throughput: 0: 42276.2. Samples: 12491367020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:22:18,396][15132] Avg episode reward: [(0, '0.664')] [2024-06-24 23:22:19,177][15401] Updated weights for policy 0, policy_version 762403 (0.0028) [2024-06-24 23:22:21,795][15401] Updated weights for policy 0, policy_version 762413 (0.0038) [2024-06-24 23:22:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12491423744. Throughput: 0: 42450.1. Samples: 12491490040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:22:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 23:22:26,750][15401] Updated weights for policy 0, policy_version 762423 (0.0041) [2024-06-24 23:22:28,390][15132] Fps is (10 sec: 44264.9, 60 sec: 41779.2, 300 sec: 42709.8). Total num frames: 12491620352. Throughput: 0: 42489.3. Samples: 12491749880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:22:28,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 23:22:29,510][15401] Updated weights for policy 0, policy_version 762433 (0.0038) [2024-06-24 23:22:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12491833344. Throughput: 0: 42245.0. Samples: 12492001040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-24 23:22:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 23:22:34,270][15401] Updated weights for policy 0, policy_version 762443 (0.0031) [2024-06-24 23:22:37,289][15401] Updated weights for policy 0, policy_version 762453 (0.0043) [2024-06-24 23:22:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12492062720. Throughput: 0: 42512.4. Samples: 12492129580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:22:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-24 23:22:41,909][15401] Updated weights for policy 0, policy_version 762463 (0.0035) [2024-06-24 23:22:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 12492259328. Throughput: 0: 42283.4. Samples: 12492385140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:22:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-24 23:22:45,139][15401] Updated weights for policy 0, policy_version 762473 (0.0035) [2024-06-24 23:22:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12492472320. Throughput: 0: 42252.0. Samples: 12492639320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:22:48,392][15132] Avg episode reward: [(0, '0.711')] [2024-06-24 23:22:49,566][15401] Updated weights for policy 0, policy_version 762483 (0.0042) [2024-06-24 23:22:52,746][15401] Updated weights for policy 0, policy_version 762493 (0.0036) [2024-06-24 23:22:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12492718080. Throughput: 0: 42479.6. Samples: 12492768800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:22:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 23:22:57,381][15401] Updated weights for policy 0, policy_version 762503 (0.0034) [2024-06-24 23:22:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 12492898304. Throughput: 0: 42458.2. Samples: 12493027300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:22:58,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 23:23:00,712][15401] Updated weights for policy 0, policy_version 762513 (0.0044) [2024-06-24 23:23:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 12493111296. Throughput: 0: 42379.4. Samples: 12493273820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-24 23:23:04,986][15401] Updated weights for policy 0, policy_version 762523 (0.0037) [2024-06-24 23:23:08,374][15401] Updated weights for policy 0, policy_version 762533 (0.0041) [2024-06-24 23:23:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 12493340672. Throughput: 0: 42475.5. Samples: 12493401440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 23:23:12,606][15401] Updated weights for policy 0, policy_version 762543 (0.0030) [2024-06-24 23:23:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12493553664. Throughput: 0: 42423.1. Samples: 12493658920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:13,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-24 23:23:16,071][15401] Updated weights for policy 0, policy_version 762553 (0.0039) [2024-06-24 23:23:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42602.9, 300 sec: 42542.8). Total num frames: 12493733888. Throughput: 0: 42490.2. Samples: 12493913100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-24 23:23:20,318][15401] Updated weights for policy 0, policy_version 762563 (0.0026) [2024-06-24 23:23:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12493963264. Throughput: 0: 42443.2. Samples: 12494039520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 23:23:23,732][15401] Updated weights for policy 0, policy_version 762573 (0.0032) [2024-06-24 23:23:27,855][15401] Updated weights for policy 0, policy_version 762583 (0.0033) [2024-06-24 23:23:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12494159872. Throughput: 0: 42487.2. Samples: 12494297060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-24 23:23:31,334][15401] Updated weights for policy 0, policy_version 762593 (0.0028) [2024-06-24 23:23:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 12494356480. Throughput: 0: 42681.9. Samples: 12494560000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:33,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-24 23:23:33,660][15349] Signal inference workers to stop experience collection... (184850 times) [2024-06-24 23:23:33,660][15349] Signal inference workers to resume experience collection... (184850 times) [2024-06-24 23:23:33,680][15401] InferenceWorker_p0-w0: stopping experience collection (184850 times) [2024-06-24 23:23:33,681][15401] InferenceWorker_p0-w0: resuming experience collection (184850 times) [2024-06-24 23:23:35,422][15401] Updated weights for policy 0, policy_version 762603 (0.0038) [2024-06-24 23:23:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.5). Total num frames: 12494602240. Throughput: 0: 42485.6. Samples: 12494680660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:38,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 23:23:38,915][15401] Updated weights for policy 0, policy_version 762613 (0.0035) [2024-06-24 23:23:43,207][15401] Updated weights for policy 0, policy_version 762623 (0.0031) [2024-06-24 23:23:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12494815232. Throughput: 0: 42387.5. Samples: 12494934740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 23:23:43,523][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000762624_12494831616.pth... [2024-06-24 23:23:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000762000_12484608000.pth [2024-06-24 23:23:46,856][15401] Updated weights for policy 0, policy_version 762633 (0.0031) [2024-06-24 23:23:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12495011840. Throughput: 0: 42658.2. Samples: 12495193440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:48,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 23:23:50,791][15401] Updated weights for policy 0, policy_version 762643 (0.0037) [2024-06-24 23:23:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12495241216. Throughput: 0: 42668.6. Samples: 12495321520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:53,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-24 23:23:54,245][15401] Updated weights for policy 0, policy_version 762653 (0.0033) [2024-06-24 23:23:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12495454208. Throughput: 0: 42685.7. Samples: 12495579780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:23:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 23:23:58,552][15401] Updated weights for policy 0, policy_version 762663 (0.0032) [2024-06-24 23:24:01,813][15401] Updated weights for policy 0, policy_version 762673 (0.0032) [2024-06-24 23:24:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 12495667200. Throughput: 0: 42836.5. Samples: 12495840740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:24:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 23:24:06,304][15401] Updated weights for policy 0, policy_version 762683 (0.0030) [2024-06-24 23:24:08,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 12495880192. Throughput: 0: 42903.6. Samples: 12495970180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:24:08,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-24 23:24:09,392][15401] Updated weights for policy 0, policy_version 762693 (0.0039) [2024-06-24 23:24:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12496093184. Throughput: 0: 42870.2. Samples: 12496226220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:24:13,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-24 23:24:13,806][15401] Updated weights for policy 0, policy_version 762703 (0.0026) [2024-06-24 23:24:17,169][15401] Updated weights for policy 0, policy_version 762713 (0.0032) [2024-06-24 23:24:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12496322560. Throughput: 0: 42765.6. Samples: 12496484460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-24 23:24:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 23:24:21,209][15401] Updated weights for policy 0, policy_version 762723 (0.0033) [2024-06-24 23:24:23,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 12496519168. Throughput: 0: 42953.8. Samples: 12496613680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:24:23,393][15132] Avg episode reward: [(0, '0.342')] [2024-06-24 23:24:24,786][15401] Updated weights for policy 0, policy_version 762733 (0.0044) [2024-06-24 23:24:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12496748544. Throughput: 0: 43125.0. Samples: 12496875360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:24:28,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-24 23:24:28,609][15401] Updated weights for policy 0, policy_version 762743 (0.0033) [2024-06-24 23:24:32,426][15401] Updated weights for policy 0, policy_version 762753 (0.0032) [2024-06-24 23:24:33,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12496961536. Throughput: 0: 43074.8. Samples: 12497131800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:24:33,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-24 23:24:36,065][15401] Updated weights for policy 0, policy_version 762763 (0.0028) [2024-06-24 23:24:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 12497158144. Throughput: 0: 42943.2. Samples: 12497253960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:24:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-24 23:24:39,838][15401] Updated weights for policy 0, policy_version 762773 (0.0039) [2024-06-24 23:24:42,877][15349] Signal inference workers to stop experience collection... (184900 times) [2024-06-24 23:24:42,877][15349] Signal inference workers to resume experience collection... (184900 times) [2024-06-24 23:24:42,911][15401] InferenceWorker_p0-w0: stopping experience collection (184900 times) [2024-06-24 23:24:42,911][15401] InferenceWorker_p0-w0: resuming experience collection (184900 times) [2024-06-24 23:24:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 42709.8). Total num frames: 12497420288. Throughput: 0: 43108.2. Samples: 12497519640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:24:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 23:24:44,346][15401] Updated weights for policy 0, policy_version 762783 (0.0037) [2024-06-24 23:24:48,009][15401] Updated weights for policy 0, policy_version 762793 (0.0035) [2024-06-24 23:24:48,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12497600512. Throughput: 0: 43047.9. Samples: 12497777900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:24:48,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-24 23:24:51,876][15401] Updated weights for policy 0, policy_version 762803 (0.0033) [2024-06-24 23:24:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12497813504. Throughput: 0: 43036.0. Samples: 12497906800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:24:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 23:24:55,510][15401] Updated weights for policy 0, policy_version 762813 (0.0023) [2024-06-24 23:24:58,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 12498059264. Throughput: 0: 43102.9. Samples: 12498165840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:24:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 23:24:59,247][15401] Updated weights for policy 0, policy_version 762823 (0.0033) [2024-06-24 23:25:03,068][15401] Updated weights for policy 0, policy_version 762833 (0.0035) [2024-06-24 23:25:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 12498255872. Throughput: 0: 43105.5. Samples: 12498424200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-24 23:25:06,736][15401] Updated weights for policy 0, policy_version 762843 (0.0024) [2024-06-24 23:25:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12498468864. Throughput: 0: 43133.5. Samples: 12498554580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:08,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-24 23:25:10,605][15401] Updated weights for policy 0, policy_version 762853 (0.0035) [2024-06-24 23:25:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12498698240. Throughput: 0: 43091.5. Samples: 12498814480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-24 23:25:14,202][15401] Updated weights for policy 0, policy_version 762863 (0.0024) [2024-06-24 23:25:18,337][15401] Updated weights for policy 0, policy_version 762873 (0.0044) [2024-06-24 23:25:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12498911232. Throughput: 0: 43269.7. Samples: 12499078940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:18,391][15132] Avg episode reward: [(0, '0.494')] [2024-06-24 23:25:21,879][15401] Updated weights for policy 0, policy_version 762883 (0.0035) [2024-06-24 23:25:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 12499124224. Throughput: 0: 43380.2. Samples: 12499206080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 23:25:25,886][15401] Updated weights for policy 0, policy_version 762893 (0.0041) [2024-06-24 23:25:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12499337216. Throughput: 0: 43024.4. Samples: 12499455740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:28,394][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 23:25:29,831][15401] Updated weights for policy 0, policy_version 762903 (0.0029) [2024-06-24 23:25:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12499533824. Throughput: 0: 43025.7. Samples: 12499714060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:33,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-24 23:25:33,727][15401] Updated weights for policy 0, policy_version 762913 (0.0043) [2024-06-24 23:25:37,196][15401] Updated weights for policy 0, policy_version 762923 (0.0033) [2024-06-24 23:25:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.5, 300 sec: 42765.0). Total num frames: 12499779584. Throughput: 0: 42917.2. Samples: 12499838080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:38,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 23:25:41,390][15401] Updated weights for policy 0, policy_version 762933 (0.0022) [2024-06-24 23:25:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12499992576. Throughput: 0: 42954.5. Samples: 12500098800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:43,390][15132] Avg episode reward: [(0, '0.206')] [2024-06-24 23:25:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000762939_12499992576.pth... [2024-06-24 23:25:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000762314_12489752576.pth [2024-06-24 23:25:44,845][15401] Updated weights for policy 0, policy_version 762943 (0.0032) [2024-06-24 23:25:48,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 12500156416. Throughput: 0: 42927.4. Samples: 12500356040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:48,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 23:25:49,204][15401] Updated weights for policy 0, policy_version 762953 (0.0040) [2024-06-24 23:25:52,431][15401] Updated weights for policy 0, policy_version 762963 (0.0028) [2024-06-24 23:25:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 12500418560. Throughput: 0: 42610.5. Samples: 12500472060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:53,394][15132] Avg episode reward: [(0, '0.830')] [2024-06-24 23:25:55,917][15349] Signal inference workers to stop experience collection... (184950 times) [2024-06-24 23:25:55,918][15349] Signal inference workers to resume experience collection... (184950 times) [2024-06-24 23:25:55,962][15401] InferenceWorker_p0-w0: stopping experience collection (184950 times) [2024-06-24 23:25:55,962][15401] InferenceWorker_p0-w0: resuming experience collection (184950 times) [2024-06-24 23:25:56,951][15401] Updated weights for policy 0, policy_version 762973 (0.0032) [2024-06-24 23:25:58,389][15132] Fps is (10 sec: 47525.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12500631552. Throughput: 0: 42736.1. Samples: 12500737600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:25:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 23:26:00,218][15401] Updated weights for policy 0, policy_version 762983 (0.0035) [2024-06-24 23:26:03,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 12500811776. Throughput: 0: 42707.5. Samples: 12501000880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-24 23:26:03,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-24 23:26:04,908][15401] Updated weights for policy 0, policy_version 762993 (0.0034) [2024-06-24 23:26:07,918][15401] Updated weights for policy 0, policy_version 763003 (0.0029) [2024-06-24 23:26:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 12501057536. Throughput: 0: 42469.8. Samples: 12501117320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:08,392][15132] Avg episode reward: [(0, '0.436')] [2024-06-24 23:26:12,487][15401] Updated weights for policy 0, policy_version 763013 (0.0030) [2024-06-24 23:26:13,390][15132] Fps is (10 sec: 47524.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 12501286912. Throughput: 0: 42924.5. Samples: 12501387340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-24 23:26:15,439][15401] Updated weights for policy 0, policy_version 763023 (0.0032) [2024-06-24 23:26:18,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12501434368. Throughput: 0: 42852.5. Samples: 12501642420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-24 23:26:20,161][15401] Updated weights for policy 0, policy_version 763033 (0.0033) [2024-06-24 23:26:23,034][15401] Updated weights for policy 0, policy_version 763043 (0.0034) [2024-06-24 23:26:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12501696512. Throughput: 0: 42731.6. Samples: 12501761000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:23,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-24 23:26:27,703][15401] Updated weights for policy 0, policy_version 763053 (0.0038) [2024-06-24 23:26:28,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12501909504. Throughput: 0: 42810.8. Samples: 12502025280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-24 23:26:30,589][15401] Updated weights for policy 0, policy_version 763063 (0.0037) [2024-06-24 23:26:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12502089728. Throughput: 0: 42828.9. Samples: 12502283240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 23:26:35,190][15401] Updated weights for policy 0, policy_version 763073 (0.0031) [2024-06-24 23:26:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12502335488. Throughput: 0: 42931.6. Samples: 12502403980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:38,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-24 23:26:38,668][15401] Updated weights for policy 0, policy_version 763083 (0.0039) [2024-06-24 23:26:42,754][15401] Updated weights for policy 0, policy_version 763093 (0.0029) [2024-06-24 23:26:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12502548480. Throughput: 0: 42904.4. Samples: 12502668300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:43,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 23:26:46,259][15401] Updated weights for policy 0, policy_version 763103 (0.0036) [2024-06-24 23:26:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 12502745088. Throughput: 0: 42595.1. Samples: 12502917560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-24 23:26:50,730][15401] Updated weights for policy 0, policy_version 763113 (0.0031) [2024-06-24 23:26:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12502974464. Throughput: 0: 42782.7. Samples: 12503042440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:53,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 23:26:53,946][15401] Updated weights for policy 0, policy_version 763123 (0.0038) [2024-06-24 23:26:58,258][15401] Updated weights for policy 0, policy_version 763133 (0.0030) [2024-06-24 23:26:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42765.4). Total num frames: 12503171072. Throughput: 0: 42483.0. Samples: 12503299080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:26:58,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-24 23:26:58,636][15349] Signal inference workers to stop experience collection... (185000 times) [2024-06-24 23:26:58,673][15401] InferenceWorker_p0-w0: stopping experience collection (185000 times) [2024-06-24 23:26:58,699][15349] Signal inference workers to resume experience collection... (185000 times) [2024-06-24 23:26:58,699][15401] InferenceWorker_p0-w0: resuming experience collection (185000 times) [2024-06-24 23:27:01,641][15401] Updated weights for policy 0, policy_version 763143 (0.0030) [2024-06-24 23:27:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 12503400448. Throughput: 0: 42623.5. Samples: 12503560480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:03,396][15132] Avg episode reward: [(0, '0.778')] [2024-06-24 23:27:05,981][15401] Updated weights for policy 0, policy_version 763153 (0.0041) [2024-06-24 23:27:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 12503629824. Throughput: 0: 42904.7. Samples: 12503691720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 23:27:09,251][15401] Updated weights for policy 0, policy_version 763163 (0.0024) [2024-06-24 23:27:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42821.5). Total num frames: 12503810048. Throughput: 0: 42662.9. Samples: 12503945120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:13,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 23:27:13,565][15401] Updated weights for policy 0, policy_version 763173 (0.0046) [2024-06-24 23:27:16,969][15401] Updated weights for policy 0, policy_version 763183 (0.0048) [2024-06-24 23:27:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12504039424. Throughput: 0: 42626.2. Samples: 12504201420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-24 23:27:21,111][15401] Updated weights for policy 0, policy_version 763193 (0.0032) [2024-06-24 23:27:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12504268800. Throughput: 0: 42829.7. Samples: 12504331320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:23,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-24 23:27:24,635][15401] Updated weights for policy 0, policy_version 763203 (0.0032) [2024-06-24 23:27:28,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42323.5, 300 sec: 42764.7). Total num frames: 12504449024. Throughput: 0: 42642.5. Samples: 12504587320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:28,393][15132] Avg episode reward: [(0, '0.362')] [2024-06-24 23:27:28,678][15401] Updated weights for policy 0, policy_version 763213 (0.0038) [2024-06-24 23:27:32,211][15401] Updated weights for policy 0, policy_version 763223 (0.0035) [2024-06-24 23:27:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12504662016. Throughput: 0: 42705.9. Samples: 12504839320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-24 23:27:36,282][15401] Updated weights for policy 0, policy_version 763233 (0.0034) [2024-06-24 23:27:38,392][15132] Fps is (10 sec: 44237.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12504891392. Throughput: 0: 42785.3. Samples: 12504967880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:38,392][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 23:27:39,816][15401] Updated weights for policy 0, policy_version 763243 (0.0041) [2024-06-24 23:27:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12505088000. Throughput: 0: 42595.6. Samples: 12505215880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:43,392][15132] Avg episode reward: [(0, '0.333')] [2024-06-24 23:27:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000763250_12505088000.pth... [2024-06-24 23:27:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000762624_12494831616.pth [2024-06-24 23:27:43,968][15401] Updated weights for policy 0, policy_version 763253 (0.0031) [2024-06-24 23:27:48,025][15401] Updated weights for policy 0, policy_version 763263 (0.0035) [2024-06-24 23:27:48,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12505300992. Throughput: 0: 42441.9. Samples: 12505470360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-24 23:27:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-24 23:27:51,521][15401] Updated weights for policy 0, policy_version 763273 (0.0039) [2024-06-24 23:27:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12505513984. Throughput: 0: 42383.6. Samples: 12505598980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:27:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-24 23:27:56,021][15401] Updated weights for policy 0, policy_version 763283 (0.0037) [2024-06-24 23:27:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12505726976. Throughput: 0: 42458.9. Samples: 12505855760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:27:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-24 23:27:59,090][15401] Updated weights for policy 0, policy_version 763293 (0.0029) [2024-06-24 23:28:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 12505923584. Throughput: 0: 42529.0. Samples: 12506115220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:03,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-24 23:28:03,674][15401] Updated weights for policy 0, policy_version 763303 (0.0030) [2024-06-24 23:28:07,169][15401] Updated weights for policy 0, policy_version 763313 (0.0041) [2024-06-24 23:28:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12506169344. Throughput: 0: 42337.7. Samples: 12506236520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-24 23:28:11,424][15401] Updated weights for policy 0, policy_version 763323 (0.0028) [2024-06-24 23:28:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 12506365952. Throughput: 0: 42406.0. Samples: 12506495480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-24 23:28:14,759][15401] Updated weights for policy 0, policy_version 763333 (0.0029) [2024-06-24 23:28:18,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12506562560. Throughput: 0: 42430.1. Samples: 12506748680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:18,394][15132] Avg episode reward: [(0, '0.399')] [2024-06-24 23:28:19,205][15401] Updated weights for policy 0, policy_version 763343 (0.0039) [2024-06-24 23:28:22,451][15401] Updated weights for policy 0, policy_version 763353 (0.0037) [2024-06-24 23:28:23,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42323.7, 300 sec: 42875.7). Total num frames: 12506808320. Throughput: 0: 42361.8. Samples: 12506874160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:23,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-24 23:28:26,889][15401] Updated weights for policy 0, policy_version 763363 (0.0030) [2024-06-24 23:28:26,898][15349] Signal inference workers to stop experience collection... (185050 times) [2024-06-24 23:28:26,904][15349] Signal inference workers to resume experience collection... (185050 times) [2024-06-24 23:28:26,944][15401] InferenceWorker_p0-w0: stopping experience collection (185050 times) [2024-06-24 23:28:26,948][15401] InferenceWorker_p0-w0: resuming experience collection (185050 times) [2024-06-24 23:28:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42873.3, 300 sec: 42931.6). Total num frames: 12507021312. Throughput: 0: 42582.3. Samples: 12507132080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:28,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-24 23:28:30,166][15401] Updated weights for policy 0, policy_version 763373 (0.0035) [2024-06-24 23:28:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12507217920. Throughput: 0: 42589.8. Samples: 12507386900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 23:28:34,683][15401] Updated weights for policy 0, policy_version 763383 (0.0032) [2024-06-24 23:28:37,996][15401] Updated weights for policy 0, policy_version 763393 (0.0043) [2024-06-24 23:28:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 12507447296. Throughput: 0: 42535.1. Samples: 12507513060. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 23:28:42,501][15401] Updated weights for policy 0, policy_version 763403 (0.0040) [2024-06-24 23:28:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12507643904. Throughput: 0: 42666.5. Samples: 12507775860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:43,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 23:28:45,667][15401] Updated weights for policy 0, policy_version 763413 (0.0033) [2024-06-24 23:28:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12507873280. Throughput: 0: 42395.0. Samples: 12508023000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 23:28:50,101][15401] Updated weights for policy 0, policy_version 763423 (0.0028) [2024-06-24 23:28:53,379][15401] Updated weights for policy 0, policy_version 763433 (0.0039) [2024-06-24 23:28:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12508086272. Throughput: 0: 42680.1. Samples: 12508157120. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 23:28:57,635][15401] Updated weights for policy 0, policy_version 763443 (0.0033) [2024-06-24 23:28:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 12508282880. Throughput: 0: 42693.5. Samples: 12508416700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:28:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 23:29:00,985][15401] Updated weights for policy 0, policy_version 763453 (0.0029) [2024-06-24 23:29:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12508495872. Throughput: 0: 42497.9. Samples: 12508661080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:29:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-24 23:29:05,663][15401] Updated weights for policy 0, policy_version 763463 (0.0025) [2024-06-24 23:29:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12508708864. Throughput: 0: 42721.7. Samples: 12508796540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:29:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-24 23:29:08,825][15401] Updated weights for policy 0, policy_version 763473 (0.0030) [2024-06-24 23:29:13,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12508889088. Throughput: 0: 42470.6. Samples: 12509043260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:29:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-24 23:29:13,479][15401] Updated weights for policy 0, policy_version 763483 (0.0026) [2024-06-24 23:29:16,561][15401] Updated weights for policy 0, policy_version 763493 (0.0027) [2024-06-24 23:29:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 12509118464. Throughput: 0: 42467.9. Samples: 12509297960. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:29:18,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-24 23:29:21,006][15401] Updated weights for policy 0, policy_version 763503 (0.0027) [2024-06-24 23:29:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 12509364224. Throughput: 0: 42584.5. Samples: 12509429360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:29:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-24 23:29:24,046][15401] Updated weights for policy 0, policy_version 763513 (0.0039) [2024-06-24 23:29:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 12509528064. Throughput: 0: 42366.4. Samples: 12509682240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:29:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-24 23:29:28,721][15401] Updated weights for policy 0, policy_version 763523 (0.0043) [2024-06-24 23:29:31,527][15401] Updated weights for policy 0, policy_version 763533 (0.0035) [2024-06-24 23:29:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12509773824. Throughput: 0: 42678.7. Samples: 12509943540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-24 23:29:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 23:29:36,075][15401] Updated weights for policy 0, policy_version 763543 (0.0053) [2024-06-24 23:29:37,174][15349] Signal inference workers to stop experience collection... (185100 times) [2024-06-24 23:29:37,224][15401] InferenceWorker_p0-w0: stopping experience collection (185100 times) [2024-06-24 23:29:37,232][15349] Signal inference workers to resume experience collection... (185100 times) [2024-06-24 23:29:37,243][15401] InferenceWorker_p0-w0: resuming experience collection (185100 times) [2024-06-24 23:29:38,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12510003200. Throughput: 0: 42607.6. Samples: 12510074460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:29:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 23:29:39,097][15401] Updated weights for policy 0, policy_version 763553 (0.0036) [2024-06-24 23:29:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 12510183424. Throughput: 0: 42510.9. Samples: 12510329680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:29:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-24 23:29:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000763562_12510199808.pth... [2024-06-24 23:29:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000762939_12499992576.pth [2024-06-24 23:29:44,023][15401] Updated weights for policy 0, policy_version 763563 (0.0023) [2024-06-24 23:29:46,740][15401] Updated weights for policy 0, policy_version 763573 (0.0034) [2024-06-24 23:29:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12510412800. Throughput: 0: 42611.0. Samples: 12510578580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:29:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-24 23:29:51,436][15401] Updated weights for policy 0, policy_version 763583 (0.0030) [2024-06-24 23:29:53,390][15132] Fps is (10 sec: 47512.5, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 12510658560. Throughput: 0: 42687.0. Samples: 12510717460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:29:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 23:29:54,418][15401] Updated weights for policy 0, policy_version 763593 (0.0033) [2024-06-24 23:29:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 12510806016. Throughput: 0: 42896.4. Samples: 12510973600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:29:58,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 23:29:58,987][15401] Updated weights for policy 0, policy_version 763603 (0.0032) [2024-06-24 23:30:01,928][15401] Updated weights for policy 0, policy_version 763613 (0.0043) [2024-06-24 23:30:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12511068160. Throughput: 0: 42809.3. Samples: 12511224380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:03,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 23:30:06,614][15401] Updated weights for policy 0, policy_version 763623 (0.0027) [2024-06-24 23:30:08,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 12511281152. Throughput: 0: 42939.1. Samples: 12511361620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:08,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-24 23:30:09,666][15401] Updated weights for policy 0, policy_version 763633 (0.0037) [2024-06-24 23:30:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12511461376. Throughput: 0: 42927.9. Samples: 12511614000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-24 23:30:14,292][15401] Updated weights for policy 0, policy_version 763643 (0.0026) [2024-06-24 23:30:17,389][15401] Updated weights for policy 0, policy_version 763653 (0.0037) [2024-06-24 23:30:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12511707136. Throughput: 0: 42577.3. Samples: 12511859520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 23:30:21,794][15401] Updated weights for policy 0, policy_version 763663 (0.0032) [2024-06-24 23:30:23,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12511936512. Throughput: 0: 42772.0. Samples: 12511999200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:23,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-24 23:30:25,320][15401] Updated weights for policy 0, policy_version 763673 (0.0049) [2024-06-24 23:30:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12512100352. Throughput: 0: 42773.3. Samples: 12512254480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:28,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-24 23:30:29,418][15401] Updated weights for policy 0, policy_version 763683 (0.0041) [2024-06-24 23:30:33,035][15401] Updated weights for policy 0, policy_version 763693 (0.0047) [2024-06-24 23:30:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12512362496. Throughput: 0: 42894.2. Samples: 12512508820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:33,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 23:30:37,198][15401] Updated weights for policy 0, policy_version 763703 (0.0035) [2024-06-24 23:30:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12512559104. Throughput: 0: 42821.1. Samples: 12512644400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:38,390][15132] Avg episode reward: [(0, '0.107')] [2024-06-24 23:30:40,579][15401] Updated weights for policy 0, policy_version 763713 (0.0029) [2024-06-24 23:30:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 12512755712. Throughput: 0: 42777.4. Samples: 12512898580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:43,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-24 23:30:44,691][15401] Updated weights for policy 0, policy_version 763723 (0.0028) [2024-06-24 23:30:48,198][15401] Updated weights for policy 0, policy_version 763733 (0.0033) [2024-06-24 23:30:48,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 12513001472. Throughput: 0: 42859.1. Samples: 12513153140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:48,393][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 23:30:52,335][15401] Updated weights for policy 0, policy_version 763743 (0.0042) [2024-06-24 23:30:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 12513198080. Throughput: 0: 42820.8. Samples: 12513288560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 23:30:56,126][15401] Updated weights for policy 0, policy_version 763753 (0.0044) [2024-06-24 23:30:58,392][15132] Fps is (10 sec: 39321.7, 60 sec: 43142.9, 300 sec: 42653.9). Total num frames: 12513394688. Throughput: 0: 42603.0. Samples: 12513531240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:30:58,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 23:31:00,402][15401] Updated weights for policy 0, policy_version 763763 (0.0023) [2024-06-24 23:31:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 12513624064. Throughput: 0: 42892.0. Samples: 12513789660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:31:03,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 23:31:04,151][15401] Updated weights for policy 0, policy_version 763773 (0.0038) [2024-06-24 23:31:08,189][15401] Updated weights for policy 0, policy_version 763783 (0.0038) [2024-06-24 23:31:08,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12513837056. Throughput: 0: 42646.2. Samples: 12513918280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:31:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 23:31:08,547][15349] Signal inference workers to stop experience collection... (185150 times) [2024-06-24 23:31:08,548][15349] Signal inference workers to resume experience collection... (185150 times) [2024-06-24 23:31:08,569][15401] InferenceWorker_p0-w0: stopping experience collection (185150 times) [2024-06-24 23:31:08,569][15401] InferenceWorker_p0-w0: resuming experience collection (185150 times) [2024-06-24 23:31:11,686][15401] Updated weights for policy 0, policy_version 763793 (0.0037) [2024-06-24 23:31:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12514050048. Throughput: 0: 42573.7. Samples: 12514170300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:31:13,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 23:31:15,809][15401] Updated weights for policy 0, policy_version 763803 (0.0041) [2024-06-24 23:31:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12514263040. Throughput: 0: 42551.6. Samples: 12514423640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-24 23:31:18,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 23:31:19,639][15401] Updated weights for policy 0, policy_version 763813 (0.0042) [2024-06-24 23:31:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 12514459648. Throughput: 0: 42413.7. Samples: 12514553020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:31:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 23:31:23,917][15401] Updated weights for policy 0, policy_version 763823 (0.0035) [2024-06-24 23:31:27,126][15401] Updated weights for policy 0, policy_version 763833 (0.0039) [2024-06-24 23:31:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 12514705408. Throughput: 0: 42485.0. Samples: 12514810400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:31:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-24 23:31:31,519][15401] Updated weights for policy 0, policy_version 763843 (0.0033) [2024-06-24 23:31:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12514918400. Throughput: 0: 42531.7. Samples: 12515066960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:31:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 23:31:34,774][15401] Updated weights for policy 0, policy_version 763853 (0.0043) [2024-06-24 23:31:38,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12515115008. Throughput: 0: 42382.2. Samples: 12515195760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:31:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-24 23:31:39,057][15401] Updated weights for policy 0, policy_version 763863 (0.0033) [2024-06-24 23:31:42,420][15401] Updated weights for policy 0, policy_version 763873 (0.0035) [2024-06-24 23:31:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12515344384. Throughput: 0: 42643.2. Samples: 12515450080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:31:43,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-24 23:31:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000763876_12515344384.pth... [2024-06-24 23:31:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000763250_12505088000.pth [2024-06-24 23:31:46,649][15401] Updated weights for policy 0, policy_version 763883 (0.0042) [2024-06-24 23:31:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 12515540992. Throughput: 0: 42591.6. Samples: 12515706280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:31:48,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-24 23:31:50,671][15401] Updated weights for policy 0, policy_version 763893 (0.0025) [2024-06-24 23:31:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12515753984. Throughput: 0: 42426.7. Samples: 12515827480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:31:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-24 23:31:54,241][15401] Updated weights for policy 0, policy_version 763903 (0.0029) [2024-06-24 23:31:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 12515934208. Throughput: 0: 42545.9. Samples: 12516084860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:31:58,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-24 23:31:58,484][15401] Updated weights for policy 0, policy_version 763913 (0.0042) [2024-06-24 23:32:01,835][15401] Updated weights for policy 0, policy_version 763923 (0.0023) [2024-06-24 23:32:03,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12516163584. Throughput: 0: 42594.6. Samples: 12516340400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:03,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-24 23:32:06,148][15401] Updated weights for policy 0, policy_version 763933 (0.0041) [2024-06-24 23:32:08,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12516409344. Throughput: 0: 42676.4. Samples: 12516473460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:08,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-24 23:32:09,390][15401] Updated weights for policy 0, policy_version 763943 (0.0032) [2024-06-24 23:32:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12516589568. Throughput: 0: 42648.4. Samples: 12516729580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-24 23:32:13,664][15401] Updated weights for policy 0, policy_version 763953 (0.0025) [2024-06-24 23:32:16,968][15401] Updated weights for policy 0, policy_version 763963 (0.0047) [2024-06-24 23:32:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 12516835328. Throughput: 0: 42580.1. Samples: 12516983060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-24 23:32:21,149][15401] Updated weights for policy 0, policy_version 763973 (0.0038) [2024-06-24 23:32:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 12517048320. Throughput: 0: 42560.0. Samples: 12517110960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 23:32:24,612][15401] Updated weights for policy 0, policy_version 763983 (0.0038) [2024-06-24 23:32:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12517244928. Throughput: 0: 42644.8. Samples: 12517369100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 23:32:29,063][15401] Updated weights for policy 0, policy_version 763993 (0.0045) [2024-06-24 23:32:32,123][15401] Updated weights for policy 0, policy_version 764003 (0.0029) [2024-06-24 23:32:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 12517474304. Throughput: 0: 42695.9. Samples: 12517627600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-24 23:32:36,411][15401] Updated weights for policy 0, policy_version 764013 (0.0024) [2024-06-24 23:32:37,586][15349] Signal inference workers to stop experience collection... (185200 times) [2024-06-24 23:32:37,634][15401] InferenceWorker_p0-w0: stopping experience collection (185200 times) [2024-06-24 23:32:37,641][15349] Signal inference workers to resume experience collection... (185200 times) [2024-06-24 23:32:37,648][15401] InferenceWorker_p0-w0: resuming experience collection (185200 times) [2024-06-24 23:32:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12517687296. Throughput: 0: 42810.5. Samples: 12517753960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:38,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 23:32:39,999][15401] Updated weights for policy 0, policy_version 764023 (0.0032) [2024-06-24 23:32:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12517883904. Throughput: 0: 42818.1. Samples: 12518011680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 23:32:43,910][15401] Updated weights for policy 0, policy_version 764033 (0.0030) [2024-06-24 23:32:47,861][15401] Updated weights for policy 0, policy_version 764043 (0.0032) [2024-06-24 23:32:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12518096896. Throughput: 0: 42826.0. Samples: 12518267560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 23:32:51,755][15401] Updated weights for policy 0, policy_version 764053 (0.0031) [2024-06-24 23:32:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12518326272. Throughput: 0: 42759.1. Samples: 12518397620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-24 23:32:55,345][15401] Updated weights for policy 0, policy_version 764063 (0.0031) [2024-06-24 23:32:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12518522880. Throughput: 0: 42692.0. Samples: 12518650720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:32:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 23:32:59,463][15401] Updated weights for policy 0, policy_version 764073 (0.0026) [2024-06-24 23:33:02,753][15401] Updated weights for policy 0, policy_version 764083 (0.0039) [2024-06-24 23:33:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12518768640. Throughput: 0: 42863.7. Samples: 12518911940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-24 23:33:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 23:33:06,902][15401] Updated weights for policy 0, policy_version 764093 (0.0027) [2024-06-24 23:33:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12518948864. Throughput: 0: 43036.8. Samples: 12519047620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-24 23:33:10,289][15401] Updated weights for policy 0, policy_version 764103 (0.0044) [2024-06-24 23:33:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12519178240. Throughput: 0: 42964.9. Samples: 12519302520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:13,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-24 23:33:14,485][15401] Updated weights for policy 0, policy_version 764113 (0.0030) [2024-06-24 23:33:17,922][15401] Updated weights for policy 0, policy_version 764123 (0.0031) [2024-06-24 23:33:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.2, 300 sec: 42654.3). Total num frames: 12519391232. Throughput: 0: 42800.9. Samples: 12519553640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 23:33:21,985][15401] Updated weights for policy 0, policy_version 764133 (0.0038) [2024-06-24 23:33:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12519604224. Throughput: 0: 42987.7. Samples: 12519688400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 23:33:25,558][15401] Updated weights for policy 0, policy_version 764143 (0.0039) [2024-06-24 23:33:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12519817216. Throughput: 0: 42887.1. Samples: 12519941600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:28,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-24 23:33:29,554][15401] Updated weights for policy 0, policy_version 764153 (0.0036) [2024-06-24 23:33:33,330][15401] Updated weights for policy 0, policy_version 764163 (0.0024) [2024-06-24 23:33:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12520046592. Throughput: 0: 42902.2. Samples: 12520198160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:33,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 23:33:37,386][15401] Updated weights for policy 0, policy_version 764173 (0.0036) [2024-06-24 23:33:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 12520243200. Throughput: 0: 42926.8. Samples: 12520329320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 23:33:41,106][15401] Updated weights for policy 0, policy_version 764183 (0.0036) [2024-06-24 23:33:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12520439808. Throughput: 0: 42865.3. Samples: 12520579660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:43,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-24 23:33:43,529][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000764188_12520456192.pth... [2024-06-24 23:33:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000763562_12510199808.pth [2024-06-24 23:33:44,945][15401] Updated weights for policy 0, policy_version 764193 (0.0034) [2024-06-24 23:33:48,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 12520669184. Throughput: 0: 42783.1. Samples: 12520837280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:48,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-24 23:33:49,002][15401] Updated weights for policy 0, policy_version 764203 (0.0037) [2024-06-24 23:33:52,813][15401] Updated weights for policy 0, policy_version 764213 (0.0037) [2024-06-24 23:33:53,393][15132] Fps is (10 sec: 44223.3, 60 sec: 42596.2, 300 sec: 42709.0). Total num frames: 12520882176. Throughput: 0: 42629.9. Samples: 12520966100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:53,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-24 23:33:56,763][15401] Updated weights for policy 0, policy_version 764223 (0.0034) [2024-06-24 23:33:58,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12521095168. Throughput: 0: 42694.3. Samples: 12521223760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:33:58,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-24 23:34:00,422][15401] Updated weights for policy 0, policy_version 764233 (0.0035) [2024-06-24 23:34:03,389][15132] Fps is (10 sec: 42612.2, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 12521308160. Throughput: 0: 42583.2. Samples: 12521469880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-24 23:34:04,594][15401] Updated weights for policy 0, policy_version 764243 (0.0044) [2024-06-24 23:34:07,954][15401] Updated weights for policy 0, policy_version 764253 (0.0038) [2024-06-24 23:34:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 12521537536. Throughput: 0: 42562.3. Samples: 12521603700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:08,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 23:34:12,263][15401] Updated weights for policy 0, policy_version 764263 (0.0034) [2024-06-24 23:34:12,888][15349] Signal inference workers to stop experience collection... (185250 times) [2024-06-24 23:34:12,889][15349] Signal inference workers to resume experience collection... (185250 times) [2024-06-24 23:34:12,907][15401] InferenceWorker_p0-w0: stopping experience collection (185250 times) [2024-06-24 23:34:12,942][15401] InferenceWorker_p0-w0: resuming experience collection (185250 times) [2024-06-24 23:34:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12521734144. Throughput: 0: 42639.1. Samples: 12521860360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 23:34:15,533][15401] Updated weights for policy 0, policy_version 764273 (0.0027) [2024-06-24 23:34:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12521947136. Throughput: 0: 42575.6. Samples: 12522114060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-24 23:34:19,900][15401] Updated weights for policy 0, policy_version 764283 (0.0026) [2024-06-24 23:34:23,163][15401] Updated weights for policy 0, policy_version 764293 (0.0033) [2024-06-24 23:34:23,393][15132] Fps is (10 sec: 44222.1, 60 sec: 42869.0, 300 sec: 42875.6). Total num frames: 12522176512. Throughput: 0: 42577.2. Samples: 12522245440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:23,393][15132] Avg episode reward: [(0, '0.483')] [2024-06-24 23:34:27,548][15401] Updated weights for policy 0, policy_version 764303 (0.0041) [2024-06-24 23:34:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12522373120. Throughput: 0: 42678.4. Samples: 12522500180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:28,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-24 23:34:30,769][15401] Updated weights for policy 0, policy_version 764313 (0.0044) [2024-06-24 23:34:33,389][15132] Fps is (10 sec: 42612.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12522602496. Throughput: 0: 42620.1. Samples: 12522755080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:33,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-24 23:34:35,583][15401] Updated weights for policy 0, policy_version 764323 (0.0035) [2024-06-24 23:34:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 12522815488. Throughput: 0: 42667.8. Samples: 12522886020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:38,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-24 23:34:38,398][15401] Updated weights for policy 0, policy_version 764333 (0.0049) [2024-06-24 23:34:43,125][15401] Updated weights for policy 0, policy_version 764343 (0.0031) [2024-06-24 23:34:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12523012096. Throughput: 0: 42697.8. Samples: 12523145160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:43,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-24 23:34:46,215][15401] Updated weights for policy 0, policy_version 764353 (0.0043) [2024-06-24 23:34:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 12523241472. Throughput: 0: 42806.5. Samples: 12523396180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-24 23:34:48,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-24 23:34:50,787][15401] Updated weights for policy 0, policy_version 764363 (0.0040) [2024-06-24 23:34:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.7, 300 sec: 42876.1). Total num frames: 12523454464. Throughput: 0: 42737.3. Samples: 12523526880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:34:53,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-24 23:34:53,861][15401] Updated weights for policy 0, policy_version 764373 (0.0029) [2024-06-24 23:34:58,384][15401] Updated weights for policy 0, policy_version 764383 (0.0027) [2024-06-24 23:34:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12523651072. Throughput: 0: 42608.5. Samples: 12523777740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:34:58,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-24 23:35:01,562][15401] Updated weights for policy 0, policy_version 764393 (0.0032) [2024-06-24 23:35:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12523864064. Throughput: 0: 42569.6. Samples: 12524029700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:03,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-24 23:35:06,116][15401] Updated weights for policy 0, policy_version 764403 (0.0044) [2024-06-24 23:35:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12524077056. Throughput: 0: 42493.1. Samples: 12524157480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:08,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-24 23:35:09,041][15401] Updated weights for policy 0, policy_version 764413 (0.0034) [2024-06-24 23:35:13,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.6, 300 sec: 42598.0). Total num frames: 12524273664. Throughput: 0: 42485.1. Samples: 12524412120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:13,393][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 23:35:13,781][15401] Updated weights for policy 0, policy_version 764423 (0.0029) [2024-06-24 23:35:17,235][15401] Updated weights for policy 0, policy_version 764433 (0.0029) [2024-06-24 23:35:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12524519424. Throughput: 0: 42320.0. Samples: 12524659480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:18,396][15132] Avg episode reward: [(0, '0.468')] [2024-06-24 23:35:21,496][15401] Updated weights for policy 0, policy_version 764443 (0.0033) [2024-06-24 23:35:23,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42054.6, 300 sec: 42709.5). Total num frames: 12524699648. Throughput: 0: 42356.5. Samples: 12524792060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-24 23:35:25,243][15401] Updated weights for policy 0, policy_version 764453 (0.0031) [2024-06-24 23:35:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12524929024. Throughput: 0: 42451.1. Samples: 12525055460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-24 23:35:28,997][15401] Updated weights for policy 0, policy_version 764463 (0.0035) [2024-06-24 23:35:31,176][15349] Signal inference workers to stop experience collection... (185300 times) [2024-06-24 23:35:31,222][15401] InferenceWorker_p0-w0: stopping experience collection (185300 times) [2024-06-24 23:35:31,289][15349] Signal inference workers to resume experience collection... (185300 times) [2024-06-24 23:35:31,289][15401] InferenceWorker_p0-w0: resuming experience collection (185300 times) [2024-06-24 23:35:32,727][15401] Updated weights for policy 0, policy_version 764473 (0.0027) [2024-06-24 23:35:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 12525158400. Throughput: 0: 42358.1. Samples: 12525302300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-24 23:35:36,838][15401] Updated weights for policy 0, policy_version 764483 (0.0037) [2024-06-24 23:35:38,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12525338624. Throughput: 0: 42442.0. Samples: 12525436780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 23:35:40,233][15401] Updated weights for policy 0, policy_version 764493 (0.0035) [2024-06-24 23:35:43,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 12525584384. Throughput: 0: 42596.1. Samples: 12525694560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 23:35:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000764501_12525584384.pth... [2024-06-24 23:35:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000763876_12515344384.pth [2024-06-24 23:35:44,579][15401] Updated weights for policy 0, policy_version 764503 (0.0039) [2024-06-24 23:35:47,835][15401] Updated weights for policy 0, policy_version 764513 (0.0024) [2024-06-24 23:35:48,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12525797376. Throughput: 0: 42615.1. Samples: 12525947380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:48,392][15132] Avg episode reward: [(0, '0.477')] [2024-06-24 23:35:52,184][15401] Updated weights for policy 0, policy_version 764523 (0.0037) [2024-06-24 23:35:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 12525977600. Throughput: 0: 42663.5. Samples: 12526077340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 23:35:55,674][15401] Updated weights for policy 0, policy_version 764533 (0.0038) [2024-06-24 23:35:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12526223360. Throughput: 0: 42771.2. Samples: 12526336720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:35:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-24 23:35:59,962][15401] Updated weights for policy 0, policy_version 764543 (0.0033) [2024-06-24 23:36:03,314][15401] Updated weights for policy 0, policy_version 764553 (0.0041) [2024-06-24 23:36:03,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12526436352. Throughput: 0: 43001.3. Samples: 12526594540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:36:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 23:36:07,511][15401] Updated weights for policy 0, policy_version 764563 (0.0026) [2024-06-24 23:36:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12526632960. Throughput: 0: 42870.7. Samples: 12526721240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:36:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 23:36:10,893][15401] Updated weights for policy 0, policy_version 764573 (0.0029) [2024-06-24 23:36:13,396][15132] Fps is (10 sec: 44208.5, 60 sec: 43414.8, 300 sec: 42764.1). Total num frames: 12526878720. Throughput: 0: 42575.7. Samples: 12526971640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:36:13,396][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 23:36:15,456][15401] Updated weights for policy 0, policy_version 764583 (0.0025) [2024-06-24 23:36:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12527075328. Throughput: 0: 42777.5. Samples: 12527227280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:36:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-24 23:36:18,563][15401] Updated weights for policy 0, policy_version 764593 (0.0044) [2024-06-24 23:36:23,109][15401] Updated weights for policy 0, policy_version 764603 (0.0028) [2024-06-24 23:36:23,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12527271936. Throughput: 0: 42674.5. Samples: 12527357120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:36:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 23:36:26,084][15401] Updated weights for policy 0, policy_version 764613 (0.0049) [2024-06-24 23:36:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12527484928. Throughput: 0: 42593.3. Samples: 12527611260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:36:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-24 23:36:30,686][15401] Updated weights for policy 0, policy_version 764623 (0.0036) [2024-06-24 23:36:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 12527714304. Throughput: 0: 42828.6. Samples: 12527874660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-24 23:36:33,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-24 23:36:33,583][15401] Updated weights for policy 0, policy_version 764633 (0.0042) [2024-06-24 23:36:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 12527894528. Throughput: 0: 42799.9. Samples: 12528003340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:36:38,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 23:36:38,497][15401] Updated weights for policy 0, policy_version 764643 (0.0034) [2024-06-24 23:36:41,275][15401] Updated weights for policy 0, policy_version 764653 (0.0036) [2024-06-24 23:36:41,804][15349] Signal inference workers to stop experience collection... (185350 times) [2024-06-24 23:36:41,804][15349] Signal inference workers to resume experience collection... (185350 times) [2024-06-24 23:36:41,847][15401] InferenceWorker_p0-w0: stopping experience collection (185350 times) [2024-06-24 23:36:41,848][15401] InferenceWorker_p0-w0: resuming experience collection (185350 times) [2024-06-24 23:36:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12528140288. Throughput: 0: 42620.4. Samples: 12528254640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:36:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 23:36:46,279][15401] Updated weights for policy 0, policy_version 764663 (0.0026) [2024-06-24 23:36:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12528353280. Throughput: 0: 42731.6. Samples: 12528517460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:36:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 23:36:49,192][15401] Updated weights for policy 0, policy_version 764673 (0.0028) [2024-06-24 23:36:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12528533504. Throughput: 0: 42610.2. Samples: 12528638700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:36:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 23:36:53,907][15401] Updated weights for policy 0, policy_version 764683 (0.0028) [2024-06-24 23:36:56,793][15401] Updated weights for policy 0, policy_version 764693 (0.0039) [2024-06-24 23:36:58,389][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12528812032. Throughput: 0: 42882.1. Samples: 12528901060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:36:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-24 23:37:01,594][15401] Updated weights for policy 0, policy_version 764703 (0.0021) [2024-06-24 23:37:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12528992256. Throughput: 0: 43065.8. Samples: 12529165240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 23:37:04,318][15401] Updated weights for policy 0, policy_version 764713 (0.0028) [2024-06-24 23:37:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12529188864. Throughput: 0: 42800.3. Samples: 12529283140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:08,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-24 23:37:09,194][15401] Updated weights for policy 0, policy_version 764723 (0.0032) [2024-06-24 23:37:11,984][15401] Updated weights for policy 0, policy_version 764733 (0.0032) [2024-06-24 23:37:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 12529451008. Throughput: 0: 42996.3. Samples: 12529546100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-24 23:37:16,722][15401] Updated weights for policy 0, policy_version 764743 (0.0029) [2024-06-24 23:37:18,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12529647616. Throughput: 0: 42955.4. Samples: 12529807760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:18,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 23:37:19,735][15401] Updated weights for policy 0, policy_version 764753 (0.0038) [2024-06-24 23:37:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12529844224. Throughput: 0: 42886.6. Samples: 12529933240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 23:37:24,171][15401] Updated weights for policy 0, policy_version 764763 (0.0033) [2024-06-24 23:37:27,236][15401] Updated weights for policy 0, policy_version 764773 (0.0046) [2024-06-24 23:37:28,389][15132] Fps is (10 sec: 42609.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12530073600. Throughput: 0: 42961.6. Samples: 12530187900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:28,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-24 23:37:31,782][15401] Updated weights for policy 0, policy_version 764783 (0.0028) [2024-06-24 23:37:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12530270208. Throughput: 0: 42945.2. Samples: 12530450000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:33,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-24 23:37:34,850][15401] Updated weights for policy 0, policy_version 764793 (0.0024) [2024-06-24 23:37:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 12530499584. Throughput: 0: 42975.7. Samples: 12530572600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:38,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-24 23:37:39,300][15401] Updated weights for policy 0, policy_version 764803 (0.0041) [2024-06-24 23:37:42,579][15401] Updated weights for policy 0, policy_version 764813 (0.0030) [2024-06-24 23:37:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 12530728960. Throughput: 0: 42701.8. Samples: 12530822640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-24 23:37:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000764815_12530728960.pth... [2024-06-24 23:37:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000764188_12520456192.pth [2024-06-24 23:37:47,614][15401] Updated weights for policy 0, policy_version 764823 (0.0041) [2024-06-24 23:37:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12530892800. Throughput: 0: 42597.4. Samples: 12531082120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 23:37:50,246][15401] Updated weights for policy 0, policy_version 764833 (0.0030) [2024-06-24 23:37:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12531138560. Throughput: 0: 42617.4. Samples: 12531200920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-24 23:37:55,143][15401] Updated weights for policy 0, policy_version 764843 (0.0041) [2024-06-24 23:37:58,267][15401] Updated weights for policy 0, policy_version 764853 (0.0028) [2024-06-24 23:37:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12531351552. Throughput: 0: 42541.5. Samples: 12531460460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:37:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 23:38:02,702][15401] Updated weights for policy 0, policy_version 764863 (0.0040) [2024-06-24 23:38:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12531548160. Throughput: 0: 42488.0. Samples: 12531719620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:38:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 23:38:06,031][15401] Updated weights for policy 0, policy_version 764873 (0.0028) [2024-06-24 23:38:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12531777536. Throughput: 0: 42482.3. Samples: 12531844940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:38:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-24 23:38:10,543][15401] Updated weights for policy 0, policy_version 764883 (0.0024) [2024-06-24 23:38:10,791][15349] Signal inference workers to stop experience collection... (185400 times) [2024-06-24 23:38:10,791][15349] Signal inference workers to resume experience collection... (185400 times) [2024-06-24 23:38:10,836][15401] InferenceWorker_p0-w0: stopping experience collection (185400 times) [2024-06-24 23:38:10,836][15401] InferenceWorker_p0-w0: resuming experience collection (185400 times) [2024-06-24 23:38:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12531974144. Throughput: 0: 42458.0. Samples: 12532098520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:38:13,390][15132] Avg episode reward: [(0, '0.114')] [2024-06-24 23:38:13,877][15401] Updated weights for policy 0, policy_version 764893 (0.0033) [2024-06-24 23:38:17,953][15401] Updated weights for policy 0, policy_version 764903 (0.0029) [2024-06-24 23:38:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 12532203520. Throughput: 0: 42470.3. Samples: 12532361160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-24 23:38:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-24 23:38:21,568][15401] Updated weights for policy 0, policy_version 764913 (0.0031) [2024-06-24 23:38:23,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12532416512. Throughput: 0: 42585.6. Samples: 12532489060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:38:23,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 23:38:25,692][15401] Updated weights for policy 0, policy_version 764923 (0.0037) [2024-06-24 23:38:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12532629504. Throughput: 0: 42627.1. Samples: 12532740860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:38:28,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-24 23:38:29,205][15401] Updated weights for policy 0, policy_version 764933 (0.0043) [2024-06-24 23:38:33,133][15401] Updated weights for policy 0, policy_version 764943 (0.0041) [2024-06-24 23:38:33,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 12532842496. Throughput: 0: 42682.4. Samples: 12533002840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:38:33,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-24 23:38:36,793][15401] Updated weights for policy 0, policy_version 764953 (0.0041) [2024-06-24 23:38:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12533071872. Throughput: 0: 42988.9. Samples: 12533135420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:38:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-24 23:38:40,615][15401] Updated weights for policy 0, policy_version 764963 (0.0038) [2024-06-24 23:38:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 12533268480. Throughput: 0: 42838.5. Samples: 12533388200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:38:43,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-24 23:38:44,410][15401] Updated weights for policy 0, policy_version 764973 (0.0024) [2024-06-24 23:38:48,102][15401] Updated weights for policy 0, policy_version 764983 (0.0036) [2024-06-24 23:38:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42765.5). Total num frames: 12533497856. Throughput: 0: 42797.9. Samples: 12533645520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:38:48,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 23:38:51,960][15401] Updated weights for policy 0, policy_version 764993 (0.0033) [2024-06-24 23:38:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12533694464. Throughput: 0: 42882.6. Samples: 12533774660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:38:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-24 23:38:55,833][15401] Updated weights for policy 0, policy_version 765003 (0.0024) [2024-06-24 23:38:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12533923840. Throughput: 0: 42985.8. Samples: 12534032880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:38:58,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-24 23:38:59,835][15401] Updated weights for policy 0, policy_version 765013 (0.0023) [2024-06-24 23:39:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 12534120448. Throughput: 0: 42887.5. Samples: 12534291100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-24 23:39:03,508][15401] Updated weights for policy 0, policy_version 765023 (0.0029) [2024-06-24 23:39:07,480][15401] Updated weights for policy 0, policy_version 765033 (0.0028) [2024-06-24 23:39:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12534333440. Throughput: 0: 42788.4. Samples: 12534414440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-24 23:39:11,198][15401] Updated weights for policy 0, policy_version 765043 (0.0032) [2024-06-24 23:39:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12534562816. Throughput: 0: 42919.5. Samples: 12534672240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:13,391][15132] Avg episode reward: [(0, '0.696')] [2024-06-24 23:39:15,247][15401] Updated weights for policy 0, policy_version 765053 (0.0034) [2024-06-24 23:39:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.4). Total num frames: 12534759424. Throughput: 0: 42856.6. Samples: 12534931380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 23:39:19,072][15401] Updated weights for policy 0, policy_version 765063 (0.0041) [2024-06-24 23:39:22,778][15401] Updated weights for policy 0, policy_version 765073 (0.0033) [2024-06-24 23:39:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 12534972416. Throughput: 0: 42491.6. Samples: 12535047540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 23:39:26,683][15401] Updated weights for policy 0, policy_version 765083 (0.0031) [2024-06-24 23:39:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12535201792. Throughput: 0: 42708.6. Samples: 12535310080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:28,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 23:39:30,706][15401] Updated weights for policy 0, policy_version 765093 (0.0033) [2024-06-24 23:39:30,706][15349] Signal inference workers to stop experience collection... (185450 times) [2024-06-24 23:39:30,707][15349] Signal inference workers to resume experience collection... (185450 times) [2024-06-24 23:39:30,733][15401] InferenceWorker_p0-w0: stopping experience collection (185450 times) [2024-06-24 23:39:30,733][15401] InferenceWorker_p0-w0: resuming experience collection (185450 times) [2024-06-24 23:39:33,394][15132] Fps is (10 sec: 42579.5, 60 sec: 42595.4, 300 sec: 42653.3). Total num frames: 12535398400. Throughput: 0: 42891.4. Samples: 12535575820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:33,394][15132] Avg episode reward: [(0, '0.393')] [2024-06-24 23:39:34,240][15401] Updated weights for policy 0, policy_version 765103 (0.0037) [2024-06-24 23:39:38,121][15401] Updated weights for policy 0, policy_version 765113 (0.0030) [2024-06-24 23:39:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12535611392. Throughput: 0: 42723.5. Samples: 12535697220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:38,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 23:39:42,007][15401] Updated weights for policy 0, policy_version 765123 (0.0043) [2024-06-24 23:39:43,392][15132] Fps is (10 sec: 47522.7, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 12535873536. Throughput: 0: 42744.0. Samples: 12535956460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:43,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 23:39:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000765129_12535873536.pth... [2024-06-24 23:39:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000764501_12525584384.pth [2024-06-24 23:39:46,277][15401] Updated weights for policy 0, policy_version 765133 (0.0038) [2024-06-24 23:39:48,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42050.6, 300 sec: 42598.0). Total num frames: 12536020992. Throughput: 0: 42767.0. Samples: 12536215720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:48,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-24 23:39:49,543][15401] Updated weights for policy 0, policy_version 765143 (0.0031) [2024-06-24 23:39:53,390][15132] Fps is (10 sec: 37692.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12536250368. Throughput: 0: 42597.4. Samples: 12536331320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 23:39:53,826][15401] Updated weights for policy 0, policy_version 765153 (0.0027) [2024-06-24 23:39:57,155][15401] Updated weights for policy 0, policy_version 765163 (0.0038) [2024-06-24 23:39:58,389][15132] Fps is (10 sec: 47525.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12536496128. Throughput: 0: 42810.8. Samples: 12536598720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:39:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-24 23:40:01,388][15401] Updated weights for policy 0, policy_version 765173 (0.0034) [2024-06-24 23:40:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12536659968. Throughput: 0: 42796.0. Samples: 12536857200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-24 23:40:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-24 23:40:04,911][15401] Updated weights for policy 0, policy_version 765183 (0.0034) [2024-06-24 23:40:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12536905728. Throughput: 0: 42813.6. Samples: 12536974160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-24 23:40:08,891][15401] Updated weights for policy 0, policy_version 765193 (0.0026) [2024-06-24 23:40:12,394][15401] Updated weights for policy 0, policy_version 765203 (0.0035) [2024-06-24 23:40:13,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12537135104. Throughput: 0: 42843.5. Samples: 12537238040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 23:40:16,275][15401] Updated weights for policy 0, policy_version 765213 (0.0038) [2024-06-24 23:40:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12537298944. Throughput: 0: 42905.9. Samples: 12537506400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 23:40:19,810][15401] Updated weights for policy 0, policy_version 765223 (0.0037) [2024-06-24 23:40:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12537561088. Throughput: 0: 42807.6. Samples: 12537623560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 23:40:23,947][15401] Updated weights for policy 0, policy_version 765233 (0.0048) [2024-06-24 23:40:27,303][15401] Updated weights for policy 0, policy_version 765243 (0.0026) [2024-06-24 23:40:28,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12537774080. Throughput: 0: 42832.1. Samples: 12537883800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:28,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 23:40:31,403][15401] Updated weights for policy 0, policy_version 765253 (0.0033) [2024-06-24 23:40:33,390][15132] Fps is (10 sec: 37682.2, 60 sec: 42328.2, 300 sec: 42709.5). Total num frames: 12537937920. Throughput: 0: 42921.6. Samples: 12538147100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-24 23:40:35,180][15401] Updated weights for policy 0, policy_version 765263 (0.0024) [2024-06-24 23:40:38,109][15349] Signal inference workers to stop experience collection... (185500 times) [2024-06-24 23:40:38,118][15349] Signal inference workers to resume experience collection... (185500 times) [2024-06-24 23:40:38,158][15401] InferenceWorker_p0-w0: stopping experience collection (185500 times) [2024-06-24 23:40:38,158][15401] InferenceWorker_p0-w0: resuming experience collection (185500 times) [2024-06-24 23:40:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 12538216448. Throughput: 0: 42975.5. Samples: 12538265220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 23:40:38,909][15401] Updated weights for policy 0, policy_version 765273 (0.0034) [2024-06-24 23:40:43,092][15401] Updated weights for policy 0, policy_version 765283 (0.0028) [2024-06-24 23:40:43,389][15132] Fps is (10 sec: 47515.0, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 12538413056. Throughput: 0: 42813.7. Samples: 12538525340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 23:40:46,378][15401] Updated weights for policy 0, policy_version 765293 (0.0032) [2024-06-24 23:40:48,392][15132] Fps is (10 sec: 37674.7, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 12538593280. Throughput: 0: 42789.3. Samples: 12538782820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:48,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 23:40:50,658][15401] Updated weights for policy 0, policy_version 765303 (0.0036) [2024-06-24 23:40:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 12538855424. Throughput: 0: 42900.9. Samples: 12538904700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-24 23:40:54,733][15401] Updated weights for policy 0, policy_version 765313 (0.0032) [2024-06-24 23:40:58,203][15401] Updated weights for policy 0, policy_version 765323 (0.0038) [2024-06-24 23:40:58,389][15132] Fps is (10 sec: 47525.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12539068416. Throughput: 0: 42970.3. Samples: 12539171700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:40:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 23:41:02,341][15401] Updated weights for policy 0, policy_version 765333 (0.0049) [2024-06-24 23:41:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12539248640. Throughput: 0: 42738.7. Samples: 12539429640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-24 23:41:05,907][15401] Updated weights for policy 0, policy_version 765343 (0.0038) [2024-06-24 23:41:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42765.9). Total num frames: 12539494400. Throughput: 0: 42833.2. Samples: 12539551060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 23:41:10,005][15401] Updated weights for policy 0, policy_version 765353 (0.0038) [2024-06-24 23:41:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12539691008. Throughput: 0: 42737.3. Samples: 12539806980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 23:41:13,506][15401] Updated weights for policy 0, policy_version 765363 (0.0049) [2024-06-24 23:41:17,801][15401] Updated weights for policy 0, policy_version 765373 (0.0033) [2024-06-24 23:41:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 12539904000. Throughput: 0: 42565.5. Samples: 12540062540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:18,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 23:41:21,383][15401] Updated weights for policy 0, policy_version 765383 (0.0030) [2024-06-24 23:41:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12540116992. Throughput: 0: 42818.3. Samples: 12540192040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 23:41:25,563][15401] Updated weights for policy 0, policy_version 765393 (0.0042) [2024-06-24 23:41:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 12540313600. Throughput: 0: 42710.1. Samples: 12540447300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:28,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 23:41:28,921][15401] Updated weights for policy 0, policy_version 765403 (0.0046) [2024-06-24 23:41:33,067][15401] Updated weights for policy 0, policy_version 765413 (0.0025) [2024-06-24 23:41:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.8, 300 sec: 42820.6). Total num frames: 12540526592. Throughput: 0: 42658.7. Samples: 12540702360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-24 23:41:36,636][15401] Updated weights for policy 0, policy_version 765423 (0.0045) [2024-06-24 23:41:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 12540739584. Throughput: 0: 42806.4. Samples: 12540830980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:38,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-24 23:41:40,545][15401] Updated weights for policy 0, policy_version 765433 (0.0036) [2024-06-24 23:41:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12540952576. Throughput: 0: 42574.7. Samples: 12541087560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-24 23:41:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-24 23:41:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000765440_12540968960.pth... [2024-06-24 23:41:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000764815_12530728960.pth [2024-06-24 23:41:44,318][15401] Updated weights for policy 0, policy_version 765443 (0.0042) [2024-06-24 23:41:48,346][15401] Updated weights for policy 0, policy_version 765453 (0.0043) [2024-06-24 23:41:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 12541181952. Throughput: 0: 42464.0. Samples: 12541340520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:41:48,394][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 23:41:51,897][15349] Signal inference workers to stop experience collection... (185550 times) [2024-06-24 23:41:51,897][15349] Signal inference workers to resume experience collection... (185550 times) [2024-06-24 23:41:51,920][15401] InferenceWorker_p0-w0: stopping experience collection (185550 times) [2024-06-24 23:41:51,920][15401] InferenceWorker_p0-w0: resuming experience collection (185550 times) [2024-06-24 23:41:52,066][15401] Updated weights for policy 0, policy_version 765463 (0.0038) [2024-06-24 23:41:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 12541378560. Throughput: 0: 42708.6. Samples: 12541472940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:41:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-24 23:41:55,942][15401] Updated weights for policy 0, policy_version 765473 (0.0028) [2024-06-24 23:41:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 12541591552. Throughput: 0: 42670.6. Samples: 12541727260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:41:58,392][15132] Avg episode reward: [(0, '0.397')] [2024-06-24 23:41:59,679][15401] Updated weights for policy 0, policy_version 765483 (0.0037) [2024-06-24 23:42:03,395][15132] Fps is (10 sec: 42573.1, 60 sec: 42594.3, 300 sec: 42764.2). Total num frames: 12541804544. Throughput: 0: 42694.0. Samples: 12541984020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:03,396][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 23:42:03,879][15401] Updated weights for policy 0, policy_version 765493 (0.0044) [2024-06-24 23:42:07,487][15401] Updated weights for policy 0, policy_version 765503 (0.0037) [2024-06-24 23:42:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 12542033920. Throughput: 0: 42533.8. Samples: 12542106060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:08,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-24 23:42:11,407][15401] Updated weights for policy 0, policy_version 765513 (0.0040) [2024-06-24 23:42:13,390][15132] Fps is (10 sec: 42623.1, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12542230528. Throughput: 0: 42544.0. Samples: 12542361780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:13,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-24 23:42:15,098][15401] Updated weights for policy 0, policy_version 765523 (0.0039) [2024-06-24 23:42:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12542443520. Throughput: 0: 42494.6. Samples: 12542614620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-24 23:42:19,238][15401] Updated weights for policy 0, policy_version 765533 (0.0032) [2024-06-24 23:42:22,895][15401] Updated weights for policy 0, policy_version 765543 (0.0026) [2024-06-24 23:42:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12542656512. Throughput: 0: 42477.8. Samples: 12542742480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 23:42:27,055][15401] Updated weights for policy 0, policy_version 765553 (0.0028) [2024-06-24 23:42:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12542869504. Throughput: 0: 42416.5. Samples: 12542996300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-24 23:42:30,522][15401] Updated weights for policy 0, policy_version 765563 (0.0031) [2024-06-24 23:42:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12543098880. Throughput: 0: 42450.7. Samples: 12543250800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 23:42:34,740][15401] Updated weights for policy 0, policy_version 765573 (0.0036) [2024-06-24 23:42:38,260][15401] Updated weights for policy 0, policy_version 765583 (0.0037) [2024-06-24 23:42:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12543311872. Throughput: 0: 42417.6. Samples: 12543381740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:38,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-24 23:42:42,561][15401] Updated weights for policy 0, policy_version 765593 (0.0041) [2024-06-24 23:42:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12543524864. Throughput: 0: 42464.8. Samples: 12543638080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-24 23:42:46,040][15401] Updated weights for policy 0, policy_version 765603 (0.0030) [2024-06-24 23:42:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12543737856. Throughput: 0: 42334.4. Samples: 12543888820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 23:42:50,366][15401] Updated weights for policy 0, policy_version 765613 (0.0038) [2024-06-24 23:42:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12543934464. Throughput: 0: 42458.1. Samples: 12544016680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-24 23:42:53,786][15401] Updated weights for policy 0, policy_version 765623 (0.0032) [2024-06-24 23:42:57,930][15401] Updated weights for policy 0, policy_version 765633 (0.0040) [2024-06-24 23:42:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12544163840. Throughput: 0: 42529.4. Samples: 12544275600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:42:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-24 23:43:02,037][15401] Updated weights for policy 0, policy_version 765643 (0.0036) [2024-06-24 23:43:03,395][15132] Fps is (10 sec: 42577.2, 60 sec: 42599.0, 300 sec: 42653.2). Total num frames: 12544360448. Throughput: 0: 42592.2. Samples: 12544531480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:43:03,395][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 23:43:05,447][15401] Updated weights for policy 0, policy_version 765653 (0.0028) [2024-06-24 23:43:06,957][15349] Signal inference workers to stop experience collection... (185600 times) [2024-06-24 23:43:06,961][15349] Signal inference workers to resume experience collection... (185600 times) [2024-06-24 23:43:06,999][15401] InferenceWorker_p0-w0: stopping experience collection (185600 times) [2024-06-24 23:43:06,999][15401] InferenceWorker_p0-w0: resuming experience collection (185600 times) [2024-06-24 23:43:08,394][15132] Fps is (10 sec: 40942.4, 60 sec: 42322.3, 300 sec: 42708.9). Total num frames: 12544573440. Throughput: 0: 42485.7. Samples: 12544654520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:43:08,394][15132] Avg episode reward: [(0, '0.595')] [2024-06-24 23:43:09,800][15401] Updated weights for policy 0, policy_version 765663 (0.0058) [2024-06-24 23:43:12,898][15401] Updated weights for policy 0, policy_version 765673 (0.0047) [2024-06-24 23:43:13,390][15132] Fps is (10 sec: 42619.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12544786432. Throughput: 0: 42589.7. Samples: 12544912840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:43:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-24 23:43:17,310][15401] Updated weights for policy 0, policy_version 765683 (0.0025) [2024-06-24 23:43:18,392][15132] Fps is (10 sec: 44245.1, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 12545015808. Throughput: 0: 42618.6. Samples: 12545168740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:43:18,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 23:43:20,326][15401] Updated weights for policy 0, policy_version 765693 (0.0030) [2024-06-24 23:43:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12545228800. Throughput: 0: 42605.3. Samples: 12545298980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:43:23,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-24 23:43:24,915][15401] Updated weights for policy 0, policy_version 765703 (0.0031) [2024-06-24 23:43:27,863][15401] Updated weights for policy 0, policy_version 765713 (0.0036) [2024-06-24 23:43:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12545441792. Throughput: 0: 42662.3. Samples: 12545557880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-24 23:43:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-24 23:43:32,398][15401] Updated weights for policy 0, policy_version 765723 (0.0034) [2024-06-24 23:43:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12545654784. Throughput: 0: 42748.1. Samples: 12545812480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:43:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-24 23:43:35,674][15401] Updated weights for policy 0, policy_version 765733 (0.0043) [2024-06-24 23:43:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12545867776. Throughput: 0: 42815.6. Samples: 12545943380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:43:38,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-24 23:43:40,093][15401] Updated weights for policy 0, policy_version 765743 (0.0036) [2024-06-24 23:43:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12546080768. Throughput: 0: 42774.1. Samples: 12546200440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:43:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 23:43:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000765752_12546080768.pth... [2024-06-24 23:43:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000765129_12535873536.pth [2024-06-24 23:43:43,733][15401] Updated weights for policy 0, policy_version 765753 (0.0027) [2024-06-24 23:43:47,751][15401] Updated weights for policy 0, policy_version 765763 (0.0030) [2024-06-24 23:43:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12546293760. Throughput: 0: 42884.3. Samples: 12546461060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:43:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-24 23:43:51,403][15401] Updated weights for policy 0, policy_version 765773 (0.0027) [2024-06-24 23:43:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12546506752. Throughput: 0: 42927.2. Samples: 12546586060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:43:53,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-24 23:43:55,554][15401] Updated weights for policy 0, policy_version 765783 (0.0030) [2024-06-24 23:43:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12546719744. Throughput: 0: 42886.7. Samples: 12546842740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:43:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 23:43:59,033][15401] Updated weights for policy 0, policy_version 765793 (0.0033) [2024-06-24 23:44:03,286][15401] Updated weights for policy 0, policy_version 765803 (0.0035) [2024-06-24 23:44:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42600.2, 300 sec: 42653.6). Total num frames: 12546916352. Throughput: 0: 43033.3. Samples: 12547105240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:03,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 23:44:06,459][15401] Updated weights for policy 0, policy_version 765813 (0.0035) [2024-06-24 23:44:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43147.5, 300 sec: 42709.5). Total num frames: 12547162112. Throughput: 0: 42939.1. Samples: 12547231240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-24 23:44:10,614][15401] Updated weights for policy 0, policy_version 765823 (0.0039) [2024-06-24 23:44:13,393][15132] Fps is (10 sec: 44232.7, 60 sec: 42869.1, 300 sec: 42709.0). Total num frames: 12547358720. Throughput: 0: 42948.7. Samples: 12547490720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:13,393][15132] Avg episode reward: [(0, '0.471')] [2024-06-24 23:44:13,597][15349] Signal inference workers to stop experience collection... (185650 times) [2024-06-24 23:44:13,628][15401] InferenceWorker_p0-w0: stopping experience collection (185650 times) [2024-06-24 23:44:13,650][15349] Signal inference workers to resume experience collection... (185650 times) [2024-06-24 23:44:13,651][15401] InferenceWorker_p0-w0: resuming experience collection (185650 times) [2024-06-24 23:44:14,125][15401] Updated weights for policy 0, policy_version 765833 (0.0026) [2024-06-24 23:44:18,127][15401] Updated weights for policy 0, policy_version 765843 (0.0037) [2024-06-24 23:44:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 12547571712. Throughput: 0: 42792.8. Samples: 12547738160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-24 23:44:21,785][15401] Updated weights for policy 0, policy_version 765853 (0.0021) [2024-06-24 23:44:23,390][15132] Fps is (10 sec: 42612.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12547784704. Throughput: 0: 42750.2. Samples: 12547867140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:23,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-24 23:44:25,933][15401] Updated weights for policy 0, policy_version 765863 (0.0031) [2024-06-24 23:44:28,394][15132] Fps is (10 sec: 42580.0, 60 sec: 42595.3, 300 sec: 42709.5). Total num frames: 12547997696. Throughput: 0: 42771.5. Samples: 12548125340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:28,394][15132] Avg episode reward: [(0, '0.745')] [2024-06-24 23:44:29,327][15401] Updated weights for policy 0, policy_version 765873 (0.0028) [2024-06-24 23:44:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12548210688. Throughput: 0: 42571.7. Samples: 12548376780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-24 23:44:33,477][15401] Updated weights for policy 0, policy_version 765883 (0.0037) [2024-06-24 23:44:36,925][15401] Updated weights for policy 0, policy_version 765893 (0.0034) [2024-06-24 23:44:38,389][15132] Fps is (10 sec: 44256.2, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 12548440064. Throughput: 0: 42716.5. Samples: 12548508300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-24 23:44:41,346][15401] Updated weights for policy 0, policy_version 765903 (0.0040) [2024-06-24 23:44:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 12548620288. Throughput: 0: 42787.0. Samples: 12548768160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 23:44:44,771][15401] Updated weights for policy 0, policy_version 765913 (0.0032) [2024-06-24 23:44:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12548849664. Throughput: 0: 42570.3. Samples: 12549020800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-24 23:44:48,869][15401] Updated weights for policy 0, policy_version 765923 (0.0023) [2024-06-24 23:44:52,316][15401] Updated weights for policy 0, policy_version 765933 (0.0038) [2024-06-24 23:44:53,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 12549095424. Throughput: 0: 42744.0. Samples: 12549154720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-24 23:44:56,326][15401] Updated weights for policy 0, policy_version 765943 (0.0023) [2024-06-24 23:44:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12549259264. Throughput: 0: 42764.6. Samples: 12549414980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:44:58,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-24 23:44:59,786][15401] Updated weights for policy 0, policy_version 765953 (0.0036) [2024-06-24 23:45:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 12549505024. Throughput: 0: 42780.8. Samples: 12549663300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:45:03,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-24 23:45:03,947][15401] Updated weights for policy 0, policy_version 765963 (0.0032) [2024-06-24 23:45:07,813][15401] Updated weights for policy 0, policy_version 765973 (0.0033) [2024-06-24 23:45:08,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12549734400. Throughput: 0: 42857.8. Samples: 12549795740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:45:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 23:45:11,534][15401] Updated weights for policy 0, policy_version 765983 (0.0041) [2024-06-24 23:45:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42327.8, 300 sec: 42709.5). Total num frames: 12549898240. Throughput: 0: 42809.1. Samples: 12550051560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-24 23:45:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-24 23:45:15,214][15401] Updated weights for policy 0, policy_version 765993 (0.0029) [2024-06-24 23:45:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12550144000. Throughput: 0: 42932.7. Samples: 12550308760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:18,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-24 23:45:19,498][15401] Updated weights for policy 0, policy_version 766003 (0.0040) [2024-06-24 23:45:22,426][15349] Signal inference workers to stop experience collection... (185700 times) [2024-06-24 23:45:22,426][15349] Signal inference workers to resume experience collection... (185700 times) [2024-06-24 23:45:22,438][15401] InferenceWorker_p0-w0: stopping experience collection (185700 times) [2024-06-24 23:45:22,438][15401] InferenceWorker_p0-w0: resuming experience collection (185700 times) [2024-06-24 23:45:22,752][15401] Updated weights for policy 0, policy_version 766013 (0.0035) [2024-06-24 23:45:23,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12550389760. Throughput: 0: 42978.2. Samples: 12550442320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-24 23:45:27,053][15401] Updated weights for policy 0, policy_version 766023 (0.0035) [2024-06-24 23:45:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42874.6, 300 sec: 42820.6). Total num frames: 12550569984. Throughput: 0: 42796.6. Samples: 12550694000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:28,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-24 23:45:30,551][15401] Updated weights for policy 0, policy_version 766033 (0.0037) [2024-06-24 23:45:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.3, 300 sec: 42653.9). Total num frames: 12550799360. Throughput: 0: 42935.4. Samples: 12550952900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-24 23:45:34,711][15401] Updated weights for policy 0, policy_version 766043 (0.0038) [2024-06-24 23:45:38,203][15401] Updated weights for policy 0, policy_version 766053 (0.0034) [2024-06-24 23:45:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12551012352. Throughput: 0: 42852.1. Samples: 12551083060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 23:45:42,389][15401] Updated weights for policy 0, policy_version 766063 (0.0044) [2024-06-24 23:45:43,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 12551192576. Throughput: 0: 42787.8. Samples: 12551340440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:43,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 23:45:43,474][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000766065_12551208960.pth... [2024-06-24 23:45:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000765440_12540968960.pth [2024-06-24 23:45:45,695][15401] Updated weights for policy 0, policy_version 766073 (0.0043) [2024-06-24 23:45:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 12551438336. Throughput: 0: 42969.0. Samples: 12551596900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:48,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 23:45:49,895][15401] Updated weights for policy 0, policy_version 766083 (0.0032) [2024-06-24 23:45:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12551651328. Throughput: 0: 42958.7. Samples: 12551728880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-24 23:45:53,551][15401] Updated weights for policy 0, policy_version 766093 (0.0030) [2024-06-24 23:45:57,528][15401] Updated weights for policy 0, policy_version 766103 (0.0030) [2024-06-24 23:45:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12551847936. Throughput: 0: 42930.2. Samples: 12551983420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:45:58,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-24 23:46:01,100][15401] Updated weights for policy 0, policy_version 766113 (0.0028) [2024-06-24 23:46:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12552093696. Throughput: 0: 42838.8. Samples: 12552236500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:03,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-24 23:46:05,261][15401] Updated weights for policy 0, policy_version 766123 (0.0024) [2024-06-24 23:46:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12552290304. Throughput: 0: 42818.8. Samples: 12552369160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-24 23:46:08,673][15401] Updated weights for policy 0, policy_version 766133 (0.0029) [2024-06-24 23:46:13,099][15401] Updated weights for policy 0, policy_version 766143 (0.0037) [2024-06-24 23:46:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 12552486912. Throughput: 0: 42814.1. Samples: 12552620640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:13,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-24 23:46:16,281][15401] Updated weights for policy 0, policy_version 766153 (0.0030) [2024-06-24 23:46:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12552732672. Throughput: 0: 42731.7. Samples: 12552875820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:18,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-24 23:46:21,056][15401] Updated weights for policy 0, policy_version 766163 (0.0037) [2024-06-24 23:46:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12552929280. Throughput: 0: 42735.6. Samples: 12553006160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-24 23:46:24,738][15401] Updated weights for policy 0, policy_version 766173 (0.0031) [2024-06-24 23:46:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12553125888. Throughput: 0: 42506.8. Samples: 12553253240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 23:46:28,596][15401] Updated weights for policy 0, policy_version 766183 (0.0035) [2024-06-24 23:46:32,334][15401] Updated weights for policy 0, policy_version 766193 (0.0044) [2024-06-24 23:46:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12553338880. Throughput: 0: 42785.8. Samples: 12553522260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:33,400][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 23:46:36,019][15401] Updated weights for policy 0, policy_version 766203 (0.0030) [2024-06-24 23:46:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12553584640. Throughput: 0: 42623.6. Samples: 12553646940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 23:46:40,007][15401] Updated weights for policy 0, policy_version 766213 (0.0047) [2024-06-24 23:46:41,762][15349] Signal inference workers to stop experience collection... (185750 times) [2024-06-24 23:46:41,804][15401] InferenceWorker_p0-w0: stopping experience collection (185750 times) [2024-06-24 23:46:41,819][15349] Signal inference workers to resume experience collection... (185750 times) [2024-06-24 23:46:41,823][15401] InferenceWorker_p0-w0: resuming experience collection (185750 times) [2024-06-24 23:46:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12553781248. Throughput: 0: 42581.8. Samples: 12553899600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-24 23:46:43,571][15401] Updated weights for policy 0, policy_version 766223 (0.0048) [2024-06-24 23:46:47,577][15401] Updated weights for policy 0, policy_version 766233 (0.0029) [2024-06-24 23:46:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12553977856. Throughput: 0: 42798.2. Samples: 12554162420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 23:46:51,155][15401] Updated weights for policy 0, policy_version 766243 (0.0033) [2024-06-24 23:46:53,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 12554207232. Throughput: 0: 42608.3. Samples: 12554286640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:53,392][15132] Avg episode reward: [(0, '0.657')] [2024-06-24 23:46:55,167][15401] Updated weights for policy 0, policy_version 766253 (0.0030) [2024-06-24 23:46:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42710.3). Total num frames: 12554403840. Throughput: 0: 42706.8. Samples: 12554542440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-24 23:46:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-24 23:46:58,990][15401] Updated weights for policy 0, policy_version 766263 (0.0034) [2024-06-24 23:47:03,255][15401] Updated weights for policy 0, policy_version 766273 (0.0033) [2024-06-24 23:47:03,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12554616832. Throughput: 0: 42827.2. Samples: 12554803040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-24 23:47:06,594][15401] Updated weights for policy 0, policy_version 766283 (0.0052) [2024-06-24 23:47:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12554829824. Throughput: 0: 42659.5. Samples: 12554925840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 23:47:10,724][15401] Updated weights for policy 0, policy_version 766293 (0.0028) [2024-06-24 23:47:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12555059200. Throughput: 0: 42827.4. Samples: 12555180480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-24 23:47:14,377][15401] Updated weights for policy 0, policy_version 766303 (0.0038) [2024-06-24 23:47:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12555272192. Throughput: 0: 42485.0. Samples: 12555434080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-24 23:47:18,398][15401] Updated weights for policy 0, policy_version 766313 (0.0039) [2024-06-24 23:47:22,074][15401] Updated weights for policy 0, policy_version 766323 (0.0027) [2024-06-24 23:47:23,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 12555468800. Throughput: 0: 42565.7. Samples: 12555562500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:23,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-24 23:47:25,867][15401] Updated weights for policy 0, policy_version 766333 (0.0028) [2024-06-24 23:47:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12555681792. Throughput: 0: 42643.6. Samples: 12555818560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-24 23:47:29,663][15401] Updated weights for policy 0, policy_version 766343 (0.0043) [2024-06-24 23:47:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12555911168. Throughput: 0: 42448.4. Samples: 12556072600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-24 23:47:33,478][15401] Updated weights for policy 0, policy_version 766353 (0.0028) [2024-06-24 23:47:37,814][15401] Updated weights for policy 0, policy_version 766363 (0.0035) [2024-06-24 23:47:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 12556091392. Throughput: 0: 42629.0. Samples: 12556204840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-24 23:47:41,089][15401] Updated weights for policy 0, policy_version 766373 (0.0023) [2024-06-24 23:47:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12556337152. Throughput: 0: 42491.9. Samples: 12556454580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:43,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-24 23:47:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000766378_12556337152.pth... [2024-06-24 23:47:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000765752_12546080768.pth [2024-06-24 23:47:45,383][15401] Updated weights for policy 0, policy_version 766383 (0.0028) [2024-06-24 23:47:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12556550144. Throughput: 0: 42328.4. Samples: 12556707820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 23:47:49,028][15401] Updated weights for policy 0, policy_version 766393 (0.0032) [2024-06-24 23:47:53,388][15401] Updated weights for policy 0, policy_version 766403 (0.0035) [2024-06-24 23:47:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 12556746752. Throughput: 0: 42627.0. Samples: 12556844060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-24 23:47:56,482][15401] Updated weights for policy 0, policy_version 766413 (0.0032) [2024-06-24 23:47:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.7). Total num frames: 12556976128. Throughput: 0: 42501.0. Samples: 12557093020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:47:58,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-24 23:48:00,938][15401] Updated weights for policy 0, policy_version 766423 (0.0024) [2024-06-24 23:48:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42821.2). Total num frames: 12557205504. Throughput: 0: 42578.9. Samples: 12557350140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:03,395][15132] Avg episode reward: [(0, '0.623')] [2024-06-24 23:48:04,123][15401] Updated weights for policy 0, policy_version 766433 (0.0033) [2024-06-24 23:48:06,503][15349] Signal inference workers to stop experience collection... (185800 times) [2024-06-24 23:48:06,503][15349] Signal inference workers to resume experience collection... (185800 times) [2024-06-24 23:48:06,536][15401] InferenceWorker_p0-w0: stopping experience collection (185800 times) [2024-06-24 23:48:06,536][15401] InferenceWorker_p0-w0: resuming experience collection (185800 times) [2024-06-24 23:48:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12557369344. Throughput: 0: 42626.8. Samples: 12557480600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-24 23:48:08,764][15401] Updated weights for policy 0, policy_version 766443 (0.0023) [2024-06-24 23:48:12,065][15401] Updated weights for policy 0, policy_version 766453 (0.0023) [2024-06-24 23:48:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 12557615104. Throughput: 0: 42561.2. Samples: 12557733820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-24 23:48:16,430][15401] Updated weights for policy 0, policy_version 766463 (0.0034) [2024-06-24 23:48:18,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 12557844480. Throughput: 0: 42551.5. Samples: 12557987420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 23:48:19,711][15401] Updated weights for policy 0, policy_version 766473 (0.0036) [2024-06-24 23:48:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 12558008320. Throughput: 0: 42568.5. Samples: 12558120420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-24 23:48:24,047][15401] Updated weights for policy 0, policy_version 766483 (0.0043) [2024-06-24 23:48:27,191][15401] Updated weights for policy 0, policy_version 766493 (0.0029) [2024-06-24 23:48:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12558254080. Throughput: 0: 42593.9. Samples: 12558371300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:28,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-24 23:48:31,508][15401] Updated weights for policy 0, policy_version 766503 (0.0030) [2024-06-24 23:48:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12558467072. Throughput: 0: 42787.0. Samples: 12558633240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-24 23:48:34,631][15401] Updated weights for policy 0, policy_version 766513 (0.0032) [2024-06-24 23:48:38,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 12558663680. Throughput: 0: 42482.2. Samples: 12558755860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:38,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-24 23:48:38,970][15401] Updated weights for policy 0, policy_version 766523 (0.0037) [2024-06-24 23:48:42,134][15401] Updated weights for policy 0, policy_version 766533 (0.0035) [2024-06-24 23:48:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12558893056. Throughput: 0: 42529.4. Samples: 12559006840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 27.0) [2024-06-24 23:48:43,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-24 23:48:46,787][15401] Updated weights for policy 0, policy_version 766543 (0.0032) [2024-06-24 23:48:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12559106048. Throughput: 0: 42724.5. Samples: 12559272740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:48:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 23:48:49,783][15401] Updated weights for policy 0, policy_version 766553 (0.0042) [2024-06-24 23:48:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12559302656. Throughput: 0: 42718.6. Samples: 12559402940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:48:53,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 23:48:54,183][15401] Updated weights for policy 0, policy_version 766563 (0.0034) [2024-06-24 23:48:57,600][15401] Updated weights for policy 0, policy_version 766573 (0.0025) [2024-06-24 23:48:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12559548416. Throughput: 0: 42702.9. Samples: 12559655440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:48:58,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 23:49:01,871][15401] Updated weights for policy 0, policy_version 766583 (0.0037) [2024-06-24 23:49:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12559745024. Throughput: 0: 42932.6. Samples: 12559919380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-24 23:49:05,178][15401] Updated weights for policy 0, policy_version 766593 (0.0029) [2024-06-24 23:49:08,390][15132] Fps is (10 sec: 40958.9, 60 sec: 43144.4, 300 sec: 42709.9). Total num frames: 12559958016. Throughput: 0: 42739.8. Samples: 12560043720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-24 23:49:09,607][15401] Updated weights for policy 0, policy_version 766603 (0.0031) [2024-06-24 23:49:12,803][15401] Updated weights for policy 0, policy_version 766613 (0.0027) [2024-06-24 23:49:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 12560187392. Throughput: 0: 42871.0. Samples: 12560300600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:13,392][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 23:49:17,791][15401] Updated weights for policy 0, policy_version 766623 (0.0023) [2024-06-24 23:49:18,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12560384000. Throughput: 0: 42724.5. Samples: 12560555840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:18,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 23:49:20,394][15401] Updated weights for policy 0, policy_version 766633 (0.0041) [2024-06-24 23:49:23,392][15132] Fps is (10 sec: 40960.0, 60 sec: 43142.8, 300 sec: 42709.8). Total num frames: 12560596992. Throughput: 0: 42718.7. Samples: 12560678200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:23,393][15132] Avg episode reward: [(0, '0.744')] [2024-06-24 23:49:25,305][15401] Updated weights for policy 0, policy_version 766643 (0.0033) [2024-06-24 23:49:27,237][15349] Signal inference workers to stop experience collection... (185850 times) [2024-06-24 23:49:27,238][15349] Signal inference workers to resume experience collection... (185850 times) [2024-06-24 23:49:27,271][15401] InferenceWorker_p0-w0: stopping experience collection (185850 times) [2024-06-24 23:49:27,271][15401] InferenceWorker_p0-w0: resuming experience collection (185850 times) [2024-06-24 23:49:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42764.6). Total num frames: 12560826368. Throughput: 0: 42870.1. Samples: 12560936100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:28,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-24 23:49:28,459][15401] Updated weights for policy 0, policy_version 766653 (0.0027) [2024-06-24 23:49:32,989][15401] Updated weights for policy 0, policy_version 766663 (0.0034) [2024-06-24 23:49:33,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12561022976. Throughput: 0: 42643.5. Samples: 12561191700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 23:49:36,196][15401] Updated weights for policy 0, policy_version 766673 (0.0032) [2024-06-24 23:49:38,389][15132] Fps is (10 sec: 39331.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 12561219584. Throughput: 0: 42447.7. Samples: 12561313080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:38,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-24 23:49:40,991][15401] Updated weights for policy 0, policy_version 766683 (0.0041) [2024-06-24 23:49:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12561448960. Throughput: 0: 42659.0. Samples: 12561575100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-24 23:49:43,479][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000766691_12561465344.pth... [2024-06-24 23:49:43,548][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000766065_12551208960.pth [2024-06-24 23:49:43,875][15401] Updated weights for policy 0, policy_version 766693 (0.0038) [2024-06-24 23:49:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12561645568. Throughput: 0: 42441.3. Samples: 12561829240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-24 23:49:48,686][15401] Updated weights for policy 0, policy_version 766703 (0.0038) [2024-06-24 23:49:51,610][15401] Updated weights for policy 0, policy_version 766713 (0.0038) [2024-06-24 23:49:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12561874944. Throughput: 0: 42458.8. Samples: 12561954360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:53,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 23:49:56,351][15401] Updated weights for policy 0, policy_version 766723 (0.0054) [2024-06-24 23:49:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12562071552. Throughput: 0: 42401.4. Samples: 12562208560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:49:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-24 23:49:59,403][15401] Updated weights for policy 0, policy_version 766733 (0.0034) [2024-06-24 23:50:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12562284544. Throughput: 0: 42473.8. Samples: 12562467160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:50:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-24 23:50:04,038][15401] Updated weights for policy 0, policy_version 766743 (0.0031) [2024-06-24 23:50:06,845][15401] Updated weights for policy 0, policy_version 766753 (0.0028) [2024-06-24 23:50:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12562513920. Throughput: 0: 42637.8. Samples: 12562596800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:50:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-24 23:50:11,601][15401] Updated weights for policy 0, policy_version 766763 (0.0042) [2024-06-24 23:50:13,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 12562726912. Throughput: 0: 42537.8. Samples: 12562850300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:50:13,392][15132] Avg episode reward: [(0, '0.791')] [2024-06-24 23:50:15,325][15401] Updated weights for policy 0, policy_version 766773 (0.0025) [2024-06-24 23:50:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12562939904. Throughput: 0: 42490.2. Samples: 12563103760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:50:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-24 23:50:19,342][15401] Updated weights for policy 0, policy_version 766783 (0.0037) [2024-06-24 23:50:22,983][15401] Updated weights for policy 0, policy_version 766793 (0.0031) [2024-06-24 23:50:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 12563152896. Throughput: 0: 42644.3. Samples: 12563232080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:50:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-24 23:50:26,905][15401] Updated weights for policy 0, policy_version 766803 (0.0030) [2024-06-24 23:50:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 12563365888. Throughput: 0: 42600.0. Samples: 12563492100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-24 23:50:28,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-24 23:50:30,451][15401] Updated weights for policy 0, policy_version 766813 (0.0041) [2024-06-24 23:50:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12563578880. Throughput: 0: 42590.7. Samples: 12563745820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:50:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-24 23:50:34,524][15401] Updated weights for policy 0, policy_version 766823 (0.0024) [2024-06-24 23:50:38,033][15401] Updated weights for policy 0, policy_version 766833 (0.0030) [2024-06-24 23:50:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12563791872. Throughput: 0: 42712.1. Samples: 12563876400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:50:38,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-24 23:50:40,276][15349] Signal inference workers to stop experience collection... (185900 times) [2024-06-24 23:50:40,279][15349] Signal inference workers to resume experience collection... (185900 times) [2024-06-24 23:50:40,308][15401] InferenceWorker_p0-w0: stopping experience collection (185900 times) [2024-06-24 23:50:40,308][15401] InferenceWorker_p0-w0: resuming experience collection (185900 times) [2024-06-24 23:50:42,029][15401] Updated weights for policy 0, policy_version 766843 (0.0039) [2024-06-24 23:50:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 12564004864. Throughput: 0: 42518.1. Samples: 12564121980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:50:43,393][15132] Avg episode reward: [(0, '0.749')] [2024-06-24 23:50:45,652][15401] Updated weights for policy 0, policy_version 766853 (0.0044) [2024-06-24 23:50:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12564217856. Throughput: 0: 42305.3. Samples: 12564370900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:50:48,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-24 23:50:49,863][15401] Updated weights for policy 0, policy_version 766863 (0.0031) [2024-06-24 23:50:53,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12564430848. Throughput: 0: 42395.1. Samples: 12564504580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:50:53,394][15132] Avg episode reward: [(0, '0.717')] [2024-06-24 23:50:53,610][15401] Updated weights for policy 0, policy_version 766873 (0.0047) [2024-06-24 23:50:57,462][15401] Updated weights for policy 0, policy_version 766883 (0.0036) [2024-06-24 23:50:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12564627456. Throughput: 0: 42300.5. Samples: 12564753720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:50:58,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-24 23:51:01,669][15401] Updated weights for policy 0, policy_version 766893 (0.0034) [2024-06-24 23:51:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12564856832. Throughput: 0: 42263.2. Samples: 12565005600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-24 23:51:05,226][15401] Updated weights for policy 0, policy_version 766903 (0.0045) [2024-06-24 23:51:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 12565053440. Throughput: 0: 42424.9. Samples: 12565141300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:08,393][15132] Avg episode reward: [(0, '0.656')] [2024-06-24 23:51:09,187][15401] Updated weights for policy 0, policy_version 766913 (0.0029) [2024-06-24 23:51:12,906][15401] Updated weights for policy 0, policy_version 766923 (0.0029) [2024-06-24 23:51:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 12565266432. Throughput: 0: 42285.3. Samples: 12565394940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:13,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-24 23:51:17,172][15401] Updated weights for policy 0, policy_version 766933 (0.0050) [2024-06-24 23:51:18,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12565512192. Throughput: 0: 42316.0. Samples: 12565650040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:18,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 23:51:20,822][15401] Updated weights for policy 0, policy_version 766943 (0.0033) [2024-06-24 23:51:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12565692416. Throughput: 0: 42252.7. Samples: 12565777780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-24 23:51:24,855][15401] Updated weights for policy 0, policy_version 766953 (0.0035) [2024-06-24 23:51:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12565905408. Throughput: 0: 42380.5. Samples: 12566029000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 23:51:28,462][15401] Updated weights for policy 0, policy_version 766963 (0.0040) [2024-06-24 23:51:32,335][15401] Updated weights for policy 0, policy_version 766973 (0.0029) [2024-06-24 23:51:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12566118400. Throughput: 0: 42736.4. Samples: 12566294040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-24 23:51:35,981][15401] Updated weights for policy 0, policy_version 766983 (0.0036) [2024-06-24 23:51:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12566347776. Throughput: 0: 42561.5. Samples: 12566419840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-24 23:51:39,859][15401] Updated weights for policy 0, policy_version 766993 (0.0028) [2024-06-24 23:51:43,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42595.5, 300 sec: 42653.0). Total num frames: 12566560768. Throughput: 0: 42704.1. Samples: 12566675680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:43,397][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 23:51:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000767002_12566560768.pth... [2024-06-24 23:51:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000766378_12556337152.pth [2024-06-24 23:51:43,740][15401] Updated weights for policy 0, policy_version 767003 (0.0022) [2024-06-24 23:51:45,566][15349] Signal inference workers to stop experience collection... (185950 times) [2024-06-24 23:51:45,566][15349] Signal inference workers to resume experience collection... (185950 times) [2024-06-24 23:51:45,578][15401] InferenceWorker_p0-w0: stopping experience collection (185950 times) [2024-06-24 23:51:45,579][15401] InferenceWorker_p0-w0: resuming experience collection (185950 times) [2024-06-24 23:51:47,389][15401] Updated weights for policy 0, policy_version 767013 (0.0034) [2024-06-24 23:51:48,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42325.2, 300 sec: 42543.2). Total num frames: 12566757376. Throughput: 0: 42962.0. Samples: 12566938900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-24 23:51:51,422][15401] Updated weights for policy 0, policy_version 767023 (0.0029) [2024-06-24 23:51:53,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12566986752. Throughput: 0: 42807.7. Samples: 12567067540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 23:51:54,870][15401] Updated weights for policy 0, policy_version 767033 (0.0036) [2024-06-24 23:51:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12567199744. Throughput: 0: 42810.7. Samples: 12567321420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:51:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 23:51:58,970][15401] Updated weights for policy 0, policy_version 767043 (0.0029) [2024-06-24 23:52:02,478][15401] Updated weights for policy 0, policy_version 767053 (0.0036) [2024-06-24 23:52:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12567412736. Throughput: 0: 43013.2. Samples: 12567585640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:52:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 23:52:06,399][15401] Updated weights for policy 0, policy_version 767063 (0.0035) [2024-06-24 23:52:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 12567625728. Throughput: 0: 42960.6. Samples: 12567711000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:52:08,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-24 23:52:10,346][15401] Updated weights for policy 0, policy_version 767073 (0.0035) [2024-06-24 23:52:13,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 12567855104. Throughput: 0: 43002.6. Samples: 12567964220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-24 23:52:13,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-24 23:52:14,307][15401] Updated weights for policy 0, policy_version 767083 (0.0024) [2024-06-24 23:52:18,235][15401] Updated weights for policy 0, policy_version 767093 (0.0028) [2024-06-24 23:52:18,390][15132] Fps is (10 sec: 44232.9, 60 sec: 42597.8, 300 sec: 42709.7). Total num frames: 12568068096. Throughput: 0: 42928.1. Samples: 12568225840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:18,391][15132] Avg episode reward: [(0, '0.615')] [2024-06-24 23:52:22,515][15401] Updated weights for policy 0, policy_version 767103 (0.0034) [2024-06-24 23:52:23,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12568264704. Throughput: 0: 42891.9. Samples: 12568349980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:23,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 23:52:25,733][15401] Updated weights for policy 0, policy_version 767113 (0.0041) [2024-06-24 23:52:28,390][15132] Fps is (10 sec: 42601.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12568494080. Throughput: 0: 42797.7. Samples: 12568601300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-24 23:52:30,175][15401] Updated weights for policy 0, policy_version 767123 (0.0036) [2024-06-24 23:52:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12568690688. Throughput: 0: 42706.9. Samples: 12568860700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-24 23:52:33,662][15401] Updated weights for policy 0, policy_version 767133 (0.0043) [2024-06-24 23:52:37,837][15401] Updated weights for policy 0, policy_version 767143 (0.0037) [2024-06-24 23:52:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12568870912. Throughput: 0: 42536.4. Samples: 12568981680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:38,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-24 23:52:41,509][15401] Updated weights for policy 0, policy_version 767153 (0.0041) [2024-06-24 23:52:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43149.2, 300 sec: 42709.5). Total num frames: 12569149440. Throughput: 0: 42636.1. Samples: 12569240040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-24 23:52:45,331][15401] Updated weights for policy 0, policy_version 767163 (0.0026) [2024-06-24 23:52:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12569329664. Throughput: 0: 42542.6. Samples: 12569500060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-24 23:52:49,109][15401] Updated weights for policy 0, policy_version 767173 (0.0027) [2024-06-24 23:52:52,930][15401] Updated weights for policy 0, policy_version 767183 (0.0032) [2024-06-24 23:52:53,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12569526272. Throughput: 0: 42464.5. Samples: 12569621900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:53,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-24 23:52:56,736][15401] Updated weights for policy 0, policy_version 767193 (0.0029) [2024-06-24 23:52:58,389][15132] Fps is (10 sec: 45876.6, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 12569788416. Throughput: 0: 42640.7. Samples: 12569882940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:52:58,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-24 23:53:00,421][15401] Updated weights for policy 0, policy_version 767203 (0.0029) [2024-06-24 23:53:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12569968640. Throughput: 0: 42651.0. Samples: 12570145100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-24 23:53:04,169][15401] Updated weights for policy 0, policy_version 767213 (0.0029) [2024-06-24 23:53:07,186][15349] Signal inference workers to stop experience collection... (186000 times) [2024-06-24 23:53:07,187][15349] Signal inference workers to resume experience collection... (186000 times) [2024-06-24 23:53:07,235][15401] InferenceWorker_p0-w0: stopping experience collection (186000 times) [2024-06-24 23:53:07,235][15401] InferenceWorker_p0-w0: resuming experience collection (186000 times) [2024-06-24 23:53:07,872][15401] Updated weights for policy 0, policy_version 767223 (0.0032) [2024-06-24 23:53:08,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12570181632. Throughput: 0: 42461.7. Samples: 12570260760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 23:53:12,123][15401] Updated weights for policy 0, policy_version 767233 (0.0041) [2024-06-24 23:53:13,391][15132] Fps is (10 sec: 44228.6, 60 sec: 42598.8, 300 sec: 42598.1). Total num frames: 12570411008. Throughput: 0: 42672.9. Samples: 12570521660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:13,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-24 23:53:15,934][15401] Updated weights for policy 0, policy_version 767243 (0.0032) [2024-06-24 23:53:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.8, 300 sec: 42653.9). Total num frames: 12570591232. Throughput: 0: 42626.5. Samples: 12570778900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-24 23:53:20,057][15401] Updated weights for policy 0, policy_version 767253 (0.0034) [2024-06-24 23:53:23,389][15132] Fps is (10 sec: 40967.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12570820608. Throughput: 0: 42535.1. Samples: 12570895760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-24 23:53:23,457][15401] Updated weights for policy 0, policy_version 767263 (0.0026) [2024-06-24 23:53:27,611][15401] Updated weights for policy 0, policy_version 767273 (0.0033) [2024-06-24 23:53:28,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12571049984. Throughput: 0: 42692.8. Samples: 12571161220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-24 23:53:31,215][15401] Updated weights for policy 0, policy_version 767283 (0.0038) [2024-06-24 23:53:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12571246592. Throughput: 0: 42616.2. Samples: 12571417780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:33,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-24 23:53:35,299][15401] Updated weights for policy 0, policy_version 767293 (0.0039) [2024-06-24 23:53:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 12571475968. Throughput: 0: 42523.5. Samples: 12571535460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-24 23:53:38,751][15401] Updated weights for policy 0, policy_version 767303 (0.0035) [2024-06-24 23:53:42,898][15401] Updated weights for policy 0, policy_version 767313 (0.0034) [2024-06-24 23:53:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12571688960. Throughput: 0: 42608.6. Samples: 12571800340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 23:53:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000767315_12571688960.pth... [2024-06-24 23:53:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000766691_12561465344.pth [2024-06-24 23:53:46,417][15401] Updated weights for policy 0, policy_version 767323 (0.0049) [2024-06-24 23:53:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 12571869184. Throughput: 0: 42531.6. Samples: 12572059020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-24 23:53:50,518][15401] Updated weights for policy 0, policy_version 767333 (0.0032) [2024-06-24 23:53:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 12572114944. Throughput: 0: 42613.8. Samples: 12572178380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-24 23:53:53,834][15401] Updated weights for policy 0, policy_version 767343 (0.0037) [2024-06-24 23:53:58,108][15401] Updated weights for policy 0, policy_version 767353 (0.0037) [2024-06-24 23:53:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12572327936. Throughput: 0: 42674.7. Samples: 12572441940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-24 23:53:58,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-24 23:54:01,475][15401] Updated weights for policy 0, policy_version 767363 (0.0035) [2024-06-24 23:54:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12572524544. Throughput: 0: 42681.5. Samples: 12572699560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-24 23:54:05,728][15401] Updated weights for policy 0, policy_version 767373 (0.0036) [2024-06-24 23:54:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 12572770304. Throughput: 0: 42819.5. Samples: 12572822640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:08,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-24 23:54:09,207][15401] Updated weights for policy 0, policy_version 767383 (0.0030) [2024-06-24 23:54:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42326.7, 300 sec: 42598.4). Total num frames: 12572950528. Throughput: 0: 42853.0. Samples: 12573089600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:13,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-24 23:54:13,439][15401] Updated weights for policy 0, policy_version 767393 (0.0036) [2024-06-24 23:54:17,264][15401] Updated weights for policy 0, policy_version 767403 (0.0043) [2024-06-24 23:54:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 12573163520. Throughput: 0: 42692.1. Samples: 12573338920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:18,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-24 23:54:21,050][15401] Updated weights for policy 0, policy_version 767413 (0.0042) [2024-06-24 23:54:23,008][15349] Signal inference workers to stop experience collection... (186050 times) [2024-06-24 23:54:23,050][15401] InferenceWorker_p0-w0: stopping experience collection (186050 times) [2024-06-24 23:54:23,064][15349] Signal inference workers to resume experience collection... (186050 times) [2024-06-24 23:54:23,073][15401] InferenceWorker_p0-w0: resuming experience collection (186050 times) [2024-06-24 23:54:23,391][15132] Fps is (10 sec: 45865.7, 60 sec: 43143.1, 300 sec: 42654.0). Total num frames: 12573409280. Throughput: 0: 42904.7. Samples: 12573466260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:23,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-24 23:54:24,736][15401] Updated weights for policy 0, policy_version 767423 (0.0031) [2024-06-24 23:54:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12573589504. Throughput: 0: 42996.1. Samples: 12573735160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-24 23:54:28,897][15401] Updated weights for policy 0, policy_version 767433 (0.0043) [2024-06-24 23:54:32,180][15401] Updated weights for policy 0, policy_version 767443 (0.0041) [2024-06-24 23:54:33,389][15132] Fps is (10 sec: 42606.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12573835264. Throughput: 0: 42794.2. Samples: 12573984760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-24 23:54:36,419][15401] Updated weights for policy 0, policy_version 767453 (0.0033) [2024-06-24 23:54:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12574048256. Throughput: 0: 43032.5. Samples: 12574114840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-24 23:54:39,782][15401] Updated weights for policy 0, policy_version 767463 (0.0029) [2024-06-24 23:54:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12574228480. Throughput: 0: 42871.9. Samples: 12574371180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:43,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 23:54:44,206][15401] Updated weights for policy 0, policy_version 767473 (0.0029) [2024-06-24 23:54:47,437][15401] Updated weights for policy 0, policy_version 767483 (0.0041) [2024-06-24 23:54:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12574457856. Throughput: 0: 42723.1. Samples: 12574622100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-24 23:54:51,980][15401] Updated weights for policy 0, policy_version 767493 (0.0033) [2024-06-24 23:54:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12574687232. Throughput: 0: 42875.2. Samples: 12574752020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:53,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-24 23:54:55,039][15401] Updated weights for policy 0, policy_version 767503 (0.0028) [2024-06-24 23:54:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12574867456. Throughput: 0: 42687.4. Samples: 12575010540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:54:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-24 23:54:59,523][15401] Updated weights for policy 0, policy_version 767513 (0.0028) [2024-06-24 23:55:02,643][15401] Updated weights for policy 0, policy_version 767523 (0.0042) [2024-06-24 23:55:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12575113216. Throughput: 0: 42677.1. Samples: 12575259400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:03,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-24 23:55:07,383][15401] Updated weights for policy 0, policy_version 767533 (0.0036) [2024-06-24 23:55:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42598.8). Total num frames: 12575293440. Throughput: 0: 42702.8. Samples: 12575387800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 23:55:10,345][15401] Updated weights for policy 0, policy_version 767543 (0.0036) [2024-06-24 23:55:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12575539200. Throughput: 0: 42559.1. Samples: 12575650320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-24 23:55:14,850][15401] Updated weights for policy 0, policy_version 767553 (0.0028) [2024-06-24 23:55:18,037][15401] Updated weights for policy 0, policy_version 767563 (0.0035) [2024-06-24 23:55:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12575752192. Throughput: 0: 42505.7. Samples: 12575897520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:18,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-24 23:55:22,490][15401] Updated weights for policy 0, policy_version 767573 (0.0044) [2024-06-24 23:55:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42053.7, 300 sec: 42598.4). Total num frames: 12575932416. Throughput: 0: 42617.9. Samples: 12576032640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-24 23:55:25,821][15401] Updated weights for policy 0, policy_version 767583 (0.0036) [2024-06-24 23:55:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12576161792. Throughput: 0: 42403.2. Samples: 12576279320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-24 23:55:30,365][15401] Updated weights for policy 0, policy_version 767593 (0.0033) [2024-06-24 23:55:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12576391168. Throughput: 0: 42541.4. Samples: 12576536460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-24 23:55:33,699][15401] Updated weights for policy 0, policy_version 767603 (0.0037) [2024-06-24 23:55:38,157][15401] Updated weights for policy 0, policy_version 767613 (0.0035) [2024-06-24 23:55:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.8). Total num frames: 12576571392. Throughput: 0: 42516.0. Samples: 12576665240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 23:55:41,246][15401] Updated weights for policy 0, policy_version 767623 (0.0029) [2024-06-24 23:55:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12576800768. Throughput: 0: 42391.0. Samples: 12576918140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-24 23:55:43,391][15132] Avg episode reward: [(0, '0.466')] [2024-06-24 23:55:43,519][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000767628_12576817152.pth... [2024-06-24 23:55:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000767002_12566560768.pth [2024-06-24 23:55:45,626][15401] Updated weights for policy 0, policy_version 767633 (0.0046) [2024-06-24 23:55:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12577013760. Throughput: 0: 42639.3. Samples: 12577178160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:55:48,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-24 23:55:49,242][15401] Updated weights for policy 0, policy_version 767643 (0.0038) [2024-06-24 23:55:53,205][15401] Updated weights for policy 0, policy_version 767653 (0.0047) [2024-06-24 23:55:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12577226752. Throughput: 0: 42584.4. Samples: 12577304100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:55:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-24 23:55:57,279][15401] Updated weights for policy 0, policy_version 767663 (0.0032) [2024-06-24 23:55:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12577439744. Throughput: 0: 42354.6. Samples: 12577556280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:55:58,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-24 23:56:00,808][15401] Updated weights for policy 0, policy_version 767673 (0.0027) [2024-06-24 23:56:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 12577652736. Throughput: 0: 42567.1. Samples: 12577813040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:03,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 23:56:05,086][15401] Updated weights for policy 0, policy_version 767683 (0.0033) [2024-06-24 23:56:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12577865728. Throughput: 0: 42430.1. Samples: 12577942000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-24 23:56:08,463][15401] Updated weights for policy 0, policy_version 767693 (0.0030) [2024-06-24 23:56:09,782][15349] Signal inference workers to stop experience collection... (186100 times) [2024-06-24 23:56:09,782][15349] Signal inference workers to resume experience collection... (186100 times) [2024-06-24 23:56:09,809][15401] InferenceWorker_p0-w0: stopping experience collection (186100 times) [2024-06-24 23:56:09,809][15401] InferenceWorker_p0-w0: resuming experience collection (186100 times) [2024-06-24 23:56:12,908][15401] Updated weights for policy 0, policy_version 767703 (0.0040) [2024-06-24 23:56:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 12578062336. Throughput: 0: 42522.3. Samples: 12578192820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:13,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-24 23:56:16,624][15401] Updated weights for policy 0, policy_version 767713 (0.0039) [2024-06-24 23:56:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12578275328. Throughput: 0: 42436.4. Samples: 12578446100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:18,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-24 23:56:20,784][15401] Updated weights for policy 0, policy_version 767723 (0.0035) [2024-06-24 23:56:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12578504704. Throughput: 0: 42391.1. Samples: 12578572840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:23,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-24 23:56:24,640][15401] Updated weights for policy 0, policy_version 767733 (0.0040) [2024-06-24 23:56:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12578684928. Throughput: 0: 42389.1. Samples: 12578825640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:28,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-24 23:56:28,479][15401] Updated weights for policy 0, policy_version 767743 (0.0036) [2024-06-24 23:56:32,226][15401] Updated weights for policy 0, policy_version 767753 (0.0030) [2024-06-24 23:56:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12578914304. Throughput: 0: 42297.1. Samples: 12579081540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-24 23:56:36,082][15401] Updated weights for policy 0, policy_version 767763 (0.0028) [2024-06-24 23:56:38,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 12579160064. Throughput: 0: 42292.9. Samples: 12579207280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-24 23:56:39,727][15401] Updated weights for policy 0, policy_version 767773 (0.0024) [2024-06-24 23:56:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12579340288. Throughput: 0: 42366.7. Samples: 12579462780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-24 23:56:43,620][15401] Updated weights for policy 0, policy_version 767783 (0.0034) [2024-06-24 23:56:47,548][15401] Updated weights for policy 0, policy_version 767793 (0.0025) [2024-06-24 23:56:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12579553280. Throughput: 0: 42248.4. Samples: 12579714220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:48,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-24 23:56:51,259][15401] Updated weights for policy 0, policy_version 767803 (0.0032) [2024-06-24 23:56:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12579766272. Throughput: 0: 42241.8. Samples: 12579842880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:53,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-24 23:56:55,325][15401] Updated weights for policy 0, policy_version 767813 (0.0037) [2024-06-24 23:56:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 12579979264. Throughput: 0: 42327.1. Samples: 12580097540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:56:58,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-24 23:56:59,418][15401] Updated weights for policy 0, policy_version 767823 (0.0037) [2024-06-24 23:57:03,109][15401] Updated weights for policy 0, policy_version 767833 (0.0031) [2024-06-24 23:57:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12580192256. Throughput: 0: 42434.1. Samples: 12580355640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:57:03,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-24 23:57:07,098][15401] Updated weights for policy 0, policy_version 767843 (0.0033) [2024-06-24 23:57:08,393][15132] Fps is (10 sec: 40945.0, 60 sec: 42049.8, 300 sec: 42487.2). Total num frames: 12580388864. Throughput: 0: 42355.7. Samples: 12580479000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:57:08,394][15132] Avg episode reward: [(0, '0.206')] [2024-06-24 23:57:10,807][15401] Updated weights for policy 0, policy_version 767853 (0.0040) [2024-06-24 23:57:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42543.0). Total num frames: 12580618240. Throughput: 0: 42543.4. Samples: 12580740100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:57:13,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-24 23:57:14,699][15401] Updated weights for policy 0, policy_version 767863 (0.0036) [2024-06-24 23:57:18,392][15132] Fps is (10 sec: 42603.5, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 12580814848. Throughput: 0: 42611.6. Samples: 12580999160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:57:18,392][15132] Avg episode reward: [(0, '0.750')] [2024-06-24 23:57:18,540][15401] Updated weights for policy 0, policy_version 767873 (0.0045) [2024-06-24 23:57:22,315][15401] Updated weights for policy 0, policy_version 767883 (0.0031) [2024-06-24 23:57:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 12581044224. Throughput: 0: 42484.4. Samples: 12581119180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:57:23,393][15132] Avg episode reward: [(0, '0.710')] [2024-06-24 23:57:26,127][15401] Updated weights for policy 0, policy_version 767893 (0.0036) [2024-06-24 23:57:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12581240832. Throughput: 0: 42425.4. Samples: 12581371920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-24 23:57:28,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-24 23:57:29,920][15401] Updated weights for policy 0, policy_version 767903 (0.0030) [2024-06-24 23:57:33,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12581453824. Throughput: 0: 42657.0. Samples: 12581633780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:57:33,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-24 23:57:33,681][15401] Updated weights for policy 0, policy_version 767913 (0.0032) [2024-06-24 23:57:34,291][15349] Signal inference workers to stop experience collection... (186150 times) [2024-06-24 23:57:34,291][15349] Signal inference workers to resume experience collection... (186150 times) [2024-06-24 23:57:34,312][15401] InferenceWorker_p0-w0: stopping experience collection (186150 times) [2024-06-24 23:57:34,312][15401] InferenceWorker_p0-w0: resuming experience collection (186150 times) [2024-06-24 23:57:37,508][15401] Updated weights for policy 0, policy_version 767923 (0.0034) [2024-06-24 23:57:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12581683200. Throughput: 0: 42561.7. Samples: 12581758160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:57:38,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-24 23:57:41,231][15401] Updated weights for policy 0, policy_version 767933 (0.0038) [2024-06-24 23:57:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12581896192. Throughput: 0: 42503.4. Samples: 12582010200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:57:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 23:57:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000767938_12581896192.pth... [2024-06-24 23:57:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000767315_12571688960.pth [2024-06-24 23:57:45,386][15401] Updated weights for policy 0, policy_version 767943 (0.0027) [2024-06-24 23:57:48,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 12582092800. Throughput: 0: 42562.7. Samples: 12582271060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:57:48,392][15132] Avg episode reward: [(0, '0.429')] [2024-06-24 23:57:48,901][15401] Updated weights for policy 0, policy_version 767953 (0.0047) [2024-06-24 23:57:53,030][15401] Updated weights for policy 0, policy_version 767963 (0.0027) [2024-06-24 23:57:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 12582338560. Throughput: 0: 42598.9. Samples: 12582395800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:57:53,390][15132] Avg episode reward: [(0, '0.159')] [2024-06-24 23:57:57,010][15401] Updated weights for policy 0, policy_version 767973 (0.0038) [2024-06-24 23:57:58,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12582535168. Throughput: 0: 42484.6. Samples: 12582651900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:57:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-24 23:58:00,602][15401] Updated weights for policy 0, policy_version 767983 (0.0031) [2024-06-24 23:58:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12582748160. Throughput: 0: 42369.8. Samples: 12582905700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-24 23:58:04,410][15401] Updated weights for policy 0, policy_version 767993 (0.0035) [2024-06-24 23:58:08,301][15401] Updated weights for policy 0, policy_version 768003 (0.0034) [2024-06-24 23:58:08,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42874.1, 300 sec: 42543.1). Total num frames: 12582961152. Throughput: 0: 42525.4. Samples: 12583032720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-24 23:58:11,798][15401] Updated weights for policy 0, policy_version 768013 (0.0041) [2024-06-24 23:58:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12583190528. Throughput: 0: 42747.0. Samples: 12583295540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-24 23:58:16,104][15401] Updated weights for policy 0, policy_version 768023 (0.0034) [2024-06-24 23:58:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 12583403520. Throughput: 0: 42604.9. Samples: 12583551000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:18,396][15132] Avg episode reward: [(0, '0.543')] [2024-06-24 23:58:19,491][15401] Updated weights for policy 0, policy_version 768033 (0.0034) [2024-06-24 23:58:23,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 12583567360. Throughput: 0: 42726.2. Samples: 12583680840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-24 23:58:23,711][15401] Updated weights for policy 0, policy_version 768043 (0.0027) [2024-06-24 23:58:27,174][15401] Updated weights for policy 0, policy_version 768053 (0.0033) [2024-06-24 23:58:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12583829504. Throughput: 0: 42837.9. Samples: 12583937900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-24 23:58:31,423][15401] Updated weights for policy 0, policy_version 768063 (0.0034) [2024-06-24 23:58:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 12584026112. Throughput: 0: 42617.6. Samples: 12584188760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:33,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-24 23:58:34,753][15401] Updated weights for policy 0, policy_version 768073 (0.0046) [2024-06-24 23:58:38,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 12584206336. Throughput: 0: 42700.5. Samples: 12584317320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-24 23:58:39,186][15401] Updated weights for policy 0, policy_version 768083 (0.0026) [2024-06-24 23:58:42,468][15401] Updated weights for policy 0, policy_version 768093 (0.0041) [2024-06-24 23:58:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12584468480. Throughput: 0: 42620.7. Samples: 12584569840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-24 23:58:46,982][15401] Updated weights for policy 0, policy_version 768103 (0.0036) [2024-06-24 23:58:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 12584648704. Throughput: 0: 42625.4. Samples: 12584823840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-24 23:58:50,375][15401] Updated weights for policy 0, policy_version 768113 (0.0034) [2024-06-24 23:58:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12584861696. Throughput: 0: 42508.8. Samples: 12584945620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-24 23:58:54,621][15401] Updated weights for policy 0, policy_version 768123 (0.0028) [2024-06-24 23:58:54,952][15349] Signal inference workers to stop experience collection... (186200 times) [2024-06-24 23:58:54,960][15349] Signal inference workers to resume experience collection... (186200 times) [2024-06-24 23:58:54,978][15401] InferenceWorker_p0-w0: stopping experience collection (186200 times) [2024-06-24 23:58:54,978][15401] InferenceWorker_p0-w0: resuming experience collection (186200 times) [2024-06-24 23:58:58,045][15401] Updated weights for policy 0, policy_version 768133 (0.0044) [2024-06-24 23:58:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 12585091072. Throughput: 0: 42498.2. Samples: 12585207960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:58:58,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 23:59:02,271][15401] Updated weights for policy 0, policy_version 768143 (0.0029) [2024-06-24 23:59:03,394][15132] Fps is (10 sec: 44215.3, 60 sec: 42594.9, 300 sec: 42486.6). Total num frames: 12585304064. Throughput: 0: 42405.0. Samples: 12585459440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:59:03,395][15132] Avg episode reward: [(0, '0.660')] [2024-06-24 23:59:05,887][15401] Updated weights for policy 0, policy_version 768153 (0.0027) [2024-06-24 23:59:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12585500672. Throughput: 0: 42401.9. Samples: 12585588920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:59:08,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-24 23:59:09,828][15401] Updated weights for policy 0, policy_version 768163 (0.0037) [2024-06-24 23:59:13,389][15132] Fps is (10 sec: 42619.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12585730048. Throughput: 0: 42596.0. Samples: 12585854720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-24 23:59:13,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-24 23:59:13,501][15401] Updated weights for policy 0, policy_version 768173 (0.0034) [2024-06-24 23:59:17,419][15401] Updated weights for policy 0, policy_version 768183 (0.0027) [2024-06-24 23:59:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 12585943040. Throughput: 0: 42552.2. Samples: 12586103600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:18,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-24 23:59:21,366][15401] Updated weights for policy 0, policy_version 768193 (0.0034) [2024-06-24 23:59:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 12586156032. Throughput: 0: 42705.9. Samples: 12586239080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-24 23:59:25,046][15401] Updated weights for policy 0, policy_version 768203 (0.0045) [2024-06-24 23:59:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 12586352640. Throughput: 0: 42664.4. Samples: 12586489740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-24 23:59:29,039][15401] Updated weights for policy 0, policy_version 768213 (0.0036) [2024-06-24 23:59:33,162][15401] Updated weights for policy 0, policy_version 768223 (0.0028) [2024-06-24 23:59:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 12586582016. Throughput: 0: 42760.9. Samples: 12586748080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:33,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-24 23:59:36,715][15401] Updated weights for policy 0, policy_version 768233 (0.0038) [2024-06-24 23:59:38,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 12586795008. Throughput: 0: 42922.3. Samples: 12586877220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:38,392][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 23:59:40,587][15401] Updated weights for policy 0, policy_version 768243 (0.0030) [2024-06-24 23:59:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 12587008000. Throughput: 0: 42898.2. Samples: 12587138380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:43,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-24 23:59:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000768250_12587008000.pth... [2024-06-24 23:59:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000767628_12576817152.pth [2024-06-24 23:59:44,122][15401] Updated weights for policy 0, policy_version 768253 (0.0036) [2024-06-24 23:59:48,061][15401] Updated weights for policy 0, policy_version 768263 (0.0041) [2024-06-24 23:59:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 12587220992. Throughput: 0: 42988.8. Samples: 12587393720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:48,390][15132] Avg episode reward: [(0, '0.873')] [2024-06-24 23:59:52,106][15401] Updated weights for policy 0, policy_version 768273 (0.0035) [2024-06-24 23:59:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12587450368. Throughput: 0: 42946.1. Samples: 12587521500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:53,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-24 23:59:55,818][15401] Updated weights for policy 0, policy_version 768283 (0.0023) [2024-06-24 23:59:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 12587663360. Throughput: 0: 42790.7. Samples: 12587780300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-24 23:59:58,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-24 23:59:59,731][15401] Updated weights for policy 0, policy_version 768293 (0.0025) [2024-06-25 00:00:03,256][15401] Updated weights for policy 0, policy_version 768303 (0.0046) [2024-06-25 00:00:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42874.9, 300 sec: 42653.9). Total num frames: 12587876352. Throughput: 0: 42889.2. Samples: 12588033620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 00:00:07,329][15401] Updated weights for policy 0, policy_version 768313 (0.0033) [2024-06-25 00:00:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 12588089344. Throughput: 0: 42666.2. Samples: 12588159060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:08,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 00:00:08,543][15349] Signal inference workers to stop experience collection... (186250 times) [2024-06-25 00:00:08,544][15349] Signal inference workers to resume experience collection... (186250 times) [2024-06-25 00:00:08,565][15401] InferenceWorker_p0-w0: stopping experience collection (186250 times) [2024-06-25 00:00:08,565][15401] InferenceWorker_p0-w0: resuming experience collection (186250 times) [2024-06-25 00:00:10,867][15401] Updated weights for policy 0, policy_version 768323 (0.0032) [2024-06-25 00:00:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 12588285952. Throughput: 0: 42753.3. Samples: 12588413640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 00:00:14,901][15401] Updated weights for policy 0, policy_version 768333 (0.0037) [2024-06-25 00:00:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12588498944. Throughput: 0: 42616.0. Samples: 12588665800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:18,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 00:00:18,720][15401] Updated weights for policy 0, policy_version 768343 (0.0037) [2024-06-25 00:00:22,562][15401] Updated weights for policy 0, policy_version 768353 (0.0048) [2024-06-25 00:00:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12588728320. Throughput: 0: 42700.9. Samples: 12588798660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:23,394][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 00:00:26,155][15401] Updated weights for policy 0, policy_version 768363 (0.0028) [2024-06-25 00:00:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 12588908544. Throughput: 0: 42456.9. Samples: 12589048940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 00:00:30,219][15401] Updated weights for policy 0, policy_version 768373 (0.0033) [2024-06-25 00:00:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12589154304. Throughput: 0: 42590.6. Samples: 12589310300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:33,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 00:00:33,684][15401] Updated weights for policy 0, policy_version 768383 (0.0040) [2024-06-25 00:00:37,826][15401] Updated weights for policy 0, policy_version 768393 (0.0039) [2024-06-25 00:00:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 12589350912. Throughput: 0: 42631.1. Samples: 12589439900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 00:00:41,702][15401] Updated weights for policy 0, policy_version 768403 (0.0029) [2024-06-25 00:00:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 12589563904. Throughput: 0: 42533.7. Samples: 12589694320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 00:00:45,353][15401] Updated weights for policy 0, policy_version 768413 (0.0046) [2024-06-25 00:00:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12589793280. Throughput: 0: 42590.2. Samples: 12589950180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 00:00:49,462][15401] Updated weights for policy 0, policy_version 768423 (0.0035) [2024-06-25 00:00:53,017][15401] Updated weights for policy 0, policy_version 768433 (0.0045) [2024-06-25 00:00:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12590006272. Throughput: 0: 42800.4. Samples: 12590085080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 00:00:56,985][15401] Updated weights for policy 0, policy_version 768443 (0.0038) [2024-06-25 00:00:58,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12590219264. Throughput: 0: 42776.1. Samples: 12590338560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:00:58,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 00:01:01,031][15401] Updated weights for policy 0, policy_version 768453 (0.0030) [2024-06-25 00:01:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12590448640. Throughput: 0: 42788.8. Samples: 12590591300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:03,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 00:01:04,405][15401] Updated weights for policy 0, policy_version 768463 (0.0043) [2024-06-25 00:01:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12590628864. Throughput: 0: 42828.4. Samples: 12590725940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:08,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 00:01:08,741][15401] Updated weights for policy 0, policy_version 768473 (0.0036) [2024-06-25 00:01:12,300][15401] Updated weights for policy 0, policy_version 768483 (0.0032) [2024-06-25 00:01:13,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 12590858240. Throughput: 0: 42936.0. Samples: 12590981160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:13,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 00:01:16,376][15401] Updated weights for policy 0, policy_version 768493 (0.0034) [2024-06-25 00:01:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 12591087616. Throughput: 0: 42565.5. Samples: 12591225740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 00:01:19,949][15401] Updated weights for policy 0, policy_version 768503 (0.0033) [2024-06-25 00:01:23,390][15132] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12591267840. Throughput: 0: 42672.8. Samples: 12591360180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:23,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 00:01:24,547][15401] Updated weights for policy 0, policy_version 768513 (0.0035) [2024-06-25 00:01:27,525][15401] Updated weights for policy 0, policy_version 768523 (0.0043) [2024-06-25 00:01:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12591497216. Throughput: 0: 42584.8. Samples: 12591610640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 00:01:32,052][15401] Updated weights for policy 0, policy_version 768533 (0.0028) [2024-06-25 00:01:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12591710208. Throughput: 0: 42601.9. Samples: 12591867260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 00:01:33,964][15349] Signal inference workers to stop experience collection... (186300 times) [2024-06-25 00:01:33,964][15349] Signal inference workers to resume experience collection... (186300 times) [2024-06-25 00:01:34,008][15401] InferenceWorker_p0-w0: stopping experience collection (186300 times) [2024-06-25 00:01:34,008][15401] InferenceWorker_p0-w0: resuming experience collection (186300 times) [2024-06-25 00:01:35,131][15401] Updated weights for policy 0, policy_version 768543 (0.0028) [2024-06-25 00:01:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12591890432. Throughput: 0: 42509.3. Samples: 12591998000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 00:01:39,502][15401] Updated weights for policy 0, policy_version 768553 (0.0029) [2024-06-25 00:01:42,872][15401] Updated weights for policy 0, policy_version 768563 (0.0034) [2024-06-25 00:01:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12592152576. Throughput: 0: 42558.1. Samples: 12592253680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 00:01:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000768564_12592152576.pth... [2024-06-25 00:01:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000767938_12581896192.pth [2024-06-25 00:01:47,345][15401] Updated weights for policy 0, policy_version 768573 (0.0040) [2024-06-25 00:01:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12592349184. Throughput: 0: 42543.7. Samples: 12592505760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 00:01:50,633][15401] Updated weights for policy 0, policy_version 768583 (0.0032) [2024-06-25 00:01:53,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12592545792. Throughput: 0: 42413.0. Samples: 12592634520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 00:01:54,752][15401] Updated weights for policy 0, policy_version 768593 (0.0035) [2024-06-25 00:01:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12592775168. Throughput: 0: 42551.3. Samples: 12592895860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:01:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 00:01:58,400][15401] Updated weights for policy 0, policy_version 768603 (0.0046) [2024-06-25 00:02:02,346][15401] Updated weights for policy 0, policy_version 768613 (0.0044) [2024-06-25 00:02:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42710.0). Total num frames: 12592988160. Throughput: 0: 42801.3. Samples: 12593151800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 00:02:06,141][15401] Updated weights for policy 0, policy_version 768623 (0.0046) [2024-06-25 00:02:08,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12593184768. Throughput: 0: 42638.7. Samples: 12593278920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 00:02:09,926][15401] Updated weights for policy 0, policy_version 768633 (0.0031) [2024-06-25 00:02:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42600.0, 300 sec: 42709.8). Total num frames: 12593414144. Throughput: 0: 42881.2. Samples: 12593540300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 00:02:13,605][15401] Updated weights for policy 0, policy_version 768643 (0.0041) [2024-06-25 00:02:17,887][15401] Updated weights for policy 0, policy_version 768653 (0.0030) [2024-06-25 00:02:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 12593643520. Throughput: 0: 42930.6. Samples: 12593799140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:18,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 00:02:21,633][15401] Updated weights for policy 0, policy_version 768663 (0.0038) [2024-06-25 00:02:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12593840128. Throughput: 0: 42812.9. Samples: 12593924580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:23,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 00:02:25,452][15401] Updated weights for policy 0, policy_version 768673 (0.0033) [2024-06-25 00:02:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12594069504. Throughput: 0: 42839.8. Samples: 12594181460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:28,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 00:02:29,058][15401] Updated weights for policy 0, policy_version 768683 (0.0032) [2024-06-25 00:02:32,950][15401] Updated weights for policy 0, policy_version 768693 (0.0035) [2024-06-25 00:02:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12594282496. Throughput: 0: 42823.4. Samples: 12594432820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 00:02:36,794][15401] Updated weights for policy 0, policy_version 768703 (0.0027) [2024-06-25 00:02:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 12594479104. Throughput: 0: 42963.0. Samples: 12594567860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 00:02:40,417][15401] Updated weights for policy 0, policy_version 768713 (0.0031) [2024-06-25 00:02:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 12594708480. Throughput: 0: 42899.9. Samples: 12594826360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 00:02:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 00:02:44,269][15401] Updated weights for policy 0, policy_version 768723 (0.0027) [2024-06-25 00:02:47,995][15401] Updated weights for policy 0, policy_version 768733 (0.0036) [2024-06-25 00:02:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12594937856. Throughput: 0: 42930.7. Samples: 12595083680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:02:48,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 00:02:51,919][15401] Updated weights for policy 0, policy_version 768743 (0.0024) [2024-06-25 00:02:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12595118080. Throughput: 0: 43092.2. Samples: 12595218060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:02:53,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 00:02:55,803][15401] Updated weights for policy 0, policy_version 768753 (0.0030) [2024-06-25 00:02:55,896][15349] Signal inference workers to stop experience collection... (186350 times) [2024-06-25 00:02:55,934][15401] InferenceWorker_p0-w0: stopping experience collection (186350 times) [2024-06-25 00:02:55,960][15349] Signal inference workers to resume experience collection... (186350 times) [2024-06-25 00:02:55,964][15401] InferenceWorker_p0-w0: resuming experience collection (186350 times) [2024-06-25 00:02:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12595363840. Throughput: 0: 42959.2. Samples: 12595473460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:02:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 00:02:59,659][15401] Updated weights for policy 0, policy_version 768763 (0.0028) [2024-06-25 00:03:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12595560448. Throughput: 0: 43005.4. Samples: 12595734380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:03,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 00:03:03,466][15401] Updated weights for policy 0, policy_version 768773 (0.0037) [2024-06-25 00:03:07,818][15401] Updated weights for policy 0, policy_version 768783 (0.0026) [2024-06-25 00:03:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12595773440. Throughput: 0: 42975.9. Samples: 12595858500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:08,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-25 00:03:11,075][15401] Updated weights for policy 0, policy_version 768793 (0.0028) [2024-06-25 00:03:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.8, 300 sec: 42709.5). Total num frames: 12596002816. Throughput: 0: 42969.8. Samples: 12596115100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 00:03:15,334][15401] Updated weights for policy 0, policy_version 768803 (0.0042) [2024-06-25 00:03:18,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 12596215808. Throughput: 0: 43156.0. Samples: 12596374940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:18,392][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 00:03:18,651][15401] Updated weights for policy 0, policy_version 768813 (0.0048) [2024-06-25 00:03:22,895][15401] Updated weights for policy 0, policy_version 768823 (0.0029) [2024-06-25 00:03:23,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12596428800. Throughput: 0: 42859.6. Samples: 12596496540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 00:03:26,634][15401] Updated weights for policy 0, policy_version 768833 (0.0038) [2024-06-25 00:03:28,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 12596641792. Throughput: 0: 42862.3. Samples: 12596755160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 00:03:30,392][15401] Updated weights for policy 0, policy_version 768843 (0.0031) [2024-06-25 00:03:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12596838400. Throughput: 0: 42858.6. Samples: 12597012320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:33,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-25 00:03:34,059][15401] Updated weights for policy 0, policy_version 768853 (0.0036) [2024-06-25 00:03:37,956][15401] Updated weights for policy 0, policy_version 768863 (0.0032) [2024-06-25 00:03:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12597067776. Throughput: 0: 42694.7. Samples: 12597139320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 00:03:41,552][15401] Updated weights for policy 0, policy_version 768873 (0.0031) [2024-06-25 00:03:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12597280768. Throughput: 0: 42784.0. Samples: 12597398740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 00:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000768877_12597280768.pth... [2024-06-25 00:03:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000768250_12587008000.pth [2024-06-25 00:03:45,584][15401] Updated weights for policy 0, policy_version 768883 (0.0039) [2024-06-25 00:03:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12597493760. Throughput: 0: 42567.4. Samples: 12597649920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:48,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 00:03:49,414][15401] Updated weights for policy 0, policy_version 768893 (0.0038) [2024-06-25 00:03:53,285][15401] Updated weights for policy 0, policy_version 768903 (0.0033) [2024-06-25 00:03:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12597706752. Throughput: 0: 42757.4. Samples: 12597782580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 00:03:57,144][15401] Updated weights for policy 0, policy_version 768913 (0.0046) [2024-06-25 00:03:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.7). Total num frames: 12597919744. Throughput: 0: 42678.9. Samples: 12598035660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:03:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 00:04:01,210][15401] Updated weights for policy 0, policy_version 768923 (0.0029) [2024-06-25 00:04:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12598149120. Throughput: 0: 42615.7. Samples: 12598292540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:04:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 00:04:04,895][15401] Updated weights for policy 0, policy_version 768933 (0.0034) [2024-06-25 00:04:08,394][15132] Fps is (10 sec: 40941.5, 60 sec: 42595.2, 300 sec: 42708.8). Total num frames: 12598329344. Throughput: 0: 42808.1. Samples: 12598423100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:04:08,395][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 00:04:08,698][15401] Updated weights for policy 0, policy_version 768943 (0.0035) [2024-06-25 00:04:12,379][15401] Updated weights for policy 0, policy_version 768953 (0.0027) [2024-06-25 00:04:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12598575104. Throughput: 0: 42973.7. Samples: 12598688980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:04:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 00:04:16,298][15401] Updated weights for policy 0, policy_version 768963 (0.0031) [2024-06-25 00:04:18,389][15132] Fps is (10 sec: 47535.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 12598804480. Throughput: 0: 42900.5. Samples: 12598942840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:04:18,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-25 00:04:19,972][15401] Updated weights for policy 0, policy_version 768973 (0.0045) [2024-06-25 00:04:20,296][15349] Signal inference workers to stop experience collection... (186400 times) [2024-06-25 00:04:20,297][15349] Signal inference workers to resume experience collection... (186400 times) [2024-06-25 00:04:20,344][15401] InferenceWorker_p0-w0: stopping experience collection (186400 times) [2024-06-25 00:04:20,344][15401] InferenceWorker_p0-w0: resuming experience collection (186400 times) [2024-06-25 00:04:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12598968320. Throughput: 0: 42948.3. Samples: 12599072000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:04:23,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 00:04:23,779][15401] Updated weights for policy 0, policy_version 768983 (0.0032) [2024-06-25 00:04:27,823][15401] Updated weights for policy 0, policy_version 768993 (0.0033) [2024-06-25 00:04:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12599214080. Throughput: 0: 42896.2. Samples: 12599329060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 00:04:28,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 00:04:31,808][15401] Updated weights for policy 0, policy_version 769003 (0.0035) [2024-06-25 00:04:33,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43417.7, 300 sec: 42876.5). Total num frames: 12599443456. Throughput: 0: 43025.5. Samples: 12599586060. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:04:33,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 00:04:35,313][15401] Updated weights for policy 0, policy_version 769013 (0.0029) [2024-06-25 00:04:38,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12599623680. Throughput: 0: 42931.5. Samples: 12599714500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:04:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 00:04:39,563][15401] Updated weights for policy 0, policy_version 769023 (0.0028) [2024-06-25 00:04:42,740][15401] Updated weights for policy 0, policy_version 769033 (0.0036) [2024-06-25 00:04:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 12599885824. Throughput: 0: 43123.2. Samples: 12599976200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:04:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 00:04:47,367][15401] Updated weights for policy 0, policy_version 769043 (0.0035) [2024-06-25 00:04:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12600082432. Throughput: 0: 43099.0. Samples: 12600232000. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:04:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 00:04:50,432][15401] Updated weights for policy 0, policy_version 769053 (0.0030) [2024-06-25 00:04:53,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12600246272. Throughput: 0: 42903.1. Samples: 12600353540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:04:53,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 00:04:55,013][15401] Updated weights for policy 0, policy_version 769063 (0.0042) [2024-06-25 00:04:57,882][15401] Updated weights for policy 0, policy_version 769073 (0.0034) [2024-06-25 00:04:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12600508416. Throughput: 0: 42753.8. Samples: 12600612900. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:04:58,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 00:05:02,555][15401] Updated weights for policy 0, policy_version 769083 (0.0033) [2024-06-25 00:05:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12600705024. Throughput: 0: 42951.0. Samples: 12600875640. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 00:05:05,342][15401] Updated weights for policy 0, policy_version 769093 (0.0033) [2024-06-25 00:05:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42874.7, 300 sec: 42765.0). Total num frames: 12600901632. Throughput: 0: 42603.2. Samples: 12600989140. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 00:05:10,222][15401] Updated weights for policy 0, policy_version 769103 (0.0023) [2024-06-25 00:05:12,788][15349] Signal inference workers to stop experience collection... (186450 times) [2024-06-25 00:05:12,789][15349] Signal inference workers to resume experience collection... (186450 times) [2024-06-25 00:05:12,813][15401] InferenceWorker_p0-w0: stopping experience collection (186450 times) [2024-06-25 00:05:12,813][15401] InferenceWorker_p0-w0: resuming experience collection (186450 times) [2024-06-25 00:05:12,932][15401] Updated weights for policy 0, policy_version 769113 (0.0036) [2024-06-25 00:05:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12601147392. Throughput: 0: 42798.6. Samples: 12601255000. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 00:05:17,971][15401] Updated weights for policy 0, policy_version 769123 (0.0027) [2024-06-25 00:05:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12601344000. Throughput: 0: 42988.0. Samples: 12601520520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 00:05:20,571][15401] Updated weights for policy 0, policy_version 769133 (0.0032) [2024-06-25 00:05:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 12601573376. Throughput: 0: 42873.3. Samples: 12601643800. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 00:05:25,454][15401] Updated weights for policy 0, policy_version 769143 (0.0031) [2024-06-25 00:05:28,125][15401] Updated weights for policy 0, policy_version 769153 (0.0038) [2024-06-25 00:05:28,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 12601802752. Throughput: 0: 42841.2. Samples: 12601904060. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:28,391][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 00:05:32,926][15401] Updated weights for policy 0, policy_version 769163 (0.0029) [2024-06-25 00:05:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 12601982976. Throughput: 0: 42983.5. Samples: 12602166260. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:33,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 00:05:36,334][15401] Updated weights for policy 0, policy_version 769173 (0.0024) [2024-06-25 00:05:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 12602228736. Throughput: 0: 43032.0. Samples: 12602289980. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 00:05:40,633][15401] Updated weights for policy 0, policy_version 769183 (0.0036) [2024-06-25 00:05:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 12602425344. Throughput: 0: 42935.4. Samples: 12602545000. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 00:05:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000769191_12602425344.pth... [2024-06-25 00:05:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000768564_12592152576.pth [2024-06-25 00:05:43,950][15401] Updated weights for policy 0, policy_version 769193 (0.0028) [2024-06-25 00:05:48,197][15401] Updated weights for policy 0, policy_version 769203 (0.0043) [2024-06-25 00:05:48,390][15132] Fps is (10 sec: 39320.3, 60 sec: 42325.1, 300 sec: 42765.0). Total num frames: 12602621952. Throughput: 0: 42820.6. Samples: 12602802580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 00:05:51,671][15401] Updated weights for policy 0, policy_version 769213 (0.0029) [2024-06-25 00:05:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 12602867712. Throughput: 0: 43083.5. Samples: 12602927900. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 00:05:55,882][15401] Updated weights for policy 0, policy_version 769223 (0.0041) [2024-06-25 00:05:58,390][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12603064320. Throughput: 0: 42804.4. Samples: 12603181200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:05:58,394][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 00:05:59,276][15401] Updated weights for policy 0, policy_version 769233 (0.0027) [2024-06-25 00:06:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12603260928. Throughput: 0: 42671.4. Samples: 12603440740. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:06:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 00:06:03,646][15401] Updated weights for policy 0, policy_version 769243 (0.0032) [2024-06-25 00:06:07,193][15401] Updated weights for policy 0, policy_version 769253 (0.0032) [2024-06-25 00:06:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 12603473920. Throughput: 0: 42783.1. Samples: 12603569040. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:06:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 00:06:11,203][15401] Updated weights for policy 0, policy_version 769263 (0.0029) [2024-06-25 00:06:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12603686912. Throughput: 0: 42646.8. Samples: 12603823160. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-25 00:06:13,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 00:06:14,915][15401] Updated weights for policy 0, policy_version 769273 (0.0035) [2024-06-25 00:06:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12603916288. Throughput: 0: 42574.2. Samples: 12604082100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 00:06:18,598][15401] Updated weights for policy 0, policy_version 769283 (0.0036) [2024-06-25 00:06:22,642][15401] Updated weights for policy 0, policy_version 769293 (0.0042) [2024-06-25 00:06:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12604129280. Throughput: 0: 42726.2. Samples: 12604212660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 00:06:26,205][15401] Updated weights for policy 0, policy_version 769303 (0.0027) [2024-06-25 00:06:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 12604342272. Throughput: 0: 42677.4. Samples: 12604465480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:28,390][15132] Avg episode reward: [(0, '0.215')] [2024-06-25 00:06:30,146][15401] Updated weights for policy 0, policy_version 769313 (0.0028) [2024-06-25 00:06:32,503][15349] Signal inference workers to stop experience collection... (186500 times) [2024-06-25 00:06:32,503][15349] Signal inference workers to resume experience collection... (186500 times) [2024-06-25 00:06:32,540][15401] InferenceWorker_p0-w0: stopping experience collection (186500 times) [2024-06-25 00:06:32,540][15401] InferenceWorker_p0-w0: resuming experience collection (186500 times) [2024-06-25 00:06:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12604555264. Throughput: 0: 42662.4. Samples: 12604722380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:33,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 00:06:33,854][15401] Updated weights for policy 0, policy_version 769323 (0.0031) [2024-06-25 00:06:37,850][15401] Updated weights for policy 0, policy_version 769333 (0.0033) [2024-06-25 00:06:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 12604768256. Throughput: 0: 42791.9. Samples: 12604853640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:38,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 00:06:41,448][15401] Updated weights for policy 0, policy_version 769343 (0.0043) [2024-06-25 00:06:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12604997632. Throughput: 0: 42824.0. Samples: 12605108280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 00:06:45,963][15401] Updated weights for policy 0, policy_version 769353 (0.0033) [2024-06-25 00:06:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 12605194240. Throughput: 0: 42630.7. Samples: 12605359120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 00:06:49,102][15401] Updated weights for policy 0, policy_version 769363 (0.0036) [2024-06-25 00:06:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 12605390848. Throughput: 0: 42554.3. Samples: 12605483980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 00:06:53,502][15401] Updated weights for policy 0, policy_version 769373 (0.0029) [2024-06-25 00:06:56,828][15401] Updated weights for policy 0, policy_version 769383 (0.0032) [2024-06-25 00:06:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12605636608. Throughput: 0: 42705.7. Samples: 12605744920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:06:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 00:07:01,483][15401] Updated weights for policy 0, policy_version 769393 (0.0036) [2024-06-25 00:07:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12605833216. Throughput: 0: 42697.4. Samples: 12606003480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 00:07:04,593][15401] Updated weights for policy 0, policy_version 769403 (0.0033) [2024-06-25 00:07:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 12606029824. Throughput: 0: 42579.6. Samples: 12606128740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 00:07:09,047][15401] Updated weights for policy 0, policy_version 769413 (0.0037) [2024-06-25 00:07:12,141][15401] Updated weights for policy 0, policy_version 769423 (0.0037) [2024-06-25 00:07:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 12606275584. Throughput: 0: 42603.1. Samples: 12606382720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:13,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 00:07:16,854][15401] Updated weights for policy 0, policy_version 769433 (0.0033) [2024-06-25 00:07:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12606472192. Throughput: 0: 42649.8. Samples: 12606641620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 00:07:19,708][15401] Updated weights for policy 0, policy_version 769443 (0.0022) [2024-06-25 00:07:23,392][15132] Fps is (10 sec: 39321.6, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 12606668800. Throughput: 0: 42441.0. Samples: 12606763480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:23,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 00:07:24,819][15401] Updated weights for policy 0, policy_version 769453 (0.0034) [2024-06-25 00:07:27,547][15401] Updated weights for policy 0, policy_version 769463 (0.0041) [2024-06-25 00:07:28,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12606914560. Throughput: 0: 42358.7. Samples: 12607014520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:28,393][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 00:07:32,639][15401] Updated weights for policy 0, policy_version 769473 (0.0034) [2024-06-25 00:07:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12607111168. Throughput: 0: 42589.0. Samples: 12607275620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 00:07:35,400][15401] Updated weights for policy 0, policy_version 769483 (0.0027) [2024-06-25 00:07:38,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 12607324160. Throughput: 0: 42479.9. Samples: 12607395580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 00:07:40,044][15401] Updated weights for policy 0, policy_version 769493 (0.0037) [2024-06-25 00:07:43,009][15401] Updated weights for policy 0, policy_version 769503 (0.0035) [2024-06-25 00:07:43,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12607569920. Throughput: 0: 42592.5. Samples: 12607661580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:43,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 00:07:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000769505_12607569920.pth... [2024-06-25 00:07:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000768877_12597280768.pth [2024-06-25 00:07:47,381][15401] Updated weights for policy 0, policy_version 769513 (0.0036) [2024-06-25 00:07:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12607750144. Throughput: 0: 42636.5. Samples: 12607922120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 00:07:50,601][15401] Updated weights for policy 0, policy_version 769523 (0.0030) [2024-06-25 00:07:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12607963136. Throughput: 0: 42503.4. Samples: 12608041400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:53,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 00:07:55,326][15401] Updated weights for policy 0, policy_version 769533 (0.0038) [2024-06-25 00:07:56,315][15349] Signal inference workers to stop experience collection... (186550 times) [2024-06-25 00:07:56,316][15349] Signal inference workers to resume experience collection... (186550 times) [2024-06-25 00:07:56,332][15401] InferenceWorker_p0-w0: stopping experience collection (186550 times) [2024-06-25 00:07:56,333][15401] InferenceWorker_p0-w0: resuming experience collection (186550 times) [2024-06-25 00:07:58,041][15401] Updated weights for policy 0, policy_version 769543 (0.0025) [2024-06-25 00:07:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12608208896. Throughput: 0: 42729.0. Samples: 12608305420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 00:07:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 00:08:02,934][15401] Updated weights for policy 0, policy_version 769553 (0.0028) [2024-06-25 00:08:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12608372736. Throughput: 0: 42816.5. Samples: 12608568360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:03,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 00:08:05,743][15401] Updated weights for policy 0, policy_version 769563 (0.0034) [2024-06-25 00:08:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12608602112. Throughput: 0: 42728.9. Samples: 12608686180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:08,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 00:08:10,296][15401] Updated weights for policy 0, policy_version 769573 (0.0040) [2024-06-25 00:08:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42600.0, 300 sec: 42765.4). Total num frames: 12608831488. Throughput: 0: 42924.0. Samples: 12608946000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 00:08:13,444][15401] Updated weights for policy 0, policy_version 769583 (0.0046) [2024-06-25 00:08:17,763][15401] Updated weights for policy 0, policy_version 769593 (0.0021) [2024-06-25 00:08:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12609011712. Throughput: 0: 42931.5. Samples: 12609207540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 00:08:21,074][15401] Updated weights for policy 0, policy_version 769603 (0.0022) [2024-06-25 00:08:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 12609257472. Throughput: 0: 42881.8. Samples: 12609325260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 00:08:25,217][15401] Updated weights for policy 0, policy_version 769613 (0.0032) [2024-06-25 00:08:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 12609470464. Throughput: 0: 42852.4. Samples: 12609589940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:28,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-25 00:08:28,686][15401] Updated weights for policy 0, policy_version 769623 (0.0041) [2024-06-25 00:08:33,157][15401] Updated weights for policy 0, policy_version 769633 (0.0029) [2024-06-25 00:08:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12609667072. Throughput: 0: 42804.3. Samples: 12609848320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 00:08:36,387][15401] Updated weights for policy 0, policy_version 769643 (0.0035) [2024-06-25 00:08:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12609912832. Throughput: 0: 42936.4. Samples: 12609973540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 00:08:40,641][15401] Updated weights for policy 0, policy_version 769653 (0.0029) [2024-06-25 00:08:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12610109440. Throughput: 0: 42776.8. Samples: 12610230380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 00:08:44,012][15401] Updated weights for policy 0, policy_version 769663 (0.0024) [2024-06-25 00:08:48,118][15401] Updated weights for policy 0, policy_version 769673 (0.0023) [2024-06-25 00:08:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12610322432. Throughput: 0: 42624.9. Samples: 12610486480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 00:08:51,595][15401] Updated weights for policy 0, policy_version 769683 (0.0035) [2024-06-25 00:08:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12610535424. Throughput: 0: 42870.2. Samples: 12610615340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 00:08:55,926][15401] Updated weights for policy 0, policy_version 769693 (0.0036) [2024-06-25 00:08:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12610732032. Throughput: 0: 42848.0. Samples: 12610874160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:08:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 00:08:59,394][15401] Updated weights for policy 0, policy_version 769703 (0.0028) [2024-06-25 00:09:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42821.2). Total num frames: 12610961408. Throughput: 0: 42683.1. Samples: 12611128280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:03,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 00:09:03,476][15401] Updated weights for policy 0, policy_version 769713 (0.0031) [2024-06-25 00:09:06,930][15401] Updated weights for policy 0, policy_version 769723 (0.0038) [2024-06-25 00:09:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12611174400. Throughput: 0: 42919.6. Samples: 12611256640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 00:09:10,907][15401] Updated weights for policy 0, policy_version 769733 (0.0026) [2024-06-25 00:09:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12611371008. Throughput: 0: 42784.5. Samples: 12611515240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 00:09:15,015][15349] Signal inference workers to stop experience collection... (186600 times) [2024-06-25 00:09:15,069][15401] InferenceWorker_p0-w0: stopping experience collection (186600 times) [2024-06-25 00:09:15,077][15349] Signal inference workers to resume experience collection... (186600 times) [2024-06-25 00:09:15,084][15401] InferenceWorker_p0-w0: resuming experience collection (186600 times) [2024-06-25 00:09:15,087][15401] Updated weights for policy 0, policy_version 769743 (0.0023) [2024-06-25 00:09:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12611616768. Throughput: 0: 42584.5. Samples: 12611764620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 00:09:18,692][15401] Updated weights for policy 0, policy_version 769753 (0.0040) [2024-06-25 00:09:22,595][15401] Updated weights for policy 0, policy_version 769763 (0.0029) [2024-06-25 00:09:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12611813376. Throughput: 0: 42736.5. Samples: 12611896680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:23,392][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 00:09:26,191][15401] Updated weights for policy 0, policy_version 769773 (0.0026) [2024-06-25 00:09:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12612009984. Throughput: 0: 42699.6. Samples: 12612151860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 00:09:30,086][15401] Updated weights for policy 0, policy_version 769783 (0.0024) [2024-06-25 00:09:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12612255744. Throughput: 0: 42562.1. Samples: 12612401780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 00:09:34,175][15401] Updated weights for policy 0, policy_version 769793 (0.0037) [2024-06-25 00:09:37,892][15401] Updated weights for policy 0, policy_version 769803 (0.0040) [2024-06-25 00:09:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12612452352. Throughput: 0: 42579.6. Samples: 12612531420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 00:09:42,631][15401] Updated weights for policy 0, policy_version 769813 (0.0028) [2024-06-25 00:09:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12612648960. Throughput: 0: 42404.5. Samples: 12612782360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:09:43,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-25 00:09:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000769816_12612665344.pth... [2024-06-25 00:09:43,537][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000769191_12602425344.pth [2024-06-25 00:09:46,152][15401] Updated weights for policy 0, policy_version 769823 (0.0052) [2024-06-25 00:09:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12612861952. Throughput: 0: 42392.4. Samples: 12613035940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:09:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 00:09:50,459][15401] Updated weights for policy 0, policy_version 769833 (0.0033) [2024-06-25 00:09:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12613074944. Throughput: 0: 42328.4. Samples: 12613161420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:09:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 00:09:53,810][15401] Updated weights for policy 0, policy_version 769843 (0.0034) [2024-06-25 00:09:57,968][15401] Updated weights for policy 0, policy_version 769853 (0.0030) [2024-06-25 00:09:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12613287936. Throughput: 0: 42280.4. Samples: 12613417860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:09:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 00:10:01,431][15401] Updated weights for policy 0, policy_version 769863 (0.0033) [2024-06-25 00:10:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12613500928. Throughput: 0: 42484.0. Samples: 12613676400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 00:10:05,567][15401] Updated weights for policy 0, policy_version 769873 (0.0027) [2024-06-25 00:10:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12613730304. Throughput: 0: 42337.3. Samples: 12613801860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 00:10:09,115][15401] Updated weights for policy 0, policy_version 769883 (0.0024) [2024-06-25 00:10:13,100][15401] Updated weights for policy 0, policy_version 769893 (0.0038) [2024-06-25 00:10:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12613926912. Throughput: 0: 42333.3. Samples: 12614056860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 00:10:16,841][15401] Updated weights for policy 0, policy_version 769903 (0.0039) [2024-06-25 00:10:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12614139904. Throughput: 0: 42471.1. Samples: 12614312980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 00:10:20,776][15401] Updated weights for policy 0, policy_version 769913 (0.0036) [2024-06-25 00:10:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12614369280. Throughput: 0: 42554.2. Samples: 12614446360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 00:10:24,491][15401] Updated weights for policy 0, policy_version 769923 (0.0028) [2024-06-25 00:10:28,361][15401] Updated weights for policy 0, policy_version 769933 (0.0030) [2024-06-25 00:10:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12614582272. Throughput: 0: 42686.7. Samples: 12614703260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:28,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 00:10:32,090][15401] Updated weights for policy 0, policy_version 769943 (0.0033) [2024-06-25 00:10:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12614795264. Throughput: 0: 42779.6. Samples: 12614961020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:33,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 00:10:35,990][15401] Updated weights for policy 0, policy_version 769953 (0.0034) [2024-06-25 00:10:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 12615024640. Throughput: 0: 42829.7. Samples: 12615088860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:38,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 00:10:39,704][15401] Updated weights for policy 0, policy_version 769963 (0.0037) [2024-06-25 00:10:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12615221248. Throughput: 0: 42881.0. Samples: 12615347500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 00:10:43,483][15401] Updated weights for policy 0, policy_version 769973 (0.0023) [2024-06-25 00:10:47,331][15401] Updated weights for policy 0, policy_version 769983 (0.0033) [2024-06-25 00:10:48,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 12615434240. Throughput: 0: 42793.3. Samples: 12615602200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:48,393][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 00:10:51,315][15401] Updated weights for policy 0, policy_version 769993 (0.0028) [2024-06-25 00:10:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12615663616. Throughput: 0: 42855.2. Samples: 12615730340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 00:10:54,853][15401] Updated weights for policy 0, policy_version 770003 (0.0033) [2024-06-25 00:10:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12615860224. Throughput: 0: 42839.2. Samples: 12615984620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:10:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 00:10:59,085][15401] Updated weights for policy 0, policy_version 770013 (0.0032) [2024-06-25 00:11:00,887][15349] Signal inference workers to stop experience collection... (186650 times) [2024-06-25 00:11:00,920][15401] InferenceWorker_p0-w0: stopping experience collection (186650 times) [2024-06-25 00:11:01,002][15349] Signal inference workers to resume experience collection... (186650 times) [2024-06-25 00:11:01,003][15401] InferenceWorker_p0-w0: resuming experience collection (186650 times) [2024-06-25 00:11:02,602][15401] Updated weights for policy 0, policy_version 770023 (0.0037) [2024-06-25 00:11:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 12616089600. Throughput: 0: 42928.9. Samples: 12616244880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:11:03,392][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 00:11:06,739][15401] Updated weights for policy 0, policy_version 770033 (0.0035) [2024-06-25 00:11:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12616302592. Throughput: 0: 42842.6. Samples: 12616374280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:11:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 00:11:10,662][15401] Updated weights for policy 0, policy_version 770043 (0.0032) [2024-06-25 00:11:13,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12616499200. Throughput: 0: 42856.7. Samples: 12616631820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:11:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 00:11:14,271][15401] Updated weights for policy 0, policy_version 770053 (0.0037) [2024-06-25 00:11:18,086][15401] Updated weights for policy 0, policy_version 770063 (0.0034) [2024-06-25 00:11:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 12616712192. Throughput: 0: 42743.7. Samples: 12616884480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:11:18,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 00:11:21,839][15401] Updated weights for policy 0, policy_version 770073 (0.0029) [2024-06-25 00:11:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12616941568. Throughput: 0: 42811.6. Samples: 12617015280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:11:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 00:11:25,948][15401] Updated weights for policy 0, policy_version 770083 (0.0040) [2024-06-25 00:11:28,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12617138176. Throughput: 0: 42659.0. Samples: 12617267160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 00:11:28,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 00:11:29,776][15401] Updated weights for policy 0, policy_version 770093 (0.0034) [2024-06-25 00:11:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 12617351168. Throughput: 0: 42712.3. Samples: 12617524160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:11:33,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 00:11:33,650][15401] Updated weights for policy 0, policy_version 770103 (0.0030) [2024-06-25 00:11:37,392][15401] Updated weights for policy 0, policy_version 770113 (0.0022) [2024-06-25 00:11:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 12617580544. Throughput: 0: 42684.8. Samples: 12617651160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:11:38,394][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 00:11:41,194][15401] Updated weights for policy 0, policy_version 770123 (0.0039) [2024-06-25 00:11:43,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12617777152. Throughput: 0: 42728.8. Samples: 12617907420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:11:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 00:11:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000770129_12617793536.pth... [2024-06-25 00:11:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000769505_12607569920.pth [2024-06-25 00:11:45,023][15401] Updated weights for policy 0, policy_version 770133 (0.0026) [2024-06-25 00:11:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 12617973760. Throughput: 0: 42587.1. Samples: 12618161300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:11:48,393][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 00:11:49,112][15401] Updated weights for policy 0, policy_version 770143 (0.0028) [2024-06-25 00:11:52,790][15401] Updated weights for policy 0, policy_version 770153 (0.0035) [2024-06-25 00:11:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12618219520. Throughput: 0: 42486.7. Samples: 12618286180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:11:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 00:11:56,901][15401] Updated weights for policy 0, policy_version 770163 (0.0028) [2024-06-25 00:11:58,390][15132] Fps is (10 sec: 45886.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12618432512. Throughput: 0: 42347.2. Samples: 12618537440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:11:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 00:12:00,579][15401] Updated weights for policy 0, policy_version 770173 (0.0034) [2024-06-25 00:12:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 12618629120. Throughput: 0: 42458.5. Samples: 12618795120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 00:12:04,510][15401] Updated weights for policy 0, policy_version 770183 (0.0027) [2024-06-25 00:12:08,125][15401] Updated weights for policy 0, policy_version 770193 (0.0045) [2024-06-25 00:12:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12618858496. Throughput: 0: 42386.3. Samples: 12618922660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:08,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 00:12:11,759][15349] Signal inference workers to stop experience collection... (186700 times) [2024-06-25 00:12:11,759][15349] Signal inference workers to resume experience collection... (186700 times) [2024-06-25 00:12:11,793][15401] InferenceWorker_p0-w0: stopping experience collection (186700 times) [2024-06-25 00:12:11,794][15401] InferenceWorker_p0-w0: resuming experience collection (186700 times) [2024-06-25 00:12:12,368][15401] Updated weights for policy 0, policy_version 770203 (0.0035) [2024-06-25 00:12:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12619055104. Throughput: 0: 42611.1. Samples: 12619184660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:13,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 00:12:16,133][15401] Updated weights for policy 0, policy_version 770213 (0.0045) [2024-06-25 00:12:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12619251712. Throughput: 0: 42526.9. Samples: 12619437860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 00:12:19,879][15401] Updated weights for policy 0, policy_version 770223 (0.0040) [2024-06-25 00:12:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 12619481088. Throughput: 0: 42477.8. Samples: 12619562660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 00:12:23,714][15401] Updated weights for policy 0, policy_version 770233 (0.0033) [2024-06-25 00:12:27,615][15401] Updated weights for policy 0, policy_version 770243 (0.0030) [2024-06-25 00:12:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12619677696. Throughput: 0: 42539.6. Samples: 12619821700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:28,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 00:12:31,182][15401] Updated weights for policy 0, policy_version 770253 (0.0042) [2024-06-25 00:12:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12619907072. Throughput: 0: 42480.8. Samples: 12620072840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 00:12:35,247][15401] Updated weights for policy 0, policy_version 770263 (0.0029) [2024-06-25 00:12:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12620120064. Throughput: 0: 42720.4. Samples: 12620208600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 00:12:38,719][15401] Updated weights for policy 0, policy_version 770273 (0.0041) [2024-06-25 00:12:42,856][15401] Updated weights for policy 0, policy_version 770283 (0.0042) [2024-06-25 00:12:43,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12620316672. Throughput: 0: 42786.8. Samples: 12620462840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 00:12:46,667][15401] Updated weights for policy 0, policy_version 770293 (0.0038) [2024-06-25 00:12:48,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43144.6, 300 sec: 42709.1). Total num frames: 12620562432. Throughput: 0: 42473.8. Samples: 12620706540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:48,392][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 00:12:50,770][15401] Updated weights for policy 0, policy_version 770303 (0.0041) [2024-06-25 00:12:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12620775424. Throughput: 0: 42732.5. Samples: 12620845620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:53,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 00:12:54,221][15401] Updated weights for policy 0, policy_version 770313 (0.0030) [2024-06-25 00:12:58,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12620955648. Throughput: 0: 42453.7. Samples: 12621095080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:12:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 00:12:59,117][15401] Updated weights for policy 0, policy_version 770323 (0.0043) [2024-06-25 00:13:01,744][15401] Updated weights for policy 0, policy_version 770333 (0.0036) [2024-06-25 00:13:03,392][15132] Fps is (10 sec: 42589.1, 60 sec: 42870.0, 300 sec: 42709.2). Total num frames: 12621201408. Throughput: 0: 42482.0. Samples: 12621349640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:13:03,392][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 00:13:06,705][15401] Updated weights for policy 0, policy_version 770343 (0.0028) [2024-06-25 00:13:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12621381632. Throughput: 0: 42628.1. Samples: 12621480920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:13:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 00:13:09,369][15401] Updated weights for policy 0, policy_version 770353 (0.0027) [2024-06-25 00:13:13,390][15132] Fps is (10 sec: 39329.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12621594624. Throughput: 0: 42418.5. Samples: 12621730540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 00:13:13,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 00:13:14,205][15401] Updated weights for policy 0, policy_version 770363 (0.0046) [2024-06-25 00:13:17,140][15401] Updated weights for policy 0, policy_version 770373 (0.0041) [2024-06-25 00:13:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12621840384. Throughput: 0: 42449.0. Samples: 12621983040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 00:13:21,949][15401] Updated weights for policy 0, policy_version 770383 (0.0028) [2024-06-25 00:13:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 12622020608. Throughput: 0: 42449.7. Samples: 12622118840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:23,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 00:13:25,050][15349] Signal inference workers to stop experience collection... (186750 times) [2024-06-25 00:13:25,081][15401] InferenceWorker_p0-w0: stopping experience collection (186750 times) [2024-06-25 00:13:25,107][15349] Signal inference workers to resume experience collection... (186750 times) [2024-06-25 00:13:25,107][15401] InferenceWorker_p0-w0: resuming experience collection (186750 times) [2024-06-25 00:13:25,110][15401] Updated weights for policy 0, policy_version 770393 (0.0026) [2024-06-25 00:13:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12622249984. Throughput: 0: 42320.9. Samples: 12622367280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:28,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-25 00:13:29,870][15401] Updated weights for policy 0, policy_version 770403 (0.0031) [2024-06-25 00:13:32,732][15401] Updated weights for policy 0, policy_version 770413 (0.0034) [2024-06-25 00:13:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12622462976. Throughput: 0: 42598.3. Samples: 12622623360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 00:13:37,476][15401] Updated weights for policy 0, policy_version 770423 (0.0034) [2024-06-25 00:13:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12622643200. Throughput: 0: 42296.8. Samples: 12622748980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 00:13:40,531][15401] Updated weights for policy 0, policy_version 770433 (0.0038) [2024-06-25 00:13:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12622888960. Throughput: 0: 42309.3. Samples: 12622999000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 00:13:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000770440_12622888960.pth... [2024-06-25 00:13:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000769816_12612665344.pth [2024-06-25 00:13:45,142][15401] Updated weights for policy 0, policy_version 770443 (0.0038) [2024-06-25 00:13:48,178][15401] Updated weights for policy 0, policy_version 770453 (0.0036) [2024-06-25 00:13:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 12623101952. Throughput: 0: 42398.4. Samples: 12623257480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 00:13:52,784][15401] Updated weights for policy 0, policy_version 770463 (0.0028) [2024-06-25 00:13:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 12623298560. Throughput: 0: 42353.6. Samples: 12623386840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 00:13:55,779][15401] Updated weights for policy 0, policy_version 770473 (0.0023) [2024-06-25 00:13:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 12623544320. Throughput: 0: 42485.7. Samples: 12623642400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:13:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 00:14:00,410][15401] Updated weights for policy 0, policy_version 770483 (0.0046) [2024-06-25 00:14:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42326.7, 300 sec: 42598.4). Total num frames: 12623740928. Throughput: 0: 42577.7. Samples: 12623899040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 00:14:03,410][15401] Updated weights for policy 0, policy_version 770493 (0.0031) [2024-06-25 00:14:08,041][15401] Updated weights for policy 0, policy_version 770503 (0.0042) [2024-06-25 00:14:08,392][15132] Fps is (10 sec: 37674.7, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 12623921152. Throughput: 0: 42355.6. Samples: 12624024940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:08,393][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 00:14:10,948][15401] Updated weights for policy 0, policy_version 770513 (0.0035) [2024-06-25 00:14:13,393][15132] Fps is (10 sec: 44221.6, 60 sec: 43142.1, 300 sec: 42597.9). Total num frames: 12624183296. Throughput: 0: 42549.4. Samples: 12624282160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:13,394][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 00:14:15,707][15401] Updated weights for policy 0, policy_version 770523 (0.0039) [2024-06-25 00:14:18,389][15132] Fps is (10 sec: 45886.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12624379904. Throughput: 0: 42629.0. Samples: 12624541660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 00:14:18,631][15401] Updated weights for policy 0, policy_version 770533 (0.0034) [2024-06-25 00:14:23,392][15132] Fps is (10 sec: 37687.3, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 12624560128. Throughput: 0: 42669.7. Samples: 12624669220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:23,393][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 00:14:23,747][15401] Updated weights for policy 0, policy_version 770543 (0.0041) [2024-06-25 00:14:25,999][15349] Signal inference workers to stop experience collection... (186800 times) [2024-06-25 00:14:26,048][15349] Signal inference workers to resume experience collection... (186800 times) [2024-06-25 00:14:26,049][15401] InferenceWorker_p0-w0: stopping experience collection (186800 times) [2024-06-25 00:14:26,071][15401] InferenceWorker_p0-w0: resuming experience collection (186800 times) [2024-06-25 00:14:26,197][15401] Updated weights for policy 0, policy_version 770553 (0.0042) [2024-06-25 00:14:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 12624822272. Throughput: 0: 42624.8. Samples: 12624917120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 00:14:31,371][15401] Updated weights for policy 0, policy_version 770563 (0.0040) [2024-06-25 00:14:33,396][15132] Fps is (10 sec: 47494.7, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 12625035264. Throughput: 0: 42767.7. Samples: 12625182300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:33,397][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 00:14:33,864][15401] Updated weights for policy 0, policy_version 770573 (0.0032) [2024-06-25 00:14:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12625199104. Throughput: 0: 42608.6. Samples: 12625304220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 00:14:38,993][15401] Updated weights for policy 0, policy_version 770583 (0.0037) [2024-06-25 00:14:41,626][15401] Updated weights for policy 0, policy_version 770593 (0.0034) [2024-06-25 00:14:43,390][15132] Fps is (10 sec: 42625.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12625461248. Throughput: 0: 42472.9. Samples: 12625553680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 00:14:46,710][15401] Updated weights for policy 0, policy_version 770603 (0.0032) [2024-06-25 00:14:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12625641472. Throughput: 0: 42685.8. Samples: 12625819900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 00:14:49,410][15401] Updated weights for policy 0, policy_version 770613 (0.0032) [2024-06-25 00:14:53,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12625838080. Throughput: 0: 42555.6. Samples: 12625939840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 00:14:54,279][15401] Updated weights for policy 0, policy_version 770623 (0.0039) [2024-06-25 00:14:57,102][15401] Updated weights for policy 0, policy_version 770633 (0.0031) [2024-06-25 00:14:58,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12626116608. Throughput: 0: 42602.1. Samples: 12626199100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 00:14:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 00:15:01,969][15401] Updated weights for policy 0, policy_version 770643 (0.0028) [2024-06-25 00:15:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 12626280448. Throughput: 0: 42720.3. Samples: 12626464180. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:03,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 00:15:04,890][15401] Updated weights for policy 0, policy_version 770653 (0.0031) [2024-06-25 00:15:08,389][15132] Fps is (10 sec: 36044.9, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 12626477056. Throughput: 0: 42431.7. Samples: 12626578540. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:08,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 00:15:09,618][15401] Updated weights for policy 0, policy_version 770663 (0.0031) [2024-06-25 00:15:12,497][15401] Updated weights for policy 0, policy_version 770673 (0.0031) [2024-06-25 00:15:13,390][15132] Fps is (10 sec: 45885.6, 60 sec: 42600.8, 300 sec: 42709.5). Total num frames: 12626739200. Throughput: 0: 42806.2. Samples: 12626843400. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:13,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 00:15:17,406][15401] Updated weights for policy 0, policy_version 770683 (0.0023) [2024-06-25 00:15:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12626935808. Throughput: 0: 42661.7. Samples: 12627101800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 00:15:20,175][15401] Updated weights for policy 0, policy_version 770693 (0.0034) [2024-06-25 00:15:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 12627132416. Throughput: 0: 42617.4. Samples: 12627222000. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:23,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 00:15:25,013][15401] Updated weights for policy 0, policy_version 770703 (0.0027) [2024-06-25 00:15:27,831][15401] Updated weights for policy 0, policy_version 770713 (0.0031) [2024-06-25 00:15:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12627394560. Throughput: 0: 42874.3. Samples: 12627483020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 00:15:32,662][15401] Updated weights for policy 0, policy_version 770723 (0.0031) [2024-06-25 00:15:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42056.7, 300 sec: 42487.7). Total num frames: 12627558400. Throughput: 0: 42691.9. Samples: 12627741040. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 00:15:35,656][15401] Updated weights for policy 0, policy_version 770733 (0.0035) [2024-06-25 00:15:38,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 12627771392. Throughput: 0: 42707.1. Samples: 12627861760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:38,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 00:15:40,220][15401] Updated weights for policy 0, policy_version 770743 (0.0029) [2024-06-25 00:15:41,348][15349] Signal inference workers to stop experience collection... (186850 times) [2024-06-25 00:15:41,349][15349] Signal inference workers to resume experience collection... (186850 times) [2024-06-25 00:15:41,368][15401] InferenceWorker_p0-w0: stopping experience collection (186850 times) [2024-06-25 00:15:41,368][15401] InferenceWorker_p0-w0: resuming experience collection (186850 times) [2024-06-25 00:15:43,370][15401] Updated weights for policy 0, policy_version 770753 (0.0036) [2024-06-25 00:15:43,389][15132] Fps is (10 sec: 45876.4, 60 sec: 42598.6, 300 sec: 42654.3). Total num frames: 12628017152. Throughput: 0: 42692.1. Samples: 12628120240. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 00:15:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000770754_12628033536.pth... [2024-06-25 00:15:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000770129_12617793536.pth [2024-06-25 00:15:48,024][15401] Updated weights for policy 0, policy_version 770763 (0.0041) [2024-06-25 00:15:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 12628180992. Throughput: 0: 42546.8. Samples: 12628378680. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:48,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 00:15:51,077][15401] Updated weights for policy 0, policy_version 770773 (0.0027) [2024-06-25 00:15:53,392][15132] Fps is (10 sec: 40951.0, 60 sec: 43143.1, 300 sec: 42598.1). Total num frames: 12628426752. Throughput: 0: 42694.9. Samples: 12628499900. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:53,392][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 00:15:55,806][15401] Updated weights for policy 0, policy_version 770783 (0.0037) [2024-06-25 00:15:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 12628639744. Throughput: 0: 42572.1. Samples: 12628759140. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:15:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 00:15:58,692][15401] Updated weights for policy 0, policy_version 770793 (0.0037) [2024-06-25 00:16:03,389][15132] Fps is (10 sec: 39329.8, 60 sec: 42327.1, 300 sec: 42431.8). Total num frames: 12628819968. Throughput: 0: 42778.3. Samples: 12629026820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:03,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 00:16:03,435][15401] Updated weights for policy 0, policy_version 770803 (0.0031) [2024-06-25 00:16:06,393][15401] Updated weights for policy 0, policy_version 770813 (0.0036) [2024-06-25 00:16:08,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43415.8, 300 sec: 42653.6). Total num frames: 12629082112. Throughput: 0: 42747.5. Samples: 12629145740. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:08,392][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 00:16:11,086][15401] Updated weights for policy 0, policy_version 770823 (0.0023) [2024-06-25 00:16:13,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12629295104. Throughput: 0: 42670.1. Samples: 12629403180. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 00:16:13,930][15401] Updated weights for policy 0, policy_version 770833 (0.0045) [2024-06-25 00:16:18,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 12629458944. Throughput: 0: 42839.7. Samples: 12629668820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:18,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 00:16:18,685][15401] Updated weights for policy 0, policy_version 770843 (0.0027) [2024-06-25 00:16:21,477][15401] Updated weights for policy 0, policy_version 770853 (0.0039) [2024-06-25 00:16:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 12629721088. Throughput: 0: 42769.8. Samples: 12629786300. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:23,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 00:16:26,212][15401] Updated weights for policy 0, policy_version 770863 (0.0028) [2024-06-25 00:16:28,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12629934080. Throughput: 0: 42839.3. Samples: 12630048020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 00:16:28,943][15401] Updated weights for policy 0, policy_version 770873 (0.0045) [2024-06-25 00:16:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 12630097920. Throughput: 0: 42951.5. Samples: 12630311500. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 00:16:34,080][15401] Updated weights for policy 0, policy_version 770883 (0.0036) [2024-06-25 00:16:36,594][15401] Updated weights for policy 0, policy_version 770893 (0.0035) [2024-06-25 00:16:36,962][15349] Signal inference workers to stop experience collection... (186900 times) [2024-06-25 00:16:36,997][15401] InferenceWorker_p0-w0: stopping experience collection (186900 times) [2024-06-25 00:16:37,021][15349] Signal inference workers to resume experience collection... (186900 times) [2024-06-25 00:16:37,021][15401] InferenceWorker_p0-w0: resuming experience collection (186900 times) [2024-06-25 00:16:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43146.4, 300 sec: 42654.0). Total num frames: 12630360064. Throughput: 0: 42895.8. Samples: 12630430120. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:38,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 00:16:41,504][15401] Updated weights for policy 0, policy_version 770903 (0.0033) [2024-06-25 00:16:43,395][15132] Fps is (10 sec: 47488.2, 60 sec: 42594.5, 300 sec: 42709.1). Total num frames: 12630573056. Throughput: 0: 42947.9. Samples: 12630692020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-25 00:16:43,395][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 00:16:44,287][15401] Updated weights for policy 0, policy_version 770913 (0.0029) [2024-06-25 00:16:48,392][15132] Fps is (10 sec: 39313.1, 60 sec: 42870.0, 300 sec: 42487.0). Total num frames: 12630753280. Throughput: 0: 42793.1. Samples: 12630952600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:16:48,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 00:16:49,367][15401] Updated weights for policy 0, policy_version 770923 (0.0031) [2024-06-25 00:16:52,190][15401] Updated weights for policy 0, policy_version 770933 (0.0030) [2024-06-25 00:16:53,390][15132] Fps is (10 sec: 42620.7, 60 sec: 42872.9, 300 sec: 42598.4). Total num frames: 12630999040. Throughput: 0: 42745.3. Samples: 12631069180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:16:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 00:16:57,031][15401] Updated weights for policy 0, policy_version 770943 (0.0033) [2024-06-25 00:16:58,389][15132] Fps is (10 sec: 45884.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12631212032. Throughput: 0: 42884.6. Samples: 12631332980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:16:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 00:16:59,962][15401] Updated weights for policy 0, policy_version 770953 (0.0031) [2024-06-25 00:17:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 12631392256. Throughput: 0: 42620.4. Samples: 12631586740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 00:17:04,725][15401] Updated weights for policy 0, policy_version 770963 (0.0026) [2024-06-25 00:17:07,590][15401] Updated weights for policy 0, policy_version 770973 (0.0032) [2024-06-25 00:17:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 12631638016. Throughput: 0: 42673.9. Samples: 12631706620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 00:17:12,382][15401] Updated weights for policy 0, policy_version 770983 (0.0040) [2024-06-25 00:17:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12631834624. Throughput: 0: 42650.2. Samples: 12631967280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:13,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 00:17:15,720][15401] Updated weights for policy 0, policy_version 770993 (0.0034) [2024-06-25 00:17:18,394][15132] Fps is (10 sec: 39305.5, 60 sec: 42868.5, 300 sec: 42542.3). Total num frames: 12632031232. Throughput: 0: 42424.1. Samples: 12632220760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:18,394][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 00:17:20,392][15401] Updated weights for policy 0, policy_version 771003 (0.0032) [2024-06-25 00:17:23,269][15401] Updated weights for policy 0, policy_version 771013 (0.0033) [2024-06-25 00:17:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12632276992. Throughput: 0: 42465.6. Samples: 12632341080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:23,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 00:17:27,957][15401] Updated weights for policy 0, policy_version 771023 (0.0048) [2024-06-25 00:17:28,390][15132] Fps is (10 sec: 44254.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12632473600. Throughput: 0: 42386.7. Samples: 12632599200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:28,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 00:17:30,839][15401] Updated weights for policy 0, policy_version 771033 (0.0044) [2024-06-25 00:17:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12632670208. Throughput: 0: 42306.4. Samples: 12632856300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 00:17:35,244][15401] Updated weights for policy 0, policy_version 771043 (0.0033) [2024-06-25 00:17:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12632915968. Throughput: 0: 42610.3. Samples: 12632986640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 00:17:38,494][15401] Updated weights for policy 0, policy_version 771053 (0.0045) [2024-06-25 00:17:43,037][15401] Updated weights for policy 0, policy_version 771063 (0.0030) [2024-06-25 00:17:43,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42056.0, 300 sec: 42487.7). Total num frames: 12633096192. Throughput: 0: 42365.3. Samples: 12633239420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:43,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 00:17:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000771064_12633112576.pth... [2024-06-25 00:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000770440_12622888960.pth [2024-06-25 00:17:46,648][15401] Updated weights for policy 0, policy_version 771073 (0.0037) [2024-06-25 00:17:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42872.9, 300 sec: 42542.8). Total num frames: 12633325568. Throughput: 0: 42355.5. Samples: 12633492740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:48,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 00:17:50,625][15401] Updated weights for policy 0, policy_version 771083 (0.0034) [2024-06-25 00:17:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12633554944. Throughput: 0: 42489.2. Samples: 12633618640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:53,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-25 00:17:54,319][15401] Updated weights for policy 0, policy_version 771093 (0.0034) [2024-06-25 00:17:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 42432.1). Total num frames: 12633718784. Throughput: 0: 42456.2. Samples: 12633877800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:17:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 00:17:58,678][15401] Updated weights for policy 0, policy_version 771103 (0.0029) [2024-06-25 00:18:01,775][15401] Updated weights for policy 0, policy_version 771113 (0.0034) [2024-06-25 00:18:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12633948160. Throughput: 0: 42559.3. Samples: 12634135760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:18:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 00:18:06,082][15401] Updated weights for policy 0, policy_version 771123 (0.0028) [2024-06-25 00:18:07,676][15349] Signal inference workers to stop experience collection... (186950 times) [2024-06-25 00:18:07,677][15349] Signal inference workers to resume experience collection... (186950 times) [2024-06-25 00:18:07,692][15401] InferenceWorker_p0-w0: stopping experience collection (186950 times) [2024-06-25 00:18:07,692][15401] InferenceWorker_p0-w0: resuming experience collection (186950 times) [2024-06-25 00:18:08,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12634193920. Throughput: 0: 42760.0. Samples: 12634265280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:18:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 00:18:09,321][15401] Updated weights for policy 0, policy_version 771133 (0.0036) [2024-06-25 00:18:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12634390528. Throughput: 0: 42625.3. Samples: 12634517340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:18:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 00:18:13,568][15401] Updated weights for policy 0, policy_version 771143 (0.0046) [2024-06-25 00:18:17,494][15401] Updated weights for policy 0, policy_version 771153 (0.0039) [2024-06-25 00:18:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42874.3, 300 sec: 42653.9). Total num frames: 12634603520. Throughput: 0: 42628.3. Samples: 12634774580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:18:18,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 00:18:21,482][15401] Updated weights for policy 0, policy_version 771163 (0.0028) [2024-06-25 00:18:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12634832896. Throughput: 0: 42668.8. Samples: 12634906740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:18:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 00:18:25,198][15401] Updated weights for policy 0, policy_version 771173 (0.0034) [2024-06-25 00:18:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12635029504. Throughput: 0: 42615.0. Samples: 12635157100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:18:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 00:18:28,957][15401] Updated weights for policy 0, policy_version 771183 (0.0034) [2024-06-25 00:18:32,893][15401] Updated weights for policy 0, policy_version 771193 (0.0043) [2024-06-25 00:18:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12635258880. Throughput: 0: 42777.4. Samples: 12635417720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:18:33,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 00:18:36,511][15401] Updated weights for policy 0, policy_version 771203 (0.0033) [2024-06-25 00:18:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12635471872. Throughput: 0: 42838.6. Samples: 12635546380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:18:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 00:18:40,570][15401] Updated weights for policy 0, policy_version 771213 (0.0042) [2024-06-25 00:18:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12635668480. Throughput: 0: 42687.4. Samples: 12635798740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:18:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 00:18:44,211][15401] Updated weights for policy 0, policy_version 771223 (0.0027) [2024-06-25 00:18:48,382][15401] Updated weights for policy 0, policy_version 771233 (0.0032) [2024-06-25 00:18:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12635881472. Throughput: 0: 42851.1. Samples: 12636064060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:18:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 00:18:51,787][15401] Updated weights for policy 0, policy_version 771243 (0.0043) [2024-06-25 00:18:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12636127232. Throughput: 0: 42637.2. Samples: 12636183960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:18:53,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-25 00:18:55,999][15401] Updated weights for policy 0, policy_version 771253 (0.0033) [2024-06-25 00:18:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 12636323840. Throughput: 0: 42798.2. Samples: 12636443260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:18:58,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 00:18:59,263][15401] Updated weights for policy 0, policy_version 771263 (0.0030) [2024-06-25 00:19:03,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12636504064. Throughput: 0: 42835.2. Samples: 12636702160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 00:19:03,647][15401] Updated weights for policy 0, policy_version 771273 (0.0036) [2024-06-25 00:19:06,890][15401] Updated weights for policy 0, policy_version 771283 (0.0032) [2024-06-25 00:19:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.9). Total num frames: 12636749824. Throughput: 0: 42571.6. Samples: 12636822460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 00:19:11,574][15401] Updated weights for policy 0, policy_version 771293 (0.0041) [2024-06-25 00:19:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 12636962816. Throughput: 0: 42651.7. Samples: 12637076420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 00:19:14,616][15401] Updated weights for policy 0, policy_version 771303 (0.0036) [2024-06-25 00:19:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 12637159424. Throughput: 0: 42561.3. Samples: 12637332980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 00:19:19,314][15401] Updated weights for policy 0, policy_version 771313 (0.0025) [2024-06-25 00:19:22,405][15401] Updated weights for policy 0, policy_version 771323 (0.0047) [2024-06-25 00:19:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 12637388800. Throughput: 0: 42492.9. Samples: 12637458660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:23,392][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 00:19:26,754][15401] Updated weights for policy 0, policy_version 771333 (0.0028) [2024-06-25 00:19:28,256][15349] Signal inference workers to stop experience collection... (187000 times) [2024-06-25 00:19:28,257][15349] Signal inference workers to resume experience collection... (187000 times) [2024-06-25 00:19:28,299][15401] InferenceWorker_p0-w0: stopping experience collection (187000 times) [2024-06-25 00:19:28,299][15401] InferenceWorker_p0-w0: resuming experience collection (187000 times) [2024-06-25 00:19:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42599.3). Total num frames: 12637601792. Throughput: 0: 42678.4. Samples: 12637719260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 00:19:30,023][15401] Updated weights for policy 0, policy_version 771343 (0.0036) [2024-06-25 00:19:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 12637798400. Throughput: 0: 42416.8. Samples: 12637972820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 00:19:34,447][15401] Updated weights for policy 0, policy_version 771353 (0.0034) [2024-06-25 00:19:37,727][15401] Updated weights for policy 0, policy_version 771363 (0.0041) [2024-06-25 00:19:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12638044160. Throughput: 0: 42689.0. Samples: 12638104960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 00:19:42,148][15401] Updated weights for policy 0, policy_version 771373 (0.0025) [2024-06-25 00:19:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12638224384. Throughput: 0: 42581.4. Samples: 12638359420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 00:19:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000771376_12638224384.pth... [2024-06-25 00:19:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000770754_12628033536.pth [2024-06-25 00:19:45,529][15401] Updated weights for policy 0, policy_version 771383 (0.0038) [2024-06-25 00:19:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12638453760. Throughput: 0: 42359.9. Samples: 12638608360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 00:19:49,845][15401] Updated weights for policy 0, policy_version 771393 (0.0043) [2024-06-25 00:19:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 12638650368. Throughput: 0: 42507.1. Samples: 12638735280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 00:19:53,455][15401] Updated weights for policy 0, policy_version 771403 (0.0035) [2024-06-25 00:19:57,612][15401] Updated weights for policy 0, policy_version 771413 (0.0028) [2024-06-25 00:19:58,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 12638863360. Throughput: 0: 42556.7. Samples: 12638991580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:19:58,393][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 00:20:01,399][15401] Updated weights for policy 0, policy_version 771423 (0.0052) [2024-06-25 00:20:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12639076352. Throughput: 0: 42355.2. Samples: 12639238960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:20:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 00:20:05,198][15401] Updated weights for policy 0, policy_version 771433 (0.0027) [2024-06-25 00:20:08,396][15132] Fps is (10 sec: 42581.4, 60 sec: 42320.7, 300 sec: 42542.0). Total num frames: 12639289344. Throughput: 0: 42503.3. Samples: 12639371480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:20:08,397][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 00:20:09,069][15401] Updated weights for policy 0, policy_version 771443 (0.0045) [2024-06-25 00:20:13,076][15401] Updated weights for policy 0, policy_version 771453 (0.0028) [2024-06-25 00:20:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12639502336. Throughput: 0: 42472.7. Samples: 12639630540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 00:20:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 00:20:16,902][15401] Updated weights for policy 0, policy_version 771463 (0.0047) [2024-06-25 00:20:18,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12639731712. Throughput: 0: 42335.6. Samples: 12639877920. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:18,391][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 00:20:20,669][15401] Updated weights for policy 0, policy_version 771473 (0.0036) [2024-06-25 00:20:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 12639928320. Throughput: 0: 42377.4. Samples: 12640011940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:23,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 00:20:24,303][15401] Updated weights for policy 0, policy_version 771483 (0.0032) [2024-06-25 00:20:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12640124928. Throughput: 0: 42428.4. Samples: 12640268700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:28,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 00:20:28,423][15401] Updated weights for policy 0, policy_version 771493 (0.0047) [2024-06-25 00:20:31,891][15401] Updated weights for policy 0, policy_version 771503 (0.0041) [2024-06-25 00:20:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 12640387072. Throughput: 0: 42398.3. Samples: 12640516280. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 00:20:36,015][15401] Updated weights for policy 0, policy_version 771513 (0.0042) [2024-06-25 00:20:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 12640550912. Throughput: 0: 42516.0. Samples: 12640648500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 00:20:39,680][15401] Updated weights for policy 0, policy_version 771523 (0.0026) [2024-06-25 00:20:43,390][15132] Fps is (10 sec: 37682.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12640763904. Throughput: 0: 42453.2. Samples: 12640901880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 00:20:43,707][15401] Updated weights for policy 0, policy_version 771533 (0.0028) [2024-06-25 00:20:47,402][15401] Updated weights for policy 0, policy_version 771543 (0.0024) [2024-06-25 00:20:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 12641009664. Throughput: 0: 42549.6. Samples: 12641153700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:48,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 00:20:51,691][15401] Updated weights for policy 0, policy_version 771553 (0.0031) [2024-06-25 00:20:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12641189888. Throughput: 0: 42611.0. Samples: 12641288700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 00:20:54,953][15401] Updated weights for policy 0, policy_version 771563 (0.0037) [2024-06-25 00:20:57,948][15349] Signal inference workers to stop experience collection... (187050 times) [2024-06-25 00:20:57,948][15349] Signal inference workers to resume experience collection... (187050 times) [2024-06-25 00:20:57,962][15401] InferenceWorker_p0-w0: stopping experience collection (187050 times) [2024-06-25 00:20:57,963][15401] InferenceWorker_p0-w0: resuming experience collection (187050 times) [2024-06-25 00:20:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 12641402880. Throughput: 0: 42389.9. Samples: 12641538080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:20:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 00:20:59,457][15401] Updated weights for policy 0, policy_version 771573 (0.0028) [2024-06-25 00:21:02,995][15401] Updated weights for policy 0, policy_version 771583 (0.0041) [2024-06-25 00:21:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 12641648640. Throughput: 0: 42512.5. Samples: 12641790980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 00:21:07,224][15401] Updated weights for policy 0, policy_version 771593 (0.0037) [2024-06-25 00:21:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 12641845248. Throughput: 0: 42377.3. Samples: 12641918920. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 00:21:10,492][15401] Updated weights for policy 0, policy_version 771603 (0.0035) [2024-06-25 00:21:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12642058240. Throughput: 0: 42423.1. Samples: 12642177740. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 00:21:14,739][15401] Updated weights for policy 0, policy_version 771613 (0.0033) [2024-06-25 00:21:18,165][15401] Updated weights for policy 0, policy_version 771623 (0.0032) [2024-06-25 00:21:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12642287616. Throughput: 0: 42506.6. Samples: 12642429080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:18,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 00:21:22,434][15401] Updated weights for policy 0, policy_version 771633 (0.0040) [2024-06-25 00:21:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12642484224. Throughput: 0: 42375.5. Samples: 12642555400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:23,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 00:21:25,868][15401] Updated weights for policy 0, policy_version 771643 (0.0037) [2024-06-25 00:21:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12642697216. Throughput: 0: 42447.7. Samples: 12642812020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 00:21:29,942][15401] Updated weights for policy 0, policy_version 771653 (0.0027) [2024-06-25 00:21:33,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 12642910208. Throughput: 0: 42515.1. Samples: 12643066980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:33,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 00:21:33,687][15401] Updated weights for policy 0, policy_version 771663 (0.0040) [2024-06-25 00:21:37,552][15401] Updated weights for policy 0, policy_version 771673 (0.0038) [2024-06-25 00:21:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42543.6). Total num frames: 12643123200. Throughput: 0: 42290.2. Samples: 12643191760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:38,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-25 00:21:41,477][15401] Updated weights for policy 0, policy_version 771683 (0.0023) [2024-06-25 00:21:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.6, 300 sec: 42654.2). Total num frames: 12643336192. Throughput: 0: 42460.4. Samples: 12643448800. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 00:21:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000771689_12643352576.pth... [2024-06-25 00:21:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000771064_12633112576.pth [2024-06-25 00:21:45,103][15401] Updated weights for policy 0, policy_version 771693 (0.0026) [2024-06-25 00:21:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 12643516416. Throughput: 0: 42637.0. Samples: 12643709640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:48,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 00:21:49,089][15401] Updated weights for policy 0, policy_version 771703 (0.0039) [2024-06-25 00:21:52,649][15401] Updated weights for policy 0, policy_version 771713 (0.0035) [2024-06-25 00:21:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12643762176. Throughput: 0: 42405.4. Samples: 12643827160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 00:21:56,965][15401] Updated weights for policy 0, policy_version 771723 (0.0034) [2024-06-25 00:21:58,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 12643975168. Throughput: 0: 42355.5. Samples: 12644083840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-25 00:21:58,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 00:21:59,923][15349] Signal inference workers to stop experience collection... (187100 times) [2024-06-25 00:21:59,923][15349] Signal inference workers to resume experience collection... (187100 times) [2024-06-25 00:21:59,971][15401] InferenceWorker_p0-w0: stopping experience collection (187100 times) [2024-06-25 00:21:59,972][15401] InferenceWorker_p0-w0: resuming experience collection (187100 times) [2024-06-25 00:22:00,557][15401] Updated weights for policy 0, policy_version 771733 (0.0029) [2024-06-25 00:22:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 12644155392. Throughput: 0: 42502.2. Samples: 12644341680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 00:22:04,575][15401] Updated weights for policy 0, policy_version 771743 (0.0032) [2024-06-25 00:22:08,055][15401] Updated weights for policy 0, policy_version 771753 (0.0034) [2024-06-25 00:22:08,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12644401152. Throughput: 0: 42457.2. Samples: 12644465980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 00:22:12,597][15401] Updated weights for policy 0, policy_version 771763 (0.0043) [2024-06-25 00:22:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42599.0). Total num frames: 12644597760. Throughput: 0: 42540.8. Samples: 12644726360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 00:22:15,542][15401] Updated weights for policy 0, policy_version 771773 (0.0029) [2024-06-25 00:22:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12644810752. Throughput: 0: 42483.7. Samples: 12644978640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 00:22:20,247][15401] Updated weights for policy 0, policy_version 771783 (0.0031) [2024-06-25 00:22:23,377][15401] Updated weights for policy 0, policy_version 771793 (0.0039) [2024-06-25 00:22:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12645056512. Throughput: 0: 42579.0. Samples: 12645107820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 00:22:28,205][15401] Updated weights for policy 0, policy_version 771803 (0.0038) [2024-06-25 00:22:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 12645220352. Throughput: 0: 42528.4. Samples: 12645362580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:28,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 00:22:30,938][15401] Updated weights for policy 0, policy_version 771813 (0.0047) [2024-06-25 00:22:33,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 12645433344. Throughput: 0: 42430.1. Samples: 12645619000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:33,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 00:22:36,020][15401] Updated weights for policy 0, policy_version 771823 (0.0028) [2024-06-25 00:22:38,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12645695488. Throughput: 0: 42659.6. Samples: 12645746840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:38,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 00:22:38,627][15401] Updated weights for policy 0, policy_version 771833 (0.0043) [2024-06-25 00:22:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 12645859328. Throughput: 0: 42683.5. Samples: 12646004600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:43,393][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 00:22:43,734][15401] Updated weights for policy 0, policy_version 771843 (0.0042) [2024-06-25 00:22:46,667][15401] Updated weights for policy 0, policy_version 771853 (0.0047) [2024-06-25 00:22:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 12646088704. Throughput: 0: 42500.0. Samples: 12646254180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:48,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 00:22:51,450][15401] Updated weights for policy 0, policy_version 771863 (0.0028) [2024-06-25 00:22:53,389][15132] Fps is (10 sec: 45886.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12646318080. Throughput: 0: 42800.2. Samples: 12646391980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 00:22:54,030][15401] Updated weights for policy 0, policy_version 771873 (0.0025) [2024-06-25 00:22:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 12646498304. Throughput: 0: 42643.7. Samples: 12646645320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:22:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 00:22:58,935][15401] Updated weights for policy 0, policy_version 771883 (0.0023) [2024-06-25 00:23:01,864][15401] Updated weights for policy 0, policy_version 771893 (0.0035) [2024-06-25 00:23:03,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 12646744064. Throughput: 0: 42503.3. Samples: 12646891300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 00:23:06,608][15401] Updated weights for policy 0, policy_version 771903 (0.0031) [2024-06-25 00:23:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12646957056. Throughput: 0: 42795.3. Samples: 12647033600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 00:23:09,711][15401] Updated weights for policy 0, policy_version 771913 (0.0029) [2024-06-25 00:23:13,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 12647137280. Throughput: 0: 42703.7. Samples: 12647284240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 00:23:14,346][15401] Updated weights for policy 0, policy_version 771923 (0.0036) [2024-06-25 00:23:14,828][15349] Signal inference workers to stop experience collection... (187150 times) [2024-06-25 00:23:14,828][15349] Signal inference workers to resume experience collection... (187150 times) [2024-06-25 00:23:14,846][15401] InferenceWorker_p0-w0: stopping experience collection (187150 times) [2024-06-25 00:23:14,846][15401] InferenceWorker_p0-w0: resuming experience collection (187150 times) [2024-06-25 00:23:17,655][15401] Updated weights for policy 0, policy_version 771933 (0.0037) [2024-06-25 00:23:18,394][15132] Fps is (10 sec: 42578.7, 60 sec: 42868.1, 300 sec: 42542.2). Total num frames: 12647383040. Throughput: 0: 42551.2. Samples: 12647534000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:18,395][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 00:23:21,903][15401] Updated weights for policy 0, policy_version 771943 (0.0035) [2024-06-25 00:23:23,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 12647596032. Throughput: 0: 42634.6. Samples: 12647665500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:23,393][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 00:23:25,281][15401] Updated weights for policy 0, policy_version 771953 (0.0036) [2024-06-25 00:23:28,390][15132] Fps is (10 sec: 40978.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 12647792640. Throughput: 0: 42570.8. Samples: 12647920180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 00:23:29,445][15401] Updated weights for policy 0, policy_version 771963 (0.0032) [2024-06-25 00:23:33,078][15401] Updated weights for policy 0, policy_version 771973 (0.0036) [2024-06-25 00:23:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 12648022016. Throughput: 0: 42610.7. Samples: 12648171660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 00:23:37,282][15401] Updated weights for policy 0, policy_version 771983 (0.0026) [2024-06-25 00:23:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 12648218624. Throughput: 0: 42403.1. Samples: 12648300120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 00:23:40,733][15401] Updated weights for policy 0, policy_version 771993 (0.0051) [2024-06-25 00:23:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 12648431616. Throughput: 0: 42346.1. Samples: 12648550900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 00:23:43,394][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 00:23:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000771999_12648431616.pth... [2024-06-25 00:23:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000771376_12638224384.pth [2024-06-25 00:23:45,236][15401] Updated weights for policy 0, policy_version 772003 (0.0039) [2024-06-25 00:23:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 12648644608. Throughput: 0: 42529.0. Samples: 12648805100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:23:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 00:23:48,473][15401] Updated weights for policy 0, policy_version 772013 (0.0036) [2024-06-25 00:23:52,833][15401] Updated weights for policy 0, policy_version 772023 (0.0038) [2024-06-25 00:23:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12648857600. Throughput: 0: 42291.1. Samples: 12648936700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:23:53,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 00:23:56,361][15401] Updated weights for policy 0, policy_version 772033 (0.0035) [2024-06-25 00:23:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12649070592. Throughput: 0: 42320.8. Samples: 12649188680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:23:58,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 00:24:00,406][15401] Updated weights for policy 0, policy_version 772043 (0.0026) [2024-06-25 00:24:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 12649283584. Throughput: 0: 42432.4. Samples: 12649443260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:03,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 00:24:04,087][15401] Updated weights for policy 0, policy_version 772053 (0.0036) [2024-06-25 00:24:08,115][15401] Updated weights for policy 0, policy_version 772063 (0.0034) [2024-06-25 00:24:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 12649496576. Throughput: 0: 42284.2. Samples: 12649568180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 00:24:11,712][15401] Updated weights for policy 0, policy_version 772073 (0.0024) [2024-06-25 00:24:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 12649693184. Throughput: 0: 42170.6. Samples: 12649817860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:13,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 00:24:15,881][15401] Updated weights for policy 0, policy_version 772083 (0.0045) [2024-06-25 00:24:18,391][15132] Fps is (10 sec: 42592.4, 60 sec: 42327.7, 300 sec: 42487.5). Total num frames: 12649922560. Throughput: 0: 42393.4. Samples: 12650079420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:18,391][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 00:24:19,370][15401] Updated weights for policy 0, policy_version 772093 (0.0033) [2024-06-25 00:24:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42054.1, 300 sec: 42431.8). Total num frames: 12650119168. Throughput: 0: 42484.4. Samples: 12650211920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 00:24:23,500][15401] Updated weights for policy 0, policy_version 772103 (0.0047) [2024-06-25 00:24:27,003][15401] Updated weights for policy 0, policy_version 772113 (0.0033) [2024-06-25 00:24:28,389][15132] Fps is (10 sec: 40965.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 12650332160. Throughput: 0: 42449.0. Samples: 12650461100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 00:24:31,052][15401] Updated weights for policy 0, policy_version 772123 (0.0030) [2024-06-25 00:24:33,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42323.6, 300 sec: 42431.4). Total num frames: 12650561536. Throughput: 0: 42485.8. Samples: 12650717060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:33,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 00:24:34,587][15401] Updated weights for policy 0, policy_version 772133 (0.0031) [2024-06-25 00:24:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 12650758144. Throughput: 0: 42554.2. Samples: 12650851640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 00:24:38,843][15401] Updated weights for policy 0, policy_version 772143 (0.0040) [2024-06-25 00:24:42,288][15401] Updated weights for policy 0, policy_version 772153 (0.0050) [2024-06-25 00:24:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 12650987520. Throughput: 0: 42537.0. Samples: 12651102840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 00:24:46,717][15401] Updated weights for policy 0, policy_version 772163 (0.0033) [2024-06-25 00:24:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12651200512. Throughput: 0: 42613.3. Samples: 12651360860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:48,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 00:24:48,415][15349] Signal inference workers to stop experience collection... (187200 times) [2024-06-25 00:24:48,416][15349] Signal inference workers to resume experience collection... (187200 times) [2024-06-25 00:24:48,438][15401] InferenceWorker_p0-w0: stopping experience collection (187200 times) [2024-06-25 00:24:48,438][15401] InferenceWorker_p0-w0: resuming experience collection (187200 times) [2024-06-25 00:24:49,805][15401] Updated weights for policy 0, policy_version 772173 (0.0036) [2024-06-25 00:24:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 12651397120. Throughput: 0: 42700.3. Samples: 12651489700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 00:24:54,275][15401] Updated weights for policy 0, policy_version 772183 (0.0030) [2024-06-25 00:24:57,493][15401] Updated weights for policy 0, policy_version 772193 (0.0044) [2024-06-25 00:24:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 12651626496. Throughput: 0: 42725.4. Samples: 12651740500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:24:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 00:25:02,015][15401] Updated weights for policy 0, policy_version 772203 (0.0030) [2024-06-25 00:25:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42543.4). Total num frames: 12651839488. Throughput: 0: 42627.9. Samples: 12651997720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:25:03,392][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 00:25:05,585][15401] Updated weights for policy 0, policy_version 772213 (0.0035) [2024-06-25 00:25:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12652052480. Throughput: 0: 42532.9. Samples: 12652125900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:25:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 00:25:09,550][15401] Updated weights for policy 0, policy_version 772223 (0.0026) [2024-06-25 00:25:13,088][15401] Updated weights for policy 0, policy_version 772233 (0.0028) [2024-06-25 00:25:13,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 12652281856. Throughput: 0: 42671.0. Samples: 12652381300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:25:13,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 00:25:17,092][15401] Updated weights for policy 0, policy_version 772243 (0.0048) [2024-06-25 00:25:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42326.2, 300 sec: 42487.3). Total num frames: 12652462080. Throughput: 0: 42806.2. Samples: 12652643240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:25:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 00:25:20,685][15401] Updated weights for policy 0, policy_version 772253 (0.0029) [2024-06-25 00:25:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12652691456. Throughput: 0: 42506.3. Samples: 12652764420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:25:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 00:25:24,959][15401] Updated weights for policy 0, policy_version 772263 (0.0029) [2024-06-25 00:25:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 12652904448. Throughput: 0: 42626.2. Samples: 12653021020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:25:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 00:25:28,659][15401] Updated weights for policy 0, policy_version 772273 (0.0035) [2024-06-25 00:25:32,398][15401] Updated weights for policy 0, policy_version 772283 (0.0034) [2024-06-25 00:25:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 12653101056. Throughput: 0: 42713.6. Samples: 12653282980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:25:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 00:25:36,328][15401] Updated weights for policy 0, policy_version 772293 (0.0033) [2024-06-25 00:25:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12653330432. Throughput: 0: 42753.8. Samples: 12653413620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:25:38,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 00:25:39,863][15401] Updated weights for policy 0, policy_version 772303 (0.0031) [2024-06-25 00:25:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12653543424. Throughput: 0: 42815.5. Samples: 12653667200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:25:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 00:25:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000772311_12653543424.pth... [2024-06-25 00:25:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000771689_12643352576.pth [2024-06-25 00:25:43,942][15401] Updated weights for policy 0, policy_version 772313 (0.0034) [2024-06-25 00:25:48,018][15401] Updated weights for policy 0, policy_version 772323 (0.0047) [2024-06-25 00:25:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 12653756416. Throughput: 0: 42939.0. Samples: 12653929880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:25:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 00:25:51,500][15401] Updated weights for policy 0, policy_version 772333 (0.0026) [2024-06-25 00:25:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12653985792. Throughput: 0: 42866.5. Samples: 12654054900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:25:53,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 00:25:55,909][15401] Updated weights for policy 0, policy_version 772343 (0.0041) [2024-06-25 00:25:57,369][15349] Signal inference workers to stop experience collection... (187250 times) [2024-06-25 00:25:57,419][15349] Signal inference workers to resume experience collection... (187250 times) [2024-06-25 00:25:57,429][15401] InferenceWorker_p0-w0: stopping experience collection (187250 times) [2024-06-25 00:25:57,463][15401] InferenceWorker_p0-w0: resuming experience collection (187250 times) [2024-06-25 00:25:58,394][15132] Fps is (10 sec: 44217.3, 60 sec: 42868.2, 300 sec: 42542.2). Total num frames: 12654198784. Throughput: 0: 42949.4. Samples: 12654314220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:25:58,395][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 00:25:59,164][15401] Updated weights for policy 0, policy_version 772353 (0.0025) [2024-06-25 00:26:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 12654379008. Throughput: 0: 42999.1. Samples: 12654578200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 00:26:03,621][15401] Updated weights for policy 0, policy_version 772363 (0.0031) [2024-06-25 00:26:06,612][15401] Updated weights for policy 0, policy_version 772373 (0.0048) [2024-06-25 00:26:08,389][15132] Fps is (10 sec: 40978.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12654608384. Throughput: 0: 43062.6. Samples: 12654702240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 00:26:11,194][15401] Updated weights for policy 0, policy_version 772383 (0.0026) [2024-06-25 00:26:13,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12654854144. Throughput: 0: 43046.7. Samples: 12654958120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:13,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 00:26:14,332][15401] Updated weights for policy 0, policy_version 772393 (0.0039) [2024-06-25 00:26:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12655034368. Throughput: 0: 43000.1. Samples: 12655217980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:18,396][15132] Avg episode reward: [(0, '0.878')] [2024-06-25 00:26:18,910][15401] Updated weights for policy 0, policy_version 772403 (0.0030) [2024-06-25 00:26:21,821][15401] Updated weights for policy 0, policy_version 772413 (0.0035) [2024-06-25 00:26:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12655247360. Throughput: 0: 42836.5. Samples: 12655341260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 00:26:26,416][15401] Updated weights for policy 0, policy_version 772423 (0.0032) [2024-06-25 00:26:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 12655493120. Throughput: 0: 43004.8. Samples: 12655602420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 00:26:29,249][15401] Updated weights for policy 0, policy_version 772433 (0.0025) [2024-06-25 00:26:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 12655673344. Throughput: 0: 42850.7. Samples: 12655858160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:33,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 00:26:33,894][15401] Updated weights for policy 0, policy_version 772443 (0.0037) [2024-06-25 00:26:36,816][15401] Updated weights for policy 0, policy_version 772453 (0.0032) [2024-06-25 00:26:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12655919104. Throughput: 0: 42829.5. Samples: 12655982220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 00:26:41,680][15401] Updated weights for policy 0, policy_version 772463 (0.0038) [2024-06-25 00:26:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12656132096. Throughput: 0: 43031.0. Samples: 12656250420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:43,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-25 00:26:44,521][15401] Updated weights for policy 0, policy_version 772473 (0.0042) [2024-06-25 00:26:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12656312320. Throughput: 0: 42744.1. Samples: 12656501680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 00:26:49,230][15401] Updated weights for policy 0, policy_version 772483 (0.0041) [2024-06-25 00:26:52,331][15401] Updated weights for policy 0, policy_version 772493 (0.0031) [2024-06-25 00:26:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 12656574464. Throughput: 0: 42778.7. Samples: 12656627280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 00:26:56,834][15401] Updated weights for policy 0, policy_version 772503 (0.0023) [2024-06-25 00:26:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42601.6, 300 sec: 42709.5). Total num frames: 12656754688. Throughput: 0: 42899.0. Samples: 12656888580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:26:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 00:27:00,077][15401] Updated weights for policy 0, policy_version 772513 (0.0043) [2024-06-25 00:27:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 12656967680. Throughput: 0: 42774.3. Samples: 12657142820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:27:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 00:27:04,224][15401] Updated weights for policy 0, policy_version 772523 (0.0035) [2024-06-25 00:27:06,276][15349] Signal inference workers to stop experience collection... (187300 times) [2024-06-25 00:27:06,276][15349] Signal inference workers to resume experience collection... (187300 times) [2024-06-25 00:27:06,313][15401] InferenceWorker_p0-w0: stopping experience collection (187300 times) [2024-06-25 00:27:06,313][15401] InferenceWorker_p0-w0: resuming experience collection (187300 times) [2024-06-25 00:27:07,506][15401] Updated weights for policy 0, policy_version 772533 (0.0039) [2024-06-25 00:27:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12657213440. Throughput: 0: 42921.3. Samples: 12657272720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:27:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 00:27:12,157][15401] Updated weights for policy 0, policy_version 772543 (0.0039) [2024-06-25 00:27:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12657377280. Throughput: 0: 42966.8. Samples: 12657535920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:27:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 00:27:15,217][15401] Updated weights for policy 0, policy_version 772553 (0.0028) [2024-06-25 00:27:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 12657623040. Throughput: 0: 42880.9. Samples: 12657787900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 00:27:18,393][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 00:27:19,611][15401] Updated weights for policy 0, policy_version 772563 (0.0035) [2024-06-25 00:27:23,000][15401] Updated weights for policy 0, policy_version 772573 (0.0038) [2024-06-25 00:27:23,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 12657852416. Throughput: 0: 43180.8. Samples: 12657925360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:27:23,393][15132] Avg episode reward: [(0, '0.330')] [2024-06-25 00:27:27,139][15401] Updated weights for policy 0, policy_version 772583 (0.0035) [2024-06-25 00:27:28,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12658032640. Throughput: 0: 42821.8. Samples: 12658177400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:27:28,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 00:27:30,717][15401] Updated weights for policy 0, policy_version 772593 (0.0044) [2024-06-25 00:27:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 12658278400. Throughput: 0: 42762.7. Samples: 12658426000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:27:33,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 00:27:34,821][15401] Updated weights for policy 0, policy_version 772603 (0.0031) [2024-06-25 00:27:38,335][15401] Updated weights for policy 0, policy_version 772613 (0.0027) [2024-06-25 00:27:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 12658491392. Throughput: 0: 43064.3. Samples: 12658565180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:27:38,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 00:27:42,290][15401] Updated weights for policy 0, policy_version 772623 (0.0036) [2024-06-25 00:27:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12658671616. Throughput: 0: 43010.8. Samples: 12658824060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:27:43,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 00:27:43,464][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000772625_12658688000.pth... [2024-06-25 00:27:43,518][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000771999_12648431616.pth [2024-06-25 00:27:46,284][15401] Updated weights for policy 0, policy_version 772633 (0.0029) [2024-06-25 00:27:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 12658933760. Throughput: 0: 42815.5. Samples: 12659069520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:27:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 00:27:50,386][15401] Updated weights for policy 0, policy_version 772643 (0.0037) [2024-06-25 00:27:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12659130368. Throughput: 0: 43055.1. Samples: 12659210200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:27:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 00:27:53,843][15401] Updated weights for policy 0, policy_version 772653 (0.0030) [2024-06-25 00:27:57,985][15401] Updated weights for policy 0, policy_version 772663 (0.0042) [2024-06-25 00:27:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 12659326976. Throughput: 0: 42839.4. Samples: 12659463700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:27:58,394][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 00:28:01,411][15401] Updated weights for policy 0, policy_version 772673 (0.0034) [2024-06-25 00:28:03,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43688.9, 300 sec: 42820.2). Total num frames: 12659589120. Throughput: 0: 42832.4. Samples: 12659715360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:03,393][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 00:28:05,470][15401] Updated weights for policy 0, policy_version 772683 (0.0036) [2024-06-25 00:28:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12659769344. Throughput: 0: 42944.0. Samples: 12659857840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:08,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 00:28:08,993][15401] Updated weights for policy 0, policy_version 772693 (0.0036) [2024-06-25 00:28:13,028][15401] Updated weights for policy 0, policy_version 772703 (0.0042) [2024-06-25 00:28:13,389][15132] Fps is (10 sec: 37692.6, 60 sec: 43144.5, 300 sec: 42654.6). Total num frames: 12659965952. Throughput: 0: 42936.5. Samples: 12660109540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:13,396][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 00:28:13,863][15349] Signal inference workers to stop experience collection... (187350 times) [2024-06-25 00:28:13,863][15349] Signal inference workers to resume experience collection... (187350 times) [2024-06-25 00:28:13,881][15401] InferenceWorker_p0-w0: stopping experience collection (187350 times) [2024-06-25 00:28:13,917][15401] InferenceWorker_p0-w0: resuming experience collection (187350 times) [2024-06-25 00:28:16,542][15401] Updated weights for policy 0, policy_version 772713 (0.0033) [2024-06-25 00:28:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43419.3, 300 sec: 42820.9). Total num frames: 12660228096. Throughput: 0: 42895.0. Samples: 12660356280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 00:28:21,006][15401] Updated weights for policy 0, policy_version 772723 (0.0032) [2024-06-25 00:28:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12660408320. Throughput: 0: 42957.4. Samples: 12660498260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 00:28:24,146][15401] Updated weights for policy 0, policy_version 772733 (0.0037) [2024-06-25 00:28:28,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12660604928. Throughput: 0: 42802.1. Samples: 12660750160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 00:28:28,528][15401] Updated weights for policy 0, policy_version 772743 (0.0039) [2024-06-25 00:28:31,656][15401] Updated weights for policy 0, policy_version 772753 (0.0032) [2024-06-25 00:28:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12660867072. Throughput: 0: 42955.2. Samples: 12661002500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 00:28:36,032][15401] Updated weights for policy 0, policy_version 772763 (0.0045) [2024-06-25 00:28:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12661047296. Throughput: 0: 42875.5. Samples: 12661139600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 00:28:39,311][15401] Updated weights for policy 0, policy_version 772773 (0.0047) [2024-06-25 00:28:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12661260288. Throughput: 0: 42816.4. Samples: 12661390440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:43,399][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 00:28:43,543][15401] Updated weights for policy 0, policy_version 772783 (0.0034) [2024-06-25 00:28:47,086][15401] Updated weights for policy 0, policy_version 772793 (0.0039) [2024-06-25 00:28:48,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12661489664. Throughput: 0: 42894.8. Samples: 12661645520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:48,395][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 00:28:51,087][15401] Updated weights for policy 0, policy_version 772803 (0.0044) [2024-06-25 00:28:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12661702656. Throughput: 0: 42666.2. Samples: 12661777820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 00:28:54,808][15401] Updated weights for policy 0, policy_version 772813 (0.0035) [2024-06-25 00:28:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12661899264. Throughput: 0: 42711.0. Samples: 12662031540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:28:58,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 00:28:58,765][15401] Updated weights for policy 0, policy_version 772823 (0.0035) [2024-06-25 00:29:02,407][15401] Updated weights for policy 0, policy_version 772833 (0.0031) [2024-06-25 00:29:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 12662145024. Throughput: 0: 42865.8. Samples: 12662285240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 00:29:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 00:29:06,468][15401] Updated weights for policy 0, policy_version 772843 (0.0033) [2024-06-25 00:29:08,391][15132] Fps is (10 sec: 42591.7, 60 sec: 42597.3, 300 sec: 42820.3). Total num frames: 12662325248. Throughput: 0: 42618.0. Samples: 12662416140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:08,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 00:29:10,052][15401] Updated weights for policy 0, policy_version 772853 (0.0043) [2024-06-25 00:29:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.7). Total num frames: 12662554624. Throughput: 0: 42768.9. Samples: 12662674760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:13,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 00:29:14,131][15401] Updated weights for policy 0, policy_version 772863 (0.0035) [2024-06-25 00:29:17,724][15401] Updated weights for policy 0, policy_version 772873 (0.0031) [2024-06-25 00:29:18,390][15132] Fps is (10 sec: 45882.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 12662784000. Throughput: 0: 42643.0. Samples: 12662921440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 00:29:21,622][15401] Updated weights for policy 0, policy_version 772883 (0.0035) [2024-06-25 00:29:23,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 12662964224. Throughput: 0: 42660.7. Samples: 12663059600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:23,396][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 00:29:25,222][15401] Updated weights for policy 0, policy_version 772893 (0.0031) [2024-06-25 00:29:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 12663193600. Throughput: 0: 42782.7. Samples: 12663315660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 00:29:29,729][15401] Updated weights for policy 0, policy_version 772903 (0.0039) [2024-06-25 00:29:32,577][15349] Signal inference workers to stop experience collection... (187400 times) [2024-06-25 00:29:32,630][15401] InferenceWorker_p0-w0: stopping experience collection (187400 times) [2024-06-25 00:29:32,691][15349] Signal inference workers to resume experience collection... (187400 times) [2024-06-25 00:29:32,691][15401] InferenceWorker_p0-w0: resuming experience collection (187400 times) [2024-06-25 00:29:32,830][15401] Updated weights for policy 0, policy_version 772913 (0.0036) [2024-06-25 00:29:33,390][15132] Fps is (10 sec: 45904.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 12663422976. Throughput: 0: 42718.2. Samples: 12663567840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 00:29:37,170][15401] Updated weights for policy 0, policy_version 772923 (0.0041) [2024-06-25 00:29:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12663619584. Throughput: 0: 42724.9. Samples: 12663700440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 00:29:40,653][15401] Updated weights for policy 0, policy_version 772933 (0.0032) [2024-06-25 00:29:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12663848960. Throughput: 0: 42773.9. Samples: 12663956360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:43,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-25 00:29:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000772940_12663848960.pth... [2024-06-25 00:29:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000772311_12653543424.pth [2024-06-25 00:29:44,853][15401] Updated weights for policy 0, policy_version 772943 (0.0027) [2024-06-25 00:29:48,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 12664029184. Throughput: 0: 42904.4. Samples: 12664216040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:48,393][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 00:29:48,649][15401] Updated weights for policy 0, policy_version 772953 (0.0045) [2024-06-25 00:29:52,515][15401] Updated weights for policy 0, policy_version 772963 (0.0035) [2024-06-25 00:29:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12664258560. Throughput: 0: 42714.9. Samples: 12664338240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 00:29:56,249][15401] Updated weights for policy 0, policy_version 772973 (0.0040) [2024-06-25 00:29:58,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12664471552. Throughput: 0: 42593.0. Samples: 12664591440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:29:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 00:30:00,117][15401] Updated weights for policy 0, policy_version 772983 (0.0029) [2024-06-25 00:30:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 12664668160. Throughput: 0: 42889.0. Samples: 12664851440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:03,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 00:30:03,795][15401] Updated weights for policy 0, policy_version 772993 (0.0031) [2024-06-25 00:30:07,511][15401] Updated weights for policy 0, policy_version 773003 (0.0047) [2024-06-25 00:30:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43145.8, 300 sec: 42820.6). Total num frames: 12664913920. Throughput: 0: 42730.2. Samples: 12664982180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 00:30:11,413][15401] Updated weights for policy 0, policy_version 773013 (0.0038) [2024-06-25 00:30:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12665110528. Throughput: 0: 42607.9. Samples: 12665233020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 00:30:15,081][15401] Updated weights for policy 0, policy_version 773023 (0.0027) [2024-06-25 00:30:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 12665323520. Throughput: 0: 42861.0. Samples: 12665496580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:18,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-25 00:30:19,088][15401] Updated weights for policy 0, policy_version 773033 (0.0038) [2024-06-25 00:30:22,787][15401] Updated weights for policy 0, policy_version 773043 (0.0037) [2024-06-25 00:30:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42874.3, 300 sec: 42820.2). Total num frames: 12665536512. Throughput: 0: 42623.0. Samples: 12665618580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:23,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 00:30:26,768][15401] Updated weights for policy 0, policy_version 773053 (0.0023) [2024-06-25 00:30:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12665749504. Throughput: 0: 42675.4. Samples: 12665876760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 00:30:30,563][15401] Updated weights for policy 0, policy_version 773063 (0.0039) [2024-06-25 00:30:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12665962496. Throughput: 0: 42625.5. Samples: 12666134080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 00:30:34,499][15401] Updated weights for policy 0, policy_version 773073 (0.0038) [2024-06-25 00:30:38,077][15401] Updated weights for policy 0, policy_version 773083 (0.0042) [2024-06-25 00:30:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12666191872. Throughput: 0: 42789.2. Samples: 12666263760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 00:30:42,262][15401] Updated weights for policy 0, policy_version 773093 (0.0045) [2024-06-25 00:30:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 12666388480. Throughput: 0: 42824.8. Samples: 12666518560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:43,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 00:30:46,078][15401] Updated weights for policy 0, policy_version 773103 (0.0027) [2024-06-25 00:30:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 12666617856. Throughput: 0: 42701.7. Samples: 12666773020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 00:30:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 00:30:50,022][15401] Updated weights for policy 0, policy_version 773113 (0.0041) [2024-06-25 00:30:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42821.2). Total num frames: 12666830848. Throughput: 0: 42673.3. Samples: 12666902480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:30:53,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 00:30:53,996][15401] Updated weights for policy 0, policy_version 773123 (0.0035) [2024-06-25 00:30:55,776][15349] Signal inference workers to stop experience collection... (187450 times) [2024-06-25 00:30:55,776][15349] Signal inference workers to resume experience collection... (187450 times) [2024-06-25 00:30:55,812][15401] InferenceWorker_p0-w0: stopping experience collection (187450 times) [2024-06-25 00:30:55,812][15401] InferenceWorker_p0-w0: resuming experience collection (187450 times) [2024-06-25 00:30:57,537][15401] Updated weights for policy 0, policy_version 773133 (0.0044) [2024-06-25 00:30:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12667027456. Throughput: 0: 42845.3. Samples: 12667161060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:30:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 00:31:01,534][15401] Updated weights for policy 0, policy_version 773143 (0.0046) [2024-06-25 00:31:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 12667273216. Throughput: 0: 42671.1. Samples: 12667416780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 00:31:05,037][15401] Updated weights for policy 0, policy_version 773153 (0.0033) [2024-06-25 00:31:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12667469824. Throughput: 0: 42895.6. Samples: 12667548780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 00:31:08,950][15401] Updated weights for policy 0, policy_version 773163 (0.0042) [2024-06-25 00:31:12,715][15401] Updated weights for policy 0, policy_version 773173 (0.0027) [2024-06-25 00:31:13,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12667682816. Throughput: 0: 42837.7. Samples: 12667804460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:13,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 00:31:16,390][15401] Updated weights for policy 0, policy_version 773183 (0.0027) [2024-06-25 00:31:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12667895808. Throughput: 0: 42846.1. Samples: 12668062160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 00:31:20,635][15401] Updated weights for policy 0, policy_version 773193 (0.0036) [2024-06-25 00:31:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12668108800. Throughput: 0: 42798.4. Samples: 12668189680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 00:31:23,933][15401] Updated weights for policy 0, policy_version 773203 (0.0032) [2024-06-25 00:31:28,128][15401] Updated weights for policy 0, policy_version 773213 (0.0033) [2024-06-25 00:31:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12668321792. Throughput: 0: 42766.6. Samples: 12668443060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 00:31:31,927][15401] Updated weights for policy 0, policy_version 773223 (0.0028) [2024-06-25 00:31:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12668551168. Throughput: 0: 42906.7. Samples: 12668703820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 00:31:35,611][15401] Updated weights for policy 0, policy_version 773233 (0.0030) [2024-06-25 00:31:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12668731392. Throughput: 0: 42918.5. Samples: 12668833820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 00:31:39,404][15401] Updated weights for policy 0, policy_version 773243 (0.0034) [2024-06-25 00:31:43,388][15401] Updated weights for policy 0, policy_version 773253 (0.0040) [2024-06-25 00:31:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12668977152. Throughput: 0: 42687.7. Samples: 12669082000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 00:31:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000773253_12668977152.pth... [2024-06-25 00:31:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000772625_12658688000.pth [2024-06-25 00:31:47,056][15401] Updated weights for policy 0, policy_version 773263 (0.0036) [2024-06-25 00:31:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12669190144. Throughput: 0: 42839.0. Samples: 12669344540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 00:31:50,934][15401] Updated weights for policy 0, policy_version 773273 (0.0042) [2024-06-25 00:31:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12669370368. Throughput: 0: 42724.6. Samples: 12669471380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 00:31:54,968][15401] Updated weights for policy 0, policy_version 773283 (0.0022) [2024-06-25 00:31:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12669599744. Throughput: 0: 42596.1. Samples: 12669721280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:31:58,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 00:31:58,916][15401] Updated weights for policy 0, policy_version 773293 (0.0036) [2024-06-25 00:32:02,683][15401] Updated weights for policy 0, policy_version 773303 (0.0024) [2024-06-25 00:32:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12669829120. Throughput: 0: 42664.0. Samples: 12669982040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:32:03,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 00:32:06,566][15401] Updated weights for policy 0, policy_version 773313 (0.0039) [2024-06-25 00:32:08,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42320.9, 300 sec: 42819.6). Total num frames: 12670009344. Throughput: 0: 42669.0. Samples: 12670110060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:32:08,396][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 00:32:10,279][15401] Updated weights for policy 0, policy_version 773323 (0.0029) [2024-06-25 00:32:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 12670238720. Throughput: 0: 42576.2. Samples: 12670358980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:32:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 00:32:14,311][15401] Updated weights for policy 0, policy_version 773333 (0.0038) [2024-06-25 00:32:14,901][15349] Signal inference workers to stop experience collection... (187500 times) [2024-06-25 00:32:14,954][15401] InferenceWorker_p0-w0: stopping experience collection (187500 times) [2024-06-25 00:32:14,960][15349] Signal inference workers to resume experience collection... (187500 times) [2024-06-25 00:32:14,988][15401] InferenceWorker_p0-w0: resuming experience collection (187500 times) [2024-06-25 00:32:18,226][15401] Updated weights for policy 0, policy_version 773343 (0.0038) [2024-06-25 00:32:18,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12670451712. Throughput: 0: 42581.3. Samples: 12670619980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:32:18,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 00:32:21,885][15401] Updated weights for policy 0, policy_version 773353 (0.0036) [2024-06-25 00:32:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12670664704. Throughput: 0: 42574.6. Samples: 12670749680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:32:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 00:32:25,942][15401] Updated weights for policy 0, policy_version 773363 (0.0032) [2024-06-25 00:32:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12670894080. Throughput: 0: 42680.5. Samples: 12671002620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:32:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 00:32:29,975][15401] Updated weights for policy 0, policy_version 773373 (0.0030) [2024-06-25 00:32:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12671074304. Throughput: 0: 42516.9. Samples: 12671257800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 00:32:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 00:32:33,673][15401] Updated weights for policy 0, policy_version 773383 (0.0038) [2024-06-25 00:32:37,597][15401] Updated weights for policy 0, policy_version 773393 (0.0033) [2024-06-25 00:32:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12671303680. Throughput: 0: 42474.2. Samples: 12671382720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:32:38,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 00:32:41,227][15401] Updated weights for policy 0, policy_version 773403 (0.0043) [2024-06-25 00:32:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12671516672. Throughput: 0: 42628.4. Samples: 12671639560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:32:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 00:32:45,159][15401] Updated weights for policy 0, policy_version 773413 (0.0039) [2024-06-25 00:32:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12671729664. Throughput: 0: 42746.2. Samples: 12671905620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:32:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 00:32:48,821][15401] Updated weights for policy 0, policy_version 773423 (0.0044) [2024-06-25 00:32:52,636][15401] Updated weights for policy 0, policy_version 773433 (0.0027) [2024-06-25 00:32:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12671942656. Throughput: 0: 42698.1. Samples: 12672031200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:32:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 00:32:56,611][15401] Updated weights for policy 0, policy_version 773443 (0.0041) [2024-06-25 00:32:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 12672172032. Throughput: 0: 42655.8. Samples: 12672278500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:32:58,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 00:33:00,970][15401] Updated weights for policy 0, policy_version 773453 (0.0052) [2024-06-25 00:33:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12672368640. Throughput: 0: 42530.3. Samples: 12672533840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:03,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 00:33:04,414][15401] Updated weights for policy 0, policy_version 773463 (0.0038) [2024-06-25 00:33:08,391][15132] Fps is (10 sec: 39316.9, 60 sec: 42602.0, 300 sec: 42709.3). Total num frames: 12672565248. Throughput: 0: 42447.8. Samples: 12672659880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:08,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 00:33:08,541][15401] Updated weights for policy 0, policy_version 773473 (0.0047) [2024-06-25 00:33:11,930][15401] Updated weights for policy 0, policy_version 773483 (0.0034) [2024-06-25 00:33:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12672794624. Throughput: 0: 42480.1. Samples: 12672914220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:13,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 00:33:16,037][15401] Updated weights for policy 0, policy_version 773493 (0.0036) [2024-06-25 00:33:18,390][15132] Fps is (10 sec: 44242.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12673007616. Throughput: 0: 42506.6. Samples: 12673170600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:18,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 00:33:19,501][15401] Updated weights for policy 0, policy_version 773503 (0.0036) [2024-06-25 00:33:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 12673204224. Throughput: 0: 42524.9. Samples: 12673296340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 00:33:23,926][15401] Updated weights for policy 0, policy_version 773513 (0.0034) [2024-06-25 00:33:27,297][15401] Updated weights for policy 0, policy_version 773523 (0.0032) [2024-06-25 00:33:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12673449984. Throughput: 0: 42673.8. Samples: 12673559880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 00:33:31,459][15401] Updated weights for policy 0, policy_version 773533 (0.0027) [2024-06-25 00:33:33,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12673662976. Throughput: 0: 42424.4. Samples: 12673814720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 00:33:34,749][15401] Updated weights for policy 0, policy_version 773543 (0.0032) [2024-06-25 00:33:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12673859584. Throughput: 0: 42426.7. Samples: 12673940400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 00:33:39,329][15401] Updated weights for policy 0, policy_version 773553 (0.0039) [2024-06-25 00:33:42,473][15401] Updated weights for policy 0, policy_version 773563 (0.0031) [2024-06-25 00:33:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12674072576. Throughput: 0: 42617.1. Samples: 12674196260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 00:33:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000773564_12674072576.pth... [2024-06-25 00:33:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000772940_12663848960.pth [2024-06-25 00:33:47,088][15401] Updated weights for policy 0, policy_version 773573 (0.0033) [2024-06-25 00:33:47,264][15349] Signal inference workers to stop experience collection... (187550 times) [2024-06-25 00:33:47,264][15349] Signal inference workers to resume experience collection... (187550 times) [2024-06-25 00:33:47,299][15401] InferenceWorker_p0-w0: stopping experience collection (187550 times) [2024-06-25 00:33:47,299][15401] InferenceWorker_p0-w0: resuming experience collection (187550 times) [2024-06-25 00:33:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12674301952. Throughput: 0: 42645.3. Samples: 12674452880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 00:33:50,344][15401] Updated weights for policy 0, policy_version 773583 (0.0041) [2024-06-25 00:33:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12674498560. Throughput: 0: 42718.6. Samples: 12674582160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:53,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 00:33:54,493][15401] Updated weights for policy 0, policy_version 773593 (0.0031) [2024-06-25 00:33:57,903][15401] Updated weights for policy 0, policy_version 773603 (0.0034) [2024-06-25 00:33:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 12674711552. Throughput: 0: 42843.6. Samples: 12674842180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:33:58,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 00:34:01,864][15401] Updated weights for policy 0, policy_version 773613 (0.0031) [2024-06-25 00:34:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 12674940928. Throughput: 0: 42852.1. Samples: 12675098940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:34:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 00:34:05,623][15401] Updated weights for policy 0, policy_version 773623 (0.0029) [2024-06-25 00:34:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42872.5, 300 sec: 42654.0). Total num frames: 12675137536. Throughput: 0: 42842.2. Samples: 12675224240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:34:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 00:34:09,359][15401] Updated weights for policy 0, policy_version 773633 (0.0032) [2024-06-25 00:34:13,136][15401] Updated weights for policy 0, policy_version 773643 (0.0033) [2024-06-25 00:34:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12675366912. Throughput: 0: 42836.5. Samples: 12675487520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:34:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 00:34:17,037][15401] Updated weights for policy 0, policy_version 773653 (0.0023) [2024-06-25 00:34:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 12675579904. Throughput: 0: 42832.1. Samples: 12675742160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 00:34:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 00:34:20,726][15401] Updated weights for policy 0, policy_version 773663 (0.0026) [2024-06-25 00:34:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12675776512. Throughput: 0: 42722.6. Samples: 12675862920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:34:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 00:34:24,660][15401] Updated weights for policy 0, policy_version 773673 (0.0041) [2024-06-25 00:34:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12675989504. Throughput: 0: 42819.0. Samples: 12676123120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:34:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 00:34:28,539][15401] Updated weights for policy 0, policy_version 773683 (0.0037) [2024-06-25 00:34:32,386][15401] Updated weights for policy 0, policy_version 773693 (0.0027) [2024-06-25 00:34:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12676218880. Throughput: 0: 42699.1. Samples: 12676374340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:34:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 00:34:36,365][15401] Updated weights for policy 0, policy_version 773703 (0.0033) [2024-06-25 00:34:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12676415488. Throughput: 0: 42780.5. Samples: 12676507280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:34:38,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 00:34:40,056][15401] Updated weights for policy 0, policy_version 773713 (0.0028) [2024-06-25 00:34:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 12676628480. Throughput: 0: 42571.9. Samples: 12676757920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:34:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 00:34:44,048][15401] Updated weights for policy 0, policy_version 773723 (0.0044) [2024-06-25 00:34:47,768][15401] Updated weights for policy 0, policy_version 773733 (0.0033) [2024-06-25 00:34:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12676857856. Throughput: 0: 42564.0. Samples: 12677014320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:34:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 00:34:52,005][15401] Updated weights for policy 0, policy_version 773743 (0.0038) [2024-06-25 00:34:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12677070848. Throughput: 0: 42666.1. Samples: 12677144220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:34:53,390][15132] Avg episode reward: [(0, '0.103')] [2024-06-25 00:34:55,401][15401] Updated weights for policy 0, policy_version 773753 (0.0043) [2024-06-25 00:34:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12677267456. Throughput: 0: 42379.2. Samples: 12677394580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:34:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 00:34:59,607][15401] Updated weights for policy 0, policy_version 773763 (0.0032) [2024-06-25 00:35:03,343][15401] Updated weights for policy 0, policy_version 773773 (0.0026) [2024-06-25 00:35:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12677496832. Throughput: 0: 42577.5. Samples: 12677658140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 00:35:07,068][15401] Updated weights for policy 0, policy_version 773783 (0.0038) [2024-06-25 00:35:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12677726208. Throughput: 0: 42781.8. Samples: 12677788100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 00:35:11,047][15401] Updated weights for policy 0, policy_version 773793 (0.0040) [2024-06-25 00:35:13,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12677906432. Throughput: 0: 42616.9. Samples: 12678040880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:13,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 00:35:14,712][15401] Updated weights for policy 0, policy_version 773803 (0.0029) [2024-06-25 00:35:15,449][15349] Signal inference workers to stop experience collection... (187600 times) [2024-06-25 00:35:15,449][15349] Signal inference workers to resume experience collection... (187600 times) [2024-06-25 00:35:15,480][15401] InferenceWorker_p0-w0: stopping experience collection (187600 times) [2024-06-25 00:35:15,480][15401] InferenceWorker_p0-w0: resuming experience collection (187600 times) [2024-06-25 00:35:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12678119424. Throughput: 0: 42869.3. Samples: 12678303460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 00:35:18,765][15401] Updated weights for policy 0, policy_version 773813 (0.0037) [2024-06-25 00:35:22,516][15401] Updated weights for policy 0, policy_version 773823 (0.0031) [2024-06-25 00:35:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12678365184. Throughput: 0: 42598.1. Samples: 12678424200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 00:35:26,394][15401] Updated weights for policy 0, policy_version 773833 (0.0033) [2024-06-25 00:35:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12678578176. Throughput: 0: 42697.7. Samples: 12678679320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:28,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 00:35:30,138][15401] Updated weights for policy 0, policy_version 773843 (0.0034) [2024-06-25 00:35:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12678758400. Throughput: 0: 42882.2. Samples: 12678944020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 00:35:34,034][15401] Updated weights for policy 0, policy_version 773853 (0.0036) [2024-06-25 00:35:37,636][15401] Updated weights for policy 0, policy_version 773863 (0.0022) [2024-06-25 00:35:38,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12679004160. Throughput: 0: 42751.6. Samples: 12679068040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 00:35:41,585][15401] Updated weights for policy 0, policy_version 773873 (0.0030) [2024-06-25 00:35:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12679217152. Throughput: 0: 42898.1. Samples: 12679325000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:43,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 00:35:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000773878_12679217152.pth... [2024-06-25 00:35:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000773253_12668977152.pth [2024-06-25 00:35:45,123][15401] Updated weights for policy 0, policy_version 773883 (0.0037) [2024-06-25 00:35:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12679397376. Throughput: 0: 42878.2. Samples: 12679587660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:48,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-25 00:35:49,344][15401] Updated weights for policy 0, policy_version 773893 (0.0038) [2024-06-25 00:35:52,934][15401] Updated weights for policy 0, policy_version 773903 (0.0030) [2024-06-25 00:35:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12679643136. Throughput: 0: 42734.7. Samples: 12679711160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 00:35:56,967][15401] Updated weights for policy 0, policy_version 773913 (0.0036) [2024-06-25 00:35:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12679856128. Throughput: 0: 42812.0. Samples: 12679967420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:35:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 00:36:00,461][15401] Updated weights for policy 0, policy_version 773923 (0.0038) [2024-06-25 00:36:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12680036352. Throughput: 0: 42943.8. Samples: 12680235920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:36:03,390][15132] Avg episode reward: [(0, '0.922')] [2024-06-25 00:36:04,440][15401] Updated weights for policy 0, policy_version 773933 (0.0037) [2024-06-25 00:36:07,856][15401] Updated weights for policy 0, policy_version 773943 (0.0030) [2024-06-25 00:36:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42709.2). Total num frames: 12680282112. Throughput: 0: 42888.9. Samples: 12680354300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:08,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 00:36:12,082][15401] Updated weights for policy 0, policy_version 773953 (0.0029) [2024-06-25 00:36:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12680495104. Throughput: 0: 43021.0. Samples: 12680615260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 00:36:15,376][15401] Updated weights for policy 0, policy_version 773963 (0.0039) [2024-06-25 00:36:16,470][15349] Signal inference workers to stop experience collection... (187650 times) [2024-06-25 00:36:16,471][15349] Signal inference workers to resume experience collection... (187650 times) [2024-06-25 00:36:16,499][15401] InferenceWorker_p0-w0: stopping experience collection (187650 times) [2024-06-25 00:36:16,500][15401] InferenceWorker_p0-w0: resuming experience collection (187650 times) [2024-06-25 00:36:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 12680691712. Throughput: 0: 42915.6. Samples: 12680875220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 00:36:20,131][15401] Updated weights for policy 0, policy_version 773973 (0.0043) [2024-06-25 00:36:23,020][15401] Updated weights for policy 0, policy_version 773983 (0.0034) [2024-06-25 00:36:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12680937472. Throughput: 0: 42847.5. Samples: 12680996180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:23,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 00:36:28,009][15401] Updated weights for policy 0, policy_version 773993 (0.0032) [2024-06-25 00:36:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 12681134080. Throughput: 0: 42940.2. Samples: 12681257300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:28,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 00:36:30,649][15401] Updated weights for policy 0, policy_version 774003 (0.0026) [2024-06-25 00:36:33,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12681314304. Throughput: 0: 42714.0. Samples: 12681509900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:33,393][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 00:36:35,765][15401] Updated weights for policy 0, policy_version 774013 (0.0032) [2024-06-25 00:36:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12681576448. Throughput: 0: 42770.7. Samples: 12681635840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 00:36:38,600][15401] Updated weights for policy 0, policy_version 774023 (0.0032) [2024-06-25 00:36:43,289][15401] Updated weights for policy 0, policy_version 774033 (0.0026) [2024-06-25 00:36:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12681756672. Throughput: 0: 42915.5. Samples: 12681898620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 00:36:46,265][15401] Updated weights for policy 0, policy_version 774043 (0.0039) [2024-06-25 00:36:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12681969664. Throughput: 0: 42484.3. Samples: 12682147720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 00:36:50,807][15401] Updated weights for policy 0, policy_version 774053 (0.0043) [2024-06-25 00:36:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12682215424. Throughput: 0: 42784.9. Samples: 12682279520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 00:36:53,867][15401] Updated weights for policy 0, policy_version 774063 (0.0038) [2024-06-25 00:36:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12682379264. Throughput: 0: 42821.9. Samples: 12682542240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:36:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 00:36:58,650][15401] Updated weights for policy 0, policy_version 774073 (0.0037) [2024-06-25 00:37:01,790][15401] Updated weights for policy 0, policy_version 774083 (0.0025) [2024-06-25 00:37:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42766.0). Total num frames: 12682625024. Throughput: 0: 42411.6. Samples: 12682783740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:03,390][15132] Avg episode reward: [(0, '0.880')] [2024-06-25 00:37:06,276][15401] Updated weights for policy 0, policy_version 774093 (0.0043) [2024-06-25 00:37:08,392][15132] Fps is (10 sec: 47501.6, 60 sec: 42871.4, 300 sec: 42764.6). Total num frames: 12682854400. Throughput: 0: 42738.2. Samples: 12682919500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:08,393][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 00:37:09,305][15401] Updated weights for policy 0, policy_version 774103 (0.0038) [2024-06-25 00:37:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12683018240. Throughput: 0: 42591.9. Samples: 12683173940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 00:37:13,991][15401] Updated weights for policy 0, policy_version 774113 (0.0026) [2024-06-25 00:37:17,039][15401] Updated weights for policy 0, policy_version 774123 (0.0030) [2024-06-25 00:37:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12683280384. Throughput: 0: 42510.2. Samples: 12683422760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 00:37:21,823][15401] Updated weights for policy 0, policy_version 774133 (0.0044) [2024-06-25 00:37:23,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 12683493376. Throughput: 0: 42765.3. Samples: 12683560380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:23,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 00:37:24,930][15401] Updated weights for policy 0, policy_version 774143 (0.0043) [2024-06-25 00:37:28,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12683657216. Throughput: 0: 42393.3. Samples: 12683806320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 00:37:29,570][15401] Updated weights for policy 0, policy_version 774153 (0.0025) [2024-06-25 00:37:30,486][15349] Signal inference workers to stop experience collection... (187700 times) [2024-06-25 00:37:30,486][15349] Signal inference workers to resume experience collection... (187700 times) [2024-06-25 00:37:30,511][15401] InferenceWorker_p0-w0: stopping experience collection (187700 times) [2024-06-25 00:37:30,511][15401] InferenceWorker_p0-w0: resuming experience collection (187700 times) [2024-06-25 00:37:32,544][15401] Updated weights for policy 0, policy_version 774163 (0.0034) [2024-06-25 00:37:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 12683919360. Throughput: 0: 42347.0. Samples: 12684053340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 00:37:37,264][15401] Updated weights for policy 0, policy_version 774173 (0.0030) [2024-06-25 00:37:38,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12684132352. Throughput: 0: 42505.7. Samples: 12684192280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 00:37:40,166][15401] Updated weights for policy 0, policy_version 774183 (0.0040) [2024-06-25 00:37:43,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12684296192. Throughput: 0: 42259.0. Samples: 12684443900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 00:37:43,443][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000774189_12684312576.pth... [2024-06-25 00:37:43,515][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000773564_12674072576.pth [2024-06-25 00:37:45,033][15401] Updated weights for policy 0, policy_version 774193 (0.0035) [2024-06-25 00:37:47,852][15401] Updated weights for policy 0, policy_version 774203 (0.0047) [2024-06-25 00:37:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12684541952. Throughput: 0: 42492.5. Samples: 12684695900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-25 00:37:48,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 00:37:52,561][15401] Updated weights for policy 0, policy_version 774213 (0.0034) [2024-06-25 00:37:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12684738560. Throughput: 0: 42458.8. Samples: 12684830040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:37:53,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 00:37:55,479][15401] Updated weights for policy 0, policy_version 774223 (0.0030) [2024-06-25 00:37:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12684951552. Throughput: 0: 42364.8. Samples: 12685080360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:37:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 00:38:00,232][15401] Updated weights for policy 0, policy_version 774233 (0.0026) [2024-06-25 00:38:03,275][15401] Updated weights for policy 0, policy_version 774243 (0.0040) [2024-06-25 00:38:03,393][15132] Fps is (10 sec: 45858.6, 60 sec: 42868.9, 300 sec: 42820.2). Total num frames: 12685197312. Throughput: 0: 42467.8. Samples: 12685333960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:03,394][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 00:38:07,710][15401] Updated weights for policy 0, policy_version 774253 (0.0032) [2024-06-25 00:38:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41781.0, 300 sec: 42598.4). Total num frames: 12685361152. Throughput: 0: 42351.3. Samples: 12685466080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:08,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 00:38:10,730][15401] Updated weights for policy 0, policy_version 774263 (0.0044) [2024-06-25 00:38:13,390][15132] Fps is (10 sec: 40974.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12685606912. Throughput: 0: 42471.0. Samples: 12685717520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:13,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 00:38:15,542][15401] Updated weights for policy 0, policy_version 774273 (0.0040) [2024-06-25 00:38:18,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12685836288. Throughput: 0: 42715.2. Samples: 12685975520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:18,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 00:38:18,418][15401] Updated weights for policy 0, policy_version 774283 (0.0034) [2024-06-25 00:38:23,184][15401] Updated weights for policy 0, policy_version 774293 (0.0026) [2024-06-25 00:38:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 12686016512. Throughput: 0: 42543.5. Samples: 12686106740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 00:38:26,247][15401] Updated weights for policy 0, policy_version 774303 (0.0029) [2024-06-25 00:38:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 12686245888. Throughput: 0: 42633.4. Samples: 12686362400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:28,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 00:38:30,682][15401] Updated weights for policy 0, policy_version 774313 (0.0035) [2024-06-25 00:38:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12686458880. Throughput: 0: 42682.2. Samples: 12686616600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 00:38:34,172][15401] Updated weights for policy 0, policy_version 774323 (0.0045) [2024-06-25 00:38:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12686655488. Throughput: 0: 42544.4. Samples: 12686744540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 00:38:38,684][15401] Updated weights for policy 0, policy_version 774333 (0.0035) [2024-06-25 00:38:41,855][15401] Updated weights for policy 0, policy_version 774343 (0.0030) [2024-06-25 00:38:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12686868480. Throughput: 0: 42586.7. Samples: 12686996760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 00:38:46,157][15401] Updated weights for policy 0, policy_version 774353 (0.0035) [2024-06-25 00:38:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12687081472. Throughput: 0: 42725.1. Samples: 12687256440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:48,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 00:38:49,703][15401] Updated weights for policy 0, policy_version 774363 (0.0038) [2024-06-25 00:38:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12687294464. Throughput: 0: 42651.5. Samples: 12687385400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 00:38:53,989][15401] Updated weights for policy 0, policy_version 774373 (0.0036) [2024-06-25 00:38:57,074][15349] Signal inference workers to stop experience collection... (187750 times) [2024-06-25 00:38:57,076][15349] Signal inference workers to resume experience collection... (187750 times) [2024-06-25 00:38:57,096][15401] InferenceWorker_p0-w0: stopping experience collection (187750 times) [2024-06-25 00:38:57,096][15401] InferenceWorker_p0-w0: resuming experience collection (187750 times) [2024-06-25 00:38:57,229][15401] Updated weights for policy 0, policy_version 774383 (0.0046) [2024-06-25 00:38:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12687523840. Throughput: 0: 42649.0. Samples: 12687636720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:38:58,392][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 00:39:01,786][15401] Updated weights for policy 0, policy_version 774393 (0.0024) [2024-06-25 00:39:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42054.8, 300 sec: 42653.9). Total num frames: 12687720448. Throughput: 0: 42661.9. Samples: 12687895300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:39:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 00:39:05,195][15401] Updated weights for policy 0, policy_version 774403 (0.0031) [2024-06-25 00:39:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 12687949824. Throughput: 0: 42530.9. Samples: 12688020620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:39:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 00:39:09,283][15401] Updated weights for policy 0, policy_version 774413 (0.0036) [2024-06-25 00:39:12,691][15401] Updated weights for policy 0, policy_version 774423 (0.0035) [2024-06-25 00:39:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12688179200. Throughput: 0: 42631.4. Samples: 12688280920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:39:13,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 00:39:17,031][15401] Updated weights for policy 0, policy_version 774433 (0.0041) [2024-06-25 00:39:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 12688343040. Throughput: 0: 42714.6. Samples: 12688538760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:39:18,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 00:39:20,333][15401] Updated weights for policy 0, policy_version 774443 (0.0043) [2024-06-25 00:39:23,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12688588800. Throughput: 0: 42518.2. Samples: 12688657860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:39:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 00:39:24,473][15401] Updated weights for policy 0, policy_version 774453 (0.0031) [2024-06-25 00:39:27,950][15401] Updated weights for policy 0, policy_version 774463 (0.0037) [2024-06-25 00:39:28,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12688834560. Throughput: 0: 42761.8. Samples: 12688921040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:39:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 00:39:32,316][15401] Updated weights for policy 0, policy_version 774473 (0.0040) [2024-06-25 00:39:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12688998400. Throughput: 0: 42701.9. Samples: 12689178020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 00:39:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 00:39:35,667][15401] Updated weights for policy 0, policy_version 774483 (0.0031) [2024-06-25 00:39:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12689244160. Throughput: 0: 42463.1. Samples: 12689296240. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:39:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 00:39:39,749][15401] Updated weights for policy 0, policy_version 774493 (0.0050) [2024-06-25 00:39:43,312][15401] Updated weights for policy 0, policy_version 774503 (0.0039) [2024-06-25 00:39:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12689457152. Throughput: 0: 42831.2. Samples: 12689564120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:39:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 00:39:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000774504_12689473536.pth... [2024-06-25 00:39:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000773878_12679217152.pth [2024-06-25 00:39:47,421][15401] Updated weights for policy 0, policy_version 774513 (0.0047) [2024-06-25 00:39:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12689637376. Throughput: 0: 42834.6. Samples: 12689822860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:39:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 00:39:50,853][15401] Updated weights for policy 0, policy_version 774523 (0.0034) [2024-06-25 00:39:53,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 12689866752. Throughput: 0: 42693.1. Samples: 12689941820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:39:53,390][15132] Avg episode reward: [(0, '0.144')] [2024-06-25 00:39:55,027][15401] Updated weights for policy 0, policy_version 774533 (0.0039) [2024-06-25 00:39:58,384][15401] Updated weights for policy 0, policy_version 774543 (0.0034) [2024-06-25 00:39:58,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12690112512. Throughput: 0: 42860.2. Samples: 12690209520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:39:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 00:40:03,039][15401] Updated weights for policy 0, policy_version 774553 (0.0043) [2024-06-25 00:40:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12690292736. Throughput: 0: 42780.0. Samples: 12690463860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:03,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 00:40:06,053][15401] Updated weights for policy 0, policy_version 774563 (0.0035) [2024-06-25 00:40:08,392][15132] Fps is (10 sec: 37673.6, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 12690489344. Throughput: 0: 42837.6. Samples: 12690585660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:08,392][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 00:40:10,508][15401] Updated weights for policy 0, policy_version 774573 (0.0036) [2024-06-25 00:40:13,118][15349] Signal inference workers to stop experience collection... (187800 times) [2024-06-25 00:40:13,118][15349] Signal inference workers to resume experience collection... (187800 times) [2024-06-25 00:40:13,139][15401] InferenceWorker_p0-w0: stopping experience collection (187800 times) [2024-06-25 00:40:13,139][15401] InferenceWorker_p0-w0: resuming experience collection (187800 times) [2024-06-25 00:40:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 12690735104. Throughput: 0: 42961.0. Samples: 12690854280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 00:40:13,598][15401] Updated weights for policy 0, policy_version 774583 (0.0037) [2024-06-25 00:40:18,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12690915328. Throughput: 0: 42927.5. Samples: 12691109760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 00:40:18,535][15401] Updated weights for policy 0, policy_version 774593 (0.0023) [2024-06-25 00:40:21,404][15401] Updated weights for policy 0, policy_version 774603 (0.0035) [2024-06-25 00:40:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12691144704. Throughput: 0: 42976.4. Samples: 12691230180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 00:40:25,942][15401] Updated weights for policy 0, policy_version 774613 (0.0046) [2024-06-25 00:40:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12691374080. Throughput: 0: 42905.7. Samples: 12691494880. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 00:40:29,158][15401] Updated weights for policy 0, policy_version 774623 (0.0029) [2024-06-25 00:40:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12691554304. Throughput: 0: 42694.2. Samples: 12691744100. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:33,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 00:40:33,899][15401] Updated weights for policy 0, policy_version 774633 (0.0046) [2024-06-25 00:40:36,938][15401] Updated weights for policy 0, policy_version 774643 (0.0035) [2024-06-25 00:40:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12691800064. Throughput: 0: 42737.9. Samples: 12691865020. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:38,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 00:40:41,455][15401] Updated weights for policy 0, policy_version 774653 (0.0027) [2024-06-25 00:40:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12691996672. Throughput: 0: 42534.0. Samples: 12692123560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:43,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 00:40:44,443][15401] Updated weights for policy 0, policy_version 774663 (0.0031) [2024-06-25 00:40:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12692209664. Throughput: 0: 42535.5. Samples: 12692377960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 00:40:49,159][15401] Updated weights for policy 0, policy_version 774673 (0.0037) [2024-06-25 00:40:52,328][15401] Updated weights for policy 0, policy_version 774683 (0.0041) [2024-06-25 00:40:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12692439040. Throughput: 0: 42763.1. Samples: 12692509900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 00:40:56,722][15401] Updated weights for policy 0, policy_version 774693 (0.0032) [2024-06-25 00:40:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 12692652032. Throughput: 0: 42589.2. Samples: 12692770800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:40:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 00:41:00,232][15401] Updated weights for policy 0, policy_version 774703 (0.0021) [2024-06-25 00:41:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 12692848640. Throughput: 0: 42559.0. Samples: 12693024920. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:41:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 00:41:04,241][15401] Updated weights for policy 0, policy_version 774713 (0.0033) [2024-06-25 00:41:07,681][15401] Updated weights for policy 0, policy_version 774723 (0.0043) [2024-06-25 00:41:08,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43417.6, 300 sec: 42709.1). Total num frames: 12693094400. Throughput: 0: 42731.1. Samples: 12693153180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:41:08,393][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 00:41:11,834][15401] Updated weights for policy 0, policy_version 774733 (0.0035) [2024-06-25 00:41:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12693307392. Throughput: 0: 42636.0. Samples: 12693413500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:41:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 00:41:15,194][15401] Updated weights for policy 0, policy_version 774743 (0.0029) [2024-06-25 00:41:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12693487616. Throughput: 0: 42908.4. Samples: 12693674980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-25 00:41:18,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 00:41:19,397][15401] Updated weights for policy 0, policy_version 774753 (0.0043) [2024-06-25 00:41:22,882][15401] Updated weights for policy 0, policy_version 774763 (0.0047) [2024-06-25 00:41:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 12693733376. Throughput: 0: 42949.7. Samples: 12693797760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:41:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 00:41:27,365][15401] Updated weights for policy 0, policy_version 774773 (0.0035) [2024-06-25 00:41:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12693929984. Throughput: 0: 43060.9. Samples: 12694061300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:41:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 00:41:29,462][15349] Signal inference workers to stop experience collection... (187850 times) [2024-06-25 00:41:29,463][15349] Signal inference workers to resume experience collection... (187850 times) [2024-06-25 00:41:29,486][15401] InferenceWorker_p0-w0: stopping experience collection (187850 times) [2024-06-25 00:41:29,486][15401] InferenceWorker_p0-w0: resuming experience collection (187850 times) [2024-06-25 00:41:30,610][15401] Updated weights for policy 0, policy_version 774783 (0.0030) [2024-06-25 00:41:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 12694126592. Throughput: 0: 43078.6. Samples: 12694316500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:41:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 00:41:34,866][15401] Updated weights for policy 0, policy_version 774793 (0.0029) [2024-06-25 00:41:38,135][15401] Updated weights for policy 0, policy_version 774803 (0.0031) [2024-06-25 00:41:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12694372352. Throughput: 0: 42885.0. Samples: 12694439720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:41:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 00:41:42,519][15401] Updated weights for policy 0, policy_version 774813 (0.0038) [2024-06-25 00:41:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12694585344. Throughput: 0: 42884.0. Samples: 12694700580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:41:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 00:41:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000774816_12694585344.pth... [2024-06-25 00:41:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000774189_12684312576.pth [2024-06-25 00:41:45,886][15401] Updated weights for policy 0, policy_version 774823 (0.0028) [2024-06-25 00:41:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12694781952. Throughput: 0: 42952.1. Samples: 12694957760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:41:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 00:41:50,424][15401] Updated weights for policy 0, policy_version 774833 (0.0035) [2024-06-25 00:41:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12695011328. Throughput: 0: 42936.0. Samples: 12695085200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:41:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 00:41:53,448][15401] Updated weights for policy 0, policy_version 774843 (0.0021) [2024-06-25 00:41:57,852][15401] Updated weights for policy 0, policy_version 774853 (0.0033) [2024-06-25 00:41:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12695224320. Throughput: 0: 43043.6. Samples: 12695350460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:41:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 00:42:01,201][15401] Updated weights for policy 0, policy_version 774863 (0.0036) [2024-06-25 00:42:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 12695437312. Throughput: 0: 42876.8. Samples: 12695604440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 00:42:05,304][15401] Updated weights for policy 0, policy_version 774873 (0.0033) [2024-06-25 00:42:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 12695650304. Throughput: 0: 43013.1. Samples: 12695733340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:08,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 00:42:08,754][15401] Updated weights for policy 0, policy_version 774883 (0.0037) [2024-06-25 00:42:13,152][15401] Updated weights for policy 0, policy_version 774893 (0.0032) [2024-06-25 00:42:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12695846912. Throughput: 0: 42839.0. Samples: 12695989060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 00:42:16,317][15401] Updated weights for policy 0, policy_version 774903 (0.0028) [2024-06-25 00:42:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 12696092672. Throughput: 0: 42875.2. Samples: 12696245880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:18,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 00:42:20,960][15401] Updated weights for policy 0, policy_version 774913 (0.0038) [2024-06-25 00:42:23,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12696305664. Throughput: 0: 43159.5. Samples: 12696381900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 00:42:23,899][15401] Updated weights for policy 0, policy_version 774923 (0.0030) [2024-06-25 00:42:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12696485888. Throughput: 0: 42914.9. Samples: 12696631740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 00:42:28,497][15401] Updated weights for policy 0, policy_version 774933 (0.0035) [2024-06-25 00:42:31,420][15401] Updated weights for policy 0, policy_version 774943 (0.0038) [2024-06-25 00:42:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 12696731648. Throughput: 0: 42781.7. Samples: 12696882940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 00:42:36,012][15401] Updated weights for policy 0, policy_version 774953 (0.0032) [2024-06-25 00:42:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12696911872. Throughput: 0: 42796.5. Samples: 12697011040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:38,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 00:42:39,569][15401] Updated weights for policy 0, policy_version 774963 (0.0036) [2024-06-25 00:42:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12697141248. Throughput: 0: 42648.9. Samples: 12697269660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 00:42:43,491][15401] Updated weights for policy 0, policy_version 774973 (0.0041) [2024-06-25 00:42:47,186][15401] Updated weights for policy 0, policy_version 774983 (0.0034) [2024-06-25 00:42:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12697370624. Throughput: 0: 42588.2. Samples: 12697520900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:48,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 00:42:51,096][15401] Updated weights for policy 0, policy_version 774993 (0.0031) [2024-06-25 00:42:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12697550848. Throughput: 0: 42583.4. Samples: 12697649600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 00:42:54,881][15401] Updated weights for policy 0, policy_version 775003 (0.0026) [2024-06-25 00:42:55,589][15349] Signal inference workers to stop experience collection... (187900 times) [2024-06-25 00:42:55,617][15401] InferenceWorker_p0-w0: stopping experience collection (187900 times) [2024-06-25 00:42:55,650][15349] Signal inference workers to resume experience collection... (187900 times) [2024-06-25 00:42:55,652][15401] InferenceWorker_p0-w0: resuming experience collection (187900 times) [2024-06-25 00:42:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42654.4). Total num frames: 12697780224. Throughput: 0: 42645.5. Samples: 12697908100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:42:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 00:42:58,865][15401] Updated weights for policy 0, policy_version 775013 (0.0045) [2024-06-25 00:43:02,632][15401] Updated weights for policy 0, policy_version 775023 (0.0032) [2024-06-25 00:43:03,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42869.9, 300 sec: 42875.7). Total num frames: 12698009600. Throughput: 0: 42624.0. Samples: 12698164060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 00:43:03,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 00:43:06,463][15401] Updated weights for policy 0, policy_version 775033 (0.0032) [2024-06-25 00:43:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12698206208. Throughput: 0: 42447.6. Samples: 12698292040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 00:43:10,379][15401] Updated weights for policy 0, policy_version 775043 (0.0028) [2024-06-25 00:43:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12698435584. Throughput: 0: 42497.6. Samples: 12698544140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 00:43:14,190][15401] Updated weights for policy 0, policy_version 775053 (0.0046) [2024-06-25 00:43:18,169][15401] Updated weights for policy 0, policy_version 775063 (0.0034) [2024-06-25 00:43:18,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 12698648576. Throughput: 0: 42644.8. Samples: 12698802060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:18,393][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 00:43:21,960][15401] Updated weights for policy 0, policy_version 775073 (0.0033) [2024-06-25 00:43:23,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 12698845184. Throughput: 0: 42566.7. Samples: 12698926640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:23,393][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 00:43:25,721][15401] Updated weights for policy 0, policy_version 775083 (0.0030) [2024-06-25 00:43:28,392][15132] Fps is (10 sec: 42598.5, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 12699074560. Throughput: 0: 42494.6. Samples: 12699182020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:28,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 00:43:29,700][15401] Updated weights for policy 0, policy_version 775093 (0.0043) [2024-06-25 00:43:33,380][15401] Updated weights for policy 0, policy_version 775103 (0.0031) [2024-06-25 00:43:33,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12699287552. Throughput: 0: 42700.4. Samples: 12699442420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 00:43:37,187][15401] Updated weights for policy 0, policy_version 775113 (0.0033) [2024-06-25 00:43:38,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12699484160. Throughput: 0: 42556.2. Samples: 12699564620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 00:43:41,227][15401] Updated weights for policy 0, policy_version 775123 (0.0025) [2024-06-25 00:43:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12699729920. Throughput: 0: 42559.6. Samples: 12699823280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 00:43:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000775130_12699729920.pth... [2024-06-25 00:43:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000774504_12689473536.pth [2024-06-25 00:43:44,679][15401] Updated weights for policy 0, policy_version 775133 (0.0040) [2024-06-25 00:43:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12699910144. Throughput: 0: 42622.7. Samples: 12700081980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:48,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 00:43:48,894][15401] Updated weights for policy 0, policy_version 775143 (0.0038) [2024-06-25 00:43:52,667][15401] Updated weights for policy 0, policy_version 775153 (0.0026) [2024-06-25 00:43:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12700123136. Throughput: 0: 42599.1. Samples: 12700209000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 00:43:56,430][15401] Updated weights for policy 0, policy_version 775163 (0.0040) [2024-06-25 00:43:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12700352512. Throughput: 0: 42595.7. Samples: 12700460940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:43:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 00:44:00,549][15401] Updated weights for policy 0, policy_version 775173 (0.0037) [2024-06-25 00:44:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 12700565504. Throughput: 0: 42722.8. Samples: 12700724480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:03,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 00:44:03,850][15401] Updated weights for policy 0, policy_version 775183 (0.0024) [2024-06-25 00:44:07,921][15401] Updated weights for policy 0, policy_version 775193 (0.0036) [2024-06-25 00:44:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 12700778496. Throughput: 0: 42904.5. Samples: 12700857240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 00:44:11,637][15401] Updated weights for policy 0, policy_version 775203 (0.0041) [2024-06-25 00:44:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12701007872. Throughput: 0: 42836.9. Samples: 12701109580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 00:44:15,366][15401] Updated weights for policy 0, policy_version 775213 (0.0037) [2024-06-25 00:44:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 12701204480. Throughput: 0: 42772.4. Samples: 12701367180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 00:44:19,510][15401] Updated weights for policy 0, policy_version 775223 (0.0042) [2024-06-25 00:44:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 12701401088. Throughput: 0: 42839.1. Samples: 12701492380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 00:44:23,461][15401] Updated weights for policy 0, policy_version 775233 (0.0038) [2024-06-25 00:44:27,018][15401] Updated weights for policy 0, policy_version 775243 (0.0033) [2024-06-25 00:44:27,221][15349] Signal inference workers to stop experience collection... (187950 times) [2024-06-25 00:44:27,221][15349] Signal inference workers to resume experience collection... (187950 times) [2024-06-25 00:44:27,260][15401] InferenceWorker_p0-w0: stopping experience collection (187950 times) [2024-06-25 00:44:27,260][15401] InferenceWorker_p0-w0: resuming experience collection (187950 times) [2024-06-25 00:44:28,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43144.5, 300 sec: 42931.3). Total num frames: 12701663232. Throughput: 0: 42927.9. Samples: 12701755140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:28,393][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 00:44:31,199][15401] Updated weights for policy 0, policy_version 775253 (0.0033) [2024-06-25 00:44:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12701843456. Throughput: 0: 42940.9. Samples: 12702014320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:33,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 00:44:34,600][15401] Updated weights for policy 0, policy_version 775263 (0.0037) [2024-06-25 00:44:38,390][15132] Fps is (10 sec: 37691.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 12702040064. Throughput: 0: 42736.0. Samples: 12702132120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 00:44:39,000][15401] Updated weights for policy 0, policy_version 775273 (0.0037) [2024-06-25 00:44:42,485][15401] Updated weights for policy 0, policy_version 775283 (0.0037) [2024-06-25 00:44:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12702285824. Throughput: 0: 42951.5. Samples: 12702393760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 00:44:46,770][15401] Updated weights for policy 0, policy_version 775293 (0.0035) [2024-06-25 00:44:48,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 12702482432. Throughput: 0: 42796.8. Samples: 12702650440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 00:44:48,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 00:44:49,953][15401] Updated weights for policy 0, policy_version 775303 (0.0034) [2024-06-25 00:44:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12702695424. Throughput: 0: 42753.7. Samples: 12702781160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:44:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 00:44:54,160][15401] Updated weights for policy 0, policy_version 775313 (0.0033) [2024-06-25 00:44:57,499][15401] Updated weights for policy 0, policy_version 775323 (0.0038) [2024-06-25 00:44:58,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12702924800. Throughput: 0: 42881.3. Samples: 12703039240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:44:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 00:45:01,659][15401] Updated weights for policy 0, policy_version 775333 (0.0031) [2024-06-25 00:45:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 12703121408. Throughput: 0: 42987.7. Samples: 12703301620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 00:45:04,948][15401] Updated weights for policy 0, policy_version 775343 (0.0031) [2024-06-25 00:45:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12703350784. Throughput: 0: 42868.8. Samples: 12703421480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:08,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 00:45:09,414][15401] Updated weights for policy 0, policy_version 775353 (0.0036) [2024-06-25 00:45:12,925][15401] Updated weights for policy 0, policy_version 775363 (0.0030) [2024-06-25 00:45:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12703580160. Throughput: 0: 42807.6. Samples: 12703681380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:13,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 00:45:16,897][15401] Updated weights for policy 0, policy_version 775373 (0.0027) [2024-06-25 00:45:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12703776768. Throughput: 0: 42837.7. Samples: 12703942020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:18,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 00:45:20,385][15401] Updated weights for policy 0, policy_version 775383 (0.0043) [2024-06-25 00:45:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12703989760. Throughput: 0: 43046.8. Samples: 12704069220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 00:45:24,561][15401] Updated weights for policy 0, policy_version 775393 (0.0031) [2024-06-25 00:45:27,738][15401] Updated weights for policy 0, policy_version 775403 (0.0031) [2024-06-25 00:45:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 12704219136. Throughput: 0: 42971.0. Samples: 12704327460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:28,391][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 00:45:32,063][15401] Updated weights for policy 0, policy_version 775413 (0.0028) [2024-06-25 00:45:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12704432128. Throughput: 0: 43069.1. Samples: 12704588440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 00:45:35,231][15401] Updated weights for policy 0, policy_version 775423 (0.0031) [2024-06-25 00:45:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 12704645120. Throughput: 0: 42899.6. Samples: 12704711640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 00:45:39,950][15401] Updated weights for policy 0, policy_version 775433 (0.0034) [2024-06-25 00:45:43,312][15401] Updated weights for policy 0, policy_version 775443 (0.0032) [2024-06-25 00:45:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12704858112. Throughput: 0: 42976.8. Samples: 12704973200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 00:45:43,490][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000775444_12704874496.pth... [2024-06-25 00:45:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000774816_12694585344.pth [2024-06-25 00:45:47,496][15401] Updated weights for policy 0, policy_version 775453 (0.0034) [2024-06-25 00:45:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 12705054720. Throughput: 0: 42813.4. Samples: 12705228220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 00:45:50,894][15401] Updated weights for policy 0, policy_version 775463 (0.0044) [2024-06-25 00:45:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12705267712. Throughput: 0: 42954.2. Samples: 12705354420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 00:45:55,120][15401] Updated weights for policy 0, policy_version 775473 (0.0045) [2024-06-25 00:45:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12705480704. Throughput: 0: 42815.6. Samples: 12705608080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:45:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 00:45:58,811][15401] Updated weights for policy 0, policy_version 775483 (0.0037) [2024-06-25 00:46:00,453][15349] Signal inference workers to stop experience collection... (188000 times) [2024-06-25 00:46:00,453][15349] Signal inference workers to resume experience collection... (188000 times) [2024-06-25 00:46:00,507][15401] InferenceWorker_p0-w0: stopping experience collection (188000 times) [2024-06-25 00:46:00,507][15401] InferenceWorker_p0-w0: resuming experience collection (188000 times) [2024-06-25 00:46:02,584][15401] Updated weights for policy 0, policy_version 775493 (0.0031) [2024-06-25 00:46:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 12705693696. Throughput: 0: 42902.8. Samples: 12705872640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:46:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 00:46:06,293][15401] Updated weights for policy 0, policy_version 775503 (0.0038) [2024-06-25 00:46:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12705923072. Throughput: 0: 42880.8. Samples: 12705998860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:46:08,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 00:46:10,130][15401] Updated weights for policy 0, policy_version 775513 (0.0043) [2024-06-25 00:46:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 12706136064. Throughput: 0: 42811.6. Samples: 12706253980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:46:13,390][15132] Avg episode reward: [(0, '0.147')] [2024-06-25 00:46:13,810][15401] Updated weights for policy 0, policy_version 775523 (0.0027) [2024-06-25 00:46:17,683][15401] Updated weights for policy 0, policy_version 775533 (0.0029) [2024-06-25 00:46:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12706332672. Throughput: 0: 42703.0. Samples: 12706510080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:46:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 00:46:21,347][15401] Updated weights for policy 0, policy_version 775543 (0.0036) [2024-06-25 00:46:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12706545664. Throughput: 0: 42812.1. Samples: 12706638180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:46:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 00:46:25,187][15401] Updated weights for policy 0, policy_version 775553 (0.0032) [2024-06-25 00:46:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12706791424. Throughput: 0: 42812.0. Samples: 12706899740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:46:28,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-25 00:46:28,846][15401] Updated weights for policy 0, policy_version 775563 (0.0022) [2024-06-25 00:46:32,815][15401] Updated weights for policy 0, policy_version 775573 (0.0032) [2024-06-25 00:46:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12706988032. Throughput: 0: 42836.4. Samples: 12707155860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:46:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 00:46:36,723][15401] Updated weights for policy 0, policy_version 775583 (0.0025) [2024-06-25 00:46:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12707201024. Throughput: 0: 42921.4. Samples: 12707285880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:46:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 00:46:40,534][15401] Updated weights for policy 0, policy_version 775593 (0.0032) [2024-06-25 00:46:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12707430400. Throughput: 0: 43063.0. Samples: 12707545920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:46:43,400][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 00:46:44,345][15401] Updated weights for policy 0, policy_version 775603 (0.0046) [2024-06-25 00:46:48,258][15401] Updated weights for policy 0, policy_version 775613 (0.0043) [2024-06-25 00:46:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 12707643392. Throughput: 0: 42885.2. Samples: 12707802480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:46:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 00:46:52,141][15401] Updated weights for policy 0, policy_version 775623 (0.0044) [2024-06-25 00:46:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 12707840000. Throughput: 0: 42948.4. Samples: 12707931640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:46:53,393][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 00:46:55,972][15401] Updated weights for policy 0, policy_version 775633 (0.0034) [2024-06-25 00:46:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12708052992. Throughput: 0: 42873.4. Samples: 12708183280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:46:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 00:46:59,789][15401] Updated weights for policy 0, policy_version 775643 (0.0028) [2024-06-25 00:47:03,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12708265984. Throughput: 0: 43089.8. Samples: 12708449120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 00:47:03,741][15401] Updated weights for policy 0, policy_version 775653 (0.0037) [2024-06-25 00:47:07,370][15401] Updated weights for policy 0, policy_version 775663 (0.0032) [2024-06-25 00:47:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12708495360. Throughput: 0: 43029.3. Samples: 12708574500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 00:47:11,659][15401] Updated weights for policy 0, policy_version 775673 (0.0025) [2024-06-25 00:47:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12708724736. Throughput: 0: 42992.5. Samples: 12708834400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 00:47:14,949][15401] Updated weights for policy 0, policy_version 775683 (0.0031) [2024-06-25 00:47:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12708921344. Throughput: 0: 43061.7. Samples: 12709093640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 00:47:19,111][15401] Updated weights for policy 0, policy_version 775693 (0.0032) [2024-06-25 00:47:22,364][15349] Signal inference workers to stop experience collection... (188050 times) [2024-06-25 00:47:22,364][15349] Signal inference workers to resume experience collection... (188050 times) [2024-06-25 00:47:22,403][15401] InferenceWorker_p0-w0: stopping experience collection (188050 times) [2024-06-25 00:47:22,403][15401] InferenceWorker_p0-w0: resuming experience collection (188050 times) [2024-06-25 00:47:22,495][15401] Updated weights for policy 0, policy_version 775703 (0.0034) [2024-06-25 00:47:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 12709150720. Throughput: 0: 42892.8. Samples: 12709216060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 00:47:26,850][15401] Updated weights for policy 0, policy_version 775713 (0.0033) [2024-06-25 00:47:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12709363712. Throughput: 0: 42796.5. Samples: 12709471760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 00:47:30,207][15401] Updated weights for policy 0, policy_version 775723 (0.0041) [2024-06-25 00:47:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 12709560320. Throughput: 0: 42825.2. Samples: 12709729620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 00:47:34,279][15401] Updated weights for policy 0, policy_version 775733 (0.0031) [2024-06-25 00:47:37,721][15401] Updated weights for policy 0, policy_version 775743 (0.0029) [2024-06-25 00:47:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12709789696. Throughput: 0: 42858.6. Samples: 12709860180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 00:47:41,803][15401] Updated weights for policy 0, policy_version 775753 (0.0035) [2024-06-25 00:47:43,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12709986304. Throughput: 0: 42844.4. Samples: 12710111280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 00:47:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000775756_12709986304.pth... [2024-06-25 00:47:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000775130_12699729920.pth [2024-06-25 00:47:45,759][15401] Updated weights for policy 0, policy_version 775763 (0.0023) [2024-06-25 00:47:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12710199296. Throughput: 0: 42753.3. Samples: 12710373020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 00:47:49,427][15401] Updated weights for policy 0, policy_version 775773 (0.0043) [2024-06-25 00:47:53,343][15401] Updated weights for policy 0, policy_version 775783 (0.0033) [2024-06-25 00:47:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 12710428672. Throughput: 0: 42795.0. Samples: 12710500280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 00:47:57,225][15401] Updated weights for policy 0, policy_version 775793 (0.0043) [2024-06-25 00:47:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 12710625280. Throughput: 0: 42756.9. Samples: 12710758460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:47:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 00:48:00,971][15401] Updated weights for policy 0, policy_version 775803 (0.0036) [2024-06-25 00:48:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12710821888. Throughput: 0: 42704.5. Samples: 12711015340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:48:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 00:48:05,145][15401] Updated weights for policy 0, policy_version 775813 (0.0030) [2024-06-25 00:48:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12711051264. Throughput: 0: 42785.4. Samples: 12711141400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:48:08,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 00:48:08,554][15401] Updated weights for policy 0, policy_version 775823 (0.0047) [2024-06-25 00:48:12,965][15401] Updated weights for policy 0, policy_version 775833 (0.0045) [2024-06-25 00:48:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 12711264256. Throughput: 0: 42803.6. Samples: 12711397920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:48:13,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-25 00:48:16,043][15401] Updated weights for policy 0, policy_version 775843 (0.0027) [2024-06-25 00:48:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 12711460864. Throughput: 0: 42757.6. Samples: 12711653700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 00:48:18,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 00:48:20,601][15401] Updated weights for policy 0, policy_version 775853 (0.0049) [2024-06-25 00:48:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 12711706624. Throughput: 0: 42680.1. Samples: 12711780780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:48:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 00:48:23,809][15401] Updated weights for policy 0, policy_version 775863 (0.0034) [2024-06-25 00:48:28,252][15401] Updated weights for policy 0, policy_version 775873 (0.0033) [2024-06-25 00:48:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12711919616. Throughput: 0: 42789.7. Samples: 12712036820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:48:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 00:48:31,474][15401] Updated weights for policy 0, policy_version 775883 (0.0048) [2024-06-25 00:48:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 12712116224. Throughput: 0: 42549.7. Samples: 12712287760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:48:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 00:48:35,835][15401] Updated weights for policy 0, policy_version 775893 (0.0032) [2024-06-25 00:48:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12712329216. Throughput: 0: 42606.1. Samples: 12712417560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:48:38,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 00:48:39,251][15401] Updated weights for policy 0, policy_version 775903 (0.0044) [2024-06-25 00:48:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12712542208. Throughput: 0: 42590.3. Samples: 12712675020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:48:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 00:48:43,402][15401] Updated weights for policy 0, policy_version 775913 (0.0027) [2024-06-25 00:48:47,092][15349] Signal inference workers to stop experience collection... (188100 times) [2024-06-25 00:48:47,144][15401] InferenceWorker_p0-w0: stopping experience collection (188100 times) [2024-06-25 00:48:47,152][15349] Signal inference workers to resume experience collection... (188100 times) [2024-06-25 00:48:47,163][15401] InferenceWorker_p0-w0: resuming experience collection (188100 times) [2024-06-25 00:48:47,296][15401] Updated weights for policy 0, policy_version 775923 (0.0035) [2024-06-25 00:48:48,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12712771584. Throughput: 0: 42244.4. Samples: 12712916340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:48:48,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 00:48:51,670][15401] Updated weights for policy 0, policy_version 775933 (0.0030) [2024-06-25 00:48:53,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 12712968192. Throughput: 0: 42376.4. Samples: 12713048440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:48:53,392][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 00:48:55,111][15401] Updated weights for policy 0, policy_version 775943 (0.0042) [2024-06-25 00:48:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 12713164800. Throughput: 0: 42251.8. Samples: 12713299260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:48:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 00:48:59,399][15401] Updated weights for policy 0, policy_version 775953 (0.0035) [2024-06-25 00:49:02,664][15401] Updated weights for policy 0, policy_version 775963 (0.0031) [2024-06-25 00:49:03,390][15132] Fps is (10 sec: 44246.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 12713410560. Throughput: 0: 42130.0. Samples: 12713549560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:03,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 00:49:06,921][15401] Updated weights for policy 0, policy_version 775973 (0.0035) [2024-06-25 00:49:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12713607168. Throughput: 0: 42327.4. Samples: 12713685520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:08,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-25 00:49:10,375][15401] Updated weights for policy 0, policy_version 775983 (0.0037) [2024-06-25 00:49:13,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 12713787392. Throughput: 0: 42255.3. Samples: 12713938300. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:13,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 00:49:14,511][15401] Updated weights for policy 0, policy_version 775993 (0.0042) [2024-06-25 00:49:18,049][15401] Updated weights for policy 0, policy_version 776003 (0.0034) [2024-06-25 00:49:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 12714065920. Throughput: 0: 42357.8. Samples: 12714193860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 00:49:22,099][15401] Updated weights for policy 0, policy_version 776013 (0.0034) [2024-06-25 00:49:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 12714246144. Throughput: 0: 42541.4. Samples: 12714331920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:23,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 00:49:25,525][15401] Updated weights for policy 0, policy_version 776023 (0.0039) [2024-06-25 00:49:28,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 12714442752. Throughput: 0: 42351.0. Samples: 12714580820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 00:49:29,830][15401] Updated weights for policy 0, policy_version 776033 (0.0031) [2024-06-25 00:49:33,081][15401] Updated weights for policy 0, policy_version 776043 (0.0029) [2024-06-25 00:49:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 12714704896. Throughput: 0: 42703.5. Samples: 12714838000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 00:49:37,487][15401] Updated weights for policy 0, policy_version 776053 (0.0040) [2024-06-25 00:49:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 12714868736. Throughput: 0: 42845.4. Samples: 12714976380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:38,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 00:49:40,693][15401] Updated weights for policy 0, policy_version 776063 (0.0036) [2024-06-25 00:49:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 12715098112. Throughput: 0: 42780.5. Samples: 12715224380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 00:49:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000776068_12715098112.pth... [2024-06-25 00:49:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000775444_12704874496.pth [2024-06-25 00:49:45,604][15401] Updated weights for policy 0, policy_version 776073 (0.0047) [2024-06-25 00:49:48,306][15401] Updated weights for policy 0, policy_version 776083 (0.0042) [2024-06-25 00:49:48,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12715343872. Throughput: 0: 42880.2. Samples: 12715479160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 00:49:53,173][15401] Updated weights for policy 0, policy_version 776093 (0.0030) [2024-06-25 00:49:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 12715507712. Throughput: 0: 42782.3. Samples: 12715610720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:53,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 00:49:55,988][15401] Updated weights for policy 0, policy_version 776103 (0.0038) [2024-06-25 00:49:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 12715753472. Throughput: 0: 42820.0. Samples: 12715865200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:49:58,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 00:50:00,824][15401] Updated weights for policy 0, policy_version 776113 (0.0056) [2024-06-25 00:50:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12715966464. Throughput: 0: 42836.0. Samples: 12716121480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-25 00:50:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 00:50:03,876][15349] Signal inference workers to stop experience collection... (188150 times) [2024-06-25 00:50:03,931][15401] InferenceWorker_p0-w0: stopping experience collection (188150 times) [2024-06-25 00:50:03,931][15349] Signal inference workers to resume experience collection... (188150 times) [2024-06-25 00:50:03,939][15401] Updated weights for policy 0, policy_version 776123 (0.0028) [2024-06-25 00:50:03,952][15401] InferenceWorker_p0-w0: resuming experience collection (188150 times) [2024-06-25 00:50:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12716146688. Throughput: 0: 42674.4. Samples: 12716252260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 00:50:08,517][15401] Updated weights for policy 0, policy_version 776133 (0.0034) [2024-06-25 00:50:11,482][15401] Updated weights for policy 0, policy_version 776143 (0.0027) [2024-06-25 00:50:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43688.9, 300 sec: 42820.2). Total num frames: 12716408832. Throughput: 0: 42825.2. Samples: 12716508060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:13,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 00:50:15,937][15401] Updated weights for policy 0, policy_version 776153 (0.0035) [2024-06-25 00:50:18,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12716605440. Throughput: 0: 42852.1. Samples: 12716766340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:18,396][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 00:50:19,265][15401] Updated weights for policy 0, policy_version 776163 (0.0047) [2024-06-25 00:50:23,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12716785664. Throughput: 0: 42582.7. Samples: 12716892600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 00:50:23,617][15401] Updated weights for policy 0, policy_version 776173 (0.0035) [2024-06-25 00:50:26,782][15401] Updated weights for policy 0, policy_version 776183 (0.0035) [2024-06-25 00:50:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12717047808. Throughput: 0: 42803.7. Samples: 12717150540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:28,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 00:50:31,249][15401] Updated weights for policy 0, policy_version 776193 (0.0033) [2024-06-25 00:50:33,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12717260800. Throughput: 0: 42973.3. Samples: 12717412960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:33,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 00:50:34,302][15401] Updated weights for policy 0, policy_version 776203 (0.0028) [2024-06-25 00:50:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12717441024. Throughput: 0: 42860.1. Samples: 12717539420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 00:50:38,785][15401] Updated weights for policy 0, policy_version 776213 (0.0041) [2024-06-25 00:50:41,782][15401] Updated weights for policy 0, policy_version 776223 (0.0026) [2024-06-25 00:50:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12717686784. Throughput: 0: 42870.1. Samples: 12717794360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:43,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 00:50:46,793][15401] Updated weights for policy 0, policy_version 776233 (0.0032) [2024-06-25 00:50:48,396][15132] Fps is (10 sec: 45845.3, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 12717899776. Throughput: 0: 42857.8. Samples: 12718050360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:48,396][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 00:50:49,662][15401] Updated weights for policy 0, policy_version 776243 (0.0039) [2024-06-25 00:50:53,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12718063616. Throughput: 0: 42778.2. Samples: 12718177280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:53,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-25 00:50:54,264][15401] Updated weights for policy 0, policy_version 776253 (0.0040) [2024-06-25 00:50:57,167][15401] Updated weights for policy 0, policy_version 776263 (0.0033) [2024-06-25 00:50:58,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12718325760. Throughput: 0: 42853.7. Samples: 12718436380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:50:58,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 00:51:01,793][15401] Updated weights for policy 0, policy_version 776273 (0.0038) [2024-06-25 00:51:03,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12718555136. Throughput: 0: 42867.6. Samples: 12718695380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:03,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 00:51:05,322][15401] Updated weights for policy 0, policy_version 776283 (0.0037) [2024-06-25 00:51:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12718718976. Throughput: 0: 42826.2. Samples: 12718819780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 00:51:09,448][15401] Updated weights for policy 0, policy_version 776293 (0.0041) [2024-06-25 00:51:09,652][15349] Signal inference workers to stop experience collection... (188200 times) [2024-06-25 00:51:09,653][15349] Signal inference workers to resume experience collection... (188200 times) [2024-06-25 00:51:09,704][15401] InferenceWorker_p0-w0: stopping experience collection (188200 times) [2024-06-25 00:51:09,704][15401] InferenceWorker_p0-w0: resuming experience collection (188200 times) [2024-06-25 00:51:12,743][15401] Updated weights for policy 0, policy_version 776303 (0.0040) [2024-06-25 00:51:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 12718964736. Throughput: 0: 42796.4. Samples: 12719076380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 00:51:17,247][15401] Updated weights for policy 0, policy_version 776313 (0.0040) [2024-06-25 00:51:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12719161344. Throughput: 0: 42622.7. Samples: 12719330980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 00:51:20,543][15401] Updated weights for policy 0, policy_version 776323 (0.0028) [2024-06-25 00:51:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12719357952. Throughput: 0: 42471.5. Samples: 12719450640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:23,395][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 00:51:24,944][15401] Updated weights for policy 0, policy_version 776333 (0.0029) [2024-06-25 00:51:27,999][15401] Updated weights for policy 0, policy_version 776343 (0.0039) [2024-06-25 00:51:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12719620096. Throughput: 0: 42675.8. Samples: 12719714760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 00:51:32,597][15401] Updated weights for policy 0, policy_version 776353 (0.0041) [2024-06-25 00:51:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12719816704. Throughput: 0: 42798.5. Samples: 12719976020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 00:51:35,625][15401] Updated weights for policy 0, policy_version 776363 (0.0034) [2024-06-25 00:51:38,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12720013312. Throughput: 0: 42820.9. Samples: 12720104220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 00:51:40,303][15401] Updated weights for policy 0, policy_version 776373 (0.0026) [2024-06-25 00:51:43,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 12720242688. Throughput: 0: 42662.7. Samples: 12720356300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:43,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 00:51:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000776382_12720242688.pth... [2024-06-25 00:51:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000775756_12709986304.pth [2024-06-25 00:51:43,739][15401] Updated weights for policy 0, policy_version 776383 (0.0045) [2024-06-25 00:51:47,809][15401] Updated weights for policy 0, policy_version 776393 (0.0028) [2024-06-25 00:51:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42603.0, 300 sec: 42765.4). Total num frames: 12720455680. Throughput: 0: 42876.0. Samples: 12720624800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 00:51:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 00:51:51,217][15401] Updated weights for policy 0, policy_version 776403 (0.0026) [2024-06-25 00:51:53,390][15132] Fps is (10 sec: 40969.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12720652288. Throughput: 0: 42887.8. Samples: 12720749740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:51:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 00:51:55,671][15401] Updated weights for policy 0, policy_version 776413 (0.0042) [2024-06-25 00:51:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12720881664. Throughput: 0: 42701.2. Samples: 12720997940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:51:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 00:51:58,881][15401] Updated weights for policy 0, policy_version 776423 (0.0044) [2024-06-25 00:52:03,011][15401] Updated weights for policy 0, policy_version 776433 (0.0046) [2024-06-25 00:52:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12721078272. Throughput: 0: 42930.1. Samples: 12721262840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:03,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-25 00:52:06,587][15401] Updated weights for policy 0, policy_version 776443 (0.0029) [2024-06-25 00:52:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12721307648. Throughput: 0: 43100.9. Samples: 12721390180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 00:52:10,515][15401] Updated weights for policy 0, policy_version 776453 (0.0033) [2024-06-25 00:52:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 12721520640. Throughput: 0: 42905.1. Samples: 12721645500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 00:52:14,267][15401] Updated weights for policy 0, policy_version 776463 (0.0033) [2024-06-25 00:52:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12721717248. Throughput: 0: 42855.2. Samples: 12721904500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:18,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 00:52:18,484][15401] Updated weights for policy 0, policy_version 776473 (0.0031) [2024-06-25 00:52:18,816][15349] Signal inference workers to stop experience collection... (188250 times) [2024-06-25 00:52:18,861][15401] InferenceWorker_p0-w0: stopping experience collection (188250 times) [2024-06-25 00:52:18,931][15349] Signal inference workers to resume experience collection... (188250 times) [2024-06-25 00:52:18,931][15401] InferenceWorker_p0-w0: resuming experience collection (188250 times) [2024-06-25 00:52:22,109][15401] Updated weights for policy 0, policy_version 776483 (0.0036) [2024-06-25 00:52:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12721946624. Throughput: 0: 42811.5. Samples: 12722030740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 00:52:25,997][15401] Updated weights for policy 0, policy_version 776493 (0.0035) [2024-06-25 00:52:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12722176000. Throughput: 0: 43029.8. Samples: 12722292540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 00:52:29,540][15401] Updated weights for policy 0, policy_version 776503 (0.0034) [2024-06-25 00:52:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12722372608. Throughput: 0: 42872.5. Samples: 12722554060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 00:52:33,542][15401] Updated weights for policy 0, policy_version 776513 (0.0040) [2024-06-25 00:52:37,378][15401] Updated weights for policy 0, policy_version 776523 (0.0038) [2024-06-25 00:52:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12722569216. Throughput: 0: 42754.4. Samples: 12722673680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 00:52:41,150][15401] Updated weights for policy 0, policy_version 776533 (0.0037) [2024-06-25 00:52:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43146.1, 300 sec: 42820.5). Total num frames: 12722831360. Throughput: 0: 42925.7. Samples: 12722929600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 00:52:45,040][15401] Updated weights for policy 0, policy_version 776543 (0.0033) [2024-06-25 00:52:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12723011584. Throughput: 0: 42756.5. Samples: 12723186880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 00:52:48,807][15401] Updated weights for policy 0, policy_version 776553 (0.0034) [2024-06-25 00:52:52,597][15401] Updated weights for policy 0, policy_version 776563 (0.0047) [2024-06-25 00:52:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12723224576. Throughput: 0: 42698.7. Samples: 12723311620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 00:52:56,652][15401] Updated weights for policy 0, policy_version 776573 (0.0039) [2024-06-25 00:52:58,393][15132] Fps is (10 sec: 44219.6, 60 sec: 42868.7, 300 sec: 42820.0). Total num frames: 12723453952. Throughput: 0: 42614.6. Samples: 12723563320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:52:58,394][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 00:53:00,252][15401] Updated weights for policy 0, policy_version 776583 (0.0038) [2024-06-25 00:53:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12723650560. Throughput: 0: 42602.5. Samples: 12723821620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:53:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 00:53:04,449][15401] Updated weights for policy 0, policy_version 776593 (0.0032) [2024-06-25 00:53:07,816][15401] Updated weights for policy 0, policy_version 776603 (0.0026) [2024-06-25 00:53:08,390][15132] Fps is (10 sec: 40976.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12723863552. Throughput: 0: 42531.1. Samples: 12723944640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:53:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 00:53:12,146][15401] Updated weights for policy 0, policy_version 776613 (0.0045) [2024-06-25 00:53:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12724076544. Throughput: 0: 42416.4. Samples: 12724201280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:53:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 00:53:16,002][15401] Updated weights for policy 0, policy_version 776623 (0.0048) [2024-06-25 00:53:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12724289536. Throughput: 0: 42300.4. Samples: 12724457580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:53:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 00:53:19,595][15401] Updated weights for policy 0, policy_version 776633 (0.0032) [2024-06-25 00:53:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12724502528. Throughput: 0: 42480.3. Samples: 12724585300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:53:23,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 00:53:23,531][15401] Updated weights for policy 0, policy_version 776643 (0.0026) [2024-06-25 00:53:27,750][15401] Updated weights for policy 0, policy_version 776653 (0.0026) [2024-06-25 00:53:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12724715520. Throughput: 0: 42442.0. Samples: 12724839480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:53:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 00:53:31,034][15401] Updated weights for policy 0, policy_version 776663 (0.0033) [2024-06-25 00:53:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12724928512. Throughput: 0: 42523.1. Samples: 12725100420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:53:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 00:53:35,283][15401] Updated weights for policy 0, policy_version 776673 (0.0031) [2024-06-25 00:53:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12725157888. Throughput: 0: 42546.2. Samples: 12725226200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 00:53:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 00:53:39,066][15401] Updated weights for policy 0, policy_version 776683 (0.0038) [2024-06-25 00:53:42,770][15401] Updated weights for policy 0, policy_version 776693 (0.0035) [2024-06-25 00:53:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 12725338112. Throughput: 0: 42646.0. Samples: 12725482220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:53:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 00:53:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000776694_12725354496.pth... [2024-06-25 00:53:43,509][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000776068_12715098112.pth [2024-06-25 00:53:46,419][15401] Updated weights for policy 0, policy_version 776703 (0.0035) [2024-06-25 00:53:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 12725567488. Throughput: 0: 42585.4. Samples: 12725737960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:53:48,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 00:53:50,616][15401] Updated weights for policy 0, policy_version 776713 (0.0046) [2024-06-25 00:53:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12725796864. Throughput: 0: 42646.7. Samples: 12725863740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:53:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 00:53:53,836][15401] Updated weights for policy 0, policy_version 776723 (0.0034) [2024-06-25 00:53:55,995][15349] Signal inference workers to stop experience collection... (188300 times) [2024-06-25 00:53:55,999][15349] Signal inference workers to resume experience collection... (188300 times) [2024-06-25 00:53:56,055][15401] InferenceWorker_p0-w0: stopping experience collection (188300 times) [2024-06-25 00:53:56,056][15401] InferenceWorker_p0-w0: resuming experience collection (188300 times) [2024-06-25 00:53:58,279][15401] Updated weights for policy 0, policy_version 776733 (0.0033) [2024-06-25 00:53:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42328.2, 300 sec: 42654.0). Total num frames: 12725993472. Throughput: 0: 42663.2. Samples: 12726121120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:53:58,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 00:54:01,581][15401] Updated weights for policy 0, policy_version 776743 (0.0037) [2024-06-25 00:54:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12726206464. Throughput: 0: 42792.1. Samples: 12726383220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 00:54:05,822][15401] Updated weights for policy 0, policy_version 776753 (0.0034) [2024-06-25 00:54:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12726435840. Throughput: 0: 42676.6. Samples: 12726505740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:08,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 00:54:09,669][15401] Updated weights for policy 0, policy_version 776763 (0.0029) [2024-06-25 00:54:13,303][15401] Updated weights for policy 0, policy_version 776773 (0.0035) [2024-06-25 00:54:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 12726648832. Throughput: 0: 42793.8. Samples: 12726765200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:13,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 00:54:17,109][15401] Updated weights for policy 0, policy_version 776783 (0.0030) [2024-06-25 00:54:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12726845440. Throughput: 0: 42744.0. Samples: 12727023900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:18,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 00:54:20,826][15401] Updated weights for policy 0, policy_version 776793 (0.0039) [2024-06-25 00:54:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12727074816. Throughput: 0: 42798.7. Samples: 12727152140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 00:54:24,812][15401] Updated weights for policy 0, policy_version 776803 (0.0034) [2024-06-25 00:54:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12727287808. Throughput: 0: 42891.6. Samples: 12727412340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 00:54:28,444][15401] Updated weights for policy 0, policy_version 776813 (0.0034) [2024-06-25 00:54:32,402][15401] Updated weights for policy 0, policy_version 776823 (0.0036) [2024-06-25 00:54:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12727500800. Throughput: 0: 42961.3. Samples: 12727671220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:33,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 00:54:35,951][15401] Updated weights for policy 0, policy_version 776833 (0.0039) [2024-06-25 00:54:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12727730176. Throughput: 0: 42895.2. Samples: 12727794020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 00:54:40,095][15401] Updated weights for policy 0, policy_version 776843 (0.0040) [2024-06-25 00:54:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 12727959552. Throughput: 0: 42964.4. Samples: 12728054520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:43,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 00:54:43,398][15401] Updated weights for policy 0, policy_version 776853 (0.0030) [2024-06-25 00:54:48,175][15401] Updated weights for policy 0, policy_version 776863 (0.0037) [2024-06-25 00:54:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12728123392. Throughput: 0: 42937.0. Samples: 12728315380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:48,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-25 00:54:50,882][15401] Updated weights for policy 0, policy_version 776873 (0.0034) [2024-06-25 00:54:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12728369152. Throughput: 0: 42824.3. Samples: 12728432840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:53,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-25 00:54:55,786][15401] Updated weights for policy 0, policy_version 776883 (0.0026) [2024-06-25 00:54:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12728582144. Throughput: 0: 42956.0. Samples: 12728698220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:54:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 00:54:58,851][15401] Updated weights for policy 0, policy_version 776893 (0.0030) [2024-06-25 00:55:03,192][15401] Updated weights for policy 0, policy_version 776903 (0.0036) [2024-06-25 00:55:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12728778752. Throughput: 0: 43038.2. Samples: 12728960620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:55:03,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-25 00:55:06,448][15401] Updated weights for policy 0, policy_version 776913 (0.0035) [2024-06-25 00:55:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 12729024512. Throughput: 0: 42844.0. Samples: 12729080120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:55:08,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-25 00:55:10,650][15401] Updated weights for policy 0, policy_version 776923 (0.0037) [2024-06-25 00:55:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12729221120. Throughput: 0: 43084.4. Samples: 12729351140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:55:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 00:55:13,930][15401] Updated weights for policy 0, policy_version 776933 (0.0033) [2024-06-25 00:55:18,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12729417728. Throughput: 0: 42987.9. Samples: 12729605680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:55:18,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 00:55:18,694][15401] Updated weights for policy 0, policy_version 776943 (0.0022) [2024-06-25 00:55:21,564][15349] Signal inference workers to stop experience collection... (188350 times) [2024-06-25 00:55:21,565][15349] Signal inference workers to resume experience collection... (188350 times) [2024-06-25 00:55:21,615][15401] InferenceWorker_p0-w0: stopping experience collection (188350 times) [2024-06-25 00:55:21,615][15401] InferenceWorker_p0-w0: resuming experience collection (188350 times) [2024-06-25 00:55:21,730][15401] Updated weights for policy 0, policy_version 776953 (0.0045) [2024-06-25 00:55:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12729663488. Throughput: 0: 42971.4. Samples: 12729727740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 00:55:23,391][15132] Avg episode reward: [(0, '0.342')] [2024-06-25 00:55:26,188][15401] Updated weights for policy 0, policy_version 776963 (0.0027) [2024-06-25 00:55:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12729860096. Throughput: 0: 43078.2. Samples: 12729993040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:55:28,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-25 00:55:29,163][15401] Updated weights for policy 0, policy_version 776973 (0.0025) [2024-06-25 00:55:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12730056704. Throughput: 0: 42909.6. Samples: 12730246320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:55:33,396][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 00:55:33,905][15401] Updated weights for policy 0, policy_version 776983 (0.0039) [2024-06-25 00:55:36,862][15401] Updated weights for policy 0, policy_version 776993 (0.0042) [2024-06-25 00:55:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 12730318848. Throughput: 0: 43031.2. Samples: 12730369240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:55:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 00:55:41,374][15401] Updated weights for policy 0, policy_version 777003 (0.0045) [2024-06-25 00:55:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42766.0). Total num frames: 12730515456. Throughput: 0: 43025.3. Samples: 12730634360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:55:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 00:55:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000777010_12730531840.pth... [2024-06-25 00:55:43,577][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000776382_12720242688.pth [2024-06-25 00:55:44,382][15401] Updated weights for policy 0, policy_version 777013 (0.0029) [2024-06-25 00:55:48,396][15132] Fps is (10 sec: 39296.6, 60 sec: 43139.8, 300 sec: 42875.2). Total num frames: 12730712064. Throughput: 0: 42921.0. Samples: 12730892340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:55:48,397][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 00:55:48,913][15401] Updated weights for policy 0, policy_version 777023 (0.0041) [2024-06-25 00:55:52,202][15401] Updated weights for policy 0, policy_version 777033 (0.0031) [2024-06-25 00:55:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12730957824. Throughput: 0: 43045.2. Samples: 12731017160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:55:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 00:55:56,585][15401] Updated weights for policy 0, policy_version 777043 (0.0032) [2024-06-25 00:55:58,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12731154432. Throughput: 0: 42972.9. Samples: 12731284920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:55:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 00:55:59,754][15401] Updated weights for policy 0, policy_version 777053 (0.0027) [2024-06-25 00:56:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12731367424. Throughput: 0: 42936.5. Samples: 12731537820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:03,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 00:56:04,353][15401] Updated weights for policy 0, policy_version 777063 (0.0040) [2024-06-25 00:56:07,489][15401] Updated weights for policy 0, policy_version 777073 (0.0032) [2024-06-25 00:56:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12731613184. Throughput: 0: 43082.8. Samples: 12731666460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 00:56:11,968][15401] Updated weights for policy 0, policy_version 777083 (0.0038) [2024-06-25 00:56:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12731809792. Throughput: 0: 43031.6. Samples: 12731929460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 00:56:14,901][15401] Updated weights for policy 0, policy_version 777093 (0.0038) [2024-06-25 00:56:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12732006400. Throughput: 0: 43064.5. Samples: 12732184220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:18,390][15132] Avg episode reward: [(0, '0.866')] [2024-06-25 00:56:19,692][15401] Updated weights for policy 0, policy_version 777103 (0.0029) [2024-06-25 00:56:22,480][15401] Updated weights for policy 0, policy_version 777113 (0.0038) [2024-06-25 00:56:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12732268544. Throughput: 0: 43189.8. Samples: 12732312780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 00:56:27,335][15401] Updated weights for policy 0, policy_version 777123 (0.0027) [2024-06-25 00:56:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12732448768. Throughput: 0: 43228.4. Samples: 12732579640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 00:56:30,054][15401] Updated weights for policy 0, policy_version 777133 (0.0028) [2024-06-25 00:56:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12732661760. Throughput: 0: 42991.4. Samples: 12732826680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:33,392][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 00:56:34,783][15401] Updated weights for policy 0, policy_version 777143 (0.0039) [2024-06-25 00:56:37,994][15349] Signal inference workers to stop experience collection... (188400 times) [2024-06-25 00:56:38,018][15401] InferenceWorker_p0-w0: stopping experience collection (188400 times) [2024-06-25 00:56:38,048][15349] Signal inference workers to resume experience collection... (188400 times) [2024-06-25 00:56:38,052][15401] InferenceWorker_p0-w0: resuming experience collection (188400 times) [2024-06-25 00:56:38,058][15401] Updated weights for policy 0, policy_version 777153 (0.0046) [2024-06-25 00:56:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 12732907520. Throughput: 0: 43133.8. Samples: 12732958180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:38,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-25 00:56:42,378][15401] Updated weights for policy 0, policy_version 777163 (0.0022) [2024-06-25 00:56:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12733071360. Throughput: 0: 42736.5. Samples: 12733208060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 00:56:45,758][15401] Updated weights for policy 0, policy_version 777173 (0.0027) [2024-06-25 00:56:48,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 12733300736. Throughput: 0: 42613.8. Samples: 12733455440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 00:56:50,426][15401] Updated weights for policy 0, policy_version 777183 (0.0035) [2024-06-25 00:56:53,351][15401] Updated weights for policy 0, policy_version 777193 (0.0029) [2024-06-25 00:56:53,391][15132] Fps is (10 sec: 45866.5, 60 sec: 42870.2, 300 sec: 42875.8). Total num frames: 12733530112. Throughput: 0: 42721.7. Samples: 12733589020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:53,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 00:56:57,930][15401] Updated weights for policy 0, policy_version 777203 (0.0037) [2024-06-25 00:56:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12733693952. Throughput: 0: 42577.7. Samples: 12733845460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:56:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 00:57:01,129][15401] Updated weights for policy 0, policy_version 777213 (0.0022) [2024-06-25 00:57:03,389][15132] Fps is (10 sec: 40968.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12733939712. Throughput: 0: 42420.5. Samples: 12734093140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:57:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 00:57:05,727][15401] Updated weights for policy 0, policy_version 777223 (0.0033) [2024-06-25 00:57:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 12734136320. Throughput: 0: 42590.8. Samples: 12734229360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 00:57:08,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 00:57:08,747][15401] Updated weights for policy 0, policy_version 777233 (0.0032) [2024-06-25 00:57:13,156][15401] Updated weights for policy 0, policy_version 777243 (0.0040) [2024-06-25 00:57:13,396][15132] Fps is (10 sec: 40933.5, 60 sec: 42320.8, 300 sec: 42819.6). Total num frames: 12734349312. Throughput: 0: 42317.1. Samples: 12734484180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:13,397][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 00:57:16,551][15401] Updated weights for policy 0, policy_version 777253 (0.0038) [2024-06-25 00:57:18,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 12734595072. Throughput: 0: 42445.6. Samples: 12734736740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 00:57:20,667][15401] Updated weights for policy 0, policy_version 777263 (0.0027) [2024-06-25 00:57:23,389][15132] Fps is (10 sec: 44265.5, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 12734791680. Throughput: 0: 42502.8. Samples: 12734870800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 00:57:24,151][15401] Updated weights for policy 0, policy_version 777273 (0.0026) [2024-06-25 00:57:28,175][15401] Updated weights for policy 0, policy_version 777283 (0.0024) [2024-06-25 00:57:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12735004672. Throughput: 0: 42631.0. Samples: 12735126460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 00:57:31,862][15401] Updated weights for policy 0, policy_version 777293 (0.0036) [2024-06-25 00:57:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12735234048. Throughput: 0: 42751.2. Samples: 12735379240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:33,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 00:57:35,740][15401] Updated weights for policy 0, policy_version 777303 (0.0030) [2024-06-25 00:57:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 12735414272. Throughput: 0: 42665.4. Samples: 12735508880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:38,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 00:57:39,562][15401] Updated weights for policy 0, policy_version 777313 (0.0033) [2024-06-25 00:57:43,308][15401] Updated weights for policy 0, policy_version 777323 (0.0028) [2024-06-25 00:57:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 12735660032. Throughput: 0: 42607.5. Samples: 12735762900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:43,393][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 00:57:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000777323_12735660032.pth... [2024-06-25 00:57:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000776694_12725354496.pth [2024-06-25 00:57:47,651][15401] Updated weights for policy 0, policy_version 777333 (0.0030) [2024-06-25 00:57:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12735873024. Throughput: 0: 42638.2. Samples: 12736011860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 00:57:51,378][15349] Signal inference workers to stop experience collection... (188450 times) [2024-06-25 00:57:51,430][15401] InferenceWorker_p0-w0: stopping experience collection (188450 times) [2024-06-25 00:57:51,437][15349] Signal inference workers to resume experience collection... (188450 times) [2024-06-25 00:57:51,444][15401] InferenceWorker_p0-w0: resuming experience collection (188450 times) [2024-06-25 00:57:51,447][15401] Updated weights for policy 0, policy_version 777343 (0.0031) [2024-06-25 00:57:53,389][15132] Fps is (10 sec: 37692.5, 60 sec: 41780.6, 300 sec: 42654.5). Total num frames: 12736036864. Throughput: 0: 42574.6. Samples: 12736145220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:53,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 00:57:55,272][15401] Updated weights for policy 0, policy_version 777353 (0.0042) [2024-06-25 00:57:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12736266240. Throughput: 0: 42347.7. Samples: 12736389560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:57:58,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 00:57:59,059][15401] Updated weights for policy 0, policy_version 777363 (0.0029) [2024-06-25 00:58:03,102][15401] Updated weights for policy 0, policy_version 777373 (0.0035) [2024-06-25 00:58:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12736495616. Throughput: 0: 42502.4. Samples: 12736649340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:03,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-25 00:58:06,522][15401] Updated weights for policy 0, policy_version 777383 (0.0039) [2024-06-25 00:58:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12736692224. Throughput: 0: 42464.0. Samples: 12736781680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:08,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 00:58:10,691][15401] Updated weights for policy 0, policy_version 777393 (0.0033) [2024-06-25 00:58:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42876.0, 300 sec: 42820.5). Total num frames: 12736921600. Throughput: 0: 42387.6. Samples: 12737033900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:13,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 00:58:14,062][15401] Updated weights for policy 0, policy_version 777403 (0.0034) [2024-06-25 00:58:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.4, 300 sec: 42709.5). Total num frames: 12737101824. Throughput: 0: 42525.8. Samples: 12737292900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 00:58:18,545][15401] Updated weights for policy 0, policy_version 777413 (0.0029) [2024-06-25 00:58:21,971][15401] Updated weights for policy 0, policy_version 777423 (0.0040) [2024-06-25 00:58:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12737347584. Throughput: 0: 42455.1. Samples: 12737419360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 00:58:26,173][15401] Updated weights for policy 0, policy_version 777433 (0.0039) [2024-06-25 00:58:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12737560576. Throughput: 0: 42543.6. Samples: 12737677260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:28,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 00:58:29,500][15401] Updated weights for policy 0, policy_version 777443 (0.0033) [2024-06-25 00:58:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 12737740800. Throughput: 0: 42702.3. Samples: 12737933460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 00:58:33,975][15401] Updated weights for policy 0, policy_version 777453 (0.0034) [2024-06-25 00:58:37,109][15401] Updated weights for policy 0, policy_version 777463 (0.0035) [2024-06-25 00:58:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 12738002944. Throughput: 0: 42486.5. Samples: 12738057120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 00:58:41,672][15401] Updated weights for policy 0, policy_version 777473 (0.0037) [2024-06-25 00:58:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 12738199552. Throughput: 0: 42849.5. Samples: 12738317780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 00:58:44,906][15401] Updated weights for policy 0, policy_version 777483 (0.0043) [2024-06-25 00:58:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12738396160. Throughput: 0: 42875.8. Samples: 12738578760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:48,396][15132] Avg episode reward: [(0, '0.848')] [2024-06-25 00:58:49,349][15401] Updated weights for policy 0, policy_version 777493 (0.0028) [2024-06-25 00:58:52,650][15401] Updated weights for policy 0, policy_version 777503 (0.0028) [2024-06-25 00:58:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 12738641920. Throughput: 0: 42588.8. Samples: 12738698180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 00:58:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 00:58:56,951][15401] Updated weights for policy 0, policy_version 777513 (0.0029) [2024-06-25 00:58:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12738822144. Throughput: 0: 42740.0. Samples: 12738957200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:58:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 00:59:00,037][15349] Signal inference workers to stop experience collection... (188500 times) [2024-06-25 00:59:00,070][15401] InferenceWorker_p0-w0: stopping experience collection (188500 times) [2024-06-25 00:59:00,104][15349] Signal inference workers to resume experience collection... (188500 times) [2024-06-25 00:59:00,105][15401] InferenceWorker_p0-w0: resuming experience collection (188500 times) [2024-06-25 00:59:00,293][15401] Updated weights for policy 0, policy_version 777523 (0.0047) [2024-06-25 00:59:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12739018752. Throughput: 0: 42650.6. Samples: 12739212180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 00:59:04,650][15401] Updated weights for policy 0, policy_version 777533 (0.0042) [2024-06-25 00:59:07,848][15401] Updated weights for policy 0, policy_version 777543 (0.0023) [2024-06-25 00:59:08,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12739264512. Throughput: 0: 42648.0. Samples: 12739338520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:08,391][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 00:59:12,379][15401] Updated weights for policy 0, policy_version 777553 (0.0041) [2024-06-25 00:59:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12739461120. Throughput: 0: 42620.8. Samples: 12739595200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:13,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 00:59:15,579][15401] Updated weights for policy 0, policy_version 777563 (0.0029) [2024-06-25 00:59:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12739674112. Throughput: 0: 42558.5. Samples: 12739848600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 00:59:20,089][15401] Updated weights for policy 0, policy_version 777573 (0.0038) [2024-06-25 00:59:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12739903488. Throughput: 0: 42717.3. Samples: 12739979400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:23,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 00:59:23,526][15401] Updated weights for policy 0, policy_version 777583 (0.0035) [2024-06-25 00:59:27,669][15401] Updated weights for policy 0, policy_version 777593 (0.0043) [2024-06-25 00:59:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12740100096. Throughput: 0: 42582.5. Samples: 12740234000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:28,396][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 00:59:31,193][15401] Updated weights for policy 0, policy_version 777603 (0.0034) [2024-06-25 00:59:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12740329472. Throughput: 0: 42360.6. Samples: 12740484980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 00:59:35,396][15401] Updated weights for policy 0, policy_version 777613 (0.0039) [2024-06-25 00:59:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12740542464. Throughput: 0: 42741.8. Samples: 12740621560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 00:59:38,766][15401] Updated weights for policy 0, policy_version 777623 (0.0021) [2024-06-25 00:59:42,889][15401] Updated weights for policy 0, policy_version 777633 (0.0043) [2024-06-25 00:59:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12740739072. Throughput: 0: 42621.0. Samples: 12740875140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:43,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 00:59:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000777633_12740739072.pth... [2024-06-25 00:59:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000777010_12730531840.pth [2024-06-25 00:59:46,751][15401] Updated weights for policy 0, policy_version 777643 (0.0042) [2024-06-25 00:59:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12740968448. Throughput: 0: 42475.1. Samples: 12741123660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:48,393][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 00:59:50,835][15401] Updated weights for policy 0, policy_version 777653 (0.0039) [2024-06-25 00:59:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12741165056. Throughput: 0: 42581.8. Samples: 12741254700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 00:59:54,315][15401] Updated weights for policy 0, policy_version 777663 (0.0032) [2024-06-25 00:59:58,301][15401] Updated weights for policy 0, policy_version 777673 (0.0026) [2024-06-25 00:59:58,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12741394432. Throughput: 0: 42548.1. Samples: 12741509860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 00:59:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 01:00:01,881][15401] Updated weights for policy 0, policy_version 777683 (0.0035) [2024-06-25 01:00:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12741607424. Throughput: 0: 42568.0. Samples: 12741764160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:00:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 01:00:05,814][15401] Updated weights for policy 0, policy_version 777693 (0.0036) [2024-06-25 01:00:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12741820416. Throughput: 0: 42528.5. Samples: 12741893180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:00:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 01:00:09,874][15401] Updated weights for policy 0, policy_version 777703 (0.0040) [2024-06-25 01:00:13,299][15401] Updated weights for policy 0, policy_version 777713 (0.0030) [2024-06-25 01:00:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12742049792. Throughput: 0: 42633.4. Samples: 12742152500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:00:13,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 01:00:17,455][15401] Updated weights for policy 0, policy_version 777723 (0.0036) [2024-06-25 01:00:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 12742246400. Throughput: 0: 42634.6. Samples: 12742403640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:00:18,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 01:00:21,342][15401] Updated weights for policy 0, policy_version 777733 (0.0039) [2024-06-25 01:00:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12742443008. Throughput: 0: 42559.6. Samples: 12742536740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:00:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 01:00:25,014][15401] Updated weights for policy 0, policy_version 777743 (0.0030) [2024-06-25 01:00:28,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12742639616. Throughput: 0: 42496.3. Samples: 12742787480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:00:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 01:00:28,789][15349] Signal inference workers to stop experience collection... (188550 times) [2024-06-25 01:00:28,791][15349] Signal inference workers to resume experience collection... (188550 times) [2024-06-25 01:00:28,832][15401] InferenceWorker_p0-w0: stopping experience collection (188550 times) [2024-06-25 01:00:28,832][15401] InferenceWorker_p0-w0: resuming experience collection (188550 times) [2024-06-25 01:00:29,077][15401] Updated weights for policy 0, policy_version 777753 (0.0032) [2024-06-25 01:00:32,514][15401] Updated weights for policy 0, policy_version 777763 (0.0028) [2024-06-25 01:00:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12742885376. Throughput: 0: 42660.5. Samples: 12743043280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:00:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 01:00:36,876][15401] Updated weights for policy 0, policy_version 777773 (0.0023) [2024-06-25 01:00:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12743081984. Throughput: 0: 42787.0. Samples: 12743180120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:00:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 01:00:40,365][15401] Updated weights for policy 0, policy_version 777783 (0.0037) [2024-06-25 01:00:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 12743311360. Throughput: 0: 42777.7. Samples: 12743434860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:00:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 01:00:44,450][15401] Updated weights for policy 0, policy_version 777793 (0.0038) [2024-06-25 01:00:47,897][15401] Updated weights for policy 0, policy_version 777803 (0.0028) [2024-06-25 01:00:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 12743540736. Throughput: 0: 42723.5. Samples: 12743686720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:00:48,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 01:00:52,062][15401] Updated weights for policy 0, policy_version 777813 (0.0035) [2024-06-25 01:00:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12743720960. Throughput: 0: 42894.6. Samples: 12743823440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:00:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 01:00:55,382][15401] Updated weights for policy 0, policy_version 777823 (0.0041) [2024-06-25 01:00:58,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12743933952. Throughput: 0: 42743.7. Samples: 12744075960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:00:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 01:00:59,642][15401] Updated weights for policy 0, policy_version 777833 (0.0033) [2024-06-25 01:01:03,310][15401] Updated weights for policy 0, policy_version 777843 (0.0030) [2024-06-25 01:01:03,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12744179712. Throughput: 0: 43004.1. Samples: 12744338720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 01:01:07,259][15401] Updated weights for policy 0, policy_version 777853 (0.0029) [2024-06-25 01:01:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12744359936. Throughput: 0: 42943.2. Samples: 12744469180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:08,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 01:01:10,822][15401] Updated weights for policy 0, policy_version 777863 (0.0039) [2024-06-25 01:01:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12744589312. Throughput: 0: 42932.0. Samples: 12744719420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 01:01:14,886][15401] Updated weights for policy 0, policy_version 777873 (0.0028) [2024-06-25 01:01:18,235][15401] Updated weights for policy 0, policy_version 777883 (0.0023) [2024-06-25 01:01:18,390][15132] Fps is (10 sec: 47512.5, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 12744835072. Throughput: 0: 42979.4. Samples: 12744977360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 01:01:22,315][15401] Updated weights for policy 0, policy_version 777893 (0.0048) [2024-06-25 01:01:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12744998912. Throughput: 0: 42929.5. Samples: 12745111940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 01:01:25,853][15401] Updated weights for policy 0, policy_version 777903 (0.0035) [2024-06-25 01:01:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 12745244672. Throughput: 0: 42906.7. Samples: 12745365660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 01:01:30,151][15401] Updated weights for policy 0, policy_version 777913 (0.0031) [2024-06-25 01:01:33,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12745474048. Throughput: 0: 43099.2. Samples: 12745626180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 01:01:33,649][15401] Updated weights for policy 0, policy_version 777923 (0.0029) [2024-06-25 01:01:37,612][15401] Updated weights for policy 0, policy_version 777933 (0.0032) [2024-06-25 01:01:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12745670656. Throughput: 0: 43028.9. Samples: 12745759740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 01:01:41,381][15401] Updated weights for policy 0, policy_version 777943 (0.0033) [2024-06-25 01:01:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12745900032. Throughput: 0: 42945.6. Samples: 12746008520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:43,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 01:01:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000777948_12745900032.pth... [2024-06-25 01:01:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000777323_12735660032.pth [2024-06-25 01:01:45,387][15401] Updated weights for policy 0, policy_version 777953 (0.0033) [2024-06-25 01:01:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 12746113024. Throughput: 0: 43026.6. Samples: 12746274920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 01:01:48,876][15401] Updated weights for policy 0, policy_version 777963 (0.0031) [2024-06-25 01:01:52,815][15401] Updated weights for policy 0, policy_version 777973 (0.0042) [2024-06-25 01:01:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12746309632. Throughput: 0: 43022.6. Samples: 12746405200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 01:01:56,328][15401] Updated weights for policy 0, policy_version 777983 (0.0047) [2024-06-25 01:01:58,246][15349] Signal inference workers to stop experience collection... (188600 times) [2024-06-25 01:01:58,247][15349] Signal inference workers to resume experience collection... (188600 times) [2024-06-25 01:01:58,271][15401] InferenceWorker_p0-w0: stopping experience collection (188600 times) [2024-06-25 01:01:58,299][15401] InferenceWorker_p0-w0: resuming experience collection (188600 times) [2024-06-25 01:01:58,389][15132] Fps is (10 sec: 44238.2, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 12746555392. Throughput: 0: 43144.3. Samples: 12746660900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:01:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 01:02:00,274][15401] Updated weights for policy 0, policy_version 777993 (0.0030) [2024-06-25 01:02:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12746752000. Throughput: 0: 43126.8. Samples: 12746918060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:02:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 01:02:03,847][15401] Updated weights for policy 0, policy_version 778003 (0.0037) [2024-06-25 01:02:07,705][15401] Updated weights for policy 0, policy_version 778013 (0.0034) [2024-06-25 01:02:08,389][15132] Fps is (10 sec: 40959.3, 60 sec: 43417.5, 300 sec: 42766.0). Total num frames: 12746964992. Throughput: 0: 43055.5. Samples: 12747049440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:02:08,399][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 01:02:11,282][15401] Updated weights for policy 0, policy_version 778023 (0.0027) [2024-06-25 01:02:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12747194368. Throughput: 0: 43349.7. Samples: 12747316400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:02:13,399][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 01:02:15,759][15401] Updated weights for policy 0, policy_version 778033 (0.0033) [2024-06-25 01:02:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12747407360. Throughput: 0: 43088.9. Samples: 12747565180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:02:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 01:02:19,461][15401] Updated weights for policy 0, policy_version 778043 (0.0033) [2024-06-25 01:02:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 12747603968. Throughput: 0: 43057.4. Samples: 12747697320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 01:02:23,399][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 01:02:23,735][15401] Updated weights for policy 0, policy_version 778053 (0.0030) [2024-06-25 01:02:26,993][15401] Updated weights for policy 0, policy_version 778063 (0.0033) [2024-06-25 01:02:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12747833344. Throughput: 0: 43216.5. Samples: 12747953260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:02:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 01:02:31,279][15401] Updated weights for policy 0, policy_version 778073 (0.0024) [2024-06-25 01:02:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12748062720. Throughput: 0: 42835.5. Samples: 12748202520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:02:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 01:02:34,470][15401] Updated weights for policy 0, policy_version 778083 (0.0033) [2024-06-25 01:02:38,396][15132] Fps is (10 sec: 42571.0, 60 sec: 43140.0, 300 sec: 42708.9). Total num frames: 12748259328. Throughput: 0: 42924.5. Samples: 12748337080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:02:38,397][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 01:02:38,687][15401] Updated weights for policy 0, policy_version 778093 (0.0037) [2024-06-25 01:02:42,171][15401] Updated weights for policy 0, policy_version 778103 (0.0034) [2024-06-25 01:02:43,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 12748472320. Throughput: 0: 43026.4. Samples: 12748597200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:02:43,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 01:02:46,595][15401] Updated weights for policy 0, policy_version 778113 (0.0034) [2024-06-25 01:02:48,390][15132] Fps is (10 sec: 45904.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 12748718080. Throughput: 0: 42782.2. Samples: 12748843260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:02:48,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 01:02:50,016][15401] Updated weights for policy 0, policy_version 778123 (0.0041) [2024-06-25 01:02:53,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 12748881920. Throughput: 0: 42923.5. Samples: 12748981100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:02:53,393][15132] Avg episode reward: [(0, '0.162')] [2024-06-25 01:02:54,125][15401] Updated weights for policy 0, policy_version 778133 (0.0034) [2024-06-25 01:02:57,810][15401] Updated weights for policy 0, policy_version 778143 (0.0037) [2024-06-25 01:02:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12749127680. Throughput: 0: 42699.7. Samples: 12749237880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:02:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 01:03:01,943][15401] Updated weights for policy 0, policy_version 778153 (0.0029) [2024-06-25 01:03:03,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12749340672. Throughput: 0: 42666.2. Samples: 12749485160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:03,399][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 01:03:05,367][15401] Updated weights for policy 0, policy_version 778163 (0.0034) [2024-06-25 01:03:08,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12749520896. Throughput: 0: 42671.1. Samples: 12749617520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:08,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 01:03:09,410][15401] Updated weights for policy 0, policy_version 778173 (0.0036) [2024-06-25 01:03:12,771][15349] Signal inference workers to stop experience collection... (188650 times) [2024-06-25 01:03:12,771][15349] Signal inference workers to resume experience collection... (188650 times) [2024-06-25 01:03:12,784][15401] InferenceWorker_p0-w0: stopping experience collection (188650 times) [2024-06-25 01:03:12,784][15401] InferenceWorker_p0-w0: resuming experience collection (188650 times) [2024-06-25 01:03:12,923][15401] Updated weights for policy 0, policy_version 778183 (0.0022) [2024-06-25 01:03:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 12749766656. Throughput: 0: 42705.4. Samples: 12749875000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 01:03:17,048][15401] Updated weights for policy 0, policy_version 778193 (0.0032) [2024-06-25 01:03:18,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12749996032. Throughput: 0: 42796.6. Samples: 12750128360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 01:03:20,356][15401] Updated weights for policy 0, policy_version 778203 (0.0039) [2024-06-25 01:03:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12750159872. Throughput: 0: 42828.8. Samples: 12750264100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 01:03:24,546][15401] Updated weights for policy 0, policy_version 778213 (0.0032) [2024-06-25 01:03:28,339][15401] Updated weights for policy 0, policy_version 778223 (0.0023) [2024-06-25 01:03:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12750405632. Throughput: 0: 42722.6. Samples: 12750519620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 01:03:32,466][15401] Updated weights for policy 0, policy_version 778233 (0.0032) [2024-06-25 01:03:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12750618624. Throughput: 0: 42815.6. Samples: 12750769960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:33,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 01:03:35,958][15401] Updated weights for policy 0, policy_version 778243 (0.0031) [2024-06-25 01:03:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 12750798848. Throughput: 0: 42676.5. Samples: 12750901440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:38,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 01:03:39,843][15401] Updated weights for policy 0, policy_version 778253 (0.0026) [2024-06-25 01:03:43,239][15401] Updated weights for policy 0, policy_version 778263 (0.0032) [2024-06-25 01:03:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 12751060992. Throughput: 0: 42716.7. Samples: 12751160140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:43,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 01:03:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000778263_12751060992.pth... [2024-06-25 01:03:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000777633_12740739072.pth [2024-06-25 01:03:47,319][15401] Updated weights for policy 0, policy_version 778273 (0.0033) [2024-06-25 01:03:48,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12751273984. Throughput: 0: 43091.9. Samples: 12751424300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 01:03:50,720][15401] Updated weights for policy 0, policy_version 778283 (0.0047) [2024-06-25 01:03:53,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 12751437824. Throughput: 0: 42886.7. Samples: 12751547420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:53,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 01:03:55,358][15401] Updated weights for policy 0, policy_version 778293 (0.0035) [2024-06-25 01:03:58,326][15401] Updated weights for policy 0, policy_version 778303 (0.0034) [2024-06-25 01:03:58,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 12751716352. Throughput: 0: 42842.2. Samples: 12751802900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:03:58,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 01:04:03,041][15401] Updated weights for policy 0, policy_version 778313 (0.0034) [2024-06-25 01:04:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12751896576. Throughput: 0: 43078.3. Samples: 12752066880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:04:03,390][15132] Avg episode reward: [(0, '0.243')] [2024-06-25 01:04:06,105][15401] Updated weights for policy 0, policy_version 778323 (0.0040) [2024-06-25 01:04:08,390][15132] Fps is (10 sec: 36044.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12752076800. Throughput: 0: 42682.6. Samples: 12752184820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 01:04:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 01:04:10,772][15401] Updated weights for policy 0, policy_version 778333 (0.0035) [2024-06-25 01:04:13,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 12752338944. Throughput: 0: 42570.2. Samples: 12752435380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:13,393][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 01:04:14,211][15401] Updated weights for policy 0, policy_version 778343 (0.0031) [2024-06-25 01:04:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 12752519168. Throughput: 0: 42958.8. Samples: 12752703100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:18,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 01:04:18,472][15401] Updated weights for policy 0, policy_version 778353 (0.0033) [2024-06-25 01:04:19,343][15349] Signal inference workers to stop experience collection... (188700 times) [2024-06-25 01:04:19,391][15401] InferenceWorker_p0-w0: stopping experience collection (188700 times) [2024-06-25 01:04:19,398][15349] Signal inference workers to resume experience collection... (188700 times) [2024-06-25 01:04:19,411][15401] InferenceWorker_p0-w0: resuming experience collection (188700 times) [2024-06-25 01:04:21,711][15401] Updated weights for policy 0, policy_version 778363 (0.0029) [2024-06-25 01:04:23,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12752732160. Throughput: 0: 42720.4. Samples: 12752823860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 01:04:25,976][15401] Updated weights for policy 0, policy_version 778373 (0.0039) [2024-06-25 01:04:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12752977920. Throughput: 0: 42748.4. Samples: 12753083820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 01:04:29,283][15401] Updated weights for policy 0, policy_version 778383 (0.0047) [2024-06-25 01:04:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12753158144. Throughput: 0: 42695.3. Samples: 12753345580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:33,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 01:04:33,658][15401] Updated weights for policy 0, policy_version 778393 (0.0028) [2024-06-25 01:04:37,105][15401] Updated weights for policy 0, policy_version 778403 (0.0029) [2024-06-25 01:04:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 12753387520. Throughput: 0: 42505.2. Samples: 12753460160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 01:04:41,490][15401] Updated weights for policy 0, policy_version 778413 (0.0042) [2024-06-25 01:04:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 12753600512. Throughput: 0: 42595.5. Samples: 12753719700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:43,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 01:04:44,690][15401] Updated weights for policy 0, policy_version 778423 (0.0029) [2024-06-25 01:04:48,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41779.4, 300 sec: 42765.0). Total num frames: 12753780736. Throughput: 0: 42564.4. Samples: 12753982280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 01:04:49,270][15401] Updated weights for policy 0, policy_version 778433 (0.0040) [2024-06-25 01:04:52,350][15401] Updated weights for policy 0, policy_version 778443 (0.0031) [2024-06-25 01:04:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12754026496. Throughput: 0: 42537.5. Samples: 12754099000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 01:04:56,930][15401] Updated weights for policy 0, policy_version 778453 (0.0034) [2024-06-25 01:04:58,392][15132] Fps is (10 sec: 45865.3, 60 sec: 42050.8, 300 sec: 42820.3). Total num frames: 12754239488. Throughput: 0: 42727.9. Samples: 12754358120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:04:58,392][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 01:05:00,142][15401] Updated weights for policy 0, policy_version 778463 (0.0039) [2024-06-25 01:05:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 12754419712. Throughput: 0: 42535.1. Samples: 12754617180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 01:05:04,824][15401] Updated weights for policy 0, policy_version 778473 (0.0037) [2024-06-25 01:05:07,813][15401] Updated weights for policy 0, policy_version 778483 (0.0024) [2024-06-25 01:05:08,390][15132] Fps is (10 sec: 42607.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12754665472. Throughput: 0: 42556.5. Samples: 12754738900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:08,391][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 01:05:12,597][15401] Updated weights for policy 0, policy_version 778493 (0.0041) [2024-06-25 01:05:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42327.1, 300 sec: 42820.9). Total num frames: 12754878464. Throughput: 0: 42642.4. Samples: 12755002720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:13,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 01:05:15,418][15401] Updated weights for policy 0, policy_version 778503 (0.0035) [2024-06-25 01:05:18,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42325.1, 300 sec: 42765.0). Total num frames: 12755058688. Throughput: 0: 42282.8. Samples: 12755248320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 01:05:20,357][15401] Updated weights for policy 0, policy_version 778513 (0.0048) [2024-06-25 01:05:23,084][15401] Updated weights for policy 0, policy_version 778523 (0.0052) [2024-06-25 01:05:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 12755320832. Throughput: 0: 42567.5. Samples: 12755375700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 01:05:28,119][15401] Updated weights for policy 0, policy_version 778533 (0.0037) [2024-06-25 01:05:28,389][15132] Fps is (10 sec: 42599.7, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 12755484672. Throughput: 0: 42528.9. Samples: 12755633500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 01:05:29,431][15349] Signal inference workers to stop experience collection... (188750 times) [2024-06-25 01:05:29,432][15349] Signal inference workers to resume experience collection... (188750 times) [2024-06-25 01:05:29,460][15401] InferenceWorker_p0-w0: stopping experience collection (188750 times) [2024-06-25 01:05:29,460][15401] InferenceWorker_p0-w0: resuming experience collection (188750 times) [2024-06-25 01:05:31,119][15401] Updated weights for policy 0, policy_version 778543 (0.0022) [2024-06-25 01:05:33,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 12755714048. Throughput: 0: 42212.3. Samples: 12755881940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:33,393][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 01:05:35,741][15401] Updated weights for policy 0, policy_version 778553 (0.0031) [2024-06-25 01:05:38,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12755959808. Throughput: 0: 42533.2. Samples: 12756013000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 01:05:38,767][15401] Updated weights for policy 0, policy_version 778563 (0.0034) [2024-06-25 01:05:43,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12756123648. Throughput: 0: 42638.5. Samples: 12756276760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 01:05:43,396][15401] Updated weights for policy 0, policy_version 778573 (0.0038) [2024-06-25 01:05:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000778573_12756140032.pth... [2024-06-25 01:05:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000777948_12745900032.pth [2024-06-25 01:05:46,304][15401] Updated weights for policy 0, policy_version 778583 (0.0036) [2024-06-25 01:05:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12756369408. Throughput: 0: 42428.0. Samples: 12756526440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:48,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 01:05:51,061][15401] Updated weights for policy 0, policy_version 778593 (0.0038) [2024-06-25 01:05:53,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12756598784. Throughput: 0: 42556.1. Samples: 12756653920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 01:05:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 01:05:54,277][15401] Updated weights for policy 0, policy_version 778603 (0.0039) [2024-06-25 01:05:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42053.7, 300 sec: 42653.9). Total num frames: 12756762624. Throughput: 0: 42467.5. Samples: 12756913760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:05:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 01:05:58,845][15401] Updated weights for policy 0, policy_version 778613 (0.0037) [2024-06-25 01:06:01,698][15401] Updated weights for policy 0, policy_version 778623 (0.0048) [2024-06-25 01:06:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12757008384. Throughput: 0: 42670.1. Samples: 12757168460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:03,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 01:06:06,436][15401] Updated weights for policy 0, policy_version 778633 (0.0028) [2024-06-25 01:06:08,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12757237760. Throughput: 0: 42871.7. Samples: 12757304920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 01:06:09,325][15401] Updated weights for policy 0, policy_version 778643 (0.0042) [2024-06-25 01:06:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12757401600. Throughput: 0: 42752.4. Samples: 12757557360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 01:06:14,212][15401] Updated weights for policy 0, policy_version 778653 (0.0038) [2024-06-25 01:06:16,977][15401] Updated weights for policy 0, policy_version 778663 (0.0043) [2024-06-25 01:06:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 12757647360. Throughput: 0: 42807.2. Samples: 12757808160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 01:06:21,909][15401] Updated weights for policy 0, policy_version 778673 (0.0034) [2024-06-25 01:06:23,390][15132] Fps is (10 sec: 47511.7, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 12757876736. Throughput: 0: 42970.4. Samples: 12757946680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:23,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 01:06:24,592][15401] Updated weights for policy 0, policy_version 778683 (0.0034) [2024-06-25 01:06:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12758056960. Throughput: 0: 42625.7. Samples: 12758194920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 01:06:29,426][15401] Updated weights for policy 0, policy_version 778693 (0.0034) [2024-06-25 01:06:32,367][15401] Updated weights for policy 0, policy_version 778703 (0.0036) [2024-06-25 01:06:33,390][15132] Fps is (10 sec: 42600.0, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 12758302720. Throughput: 0: 42655.0. Samples: 12758445920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 01:06:36,930][15401] Updated weights for policy 0, policy_version 778713 (0.0024) [2024-06-25 01:06:38,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 12758515712. Throughput: 0: 42901.7. Samples: 12758584600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:38,393][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 01:06:39,960][15401] Updated weights for policy 0, policy_version 778723 (0.0041) [2024-06-25 01:06:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12758695936. Throughput: 0: 42742.7. Samples: 12758837180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:43,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 01:06:44,178][15349] Signal inference workers to stop experience collection... (188800 times) [2024-06-25 01:06:44,230][15401] InferenceWorker_p0-w0: stopping experience collection (188800 times) [2024-06-25 01:06:44,233][15349] Signal inference workers to resume experience collection... (188800 times) [2024-06-25 01:06:44,243][15401] InferenceWorker_p0-w0: resuming experience collection (188800 times) [2024-06-25 01:06:44,373][15401] Updated weights for policy 0, policy_version 778733 (0.0038) [2024-06-25 01:06:47,613][15401] Updated weights for policy 0, policy_version 778743 (0.0036) [2024-06-25 01:06:48,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12758958080. Throughput: 0: 42869.2. Samples: 12759097580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:48,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 01:06:52,228][15401] Updated weights for policy 0, policy_version 778753 (0.0036) [2024-06-25 01:06:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12759138304. Throughput: 0: 42882.1. Samples: 12759234620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 01:06:55,309][15401] Updated weights for policy 0, policy_version 778763 (0.0033) [2024-06-25 01:06:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12759351296. Throughput: 0: 42843.9. Samples: 12759485340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:06:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 01:06:59,746][15401] Updated weights for policy 0, policy_version 778773 (0.0030) [2024-06-25 01:07:03,075][15401] Updated weights for policy 0, policy_version 778783 (0.0030) [2024-06-25 01:07:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12759580672. Throughput: 0: 42878.8. Samples: 12759737700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 01:07:07,433][15401] Updated weights for policy 0, policy_version 778793 (0.0033) [2024-06-25 01:07:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12759793664. Throughput: 0: 42811.0. Samples: 12759873160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:08,390][15132] Avg episode reward: [(0, '0.218')] [2024-06-25 01:07:10,662][15401] Updated weights for policy 0, policy_version 778803 (0.0024) [2024-06-25 01:07:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12760006656. Throughput: 0: 42890.7. Samples: 12760125000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:13,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 01:07:15,008][15401] Updated weights for policy 0, policy_version 778813 (0.0034) [2024-06-25 01:07:18,174][15401] Updated weights for policy 0, policy_version 778823 (0.0023) [2024-06-25 01:07:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12760236032. Throughput: 0: 43035.6. Samples: 12760382520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 01:07:22,806][15401] Updated weights for policy 0, policy_version 778833 (0.0031) [2024-06-25 01:07:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.7, 300 sec: 42709.5). Total num frames: 12760432640. Throughput: 0: 42988.5. Samples: 12760518980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 01:07:25,933][15401] Updated weights for policy 0, policy_version 778843 (0.0045) [2024-06-25 01:07:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 12760645632. Throughput: 0: 42927.2. Samples: 12760768900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 01:07:30,483][15401] Updated weights for policy 0, policy_version 778853 (0.0033) [2024-06-25 01:07:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 12760875008. Throughput: 0: 42898.6. Samples: 12761028020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 01:07:33,506][15401] Updated weights for policy 0, policy_version 778863 (0.0033) [2024-06-25 01:07:38,175][15401] Updated weights for policy 0, policy_version 778873 (0.0023) [2024-06-25 01:07:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42654.3). Total num frames: 12761055232. Throughput: 0: 42777.9. Samples: 12761159620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 01:07:41,024][15401] Updated weights for policy 0, policy_version 778883 (0.0032) [2024-06-25 01:07:43,396][15132] Fps is (10 sec: 42571.5, 60 sec: 43413.1, 300 sec: 42653.0). Total num frames: 12761300992. Throughput: 0: 42785.7. Samples: 12761410960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 01:07:43,396][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 01:07:43,533][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000778889_12761317376.pth... [2024-06-25 01:07:43,583][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000778263_12751060992.pth [2024-06-25 01:07:45,707][15401] Updated weights for policy 0, policy_version 778893 (0.0034) [2024-06-25 01:07:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 12761513984. Throughput: 0: 43010.7. Samples: 12761673180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:07:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 01:07:48,619][15401] Updated weights for policy 0, policy_version 778903 (0.0032) [2024-06-25 01:07:53,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12761694208. Throughput: 0: 42744.2. Samples: 12761796640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:07:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 01:07:53,422][15401] Updated weights for policy 0, policy_version 778913 (0.0029) [2024-06-25 01:07:56,397][15401] Updated weights for policy 0, policy_version 778923 (0.0046) [2024-06-25 01:07:57,669][15349] Signal inference workers to stop experience collection... (188850 times) [2024-06-25 01:07:57,715][15401] InferenceWorker_p0-w0: stopping experience collection (188850 times) [2024-06-25 01:07:57,727][15349] Signal inference workers to resume experience collection... (188850 times) [2024-06-25 01:07:57,731][15401] InferenceWorker_p0-w0: resuming experience collection (188850 times) [2024-06-25 01:07:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 12761956352. Throughput: 0: 42918.6. Samples: 12762056340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:07:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 01:08:01,080][15401] Updated weights for policy 0, policy_version 778933 (0.0026) [2024-06-25 01:08:03,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 12762136576. Throughput: 0: 42835.4. Samples: 12762310120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:03,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 01:08:04,663][15401] Updated weights for policy 0, policy_version 778943 (0.0030) [2024-06-25 01:08:08,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12762333184. Throughput: 0: 42459.5. Samples: 12762429660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 01:08:08,857][15401] Updated weights for policy 0, policy_version 778953 (0.0026) [2024-06-25 01:08:12,304][15401] Updated weights for policy 0, policy_version 778963 (0.0039) [2024-06-25 01:08:13,392][15132] Fps is (10 sec: 45865.0, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 12762595328. Throughput: 0: 42751.5. Samples: 12762692820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:13,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 01:08:16,331][15401] Updated weights for policy 0, policy_version 778973 (0.0034) [2024-06-25 01:08:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12762775552. Throughput: 0: 42628.8. Samples: 12762946320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:18,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-25 01:08:19,787][15401] Updated weights for policy 0, policy_version 778983 (0.0034) [2024-06-25 01:08:23,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12762988544. Throughput: 0: 42585.2. Samples: 12763075960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:23,390][15132] Avg episode reward: [(0, '0.200')] [2024-06-25 01:08:24,103][15401] Updated weights for policy 0, policy_version 778993 (0.0043) [2024-06-25 01:08:27,686][15401] Updated weights for policy 0, policy_version 779003 (0.0038) [2024-06-25 01:08:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12763217920. Throughput: 0: 42781.1. Samples: 12763335840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:28,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-25 01:08:31,585][15401] Updated weights for policy 0, policy_version 779013 (0.0045) [2024-06-25 01:08:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12763414528. Throughput: 0: 42530.6. Samples: 12763587060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 01:08:35,206][15401] Updated weights for policy 0, policy_version 779023 (0.0034) [2024-06-25 01:08:38,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.3, 300 sec: 42653.9). Total num frames: 12763643904. Throughput: 0: 42633.8. Samples: 12763715180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:38,390][15132] Avg episode reward: [(0, '0.202')] [2024-06-25 01:08:39,156][15401] Updated weights for policy 0, policy_version 779033 (0.0042) [2024-06-25 01:08:43,318][15401] Updated weights for policy 0, policy_version 779043 (0.0034) [2024-06-25 01:08:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42329.7, 300 sec: 42598.4). Total num frames: 12763840512. Throughput: 0: 42668.0. Samples: 12763976400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 01:08:46,851][15401] Updated weights for policy 0, policy_version 779053 (0.0022) [2024-06-25 01:08:48,389][15132] Fps is (10 sec: 42599.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12764069888. Throughput: 0: 42670.9. Samples: 12764230300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:48,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 01:08:50,829][15401] Updated weights for policy 0, policy_version 779063 (0.0027) [2024-06-25 01:08:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12764282880. Throughput: 0: 42768.5. Samples: 12764354240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 01:08:54,822][15401] Updated weights for policy 0, policy_version 779073 (0.0035) [2024-06-25 01:08:58,310][15401] Updated weights for policy 0, policy_version 779083 (0.0026) [2024-06-25 01:08:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12764495872. Throughput: 0: 42729.8. Samples: 12764615560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:08:58,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 01:09:02,337][15401] Updated weights for policy 0, policy_version 779093 (0.0042) [2024-06-25 01:09:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12764708864. Throughput: 0: 42777.9. Samples: 12764871320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:09:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 01:09:05,835][15401] Updated weights for policy 0, policy_version 779103 (0.0033) [2024-06-25 01:09:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42709.8). Total num frames: 12764938240. Throughput: 0: 42653.5. Samples: 12764995360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:09:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 01:09:10,263][15401] Updated weights for policy 0, policy_version 779113 (0.0030) [2024-06-25 01:09:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42325.3, 300 sec: 42764.7). Total num frames: 12765134848. Throughput: 0: 42804.9. Samples: 12765262160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:09:13,392][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 01:09:13,522][15401] Updated weights for policy 0, policy_version 779123 (0.0042) [2024-06-25 01:09:13,873][15349] Signal inference workers to stop experience collection... (188900 times) [2024-06-25 01:09:13,912][15401] InferenceWorker_p0-w0: stopping experience collection (188900 times) [2024-06-25 01:09:13,923][15349] Signal inference workers to resume experience collection... (188900 times) [2024-06-25 01:09:13,939][15401] InferenceWorker_p0-w0: resuming experience collection (188900 times) [2024-06-25 01:09:17,670][15401] Updated weights for policy 0, policy_version 779133 (0.0033) [2024-06-25 01:09:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12765364224. Throughput: 0: 42880.8. Samples: 12765516700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:09:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 01:09:21,280][15401] Updated weights for policy 0, policy_version 779143 (0.0039) [2024-06-25 01:09:23,389][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12765577216. Throughput: 0: 43026.1. Samples: 12765651340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:09:23,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 01:09:25,082][15401] Updated weights for policy 0, policy_version 779153 (0.0037) [2024-06-25 01:09:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12765773824. Throughput: 0: 42866.9. Samples: 12765905400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 01:09:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 01:09:28,976][15401] Updated weights for policy 0, policy_version 779163 (0.0034) [2024-06-25 01:09:32,809][15401] Updated weights for policy 0, policy_version 779173 (0.0035) [2024-06-25 01:09:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12765986816. Throughput: 0: 42858.1. Samples: 12766158920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:09:33,391][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 01:09:36,539][15401] Updated weights for policy 0, policy_version 779183 (0.0031) [2024-06-25 01:09:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.8, 300 sec: 42765.0). Total num frames: 12766216192. Throughput: 0: 43005.5. Samples: 12766289480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:09:38,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 01:09:40,262][15401] Updated weights for policy 0, policy_version 779193 (0.0024) [2024-06-25 01:09:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12766412800. Throughput: 0: 42867.0. Samples: 12766544680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:09:43,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 01:09:43,522][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000779201_12766429184.pth... [2024-06-25 01:09:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000778573_12756140032.pth [2024-06-25 01:09:44,150][15401] Updated weights for policy 0, policy_version 779203 (0.0036) [2024-06-25 01:09:48,219][15401] Updated weights for policy 0, policy_version 779213 (0.0028) [2024-06-25 01:09:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12766625792. Throughput: 0: 42959.5. Samples: 12766804500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:09:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 01:09:52,001][15401] Updated weights for policy 0, policy_version 779223 (0.0041) [2024-06-25 01:09:53,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 12766838784. Throughput: 0: 43070.7. Samples: 12766933540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:09:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 01:09:55,573][15401] Updated weights for policy 0, policy_version 779233 (0.0041) [2024-06-25 01:09:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12767051776. Throughput: 0: 42773.4. Samples: 12767186860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:09:58,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 01:09:59,493][15401] Updated weights for policy 0, policy_version 779243 (0.0033) [2024-06-25 01:10:03,045][15401] Updated weights for policy 0, policy_version 779253 (0.0031) [2024-06-25 01:10:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12767281152. Throughput: 0: 42874.2. Samples: 12767446040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 01:10:07,114][15401] Updated weights for policy 0, policy_version 779263 (0.0026) [2024-06-25 01:10:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12767494144. Throughput: 0: 42885.8. Samples: 12767581200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:08,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 01:10:10,503][15401] Updated weights for policy 0, policy_version 779273 (0.0039) [2024-06-25 01:10:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 12767690752. Throughput: 0: 42743.4. Samples: 12767828860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 01:10:14,645][15401] Updated weights for policy 0, policy_version 779283 (0.0032) [2024-06-25 01:10:18,026][15401] Updated weights for policy 0, policy_version 779293 (0.0042) [2024-06-25 01:10:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12767936512. Throughput: 0: 42787.1. Samples: 12768084340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 01:10:22,572][15401] Updated weights for policy 0, policy_version 779303 (0.0037) [2024-06-25 01:10:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12768133120. Throughput: 0: 42843.0. Samples: 12768217420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:23,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 01:10:25,633][15401] Updated weights for policy 0, policy_version 779313 (0.0027) [2024-06-25 01:10:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12768346112. Throughput: 0: 42754.9. Samples: 12768468540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 01:10:30,660][15401] Updated weights for policy 0, policy_version 779323 (0.0047) [2024-06-25 01:10:33,302][15401] Updated weights for policy 0, policy_version 779333 (0.0032) [2024-06-25 01:10:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 12768591872. Throughput: 0: 42534.2. Samples: 12768718540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 01:10:38,198][15401] Updated weights for policy 0, policy_version 779343 (0.0032) [2024-06-25 01:10:38,206][15349] Signal inference workers to stop experience collection... (188950 times) [2024-06-25 01:10:38,206][15349] Signal inference workers to resume experience collection... (188950 times) [2024-06-25 01:10:38,226][15401] InferenceWorker_p0-w0: stopping experience collection (188950 times) [2024-06-25 01:10:38,226][15401] InferenceWorker_p0-w0: resuming experience collection (188950 times) [2024-06-25 01:10:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12768772096. Throughput: 0: 42614.1. Samples: 12768851180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 01:10:41,309][15401] Updated weights for policy 0, policy_version 779353 (0.0049) [2024-06-25 01:10:43,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 12768968704. Throughput: 0: 42587.5. Samples: 12769103300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 01:10:45,841][15401] Updated weights for policy 0, policy_version 779363 (0.0032) [2024-06-25 01:10:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12769214464. Throughput: 0: 42501.0. Samples: 12769358580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:48,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 01:10:48,918][15401] Updated weights for policy 0, policy_version 779373 (0.0036) [2024-06-25 01:10:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12769394688. Throughput: 0: 42513.7. Samples: 12769494320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:53,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 01:10:53,488][15401] Updated weights for policy 0, policy_version 779383 (0.0032) [2024-06-25 01:10:56,530][15401] Updated weights for policy 0, policy_version 779393 (0.0046) [2024-06-25 01:10:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12769607680. Throughput: 0: 42608.9. Samples: 12769746260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:10:58,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 01:11:01,074][15401] Updated weights for policy 0, policy_version 779403 (0.0043) [2024-06-25 01:11:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12769853440. Throughput: 0: 42592.4. Samples: 12770001000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:11:03,392][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 01:11:04,103][15401] Updated weights for policy 0, policy_version 779413 (0.0046) [2024-06-25 01:11:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12770033664. Throughput: 0: 42550.7. Samples: 12770132200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:11:08,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 01:11:08,656][15401] Updated weights for policy 0, policy_version 779423 (0.0040) [2024-06-25 01:11:12,306][15401] Updated weights for policy 0, policy_version 779433 (0.0040) [2024-06-25 01:11:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12770263040. Throughput: 0: 42631.8. Samples: 12770386980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 01:11:13,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 01:11:16,247][15401] Updated weights for policy 0, policy_version 779443 (0.0031) [2024-06-25 01:11:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12770476032. Throughput: 0: 42785.4. Samples: 12770643880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:18,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 01:11:20,010][15401] Updated weights for policy 0, policy_version 779453 (0.0034) [2024-06-25 01:11:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12770689024. Throughput: 0: 42708.7. Samples: 12770773080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:23,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 01:11:23,840][15401] Updated weights for policy 0, policy_version 779463 (0.0029) [2024-06-25 01:11:27,559][15401] Updated weights for policy 0, policy_version 779473 (0.0036) [2024-06-25 01:11:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12770885632. Throughput: 0: 42675.7. Samples: 12771023700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 01:11:31,682][15401] Updated weights for policy 0, policy_version 779483 (0.0025) [2024-06-25 01:11:33,389][15132] Fps is (10 sec: 40961.1, 60 sec: 41779.3, 300 sec: 42654.3). Total num frames: 12771098624. Throughput: 0: 42709.4. Samples: 12771280500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:33,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 01:11:35,293][15401] Updated weights for policy 0, policy_version 779493 (0.0033) [2024-06-25 01:11:38,155][15349] Signal inference workers to stop experience collection... (189000 times) [2024-06-25 01:11:38,206][15401] InferenceWorker_p0-w0: stopping experience collection (189000 times) [2024-06-25 01:11:38,213][15349] Signal inference workers to resume experience collection... (189000 times) [2024-06-25 01:11:38,228][15401] InferenceWorker_p0-w0: resuming experience collection (189000 times) [2024-06-25 01:11:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12771344384. Throughput: 0: 42554.4. Samples: 12771409260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:38,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 01:11:39,452][15401] Updated weights for policy 0, policy_version 779503 (0.0034) [2024-06-25 01:11:42,925][15401] Updated weights for policy 0, policy_version 779513 (0.0038) [2024-06-25 01:11:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12771540992. Throughput: 0: 42612.4. Samples: 12771663820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 01:11:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000779513_12771540992.pth... [2024-06-25 01:11:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000778889_12761317376.pth [2024-06-25 01:11:47,157][15401] Updated weights for policy 0, policy_version 779523 (0.0045) [2024-06-25 01:11:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12771753984. Throughput: 0: 42553.4. Samples: 12771915900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:48,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 01:11:50,686][15401] Updated weights for policy 0, policy_version 779533 (0.0040) [2024-06-25 01:11:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12771966976. Throughput: 0: 42576.3. Samples: 12772048140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 01:11:54,679][15401] Updated weights for policy 0, policy_version 779543 (0.0033) [2024-06-25 01:11:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12772179968. Throughput: 0: 42601.1. Samples: 12772304020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:11:58,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-25 01:11:58,437][15401] Updated weights for policy 0, policy_version 779553 (0.0031) [2024-06-25 01:12:02,226][15401] Updated weights for policy 0, policy_version 779563 (0.0027) [2024-06-25 01:12:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12772392960. Throughput: 0: 42381.4. Samples: 12772551040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 01:12:06,293][15401] Updated weights for policy 0, policy_version 779573 (0.0032) [2024-06-25 01:12:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12772589568. Throughput: 0: 42391.4. Samples: 12772680680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 01:12:10,183][15401] Updated weights for policy 0, policy_version 779583 (0.0033) [2024-06-25 01:12:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 12772818944. Throughput: 0: 42520.8. Samples: 12772937240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:13,393][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 01:12:14,456][15401] Updated weights for policy 0, policy_version 779593 (0.0030) [2024-06-25 01:12:18,042][15401] Updated weights for policy 0, policy_version 779603 (0.0038) [2024-06-25 01:12:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12773015552. Throughput: 0: 42207.9. Samples: 12773179860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 01:12:22,086][15401] Updated weights for policy 0, policy_version 779613 (0.0036) [2024-06-25 01:12:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 12773228544. Throughput: 0: 42251.0. Samples: 12773310560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 01:12:25,705][15401] Updated weights for policy 0, policy_version 779623 (0.0024) [2024-06-25 01:12:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12773441536. Throughput: 0: 42224.1. Samples: 12773563900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 01:12:30,047][15401] Updated weights for policy 0, policy_version 779633 (0.0041) [2024-06-25 01:12:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12773638144. Throughput: 0: 42164.9. Samples: 12773813320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 01:12:33,972][15401] Updated weights for policy 0, policy_version 779643 (0.0048) [2024-06-25 01:12:37,789][15401] Updated weights for policy 0, policy_version 779653 (0.0036) [2024-06-25 01:12:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42599.3). Total num frames: 12773867520. Throughput: 0: 41990.3. Samples: 12773937700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 01:12:41,707][15401] Updated weights for policy 0, policy_version 779663 (0.0050) [2024-06-25 01:12:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 12774064128. Throughput: 0: 41944.3. Samples: 12774191520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 01:12:45,337][15401] Updated weights for policy 0, policy_version 779673 (0.0030) [2024-06-25 01:12:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12774277120. Throughput: 0: 42087.0. Samples: 12774444960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:48,396][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 01:12:49,342][15401] Updated weights for policy 0, policy_version 779683 (0.0038) [2024-06-25 01:12:51,826][15349] Signal inference workers to stop experience collection... (189050 times) [2024-06-25 01:12:51,827][15349] Signal inference workers to resume experience collection... (189050 times) [2024-06-25 01:12:51,845][15401] InferenceWorker_p0-w0: stopping experience collection (189050 times) [2024-06-25 01:12:51,846][15401] InferenceWorker_p0-w0: resuming experience collection (189050 times) [2024-06-25 01:12:52,962][15401] Updated weights for policy 0, policy_version 779693 (0.0037) [2024-06-25 01:12:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12774490112. Throughput: 0: 42078.8. Samples: 12774574240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:53,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 01:12:57,003][15401] Updated weights for policy 0, policy_version 779703 (0.0036) [2024-06-25 01:12:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 12774703104. Throughput: 0: 41872.9. Samples: 12774821420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 01:12:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 01:13:00,883][15401] Updated weights for policy 0, policy_version 779713 (0.0027) [2024-06-25 01:13:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 12774899712. Throughput: 0: 42285.7. Samples: 12775082720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 01:13:04,591][15401] Updated weights for policy 0, policy_version 779723 (0.0034) [2024-06-25 01:13:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 12775129088. Throughput: 0: 42146.3. Samples: 12775207140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 01:13:08,661][15401] Updated weights for policy 0, policy_version 779733 (0.0037) [2024-06-25 01:13:12,411][15401] Updated weights for policy 0, policy_version 779743 (0.0037) [2024-06-25 01:13:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 12775342080. Throughput: 0: 42232.9. Samples: 12775464380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 01:13:16,485][15401] Updated weights for policy 0, policy_version 779753 (0.0025) [2024-06-25 01:13:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12775555072. Throughput: 0: 42352.0. Samples: 12775719160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 01:13:19,922][15401] Updated weights for policy 0, policy_version 779763 (0.0038) [2024-06-25 01:13:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 12775768064. Throughput: 0: 42376.8. Samples: 12775844760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:23,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 01:13:24,098][15401] Updated weights for policy 0, policy_version 779773 (0.0031) [2024-06-25 01:13:27,914][15401] Updated weights for policy 0, policy_version 779783 (0.0044) [2024-06-25 01:13:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12775981056. Throughput: 0: 42406.9. Samples: 12776099820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 01:13:31,576][15401] Updated weights for policy 0, policy_version 779793 (0.0036) [2024-06-25 01:13:33,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12776194048. Throughput: 0: 42460.5. Samples: 12776355680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 01:13:35,526][15401] Updated weights for policy 0, policy_version 779803 (0.0039) [2024-06-25 01:13:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12776407040. Throughput: 0: 42413.5. Samples: 12776482840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 01:13:39,043][15401] Updated weights for policy 0, policy_version 779813 (0.0035) [2024-06-25 01:13:43,109][15401] Updated weights for policy 0, policy_version 779823 (0.0026) [2024-06-25 01:13:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 12776620032. Throughput: 0: 42605.3. Samples: 12776738660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:43,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 01:13:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000779823_12776620032.pth... [2024-06-25 01:13:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000779201_12766429184.pth [2024-06-25 01:13:46,818][15401] Updated weights for policy 0, policy_version 779833 (0.0037) [2024-06-25 01:13:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12776816640. Throughput: 0: 42245.8. Samples: 12776983780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 01:13:51,317][15401] Updated weights for policy 0, policy_version 779843 (0.0034) [2024-06-25 01:13:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 12777029632. Throughput: 0: 42332.4. Samples: 12777112100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 01:13:54,654][15401] Updated weights for policy 0, policy_version 779853 (0.0027) [2024-06-25 01:13:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12777242624. Throughput: 0: 42278.6. Samples: 12777366920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:13:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 01:13:59,030][15401] Updated weights for policy 0, policy_version 779863 (0.0028) [2024-06-25 01:14:02,326][15401] Updated weights for policy 0, policy_version 779873 (0.0026) [2024-06-25 01:14:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 12777472000. Throughput: 0: 42205.8. Samples: 12777618420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 01:14:05,267][15349] Signal inference workers to stop experience collection... (189100 times) [2024-06-25 01:14:05,324][15401] InferenceWorker_p0-w0: stopping experience collection (189100 times) [2024-06-25 01:14:05,324][15349] Signal inference workers to resume experience collection... (189100 times) [2024-06-25 01:14:05,337][15401] InferenceWorker_p0-w0: resuming experience collection (189100 times) [2024-06-25 01:14:06,822][15401] Updated weights for policy 0, policy_version 779883 (0.0042) [2024-06-25 01:14:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42432.1). Total num frames: 12777652224. Throughput: 0: 42371.6. Samples: 12777751380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:08,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 01:14:10,268][15401] Updated weights for policy 0, policy_version 779893 (0.0028) [2024-06-25 01:14:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 12777881600. Throughput: 0: 42351.4. Samples: 12778005640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 01:14:14,427][15401] Updated weights for policy 0, policy_version 779903 (0.0036) [2024-06-25 01:14:17,915][15401] Updated weights for policy 0, policy_version 779913 (0.0027) [2024-06-25 01:14:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12778110976. Throughput: 0: 42386.3. Samples: 12778263060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 01:14:22,390][15401] Updated weights for policy 0, policy_version 779923 (0.0031) [2024-06-25 01:14:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 12778291200. Throughput: 0: 42541.8. Samples: 12778397220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 01:14:25,338][15401] Updated weights for policy 0, policy_version 779933 (0.0038) [2024-06-25 01:14:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12778536960. Throughput: 0: 42403.2. Samples: 12778646800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:28,392][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 01:14:30,359][15401] Updated weights for policy 0, policy_version 779943 (0.0038) [2024-06-25 01:14:32,941][15401] Updated weights for policy 0, policy_version 779953 (0.0042) [2024-06-25 01:14:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12778749952. Throughput: 0: 42715.2. Samples: 12778905960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:33,393][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 01:14:37,841][15401] Updated weights for policy 0, policy_version 779963 (0.0042) [2024-06-25 01:14:38,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 42376.6). Total num frames: 12778913792. Throughput: 0: 42801.8. Samples: 12779038180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:38,390][15132] Avg episode reward: [(0, '0.862')] [2024-06-25 01:14:40,507][15401] Updated weights for policy 0, policy_version 779973 (0.0022) [2024-06-25 01:14:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12779192320. Throughput: 0: 42747.1. Samples: 12779290540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 01:14:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 01:14:45,345][15401] Updated weights for policy 0, policy_version 779983 (0.0039) [2024-06-25 01:14:48,121][15401] Updated weights for policy 0, policy_version 779993 (0.0034) [2024-06-25 01:14:48,389][15132] Fps is (10 sec: 49151.7, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 12779405312. Throughput: 0: 42696.4. Samples: 12779539760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:14:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 01:14:53,004][15401] Updated weights for policy 0, policy_version 780003 (0.0038) [2024-06-25 01:14:53,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 12779569152. Throughput: 0: 42628.4. Samples: 12779669660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:14:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 01:14:55,865][15401] Updated weights for policy 0, policy_version 780013 (0.0039) [2024-06-25 01:14:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 12779814912. Throughput: 0: 42679.7. Samples: 12779926220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:14:58,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 01:15:00,959][15401] Updated weights for policy 0, policy_version 780023 (0.0032) [2024-06-25 01:15:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 12780044288. Throughput: 0: 42718.9. Samples: 12780185420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 01:15:03,825][15401] Updated weights for policy 0, policy_version 780033 (0.0036) [2024-06-25 01:15:08,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 12780191744. Throughput: 0: 42474.7. Samples: 12780308580. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 01:15:08,666][15401] Updated weights for policy 0, policy_version 780043 (0.0033) [2024-06-25 01:15:11,366][15401] Updated weights for policy 0, policy_version 780053 (0.0036) [2024-06-25 01:15:12,742][15349] Signal inference workers to stop experience collection... (189150 times) [2024-06-25 01:15:12,743][15349] Signal inference workers to resume experience collection... (189150 times) [2024-06-25 01:15:12,783][15401] InferenceWorker_p0-w0: stopping experience collection (189150 times) [2024-06-25 01:15:12,783][15401] InferenceWorker_p0-w0: resuming experience collection (189150 times) [2024-06-25 01:15:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.9, 300 sec: 42487.0). Total num frames: 12780470272. Throughput: 0: 42731.1. Samples: 12780569800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:13,393][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 01:15:16,302][15401] Updated weights for policy 0, policy_version 780063 (0.0033) [2024-06-25 01:15:18,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 12780666880. Throughput: 0: 42561.4. Samples: 12780821220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 01:15:19,182][15401] Updated weights for policy 0, policy_version 780073 (0.0031) [2024-06-25 01:15:23,394][15132] Fps is (10 sec: 37676.2, 60 sec: 42595.4, 300 sec: 42375.6). Total num frames: 12780847104. Throughput: 0: 42413.7. Samples: 12780946980. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:23,394][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 01:15:24,142][15401] Updated weights for policy 0, policy_version 780083 (0.0036) [2024-06-25 01:15:27,083][15401] Updated weights for policy 0, policy_version 780093 (0.0030) [2024-06-25 01:15:28,390][15132] Fps is (10 sec: 42596.9, 60 sec: 42598.2, 300 sec: 42376.2). Total num frames: 12781092864. Throughput: 0: 42472.7. Samples: 12781201820. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 01:15:31,625][15401] Updated weights for policy 0, policy_version 780103 (0.0030) [2024-06-25 01:15:33,390][15132] Fps is (10 sec: 44255.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 12781289472. Throughput: 0: 42644.3. Samples: 12781458760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 01:15:34,977][15401] Updated weights for policy 0, policy_version 780113 (0.0024) [2024-06-25 01:15:38,390][15132] Fps is (10 sec: 40961.0, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 12781502464. Throughput: 0: 42543.0. Samples: 12781584100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:38,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-25 01:15:39,198][15401] Updated weights for policy 0, policy_version 780123 (0.0042) [2024-06-25 01:15:42,693][15401] Updated weights for policy 0, policy_version 780133 (0.0028) [2024-06-25 01:15:43,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42050.6, 300 sec: 42375.9). Total num frames: 12781715456. Throughput: 0: 42557.2. Samples: 12781841400. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:43,393][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 01:15:43,546][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000780135_12781731840.pth... [2024-06-25 01:15:43,609][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000779513_12771540992.pth [2024-06-25 01:15:47,205][15401] Updated weights for policy 0, policy_version 780143 (0.0038) [2024-06-25 01:15:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 12781912064. Throughput: 0: 42558.7. Samples: 12782100560. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:48,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 01:15:50,471][15401] Updated weights for policy 0, policy_version 780153 (0.0032) [2024-06-25 01:15:53,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 12782141440. Throughput: 0: 42499.1. Samples: 12782221140. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:53,392][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 01:15:54,694][15401] Updated weights for policy 0, policy_version 780163 (0.0032) [2024-06-25 01:15:58,193][15401] Updated weights for policy 0, policy_version 780173 (0.0032) [2024-06-25 01:15:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 12782354432. Throughput: 0: 42437.9. Samples: 12782479400. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:15:58,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 01:16:02,254][15401] Updated weights for policy 0, policy_version 780183 (0.0030) [2024-06-25 01:16:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 12782551040. Throughput: 0: 42505.7. Samples: 12782733980. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:16:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 01:16:05,986][15401] Updated weights for policy 0, policy_version 780193 (0.0049) [2024-06-25 01:16:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42487.3). Total num frames: 12782796800. Throughput: 0: 42410.2. Samples: 12782855260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:16:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 01:16:09,927][15401] Updated weights for policy 0, policy_version 780203 (0.0029) [2024-06-25 01:16:13,392][15132] Fps is (10 sec: 42588.4, 60 sec: 41779.2, 300 sec: 42375.9). Total num frames: 12782977024. Throughput: 0: 42515.8. Samples: 12783115120. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:16:13,393][15132] Avg episode reward: [(0, '0.281')] [2024-06-25 01:16:13,599][15401] Updated weights for policy 0, policy_version 780213 (0.0026) [2024-06-25 01:16:17,723][15401] Updated weights for policy 0, policy_version 780223 (0.0036) [2024-06-25 01:16:18,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.1, 300 sec: 42320.7). Total num frames: 12783173632. Throughput: 0: 42333.8. Samples: 12783363780. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:16:18,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 01:16:21,373][15401] Updated weights for policy 0, policy_version 780233 (0.0040) [2024-06-25 01:16:22,436][15349] Signal inference workers to stop experience collection... (189200 times) [2024-06-25 01:16:22,437][15349] Signal inference workers to resume experience collection... (189200 times) [2024-06-25 01:16:22,479][15401] InferenceWorker_p0-w0: stopping experience collection (189200 times) [2024-06-25 01:16:22,479][15401] InferenceWorker_p0-w0: resuming experience collection (189200 times) [2024-06-25 01:16:23,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43147.6, 300 sec: 42542.8). Total num frames: 12783435776. Throughput: 0: 42351.6. Samples: 12783489920. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:16:23,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-25 01:16:25,118][15401] Updated weights for policy 0, policy_version 780243 (0.0028) [2024-06-25 01:16:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.4, 300 sec: 42431.7). Total num frames: 12783616000. Throughput: 0: 42565.3. Samples: 12783756740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 01:16:28,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 01:16:28,966][15401] Updated weights for policy 0, policy_version 780253 (0.0024) [2024-06-25 01:16:32,903][15401] Updated weights for policy 0, policy_version 780263 (0.0036) [2024-06-25 01:16:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 12783828992. Throughput: 0: 42339.2. Samples: 12784005820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:16:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 01:16:36,604][15401] Updated weights for policy 0, policy_version 780273 (0.0042) [2024-06-25 01:16:38,389][15132] Fps is (10 sec: 45876.3, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 12784074752. Throughput: 0: 42509.4. Samples: 12784133960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:16:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 01:16:40,730][15401] Updated weights for policy 0, policy_version 780283 (0.0032) [2024-06-25 01:16:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42598.4, 300 sec: 42431.4). Total num frames: 12784271360. Throughput: 0: 42599.9. Samples: 12784396500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:16:43,392][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 01:16:44,475][15401] Updated weights for policy 0, policy_version 780293 (0.0027) [2024-06-25 01:16:48,170][15401] Updated weights for policy 0, policy_version 780303 (0.0031) [2024-06-25 01:16:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 12784484352. Throughput: 0: 42487.1. Samples: 12784645900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:16:48,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 01:16:51,970][15401] Updated weights for policy 0, policy_version 780313 (0.0040) [2024-06-25 01:16:53,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43146.2, 300 sec: 42542.8). Total num frames: 12784730112. Throughput: 0: 42814.2. Samples: 12784781900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:16:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 01:16:55,646][15401] Updated weights for policy 0, policy_version 780323 (0.0028) [2024-06-25 01:16:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 12784893952. Throughput: 0: 42705.9. Samples: 12785036780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:16:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 01:16:59,526][15401] Updated weights for policy 0, policy_version 780333 (0.0024) [2024-06-25 01:17:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 12785123328. Throughput: 0: 42809.5. Samples: 12785290200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 01:17:03,470][15401] Updated weights for policy 0, policy_version 780343 (0.0033) [2024-06-25 01:17:07,278][15401] Updated weights for policy 0, policy_version 780353 (0.0027) [2024-06-25 01:17:08,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.6, 300 sec: 42487.7). Total num frames: 12785352704. Throughput: 0: 43027.8. Samples: 12785426160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 01:17:10,880][15401] Updated weights for policy 0, policy_version 780363 (0.0031) [2024-06-25 01:17:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42431.8). Total num frames: 12785532928. Throughput: 0: 42765.5. Samples: 12785681180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 01:17:14,796][15401] Updated weights for policy 0, policy_version 780373 (0.0031) [2024-06-25 01:17:18,290][15401] Updated weights for policy 0, policy_version 780383 (0.0029) [2024-06-25 01:17:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.8, 300 sec: 42598.4). Total num frames: 12785795072. Throughput: 0: 42801.4. Samples: 12785931880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 01:17:22,565][15401] Updated weights for policy 0, policy_version 780393 (0.0026) [2024-06-25 01:17:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12785991680. Throughput: 0: 42931.5. Samples: 12786065880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 01:17:26,171][15401] Updated weights for policy 0, policy_version 780403 (0.0034) [2024-06-25 01:17:28,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 12786171904. Throughput: 0: 42888.4. Samples: 12786326380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 01:17:30,257][15401] Updated weights for policy 0, policy_version 780413 (0.0028) [2024-06-25 01:17:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 12786417664. Throughput: 0: 42784.1. Samples: 12786571180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:33,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 01:17:33,598][15401] Updated weights for policy 0, policy_version 780423 (0.0034) [2024-06-25 01:17:38,159][15401] Updated weights for policy 0, policy_version 780433 (0.0032) [2024-06-25 01:17:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12786630656. Throughput: 0: 42790.3. Samples: 12786707460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 01:17:41,736][15401] Updated weights for policy 0, policy_version 780443 (0.0036) [2024-06-25 01:17:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 12786827264. Throughput: 0: 42773.4. Samples: 12786961580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 01:17:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000780447_12786843648.pth... [2024-06-25 01:17:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000779823_12776620032.pth [2024-06-25 01:17:45,669][15349] Signal inference workers to stop experience collection... (189250 times) [2024-06-25 01:17:45,669][15349] Signal inference workers to resume experience collection... (189250 times) [2024-06-25 01:17:45,698][15401] InferenceWorker_p0-w0: stopping experience collection (189250 times) [2024-06-25 01:17:45,699][15401] InferenceWorker_p0-w0: resuming experience collection (189250 times) [2024-06-25 01:17:45,844][15401] Updated weights for policy 0, policy_version 780453 (0.0030) [2024-06-25 01:17:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12787056640. Throughput: 0: 42710.2. Samples: 12787212160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 01:17:49,295][15401] Updated weights for policy 0, policy_version 780463 (0.0033) [2024-06-25 01:17:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12787269632. Throughput: 0: 42723.0. Samples: 12787348700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 01:17:53,394][15401] Updated weights for policy 0, policy_version 780473 (0.0032) [2024-06-25 01:17:56,827][15401] Updated weights for policy 0, policy_version 780483 (0.0034) [2024-06-25 01:17:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12787466240. Throughput: 0: 42576.9. Samples: 12787597140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:17:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 01:18:01,044][15401] Updated weights for policy 0, policy_version 780493 (0.0033) [2024-06-25 01:18:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12787695616. Throughput: 0: 42708.3. Samples: 12787853760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:18:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 01:18:04,596][15401] Updated weights for policy 0, policy_version 780503 (0.0028) [2024-06-25 01:18:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12787892224. Throughput: 0: 42484.1. Samples: 12787977660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:18:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 01:18:08,954][15401] Updated weights for policy 0, policy_version 780513 (0.0023) [2024-06-25 01:18:12,463][15401] Updated weights for policy 0, policy_version 780523 (0.0037) [2024-06-25 01:18:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 12788088832. Throughput: 0: 42331.5. Samples: 12788231300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 01:18:13,390][15132] Avg episode reward: [(0, '0.845')] [2024-06-25 01:18:16,574][15401] Updated weights for policy 0, policy_version 780533 (0.0021) [2024-06-25 01:18:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42052.1, 300 sec: 42543.2). Total num frames: 12788318208. Throughput: 0: 42666.5. Samples: 12788491180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:18,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 01:18:20,107][15401] Updated weights for policy 0, policy_version 780543 (0.0031) [2024-06-25 01:18:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 12788514816. Throughput: 0: 42464.5. Samples: 12788618360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:23,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 01:18:24,135][15401] Updated weights for policy 0, policy_version 780553 (0.0033) [2024-06-25 01:18:27,685][15401] Updated weights for policy 0, policy_version 780563 (0.0048) [2024-06-25 01:18:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12788744192. Throughput: 0: 42429.2. Samples: 12788870900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 01:18:31,671][15401] Updated weights for policy 0, policy_version 780573 (0.0030) [2024-06-25 01:18:33,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 12788973568. Throughput: 0: 42665.2. Samples: 12789132100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:33,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 01:18:35,273][15401] Updated weights for policy 0, policy_version 780583 (0.0036) [2024-06-25 01:18:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12789153792. Throughput: 0: 42445.2. Samples: 12789258740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 01:18:39,709][15401] Updated weights for policy 0, policy_version 780593 (0.0032) [2024-06-25 01:18:43,308][15401] Updated weights for policy 0, policy_version 780603 (0.0030) [2024-06-25 01:18:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12789399552. Throughput: 0: 42583.0. Samples: 12789513380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 01:18:47,419][15401] Updated weights for policy 0, policy_version 780613 (0.0038) [2024-06-25 01:18:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12789596160. Throughput: 0: 42563.1. Samples: 12789769100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 01:18:51,147][15401] Updated weights for policy 0, policy_version 780623 (0.0044) [2024-06-25 01:18:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12789809152. Throughput: 0: 42684.8. Samples: 12789898480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 01:18:55,052][15401] Updated weights for policy 0, policy_version 780633 (0.0031) [2024-06-25 01:18:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12790022144. Throughput: 0: 42779.4. Samples: 12790156360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:18:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 01:18:58,619][15401] Updated weights for policy 0, policy_version 780643 (0.0034) [2024-06-25 01:19:02,682][15401] Updated weights for policy 0, policy_version 780653 (0.0039) [2024-06-25 01:19:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12790251520. Throughput: 0: 42731.6. Samples: 12790414100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 01:19:06,086][15349] Signal inference workers to stop experience collection... (189300 times) [2024-06-25 01:19:06,086][15349] Signal inference workers to resume experience collection... (189300 times) [2024-06-25 01:19:06,105][15401] InferenceWorker_p0-w0: stopping experience collection (189300 times) [2024-06-25 01:19:06,105][15401] InferenceWorker_p0-w0: resuming experience collection (189300 times) [2024-06-25 01:19:06,233][15401] Updated weights for policy 0, policy_version 780663 (0.0036) [2024-06-25 01:19:08,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12790464512. Throughput: 0: 42710.9. Samples: 12790540360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:08,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 01:19:10,585][15401] Updated weights for policy 0, policy_version 780673 (0.0041) [2024-06-25 01:19:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 12790677504. Throughput: 0: 42680.5. Samples: 12790791520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 01:19:13,845][15401] Updated weights for policy 0, policy_version 780683 (0.0038) [2024-06-25 01:19:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12790857728. Throughput: 0: 42669.1. Samples: 12791052200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 01:19:18,449][15401] Updated weights for policy 0, policy_version 780693 (0.0040) [2024-06-25 01:19:21,456][15401] Updated weights for policy 0, policy_version 780703 (0.0029) [2024-06-25 01:19:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12791103488. Throughput: 0: 42546.7. Samples: 12791173340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 01:19:25,949][15401] Updated weights for policy 0, policy_version 780713 (0.0035) [2024-06-25 01:19:28,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 12791316480. Throughput: 0: 42672.1. Samples: 12791433720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:28,393][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 01:19:29,510][15401] Updated weights for policy 0, policy_version 780723 (0.0036) [2024-06-25 01:19:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12791496704. Throughput: 0: 42684.9. Samples: 12791689920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:33,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 01:19:33,583][15401] Updated weights for policy 0, policy_version 780733 (0.0044) [2024-06-25 01:19:37,117][15401] Updated weights for policy 0, policy_version 780743 (0.0035) [2024-06-25 01:19:38,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 12791742464. Throughput: 0: 42499.1. Samples: 12791810940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:38,396][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 01:19:41,183][15401] Updated weights for policy 0, policy_version 780753 (0.0031) [2024-06-25 01:19:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 12791939072. Throughput: 0: 42476.4. Samples: 12792067800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 01:19:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000780759_12791955456.pth... [2024-06-25 01:19:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000780135_12781731840.pth [2024-06-25 01:19:44,769][15401] Updated weights for policy 0, policy_version 780763 (0.0038) [2024-06-25 01:19:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12792119296. Throughput: 0: 42370.8. Samples: 12792320780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:48,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 01:19:49,192][15401] Updated weights for policy 0, policy_version 780773 (0.0037) [2024-06-25 01:19:52,376][15401] Updated weights for policy 0, policy_version 780783 (0.0033) [2024-06-25 01:19:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12792381440. Throughput: 0: 42298.3. Samples: 12792443780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 01:19:56,856][15401] Updated weights for policy 0, policy_version 780793 (0.0033) [2024-06-25 01:19:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 12792561664. Throughput: 0: 42439.2. Samples: 12792701280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:19:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 01:20:00,190][15401] Updated weights for policy 0, policy_version 780803 (0.0031) [2024-06-25 01:20:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 12792774656. Throughput: 0: 42358.7. Samples: 12792958340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 01:20:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 01:20:04,437][15401] Updated weights for policy 0, policy_version 780813 (0.0029) [2024-06-25 01:20:07,850][15401] Updated weights for policy 0, policy_version 780823 (0.0044) [2024-06-25 01:20:08,396][15132] Fps is (10 sec: 47482.3, 60 sec: 42866.9, 300 sec: 42597.8). Total num frames: 12793036800. Throughput: 0: 42503.2. Samples: 12793086260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:08,396][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 01:20:12,105][15401] Updated weights for policy 0, policy_version 780833 (0.0033) [2024-06-25 01:20:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12793200640. Throughput: 0: 42286.6. Samples: 12793336520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 01:20:15,520][15349] Signal inference workers to stop experience collection... (189350 times) [2024-06-25 01:20:15,521][15349] Signal inference workers to resume experience collection... (189350 times) [2024-06-25 01:20:15,531][15401] Updated weights for policy 0, policy_version 780843 (0.0043) [2024-06-25 01:20:15,554][15401] InferenceWorker_p0-w0: stopping experience collection (189350 times) [2024-06-25 01:20:15,554][15401] InferenceWorker_p0-w0: resuming experience collection (189350 times) [2024-06-25 01:20:18,396][15132] Fps is (10 sec: 39322.1, 60 sec: 42866.9, 300 sec: 42653.6). Total num frames: 12793430016. Throughput: 0: 42334.0. Samples: 12793595220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:18,396][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 01:20:19,789][15401] Updated weights for policy 0, policy_version 780853 (0.0040) [2024-06-25 01:20:23,159][15401] Updated weights for policy 0, policy_version 780863 (0.0033) [2024-06-25 01:20:23,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 12793675776. Throughput: 0: 42512.8. Samples: 12793724020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 01:20:27,377][15401] Updated weights for policy 0, policy_version 780873 (0.0027) [2024-06-25 01:20:28,389][15132] Fps is (10 sec: 42626.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 12793856000. Throughput: 0: 42558.7. Samples: 12793982940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 01:20:30,878][15401] Updated weights for policy 0, policy_version 780883 (0.0022) [2024-06-25 01:20:33,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12794052608. Throughput: 0: 42679.9. Samples: 12794241380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 01:20:34,955][15401] Updated weights for policy 0, policy_version 780893 (0.0037) [2024-06-25 01:20:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12794298368. Throughput: 0: 42575.5. Samples: 12794359680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 01:20:38,419][15401] Updated weights for policy 0, policy_version 780903 (0.0032) [2024-06-25 01:20:42,825][15401] Updated weights for policy 0, policy_version 780913 (0.0047) [2024-06-25 01:20:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12794511360. Throughput: 0: 42648.3. Samples: 12794620460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 01:20:46,186][15401] Updated weights for policy 0, policy_version 780923 (0.0033) [2024-06-25 01:20:48,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.7, 300 sec: 42542.9). Total num frames: 12794691584. Throughput: 0: 42643.9. Samples: 12794877420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:48,392][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 01:20:50,397][15401] Updated weights for policy 0, policy_version 780933 (0.0030) [2024-06-25 01:20:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 12794920960. Throughput: 0: 42451.9. Samples: 12794996420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:53,393][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 01:20:54,043][15401] Updated weights for policy 0, policy_version 780943 (0.0027) [2024-06-25 01:20:58,103][15401] Updated weights for policy 0, policy_version 780953 (0.0034) [2024-06-25 01:20:58,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12795150336. Throughput: 0: 42762.7. Samples: 12795260840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:20:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 01:21:01,958][15401] Updated weights for policy 0, policy_version 780963 (0.0034) [2024-06-25 01:21:03,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 12795330560. Throughput: 0: 42665.1. Samples: 12795514880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 01:21:05,648][15401] Updated weights for policy 0, policy_version 780973 (0.0061) [2024-06-25 01:21:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42056.9, 300 sec: 42654.3). Total num frames: 12795559936. Throughput: 0: 42528.6. Samples: 12795637800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:08,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 01:21:09,469][15401] Updated weights for policy 0, policy_version 780983 (0.0040) [2024-06-25 01:21:13,075][15401] Updated weights for policy 0, policy_version 780993 (0.0032) [2024-06-25 01:21:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12795789312. Throughput: 0: 42645.2. Samples: 12795901980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 01:21:16,983][15401] Updated weights for policy 0, policy_version 781003 (0.0030) [2024-06-25 01:21:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 12795985920. Throughput: 0: 42588.4. Samples: 12796157860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 01:21:21,049][15401] Updated weights for policy 0, policy_version 781013 (0.0042) [2024-06-25 01:21:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 12796198912. Throughput: 0: 42633.9. Samples: 12796278200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 01:21:25,060][15401] Updated weights for policy 0, policy_version 781023 (0.0031) [2024-06-25 01:21:27,689][15349] Signal inference workers to stop experience collection... (189400 times) [2024-06-25 01:21:27,690][15349] Signal inference workers to resume experience collection... (189400 times) [2024-06-25 01:21:27,711][15401] InferenceWorker_p0-w0: stopping experience collection (189400 times) [2024-06-25 01:21:27,711][15401] InferenceWorker_p0-w0: resuming experience collection (189400 times) [2024-06-25 01:21:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12796428288. Throughput: 0: 42633.3. Samples: 12796538960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:28,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 01:21:28,539][15401] Updated weights for policy 0, policy_version 781033 (0.0028) [2024-06-25 01:21:32,597][15401] Updated weights for policy 0, policy_version 781043 (0.0033) [2024-06-25 01:21:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 12796624896. Throughput: 0: 42565.3. Samples: 12796792760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 01:21:36,312][15401] Updated weights for policy 0, policy_version 781053 (0.0035) [2024-06-25 01:21:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12796854272. Throughput: 0: 42669.0. Samples: 12796916420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:38,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 01:21:40,153][15401] Updated weights for policy 0, policy_version 781063 (0.0034) [2024-06-25 01:21:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12797050880. Throughput: 0: 42695.5. Samples: 12797182140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:43,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 01:21:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000781070_12797050880.pth... [2024-06-25 01:21:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000780447_12786843648.pth [2024-06-25 01:21:44,201][15401] Updated weights for policy 0, policy_version 781073 (0.0032) [2024-06-25 01:21:48,041][15401] Updated weights for policy 0, policy_version 781083 (0.0030) [2024-06-25 01:21:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 12797263872. Throughput: 0: 42582.3. Samples: 12797431080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 01:21:48,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 01:21:51,741][15401] Updated weights for policy 0, policy_version 781093 (0.0042) [2024-06-25 01:21:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 12797493248. Throughput: 0: 42737.2. Samples: 12797560980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:21:53,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 01:21:55,733][15401] Updated weights for policy 0, policy_version 781103 (0.0027) [2024-06-25 01:21:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 12797673472. Throughput: 0: 42592.4. Samples: 12797818640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:21:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 01:21:59,486][15401] Updated weights for policy 0, policy_version 781113 (0.0045) [2024-06-25 01:22:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 12797902848. Throughput: 0: 42557.4. Samples: 12798072940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 01:22:03,750][15401] Updated weights for policy 0, policy_version 781123 (0.0032) [2024-06-25 01:22:07,389][15401] Updated weights for policy 0, policy_version 781133 (0.0028) [2024-06-25 01:22:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12798132224. Throughput: 0: 42880.4. Samples: 12798207820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 01:22:11,266][15401] Updated weights for policy 0, policy_version 781143 (0.0039) [2024-06-25 01:22:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 12798312448. Throughput: 0: 42785.8. Samples: 12798464320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 01:22:14,926][15401] Updated weights for policy 0, policy_version 781153 (0.0042) [2024-06-25 01:22:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 12798541824. Throughput: 0: 42664.5. Samples: 12798712660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 01:22:18,902][15401] Updated weights for policy 0, policy_version 781163 (0.0037) [2024-06-25 01:22:22,423][15401] Updated weights for policy 0, policy_version 781173 (0.0047) [2024-06-25 01:22:23,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12798787584. Throughput: 0: 42964.9. Samples: 12798849840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 01:22:26,421][15401] Updated weights for policy 0, policy_version 781183 (0.0032) [2024-06-25 01:22:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 12798951424. Throughput: 0: 42775.3. Samples: 12799107020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 01:22:29,931][15401] Updated weights for policy 0, policy_version 781193 (0.0023) [2024-06-25 01:22:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 12799197184. Throughput: 0: 42839.6. Samples: 12799358860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 01:22:33,934][15401] Updated weights for policy 0, policy_version 781203 (0.0036) [2024-06-25 01:22:37,708][15401] Updated weights for policy 0, policy_version 781213 (0.0033) [2024-06-25 01:22:38,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12799426560. Throughput: 0: 42981.5. Samples: 12799495140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 01:22:41,434][15401] Updated weights for policy 0, policy_version 781223 (0.0023) [2024-06-25 01:22:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 12799590400. Throughput: 0: 42992.9. Samples: 12799753320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 01:22:45,348][15401] Updated weights for policy 0, policy_version 781233 (0.0037) [2024-06-25 01:22:45,812][15349] Signal inference workers to stop experience collection... (189450 times) [2024-06-25 01:22:45,813][15349] Signal inference workers to resume experience collection... (189450 times) [2024-06-25 01:22:45,837][15401] InferenceWorker_p0-w0: stopping experience collection (189450 times) [2024-06-25 01:22:45,837][15401] InferenceWorker_p0-w0: resuming experience collection (189450 times) [2024-06-25 01:22:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12799852544. Throughput: 0: 42895.9. Samples: 12800003260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 01:22:49,271][15401] Updated weights for policy 0, policy_version 781243 (0.0026) [2024-06-25 01:22:52,997][15401] Updated weights for policy 0, policy_version 781253 (0.0031) [2024-06-25 01:22:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12800049152. Throughput: 0: 43072.0. Samples: 12800146060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 01:22:56,888][15401] Updated weights for policy 0, policy_version 781263 (0.0031) [2024-06-25 01:22:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12800245760. Throughput: 0: 42869.7. Samples: 12800393460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:22:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 01:23:00,823][15401] Updated weights for policy 0, policy_version 781273 (0.0038) [2024-06-25 01:23:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 12800507904. Throughput: 0: 42884.5. Samples: 12800642460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:23:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 01:23:04,398][15401] Updated weights for policy 0, policy_version 781283 (0.0028) [2024-06-25 01:23:08,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12800688128. Throughput: 0: 42844.1. Samples: 12800777820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:23:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 01:23:08,472][15401] Updated weights for policy 0, policy_version 781293 (0.0029) [2024-06-25 01:23:12,612][15401] Updated weights for policy 0, policy_version 781303 (0.0030) [2024-06-25 01:23:13,391][15132] Fps is (10 sec: 37678.7, 60 sec: 42870.6, 300 sec: 42598.2). Total num frames: 12800884736. Throughput: 0: 42677.3. Samples: 12801027560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:23:13,391][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 01:23:16,390][15401] Updated weights for policy 0, policy_version 781313 (0.0038) [2024-06-25 01:23:18,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 12801146880. Throughput: 0: 42489.2. Samples: 12801270880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:23:18,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 01:23:20,203][15401] Updated weights for policy 0, policy_version 781323 (0.0027) [2024-06-25 01:23:23,389][15132] Fps is (10 sec: 40965.6, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 12801294336. Throughput: 0: 42581.4. Samples: 12801411300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:23:23,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 01:23:24,350][15401] Updated weights for policy 0, policy_version 781333 (0.0035) [2024-06-25 01:23:27,837][15401] Updated weights for policy 0, policy_version 781343 (0.0030) [2024-06-25 01:23:28,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12801523712. Throughput: 0: 42357.7. Samples: 12801659420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:23:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 01:23:31,913][15401] Updated weights for policy 0, policy_version 781353 (0.0034) [2024-06-25 01:23:33,390][15132] Fps is (10 sec: 50789.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12801802240. Throughput: 0: 42368.9. Samples: 12801909860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 01:23:33,394][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 01:23:35,511][15401] Updated weights for policy 0, policy_version 781363 (0.0038) [2024-06-25 01:23:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12801966080. Throughput: 0: 42402.6. Samples: 12802054180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:23:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 01:23:39,338][15401] Updated weights for policy 0, policy_version 781373 (0.0042) [2024-06-25 01:23:42,994][15401] Updated weights for policy 0, policy_version 781383 (0.0027) [2024-06-25 01:23:43,390][15132] Fps is (10 sec: 37683.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12802179072. Throughput: 0: 42578.7. Samples: 12802309500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:23:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 01:23:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000781383_12802179072.pth... [2024-06-25 01:23:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000780759_12791955456.pth [2024-06-25 01:23:47,032][15401] Updated weights for policy 0, policy_version 781393 (0.0045) [2024-06-25 01:23:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12802424832. Throughput: 0: 42568.5. Samples: 12802558040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:23:48,394][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 01:23:50,518][15401] Updated weights for policy 0, policy_version 781403 (0.0027) [2024-06-25 01:23:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12802621440. Throughput: 0: 42638.2. Samples: 12802696540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:23:53,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 01:23:54,352][15401] Updated weights for policy 0, policy_version 781413 (0.0046) [2024-06-25 01:23:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12802818048. Throughput: 0: 42744.7. Samples: 12802951020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:23:58,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 01:23:58,894][15401] Updated weights for policy 0, policy_version 781423 (0.0028) [2024-06-25 01:24:01,244][15349] Signal inference workers to stop experience collection... (189500 times) [2024-06-25 01:24:01,245][15349] Signal inference workers to resume experience collection... (189500 times) [2024-06-25 01:24:01,271][15401] InferenceWorker_p0-w0: stopping experience collection (189500 times) [2024-06-25 01:24:01,271][15401] InferenceWorker_p0-w0: resuming experience collection (189500 times) [2024-06-25 01:24:01,930][15401] Updated weights for policy 0, policy_version 781433 (0.0040) [2024-06-25 01:24:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12803080192. Throughput: 0: 42909.0. Samples: 12803201780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 01:24:06,453][15401] Updated weights for policy 0, policy_version 781443 (0.0045) [2024-06-25 01:24:08,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 12803244032. Throughput: 0: 42917.7. Samples: 12803342700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:08,392][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 01:24:09,494][15401] Updated weights for policy 0, policy_version 781453 (0.0031) [2024-06-25 01:24:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43145.4, 300 sec: 42765.0). Total num frames: 12803473408. Throughput: 0: 42992.8. Samples: 12803594100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 01:24:14,054][15401] Updated weights for policy 0, policy_version 781463 (0.0028) [2024-06-25 01:24:17,042][15401] Updated weights for policy 0, policy_version 781473 (0.0030) [2024-06-25 01:24:18,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12803702784. Throughput: 0: 42997.9. Samples: 12803844760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 01:24:21,634][15401] Updated weights for policy 0, policy_version 781483 (0.0038) [2024-06-25 01:24:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 12803883008. Throughput: 0: 42721.3. Samples: 12803976640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:23,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 01:24:24,797][15401] Updated weights for policy 0, policy_version 781493 (0.0040) [2024-06-25 01:24:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12804096000. Throughput: 0: 42776.2. Samples: 12804234420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 01:24:29,285][15401] Updated weights for policy 0, policy_version 781503 (0.0039) [2024-06-25 01:24:32,401][15401] Updated weights for policy 0, policy_version 781513 (0.0025) [2024-06-25 01:24:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12804341760. Throughput: 0: 42760.0. Samples: 12804482240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 01:24:36,904][15401] Updated weights for policy 0, policy_version 781523 (0.0034) [2024-06-25 01:24:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12804521984. Throughput: 0: 42724.8. Samples: 12804619160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 01:24:40,156][15401] Updated weights for policy 0, policy_version 781533 (0.0035) [2024-06-25 01:24:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12804751360. Throughput: 0: 42631.2. Samples: 12804869420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:43,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 01:24:44,559][15401] Updated weights for policy 0, policy_version 781543 (0.0039) [2024-06-25 01:24:47,777][15401] Updated weights for policy 0, policy_version 781553 (0.0037) [2024-06-25 01:24:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12804997120. Throughput: 0: 42535.5. Samples: 12805115880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 01:24:52,561][15401] Updated weights for policy 0, policy_version 781563 (0.0029) [2024-06-25 01:24:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12805144576. Throughput: 0: 42512.0. Samples: 12805255640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 01:24:55,387][15401] Updated weights for policy 0, policy_version 781573 (0.0037) [2024-06-25 01:24:58,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12805373952. Throughput: 0: 42455.1. Samples: 12805504580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:24:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 01:25:00,197][15401] Updated weights for policy 0, policy_version 781583 (0.0037) [2024-06-25 01:25:03,114][15401] Updated weights for policy 0, policy_version 781593 (0.0028) [2024-06-25 01:25:03,389][15132] Fps is (10 sec: 49152.3, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 12805636096. Throughput: 0: 42481.3. Samples: 12805756420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:25:03,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 01:25:08,244][15401] Updated weights for policy 0, policy_version 781603 (0.0025) [2024-06-25 01:25:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 12805783552. Throughput: 0: 42626.8. Samples: 12805894840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:25:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 01:25:10,709][15401] Updated weights for policy 0, policy_version 781613 (0.0032) [2024-06-25 01:25:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 12806029312. Throughput: 0: 42418.0. Samples: 12806143240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:25:13,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 01:25:15,788][15401] Updated weights for policy 0, policy_version 781623 (0.0039) [2024-06-25 01:25:18,389][15401] Updated weights for policy 0, policy_version 781633 (0.0023) [2024-06-25 01:25:18,390][15132] Fps is (10 sec: 49151.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12806275072. Throughput: 0: 42564.0. Samples: 12806397620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 01:25:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 01:25:18,413][15349] Signal inference workers to stop experience collection... (189550 times) [2024-06-25 01:25:18,414][15349] Signal inference workers to resume experience collection... (189550 times) [2024-06-25 01:25:18,453][15401] InferenceWorker_p0-w0: stopping experience collection (189550 times) [2024-06-25 01:25:18,453][15401] InferenceWorker_p0-w0: resuming experience collection (189550 times) [2024-06-25 01:25:23,320][15401] Updated weights for policy 0, policy_version 781643 (0.0031) [2024-06-25 01:25:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12806438912. Throughput: 0: 42430.8. Samples: 12806528540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:25:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 01:25:26,162][15401] Updated weights for policy 0, policy_version 781653 (0.0034) [2024-06-25 01:25:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 12806684672. Throughput: 0: 42446.3. Samples: 12806779500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:25:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 01:25:30,942][15401] Updated weights for policy 0, policy_version 781663 (0.0029) [2024-06-25 01:25:33,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12806897664. Throughput: 0: 42688.1. Samples: 12807036840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:25:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 01:25:33,809][15401] Updated weights for policy 0, policy_version 781673 (0.0025) [2024-06-25 01:25:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12807077888. Throughput: 0: 42419.2. Samples: 12807164500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:25:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 01:25:38,518][15401] Updated weights for policy 0, policy_version 781683 (0.0038) [2024-06-25 01:25:41,475][15401] Updated weights for policy 0, policy_version 781693 (0.0024) [2024-06-25 01:25:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12807323648. Throughput: 0: 42578.2. Samples: 12807420600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:25:43,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 01:25:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000781698_12807340032.pth... [2024-06-25 01:25:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000781070_12797050880.pth [2024-06-25 01:25:46,201][15401] Updated weights for policy 0, policy_version 781703 (0.0038) [2024-06-25 01:25:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 12807520256. Throughput: 0: 42651.0. Samples: 12807675720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:25:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 01:25:49,260][15401] Updated weights for policy 0, policy_version 781713 (0.0038) [2024-06-25 01:25:53,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12807700480. Throughput: 0: 42476.4. Samples: 12807806280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:25:53,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 01:25:54,120][15401] Updated weights for policy 0, policy_version 781723 (0.0035) [2024-06-25 01:25:56,890][15401] Updated weights for policy 0, policy_version 781733 (0.0031) [2024-06-25 01:25:58,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 12807946240. Throughput: 0: 42674.3. Samples: 12808063680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:25:58,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 01:26:01,951][15401] Updated weights for policy 0, policy_version 781743 (0.0035) [2024-06-25 01:26:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12808159232. Throughput: 0: 42657.5. Samples: 12808317200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:03,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 01:26:04,599][15401] Updated weights for policy 0, policy_version 781753 (0.0035) [2024-06-25 01:26:08,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12808355840. Throughput: 0: 42621.2. Samples: 12808446500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:08,396][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 01:26:09,517][15401] Updated weights for policy 0, policy_version 781763 (0.0023) [2024-06-25 01:26:12,518][15401] Updated weights for policy 0, policy_version 781773 (0.0027) [2024-06-25 01:26:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12808585216. Throughput: 0: 42588.9. Samples: 12808696000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 01:26:17,070][15401] Updated weights for policy 0, policy_version 781783 (0.0034) [2024-06-25 01:26:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12808798208. Throughput: 0: 42743.4. Samples: 12808960300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 01:26:20,094][15401] Updated weights for policy 0, policy_version 781793 (0.0032) [2024-06-25 01:26:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12809011200. Throughput: 0: 42701.1. Samples: 12809086060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 01:26:24,790][15401] Updated weights for policy 0, policy_version 781803 (0.0031) [2024-06-25 01:26:27,715][15401] Updated weights for policy 0, policy_version 781813 (0.0038) [2024-06-25 01:26:28,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42709.2). Total num frames: 12809224192. Throughput: 0: 42594.7. Samples: 12809337460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:28,392][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 01:26:32,312][15401] Updated weights for policy 0, policy_version 781823 (0.0050) [2024-06-25 01:26:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 12809437184. Throughput: 0: 42747.5. Samples: 12809599360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:33,396][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 01:26:35,238][15401] Updated weights for policy 0, policy_version 781833 (0.0032) [2024-06-25 01:26:38,396][15132] Fps is (10 sec: 42581.1, 60 sec: 42866.8, 300 sec: 42708.6). Total num frames: 12809650176. Throughput: 0: 42700.5. Samples: 12809728080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:38,397][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 01:26:39,822][15401] Updated weights for policy 0, policy_version 781843 (0.0031) [2024-06-25 01:26:42,674][15349] Signal inference workers to stop experience collection... (189600 times) [2024-06-25 01:26:42,724][15401] InferenceWorker_p0-w0: stopping experience collection (189600 times) [2024-06-25 01:26:42,789][15349] Signal inference workers to resume experience collection... (189600 times) [2024-06-25 01:26:42,789][15401] InferenceWorker_p0-w0: resuming experience collection (189600 times) [2024-06-25 01:26:42,925][15401] Updated weights for policy 0, policy_version 781853 (0.0023) [2024-06-25 01:26:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12809879552. Throughput: 0: 42650.2. Samples: 12809982840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 01:26:47,581][15401] Updated weights for policy 0, policy_version 781863 (0.0031) [2024-06-25 01:26:48,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12810076160. Throughput: 0: 42769.6. Samples: 12810241840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 01:26:50,892][15401] Updated weights for policy 0, policy_version 781873 (0.0034) [2024-06-25 01:26:53,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12810256384. Throughput: 0: 42603.2. Samples: 12810363640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 01:26:55,101][15401] Updated weights for policy 0, policy_version 781883 (0.0033) [2024-06-25 01:26:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 12810518528. Throughput: 0: 42735.5. Samples: 12810619100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:26:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 01:26:58,741][15401] Updated weights for policy 0, policy_version 781893 (0.0028) [2024-06-25 01:27:02,780][15401] Updated weights for policy 0, policy_version 781903 (0.0043) [2024-06-25 01:27:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12810698752. Throughput: 0: 42631.1. Samples: 12810878700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:27:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 01:27:06,490][15401] Updated weights for policy 0, policy_version 781913 (0.0037) [2024-06-25 01:27:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12810911744. Throughput: 0: 42675.7. Samples: 12811006460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 01:27:10,484][15401] Updated weights for policy 0, policy_version 781923 (0.0030) [2024-06-25 01:27:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12811157504. Throughput: 0: 42801.3. Samples: 12811263420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 01:27:14,083][15401] Updated weights for policy 0, policy_version 781933 (0.0032) [2024-06-25 01:27:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 12811337728. Throughput: 0: 42633.8. Samples: 12811517980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:18,393][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 01:27:18,634][15401] Updated weights for policy 0, policy_version 781943 (0.0043) [2024-06-25 01:27:21,861][15401] Updated weights for policy 0, policy_version 781953 (0.0045) [2024-06-25 01:27:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12811567104. Throughput: 0: 42520.3. Samples: 12811641220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:23,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 01:27:26,147][15401] Updated weights for policy 0, policy_version 781963 (0.0035) [2024-06-25 01:27:28,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 12811796480. Throughput: 0: 42656.1. Samples: 12811902360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:28,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-25 01:27:29,351][15401] Updated weights for policy 0, policy_version 781973 (0.0033) [2024-06-25 01:27:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12811976704. Throughput: 0: 42634.7. Samples: 12812160400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:33,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 01:27:33,898][15401] Updated weights for policy 0, policy_version 781983 (0.0025) [2024-06-25 01:27:36,790][15401] Updated weights for policy 0, policy_version 781993 (0.0039) [2024-06-25 01:27:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 12812206080. Throughput: 0: 42666.1. Samples: 12812283620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 01:27:41,429][15401] Updated weights for policy 0, policy_version 782003 (0.0032) [2024-06-25 01:27:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 12812435456. Throughput: 0: 42787.6. Samples: 12812544540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 01:27:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000782009_12812435456.pth... [2024-06-25 01:27:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000781383_12802179072.pth [2024-06-25 01:27:44,444][15401] Updated weights for policy 0, policy_version 782013 (0.0030) [2024-06-25 01:27:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12812615680. Throughput: 0: 42725.4. Samples: 12812801340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 01:27:49,010][15401] Updated weights for policy 0, policy_version 782023 (0.0031) [2024-06-25 01:27:52,323][15401] Updated weights for policy 0, policy_version 782033 (0.0033) [2024-06-25 01:27:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12812861440. Throughput: 0: 42714.3. Samples: 12812928600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 01:27:56,683][15401] Updated weights for policy 0, policy_version 782043 (0.0033) [2024-06-25 01:27:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12813058048. Throughput: 0: 42711.6. Samples: 12813185440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:27:58,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 01:28:00,307][15401] Updated weights for policy 0, policy_version 782053 (0.0025) [2024-06-25 01:28:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 12813271040. Throughput: 0: 42754.3. Samples: 12813441820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:03,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 01:28:04,504][15401] Updated weights for policy 0, policy_version 782063 (0.0037) [2024-06-25 01:28:06,527][15349] Signal inference workers to stop experience collection... (189650 times) [2024-06-25 01:28:06,568][15401] InferenceWorker_p0-w0: stopping experience collection (189650 times) [2024-06-25 01:28:06,579][15349] Signal inference workers to resume experience collection... (189650 times) [2024-06-25 01:28:06,585][15401] InferenceWorker_p0-w0: resuming experience collection (189650 times) [2024-06-25 01:28:08,071][15401] Updated weights for policy 0, policy_version 782073 (0.0039) [2024-06-25 01:28:08,396][15132] Fps is (10 sec: 44208.6, 60 sec: 43140.0, 300 sec: 42764.3). Total num frames: 12813500416. Throughput: 0: 42804.7. Samples: 12813567700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:08,396][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 01:28:12,176][15401] Updated weights for policy 0, policy_version 782083 (0.0028) [2024-06-25 01:28:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 12813697024. Throughput: 0: 42790.2. Samples: 12813827920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 01:28:15,527][15401] Updated weights for policy 0, policy_version 782093 (0.0021) [2024-06-25 01:28:18,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 12813910016. Throughput: 0: 42796.8. Samples: 12814086260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:18,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 01:28:19,674][15401] Updated weights for policy 0, policy_version 782103 (0.0029) [2024-06-25 01:28:22,970][15401] Updated weights for policy 0, policy_version 782113 (0.0033) [2024-06-25 01:28:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 12814139392. Throughput: 0: 42868.0. Samples: 12814212780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:23,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 01:28:27,171][15401] Updated weights for policy 0, policy_version 782123 (0.0040) [2024-06-25 01:28:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 12814336000. Throughput: 0: 42787.9. Samples: 12814470000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 01:28:30,446][15401] Updated weights for policy 0, policy_version 782133 (0.0029) [2024-06-25 01:28:33,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12814565376. Throughput: 0: 42723.4. Samples: 12814723900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 01:28:34,715][15401] Updated weights for policy 0, policy_version 782143 (0.0032) [2024-06-25 01:28:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12814778368. Throughput: 0: 42743.6. Samples: 12814852060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:38,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 01:28:38,572][15401] Updated weights for policy 0, policy_version 782153 (0.0040) [2024-06-25 01:28:42,681][15401] Updated weights for policy 0, policy_version 782163 (0.0044) [2024-06-25 01:28:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12814991360. Throughput: 0: 42702.2. Samples: 12815107040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:43,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 01:28:46,185][15401] Updated weights for policy 0, policy_version 782173 (0.0035) [2024-06-25 01:28:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12815204352. Throughput: 0: 42690.2. Samples: 12815362880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:48,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 01:28:50,301][15401] Updated weights for policy 0, policy_version 782183 (0.0029) [2024-06-25 01:28:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 12815400960. Throughput: 0: 42754.5. Samples: 12815491480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 01:28:53,393][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 01:28:53,837][15401] Updated weights for policy 0, policy_version 782193 (0.0033) [2024-06-25 01:28:57,984][15401] Updated weights for policy 0, policy_version 782203 (0.0026) [2024-06-25 01:28:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 12815630336. Throughput: 0: 42580.3. Samples: 12815744040. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:28:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 01:29:01,437][15401] Updated weights for policy 0, policy_version 782213 (0.0039) [2024-06-25 01:29:03,392][15132] Fps is (10 sec: 42598.1, 60 sec: 42596.6, 300 sec: 42653.9). Total num frames: 12815826944. Throughput: 0: 42497.3. Samples: 12815998740. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:03,393][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 01:29:05,538][15401] Updated weights for policy 0, policy_version 782223 (0.0055) [2024-06-25 01:29:08,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42325.3, 300 sec: 42597.5). Total num frames: 12816039936. Throughput: 0: 42536.2. Samples: 12816127080. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:08,397][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 01:29:09,125][15401] Updated weights for policy 0, policy_version 782233 (0.0035) [2024-06-25 01:29:13,169][15401] Updated weights for policy 0, policy_version 782243 (0.0038) [2024-06-25 01:29:13,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12816285696. Throughput: 0: 42627.1. Samples: 12816388220. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 01:29:17,306][15401] Updated weights for policy 0, policy_version 782253 (0.0038) [2024-06-25 01:29:18,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12816482304. Throughput: 0: 42545.9. Samples: 12816638460. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 01:29:20,843][15401] Updated weights for policy 0, policy_version 782263 (0.0035) [2024-06-25 01:29:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 12816678912. Throughput: 0: 42473.3. Samples: 12816763360. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 01:29:23,466][15349] Signal inference workers to stop experience collection... (189700 times) [2024-06-25 01:29:23,467][15349] Signal inference workers to resume experience collection... (189700 times) [2024-06-25 01:29:23,492][15401] InferenceWorker_p0-w0: stopping experience collection (189700 times) [2024-06-25 01:29:23,493][15401] InferenceWorker_p0-w0: resuming experience collection (189700 times) [2024-06-25 01:29:25,304][15401] Updated weights for policy 0, policy_version 782273 (0.0032) [2024-06-25 01:29:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12816908288. Throughput: 0: 42544.9. Samples: 12817021560. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 01:29:28,529][15401] Updated weights for policy 0, policy_version 782283 (0.0044) [2024-06-25 01:29:32,949][15401] Updated weights for policy 0, policy_version 782293 (0.0039) [2024-06-25 01:29:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12817104896. Throughput: 0: 42698.2. Samples: 12817284300. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 01:29:36,172][15401] Updated weights for policy 0, policy_version 782303 (0.0038) [2024-06-25 01:29:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12817334272. Throughput: 0: 42594.7. Samples: 12817408140. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 01:29:40,612][15401] Updated weights for policy 0, policy_version 782313 (0.0033) [2024-06-25 01:29:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12817530880. Throughput: 0: 42591.2. Samples: 12817660640. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 01:29:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000782321_12817547264.pth... [2024-06-25 01:29:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000781698_12807340032.pth [2024-06-25 01:29:43,815][15401] Updated weights for policy 0, policy_version 782323 (0.0031) [2024-06-25 01:29:48,275][15401] Updated weights for policy 0, policy_version 782333 (0.0033) [2024-06-25 01:29:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12817743872. Throughput: 0: 42732.2. Samples: 12817921580. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 01:29:51,658][15401] Updated weights for policy 0, policy_version 782343 (0.0031) [2024-06-25 01:29:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 12817973248. Throughput: 0: 42622.6. Samples: 12818044820. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:53,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-25 01:29:56,035][15401] Updated weights for policy 0, policy_version 782353 (0.0036) [2024-06-25 01:29:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12818169856. Throughput: 0: 42447.5. Samples: 12818298360. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:29:58,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 01:29:59,696][15401] Updated weights for policy 0, policy_version 782363 (0.0039) [2024-06-25 01:30:03,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 12818366464. Throughput: 0: 42661.7. Samples: 12818558240. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:30:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 01:30:03,636][15401] Updated weights for policy 0, policy_version 782373 (0.0030) [2024-06-25 01:30:07,422][15401] Updated weights for policy 0, policy_version 782383 (0.0030) [2024-06-25 01:30:08,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42876.0, 300 sec: 42654.0). Total num frames: 12818612224. Throughput: 0: 42586.6. Samples: 12818679760. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:30:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 01:30:11,135][15401] Updated weights for policy 0, policy_version 782393 (0.0031) [2024-06-25 01:30:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12818825216. Throughput: 0: 42679.5. Samples: 12818942140. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:30:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 01:30:15,069][15401] Updated weights for policy 0, policy_version 782403 (0.0028) [2024-06-25 01:30:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12819005440. Throughput: 0: 42580.1. Samples: 12819200400. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:30:18,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 01:30:18,796][15401] Updated weights for policy 0, policy_version 782413 (0.0038) [2024-06-25 01:30:22,583][15401] Updated weights for policy 0, policy_version 782423 (0.0024) [2024-06-25 01:30:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12819251200. Throughput: 0: 42557.8. Samples: 12819323240. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:30:23,400][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 01:30:26,448][15401] Updated weights for policy 0, policy_version 782433 (0.0042) [2024-06-25 01:30:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12819464192. Throughput: 0: 42608.9. Samples: 12819578040. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:30:28,399][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 01:30:30,177][15401] Updated weights for policy 0, policy_version 782443 (0.0034) [2024-06-25 01:30:33,391][15132] Fps is (10 sec: 40955.3, 60 sec: 42597.6, 300 sec: 42653.8). Total num frames: 12819660800. Throughput: 0: 42557.5. Samples: 12819836720. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:30:33,391][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 01:30:34,074][15401] Updated weights for policy 0, policy_version 782453 (0.0036) [2024-06-25 01:30:36,984][15349] Signal inference workers to stop experience collection... (189750 times) [2024-06-25 01:30:36,984][15349] Signal inference workers to resume experience collection... (189750 times) [2024-06-25 01:30:37,012][15401] InferenceWorker_p0-w0: stopping experience collection (189750 times) [2024-06-25 01:30:37,012][15401] InferenceWorker_p0-w0: resuming experience collection (189750 times) [2024-06-25 01:30:37,975][15401] Updated weights for policy 0, policy_version 782463 (0.0032) [2024-06-25 01:30:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12819890176. Throughput: 0: 42520.9. Samples: 12819958260. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-25 01:30:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 01:30:41,833][15401] Updated weights for policy 0, policy_version 782473 (0.0024) [2024-06-25 01:30:43,392][15132] Fps is (10 sec: 44231.2, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 12820103168. Throughput: 0: 42599.2. Samples: 12820215420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:30:43,392][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 01:30:45,614][15401] Updated weights for policy 0, policy_version 782483 (0.0035) [2024-06-25 01:30:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12820283392. Throughput: 0: 42563.6. Samples: 12820473600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:30:48,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 01:30:49,550][15401] Updated weights for policy 0, policy_version 782493 (0.0024) [2024-06-25 01:30:53,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42323.5, 300 sec: 42598.4). Total num frames: 12820512768. Throughput: 0: 42487.9. Samples: 12820591820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:30:53,393][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 01:30:53,721][15401] Updated weights for policy 0, policy_version 782503 (0.0036) [2024-06-25 01:30:57,324][15401] Updated weights for policy 0, policy_version 782513 (0.0036) [2024-06-25 01:30:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12820742144. Throughput: 0: 42331.2. Samples: 12820847040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:30:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 01:31:01,309][15401] Updated weights for policy 0, policy_version 782523 (0.0041) [2024-06-25 01:31:03,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12820905984. Throughput: 0: 42402.9. Samples: 12821108540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:03,391][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 01:31:05,274][15401] Updated weights for policy 0, policy_version 782533 (0.0040) [2024-06-25 01:31:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12821151744. Throughput: 0: 42228.8. Samples: 12821223540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 01:31:08,774][15401] Updated weights for policy 0, policy_version 782543 (0.0024) [2024-06-25 01:31:12,854][15401] Updated weights for policy 0, policy_version 782553 (0.0026) [2024-06-25 01:31:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12821364736. Throughput: 0: 42428.5. Samples: 12821487320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 01:31:16,444][15401] Updated weights for policy 0, policy_version 782563 (0.0034) [2024-06-25 01:31:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12821561344. Throughput: 0: 42358.4. Samples: 12821742800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 01:31:20,707][15401] Updated weights for policy 0, policy_version 782573 (0.0025) [2024-06-25 01:31:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 12821790720. Throughput: 0: 42488.7. Samples: 12821870260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:23,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-25 01:31:24,286][15401] Updated weights for policy 0, policy_version 782583 (0.0024) [2024-06-25 01:31:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12821987328. Throughput: 0: 42457.9. Samples: 12822125920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 01:31:28,500][15401] Updated weights for policy 0, policy_version 782593 (0.0027) [2024-06-25 01:31:32,074][15401] Updated weights for policy 0, policy_version 782603 (0.0042) [2024-06-25 01:31:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42599.2, 300 sec: 42599.3). Total num frames: 12822216704. Throughput: 0: 42334.1. Samples: 12822378640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:33,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-25 01:31:36,250][15401] Updated weights for policy 0, policy_version 782613 (0.0032) [2024-06-25 01:31:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12822429696. Throughput: 0: 42715.6. Samples: 12822513920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 01:31:39,811][15401] Updated weights for policy 0, policy_version 782623 (0.0038) [2024-06-25 01:31:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41780.9, 300 sec: 42487.3). Total num frames: 12822609920. Throughput: 0: 42740.4. Samples: 12822770360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:43,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-25 01:31:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000782631_12822626304.pth... [2024-06-25 01:31:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000782009_12812435456.pth [2024-06-25 01:31:43,743][15401] Updated weights for policy 0, policy_version 782633 (0.0023) [2024-06-25 01:31:47,445][15401] Updated weights for policy 0, policy_version 782643 (0.0029) [2024-06-25 01:31:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12822872064. Throughput: 0: 42423.2. Samples: 12823017580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:48,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-25 01:31:51,170][15349] Signal inference workers to stop experience collection... (189800 times) [2024-06-25 01:31:51,227][15401] InferenceWorker_p0-w0: stopping experience collection (189800 times) [2024-06-25 01:31:51,227][15349] Signal inference workers to resume experience collection... (189800 times) [2024-06-25 01:31:51,239][15401] InferenceWorker_p0-w0: resuming experience collection (189800 times) [2024-06-25 01:31:51,405][15401] Updated weights for policy 0, policy_version 782653 (0.0040) [2024-06-25 01:31:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 12823068672. Throughput: 0: 42908.5. Samples: 12823154420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 01:31:55,010][15401] Updated weights for policy 0, policy_version 782663 (0.0040) [2024-06-25 01:31:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12823265280. Throughput: 0: 42760.8. Samples: 12823411560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:31:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 01:31:59,002][15401] Updated weights for policy 0, policy_version 782673 (0.0034) [2024-06-25 01:32:02,574][15401] Updated weights for policy 0, policy_version 782683 (0.0040) [2024-06-25 01:32:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12823511040. Throughput: 0: 42646.6. Samples: 12823661900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:32:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 01:32:06,558][15401] Updated weights for policy 0, policy_version 782693 (0.0035) [2024-06-25 01:32:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12823707648. Throughput: 0: 42838.8. Samples: 12823798000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:32:08,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 01:32:10,240][15401] Updated weights for policy 0, policy_version 782703 (0.0030) [2024-06-25 01:32:13,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 12823887872. Throughput: 0: 42694.6. Samples: 12824047180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:32:13,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-25 01:32:14,247][15401] Updated weights for policy 0, policy_version 782713 (0.0041) [2024-06-25 01:32:17,745][15401] Updated weights for policy 0, policy_version 782723 (0.0032) [2024-06-25 01:32:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 12824150016. Throughput: 0: 42699.7. Samples: 12824300120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:32:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 01:32:22,263][15401] Updated weights for policy 0, policy_version 782733 (0.0040) [2024-06-25 01:32:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12824346624. Throughput: 0: 42653.8. Samples: 12824433340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 01:32:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 01:32:25,566][15401] Updated weights for policy 0, policy_version 782743 (0.0044) [2024-06-25 01:32:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12824543232. Throughput: 0: 42600.0. Samples: 12824687360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:32:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 01:32:29,713][15401] Updated weights for policy 0, policy_version 782753 (0.0037) [2024-06-25 01:32:33,343][15401] Updated weights for policy 0, policy_version 782763 (0.0022) [2024-06-25 01:32:33,397][15132] Fps is (10 sec: 44202.4, 60 sec: 42866.0, 300 sec: 42652.8). Total num frames: 12824788992. Throughput: 0: 42835.2. Samples: 12824945500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:32:33,398][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 01:32:37,251][15401] Updated weights for policy 0, policy_version 782773 (0.0031) [2024-06-25 01:32:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12825001984. Throughput: 0: 42695.0. Samples: 12825075700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:32:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 01:32:40,915][15401] Updated weights for policy 0, policy_version 782783 (0.0031) [2024-06-25 01:32:43,390][15132] Fps is (10 sec: 40991.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12825198592. Throughput: 0: 42736.0. Samples: 12825334680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:32:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 01:32:44,857][15401] Updated weights for policy 0, policy_version 782793 (0.0032) [2024-06-25 01:32:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12825427968. Throughput: 0: 42850.2. Samples: 12825590160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:32:48,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 01:32:48,664][15401] Updated weights for policy 0, policy_version 782803 (0.0044) [2024-06-25 01:32:52,822][15401] Updated weights for policy 0, policy_version 782813 (0.0030) [2024-06-25 01:32:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12825640960. Throughput: 0: 42708.8. Samples: 12825719900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:32:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 01:32:56,315][15401] Updated weights for policy 0, policy_version 782823 (0.0037) [2024-06-25 01:32:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12825853952. Throughput: 0: 42838.3. Samples: 12825974900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:32:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 01:33:00,386][15401] Updated weights for policy 0, policy_version 782833 (0.0034) [2024-06-25 01:33:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42543.8). Total num frames: 12826050560. Throughput: 0: 42919.4. Samples: 12826231500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 01:33:04,041][15401] Updated weights for policy 0, policy_version 782843 (0.0031) [2024-06-25 01:33:07,963][15401] Updated weights for policy 0, policy_version 782853 (0.0032) [2024-06-25 01:33:07,978][15349] Signal inference workers to stop experience collection... (189850 times) [2024-06-25 01:33:07,979][15349] Signal inference workers to resume experience collection... (189850 times) [2024-06-25 01:33:07,999][15401] InferenceWorker_p0-w0: stopping experience collection (189850 times) [2024-06-25 01:33:07,999][15401] InferenceWorker_p0-w0: resuming experience collection (189850 times) [2024-06-25 01:33:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12826279936. Throughput: 0: 42749.3. Samples: 12826357060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 01:33:11,621][15401] Updated weights for policy 0, policy_version 782863 (0.0032) [2024-06-25 01:33:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 12826492928. Throughput: 0: 42873.3. Samples: 12826616660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 01:33:15,468][15401] Updated weights for policy 0, policy_version 782873 (0.0037) [2024-06-25 01:33:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.4). Total num frames: 12826705920. Throughput: 0: 42856.6. Samples: 12826873820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:18,401][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 01:33:19,691][15401] Updated weights for policy 0, policy_version 782883 (0.0040) [2024-06-25 01:33:22,955][15401] Updated weights for policy 0, policy_version 782893 (0.0025) [2024-06-25 01:33:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12826935296. Throughput: 0: 42797.5. Samples: 12827001580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:23,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 01:33:27,394][15401] Updated weights for policy 0, policy_version 782903 (0.0033) [2024-06-25 01:33:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12827131904. Throughput: 0: 42751.1. Samples: 12827258480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 01:33:30,603][15401] Updated weights for policy 0, policy_version 782913 (0.0027) [2024-06-25 01:33:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42602.2, 300 sec: 42598.0). Total num frames: 12827344896. Throughput: 0: 42742.3. Samples: 12827513660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:33,392][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 01:33:35,162][15401] Updated weights for policy 0, policy_version 782923 (0.0034) [2024-06-25 01:33:38,336][15401] Updated weights for policy 0, policy_version 782933 (0.0037) [2024-06-25 01:33:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12827574272. Throughput: 0: 42695.0. Samples: 12827641180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 01:33:42,970][15401] Updated weights for policy 0, policy_version 782943 (0.0027) [2024-06-25 01:33:43,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12827754496. Throughput: 0: 42700.0. Samples: 12827896400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:43,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 01:33:43,526][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000782945_12827770880.pth... [2024-06-25 01:33:43,591][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000782321_12817547264.pth [2024-06-25 01:33:46,049][15401] Updated weights for policy 0, policy_version 782953 (0.0031) [2024-06-25 01:33:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 12827967488. Throughput: 0: 42625.4. Samples: 12828149640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:48,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 01:33:50,749][15401] Updated weights for policy 0, policy_version 782963 (0.0032) [2024-06-25 01:33:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12828196864. Throughput: 0: 42650.6. Samples: 12828276340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:53,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 01:33:54,185][15401] Updated weights for policy 0, policy_version 782973 (0.0042) [2024-06-25 01:33:58,364][15401] Updated weights for policy 0, policy_version 782983 (0.0027) [2024-06-25 01:33:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 12828393472. Throughput: 0: 42566.2. Samples: 12828532140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:33:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 01:34:01,725][15401] Updated weights for policy 0, policy_version 782993 (0.0041) [2024-06-25 01:34:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 12828606464. Throughput: 0: 42612.4. Samples: 12828791280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:34:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 01:34:05,759][15401] Updated weights for policy 0, policy_version 783003 (0.0028) [2024-06-25 01:34:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12828852224. Throughput: 0: 42554.0. Samples: 12828916520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 01:34:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 01:34:09,223][15401] Updated weights for policy 0, policy_version 783013 (0.0029) [2024-06-25 01:34:13,318][15401] Updated weights for policy 0, policy_version 783023 (0.0032) [2024-06-25 01:34:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12829048832. Throughput: 0: 42501.4. Samples: 12829171040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:13,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 01:34:17,154][15401] Updated weights for policy 0, policy_version 783033 (0.0042) [2024-06-25 01:34:18,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 12829245440. Throughput: 0: 42659.1. Samples: 12829433220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 01:34:20,661][15401] Updated weights for policy 0, policy_version 783043 (0.0029) [2024-06-25 01:34:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12829507584. Throughput: 0: 42648.4. Samples: 12829560360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:23,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 01:34:24,651][15401] Updated weights for policy 0, policy_version 783053 (0.0029) [2024-06-25 01:34:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12829687808. Throughput: 0: 42555.6. Samples: 12829811400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 01:34:28,418][15401] Updated weights for policy 0, policy_version 783063 (0.0038) [2024-06-25 01:34:32,734][15401] Updated weights for policy 0, policy_version 783073 (0.0038) [2024-06-25 01:34:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 12829900800. Throughput: 0: 42794.3. Samples: 12830075380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 01:34:36,031][15401] Updated weights for policy 0, policy_version 783083 (0.0033) [2024-06-25 01:34:38,294][15349] Signal inference workers to stop experience collection... (189900 times) [2024-06-25 01:34:38,300][15349] Signal inference workers to resume experience collection... (189900 times) [2024-06-25 01:34:38,347][15401] InferenceWorker_p0-w0: stopping experience collection (189900 times) [2024-06-25 01:34:38,347][15401] InferenceWorker_p0-w0: resuming experience collection (189900 times) [2024-06-25 01:34:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12830130176. Throughput: 0: 42697.9. Samples: 12830197740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 01:34:40,199][15401] Updated weights for policy 0, policy_version 783093 (0.0029) [2024-06-25 01:34:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12830326784. Throughput: 0: 42750.4. Samples: 12830455900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:43,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 01:34:43,749][15401] Updated weights for policy 0, policy_version 783103 (0.0033) [2024-06-25 01:34:47,713][15401] Updated weights for policy 0, policy_version 783113 (0.0032) [2024-06-25 01:34:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 12830523392. Throughput: 0: 42672.1. Samples: 12830711520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 01:34:51,470][15401] Updated weights for policy 0, policy_version 783123 (0.0041) [2024-06-25 01:34:53,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12830769152. Throughput: 0: 42696.1. Samples: 12830837840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 01:34:55,191][15401] Updated weights for policy 0, policy_version 783133 (0.0047) [2024-06-25 01:34:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12830982144. Throughput: 0: 42950.1. Samples: 12831103800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:34:58,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 01:34:58,932][15401] Updated weights for policy 0, policy_version 783143 (0.0027) [2024-06-25 01:35:02,565][15401] Updated weights for policy 0, policy_version 783153 (0.0036) [2024-06-25 01:35:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12831178752. Throughput: 0: 42798.2. Samples: 12831359140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:03,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 01:35:06,428][15401] Updated weights for policy 0, policy_version 783163 (0.0049) [2024-06-25 01:35:08,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12831408128. Throughput: 0: 42782.8. Samples: 12831485580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 01:35:10,163][15401] Updated weights for policy 0, policy_version 783173 (0.0021) [2024-06-25 01:35:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12831621120. Throughput: 0: 43057.1. Samples: 12831748980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 01:35:14,139][15401] Updated weights for policy 0, policy_version 783183 (0.0040) [2024-06-25 01:35:17,918][15401] Updated weights for policy 0, policy_version 783193 (0.0040) [2024-06-25 01:35:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 12831834112. Throughput: 0: 42784.5. Samples: 12832000680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 01:35:21,607][15401] Updated weights for policy 0, policy_version 783203 (0.0030) [2024-06-25 01:35:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12832047104. Throughput: 0: 42998.5. Samples: 12832132680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 01:35:25,518][15401] Updated weights for policy 0, policy_version 783213 (0.0037) [2024-06-25 01:35:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.2). Total num frames: 12832276480. Throughput: 0: 43054.9. Samples: 12832393380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 01:35:29,266][15401] Updated weights for policy 0, policy_version 783223 (0.0044) [2024-06-25 01:35:33,086][15401] Updated weights for policy 0, policy_version 783233 (0.0031) [2024-06-25 01:35:33,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 12832489472. Throughput: 0: 42895.9. Samples: 12832641940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:33,393][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 01:35:36,923][15401] Updated weights for policy 0, policy_version 783243 (0.0035) [2024-06-25 01:35:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 12832702464. Throughput: 0: 43064.5. Samples: 12832775740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 01:35:40,815][15401] Updated weights for policy 0, policy_version 783253 (0.0026) [2024-06-25 01:35:43,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12832899072. Throughput: 0: 42952.2. Samples: 12833036640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 01:35:43,482][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000783259_12832915456.pth... [2024-06-25 01:35:43,531][15349] Signal inference workers to stop experience collection... (189950 times) [2024-06-25 01:35:43,536][15349] Signal inference workers to resume experience collection... (189950 times) [2024-06-25 01:35:43,545][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000782631_12822626304.pth [2024-06-25 01:35:43,551][15401] InferenceWorker_p0-w0: stopping experience collection (189950 times) [2024-06-25 01:35:43,551][15401] InferenceWorker_p0-w0: resuming experience collection (189950 times) [2024-06-25 01:35:44,827][15401] Updated weights for policy 0, policy_version 783263 (0.0053) [2024-06-25 01:35:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 12833128448. Throughput: 0: 42826.8. Samples: 12833286340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 01:35:48,570][15401] Updated weights for policy 0, policy_version 783273 (0.0041) [2024-06-25 01:35:52,344][15401] Updated weights for policy 0, policy_version 783283 (0.0038) [2024-06-25 01:35:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12833325056. Throughput: 0: 42910.7. Samples: 12833416560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 01:35:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 01:35:56,645][15401] Updated weights for policy 0, policy_version 783293 (0.0025) [2024-06-25 01:35:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12833554432. Throughput: 0: 42781.9. Samples: 12833674160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:35:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 01:36:00,235][15401] Updated weights for policy 0, policy_version 783303 (0.0029) [2024-06-25 01:36:03,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 12833767424. Throughput: 0: 42885.2. Samples: 12833930620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:03,393][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 01:36:04,123][15401] Updated weights for policy 0, policy_version 783313 (0.0037) [2024-06-25 01:36:07,708][15401] Updated weights for policy 0, policy_version 783323 (0.0026) [2024-06-25 01:36:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12833964032. Throughput: 0: 42832.0. Samples: 12834060120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:08,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 01:36:11,889][15401] Updated weights for policy 0, policy_version 783333 (0.0029) [2024-06-25 01:36:13,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 12834209792. Throughput: 0: 42771.3. Samples: 12834318080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:13,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 01:36:15,167][15401] Updated weights for policy 0, policy_version 783343 (0.0024) [2024-06-25 01:36:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12834406400. Throughput: 0: 43005.9. Samples: 12834577100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 01:36:19,460][15401] Updated weights for policy 0, policy_version 783353 (0.0035) [2024-06-25 01:36:22,724][15401] Updated weights for policy 0, policy_version 783363 (0.0036) [2024-06-25 01:36:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12834619392. Throughput: 0: 42775.6. Samples: 12834700640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 01:36:26,959][15401] Updated weights for policy 0, policy_version 783373 (0.0034) [2024-06-25 01:36:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12834848768. Throughput: 0: 42741.7. Samples: 12834960020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:28,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 01:36:30,834][15401] Updated weights for policy 0, policy_version 783383 (0.0031) [2024-06-25 01:36:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 12835045376. Throughput: 0: 42903.2. Samples: 12835216980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:33,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 01:36:34,639][15401] Updated weights for policy 0, policy_version 783393 (0.0023) [2024-06-25 01:36:38,379][15401] Updated weights for policy 0, policy_version 783403 (0.0030) [2024-06-25 01:36:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12835274752. Throughput: 0: 42788.4. Samples: 12835342040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:38,392][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 01:36:42,185][15401] Updated weights for policy 0, policy_version 783413 (0.0040) [2024-06-25 01:36:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12835487744. Throughput: 0: 42738.3. Samples: 12835597380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:43,399][15132] Avg episode reward: [(0, '0.288')] [2024-06-25 01:36:46,042][15401] Updated weights for policy 0, policy_version 783423 (0.0030) [2024-06-25 01:36:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12835667968. Throughput: 0: 42800.4. Samples: 12835856540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:48,400][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 01:36:49,950][15401] Updated weights for policy 0, policy_version 783433 (0.0031) [2024-06-25 01:36:53,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 12835897344. Throughput: 0: 42663.1. Samples: 12835980060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:53,403][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 01:36:53,568][15401] Updated weights for policy 0, policy_version 783443 (0.0040) [2024-06-25 01:36:57,885][15401] Updated weights for policy 0, policy_version 783453 (0.0024) [2024-06-25 01:36:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12836110336. Throughput: 0: 42614.6. Samples: 12836235740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:36:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 01:37:01,209][15401] Updated weights for policy 0, policy_version 783463 (0.0039) [2024-06-25 01:37:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 12836306944. Throughput: 0: 42571.0. Samples: 12836492800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:37:03,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 01:37:05,520][15401] Updated weights for policy 0, policy_version 783473 (0.0034) [2024-06-25 01:37:06,501][15349] Signal inference workers to stop experience collection... (190000 times) [2024-06-25 01:37:06,501][15349] Signal inference workers to resume experience collection... (190000 times) [2024-06-25 01:37:06,523][15401] InferenceWorker_p0-w0: stopping experience collection (190000 times) [2024-06-25 01:37:06,524][15401] InferenceWorker_p0-w0: resuming experience collection (190000 times) [2024-06-25 01:37:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 12836552704. Throughput: 0: 42611.3. Samples: 12836618160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:37:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 01:37:08,763][15401] Updated weights for policy 0, policy_version 783483 (0.0028) [2024-06-25 01:37:12,918][15401] Updated weights for policy 0, policy_version 783493 (0.0037) [2024-06-25 01:37:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12836749312. Throughput: 0: 42697.3. Samples: 12836881400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:37:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 01:37:16,620][15401] Updated weights for policy 0, policy_version 783503 (0.0034) [2024-06-25 01:37:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12836945920. Throughput: 0: 42693.3. Samples: 12837138180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:37:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 01:37:20,927][15401] Updated weights for policy 0, policy_version 783513 (0.0035) [2024-06-25 01:37:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 12837208064. Throughput: 0: 42705.2. Samples: 12837263780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:37:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 01:37:24,224][15401] Updated weights for policy 0, policy_version 783523 (0.0040) [2024-06-25 01:37:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42655.0). Total num frames: 12837371904. Throughput: 0: 42686.2. Samples: 12837518260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:37:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 01:37:28,604][15401] Updated weights for policy 0, policy_version 783533 (0.0033) [2024-06-25 01:37:32,161][15401] Updated weights for policy 0, policy_version 783543 (0.0035) [2024-06-25 01:37:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12837601280. Throughput: 0: 42527.1. Samples: 12837770260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:37:33,394][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 01:37:36,370][15401] Updated weights for policy 0, policy_version 783553 (0.0029) [2024-06-25 01:37:38,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12837847040. Throughput: 0: 42748.6. Samples: 12837903640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 01:37:38,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 01:37:39,706][15401] Updated weights for policy 0, policy_version 783563 (0.0040) [2024-06-25 01:37:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 12838027264. Throughput: 0: 42802.1. Samples: 12838161940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:37:43,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 01:37:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000783572_12838043648.pth... [2024-06-25 01:37:43,566][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000782945_12827770880.pth [2024-06-25 01:37:43,846][15401] Updated weights for policy 0, policy_version 783573 (0.0032) [2024-06-25 01:37:47,339][15401] Updated weights for policy 0, policy_version 783583 (0.0026) [2024-06-25 01:37:48,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12838223872. Throughput: 0: 42833.9. Samples: 12838420320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:37:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 01:37:51,384][15401] Updated weights for policy 0, policy_version 783593 (0.0036) [2024-06-25 01:37:53,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 12838469632. Throughput: 0: 42854.4. Samples: 12838546600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:37:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 01:37:54,895][15401] Updated weights for policy 0, policy_version 783603 (0.0042) [2024-06-25 01:37:58,394][15132] Fps is (10 sec: 44216.8, 60 sec: 42595.2, 300 sec: 42764.4). Total num frames: 12838666240. Throughput: 0: 42685.0. Samples: 12838802420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:37:58,395][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 01:37:59,308][15401] Updated weights for policy 0, policy_version 783613 (0.0038) [2024-06-25 01:38:02,519][15401] Updated weights for policy 0, policy_version 783623 (0.0032) [2024-06-25 01:38:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 12838879232. Throughput: 0: 42581.1. Samples: 12839054340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 01:38:06,866][15401] Updated weights for policy 0, policy_version 783633 (0.0036) [2024-06-25 01:38:08,390][15132] Fps is (10 sec: 45895.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12839124992. Throughput: 0: 42686.8. Samples: 12839184680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:08,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 01:38:10,239][15401] Updated weights for policy 0, policy_version 783643 (0.0025) [2024-06-25 01:38:13,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 12839305216. Throughput: 0: 42788.6. Samples: 12839443740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 01:38:14,417][15401] Updated weights for policy 0, policy_version 783653 (0.0033) [2024-06-25 01:38:18,349][15401] Updated weights for policy 0, policy_version 783663 (0.0042) [2024-06-25 01:38:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12839534592. Throughput: 0: 42778.7. Samples: 12839695300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 01:38:21,859][15401] Updated weights for policy 0, policy_version 783673 (0.0025) [2024-06-25 01:38:22,157][15349] Signal inference workers to stop experience collection... (190050 times) [2024-06-25 01:38:22,164][15349] Signal inference workers to resume experience collection... (190050 times) [2024-06-25 01:38:22,191][15401] InferenceWorker_p0-w0: stopping experience collection (190050 times) [2024-06-25 01:38:22,191][15401] InferenceWorker_p0-w0: resuming experience collection (190050 times) [2024-06-25 01:38:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12839747584. Throughput: 0: 42791.8. Samples: 12839829280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 01:38:26,111][15401] Updated weights for policy 0, policy_version 783683 (0.0039) [2024-06-25 01:38:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 12839927808. Throughput: 0: 42669.9. Samples: 12840081980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 01:38:29,813][15401] Updated weights for policy 0, policy_version 783693 (0.0043) [2024-06-25 01:38:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12840173568. Throughput: 0: 42484.0. Samples: 12840332100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:33,396][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 01:38:33,799][15401] Updated weights for policy 0, policy_version 783703 (0.0037) [2024-06-25 01:38:37,559][15401] Updated weights for policy 0, policy_version 783713 (0.0022) [2024-06-25 01:38:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 12840386560. Throughput: 0: 42647.6. Samples: 12840465740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:38,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 01:38:41,719][15401] Updated weights for policy 0, policy_version 783723 (0.0030) [2024-06-25 01:38:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 12840566784. Throughput: 0: 42733.6. Samples: 12840725240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:43,391][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 01:38:45,049][15401] Updated weights for policy 0, policy_version 783733 (0.0036) [2024-06-25 01:38:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 12840828928. Throughput: 0: 42654.7. Samples: 12840973800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 01:38:49,216][15401] Updated weights for policy 0, policy_version 783743 (0.0031) [2024-06-25 01:38:52,700][15401] Updated weights for policy 0, policy_version 783753 (0.0049) [2024-06-25 01:38:53,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12841041920. Throughput: 0: 42712.5. Samples: 12841106740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:53,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 01:38:56,879][15401] Updated weights for policy 0, policy_version 783763 (0.0034) [2024-06-25 01:38:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42601.6, 300 sec: 42765.0). Total num frames: 12841222144. Throughput: 0: 42666.6. Samples: 12841363740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:38:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 01:39:00,420][15401] Updated weights for policy 0, policy_version 783773 (0.0036) [2024-06-25 01:39:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12841451520. Throughput: 0: 42645.4. Samples: 12841614340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:39:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 01:39:04,417][15401] Updated weights for policy 0, policy_version 783783 (0.0044) [2024-06-25 01:39:08,263][15401] Updated weights for policy 0, policy_version 783793 (0.0024) [2024-06-25 01:39:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12841664512. Throughput: 0: 42529.3. Samples: 12841743100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:39:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 01:39:11,990][15401] Updated weights for policy 0, policy_version 783803 (0.0029) [2024-06-25 01:39:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12841861120. Throughput: 0: 42469.7. Samples: 12841993120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:39:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 01:39:15,786][15401] Updated weights for policy 0, policy_version 783813 (0.0032) [2024-06-25 01:39:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12842090496. Throughput: 0: 42678.3. Samples: 12842252620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:39:18,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 01:39:19,556][15401] Updated weights for policy 0, policy_version 783823 (0.0035) [2024-06-25 01:39:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 12842303488. Throughput: 0: 42665.9. Samples: 12842385700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:39:23,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 01:39:23,424][15401] Updated weights for policy 0, policy_version 783833 (0.0040) [2024-06-25 01:39:27,681][15401] Updated weights for policy 0, policy_version 783843 (0.0027) [2024-06-25 01:39:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12842516480. Throughput: 0: 42480.5. Samples: 12842636860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 01:39:28,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 01:39:31,066][15401] Updated weights for policy 0, policy_version 783853 (0.0037) [2024-06-25 01:39:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 12842729472. Throughput: 0: 42545.3. Samples: 12842888340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:39:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 01:39:35,403][15401] Updated weights for policy 0, policy_version 783863 (0.0042) [2024-06-25 01:39:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12842926080. Throughput: 0: 42366.2. Samples: 12843013220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:39:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 01:39:39,143][15401] Updated weights for policy 0, policy_version 783873 (0.0038) [2024-06-25 01:39:42,665][15349] Signal inference workers to stop experience collection... (190100 times) [2024-06-25 01:39:42,667][15349] Signal inference workers to resume experience collection... (190100 times) [2024-06-25 01:39:42,701][15401] InferenceWorker_p0-w0: stopping experience collection (190100 times) [2024-06-25 01:39:42,701][15401] InferenceWorker_p0-w0: resuming experience collection (190100 times) [2024-06-25 01:39:42,981][15401] Updated weights for policy 0, policy_version 783883 (0.0036) [2024-06-25 01:39:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12843139072. Throughput: 0: 42345.8. Samples: 12843269300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:39:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 01:39:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000783883_12843139072.pth... [2024-06-25 01:39:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000783259_12832915456.pth [2024-06-25 01:39:46,890][15401] Updated weights for policy 0, policy_version 783893 (0.0032) [2024-06-25 01:39:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 12843368448. Throughput: 0: 42377.8. Samples: 12843521340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:39:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 01:39:50,551][15401] Updated weights for policy 0, policy_version 783903 (0.0030) [2024-06-25 01:39:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 12843565056. Throughput: 0: 42471.1. Samples: 12843654300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:39:53,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-25 01:39:54,578][15401] Updated weights for policy 0, policy_version 783913 (0.0032) [2024-06-25 01:39:58,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12843761664. Throughput: 0: 42504.9. Samples: 12843905840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:39:58,391][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 01:39:58,694][15401] Updated weights for policy 0, policy_version 783923 (0.0024) [2024-06-25 01:40:02,290][15401] Updated weights for policy 0, policy_version 783933 (0.0028) [2024-06-25 01:40:03,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 12843991040. Throughput: 0: 42413.6. Samples: 12844161340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:03,393][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 01:40:06,246][15401] Updated weights for policy 0, policy_version 783943 (0.0042) [2024-06-25 01:40:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12844204032. Throughput: 0: 42417.6. Samples: 12844294500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 01:40:09,770][15401] Updated weights for policy 0, policy_version 783953 (0.0029) [2024-06-25 01:40:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12844433408. Throughput: 0: 42584.4. Samples: 12844553160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:13,393][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 01:40:13,633][15401] Updated weights for policy 0, policy_version 783963 (0.0036) [2024-06-25 01:40:17,324][15401] Updated weights for policy 0, policy_version 783973 (0.0022) [2024-06-25 01:40:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12844646400. Throughput: 0: 42562.7. Samples: 12844803660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:18,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 01:40:21,503][15401] Updated weights for policy 0, policy_version 783983 (0.0032) [2024-06-25 01:40:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 12844859392. Throughput: 0: 42702.2. Samples: 12844934820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 01:40:25,591][15401] Updated weights for policy 0, policy_version 783993 (0.0030) [2024-06-25 01:40:28,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 12845072384. Throughput: 0: 42705.3. Samples: 12845191140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:28,393][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 01:40:29,154][15401] Updated weights for policy 0, policy_version 784003 (0.0031) [2024-06-25 01:40:33,375][15401] Updated weights for policy 0, policy_version 784013 (0.0032) [2024-06-25 01:40:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12845268992. Throughput: 0: 42979.0. Samples: 12845455400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 01:40:36,729][15401] Updated weights for policy 0, policy_version 784023 (0.0035) [2024-06-25 01:40:38,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12845514752. Throughput: 0: 42725.5. Samples: 12845576940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:38,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 01:40:40,925][15401] Updated weights for policy 0, policy_version 784033 (0.0037) [2024-06-25 01:40:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12845711360. Throughput: 0: 42850.6. Samples: 12845834120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:43,396][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 01:40:44,432][15401] Updated weights for policy 0, policy_version 784043 (0.0043) [2024-06-25 01:40:48,333][15401] Updated weights for policy 0, policy_version 784053 (0.0039) [2024-06-25 01:40:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12845924352. Throughput: 0: 42826.8. Samples: 12846088440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:48,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 01:40:52,233][15401] Updated weights for policy 0, policy_version 784063 (0.0040) [2024-06-25 01:40:53,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 12846170112. Throughput: 0: 42679.6. Samples: 12846215080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:53,392][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 01:40:55,898][15401] Updated weights for policy 0, policy_version 784073 (0.0035) [2024-06-25 01:40:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 12846317568. Throughput: 0: 42558.3. Samples: 12846468280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:40:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 01:40:59,921][15401] Updated weights for policy 0, policy_version 784083 (0.0038) [2024-06-25 01:41:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 12846563328. Throughput: 0: 42631.2. Samples: 12846722060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:41:03,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 01:41:03,540][15401] Updated weights for policy 0, policy_version 784093 (0.0040) [2024-06-25 01:41:07,466][15401] Updated weights for policy 0, policy_version 784103 (0.0039) [2024-06-25 01:41:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12846792704. Throughput: 0: 42568.8. Samples: 12846850420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:41:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 01:41:11,556][15401] Updated weights for policy 0, policy_version 784113 (0.0027) [2024-06-25 01:41:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 12846956544. Throughput: 0: 42693.4. Samples: 12847112240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 01:41:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 01:41:15,166][15349] Signal inference workers to stop experience collection... (190150 times) [2024-06-25 01:41:15,169][15349] Signal inference workers to resume experience collection... (190150 times) [2024-06-25 01:41:15,181][15401] Updated weights for policy 0, policy_version 784123 (0.0033) [2024-06-25 01:41:15,209][15401] InferenceWorker_p0-w0: stopping experience collection (190150 times) [2024-06-25 01:41:15,216][15401] InferenceWorker_p0-w0: resuming experience collection (190150 times) [2024-06-25 01:41:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12847185920. Throughput: 0: 42360.6. Samples: 12847361620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 01:41:19,203][15401] Updated weights for policy 0, policy_version 784133 (0.0027) [2024-06-25 01:41:22,827][15401] Updated weights for policy 0, policy_version 784143 (0.0031) [2024-06-25 01:41:23,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12847431680. Throughput: 0: 42513.8. Samples: 12847490060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:23,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 01:41:26,669][15401] Updated weights for policy 0, policy_version 784153 (0.0036) [2024-06-25 01:41:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 12847595520. Throughput: 0: 42325.5. Samples: 12847738760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 01:41:30,391][15401] Updated weights for policy 0, policy_version 784163 (0.0035) [2024-06-25 01:41:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12847824896. Throughput: 0: 42451.5. Samples: 12847998760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:33,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 01:41:34,595][15401] Updated weights for policy 0, policy_version 784173 (0.0036) [2024-06-25 01:41:38,175][15401] Updated weights for policy 0, policy_version 784183 (0.0034) [2024-06-25 01:41:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12848070656. Throughput: 0: 42521.9. Samples: 12848128560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 01:41:42,917][15401] Updated weights for policy 0, policy_version 784193 (0.0027) [2024-06-25 01:41:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12848250880. Throughput: 0: 42523.6. Samples: 12848381840. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:43,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 01:41:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000784195_12848250880.pth... [2024-06-25 01:41:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000783572_12838043648.pth [2024-06-25 01:41:45,773][15401] Updated weights for policy 0, policy_version 784203 (0.0037) [2024-06-25 01:41:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 12848463872. Throughput: 0: 42494.3. Samples: 12848634300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 01:41:50,579][15401] Updated weights for policy 0, policy_version 784213 (0.0024) [2024-06-25 01:41:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12848693248. Throughput: 0: 42772.5. Samples: 12848775180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 01:41:53,474][15401] Updated weights for policy 0, policy_version 784223 (0.0049) [2024-06-25 01:41:58,205][15401] Updated weights for policy 0, policy_version 784233 (0.0039) [2024-06-25 01:41:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12848873472. Throughput: 0: 42351.1. Samples: 12849018040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:41:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 01:42:01,312][15401] Updated weights for policy 0, policy_version 784243 (0.0036) [2024-06-25 01:42:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12849119232. Throughput: 0: 42351.4. Samples: 12849267440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:03,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 01:42:05,741][15401] Updated weights for policy 0, policy_version 784253 (0.0031) [2024-06-25 01:42:06,266][15349] Signal inference workers to stop experience collection... (190200 times) [2024-06-25 01:42:06,266][15349] Signal inference workers to resume experience collection... (190200 times) [2024-06-25 01:42:06,320][15401] InferenceWorker_p0-w0: stopping experience collection (190200 times) [2024-06-25 01:42:06,320][15401] InferenceWorker_p0-w0: resuming experience collection (190200 times) [2024-06-25 01:42:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12849315840. Throughput: 0: 42678.7. Samples: 12849410600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:08,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 01:42:08,805][15401] Updated weights for policy 0, policy_version 784263 (0.0034) [2024-06-25 01:42:13,380][15401] Updated weights for policy 0, policy_version 784273 (0.0031) [2024-06-25 01:42:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12849528832. Throughput: 0: 42773.4. Samples: 12849663560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:13,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 01:42:16,679][15401] Updated weights for policy 0, policy_version 784283 (0.0042) [2024-06-25 01:42:18,391][15132] Fps is (10 sec: 45866.8, 60 sec: 43143.2, 300 sec: 42598.2). Total num frames: 12849774592. Throughput: 0: 42444.1. Samples: 12849908820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:18,392][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 01:42:21,000][15401] Updated weights for policy 0, policy_version 784293 (0.0042) [2024-06-25 01:42:23,392][15132] Fps is (10 sec: 40950.2, 60 sec: 41777.5, 300 sec: 42598.1). Total num frames: 12849938432. Throughput: 0: 42614.2. Samples: 12850046300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:23,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 01:42:24,359][15401] Updated weights for policy 0, policy_version 784303 (0.0050) [2024-06-25 01:42:28,392][15132] Fps is (10 sec: 37678.9, 60 sec: 42596.3, 300 sec: 42542.5). Total num frames: 12850151424. Throughput: 0: 42535.9. Samples: 12850296080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:28,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 01:42:28,952][15401] Updated weights for policy 0, policy_version 784313 (0.0036) [2024-06-25 01:42:31,917][15401] Updated weights for policy 0, policy_version 784323 (0.0044) [2024-06-25 01:42:33,390][15132] Fps is (10 sec: 49163.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 12850429952. Throughput: 0: 42498.1. Samples: 12850546720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 01:42:36,493][15401] Updated weights for policy 0, policy_version 784333 (0.0033) [2024-06-25 01:42:38,389][15132] Fps is (10 sec: 42611.1, 60 sec: 41779.2, 300 sec: 42543.2). Total num frames: 12850577408. Throughput: 0: 42467.6. Samples: 12850686220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 01:42:39,619][15401] Updated weights for policy 0, policy_version 784343 (0.0038) [2024-06-25 01:42:43,390][15132] Fps is (10 sec: 36044.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 12850790400. Throughput: 0: 42610.1. Samples: 12850935500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 01:42:44,148][15401] Updated weights for policy 0, policy_version 784353 (0.0045) [2024-06-25 01:42:47,259][15401] Updated weights for policy 0, policy_version 784363 (0.0033) [2024-06-25 01:42:48,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 12851052544. Throughput: 0: 42495.7. Samples: 12851179740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 01:42:51,942][15401] Updated weights for policy 0, policy_version 784373 (0.0032) [2024-06-25 01:42:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42543.5). Total num frames: 12851216384. Throughput: 0: 42380.9. Samples: 12851317740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 01:42:55,348][15401] Updated weights for policy 0, policy_version 784383 (0.0029) [2024-06-25 01:42:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12851445760. Throughput: 0: 42322.1. Samples: 12851568060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-25 01:42:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 01:42:59,613][15401] Updated weights for policy 0, policy_version 784393 (0.0043) [2024-06-25 01:43:00,113][15349] Signal inference workers to stop experience collection... (190250 times) [2024-06-25 01:43:00,113][15349] Signal inference workers to resume experience collection... (190250 times) [2024-06-25 01:43:00,141][15401] InferenceWorker_p0-w0: stopping experience collection (190250 times) [2024-06-25 01:43:00,141][15401] InferenceWorker_p0-w0: resuming experience collection (190250 times) [2024-06-25 01:43:03,010][15401] Updated weights for policy 0, policy_version 784403 (0.0040) [2024-06-25 01:43:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12851691520. Throughput: 0: 42525.2. Samples: 12851822380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 01:43:07,241][15401] Updated weights for policy 0, policy_version 784413 (0.0049) [2024-06-25 01:43:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12851855360. Throughput: 0: 42334.3. Samples: 12851951240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 01:43:10,665][15401] Updated weights for policy 0, policy_version 784423 (0.0034) [2024-06-25 01:43:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12852101120. Throughput: 0: 42317.7. Samples: 12852200260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:13,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 01:43:15,113][15401] Updated weights for policy 0, policy_version 784433 (0.0024) [2024-06-25 01:43:18,326][15401] Updated weights for policy 0, policy_version 784443 (0.0044) [2024-06-25 01:43:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42326.5, 300 sec: 42598.4). Total num frames: 12852314112. Throughput: 0: 42501.4. Samples: 12852459280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 01:43:23,015][15401] Updated weights for policy 0, policy_version 784453 (0.0025) [2024-06-25 01:43:23,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 12852477952. Throughput: 0: 42292.8. Samples: 12852589400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 01:43:26,226][15401] Updated weights for policy 0, policy_version 784463 (0.0046) [2024-06-25 01:43:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.6, 300 sec: 42542.9). Total num frames: 12852723712. Throughput: 0: 42252.6. Samples: 12852836860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 01:43:30,730][15401] Updated weights for policy 0, policy_version 784473 (0.0032) [2024-06-25 01:43:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 42487.3). Total num frames: 12852920320. Throughput: 0: 42514.7. Samples: 12853092900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 01:43:33,735][15401] Updated weights for policy 0, policy_version 784483 (0.0029) [2024-06-25 01:43:38,373][15401] Updated weights for policy 0, policy_version 784493 (0.0029) [2024-06-25 01:43:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12853133312. Throughput: 0: 42311.9. Samples: 12853221780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:38,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 01:43:41,300][15401] Updated weights for policy 0, policy_version 784503 (0.0031) [2024-06-25 01:43:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 12853379072. Throughput: 0: 42390.7. Samples: 12853475640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:43,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 01:43:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000784508_12853379072.pth... [2024-06-25 01:43:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000783883_12843139072.pth [2024-06-25 01:43:45,907][15401] Updated weights for policy 0, policy_version 784513 (0.0039) [2024-06-25 01:43:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 12853559296. Throughput: 0: 42399.2. Samples: 12853730340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 01:43:49,241][15401] Updated weights for policy 0, policy_version 784523 (0.0030) [2024-06-25 01:43:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 12853772288. Throughput: 0: 42287.9. Samples: 12853854200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:53,391][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 01:43:53,469][15401] Updated weights for policy 0, policy_version 784533 (0.0044) [2024-06-25 01:43:57,214][15401] Updated weights for policy 0, policy_version 784543 (0.0045) [2024-06-25 01:43:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12854001664. Throughput: 0: 42499.3. Samples: 12854112720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:43:58,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 01:44:01,057][15401] Updated weights for policy 0, policy_version 784553 (0.0032) [2024-06-25 01:44:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42431.8). Total num frames: 12854181888. Throughput: 0: 42479.2. Samples: 12854370840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:03,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 01:44:05,013][15401] Updated weights for policy 0, policy_version 784563 (0.0029) [2024-06-25 01:44:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 12854411264. Throughput: 0: 42225.4. Samples: 12854489640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:08,392][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 01:44:08,753][15401] Updated weights for policy 0, policy_version 784573 (0.0030) [2024-06-25 01:44:12,655][15401] Updated weights for policy 0, policy_version 784583 (0.0029) [2024-06-25 01:44:13,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 12854640640. Throughput: 0: 42538.9. Samples: 12854751120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:13,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 01:44:16,714][15401] Updated weights for policy 0, policy_version 784593 (0.0037) [2024-06-25 01:44:18,389][15132] Fps is (10 sec: 39331.3, 60 sec: 41506.2, 300 sec: 42376.2). Total num frames: 12854804480. Throughput: 0: 42653.9. Samples: 12855012320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:18,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 01:44:19,056][15349] Signal inference workers to stop experience collection... (190300 times) [2024-06-25 01:44:19,106][15401] InferenceWorker_p0-w0: stopping experience collection (190300 times) [2024-06-25 01:44:19,113][15349] Signal inference workers to resume experience collection... (190300 times) [2024-06-25 01:44:19,118][15401] InferenceWorker_p0-w0: resuming experience collection (190300 times) [2024-06-25 01:44:20,360][15401] Updated weights for policy 0, policy_version 784603 (0.0028) [2024-06-25 01:44:23,390][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 12855066624. Throughput: 0: 42459.6. Samples: 12855132460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:23,392][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 01:44:24,275][15401] Updated weights for policy 0, policy_version 784613 (0.0032) [2024-06-25 01:44:27,926][15401] Updated weights for policy 0, policy_version 784623 (0.0034) [2024-06-25 01:44:28,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12855279616. Throughput: 0: 42694.3. Samples: 12855396880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 01:44:31,753][15401] Updated weights for policy 0, policy_version 784633 (0.0044) [2024-06-25 01:44:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 12855459840. Throughput: 0: 42651.4. Samples: 12855649660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 01:44:35,774][15401] Updated weights for policy 0, policy_version 784643 (0.0026) [2024-06-25 01:44:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12855705600. Throughput: 0: 42603.2. Samples: 12855771340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:38,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 01:44:39,187][15401] Updated weights for policy 0, policy_version 784653 (0.0038) [2024-06-25 01:44:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 12855902208. Throughput: 0: 42736.9. Samples: 12856035880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-25 01:44:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 01:44:43,498][15401] Updated weights for policy 0, policy_version 784663 (0.0040) [2024-06-25 01:44:46,835][15401] Updated weights for policy 0, policy_version 784673 (0.0032) [2024-06-25 01:44:48,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 12856115200. Throughput: 0: 42558.6. Samples: 12856286080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:44:48,393][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 01:44:51,173][15401] Updated weights for policy 0, policy_version 784683 (0.0031) [2024-06-25 01:44:53,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 12856328192. Throughput: 0: 42885.3. Samples: 12856419480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:44:53,392][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 01:44:54,492][15401] Updated weights for policy 0, policy_version 784693 (0.0024) [2024-06-25 01:44:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 12856541184. Throughput: 0: 42734.0. Samples: 12856674140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:44:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 01:44:58,720][15401] Updated weights for policy 0, policy_version 784703 (0.0045) [2024-06-25 01:45:02,583][15401] Updated weights for policy 0, policy_version 784713 (0.0031) [2024-06-25 01:45:03,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12856754176. Throughput: 0: 42685.7. Samples: 12856933180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 01:45:06,349][15401] Updated weights for policy 0, policy_version 784723 (0.0036) [2024-06-25 01:45:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 12856983552. Throughput: 0: 42769.4. Samples: 12857057080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 01:45:10,090][15401] Updated weights for policy 0, policy_version 784733 (0.0039) [2024-06-25 01:45:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 12857196544. Throughput: 0: 42692.9. Samples: 12857318060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 01:45:13,935][15401] Updated weights for policy 0, policy_version 784743 (0.0036) [2024-06-25 01:45:17,582][15401] Updated weights for policy 0, policy_version 784753 (0.0028) [2024-06-25 01:45:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 12857393152. Throughput: 0: 42775.7. Samples: 12857574560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:18,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 01:45:21,473][15401] Updated weights for policy 0, policy_version 784763 (0.0029) [2024-06-25 01:45:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 12857622528. Throughput: 0: 42904.9. Samples: 12857702060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 01:45:25,306][15401] Updated weights for policy 0, policy_version 784773 (0.0028) [2024-06-25 01:45:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12857835520. Throughput: 0: 42789.2. Samples: 12857961400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 01:45:29,074][15401] Updated weights for policy 0, policy_version 784783 (0.0026) [2024-06-25 01:45:33,350][15401] Updated weights for policy 0, policy_version 784793 (0.0030) [2024-06-25 01:45:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 12858048512. Throughput: 0: 43014.6. Samples: 12858221640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:33,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-25 01:45:36,921][15401] Updated weights for policy 0, policy_version 784803 (0.0034) [2024-06-25 01:45:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 12858277888. Throughput: 0: 42721.4. Samples: 12858341840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 01:45:41,152][15401] Updated weights for policy 0, policy_version 784813 (0.0040) [2024-06-25 01:45:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12858490880. Throughput: 0: 42819.1. Samples: 12858601000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 01:45:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000784820_12858490880.pth... [2024-06-25 01:45:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000784195_12848250880.pth [2024-06-25 01:45:44,703][15401] Updated weights for policy 0, policy_version 784823 (0.0031) [2024-06-25 01:45:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42600.0, 300 sec: 42376.2). Total num frames: 12858671104. Throughput: 0: 42708.7. Samples: 12858855080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:48,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 01:45:49,153][15401] Updated weights for policy 0, policy_version 784833 (0.0035) [2024-06-25 01:45:52,142][15401] Updated weights for policy 0, policy_version 784843 (0.0030) [2024-06-25 01:45:53,050][15349] Signal inference workers to stop experience collection... (190350 times) [2024-06-25 01:45:53,051][15349] Signal inference workers to resume experience collection... (190350 times) [2024-06-25 01:45:53,091][15401] InferenceWorker_p0-w0: stopping experience collection (190350 times) [2024-06-25 01:45:53,091][15401] InferenceWorker_p0-w0: resuming experience collection (190350 times) [2024-06-25 01:45:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 12858916864. Throughput: 0: 42721.8. Samples: 12858979560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:53,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 01:45:56,735][15401] Updated weights for policy 0, policy_version 784853 (0.0032) [2024-06-25 01:45:58,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 12859113472. Throughput: 0: 42833.2. Samples: 12859245560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:45:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 01:45:59,751][15401] Updated weights for policy 0, policy_version 784863 (0.0031) [2024-06-25 01:46:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 12859326464. Throughput: 0: 42861.6. Samples: 12859503340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:46:03,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 01:46:04,293][15401] Updated weights for policy 0, policy_version 784873 (0.0030) [2024-06-25 01:46:07,276][15401] Updated weights for policy 0, policy_version 784883 (0.0039) [2024-06-25 01:46:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12859555840. Throughput: 0: 42841.2. Samples: 12859629920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:46:08,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 01:46:11,958][15401] Updated weights for policy 0, policy_version 784893 (0.0042) [2024-06-25 01:46:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12859752448. Throughput: 0: 42841.5. Samples: 12859889260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:46:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 01:46:14,874][15401] Updated weights for policy 0, policy_version 784903 (0.0041) [2024-06-25 01:46:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 12859965440. Throughput: 0: 42754.8. Samples: 12860145600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:46:18,392][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 01:46:19,690][15401] Updated weights for policy 0, policy_version 784913 (0.0034) [2024-06-25 01:46:22,399][15401] Updated weights for policy 0, policy_version 784923 (0.0031) [2024-06-25 01:46:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12860211200. Throughput: 0: 42888.9. Samples: 12860271840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:46:23,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 01:46:27,581][15401] Updated weights for policy 0, policy_version 784933 (0.0043) [2024-06-25 01:46:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 12860391424. Throughput: 0: 43044.1. Samples: 12860537980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:46:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 01:46:30,029][15401] Updated weights for policy 0, policy_version 784943 (0.0026) [2024-06-25 01:46:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12860620800. Throughput: 0: 42815.7. Samples: 12860781780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 01:46:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 01:46:35,268][15401] Updated weights for policy 0, policy_version 784953 (0.0025) [2024-06-25 01:46:37,748][15401] Updated weights for policy 0, policy_version 784963 (0.0036) [2024-06-25 01:46:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12860850176. Throughput: 0: 43027.9. Samples: 12860915820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:46:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 01:46:42,620][15401] Updated weights for policy 0, policy_version 784973 (0.0047) [2024-06-25 01:46:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12861030400. Throughput: 0: 42921.0. Samples: 12861177000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:46:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 01:46:45,372][15401] Updated weights for policy 0, policy_version 784983 (0.0049) [2024-06-25 01:46:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.8, 300 sec: 42598.4). Total num frames: 12861259776. Throughput: 0: 42673.9. Samples: 12861423660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:46:48,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 01:46:50,157][15401] Updated weights for policy 0, policy_version 784993 (0.0033) [2024-06-25 01:46:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12861472768. Throughput: 0: 42782.7. Samples: 12861555140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:46:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 01:46:53,406][15401] Updated weights for policy 0, policy_version 785003 (0.0034) [2024-06-25 01:46:57,703][15401] Updated weights for policy 0, policy_version 785013 (0.0039) [2024-06-25 01:46:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12861669376. Throughput: 0: 42680.9. Samples: 12861809900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:46:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 01:47:00,880][15401] Updated weights for policy 0, policy_version 785023 (0.0028) [2024-06-25 01:47:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12861915136. Throughput: 0: 42618.6. Samples: 12862063440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 01:47:05,183][15401] Updated weights for policy 0, policy_version 785033 (0.0028) [2024-06-25 01:47:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12862111744. Throughput: 0: 42684.9. Samples: 12862192660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 01:47:08,582][15401] Updated weights for policy 0, policy_version 785043 (0.0030) [2024-06-25 01:47:13,005][15401] Updated weights for policy 0, policy_version 785053 (0.0041) [2024-06-25 01:47:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42543.1). Total num frames: 12862324736. Throughput: 0: 42581.7. Samples: 12862454160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:13,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 01:47:16,220][15401] Updated weights for policy 0, policy_version 785063 (0.0030) [2024-06-25 01:47:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 12862554112. Throughput: 0: 42634.1. Samples: 12862700320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:18,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 01:47:20,673][15401] Updated weights for policy 0, policy_version 785073 (0.0037) [2024-06-25 01:47:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.9). Total num frames: 12862750720. Throughput: 0: 42600.9. Samples: 12862832860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 01:47:24,286][15401] Updated weights for policy 0, policy_version 785083 (0.0046) [2024-06-25 01:47:28,275][15401] Updated weights for policy 0, policy_version 785093 (0.0036) [2024-06-25 01:47:28,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 12862963712. Throughput: 0: 42559.1. Samples: 12863092160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:28,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 01:47:31,931][15401] Updated weights for policy 0, policy_version 785103 (0.0028) [2024-06-25 01:47:32,141][15349] Signal inference workers to stop experience collection... (190400 times) [2024-06-25 01:47:32,142][15349] Signal inference workers to resume experience collection... (190400 times) [2024-06-25 01:47:32,150][15401] InferenceWorker_p0-w0: stopping experience collection (190400 times) [2024-06-25 01:47:32,165][15401] InferenceWorker_p0-w0: resuming experience collection (190400 times) [2024-06-25 01:47:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12863209472. Throughput: 0: 42594.0. Samples: 12863340400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 01:47:36,200][15401] Updated weights for policy 0, policy_version 785113 (0.0036) [2024-06-25 01:47:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12863406080. Throughput: 0: 42758.6. Samples: 12863479280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 01:47:39,522][15401] Updated weights for policy 0, policy_version 785123 (0.0023) [2024-06-25 01:47:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 12863586304. Throughput: 0: 42702.6. Samples: 12863731520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 01:47:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000785131_12863586304.pth... [2024-06-25 01:47:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000784508_12853379072.pth [2024-06-25 01:47:43,807][15401] Updated weights for policy 0, policy_version 785133 (0.0022) [2024-06-25 01:47:47,010][15401] Updated weights for policy 0, policy_version 785143 (0.0051) [2024-06-25 01:47:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 12863848448. Throughput: 0: 42678.3. Samples: 12863983960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:48,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 01:47:51,363][15401] Updated weights for policy 0, policy_version 785153 (0.0034) [2024-06-25 01:47:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12864045056. Throughput: 0: 42858.7. Samples: 12864121300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 01:47:54,653][15401] Updated weights for policy 0, policy_version 785163 (0.0037) [2024-06-25 01:47:58,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 12864225280. Throughput: 0: 42584.6. Samples: 12864370460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:47:58,390][15132] Avg episode reward: [(0, '0.206')] [2024-06-25 01:47:58,945][15401] Updated weights for policy 0, policy_version 785173 (0.0040) [2024-06-25 01:48:02,378][15401] Updated weights for policy 0, policy_version 785183 (0.0027) [2024-06-25 01:48:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12864471040. Throughput: 0: 42794.7. Samples: 12864626080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:48:03,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-25 01:48:06,619][15401] Updated weights for policy 0, policy_version 785193 (0.0045) [2024-06-25 01:48:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12864667648. Throughput: 0: 42780.8. Samples: 12864758000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:48:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 01:48:10,360][15401] Updated weights for policy 0, policy_version 785203 (0.0031) [2024-06-25 01:48:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12864864256. Throughput: 0: 42507.0. Samples: 12865004980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:48:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 01:48:14,117][15401] Updated weights for policy 0, policy_version 785213 (0.0043) [2024-06-25 01:48:17,859][15401] Updated weights for policy 0, policy_version 785223 (0.0035) [2024-06-25 01:48:18,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 12865093632. Throughput: 0: 42709.8. Samples: 12865262440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 01:48:18,392][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 01:48:21,950][15401] Updated weights for policy 0, policy_version 785233 (0.0049) [2024-06-25 01:48:23,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 12865306624. Throughput: 0: 42615.3. Samples: 12865397240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:48:23,397][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 01:48:25,442][15401] Updated weights for policy 0, policy_version 785243 (0.0033) [2024-06-25 01:48:28,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12865519616. Throughput: 0: 42702.1. Samples: 12865653120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:48:28,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 01:48:29,525][15401] Updated weights for policy 0, policy_version 785253 (0.0029) [2024-06-25 01:48:33,052][15401] Updated weights for policy 0, policy_version 785263 (0.0038) [2024-06-25 01:48:33,390][15132] Fps is (10 sec: 44264.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12865748992. Throughput: 0: 42730.6. Samples: 12865906840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:48:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 01:48:37,153][15401] Updated weights for policy 0, policy_version 785273 (0.0034) [2024-06-25 01:48:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12865961984. Throughput: 0: 42667.0. Samples: 12866041320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:48:38,396][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 01:48:40,771][15401] Updated weights for policy 0, policy_version 785283 (0.0038) [2024-06-25 01:48:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12866158592. Throughput: 0: 42830.5. Samples: 12866297840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:48:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 01:48:44,715][15401] Updated weights for policy 0, policy_version 785293 (0.0025) [2024-06-25 01:48:47,118][15349] Signal inference workers to stop experience collection... (190450 times) [2024-06-25 01:48:47,119][15349] Signal inference workers to resume experience collection... (190450 times) [2024-06-25 01:48:47,136][15401] InferenceWorker_p0-w0: stopping experience collection (190450 times) [2024-06-25 01:48:47,158][15401] InferenceWorker_p0-w0: resuming experience collection (190450 times) [2024-06-25 01:48:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12866387968. Throughput: 0: 42848.0. Samples: 12866554240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:48:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 01:48:48,465][15401] Updated weights for policy 0, policy_version 785303 (0.0046) [2024-06-25 01:48:52,158][15401] Updated weights for policy 0, policy_version 785313 (0.0022) [2024-06-25 01:48:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12866600960. Throughput: 0: 42835.7. Samples: 12866685600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:48:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 01:48:56,118][15401] Updated weights for policy 0, policy_version 785323 (0.0035) [2024-06-25 01:48:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12866797568. Throughput: 0: 42932.2. Samples: 12866936920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:48:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 01:48:59,866][15401] Updated weights for policy 0, policy_version 785333 (0.0029) [2024-06-25 01:49:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 12867026944. Throughput: 0: 42856.8. Samples: 12867190900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:03,391][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 01:49:03,743][15401] Updated weights for policy 0, policy_version 785343 (0.0040) [2024-06-25 01:49:07,468][15401] Updated weights for policy 0, policy_version 785353 (0.0035) [2024-06-25 01:49:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 12867256320. Throughput: 0: 42741.7. Samples: 12867320340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:08,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-25 01:49:11,827][15401] Updated weights for policy 0, policy_version 785363 (0.0031) [2024-06-25 01:49:13,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12867436544. Throughput: 0: 42748.1. Samples: 12867576780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:13,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 01:49:15,075][15401] Updated weights for policy 0, policy_version 785373 (0.0030) [2024-06-25 01:49:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 12867665920. Throughput: 0: 42900.8. Samples: 12867837380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:18,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 01:49:19,450][15401] Updated weights for policy 0, policy_version 785383 (0.0038) [2024-06-25 01:49:22,627][15401] Updated weights for policy 0, policy_version 785393 (0.0033) [2024-06-25 01:49:23,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43422.3, 300 sec: 42820.6). Total num frames: 12867911680. Throughput: 0: 42809.5. Samples: 12867967740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 01:49:26,983][15401] Updated weights for policy 0, policy_version 785403 (0.0036) [2024-06-25 01:49:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12868075520. Throughput: 0: 42576.6. Samples: 12868213780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:28,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 01:49:30,673][15401] Updated weights for policy 0, policy_version 785413 (0.0044) [2024-06-25 01:49:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12868304896. Throughput: 0: 42732.4. Samples: 12868477200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:33,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 01:49:34,520][15401] Updated weights for policy 0, policy_version 785423 (0.0045) [2024-06-25 01:49:38,383][15401] Updated weights for policy 0, policy_version 785433 (0.0034) [2024-06-25 01:49:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12868534272. Throughput: 0: 42640.0. Samples: 12868604400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 01:49:42,046][15401] Updated weights for policy 0, policy_version 785443 (0.0033) [2024-06-25 01:49:43,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 12868730880. Throughput: 0: 42482.0. Samples: 12868848720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:43,393][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 01:49:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000785445_12868730880.pth... [2024-06-25 01:49:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000784820_12858490880.pth [2024-06-25 01:49:46,009][15401] Updated weights for policy 0, policy_version 785453 (0.0042) [2024-06-25 01:49:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 12868943872. Throughput: 0: 42667.3. Samples: 12869110920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 01:49:49,834][15401] Updated weights for policy 0, policy_version 785463 (0.0044) [2024-06-25 01:49:52,040][15349] Signal inference workers to stop experience collection... (190500 times) [2024-06-25 01:49:52,084][15401] InferenceWorker_p0-w0: stopping experience collection (190500 times) [2024-06-25 01:49:52,093][15349] Signal inference workers to resume experience collection... (190500 times) [2024-06-25 01:49:52,098][15401] InferenceWorker_p0-w0: resuming experience collection (190500 times) [2024-06-25 01:49:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12869156864. Throughput: 0: 42552.0. Samples: 12869235180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:53,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 01:49:54,153][15401] Updated weights for policy 0, policy_version 785473 (0.0039) [2024-06-25 01:49:57,499][15401] Updated weights for policy 0, policy_version 785483 (0.0029) [2024-06-25 01:49:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12869369856. Throughput: 0: 42598.7. Samples: 12869493720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:49:58,392][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 01:50:01,650][15401] Updated weights for policy 0, policy_version 785493 (0.0032) [2024-06-25 01:50:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12869582848. Throughput: 0: 42569.9. Samples: 12869753020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 01:50:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 01:50:05,304][15401] Updated weights for policy 0, policy_version 785503 (0.0038) [2024-06-25 01:50:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12869812224. Throughput: 0: 42452.4. Samples: 12869878100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 01:50:09,178][15401] Updated weights for policy 0, policy_version 785513 (0.0037) [2024-06-25 01:50:13,264][15401] Updated weights for policy 0, policy_version 785523 (0.0034) [2024-06-25 01:50:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12870008832. Throughput: 0: 42692.8. Samples: 12870134960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:13,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 01:50:16,937][15401] Updated weights for policy 0, policy_version 785533 (0.0031) [2024-06-25 01:50:18,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 12870221824. Throughput: 0: 42617.2. Samples: 12870394980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:18,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 01:50:20,860][15401] Updated weights for policy 0, policy_version 785543 (0.0028) [2024-06-25 01:50:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 12870451200. Throughput: 0: 42617.9. Samples: 12870522200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:23,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 01:50:24,366][15401] Updated weights for policy 0, policy_version 785553 (0.0035) [2024-06-25 01:50:28,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12870647808. Throughput: 0: 42992.5. Samples: 12870783280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:28,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-25 01:50:28,404][15401] Updated weights for policy 0, policy_version 785563 (0.0039) [2024-06-25 01:50:31,837][15401] Updated weights for policy 0, policy_version 785573 (0.0032) [2024-06-25 01:50:33,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12870860800. Throughput: 0: 42808.1. Samples: 12871037280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 01:50:35,871][15401] Updated weights for policy 0, policy_version 785583 (0.0038) [2024-06-25 01:50:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12871106560. Throughput: 0: 42919.6. Samples: 12871166560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:38,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 01:50:39,449][15401] Updated weights for policy 0, policy_version 785593 (0.0033) [2024-06-25 01:50:43,355][15401] Updated weights for policy 0, policy_version 785603 (0.0031) [2024-06-25 01:50:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 12871319552. Throughput: 0: 43038.3. Samples: 12871430440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:43,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 01:50:47,034][15401] Updated weights for policy 0, policy_version 785613 (0.0036) [2024-06-25 01:50:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12871516160. Throughput: 0: 42906.3. Samples: 12871683800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 01:50:51,165][15401] Updated weights for policy 0, policy_version 785623 (0.0034) [2024-06-25 01:50:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12871745536. Throughput: 0: 43025.0. Samples: 12871814220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:53,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-25 01:50:54,899][15401] Updated weights for policy 0, policy_version 785633 (0.0034) [2024-06-25 01:50:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12871942144. Throughput: 0: 43179.5. Samples: 12872078040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:50:58,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-25 01:50:58,925][15401] Updated weights for policy 0, policy_version 785643 (0.0038) [2024-06-25 01:51:02,551][15401] Updated weights for policy 0, policy_version 785653 (0.0039) [2024-06-25 01:51:03,391][15132] Fps is (10 sec: 42591.9, 60 sec: 43143.5, 300 sec: 42764.8). Total num frames: 12872171520. Throughput: 0: 42823.3. Samples: 12872322080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:03,391][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 01:51:04,908][15349] Signal inference workers to stop experience collection... (190550 times) [2024-06-25 01:51:04,909][15349] Signal inference workers to resume experience collection... (190550 times) [2024-06-25 01:51:04,955][15401] InferenceWorker_p0-w0: stopping experience collection (190550 times) [2024-06-25 01:51:04,956][15401] InferenceWorker_p0-w0: resuming experience collection (190550 times) [2024-06-25 01:51:06,351][15401] Updated weights for policy 0, policy_version 785663 (0.0038) [2024-06-25 01:51:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12872384512. Throughput: 0: 43017.1. Samples: 12872457980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 01:51:09,984][15401] Updated weights for policy 0, policy_version 785673 (0.0037) [2024-06-25 01:51:13,389][15132] Fps is (10 sec: 39327.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12872564736. Throughput: 0: 43012.1. Samples: 12872718820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 01:51:14,290][15401] Updated weights for policy 0, policy_version 785683 (0.0038) [2024-06-25 01:51:17,650][15401] Updated weights for policy 0, policy_version 785693 (0.0039) [2024-06-25 01:51:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12872826880. Throughput: 0: 42762.9. Samples: 12872961620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 01:51:21,956][15401] Updated weights for policy 0, policy_version 785703 (0.0041) [2024-06-25 01:51:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 12873023488. Throughput: 0: 42947.4. Samples: 12873099200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 01:51:25,316][15401] Updated weights for policy 0, policy_version 785713 (0.0029) [2024-06-25 01:51:28,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12873203712. Throughput: 0: 42824.0. Samples: 12873357520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 01:51:29,498][15401] Updated weights for policy 0, policy_version 785723 (0.0038) [2024-06-25 01:51:32,951][15401] Updated weights for policy 0, policy_version 785733 (0.0030) [2024-06-25 01:51:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12873449472. Throughput: 0: 42754.1. Samples: 12873607740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:33,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-25 01:51:37,233][15401] Updated weights for policy 0, policy_version 785743 (0.0050) [2024-06-25 01:51:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12873662464. Throughput: 0: 42771.0. Samples: 12873738920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 01:51:40,823][15401] Updated weights for policy 0, policy_version 785753 (0.0037) [2024-06-25 01:51:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12873859072. Throughput: 0: 42571.6. Samples: 12873993760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:43,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 01:51:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000785758_12873859072.pth... [2024-06-25 01:51:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000785131_12863586304.pth [2024-06-25 01:51:44,696][15401] Updated weights for policy 0, policy_version 785763 (0.0032) [2024-06-25 01:51:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 12874088448. Throughput: 0: 42858.1. Samples: 12874250640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 01:51:48,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 01:51:48,561][15401] Updated weights for policy 0, policy_version 785773 (0.0027) [2024-06-25 01:51:52,643][15401] Updated weights for policy 0, policy_version 785783 (0.0038) [2024-06-25 01:51:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 12874317824. Throughput: 0: 42803.6. Samples: 12874384140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:51:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 01:51:56,112][15401] Updated weights for policy 0, policy_version 785793 (0.0039) [2024-06-25 01:51:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12874514432. Throughput: 0: 42599.4. Samples: 12874635800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:51:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 01:52:00,399][15401] Updated weights for policy 0, policy_version 785803 (0.0038) [2024-06-25 01:52:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42599.4, 300 sec: 42765.0). Total num frames: 12874727424. Throughput: 0: 42966.8. Samples: 12874895120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:03,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 01:52:03,641][15401] Updated weights for policy 0, policy_version 785813 (0.0043) [2024-06-25 01:52:08,038][15401] Updated weights for policy 0, policy_version 785823 (0.0027) [2024-06-25 01:52:08,234][15349] Signal inference workers to stop experience collection... (190600 times) [2024-06-25 01:52:08,234][15349] Signal inference workers to resume experience collection... (190600 times) [2024-06-25 01:52:08,282][15401] InferenceWorker_p0-w0: stopping experience collection (190600 times) [2024-06-25 01:52:08,282][15401] InferenceWorker_p0-w0: resuming experience collection (190600 times) [2024-06-25 01:52:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12874956800. Throughput: 0: 42787.2. Samples: 12875024620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 01:52:11,189][15401] Updated weights for policy 0, policy_version 785833 (0.0034) [2024-06-25 01:52:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12875153408. Throughput: 0: 42675.6. Samples: 12875277920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:13,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 01:52:15,632][15401] Updated weights for policy 0, policy_version 785843 (0.0042) [2024-06-25 01:52:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12875366400. Throughput: 0: 42887.4. Samples: 12875537680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:18,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 01:52:19,162][15401] Updated weights for policy 0, policy_version 785853 (0.0042) [2024-06-25 01:52:23,141][15401] Updated weights for policy 0, policy_version 785863 (0.0028) [2024-06-25 01:52:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12875595776. Throughput: 0: 42818.6. Samples: 12875665760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:23,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-25 01:52:26,994][15401] Updated weights for policy 0, policy_version 785873 (0.0036) [2024-06-25 01:52:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12875808768. Throughput: 0: 42996.9. Samples: 12875928620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:28,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 01:52:30,582][15401] Updated weights for policy 0, policy_version 785883 (0.0025) [2024-06-25 01:52:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12876021760. Throughput: 0: 43206.3. Samples: 12876194920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 01:52:34,574][15401] Updated weights for policy 0, policy_version 785893 (0.0036) [2024-06-25 01:52:38,341][15401] Updated weights for policy 0, policy_version 785903 (0.0026) [2024-06-25 01:52:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12876234752. Throughput: 0: 43008.4. Samples: 12876319520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 01:52:42,022][15401] Updated weights for policy 0, policy_version 785913 (0.0028) [2024-06-25 01:52:43,396][15132] Fps is (10 sec: 44208.8, 60 sec: 43413.0, 300 sec: 42764.1). Total num frames: 12876464128. Throughput: 0: 43177.1. Samples: 12876579040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:43,396][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 01:52:45,815][15401] Updated weights for policy 0, policy_version 785923 (0.0038) [2024-06-25 01:52:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12876677120. Throughput: 0: 43250.5. Samples: 12876841400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:48,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 01:52:49,552][15401] Updated weights for policy 0, policy_version 785933 (0.0031) [2024-06-25 01:52:53,390][15132] Fps is (10 sec: 40985.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12876873728. Throughput: 0: 43150.9. Samples: 12876966420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:53,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 01:52:53,457][15401] Updated weights for policy 0, policy_version 785943 (0.0044) [2024-06-25 01:52:57,134][15401] Updated weights for policy 0, policy_version 785953 (0.0036) [2024-06-25 01:52:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12877103104. Throughput: 0: 43295.8. Samples: 12877226240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:52:58,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 01:53:01,022][15401] Updated weights for policy 0, policy_version 785963 (0.0027) [2024-06-25 01:53:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12877299712. Throughput: 0: 43227.7. Samples: 12877482920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:53:03,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 01:53:04,673][15401] Updated weights for policy 0, policy_version 785973 (0.0032) [2024-06-25 01:53:08,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12877529088. Throughput: 0: 43240.0. Samples: 12877611560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:53:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 01:53:08,559][15401] Updated weights for policy 0, policy_version 785983 (0.0029) [2024-06-25 01:53:12,294][15401] Updated weights for policy 0, policy_version 785993 (0.0031) [2024-06-25 01:53:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42932.0). Total num frames: 12877758464. Throughput: 0: 43258.2. Samples: 12877875240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:53:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 01:53:16,102][15401] Updated weights for policy 0, policy_version 786003 (0.0033) [2024-06-25 01:53:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42877.0). Total num frames: 12877955072. Throughput: 0: 43039.2. Samples: 12878131680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:53:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 01:53:20,026][15401] Updated weights for policy 0, policy_version 786013 (0.0039) [2024-06-25 01:53:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 12878184448. Throughput: 0: 43084.6. Samples: 12878258320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:53:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 01:53:23,603][15401] Updated weights for policy 0, policy_version 786023 (0.0031) [2024-06-25 01:53:27,674][15401] Updated weights for policy 0, policy_version 786033 (0.0029) [2024-06-25 01:53:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12878397440. Throughput: 0: 43115.9. Samples: 12878518980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:53:28,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 01:53:28,747][15349] Signal inference workers to stop experience collection... (190650 times) [2024-06-25 01:53:28,747][15349] Signal inference workers to resume experience collection... (190650 times) [2024-06-25 01:53:28,788][15401] InferenceWorker_p0-w0: stopping experience collection (190650 times) [2024-06-25 01:53:28,788][15401] InferenceWorker_p0-w0: resuming experience collection (190650 times) [2024-06-25 01:53:31,201][15401] Updated weights for policy 0, policy_version 786043 (0.0035) [2024-06-25 01:53:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12878610432. Throughput: 0: 42903.2. Samples: 12878772040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 01:53:33,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 01:53:35,336][15401] Updated weights for policy 0, policy_version 786053 (0.0027) [2024-06-25 01:53:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12878823424. Throughput: 0: 42976.1. Samples: 12878900340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:53:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 01:53:38,631][15401] Updated weights for policy 0, policy_version 786063 (0.0034) [2024-06-25 01:53:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 12879003648. Throughput: 0: 42897.1. Samples: 12879156600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:53:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 01:53:43,495][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000786073_12879020032.pth... [2024-06-25 01:53:43,502][15401] Updated weights for policy 0, policy_version 786073 (0.0034) [2024-06-25 01:53:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000785445_12868730880.pth [2024-06-25 01:53:46,123][15401] Updated weights for policy 0, policy_version 786083 (0.0037) [2024-06-25 01:53:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12879249408. Throughput: 0: 42851.1. Samples: 12879411220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:53:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 01:53:51,018][15401] Updated weights for policy 0, policy_version 786093 (0.0031) [2024-06-25 01:53:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 12879462400. Throughput: 0: 42894.3. Samples: 12879541800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:53:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 01:53:53,791][15401] Updated weights for policy 0, policy_version 786103 (0.0027) [2024-06-25 01:53:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.5, 300 sec: 42765.1). Total num frames: 12879642624. Throughput: 0: 42759.2. Samples: 12879799400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:53:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 01:53:58,564][15401] Updated weights for policy 0, policy_version 786113 (0.0032) [2024-06-25 01:54:01,791][15401] Updated weights for policy 0, policy_version 786123 (0.0034) [2024-06-25 01:54:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12879888384. Throughput: 0: 42702.3. Samples: 12880053280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 01:54:06,344][15401] Updated weights for policy 0, policy_version 786133 (0.0035) [2024-06-25 01:54:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 12880117760. Throughput: 0: 42949.3. Samples: 12880191040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 01:54:09,300][15401] Updated weights for policy 0, policy_version 786143 (0.0031) [2024-06-25 01:54:13,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42320.8, 300 sec: 42819.6). Total num frames: 12880297984. Throughput: 0: 42830.3. Samples: 12880446620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:13,397][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 01:54:13,902][15401] Updated weights for policy 0, policy_version 786153 (0.0029) [2024-06-25 01:54:16,869][15401] Updated weights for policy 0, policy_version 786163 (0.0029) [2024-06-25 01:54:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12880527360. Throughput: 0: 42886.6. Samples: 12880701940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 01:54:21,705][15401] Updated weights for policy 0, policy_version 786173 (0.0051) [2024-06-25 01:54:23,390][15132] Fps is (10 sec: 47543.9, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 12880773120. Throughput: 0: 42953.8. Samples: 12880833260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:23,395][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 01:54:24,459][15401] Updated weights for policy 0, policy_version 786183 (0.0027) [2024-06-25 01:54:28,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 12880936960. Throughput: 0: 42797.2. Samples: 12881082580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:28,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 01:54:29,318][15401] Updated weights for policy 0, policy_version 786193 (0.0030) [2024-06-25 01:54:32,028][15401] Updated weights for policy 0, policy_version 786203 (0.0039) [2024-06-25 01:54:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12881182720. Throughput: 0: 42876.0. Samples: 12881340640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:33,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-25 01:54:36,757][15401] Updated weights for policy 0, policy_version 786213 (0.0042) [2024-06-25 01:54:38,332][15349] Signal inference workers to stop experience collection... (190700 times) [2024-06-25 01:54:38,385][15349] Signal inference workers to resume experience collection... (190700 times) [2024-06-25 01:54:38,386][15401] InferenceWorker_p0-w0: stopping experience collection (190700 times) [2024-06-25 01:54:38,390][15132] Fps is (10 sec: 47524.6, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 12881412096. Throughput: 0: 43035.4. Samples: 12881478400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:38,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 01:54:38,411][15401] InferenceWorker_p0-w0: resuming experience collection (190700 times) [2024-06-25 01:54:39,931][15401] Updated weights for policy 0, policy_version 786223 (0.0032) [2024-06-25 01:54:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12881592320. Throughput: 0: 42990.6. Samples: 12881733980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 01:54:44,351][15401] Updated weights for policy 0, policy_version 786233 (0.0030) [2024-06-25 01:54:47,374][15401] Updated weights for policy 0, policy_version 786243 (0.0041) [2024-06-25 01:54:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 12881838080. Throughput: 0: 42937.6. Samples: 12881985480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:48,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 01:54:52,173][15401] Updated weights for policy 0, policy_version 786253 (0.0054) [2024-06-25 01:54:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 12882051072. Throughput: 0: 42818.6. Samples: 12882117880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 01:54:54,854][15401] Updated weights for policy 0, policy_version 786263 (0.0044) [2024-06-25 01:54:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 12882247680. Throughput: 0: 42910.1. Samples: 12882377300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:54:58,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 01:54:59,810][15401] Updated weights for policy 0, policy_version 786273 (0.0033) [2024-06-25 01:55:02,451][15401] Updated weights for policy 0, policy_version 786283 (0.0039) [2024-06-25 01:55:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12882460672. Throughput: 0: 42768.2. Samples: 12882626500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:55:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 01:55:07,641][15401] Updated weights for policy 0, policy_version 786293 (0.0042) [2024-06-25 01:55:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 12882673664. Throughput: 0: 42816.6. Samples: 12882760000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:55:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 01:55:10,085][15401] Updated weights for policy 0, policy_version 786303 (0.0039) [2024-06-25 01:55:13,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43147.4, 300 sec: 42931.3). Total num frames: 12882886656. Throughput: 0: 42961.3. Samples: 12883015840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:55:13,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 01:55:15,205][15401] Updated weights for policy 0, policy_version 786313 (0.0042) [2024-06-25 01:55:18,190][15401] Updated weights for policy 0, policy_version 786323 (0.0038) [2024-06-25 01:55:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12883116032. Throughput: 0: 42850.6. Samples: 12883268920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:55:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 01:55:22,737][15401] Updated weights for policy 0, policy_version 786333 (0.0032) [2024-06-25 01:55:23,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 12883329024. Throughput: 0: 42834.2. Samples: 12883405940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 01:55:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 01:55:26,047][15401] Updated weights for policy 0, policy_version 786343 (0.0031) [2024-06-25 01:55:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43419.2, 300 sec: 42987.1). Total num frames: 12883542016. Throughput: 0: 42641.7. Samples: 12883652860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:55:28,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 01:55:30,326][15401] Updated weights for policy 0, policy_version 786353 (0.0030) [2024-06-25 01:55:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12883738624. Throughput: 0: 42815.7. Samples: 12883912180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:55:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 01:55:33,742][15401] Updated weights for policy 0, policy_version 786363 (0.0033) [2024-06-25 01:55:37,870][15401] Updated weights for policy 0, policy_version 786373 (0.0040) [2024-06-25 01:55:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 12883951616. Throughput: 0: 42708.9. Samples: 12884039780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:55:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 01:55:41,543][15401] Updated weights for policy 0, policy_version 786383 (0.0030) [2024-06-25 01:55:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 12884197376. Throughput: 0: 42595.2. Samples: 12884294080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:55:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 01:55:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000786389_12884197376.pth... [2024-06-25 01:55:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000785758_12873859072.pth [2024-06-25 01:55:45,654][15401] Updated weights for policy 0, policy_version 786393 (0.0035) [2024-06-25 01:55:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12884393984. Throughput: 0: 42787.4. Samples: 12884551940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:55:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 01:55:49,074][15401] Updated weights for policy 0, policy_version 786403 (0.0033) [2024-06-25 01:55:50,193][15349] Signal inference workers to stop experience collection... (190750 times) [2024-06-25 01:55:50,219][15401] InferenceWorker_p0-w0: stopping experience collection (190750 times) [2024-06-25 01:55:50,255][15349] Signal inference workers to resume experience collection... (190750 times) [2024-06-25 01:55:50,256][15401] InferenceWorker_p0-w0: resuming experience collection (190750 times) [2024-06-25 01:55:53,268][15401] Updated weights for policy 0, policy_version 786413 (0.0025) [2024-06-25 01:55:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 12884590592. Throughput: 0: 42620.4. Samples: 12884677920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:55:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 01:55:56,922][15401] Updated weights for policy 0, policy_version 786423 (0.0049) [2024-06-25 01:55:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42931.8). Total num frames: 12884836352. Throughput: 0: 42660.1. Samples: 12884935440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:55:58,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 01:56:01,000][15401] Updated weights for policy 0, policy_version 786433 (0.0051) [2024-06-25 01:56:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12885032960. Throughput: 0: 42710.7. Samples: 12885190900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:03,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 01:56:04,541][15401] Updated weights for policy 0, policy_version 786443 (0.0037) [2024-06-25 01:56:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 12885229568. Throughput: 0: 42419.2. Samples: 12885314800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 01:56:08,642][15401] Updated weights for policy 0, policy_version 786453 (0.0033) [2024-06-25 01:56:12,190][15401] Updated weights for policy 0, policy_version 786463 (0.0025) [2024-06-25 01:56:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 12885458944. Throughput: 0: 42724.9. Samples: 12885575480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:13,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 01:56:16,411][15401] Updated weights for policy 0, policy_version 786473 (0.0030) [2024-06-25 01:56:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 12885688320. Throughput: 0: 42552.5. Samples: 12885827040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 01:56:19,701][15401] Updated weights for policy 0, policy_version 786483 (0.0031) [2024-06-25 01:56:23,392][15132] Fps is (10 sec: 39312.8, 60 sec: 42050.7, 300 sec: 42875.7). Total num frames: 12885852160. Throughput: 0: 42521.8. Samples: 12885953360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:23,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 01:56:24,142][15401] Updated weights for policy 0, policy_version 786493 (0.0027) [2024-06-25 01:56:27,555][15401] Updated weights for policy 0, policy_version 786503 (0.0026) [2024-06-25 01:56:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 12886097920. Throughput: 0: 42451.5. Samples: 12886204400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 01:56:32,142][15401] Updated weights for policy 0, policy_version 786513 (0.0037) [2024-06-25 01:56:33,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12886310912. Throughput: 0: 42362.7. Samples: 12886458260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:33,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 01:56:35,109][15401] Updated weights for policy 0, policy_version 786523 (0.0027) [2024-06-25 01:56:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 12886507520. Throughput: 0: 42520.5. Samples: 12886591340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 01:56:39,637][15401] Updated weights for policy 0, policy_version 786533 (0.0031) [2024-06-25 01:56:42,644][15401] Updated weights for policy 0, policy_version 786543 (0.0033) [2024-06-25 01:56:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 12886736896. Throughput: 0: 42520.0. Samples: 12886848840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:43,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 01:56:47,296][15401] Updated weights for policy 0, policy_version 786553 (0.0035) [2024-06-25 01:56:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12886949888. Throughput: 0: 42497.8. Samples: 12887103300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:48,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 01:56:50,247][15401] Updated weights for policy 0, policy_version 786563 (0.0045) [2024-06-25 01:56:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12887146496. Throughput: 0: 42639.9. Samples: 12887233600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:53,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 01:56:54,766][15401] Updated weights for policy 0, policy_version 786573 (0.0034) [2024-06-25 01:56:58,160][15401] Updated weights for policy 0, policy_version 786583 (0.0035) [2024-06-25 01:56:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 12887375872. Throughput: 0: 42502.8. Samples: 12887488100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:56:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 01:57:02,712][15401] Updated weights for policy 0, policy_version 786593 (0.0047) [2024-06-25 01:57:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12887572480. Throughput: 0: 42598.6. Samples: 12887743980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:57:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 01:57:05,998][15401] Updated weights for policy 0, policy_version 786603 (0.0035) [2024-06-25 01:57:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12887785472. Throughput: 0: 42620.9. Samples: 12887871200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 01:57:08,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 01:57:08,720][15349] Signal inference workers to stop experience collection... (190800 times) [2024-06-25 01:57:08,720][15349] Signal inference workers to resume experience collection... (190800 times) [2024-06-25 01:57:08,768][15401] InferenceWorker_p0-w0: stopping experience collection (190800 times) [2024-06-25 01:57:08,768][15401] InferenceWorker_p0-w0: resuming experience collection (190800 times) [2024-06-25 01:57:10,363][15401] Updated weights for policy 0, policy_version 786613 (0.0028) [2024-06-25 01:57:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12887998464. Throughput: 0: 42671.9. Samples: 12888124640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 01:57:13,647][15401] Updated weights for policy 0, policy_version 786623 (0.0054) [2024-06-25 01:57:17,957][15401] Updated weights for policy 0, policy_version 786633 (0.0028) [2024-06-25 01:57:18,393][15132] Fps is (10 sec: 42585.3, 60 sec: 42050.0, 300 sec: 42764.6). Total num frames: 12888211456. Throughput: 0: 42742.9. Samples: 12888381820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:18,393][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 01:57:21,142][15401] Updated weights for policy 0, policy_version 786643 (0.0031) [2024-06-25 01:57:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 12888408064. Throughput: 0: 42528.9. Samples: 12888505140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 01:57:25,469][15401] Updated weights for policy 0, policy_version 786653 (0.0037) [2024-06-25 01:57:28,389][15132] Fps is (10 sec: 44250.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12888653824. Throughput: 0: 42605.8. Samples: 12888766100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 01:57:29,015][15401] Updated weights for policy 0, policy_version 786663 (0.0037) [2024-06-25 01:57:33,045][15401] Updated weights for policy 0, policy_version 786673 (0.0028) [2024-06-25 01:57:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12888850432. Throughput: 0: 42645.3. Samples: 12889022340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 01:57:36,504][15401] Updated weights for policy 0, policy_version 786683 (0.0031) [2024-06-25 01:57:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.6, 300 sec: 42710.1). Total num frames: 12889063424. Throughput: 0: 42477.8. Samples: 12889145200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:38,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 01:57:40,581][15401] Updated weights for policy 0, policy_version 786693 (0.0042) [2024-06-25 01:57:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12889292800. Throughput: 0: 42504.4. Samples: 12889400800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 01:57:43,517][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000786701_12889309184.pth... [2024-06-25 01:57:43,568][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000786073_12879020032.pth [2024-06-25 01:57:44,282][15401] Updated weights for policy 0, policy_version 786703 (0.0023) [2024-06-25 01:57:48,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 12889473024. Throughput: 0: 42573.4. Samples: 12889659780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:48,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 01:57:48,851][15401] Updated weights for policy 0, policy_version 786713 (0.0024) [2024-06-25 01:57:51,929][15401] Updated weights for policy 0, policy_version 786723 (0.0053) [2024-06-25 01:57:53,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 12889702400. Throughput: 0: 42397.3. Samples: 12889779180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:53,392][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 01:57:56,510][15401] Updated weights for policy 0, policy_version 786733 (0.0029) [2024-06-25 01:57:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12889915392. Throughput: 0: 42533.3. Samples: 12890038640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:57:58,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 01:57:59,504][15401] Updated weights for policy 0, policy_version 786743 (0.0042) [2024-06-25 01:58:03,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12890112000. Throughput: 0: 42620.3. Samples: 12890299600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 01:58:04,168][15401] Updated weights for policy 0, policy_version 786753 (0.0041) [2024-06-25 01:58:07,558][15401] Updated weights for policy 0, policy_version 786763 (0.0050) [2024-06-25 01:58:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12890341376. Throughput: 0: 42505.7. Samples: 12890417900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 01:58:11,676][15401] Updated weights for policy 0, policy_version 786773 (0.0041) [2024-06-25 01:58:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12890554368. Throughput: 0: 42389.3. Samples: 12890673620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 01:58:15,307][15401] Updated weights for policy 0, policy_version 786783 (0.0030) [2024-06-25 01:58:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.5, 300 sec: 42653.9). Total num frames: 12890767360. Throughput: 0: 42558.2. Samples: 12890937460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 01:58:19,419][15401] Updated weights for policy 0, policy_version 786793 (0.0038) [2024-06-25 01:58:22,888][15401] Updated weights for policy 0, policy_version 786803 (0.0033) [2024-06-25 01:58:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 12890980352. Throughput: 0: 42515.0. Samples: 12891058280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 01:58:27,037][15401] Updated weights for policy 0, policy_version 786813 (0.0027) [2024-06-25 01:58:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12891193344. Throughput: 0: 42614.3. Samples: 12891318440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:28,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 01:58:30,235][15401] Updated weights for policy 0, policy_version 786823 (0.0037) [2024-06-25 01:58:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12891406336. Throughput: 0: 42675.1. Samples: 12891580160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 01:58:34,901][15401] Updated weights for policy 0, policy_version 786833 (0.0041) [2024-06-25 01:58:37,954][15401] Updated weights for policy 0, policy_version 786843 (0.0040) [2024-06-25 01:58:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 12891635712. Throughput: 0: 42799.6. Samples: 12891705060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 01:58:42,387][15401] Updated weights for policy 0, policy_version 786853 (0.0036) [2024-06-25 01:58:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12891848704. Throughput: 0: 42854.3. Samples: 12891967080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 01:58:45,679][15401] Updated weights for policy 0, policy_version 786863 (0.0032) [2024-06-25 01:58:48,260][15349] Signal inference workers to stop experience collection... (190850 times) [2024-06-25 01:58:48,260][15349] Signal inference workers to resume experience collection... (190850 times) [2024-06-25 01:58:48,312][15401] InferenceWorker_p0-w0: stopping experience collection (190850 times) [2024-06-25 01:58:48,312][15401] InferenceWorker_p0-w0: resuming experience collection (190850 times) [2024-06-25 01:58:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12892028928. Throughput: 0: 42649.4. Samples: 12892218820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:48,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 01:58:50,593][15401] Updated weights for policy 0, policy_version 786873 (0.0032) [2024-06-25 01:58:53,331][15401] Updated weights for policy 0, policy_version 786883 (0.0037) [2024-06-25 01:58:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 12892291072. Throughput: 0: 42824.4. Samples: 12892345000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 01:58:53,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 01:58:58,142][15401] Updated weights for policy 0, policy_version 786893 (0.0037) [2024-06-25 01:58:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12892471296. Throughput: 0: 42814.2. Samples: 12892600260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:58:58,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 01:59:00,897][15401] Updated weights for policy 0, policy_version 786903 (0.0028) [2024-06-25 01:59:03,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12892667904. Throughput: 0: 42727.6. Samples: 12892860200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 01:59:05,703][15401] Updated weights for policy 0, policy_version 786913 (0.0026) [2024-06-25 01:59:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42765.9). Total num frames: 12892913664. Throughput: 0: 42935.6. Samples: 12892990380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 01:59:09,068][15401] Updated weights for policy 0, policy_version 786923 (0.0050) [2024-06-25 01:59:13,290][15401] Updated weights for policy 0, policy_version 786933 (0.0035) [2024-06-25 01:59:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12893110272. Throughput: 0: 42615.6. Samples: 12893236140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:13,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 01:59:16,962][15401] Updated weights for policy 0, policy_version 786943 (0.0040) [2024-06-25 01:59:18,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 12893323264. Throughput: 0: 42443.9. Samples: 12893490240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:18,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 01:59:20,878][15401] Updated weights for policy 0, policy_version 786953 (0.0044) [2024-06-25 01:59:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 12893552640. Throughput: 0: 42457.8. Samples: 12893615660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 01:59:24,658][15401] Updated weights for policy 0, policy_version 786963 (0.0036) [2024-06-25 01:59:28,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12893716480. Throughput: 0: 42261.3. Samples: 12893868840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:28,394][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 01:59:29,001][15401] Updated weights for policy 0, policy_version 786973 (0.0035) [2024-06-25 01:59:32,165][15401] Updated weights for policy 0, policy_version 786983 (0.0034) [2024-06-25 01:59:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12893962240. Throughput: 0: 42293.8. Samples: 12894122040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 01:59:36,602][15401] Updated weights for policy 0, policy_version 786993 (0.0024) [2024-06-25 01:59:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12894175232. Throughput: 0: 42594.2. Samples: 12894261740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 01:59:39,818][15401] Updated weights for policy 0, policy_version 787003 (0.0031) [2024-06-25 01:59:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 12894371840. Throughput: 0: 42445.6. Samples: 12894510320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 01:59:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000787010_12894371840.pth... [2024-06-25 01:59:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000786389_12884197376.pth [2024-06-25 01:59:44,139][15401] Updated weights for policy 0, policy_version 787013 (0.0041) [2024-06-25 01:59:47,515][15401] Updated weights for policy 0, policy_version 787023 (0.0024) [2024-06-25 01:59:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 12894601216. Throughput: 0: 42443.5. Samples: 12894770160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 01:59:51,684][15401] Updated weights for policy 0, policy_version 787033 (0.0046) [2024-06-25 01:59:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 12894797824. Throughput: 0: 42352.6. Samples: 12894896240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 01:59:55,681][15401] Updated weights for policy 0, policy_version 787043 (0.0042) [2024-06-25 01:59:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12895027200. Throughput: 0: 42507.9. Samples: 12895149000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 01:59:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 01:59:59,439][15401] Updated weights for policy 0, policy_version 787053 (0.0025) [2024-06-25 02:00:03,254][15401] Updated weights for policy 0, policy_version 787063 (0.0037) [2024-06-25 02:00:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12895240192. Throughput: 0: 42739.2. Samples: 12895413400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 02:00:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 02:00:06,977][15401] Updated weights for policy 0, policy_version 787073 (0.0033) [2024-06-25 02:00:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 12895453184. Throughput: 0: 42734.2. Samples: 12895538700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 02:00:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 02:00:10,860][15401] Updated weights for policy 0, policy_version 787083 (0.0028) [2024-06-25 02:00:11,144][15349] Signal inference workers to stop experience collection... (190900 times) [2024-06-25 02:00:11,184][15401] InferenceWorker_p0-w0: stopping experience collection (190900 times) [2024-06-25 02:00:11,204][15349] Signal inference workers to resume experience collection... (190900 times) [2024-06-25 02:00:11,206][15401] InferenceWorker_p0-w0: resuming experience collection (190900 times) [2024-06-25 02:00:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 12895682560. Throughput: 0: 42836.4. Samples: 12895796480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 02:00:13,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-25 02:00:14,346][15401] Updated weights for policy 0, policy_version 787093 (0.0028) [2024-06-25 02:00:18,382][15401] Updated weights for policy 0, policy_version 787103 (0.0026) [2024-06-25 02:00:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 12895895552. Throughput: 0: 43112.0. Samples: 12896062080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 02:00:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 02:00:21,793][15401] Updated weights for policy 0, policy_version 787113 (0.0025) [2024-06-25 02:00:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 12896092160. Throughput: 0: 42781.2. Samples: 12896186900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 02:00:23,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 02:00:25,895][15401] Updated weights for policy 0, policy_version 787123 (0.0037) [2024-06-25 02:00:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43690.7, 300 sec: 42709.5). Total num frames: 12896337920. Throughput: 0: 42913.4. Samples: 12896441420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 02:00:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 02:00:29,768][15401] Updated weights for policy 0, policy_version 787133 (0.0033) [2024-06-25 02:00:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12896518144. Throughput: 0: 42921.5. Samples: 12896701620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 02:00:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 02:00:33,857][15401] Updated weights for policy 0, policy_version 787143 (0.0036) [2024-06-25 02:00:37,533][15401] Updated weights for policy 0, policy_version 787153 (0.0032) [2024-06-25 02:00:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 12896747520. Throughput: 0: 42830.2. Samples: 12896823600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 02:00:38,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 02:00:41,475][15401] Updated weights for policy 0, policy_version 787163 (0.0038) [2024-06-25 02:00:43,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 12896976896. Throughput: 0: 42927.1. Samples: 12897080720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:00:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 02:00:45,313][15401] Updated weights for policy 0, policy_version 787173 (0.0037) [2024-06-25 02:00:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12897157120. Throughput: 0: 42794.1. Samples: 12897339140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:00:48,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 02:00:49,100][15401] Updated weights for policy 0, policy_version 787183 (0.0029) [2024-06-25 02:00:53,042][15401] Updated weights for policy 0, policy_version 787193 (0.0030) [2024-06-25 02:00:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 12897386496. Throughput: 0: 42903.5. Samples: 12897469360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:00:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 02:00:56,645][15401] Updated weights for policy 0, policy_version 787203 (0.0038) [2024-06-25 02:00:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12897599488. Throughput: 0: 42720.5. Samples: 12897718900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:00:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 02:01:00,641][15401] Updated weights for policy 0, policy_version 787213 (0.0032) [2024-06-25 02:01:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12897812480. Throughput: 0: 42597.6. Samples: 12897978980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 02:01:04,393][15401] Updated weights for policy 0, policy_version 787223 (0.0027) [2024-06-25 02:01:08,373][15401] Updated weights for policy 0, policy_version 787233 (0.0034) [2024-06-25 02:01:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12898025472. Throughput: 0: 42702.7. Samples: 12898108520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 02:01:11,956][15401] Updated weights for policy 0, policy_version 787243 (0.0040) [2024-06-25 02:01:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 12898238464. Throughput: 0: 42615.6. Samples: 12898359120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 02:01:15,955][15401] Updated weights for policy 0, policy_version 787253 (0.0031) [2024-06-25 02:01:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 12898435072. Throughput: 0: 42606.2. Samples: 12898618900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:18,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 02:01:19,615][15401] Updated weights for policy 0, policy_version 787263 (0.0040) [2024-06-25 02:01:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 12898664448. Throughput: 0: 42713.8. Samples: 12898745720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 02:01:23,526][15401] Updated weights for policy 0, policy_version 787273 (0.0051) [2024-06-25 02:01:24,002][15349] Signal inference workers to stop experience collection... (190950 times) [2024-06-25 02:01:24,003][15349] Signal inference workers to resume experience collection... (190950 times) [2024-06-25 02:01:24,022][15401] InferenceWorker_p0-w0: stopping experience collection (190950 times) [2024-06-25 02:01:24,022][15401] InferenceWorker_p0-w0: resuming experience collection (190950 times) [2024-06-25 02:01:27,199][15401] Updated weights for policy 0, policy_version 787283 (0.0031) [2024-06-25 02:01:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12898877440. Throughput: 0: 42592.0. Samples: 12898997360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 02:01:31,084][15401] Updated weights for policy 0, policy_version 787293 (0.0032) [2024-06-25 02:01:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12899074048. Throughput: 0: 42798.4. Samples: 12899265060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 02:01:34,767][15401] Updated weights for policy 0, policy_version 787303 (0.0030) [2024-06-25 02:01:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12899303424. Throughput: 0: 42608.1. Samples: 12899386720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 02:01:38,784][15401] Updated weights for policy 0, policy_version 787313 (0.0040) [2024-06-25 02:01:42,876][15401] Updated weights for policy 0, policy_version 787323 (0.0047) [2024-06-25 02:01:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12899532800. Throughput: 0: 42801.8. Samples: 12899644980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:43,396][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 02:01:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000787325_12899532800.pth... [2024-06-25 02:01:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000786701_12889309184.pth [2024-06-25 02:01:46,767][15401] Updated weights for policy 0, policy_version 787333 (0.0029) [2024-06-25 02:01:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 12899729408. Throughput: 0: 42595.7. Samples: 12899895780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 02:01:50,484][15401] Updated weights for policy 0, policy_version 787343 (0.0033) [2024-06-25 02:01:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12899942400. Throughput: 0: 42487.6. Samples: 12900020460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:53,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 02:01:54,529][15401] Updated weights for policy 0, policy_version 787353 (0.0027) [2024-06-25 02:01:58,241][15401] Updated weights for policy 0, policy_version 787363 (0.0030) [2024-06-25 02:01:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12900155392. Throughput: 0: 42773.8. Samples: 12900283940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:01:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 02:02:02,094][15401] Updated weights for policy 0, policy_version 787373 (0.0025) [2024-06-25 02:02:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12900368384. Throughput: 0: 42416.4. Samples: 12900527640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:02:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 02:02:06,135][15401] Updated weights for policy 0, policy_version 787383 (0.0028) [2024-06-25 02:02:08,390][15132] Fps is (10 sec: 44235.0, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 12900597760. Throughput: 0: 42524.5. Samples: 12900659340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:02:08,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 02:02:09,958][15401] Updated weights for policy 0, policy_version 787393 (0.0038) [2024-06-25 02:02:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.9). Total num frames: 12900777984. Throughput: 0: 42675.6. Samples: 12900917760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:02:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 02:02:13,627][15401] Updated weights for policy 0, policy_version 787403 (0.0040) [2024-06-25 02:02:17,602][15401] Updated weights for policy 0, policy_version 787413 (0.0043) [2024-06-25 02:02:18,389][15132] Fps is (10 sec: 40961.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12901007360. Throughput: 0: 42441.3. Samples: 12901174920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:02:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 02:02:21,373][15401] Updated weights for policy 0, policy_version 787423 (0.0039) [2024-06-25 02:02:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12901253120. Throughput: 0: 42635.6. Samples: 12901305320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:02:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 02:02:25,130][15401] Updated weights for policy 0, policy_version 787433 (0.0030) [2024-06-25 02:02:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12901433344. Throughput: 0: 42586.7. Samples: 12901561380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 02:02:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 02:02:28,878][15401] Updated weights for policy 0, policy_version 787443 (0.0036) [2024-06-25 02:02:32,679][15401] Updated weights for policy 0, policy_version 787453 (0.0033) [2024-06-25 02:02:33,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 12901646336. Throughput: 0: 42635.1. Samples: 12901814360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:02:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 02:02:36,701][15401] Updated weights for policy 0, policy_version 787463 (0.0038) [2024-06-25 02:02:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12901875712. Throughput: 0: 42676.5. Samples: 12901940900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:02:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 02:02:40,294][15401] Updated weights for policy 0, policy_version 787473 (0.0047) [2024-06-25 02:02:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12902072320. Throughput: 0: 42568.4. Samples: 12902199520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:02:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 02:02:44,408][15401] Updated weights for policy 0, policy_version 787483 (0.0039) [2024-06-25 02:02:45,137][15349] Signal inference workers to stop experience collection... (191000 times) [2024-06-25 02:02:45,195][15401] InferenceWorker_p0-w0: stopping experience collection (191000 times) [2024-06-25 02:02:45,195][15349] Signal inference workers to resume experience collection... (191000 times) [2024-06-25 02:02:45,212][15401] InferenceWorker_p0-w0: resuming experience collection (191000 times) [2024-06-25 02:02:47,840][15401] Updated weights for policy 0, policy_version 787493 (0.0047) [2024-06-25 02:02:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12902285312. Throughput: 0: 42724.4. Samples: 12902450240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:02:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 02:02:52,043][15401] Updated weights for policy 0, policy_version 787503 (0.0033) [2024-06-25 02:02:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 12902531072. Throughput: 0: 42620.2. Samples: 12902577240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:02:53,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 02:02:55,939][15401] Updated weights for policy 0, policy_version 787513 (0.0031) [2024-06-25 02:02:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12902727680. Throughput: 0: 42688.0. Samples: 12902838720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:02:58,396][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 02:03:00,023][15401] Updated weights for policy 0, policy_version 787523 (0.0037) [2024-06-25 02:03:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12902924288. Throughput: 0: 42679.8. Samples: 12903095520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 02:03:03,522][15401] Updated weights for policy 0, policy_version 787533 (0.0035) [2024-06-25 02:03:07,613][15401] Updated weights for policy 0, policy_version 787543 (0.0037) [2024-06-25 02:03:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.8, 300 sec: 42765.0). Total num frames: 12903170048. Throughput: 0: 42509.3. Samples: 12903218240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 02:03:11,334][15401] Updated weights for policy 0, policy_version 787553 (0.0033) [2024-06-25 02:03:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12903350272. Throughput: 0: 42562.6. Samples: 12903476700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 02:03:15,237][15401] Updated weights for policy 0, policy_version 787563 (0.0033) [2024-06-25 02:03:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12903546880. Throughput: 0: 42573.7. Samples: 12903730180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:18,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 02:03:18,885][15401] Updated weights for policy 0, policy_version 787573 (0.0038) [2024-06-25 02:03:22,716][15401] Updated weights for policy 0, policy_version 787583 (0.0037) [2024-06-25 02:03:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12903792640. Throughput: 0: 42626.2. Samples: 12903859080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:23,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 02:03:26,769][15401] Updated weights for policy 0, policy_version 787593 (0.0041) [2024-06-25 02:03:28,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 12904005632. Throughput: 0: 42542.1. Samples: 12904114020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:28,393][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 02:03:30,216][15401] Updated weights for policy 0, policy_version 787603 (0.0029) [2024-06-25 02:03:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12904202240. Throughput: 0: 42791.1. Samples: 12904375840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:33,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 02:03:34,199][15401] Updated weights for policy 0, policy_version 787613 (0.0047) [2024-06-25 02:03:37,858][15401] Updated weights for policy 0, policy_version 787623 (0.0036) [2024-06-25 02:03:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 12904415232. Throughput: 0: 42663.7. Samples: 12904497100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 02:03:39,305][15349] Signal inference workers to stop experience collection... (191050 times) [2024-06-25 02:03:39,305][15349] Signal inference workers to resume experience collection... (191050 times) [2024-06-25 02:03:39,327][15401] InferenceWorker_p0-w0: stopping experience collection (191050 times) [2024-06-25 02:03:39,328][15401] InferenceWorker_p0-w0: resuming experience collection (191050 times) [2024-06-25 02:03:42,007][15401] Updated weights for policy 0, policy_version 787633 (0.0034) [2024-06-25 02:03:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12904628224. Throughput: 0: 42592.3. Samples: 12904755380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 02:03:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000787636_12904628224.pth... [2024-06-25 02:03:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000787010_12894371840.pth [2024-06-25 02:03:45,318][15401] Updated weights for policy 0, policy_version 787643 (0.0036) [2024-06-25 02:03:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 12904824832. Throughput: 0: 42651.8. Samples: 12905014840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:48,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-25 02:03:49,728][15401] Updated weights for policy 0, policy_version 787653 (0.0037) [2024-06-25 02:03:52,971][15401] Updated weights for policy 0, policy_version 787663 (0.0028) [2024-06-25 02:03:53,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12905086976. Throughput: 0: 42751.0. Samples: 12905142040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:53,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-25 02:03:57,344][15401] Updated weights for policy 0, policy_version 787673 (0.0034) [2024-06-25 02:03:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12905267200. Throughput: 0: 42673.4. Samples: 12905397000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:03:58,394][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 02:04:00,754][15401] Updated weights for policy 0, policy_version 787683 (0.0021) [2024-06-25 02:04:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 12905480192. Throughput: 0: 42832.6. Samples: 12905657640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:04:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 02:04:04,934][15401] Updated weights for policy 0, policy_version 787693 (0.0040) [2024-06-25 02:04:08,287][15401] Updated weights for policy 0, policy_version 787703 (0.0024) [2024-06-25 02:04:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12905725952. Throughput: 0: 42757.3. Samples: 12905783160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:04:08,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-25 02:04:12,723][15401] Updated weights for policy 0, policy_version 787713 (0.0026) [2024-06-25 02:04:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12905906176. Throughput: 0: 42815.6. Samples: 12906040620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 02:04:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 02:04:16,391][15401] Updated weights for policy 0, policy_version 787723 (0.0029) [2024-06-25 02:04:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12906119168. Throughput: 0: 42508.9. Samples: 12906288740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 02:04:20,430][15401] Updated weights for policy 0, policy_version 787733 (0.0040) [2024-06-25 02:04:23,390][15132] Fps is (10 sec: 44235.1, 60 sec: 42598.1, 300 sec: 42820.5). Total num frames: 12906348544. Throughput: 0: 42613.3. Samples: 12906414720. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:23,391][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 02:04:24,045][15401] Updated weights for policy 0, policy_version 787743 (0.0032) [2024-06-25 02:04:28,094][15401] Updated weights for policy 0, policy_version 787753 (0.0036) [2024-06-25 02:04:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42325.4, 300 sec: 42653.6). Total num frames: 12906545152. Throughput: 0: 42696.5. Samples: 12906676820. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:28,392][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 02:04:31,735][15401] Updated weights for policy 0, policy_version 787763 (0.0029) [2024-06-25 02:04:33,390][15132] Fps is (10 sec: 40961.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12906758144. Throughput: 0: 42510.1. Samples: 12906927800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 02:04:35,763][15401] Updated weights for policy 0, policy_version 787773 (0.0042) [2024-06-25 02:04:38,396][15132] Fps is (10 sec: 44218.9, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 12906987520. Throughput: 0: 42526.0. Samples: 12907055980. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:38,397][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 02:04:39,453][15401] Updated weights for policy 0, policy_version 787783 (0.0031) [2024-06-25 02:04:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12907184128. Throughput: 0: 42616.5. Samples: 12907314740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 02:04:43,436][15401] Updated weights for policy 0, policy_version 787793 (0.0027) [2024-06-25 02:04:47,273][15401] Updated weights for policy 0, policy_version 787803 (0.0034) [2024-06-25 02:04:48,390][15132] Fps is (10 sec: 42625.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12907413504. Throughput: 0: 42466.5. Samples: 12907568640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 02:04:51,088][15401] Updated weights for policy 0, policy_version 787813 (0.0032) [2024-06-25 02:04:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12907642880. Throughput: 0: 42536.9. Samples: 12907697320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 02:04:54,944][15401] Updated weights for policy 0, policy_version 787823 (0.0031) [2024-06-25 02:04:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12907823104. Throughput: 0: 42528.9. Samples: 12907954420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:04:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 02:04:58,674][15401] Updated weights for policy 0, policy_version 787833 (0.0031) [2024-06-25 02:05:02,455][15401] Updated weights for policy 0, policy_version 787843 (0.0032) [2024-06-25 02:05:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12908052480. Throughput: 0: 42815.0. Samples: 12908215420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 02:05:06,001][15349] Signal inference workers to stop experience collection... (191100 times) [2024-06-25 02:05:06,051][15401] InferenceWorker_p0-w0: stopping experience collection (191100 times) [2024-06-25 02:05:06,052][15349] Signal inference workers to resume experience collection... (191100 times) [2024-06-25 02:05:06,060][15401] InferenceWorker_p0-w0: resuming experience collection (191100 times) [2024-06-25 02:05:06,193][15401] Updated weights for policy 0, policy_version 787853 (0.0038) [2024-06-25 02:05:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12908281856. Throughput: 0: 42831.1. Samples: 12908342100. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:08,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 02:05:10,086][15401] Updated weights for policy 0, policy_version 787863 (0.0036) [2024-06-25 02:05:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 12908462080. Throughput: 0: 42729.3. Samples: 12908599540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:13,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 02:05:13,709][15401] Updated weights for policy 0, policy_version 787873 (0.0034) [2024-06-25 02:05:18,233][15401] Updated weights for policy 0, policy_version 787883 (0.0032) [2024-06-25 02:05:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12908675072. Throughput: 0: 42807.1. Samples: 12908854120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 02:05:21,347][15401] Updated weights for policy 0, policy_version 787893 (0.0047) [2024-06-25 02:05:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.8, 300 sec: 42653.9). Total num frames: 12908920832. Throughput: 0: 42682.1. Samples: 12908976400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 02:05:25,771][15401] Updated weights for policy 0, policy_version 787903 (0.0027) [2024-06-25 02:05:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 12909101056. Throughput: 0: 42808.9. Samples: 12909241140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 02:05:29,225][15401] Updated weights for policy 0, policy_version 787913 (0.0042) [2024-06-25 02:05:33,301][15401] Updated weights for policy 0, policy_version 787923 (0.0037) [2024-06-25 02:05:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 12909330432. Throughput: 0: 42815.9. Samples: 12909495360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:33,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 02:05:36,883][15401] Updated weights for policy 0, policy_version 787933 (0.0045) [2024-06-25 02:05:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 12909559808. Throughput: 0: 42749.2. Samples: 12909621040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 02:05:41,844][15401] Updated weights for policy 0, policy_version 787943 (0.0038) [2024-06-25 02:05:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12909740032. Throughput: 0: 42771.1. Samples: 12909879120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 02:05:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000787949_12909756416.pth... [2024-06-25 02:05:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000787325_12899532800.pth [2024-06-25 02:05:44,772][15401] Updated weights for policy 0, policy_version 787953 (0.0029) [2024-06-25 02:05:48,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 12909953024. Throughput: 0: 42466.8. Samples: 12910126420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:48,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 02:05:49,427][15401] Updated weights for policy 0, policy_version 787963 (0.0032) [2024-06-25 02:05:52,553][15401] Updated weights for policy 0, policy_version 787973 (0.0027) [2024-06-25 02:05:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12910198784. Throughput: 0: 42506.3. Samples: 12910254880. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 02:05:56,963][15401] Updated weights for policy 0, policy_version 787983 (0.0036) [2024-06-25 02:05:58,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 12910362624. Throughput: 0: 42564.0. Samples: 12910514920. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-25 02:05:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 02:06:00,259][15401] Updated weights for policy 0, policy_version 787993 (0.0049) [2024-06-25 02:06:03,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 12910608384. Throughput: 0: 42362.7. Samples: 12910760540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:03,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 02:06:04,609][15401] Updated weights for policy 0, policy_version 788003 (0.0041) [2024-06-25 02:06:07,951][15401] Updated weights for policy 0, policy_version 788013 (0.0040) [2024-06-25 02:06:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12910821376. Throughput: 0: 42676.0. Samples: 12910896820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 02:06:12,080][15401] Updated weights for policy 0, policy_version 788023 (0.0032) [2024-06-25 02:06:13,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12911001600. Throughput: 0: 42359.9. Samples: 12911147340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 02:06:15,491][15401] Updated weights for policy 0, policy_version 788033 (0.0035) [2024-06-25 02:06:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12911247360. Throughput: 0: 42287.1. Samples: 12911398280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 02:06:19,701][15401] Updated weights for policy 0, policy_version 788043 (0.0039) [2024-06-25 02:06:23,314][15401] Updated weights for policy 0, policy_version 788053 (0.0037) [2024-06-25 02:06:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12911460352. Throughput: 0: 42531.3. Samples: 12911534940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 02:06:27,159][15401] Updated weights for policy 0, policy_version 788063 (0.0035) [2024-06-25 02:06:28,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 12911624192. Throughput: 0: 42365.8. Samples: 12911785580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 02:06:29,221][15349] Signal inference workers to stop experience collection... (191150 times) [2024-06-25 02:06:29,264][15401] InferenceWorker_p0-w0: stopping experience collection (191150 times) [2024-06-25 02:06:29,290][15349] Signal inference workers to resume experience collection... (191150 times) [2024-06-25 02:06:29,296][15401] InferenceWorker_p0-w0: resuming experience collection (191150 times) [2024-06-25 02:06:31,100][15401] Updated weights for policy 0, policy_version 788073 (0.0050) [2024-06-25 02:06:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12911902720. Throughput: 0: 42472.2. Samples: 12912037680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:33,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 02:06:35,366][15401] Updated weights for policy 0, policy_version 788083 (0.0039) [2024-06-25 02:06:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 12912082944. Throughput: 0: 42696.0. Samples: 12912176200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:38,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 02:06:38,731][15401] Updated weights for policy 0, policy_version 788093 (0.0028) [2024-06-25 02:06:42,903][15401] Updated weights for policy 0, policy_version 788103 (0.0054) [2024-06-25 02:06:43,389][15132] Fps is (10 sec: 37684.2, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 12912279552. Throughput: 0: 42447.3. Samples: 12912425040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 02:06:46,345][15401] Updated weights for policy 0, policy_version 788113 (0.0042) [2024-06-25 02:06:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12912541696. Throughput: 0: 42622.4. Samples: 12912678440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 02:06:50,446][15401] Updated weights for policy 0, policy_version 788123 (0.0035) [2024-06-25 02:06:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12912721920. Throughput: 0: 42613.0. Samples: 12912814400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 02:06:54,335][15401] Updated weights for policy 0, policy_version 788133 (0.0033) [2024-06-25 02:06:57,940][15401] Updated weights for policy 0, policy_version 788143 (0.0034) [2024-06-25 02:06:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 12912934912. Throughput: 0: 42584.8. Samples: 12913063660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:06:58,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 02:07:02,079][15401] Updated weights for policy 0, policy_version 788153 (0.0037) [2024-06-25 02:07:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 12913180672. Throughput: 0: 42565.8. Samples: 12913313740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 02:07:05,491][15401] Updated weights for policy 0, policy_version 788163 (0.0039) [2024-06-25 02:07:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 12913344512. Throughput: 0: 42550.7. Samples: 12913449720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:08,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 02:07:09,728][15401] Updated weights for policy 0, policy_version 788173 (0.0043) [2024-06-25 02:07:13,021][15401] Updated weights for policy 0, policy_version 788183 (0.0039) [2024-06-25 02:07:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12913590272. Throughput: 0: 42388.0. Samples: 12913693040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 02:07:17,452][15401] Updated weights for policy 0, policy_version 788193 (0.0040) [2024-06-25 02:07:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 12913803264. Throughput: 0: 42578.3. Samples: 12913953700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:18,393][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 02:07:21,076][15401] Updated weights for policy 0, policy_version 788203 (0.0032) [2024-06-25 02:07:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 12913983488. Throughput: 0: 42444.8. Samples: 12914086220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:23,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 02:07:25,003][15401] Updated weights for policy 0, policy_version 788213 (0.0035) [2024-06-25 02:07:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 12914212864. Throughput: 0: 42477.6. Samples: 12914336540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:28,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 02:07:28,967][15401] Updated weights for policy 0, policy_version 788223 (0.0035) [2024-06-25 02:07:32,540][15401] Updated weights for policy 0, policy_version 788233 (0.0031) [2024-06-25 02:07:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 12914442240. Throughput: 0: 42640.0. Samples: 12914597240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 02:07:34,352][15349] Signal inference workers to stop experience collection... (191200 times) [2024-06-25 02:07:34,352][15349] Signal inference workers to resume experience collection... (191200 times) [2024-06-25 02:07:34,398][15401] InferenceWorker_p0-w0: stopping experience collection (191200 times) [2024-06-25 02:07:34,398][15401] InferenceWorker_p0-w0: resuming experience collection (191200 times) [2024-06-25 02:07:36,456][15401] Updated weights for policy 0, policy_version 788243 (0.0036) [2024-06-25 02:07:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12914638848. Throughput: 0: 42752.7. Samples: 12914738280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:38,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 02:07:40,170][15401] Updated weights for policy 0, policy_version 788253 (0.0036) [2024-06-25 02:07:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 12914868224. Throughput: 0: 42882.3. Samples: 12914993360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 02:07:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 02:07:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000788261_12914868224.pth... [2024-06-25 02:07:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000787636_12904628224.pth [2024-06-25 02:07:43,858][15401] Updated weights for policy 0, policy_version 788263 (0.0025) [2024-06-25 02:07:47,619][15401] Updated weights for policy 0, policy_version 788273 (0.0025) [2024-06-25 02:07:48,392][15132] Fps is (10 sec: 47503.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 12915113984. Throughput: 0: 42996.5. Samples: 12915248680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:07:48,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 02:07:51,426][15401] Updated weights for policy 0, policy_version 788283 (0.0032) [2024-06-25 02:07:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 12915277824. Throughput: 0: 42964.9. Samples: 12915383140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:07:53,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 02:07:55,257][15401] Updated weights for policy 0, policy_version 788293 (0.0039) [2024-06-25 02:07:58,389][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12915523584. Throughput: 0: 43178.7. Samples: 12915636080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:07:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 02:07:58,893][15401] Updated weights for policy 0, policy_version 788303 (0.0031) [2024-06-25 02:08:02,718][15401] Updated weights for policy 0, policy_version 788313 (0.0030) [2024-06-25 02:08:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12915752960. Throughput: 0: 43227.6. Samples: 12915898940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:03,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 02:08:07,044][15401] Updated weights for policy 0, policy_version 788323 (0.0047) [2024-06-25 02:08:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 12915933184. Throughput: 0: 43168.0. Samples: 12916028880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:08,392][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 02:08:10,289][15401] Updated weights for policy 0, policy_version 788333 (0.0035) [2024-06-25 02:08:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12916178944. Throughput: 0: 43137.3. Samples: 12916277720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:13,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 02:08:14,525][15401] Updated weights for policy 0, policy_version 788343 (0.0028) [2024-06-25 02:08:18,179][15401] Updated weights for policy 0, policy_version 788353 (0.0043) [2024-06-25 02:08:18,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 12916375552. Throughput: 0: 43194.2. Samples: 12916540980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:18,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 02:08:22,163][15401] Updated weights for policy 0, policy_version 788363 (0.0041) [2024-06-25 02:08:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.7, 300 sec: 42654.3). Total num frames: 12916588544. Throughput: 0: 42948.6. Samples: 12916670960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:23,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 02:08:25,764][15401] Updated weights for policy 0, policy_version 788373 (0.0035) [2024-06-25 02:08:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 12916817920. Throughput: 0: 42803.5. Samples: 12916919520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 02:08:29,994][15401] Updated weights for policy 0, policy_version 788383 (0.0041) [2024-06-25 02:08:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12917014528. Throughput: 0: 43068.5. Samples: 12917186660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 02:08:33,423][15401] Updated weights for policy 0, policy_version 788393 (0.0033) [2024-06-25 02:08:37,585][15401] Updated weights for policy 0, policy_version 788403 (0.0034) [2024-06-25 02:08:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12917227520. Throughput: 0: 42774.1. Samples: 12917307980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:38,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 02:08:41,041][15401] Updated weights for policy 0, policy_version 788413 (0.0030) [2024-06-25 02:08:43,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 12917473280. Throughput: 0: 42794.6. Samples: 12917561940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:43,393][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 02:08:45,548][15401] Updated weights for policy 0, policy_version 788423 (0.0034) [2024-06-25 02:08:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 12917637120. Throughput: 0: 42716.5. Samples: 12917821180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:48,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 02:08:48,868][15401] Updated weights for policy 0, policy_version 788433 (0.0037) [2024-06-25 02:08:53,145][15401] Updated weights for policy 0, policy_version 788443 (0.0037) [2024-06-25 02:08:53,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 12917850112. Throughput: 0: 42378.8. Samples: 12917935820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:53,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 02:08:56,486][15401] Updated weights for policy 0, policy_version 788453 (0.0029) [2024-06-25 02:08:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12918095872. Throughput: 0: 42658.3. Samples: 12918197340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:08:58,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 02:09:00,636][15401] Updated weights for policy 0, policy_version 788463 (0.0040) [2024-06-25 02:09:01,387][15349] Signal inference workers to stop experience collection... (191250 times) [2024-06-25 02:09:01,387][15349] Signal inference workers to resume experience collection... (191250 times) [2024-06-25 02:09:01,409][15401] InferenceWorker_p0-w0: stopping experience collection (191250 times) [2024-06-25 02:09:01,411][15401] InferenceWorker_p0-w0: resuming experience collection (191250 times) [2024-06-25 02:09:03,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 12918276096. Throughput: 0: 42666.6. Samples: 12918461080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:09:03,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 02:09:04,116][15401] Updated weights for policy 0, policy_version 788473 (0.0041) [2024-06-25 02:09:08,316][15401] Updated weights for policy 0, policy_version 788483 (0.0038) [2024-06-25 02:09:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 12918505472. Throughput: 0: 42380.0. Samples: 12918578060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:09:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 02:09:11,760][15401] Updated weights for policy 0, policy_version 788493 (0.0029) [2024-06-25 02:09:13,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12918734848. Throughput: 0: 42615.2. Samples: 12918837200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:09:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 02:09:16,234][15401] Updated weights for policy 0, policy_version 788503 (0.0041) [2024-06-25 02:09:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.5). Total num frames: 12918915072. Throughput: 0: 42447.0. Samples: 12919096780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:09:18,394][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 02:09:19,551][15401] Updated weights for policy 0, policy_version 788513 (0.0029) [2024-06-25 02:09:23,396][15132] Fps is (10 sec: 39296.2, 60 sec: 42320.8, 300 sec: 42653.4). Total num frames: 12919128064. Throughput: 0: 42504.7. Samples: 12919220960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:09:23,396][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 02:09:23,880][15401] Updated weights for policy 0, policy_version 788523 (0.0032) [2024-06-25 02:09:27,190][15401] Updated weights for policy 0, policy_version 788533 (0.0024) [2024-06-25 02:09:28,392][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12919373824. Throughput: 0: 42588.9. Samples: 12919478340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-25 02:09:28,392][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 02:09:31,542][15401] Updated weights for policy 0, policy_version 788543 (0.0033) [2024-06-25 02:09:33,392][15132] Fps is (10 sec: 44254.7, 60 sec: 42596.7, 300 sec: 42654.5). Total num frames: 12919570432. Throughput: 0: 42679.5. Samples: 12919741860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:09:33,392][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 02:09:35,082][15401] Updated weights for policy 0, policy_version 788553 (0.0023) [2024-06-25 02:09:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12919767040. Throughput: 0: 42782.2. Samples: 12919861020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:09:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 02:09:39,067][15401] Updated weights for policy 0, policy_version 788563 (0.0033) [2024-06-25 02:09:42,531][15401] Updated weights for policy 0, policy_version 788573 (0.0030) [2024-06-25 02:09:43,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 12920012800. Throughput: 0: 42866.2. Samples: 12920126320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:09:43,394][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 02:09:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000788575_12920012800.pth... [2024-06-25 02:09:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000787949_12909756416.pth [2024-06-25 02:09:46,604][15401] Updated weights for policy 0, policy_version 788583 (0.0032) [2024-06-25 02:09:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12920225792. Throughput: 0: 42803.7. Samples: 12920387140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:09:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 02:09:50,530][15401] Updated weights for policy 0, policy_version 788593 (0.0043) [2024-06-25 02:09:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 12920438784. Throughput: 0: 42999.0. Samples: 12920513020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:09:53,391][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 02:09:53,917][15401] Updated weights for policy 0, policy_version 788603 (0.0028) [2024-06-25 02:09:58,289][15401] Updated weights for policy 0, policy_version 788613 (0.0033) [2024-06-25 02:09:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 12920635392. Throughput: 0: 42987.6. Samples: 12920771640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:09:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 02:10:01,814][15401] Updated weights for policy 0, policy_version 788623 (0.0052) [2024-06-25 02:10:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 12920848384. Throughput: 0: 42949.8. Samples: 12921029520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:03,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 02:10:05,772][15401] Updated weights for policy 0, policy_version 788633 (0.0041) [2024-06-25 02:10:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12921077760. Throughput: 0: 43044.3. Samples: 12921157680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 02:10:09,251][15401] Updated weights for policy 0, policy_version 788643 (0.0026) [2024-06-25 02:10:13,326][15401] Updated weights for policy 0, policy_version 788653 (0.0036) [2024-06-25 02:10:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12921290752. Throughput: 0: 43079.9. Samples: 12921416940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 02:10:16,801][15401] Updated weights for policy 0, policy_version 788663 (0.0034) [2024-06-25 02:10:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12921503744. Throughput: 0: 42929.4. Samples: 12921673580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 02:10:20,798][15401] Updated weights for policy 0, policy_version 788673 (0.0043) [2024-06-25 02:10:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 12921716736. Throughput: 0: 43252.1. Samples: 12921807360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 02:10:24,382][15401] Updated weights for policy 0, policy_version 788683 (0.0023) [2024-06-25 02:10:27,017][15349] Signal inference workers to stop experience collection... (191300 times) [2024-06-25 02:10:27,063][15401] InferenceWorker_p0-w0: stopping experience collection (191300 times) [2024-06-25 02:10:27,073][15349] Signal inference workers to resume experience collection... (191300 times) [2024-06-25 02:10:27,081][15401] InferenceWorker_p0-w0: resuming experience collection (191300 times) [2024-06-25 02:10:28,370][15401] Updated weights for policy 0, policy_version 788693 (0.0041) [2024-06-25 02:10:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12921946112. Throughput: 0: 43009.4. Samples: 12922061740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:28,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 02:10:31,867][15401] Updated weights for policy 0, policy_version 788703 (0.0036) [2024-06-25 02:10:33,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43144.5, 300 sec: 42709.1). Total num frames: 12922159104. Throughput: 0: 42889.6. Samples: 12922317280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:33,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 02:10:36,430][15401] Updated weights for policy 0, policy_version 788713 (0.0033) [2024-06-25 02:10:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12922355712. Throughput: 0: 43053.9. Samples: 12922450440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 02:10:39,717][15401] Updated weights for policy 0, policy_version 788723 (0.0042) [2024-06-25 02:10:43,392][15132] Fps is (10 sec: 42598.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12922585088. Throughput: 0: 42997.2. Samples: 12922706620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:43,392][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 02:10:43,876][15401] Updated weights for policy 0, policy_version 788733 (0.0040) [2024-06-25 02:10:47,383][15401] Updated weights for policy 0, policy_version 788743 (0.0026) [2024-06-25 02:10:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12922798080. Throughput: 0: 42993.7. Samples: 12922964240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 02:10:51,358][15401] Updated weights for policy 0, policy_version 788753 (0.0033) [2024-06-25 02:10:53,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 12923011072. Throughput: 0: 43070.2. Samples: 12923095940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:53,392][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 02:10:55,337][15401] Updated weights for policy 0, policy_version 788763 (0.0039) [2024-06-25 02:10:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 12923224064. Throughput: 0: 42963.6. Samples: 12923350300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:10:58,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 02:10:58,863][15401] Updated weights for policy 0, policy_version 788773 (0.0035) [2024-06-25 02:11:02,799][15401] Updated weights for policy 0, policy_version 788783 (0.0025) [2024-06-25 02:11:03,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12923437056. Throughput: 0: 43053.4. Samples: 12923610980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:11:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 02:11:06,481][15401] Updated weights for policy 0, policy_version 788793 (0.0040) [2024-06-25 02:11:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12923666432. Throughput: 0: 42947.5. Samples: 12923740000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:11:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 02:11:10,646][15401] Updated weights for policy 0, policy_version 788803 (0.0024) [2024-06-25 02:11:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12923879424. Throughput: 0: 42881.8. Samples: 12923991420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:11:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 02:11:13,954][15401] Updated weights for policy 0, policy_version 788813 (0.0027) [2024-06-25 02:11:18,193][15401] Updated weights for policy 0, policy_version 788823 (0.0038) [2024-06-25 02:11:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 12924076032. Throughput: 0: 43099.2. Samples: 12924256740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 02:11:18,392][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 02:11:21,673][15401] Updated weights for policy 0, policy_version 788833 (0.0039) [2024-06-25 02:11:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12924289024. Throughput: 0: 42956.0. Samples: 12924383460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:11:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 02:11:25,726][15401] Updated weights for policy 0, policy_version 788843 (0.0038) [2024-06-25 02:11:28,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12924518400. Throughput: 0: 42866.3. Samples: 12924635500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:11:28,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 02:11:29,166][15401] Updated weights for policy 0, policy_version 788853 (0.0043) [2024-06-25 02:11:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 12924715008. Throughput: 0: 42901.7. Samples: 12924894820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:11:33,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 02:11:33,408][15401] Updated weights for policy 0, policy_version 788863 (0.0029) [2024-06-25 02:11:36,912][15401] Updated weights for policy 0, policy_version 788873 (0.0034) [2024-06-25 02:11:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12924928000. Throughput: 0: 42729.3. Samples: 12925018660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:11:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 02:11:40,998][15401] Updated weights for policy 0, policy_version 788883 (0.0043) [2024-06-25 02:11:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 12925140992. Throughput: 0: 42669.8. Samples: 12925270440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:11:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 02:11:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000788889_12925157376.pth... [2024-06-25 02:11:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000788261_12914868224.pth [2024-06-25 02:11:44,566][15401] Updated weights for policy 0, policy_version 788893 (0.0035) [2024-06-25 02:11:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12925353984. Throughput: 0: 42501.6. Samples: 12925523560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:11:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 02:11:48,939][15401] Updated weights for policy 0, policy_version 788903 (0.0039) [2024-06-25 02:11:50,206][15349] Signal inference workers to stop experience collection... (191350 times) [2024-06-25 02:11:50,256][15401] InferenceWorker_p0-w0: stopping experience collection (191350 times) [2024-06-25 02:11:50,263][15349] Signal inference workers to resume experience collection... (191350 times) [2024-06-25 02:11:50,266][15401] InferenceWorker_p0-w0: resuming experience collection (191350 times) [2024-06-25 02:11:52,611][15401] Updated weights for policy 0, policy_version 788913 (0.0049) [2024-06-25 02:11:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 12925566976. Throughput: 0: 42487.5. Samples: 12925651940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:11:53,395][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 02:11:56,590][15401] Updated weights for policy 0, policy_version 788923 (0.0028) [2024-06-25 02:11:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12925796352. Throughput: 0: 42631.0. Samples: 12925909820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:11:58,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 02:12:00,098][15401] Updated weights for policy 0, policy_version 788933 (0.0044) [2024-06-25 02:12:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 12926009344. Throughput: 0: 42482.6. Samples: 12926168360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:03,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-25 02:12:04,221][15401] Updated weights for policy 0, policy_version 788943 (0.0044) [2024-06-25 02:12:08,015][15401] Updated weights for policy 0, policy_version 788953 (0.0039) [2024-06-25 02:12:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12926205952. Throughput: 0: 42458.6. Samples: 12926294100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 02:12:11,763][15401] Updated weights for policy 0, policy_version 788963 (0.0031) [2024-06-25 02:12:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12926451712. Throughput: 0: 42514.5. Samples: 12926548660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 02:12:16,253][15401] Updated weights for policy 0, policy_version 788973 (0.0034) [2024-06-25 02:12:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 12926648320. Throughput: 0: 42390.4. Samples: 12926802380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 02:12:19,399][15401] Updated weights for policy 0, policy_version 788983 (0.0024) [2024-06-25 02:12:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12926828544. Throughput: 0: 42344.1. Samples: 12926924140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:23,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 02:12:23,816][15401] Updated weights for policy 0, policy_version 788993 (0.0041) [2024-06-25 02:12:27,597][15401] Updated weights for policy 0, policy_version 789003 (0.0030) [2024-06-25 02:12:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12927074304. Throughput: 0: 42469.8. Samples: 12927181580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:28,391][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 02:12:31,531][15401] Updated weights for policy 0, policy_version 789013 (0.0041) [2024-06-25 02:12:33,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 12927254528. Throughput: 0: 42518.0. Samples: 12927437140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:33,397][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 02:12:35,220][15401] Updated weights for policy 0, policy_version 789023 (0.0027) [2024-06-25 02:12:38,393][15132] Fps is (10 sec: 39309.0, 60 sec: 42323.1, 300 sec: 42709.0). Total num frames: 12927467520. Throughput: 0: 42418.4. Samples: 12927560900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:38,393][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 02:12:39,096][15401] Updated weights for policy 0, policy_version 789033 (0.0036) [2024-06-25 02:12:42,764][15401] Updated weights for policy 0, policy_version 789043 (0.0032) [2024-06-25 02:12:43,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 12927696896. Throughput: 0: 42415.6. Samples: 12927818520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 02:12:46,783][15401] Updated weights for policy 0, policy_version 789053 (0.0042) [2024-06-25 02:12:48,390][15132] Fps is (10 sec: 44250.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12927909888. Throughput: 0: 42386.7. Samples: 12928075760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:48,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 02:12:50,792][15401] Updated weights for policy 0, policy_version 789063 (0.0036) [2024-06-25 02:12:52,041][15349] Signal inference workers to stop experience collection... (191400 times) [2024-06-25 02:12:52,041][15349] Signal inference workers to resume experience collection... (191400 times) [2024-06-25 02:12:52,088][15401] InferenceWorker_p0-w0: stopping experience collection (191400 times) [2024-06-25 02:12:52,088][15401] InferenceWorker_p0-w0: resuming experience collection (191400 times) [2024-06-25 02:12:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 12928106496. Throughput: 0: 42402.7. Samples: 12928202220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:53,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 02:12:54,311][15401] Updated weights for policy 0, policy_version 789073 (0.0046) [2024-06-25 02:12:58,240][15401] Updated weights for policy 0, policy_version 789083 (0.0024) [2024-06-25 02:12:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12928335872. Throughput: 0: 42459.5. Samples: 12928459340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:12:58,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-25 02:13:02,045][15401] Updated weights for policy 0, policy_version 789093 (0.0036) [2024-06-25 02:13:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 12928548864. Throughput: 0: 42565.3. Samples: 12928717820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 02:13:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 02:13:05,757][15401] Updated weights for policy 0, policy_version 789103 (0.0042) [2024-06-25 02:13:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12928778240. Throughput: 0: 42702.2. Samples: 12928845740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 02:13:09,498][15401] Updated weights for policy 0, policy_version 789113 (0.0028) [2024-06-25 02:13:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 12928974848. Throughput: 0: 42710.7. Samples: 12929103560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 02:13:13,471][15401] Updated weights for policy 0, policy_version 789123 (0.0028) [2024-06-25 02:13:17,213][15401] Updated weights for policy 0, policy_version 789133 (0.0025) [2024-06-25 02:13:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12929187840. Throughput: 0: 42975.5. Samples: 12929370760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:18,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 02:13:21,039][15401] Updated weights for policy 0, policy_version 789143 (0.0044) [2024-06-25 02:13:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12929417216. Throughput: 0: 42953.4. Samples: 12929493660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 02:13:24,970][15401] Updated weights for policy 0, policy_version 789153 (0.0040) [2024-06-25 02:13:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12929630208. Throughput: 0: 42919.1. Samples: 12929749880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:28,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 02:13:28,600][15401] Updated weights for policy 0, policy_version 789163 (0.0038) [2024-06-25 02:13:32,608][15401] Updated weights for policy 0, policy_version 789173 (0.0035) [2024-06-25 02:13:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43149.2, 300 sec: 42765.0). Total num frames: 12929843200. Throughput: 0: 43128.1. Samples: 12930016520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 02:13:36,094][15401] Updated weights for policy 0, policy_version 789183 (0.0041) [2024-06-25 02:13:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.9, 300 sec: 42654.3). Total num frames: 12930056192. Throughput: 0: 42970.7. Samples: 12930135900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 02:13:40,135][15401] Updated weights for policy 0, policy_version 789193 (0.0029) [2024-06-25 02:13:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12930269184. Throughput: 0: 43000.2. Samples: 12930394340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:43,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 02:13:43,468][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000789202_12930285568.pth... [2024-06-25 02:13:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000788575_12920012800.pth [2024-06-25 02:13:43,690][15401] Updated weights for policy 0, policy_version 789203 (0.0041) [2024-06-25 02:13:47,946][15401] Updated weights for policy 0, policy_version 789213 (0.0032) [2024-06-25 02:13:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12930482176. Throughput: 0: 42997.9. Samples: 12930652720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 02:13:51,162][15401] Updated weights for policy 0, policy_version 789223 (0.0046) [2024-06-25 02:13:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12930695168. Throughput: 0: 42947.2. Samples: 12930778360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:53,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 02:13:55,577][15401] Updated weights for policy 0, policy_version 789233 (0.0037) [2024-06-25 02:13:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42876.5). Total num frames: 12930924544. Throughput: 0: 43000.1. Samples: 12931038560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:13:58,398][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 02:13:59,379][15401] Updated weights for policy 0, policy_version 789243 (0.0034) [2024-06-25 02:14:03,093][15401] Updated weights for policy 0, policy_version 789253 (0.0041) [2024-06-25 02:14:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12931137536. Throughput: 0: 42859.0. Samples: 12931299420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:03,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 02:14:06,845][15401] Updated weights for policy 0, policy_version 789263 (0.0037) [2024-06-25 02:14:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12931350528. Throughput: 0: 42911.4. Samples: 12931424680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 02:14:10,781][15401] Updated weights for policy 0, policy_version 789273 (0.0027) [2024-06-25 02:14:13,396][15132] Fps is (10 sec: 42571.4, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 12931563520. Throughput: 0: 42837.0. Samples: 12931677820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:13,396][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 02:14:14,958][15401] Updated weights for policy 0, policy_version 789283 (0.0042) [2024-06-25 02:14:14,985][15349] Signal inference workers to stop experience collection... (191450 times) [2024-06-25 02:14:14,986][15349] Signal inference workers to resume experience collection... (191450 times) [2024-06-25 02:14:15,003][15401] InferenceWorker_p0-w0: stopping experience collection (191450 times) [2024-06-25 02:14:15,003][15401] InferenceWorker_p0-w0: resuming experience collection (191450 times) [2024-06-25 02:14:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 12931760128. Throughput: 0: 42621.3. Samples: 12931934480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:18,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 02:14:18,514][15401] Updated weights for policy 0, policy_version 789293 (0.0032) [2024-06-25 02:14:22,562][15401] Updated weights for policy 0, policy_version 789303 (0.0037) [2024-06-25 02:14:23,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12931973120. Throughput: 0: 42705.6. Samples: 12932057660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:23,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 02:14:25,996][15401] Updated weights for policy 0, policy_version 789313 (0.0042) [2024-06-25 02:14:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 12932202496. Throughput: 0: 42656.0. Samples: 12932313860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 02:14:30,130][15401] Updated weights for policy 0, policy_version 789323 (0.0041) [2024-06-25 02:14:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12932399104. Throughput: 0: 42649.8. Samples: 12932571960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 02:14:33,765][15401] Updated weights for policy 0, policy_version 789333 (0.0035) [2024-06-25 02:14:37,764][15401] Updated weights for policy 0, policy_version 789343 (0.0037) [2024-06-25 02:14:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12932612096. Throughput: 0: 42631.0. Samples: 12932696760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 02:14:41,535][15401] Updated weights for policy 0, policy_version 789353 (0.0030) [2024-06-25 02:14:43,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 12932841472. Throughput: 0: 42532.9. Samples: 12932952820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:43,397][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 02:14:45,287][15401] Updated weights for policy 0, policy_version 789363 (0.0041) [2024-06-25 02:14:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 12933021696. Throughput: 0: 42490.8. Samples: 12933211500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 02:14:48,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 02:14:49,087][15401] Updated weights for policy 0, policy_version 789373 (0.0029) [2024-06-25 02:14:53,040][15401] Updated weights for policy 0, policy_version 789383 (0.0036) [2024-06-25 02:14:53,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12933251072. Throughput: 0: 42472.0. Samples: 12933335920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:14:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 02:14:56,629][15401] Updated weights for policy 0, policy_version 789393 (0.0031) [2024-06-25 02:14:58,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12933496832. Throughput: 0: 42622.1. Samples: 12933595540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:14:58,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 02:15:00,611][15401] Updated weights for policy 0, policy_version 789403 (0.0035) [2024-06-25 02:15:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12933677056. Throughput: 0: 42701.4. Samples: 12933856040. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:03,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 02:15:04,112][15401] Updated weights for policy 0, policy_version 789413 (0.0040) [2024-06-25 02:15:08,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 12933890048. Throughput: 0: 42666.2. Samples: 12933977740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:08,393][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 02:15:08,973][15401] Updated weights for policy 0, policy_version 789423 (0.0043) [2024-06-25 02:15:12,083][15401] Updated weights for policy 0, policy_version 789433 (0.0029) [2024-06-25 02:15:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42601.2, 300 sec: 42764.7). Total num frames: 12934119424. Throughput: 0: 42523.0. Samples: 12934227500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:13,401][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 02:15:16,625][15401] Updated weights for policy 0, policy_version 789443 (0.0040) [2024-06-25 02:15:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12934316032. Throughput: 0: 42543.9. Samples: 12934486440. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:18,392][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 02:15:19,617][15401] Updated weights for policy 0, policy_version 789453 (0.0037) [2024-06-25 02:15:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 12934529024. Throughput: 0: 42632.1. Samples: 12934615200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:23,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 02:15:24,267][15401] Updated weights for policy 0, policy_version 789463 (0.0033) [2024-06-25 02:15:27,008][15401] Updated weights for policy 0, policy_version 789473 (0.0034) [2024-06-25 02:15:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 12934758400. Throughput: 0: 42565.6. Samples: 12934868000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 02:15:31,907][15401] Updated weights for policy 0, policy_version 789483 (0.0035) [2024-06-25 02:15:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12934955008. Throughput: 0: 42849.7. Samples: 12935139740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:33,390][15132] Avg episode reward: [(0, '0.043')] [2024-06-25 02:15:33,476][15349] Signal inference workers to stop experience collection... (191500 times) [2024-06-25 02:15:33,476][15349] Signal inference workers to resume experience collection... (191500 times) [2024-06-25 02:15:33,527][15401] InferenceWorker_p0-w0: stopping experience collection (191500 times) [2024-06-25 02:15:33,528][15401] InferenceWorker_p0-w0: resuming experience collection (191500 times) [2024-06-25 02:15:34,382][15401] Updated weights for policy 0, policy_version 789493 (0.0028) [2024-06-25 02:15:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 12935184384. Throughput: 0: 42896.0. Samples: 12935266240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:38,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 02:15:39,279][15401] Updated weights for policy 0, policy_version 789503 (0.0032) [2024-06-25 02:15:41,975][15401] Updated weights for policy 0, policy_version 789513 (0.0026) [2024-06-25 02:15:43,396][15132] Fps is (10 sec: 45845.6, 60 sec: 42871.4, 300 sec: 42764.1). Total num frames: 12935413760. Throughput: 0: 42723.7. Samples: 12935518380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:43,397][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 02:15:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000789515_12935413760.pth... [2024-06-25 02:15:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000788889_12925157376.pth [2024-06-25 02:15:46,839][15401] Updated weights for policy 0, policy_version 789523 (0.0040) [2024-06-25 02:15:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 12935593984. Throughput: 0: 42804.4. Samples: 12935782240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 02:15:49,694][15401] Updated weights for policy 0, policy_version 789533 (0.0031) [2024-06-25 02:15:53,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12935823360. Throughput: 0: 42792.1. Samples: 12935903280. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 02:15:54,380][15401] Updated weights for policy 0, policy_version 789543 (0.0051) [2024-06-25 02:15:57,232][15401] Updated weights for policy 0, policy_version 789553 (0.0028) [2024-06-25 02:15:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12936052736. Throughput: 0: 42857.4. Samples: 12936155980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:15:58,390][15132] Avg episode reward: [(0, '0.021')] [2024-06-25 02:16:02,288][15401] Updated weights for policy 0, policy_version 789563 (0.0035) [2024-06-25 02:16:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 12936232960. Throughput: 0: 42935.1. Samples: 12936418520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:16:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 02:16:05,178][15401] Updated weights for policy 0, policy_version 789573 (0.0037) [2024-06-25 02:16:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 12936478720. Throughput: 0: 42793.8. Samples: 12936540920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:16:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 02:16:09,704][15401] Updated weights for policy 0, policy_version 789583 (0.0041) [2024-06-25 02:16:12,907][15401] Updated weights for policy 0, policy_version 789593 (0.0032) [2024-06-25 02:16:13,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43146.3, 300 sec: 42820.9). Total num frames: 12936708096. Throughput: 0: 42955.6. Samples: 12936801000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:16:13,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 02:16:17,705][15401] Updated weights for policy 0, policy_version 789603 (0.0046) [2024-06-25 02:16:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12936871936. Throughput: 0: 42865.7. Samples: 12937068700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:16:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 02:16:20,519][15401] Updated weights for policy 0, policy_version 789613 (0.0039) [2024-06-25 02:16:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12937117696. Throughput: 0: 42552.0. Samples: 12937181080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:16:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 02:16:25,239][15401] Updated weights for policy 0, policy_version 789623 (0.0032) [2024-06-25 02:16:28,208][15401] Updated weights for policy 0, policy_version 789633 (0.0038) [2024-06-25 02:16:28,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12937347072. Throughput: 0: 42778.1. Samples: 12937443120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:16:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 02:16:32,747][15401] Updated weights for policy 0, policy_version 789643 (0.0055) [2024-06-25 02:16:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 12937510912. Throughput: 0: 42659.1. Samples: 12937701900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:16:33,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 02:16:33,415][15349] Signal inference workers to stop experience collection... (191550 times) [2024-06-25 02:16:33,455][15401] InferenceWorker_p0-w0: stopping experience collection (191550 times) [2024-06-25 02:16:33,474][15349] Signal inference workers to resume experience collection... (191550 times) [2024-06-25 02:16:33,476][15401] InferenceWorker_p0-w0: resuming experience collection (191550 times) [2024-06-25 02:16:35,981][15401] Updated weights for policy 0, policy_version 789653 (0.0037) [2024-06-25 02:16:38,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 12937723904. Throughput: 0: 42659.1. Samples: 12937822940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 02:16:38,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-25 02:16:40,441][15401] Updated weights for policy 0, policy_version 789663 (0.0033) [2024-06-25 02:16:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42603.1, 300 sec: 42765.0). Total num frames: 12937969664. Throughput: 0: 42774.3. Samples: 12938080820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:16:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 02:16:43,636][15401] Updated weights for policy 0, policy_version 789673 (0.0037) [2024-06-25 02:16:48,230][15401] Updated weights for policy 0, policy_version 789683 (0.0034) [2024-06-25 02:16:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12938166272. Throughput: 0: 42724.0. Samples: 12938341100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:16:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 02:16:51,406][15401] Updated weights for policy 0, policy_version 789693 (0.0038) [2024-06-25 02:16:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12938379264. Throughput: 0: 42715.4. Samples: 12938463120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:16:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 02:16:56,189][15401] Updated weights for policy 0, policy_version 789703 (0.0045) [2024-06-25 02:16:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12938608640. Throughput: 0: 42650.6. Samples: 12938720280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:16:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 02:16:59,039][15401] Updated weights for policy 0, policy_version 789713 (0.0034) [2024-06-25 02:17:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 12938788864. Throughput: 0: 42459.7. Samples: 12938979380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 02:17:03,697][15401] Updated weights for policy 0, policy_version 789723 (0.0030) [2024-06-25 02:17:06,887][15401] Updated weights for policy 0, policy_version 789733 (0.0038) [2024-06-25 02:17:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 12939034624. Throughput: 0: 42663.5. Samples: 12939100940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 02:17:11,239][15401] Updated weights for policy 0, policy_version 789743 (0.0036) [2024-06-25 02:17:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12939247616. Throughput: 0: 42754.8. Samples: 12939367080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 02:17:14,350][15401] Updated weights for policy 0, policy_version 789753 (0.0031) [2024-06-25 02:17:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12939427840. Throughput: 0: 42690.3. Samples: 12939622960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 02:17:19,123][15401] Updated weights for policy 0, policy_version 789763 (0.0037) [2024-06-25 02:17:22,253][15401] Updated weights for policy 0, policy_version 789773 (0.0045) [2024-06-25 02:17:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12939673600. Throughput: 0: 42660.4. Samples: 12939742660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 02:17:26,624][15401] Updated weights for policy 0, policy_version 789783 (0.0029) [2024-06-25 02:17:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42821.5). Total num frames: 12939886592. Throughput: 0: 42749.2. Samples: 12940004540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 02:17:29,982][15401] Updated weights for policy 0, policy_version 789793 (0.0038) [2024-06-25 02:17:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.5). Total num frames: 12940083200. Throughput: 0: 42659.1. Samples: 12940260760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 02:17:34,147][15401] Updated weights for policy 0, policy_version 789803 (0.0039) [2024-06-25 02:17:37,403][15401] Updated weights for policy 0, policy_version 789813 (0.0028) [2024-06-25 02:17:38,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 12940328960. Throughput: 0: 42715.2. Samples: 12940385400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:38,392][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 02:17:42,013][15401] Updated weights for policy 0, policy_version 789823 (0.0027) [2024-06-25 02:17:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12940525568. Throughput: 0: 42888.9. Samples: 12940650280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 02:17:43,565][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000789829_12940558336.pth... [2024-06-25 02:17:43,617][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000789202_12930285568.pth [2024-06-25 02:17:45,002][15401] Updated weights for policy 0, policy_version 789833 (0.0033) [2024-06-25 02:17:48,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12940738560. Throughput: 0: 42830.5. Samples: 12940906760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 02:17:49,820][15401] Updated weights for policy 0, policy_version 789843 (0.0034) [2024-06-25 02:17:52,965][15401] Updated weights for policy 0, policy_version 789853 (0.0029) [2024-06-25 02:17:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 12940967936. Throughput: 0: 42854.4. Samples: 12941029380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 02:17:57,387][15401] Updated weights for policy 0, policy_version 789863 (0.0030) [2024-06-25 02:17:58,120][15349] Signal inference workers to stop experience collection... (191600 times) [2024-06-25 02:17:58,122][15349] Signal inference workers to resume experience collection... (191600 times) [2024-06-25 02:17:58,140][15401] InferenceWorker_p0-w0: stopping experience collection (191600 times) [2024-06-25 02:17:58,171][15401] InferenceWorker_p0-w0: resuming experience collection (191600 times) [2024-06-25 02:17:58,396][15132] Fps is (10 sec: 42571.6, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 12941164544. Throughput: 0: 42827.2. Samples: 12941294580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:17:58,397][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 02:18:00,447][15401] Updated weights for policy 0, policy_version 789873 (0.0034) [2024-06-25 02:18:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 12941377536. Throughput: 0: 42774.1. Samples: 12941547800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:18:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 02:18:05,250][15401] Updated weights for policy 0, policy_version 789883 (0.0034) [2024-06-25 02:18:08,028][15401] Updated weights for policy 0, policy_version 789893 (0.0038) [2024-06-25 02:18:08,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12941606912. Throughput: 0: 42944.2. Samples: 12941675140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:18:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 02:18:12,771][15401] Updated weights for policy 0, policy_version 789903 (0.0044) [2024-06-25 02:18:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12941819904. Throughput: 0: 42913.4. Samples: 12941935640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:18:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 02:18:15,520][15401] Updated weights for policy 0, policy_version 789913 (0.0033) [2024-06-25 02:18:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12942016512. Throughput: 0: 42916.1. Samples: 12942191980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:18:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 02:18:20,145][15401] Updated weights for policy 0, policy_version 789923 (0.0045) [2024-06-25 02:18:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12942245888. Throughput: 0: 42982.3. Samples: 12942319500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:18:23,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 02:18:23,441][15401] Updated weights for policy 0, policy_version 789933 (0.0037) [2024-06-25 02:18:27,923][15401] Updated weights for policy 0, policy_version 789943 (0.0036) [2024-06-25 02:18:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12942458880. Throughput: 0: 42920.0. Samples: 12942581680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:18:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 02:18:30,977][15401] Updated weights for policy 0, policy_version 789953 (0.0038) [2024-06-25 02:18:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 12942671872. Throughput: 0: 42933.8. Samples: 12942838880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:18:33,392][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 02:18:35,410][15401] Updated weights for policy 0, policy_version 789963 (0.0030) [2024-06-25 02:18:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 12942901248. Throughput: 0: 42982.7. Samples: 12942963600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:18:38,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 02:18:38,486][15401] Updated weights for policy 0, policy_version 789973 (0.0032) [2024-06-25 02:18:42,937][15401] Updated weights for policy 0, policy_version 789983 (0.0032) [2024-06-25 02:18:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12943114240. Throughput: 0: 42987.9. Samples: 12943228760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:18:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 02:18:46,138][15401] Updated weights for policy 0, policy_version 789993 (0.0034) [2024-06-25 02:18:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12943310848. Throughput: 0: 43034.3. Samples: 12943484340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:18:48,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 02:18:50,566][15401] Updated weights for policy 0, policy_version 790003 (0.0045) [2024-06-25 02:18:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 12943523840. Throughput: 0: 42875.0. Samples: 12943604520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:18:53,393][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 02:18:53,937][15401] Updated weights for policy 0, policy_version 790013 (0.0042) [2024-06-25 02:18:57,970][15401] Updated weights for policy 0, policy_version 790023 (0.0036) [2024-06-25 02:18:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43422.2, 300 sec: 42820.6). Total num frames: 12943769600. Throughput: 0: 43062.1. Samples: 12943873440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:18:58,404][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 02:19:01,467][15401] Updated weights for policy 0, policy_version 790033 (0.0042) [2024-06-25 02:19:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12943949824. Throughput: 0: 42929.2. Samples: 12944123800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 02:19:05,913][15401] Updated weights for policy 0, policy_version 790043 (0.0032) [2024-06-25 02:19:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 12944195584. Throughput: 0: 43000.4. Samples: 12944254520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 02:19:09,038][15401] Updated weights for policy 0, policy_version 790053 (0.0024) [2024-06-25 02:19:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12944359424. Throughput: 0: 43009.4. Samples: 12944517100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 02:19:13,755][15401] Updated weights for policy 0, policy_version 790063 (0.0029) [2024-06-25 02:19:16,653][15401] Updated weights for policy 0, policy_version 790073 (0.0036) [2024-06-25 02:19:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12944605184. Throughput: 0: 42819.6. Samples: 12944765660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 02:19:21,368][15401] Updated weights for policy 0, policy_version 790083 (0.0024) [2024-06-25 02:19:21,459][15349] Signal inference workers to stop experience collection... (191650 times) [2024-06-25 02:19:21,492][15401] InferenceWorker_p0-w0: stopping experience collection (191650 times) [2024-06-25 02:19:21,515][15349] Signal inference workers to resume experience collection... (191650 times) [2024-06-25 02:19:21,520][15401] InferenceWorker_p0-w0: resuming experience collection (191650 times) [2024-06-25 02:19:23,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12944834560. Throughput: 0: 43157.7. Samples: 12944905700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 02:19:24,266][15401] Updated weights for policy 0, policy_version 790093 (0.0022) [2024-06-25 02:19:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12945014784. Throughput: 0: 42948.3. Samples: 12945161440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:28,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 02:19:28,841][15401] Updated weights for policy 0, policy_version 790103 (0.0030) [2024-06-25 02:19:31,800][15401] Updated weights for policy 0, policy_version 790113 (0.0026) [2024-06-25 02:19:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 12945244160. Throughput: 0: 42828.4. Samples: 12945411620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 02:19:36,526][15401] Updated weights for policy 0, policy_version 790123 (0.0036) [2024-06-25 02:19:38,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 12945489920. Throughput: 0: 43206.3. Samples: 12945548800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:38,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 02:19:39,578][15401] Updated weights for policy 0, policy_version 790133 (0.0031) [2024-06-25 02:19:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 12945653760. Throughput: 0: 42752.1. Samples: 12945797280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:43,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 02:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000790140_12945653760.pth... [2024-06-25 02:19:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000789515_12935413760.pth [2024-06-25 02:19:44,192][15401] Updated weights for policy 0, policy_version 790143 (0.0030) [2024-06-25 02:19:47,206][15401] Updated weights for policy 0, policy_version 790153 (0.0034) [2024-06-25 02:19:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12945899520. Throughput: 0: 42759.2. Samples: 12946047960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 02:19:51,611][15401] Updated weights for policy 0, policy_version 790163 (0.0031) [2024-06-25 02:19:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12946096128. Throughput: 0: 42852.8. Samples: 12946182900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 02:19:54,973][15401] Updated weights for policy 0, policy_version 790173 (0.0035) [2024-06-25 02:19:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 12946292736. Throughput: 0: 42585.4. Samples: 12946433440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:19:58,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 02:19:59,535][15401] Updated weights for policy 0, policy_version 790183 (0.0027) [2024-06-25 02:20:02,493][15401] Updated weights for policy 0, policy_version 790193 (0.0042) [2024-06-25 02:20:03,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42932.0). Total num frames: 12946554880. Throughput: 0: 42757.0. Samples: 12946689720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:20:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 02:20:07,127][15401] Updated weights for policy 0, policy_version 790203 (0.0035) [2024-06-25 02:20:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 12946735104. Throughput: 0: 42657.0. Samples: 12946825260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 02:20:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 02:20:10,349][15401] Updated weights for policy 0, policy_version 790213 (0.0038) [2024-06-25 02:20:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12946948096. Throughput: 0: 42595.3. Samples: 12947078220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 02:20:14,941][15401] Updated weights for policy 0, policy_version 790223 (0.0037) [2024-06-25 02:20:18,150][15401] Updated weights for policy 0, policy_version 790233 (0.0038) [2024-06-25 02:20:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12947193856. Throughput: 0: 42641.8. Samples: 12947330500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:18,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 02:20:22,569][15401] Updated weights for policy 0, policy_version 790243 (0.0032) [2024-06-25 02:20:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12947374080. Throughput: 0: 42494.1. Samples: 12947461040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 02:20:25,795][15401] Updated weights for policy 0, policy_version 790253 (0.0039) [2024-06-25 02:20:28,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12947587072. Throughput: 0: 42564.9. Samples: 12947712700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:28,391][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 02:20:30,226][15401] Updated weights for policy 0, policy_version 790263 (0.0036) [2024-06-25 02:20:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12947816448. Throughput: 0: 42677.9. Samples: 12947968460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:33,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 02:20:33,655][15401] Updated weights for policy 0, policy_version 790273 (0.0025) [2024-06-25 02:20:37,894][15401] Updated weights for policy 0, policy_version 790283 (0.0027) [2024-06-25 02:20:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42710.4). Total num frames: 12948013056. Throughput: 0: 42642.7. Samples: 12948101820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 02:20:39,796][15349] Signal inference workers to stop experience collection... (191700 times) [2024-06-25 02:20:39,797][15349] Signal inference workers to resume experience collection... (191700 times) [2024-06-25 02:20:39,835][15401] InferenceWorker_p0-w0: stopping experience collection (191700 times) [2024-06-25 02:20:39,835][15401] InferenceWorker_p0-w0: resuming experience collection (191700 times) [2024-06-25 02:20:41,086][15401] Updated weights for policy 0, policy_version 790293 (0.0029) [2024-06-25 02:20:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12948242432. Throughput: 0: 42861.7. Samples: 12948362220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 02:20:45,270][15401] Updated weights for policy 0, policy_version 790303 (0.0042) [2024-06-25 02:20:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12948455424. Throughput: 0: 42846.1. Samples: 12948617800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 02:20:48,604][15401] Updated weights for policy 0, policy_version 790313 (0.0038) [2024-06-25 02:20:52,738][15401] Updated weights for policy 0, policy_version 790323 (0.0033) [2024-06-25 02:20:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12948652032. Throughput: 0: 42661.2. Samples: 12948745020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:53,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 02:20:56,547][15401] Updated weights for policy 0, policy_version 790333 (0.0025) [2024-06-25 02:20:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12948881408. Throughput: 0: 42812.9. Samples: 12949004800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:20:58,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 02:21:00,362][15401] Updated weights for policy 0, policy_version 790343 (0.0037) [2024-06-25 02:21:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12949094400. Throughput: 0: 42835.9. Samples: 12949258120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:03,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-25 02:21:04,180][15401] Updated weights for policy 0, policy_version 790353 (0.0027) [2024-06-25 02:21:08,223][15401] Updated weights for policy 0, policy_version 790363 (0.0038) [2024-06-25 02:21:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 12949307392. Throughput: 0: 42739.1. Samples: 12949384300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:08,390][15132] Avg episode reward: [(0, '0.191')] [2024-06-25 02:21:11,680][15401] Updated weights for policy 0, policy_version 790373 (0.0029) [2024-06-25 02:21:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 12949536768. Throughput: 0: 42978.2. Samples: 12949646720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 02:21:15,837][15401] Updated weights for policy 0, policy_version 790383 (0.0038) [2024-06-25 02:21:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12949749760. Throughput: 0: 42917.8. Samples: 12949899760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 02:21:20,085][15401] Updated weights for policy 0, policy_version 790393 (0.0030) [2024-06-25 02:21:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12949946368. Throughput: 0: 42774.2. Samples: 12950026660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:23,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 02:21:23,603][15401] Updated weights for policy 0, policy_version 790403 (0.0037) [2024-06-25 02:21:27,736][15401] Updated weights for policy 0, policy_version 790413 (0.0037) [2024-06-25 02:21:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 12950175744. Throughput: 0: 42801.0. Samples: 12950288260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 02:21:31,270][15401] Updated weights for policy 0, policy_version 790423 (0.0034) [2024-06-25 02:21:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12950372352. Throughput: 0: 42712.9. Samples: 12950539880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 02:21:35,425][15401] Updated weights for policy 0, policy_version 790433 (0.0039) [2024-06-25 02:21:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12950585344. Throughput: 0: 42766.7. Samples: 12950669520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:38,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-25 02:21:38,879][15401] Updated weights for policy 0, policy_version 790443 (0.0031) [2024-06-25 02:21:42,815][15401] Updated weights for policy 0, policy_version 790453 (0.0034) [2024-06-25 02:21:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12950814720. Throughput: 0: 42811.4. Samples: 12950931320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:43,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-25 02:21:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000790455_12950814720.pth... [2024-06-25 02:21:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000789829_12940558336.pth [2024-06-25 02:21:46,491][15401] Updated weights for policy 0, policy_version 790463 (0.0033) [2024-06-25 02:21:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12951011328. Throughput: 0: 42924.1. Samples: 12951189700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 02:21:50,612][15401] Updated weights for policy 0, policy_version 790473 (0.0028) [2024-06-25 02:21:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 12951240704. Throughput: 0: 42917.8. Samples: 12951315600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:53,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 02:21:54,081][15401] Updated weights for policy 0, policy_version 790483 (0.0026) [2024-06-25 02:21:58,072][15401] Updated weights for policy 0, policy_version 790493 (0.0041) [2024-06-25 02:21:58,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42593.8, 300 sec: 42875.2). Total num frames: 12951437312. Throughput: 0: 42931.8. Samples: 12951578920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 02:21:58,396][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 02:22:01,598][15401] Updated weights for policy 0, policy_version 790503 (0.0039) [2024-06-25 02:22:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12951666688. Throughput: 0: 43040.0. Samples: 12951836560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 02:22:03,520][15349] Signal inference workers to stop experience collection... (191750 times) [2024-06-25 02:22:03,521][15349] Signal inference workers to resume experience collection... (191750 times) [2024-06-25 02:22:03,560][15401] InferenceWorker_p0-w0: stopping experience collection (191750 times) [2024-06-25 02:22:03,561][15401] InferenceWorker_p0-w0: resuming experience collection (191750 times) [2024-06-25 02:22:05,596][15401] Updated weights for policy 0, policy_version 790513 (0.0035) [2024-06-25 02:22:08,390][15132] Fps is (10 sec: 45904.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12951896064. Throughput: 0: 43038.2. Samples: 12951963380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:08,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 02:22:09,242][15401] Updated weights for policy 0, policy_version 790523 (0.0033) [2024-06-25 02:22:13,191][15401] Updated weights for policy 0, policy_version 790533 (0.0044) [2024-06-25 02:22:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 12952092672. Throughput: 0: 42691.8. Samples: 12952209400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:13,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 02:22:16,852][15401] Updated weights for policy 0, policy_version 790543 (0.0031) [2024-06-25 02:22:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12952305664. Throughput: 0: 42996.5. Samples: 12952474720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:18,391][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 02:22:20,727][15401] Updated weights for policy 0, policy_version 790553 (0.0046) [2024-06-25 02:22:23,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 12952535040. Throughput: 0: 42959.5. Samples: 12952602800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:23,392][15132] Avg episode reward: [(0, '0.890')] [2024-06-25 02:22:24,404][15401] Updated weights for policy 0, policy_version 790563 (0.0038) [2024-06-25 02:22:28,342][15401] Updated weights for policy 0, policy_version 790573 (0.0027) [2024-06-25 02:22:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 12952748032. Throughput: 0: 42842.6. Samples: 12952859240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:28,390][15132] Avg episode reward: [(0, '0.884')] [2024-06-25 02:22:32,027][15401] Updated weights for policy 0, policy_version 790583 (0.0039) [2024-06-25 02:22:33,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 12952944640. Throughput: 0: 42862.9. Samples: 12953118640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:33,401][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 02:22:36,219][15401] Updated weights for policy 0, policy_version 790593 (0.0034) [2024-06-25 02:22:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12953157632. Throughput: 0: 42959.5. Samples: 12953248780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 02:22:39,822][15401] Updated weights for policy 0, policy_version 790603 (0.0025) [2024-06-25 02:22:43,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12953370624. Throughput: 0: 42713.5. Samples: 12953500760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 02:22:43,707][15401] Updated weights for policy 0, policy_version 790613 (0.0040) [2024-06-25 02:22:47,722][15401] Updated weights for policy 0, policy_version 790623 (0.0034) [2024-06-25 02:22:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12953583616. Throughput: 0: 42751.6. Samples: 12953760380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 02:22:51,231][15401] Updated weights for policy 0, policy_version 790633 (0.0031) [2024-06-25 02:22:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.9). Total num frames: 12953780224. Throughput: 0: 42717.9. Samples: 12953885680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 02:22:55,477][15401] Updated weights for policy 0, policy_version 790643 (0.0033) [2024-06-25 02:22:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 12954009600. Throughput: 0: 43002.8. Samples: 12954144520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:22:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 02:22:59,482][15401] Updated weights for policy 0, policy_version 790653 (0.0028) [2024-06-25 02:23:03,116][15401] Updated weights for policy 0, policy_version 790663 (0.0034) [2024-06-25 02:23:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12954222592. Throughput: 0: 42793.2. Samples: 12954400420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 02:23:07,010][15401] Updated weights for policy 0, policy_version 790673 (0.0038) [2024-06-25 02:23:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 12954435584. Throughput: 0: 42736.9. Samples: 12954525860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 02:23:10,718][15401] Updated weights for policy 0, policy_version 790683 (0.0038) [2024-06-25 02:23:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12954664960. Throughput: 0: 42680.3. Samples: 12954779860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:13,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 02:23:14,717][15401] Updated weights for policy 0, policy_version 790693 (0.0037) [2024-06-25 02:23:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12954861568. Throughput: 0: 42637.9. Samples: 12955037240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 02:23:18,441][15401] Updated weights for policy 0, policy_version 790703 (0.0025) [2024-06-25 02:23:22,214][15401] Updated weights for policy 0, policy_version 790713 (0.0032) [2024-06-25 02:23:23,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 12955074560. Throughput: 0: 42422.6. Samples: 12955157800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 02:23:26,062][15401] Updated weights for policy 0, policy_version 790723 (0.0036) [2024-06-25 02:23:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 12955303936. Throughput: 0: 42637.9. Samples: 12955419460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:28,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 02:23:29,665][15401] Updated weights for policy 0, policy_version 790733 (0.0030) [2024-06-25 02:23:33,385][15349] Signal inference workers to stop experience collection... (191800 times) [2024-06-25 02:23:33,385][15349] Signal inference workers to resume experience collection... (191800 times) [2024-06-25 02:23:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 12955500544. Throughput: 0: 42647.5. Samples: 12955679520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:33,393][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 02:23:33,401][15401] InferenceWorker_p0-w0: stopping experience collection (191800 times) [2024-06-25 02:23:33,413][15401] InferenceWorker_p0-w0: resuming experience collection (191800 times) [2024-06-25 02:23:33,692][15401] Updated weights for policy 0, policy_version 790743 (0.0036) [2024-06-25 02:23:37,185][15401] Updated weights for policy 0, policy_version 790753 (0.0043) [2024-06-25 02:23:38,391][15132] Fps is (10 sec: 42590.0, 60 sec: 42870.1, 300 sec: 42764.7). Total num frames: 12955729920. Throughput: 0: 42591.9. Samples: 12955802400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:38,392][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 02:23:41,465][15401] Updated weights for policy 0, policy_version 790763 (0.0041) [2024-06-25 02:23:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 12955942912. Throughput: 0: 42528.3. Samples: 12956058400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 02:23:43,393][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 02:23:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000790768_12955942912.pth... [2024-06-25 02:23:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000790140_12945653760.pth [2024-06-25 02:23:44,909][15401] Updated weights for policy 0, policy_version 790773 (0.0030) [2024-06-25 02:23:48,392][15132] Fps is (10 sec: 40958.1, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 12956139520. Throughput: 0: 42579.6. Samples: 12956316600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:23:48,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 02:23:48,961][15401] Updated weights for policy 0, policy_version 790783 (0.0035) [2024-06-25 02:23:52,280][15401] Updated weights for policy 0, policy_version 790793 (0.0031) [2024-06-25 02:23:53,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 12956368896. Throughput: 0: 42740.4. Samples: 12956449180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:23:53,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 02:23:56,993][15401] Updated weights for policy 0, policy_version 790803 (0.0032) [2024-06-25 02:23:58,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12956549120. Throughput: 0: 42725.5. Samples: 12956702500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:23:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 02:23:59,776][15401] Updated weights for policy 0, policy_version 790813 (0.0032) [2024-06-25 02:24:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12956794880. Throughput: 0: 42557.1. Samples: 12956952320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 02:24:04,679][15401] Updated weights for policy 0, policy_version 790823 (0.0042) [2024-06-25 02:24:07,890][15401] Updated weights for policy 0, policy_version 790833 (0.0037) [2024-06-25 02:24:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12957007872. Throughput: 0: 42991.1. Samples: 12957092400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 02:24:12,377][15401] Updated weights for policy 0, policy_version 790843 (0.0027) [2024-06-25 02:24:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 12957188096. Throughput: 0: 42752.9. Samples: 12957343340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:13,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 02:24:15,652][15401] Updated weights for policy 0, policy_version 790853 (0.0041) [2024-06-25 02:24:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12957433856. Throughput: 0: 42452.0. Samples: 12957589860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 02:24:19,901][15401] Updated weights for policy 0, policy_version 790863 (0.0035) [2024-06-25 02:24:23,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12957646848. Throughput: 0: 42829.5. Samples: 12957729640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:23,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 02:24:23,423][15401] Updated weights for policy 0, policy_version 790873 (0.0029) [2024-06-25 02:24:27,433][15401] Updated weights for policy 0, policy_version 790883 (0.0040) [2024-06-25 02:24:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 12957827072. Throughput: 0: 42801.0. Samples: 12957984340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 02:24:31,039][15401] Updated weights for policy 0, policy_version 790893 (0.0033) [2024-06-25 02:24:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 12958089216. Throughput: 0: 42780.1. Samples: 12958241600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 02:24:34,806][15401] Updated weights for policy 0, policy_version 790903 (0.0031) [2024-06-25 02:24:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42599.8, 300 sec: 42820.6). Total num frames: 12958285824. Throughput: 0: 42859.6. Samples: 12958377860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 02:24:38,936][15401] Updated weights for policy 0, policy_version 790913 (0.0035) [2024-06-25 02:24:42,090][15349] Signal inference workers to stop experience collection... (191850 times) [2024-06-25 02:24:42,139][15401] InferenceWorker_p0-w0: stopping experience collection (191850 times) [2024-06-25 02:24:42,145][15349] Signal inference workers to resume experience collection... (191850 times) [2024-06-25 02:24:42,147][15401] InferenceWorker_p0-w0: resuming experience collection (191850 times) [2024-06-25 02:24:42,297][15401] Updated weights for policy 0, policy_version 790923 (0.0039) [2024-06-25 02:24:43,394][15132] Fps is (10 sec: 39302.3, 60 sec: 42323.6, 300 sec: 42653.2). Total num frames: 12958482432. Throughput: 0: 42673.6. Samples: 12958623020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:43,395][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 02:24:46,419][15401] Updated weights for policy 0, policy_version 790933 (0.0034) [2024-06-25 02:24:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 12958728192. Throughput: 0: 42623.3. Samples: 12958870360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 02:24:50,010][15401] Updated weights for policy 0, policy_version 790943 (0.0031) [2024-06-25 02:24:53,389][15132] Fps is (10 sec: 42619.9, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 12958908416. Throughput: 0: 42569.5. Samples: 12959008020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 02:24:54,052][15401] Updated weights for policy 0, policy_version 790953 (0.0027) [2024-06-25 02:24:57,738][15401] Updated weights for policy 0, policy_version 790963 (0.0033) [2024-06-25 02:24:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 12959137792. Throughput: 0: 42630.7. Samples: 12959261720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:24:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 02:25:01,765][15401] Updated weights for policy 0, policy_version 790973 (0.0034) [2024-06-25 02:25:03,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 12959383552. Throughput: 0: 42767.7. Samples: 12959514400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:25:03,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 02:25:05,631][15401] Updated weights for policy 0, policy_version 790983 (0.0036) [2024-06-25 02:25:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12959563776. Throughput: 0: 42675.9. Samples: 12959650060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:25:08,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 02:25:09,366][15401] Updated weights for policy 0, policy_version 790993 (0.0028) [2024-06-25 02:25:13,387][15401] Updated weights for policy 0, policy_version 791003 (0.0033) [2024-06-25 02:25:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 12959793152. Throughput: 0: 42581.4. Samples: 12959900500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:25:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 02:25:17,287][15401] Updated weights for policy 0, policy_version 791013 (0.0036) [2024-06-25 02:25:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12960006144. Throughput: 0: 42599.1. Samples: 12960158560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:25:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 02:25:21,098][15401] Updated weights for policy 0, policy_version 791023 (0.0028) [2024-06-25 02:25:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12960202752. Throughput: 0: 42501.4. Samples: 12960290420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:25:23,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 02:25:24,717][15401] Updated weights for policy 0, policy_version 791033 (0.0035) [2024-06-25 02:25:28,395][15132] Fps is (10 sec: 42573.4, 60 sec: 43413.4, 300 sec: 42764.2). Total num frames: 12960432128. Throughput: 0: 42765.8. Samples: 12960547520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 02:25:28,396][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 02:25:28,776][15401] Updated weights for policy 0, policy_version 791043 (0.0029) [2024-06-25 02:25:32,206][15401] Updated weights for policy 0, policy_version 791053 (0.0026) [2024-06-25 02:25:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12960645120. Throughput: 0: 43039.0. Samples: 12960807120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:25:33,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 02:25:36,378][15401] Updated weights for policy 0, policy_version 791063 (0.0037) [2024-06-25 02:25:38,389][15132] Fps is (10 sec: 40984.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12960841728. Throughput: 0: 42900.9. Samples: 12960938560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:25:38,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 02:25:40,298][15401] Updated weights for policy 0, policy_version 791073 (0.0046) [2024-06-25 02:25:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43148.0, 300 sec: 42765.0). Total num frames: 12961071104. Throughput: 0: 42823.8. Samples: 12961188800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:25:43,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 02:25:43,450][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000791082_12961087488.pth... [2024-06-25 02:25:43,511][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000790455_12950814720.pth [2024-06-25 02:25:44,271][15401] Updated weights for policy 0, policy_version 791083 (0.0027) [2024-06-25 02:25:47,795][15401] Updated weights for policy 0, policy_version 791093 (0.0029) [2024-06-25 02:25:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12961284096. Throughput: 0: 42874.5. Samples: 12961443760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:25:48,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 02:25:50,072][15349] Signal inference workers to stop experience collection... (191900 times) [2024-06-25 02:25:50,127][15401] InferenceWorker_p0-w0: stopping experience collection (191900 times) [2024-06-25 02:25:50,130][15349] Signal inference workers to resume experience collection... (191900 times) [2024-06-25 02:25:50,143][15401] InferenceWorker_p0-w0: resuming experience collection (191900 times) [2024-06-25 02:25:51,777][15401] Updated weights for policy 0, policy_version 791103 (0.0032) [2024-06-25 02:25:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 12961480704. Throughput: 0: 42626.1. Samples: 12961568240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:25:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 02:25:55,398][15401] Updated weights for policy 0, policy_version 791113 (0.0038) [2024-06-25 02:25:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 12961726464. Throughput: 0: 42951.0. Samples: 12961833300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:25:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 02:25:59,266][15401] Updated weights for policy 0, policy_version 791123 (0.0039) [2024-06-25 02:26:02,866][15401] Updated weights for policy 0, policy_version 791133 (0.0022) [2024-06-25 02:26:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12961923072. Throughput: 0: 42916.0. Samples: 12962089780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 02:26:06,739][15401] Updated weights for policy 0, policy_version 791143 (0.0043) [2024-06-25 02:26:08,393][15132] Fps is (10 sec: 40945.5, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 12962136064. Throughput: 0: 42896.0. Samples: 12962220900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:08,394][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 02:26:10,416][15401] Updated weights for policy 0, policy_version 791153 (0.0022) [2024-06-25 02:26:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 12962381824. Throughput: 0: 43023.8. Samples: 12962483340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:13,391][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 02:26:14,298][15401] Updated weights for policy 0, policy_version 791163 (0.0028) [2024-06-25 02:26:17,933][15401] Updated weights for policy 0, policy_version 791173 (0.0035) [2024-06-25 02:26:18,389][15132] Fps is (10 sec: 44253.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12962578432. Throughput: 0: 42941.9. Samples: 12962739500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 02:26:21,964][15401] Updated weights for policy 0, policy_version 791183 (0.0032) [2024-06-25 02:26:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12962775040. Throughput: 0: 42859.0. Samples: 12962867220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 02:26:25,824][15401] Updated weights for policy 0, policy_version 791193 (0.0037) [2024-06-25 02:26:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43148.8, 300 sec: 42876.1). Total num frames: 12963020800. Throughput: 0: 42945.0. Samples: 12963121320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 02:26:29,785][15401] Updated weights for policy 0, policy_version 791203 (0.0038) [2024-06-25 02:26:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12963217408. Throughput: 0: 43046.4. Samples: 12963380840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 02:26:33,462][15401] Updated weights for policy 0, policy_version 791213 (0.0038) [2024-06-25 02:26:37,563][15401] Updated weights for policy 0, policy_version 791223 (0.0030) [2024-06-25 02:26:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12963414016. Throughput: 0: 43073.9. Samples: 12963506560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:38,396][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 02:26:41,226][15401] Updated weights for policy 0, policy_version 791233 (0.0035) [2024-06-25 02:26:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12963643392. Throughput: 0: 42808.9. Samples: 12963759700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 02:26:45,291][15401] Updated weights for policy 0, policy_version 791243 (0.0039) [2024-06-25 02:26:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12963840000. Throughput: 0: 42800.0. Samples: 12964015780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 02:26:49,244][15401] Updated weights for policy 0, policy_version 791253 (0.0024) [2024-06-25 02:26:52,860][15401] Updated weights for policy 0, policy_version 791263 (0.0032) [2024-06-25 02:26:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42821.5). Total num frames: 12964069376. Throughput: 0: 42782.6. Samples: 12964145960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 02:26:56,846][15401] Updated weights for policy 0, policy_version 791273 (0.0035) [2024-06-25 02:26:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12964298752. Throughput: 0: 42731.5. Samples: 12964406260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:26:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 02:27:00,649][15401] Updated weights for policy 0, policy_version 791283 (0.0027) [2024-06-25 02:27:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 12964495360. Throughput: 0: 42773.6. Samples: 12964664320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:27:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 02:27:04,431][15401] Updated weights for policy 0, policy_version 791293 (0.0029) [2024-06-25 02:27:08,086][15401] Updated weights for policy 0, policy_version 791303 (0.0048) [2024-06-25 02:27:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42874.1, 300 sec: 42765.0). Total num frames: 12964708352. Throughput: 0: 42754.7. Samples: 12964791180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:27:08,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 02:27:11,902][15401] Updated weights for policy 0, policy_version 791313 (0.0038) [2024-06-25 02:27:13,392][15132] Fps is (10 sec: 44227.7, 60 sec: 42596.9, 300 sec: 42820.2). Total num frames: 12964937728. Throughput: 0: 42871.7. Samples: 12965050640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 02:27:13,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 02:27:15,768][15401] Updated weights for policy 0, policy_version 791323 (0.0038) [2024-06-25 02:27:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 12965150720. Throughput: 0: 42746.5. Samples: 12965304440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 02:27:19,390][15401] Updated weights for policy 0, policy_version 791333 (0.0039) [2024-06-25 02:27:20,004][15349] Signal inference workers to stop experience collection... (191950 times) [2024-06-25 02:27:20,038][15401] InferenceWorker_p0-w0: stopping experience collection (191950 times) [2024-06-25 02:27:20,071][15349] Signal inference workers to resume experience collection... (191950 times) [2024-06-25 02:27:20,071][15401] InferenceWorker_p0-w0: resuming experience collection (191950 times) [2024-06-25 02:27:23,389][15132] Fps is (10 sec: 40969.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 12965347328. Throughput: 0: 42830.3. Samples: 12965433920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 02:27:23,425][15401] Updated weights for policy 0, policy_version 791343 (0.0027) [2024-06-25 02:27:27,058][15401] Updated weights for policy 0, policy_version 791353 (0.0036) [2024-06-25 02:27:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 12965560320. Throughput: 0: 42831.7. Samples: 12965687120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 02:27:30,943][15401] Updated weights for policy 0, policy_version 791363 (0.0037) [2024-06-25 02:27:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12965773312. Throughput: 0: 42996.8. Samples: 12965950640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 02:27:34,591][15401] Updated weights for policy 0, policy_version 791373 (0.0028) [2024-06-25 02:27:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12965986304. Throughput: 0: 43015.0. Samples: 12966081640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 02:27:38,745][15401] Updated weights for policy 0, policy_version 791383 (0.0030) [2024-06-25 02:27:42,149][15401] Updated weights for policy 0, policy_version 791393 (0.0032) [2024-06-25 02:27:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 12966215680. Throughput: 0: 42818.7. Samples: 12966333100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 02:27:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000791395_12966215680.pth... [2024-06-25 02:27:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000790768_12955942912.pth [2024-06-25 02:27:46,418][15401] Updated weights for policy 0, policy_version 791403 (0.0032) [2024-06-25 02:27:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12966428672. Throughput: 0: 42793.4. Samples: 12966590020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 02:27:49,784][15401] Updated weights for policy 0, policy_version 791413 (0.0050) [2024-06-25 02:27:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12966625280. Throughput: 0: 42905.8. Samples: 12966721940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:53,396][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 02:27:54,244][15401] Updated weights for policy 0, policy_version 791423 (0.0032) [2024-06-25 02:27:57,305][15401] Updated weights for policy 0, policy_version 791433 (0.0042) [2024-06-25 02:27:58,391][15132] Fps is (10 sec: 40955.3, 60 sec: 42324.6, 300 sec: 42764.9). Total num frames: 12966838272. Throughput: 0: 42743.6. Samples: 12966974060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:27:58,391][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 02:28:01,714][15401] Updated weights for policy 0, policy_version 791443 (0.0039) [2024-06-25 02:28:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12967051264. Throughput: 0: 43008.0. Samples: 12967239800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 02:28:04,873][15401] Updated weights for policy 0, policy_version 791453 (0.0036) [2024-06-25 02:28:08,389][15132] Fps is (10 sec: 42603.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 12967264256. Throughput: 0: 42879.1. Samples: 12967363480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 02:28:09,350][15401] Updated weights for policy 0, policy_version 791463 (0.0036) [2024-06-25 02:28:12,624][15401] Updated weights for policy 0, policy_version 791473 (0.0032) [2024-06-25 02:28:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42873.0, 300 sec: 42876.1). Total num frames: 12967510016. Throughput: 0: 42999.1. Samples: 12967622080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 02:28:16,962][15401] Updated weights for policy 0, policy_version 791483 (0.0032) [2024-06-25 02:28:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12967690240. Throughput: 0: 42834.6. Samples: 12967878200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 02:28:20,557][15401] Updated weights for policy 0, policy_version 791493 (0.0034) [2024-06-25 02:28:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12967919616. Throughput: 0: 42628.4. Samples: 12967999920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:23,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 02:28:24,406][15401] Updated weights for policy 0, policy_version 791503 (0.0034) [2024-06-25 02:28:28,074][15401] Updated weights for policy 0, policy_version 791513 (0.0042) [2024-06-25 02:28:28,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 12968148992. Throughput: 0: 42778.6. Samples: 12968258140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:28,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 02:28:31,963][15401] Updated weights for policy 0, policy_version 791523 (0.0038) [2024-06-25 02:28:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 12968345600. Throughput: 0: 42871.0. Samples: 12968519220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 02:28:35,882][15401] Updated weights for policy 0, policy_version 791533 (0.0033) [2024-06-25 02:28:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 12968558592. Throughput: 0: 42747.4. Samples: 12968645580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 02:28:39,739][15401] Updated weights for policy 0, policy_version 791543 (0.0028) [2024-06-25 02:28:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 12968787968. Throughput: 0: 42877.6. Samples: 12968903500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 02:28:43,478][15401] Updated weights for policy 0, policy_version 791553 (0.0032) [2024-06-25 02:28:47,315][15401] Updated weights for policy 0, policy_version 791563 (0.0033) [2024-06-25 02:28:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12968984576. Throughput: 0: 42827.4. Samples: 12969167040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 02:28:49,124][15349] Signal inference workers to stop experience collection... (192000 times) [2024-06-25 02:28:49,161][15401] InferenceWorker_p0-w0: stopping experience collection (192000 times) [2024-06-25 02:28:49,182][15349] Signal inference workers to resume experience collection... (192000 times) [2024-06-25 02:28:49,184][15401] InferenceWorker_p0-w0: resuming experience collection (192000 times) [2024-06-25 02:28:51,093][15401] Updated weights for policy 0, policy_version 791573 (0.0033) [2024-06-25 02:28:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 12969181184. Throughput: 0: 42796.3. Samples: 12969289320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 02:28:55,092][15401] Updated weights for policy 0, policy_version 791583 (0.0031) [2024-06-25 02:28:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43418.5, 300 sec: 42876.1). Total num frames: 12969443328. Throughput: 0: 42804.9. Samples: 12969548300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:28:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 02:28:58,667][15401] Updated weights for policy 0, policy_version 791593 (0.0028) [2024-06-25 02:29:02,688][15401] Updated weights for policy 0, policy_version 791603 (0.0025) [2024-06-25 02:29:03,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12969656320. Throughput: 0: 42949.1. Samples: 12969810900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 02:29:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 02:29:06,146][15401] Updated weights for policy 0, policy_version 791613 (0.0030) [2024-06-25 02:29:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12969836544. Throughput: 0: 43036.9. Samples: 12969936580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:08,391][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 02:29:10,179][15401] Updated weights for policy 0, policy_version 791623 (0.0041) [2024-06-25 02:29:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12970065920. Throughput: 0: 43226.3. Samples: 12970203320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 02:29:13,850][15401] Updated weights for policy 0, policy_version 791633 (0.0033) [2024-06-25 02:29:17,807][15401] Updated weights for policy 0, policy_version 791643 (0.0046) [2024-06-25 02:29:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 12970295296. Throughput: 0: 43048.1. Samples: 12970456380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 02:29:21,463][15401] Updated weights for policy 0, policy_version 791653 (0.0033) [2024-06-25 02:29:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12970491904. Throughput: 0: 43028.0. Samples: 12970581840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:23,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 02:29:25,589][15401] Updated weights for policy 0, policy_version 791663 (0.0023) [2024-06-25 02:29:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12970704896. Throughput: 0: 43012.0. Samples: 12970839040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 02:29:29,131][15401] Updated weights for policy 0, policy_version 791673 (0.0044) [2024-06-25 02:29:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 12970934272. Throughput: 0: 42921.0. Samples: 12971098480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 02:29:33,394][15401] Updated weights for policy 0, policy_version 791683 (0.0028) [2024-06-25 02:29:37,119][15401] Updated weights for policy 0, policy_version 791693 (0.0032) [2024-06-25 02:29:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42932.3). Total num frames: 12971147264. Throughput: 0: 42930.3. Samples: 12971221180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 02:29:41,377][15401] Updated weights for policy 0, policy_version 791703 (0.0023) [2024-06-25 02:29:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12971343872. Throughput: 0: 42803.6. Samples: 12971474460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:43,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 02:29:43,536][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000791709_12971360256.pth... [2024-06-25 02:29:43,605][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000791082_12961087488.pth [2024-06-25 02:29:44,655][15401] Updated weights for policy 0, policy_version 791713 (0.0034) [2024-06-25 02:29:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12971556864. Throughput: 0: 42740.3. Samples: 12971734220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 02:29:48,866][15401] Updated weights for policy 0, policy_version 791723 (0.0037) [2024-06-25 02:29:52,594][15401] Updated weights for policy 0, policy_version 791733 (0.0023) [2024-06-25 02:29:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 12971786240. Throughput: 0: 42823.6. Samples: 12971863640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 02:29:56,373][15401] Updated weights for policy 0, policy_version 791743 (0.0032) [2024-06-25 02:29:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12971999232. Throughput: 0: 42621.4. Samples: 12972121280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:29:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 02:30:00,211][15401] Updated weights for policy 0, policy_version 791753 (0.0034) [2024-06-25 02:30:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 12972212224. Throughput: 0: 42768.4. Samples: 12972381060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:03,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 02:30:03,908][15401] Updated weights for policy 0, policy_version 791763 (0.0035) [2024-06-25 02:30:07,753][15401] Updated weights for policy 0, policy_version 791773 (0.0046) [2024-06-25 02:30:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 12972441600. Throughput: 0: 42849.4. Samples: 12972510060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 02:30:11,369][15401] Updated weights for policy 0, policy_version 791783 (0.0038) [2024-06-25 02:30:13,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12972638208. Throughput: 0: 42890.2. Samples: 12972769100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:13,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 02:30:15,243][15401] Updated weights for policy 0, policy_version 791793 (0.0029) [2024-06-25 02:30:16,893][15349] Signal inference workers to stop experience collection... (192050 times) [2024-06-25 02:30:16,893][15349] Signal inference workers to resume experience collection... (192050 times) [2024-06-25 02:30:16,939][15401] InferenceWorker_p0-w0: stopping experience collection (192050 times) [2024-06-25 02:30:16,939][15401] InferenceWorker_p0-w0: resuming experience collection (192050 times) [2024-06-25 02:30:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12972867584. Throughput: 0: 42752.1. Samples: 12973022320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:18,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 02:30:19,431][15401] Updated weights for policy 0, policy_version 791803 (0.0026) [2024-06-25 02:30:22,885][15401] Updated weights for policy 0, policy_version 791813 (0.0037) [2024-06-25 02:30:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42877.0). Total num frames: 12973080576. Throughput: 0: 42979.7. Samples: 12973155260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:23,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 02:30:26,949][15401] Updated weights for policy 0, policy_version 791823 (0.0036) [2024-06-25 02:30:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12973277184. Throughput: 0: 43005.8. Samples: 12973409720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:28,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 02:30:30,446][15401] Updated weights for policy 0, policy_version 791833 (0.0042) [2024-06-25 02:30:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12973506560. Throughput: 0: 42739.7. Samples: 12973657500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:33,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 02:30:34,525][15401] Updated weights for policy 0, policy_version 791843 (0.0042) [2024-06-25 02:30:38,154][15401] Updated weights for policy 0, policy_version 791853 (0.0030) [2024-06-25 02:30:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12973719552. Throughput: 0: 42862.1. Samples: 12973792440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:38,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 02:30:42,459][15401] Updated weights for policy 0, policy_version 791863 (0.0028) [2024-06-25 02:30:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12973916160. Throughput: 0: 42748.0. Samples: 12974044940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 02:30:46,258][15401] Updated weights for policy 0, policy_version 791873 (0.0025) [2024-06-25 02:30:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 12974161920. Throughput: 0: 42526.4. Samples: 12974294640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-25 02:30:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 02:30:50,082][15401] Updated weights for policy 0, policy_version 791883 (0.0041) [2024-06-25 02:30:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12974342144. Throughput: 0: 42680.4. Samples: 12974430680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:30:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 02:30:53,810][15401] Updated weights for policy 0, policy_version 791893 (0.0030) [2024-06-25 02:30:57,596][15401] Updated weights for policy 0, policy_version 791903 (0.0035) [2024-06-25 02:30:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12974555136. Throughput: 0: 42563.1. Samples: 12974684440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:30:58,393][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 02:31:01,399][15401] Updated weights for policy 0, policy_version 791913 (0.0029) [2024-06-25 02:31:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42876.6). Total num frames: 12974784512. Throughput: 0: 42688.9. Samples: 12974943320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 02:31:05,430][15401] Updated weights for policy 0, policy_version 791923 (0.0032) [2024-06-25 02:31:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 12974981120. Throughput: 0: 42610.2. Samples: 12975072720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:08,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 02:31:09,149][15401] Updated weights for policy 0, policy_version 791933 (0.0031) [2024-06-25 02:31:12,872][15401] Updated weights for policy 0, policy_version 791943 (0.0036) [2024-06-25 02:31:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 12975226880. Throughput: 0: 42648.7. Samples: 12975328920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 02:31:16,926][15401] Updated weights for policy 0, policy_version 791953 (0.0033) [2024-06-25 02:31:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12975439872. Throughput: 0: 42817.7. Samples: 12975584300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 02:31:20,290][15401] Updated weights for policy 0, policy_version 791963 (0.0026) [2024-06-25 02:31:23,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 12975603712. Throughput: 0: 42746.8. Samples: 12975716040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 02:31:24,359][15401] Updated weights for policy 0, policy_version 791973 (0.0030) [2024-06-25 02:31:27,857][15401] Updated weights for policy 0, policy_version 791983 (0.0034) [2024-06-25 02:31:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 12975882240. Throughput: 0: 42970.2. Samples: 12975978600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 02:31:31,886][15401] Updated weights for policy 0, policy_version 791993 (0.0036) [2024-06-25 02:31:33,390][15132] Fps is (10 sec: 49151.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 12976095232. Throughput: 0: 42988.7. Samples: 12976229140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:33,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 02:31:35,538][15401] Updated weights for policy 0, policy_version 792003 (0.0034) [2024-06-25 02:31:38,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 12976259072. Throughput: 0: 42890.9. Samples: 12976360760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 02:31:39,357][15401] Updated weights for policy 0, policy_version 792013 (0.0029) [2024-06-25 02:31:43,179][15401] Updated weights for policy 0, policy_version 792023 (0.0039) [2024-06-25 02:31:43,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 12976521216. Throughput: 0: 43024.5. Samples: 12976620540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:43,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 02:31:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000792024_12976521216.pth... [2024-06-25 02:31:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000791395_12966215680.pth [2024-06-25 02:31:46,012][15349] Signal inference workers to stop experience collection... (192100 times) [2024-06-25 02:31:46,069][15401] InferenceWorker_p0-w0: stopping experience collection (192100 times) [2024-06-25 02:31:46,072][15349] Signal inference workers to resume experience collection... (192100 times) [2024-06-25 02:31:46,082][15401] InferenceWorker_p0-w0: resuming experience collection (192100 times) [2024-06-25 02:31:47,044][15401] Updated weights for policy 0, policy_version 792033 (0.0046) [2024-06-25 02:31:48,392][15132] Fps is (10 sec: 49138.8, 60 sec: 43142.7, 300 sec: 42986.8). Total num frames: 12976750592. Throughput: 0: 42930.4. Samples: 12976875300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:48,392][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 02:31:50,637][15401] Updated weights for policy 0, policy_version 792043 (0.0039) [2024-06-25 02:31:53,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12976898048. Throughput: 0: 43060.4. Samples: 12977010440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:53,390][15132] Avg episode reward: [(0, '0.859')] [2024-06-25 02:31:54,991][15401] Updated weights for policy 0, policy_version 792053 (0.0045) [2024-06-25 02:31:58,155][15401] Updated weights for policy 0, policy_version 792063 (0.0028) [2024-06-25 02:31:58,389][15132] Fps is (10 sec: 40970.6, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 12977160192. Throughput: 0: 42863.3. Samples: 12977257760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:31:58,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 02:32:02,759][15401] Updated weights for policy 0, policy_version 792073 (0.0040) [2024-06-25 02:32:03,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12977373184. Throughput: 0: 42905.5. Samples: 12977515040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:32:03,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 02:32:05,729][15401] Updated weights for policy 0, policy_version 792083 (0.0039) [2024-06-25 02:32:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 12977537024. Throughput: 0: 42731.5. Samples: 12977638960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:32:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 02:32:10,470][15401] Updated weights for policy 0, policy_version 792093 (0.0032) [2024-06-25 02:32:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 12977782784. Throughput: 0: 42595.2. Samples: 12977895380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:32:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 02:32:13,876][15401] Updated weights for policy 0, policy_version 792103 (0.0036) [2024-06-25 02:32:18,210][15401] Updated weights for policy 0, policy_version 792113 (0.0039) [2024-06-25 02:32:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 12977979392. Throughput: 0: 42901.1. Samples: 12978159680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:32:18,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 02:32:21,258][15401] Updated weights for policy 0, policy_version 792123 (0.0026) [2024-06-25 02:32:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12978176000. Throughput: 0: 42708.4. Samples: 12978282640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:32:23,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 02:32:25,802][15401] Updated weights for policy 0, policy_version 792133 (0.0042) [2024-06-25 02:32:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 12978438144. Throughput: 0: 42619.5. Samples: 12978538420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:32:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 02:32:29,050][15401] Updated weights for policy 0, policy_version 792143 (0.0025) [2024-06-25 02:32:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 12978618368. Throughput: 0: 42877.5. Samples: 12978804680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 02:32:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 02:32:33,413][15401] Updated weights for policy 0, policy_version 792153 (0.0041) [2024-06-25 02:32:36,503][15401] Updated weights for policy 0, policy_version 792163 (0.0036) [2024-06-25 02:32:38,390][15132] Fps is (10 sec: 39320.0, 60 sec: 42871.1, 300 sec: 42765.0). Total num frames: 12978831360. Throughput: 0: 42487.6. Samples: 12978922400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:32:38,391][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 02:32:40,966][15401] Updated weights for policy 0, policy_version 792173 (0.0023) [2024-06-25 02:32:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12979093504. Throughput: 0: 42768.8. Samples: 12979182360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:32:43,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 02:32:43,906][15401] Updated weights for policy 0, policy_version 792183 (0.0038) [2024-06-25 02:32:48,389][15132] Fps is (10 sec: 42600.5, 60 sec: 41781.0, 300 sec: 42820.6). Total num frames: 12979257344. Throughput: 0: 42974.6. Samples: 12979448900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:32:48,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 02:32:48,846][15401] Updated weights for policy 0, policy_version 792193 (0.0032) [2024-06-25 02:32:51,540][15401] Updated weights for policy 0, policy_version 792203 (0.0035) [2024-06-25 02:32:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 12979470336. Throughput: 0: 42805.9. Samples: 12979565220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:32:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 02:32:55,951][15349] Signal inference workers to stop experience collection... (192150 times) [2024-06-25 02:32:55,956][15349] Signal inference workers to resume experience collection... (192150 times) [2024-06-25 02:32:55,995][15401] InferenceWorker_p0-w0: stopping experience collection (192150 times) [2024-06-25 02:32:55,995][15401] InferenceWorker_p0-w0: resuming experience collection (192150 times) [2024-06-25 02:32:56,092][15401] Updated weights for policy 0, policy_version 792213 (0.0047) [2024-06-25 02:32:58,393][15132] Fps is (10 sec: 47495.1, 60 sec: 42868.7, 300 sec: 42986.6). Total num frames: 12979732480. Throughput: 0: 42933.2. Samples: 12979827540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:32:58,394][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 02:32:58,995][15401] Updated weights for policy 0, policy_version 792223 (0.0023) [2024-06-25 02:33:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 12979896320. Throughput: 0: 43142.7. Samples: 12980101100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:03,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 02:33:03,825][15401] Updated weights for policy 0, policy_version 792233 (0.0033) [2024-06-25 02:33:06,414][15401] Updated weights for policy 0, policy_version 792243 (0.0041) [2024-06-25 02:33:08,389][15132] Fps is (10 sec: 37698.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 12980109312. Throughput: 0: 42877.8. Samples: 12980212140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:08,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-25 02:33:11,315][15401] Updated weights for policy 0, policy_version 792253 (0.0036) [2024-06-25 02:33:13,389][15132] Fps is (10 sec: 49151.6, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 12980387840. Throughput: 0: 43076.1. Samples: 12980476840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:13,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 02:33:14,031][15401] Updated weights for policy 0, policy_version 792263 (0.0034) [2024-06-25 02:33:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12980551680. Throughput: 0: 43074.7. Samples: 12980743040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:18,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 02:33:19,211][15401] Updated weights for policy 0, policy_version 792273 (0.0037) [2024-06-25 02:33:21,892][15401] Updated weights for policy 0, policy_version 792283 (0.0040) [2024-06-25 02:33:23,390][15132] Fps is (10 sec: 39319.9, 60 sec: 43417.3, 300 sec: 42820.5). Total num frames: 12980781056. Throughput: 0: 42871.2. Samples: 12980851600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 02:33:26,864][15401] Updated weights for policy 0, policy_version 792293 (0.0026) [2024-06-25 02:33:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 12981010432. Throughput: 0: 43043.7. Samples: 12981119320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 02:33:29,737][15401] Updated weights for policy 0, policy_version 792303 (0.0038) [2024-06-25 02:33:33,390][15132] Fps is (10 sec: 39322.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12981174272. Throughput: 0: 43060.3. Samples: 12981386620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:33,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 02:33:34,424][15401] Updated weights for policy 0, policy_version 792313 (0.0038) [2024-06-25 02:33:37,839][15401] Updated weights for policy 0, policy_version 792323 (0.0035) [2024-06-25 02:33:38,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 12981436416. Throughput: 0: 42934.9. Samples: 12981497300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 02:33:41,987][15401] Updated weights for policy 0, policy_version 792333 (0.0035) [2024-06-25 02:33:43,389][15132] Fps is (10 sec: 49152.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 12981665792. Throughput: 0: 42924.2. Samples: 12981758960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:43,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 02:33:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000792338_12981665792.pth... [2024-06-25 02:33:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000791709_12971360256.pth [2024-06-25 02:33:45,781][15401] Updated weights for policy 0, policy_version 792343 (0.0040) [2024-06-25 02:33:48,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12981813248. Throughput: 0: 42708.7. Samples: 12982023000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 02:33:49,844][15401] Updated weights for policy 0, policy_version 792353 (0.0036) [2024-06-25 02:33:53,277][15401] Updated weights for policy 0, policy_version 792363 (0.0049) [2024-06-25 02:33:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 12982075392. Throughput: 0: 42801.2. Samples: 12982138200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 02:33:57,308][15349] Signal inference workers to stop experience collection... (192200 times) [2024-06-25 02:33:57,350][15401] InferenceWorker_p0-w0: stopping experience collection (192200 times) [2024-06-25 02:33:57,359][15349] Signal inference workers to resume experience collection... (192200 times) [2024-06-25 02:33:57,367][15401] InferenceWorker_p0-w0: resuming experience collection (192200 times) [2024-06-25 02:33:57,369][15401] Updated weights for policy 0, policy_version 792373 (0.0037) [2024-06-25 02:33:58,390][15132] Fps is (10 sec: 49151.9, 60 sec: 42874.2, 300 sec: 42876.1). Total num frames: 12982304768. Throughput: 0: 42908.4. Samples: 12982407720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:33:58,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-25 02:34:01,172][15401] Updated weights for policy 0, policy_version 792383 (0.0035) [2024-06-25 02:34:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 12982468608. Throughput: 0: 42712.5. Samples: 12982665100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:34:03,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 02:34:04,934][15401] Updated weights for policy 0, policy_version 792393 (0.0031) [2024-06-25 02:34:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 12982714368. Throughput: 0: 42970.1. Samples: 12982785240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:34:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 02:34:08,547][15401] Updated weights for policy 0, policy_version 792403 (0.0030) [2024-06-25 02:34:12,611][15401] Updated weights for policy 0, policy_version 792413 (0.0023) [2024-06-25 02:34:13,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12982943744. Throughput: 0: 43000.2. Samples: 12983054340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:34:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 02:34:16,119][15401] Updated weights for policy 0, policy_version 792423 (0.0035) [2024-06-25 02:34:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 12983123968. Throughput: 0: 42601.9. Samples: 12983303700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:34:18,390][15132] Avg episode reward: [(0, '0.845')] [2024-06-25 02:34:20,304][15401] Updated weights for policy 0, policy_version 792433 (0.0036) [2024-06-25 02:34:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 12983369728. Throughput: 0: 42922.2. Samples: 12983428800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 02:34:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 02:34:23,625][15401] Updated weights for policy 0, policy_version 792443 (0.0034) [2024-06-25 02:34:27,907][15401] Updated weights for policy 0, policy_version 792453 (0.0032) [2024-06-25 02:34:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 12983582720. Throughput: 0: 43002.5. Samples: 12983694080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:34:28,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 02:34:31,094][15401] Updated weights for policy 0, policy_version 792463 (0.0037) [2024-06-25 02:34:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12983762944. Throughput: 0: 42885.3. Samples: 12983952840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:34:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 02:34:35,503][15401] Updated weights for policy 0, policy_version 792473 (0.0032) [2024-06-25 02:34:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 12984008704. Throughput: 0: 43061.4. Samples: 12984075960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:34:38,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-25 02:34:39,151][15401] Updated weights for policy 0, policy_version 792483 (0.0033) [2024-06-25 02:34:43,025][15401] Updated weights for policy 0, policy_version 792493 (0.0033) [2024-06-25 02:34:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 12984205312. Throughput: 0: 42874.1. Samples: 12984337060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:34:43,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 02:34:46,707][15401] Updated weights for policy 0, policy_version 792503 (0.0046) [2024-06-25 02:34:48,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 12984401920. Throughput: 0: 42864.8. Samples: 12984594020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:34:48,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 02:34:50,740][15401] Updated weights for policy 0, policy_version 792513 (0.0041) [2024-06-25 02:34:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12984647680. Throughput: 0: 42917.0. Samples: 12984716500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:34:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 02:34:54,284][15401] Updated weights for policy 0, policy_version 792523 (0.0030) [2024-06-25 02:34:58,338][15401] Updated weights for policy 0, policy_version 792533 (0.0032) [2024-06-25 02:34:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 12984860672. Throughput: 0: 42783.7. Samples: 12984979600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:34:58,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 02:35:02,029][15401] Updated weights for policy 0, policy_version 792543 (0.0040) [2024-06-25 02:35:03,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 12985057280. Throughput: 0: 42899.4. Samples: 12985234280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:03,393][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 02:35:06,218][15401] Updated weights for policy 0, policy_version 792553 (0.0043) [2024-06-25 02:35:08,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43140.0, 300 sec: 42930.7). Total num frames: 12985303040. Throughput: 0: 42914.0. Samples: 12985360200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:08,396][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 02:35:09,590][15401] Updated weights for policy 0, policy_version 792563 (0.0036) [2024-06-25 02:35:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 12985483264. Throughput: 0: 42695.5. Samples: 12985615380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:13,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 02:35:13,797][15401] Updated weights for policy 0, policy_version 792573 (0.0038) [2024-06-25 02:35:17,108][15401] Updated weights for policy 0, policy_version 792583 (0.0035) [2024-06-25 02:35:18,390][15132] Fps is (10 sec: 39346.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12985696256. Throughput: 0: 42684.4. Samples: 12985873640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 02:35:21,554][15401] Updated weights for policy 0, policy_version 792593 (0.0030) [2024-06-25 02:35:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12985942016. Throughput: 0: 42892.2. Samples: 12986006120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 02:35:24,931][15401] Updated weights for policy 0, policy_version 792603 (0.0026) [2024-06-25 02:35:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12986138624. Throughput: 0: 42867.2. Samples: 12986266080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 02:35:29,152][15401] Updated weights for policy 0, policy_version 792613 (0.0041) [2024-06-25 02:35:32,326][15401] Updated weights for policy 0, policy_version 792623 (0.0036) [2024-06-25 02:35:33,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12986351616. Throughput: 0: 43002.6. Samples: 12986529140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 02:35:36,539][15401] Updated weights for policy 0, policy_version 792633 (0.0022) [2024-06-25 02:35:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12986580992. Throughput: 0: 43216.5. Samples: 12986661240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 02:35:39,790][15401] Updated weights for policy 0, policy_version 792643 (0.0041) [2024-06-25 02:35:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 12986777600. Throughput: 0: 43110.6. Samples: 12986919580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:43,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-25 02:35:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000792650_12986777600.pth... [2024-06-25 02:35:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000792024_12976521216.pth [2024-06-25 02:35:43,768][15349] Signal inference workers to stop experience collection... (192250 times) [2024-06-25 02:35:43,778][15401] InferenceWorker_p0-w0: stopping experience collection (192250 times) [2024-06-25 02:35:43,826][15349] Signal inference workers to resume experience collection... (192250 times) [2024-06-25 02:35:43,826][15401] InferenceWorker_p0-w0: resuming experience collection (192250 times) [2024-06-25 02:35:43,989][15401] Updated weights for policy 0, policy_version 792653 (0.0036) [2024-06-25 02:35:47,610][15401] Updated weights for policy 0, policy_version 792663 (0.0036) [2024-06-25 02:35:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 12987006976. Throughput: 0: 43110.7. Samples: 12987174160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:48,395][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 02:35:51,606][15401] Updated weights for policy 0, policy_version 792673 (0.0033) [2024-06-25 02:35:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12987219968. Throughput: 0: 43319.9. Samples: 12987309320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 02:35:55,178][15401] Updated weights for policy 0, policy_version 792683 (0.0024) [2024-06-25 02:35:58,393][15132] Fps is (10 sec: 40945.2, 60 sec: 42595.7, 300 sec: 42820.0). Total num frames: 12987416576. Throughput: 0: 43261.0. Samples: 12987562280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:35:58,394][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 02:35:59,126][15401] Updated weights for policy 0, policy_version 792693 (0.0033) [2024-06-25 02:36:03,139][15401] Updated weights for policy 0, policy_version 792703 (0.0041) [2024-06-25 02:36:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 12987645952. Throughput: 0: 43162.6. Samples: 12987815960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:36:03,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 02:36:06,893][15401] Updated weights for policy 0, policy_version 792713 (0.0033) [2024-06-25 02:36:08,390][15132] Fps is (10 sec: 44253.0, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 12987858944. Throughput: 0: 43123.7. Samples: 12987946680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 02:36:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 02:36:10,988][15401] Updated weights for policy 0, policy_version 792723 (0.0038) [2024-06-25 02:36:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12988071936. Throughput: 0: 42966.2. Samples: 12988199560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:13,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 02:36:14,639][15401] Updated weights for policy 0, policy_version 792733 (0.0030) [2024-06-25 02:36:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 12988284928. Throughput: 0: 42859.3. Samples: 12988457800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 02:36:18,490][15401] Updated weights for policy 0, policy_version 792743 (0.0025) [2024-06-25 02:36:22,186][15401] Updated weights for policy 0, policy_version 792753 (0.0046) [2024-06-25 02:36:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 12988497920. Throughput: 0: 42758.2. Samples: 12988585360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 02:36:25,974][15401] Updated weights for policy 0, policy_version 792763 (0.0028) [2024-06-25 02:36:28,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12988710912. Throughput: 0: 42723.1. Samples: 12988842120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 02:36:29,820][15401] Updated weights for policy 0, policy_version 792773 (0.0033) [2024-06-25 02:36:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42987.1). Total num frames: 12988940288. Throughput: 0: 42712.5. Samples: 12989096220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:33,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 02:36:33,549][15401] Updated weights for policy 0, policy_version 792783 (0.0042) [2024-06-25 02:36:37,812][15401] Updated weights for policy 0, policy_version 792793 (0.0048) [2024-06-25 02:36:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 12989153280. Throughput: 0: 42599.8. Samples: 12989226320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:38,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 02:36:41,583][15401] Updated weights for policy 0, policy_version 792803 (0.0030) [2024-06-25 02:36:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.9). Total num frames: 12989349888. Throughput: 0: 42647.5. Samples: 12989481260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 02:36:45,279][15401] Updated weights for policy 0, policy_version 792813 (0.0033) [2024-06-25 02:36:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 12989562880. Throughput: 0: 42563.7. Samples: 12989731320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 02:36:49,291][15401] Updated weights for policy 0, policy_version 792823 (0.0029) [2024-06-25 02:36:53,008][15401] Updated weights for policy 0, policy_version 792833 (0.0036) [2024-06-25 02:36:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 12989775872. Throughput: 0: 42554.7. Samples: 12989861640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 02:36:57,005][15401] Updated weights for policy 0, policy_version 792843 (0.0030) [2024-06-25 02:36:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42874.1, 300 sec: 42765.0). Total num frames: 12989988864. Throughput: 0: 42570.3. Samples: 12990115220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:36:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 02:37:00,606][15401] Updated weights for policy 0, policy_version 792853 (0.0037) [2024-06-25 02:37:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 12990218240. Throughput: 0: 42551.8. Samples: 12990372740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:03,392][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 02:37:04,555][15401] Updated weights for policy 0, policy_version 792863 (0.0032) [2024-06-25 02:37:08,340][15401] Updated weights for policy 0, policy_version 792873 (0.0027) [2024-06-25 02:37:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 12990431232. Throughput: 0: 42679.5. Samples: 12990505940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:08,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 02:37:12,067][15401] Updated weights for policy 0, policy_version 792883 (0.0039) [2024-06-25 02:37:13,390][15132] Fps is (10 sec: 40969.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12990627840. Throughput: 0: 42614.9. Samples: 12990759800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 02:37:15,954][15401] Updated weights for policy 0, policy_version 792893 (0.0035) [2024-06-25 02:37:18,233][15349] Signal inference workers to stop experience collection... (192300 times) [2024-06-25 02:37:18,262][15401] InferenceWorker_p0-w0: stopping experience collection (192300 times) [2024-06-25 02:37:18,293][15349] Signal inference workers to resume experience collection... (192300 times) [2024-06-25 02:37:18,293][15401] InferenceWorker_p0-w0: resuming experience collection (192300 times) [2024-06-25 02:37:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 12990857216. Throughput: 0: 42740.6. Samples: 12991019540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 02:37:19,773][15401] Updated weights for policy 0, policy_version 792903 (0.0024) [2024-06-25 02:37:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 12991053824. Throughput: 0: 42714.8. Samples: 12991148480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:23,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 02:37:23,627][15401] Updated weights for policy 0, policy_version 792913 (0.0030) [2024-06-25 02:37:27,307][15401] Updated weights for policy 0, policy_version 792923 (0.0023) [2024-06-25 02:37:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12991283200. Throughput: 0: 42679.4. Samples: 12991401840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:28,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 02:37:31,095][15401] Updated weights for policy 0, policy_version 792933 (0.0032) [2024-06-25 02:37:33,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 12991496192. Throughput: 0: 43047.4. Samples: 12991668560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:33,392][15132] Avg episode reward: [(0, '0.808')] [2024-06-25 02:37:34,769][15401] Updated weights for policy 0, policy_version 792943 (0.0037) [2024-06-25 02:37:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 12991709184. Throughput: 0: 42943.1. Samples: 12991794080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:38,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-25 02:37:38,732][15401] Updated weights for policy 0, policy_version 792953 (0.0029) [2024-06-25 02:37:42,314][15401] Updated weights for policy 0, policy_version 792963 (0.0038) [2024-06-25 02:37:43,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 12991938560. Throughput: 0: 42902.4. Samples: 12992045820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:43,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 02:37:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000792965_12991938560.pth... [2024-06-25 02:37:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000792338_12981665792.pth [2024-06-25 02:37:46,439][15401] Updated weights for policy 0, policy_version 792973 (0.0035) [2024-06-25 02:37:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12992135168. Throughput: 0: 42981.8. Samples: 12992306820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 02:37:50,357][15401] Updated weights for policy 0, policy_version 792983 (0.0023) [2024-06-25 02:37:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.6). Total num frames: 12992348160. Throughput: 0: 42770.2. Samples: 12992430600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 02:37:53,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 02:37:54,127][15401] Updated weights for policy 0, policy_version 792993 (0.0042) [2024-06-25 02:37:58,088][15401] Updated weights for policy 0, policy_version 793003 (0.0039) [2024-06-25 02:37:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 12992561152. Throughput: 0: 42868.5. Samples: 12992688880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:37:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 02:38:01,731][15401] Updated weights for policy 0, policy_version 793013 (0.0034) [2024-06-25 02:38:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 12992790528. Throughput: 0: 42799.0. Samples: 12992945500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 02:38:05,751][15401] Updated weights for policy 0, policy_version 793023 (0.0026) [2024-06-25 02:38:08,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12993003520. Throughput: 0: 42797.3. Samples: 12993074360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 02:38:09,451][15401] Updated weights for policy 0, policy_version 793033 (0.0028) [2024-06-25 02:38:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.7, 300 sec: 42876.1). Total num frames: 12993200128. Throughput: 0: 42989.5. Samples: 12993336360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:13,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-25 02:38:13,446][15401] Updated weights for policy 0, policy_version 793043 (0.0035) [2024-06-25 02:38:17,073][15401] Updated weights for policy 0, policy_version 793053 (0.0037) [2024-06-25 02:38:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42931.7). Total num frames: 12993445888. Throughput: 0: 42642.3. Samples: 12993587360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:18,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-25 02:38:21,006][15401] Updated weights for policy 0, policy_version 793063 (0.0042) [2024-06-25 02:38:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 12993626112. Throughput: 0: 42805.4. Samples: 12993720320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 02:38:24,634][15401] Updated weights for policy 0, policy_version 793073 (0.0028) [2024-06-25 02:38:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 12993855488. Throughput: 0: 43036.4. Samples: 12993982460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 02:38:28,504][15401] Updated weights for policy 0, policy_version 793083 (0.0032) [2024-06-25 02:38:31,118][15349] Signal inference workers to stop experience collection... (192350 times) [2024-06-25 02:38:31,169][15401] InferenceWorker_p0-w0: stopping experience collection (192350 times) [2024-06-25 02:38:31,172][15349] Signal inference workers to resume experience collection... (192350 times) [2024-06-25 02:38:31,179][15401] InferenceWorker_p0-w0: resuming experience collection (192350 times) [2024-06-25 02:38:32,156][15401] Updated weights for policy 0, policy_version 793093 (0.0035) [2024-06-25 02:38:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 12994084864. Throughput: 0: 42790.9. Samples: 12994232400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:33,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 02:38:35,958][15401] Updated weights for policy 0, policy_version 793103 (0.0022) [2024-06-25 02:38:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 12994265088. Throughput: 0: 43015.7. Samples: 12994366300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:38,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 02:38:39,855][15401] Updated weights for policy 0, policy_version 793113 (0.0039) [2024-06-25 02:38:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.3, 300 sec: 43042.7). Total num frames: 12994510848. Throughput: 0: 43056.8. Samples: 12994626440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:43,390][15132] Avg episode reward: [(0, '0.875')] [2024-06-25 02:38:43,454][15401] Updated weights for policy 0, policy_version 793123 (0.0035) [2024-06-25 02:38:47,483][15401] Updated weights for policy 0, policy_version 793133 (0.0026) [2024-06-25 02:38:48,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 12994740224. Throughput: 0: 42877.4. Samples: 12994874980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 02:38:51,290][15401] Updated weights for policy 0, policy_version 793143 (0.0028) [2024-06-25 02:38:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 12994920448. Throughput: 0: 43010.7. Samples: 12995009840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:53,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 02:38:55,122][15401] Updated weights for policy 0, policy_version 793153 (0.0029) [2024-06-25 02:38:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 12995133440. Throughput: 0: 42887.2. Samples: 12995266280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:38:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 02:38:58,772][15401] Updated weights for policy 0, policy_version 793163 (0.0040) [2024-06-25 02:39:02,822][15401] Updated weights for policy 0, policy_version 793173 (0.0036) [2024-06-25 02:39:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 12995379200. Throughput: 0: 42948.8. Samples: 12995520060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 02:39:06,254][15401] Updated weights for policy 0, policy_version 793183 (0.0028) [2024-06-25 02:39:08,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 12995575808. Throughput: 0: 42935.4. Samples: 12995652420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:08,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 02:39:10,308][15401] Updated weights for policy 0, policy_version 793193 (0.0030) [2024-06-25 02:39:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 12995788800. Throughput: 0: 42818.7. Samples: 12995909300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:13,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 02:39:14,501][15401] Updated weights for policy 0, policy_version 793203 (0.0039) [2024-06-25 02:39:17,937][15401] Updated weights for policy 0, policy_version 793213 (0.0029) [2024-06-25 02:39:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 12996018176. Throughput: 0: 43000.8. Samples: 12996167440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:18,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 02:39:21,985][15401] Updated weights for policy 0, policy_version 793223 (0.0030) [2024-06-25 02:39:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 12996214784. Throughput: 0: 42922.4. Samples: 12996297820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 02:39:25,725][15401] Updated weights for policy 0, policy_version 793233 (0.0030) [2024-06-25 02:39:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 12996427776. Throughput: 0: 42775.7. Samples: 12996551340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 02:39:29,612][15401] Updated weights for policy 0, policy_version 793243 (0.0035) [2024-06-25 02:39:33,198][15349] Signal inference workers to stop experience collection... (192400 times) [2024-06-25 02:39:33,198][15349] Signal inference workers to resume experience collection... (192400 times) [2024-06-25 02:39:33,245][15401] InferenceWorker_p0-w0: stopping experience collection (192400 times) [2024-06-25 02:39:33,245][15401] InferenceWorker_p0-w0: resuming experience collection (192400 times) [2024-06-25 02:39:33,348][15401] Updated weights for policy 0, policy_version 793253 (0.0035) [2024-06-25 02:39:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 12996657152. Throughput: 0: 42944.0. Samples: 12996807460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 02:39:37,351][15401] Updated weights for policy 0, policy_version 793263 (0.0025) [2024-06-25 02:39:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42931.7). Total num frames: 12996870144. Throughput: 0: 42759.9. Samples: 12996934040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 02:39:41,239][15401] Updated weights for policy 0, policy_version 793273 (0.0033) [2024-06-25 02:39:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42931.6). Total num frames: 12997066752. Throughput: 0: 42713.7. Samples: 12997188400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 02:39:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 02:39:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000793278_12997066752.pth... [2024-06-25 02:39:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000792650_12986777600.pth [2024-06-25 02:39:45,301][15401] Updated weights for policy 0, policy_version 793283 (0.0026) [2024-06-25 02:39:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 12997279744. Throughput: 0: 42886.7. Samples: 12997449960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:39:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 02:39:48,808][15401] Updated weights for policy 0, policy_version 793293 (0.0037) [2024-06-25 02:39:52,741][15401] Updated weights for policy 0, policy_version 793303 (0.0039) [2024-06-25 02:39:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 12997509120. Throughput: 0: 42837.0. Samples: 12997580080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:39:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 02:39:56,361][15401] Updated weights for policy 0, policy_version 793313 (0.0054) [2024-06-25 02:39:58,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 12997705728. Throughput: 0: 42708.9. Samples: 12997831200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:39:58,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 02:40:00,280][15401] Updated weights for policy 0, policy_version 793323 (0.0039) [2024-06-25 02:40:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 42766.0). Total num frames: 12997918720. Throughput: 0: 42717.0. Samples: 12998089700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 02:40:04,074][15401] Updated weights for policy 0, policy_version 793333 (0.0032) [2024-06-25 02:40:07,786][15401] Updated weights for policy 0, policy_version 793343 (0.0024) [2024-06-25 02:40:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 12998131712. Throughput: 0: 42760.9. Samples: 12998222060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 02:40:11,665][15401] Updated weights for policy 0, policy_version 793353 (0.0048) [2024-06-25 02:40:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 12998344704. Throughput: 0: 42680.0. Samples: 12998471940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:13,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 02:40:15,434][15401] Updated weights for policy 0, policy_version 793363 (0.0030) [2024-06-25 02:40:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12998574080. Throughput: 0: 42694.7. Samples: 12998728720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:18,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 02:40:19,404][15401] Updated weights for policy 0, policy_version 793373 (0.0029) [2024-06-25 02:40:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 12998770688. Throughput: 0: 42832.0. Samples: 12998861480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:23,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 02:40:23,655][15401] Updated weights for policy 0, policy_version 793383 (0.0026) [2024-06-25 02:40:27,413][15401] Updated weights for policy 0, policy_version 793393 (0.0048) [2024-06-25 02:40:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 12998983680. Throughput: 0: 42688.7. Samples: 12999109400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:28,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 02:40:31,479][15401] Updated weights for policy 0, policy_version 793403 (0.0036) [2024-06-25 02:40:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 12999213056. Throughput: 0: 42549.9. Samples: 12999364700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:33,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-25 02:40:34,935][15401] Updated weights for policy 0, policy_version 793413 (0.0041) [2024-06-25 02:40:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 12999409664. Throughput: 0: 42710.7. Samples: 12999502060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:38,390][15132] Avg episode reward: [(0, '0.181')] [2024-06-25 02:40:38,905][15401] Updated weights for policy 0, policy_version 793423 (0.0036) [2024-06-25 02:40:42,371][15401] Updated weights for policy 0, policy_version 793433 (0.0033) [2024-06-25 02:40:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 12999606272. Throughput: 0: 42658.6. Samples: 12999750840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 02:40:46,567][15401] Updated weights for policy 0, policy_version 793443 (0.0041) [2024-06-25 02:40:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 12999852032. Throughput: 0: 42504.9. Samples: 13000002420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 02:40:50,058][15401] Updated weights for policy 0, policy_version 793453 (0.0036) [2024-06-25 02:40:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42821.1). Total num frames: 13000048640. Throughput: 0: 42573.8. Samples: 13000137880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 02:40:54,330][15401] Updated weights for policy 0, policy_version 793463 (0.0033) [2024-06-25 02:40:57,483][15401] Updated weights for policy 0, policy_version 793473 (0.0032) [2024-06-25 02:40:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13000261632. Throughput: 0: 42598.2. Samples: 13000388860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:40:58,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 02:41:02,026][15401] Updated weights for policy 0, policy_version 793483 (0.0034) [2024-06-25 02:41:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13000507392. Throughput: 0: 42685.4. Samples: 13000649560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:41:03,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 02:41:05,048][15401] Updated weights for policy 0, policy_version 793493 (0.0034) [2024-06-25 02:41:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13000687616. Throughput: 0: 42657.8. Samples: 13000781080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:41:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 02:41:09,680][15401] Updated weights for policy 0, policy_version 793503 (0.0033) [2024-06-25 02:41:12,841][15401] Updated weights for policy 0, policy_version 793513 (0.0046) [2024-06-25 02:41:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13000916992. Throughput: 0: 42698.7. Samples: 13001030840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:41:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 02:41:13,860][15349] Signal inference workers to stop experience collection... (192450 times) [2024-06-25 02:41:13,867][15349] Signal inference workers to resume experience collection... (192450 times) [2024-06-25 02:41:13,906][15401] InferenceWorker_p0-w0: stopping experience collection (192450 times) [2024-06-25 02:41:13,907][15401] InferenceWorker_p0-w0: resuming experience collection (192450 times) [2024-06-25 02:41:17,410][15401] Updated weights for policy 0, policy_version 793523 (0.0038) [2024-06-25 02:41:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13001129984. Throughput: 0: 42732.1. Samples: 13001287640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:41:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 02:41:20,556][15401] Updated weights for policy 0, policy_version 793533 (0.0039) [2024-06-25 02:41:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13001310208. Throughput: 0: 42448.9. Samples: 13001412260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:41:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 02:41:25,067][15401] Updated weights for policy 0, policy_version 793543 (0.0036) [2024-06-25 02:41:28,120][15401] Updated weights for policy 0, policy_version 793553 (0.0032) [2024-06-25 02:41:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13001572352. Throughput: 0: 42530.5. Samples: 13001664720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:41:28,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 02:41:33,026][15401] Updated weights for policy 0, policy_version 793563 (0.0038) [2024-06-25 02:41:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 13001736192. Throughput: 0: 42835.0. Samples: 13001930000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:41:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 02:41:36,033][15401] Updated weights for policy 0, policy_version 793573 (0.0047) [2024-06-25 02:41:38,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13001949184. Throughput: 0: 42443.1. Samples: 13002047820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:41:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 02:41:40,627][15401] Updated weights for policy 0, policy_version 793583 (0.0034) [2024-06-25 02:41:43,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13002211328. Throughput: 0: 42590.5. Samples: 13002305440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:41:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 02:41:43,482][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000793593_13002227712.pth... [2024-06-25 02:41:43,485][15401] Updated weights for policy 0, policy_version 793593 (0.0033) [2024-06-25 02:41:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000792965_12991938560.pth [2024-06-25 02:41:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 13002375168. Throughput: 0: 42417.8. Samples: 13002558360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:41:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 02:41:48,731][15401] Updated weights for policy 0, policy_version 793603 (0.0049) [2024-06-25 02:41:51,346][15401] Updated weights for policy 0, policy_version 793613 (0.0041) [2024-06-25 02:41:53,392][15132] Fps is (10 sec: 37674.9, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 13002588160. Throughput: 0: 42200.9. Samples: 13002680220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:41:53,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 02:41:56,251][15401] Updated weights for policy 0, policy_version 793623 (0.0044) [2024-06-25 02:41:58,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 13002850304. Throughput: 0: 42453.8. Samples: 13002941260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:41:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 02:41:59,119][15401] Updated weights for policy 0, policy_version 793633 (0.0046) [2024-06-25 02:42:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 13003014144. Throughput: 0: 42554.5. Samples: 13003202600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:03,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-25 02:42:03,766][15401] Updated weights for policy 0, policy_version 793643 (0.0031) [2024-06-25 02:42:06,702][15401] Updated weights for policy 0, policy_version 793653 (0.0035) [2024-06-25 02:42:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13003243520. Throughput: 0: 42390.6. Samples: 13003319840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 02:42:11,668][15401] Updated weights for policy 0, policy_version 793663 (0.0033) [2024-06-25 02:42:13,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 13003472896. Throughput: 0: 42612.0. Samples: 13003582360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:13,393][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 02:42:14,441][15401] Updated weights for policy 0, policy_version 793673 (0.0041) [2024-06-25 02:42:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 13003653120. Throughput: 0: 42397.8. Samples: 13003837900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:18,396][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 02:42:19,419][15401] Updated weights for policy 0, policy_version 793683 (0.0036) [2024-06-25 02:42:22,008][15401] Updated weights for policy 0, policy_version 793693 (0.0026) [2024-06-25 02:42:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13003882496. Throughput: 0: 42461.3. Samples: 13003958580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:23,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 02:42:27,026][15401] Updated weights for policy 0, policy_version 793703 (0.0053) [2024-06-25 02:42:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 13004111872. Throughput: 0: 42545.8. Samples: 13004220000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:28,396][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 02:42:29,773][15401] Updated weights for policy 0, policy_version 793713 (0.0029) [2024-06-25 02:42:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13004292096. Throughput: 0: 42683.6. Samples: 13004479120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 02:42:34,723][15401] Updated weights for policy 0, policy_version 793723 (0.0038) [2024-06-25 02:42:36,735][15349] Signal inference workers to stop experience collection... (192500 times) [2024-06-25 02:42:36,736][15349] Signal inference workers to resume experience collection... (192500 times) [2024-06-25 02:42:36,778][15401] InferenceWorker_p0-w0: stopping experience collection (192500 times) [2024-06-25 02:42:36,778][15401] InferenceWorker_p0-w0: resuming experience collection (192500 times) [2024-06-25 02:42:37,242][15401] Updated weights for policy 0, policy_version 793733 (0.0037) [2024-06-25 02:42:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13004537856. Throughput: 0: 42765.3. Samples: 13004604560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 02:42:42,295][15401] Updated weights for policy 0, policy_version 793743 (0.0031) [2024-06-25 02:42:43,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42050.7, 300 sec: 42709.1). Total num frames: 13004734464. Throughput: 0: 42884.0. Samples: 13004871140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:43,392][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 02:42:44,974][15401] Updated weights for policy 0, policy_version 793753 (0.0028) [2024-06-25 02:42:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13004947456. Throughput: 0: 42751.1. Samples: 13005126400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:48,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 02:42:49,795][15401] Updated weights for policy 0, policy_version 793763 (0.0041) [2024-06-25 02:42:52,757][15401] Updated weights for policy 0, policy_version 793773 (0.0036) [2024-06-25 02:42:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 13005176832. Throughput: 0: 42892.5. Samples: 13005250000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 02:42:57,382][15401] Updated weights for policy 0, policy_version 793783 (0.0032) [2024-06-25 02:42:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 13005373440. Throughput: 0: 42942.4. Samples: 13005514660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:42:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 02:43:00,347][15401] Updated weights for policy 0, policy_version 793793 (0.0039) [2024-06-25 02:43:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13005586432. Throughput: 0: 42941.2. Samples: 13005770260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:43:03,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 02:43:04,988][15401] Updated weights for policy 0, policy_version 793803 (0.0035) [2024-06-25 02:43:07,958][15401] Updated weights for policy 0, policy_version 793813 (0.0028) [2024-06-25 02:43:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13005832192. Throughput: 0: 43031.7. Samples: 13005895000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:43:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 02:43:12,496][15401] Updated weights for policy 0, policy_version 793823 (0.0033) [2024-06-25 02:43:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 13006028800. Throughput: 0: 43201.9. Samples: 13006164080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 02:43:13,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 02:43:15,515][15401] Updated weights for policy 0, policy_version 793833 (0.0034) [2024-06-25 02:43:18,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13006225408. Throughput: 0: 43070.0. Samples: 13006417280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 02:43:20,076][15401] Updated weights for policy 0, policy_version 793843 (0.0025) [2024-06-25 02:43:23,231][15401] Updated weights for policy 0, policy_version 793853 (0.0038) [2024-06-25 02:43:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13006487552. Throughput: 0: 43132.4. Samples: 13006545520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:23,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 02:43:27,625][15401] Updated weights for policy 0, policy_version 793863 (0.0029) [2024-06-25 02:43:28,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13006684160. Throughput: 0: 43123.3. Samples: 13006811580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 02:43:31,010][15401] Updated weights for policy 0, policy_version 793873 (0.0031) [2024-06-25 02:43:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13006880768. Throughput: 0: 43111.9. Samples: 13007066440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:33,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 02:43:35,428][15401] Updated weights for policy 0, policy_version 793883 (0.0045) [2024-06-25 02:43:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13007110144. Throughput: 0: 42983.1. Samples: 13007184240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 02:43:38,690][15401] Updated weights for policy 0, policy_version 793893 (0.0035) [2024-06-25 02:43:43,082][15401] Updated weights for policy 0, policy_version 793903 (0.0039) [2024-06-25 02:43:43,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 13007323136. Throughput: 0: 43102.6. Samples: 13007454280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:43,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 02:43:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000793904_13007323136.pth... [2024-06-25 02:43:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000793278_12997066752.pth [2024-06-25 02:43:46,235][15401] Updated weights for policy 0, policy_version 793913 (0.0038) [2024-06-25 02:43:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13007536128. Throughput: 0: 42916.5. Samples: 13007701500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 02:43:50,849][15401] Updated weights for policy 0, policy_version 793923 (0.0029) [2024-06-25 02:43:53,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 13007765504. Throughput: 0: 43116.2. Samples: 13007835340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:53,393][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 02:43:53,668][15349] Signal inference workers to stop experience collection... (192550 times) [2024-06-25 02:43:53,694][15401] InferenceWorker_p0-w0: stopping experience collection (192550 times) [2024-06-25 02:43:53,723][15349] Signal inference workers to resume experience collection... (192550 times) [2024-06-25 02:43:53,724][15401] InferenceWorker_p0-w0: resuming experience collection (192550 times) [2024-06-25 02:43:53,896][15401] Updated weights for policy 0, policy_version 793933 (0.0038) [2024-06-25 02:43:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13007945728. Throughput: 0: 42996.9. Samples: 13008098940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:43:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 02:43:58,521][15401] Updated weights for policy 0, policy_version 793943 (0.0033) [2024-06-25 02:44:01,303][15401] Updated weights for policy 0, policy_version 793953 (0.0024) [2024-06-25 02:44:03,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13008191488. Throughput: 0: 42808.1. Samples: 13008343640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:03,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 02:44:06,102][15401] Updated weights for policy 0, policy_version 793963 (0.0037) [2024-06-25 02:44:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13008404480. Throughput: 0: 43007.2. Samples: 13008480840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:08,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 02:44:09,301][15401] Updated weights for policy 0, policy_version 793973 (0.0028) [2024-06-25 02:44:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13008584704. Throughput: 0: 42751.9. Samples: 13008735420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 02:44:13,624][15401] Updated weights for policy 0, policy_version 793983 (0.0037) [2024-06-25 02:44:16,957][15401] Updated weights for policy 0, policy_version 793993 (0.0032) [2024-06-25 02:44:18,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43686.1, 300 sec: 42819.6). Total num frames: 13008846848. Throughput: 0: 42529.7. Samples: 13008980540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:18,396][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 02:44:21,226][15401] Updated weights for policy 0, policy_version 794003 (0.0032) [2024-06-25 02:44:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13009027072. Throughput: 0: 42988.8. Samples: 13009118740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 02:44:24,640][15401] Updated weights for policy 0, policy_version 794013 (0.0024) [2024-06-25 02:44:28,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13009240064. Throughput: 0: 42692.5. Samples: 13009375440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 02:44:28,660][15401] Updated weights for policy 0, policy_version 794023 (0.0035) [2024-06-25 02:44:32,097][15401] Updated weights for policy 0, policy_version 794033 (0.0029) [2024-06-25 02:44:33,390][15132] Fps is (10 sec: 47512.4, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 13009502208. Throughput: 0: 42868.7. Samples: 13009630600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:33,391][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 02:44:36,093][15401] Updated weights for policy 0, policy_version 794043 (0.0049) [2024-06-25 02:44:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13009682432. Throughput: 0: 42857.1. Samples: 13009763800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 02:44:40,039][15401] Updated weights for policy 0, policy_version 794053 (0.0046) [2024-06-25 02:44:43,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13009895424. Throughput: 0: 42749.9. Samples: 13010022680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:43,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 02:44:43,558][15401] Updated weights for policy 0, policy_version 794063 (0.0038) [2024-06-25 02:44:47,535][15401] Updated weights for policy 0, policy_version 794073 (0.0033) [2024-06-25 02:44:48,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 13010141184. Throughput: 0: 42927.5. Samples: 13010275480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:48,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 02:44:51,333][15401] Updated weights for policy 0, policy_version 794083 (0.0035) [2024-06-25 02:44:53,394][15132] Fps is (10 sec: 40943.2, 60 sec: 42324.2, 300 sec: 42708.9). Total num frames: 13010305024. Throughput: 0: 42933.9. Samples: 13010413040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:53,394][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 02:44:55,035][15401] Updated weights for policy 0, policy_version 794093 (0.0033) [2024-06-25 02:44:58,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13010550784. Throughput: 0: 43090.2. Samples: 13010674480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:44:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 02:44:58,731][15401] Updated weights for policy 0, policy_version 794103 (0.0027) [2024-06-25 02:45:02,610][15401] Updated weights for policy 0, policy_version 794113 (0.0030) [2024-06-25 02:45:03,390][15132] Fps is (10 sec: 49171.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 13010796544. Throughput: 0: 43363.5. Samples: 13010931620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 02:45:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 02:45:06,206][15401] Updated weights for policy 0, policy_version 794123 (0.0028) [2024-06-25 02:45:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13010960384. Throughput: 0: 43251.6. Samples: 13011065060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:08,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 02:45:10,133][15349] Signal inference workers to stop experience collection... (192600 times) [2024-06-25 02:45:10,171][15401] InferenceWorker_p0-w0: stopping experience collection (192600 times) [2024-06-25 02:45:10,181][15349] Signal inference workers to resume experience collection... (192600 times) [2024-06-25 02:45:10,194][15401] InferenceWorker_p0-w0: resuming experience collection (192600 times) [2024-06-25 02:45:10,203][15401] Updated weights for policy 0, policy_version 794133 (0.0037) [2024-06-25 02:45:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 13011206144. Throughput: 0: 43285.1. Samples: 13011323280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 02:45:13,710][15401] Updated weights for policy 0, policy_version 794143 (0.0035) [2024-06-25 02:45:17,715][15401] Updated weights for policy 0, policy_version 794153 (0.0029) [2024-06-25 02:45:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 13011419136. Throughput: 0: 43279.0. Samples: 13011578140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 02:45:21,318][15401] Updated weights for policy 0, policy_version 794163 (0.0021) [2024-06-25 02:45:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13011615744. Throughput: 0: 43216.8. Samples: 13011708560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:23,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 02:45:25,106][15401] Updated weights for policy 0, policy_version 794173 (0.0037) [2024-06-25 02:45:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 13011861504. Throughput: 0: 43276.7. Samples: 13011970140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 02:45:29,251][15401] Updated weights for policy 0, policy_version 794183 (0.0035) [2024-06-25 02:45:32,947][15401] Updated weights for policy 0, policy_version 794193 (0.0034) [2024-06-25 02:45:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 13012058112. Throughput: 0: 43291.3. Samples: 13012223480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 02:45:36,725][15401] Updated weights for policy 0, policy_version 794203 (0.0043) [2024-06-25 02:45:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13012271104. Throughput: 0: 42971.0. Samples: 13012346560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 02:45:40,802][15401] Updated weights for policy 0, policy_version 794213 (0.0024) [2024-06-25 02:45:43,389][15132] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 13012516864. Throughput: 0: 43127.6. Samples: 13012615220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 02:45:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000794221_13012516864.pth... [2024-06-25 02:45:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000793593_13002227712.pth [2024-06-25 02:45:44,358][15401] Updated weights for policy 0, policy_version 794223 (0.0038) [2024-06-25 02:45:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 13012697088. Throughput: 0: 43015.1. Samples: 13012867300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 02:45:48,481][15401] Updated weights for policy 0, policy_version 794233 (0.0031) [2024-06-25 02:45:52,226][15401] Updated weights for policy 0, policy_version 794243 (0.0040) [2024-06-25 02:45:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43420.4, 300 sec: 42876.1). Total num frames: 13012910080. Throughput: 0: 42838.5. Samples: 13012992800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 02:45:55,992][15401] Updated weights for policy 0, policy_version 794253 (0.0037) [2024-06-25 02:45:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 13013139456. Throughput: 0: 42927.0. Samples: 13013254980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:45:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 02:45:59,801][15401] Updated weights for policy 0, policy_version 794263 (0.0037) [2024-06-25 02:46:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 13013336064. Throughput: 0: 42851.3. Samples: 13013506460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 02:46:03,585][15401] Updated weights for policy 0, policy_version 794273 (0.0028) [2024-06-25 02:46:07,203][15401] Updated weights for policy 0, policy_version 794283 (0.0032) [2024-06-25 02:46:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13013565440. Throughput: 0: 42890.2. Samples: 13013638620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:08,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 02:46:11,001][15401] Updated weights for policy 0, policy_version 794293 (0.0034) [2024-06-25 02:46:13,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 13013745664. Throughput: 0: 42878.4. Samples: 13013899660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:13,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 02:46:14,815][15401] Updated weights for policy 0, policy_version 794303 (0.0026) [2024-06-25 02:46:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13014007808. Throughput: 0: 42948.8. Samples: 13014156180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 02:46:18,692][15401] Updated weights for policy 0, policy_version 794313 (0.0037) [2024-06-25 02:46:22,620][15401] Updated weights for policy 0, policy_version 794323 (0.0039) [2024-06-25 02:46:23,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13014220800. Throughput: 0: 43040.8. Samples: 13014283400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:23,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 02:46:26,562][15401] Updated weights for policy 0, policy_version 794333 (0.0028) [2024-06-25 02:46:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 13014417408. Throughput: 0: 42759.9. Samples: 13014539420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 02:46:30,185][15401] Updated weights for policy 0, policy_version 794343 (0.0038) [2024-06-25 02:46:32,853][15349] Signal inference workers to stop experience collection... (192650 times) [2024-06-25 02:46:32,874][15401] InferenceWorker_p0-w0: stopping experience collection (192650 times) [2024-06-25 02:46:32,915][15349] Signal inference workers to resume experience collection... (192650 times) [2024-06-25 02:46:32,915][15401] InferenceWorker_p0-w0: resuming experience collection (192650 times) [2024-06-25 02:46:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13014630400. Throughput: 0: 42821.3. Samples: 13014794260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 02:46:34,566][15401] Updated weights for policy 0, policy_version 794353 (0.0036) [2024-06-25 02:46:37,893][15401] Updated weights for policy 0, policy_version 794363 (0.0042) [2024-06-25 02:46:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 13014843392. Throughput: 0: 42924.9. Samples: 13014924520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:38,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 02:46:42,258][15401] Updated weights for policy 0, policy_version 794373 (0.0033) [2024-06-25 02:46:43,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 42931.3). Total num frames: 13015040000. Throughput: 0: 42808.3. Samples: 13015181460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:43,392][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 02:46:45,944][15401] Updated weights for policy 0, policy_version 794383 (0.0043) [2024-06-25 02:46:48,389][15132] Fps is (10 sec: 44248.0, 60 sec: 43144.6, 300 sec: 43043.1). Total num frames: 13015285760. Throughput: 0: 42790.0. Samples: 13015432000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 02:46:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 02:46:49,839][15401] Updated weights for policy 0, policy_version 794393 (0.0040) [2024-06-25 02:46:53,389][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13015482368. Throughput: 0: 42863.6. Samples: 13015567480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:46:53,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 02:46:53,407][15401] Updated weights for policy 0, policy_version 794403 (0.0031) [2024-06-25 02:46:57,485][15401] Updated weights for policy 0, policy_version 794413 (0.0041) [2024-06-25 02:46:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 13015695360. Throughput: 0: 42527.5. Samples: 13015813400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:46:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 02:47:01,222][15401] Updated weights for policy 0, policy_version 794423 (0.0029) [2024-06-25 02:47:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 13015908352. Throughput: 0: 42509.7. Samples: 13016069120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 02:47:05,139][15401] Updated weights for policy 0, policy_version 794433 (0.0023) [2024-06-25 02:47:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 13016104960. Throughput: 0: 42590.4. Samples: 13016199960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 02:47:08,977][15401] Updated weights for policy 0, policy_version 794443 (0.0033) [2024-06-25 02:47:12,777][15401] Updated weights for policy 0, policy_version 794453 (0.0038) [2024-06-25 02:47:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 13016334336. Throughput: 0: 42493.4. Samples: 13016451620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 02:47:16,651][15401] Updated weights for policy 0, policy_version 794463 (0.0046) [2024-06-25 02:47:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 13016547328. Throughput: 0: 42530.3. Samples: 13016708120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 02:47:20,373][15401] Updated weights for policy 0, policy_version 794473 (0.0039) [2024-06-25 02:47:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13016760320. Throughput: 0: 42471.6. Samples: 13016835640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 02:47:24,230][15401] Updated weights for policy 0, policy_version 794483 (0.0030) [2024-06-25 02:47:27,956][15401] Updated weights for policy 0, policy_version 794493 (0.0031) [2024-06-25 02:47:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 13016973312. Throughput: 0: 42422.2. Samples: 13017090360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:28,394][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 02:47:31,876][15401] Updated weights for policy 0, policy_version 794503 (0.0037) [2024-06-25 02:47:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13017186304. Throughput: 0: 42508.9. Samples: 13017344900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:33,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 02:47:35,690][15401] Updated weights for policy 0, policy_version 794513 (0.0033) [2024-06-25 02:47:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42876.4). Total num frames: 13017382912. Throughput: 0: 42370.6. Samples: 13017474160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 02:47:39,802][15401] Updated weights for policy 0, policy_version 794523 (0.0032) [2024-06-25 02:47:43,245][15401] Updated weights for policy 0, policy_version 794533 (0.0033) [2024-06-25 02:47:43,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 13017628672. Throughput: 0: 42797.8. Samples: 13017739300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 02:47:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000794533_13017628672.pth... [2024-06-25 02:47:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000793904_13007323136.pth [2024-06-25 02:47:47,311][15401] Updated weights for policy 0, policy_version 794543 (0.0036) [2024-06-25 02:47:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13017841664. Throughput: 0: 42699.6. Samples: 13017990600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 02:47:51,197][15401] Updated weights for policy 0, policy_version 794553 (0.0034) [2024-06-25 02:47:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 13018038272. Throughput: 0: 42709.6. Samples: 13018121900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 02:47:54,839][15401] Updated weights for policy 0, policy_version 794563 (0.0038) [2024-06-25 02:47:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13018251264. Throughput: 0: 42847.6. Samples: 13018379760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:47:58,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 02:47:58,744][15401] Updated weights for policy 0, policy_version 794573 (0.0032) [2024-06-25 02:48:02,606][15401] Updated weights for policy 0, policy_version 794583 (0.0047) [2024-06-25 02:48:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13018480640. Throughput: 0: 42809.7. Samples: 13018634560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:48:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 02:48:06,506][15401] Updated weights for policy 0, policy_version 794593 (0.0040) [2024-06-25 02:48:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13018693632. Throughput: 0: 42989.0. Samples: 13018770140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:48:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 02:48:09,102][15349] Signal inference workers to stop experience collection... (192700 times) [2024-06-25 02:48:09,103][15349] Signal inference workers to resume experience collection... (192700 times) [2024-06-25 02:48:09,118][15401] InferenceWorker_p0-w0: stopping experience collection (192700 times) [2024-06-25 02:48:09,118][15401] InferenceWorker_p0-w0: resuming experience collection (192700 times) [2024-06-25 02:48:10,032][15401] Updated weights for policy 0, policy_version 794603 (0.0031) [2024-06-25 02:48:13,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 13018890240. Throughput: 0: 43117.3. Samples: 13019030740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:48:13,392][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 02:48:14,163][15401] Updated weights for policy 0, policy_version 794613 (0.0041) [2024-06-25 02:48:17,536][15401] Updated weights for policy 0, policy_version 794623 (0.0037) [2024-06-25 02:48:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13019136000. Throughput: 0: 43050.9. Samples: 13019282200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:48:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 02:48:21,842][15401] Updated weights for policy 0, policy_version 794633 (0.0041) [2024-06-25 02:48:23,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13019332608. Throughput: 0: 43238.2. Samples: 13019419880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:48:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 02:48:25,218][15401] Updated weights for policy 0, policy_version 794643 (0.0039) [2024-06-25 02:48:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13019545600. Throughput: 0: 42938.6. Samples: 13019671540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:48:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 02:48:29,406][15401] Updated weights for policy 0, policy_version 794653 (0.0042) [2024-06-25 02:48:32,729][15401] Updated weights for policy 0, policy_version 794663 (0.0036) [2024-06-25 02:48:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 13019774976. Throughput: 0: 43059.0. Samples: 13019928360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 02:48:33,393][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 02:48:36,992][15401] Updated weights for policy 0, policy_version 794673 (0.0034) [2024-06-25 02:48:38,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 13019987968. Throughput: 0: 43187.2. Samples: 13020065420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:48:38,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 02:48:40,194][15401] Updated weights for policy 0, policy_version 794683 (0.0024) [2024-06-25 02:48:43,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13020184576. Throughput: 0: 43185.8. Samples: 13020323120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:48:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 02:48:44,508][15401] Updated weights for policy 0, policy_version 794693 (0.0032) [2024-06-25 02:48:47,867][15401] Updated weights for policy 0, policy_version 794703 (0.0030) [2024-06-25 02:48:48,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 13020430336. Throughput: 0: 43081.7. Samples: 13020573240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:48:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 02:48:52,192][15401] Updated weights for policy 0, policy_version 794713 (0.0036) [2024-06-25 02:48:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13020626944. Throughput: 0: 43145.3. Samples: 13020711680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:48:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 02:48:55,699][15401] Updated weights for policy 0, policy_version 794723 (0.0045) [2024-06-25 02:48:58,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13020823552. Throughput: 0: 42909.8. Samples: 13020961580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:48:58,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 02:49:00,015][15401] Updated weights for policy 0, policy_version 794733 (0.0033) [2024-06-25 02:49:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13021052928. Throughput: 0: 42985.3. Samples: 13021216540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 02:49:03,616][15401] Updated weights for policy 0, policy_version 794743 (0.0041) [2024-06-25 02:49:07,755][15401] Updated weights for policy 0, policy_version 794753 (0.0038) [2024-06-25 02:49:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 13021282304. Throughput: 0: 42907.1. Samples: 13021350700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 02:49:11,177][15401] Updated weights for policy 0, policy_version 794763 (0.0035) [2024-06-25 02:49:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42873.2, 300 sec: 42765.9). Total num frames: 13021462528. Throughput: 0: 42789.4. Samples: 13021597060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:13,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 02:49:15,508][15401] Updated weights for policy 0, policy_version 794773 (0.0034) [2024-06-25 02:49:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13021691904. Throughput: 0: 42889.5. Samples: 13021858280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:18,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 02:49:18,648][15401] Updated weights for policy 0, policy_version 794783 (0.0032) [2024-06-25 02:49:22,981][15401] Updated weights for policy 0, policy_version 794793 (0.0027) [2024-06-25 02:49:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13021904896. Throughput: 0: 42814.7. Samples: 13021991980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 02:49:26,264][15401] Updated weights for policy 0, policy_version 794803 (0.0025) [2024-06-25 02:49:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13022134272. Throughput: 0: 42681.7. Samples: 13022243800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:28,392][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 02:49:30,334][15401] Updated weights for policy 0, policy_version 794813 (0.0026) [2024-06-25 02:49:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42871.5, 300 sec: 42931.3). Total num frames: 13022347264. Throughput: 0: 42986.2. Samples: 13022507720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:33,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 02:49:33,711][15401] Updated weights for policy 0, policy_version 794823 (0.0030) [2024-06-25 02:49:38,301][15401] Updated weights for policy 0, policy_version 794833 (0.0044) [2024-06-25 02:49:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 13022543872. Throughput: 0: 42745.0. Samples: 13022635200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:38,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 02:49:41,560][15401] Updated weights for policy 0, policy_version 794843 (0.0023) [2024-06-25 02:49:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 13022789632. Throughput: 0: 42793.0. Samples: 13022887260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 02:49:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000794848_13022789632.pth... [2024-06-25 02:49:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000794221_13012516864.pth [2024-06-25 02:49:45,911][15401] Updated weights for policy 0, policy_version 794853 (0.0047) [2024-06-25 02:49:46,485][15349] Signal inference workers to stop experience collection... (192750 times) [2024-06-25 02:49:46,492][15349] Signal inference workers to resume experience collection... (192750 times) [2024-06-25 02:49:46,498][15401] InferenceWorker_p0-w0: stopping experience collection (192750 times) [2024-06-25 02:49:46,511][15401] InferenceWorker_p0-w0: resuming experience collection (192750 times) [2024-06-25 02:49:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42987.8). Total num frames: 13022986240. Throughput: 0: 43053.5. Samples: 13023153940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:48,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 02:49:49,084][15401] Updated weights for policy 0, policy_version 794863 (0.0029) [2024-06-25 02:49:53,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 13023182848. Throughput: 0: 42761.8. Samples: 13023275080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:53,392][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 02:49:53,711][15401] Updated weights for policy 0, policy_version 794873 (0.0044) [2024-06-25 02:49:56,849][15401] Updated weights for policy 0, policy_version 794883 (0.0043) [2024-06-25 02:49:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 13023428608. Throughput: 0: 42960.9. Samples: 13023530300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:49:58,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 02:50:01,222][15401] Updated weights for policy 0, policy_version 794893 (0.0050) [2024-06-25 02:50:03,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13023625216. Throughput: 0: 43090.5. Samples: 13023797360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:50:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 02:50:04,426][15401] Updated weights for policy 0, policy_version 794903 (0.0038) [2024-06-25 02:50:08,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 13023821824. Throughput: 0: 42788.0. Samples: 13023917540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:50:08,392][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 02:50:08,684][15401] Updated weights for policy 0, policy_version 794913 (0.0045) [2024-06-25 02:50:12,102][15401] Updated weights for policy 0, policy_version 794923 (0.0043) [2024-06-25 02:50:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13024067584. Throughput: 0: 42916.9. Samples: 13024175060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:50:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 02:50:16,380][15401] Updated weights for policy 0, policy_version 794933 (0.0036) [2024-06-25 02:50:18,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13024264192. Throughput: 0: 42831.7. Samples: 13024435040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:50:18,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-25 02:50:19,718][15401] Updated weights for policy 0, policy_version 794943 (0.0030) [2024-06-25 02:50:23,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 13024477184. Throughput: 0: 42803.2. Samples: 13024561620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 02:50:23,396][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 02:50:24,174][15401] Updated weights for policy 0, policy_version 794953 (0.0040) [2024-06-25 02:50:27,387][15401] Updated weights for policy 0, policy_version 794963 (0.0028) [2024-06-25 02:50:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13024706560. Throughput: 0: 42920.3. Samples: 13024818680. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:50:28,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-25 02:50:31,540][15401] Updated weights for policy 0, policy_version 794973 (0.0039) [2024-06-25 02:50:33,389][15132] Fps is (10 sec: 44265.1, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 13024919552. Throughput: 0: 42929.3. Samples: 13025085760. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:50:33,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-25 02:50:35,029][15401] Updated weights for policy 0, policy_version 794983 (0.0028) [2024-06-25 02:50:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13025116160. Throughput: 0: 42925.0. Samples: 13025206600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:50:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 02:50:39,324][15401] Updated weights for policy 0, policy_version 794993 (0.0035) [2024-06-25 02:50:42,430][15401] Updated weights for policy 0, policy_version 795003 (0.0038) [2024-06-25 02:50:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 13025361920. Throughput: 0: 43058.0. Samples: 13025467920. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:50:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 02:50:46,794][15401] Updated weights for policy 0, policy_version 795013 (0.0041) [2024-06-25 02:50:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13025542144. Throughput: 0: 42916.2. Samples: 13025728580. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:50:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 02:50:50,050][15401] Updated weights for policy 0, policy_version 795023 (0.0027) [2024-06-25 02:50:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 13025771520. Throughput: 0: 43114.3. Samples: 13025857580. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:50:53,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 02:50:54,250][15401] Updated weights for policy 0, policy_version 795033 (0.0031) [2024-06-25 02:50:57,612][15401] Updated weights for policy 0, policy_version 795043 (0.0041) [2024-06-25 02:50:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13026000896. Throughput: 0: 43239.2. Samples: 13026120820. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:50:58,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-25 02:51:01,823][15401] Updated weights for policy 0, policy_version 795053 (0.0035) [2024-06-25 02:51:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13026197504. Throughput: 0: 43282.2. Samples: 13026382740. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 02:51:05,418][15401] Updated weights for policy 0, policy_version 795063 (0.0033) [2024-06-25 02:51:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43419.4, 300 sec: 42987.2). Total num frames: 13026426880. Throughput: 0: 43255.5. Samples: 13026507840. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 02:51:09,419][15401] Updated weights for policy 0, policy_version 795073 (0.0042) [2024-06-25 02:51:13,042][15401] Updated weights for policy 0, policy_version 795083 (0.0033) [2024-06-25 02:51:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13026639872. Throughput: 0: 43281.0. Samples: 13026766320. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:13,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 02:51:16,655][15349] Signal inference workers to stop experience collection... (192800 times) [2024-06-25 02:51:16,657][15349] Signal inference workers to resume experience collection... (192800 times) [2024-06-25 02:51:16,712][15401] InferenceWorker_p0-w0: stopping experience collection (192800 times) [2024-06-25 02:51:16,712][15401] InferenceWorker_p0-w0: resuming experience collection (192800 times) [2024-06-25 02:51:16,803][15401] Updated weights for policy 0, policy_version 795093 (0.0036) [2024-06-25 02:51:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13026852864. Throughput: 0: 43193.3. Samples: 13027029460. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:18,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-25 02:51:20,697][15401] Updated weights for policy 0, policy_version 795103 (0.0044) [2024-06-25 02:51:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43422.1, 300 sec: 42931.6). Total num frames: 13027082240. Throughput: 0: 43370.9. Samples: 13027158300. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:23,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 02:51:24,383][15401] Updated weights for policy 0, policy_version 795113 (0.0036) [2024-06-25 02:51:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13027278848. Throughput: 0: 43256.5. Samples: 13027414460. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:28,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 02:51:28,715][15401] Updated weights for policy 0, policy_version 795123 (0.0033) [2024-06-25 02:51:31,921][15401] Updated weights for policy 0, policy_version 795133 (0.0034) [2024-06-25 02:51:33,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 13027491840. Throughput: 0: 43124.9. Samples: 13027669200. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 02:51:36,070][15401] Updated weights for policy 0, policy_version 795143 (0.0029) [2024-06-25 02:51:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 13027721216. Throughput: 0: 43160.5. Samples: 13027799800. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 02:51:39,482][15401] Updated weights for policy 0, policy_version 795153 (0.0047) [2024-06-25 02:51:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13027917824. Throughput: 0: 42967.0. Samples: 13028054340. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 02:51:43,463][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000795162_13027934208.pth... [2024-06-25 02:51:43,505][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000794533_13017628672.pth [2024-06-25 02:51:43,648][15401] Updated weights for policy 0, policy_version 795163 (0.0031) [2024-06-25 02:51:46,958][15401] Updated weights for policy 0, policy_version 795173 (0.0032) [2024-06-25 02:51:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43690.5, 300 sec: 42987.2). Total num frames: 13028163584. Throughput: 0: 42955.9. Samples: 13028315760. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:48,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 02:51:51,235][15401] Updated weights for policy 0, policy_version 795183 (0.0033) [2024-06-25 02:51:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13028376576. Throughput: 0: 43063.0. Samples: 13028445680. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:53,394][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 02:51:54,874][15401] Updated weights for policy 0, policy_version 795193 (0.0032) [2024-06-25 02:51:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13028556800. Throughput: 0: 43040.5. Samples: 13028703140. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:51:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 02:51:59,192][15401] Updated weights for policy 0, policy_version 795203 (0.0029) [2024-06-25 02:52:02,716][15401] Updated weights for policy 0, policy_version 795213 (0.0032) [2024-06-25 02:52:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13028802560. Throughput: 0: 42776.1. Samples: 13028954380. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:52:03,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 02:52:06,635][15401] Updated weights for policy 0, policy_version 795223 (0.0029) [2024-06-25 02:52:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13028999168. Throughput: 0: 43057.9. Samples: 13029095900. Policy #0 lag: (min: 1.0, avg: 8.5, max: 19.0) [2024-06-25 02:52:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 02:52:10,160][15401] Updated weights for policy 0, policy_version 795233 (0.0031) [2024-06-25 02:52:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13029212160. Throughput: 0: 43092.1. Samples: 13029353600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 02:52:13,992][15401] Updated weights for policy 0, policy_version 795243 (0.0032) [2024-06-25 02:52:17,518][15401] Updated weights for policy 0, policy_version 795253 (0.0039) [2024-06-25 02:52:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 13029457920. Throughput: 0: 42990.2. Samples: 13029603760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:18,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 02:52:21,810][15401] Updated weights for policy 0, policy_version 795263 (0.0031) [2024-06-25 02:52:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13029638144. Throughput: 0: 43199.5. Samples: 13029743780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:23,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 02:52:25,228][15401] Updated weights for policy 0, policy_version 795273 (0.0040) [2024-06-25 02:52:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13029851136. Throughput: 0: 43175.5. Samples: 13029997240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:28,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 02:52:29,366][15401] Updated weights for policy 0, policy_version 795283 (0.0031) [2024-06-25 02:52:32,721][15401] Updated weights for policy 0, policy_version 795293 (0.0043) [2024-06-25 02:52:33,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43688.8, 300 sec: 43153.4). Total num frames: 13030113280. Throughput: 0: 43019.1. Samples: 13030251720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:33,393][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 02:52:35,880][15349] Signal inference workers to stop experience collection... (192850 times) [2024-06-25 02:52:35,881][15349] Signal inference workers to resume experience collection... (192850 times) [2024-06-25 02:52:35,923][15401] InferenceWorker_p0-w0: stopping experience collection (192850 times) [2024-06-25 02:52:35,923][15401] InferenceWorker_p0-w0: resuming experience collection (192850 times) [2024-06-25 02:52:37,121][15401] Updated weights for policy 0, policy_version 795303 (0.0034) [2024-06-25 02:52:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13030293504. Throughput: 0: 43121.4. Samples: 13030386140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:38,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 02:52:40,558][15401] Updated weights for policy 0, policy_version 795313 (0.0032) [2024-06-25 02:52:43,391][15132] Fps is (10 sec: 39327.0, 60 sec: 43143.8, 300 sec: 42931.5). Total num frames: 13030506496. Throughput: 0: 42976.7. Samples: 13030637140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:43,391][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 02:52:44,749][15401] Updated weights for policy 0, policy_version 795323 (0.0030) [2024-06-25 02:52:48,199][15401] Updated weights for policy 0, policy_version 795333 (0.0032) [2024-06-25 02:52:48,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.9, 300 sec: 43097.9). Total num frames: 13030752256. Throughput: 0: 43177.2. Samples: 13030897460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:48,392][15132] Avg episode reward: [(0, '0.290')] [2024-06-25 02:52:52,305][15401] Updated weights for policy 0, policy_version 795343 (0.0037) [2024-06-25 02:52:53,389][15132] Fps is (10 sec: 42603.0, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 13030932480. Throughput: 0: 42975.1. Samples: 13031029780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:53,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-25 02:52:55,803][15401] Updated weights for policy 0, policy_version 795353 (0.0036) [2024-06-25 02:52:58,389][15132] Fps is (10 sec: 40970.2, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13031161856. Throughput: 0: 42822.3. Samples: 13031280600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:52:58,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 02:53:00,190][15401] Updated weights for policy 0, policy_version 795363 (0.0035) [2024-06-25 02:53:03,334][15401] Updated weights for policy 0, policy_version 795373 (0.0042) [2024-06-25 02:53:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13031391232. Throughput: 0: 42984.4. Samples: 13031538060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 02:53:07,669][15401] Updated weights for policy 0, policy_version 795383 (0.0039) [2024-06-25 02:53:08,390][15132] Fps is (10 sec: 42597.0, 60 sec: 43144.4, 300 sec: 43043.0). Total num frames: 13031587840. Throughput: 0: 42745.6. Samples: 13031667340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 02:53:10,997][15401] Updated weights for policy 0, policy_version 795393 (0.0035) [2024-06-25 02:53:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 13031800832. Throughput: 0: 42886.4. Samples: 13031927120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:13,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-25 02:53:15,419][15401] Updated weights for policy 0, policy_version 795403 (0.0044) [2024-06-25 02:53:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.1, 300 sec: 42987.1). Total num frames: 13032013824. Throughput: 0: 42856.2. Samples: 13032180160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 02:53:18,650][15401] Updated weights for policy 0, policy_version 795413 (0.0034) [2024-06-25 02:53:23,053][15401] Updated weights for policy 0, policy_version 795423 (0.0033) [2024-06-25 02:53:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13032210432. Throughput: 0: 42763.1. Samples: 13032310480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:23,399][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 02:53:26,174][15401] Updated weights for policy 0, policy_version 795433 (0.0027) [2024-06-25 02:53:28,390][15132] Fps is (10 sec: 42599.5, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 13032439808. Throughput: 0: 42993.8. Samples: 13032571820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 02:53:30,526][15401] Updated weights for policy 0, policy_version 795443 (0.0028) [2024-06-25 02:53:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42987.5). Total num frames: 13032669184. Throughput: 0: 42845.8. Samples: 13032825420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:33,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 02:53:33,763][15401] Updated weights for policy 0, policy_version 795453 (0.0039) [2024-06-25 02:53:38,306][15401] Updated weights for policy 0, policy_version 795463 (0.0031) [2024-06-25 02:53:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13032865792. Throughput: 0: 42854.6. Samples: 13032958240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:38,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 02:53:41,467][15401] Updated weights for policy 0, policy_version 795473 (0.0032) [2024-06-25 02:53:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42872.2, 300 sec: 42876.1). Total num frames: 13033078784. Throughput: 0: 42963.4. Samples: 13033213960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 02:53:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000795477_13033095168.pth... [2024-06-25 02:53:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000794848_13022789632.pth [2024-06-25 02:53:45,743][15401] Updated weights for policy 0, policy_version 795483 (0.0031) [2024-06-25 02:53:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 43042.7). Total num frames: 13033324544. Throughput: 0: 43015.5. Samples: 13033473760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:48,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 02:53:49,346][15401] Updated weights for policy 0, policy_version 795493 (0.0033) [2024-06-25 02:53:53,346][15401] Updated weights for policy 0, policy_version 795503 (0.0032) [2024-06-25 02:53:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13033521152. Throughput: 0: 43099.3. Samples: 13033606800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 02:53:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 02:53:56,846][15401] Updated weights for policy 0, policy_version 795513 (0.0034) [2024-06-25 02:53:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13033734144. Throughput: 0: 42907.6. Samples: 13033857960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:53:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 02:54:00,809][15401] Updated weights for policy 0, policy_version 795523 (0.0033) [2024-06-25 02:54:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13033963520. Throughput: 0: 42874.5. Samples: 13034109500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 02:54:04,575][15401] Updated weights for policy 0, policy_version 795533 (0.0033) [2024-06-25 02:54:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 13034160128. Throughput: 0: 42865.3. Samples: 13034239420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:08,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-25 02:54:08,461][15401] Updated weights for policy 0, policy_version 795543 (0.0039) [2024-06-25 02:54:12,203][15401] Updated weights for policy 0, policy_version 795553 (0.0037) [2024-06-25 02:54:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 13034356736. Throughput: 0: 42719.2. Samples: 13034494180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:13,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 02:54:14,466][15349] Signal inference workers to stop experience collection... (192900 times) [2024-06-25 02:54:14,521][15349] Signal inference workers to resume experience collection... (192900 times) [2024-06-25 02:54:14,524][15401] InferenceWorker_p0-w0: stopping experience collection (192900 times) [2024-06-25 02:54:14,544][15401] InferenceWorker_p0-w0: resuming experience collection (192900 times) [2024-06-25 02:54:16,349][15401] Updated weights for policy 0, policy_version 795563 (0.0030) [2024-06-25 02:54:18,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43416.1, 300 sec: 43097.9). Total num frames: 13034618880. Throughput: 0: 42736.4. Samples: 13034748660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:18,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 02:54:19,796][15401] Updated weights for policy 0, policy_version 795573 (0.0036) [2024-06-25 02:54:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 13034782720. Throughput: 0: 42670.6. Samples: 13034878520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:23,393][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 02:54:23,967][15401] Updated weights for policy 0, policy_version 795583 (0.0025) [2024-06-25 02:54:27,370][15401] Updated weights for policy 0, policy_version 795593 (0.0031) [2024-06-25 02:54:28,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 13034995712. Throughput: 0: 42550.8. Samples: 13035128740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 02:54:31,488][15401] Updated weights for policy 0, policy_version 795603 (0.0038) [2024-06-25 02:54:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 13035208704. Throughput: 0: 42525.5. Samples: 13035387400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 02:54:35,051][15401] Updated weights for policy 0, policy_version 795613 (0.0033) [2024-06-25 02:54:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13035421696. Throughput: 0: 42421.9. Samples: 13035515780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 02:54:39,288][15401] Updated weights for policy 0, policy_version 795623 (0.0038) [2024-06-25 02:54:42,912][15401] Updated weights for policy 0, policy_version 795633 (0.0028) [2024-06-25 02:54:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13035651072. Throughput: 0: 42509.7. Samples: 13035770900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 02:54:47,207][15401] Updated weights for policy 0, policy_version 795643 (0.0030) [2024-06-25 02:54:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42987.5). Total num frames: 13035864064. Throughput: 0: 42424.8. Samples: 13036018620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 02:54:50,784][15401] Updated weights for policy 0, policy_version 795653 (0.0031) [2024-06-25 02:54:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 13036060672. Throughput: 0: 42383.5. Samples: 13036146680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 02:54:54,655][15401] Updated weights for policy 0, policy_version 795663 (0.0035) [2024-06-25 02:54:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 13036290048. Throughput: 0: 42550.8. Samples: 13036408960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:54:58,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-25 02:54:58,499][15401] Updated weights for policy 0, policy_version 795673 (0.0034) [2024-06-25 02:55:02,438][15401] Updated weights for policy 0, policy_version 795683 (0.0031) [2024-06-25 02:55:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42987.5). Total num frames: 13036503040. Throughput: 0: 42514.3. Samples: 13036661700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:03,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 02:55:06,215][15401] Updated weights for policy 0, policy_version 795693 (0.0034) [2024-06-25 02:55:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13036716032. Throughput: 0: 42562.7. Samples: 13036793740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:08,392][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 02:55:10,110][15401] Updated weights for policy 0, policy_version 795703 (0.0034) [2024-06-25 02:55:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13036912640. Throughput: 0: 42737.8. Samples: 13037051940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 02:55:14,024][15401] Updated weights for policy 0, policy_version 795713 (0.0031) [2024-06-25 02:55:17,666][15401] Updated weights for policy 0, policy_version 795723 (0.0029) [2024-06-25 02:55:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41780.9, 300 sec: 42877.0). Total num frames: 13037125632. Throughput: 0: 42687.0. Samples: 13037308320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:18,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 02:55:21,606][15401] Updated weights for policy 0, policy_version 795733 (0.0037) [2024-06-25 02:55:23,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 13037355008. Throughput: 0: 42738.5. Samples: 13037439020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 02:55:25,378][15401] Updated weights for policy 0, policy_version 795743 (0.0041) [2024-06-25 02:55:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13037568000. Throughput: 0: 42758.2. Samples: 13037695020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:28,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 02:55:28,904][15349] Signal inference workers to stop experience collection... (192950 times) [2024-06-25 02:55:28,952][15349] Signal inference workers to resume experience collection... (192950 times) [2024-06-25 02:55:28,953][15401] InferenceWorker_p0-w0: stopping experience collection (192950 times) [2024-06-25 02:55:28,988][15401] InferenceWorker_p0-w0: resuming experience collection (192950 times) [2024-06-25 02:55:29,087][15401] Updated weights for policy 0, policy_version 795753 (0.0029) [2024-06-25 02:55:32,943][15401] Updated weights for policy 0, policy_version 795763 (0.0028) [2024-06-25 02:55:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13037780992. Throughput: 0: 42937.5. Samples: 13037950800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 02:55:36,644][15401] Updated weights for policy 0, policy_version 795773 (0.0038) [2024-06-25 02:55:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13038010368. Throughput: 0: 43065.4. Samples: 13038084620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:38,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 02:55:40,799][15401] Updated weights for policy 0, policy_version 795783 (0.0043) [2024-06-25 02:55:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13038206976. Throughput: 0: 42878.7. Samples: 13038338500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 02:55:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 02:55:43,509][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000795790_13038223360.pth... [2024-06-25 02:55:43,566][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000795162_13027934208.pth [2024-06-25 02:55:44,621][15401] Updated weights for policy 0, policy_version 795793 (0.0034) [2024-06-25 02:55:48,389][15401] Updated weights for policy 0, policy_version 795803 (0.0035) [2024-06-25 02:55:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 13038436352. Throughput: 0: 43042.6. Samples: 13038598720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:55:48,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 02:55:52,307][15401] Updated weights for policy 0, policy_version 795813 (0.0028) [2024-06-25 02:55:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13038649344. Throughput: 0: 42912.1. Samples: 13038724780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:55:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 02:55:55,975][15401] Updated weights for policy 0, policy_version 795823 (0.0036) [2024-06-25 02:55:58,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 13038829568. Throughput: 0: 42720.3. Samples: 13038974360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:55:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 02:55:59,928][15401] Updated weights for policy 0, policy_version 795833 (0.0035) [2024-06-25 02:56:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13039058944. Throughput: 0: 42688.0. Samples: 13039229280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:03,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 02:56:03,555][15401] Updated weights for policy 0, policy_version 795843 (0.0026) [2024-06-25 02:56:07,507][15401] Updated weights for policy 0, policy_version 795853 (0.0031) [2024-06-25 02:56:08,396][15132] Fps is (10 sec: 45845.8, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 13039288320. Throughput: 0: 42765.1. Samples: 13039363720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:08,397][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 02:56:11,010][15401] Updated weights for policy 0, policy_version 795863 (0.0043) [2024-06-25 02:56:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13039468544. Throughput: 0: 42636.0. Samples: 13039613640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:13,390][15132] Avg episode reward: [(0, '0.195')] [2024-06-25 02:56:15,476][15401] Updated weights for policy 0, policy_version 795873 (0.0026) [2024-06-25 02:56:18,390][15132] Fps is (10 sec: 42625.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13039714304. Throughput: 0: 42565.7. Samples: 13039866260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 02:56:18,726][15401] Updated weights for policy 0, policy_version 795883 (0.0033) [2024-06-25 02:56:23,203][15401] Updated weights for policy 0, policy_version 795893 (0.0029) [2024-06-25 02:56:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13039910912. Throughput: 0: 42681.3. Samples: 13040005280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 02:56:26,558][15401] Updated weights for policy 0, policy_version 795903 (0.0032) [2024-06-25 02:56:28,391][15132] Fps is (10 sec: 40954.9, 60 sec: 42597.5, 300 sec: 42820.4). Total num frames: 13040123904. Throughput: 0: 42552.9. Samples: 13040253440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:28,391][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 02:56:31,024][15401] Updated weights for policy 0, policy_version 795913 (0.0033) [2024-06-25 02:56:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13040353280. Throughput: 0: 42395.5. Samples: 13040506420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:33,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 02:56:34,441][15401] Updated weights for policy 0, policy_version 795923 (0.0038) [2024-06-25 02:56:38,390][15132] Fps is (10 sec: 40965.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 13040533504. Throughput: 0: 42582.6. Samples: 13040641000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:38,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 02:56:38,710][15401] Updated weights for policy 0, policy_version 795933 (0.0043) [2024-06-25 02:56:39,051][15349] Signal inference workers to stop experience collection... (193000 times) [2024-06-25 02:56:39,051][15349] Signal inference workers to resume experience collection... (193000 times) [2024-06-25 02:56:39,088][15401] InferenceWorker_p0-w0: stopping experience collection (193000 times) [2024-06-25 02:56:39,088][15401] InferenceWorker_p0-w0: resuming experience collection (193000 times) [2024-06-25 02:56:41,956][15401] Updated weights for policy 0, policy_version 795943 (0.0036) [2024-06-25 02:56:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 13040762880. Throughput: 0: 42661.2. Samples: 13040894220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:43,393][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 02:56:46,370][15401] Updated weights for policy 0, policy_version 795953 (0.0038) [2024-06-25 02:56:48,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 13041008640. Throughput: 0: 42774.2. Samples: 13041154120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 02:56:49,493][15401] Updated weights for policy 0, policy_version 795963 (0.0044) [2024-06-25 02:56:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13041172480. Throughput: 0: 42735.5. Samples: 13041286540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 02:56:53,986][15401] Updated weights for policy 0, policy_version 795973 (0.0036) [2024-06-25 02:56:57,092][15401] Updated weights for policy 0, policy_version 795983 (0.0033) [2024-06-25 02:56:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13041418240. Throughput: 0: 42803.2. Samples: 13041539780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:56:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 02:57:01,618][15401] Updated weights for policy 0, policy_version 795993 (0.0025) [2024-06-25 02:57:03,390][15132] Fps is (10 sec: 47512.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13041647616. Throughput: 0: 42839.4. Samples: 13041794040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:57:03,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 02:57:05,019][15401] Updated weights for policy 0, policy_version 796003 (0.0032) [2024-06-25 02:57:08,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42328.2, 300 sec: 42764.7). Total num frames: 13041827840. Throughput: 0: 42817.3. Samples: 13041932160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:57:08,392][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 02:57:09,160][15401] Updated weights for policy 0, policy_version 796013 (0.0031) [2024-06-25 02:57:12,454][15401] Updated weights for policy 0, policy_version 796023 (0.0042) [2024-06-25 02:57:13,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13042040832. Throughput: 0: 42941.3. Samples: 13042185740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:57:13,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-25 02:57:16,608][15401] Updated weights for policy 0, policy_version 796033 (0.0035) [2024-06-25 02:57:18,389][15132] Fps is (10 sec: 47525.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13042302976. Throughput: 0: 43017.0. Samples: 13042442180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:57:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 02:57:20,153][15401] Updated weights for policy 0, policy_version 796043 (0.0037) [2024-06-25 02:57:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13042483200. Throughput: 0: 43088.4. Samples: 13042579980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:57:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 02:57:24,267][15401] Updated weights for policy 0, policy_version 796053 (0.0033) [2024-06-25 02:57:27,640][15401] Updated weights for policy 0, policy_version 796063 (0.0045) [2024-06-25 02:57:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43145.5, 300 sec: 42709.8). Total num frames: 13042712576. Throughput: 0: 42959.2. Samples: 13042827280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 02:57:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 02:57:32,000][15401] Updated weights for policy 0, policy_version 796073 (0.0023) [2024-06-25 02:57:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13042941952. Throughput: 0: 43046.2. Samples: 13043091200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:57:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 02:57:35,141][15401] Updated weights for policy 0, policy_version 796083 (0.0040) [2024-06-25 02:57:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42820.7). Total num frames: 13043138560. Throughput: 0: 42821.7. Samples: 13043213520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:57:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 02:57:39,663][15401] Updated weights for policy 0, policy_version 796093 (0.0038) [2024-06-25 02:57:42,706][15401] Updated weights for policy 0, policy_version 796103 (0.0028) [2024-06-25 02:57:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43419.4, 300 sec: 42765.4). Total num frames: 13043367936. Throughput: 0: 42932.4. Samples: 13043471740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:57:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 02:57:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000796104_13043367936.pth... [2024-06-25 02:57:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000795477_13033095168.pth [2024-06-25 02:57:47,267][15401] Updated weights for policy 0, policy_version 796113 (0.0034) [2024-06-25 02:57:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13043564544. Throughput: 0: 43082.0. Samples: 13043732720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:57:48,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 02:57:50,741][15401] Updated weights for policy 0, policy_version 796123 (0.0033) [2024-06-25 02:57:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13043777536. Throughput: 0: 42726.4. Samples: 13043854740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:57:53,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 02:57:55,160][15401] Updated weights for policy 0, policy_version 796133 (0.0052) [2024-06-25 02:57:55,751][15349] Signal inference workers to stop experience collection... (193050 times) [2024-06-25 02:57:55,760][15349] Signal inference workers to resume experience collection... (193050 times) [2024-06-25 02:57:55,786][15401] InferenceWorker_p0-w0: stopping experience collection (193050 times) [2024-06-25 02:57:55,786][15401] InferenceWorker_p0-w0: resuming experience collection (193050 times) [2024-06-25 02:57:58,285][15401] Updated weights for policy 0, policy_version 796143 (0.0030) [2024-06-25 02:57:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13044006912. Throughput: 0: 42812.8. Samples: 13044112320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:57:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 02:58:02,626][15401] Updated weights for policy 0, policy_version 796153 (0.0037) [2024-06-25 02:58:03,391][15132] Fps is (10 sec: 42592.0, 60 sec: 42597.6, 300 sec: 42764.9). Total num frames: 13044203520. Throughput: 0: 42993.8. Samples: 13044376960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:03,391][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 02:58:05,946][15401] Updated weights for policy 0, policy_version 796163 (0.0032) [2024-06-25 02:58:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 13044416512. Throughput: 0: 42629.0. Samples: 13044498280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 02:58:10,465][15401] Updated weights for policy 0, policy_version 796173 (0.0037) [2024-06-25 02:58:13,390][15132] Fps is (10 sec: 42603.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13044629504. Throughput: 0: 42795.4. Samples: 13044753080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 02:58:13,714][15401] Updated weights for policy 0, policy_version 796183 (0.0027) [2024-06-25 02:58:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 13044826112. Throughput: 0: 42718.6. Samples: 13045013540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 02:58:18,396][15401] Updated weights for policy 0, policy_version 796193 (0.0040) [2024-06-25 02:58:21,416][15401] Updated weights for policy 0, policy_version 796203 (0.0032) [2024-06-25 02:58:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13045055488. Throughput: 0: 42692.9. Samples: 13045134700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:23,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-25 02:58:26,091][15401] Updated weights for policy 0, policy_version 796213 (0.0033) [2024-06-25 02:58:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13045268480. Throughput: 0: 42742.6. Samples: 13045395160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:28,392][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 02:58:28,943][15401] Updated weights for policy 0, policy_version 796223 (0.0036) [2024-06-25 02:58:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 13045465088. Throughput: 0: 42718.1. Samples: 13045655040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 02:58:33,569][15401] Updated weights for policy 0, policy_version 796233 (0.0043) [2024-06-25 02:58:36,815][15401] Updated weights for policy 0, policy_version 796243 (0.0038) [2024-06-25 02:58:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13045710848. Throughput: 0: 42731.4. Samples: 13045777660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 02:58:40,967][15401] Updated weights for policy 0, policy_version 796253 (0.0029) [2024-06-25 02:58:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 13045907456. Throughput: 0: 42806.7. Samples: 13046038620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 02:58:44,414][15401] Updated weights for policy 0, policy_version 796263 (0.0025) [2024-06-25 02:58:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13046120448. Throughput: 0: 42587.0. Samples: 13046293320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:48,390][15132] Avg episode reward: [(0, '0.154')] [2024-06-25 02:58:48,548][15401] Updated weights for policy 0, policy_version 796273 (0.0044) [2024-06-25 02:58:52,344][15401] Updated weights for policy 0, policy_version 796283 (0.0041) [2024-06-25 02:58:53,393][15132] Fps is (10 sec: 45858.9, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 13046366208. Throughput: 0: 42750.3. Samples: 13046422200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:53,393][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 02:58:56,139][15401] Updated weights for policy 0, policy_version 796293 (0.0042) [2024-06-25 02:58:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13046546432. Throughput: 0: 42837.5. Samples: 13046680760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:58:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 02:59:00,025][15401] Updated weights for policy 0, policy_version 796303 (0.0045) [2024-06-25 02:59:03,390][15132] Fps is (10 sec: 40974.6, 60 sec: 42872.4, 300 sec: 42765.0). Total num frames: 13046775808. Throughput: 0: 42526.3. Samples: 13046927220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:59:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 02:59:04,154][15401] Updated weights for policy 0, policy_version 796313 (0.0040) [2024-06-25 02:59:07,635][15401] Updated weights for policy 0, policy_version 796323 (0.0044) [2024-06-25 02:59:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13046988800. Throughput: 0: 42755.5. Samples: 13047058700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:59:08,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 02:59:11,641][15401] Updated weights for policy 0, policy_version 796333 (0.0037) [2024-06-25 02:59:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 13047185408. Throughput: 0: 42695.9. Samples: 13047316480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:59:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 02:59:15,253][15401] Updated weights for policy 0, policy_version 796343 (0.0026) [2024-06-25 02:59:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 13047414784. Throughput: 0: 42502.3. Samples: 13047567640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 02:59:18,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 02:59:19,250][15401] Updated weights for policy 0, policy_version 796353 (0.0040) [2024-06-25 02:59:23,208][15401] Updated weights for policy 0, policy_version 796363 (0.0044) [2024-06-25 02:59:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13047627776. Throughput: 0: 42816.5. Samples: 13047704400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 02:59:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 02:59:26,981][15401] Updated weights for policy 0, policy_version 796373 (0.0041) [2024-06-25 02:59:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13047824384. Throughput: 0: 42646.2. Samples: 13047957700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 02:59:28,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-25 02:59:30,391][15349] Signal inference workers to stop experience collection... (193100 times) [2024-06-25 02:59:30,396][15349] Signal inference workers to resume experience collection... (193100 times) [2024-06-25 02:59:30,439][15401] InferenceWorker_p0-w0: stopping experience collection (193100 times) [2024-06-25 02:59:30,439][15401] InferenceWorker_p0-w0: resuming experience collection (193100 times) [2024-06-25 02:59:30,678][15401] Updated weights for policy 0, policy_version 796383 (0.0041) [2024-06-25 02:59:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13048053760. Throughput: 0: 42612.5. Samples: 13048210880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 02:59:33,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 02:59:34,439][15401] Updated weights for policy 0, policy_version 796393 (0.0037) [2024-06-25 02:59:38,295][15401] Updated weights for policy 0, policy_version 796403 (0.0036) [2024-06-25 02:59:38,393][15132] Fps is (10 sec: 44220.8, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 13048266752. Throughput: 0: 42830.1. Samples: 13048349560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 02:59:38,394][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 02:59:41,968][15401] Updated weights for policy 0, policy_version 796413 (0.0045) [2024-06-25 02:59:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13048479744. Throughput: 0: 42596.1. Samples: 13048597580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 02:59:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 02:59:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000796416_13048479744.pth... [2024-06-25 02:59:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000795790_13038223360.pth [2024-06-25 02:59:46,294][15401] Updated weights for policy 0, policy_version 796423 (0.0028) [2024-06-25 02:59:48,389][15132] Fps is (10 sec: 44253.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13048709120. Throughput: 0: 42713.8. Samples: 13048849340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 02:59:48,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 02:59:49,575][15401] Updated weights for policy 0, policy_version 796433 (0.0038) [2024-06-25 02:59:53,392][15132] Fps is (10 sec: 37673.7, 60 sec: 41506.9, 300 sec: 42598.0). Total num frames: 13048856576. Throughput: 0: 42791.9. Samples: 13048984440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 02:59:53,393][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 02:59:53,980][15401] Updated weights for policy 0, policy_version 796443 (0.0051) [2024-06-25 02:59:57,290][15401] Updated weights for policy 0, policy_version 796453 (0.0033) [2024-06-25 02:59:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13049135104. Throughput: 0: 42704.0. Samples: 13049238160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 02:59:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 03:00:01,621][15401] Updated weights for policy 0, policy_version 796463 (0.0033) [2024-06-25 03:00:03,392][15132] Fps is (10 sec: 49152.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 13049348096. Throughput: 0: 42720.9. Samples: 13049490180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:03,392][15132] Avg episode reward: [(0, '0.852')] [2024-06-25 03:00:04,782][15401] Updated weights for policy 0, policy_version 796473 (0.0041) [2024-06-25 03:00:08,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13049544704. Throughput: 0: 42701.4. Samples: 13049625960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 03:00:09,070][15401] Updated weights for policy 0, policy_version 796483 (0.0042) [2024-06-25 03:00:12,645][15401] Updated weights for policy 0, policy_version 796493 (0.0042) [2024-06-25 03:00:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13049774080. Throughput: 0: 42743.7. Samples: 13049881160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:13,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 03:00:16,787][15401] Updated weights for policy 0, policy_version 796503 (0.0022) [2024-06-25 03:00:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13049987072. Throughput: 0: 42714.3. Samples: 13050133020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:18,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 03:00:20,076][15401] Updated weights for policy 0, policy_version 796513 (0.0027) [2024-06-25 03:00:21,944][15349] Signal inference workers to stop experience collection... (193150 times) [2024-06-25 03:00:21,978][15401] InferenceWorker_p0-w0: stopping experience collection (193150 times) [2024-06-25 03:00:22,067][15349] Signal inference workers to resume experience collection... (193150 times) [2024-06-25 03:00:22,067][15401] InferenceWorker_p0-w0: resuming experience collection (193150 times) [2024-06-25 03:00:23,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 13050183680. Throughput: 0: 42613.6. Samples: 13050267120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:23,393][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 03:00:24,531][15401] Updated weights for policy 0, policy_version 796523 (0.0028) [2024-06-25 03:00:27,641][15401] Updated weights for policy 0, policy_version 796533 (0.0029) [2024-06-25 03:00:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13050413056. Throughput: 0: 42822.7. Samples: 13050524600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 03:00:32,018][15401] Updated weights for policy 0, policy_version 796543 (0.0024) [2024-06-25 03:00:33,395][15132] Fps is (10 sec: 45862.5, 60 sec: 43140.8, 300 sec: 42819.8). Total num frames: 13050642432. Throughput: 0: 42937.7. Samples: 13050781760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:33,395][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 03:00:35,449][15401] Updated weights for policy 0, policy_version 796553 (0.0040) [2024-06-25 03:00:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42601.0, 300 sec: 42765.0). Total num frames: 13050822656. Throughput: 0: 42744.5. Samples: 13050907840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 03:00:39,641][15401] Updated weights for policy 0, policy_version 796563 (0.0043) [2024-06-25 03:00:43,033][15401] Updated weights for policy 0, policy_version 796573 (0.0037) [2024-06-25 03:00:43,389][15132] Fps is (10 sec: 40981.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 13051052032. Throughput: 0: 42878.0. Samples: 13051167660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 03:00:47,456][15401] Updated weights for policy 0, policy_version 796583 (0.0033) [2024-06-25 03:00:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13051265024. Throughput: 0: 42841.4. Samples: 13051417940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 03:00:50,711][15401] Updated weights for policy 0, policy_version 796593 (0.0035) [2024-06-25 03:00:53,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 13051461632. Throughput: 0: 42754.6. Samples: 13051549920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:53,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 03:00:55,059][15401] Updated weights for policy 0, policy_version 796603 (0.0037) [2024-06-25 03:00:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13051691008. Throughput: 0: 42759.5. Samples: 13051805340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:00:58,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 03:00:58,488][15401] Updated weights for policy 0, policy_version 796613 (0.0034) [2024-06-25 03:01:02,710][15401] Updated weights for policy 0, policy_version 796623 (0.0041) [2024-06-25 03:01:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 42654.9). Total num frames: 13051871232. Throughput: 0: 42991.5. Samples: 13052067640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 03:01:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 03:01:06,387][15401] Updated weights for policy 0, policy_version 796633 (0.0037) [2024-06-25 03:01:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13052100608. Throughput: 0: 42768.9. Samples: 13052191620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 03:01:10,370][15401] Updated weights for policy 0, policy_version 796643 (0.0035) [2024-06-25 03:01:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13052329984. Throughput: 0: 42637.3. Samples: 13052443280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 03:01:14,101][15401] Updated weights for policy 0, policy_version 796653 (0.0036) [2024-06-25 03:01:17,995][15401] Updated weights for policy 0, policy_version 796663 (0.0032) [2024-06-25 03:01:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13052526592. Throughput: 0: 42607.2. Samples: 13052698860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:18,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 03:01:21,871][15401] Updated weights for policy 0, policy_version 796673 (0.0034) [2024-06-25 03:01:23,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42325.3, 300 sec: 42709.3). Total num frames: 13052723200. Throughput: 0: 42716.0. Samples: 13052830160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:23,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 03:01:25,690][15401] Updated weights for policy 0, policy_version 796683 (0.0036) [2024-06-25 03:01:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13052968960. Throughput: 0: 42524.0. Samples: 13053081240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 03:01:29,495][15401] Updated weights for policy 0, policy_version 796693 (0.0033) [2024-06-25 03:01:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42055.9, 300 sec: 42820.6). Total num frames: 13053165568. Throughput: 0: 42674.3. Samples: 13053338280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:33,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 03:01:33,399][15401] Updated weights for policy 0, policy_version 796703 (0.0036) [2024-06-25 03:01:37,147][15401] Updated weights for policy 0, policy_version 796713 (0.0049) [2024-06-25 03:01:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 13053378560. Throughput: 0: 42508.5. Samples: 13053462800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:38,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-25 03:01:41,176][15401] Updated weights for policy 0, policy_version 796723 (0.0057) [2024-06-25 03:01:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13053591552. Throughput: 0: 42581.3. Samples: 13053721500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:43,393][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 03:01:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000796728_13053591552.pth... [2024-06-25 03:01:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000796104_13043367936.pth [2024-06-25 03:01:45,317][15401] Updated weights for policy 0, policy_version 796733 (0.0036) [2024-06-25 03:01:46,629][15349] Signal inference workers to stop experience collection... (193200 times) [2024-06-25 03:01:46,668][15401] InferenceWorker_p0-w0: stopping experience collection (193200 times) [2024-06-25 03:01:46,675][15349] Signal inference workers to resume experience collection... (193200 times) [2024-06-25 03:01:46,682][15401] InferenceWorker_p0-w0: resuming experience collection (193200 times) [2024-06-25 03:01:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13053820928. Throughput: 0: 42372.4. Samples: 13053974400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 03:01:48,756][15401] Updated weights for policy 0, policy_version 796743 (0.0042) [2024-06-25 03:01:53,022][15401] Updated weights for policy 0, policy_version 796753 (0.0025) [2024-06-25 03:01:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 13054001152. Throughput: 0: 42486.7. Samples: 13054103620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:53,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 03:01:56,419][15401] Updated weights for policy 0, policy_version 796763 (0.0037) [2024-06-25 03:01:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13054246912. Throughput: 0: 42554.6. Samples: 13054358240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:01:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 03:02:00,981][15401] Updated weights for policy 0, policy_version 796773 (0.0042) [2024-06-25 03:02:03,389][15132] Fps is (10 sec: 45886.4, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 13054459904. Throughput: 0: 42569.3. Samples: 13054614480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:03,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 03:02:04,079][15401] Updated weights for policy 0, policy_version 796783 (0.0040) [2024-06-25 03:02:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 13054623744. Throughput: 0: 42489.9. Samples: 13054742100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 03:02:08,631][15401] Updated weights for policy 0, policy_version 796793 (0.0035) [2024-06-25 03:02:11,849][15401] Updated weights for policy 0, policy_version 796803 (0.0027) [2024-06-25 03:02:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13054869504. Throughput: 0: 42558.1. Samples: 13054996360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:13,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 03:02:16,301][15401] Updated weights for policy 0, policy_version 796813 (0.0033) [2024-06-25 03:02:18,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13055098880. Throughput: 0: 42440.4. Samples: 13055248100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:18,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 03:02:19,365][15401] Updated weights for policy 0, policy_version 796823 (0.0040) [2024-06-25 03:02:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 13055279104. Throughput: 0: 42647.1. Samples: 13055381920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 03:02:24,029][15401] Updated weights for policy 0, policy_version 796833 (0.0042) [2024-06-25 03:02:27,356][15401] Updated weights for policy 0, policy_version 796843 (0.0037) [2024-06-25 03:02:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13055524864. Throughput: 0: 42528.8. Samples: 13055635300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 03:02:31,784][15401] Updated weights for policy 0, policy_version 796853 (0.0029) [2024-06-25 03:02:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13055737856. Throughput: 0: 42490.3. Samples: 13055886460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 03:02:34,993][15401] Updated weights for policy 0, policy_version 796863 (0.0047) [2024-06-25 03:02:38,394][15132] Fps is (10 sec: 39304.3, 60 sec: 42322.1, 300 sec: 42542.2). Total num frames: 13055918080. Throughput: 0: 42478.4. Samples: 13056015240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:38,394][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 03:02:39,421][15401] Updated weights for policy 0, policy_version 796873 (0.0025) [2024-06-25 03:02:42,468][15401] Updated weights for policy 0, policy_version 796883 (0.0033) [2024-06-25 03:02:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13056163840. Throughput: 0: 42525.4. Samples: 13056271880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 03:02:47,181][15401] Updated weights for policy 0, policy_version 796893 (0.0034) [2024-06-25 03:02:48,389][15132] Fps is (10 sec: 45896.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13056376832. Throughput: 0: 42493.7. Samples: 13056526700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 03:02:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 03:02:50,673][15401] Updated weights for policy 0, policy_version 796903 (0.0032) [2024-06-25 03:02:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 13056573440. Throughput: 0: 42577.1. Samples: 13056658080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:02:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 03:02:54,744][15401] Updated weights for policy 0, policy_version 796913 (0.0029) [2024-06-25 03:02:56,576][15349] Signal inference workers to stop experience collection... (193250 times) [2024-06-25 03:02:56,576][15349] Signal inference workers to resume experience collection... (193250 times) [2024-06-25 03:02:56,612][15401] InferenceWorker_p0-w0: stopping experience collection (193250 times) [2024-06-25 03:02:56,612][15401] InferenceWorker_p0-w0: resuming experience collection (193250 times) [2024-06-25 03:02:58,330][15401] Updated weights for policy 0, policy_version 796923 (0.0041) [2024-06-25 03:02:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.1). Total num frames: 13056786432. Throughput: 0: 42521.5. Samples: 13056909820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:02:58,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 03:03:02,307][15401] Updated weights for policy 0, policy_version 796933 (0.0037) [2024-06-25 03:03:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13056983040. Throughput: 0: 42753.7. Samples: 13057172020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 03:03:05,908][15401] Updated weights for policy 0, policy_version 796943 (0.0039) [2024-06-25 03:03:08,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43415.8, 300 sec: 42709.2). Total num frames: 13057228800. Throughput: 0: 42572.8. Samples: 13057297800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:08,393][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 03:03:10,042][15401] Updated weights for policy 0, policy_version 796953 (0.0061) [2024-06-25 03:03:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13057425408. Throughput: 0: 42669.9. Samples: 13057555440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:13,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 03:03:13,468][15401] Updated weights for policy 0, policy_version 796963 (0.0043) [2024-06-25 03:03:17,700][15401] Updated weights for policy 0, policy_version 796973 (0.0031) [2024-06-25 03:03:18,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13057638400. Throughput: 0: 42900.5. Samples: 13057816980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 03:03:21,024][15401] Updated weights for policy 0, policy_version 796983 (0.0036) [2024-06-25 03:03:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13057884160. Throughput: 0: 42887.5. Samples: 13057944980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:23,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 03:03:25,179][15401] Updated weights for policy 0, policy_version 796993 (0.0041) [2024-06-25 03:03:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13058080768. Throughput: 0: 42921.8. Samples: 13058203360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:28,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 03:03:28,502][15401] Updated weights for policy 0, policy_version 797003 (0.0028) [2024-06-25 03:03:32,952][15401] Updated weights for policy 0, policy_version 797013 (0.0029) [2024-06-25 03:03:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13058293760. Throughput: 0: 42957.7. Samples: 13058459800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 03:03:36,190][15401] Updated weights for policy 0, policy_version 797023 (0.0039) [2024-06-25 03:03:38,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43419.2, 300 sec: 42764.7). Total num frames: 13058523136. Throughput: 0: 42813.0. Samples: 13058584760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:38,392][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 03:03:40,494][15401] Updated weights for policy 0, policy_version 797033 (0.0032) [2024-06-25 03:03:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13058703360. Throughput: 0: 43020.0. Samples: 13058845720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 03:03:43,606][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000797042_13058736128.pth... [2024-06-25 03:03:43,659][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000796416_13048479744.pth [2024-06-25 03:03:43,838][15401] Updated weights for policy 0, policy_version 797043 (0.0026) [2024-06-25 03:03:48,011][15401] Updated weights for policy 0, policy_version 797053 (0.0039) [2024-06-25 03:03:48,389][15132] Fps is (10 sec: 39330.7, 60 sec: 42325.3, 300 sec: 42543.4). Total num frames: 13058916352. Throughput: 0: 42849.0. Samples: 13059100220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:48,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 03:03:51,542][15401] Updated weights for policy 0, policy_version 797063 (0.0047) [2024-06-25 03:03:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13059162112. Throughput: 0: 42888.1. Samples: 13059227660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:53,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-25 03:03:55,788][15401] Updated weights for policy 0, policy_version 797073 (0.0033) [2024-06-25 03:03:58,391][15132] Fps is (10 sec: 44232.1, 60 sec: 42870.7, 300 sec: 42653.8). Total num frames: 13059358720. Throughput: 0: 42882.1. Samples: 13059485180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:03:58,391][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 03:03:59,236][15401] Updated weights for policy 0, policy_version 797083 (0.0028) [2024-06-25 03:04:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13059555328. Throughput: 0: 42887.8. Samples: 13059746940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 03:04:03,492][15401] Updated weights for policy 0, policy_version 797093 (0.0045) [2024-06-25 03:04:07,137][15401] Updated weights for policy 0, policy_version 797103 (0.0037) [2024-06-25 03:04:08,389][15132] Fps is (10 sec: 45880.6, 60 sec: 43146.4, 300 sec: 42820.6). Total num frames: 13059817472. Throughput: 0: 42768.5. Samples: 13059869560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 03:04:11,103][15401] Updated weights for policy 0, policy_version 797113 (0.0045) [2024-06-25 03:04:13,052][15349] Signal inference workers to stop experience collection... (193300 times) [2024-06-25 03:04:13,054][15349] Signal inference workers to resume experience collection... (193300 times) [2024-06-25 03:04:13,069][15401] InferenceWorker_p0-w0: stopping experience collection (193300 times) [2024-06-25 03:04:13,104][15401] InferenceWorker_p0-w0: resuming experience collection (193300 times) [2024-06-25 03:04:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13060014080. Throughput: 0: 42810.2. Samples: 13060129820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 03:04:14,875][15401] Updated weights for policy 0, policy_version 797123 (0.0042) [2024-06-25 03:04:18,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13060210688. Throughput: 0: 42828.1. Samples: 13060387060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 03:04:18,570][15401] Updated weights for policy 0, policy_version 797133 (0.0043) [2024-06-25 03:04:22,276][15401] Updated weights for policy 0, policy_version 797143 (0.0051) [2024-06-25 03:04:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13060440064. Throughput: 0: 42915.9. Samples: 13060515880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 03:04:26,073][15401] Updated weights for policy 0, policy_version 797153 (0.0025) [2024-06-25 03:04:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13060636672. Throughput: 0: 42873.3. Samples: 13060775020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 03:04:29,662][15401] Updated weights for policy 0, policy_version 797163 (0.0026) [2024-06-25 03:04:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42654.5). Total num frames: 13060849664. Throughput: 0: 42921.2. Samples: 13061031680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:33,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 03:04:33,718][15401] Updated weights for policy 0, policy_version 797173 (0.0033) [2024-06-25 03:04:37,261][15401] Updated weights for policy 0, policy_version 797183 (0.0036) [2024-06-25 03:04:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 13061095424. Throughput: 0: 42959.1. Samples: 13061160820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 03:04:41,442][15401] Updated weights for policy 0, policy_version 797193 (0.0026) [2024-06-25 03:04:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 13061275648. Throughput: 0: 42782.6. Samples: 13061410360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 03:04:44,879][15401] Updated weights for policy 0, policy_version 797203 (0.0037) [2024-06-25 03:04:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 13061488640. Throughput: 0: 42703.5. Samples: 13061668600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:48,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 03:04:49,280][15401] Updated weights for policy 0, policy_version 797213 (0.0031) [2024-06-25 03:04:52,359][15401] Updated weights for policy 0, policy_version 797223 (0.0032) [2024-06-25 03:04:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13061718016. Throughput: 0: 42776.7. Samples: 13061794520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 03:04:56,871][15401] Updated weights for policy 0, policy_version 797233 (0.0033) [2024-06-25 03:04:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42872.2, 300 sec: 42654.3). Total num frames: 13061931008. Throughput: 0: 42720.8. Samples: 13062052260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:04:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 03:05:00,586][15401] Updated weights for policy 0, policy_version 797243 (0.0043) [2024-06-25 03:05:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 13062127616. Throughput: 0: 42602.2. Samples: 13062304160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:03,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 03:05:04,812][15401] Updated weights for policy 0, policy_version 797253 (0.0044) [2024-06-25 03:05:08,170][15401] Updated weights for policy 0, policy_version 797263 (0.0037) [2024-06-25 03:05:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13062356992. Throughput: 0: 42454.2. Samples: 13062426320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 03:05:12,707][15401] Updated weights for policy 0, policy_version 797273 (0.0046) [2024-06-25 03:05:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13062553600. Throughput: 0: 42423.6. Samples: 13062684080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 03:05:15,978][15401] Updated weights for policy 0, policy_version 797283 (0.0043) [2024-06-25 03:05:18,393][15132] Fps is (10 sec: 40944.3, 60 sec: 42595.6, 300 sec: 42653.7). Total num frames: 13062766592. Throughput: 0: 42280.0. Samples: 13062934440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:18,394][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 03:05:20,258][15401] Updated weights for policy 0, policy_version 797293 (0.0032) [2024-06-25 03:05:23,370][15401] Updated weights for policy 0, policy_version 797303 (0.0035) [2024-06-25 03:05:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13063012352. Throughput: 0: 42398.7. Samples: 13063068760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:23,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-25 03:05:28,051][15401] Updated weights for policy 0, policy_version 797313 (0.0043) [2024-06-25 03:05:28,128][15349] Signal inference workers to stop experience collection... (193350 times) [2024-06-25 03:05:28,162][15401] InferenceWorker_p0-w0: stopping experience collection (193350 times) [2024-06-25 03:05:28,196][15349] Signal inference workers to resume experience collection... (193350 times) [2024-06-25 03:05:28,196][15401] InferenceWorker_p0-w0: resuming experience collection (193350 times) [2024-06-25 03:05:28,389][15132] Fps is (10 sec: 44254.5, 60 sec: 42871.5, 300 sec: 42599.2). Total num frames: 13063208960. Throughput: 0: 42684.2. Samples: 13063331140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 03:05:31,171][15401] Updated weights for policy 0, policy_version 797323 (0.0031) [2024-06-25 03:05:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13063421952. Throughput: 0: 42484.7. Samples: 13063580400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 03:05:35,578][15401] Updated weights for policy 0, policy_version 797333 (0.0043) [2024-06-25 03:05:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13063634944. Throughput: 0: 42565.9. Samples: 13063709980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 03:05:38,753][15401] Updated weights for policy 0, policy_version 797343 (0.0039) [2024-06-25 03:05:43,115][15401] Updated weights for policy 0, policy_version 797353 (0.0035) [2024-06-25 03:05:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13063831552. Throughput: 0: 42740.5. Samples: 13063975580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 03:05:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000797354_13063847936.pth... [2024-06-25 03:05:43,561][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000796728_13053591552.pth [2024-06-25 03:05:46,395][15401] Updated weights for policy 0, policy_version 797363 (0.0027) [2024-06-25 03:05:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13064044544. Throughput: 0: 42694.6. Samples: 13064225420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 03:05:50,990][15401] Updated weights for policy 0, policy_version 797373 (0.0041) [2024-06-25 03:05:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13064290304. Throughput: 0: 42931.1. Samples: 13064358220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 03:05:53,952][15401] Updated weights for policy 0, policy_version 797383 (0.0029) [2024-06-25 03:05:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 13064454144. Throughput: 0: 42924.4. Samples: 13064615680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:05:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 03:05:58,704][15401] Updated weights for policy 0, policy_version 797393 (0.0032) [2024-06-25 03:06:01,644][15401] Updated weights for policy 0, policy_version 797403 (0.0046) [2024-06-25 03:06:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13064699904. Throughput: 0: 42913.5. Samples: 13064865380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:06:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 03:06:06,307][15401] Updated weights for policy 0, policy_version 797413 (0.0038) [2024-06-25 03:06:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13064929280. Throughput: 0: 42978.1. Samples: 13065002780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:06:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 03:06:09,438][15401] Updated weights for policy 0, policy_version 797423 (0.0038) [2024-06-25 03:06:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13065093120. Throughput: 0: 42647.9. Samples: 13065250300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:06:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 03:06:13,997][15401] Updated weights for policy 0, policy_version 797433 (0.0023) [2024-06-25 03:06:17,046][15401] Updated weights for policy 0, policy_version 797443 (0.0040) [2024-06-25 03:06:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42874.4, 300 sec: 42765.4). Total num frames: 13065338880. Throughput: 0: 42715.1. Samples: 13065502580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:06:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 03:06:21,583][15401] Updated weights for policy 0, policy_version 797453 (0.0041) [2024-06-25 03:06:23,390][15132] Fps is (10 sec: 49151.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13065584640. Throughput: 0: 42951.4. Samples: 13065642800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 03:06:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 03:06:24,683][15401] Updated weights for policy 0, policy_version 797463 (0.0031) [2024-06-25 03:06:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13065732096. Throughput: 0: 42546.7. Samples: 13065890180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:06:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 03:06:29,346][15401] Updated weights for policy 0, policy_version 797473 (0.0039) [2024-06-25 03:06:32,369][15401] Updated weights for policy 0, policy_version 797483 (0.0027) [2024-06-25 03:06:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13065994240. Throughput: 0: 42667.0. Samples: 13066145440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:06:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 03:06:36,955][15401] Updated weights for policy 0, policy_version 797493 (0.0025) [2024-06-25 03:06:38,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13066207232. Throughput: 0: 42839.6. Samples: 13066286000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:06:38,390][15132] Avg episode reward: [(0, '0.179')] [2024-06-25 03:06:39,977][15401] Updated weights for policy 0, policy_version 797503 (0.0040) [2024-06-25 03:06:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13066387456. Throughput: 0: 42548.5. Samples: 13066530360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:06:43,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 03:06:44,856][15401] Updated weights for policy 0, policy_version 797513 (0.0044) [2024-06-25 03:06:47,619][15401] Updated weights for policy 0, policy_version 797523 (0.0042) [2024-06-25 03:06:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 13066633216. Throughput: 0: 42655.0. Samples: 13066784860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:06:48,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 03:06:52,372][15401] Updated weights for policy 0, policy_version 797533 (0.0045) [2024-06-25 03:06:53,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13066846208. Throughput: 0: 42542.1. Samples: 13066917280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:06:53,393][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 03:06:55,257][15401] Updated weights for policy 0, policy_version 797543 (0.0035) [2024-06-25 03:06:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13067026432. Throughput: 0: 42696.8. Samples: 13067171660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:06:58,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 03:06:59,959][15401] Updated weights for policy 0, policy_version 797553 (0.0045) [2024-06-25 03:07:00,647][15349] Signal inference workers to stop experience collection... (193400 times) [2024-06-25 03:07:00,678][15401] InferenceWorker_p0-w0: stopping experience collection (193400 times) [2024-06-25 03:07:00,695][15349] Signal inference workers to resume experience collection... (193400 times) [2024-06-25 03:07:00,697][15401] InferenceWorker_p0-w0: resuming experience collection (193400 times) [2024-06-25 03:07:03,357][15401] Updated weights for policy 0, policy_version 797563 (0.0039) [2024-06-25 03:07:03,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13067272192. Throughput: 0: 42743.5. Samples: 13067426040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:03,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 03:07:07,485][15401] Updated weights for policy 0, policy_version 797573 (0.0037) [2024-06-25 03:07:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13067468800. Throughput: 0: 42526.3. Samples: 13067556480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 03:07:10,813][15401] Updated weights for policy 0, policy_version 797583 (0.0036) [2024-06-25 03:07:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13067665408. Throughput: 0: 42513.7. Samples: 13067803300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 03:07:15,181][15401] Updated weights for policy 0, policy_version 797593 (0.0047) [2024-06-25 03:07:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13067911168. Throughput: 0: 42570.3. Samples: 13068061100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 03:07:18,586][15401] Updated weights for policy 0, policy_version 797603 (0.0033) [2024-06-25 03:07:22,922][15401] Updated weights for policy 0, policy_version 797613 (0.0032) [2024-06-25 03:07:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 13068107776. Throughput: 0: 42434.2. Samples: 13068195540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 03:07:26,165][15401] Updated weights for policy 0, policy_version 797623 (0.0035) [2024-06-25 03:07:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 13068320768. Throughput: 0: 42436.3. Samples: 13068440000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 03:07:30,561][15401] Updated weights for policy 0, policy_version 797633 (0.0026) [2024-06-25 03:07:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42821.2). Total num frames: 13068550144. Throughput: 0: 42505.4. Samples: 13068697600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 03:07:33,801][15401] Updated weights for policy 0, policy_version 797643 (0.0033) [2024-06-25 03:07:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13068730368. Throughput: 0: 42505.5. Samples: 13068829920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:38,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 03:07:38,419][15401] Updated weights for policy 0, policy_version 797653 (0.0037) [2024-06-25 03:07:41,854][15401] Updated weights for policy 0, policy_version 797663 (0.0032) [2024-06-25 03:07:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13068976128. Throughput: 0: 42422.4. Samples: 13069080660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 03:07:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000797667_13068976128.pth... [2024-06-25 03:07:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000797042_13058736128.pth [2024-06-25 03:07:45,911][15401] Updated weights for policy 0, policy_version 797673 (0.0027) [2024-06-25 03:07:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13069189120. Throughput: 0: 42612.0. Samples: 13069343580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 03:07:49,534][15401] Updated weights for policy 0, policy_version 797683 (0.0040) [2024-06-25 03:07:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 13069369344. Throughput: 0: 42484.5. Samples: 13069468280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 03:07:53,584][15401] Updated weights for policy 0, policy_version 797693 (0.0046) [2024-06-25 03:07:57,051][15401] Updated weights for policy 0, policy_version 797703 (0.0035) [2024-06-25 03:07:58,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 13069598720. Throughput: 0: 42697.8. Samples: 13069724800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:07:58,393][15132] Avg episode reward: [(0, '0.858')] [2024-06-25 03:08:01,489][15401] Updated weights for policy 0, policy_version 797713 (0.0028) [2024-06-25 03:08:03,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 13069844480. Throughput: 0: 42676.0. Samples: 13069981520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:08:03,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 03:08:04,222][15349] Signal inference workers to stop experience collection... (193450 times) [2024-06-25 03:08:04,227][15349] Signal inference workers to resume experience collection... (193450 times) [2024-06-25 03:08:04,270][15401] InferenceWorker_p0-w0: stopping experience collection (193450 times) [2024-06-25 03:08:04,270][15401] InferenceWorker_p0-w0: resuming experience collection (193450 times) [2024-06-25 03:08:04,536][15401] Updated weights for policy 0, policy_version 797723 (0.0035) [2024-06-25 03:08:08,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13070008320. Throughput: 0: 42639.5. Samples: 13070114320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:08:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 03:08:09,030][15401] Updated weights for policy 0, policy_version 797733 (0.0024) [2024-06-25 03:08:12,081][15401] Updated weights for policy 0, policy_version 797743 (0.0045) [2024-06-25 03:08:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13070254080. Throughput: 0: 42763.5. Samples: 13070364360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 03:08:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 03:08:16,807][15401] Updated weights for policy 0, policy_version 797753 (0.0035) [2024-06-25 03:08:18,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13070483456. Throughput: 0: 42896.1. Samples: 13070627920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 03:08:19,645][15401] Updated weights for policy 0, policy_version 797763 (0.0042) [2024-06-25 03:08:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13070647296. Throughput: 0: 42796.0. Samples: 13070755740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 03:08:24,230][15401] Updated weights for policy 0, policy_version 797773 (0.0039) [2024-06-25 03:08:27,614][15401] Updated weights for policy 0, policy_version 797783 (0.0036) [2024-06-25 03:08:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13070909440. Throughput: 0: 42876.0. Samples: 13071010080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 03:08:31,855][15401] Updated weights for policy 0, policy_version 797793 (0.0024) [2024-06-25 03:08:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 13071106048. Throughput: 0: 42817.7. Samples: 13071270380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:33,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 03:08:35,041][15401] Updated weights for policy 0, policy_version 797803 (0.0031) [2024-06-25 03:08:38,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13071286272. Throughput: 0: 42764.3. Samples: 13071392680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:38,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 03:08:39,678][15401] Updated weights for policy 0, policy_version 797813 (0.0035) [2024-06-25 03:08:42,755][15401] Updated weights for policy 0, policy_version 797823 (0.0037) [2024-06-25 03:08:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13071548416. Throughput: 0: 42860.4. Samples: 13071653420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:43,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 03:08:47,177][15401] Updated weights for policy 0, policy_version 797833 (0.0035) [2024-06-25 03:08:48,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13071761408. Throughput: 0: 42881.7. Samples: 13071911200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 03:08:50,607][15401] Updated weights for policy 0, policy_version 797843 (0.0041) [2024-06-25 03:08:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42654.1). Total num frames: 13071941632. Throughput: 0: 42806.8. Samples: 13072040620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 03:08:54,842][15401] Updated weights for policy 0, policy_version 797853 (0.0026) [2024-06-25 03:08:57,983][15401] Updated weights for policy 0, policy_version 797863 (0.0031) [2024-06-25 03:08:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 13072187392. Throughput: 0: 43000.1. Samples: 13072299360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:08:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 03:09:02,431][15401] Updated weights for policy 0, policy_version 797873 (0.0039) [2024-06-25 03:09:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13072400384. Throughput: 0: 42943.8. Samples: 13072560400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 03:09:05,448][15401] Updated weights for policy 0, policy_version 797883 (0.0033) [2024-06-25 03:09:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 13072596992. Throughput: 0: 42911.0. Samples: 13072686740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 03:09:09,985][15401] Updated weights for policy 0, policy_version 797893 (0.0036) [2024-06-25 03:09:13,268][15401] Updated weights for policy 0, policy_version 797903 (0.0037) [2024-06-25 03:09:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 13072842752. Throughput: 0: 42897.8. Samples: 13072940480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 03:09:17,611][15401] Updated weights for policy 0, policy_version 797913 (0.0031) [2024-06-25 03:09:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13073055744. Throughput: 0: 42988.5. Samples: 13073204860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 03:09:20,683][15401] Updated weights for policy 0, policy_version 797923 (0.0028) [2024-06-25 03:09:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13073235968. Throughput: 0: 42983.2. Samples: 13073326920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 03:09:25,281][15401] Updated weights for policy 0, policy_version 797933 (0.0034) [2024-06-25 03:09:26,266][15349] Signal inference workers to stop experience collection... (193500 times) [2024-06-25 03:09:26,266][15349] Signal inference workers to resume experience collection... (193500 times) [2024-06-25 03:09:26,284][15401] InferenceWorker_p0-w0: stopping experience collection (193500 times) [2024-06-25 03:09:26,284][15401] InferenceWorker_p0-w0: resuming experience collection (193500 times) [2024-06-25 03:09:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13073481728. Throughput: 0: 42960.5. Samples: 13073586640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:28,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-25 03:09:28,517][15401] Updated weights for policy 0, policy_version 797943 (0.0041) [2024-06-25 03:09:32,888][15401] Updated weights for policy 0, policy_version 797953 (0.0037) [2024-06-25 03:09:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13073661952. Throughput: 0: 43037.3. Samples: 13073847880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 03:09:36,108][15401] Updated weights for policy 0, policy_version 797963 (0.0033) [2024-06-25 03:09:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13073874944. Throughput: 0: 42878.6. Samples: 13073970160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 03:09:40,386][15401] Updated weights for policy 0, policy_version 797973 (0.0041) [2024-06-25 03:09:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13074104320. Throughput: 0: 43035.9. Samples: 13074235980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:43,396][15132] Avg episode reward: [(0, '0.245')] [2024-06-25 03:09:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000797981_13074120704.pth... [2024-06-25 03:09:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000797354_13063847936.pth [2024-06-25 03:09:43,787][15401] Updated weights for policy 0, policy_version 797983 (0.0033) [2024-06-25 03:09:48,026][15401] Updated weights for policy 0, policy_version 797993 (0.0033) [2024-06-25 03:09:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13074317312. Throughput: 0: 42838.3. Samples: 13074488120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:48,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 03:09:51,366][15401] Updated weights for policy 0, policy_version 798003 (0.0032) [2024-06-25 03:09:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 13074530304. Throughput: 0: 42911.9. Samples: 13074617780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:53,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 03:09:55,604][15401] Updated weights for policy 0, policy_version 798013 (0.0037) [2024-06-25 03:09:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13074759680. Throughput: 0: 43109.8. Samples: 13074880420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 03:09:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 03:09:59,111][15401] Updated weights for policy 0, policy_version 798023 (0.0042) [2024-06-25 03:10:03,279][15401] Updated weights for policy 0, policy_version 798033 (0.0042) [2024-06-25 03:10:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13074972672. Throughput: 0: 42934.2. Samples: 13075136900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:03,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 03:10:07,092][15401] Updated weights for policy 0, policy_version 798043 (0.0038) [2024-06-25 03:10:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13075185664. Throughput: 0: 43084.9. Samples: 13075265740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 03:10:10,839][15401] Updated weights for policy 0, policy_version 798053 (0.0027) [2024-06-25 03:10:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42821.1). Total num frames: 13075398656. Throughput: 0: 43030.3. Samples: 13075523000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:13,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-25 03:10:14,616][15401] Updated weights for policy 0, policy_version 798063 (0.0043) [2024-06-25 03:10:18,391][15401] Updated weights for policy 0, policy_version 798073 (0.0033) [2024-06-25 03:10:18,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 13075628032. Throughput: 0: 42943.3. Samples: 13075780600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:18,397][15132] Avg episode reward: [(0, '0.845')] [2024-06-25 03:10:22,383][15401] Updated weights for policy 0, policy_version 798083 (0.0027) [2024-06-25 03:10:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13075824640. Throughput: 0: 43063.5. Samples: 13075908020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:23,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 03:10:26,235][15401] Updated weights for policy 0, policy_version 798093 (0.0028) [2024-06-25 03:10:28,392][15132] Fps is (10 sec: 40976.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 13076037632. Throughput: 0: 42826.3. Samples: 13076163260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:28,392][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 03:10:29,861][15401] Updated weights for policy 0, policy_version 798103 (0.0032) [2024-06-25 03:10:33,394][15132] Fps is (10 sec: 42579.8, 60 sec: 43141.4, 300 sec: 42764.4). Total num frames: 13076250624. Throughput: 0: 42902.4. Samples: 13076418920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:33,394][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 03:10:33,940][15401] Updated weights for policy 0, policy_version 798113 (0.0031) [2024-06-25 03:10:37,625][15401] Updated weights for policy 0, policy_version 798123 (0.0034) [2024-06-25 03:10:38,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13076480000. Throughput: 0: 42861.0. Samples: 13076546520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:38,391][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 03:10:41,664][15401] Updated weights for policy 0, policy_version 798133 (0.0039) [2024-06-25 03:10:43,390][15132] Fps is (10 sec: 40977.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13076660224. Throughput: 0: 42713.7. Samples: 13076802540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 03:10:45,385][15401] Updated weights for policy 0, policy_version 798143 (0.0038) [2024-06-25 03:10:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13076905984. Throughput: 0: 42570.5. Samples: 13077052580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:48,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 03:10:49,914][15401] Updated weights for policy 0, policy_version 798153 (0.0035) [2024-06-25 03:10:51,854][15349] Signal inference workers to stop experience collection... (193550 times) [2024-06-25 03:10:51,914][15349] Signal inference workers to resume experience collection... (193550 times) [2024-06-25 03:10:51,937][15401] InferenceWorker_p0-w0: stopping experience collection (193550 times) [2024-06-25 03:10:51,969][15401] InferenceWorker_p0-w0: resuming experience collection (193550 times) [2024-06-25 03:10:53,124][15401] Updated weights for policy 0, policy_version 798163 (0.0029) [2024-06-25 03:10:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13077118976. Throughput: 0: 42629.7. Samples: 13077184080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:53,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-25 03:10:57,322][15401] Updated weights for policy 0, policy_version 798173 (0.0040) [2024-06-25 03:10:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 13077315584. Throughput: 0: 42650.5. Samples: 13077442280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:10:58,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 03:11:00,560][15401] Updated weights for policy 0, policy_version 798183 (0.0027) [2024-06-25 03:11:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13077528576. Throughput: 0: 42735.0. Samples: 13077703400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 03:11:05,144][15401] Updated weights for policy 0, policy_version 798193 (0.0022) [2024-06-25 03:11:08,299][15401] Updated weights for policy 0, policy_version 798203 (0.0042) [2024-06-25 03:11:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13077757952. Throughput: 0: 42705.7. Samples: 13077829780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:08,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 03:11:12,631][15401] Updated weights for policy 0, policy_version 798213 (0.0035) [2024-06-25 03:11:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13077954560. Throughput: 0: 42733.3. Samples: 13078086160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 03:11:15,844][15401] Updated weights for policy 0, policy_version 798223 (0.0040) [2024-06-25 03:11:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 13078183936. Throughput: 0: 42728.2. Samples: 13078341500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 03:11:20,202][15401] Updated weights for policy 0, policy_version 798233 (0.0030) [2024-06-25 03:11:23,395][15132] Fps is (10 sec: 44214.5, 60 sec: 42867.8, 300 sec: 42930.9). Total num frames: 13078396928. Throughput: 0: 42866.7. Samples: 13078475740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:23,395][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 03:11:23,617][15401] Updated weights for policy 0, policy_version 798243 (0.0028) [2024-06-25 03:11:27,919][15401] Updated weights for policy 0, policy_version 798253 (0.0024) [2024-06-25 03:11:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 13078577152. Throughput: 0: 42873.8. Samples: 13078731860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:28,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 03:11:31,025][15401] Updated weights for policy 0, policy_version 798263 (0.0034) [2024-06-25 03:11:33,390][15132] Fps is (10 sec: 42620.0, 60 sec: 42874.6, 300 sec: 42765.0). Total num frames: 13078822912. Throughput: 0: 42960.5. Samples: 13078985800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:33,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 03:11:35,411][15401] Updated weights for policy 0, policy_version 798273 (0.0027) [2024-06-25 03:11:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13079035904. Throughput: 0: 43069.3. Samples: 13079122200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 03:11:39,041][15401] Updated weights for policy 0, policy_version 798283 (0.0032) [2024-06-25 03:11:43,244][15401] Updated weights for policy 0, policy_version 798293 (0.0034) [2024-06-25 03:11:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13079232512. Throughput: 0: 42920.6. Samples: 13079373700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:11:43,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-25 03:11:43,504][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000798294_13079248896.pth... [2024-06-25 03:11:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000797667_13068976128.pth [2024-06-25 03:11:46,437][15401] Updated weights for policy 0, policy_version 798303 (0.0033) [2024-06-25 03:11:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 13079478272. Throughput: 0: 42847.6. Samples: 13079631540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:11:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 03:11:50,737][15401] Updated weights for policy 0, policy_version 798313 (0.0027) [2024-06-25 03:11:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13079674880. Throughput: 0: 43086.6. Samples: 13079768680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:11:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 03:11:53,880][15401] Updated weights for policy 0, policy_version 798323 (0.0039) [2024-06-25 03:11:58,261][15401] Updated weights for policy 0, policy_version 798333 (0.0033) [2024-06-25 03:11:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13079887872. Throughput: 0: 42842.3. Samples: 13080014060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:11:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 03:12:01,528][15401] Updated weights for policy 0, policy_version 798343 (0.0028) [2024-06-25 03:12:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 13080133632. Throughput: 0: 43037.7. Samples: 13080278200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 03:12:05,790][15401] Updated weights for policy 0, policy_version 798353 (0.0040) [2024-06-25 03:12:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13080313856. Throughput: 0: 43005.4. Samples: 13080410760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 03:12:09,056][15401] Updated weights for policy 0, policy_version 798363 (0.0029) [2024-06-25 03:12:13,214][15401] Updated weights for policy 0, policy_version 798373 (0.0036) [2024-06-25 03:12:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13080543232. Throughput: 0: 42895.1. Samples: 13080662140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 03:12:16,923][15401] Updated weights for policy 0, policy_version 798383 (0.0034) [2024-06-25 03:12:18,365][15349] Signal inference workers to stop experience collection... (193600 times) [2024-06-25 03:12:18,365][15349] Signal inference workers to resume experience collection... (193600 times) [2024-06-25 03:12:18,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13080772608. Throughput: 0: 42957.0. Samples: 13080918860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:18,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 03:12:18,406][15401] InferenceWorker_p0-w0: stopping experience collection (193600 times) [2024-06-25 03:12:18,406][15401] InferenceWorker_p0-w0: resuming experience collection (193600 times) [2024-06-25 03:12:21,151][15401] Updated weights for policy 0, policy_version 798393 (0.0033) [2024-06-25 03:12:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42602.1, 300 sec: 42820.6). Total num frames: 13080952832. Throughput: 0: 42959.7. Samples: 13081055380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:23,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 03:12:24,515][15401] Updated weights for policy 0, policy_version 798403 (0.0030) [2024-06-25 03:12:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13081182208. Throughput: 0: 43003.0. Samples: 13081308840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:28,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 03:12:28,581][15401] Updated weights for policy 0, policy_version 798413 (0.0035) [2024-06-25 03:12:31,954][15401] Updated weights for policy 0, policy_version 798423 (0.0041) [2024-06-25 03:12:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13081411584. Throughput: 0: 43137.3. Samples: 13081572720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 03:12:36,114][15401] Updated weights for policy 0, policy_version 798433 (0.0035) [2024-06-25 03:12:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13081575424. Throughput: 0: 42952.6. Samples: 13081701540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 03:12:39,618][15401] Updated weights for policy 0, policy_version 798443 (0.0022) [2024-06-25 03:12:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13081821184. Throughput: 0: 43123.6. Samples: 13081954620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 03:12:43,599][15401] Updated weights for policy 0, policy_version 798453 (0.0038) [2024-06-25 03:12:47,210][15401] Updated weights for policy 0, policy_version 798463 (0.0045) [2024-06-25 03:12:48,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13082066944. Throughput: 0: 43105.5. Samples: 13082217940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 03:12:51,149][15401] Updated weights for policy 0, policy_version 798473 (0.0025) [2024-06-25 03:12:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42820.9). Total num frames: 13082230784. Throughput: 0: 43131.1. Samples: 13082351660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 03:12:54,851][15401] Updated weights for policy 0, policy_version 798483 (0.0047) [2024-06-25 03:12:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13082476544. Throughput: 0: 43160.9. Samples: 13082604380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:12:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 03:12:58,658][15401] Updated weights for policy 0, policy_version 798493 (0.0038) [2024-06-25 03:13:02,491][15401] Updated weights for policy 0, policy_version 798503 (0.0042) [2024-06-25 03:13:03,389][15132] Fps is (10 sec: 49151.8, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 13082722304. Throughput: 0: 43320.5. Samples: 13082868280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:13:03,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 03:13:06,179][15401] Updated weights for policy 0, policy_version 798513 (0.0037) [2024-06-25 03:13:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13082886144. Throughput: 0: 43131.0. Samples: 13082996280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:13:08,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 03:13:10,325][15401] Updated weights for policy 0, policy_version 798523 (0.0030) [2024-06-25 03:13:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13083131904. Throughput: 0: 43079.1. Samples: 13083247400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:13:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 03:13:13,742][15401] Updated weights for policy 0, policy_version 798533 (0.0033) [2024-06-25 03:13:17,826][15401] Updated weights for policy 0, policy_version 798543 (0.0036) [2024-06-25 03:13:18,392][15132] Fps is (10 sec: 47502.3, 60 sec: 43142.8, 300 sec: 43097.9). Total num frames: 13083361280. Throughput: 0: 43064.8. Samples: 13083510740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:13:18,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 03:13:21,315][15401] Updated weights for policy 0, policy_version 798553 (0.0037) [2024-06-25 03:13:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13083541504. Throughput: 0: 43015.9. Samples: 13083637260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:13:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 03:13:25,518][15401] Updated weights for policy 0, policy_version 798563 (0.0031) [2024-06-25 03:13:27,001][15349] Signal inference workers to stop experience collection... (193650 times) [2024-06-25 03:13:27,001][15349] Signal inference workers to resume experience collection... (193650 times) [2024-06-25 03:13:27,019][15401] InferenceWorker_p0-w0: stopping experience collection (193650 times) [2024-06-25 03:13:27,019][15401] InferenceWorker_p0-w0: resuming experience collection (193650 times) [2024-06-25 03:13:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13083787264. Throughput: 0: 43011.1. Samples: 13083890120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:13:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 03:13:28,744][15401] Updated weights for policy 0, policy_version 798573 (0.0045) [2024-06-25 03:13:33,097][15401] Updated weights for policy 0, policy_version 798583 (0.0034) [2024-06-25 03:13:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 13084000256. Throughput: 0: 43046.2. Samples: 13084155020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:13:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 03:13:36,461][15401] Updated weights for policy 0, policy_version 798593 (0.0034) [2024-06-25 03:13:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 13084180480. Throughput: 0: 42847.8. Samples: 13084279820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:13:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 03:13:40,536][15401] Updated weights for policy 0, policy_version 798603 (0.0033) [2024-06-25 03:13:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 13084442624. Throughput: 0: 43065.8. Samples: 13084542340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:13:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 03:13:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000798612_13084459008.pth... [2024-06-25 03:13:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000797981_13074120704.pth [2024-06-25 03:13:43,857][15401] Updated weights for policy 0, policy_version 798613 (0.0041) [2024-06-25 03:13:48,052][15401] Updated weights for policy 0, policy_version 798623 (0.0038) [2024-06-25 03:13:48,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 13084639232. Throughput: 0: 42973.3. Samples: 13084802080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:13:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 03:13:51,402][15401] Updated weights for policy 0, policy_version 798633 (0.0040) [2024-06-25 03:13:53,392][15132] Fps is (10 sec: 39312.1, 60 sec: 43415.8, 300 sec: 42875.8). Total num frames: 13084835840. Throughput: 0: 42895.1. Samples: 13084926660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:13:53,392][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 03:13:55,858][15401] Updated weights for policy 0, policy_version 798643 (0.0038) [2024-06-25 03:13:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 13085081600. Throughput: 0: 43056.4. Samples: 13085185040. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:13:58,392][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 03:13:58,900][15401] Updated weights for policy 0, policy_version 798653 (0.0036) [2024-06-25 03:14:03,389][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 13085278208. Throughput: 0: 43079.6. Samples: 13085449220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 03:14:03,434][15401] Updated weights for policy 0, policy_version 798663 (0.0030) [2024-06-25 03:14:07,368][15401] Updated weights for policy 0, policy_version 798673 (0.0031) [2024-06-25 03:14:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13085474816. Throughput: 0: 43015.7. Samples: 13085572960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 03:14:10,988][15401] Updated weights for policy 0, policy_version 798683 (0.0036) [2024-06-25 03:14:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42987.1). Total num frames: 13085736960. Throughput: 0: 43218.1. Samples: 13085834940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:13,391][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 03:14:14,952][15401] Updated weights for policy 0, policy_version 798693 (0.0033) [2024-06-25 03:14:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 43042.7). Total num frames: 13085933568. Throughput: 0: 43081.3. Samples: 13086093680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 03:14:18,475][15401] Updated weights for policy 0, policy_version 798703 (0.0041) [2024-06-25 03:14:22,830][15401] Updated weights for policy 0, policy_version 798713 (0.0033) [2024-06-25 03:14:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13086130176. Throughput: 0: 43089.5. Samples: 13086218840. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:23,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-25 03:14:26,172][15401] Updated weights for policy 0, policy_version 798723 (0.0037) [2024-06-25 03:14:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 13086392320. Throughput: 0: 43025.2. Samples: 13086478480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:28,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 03:14:30,442][15401] Updated weights for policy 0, policy_version 798733 (0.0048) [2024-06-25 03:14:33,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42869.9, 300 sec: 43042.4). Total num frames: 13086572544. Throughput: 0: 43127.2. Samples: 13086742900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:33,392][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 03:14:33,854][15401] Updated weights for policy 0, policy_version 798743 (0.0038) [2024-06-25 03:14:38,315][15401] Updated weights for policy 0, policy_version 798753 (0.0029) [2024-06-25 03:14:38,390][15132] Fps is (10 sec: 37683.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13086769152. Throughput: 0: 43071.1. Samples: 13086864760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 03:14:41,225][15349] Signal inference workers to stop experience collection... (193700 times) [2024-06-25 03:14:41,227][15349] Signal inference workers to resume experience collection... (193700 times) [2024-06-25 03:14:41,242][15401] InferenceWorker_p0-w0: stopping experience collection (193700 times) [2024-06-25 03:14:41,242][15401] InferenceWorker_p0-w0: resuming experience collection (193700 times) [2024-06-25 03:14:41,372][15401] Updated weights for policy 0, policy_version 798763 (0.0024) [2024-06-25 03:14:43,390][15132] Fps is (10 sec: 47523.6, 60 sec: 43417.5, 300 sec: 43153.8). Total num frames: 13087047680. Throughput: 0: 43114.7. Samples: 13087125100. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 03:14:45,796][15401] Updated weights for policy 0, policy_version 798773 (0.0036) [2024-06-25 03:14:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13087211520. Throughput: 0: 43099.2. Samples: 13087388680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 03:14:48,938][15401] Updated weights for policy 0, policy_version 798783 (0.0031) [2024-06-25 03:14:53,231][15401] Updated weights for policy 0, policy_version 798793 (0.0038) [2024-06-25 03:14:53,389][15132] Fps is (10 sec: 37683.5, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 13087424512. Throughput: 0: 43128.9. Samples: 13087513760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 03:14:56,670][15401] Updated weights for policy 0, policy_version 798803 (0.0038) [2024-06-25 03:14:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43146.2, 300 sec: 43042.7). Total num frames: 13087670272. Throughput: 0: 43110.2. Samples: 13087774900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:14:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 03:15:00,726][15401] Updated weights for policy 0, policy_version 798813 (0.0033) [2024-06-25 03:15:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 13087866880. Throughput: 0: 43004.3. Samples: 13088028880. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:15:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 03:15:04,477][15401] Updated weights for policy 0, policy_version 798823 (0.0036) [2024-06-25 03:15:08,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13088063488. Throughput: 0: 43084.4. Samples: 13088157640. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:15:08,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 03:15:08,548][15401] Updated weights for policy 0, policy_version 798833 (0.0037) [2024-06-25 03:15:12,073][15401] Updated weights for policy 0, policy_version 798843 (0.0040) [2024-06-25 03:15:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42988.1). Total num frames: 13088309248. Throughput: 0: 43047.2. Samples: 13088415600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:15:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 03:15:16,442][15401] Updated weights for policy 0, policy_version 798853 (0.0034) [2024-06-25 03:15:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13088489472. Throughput: 0: 42852.3. Samples: 13088671160. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-25 03:15:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 03:15:19,886][15401] Updated weights for policy 0, policy_version 798863 (0.0023) [2024-06-25 03:15:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42987.5). Total num frames: 13088718848. Throughput: 0: 42847.1. Samples: 13088792880. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:15:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 03:15:23,906][15401] Updated weights for policy 0, policy_version 798873 (0.0033) [2024-06-25 03:15:27,484][15401] Updated weights for policy 0, policy_version 798883 (0.0032) [2024-06-25 03:15:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42987.8). Total num frames: 13088931840. Throughput: 0: 42923.2. Samples: 13089056640. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:15:28,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 03:15:31,413][15401] Updated weights for policy 0, policy_version 798893 (0.0040) [2024-06-25 03:15:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42599.9, 300 sec: 42876.1). Total num frames: 13089128448. Throughput: 0: 42830.1. Samples: 13089316040. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:15:33,391][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 03:15:35,346][15401] Updated weights for policy 0, policy_version 798903 (0.0031) [2024-06-25 03:15:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 43098.3). Total num frames: 13089374208. Throughput: 0: 42681.7. Samples: 13089434440. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:15:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 03:15:38,733][15401] Updated weights for policy 0, policy_version 798913 (0.0031) [2024-06-25 03:15:42,826][15401] Updated weights for policy 0, policy_version 798923 (0.0036) [2024-06-25 03:15:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 13089587200. Throughput: 0: 42736.9. Samples: 13089698060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:15:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 03:15:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000798925_13089587200.pth... [2024-06-25 03:15:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000798294_13079248896.pth [2024-06-25 03:15:46,154][15401] Updated weights for policy 0, policy_version 798933 (0.0033) [2024-06-25 03:15:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13089767424. Throughput: 0: 42917.4. Samples: 13089960160. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:15:48,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 03:15:50,527][15401] Updated weights for policy 0, policy_version 798943 (0.0033) [2024-06-25 03:15:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 43098.3). Total num frames: 13090029568. Throughput: 0: 42709.2. Samples: 13090079560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:15:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 03:15:53,614][15401] Updated weights for policy 0, policy_version 798953 (0.0037) [2024-06-25 03:15:58,211][15401] Updated weights for policy 0, policy_version 798963 (0.0036) [2024-06-25 03:15:58,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42320.9, 300 sec: 42986.2). Total num frames: 13090209792. Throughput: 0: 42809.4. Samples: 13090342300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:15:58,396][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 03:15:58,458][15349] Signal inference workers to stop experience collection... (193750 times) [2024-06-25 03:15:58,458][15349] Signal inference workers to resume experience collection... (193750 times) [2024-06-25 03:15:58,503][15401] InferenceWorker_p0-w0: stopping experience collection (193750 times) [2024-06-25 03:15:58,503][15401] InferenceWorker_p0-w0: resuming experience collection (193750 times) [2024-06-25 03:16:01,533][15401] Updated weights for policy 0, policy_version 798973 (0.0034) [2024-06-25 03:16:03,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 13090406400. Throughput: 0: 43065.8. Samples: 13090609120. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 03:16:05,825][15401] Updated weights for policy 0, policy_version 798983 (0.0037) [2024-06-25 03:16:08,389][15132] Fps is (10 sec: 45905.1, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 13090668544. Throughput: 0: 43015.8. Samples: 13090728580. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 03:16:09,255][15401] Updated weights for policy 0, policy_version 798993 (0.0033) [2024-06-25 03:16:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 13090848768. Throughput: 0: 42903.1. Samples: 13090987280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:13,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 03:16:13,406][15401] Updated weights for policy 0, policy_version 799003 (0.0042) [2024-06-25 03:16:16,969][15401] Updated weights for policy 0, policy_version 799013 (0.0039) [2024-06-25 03:16:18,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42876.8). Total num frames: 13091045376. Throughput: 0: 42856.5. Samples: 13091244580. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 03:16:21,036][15401] Updated weights for policy 0, policy_version 799023 (0.0040) [2024-06-25 03:16:23,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43415.9, 300 sec: 43209.0). Total num frames: 13091323904. Throughput: 0: 42997.3. Samples: 13091369420. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:23,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 03:16:24,526][15401] Updated weights for policy 0, policy_version 799033 (0.0038) [2024-06-25 03:16:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 13091487744. Throughput: 0: 43069.8. Samples: 13091636200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 03:16:28,591][15401] Updated weights for policy 0, policy_version 799043 (0.0037) [2024-06-25 03:16:32,139][15401] Updated weights for policy 0, policy_version 799053 (0.0024) [2024-06-25 03:16:33,390][15132] Fps is (10 sec: 39330.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13091717120. Throughput: 0: 42890.6. Samples: 13091890240. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:33,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 03:16:36,134][15401] Updated weights for policy 0, policy_version 799063 (0.0028) [2024-06-25 03:16:38,392][15132] Fps is (10 sec: 49140.5, 60 sec: 43415.8, 300 sec: 43209.0). Total num frames: 13091979264. Throughput: 0: 43020.9. Samples: 13092015600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:38,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 03:16:40,215][15401] Updated weights for policy 0, policy_version 799073 (0.0036) [2024-06-25 03:16:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13092143104. Throughput: 0: 43094.9. Samples: 13092281300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 03:16:43,834][15401] Updated weights for policy 0, policy_version 799083 (0.0034) [2024-06-25 03:16:47,725][15401] Updated weights for policy 0, policy_version 799093 (0.0041) [2024-06-25 03:16:48,389][15132] Fps is (10 sec: 37692.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13092356096. Throughput: 0: 42753.3. Samples: 13092533020. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 03:16:51,514][15401] Updated weights for policy 0, policy_version 799103 (0.0028) [2024-06-25 03:16:53,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.7, 300 sec: 43153.8). Total num frames: 13092618240. Throughput: 0: 42969.3. Samples: 13092662200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 03:16:55,287][15401] Updated weights for policy 0, policy_version 799113 (0.0040) [2024-06-25 03:16:58,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42328.1, 300 sec: 42764.7). Total num frames: 13092749312. Throughput: 0: 42838.1. Samples: 13092915100. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:16:58,393][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 03:16:59,291][15401] Updated weights for policy 0, policy_version 799123 (0.0052) [2024-06-25 03:17:03,114][15401] Updated weights for policy 0, policy_version 799133 (0.0036) [2024-06-25 03:17:03,390][15132] Fps is (10 sec: 37682.7, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 13092995072. Throughput: 0: 42675.9. Samples: 13093165000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-25 03:17:03,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 03:17:06,964][15349] Signal inference workers to stop experience collection... (193800 times) [2024-06-25 03:17:07,019][15401] InferenceWorker_p0-w0: stopping experience collection (193800 times) [2024-06-25 03:17:07,025][15349] Signal inference workers to resume experience collection... (193800 times) [2024-06-25 03:17:07,036][15401] InferenceWorker_p0-w0: resuming experience collection (193800 times) [2024-06-25 03:17:07,038][15401] Updated weights for policy 0, policy_version 799143 (0.0023) [2024-06-25 03:17:08,389][15132] Fps is (10 sec: 50803.0, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 13093257216. Throughput: 0: 42909.4. Samples: 13093300240. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 03:17:10,600][15401] Updated weights for policy 0, policy_version 799153 (0.0036) [2024-06-25 03:17:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13093404672. Throughput: 0: 42641.2. Samples: 13093555060. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:13,399][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 03:17:14,598][15401] Updated weights for policy 0, policy_version 799163 (0.0026) [2024-06-25 03:17:18,209][15401] Updated weights for policy 0, policy_version 799173 (0.0027) [2024-06-25 03:17:18,390][15132] Fps is (10 sec: 39319.7, 60 sec: 43417.3, 300 sec: 43042.6). Total num frames: 13093650432. Throughput: 0: 42379.2. Samples: 13093797320. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 03:17:22,627][15401] Updated weights for policy 0, policy_version 799183 (0.0022) [2024-06-25 03:17:23,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42327.0, 300 sec: 42987.2). Total num frames: 13093863424. Throughput: 0: 42590.8. Samples: 13093932080. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:23,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 03:17:25,782][15401] Updated weights for policy 0, policy_version 799193 (0.0022) [2024-06-25 03:17:28,396][15132] Fps is (10 sec: 39298.1, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 13094043648. Throughput: 0: 42418.9. Samples: 13094190420. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:28,396][15132] Avg episode reward: [(0, '0.330')] [2024-06-25 03:17:30,091][15401] Updated weights for policy 0, policy_version 799203 (0.0032) [2024-06-25 03:17:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 43098.2). Total num frames: 13094289408. Throughput: 0: 42413.2. Samples: 13094441620. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:33,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 03:17:33,957][15401] Updated weights for policy 0, policy_version 799213 (0.0031) [2024-06-25 03:17:37,797][15401] Updated weights for policy 0, policy_version 799223 (0.0036) [2024-06-25 03:17:38,390][15132] Fps is (10 sec: 47544.0, 60 sec: 42327.0, 300 sec: 43042.7). Total num frames: 13094518784. Throughput: 0: 42610.6. Samples: 13094579680. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:38,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 03:17:41,496][15401] Updated weights for policy 0, policy_version 799233 (0.0040) [2024-06-25 03:17:43,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42709.4). Total num frames: 13094666240. Throughput: 0: 42468.8. Samples: 13094826100. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 03:17:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000799235_13094666240.pth... [2024-06-25 03:17:43,518][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000798612_13084459008.pth [2024-06-25 03:17:45,471][15401] Updated weights for policy 0, policy_version 799243 (0.0038) [2024-06-25 03:17:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 13094928384. Throughput: 0: 42532.5. Samples: 13095078960. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 03:17:49,064][15401] Updated weights for policy 0, policy_version 799253 (0.0037) [2024-06-25 03:17:52,953][15401] Updated weights for policy 0, policy_version 799263 (0.0045) [2024-06-25 03:17:53,389][15132] Fps is (10 sec: 49153.1, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 13095157760. Throughput: 0: 42553.4. Samples: 13095215140. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 03:17:56,899][15401] Updated weights for policy 0, policy_version 799273 (0.0024) [2024-06-25 03:17:58,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 13095305216. Throughput: 0: 42397.6. Samples: 13095462940. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:17:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 03:18:00,484][15401] Updated weights for policy 0, policy_version 799283 (0.0027) [2024-06-25 03:18:02,229][15349] Signal inference workers to stop experience collection... (193850 times) [2024-06-25 03:18:02,229][15349] Signal inference workers to resume experience collection... (193850 times) [2024-06-25 03:18:02,243][15401] InferenceWorker_p0-w0: stopping experience collection (193850 times) [2024-06-25 03:18:02,243][15401] InferenceWorker_p0-w0: resuming experience collection (193850 times) [2024-06-25 03:18:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 13095583744. Throughput: 0: 42611.5. Samples: 13095714820. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 03:18:04,527][15401] Updated weights for policy 0, policy_version 799293 (0.0039) [2024-06-25 03:18:08,069][15401] Updated weights for policy 0, policy_version 799303 (0.0040) [2024-06-25 03:18:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 13095780352. Throughput: 0: 42812.4. Samples: 13095858640. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 03:18:12,065][15401] Updated weights for policy 0, policy_version 799313 (0.0028) [2024-06-25 03:18:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 13095960576. Throughput: 0: 42644.4. Samples: 13096109140. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:13,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 03:18:15,685][15401] Updated weights for policy 0, policy_version 799323 (0.0023) [2024-06-25 03:18:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.8, 300 sec: 42987.2). Total num frames: 13096222720. Throughput: 0: 42716.1. Samples: 13096363840. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 03:18:19,773][15401] Updated weights for policy 0, policy_version 799333 (0.0037) [2024-06-25 03:18:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13096419328. Throughput: 0: 42642.3. Samples: 13096498580. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 03:18:23,480][15401] Updated weights for policy 0, policy_version 799343 (0.0038) [2024-06-25 03:18:27,772][15401] Updated weights for policy 0, policy_version 799353 (0.0033) [2024-06-25 03:18:28,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 13096599552. Throughput: 0: 42707.3. Samples: 13096747920. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 03:18:31,103][15401] Updated weights for policy 0, policy_version 799363 (0.0043) [2024-06-25 03:18:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13096861696. Throughput: 0: 42795.4. Samples: 13097004760. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 03:18:35,269][15401] Updated weights for policy 0, policy_version 799373 (0.0029) [2024-06-25 03:18:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13097058304. Throughput: 0: 42889.7. Samples: 13097145180. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 03:18:38,617][15401] Updated weights for policy 0, policy_version 799383 (0.0029) [2024-06-25 03:18:42,836][15401] Updated weights for policy 0, policy_version 799393 (0.0044) [2024-06-25 03:18:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13097254912. Throughput: 0: 42873.6. Samples: 13097392260. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 03:18:46,677][15401] Updated weights for policy 0, policy_version 799403 (0.0038) [2024-06-25 03:18:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 13097500672. Throughput: 0: 43009.3. Samples: 13097650240. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 03:18:50,457][15401] Updated weights for policy 0, policy_version 799413 (0.0032) [2024-06-25 03:18:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 13097713664. Throughput: 0: 42876.4. Samples: 13097788080. Policy #0 lag: (min: 0.0, avg: 14.2, max: 25.0) [2024-06-25 03:18:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 03:18:54,144][15401] Updated weights for policy 0, policy_version 799423 (0.0028) [2024-06-25 03:18:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13097893888. Throughput: 0: 42825.8. Samples: 13098036300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:18:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 03:18:58,495][15401] Updated weights for policy 0, policy_version 799433 (0.0028) [2024-06-25 03:19:01,759][15401] Updated weights for policy 0, policy_version 799443 (0.0037) [2024-06-25 03:19:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13098139648. Throughput: 0: 42890.2. Samples: 13098293900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:03,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 03:19:06,058][15401] Updated weights for policy 0, policy_version 799453 (0.0033) [2024-06-25 03:19:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13098352640. Throughput: 0: 42801.2. Samples: 13098424640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 03:19:09,276][15401] Updated weights for policy 0, policy_version 799463 (0.0034) [2024-06-25 03:19:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13098549248. Throughput: 0: 42878.1. Samples: 13098677440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:13,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-25 03:19:13,539][15401] Updated weights for policy 0, policy_version 799473 (0.0036) [2024-06-25 03:19:16,877][15401] Updated weights for policy 0, policy_version 799483 (0.0031) [2024-06-25 03:19:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13098778624. Throughput: 0: 42898.6. Samples: 13098935200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:18,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-25 03:19:18,525][15349] Signal inference workers to stop experience collection... (193900 times) [2024-06-25 03:19:18,544][15401] InferenceWorker_p0-w0: stopping experience collection (193900 times) [2024-06-25 03:19:18,599][15349] Signal inference workers to resume experience collection... (193900 times) [2024-06-25 03:19:18,600][15401] InferenceWorker_p0-w0: resuming experience collection (193900 times) [2024-06-25 03:19:21,317][15401] Updated weights for policy 0, policy_version 799493 (0.0037) [2024-06-25 03:19:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13098991616. Throughput: 0: 42824.5. Samples: 13099072280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 03:19:24,340][15401] Updated weights for policy 0, policy_version 799503 (0.0042) [2024-06-25 03:19:28,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 13099188224. Throughput: 0: 42905.8. Samples: 13099323020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 03:19:28,937][15401] Updated weights for policy 0, policy_version 799513 (0.0040) [2024-06-25 03:19:31,875][15401] Updated weights for policy 0, policy_version 799523 (0.0032) [2024-06-25 03:19:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13099433984. Throughput: 0: 43016.5. Samples: 13099585980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 03:19:37,015][15401] Updated weights for policy 0, policy_version 799533 (0.0042) [2024-06-25 03:19:38,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13099663360. Throughput: 0: 42972.9. Samples: 13099721860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 03:19:39,862][15401] Updated weights for policy 0, policy_version 799543 (0.0036) [2024-06-25 03:19:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13099843584. Throughput: 0: 42994.2. Samples: 13099971040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:43,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 03:19:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000799551_13099843584.pth... [2024-06-25 03:19:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000798925_13089587200.pth [2024-06-25 03:19:44,379][15401] Updated weights for policy 0, policy_version 799553 (0.0025) [2024-06-25 03:19:47,695][15401] Updated weights for policy 0, policy_version 799563 (0.0032) [2024-06-25 03:19:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13100089344. Throughput: 0: 42980.4. Samples: 13100228020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 03:19:51,839][15401] Updated weights for policy 0, policy_version 799573 (0.0023) [2024-06-25 03:19:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13100302336. Throughput: 0: 43189.9. Samples: 13100368180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 03:19:55,214][15401] Updated weights for policy 0, policy_version 799583 (0.0035) [2024-06-25 03:19:58,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13100482560. Throughput: 0: 43093.0. Samples: 13100616620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:19:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 03:19:59,546][15401] Updated weights for policy 0, policy_version 799593 (0.0033) [2024-06-25 03:20:02,985][15401] Updated weights for policy 0, policy_version 799603 (0.0047) [2024-06-25 03:20:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 13100728320. Throughput: 0: 42996.5. Samples: 13100870040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:20:03,400][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 03:20:07,234][15401] Updated weights for policy 0, policy_version 799613 (0.0039) [2024-06-25 03:20:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13100924928. Throughput: 0: 42912.8. Samples: 13101003360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:20:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 03:20:10,549][15401] Updated weights for policy 0, policy_version 799623 (0.0036) [2024-06-25 03:20:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 13101137920. Throughput: 0: 42882.8. Samples: 13101252740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:20:13,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-25 03:20:14,865][15401] Updated weights for policy 0, policy_version 799633 (0.0041) [2024-06-25 03:20:18,090][15401] Updated weights for policy 0, policy_version 799643 (0.0033) [2024-06-25 03:20:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13101367296. Throughput: 0: 42945.7. Samples: 13101518540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:20:18,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 03:20:22,293][15401] Updated weights for policy 0, policy_version 799653 (0.0043) [2024-06-25 03:20:23,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 13101563904. Throughput: 0: 42863.5. Samples: 13101650820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:20:23,393][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 03:20:25,686][15401] Updated weights for policy 0, policy_version 799663 (0.0043) [2024-06-25 03:20:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 13101776896. Throughput: 0: 42907.5. Samples: 13101901980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:20:28,392][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 03:20:29,773][15401] Updated weights for policy 0, policy_version 799673 (0.0029) [2024-06-25 03:20:33,281][15401] Updated weights for policy 0, policy_version 799683 (0.0038) [2024-06-25 03:20:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13102006272. Throughput: 0: 42825.3. Samples: 13102155160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:20:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 03:20:37,338][15401] Updated weights for policy 0, policy_version 799693 (0.0046) [2024-06-25 03:20:38,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 13102186496. Throughput: 0: 42692.4. Samples: 13102289340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 03:20:38,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 03:20:40,943][15401] Updated weights for policy 0, policy_version 799703 (0.0023) [2024-06-25 03:20:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13102415872. Throughput: 0: 42749.2. Samples: 13102540340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:20:43,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 03:20:44,915][15401] Updated weights for policy 0, policy_version 799713 (0.0034) [2024-06-25 03:20:46,665][15349] Signal inference workers to stop experience collection... (193950 times) [2024-06-25 03:20:46,665][15349] Signal inference workers to resume experience collection... (193950 times) [2024-06-25 03:20:46,676][15401] InferenceWorker_p0-w0: stopping experience collection (193950 times) [2024-06-25 03:20:46,676][15401] InferenceWorker_p0-w0: resuming experience collection (193950 times) [2024-06-25 03:20:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13102628864. Throughput: 0: 42783.7. Samples: 13102795300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:20:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 03:20:49,048][15401] Updated weights for policy 0, policy_version 799723 (0.0052) [2024-06-25 03:20:52,741][15401] Updated weights for policy 0, policy_version 799733 (0.0033) [2024-06-25 03:20:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42765.9). Total num frames: 13102825472. Throughput: 0: 42668.0. Samples: 13102923420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:20:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 03:20:56,567][15401] Updated weights for policy 0, policy_version 799743 (0.0034) [2024-06-25 03:20:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13103071232. Throughput: 0: 42943.9. Samples: 13103185220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:20:58,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 03:21:00,290][15401] Updated weights for policy 0, policy_version 799753 (0.0034) [2024-06-25 03:21:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13103267840. Throughput: 0: 42677.4. Samples: 13103439020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 03:21:04,178][15401] Updated weights for policy 0, policy_version 799763 (0.0040) [2024-06-25 03:21:08,184][15401] Updated weights for policy 0, policy_version 799773 (0.0029) [2024-06-25 03:21:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13103480832. Throughput: 0: 42593.7. Samples: 13103567440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:08,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 03:21:11,714][15401] Updated weights for policy 0, policy_version 799783 (0.0040) [2024-06-25 03:21:13,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 13103693824. Throughput: 0: 42748.9. Samples: 13103825680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:13,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 03:21:15,834][15401] Updated weights for policy 0, policy_version 799793 (0.0026) [2024-06-25 03:21:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 13103906816. Throughput: 0: 42662.3. Samples: 13104074960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 03:21:20,021][15401] Updated weights for policy 0, policy_version 799803 (0.0033) [2024-06-25 03:21:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 13104119808. Throughput: 0: 42639.2. Samples: 13104208100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:23,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 03:21:23,485][15401] Updated weights for policy 0, policy_version 799813 (0.0034) [2024-06-25 03:21:27,733][15401] Updated weights for policy 0, policy_version 799823 (0.0028) [2024-06-25 03:21:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 13104332800. Throughput: 0: 42649.4. Samples: 13104459560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 03:21:31,175][15401] Updated weights for policy 0, policy_version 799833 (0.0043) [2024-06-25 03:21:33,390][15132] Fps is (10 sec: 42596.5, 60 sec: 42325.1, 300 sec: 42598.7). Total num frames: 13104545792. Throughput: 0: 42500.5. Samples: 13104707840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 03:21:35,376][15401] Updated weights for policy 0, policy_version 799843 (0.0041) [2024-06-25 03:21:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13104742400. Throughput: 0: 42442.6. Samples: 13104833340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 03:21:39,001][15401] Updated weights for policy 0, policy_version 799853 (0.0039) [2024-06-25 03:21:42,973][15401] Updated weights for policy 0, policy_version 799863 (0.0022) [2024-06-25 03:21:43,389][15132] Fps is (10 sec: 42600.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13104971776. Throughput: 0: 42404.0. Samples: 13105093400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 03:21:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000799865_13104988160.pth... [2024-06-25 03:21:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000799235_13094666240.pth [2024-06-25 03:21:46,654][15401] Updated weights for policy 0, policy_version 799873 (0.0026) [2024-06-25 03:21:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13105184768. Throughput: 0: 42452.9. Samples: 13105349400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 03:21:50,595][15401] Updated weights for policy 0, policy_version 799883 (0.0031) [2024-06-25 03:21:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 13105397760. Throughput: 0: 42542.2. Samples: 13105481840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 03:21:54,381][15401] Updated weights for policy 0, policy_version 799893 (0.0030) [2024-06-25 03:21:58,371][15401] Updated weights for policy 0, policy_version 799903 (0.0036) [2024-06-25 03:21:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13105610752. Throughput: 0: 42425.4. Samples: 13105734720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:21:58,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 03:22:01,994][15401] Updated weights for policy 0, policy_version 799913 (0.0039) [2024-06-25 03:22:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 13105840128. Throughput: 0: 42579.7. Samples: 13105991060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:22:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 03:22:05,845][15401] Updated weights for policy 0, policy_version 799923 (0.0040) [2024-06-25 03:22:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13106053120. Throughput: 0: 42589.8. Samples: 13106124640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:22:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 03:22:09,399][15401] Updated weights for policy 0, policy_version 799933 (0.0038) [2024-06-25 03:22:13,288][15349] Signal inference workers to stop experience collection... (194000 times) [2024-06-25 03:22:13,288][15349] Signal inference workers to resume experience collection... (194000 times) [2024-06-25 03:22:13,298][15401] InferenceWorker_p0-w0: stopping experience collection (194000 times) [2024-06-25 03:22:13,298][15401] InferenceWorker_p0-w0: resuming experience collection (194000 times) [2024-06-25 03:22:13,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13106249728. Throughput: 0: 42748.1. Samples: 13106383220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:22:13,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 03:22:13,446][15401] Updated weights for policy 0, policy_version 799943 (0.0034) [2024-06-25 03:22:17,329][15401] Updated weights for policy 0, policy_version 799953 (0.0038) [2024-06-25 03:22:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13106462720. Throughput: 0: 42810.1. Samples: 13106634280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:22:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 03:22:21,251][15401] Updated weights for policy 0, policy_version 799963 (0.0029) [2024-06-25 03:22:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 13106692096. Throughput: 0: 42957.9. Samples: 13106766440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:22:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 03:22:24,835][15401] Updated weights for policy 0, policy_version 799973 (0.0032) [2024-06-25 03:22:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13106872320. Throughput: 0: 42790.7. Samples: 13107018980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 03:22:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 03:22:28,935][15401] Updated weights for policy 0, policy_version 799983 (0.0033) [2024-06-25 03:22:32,350][15401] Updated weights for policy 0, policy_version 799993 (0.0030) [2024-06-25 03:22:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.8, 300 sec: 42709.5). Total num frames: 13107118080. Throughput: 0: 42854.7. Samples: 13107277860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:22:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 03:22:36,589][15401] Updated weights for policy 0, policy_version 800003 (0.0026) [2024-06-25 03:22:38,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13107347456. Throughput: 0: 42966.3. Samples: 13107415320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:22:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 03:22:40,106][15401] Updated weights for policy 0, policy_version 800013 (0.0025) [2024-06-25 03:22:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13107527680. Throughput: 0: 42878.5. Samples: 13107664260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:22:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 03:22:44,238][15401] Updated weights for policy 0, policy_version 800023 (0.0031) [2024-06-25 03:22:47,958][15401] Updated weights for policy 0, policy_version 800033 (0.0034) [2024-06-25 03:22:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13107757056. Throughput: 0: 42842.9. Samples: 13107918980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:22:48,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 03:22:51,817][15401] Updated weights for policy 0, policy_version 800043 (0.0022) [2024-06-25 03:22:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13107986432. Throughput: 0: 42884.3. Samples: 13108054440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:22:53,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 03:22:55,556][15401] Updated weights for policy 0, policy_version 800053 (0.0037) [2024-06-25 03:22:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13108183040. Throughput: 0: 42684.4. Samples: 13108304020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:22:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 03:22:59,577][15401] Updated weights for policy 0, policy_version 800063 (0.0037) [2024-06-25 03:23:03,185][15401] Updated weights for policy 0, policy_version 800073 (0.0029) [2024-06-25 03:23:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13108396032. Throughput: 0: 42691.5. Samples: 13108555400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 03:23:07,410][15401] Updated weights for policy 0, policy_version 800083 (0.0041) [2024-06-25 03:23:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13108609024. Throughput: 0: 42633.1. Samples: 13108684940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 03:23:10,848][15401] Updated weights for policy 0, policy_version 800093 (0.0028) [2024-06-25 03:23:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13108838400. Throughput: 0: 42772.3. Samples: 13108943740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 03:23:14,953][15401] Updated weights for policy 0, policy_version 800103 (0.0043) [2024-06-25 03:23:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13109035008. Throughput: 0: 42714.7. Samples: 13109200020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:18,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 03:23:18,486][15401] Updated weights for policy 0, policy_version 800113 (0.0032) [2024-06-25 03:23:22,689][15401] Updated weights for policy 0, policy_version 800123 (0.0027) [2024-06-25 03:23:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 13109231616. Throughput: 0: 42432.0. Samples: 13109324760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 03:23:25,967][15401] Updated weights for policy 0, policy_version 800133 (0.0045) [2024-06-25 03:23:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13109477376. Throughput: 0: 42605.3. Samples: 13109581500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 03:23:30,570][15401] Updated weights for policy 0, policy_version 800143 (0.0031) [2024-06-25 03:23:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13109690368. Throughput: 0: 42628.0. Samples: 13109837240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 03:23:33,547][15401] Updated weights for policy 0, policy_version 800153 (0.0029) [2024-06-25 03:23:35,520][15349] Signal inference workers to stop experience collection... (194050 times) [2024-06-25 03:23:35,521][15349] Signal inference workers to resume experience collection... (194050 times) [2024-06-25 03:23:35,536][15401] InferenceWorker_p0-w0: stopping experience collection (194050 times) [2024-06-25 03:23:35,536][15401] InferenceWorker_p0-w0: resuming experience collection (194050 times) [2024-06-25 03:23:38,353][15401] Updated weights for policy 0, policy_version 800163 (0.0031) [2024-06-25 03:23:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 13109870592. Throughput: 0: 42401.9. Samples: 13109962520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:38,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 03:23:41,414][15401] Updated weights for policy 0, policy_version 800173 (0.0025) [2024-06-25 03:23:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13110116352. Throughput: 0: 42561.3. Samples: 13110219280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 03:23:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000800178_13110116352.pth... [2024-06-25 03:23:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000799551_13099843584.pth [2024-06-25 03:23:46,139][15401] Updated weights for policy 0, policy_version 800183 (0.0030) [2024-06-25 03:23:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13110312960. Throughput: 0: 42698.8. Samples: 13110476840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:48,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 03:23:48,999][15401] Updated weights for policy 0, policy_version 800193 (0.0037) [2024-06-25 03:23:53,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 13110493184. Throughput: 0: 42729.1. Samples: 13110607740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:53,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 03:23:53,611][15401] Updated weights for policy 0, policy_version 800203 (0.0036) [2024-06-25 03:23:56,522][15401] Updated weights for policy 0, policy_version 800213 (0.0025) [2024-06-25 03:23:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13110771712. Throughput: 0: 42737.7. Samples: 13110866940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:23:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 03:24:01,277][15401] Updated weights for policy 0, policy_version 800223 (0.0038) [2024-06-25 03:24:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13110968320. Throughput: 0: 42903.1. Samples: 13111130660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:24:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 03:24:04,064][15401] Updated weights for policy 0, policy_version 800233 (0.0042) [2024-06-25 03:24:08,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13111148544. Throughput: 0: 42825.7. Samples: 13111251920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:24:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 03:24:08,862][15401] Updated weights for policy 0, policy_version 800243 (0.0024) [2024-06-25 03:24:11,658][15401] Updated weights for policy 0, policy_version 800253 (0.0042) [2024-06-25 03:24:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13111394304. Throughput: 0: 42881.5. Samples: 13111511160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-06-25 03:24:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 03:24:16,644][15401] Updated weights for policy 0, policy_version 800263 (0.0036) [2024-06-25 03:24:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13111607296. Throughput: 0: 43047.1. Samples: 13111774360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 03:24:19,242][15401] Updated weights for policy 0, policy_version 800273 (0.0037) [2024-06-25 03:24:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13111787520. Throughput: 0: 42912.8. Samples: 13111893600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 03:24:24,085][15401] Updated weights for policy 0, policy_version 800283 (0.0035) [2024-06-25 03:24:26,830][15401] Updated weights for policy 0, policy_version 800293 (0.0039) [2024-06-25 03:24:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13112049664. Throughput: 0: 43019.6. Samples: 13112155160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 03:24:31,575][15401] Updated weights for policy 0, policy_version 800303 (0.0044) [2024-06-25 03:24:33,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13112262656. Throughput: 0: 43315.1. Samples: 13112426020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 03:24:34,696][15401] Updated weights for policy 0, policy_version 800313 (0.0042) [2024-06-25 03:24:38,391][15132] Fps is (10 sec: 40954.1, 60 sec: 43143.4, 300 sec: 42764.8). Total num frames: 13112459264. Throughput: 0: 43004.8. Samples: 13112543020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:38,391][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 03:24:39,144][15401] Updated weights for policy 0, policy_version 800323 (0.0036) [2024-06-25 03:24:42,417][15401] Updated weights for policy 0, policy_version 800333 (0.0031) [2024-06-25 03:24:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13112688640. Throughput: 0: 43021.5. Samples: 13112802900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:43,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 03:24:46,739][15401] Updated weights for policy 0, policy_version 800343 (0.0025) [2024-06-25 03:24:48,389][15132] Fps is (10 sec: 40966.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13112868864. Throughput: 0: 43009.4. Samples: 13113066080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:48,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 03:24:48,429][15349] Signal inference workers to stop experience collection... (194100 times) [2024-06-25 03:24:48,432][15349] Signal inference workers to resume experience collection... (194100 times) [2024-06-25 03:24:48,454][15401] InferenceWorker_p0-w0: stopping experience collection (194100 times) [2024-06-25 03:24:48,454][15401] InferenceWorker_p0-w0: resuming experience collection (194100 times) [2024-06-25 03:24:50,174][15401] Updated weights for policy 0, policy_version 800353 (0.0034) [2024-06-25 03:24:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 13113114624. Throughput: 0: 42972.5. Samples: 13113185680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 03:24:54,139][15401] Updated weights for policy 0, policy_version 800363 (0.0032) [2024-06-25 03:24:57,662][15401] Updated weights for policy 0, policy_version 800373 (0.0028) [2024-06-25 03:24:58,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13113344000. Throughput: 0: 42983.8. Samples: 13113445440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:24:58,399][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 03:25:01,638][15401] Updated weights for policy 0, policy_version 800383 (0.0024) [2024-06-25 03:25:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13113524224. Throughput: 0: 43007.6. Samples: 13113709700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 03:25:05,166][15401] Updated weights for policy 0, policy_version 800393 (0.0033) [2024-06-25 03:25:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 13113753600. Throughput: 0: 43057.4. Samples: 13113831180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:08,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 03:25:09,554][15401] Updated weights for policy 0, policy_version 800403 (0.0039) [2024-06-25 03:25:12,717][15401] Updated weights for policy 0, policy_version 800413 (0.0037) [2024-06-25 03:25:13,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13113982976. Throughput: 0: 43010.2. Samples: 13114090620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 03:25:17,097][15401] Updated weights for policy 0, policy_version 800423 (0.0036) [2024-06-25 03:25:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 13114179584. Throughput: 0: 42788.0. Samples: 13114351480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 03:25:20,579][15401] Updated weights for policy 0, policy_version 800433 (0.0046) [2024-06-25 03:25:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 42820.9). Total num frames: 13114408960. Throughput: 0: 42970.3. Samples: 13114476620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 03:25:25,023][15401] Updated weights for policy 0, policy_version 800443 (0.0026) [2024-06-25 03:25:27,953][15401] Updated weights for policy 0, policy_version 800453 (0.0026) [2024-06-25 03:25:28,390][15132] Fps is (10 sec: 44235.0, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 13114621952. Throughput: 0: 43000.0. Samples: 13114737920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:28,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 03:25:32,724][15401] Updated weights for policy 0, policy_version 800463 (0.0043) [2024-06-25 03:25:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13114818560. Throughput: 0: 42907.9. Samples: 13114996940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 03:25:35,581][15401] Updated weights for policy 0, policy_version 800473 (0.0037) [2024-06-25 03:25:38,389][15132] Fps is (10 sec: 44238.8, 60 sec: 43418.7, 300 sec: 42876.1). Total num frames: 13115064320. Throughput: 0: 43023.2. Samples: 13115121720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 03:25:40,185][15401] Updated weights for policy 0, policy_version 800483 (0.0033) [2024-06-25 03:25:43,332][15401] Updated weights for policy 0, policy_version 800493 (0.0039) [2024-06-25 03:25:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13115277312. Throughput: 0: 43100.6. Samples: 13115384960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 03:25:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000800493_13115277312.pth... [2024-06-25 03:25:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000799865_13104988160.pth [2024-06-25 03:25:47,641][15401] Updated weights for policy 0, policy_version 800503 (0.0049) [2024-06-25 03:25:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13115473920. Throughput: 0: 42923.4. Samples: 13115641260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:48,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 03:25:50,917][15401] Updated weights for policy 0, policy_version 800513 (0.0036) [2024-06-25 03:25:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13115686912. Throughput: 0: 42989.4. Samples: 13115765700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 03:25:55,321][15401] Updated weights for policy 0, policy_version 800523 (0.0023) [2024-06-25 03:25:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13115916288. Throughput: 0: 43047.5. Samples: 13116027760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:25:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 03:25:58,535][15401] Updated weights for policy 0, policy_version 800533 (0.0023) [2024-06-25 03:26:02,829][15401] Updated weights for policy 0, policy_version 800543 (0.0047) [2024-06-25 03:26:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13116096512. Throughput: 0: 43104.5. Samples: 13116291180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-25 03:26:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 03:26:06,130][15401] Updated weights for policy 0, policy_version 800553 (0.0036) [2024-06-25 03:26:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 13116342272. Throughput: 0: 43059.9. Samples: 13116414320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 03:26:10,199][15401] Updated weights for policy 0, policy_version 800563 (0.0037) [2024-06-25 03:26:13,394][15132] Fps is (10 sec: 45853.6, 60 sec: 42868.1, 300 sec: 42875.4). Total num frames: 13116555264. Throughput: 0: 43054.2. Samples: 13116675540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:13,395][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 03:26:14,007][15401] Updated weights for policy 0, policy_version 800573 (0.0028) [2024-06-25 03:26:17,746][15401] Updated weights for policy 0, policy_version 800583 (0.0034) [2024-06-25 03:26:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13116751872. Throughput: 0: 42901.8. Samples: 13116927520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:18,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 03:26:21,539][15349] Signal inference workers to stop experience collection... (194150 times) [2024-06-25 03:26:21,539][15349] Signal inference workers to resume experience collection... (194150 times) [2024-06-25 03:26:21,554][15401] InferenceWorker_p0-w0: stopping experience collection (194150 times) [2024-06-25 03:26:21,554][15401] InferenceWorker_p0-w0: resuming experience collection (194150 times) [2024-06-25 03:26:21,685][15401] Updated weights for policy 0, policy_version 800593 (0.0039) [2024-06-25 03:26:23,389][15132] Fps is (10 sec: 42618.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13116981248. Throughput: 0: 42924.5. Samples: 13117053320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:23,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 03:26:25,606][15401] Updated weights for policy 0, policy_version 800603 (0.0026) [2024-06-25 03:26:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.8, 300 sec: 42820.6). Total num frames: 13117177856. Throughput: 0: 42879.6. Samples: 13117314540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:28,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 03:26:29,174][15401] Updated weights for policy 0, policy_version 800613 (0.0033) [2024-06-25 03:26:33,339][15401] Updated weights for policy 0, policy_version 800623 (0.0044) [2024-06-25 03:26:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13117407232. Throughput: 0: 42906.3. Samples: 13117572040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 03:26:36,725][15401] Updated weights for policy 0, policy_version 800633 (0.0040) [2024-06-25 03:26:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13117636608. Throughput: 0: 42996.0. Samples: 13117700520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:38,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 03:26:41,135][15401] Updated weights for policy 0, policy_version 800643 (0.0027) [2024-06-25 03:26:43,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 13117816832. Throughput: 0: 42961.3. Samples: 13117961120. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:43,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 03:26:44,297][15401] Updated weights for policy 0, policy_version 800653 (0.0043) [2024-06-25 03:26:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13118029824. Throughput: 0: 42727.9. Samples: 13118213940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 03:26:49,054][15401] Updated weights for policy 0, policy_version 800663 (0.0038) [2024-06-25 03:26:51,951][15401] Updated weights for policy 0, policy_version 800673 (0.0039) [2024-06-25 03:26:53,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13118275584. Throughput: 0: 42860.5. Samples: 13118343040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 03:26:56,721][15401] Updated weights for policy 0, policy_version 800683 (0.0028) [2024-06-25 03:26:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 13118455808. Throughput: 0: 42804.5. Samples: 13118601540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:26:58,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 03:26:59,791][15401] Updated weights for policy 0, policy_version 800693 (0.0028) [2024-06-25 03:27:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13118685184. Throughput: 0: 42717.0. Samples: 13118849780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:03,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 03:27:04,411][15401] Updated weights for policy 0, policy_version 800703 (0.0028) [2024-06-25 03:27:07,381][15401] Updated weights for policy 0, policy_version 800713 (0.0027) [2024-06-25 03:27:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13118914560. Throughput: 0: 42908.5. Samples: 13118984200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 03:27:11,911][15401] Updated weights for policy 0, policy_version 800723 (0.0044) [2024-06-25 03:27:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42328.5, 300 sec: 42820.5). Total num frames: 13119094784. Throughput: 0: 42776.6. Samples: 13119239500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 03:27:15,205][15401] Updated weights for policy 0, policy_version 800733 (0.0033) [2024-06-25 03:27:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13119324160. Throughput: 0: 42687.6. Samples: 13119492980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:18,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 03:27:19,359][15401] Updated weights for policy 0, policy_version 800743 (0.0027) [2024-06-25 03:27:22,933][15401] Updated weights for policy 0, policy_version 800753 (0.0030) [2024-06-25 03:27:23,390][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13119569920. Throughput: 0: 42788.8. Samples: 13119626020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:23,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 03:27:27,382][15401] Updated weights for policy 0, policy_version 800763 (0.0031) [2024-06-25 03:27:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13119750144. Throughput: 0: 42687.2. Samples: 13119881940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 03:27:30,600][15401] Updated weights for policy 0, policy_version 800773 (0.0038) [2024-06-25 03:27:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13119979520. Throughput: 0: 42623.4. Samples: 13120132000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:33,399][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 03:27:34,883][15401] Updated weights for policy 0, policy_version 800783 (0.0029) [2024-06-25 03:27:36,761][15349] Signal inference workers to stop experience collection... (194200 times) [2024-06-25 03:27:36,761][15349] Signal inference workers to resume experience collection... (194200 times) [2024-06-25 03:27:36,807][15401] InferenceWorker_p0-w0: stopping experience collection (194200 times) [2024-06-25 03:27:36,807][15401] InferenceWorker_p0-w0: resuming experience collection (194200 times) [2024-06-25 03:27:38,177][15401] Updated weights for policy 0, policy_version 800793 (0.0043) [2024-06-25 03:27:38,391][15132] Fps is (10 sec: 44232.0, 60 sec: 42597.6, 300 sec: 42931.5). Total num frames: 13120192512. Throughput: 0: 42828.3. Samples: 13120270360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:38,391][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 03:27:43,022][15401] Updated weights for policy 0, policy_version 800803 (0.0046) [2024-06-25 03:27:43,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 13120372736. Throughput: 0: 42639.6. Samples: 13120520320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:43,396][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 03:27:43,535][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000800805_13120389120.pth... [2024-06-25 03:27:43,604][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000800178_13110116352.pth [2024-06-25 03:27:46,158][15401] Updated weights for policy 0, policy_version 800813 (0.0032) [2024-06-25 03:27:48,390][15132] Fps is (10 sec: 42602.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13120618496. Throughput: 0: 42734.6. Samples: 13120772840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-25 03:27:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 03:27:50,546][15401] Updated weights for policy 0, policy_version 800823 (0.0036) [2024-06-25 03:27:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 13120815104. Throughput: 0: 42683.8. Samples: 13120904980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:27:53,391][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 03:27:53,810][15401] Updated weights for policy 0, policy_version 800833 (0.0038) [2024-06-25 03:27:58,095][15401] Updated weights for policy 0, policy_version 800843 (0.0028) [2024-06-25 03:27:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13121011712. Throughput: 0: 42610.4. Samples: 13121156960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:27:58,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 03:28:01,550][15401] Updated weights for policy 0, policy_version 800853 (0.0047) [2024-06-25 03:28:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13121257472. Throughput: 0: 42494.6. Samples: 13121405240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:03,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 03:28:05,670][15401] Updated weights for policy 0, policy_version 800863 (0.0031) [2024-06-25 03:28:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 13121421312. Throughput: 0: 42526.7. Samples: 13121539720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:08,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 03:28:09,271][15401] Updated weights for policy 0, policy_version 800873 (0.0036) [2024-06-25 03:28:13,188][15401] Updated weights for policy 0, policy_version 800883 (0.0033) [2024-06-25 03:28:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 13121667072. Throughput: 0: 42451.5. Samples: 13121792260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:13,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-25 03:28:16,987][15401] Updated weights for policy 0, policy_version 800893 (0.0029) [2024-06-25 03:28:18,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13121896448. Throughput: 0: 42689.1. Samples: 13122053000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 03:28:20,902][15401] Updated weights for policy 0, policy_version 800903 (0.0035) [2024-06-25 03:28:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 13122076672. Throughput: 0: 42431.6. Samples: 13122179740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 03:28:24,721][15401] Updated weights for policy 0, policy_version 800913 (0.0032) [2024-06-25 03:28:28,336][15401] Updated weights for policy 0, policy_version 800923 (0.0034) [2024-06-25 03:28:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13122322432. Throughput: 0: 42454.7. Samples: 13122430780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 03:28:32,112][15401] Updated weights for policy 0, policy_version 800933 (0.0040) [2024-06-25 03:28:33,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13122535424. Throughput: 0: 42664.5. Samples: 13122692740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 03:28:35,682][15401] Updated weights for policy 0, policy_version 800943 (0.0031) [2024-06-25 03:28:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42053.0, 300 sec: 42709.5). Total num frames: 13122715648. Throughput: 0: 42621.4. Samples: 13122822940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 03:28:40,075][15401] Updated weights for policy 0, policy_version 800953 (0.0034) [2024-06-25 03:28:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13122961408. Throughput: 0: 42578.9. Samples: 13123073020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:43,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 03:28:43,458][15401] Updated weights for policy 0, policy_version 800963 (0.0041) [2024-06-25 03:28:47,747][15401] Updated weights for policy 0, policy_version 800973 (0.0021) [2024-06-25 03:28:48,088][15349] Signal inference workers to stop experience collection... (194250 times) [2024-06-25 03:28:48,090][15349] Signal inference workers to resume experience collection... (194250 times) [2024-06-25 03:28:48,102][15401] InferenceWorker_p0-w0: stopping experience collection (194250 times) [2024-06-25 03:28:48,102][15401] InferenceWorker_p0-w0: resuming experience collection (194250 times) [2024-06-25 03:28:48,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 13123190784. Throughput: 0: 42909.3. Samples: 13123336160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 03:28:50,924][15401] Updated weights for policy 0, policy_version 800983 (0.0029) [2024-06-25 03:28:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13123354624. Throughput: 0: 42749.7. Samples: 13123463460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 03:28:55,363][15401] Updated weights for policy 0, policy_version 800993 (0.0041) [2024-06-25 03:28:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13123616768. Throughput: 0: 42703.5. Samples: 13123713920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:28:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 03:28:58,510][15401] Updated weights for policy 0, policy_version 801003 (0.0033) [2024-06-25 03:29:03,028][15401] Updated weights for policy 0, policy_version 801013 (0.0036) [2024-06-25 03:29:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13123796992. Throughput: 0: 42728.0. Samples: 13123975760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:29:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 03:29:06,384][15401] Updated weights for policy 0, policy_version 801023 (0.0037) [2024-06-25 03:29:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13124009984. Throughput: 0: 42717.9. Samples: 13124102040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:29:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 03:29:10,559][15401] Updated weights for policy 0, policy_version 801033 (0.0037) [2024-06-25 03:29:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13124255744. Throughput: 0: 42923.4. Samples: 13124362340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:29:13,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-25 03:29:13,896][15401] Updated weights for policy 0, policy_version 801043 (0.0026) [2024-06-25 03:29:18,281][15401] Updated weights for policy 0, policy_version 801053 (0.0033) [2024-06-25 03:29:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13124452352. Throughput: 0: 42763.1. Samples: 13124617080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:29:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 03:29:21,678][15401] Updated weights for policy 0, policy_version 801063 (0.0041) [2024-06-25 03:29:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13124665344. Throughput: 0: 42586.1. Samples: 13124739320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:29:23,391][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 03:29:25,859][15401] Updated weights for policy 0, policy_version 801073 (0.0035) [2024-06-25 03:29:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13124861952. Throughput: 0: 42793.5. Samples: 13124998720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:29:28,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 03:29:29,398][15401] Updated weights for policy 0, policy_version 801083 (0.0032) [2024-06-25 03:29:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42820.8). Total num frames: 13125091328. Throughput: 0: 42595.6. Samples: 13125252960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 03:29:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 03:29:33,801][15401] Updated weights for policy 0, policy_version 801093 (0.0033) [2024-06-25 03:29:37,834][15401] Updated weights for policy 0, policy_version 801103 (0.0030) [2024-06-25 03:29:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13125304320. Throughput: 0: 42699.7. Samples: 13125384940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:29:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 03:29:41,307][15401] Updated weights for policy 0, policy_version 801113 (0.0033) [2024-06-25 03:29:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13125517312. Throughput: 0: 42627.1. Samples: 13125632140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:29:43,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 03:29:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000801118_13125517312.pth... [2024-06-25 03:29:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000800493_13115277312.pth [2024-06-25 03:29:45,572][15401] Updated weights for policy 0, policy_version 801123 (0.0033) [2024-06-25 03:29:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13125713920. Throughput: 0: 42487.0. Samples: 13125887680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:29:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 03:29:48,977][15401] Updated weights for policy 0, policy_version 801133 (0.0041) [2024-06-25 03:29:53,236][15401] Updated weights for policy 0, policy_version 801143 (0.0029) [2024-06-25 03:29:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 13125926912. Throughput: 0: 42453.8. Samples: 13126012460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:29:53,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 03:29:57,147][15401] Updated weights for policy 0, policy_version 801153 (0.0038) [2024-06-25 03:29:58,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13126139904. Throughput: 0: 42401.4. Samples: 13126270400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:29:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 03:30:00,898][15401] Updated weights for policy 0, policy_version 801163 (0.0033) [2024-06-25 03:30:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13126369280. Throughput: 0: 42388.8. Samples: 13126524580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:03,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 03:30:04,626][15401] Updated weights for policy 0, policy_version 801173 (0.0048) [2024-06-25 03:30:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13126565888. Throughput: 0: 42629.1. Samples: 13126657620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:08,390][15132] Avg episode reward: [(0, '0.205')] [2024-06-25 03:30:08,515][15401] Updated weights for policy 0, policy_version 801183 (0.0029) [2024-06-25 03:30:12,215][15401] Updated weights for policy 0, policy_version 801193 (0.0036) [2024-06-25 03:30:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13126795264. Throughput: 0: 42539.1. Samples: 13126912980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:13,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 03:30:16,261][15401] Updated weights for policy 0, policy_version 801203 (0.0029) [2024-06-25 03:30:16,699][15349] Signal inference workers to stop experience collection... (194300 times) [2024-06-25 03:30:16,700][15349] Signal inference workers to resume experience collection... (194300 times) [2024-06-25 03:30:16,717][15401] InferenceWorker_p0-w0: stopping experience collection (194300 times) [2024-06-25 03:30:16,717][15401] InferenceWorker_p0-w0: resuming experience collection (194300 times) [2024-06-25 03:30:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13127024640. Throughput: 0: 42476.9. Samples: 13127164420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:18,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 03:30:19,837][15401] Updated weights for policy 0, policy_version 801213 (0.0034) [2024-06-25 03:30:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13127204864. Throughput: 0: 42406.5. Samples: 13127293240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 03:30:23,819][15401] Updated weights for policy 0, policy_version 801223 (0.0045) [2024-06-25 03:30:27,491][15401] Updated weights for policy 0, policy_version 801233 (0.0026) [2024-06-25 03:30:28,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 13127434240. Throughput: 0: 42662.0. Samples: 13127552200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:28,397][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 03:30:31,660][15401] Updated weights for policy 0, policy_version 801243 (0.0037) [2024-06-25 03:30:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13127630848. Throughput: 0: 42566.7. Samples: 13127803180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:33,393][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 03:30:35,207][15401] Updated weights for policy 0, policy_version 801253 (0.0027) [2024-06-25 03:30:38,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13127860224. Throughput: 0: 42537.8. Samples: 13127926660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 03:30:39,062][15401] Updated weights for policy 0, policy_version 801263 (0.0032) [2024-06-25 03:30:42,726][15401] Updated weights for policy 0, policy_version 801273 (0.0025) [2024-06-25 03:30:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13128073216. Throughput: 0: 42631.6. Samples: 13128188820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 03:30:46,682][15401] Updated weights for policy 0, policy_version 801283 (0.0037) [2024-06-25 03:30:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13128269824. Throughput: 0: 42769.4. Samples: 13128449200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:48,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-25 03:30:50,357][15401] Updated weights for policy 0, policy_version 801293 (0.0031) [2024-06-25 03:30:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13128515584. Throughput: 0: 42482.6. Samples: 13128569340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:53,391][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 03:30:54,006][15401] Updated weights for policy 0, policy_version 801303 (0.0038) [2024-06-25 03:30:58,256][15401] Updated weights for policy 0, policy_version 801313 (0.0031) [2024-06-25 03:30:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13128712192. Throughput: 0: 42622.8. Samples: 13128831000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:30:58,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 03:31:02,144][15401] Updated weights for policy 0, policy_version 801323 (0.0044) [2024-06-25 03:31:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13128908800. Throughput: 0: 42728.1. Samples: 13129087180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:31:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 03:31:05,918][15401] Updated weights for policy 0, policy_version 801333 (0.0028) [2024-06-25 03:31:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42654.3). Total num frames: 13129138176. Throughput: 0: 42491.7. Samples: 13129205460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:31:08,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 03:31:09,717][15401] Updated weights for policy 0, policy_version 801343 (0.0027) [2024-06-25 03:31:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13129351168. Throughput: 0: 42631.5. Samples: 13129470340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:31:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 03:31:13,503][15401] Updated weights for policy 0, policy_version 801353 (0.0036) [2024-06-25 03:31:17,203][15401] Updated weights for policy 0, policy_version 801363 (0.0035) [2024-06-25 03:31:18,391][15132] Fps is (10 sec: 39323.3, 60 sec: 41777.9, 300 sec: 42542.6). Total num frames: 13129531392. Throughput: 0: 42791.6. Samples: 13129728880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:31:18,392][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 03:31:21,122][15401] Updated weights for policy 0, policy_version 801373 (0.0023) [2024-06-25 03:31:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13129760768. Throughput: 0: 42791.9. Samples: 13129852300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-25 03:31:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 03:31:24,829][15401] Updated weights for policy 0, policy_version 801383 (0.0028) [2024-06-25 03:31:27,176][15349] Signal inference workers to stop experience collection... (194350 times) [2024-06-25 03:31:27,177][15349] Signal inference workers to resume experience collection... (194350 times) [2024-06-25 03:31:27,228][15401] InferenceWorker_p0-w0: stopping experience collection (194350 times) [2024-06-25 03:31:27,228][15401] InferenceWorker_p0-w0: resuming experience collection (194350 times) [2024-06-25 03:31:28,390][15132] Fps is (10 sec: 45884.0, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 13129990144. Throughput: 0: 42821.7. Samples: 13130115800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:31:28,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 03:31:28,782][15401] Updated weights for policy 0, policy_version 801393 (0.0038) [2024-06-25 03:31:32,472][15401] Updated weights for policy 0, policy_version 801403 (0.0035) [2024-06-25 03:31:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13130186752. Throughput: 0: 42738.2. Samples: 13130372420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:31:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 03:31:36,341][15401] Updated weights for policy 0, policy_version 801413 (0.0040) [2024-06-25 03:31:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 13130416128. Throughput: 0: 42899.6. Samples: 13130499820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:31:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 03:31:40,155][15401] Updated weights for policy 0, policy_version 801423 (0.0038) [2024-06-25 03:31:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13130629120. Throughput: 0: 42943.9. Samples: 13130763480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:31:43,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 03:31:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000801431_13130645504.pth... [2024-06-25 03:31:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000800805_13120389120.pth [2024-06-25 03:31:44,074][15401] Updated weights for policy 0, policy_version 801433 (0.0038) [2024-06-25 03:31:47,840][15401] Updated weights for policy 0, policy_version 801443 (0.0038) [2024-06-25 03:31:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13130842112. Throughput: 0: 42924.1. Samples: 13131018760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:31:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 03:31:51,723][15401] Updated weights for policy 0, policy_version 801453 (0.0031) [2024-06-25 03:31:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13131055104. Throughput: 0: 43191.2. Samples: 13131148960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:31:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 03:31:55,475][15401] Updated weights for policy 0, policy_version 801463 (0.0024) [2024-06-25 03:31:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13131300864. Throughput: 0: 43061.3. Samples: 13131408100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:31:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 03:31:59,398][15401] Updated weights for policy 0, policy_version 801473 (0.0034) [2024-06-25 03:32:03,191][15401] Updated weights for policy 0, policy_version 801483 (0.0028) [2024-06-25 03:32:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 13131497472. Throughput: 0: 43144.2. Samples: 13131670280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 03:32:06,789][15401] Updated weights for policy 0, policy_version 801493 (0.0032) [2024-06-25 03:32:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 13131710464. Throughput: 0: 43233.4. Samples: 13131797800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 03:32:10,757][15401] Updated weights for policy 0, policy_version 801503 (0.0034) [2024-06-25 03:32:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13131939840. Throughput: 0: 43130.2. Samples: 13132056660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:13,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 03:32:14,260][15401] Updated weights for policy 0, policy_version 801513 (0.0041) [2024-06-25 03:32:18,386][15401] Updated weights for policy 0, policy_version 801523 (0.0028) [2024-06-25 03:32:18,393][15132] Fps is (10 sec: 44220.9, 60 sec: 43689.5, 300 sec: 42653.4). Total num frames: 13132152832. Throughput: 0: 43109.9. Samples: 13132312520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:18,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 03:32:22,027][15401] Updated weights for policy 0, policy_version 801533 (0.0047) [2024-06-25 03:32:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13132349440. Throughput: 0: 43120.1. Samples: 13132440220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 03:32:25,927][15401] Updated weights for policy 0, policy_version 801543 (0.0025) [2024-06-25 03:32:28,390][15132] Fps is (10 sec: 42613.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13132578816. Throughput: 0: 42911.9. Samples: 13132694520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:28,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 03:32:29,598][15401] Updated weights for policy 0, policy_version 801553 (0.0038) [2024-06-25 03:32:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42709.6). Total num frames: 13132791808. Throughput: 0: 43100.8. Samples: 13132958300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 03:32:33,453][15401] Updated weights for policy 0, policy_version 801563 (0.0027) [2024-06-25 03:32:37,104][15401] Updated weights for policy 0, policy_version 801573 (0.0041) [2024-06-25 03:32:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13133004800. Throughput: 0: 42988.8. Samples: 13133083460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 03:32:41,076][15401] Updated weights for policy 0, policy_version 801583 (0.0046) [2024-06-25 03:32:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13133217792. Throughput: 0: 42879.1. Samples: 13133337660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 03:32:44,929][15401] Updated weights for policy 0, policy_version 801593 (0.0027) [2024-06-25 03:32:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13133430784. Throughput: 0: 42888.8. Samples: 13133600280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 03:32:48,943][15401] Updated weights for policy 0, policy_version 801603 (0.0032) [2024-06-25 03:32:52,693][15401] Updated weights for policy 0, policy_version 801613 (0.0032) [2024-06-25 03:32:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13133643776. Throughput: 0: 42792.7. Samples: 13133723480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 03:32:56,818][15401] Updated weights for policy 0, policy_version 801623 (0.0038) [2024-06-25 03:32:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13133873152. Throughput: 0: 42783.2. Samples: 13133981900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:32:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 03:33:00,603][15401] Updated weights for policy 0, policy_version 801633 (0.0038) [2024-06-25 03:33:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13134069760. Throughput: 0: 42776.2. Samples: 13134237300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:33:03,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 03:33:04,514][15401] Updated weights for policy 0, policy_version 801643 (0.0033) [2024-06-25 03:33:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13134266368. Throughput: 0: 42639.6. Samples: 13134359000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 03:33:08,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 03:33:08,434][15401] Updated weights for policy 0, policy_version 801653 (0.0037) [2024-06-25 03:33:12,249][15401] Updated weights for policy 0, policy_version 801663 (0.0034) [2024-06-25 03:33:12,275][15349] Signal inference workers to stop experience collection... (194400 times) [2024-06-25 03:33:12,275][15349] Signal inference workers to resume experience collection... (194400 times) [2024-06-25 03:33:12,312][15401] InferenceWorker_p0-w0: stopping experience collection (194400 times) [2024-06-25 03:33:12,313][15401] InferenceWorker_p0-w0: resuming experience collection (194400 times) [2024-06-25 03:33:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13134528512. Throughput: 0: 42867.1. Samples: 13134623540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 03:33:16,442][15401] Updated weights for policy 0, policy_version 801673 (0.0030) [2024-06-25 03:33:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42873.9, 300 sec: 42876.1). Total num frames: 13134725120. Throughput: 0: 42579.0. Samples: 13134874360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 03:33:19,831][15401] Updated weights for policy 0, policy_version 801683 (0.0026) [2024-06-25 03:33:23,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13134905344. Throughput: 0: 42679.5. Samples: 13135004040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 03:33:23,953][15401] Updated weights for policy 0, policy_version 801693 (0.0035) [2024-06-25 03:33:27,209][15401] Updated weights for policy 0, policy_version 801703 (0.0033) [2024-06-25 03:33:28,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 13135151104. Throughput: 0: 42767.5. Samples: 13135262300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:28,392][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 03:33:31,523][15401] Updated weights for policy 0, policy_version 801713 (0.0042) [2024-06-25 03:33:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13135364096. Throughput: 0: 42654.3. Samples: 13135519720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:33,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 03:33:34,834][15401] Updated weights for policy 0, policy_version 801723 (0.0036) [2024-06-25 03:33:38,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13135560704. Throughput: 0: 42820.6. Samples: 13135650400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 03:33:38,913][15401] Updated weights for policy 0, policy_version 801733 (0.0036) [2024-06-25 03:33:42,256][15401] Updated weights for policy 0, policy_version 801743 (0.0028) [2024-06-25 03:33:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13135806464. Throughput: 0: 42896.9. Samples: 13135912260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:43,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 03:33:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000801746_13135806464.pth... [2024-06-25 03:33:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000801118_13125517312.pth [2024-06-25 03:33:46,411][15401] Updated weights for policy 0, policy_version 801753 (0.0041) [2024-06-25 03:33:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13136003072. Throughput: 0: 42933.4. Samples: 13136169300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 03:33:49,910][15401] Updated weights for policy 0, policy_version 801763 (0.0032) [2024-06-25 03:33:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13136216064. Throughput: 0: 43036.0. Samples: 13136295620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 03:33:54,412][15401] Updated weights for policy 0, policy_version 801773 (0.0031) [2024-06-25 03:33:57,838][15401] Updated weights for policy 0, policy_version 801783 (0.0029) [2024-06-25 03:33:58,393][15132] Fps is (10 sec: 44220.3, 60 sec: 42868.8, 300 sec: 42875.5). Total num frames: 13136445440. Throughput: 0: 42907.6. Samples: 13136554540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:33:58,394][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 03:34:01,956][15401] Updated weights for policy 0, policy_version 801793 (0.0031) [2024-06-25 03:34:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 13136658432. Throughput: 0: 42985.5. Samples: 13136808700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 03:34:05,341][15401] Updated weights for policy 0, policy_version 801803 (0.0045) [2024-06-25 03:34:08,389][15132] Fps is (10 sec: 40975.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13136855040. Throughput: 0: 43097.4. Samples: 13136943420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 03:34:09,404][15401] Updated weights for policy 0, policy_version 801813 (0.0053) [2024-06-25 03:34:12,977][15401] Updated weights for policy 0, policy_version 801823 (0.0031) [2024-06-25 03:34:13,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13137068032. Throughput: 0: 42980.0. Samples: 13137196300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 03:34:17,117][15401] Updated weights for policy 0, policy_version 801833 (0.0046) [2024-06-25 03:34:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13137297408. Throughput: 0: 42975.5. Samples: 13137453620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 03:34:20,771][15401] Updated weights for policy 0, policy_version 801843 (0.0028) [2024-06-25 03:34:23,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 13137510400. Throughput: 0: 43149.6. Samples: 13137592240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:23,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 03:34:24,765][15401] Updated weights for policy 0, policy_version 801853 (0.0032) [2024-06-25 03:34:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 13137723392. Throughput: 0: 42982.4. Samples: 13137846460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 03:34:28,398][15401] Updated weights for policy 0, policy_version 801863 (0.0047) [2024-06-25 03:34:32,107][15401] Updated weights for policy 0, policy_version 801873 (0.0030) [2024-06-25 03:34:32,611][15349] Signal inference workers to stop experience collection... (194450 times) [2024-06-25 03:34:32,611][15349] Signal inference workers to resume experience collection... (194450 times) [2024-06-25 03:34:32,633][15401] InferenceWorker_p0-w0: stopping experience collection (194450 times) [2024-06-25 03:34:32,634][15401] InferenceWorker_p0-w0: resuming experience collection (194450 times) [2024-06-25 03:34:33,389][15132] Fps is (10 sec: 44248.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13137952768. Throughput: 0: 42907.2. Samples: 13138100120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 03:34:36,050][15401] Updated weights for policy 0, policy_version 801883 (0.0042) [2024-06-25 03:34:38,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13138149376. Throughput: 0: 43046.6. Samples: 13138232720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 03:34:39,917][15401] Updated weights for policy 0, policy_version 801893 (0.0033) [2024-06-25 03:34:43,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 13138345984. Throughput: 0: 42907.0. Samples: 13138485200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:43,396][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 03:34:43,638][15401] Updated weights for policy 0, policy_version 801903 (0.0038) [2024-06-25 03:34:47,432][15401] Updated weights for policy 0, policy_version 801913 (0.0031) [2024-06-25 03:34:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13138575360. Throughput: 0: 43043.9. Samples: 13138745680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 03:34:51,423][15401] Updated weights for policy 0, policy_version 801923 (0.0029) [2024-06-25 03:34:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13138788352. Throughput: 0: 42992.3. Samples: 13138878080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 03:34:54,934][15401] Updated weights for policy 0, policy_version 801933 (0.0028) [2024-06-25 03:34:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42599.4, 300 sec: 42820.2). Total num frames: 13139001344. Throughput: 0: 42999.5. Samples: 13139131380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:34:58,392][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 03:34:58,941][15401] Updated weights for policy 0, policy_version 801943 (0.0043) [2024-06-25 03:35:02,453][15401] Updated weights for policy 0, policy_version 801953 (0.0036) [2024-06-25 03:35:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 13139247104. Throughput: 0: 43147.9. Samples: 13139395280. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:03,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-25 03:35:06,466][15401] Updated weights for policy 0, policy_version 801963 (0.0039) [2024-06-25 03:35:08,392][15132] Fps is (10 sec: 44236.8, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 13139443712. Throughput: 0: 43042.7. Samples: 13139529160. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:08,393][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 03:35:09,921][15401] Updated weights for policy 0, policy_version 801973 (0.0054) [2024-06-25 03:35:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13139640320. Throughput: 0: 42952.3. Samples: 13139779320. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 03:35:14,038][15401] Updated weights for policy 0, policy_version 801983 (0.0034) [2024-06-25 03:35:17,712][15401] Updated weights for policy 0, policy_version 801993 (0.0021) [2024-06-25 03:35:18,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 13139869696. Throughput: 0: 43118.4. Samples: 13140040560. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:18,393][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 03:35:21,557][15401] Updated weights for policy 0, policy_version 802003 (0.0034) [2024-06-25 03:35:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.3, 300 sec: 42932.6). Total num frames: 13140099072. Throughput: 0: 43288.0. Samples: 13140180680. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 03:35:25,109][15401] Updated weights for policy 0, policy_version 802013 (0.0023) [2024-06-25 03:35:28,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 13140295680. Throughput: 0: 43158.4. Samples: 13140427320. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 03:35:29,134][15401] Updated weights for policy 0, policy_version 802023 (0.0043) [2024-06-25 03:35:32,565][15401] Updated weights for policy 0, policy_version 802033 (0.0030) [2024-06-25 03:35:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 13140541440. Throughput: 0: 43298.1. Samples: 13140694100. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 03:35:36,733][15401] Updated weights for policy 0, policy_version 802043 (0.0040) [2024-06-25 03:35:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13140738048. Throughput: 0: 43341.5. Samples: 13140828440. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 03:35:40,091][15401] Updated weights for policy 0, policy_version 802053 (0.0034) [2024-06-25 03:35:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 13140951040. Throughput: 0: 43310.8. Samples: 13141080260. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 03:35:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000802060_13140951040.pth... [2024-06-25 03:35:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000801431_13130645504.pth [2024-06-25 03:35:44,536][15401] Updated weights for policy 0, policy_version 802063 (0.0040) [2024-06-25 03:35:47,858][15401] Updated weights for policy 0, policy_version 802073 (0.0037) [2024-06-25 03:35:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13141164032. Throughput: 0: 43263.6. Samples: 13141342140. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 03:35:48,813][15349] Signal inference workers to stop experience collection... (194500 times) [2024-06-25 03:35:48,814][15349] Signal inference workers to resume experience collection... (194500 times) [2024-06-25 03:35:48,855][15401] InferenceWorker_p0-w0: stopping experience collection (194500 times) [2024-06-25 03:35:48,859][15401] InferenceWorker_p0-w0: resuming experience collection (194500 times) [2024-06-25 03:35:52,011][15401] Updated weights for policy 0, policy_version 802083 (0.0029) [2024-06-25 03:35:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13141360640. Throughput: 0: 43179.7. Samples: 13141472140. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 03:35:55,604][15401] Updated weights for policy 0, policy_version 802093 (0.0031) [2024-06-25 03:35:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43419.4, 300 sec: 43042.7). Total num frames: 13141606400. Throughput: 0: 43180.5. Samples: 13141722440. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:35:58,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 03:35:59,890][15401] Updated weights for policy 0, policy_version 802103 (0.0041) [2024-06-25 03:36:03,025][15401] Updated weights for policy 0, policy_version 802113 (0.0036) [2024-06-25 03:36:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 13141819392. Throughput: 0: 43222.9. Samples: 13141985480. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:03,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 03:36:07,603][15401] Updated weights for policy 0, policy_version 802123 (0.0027) [2024-06-25 03:36:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 13142016000. Throughput: 0: 43000.8. Samples: 13142115720. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:08,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 03:36:10,555][15401] Updated weights for policy 0, policy_version 802133 (0.0032) [2024-06-25 03:36:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 43098.5). Total num frames: 13142245376. Throughput: 0: 43059.1. Samples: 13142364980. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 03:36:15,010][15401] Updated weights for policy 0, policy_version 802143 (0.0040) [2024-06-25 03:36:18,279][15401] Updated weights for policy 0, policy_version 802153 (0.0032) [2024-06-25 03:36:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43419.4, 300 sec: 43098.3). Total num frames: 13142474752. Throughput: 0: 42936.9. Samples: 13142626260. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 03:36:22,379][15401] Updated weights for policy 0, policy_version 802163 (0.0032) [2024-06-25 03:36:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13142671360. Throughput: 0: 42869.7. Samples: 13142757580. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 03:36:25,787][15401] Updated weights for policy 0, policy_version 802173 (0.0039) [2024-06-25 03:36:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 43098.3). Total num frames: 13142900736. Throughput: 0: 43005.3. Samples: 13143015500. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 03:36:29,795][15401] Updated weights for policy 0, policy_version 802183 (0.0027) [2024-06-25 03:36:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 13143097344. Throughput: 0: 42900.0. Samples: 13143272640. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:33,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-25 03:36:33,561][15401] Updated weights for policy 0, policy_version 802193 (0.0033) [2024-06-25 03:36:37,311][15401] Updated weights for policy 0, policy_version 802203 (0.0037) [2024-06-25 03:36:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13143310336. Throughput: 0: 42840.9. Samples: 13143399980. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 03:36:41,142][15401] Updated weights for policy 0, policy_version 802213 (0.0045) [2024-06-25 03:36:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13143539712. Throughput: 0: 42959.1. Samples: 13143655600. Policy #0 lag: (min: 1.0, avg: 8.7, max: 22.0) [2024-06-25 03:36:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 03:36:45,281][15401] Updated weights for policy 0, policy_version 802223 (0.0041) [2024-06-25 03:36:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13143752704. Throughput: 0: 42952.8. Samples: 13143918360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:36:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 03:36:48,970][15401] Updated weights for policy 0, policy_version 802233 (0.0023) [2024-06-25 03:36:52,753][15401] Updated weights for policy 0, policy_version 802243 (0.0038) [2024-06-25 03:36:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 13143982080. Throughput: 0: 42909.9. Samples: 13144046660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:36:53,390][15132] Avg episode reward: [(0, '0.190')] [2024-06-25 03:36:56,578][15401] Updated weights for policy 0, policy_version 802253 (0.0031) [2024-06-25 03:36:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42987.1). Total num frames: 13144178688. Throughput: 0: 43077.2. Samples: 13144303460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:36:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-25 03:37:00,292][15401] Updated weights for policy 0, policy_version 802263 (0.0025) [2024-06-25 03:37:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13144408064. Throughput: 0: 43147.6. Samples: 13144567900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 03:37:03,981][15401] Updated weights for policy 0, policy_version 802273 (0.0035) [2024-06-25 03:37:07,803][15401] Updated weights for policy 0, policy_version 802283 (0.0035) [2024-06-25 03:37:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 13144604672. Throughput: 0: 43189.8. Samples: 13144701120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 03:37:11,527][15401] Updated weights for policy 0, policy_version 802293 (0.0033) [2024-06-25 03:37:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42932.2). Total num frames: 13144817664. Throughput: 0: 43044.0. Samples: 13144952480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:13,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 03:37:15,649][15401] Updated weights for policy 0, policy_version 802303 (0.0034) [2024-06-25 03:37:17,288][15349] Signal inference workers to stop experience collection... (194550 times) [2024-06-25 03:37:17,288][15349] Signal inference workers to resume experience collection... (194550 times) [2024-06-25 03:37:17,299][15401] InferenceWorker_p0-w0: stopping experience collection (194550 times) [2024-06-25 03:37:17,325][15401] InferenceWorker_p0-w0: resuming experience collection (194550 times) [2024-06-25 03:37:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 13145047040. Throughput: 0: 43050.6. Samples: 13145209920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 03:37:19,131][15401] Updated weights for policy 0, policy_version 802313 (0.0033) [2024-06-25 03:37:23,262][15401] Updated weights for policy 0, policy_version 802323 (0.0034) [2024-06-25 03:37:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 13145260032. Throughput: 0: 43291.8. Samples: 13145348120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:23,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 03:37:26,691][15401] Updated weights for policy 0, policy_version 802333 (0.0036) [2024-06-25 03:37:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13145473024. Throughput: 0: 43084.9. Samples: 13145594420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 03:37:30,839][15401] Updated weights for policy 0, policy_version 802343 (0.0046) [2024-06-25 03:37:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13145686016. Throughput: 0: 43011.5. Samples: 13145853880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:33,398][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 03:37:34,355][15401] Updated weights for policy 0, policy_version 802353 (0.0043) [2024-06-25 03:37:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13145882624. Throughput: 0: 43047.9. Samples: 13145983820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 03:37:38,565][15401] Updated weights for policy 0, policy_version 802363 (0.0036) [2024-06-25 03:37:42,163][15401] Updated weights for policy 0, policy_version 802373 (0.0042) [2024-06-25 03:37:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13146128384. Throughput: 0: 43027.7. Samples: 13146239700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 03:37:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000802376_13146128384.pth... [2024-06-25 03:37:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000801746_13135806464.pth [2024-06-25 03:37:46,329][15401] Updated weights for policy 0, policy_version 802383 (0.0040) [2024-06-25 03:37:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13146324992. Throughput: 0: 42814.3. Samples: 13146494540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:48,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 03:37:49,775][15401] Updated weights for policy 0, policy_version 802393 (0.0029) [2024-06-25 03:37:53,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42593.8, 300 sec: 42930.7). Total num frames: 13146537984. Throughput: 0: 42679.2. Samples: 13146621960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:53,396][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 03:37:54,107][15401] Updated weights for policy 0, policy_version 802403 (0.0039) [2024-06-25 03:37:57,358][15401] Updated weights for policy 0, policy_version 802413 (0.0039) [2024-06-25 03:37:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 13146767360. Throughput: 0: 42984.9. Samples: 13146886800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:37:58,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-25 03:38:01,659][15401] Updated weights for policy 0, policy_version 802423 (0.0038) [2024-06-25 03:38:03,390][15132] Fps is (10 sec: 44264.6, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 13146980352. Throughput: 0: 42860.9. Samples: 13147138660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:38:03,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 03:38:04,913][15401] Updated weights for policy 0, policy_version 802433 (0.0035) [2024-06-25 03:38:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13147176960. Throughput: 0: 42595.3. Samples: 13147264900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:38:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 03:38:09,385][15401] Updated weights for policy 0, policy_version 802443 (0.0039) [2024-06-25 03:38:12,700][15401] Updated weights for policy 0, policy_version 802453 (0.0038) [2024-06-25 03:38:13,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43415.9, 300 sec: 43042.4). Total num frames: 13147422720. Throughput: 0: 42840.8. Samples: 13147522360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:38:13,392][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 03:38:16,884][15401] Updated weights for policy 0, policy_version 802463 (0.0045) [2024-06-25 03:38:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 13147602944. Throughput: 0: 42785.8. Samples: 13147779240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:38:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 03:38:20,299][15401] Updated weights for policy 0, policy_version 802473 (0.0032) [2024-06-25 03:38:23,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 13147832320. Throughput: 0: 42688.8. Samples: 13147904820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:38:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 03:38:24,558][15401] Updated weights for policy 0, policy_version 802483 (0.0031) [2024-06-25 03:38:28,091][15401] Updated weights for policy 0, policy_version 802493 (0.0036) [2024-06-25 03:38:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13148045312. Throughput: 0: 42773.4. Samples: 13148164500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:38:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 03:38:32,132][15401] Updated weights for policy 0, policy_version 802503 (0.0031) [2024-06-25 03:38:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 13148241920. Throughput: 0: 42854.7. Samples: 13148423000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 03:38:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 03:38:35,641][15401] Updated weights for policy 0, policy_version 802513 (0.0036) [2024-06-25 03:38:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13148471296. Throughput: 0: 42859.4. Samples: 13148550360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:38:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 03:38:39,543][15401] Updated weights for policy 0, policy_version 802523 (0.0032) [2024-06-25 03:38:42,943][15349] Signal inference workers to stop experience collection... (194600 times) [2024-06-25 03:38:42,944][15349] Signal inference workers to resume experience collection... (194600 times) [2024-06-25 03:38:42,984][15401] InferenceWorker_p0-w0: stopping experience collection (194600 times) [2024-06-25 03:38:42,984][15401] InferenceWorker_p0-w0: resuming experience collection (194600 times) [2024-06-25 03:38:43,081][15401] Updated weights for policy 0, policy_version 802533 (0.0031) [2024-06-25 03:38:43,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 43042.4). Total num frames: 13148700672. Throughput: 0: 42755.9. Samples: 13148810920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:38:43,393][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 03:38:47,262][15401] Updated weights for policy 0, policy_version 802543 (0.0033) [2024-06-25 03:38:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13148880896. Throughput: 0: 42984.2. Samples: 13149072940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:38:48,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 03:38:50,632][15401] Updated weights for policy 0, policy_version 802553 (0.0029) [2024-06-25 03:38:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42876.0, 300 sec: 42932.2). Total num frames: 13149110272. Throughput: 0: 42808.8. Samples: 13149191300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:38:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 03:38:54,693][15401] Updated weights for policy 0, policy_version 802563 (0.0037) [2024-06-25 03:38:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 13149339648. Throughput: 0: 42876.9. Samples: 13149451720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:38:58,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 03:38:58,426][15401] Updated weights for policy 0, policy_version 802573 (0.0039) [2024-06-25 03:39:02,533][15401] Updated weights for policy 0, policy_version 802583 (0.0030) [2024-06-25 03:39:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 13149536256. Throughput: 0: 42895.6. Samples: 13149709540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 03:39:06,265][15401] Updated weights for policy 0, policy_version 802593 (0.0031) [2024-06-25 03:39:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13149765632. Throughput: 0: 42895.2. Samples: 13149835100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 03:39:10,345][15401] Updated weights for policy 0, policy_version 802603 (0.0023) [2024-06-25 03:39:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42987.1). Total num frames: 13149978624. Throughput: 0: 42919.3. Samples: 13150095880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 03:39:14,051][15401] Updated weights for policy 0, policy_version 802613 (0.0032) [2024-06-25 03:39:17,866][15401] Updated weights for policy 0, policy_version 802623 (0.0035) [2024-06-25 03:39:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 13150191616. Throughput: 0: 42730.6. Samples: 13150345880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:18,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 03:39:22,158][15401] Updated weights for policy 0, policy_version 802633 (0.0037) [2024-06-25 03:39:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42987.1). Total num frames: 13150404608. Throughput: 0: 42728.9. Samples: 13150473160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:23,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 03:39:25,453][15401] Updated weights for policy 0, policy_version 802643 (0.0033) [2024-06-25 03:39:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13150617600. Throughput: 0: 42805.4. Samples: 13150737060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 03:39:29,658][15401] Updated weights for policy 0, policy_version 802653 (0.0037) [2024-06-25 03:39:32,966][15401] Updated weights for policy 0, policy_version 802663 (0.0036) [2024-06-25 03:39:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13150830592. Throughput: 0: 42465.3. Samples: 13150983880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 03:39:37,297][15401] Updated weights for policy 0, policy_version 802673 (0.0037) [2024-06-25 03:39:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 13151027200. Throughput: 0: 42790.5. Samples: 13151116880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 03:39:40,507][15401] Updated weights for policy 0, policy_version 802683 (0.0025) [2024-06-25 03:39:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42987.2). Total num frames: 13151256576. Throughput: 0: 42816.1. Samples: 13151378440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 03:39:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000802690_13151272960.pth... [2024-06-25 03:39:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000802060_13140951040.pth [2024-06-25 03:39:44,881][15401] Updated weights for policy 0, policy_version 802693 (0.0037) [2024-06-25 03:39:48,059][15401] Updated weights for policy 0, policy_version 802703 (0.0030) [2024-06-25 03:39:48,389][15132] Fps is (10 sec: 45876.5, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13151485952. Throughput: 0: 42573.9. Samples: 13151625360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 03:39:50,454][15349] Signal inference workers to stop experience collection... (194650 times) [2024-06-25 03:39:50,489][15401] InferenceWorker_p0-w0: stopping experience collection (194650 times) [2024-06-25 03:39:50,514][15349] Signal inference workers to resume experience collection... (194650 times) [2024-06-25 03:39:50,520][15401] InferenceWorker_p0-w0: resuming experience collection (194650 times) [2024-06-25 03:39:52,481][15401] Updated weights for policy 0, policy_version 802713 (0.0033) [2024-06-25 03:39:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 13151666176. Throughput: 0: 42698.6. Samples: 13151756540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 03:39:56,093][15401] Updated weights for policy 0, policy_version 802723 (0.0044) [2024-06-25 03:39:58,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 13151879168. Throughput: 0: 42536.6. Samples: 13152010120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:39:58,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 03:39:59,972][15401] Updated weights for policy 0, policy_version 802733 (0.0039) [2024-06-25 03:40:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 13152108544. Throughput: 0: 42801.2. Samples: 13152271940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:40:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 03:40:03,604][15401] Updated weights for policy 0, policy_version 802743 (0.0046) [2024-06-25 03:40:07,955][15401] Updated weights for policy 0, policy_version 802753 (0.0037) [2024-06-25 03:40:08,396][15132] Fps is (10 sec: 42581.0, 60 sec: 42320.8, 300 sec: 42930.7). Total num frames: 13152305152. Throughput: 0: 42911.7. Samples: 13152404460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:40:08,396][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 03:40:11,477][15401] Updated weights for policy 0, policy_version 802763 (0.0040) [2024-06-25 03:40:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42876.4). Total num frames: 13152518144. Throughput: 0: 42590.6. Samples: 13152653640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:40:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 03:40:15,463][15401] Updated weights for policy 0, policy_version 802773 (0.0040) [2024-06-25 03:40:18,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13152747520. Throughput: 0: 42774.7. Samples: 13152908740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 03:40:18,390][15132] Avg episode reward: [(0, '0.175')] [2024-06-25 03:40:19,006][15401] Updated weights for policy 0, policy_version 802783 (0.0032) [2024-06-25 03:40:23,004][15401] Updated weights for policy 0, policy_version 802793 (0.0034) [2024-06-25 03:40:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13152960512. Throughput: 0: 42808.6. Samples: 13153043260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:40:23,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 03:40:26,875][15401] Updated weights for policy 0, policy_version 802803 (0.0035) [2024-06-25 03:40:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13153157120. Throughput: 0: 42486.1. Samples: 13153290320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:40:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 03:40:31,091][15401] Updated weights for policy 0, policy_version 802813 (0.0029) [2024-06-25 03:40:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13153402880. Throughput: 0: 42660.7. Samples: 13153545100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:40:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 03:40:34,526][15401] Updated weights for policy 0, policy_version 802823 (0.0031) [2024-06-25 03:40:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 13153583104. Throughput: 0: 42699.6. Samples: 13153678020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:40:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 03:40:38,724][15401] Updated weights for policy 0, policy_version 802833 (0.0027) [2024-06-25 03:40:42,025][15401] Updated weights for policy 0, policy_version 802843 (0.0030) [2024-06-25 03:40:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13153828864. Throughput: 0: 42692.5. Samples: 13153931180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:40:43,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 03:40:46,287][15401] Updated weights for policy 0, policy_version 802853 (0.0023) [2024-06-25 03:40:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 13154041856. Throughput: 0: 42625.4. Samples: 13154190080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:40:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 03:40:49,711][15401] Updated weights for policy 0, policy_version 802863 (0.0039) [2024-06-25 03:40:53,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13154222080. Throughput: 0: 42632.2. Samples: 13154322640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:40:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 03:40:53,936][15401] Updated weights for policy 0, policy_version 802873 (0.0030) [2024-06-25 03:40:57,447][15401] Updated weights for policy 0, policy_version 802883 (0.0030) [2024-06-25 03:40:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 13154435072. Throughput: 0: 42632.5. Samples: 13154572100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:40:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 03:41:01,491][15401] Updated weights for policy 0, policy_version 802893 (0.0033) [2024-06-25 03:41:03,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 13154680832. Throughput: 0: 42679.1. Samples: 13154829300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 03:41:05,728][15401] Updated weights for policy 0, policy_version 802903 (0.0040) [2024-06-25 03:41:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42876.1, 300 sec: 42820.6). Total num frames: 13154877440. Throughput: 0: 42637.4. Samples: 13154961940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 03:41:09,360][15401] Updated weights for policy 0, policy_version 802913 (0.0029) [2024-06-25 03:41:10,156][15349] Signal inference workers to stop experience collection... (194700 times) [2024-06-25 03:41:10,158][15349] Signal inference workers to resume experience collection... (194700 times) [2024-06-25 03:41:10,201][15401] InferenceWorker_p0-w0: stopping experience collection (194700 times) [2024-06-25 03:41:10,201][15401] InferenceWorker_p0-w0: resuming experience collection (194700 times) [2024-06-25 03:41:13,197][15401] Updated weights for policy 0, policy_version 802923 (0.0028) [2024-06-25 03:41:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13155090432. Throughput: 0: 42892.5. Samples: 13155220480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 03:41:17,067][15401] Updated weights for policy 0, policy_version 802933 (0.0033) [2024-06-25 03:41:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13155336192. Throughput: 0: 42898.4. Samples: 13155475520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 03:41:20,679][15401] Updated weights for policy 0, policy_version 802943 (0.0028) [2024-06-25 03:41:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13155500032. Throughput: 0: 42822.3. Samples: 13155605020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 03:41:24,767][15401] Updated weights for policy 0, policy_version 802953 (0.0042) [2024-06-25 03:41:28,212][15401] Updated weights for policy 0, policy_version 802963 (0.0033) [2024-06-25 03:41:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13155745792. Throughput: 0: 42899.9. Samples: 13155861680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 03:41:32,296][15401] Updated weights for policy 0, policy_version 802973 (0.0042) [2024-06-25 03:41:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13155975168. Throughput: 0: 42806.2. Samples: 13156116360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 03:41:36,065][15401] Updated weights for policy 0, policy_version 802983 (0.0039) [2024-06-25 03:41:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13156155392. Throughput: 0: 42772.1. Samples: 13156247380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 03:41:39,820][15401] Updated weights for policy 0, policy_version 802993 (0.0027) [2024-06-25 03:41:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 13156384768. Throughput: 0: 42895.1. Samples: 13156502380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:43,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 03:41:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000803003_13156401152.pth... [2024-06-25 03:41:43,552][15401] Updated weights for policy 0, policy_version 803003 (0.0031) [2024-06-25 03:41:43,590][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000802376_13146128384.pth [2024-06-25 03:41:47,796][15401] Updated weights for policy 0, policy_version 803013 (0.0047) [2024-06-25 03:41:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13156597760. Throughput: 0: 42935.9. Samples: 13156761420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 03:41:51,523][15401] Updated weights for policy 0, policy_version 803023 (0.0041) [2024-06-25 03:41:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13156794368. Throughput: 0: 42823.1. Samples: 13156888980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:53,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 03:41:55,447][15401] Updated weights for policy 0, policy_version 803033 (0.0023) [2024-06-25 03:41:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13157023744. Throughput: 0: 42596.9. Samples: 13157137340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:41:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 03:41:59,018][15401] Updated weights for policy 0, policy_version 803043 (0.0033) [2024-06-25 03:42:02,952][15401] Updated weights for policy 0, policy_version 803053 (0.0021) [2024-06-25 03:42:03,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13157236736. Throughput: 0: 42670.4. Samples: 13157395700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 03:42:03,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-25 03:42:06,819][15401] Updated weights for policy 0, policy_version 803063 (0.0037) [2024-06-25 03:42:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13157449728. Throughput: 0: 42731.9. Samples: 13157527960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:08,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-25 03:42:10,462][15401] Updated weights for policy 0, policy_version 803073 (0.0032) [2024-06-25 03:42:13,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13157662720. Throughput: 0: 42669.3. Samples: 13157781800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 03:42:14,266][15401] Updated weights for policy 0, policy_version 803083 (0.0038) [2024-06-25 03:42:17,990][15401] Updated weights for policy 0, policy_version 803093 (0.0035) [2024-06-25 03:42:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 13157892096. Throughput: 0: 42768.4. Samples: 13158040940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:18,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 03:42:21,891][15401] Updated weights for policy 0, policy_version 803103 (0.0034) [2024-06-25 03:42:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13158072320. Throughput: 0: 42837.3. Samples: 13158175060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 03:42:25,657][15401] Updated weights for policy 0, policy_version 803113 (0.0044) [2024-06-25 03:42:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13158318080. Throughput: 0: 42845.3. Samples: 13158430420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 03:42:29,317][15349] Signal inference workers to stop experience collection... (194750 times) [2024-06-25 03:42:29,348][15401] InferenceWorker_p0-w0: stopping experience collection (194750 times) [2024-06-25 03:42:29,373][15349] Signal inference workers to resume experience collection... (194750 times) [2024-06-25 03:42:29,374][15401] InferenceWorker_p0-w0: resuming experience collection (194750 times) [2024-06-25 03:42:29,515][15401] Updated weights for policy 0, policy_version 803123 (0.0030) [2024-06-25 03:42:33,309][15401] Updated weights for policy 0, policy_version 803133 (0.0028) [2024-06-25 03:42:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13158531072. Throughput: 0: 42862.3. Samples: 13158690220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 03:42:37,091][15401] Updated weights for policy 0, policy_version 803143 (0.0029) [2024-06-25 03:42:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13158711296. Throughput: 0: 42888.4. Samples: 13158818960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 03:42:41,030][15401] Updated weights for policy 0, policy_version 803153 (0.0033) [2024-06-25 03:42:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13158957056. Throughput: 0: 42976.9. Samples: 13159071300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 03:42:45,231][15401] Updated weights for policy 0, policy_version 803163 (0.0027) [2024-06-25 03:42:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42821.5). Total num frames: 13159170048. Throughput: 0: 42979.0. Samples: 13159329740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 03:42:48,529][15401] Updated weights for policy 0, policy_version 803173 (0.0036) [2024-06-25 03:42:52,631][15401] Updated weights for policy 0, policy_version 803183 (0.0033) [2024-06-25 03:42:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13159366656. Throughput: 0: 42757.9. Samples: 13159452060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 03:42:56,655][15401] Updated weights for policy 0, policy_version 803193 (0.0028) [2024-06-25 03:42:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13159596032. Throughput: 0: 42897.3. Samples: 13159712180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:42:58,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 03:43:00,222][15401] Updated weights for policy 0, policy_version 803203 (0.0032) [2024-06-25 03:43:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13159809024. Throughput: 0: 42781.7. Samples: 13159966120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 03:43:04,256][15401] Updated weights for policy 0, policy_version 803213 (0.0041) [2024-06-25 03:43:08,192][15401] Updated weights for policy 0, policy_version 803223 (0.0047) [2024-06-25 03:43:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 13160005632. Throughput: 0: 42734.2. Samples: 13160098100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 03:43:11,720][15401] Updated weights for policy 0, policy_version 803233 (0.0044) [2024-06-25 03:43:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13160251392. Throughput: 0: 42756.6. Samples: 13160354460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:13,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 03:43:15,743][15401] Updated weights for policy 0, policy_version 803243 (0.0028) [2024-06-25 03:43:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13160431616. Throughput: 0: 42820.8. Samples: 13160617160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 03:43:19,228][15401] Updated weights for policy 0, policy_version 803253 (0.0029) [2024-06-25 03:43:23,355][15401] Updated weights for policy 0, policy_version 803263 (0.0039) [2024-06-25 03:43:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13160660992. Throughput: 0: 42636.0. Samples: 13160737580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:23,390][15132] Avg episode reward: [(0, '0.188')] [2024-06-25 03:43:27,110][15401] Updated weights for policy 0, policy_version 803273 (0.0041) [2024-06-25 03:43:28,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 13160873984. Throughput: 0: 42665.5. Samples: 13160991520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:28,396][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 03:43:30,934][15401] Updated weights for policy 0, policy_version 803283 (0.0031) [2024-06-25 03:43:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13161070592. Throughput: 0: 42700.7. Samples: 13161251280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 03:43:34,623][15401] Updated weights for policy 0, policy_version 803293 (0.0024) [2024-06-25 03:43:38,396][15132] Fps is (10 sec: 42598.4, 60 sec: 43139.9, 300 sec: 42708.9). Total num frames: 13161299968. Throughput: 0: 42719.2. Samples: 13161374700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:38,397][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 03:43:38,527][15401] Updated weights for policy 0, policy_version 803303 (0.0027) [2024-06-25 03:43:42,208][15401] Updated weights for policy 0, policy_version 803313 (0.0046) [2024-06-25 03:43:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13161529344. Throughput: 0: 42721.4. Samples: 13161634640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 03:43:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000803316_13161529344.pth... [2024-06-25 03:43:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000802690_13151272960.pth [2024-06-25 03:43:43,730][15349] Signal inference workers to stop experience collection... (194800 times) [2024-06-25 03:43:43,773][15401] InferenceWorker_p0-w0: stopping experience collection (194800 times) [2024-06-25 03:43:43,847][15349] Signal inference workers to resume experience collection... (194800 times) [2024-06-25 03:43:43,847][15401] InferenceWorker_p0-w0: resuming experience collection (194800 times) [2024-06-25 03:43:46,203][15401] Updated weights for policy 0, policy_version 803323 (0.0045) [2024-06-25 03:43:48,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 13161725952. Throughput: 0: 42784.9. Samples: 13161891440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 03:43:49,913][15401] Updated weights for policy 0, policy_version 803333 (0.0039) [2024-06-25 03:43:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13161938944. Throughput: 0: 42566.3. Samples: 13162013580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 03:43:53,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-25 03:43:54,265][15401] Updated weights for policy 0, policy_version 803343 (0.0028) [2024-06-25 03:43:57,506][15401] Updated weights for policy 0, policy_version 803353 (0.0031) [2024-06-25 03:43:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13162151936. Throughput: 0: 42601.3. Samples: 13162271520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:43:58,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 03:44:01,830][15401] Updated weights for policy 0, policy_version 803363 (0.0032) [2024-06-25 03:44:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13162364928. Throughput: 0: 42601.7. Samples: 13162534240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:03,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 03:44:05,347][15401] Updated weights for policy 0, policy_version 803373 (0.0029) [2024-06-25 03:44:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13162594304. Throughput: 0: 42717.3. Samples: 13162659860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 03:44:09,538][15401] Updated weights for policy 0, policy_version 803383 (0.0034) [2024-06-25 03:44:12,869][15401] Updated weights for policy 0, policy_version 803393 (0.0037) [2024-06-25 03:44:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13162807296. Throughput: 0: 42877.1. Samples: 13162920720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:13,391][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 03:44:17,149][15401] Updated weights for policy 0, policy_version 803403 (0.0037) [2024-06-25 03:44:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13163003904. Throughput: 0: 42863.6. Samples: 13163180140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 03:44:20,368][15401] Updated weights for policy 0, policy_version 803413 (0.0026) [2024-06-25 03:44:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13163216896. Throughput: 0: 42875.9. Samples: 13163303840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 03:44:24,735][15401] Updated weights for policy 0, policy_version 803423 (0.0032) [2024-06-25 03:44:28,204][15401] Updated weights for policy 0, policy_version 803433 (0.0036) [2024-06-25 03:44:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 13163446272. Throughput: 0: 42884.0. Samples: 13163564420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 03:44:32,497][15401] Updated weights for policy 0, policy_version 803443 (0.0037) [2024-06-25 03:44:33,393][15132] Fps is (10 sec: 44220.0, 60 sec: 43141.8, 300 sec: 42820.0). Total num frames: 13163659264. Throughput: 0: 42897.4. Samples: 13163821980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:33,394][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 03:44:35,905][15401] Updated weights for policy 0, policy_version 803453 (0.0032) [2024-06-25 03:44:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 13163855872. Throughput: 0: 43002.3. Samples: 13163948680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 03:44:40,203][15401] Updated weights for policy 0, policy_version 803463 (0.0030) [2024-06-25 03:44:43,369][15401] Updated weights for policy 0, policy_version 803473 (0.0028) [2024-06-25 03:44:43,390][15132] Fps is (10 sec: 44252.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13164101632. Throughput: 0: 42937.7. Samples: 13164203720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 03:44:47,929][15401] Updated weights for policy 0, policy_version 803483 (0.0030) [2024-06-25 03:44:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13164281856. Throughput: 0: 43014.3. Samples: 13164469880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 03:44:50,925][15401] Updated weights for policy 0, policy_version 803493 (0.0033) [2024-06-25 03:44:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 13164511232. Throughput: 0: 42940.9. Samples: 13164592200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 03:44:55,639][15401] Updated weights for policy 0, policy_version 803503 (0.0035) [2024-06-25 03:44:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13164724224. Throughput: 0: 42780.5. Samples: 13164845840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:44:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 03:44:58,744][15401] Updated weights for policy 0, policy_version 803513 (0.0038) [2024-06-25 03:45:03,245][15401] Updated weights for policy 0, policy_version 803523 (0.0030) [2024-06-25 03:45:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42765.9). Total num frames: 13164920832. Throughput: 0: 42966.2. Samples: 13165113620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:45:03,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 03:45:06,280][15401] Updated weights for policy 0, policy_version 803533 (0.0040) [2024-06-25 03:45:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13165166592. Throughput: 0: 42911.0. Samples: 13165234840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:45:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 03:45:10,745][15401] Updated weights for policy 0, policy_version 803543 (0.0035) [2024-06-25 03:45:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13165363200. Throughput: 0: 42861.4. Samples: 13165493180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:45:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 03:45:14,120][15401] Updated weights for policy 0, policy_version 803553 (0.0028) [2024-06-25 03:45:18,396][15132] Fps is (10 sec: 39296.9, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 13165559808. Throughput: 0: 42846.4. Samples: 13165750180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:45:18,396][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 03:45:18,681][15401] Updated weights for policy 0, policy_version 803563 (0.0049) [2024-06-25 03:45:21,784][15401] Updated weights for policy 0, policy_version 803573 (0.0030) [2024-06-25 03:45:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13165805568. Throughput: 0: 42785.2. Samples: 13165874020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:45:23,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-25 03:45:26,182][15401] Updated weights for policy 0, policy_version 803583 (0.0033) [2024-06-25 03:45:28,390][15132] Fps is (10 sec: 44265.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13166002176. Throughput: 0: 42851.7. Samples: 13166132040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:45:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 03:45:29,262][15349] Signal inference workers to stop experience collection... (194850 times) [2024-06-25 03:45:29,297][15401] InferenceWorker_p0-w0: stopping experience collection (194850 times) [2024-06-25 03:45:29,323][15349] Signal inference workers to resume experience collection... (194850 times) [2024-06-25 03:45:29,331][15401] InferenceWorker_p0-w0: resuming experience collection (194850 times) [2024-06-25 03:45:29,333][15401] Updated weights for policy 0, policy_version 803593 (0.0032) [2024-06-25 03:45:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42327.9, 300 sec: 42765.0). Total num frames: 13166198784. Throughput: 0: 42725.7. Samples: 13166392540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:45:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 03:45:33,683][15401] Updated weights for policy 0, policy_version 803603 (0.0037) [2024-06-25 03:45:36,874][15401] Updated weights for policy 0, policy_version 803613 (0.0032) [2024-06-25 03:45:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13166460928. Throughput: 0: 42870.2. Samples: 13166521360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 03:45:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 03:45:41,258][15401] Updated weights for policy 0, policy_version 803623 (0.0035) [2024-06-25 03:45:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13166641152. Throughput: 0: 42831.1. Samples: 13166773240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:45:43,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 03:45:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000803628_13166641152.pth... [2024-06-25 03:45:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000803003_13156401152.pth [2024-06-25 03:45:44,657][15401] Updated weights for policy 0, policy_version 803633 (0.0034) [2024-06-25 03:45:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13166854144. Throughput: 0: 42641.8. Samples: 13167032500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:45:48,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-25 03:45:49,099][15401] Updated weights for policy 0, policy_version 803643 (0.0032) [2024-06-25 03:45:52,303][15401] Updated weights for policy 0, policy_version 803653 (0.0036) [2024-06-25 03:45:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13167099904. Throughput: 0: 42774.3. Samples: 13167159680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:45:53,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 03:45:56,596][15401] Updated weights for policy 0, policy_version 803663 (0.0039) [2024-06-25 03:45:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13167263744. Throughput: 0: 42941.7. Samples: 13167425560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:45:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 03:45:59,765][15401] Updated weights for policy 0, policy_version 803673 (0.0033) [2024-06-25 03:46:03,391][15132] Fps is (10 sec: 39314.5, 60 sec: 42870.2, 300 sec: 42764.7). Total num frames: 13167493120. Throughput: 0: 42763.5. Samples: 13167674340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:03,404][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 03:46:04,093][15401] Updated weights for policy 0, policy_version 803683 (0.0037) [2024-06-25 03:46:07,818][15401] Updated weights for policy 0, policy_version 803693 (0.0044) [2024-06-25 03:46:08,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13167738880. Throughput: 0: 42863.6. Samples: 13167802880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:08,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 03:46:11,578][15401] Updated weights for policy 0, policy_version 803703 (0.0042) [2024-06-25 03:46:13,390][15132] Fps is (10 sec: 40966.5, 60 sec: 42325.1, 300 sec: 42598.4). Total num frames: 13167902720. Throughput: 0: 42808.7. Samples: 13168058440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:13,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-25 03:46:15,386][15401] Updated weights for policy 0, policy_version 803713 (0.0038) [2024-06-25 03:46:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 13168148480. Throughput: 0: 42629.1. Samples: 13168310840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:18,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 03:46:19,400][15401] Updated weights for policy 0, policy_version 803723 (0.0032) [2024-06-25 03:46:23,105][15401] Updated weights for policy 0, policy_version 803733 (0.0026) [2024-06-25 03:46:23,390][15132] Fps is (10 sec: 47514.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13168377856. Throughput: 0: 42716.5. Samples: 13168443600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 03:46:27,054][15401] Updated weights for policy 0, policy_version 803743 (0.0033) [2024-06-25 03:46:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13168558080. Throughput: 0: 42848.5. Samples: 13168701420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 03:46:30,627][15401] Updated weights for policy 0, policy_version 803753 (0.0039) [2024-06-25 03:46:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 13168803840. Throughput: 0: 42809.8. Samples: 13168958940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:33,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 03:46:34,644][15401] Updated weights for policy 0, policy_version 803763 (0.0035) [2024-06-25 03:46:38,169][15401] Updated weights for policy 0, policy_version 803773 (0.0030) [2024-06-25 03:46:38,396][15132] Fps is (10 sec: 45845.4, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 13169016832. Throughput: 0: 42951.6. Samples: 13169092780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:38,397][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 03:46:42,196][15401] Updated weights for policy 0, policy_version 803783 (0.0023) [2024-06-25 03:46:43,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13169197056. Throughput: 0: 42641.7. Samples: 13169344440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 03:46:45,647][15401] Updated weights for policy 0, policy_version 803793 (0.0043) [2024-06-25 03:46:48,389][15132] Fps is (10 sec: 42626.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13169442816. Throughput: 0: 42766.2. Samples: 13169598740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 03:46:50,255][15401] Updated weights for policy 0, policy_version 803803 (0.0046) [2024-06-25 03:46:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13169655808. Throughput: 0: 42981.4. Samples: 13169737040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 03:46:53,553][15401] Updated weights for policy 0, policy_version 803813 (0.0055) [2024-06-25 03:46:57,662][15401] Updated weights for policy 0, policy_version 803823 (0.0047) [2024-06-25 03:46:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13169836032. Throughput: 0: 42920.6. Samples: 13169989860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:46:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 03:46:59,834][15349] Signal inference workers to stop experience collection... (194900 times) [2024-06-25 03:46:59,888][15401] InferenceWorker_p0-w0: stopping experience collection (194900 times) [2024-06-25 03:46:59,889][15349] Signal inference workers to resume experience collection... (194900 times) [2024-06-25 03:46:59,904][15401] InferenceWorker_p0-w0: resuming experience collection (194900 times) [2024-06-25 03:47:01,518][15401] Updated weights for policy 0, policy_version 803833 (0.0037) [2024-06-25 03:47:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43145.8, 300 sec: 42820.6). Total num frames: 13170081792. Throughput: 0: 42895.5. Samples: 13170241140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:47:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 03:47:05,282][15401] Updated weights for policy 0, policy_version 803843 (0.0024) [2024-06-25 03:47:08,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13170294784. Throughput: 0: 43135.5. Samples: 13170384700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:47:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 03:47:09,022][15401] Updated weights for policy 0, policy_version 803853 (0.0042) [2024-06-25 03:47:12,843][15401] Updated weights for policy 0, policy_version 803863 (0.0028) [2024-06-25 03:47:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13170491392. Throughput: 0: 42947.0. Samples: 13170634040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:47:13,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 03:47:16,645][15401] Updated weights for policy 0, policy_version 803873 (0.0033) [2024-06-25 03:47:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13170720768. Throughput: 0: 42926.7. Samples: 13170890640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:47:18,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 03:47:20,739][15401] Updated weights for policy 0, policy_version 803883 (0.0036) [2024-06-25 03:47:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13170950144. Throughput: 0: 43027.0. Samples: 13171028720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:47:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 03:47:24,059][15401] Updated weights for policy 0, policy_version 803893 (0.0039) [2024-06-25 03:47:28,150][15401] Updated weights for policy 0, policy_version 803903 (0.0037) [2024-06-25 03:47:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13171146752. Throughput: 0: 42951.2. Samples: 13171277240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 03:47:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 03:47:31,796][15401] Updated weights for policy 0, policy_version 803913 (0.0032) [2024-06-25 03:47:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13171359744. Throughput: 0: 43079.1. Samples: 13171537300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:47:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 03:47:35,622][15401] Updated weights for policy 0, policy_version 803923 (0.0038) [2024-06-25 03:47:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 13171605504. Throughput: 0: 42958.6. Samples: 13171670180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:47:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 03:47:39,417][15401] Updated weights for policy 0, policy_version 803933 (0.0032) [2024-06-25 03:47:43,229][15401] Updated weights for policy 0, policy_version 803943 (0.0035) [2024-06-25 03:47:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13171802112. Throughput: 0: 42946.7. Samples: 13171922460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:47:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 03:47:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000803943_13171802112.pth... [2024-06-25 03:47:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000803316_13161529344.pth [2024-06-25 03:47:47,357][15401] Updated weights for policy 0, policy_version 803953 (0.0027) [2024-06-25 03:47:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13171998720. Throughput: 0: 43075.4. Samples: 13172179540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:47:48,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 03:47:50,787][15401] Updated weights for policy 0, policy_version 803963 (0.0046) [2024-06-25 03:47:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13172244480. Throughput: 0: 42643.5. Samples: 13172303660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:47:53,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 03:47:54,861][15401] Updated weights for policy 0, policy_version 803973 (0.0039) [2024-06-25 03:47:57,502][15349] Signal inference workers to stop experience collection... (194950 times) [2024-06-25 03:47:57,502][15349] Signal inference workers to resume experience collection... (194950 times) [2024-06-25 03:47:57,517][15401] InferenceWorker_p0-w0: stopping experience collection (194950 times) [2024-06-25 03:47:57,518][15401] InferenceWorker_p0-w0: resuming experience collection (194950 times) [2024-06-25 03:47:58,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43417.8, 300 sec: 42820.6). Total num frames: 13172441088. Throughput: 0: 42762.4. Samples: 13172558340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:47:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 03:47:58,409][15401] Updated weights for policy 0, policy_version 803983 (0.0033) [2024-06-25 03:48:02,397][15401] Updated weights for policy 0, policy_version 803993 (0.0033) [2024-06-25 03:48:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13172654080. Throughput: 0: 42817.6. Samples: 13172817440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:03,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-25 03:48:05,879][15401] Updated weights for policy 0, policy_version 804003 (0.0034) [2024-06-25 03:48:08,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 13172883456. Throughput: 0: 42476.4. Samples: 13172940260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:08,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 03:48:10,047][15401] Updated weights for policy 0, policy_version 804013 (0.0038) [2024-06-25 03:48:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13173080064. Throughput: 0: 42792.5. Samples: 13173202900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 03:48:14,069][15401] Updated weights for policy 0, policy_version 804023 (0.0034) [2024-06-25 03:48:17,966][15401] Updated weights for policy 0, policy_version 804033 (0.0037) [2024-06-25 03:48:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13173293056. Throughput: 0: 42497.7. Samples: 13173449700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 03:48:21,627][15401] Updated weights for policy 0, policy_version 804043 (0.0033) [2024-06-25 03:48:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 13173506048. Throughput: 0: 42397.4. Samples: 13173578060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 03:48:26,181][15401] Updated weights for policy 0, policy_version 804053 (0.0036) [2024-06-25 03:48:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13173719040. Throughput: 0: 42549.4. Samples: 13173837180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:28,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 03:48:29,338][15401] Updated weights for policy 0, policy_version 804063 (0.0042) [2024-06-25 03:48:33,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42320.8, 300 sec: 42709.5). Total num frames: 13173899264. Throughput: 0: 42506.1. Samples: 13174092580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:33,396][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 03:48:33,691][15401] Updated weights for policy 0, policy_version 804073 (0.0042) [2024-06-25 03:48:37,145][15401] Updated weights for policy 0, policy_version 804083 (0.0040) [2024-06-25 03:48:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13174145024. Throughput: 0: 42551.6. Samples: 13174218480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 03:48:41,193][15401] Updated weights for policy 0, policy_version 804093 (0.0047) [2024-06-25 03:48:43,389][15132] Fps is (10 sec: 45904.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13174358016. Throughput: 0: 42738.1. Samples: 13174481560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 03:48:44,665][15401] Updated weights for policy 0, policy_version 804103 (0.0033) [2024-06-25 03:48:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13174571008. Throughput: 0: 42660.5. Samples: 13174737160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:48,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 03:48:48,717][15401] Updated weights for policy 0, policy_version 804113 (0.0035) [2024-06-25 03:48:52,318][15401] Updated weights for policy 0, policy_version 804123 (0.0027) [2024-06-25 03:48:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13174784000. Throughput: 0: 42737.0. Samples: 13174863320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:53,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 03:48:56,511][15401] Updated weights for policy 0, policy_version 804133 (0.0037) [2024-06-25 03:48:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 13174996992. Throughput: 0: 42708.8. Samples: 13175124800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:48:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 03:48:59,902][15401] Updated weights for policy 0, policy_version 804143 (0.0043) [2024-06-25 03:49:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13175209984. Throughput: 0: 42928.9. Samples: 13175381500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:49:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 03:49:04,274][15401] Updated weights for policy 0, policy_version 804153 (0.0033) [2024-06-25 03:49:07,428][15401] Updated weights for policy 0, policy_version 804163 (0.0029) [2024-06-25 03:49:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 13175439360. Throughput: 0: 42969.3. Samples: 13175511680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:49:08,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 03:49:11,748][15401] Updated weights for policy 0, policy_version 804173 (0.0042) [2024-06-25 03:49:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13175652352. Throughput: 0: 42997.8. Samples: 13175772080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-25 03:49:13,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 03:49:15,376][15401] Updated weights for policy 0, policy_version 804183 (0.0037) [2024-06-25 03:49:18,393][15132] Fps is (10 sec: 42585.1, 60 sec: 42869.2, 300 sec: 42875.6). Total num frames: 13175865344. Throughput: 0: 42819.5. Samples: 13176019320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:18,393][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 03:49:19,275][15401] Updated weights for policy 0, policy_version 804193 (0.0031) [2024-06-25 03:49:23,016][15401] Updated weights for policy 0, policy_version 804203 (0.0036) [2024-06-25 03:49:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13176078336. Throughput: 0: 42958.7. Samples: 13176151620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:23,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 03:49:26,779][15401] Updated weights for policy 0, policy_version 804213 (0.0033) [2024-06-25 03:49:28,392][15132] Fps is (10 sec: 44240.4, 60 sec: 43142.9, 300 sec: 42876.3). Total num frames: 13176307712. Throughput: 0: 42895.9. Samples: 13176411980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:28,393][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 03:49:30,781][15401] Updated weights for policy 0, policy_version 804223 (0.0034) [2024-06-25 03:49:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 13176504320. Throughput: 0: 42810.7. Samples: 13176663640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:33,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 03:49:34,277][15349] Signal inference workers to stop experience collection... (195000 times) [2024-06-25 03:49:34,278][15349] Signal inference workers to resume experience collection... (195000 times) [2024-06-25 03:49:34,316][15401] InferenceWorker_p0-w0: stopping experience collection (195000 times) [2024-06-25 03:49:34,316][15401] InferenceWorker_p0-w0: resuming experience collection (195000 times) [2024-06-25 03:49:34,407][15401] Updated weights for policy 0, policy_version 804233 (0.0036) [2024-06-25 03:49:38,376][15401] Updated weights for policy 0, policy_version 804243 (0.0035) [2024-06-25 03:49:38,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13176717312. Throughput: 0: 42904.9. Samples: 13176794040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 03:49:42,307][15401] Updated weights for policy 0, policy_version 804253 (0.0037) [2024-06-25 03:49:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13176930304. Throughput: 0: 42972.0. Samples: 13177058540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:43,391][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 03:49:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000804256_13176930304.pth... [2024-06-25 03:49:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000803628_13166641152.pth [2024-06-25 03:49:45,786][15401] Updated weights for policy 0, policy_version 804263 (0.0022) [2024-06-25 03:49:48,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 13177159680. Throughput: 0: 42770.2. Samples: 13177306260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:48,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 03:49:49,875][15401] Updated weights for policy 0, policy_version 804273 (0.0034) [2024-06-25 03:49:53,199][15401] Updated weights for policy 0, policy_version 804283 (0.0022) [2024-06-25 03:49:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13177372672. Throughput: 0: 42917.9. Samples: 13177442980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:53,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-25 03:49:57,688][15401] Updated weights for policy 0, policy_version 804293 (0.0037) [2024-06-25 03:49:58,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13177585664. Throughput: 0: 42905.4. Samples: 13177702820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:49:58,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-25 03:50:01,067][15401] Updated weights for policy 0, policy_version 804303 (0.0038) [2024-06-25 03:50:03,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 13177782272. Throughput: 0: 42826.5. Samples: 13177946480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:03,392][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 03:50:05,149][15401] Updated weights for policy 0, policy_version 804313 (0.0041) [2024-06-25 03:50:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13178011648. Throughput: 0: 42825.1. Samples: 13178078760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:08,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-25 03:50:08,565][15401] Updated weights for policy 0, policy_version 804323 (0.0039) [2024-06-25 03:50:12,526][15401] Updated weights for policy 0, policy_version 804333 (0.0030) [2024-06-25 03:50:13,390][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42932.6). Total num frames: 13178224640. Throughput: 0: 42849.8. Samples: 13178340120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:13,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 03:50:16,043][15401] Updated weights for policy 0, policy_version 804343 (0.0031) [2024-06-25 03:50:18,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42600.8, 300 sec: 42765.0). Total num frames: 13178421248. Throughput: 0: 43069.9. Samples: 13178601780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:18,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-25 03:50:20,126][15401] Updated weights for policy 0, policy_version 804353 (0.0038) [2024-06-25 03:50:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13178650624. Throughput: 0: 42926.6. Samples: 13178725740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 03:50:23,909][15401] Updated weights for policy 0, policy_version 804363 (0.0032) [2024-06-25 03:50:27,693][15401] Updated weights for policy 0, policy_version 804373 (0.0038) [2024-06-25 03:50:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.2, 300 sec: 42931.7). Total num frames: 13178863616. Throughput: 0: 42752.1. Samples: 13178982380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 03:50:31,470][15401] Updated weights for policy 0, policy_version 804383 (0.0037) [2024-06-25 03:50:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13179060224. Throughput: 0: 43027.2. Samples: 13179242380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 03:50:35,373][15401] Updated weights for policy 0, policy_version 804393 (0.0035) [2024-06-25 03:50:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13179289600. Throughput: 0: 42638.2. Samples: 13179361700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 03:50:39,145][15401] Updated weights for policy 0, policy_version 804403 (0.0036) [2024-06-25 03:50:43,085][15401] Updated weights for policy 0, policy_version 804413 (0.0038) [2024-06-25 03:50:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 13179502592. Throughput: 0: 42642.6. Samples: 13179621840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:43,392][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 03:50:46,916][15401] Updated weights for policy 0, policy_version 804423 (0.0030) [2024-06-25 03:50:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 13179699200. Throughput: 0: 42780.6. Samples: 13179871500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 03:50:49,847][15349] Signal inference workers to stop experience collection... (195050 times) [2024-06-25 03:50:49,848][15349] Signal inference workers to resume experience collection... (195050 times) [2024-06-25 03:50:49,856][15401] InferenceWorker_p0-w0: stopping experience collection (195050 times) [2024-06-25 03:50:49,871][15401] InferenceWorker_p0-w0: resuming experience collection (195050 times) [2024-06-25 03:50:50,946][15401] Updated weights for policy 0, policy_version 804433 (0.0024) [2024-06-25 03:50:53,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 13179912192. Throughput: 0: 42701.0. Samples: 13180000300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 03:50:54,627][15401] Updated weights for policy 0, policy_version 804443 (0.0027) [2024-06-25 03:50:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42820.8). Total num frames: 13180125184. Throughput: 0: 42724.4. Samples: 13180262720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:50:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 03:50:58,638][15401] Updated weights for policy 0, policy_version 804453 (0.0028) [2024-06-25 03:51:02,602][15401] Updated weights for policy 0, policy_version 804463 (0.0038) [2024-06-25 03:51:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 13180354560. Throughput: 0: 42418.5. Samples: 13180510620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 03:51:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 03:51:06,321][15401] Updated weights for policy 0, policy_version 804473 (0.0031) [2024-06-25 03:51:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 13180551168. Throughput: 0: 42626.8. Samples: 13180643940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 03:51:10,130][15401] Updated weights for policy 0, policy_version 804483 (0.0037) [2024-06-25 03:51:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13180780544. Throughput: 0: 42716.4. Samples: 13180904620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:13,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 03:51:13,834][15401] Updated weights for policy 0, policy_version 804493 (0.0030) [2024-06-25 03:51:17,538][15401] Updated weights for policy 0, policy_version 804503 (0.0036) [2024-06-25 03:51:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13181009920. Throughput: 0: 42609.0. Samples: 13181159780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:18,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 03:51:21,352][15401] Updated weights for policy 0, policy_version 804513 (0.0042) [2024-06-25 03:51:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13181206528. Throughput: 0: 42907.6. Samples: 13181292540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 03:51:25,125][15401] Updated weights for policy 0, policy_version 804523 (0.0035) [2024-06-25 03:51:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13181419520. Throughput: 0: 42996.6. Samples: 13181556580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:28,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 03:51:29,076][15401] Updated weights for policy 0, policy_version 804533 (0.0038) [2024-06-25 03:51:32,653][15401] Updated weights for policy 0, policy_version 804543 (0.0036) [2024-06-25 03:51:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42821.5). Total num frames: 13181648896. Throughput: 0: 43006.3. Samples: 13181806780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:33,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 03:51:36,720][15401] Updated weights for policy 0, policy_version 804553 (0.0038) [2024-06-25 03:51:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13181861888. Throughput: 0: 43189.4. Samples: 13181943820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 03:51:40,109][15401] Updated weights for policy 0, policy_version 804563 (0.0030) [2024-06-25 03:51:43,392][15132] Fps is (10 sec: 42587.1, 60 sec: 42871.4, 300 sec: 42820.2). Total num frames: 13182074880. Throughput: 0: 43235.4. Samples: 13182208420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:43,393][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 03:51:43,541][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000804571_13182091264.pth... [2024-06-25 03:51:43,600][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000803943_13171802112.pth [2024-06-25 03:51:44,290][15401] Updated weights for policy 0, policy_version 804573 (0.0032) [2024-06-25 03:51:47,872][15401] Updated weights for policy 0, policy_version 804583 (0.0036) [2024-06-25 03:51:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 13182320640. Throughput: 0: 43161.3. Samples: 13182452880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 03:51:52,073][15401] Updated weights for policy 0, policy_version 804593 (0.0039) [2024-06-25 03:51:53,389][15132] Fps is (10 sec: 42609.2, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 13182500864. Throughput: 0: 43102.2. Samples: 13182583540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 03:51:55,416][15349] Signal inference workers to stop experience collection... (195100 times) [2024-06-25 03:51:55,416][15349] Signal inference workers to resume experience collection... (195100 times) [2024-06-25 03:51:55,444][15401] InferenceWorker_p0-w0: stopping experience collection (195100 times) [2024-06-25 03:51:55,444][15401] InferenceWorker_p0-w0: resuming experience collection (195100 times) [2024-06-25 03:51:55,578][15401] Updated weights for policy 0, policy_version 804603 (0.0023) [2024-06-25 03:51:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13182730240. Throughput: 0: 43118.6. Samples: 13182844960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:51:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 03:51:59,824][15401] Updated weights for policy 0, policy_version 804613 (0.0035) [2024-06-25 03:52:03,128][15401] Updated weights for policy 0, policy_version 804623 (0.0037) [2024-06-25 03:52:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13182943232. Throughput: 0: 43082.2. Samples: 13183098480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:03,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 03:52:07,358][15401] Updated weights for policy 0, policy_version 804633 (0.0041) [2024-06-25 03:52:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 13183156224. Throughput: 0: 43120.5. Samples: 13183232960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:08,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 03:52:10,709][15401] Updated weights for policy 0, policy_version 804643 (0.0045) [2024-06-25 03:52:13,390][15132] Fps is (10 sec: 42595.8, 60 sec: 43144.1, 300 sec: 42876.0). Total num frames: 13183369216. Throughput: 0: 42961.2. Samples: 13183489860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:13,391][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 03:52:15,246][15401] Updated weights for policy 0, policy_version 804653 (0.0041) [2024-06-25 03:52:18,200][15401] Updated weights for policy 0, policy_version 804663 (0.0035) [2024-06-25 03:52:18,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13183598592. Throughput: 0: 43084.8. Samples: 13183745600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 03:52:22,826][15401] Updated weights for policy 0, policy_version 804673 (0.0026) [2024-06-25 03:52:23,390][15132] Fps is (10 sec: 42600.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13183795200. Throughput: 0: 42937.3. Samples: 13183876000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 03:52:25,916][15401] Updated weights for policy 0, policy_version 804683 (0.0031) [2024-06-25 03:52:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13184008192. Throughput: 0: 42693.9. Samples: 13184129540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 03:52:30,402][15401] Updated weights for policy 0, policy_version 804693 (0.0032) [2024-06-25 03:52:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13184221184. Throughput: 0: 42971.6. Samples: 13184386600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:33,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 03:52:33,587][15401] Updated weights for policy 0, policy_version 804703 (0.0032) [2024-06-25 03:52:37,902][15401] Updated weights for policy 0, policy_version 804713 (0.0036) [2024-06-25 03:52:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13184434176. Throughput: 0: 42967.0. Samples: 13184517060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 03:52:41,202][15401] Updated weights for policy 0, policy_version 804723 (0.0030) [2024-06-25 03:52:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 13184630784. Throughput: 0: 42798.1. Samples: 13184770880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:43,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 03:52:45,793][15401] Updated weights for policy 0, policy_version 804733 (0.0038) [2024-06-25 03:52:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13184876544. Throughput: 0: 42768.8. Samples: 13185023080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 03:52:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 03:52:48,883][15401] Updated weights for policy 0, policy_version 804743 (0.0032) [2024-06-25 03:52:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13185056768. Throughput: 0: 42721.3. Samples: 13185155420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:52:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 03:52:53,400][15401] Updated weights for policy 0, policy_version 804753 (0.0036) [2024-06-25 03:52:56,564][15401] Updated weights for policy 0, policy_version 804763 (0.0036) [2024-06-25 03:52:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13185286144. Throughput: 0: 42737.9. Samples: 13185413040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:52:58,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 03:53:01,081][15401] Updated weights for policy 0, policy_version 804773 (0.0030) [2024-06-25 03:53:03,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 13185531904. Throughput: 0: 42789.7. Samples: 13185671140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:03,399][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 03:53:04,102][15401] Updated weights for policy 0, policy_version 804783 (0.0039) [2024-06-25 03:53:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13185712128. Throughput: 0: 42936.6. Samples: 13185808140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:08,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-25 03:53:08,519][15401] Updated weights for policy 0, policy_version 804793 (0.0034) [2024-06-25 03:53:11,670][15401] Updated weights for policy 0, policy_version 804803 (0.0030) [2024-06-25 03:53:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.8, 300 sec: 42876.1). Total num frames: 13185941504. Throughput: 0: 42891.1. Samples: 13186059640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:13,390][15132] Avg episode reward: [(0, '0.150')] [2024-06-25 03:53:15,958][15401] Updated weights for policy 0, policy_version 804813 (0.0030) [2024-06-25 03:53:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13186170880. Throughput: 0: 42911.5. Samples: 13186317620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 03:53:19,137][15401] Updated weights for policy 0, policy_version 804823 (0.0042) [2024-06-25 03:53:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13186351104. Throughput: 0: 42945.9. Samples: 13186449620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 03:53:23,680][15401] Updated weights for policy 0, policy_version 804833 (0.0029) [2024-06-25 03:53:26,331][15349] Signal inference workers to stop experience collection... (195150 times) [2024-06-25 03:53:26,388][15401] InferenceWorker_p0-w0: stopping experience collection (195150 times) [2024-06-25 03:53:26,391][15349] Signal inference workers to resume experience collection... (195150 times) [2024-06-25 03:53:26,399][15401] InferenceWorker_p0-w0: resuming experience collection (195150 times) [2024-06-25 03:53:26,948][15401] Updated weights for policy 0, policy_version 804843 (0.0040) [2024-06-25 03:53:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 43043.6). Total num frames: 13186596864. Throughput: 0: 42904.1. Samples: 13186701560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 03:53:31,531][15401] Updated weights for policy 0, policy_version 804853 (0.0032) [2024-06-25 03:53:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13186793472. Throughput: 0: 43218.8. Samples: 13186967920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:33,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 03:53:34,761][15401] Updated weights for policy 0, policy_version 804863 (0.0028) [2024-06-25 03:53:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13186990080. Throughput: 0: 43049.2. Samples: 13187092640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:38,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 03:53:38,998][15401] Updated weights for policy 0, policy_version 804873 (0.0030) [2024-06-25 03:53:42,376][15401] Updated weights for policy 0, policy_version 804883 (0.0035) [2024-06-25 03:53:43,390][15132] Fps is (10 sec: 45874.0, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 13187252224. Throughput: 0: 43108.7. Samples: 13187352940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:43,396][15132] Avg episode reward: [(0, '0.205')] [2024-06-25 03:53:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000804886_13187252224.pth... [2024-06-25 03:53:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000804256_13176930304.pth [2024-06-25 03:53:46,846][15401] Updated weights for policy 0, policy_version 804893 (0.0030) [2024-06-25 03:53:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13187448832. Throughput: 0: 43155.2. Samples: 13187613120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:48,390][15132] Avg episode reward: [(0, '0.196')] [2024-06-25 03:53:49,939][15401] Updated weights for policy 0, policy_version 804903 (0.0037) [2024-06-25 03:53:53,389][15132] Fps is (10 sec: 39322.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13187645440. Throughput: 0: 42835.1. Samples: 13187735720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:53,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-25 03:53:54,325][15401] Updated weights for policy 0, policy_version 804913 (0.0027) [2024-06-25 03:53:57,325][15401] Updated weights for policy 0, policy_version 804923 (0.0035) [2024-06-25 03:53:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13187891200. Throughput: 0: 43023.1. Samples: 13187995680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:53:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 03:54:01,898][15401] Updated weights for policy 0, policy_version 804933 (0.0026) [2024-06-25 03:54:03,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13188104192. Throughput: 0: 43056.1. Samples: 13188255140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:54:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 03:54:05,443][15401] Updated weights for policy 0, policy_version 804943 (0.0029) [2024-06-25 03:54:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13188300800. Throughput: 0: 42892.0. Samples: 13188379760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:54:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 03:54:09,384][15401] Updated weights for policy 0, policy_version 804953 (0.0027) [2024-06-25 03:54:12,861][15401] Updated weights for policy 0, policy_version 804963 (0.0038) [2024-06-25 03:54:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42932.1). Total num frames: 13188530176. Throughput: 0: 43090.7. Samples: 13188640640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:54:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 03:54:16,976][15401] Updated weights for policy 0, policy_version 804973 (0.0043) [2024-06-25 03:54:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 13188710400. Throughput: 0: 43057.1. Samples: 13188905500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:54:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 03:54:20,460][15401] Updated weights for policy 0, policy_version 804983 (0.0034) [2024-06-25 03:54:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 13188939776. Throughput: 0: 43010.7. Samples: 13189028120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:54:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 03:54:24,667][15401] Updated weights for policy 0, policy_version 804993 (0.0048) [2024-06-25 03:54:27,964][15401] Updated weights for policy 0, policy_version 805003 (0.0037) [2024-06-25 03:54:28,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13189185536. Throughput: 0: 42924.2. Samples: 13189284520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:54:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 03:54:32,257][15401] Updated weights for policy 0, policy_version 805013 (0.0029) [2024-06-25 03:54:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13189365760. Throughput: 0: 43062.2. Samples: 13189550920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:54:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 03:54:35,397][15401] Updated weights for policy 0, policy_version 805023 (0.0049) [2024-06-25 03:54:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 13189595136. Throughput: 0: 43048.0. Samples: 13189672880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-25 03:54:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 03:54:40,319][15401] Updated weights for policy 0, policy_version 805033 (0.0032) [2024-06-25 03:54:42,836][15401] Updated weights for policy 0, policy_version 805043 (0.0030) [2024-06-25 03:54:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42932.0). Total num frames: 13189824512. Throughput: 0: 42991.2. Samples: 13189930280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:54:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 03:54:47,746][15401] Updated weights for policy 0, policy_version 805053 (0.0033) [2024-06-25 03:54:48,125][15349] Signal inference workers to stop experience collection... (195200 times) [2024-06-25 03:54:48,173][15349] Signal inference workers to resume experience collection... (195200 times) [2024-06-25 03:54:48,174][15401] InferenceWorker_p0-w0: stopping experience collection (195200 times) [2024-06-25 03:54:48,201][15401] InferenceWorker_p0-w0: resuming experience collection (195200 times) [2024-06-25 03:54:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13190021120. Throughput: 0: 43239.5. Samples: 13190200920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:54:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 03:54:50,672][15401] Updated weights for policy 0, policy_version 805063 (0.0032) [2024-06-25 03:54:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13190234112. Throughput: 0: 43175.5. Samples: 13190322660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:54:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 03:54:55,225][15401] Updated weights for policy 0, policy_version 805073 (0.0049) [2024-06-25 03:54:58,229][15401] Updated weights for policy 0, policy_version 805083 (0.0039) [2024-06-25 03:54:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 43043.1). Total num frames: 13190479872. Throughput: 0: 43008.4. Samples: 13190576020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:54:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 03:55:03,243][15401] Updated weights for policy 0, policy_version 805093 (0.0024) [2024-06-25 03:55:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 13190643712. Throughput: 0: 42981.0. Samples: 13190839640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 03:55:06,068][15401] Updated weights for policy 0, policy_version 805103 (0.0042) [2024-06-25 03:55:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13190889472. Throughput: 0: 42861.8. Samples: 13190956900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 03:55:10,719][15401] Updated weights for policy 0, policy_version 805113 (0.0035) [2024-06-25 03:55:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13191102464. Throughput: 0: 43029.9. Samples: 13191220860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:13,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 03:55:13,907][15401] Updated weights for policy 0, policy_version 805123 (0.0029) [2024-06-25 03:55:18,203][15401] Updated weights for policy 0, policy_version 805133 (0.0033) [2024-06-25 03:55:18,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 13191299072. Throughput: 0: 42894.2. Samples: 13191481260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:18,392][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 03:55:21,370][15401] Updated weights for policy 0, policy_version 805143 (0.0027) [2024-06-25 03:55:23,392][15132] Fps is (10 sec: 42587.5, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 13191528448. Throughput: 0: 42923.8. Samples: 13191604560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:23,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 03:55:25,792][15401] Updated weights for policy 0, policy_version 805153 (0.0046) [2024-06-25 03:55:28,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 13191741440. Throughput: 0: 42879.0. Samples: 13191859840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 03:55:29,644][15401] Updated weights for policy 0, policy_version 805163 (0.0028) [2024-06-25 03:55:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13191938048. Throughput: 0: 42621.8. Samples: 13192118900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:33,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 03:55:33,771][15401] Updated weights for policy 0, policy_version 805173 (0.0039) [2024-06-25 03:55:37,162][15401] Updated weights for policy 0, policy_version 805183 (0.0028) [2024-06-25 03:55:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 13192167424. Throughput: 0: 42665.4. Samples: 13192242600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:38,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 03:55:41,333][15401] Updated weights for policy 0, policy_version 805193 (0.0026) [2024-06-25 03:55:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 13192364032. Throughput: 0: 42764.9. Samples: 13192500440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 03:55:43,577][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000805199_13192380416.pth... [2024-06-25 03:55:43,624][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000804571_13182091264.pth [2024-06-25 03:55:44,624][15401] Updated weights for policy 0, policy_version 805203 (0.0042) [2024-06-25 03:55:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13192577024. Throughput: 0: 42592.4. Samples: 13192756300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:48,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 03:55:48,846][15401] Updated weights for policy 0, policy_version 805213 (0.0033) [2024-06-25 03:55:52,290][15401] Updated weights for policy 0, policy_version 805223 (0.0044) [2024-06-25 03:55:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13192806400. Throughput: 0: 42690.6. Samples: 13192877980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:53,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 03:55:56,474][15401] Updated weights for policy 0, policy_version 805233 (0.0034) [2024-06-25 03:55:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 13193003008. Throughput: 0: 42532.7. Samples: 13193134840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:55:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 03:55:59,866][15401] Updated weights for policy 0, policy_version 805243 (0.0036) [2024-06-25 03:56:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13193232384. Throughput: 0: 42574.3. Samples: 13193397000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:56:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 03:56:03,939][15401] Updated weights for policy 0, policy_version 805253 (0.0028) [2024-06-25 03:56:07,447][15401] Updated weights for policy 0, policy_version 805263 (0.0037) [2024-06-25 03:56:08,392][15132] Fps is (10 sec: 44227.2, 60 sec: 42596.8, 300 sec: 42931.3). Total num frames: 13193445376. Throughput: 0: 42606.4. Samples: 13193521840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:56:08,392][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 03:56:11,904][15401] Updated weights for policy 0, policy_version 805273 (0.0034) [2024-06-25 03:56:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 13193641984. Throughput: 0: 42738.2. Samples: 13193783060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:56:13,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 03:56:15,498][15401] Updated weights for policy 0, policy_version 805283 (0.0034) [2024-06-25 03:56:18,054][15349] Signal inference workers to stop experience collection... (195250 times) [2024-06-25 03:56:18,054][15349] Signal inference workers to resume experience collection... (195250 times) [2024-06-25 03:56:18,094][15401] InferenceWorker_p0-w0: stopping experience collection (195250 times) [2024-06-25 03:56:18,094][15401] InferenceWorker_p0-w0: resuming experience collection (195250 times) [2024-06-25 03:56:18,389][15132] Fps is (10 sec: 44246.9, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 13193887744. Throughput: 0: 42636.5. Samples: 13194037540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:56:18,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 03:56:19,343][15401] Updated weights for policy 0, policy_version 805293 (0.0039) [2024-06-25 03:56:23,104][15401] Updated weights for policy 0, policy_version 805303 (0.0027) [2024-06-25 03:56:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.2, 300 sec: 42931.6). Total num frames: 13194084352. Throughput: 0: 42838.3. Samples: 13194170320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 03:56:23,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 03:56:26,855][15401] Updated weights for policy 0, policy_version 805313 (0.0033) [2024-06-25 03:56:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13194297344. Throughput: 0: 42653.4. Samples: 13194419840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:56:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 03:56:30,957][15401] Updated weights for policy 0, policy_version 805323 (0.0023) [2024-06-25 03:56:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13194526720. Throughput: 0: 42690.2. Samples: 13194677360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:56:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 03:56:34,289][15401] Updated weights for policy 0, policy_version 805333 (0.0035) [2024-06-25 03:56:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 13194723328. Throughput: 0: 42946.8. Samples: 13194810580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:56:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 03:56:38,420][15401] Updated weights for policy 0, policy_version 805343 (0.0031) [2024-06-25 03:56:42,167][15401] Updated weights for policy 0, policy_version 805353 (0.0035) [2024-06-25 03:56:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13194936320. Throughput: 0: 42889.8. Samples: 13195064880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:56:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 03:56:45,972][15401] Updated weights for policy 0, policy_version 805363 (0.0043) [2024-06-25 03:56:48,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 13195182080. Throughput: 0: 42675.3. Samples: 13195317400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:56:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 03:56:49,977][15401] Updated weights for policy 0, policy_version 805373 (0.0035) [2024-06-25 03:56:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13195362304. Throughput: 0: 42858.0. Samples: 13195450360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:56:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 03:56:53,795][15401] Updated weights for policy 0, policy_version 805383 (0.0031) [2024-06-25 03:56:57,609][15401] Updated weights for policy 0, policy_version 805393 (0.0038) [2024-06-25 03:56:58,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13195575296. Throughput: 0: 42731.1. Samples: 13195705960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:56:58,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 03:57:01,718][15401] Updated weights for policy 0, policy_version 805403 (0.0038) [2024-06-25 03:57:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13195804672. Throughput: 0: 42711.1. Samples: 13195959540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 03:57:05,497][15401] Updated weights for policy 0, policy_version 805413 (0.0050) [2024-06-25 03:57:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 13196001280. Throughput: 0: 42659.1. Samples: 13196089980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 03:57:09,441][15401] Updated weights for policy 0, policy_version 805423 (0.0043) [2024-06-25 03:57:13,032][15401] Updated weights for policy 0, policy_version 805433 (0.0045) [2024-06-25 03:57:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13196230656. Throughput: 0: 42872.9. Samples: 13196349120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 03:57:16,914][15401] Updated weights for policy 0, policy_version 805443 (0.0035) [2024-06-25 03:57:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13196460032. Throughput: 0: 42735.5. Samples: 13196600460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 03:57:20,597][15401] Updated weights for policy 0, policy_version 805453 (0.0031) [2024-06-25 03:57:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13196656640. Throughput: 0: 42738.7. Samples: 13196733820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:23,396][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 03:57:24,617][15401] Updated weights for policy 0, policy_version 805463 (0.0028) [2024-06-25 03:57:28,051][15401] Updated weights for policy 0, policy_version 805473 (0.0032) [2024-06-25 03:57:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13196886016. Throughput: 0: 42891.5. Samples: 13196995000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 03:57:32,131][15401] Updated weights for policy 0, policy_version 805483 (0.0031) [2024-06-25 03:57:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13197082624. Throughput: 0: 42857.0. Samples: 13197245960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 03:57:35,815][15401] Updated weights for policy 0, policy_version 805493 (0.0037) [2024-06-25 03:57:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13197295616. Throughput: 0: 42759.2. Samples: 13197374520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 03:57:38,592][15349] Signal inference workers to stop experience collection... (195300 times) [2024-06-25 03:57:38,596][15349] Signal inference workers to resume experience collection... (195300 times) [2024-06-25 03:57:38,614][15401] InferenceWorker_p0-w0: stopping experience collection (195300 times) [2024-06-25 03:57:38,614][15401] InferenceWorker_p0-w0: resuming experience collection (195300 times) [2024-06-25 03:57:40,004][15401] Updated weights for policy 0, policy_version 805503 (0.0035) [2024-06-25 03:57:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13197508608. Throughput: 0: 42816.2. Samples: 13197632700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 03:57:43,570][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000805513_13197524992.pth... [2024-06-25 03:57:43,582][15401] Updated weights for policy 0, policy_version 805513 (0.0033) [2024-06-25 03:57:43,639][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000804886_13187252224.pth [2024-06-25 03:57:47,551][15401] Updated weights for policy 0, policy_version 805523 (0.0034) [2024-06-25 03:57:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 13197721600. Throughput: 0: 42936.3. Samples: 13197891680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 03:57:51,124][15401] Updated weights for policy 0, policy_version 805533 (0.0041) [2024-06-25 03:57:53,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13197934592. Throughput: 0: 42884.9. Samples: 13198019800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 03:57:55,149][15401] Updated weights for policy 0, policy_version 805543 (0.0029) [2024-06-25 03:57:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13198131200. Throughput: 0: 42908.9. Samples: 13198280020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:57:58,396][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 03:57:58,685][15401] Updated weights for policy 0, policy_version 805553 (0.0030) [2024-06-25 03:58:02,684][15401] Updated weights for policy 0, policy_version 805563 (0.0046) [2024-06-25 03:58:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13198376960. Throughput: 0: 43014.7. Samples: 13198536120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:58:03,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 03:58:06,324][15401] Updated weights for policy 0, policy_version 805573 (0.0032) [2024-06-25 03:58:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13198589952. Throughput: 0: 42950.5. Samples: 13198666600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:58:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 03:58:10,280][15401] Updated weights for policy 0, policy_version 805583 (0.0033) [2024-06-25 03:58:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 13198802944. Throughput: 0: 42712.0. Samples: 13198917140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 03:58:13,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 03:58:14,211][15401] Updated weights for policy 0, policy_version 805593 (0.0038) [2024-06-25 03:58:18,145][15401] Updated weights for policy 0, policy_version 805603 (0.0034) [2024-06-25 03:58:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13198999552. Throughput: 0: 42948.1. Samples: 13199178620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 03:58:21,681][15401] Updated weights for policy 0, policy_version 805613 (0.0038) [2024-06-25 03:58:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13199228928. Throughput: 0: 42871.1. Samples: 13199303720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 03:58:25,857][15401] Updated weights for policy 0, policy_version 805623 (0.0031) [2024-06-25 03:58:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13199441920. Throughput: 0: 42827.3. Samples: 13199559920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:28,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 03:58:29,355][15401] Updated weights for policy 0, policy_version 805633 (0.0026) [2024-06-25 03:58:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13199638528. Throughput: 0: 42831.3. Samples: 13199819080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:33,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 03:58:33,473][15401] Updated weights for policy 0, policy_version 805643 (0.0022) [2024-06-25 03:58:36,727][15401] Updated weights for policy 0, policy_version 805653 (0.0043) [2024-06-25 03:58:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13199884288. Throughput: 0: 42627.0. Samples: 13199938020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 03:58:41,386][15401] Updated weights for policy 0, policy_version 805663 (0.0028) [2024-06-25 03:58:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 13200080896. Throughput: 0: 42578.8. Samples: 13200196060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:43,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 03:58:44,750][15401] Updated weights for policy 0, policy_version 805673 (0.0042) [2024-06-25 03:58:48,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 13200261120. Throughput: 0: 42687.6. Samples: 13200457060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:48,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 03:58:49,233][15401] Updated weights for policy 0, policy_version 805683 (0.0043) [2024-06-25 03:58:52,200][15401] Updated weights for policy 0, policy_version 805693 (0.0026) [2024-06-25 03:58:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13200506880. Throughput: 0: 42321.4. Samples: 13200571060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:53,391][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 03:58:57,041][15401] Updated weights for policy 0, policy_version 805703 (0.0051) [2024-06-25 03:58:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13200719872. Throughput: 0: 42764.4. Samples: 13200841440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:58:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 03:59:00,128][15401] Updated weights for policy 0, policy_version 805713 (0.0048) [2024-06-25 03:59:00,557][15349] Signal inference workers to stop experience collection... (195350 times) [2024-06-25 03:59:00,604][15401] InferenceWorker_p0-w0: stopping experience collection (195350 times) [2024-06-25 03:59:00,614][15349] Signal inference workers to resume experience collection... (195350 times) [2024-06-25 03:59:00,620][15401] InferenceWorker_p0-w0: resuming experience collection (195350 times) [2024-06-25 03:59:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 13200900096. Throughput: 0: 42705.2. Samples: 13201100360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:03,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-25 03:59:04,828][15401] Updated weights for policy 0, policy_version 805723 (0.0034) [2024-06-25 03:59:07,459][15401] Updated weights for policy 0, policy_version 805733 (0.0035) [2024-06-25 03:59:08,396][15132] Fps is (10 sec: 44210.0, 60 sec: 42867.1, 300 sec: 42819.7). Total num frames: 13201162240. Throughput: 0: 42456.0. Samples: 13201214500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:08,396][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 03:59:12,507][15401] Updated weights for policy 0, policy_version 805743 (0.0026) [2024-06-25 03:59:13,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42600.2, 300 sec: 42876.1). Total num frames: 13201358848. Throughput: 0: 42766.8. Samples: 13201484420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 03:59:15,209][15401] Updated weights for policy 0, policy_version 805753 (0.0025) [2024-06-25 03:59:18,389][15132] Fps is (10 sec: 39345.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13201555456. Throughput: 0: 42552.8. Samples: 13201733960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 03:59:20,238][15401] Updated weights for policy 0, policy_version 805763 (0.0036) [2024-06-25 03:59:23,255][15401] Updated weights for policy 0, policy_version 805773 (0.0039) [2024-06-25 03:59:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13201784832. Throughput: 0: 42691.0. Samples: 13201859120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:23,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 03:59:28,113][15401] Updated weights for policy 0, policy_version 805783 (0.0043) [2024-06-25 03:59:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13201981440. Throughput: 0: 42760.0. Samples: 13202120260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:28,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 03:59:30,757][15401] Updated weights for policy 0, policy_version 805793 (0.0032) [2024-06-25 03:59:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 13202194432. Throughput: 0: 42450.9. Samples: 13202367360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:33,390][15132] Avg episode reward: [(0, '0.879')] [2024-06-25 03:59:35,803][15401] Updated weights for policy 0, policy_version 805803 (0.0027) [2024-06-25 03:59:38,323][15401] Updated weights for policy 0, policy_version 805813 (0.0028) [2024-06-25 03:59:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13202440192. Throughput: 0: 42910.6. Samples: 13202502040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:38,395][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 03:59:43,337][15401] Updated weights for policy 0, policy_version 805823 (0.0045) [2024-06-25 03:59:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13202604032. Throughput: 0: 42632.1. Samples: 13202759880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 03:59:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000805823_13202604032.pth... [2024-06-25 03:59:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000805199_13192380416.pth [2024-06-25 03:59:45,868][15401] Updated weights for policy 0, policy_version 805833 (0.0034) [2024-06-25 03:59:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13202849792. Throughput: 0: 42395.2. Samples: 13203008140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 03:59:50,919][15401] Updated weights for policy 0, policy_version 805843 (0.0037) [2024-06-25 03:59:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13203079168. Throughput: 0: 42816.0. Samples: 13203140960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 03:59:53,442][15401] Updated weights for policy 0, policy_version 805853 (0.0038) [2024-06-25 03:59:58,390][15132] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 13203226624. Throughput: 0: 42440.8. Samples: 13203394260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 03:59:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 03:59:58,736][15401] Updated weights for policy 0, policy_version 805863 (0.0035) [2024-06-25 04:00:00,111][15349] Signal inference workers to stop experience collection... (195400 times) [2024-06-25 04:00:00,114][15349] Signal inference workers to resume experience collection... (195400 times) [2024-06-25 04:00:00,149][15401] InferenceWorker_p0-w0: stopping experience collection (195400 times) [2024-06-25 04:00:00,149][15401] InferenceWorker_p0-w0: resuming experience collection (195400 times) [2024-06-25 04:00:01,061][15401] Updated weights for policy 0, policy_version 805873 (0.0033) [2024-06-25 04:00:03,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 13203472384. Throughput: 0: 42387.5. Samples: 13203641500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:03,392][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 04:00:06,459][15401] Updated weights for policy 0, policy_version 805883 (0.0033) [2024-06-25 04:00:08,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42329.7, 300 sec: 42709.5). Total num frames: 13203701760. Throughput: 0: 42693.9. Samples: 13203780340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 04:00:08,789][15401] Updated weights for policy 0, policy_version 805893 (0.0028) [2024-06-25 04:00:13,389][15132] Fps is (10 sec: 39331.1, 60 sec: 41779.1, 300 sec: 42598.7). Total num frames: 13203865600. Throughput: 0: 42375.9. Samples: 13204027180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 04:00:13,980][15401] Updated weights for policy 0, policy_version 805903 (0.0028) [2024-06-25 04:00:16,761][15401] Updated weights for policy 0, policy_version 805913 (0.0028) [2024-06-25 04:00:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 13204111360. Throughput: 0: 42448.5. Samples: 13204277540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:18,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 04:00:21,699][15401] Updated weights for policy 0, policy_version 805923 (0.0033) [2024-06-25 04:00:23,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 13204340736. Throughput: 0: 42563.8. Samples: 13204417400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 04:00:24,628][15401] Updated weights for policy 0, policy_version 805933 (0.0038) [2024-06-25 04:00:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13204520960. Throughput: 0: 42238.1. Samples: 13204660600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:28,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 04:00:29,271][15401] Updated weights for policy 0, policy_version 805943 (0.0039) [2024-06-25 04:00:32,235][15401] Updated weights for policy 0, policy_version 805953 (0.0051) [2024-06-25 04:00:33,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13204766720. Throughput: 0: 42404.4. Samples: 13204916340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 04:00:36,938][15401] Updated weights for policy 0, policy_version 805963 (0.0042) [2024-06-25 04:00:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13204979712. Throughput: 0: 42474.7. Samples: 13205052320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 04:00:40,195][15401] Updated weights for policy 0, policy_version 805973 (0.0033) [2024-06-25 04:00:43,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 13205159936. Throughput: 0: 42360.3. Samples: 13205300480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:43,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 04:00:44,565][15401] Updated weights for policy 0, policy_version 805983 (0.0039) [2024-06-25 04:00:48,029][15401] Updated weights for policy 0, policy_version 805993 (0.0040) [2024-06-25 04:00:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13205389312. Throughput: 0: 42548.1. Samples: 13205556060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:48,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 04:00:52,049][15401] Updated weights for policy 0, policy_version 806003 (0.0038) [2024-06-25 04:00:53,389][15132] Fps is (10 sec: 45876.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13205618688. Throughput: 0: 42491.6. Samples: 13205692460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 04:00:55,627][15401] Updated weights for policy 0, policy_version 806013 (0.0036) [2024-06-25 04:00:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13205798912. Throughput: 0: 42573.7. Samples: 13205943000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:00:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 04:00:59,639][15401] Updated weights for policy 0, policy_version 806023 (0.0026) [2024-06-25 04:01:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13206028288. Throughput: 0: 42786.7. Samples: 13206203040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:03,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 04:01:03,450][15401] Updated weights for policy 0, policy_version 806033 (0.0037) [2024-06-25 04:01:07,466][15401] Updated weights for policy 0, policy_version 806043 (0.0034) [2024-06-25 04:01:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13206257664. Throughput: 0: 42617.1. Samples: 13206335180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 04:01:11,155][15401] Updated weights for policy 0, policy_version 806053 (0.0037) [2024-06-25 04:01:13,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13206454272. Throughput: 0: 42766.7. Samples: 13206585100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 04:01:14,982][15401] Updated weights for policy 0, policy_version 806063 (0.0043) [2024-06-25 04:01:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13206667264. Throughput: 0: 42728.9. Samples: 13206839140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 04:01:18,980][15401] Updated weights for policy 0, policy_version 806073 (0.0033) [2024-06-25 04:01:22,776][15349] Signal inference workers to stop experience collection... (195450 times) [2024-06-25 04:01:22,807][15401] InferenceWorker_p0-w0: stopping experience collection (195450 times) [2024-06-25 04:01:22,835][15349] Signal inference workers to resume experience collection... (195450 times) [2024-06-25 04:01:22,840][15401] InferenceWorker_p0-w0: resuming experience collection (195450 times) [2024-06-25 04:01:22,843][15401] Updated weights for policy 0, policy_version 806083 (0.0024) [2024-06-25 04:01:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13206913024. Throughput: 0: 42651.6. Samples: 13206971640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 04:01:26,632][15401] Updated weights for policy 0, policy_version 806093 (0.0046) [2024-06-25 04:01:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 13207109632. Throughput: 0: 42770.4. Samples: 13207225140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 04:01:30,478][15401] Updated weights for policy 0, policy_version 806103 (0.0041) [2024-06-25 04:01:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13207306240. Throughput: 0: 42781.1. Samples: 13207481220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 04:01:34,130][15401] Updated weights for policy 0, policy_version 806113 (0.0034) [2024-06-25 04:01:38,183][15401] Updated weights for policy 0, policy_version 806123 (0.0033) [2024-06-25 04:01:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13207535616. Throughput: 0: 42570.2. Samples: 13207608120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 04:01:41,805][15401] Updated weights for policy 0, policy_version 806133 (0.0023) [2024-06-25 04:01:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 13207748608. Throughput: 0: 42758.7. Samples: 13207867140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:43,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 04:01:43,538][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000806138_13207764992.pth... [2024-06-25 04:01:43,595][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000805513_13197524992.pth [2024-06-25 04:01:45,768][15401] Updated weights for policy 0, policy_version 806143 (0.0032) [2024-06-25 04:01:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13207961600. Throughput: 0: 42554.6. Samples: 13208117900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 04:01:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 04:01:49,912][15401] Updated weights for policy 0, policy_version 806153 (0.0039) [2024-06-25 04:01:53,343][15401] Updated weights for policy 0, policy_version 806163 (0.0039) [2024-06-25 04:01:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13208174592. Throughput: 0: 42520.6. Samples: 13208248600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:01:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 04:01:57,367][15401] Updated weights for policy 0, policy_version 806173 (0.0035) [2024-06-25 04:01:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13208387584. Throughput: 0: 42799.6. Samples: 13208511080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:01:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 04:02:00,724][15401] Updated weights for policy 0, policy_version 806183 (0.0049) [2024-06-25 04:02:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 13208584192. Throughput: 0: 42767.7. Samples: 13208763680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 04:02:04,898][15401] Updated weights for policy 0, policy_version 806193 (0.0043) [2024-06-25 04:02:08,346][15401] Updated weights for policy 0, policy_version 806203 (0.0032) [2024-06-25 04:02:08,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 13208829952. Throughput: 0: 42737.8. Samples: 13208894940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:08,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 04:02:12,389][15401] Updated weights for policy 0, policy_version 806213 (0.0041) [2024-06-25 04:02:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13209010176. Throughput: 0: 42917.5. Samples: 13209156420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 04:02:15,947][15401] Updated weights for policy 0, policy_version 806223 (0.0035) [2024-06-25 04:02:18,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13209239552. Throughput: 0: 42808.9. Samples: 13209407620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 04:02:19,972][15401] Updated weights for policy 0, policy_version 806233 (0.0023) [2024-06-25 04:02:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13209468928. Throughput: 0: 43010.1. Samples: 13209543580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 04:02:23,399][15401] Updated weights for policy 0, policy_version 806243 (0.0035) [2024-06-25 04:02:27,686][15401] Updated weights for policy 0, policy_version 806253 (0.0038) [2024-06-25 04:02:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13209665536. Throughput: 0: 42915.0. Samples: 13209798320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:28,391][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 04:02:30,961][15401] Updated weights for policy 0, policy_version 806263 (0.0032) [2024-06-25 04:02:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 13209894912. Throughput: 0: 42973.3. Samples: 13210051800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:33,393][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 04:02:35,322][15401] Updated weights for policy 0, policy_version 806273 (0.0034) [2024-06-25 04:02:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13210107904. Throughput: 0: 42961.7. Samples: 13210181880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:38,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 04:02:38,391][15349] Signal inference workers to stop experience collection... (195500 times) [2024-06-25 04:02:38,396][15349] Signal inference workers to resume experience collection... (195500 times) [2024-06-25 04:02:38,444][15401] InferenceWorker_p0-w0: stopping experience collection (195500 times) [2024-06-25 04:02:38,444][15401] InferenceWorker_p0-w0: resuming experience collection (195500 times) [2024-06-25 04:02:38,720][15401] Updated weights for policy 0, policy_version 806283 (0.0040) [2024-06-25 04:02:43,031][15401] Updated weights for policy 0, policy_version 806293 (0.0034) [2024-06-25 04:02:43,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13210304512. Throughput: 0: 42860.9. Samples: 13210439820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 04:02:46,336][15401] Updated weights for policy 0, policy_version 806303 (0.0027) [2024-06-25 04:02:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 13210550272. Throughput: 0: 42795.9. Samples: 13210689600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:48,392][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 04:02:50,658][15401] Updated weights for policy 0, policy_version 806313 (0.0028) [2024-06-25 04:02:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13210730496. Throughput: 0: 42998.6. Samples: 13210829780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 04:02:54,088][15401] Updated weights for policy 0, policy_version 806323 (0.0046) [2024-06-25 04:02:58,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 13210959872. Throughput: 0: 42793.2. Samples: 13211082220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:02:58,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 04:02:58,393][15401] Updated weights for policy 0, policy_version 806333 (0.0033) [2024-06-25 04:03:01,712][15401] Updated weights for policy 0, policy_version 806343 (0.0027) [2024-06-25 04:03:03,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 13211189248. Throughput: 0: 42751.6. Samples: 13211331440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:03:03,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 04:03:06,175][15401] Updated weights for policy 0, policy_version 806353 (0.0028) [2024-06-25 04:03:08,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42327.0, 300 sec: 42598.7). Total num frames: 13211369472. Throughput: 0: 42780.1. Samples: 13211468680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:03:08,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 04:03:09,401][15401] Updated weights for policy 0, policy_version 806363 (0.0026) [2024-06-25 04:03:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13211582464. Throughput: 0: 42839.8. Samples: 13211726100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:03:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 04:03:13,751][15401] Updated weights for policy 0, policy_version 806373 (0.0028) [2024-06-25 04:03:17,013][15401] Updated weights for policy 0, policy_version 806383 (0.0030) [2024-06-25 04:03:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13211828224. Throughput: 0: 42677.0. Samples: 13211972160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:03:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 04:03:21,650][15401] Updated weights for policy 0, policy_version 806393 (0.0036) [2024-06-25 04:03:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13212024832. Throughput: 0: 42889.3. Samples: 13212111900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:03:23,393][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 04:03:24,713][15401] Updated weights for policy 0, policy_version 806403 (0.0030) [2024-06-25 04:03:28,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13212205056. Throughput: 0: 42746.2. Samples: 13212363400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:03:28,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 04:03:29,391][15401] Updated weights for policy 0, policy_version 806413 (0.0036) [2024-06-25 04:03:32,298][15401] Updated weights for policy 0, policy_version 806423 (0.0032) [2024-06-25 04:03:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 13212483584. Throughput: 0: 42719.1. Samples: 13212611860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 04:03:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 04:03:37,219][15401] Updated weights for policy 0, policy_version 806433 (0.0025) [2024-06-25 04:03:38,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13212663808. Throughput: 0: 42775.8. Samples: 13212754680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:03:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 04:03:39,801][15401] Updated weights for policy 0, policy_version 806443 (0.0040) [2024-06-25 04:03:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13212860416. Throughput: 0: 42518.7. Samples: 13212995460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:03:43,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 04:03:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000806449_13212860416.pth... [2024-06-25 04:03:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000805823_13202604032.pth [2024-06-25 04:03:44,873][15401] Updated weights for policy 0, policy_version 806453 (0.0046) [2024-06-25 04:03:45,339][15349] Signal inference workers to stop experience collection... (195550 times) [2024-06-25 04:03:45,339][15349] Signal inference workers to resume experience collection... (195550 times) [2024-06-25 04:03:45,388][15401] InferenceWorker_p0-w0: stopping experience collection (195550 times) [2024-06-25 04:03:45,388][15401] InferenceWorker_p0-w0: resuming experience collection (195550 times) [2024-06-25 04:03:47,908][15401] Updated weights for policy 0, policy_version 806463 (0.0040) [2024-06-25 04:03:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 13213089792. Throughput: 0: 42629.0. Samples: 13213249740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:03:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 04:03:52,574][15401] Updated weights for policy 0, policy_version 806473 (0.0032) [2024-06-25 04:03:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 13213302784. Throughput: 0: 42609.7. Samples: 13213386220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:03:53,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 04:03:55,393][15401] Updated weights for policy 0, policy_version 806483 (0.0036) [2024-06-25 04:03:58,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 13213499392. Throughput: 0: 42313.2. Samples: 13213630300. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:03:58,392][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 04:04:00,453][15401] Updated weights for policy 0, policy_version 806493 (0.0035) [2024-06-25 04:04:03,361][15401] Updated weights for policy 0, policy_version 806503 (0.0031) [2024-06-25 04:04:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.3, 300 sec: 42654.8). Total num frames: 13213745152. Throughput: 0: 42638.6. Samples: 13213890900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 04:04:08,078][15401] Updated weights for policy 0, policy_version 806513 (0.0042) [2024-06-25 04:04:08,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13213941760. Throughput: 0: 42483.5. Samples: 13214023660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 04:04:10,952][15401] Updated weights for policy 0, policy_version 806523 (0.0036) [2024-06-25 04:04:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13214154752. Throughput: 0: 42354.7. Samples: 13214269360. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 04:04:15,795][15401] Updated weights for policy 0, policy_version 806533 (0.0025) [2024-06-25 04:04:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13214384128. Throughput: 0: 42664.0. Samples: 13214531740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 04:04:18,445][15401] Updated weights for policy 0, policy_version 806543 (0.0043) [2024-06-25 04:04:23,310][15401] Updated weights for policy 0, policy_version 806553 (0.0043) [2024-06-25 04:04:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13214564352. Throughput: 0: 42378.9. Samples: 13214661740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 04:04:26,167][15401] Updated weights for policy 0, policy_version 806563 (0.0031) [2024-06-25 04:04:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 13214810112. Throughput: 0: 42572.6. Samples: 13214911220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 04:04:30,991][15401] Updated weights for policy 0, policy_version 806573 (0.0039) [2024-06-25 04:04:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13215023104. Throughput: 0: 42681.8. Samples: 13215170420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 04:04:33,626][15401] Updated weights for policy 0, policy_version 806583 (0.0038) [2024-06-25 04:04:38,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13215186944. Throughput: 0: 42582.8. Samples: 13215302340. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 04:04:38,549][15401] Updated weights for policy 0, policy_version 806593 (0.0040) [2024-06-25 04:04:41,603][15401] Updated weights for policy 0, policy_version 806603 (0.0028) [2024-06-25 04:04:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13215432704. Throughput: 0: 42605.4. Samples: 13215547440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 04:04:46,187][15401] Updated weights for policy 0, policy_version 806613 (0.0029) [2024-06-25 04:04:48,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13215662080. Throughput: 0: 42698.3. Samples: 13215812320. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 04:04:49,307][15401] Updated weights for policy 0, policy_version 806623 (0.0042) [2024-06-25 04:04:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 13215842304. Throughput: 0: 42466.3. Samples: 13215934640. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 04:04:53,975][15401] Updated weights for policy 0, policy_version 806633 (0.0035) [2024-06-25 04:04:57,163][15401] Updated weights for policy 0, policy_version 806643 (0.0029) [2024-06-25 04:04:57,591][15349] Signal inference workers to stop experience collection... (195600 times) [2024-06-25 04:04:57,594][15349] Signal inference workers to resume experience collection... (195600 times) [2024-06-25 04:04:57,644][15401] InferenceWorker_p0-w0: stopping experience collection (195600 times) [2024-06-25 04:04:57,644][15401] InferenceWorker_p0-w0: resuming experience collection (195600 times) [2024-06-25 04:04:58,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13216071680. Throughput: 0: 42722.7. Samples: 13216191980. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:04:58,392][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 04:05:01,566][15401] Updated weights for policy 0, policy_version 806653 (0.0042) [2024-06-25 04:05:03,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 13216284672. Throughput: 0: 42613.8. Samples: 13216449460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:05:03,393][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 04:05:05,024][15401] Updated weights for policy 0, policy_version 806663 (0.0044) [2024-06-25 04:05:08,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13216497664. Throughput: 0: 42572.5. Samples: 13216577500. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:05:08,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 04:05:09,276][15401] Updated weights for policy 0, policy_version 806673 (0.0034) [2024-06-25 04:05:12,465][15401] Updated weights for policy 0, policy_version 806683 (0.0040) [2024-06-25 04:05:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13216727040. Throughput: 0: 42819.0. Samples: 13216838080. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:05:13,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-25 04:05:16,803][15401] Updated weights for policy 0, policy_version 806693 (0.0031) [2024-06-25 04:05:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13216940032. Throughput: 0: 42750.1. Samples: 13217094180. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:05:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 04:05:20,094][15401] Updated weights for policy 0, policy_version 806703 (0.0031) [2024-06-25 04:05:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13217136640. Throughput: 0: 42741.3. Samples: 13217225700. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-25 04:05:23,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 04:05:24,528][15401] Updated weights for policy 0, policy_version 806713 (0.0031) [2024-06-25 04:05:27,807][15401] Updated weights for policy 0, policy_version 806723 (0.0047) [2024-06-25 04:05:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13217349632. Throughput: 0: 42934.3. Samples: 13217479480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:05:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 04:05:31,984][15401] Updated weights for policy 0, policy_version 806733 (0.0027) [2024-06-25 04:05:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13217595392. Throughput: 0: 42893.4. Samples: 13217742520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:05:33,390][15132] Avg episode reward: [(0, '0.198')] [2024-06-25 04:05:35,288][15401] Updated weights for policy 0, policy_version 806743 (0.0040) [2024-06-25 04:05:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13217792000. Throughput: 0: 43078.2. Samples: 13217873160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:05:38,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-25 04:05:39,699][15401] Updated weights for policy 0, policy_version 806753 (0.0040) [2024-06-25 04:05:42,832][15401] Updated weights for policy 0, policy_version 806763 (0.0032) [2024-06-25 04:05:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13218004992. Throughput: 0: 43021.0. Samples: 13218127820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:05:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 04:05:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000806763_13218004992.pth... [2024-06-25 04:05:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000806138_13207764992.pth [2024-06-25 04:05:47,129][15401] Updated weights for policy 0, policy_version 806773 (0.0045) [2024-06-25 04:05:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13218217984. Throughput: 0: 43160.1. Samples: 13218391560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:05:48,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 04:05:50,240][15401] Updated weights for policy 0, policy_version 806783 (0.0029) [2024-06-25 04:05:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13218447360. Throughput: 0: 43101.4. Samples: 13218517060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:05:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 04:05:54,614][15401] Updated weights for policy 0, policy_version 806793 (0.0039) [2024-06-25 04:05:58,015][15401] Updated weights for policy 0, policy_version 806803 (0.0025) [2024-06-25 04:05:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.2, 300 sec: 42820.9). Total num frames: 13218660352. Throughput: 0: 43016.5. Samples: 13218773820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:05:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 04:06:02,550][15401] Updated weights for policy 0, policy_version 806813 (0.0029) [2024-06-25 04:06:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 13218873344. Throughput: 0: 43105.7. Samples: 13219033940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:03,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 04:06:06,013][15401] Updated weights for policy 0, policy_version 806823 (0.0041) [2024-06-25 04:06:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13219102720. Throughput: 0: 42981.3. Samples: 13219159860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:08,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 04:06:10,061][15401] Updated weights for policy 0, policy_version 806833 (0.0045) [2024-06-25 04:06:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13219282944. Throughput: 0: 43032.8. Samples: 13219415960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:13,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 04:06:13,839][15401] Updated weights for policy 0, policy_version 806843 (0.0048) [2024-06-25 04:06:17,786][15401] Updated weights for policy 0, policy_version 806853 (0.0043) [2024-06-25 04:06:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13219512320. Throughput: 0: 42915.6. Samples: 13219673720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:18,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 04:06:21,396][15401] Updated weights for policy 0, policy_version 806863 (0.0035) [2024-06-25 04:06:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13219725312. Throughput: 0: 42856.8. Samples: 13219801720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:23,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 04:06:23,513][15349] Signal inference workers to stop experience collection... (195650 times) [2024-06-25 04:06:23,513][15349] Signal inference workers to resume experience collection... (195650 times) [2024-06-25 04:06:23,532][15401] InferenceWorker_p0-w0: stopping experience collection (195650 times) [2024-06-25 04:06:23,532][15401] InferenceWorker_p0-w0: resuming experience collection (195650 times) [2024-06-25 04:06:25,470][15401] Updated weights for policy 0, policy_version 806873 (0.0031) [2024-06-25 04:06:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13219938304. Throughput: 0: 42911.5. Samples: 13220058840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:28,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 04:06:29,242][15401] Updated weights for policy 0, policy_version 806883 (0.0044) [2024-06-25 04:06:32,872][15401] Updated weights for policy 0, policy_version 806893 (0.0036) [2024-06-25 04:06:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13220151296. Throughput: 0: 42725.3. Samples: 13220314200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 04:06:36,759][15401] Updated weights for policy 0, policy_version 806903 (0.0034) [2024-06-25 04:06:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13220380672. Throughput: 0: 42864.9. Samples: 13220445980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 04:06:40,291][15401] Updated weights for policy 0, policy_version 806913 (0.0033) [2024-06-25 04:06:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13220577280. Throughput: 0: 43048.5. Samples: 13220711000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 04:06:44,533][15401] Updated weights for policy 0, policy_version 806923 (0.0034) [2024-06-25 04:06:47,704][15401] Updated weights for policy 0, policy_version 806933 (0.0035) [2024-06-25 04:06:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13220790272. Throughput: 0: 42893.9. Samples: 13220964160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 04:06:52,244][15401] Updated weights for policy 0, policy_version 806943 (0.0031) [2024-06-25 04:06:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13221019648. Throughput: 0: 42972.5. Samples: 13221093620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:53,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 04:06:55,859][15401] Updated weights for policy 0, policy_version 806953 (0.0037) [2024-06-25 04:06:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13221232640. Throughput: 0: 43077.9. Samples: 13221354460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:06:58,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 04:06:59,699][15401] Updated weights for policy 0, policy_version 806963 (0.0027) [2024-06-25 04:07:03,251][15401] Updated weights for policy 0, policy_version 806973 (0.0030) [2024-06-25 04:07:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 13221445632. Throughput: 0: 43129.1. Samples: 13221614540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:07:03,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 04:07:07,105][15401] Updated weights for policy 0, policy_version 806983 (0.0037) [2024-06-25 04:07:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13221658624. Throughput: 0: 43185.1. Samples: 13221745040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 04:07:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 04:07:10,617][15401] Updated weights for policy 0, policy_version 806993 (0.0046) [2024-06-25 04:07:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13221871616. Throughput: 0: 43194.6. Samples: 13222002600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:13,400][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 04:07:15,105][15401] Updated weights for policy 0, policy_version 807003 (0.0023) [2024-06-25 04:07:18,268][15401] Updated weights for policy 0, policy_version 807013 (0.0036) [2024-06-25 04:07:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13222100992. Throughput: 0: 43097.7. Samples: 13222253600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 04:07:22,665][15401] Updated weights for policy 0, policy_version 807023 (0.0038) [2024-06-25 04:07:23,396][15132] Fps is (10 sec: 44208.7, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 13222313984. Throughput: 0: 43117.4. Samples: 13222386540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:23,397][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 04:07:26,082][15401] Updated weights for policy 0, policy_version 807033 (0.0031) [2024-06-25 04:07:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 13222494208. Throughput: 0: 42831.9. Samples: 13222638440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:28,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 04:07:30,457][15401] Updated weights for policy 0, policy_version 807043 (0.0043) [2024-06-25 04:07:31,766][15349] Signal inference workers to stop experience collection... (195700 times) [2024-06-25 04:07:31,773][15349] Signal inference workers to resume experience collection... (195700 times) [2024-06-25 04:07:31,814][15401] InferenceWorker_p0-w0: stopping experience collection (195700 times) [2024-06-25 04:07:31,814][15401] InferenceWorker_p0-w0: resuming experience collection (195700 times) [2024-06-25 04:07:33,390][15132] Fps is (10 sec: 42625.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13222739968. Throughput: 0: 42858.6. Samples: 13222892800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 04:07:33,576][15401] Updated weights for policy 0, policy_version 807053 (0.0044) [2024-06-25 04:07:38,035][15401] Updated weights for policy 0, policy_version 807063 (0.0028) [2024-06-25 04:07:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13222936576. Throughput: 0: 42899.5. Samples: 13223024100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:38,390][15132] Avg episode reward: [(0, '0.862')] [2024-06-25 04:07:41,144][15401] Updated weights for policy 0, policy_version 807073 (0.0043) [2024-06-25 04:07:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 13223149568. Throughput: 0: 42639.0. Samples: 13223273220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 04:07:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000807077_13223149568.pth... [2024-06-25 04:07:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000806449_13212860416.pth [2024-06-25 04:07:45,780][15401] Updated weights for policy 0, policy_version 807083 (0.0034) [2024-06-25 04:07:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13223378944. Throughput: 0: 42594.3. Samples: 13223531280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:48,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 04:07:48,757][15401] Updated weights for policy 0, policy_version 807093 (0.0026) [2024-06-25 04:07:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 13223575552. Throughput: 0: 42616.8. Samples: 13223662800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:53,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 04:07:53,399][15401] Updated weights for policy 0, policy_version 807103 (0.0033) [2024-06-25 04:07:56,837][15401] Updated weights for policy 0, policy_version 807113 (0.0039) [2024-06-25 04:07:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13223804928. Throughput: 0: 42526.2. Samples: 13223916280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:07:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 04:08:00,966][15401] Updated weights for policy 0, policy_version 807123 (0.0026) [2024-06-25 04:08:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13224017920. Throughput: 0: 42587.6. Samples: 13224170040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 04:08:04,362][15401] Updated weights for policy 0, policy_version 807133 (0.0029) [2024-06-25 04:08:08,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42325.0, 300 sec: 42765.0). Total num frames: 13224198144. Throughput: 0: 42544.0. Samples: 13224300760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 04:08:08,608][15401] Updated weights for policy 0, policy_version 807143 (0.0028) [2024-06-25 04:08:12,064][15401] Updated weights for policy 0, policy_version 807153 (0.0038) [2024-06-25 04:08:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13224443904. Throughput: 0: 42632.5. Samples: 13224556900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:13,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 04:08:16,175][15401] Updated weights for policy 0, policy_version 807163 (0.0036) [2024-06-25 04:08:18,390][15132] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13224656896. Throughput: 0: 42655.5. Samples: 13224812300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:18,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 04:08:19,611][15401] Updated weights for policy 0, policy_version 807173 (0.0029) [2024-06-25 04:08:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42329.9, 300 sec: 42876.1). Total num frames: 13224853504. Throughput: 0: 42665.9. Samples: 13224944060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 04:08:23,666][15401] Updated weights for policy 0, policy_version 807183 (0.0046) [2024-06-25 04:08:27,348][15401] Updated weights for policy 0, policy_version 807193 (0.0030) [2024-06-25 04:08:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13225082880. Throughput: 0: 42865.7. Samples: 13225202180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 04:08:31,255][15401] Updated weights for policy 0, policy_version 807203 (0.0030) [2024-06-25 04:08:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13225312256. Throughput: 0: 42775.7. Samples: 13225456180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 04:08:35,055][15401] Updated weights for policy 0, policy_version 807213 (0.0030) [2024-06-25 04:08:38,395][15132] Fps is (10 sec: 40939.6, 60 sec: 42594.8, 300 sec: 42819.8). Total num frames: 13225492480. Throughput: 0: 42809.8. Samples: 13225589460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:38,395][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 04:08:39,001][15401] Updated weights for policy 0, policy_version 807223 (0.0043) [2024-06-25 04:08:42,558][15401] Updated weights for policy 0, policy_version 807233 (0.0048) [2024-06-25 04:08:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13225738240. Throughput: 0: 42880.5. Samples: 13225845900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 04:08:46,658][15349] Signal inference workers to stop experience collection... (195750 times) [2024-06-25 04:08:46,714][15401] InferenceWorker_p0-w0: stopping experience collection (195750 times) [2024-06-25 04:08:46,722][15349] Signal inference workers to resume experience collection... (195750 times) [2024-06-25 04:08:46,730][15401] InferenceWorker_p0-w0: resuming experience collection (195750 times) [2024-06-25 04:08:46,733][15401] Updated weights for policy 0, policy_version 807243 (0.0025) [2024-06-25 04:08:48,390][15132] Fps is (10 sec: 45898.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 13225951232. Throughput: 0: 42873.3. Samples: 13226099340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 04:08:50,096][15401] Updated weights for policy 0, policy_version 807253 (0.0029) [2024-06-25 04:08:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 13226131456. Throughput: 0: 42991.0. Samples: 13226235340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 04:08:54,127][15401] Updated weights for policy 0, policy_version 807263 (0.0044) [2024-06-25 04:08:57,854][15401] Updated weights for policy 0, policy_version 807273 (0.0033) [2024-06-25 04:08:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13226360832. Throughput: 0: 42870.8. Samples: 13226486080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 04:08:58,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 04:09:01,742][15401] Updated weights for policy 0, policy_version 807283 (0.0033) [2024-06-25 04:09:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13226590208. Throughput: 0: 42969.4. Samples: 13226745920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:03,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 04:09:05,687][15401] Updated weights for policy 0, policy_version 807293 (0.0042) [2024-06-25 04:09:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 13226770432. Throughput: 0: 42898.6. Samples: 13226874500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 04:09:09,686][15401] Updated weights for policy 0, policy_version 807303 (0.0037) [2024-06-25 04:09:13,189][15401] Updated weights for policy 0, policy_version 807313 (0.0027) [2024-06-25 04:09:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13227016192. Throughput: 0: 42855.6. Samples: 13227130680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 04:09:17,125][15401] Updated weights for policy 0, policy_version 807323 (0.0029) [2024-06-25 04:09:18,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13227229184. Throughput: 0: 43060.4. Samples: 13227393900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 04:09:20,773][15401] Updated weights for policy 0, policy_version 807333 (0.0024) [2024-06-25 04:09:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13227425792. Throughput: 0: 42959.6. Samples: 13227522420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 04:09:24,525][15401] Updated weights for policy 0, policy_version 807343 (0.0031) [2024-06-25 04:09:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13227655168. Throughput: 0: 42845.5. Samples: 13227773940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:28,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 04:09:28,437][15401] Updated weights for policy 0, policy_version 807353 (0.0029) [2024-06-25 04:09:32,552][15401] Updated weights for policy 0, policy_version 807363 (0.0028) [2024-06-25 04:09:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 13227868160. Throughput: 0: 42994.6. Samples: 13228034100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:33,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-25 04:09:35,850][15401] Updated weights for policy 0, policy_version 807373 (0.0047) [2024-06-25 04:09:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42875.2, 300 sec: 42820.6). Total num frames: 13228064768. Throughput: 0: 42830.8. Samples: 13228162720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 04:09:40,023][15401] Updated weights for policy 0, policy_version 807383 (0.0034) [2024-06-25 04:09:43,278][15401] Updated weights for policy 0, policy_version 807393 (0.0041) [2024-06-25 04:09:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13228326912. Throughput: 0: 43084.4. Samples: 13228424880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 04:09:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000807393_13228326912.pth... [2024-06-25 04:09:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000806763_13218004992.pth [2024-06-25 04:09:47,517][15401] Updated weights for policy 0, policy_version 807403 (0.0037) [2024-06-25 04:09:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13228523520. Throughput: 0: 42996.4. Samples: 13228680760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:48,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 04:09:50,022][15349] Signal inference workers to stop experience collection... (195800 times) [2024-06-25 04:09:50,072][15401] InferenceWorker_p0-w0: stopping experience collection (195800 times) [2024-06-25 04:09:50,136][15349] Signal inference workers to resume experience collection... (195800 times) [2024-06-25 04:09:50,136][15401] InferenceWorker_p0-w0: resuming experience collection (195800 times) [2024-06-25 04:09:50,986][15401] Updated weights for policy 0, policy_version 807413 (0.0030) [2024-06-25 04:09:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 13228703744. Throughput: 0: 42857.4. Samples: 13228803080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:53,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 04:09:55,199][15401] Updated weights for policy 0, policy_version 807423 (0.0032) [2024-06-25 04:09:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 13228949504. Throughput: 0: 43006.7. Samples: 13229065980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:09:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 04:09:58,932][15401] Updated weights for policy 0, policy_version 807433 (0.0047) [2024-06-25 04:10:02,757][15401] Updated weights for policy 0, policy_version 807443 (0.0048) [2024-06-25 04:10:03,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13229162496. Throughput: 0: 42973.4. Samples: 13229327700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 04:10:06,445][15401] Updated weights for policy 0, policy_version 807453 (0.0035) [2024-06-25 04:10:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13229359104. Throughput: 0: 42881.4. Samples: 13229452080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 04:10:10,293][15401] Updated weights for policy 0, policy_version 807463 (0.0032) [2024-06-25 04:10:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13229588480. Throughput: 0: 43067.9. Samples: 13229712000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 04:10:14,247][15401] Updated weights for policy 0, policy_version 807473 (0.0033) [2024-06-25 04:10:17,914][15401] Updated weights for policy 0, policy_version 807483 (0.0038) [2024-06-25 04:10:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13229801472. Throughput: 0: 42927.2. Samples: 13229965820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 04:10:21,777][15401] Updated weights for policy 0, policy_version 807493 (0.0040) [2024-06-25 04:10:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13230014464. Throughput: 0: 43049.7. Samples: 13230099960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 04:10:25,646][15401] Updated weights for policy 0, policy_version 807503 (0.0037) [2024-06-25 04:10:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13230227456. Throughput: 0: 42889.8. Samples: 13230354920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 04:10:29,421][15401] Updated weights for policy 0, policy_version 807513 (0.0039) [2024-06-25 04:10:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13230440448. Throughput: 0: 42840.5. Samples: 13230608580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:33,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 04:10:33,423][15401] Updated weights for policy 0, policy_version 807523 (0.0033) [2024-06-25 04:10:37,043][15401] Updated weights for policy 0, policy_version 807533 (0.0029) [2024-06-25 04:10:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13230653440. Throughput: 0: 42955.3. Samples: 13230736080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 04:10:41,399][15401] Updated weights for policy 0, policy_version 807543 (0.0028) [2024-06-25 04:10:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13230882816. Throughput: 0: 42834.2. Samples: 13230993520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-25 04:10:43,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-25 04:10:44,901][15401] Updated weights for policy 0, policy_version 807553 (0.0034) [2024-06-25 04:10:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13231079424. Throughput: 0: 42721.3. Samples: 13231250160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:10:48,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 04:10:49,062][15401] Updated weights for policy 0, policy_version 807563 (0.0037) [2024-06-25 04:10:52,611][15401] Updated weights for policy 0, policy_version 807573 (0.0031) [2024-06-25 04:10:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13231308800. Throughput: 0: 42886.1. Samples: 13231381960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:10:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 04:10:56,567][15401] Updated weights for policy 0, policy_version 807583 (0.0032) [2024-06-25 04:10:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13231489024. Throughput: 0: 42781.0. Samples: 13231637140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:10:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 04:11:00,331][15401] Updated weights for policy 0, policy_version 807593 (0.0030) [2024-06-25 04:11:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13231734784. Throughput: 0: 42820.8. Samples: 13231892760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:03,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-25 04:11:04,013][15401] Updated weights for policy 0, policy_version 807603 (0.0030) [2024-06-25 04:11:07,978][15401] Updated weights for policy 0, policy_version 807613 (0.0040) [2024-06-25 04:11:08,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 13231931392. Throughput: 0: 42733.7. Samples: 13232023080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:08,401][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 04:11:12,155][15401] Updated weights for policy 0, policy_version 807623 (0.0033) [2024-06-25 04:11:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13232144384. Throughput: 0: 42806.5. Samples: 13232281220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 04:11:15,459][15401] Updated weights for policy 0, policy_version 807633 (0.0024) [2024-06-25 04:11:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13232373760. Throughput: 0: 42811.1. Samples: 13232535080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 04:11:19,548][15401] Updated weights for policy 0, policy_version 807643 (0.0049) [2024-06-25 04:11:22,683][15349] Signal inference workers to stop experience collection... (195850 times) [2024-06-25 04:11:22,683][15349] Signal inference workers to resume experience collection... (195850 times) [2024-06-25 04:11:22,708][15401] InferenceWorker_p0-w0: stopping experience collection (195850 times) [2024-06-25 04:11:22,708][15401] InferenceWorker_p0-w0: resuming experience collection (195850 times) [2024-06-25 04:11:22,983][15401] Updated weights for policy 0, policy_version 807653 (0.0023) [2024-06-25 04:11:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13232586752. Throughput: 0: 42937.4. Samples: 13232668260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 04:11:26,961][15401] Updated weights for policy 0, policy_version 807663 (0.0040) [2024-06-25 04:11:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13232799744. Throughput: 0: 42936.0. Samples: 13232925640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:28,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 04:11:30,685][15401] Updated weights for policy 0, policy_version 807673 (0.0031) [2024-06-25 04:11:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13233029120. Throughput: 0: 42860.0. Samples: 13233178860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:33,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 04:11:34,538][15401] Updated weights for policy 0, policy_version 807683 (0.0029) [2024-06-25 04:11:38,353][15401] Updated weights for policy 0, policy_version 807693 (0.0036) [2024-06-25 04:11:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13233242112. Throughput: 0: 42857.2. Samples: 13233310540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 04:11:42,215][15401] Updated weights for policy 0, policy_version 807703 (0.0031) [2024-06-25 04:11:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13233438720. Throughput: 0: 42905.4. Samples: 13233567880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 04:11:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000807706_13233455104.pth... [2024-06-25 04:11:43,519][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000807077_13223149568.pth [2024-06-25 04:11:45,858][15401] Updated weights for policy 0, policy_version 807713 (0.0034) [2024-06-25 04:11:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13233668096. Throughput: 0: 42846.8. Samples: 13233820860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 04:11:49,695][15401] Updated weights for policy 0, policy_version 807723 (0.0037) [2024-06-25 04:11:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13233881088. Throughput: 0: 42881.9. Samples: 13233952660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 04:11:53,563][15401] Updated weights for policy 0, policy_version 807733 (0.0030) [2024-06-25 04:11:57,737][15401] Updated weights for policy 0, policy_version 807743 (0.0032) [2024-06-25 04:11:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 13234077696. Throughput: 0: 42805.8. Samples: 13234207480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:11:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 04:12:01,167][15401] Updated weights for policy 0, policy_version 807753 (0.0033) [2024-06-25 04:12:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13234307072. Throughput: 0: 42810.3. Samples: 13234461540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:12:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 04:12:05,227][15401] Updated weights for policy 0, policy_version 807763 (0.0036) [2024-06-25 04:12:08,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 13234520064. Throughput: 0: 42935.7. Samples: 13234600360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:12:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 04:12:08,656][15401] Updated weights for policy 0, policy_version 807773 (0.0034) [2024-06-25 04:12:12,774][15401] Updated weights for policy 0, policy_version 807783 (0.0034) [2024-06-25 04:12:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13234716672. Throughput: 0: 42886.1. Samples: 13234855520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:12:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 04:12:16,775][15401] Updated weights for policy 0, policy_version 807793 (0.0038) [2024-06-25 04:12:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 13234962432. Throughput: 0: 42797.3. Samples: 13235104740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:12:18,394][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 04:12:20,280][15401] Updated weights for policy 0, policy_version 807803 (0.0024) [2024-06-25 04:12:23,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 13235159040. Throughput: 0: 42914.3. Samples: 13235241780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:12:23,392][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 04:12:24,179][15401] Updated weights for policy 0, policy_version 807813 (0.0027) [2024-06-25 04:12:27,785][15401] Updated weights for policy 0, policy_version 807823 (0.0032) [2024-06-25 04:12:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13235372032. Throughput: 0: 42834.6. Samples: 13235495440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:12:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 04:12:31,708][15401] Updated weights for policy 0, policy_version 807833 (0.0040) [2024-06-25 04:12:33,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13235617792. Throughput: 0: 42871.9. Samples: 13235750100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 04:12:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 04:12:35,684][15401] Updated weights for policy 0, policy_version 807843 (0.0028) [2024-06-25 04:12:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13235781632. Throughput: 0: 42939.5. Samples: 13235884940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:12:38,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-25 04:12:38,987][15349] Signal inference workers to stop experience collection... (195900 times) [2024-06-25 04:12:38,990][15349] Signal inference workers to resume experience collection... (195900 times) [2024-06-25 04:12:39,007][15401] InferenceWorker_p0-w0: stopping experience collection (195900 times) [2024-06-25 04:12:39,008][15401] InferenceWorker_p0-w0: resuming experience collection (195900 times) [2024-06-25 04:12:39,279][15401] Updated weights for policy 0, policy_version 807853 (0.0036) [2024-06-25 04:12:43,115][15401] Updated weights for policy 0, policy_version 807863 (0.0029) [2024-06-25 04:12:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13236027392. Throughput: 0: 42949.4. Samples: 13236140200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:12:43,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 04:12:47,181][15401] Updated weights for policy 0, policy_version 807873 (0.0028) [2024-06-25 04:12:48,389][15132] Fps is (10 sec: 49152.3, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13236273152. Throughput: 0: 43048.1. Samples: 13236398700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:12:48,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 04:12:50,564][15401] Updated weights for policy 0, policy_version 807883 (0.0038) [2024-06-25 04:12:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13236420608. Throughput: 0: 42957.6. Samples: 13236533460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:12:53,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 04:12:54,573][15401] Updated weights for policy 0, policy_version 807893 (0.0037) [2024-06-25 04:12:58,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13236649984. Throughput: 0: 42820.7. Samples: 13236782440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:12:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 04:12:59,065][15401] Updated weights for policy 0, policy_version 807903 (0.0042) [2024-06-25 04:13:02,326][15401] Updated weights for policy 0, policy_version 807913 (0.0027) [2024-06-25 04:13:03,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 43042.8). Total num frames: 13236895744. Throughput: 0: 42948.6. Samples: 13237037420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 04:13:06,575][15401] Updated weights for policy 0, policy_version 807923 (0.0027) [2024-06-25 04:13:08,396][15132] Fps is (10 sec: 40933.3, 60 sec: 42320.7, 300 sec: 42764.1). Total num frames: 13237059584. Throughput: 0: 42815.7. Samples: 13237168660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:08,397][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 04:13:10,170][15401] Updated weights for policy 0, policy_version 807933 (0.0036) [2024-06-25 04:13:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13237305344. Throughput: 0: 42728.0. Samples: 13237418200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 04:13:14,074][15401] Updated weights for policy 0, policy_version 807943 (0.0027) [2024-06-25 04:13:17,713][15401] Updated weights for policy 0, policy_version 807953 (0.0027) [2024-06-25 04:13:18,392][15132] Fps is (10 sec: 45893.6, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 13237518336. Throughput: 0: 42883.1. Samples: 13237679940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:18,395][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 04:13:21,591][15401] Updated weights for policy 0, policy_version 807963 (0.0039) [2024-06-25 04:13:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 13237714944. Throughput: 0: 42681.3. Samples: 13237805600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 04:13:25,193][15401] Updated weights for policy 0, policy_version 807973 (0.0041) [2024-06-25 04:13:28,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13237944320. Throughput: 0: 42702.8. Samples: 13238061820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 04:13:29,162][15401] Updated weights for policy 0, policy_version 807983 (0.0039) [2024-06-25 04:13:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42876.8). Total num frames: 13238140928. Throughput: 0: 42620.8. Samples: 13238316640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:33,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 04:13:33,433][15401] Updated weights for policy 0, policy_version 807993 (0.0036) [2024-06-25 04:13:36,922][15401] Updated weights for policy 0, policy_version 808003 (0.0029) [2024-06-25 04:13:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13238353920. Throughput: 0: 42345.0. Samples: 13238438980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 04:13:41,063][15401] Updated weights for policy 0, policy_version 808013 (0.0029) [2024-06-25 04:13:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13238583296. Throughput: 0: 42536.7. Samples: 13238696600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:43,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 04:13:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000808019_13238583296.pth... [2024-06-25 04:13:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000807393_13228326912.pth [2024-06-25 04:13:45,191][15401] Updated weights for policy 0, policy_version 808023 (0.0035) [2024-06-25 04:13:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42876.1). Total num frames: 13238779904. Throughput: 0: 42434.6. Samples: 13238946980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 04:13:49,130][15401] Updated weights for policy 0, policy_version 808033 (0.0037) [2024-06-25 04:13:52,834][15401] Updated weights for policy 0, policy_version 808043 (0.0034) [2024-06-25 04:13:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13238992896. Throughput: 0: 42338.1. Samples: 13239073600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:53,390][15132] Avg episode reward: [(0, '0.903')] [2024-06-25 04:13:56,699][15401] Updated weights for policy 0, policy_version 808053 (0.0042) [2024-06-25 04:13:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13239205888. Throughput: 0: 42506.7. Samples: 13239331000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:13:58,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 04:14:00,558][15401] Updated weights for policy 0, policy_version 808063 (0.0021) [2024-06-25 04:14:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 13239418880. Throughput: 0: 42373.0. Samples: 13239586620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:14:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 04:14:04,556][15401] Updated weights for policy 0, policy_version 808073 (0.0033) [2024-06-25 04:14:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 13239615488. Throughput: 0: 42402.6. Samples: 13239713720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:14:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 04:14:08,408][15401] Updated weights for policy 0, policy_version 808083 (0.0035) [2024-06-25 04:14:12,055][15401] Updated weights for policy 0, policy_version 808093 (0.0032) [2024-06-25 04:14:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13239828480. Throughput: 0: 42375.9. Samples: 13239968740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:14:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 04:14:14,204][15349] Signal inference workers to stop experience collection... (195950 times) [2024-06-25 04:14:14,204][15349] Signal inference workers to resume experience collection... (195950 times) [2024-06-25 04:14:14,257][15401] InferenceWorker_p0-w0: stopping experience collection (195950 times) [2024-06-25 04:14:14,258][15401] InferenceWorker_p0-w0: resuming experience collection (195950 times) [2024-06-25 04:14:15,939][15401] Updated weights for policy 0, policy_version 808103 (0.0032) [2024-06-25 04:14:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 13240041472. Throughput: 0: 42430.3. Samples: 13240226000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 04:14:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 04:14:19,638][15401] Updated weights for policy 0, policy_version 808113 (0.0034) [2024-06-25 04:14:23,351][15401] Updated weights for policy 0, policy_version 808123 (0.0032) [2024-06-25 04:14:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13240287232. Throughput: 0: 42576.8. Samples: 13240354940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:14:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 04:14:27,244][15401] Updated weights for policy 0, policy_version 808133 (0.0037) [2024-06-25 04:14:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 13240483840. Throughput: 0: 42467.9. Samples: 13240607660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:14:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 04:14:30,894][15401] Updated weights for policy 0, policy_version 808143 (0.0044) [2024-06-25 04:14:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13240696832. Throughput: 0: 42718.9. Samples: 13240869340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:14:33,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 04:14:34,878][15401] Updated weights for policy 0, policy_version 808153 (0.0038) [2024-06-25 04:14:38,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13240926208. Throughput: 0: 42677.3. Samples: 13240994080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:14:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 04:14:38,548][15401] Updated weights for policy 0, policy_version 808163 (0.0033) [2024-06-25 04:14:42,872][15401] Updated weights for policy 0, policy_version 808173 (0.0040) [2024-06-25 04:14:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13241122816. Throughput: 0: 42737.8. Samples: 13241254200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:14:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 04:14:45,991][15401] Updated weights for policy 0, policy_version 808183 (0.0039) [2024-06-25 04:14:48,391][15132] Fps is (10 sec: 40952.8, 60 sec: 42597.1, 300 sec: 42820.3). Total num frames: 13241335808. Throughput: 0: 42844.0. Samples: 13241514680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:14:48,392][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 04:14:50,430][15401] Updated weights for policy 0, policy_version 808193 (0.0031) [2024-06-25 04:14:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13241565184. Throughput: 0: 42844.6. Samples: 13241641720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:14:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 04:14:53,543][15401] Updated weights for policy 0, policy_version 808203 (0.0028) [2024-06-25 04:14:57,987][15401] Updated weights for policy 0, policy_version 808213 (0.0036) [2024-06-25 04:14:58,389][15132] Fps is (10 sec: 42606.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13241761792. Throughput: 0: 42880.6. Samples: 13241898360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:14:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 04:15:01,505][15401] Updated weights for policy 0, policy_version 808223 (0.0036) [2024-06-25 04:15:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13241974784. Throughput: 0: 42888.0. Samples: 13242155960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:03,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 04:15:05,600][15401] Updated weights for policy 0, policy_version 808233 (0.0032) [2024-06-25 04:15:08,396][15132] Fps is (10 sec: 45845.5, 60 sec: 43413.0, 300 sec: 42819.6). Total num frames: 13242220544. Throughput: 0: 42781.9. Samples: 13242280400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:08,397][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 04:15:09,115][15401] Updated weights for policy 0, policy_version 808243 (0.0038) [2024-06-25 04:15:13,126][15401] Updated weights for policy 0, policy_version 808253 (0.0040) [2024-06-25 04:15:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13242417152. Throughput: 0: 42832.5. Samples: 13242535120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 04:15:16,855][15401] Updated weights for policy 0, policy_version 808263 (0.0038) [2024-06-25 04:15:18,390][15132] Fps is (10 sec: 39346.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13242613760. Throughput: 0: 42713.9. Samples: 13242791460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:18,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 04:15:20,753][15401] Updated weights for policy 0, policy_version 808273 (0.0023) [2024-06-25 04:15:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13242859520. Throughput: 0: 42768.1. Samples: 13242918640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 04:15:24,787][15401] Updated weights for policy 0, policy_version 808283 (0.0023) [2024-06-25 04:15:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13243056128. Throughput: 0: 42794.3. Samples: 13243179940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 04:15:28,452][15401] Updated weights for policy 0, policy_version 808293 (0.0024) [2024-06-25 04:15:32,359][15401] Updated weights for policy 0, policy_version 808303 (0.0036) [2024-06-25 04:15:33,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 13243252736. Throughput: 0: 42872.4. Samples: 13243443860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:33,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 04:15:36,047][15401] Updated weights for policy 0, policy_version 808313 (0.0045) [2024-06-25 04:15:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13243498496. Throughput: 0: 42760.0. Samples: 13243565920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:38,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 04:15:39,869][15401] Updated weights for policy 0, policy_version 808323 (0.0040) [2024-06-25 04:15:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13243695104. Throughput: 0: 42822.6. Samples: 13243825380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:43,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 04:15:43,556][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000808332_13243711488.pth... [2024-06-25 04:15:43,606][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000807706_13233455104.pth [2024-06-25 04:15:43,782][15401] Updated weights for policy 0, policy_version 808333 (0.0036) [2024-06-25 04:15:47,775][15401] Updated weights for policy 0, policy_version 808343 (0.0024) [2024-06-25 04:15:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43145.9, 300 sec: 42765.0). Total num frames: 13243924480. Throughput: 0: 42820.4. Samples: 13244082880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 04:15:51,329][15401] Updated weights for policy 0, policy_version 808353 (0.0030) [2024-06-25 04:15:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13244137472. Throughput: 0: 42927.5. Samples: 13244211860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 04:15:55,473][15401] Updated weights for policy 0, policy_version 808363 (0.0022) [2024-06-25 04:15:55,496][15349] Signal inference workers to stop experience collection... (196000 times) [2024-06-25 04:15:55,497][15349] Signal inference workers to resume experience collection... (196000 times) [2024-06-25 04:15:55,529][15401] InferenceWorker_p0-w0: stopping experience collection (196000 times) [2024-06-25 04:15:55,529][15401] InferenceWorker_p0-w0: resuming experience collection (196000 times) [2024-06-25 04:15:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13244334080. Throughput: 0: 42996.2. Samples: 13244469940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:15:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 04:15:58,810][15401] Updated weights for policy 0, policy_version 808373 (0.0024) [2024-06-25 04:16:03,017][15401] Updated weights for policy 0, policy_version 808383 (0.0034) [2024-06-25 04:16:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 13244563456. Throughput: 0: 42998.7. Samples: 13244726400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:16:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 04:16:06,436][15401] Updated weights for policy 0, policy_version 808393 (0.0032) [2024-06-25 04:16:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 13244776448. Throughput: 0: 43010.1. Samples: 13244854100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:16:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 04:16:10,575][15401] Updated weights for policy 0, policy_version 808403 (0.0028) [2024-06-25 04:16:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13244989440. Throughput: 0: 42956.0. Samples: 13245112960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 04:16:14,233][15401] Updated weights for policy 0, policy_version 808413 (0.0034) [2024-06-25 04:16:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13245186048. Throughput: 0: 42748.9. Samples: 13245367560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:18,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-25 04:16:18,426][15401] Updated weights for policy 0, policy_version 808423 (0.0029) [2024-06-25 04:16:22,283][15401] Updated weights for policy 0, policy_version 808433 (0.0038) [2024-06-25 04:16:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13245399040. Throughput: 0: 42846.6. Samples: 13245494020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:23,390][15132] Avg episode reward: [(0, '0.208')] [2024-06-25 04:16:26,061][15401] Updated weights for policy 0, policy_version 808443 (0.0039) [2024-06-25 04:16:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13245628416. Throughput: 0: 42766.6. Samples: 13245749880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:28,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 04:16:29,860][15401] Updated weights for policy 0, policy_version 808453 (0.0040) [2024-06-25 04:16:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13245825024. Throughput: 0: 42876.9. Samples: 13246012340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:33,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 04:16:33,643][15401] Updated weights for policy 0, policy_version 808463 (0.0039) [2024-06-25 04:16:37,378][15401] Updated weights for policy 0, policy_version 808473 (0.0028) [2024-06-25 04:16:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13246054400. Throughput: 0: 42861.3. Samples: 13246140620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 04:16:41,188][15401] Updated weights for policy 0, policy_version 808483 (0.0051) [2024-06-25 04:16:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13246267392. Throughput: 0: 42749.6. Samples: 13246393680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:43,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 04:16:44,985][15401] Updated weights for policy 0, policy_version 808493 (0.0033) [2024-06-25 04:16:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13246480384. Throughput: 0: 42926.2. Samples: 13246658080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 04:16:48,719][15401] Updated weights for policy 0, policy_version 808503 (0.0032) [2024-06-25 04:16:52,539][15401] Updated weights for policy 0, policy_version 808513 (0.0039) [2024-06-25 04:16:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13246693376. Throughput: 0: 42880.1. Samples: 13246783700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:53,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-25 04:16:56,399][15401] Updated weights for policy 0, policy_version 808523 (0.0046) [2024-06-25 04:16:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13246922752. Throughput: 0: 42818.7. Samples: 13247039800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:16:58,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 04:17:00,245][15401] Updated weights for policy 0, policy_version 808533 (0.0035) [2024-06-25 04:17:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13247135744. Throughput: 0: 42932.0. Samples: 13247299500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:03,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 04:17:03,962][15401] Updated weights for policy 0, policy_version 808543 (0.0044) [2024-06-25 04:17:07,711][15401] Updated weights for policy 0, policy_version 808553 (0.0030) [2024-06-25 04:17:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13247348736. Throughput: 0: 42941.3. Samples: 13247426380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 04:17:11,495][15401] Updated weights for policy 0, policy_version 808563 (0.0032) [2024-06-25 04:17:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13247561728. Throughput: 0: 42785.4. Samples: 13247675220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 04:17:15,446][15401] Updated weights for policy 0, policy_version 808573 (0.0041) [2024-06-25 04:17:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 13247774720. Throughput: 0: 42859.5. Samples: 13247941020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:18,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 04:17:19,034][15401] Updated weights for policy 0, policy_version 808583 (0.0040) [2024-06-25 04:17:23,117][15401] Updated weights for policy 0, policy_version 808593 (0.0037) [2024-06-25 04:17:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13247987712. Throughput: 0: 42794.3. Samples: 13248066360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 04:17:26,649][15401] Updated weights for policy 0, policy_version 808603 (0.0034) [2024-06-25 04:17:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 13248200704. Throughput: 0: 42839.7. Samples: 13248321460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 04:17:31,002][15401] Updated weights for policy 0, policy_version 808613 (0.0029) [2024-06-25 04:17:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13248397312. Throughput: 0: 42775.4. Samples: 13248582980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 04:17:33,895][15349] Signal inference workers to stop experience collection... (196050 times) [2024-06-25 04:17:33,948][15401] InferenceWorker_p0-w0: stopping experience collection (196050 times) [2024-06-25 04:17:33,950][15349] Signal inference workers to resume experience collection... (196050 times) [2024-06-25 04:17:33,960][15401] InferenceWorker_p0-w0: resuming experience collection (196050 times) [2024-06-25 04:17:34,104][15401] Updated weights for policy 0, policy_version 808623 (0.0031) [2024-06-25 04:17:38,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13248626688. Throughput: 0: 42788.7. Samples: 13248709200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 04:17:38,462][15401] Updated weights for policy 0, policy_version 808633 (0.0039) [2024-06-25 04:17:42,186][15401] Updated weights for policy 0, policy_version 808643 (0.0026) [2024-06-25 04:17:43,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 13248856064. Throughput: 0: 42838.0. Samples: 13248967620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:43,401][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 04:17:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000808646_13248856064.pth... [2024-06-25 04:17:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000808019_13238583296.pth [2024-06-25 04:17:45,872][15401] Updated weights for policy 0, policy_version 808653 (0.0032) [2024-06-25 04:17:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13249052672. Throughput: 0: 42804.3. Samples: 13249225700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 04:17:49,756][15401] Updated weights for policy 0, policy_version 808663 (0.0029) [2024-06-25 04:17:53,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13249282048. Throughput: 0: 42831.0. Samples: 13249353780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 04:17:53,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 04:17:53,551][15401] Updated weights for policy 0, policy_version 808673 (0.0035) [2024-06-25 04:17:57,181][15401] Updated weights for policy 0, policy_version 808683 (0.0037) [2024-06-25 04:17:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13249478656. Throughput: 0: 42935.7. Samples: 13249607320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:17:58,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 04:18:01,700][15401] Updated weights for policy 0, policy_version 808693 (0.0031) [2024-06-25 04:18:03,393][15132] Fps is (10 sec: 42582.5, 60 sec: 42868.7, 300 sec: 42876.5). Total num frames: 13249708032. Throughput: 0: 42775.4. Samples: 13249866080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:03,394][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 04:18:05,046][15401] Updated weights for policy 0, policy_version 808703 (0.0023) [2024-06-25 04:18:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13249904640. Throughput: 0: 42851.9. Samples: 13249994700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 04:18:09,070][15401] Updated weights for policy 0, policy_version 808713 (0.0041) [2024-06-25 04:18:12,733][15401] Updated weights for policy 0, policy_version 808723 (0.0030) [2024-06-25 04:18:13,389][15132] Fps is (10 sec: 42615.1, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 13250134016. Throughput: 0: 42881.8. Samples: 13250251140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 04:18:16,663][15401] Updated weights for policy 0, policy_version 808733 (0.0028) [2024-06-25 04:18:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13250330624. Throughput: 0: 42872.1. Samples: 13250512220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:18,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 04:18:20,478][15401] Updated weights for policy 0, policy_version 808743 (0.0037) [2024-06-25 04:18:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13250560000. Throughput: 0: 42856.1. Samples: 13250637720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 04:18:24,095][15401] Updated weights for policy 0, policy_version 808753 (0.0035) [2024-06-25 04:18:27,940][15401] Updated weights for policy 0, policy_version 808763 (0.0038) [2024-06-25 04:18:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13250789376. Throughput: 0: 42937.1. Samples: 13250899680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 04:18:31,528][15401] Updated weights for policy 0, policy_version 808773 (0.0034) [2024-06-25 04:18:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 13250985984. Throughput: 0: 42985.0. Samples: 13251160020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 04:18:35,699][15401] Updated weights for policy 0, policy_version 808783 (0.0033) [2024-06-25 04:18:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13251215360. Throughput: 0: 43057.4. Samples: 13251291360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:38,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 04:18:39,352][15401] Updated weights for policy 0, policy_version 808793 (0.0033) [2024-06-25 04:18:43,391][15132] Fps is (10 sec: 42593.4, 60 sec: 42599.3, 300 sec: 42820.4). Total num frames: 13251411968. Throughput: 0: 43153.5. Samples: 13251549280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:43,391][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 04:18:43,426][15401] Updated weights for policy 0, policy_version 808803 (0.0030) [2024-06-25 04:18:46,844][15401] Updated weights for policy 0, policy_version 808813 (0.0033) [2024-06-25 04:18:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13251641344. Throughput: 0: 42949.5. Samples: 13251798640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:48,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 04:18:51,181][15401] Updated weights for policy 0, policy_version 808823 (0.0035) [2024-06-25 04:18:53,390][15132] Fps is (10 sec: 44241.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13251854336. Throughput: 0: 43056.8. Samples: 13251932260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 04:18:54,832][15401] Updated weights for policy 0, policy_version 808833 (0.0045) [2024-06-25 04:18:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13252050944. Throughput: 0: 42917.3. Samples: 13252182420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:18:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 04:18:58,877][15401] Updated weights for policy 0, policy_version 808843 (0.0034) [2024-06-25 04:19:02,704][15401] Updated weights for policy 0, policy_version 808853 (0.0042) [2024-06-25 04:19:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42874.2, 300 sec: 42931.6). Total num frames: 13252280320. Throughput: 0: 42813.4. Samples: 13252438820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:03,398][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 04:19:06,491][15401] Updated weights for policy 0, policy_version 808863 (0.0033) [2024-06-25 04:19:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13252493312. Throughput: 0: 42967.6. Samples: 13252571260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 04:19:10,413][15401] Updated weights for policy 0, policy_version 808873 (0.0041) [2024-06-25 04:19:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13252689920. Throughput: 0: 42805.2. Samples: 13252825920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:13,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 04:19:14,106][15401] Updated weights for policy 0, policy_version 808883 (0.0031) [2024-06-25 04:19:17,973][15349] Signal inference workers to stop experience collection... (196100 times) [2024-06-25 04:19:17,973][15349] Signal inference workers to resume experience collection... (196100 times) [2024-06-25 04:19:18,019][15401] InferenceWorker_p0-w0: stopping experience collection (196100 times) [2024-06-25 04:19:18,019][15401] InferenceWorker_p0-w0: resuming experience collection (196100 times) [2024-06-25 04:19:18,107][15401] Updated weights for policy 0, policy_version 808893 (0.0028) [2024-06-25 04:19:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13252919296. Throughput: 0: 42785.3. Samples: 13253085360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:18,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 04:19:21,621][15401] Updated weights for policy 0, policy_version 808903 (0.0037) [2024-06-25 04:19:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13253148672. Throughput: 0: 42716.9. Samples: 13253213620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 04:19:25,526][15401] Updated weights for policy 0, policy_version 808913 (0.0043) [2024-06-25 04:19:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 13253328896. Throughput: 0: 42655.2. Samples: 13253468720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:28,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 04:19:29,175][15401] Updated weights for policy 0, policy_version 808923 (0.0024) [2024-06-25 04:19:32,940][15401] Updated weights for policy 0, policy_version 808933 (0.0028) [2024-06-25 04:19:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13253574656. Throughput: 0: 42919.0. Samples: 13253730000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 04:19:36,752][15401] Updated weights for policy 0, policy_version 808943 (0.0032) [2024-06-25 04:19:38,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13253804032. Throughput: 0: 42959.1. Samples: 13253865420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 04:19:40,419][15401] Updated weights for policy 0, policy_version 808953 (0.0034) [2024-06-25 04:19:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42872.2, 300 sec: 42876.3). Total num frames: 13253984256. Throughput: 0: 43088.4. Samples: 13254121400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 04:19:43,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-25 04:19:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000808960_13254000640.pth... [2024-06-25 04:19:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000808332_13243711488.pth [2024-06-25 04:19:44,533][15401] Updated weights for policy 0, policy_version 808963 (0.0036) [2024-06-25 04:19:48,311][15401] Updated weights for policy 0, policy_version 808973 (0.0030) [2024-06-25 04:19:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13254213632. Throughput: 0: 43086.7. Samples: 13254377720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:19:48,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-25 04:19:52,038][15401] Updated weights for policy 0, policy_version 808983 (0.0038) [2024-06-25 04:19:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13254443008. Throughput: 0: 42962.6. Samples: 13254504580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:19:53,392][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 04:19:55,827][15401] Updated weights for policy 0, policy_version 808993 (0.0044) [2024-06-25 04:19:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13254639616. Throughput: 0: 43050.2. Samples: 13254763180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:19:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 04:19:59,751][15401] Updated weights for policy 0, policy_version 809003 (0.0030) [2024-06-25 04:20:03,363][15401] Updated weights for policy 0, policy_version 809013 (0.0039) [2024-06-25 04:20:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 13254868992. Throughput: 0: 42999.0. Samples: 13255020320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 04:20:07,246][15401] Updated weights for policy 0, policy_version 809023 (0.0033) [2024-06-25 04:20:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13255065600. Throughput: 0: 42963.1. Samples: 13255146960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 04:20:10,963][15401] Updated weights for policy 0, policy_version 809033 (0.0035) [2024-06-25 04:20:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13255278592. Throughput: 0: 43024.5. Samples: 13255404820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 04:20:14,942][15401] Updated weights for policy 0, policy_version 809043 (0.0041) [2024-06-25 04:20:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13255507968. Throughput: 0: 42829.5. Samples: 13255657320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 04:20:18,473][15401] Updated weights for policy 0, policy_version 809053 (0.0031) [2024-06-25 04:20:22,334][15401] Updated weights for policy 0, policy_version 809063 (0.0033) [2024-06-25 04:20:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13255720960. Throughput: 0: 42837.3. Samples: 13255793100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:23,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 04:20:25,949][15401] Updated weights for policy 0, policy_version 809073 (0.0026) [2024-06-25 04:20:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13255901184. Throughput: 0: 42915.5. Samples: 13256052600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 04:20:30,125][15401] Updated weights for policy 0, policy_version 809083 (0.0035) [2024-06-25 04:20:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13256146944. Throughput: 0: 42827.1. Samples: 13256304940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 04:20:33,626][15401] Updated weights for policy 0, policy_version 809093 (0.0042) [2024-06-25 04:20:37,678][15401] Updated weights for policy 0, policy_version 809103 (0.0040) [2024-06-25 04:20:38,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13256376320. Throughput: 0: 42993.8. Samples: 13256439300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 04:20:41,386][15401] Updated weights for policy 0, policy_version 809113 (0.0035) [2024-06-25 04:20:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13256540160. Throughput: 0: 42928.9. Samples: 13256694980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:43,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 04:20:45,163][15401] Updated weights for policy 0, policy_version 809123 (0.0034) [2024-06-25 04:20:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13256785920. Throughput: 0: 42911.6. Samples: 13256951340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:48,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 04:20:49,278][15401] Updated weights for policy 0, policy_version 809133 (0.0032) [2024-06-25 04:20:52,916][15349] Signal inference workers to stop experience collection... (196150 times) [2024-06-25 04:20:52,974][15401] InferenceWorker_p0-w0: stopping experience collection (196150 times) [2024-06-25 04:20:53,035][15349] Signal inference workers to resume experience collection... (196150 times) [2024-06-25 04:20:53,035][15401] InferenceWorker_p0-w0: resuming experience collection (196150 times) [2024-06-25 04:20:53,207][15401] Updated weights for policy 0, policy_version 809143 (0.0038) [2024-06-25 04:20:53,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 13257015296. Throughput: 0: 43084.1. Samples: 13257085740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 04:20:57,082][15401] Updated weights for policy 0, policy_version 809153 (0.0032) [2024-06-25 04:20:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 13257211904. Throughput: 0: 42987.1. Samples: 13257339340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:20:58,392][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 04:21:00,843][15401] Updated weights for policy 0, policy_version 809163 (0.0025) [2024-06-25 04:21:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13257424896. Throughput: 0: 43084.0. Samples: 13257596100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:21:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 04:21:04,633][15401] Updated weights for policy 0, policy_version 809173 (0.0035) [2024-06-25 04:21:08,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13257637888. Throughput: 0: 42942.7. Samples: 13257725520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:21:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 04:21:08,453][15401] Updated weights for policy 0, policy_version 809183 (0.0040) [2024-06-25 04:21:12,161][15401] Updated weights for policy 0, policy_version 809193 (0.0036) [2024-06-25 04:21:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13257834496. Throughput: 0: 42677.5. Samples: 13257973080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:21:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 04:21:16,081][15401] Updated weights for policy 0, policy_version 809203 (0.0033) [2024-06-25 04:21:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13258080256. Throughput: 0: 42689.2. Samples: 13258225960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:21:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 04:21:20,211][15401] Updated weights for policy 0, policy_version 809213 (0.0040) [2024-06-25 04:21:23,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13258276864. Throughput: 0: 42695.2. Samples: 13258360580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:21:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 04:21:23,880][15401] Updated weights for policy 0, policy_version 809223 (0.0042) [2024-06-25 04:21:27,844][15401] Updated weights for policy 0, policy_version 809233 (0.0034) [2024-06-25 04:21:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13258473472. Throughput: 0: 42561.7. Samples: 13258610260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 04:21:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 04:21:31,441][15401] Updated weights for policy 0, policy_version 809243 (0.0051) [2024-06-25 04:21:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13258719232. Throughput: 0: 42573.4. Samples: 13258867140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:21:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 04:21:35,660][15401] Updated weights for policy 0, policy_version 809253 (0.0024) [2024-06-25 04:21:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 13258915840. Throughput: 0: 42537.2. Samples: 13258999920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:21:38,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 04:21:39,039][15401] Updated weights for policy 0, policy_version 809263 (0.0046) [2024-06-25 04:21:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13259112448. Throughput: 0: 42450.7. Samples: 13259249520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:21:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 04:21:43,473][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000809273_13259128832.pth... [2024-06-25 04:21:43,479][15401] Updated weights for policy 0, policy_version 809273 (0.0034) [2024-06-25 04:21:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000808646_13248856064.pth [2024-06-25 04:21:46,921][15401] Updated weights for policy 0, policy_version 809283 (0.0033) [2024-06-25 04:21:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13259341824. Throughput: 0: 42317.7. Samples: 13259500400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:21:48,394][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 04:21:51,236][15401] Updated weights for policy 0, policy_version 809293 (0.0032) [2024-06-25 04:21:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13259538432. Throughput: 0: 42436.9. Samples: 13259635180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:21:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 04:21:54,727][15401] Updated weights for policy 0, policy_version 809303 (0.0038) [2024-06-25 04:21:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 13259767808. Throughput: 0: 42346.4. Samples: 13259878680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:21:58,391][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 04:21:59,545][15401] Updated weights for policy 0, policy_version 809313 (0.0031) [2024-06-25 04:22:02,504][15401] Updated weights for policy 0, policy_version 809323 (0.0033) [2024-06-25 04:22:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 13259997184. Throughput: 0: 42364.8. Samples: 13260132380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 04:22:07,047][15401] Updated weights for policy 0, policy_version 809333 (0.0038) [2024-06-25 04:22:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 13260144640. Throughput: 0: 42180.0. Samples: 13260258680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:08,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 04:22:10,120][15401] Updated weights for policy 0, policy_version 809343 (0.0036) [2024-06-25 04:22:13,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13260390400. Throughput: 0: 42166.8. Samples: 13260507760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:13,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 04:22:14,550][15401] Updated weights for policy 0, policy_version 809353 (0.0037) [2024-06-25 04:22:17,919][15401] Updated weights for policy 0, policy_version 809363 (0.0042) [2024-06-25 04:22:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13260603392. Throughput: 0: 42363.6. Samples: 13260773500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 04:22:22,021][15401] Updated weights for policy 0, policy_version 809373 (0.0040) [2024-06-25 04:22:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 13260800000. Throughput: 0: 42260.4. Samples: 13260901640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 04:22:25,491][15401] Updated weights for policy 0, policy_version 809383 (0.0039) [2024-06-25 04:22:26,540][15349] Signal inference workers to stop experience collection... (196200 times) [2024-06-25 04:22:26,576][15401] InferenceWorker_p0-w0: stopping experience collection (196200 times) [2024-06-25 04:22:26,601][15349] Signal inference workers to resume experience collection... (196200 times) [2024-06-25 04:22:26,602][15401] InferenceWorker_p0-w0: resuming experience collection (196200 times) [2024-06-25 04:22:28,390][15132] Fps is (10 sec: 44233.7, 60 sec: 42871.1, 300 sec: 42876.0). Total num frames: 13261045760. Throughput: 0: 42148.8. Samples: 13261146240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:28,391][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 04:22:29,585][15401] Updated weights for policy 0, policy_version 809393 (0.0042) [2024-06-25 04:22:33,381][15401] Updated weights for policy 0, policy_version 809403 (0.0038) [2024-06-25 04:22:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 13261258752. Throughput: 0: 42481.9. Samples: 13261412080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 04:22:37,447][15401] Updated weights for policy 0, policy_version 809413 (0.0034) [2024-06-25 04:22:38,390][15132] Fps is (10 sec: 37684.9, 60 sec: 41779.1, 300 sec: 42598.7). Total num frames: 13261422592. Throughput: 0: 42241.1. Samples: 13261536040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 04:22:41,019][15401] Updated weights for policy 0, policy_version 809423 (0.0030) [2024-06-25 04:22:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13261701120. Throughput: 0: 42314.3. Samples: 13261782820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 04:22:45,937][15401] Updated weights for policy 0, policy_version 809433 (0.0026) [2024-06-25 04:22:48,389][15132] Fps is (10 sec: 45876.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13261881344. Throughput: 0: 42479.3. Samples: 13262043940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:48,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 04:22:48,832][15401] Updated weights for policy 0, policy_version 809443 (0.0031) [2024-06-25 04:22:53,389][15132] Fps is (10 sec: 34406.6, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 13262045184. Throughput: 0: 42296.4. Samples: 13262162020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 04:22:53,558][15401] Updated weights for policy 0, policy_version 809453 (0.0026) [2024-06-25 04:22:56,539][15401] Updated weights for policy 0, policy_version 809463 (0.0037) [2024-06-25 04:22:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.6, 300 sec: 42765.6). Total num frames: 13262323712. Throughput: 0: 42562.7. Samples: 13262423080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:22:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 04:23:01,147][15401] Updated weights for policy 0, policy_version 809473 (0.0035) [2024-06-25 04:23:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 13262503936. Throughput: 0: 42349.2. Samples: 13262679220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:23:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 04:23:04,362][15401] Updated weights for policy 0, policy_version 809483 (0.0038) [2024-06-25 04:23:08,392][15132] Fps is (10 sec: 37673.8, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 13262700544. Throughput: 0: 42056.5. Samples: 13262794280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:23:08,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 04:23:08,801][15401] Updated weights for policy 0, policy_version 809493 (0.0043) [2024-06-25 04:23:12,053][15401] Updated weights for policy 0, policy_version 809503 (0.0030) [2024-06-25 04:23:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13262962688. Throughput: 0: 42493.0. Samples: 13263058400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:23:13,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 04:23:16,930][15401] Updated weights for policy 0, policy_version 809513 (0.0040) [2024-06-25 04:23:18,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13263126528. Throughput: 0: 42385.8. Samples: 13263319440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:23:18,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 04:23:19,602][15401] Updated weights for policy 0, policy_version 809523 (0.0042) [2024-06-25 04:23:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13263355904. Throughput: 0: 42196.2. Samples: 13263434860. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:23:23,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-25 04:23:24,387][15401] Updated weights for policy 0, policy_version 809533 (0.0035) [2024-06-25 04:23:25,067][15349] Signal inference workers to stop experience collection... (196250 times) [2024-06-25 04:23:25,117][15401] InferenceWorker_p0-w0: stopping experience collection (196250 times) [2024-06-25 04:23:25,125][15349] Signal inference workers to resume experience collection... (196250 times) [2024-06-25 04:23:25,130][15401] InferenceWorker_p0-w0: resuming experience collection (196250 times) [2024-06-25 04:23:27,225][15401] Updated weights for policy 0, policy_version 809543 (0.0046) [2024-06-25 04:23:28,389][15132] Fps is (10 sec: 49151.7, 60 sec: 42872.0, 300 sec: 42820.6). Total num frames: 13263618048. Throughput: 0: 42665.4. Samples: 13263702760. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:23:28,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-25 04:23:31,973][15401] Updated weights for policy 0, policy_version 809553 (0.0039) [2024-06-25 04:23:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 13263765504. Throughput: 0: 42746.2. Samples: 13263967520. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:23:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 04:23:34,924][15401] Updated weights for policy 0, policy_version 809563 (0.0031) [2024-06-25 04:23:38,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.6, 300 sec: 42654.1). Total num frames: 13263994880. Throughput: 0: 42688.9. Samples: 13264083020. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:23:38,390][15132] Avg episode reward: [(0, '0.907')] [2024-06-25 04:23:39,375][15401] Updated weights for policy 0, policy_version 809573 (0.0036) [2024-06-25 04:23:42,416][15401] Updated weights for policy 0, policy_version 809583 (0.0034) [2024-06-25 04:23:43,390][15132] Fps is (10 sec: 49151.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13264257024. Throughput: 0: 42709.1. Samples: 13264345000. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:23:43,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 04:23:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000809586_13264257024.pth... [2024-06-25 04:23:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000808960_13254000640.pth [2024-06-25 04:23:47,296][15401] Updated weights for policy 0, policy_version 809593 (0.0038) [2024-06-25 04:23:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 13264404480. Throughput: 0: 42836.1. Samples: 13264606840. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:23:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 04:23:50,042][15401] Updated weights for policy 0, policy_version 809603 (0.0044) [2024-06-25 04:23:53,389][15132] Fps is (10 sec: 37683.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 13264633856. Throughput: 0: 42766.3. Samples: 13264718660. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:23:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 04:23:54,727][15401] Updated weights for policy 0, policy_version 809613 (0.0022) [2024-06-25 04:23:57,837][15401] Updated weights for policy 0, policy_version 809623 (0.0041) [2024-06-25 04:23:58,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13264879616. Throughput: 0: 42716.9. Samples: 13264980660. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:23:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 04:24:02,288][15401] Updated weights for policy 0, policy_version 809633 (0.0049) [2024-06-25 04:24:03,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 13265043456. Throughput: 0: 42864.7. Samples: 13265248460. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:03,393][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 04:24:05,356][15401] Updated weights for policy 0, policy_version 809643 (0.0034) [2024-06-25 04:24:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 13265289216. Throughput: 0: 42843.0. Samples: 13265362800. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 04:24:10,159][15401] Updated weights for policy 0, policy_version 809653 (0.0033) [2024-06-25 04:24:13,049][15401] Updated weights for policy 0, policy_version 809663 (0.0035) [2024-06-25 04:24:13,389][15132] Fps is (10 sec: 47525.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13265518592. Throughput: 0: 42709.3. Samples: 13265624680. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 04:24:18,077][15401] Updated weights for policy 0, policy_version 809673 (0.0036) [2024-06-25 04:24:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 13265682432. Throughput: 0: 42595.0. Samples: 13265884300. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 04:24:20,773][15401] Updated weights for policy 0, policy_version 809683 (0.0027) [2024-06-25 04:24:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13265928192. Throughput: 0: 42595.1. Samples: 13265999800. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 04:24:25,577][15401] Updated weights for policy 0, policy_version 809693 (0.0041) [2024-06-25 04:24:28,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13266157568. Throughput: 0: 42642.7. Samples: 13266263920. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 04:24:28,411][15401] Updated weights for policy 0, policy_version 809703 (0.0023) [2024-06-25 04:24:32,787][15349] Signal inference workers to stop experience collection... (196300 times) [2024-06-25 04:24:32,843][15349] Signal inference workers to resume experience collection... (196300 times) [2024-06-25 04:24:32,843][15401] InferenceWorker_p0-w0: stopping experience collection (196300 times) [2024-06-25 04:24:32,857][15401] InferenceWorker_p0-w0: resuming experience collection (196300 times) [2024-06-25 04:24:32,996][15401] Updated weights for policy 0, policy_version 809713 (0.0035) [2024-06-25 04:24:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 13266337792. Throughput: 0: 42578.7. Samples: 13266522880. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 04:24:36,138][15401] Updated weights for policy 0, policy_version 809723 (0.0043) [2024-06-25 04:24:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13266567168. Throughput: 0: 42808.8. Samples: 13266645060. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:38,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 04:24:41,023][15401] Updated weights for policy 0, policy_version 809733 (0.0038) [2024-06-25 04:24:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13266796544. Throughput: 0: 42868.0. Samples: 13266909720. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 04:24:43,775][15401] Updated weights for policy 0, policy_version 809743 (0.0028) [2024-06-25 04:24:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 13266960384. Throughput: 0: 42793.8. Samples: 13267174080. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 04:24:48,626][15401] Updated weights for policy 0, policy_version 809753 (0.0035) [2024-06-25 04:24:51,326][15401] Updated weights for policy 0, policy_version 809763 (0.0044) [2024-06-25 04:24:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 13267238912. Throughput: 0: 42887.6. Samples: 13267292740. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:53,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 04:24:56,365][15401] Updated weights for policy 0, policy_version 809773 (0.0044) [2024-06-25 04:24:58,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13267435520. Throughput: 0: 42923.0. Samples: 13267556220. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:24:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 04:24:59,125][15401] Updated weights for policy 0, policy_version 809783 (0.0029) [2024-06-25 04:25:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 13267615744. Throughput: 0: 42985.9. Samples: 13267818660. Policy #0 lag: (min: 2.0, avg: 10.9, max: 25.0) [2024-06-25 04:25:03,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 04:25:04,045][15401] Updated weights for policy 0, policy_version 809793 (0.0046) [2024-06-25 04:25:06,662][15401] Updated weights for policy 0, policy_version 809803 (0.0040) [2024-06-25 04:25:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 13267894272. Throughput: 0: 43053.0. Samples: 13267937180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 04:25:11,510][15401] Updated weights for policy 0, policy_version 809813 (0.0033) [2024-06-25 04:25:13,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13268090880. Throughput: 0: 43032.1. Samples: 13268200360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 04:25:14,193][15401] Updated weights for policy 0, policy_version 809823 (0.0032) [2024-06-25 04:25:18,390][15132] Fps is (10 sec: 37683.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 13268271104. Throughput: 0: 43098.1. Samples: 13268462300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 04:25:19,019][15401] Updated weights for policy 0, policy_version 809833 (0.0039) [2024-06-25 04:25:21,811][15401] Updated weights for policy 0, policy_version 809843 (0.0040) [2024-06-25 04:25:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 13268533248. Throughput: 0: 43074.5. Samples: 13268583420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 04:25:26,711][15401] Updated weights for policy 0, policy_version 809853 (0.0026) [2024-06-25 04:25:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13268713472. Throughput: 0: 43029.4. Samples: 13268846040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 04:25:29,445][15401] Updated weights for policy 0, policy_version 809863 (0.0028) [2024-06-25 04:25:33,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 13268910080. Throughput: 0: 42926.7. Samples: 13269105780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:33,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 04:25:34,107][15401] Updated weights for policy 0, policy_version 809873 (0.0040) [2024-06-25 04:25:36,633][15349] Signal inference workers to stop experience collection... (196350 times) [2024-06-25 04:25:36,634][15349] Signal inference workers to resume experience collection... (196350 times) [2024-06-25 04:25:36,649][15401] InferenceWorker_p0-w0: stopping experience collection (196350 times) [2024-06-25 04:25:36,649][15401] InferenceWorker_p0-w0: resuming experience collection (196350 times) [2024-06-25 04:25:37,181][15401] Updated weights for policy 0, policy_version 809883 (0.0041) [2024-06-25 04:25:38,396][15132] Fps is (10 sec: 44208.1, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 13269155840. Throughput: 0: 43045.5. Samples: 13269230060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:38,397][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 04:25:41,611][15401] Updated weights for policy 0, policy_version 809893 (0.0045) [2024-06-25 04:25:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13269352448. Throughput: 0: 42924.1. Samples: 13269487800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 04:25:43,495][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000809898_13269368832.pth... [2024-06-25 04:25:43,547][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000809273_13259128832.pth [2024-06-25 04:25:44,707][15401] Updated weights for policy 0, policy_version 809903 (0.0030) [2024-06-25 04:25:48,390][15132] Fps is (10 sec: 40986.1, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 13269565440. Throughput: 0: 42958.6. Samples: 13269751800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 04:25:49,146][15401] Updated weights for policy 0, policy_version 809913 (0.0029) [2024-06-25 04:25:52,455][15401] Updated weights for policy 0, policy_version 809923 (0.0029) [2024-06-25 04:25:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 13269827584. Throughput: 0: 43103.5. Samples: 13269876840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 04:25:56,745][15401] Updated weights for policy 0, policy_version 809933 (0.0037) [2024-06-25 04:25:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13269975040. Throughput: 0: 43045.8. Samples: 13270137420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:25:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 04:25:59,971][15401] Updated weights for policy 0, policy_version 809943 (0.0037) [2024-06-25 04:26:03,392][15132] Fps is (10 sec: 39312.2, 60 sec: 43415.8, 300 sec: 42653.6). Total num frames: 13270220800. Throughput: 0: 42917.7. Samples: 13270393700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:03,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 04:26:05,018][15401] Updated weights for policy 0, policy_version 809953 (0.0031) [2024-06-25 04:26:07,520][15401] Updated weights for policy 0, policy_version 809963 (0.0032) [2024-06-25 04:26:08,389][15132] Fps is (10 sec: 49152.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13270466560. Throughput: 0: 43138.0. Samples: 13270524620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 04:26:12,448][15401] Updated weights for policy 0, policy_version 809973 (0.0032) [2024-06-25 04:26:13,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13270646784. Throughput: 0: 43203.6. Samples: 13270790200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:13,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 04:26:15,086][15401] Updated weights for policy 0, policy_version 809983 (0.0042) [2024-06-25 04:26:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 13270876160. Throughput: 0: 43044.9. Samples: 13271042800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 04:26:19,857][15401] Updated weights for policy 0, policy_version 809993 (0.0043) [2024-06-25 04:26:22,933][15401] Updated weights for policy 0, policy_version 810003 (0.0042) [2024-06-25 04:26:23,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43143.0, 300 sec: 42875.8). Total num frames: 13271121920. Throughput: 0: 43125.7. Samples: 13271170540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:23,392][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 04:26:27,412][15401] Updated weights for policy 0, policy_version 810013 (0.0031) [2024-06-25 04:26:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 13271302144. Throughput: 0: 43269.2. Samples: 13271434920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:28,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 04:26:30,444][15401] Updated weights for policy 0, policy_version 810023 (0.0029) [2024-06-25 04:26:33,390][15132] Fps is (10 sec: 39330.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 13271515136. Throughput: 0: 43144.0. Samples: 13271693280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:33,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-25 04:26:34,844][15401] Updated weights for policy 0, policy_version 810033 (0.0037) [2024-06-25 04:26:38,038][15401] Updated weights for policy 0, policy_version 810043 (0.0033) [2024-06-25 04:26:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43422.2, 300 sec: 42876.1). Total num frames: 13271760896. Throughput: 0: 43191.1. Samples: 13271820440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:38,398][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 04:26:42,618][15401] Updated weights for policy 0, policy_version 810053 (0.0028) [2024-06-25 04:26:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13271957504. Throughput: 0: 43261.4. Samples: 13272084180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 04:26:45,715][15401] Updated weights for policy 0, policy_version 810063 (0.0030) [2024-06-25 04:26:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13272170496. Throughput: 0: 43160.5. Samples: 13272335820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 04:26:50,123][15401] Updated weights for policy 0, policy_version 810073 (0.0039) [2024-06-25 04:26:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13272383488. Throughput: 0: 43137.7. Samples: 13272465820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 04:26:53,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 04:26:53,580][15401] Updated weights for policy 0, policy_version 810083 (0.0035) [2024-06-25 04:26:57,558][15401] Updated weights for policy 0, policy_version 810093 (0.0028) [2024-06-25 04:26:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 13272596480. Throughput: 0: 42991.0. Samples: 13272724800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:26:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 04:27:00,672][15349] Signal inference workers to stop experience collection... (196400 times) [2024-06-25 04:27:00,673][15349] Signal inference workers to resume experience collection... (196400 times) [2024-06-25 04:27:00,722][15401] InferenceWorker_p0-w0: stopping experience collection (196400 times) [2024-06-25 04:27:00,722][15401] InferenceWorker_p0-w0: resuming experience collection (196400 times) [2024-06-25 04:27:01,163][15401] Updated weights for policy 0, policy_version 810103 (0.0040) [2024-06-25 04:27:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 13272809472. Throughput: 0: 42998.7. Samples: 13272977740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 04:27:04,982][15401] Updated weights for policy 0, policy_version 810113 (0.0029) [2024-06-25 04:27:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 13273038848. Throughput: 0: 43055.4. Samples: 13273107940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 04:27:08,719][15401] Updated weights for policy 0, policy_version 810123 (0.0037) [2024-06-25 04:27:12,519][15401] Updated weights for policy 0, policy_version 810133 (0.0037) [2024-06-25 04:27:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 13273235456. Throughput: 0: 42932.8. Samples: 13273367000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:13,392][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 04:27:16,138][15401] Updated weights for policy 0, policy_version 810143 (0.0043) [2024-06-25 04:27:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13273448448. Throughput: 0: 42881.4. Samples: 13273622940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:18,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 04:27:20,339][15401] Updated weights for policy 0, policy_version 810153 (0.0040) [2024-06-25 04:27:23,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 13273677824. Throughput: 0: 42836.0. Samples: 13273748060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 04:27:23,671][15401] Updated weights for policy 0, policy_version 810163 (0.0031) [2024-06-25 04:27:27,909][15401] Updated weights for policy 0, policy_version 810173 (0.0027) [2024-06-25 04:27:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13273890816. Throughput: 0: 42895.1. Samples: 13274014460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 04:27:31,486][15401] Updated weights for policy 0, policy_version 810183 (0.0036) [2024-06-25 04:27:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13274103808. Throughput: 0: 42946.3. Samples: 13274268400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 04:27:35,545][15401] Updated weights for policy 0, policy_version 810193 (0.0035) [2024-06-25 04:27:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13274316800. Throughput: 0: 42852.9. Samples: 13274394200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:38,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 04:27:39,245][15401] Updated weights for policy 0, policy_version 810203 (0.0029) [2024-06-25 04:27:43,173][15401] Updated weights for policy 0, policy_version 810213 (0.0043) [2024-06-25 04:27:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13274529792. Throughput: 0: 42889.4. Samples: 13274654820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:43,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-25 04:27:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000810214_13274546176.pth... [2024-06-25 04:27:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000809586_13264257024.pth [2024-06-25 04:27:46,921][15401] Updated weights for policy 0, policy_version 810223 (0.0028) [2024-06-25 04:27:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 13274742784. Throughput: 0: 42876.2. Samples: 13274907180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:48,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 04:27:50,996][15401] Updated weights for policy 0, policy_version 810233 (0.0025) [2024-06-25 04:27:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13274955776. Throughput: 0: 42745.0. Samples: 13275031460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 04:27:54,417][15401] Updated weights for policy 0, policy_version 810243 (0.0034) [2024-06-25 04:27:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13275152384. Throughput: 0: 42882.2. Samples: 13275296600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:27:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 04:27:58,608][15401] Updated weights for policy 0, policy_version 810253 (0.0040) [2024-06-25 04:28:02,186][15401] Updated weights for policy 0, policy_version 810263 (0.0040) [2024-06-25 04:28:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 13275381760. Throughput: 0: 42724.3. Samples: 13275545540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:28:03,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-25 04:28:06,162][15401] Updated weights for policy 0, policy_version 810273 (0.0037) [2024-06-25 04:28:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13275594752. Throughput: 0: 42810.2. Samples: 13275674520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:28:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 04:28:09,755][15401] Updated weights for policy 0, policy_version 810283 (0.0026) [2024-06-25 04:28:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.1, 300 sec: 42987.1). Total num frames: 13275807744. Throughput: 0: 42677.2. Samples: 13275934940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:28:13,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 04:28:13,810][15401] Updated weights for policy 0, policy_version 810293 (0.0046) [2024-06-25 04:28:17,560][15401] Updated weights for policy 0, policy_version 810303 (0.0036) [2024-06-25 04:28:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13276037120. Throughput: 0: 42699.5. Samples: 13276189880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:28:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 04:28:21,404][15401] Updated weights for policy 0, policy_version 810313 (0.0027) [2024-06-25 04:28:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13276250112. Throughput: 0: 42769.7. Samples: 13276318840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:28:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 04:28:24,065][15349] Signal inference workers to stop experience collection... (196450 times) [2024-06-25 04:28:24,065][15349] Signal inference workers to resume experience collection... (196450 times) [2024-06-25 04:28:24,085][15401] InferenceWorker_p0-w0: stopping experience collection (196450 times) [2024-06-25 04:28:24,086][15401] InferenceWorker_p0-w0: resuming experience collection (196450 times) [2024-06-25 04:28:25,365][15401] Updated weights for policy 0, policy_version 810323 (0.0038) [2024-06-25 04:28:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 13276463104. Throughput: 0: 42741.8. Samples: 13276578200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:28:28,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 04:28:29,044][15401] Updated weights for policy 0, policy_version 810333 (0.0026) [2024-06-25 04:28:33,233][15401] Updated weights for policy 0, policy_version 810343 (0.0036) [2024-06-25 04:28:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 13276659712. Throughput: 0: 42907.4. Samples: 13276838000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:28:33,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 04:28:36,594][15401] Updated weights for policy 0, policy_version 810353 (0.0035) [2024-06-25 04:28:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13276905472. Throughput: 0: 42905.0. Samples: 13276962180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 04:28:38,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 04:28:40,774][15401] Updated weights for policy 0, policy_version 810363 (0.0040) [2024-06-25 04:28:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 13277102080. Throughput: 0: 42768.9. Samples: 13277221200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:28:43,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 04:28:44,291][15401] Updated weights for policy 0, policy_version 810373 (0.0030) [2024-06-25 04:28:48,319][15401] Updated weights for policy 0, policy_version 810383 (0.0026) [2024-06-25 04:28:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13277315072. Throughput: 0: 43018.2. Samples: 13277481360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:28:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 04:28:51,877][15401] Updated weights for policy 0, policy_version 810393 (0.0040) [2024-06-25 04:28:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13277544448. Throughput: 0: 42917.4. Samples: 13277605800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:28:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 04:28:55,819][15401] Updated weights for policy 0, policy_version 810403 (0.0027) [2024-06-25 04:28:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 43154.1). Total num frames: 13277773824. Throughput: 0: 43002.7. Samples: 13277870060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:28:58,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 04:28:59,872][15401] Updated weights for policy 0, policy_version 810413 (0.0050) [2024-06-25 04:29:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13277937664. Throughput: 0: 43119.5. Samples: 13278130260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:03,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 04:29:03,688][15401] Updated weights for policy 0, policy_version 810423 (0.0041) [2024-06-25 04:29:07,378][15401] Updated weights for policy 0, policy_version 810433 (0.0037) [2024-06-25 04:29:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13278183424. Throughput: 0: 42961.4. Samples: 13278252100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:08,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 04:29:11,471][15401] Updated weights for policy 0, policy_version 810443 (0.0033) [2024-06-25 04:29:13,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 43153.8). Total num frames: 13278412800. Throughput: 0: 42985.2. Samples: 13278512540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:13,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 04:29:14,988][15401] Updated weights for policy 0, policy_version 810453 (0.0029) [2024-06-25 04:29:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13278576640. Throughput: 0: 42895.5. Samples: 13278768300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:18,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 04:29:19,229][15401] Updated weights for policy 0, policy_version 810463 (0.0043) [2024-06-25 04:29:22,476][15401] Updated weights for policy 0, policy_version 810473 (0.0037) [2024-06-25 04:29:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13278822400. Throughput: 0: 42819.5. Samples: 13278889060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 04:29:26,802][15401] Updated weights for policy 0, policy_version 810483 (0.0037) [2024-06-25 04:29:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 13279035392. Throughput: 0: 42964.0. Samples: 13279154580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 04:29:30,295][15401] Updated weights for policy 0, policy_version 810493 (0.0039) [2024-06-25 04:29:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13279232000. Throughput: 0: 42767.6. Samples: 13279405900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 04:29:34,331][15401] Updated weights for policy 0, policy_version 810503 (0.0032) [2024-06-25 04:29:37,761][15401] Updated weights for policy 0, policy_version 810513 (0.0032) [2024-06-25 04:29:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13279477760. Throughput: 0: 42817.3. Samples: 13279532580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 04:29:42,297][15401] Updated weights for policy 0, policy_version 810523 (0.0041) [2024-06-25 04:29:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 43098.2). Total num frames: 13279674368. Throughput: 0: 42911.0. Samples: 13279801060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 04:29:43,459][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000810528_13279690752.pth... [2024-06-25 04:29:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000809898_13269368832.pth [2024-06-25 04:29:45,367][15401] Updated weights for policy 0, policy_version 810533 (0.0036) [2024-06-25 04:29:47,180][15349] Signal inference workers to stop experience collection... (196500 times) [2024-06-25 04:29:47,180][15349] Signal inference workers to resume experience collection... (196500 times) [2024-06-25 04:29:47,218][15401] InferenceWorker_p0-w0: stopping experience collection (196500 times) [2024-06-25 04:29:47,218][15401] InferenceWorker_p0-w0: resuming experience collection (196500 times) [2024-06-25 04:29:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13279887360. Throughput: 0: 42692.1. Samples: 13280051400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 04:29:50,251][15401] Updated weights for policy 0, policy_version 810543 (0.0034) [2024-06-25 04:29:52,847][15401] Updated weights for policy 0, policy_version 810553 (0.0042) [2024-06-25 04:29:53,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13280133120. Throughput: 0: 42759.1. Samples: 13280176260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:53,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 04:29:57,862][15401] Updated weights for policy 0, policy_version 810563 (0.0032) [2024-06-25 04:29:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42987.2). Total num frames: 13280296960. Throughput: 0: 42869.3. Samples: 13280441660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:29:58,391][15132] Avg episode reward: [(0, '0.240')] [2024-06-25 04:30:00,567][15401] Updated weights for policy 0, policy_version 810573 (0.0029) [2024-06-25 04:30:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13280526336. Throughput: 0: 42639.4. Samples: 13280687080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:30:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 04:30:05,438][15401] Updated weights for policy 0, policy_version 810583 (0.0028) [2024-06-25 04:30:08,287][15401] Updated weights for policy 0, policy_version 810593 (0.0048) [2024-06-25 04:30:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13280755712. Throughput: 0: 42817.4. Samples: 13280815840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:30:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 04:30:13,058][15401] Updated weights for policy 0, policy_version 810603 (0.0030) [2024-06-25 04:30:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 13280952320. Throughput: 0: 42761.4. Samples: 13281078840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:30:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 04:30:15,803][15401] Updated weights for policy 0, policy_version 810613 (0.0026) [2024-06-25 04:30:18,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43415.8, 300 sec: 42875.8). Total num frames: 13281181696. Throughput: 0: 42672.4. Samples: 13281326260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:30:18,393][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 04:30:20,700][15401] Updated weights for policy 0, policy_version 810623 (0.0031) [2024-06-25 04:30:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13281394688. Throughput: 0: 42890.7. Samples: 13281462660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:30:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 04:30:23,518][15401] Updated weights for policy 0, policy_version 810633 (0.0049) [2024-06-25 04:30:28,221][15401] Updated weights for policy 0, policy_version 810643 (0.0040) [2024-06-25 04:30:28,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 13281574912. Throughput: 0: 42645.9. Samples: 13281720120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 04:30:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 04:30:31,439][15401] Updated weights for policy 0, policy_version 810653 (0.0034) [2024-06-25 04:30:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42877.0). Total num frames: 13281804288. Throughput: 0: 42649.8. Samples: 13281970640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:30:33,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 04:30:35,813][15401] Updated weights for policy 0, policy_version 810663 (0.0038) [2024-06-25 04:30:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 13282033664. Throughput: 0: 42768.1. Samples: 13282100820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:30:38,396][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 04:30:38,995][15401] Updated weights for policy 0, policy_version 810673 (0.0034) [2024-06-25 04:30:43,348][15401] Updated weights for policy 0, policy_version 810683 (0.0042) [2024-06-25 04:30:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.8, 300 sec: 42931.3). Total num frames: 13282230272. Throughput: 0: 42547.1. Samples: 13282356380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:30:43,393][15132] Avg episode reward: [(0, '0.337')] [2024-06-25 04:30:46,705][15401] Updated weights for policy 0, policy_version 810693 (0.0027) [2024-06-25 04:30:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13282459648. Throughput: 0: 42824.0. Samples: 13282614160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:30:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 04:30:51,283][15401] Updated weights for policy 0, policy_version 810703 (0.0032) [2024-06-25 04:30:52,193][15349] Signal inference workers to stop experience collection... (196550 times) [2024-06-25 04:30:52,194][15349] Signal inference workers to resume experience collection... (196550 times) [2024-06-25 04:30:52,244][15401] InferenceWorker_p0-w0: stopping experience collection (196550 times) [2024-06-25 04:30:52,244][15401] InferenceWorker_p0-w0: resuming experience collection (196550 times) [2024-06-25 04:30:53,392][15132] Fps is (10 sec: 45875.1, 60 sec: 42596.7, 300 sec: 43097.9). Total num frames: 13282689024. Throughput: 0: 43027.4. Samples: 13282752180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:30:53,392][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 04:30:54,372][15401] Updated weights for policy 0, policy_version 810713 (0.0037) [2024-06-25 04:30:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 13282852864. Throughput: 0: 42687.1. Samples: 13282999760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:30:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 04:30:58,886][15401] Updated weights for policy 0, policy_version 810723 (0.0034) [2024-06-25 04:31:01,905][15401] Updated weights for policy 0, policy_version 810733 (0.0040) [2024-06-25 04:31:03,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13283098624. Throughput: 0: 42885.9. Samples: 13283256020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:03,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 04:31:06,572][15401] Updated weights for policy 0, policy_version 810743 (0.0044) [2024-06-25 04:31:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 13283311616. Throughput: 0: 42846.7. Samples: 13283390760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:08,390][15132] Avg episode reward: [(0, '0.213')] [2024-06-25 04:31:09,454][15401] Updated weights for policy 0, policy_version 810753 (0.0031) [2024-06-25 04:31:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13283491840. Throughput: 0: 42730.3. Samples: 13283642980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:13,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 04:31:14,286][15401] Updated weights for policy 0, policy_version 810763 (0.0031) [2024-06-25 04:31:17,538][15401] Updated weights for policy 0, policy_version 810773 (0.0039) [2024-06-25 04:31:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.2, 300 sec: 42765.4). Total num frames: 13283737600. Throughput: 0: 42796.4. Samples: 13283896480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 04:31:22,043][15401] Updated weights for policy 0, policy_version 810783 (0.0040) [2024-06-25 04:31:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13283950592. Throughput: 0: 42845.7. Samples: 13284028880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:23,394][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 04:31:25,146][15401] Updated weights for policy 0, policy_version 810793 (0.0034) [2024-06-25 04:31:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13284147200. Throughput: 0: 42855.1. Samples: 13284284760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:28,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 04:31:29,453][15401] Updated weights for policy 0, policy_version 810803 (0.0038) [2024-06-25 04:31:32,834][15401] Updated weights for policy 0, policy_version 810813 (0.0026) [2024-06-25 04:31:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13284392960. Throughput: 0: 42868.5. Samples: 13284543240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:33,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 04:31:36,806][15401] Updated weights for policy 0, policy_version 810823 (0.0038) [2024-06-25 04:31:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13284589568. Throughput: 0: 42704.0. Samples: 13284673760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 04:31:40,606][15401] Updated weights for policy 0, policy_version 810833 (0.0039) [2024-06-25 04:31:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 13284802560. Throughput: 0: 42834.2. Samples: 13284927300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:43,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 04:31:43,494][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000810841_13284818944.pth... [2024-06-25 04:31:43,539][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000810214_13274546176.pth [2024-06-25 04:31:44,506][15401] Updated weights for policy 0, policy_version 810843 (0.0029) [2024-06-25 04:31:48,041][15401] Updated weights for policy 0, policy_version 810853 (0.0025) [2024-06-25 04:31:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13285015552. Throughput: 0: 42948.9. Samples: 13285188720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 04:31:51,955][15401] Updated weights for policy 0, policy_version 810863 (0.0046) [2024-06-25 04:31:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42327.0, 300 sec: 42820.6). Total num frames: 13285228544. Throughput: 0: 42954.7. Samples: 13285323720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:53,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 04:31:55,846][15401] Updated weights for policy 0, policy_version 810873 (0.0023) [2024-06-25 04:31:58,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 13285441536. Throughput: 0: 42927.5. Samples: 13285574820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:31:58,393][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 04:31:59,709][15401] Updated weights for policy 0, policy_version 810883 (0.0034) [2024-06-25 04:32:00,416][15349] Signal inference workers to stop experience collection... (196600 times) [2024-06-25 04:32:00,417][15349] Signal inference workers to resume experience collection... (196600 times) [2024-06-25 04:32:00,462][15401] InferenceWorker_p0-w0: stopping experience collection (196600 times) [2024-06-25 04:32:00,462][15401] InferenceWorker_p0-w0: resuming experience collection (196600 times) [2024-06-25 04:32:03,202][15401] Updated weights for policy 0, policy_version 810893 (0.0035) [2024-06-25 04:32:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13285670912. Throughput: 0: 43014.9. Samples: 13285832160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:32:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 04:32:07,343][15401] Updated weights for policy 0, policy_version 810903 (0.0021) [2024-06-25 04:32:08,390][15132] Fps is (10 sec: 45885.8, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 13285900288. Throughput: 0: 43071.1. Samples: 13285967080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:32:08,396][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 04:32:11,056][15401] Updated weights for policy 0, policy_version 810913 (0.0038) [2024-06-25 04:32:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13286080512. Throughput: 0: 42975.5. Samples: 13286218660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 26.0) [2024-06-25 04:32:13,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 04:32:14,944][15401] Updated weights for policy 0, policy_version 810923 (0.0041) [2024-06-25 04:32:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13286309888. Throughput: 0: 42947.5. Samples: 13286475880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 04:32:18,634][15401] Updated weights for policy 0, policy_version 810933 (0.0027) [2024-06-25 04:32:22,443][15401] Updated weights for policy 0, policy_version 810943 (0.0028) [2024-06-25 04:32:23,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13286539264. Throughput: 0: 42952.0. Samples: 13286606600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:23,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 04:32:26,342][15401] Updated weights for policy 0, policy_version 810953 (0.0032) [2024-06-25 04:32:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13286719488. Throughput: 0: 42961.4. Samples: 13286860560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 04:32:30,079][15401] Updated weights for policy 0, policy_version 810963 (0.0030) [2024-06-25 04:32:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13286948864. Throughput: 0: 42878.7. Samples: 13287118260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 04:32:33,870][15401] Updated weights for policy 0, policy_version 810973 (0.0025) [2024-06-25 04:32:37,625][15401] Updated weights for policy 0, policy_version 810983 (0.0040) [2024-06-25 04:32:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13287178240. Throughput: 0: 42933.6. Samples: 13287255740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 04:32:41,570][15401] Updated weights for policy 0, policy_version 810993 (0.0032) [2024-06-25 04:32:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13287358464. Throughput: 0: 42959.6. Samples: 13287507900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:43,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 04:32:45,222][15401] Updated weights for policy 0, policy_version 811003 (0.0035) [2024-06-25 04:32:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13287604224. Throughput: 0: 42963.7. Samples: 13287765520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 04:32:49,270][15401] Updated weights for policy 0, policy_version 811013 (0.0038) [2024-06-25 04:32:52,811][15401] Updated weights for policy 0, policy_version 811023 (0.0040) [2024-06-25 04:32:53,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13287833600. Throughput: 0: 42917.9. Samples: 13287898380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:53,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 04:32:56,931][15401] Updated weights for policy 0, policy_version 811033 (0.0024) [2024-06-25 04:32:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 13287997440. Throughput: 0: 42819.6. Samples: 13288145540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:32:58,393][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 04:33:00,753][15401] Updated weights for policy 0, policy_version 811043 (0.0047) [2024-06-25 04:33:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13288243200. Throughput: 0: 42695.5. Samples: 13288397180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 04:33:04,647][15401] Updated weights for policy 0, policy_version 811053 (0.0045) [2024-06-25 04:33:08,244][15401] Updated weights for policy 0, policy_version 811063 (0.0033) [2024-06-25 04:33:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13288456192. Throughput: 0: 42904.8. Samples: 13288537320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:08,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 04:33:12,249][15401] Updated weights for policy 0, policy_version 811073 (0.0033) [2024-06-25 04:33:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13288652800. Throughput: 0: 42968.3. Samples: 13288794140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 04:33:15,604][15401] Updated weights for policy 0, policy_version 811083 (0.0030) [2024-06-25 04:33:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13288898560. Throughput: 0: 42801.3. Samples: 13289044320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:18,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 04:33:19,702][15401] Updated weights for policy 0, policy_version 811093 (0.0034) [2024-06-25 04:33:23,096][15349] Signal inference workers to stop experience collection... (196650 times) [2024-06-25 04:33:23,144][15349] Signal inference workers to resume experience collection... (196650 times) [2024-06-25 04:33:23,145][15401] InferenceWorker_p0-w0: stopping experience collection (196650 times) [2024-06-25 04:33:23,160][15401] InferenceWorker_p0-w0: resuming experience collection (196650 times) [2024-06-25 04:33:23,309][15401] Updated weights for policy 0, policy_version 811103 (0.0038) [2024-06-25 04:33:23,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13289111552. Throughput: 0: 42908.1. Samples: 13289186600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 04:33:27,229][15401] Updated weights for policy 0, policy_version 811113 (0.0025) [2024-06-25 04:33:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13289308160. Throughput: 0: 42968.0. Samples: 13289441460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 04:33:30,924][15401] Updated weights for policy 0, policy_version 811123 (0.0041) [2024-06-25 04:33:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13289553920. Throughput: 0: 42829.7. Samples: 13289692860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 04:33:34,873][15401] Updated weights for policy 0, policy_version 811133 (0.0039) [2024-06-25 04:33:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13289750528. Throughput: 0: 42943.0. Samples: 13289830820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:38,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 04:33:38,444][15401] Updated weights for policy 0, policy_version 811143 (0.0032) [2024-06-25 04:33:42,806][15401] Updated weights for policy 0, policy_version 811153 (0.0028) [2024-06-25 04:33:43,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13289930752. Throughput: 0: 43090.3. Samples: 13290084600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 04:33:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000811154_13289947136.pth... [2024-06-25 04:33:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000810528_13279690752.pth [2024-06-25 04:33:46,089][15401] Updated weights for policy 0, policy_version 811163 (0.0033) [2024-06-25 04:33:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 13290192896. Throughput: 0: 43095.6. Samples: 13290336580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:48,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 04:33:50,409][15401] Updated weights for policy 0, policy_version 811173 (0.0034) [2024-06-25 04:33:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13290389504. Throughput: 0: 43062.8. Samples: 13290475140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 04:33:54,017][15401] Updated weights for policy 0, policy_version 811183 (0.0030) [2024-06-25 04:33:57,982][15401] Updated weights for policy 0, policy_version 811193 (0.0027) [2024-06-25 04:33:58,389][15132] Fps is (10 sec: 39330.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13290586112. Throughput: 0: 43035.7. Samples: 13290730740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:33:58,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 04:34:01,495][15401] Updated weights for policy 0, policy_version 811203 (0.0026) [2024-06-25 04:34:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13290815488. Throughput: 0: 43226.7. Samples: 13290989520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-25 04:34:03,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 04:34:05,623][15401] Updated weights for policy 0, policy_version 811213 (0.0039) [2024-06-25 04:34:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13291044864. Throughput: 0: 42922.3. Samples: 13291118100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 04:34:09,037][15401] Updated weights for policy 0, policy_version 811223 (0.0038) [2024-06-25 04:34:13,146][15401] Updated weights for policy 0, policy_version 811233 (0.0029) [2024-06-25 04:34:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13291241472. Throughput: 0: 43001.8. Samples: 13291376540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:13,393][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 04:34:16,496][15401] Updated weights for policy 0, policy_version 811243 (0.0035) [2024-06-25 04:34:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13291470848. Throughput: 0: 43089.4. Samples: 13291631880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:18,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 04:34:21,090][15401] Updated weights for policy 0, policy_version 811253 (0.0037) [2024-06-25 04:34:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13291683840. Throughput: 0: 43084.0. Samples: 13291769600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 04:34:23,864][15401] Updated weights for policy 0, policy_version 811263 (0.0027) [2024-06-25 04:34:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13291880448. Throughput: 0: 43117.7. Samples: 13292024900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 04:34:28,558][15401] Updated weights for policy 0, policy_version 811273 (0.0033) [2024-06-25 04:34:30,146][15349] Signal inference workers to stop experience collection... (196700 times) [2024-06-25 04:34:30,146][15349] Signal inference workers to resume experience collection... (196700 times) [2024-06-25 04:34:30,191][15401] InferenceWorker_p0-w0: stopping experience collection (196700 times) [2024-06-25 04:34:30,191][15401] InferenceWorker_p0-w0: resuming experience collection (196700 times) [2024-06-25 04:34:31,893][15401] Updated weights for policy 0, policy_version 811283 (0.0039) [2024-06-25 04:34:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13292109824. Throughput: 0: 43240.9. Samples: 13292282320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 04:34:36,151][15401] Updated weights for policy 0, policy_version 811293 (0.0025) [2024-06-25 04:34:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 13292339200. Throughput: 0: 43144.4. Samples: 13292416640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:38,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 04:34:39,218][15401] Updated weights for policy 0, policy_version 811303 (0.0047) [2024-06-25 04:34:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13292519424. Throughput: 0: 43094.6. Samples: 13292670000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 04:34:43,677][15401] Updated weights for policy 0, policy_version 811313 (0.0029) [2024-06-25 04:34:47,099][15401] Updated weights for policy 0, policy_version 811323 (0.0034) [2024-06-25 04:34:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 13292765184. Throughput: 0: 43062.6. Samples: 13292927340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 04:34:51,474][15401] Updated weights for policy 0, policy_version 811333 (0.0028) [2024-06-25 04:34:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.4, 300 sec: 43042.7). Total num frames: 13292994560. Throughput: 0: 43257.6. Samples: 13293064700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 04:34:54,636][15401] Updated weights for policy 0, policy_version 811343 (0.0037) [2024-06-25 04:34:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13293174784. Throughput: 0: 43173.7. Samples: 13293319360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:34:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 04:34:59,015][15401] Updated weights for policy 0, policy_version 811353 (0.0039) [2024-06-25 04:35:02,446][15401] Updated weights for policy 0, policy_version 811363 (0.0027) [2024-06-25 04:35:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 13293420544. Throughput: 0: 43040.4. Samples: 13293568700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 04:35:06,767][15401] Updated weights for policy 0, policy_version 811373 (0.0030) [2024-06-25 04:35:08,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 13293633536. Throughput: 0: 43103.9. Samples: 13293709380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:08,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 04:35:10,149][15401] Updated weights for policy 0, policy_version 811383 (0.0041) [2024-06-25 04:35:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 13293813760. Throughput: 0: 42947.2. Samples: 13293957520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:13,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 04:35:14,249][15401] Updated weights for policy 0, policy_version 811393 (0.0035) [2024-06-25 04:35:17,656][15401] Updated weights for policy 0, policy_version 811403 (0.0033) [2024-06-25 04:35:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13294075904. Throughput: 0: 42817.9. Samples: 13294209120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:18,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 04:35:21,828][15401] Updated weights for policy 0, policy_version 811413 (0.0043) [2024-06-25 04:35:23,390][15132] Fps is (10 sec: 45871.1, 60 sec: 43143.9, 300 sec: 43042.6). Total num frames: 13294272512. Throughput: 0: 42905.0. Samples: 13294347400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:23,391][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 04:35:25,417][15401] Updated weights for policy 0, policy_version 811423 (0.0038) [2024-06-25 04:35:28,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13294452736. Throughput: 0: 42882.3. Samples: 13294599700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 04:35:29,302][15401] Updated weights for policy 0, policy_version 811433 (0.0030) [2024-06-25 04:35:32,931][15401] Updated weights for policy 0, policy_version 811443 (0.0035) [2024-06-25 04:35:33,389][15132] Fps is (10 sec: 40963.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13294682112. Throughput: 0: 42873.5. Samples: 13294856640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:33,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 04:35:36,467][15349] Signal inference workers to stop experience collection... (196750 times) [2024-06-25 04:35:36,493][15401] InferenceWorker_p0-w0: stopping experience collection (196750 times) [2024-06-25 04:35:36,533][15349] Signal inference workers to resume experience collection... (196750 times) [2024-06-25 04:35:36,534][15401] InferenceWorker_p0-w0: resuming experience collection (196750 times) [2024-06-25 04:35:36,949][15401] Updated weights for policy 0, policy_version 811453 (0.0035) [2024-06-25 04:35:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 13294911488. Throughput: 0: 42779.6. Samples: 13294989780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 04:35:40,389][15401] Updated weights for policy 0, policy_version 811463 (0.0037) [2024-06-25 04:35:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13295091712. Throughput: 0: 42737.5. Samples: 13295242540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:43,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 04:35:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000811469_13295108096.pth... [2024-06-25 04:35:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000810841_13284818944.pth [2024-06-25 04:35:44,542][15401] Updated weights for policy 0, policy_version 811473 (0.0033) [2024-06-25 04:35:48,147][15401] Updated weights for policy 0, policy_version 811483 (0.0042) [2024-06-25 04:35:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 13295337472. Throughput: 0: 42986.3. Samples: 13295503080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 04:35:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 04:35:52,107][15401] Updated weights for policy 0, policy_version 811493 (0.0034) [2024-06-25 04:35:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 13295550464. Throughput: 0: 42886.7. Samples: 13295639180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:35:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 04:35:56,058][15401] Updated weights for policy 0, policy_version 811503 (0.0040) [2024-06-25 04:35:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13295747072. Throughput: 0: 42863.1. Samples: 13295886360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:35:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 04:35:59,748][15401] Updated weights for policy 0, policy_version 811513 (0.0031) [2024-06-25 04:36:03,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 13295976448. Throughput: 0: 42973.1. Samples: 13296142920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 04:36:03,572][15401] Updated weights for policy 0, policy_version 811523 (0.0026) [2024-06-25 04:36:07,600][15401] Updated weights for policy 0, policy_version 811533 (0.0034) [2024-06-25 04:36:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 43042.7). Total num frames: 13296189440. Throughput: 0: 42910.1. Samples: 13296278320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 04:36:11,183][15401] Updated weights for policy 0, policy_version 811543 (0.0030) [2024-06-25 04:36:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 13296402432. Throughput: 0: 42856.7. Samples: 13296528260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:13,391][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 04:36:15,301][15401] Updated weights for policy 0, policy_version 811553 (0.0030) [2024-06-25 04:36:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 13296615424. Throughput: 0: 42817.8. Samples: 13296783440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 04:36:18,896][15401] Updated weights for policy 0, policy_version 811563 (0.0036) [2024-06-25 04:36:22,877][15401] Updated weights for policy 0, policy_version 811573 (0.0031) [2024-06-25 04:36:23,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42599.0, 300 sec: 42987.2). Total num frames: 13296828416. Throughput: 0: 42705.9. Samples: 13296911540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:23,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 04:36:26,631][15401] Updated weights for policy 0, policy_version 811583 (0.0043) [2024-06-25 04:36:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13297041408. Throughput: 0: 42734.6. Samples: 13297165600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 04:36:31,064][15401] Updated weights for policy 0, policy_version 811593 (0.0039) [2024-06-25 04:36:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 13297270784. Throughput: 0: 42393.6. Samples: 13297410800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:33,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 04:36:34,082][15401] Updated weights for policy 0, policy_version 811603 (0.0026) [2024-06-25 04:36:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13297451008. Throughput: 0: 42431.1. Samples: 13297548580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:38,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 04:36:38,637][15401] Updated weights for policy 0, policy_version 811613 (0.0042) [2024-06-25 04:36:41,805][15401] Updated weights for policy 0, policy_version 811623 (0.0036) [2024-06-25 04:36:43,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13297647616. Throughput: 0: 42520.5. Samples: 13297799780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:43,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 04:36:46,131][15401] Updated weights for policy 0, policy_version 811633 (0.0028) [2024-06-25 04:36:48,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13297926144. Throughput: 0: 42454.8. Samples: 13298053380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 04:36:49,331][15401] Updated weights for policy 0, policy_version 811643 (0.0033) [2024-06-25 04:36:53,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42876.4). Total num frames: 13298089984. Throughput: 0: 42515.0. Samples: 13298191500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 04:36:53,713][15401] Updated weights for policy 0, policy_version 811653 (0.0056) [2024-06-25 04:36:56,895][15401] Updated weights for policy 0, policy_version 811663 (0.0033) [2024-06-25 04:36:58,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13298302976. Throughput: 0: 42440.2. Samples: 13298438060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:36:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 04:36:59,111][15349] Signal inference workers to stop experience collection... (196800 times) [2024-06-25 04:36:59,111][15349] Signal inference workers to resume experience collection... (196800 times) [2024-06-25 04:36:59,129][15401] InferenceWorker_p0-w0: stopping experience collection (196800 times) [2024-06-25 04:36:59,129][15401] InferenceWorker_p0-w0: resuming experience collection (196800 times) [2024-06-25 04:37:01,164][15401] Updated weights for policy 0, policy_version 811673 (0.0028) [2024-06-25 04:37:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13298548736. Throughput: 0: 42635.5. Samples: 13298702040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:37:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 04:37:04,395][15401] Updated weights for policy 0, policy_version 811683 (0.0040) [2024-06-25 04:37:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 13298745344. Throughput: 0: 42762.7. Samples: 13298835860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:37:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 04:37:08,642][15401] Updated weights for policy 0, policy_version 811693 (0.0034) [2024-06-25 04:37:12,395][15401] Updated weights for policy 0, policy_version 811703 (0.0032) [2024-06-25 04:37:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 13298958336. Throughput: 0: 42666.7. Samples: 13299085600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:37:13,390][15132] Avg episode reward: [(0, '0.156')] [2024-06-25 04:37:16,246][15401] Updated weights for policy 0, policy_version 811713 (0.0032) [2024-06-25 04:37:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13299187712. Throughput: 0: 42938.7. Samples: 13299343040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:37:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 04:37:19,823][15401] Updated weights for policy 0, policy_version 811723 (0.0034) [2024-06-25 04:37:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13299400704. Throughput: 0: 42924.0. Samples: 13299480160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:37:23,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-25 04:37:23,842][15401] Updated weights for policy 0, policy_version 811733 (0.0027) [2024-06-25 04:37:27,426][15401] Updated weights for policy 0, policy_version 811743 (0.0030) [2024-06-25 04:37:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13299613696. Throughput: 0: 43031.9. Samples: 13299736220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:37:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 04:37:31,306][15401] Updated weights for policy 0, policy_version 811753 (0.0037) [2024-06-25 04:37:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 13299843072. Throughput: 0: 43159.9. Samples: 13299995680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:37:33,393][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 04:37:35,110][15401] Updated weights for policy 0, policy_version 811763 (0.0030) [2024-06-25 04:37:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13300023296. Throughput: 0: 42993.1. Samples: 13300126180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-25 04:37:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 04:37:38,974][15401] Updated weights for policy 0, policy_version 811773 (0.0025) [2024-06-25 04:37:42,595][15401] Updated weights for policy 0, policy_version 811783 (0.0031) [2024-06-25 04:37:43,390][15132] Fps is (10 sec: 40969.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13300252672. Throughput: 0: 43188.4. Samples: 13300381540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:37:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 04:37:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000811784_13300269056.pth... [2024-06-25 04:37:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000811154_13289947136.pth [2024-06-25 04:37:46,657][15401] Updated weights for policy 0, policy_version 811793 (0.0037) [2024-06-25 04:37:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13300482048. Throughput: 0: 43144.4. Samples: 13300643540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:37:48,399][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 04:37:50,035][15401] Updated weights for policy 0, policy_version 811803 (0.0047) [2024-06-25 04:37:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13300678656. Throughput: 0: 43046.6. Samples: 13300772960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:37:53,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 04:37:54,270][15401] Updated weights for policy 0, policy_version 811813 (0.0042) [2024-06-25 04:37:57,616][15401] Updated weights for policy 0, policy_version 811823 (0.0027) [2024-06-25 04:37:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 13300924416. Throughput: 0: 43236.3. Samples: 13301031240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:37:58,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 04:38:02,031][15401] Updated weights for policy 0, policy_version 811833 (0.0036) [2024-06-25 04:38:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13301121024. Throughput: 0: 43331.2. Samples: 13301292940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 04:38:05,400][15401] Updated weights for policy 0, policy_version 811843 (0.0028) [2024-06-25 04:38:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13301334016. Throughput: 0: 43103.2. Samples: 13301419800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:08,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-25 04:38:09,545][15401] Updated weights for policy 0, policy_version 811853 (0.0028) [2024-06-25 04:38:13,327][15401] Updated weights for policy 0, policy_version 811863 (0.0037) [2024-06-25 04:38:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 13301563392. Throughput: 0: 43074.2. Samples: 13301674560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 04:38:17,459][15401] Updated weights for policy 0, policy_version 811873 (0.0028) [2024-06-25 04:38:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13301776384. Throughput: 0: 43121.3. Samples: 13301936040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:18,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 04:38:21,203][15401] Updated weights for policy 0, policy_version 811883 (0.0033) [2024-06-25 04:38:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13302005760. Throughput: 0: 43074.1. Samples: 13302064520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 04:38:24,935][15401] Updated weights for policy 0, policy_version 811893 (0.0037) [2024-06-25 04:38:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13302202368. Throughput: 0: 43126.7. Samples: 13302322240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 04:38:28,781][15401] Updated weights for policy 0, policy_version 811903 (0.0028) [2024-06-25 04:38:32,526][15401] Updated weights for policy 0, policy_version 811913 (0.0033) [2024-06-25 04:38:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 13302431744. Throughput: 0: 43054.7. Samples: 13302581000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 04:38:36,275][15401] Updated weights for policy 0, policy_version 811923 (0.0036) [2024-06-25 04:38:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 13302644736. Throughput: 0: 43038.7. Samples: 13302709700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:38,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 04:38:40,468][15401] Updated weights for policy 0, policy_version 811933 (0.0034) [2024-06-25 04:38:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 13302841344. Throughput: 0: 43018.4. Samples: 13302967060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:43,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-25 04:38:43,905][15401] Updated weights for policy 0, policy_version 811943 (0.0027) [2024-06-25 04:38:45,636][15349] Signal inference workers to stop experience collection... (196850 times) [2024-06-25 04:38:45,636][15349] Signal inference workers to resume experience collection... (196850 times) [2024-06-25 04:38:45,648][15401] InferenceWorker_p0-w0: stopping experience collection (196850 times) [2024-06-25 04:38:45,648][15401] InferenceWorker_p0-w0: resuming experience collection (196850 times) [2024-06-25 04:38:48,018][15401] Updated weights for policy 0, policy_version 811953 (0.0029) [2024-06-25 04:38:48,394][15132] Fps is (10 sec: 40943.4, 60 sec: 42868.6, 300 sec: 42931.0). Total num frames: 13303054336. Throughput: 0: 42910.3. Samples: 13303224080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:48,394][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 04:38:51,699][15401] Updated weights for policy 0, policy_version 811963 (0.0039) [2024-06-25 04:38:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13303283712. Throughput: 0: 42920.4. Samples: 13303351220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:53,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 04:38:55,546][15401] Updated weights for policy 0, policy_version 811973 (0.0030) [2024-06-25 04:38:58,389][15132] Fps is (10 sec: 42615.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13303480320. Throughput: 0: 42913.8. Samples: 13303605680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:38:58,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 04:38:59,186][15401] Updated weights for policy 0, policy_version 811983 (0.0034) [2024-06-25 04:39:03,120][15401] Updated weights for policy 0, policy_version 811993 (0.0039) [2024-06-25 04:39:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 13303709696. Throughput: 0: 42872.4. Samples: 13303865300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:39:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 04:39:06,860][15401] Updated weights for policy 0, policy_version 812003 (0.0032) [2024-06-25 04:39:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13303922688. Throughput: 0: 42912.6. Samples: 13303995580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:39:08,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 04:39:10,611][15401] Updated weights for policy 0, policy_version 812013 (0.0032) [2024-06-25 04:39:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13304135680. Throughput: 0: 42872.4. Samples: 13304251500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:39:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 04:39:14,736][15401] Updated weights for policy 0, policy_version 812023 (0.0047) [2024-06-25 04:39:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13304332288. Throughput: 0: 42767.5. Samples: 13304505540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:39:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 04:39:18,523][15401] Updated weights for policy 0, policy_version 812033 (0.0024) [2024-06-25 04:39:22,444][15401] Updated weights for policy 0, policy_version 812043 (0.0031) [2024-06-25 04:39:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 13304561664. Throughput: 0: 42764.3. Samples: 13304634100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:39:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 04:39:25,931][15401] Updated weights for policy 0, policy_version 812053 (0.0029) [2024-06-25 04:39:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13304774656. Throughput: 0: 42844.0. Samples: 13304895040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 04:39:28,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 04:39:29,918][15401] Updated weights for policy 0, policy_version 812063 (0.0046) [2024-06-25 04:39:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13304987648. Throughput: 0: 42764.7. Samples: 13305148320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:39:33,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 04:39:33,980][15401] Updated weights for policy 0, policy_version 812073 (0.0036) [2024-06-25 04:39:37,430][15401] Updated weights for policy 0, policy_version 812083 (0.0038) [2024-06-25 04:39:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 13305200640. Throughput: 0: 42790.8. Samples: 13305276800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:39:38,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 04:39:41,552][15401] Updated weights for policy 0, policy_version 812093 (0.0037) [2024-06-25 04:39:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13305413632. Throughput: 0: 42899.5. Samples: 13305536160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:39:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 04:39:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000812099_13305430016.pth... [2024-06-25 04:39:43,582][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000811469_13295108096.pth [2024-06-25 04:39:44,809][15401] Updated weights for policy 0, policy_version 812103 (0.0030) [2024-06-25 04:39:48,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42601.3, 300 sec: 42765.0). Total num frames: 13305610240. Throughput: 0: 42812.1. Samples: 13305791840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:39:48,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 04:39:49,103][15401] Updated weights for policy 0, policy_version 812113 (0.0039) [2024-06-25 04:39:52,874][15401] Updated weights for policy 0, policy_version 812123 (0.0030) [2024-06-25 04:39:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13305839616. Throughput: 0: 42671.4. Samples: 13305915800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:39:53,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-25 04:39:56,628][15401] Updated weights for policy 0, policy_version 812133 (0.0033) [2024-06-25 04:39:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13306068992. Throughput: 0: 42914.7. Samples: 13306182660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:39:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 04:40:00,284][15401] Updated weights for policy 0, policy_version 812143 (0.0034) [2024-06-25 04:40:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 13306265600. Throughput: 0: 43035.6. Samples: 13306442140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 04:40:04,660][15401] Updated weights for policy 0, policy_version 812153 (0.0037) [2024-06-25 04:40:04,974][15349] Signal inference workers to stop experience collection... (196900 times) [2024-06-25 04:40:04,975][15349] Signal inference workers to resume experience collection... (196900 times) [2024-06-25 04:40:05,020][15401] InferenceWorker_p0-w0: stopping experience collection (196900 times) [2024-06-25 04:40:05,020][15401] InferenceWorker_p0-w0: resuming experience collection (196900 times) [2024-06-25 04:40:07,681][15401] Updated weights for policy 0, policy_version 812163 (0.0033) [2024-06-25 04:40:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13306494976. Throughput: 0: 42915.4. Samples: 13306565280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 04:40:12,261][15401] Updated weights for policy 0, policy_version 812173 (0.0029) [2024-06-25 04:40:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13306691584. Throughput: 0: 42909.2. Samples: 13306825960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:13,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 04:40:15,183][15401] Updated weights for policy 0, policy_version 812183 (0.0038) [2024-06-25 04:40:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.7). Total num frames: 13306904576. Throughput: 0: 42876.9. Samples: 13307077780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:18,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 04:40:19,946][15401] Updated weights for policy 0, policy_version 812193 (0.0030) [2024-06-25 04:40:23,097][15401] Updated weights for policy 0, policy_version 812203 (0.0027) [2024-06-25 04:40:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 13307133952. Throughput: 0: 42916.4. Samples: 13307208040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:23,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 04:40:28,013][15401] Updated weights for policy 0, policy_version 812213 (0.0030) [2024-06-25 04:40:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13307330560. Throughput: 0: 42855.5. Samples: 13307464660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:28,393][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 04:40:30,811][15401] Updated weights for policy 0, policy_version 812223 (0.0030) [2024-06-25 04:40:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13307543552. Throughput: 0: 42805.8. Samples: 13307718100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 04:40:35,583][15401] Updated weights for policy 0, policy_version 812233 (0.0024) [2024-06-25 04:40:38,286][15401] Updated weights for policy 0, policy_version 812243 (0.0036) [2024-06-25 04:40:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13307789312. Throughput: 0: 43010.2. Samples: 13307851260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:38,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 04:40:43,030][15401] Updated weights for policy 0, policy_version 812253 (0.0034) [2024-06-25 04:40:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13307969536. Throughput: 0: 42806.5. Samples: 13308108960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 04:40:45,763][15401] Updated weights for policy 0, policy_version 812263 (0.0036) [2024-06-25 04:40:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13308198912. Throughput: 0: 42629.3. Samples: 13308360460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 04:40:50,504][15401] Updated weights for policy 0, policy_version 812273 (0.0023) [2024-06-25 04:40:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13308428288. Throughput: 0: 42834.7. Samples: 13308492840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 04:40:53,441][15401] Updated weights for policy 0, policy_version 812283 (0.0028) [2024-06-25 04:40:58,180][15401] Updated weights for policy 0, policy_version 812293 (0.0023) [2024-06-25 04:40:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13308608512. Throughput: 0: 42906.8. Samples: 13308756760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:40:58,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-25 04:41:01,191][15401] Updated weights for policy 0, policy_version 812303 (0.0032) [2024-06-25 04:41:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13308837888. Throughput: 0: 42814.3. Samples: 13309004420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:41:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 04:41:05,652][15401] Updated weights for policy 0, policy_version 812313 (0.0031) [2024-06-25 04:41:08,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 13309067264. Throughput: 0: 42966.1. Samples: 13309141620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:41:08,401][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 04:41:08,873][15401] Updated weights for policy 0, policy_version 812323 (0.0039) [2024-06-25 04:41:13,288][15401] Updated weights for policy 0, policy_version 812333 (0.0028) [2024-06-25 04:41:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13309263872. Throughput: 0: 42989.3. Samples: 13309399180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 04:41:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 04:41:17,016][15401] Updated weights for policy 0, policy_version 812343 (0.0040) [2024-06-25 04:41:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13309493248. Throughput: 0: 42887.9. Samples: 13309648060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 04:41:21,006][15401] Updated weights for policy 0, policy_version 812353 (0.0035) [2024-06-25 04:41:21,876][15349] Signal inference workers to stop experience collection... (196950 times) [2024-06-25 04:41:21,880][15349] Signal inference workers to resume experience collection... (196950 times) [2024-06-25 04:41:21,922][15401] InferenceWorker_p0-w0: stopping experience collection (196950 times) [2024-06-25 04:41:21,922][15401] InferenceWorker_p0-w0: resuming experience collection (196950 times) [2024-06-25 04:41:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13309706240. Throughput: 0: 42922.3. Samples: 13309782760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 04:41:24,481][15401] Updated weights for policy 0, policy_version 812363 (0.0037) [2024-06-25 04:41:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13309902848. Throughput: 0: 42859.1. Samples: 13310037620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 04:41:28,684][15401] Updated weights for policy 0, policy_version 812373 (0.0040) [2024-06-25 04:41:32,162][15401] Updated weights for policy 0, policy_version 812383 (0.0037) [2024-06-25 04:41:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13310148608. Throughput: 0: 42934.7. Samples: 13310292520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:33,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 04:41:36,307][15401] Updated weights for policy 0, policy_version 812393 (0.0033) [2024-06-25 04:41:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 13310345216. Throughput: 0: 42897.7. Samples: 13310423240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 04:41:39,806][15401] Updated weights for policy 0, policy_version 812403 (0.0037) [2024-06-25 04:41:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13310541824. Throughput: 0: 42723.1. Samples: 13310679300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 04:41:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000812412_13310558208.pth... [2024-06-25 04:41:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000811784_13300269056.pth [2024-06-25 04:41:43,849][15401] Updated weights for policy 0, policy_version 812413 (0.0028) [2024-06-25 04:41:47,316][15401] Updated weights for policy 0, policy_version 812423 (0.0023) [2024-06-25 04:41:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13310771200. Throughput: 0: 42895.1. Samples: 13310934700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:48,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 04:41:51,551][15401] Updated weights for policy 0, policy_version 812433 (0.0035) [2024-06-25 04:41:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 13310984192. Throughput: 0: 42805.5. Samples: 13311067760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 04:41:55,302][15401] Updated weights for policy 0, policy_version 812443 (0.0041) [2024-06-25 04:41:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13311197184. Throughput: 0: 42762.4. Samples: 13311323480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:41:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 04:41:58,942][15401] Updated weights for policy 0, policy_version 812453 (0.0033) [2024-06-25 04:42:02,963][15401] Updated weights for policy 0, policy_version 812463 (0.0019) [2024-06-25 04:42:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 13311410176. Throughput: 0: 42972.4. Samples: 13311581820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 04:42:06,496][15401] Updated weights for policy 0, policy_version 812473 (0.0033) [2024-06-25 04:42:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 13311639552. Throughput: 0: 42796.4. Samples: 13311708600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 04:42:10,824][15401] Updated weights for policy 0, policy_version 812483 (0.0033) [2024-06-25 04:42:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13311852544. Throughput: 0: 42792.5. Samples: 13311963280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:13,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 04:42:14,589][15401] Updated weights for policy 0, policy_version 812493 (0.0034) [2024-06-25 04:42:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13312032768. Throughput: 0: 42868.0. Samples: 13312221580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 04:42:18,684][15401] Updated weights for policy 0, policy_version 812503 (0.0033) [2024-06-25 04:42:22,094][15401] Updated weights for policy 0, policy_version 812513 (0.0032) [2024-06-25 04:42:23,390][15132] Fps is (10 sec: 44233.5, 60 sec: 43143.9, 300 sec: 42987.1). Total num frames: 13312294912. Throughput: 0: 42735.7. Samples: 13312346380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:23,391][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 04:42:26,257][15401] Updated weights for policy 0, policy_version 812523 (0.0046) [2024-06-25 04:42:28,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43142.9, 300 sec: 42876.1). Total num frames: 13312491520. Throughput: 0: 42762.5. Samples: 13312603720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:28,392][15132] Avg episode reward: [(0, '0.283')] [2024-06-25 04:42:29,707][15401] Updated weights for policy 0, policy_version 812533 (0.0036) [2024-06-25 04:42:33,389][15132] Fps is (10 sec: 37686.5, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 13312671744. Throughput: 0: 42927.2. Samples: 13312866420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:33,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 04:42:33,850][15401] Updated weights for policy 0, policy_version 812543 (0.0031) [2024-06-25 04:42:37,068][15349] Signal inference workers to stop experience collection... (197000 times) [2024-06-25 04:42:37,068][15349] Signal inference workers to resume experience collection... (197000 times) [2024-06-25 04:42:37,092][15401] InferenceWorker_p0-w0: stopping experience collection (197000 times) [2024-06-25 04:42:37,092][15401] InferenceWorker_p0-w0: resuming experience collection (197000 times) [2024-06-25 04:42:37,234][15401] Updated weights for policy 0, policy_version 812553 (0.0033) [2024-06-25 04:42:38,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13312933888. Throughput: 0: 42656.3. Samples: 13312987300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 04:42:41,748][15401] Updated weights for policy 0, policy_version 812563 (0.0030) [2024-06-25 04:42:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13313114112. Throughput: 0: 42769.2. Samples: 13313248100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 04:42:44,702][15401] Updated weights for policy 0, policy_version 812573 (0.0038) [2024-06-25 04:42:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13313327104. Throughput: 0: 42697.4. Samples: 13313503200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:48,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 04:42:49,270][15401] Updated weights for policy 0, policy_version 812583 (0.0037) [2024-06-25 04:42:52,390][15401] Updated weights for policy 0, policy_version 812593 (0.0041) [2024-06-25 04:42:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13313572864. Throughput: 0: 42614.1. Samples: 13313626240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 04:42:56,762][15401] Updated weights for policy 0, policy_version 812603 (0.0034) [2024-06-25 04:42:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13313753088. Throughput: 0: 42728.1. Samples: 13313886040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:42:58,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 04:43:00,104][15401] Updated weights for policy 0, policy_version 812613 (0.0025) [2024-06-25 04:43:03,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13313949696. Throughput: 0: 42627.9. Samples: 13314139840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 04:43:03,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 04:43:04,515][15401] Updated weights for policy 0, policy_version 812623 (0.0029) [2024-06-25 04:43:07,775][15401] Updated weights for policy 0, policy_version 812633 (0.0028) [2024-06-25 04:43:08,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13314228224. Throughput: 0: 42771.0. Samples: 13314271040. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:08,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 04:43:12,207][15401] Updated weights for policy 0, policy_version 812643 (0.0040) [2024-06-25 04:43:13,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 13314375680. Throughput: 0: 42609.3. Samples: 13314521140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:13,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 04:43:15,320][15401] Updated weights for policy 0, policy_version 812653 (0.0027) [2024-06-25 04:43:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13314605056. Throughput: 0: 42426.6. Samples: 13314775620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:18,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 04:43:20,059][15401] Updated weights for policy 0, policy_version 812663 (0.0039) [2024-06-25 04:43:22,874][15401] Updated weights for policy 0, policy_version 812673 (0.0036) [2024-06-25 04:43:23,390][15132] Fps is (10 sec: 47524.9, 60 sec: 42599.0, 300 sec: 42876.1). Total num frames: 13314850816. Throughput: 0: 42657.4. Samples: 13314906880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 04:43:28,245][15401] Updated weights for policy 0, policy_version 812683 (0.0038) [2024-06-25 04:43:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 41780.8, 300 sec: 42598.4). Total num frames: 13314998272. Throughput: 0: 42468.0. Samples: 13315159160. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:28,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 04:43:30,638][15401] Updated weights for policy 0, policy_version 812693 (0.0037) [2024-06-25 04:43:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13315260416. Throughput: 0: 42316.4. Samples: 13315407440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 04:43:35,884][15401] Updated weights for policy 0, policy_version 812703 (0.0032) [2024-06-25 04:43:37,174][15349] Signal inference workers to stop experience collection... (197050 times) [2024-06-25 04:43:37,219][15401] InferenceWorker_p0-w0: stopping experience collection (197050 times) [2024-06-25 04:43:37,223][15349] Signal inference workers to resume experience collection... (197050 times) [2024-06-25 04:43:37,230][15401] InferenceWorker_p0-w0: resuming experience collection (197050 times) [2024-06-25 04:43:38,229][15401] Updated weights for policy 0, policy_version 812713 (0.0039) [2024-06-25 04:43:38,389][15132] Fps is (10 sec: 50790.9, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 13315506176. Throughput: 0: 42647.7. Samples: 13315545380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 04:43:43,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42654.5). Total num frames: 13315637248. Throughput: 0: 42486.2. Samples: 13315797920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:43,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 04:43:43,465][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000812723_13315653632.pth... [2024-06-25 04:43:43,474][15401] Updated weights for policy 0, policy_version 812723 (0.0031) [2024-06-25 04:43:43,514][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000812099_13305430016.pth [2024-06-25 04:43:45,875][15401] Updated weights for policy 0, policy_version 812733 (0.0021) [2024-06-25 04:43:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13315899392. Throughput: 0: 42413.3. Samples: 13316048440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:48,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 04:43:50,998][15401] Updated weights for policy 0, policy_version 812743 (0.0031) [2024-06-25 04:43:53,389][15132] Fps is (10 sec: 49152.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13316128768. Throughput: 0: 42677.3. Samples: 13316191520. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:53,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 04:43:53,496][15401] Updated weights for policy 0, policy_version 812753 (0.0036) [2024-06-25 04:43:58,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13316276224. Throughput: 0: 42748.1. Samples: 13316444700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:43:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 04:43:58,599][15401] Updated weights for policy 0, policy_version 812763 (0.0038) [2024-06-25 04:44:01,183][15401] Updated weights for policy 0, policy_version 812773 (0.0026) [2024-06-25 04:44:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43416.0, 300 sec: 42820.2). Total num frames: 13316554752. Throughput: 0: 42497.3. Samples: 13316688100. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:03,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 04:44:06,312][15401] Updated weights for policy 0, policy_version 812783 (0.0033) [2024-06-25 04:44:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 13316751360. Throughput: 0: 42807.6. Samples: 13316833220. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 04:44:08,915][15401] Updated weights for policy 0, policy_version 812793 (0.0025) [2024-06-25 04:44:13,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13316931584. Throughput: 0: 42749.9. Samples: 13317082900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 04:44:14,058][15401] Updated weights for policy 0, policy_version 812803 (0.0029) [2024-06-25 04:44:16,434][15401] Updated weights for policy 0, policy_version 812813 (0.0027) [2024-06-25 04:44:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13317193728. Throughput: 0: 42918.3. Samples: 13317338760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:18,399][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 04:44:21,622][15401] Updated weights for policy 0, policy_version 812823 (0.0028) [2024-06-25 04:44:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13317390336. Throughput: 0: 42948.4. Samples: 13317478060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:23,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 04:44:24,248][15401] Updated weights for policy 0, policy_version 812833 (0.0045) [2024-06-25 04:44:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13317586944. Throughput: 0: 42798.6. Samples: 13317723860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 04:44:29,402][15401] Updated weights for policy 0, policy_version 812843 (0.0034) [2024-06-25 04:44:32,098][15401] Updated weights for policy 0, policy_version 812853 (0.0027) [2024-06-25 04:44:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 13317832704. Throughput: 0: 42757.5. Samples: 13317972520. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 04:44:37,197][15401] Updated weights for policy 0, policy_version 812863 (0.0052) [2024-06-25 04:44:37,234][15349] Signal inference workers to stop experience collection... (197100 times) [2024-06-25 04:44:37,287][15401] InferenceWorker_p0-w0: stopping experience collection (197100 times) [2024-06-25 04:44:37,362][15349] Signal inference workers to resume experience collection... (197100 times) [2024-06-25 04:44:37,362][15401] InferenceWorker_p0-w0: resuming experience collection (197100 times) [2024-06-25 04:44:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 42709.5). Total num frames: 13318012928. Throughput: 0: 42656.3. Samples: 13318111060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 04:44:39,807][15401] Updated weights for policy 0, policy_version 812873 (0.0040) [2024-06-25 04:44:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13318242304. Throughput: 0: 42627.5. Samples: 13318362940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 04:44:44,703][15401] Updated weights for policy 0, policy_version 812883 (0.0032) [2024-06-25 04:44:47,430][15401] Updated weights for policy 0, policy_version 812893 (0.0040) [2024-06-25 04:44:48,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13318488064. Throughput: 0: 42801.8. Samples: 13318614080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 04:44:48,396][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 04:44:52,185][15401] Updated weights for policy 0, policy_version 812903 (0.0034) [2024-06-25 04:44:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 13318651904. Throughput: 0: 42565.0. Samples: 13318748640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:44:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 04:44:55,096][15401] Updated weights for policy 0, policy_version 812913 (0.0026) [2024-06-25 04:44:58,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43688.9, 300 sec: 42820.2). Total num frames: 13318897664. Throughput: 0: 42865.7. Samples: 13319011960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:44:58,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 04:44:59,650][15401] Updated weights for policy 0, policy_version 812923 (0.0021) [2024-06-25 04:45:02,776][15401] Updated weights for policy 0, policy_version 812933 (0.0030) [2024-06-25 04:45:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 13319110656. Throughput: 0: 42551.6. Samples: 13319253580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 04:45:07,143][15401] Updated weights for policy 0, policy_version 812943 (0.0024) [2024-06-25 04:45:08,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13319290880. Throughput: 0: 42528.0. Samples: 13319391820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:08,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 04:45:10,266][15401] Updated weights for policy 0, policy_version 812953 (0.0038) [2024-06-25 04:45:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 13319536640. Throughput: 0: 42810.2. Samples: 13319650320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:13,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 04:45:14,718][15401] Updated weights for policy 0, policy_version 812963 (0.0039) [2024-06-25 04:45:17,999][15401] Updated weights for policy 0, policy_version 812973 (0.0027) [2024-06-25 04:45:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13319749632. Throughput: 0: 42727.9. Samples: 13319895280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 04:45:22,259][15401] Updated weights for policy 0, policy_version 812983 (0.0035) [2024-06-25 04:45:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 13319929856. Throughput: 0: 42663.6. Samples: 13320030920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 04:45:26,147][15401] Updated weights for policy 0, policy_version 812993 (0.0036) [2024-06-25 04:45:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13320159232. Throughput: 0: 42705.5. Samples: 13320284680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:28,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 04:45:30,225][15401] Updated weights for policy 0, policy_version 813003 (0.0037) [2024-06-25 04:45:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13320372224. Throughput: 0: 42713.0. Samples: 13320536160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 04:45:33,718][15401] Updated weights for policy 0, policy_version 813013 (0.0035) [2024-06-25 04:45:37,708][15401] Updated weights for policy 0, policy_version 813023 (0.0039) [2024-06-25 04:45:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13320585216. Throughput: 0: 42613.3. Samples: 13320666240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 04:45:41,230][15401] Updated weights for policy 0, policy_version 813033 (0.0029) [2024-06-25 04:45:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13320798208. Throughput: 0: 42308.0. Samples: 13320915720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:43,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 04:45:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000813037_13320798208.pth... [2024-06-25 04:45:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000812412_13310558208.pth [2024-06-25 04:45:45,282][15401] Updated weights for policy 0, policy_version 813043 (0.0031) [2024-06-25 04:45:47,938][15349] Signal inference workers to stop experience collection... (197150 times) [2024-06-25 04:45:47,979][15401] InferenceWorker_p0-w0: stopping experience collection (197150 times) [2024-06-25 04:45:47,998][15349] Signal inference workers to resume experience collection... (197150 times) [2024-06-25 04:45:48,000][15401] InferenceWorker_p0-w0: resuming experience collection (197150 times) [2024-06-25 04:45:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 13321011200. Throughput: 0: 42669.8. Samples: 13321173720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 04:45:48,964][15401] Updated weights for policy 0, policy_version 813053 (0.0033) [2024-06-25 04:45:53,179][15401] Updated weights for policy 0, policy_version 813063 (0.0031) [2024-06-25 04:45:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13321224192. Throughput: 0: 42481.2. Samples: 13321303480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:53,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 04:45:56,555][15401] Updated weights for policy 0, policy_version 813073 (0.0034) [2024-06-25 04:45:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 13321437184. Throughput: 0: 42313.8. Samples: 13321554540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:45:58,393][15132] Avg episode reward: [(0, '0.284')] [2024-06-25 04:46:00,793][15401] Updated weights for policy 0, policy_version 813083 (0.0040) [2024-06-25 04:46:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 13321633792. Throughput: 0: 42740.9. Samples: 13321818620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:46:03,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-25 04:46:04,351][15401] Updated weights for policy 0, policy_version 813093 (0.0024) [2024-06-25 04:46:08,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13321863168. Throughput: 0: 42507.9. Samples: 13321943780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:46:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 04:46:08,463][15401] Updated weights for policy 0, policy_version 813103 (0.0046) [2024-06-25 04:46:12,046][15401] Updated weights for policy 0, policy_version 813113 (0.0026) [2024-06-25 04:46:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13322076160. Throughput: 0: 42541.1. Samples: 13322199040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:46:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 04:46:16,057][15401] Updated weights for policy 0, policy_version 813123 (0.0024) [2024-06-25 04:46:18,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13322305536. Throughput: 0: 42814.2. Samples: 13322462800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:46:18,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 04:46:19,740][15401] Updated weights for policy 0, policy_version 813133 (0.0042) [2024-06-25 04:46:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13322502144. Throughput: 0: 42765.7. Samples: 13322590700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:46:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 04:46:23,739][15401] Updated weights for policy 0, policy_version 813143 (0.0033) [2024-06-25 04:46:27,389][15401] Updated weights for policy 0, policy_version 813153 (0.0027) [2024-06-25 04:46:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 13322731520. Throughput: 0: 42925.2. Samples: 13322847360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:46:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 04:46:31,385][15401] Updated weights for policy 0, policy_version 813163 (0.0050) [2024-06-25 04:46:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13322928128. Throughput: 0: 42884.4. Samples: 13323103520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:46:33,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 04:46:34,993][15401] Updated weights for policy 0, policy_version 813173 (0.0032) [2024-06-25 04:46:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13323157504. Throughput: 0: 42748.4. Samples: 13323227160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 04:46:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 04:46:39,260][15401] Updated weights for policy 0, policy_version 813183 (0.0025) [2024-06-25 04:46:42,900][15401] Updated weights for policy 0, policy_version 813193 (0.0040) [2024-06-25 04:46:43,395][15132] Fps is (10 sec: 44214.6, 60 sec: 42867.9, 300 sec: 42708.7). Total num frames: 13323370496. Throughput: 0: 42870.8. Samples: 13323483840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:46:43,395][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 04:46:46,879][15401] Updated weights for policy 0, policy_version 813203 (0.0052) [2024-06-25 04:46:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13323567104. Throughput: 0: 42684.0. Samples: 13323739400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:46:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 04:46:50,768][15401] Updated weights for policy 0, policy_version 813213 (0.0032) [2024-06-25 04:46:53,390][15132] Fps is (10 sec: 42619.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13323796480. Throughput: 0: 42711.6. Samples: 13323865800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:46:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 04:46:54,569][15401] Updated weights for policy 0, policy_version 813223 (0.0038) [2024-06-25 04:46:56,243][15349] Signal inference workers to stop experience collection... (197200 times) [2024-06-25 04:46:56,296][15349] Signal inference workers to resume experience collection... (197200 times) [2024-06-25 04:46:56,297][15401] InferenceWorker_p0-w0: stopping experience collection (197200 times) [2024-06-25 04:46:56,314][15401] InferenceWorker_p0-w0: resuming experience collection (197200 times) [2024-06-25 04:46:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 13323993088. Throughput: 0: 42625.3. Samples: 13324117280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:46:58,401][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 04:46:58,449][15401] Updated weights for policy 0, policy_version 813233 (0.0043) [2024-06-25 04:47:02,035][15401] Updated weights for policy 0, policy_version 813243 (0.0028) [2024-06-25 04:47:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13324206080. Throughput: 0: 42509.1. Samples: 13324375720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:03,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 04:47:06,306][15401] Updated weights for policy 0, policy_version 813253 (0.0022) [2024-06-25 04:47:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13324419072. Throughput: 0: 42540.5. Samples: 13324505020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 04:47:09,909][15401] Updated weights for policy 0, policy_version 813263 (0.0038) [2024-06-25 04:47:13,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 13324632064. Throughput: 0: 42438.5. Samples: 13324757080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 04:47:13,871][15401] Updated weights for policy 0, policy_version 813273 (0.0034) [2024-06-25 04:47:17,570][15401] Updated weights for policy 0, policy_version 813283 (0.0044) [2024-06-25 04:47:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42543.0). Total num frames: 13324845056. Throughput: 0: 42496.0. Samples: 13325015840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 04:47:21,407][15401] Updated weights for policy 0, policy_version 813293 (0.0036) [2024-06-25 04:47:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 13325074432. Throughput: 0: 42621.4. Samples: 13325145120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 04:47:25,295][15401] Updated weights for policy 0, policy_version 813303 (0.0030) [2024-06-25 04:47:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13325287424. Throughput: 0: 42685.6. Samples: 13325404480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:28,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 04:47:29,057][15401] Updated weights for policy 0, policy_version 813313 (0.0044) [2024-06-25 04:47:32,907][15401] Updated weights for policy 0, policy_version 813323 (0.0023) [2024-06-25 04:47:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 13325484032. Throughput: 0: 42690.6. Samples: 13325660480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 04:47:36,538][15401] Updated weights for policy 0, policy_version 813333 (0.0034) [2024-06-25 04:47:38,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13325713408. Throughput: 0: 42644.5. Samples: 13325784800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 04:47:40,888][15401] Updated weights for policy 0, policy_version 813343 (0.0027) [2024-06-25 04:47:43,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42875.1, 300 sec: 42765.0). Total num frames: 13325942784. Throughput: 0: 42838.8. Samples: 13326044920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:43,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 04:47:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000813351_13325942784.pth... [2024-06-25 04:47:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000812723_13315653632.pth [2024-06-25 04:47:44,498][15401] Updated weights for policy 0, policy_version 813353 (0.0019) [2024-06-25 04:47:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 13326106624. Throughput: 0: 42793.8. Samples: 13326301440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 04:47:48,690][15401] Updated weights for policy 0, policy_version 813363 (0.0023) [2024-06-25 04:47:52,101][15401] Updated weights for policy 0, policy_version 813373 (0.0032) [2024-06-25 04:47:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13326336000. Throughput: 0: 42495.6. Samples: 13326417320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:53,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 04:47:56,590][15401] Updated weights for policy 0, policy_version 813383 (0.0041) [2024-06-25 04:47:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13326548992. Throughput: 0: 42691.4. Samples: 13326678200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:47:58,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 04:47:59,750][15401] Updated weights for policy 0, policy_version 813393 (0.0031) [2024-06-25 04:48:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 13326745600. Throughput: 0: 42461.4. Samples: 13326926600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:48:03,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 04:48:04,190][15401] Updated weights for policy 0, policy_version 813403 (0.0028) [2024-06-25 04:48:07,313][15401] Updated weights for policy 0, policy_version 813413 (0.0039) [2024-06-25 04:48:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 13326991360. Throughput: 0: 42350.7. Samples: 13327050900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:48:08,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 04:48:11,887][15401] Updated weights for policy 0, policy_version 813423 (0.0036) [2024-06-25 04:48:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13327171584. Throughput: 0: 42385.3. Samples: 13327311820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:48:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 04:48:13,722][15349] Signal inference workers to stop experience collection... (197250 times) [2024-06-25 04:48:13,726][15349] Signal inference workers to resume experience collection... (197250 times) [2024-06-25 04:48:13,741][15401] InferenceWorker_p0-w0: stopping experience collection (197250 times) [2024-06-25 04:48:13,742][15401] InferenceWorker_p0-w0: resuming experience collection (197250 times) [2024-06-25 04:48:15,157][15401] Updated weights for policy 0, policy_version 813433 (0.0034) [2024-06-25 04:48:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13327400960. Throughput: 0: 42424.1. Samples: 13327569560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:48:18,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 04:48:19,328][15401] Updated weights for policy 0, policy_version 813443 (0.0035) [2024-06-25 04:48:22,832][15401] Updated weights for policy 0, policy_version 813453 (0.0032) [2024-06-25 04:48:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13327630336. Throughput: 0: 42460.5. Samples: 13327695520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 04:48:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 04:48:27,006][15401] Updated weights for policy 0, policy_version 813463 (0.0044) [2024-06-25 04:48:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 13327810560. Throughput: 0: 42262.2. Samples: 13327946720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:48:28,392][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 04:48:30,664][15401] Updated weights for policy 0, policy_version 813473 (0.0041) [2024-06-25 04:48:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 13328023552. Throughput: 0: 42271.6. Samples: 13328203660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:48:33,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 04:48:34,496][15401] Updated weights for policy 0, policy_version 813483 (0.0049) [2024-06-25 04:48:38,200][15401] Updated weights for policy 0, policy_version 813493 (0.0019) [2024-06-25 04:48:38,392][15132] Fps is (10 sec: 47502.5, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 13328285696. Throughput: 0: 42713.2. Samples: 13328339520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:48:38,392][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 04:48:42,033][15401] Updated weights for policy 0, policy_version 813503 (0.0043) [2024-06-25 04:48:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 13328449536. Throughput: 0: 42465.7. Samples: 13328589160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:48:43,396][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 04:48:45,952][15401] Updated weights for policy 0, policy_version 813513 (0.0040) [2024-06-25 04:48:48,392][15132] Fps is (10 sec: 39321.6, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 13328678912. Throughput: 0: 42741.3. Samples: 13328850060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:48:48,392][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 04:48:49,763][15401] Updated weights for policy 0, policy_version 813523 (0.0035) [2024-06-25 04:48:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13328891904. Throughput: 0: 42910.2. Samples: 13328981860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:48:53,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 04:48:53,556][15401] Updated weights for policy 0, policy_version 813533 (0.0021) [2024-06-25 04:48:57,535][15401] Updated weights for policy 0, policy_version 813543 (0.0040) [2024-06-25 04:48:58,390][15132] Fps is (10 sec: 40969.1, 60 sec: 42325.2, 300 sec: 42487.6). Total num frames: 13329088512. Throughput: 0: 42612.8. Samples: 13329229400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:48:58,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 04:49:01,271][15401] Updated weights for policy 0, policy_version 813553 (0.0033) [2024-06-25 04:49:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13329317888. Throughput: 0: 42585.9. Samples: 13329485920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 04:49:05,169][15401] Updated weights for policy 0, policy_version 813563 (0.0031) [2024-06-25 04:49:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13329530880. Throughput: 0: 42719.0. Samples: 13329617880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:08,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 04:49:08,401][15349] Signal inference workers to stop experience collection... (197300 times) [2024-06-25 04:49:08,447][15401] InferenceWorker_p0-w0: stopping experience collection (197300 times) [2024-06-25 04:49:08,455][15349] Signal inference workers to resume experience collection... (197300 times) [2024-06-25 04:49:08,460][15401] InferenceWorker_p0-w0: resuming experience collection (197300 times) [2024-06-25 04:49:08,753][15401] Updated weights for policy 0, policy_version 813573 (0.0027) [2024-06-25 04:49:12,770][15401] Updated weights for policy 0, policy_version 813583 (0.0041) [2024-06-25 04:49:13,390][15132] Fps is (10 sec: 42595.3, 60 sec: 42871.1, 300 sec: 42542.8). Total num frames: 13329743872. Throughput: 0: 42747.4. Samples: 13329870380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:13,399][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 04:49:16,403][15401] Updated weights for policy 0, policy_version 813593 (0.0032) [2024-06-25 04:49:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13329973248. Throughput: 0: 42636.5. Samples: 13330122300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:18,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 04:49:20,430][15401] Updated weights for policy 0, policy_version 813603 (0.0029) [2024-06-25 04:49:23,389][15132] Fps is (10 sec: 40963.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13330153472. Throughput: 0: 42483.7. Samples: 13330251180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 04:49:24,205][15401] Updated weights for policy 0, policy_version 813613 (0.0041) [2024-06-25 04:49:28,335][15401] Updated weights for policy 0, policy_version 813623 (0.0034) [2024-06-25 04:49:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 13330399232. Throughput: 0: 42642.7. Samples: 13330508080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 04:49:31,818][15401] Updated weights for policy 0, policy_version 813633 (0.0028) [2024-06-25 04:49:33,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13330628608. Throughput: 0: 42449.3. Samples: 13330760180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 04:49:35,715][15401] Updated weights for policy 0, policy_version 813643 (0.0028) [2024-06-25 04:49:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42053.8, 300 sec: 42598.4). Total num frames: 13330808832. Throughput: 0: 42451.0. Samples: 13330892160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:38,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 04:49:39,558][15401] Updated weights for policy 0, policy_version 813653 (0.0032) [2024-06-25 04:49:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 13331038208. Throughput: 0: 42744.9. Samples: 13331152920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:43,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-25 04:49:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000813662_13331038208.pth... [2024-06-25 04:49:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000813037_13320798208.pth [2024-06-25 04:49:43,914][15401] Updated weights for policy 0, policy_version 813663 (0.0032) [2024-06-25 04:49:47,340][15401] Updated weights for policy 0, policy_version 813673 (0.0037) [2024-06-25 04:49:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 13331267584. Throughput: 0: 42654.9. Samples: 13331405400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 04:49:51,654][15401] Updated weights for policy 0, policy_version 813683 (0.0031) [2024-06-25 04:49:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 13331447808. Throughput: 0: 42657.5. Samples: 13331537460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 04:49:55,311][15401] Updated weights for policy 0, policy_version 813693 (0.0043) [2024-06-25 04:49:58,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 13331693568. Throughput: 0: 42774.9. Samples: 13331795220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:49:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 04:49:59,158][15401] Updated weights for policy 0, policy_version 813703 (0.0041) [2024-06-25 04:50:00,036][15349] Signal inference workers to stop experience collection... (197350 times) [2024-06-25 04:50:00,037][15349] Signal inference workers to resume experience collection... (197350 times) [2024-06-25 04:50:00,075][15401] InferenceWorker_p0-w0: stopping experience collection (197350 times) [2024-06-25 04:50:00,075][15401] InferenceWorker_p0-w0: resuming experience collection (197350 times) [2024-06-25 04:50:02,886][15401] Updated weights for policy 0, policy_version 813713 (0.0054) [2024-06-25 04:50:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13331890176. Throughput: 0: 42785.9. Samples: 13332047660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:50:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 04:50:07,154][15401] Updated weights for policy 0, policy_version 813723 (0.0055) [2024-06-25 04:50:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13332103168. Throughput: 0: 42770.5. Samples: 13332175860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:50:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 04:50:10,450][15401] Updated weights for policy 0, policy_version 813733 (0.0038) [2024-06-25 04:50:13,390][15132] Fps is (10 sec: 44232.9, 60 sec: 43144.4, 300 sec: 42653.8). Total num frames: 13332332544. Throughput: 0: 42785.4. Samples: 13332433460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-06-25 04:50:13,391][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 04:50:14,631][15401] Updated weights for policy 0, policy_version 813743 (0.0038) [2024-06-25 04:50:18,223][15401] Updated weights for policy 0, policy_version 813753 (0.0031) [2024-06-25 04:50:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13332529152. Throughput: 0: 42948.9. Samples: 13332692880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:18,396][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 04:50:22,095][15401] Updated weights for policy 0, policy_version 813763 (0.0029) [2024-06-25 04:50:23,389][15132] Fps is (10 sec: 40963.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13332742144. Throughput: 0: 42790.4. Samples: 13332817720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:23,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 04:50:25,885][15401] Updated weights for policy 0, policy_version 813773 (0.0041) [2024-06-25 04:50:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13332955136. Throughput: 0: 42774.0. Samples: 13333077740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:28,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 04:50:29,487][15401] Updated weights for policy 0, policy_version 813783 (0.0037) [2024-06-25 04:50:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13333151744. Throughput: 0: 42844.5. Samples: 13333333400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 04:50:33,806][15401] Updated weights for policy 0, policy_version 813793 (0.0032) [2024-06-25 04:50:37,145][15401] Updated weights for policy 0, policy_version 813803 (0.0038) [2024-06-25 04:50:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13333381120. Throughput: 0: 42692.3. Samples: 13333458620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:38,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 04:50:41,421][15401] Updated weights for policy 0, policy_version 813813 (0.0034) [2024-06-25 04:50:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13333610496. Throughput: 0: 42850.7. Samples: 13333723500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:43,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 04:50:44,755][15401] Updated weights for policy 0, policy_version 813823 (0.0030) [2024-06-25 04:50:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13333807104. Throughput: 0: 42798.1. Samples: 13333973580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 04:50:49,226][15401] Updated weights for policy 0, policy_version 813833 (0.0037) [2024-06-25 04:50:52,308][15401] Updated weights for policy 0, policy_version 813843 (0.0037) [2024-06-25 04:50:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 13334020096. Throughput: 0: 42774.7. Samples: 13334100720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 04:50:56,738][15401] Updated weights for policy 0, policy_version 813853 (0.0038) [2024-06-25 04:50:58,391][15132] Fps is (10 sec: 42591.1, 60 sec: 42324.0, 300 sec: 42709.2). Total num frames: 13334233088. Throughput: 0: 42951.9. Samples: 13334366340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:50:58,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 04:50:59,817][15401] Updated weights for policy 0, policy_version 813863 (0.0039) [2024-06-25 04:51:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13334462464. Throughput: 0: 42835.5. Samples: 13334620480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 04:51:04,254][15401] Updated weights for policy 0, policy_version 813873 (0.0034) [2024-06-25 04:51:07,373][15401] Updated weights for policy 0, policy_version 813883 (0.0046) [2024-06-25 04:51:08,389][15132] Fps is (10 sec: 44244.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13334675456. Throughput: 0: 42972.0. Samples: 13334751460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:08,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 04:51:11,840][15401] Updated weights for policy 0, policy_version 813893 (0.0050) [2024-06-25 04:51:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42326.0, 300 sec: 42598.4). Total num frames: 13334872064. Throughput: 0: 42928.0. Samples: 13335009500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:13,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 04:51:15,499][15401] Updated weights for policy 0, policy_version 813903 (0.0027) [2024-06-25 04:51:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13335101440. Throughput: 0: 42883.6. Samples: 13335263160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 04:51:19,451][15401] Updated weights for policy 0, policy_version 813913 (0.0028) [2024-06-25 04:51:22,945][15401] Updated weights for policy 0, policy_version 813923 (0.0037) [2024-06-25 04:51:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13335314432. Throughput: 0: 43025.8. Samples: 13335394780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:23,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 04:51:27,375][15401] Updated weights for policy 0, policy_version 813933 (0.0041) [2024-06-25 04:51:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13335511040. Throughput: 0: 42695.1. Samples: 13335644780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 04:51:29,237][15349] Signal inference workers to stop experience collection... (197400 times) [2024-06-25 04:51:29,272][15401] InferenceWorker_p0-w0: stopping experience collection (197400 times) [2024-06-25 04:51:29,355][15349] Signal inference workers to resume experience collection... (197400 times) [2024-06-25 04:51:29,356][15401] InferenceWorker_p0-w0: resuming experience collection (197400 times) [2024-06-25 04:51:30,730][15401] Updated weights for policy 0, policy_version 813943 (0.0033) [2024-06-25 04:51:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13335724032. Throughput: 0: 42592.9. Samples: 13335890260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:33,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-25 04:51:35,044][15401] Updated weights for policy 0, policy_version 813953 (0.0040) [2024-06-25 04:51:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.7). Total num frames: 13335953408. Throughput: 0: 42666.2. Samples: 13336020700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:38,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 04:51:38,767][15401] Updated weights for policy 0, policy_version 813963 (0.0040) [2024-06-25 04:51:43,106][15401] Updated weights for policy 0, policy_version 813973 (0.0030) [2024-06-25 04:51:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13336133632. Throughput: 0: 42408.3. Samples: 13336274640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:43,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 04:51:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000813974_13336150016.pth... [2024-06-25 04:51:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000813351_13325942784.pth [2024-06-25 04:51:46,349][15401] Updated weights for policy 0, policy_version 813983 (0.0025) [2024-06-25 04:51:48,391][15132] Fps is (10 sec: 40953.8, 60 sec: 42597.4, 300 sec: 42598.2). Total num frames: 13336363008. Throughput: 0: 42260.4. Samples: 13336522260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:48,391][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 04:51:50,873][15401] Updated weights for policy 0, policy_version 813993 (0.0033) [2024-06-25 04:51:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 13336576000. Throughput: 0: 42261.3. Samples: 13336653220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 04:51:54,135][15401] Updated weights for policy 0, policy_version 814003 (0.0031) [2024-06-25 04:51:58,390][15132] Fps is (10 sec: 40966.1, 60 sec: 42326.6, 300 sec: 42598.4). Total num frames: 13336772608. Throughput: 0: 42158.5. Samples: 13336906640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:51:58,392][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 04:51:58,497][15401] Updated weights for policy 0, policy_version 814013 (0.0031) [2024-06-25 04:52:01,797][15401] Updated weights for policy 0, policy_version 814023 (0.0031) [2024-06-25 04:52:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13337018368. Throughput: 0: 42033.8. Samples: 13337154680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 04:52:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 04:52:06,159][15401] Updated weights for policy 0, policy_version 814033 (0.0033) [2024-06-25 04:52:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13337198592. Throughput: 0: 42071.2. Samples: 13337287980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:08,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 04:52:09,925][15401] Updated weights for policy 0, policy_version 814043 (0.0036) [2024-06-25 04:52:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13337411584. Throughput: 0: 42057.4. Samples: 13337537360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:13,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-25 04:52:13,728][15401] Updated weights for policy 0, policy_version 814053 (0.0024) [2024-06-25 04:52:17,742][15401] Updated weights for policy 0, policy_version 814063 (0.0029) [2024-06-25 04:52:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13337640960. Throughput: 0: 42260.4. Samples: 13337791980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:18,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 04:52:21,867][15401] Updated weights for policy 0, policy_version 814073 (0.0027) [2024-06-25 04:52:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 13337821184. Throughput: 0: 42204.4. Samples: 13337919900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 04:52:25,618][15401] Updated weights for policy 0, policy_version 814083 (0.0043) [2024-06-25 04:52:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13338050560. Throughput: 0: 42108.1. Samples: 13338169500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 04:52:29,442][15401] Updated weights for policy 0, policy_version 814093 (0.0039) [2024-06-25 04:52:33,282][15401] Updated weights for policy 0, policy_version 814103 (0.0028) [2024-06-25 04:52:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13338263552. Throughput: 0: 42523.3. Samples: 13338435740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 04:52:36,983][15401] Updated weights for policy 0, policy_version 814113 (0.0028) [2024-06-25 04:52:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 13338476544. Throughput: 0: 42457.7. Samples: 13338563820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 04:52:40,687][15401] Updated weights for policy 0, policy_version 814123 (0.0042) [2024-06-25 04:52:43,394][15132] Fps is (10 sec: 44215.6, 60 sec: 42868.1, 300 sec: 42708.8). Total num frames: 13338705920. Throughput: 0: 42320.5. Samples: 13338811260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:43,395][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 04:52:44,467][15401] Updated weights for policy 0, policy_version 814133 (0.0039) [2024-06-25 04:52:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42326.4, 300 sec: 42598.4). Total num frames: 13338902528. Throughput: 0: 42744.0. Samples: 13339078160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:48,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 04:52:48,437][15401] Updated weights for policy 0, policy_version 814143 (0.0034) [2024-06-25 04:52:52,270][15401] Updated weights for policy 0, policy_version 814153 (0.0035) [2024-06-25 04:52:53,389][15132] Fps is (10 sec: 39340.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 13339099136. Throughput: 0: 42516.5. Samples: 13339201220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 04:52:56,490][15401] Updated weights for policy 0, policy_version 814163 (0.0039) [2024-06-25 04:52:57,317][15349] Signal inference workers to stop experience collection... (197450 times) [2024-06-25 04:52:57,317][15349] Signal inference workers to resume experience collection... (197450 times) [2024-06-25 04:52:57,366][15401] InferenceWorker_p0-w0: stopping experience collection (197450 times) [2024-06-25 04:52:57,367][15401] InferenceWorker_p0-w0: resuming experience collection (197450 times) [2024-06-25 04:52:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13339328512. Throughput: 0: 42461.7. Samples: 13339448140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:52:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 04:52:59,967][15401] Updated weights for policy 0, policy_version 814173 (0.0039) [2024-06-25 04:53:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13339557888. Throughput: 0: 42716.0. Samples: 13339714200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 04:53:03,976][15401] Updated weights for policy 0, policy_version 814183 (0.0039) [2024-06-25 04:53:07,580][15401] Updated weights for policy 0, policy_version 814193 (0.0042) [2024-06-25 04:53:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13339754496. Throughput: 0: 42670.8. Samples: 13339840080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:08,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 04:53:11,573][15401] Updated weights for policy 0, policy_version 814203 (0.0040) [2024-06-25 04:53:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13339983872. Throughput: 0: 42710.6. Samples: 13340091480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:13,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 04:53:15,312][15401] Updated weights for policy 0, policy_version 814213 (0.0044) [2024-06-25 04:53:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13340180480. Throughput: 0: 42652.4. Samples: 13340355100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 04:53:19,209][15401] Updated weights for policy 0, policy_version 814223 (0.0024) [2024-06-25 04:53:23,052][15401] Updated weights for policy 0, policy_version 814233 (0.0034) [2024-06-25 04:53:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13340393472. Throughput: 0: 42480.9. Samples: 13340475460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 04:53:26,936][15401] Updated weights for policy 0, policy_version 814243 (0.0035) [2024-06-25 04:53:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13340639232. Throughput: 0: 42691.6. Samples: 13340732180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 04:53:30,797][15401] Updated weights for policy 0, policy_version 814253 (0.0034) [2024-06-25 04:53:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42543.2). Total num frames: 13340835840. Throughput: 0: 42530.5. Samples: 13340992040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 04:53:34,529][15401] Updated weights for policy 0, policy_version 814263 (0.0054) [2024-06-25 04:53:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13341032448. Throughput: 0: 42501.2. Samples: 13341113780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:38,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 04:53:38,428][15401] Updated weights for policy 0, policy_version 814273 (0.0045) [2024-06-25 04:53:42,310][15401] Updated weights for policy 0, policy_version 814283 (0.0026) [2024-06-25 04:53:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42874.8, 300 sec: 42709.8). Total num frames: 13341278208. Throughput: 0: 42765.7. Samples: 13341372600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 04:53:43,496][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000814288_13341294592.pth... [2024-06-25 04:53:43,544][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000813662_13331038208.pth [2024-06-25 04:53:46,461][15401] Updated weights for policy 0, policy_version 814293 (0.0026) [2024-06-25 04:53:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13341458432. Throughput: 0: 42601.3. Samples: 13341631260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 04:53:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 04:53:49,829][15401] Updated weights for policy 0, policy_version 814303 (0.0027) [2024-06-25 04:53:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 13341671424. Throughput: 0: 42543.4. Samples: 13341754540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:53:53,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 04:53:54,014][15401] Updated weights for policy 0, policy_version 814313 (0.0032) [2024-06-25 04:53:57,650][15401] Updated weights for policy 0, policy_version 814323 (0.0040) [2024-06-25 04:53:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13341917184. Throughput: 0: 42799.2. Samples: 13342017440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:53:58,396][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 04:54:01,710][15401] Updated weights for policy 0, policy_version 814333 (0.0033) [2024-06-25 04:54:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 13342081024. Throughput: 0: 42685.3. Samples: 13342275940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:03,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 04:54:05,144][15401] Updated weights for policy 0, policy_version 814343 (0.0040) [2024-06-25 04:54:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.5). Total num frames: 13342310400. Throughput: 0: 42645.0. Samples: 13342394480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:08,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-25 04:54:09,356][15401] Updated weights for policy 0, policy_version 814353 (0.0028) [2024-06-25 04:54:12,848][15401] Updated weights for policy 0, policy_version 814363 (0.0039) [2024-06-25 04:54:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13342556160. Throughput: 0: 42776.0. Samples: 13342657100. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 04:54:16,899][15401] Updated weights for policy 0, policy_version 814373 (0.0036) [2024-06-25 04:54:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13342720000. Throughput: 0: 42772.6. Samples: 13342916800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:18,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 04:54:20,667][15401] Updated weights for policy 0, policy_version 814383 (0.0034) [2024-06-25 04:54:21,660][15349] Signal inference workers to stop experience collection... (197500 times) [2024-06-25 04:54:21,708][15401] InferenceWorker_p0-w0: stopping experience collection (197500 times) [2024-06-25 04:54:21,713][15349] Signal inference workers to resume experience collection... (197500 times) [2024-06-25 04:54:21,732][15401] InferenceWorker_p0-w0: resuming experience collection (197500 times) [2024-06-25 04:54:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13342965760. Throughput: 0: 42776.9. Samples: 13343038740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 04:54:24,555][15401] Updated weights for policy 0, policy_version 814393 (0.0024) [2024-06-25 04:54:28,302][15401] Updated weights for policy 0, policy_version 814403 (0.0035) [2024-06-25 04:54:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13343178752. Throughput: 0: 42880.1. Samples: 13343302200. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 04:54:32,230][15401] Updated weights for policy 0, policy_version 814413 (0.0028) [2024-06-25 04:54:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13343375360. Throughput: 0: 42723.1. Samples: 13343553800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:33,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 04:54:35,950][15401] Updated weights for policy 0, policy_version 814423 (0.0038) [2024-06-25 04:54:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 13343621120. Throughput: 0: 42791.5. Samples: 13343680160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 04:54:39,760][15401] Updated weights for policy 0, policy_version 814433 (0.0029) [2024-06-25 04:54:43,392][15132] Fps is (10 sec: 42589.4, 60 sec: 42050.8, 300 sec: 42487.0). Total num frames: 13343801344. Throughput: 0: 42673.9. Samples: 13343937860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:43,392][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 04:54:43,657][15401] Updated weights for policy 0, policy_version 814443 (0.0049) [2024-06-25 04:54:47,723][15401] Updated weights for policy 0, policy_version 814453 (0.0031) [2024-06-25 04:54:48,391][15132] Fps is (10 sec: 40956.3, 60 sec: 42870.7, 300 sec: 42653.8). Total num frames: 13344030720. Throughput: 0: 42561.2. Samples: 13344191240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:48,391][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 04:54:51,125][15401] Updated weights for policy 0, policy_version 814463 (0.0028) [2024-06-25 04:54:53,390][15132] Fps is (10 sec: 44245.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 13344243712. Throughput: 0: 42762.9. Samples: 13344318820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 04:54:55,447][15401] Updated weights for policy 0, policy_version 814473 (0.0036) [2024-06-25 04:54:58,390][15132] Fps is (10 sec: 42602.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13344456704. Throughput: 0: 42671.1. Samples: 13344577300. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:54:58,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-25 04:54:58,797][15401] Updated weights for policy 0, policy_version 814483 (0.0033) [2024-06-25 04:55:03,183][15401] Updated weights for policy 0, policy_version 814493 (0.0040) [2024-06-25 04:55:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 13344653312. Throughput: 0: 42548.3. Samples: 13344831480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:55:03,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 04:55:06,331][15401] Updated weights for policy 0, policy_version 814503 (0.0036) [2024-06-25 04:55:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42542.6). Total num frames: 13344882688. Throughput: 0: 42525.4. Samples: 13344952480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:55:08,393][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 04:55:10,804][15401] Updated weights for policy 0, policy_version 814513 (0.0033) [2024-06-25 04:55:13,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13345112064. Throughput: 0: 42541.3. Samples: 13345216560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:55:13,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 04:55:13,984][15401] Updated weights for policy 0, policy_version 814523 (0.0038) [2024-06-25 04:55:18,305][15401] Updated weights for policy 0, policy_version 814533 (0.0034) [2024-06-25 04:55:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13345308672. Throughput: 0: 42562.7. Samples: 13345469120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:55:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 04:55:22,164][15401] Updated weights for policy 0, policy_version 814543 (0.0026) [2024-06-25 04:55:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13345521664. Throughput: 0: 42457.1. Samples: 13345590720. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:55:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 04:55:25,904][15401] Updated weights for policy 0, policy_version 814553 (0.0034) [2024-06-25 04:55:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13345751040. Throughput: 0: 42640.7. Samples: 13345856600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:55:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 04:55:29,887][15401] Updated weights for policy 0, policy_version 814563 (0.0029) [2024-06-25 04:55:33,341][15349] Signal inference workers to stop experience collection... (197550 times) [2024-06-25 04:55:33,341][15349] Signal inference workers to resume experience collection... (197550 times) [2024-06-25 04:55:33,376][15401] InferenceWorker_p0-w0: stopping experience collection (197550 times) [2024-06-25 04:55:33,376][15401] InferenceWorker_p0-w0: resuming experience collection (197550 times) [2024-06-25 04:55:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13345947648. Throughput: 0: 42676.7. Samples: 13346111640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:55:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 04:55:33,492][15401] Updated weights for policy 0, policy_version 814573 (0.0042) [2024-06-25 04:55:37,581][15401] Updated weights for policy 0, policy_version 814583 (0.0034) [2024-06-25 04:55:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13346177024. Throughput: 0: 42596.2. Samples: 13346235640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 04:55:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 04:55:41,203][15401] Updated weights for policy 0, policy_version 814593 (0.0033) [2024-06-25 04:55:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.1, 300 sec: 42654.0). Total num frames: 13346390016. Throughput: 0: 42787.2. Samples: 13346502720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:55:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 04:55:43,438][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000814600_13346406400.pth... [2024-06-25 04:55:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000813974_13336150016.pth [2024-06-25 04:55:45,178][15401] Updated weights for policy 0, policy_version 814603 (0.0040) [2024-06-25 04:55:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42599.2, 300 sec: 42598.4). Total num frames: 13346586624. Throughput: 0: 42833.5. Samples: 13346758980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:55:48,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 04:55:49,019][15401] Updated weights for policy 0, policy_version 814613 (0.0023) [2024-06-25 04:55:52,725][15401] Updated weights for policy 0, policy_version 814623 (0.0040) [2024-06-25 04:55:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42654.2). Total num frames: 13346816000. Throughput: 0: 42824.9. Samples: 13346879500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:55:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 04:55:56,543][15401] Updated weights for policy 0, policy_version 814633 (0.0033) [2024-06-25 04:55:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13347028992. Throughput: 0: 42870.4. Samples: 13347145720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:55:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 04:56:00,363][15401] Updated weights for policy 0, policy_version 814643 (0.0032) [2024-06-25 04:56:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 13347225600. Throughput: 0: 42858.6. Samples: 13347397760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 04:56:04,233][15401] Updated weights for policy 0, policy_version 814653 (0.0033) [2024-06-25 04:56:07,927][15401] Updated weights for policy 0, policy_version 814663 (0.0050) [2024-06-25 04:56:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 13347438592. Throughput: 0: 42943.0. Samples: 13347523160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 04:56:12,096][15401] Updated weights for policy 0, policy_version 814673 (0.0043) [2024-06-25 04:56:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13347651584. Throughput: 0: 42755.2. Samples: 13347780580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 04:56:15,880][15401] Updated weights for policy 0, policy_version 814683 (0.0036) [2024-06-25 04:56:18,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 13347880960. Throughput: 0: 42717.6. Samples: 13348034040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:18,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 04:56:19,683][15401] Updated weights for policy 0, policy_version 814693 (0.0035) [2024-06-25 04:56:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13348077568. Throughput: 0: 42913.8. Samples: 13348166760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 04:56:23,493][15401] Updated weights for policy 0, policy_version 814703 (0.0033) [2024-06-25 04:56:27,156][15401] Updated weights for policy 0, policy_version 814713 (0.0037) [2024-06-25 04:56:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13348306944. Throughput: 0: 42784.0. Samples: 13348428000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 04:56:31,018][15401] Updated weights for policy 0, policy_version 814723 (0.0046) [2024-06-25 04:56:32,641][15349] Signal inference workers to stop experience collection... (197600 times) [2024-06-25 04:56:32,692][15401] InferenceWorker_p0-w0: stopping experience collection (197600 times) [2024-06-25 04:56:32,759][15349] Signal inference workers to resume experience collection... (197600 times) [2024-06-25 04:56:32,760][15401] InferenceWorker_p0-w0: resuming experience collection (197600 times) [2024-06-25 04:56:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13348519936. Throughput: 0: 42867.6. Samples: 13348688020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 04:56:34,755][15401] Updated weights for policy 0, policy_version 814733 (0.0027) [2024-06-25 04:56:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13348732928. Throughput: 0: 42935.1. Samples: 13348811580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 04:56:38,459][15401] Updated weights for policy 0, policy_version 814743 (0.0043) [2024-06-25 04:56:42,385][15401] Updated weights for policy 0, policy_version 814753 (0.0042) [2024-06-25 04:56:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42654.2). Total num frames: 13348945920. Throughput: 0: 42706.1. Samples: 13349067500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 04:56:46,045][15401] Updated weights for policy 0, policy_version 814763 (0.0021) [2024-06-25 04:56:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13349158912. Throughput: 0: 42831.2. Samples: 13349325160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:48,398][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 04:56:50,370][15401] Updated weights for policy 0, policy_version 814773 (0.0028) [2024-06-25 04:56:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13349371904. Throughput: 0: 42745.0. Samples: 13349446680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:53,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 04:56:53,672][15401] Updated weights for policy 0, policy_version 814783 (0.0049) [2024-06-25 04:56:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13349568512. Throughput: 0: 42670.7. Samples: 13349700760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:56:58,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-25 04:56:58,400][15401] Updated weights for policy 0, policy_version 814793 (0.0029) [2024-06-25 04:57:01,346][15401] Updated weights for policy 0, policy_version 814803 (0.0036) [2024-06-25 04:57:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13349781504. Throughput: 0: 42557.8. Samples: 13349949040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:57:03,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 04:57:06,045][15401] Updated weights for policy 0, policy_version 814813 (0.0035) [2024-06-25 04:57:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13350010880. Throughput: 0: 42415.8. Samples: 13350075480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:57:08,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 04:57:08,899][15401] Updated weights for policy 0, policy_version 814823 (0.0025) [2024-06-25 04:57:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 13350174720. Throughput: 0: 42387.6. Samples: 13350335440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:57:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 04:57:13,739][15401] Updated weights for policy 0, policy_version 814833 (0.0028) [2024-06-25 04:57:16,540][15401] Updated weights for policy 0, policy_version 814843 (0.0030) [2024-06-25 04:57:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42054.0, 300 sec: 42654.0). Total num frames: 13350404096. Throughput: 0: 42158.7. Samples: 13350585160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:57:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 04:57:21,399][15401] Updated weights for policy 0, policy_version 814853 (0.0029) [2024-06-25 04:57:23,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13350649856. Throughput: 0: 42296.0. Samples: 13350714900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 04:57:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 04:57:24,757][15401] Updated weights for policy 0, policy_version 814863 (0.0043) [2024-06-25 04:57:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 13350813696. Throughput: 0: 42303.1. Samples: 13350971140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:57:28,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 04:57:29,061][15401] Updated weights for policy 0, policy_version 814873 (0.0028) [2024-06-25 04:57:33,032][15401] Updated weights for policy 0, policy_version 814883 (0.0021) [2024-06-25 04:57:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13351043072. Throughput: 0: 42122.7. Samples: 13351220680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:57:33,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-25 04:57:36,600][15349] Signal inference workers to stop experience collection... (197650 times) [2024-06-25 04:57:36,635][15401] InferenceWorker_p0-w0: stopping experience collection (197650 times) [2024-06-25 04:57:36,656][15349] Signal inference workers to resume experience collection... (197650 times) [2024-06-25 04:57:36,660][15401] InferenceWorker_p0-w0: resuming experience collection (197650 times) [2024-06-25 04:57:36,802][15401] Updated weights for policy 0, policy_version 814893 (0.0040) [2024-06-25 04:57:38,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42598.5, 300 sec: 42654.6). Total num frames: 13351288832. Throughput: 0: 42420.9. Samples: 13351355620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:57:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 04:57:40,528][15401] Updated weights for policy 0, policy_version 814903 (0.0035) [2024-06-25 04:57:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 13351452672. Throughput: 0: 42317.3. Samples: 13351605040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:57:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 04:57:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000814909_13351469056.pth... [2024-06-25 04:57:43,573][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000814288_13341294592.pth [2024-06-25 04:57:44,478][15401] Updated weights for policy 0, policy_version 814913 (0.0041) [2024-06-25 04:57:48,295][15401] Updated weights for policy 0, policy_version 814923 (0.0034) [2024-06-25 04:57:48,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 13351698432. Throughput: 0: 42419.1. Samples: 13351858000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:57:48,392][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 04:57:52,077][15401] Updated weights for policy 0, policy_version 814933 (0.0033) [2024-06-25 04:57:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13351911424. Throughput: 0: 42571.2. Samples: 13351991180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:57:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 04:57:55,842][15401] Updated weights for policy 0, policy_version 814943 (0.0034) [2024-06-25 04:57:58,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13352108032. Throughput: 0: 42368.4. Samples: 13352242020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:57:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 04:57:59,786][15401] Updated weights for policy 0, policy_version 814953 (0.0039) [2024-06-25 04:58:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13352337408. Throughput: 0: 42462.2. Samples: 13352495960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 04:58:03,509][15401] Updated weights for policy 0, policy_version 814963 (0.0027) [2024-06-25 04:58:07,384][15401] Updated weights for policy 0, policy_version 814973 (0.0033) [2024-06-25 04:58:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13352550400. Throughput: 0: 42506.7. Samples: 13352627700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 04:58:11,130][15401] Updated weights for policy 0, policy_version 814983 (0.0028) [2024-06-25 04:58:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13352747008. Throughput: 0: 42364.5. Samples: 13352877540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:13,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 04:58:15,007][15401] Updated weights for policy 0, policy_version 814993 (0.0035) [2024-06-25 04:58:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13352960000. Throughput: 0: 42598.0. Samples: 13353137600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:18,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 04:58:18,798][15401] Updated weights for policy 0, policy_version 815003 (0.0032) [2024-06-25 04:58:22,822][15401] Updated weights for policy 0, policy_version 815013 (0.0045) [2024-06-25 04:58:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13353205760. Throughput: 0: 42565.2. Samples: 13353271060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 04:58:26,424][15401] Updated weights for policy 0, policy_version 815023 (0.0036) [2024-06-25 04:58:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13353402368. Throughput: 0: 42510.5. Samples: 13353518020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 04:58:30,548][15401] Updated weights for policy 0, policy_version 815033 (0.0032) [2024-06-25 04:58:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13353598976. Throughput: 0: 42832.1. Samples: 13353785340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:33,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 04:58:34,051][15401] Updated weights for policy 0, policy_version 815043 (0.0029) [2024-06-25 04:58:38,202][15401] Updated weights for policy 0, policy_version 815053 (0.0044) [2024-06-25 04:58:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13353828352. Throughput: 0: 42672.9. Samples: 13353911460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 04:58:38,460][15349] Signal inference workers to stop experience collection... (197700 times) [2024-06-25 04:58:38,460][15349] Signal inference workers to resume experience collection... (197700 times) [2024-06-25 04:58:38,487][15401] InferenceWorker_p0-w0: stopping experience collection (197700 times) [2024-06-25 04:58:38,487][15401] InferenceWorker_p0-w0: resuming experience collection (197700 times) [2024-06-25 04:58:41,676][15401] Updated weights for policy 0, policy_version 815063 (0.0030) [2024-06-25 04:58:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13354041344. Throughput: 0: 42679.1. Samples: 13354162580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 04:58:45,651][15401] Updated weights for policy 0, policy_version 815073 (0.0032) [2024-06-25 04:58:48,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 13354221568. Throughput: 0: 42955.1. Samples: 13354428940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 04:58:49,596][15401] Updated weights for policy 0, policy_version 815083 (0.0037) [2024-06-25 04:58:53,311][15401] Updated weights for policy 0, policy_version 815093 (0.0033) [2024-06-25 04:58:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13354483712. Throughput: 0: 42704.4. Samples: 13354549400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 04:58:57,382][15401] Updated weights for policy 0, policy_version 815103 (0.0031) [2024-06-25 04:58:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13354680320. Throughput: 0: 42841.4. Samples: 13354805400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:58:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 04:59:00,940][15401] Updated weights for policy 0, policy_version 815113 (0.0028) [2024-06-25 04:59:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 13354860544. Throughput: 0: 42811.2. Samples: 13355064100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:59:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 04:59:05,114][15401] Updated weights for policy 0, policy_version 815123 (0.0036) [2024-06-25 04:59:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13355122688. Throughput: 0: 42537.2. Samples: 13355185240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:59:08,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 04:59:08,620][15401] Updated weights for policy 0, policy_version 815133 (0.0043) [2024-06-25 04:59:12,807][15401] Updated weights for policy 0, policy_version 815143 (0.0030) [2024-06-25 04:59:13,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13355335680. Throughput: 0: 42865.5. Samples: 13355446960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-25 04:59:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 04:59:16,115][15401] Updated weights for policy 0, policy_version 815153 (0.0039) [2024-06-25 04:59:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 13355515904. Throughput: 0: 42564.1. Samples: 13355700720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 04:59:20,399][15401] Updated weights for policy 0, policy_version 815163 (0.0026) [2024-06-25 04:59:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13355778048. Throughput: 0: 42592.4. Samples: 13355828120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 04:59:24,069][15401] Updated weights for policy 0, policy_version 815173 (0.0040) [2024-06-25 04:59:28,053][15401] Updated weights for policy 0, policy_version 815183 (0.0028) [2024-06-25 04:59:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 13355958272. Throughput: 0: 42777.9. Samples: 13356087580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:28,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 04:59:31,543][15401] Updated weights for policy 0, policy_version 815193 (0.0043) [2024-06-25 04:59:33,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 13356154880. Throughput: 0: 42473.2. Samples: 13356340240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 04:59:35,910][15401] Updated weights for policy 0, policy_version 815203 (0.0041) [2024-06-25 04:59:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 13356400640. Throughput: 0: 42585.8. Samples: 13356465760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:38,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 04:59:39,187][15401] Updated weights for policy 0, policy_version 815213 (0.0025) [2024-06-25 04:59:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42543.0). Total num frames: 13356580864. Throughput: 0: 42664.4. Samples: 13356725300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 04:59:43,612][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000815223_13356613632.pth... [2024-06-25 04:59:43,618][15401] Updated weights for policy 0, policy_version 815223 (0.0030) [2024-06-25 04:59:43,664][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000814600_13346406400.pth [2024-06-25 04:59:47,246][15401] Updated weights for policy 0, policy_version 815233 (0.0024) [2024-06-25 04:59:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13356810240. Throughput: 0: 42492.9. Samples: 13356976280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:48,395][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 04:59:51,381][15401] Updated weights for policy 0, policy_version 815243 (0.0044) [2024-06-25 04:59:53,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 13357039616. Throughput: 0: 42711.1. Samples: 13357107340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:53,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 04:59:54,968][15401] Updated weights for policy 0, policy_version 815253 (0.0030) [2024-06-25 04:59:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 13357219840. Throughput: 0: 42604.3. Samples: 13357364260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 04:59:58,393][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 04:59:59,231][15401] Updated weights for policy 0, policy_version 815263 (0.0037) [2024-06-25 05:00:02,635][15401] Updated weights for policy 0, policy_version 815273 (0.0032) [2024-06-25 05:00:03,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.6, 300 sec: 42598.7). Total num frames: 13357449216. Throughput: 0: 42320.3. Samples: 13357605140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 05:00:07,093][15401] Updated weights for policy 0, policy_version 815283 (0.0042) [2024-06-25 05:00:08,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13357678592. Throughput: 0: 42372.0. Samples: 13357734860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:08,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 05:00:10,598][15401] Updated weights for policy 0, policy_version 815293 (0.0031) [2024-06-25 05:00:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 13357858816. Throughput: 0: 42401.3. Samples: 13357995640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 05:00:14,506][15349] Signal inference workers to stop experience collection... (197750 times) [2024-06-25 05:00:14,543][15401] InferenceWorker_p0-w0: stopping experience collection (197750 times) [2024-06-25 05:00:14,628][15349] Signal inference workers to resume experience collection... (197750 times) [2024-06-25 05:00:14,628][15401] InferenceWorker_p0-w0: resuming experience collection (197750 times) [2024-06-25 05:00:14,793][15401] Updated weights for policy 0, policy_version 815303 (0.0040) [2024-06-25 05:00:18,136][15401] Updated weights for policy 0, policy_version 815313 (0.0030) [2024-06-25 05:00:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13358088192. Throughput: 0: 42340.1. Samples: 13358245540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 05:00:22,392][15401] Updated weights for policy 0, policy_version 815323 (0.0034) [2024-06-25 05:00:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13358333952. Throughput: 0: 42596.1. Samples: 13358382580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 05:00:25,719][15401] Updated weights for policy 0, policy_version 815333 (0.0040) [2024-06-25 05:00:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13358497792. Throughput: 0: 42385.0. Samples: 13358632620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 05:00:29,940][15401] Updated weights for policy 0, policy_version 815343 (0.0033) [2024-06-25 05:00:33,249][15401] Updated weights for policy 0, policy_version 815353 (0.0031) [2024-06-25 05:00:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 13358743552. Throughput: 0: 42532.1. Samples: 13358890220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:33,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 05:00:37,714][15401] Updated weights for policy 0, policy_version 815363 (0.0031) [2024-06-25 05:00:38,396][15132] Fps is (10 sec: 44207.9, 60 sec: 42320.8, 300 sec: 42541.9). Total num frames: 13358940160. Throughput: 0: 42535.3. Samples: 13359021600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:38,397][15132] Avg episode reward: [(0, '0.287')] [2024-06-25 05:00:40,877][15401] Updated weights for policy 0, policy_version 815373 (0.0048) [2024-06-25 05:00:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13359153152. Throughput: 0: 42538.2. Samples: 13359278380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:43,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 05:00:45,466][15401] Updated weights for policy 0, policy_version 815383 (0.0039) [2024-06-25 05:00:48,389][15132] Fps is (10 sec: 44265.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13359382528. Throughput: 0: 42860.1. Samples: 13359533840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 05:00:48,495][15401] Updated weights for policy 0, policy_version 815393 (0.0029) [2024-06-25 05:00:53,025][15401] Updated weights for policy 0, policy_version 815403 (0.0046) [2024-06-25 05:00:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 13359562752. Throughput: 0: 42920.5. Samples: 13359666280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 05:00:56,176][15401] Updated weights for policy 0, policy_version 815413 (0.0042) [2024-06-25 05:00:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 13359808512. Throughput: 0: 42719.0. Samples: 13359918000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:00:58,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 05:01:00,468][15401] Updated weights for policy 0, policy_version 815423 (0.0023) [2024-06-25 05:01:03,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13360037888. Throughput: 0: 42952.5. Samples: 13360178400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 05:01:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 05:01:03,568][15401] Updated weights for policy 0, policy_version 815433 (0.0041) [2024-06-25 05:01:08,039][15401] Updated weights for policy 0, policy_version 815443 (0.0032) [2024-06-25 05:01:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13360218112. Throughput: 0: 42852.8. Samples: 13360310960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:08,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 05:01:11,452][15401] Updated weights for policy 0, policy_version 815453 (0.0024) [2024-06-25 05:01:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 13360447488. Throughput: 0: 42969.7. Samples: 13360566260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 05:01:15,678][15401] Updated weights for policy 0, policy_version 815463 (0.0028) [2024-06-25 05:01:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13360660480. Throughput: 0: 42960.5. Samples: 13360823440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:18,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 05:01:18,864][15401] Updated weights for policy 0, policy_version 815473 (0.0025) [2024-06-25 05:01:23,188][15401] Updated weights for policy 0, policy_version 815483 (0.0032) [2024-06-25 05:01:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13360873472. Throughput: 0: 43100.8. Samples: 13360960860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:23,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 05:01:26,898][15401] Updated weights for policy 0, policy_version 815493 (0.0036) [2024-06-25 05:01:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 13361086464. Throughput: 0: 42972.1. Samples: 13361212120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 05:01:30,732][15401] Updated weights for policy 0, policy_version 815503 (0.0036) [2024-06-25 05:01:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13361299456. Throughput: 0: 42960.0. Samples: 13361467040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 05:01:34,331][15349] Signal inference workers to stop experience collection... (197800 times) [2024-06-25 05:01:34,331][15349] Signal inference workers to resume experience collection... (197800 times) [2024-06-25 05:01:34,376][15401] InferenceWorker_p0-w0: stopping experience collection (197800 times) [2024-06-25 05:01:34,376][15401] InferenceWorker_p0-w0: resuming experience collection (197800 times) [2024-06-25 05:01:34,467][15401] Updated weights for policy 0, policy_version 815513 (0.0025) [2024-06-25 05:01:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 13361512448. Throughput: 0: 42962.1. Samples: 13361599580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:38,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 05:01:38,694][15401] Updated weights for policy 0, policy_version 815523 (0.0033) [2024-06-25 05:01:42,375][15401] Updated weights for policy 0, policy_version 815533 (0.0035) [2024-06-25 05:01:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13361725440. Throughput: 0: 43078.4. Samples: 13361856520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:43,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 05:01:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000815536_13361741824.pth... [2024-06-25 05:01:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000814909_13351469056.pth [2024-06-25 05:01:46,269][15401] Updated weights for policy 0, policy_version 815543 (0.0030) [2024-06-25 05:01:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13361938432. Throughput: 0: 42847.5. Samples: 13362106540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 05:01:50,192][15401] Updated weights for policy 0, policy_version 815553 (0.0033) [2024-06-25 05:01:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 13362151424. Throughput: 0: 42777.0. Samples: 13362235920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 05:01:54,047][15401] Updated weights for policy 0, policy_version 815563 (0.0036) [2024-06-25 05:01:58,067][15401] Updated weights for policy 0, policy_version 815573 (0.0035) [2024-06-25 05:01:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13362364416. Throughput: 0: 42795.5. Samples: 13362492060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:01:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 05:02:01,539][15401] Updated weights for policy 0, policy_version 815583 (0.0042) [2024-06-25 05:02:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13362593792. Throughput: 0: 42759.4. Samples: 13362747620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:03,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 05:02:05,557][15401] Updated weights for policy 0, policy_version 815593 (0.0025) [2024-06-25 05:02:08,392][15132] Fps is (10 sec: 42589.3, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 13362790400. Throughput: 0: 42576.2. Samples: 13362876880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:08,392][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 05:02:09,236][15401] Updated weights for policy 0, policy_version 815603 (0.0044) [2024-06-25 05:02:13,027][15401] Updated weights for policy 0, policy_version 815613 (0.0036) [2024-06-25 05:02:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13363003392. Throughput: 0: 42565.3. Samples: 13363127560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 05:02:17,002][15401] Updated weights for policy 0, policy_version 815623 (0.0035) [2024-06-25 05:02:18,390][15132] Fps is (10 sec: 42607.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13363216384. Throughput: 0: 42624.7. Samples: 13363385160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 05:02:20,511][15401] Updated weights for policy 0, policy_version 815633 (0.0022) [2024-06-25 05:02:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13363412992. Throughput: 0: 42593.0. Samples: 13363516260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 05:02:24,522][15401] Updated weights for policy 0, policy_version 815643 (0.0036) [2024-06-25 05:02:28,385][15401] Updated weights for policy 0, policy_version 815653 (0.0043) [2024-06-25 05:02:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13363658752. Throughput: 0: 42467.1. Samples: 13363767540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:28,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 05:02:31,979][15401] Updated weights for policy 0, policy_version 815663 (0.0048) [2024-06-25 05:02:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 13363871744. Throughput: 0: 42585.2. Samples: 13364022880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:33,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 05:02:36,181][15401] Updated weights for policy 0, policy_version 815673 (0.0031) [2024-06-25 05:02:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13364051968. Throughput: 0: 42640.4. Samples: 13364154740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 05:02:39,763][15401] Updated weights for policy 0, policy_version 815683 (0.0034) [2024-06-25 05:02:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 13364297728. Throughput: 0: 42476.0. Samples: 13364403480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:43,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-25 05:02:43,826][15401] Updated weights for policy 0, policy_version 815693 (0.0033) [2024-06-25 05:02:47,462][15401] Updated weights for policy 0, policy_version 815703 (0.0030) [2024-06-25 05:02:48,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13364527104. Throughput: 0: 42429.5. Samples: 13364656940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 05:02:48,390][15132] Avg episode reward: [(0, '0.263')] [2024-06-25 05:02:51,376][15401] Updated weights for policy 0, policy_version 815713 (0.0031) [2024-06-25 05:02:53,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13364690944. Throughput: 0: 42603.9. Samples: 13364793960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:02:53,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-25 05:02:55,093][15349] Signal inference workers to stop experience collection... (197850 times) [2024-06-25 05:02:55,093][15349] Signal inference workers to resume experience collection... (197850 times) [2024-06-25 05:02:55,135][15401] InferenceWorker_p0-w0: stopping experience collection (197850 times) [2024-06-25 05:02:55,135][15401] InferenceWorker_p0-w0: resuming experience collection (197850 times) [2024-06-25 05:02:55,242][15401] Updated weights for policy 0, policy_version 815723 (0.0029) [2024-06-25 05:02:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13364936704. Throughput: 0: 42657.7. Samples: 13365047160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:02:58,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 05:02:58,931][15401] Updated weights for policy 0, policy_version 815733 (0.0031) [2024-06-25 05:03:02,913][15401] Updated weights for policy 0, policy_version 815743 (0.0026) [2024-06-25 05:03:03,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13365166080. Throughput: 0: 42698.3. Samples: 13365306580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 05:03:06,343][15401] Updated weights for policy 0, policy_version 815753 (0.0029) [2024-06-25 05:03:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42599.9, 300 sec: 42709.5). Total num frames: 13365346304. Throughput: 0: 42763.9. Samples: 13365440640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:08,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 05:03:10,714][15401] Updated weights for policy 0, policy_version 815763 (0.0033) [2024-06-25 05:03:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 13365575680. Throughput: 0: 42772.9. Samples: 13365692320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 05:03:13,942][15401] Updated weights for policy 0, policy_version 815773 (0.0031) [2024-06-25 05:03:18,363][15401] Updated weights for policy 0, policy_version 815783 (0.0055) [2024-06-25 05:03:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13365788672. Throughput: 0: 43053.4. Samples: 13365960280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 05:03:21,849][15401] Updated weights for policy 0, policy_version 815793 (0.0025) [2024-06-25 05:03:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13365985280. Throughput: 0: 42801.0. Samples: 13366080780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 05:03:25,965][15401] Updated weights for policy 0, policy_version 815803 (0.0037) [2024-06-25 05:03:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13366231040. Throughput: 0: 43098.7. Samples: 13366342920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 05:03:29,280][15401] Updated weights for policy 0, policy_version 815813 (0.0036) [2024-06-25 05:03:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 13366411264. Throughput: 0: 43198.7. Samples: 13366600880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 05:03:33,655][15401] Updated weights for policy 0, policy_version 815823 (0.0033) [2024-06-25 05:03:37,225][15401] Updated weights for policy 0, policy_version 815833 (0.0049) [2024-06-25 05:03:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13366640640. Throughput: 0: 42859.9. Samples: 13366722660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 05:03:41,086][15401] Updated weights for policy 0, policy_version 815843 (0.0039) [2024-06-25 05:03:43,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13366886400. Throughput: 0: 42987.2. Samples: 13366981580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:43,399][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 05:03:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000815850_13366886400.pth... [2024-06-25 05:03:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000815223_13356613632.pth [2024-06-25 05:03:45,415][15401] Updated weights for policy 0, policy_version 815853 (0.0032) [2024-06-25 05:03:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13367066624. Throughput: 0: 42774.3. Samples: 13367231420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 05:03:49,052][15401] Updated weights for policy 0, policy_version 815863 (0.0032) [2024-06-25 05:03:53,004][15401] Updated weights for policy 0, policy_version 815873 (0.0041) [2024-06-25 05:03:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13367279616. Throughput: 0: 42609.8. Samples: 13367358080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 05:03:56,586][15401] Updated weights for policy 0, policy_version 815883 (0.0037) [2024-06-25 05:03:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13367492608. Throughput: 0: 42904.4. Samples: 13367623020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:03:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 05:04:00,518][15401] Updated weights for policy 0, policy_version 815893 (0.0029) [2024-06-25 05:04:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13367721984. Throughput: 0: 42565.4. Samples: 13367875720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:04:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 05:04:04,375][15401] Updated weights for policy 0, policy_version 815903 (0.0033) [2024-06-25 05:04:07,958][15401] Updated weights for policy 0, policy_version 815913 (0.0033) [2024-06-25 05:04:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13367934976. Throughput: 0: 42699.9. Samples: 13368002280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:04:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 05:04:12,065][15401] Updated weights for policy 0, policy_version 815923 (0.0033) [2024-06-25 05:04:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 13368131584. Throughput: 0: 42671.4. Samples: 13368263140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:04:13,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 05:04:13,920][15349] Signal inference workers to stop experience collection... (197900 times) [2024-06-25 05:04:13,920][15349] Signal inference workers to resume experience collection... (197900 times) [2024-06-25 05:04:13,976][15401] InferenceWorker_p0-w0: stopping experience collection (197900 times) [2024-06-25 05:04:13,976][15401] InferenceWorker_p0-w0: resuming experience collection (197900 times) [2024-06-25 05:04:15,345][15401] Updated weights for policy 0, policy_version 815933 (0.0024) [2024-06-25 05:04:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13368344576. Throughput: 0: 42726.2. Samples: 13368523560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:04:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 05:04:19,845][15401] Updated weights for policy 0, policy_version 815943 (0.0035) [2024-06-25 05:04:23,100][15401] Updated weights for policy 0, policy_version 815953 (0.0038) [2024-06-25 05:04:23,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 13368590336. Throughput: 0: 42793.8. Samples: 13368648380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:04:23,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 05:04:27,543][15401] Updated weights for policy 0, policy_version 815963 (0.0043) [2024-06-25 05:04:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13368770560. Throughput: 0: 42850.3. Samples: 13368909840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:04:28,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 05:04:30,630][15401] Updated weights for policy 0, policy_version 815973 (0.0034) [2024-06-25 05:04:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13368983552. Throughput: 0: 42811.0. Samples: 13369157920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:04:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 05:04:35,594][15401] Updated weights for policy 0, policy_version 815983 (0.0027) [2024-06-25 05:04:38,198][15401] Updated weights for policy 0, policy_version 815993 (0.0029) [2024-06-25 05:04:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13369229312. Throughput: 0: 43000.0. Samples: 13369293080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:04:38,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 05:04:43,193][15401] Updated weights for policy 0, policy_version 816003 (0.0047) [2024-06-25 05:04:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13369409536. Throughput: 0: 42721.8. Samples: 13369545500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:04:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 05:04:45,834][15401] Updated weights for policy 0, policy_version 816013 (0.0032) [2024-06-25 05:04:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 13369638912. Throughput: 0: 42713.7. Samples: 13369797840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:04:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 05:04:50,796][15401] Updated weights for policy 0, policy_version 816023 (0.0025) [2024-06-25 05:04:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 13369851904. Throughput: 0: 42796.9. Samples: 13369928140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:04:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 05:04:53,771][15401] Updated weights for policy 0, policy_version 816033 (0.0036) [2024-06-25 05:04:58,282][15401] Updated weights for policy 0, policy_version 816043 (0.0033) [2024-06-25 05:04:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13370048512. Throughput: 0: 42682.8. Samples: 13370183860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:04:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 05:05:01,427][15401] Updated weights for policy 0, policy_version 816053 (0.0045) [2024-06-25 05:05:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13370294272. Throughput: 0: 42480.7. Samples: 13370435200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:03,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 05:05:05,897][15401] Updated weights for policy 0, policy_version 816063 (0.0032) [2024-06-25 05:05:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13370490880. Throughput: 0: 42705.7. Samples: 13370570140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 05:05:09,103][15401] Updated weights for policy 0, policy_version 816073 (0.0035) [2024-06-25 05:05:13,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 13370687488. Throughput: 0: 42475.0. Samples: 13370821320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:13,393][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 05:05:13,794][15401] Updated weights for policy 0, policy_version 816083 (0.0046) [2024-06-25 05:05:16,869][15401] Updated weights for policy 0, policy_version 816093 (0.0030) [2024-06-25 05:05:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 13370916864. Throughput: 0: 42505.2. Samples: 13371070660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 05:05:21,559][15401] Updated weights for policy 0, policy_version 816103 (0.0042) [2024-06-25 05:05:23,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 13371113472. Throughput: 0: 42582.3. Samples: 13371209280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:23,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 05:05:23,423][15349] Signal inference workers to stop experience collection... (197950 times) [2024-06-25 05:05:23,472][15401] InferenceWorker_p0-w0: stopping experience collection (197950 times) [2024-06-25 05:05:23,480][15349] Signal inference workers to resume experience collection... (197950 times) [2024-06-25 05:05:23,492][15401] InferenceWorker_p0-w0: resuming experience collection (197950 times) [2024-06-25 05:05:24,209][15401] Updated weights for policy 0, policy_version 816113 (0.0060) [2024-06-25 05:05:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13371326464. Throughput: 0: 42619.1. Samples: 13371463360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 05:05:29,211][15401] Updated weights for policy 0, policy_version 816123 (0.0038) [2024-06-25 05:05:32,276][15401] Updated weights for policy 0, policy_version 816133 (0.0042) [2024-06-25 05:05:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 13371572224. Throughput: 0: 42595.5. Samples: 13371714640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 05:05:36,850][15401] Updated weights for policy 0, policy_version 816143 (0.0040) [2024-06-25 05:05:38,390][15132] Fps is (10 sec: 42596.4, 60 sec: 42052.0, 300 sec: 42709.4). Total num frames: 13371752448. Throughput: 0: 42643.1. Samples: 13371847100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 05:05:39,900][15401] Updated weights for policy 0, policy_version 816153 (0.0025) [2024-06-25 05:05:43,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 13371965440. Throughput: 0: 42503.3. Samples: 13372096520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:43,396][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 05:05:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000816160_13371965440.pth... [2024-06-25 05:05:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000815536_13361741824.pth [2024-06-25 05:05:44,297][15401] Updated weights for policy 0, policy_version 816163 (0.0046) [2024-06-25 05:05:47,765][15401] Updated weights for policy 0, policy_version 816173 (0.0038) [2024-06-25 05:05:48,389][15132] Fps is (10 sec: 44239.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13372194816. Throughput: 0: 42622.9. Samples: 13372353220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 05:05:51,937][15401] Updated weights for policy 0, policy_version 816183 (0.0046) [2024-06-25 05:05:53,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13372391424. Throughput: 0: 42511.6. Samples: 13372483160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 05:05:55,563][15401] Updated weights for policy 0, policy_version 816193 (0.0036) [2024-06-25 05:05:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13372620800. Throughput: 0: 42474.2. Samples: 13372732560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:05:58,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 05:05:59,899][15401] Updated weights for policy 0, policy_version 816203 (0.0032) [2024-06-25 05:06:03,187][15401] Updated weights for policy 0, policy_version 816213 (0.0035) [2024-06-25 05:06:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13372833792. Throughput: 0: 42620.5. Samples: 13372988580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:06:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 05:06:07,479][15401] Updated weights for policy 0, policy_version 816223 (0.0027) [2024-06-25 05:06:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13373030400. Throughput: 0: 42403.9. Samples: 13373117460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:06:08,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 05:06:11,179][15401] Updated weights for policy 0, policy_version 816233 (0.0032) [2024-06-25 05:06:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 13373259776. Throughput: 0: 42232.4. Samples: 13373363820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:06:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 05:06:15,182][15401] Updated weights for policy 0, policy_version 816243 (0.0033) [2024-06-25 05:06:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 13373440000. Throughput: 0: 42357.9. Samples: 13373620740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:06:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 05:06:18,962][15401] Updated weights for policy 0, policy_version 816253 (0.0027) [2024-06-25 05:06:23,137][15401] Updated weights for policy 0, policy_version 816263 (0.0044) [2024-06-25 05:06:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13373652992. Throughput: 0: 42108.0. Samples: 13373741940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:06:23,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 05:06:26,558][15401] Updated weights for policy 0, policy_version 816273 (0.0034) [2024-06-25 05:06:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 13373898752. Throughput: 0: 42253.5. Samples: 13373997920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 05:06:28,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 05:06:30,925][15401] Updated weights for policy 0, policy_version 816283 (0.0024) [2024-06-25 05:06:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 13374078976. Throughput: 0: 42355.5. Samples: 13374259220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:06:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 05:06:34,177][15401] Updated weights for policy 0, policy_version 816293 (0.0032) [2024-06-25 05:06:38,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.6, 300 sec: 42542.9). Total num frames: 13374275584. Throughput: 0: 42097.9. Samples: 13374377560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:06:38,390][15132] Avg episode reward: [(0, '0.167')] [2024-06-25 05:06:38,740][15401] Updated weights for policy 0, policy_version 816303 (0.0036) [2024-06-25 05:06:42,038][15401] Updated weights for policy 0, policy_version 816313 (0.0035) [2024-06-25 05:06:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 13374537728. Throughput: 0: 42218.4. Samples: 13374632380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:06:43,390][15132] Avg episode reward: [(0, '0.167')] [2024-06-25 05:06:46,584][15401] Updated weights for policy 0, policy_version 816323 (0.0029) [2024-06-25 05:06:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 13374701568. Throughput: 0: 42377.8. Samples: 13374895580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:06:48,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-25 05:06:48,884][15349] Signal inference workers to stop experience collection... (198000 times) [2024-06-25 05:06:48,920][15401] InferenceWorker_p0-w0: stopping experience collection (198000 times) [2024-06-25 05:06:48,948][15349] Signal inference workers to resume experience collection... (198000 times) [2024-06-25 05:06:48,951][15401] InferenceWorker_p0-w0: resuming experience collection (198000 times) [2024-06-25 05:06:49,757][15401] Updated weights for policy 0, policy_version 816333 (0.0035) [2024-06-25 05:06:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13374930944. Throughput: 0: 42015.1. Samples: 13375008140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:06:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 05:06:54,220][15401] Updated weights for policy 0, policy_version 816343 (0.0037) [2024-06-25 05:06:57,457][15401] Updated weights for policy 0, policy_version 816353 (0.0041) [2024-06-25 05:06:58,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13375176704. Throughput: 0: 42354.1. Samples: 13375269760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:06:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 05:07:01,994][15401] Updated weights for policy 0, policy_version 816363 (0.0032) [2024-06-25 05:07:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42543.2). Total num frames: 13375340544. Throughput: 0: 42515.1. Samples: 13375533920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:03,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 05:07:05,084][15401] Updated weights for policy 0, policy_version 816373 (0.0032) [2024-06-25 05:07:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13375569920. Throughput: 0: 42325.8. Samples: 13375646600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 05:07:09,978][15401] Updated weights for policy 0, policy_version 816383 (0.0050) [2024-06-25 05:07:12,741][15401] Updated weights for policy 0, policy_version 816393 (0.0037) [2024-06-25 05:07:13,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13375815680. Throughput: 0: 42372.1. Samples: 13375904660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 05:07:17,810][15401] Updated weights for policy 0, policy_version 816403 (0.0026) [2024-06-25 05:07:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13375979520. Throughput: 0: 42647.5. Samples: 13376178360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:18,399][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 05:07:20,237][15401] Updated weights for policy 0, policy_version 816413 (0.0044) [2024-06-25 05:07:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13376225280. Throughput: 0: 42435.5. Samples: 13376287160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:23,399][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 05:07:25,663][15401] Updated weights for policy 0, policy_version 816423 (0.0034) [2024-06-25 05:07:27,968][15401] Updated weights for policy 0, policy_version 816433 (0.0049) [2024-06-25 05:07:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13376454656. Throughput: 0: 42630.5. Samples: 13376550760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 05:07:33,331][15401] Updated weights for policy 0, policy_version 816443 (0.0044) [2024-06-25 05:07:33,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 13376602112. Throughput: 0: 42812.8. Samples: 13376822260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:33,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 05:07:35,560][15401] Updated weights for policy 0, policy_version 816453 (0.0036) [2024-06-25 05:07:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13376864256. Throughput: 0: 42694.3. Samples: 13376929380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:38,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 05:07:40,893][15401] Updated weights for policy 0, policy_version 816463 (0.0031) [2024-06-25 05:07:43,136][15401] Updated weights for policy 0, policy_version 816473 (0.0031) [2024-06-25 05:07:43,390][15132] Fps is (10 sec: 50801.5, 60 sec: 42871.2, 300 sec: 42653.9). Total num frames: 13377110016. Throughput: 0: 42813.7. Samples: 13377196380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 05:07:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000816474_13377110016.pth... [2024-06-25 05:07:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000815850_13366886400.pth [2024-06-25 05:07:48,226][15401] Updated weights for policy 0, policy_version 816483 (0.0043) [2024-06-25 05:07:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13377257472. Throughput: 0: 42874.2. Samples: 13377463260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:48,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-25 05:07:50,679][15401] Updated weights for policy 0, policy_version 816493 (0.0035) [2024-06-25 05:07:53,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13377519616. Throughput: 0: 42915.9. Samples: 13377577820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:53,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 05:07:55,666][15401] Updated weights for policy 0, policy_version 816503 (0.0033) [2024-06-25 05:07:57,282][15349] Signal inference workers to stop experience collection... (198050 times) [2024-06-25 05:07:57,288][15349] Signal inference workers to resume experience collection... (198050 times) [2024-06-25 05:07:57,312][15401] InferenceWorker_p0-w0: stopping experience collection (198050 times) [2024-06-25 05:07:57,312][15401] InferenceWorker_p0-w0: resuming experience collection (198050 times) [2024-06-25 05:07:58,115][15401] Updated weights for policy 0, policy_version 816513 (0.0032) [2024-06-25 05:07:58,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13377748992. Throughput: 0: 43108.4. Samples: 13377844540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:07:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 05:08:03,270][15401] Updated weights for policy 0, policy_version 816523 (0.0035) [2024-06-25 05:08:03,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 13377912832. Throughput: 0: 42784.0. Samples: 13378103740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:08:03,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 05:08:05,745][15401] Updated weights for policy 0, policy_version 816533 (0.0038) [2024-06-25 05:08:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 13378174976. Throughput: 0: 42937.8. Samples: 13378219360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:08:08,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 05:08:11,130][15401] Updated weights for policy 0, policy_version 816543 (0.0039) [2024-06-25 05:08:13,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13378371584. Throughput: 0: 43034.3. Samples: 13378487300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 05:08:13,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 05:08:13,929][15401] Updated weights for policy 0, policy_version 816553 (0.0027) [2024-06-25 05:08:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13378551808. Throughput: 0: 42680.1. Samples: 13378742760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:18,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 05:08:18,547][15401] Updated weights for policy 0, policy_version 816563 (0.0030) [2024-06-25 05:08:21,586][15401] Updated weights for policy 0, policy_version 816573 (0.0028) [2024-06-25 05:08:23,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 13378813952. Throughput: 0: 42968.8. Samples: 13378863080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:23,393][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 05:08:26,392][15401] Updated weights for policy 0, policy_version 816583 (0.0032) [2024-06-25 05:08:28,394][15132] Fps is (10 sec: 45855.3, 60 sec: 42595.4, 300 sec: 42708.8). Total num frames: 13379010560. Throughput: 0: 42900.1. Samples: 13379127060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:28,394][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 05:08:29,296][15401] Updated weights for policy 0, policy_version 816593 (0.0036) [2024-06-25 05:08:33,390][15132] Fps is (10 sec: 37692.0, 60 sec: 43146.2, 300 sec: 42542.9). Total num frames: 13379190784. Throughput: 0: 42611.0. Samples: 13379380760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:33,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 05:08:33,936][15401] Updated weights for policy 0, policy_version 816603 (0.0033) [2024-06-25 05:08:36,919][15401] Updated weights for policy 0, policy_version 816613 (0.0040) [2024-06-25 05:08:38,390][15132] Fps is (10 sec: 44255.5, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 13379452928. Throughput: 0: 42759.1. Samples: 13379501980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 05:08:41,811][15401] Updated weights for policy 0, policy_version 816623 (0.0043) [2024-06-25 05:08:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13379633152. Throughput: 0: 42638.5. Samples: 13379763280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 05:08:45,062][15401] Updated weights for policy 0, policy_version 816633 (0.0035) [2024-06-25 05:08:48,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13379829760. Throughput: 0: 42507.1. Samples: 13380016460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:48,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 05:08:49,627][15401] Updated weights for policy 0, policy_version 816643 (0.0034) [2024-06-25 05:08:52,635][15401] Updated weights for policy 0, policy_version 816653 (0.0034) [2024-06-25 05:08:53,390][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13380108288. Throughput: 0: 42613.3. Samples: 13380136960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:53,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 05:08:57,106][15401] Updated weights for policy 0, policy_version 816663 (0.0026) [2024-06-25 05:08:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 13380272128. Throughput: 0: 42581.0. Samples: 13380403440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:08:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 05:09:00,198][15401] Updated weights for policy 0, policy_version 816673 (0.0030) [2024-06-25 05:09:03,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 13380468736. Throughput: 0: 42428.4. Samples: 13380652040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:03,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 05:09:04,850][15401] Updated weights for policy 0, policy_version 816683 (0.0025) [2024-06-25 05:09:07,729][15401] Updated weights for policy 0, policy_version 816693 (0.0031) [2024-06-25 05:09:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13380730880. Throughput: 0: 42568.0. Samples: 13380778540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 05:09:12,844][15401] Updated weights for policy 0, policy_version 816703 (0.0042) [2024-06-25 05:09:12,855][15349] Signal inference workers to stop experience collection... (198100 times) [2024-06-25 05:09:12,856][15349] Signal inference workers to resume experience collection... (198100 times) [2024-06-25 05:09:12,866][15401] InferenceWorker_p0-w0: stopping experience collection (198100 times) [2024-06-25 05:09:12,866][15401] InferenceWorker_p0-w0: resuming experience collection (198100 times) [2024-06-25 05:09:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13380911104. Throughput: 0: 42577.9. Samples: 13381042880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 05:09:15,149][15401] Updated weights for policy 0, policy_version 816713 (0.0032) [2024-06-25 05:09:18,394][15132] Fps is (10 sec: 39303.1, 60 sec: 42868.0, 300 sec: 42486.6). Total num frames: 13381124096. Throughput: 0: 42606.2. Samples: 13381298240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:18,395][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 05:09:20,358][15401] Updated weights for policy 0, policy_version 816723 (0.0025) [2024-06-25 05:09:22,648][15401] Updated weights for policy 0, policy_version 816733 (0.0039) [2024-06-25 05:09:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13381369856. Throughput: 0: 42689.9. Samples: 13381423020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 05:09:27,909][15401] Updated weights for policy 0, policy_version 816743 (0.0045) [2024-06-25 05:09:28,392][15132] Fps is (10 sec: 44247.3, 60 sec: 42599.7, 300 sec: 42653.6). Total num frames: 13381566464. Throughput: 0: 42804.1. Samples: 13381689560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:28,392][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 05:09:30,425][15401] Updated weights for policy 0, policy_version 816753 (0.0044) [2024-06-25 05:09:33,394][15132] Fps is (10 sec: 39304.3, 60 sec: 42868.4, 300 sec: 42486.7). Total num frames: 13381763072. Throughput: 0: 42764.3. Samples: 13381941040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:33,394][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 05:09:35,265][15401] Updated weights for policy 0, policy_version 816763 (0.0031) [2024-06-25 05:09:38,209][15401] Updated weights for policy 0, policy_version 816773 (0.0035) [2024-06-25 05:09:38,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13382008832. Throughput: 0: 42873.8. Samples: 13382066280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 05:09:42,969][15401] Updated weights for policy 0, policy_version 816783 (0.0043) [2024-06-25 05:09:43,396][15132] Fps is (10 sec: 42589.8, 60 sec: 42594.0, 300 sec: 42541.9). Total num frames: 13382189056. Throughput: 0: 42707.7. Samples: 13382325560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:43,396][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 05:09:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000816785_13382205440.pth... [2024-06-25 05:09:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000816160_13371965440.pth [2024-06-25 05:09:45,945][15401] Updated weights for policy 0, policy_version 816793 (0.0036) [2024-06-25 05:09:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13382402048. Throughput: 0: 42874.2. Samples: 13382581380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:48,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 05:09:50,514][15401] Updated weights for policy 0, policy_version 816803 (0.0045) [2024-06-25 05:09:53,389][15132] Fps is (10 sec: 45905.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13382647808. Throughput: 0: 42857.0. Samples: 13382707100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 05:09:53,504][15401] Updated weights for policy 0, policy_version 816813 (0.0035) [2024-06-25 05:09:58,130][15401] Updated weights for policy 0, policy_version 816823 (0.0031) [2024-06-25 05:09:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13382844416. Throughput: 0: 42708.8. Samples: 13382964780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:09:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 05:10:01,063][15401] Updated weights for policy 0, policy_version 816833 (0.0032) [2024-06-25 05:10:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13383041024. Throughput: 0: 42741.9. Samples: 13383221420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 05:10:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 05:10:05,657][15401] Updated weights for policy 0, policy_version 816843 (0.0037) [2024-06-25 05:10:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 13383286784. Throughput: 0: 42783.1. Samples: 13383348260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 05:10:08,925][15401] Updated weights for policy 0, policy_version 816853 (0.0027) [2024-06-25 05:10:13,081][15401] Updated weights for policy 0, policy_version 816863 (0.0052) [2024-06-25 05:10:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13383483392. Throughput: 0: 42639.6. Samples: 13383608240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:13,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 05:10:16,283][15349] Signal inference workers to stop experience collection... (198150 times) [2024-06-25 05:10:16,283][15349] Signal inference workers to resume experience collection... (198150 times) [2024-06-25 05:10:16,300][15401] InferenceWorker_p0-w0: stopping experience collection (198150 times) [2024-06-25 05:10:16,300][15401] InferenceWorker_p0-w0: resuming experience collection (198150 times) [2024-06-25 05:10:16,439][15401] Updated weights for policy 0, policy_version 816873 (0.0030) [2024-06-25 05:10:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42874.9, 300 sec: 42653.9). Total num frames: 13383696384. Throughput: 0: 42840.6. Samples: 13383868680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:18,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 05:10:21,083][15401] Updated weights for policy 0, policy_version 816883 (0.0045) [2024-06-25 05:10:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13383925760. Throughput: 0: 42891.0. Samples: 13383996380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:23,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 05:10:24,421][15401] Updated weights for policy 0, policy_version 816893 (0.0048) [2024-06-25 05:10:28,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42325.4, 300 sec: 42487.0). Total num frames: 13384105984. Throughput: 0: 42787.8. Samples: 13384250840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:28,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 05:10:28,560][15401] Updated weights for policy 0, policy_version 816903 (0.0021) [2024-06-25 05:10:32,038][15401] Updated weights for policy 0, policy_version 816913 (0.0033) [2024-06-25 05:10:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43147.7, 300 sec: 42709.5). Total num frames: 13384351744. Throughput: 0: 42788.5. Samples: 13384506860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:33,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 05:10:36,024][15401] Updated weights for policy 0, policy_version 816923 (0.0028) [2024-06-25 05:10:38,389][15132] Fps is (10 sec: 47525.2, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 13384581120. Throughput: 0: 42774.6. Samples: 13384631960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 05:10:39,511][15401] Updated weights for policy 0, policy_version 816933 (0.0048) [2024-06-25 05:10:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 13384761344. Throughput: 0: 43003.5. Samples: 13384899940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:43,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 05:10:43,579][15401] Updated weights for policy 0, policy_version 816943 (0.0042) [2024-06-25 05:10:47,116][15401] Updated weights for policy 0, policy_version 816953 (0.0043) [2024-06-25 05:10:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13384990720. Throughput: 0: 42836.0. Samples: 13385149040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 05:10:51,385][15401] Updated weights for policy 0, policy_version 816963 (0.0050) [2024-06-25 05:10:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13385220096. Throughput: 0: 42948.5. Samples: 13385280940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 05:10:54,893][15401] Updated weights for policy 0, policy_version 816973 (0.0024) [2024-06-25 05:10:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13385400320. Throughput: 0: 43065.9. Samples: 13385546200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:10:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 05:10:58,869][15401] Updated weights for policy 0, policy_version 816983 (0.0032) [2024-06-25 05:11:02,507][15401] Updated weights for policy 0, policy_version 816993 (0.0037) [2024-06-25 05:11:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13385629696. Throughput: 0: 42879.4. Samples: 13385798260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:03,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 05:11:06,688][15401] Updated weights for policy 0, policy_version 817003 (0.0034) [2024-06-25 05:11:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13385842688. Throughput: 0: 42922.0. Samples: 13385927860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 05:11:10,107][15401] Updated weights for policy 0, policy_version 817013 (0.0026) [2024-06-25 05:11:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13386055680. Throughput: 0: 43060.0. Samples: 13386188440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 05:11:14,145][15401] Updated weights for policy 0, policy_version 817023 (0.0023) [2024-06-25 05:11:17,806][15401] Updated weights for policy 0, policy_version 817033 (0.0035) [2024-06-25 05:11:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13386268672. Throughput: 0: 42820.8. Samples: 13386433800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 05:11:21,719][15401] Updated weights for policy 0, policy_version 817043 (0.0032) [2024-06-25 05:11:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13386481664. Throughput: 0: 43097.2. Samples: 13386571340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 05:11:25,317][15401] Updated weights for policy 0, policy_version 817053 (0.0048) [2024-06-25 05:11:28,396][15132] Fps is (10 sec: 42571.5, 60 sec: 43141.7, 300 sec: 42764.1). Total num frames: 13386694656. Throughput: 0: 42808.7. Samples: 13386826600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:28,397][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 05:11:29,271][15401] Updated weights for policy 0, policy_version 817063 (0.0033) [2024-06-25 05:11:32,865][15401] Updated weights for policy 0, policy_version 817073 (0.0030) [2024-06-25 05:11:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13386924032. Throughput: 0: 42980.5. Samples: 13387083160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 05:11:36,640][15401] Updated weights for policy 0, policy_version 817083 (0.0035) [2024-06-25 05:11:38,390][15132] Fps is (10 sec: 44264.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13387137024. Throughput: 0: 43047.9. Samples: 13387218100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 05:11:40,457][15401] Updated weights for policy 0, policy_version 817093 (0.0034) [2024-06-25 05:11:43,107][15349] Signal inference workers to stop experience collection... (198200 times) [2024-06-25 05:11:43,108][15349] Signal inference workers to resume experience collection... (198200 times) [2024-06-25 05:11:43,136][15401] InferenceWorker_p0-w0: stopping experience collection (198200 times) [2024-06-25 05:11:43,137][15401] InferenceWorker_p0-w0: resuming experience collection (198200 times) [2024-06-25 05:11:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13387333632. Throughput: 0: 42804.3. Samples: 13387472400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 05:11:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000817098_13387333632.pth... [2024-06-25 05:11:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000816474_13377110016.pth [2024-06-25 05:11:44,344][15401] Updated weights for policy 0, policy_version 817103 (0.0031) [2024-06-25 05:11:48,108][15401] Updated weights for policy 0, policy_version 817113 (0.0033) [2024-06-25 05:11:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13387579392. Throughput: 0: 42901.8. Samples: 13387728840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 05:11:51,821][15401] Updated weights for policy 0, policy_version 817123 (0.0041) [2024-06-25 05:11:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13387776000. Throughput: 0: 43142.2. Samples: 13387869260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-25 05:11:53,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 05:11:56,151][15401] Updated weights for policy 0, policy_version 817133 (0.0043) [2024-06-25 05:11:58,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13387988992. Throughput: 0: 43029.4. Samples: 13388124760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:11:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 05:11:59,432][15401] Updated weights for policy 0, policy_version 817143 (0.0032) [2024-06-25 05:12:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13388218368. Throughput: 0: 43284.0. Samples: 13388381580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:03,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 05:12:03,598][15401] Updated weights for policy 0, policy_version 817153 (0.0023) [2024-06-25 05:12:06,973][15401] Updated weights for policy 0, policy_version 817163 (0.0037) [2024-06-25 05:12:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13388447744. Throughput: 0: 43185.5. Samples: 13388514680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 05:12:11,061][15401] Updated weights for policy 0, policy_version 817173 (0.0045) [2024-06-25 05:12:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13388627968. Throughput: 0: 43149.1. Samples: 13388768040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:13,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 05:12:14,911][15401] Updated weights for policy 0, policy_version 817183 (0.0033) [2024-06-25 05:12:18,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13388840960. Throughput: 0: 43214.1. Samples: 13389027800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 05:12:18,749][15401] Updated weights for policy 0, policy_version 817193 (0.0035) [2024-06-25 05:12:22,410][15401] Updated weights for policy 0, policy_version 817203 (0.0030) [2024-06-25 05:12:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13389070336. Throughput: 0: 43067.6. Samples: 13389156140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 05:12:26,651][15401] Updated weights for policy 0, policy_version 817213 (0.0032) [2024-06-25 05:12:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42876.0, 300 sec: 42932.0). Total num frames: 13389266944. Throughput: 0: 43049.8. Samples: 13389409640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:28,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 05:12:30,101][15401] Updated weights for policy 0, policy_version 817223 (0.0037) [2024-06-25 05:12:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13389496320. Throughput: 0: 43224.0. Samples: 13389673920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 05:12:34,212][15401] Updated weights for policy 0, policy_version 817233 (0.0038) [2024-06-25 05:12:37,667][15401] Updated weights for policy 0, policy_version 817243 (0.0023) [2024-06-25 05:12:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 13389725696. Throughput: 0: 42965.3. Samples: 13389802700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:38,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 05:12:41,819][15401] Updated weights for policy 0, policy_version 817253 (0.0039) [2024-06-25 05:12:43,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13389922304. Throughput: 0: 42916.0. Samples: 13390055980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 05:12:45,145][15401] Updated weights for policy 0, policy_version 817263 (0.0026) [2024-06-25 05:12:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13390151680. Throughput: 0: 43028.0. Samples: 13390317840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 05:12:49,332][15401] Updated weights for policy 0, policy_version 817273 (0.0034) [2024-06-25 05:12:53,106][15401] Updated weights for policy 0, policy_version 817283 (0.0031) [2024-06-25 05:12:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13390364672. Throughput: 0: 42918.5. Samples: 13390446020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 05:12:55,119][15349] Signal inference workers to stop experience collection... (198250 times) [2024-06-25 05:12:55,119][15349] Signal inference workers to resume experience collection... (198250 times) [2024-06-25 05:12:55,147][15401] InferenceWorker_p0-w0: stopping experience collection (198250 times) [2024-06-25 05:12:55,147][15401] InferenceWorker_p0-w0: resuming experience collection (198250 times) [2024-06-25 05:12:56,977][15401] Updated weights for policy 0, policy_version 817293 (0.0048) [2024-06-25 05:12:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 13390577664. Throughput: 0: 42784.0. Samples: 13390693320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:12:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 05:13:00,709][15401] Updated weights for policy 0, policy_version 817303 (0.0042) [2024-06-25 05:13:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13390774272. Throughput: 0: 42717.9. Samples: 13390950100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:13:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 05:13:05,071][15401] Updated weights for policy 0, policy_version 817313 (0.0036) [2024-06-25 05:13:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 13391020032. Throughput: 0: 42831.1. Samples: 13391083540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:13:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 05:13:08,392][15401] Updated weights for policy 0, policy_version 817323 (0.0040) [2024-06-25 05:13:12,860][15401] Updated weights for policy 0, policy_version 817333 (0.0046) [2024-06-25 05:13:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 13391216640. Throughput: 0: 42773.3. Samples: 13391334440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:13:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 05:13:16,350][15401] Updated weights for policy 0, policy_version 817343 (0.0029) [2024-06-25 05:13:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 13391429632. Throughput: 0: 42503.6. Samples: 13391586580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:13:18,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-25 05:13:20,373][15401] Updated weights for policy 0, policy_version 817353 (0.0036) [2024-06-25 05:13:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42821.2). Total num frames: 13391642624. Throughput: 0: 42651.2. Samples: 13391722000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:13:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 05:13:23,927][15401] Updated weights for policy 0, policy_version 817363 (0.0038) [2024-06-25 05:13:27,966][15401] Updated weights for policy 0, policy_version 817373 (0.0032) [2024-06-25 05:13:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13391839232. Throughput: 0: 42637.3. Samples: 13391974660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:13:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 05:13:31,709][15401] Updated weights for policy 0, policy_version 817383 (0.0034) [2024-06-25 05:13:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13392084992. Throughput: 0: 42412.0. Samples: 13392226380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:13:33,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 05:13:35,577][15401] Updated weights for policy 0, policy_version 817393 (0.0040) [2024-06-25 05:13:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13392281600. Throughput: 0: 42491.2. Samples: 13392358120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 05:13:38,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-25 05:13:39,288][15401] Updated weights for policy 0, policy_version 817403 (0.0054) [2024-06-25 05:13:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13392478208. Throughput: 0: 42558.3. Samples: 13392608440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:13:43,392][15132] Avg episode reward: [(0, '0.148')] [2024-06-25 05:13:43,527][15401] Updated weights for policy 0, policy_version 817413 (0.0037) [2024-06-25 05:13:43,529][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000817413_13392494592.pth... [2024-06-25 05:13:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000816785_13382205440.pth [2024-06-25 05:13:46,846][15401] Updated weights for policy 0, policy_version 817423 (0.0034) [2024-06-25 05:13:48,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 13392691200. Throughput: 0: 42582.5. Samples: 13392866420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:13:48,393][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 05:13:51,670][15401] Updated weights for policy 0, policy_version 817433 (0.0030) [2024-06-25 05:13:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13392904192. Throughput: 0: 42521.8. Samples: 13392997020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:13:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 05:13:54,415][15401] Updated weights for policy 0, policy_version 817443 (0.0033) [2024-06-25 05:13:58,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13393133568. Throughput: 0: 42535.9. Samples: 13393248560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:13:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 05:13:59,275][15401] Updated weights for policy 0, policy_version 817453 (0.0045) [2024-06-25 05:14:02,204][15401] Updated weights for policy 0, policy_version 817463 (0.0049) [2024-06-25 05:14:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13393346560. Throughput: 0: 42571.2. Samples: 13393502280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:03,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 05:14:06,965][15401] Updated weights for policy 0, policy_version 817473 (0.0041) [2024-06-25 05:14:08,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.7, 300 sec: 42875.7). Total num frames: 13393559552. Throughput: 0: 42389.2. Samples: 13393629620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:08,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 05:14:10,379][15401] Updated weights for policy 0, policy_version 817483 (0.0031) [2024-06-25 05:14:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42932.3). Total num frames: 13393788928. Throughput: 0: 42616.9. Samples: 13393892420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:13,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 05:14:14,511][15401] Updated weights for policy 0, policy_version 817493 (0.0038) [2024-06-25 05:14:17,905][15401] Updated weights for policy 0, policy_version 817503 (0.0035) [2024-06-25 05:14:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13393985536. Throughput: 0: 42597.3. Samples: 13394143260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:18,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 05:14:22,139][15401] Updated weights for policy 0, policy_version 817513 (0.0032) [2024-06-25 05:14:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 13394198528. Throughput: 0: 42490.1. Samples: 13394270180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 05:14:25,689][15401] Updated weights for policy 0, policy_version 817523 (0.0037) [2024-06-25 05:14:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42932.3). Total num frames: 13394427904. Throughput: 0: 42824.1. Samples: 13394535520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 05:14:30,027][15401] Updated weights for policy 0, policy_version 817533 (0.0033) [2024-06-25 05:14:30,846][15349] Signal inference workers to stop experience collection... (198300 times) [2024-06-25 05:14:30,878][15401] InferenceWorker_p0-w0: stopping experience collection (198300 times) [2024-06-25 05:14:30,906][15349] Signal inference workers to resume experience collection... (198300 times) [2024-06-25 05:14:30,914][15401] InferenceWorker_p0-w0: resuming experience collection (198300 times) [2024-06-25 05:14:33,310][15401] Updated weights for policy 0, policy_version 817543 (0.0035) [2024-06-25 05:14:33,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13394624512. Throughput: 0: 42851.6. Samples: 13394794640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:33,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 05:14:37,408][15401] Updated weights for policy 0, policy_version 817553 (0.0047) [2024-06-25 05:14:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42877.0). Total num frames: 13394837504. Throughput: 0: 42629.4. Samples: 13394915340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 05:14:40,852][15401] Updated weights for policy 0, policy_version 817563 (0.0031) [2024-06-25 05:14:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13395050496. Throughput: 0: 42841.2. Samples: 13395176420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:43,390][15132] Avg episode reward: [(0, '0.171')] [2024-06-25 05:14:44,973][15401] Updated weights for policy 0, policy_version 817573 (0.0029) [2024-06-25 05:14:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13395247104. Throughput: 0: 42885.7. Samples: 13395432140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 05:14:48,697][15401] Updated weights for policy 0, policy_version 817583 (0.0028) [2024-06-25 05:14:52,465][15401] Updated weights for policy 0, policy_version 817593 (0.0044) [2024-06-25 05:14:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13395476480. Throughput: 0: 42799.2. Samples: 13395555480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:53,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 05:14:56,258][15401] Updated weights for policy 0, policy_version 817603 (0.0038) [2024-06-25 05:14:58,391][15132] Fps is (10 sec: 42591.0, 60 sec: 42324.1, 300 sec: 42820.3). Total num frames: 13395673088. Throughput: 0: 42729.0. Samples: 13395815300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:14:58,392][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 05:15:00,077][15401] Updated weights for policy 0, policy_version 817613 (0.0041) [2024-06-25 05:15:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13395902464. Throughput: 0: 42843.6. Samples: 13396071220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:15:03,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 05:15:03,865][15401] Updated weights for policy 0, policy_version 817623 (0.0034) [2024-06-25 05:15:07,407][15401] Updated weights for policy 0, policy_version 817633 (0.0046) [2024-06-25 05:15:08,389][15132] Fps is (10 sec: 45883.2, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 13396131840. Throughput: 0: 42867.2. Samples: 13396199200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:15:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 05:15:11,285][15401] Updated weights for policy 0, policy_version 817643 (0.0047) [2024-06-25 05:15:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13396312064. Throughput: 0: 42765.3. Samples: 13396459960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:15:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 05:15:14,794][15401] Updated weights for policy 0, policy_version 817653 (0.0029) [2024-06-25 05:15:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 13396541440. Throughput: 0: 42746.8. Samples: 13396718240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:15:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 05:15:19,164][15401] Updated weights for policy 0, policy_version 817663 (0.0031) [2024-06-25 05:15:22,323][15401] Updated weights for policy 0, policy_version 817673 (0.0027) [2024-06-25 05:15:23,392][15132] Fps is (10 sec: 47501.8, 60 sec: 43142.8, 300 sec: 42987.2). Total num frames: 13396787200. Throughput: 0: 43093.6. Samples: 13396854660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:15:23,393][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 05:15:26,659][15401] Updated weights for policy 0, policy_version 817683 (0.0024) [2024-06-25 05:15:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13396967424. Throughput: 0: 42944.7. Samples: 13397108920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:15:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 05:15:29,749][15401] Updated weights for policy 0, policy_version 817693 (0.0024) [2024-06-25 05:15:33,390][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13397213184. Throughput: 0: 43064.4. Samples: 13397370040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:15:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 05:15:34,740][15401] Updated weights for policy 0, policy_version 817703 (0.0029) [2024-06-25 05:15:37,117][15401] Updated weights for policy 0, policy_version 817713 (0.0029) [2024-06-25 05:15:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 13397442560. Throughput: 0: 43356.5. Samples: 13397506520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:15:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 05:15:39,767][15349] Signal inference workers to stop experience collection... (198350 times) [2024-06-25 05:15:39,767][15349] Signal inference workers to resume experience collection... (198350 times) [2024-06-25 05:15:39,819][15401] InferenceWorker_p0-w0: stopping experience collection (198350 times) [2024-06-25 05:15:39,819][15401] InferenceWorker_p0-w0: resuming experience collection (198350 times) [2024-06-25 05:15:42,189][15401] Updated weights for policy 0, policy_version 817723 (0.0034) [2024-06-25 05:15:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 13397606400. Throughput: 0: 43353.7. Samples: 13397766140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:15:43,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 05:15:43,459][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000817726_13397622784.pth... [2024-06-25 05:15:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000817098_13387333632.pth [2024-06-25 05:15:44,699][15401] Updated weights for policy 0, policy_version 817733 (0.0035) [2024-06-25 05:15:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13397852160. Throughput: 0: 43163.0. Samples: 13398013560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:15:48,393][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 05:15:50,119][15401] Updated weights for policy 0, policy_version 817743 (0.0040) [2024-06-25 05:15:52,399][15401] Updated weights for policy 0, policy_version 817753 (0.0036) [2024-06-25 05:15:53,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43690.8, 300 sec: 43042.7). Total num frames: 13398097920. Throughput: 0: 43355.2. Samples: 13398150180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:15:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 05:15:57,786][15401] Updated weights for policy 0, policy_version 817763 (0.0038) [2024-06-25 05:15:58,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42871.0, 300 sec: 42764.7). Total num frames: 13398245376. Throughput: 0: 43202.1. Samples: 13398404160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:15:58,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 05:16:00,368][15401] Updated weights for policy 0, policy_version 817773 (0.0038) [2024-06-25 05:16:03,389][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13398491136. Throughput: 0: 42998.6. Samples: 13398653180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:03,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 05:16:05,447][15401] Updated weights for policy 0, policy_version 817783 (0.0033) [2024-06-25 05:16:07,979][15401] Updated weights for policy 0, policy_version 817793 (0.0032) [2024-06-25 05:16:08,390][15132] Fps is (10 sec: 47524.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13398720512. Throughput: 0: 42976.0. Samples: 13398788480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:08,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 05:16:12,935][15401] Updated weights for policy 0, policy_version 817803 (0.0024) [2024-06-25 05:16:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13398884352. Throughput: 0: 42981.3. Samples: 13399043080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:13,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 05:16:15,795][15401] Updated weights for policy 0, policy_version 817813 (0.0024) [2024-06-25 05:16:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13399130112. Throughput: 0: 42688.5. Samples: 13399291020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:18,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 05:16:20,335][15401] Updated weights for policy 0, policy_version 817823 (0.0036) [2024-06-25 05:16:23,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42873.2, 300 sec: 42932.6). Total num frames: 13399359488. Throughput: 0: 42637.4. Samples: 13399425200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:23,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 05:16:23,658][15401] Updated weights for policy 0, policy_version 817833 (0.0035) [2024-06-25 05:16:27,940][15401] Updated weights for policy 0, policy_version 817843 (0.0040) [2024-06-25 05:16:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13399539712. Throughput: 0: 42509.3. Samples: 13399679060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:28,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 05:16:31,195][15401] Updated weights for policy 0, policy_version 817853 (0.0038) [2024-06-25 05:16:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13399785472. Throughput: 0: 42646.2. Samples: 13399932640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 05:16:35,700][15401] Updated weights for policy 0, policy_version 817863 (0.0041) [2024-06-25 05:16:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 13399998464. Throughput: 0: 42625.3. Samples: 13400068320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 05:16:39,019][15401] Updated weights for policy 0, policy_version 817873 (0.0025) [2024-06-25 05:16:43,094][15401] Updated weights for policy 0, policy_version 817883 (0.0038) [2024-06-25 05:16:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13400195072. Throughput: 0: 42736.2. Samples: 13400327180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 05:16:46,444][15401] Updated weights for policy 0, policy_version 817893 (0.0032) [2024-06-25 05:16:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13400424448. Throughput: 0: 42853.8. Samples: 13400581600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:48,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 05:16:49,461][15349] Signal inference workers to stop experience collection... (198400 times) [2024-06-25 05:16:49,507][15401] InferenceWorker_p0-w0: stopping experience collection (198400 times) [2024-06-25 05:16:49,513][15349] Signal inference workers to resume experience collection... (198400 times) [2024-06-25 05:16:49,524][15401] InferenceWorker_p0-w0: resuming experience collection (198400 times) [2024-06-25 05:16:50,393][15401] Updated weights for policy 0, policy_version 817903 (0.0034) [2024-06-25 05:16:53,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.6, 300 sec: 42875.8). Total num frames: 13400637440. Throughput: 0: 42824.5. Samples: 13400715680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:53,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 05:16:53,883][15401] Updated weights for policy 0, policy_version 817913 (0.0049) [2024-06-25 05:16:57,895][15401] Updated weights for policy 0, policy_version 817923 (0.0047) [2024-06-25 05:16:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43690.7, 300 sec: 42875.8). Total num frames: 13400866816. Throughput: 0: 43005.2. Samples: 13400978420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:16:58,392][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 05:17:01,655][15401] Updated weights for policy 0, policy_version 817933 (0.0037) [2024-06-25 05:17:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13401079808. Throughput: 0: 43046.2. Samples: 13401228100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:17:03,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 05:17:05,777][15401] Updated weights for policy 0, policy_version 817943 (0.0026) [2024-06-25 05:17:08,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13401260032. Throughput: 0: 42994.7. Samples: 13401359960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:17:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 05:17:09,183][15401] Updated weights for policy 0, policy_version 817953 (0.0037) [2024-06-25 05:17:13,375][15401] Updated weights for policy 0, policy_version 817963 (0.0031) [2024-06-25 05:17:13,391][15132] Fps is (10 sec: 42593.3, 60 sec: 43689.8, 300 sec: 42931.5). Total num frames: 13401505792. Throughput: 0: 43141.1. Samples: 13401620460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:17:13,393][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 05:17:16,935][15401] Updated weights for policy 0, policy_version 817973 (0.0042) [2024-06-25 05:17:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13401718784. Throughput: 0: 42984.0. Samples: 13401866920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-25 05:17:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 05:17:21,108][15401] Updated weights for policy 0, policy_version 817983 (0.0027) [2024-06-25 05:17:23,389][15132] Fps is (10 sec: 39326.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13401899008. Throughput: 0: 42953.7. Samples: 13402001240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:17:23,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 05:17:25,039][15401] Updated weights for policy 0, policy_version 817993 (0.0030) [2024-06-25 05:17:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13402144768. Throughput: 0: 42857.2. Samples: 13402255760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:17:28,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 05:17:28,569][15401] Updated weights for policy 0, policy_version 818003 (0.0032) [2024-06-25 05:17:32,535][15401] Updated weights for policy 0, policy_version 818013 (0.0033) [2024-06-25 05:17:33,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13402374144. Throughput: 0: 42877.2. Samples: 13402511080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:17:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 05:17:36,141][15401] Updated weights for policy 0, policy_version 818023 (0.0029) [2024-06-25 05:17:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 13402554368. Throughput: 0: 42822.7. Samples: 13402642600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:17:38,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 05:17:40,063][15401] Updated weights for policy 0, policy_version 818033 (0.0033) [2024-06-25 05:17:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13402783744. Throughput: 0: 42713.4. Samples: 13402900420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:17:43,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 05:17:43,451][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000818042_13402800128.pth... [2024-06-25 05:17:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000817413_13392494592.pth [2024-06-25 05:17:43,795][15401] Updated weights for policy 0, policy_version 818043 (0.0047) [2024-06-25 05:17:47,579][15401] Updated weights for policy 0, policy_version 818053 (0.0050) [2024-06-25 05:17:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13403013120. Throughput: 0: 42892.2. Samples: 13403158260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:17:48,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 05:17:51,533][15401] Updated weights for policy 0, policy_version 818063 (0.0030) [2024-06-25 05:17:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 13403176960. Throughput: 0: 42800.9. Samples: 13403286000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:17:53,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 05:17:53,547][15349] Signal inference workers to stop experience collection... (198450 times) [2024-06-25 05:17:53,607][15401] InferenceWorker_p0-w0: stopping experience collection (198450 times) [2024-06-25 05:17:53,671][15349] Signal inference workers to resume experience collection... (198450 times) [2024-06-25 05:17:53,671][15401] InferenceWorker_p0-w0: resuming experience collection (198450 times) [2024-06-25 05:17:55,766][15401] Updated weights for policy 0, policy_version 818073 (0.0037) [2024-06-25 05:17:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 13403422720. Throughput: 0: 42631.2. Samples: 13403538820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:17:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 05:17:59,252][15401] Updated weights for policy 0, policy_version 818083 (0.0034) [2024-06-25 05:18:03,364][15401] Updated weights for policy 0, policy_version 818093 (0.0030) [2024-06-25 05:18:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13403635712. Throughput: 0: 42848.4. Samples: 13403795100. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 05:18:07,029][15401] Updated weights for policy 0, policy_version 818103 (0.0028) [2024-06-25 05:18:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13403815936. Throughput: 0: 42752.3. Samples: 13403925100. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 05:18:11,109][15401] Updated weights for policy 0, policy_version 818113 (0.0035) [2024-06-25 05:18:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42599.2, 300 sec: 42820.6). Total num frames: 13404061696. Throughput: 0: 42710.3. Samples: 13404177720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:13,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 05:18:14,674][15401] Updated weights for policy 0, policy_version 818123 (0.0036) [2024-06-25 05:18:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13404258304. Throughput: 0: 42681.9. Samples: 13404431760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:18,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 05:18:18,807][15401] Updated weights for policy 0, policy_version 818133 (0.0030) [2024-06-25 05:18:22,591][15401] Updated weights for policy 0, policy_version 818143 (0.0036) [2024-06-25 05:18:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13404471296. Throughput: 0: 42473.8. Samples: 13404553920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:23,390][15132] Avg episode reward: [(0, '0.292')] [2024-06-25 05:18:26,742][15401] Updated weights for policy 0, policy_version 818153 (0.0034) [2024-06-25 05:18:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13404700672. Throughput: 0: 42524.8. Samples: 13404814040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 05:18:30,348][15401] Updated weights for policy 0, policy_version 818163 (0.0039) [2024-06-25 05:18:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13404897280. Throughput: 0: 42392.6. Samples: 13405065920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 05:18:34,486][15401] Updated weights for policy 0, policy_version 818173 (0.0034) [2024-06-25 05:18:38,052][15401] Updated weights for policy 0, policy_version 818183 (0.0036) [2024-06-25 05:18:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13405110272. Throughput: 0: 42248.7. Samples: 13405187200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 05:18:42,280][15401] Updated weights for policy 0, policy_version 818193 (0.0035) [2024-06-25 05:18:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42820.9). Total num frames: 13405323264. Throughput: 0: 42416.5. Samples: 13405447560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:43,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 05:18:45,615][15401] Updated weights for policy 0, policy_version 818203 (0.0036) [2024-06-25 05:18:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 13405519872. Throughput: 0: 42399.2. Samples: 13405703060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 05:18:49,778][15401] Updated weights for policy 0, policy_version 818213 (0.0038) [2024-06-25 05:18:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13405749248. Throughput: 0: 42293.9. Samples: 13405828320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:53,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 05:18:53,495][15401] Updated weights for policy 0, policy_version 818223 (0.0032) [2024-06-25 05:18:57,449][15401] Updated weights for policy 0, policy_version 818233 (0.0037) [2024-06-25 05:18:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13405978624. Throughput: 0: 42351.1. Samples: 13406083520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:18:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 05:19:01,296][15401] Updated weights for policy 0, policy_version 818243 (0.0046) [2024-06-25 05:19:02,128][15349] Signal inference workers to stop experience collection... (198500 times) [2024-06-25 05:19:02,128][15349] Signal inference workers to resume experience collection... (198500 times) [2024-06-25 05:19:02,169][15401] InferenceWorker_p0-w0: stopping experience collection (198500 times) [2024-06-25 05:19:02,169][15401] InferenceWorker_p0-w0: resuming experience collection (198500 times) [2024-06-25 05:19:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42654.3). Total num frames: 13406142464. Throughput: 0: 42464.1. Samples: 13406342640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-25 05:19:03,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-25 05:19:05,127][15401] Updated weights for policy 0, policy_version 818253 (0.0034) [2024-06-25 05:19:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13406388224. Throughput: 0: 42348.8. Samples: 13406459620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:08,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 05:19:09,031][15401] Updated weights for policy 0, policy_version 818263 (0.0043) [2024-06-25 05:19:12,987][15401] Updated weights for policy 0, policy_version 818273 (0.0040) [2024-06-25 05:19:13,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13406617600. Throughput: 0: 42370.2. Samples: 13406720700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:13,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 05:19:16,663][15401] Updated weights for policy 0, policy_version 818283 (0.0037) [2024-06-25 05:19:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13406797824. Throughput: 0: 42628.9. Samples: 13406984220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:18,394][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 05:19:20,440][15401] Updated weights for policy 0, policy_version 818293 (0.0042) [2024-06-25 05:19:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13407043584. Throughput: 0: 42716.9. Samples: 13407109460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 05:19:24,154][15401] Updated weights for policy 0, policy_version 818303 (0.0043) [2024-06-25 05:19:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 13407223808. Throughput: 0: 42703.9. Samples: 13407369240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 05:19:28,579][15401] Updated weights for policy 0, policy_version 818313 (0.0022) [2024-06-25 05:19:31,702][15401] Updated weights for policy 0, policy_version 818323 (0.0034) [2024-06-25 05:19:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13407453184. Throughput: 0: 42748.0. Samples: 13407626720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 05:19:36,066][15401] Updated weights for policy 0, policy_version 818333 (0.0030) [2024-06-25 05:19:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13407682560. Throughput: 0: 42889.6. Samples: 13407758360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 05:19:39,372][15401] Updated weights for policy 0, policy_version 818343 (0.0043) [2024-06-25 05:19:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13407879168. Throughput: 0: 42973.7. Samples: 13408017340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 05:19:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000818352_13407879168.pth... [2024-06-25 05:19:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000817726_13397622784.pth [2024-06-25 05:19:43,690][15401] Updated weights for policy 0, policy_version 818353 (0.0032) [2024-06-25 05:19:46,926][15401] Updated weights for policy 0, policy_version 818363 (0.0050) [2024-06-25 05:19:48,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 13408092160. Throughput: 0: 42832.3. Samples: 13408270200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:48,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 05:19:51,297][15401] Updated weights for policy 0, policy_version 818373 (0.0033) [2024-06-25 05:19:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42820.4). Total num frames: 13408305152. Throughput: 0: 43152.8. Samples: 13408401600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:53,393][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 05:19:54,372][15401] Updated weights for policy 0, policy_version 818383 (0.0037) [2024-06-25 05:19:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13408518144. Throughput: 0: 43014.7. Samples: 13408656360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:19:58,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 05:19:58,929][15401] Updated weights for policy 0, policy_version 818393 (0.0038) [2024-06-25 05:20:02,101][15401] Updated weights for policy 0, policy_version 818403 (0.0029) [2024-06-25 05:20:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13408747520. Throughput: 0: 42783.9. Samples: 13408909500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:03,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 05:20:06,535][15401] Updated weights for policy 0, policy_version 818413 (0.0036) [2024-06-25 05:20:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13408960512. Throughput: 0: 43096.2. Samples: 13409048780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:08,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 05:20:09,679][15401] Updated weights for policy 0, policy_version 818423 (0.0034) [2024-06-25 05:20:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13409157120. Throughput: 0: 42962.0. Samples: 13409302520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:13,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 05:20:14,190][15401] Updated weights for policy 0, policy_version 818433 (0.0033) [2024-06-25 05:20:17,496][15401] Updated weights for policy 0, policy_version 818443 (0.0045) [2024-06-25 05:20:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 13409402880. Throughput: 0: 42632.4. Samples: 13409545180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 05:20:21,661][15401] Updated weights for policy 0, policy_version 818453 (0.0033) [2024-06-25 05:20:22,021][15349] Signal inference workers to stop experience collection... (198550 times) [2024-06-25 05:20:22,067][15401] InferenceWorker_p0-w0: stopping experience collection (198550 times) [2024-06-25 05:20:22,072][15349] Signal inference workers to resume experience collection... (198550 times) [2024-06-25 05:20:22,082][15401] InferenceWorker_p0-w0: resuming experience collection (198550 times) [2024-06-25 05:20:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 13409599488. Throughput: 0: 42846.8. Samples: 13409686460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 05:20:25,203][15401] Updated weights for policy 0, policy_version 818463 (0.0024) [2024-06-25 05:20:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 13409796096. Throughput: 0: 42705.0. Samples: 13409939060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 05:20:29,706][15401] Updated weights for policy 0, policy_version 818473 (0.0036) [2024-06-25 05:20:32,879][15401] Updated weights for policy 0, policy_version 818483 (0.0031) [2024-06-25 05:20:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13410025472. Throughput: 0: 42643.7. Samples: 13410189060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 05:20:37,338][15401] Updated weights for policy 0, policy_version 818493 (0.0032) [2024-06-25 05:20:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13410238464. Throughput: 0: 42685.4. Samples: 13410322340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 05:20:40,347][15401] Updated weights for policy 0, policy_version 818503 (0.0029) [2024-06-25 05:20:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13410467840. Throughput: 0: 42720.0. Samples: 13410578760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 05:20:44,737][15401] Updated weights for policy 0, policy_version 818513 (0.0034) [2024-06-25 05:20:48,214][15401] Updated weights for policy 0, policy_version 818523 (0.0034) [2024-06-25 05:20:48,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 13410680832. Throughput: 0: 42859.2. Samples: 13410838160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 05:20:52,314][15401] Updated weights for policy 0, policy_version 818533 (0.0026) [2024-06-25 05:20:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42873.3, 300 sec: 42820.9). Total num frames: 13410877440. Throughput: 0: 42610.6. Samples: 13410966260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 05:20:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 05:20:55,881][15401] Updated weights for policy 0, policy_version 818543 (0.0037) [2024-06-25 05:20:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13411123200. Throughput: 0: 42717.2. Samples: 13411224800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:20:58,393][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 05:20:59,809][15401] Updated weights for policy 0, policy_version 818553 (0.0041) [2024-06-25 05:21:03,358][15401] Updated weights for policy 0, policy_version 818563 (0.0047) [2024-06-25 05:21:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13411336192. Throughput: 0: 42944.0. Samples: 13411477660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:03,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 05:21:07,394][15401] Updated weights for policy 0, policy_version 818573 (0.0029) [2024-06-25 05:21:08,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13411516416. Throughput: 0: 42777.3. Samples: 13411611440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:08,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-25 05:21:10,803][15401] Updated weights for policy 0, policy_version 818583 (0.0037) [2024-06-25 05:21:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13411745792. Throughput: 0: 42927.6. Samples: 13411870800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 05:21:14,897][15401] Updated weights for policy 0, policy_version 818593 (0.0028) [2024-06-25 05:21:18,323][15401] Updated weights for policy 0, policy_version 818603 (0.0037) [2024-06-25 05:21:18,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13411991552. Throughput: 0: 43077.3. Samples: 13412127540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 05:21:22,642][15401] Updated weights for policy 0, policy_version 818613 (0.0026) [2024-06-25 05:21:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13412171776. Throughput: 0: 43012.6. Samples: 13412257900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 05:21:25,966][15401] Updated weights for policy 0, policy_version 818623 (0.0029) [2024-06-25 05:21:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13412401152. Throughput: 0: 43008.9. Samples: 13412514160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:28,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 05:21:30,560][15401] Updated weights for policy 0, policy_version 818633 (0.0045) [2024-06-25 05:21:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 13412630528. Throughput: 0: 42996.8. Samples: 13412773020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 05:21:33,750][15401] Updated weights for policy 0, policy_version 818643 (0.0030) [2024-06-25 05:21:37,920][15401] Updated weights for policy 0, policy_version 818653 (0.0028) [2024-06-25 05:21:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13412827136. Throughput: 0: 43087.1. Samples: 13412905180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 05:21:41,363][15401] Updated weights for policy 0, policy_version 818663 (0.0031) [2024-06-25 05:21:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 13413056512. Throughput: 0: 43172.5. Samples: 13413167660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:43,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 05:21:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000818668_13413056512.pth... [2024-06-25 05:21:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000818042_13402800128.pth [2024-06-25 05:21:45,463][15401] Updated weights for policy 0, policy_version 818673 (0.0037) [2024-06-25 05:21:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 13413253120. Throughput: 0: 43166.7. Samples: 13413420160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 05:21:49,213][15401] Updated weights for policy 0, policy_version 818683 (0.0038) [2024-06-25 05:21:53,038][15401] Updated weights for policy 0, policy_version 818693 (0.0034) [2024-06-25 05:21:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 13413482496. Throughput: 0: 43058.8. Samples: 13413549080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 05:21:56,770][15401] Updated weights for policy 0, policy_version 818703 (0.0023) [2024-06-25 05:21:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 13413679104. Throughput: 0: 43155.6. Samples: 13413812800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:21:58,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-25 05:21:58,430][15349] Signal inference workers to stop experience collection... (198600 times) [2024-06-25 05:21:58,481][15401] InferenceWorker_p0-w0: stopping experience collection (198600 times) [2024-06-25 05:21:58,490][15349] Signal inference workers to resume experience collection... (198600 times) [2024-06-25 05:21:58,500][15401] InferenceWorker_p0-w0: resuming experience collection (198600 times) [2024-06-25 05:22:00,546][15401] Updated weights for policy 0, policy_version 818713 (0.0034) [2024-06-25 05:22:03,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 13413892096. Throughput: 0: 43082.5. Samples: 13414066360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:03,392][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 05:22:04,405][15401] Updated weights for policy 0, policy_version 818723 (0.0031) [2024-06-25 05:22:08,086][15401] Updated weights for policy 0, policy_version 818733 (0.0030) [2024-06-25 05:22:08,392][15132] Fps is (10 sec: 45863.5, 60 sec: 43689.0, 300 sec: 42820.4). Total num frames: 13414137856. Throughput: 0: 43062.6. Samples: 13414195820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:08,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 05:22:12,070][15401] Updated weights for policy 0, policy_version 818743 (0.0033) [2024-06-25 05:22:13,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13414334464. Throughput: 0: 43172.6. Samples: 13414456920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:13,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 05:22:15,585][15401] Updated weights for policy 0, policy_version 818753 (0.0031) [2024-06-25 05:22:18,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13414531072. Throughput: 0: 43206.8. Samples: 13414717320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 05:22:19,597][15401] Updated weights for policy 0, policy_version 818763 (0.0037) [2024-06-25 05:22:23,094][15401] Updated weights for policy 0, policy_version 818773 (0.0031) [2024-06-25 05:22:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13414776832. Throughput: 0: 43019.6. Samples: 13414841060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 05:22:26,971][15401] Updated weights for policy 0, policy_version 818783 (0.0034) [2024-06-25 05:22:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13414989824. Throughput: 0: 43070.7. Samples: 13415105740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:28,398][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 05:22:30,724][15401] Updated weights for policy 0, policy_version 818793 (0.0045) [2024-06-25 05:22:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13415186432. Throughput: 0: 43252.9. Samples: 13415366540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:33,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 05:22:35,054][15401] Updated weights for policy 0, policy_version 818803 (0.0039) [2024-06-25 05:22:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13415399424. Throughput: 0: 43161.8. Samples: 13415491360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 05:22:38,600][15401] Updated weights for policy 0, policy_version 818813 (0.0028) [2024-06-25 05:22:42,735][15401] Updated weights for policy 0, policy_version 818823 (0.0032) [2024-06-25 05:22:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 13415628800. Throughput: 0: 43130.9. Samples: 13415753700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 05:22:43,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-25 05:22:46,328][15401] Updated weights for policy 0, policy_version 818833 (0.0035) [2024-06-25 05:22:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13415841792. Throughput: 0: 42992.9. Samples: 13416000940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:22:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 05:22:50,354][15401] Updated weights for policy 0, policy_version 818843 (0.0027) [2024-06-25 05:22:53,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 13416054784. Throughput: 0: 43067.1. Samples: 13416133840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:22:53,392][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 05:22:53,860][15401] Updated weights for policy 0, policy_version 818853 (0.0034) [2024-06-25 05:22:57,907][15401] Updated weights for policy 0, policy_version 818863 (0.0043) [2024-06-25 05:22:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13416284160. Throughput: 0: 43019.0. Samples: 13416392780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:22:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 05:23:01,450][15401] Updated weights for policy 0, policy_version 818873 (0.0029) [2024-06-25 05:23:03,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43419.4, 300 sec: 42987.2). Total num frames: 13416497152. Throughput: 0: 42920.4. Samples: 13416648740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:03,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 05:23:05,500][15401] Updated weights for policy 0, policy_version 818883 (0.0032) [2024-06-25 05:23:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 13416693760. Throughput: 0: 43070.0. Samples: 13416779220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 05:23:09,509][15401] Updated weights for policy 0, policy_version 818893 (0.0042) [2024-06-25 05:23:12,962][15349] Signal inference workers to stop experience collection... (198650 times) [2024-06-25 05:23:12,962][15349] Signal inference workers to resume experience collection... (198650 times) [2024-06-25 05:23:13,008][15401] InferenceWorker_p0-w0: stopping experience collection (198650 times) [2024-06-25 05:23:13,008][15401] InferenceWorker_p0-w0: resuming experience collection (198650 times) [2024-06-25 05:23:13,096][15401] Updated weights for policy 0, policy_version 818903 (0.0029) [2024-06-25 05:23:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13416906752. Throughput: 0: 42912.9. Samples: 13417036820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 05:23:16,968][15401] Updated weights for policy 0, policy_version 818913 (0.0035) [2024-06-25 05:23:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13417119744. Throughput: 0: 42923.5. Samples: 13417298100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 05:23:20,606][15401] Updated weights for policy 0, policy_version 818923 (0.0022) [2024-06-25 05:23:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13417349120. Throughput: 0: 43038.7. Samples: 13417428100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 05:23:24,392][15401] Updated weights for policy 0, policy_version 818933 (0.0028) [2024-06-25 05:23:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13417545728. Throughput: 0: 43096.1. Samples: 13417693020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 05:23:28,481][15401] Updated weights for policy 0, policy_version 818943 (0.0033) [2024-06-25 05:23:31,977][15401] Updated weights for policy 0, policy_version 818953 (0.0031) [2024-06-25 05:23:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 13417775104. Throughput: 0: 43130.3. Samples: 13417941800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 05:23:36,002][15401] Updated weights for policy 0, policy_version 818963 (0.0045) [2024-06-25 05:23:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 13417988096. Throughput: 0: 43119.6. Samples: 13418074120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:38,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 05:23:39,522][15401] Updated weights for policy 0, policy_version 818973 (0.0031) [2024-06-25 05:23:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13418201088. Throughput: 0: 43184.0. Samples: 13418336060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:43,396][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 05:23:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000818982_13418201088.pth... [2024-06-25 05:23:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000818352_13407879168.pth [2024-06-25 05:23:43,617][15401] Updated weights for policy 0, policy_version 818983 (0.0032) [2024-06-25 05:23:47,134][15401] Updated weights for policy 0, policy_version 818993 (0.0032) [2024-06-25 05:23:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13418446848. Throughput: 0: 43046.2. Samples: 13418585820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:48,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 05:23:51,182][15401] Updated weights for policy 0, policy_version 819003 (0.0031) [2024-06-25 05:23:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 13418627072. Throughput: 0: 43093.1. Samples: 13418718400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 05:23:54,651][15401] Updated weights for policy 0, policy_version 819013 (0.0043) [2024-06-25 05:23:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 13418840064. Throughput: 0: 43086.2. Samples: 13418975700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:23:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 05:23:58,618][15401] Updated weights for policy 0, policy_version 819023 (0.0033) [2024-06-25 05:24:02,296][15401] Updated weights for policy 0, policy_version 819033 (0.0035) [2024-06-25 05:24:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 13419069440. Throughput: 0: 42918.6. Samples: 13419229440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:24:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 05:24:06,167][15401] Updated weights for policy 0, policy_version 819043 (0.0033) [2024-06-25 05:24:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13419282432. Throughput: 0: 42966.1. Samples: 13419361580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:24:08,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 05:24:10,026][15401] Updated weights for policy 0, policy_version 819053 (0.0041) [2024-06-25 05:24:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13419495424. Throughput: 0: 42847.5. Samples: 13419621160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:24:13,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 05:24:13,645][15401] Updated weights for policy 0, policy_version 819063 (0.0026) [2024-06-25 05:24:17,581][15401] Updated weights for policy 0, policy_version 819073 (0.0038) [2024-06-25 05:24:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13419708416. Throughput: 0: 43035.5. Samples: 13419878400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:24:18,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 05:24:21,445][15401] Updated weights for policy 0, policy_version 819083 (0.0031) [2024-06-25 05:24:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 13419921408. Throughput: 0: 42982.3. Samples: 13420008320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:24:23,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 05:24:25,244][15401] Updated weights for policy 0, policy_version 819093 (0.0037) [2024-06-25 05:24:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13420134400. Throughput: 0: 42911.2. Samples: 13420267060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-25 05:24:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 05:24:29,036][15401] Updated weights for policy 0, policy_version 819103 (0.0030) [2024-06-25 05:24:31,355][15349] Signal inference workers to stop experience collection... (198700 times) [2024-06-25 05:24:31,355][15349] Signal inference workers to resume experience collection... (198700 times) [2024-06-25 05:24:31,398][15401] InferenceWorker_p0-w0: stopping experience collection (198700 times) [2024-06-25 05:24:31,398][15401] InferenceWorker_p0-w0: resuming experience collection (198700 times) [2024-06-25 05:24:32,842][15401] Updated weights for policy 0, policy_version 819113 (0.0028) [2024-06-25 05:24:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13420347392. Throughput: 0: 43016.0. Samples: 13420521540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:24:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 05:24:36,578][15401] Updated weights for policy 0, policy_version 819123 (0.0038) [2024-06-25 05:24:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 13420576768. Throughput: 0: 42922.7. Samples: 13420649920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:24:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 05:24:40,715][15401] Updated weights for policy 0, policy_version 819133 (0.0027) [2024-06-25 05:24:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42987.5). Total num frames: 13420773376. Throughput: 0: 42948.0. Samples: 13420908360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:24:43,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 05:24:44,271][15401] Updated weights for policy 0, policy_version 819143 (0.0029) [2024-06-25 05:24:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42932.0). Total num frames: 13420969984. Throughput: 0: 42910.3. Samples: 13421160400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:24:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 05:24:48,718][15401] Updated weights for policy 0, policy_version 819153 (0.0036) [2024-06-25 05:24:52,063][15401] Updated weights for policy 0, policy_version 819163 (0.0037) [2024-06-25 05:24:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13421215744. Throughput: 0: 42728.5. Samples: 13421284360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:24:53,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 05:24:56,408][15401] Updated weights for policy 0, policy_version 819173 (0.0028) [2024-06-25 05:24:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13421412352. Throughput: 0: 42673.8. Samples: 13421541480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:24:58,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 05:24:59,649][15401] Updated weights for policy 0, policy_version 819183 (0.0032) [2024-06-25 05:25:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13421625344. Throughput: 0: 42706.2. Samples: 13421800180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:03,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-25 05:25:03,994][15401] Updated weights for policy 0, policy_version 819193 (0.0033) [2024-06-25 05:25:07,200][15401] Updated weights for policy 0, policy_version 819203 (0.0035) [2024-06-25 05:25:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 13421871104. Throughput: 0: 42706.5. Samples: 13421930120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:08,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-25 05:25:11,454][15401] Updated weights for policy 0, policy_version 819213 (0.0035) [2024-06-25 05:25:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13422067712. Throughput: 0: 42832.4. Samples: 13422194520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:13,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 05:25:14,874][15401] Updated weights for policy 0, policy_version 819223 (0.0029) [2024-06-25 05:25:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13422280704. Throughput: 0: 42743.6. Samples: 13422445000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:18,396][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 05:25:19,209][15401] Updated weights for policy 0, policy_version 819233 (0.0043) [2024-06-25 05:25:22,854][15401] Updated weights for policy 0, policy_version 819243 (0.0033) [2024-06-25 05:25:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 13422493696. Throughput: 0: 42703.0. Samples: 13422571560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:23,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 05:25:26,758][15401] Updated weights for policy 0, policy_version 819253 (0.0031) [2024-06-25 05:25:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13422706688. Throughput: 0: 42654.6. Samples: 13422827820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 05:25:30,299][15401] Updated weights for policy 0, policy_version 819263 (0.0037) [2024-06-25 05:25:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13422919680. Throughput: 0: 42704.4. Samples: 13423082100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 05:25:34,807][15401] Updated weights for policy 0, policy_version 819273 (0.0044) [2024-06-25 05:25:38,255][15401] Updated weights for policy 0, policy_version 819283 (0.0036) [2024-06-25 05:25:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 13423132672. Throughput: 0: 42732.4. Samples: 13423207320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:38,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 05:25:42,544][15401] Updated weights for policy 0, policy_version 819293 (0.0037) [2024-06-25 05:25:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 13423345664. Throughput: 0: 42957.1. Samples: 13423474560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:43,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 05:25:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000819297_13423362048.pth... [2024-06-25 05:25:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000818668_13413056512.pth [2024-06-25 05:25:44,629][15349] Signal inference workers to stop experience collection... (198750 times) [2024-06-25 05:25:44,632][15349] Signal inference workers to resume experience collection... (198750 times) [2024-06-25 05:25:44,650][15401] InferenceWorker_p0-w0: stopping experience collection (198750 times) [2024-06-25 05:25:44,650][15401] InferenceWorker_p0-w0: resuming experience collection (198750 times) [2024-06-25 05:25:45,704][15401] Updated weights for policy 0, policy_version 819303 (0.0031) [2024-06-25 05:25:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13423558656. Throughput: 0: 42847.2. Samples: 13423728300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 05:25:49,914][15401] Updated weights for policy 0, policy_version 819313 (0.0027) [2024-06-25 05:25:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13423771648. Throughput: 0: 42777.5. Samples: 13423855100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:53,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 05:25:53,422][15401] Updated weights for policy 0, policy_version 819323 (0.0042) [2024-06-25 05:25:57,320][15401] Updated weights for policy 0, policy_version 819333 (0.0043) [2024-06-25 05:25:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13423984640. Throughput: 0: 42693.8. Samples: 13424115740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:25:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 05:26:01,258][15401] Updated weights for policy 0, policy_version 819343 (0.0030) [2024-06-25 05:26:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13424197632. Throughput: 0: 42759.1. Samples: 13424369160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:26:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 05:26:05,106][15401] Updated weights for policy 0, policy_version 819353 (0.0029) [2024-06-25 05:26:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42931.6). Total num frames: 13424410624. Throughput: 0: 42844.5. Samples: 13424499560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:26:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 05:26:08,930][15401] Updated weights for policy 0, policy_version 819363 (0.0037) [2024-06-25 05:26:13,025][15401] Updated weights for policy 0, policy_version 819373 (0.0032) [2024-06-25 05:26:13,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 13424623616. Throughput: 0: 42686.2. Samples: 13424748800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:26:13,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 05:26:16,664][15401] Updated weights for policy 0, policy_version 819383 (0.0043) [2024-06-25 05:26:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13424852992. Throughput: 0: 42676.1. Samples: 13425002520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 05:26:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 05:26:20,651][15401] Updated weights for policy 0, policy_version 819393 (0.0028) [2024-06-25 05:26:23,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13425065984. Throughput: 0: 42984.9. Samples: 13425141640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:26:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 05:26:24,419][15401] Updated weights for policy 0, policy_version 819403 (0.0043) [2024-06-25 05:26:28,134][15401] Updated weights for policy 0, policy_version 819413 (0.0029) [2024-06-25 05:26:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 13425262592. Throughput: 0: 42661.8. Samples: 13425394340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:26:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 05:26:31,836][15401] Updated weights for policy 0, policy_version 819423 (0.0027) [2024-06-25 05:26:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13425475584. Throughput: 0: 42777.0. Samples: 13425653260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:26:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 05:26:36,070][15401] Updated weights for policy 0, policy_version 819433 (0.0038) [2024-06-25 05:26:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 13425704960. Throughput: 0: 42901.6. Samples: 13425785680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:26:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 05:26:39,335][15401] Updated weights for policy 0, policy_version 819443 (0.0032) [2024-06-25 05:26:43,351][15401] Updated weights for policy 0, policy_version 819453 (0.0037) [2024-06-25 05:26:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 13425917952. Throughput: 0: 42750.2. Samples: 13426039500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:26:43,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 05:26:47,227][15401] Updated weights for policy 0, policy_version 819463 (0.0033) [2024-06-25 05:26:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13426147328. Throughput: 0: 42932.1. Samples: 13426301100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:26:48,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 05:26:50,853][15401] Updated weights for policy 0, policy_version 819473 (0.0035) [2024-06-25 05:26:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13426343936. Throughput: 0: 42932.0. Samples: 13426431500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:26:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 05:26:54,857][15401] Updated weights for policy 0, policy_version 819483 (0.0029) [2024-06-25 05:26:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 13426556928. Throughput: 0: 42954.3. Samples: 13426681640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:26:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 05:26:58,713][15401] Updated weights for policy 0, policy_version 819493 (0.0032) [2024-06-25 05:27:02,480][15401] Updated weights for policy 0, policy_version 819503 (0.0037) [2024-06-25 05:27:03,392][15132] Fps is (10 sec: 44225.6, 60 sec: 43142.8, 300 sec: 42876.1). Total num frames: 13426786304. Throughput: 0: 43021.3. Samples: 13426938580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:03,393][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 05:27:06,244][15401] Updated weights for policy 0, policy_version 819513 (0.0049) [2024-06-25 05:27:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13426982912. Throughput: 0: 42808.8. Samples: 13427068040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 05:27:10,052][15401] Updated weights for policy 0, policy_version 819523 (0.0035) [2024-06-25 05:27:13,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 13427179520. Throughput: 0: 42792.1. Samples: 13427319980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 05:27:14,184][15401] Updated weights for policy 0, policy_version 819533 (0.0037) [2024-06-25 05:27:17,718][15401] Updated weights for policy 0, policy_version 819543 (0.0046) [2024-06-25 05:27:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13427425280. Throughput: 0: 42661.3. Samples: 13427573020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 05:27:21,919][15401] Updated weights for policy 0, policy_version 819553 (0.0039) [2024-06-25 05:27:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13427638272. Throughput: 0: 42807.6. Samples: 13427712020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:23,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-25 05:27:25,097][15401] Updated weights for policy 0, policy_version 819563 (0.0026) [2024-06-25 05:27:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13427834880. Throughput: 0: 42810.8. Samples: 13427965980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 05:27:29,502][15401] Updated weights for policy 0, policy_version 819573 (0.0027) [2024-06-25 05:27:32,377][15349] Signal inference workers to stop experience collection... (198800 times) [2024-06-25 05:27:32,377][15349] Signal inference workers to resume experience collection... (198800 times) [2024-06-25 05:27:32,391][15401] InferenceWorker_p0-w0: stopping experience collection (198800 times) [2024-06-25 05:27:32,392][15401] InferenceWorker_p0-w0: resuming experience collection (198800 times) [2024-06-25 05:27:32,671][15401] Updated weights for policy 0, policy_version 819583 (0.0030) [2024-06-25 05:27:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13428064256. Throughput: 0: 42765.3. Samples: 13428225540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 05:27:37,059][15401] Updated weights for policy 0, policy_version 819593 (0.0026) [2024-06-25 05:27:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13428293632. Throughput: 0: 42919.0. Samples: 13428362860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:38,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 05:27:40,131][15401] Updated weights for policy 0, policy_version 819603 (0.0030) [2024-06-25 05:27:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13428490240. Throughput: 0: 42978.2. Samples: 13428615660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:43,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 05:27:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000819610_13428490240.pth... [2024-06-25 05:27:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000818982_13418201088.pth [2024-06-25 05:27:44,642][15401] Updated weights for policy 0, policy_version 819613 (0.0046) [2024-06-25 05:27:47,594][15401] Updated weights for policy 0, policy_version 819623 (0.0032) [2024-06-25 05:27:48,395][15132] Fps is (10 sec: 40937.7, 60 sec: 42594.4, 300 sec: 42875.6). Total num frames: 13428703232. Throughput: 0: 43059.7. Samples: 13428876400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:48,395][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 05:27:52,050][15401] Updated weights for policy 0, policy_version 819633 (0.0042) [2024-06-25 05:27:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13428932608. Throughput: 0: 43125.3. Samples: 13429008680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 05:27:55,708][15401] Updated weights for policy 0, policy_version 819643 (0.0038) [2024-06-25 05:27:58,389][15132] Fps is (10 sec: 40982.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13429112832. Throughput: 0: 43138.7. Samples: 13429261220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:27:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 05:27:59,667][15401] Updated weights for policy 0, policy_version 819653 (0.0025) [2024-06-25 05:28:03,330][15401] Updated weights for policy 0, policy_version 819663 (0.0038) [2024-06-25 05:28:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42931.7). Total num frames: 13429358592. Throughput: 0: 43194.2. Samples: 13429516760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:28:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 05:28:07,595][15401] Updated weights for policy 0, policy_version 819673 (0.0038) [2024-06-25 05:28:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13429555200. Throughput: 0: 43028.9. Samples: 13429648320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:28:08,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 05:28:10,835][15401] Updated weights for policy 0, policy_version 819683 (0.0032) [2024-06-25 05:28:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13429768192. Throughput: 0: 43023.4. Samples: 13429902040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 05:28:15,053][15401] Updated weights for policy 0, policy_version 819693 (0.0034) [2024-06-25 05:28:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13429997568. Throughput: 0: 42926.2. Samples: 13430157220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:18,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 05:28:18,410][15401] Updated weights for policy 0, policy_version 819703 (0.0033) [2024-06-25 05:28:22,654][15401] Updated weights for policy 0, policy_version 819713 (0.0036) [2024-06-25 05:28:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13430210560. Throughput: 0: 42824.9. Samples: 13430289980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:23,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 05:28:26,186][15401] Updated weights for policy 0, policy_version 819723 (0.0030) [2024-06-25 05:28:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13430423552. Throughput: 0: 42911.6. Samples: 13430546680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:28,391][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 05:28:30,093][15401] Updated weights for policy 0, policy_version 819733 (0.0036) [2024-06-25 05:28:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13430636544. Throughput: 0: 42736.7. Samples: 13430799320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 05:28:33,829][15401] Updated weights for policy 0, policy_version 819743 (0.0030) [2024-06-25 05:28:37,818][15401] Updated weights for policy 0, policy_version 819753 (0.0038) [2024-06-25 05:28:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13430849536. Throughput: 0: 42751.6. Samples: 13430932500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 05:28:41,394][15401] Updated weights for policy 0, policy_version 819763 (0.0032) [2024-06-25 05:28:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13431062528. Throughput: 0: 42883.9. Samples: 13431191000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:43,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-25 05:28:45,298][15401] Updated weights for policy 0, policy_version 819773 (0.0041) [2024-06-25 05:28:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42875.3, 300 sec: 42876.1). Total num frames: 13431275520. Throughput: 0: 42951.4. Samples: 13431449580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 05:28:48,987][15401] Updated weights for policy 0, policy_version 819783 (0.0030) [2024-06-25 05:28:52,799][15401] Updated weights for policy 0, policy_version 819793 (0.0037) [2024-06-25 05:28:53,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13431521280. Throughput: 0: 42983.1. Samples: 13431582560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 05:28:56,809][15401] Updated weights for policy 0, policy_version 819803 (0.0040) [2024-06-25 05:28:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13431701504. Throughput: 0: 42995.7. Samples: 13431836840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:28:58,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 05:29:00,614][15401] Updated weights for policy 0, policy_version 819813 (0.0032) [2024-06-25 05:29:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13431930880. Throughput: 0: 43036.8. Samples: 13432093880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:03,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 05:29:03,622][15349] Signal inference workers to stop experience collection... (198850 times) [2024-06-25 05:29:03,628][15349] Signal inference workers to resume experience collection... (198850 times) [2024-06-25 05:29:03,660][15401] InferenceWorker_p0-w0: stopping experience collection (198850 times) [2024-06-25 05:29:03,660][15401] InferenceWorker_p0-w0: resuming experience collection (198850 times) [2024-06-25 05:29:04,636][15401] Updated weights for policy 0, policy_version 819823 (0.0035) [2024-06-25 05:29:08,094][15401] Updated weights for policy 0, policy_version 819833 (0.0033) [2024-06-25 05:29:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 13432160256. Throughput: 0: 43101.3. Samples: 13432229540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:08,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 05:29:12,006][15401] Updated weights for policy 0, policy_version 819843 (0.0040) [2024-06-25 05:29:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13432340480. Throughput: 0: 43052.9. Samples: 13432484060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:13,390][15132] Avg episode reward: [(0, '0.910')] [2024-06-25 05:29:15,702][15401] Updated weights for policy 0, policy_version 819853 (0.0047) [2024-06-25 05:29:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13432586240. Throughput: 0: 43077.5. Samples: 13432737800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:18,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 05:29:19,479][15401] Updated weights for policy 0, policy_version 819863 (0.0034) [2024-06-25 05:29:23,251][15401] Updated weights for policy 0, policy_version 819873 (0.0033) [2024-06-25 05:29:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13432799232. Throughput: 0: 43283.1. Samples: 13432880240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:23,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 05:29:27,186][15401] Updated weights for policy 0, policy_version 819883 (0.0025) [2024-06-25 05:29:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13432995840. Throughput: 0: 43032.5. Samples: 13433127460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 05:29:30,787][15401] Updated weights for policy 0, policy_version 819893 (0.0045) [2024-06-25 05:29:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13433225216. Throughput: 0: 42913.8. Samples: 13433380700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:33,390][15132] Avg episode reward: [(0, '0.118')] [2024-06-25 05:29:34,926][15401] Updated weights for policy 0, policy_version 819903 (0.0037) [2024-06-25 05:29:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13433438208. Throughput: 0: 42945.8. Samples: 13433515120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:38,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 05:29:38,545][15401] Updated weights for policy 0, policy_version 819913 (0.0033) [2024-06-25 05:29:42,803][15401] Updated weights for policy 0, policy_version 819923 (0.0027) [2024-06-25 05:29:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13433651200. Throughput: 0: 42932.3. Samples: 13433768800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 05:29:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000819925_13433651200.pth... [2024-06-25 05:29:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000819297_13423362048.pth [2024-06-25 05:29:46,154][15401] Updated weights for policy 0, policy_version 819933 (0.0031) [2024-06-25 05:29:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 13433864192. Throughput: 0: 42917.9. Samples: 13434025180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 05:29:50,250][15401] Updated weights for policy 0, policy_version 819943 (0.0033) [2024-06-25 05:29:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 13434077184. Throughput: 0: 42825.8. Samples: 13434156700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 05:29:53,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 05:29:53,699][15401] Updated weights for policy 0, policy_version 819953 (0.0027) [2024-06-25 05:29:57,676][15401] Updated weights for policy 0, policy_version 819963 (0.0047) [2024-06-25 05:29:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 13434290176. Throughput: 0: 42932.6. Samples: 13434416020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:29:58,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 05:30:01,216][15401] Updated weights for policy 0, policy_version 819973 (0.0048) [2024-06-25 05:30:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13434519552. Throughput: 0: 42968.0. Samples: 13434671360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:03,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 05:30:05,188][15401] Updated weights for policy 0, policy_version 819983 (0.0040) [2024-06-25 05:30:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.8, 300 sec: 42875.7). Total num frames: 13434716160. Throughput: 0: 42786.1. Samples: 13434805720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:08,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 05:30:08,976][15401] Updated weights for policy 0, policy_version 819993 (0.0050) [2024-06-25 05:30:12,896][15401] Updated weights for policy 0, policy_version 820003 (0.0032) [2024-06-25 05:30:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 13434945536. Throughput: 0: 42995.5. Samples: 13435062260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 05:30:16,752][15401] Updated weights for policy 0, policy_version 820013 (0.0028) [2024-06-25 05:30:18,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13435142144. Throughput: 0: 43135.6. Samples: 13435321800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:18,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 05:30:20,455][15401] Updated weights for policy 0, policy_version 820023 (0.0033) [2024-06-25 05:30:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13435355136. Throughput: 0: 42963.5. Samples: 13435448480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 05:30:24,180][15401] Updated weights for policy 0, policy_version 820033 (0.0040) [2024-06-25 05:30:27,938][15401] Updated weights for policy 0, policy_version 820043 (0.0034) [2024-06-25 05:30:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 13435584512. Throughput: 0: 43031.2. Samples: 13435705200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 05:30:31,933][15401] Updated weights for policy 0, policy_version 820053 (0.0040) [2024-06-25 05:30:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13435781120. Throughput: 0: 43099.5. Samples: 13435964660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 05:30:35,579][15401] Updated weights for policy 0, policy_version 820063 (0.0036) [2024-06-25 05:30:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13436010496. Throughput: 0: 42867.2. Samples: 13436085720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:38,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 05:30:39,707][15401] Updated weights for policy 0, policy_version 820073 (0.0039) [2024-06-25 05:30:43,238][15401] Updated weights for policy 0, policy_version 820083 (0.0023) [2024-06-25 05:30:43,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 13436239872. Throughput: 0: 42948.2. Samples: 13436348800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:43,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 05:30:47,321][15401] Updated weights for policy 0, policy_version 820093 (0.0037) [2024-06-25 05:30:48,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 13436436480. Throughput: 0: 42815.3. Samples: 13436598320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:48,396][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 05:30:51,034][15349] Signal inference workers to stop experience collection... (198900 times) [2024-06-25 05:30:51,061][15401] InferenceWorker_p0-w0: stopping experience collection (198900 times) [2024-06-25 05:30:51,096][15349] Signal inference workers to resume experience collection... (198900 times) [2024-06-25 05:30:51,097][15401] InferenceWorker_p0-w0: resuming experience collection (198900 times) [2024-06-25 05:30:51,249][15401] Updated weights for policy 0, policy_version 820103 (0.0043) [2024-06-25 05:30:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13436649472. Throughput: 0: 42772.4. Samples: 13436730380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 05:30:54,824][15401] Updated weights for policy 0, policy_version 820113 (0.0037) [2024-06-25 05:30:58,389][15132] Fps is (10 sec: 40986.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13436846080. Throughput: 0: 42878.9. Samples: 13436991800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:30:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 05:30:58,915][15401] Updated weights for policy 0, policy_version 820123 (0.0039) [2024-06-25 05:31:02,492][15401] Updated weights for policy 0, policy_version 820133 (0.0030) [2024-06-25 05:31:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13437091840. Throughput: 0: 42536.9. Samples: 13437235960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 05:31:06,816][15401] Updated weights for policy 0, policy_version 820143 (0.0035) [2024-06-25 05:31:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42873.1, 300 sec: 42932.0). Total num frames: 13437288448. Throughput: 0: 42683.0. Samples: 13437369220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 05:31:10,167][15401] Updated weights for policy 0, policy_version 820153 (0.0034) [2024-06-25 05:31:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13437485056. Throughput: 0: 42700.9. Samples: 13437626740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 05:31:14,320][15401] Updated weights for policy 0, policy_version 820163 (0.0032) [2024-06-25 05:31:17,988][15401] Updated weights for policy 0, policy_version 820173 (0.0033) [2024-06-25 05:31:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13437714432. Throughput: 0: 42484.8. Samples: 13437876480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:18,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-25 05:31:22,055][15401] Updated weights for policy 0, policy_version 820183 (0.0038) [2024-06-25 05:31:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13437943808. Throughput: 0: 42723.7. Samples: 13438008280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:23,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 05:31:25,560][15401] Updated weights for policy 0, policy_version 820193 (0.0030) [2024-06-25 05:31:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13438124032. Throughput: 0: 42632.6. Samples: 13438267160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 05:31:29,963][15401] Updated weights for policy 0, policy_version 820203 (0.0036) [2024-06-25 05:31:33,286][15401] Updated weights for policy 0, policy_version 820213 (0.0023) [2024-06-25 05:31:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 13438369792. Throughput: 0: 42770.9. Samples: 13438522740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:33,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 05:31:37,546][15401] Updated weights for policy 0, policy_version 820223 (0.0035) [2024-06-25 05:31:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13438582784. Throughput: 0: 42704.4. Samples: 13438652080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 05:31:41,237][15401] Updated weights for policy 0, policy_version 820233 (0.0031) [2024-06-25 05:31:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 13438763008. Throughput: 0: 42508.7. Samples: 13438904700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 05:31:43,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-25 05:31:43,559][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000820238_13438779392.pth... [2024-06-25 05:31:43,618][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000819610_13428490240.pth [2024-06-25 05:31:44,974][15401] Updated weights for policy 0, policy_version 820243 (0.0033) [2024-06-25 05:31:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42602.9, 300 sec: 42876.1). Total num frames: 13438992384. Throughput: 0: 42816.9. Samples: 13439162720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:31:48,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 05:31:48,873][15401] Updated weights for policy 0, policy_version 820253 (0.0041) [2024-06-25 05:31:52,641][15401] Updated weights for policy 0, policy_version 820263 (0.0027) [2024-06-25 05:31:53,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13439221760. Throughput: 0: 42769.9. Samples: 13439293860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:31:53,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 05:31:56,383][15401] Updated weights for policy 0, policy_version 820273 (0.0042) [2024-06-25 05:31:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 13439418368. Throughput: 0: 42684.0. Samples: 13439547520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:31:58,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 05:32:00,532][15401] Updated weights for policy 0, policy_version 820283 (0.0040) [2024-06-25 05:32:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13439647744. Throughput: 0: 42816.0. Samples: 13439803200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 05:32:03,970][15401] Updated weights for policy 0, policy_version 820293 (0.0028) [2024-06-25 05:32:07,897][15349] Signal inference workers to stop experience collection... (198950 times) [2024-06-25 05:32:07,951][15349] Signal inference workers to resume experience collection... (198950 times) [2024-06-25 05:32:07,951][15401] InferenceWorker_p0-w0: stopping experience collection (198950 times) [2024-06-25 05:32:07,966][15401] InferenceWorker_p0-w0: resuming experience collection (198950 times) [2024-06-25 05:32:08,098][15401] Updated weights for policy 0, policy_version 820303 (0.0039) [2024-06-25 05:32:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13439860736. Throughput: 0: 42943.5. Samples: 13439940740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:08,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 05:32:11,734][15401] Updated weights for policy 0, policy_version 820313 (0.0038) [2024-06-25 05:32:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13440057344. Throughput: 0: 42820.9. Samples: 13440194100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:13,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 05:32:15,544][15401] Updated weights for policy 0, policy_version 820323 (0.0039) [2024-06-25 05:32:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13440303104. Throughput: 0: 42679.5. Samples: 13440443320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 05:32:19,311][15401] Updated weights for policy 0, policy_version 820333 (0.0024) [2024-06-25 05:32:23,063][15401] Updated weights for policy 0, policy_version 820343 (0.0036) [2024-06-25 05:32:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 13440499712. Throughput: 0: 42869.8. Samples: 13440581220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 05:32:26,883][15401] Updated weights for policy 0, policy_version 820353 (0.0036) [2024-06-25 05:32:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13440712704. Throughput: 0: 42860.5. Samples: 13440833420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:28,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 05:32:30,713][15401] Updated weights for policy 0, policy_version 820363 (0.0034) [2024-06-25 05:32:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13440942080. Throughput: 0: 42766.3. Samples: 13441087200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 05:32:34,505][15401] Updated weights for policy 0, policy_version 820373 (0.0039) [2024-06-25 05:32:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13441138688. Throughput: 0: 42748.8. Samples: 13441217560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:38,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-25 05:32:38,630][15401] Updated weights for policy 0, policy_version 820383 (0.0027) [2024-06-25 05:32:42,048][15401] Updated weights for policy 0, policy_version 820393 (0.0039) [2024-06-25 05:32:43,390][15132] Fps is (10 sec: 40956.2, 60 sec: 43144.0, 300 sec: 42876.8). Total num frames: 13441351680. Throughput: 0: 42708.5. Samples: 13441469440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:43,391][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 05:32:46,211][15401] Updated weights for policy 0, policy_version 820403 (0.0028) [2024-06-25 05:32:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13441564672. Throughput: 0: 42692.9. Samples: 13441724380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 05:32:50,318][15401] Updated weights for policy 0, policy_version 820413 (0.0036) [2024-06-25 05:32:53,389][15132] Fps is (10 sec: 40963.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 13441761280. Throughput: 0: 42478.7. Samples: 13441852280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 05:32:54,041][15401] Updated weights for policy 0, policy_version 820423 (0.0051) [2024-06-25 05:32:58,242][15401] Updated weights for policy 0, policy_version 820433 (0.0024) [2024-06-25 05:32:58,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 13441974272. Throughput: 0: 42422.4. Samples: 13442103380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:32:58,396][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 05:33:01,588][15401] Updated weights for policy 0, policy_version 820443 (0.0036) [2024-06-25 05:33:03,390][15132] Fps is (10 sec: 44234.0, 60 sec: 42598.0, 300 sec: 42876.0). Total num frames: 13442203648. Throughput: 0: 42591.9. Samples: 13442359980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:33:03,391][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 05:33:06,096][15401] Updated weights for policy 0, policy_version 820453 (0.0045) [2024-06-25 05:33:08,389][15132] Fps is (10 sec: 40986.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13442383872. Throughput: 0: 42403.6. Samples: 13442489380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:33:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 05:33:09,441][15401] Updated weights for policy 0, policy_version 820463 (0.0039) [2024-06-25 05:33:13,390][15132] Fps is (10 sec: 40962.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13442613248. Throughput: 0: 42411.6. Samples: 13442741940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:33:13,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 05:33:13,662][15401] Updated weights for policy 0, policy_version 820473 (0.0036) [2024-06-25 05:33:17,215][15401] Updated weights for policy 0, policy_version 820483 (0.0045) [2024-06-25 05:33:18,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13442859008. Throughput: 0: 42358.7. Samples: 13442993340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:33:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 05:33:21,408][15401] Updated weights for policy 0, policy_version 820493 (0.0037) [2024-06-25 05:33:23,390][15132] Fps is (10 sec: 42596.7, 60 sec: 42325.0, 300 sec: 42765.0). Total num frames: 13443039232. Throughput: 0: 42460.1. Samples: 13443128280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:33:23,399][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 05:33:24,844][15401] Updated weights for policy 0, policy_version 820503 (0.0036) [2024-06-25 05:33:26,467][15349] Signal inference workers to stop experience collection... (199000 times) [2024-06-25 05:33:26,470][15349] Signal inference workers to resume experience collection... (199000 times) [2024-06-25 05:33:26,497][15401] InferenceWorker_p0-w0: stopping experience collection (199000 times) [2024-06-25 05:33:26,497][15401] InferenceWorker_p0-w0: resuming experience collection (199000 times) [2024-06-25 05:33:28,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 13443252224. Throughput: 0: 42428.4. Samples: 13443378780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:33:28,401][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 05:33:29,078][15401] Updated weights for policy 0, policy_version 820513 (0.0041) [2024-06-25 05:33:32,407][15401] Updated weights for policy 0, policy_version 820523 (0.0032) [2024-06-25 05:33:33,389][15132] Fps is (10 sec: 44238.9, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 13443481600. Throughput: 0: 42435.1. Samples: 13443633960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-25 05:33:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 05:33:36,577][15401] Updated weights for policy 0, policy_version 820533 (0.0031) [2024-06-25 05:33:38,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13443678208. Throughput: 0: 42571.6. Samples: 13443768000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:33:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 05:33:39,995][15401] Updated weights for policy 0, policy_version 820543 (0.0036) [2024-06-25 05:33:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42599.0, 300 sec: 42820.6). Total num frames: 13443907584. Throughput: 0: 42641.5. Samples: 13444021980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:33:43,392][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 05:33:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000820551_13443907584.pth... [2024-06-25 05:33:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000819925_13433651200.pth [2024-06-25 05:33:44,158][15401] Updated weights for policy 0, policy_version 820553 (0.0043) [2024-06-25 05:33:47,669][15401] Updated weights for policy 0, policy_version 820563 (0.0029) [2024-06-25 05:33:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13444136960. Throughput: 0: 42449.0. Samples: 13444270160. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:33:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 05:33:51,969][15401] Updated weights for policy 0, policy_version 820573 (0.0029) [2024-06-25 05:33:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13444333568. Throughput: 0: 42644.9. Samples: 13444408400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:33:53,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-25 05:33:55,226][15401] Updated weights for policy 0, policy_version 820583 (0.0028) [2024-06-25 05:33:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 13444530176. Throughput: 0: 42737.4. Samples: 13444665120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:33:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 05:33:59,471][15401] Updated weights for policy 0, policy_version 820593 (0.0036) [2024-06-25 05:34:03,047][15401] Updated weights for policy 0, policy_version 820603 (0.0036) [2024-06-25 05:34:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.9, 300 sec: 42820.6). Total num frames: 13444792320. Throughput: 0: 42661.2. Samples: 13444913100. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 05:34:07,097][15401] Updated weights for policy 0, policy_version 820613 (0.0040) [2024-06-25 05:34:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13444956160. Throughput: 0: 42657.3. Samples: 13445047840. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:08,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 05:34:10,753][15401] Updated weights for policy 0, policy_version 820623 (0.0033) [2024-06-25 05:34:13,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13445169152. Throughput: 0: 42670.6. Samples: 13445298860. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 05:34:14,720][15401] Updated weights for policy 0, policy_version 820633 (0.0039) [2024-06-25 05:34:18,252][15401] Updated weights for policy 0, policy_version 820643 (0.0032) [2024-06-25 05:34:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13445414912. Throughput: 0: 42730.7. Samples: 13445556840. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:18,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 05:34:22,356][15401] Updated weights for policy 0, policy_version 820653 (0.0036) [2024-06-25 05:34:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.7, 300 sec: 42709.5). Total num frames: 13445595136. Throughput: 0: 42767.9. Samples: 13445692560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 05:34:26,076][15401] Updated weights for policy 0, policy_version 820663 (0.0039) [2024-06-25 05:34:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 13445808128. Throughput: 0: 42524.1. Samples: 13445935560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:28,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 05:34:30,025][15401] Updated weights for policy 0, policy_version 820673 (0.0035) [2024-06-25 05:34:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13446053888. Throughput: 0: 42823.5. Samples: 13446197220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 05:34:33,654][15401] Updated weights for policy 0, policy_version 820683 (0.0026) [2024-06-25 05:34:37,754][15401] Updated weights for policy 0, policy_version 820693 (0.0041) [2024-06-25 05:34:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13446234112. Throughput: 0: 42765.4. Samples: 13446332840. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 05:34:41,090][15349] Signal inference workers to stop experience collection... (199050 times) [2024-06-25 05:34:41,147][15401] InferenceWorker_p0-w0: stopping experience collection (199050 times) [2024-06-25 05:34:41,204][15349] Signal inference workers to resume experience collection... (199050 times) [2024-06-25 05:34:41,204][15401] InferenceWorker_p0-w0: resuming experience collection (199050 times) [2024-06-25 05:34:41,335][15401] Updated weights for policy 0, policy_version 820703 (0.0031) [2024-06-25 05:34:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13446447104. Throughput: 0: 42569.7. Samples: 13446580760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:43,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 05:34:45,789][15401] Updated weights for policy 0, policy_version 820713 (0.0030) [2024-06-25 05:34:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13446676480. Throughput: 0: 42723.7. Samples: 13446835660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 05:34:48,923][15401] Updated weights for policy 0, policy_version 820723 (0.0028) [2024-06-25 05:34:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13446873088. Throughput: 0: 42654.3. Samples: 13446967280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 05:34:53,532][15401] Updated weights for policy 0, policy_version 820733 (0.0038) [2024-06-25 05:34:56,656][15401] Updated weights for policy 0, policy_version 820743 (0.0023) [2024-06-25 05:34:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13447102464. Throughput: 0: 42524.6. Samples: 13447212460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:34:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 05:35:01,318][15401] Updated weights for policy 0, policy_version 820753 (0.0055) [2024-06-25 05:35:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42709.8). Total num frames: 13447315456. Throughput: 0: 42561.8. Samples: 13447472120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:35:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 05:35:04,747][15401] Updated weights for policy 0, policy_version 820763 (0.0036) [2024-06-25 05:35:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13447495680. Throughput: 0: 42303.2. Samples: 13447596200. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:35:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 05:35:09,010][15401] Updated weights for policy 0, policy_version 820773 (0.0043) [2024-06-25 05:35:12,308][15401] Updated weights for policy 0, policy_version 820783 (0.0032) [2024-06-25 05:35:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13447741440. Throughput: 0: 42536.3. Samples: 13447849700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:35:13,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 05:35:16,806][15401] Updated weights for policy 0, policy_version 820793 (0.0033) [2024-06-25 05:35:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 13447938048. Throughput: 0: 42561.4. Samples: 13448112480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 05:35:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 05:35:19,909][15401] Updated weights for policy 0, policy_version 820803 (0.0029) [2024-06-25 05:35:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13448151040. Throughput: 0: 42328.3. Samples: 13448237620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:35:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 05:35:24,239][15401] Updated weights for policy 0, policy_version 820813 (0.0033) [2024-06-25 05:35:27,702][15401] Updated weights for policy 0, policy_version 820823 (0.0038) [2024-06-25 05:35:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13448364032. Throughput: 0: 42428.6. Samples: 13448490040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:35:28,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 05:35:31,842][15401] Updated weights for policy 0, policy_version 820833 (0.0022) [2024-06-25 05:35:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13448577024. Throughput: 0: 42533.7. Samples: 13448749680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:35:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 05:35:35,215][15401] Updated weights for policy 0, policy_version 820843 (0.0032) [2024-06-25 05:35:38,392][15132] Fps is (10 sec: 42589.2, 60 sec: 42596.8, 300 sec: 42542.9). Total num frames: 13448790016. Throughput: 0: 42546.3. Samples: 13448881960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:35:38,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 05:35:39,309][15401] Updated weights for policy 0, policy_version 820853 (0.0027) [2024-06-25 05:35:42,792][15401] Updated weights for policy 0, policy_version 820863 (0.0035) [2024-06-25 05:35:43,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42654.5). Total num frames: 13449019392. Throughput: 0: 42807.4. Samples: 13449138900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:35:43,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 05:35:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000820863_13449019392.pth... [2024-06-25 05:35:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000820238_13438779392.pth [2024-06-25 05:35:46,932][15401] Updated weights for policy 0, policy_version 820873 (0.0041) [2024-06-25 05:35:48,389][15132] Fps is (10 sec: 44246.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13449232384. Throughput: 0: 42818.7. Samples: 13449398960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:35:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 05:35:50,437][15401] Updated weights for policy 0, policy_version 820883 (0.0037) [2024-06-25 05:35:53,391][15132] Fps is (10 sec: 42601.0, 60 sec: 42870.1, 300 sec: 42709.2). Total num frames: 13449445376. Throughput: 0: 42904.0. Samples: 13449526960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:35:53,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 05:35:54,301][15401] Updated weights for policy 0, policy_version 820893 (0.0039) [2024-06-25 05:35:56,119][15349] Signal inference workers to stop experience collection... (199100 times) [2024-06-25 05:35:56,119][15349] Signal inference workers to resume experience collection... (199100 times) [2024-06-25 05:35:56,142][15401] InferenceWorker_p0-w0: stopping experience collection (199100 times) [2024-06-25 05:35:56,142][15401] InferenceWorker_p0-w0: resuming experience collection (199100 times) [2024-06-25 05:35:57,824][15401] Updated weights for policy 0, policy_version 820903 (0.0031) [2024-06-25 05:35:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13449674752. Throughput: 0: 42989.9. Samples: 13449784240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:35:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 05:36:02,204][15401] Updated weights for policy 0, policy_version 820913 (0.0033) [2024-06-25 05:36:03,390][15132] Fps is (10 sec: 42605.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13449871360. Throughput: 0: 43043.4. Samples: 13450049440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 05:36:05,705][15401] Updated weights for policy 0, policy_version 820923 (0.0042) [2024-06-25 05:36:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13450084352. Throughput: 0: 43020.1. Samples: 13450173520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 05:36:09,522][15401] Updated weights for policy 0, policy_version 820933 (0.0028) [2024-06-25 05:36:13,330][15401] Updated weights for policy 0, policy_version 820943 (0.0028) [2024-06-25 05:36:13,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 13450330112. Throughput: 0: 43112.5. Samples: 13450430100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 05:36:17,482][15401] Updated weights for policy 0, policy_version 820953 (0.0032) [2024-06-25 05:36:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 13450526720. Throughput: 0: 43154.2. Samples: 13450691620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:18,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 05:36:21,257][15401] Updated weights for policy 0, policy_version 820963 (0.0036) [2024-06-25 05:36:23,392][15132] Fps is (10 sec: 40949.9, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 13450739712. Throughput: 0: 43056.2. Samples: 13450819500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:23,393][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 05:36:25,154][15401] Updated weights for policy 0, policy_version 820973 (0.0028) [2024-06-25 05:36:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13450952704. Throughput: 0: 43027.1. Samples: 13451075020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 05:36:28,797][15401] Updated weights for policy 0, policy_version 820983 (0.0035) [2024-06-25 05:36:32,814][15401] Updated weights for policy 0, policy_version 820993 (0.0027) [2024-06-25 05:36:33,393][15132] Fps is (10 sec: 44230.2, 60 sec: 43414.8, 300 sec: 42708.9). Total num frames: 13451182080. Throughput: 0: 42922.4. Samples: 13451330640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:33,394][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 05:36:36,530][15401] Updated weights for policy 0, policy_version 821003 (0.0024) [2024-06-25 05:36:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 13451378688. Throughput: 0: 42968.8. Samples: 13451460480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 05:36:40,401][15401] Updated weights for policy 0, policy_version 821013 (0.0029) [2024-06-25 05:36:43,389][15132] Fps is (10 sec: 40976.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 13451591680. Throughput: 0: 42961.3. Samples: 13451717500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 05:36:44,346][15401] Updated weights for policy 0, policy_version 821023 (0.0026) [2024-06-25 05:36:48,305][15401] Updated weights for policy 0, policy_version 821033 (0.0044) [2024-06-25 05:36:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13451804672. Throughput: 0: 42901.4. Samples: 13451980000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:48,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 05:36:51,729][15401] Updated weights for policy 0, policy_version 821043 (0.0031) [2024-06-25 05:36:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42872.8, 300 sec: 42709.5). Total num frames: 13452017664. Throughput: 0: 42936.8. Samples: 13452105680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 05:36:55,913][15401] Updated weights for policy 0, policy_version 821053 (0.0031) [2024-06-25 05:36:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13452230656. Throughput: 0: 42761.4. Samples: 13452354360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:36:58,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 05:36:59,278][15401] Updated weights for policy 0, policy_version 821063 (0.0038) [2024-06-25 05:37:03,334][15401] Updated weights for policy 0, policy_version 821073 (0.0031) [2024-06-25 05:37:03,393][15132] Fps is (10 sec: 44220.6, 60 sec: 43141.9, 300 sec: 42708.9). Total num frames: 13452460032. Throughput: 0: 42874.7. Samples: 13452621140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:37:03,394][15132] Avg episode reward: [(0, '0.270')] [2024-06-25 05:37:07,200][15401] Updated weights for policy 0, policy_version 821083 (0.0030) [2024-06-25 05:37:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13452656640. Throughput: 0: 42804.9. Samples: 13452745620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 05:37:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 05:37:10,873][15401] Updated weights for policy 0, policy_version 821093 (0.0045) [2024-06-25 05:37:13,390][15132] Fps is (10 sec: 42613.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13452886016. Throughput: 0: 42794.7. Samples: 13453000780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 05:37:14,787][15401] Updated weights for policy 0, policy_version 821103 (0.0036) [2024-06-25 05:37:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13453099008. Throughput: 0: 42962.3. Samples: 13453263780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 05:37:18,446][15401] Updated weights for policy 0, policy_version 821113 (0.0031) [2024-06-25 05:37:22,171][15401] Updated weights for policy 0, policy_version 821123 (0.0028) [2024-06-25 05:37:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 13453312000. Throughput: 0: 42937.4. Samples: 13453392660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 05:37:25,883][15349] Signal inference workers to stop experience collection... (199150 times) [2024-06-25 05:37:25,883][15349] Signal inference workers to resume experience collection... (199150 times) [2024-06-25 05:37:25,913][15401] InferenceWorker_p0-w0: stopping experience collection (199150 times) [2024-06-25 05:37:25,914][15401] InferenceWorker_p0-w0: resuming experience collection (199150 times) [2024-06-25 05:37:26,035][15401] Updated weights for policy 0, policy_version 821133 (0.0033) [2024-06-25 05:37:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 13453524992. Throughput: 0: 42917.8. Samples: 13453648800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 05:37:29,744][15401] Updated weights for policy 0, policy_version 821143 (0.0032) [2024-06-25 05:37:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42599.5, 300 sec: 42709.1). Total num frames: 13453737984. Throughput: 0: 42811.6. Samples: 13453906620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:33,392][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 05:37:33,619][15401] Updated weights for policy 0, policy_version 821153 (0.0028) [2024-06-25 05:37:37,819][15401] Updated weights for policy 0, policy_version 821163 (0.0025) [2024-06-25 05:37:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.1). Total num frames: 13453934592. Throughput: 0: 42886.3. Samples: 13454035560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 05:37:41,237][15401] Updated weights for policy 0, policy_version 821173 (0.0040) [2024-06-25 05:37:43,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13454180352. Throughput: 0: 43026.9. Samples: 13454290580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 05:37:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000821178_13454180352.pth... [2024-06-25 05:37:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000820551_13443907584.pth [2024-06-25 05:37:45,305][15401] Updated weights for policy 0, policy_version 821183 (0.0036) [2024-06-25 05:37:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13454376960. Throughput: 0: 42802.2. Samples: 13454547080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 05:37:48,977][15401] Updated weights for policy 0, policy_version 821193 (0.0036) [2024-06-25 05:37:53,266][15401] Updated weights for policy 0, policy_version 821203 (0.0029) [2024-06-25 05:37:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42766.0). Total num frames: 13454589952. Throughput: 0: 42874.4. Samples: 13454674960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:53,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 05:37:56,822][15401] Updated weights for policy 0, policy_version 821213 (0.0036) [2024-06-25 05:37:58,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.4, 300 sec: 42820.6). Total num frames: 13454835712. Throughput: 0: 42856.0. Samples: 13454929300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:37:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 05:38:00,983][15401] Updated weights for policy 0, policy_version 821223 (0.0031) [2024-06-25 05:38:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42601.1, 300 sec: 42820.6). Total num frames: 13455015936. Throughput: 0: 42864.2. Samples: 13455192660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 05:38:04,508][15401] Updated weights for policy 0, policy_version 821233 (0.0032) [2024-06-25 05:38:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13455228928. Throughput: 0: 42724.4. Samples: 13455315260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 05:38:08,510][15401] Updated weights for policy 0, policy_version 821243 (0.0039) [2024-06-25 05:38:12,134][15401] Updated weights for policy 0, policy_version 821253 (0.0032) [2024-06-25 05:38:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13455458304. Throughput: 0: 42739.1. Samples: 13455572060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 05:38:16,239][15401] Updated weights for policy 0, policy_version 821263 (0.0035) [2024-06-25 05:38:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 13455654912. Throughput: 0: 42744.9. Samples: 13455830040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:18,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 05:38:19,863][15401] Updated weights for policy 0, policy_version 821273 (0.0039) [2024-06-25 05:38:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 13455867904. Throughput: 0: 42772.4. Samples: 13455960320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 05:38:23,675][15401] Updated weights for policy 0, policy_version 821283 (0.0034) [2024-06-25 05:38:27,589][15401] Updated weights for policy 0, policy_version 821293 (0.0037) [2024-06-25 05:38:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13456097280. Throughput: 0: 42807.0. Samples: 13456216900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 05:38:31,539][15401] Updated weights for policy 0, policy_version 821303 (0.0044) [2024-06-25 05:38:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 13456310272. Throughput: 0: 42797.8. Samples: 13456473080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:33,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 05:38:35,115][15401] Updated weights for policy 0, policy_version 821313 (0.0034) [2024-06-25 05:38:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13456506880. Throughput: 0: 42770.9. Samples: 13456599660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 05:38:39,003][15401] Updated weights for policy 0, policy_version 821323 (0.0028) [2024-06-25 05:38:42,552][15349] Signal inference workers to stop experience collection... (199200 times) [2024-06-25 05:38:42,591][15401] InferenceWorker_p0-w0: stopping experience collection (199200 times) [2024-06-25 05:38:42,610][15349] Signal inference workers to resume experience collection... (199200 times) [2024-06-25 05:38:42,611][15401] InferenceWorker_p0-w0: resuming experience collection (199200 times) [2024-06-25 05:38:42,757][15401] Updated weights for policy 0, policy_version 821333 (0.0028) [2024-06-25 05:38:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13456736256. Throughput: 0: 42932.2. Samples: 13456861240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 05:38:46,774][15401] Updated weights for policy 0, policy_version 821343 (0.0030) [2024-06-25 05:38:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13456965632. Throughput: 0: 42548.7. Samples: 13457107360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:48,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 05:38:50,320][15401] Updated weights for policy 0, policy_version 821353 (0.0029) [2024-06-25 05:38:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13457129472. Throughput: 0: 42801.3. Samples: 13457241320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:53,396][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 05:38:54,408][15401] Updated weights for policy 0, policy_version 821363 (0.0025) [2024-06-25 05:38:57,888][15401] Updated weights for policy 0, policy_version 821373 (0.0035) [2024-06-25 05:38:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13457408000. Throughput: 0: 42965.7. Samples: 13457505520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 05:38:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 05:39:01,867][15401] Updated weights for policy 0, policy_version 821383 (0.0036) [2024-06-25 05:39:03,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13457604608. Throughput: 0: 42803.5. Samples: 13457756200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 05:39:05,466][15401] Updated weights for policy 0, policy_version 821393 (0.0038) [2024-06-25 05:39:08,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13457784832. Throughput: 0: 42696.5. Samples: 13457881660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 05:39:09,823][15401] Updated weights for policy 0, policy_version 821403 (0.0027) [2024-06-25 05:39:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13458014208. Throughput: 0: 42664.6. Samples: 13458136800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 05:39:13,467][15401] Updated weights for policy 0, policy_version 821413 (0.0031) [2024-06-25 05:39:17,346][15401] Updated weights for policy 0, policy_version 821423 (0.0042) [2024-06-25 05:39:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13458243584. Throughput: 0: 42727.6. Samples: 13458395720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 05:39:21,202][15401] Updated weights for policy 0, policy_version 821433 (0.0028) [2024-06-25 05:39:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13458423808. Throughput: 0: 42861.8. Samples: 13458528440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:23,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 05:39:25,394][15401] Updated weights for policy 0, policy_version 821443 (0.0029) [2024-06-25 05:39:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13458653184. Throughput: 0: 42610.1. Samples: 13458778700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 05:39:28,826][15401] Updated weights for policy 0, policy_version 821453 (0.0035) [2024-06-25 05:39:32,979][15401] Updated weights for policy 0, policy_version 821463 (0.0040) [2024-06-25 05:39:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 13458882560. Throughput: 0: 42936.1. Samples: 13459039480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:33,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 05:39:36,393][15401] Updated weights for policy 0, policy_version 821473 (0.0035) [2024-06-25 05:39:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13459079168. Throughput: 0: 42850.8. Samples: 13459169600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 05:39:40,521][15401] Updated weights for policy 0, policy_version 821483 (0.0038) [2024-06-25 05:39:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13459308544. Throughput: 0: 42628.1. Samples: 13459423780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 05:39:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000821491_13459308544.pth... [2024-06-25 05:39:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000820863_13449019392.pth [2024-06-25 05:39:44,125][15401] Updated weights for policy 0, policy_version 821493 (0.0043) [2024-06-25 05:39:48,140][15401] Updated weights for policy 0, policy_version 821503 (0.0041) [2024-06-25 05:39:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13459505152. Throughput: 0: 42858.7. Samples: 13459684840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:48,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 05:39:50,020][15349] Signal inference workers to stop experience collection... (199250 times) [2024-06-25 05:39:50,021][15349] Signal inference workers to resume experience collection... (199250 times) [2024-06-25 05:39:50,048][15401] InferenceWorker_p0-w0: stopping experience collection (199250 times) [2024-06-25 05:39:50,049][15401] InferenceWorker_p0-w0: resuming experience collection (199250 times) [2024-06-25 05:39:51,682][15401] Updated weights for policy 0, policy_version 821513 (0.0035) [2024-06-25 05:39:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13459734528. Throughput: 0: 42901.2. Samples: 13459812220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 05:39:55,669][15401] Updated weights for policy 0, policy_version 821523 (0.0034) [2024-06-25 05:39:58,394][15132] Fps is (10 sec: 44219.0, 60 sec: 42322.5, 300 sec: 42820.0). Total num frames: 13459947520. Throughput: 0: 42974.8. Samples: 13460070840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:39:58,394][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 05:39:59,196][15401] Updated weights for policy 0, policy_version 821533 (0.0033) [2024-06-25 05:40:03,282][15401] Updated weights for policy 0, policy_version 821543 (0.0042) [2024-06-25 05:40:03,391][15132] Fps is (10 sec: 42592.8, 60 sec: 42597.5, 300 sec: 42931.4). Total num frames: 13460160512. Throughput: 0: 42868.5. Samples: 13460324860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:03,391][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 05:40:07,059][15401] Updated weights for policy 0, policy_version 821553 (0.0031) [2024-06-25 05:40:08,390][15132] Fps is (10 sec: 42614.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13460373504. Throughput: 0: 42693.2. Samples: 13460449640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:08,399][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 05:40:10,910][15401] Updated weights for policy 0, policy_version 821563 (0.0039) [2024-06-25 05:40:13,396][15132] Fps is (10 sec: 42577.0, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 13460586496. Throughput: 0: 42883.7. Samples: 13460708740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:13,397][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 05:40:14,811][15401] Updated weights for policy 0, policy_version 821573 (0.0042) [2024-06-25 05:40:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13460783104. Throughput: 0: 42761.3. Samples: 13460963740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 05:40:18,665][15401] Updated weights for policy 0, policy_version 821583 (0.0041) [2024-06-25 05:40:22,587][15401] Updated weights for policy 0, policy_version 821593 (0.0039) [2024-06-25 05:40:23,390][15132] Fps is (10 sec: 44264.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 13461028864. Throughput: 0: 42626.8. Samples: 13461087820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 05:40:26,190][15401] Updated weights for policy 0, policy_version 821603 (0.0029) [2024-06-25 05:40:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13461225472. Throughput: 0: 42703.5. Samples: 13461345440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 05:40:30,354][15401] Updated weights for policy 0, policy_version 821613 (0.0047) [2024-06-25 05:40:33,390][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 13461438464. Throughput: 0: 42522.2. Samples: 13461598340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:33,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-25 05:40:33,757][15401] Updated weights for policy 0, policy_version 821623 (0.0045) [2024-06-25 05:40:38,007][15401] Updated weights for policy 0, policy_version 821633 (0.0037) [2024-06-25 05:40:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 13461635072. Throughput: 0: 42632.0. Samples: 13461730660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 05:40:41,903][15401] Updated weights for policy 0, policy_version 821643 (0.0035) [2024-06-25 05:40:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 13461848064. Throughput: 0: 42527.3. Samples: 13461984400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 05:40:45,712][15401] Updated weights for policy 0, policy_version 821653 (0.0038) [2024-06-25 05:40:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 13462077440. Throughput: 0: 42411.0. Samples: 13462233300. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 05:40:49,630][15401] Updated weights for policy 0, policy_version 821663 (0.0044) [2024-06-25 05:40:53,272][15401] Updated weights for policy 0, policy_version 821673 (0.0039) [2024-06-25 05:40:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13462290432. Throughput: 0: 42640.5. Samples: 13462368460. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 05:40:57,274][15401] Updated weights for policy 0, policy_version 821683 (0.0042) [2024-06-25 05:40:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42328.1, 300 sec: 42765.0). Total num frames: 13462487040. Throughput: 0: 42568.2. Samples: 13462624040. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:40:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 05:41:00,826][15401] Updated weights for policy 0, policy_version 821693 (0.0032) [2024-06-25 05:41:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42872.4, 300 sec: 42876.1). Total num frames: 13462732800. Throughput: 0: 42385.7. Samples: 13462871100. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:03,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 05:41:05,146][15401] Updated weights for policy 0, policy_version 821703 (0.0041) [2024-06-25 05:41:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13462913024. Throughput: 0: 42635.2. Samples: 13463006400. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:08,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 05:41:08,587][15401] Updated weights for policy 0, policy_version 821713 (0.0037) [2024-06-25 05:41:09,911][15349] Signal inference workers to stop experience collection... (199300 times) [2024-06-25 05:41:09,912][15349] Signal inference workers to resume experience collection... (199300 times) [2024-06-25 05:41:09,923][15401] InferenceWorker_p0-w0: stopping experience collection (199300 times) [2024-06-25 05:41:09,947][15401] InferenceWorker_p0-w0: resuming experience collection (199300 times) [2024-06-25 05:41:12,791][15401] Updated weights for policy 0, policy_version 821723 (0.0036) [2024-06-25 05:41:13,392][15132] Fps is (10 sec: 37674.2, 60 sec: 42055.1, 300 sec: 42653.6). Total num frames: 13463109632. Throughput: 0: 42392.0. Samples: 13463253180. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:13,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 05:41:16,455][15401] Updated weights for policy 0, policy_version 821733 (0.0034) [2024-06-25 05:41:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 13463355392. Throughput: 0: 42331.1. Samples: 13463503240. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:18,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 05:41:20,447][15401] Updated weights for policy 0, policy_version 821743 (0.0028) [2024-06-25 05:41:23,390][15132] Fps is (10 sec: 44246.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13463552000. Throughput: 0: 42459.0. Samples: 13463641320. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 05:41:24,316][15401] Updated weights for policy 0, policy_version 821753 (0.0035) [2024-06-25 05:41:28,272][15401] Updated weights for policy 0, policy_version 821763 (0.0035) [2024-06-25 05:41:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 13463764992. Throughput: 0: 42355.3. Samples: 13463890380. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 05:41:31,766][15401] Updated weights for policy 0, policy_version 821773 (0.0031) [2024-06-25 05:41:33,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13464010752. Throughput: 0: 42466.7. Samples: 13464144300. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 05:41:35,896][15401] Updated weights for policy 0, policy_version 821783 (0.0039) [2024-06-25 05:41:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13464174592. Throughput: 0: 42433.0. Samples: 13464277940. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 05:41:39,619][15401] Updated weights for policy 0, policy_version 821793 (0.0033) [2024-06-25 05:41:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13464403968. Throughput: 0: 42339.6. Samples: 13464529320. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:43,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 05:41:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000821803_13464420352.pth... [2024-06-25 05:41:43,460][15401] Updated weights for policy 0, policy_version 821803 (0.0028) [2024-06-25 05:41:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000821178_13454180352.pth [2024-06-25 05:41:47,190][15401] Updated weights for policy 0, policy_version 821813 (0.0022) [2024-06-25 05:41:48,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13464649728. Throughput: 0: 42539.9. Samples: 13464785400. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 05:41:50,981][15401] Updated weights for policy 0, policy_version 821823 (0.0042) [2024-06-25 05:41:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 13464813568. Throughput: 0: 42489.5. Samples: 13464918420. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:53,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 05:41:54,692][15401] Updated weights for policy 0, policy_version 821833 (0.0033) [2024-06-25 05:41:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 13465059328. Throughput: 0: 42651.6. Samples: 13465172400. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:41:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 05:41:58,533][15401] Updated weights for policy 0, policy_version 821843 (0.0030) [2024-06-25 05:42:02,424][15401] Updated weights for policy 0, policy_version 821853 (0.0043) [2024-06-25 05:42:03,389][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13465288704. Throughput: 0: 42684.8. Samples: 13465424060. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:42:03,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 05:42:06,811][15401] Updated weights for policy 0, policy_version 821863 (0.0039) [2024-06-25 05:42:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13465452544. Throughput: 0: 42568.1. Samples: 13465556880. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:42:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 05:42:10,134][15401] Updated weights for policy 0, policy_version 821873 (0.0034) [2024-06-25 05:42:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 13465698304. Throughput: 0: 42635.8. Samples: 13465809000. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:42:13,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 05:42:14,531][15401] Updated weights for policy 0, policy_version 821883 (0.0045) [2024-06-25 05:42:17,652][15401] Updated weights for policy 0, policy_version 821893 (0.0030) [2024-06-25 05:42:18,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13465927680. Throughput: 0: 42551.1. Samples: 13466059100. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:42:18,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 05:42:22,157][15401] Updated weights for policy 0, policy_version 821903 (0.0036) [2024-06-25 05:42:23,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 13466091520. Throughput: 0: 42568.9. Samples: 13466193540. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:42:23,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 05:42:25,130][15401] Updated weights for policy 0, policy_version 821913 (0.0026) [2024-06-25 05:42:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 13466337280. Throughput: 0: 42711.1. Samples: 13466451320. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:42:28,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 05:42:30,364][15349] Signal inference workers to stop experience collection... (199350 times) [2024-06-25 05:42:30,365][15349] Signal inference workers to resume experience collection... (199350 times) [2024-06-25 05:42:30,373][15401] Updated weights for policy 0, policy_version 821923 (0.0039) [2024-06-25 05:42:30,388][15401] InferenceWorker_p0-w0: stopping experience collection (199350 times) [2024-06-25 05:42:30,388][15401] InferenceWorker_p0-w0: resuming experience collection (199350 times) [2024-06-25 05:42:32,777][15401] Updated weights for policy 0, policy_version 821933 (0.0037) [2024-06-25 05:42:33,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13466566656. Throughput: 0: 42555.7. Samples: 13466700400. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-06-25 05:42:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 05:42:37,863][15401] Updated weights for policy 0, policy_version 821943 (0.0029) [2024-06-25 05:42:38,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 13466746880. Throughput: 0: 42445.2. Samples: 13466828560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:42:38,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 05:42:40,590][15401] Updated weights for policy 0, policy_version 821953 (0.0029) [2024-06-25 05:42:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13466992640. Throughput: 0: 42632.9. Samples: 13467090880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:42:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 05:42:45,524][15401] Updated weights for policy 0, policy_version 821963 (0.0029) [2024-06-25 05:42:48,274][15401] Updated weights for policy 0, policy_version 821973 (0.0028) [2024-06-25 05:42:48,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13467205632. Throughput: 0: 42721.9. Samples: 13467346540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:42:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 05:42:52,987][15401] Updated weights for policy 0, policy_version 821983 (0.0041) [2024-06-25 05:42:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13467385856. Throughput: 0: 42577.4. Samples: 13467472860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:42:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 05:42:55,945][15401] Updated weights for policy 0, policy_version 821993 (0.0049) [2024-06-25 05:42:58,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 13467631616. Throughput: 0: 42868.5. Samples: 13467738180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:42:58,401][15132] Avg episode reward: [(0, '0.129')] [2024-06-25 05:43:00,446][15401] Updated weights for policy 0, policy_version 822003 (0.0032) [2024-06-25 05:43:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13467828224. Throughput: 0: 42868.0. Samples: 13467988160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:03,392][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 05:43:03,539][15401] Updated weights for policy 0, policy_version 822013 (0.0025) [2024-06-25 05:43:08,148][15401] Updated weights for policy 0, policy_version 822023 (0.0026) [2024-06-25 05:43:08,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13468024832. Throughput: 0: 42777.1. Samples: 13468118520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 05:43:11,147][15401] Updated weights for policy 0, policy_version 822033 (0.0041) [2024-06-25 05:43:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13468270592. Throughput: 0: 42911.6. Samples: 13468382340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 05:43:15,636][15401] Updated weights for policy 0, policy_version 822043 (0.0039) [2024-06-25 05:43:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13468483584. Throughput: 0: 42897.2. Samples: 13468630780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 05:43:18,688][15401] Updated weights for policy 0, policy_version 822053 (0.0043) [2024-06-25 05:43:23,186][15401] Updated weights for policy 0, policy_version 822063 (0.0040) [2024-06-25 05:43:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 13468680192. Throughput: 0: 42994.0. Samples: 13468763180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 05:43:26,468][15401] Updated weights for policy 0, policy_version 822073 (0.0031) [2024-06-25 05:43:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 13468876800. Throughput: 0: 42777.8. Samples: 13469015880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 05:43:30,672][15401] Updated weights for policy 0, policy_version 822083 (0.0041) [2024-06-25 05:43:32,503][15349] Signal inference workers to stop experience collection... (199400 times) [2024-06-25 05:43:32,506][15349] Signal inference workers to resume experience collection... (199400 times) [2024-06-25 05:43:32,530][15401] InferenceWorker_p0-w0: stopping experience collection (199400 times) [2024-06-25 05:43:32,530][15401] InferenceWorker_p0-w0: resuming experience collection (199400 times) [2024-06-25 05:43:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13469122560. Throughput: 0: 42891.9. Samples: 13469276680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 05:43:34,006][15401] Updated weights for policy 0, policy_version 822093 (0.0029) [2024-06-25 05:43:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42871.4, 300 sec: 42653.6). Total num frames: 13469319168. Throughput: 0: 42903.1. Samples: 13469403600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:38,392][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 05:43:38,462][15401] Updated weights for policy 0, policy_version 822103 (0.0038) [2024-06-25 05:43:42,077][15401] Updated weights for policy 0, policy_version 822113 (0.0034) [2024-06-25 05:43:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13469532160. Throughput: 0: 42488.1. Samples: 13469650040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:43,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 05:43:43,463][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000822116_13469548544.pth... [2024-06-25 05:43:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000821491_13459308544.pth [2024-06-25 05:43:46,393][15401] Updated weights for policy 0, policy_version 822123 (0.0040) [2024-06-25 05:43:48,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13469745152. Throughput: 0: 42857.4. Samples: 13469916740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 05:43:49,595][15401] Updated weights for policy 0, policy_version 822133 (0.0044) [2024-06-25 05:43:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13469958144. Throughput: 0: 42760.6. Samples: 13470042740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:53,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 05:43:53,852][15401] Updated weights for policy 0, policy_version 822143 (0.0034) [2024-06-25 05:43:57,082][15401] Updated weights for policy 0, policy_version 822153 (0.0029) [2024-06-25 05:43:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 13470171136. Throughput: 0: 42426.7. Samples: 13470291540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:43:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 05:44:01,659][15401] Updated weights for policy 0, policy_version 822163 (0.0034) [2024-06-25 05:44:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13470384128. Throughput: 0: 42867.6. Samples: 13470559820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:44:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 05:44:04,712][15401] Updated weights for policy 0, policy_version 822173 (0.0037) [2024-06-25 05:44:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13470613504. Throughput: 0: 42634.9. Samples: 13470681760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:44:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 05:44:09,339][15401] Updated weights for policy 0, policy_version 822183 (0.0027) [2024-06-25 05:44:13,077][15401] Updated weights for policy 0, policy_version 822193 (0.0029) [2024-06-25 05:44:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 13470810112. Throughput: 0: 42564.4. Samples: 13470931380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:44:13,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 05:44:17,011][15401] Updated weights for policy 0, policy_version 822203 (0.0040) [2024-06-25 05:44:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13471023104. Throughput: 0: 42617.4. Samples: 13471194460. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:44:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 05:44:20,746][15401] Updated weights for policy 0, policy_version 822213 (0.0032) [2024-06-25 05:44:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13471252480. Throughput: 0: 42569.8. Samples: 13471319140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-06-25 05:44:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 05:44:24,545][15401] Updated weights for policy 0, policy_version 822223 (0.0042) [2024-06-25 05:44:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13471449088. Throughput: 0: 42656.0. Samples: 13471569560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:44:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 05:44:28,694][15401] Updated weights for policy 0, policy_version 822233 (0.0040) [2024-06-25 05:44:32,083][15401] Updated weights for policy 0, policy_version 822243 (0.0024) [2024-06-25 05:44:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13471645696. Throughput: 0: 42617.4. Samples: 13471834520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:44:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 05:44:36,351][15401] Updated weights for policy 0, policy_version 822253 (0.0032) [2024-06-25 05:44:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42598.4, 300 sec: 42598.0). Total num frames: 13471875072. Throughput: 0: 42594.1. Samples: 13471959580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:44:38,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 05:44:39,669][15401] Updated weights for policy 0, policy_version 822263 (0.0034) [2024-06-25 05:44:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13472088064. Throughput: 0: 42814.0. Samples: 13472218180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:44:43,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 05:44:44,046][15401] Updated weights for policy 0, policy_version 822273 (0.0036) [2024-06-25 05:44:47,337][15401] Updated weights for policy 0, policy_version 822283 (0.0037) [2024-06-25 05:44:48,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13472301056. Throughput: 0: 42647.6. Samples: 13472478960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:44:48,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 05:44:51,656][15401] Updated weights for policy 0, policy_version 822293 (0.0024) [2024-06-25 05:44:53,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.5, 300 sec: 42710.1). Total num frames: 13472546816. Throughput: 0: 42867.6. Samples: 13472610800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:44:53,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 05:44:55,134][15401] Updated weights for policy 0, policy_version 822303 (0.0040) [2024-06-25 05:44:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 13472727040. Throughput: 0: 43041.0. Samples: 13472868120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:44:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 05:44:59,179][15401] Updated weights for policy 0, policy_version 822313 (0.0032) [2024-06-25 05:45:02,760][15401] Updated weights for policy 0, policy_version 822323 (0.0030) [2024-06-25 05:45:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 13472956416. Throughput: 0: 42729.8. Samples: 13473117300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 05:45:07,105][15401] Updated weights for policy 0, policy_version 822333 (0.0033) [2024-06-25 05:45:07,816][15349] Signal inference workers to stop experience collection... (199450 times) [2024-06-25 05:45:07,817][15349] Signal inference workers to resume experience collection... (199450 times) [2024-06-25 05:45:07,858][15401] InferenceWorker_p0-w0: stopping experience collection (199450 times) [2024-06-25 05:45:07,858][15401] InferenceWorker_p0-w0: resuming experience collection (199450 times) [2024-06-25 05:45:08,394][15132] Fps is (10 sec: 45854.5, 60 sec: 42868.3, 300 sec: 42709.8). Total num frames: 13473185792. Throughput: 0: 42907.8. Samples: 13473250180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:08,394][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 05:45:10,270][15401] Updated weights for policy 0, policy_version 822343 (0.0037) [2024-06-25 05:45:13,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 13473382400. Throughput: 0: 42987.5. Samples: 13473504000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 05:45:14,585][15401] Updated weights for policy 0, policy_version 822353 (0.0030) [2024-06-25 05:45:18,146][15401] Updated weights for policy 0, policy_version 822363 (0.0033) [2024-06-25 05:45:18,390][15132] Fps is (10 sec: 42617.3, 60 sec: 43144.4, 300 sec: 42654.0). Total num frames: 13473611776. Throughput: 0: 42955.0. Samples: 13473767500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 05:45:22,031][15401] Updated weights for policy 0, policy_version 822373 (0.0031) [2024-06-25 05:45:23,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13473824768. Throughput: 0: 43109.4. Samples: 13473899400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 05:45:25,555][15401] Updated weights for policy 0, policy_version 822383 (0.0030) [2024-06-25 05:45:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13474021376. Throughput: 0: 42936.6. Samples: 13474150320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 05:45:30,036][15401] Updated weights for policy 0, policy_version 822393 (0.0050) [2024-06-25 05:45:33,294][15401] Updated weights for policy 0, policy_version 822403 (0.0026) [2024-06-25 05:45:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13474250752. Throughput: 0: 42979.6. Samples: 13474413040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 05:45:37,551][15401] Updated weights for policy 0, policy_version 822413 (0.0034) [2024-06-25 05:45:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 13474480128. Throughput: 0: 43002.1. Samples: 13474545900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 05:45:40,790][15401] Updated weights for policy 0, policy_version 822423 (0.0036) [2024-06-25 05:45:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13474676736. Throughput: 0: 42902.6. Samples: 13474798740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 05:45:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000822429_13474676736.pth... [2024-06-25 05:45:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000821803_13464420352.pth [2024-06-25 05:45:45,083][15401] Updated weights for policy 0, policy_version 822433 (0.0028) [2024-06-25 05:45:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13474889728. Throughput: 0: 43156.9. Samples: 13475059360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:48,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 05:45:48,479][15401] Updated weights for policy 0, policy_version 822443 (0.0043) [2024-06-25 05:45:52,547][15401] Updated weights for policy 0, policy_version 822453 (0.0036) [2024-06-25 05:45:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13475086336. Throughput: 0: 43048.8. Samples: 13475187180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:53,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 05:45:56,138][15401] Updated weights for policy 0, policy_version 822463 (0.0029) [2024-06-25 05:45:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 13475332096. Throughput: 0: 43046.3. Samples: 13475441080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:45:58,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 05:46:00,080][15401] Updated weights for policy 0, policy_version 822473 (0.0027) [2024-06-25 05:46:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13475528704. Throughput: 0: 42859.1. Samples: 13475696160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:46:03,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 05:46:03,935][15401] Updated weights for policy 0, policy_version 822483 (0.0039) [2024-06-25 05:46:07,750][15401] Updated weights for policy 0, policy_version 822493 (0.0033) [2024-06-25 05:46:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42601.6, 300 sec: 42820.9). Total num frames: 13475741696. Throughput: 0: 42786.7. Samples: 13475824800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:46:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 05:46:11,649][15401] Updated weights for policy 0, policy_version 822503 (0.0033) [2024-06-25 05:46:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 13475987456. Throughput: 0: 42850.2. Samples: 13476078580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 05:46:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 05:46:15,388][15401] Updated weights for policy 0, policy_version 822513 (0.0039) [2024-06-25 05:46:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13476167680. Throughput: 0: 42836.4. Samples: 13476340680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:18,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 05:46:19,274][15401] Updated weights for policy 0, policy_version 822523 (0.0031) [2024-06-25 05:46:23,044][15401] Updated weights for policy 0, policy_version 822533 (0.0036) [2024-06-25 05:46:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13476380672. Throughput: 0: 42633.5. Samples: 13476464400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 05:46:25,729][15349] Signal inference workers to stop experience collection... (199500 times) [2024-06-25 05:46:25,730][15349] Signal inference workers to resume experience collection... (199500 times) [2024-06-25 05:46:25,744][15401] InferenceWorker_p0-w0: stopping experience collection (199500 times) [2024-06-25 05:46:25,776][15401] InferenceWorker_p0-w0: resuming experience collection (199500 times) [2024-06-25 05:46:26,815][15401] Updated weights for policy 0, policy_version 822543 (0.0028) [2024-06-25 05:46:28,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 13476626432. Throughput: 0: 42780.9. Samples: 13476723980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:28,392][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 05:46:31,123][15401] Updated weights for policy 0, policy_version 822553 (0.0034) [2024-06-25 05:46:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 13476806656. Throughput: 0: 42611.8. Samples: 13476976900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 05:46:34,856][15401] Updated weights for policy 0, policy_version 822563 (0.0035) [2024-06-25 05:46:38,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 13477019648. Throughput: 0: 42496.0. Samples: 13477099500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 05:46:38,605][15401] Updated weights for policy 0, policy_version 822573 (0.0038) [2024-06-25 05:46:42,394][15401] Updated weights for policy 0, policy_version 822583 (0.0040) [2024-06-25 05:46:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13477249024. Throughput: 0: 42782.3. Samples: 13477366280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:43,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 05:46:46,258][15401] Updated weights for policy 0, policy_version 822593 (0.0039) [2024-06-25 05:46:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13477445632. Throughput: 0: 42623.1. Samples: 13477614200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:48,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 05:46:49,911][15401] Updated weights for policy 0, policy_version 822603 (0.0024) [2024-06-25 05:46:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13477658624. Throughput: 0: 42629.7. Samples: 13477743140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:53,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 05:46:53,903][15401] Updated weights for policy 0, policy_version 822613 (0.0034) [2024-06-25 05:46:57,703][15401] Updated weights for policy 0, policy_version 822623 (0.0041) [2024-06-25 05:46:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13477888000. Throughput: 0: 42766.2. Samples: 13478003060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:46:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 05:47:01,530][15401] Updated weights for policy 0, policy_version 822633 (0.0031) [2024-06-25 05:47:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13478100992. Throughput: 0: 42695.1. Samples: 13478261960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 05:47:05,308][15401] Updated weights for policy 0, policy_version 822643 (0.0030) [2024-06-25 05:47:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13478297600. Throughput: 0: 42803.1. Samples: 13478390540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 05:47:09,017][15401] Updated weights for policy 0, policy_version 822653 (0.0037) [2024-06-25 05:47:12,928][15401] Updated weights for policy 0, policy_version 822663 (0.0036) [2024-06-25 05:47:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 13478510592. Throughput: 0: 42848.5. Samples: 13478652060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 05:47:16,930][15401] Updated weights for policy 0, policy_version 822673 (0.0046) [2024-06-25 05:47:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13478739968. Throughput: 0: 42774.0. Samples: 13478901720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 05:47:20,514][15401] Updated weights for policy 0, policy_version 822683 (0.0034) [2024-06-25 05:47:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13478952960. Throughput: 0: 42983.4. Samples: 13479033760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 05:47:24,530][15401] Updated weights for policy 0, policy_version 822693 (0.0025) [2024-06-25 05:47:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 13479149568. Throughput: 0: 42625.8. Samples: 13479284440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:28,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 05:47:28,471][15401] Updated weights for policy 0, policy_version 822703 (0.0027) [2024-06-25 05:47:32,045][15401] Updated weights for policy 0, policy_version 822713 (0.0029) [2024-06-25 05:47:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 13479395328. Throughput: 0: 42918.1. Samples: 13479545520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:33,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-25 05:47:36,069][15401] Updated weights for policy 0, policy_version 822723 (0.0043) [2024-06-25 05:47:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13479591936. Throughput: 0: 43009.8. Samples: 13479678580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 05:47:39,897][15401] Updated weights for policy 0, policy_version 822733 (0.0029) [2024-06-25 05:47:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13479804928. Throughput: 0: 42871.5. Samples: 13479932280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 05:47:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000822742_13479804928.pth... [2024-06-25 05:47:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000822116_13469548544.pth [2024-06-25 05:47:43,636][15401] Updated weights for policy 0, policy_version 822743 (0.0039) [2024-06-25 05:47:47,423][15401] Updated weights for policy 0, policy_version 822753 (0.0035) [2024-06-25 05:47:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13480034304. Throughput: 0: 42830.6. Samples: 13480189340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:48,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 05:47:51,334][15401] Updated weights for policy 0, policy_version 822763 (0.0042) [2024-06-25 05:47:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 13480230912. Throughput: 0: 42846.0. Samples: 13480318620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 05:47:55,281][15401] Updated weights for policy 0, policy_version 822773 (0.0048) [2024-06-25 05:47:58,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 13480443904. Throughput: 0: 42535.0. Samples: 13480566240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 05:47:58,392][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 05:47:59,026][15401] Updated weights for policy 0, policy_version 822783 (0.0037) [2024-06-25 05:48:02,851][15401] Updated weights for policy 0, policy_version 822793 (0.0028) [2024-06-25 05:48:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13480673280. Throughput: 0: 42732.8. Samples: 13480824700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:03,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 05:48:06,647][15401] Updated weights for policy 0, policy_version 822803 (0.0037) [2024-06-25 05:48:08,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13480853504. Throughput: 0: 42677.1. Samples: 13480954220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 05:48:10,437][15401] Updated weights for policy 0, policy_version 822813 (0.0026) [2024-06-25 05:48:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13481099264. Throughput: 0: 42727.5. Samples: 13481207180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:13,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 05:48:14,257][15401] Updated weights for policy 0, policy_version 822823 (0.0050) [2024-06-25 05:48:15,920][15349] Signal inference workers to stop experience collection... (199550 times) [2024-06-25 05:48:15,962][15401] InferenceWorker_p0-w0: stopping experience collection (199550 times) [2024-06-25 05:48:15,982][15349] Signal inference workers to resume experience collection... (199550 times) [2024-06-25 05:48:15,983][15401] InferenceWorker_p0-w0: resuming experience collection (199550 times) [2024-06-25 05:48:18,022][15401] Updated weights for policy 0, policy_version 822833 (0.0033) [2024-06-25 05:48:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13481312256. Throughput: 0: 42713.5. Samples: 13481467620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:18,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 05:48:21,751][15401] Updated weights for policy 0, policy_version 822843 (0.0042) [2024-06-25 05:48:23,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13481492480. Throughput: 0: 42617.8. Samples: 13481596380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:23,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 05:48:25,779][15401] Updated weights for policy 0, policy_version 822853 (0.0040) [2024-06-25 05:48:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 13481754624. Throughput: 0: 42532.8. Samples: 13481846260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 05:48:29,220][15401] Updated weights for policy 0, policy_version 822863 (0.0029) [2024-06-25 05:48:33,312][15401] Updated weights for policy 0, policy_version 822873 (0.0044) [2024-06-25 05:48:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 13481951232. Throughput: 0: 42703.7. Samples: 13482111000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 05:48:36,779][15401] Updated weights for policy 0, policy_version 822883 (0.0023) [2024-06-25 05:48:38,390][15132] Fps is (10 sec: 36045.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 13482115072. Throughput: 0: 42544.0. Samples: 13482233100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:38,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 05:48:41,357][15401] Updated weights for policy 0, policy_version 822893 (0.0029) [2024-06-25 05:48:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13482393600. Throughput: 0: 42566.2. Samples: 13482481620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 05:48:44,404][15401] Updated weights for policy 0, policy_version 822903 (0.0040) [2024-06-25 05:48:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13482573824. Throughput: 0: 42595.5. Samples: 13482741500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 05:48:48,983][15401] Updated weights for policy 0, policy_version 822913 (0.0033) [2024-06-25 05:48:52,290][15401] Updated weights for policy 0, policy_version 822923 (0.0024) [2024-06-25 05:48:53,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 13482770432. Throughput: 0: 42411.8. Samples: 13482862860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:53,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 05:48:56,451][15401] Updated weights for policy 0, policy_version 822933 (0.0027) [2024-06-25 05:48:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 13483016192. Throughput: 0: 42412.5. Samples: 13483115740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:48:58,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 05:49:00,426][15401] Updated weights for policy 0, policy_version 822943 (0.0029) [2024-06-25 05:49:03,390][15132] Fps is (10 sec: 40969.0, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 13483180032. Throughput: 0: 42379.3. Samples: 13483374700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:03,391][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 05:49:04,638][15401] Updated weights for policy 0, policy_version 822953 (0.0030) [2024-06-25 05:49:08,180][15401] Updated weights for policy 0, policy_version 822963 (0.0034) [2024-06-25 05:49:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42765.4). Total num frames: 13483425792. Throughput: 0: 42101.3. Samples: 13483490940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 05:49:12,018][15349] Signal inference workers to stop experience collection... (199600 times) [2024-06-25 05:49:12,019][15349] Signal inference workers to resume experience collection... (199600 times) [2024-06-25 05:49:12,060][15401] InferenceWorker_p0-w0: stopping experience collection (199600 times) [2024-06-25 05:49:12,060][15401] InferenceWorker_p0-w0: resuming experience collection (199600 times) [2024-06-25 05:49:12,175][15401] Updated weights for policy 0, policy_version 822973 (0.0034) [2024-06-25 05:49:13,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13483655168. Throughput: 0: 42416.9. Samples: 13483755020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:13,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 05:49:15,738][15401] Updated weights for policy 0, policy_version 822983 (0.0030) [2024-06-25 05:49:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 13483819008. Throughput: 0: 42448.0. Samples: 13484021160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 05:49:19,854][15401] Updated weights for policy 0, policy_version 822993 (0.0029) [2024-06-25 05:49:23,216][15401] Updated weights for policy 0, policy_version 823003 (0.0028) [2024-06-25 05:49:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13484081152. Throughput: 0: 42324.4. Samples: 13484137700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 05:49:27,573][15401] Updated weights for policy 0, policy_version 823013 (0.0027) [2024-06-25 05:49:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13484294144. Throughput: 0: 42691.1. Samples: 13484402720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:28,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 05:49:30,744][15401] Updated weights for policy 0, policy_version 823023 (0.0032) [2024-06-25 05:49:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 13484490752. Throughput: 0: 42554.3. Samples: 13484656440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 05:49:35,144][15401] Updated weights for policy 0, policy_version 823033 (0.0037) [2024-06-25 05:49:38,347][15401] Updated weights for policy 0, policy_version 823043 (0.0027) [2024-06-25 05:49:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 13484736512. Throughput: 0: 42648.0. Samples: 13484781920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 05:49:42,845][15401] Updated weights for policy 0, policy_version 823053 (0.0035) [2024-06-25 05:49:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13484949504. Throughput: 0: 42933.7. Samples: 13485047760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 05:49:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000823056_13484949504.pth... [2024-06-25 05:49:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000822429_13474676736.pth [2024-06-25 05:49:46,142][15401] Updated weights for policy 0, policy_version 823063 (0.0039) [2024-06-25 05:49:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13485113344. Throughput: 0: 42898.9. Samples: 13485305140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 05:49:48,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 05:49:50,531][15401] Updated weights for policy 0, policy_version 823073 (0.0036) [2024-06-25 05:49:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 13485375488. Throughput: 0: 43013.8. Samples: 13485426560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:49:53,396][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 05:49:53,593][15401] Updated weights for policy 0, policy_version 823083 (0.0037) [2024-06-25 05:49:58,010][15401] Updated weights for policy 0, policy_version 823093 (0.0027) [2024-06-25 05:49:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13485572096. Throughput: 0: 43066.8. Samples: 13485693020. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:49:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 05:50:01,058][15401] Updated weights for policy 0, policy_version 823103 (0.0028) [2024-06-25 05:50:03,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.7, 300 sec: 42654.6). Total num frames: 13485768704. Throughput: 0: 42959.4. Samples: 13485954340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:03,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 05:50:05,782][15401] Updated weights for policy 0, policy_version 823113 (0.0037) [2024-06-25 05:50:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13485998080. Throughput: 0: 43152.1. Samples: 13486079540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 05:50:09,153][15401] Updated weights for policy 0, policy_version 823123 (0.0027) [2024-06-25 05:50:10,777][15349] Signal inference workers to stop experience collection... (199650 times) [2024-06-25 05:50:10,832][15401] InferenceWorker_p0-w0: stopping experience collection (199650 times) [2024-06-25 05:50:10,838][15349] Signal inference workers to resume experience collection... (199650 times) [2024-06-25 05:50:10,842][15401] InferenceWorker_p0-w0: resuming experience collection (199650 times) [2024-06-25 05:50:13,188][15401] Updated weights for policy 0, policy_version 823133 (0.0047) [2024-06-25 05:50:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13486211072. Throughput: 0: 43106.3. Samples: 13486342500. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:13,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-25 05:50:16,647][15401] Updated weights for policy 0, policy_version 823143 (0.0029) [2024-06-25 05:50:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 13486424064. Throughput: 0: 43203.2. Samples: 13486600580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:18,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 05:50:20,655][15401] Updated weights for policy 0, policy_version 823153 (0.0037) [2024-06-25 05:50:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13486653440. Throughput: 0: 43201.4. Samples: 13486725980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 05:50:24,318][15401] Updated weights for policy 0, policy_version 823163 (0.0038) [2024-06-25 05:50:28,322][15401] Updated weights for policy 0, policy_version 823173 (0.0029) [2024-06-25 05:50:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13486866432. Throughput: 0: 43108.4. Samples: 13486987640. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 05:50:32,183][15401] Updated weights for policy 0, policy_version 823183 (0.0034) [2024-06-25 05:50:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 13487063040. Throughput: 0: 43109.8. Samples: 13487245080. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 05:50:35,910][15401] Updated weights for policy 0, policy_version 823193 (0.0029) [2024-06-25 05:50:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13487292416. Throughput: 0: 43168.6. Samples: 13487369140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:38,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 05:50:39,758][15401] Updated weights for policy 0, policy_version 823203 (0.0035) [2024-06-25 05:50:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13487505408. Throughput: 0: 43037.8. Samples: 13487629720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:43,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 05:50:43,480][15401] Updated weights for policy 0, policy_version 823213 (0.0037) [2024-06-25 05:50:47,936][15401] Updated weights for policy 0, policy_version 823223 (0.0025) [2024-06-25 05:50:48,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13487718400. Throughput: 0: 42916.9. Samples: 13487885600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 05:50:51,101][15401] Updated weights for policy 0, policy_version 823233 (0.0037) [2024-06-25 05:50:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13487947776. Throughput: 0: 42924.4. Samples: 13488011140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 05:50:55,459][15401] Updated weights for policy 0, policy_version 823243 (0.0029) [2024-06-25 05:50:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13488128000. Throughput: 0: 42889.7. Samples: 13488272540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:50:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 05:50:58,807][15401] Updated weights for policy 0, policy_version 823253 (0.0041) [2024-06-25 05:51:03,040][15401] Updated weights for policy 0, policy_version 823263 (0.0035) [2024-06-25 05:51:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13488340992. Throughput: 0: 42671.0. Samples: 13488520780. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:51:03,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 05:51:06,564][15401] Updated weights for policy 0, policy_version 823273 (0.0026) [2024-06-25 05:51:08,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 13488586752. Throughput: 0: 42730.9. Samples: 13488648880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:51:08,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-25 05:51:10,984][15401] Updated weights for policy 0, policy_version 823283 (0.0037) [2024-06-25 05:51:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13488783360. Throughput: 0: 42713.4. Samples: 13488909740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:51:13,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-25 05:51:14,169][15401] Updated weights for policy 0, policy_version 823293 (0.0023) [2024-06-25 05:51:18,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13488979968. Throughput: 0: 42699.1. Samples: 13489166540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:51:18,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 05:51:18,448][15401] Updated weights for policy 0, policy_version 823303 (0.0041) [2024-06-25 05:51:20,795][15349] Signal inference workers to stop experience collection... (199700 times) [2024-06-25 05:51:20,795][15349] Signal inference workers to resume experience collection... (199700 times) [2024-06-25 05:51:20,844][15401] InferenceWorker_p0-w0: stopping experience collection (199700 times) [2024-06-25 05:51:20,845][15401] InferenceWorker_p0-w0: resuming experience collection (199700 times) [2024-06-25 05:51:21,914][15401] Updated weights for policy 0, policy_version 823313 (0.0040) [2024-06-25 05:51:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 13489242112. Throughput: 0: 42744.3. Samples: 13489292640. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:51:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 05:51:26,426][15401] Updated weights for policy 0, policy_version 823323 (0.0043) [2024-06-25 05:51:28,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.7, 300 sec: 42709.2). Total num frames: 13489405952. Throughput: 0: 42654.1. Samples: 13489549260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:51:28,393][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 05:51:29,613][15401] Updated weights for policy 0, policy_version 823333 (0.0034) [2024-06-25 05:51:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13489635328. Throughput: 0: 42482.7. Samples: 13489797320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:51:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 05:51:33,982][15401] Updated weights for policy 0, policy_version 823343 (0.0030) [2024-06-25 05:51:37,225][15401] Updated weights for policy 0, policy_version 823353 (0.0036) [2024-06-25 05:51:38,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13489864704. Throughput: 0: 42703.5. Samples: 13489932800. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-06-25 05:51:38,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 05:51:41,455][15401] Updated weights for policy 0, policy_version 823363 (0.0032) [2024-06-25 05:51:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13490044928. Throughput: 0: 42546.7. Samples: 13490187140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:51:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 05:51:43,477][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000823368_13490061312.pth... [2024-06-25 05:51:43,542][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000822742_13479804928.pth [2024-06-25 05:51:44,726][15401] Updated weights for policy 0, policy_version 823373 (0.0045) [2024-06-25 05:51:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13490274304. Throughput: 0: 42768.0. Samples: 13490445340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:51:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 05:51:48,955][15401] Updated weights for policy 0, policy_version 823383 (0.0036) [2024-06-25 05:51:52,330][15401] Updated weights for policy 0, policy_version 823393 (0.0028) [2024-06-25 05:51:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13490503680. Throughput: 0: 42872.5. Samples: 13490578140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:51:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 05:51:56,579][15401] Updated weights for policy 0, policy_version 823403 (0.0046) [2024-06-25 05:51:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13490683904. Throughput: 0: 42723.4. Samples: 13490832300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:51:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 05:52:00,142][15401] Updated weights for policy 0, policy_version 823413 (0.0026) [2024-06-25 05:52:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13490929664. Throughput: 0: 42562.5. Samples: 13491081860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 05:52:03,994][15401] Updated weights for policy 0, policy_version 823423 (0.0037) [2024-06-25 05:52:07,815][15401] Updated weights for policy 0, policy_version 823433 (0.0029) [2024-06-25 05:52:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13491142656. Throughput: 0: 42808.0. Samples: 13491219000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:08,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 05:52:11,499][15401] Updated weights for policy 0, policy_version 823443 (0.0033) [2024-06-25 05:52:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13491322880. Throughput: 0: 42768.4. Samples: 13491473740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 05:52:15,400][15401] Updated weights for policy 0, policy_version 823453 (0.0044) [2024-06-25 05:52:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 13491585024. Throughput: 0: 42710.6. Samples: 13491719300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:18,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 05:52:19,240][15401] Updated weights for policy 0, policy_version 823463 (0.0036) [2024-06-25 05:52:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13491765248. Throughput: 0: 42777.9. Samples: 13491857800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:23,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 05:52:23,811][15401] Updated weights for policy 0, policy_version 823473 (0.0027) [2024-06-25 05:52:24,309][15349] Signal inference workers to stop experience collection... (199750 times) [2024-06-25 05:52:24,310][15349] Signal inference workers to resume experience collection... (199750 times) [2024-06-25 05:52:24,359][15401] InferenceWorker_p0-w0: stopping experience collection (199750 times) [2024-06-25 05:52:24,359][15401] InferenceWorker_p0-w0: resuming experience collection (199750 times) [2024-06-25 05:52:26,955][15401] Updated weights for policy 0, policy_version 823483 (0.0040) [2024-06-25 05:52:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 13491978240. Throughput: 0: 42638.6. Samples: 13492105880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 05:52:31,377][15401] Updated weights for policy 0, policy_version 823493 (0.0037) [2024-06-25 05:52:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13492207616. Throughput: 0: 42575.6. Samples: 13492361240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 05:52:34,577][15401] Updated weights for policy 0, policy_version 823503 (0.0036) [2024-06-25 05:52:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 13492404224. Throughput: 0: 42661.4. Samples: 13492498000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:38,392][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 05:52:38,903][15401] Updated weights for policy 0, policy_version 823513 (0.0036) [2024-06-25 05:52:42,313][15401] Updated weights for policy 0, policy_version 823523 (0.0040) [2024-06-25 05:52:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 13492633600. Throughput: 0: 42601.3. Samples: 13492749360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 05:52:46,736][15401] Updated weights for policy 0, policy_version 823533 (0.0031) [2024-06-25 05:52:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13492846592. Throughput: 0: 42728.6. Samples: 13493004640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 05:52:49,892][15401] Updated weights for policy 0, policy_version 823543 (0.0027) [2024-06-25 05:52:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 13493043200. Throughput: 0: 42577.7. Samples: 13493135000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:53,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-25 05:52:54,242][15401] Updated weights for policy 0, policy_version 823553 (0.0039) [2024-06-25 05:52:57,520][15401] Updated weights for policy 0, policy_version 823563 (0.0031) [2024-06-25 05:52:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 13493256192. Throughput: 0: 42615.3. Samples: 13493391420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:52:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 05:53:01,826][15401] Updated weights for policy 0, policy_version 823573 (0.0022) [2024-06-25 05:53:03,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 13493485568. Throughput: 0: 42915.9. Samples: 13493650620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:53:03,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 05:53:05,785][15401] Updated weights for policy 0, policy_version 823583 (0.0039) [2024-06-25 05:53:08,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13493698560. Throughput: 0: 42636.8. Samples: 13493776560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:53:08,393][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 05:53:09,380][15401] Updated weights for policy 0, policy_version 823593 (0.0032) [2024-06-25 05:53:13,181][15401] Updated weights for policy 0, policy_version 823603 (0.0047) [2024-06-25 05:53:13,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13493911552. Throughput: 0: 42838.6. Samples: 13494033620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:53:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 05:53:17,005][15401] Updated weights for policy 0, policy_version 823613 (0.0029) [2024-06-25 05:53:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13494140928. Throughput: 0: 42961.4. Samples: 13494294500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:53:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 05:53:20,708][15401] Updated weights for policy 0, policy_version 823623 (0.0027) [2024-06-25 05:53:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13494337536. Throughput: 0: 42869.4. Samples: 13494427020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:53:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 05:53:24,844][15401] Updated weights for policy 0, policy_version 823633 (0.0037) [2024-06-25 05:53:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13494550528. Throughput: 0: 42891.7. Samples: 13494679480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 05:53:28,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 05:53:28,747][15401] Updated weights for policy 0, policy_version 823643 (0.0026) [2024-06-25 05:53:32,558][15401] Updated weights for policy 0, policy_version 823653 (0.0028) [2024-06-25 05:53:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13494779904. Throughput: 0: 42838.6. Samples: 13494932380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:53:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 05:53:36,302][15401] Updated weights for policy 0, policy_version 823663 (0.0034) [2024-06-25 05:53:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 13494960128. Throughput: 0: 42946.9. Samples: 13495067600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:53:38,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 05:53:40,180][15401] Updated weights for policy 0, policy_version 823673 (0.0037) [2024-06-25 05:53:43,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13495189504. Throughput: 0: 42737.4. Samples: 13495314620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:53:43,390][15132] Avg episode reward: [(0, '0.015')] [2024-06-25 05:53:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000823681_13495189504.pth... [2024-06-25 05:53:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000823056_13484949504.pth [2024-06-25 05:53:44,065][15401] Updated weights for policy 0, policy_version 823683 (0.0044) [2024-06-25 05:53:47,837][15401] Updated weights for policy 0, policy_version 823693 (0.0037) [2024-06-25 05:53:48,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 13495418880. Throughput: 0: 42809.4. Samples: 13495576940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:53:48,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 05:53:49,491][15349] Signal inference workers to stop experience collection... (199800 times) [2024-06-25 05:53:49,492][15349] Signal inference workers to resume experience collection... (199800 times) [2024-06-25 05:53:49,514][15401] InferenceWorker_p0-w0: stopping experience collection (199800 times) [2024-06-25 05:53:49,514][15401] InferenceWorker_p0-w0: resuming experience collection (199800 times) [2024-06-25 05:53:51,685][15401] Updated weights for policy 0, policy_version 823703 (0.0026) [2024-06-25 05:53:53,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13495615488. Throughput: 0: 42866.7. Samples: 13495705460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:53:53,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 05:53:55,479][15401] Updated weights for policy 0, policy_version 823713 (0.0036) [2024-06-25 05:53:58,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 13495844864. Throughput: 0: 42728.6. Samples: 13495956680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:53:58,396][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 05:53:59,363][15401] Updated weights for policy 0, policy_version 823723 (0.0027) [2024-06-25 05:54:03,156][15401] Updated weights for policy 0, policy_version 823733 (0.0038) [2024-06-25 05:54:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 13496057856. Throughput: 0: 42773.8. Samples: 13496219320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 05:54:07,124][15401] Updated weights for policy 0, policy_version 823743 (0.0044) [2024-06-25 05:54:08,390][15132] Fps is (10 sec: 40985.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13496254464. Throughput: 0: 42543.0. Samples: 13496341460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 05:54:10,595][15401] Updated weights for policy 0, policy_version 823753 (0.0039) [2024-06-25 05:54:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13496483840. Throughput: 0: 42700.8. Samples: 13496601020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 05:54:14,537][15401] Updated weights for policy 0, policy_version 823763 (0.0028) [2024-06-25 05:54:18,180][15401] Updated weights for policy 0, policy_version 823773 (0.0043) [2024-06-25 05:54:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13496696832. Throughput: 0: 42808.1. Samples: 13496858740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:18,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 05:54:22,350][15401] Updated weights for policy 0, policy_version 823783 (0.0036) [2024-06-25 05:54:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13496893440. Throughput: 0: 42649.7. Samples: 13496986840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:23,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 05:54:25,787][15401] Updated weights for policy 0, policy_version 823793 (0.0032) [2024-06-25 05:54:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13497122816. Throughput: 0: 42746.8. Samples: 13497238220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 05:54:29,981][15401] Updated weights for policy 0, policy_version 823803 (0.0036) [2024-06-25 05:54:33,281][15401] Updated weights for policy 0, policy_version 823813 (0.0038) [2024-06-25 05:54:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13497352192. Throughput: 0: 42569.3. Samples: 13497492560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 05:54:37,573][15401] Updated weights for policy 0, policy_version 823823 (0.0039) [2024-06-25 05:54:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 13497532416. Throughput: 0: 42673.3. Samples: 13497625760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:38,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 05:54:40,957][15401] Updated weights for policy 0, policy_version 823833 (0.0024) [2024-06-25 05:54:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.6, 300 sec: 42820.5). Total num frames: 13497745408. Throughput: 0: 42606.0. Samples: 13497873680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 05:54:45,579][15401] Updated weights for policy 0, policy_version 823843 (0.0026) [2024-06-25 05:54:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13497974784. Throughput: 0: 42479.5. Samples: 13498131000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:48,392][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 05:54:48,931][15401] Updated weights for policy 0, policy_version 823853 (0.0031) [2024-06-25 05:54:53,001][15401] Updated weights for policy 0, policy_version 823863 (0.0023) [2024-06-25 05:54:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13498187776. Throughput: 0: 42646.2. Samples: 13498260540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 05:54:56,447][15401] Updated weights for policy 0, policy_version 823873 (0.0035) [2024-06-25 05:54:58,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42602.8, 300 sec: 42820.5). Total num frames: 13498400768. Throughput: 0: 42545.7. Samples: 13498515580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:54:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 05:55:00,644][15401] Updated weights for policy 0, policy_version 823883 (0.0026) [2024-06-25 05:55:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13498630144. Throughput: 0: 42574.5. Samples: 13498774600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:55:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 05:55:04,056][15401] Updated weights for policy 0, policy_version 823893 (0.0028) [2024-06-25 05:55:08,211][15401] Updated weights for policy 0, policy_version 823903 (0.0027) [2024-06-25 05:55:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13498826752. Throughput: 0: 42714.5. Samples: 13498909000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:55:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 05:55:11,482][15349] Signal inference workers to stop experience collection... (199850 times) [2024-06-25 05:55:11,523][15401] InferenceWorker_p0-w0: stopping experience collection (199850 times) [2024-06-25 05:55:11,544][15349] Signal inference workers to resume experience collection... (199850 times) [2024-06-25 05:55:11,545][15401] InferenceWorker_p0-w0: resuming experience collection (199850 times) [2024-06-25 05:55:11,687][15401] Updated weights for policy 0, policy_version 823913 (0.0037) [2024-06-25 05:55:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13499039744. Throughput: 0: 42581.3. Samples: 13499154380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 05:55:13,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 05:55:15,698][15401] Updated weights for policy 0, policy_version 823923 (0.0037) [2024-06-25 05:55:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13499269120. Throughput: 0: 42656.6. Samples: 13499412100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 05:55:19,717][15401] Updated weights for policy 0, policy_version 823933 (0.0035) [2024-06-25 05:55:23,338][15401] Updated weights for policy 0, policy_version 823943 (0.0028) [2024-06-25 05:55:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13499482112. Throughput: 0: 42569.4. Samples: 13499541380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 05:55:27,250][15401] Updated weights for policy 0, policy_version 823953 (0.0034) [2024-06-25 05:55:28,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13499678720. Throughput: 0: 42659.9. Samples: 13499793380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 05:55:31,029][15401] Updated weights for policy 0, policy_version 823963 (0.0032) [2024-06-25 05:55:33,390][15132] Fps is (10 sec: 39319.0, 60 sec: 42051.8, 300 sec: 42653.8). Total num frames: 13499875328. Throughput: 0: 42807.0. Samples: 13500057240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:33,391][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 05:55:35,006][15401] Updated weights for policy 0, policy_version 823973 (0.0036) [2024-06-25 05:55:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13500121088. Throughput: 0: 42697.7. Samples: 13500181940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:38,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 05:55:38,645][15401] Updated weights for policy 0, policy_version 823983 (0.0045) [2024-06-25 05:55:42,579][15401] Updated weights for policy 0, policy_version 823993 (0.0053) [2024-06-25 05:55:43,389][15132] Fps is (10 sec: 44239.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13500317696. Throughput: 0: 42703.8. Samples: 13500437240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 05:55:43,495][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000823995_13500334080.pth... [2024-06-25 05:55:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000823368_13490061312.pth [2024-06-25 05:55:46,428][15401] Updated weights for policy 0, policy_version 824003 (0.0041) [2024-06-25 05:55:48,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 13500497920. Throughput: 0: 42668.9. Samples: 13500694700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:48,399][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 05:55:50,460][15401] Updated weights for policy 0, policy_version 824013 (0.0042) [2024-06-25 05:55:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13500743680. Throughput: 0: 42281.5. Samples: 13500811660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 05:55:54,149][15401] Updated weights for policy 0, policy_version 824023 (0.0038) [2024-06-25 05:55:58,105][15401] Updated weights for policy 0, policy_version 824033 (0.0038) [2024-06-25 05:55:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 13500956672. Throughput: 0: 42650.8. Samples: 13501073660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:55:58,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 05:56:01,526][15401] Updated weights for policy 0, policy_version 824043 (0.0027) [2024-06-25 05:56:03,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13501153280. Throughput: 0: 42612.7. Samples: 13501329680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 05:56:05,731][15401] Updated weights for policy 0, policy_version 824053 (0.0024) [2024-06-25 05:56:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13501382656. Throughput: 0: 42499.5. Samples: 13501453860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 05:56:09,691][15401] Updated weights for policy 0, policy_version 824063 (0.0035) [2024-06-25 05:56:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13501595648. Throughput: 0: 42601.0. Samples: 13501710420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 05:56:13,545][15401] Updated weights for policy 0, policy_version 824073 (0.0036) [2024-06-25 05:56:17,374][15401] Updated weights for policy 0, policy_version 824083 (0.0044) [2024-06-25 05:56:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 13501792256. Throughput: 0: 42472.5. Samples: 13501968480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 05:56:21,150][15401] Updated weights for policy 0, policy_version 824093 (0.0032) [2024-06-25 05:56:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 13502021632. Throughput: 0: 42461.4. Samples: 13502092700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 05:56:24,817][15401] Updated weights for policy 0, policy_version 824103 (0.0040) [2024-06-25 05:56:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13502234624. Throughput: 0: 42542.1. Samples: 13502351640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 05:56:28,959][15401] Updated weights for policy 0, policy_version 824113 (0.0036) [2024-06-25 05:56:32,423][15349] Signal inference workers to stop experience collection... (199900 times) [2024-06-25 05:56:32,428][15349] Signal inference workers to resume experience collection... (199900 times) [2024-06-25 05:56:32,472][15401] InferenceWorker_p0-w0: stopping experience collection (199900 times) [2024-06-25 05:56:32,472][15401] InferenceWorker_p0-w0: resuming experience collection (199900 times) [2024-06-25 05:56:32,563][15401] Updated weights for policy 0, policy_version 824123 (0.0040) [2024-06-25 05:56:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.8, 300 sec: 42653.9). Total num frames: 13502447616. Throughput: 0: 42384.8. Samples: 13502602020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:33,396][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 05:56:37,023][15401] Updated weights for policy 0, policy_version 824133 (0.0030) [2024-06-25 05:56:38,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13502644224. Throughput: 0: 42588.3. Samples: 13502728140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:38,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 05:56:40,662][15401] Updated weights for policy 0, policy_version 824143 (0.0036) [2024-06-25 05:56:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13502857216. Throughput: 0: 42628.3. Samples: 13502991940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:43,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 05:56:44,531][15401] Updated weights for policy 0, policy_version 824153 (0.0032) [2024-06-25 05:56:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13503070208. Throughput: 0: 42562.8. Samples: 13503245000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 05:56:48,414][15401] Updated weights for policy 0, policy_version 824163 (0.0029) [2024-06-25 05:56:52,080][15401] Updated weights for policy 0, policy_version 824173 (0.0040) [2024-06-25 05:56:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 13503299584. Throughput: 0: 42664.0. Samples: 13503373740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 05:56:56,078][15401] Updated weights for policy 0, policy_version 824183 (0.0034) [2024-06-25 05:56:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13503496192. Throughput: 0: 42681.4. Samples: 13503631080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:56:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 05:56:59,728][15401] Updated weights for policy 0, policy_version 824193 (0.0037) [2024-06-25 05:57:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13503725568. Throughput: 0: 42727.3. Samples: 13503891200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 05:57:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 05:57:03,551][15401] Updated weights for policy 0, policy_version 824203 (0.0039) [2024-06-25 05:57:07,515][15401] Updated weights for policy 0, policy_version 824213 (0.0033) [2024-06-25 05:57:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13503938560. Throughput: 0: 42826.3. Samples: 13504019880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:08,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 05:57:11,050][15401] Updated weights for policy 0, policy_version 824223 (0.0033) [2024-06-25 05:57:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 13504118784. Throughput: 0: 42721.4. Samples: 13504274100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 05:57:15,116][15401] Updated weights for policy 0, policy_version 824233 (0.0023) [2024-06-25 05:57:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13504364544. Throughput: 0: 42645.9. Samples: 13504521080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 05:57:18,709][15401] Updated weights for policy 0, policy_version 824243 (0.0050) [2024-06-25 05:57:22,841][15401] Updated weights for policy 0, policy_version 824253 (0.0029) [2024-06-25 05:57:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13504561152. Throughput: 0: 42821.7. Samples: 13504655120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 05:57:26,647][15401] Updated weights for policy 0, policy_version 824263 (0.0041) [2024-06-25 05:57:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 13504757760. Throughput: 0: 42459.6. Samples: 13504902620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:28,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 05:57:30,784][15401] Updated weights for policy 0, policy_version 824273 (0.0041) [2024-06-25 05:57:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 13505003520. Throughput: 0: 42479.4. Samples: 13505156580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 05:57:34,453][15401] Updated weights for policy 0, policy_version 824283 (0.0048) [2024-06-25 05:57:38,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 13505200128. Throughput: 0: 42582.7. Samples: 13505290060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:38,393][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 05:57:38,734][15401] Updated weights for policy 0, policy_version 824293 (0.0034) [2024-06-25 05:57:42,272][15401] Updated weights for policy 0, policy_version 824303 (0.0036) [2024-06-25 05:57:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13505413120. Throughput: 0: 42563.4. Samples: 13505546440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 05:57:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000824305_13505413120.pth... [2024-06-25 05:57:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000823681_13495189504.pth [2024-06-25 05:57:43,811][15349] Signal inference workers to stop experience collection... (199950 times) [2024-06-25 05:57:43,811][15349] Signal inference workers to resume experience collection... (199950 times) [2024-06-25 05:57:43,827][15401] InferenceWorker_p0-w0: stopping experience collection (199950 times) [2024-06-25 05:57:43,827][15401] InferenceWorker_p0-w0: resuming experience collection (199950 times) [2024-06-25 05:57:46,691][15401] Updated weights for policy 0, policy_version 824313 (0.0039) [2024-06-25 05:57:48,396][15132] Fps is (10 sec: 44219.1, 60 sec: 42866.8, 300 sec: 42708.6). Total num frames: 13505642496. Throughput: 0: 42318.3. Samples: 13505795800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:48,397][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 05:57:49,853][15401] Updated weights for policy 0, policy_version 824323 (0.0051) [2024-06-25 05:57:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13505839104. Throughput: 0: 42417.3. Samples: 13505928660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 05:57:54,151][15401] Updated weights for policy 0, policy_version 824333 (0.0041) [2024-06-25 05:57:57,596][15401] Updated weights for policy 0, policy_version 824343 (0.0036) [2024-06-25 05:57:58,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 13506052096. Throughput: 0: 42382.6. Samples: 13506181320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:57:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 05:58:01,792][15401] Updated weights for policy 0, policy_version 824353 (0.0040) [2024-06-25 05:58:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 13506265088. Throughput: 0: 42482.7. Samples: 13506432800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 05:58:05,172][15401] Updated weights for policy 0, policy_version 824363 (0.0033) [2024-06-25 05:58:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13506478080. Throughput: 0: 42361.4. Samples: 13506561380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:08,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 05:58:09,495][15401] Updated weights for policy 0, policy_version 824373 (0.0030) [2024-06-25 05:58:13,137][15401] Updated weights for policy 0, policy_version 824383 (0.0028) [2024-06-25 05:58:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13506691072. Throughput: 0: 42580.9. Samples: 13506818760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:13,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 05:58:17,090][15401] Updated weights for policy 0, policy_version 824393 (0.0033) [2024-06-25 05:58:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13506920448. Throughput: 0: 42622.7. Samples: 13507074600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 05:58:21,093][15401] Updated weights for policy 0, policy_version 824403 (0.0035) [2024-06-25 05:58:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13507117056. Throughput: 0: 42522.4. Samples: 13507203460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 05:58:24,623][15401] Updated weights for policy 0, policy_version 824413 (0.0041) [2024-06-25 05:58:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13507330048. Throughput: 0: 42448.1. Samples: 13507456600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 05:58:28,946][15401] Updated weights for policy 0, policy_version 824423 (0.0032) [2024-06-25 05:58:32,299][15401] Updated weights for policy 0, policy_version 824433 (0.0031) [2024-06-25 05:58:33,391][15132] Fps is (10 sec: 44228.5, 60 sec: 42597.2, 300 sec: 42709.2). Total num frames: 13507559424. Throughput: 0: 42575.5. Samples: 13507711500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:33,392][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 05:58:36,617][15401] Updated weights for policy 0, policy_version 824443 (0.0063) [2024-06-25 05:58:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 13507756032. Throughput: 0: 42659.9. Samples: 13507848360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:38,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 05:58:39,742][15401] Updated weights for policy 0, policy_version 824453 (0.0039) [2024-06-25 05:58:43,392][15132] Fps is (10 sec: 40957.4, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 13507969024. Throughput: 0: 42610.2. Samples: 13508098880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:43,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 05:58:44,064][15401] Updated weights for policy 0, policy_version 824463 (0.0038) [2024-06-25 05:58:47,179][15401] Updated weights for policy 0, policy_version 824473 (0.0040) [2024-06-25 05:58:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 13508214784. Throughput: 0: 42715.0. Samples: 13508354980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:48,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 05:58:51,462][15401] Updated weights for policy 0, policy_version 824483 (0.0037) [2024-06-25 05:58:53,391][15132] Fps is (10 sec: 42602.1, 60 sec: 42597.3, 300 sec: 42543.6). Total num frames: 13508395008. Throughput: 0: 42960.7. Samples: 13508494680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 05:58:53,391][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 05:58:54,886][15401] Updated weights for policy 0, policy_version 824493 (0.0025) [2024-06-25 05:58:57,803][15349] Signal inference workers to stop experience collection... (200000 times) [2024-06-25 05:58:57,834][15401] InferenceWorker_p0-w0: stopping experience collection (200000 times) [2024-06-25 05:58:57,917][15349] Signal inference workers to resume experience collection... (200000 times) [2024-06-25 05:58:57,917][15401] InferenceWorker_p0-w0: resuming experience collection (200000 times) [2024-06-25 05:58:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13508624384. Throughput: 0: 42870.6. Samples: 13508747940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:58:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 05:58:58,935][15401] Updated weights for policy 0, policy_version 824503 (0.0041) [2024-06-25 05:59:02,504][15401] Updated weights for policy 0, policy_version 824513 (0.0030) [2024-06-25 05:59:03,390][15132] Fps is (10 sec: 47520.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13508870144. Throughput: 0: 42764.9. Samples: 13508999020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:03,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 05:59:06,485][15401] Updated weights for policy 0, policy_version 824523 (0.0036) [2024-06-25 05:59:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13509033984. Throughput: 0: 42923.9. Samples: 13509135040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 05:59:10,143][15401] Updated weights for policy 0, policy_version 824533 (0.0027) [2024-06-25 05:59:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13509279744. Throughput: 0: 43009.0. Samples: 13509392000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:13,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 05:59:14,327][15401] Updated weights for policy 0, policy_version 824543 (0.0032) [2024-06-25 05:59:17,702][15401] Updated weights for policy 0, policy_version 824553 (0.0028) [2024-06-25 05:59:18,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13509509120. Throughput: 0: 42980.3. Samples: 13509645540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 05:59:21,922][15401] Updated weights for policy 0, policy_version 824563 (0.0027) [2024-06-25 05:59:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 13509689344. Throughput: 0: 42955.4. Samples: 13509781360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 05:59:25,366][15401] Updated weights for policy 0, policy_version 824573 (0.0025) [2024-06-25 05:59:28,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43415.9, 300 sec: 42653.6). Total num frames: 13509935104. Throughput: 0: 43234.2. Samples: 13510044420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:28,392][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 05:59:29,534][15401] Updated weights for policy 0, policy_version 824583 (0.0026) [2024-06-25 05:59:33,094][15401] Updated weights for policy 0, policy_version 824593 (0.0025) [2024-06-25 05:59:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43145.8, 300 sec: 42765.0). Total num frames: 13510148096. Throughput: 0: 43051.1. Samples: 13510292280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 05:59:37,229][15401] Updated weights for policy 0, policy_version 824603 (0.0035) [2024-06-25 05:59:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13510344704. Throughput: 0: 42835.3. Samples: 13510422200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 05:59:40,691][15401] Updated weights for policy 0, policy_version 824613 (0.0035) [2024-06-25 05:59:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 13510574080. Throughput: 0: 43027.4. Samples: 13510684280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:43,392][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 05:59:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000824620_13510574080.pth... [2024-06-25 05:59:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000823995_13500334080.pth [2024-06-25 05:59:44,696][15401] Updated weights for policy 0, policy_version 824623 (0.0025) [2024-06-25 05:59:48,225][15401] Updated weights for policy 0, policy_version 824633 (0.0041) [2024-06-25 05:59:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13510787072. Throughput: 0: 43045.2. Samples: 13510936060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:48,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 05:59:52,248][15401] Updated weights for policy 0, policy_version 824643 (0.0046) [2024-06-25 05:59:53,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43418.8, 300 sec: 42709.5). Total num frames: 13511000064. Throughput: 0: 42971.2. Samples: 13511068740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:53,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 05:59:55,761][15401] Updated weights for policy 0, policy_version 824653 (0.0028) [2024-06-25 05:59:58,392][15132] Fps is (10 sec: 39312.6, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 13511180288. Throughput: 0: 42737.6. Samples: 13511315300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 05:59:58,392][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 05:59:59,982][15401] Updated weights for policy 0, policy_version 824663 (0.0028) [2024-06-25 06:00:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13511426048. Throughput: 0: 42765.4. Samples: 13511569980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:03,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-25 06:00:03,423][15401] Updated weights for policy 0, policy_version 824673 (0.0032) [2024-06-25 06:00:07,680][15401] Updated weights for policy 0, policy_version 824683 (0.0052) [2024-06-25 06:00:08,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13511606272. Throughput: 0: 42671.7. Samples: 13511701580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 06:00:11,306][15401] Updated weights for policy 0, policy_version 824693 (0.0029) [2024-06-25 06:00:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13511835648. Throughput: 0: 42484.0. Samples: 13511956100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 06:00:15,511][15401] Updated weights for policy 0, policy_version 824703 (0.0031) [2024-06-25 06:00:17,214][15349] Signal inference workers to stop experience collection... (200050 times) [2024-06-25 06:00:17,233][15401] InferenceWorker_p0-w0: stopping experience collection (200050 times) [2024-06-25 06:00:17,293][15349] Signal inference workers to resume experience collection... (200050 times) [2024-06-25 06:00:17,293][15401] InferenceWorker_p0-w0: resuming experience collection (200050 times) [2024-06-25 06:00:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 13512065024. Throughput: 0: 42768.0. Samples: 13512216940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:18,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 06:00:19,122][15401] Updated weights for policy 0, policy_version 824713 (0.0025) [2024-06-25 06:00:22,951][15401] Updated weights for policy 0, policy_version 824723 (0.0051) [2024-06-25 06:00:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13512261632. Throughput: 0: 42827.4. Samples: 13512349440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 06:00:26,601][15401] Updated weights for policy 0, policy_version 824733 (0.0034) [2024-06-25 06:00:28,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42327.0, 300 sec: 42709.6). Total num frames: 13512474624. Throughput: 0: 42697.4. Samples: 13512605560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 06:00:30,934][15401] Updated weights for policy 0, policy_version 824743 (0.0034) [2024-06-25 06:00:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13512704000. Throughput: 0: 42653.9. Samples: 13512855480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 06:00:34,343][15401] Updated weights for policy 0, policy_version 824753 (0.0026) [2024-06-25 06:00:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13512900608. Throughput: 0: 42636.0. Samples: 13512987360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:38,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 06:00:38,763][15401] Updated weights for policy 0, policy_version 824763 (0.0038) [2024-06-25 06:00:41,847][15401] Updated weights for policy 0, policy_version 824773 (0.0032) [2024-06-25 06:00:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 13513113600. Throughput: 0: 42789.4. Samples: 13513240720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 06:00:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 06:00:46,237][15401] Updated weights for policy 0, policy_version 824783 (0.0027) [2024-06-25 06:00:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 13513342976. Throughput: 0: 42878.7. Samples: 13513499520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:00:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 06:00:49,856][15401] Updated weights for policy 0, policy_version 824793 (0.0032) [2024-06-25 06:00:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13513555968. Throughput: 0: 42882.7. Samples: 13513631300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:00:53,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 06:00:53,778][15401] Updated weights for policy 0, policy_version 824803 (0.0048) [2024-06-25 06:00:57,442][15401] Updated weights for policy 0, policy_version 824813 (0.0041) [2024-06-25 06:00:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 13513768960. Throughput: 0: 42956.0. Samples: 13513889120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:00:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 06:01:01,186][15401] Updated weights for policy 0, policy_version 824823 (0.0035) [2024-06-25 06:01:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13513981952. Throughput: 0: 42796.8. Samples: 13514142700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:03,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 06:01:05,121][15401] Updated weights for policy 0, policy_version 824833 (0.0036) [2024-06-25 06:01:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13514194944. Throughput: 0: 42741.0. Samples: 13514272780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:08,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 06:01:08,938][15401] Updated weights for policy 0, policy_version 824843 (0.0024) [2024-06-25 06:01:12,867][15401] Updated weights for policy 0, policy_version 824853 (0.0041) [2024-06-25 06:01:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13514391552. Throughput: 0: 42728.9. Samples: 13514528360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 06:01:16,428][15401] Updated weights for policy 0, policy_version 824863 (0.0038) [2024-06-25 06:01:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 13514620928. Throughput: 0: 42956.4. Samples: 13514788520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:18,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 06:01:20,405][15401] Updated weights for policy 0, policy_version 824873 (0.0044) [2024-06-25 06:01:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13514833920. Throughput: 0: 42906.6. Samples: 13514918160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 06:01:23,939][15401] Updated weights for policy 0, policy_version 824883 (0.0035) [2024-06-25 06:01:28,236][15401] Updated weights for policy 0, policy_version 824893 (0.0036) [2024-06-25 06:01:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13515046912. Throughput: 0: 42814.2. Samples: 13515167360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 06:01:31,817][15401] Updated weights for policy 0, policy_version 824903 (0.0030) [2024-06-25 06:01:33,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 13515259904. Throughput: 0: 42760.5. Samples: 13515424020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:33,396][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 06:01:35,763][15401] Updated weights for policy 0, policy_version 824913 (0.0044) [2024-06-25 06:01:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 13515489280. Throughput: 0: 42706.1. Samples: 13515553080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:38,394][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 06:01:40,019][15401] Updated weights for policy 0, policy_version 824923 (0.0030) [2024-06-25 06:01:43,354][15401] Updated weights for policy 0, policy_version 824933 (0.0032) [2024-06-25 06:01:43,392][15132] Fps is (10 sec: 44254.6, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 13515702272. Throughput: 0: 42614.6. Samples: 13515806880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:43,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 06:01:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000824933_13515702272.pth... [2024-06-25 06:01:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000824305_13505413120.pth [2024-06-25 06:01:47,642][15401] Updated weights for policy 0, policy_version 824943 (0.0037) [2024-06-25 06:01:47,656][15349] Signal inference workers to stop experience collection... (200100 times) [2024-06-25 06:01:47,656][15349] Signal inference workers to resume experience collection... (200100 times) [2024-06-25 06:01:47,679][15401] InferenceWorker_p0-w0: stopping experience collection (200100 times) [2024-06-25 06:01:47,679][15401] InferenceWorker_p0-w0: resuming experience collection (200100 times) [2024-06-25 06:01:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13515898880. Throughput: 0: 42632.1. Samples: 13516061140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:48,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 06:01:50,963][15401] Updated weights for policy 0, policy_version 824953 (0.0040) [2024-06-25 06:01:53,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13516111872. Throughput: 0: 42599.1. Samples: 13516189740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 06:01:55,204][15401] Updated weights for policy 0, policy_version 824963 (0.0045) [2024-06-25 06:01:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13516324864. Throughput: 0: 42601.8. Samples: 13516445440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:01:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 06:01:58,585][15401] Updated weights for policy 0, policy_version 824973 (0.0028) [2024-06-25 06:02:02,768][15401] Updated weights for policy 0, policy_version 824983 (0.0043) [2024-06-25 06:02:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13516537856. Throughput: 0: 42593.0. Samples: 13516705200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:02:03,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 06:02:06,250][15401] Updated weights for policy 0, policy_version 824993 (0.0029) [2024-06-25 06:02:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13516750848. Throughput: 0: 42497.8. Samples: 13516830560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:02:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 06:02:10,364][15401] Updated weights for policy 0, policy_version 825003 (0.0033) [2024-06-25 06:02:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13516947456. Throughput: 0: 42643.2. Samples: 13517086300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:02:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 06:02:13,864][15401] Updated weights for policy 0, policy_version 825013 (0.0028) [2024-06-25 06:02:18,139][15401] Updated weights for policy 0, policy_version 825023 (0.0031) [2024-06-25 06:02:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13517176832. Throughput: 0: 42774.6. Samples: 13517348600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:02:18,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 06:02:21,713][15401] Updated weights for policy 0, policy_version 825033 (0.0028) [2024-06-25 06:02:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13517389824. Throughput: 0: 42764.1. Samples: 13517477460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:02:23,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 06:02:25,758][15401] Updated weights for policy 0, policy_version 825043 (0.0041) [2024-06-25 06:02:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13517602816. Throughput: 0: 42943.2. Samples: 13517739220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 06:02:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 06:02:29,263][15401] Updated weights for policy 0, policy_version 825053 (0.0028) [2024-06-25 06:02:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42603.0, 300 sec: 42765.4). Total num frames: 13517815808. Throughput: 0: 43037.9. Samples: 13517997840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:02:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 06:02:33,490][15401] Updated weights for policy 0, policy_version 825063 (0.0037) [2024-06-25 06:02:36,751][15401] Updated weights for policy 0, policy_version 825073 (0.0029) [2024-06-25 06:02:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13518045184. Throughput: 0: 42914.6. Samples: 13518120900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:02:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 06:02:41,113][15401] Updated weights for policy 0, policy_version 825083 (0.0051) [2024-06-25 06:02:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.1, 300 sec: 42766.0). Total num frames: 13518258176. Throughput: 0: 43004.4. Samples: 13518380640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:02:43,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 06:02:44,350][15401] Updated weights for policy 0, policy_version 825093 (0.0046) [2024-06-25 06:02:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13518454784. Throughput: 0: 43006.3. Samples: 13518640480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:02:48,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 06:02:48,680][15401] Updated weights for policy 0, policy_version 825103 (0.0047) [2024-06-25 06:02:51,977][15401] Updated weights for policy 0, policy_version 825113 (0.0030) [2024-06-25 06:02:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13518684160. Throughput: 0: 42924.7. Samples: 13518762180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:02:53,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 06:02:56,654][15401] Updated weights for policy 0, policy_version 825123 (0.0024) [2024-06-25 06:02:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13518897152. Throughput: 0: 42900.5. Samples: 13519016820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:02:58,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 06:02:59,859][15401] Updated weights for policy 0, policy_version 825133 (0.0033) [2024-06-25 06:03:03,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 13519093760. Throughput: 0: 42837.6. Samples: 13519276400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:03,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 06:03:04,405][15349] Signal inference workers to stop experience collection... (200150 times) [2024-06-25 06:03:04,407][15349] Signal inference workers to resume experience collection... (200150 times) [2024-06-25 06:03:04,418][15401] Updated weights for policy 0, policy_version 825143 (0.0031) [2024-06-25 06:03:04,429][15401] InferenceWorker_p0-w0: stopping experience collection (200150 times) [2024-06-25 06:03:04,429][15401] InferenceWorker_p0-w0: resuming experience collection (200150 times) [2024-06-25 06:03:07,474][15401] Updated weights for policy 0, policy_version 825153 (0.0049) [2024-06-25 06:03:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13519339520. Throughput: 0: 42761.2. Samples: 13519401720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:08,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 06:03:12,097][15401] Updated weights for policy 0, policy_version 825163 (0.0032) [2024-06-25 06:03:13,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13519536128. Throughput: 0: 42724.7. Samples: 13519661840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:13,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-25 06:03:15,124][15401] Updated weights for policy 0, policy_version 825173 (0.0022) [2024-06-25 06:03:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13519749120. Throughput: 0: 42524.4. Samples: 13519911440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:18,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 06:03:19,894][15401] Updated weights for policy 0, policy_version 825183 (0.0041) [2024-06-25 06:03:22,671][15401] Updated weights for policy 0, policy_version 825193 (0.0045) [2024-06-25 06:03:23,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 13519978496. Throughput: 0: 42690.2. Samples: 13520042060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:23,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 06:03:27,643][15401] Updated weights for policy 0, policy_version 825203 (0.0037) [2024-06-25 06:03:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42765.3). Total num frames: 13520175104. Throughput: 0: 42938.1. Samples: 13520312860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 06:03:30,321][15401] Updated weights for policy 0, policy_version 825213 (0.0042) [2024-06-25 06:03:33,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13520388096. Throughput: 0: 42645.2. Samples: 13520559520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:33,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 06:03:35,279][15401] Updated weights for policy 0, policy_version 825223 (0.0035) [2024-06-25 06:03:38,075][15401] Updated weights for policy 0, policy_version 825233 (0.0029) [2024-06-25 06:03:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 13520617472. Throughput: 0: 42842.8. Samples: 13520690100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 06:03:42,843][15401] Updated weights for policy 0, policy_version 825243 (0.0034) [2024-06-25 06:03:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 13520797696. Throughput: 0: 42898.6. Samples: 13520947260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 06:03:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000825245_13520814080.pth... [2024-06-25 06:03:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000824620_13510574080.pth [2024-06-25 06:03:45,850][15401] Updated weights for policy 0, policy_version 825253 (0.0025) [2024-06-25 06:03:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 13521027072. Throughput: 0: 42572.2. Samples: 13521192040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 06:03:50,486][15401] Updated weights for policy 0, policy_version 825263 (0.0030) [2024-06-25 06:03:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 13521256448. Throughput: 0: 42795.3. Samples: 13521327500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 06:03:53,508][15401] Updated weights for policy 0, policy_version 825273 (0.0024) [2024-06-25 06:03:58,167][15401] Updated weights for policy 0, policy_version 825283 (0.0034) [2024-06-25 06:03:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13521453056. Throughput: 0: 42831.2. Samples: 13521589240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:03:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 06:04:01,076][15401] Updated weights for policy 0, policy_version 825293 (0.0034) [2024-06-25 06:04:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 13521682432. Throughput: 0: 42749.7. Samples: 13521835180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:04:03,394][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 06:04:05,826][15401] Updated weights for policy 0, policy_version 825303 (0.0036) [2024-06-25 06:04:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13521895424. Throughput: 0: 42767.9. Samples: 13521966520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:04:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 06:04:08,740][15401] Updated weights for policy 0, policy_version 825313 (0.0038) [2024-06-25 06:04:10,499][15349] Signal inference workers to stop experience collection... (200200 times) [2024-06-25 06:04:10,499][15349] Signal inference workers to resume experience collection... (200200 times) [2024-06-25 06:04:10,548][15401] InferenceWorker_p0-w0: stopping experience collection (200200 times) [2024-06-25 06:04:10,548][15401] InferenceWorker_p0-w0: resuming experience collection (200200 times) [2024-06-25 06:04:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13522075648. Throughput: 0: 42614.8. Samples: 13522230520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:04:13,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 06:04:13,459][15401] Updated weights for policy 0, policy_version 825323 (0.0024) [2024-06-25 06:04:16,534][15401] Updated weights for policy 0, policy_version 825333 (0.0042) [2024-06-25 06:04:18,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13522321408. Throughput: 0: 42516.5. Samples: 13522472760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 06:04:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 06:04:21,314][15401] Updated weights for policy 0, policy_version 825343 (0.0035) [2024-06-25 06:04:23,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 13522550784. Throughput: 0: 42684.9. Samples: 13522610920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:04:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 06:04:24,238][15401] Updated weights for policy 0, policy_version 825353 (0.0038) [2024-06-25 06:04:28,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 13522698240. Throughput: 0: 42648.3. Samples: 13522866440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:04:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 06:04:29,032][15401] Updated weights for policy 0, policy_version 825363 (0.0044) [2024-06-25 06:04:31,931][15401] Updated weights for policy 0, policy_version 825373 (0.0027) [2024-06-25 06:04:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13522976768. Throughput: 0: 42692.8. Samples: 13523113220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:04:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 06:04:36,557][15401] Updated weights for policy 0, policy_version 825383 (0.0037) [2024-06-25 06:04:38,392][15132] Fps is (10 sec: 49141.0, 60 sec: 42869.7, 300 sec: 42765.0). Total num frames: 13523189760. Throughput: 0: 42830.5. Samples: 13523254980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:04:38,393][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 06:04:39,536][15401] Updated weights for policy 0, policy_version 825393 (0.0031) [2024-06-25 06:04:43,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13523353600. Throughput: 0: 42616.0. Samples: 13523506960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:04:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 06:04:44,082][15401] Updated weights for policy 0, policy_version 825403 (0.0050) [2024-06-25 06:04:47,101][15401] Updated weights for policy 0, policy_version 825413 (0.0040) [2024-06-25 06:04:48,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13523615744. Throughput: 0: 42614.6. Samples: 13523752840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:04:48,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 06:04:51,754][15401] Updated weights for policy 0, policy_version 825423 (0.0043) [2024-06-25 06:04:53,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42876.5). Total num frames: 13523828736. Throughput: 0: 42851.3. Samples: 13523894820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:04:53,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 06:04:54,719][15401] Updated weights for policy 0, policy_version 825433 (0.0030) [2024-06-25 06:04:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13523992576. Throughput: 0: 42629.8. Samples: 13524148860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:04:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 06:04:59,222][15401] Updated weights for policy 0, policy_version 825443 (0.0044) [2024-06-25 06:05:02,640][15401] Updated weights for policy 0, policy_version 825453 (0.0032) [2024-06-25 06:05:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13524254720. Throughput: 0: 42798.7. Samples: 13524398700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 06:05:07,128][15401] Updated weights for policy 0, policy_version 825463 (0.0032) [2024-06-25 06:05:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13524467712. Throughput: 0: 42721.8. Samples: 13524533400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 06:05:10,178][15401] Updated weights for policy 0, policy_version 825473 (0.0031) [2024-06-25 06:05:13,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 13524631552. Throughput: 0: 42705.0. Samples: 13524788160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 06:05:14,724][15401] Updated weights for policy 0, policy_version 825483 (0.0043) [2024-06-25 06:05:17,698][15401] Updated weights for policy 0, policy_version 825493 (0.0033) [2024-06-25 06:05:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13524893696. Throughput: 0: 42804.4. Samples: 13525039420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 06:05:22,238][15401] Updated weights for policy 0, policy_version 825503 (0.0039) [2024-06-25 06:05:23,396][15132] Fps is (10 sec: 45845.7, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 13525090304. Throughput: 0: 42661.1. Samples: 13525174900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:23,397][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 06:05:25,814][15401] Updated weights for policy 0, policy_version 825513 (0.0031) [2024-06-25 06:05:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 13525286912. Throughput: 0: 42626.6. Samples: 13525425160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 06:05:29,779][15401] Updated weights for policy 0, policy_version 825523 (0.0037) [2024-06-25 06:05:33,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13525516288. Throughput: 0: 43054.9. Samples: 13525690300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 06:05:33,419][15401] Updated weights for policy 0, policy_version 825533 (0.0026) [2024-06-25 06:05:37,224][15401] Updated weights for policy 0, policy_version 825543 (0.0030) [2024-06-25 06:05:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 13525745664. Throughput: 0: 42721.8. Samples: 13525817300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 06:05:40,705][15349] Signal inference workers to stop experience collection... (200250 times) [2024-06-25 06:05:40,714][15349] Signal inference workers to resume experience collection... (200250 times) [2024-06-25 06:05:40,753][15401] InferenceWorker_p0-w0: stopping experience collection (200250 times) [2024-06-25 06:05:40,760][15401] InferenceWorker_p0-w0: resuming experience collection (200250 times) [2024-06-25 06:05:40,882][15401] Updated weights for policy 0, policy_version 825553 (0.0031) [2024-06-25 06:05:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13525958656. Throughput: 0: 42840.8. Samples: 13526076700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 06:05:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000825559_13525958656.pth... [2024-06-25 06:05:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000824933_13515702272.pth [2024-06-25 06:05:44,798][15401] Updated weights for policy 0, policy_version 825563 (0.0041) [2024-06-25 06:05:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13526171648. Throughput: 0: 42980.0. Samples: 13526332800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 06:05:48,466][15401] Updated weights for policy 0, policy_version 825573 (0.0041) [2024-06-25 06:05:52,463][15401] Updated weights for policy 0, policy_version 825583 (0.0031) [2024-06-25 06:05:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13526384640. Throughput: 0: 42930.1. Samples: 13526465260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 06:05:55,930][15401] Updated weights for policy 0, policy_version 825593 (0.0028) [2024-06-25 06:05:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13526581248. Throughput: 0: 42954.1. Samples: 13526721100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:05:58,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 06:05:59,932][15401] Updated weights for policy 0, policy_version 825603 (0.0037) [2024-06-25 06:06:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13526827008. Throughput: 0: 43058.3. Samples: 13526977040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:06:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 06:06:03,675][15401] Updated weights for policy 0, policy_version 825613 (0.0039) [2024-06-25 06:06:07,609][15401] Updated weights for policy 0, policy_version 825623 (0.0027) [2024-06-25 06:06:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13527040000. Throughput: 0: 42989.1. Samples: 13527109140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 06:06:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 06:06:11,180][15401] Updated weights for policy 0, policy_version 825633 (0.0028) [2024-06-25 06:06:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13527220224. Throughput: 0: 43073.3. Samples: 13527363460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:13,393][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 06:06:15,339][15401] Updated weights for policy 0, policy_version 825643 (0.0033) [2024-06-25 06:06:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13527465984. Throughput: 0: 42894.9. Samples: 13527620580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 06:06:18,707][15401] Updated weights for policy 0, policy_version 825653 (0.0028) [2024-06-25 06:06:22,901][15401] Updated weights for policy 0, policy_version 825663 (0.0030) [2024-06-25 06:06:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43149.0, 300 sec: 42820.5). Total num frames: 13527678976. Throughput: 0: 43048.2. Samples: 13527754480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:23,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 06:06:26,350][15401] Updated weights for policy 0, policy_version 825673 (0.0031) [2024-06-25 06:06:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42710.4). Total num frames: 13527859200. Throughput: 0: 42847.3. Samples: 13528004820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 06:06:30,641][15401] Updated weights for policy 0, policy_version 825683 (0.0044) [2024-06-25 06:06:33,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13528104960. Throughput: 0: 42711.1. Samples: 13528254800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 06:06:34,041][15401] Updated weights for policy 0, policy_version 825693 (0.0042) [2024-06-25 06:06:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 13528301568. Throughput: 0: 42704.9. Samples: 13528386980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:38,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 06:06:38,508][15401] Updated weights for policy 0, policy_version 825703 (0.0043) [2024-06-25 06:06:42,320][15401] Updated weights for policy 0, policy_version 825713 (0.0034) [2024-06-25 06:06:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13528498176. Throughput: 0: 42607.7. Samples: 13528638440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:43,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 06:06:46,134][15401] Updated weights for policy 0, policy_version 825723 (0.0040) [2024-06-25 06:06:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13528727552. Throughput: 0: 42505.2. Samples: 13528889780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:48,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 06:06:49,856][15401] Updated weights for policy 0, policy_version 825733 (0.0035) [2024-06-25 06:06:53,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 13528940544. Throughput: 0: 42592.3. Samples: 13529026060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:53,396][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 06:06:53,977][15401] Updated weights for policy 0, policy_version 825743 (0.0026) [2024-06-25 06:06:57,861][15401] Updated weights for policy 0, policy_version 825753 (0.0032) [2024-06-25 06:06:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13529137152. Throughput: 0: 42463.1. Samples: 13529274300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:06:58,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 06:07:01,567][15401] Updated weights for policy 0, policy_version 825763 (0.0028) [2024-06-25 06:07:03,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13529366528. Throughput: 0: 42402.3. Samples: 13529528680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 06:07:05,476][15401] Updated weights for policy 0, policy_version 825773 (0.0033) [2024-06-25 06:07:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13529563136. Throughput: 0: 42331.2. Samples: 13529659380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 06:07:09,640][15401] Updated weights for policy 0, policy_version 825783 (0.0028) [2024-06-25 06:07:12,897][15401] Updated weights for policy 0, policy_version 825793 (0.0039) [2024-06-25 06:07:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13529792512. Throughput: 0: 42371.1. Samples: 13529911520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 06:07:15,316][15349] Signal inference workers to stop experience collection... (200300 times) [2024-06-25 06:07:15,316][15349] Signal inference workers to resume experience collection... (200300 times) [2024-06-25 06:07:15,327][15401] InferenceWorker_p0-w0: stopping experience collection (200300 times) [2024-06-25 06:07:15,345][15401] InferenceWorker_p0-w0: resuming experience collection (200300 times) [2024-06-25 06:07:17,093][15401] Updated weights for policy 0, policy_version 825803 (0.0038) [2024-06-25 06:07:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13529989120. Throughput: 0: 42577.7. Samples: 13530170800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 06:07:20,457][15401] Updated weights for policy 0, policy_version 825813 (0.0024) [2024-06-25 06:07:23,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42052.2, 300 sec: 42709.4). Total num frames: 13530202112. Throughput: 0: 42467.0. Samples: 13530298000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:23,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 06:07:24,748][15401] Updated weights for policy 0, policy_version 825823 (0.0027) [2024-06-25 06:07:28,072][15401] Updated weights for policy 0, policy_version 825833 (0.0031) [2024-06-25 06:07:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13530447872. Throughput: 0: 42442.1. Samples: 13530548340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:28,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 06:07:32,612][15401] Updated weights for policy 0, policy_version 825843 (0.0031) [2024-06-25 06:07:33,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 13530644480. Throughput: 0: 42708.4. Samples: 13530811660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 06:07:35,605][15401] Updated weights for policy 0, policy_version 825853 (0.0028) [2024-06-25 06:07:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13530857472. Throughput: 0: 42410.8. Samples: 13530934280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 06:07:40,193][15401] Updated weights for policy 0, policy_version 825863 (0.0031) [2024-06-25 06:07:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13531086848. Throughput: 0: 42723.0. Samples: 13531196840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:43,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 06:07:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000825872_13531086848.pth... [2024-06-25 06:07:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000825245_13520814080.pth [2024-06-25 06:07:43,699][15401] Updated weights for policy 0, policy_version 825873 (0.0041) [2024-06-25 06:07:47,749][15401] Updated weights for policy 0, policy_version 825883 (0.0044) [2024-06-25 06:07:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13531283456. Throughput: 0: 42818.6. Samples: 13531455520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 06:07:51,558][15401] Updated weights for policy 0, policy_version 825893 (0.0031) [2024-06-25 06:07:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 13531496448. Throughput: 0: 42621.8. Samples: 13531577360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 06:07:55,414][15401] Updated weights for policy 0, policy_version 825903 (0.0037) [2024-06-25 06:07:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 13531725824. Throughput: 0: 42781.3. Samples: 13531836680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-25 06:07:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 06:07:59,086][15401] Updated weights for policy 0, policy_version 825913 (0.0024) [2024-06-25 06:08:02,949][15401] Updated weights for policy 0, policy_version 825923 (0.0032) [2024-06-25 06:08:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13531922432. Throughput: 0: 42827.1. Samples: 13532098020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 06:08:06,450][15401] Updated weights for policy 0, policy_version 825933 (0.0028) [2024-06-25 06:08:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13532151808. Throughput: 0: 42870.8. Samples: 13532227180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:08,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 06:08:10,822][15401] Updated weights for policy 0, policy_version 825943 (0.0034) [2024-06-25 06:08:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13532364800. Throughput: 0: 43080.6. Samples: 13532486960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 06:08:14,215][15401] Updated weights for policy 0, policy_version 825953 (0.0036) [2024-06-25 06:08:18,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 13532561408. Throughput: 0: 42989.6. Samples: 13532746180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:18,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 06:08:18,409][15401] Updated weights for policy 0, policy_version 825963 (0.0031) [2024-06-25 06:08:21,988][15401] Updated weights for policy 0, policy_version 825973 (0.0041) [2024-06-25 06:08:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 13532807168. Throughput: 0: 42911.6. Samples: 13532865300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:23,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 06:08:26,218][15401] Updated weights for policy 0, policy_version 825983 (0.0051) [2024-06-25 06:08:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13533020160. Throughput: 0: 42931.2. Samples: 13533128740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:28,390][15132] Avg episode reward: [(0, '0.207')] [2024-06-25 06:08:29,490][15401] Updated weights for policy 0, policy_version 825993 (0.0032) [2024-06-25 06:08:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13533200384. Throughput: 0: 42967.9. Samples: 13533389080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:33,399][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 06:08:33,768][15401] Updated weights for policy 0, policy_version 826003 (0.0037) [2024-06-25 06:08:36,929][15401] Updated weights for policy 0, policy_version 826013 (0.0033) [2024-06-25 06:08:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 13533446144. Throughput: 0: 42927.2. Samples: 13533509080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 06:08:39,187][15349] Signal inference workers to stop experience collection... (200350 times) [2024-06-25 06:08:39,188][15349] Signal inference workers to resume experience collection... (200350 times) [2024-06-25 06:08:39,228][15401] InferenceWorker_p0-w0: stopping experience collection (200350 times) [2024-06-25 06:08:39,228][15401] InferenceWorker_p0-w0: resuming experience collection (200350 times) [2024-06-25 06:08:41,404][15401] Updated weights for policy 0, policy_version 826023 (0.0043) [2024-06-25 06:08:43,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 13533659136. Throughput: 0: 42973.8. Samples: 13533770500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:43,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 06:08:44,715][15401] Updated weights for policy 0, policy_version 826033 (0.0034) [2024-06-25 06:08:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13533839360. Throughput: 0: 42878.2. Samples: 13534027540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 06:08:49,223][15401] Updated weights for policy 0, policy_version 826043 (0.0041) [2024-06-25 06:08:52,260][15401] Updated weights for policy 0, policy_version 826053 (0.0029) [2024-06-25 06:08:53,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 13534101504. Throughput: 0: 42654.3. Samples: 13534146720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:53,393][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 06:08:56,859][15401] Updated weights for policy 0, policy_version 826063 (0.0027) [2024-06-25 06:08:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13534298112. Throughput: 0: 42785.3. Samples: 13534412300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:08:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 06:08:59,890][15401] Updated weights for policy 0, policy_version 826073 (0.0022) [2024-06-25 06:09:03,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13534478336. Throughput: 0: 42767.0. Samples: 13534670700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 06:09:04,486][15401] Updated weights for policy 0, policy_version 826083 (0.0031) [2024-06-25 06:09:07,611][15401] Updated weights for policy 0, policy_version 826093 (0.0042) [2024-06-25 06:09:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13534740480. Throughput: 0: 42830.8. Samples: 13534792680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 06:09:12,379][15401] Updated weights for policy 0, policy_version 826103 (0.0036) [2024-06-25 06:09:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13534904320. Throughput: 0: 42694.6. Samples: 13535050000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 06:09:15,187][15401] Updated weights for policy 0, policy_version 826113 (0.0029) [2024-06-25 06:09:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13535117312. Throughput: 0: 42599.6. Samples: 13535306060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:18,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-25 06:09:20,186][15401] Updated weights for policy 0, policy_version 826123 (0.0045) [2024-06-25 06:09:23,114][15401] Updated weights for policy 0, policy_version 826133 (0.0024) [2024-06-25 06:09:23,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 13535379456. Throughput: 0: 42665.3. Samples: 13535429020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 06:09:27,831][15401] Updated weights for policy 0, policy_version 826143 (0.0026) [2024-06-25 06:09:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13535559680. Throughput: 0: 42686.3. Samples: 13535691380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:28,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 06:09:30,791][15401] Updated weights for policy 0, policy_version 826153 (0.0040) [2024-06-25 06:09:33,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 13535772672. Throughput: 0: 42521.8. Samples: 13535941020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:33,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 06:09:35,416][15401] Updated weights for policy 0, policy_version 826163 (0.0027) [2024-06-25 06:09:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13536002048. Throughput: 0: 42832.1. Samples: 13536074060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 06:09:38,508][15401] Updated weights for policy 0, policy_version 826173 (0.0054) [2024-06-25 06:09:42,995][15401] Updated weights for policy 0, policy_version 826183 (0.0042) [2024-06-25 06:09:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13536198656. Throughput: 0: 42642.2. Samples: 13536331200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 06:09:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 06:09:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000826185_13536215040.pth... [2024-06-25 06:09:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000825559_13525958656.pth [2024-06-25 06:09:46,049][15401] Updated weights for policy 0, policy_version 826193 (0.0041) [2024-06-25 06:09:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 13536411648. Throughput: 0: 42566.7. Samples: 13536586200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:09:48,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 06:09:50,572][15401] Updated weights for policy 0, policy_version 826203 (0.0042) [2024-06-25 06:09:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 13536641024. Throughput: 0: 42783.6. Samples: 13536717940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:09:53,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 06:09:53,788][15401] Updated weights for policy 0, policy_version 826213 (0.0037) [2024-06-25 06:09:58,283][15401] Updated weights for policy 0, policy_version 826223 (0.0038) [2024-06-25 06:09:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13536837632. Throughput: 0: 42782.3. Samples: 13536975200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:09:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 06:10:01,387][15401] Updated weights for policy 0, policy_version 826233 (0.0033) [2024-06-25 06:10:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13537067008. Throughput: 0: 42656.1. Samples: 13537225580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 06:10:06,146][15401] Updated weights for policy 0, policy_version 826243 (0.0031) [2024-06-25 06:10:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 13537280000. Throughput: 0: 42977.2. Samples: 13537363000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 06:10:09,157][15401] Updated weights for policy 0, policy_version 826253 (0.0049) [2024-06-25 06:10:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13537460224. Throughput: 0: 42773.1. Samples: 13537616180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 06:10:13,731][15401] Updated weights for policy 0, policy_version 826263 (0.0041) [2024-06-25 06:10:14,172][15349] Signal inference workers to stop experience collection... (200400 times) [2024-06-25 06:10:14,173][15349] Signal inference workers to resume experience collection... (200400 times) [2024-06-25 06:10:14,198][15401] InferenceWorker_p0-w0: stopping experience collection (200400 times) [2024-06-25 06:10:14,230][15401] InferenceWorker_p0-w0: resuming experience collection (200400 times) [2024-06-25 06:10:17,017][15401] Updated weights for policy 0, policy_version 826273 (0.0029) [2024-06-25 06:10:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42821.5). Total num frames: 13537722368. Throughput: 0: 42712.4. Samples: 13537863080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:18,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 06:10:21,651][15401] Updated weights for policy 0, policy_version 826283 (0.0032) [2024-06-25 06:10:23,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13537935360. Throughput: 0: 42874.1. Samples: 13538003400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 06:10:24,654][15401] Updated weights for policy 0, policy_version 826293 (0.0027) [2024-06-25 06:10:28,396][15132] Fps is (10 sec: 39297.6, 60 sec: 42594.0, 300 sec: 42708.6). Total num frames: 13538115584. Throughput: 0: 42773.7. Samples: 13538256280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:28,396][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 06:10:29,164][15401] Updated weights for policy 0, policy_version 826303 (0.0037) [2024-06-25 06:10:32,285][15401] Updated weights for policy 0, policy_version 826313 (0.0036) [2024-06-25 06:10:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13538361344. Throughput: 0: 42712.0. Samples: 13538508240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 06:10:36,710][15401] Updated weights for policy 0, policy_version 826323 (0.0040) [2024-06-25 06:10:38,390][15132] Fps is (10 sec: 45902.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13538574336. Throughput: 0: 42765.3. Samples: 13538642380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 06:10:39,884][15401] Updated weights for policy 0, policy_version 826333 (0.0034) [2024-06-25 06:10:43,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 13538754560. Throughput: 0: 42558.7. Samples: 13538890440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:43,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 06:10:44,325][15401] Updated weights for policy 0, policy_version 826343 (0.0031) [2024-06-25 06:10:47,525][15401] Updated weights for policy 0, policy_version 826353 (0.0044) [2024-06-25 06:10:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13538983936. Throughput: 0: 42773.2. Samples: 13539150380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 06:10:51,859][15401] Updated weights for policy 0, policy_version 826363 (0.0038) [2024-06-25 06:10:53,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13539196928. Throughput: 0: 42652.4. Samples: 13539282360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 06:10:55,243][15401] Updated weights for policy 0, policy_version 826373 (0.0027) [2024-06-25 06:10:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13539409920. Throughput: 0: 42694.3. Samples: 13539537420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:10:58,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 06:10:59,376][15401] Updated weights for policy 0, policy_version 826383 (0.0024) [2024-06-25 06:11:02,929][15401] Updated weights for policy 0, policy_version 826393 (0.0033) [2024-06-25 06:11:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13539622912. Throughput: 0: 42803.1. Samples: 13539789220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:11:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 06:11:07,050][15401] Updated weights for policy 0, policy_version 826403 (0.0029) [2024-06-25 06:11:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13539852288. Throughput: 0: 42644.9. Samples: 13539922420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:11:08,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 06:11:10,873][15401] Updated weights for policy 0, policy_version 826413 (0.0033) [2024-06-25 06:11:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 13540065280. Throughput: 0: 42681.7. Samples: 13540176700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:11:13,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 06:11:14,706][15401] Updated weights for policy 0, policy_version 826423 (0.0041) [2024-06-25 06:11:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13540261888. Throughput: 0: 42809.8. Samples: 13540434680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:11:18,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 06:11:18,615][15401] Updated weights for policy 0, policy_version 826433 (0.0039) [2024-06-25 06:11:22,309][15401] Updated weights for policy 0, policy_version 826443 (0.0029) [2024-06-25 06:11:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13540491264. Throughput: 0: 42676.1. Samples: 13540562800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:11:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 06:11:26,127][15401] Updated weights for policy 0, policy_version 826453 (0.0037) [2024-06-25 06:11:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43149.0, 300 sec: 42709.5). Total num frames: 13540704256. Throughput: 0: 42902.4. Samples: 13540820940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:11:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 06:11:29,857][15401] Updated weights for policy 0, policy_version 826463 (0.0030) [2024-06-25 06:11:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13540900864. Throughput: 0: 42908.1. Samples: 13541081240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 06:11:33,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 06:11:34,135][15401] Updated weights for policy 0, policy_version 826473 (0.0038) [2024-06-25 06:11:37,480][15401] Updated weights for policy 0, policy_version 826483 (0.0033) [2024-06-25 06:11:38,037][15349] Signal inference workers to stop experience collection... (200450 times) [2024-06-25 06:11:38,037][15349] Signal inference workers to resume experience collection... (200450 times) [2024-06-25 06:11:38,051][15401] InferenceWorker_p0-w0: stopping experience collection (200450 times) [2024-06-25 06:11:38,051][15401] InferenceWorker_p0-w0: resuming experience collection (200450 times) [2024-06-25 06:11:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13541146624. Throughput: 0: 42753.8. Samples: 13541206280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:11:38,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 06:11:41,605][15401] Updated weights for policy 0, policy_version 826493 (0.0036) [2024-06-25 06:11:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 13541343232. Throughput: 0: 42944.5. Samples: 13541469920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:11:43,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 06:11:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000826499_13541359616.pth... [2024-06-25 06:11:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000825872_13531086848.pth [2024-06-25 06:11:45,006][15401] Updated weights for policy 0, policy_version 826503 (0.0046) [2024-06-25 06:11:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 13541539840. Throughput: 0: 43176.4. Samples: 13541732160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:11:48,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 06:11:49,164][15401] Updated weights for policy 0, policy_version 826513 (0.0033) [2024-06-25 06:11:52,522][15401] Updated weights for policy 0, policy_version 826523 (0.0036) [2024-06-25 06:11:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13541785600. Throughput: 0: 42903.7. Samples: 13541853080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:11:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 06:11:56,931][15401] Updated weights for policy 0, policy_version 826533 (0.0035) [2024-06-25 06:11:58,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13542014976. Throughput: 0: 43060.4. Samples: 13542114420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:11:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 06:11:59,985][15401] Updated weights for policy 0, policy_version 826543 (0.0028) [2024-06-25 06:12:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13542195200. Throughput: 0: 43179.1. Samples: 13542377740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:03,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 06:12:04,474][15401] Updated weights for policy 0, policy_version 826553 (0.0032) [2024-06-25 06:12:07,760][15401] Updated weights for policy 0, policy_version 826563 (0.0044) [2024-06-25 06:12:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13542424576. Throughput: 0: 42844.9. Samples: 13542490820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:08,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 06:12:11,935][15401] Updated weights for policy 0, policy_version 826573 (0.0035) [2024-06-25 06:12:13,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 13542670336. Throughput: 0: 43107.1. Samples: 13542760760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:13,390][15132] Avg episode reward: [(0, '0.103')] [2024-06-25 06:12:15,219][15401] Updated weights for policy 0, policy_version 826583 (0.0028) [2024-06-25 06:12:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13542850560. Throughput: 0: 42914.2. Samples: 13543012380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:18,390][15132] Avg episode reward: [(0, '0.149')] [2024-06-25 06:12:19,632][15401] Updated weights for policy 0, policy_version 826593 (0.0032) [2024-06-25 06:12:22,742][15401] Updated weights for policy 0, policy_version 826603 (0.0037) [2024-06-25 06:12:23,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 13543063552. Throughput: 0: 43010.3. Samples: 13543141840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:23,392][15132] Avg episode reward: [(0, '0.267')] [2024-06-25 06:12:27,182][15401] Updated weights for policy 0, policy_version 826613 (0.0027) [2024-06-25 06:12:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13543276544. Throughput: 0: 43027.1. Samples: 13543406140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:28,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 06:12:30,274][15401] Updated weights for policy 0, policy_version 826623 (0.0039) [2024-06-25 06:12:33,393][15132] Fps is (10 sec: 42591.5, 60 sec: 43141.6, 300 sec: 42820.0). Total num frames: 13543489536. Throughput: 0: 42852.7. Samples: 13543660700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:33,394][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 06:12:34,831][15401] Updated weights for policy 0, policy_version 826633 (0.0027) [2024-06-25 06:12:37,736][15401] Updated weights for policy 0, policy_version 826643 (0.0027) [2024-06-25 06:12:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13543718912. Throughput: 0: 42976.8. Samples: 13543787040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:38,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-25 06:12:42,635][15401] Updated weights for policy 0, policy_version 826653 (0.0031) [2024-06-25 06:12:42,649][15349] Signal inference workers to stop experience collection... (200500 times) [2024-06-25 06:12:42,650][15349] Signal inference workers to resume experience collection... (200500 times) [2024-06-25 06:12:42,662][15401] InferenceWorker_p0-w0: stopping experience collection (200500 times) [2024-06-25 06:12:42,675][15401] InferenceWorker_p0-w0: resuming experience collection (200500 times) [2024-06-25 06:12:43,389][15132] Fps is (10 sec: 44254.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13543931904. Throughput: 0: 43066.8. Samples: 13544052420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:43,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 06:12:45,367][15401] Updated weights for policy 0, policy_version 826663 (0.0034) [2024-06-25 06:12:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13544128512. Throughput: 0: 42784.4. Samples: 13544303040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 06:12:50,160][15401] Updated weights for policy 0, policy_version 826673 (0.0025) [2024-06-25 06:12:53,041][15401] Updated weights for policy 0, policy_version 826683 (0.0034) [2024-06-25 06:12:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13544374272. Throughput: 0: 43009.7. Samples: 13544426260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 06:12:57,696][15401] Updated weights for policy 0, policy_version 826693 (0.0037) [2024-06-25 06:12:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13544570880. Throughput: 0: 42955.0. Samples: 13544693740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:12:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 06:13:00,612][15401] Updated weights for policy 0, policy_version 826703 (0.0024) [2024-06-25 06:13:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13544767488. Throughput: 0: 43009.3. Samples: 13544947800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:13:03,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-25 06:13:05,522][15401] Updated weights for policy 0, policy_version 826713 (0.0033) [2024-06-25 06:13:08,391][15132] Fps is (10 sec: 44229.2, 60 sec: 43143.2, 300 sec: 42875.8). Total num frames: 13545013248. Throughput: 0: 42912.1. Samples: 13545072860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:13:08,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 06:13:09,024][15401] Updated weights for policy 0, policy_version 826723 (0.0027) [2024-06-25 06:13:13,058][15401] Updated weights for policy 0, policy_version 826733 (0.0036) [2024-06-25 06:13:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 13545209856. Throughput: 0: 42838.0. Samples: 13545333860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:13:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 06:13:16,599][15401] Updated weights for policy 0, policy_version 826743 (0.0048) [2024-06-25 06:13:18,389][15132] Fps is (10 sec: 39329.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13545406464. Throughput: 0: 42962.1. Samples: 13545593820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:13:18,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 06:13:20,567][15401] Updated weights for policy 0, policy_version 826753 (0.0034) [2024-06-25 06:13:23,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 13545652224. Throughput: 0: 42891.6. Samples: 13545717160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 06:13:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 06:13:23,994][15401] Updated weights for policy 0, policy_version 826763 (0.0033) [2024-06-25 06:13:28,143][15401] Updated weights for policy 0, policy_version 826773 (0.0034) [2024-06-25 06:13:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13545848832. Throughput: 0: 42843.4. Samples: 13545980380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:13:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 06:13:31,592][15401] Updated weights for policy 0, policy_version 826783 (0.0043) [2024-06-25 06:13:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42874.3, 300 sec: 42765.0). Total num frames: 13546061824. Throughput: 0: 42873.2. Samples: 13546232340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:13:33,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 06:13:35,947][15401] Updated weights for policy 0, policy_version 826793 (0.0037) [2024-06-25 06:13:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13546307584. Throughput: 0: 42941.3. Samples: 13546358620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:13:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 06:13:39,241][15401] Updated weights for policy 0, policy_version 826803 (0.0031) [2024-06-25 06:13:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13546487808. Throughput: 0: 42717.5. Samples: 13546616020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:13:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 06:13:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000826812_13546487808.pth... [2024-06-25 06:13:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000826185_13536215040.pth [2024-06-25 06:13:43,637][15401] Updated weights for policy 0, policy_version 826813 (0.0037) [2024-06-25 06:13:47,017][15401] Updated weights for policy 0, policy_version 826823 (0.0028) [2024-06-25 06:13:48,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 13546684416. Throughput: 0: 42686.7. Samples: 13546868700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:13:48,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 06:13:51,336][15401] Updated weights for policy 0, policy_version 826833 (0.0038) [2024-06-25 06:13:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13546946560. Throughput: 0: 42777.8. Samples: 13546997780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:13:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 06:13:54,710][15401] Updated weights for policy 0, policy_version 826843 (0.0027) [2024-06-25 06:13:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13547110400. Throughput: 0: 42613.1. Samples: 13547251440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:13:58,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 06:13:59,017][15401] Updated weights for policy 0, policy_version 826853 (0.0036) [2024-06-25 06:14:02,550][15401] Updated weights for policy 0, policy_version 826863 (0.0037) [2024-06-25 06:14:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13547339776. Throughput: 0: 42517.3. Samples: 13547507100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 06:14:06,711][15401] Updated weights for policy 0, policy_version 826873 (0.0030) [2024-06-25 06:14:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42599.7, 300 sec: 42931.7). Total num frames: 13547569152. Throughput: 0: 42646.3. Samples: 13547636240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 06:14:10,408][15401] Updated weights for policy 0, policy_version 826883 (0.0037) [2024-06-25 06:14:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13547765760. Throughput: 0: 42523.2. Samples: 13547893920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:13,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 06:14:14,497][15401] Updated weights for policy 0, policy_version 826893 (0.0031) [2024-06-25 06:14:18,102][15401] Updated weights for policy 0, policy_version 826903 (0.0038) [2024-06-25 06:14:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13547978752. Throughput: 0: 42591.7. Samples: 13548148960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 06:14:22,199][15401] Updated weights for policy 0, policy_version 826913 (0.0030) [2024-06-25 06:14:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 13548191744. Throughput: 0: 42655.6. Samples: 13548278220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:23,392][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 06:14:23,780][15349] Signal inference workers to stop experience collection... (200550 times) [2024-06-25 06:14:23,831][15401] InferenceWorker_p0-w0: stopping experience collection (200550 times) [2024-06-25 06:14:23,839][15349] Signal inference workers to resume experience collection... (200550 times) [2024-06-25 06:14:23,844][15401] InferenceWorker_p0-w0: resuming experience collection (200550 times) [2024-06-25 06:14:25,676][15401] Updated weights for policy 0, policy_version 826923 (0.0041) [2024-06-25 06:14:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13548404736. Throughput: 0: 42701.6. Samples: 13548537600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:28,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 06:14:29,787][15401] Updated weights for policy 0, policy_version 826933 (0.0043) [2024-06-25 06:14:33,332][15401] Updated weights for policy 0, policy_version 826943 (0.0042) [2024-06-25 06:14:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13548634112. Throughput: 0: 42787.5. Samples: 13548794140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 06:14:37,227][15401] Updated weights for policy 0, policy_version 826953 (0.0032) [2024-06-25 06:14:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13548847104. Throughput: 0: 42972.0. Samples: 13548931520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 06:14:40,873][15401] Updated weights for policy 0, policy_version 826963 (0.0036) [2024-06-25 06:14:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13549043712. Throughput: 0: 42972.3. Samples: 13549185200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 06:14:44,761][15401] Updated weights for policy 0, policy_version 826973 (0.0028) [2024-06-25 06:14:48,326][15401] Updated weights for policy 0, policy_version 826983 (0.0031) [2024-06-25 06:14:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13549289472. Throughput: 0: 42993.8. Samples: 13549441820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 06:14:52,368][15401] Updated weights for policy 0, policy_version 826993 (0.0029) [2024-06-25 06:14:53,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13549518848. Throughput: 0: 43048.4. Samples: 13549573420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:53,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 06:14:55,886][15401] Updated weights for policy 0, policy_version 827003 (0.0043) [2024-06-25 06:14:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13549699072. Throughput: 0: 43087.6. Samples: 13549832860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:14:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 06:14:59,868][15401] Updated weights for policy 0, policy_version 827013 (0.0034) [2024-06-25 06:15:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13549928448. Throughput: 0: 43145.9. Samples: 13550090520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:15:03,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 06:15:03,489][15401] Updated weights for policy 0, policy_version 827023 (0.0042) [2024-06-25 06:15:07,820][15401] Updated weights for policy 0, policy_version 827033 (0.0032) [2024-06-25 06:15:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13550141440. Throughput: 0: 43078.3. Samples: 13550216640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:15:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 06:15:11,115][15401] Updated weights for policy 0, policy_version 827043 (0.0040) [2024-06-25 06:15:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13550338048. Throughput: 0: 42973.3. Samples: 13550471400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 06:15:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 06:15:15,232][15401] Updated weights for policy 0, policy_version 827053 (0.0030) [2024-06-25 06:15:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13550567424. Throughput: 0: 43035.5. Samples: 13550730740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 06:15:18,991][15401] Updated weights for policy 0, policy_version 827063 (0.0035) [2024-06-25 06:15:23,024][15401] Updated weights for policy 0, policy_version 827073 (0.0034) [2024-06-25 06:15:23,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43419.3, 300 sec: 42988.1). Total num frames: 13550796800. Throughput: 0: 42816.0. Samples: 13550858240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 06:15:26,661][15401] Updated weights for policy 0, policy_version 827083 (0.0060) [2024-06-25 06:15:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13550993408. Throughput: 0: 42949.9. Samples: 13551117940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 06:15:30,522][15401] Updated weights for policy 0, policy_version 827093 (0.0043) [2024-06-25 06:15:33,391][15132] Fps is (10 sec: 42591.8, 60 sec: 43143.4, 300 sec: 42875.9). Total num frames: 13551222784. Throughput: 0: 42758.9. Samples: 13551366040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:33,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 06:15:34,683][15401] Updated weights for policy 0, policy_version 827103 (0.0044) [2024-06-25 06:15:38,112][15401] Updated weights for policy 0, policy_version 827113 (0.0045) [2024-06-25 06:15:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 13551419392. Throughput: 0: 42788.1. Samples: 13551498880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 06:15:42,190][15401] Updated weights for policy 0, policy_version 827123 (0.0031) [2024-06-25 06:15:43,389][15132] Fps is (10 sec: 40966.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 13551632384. Throughput: 0: 42667.1. Samples: 13551752880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:43,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 06:15:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000827126_13551632384.pth... [2024-06-25 06:15:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000826499_13541359616.pth [2024-06-25 06:15:45,954][15401] Updated weights for policy 0, policy_version 827133 (0.0031) [2024-06-25 06:15:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 13551861760. Throughput: 0: 42547.5. Samples: 13552005160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 06:15:49,648][15401] Updated weights for policy 0, policy_version 827143 (0.0033) [2024-06-25 06:15:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 13552041984. Throughput: 0: 42740.9. Samples: 13552139980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:53,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 06:15:53,543][15349] Signal inference workers to stop experience collection... (200600 times) [2024-06-25 06:15:53,580][15401] InferenceWorker_p0-w0: stopping experience collection (200600 times) [2024-06-25 06:15:53,589][15349] Signal inference workers to resume experience collection... (200600 times) [2024-06-25 06:15:53,599][15401] InferenceWorker_p0-w0: resuming experience collection (200600 times) [2024-06-25 06:15:53,725][15401] Updated weights for policy 0, policy_version 827153 (0.0031) [2024-06-25 06:15:57,351][15401] Updated weights for policy 0, policy_version 827163 (0.0029) [2024-06-25 06:15:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 13552271360. Throughput: 0: 42741.7. Samples: 13552394780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:15:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 06:16:01,355][15401] Updated weights for policy 0, policy_version 827173 (0.0026) [2024-06-25 06:16:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 13552484352. Throughput: 0: 42579.6. Samples: 13552646820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 06:16:04,947][15401] Updated weights for policy 0, policy_version 827183 (0.0041) [2024-06-25 06:16:08,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 13552697344. Throughput: 0: 42759.5. Samples: 13552782520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:08,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 06:16:08,958][15401] Updated weights for policy 0, policy_version 827193 (0.0040) [2024-06-25 06:16:12,326][15401] Updated weights for policy 0, policy_version 827203 (0.0024) [2024-06-25 06:16:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13552926720. Throughput: 0: 42728.1. Samples: 13553040700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:13,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-25 06:16:16,508][15401] Updated weights for policy 0, policy_version 827213 (0.0034) [2024-06-25 06:16:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13553123328. Throughput: 0: 42989.5. Samples: 13553300500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:18,390][15132] Avg episode reward: [(0, '0.169')] [2024-06-25 06:16:19,769][15401] Updated weights for policy 0, policy_version 827223 (0.0026) [2024-06-25 06:16:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 13553336320. Throughput: 0: 42914.6. Samples: 13553430040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:23,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 06:16:24,009][15401] Updated weights for policy 0, policy_version 827233 (0.0040) [2024-06-25 06:16:27,535][15401] Updated weights for policy 0, policy_version 827243 (0.0030) [2024-06-25 06:16:28,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 13553582080. Throughput: 0: 42934.1. Samples: 13553685020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:28,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 06:16:31,870][15401] Updated weights for policy 0, policy_version 827253 (0.0028) [2024-06-25 06:16:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42599.5, 300 sec: 42820.6). Total num frames: 13553778688. Throughput: 0: 43052.9. Samples: 13553942540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 06:16:35,084][15401] Updated weights for policy 0, policy_version 827263 (0.0050) [2024-06-25 06:16:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13553991680. Throughput: 0: 42867.6. Samples: 13554069020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 06:16:39,405][15401] Updated weights for policy 0, policy_version 827273 (0.0039) [2024-06-25 06:16:42,655][15401] Updated weights for policy 0, policy_version 827283 (0.0033) [2024-06-25 06:16:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 13554204672. Throughput: 0: 42823.2. Samples: 13554321920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:43,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 06:16:47,027][15401] Updated weights for policy 0, policy_version 827293 (0.0032) [2024-06-25 06:16:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13554434048. Throughput: 0: 43064.5. Samples: 13554584720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 06:16:50,049][15401] Updated weights for policy 0, policy_version 827303 (0.0038) [2024-06-25 06:16:53,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13554630656. Throughput: 0: 42919.2. Samples: 13554713780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:53,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 06:16:54,582][15401] Updated weights for policy 0, policy_version 827313 (0.0029) [2024-06-25 06:16:57,931][15401] Updated weights for policy 0, policy_version 827323 (0.0037) [2024-06-25 06:16:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13554860032. Throughput: 0: 42824.4. Samples: 13554967800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 06:16:58,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 06:17:02,425][15401] Updated weights for policy 0, policy_version 827333 (0.0035) [2024-06-25 06:17:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13555073024. Throughput: 0: 42799.8. Samples: 13555226500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 06:17:05,994][15401] Updated weights for policy 0, policy_version 827343 (0.0028) [2024-06-25 06:17:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 13555269632. Throughput: 0: 42815.0. Samples: 13555356820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:08,393][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 06:17:10,181][15401] Updated weights for policy 0, policy_version 827353 (0.0027) [2024-06-25 06:17:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13555499008. Throughput: 0: 42661.3. Samples: 13555604680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 06:17:13,608][15401] Updated weights for policy 0, policy_version 827363 (0.0039) [2024-06-25 06:17:17,855][15401] Updated weights for policy 0, policy_version 827373 (0.0039) [2024-06-25 06:17:18,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 13555695616. Throughput: 0: 42757.4. Samples: 13555866620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:18,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 06:17:18,900][15349] Signal inference workers to stop experience collection... (200650 times) [2024-06-25 06:17:18,904][15349] Signal inference workers to resume experience collection... (200650 times) [2024-06-25 06:17:18,916][15401] InferenceWorker_p0-w0: stopping experience collection (200650 times) [2024-06-25 06:17:18,930][15401] InferenceWorker_p0-w0: resuming experience collection (200650 times) [2024-06-25 06:17:21,246][15401] Updated weights for policy 0, policy_version 827383 (0.0028) [2024-06-25 06:17:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13555924992. Throughput: 0: 42806.5. Samples: 13555995320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:23,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 06:17:25,218][15401] Updated weights for policy 0, policy_version 827393 (0.0032) [2024-06-25 06:17:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42876.7). Total num frames: 13556137984. Throughput: 0: 42918.6. Samples: 13556253160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 06:17:29,060][15401] Updated weights for policy 0, policy_version 827403 (0.0040) [2024-06-25 06:17:33,001][15401] Updated weights for policy 0, policy_version 827413 (0.0029) [2024-06-25 06:17:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13556334592. Throughput: 0: 42801.9. Samples: 13556510800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 06:17:36,670][15401] Updated weights for policy 0, policy_version 827423 (0.0025) [2024-06-25 06:17:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13556547584. Throughput: 0: 42728.4. Samples: 13556636560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 06:17:40,575][15401] Updated weights for policy 0, policy_version 827433 (0.0040) [2024-06-25 06:17:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 13556776960. Throughput: 0: 42847.1. Samples: 13556895920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 06:17:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000827440_13556776960.pth... [2024-06-25 06:17:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000826812_13546487808.pth [2024-06-25 06:17:44,174][15401] Updated weights for policy 0, policy_version 827443 (0.0030) [2024-06-25 06:17:48,120][15401] Updated weights for policy 0, policy_version 827453 (0.0035) [2024-06-25 06:17:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 13556989952. Throughput: 0: 42789.9. Samples: 13557152140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:48,401][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 06:17:51,676][15401] Updated weights for policy 0, policy_version 827463 (0.0036) [2024-06-25 06:17:53,391][15132] Fps is (10 sec: 40952.5, 60 sec: 42597.0, 300 sec: 42764.8). Total num frames: 13557186560. Throughput: 0: 42813.8. Samples: 13557283420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:53,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 06:17:55,544][15401] Updated weights for policy 0, policy_version 827473 (0.0027) [2024-06-25 06:17:58,389][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13557415936. Throughput: 0: 43095.7. Samples: 13557543980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:17:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 06:17:59,868][15401] Updated weights for policy 0, policy_version 827483 (0.0035) [2024-06-25 06:18:03,077][15401] Updated weights for policy 0, policy_version 827493 (0.0034) [2024-06-25 06:18:03,390][15132] Fps is (10 sec: 45880.5, 60 sec: 42871.0, 300 sec: 42820.7). Total num frames: 13557645312. Throughput: 0: 42905.0. Samples: 13557797380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:03,391][15132] Avg episode reward: [(0, '0.918')] [2024-06-25 06:18:07,273][15401] Updated weights for policy 0, policy_version 827503 (0.0037) [2024-06-25 06:18:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 13557841920. Throughput: 0: 42967.6. Samples: 13557928860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 06:18:10,988][15401] Updated weights for policy 0, policy_version 827513 (0.0038) [2024-06-25 06:18:13,390][15132] Fps is (10 sec: 42600.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13558071296. Throughput: 0: 42958.1. Samples: 13558186280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:13,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 06:18:14,914][15401] Updated weights for policy 0, policy_version 827523 (0.0033) [2024-06-25 06:18:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13558267904. Throughput: 0: 42999.0. Samples: 13558445760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 06:18:18,591][15401] Updated weights for policy 0, policy_version 827533 (0.0031) [2024-06-25 06:18:22,479][15401] Updated weights for policy 0, policy_version 827543 (0.0033) [2024-06-25 06:18:23,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13558497280. Throughput: 0: 43009.3. Samples: 13558571980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 06:18:26,101][15401] Updated weights for policy 0, policy_version 827553 (0.0037) [2024-06-25 06:18:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13558710272. Throughput: 0: 42939.1. Samples: 13558828180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 06:18:30,055][15401] Updated weights for policy 0, policy_version 827563 (0.0033) [2024-06-25 06:18:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13558923264. Throughput: 0: 43159.2. Samples: 13559094200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:33,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 06:18:33,674][15401] Updated weights for policy 0, policy_version 827573 (0.0033) [2024-06-25 06:18:37,699][15401] Updated weights for policy 0, policy_version 827583 (0.0033) [2024-06-25 06:18:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 13559152640. Throughput: 0: 42985.0. Samples: 13559217660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 06:18:41,494][15401] Updated weights for policy 0, policy_version 827593 (0.0033) [2024-06-25 06:18:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13559365632. Throughput: 0: 43023.2. Samples: 13559480020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 06:18:45,217][15401] Updated weights for policy 0, policy_version 827603 (0.0040) [2024-06-25 06:18:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13559545856. Throughput: 0: 43218.1. Samples: 13559742160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 06:18:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 06:18:48,415][15349] Signal inference workers to stop experience collection... (200700 times) [2024-06-25 06:18:48,447][15401] InferenceWorker_p0-w0: stopping experience collection (200700 times) [2024-06-25 06:18:48,469][15349] Signal inference workers to resume experience collection... (200700 times) [2024-06-25 06:18:48,476][15401] InferenceWorker_p0-w0: resuming experience collection (200700 times) [2024-06-25 06:18:49,114][15401] Updated weights for policy 0, policy_version 827613 (0.0031) [2024-06-25 06:18:52,730][15401] Updated weights for policy 0, policy_version 827623 (0.0029) [2024-06-25 06:18:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43692.0, 300 sec: 43042.7). Total num frames: 13559808000. Throughput: 0: 42974.1. Samples: 13559862700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:18:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 06:18:57,100][15401] Updated weights for policy 0, policy_version 827633 (0.0033) [2024-06-25 06:18:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13560004608. Throughput: 0: 43034.5. Samples: 13560122820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:18:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 06:19:00,085][15401] Updated weights for policy 0, policy_version 827643 (0.0025) [2024-06-25 06:19:03,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.9, 300 sec: 42765.0). Total num frames: 13560184832. Throughput: 0: 42881.4. Samples: 13560375420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 06:19:04,698][15401] Updated weights for policy 0, policy_version 827653 (0.0027) [2024-06-25 06:19:07,944][15401] Updated weights for policy 0, policy_version 827663 (0.0043) [2024-06-25 06:19:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13560446976. Throughput: 0: 42882.2. Samples: 13560501680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:08,400][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 06:19:12,190][15401] Updated weights for policy 0, policy_version 827673 (0.0040) [2024-06-25 06:19:13,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42870.0, 300 sec: 42931.3). Total num frames: 13560643584. Throughput: 0: 43007.2. Samples: 13560763600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:13,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 06:19:15,662][15401] Updated weights for policy 0, policy_version 827683 (0.0029) [2024-06-25 06:19:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 13560840192. Throughput: 0: 42816.9. Samples: 13561020960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:18,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 06:19:19,804][15401] Updated weights for policy 0, policy_version 827693 (0.0038) [2024-06-25 06:19:23,163][15401] Updated weights for policy 0, policy_version 827703 (0.0032) [2024-06-25 06:19:23,390][15132] Fps is (10 sec: 44246.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13561085952. Throughput: 0: 42985.6. Samples: 13561152020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 06:19:27,337][15401] Updated weights for policy 0, policy_version 827713 (0.0041) [2024-06-25 06:19:28,396][15132] Fps is (10 sec: 45845.2, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 13561298944. Throughput: 0: 42986.7. Samples: 13561414700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:28,397][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 06:19:30,813][15401] Updated weights for policy 0, policy_version 827723 (0.0031) [2024-06-25 06:19:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13561495552. Throughput: 0: 42872.0. Samples: 13561671400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:33,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 06:19:35,040][15401] Updated weights for policy 0, policy_version 827733 (0.0040) [2024-06-25 06:19:38,303][15401] Updated weights for policy 0, policy_version 827743 (0.0025) [2024-06-25 06:19:38,389][15132] Fps is (10 sec: 44265.5, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13561741312. Throughput: 0: 43079.6. Samples: 13561801280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:38,394][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 06:19:42,597][15401] Updated weights for policy 0, policy_version 827753 (0.0026) [2024-06-25 06:19:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13561937920. Throughput: 0: 43134.6. Samples: 13562063880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 06:19:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000827755_13561937920.pth... [2024-06-25 06:19:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000827126_13551632384.pth [2024-06-25 06:19:45,938][15401] Updated weights for policy 0, policy_version 827763 (0.0035) [2024-06-25 06:19:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13562150912. Throughput: 0: 43072.4. Samples: 13562313680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:48,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 06:19:50,242][15401] Updated weights for policy 0, policy_version 827773 (0.0045) [2024-06-25 06:19:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13562363904. Throughput: 0: 43248.9. Samples: 13562447880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 06:19:53,603][15401] Updated weights for policy 0, policy_version 827783 (0.0032) [2024-06-25 06:19:58,097][15401] Updated weights for policy 0, policy_version 827793 (0.0032) [2024-06-25 06:19:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13562560512. Throughput: 0: 43138.2. Samples: 13562704720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:19:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 06:20:01,169][15401] Updated weights for policy 0, policy_version 827803 (0.0036) [2024-06-25 06:20:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13562789888. Throughput: 0: 42980.4. Samples: 13562955080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:20:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 06:20:05,607][15401] Updated weights for policy 0, policy_version 827813 (0.0037) [2024-06-25 06:20:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13563019264. Throughput: 0: 43041.4. Samples: 13563088880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:20:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 06:20:08,776][15349] Signal inference workers to stop experience collection... (200750 times) [2024-06-25 06:20:08,776][15349] Signal inference workers to resume experience collection... (200750 times) [2024-06-25 06:20:08,798][15401] InferenceWorker_p0-w0: stopping experience collection (200750 times) [2024-06-25 06:20:08,798][15401] InferenceWorker_p0-w0: resuming experience collection (200750 times) [2024-06-25 06:20:08,930][15401] Updated weights for policy 0, policy_version 827823 (0.0036) [2024-06-25 06:20:13,339][15401] Updated weights for policy 0, policy_version 827833 (0.0040) [2024-06-25 06:20:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42873.0, 300 sec: 42876.1). Total num frames: 13563215872. Throughput: 0: 42981.1. Samples: 13563348580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:20:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 06:20:16,668][15401] Updated weights for policy 0, policy_version 827843 (0.0036) [2024-06-25 06:20:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13563445248. Throughput: 0: 43049.7. Samples: 13563608640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:20:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 06:20:20,872][15401] Updated weights for policy 0, policy_version 827853 (0.0034) [2024-06-25 06:20:23,392][15132] Fps is (10 sec: 45864.9, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 13563674624. Throughput: 0: 42969.7. Samples: 13563735020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:20:23,393][15132] Avg episode reward: [(0, '0.276')] [2024-06-25 06:20:24,040][15401] Updated weights for policy 0, policy_version 827863 (0.0034) [2024-06-25 06:20:28,365][15401] Updated weights for policy 0, policy_version 827873 (0.0027) [2024-06-25 06:20:28,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42874.4, 300 sec: 42876.0). Total num frames: 13563871232. Throughput: 0: 43080.4. Samples: 13564002600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:20:28,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 06:20:31,455][15401] Updated weights for policy 0, policy_version 827883 (0.0035) [2024-06-25 06:20:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 13564100608. Throughput: 0: 43253.4. Samples: 13564260080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:20:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 06:20:35,815][15401] Updated weights for policy 0, policy_version 827893 (0.0038) [2024-06-25 06:20:38,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13564329984. Throughput: 0: 43227.1. Samples: 13564393100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-25 06:20:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 06:20:39,187][15401] Updated weights for policy 0, policy_version 827903 (0.0033) [2024-06-25 06:20:43,392][15132] Fps is (10 sec: 39311.6, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 13564493824. Throughput: 0: 43155.9. Samples: 13564646840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:20:43,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 06:20:43,732][15401] Updated weights for policy 0, policy_version 827913 (0.0030) [2024-06-25 06:20:46,787][15401] Updated weights for policy 0, policy_version 827923 (0.0029) [2024-06-25 06:20:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 13564739584. Throughput: 0: 43280.0. Samples: 13564902680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:20:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 06:20:51,212][15401] Updated weights for policy 0, policy_version 827933 (0.0038) [2024-06-25 06:20:53,390][15132] Fps is (10 sec: 47525.0, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13564968960. Throughput: 0: 43382.2. Samples: 13565041080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:20:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 06:20:54,403][15401] Updated weights for policy 0, policy_version 827943 (0.0043) [2024-06-25 06:20:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13565132800. Throughput: 0: 43228.6. Samples: 13565293860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:20:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 06:20:58,845][15401] Updated weights for policy 0, policy_version 827953 (0.0029) [2024-06-25 06:21:02,047][15401] Updated weights for policy 0, policy_version 827963 (0.0030) [2024-06-25 06:21:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 43098.6). Total num frames: 13565411328. Throughput: 0: 43102.3. Samples: 13565548240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 06:21:06,666][15401] Updated weights for policy 0, policy_version 827973 (0.0034) [2024-06-25 06:21:08,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 13565607936. Throughput: 0: 43414.7. Samples: 13565688680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:08,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 06:21:09,714][15401] Updated weights for policy 0, policy_version 827983 (0.0025) [2024-06-25 06:21:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 13565788160. Throughput: 0: 42932.0. Samples: 13565934440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:13,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 06:21:14,259][15401] Updated weights for policy 0, policy_version 827993 (0.0037) [2024-06-25 06:21:17,414][15401] Updated weights for policy 0, policy_version 828003 (0.0030) [2024-06-25 06:21:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 13566050304. Throughput: 0: 42833.8. Samples: 13566187600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:18,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 06:21:21,959][15401] Updated weights for policy 0, policy_version 828013 (0.0031) [2024-06-25 06:21:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42873.2, 300 sec: 42932.0). Total num frames: 13566246912. Throughput: 0: 42955.6. Samples: 13566326100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:23,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 06:21:24,927][15401] Updated weights for policy 0, policy_version 828023 (0.0033) [2024-06-25 06:21:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 13566443520. Throughput: 0: 42760.6. Samples: 13566570960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:28,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 06:21:29,483][15401] Updated weights for policy 0, policy_version 828033 (0.0038) [2024-06-25 06:21:32,898][15401] Updated weights for policy 0, policy_version 828043 (0.0046) [2024-06-25 06:21:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 13566689280. Throughput: 0: 42910.3. Samples: 13566833640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:33,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 06:21:33,978][15349] Signal inference workers to stop experience collection... (200800 times) [2024-06-25 06:21:33,978][15349] Signal inference workers to resume experience collection... (200800 times) [2024-06-25 06:21:33,997][15401] InferenceWorker_p0-w0: stopping experience collection (200800 times) [2024-06-25 06:21:34,032][15401] InferenceWorker_p0-w0: resuming experience collection (200800 times) [2024-06-25 06:21:37,104][15401] Updated weights for policy 0, policy_version 828053 (0.0047) [2024-06-25 06:21:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42932.0). Total num frames: 13566869504. Throughput: 0: 42740.6. Samples: 13566964400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 06:21:40,519][15401] Updated weights for policy 0, policy_version 828063 (0.0034) [2024-06-25 06:21:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43419.4, 300 sec: 42931.6). Total num frames: 13567098880. Throughput: 0: 42718.7. Samples: 13567216200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 06:21:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000828070_13567098880.pth... [2024-06-25 06:21:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000827440_13556776960.pth [2024-06-25 06:21:44,926][15401] Updated weights for policy 0, policy_version 828073 (0.0032) [2024-06-25 06:21:48,334][15401] Updated weights for policy 0, policy_version 828083 (0.0039) [2024-06-25 06:21:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13567311872. Throughput: 0: 42816.1. Samples: 13567474960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:48,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 06:21:52,569][15401] Updated weights for policy 0, policy_version 828093 (0.0039) [2024-06-25 06:21:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 13567508480. Throughput: 0: 42531.3. Samples: 13567602480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 06:21:55,888][15401] Updated weights for policy 0, policy_version 828103 (0.0045) [2024-06-25 06:21:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 13567737856. Throughput: 0: 42675.2. Samples: 13567854820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:21:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 06:22:00,018][15401] Updated weights for policy 0, policy_version 828113 (0.0045) [2024-06-25 06:22:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42325.3, 300 sec: 42987.5). Total num frames: 13567950848. Throughput: 0: 42927.4. Samples: 13568119340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:22:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 06:22:03,515][15401] Updated weights for policy 0, policy_version 828123 (0.0030) [2024-06-25 06:22:08,005][15401] Updated weights for policy 0, policy_version 828133 (0.0050) [2024-06-25 06:22:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 13568163840. Throughput: 0: 42540.4. Samples: 13568240420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:22:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 06:22:11,299][15401] Updated weights for policy 0, policy_version 828143 (0.0024) [2024-06-25 06:22:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 13568376832. Throughput: 0: 42827.0. Samples: 13568498280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:22:13,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 06:22:15,386][15401] Updated weights for policy 0, policy_version 828153 (0.0040) [2024-06-25 06:22:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 13568589824. Throughput: 0: 42771.3. Samples: 13568758360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:22:18,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 06:22:18,998][15401] Updated weights for policy 0, policy_version 828163 (0.0030) [2024-06-25 06:22:22,882][15401] Updated weights for policy 0, policy_version 828173 (0.0035) [2024-06-25 06:22:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 13568786432. Throughput: 0: 42619.9. Samples: 13568882300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:22:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 06:22:26,601][15401] Updated weights for policy 0, policy_version 828183 (0.0048) [2024-06-25 06:22:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13569015808. Throughput: 0: 42681.9. Samples: 13569136880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 06:22:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 06:22:30,354][15401] Updated weights for policy 0, policy_version 828193 (0.0035) [2024-06-25 06:22:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42876.1). Total num frames: 13569196032. Throughput: 0: 42884.0. Samples: 13569404740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:22:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 06:22:34,287][15401] Updated weights for policy 0, policy_version 828203 (0.0024) [2024-06-25 06:22:37,817][15401] Updated weights for policy 0, policy_version 828213 (0.0032) [2024-06-25 06:22:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13569441792. Throughput: 0: 42754.1. Samples: 13569526420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:22:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 06:22:41,886][15401] Updated weights for policy 0, policy_version 828223 (0.0025) [2024-06-25 06:22:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42987.5). Total num frames: 13569671168. Throughput: 0: 42920.5. Samples: 13569786240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:22:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 06:22:45,221][15401] Updated weights for policy 0, policy_version 828233 (0.0036) [2024-06-25 06:22:47,354][15349] Signal inference workers to stop experience collection... (200850 times) [2024-06-25 06:22:47,388][15401] InferenceWorker_p0-w0: stopping experience collection (200850 times) [2024-06-25 06:22:47,422][15349] Signal inference workers to resume experience collection... (200850 times) [2024-06-25 06:22:47,422][15401] InferenceWorker_p0-w0: resuming experience collection (200850 times) [2024-06-25 06:22:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42931.9). Total num frames: 13569851392. Throughput: 0: 42870.0. Samples: 13570048480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:22:48,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 06:22:49,446][15401] Updated weights for policy 0, policy_version 828243 (0.0021) [2024-06-25 06:22:52,890][15401] Updated weights for policy 0, policy_version 828253 (0.0024) [2024-06-25 06:22:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 13570097152. Throughput: 0: 42818.7. Samples: 13570167260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:22:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 06:22:57,234][15401] Updated weights for policy 0, policy_version 828263 (0.0035) [2024-06-25 06:22:58,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 13570310144. Throughput: 0: 42926.3. Samples: 13570429860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:22:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 06:23:00,489][15401] Updated weights for policy 0, policy_version 828273 (0.0027) [2024-06-25 06:23:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13570506752. Throughput: 0: 42848.1. Samples: 13570686520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 06:23:04,829][15401] Updated weights for policy 0, policy_version 828283 (0.0050) [2024-06-25 06:23:08,172][15401] Updated weights for policy 0, policy_version 828293 (0.0029) [2024-06-25 06:23:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13570752512. Throughput: 0: 42889.0. Samples: 13570812300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 06:23:12,674][15401] Updated weights for policy 0, policy_version 828303 (0.0035) [2024-06-25 06:23:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 13570949120. Throughput: 0: 42874.2. Samples: 13571066220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:13,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 06:23:16,043][15401] Updated weights for policy 0, policy_version 828313 (0.0033) [2024-06-25 06:23:18,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13571129344. Throughput: 0: 42599.0. Samples: 13571321700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 06:23:20,405][15401] Updated weights for policy 0, policy_version 828323 (0.0034) [2024-06-25 06:23:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13571375104. Throughput: 0: 42537.8. Samples: 13571440620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 06:23:23,793][15401] Updated weights for policy 0, policy_version 828333 (0.0032) [2024-06-25 06:23:27,942][15401] Updated weights for policy 0, policy_version 828343 (0.0041) [2024-06-25 06:23:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 13571588096. Throughput: 0: 42637.6. Samples: 13571704940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:28,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 06:23:31,814][15401] Updated weights for policy 0, policy_version 828353 (0.0037) [2024-06-25 06:23:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 13571784704. Throughput: 0: 42588.3. Samples: 13571965060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:33,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 06:23:35,465][15401] Updated weights for policy 0, policy_version 828363 (0.0025) [2024-06-25 06:23:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13572014080. Throughput: 0: 42775.0. Samples: 13572092140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:38,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 06:23:39,323][15401] Updated weights for policy 0, policy_version 828373 (0.0032) [2024-06-25 06:23:43,363][15401] Updated weights for policy 0, policy_version 828383 (0.0034) [2024-06-25 06:23:43,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 13572227072. Throughput: 0: 42616.4. Samples: 13572347600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 06:23:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000828383_13572227072.pth... [2024-06-25 06:23:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000827755_13561937920.pth [2024-06-25 06:23:47,209][15401] Updated weights for policy 0, policy_version 828393 (0.0042) [2024-06-25 06:23:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13572423680. Throughput: 0: 42527.1. Samples: 13572600240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 06:23:51,204][15401] Updated weights for policy 0, policy_version 828403 (0.0035) [2024-06-25 06:23:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13572653056. Throughput: 0: 42606.5. Samples: 13572729600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 06:23:54,962][15401] Updated weights for policy 0, policy_version 828413 (0.0029) [2024-06-25 06:23:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 13572849664. Throughput: 0: 42741.3. Samples: 13572989580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:23:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 06:23:58,967][15401] Updated weights for policy 0, policy_version 828423 (0.0035) [2024-06-25 06:23:59,081][15349] Signal inference workers to stop experience collection... (200900 times) [2024-06-25 06:23:59,126][15401] InferenceWorker_p0-w0: stopping experience collection (200900 times) [2024-06-25 06:23:59,149][15349] Signal inference workers to resume experience collection... (200900 times) [2024-06-25 06:23:59,156][15401] InferenceWorker_p0-w0: resuming experience collection (200900 times) [2024-06-25 06:24:02,543][15401] Updated weights for policy 0, policy_version 828433 (0.0040) [2024-06-25 06:24:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 13573079040. Throughput: 0: 42732.9. Samples: 13573244780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:24:03,393][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 06:24:06,515][15401] Updated weights for policy 0, policy_version 828443 (0.0039) [2024-06-25 06:24:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 13573308416. Throughput: 0: 42924.5. Samples: 13573372220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:24:08,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 06:24:10,617][15401] Updated weights for policy 0, policy_version 828453 (0.0036) [2024-06-25 06:24:13,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13573505024. Throughput: 0: 42596.1. Samples: 13573621760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 06:24:13,398][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 06:24:14,152][15401] Updated weights for policy 0, policy_version 828463 (0.0046) [2024-06-25 06:24:18,123][15401] Updated weights for policy 0, policy_version 828473 (0.0029) [2024-06-25 06:24:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13573701632. Throughput: 0: 42574.6. Samples: 13573880820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 06:24:22,111][15401] Updated weights for policy 0, policy_version 828483 (0.0034) [2024-06-25 06:24:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42877.0). Total num frames: 13573947392. Throughput: 0: 42538.2. Samples: 13574006360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 06:24:25,959][15401] Updated weights for policy 0, policy_version 828493 (0.0043) [2024-06-25 06:24:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13574144000. Throughput: 0: 42493.4. Samples: 13574259800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 06:24:29,577][15401] Updated weights for policy 0, policy_version 828503 (0.0035) [2024-06-25 06:24:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13574340608. Throughput: 0: 42747.9. Samples: 13574523900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:33,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 06:24:33,422][15401] Updated weights for policy 0, policy_version 828513 (0.0026) [2024-06-25 06:24:36,950][15401] Updated weights for policy 0, policy_version 828523 (0.0027) [2024-06-25 06:24:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13574602752. Throughput: 0: 42622.6. Samples: 13574647620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:38,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-25 06:24:40,946][15401] Updated weights for policy 0, policy_version 828533 (0.0033) [2024-06-25 06:24:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13574766592. Throughput: 0: 42545.7. Samples: 13574904140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:43,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-25 06:24:44,991][15401] Updated weights for policy 0, policy_version 828543 (0.0031) [2024-06-25 06:24:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13574995968. Throughput: 0: 42549.8. Samples: 13575159420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 06:24:48,649][15401] Updated weights for policy 0, policy_version 828553 (0.0036) [2024-06-25 06:24:52,573][15401] Updated weights for policy 0, policy_version 828563 (0.0022) [2024-06-25 06:24:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13575225344. Throughput: 0: 42613.2. Samples: 13575289820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 06:24:56,069][15401] Updated weights for policy 0, policy_version 828573 (0.0045) [2024-06-25 06:24:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13575405568. Throughput: 0: 42767.9. Samples: 13575546320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:24:58,391][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 06:25:00,110][15401] Updated weights for policy 0, policy_version 828583 (0.0027) [2024-06-25 06:25:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 13575634944. Throughput: 0: 42802.4. Samples: 13575806920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 06:25:03,657][15401] Updated weights for policy 0, policy_version 828593 (0.0034) [2024-06-25 06:25:07,645][15401] Updated weights for policy 0, policy_version 828603 (0.0028) [2024-06-25 06:25:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13575864320. Throughput: 0: 42947.6. Samples: 13575939000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 06:25:11,347][15401] Updated weights for policy 0, policy_version 828613 (0.0033) [2024-06-25 06:25:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13576060928. Throughput: 0: 42872.3. Samples: 13576189060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:13,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 06:25:15,308][15401] Updated weights for policy 0, policy_version 828623 (0.0033) [2024-06-25 06:25:15,538][15349] Signal inference workers to stop experience collection... (200950 times) [2024-06-25 06:25:15,557][15401] InferenceWorker_p0-w0: stopping experience collection (200950 times) [2024-06-25 06:25:15,594][15349] Signal inference workers to resume experience collection... (200950 times) [2024-06-25 06:25:15,594][15401] InferenceWorker_p0-w0: resuming experience collection (200950 times) [2024-06-25 06:25:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 13576273920. Throughput: 0: 42711.6. Samples: 13576445920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:18,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 06:25:19,299][15401] Updated weights for policy 0, policy_version 828633 (0.0033) [2024-06-25 06:25:22,977][15401] Updated weights for policy 0, policy_version 828643 (0.0037) [2024-06-25 06:25:23,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.6, 300 sec: 42820.9). Total num frames: 13576503296. Throughput: 0: 42826.4. Samples: 13576574800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 06:25:26,809][15401] Updated weights for policy 0, policy_version 828653 (0.0041) [2024-06-25 06:25:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 13576683520. Throughput: 0: 42787.7. Samples: 13576829680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:28,393][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 06:25:30,931][15401] Updated weights for policy 0, policy_version 828663 (0.0038) [2024-06-25 06:25:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 13576929280. Throughput: 0: 42725.7. Samples: 13577082180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:33,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 06:25:34,404][15401] Updated weights for policy 0, policy_version 828673 (0.0031) [2024-06-25 06:25:38,389][15132] Fps is (10 sec: 42608.6, 60 sec: 41779.3, 300 sec: 42765.4). Total num frames: 13577109504. Throughput: 0: 42653.5. Samples: 13577209220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 06:25:38,555][15401] Updated weights for policy 0, policy_version 828683 (0.0045) [2024-06-25 06:25:42,108][15401] Updated weights for policy 0, policy_version 828693 (0.0029) [2024-06-25 06:25:43,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13577338880. Throughput: 0: 42608.4. Samples: 13577463700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 06:25:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000828695_13577338880.pth... [2024-06-25 06:25:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000828070_13567098880.pth [2024-06-25 06:25:46,433][15401] Updated weights for policy 0, policy_version 828703 (0.0044) [2024-06-25 06:25:48,391][15132] Fps is (10 sec: 45870.3, 60 sec: 42870.7, 300 sec: 42709.3). Total num frames: 13577568256. Throughput: 0: 42398.1. Samples: 13577714880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:48,391][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 06:25:49,726][15401] Updated weights for policy 0, policy_version 828713 (0.0032) [2024-06-25 06:25:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13577764864. Throughput: 0: 42464.5. Samples: 13577849900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 06:25:54,054][15401] Updated weights for policy 0, policy_version 828723 (0.0025) [2024-06-25 06:25:57,567][15401] Updated weights for policy 0, policy_version 828733 (0.0024) [2024-06-25 06:25:58,390][15132] Fps is (10 sec: 42602.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 13577994240. Throughput: 0: 42517.0. Samples: 13578102320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:25:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 06:26:01,910][15401] Updated weights for policy 0, policy_version 828743 (0.0038) [2024-06-25 06:26:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 13578207232. Throughput: 0: 42410.6. Samples: 13578354400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 06:26:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 06:26:05,081][15401] Updated weights for policy 0, policy_version 828753 (0.0049) [2024-06-25 06:26:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13578420224. Throughput: 0: 42340.4. Samples: 13578480120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 06:26:09,352][15401] Updated weights for policy 0, policy_version 828763 (0.0033) [2024-06-25 06:26:13,049][15401] Updated weights for policy 0, policy_version 828773 (0.0047) [2024-06-25 06:26:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 13578633216. Throughput: 0: 42483.5. Samples: 13578741440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:13,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 06:26:16,895][15401] Updated weights for policy 0, policy_version 828783 (0.0027) [2024-06-25 06:26:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13578829824. Throughput: 0: 42661.8. Samples: 13579001860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:18,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 06:26:20,588][15401] Updated weights for policy 0, policy_version 828793 (0.0042) [2024-06-25 06:26:23,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13579042816. Throughput: 0: 42582.6. Samples: 13579125440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 06:26:24,359][15401] Updated weights for policy 0, policy_version 828803 (0.0030) [2024-06-25 06:26:28,128][15401] Updated weights for policy 0, policy_version 828813 (0.0037) [2024-06-25 06:26:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 13579272192. Throughput: 0: 42561.1. Samples: 13579378940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:28,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 06:26:32,190][15401] Updated weights for policy 0, policy_version 828823 (0.0027) [2024-06-25 06:26:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 13579468800. Throughput: 0: 42796.6. Samples: 13579640680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 06:26:35,602][15401] Updated weights for policy 0, policy_version 828833 (0.0041) [2024-06-25 06:26:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13579665408. Throughput: 0: 42510.8. Samples: 13579762880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 06:26:39,712][15401] Updated weights for policy 0, policy_version 828843 (0.0027) [2024-06-25 06:26:43,198][15401] Updated weights for policy 0, policy_version 828853 (0.0029) [2024-06-25 06:26:43,392][15132] Fps is (10 sec: 45863.8, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 13579927552. Throughput: 0: 42855.5. Samples: 13580030920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:43,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 06:26:46,396][15349] Signal inference workers to stop experience collection... (201000 times) [2024-06-25 06:26:46,400][15349] Signal inference workers to resume experience collection... (201000 times) [2024-06-25 06:26:46,414][15401] InferenceWorker_p0-w0: stopping experience collection (201000 times) [2024-06-25 06:26:46,451][15401] InferenceWorker_p0-w0: resuming experience collection (201000 times) [2024-06-25 06:26:47,358][15401] Updated weights for policy 0, policy_version 828863 (0.0024) [2024-06-25 06:26:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42599.1, 300 sec: 42765.0). Total num frames: 13580124160. Throughput: 0: 42873.8. Samples: 13580283720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:48,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 06:26:50,882][15401] Updated weights for policy 0, policy_version 828873 (0.0036) [2024-06-25 06:26:53,392][15132] Fps is (10 sec: 39321.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 13580320768. Throughput: 0: 42767.9. Samples: 13580404780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:53,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 06:26:55,201][15401] Updated weights for policy 0, policy_version 828883 (0.0050) [2024-06-25 06:26:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13580566528. Throughput: 0: 42955.1. Samples: 13580674320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:26:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 06:26:58,413][15401] Updated weights for policy 0, policy_version 828893 (0.0034) [2024-06-25 06:27:02,730][15401] Updated weights for policy 0, policy_version 828903 (0.0031) [2024-06-25 06:27:03,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13580763136. Throughput: 0: 42819.6. Samples: 13580928740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:03,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 06:27:06,074][15401] Updated weights for policy 0, policy_version 828913 (0.0043) [2024-06-25 06:27:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 13580959744. Throughput: 0: 42816.0. Samples: 13581052160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 06:27:10,251][15401] Updated weights for policy 0, policy_version 828923 (0.0034) [2024-06-25 06:27:13,394][15132] Fps is (10 sec: 44215.6, 60 sec: 42869.8, 300 sec: 42764.3). Total num frames: 13581205504. Throughput: 0: 43045.2. Samples: 13581316180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:13,395][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 06:27:13,950][15401] Updated weights for policy 0, policy_version 828933 (0.0027) [2024-06-25 06:27:17,850][15401] Updated weights for policy 0, policy_version 828943 (0.0028) [2024-06-25 06:27:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13581402112. Throughput: 0: 42828.7. Samples: 13581567980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 06:27:21,599][15401] Updated weights for policy 0, policy_version 828953 (0.0031) [2024-06-25 06:27:23,390][15132] Fps is (10 sec: 39339.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13581598720. Throughput: 0: 43084.7. Samples: 13581701700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 06:27:25,376][15401] Updated weights for policy 0, policy_version 828963 (0.0030) [2024-06-25 06:27:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13581844480. Throughput: 0: 42768.0. Samples: 13581955380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:28,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 06:27:29,585][15401] Updated weights for policy 0, policy_version 828973 (0.0036) [2024-06-25 06:27:33,217][15401] Updated weights for policy 0, policy_version 828983 (0.0035) [2024-06-25 06:27:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13582057472. Throughput: 0: 42864.3. Samples: 13582212620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 06:27:37,292][15401] Updated weights for policy 0, policy_version 828993 (0.0031) [2024-06-25 06:27:38,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 13582237696. Throughput: 0: 43048.9. Samples: 13582341980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:38,393][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 06:27:40,775][15401] Updated weights for policy 0, policy_version 829003 (0.0029) [2024-06-25 06:27:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 13582483456. Throughput: 0: 42730.3. Samples: 13582597180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 06:27:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000829010_13582499840.pth... [2024-06-25 06:27:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000828383_13572227072.pth [2024-06-25 06:27:45,004][15401] Updated weights for policy 0, policy_version 829013 (0.0053) [2024-06-25 06:27:48,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13582696448. Throughput: 0: 42716.3. Samples: 13582850980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:48,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 06:27:48,826][15401] Updated weights for policy 0, policy_version 829023 (0.0037) [2024-06-25 06:27:52,625][15401] Updated weights for policy 0, policy_version 829033 (0.0040) [2024-06-25 06:27:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 13582893056. Throughput: 0: 42813.6. Samples: 13582978780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 06:27:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 06:27:56,272][15401] Updated weights for policy 0, policy_version 829043 (0.0036) [2024-06-25 06:27:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13583122432. Throughput: 0: 42676.1. Samples: 13583236400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:27:58,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 06:28:00,282][15401] Updated weights for policy 0, policy_version 829053 (0.0027) [2024-06-25 06:28:03,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13583335424. Throughput: 0: 42804.9. Samples: 13583494200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 06:28:04,315][15401] Updated weights for policy 0, policy_version 829063 (0.0030) [2024-06-25 06:28:08,300][15401] Updated weights for policy 0, policy_version 829073 (0.0038) [2024-06-25 06:28:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 13583532032. Throughput: 0: 42547.1. Samples: 13583616420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:08,392][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 06:28:11,735][15401] Updated weights for policy 0, policy_version 829083 (0.0038) [2024-06-25 06:28:11,944][15349] Signal inference workers to stop experience collection... (201050 times) [2024-06-25 06:28:11,978][15401] InferenceWorker_p0-w0: stopping experience collection (201050 times) [2024-06-25 06:28:12,004][15349] Signal inference workers to resume experience collection... (201050 times) [2024-06-25 06:28:12,008][15401] InferenceWorker_p0-w0: resuming experience collection (201050 times) [2024-06-25 06:28:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42874.7, 300 sec: 42876.1). Total num frames: 13583777792. Throughput: 0: 42650.6. Samples: 13583874660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 06:28:16,029][15401] Updated weights for policy 0, policy_version 829093 (0.0034) [2024-06-25 06:28:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13583974400. Throughput: 0: 42738.4. Samples: 13584135840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 06:28:19,351][15401] Updated weights for policy 0, policy_version 829103 (0.0043) [2024-06-25 06:28:23,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13584154624. Throughput: 0: 42638.3. Samples: 13584260600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:23,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-25 06:28:23,843][15401] Updated weights for policy 0, policy_version 829113 (0.0022) [2024-06-25 06:28:27,117][15401] Updated weights for policy 0, policy_version 829123 (0.0029) [2024-06-25 06:28:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 13584416768. Throughput: 0: 42785.8. Samples: 13584522540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:28,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 06:28:31,400][15401] Updated weights for policy 0, policy_version 829133 (0.0039) [2024-06-25 06:28:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 13584596992. Throughput: 0: 42886.8. Samples: 13584780880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:33,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 06:28:34,658][15401] Updated weights for policy 0, policy_version 829143 (0.0026) [2024-06-25 06:28:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 13584809984. Throughput: 0: 42671.6. Samples: 13584899000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 06:28:38,973][15401] Updated weights for policy 0, policy_version 829153 (0.0039) [2024-06-25 06:28:42,395][15401] Updated weights for policy 0, policy_version 829163 (0.0031) [2024-06-25 06:28:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13585055744. Throughput: 0: 42876.0. Samples: 13585165820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 06:28:46,783][15401] Updated weights for policy 0, policy_version 829173 (0.0025) [2024-06-25 06:28:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 13585219584. Throughput: 0: 42876.1. Samples: 13585423620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 06:28:50,058][15401] Updated weights for policy 0, policy_version 829183 (0.0044) [2024-06-25 06:28:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13585465344. Throughput: 0: 42715.6. Samples: 13585538520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:53,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 06:28:54,580][15401] Updated weights for policy 0, policy_version 829193 (0.0030) [2024-06-25 06:28:57,534][15401] Updated weights for policy 0, policy_version 829203 (0.0040) [2024-06-25 06:28:58,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 13585694720. Throughput: 0: 42868.5. Samples: 13585803740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:28:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 06:29:02,218][15401] Updated weights for policy 0, policy_version 829213 (0.0024) [2024-06-25 06:29:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13585874944. Throughput: 0: 42883.1. Samples: 13586065580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:03,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 06:29:05,300][15401] Updated weights for policy 0, policy_version 829223 (0.0026) [2024-06-25 06:29:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 13586104320. Throughput: 0: 42698.2. Samples: 13586182020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 06:29:09,655][15401] Updated weights for policy 0, policy_version 829233 (0.0032) [2024-06-25 06:29:12,870][15401] Updated weights for policy 0, policy_version 829243 (0.0025) [2024-06-25 06:29:13,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 13586350080. Throughput: 0: 42862.7. Samples: 13586451360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 06:29:17,004][15401] Updated weights for policy 0, policy_version 829253 (0.0033) [2024-06-25 06:29:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13586513920. Throughput: 0: 43016.9. Samples: 13586716640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 06:29:20,426][15401] Updated weights for policy 0, policy_version 829263 (0.0031) [2024-06-25 06:29:20,866][15349] Signal inference workers to stop experience collection... (201100 times) [2024-06-25 06:29:20,870][15349] Signal inference workers to resume experience collection... (201100 times) [2024-06-25 06:29:20,903][15401] InferenceWorker_p0-w0: stopping experience collection (201100 times) [2024-06-25 06:29:20,906][15401] InferenceWorker_p0-w0: resuming experience collection (201100 times) [2024-06-25 06:29:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13586759680. Throughput: 0: 43020.6. Samples: 13586834920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:23,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 06:29:24,398][15401] Updated weights for policy 0, policy_version 829273 (0.0045) [2024-06-25 06:29:27,906][15401] Updated weights for policy 0, policy_version 829283 (0.0027) [2024-06-25 06:29:28,390][15132] Fps is (10 sec: 50790.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13587021824. Throughput: 0: 43146.1. Samples: 13587107400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:28,399][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 06:29:31,881][15401] Updated weights for policy 0, policy_version 829293 (0.0037) [2024-06-25 06:29:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13587185664. Throughput: 0: 43219.5. Samples: 13587368500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 06:29:35,323][15401] Updated weights for policy 0, policy_version 829303 (0.0025) [2024-06-25 06:29:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13587415040. Throughput: 0: 43261.2. Samples: 13587485280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:38,399][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 06:29:39,484][15401] Updated weights for policy 0, policy_version 829313 (0.0036) [2024-06-25 06:29:42,825][15401] Updated weights for policy 0, policy_version 829323 (0.0043) [2024-06-25 06:29:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 13587660800. Throughput: 0: 43432.0. Samples: 13587758180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 06:29:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 06:29:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000829325_13587660800.pth... [2024-06-25 06:29:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000828695_13577338880.pth [2024-06-25 06:29:47,106][15401] Updated weights for policy 0, policy_version 829333 (0.0035) [2024-06-25 06:29:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 13587824640. Throughput: 0: 43283.8. Samples: 13588013360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:29:48,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 06:29:50,376][15401] Updated weights for policy 0, policy_version 829343 (0.0026) [2024-06-25 06:29:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 13588070400. Throughput: 0: 43417.7. Samples: 13588135820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:29:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 06:29:54,410][15401] Updated weights for policy 0, policy_version 829353 (0.0029) [2024-06-25 06:29:57,883][15401] Updated weights for policy 0, policy_version 829363 (0.0029) [2024-06-25 06:29:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13588283392. Throughput: 0: 43357.7. Samples: 13588402460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:29:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 06:30:01,905][15401] Updated weights for policy 0, policy_version 829373 (0.0021) [2024-06-25 06:30:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13588480000. Throughput: 0: 43108.5. Samples: 13588656520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 06:30:05,668][15401] Updated weights for policy 0, policy_version 829383 (0.0035) [2024-06-25 06:30:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 13588692992. Throughput: 0: 43214.1. Samples: 13588779560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:08,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 06:30:09,895][15401] Updated weights for policy 0, policy_version 829393 (0.0037) [2024-06-25 06:30:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13588905984. Throughput: 0: 43038.3. Samples: 13589044120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 06:30:13,704][15401] Updated weights for policy 0, policy_version 829403 (0.0033) [2024-06-25 06:30:17,390][15401] Updated weights for policy 0, policy_version 829413 (0.0032) [2024-06-25 06:30:18,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 13589118976. Throughput: 0: 42764.9. Samples: 13589293020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:18,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 06:30:21,197][15401] Updated weights for policy 0, policy_version 829423 (0.0031) [2024-06-25 06:30:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42932.0). Total num frames: 13589348352. Throughput: 0: 43037.3. Samples: 13589421960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 06:30:24,786][15349] Signal inference workers to stop experience collection... (201150 times) [2024-06-25 06:30:24,792][15349] Signal inference workers to resume experience collection... (201150 times) [2024-06-25 06:30:24,803][15401] InferenceWorker_p0-w0: stopping experience collection (201150 times) [2024-06-25 06:30:24,812][15401] InferenceWorker_p0-w0: resuming experience collection (201150 times) [2024-06-25 06:30:24,943][15401] Updated weights for policy 0, policy_version 829433 (0.0028) [2024-06-25 06:30:28,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 13589561344. Throughput: 0: 42809.7. Samples: 13589684620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 06:30:28,995][15401] Updated weights for policy 0, policy_version 829443 (0.0031) [2024-06-25 06:30:32,731][15401] Updated weights for policy 0, policy_version 829453 (0.0033) [2024-06-25 06:30:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13589774336. Throughput: 0: 42650.3. Samples: 13589932620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 06:30:36,606][15401] Updated weights for policy 0, policy_version 829463 (0.0025) [2024-06-25 06:30:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13589970944. Throughput: 0: 42781.8. Samples: 13590061000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 06:30:40,531][15401] Updated weights for policy 0, policy_version 829473 (0.0039) [2024-06-25 06:30:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.7). Total num frames: 13590200320. Throughput: 0: 42658.2. Samples: 13590322080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 06:30:44,094][15401] Updated weights for policy 0, policy_version 829483 (0.0028) [2024-06-25 06:30:48,110][15401] Updated weights for policy 0, policy_version 829493 (0.0046) [2024-06-25 06:30:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13590413312. Throughput: 0: 42517.6. Samples: 13590569820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 06:30:51,939][15401] Updated weights for policy 0, policy_version 829503 (0.0038) [2024-06-25 06:30:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13590609920. Throughput: 0: 42639.2. Samples: 13590698320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:53,392][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 06:30:55,847][15401] Updated weights for policy 0, policy_version 829513 (0.0025) [2024-06-25 06:30:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13590839296. Throughput: 0: 42591.6. Samples: 13590960740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:30:58,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 06:30:59,739][15401] Updated weights for policy 0, policy_version 829523 (0.0026) [2024-06-25 06:31:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13591052288. Throughput: 0: 42657.8. Samples: 13591212520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:31:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 06:31:03,819][15401] Updated weights for policy 0, policy_version 829533 (0.0035) [2024-06-25 06:31:07,263][15401] Updated weights for policy 0, policy_version 829543 (0.0036) [2024-06-25 06:31:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 13591265280. Throughput: 0: 42571.3. Samples: 13591337660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:31:08,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 06:31:11,182][15401] Updated weights for policy 0, policy_version 829553 (0.0032) [2024-06-25 06:31:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13591478272. Throughput: 0: 42632.1. Samples: 13591603060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:31:13,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 06:31:14,659][15401] Updated weights for policy 0, policy_version 829563 (0.0021) [2024-06-25 06:31:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 13591707648. Throughput: 0: 42848.4. Samples: 13591860800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:31:18,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-25 06:31:18,653][15401] Updated weights for policy 0, policy_version 829573 (0.0040) [2024-06-25 06:31:22,275][15401] Updated weights for policy 0, policy_version 829583 (0.0038) [2024-06-25 06:31:23,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 13591920640. Throughput: 0: 42934.0. Samples: 13591993140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:31:23,393][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 06:31:23,804][15349] Signal inference workers to stop experience collection... (201200 times) [2024-06-25 06:31:23,808][15349] Signal inference workers to resume experience collection... (201200 times) [2024-06-25 06:31:23,847][15401] InferenceWorker_p0-w0: stopping experience collection (201200 times) [2024-06-25 06:31:23,847][15401] InferenceWorker_p0-w0: resuming experience collection (201200 times) [2024-06-25 06:31:26,660][15401] Updated weights for policy 0, policy_version 829593 (0.0032) [2024-06-25 06:31:28,394][15132] Fps is (10 sec: 40941.8, 60 sec: 42595.3, 300 sec: 42875.4). Total num frames: 13592117248. Throughput: 0: 42942.3. Samples: 13592254680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:31:28,395][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 06:31:29,758][15401] Updated weights for policy 0, policy_version 829603 (0.0023) [2024-06-25 06:31:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13592330240. Throughput: 0: 43091.7. Samples: 13592508940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-06-25 06:31:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 06:31:34,247][15401] Updated weights for policy 0, policy_version 829613 (0.0035) [2024-06-25 06:31:37,508][15401] Updated weights for policy 0, policy_version 829623 (0.0036) [2024-06-25 06:31:38,390][15132] Fps is (10 sec: 44256.9, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 13592559616. Throughput: 0: 43148.9. Samples: 13592640020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:31:38,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 06:31:41,833][15401] Updated weights for policy 0, policy_version 829633 (0.0021) [2024-06-25 06:31:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13592756224. Throughput: 0: 42999.5. Samples: 13592895720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:31:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 06:31:43,565][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000829638_13592788992.pth... [2024-06-25 06:31:43,623][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000829010_13582499840.pth [2024-06-25 06:31:44,958][15401] Updated weights for policy 0, policy_version 829643 (0.0032) [2024-06-25 06:31:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42932.0). Total num frames: 13592985600. Throughput: 0: 43124.9. Samples: 13593153140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:31:48,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 06:31:49,495][15401] Updated weights for policy 0, policy_version 829653 (0.0036) [2024-06-25 06:31:52,514][15401] Updated weights for policy 0, policy_version 829663 (0.0033) [2024-06-25 06:31:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13593214976. Throughput: 0: 43191.0. Samples: 13593281260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:31:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 06:31:57,214][15401] Updated weights for policy 0, policy_version 829673 (0.0034) [2024-06-25 06:31:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13593411584. Throughput: 0: 43090.2. Samples: 13593542120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:31:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 06:32:00,027][15401] Updated weights for policy 0, policy_version 829683 (0.0030) [2024-06-25 06:32:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13593624576. Throughput: 0: 43079.6. Samples: 13593799380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 06:32:04,734][15401] Updated weights for policy 0, policy_version 829693 (0.0035) [2024-06-25 06:32:07,439][15401] Updated weights for policy 0, policy_version 829703 (0.0033) [2024-06-25 06:32:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42932.3). Total num frames: 13593870336. Throughput: 0: 43064.6. Samples: 13593930940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:08,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 06:32:12,474][15401] Updated weights for policy 0, policy_version 829713 (0.0042) [2024-06-25 06:32:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13594050560. Throughput: 0: 43118.3. Samples: 13594194800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 06:32:15,356][15401] Updated weights for policy 0, policy_version 829723 (0.0028) [2024-06-25 06:32:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13594279936. Throughput: 0: 43145.3. Samples: 13594450480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 06:32:20,054][15401] Updated weights for policy 0, policy_version 829733 (0.0031) [2024-06-25 06:32:23,150][15401] Updated weights for policy 0, policy_version 829743 (0.0029) [2024-06-25 06:32:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 13594509312. Throughput: 0: 43043.5. Samples: 13594576980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:23,393][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 06:32:27,502][15401] Updated weights for policy 0, policy_version 829753 (0.0035) [2024-06-25 06:32:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42874.7, 300 sec: 42820.6). Total num frames: 13594689536. Throughput: 0: 43080.9. Samples: 13594834360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 06:32:30,842][15401] Updated weights for policy 0, policy_version 829763 (0.0032) [2024-06-25 06:32:33,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42987.2). Total num frames: 13594918912. Throughput: 0: 43025.7. Samples: 13595089400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:33,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 06:32:35,039][15401] Updated weights for policy 0, policy_version 829773 (0.0043) [2024-06-25 06:32:38,263][15349] Signal inference workers to stop experience collection... (201250 times) [2024-06-25 06:32:38,293][15401] InferenceWorker_p0-w0: stopping experience collection (201250 times) [2024-06-25 06:32:38,328][15349] Signal inference workers to resume experience collection... (201250 times) [2024-06-25 06:32:38,329][15401] InferenceWorker_p0-w0: resuming experience collection (201250 times) [2024-06-25 06:32:38,330][15401] Updated weights for policy 0, policy_version 829783 (0.0021) [2024-06-25 06:32:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13595164672. Throughput: 0: 43086.3. Samples: 13595220140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 06:32:42,935][15401] Updated weights for policy 0, policy_version 829793 (0.0032) [2024-06-25 06:32:43,392][15132] Fps is (10 sec: 44236.7, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 13595361280. Throughput: 0: 43039.0. Samples: 13595478980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:43,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 06:32:45,706][15401] Updated weights for policy 0, policy_version 829803 (0.0027) [2024-06-25 06:32:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 13595557888. Throughput: 0: 43102.6. Samples: 13595739100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:48,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 06:32:50,508][15401] Updated weights for policy 0, policy_version 829813 (0.0034) [2024-06-25 06:32:53,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13595803648. Throughput: 0: 42903.1. Samples: 13595861580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 06:32:53,549][15401] Updated weights for policy 0, policy_version 829823 (0.0023) [2024-06-25 06:32:57,962][15401] Updated weights for policy 0, policy_version 829833 (0.0034) [2024-06-25 06:32:58,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 13596000256. Throughput: 0: 42868.8. Samples: 13596123900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:32:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 06:33:01,063][15401] Updated weights for policy 0, policy_version 829843 (0.0039) [2024-06-25 06:33:03,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.7, 300 sec: 42931.6). Total num frames: 13596196864. Throughput: 0: 43029.7. Samples: 13596386920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:33:03,393][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 06:33:05,785][15401] Updated weights for policy 0, policy_version 829853 (0.0032) [2024-06-25 06:33:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13596459008. Throughput: 0: 42960.5. Samples: 13596510200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:33:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 06:33:08,682][15401] Updated weights for policy 0, policy_version 829863 (0.0037) [2024-06-25 06:33:13,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13596622848. Throughput: 0: 43077.0. Samples: 13596772820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:33:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 06:33:13,438][15401] Updated weights for policy 0, policy_version 829873 (0.0042) [2024-06-25 06:33:16,844][15401] Updated weights for policy 0, policy_version 829883 (0.0042) [2024-06-25 06:33:18,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 13596835840. Throughput: 0: 42869.5. Samples: 13597018420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 06:33:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 06:33:20,990][15401] Updated weights for policy 0, policy_version 829893 (0.0031) [2024-06-25 06:33:23,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13597097984. Throughput: 0: 42819.0. Samples: 13597147000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:33:23,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 06:33:24,423][15401] Updated weights for policy 0, policy_version 829903 (0.0032) [2024-06-25 06:33:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13597261824. Throughput: 0: 42846.3. Samples: 13597406960. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:33:28,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 06:33:28,777][15401] Updated weights for policy 0, policy_version 829913 (0.0041) [2024-06-25 06:33:31,990][15401] Updated weights for policy 0, policy_version 829923 (0.0042) [2024-06-25 06:33:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 13597491200. Throughput: 0: 42715.0. Samples: 13597661180. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:33:33,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 06:33:36,677][15401] Updated weights for policy 0, policy_version 829933 (0.0042) [2024-06-25 06:33:38,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 13597736960. Throughput: 0: 42890.2. Samples: 13597791640. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:33:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 06:33:39,481][15401] Updated weights for policy 0, policy_version 829943 (0.0036) [2024-06-25 06:33:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42327.1, 300 sec: 42987.2). Total num frames: 13597900800. Throughput: 0: 42748.4. Samples: 13598047580. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:33:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 06:33:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000829950_13597900800.pth... [2024-06-25 06:33:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000829325_13587660800.pth [2024-06-25 06:33:44,266][15401] Updated weights for policy 0, policy_version 829953 (0.0031) [2024-06-25 06:33:45,119][15349] Signal inference workers to stop experience collection... (201300 times) [2024-06-25 06:33:45,119][15349] Signal inference workers to resume experience collection... (201300 times) [2024-06-25 06:33:45,166][15401] InferenceWorker_p0-w0: stopping experience collection (201300 times) [2024-06-25 06:33:45,167][15401] InferenceWorker_p0-w0: resuming experience collection (201300 times) [2024-06-25 06:33:47,040][15401] Updated weights for policy 0, policy_version 829963 (0.0035) [2024-06-25 06:33:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42871.5, 300 sec: 42931.3). Total num frames: 13598130176. Throughput: 0: 42319.2. Samples: 13598291280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:33:48,392][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 06:33:51,701][15401] Updated weights for policy 0, policy_version 829973 (0.0032) [2024-06-25 06:33:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13598375936. Throughput: 0: 42624.8. Samples: 13598428320. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:33:53,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 06:33:55,411][15401] Updated weights for policy 0, policy_version 829983 (0.0037) [2024-06-25 06:33:58,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 13598523392. Throughput: 0: 42549.7. Samples: 13598687560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:33:58,391][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 06:33:59,345][15401] Updated weights for policy 0, policy_version 829993 (0.0034) [2024-06-25 06:34:02,935][15401] Updated weights for policy 0, policy_version 830003 (0.0027) [2024-06-25 06:34:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 13598769152. Throughput: 0: 42736.3. Samples: 13598941560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 06:34:06,822][15401] Updated weights for policy 0, policy_version 830013 (0.0029) [2024-06-25 06:34:08,389][15132] Fps is (10 sec: 49152.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13599014912. Throughput: 0: 42859.7. Samples: 13599075680. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 06:34:10,295][15401] Updated weights for policy 0, policy_version 830023 (0.0037) [2024-06-25 06:34:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 13599195136. Throughput: 0: 42870.2. Samples: 13599336120. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:13,390][15132] Avg episode reward: [(0, '0.162')] [2024-06-25 06:34:14,329][15401] Updated weights for policy 0, policy_version 830033 (0.0038) [2024-06-25 06:34:17,866][15401] Updated weights for policy 0, policy_version 830043 (0.0036) [2024-06-25 06:34:18,390][15132] Fps is (10 sec: 40959.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 13599424512. Throughput: 0: 42782.7. Samples: 13599586400. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 06:34:21,935][15401] Updated weights for policy 0, policy_version 830053 (0.0034) [2024-06-25 06:34:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13599653888. Throughput: 0: 42862.7. Samples: 13599720460. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:23,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 06:34:25,461][15401] Updated weights for policy 0, policy_version 830063 (0.0027) [2024-06-25 06:34:28,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 13599834112. Throughput: 0: 42987.9. Samples: 13599982140. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:28,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 06:34:29,473][15401] Updated weights for policy 0, policy_version 830073 (0.0038) [2024-06-25 06:34:32,939][15401] Updated weights for policy 0, policy_version 830083 (0.0040) [2024-06-25 06:34:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 13600079872. Throughput: 0: 42936.1. Samples: 13600223300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:33,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 06:34:37,602][15401] Updated weights for policy 0, policy_version 830093 (0.0032) [2024-06-25 06:34:38,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13600292864. Throughput: 0: 42883.6. Samples: 13600358080. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:38,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 06:34:40,805][15401] Updated weights for policy 0, policy_version 830103 (0.0030) [2024-06-25 06:34:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13600473088. Throughput: 0: 42741.7. Samples: 13600610940. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 06:34:45,256][15401] Updated weights for policy 0, policy_version 830113 (0.0041) [2024-06-25 06:34:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 13600718848. Throughput: 0: 42627.5. Samples: 13600859800. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 06:34:48,526][15401] Updated weights for policy 0, policy_version 830123 (0.0035) [2024-06-25 06:34:52,763][15401] Updated weights for policy 0, policy_version 830133 (0.0033) [2024-06-25 06:34:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13600931840. Throughput: 0: 42688.3. Samples: 13600996660. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 06:34:54,638][15349] Signal inference workers to stop experience collection... (201350 times) [2024-06-25 06:34:54,638][15349] Signal inference workers to resume experience collection... (201350 times) [2024-06-25 06:34:54,681][15401] InferenceWorker_p0-w0: stopping experience collection (201350 times) [2024-06-25 06:34:54,681][15401] InferenceWorker_p0-w0: resuming experience collection (201350 times) [2024-06-25 06:34:56,248][15401] Updated weights for policy 0, policy_version 830143 (0.0027) [2024-06-25 06:34:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13601112064. Throughput: 0: 42638.8. Samples: 13601254860. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:34:58,396][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 06:35:00,291][15401] Updated weights for policy 0, policy_version 830153 (0.0029) [2024-06-25 06:35:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 13601357824. Throughput: 0: 42598.8. Samples: 13601503340. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:35:03,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 06:35:03,951][15401] Updated weights for policy 0, policy_version 830163 (0.0038) [2024-06-25 06:35:07,858][15401] Updated weights for policy 0, policy_version 830173 (0.0032) [2024-06-25 06:35:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13601587200. Throughput: 0: 42601.2. Samples: 13601637520. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-25 06:35:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 06:35:12,146][15401] Updated weights for policy 0, policy_version 830183 (0.0026) [2024-06-25 06:35:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 13601751040. Throughput: 0: 42462.8. Samples: 13601892860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 06:35:15,448][15401] Updated weights for policy 0, policy_version 830193 (0.0032) [2024-06-25 06:35:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13601980416. Throughput: 0: 42799.0. Samples: 13602149260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:18,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 06:35:19,677][15401] Updated weights for policy 0, policy_version 830203 (0.0034) [2024-06-25 06:35:22,941][15401] Updated weights for policy 0, policy_version 830213 (0.0032) [2024-06-25 06:35:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13602209792. Throughput: 0: 42682.6. Samples: 13602278800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 06:35:27,402][15401] Updated weights for policy 0, policy_version 830223 (0.0040) [2024-06-25 06:35:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42598.4, 300 sec: 42764.7). Total num frames: 13602390016. Throughput: 0: 42781.3. Samples: 13602536200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:28,393][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 06:35:30,539][15401] Updated weights for policy 0, policy_version 830233 (0.0032) [2024-06-25 06:35:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 13602619392. Throughput: 0: 42946.7. Samples: 13602792400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 06:35:34,922][15401] Updated weights for policy 0, policy_version 830243 (0.0036) [2024-06-25 06:35:38,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13602848768. Throughput: 0: 42977.7. Samples: 13602930660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 06:35:38,402][15401] Updated weights for policy 0, policy_version 830253 (0.0026) [2024-06-25 06:35:42,634][15401] Updated weights for policy 0, policy_version 830263 (0.0042) [2024-06-25 06:35:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13603028992. Throughput: 0: 42789.8. Samples: 13603180400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 06:35:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000830263_13603028992.pth... [2024-06-25 06:35:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000829638_13592788992.pth [2024-06-25 06:35:45,845][15401] Updated weights for policy 0, policy_version 830273 (0.0034) [2024-06-25 06:35:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13603274752. Throughput: 0: 42949.3. Samples: 13603436060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 06:35:50,197][15401] Updated weights for policy 0, policy_version 830283 (0.0026) [2024-06-25 06:35:53,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13603504128. Throughput: 0: 42968.0. Samples: 13603571080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 06:35:53,553][15401] Updated weights for policy 0, policy_version 830293 (0.0048) [2024-06-25 06:35:58,317][15401] Updated weights for policy 0, policy_version 830303 (0.0039) [2024-06-25 06:35:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13603684352. Throughput: 0: 42928.3. Samples: 13603824640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:35:58,390][15132] Avg episode reward: [(0, '0.908')] [2024-06-25 06:36:01,062][15349] Signal inference workers to stop experience collection... (201400 times) [2024-06-25 06:36:01,104][15401] InferenceWorker_p0-w0: stopping experience collection (201400 times) [2024-06-25 06:36:01,122][15349] Signal inference workers to resume experience collection... (201400 times) [2024-06-25 06:36:01,124][15401] InferenceWorker_p0-w0: resuming experience collection (201400 times) [2024-06-25 06:36:01,127][15401] Updated weights for policy 0, policy_version 830313 (0.0030) [2024-06-25 06:36:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13603913728. Throughput: 0: 42915.2. Samples: 13604080440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:03,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-25 06:36:05,779][15401] Updated weights for policy 0, policy_version 830323 (0.0043) [2024-06-25 06:36:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13604143104. Throughput: 0: 42968.1. Samples: 13604212360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:08,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 06:36:08,759][15401] Updated weights for policy 0, policy_version 830333 (0.0038) [2024-06-25 06:36:13,352][15401] Updated weights for policy 0, policy_version 830343 (0.0033) [2024-06-25 06:36:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 13604339712. Throughput: 0: 43003.1. Samples: 13604471240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:13,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 06:36:16,267][15401] Updated weights for policy 0, policy_version 830353 (0.0035) [2024-06-25 06:36:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 13604552704. Throughput: 0: 42845.3. Samples: 13604720440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:18,396][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 06:36:20,892][15401] Updated weights for policy 0, policy_version 830363 (0.0038) [2024-06-25 06:36:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.7). Total num frames: 13604765696. Throughput: 0: 42716.5. Samples: 13604852900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 06:36:23,985][15401] Updated weights for policy 0, policy_version 830373 (0.0035) [2024-06-25 06:36:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 13604978688. Throughput: 0: 42995.5. Samples: 13605115200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:28,394][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 06:36:28,542][15401] Updated weights for policy 0, policy_version 830383 (0.0047) [2024-06-25 06:36:31,680][15401] Updated weights for policy 0, policy_version 830393 (0.0032) [2024-06-25 06:36:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13605208064. Throughput: 0: 42790.7. Samples: 13605361640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:33,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 06:36:36,198][15401] Updated weights for policy 0, policy_version 830403 (0.0035) [2024-06-25 06:36:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13605421056. Throughput: 0: 42846.2. Samples: 13605499160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:38,390][15132] Avg episode reward: [(0, '0.871')] [2024-06-25 06:36:39,343][15401] Updated weights for policy 0, policy_version 830413 (0.0028) [2024-06-25 06:36:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13605617664. Throughput: 0: 42965.0. Samples: 13605758060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:43,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 06:36:43,863][15401] Updated weights for policy 0, policy_version 830423 (0.0030) [2024-06-25 06:36:46,952][15401] Updated weights for policy 0, policy_version 830433 (0.0033) [2024-06-25 06:36:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13605847040. Throughput: 0: 42819.1. Samples: 13606007300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 06:36:51,495][15401] Updated weights for policy 0, policy_version 830443 (0.0034) [2024-06-25 06:36:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 13606043648. Throughput: 0: 42853.8. Samples: 13606140780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:53,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 06:36:54,541][15401] Updated weights for policy 0, policy_version 830453 (0.0033) [2024-06-25 06:36:58,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 13606273024. Throughput: 0: 42706.2. Samples: 13606393120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 06:36:58,393][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 06:36:59,419][15401] Updated weights for policy 0, policy_version 830463 (0.0030) [2024-06-25 06:37:02,519][15401] Updated weights for policy 0, policy_version 830473 (0.0024) [2024-06-25 06:37:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13606502400. Throughput: 0: 42769.8. Samples: 13606645080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:03,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 06:37:07,109][15401] Updated weights for policy 0, policy_version 830483 (0.0033) [2024-06-25 06:37:08,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 13606682624. Throughput: 0: 42703.5. Samples: 13606774560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 06:37:10,142][15401] Updated weights for policy 0, policy_version 830493 (0.0036) [2024-06-25 06:37:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13606912000. Throughput: 0: 42545.0. Samples: 13607029720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 06:37:14,700][15401] Updated weights for policy 0, policy_version 830503 (0.0029) [2024-06-25 06:37:17,823][15401] Updated weights for policy 0, policy_version 830513 (0.0030) [2024-06-25 06:37:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13607124992. Throughput: 0: 42757.3. Samples: 13607285720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 06:37:22,389][15401] Updated weights for policy 0, policy_version 830523 (0.0041) [2024-06-25 06:37:23,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 13607321600. Throughput: 0: 42586.2. Samples: 13607415640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:23,393][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 06:37:25,319][15349] Signal inference workers to stop experience collection... (201450 times) [2024-06-25 06:37:25,320][15349] Signal inference workers to resume experience collection... (201450 times) [2024-06-25 06:37:25,356][15401] InferenceWorker_p0-w0: stopping experience collection (201450 times) [2024-06-25 06:37:25,357][15401] InferenceWorker_p0-w0: resuming experience collection (201450 times) [2024-06-25 06:37:25,475][15401] Updated weights for policy 0, policy_version 830533 (0.0029) [2024-06-25 06:37:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 13607550976. Throughput: 0: 42531.1. Samples: 13607671960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:28,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 06:37:30,024][15401] Updated weights for policy 0, policy_version 830543 (0.0032) [2024-06-25 06:37:32,961][15401] Updated weights for policy 0, policy_version 830553 (0.0039) [2024-06-25 06:37:33,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13607780352. Throughput: 0: 42697.4. Samples: 13607928680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 06:37:37,642][15401] Updated weights for policy 0, policy_version 830563 (0.0035) [2024-06-25 06:37:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 13607976960. Throughput: 0: 42638.6. Samples: 13608059520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:38,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 06:37:41,062][15401] Updated weights for policy 0, policy_version 830573 (0.0039) [2024-06-25 06:37:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 13608206336. Throughput: 0: 42721.7. Samples: 13608315500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 06:37:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000830579_13608206336.pth... [2024-06-25 06:37:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000829950_13597900800.pth [2024-06-25 06:37:45,330][15401] Updated weights for policy 0, policy_version 830583 (0.0036) [2024-06-25 06:37:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13608402944. Throughput: 0: 42748.1. Samples: 13608568740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 06:37:48,812][15401] Updated weights for policy 0, policy_version 830593 (0.0041) [2024-06-25 06:37:52,930][15401] Updated weights for policy 0, policy_version 830603 (0.0033) [2024-06-25 06:37:53,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13608599552. Throughput: 0: 42771.6. Samples: 13608699280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 06:37:56,361][15401] Updated weights for policy 0, policy_version 830613 (0.0038) [2024-06-25 06:37:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42876.5). Total num frames: 13608845312. Throughput: 0: 42863.5. Samples: 13608958580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:37:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 06:38:00,525][15401] Updated weights for policy 0, policy_version 830623 (0.0026) [2024-06-25 06:38:03,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13609058304. Throughput: 0: 42934.1. Samples: 13609217860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:03,392][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 06:38:03,895][15401] Updated weights for policy 0, policy_version 830633 (0.0033) [2024-06-25 06:38:07,994][15401] Updated weights for policy 0, policy_version 830643 (0.0029) [2024-06-25 06:38:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13609271296. Throughput: 0: 43041.3. Samples: 13609352400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 06:38:11,367][15401] Updated weights for policy 0, policy_version 830653 (0.0037) [2024-06-25 06:38:13,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13609484288. Throughput: 0: 42923.5. Samples: 13609603520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 06:38:15,800][15401] Updated weights for policy 0, policy_version 830663 (0.0037) [2024-06-25 06:38:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13609697280. Throughput: 0: 42978.2. Samples: 13609862700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 06:38:19,104][15401] Updated weights for policy 0, policy_version 830673 (0.0035) [2024-06-25 06:38:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 13609893888. Throughput: 0: 42986.1. Samples: 13609993900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 06:38:23,491][15401] Updated weights for policy 0, policy_version 830683 (0.0042) [2024-06-25 06:38:26,879][15401] Updated weights for policy 0, policy_version 830693 (0.0030) [2024-06-25 06:38:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13610106880. Throughput: 0: 42782.9. Samples: 13610240720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 06:38:31,090][15401] Updated weights for policy 0, policy_version 830703 (0.0035) [2024-06-25 06:38:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13610352640. Throughput: 0: 42938.7. Samples: 13610500980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 06:38:34,431][15401] Updated weights for policy 0, policy_version 830713 (0.0030) [2024-06-25 06:38:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13610549248. Throughput: 0: 43124.0. Samples: 13610639860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 06:38:38,603][15401] Updated weights for policy 0, policy_version 830723 (0.0040) [2024-06-25 06:38:41,897][15401] Updated weights for policy 0, policy_version 830733 (0.0027) [2024-06-25 06:38:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 13610762240. Throughput: 0: 42963.5. Samples: 13610891940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:43,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 06:38:44,698][15349] Signal inference workers to stop experience collection... (201500 times) [2024-06-25 06:38:44,733][15401] InferenceWorker_p0-w0: stopping experience collection (201500 times) [2024-06-25 06:38:44,760][15349] Signal inference workers to resume experience collection... (201500 times) [2024-06-25 06:38:44,764][15401] InferenceWorker_p0-w0: resuming experience collection (201500 times) [2024-06-25 06:38:46,347][15401] Updated weights for policy 0, policy_version 830743 (0.0034) [2024-06-25 06:38:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13611008000. Throughput: 0: 43009.9. Samples: 13611153200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 06:38:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 06:38:49,453][15401] Updated weights for policy 0, policy_version 830753 (0.0024) [2024-06-25 06:38:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13611188224. Throughput: 0: 42977.0. Samples: 13611286360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:38:53,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 06:38:54,030][15401] Updated weights for policy 0, policy_version 830763 (0.0049) [2024-06-25 06:38:57,357][15401] Updated weights for policy 0, policy_version 830773 (0.0036) [2024-06-25 06:38:58,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13611417600. Throughput: 0: 42855.0. Samples: 13611532000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:38:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 06:39:01,649][15401] Updated weights for policy 0, policy_version 830783 (0.0042) [2024-06-25 06:39:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 13611630592. Throughput: 0: 42713.0. Samples: 13611784780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 06:39:05,022][15401] Updated weights for policy 0, policy_version 830793 (0.0028) [2024-06-25 06:39:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13611827200. Throughput: 0: 42760.1. Samples: 13611918100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:08,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 06:39:09,154][15401] Updated weights for policy 0, policy_version 830803 (0.0032) [2024-06-25 06:39:12,880][15401] Updated weights for policy 0, policy_version 830813 (0.0040) [2024-06-25 06:39:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13612056576. Throughput: 0: 42872.9. Samples: 13612170000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:13,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 06:39:17,058][15401] Updated weights for policy 0, policy_version 830823 (0.0033) [2024-06-25 06:39:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13612269568. Throughput: 0: 42659.5. Samples: 13612420660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 06:39:20,504][15401] Updated weights for policy 0, policy_version 830833 (0.0029) [2024-06-25 06:39:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 13612466176. Throughput: 0: 42591.7. Samples: 13612556480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 06:39:24,620][15401] Updated weights for policy 0, policy_version 830843 (0.0030) [2024-06-25 06:39:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 13612679168. Throughput: 0: 42461.6. Samples: 13612802720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 06:39:28,464][15401] Updated weights for policy 0, policy_version 830853 (0.0038) [2024-06-25 06:39:32,285][15401] Updated weights for policy 0, policy_version 830863 (0.0031) [2024-06-25 06:39:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13612908544. Throughput: 0: 42363.5. Samples: 13613059560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 06:39:36,042][15401] Updated weights for policy 0, policy_version 830873 (0.0033) [2024-06-25 06:39:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13613088768. Throughput: 0: 42344.8. Samples: 13613191880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 06:39:39,867][15401] Updated weights for policy 0, policy_version 830883 (0.0038) [2024-06-25 06:39:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13613318144. Throughput: 0: 42556.1. Samples: 13613447020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 06:39:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000830892_13613334528.pth... [2024-06-25 06:39:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000830263_13603028992.pth [2024-06-25 06:39:43,723][15401] Updated weights for policy 0, policy_version 830893 (0.0037) [2024-06-25 06:39:47,476][15401] Updated weights for policy 0, policy_version 830903 (0.0051) [2024-06-25 06:39:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13613547520. Throughput: 0: 42621.7. Samples: 13613702760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:48,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 06:39:51,364][15401] Updated weights for policy 0, policy_version 830913 (0.0023) [2024-06-25 06:39:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13613727744. Throughput: 0: 42459.2. Samples: 13613828760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 06:39:55,010][15401] Updated weights for policy 0, policy_version 830923 (0.0043) [2024-06-25 06:39:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13613973504. Throughput: 0: 42632.0. Samples: 13614088440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:39:58,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 06:39:58,940][15401] Updated weights for policy 0, policy_version 830933 (0.0043) [2024-06-25 06:40:02,539][15401] Updated weights for policy 0, policy_version 830943 (0.0032) [2024-06-25 06:40:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13614186496. Throughput: 0: 42700.5. Samples: 13614342180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:40:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 06:40:06,957][15401] Updated weights for policy 0, policy_version 830953 (0.0029) [2024-06-25 06:40:08,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 13614383104. Throughput: 0: 42452.1. Samples: 13614467100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:40:08,397][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 06:40:10,330][15401] Updated weights for policy 0, policy_version 830963 (0.0031) [2024-06-25 06:40:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 13614612480. Throughput: 0: 42821.0. Samples: 13614729660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:40:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 06:40:14,572][15401] Updated weights for policy 0, policy_version 830973 (0.0028) [2024-06-25 06:40:17,884][15349] Signal inference workers to stop experience collection... (201550 times) [2024-06-25 06:40:17,916][15401] InferenceWorker_p0-w0: stopping experience collection (201550 times) [2024-06-25 06:40:17,941][15349] Signal inference workers to resume experience collection... (201550 times) [2024-06-25 06:40:17,942][15401] InferenceWorker_p0-w0: resuming experience collection (201550 times) [2024-06-25 06:40:18,075][15401] Updated weights for policy 0, policy_version 830983 (0.0022) [2024-06-25 06:40:18,389][15132] Fps is (10 sec: 44265.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13614825472. Throughput: 0: 42708.9. Samples: 13614981460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:40:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 06:40:22,562][15401] Updated weights for policy 0, policy_version 830993 (0.0030) [2024-06-25 06:40:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42876.5). Total num frames: 13615038464. Throughput: 0: 42596.6. Samples: 13615108720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:40:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 06:40:25,854][15401] Updated weights for policy 0, policy_version 831003 (0.0027) [2024-06-25 06:40:28,396][15132] Fps is (10 sec: 44208.5, 60 sec: 43140.1, 300 sec: 42875.2). Total num frames: 13615267840. Throughput: 0: 42660.7. Samples: 13615367020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:40:28,396][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 06:40:30,224][15401] Updated weights for policy 0, policy_version 831013 (0.0031) [2024-06-25 06:40:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13615464448. Throughput: 0: 42592.8. Samples: 13615619440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:40:33,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 06:40:33,580][15401] Updated weights for policy 0, policy_version 831023 (0.0023) [2024-06-25 06:40:37,789][15401] Updated weights for policy 0, policy_version 831033 (0.0036) [2024-06-25 06:40:38,390][15132] Fps is (10 sec: 39346.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13615661056. Throughput: 0: 42625.7. Samples: 13615746920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 06:40:38,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 06:40:41,378][15401] Updated weights for policy 0, policy_version 831043 (0.0035) [2024-06-25 06:40:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13615890432. Throughput: 0: 42636.1. Samples: 13616007060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:40:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 06:40:45,583][15401] Updated weights for policy 0, policy_version 831053 (0.0026) [2024-06-25 06:40:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13616103424. Throughput: 0: 42572.3. Samples: 13616257940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:40:48,399][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 06:40:49,624][15401] Updated weights for policy 0, policy_version 831063 (0.0038) [2024-06-25 06:40:53,166][15401] Updated weights for policy 0, policy_version 831073 (0.0041) [2024-06-25 06:40:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13616300032. Throughput: 0: 42706.9. Samples: 13616388640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:40:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 06:40:57,169][15401] Updated weights for policy 0, policy_version 831083 (0.0037) [2024-06-25 06:40:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13616513024. Throughput: 0: 42610.8. Samples: 13616647140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:40:58,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-25 06:41:00,997][15401] Updated weights for policy 0, policy_version 831093 (0.0033) [2024-06-25 06:41:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13616758784. Throughput: 0: 42512.0. Samples: 13616894500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:03,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 06:41:04,850][15401] Updated weights for policy 0, policy_version 831103 (0.0038) [2024-06-25 06:41:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42329.9, 300 sec: 42654.0). Total num frames: 13616922624. Throughput: 0: 42658.3. Samples: 13617028340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:08,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 06:41:08,646][15401] Updated weights for policy 0, policy_version 831113 (0.0035) [2024-06-25 06:41:12,662][15401] Updated weights for policy 0, policy_version 831123 (0.0038) [2024-06-25 06:41:13,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13617152000. Throughput: 0: 42684.6. Samples: 13617287560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 06:41:16,125][15401] Updated weights for policy 0, policy_version 831133 (0.0037) [2024-06-25 06:41:18,390][15132] Fps is (10 sec: 47512.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13617397760. Throughput: 0: 42478.6. Samples: 13617530980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 06:41:20,064][15401] Updated weights for policy 0, policy_version 831143 (0.0030) [2024-06-25 06:41:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13617577984. Throughput: 0: 42785.0. Samples: 13617672240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:23,390][15132] Avg episode reward: [(0, '0.867')] [2024-06-25 06:41:23,655][15401] Updated weights for policy 0, policy_version 831153 (0.0037) [2024-06-25 06:41:27,543][15401] Updated weights for policy 0, policy_version 831163 (0.0031) [2024-06-25 06:41:28,392][15132] Fps is (10 sec: 39312.7, 60 sec: 42055.0, 300 sec: 42653.6). Total num frames: 13617790976. Throughput: 0: 42673.2. Samples: 13617927460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:28,392][15132] Avg episode reward: [(0, '0.882')] [2024-06-25 06:41:31,286][15401] Updated weights for policy 0, policy_version 831173 (0.0035) [2024-06-25 06:41:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13618036736. Throughput: 0: 42672.4. Samples: 13618178200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 06:41:35,318][15401] Updated weights for policy 0, policy_version 831183 (0.0033) [2024-06-25 06:41:38,391][15132] Fps is (10 sec: 44239.2, 60 sec: 42870.2, 300 sec: 42764.7). Total num frames: 13618233344. Throughput: 0: 42856.1. Samples: 13618317240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:38,392][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 06:41:38,737][15401] Updated weights for policy 0, policy_version 831193 (0.0027) [2024-06-25 06:41:42,670][15401] Updated weights for policy 0, policy_version 831203 (0.0030) [2024-06-25 06:41:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 13618429952. Throughput: 0: 42839.6. Samples: 13618574920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 06:41:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000831204_13618446336.pth... [2024-06-25 06:41:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000830579_13608206336.pth [2024-06-25 06:41:46,214][15401] Updated weights for policy 0, policy_version 831213 (0.0025) [2024-06-25 06:41:48,389][15132] Fps is (10 sec: 45883.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13618692096. Throughput: 0: 42904.4. Samples: 13618825200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:48,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 06:41:50,049][15401] Updated weights for policy 0, policy_version 831223 (0.0030) [2024-06-25 06:41:52,908][15349] Signal inference workers to stop experience collection... (201600 times) [2024-06-25 06:41:52,908][15349] Signal inference workers to resume experience collection... (201600 times) [2024-06-25 06:41:52,949][15401] InferenceWorker_p0-w0: stopping experience collection (201600 times) [2024-06-25 06:41:52,949][15401] InferenceWorker_p0-w0: resuming experience collection (201600 times) [2024-06-25 06:41:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 13618888704. Throughput: 0: 43111.5. Samples: 13618968360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 06:41:54,240][15401] Updated weights for policy 0, policy_version 831233 (0.0028) [2024-06-25 06:41:57,621][15401] Updated weights for policy 0, policy_version 831243 (0.0028) [2024-06-25 06:41:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13619085312. Throughput: 0: 42768.6. Samples: 13619212140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:41:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 06:42:01,952][15401] Updated weights for policy 0, policy_version 831253 (0.0033) [2024-06-25 06:42:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 13619314688. Throughput: 0: 42976.5. Samples: 13619465020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:42:03,392][15132] Avg episode reward: [(0, '0.290')] [2024-06-25 06:42:05,173][15401] Updated weights for policy 0, policy_version 831263 (0.0037) [2024-06-25 06:42:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13619527680. Throughput: 0: 42889.7. Samples: 13619602280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:42:08,390][15132] Avg episode reward: [(0, '0.164')] [2024-06-25 06:42:09,610][15401] Updated weights for policy 0, policy_version 831273 (0.0048) [2024-06-25 06:42:12,759][15401] Updated weights for policy 0, policy_version 831283 (0.0035) [2024-06-25 06:42:13,389][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13619740672. Throughput: 0: 42714.7. Samples: 13619849520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:42:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 06:42:17,234][15401] Updated weights for policy 0, policy_version 831293 (0.0023) [2024-06-25 06:42:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 13619953664. Throughput: 0: 42924.2. Samples: 13620109780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:42:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 06:42:20,613][15401] Updated weights for policy 0, policy_version 831303 (0.0025) [2024-06-25 06:42:23,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 13620133888. Throughput: 0: 42653.6. Samples: 13620236580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:42:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 06:42:24,952][15401] Updated weights for policy 0, policy_version 831313 (0.0048) [2024-06-25 06:42:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 13620379648. Throughput: 0: 42491.5. Samples: 13620487040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 06:42:28,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 06:42:28,843][15401] Updated weights for policy 0, policy_version 831323 (0.0038) [2024-06-25 06:42:33,098][15401] Updated weights for policy 0, policy_version 831333 (0.0039) [2024-06-25 06:42:33,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 13620576256. Throughput: 0: 42866.7. Samples: 13620754200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:42:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 06:42:36,317][15401] Updated weights for policy 0, policy_version 831343 (0.0028) [2024-06-25 06:42:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42326.6, 300 sec: 42598.4). Total num frames: 13620772864. Throughput: 0: 42371.0. Samples: 13620875060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:42:38,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 06:42:40,554][15401] Updated weights for policy 0, policy_version 831353 (0.0028) [2024-06-25 06:42:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13621035008. Throughput: 0: 42623.6. Samples: 13621130200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:42:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 06:42:44,181][15401] Updated weights for policy 0, policy_version 831363 (0.0027) [2024-06-25 06:42:48,239][15401] Updated weights for policy 0, policy_version 831373 (0.0036) [2024-06-25 06:42:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 13621215232. Throughput: 0: 42871.1. Samples: 13621394120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:42:48,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-25 06:42:51,592][15401] Updated weights for policy 0, policy_version 831383 (0.0040) [2024-06-25 06:42:53,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13621411840. Throughput: 0: 42472.5. Samples: 13621513540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:42:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 06:42:55,687][15401] Updated weights for policy 0, policy_version 831393 (0.0030) [2024-06-25 06:42:58,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.8, 300 sec: 42765.0). Total num frames: 13621673984. Throughput: 0: 42695.5. Samples: 13621770920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:42:58,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 06:42:59,235][15401] Updated weights for policy 0, policy_version 831403 (0.0030) [2024-06-25 06:43:01,968][15349] Signal inference workers to stop experience collection... (201650 times) [2024-06-25 06:43:01,969][15349] Signal inference workers to resume experience collection... (201650 times) [2024-06-25 06:43:01,999][15401] InferenceWorker_p0-w0: stopping experience collection (201650 times) [2024-06-25 06:43:01,999][15401] InferenceWorker_p0-w0: resuming experience collection (201650 times) [2024-06-25 06:43:03,139][15401] Updated weights for policy 0, policy_version 831413 (0.0038) [2024-06-25 06:43:03,396][15132] Fps is (10 sec: 45845.5, 60 sec: 42595.5, 300 sec: 42708.6). Total num frames: 13621870592. Throughput: 0: 42663.2. Samples: 13622029900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:03,396][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 06:43:06,746][15401] Updated weights for policy 0, policy_version 831423 (0.0038) [2024-06-25 06:43:08,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13622067200. Throughput: 0: 42735.2. Samples: 13622159660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 06:43:10,785][15401] Updated weights for policy 0, policy_version 831433 (0.0031) [2024-06-25 06:43:13,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13622296576. Throughput: 0: 42669.3. Samples: 13622407160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 06:43:14,387][15401] Updated weights for policy 0, policy_version 831443 (0.0029) [2024-06-25 06:43:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13622509568. Throughput: 0: 42520.4. Samples: 13622667620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:18,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 06:43:18,605][15401] Updated weights for policy 0, policy_version 831453 (0.0045) [2024-06-25 06:43:22,239][15401] Updated weights for policy 0, policy_version 831463 (0.0029) [2024-06-25 06:43:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 13622706176. Throughput: 0: 42647.2. Samples: 13622794180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:23,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 06:43:26,087][15401] Updated weights for policy 0, policy_version 831473 (0.0027) [2024-06-25 06:43:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13622935552. Throughput: 0: 42635.0. Samples: 13623048780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 06:43:30,114][15401] Updated weights for policy 0, policy_version 831483 (0.0029) [2024-06-25 06:43:33,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13623148544. Throughput: 0: 42629.4. Samples: 13623312440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 06:43:33,858][15401] Updated weights for policy 0, policy_version 831493 (0.0039) [2024-06-25 06:43:38,051][15401] Updated weights for policy 0, policy_version 831503 (0.0052) [2024-06-25 06:43:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13623361536. Throughput: 0: 42737.7. Samples: 13623436740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 06:43:41,741][15401] Updated weights for policy 0, policy_version 831513 (0.0041) [2024-06-25 06:43:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 13623590912. Throughput: 0: 42665.3. Samples: 13623690860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:43,393][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 06:43:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000831518_13623590912.pth... [2024-06-25 06:43:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000830892_13613334528.pth [2024-06-25 06:43:45,854][15401] Updated weights for policy 0, policy_version 831523 (0.0028) [2024-06-25 06:43:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13623787520. Throughput: 0: 42557.7. Samples: 13623944720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:48,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 06:43:49,522][15401] Updated weights for policy 0, policy_version 831533 (0.0032) [2024-06-25 06:43:53,394][15132] Fps is (10 sec: 39313.1, 60 sec: 42868.2, 300 sec: 42597.8). Total num frames: 13623984128. Throughput: 0: 42497.1. Samples: 13624072220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:53,395][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 06:43:53,512][15401] Updated weights for policy 0, policy_version 831543 (0.0034) [2024-06-25 06:43:57,015][15401] Updated weights for policy 0, policy_version 831553 (0.0032) [2024-06-25 06:43:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13624229888. Throughput: 0: 42766.8. Samples: 13624331660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:43:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 06:44:01,225][15401] Updated weights for policy 0, policy_version 831563 (0.0036) [2024-06-25 06:44:03,390][15132] Fps is (10 sec: 44256.9, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 13624426496. Throughput: 0: 42799.1. Samples: 13624593580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:44:03,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 06:44:04,756][15401] Updated weights for policy 0, policy_version 831573 (0.0030) [2024-06-25 06:44:08,391][15132] Fps is (10 sec: 40953.0, 60 sec: 42870.4, 300 sec: 42653.7). Total num frames: 13624639488. Throughput: 0: 42721.0. Samples: 13624716700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:44:08,392][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 06:44:08,803][15401] Updated weights for policy 0, policy_version 831583 (0.0035) [2024-06-25 06:44:12,273][15401] Updated weights for policy 0, policy_version 831593 (0.0037) [2024-06-25 06:44:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13624868864. Throughput: 0: 42978.6. Samples: 13624982820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 06:44:13,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 06:44:16,252][15401] Updated weights for policy 0, policy_version 831603 (0.0042) [2024-06-25 06:44:18,389][15132] Fps is (10 sec: 42605.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13625065472. Throughput: 0: 42885.7. Samples: 13625242300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 06:44:19,776][15401] Updated weights for policy 0, policy_version 831613 (0.0044) [2024-06-25 06:44:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13625294848. Throughput: 0: 42879.9. Samples: 13625366340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 06:44:23,678][15401] Updated weights for policy 0, policy_version 831623 (0.0050) [2024-06-25 06:44:25,166][15349] Signal inference workers to stop experience collection... (201700 times) [2024-06-25 06:44:25,167][15349] Signal inference workers to resume experience collection... (201700 times) [2024-06-25 06:44:25,199][15401] InferenceWorker_p0-w0: stopping experience collection (201700 times) [2024-06-25 06:44:25,199][15401] InferenceWorker_p0-w0: resuming experience collection (201700 times) [2024-06-25 06:44:27,284][15401] Updated weights for policy 0, policy_version 831633 (0.0035) [2024-06-25 06:44:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13625507840. Throughput: 0: 43012.0. Samples: 13625626300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:28,394][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 06:44:31,669][15401] Updated weights for policy 0, policy_version 831643 (0.0047) [2024-06-25 06:44:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13625704448. Throughput: 0: 43226.6. Samples: 13625889920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 06:44:34,733][15401] Updated weights for policy 0, policy_version 831653 (0.0030) [2024-06-25 06:44:38,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 13625950208. Throughput: 0: 43160.3. Samples: 13626014340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:38,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 06:44:39,057][15401] Updated weights for policy 0, policy_version 831663 (0.0034) [2024-06-25 06:44:42,501][15401] Updated weights for policy 0, policy_version 831673 (0.0024) [2024-06-25 06:44:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 13626146816. Throughput: 0: 43083.9. Samples: 13626270440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 06:44:46,892][15401] Updated weights for policy 0, policy_version 831683 (0.0037) [2024-06-25 06:44:48,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13626343424. Throughput: 0: 43128.9. Samples: 13626534380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:48,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 06:44:50,726][15401] Updated weights for policy 0, policy_version 831693 (0.0038) [2024-06-25 06:44:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43147.8, 300 sec: 42709.5). Total num frames: 13626572800. Throughput: 0: 43013.6. Samples: 13626652240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 06:44:54,379][15401] Updated weights for policy 0, policy_version 831703 (0.0031) [2024-06-25 06:44:58,201][15401] Updated weights for policy 0, policy_version 831713 (0.0038) [2024-06-25 06:44:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13626785792. Throughput: 0: 42837.9. Samples: 13626910520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:44:58,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 06:45:02,159][15401] Updated weights for policy 0, policy_version 831723 (0.0031) [2024-06-25 06:45:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 13626982400. Throughput: 0: 42967.5. Samples: 13627175840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:03,391][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 06:45:05,860][15401] Updated weights for policy 0, policy_version 831733 (0.0034) [2024-06-25 06:45:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43145.7, 300 sec: 42765.0). Total num frames: 13627228160. Throughput: 0: 42914.7. Samples: 13627297500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 06:45:09,723][15401] Updated weights for policy 0, policy_version 831743 (0.0033) [2024-06-25 06:45:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13627408384. Throughput: 0: 42881.3. Samples: 13627555960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:13,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 06:45:13,877][15401] Updated weights for policy 0, policy_version 831753 (0.0037) [2024-06-25 06:45:17,525][15401] Updated weights for policy 0, policy_version 831763 (0.0031) [2024-06-25 06:45:18,391][15132] Fps is (10 sec: 40952.5, 60 sec: 42870.1, 300 sec: 42709.2). Total num frames: 13627637760. Throughput: 0: 42619.1. Samples: 13627807860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:18,392][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 06:45:21,493][15401] Updated weights for policy 0, policy_version 831773 (0.0036) [2024-06-25 06:45:23,392][15132] Fps is (10 sec: 47502.5, 60 sec: 43142.9, 300 sec: 42765.6). Total num frames: 13627883520. Throughput: 0: 42807.1. Samples: 13627940660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:23,392][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 06:45:24,955][15401] Updated weights for policy 0, policy_version 831783 (0.0035) [2024-06-25 06:45:28,389][15132] Fps is (10 sec: 40967.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13628047360. Throughput: 0: 42766.3. Samples: 13628194920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 06:45:29,024][15401] Updated weights for policy 0, policy_version 831793 (0.0032) [2024-06-25 06:45:32,475][15401] Updated weights for policy 0, policy_version 831803 (0.0033) [2024-06-25 06:45:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13628293120. Throughput: 0: 42617.8. Samples: 13628452180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 06:45:36,701][15401] Updated weights for policy 0, policy_version 831813 (0.0035) [2024-06-25 06:45:38,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 13628522496. Throughput: 0: 42928.4. Samples: 13628584020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:38,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 06:45:40,440][15401] Updated weights for policy 0, policy_version 831823 (0.0043) [2024-06-25 06:45:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13628686336. Throughput: 0: 42866.6. Samples: 13628839520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:43,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 06:45:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000831830_13628702720.pth... [2024-06-25 06:45:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000831204_13618446336.pth [2024-06-25 06:45:44,338][15401] Updated weights for policy 0, policy_version 831833 (0.0030) [2024-06-25 06:45:47,926][15401] Updated weights for policy 0, policy_version 831843 (0.0025) [2024-06-25 06:45:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13628932096. Throughput: 0: 42548.4. Samples: 13629090520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 06:45:51,859][15349] Signal inference workers to stop experience collection... (201750 times) [2024-06-25 06:45:51,860][15349] Signal inference workers to resume experience collection... (201750 times) [2024-06-25 06:45:51,915][15401] InferenceWorker_p0-w0: stopping experience collection (201750 times) [2024-06-25 06:45:51,915][15401] InferenceWorker_p0-w0: resuming experience collection (201750 times) [2024-06-25 06:45:52,002][15401] Updated weights for policy 0, policy_version 831853 (0.0027) [2024-06-25 06:45:53,396][15132] Fps is (10 sec: 47483.1, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 13629161472. Throughput: 0: 42781.1. Samples: 13629222920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:53,396][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 06:45:55,695][15401] Updated weights for policy 0, policy_version 831863 (0.0038) [2024-06-25 06:45:58,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13629325312. Throughput: 0: 42652.5. Samples: 13629475320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:45:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 06:45:59,521][15401] Updated weights for policy 0, policy_version 831873 (0.0049) [2024-06-25 06:46:03,219][15401] Updated weights for policy 0, policy_version 831883 (0.0031) [2024-06-25 06:46:03,389][15132] Fps is (10 sec: 40986.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13629571072. Throughput: 0: 42752.0. Samples: 13629731620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-25 06:46:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 06:46:07,202][15401] Updated weights for policy 0, policy_version 831893 (0.0029) [2024-06-25 06:46:08,396][15132] Fps is (10 sec: 45845.8, 60 sec: 42593.9, 300 sec: 42819.6). Total num frames: 13629784064. Throughput: 0: 42759.7. Samples: 13629865020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:08,396][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 06:46:10,744][15401] Updated weights for policy 0, policy_version 831903 (0.0045) [2024-06-25 06:46:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13629980672. Throughput: 0: 42584.9. Samples: 13630111240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:13,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 06:46:14,931][15401] Updated weights for policy 0, policy_version 831913 (0.0039) [2024-06-25 06:46:18,390][15132] Fps is (10 sec: 42625.0, 60 sec: 42872.7, 300 sec: 42820.5). Total num frames: 13630210048. Throughput: 0: 42519.4. Samples: 13630365560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:18,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 06:46:18,637][15401] Updated weights for policy 0, policy_version 831923 (0.0033) [2024-06-25 06:46:22,472][15401] Updated weights for policy 0, policy_version 831933 (0.0027) [2024-06-25 06:46:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42327.0, 300 sec: 42820.9). Total num frames: 13630423040. Throughput: 0: 42587.1. Samples: 13630500440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 06:46:26,066][15401] Updated weights for policy 0, policy_version 831943 (0.0033) [2024-06-25 06:46:28,392][15132] Fps is (10 sec: 39312.8, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 13630603264. Throughput: 0: 42677.7. Samples: 13630760120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:28,392][15132] Avg episode reward: [(0, '0.342')] [2024-06-25 06:46:30,054][15401] Updated weights for policy 0, policy_version 831953 (0.0048) [2024-06-25 06:46:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.8). Total num frames: 13630865408. Throughput: 0: 42660.0. Samples: 13631010220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:33,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 06:46:33,542][15401] Updated weights for policy 0, policy_version 831963 (0.0020) [2024-06-25 06:46:38,103][15401] Updated weights for policy 0, policy_version 831973 (0.0031) [2024-06-25 06:46:38,392][15132] Fps is (10 sec: 47513.2, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 13631078400. Throughput: 0: 42793.5. Samples: 13631148460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:38,401][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 06:46:41,148][15401] Updated weights for policy 0, policy_version 831983 (0.0047) [2024-06-25 06:46:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 13631258624. Throughput: 0: 42730.0. Samples: 13631398180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 06:46:45,721][15401] Updated weights for policy 0, policy_version 831993 (0.0032) [2024-06-25 06:46:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13631504384. Throughput: 0: 42522.7. Samples: 13631645140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 06:46:49,170][15401] Updated weights for policy 0, policy_version 832003 (0.0041) [2024-06-25 06:46:53,203][15401] Updated weights for policy 0, policy_version 832013 (0.0029) [2024-06-25 06:46:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42329.9, 300 sec: 42765.0). Total num frames: 13631700992. Throughput: 0: 42661.2. Samples: 13631784500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 06:46:56,587][15401] Updated weights for policy 0, policy_version 832023 (0.0034) [2024-06-25 06:46:58,391][15132] Fps is (10 sec: 40952.1, 60 sec: 43143.2, 300 sec: 42709.5). Total num frames: 13631913984. Throughput: 0: 42799.9. Samples: 13632037320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:46:58,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 06:46:59,572][15349] Signal inference workers to stop experience collection... (201800 times) [2024-06-25 06:46:59,572][15349] Signal inference workers to resume experience collection... (201800 times) [2024-06-25 06:46:59,597][15401] InferenceWorker_p0-w0: stopping experience collection (201800 times) [2024-06-25 06:46:59,597][15401] InferenceWorker_p0-w0: resuming experience collection (201800 times) [2024-06-25 06:47:01,237][15401] Updated weights for policy 0, policy_version 832033 (0.0030) [2024-06-25 06:47:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13632159744. Throughput: 0: 42912.2. Samples: 13632296600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 06:47:04,202][15401] Updated weights for policy 0, policy_version 832043 (0.0028) [2024-06-25 06:47:08,389][15132] Fps is (10 sec: 40967.7, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 13632323584. Throughput: 0: 42817.3. Samples: 13632427220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 06:47:08,744][15401] Updated weights for policy 0, policy_version 832053 (0.0032) [2024-06-25 06:47:11,884][15401] Updated weights for policy 0, policy_version 832063 (0.0032) [2024-06-25 06:47:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13632569344. Throughput: 0: 42627.6. Samples: 13632678260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 06:47:16,335][15401] Updated weights for policy 0, policy_version 832073 (0.0039) [2024-06-25 06:47:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13632765952. Throughput: 0: 42953.2. Samples: 13632943120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 06:47:19,449][15401] Updated weights for policy 0, policy_version 832083 (0.0033) [2024-06-25 06:47:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13632978944. Throughput: 0: 42687.7. Samples: 13633069300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 06:47:23,832][15401] Updated weights for policy 0, policy_version 832093 (0.0044) [2024-06-25 06:47:27,147][15401] Updated weights for policy 0, policy_version 832103 (0.0033) [2024-06-25 06:47:28,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 13633208320. Throughput: 0: 42760.3. Samples: 13633322380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:28,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 06:47:31,719][15401] Updated weights for policy 0, policy_version 832113 (0.0039) [2024-06-25 06:47:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13633404928. Throughput: 0: 43148.5. Samples: 13633586820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:33,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-25 06:47:34,954][15401] Updated weights for policy 0, policy_version 832123 (0.0030) [2024-06-25 06:47:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 13633617920. Throughput: 0: 42707.2. Samples: 13633706320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 06:47:39,374][15401] Updated weights for policy 0, policy_version 832133 (0.0029) [2024-06-25 06:47:42,418][15401] Updated weights for policy 0, policy_version 832143 (0.0032) [2024-06-25 06:47:43,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 13633863680. Throughput: 0: 42807.6. Samples: 13633963580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 06:47:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000832145_13633863680.pth... [2024-06-25 06:47:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000831518_13623590912.pth [2024-06-25 06:47:46,972][15401] Updated weights for policy 0, policy_version 832153 (0.0033) [2024-06-25 06:47:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 13634043904. Throughput: 0: 42909.2. Samples: 13634227520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:48,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 06:47:49,973][15401] Updated weights for policy 0, policy_version 832163 (0.0036) [2024-06-25 06:47:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 13634273280. Throughput: 0: 42750.2. Samples: 13634350980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 06:47:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 06:47:54,374][15401] Updated weights for policy 0, policy_version 832173 (0.0024) [2024-06-25 06:47:57,935][15401] Updated weights for policy 0, policy_version 832183 (0.0030) [2024-06-25 06:47:58,394][15132] Fps is (10 sec: 45854.9, 60 sec: 43142.7, 300 sec: 42820.8). Total num frames: 13634502656. Throughput: 0: 42960.7. Samples: 13634611680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:47:58,394][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 06:48:02,416][15401] Updated weights for policy 0, policy_version 832193 (0.0024) [2024-06-25 06:48:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 13634682880. Throughput: 0: 42868.2. Samples: 13634872180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 06:48:05,867][15401] Updated weights for policy 0, policy_version 832203 (0.0027) [2024-06-25 06:48:08,389][15132] Fps is (10 sec: 42617.6, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 13634928640. Throughput: 0: 42738.2. Samples: 13634992520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 06:48:10,095][15401] Updated weights for policy 0, policy_version 832213 (0.0034) [2024-06-25 06:48:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13635125248. Throughput: 0: 42899.4. Samples: 13635252860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:13,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 06:48:13,432][15401] Updated weights for policy 0, policy_version 832223 (0.0035) [2024-06-25 06:48:17,758][15401] Updated weights for policy 0, policy_version 832233 (0.0028) [2024-06-25 06:48:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 13635321856. Throughput: 0: 42772.4. Samples: 13635511580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 06:48:21,284][15401] Updated weights for policy 0, policy_version 832243 (0.0042) [2024-06-25 06:48:22,026][15349] Signal inference workers to stop experience collection... (201850 times) [2024-06-25 06:48:22,026][15349] Signal inference workers to resume experience collection... (201850 times) [2024-06-25 06:48:22,052][15401] InferenceWorker_p0-w0: stopping experience collection (201850 times) [2024-06-25 06:48:22,052][15401] InferenceWorker_p0-w0: resuming experience collection (201850 times) [2024-06-25 06:48:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13635567616. Throughput: 0: 42825.3. Samples: 13635633460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:23,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 06:48:25,378][15401] Updated weights for policy 0, policy_version 832253 (0.0035) [2024-06-25 06:48:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13635764224. Throughput: 0: 42890.3. Samples: 13635893640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:28,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-25 06:48:28,687][15401] Updated weights for policy 0, policy_version 832263 (0.0034) [2024-06-25 06:48:33,286][15401] Updated weights for policy 0, policy_version 832273 (0.0029) [2024-06-25 06:48:33,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13635960832. Throughput: 0: 42773.8. Samples: 13636152340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 06:48:36,536][15401] Updated weights for policy 0, policy_version 832283 (0.0039) [2024-06-25 06:48:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 13636222976. Throughput: 0: 42827.6. Samples: 13636278220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 06:48:40,679][15401] Updated weights for policy 0, policy_version 832293 (0.0026) [2024-06-25 06:48:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13636419584. Throughput: 0: 42878.9. Samples: 13636541040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:43,396][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 06:48:44,021][15401] Updated weights for policy 0, policy_version 832303 (0.0028) [2024-06-25 06:48:48,288][15401] Updated weights for policy 0, policy_version 832313 (0.0049) [2024-06-25 06:48:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42821.2). Total num frames: 13636616192. Throughput: 0: 42894.5. Samples: 13636802440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 06:48:51,506][15401] Updated weights for policy 0, policy_version 832323 (0.0032) [2024-06-25 06:48:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13636861952. Throughput: 0: 42883.4. Samples: 13636922280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 06:48:56,078][15401] Updated weights for policy 0, policy_version 832333 (0.0028) [2024-06-25 06:48:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42874.6, 300 sec: 42876.1). Total num frames: 13637074944. Throughput: 0: 42850.6. Samples: 13637181140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:48:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 06:48:59,057][15401] Updated weights for policy 0, policy_version 832343 (0.0031) [2024-06-25 06:49:03,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42596.7, 300 sec: 42709.4). Total num frames: 13637238784. Throughput: 0: 42821.6. Samples: 13637438660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:03,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 06:49:03,946][15401] Updated weights for policy 0, policy_version 832353 (0.0044) [2024-06-25 06:49:06,681][15401] Updated weights for policy 0, policy_version 832363 (0.0032) [2024-06-25 06:49:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 13637500928. Throughput: 0: 42738.6. Samples: 13637556700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:08,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 06:49:11,522][15401] Updated weights for policy 0, policy_version 832373 (0.0036) [2024-06-25 06:49:13,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13637697536. Throughput: 0: 42867.9. Samples: 13637822700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 06:49:14,168][15401] Updated weights for policy 0, policy_version 832383 (0.0030) [2024-06-25 06:49:18,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13637877760. Throughput: 0: 42799.6. Samples: 13638078320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 06:49:19,098][15401] Updated weights for policy 0, policy_version 832393 (0.0040) [2024-06-25 06:49:21,768][15401] Updated weights for policy 0, policy_version 832403 (0.0046) [2024-06-25 06:49:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13638139904. Throughput: 0: 42871.9. Samples: 13638207460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 06:49:26,860][15401] Updated weights for policy 0, policy_version 832413 (0.0025) [2024-06-25 06:49:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13638336512. Throughput: 0: 42753.8. Samples: 13638464960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 06:49:29,521][15401] Updated weights for policy 0, policy_version 832423 (0.0046) [2024-06-25 06:49:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 13638533120. Throughput: 0: 42775.7. Samples: 13638727340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 06:49:34,367][15401] Updated weights for policy 0, policy_version 832433 (0.0030) [2024-06-25 06:49:37,056][15401] Updated weights for policy 0, policy_version 832443 (0.0024) [2024-06-25 06:49:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13638795264. Throughput: 0: 42794.3. Samples: 13638848020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 06:49:38,776][15349] Signal inference workers to stop experience collection... (201900 times) [2024-06-25 06:49:38,777][15349] Signal inference workers to resume experience collection... (201900 times) [2024-06-25 06:49:38,828][15401] InferenceWorker_p0-w0: stopping experience collection (201900 times) [2024-06-25 06:49:38,828][15401] InferenceWorker_p0-w0: resuming experience collection (201900 times) [2024-06-25 06:49:42,242][15401] Updated weights for policy 0, policy_version 832453 (0.0029) [2024-06-25 06:49:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13638991872. Throughput: 0: 42945.3. Samples: 13639113680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-25 06:49:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 06:49:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000832458_13638991872.pth... [2024-06-25 06:49:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000831830_13628702720.pth [2024-06-25 06:49:44,662][15401] Updated weights for policy 0, policy_version 832463 (0.0042) [2024-06-25 06:49:48,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13639172096. Throughput: 0: 42935.2. Samples: 13639370640. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:49:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 06:49:49,676][15401] Updated weights for policy 0, policy_version 832473 (0.0043) [2024-06-25 06:49:52,206][15401] Updated weights for policy 0, policy_version 832483 (0.0027) [2024-06-25 06:49:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13639450624. Throughput: 0: 43119.6. Samples: 13639497080. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:49:53,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 06:49:57,189][15401] Updated weights for policy 0, policy_version 832493 (0.0027) [2024-06-25 06:49:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 13639614464. Throughput: 0: 43092.1. Samples: 13639761840. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:49:58,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 06:49:59,886][15401] Updated weights for policy 0, policy_version 832503 (0.0037) [2024-06-25 06:50:03,389][15132] Fps is (10 sec: 37683.9, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 13639827456. Throughput: 0: 43136.0. Samples: 13640019440. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:03,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 06:50:04,986][15401] Updated weights for policy 0, policy_version 832513 (0.0028) [2024-06-25 06:50:07,414][15401] Updated weights for policy 0, policy_version 832523 (0.0033) [2024-06-25 06:50:08,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13640089600. Throughput: 0: 43107.6. Samples: 13640147300. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:08,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 06:50:12,465][15401] Updated weights for policy 0, policy_version 832533 (0.0040) [2024-06-25 06:50:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 13640253440. Throughput: 0: 43255.4. Samples: 13640411460. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 06:50:14,870][15401] Updated weights for policy 0, policy_version 832543 (0.0030) [2024-06-25 06:50:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.6, 300 sec: 42709.8). Total num frames: 13640482816. Throughput: 0: 43079.6. Samples: 13640665920. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 06:50:19,831][15401] Updated weights for policy 0, policy_version 832553 (0.0037) [2024-06-25 06:50:22,365][15401] Updated weights for policy 0, policy_version 832563 (0.0027) [2024-06-25 06:50:23,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13640712192. Throughput: 0: 43226.2. Samples: 13640793200. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 06:50:27,459][15401] Updated weights for policy 0, policy_version 832573 (0.0026) [2024-06-25 06:50:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13640908800. Throughput: 0: 43006.4. Samples: 13641048960. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:28,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 06:50:30,455][15401] Updated weights for policy 0, policy_version 832583 (0.0032) [2024-06-25 06:50:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13641121792. Throughput: 0: 43024.8. Samples: 13641306760. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 06:50:34,855][15401] Updated weights for policy 0, policy_version 832593 (0.0044) [2024-06-25 06:50:37,956][15401] Updated weights for policy 0, policy_version 832603 (0.0027) [2024-06-25 06:50:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13641367552. Throughput: 0: 43067.6. Samples: 13641435120. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 06:50:42,665][15401] Updated weights for policy 0, policy_version 832613 (0.0022) [2024-06-25 06:50:43,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 13641580544. Throughput: 0: 43086.5. Samples: 13641700840. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:43,392][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 06:50:45,899][15401] Updated weights for policy 0, policy_version 832623 (0.0030) [2024-06-25 06:50:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 42765.9). Total num frames: 13641777152. Throughput: 0: 42924.3. Samples: 13641951040. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:48,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 06:50:50,235][15401] Updated weights for policy 0, policy_version 832633 (0.0044) [2024-06-25 06:50:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 13642006528. Throughput: 0: 42909.4. Samples: 13642078220. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 06:50:53,554][15401] Updated weights for policy 0, policy_version 832643 (0.0041) [2024-06-25 06:50:57,528][15349] Signal inference workers to stop experience collection... (201950 times) [2024-06-25 06:50:57,528][15349] Signal inference workers to resume experience collection... (201950 times) [2024-06-25 06:50:57,565][15401] InferenceWorker_p0-w0: stopping experience collection (201950 times) [2024-06-25 06:50:57,565][15401] InferenceWorker_p0-w0: resuming experience collection (201950 times) [2024-06-25 06:50:57,669][15401] Updated weights for policy 0, policy_version 832653 (0.0033) [2024-06-25 06:50:58,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 13642203136. Throughput: 0: 42782.7. Samples: 13642336780. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:50:58,392][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 06:51:01,495][15401] Updated weights for policy 0, policy_version 832663 (0.0033) [2024-06-25 06:51:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42821.5). Total num frames: 13642416128. Throughput: 0: 42796.3. Samples: 13642591760. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:51:03,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 06:51:05,196][15401] Updated weights for policy 0, policy_version 832673 (0.0029) [2024-06-25 06:51:08,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13642645504. Throughput: 0: 42793.8. Samples: 13642718920. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:51:08,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-25 06:51:09,215][15401] Updated weights for policy 0, policy_version 832683 (0.0037) [2024-06-25 06:51:12,652][15401] Updated weights for policy 0, policy_version 832693 (0.0031) [2024-06-25 06:51:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 13642858496. Throughput: 0: 42922.5. Samples: 13642980480. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:51:13,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 06:51:16,865][15401] Updated weights for policy 0, policy_version 832703 (0.0037) [2024-06-25 06:51:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 13643055104. Throughput: 0: 42975.6. Samples: 13643240760. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:51:18,392][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 06:51:20,272][15401] Updated weights for policy 0, policy_version 832713 (0.0035) [2024-06-25 06:51:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 43043.1). Total num frames: 13643300864. Throughput: 0: 42866.9. Samples: 13643364120. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:51:23,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 06:51:24,471][15401] Updated weights for policy 0, policy_version 832723 (0.0034) [2024-06-25 06:51:27,926][15401] Updated weights for policy 0, policy_version 832733 (0.0029) [2024-06-25 06:51:28,389][15132] Fps is (10 sec: 45886.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13643513856. Throughput: 0: 42864.1. Samples: 13643629620. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:51:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 06:51:32,013][15401] Updated weights for policy 0, policy_version 832743 (0.0042) [2024-06-25 06:51:33,389][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 13643710464. Throughput: 0: 43073.0. Samples: 13643889320. Policy #0 lag: (min: 1.0, avg: 13.3, max: 21.0) [2024-06-25 06:51:33,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 06:51:35,465][15401] Updated weights for policy 0, policy_version 832753 (0.0036) [2024-06-25 06:51:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13643939840. Throughput: 0: 42984.3. Samples: 13644012520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:51:38,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 06:51:40,066][15401] Updated weights for policy 0, policy_version 832763 (0.0028) [2024-06-25 06:51:43,228][15401] Updated weights for policy 0, policy_version 832773 (0.0048) [2024-06-25 06:51:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 13644169216. Throughput: 0: 43116.0. Samples: 13644276900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:51:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 06:51:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000832774_13644169216.pth... [2024-06-25 06:51:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000832145_13633863680.pth [2024-06-25 06:51:47,601][15401] Updated weights for policy 0, policy_version 832783 (0.0041) [2024-06-25 06:51:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13644349440. Throughput: 0: 42944.5. Samples: 13644524260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:51:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 06:51:50,928][15401] Updated weights for policy 0, policy_version 832793 (0.0048) [2024-06-25 06:51:53,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 13644595200. Throughput: 0: 42885.0. Samples: 13644648740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:51:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 06:51:54,984][15401] Updated weights for policy 0, policy_version 832803 (0.0042) [2024-06-25 06:51:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 13644791808. Throughput: 0: 43184.0. Samples: 13644923760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:51:58,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 06:51:58,416][15401] Updated weights for policy 0, policy_version 832813 (0.0033) [2024-06-25 06:52:02,959][15401] Updated weights for policy 0, policy_version 832823 (0.0041) [2024-06-25 06:52:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13644988416. Throughput: 0: 43111.6. Samples: 13645180680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 06:52:06,007][15401] Updated weights for policy 0, policy_version 832833 (0.0038) [2024-06-25 06:52:08,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 13645250560. Throughput: 0: 42970.9. Samples: 13645297820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 06:52:10,731][15401] Updated weights for policy 0, policy_version 832843 (0.0045) [2024-06-25 06:52:11,915][15349] Signal inference workers to stop experience collection... (202000 times) [2024-06-25 06:52:11,918][15349] Signal inference workers to resume experience collection... (202000 times) [2024-06-25 06:52:11,940][15401] InferenceWorker_p0-w0: stopping experience collection (202000 times) [2024-06-25 06:52:11,940][15401] InferenceWorker_p0-w0: resuming experience collection (202000 times) [2024-06-25 06:52:13,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13645447168. Throughput: 0: 42964.5. Samples: 13645563020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 06:52:13,605][15401] Updated weights for policy 0, policy_version 832853 (0.0039) [2024-06-25 06:52:18,388][15401] Updated weights for policy 0, policy_version 832863 (0.0040) [2024-06-25 06:52:18,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 13645627392. Throughput: 0: 42914.2. Samples: 13645820460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:18,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 06:52:21,235][15401] Updated weights for policy 0, policy_version 832873 (0.0028) [2024-06-25 06:52:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 13645873152. Throughput: 0: 42819.5. Samples: 13645939400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 06:52:25,915][15401] Updated weights for policy 0, policy_version 832883 (0.0038) [2024-06-25 06:52:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 13646086144. Throughput: 0: 42872.5. Samples: 13646206160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:28,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 06:52:28,687][15401] Updated weights for policy 0, policy_version 832893 (0.0032) [2024-06-25 06:52:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13646266368. Throughput: 0: 43082.7. Samples: 13646462980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 06:52:33,978][15401] Updated weights for policy 0, policy_version 832903 (0.0043) [2024-06-25 06:52:36,294][15401] Updated weights for policy 0, policy_version 832913 (0.0039) [2024-06-25 06:52:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13646528512. Throughput: 0: 42870.6. Samples: 13646577920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 06:52:41,629][15401] Updated weights for policy 0, policy_version 832923 (0.0026) [2024-06-25 06:52:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 13646741504. Throughput: 0: 42779.2. Samples: 13646848820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:43,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 06:52:43,847][15401] Updated weights for policy 0, policy_version 832933 (0.0034) [2024-06-25 06:52:48,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13646905344. Throughput: 0: 42724.9. Samples: 13647103300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:48,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 06:52:49,149][15401] Updated weights for policy 0, policy_version 832943 (0.0022) [2024-06-25 06:52:51,848][15401] Updated weights for policy 0, policy_version 832953 (0.0037) [2024-06-25 06:52:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42932.3). Total num frames: 13647167488. Throughput: 0: 42804.5. Samples: 13647224020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 06:52:56,718][15401] Updated weights for policy 0, policy_version 832963 (0.0026) [2024-06-25 06:52:58,389][15132] Fps is (10 sec: 49152.3, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 13647396864. Throughput: 0: 42927.9. Samples: 13647494780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:52:58,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 06:52:59,370][15401] Updated weights for policy 0, policy_version 832973 (0.0030) [2024-06-25 06:53:03,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13647560704. Throughput: 0: 42751.5. Samples: 13647744280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:53:03,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 06:53:04,170][15401] Updated weights for policy 0, policy_version 832983 (0.0029) [2024-06-25 06:53:07,177][15401] Updated weights for policy 0, policy_version 832993 (0.0029) [2024-06-25 06:53:08,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.6, 300 sec: 42931.3). Total num frames: 13647790080. Throughput: 0: 42835.6. Samples: 13647867100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:53:08,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 06:53:11,615][15401] Updated weights for policy 0, policy_version 833003 (0.0035) [2024-06-25 06:53:13,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42596.6, 300 sec: 42986.8). Total num frames: 13648003072. Throughput: 0: 42731.1. Samples: 13648129160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:53:13,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 06:53:13,973][15349] Signal inference workers to stop experience collection... (202050 times) [2024-06-25 06:53:13,974][15349] Signal inference workers to resume experience collection... (202050 times) [2024-06-25 06:53:14,007][15401] InferenceWorker_p0-w0: stopping experience collection (202050 times) [2024-06-25 06:53:14,012][15401] InferenceWorker_p0-w0: resuming experience collection (202050 times) [2024-06-25 06:53:15,022][15401] Updated weights for policy 0, policy_version 833013 (0.0034) [2024-06-25 06:53:18,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13648216064. Throughput: 0: 42625.8. Samples: 13648381140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:53:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 06:53:19,515][15401] Updated weights for policy 0, policy_version 833023 (0.0026) [2024-06-25 06:53:22,543][15401] Updated weights for policy 0, policy_version 833033 (0.0034) [2024-06-25 06:53:23,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 13648445440. Throughput: 0: 42893.8. Samples: 13648508140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 06:53:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 06:53:27,084][15401] Updated weights for policy 0, policy_version 833043 (0.0037) [2024-06-25 06:53:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 13648625664. Throughput: 0: 42674.6. Samples: 13648769180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:53:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 06:53:30,625][15401] Updated weights for policy 0, policy_version 833053 (0.0040) [2024-06-25 06:53:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13648871424. Throughput: 0: 42501.3. Samples: 13649015860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:53:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 06:53:34,647][15401] Updated weights for policy 0, policy_version 833063 (0.0033) [2024-06-25 06:53:38,061][15401] Updated weights for policy 0, policy_version 833073 (0.0027) [2024-06-25 06:53:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 13649068032. Throughput: 0: 42742.9. Samples: 13649147460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:53:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 06:53:42,243][15401] Updated weights for policy 0, policy_version 833083 (0.0031) [2024-06-25 06:53:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 13649297408. Throughput: 0: 42632.0. Samples: 13649413220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:53:43,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 06:53:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000833087_13649297408.pth... [2024-06-25 06:53:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000832458_13638991872.pth [2024-06-25 06:53:46,108][15401] Updated weights for policy 0, policy_version 833093 (0.0033) [2024-06-25 06:53:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13649510400. Throughput: 0: 42632.0. Samples: 13649662720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:53:48,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 06:53:49,957][15401] Updated weights for policy 0, policy_version 833103 (0.0029) [2024-06-25 06:53:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 13649707008. Throughput: 0: 42949.9. Samples: 13649799740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:53:53,391][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 06:53:53,748][15401] Updated weights for policy 0, policy_version 833113 (0.0037) [2024-06-25 06:53:57,482][15401] Updated weights for policy 0, policy_version 833123 (0.0022) [2024-06-25 06:53:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42987.5). Total num frames: 13649920000. Throughput: 0: 42809.4. Samples: 13650055480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:53:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 06:54:01,161][15401] Updated weights for policy 0, policy_version 833133 (0.0033) [2024-06-25 06:54:03,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13650149376. Throughput: 0: 42800.3. Samples: 13650307160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:03,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-25 06:54:05,103][15401] Updated weights for policy 0, policy_version 833143 (0.0029) [2024-06-25 06:54:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 13650345984. Throughput: 0: 42933.2. Samples: 13650440140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 06:54:08,813][15401] Updated weights for policy 0, policy_version 833153 (0.0032) [2024-06-25 06:54:12,781][15401] Updated weights for policy 0, policy_version 833163 (0.0035) [2024-06-25 06:54:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.1, 300 sec: 43042.7). Total num frames: 13650575360. Throughput: 0: 42882.5. Samples: 13650698900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 06:54:16,693][15401] Updated weights for policy 0, policy_version 833173 (0.0048) [2024-06-25 06:54:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13650788352. Throughput: 0: 43005.8. Samples: 13650951120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:18,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 06:54:20,312][15401] Updated weights for policy 0, policy_version 833183 (0.0035) [2024-06-25 06:54:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 13650984960. Throughput: 0: 42871.2. Samples: 13651076660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 06:54:24,322][15401] Updated weights for policy 0, policy_version 833193 (0.0035) [2024-06-25 06:54:27,850][15349] Signal inference workers to stop experience collection... (202100 times) [2024-06-25 06:54:27,850][15349] Signal inference workers to resume experience collection... (202100 times) [2024-06-25 06:54:27,865][15401] InferenceWorker_p0-w0: stopping experience collection (202100 times) [2024-06-25 06:54:27,865][15401] InferenceWorker_p0-w0: resuming experience collection (202100 times) [2024-06-25 06:54:28,024][15401] Updated weights for policy 0, policy_version 833203 (0.0039) [2024-06-25 06:54:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13651214336. Throughput: 0: 42650.3. Samples: 13651332480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 06:54:31,984][15401] Updated weights for policy 0, policy_version 833213 (0.0035) [2024-06-25 06:54:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13651427328. Throughput: 0: 42905.4. Samples: 13651593460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 06:54:35,503][15401] Updated weights for policy 0, policy_version 833223 (0.0032) [2024-06-25 06:54:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13651640320. Throughput: 0: 42626.6. Samples: 13651717940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 06:54:39,597][15401] Updated weights for policy 0, policy_version 833233 (0.0045) [2024-06-25 06:54:42,954][15401] Updated weights for policy 0, policy_version 833243 (0.0033) [2024-06-25 06:54:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 13651869696. Throughput: 0: 42756.1. Samples: 13651979500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 06:54:47,715][15401] Updated weights for policy 0, policy_version 833253 (0.0028) [2024-06-25 06:54:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13652049920. Throughput: 0: 42964.2. Samples: 13652240540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 06:54:50,431][15401] Updated weights for policy 0, policy_version 833263 (0.0025) [2024-06-25 06:54:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13652295680. Throughput: 0: 42634.4. Samples: 13652358680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:53,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 06:54:55,403][15401] Updated weights for policy 0, policy_version 833273 (0.0032) [2024-06-25 06:54:58,257][15401] Updated weights for policy 0, policy_version 833283 (0.0025) [2024-06-25 06:54:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 13652508672. Throughput: 0: 42643.6. Samples: 13652617860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:54:58,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 06:55:02,851][15401] Updated weights for policy 0, policy_version 833293 (0.0035) [2024-06-25 06:55:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13652688896. Throughput: 0: 42844.8. Samples: 13652879140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:55:03,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 06:55:05,725][15401] Updated weights for policy 0, policy_version 833303 (0.0037) [2024-06-25 06:55:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13652934656. Throughput: 0: 42653.8. Samples: 13652996080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:55:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 06:55:10,314][15401] Updated weights for policy 0, policy_version 833313 (0.0030) [2024-06-25 06:55:13,108][15401] Updated weights for policy 0, policy_version 833323 (0.0042) [2024-06-25 06:55:13,392][15132] Fps is (10 sec: 47502.6, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 13653164032. Throughput: 0: 42928.7. Samples: 13653264380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:13,393][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 06:55:17,890][15401] Updated weights for policy 0, policy_version 833333 (0.0032) [2024-06-25 06:55:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13653344256. Throughput: 0: 42886.6. Samples: 13653523360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 06:55:20,739][15401] Updated weights for policy 0, policy_version 833343 (0.0036) [2024-06-25 06:55:23,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13653573632. Throughput: 0: 42807.9. Samples: 13653644300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 06:55:25,567][15401] Updated weights for policy 0, policy_version 833353 (0.0035) [2024-06-25 06:55:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 13653786624. Throughput: 0: 42944.3. Samples: 13653912000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 06:55:28,845][15401] Updated weights for policy 0, policy_version 833363 (0.0046) [2024-06-25 06:55:30,514][15349] Signal inference workers to stop experience collection... (202150 times) [2024-06-25 06:55:30,534][15401] InferenceWorker_p0-w0: stopping experience collection (202150 times) [2024-06-25 06:55:30,572][15349] Signal inference workers to resume experience collection... (202150 times) [2024-06-25 06:55:30,578][15401] InferenceWorker_p0-w0: resuming experience collection (202150 times) [2024-06-25 06:55:33,169][15401] Updated weights for policy 0, policy_version 833373 (0.0045) [2024-06-25 06:55:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13653983232. Throughput: 0: 42742.1. Samples: 13654163940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 06:55:36,603][15401] Updated weights for policy 0, policy_version 833383 (0.0041) [2024-06-25 06:55:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 13654228992. Throughput: 0: 42764.8. Samples: 13654283100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 06:55:40,991][15401] Updated weights for policy 0, policy_version 833393 (0.0037) [2024-06-25 06:55:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13654425600. Throughput: 0: 42771.5. Samples: 13654542580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:43,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 06:55:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000833401_13654441984.pth... [2024-06-25 06:55:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000832774_13644169216.pth [2024-06-25 06:55:44,096][15401] Updated weights for policy 0, policy_version 833403 (0.0031) [2024-06-25 06:55:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13654622208. Throughput: 0: 42681.3. Samples: 13654799800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 06:55:48,784][15401] Updated weights for policy 0, policy_version 833413 (0.0035) [2024-06-25 06:55:51,890][15401] Updated weights for policy 0, policy_version 833423 (0.0027) [2024-06-25 06:55:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 13654851584. Throughput: 0: 42787.6. Samples: 13654921520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:53,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-25 06:55:56,735][15401] Updated weights for policy 0, policy_version 833433 (0.0029) [2024-06-25 06:55:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 13655048192. Throughput: 0: 42582.2. Samples: 13655180480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:55:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 06:55:59,495][15401] Updated weights for policy 0, policy_version 833443 (0.0029) [2024-06-25 06:56:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13655261184. Throughput: 0: 42551.2. Samples: 13655438160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 06:56:04,239][15401] Updated weights for policy 0, policy_version 833453 (0.0031) [2024-06-25 06:56:06,977][15401] Updated weights for policy 0, policy_version 833463 (0.0042) [2024-06-25 06:56:08,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13655490560. Throughput: 0: 42698.8. Samples: 13655565740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:08,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 06:56:11,763][15401] Updated weights for policy 0, policy_version 833473 (0.0025) [2024-06-25 06:56:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42053.9, 300 sec: 42820.9). Total num frames: 13655687168. Throughput: 0: 42445.8. Samples: 13655822060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 06:56:14,542][15401] Updated weights for policy 0, policy_version 833483 (0.0033) [2024-06-25 06:56:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13655900160. Throughput: 0: 42529.8. Samples: 13656077780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 06:56:19,290][15401] Updated weights for policy 0, policy_version 833493 (0.0035) [2024-06-25 06:56:22,297][15401] Updated weights for policy 0, policy_version 833503 (0.0049) [2024-06-25 06:56:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13656145920. Throughput: 0: 42816.1. Samples: 13656209820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 06:56:26,850][15401] Updated weights for policy 0, policy_version 833513 (0.0034) [2024-06-25 06:56:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13656342528. Throughput: 0: 42617.4. Samples: 13656460360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:28,396][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 06:56:30,450][15401] Updated weights for policy 0, policy_version 833523 (0.0028) [2024-06-25 06:56:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13656555520. Throughput: 0: 42695.3. Samples: 13656721080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:33,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 06:56:34,553][15401] Updated weights for policy 0, policy_version 833533 (0.0032) [2024-06-25 06:56:38,011][15401] Updated weights for policy 0, policy_version 833543 (0.0045) [2024-06-25 06:56:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13656784896. Throughput: 0: 42655.0. Samples: 13656841000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 06:56:41,965][15349] Signal inference workers to stop experience collection... (202200 times) [2024-06-25 06:56:41,966][15349] Signal inference workers to resume experience collection... (202200 times) [2024-06-25 06:56:41,978][15401] InferenceWorker_p0-w0: stopping experience collection (202200 times) [2024-06-25 06:56:41,993][15401] InferenceWorker_p0-w0: resuming experience collection (202200 times) [2024-06-25 06:56:42,264][15401] Updated weights for policy 0, policy_version 833553 (0.0037) [2024-06-25 06:56:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13656997888. Throughput: 0: 42710.4. Samples: 13657102440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 06:56:45,780][15401] Updated weights for policy 0, policy_version 833563 (0.0030) [2024-06-25 06:56:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 13657161728. Throughput: 0: 42649.8. Samples: 13657357400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 06:56:50,086][15401] Updated weights for policy 0, policy_version 833573 (0.0037) [2024-06-25 06:56:53,383][15401] Updated weights for policy 0, policy_version 833583 (0.0032) [2024-06-25 06:56:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13657423872. Throughput: 0: 42591.0. Samples: 13657482340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:53,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 06:56:57,562][15401] Updated weights for policy 0, policy_version 833593 (0.0025) [2024-06-25 06:56:58,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13657636864. Throughput: 0: 42715.6. Samples: 13657744260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-25 06:56:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 06:57:01,015][15401] Updated weights for policy 0, policy_version 833603 (0.0034) [2024-06-25 06:57:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 13657800704. Throughput: 0: 42757.2. Samples: 13658001860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 06:57:05,247][15401] Updated weights for policy 0, policy_version 833613 (0.0028) [2024-06-25 06:57:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13658062848. Throughput: 0: 42477.3. Samples: 13658121300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:08,391][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 06:57:08,771][15401] Updated weights for policy 0, policy_version 833623 (0.0039) [2024-06-25 06:57:12,790][15401] Updated weights for policy 0, policy_version 833633 (0.0031) [2024-06-25 06:57:13,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13658275840. Throughput: 0: 42649.1. Samples: 13658379580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:13,391][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 06:57:16,431][15401] Updated weights for policy 0, policy_version 833643 (0.0028) [2024-06-25 06:57:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13658456064. Throughput: 0: 42609.7. Samples: 13658638520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:18,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 06:57:20,349][15401] Updated weights for policy 0, policy_version 833653 (0.0036) [2024-06-25 06:57:23,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13658701824. Throughput: 0: 42655.5. Samples: 13658760500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 06:57:24,059][15401] Updated weights for policy 0, policy_version 833663 (0.0026) [2024-06-25 06:57:28,116][15401] Updated weights for policy 0, policy_version 833673 (0.0041) [2024-06-25 06:57:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13658898432. Throughput: 0: 42739.6. Samples: 13659025720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:28,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 06:57:31,520][15401] Updated weights for policy 0, policy_version 833683 (0.0028) [2024-06-25 06:57:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13659095040. Throughput: 0: 42806.2. Samples: 13659283680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:33,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 06:57:35,690][15401] Updated weights for policy 0, policy_version 833693 (0.0022) [2024-06-25 06:57:38,390][15132] Fps is (10 sec: 44235.0, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 13659340800. Throughput: 0: 42834.8. Samples: 13659409920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:38,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 06:57:39,132][15401] Updated weights for policy 0, policy_version 833703 (0.0038) [2024-06-25 06:57:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 13659537408. Throughput: 0: 42777.3. Samples: 13659669240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 06:57:43,420][15401] Updated weights for policy 0, policy_version 833713 (0.0031) [2024-06-25 06:57:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000833713_13659553792.pth... [2024-06-25 06:57:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000833087_13649297408.pth [2024-06-25 06:57:46,769][15401] Updated weights for policy 0, policy_version 833723 (0.0042) [2024-06-25 06:57:48,390][15132] Fps is (10 sec: 40961.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13659750400. Throughput: 0: 42665.8. Samples: 13659921820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 06:57:51,098][15401] Updated weights for policy 0, policy_version 833733 (0.0028) [2024-06-25 06:57:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13659979776. Throughput: 0: 42916.8. Samples: 13660052560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:53,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 06:57:54,242][15401] Updated weights for policy 0, policy_version 833743 (0.0039) [2024-06-25 06:57:57,421][15349] Signal inference workers to stop experience collection... (202250 times) [2024-06-25 06:57:57,452][15401] InferenceWorker_p0-w0: stopping experience collection (202250 times) [2024-06-25 06:57:57,475][15349] Signal inference workers to resume experience collection... (202250 times) [2024-06-25 06:57:57,476][15401] InferenceWorker_p0-w0: resuming experience collection (202250 times) [2024-06-25 06:57:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13660176384. Throughput: 0: 43087.4. Samples: 13660318500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:57:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 06:57:58,949][15401] Updated weights for policy 0, policy_version 833753 (0.0034) [2024-06-25 06:58:02,070][15401] Updated weights for policy 0, policy_version 833763 (0.0043) [2024-06-25 06:58:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43417.8, 300 sec: 42765.4). Total num frames: 13660405760. Throughput: 0: 42941.9. Samples: 13660570900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 06:58:06,422][15401] Updated weights for policy 0, policy_version 833773 (0.0039) [2024-06-25 06:58:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 13660618752. Throughput: 0: 43169.9. Samples: 13660703140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 06:58:09,738][15401] Updated weights for policy 0, policy_version 833783 (0.0041) [2024-06-25 06:58:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 13660831744. Throughput: 0: 42906.6. Samples: 13660956520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 06:58:13,917][15401] Updated weights for policy 0, policy_version 833793 (0.0039) [2024-06-25 06:58:17,246][15401] Updated weights for policy 0, policy_version 833803 (0.0026) [2024-06-25 06:58:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13661044736. Throughput: 0: 42926.3. Samples: 13661215360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 06:58:21,325][15401] Updated weights for policy 0, policy_version 833813 (0.0029) [2024-06-25 06:58:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13661257728. Throughput: 0: 42860.2. Samples: 13661338620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:23,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 06:58:25,026][15401] Updated weights for policy 0, policy_version 833823 (0.0037) [2024-06-25 06:58:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13661470720. Throughput: 0: 42887.1. Samples: 13661599160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:28,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 06:58:28,876][15401] Updated weights for policy 0, policy_version 833833 (0.0040) [2024-06-25 06:58:32,658][15401] Updated weights for policy 0, policy_version 833843 (0.0036) [2024-06-25 06:58:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13661683712. Throughput: 0: 42950.3. Samples: 13661854580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 06:58:36,532][15401] Updated weights for policy 0, policy_version 833853 (0.0023) [2024-06-25 06:58:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 13661896704. Throughput: 0: 42845.8. Samples: 13661980620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:38,396][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 06:58:40,329][15401] Updated weights for policy 0, policy_version 833863 (0.0036) [2024-06-25 06:58:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13662109696. Throughput: 0: 42766.7. Samples: 13662243000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 06:58:44,478][15401] Updated weights for policy 0, policy_version 833873 (0.0030) [2024-06-25 06:58:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13662322688. Throughput: 0: 42746.9. Samples: 13662494520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 06:58:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 06:58:48,480][15401] Updated weights for policy 0, policy_version 833883 (0.0039) [2024-06-25 06:58:51,963][15401] Updated weights for policy 0, policy_version 833893 (0.0041) [2024-06-25 06:58:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13662552064. Throughput: 0: 42656.4. Samples: 13662622680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:58:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 06:58:56,060][15401] Updated weights for policy 0, policy_version 833903 (0.0036) [2024-06-25 06:58:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13662748672. Throughput: 0: 42776.0. Samples: 13662881440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:58:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 06:58:59,964][15401] Updated weights for policy 0, policy_version 833913 (0.0038) [2024-06-25 06:59:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13662961664. Throughput: 0: 42689.3. Samples: 13663136380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:03,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 06:59:03,895][15401] Updated weights for policy 0, policy_version 833923 (0.0040) [2024-06-25 06:59:07,507][15401] Updated weights for policy 0, policy_version 833933 (0.0032) [2024-06-25 06:59:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13663191040. Throughput: 0: 42759.7. Samples: 13663262800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 06:59:11,527][15401] Updated weights for policy 0, policy_version 833943 (0.0034) [2024-06-25 06:59:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13663371264. Throughput: 0: 42612.2. Samples: 13663516700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 06:59:15,119][15401] Updated weights for policy 0, policy_version 833953 (0.0036) [2024-06-25 06:59:15,568][15349] Signal inference workers to stop experience collection... (202300 times) [2024-06-25 06:59:15,568][15349] Signal inference workers to resume experience collection... (202300 times) [2024-06-25 06:59:15,597][15401] InferenceWorker_p0-w0: stopping experience collection (202300 times) [2024-06-25 06:59:15,597][15401] InferenceWorker_p0-w0: resuming experience collection (202300 times) [2024-06-25 06:59:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13663617024. Throughput: 0: 42599.6. Samples: 13663771560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 06:59:19,135][15401] Updated weights for policy 0, policy_version 833963 (0.0039) [2024-06-25 06:59:22,819][15401] Updated weights for policy 0, policy_version 833973 (0.0027) [2024-06-25 06:59:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13663830016. Throughput: 0: 42600.5. Samples: 13663897640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 06:59:26,637][15401] Updated weights for policy 0, policy_version 833983 (0.0037) [2024-06-25 06:59:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13664010240. Throughput: 0: 42498.2. Samples: 13664155420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 06:59:30,330][15401] Updated weights for policy 0, policy_version 833993 (0.0032) [2024-06-25 06:59:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13664256000. Throughput: 0: 42768.1. Samples: 13664419080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:33,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 06:59:34,526][15401] Updated weights for policy 0, policy_version 834003 (0.0028) [2024-06-25 06:59:37,856][15401] Updated weights for policy 0, policy_version 834013 (0.0020) [2024-06-25 06:59:38,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13664485376. Throughput: 0: 42835.7. Samples: 13664550280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:38,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 06:59:41,955][15401] Updated weights for policy 0, policy_version 834023 (0.0030) [2024-06-25 06:59:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13664665600. Throughput: 0: 42678.0. Samples: 13664801960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:43,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 06:59:43,500][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000834026_13664681984.pth... [2024-06-25 06:59:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000833401_13654441984.pth [2024-06-25 06:59:45,425][15401] Updated weights for policy 0, policy_version 834033 (0.0030) [2024-06-25 06:59:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13664894976. Throughput: 0: 42718.1. Samples: 13665058700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:48,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 06:59:49,791][15401] Updated weights for policy 0, policy_version 834043 (0.0026) [2024-06-25 06:59:53,267][15401] Updated weights for policy 0, policy_version 834053 (0.0036) [2024-06-25 06:59:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13665124352. Throughput: 0: 42792.0. Samples: 13665188440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:53,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 06:59:57,328][15401] Updated weights for policy 0, policy_version 834063 (0.0033) [2024-06-25 06:59:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 13665320960. Throughput: 0: 42789.2. Samples: 13665442220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 06:59:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 07:00:00,788][15401] Updated weights for policy 0, policy_version 834073 (0.0031) [2024-06-25 07:00:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13665533952. Throughput: 0: 42944.0. Samples: 13665704040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:00:03,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 07:00:05,119][15401] Updated weights for policy 0, policy_version 834083 (0.0035) [2024-06-25 07:00:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 13665763328. Throughput: 0: 43054.3. Samples: 13665835080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:00:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 07:00:08,437][15401] Updated weights for policy 0, policy_version 834093 (0.0042) [2024-06-25 07:00:12,518][15401] Updated weights for policy 0, policy_version 834103 (0.0036) [2024-06-25 07:00:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13665959936. Throughput: 0: 43068.4. Samples: 13666093500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:00:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 07:00:15,967][15401] Updated weights for policy 0, policy_version 834113 (0.0027) [2024-06-25 07:00:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 13666189312. Throughput: 0: 42885.3. Samples: 13666349020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:00:18,393][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 07:00:20,102][15401] Updated weights for policy 0, policy_version 834123 (0.0040) [2024-06-25 07:00:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13666418688. Throughput: 0: 42877.3. Samples: 13666479760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:00:23,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-25 07:00:23,468][15401] Updated weights for policy 0, policy_version 834133 (0.0025) [2024-06-25 07:00:27,951][15401] Updated weights for policy 0, policy_version 834143 (0.0022) [2024-06-25 07:00:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13666615296. Throughput: 0: 43105.4. Samples: 13666741700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:00:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 07:00:29,746][15349] Signal inference workers to stop experience collection... (202350 times) [2024-06-25 07:00:29,752][15349] Signal inference workers to resume experience collection... (202350 times) [2024-06-25 07:00:29,793][15401] InferenceWorker_p0-w0: stopping experience collection (202350 times) [2024-06-25 07:00:29,793][15401] InferenceWorker_p0-w0: resuming experience collection (202350 times) [2024-06-25 07:00:30,927][15401] Updated weights for policy 0, policy_version 834153 (0.0036) [2024-06-25 07:00:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13666844672. Throughput: 0: 43137.0. Samples: 13666999860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:00:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 07:00:35,553][15401] Updated weights for policy 0, policy_version 834163 (0.0038) [2024-06-25 07:00:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13667074048. Throughput: 0: 43143.0. Samples: 13667129880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:00:38,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 07:00:38,786][15401] Updated weights for policy 0, policy_version 834173 (0.0028) [2024-06-25 07:00:43,073][15401] Updated weights for policy 0, policy_version 834183 (0.0042) [2024-06-25 07:00:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 13667254272. Throughput: 0: 43249.9. Samples: 13667388460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:00:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 07:00:46,656][15401] Updated weights for policy 0, policy_version 834193 (0.0030) [2024-06-25 07:00:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13667483648. Throughput: 0: 42881.8. Samples: 13667633720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:00:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 07:00:50,835][15401] Updated weights for policy 0, policy_version 834203 (0.0027) [2024-06-25 07:00:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13667696640. Throughput: 0: 42946.0. Samples: 13667767660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:00:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 07:00:54,318][15401] Updated weights for policy 0, policy_version 834213 (0.0032) [2024-06-25 07:00:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13667893248. Throughput: 0: 42952.5. Samples: 13668026360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:00:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 07:00:58,528][15401] Updated weights for policy 0, policy_version 834223 (0.0032) [2024-06-25 07:01:02,037][15401] Updated weights for policy 0, policy_version 834233 (0.0041) [2024-06-25 07:01:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13668122624. Throughput: 0: 42788.1. Samples: 13668274380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 07:01:06,292][15401] Updated weights for policy 0, policy_version 834243 (0.0037) [2024-06-25 07:01:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 13668319232. Throughput: 0: 42884.8. Samples: 13668409580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:08,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 07:01:09,814][15401] Updated weights for policy 0, policy_version 834253 (0.0040) [2024-06-25 07:01:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13668515840. Throughput: 0: 42647.2. Samples: 13668660820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:13,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 07:01:13,933][15401] Updated weights for policy 0, policy_version 834263 (0.0030) [2024-06-25 07:01:17,565][15401] Updated weights for policy 0, policy_version 834273 (0.0045) [2024-06-25 07:01:18,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 13668777984. Throughput: 0: 42554.6. Samples: 13668914820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 07:01:21,442][15401] Updated weights for policy 0, policy_version 834283 (0.0027) [2024-06-25 07:01:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13668958208. Throughput: 0: 42755.6. Samples: 13669053880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 07:01:24,985][15401] Updated weights for policy 0, policy_version 834293 (0.0034) [2024-06-25 07:01:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13669171200. Throughput: 0: 42516.8. Samples: 13669301720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 07:01:29,304][15401] Updated weights for policy 0, policy_version 834303 (0.0030) [2024-06-25 07:01:32,733][15401] Updated weights for policy 0, policy_version 834313 (0.0034) [2024-06-25 07:01:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13669416960. Throughput: 0: 42785.8. Samples: 13669559080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 07:01:36,828][15401] Updated weights for policy 0, policy_version 834323 (0.0030) [2024-06-25 07:01:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13669613568. Throughput: 0: 42917.1. Samples: 13669698920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:38,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 07:01:40,315][15401] Updated weights for policy 0, policy_version 834333 (0.0025) [2024-06-25 07:01:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13669810176. Throughput: 0: 42847.6. Samples: 13669954500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:43,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 07:01:43,432][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000834340_13669826560.pth... [2024-06-25 07:01:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000833713_13659553792.pth [2024-06-25 07:01:44,280][15401] Updated weights for policy 0, policy_version 834343 (0.0034) [2024-06-25 07:01:47,717][15401] Updated weights for policy 0, policy_version 834353 (0.0041) [2024-06-25 07:01:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13670055936. Throughput: 0: 43019.5. Samples: 13670210260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:48,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 07:01:52,080][15401] Updated weights for policy 0, policy_version 834363 (0.0033) [2024-06-25 07:01:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13670252544. Throughput: 0: 43034.3. Samples: 13670346120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:53,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 07:01:55,330][15401] Updated weights for policy 0, policy_version 834373 (0.0025) [2024-06-25 07:01:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 13670465536. Throughput: 0: 43115.5. Samples: 13670601020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:01:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 07:01:59,589][15401] Updated weights for policy 0, policy_version 834383 (0.0050) [2024-06-25 07:02:02,951][15401] Updated weights for policy 0, policy_version 834393 (0.0032) [2024-06-25 07:02:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13670694912. Throughput: 0: 43005.8. Samples: 13670850080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:02:03,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 07:02:07,285][15401] Updated weights for policy 0, policy_version 834403 (0.0044) [2024-06-25 07:02:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13670907904. Throughput: 0: 42857.7. Samples: 13670982480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:02:08,395][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 07:02:10,830][15401] Updated weights for policy 0, policy_version 834413 (0.0037) [2024-06-25 07:02:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13671104512. Throughput: 0: 42917.4. Samples: 13671233000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:02:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 07:02:15,130][15401] Updated weights for policy 0, policy_version 834423 (0.0033) [2024-06-25 07:02:15,447][15349] Signal inference workers to stop experience collection... (202400 times) [2024-06-25 07:02:15,485][15401] InferenceWorker_p0-w0: stopping experience collection (202400 times) [2024-06-25 07:02:15,520][15349] Signal inference workers to resume experience collection... (202400 times) [2024-06-25 07:02:15,523][15401] InferenceWorker_p0-w0: resuming experience collection (202400 times) [2024-06-25 07:02:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13671333888. Throughput: 0: 42872.9. Samples: 13671488360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:02:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 07:02:18,435][15401] Updated weights for policy 0, policy_version 834433 (0.0023) [2024-06-25 07:02:22,799][15401] Updated weights for policy 0, policy_version 834443 (0.0033) [2024-06-25 07:02:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13671546880. Throughput: 0: 42764.4. Samples: 13671623320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:02:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 07:02:26,202][15401] Updated weights for policy 0, policy_version 834453 (0.0022) [2024-06-25 07:02:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13671743488. Throughput: 0: 42715.9. Samples: 13671876720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 07:02:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 07:02:30,504][15401] Updated weights for policy 0, policy_version 834463 (0.0028) [2024-06-25 07:02:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13671972864. Throughput: 0: 42675.5. Samples: 13672130660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:02:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 07:02:33,784][15401] Updated weights for policy 0, policy_version 834473 (0.0032) [2024-06-25 07:02:37,970][15401] Updated weights for policy 0, policy_version 834483 (0.0040) [2024-06-25 07:02:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13672185856. Throughput: 0: 42538.2. Samples: 13672260340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:02:38,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 07:02:41,349][15401] Updated weights for policy 0, policy_version 834493 (0.0038) [2024-06-25 07:02:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 13672398848. Throughput: 0: 42605.3. Samples: 13672518260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:02:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 07:02:45,358][15401] Updated weights for policy 0, policy_version 834503 (0.0039) [2024-06-25 07:02:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13672611840. Throughput: 0: 43077.0. Samples: 13672788540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:02:48,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 07:02:48,768][15401] Updated weights for policy 0, policy_version 834513 (0.0033) [2024-06-25 07:02:52,855][15401] Updated weights for policy 0, policy_version 834523 (0.0030) [2024-06-25 07:02:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13672824832. Throughput: 0: 43005.8. Samples: 13672917740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:02:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 07:02:56,312][15401] Updated weights for policy 0, policy_version 834533 (0.0041) [2024-06-25 07:02:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13673037824. Throughput: 0: 42977.8. Samples: 13673167000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:02:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 07:03:00,974][15401] Updated weights for policy 0, policy_version 834543 (0.0030) [2024-06-25 07:03:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13673267200. Throughput: 0: 43001.2. Samples: 13673423420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:03,398][15132] Avg episode reward: [(0, '0.203')] [2024-06-25 07:03:04,172][15401] Updated weights for policy 0, policy_version 834553 (0.0033) [2024-06-25 07:03:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13673463808. Throughput: 0: 42917.8. Samples: 13673554620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:08,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 07:03:08,537][15401] Updated weights for policy 0, policy_version 834563 (0.0037) [2024-06-25 07:03:11,792][15401] Updated weights for policy 0, policy_version 834573 (0.0036) [2024-06-25 07:03:13,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 13673676800. Throughput: 0: 42793.2. Samples: 13673802520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:13,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 07:03:16,223][15401] Updated weights for policy 0, policy_version 834583 (0.0041) [2024-06-25 07:03:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13673889792. Throughput: 0: 42953.4. Samples: 13674063560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:18,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-25 07:03:19,395][15401] Updated weights for policy 0, policy_version 834593 (0.0038) [2024-06-25 07:03:23,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13674102784. Throughput: 0: 42897.8. Samples: 13674190740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:23,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-25 07:03:23,748][15401] Updated weights for policy 0, policy_version 834603 (0.0042) [2024-06-25 07:03:27,050][15401] Updated weights for policy 0, policy_version 834613 (0.0036) [2024-06-25 07:03:28,396][15132] Fps is (10 sec: 44208.0, 60 sec: 43139.9, 300 sec: 42875.2). Total num frames: 13674332160. Throughput: 0: 42714.0. Samples: 13674440660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:28,396][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 07:03:31,447][15401] Updated weights for policy 0, policy_version 834623 (0.0030) [2024-06-25 07:03:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 13674545152. Throughput: 0: 42558.1. Samples: 13674703760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:33,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 07:03:34,627][15401] Updated weights for policy 0, policy_version 834633 (0.0033) [2024-06-25 07:03:38,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13674758144. Throughput: 0: 42522.2. Samples: 13674831240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 07:03:39,238][15401] Updated weights for policy 0, policy_version 834643 (0.0050) [2024-06-25 07:03:42,350][15401] Updated weights for policy 0, policy_version 834653 (0.0036) [2024-06-25 07:03:43,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13674971136. Throughput: 0: 42593.7. Samples: 13675083720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:43,390][15132] Avg episode reward: [(0, '0.222')] [2024-06-25 07:03:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000834654_13674971136.pth... [2024-06-25 07:03:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000834026_13664681984.pth [2024-06-25 07:03:46,934][15349] Signal inference workers to stop experience collection... (202450 times) [2024-06-25 07:03:46,934][15349] Signal inference workers to resume experience collection... (202450 times) [2024-06-25 07:03:46,984][15401] InferenceWorker_p0-w0: stopping experience collection (202450 times) [2024-06-25 07:03:46,984][15401] InferenceWorker_p0-w0: resuming experience collection (202450 times) [2024-06-25 07:03:47,074][15401] Updated weights for policy 0, policy_version 834663 (0.0042) [2024-06-25 07:03:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13675167744. Throughput: 0: 42618.8. Samples: 13675341260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 07:03:50,022][15401] Updated weights for policy 0, policy_version 834673 (0.0027) [2024-06-25 07:03:53,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 13675397120. Throughput: 0: 42497.8. Samples: 13675467120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:53,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 07:03:54,797][15401] Updated weights for policy 0, policy_version 834683 (0.0036) [2024-06-25 07:03:57,592][15401] Updated weights for policy 0, policy_version 834693 (0.0036) [2024-06-25 07:03:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13675610112. Throughput: 0: 42503.7. Samples: 13675715080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:03:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 07:04:02,423][15401] Updated weights for policy 0, policy_version 834703 (0.0033) [2024-06-25 07:04:03,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13675790336. Throughput: 0: 42435.9. Samples: 13675973180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:04:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 07:04:05,862][15401] Updated weights for policy 0, policy_version 834713 (0.0033) [2024-06-25 07:04:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13676019712. Throughput: 0: 42279.5. Samples: 13676093320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:04:08,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 07:04:10,611][15401] Updated weights for policy 0, policy_version 834723 (0.0039) [2024-06-25 07:04:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 13676249088. Throughput: 0: 42570.0. Samples: 13676356040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:04:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 07:04:13,521][15401] Updated weights for policy 0, policy_version 834733 (0.0046) [2024-06-25 07:04:18,312][15401] Updated weights for policy 0, policy_version 834743 (0.0034) [2024-06-25 07:04:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13676429312. Throughput: 0: 42493.0. Samples: 13676615840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:04:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 07:04:21,512][15401] Updated weights for policy 0, policy_version 834753 (0.0033) [2024-06-25 07:04:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 13676675072. Throughput: 0: 42394.6. Samples: 13676739100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:04:23,392][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 07:04:25,993][15401] Updated weights for policy 0, policy_version 834763 (0.0035) [2024-06-25 07:04:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 13676888064. Throughput: 0: 42452.1. Samples: 13676994060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:04:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 07:04:29,188][15401] Updated weights for policy 0, policy_version 834773 (0.0029) [2024-06-25 07:04:33,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 13677068288. Throughput: 0: 42600.8. Samples: 13677258300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:04:33,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 07:04:33,654][15401] Updated weights for policy 0, policy_version 834783 (0.0039) [2024-06-25 07:04:36,701][15401] Updated weights for policy 0, policy_version 834793 (0.0034) [2024-06-25 07:04:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13677314048. Throughput: 0: 42483.2. Samples: 13677378760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:04:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 07:04:41,329][15401] Updated weights for policy 0, policy_version 834803 (0.0031) [2024-06-25 07:04:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13677527040. Throughput: 0: 42825.2. Samples: 13677642220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:04:43,393][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 07:04:44,467][15401] Updated weights for policy 0, policy_version 834813 (0.0040) [2024-06-25 07:04:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13677707264. Throughput: 0: 42806.3. Samples: 13677899460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:04:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 07:04:48,961][15401] Updated weights for policy 0, policy_version 834823 (0.0046) [2024-06-25 07:04:52,058][15401] Updated weights for policy 0, policy_version 834833 (0.0032) [2024-06-25 07:04:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 13677953024. Throughput: 0: 42898.3. Samples: 13678023740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:04:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 07:04:56,498][15401] Updated weights for policy 0, policy_version 834843 (0.0031) [2024-06-25 07:04:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13678166016. Throughput: 0: 42848.0. Samples: 13678284200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:04:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 07:04:59,852][15401] Updated weights for policy 0, policy_version 834853 (0.0044) [2024-06-25 07:05:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13678362624. Throughput: 0: 42655.8. Samples: 13678535360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 07:05:04,186][15401] Updated weights for policy 0, policy_version 834863 (0.0029) [2024-06-25 07:05:07,555][15401] Updated weights for policy 0, policy_version 834873 (0.0042) [2024-06-25 07:05:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13678592000. Throughput: 0: 42670.7. Samples: 13678659180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 07:05:11,751][15401] Updated weights for policy 0, policy_version 834883 (0.0035) [2024-06-25 07:05:12,119][15349] Signal inference workers to stop experience collection... (202500 times) [2024-06-25 07:05:12,168][15401] InferenceWorker_p0-w0: stopping experience collection (202500 times) [2024-06-25 07:05:12,236][15349] Signal inference workers to resume experience collection... (202500 times) [2024-06-25 07:05:12,236][15401] InferenceWorker_p0-w0: resuming experience collection (202500 times) [2024-06-25 07:05:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 13678804992. Throughput: 0: 42914.3. Samples: 13678925200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 07:05:15,031][15401] Updated weights for policy 0, policy_version 834893 (0.0033) [2024-06-25 07:05:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13679001600. Throughput: 0: 42704.5. Samples: 13679180000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:18,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 07:05:19,149][15401] Updated weights for policy 0, policy_version 834903 (0.0039) [2024-06-25 07:05:22,827][15401] Updated weights for policy 0, policy_version 834913 (0.0039) [2024-06-25 07:05:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 13679247360. Throughput: 0: 42874.1. Samples: 13679308100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:23,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 07:05:26,603][15401] Updated weights for policy 0, policy_version 834923 (0.0034) [2024-06-25 07:05:28,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 13679460352. Throughput: 0: 42886.2. Samples: 13679572200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:28,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 07:05:30,331][15401] Updated weights for policy 0, policy_version 834933 (0.0028) [2024-06-25 07:05:33,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 13679656960. Throughput: 0: 42809.2. Samples: 13679825980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:33,393][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 07:05:34,160][15401] Updated weights for policy 0, policy_version 834943 (0.0048) [2024-06-25 07:05:38,202][15401] Updated weights for policy 0, policy_version 834953 (0.0027) [2024-06-25 07:05:38,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13679869952. Throughput: 0: 42819.4. Samples: 13679950620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:38,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 07:05:42,109][15401] Updated weights for policy 0, policy_version 834963 (0.0028) [2024-06-25 07:05:43,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13680082944. Throughput: 0: 42789.7. Samples: 13680209740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:43,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 07:05:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000834966_13680082944.pth... [2024-06-25 07:05:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000834340_13669826560.pth [2024-06-25 07:05:45,909][15401] Updated weights for policy 0, policy_version 834973 (0.0030) [2024-06-25 07:05:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13680312320. Throughput: 0: 42898.8. Samples: 13680465800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 07:05:49,905][15401] Updated weights for policy 0, policy_version 834983 (0.0034) [2024-06-25 07:05:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13680492544. Throughput: 0: 42942.7. Samples: 13680591600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 07:05:53,664][15401] Updated weights for policy 0, policy_version 834993 (0.0047) [2024-06-25 07:05:57,577][15401] Updated weights for policy 0, policy_version 835003 (0.0026) [2024-06-25 07:05:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13680721920. Throughput: 0: 42783.0. Samples: 13680850440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:05:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 07:06:01,236][15401] Updated weights for policy 0, policy_version 835013 (0.0023) [2024-06-25 07:06:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13680951296. Throughput: 0: 42745.7. Samples: 13681103560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:06:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 07:06:05,078][15401] Updated weights for policy 0, policy_version 835023 (0.0044) [2024-06-25 07:06:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13681131520. Throughput: 0: 42799.7. Samples: 13681234080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-25 07:06:08,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 07:06:08,839][15401] Updated weights for policy 0, policy_version 835033 (0.0031) [2024-06-25 07:06:13,110][15401] Updated weights for policy 0, policy_version 835043 (0.0034) [2024-06-25 07:06:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 13681360896. Throughput: 0: 42685.7. Samples: 13681493060. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:13,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 07:06:16,544][15401] Updated weights for policy 0, policy_version 835053 (0.0030) [2024-06-25 07:06:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13681590272. Throughput: 0: 42588.6. Samples: 13681742360. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 07:06:20,857][15401] Updated weights for policy 0, policy_version 835063 (0.0041) [2024-06-25 07:06:21,069][15349] Signal inference workers to stop experience collection... (202550 times) [2024-06-25 07:06:21,115][15401] InferenceWorker_p0-w0: stopping experience collection (202550 times) [2024-06-25 07:06:21,130][15349] Signal inference workers to resume experience collection... (202550 times) [2024-06-25 07:06:21,136][15401] InferenceWorker_p0-w0: resuming experience collection (202550 times) [2024-06-25 07:06:23,396][15132] Fps is (10 sec: 42581.3, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 13681786880. Throughput: 0: 42843.3. Samples: 13681878840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:23,396][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 07:06:24,114][15401] Updated weights for policy 0, policy_version 835073 (0.0028) [2024-06-25 07:06:28,239][15401] Updated weights for policy 0, policy_version 835083 (0.0037) [2024-06-25 07:06:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 13681999872. Throughput: 0: 42698.5. Samples: 13682131180. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:28,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-25 07:06:31,629][15401] Updated weights for policy 0, policy_version 835093 (0.0048) [2024-06-25 07:06:33,389][15132] Fps is (10 sec: 44265.5, 60 sec: 42873.3, 300 sec: 42765.0). Total num frames: 13682229248. Throughput: 0: 42781.8. Samples: 13682390980. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:33,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 07:06:35,811][15401] Updated weights for policy 0, policy_version 835103 (0.0031) [2024-06-25 07:06:38,392][15132] Fps is (10 sec: 44227.1, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 13682442240. Throughput: 0: 42923.5. Samples: 13682523260. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:38,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 07:06:39,052][15401] Updated weights for policy 0, policy_version 835113 (0.0027) [2024-06-25 07:06:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13682638848. Throughput: 0: 42935.1. Samples: 13682782520. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:43,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 07:06:43,452][15401] Updated weights for policy 0, policy_version 835123 (0.0026) [2024-06-25 07:06:46,604][15401] Updated weights for policy 0, policy_version 835133 (0.0039) [2024-06-25 07:06:48,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13682884608. Throughput: 0: 43048.8. Samples: 13683040760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:48,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 07:06:51,096][15401] Updated weights for policy 0, policy_version 835143 (0.0047) [2024-06-25 07:06:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13683081216. Throughput: 0: 42969.3. Samples: 13683167700. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 07:06:54,142][15401] Updated weights for policy 0, policy_version 835153 (0.0027) [2024-06-25 07:06:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13683294208. Throughput: 0: 42847.6. Samples: 13683421100. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:06:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 07:06:58,617][15401] Updated weights for policy 0, policy_version 835163 (0.0040) [2024-06-25 07:07:02,376][15401] Updated weights for policy 0, policy_version 835173 (0.0030) [2024-06-25 07:07:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13683507200. Throughput: 0: 43035.4. Samples: 13683678960. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:03,390][15132] Avg episode reward: [(0, '0.873')] [2024-06-25 07:07:06,612][15401] Updated weights for policy 0, policy_version 835183 (0.0042) [2024-06-25 07:07:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 13683736576. Throughput: 0: 42907.0. Samples: 13683809380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:08,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 07:07:10,025][15401] Updated weights for policy 0, policy_version 835193 (0.0044) [2024-06-25 07:07:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 13683933184. Throughput: 0: 42853.0. Samples: 13684059560. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 07:07:14,188][15401] Updated weights for policy 0, policy_version 835203 (0.0036) [2024-06-25 07:07:17,654][15401] Updated weights for policy 0, policy_version 835213 (0.0034) [2024-06-25 07:07:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13684146176. Throughput: 0: 42667.5. Samples: 13684311020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 07:07:21,696][15401] Updated weights for policy 0, policy_version 835223 (0.0029) [2024-06-25 07:07:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43149.2, 300 sec: 42820.6). Total num frames: 13684375552. Throughput: 0: 42711.7. Samples: 13684445180. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 07:07:25,531][15401] Updated weights for policy 0, policy_version 835233 (0.0033) [2024-06-25 07:07:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13684555776. Throughput: 0: 42594.8. Samples: 13684699280. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 07:07:29,241][15401] Updated weights for policy 0, policy_version 835243 (0.0037) [2024-06-25 07:07:33,197][15401] Updated weights for policy 0, policy_version 835253 (0.0035) [2024-06-25 07:07:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13684785152. Throughput: 0: 42456.9. Samples: 13684951320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 07:07:36,252][15349] Signal inference workers to stop experience collection... (202600 times) [2024-06-25 07:07:36,253][15349] Signal inference workers to resume experience collection... (202600 times) [2024-06-25 07:07:36,269][15401] InferenceWorker_p0-w0: stopping experience collection (202600 times) [2024-06-25 07:07:36,292][15401] InferenceWorker_p0-w0: resuming experience collection (202600 times) [2024-06-25 07:07:36,721][15401] Updated weights for policy 0, policy_version 835263 (0.0032) [2024-06-25 07:07:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 13685014528. Throughput: 0: 42654.2. Samples: 13685087140. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:38,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 07:07:40,889][15401] Updated weights for policy 0, policy_version 835273 (0.0027) [2024-06-25 07:07:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13685194752. Throughput: 0: 42751.9. Samples: 13685344940. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 07:07:43,449][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000835279_13685211136.pth... [2024-06-25 07:07:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000834654_13674971136.pth [2024-06-25 07:07:44,194][15401] Updated weights for policy 0, policy_version 835283 (0.0034) [2024-06-25 07:07:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13685424128. Throughput: 0: 42683.2. Samples: 13685599700. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 07:07:48,858][15401] Updated weights for policy 0, policy_version 835293 (0.0038) [2024-06-25 07:07:51,777][15401] Updated weights for policy 0, policy_version 835303 (0.0044) [2024-06-25 07:07:53,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13685653504. Throughput: 0: 42669.0. Samples: 13685729480. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 07:07:56,565][15401] Updated weights for policy 0, policy_version 835313 (0.0040) [2024-06-25 07:07:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 13685817344. Throughput: 0: 42695.5. Samples: 13685980860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-06-25 07:07:58,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 07:07:59,697][15401] Updated weights for policy 0, policy_version 835323 (0.0033) [2024-06-25 07:08:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13686063104. Throughput: 0: 42659.5. Samples: 13686230700. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 07:08:04,308][15401] Updated weights for policy 0, policy_version 835333 (0.0033) [2024-06-25 07:08:07,615][15401] Updated weights for policy 0, policy_version 835343 (0.0029) [2024-06-25 07:08:08,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 13686292480. Throughput: 0: 42618.1. Samples: 13686363000. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:08,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 07:08:11,869][15401] Updated weights for policy 0, policy_version 835353 (0.0030) [2024-06-25 07:08:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13686472704. Throughput: 0: 42778.6. Samples: 13686624320. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 07:08:15,257][15401] Updated weights for policy 0, policy_version 835363 (0.0045) [2024-06-25 07:08:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13686718464. Throughput: 0: 42727.7. Samples: 13686874060. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 07:08:19,323][15401] Updated weights for policy 0, policy_version 835373 (0.0026) [2024-06-25 07:08:22,874][15401] Updated weights for policy 0, policy_version 835383 (0.0036) [2024-06-25 07:08:23,392][15132] Fps is (10 sec: 47502.4, 60 sec: 42869.7, 300 sec: 42765.6). Total num frames: 13686947840. Throughput: 0: 42791.5. Samples: 13687012860. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:23,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 07:08:26,958][15401] Updated weights for policy 0, policy_version 835393 (0.0041) [2024-06-25 07:08:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 13687128064. Throughput: 0: 42614.8. Samples: 13687262600. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:28,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 07:08:30,485][15401] Updated weights for policy 0, policy_version 835403 (0.0043) [2024-06-25 07:08:33,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13687357440. Throughput: 0: 42559.9. Samples: 13687514900. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 07:08:34,576][15401] Updated weights for policy 0, policy_version 835413 (0.0033) [2024-06-25 07:08:38,086][15401] Updated weights for policy 0, policy_version 835423 (0.0036) [2024-06-25 07:08:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13687586816. Throughput: 0: 42767.4. Samples: 13687654020. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:38,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 07:08:41,937][15401] Updated weights for policy 0, policy_version 835433 (0.0031) [2024-06-25 07:08:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13687767040. Throughput: 0: 42792.5. Samples: 13687906520. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:43,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 07:08:43,723][15349] Signal inference workers to stop experience collection... (202650 times) [2024-06-25 07:08:43,724][15349] Signal inference workers to resume experience collection... (202650 times) [2024-06-25 07:08:43,739][15401] InferenceWorker_p0-w0: stopping experience collection (202650 times) [2024-06-25 07:08:43,739][15401] InferenceWorker_p0-w0: resuming experience collection (202650 times) [2024-06-25 07:08:45,716][15401] Updated weights for policy 0, policy_version 835443 (0.0025) [2024-06-25 07:08:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 13687996416. Throughput: 0: 42859.9. Samples: 13688159400. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 07:08:49,915][15401] Updated weights for policy 0, policy_version 835453 (0.0039) [2024-06-25 07:08:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13688209408. Throughput: 0: 42999.2. Samples: 13688297960. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 07:08:53,495][15401] Updated weights for policy 0, policy_version 835463 (0.0037) [2024-06-25 07:08:57,577][15401] Updated weights for policy 0, policy_version 835473 (0.0035) [2024-06-25 07:08:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13688406016. Throughput: 0: 42834.3. Samples: 13688551860. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:08:58,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 07:09:01,092][15401] Updated weights for policy 0, policy_version 835483 (0.0038) [2024-06-25 07:09:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13688635392. Throughput: 0: 42885.2. Samples: 13688803900. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:03,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 07:09:05,049][15401] Updated weights for policy 0, policy_version 835493 (0.0028) [2024-06-25 07:09:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13688848384. Throughput: 0: 42767.7. Samples: 13688937300. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:08,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 07:09:08,598][15401] Updated weights for policy 0, policy_version 835503 (0.0022) [2024-06-25 07:09:12,603][15401] Updated weights for policy 0, policy_version 835513 (0.0030) [2024-06-25 07:09:13,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13689061376. Throughput: 0: 42905.3. Samples: 13689193340. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:13,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 07:09:16,472][15401] Updated weights for policy 0, policy_version 835523 (0.0043) [2024-06-25 07:09:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 13689274368. Throughput: 0: 42848.2. Samples: 13689443060. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 07:09:20,298][15401] Updated weights for policy 0, policy_version 835533 (0.0036) [2024-06-25 07:09:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42054.0, 300 sec: 42654.0). Total num frames: 13689470976. Throughput: 0: 42681.9. Samples: 13689574700. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:23,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 07:09:24,190][15401] Updated weights for policy 0, policy_version 835543 (0.0035) [2024-06-25 07:09:27,954][15401] Updated weights for policy 0, policy_version 835553 (0.0051) [2024-06-25 07:09:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13689700352. Throughput: 0: 42661.4. Samples: 13689826280. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 07:09:31,667][15401] Updated weights for policy 0, policy_version 835563 (0.0037) [2024-06-25 07:09:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13689913344. Throughput: 0: 42566.0. Samples: 13690074860. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 07:09:36,088][15401] Updated weights for policy 0, policy_version 835573 (0.0032) [2024-06-25 07:09:38,390][15132] Fps is (10 sec: 39320.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 13690093568. Throughput: 0: 42429.2. Samples: 13690207280. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:38,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-25 07:09:39,369][15401] Updated weights for policy 0, policy_version 835583 (0.0025) [2024-06-25 07:09:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13690322944. Throughput: 0: 42409.8. Samples: 13690460300. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:43,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 07:09:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000835592_13690339328.pth... [2024-06-25 07:09:43,518][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000834966_13680082944.pth [2024-06-25 07:09:44,006][15401] Updated weights for policy 0, policy_version 835593 (0.0043) [2024-06-25 07:09:47,050][15401] Updated weights for policy 0, policy_version 835603 (0.0032) [2024-06-25 07:09:48,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13690552320. Throughput: 0: 42469.9. Samples: 13690715040. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-25 07:09:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 07:09:51,684][15401] Updated weights for policy 0, policy_version 835613 (0.0030) [2024-06-25 07:09:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13690748928. Throughput: 0: 42492.4. Samples: 13690849460. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:09:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 07:09:54,323][15349] Signal inference workers to stop experience collection... (202700 times) [2024-06-25 07:09:54,379][15401] InferenceWorker_p0-w0: stopping experience collection (202700 times) [2024-06-25 07:09:54,385][15349] Signal inference workers to resume experience collection... (202700 times) [2024-06-25 07:09:54,391][15401] InferenceWorker_p0-w0: resuming experience collection (202700 times) [2024-06-25 07:09:54,558][15401] Updated weights for policy 0, policy_version 835623 (0.0045) [2024-06-25 07:09:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13690978304. Throughput: 0: 42491.6. Samples: 13691105460. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:09:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 07:09:59,252][15401] Updated weights for policy 0, policy_version 835633 (0.0032) [2024-06-25 07:10:02,023][15401] Updated weights for policy 0, policy_version 835643 (0.0042) [2024-06-25 07:10:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13691207680. Throughput: 0: 42650.6. Samples: 13691362340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:03,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-25 07:10:06,786][15401] Updated weights for policy 0, policy_version 835653 (0.0030) [2024-06-25 07:10:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13691404288. Throughput: 0: 42714.5. Samples: 13691496860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:08,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 07:10:09,753][15401] Updated weights for policy 0, policy_version 835663 (0.0040) [2024-06-25 07:10:13,393][15132] Fps is (10 sec: 40944.8, 60 sec: 42595.7, 300 sec: 42764.5). Total num frames: 13691617280. Throughput: 0: 42787.9. Samples: 13691751900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:13,394][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 07:10:14,308][15401] Updated weights for policy 0, policy_version 835673 (0.0029) [2024-06-25 07:10:17,249][15401] Updated weights for policy 0, policy_version 835683 (0.0030) [2024-06-25 07:10:18,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13691863040. Throughput: 0: 43012.4. Samples: 13692010420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:18,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 07:10:21,985][15401] Updated weights for policy 0, policy_version 835693 (0.0032) [2024-06-25 07:10:23,392][15132] Fps is (10 sec: 42604.0, 60 sec: 42869.7, 300 sec: 42653.9). Total num frames: 13692043264. Throughput: 0: 43076.5. Samples: 13692145820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:23,393][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 07:10:24,743][15401] Updated weights for policy 0, policy_version 835703 (0.0038) [2024-06-25 07:10:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 13692272640. Throughput: 0: 43135.9. Samples: 13692401420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 07:10:29,535][15401] Updated weights for policy 0, policy_version 835713 (0.0036) [2024-06-25 07:10:32,598][15401] Updated weights for policy 0, policy_version 835723 (0.0053) [2024-06-25 07:10:33,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13692518400. Throughput: 0: 42887.0. Samples: 13692644960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 07:10:37,427][15401] Updated weights for policy 0, policy_version 835733 (0.0021) [2024-06-25 07:10:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13692682240. Throughput: 0: 42923.1. Samples: 13692781000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 07:10:40,477][15401] Updated weights for policy 0, policy_version 835743 (0.0034) [2024-06-25 07:10:43,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13692895232. Throughput: 0: 42913.8. Samples: 13693036580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 07:10:45,045][15401] Updated weights for policy 0, policy_version 835753 (0.0033) [2024-06-25 07:10:47,914][15401] Updated weights for policy 0, policy_version 835763 (0.0026) [2024-06-25 07:10:48,390][15132] Fps is (10 sec: 49151.7, 60 sec: 43690.5, 300 sec: 42987.2). Total num frames: 13693173760. Throughput: 0: 42805.3. Samples: 13693288580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 07:10:52,810][15401] Updated weights for policy 0, policy_version 835773 (0.0033) [2024-06-25 07:10:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13693337600. Throughput: 0: 42870.4. Samples: 13693426020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 07:10:55,640][15401] Updated weights for policy 0, policy_version 835783 (0.0038) [2024-06-25 07:10:58,390][15132] Fps is (10 sec: 36045.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13693534208. Throughput: 0: 42794.7. Samples: 13693677500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:10:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 07:11:00,503][15401] Updated weights for policy 0, policy_version 835793 (0.0031) [2024-06-25 07:11:03,021][15349] Signal inference workers to stop experience collection... (202750 times) [2024-06-25 07:11:03,056][15401] InferenceWorker_p0-w0: stopping experience collection (202750 times) [2024-06-25 07:11:03,080][15349] Signal inference workers to resume experience collection... (202750 times) [2024-06-25 07:11:03,081][15401] InferenceWorker_p0-w0: resuming experience collection (202750 times) [2024-06-25 07:11:03,235][15401] Updated weights for policy 0, policy_version 835803 (0.0030) [2024-06-25 07:11:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13693796352. Throughput: 0: 42656.8. Samples: 13693929980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:11:03,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 07:11:08,072][15401] Updated weights for policy 0, policy_version 835813 (0.0030) [2024-06-25 07:11:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 13693976576. Throughput: 0: 42689.5. Samples: 13694066740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:11:08,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 07:11:11,312][15401] Updated weights for policy 0, policy_version 835823 (0.0038) [2024-06-25 07:11:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42874.2, 300 sec: 42709.5). Total num frames: 13694189568. Throughput: 0: 42521.4. Samples: 13694314880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:11:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 07:11:15,539][15401] Updated weights for policy 0, policy_version 835833 (0.0032) [2024-06-25 07:11:18,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 13694418944. Throughput: 0: 42872.9. Samples: 13694574240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:11:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 07:11:18,962][15401] Updated weights for policy 0, policy_version 835843 (0.0035) [2024-06-25 07:11:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13694599168. Throughput: 0: 42764.4. Samples: 13694705400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:11:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 07:11:23,512][15401] Updated weights for policy 0, policy_version 835853 (0.0040) [2024-06-25 07:11:26,617][15401] Updated weights for policy 0, policy_version 835863 (0.0038) [2024-06-25 07:11:28,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13694828544. Throughput: 0: 42581.3. Samples: 13694952840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:11:28,392][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 07:11:31,054][15401] Updated weights for policy 0, policy_version 835873 (0.0033) [2024-06-25 07:11:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 13695057920. Throughput: 0: 42714.7. Samples: 13695210740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-25 07:11:33,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 07:11:34,391][15401] Updated weights for policy 0, policy_version 835883 (0.0038) [2024-06-25 07:11:38,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13695238144. Throughput: 0: 42520.3. Samples: 13695339540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:11:38,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 07:11:38,740][15401] Updated weights for policy 0, policy_version 835893 (0.0040) [2024-06-25 07:11:42,305][15401] Updated weights for policy 0, policy_version 835903 (0.0034) [2024-06-25 07:11:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13695451136. Throughput: 0: 42625.3. Samples: 13695595640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:11:43,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 07:11:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000835905_13695467520.pth... [2024-06-25 07:11:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000835279_13685211136.pth [2024-06-25 07:11:46,341][15401] Updated weights for policy 0, policy_version 835913 (0.0035) [2024-06-25 07:11:48,390][15132] Fps is (10 sec: 47524.9, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 13695713280. Throughput: 0: 42604.5. Samples: 13695847180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:11:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 07:11:49,958][15401] Updated weights for policy 0, policy_version 835923 (0.0034) [2024-06-25 07:11:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13695893504. Throughput: 0: 42598.5. Samples: 13695983680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:11:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 07:11:53,950][15401] Updated weights for policy 0, policy_version 835933 (0.0047) [2024-06-25 07:11:57,643][15401] Updated weights for policy 0, policy_version 835943 (0.0039) [2024-06-25 07:11:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13696106496. Throughput: 0: 42646.2. Samples: 13696233960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:11:58,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 07:12:01,714][15401] Updated weights for policy 0, policy_version 835953 (0.0031) [2024-06-25 07:12:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 13696319488. Throughput: 0: 42553.8. Samples: 13696489160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 07:12:05,188][15401] Updated weights for policy 0, policy_version 835963 (0.0052) [2024-06-25 07:12:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13696532480. Throughput: 0: 42436.5. Samples: 13696615040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 07:12:09,405][15401] Updated weights for policy 0, policy_version 835973 (0.0037) [2024-06-25 07:12:10,935][15349] Signal inference workers to stop experience collection... (202800 times) [2024-06-25 07:12:10,941][15349] Signal inference workers to resume experience collection... (202800 times) [2024-06-25 07:12:10,958][15401] InferenceWorker_p0-w0: stopping experience collection (202800 times) [2024-06-25 07:12:10,958][15401] InferenceWorker_p0-w0: resuming experience collection (202800 times) [2024-06-25 07:12:12,934][15401] Updated weights for policy 0, policy_version 835983 (0.0044) [2024-06-25 07:12:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13696745472. Throughput: 0: 42589.0. Samples: 13696869240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 07:12:17,193][15401] Updated weights for policy 0, policy_version 835993 (0.0042) [2024-06-25 07:12:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13696974848. Throughput: 0: 42580.5. Samples: 13697126860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 07:12:20,537][15401] Updated weights for policy 0, policy_version 836003 (0.0045) [2024-06-25 07:12:23,393][15132] Fps is (10 sec: 42581.5, 60 sec: 42868.7, 300 sec: 42764.4). Total num frames: 13697171456. Throughput: 0: 42707.5. Samples: 13697261440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:23,394][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 07:12:24,746][15401] Updated weights for policy 0, policy_version 836013 (0.0031) [2024-06-25 07:12:28,174][15401] Updated weights for policy 0, policy_version 836023 (0.0029) [2024-06-25 07:12:28,393][15132] Fps is (10 sec: 42582.1, 60 sec: 42870.4, 300 sec: 42764.5). Total num frames: 13697400832. Throughput: 0: 42659.1. Samples: 13697515460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:28,394][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 07:12:32,196][15401] Updated weights for policy 0, policy_version 836033 (0.0043) [2024-06-25 07:12:33,392][15132] Fps is (10 sec: 44243.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13697613824. Throughput: 0: 42852.8. Samples: 13697775660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:33,393][15132] Avg episode reward: [(0, '0.281')] [2024-06-25 07:12:35,831][15401] Updated weights for policy 0, policy_version 836043 (0.0040) [2024-06-25 07:12:38,389][15132] Fps is (10 sec: 42614.9, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 13697826816. Throughput: 0: 42652.6. Samples: 13697903040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 07:12:39,864][15401] Updated weights for policy 0, policy_version 836053 (0.0036) [2024-06-25 07:12:43,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 13698056192. Throughput: 0: 42866.2. Samples: 13698162940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:43,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 07:12:43,398][15401] Updated weights for policy 0, policy_version 836063 (0.0032) [2024-06-25 07:12:47,564][15401] Updated weights for policy 0, policy_version 836073 (0.0033) [2024-06-25 07:12:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13698252800. Throughput: 0: 42771.6. Samples: 13698413880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:48,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 07:12:51,218][15401] Updated weights for policy 0, policy_version 836083 (0.0026) [2024-06-25 07:12:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13698465792. Throughput: 0: 42820.8. Samples: 13698541980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 07:12:55,298][15401] Updated weights for policy 0, policy_version 836093 (0.0041) [2024-06-25 07:12:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13698695168. Throughput: 0: 42940.4. Samples: 13698801560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:12:58,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 07:12:59,365][15401] Updated weights for policy 0, policy_version 836103 (0.0040) [2024-06-25 07:13:03,012][15401] Updated weights for policy 0, policy_version 836113 (0.0033) [2024-06-25 07:13:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13698908160. Throughput: 0: 42748.0. Samples: 13699050520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:13:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 07:13:07,104][15401] Updated weights for policy 0, policy_version 836123 (0.0036) [2024-06-25 07:13:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13699104768. Throughput: 0: 42719.8. Samples: 13699183660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:13:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 07:13:10,450][15401] Updated weights for policy 0, policy_version 836133 (0.0053) [2024-06-25 07:13:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13699317760. Throughput: 0: 42692.0. Samples: 13699436440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:13:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 07:13:14,831][15401] Updated weights for policy 0, policy_version 836143 (0.0033) [2024-06-25 07:13:18,159][15401] Updated weights for policy 0, policy_version 836153 (0.0032) [2024-06-25 07:13:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 13699547136. Throughput: 0: 42601.0. Samples: 13699692600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:13:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 07:13:22,394][15401] Updated weights for policy 0, policy_version 836163 (0.0028) [2024-06-25 07:13:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42601.2, 300 sec: 42709.5). Total num frames: 13699727360. Throughput: 0: 42622.2. Samples: 13699821040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 07:13:23,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 07:13:25,607][15401] Updated weights for policy 0, policy_version 836173 (0.0030) [2024-06-25 07:13:27,558][15349] Signal inference workers to stop experience collection... (202850 times) [2024-06-25 07:13:27,559][15349] Signal inference workers to resume experience collection... (202850 times) [2024-06-25 07:13:27,570][15401] InferenceWorker_p0-w0: stopping experience collection (202850 times) [2024-06-25 07:13:27,592][15401] InferenceWorker_p0-w0: resuming experience collection (202850 times) [2024-06-25 07:13:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42874.2, 300 sec: 42765.0). Total num frames: 13699973120. Throughput: 0: 42656.0. Samples: 13700082460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:13:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 07:13:29,983][15401] Updated weights for policy 0, policy_version 836183 (0.0031) [2024-06-25 07:13:33,170][15401] Updated weights for policy 0, policy_version 836193 (0.0034) [2024-06-25 07:13:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 13700186112. Throughput: 0: 42628.9. Samples: 13700332180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:13:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 07:13:37,579][15401] Updated weights for policy 0, policy_version 836203 (0.0033) [2024-06-25 07:13:38,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13700349952. Throughput: 0: 42663.6. Samples: 13700461840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:13:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 07:13:40,723][15401] Updated weights for policy 0, policy_version 836213 (0.0041) [2024-06-25 07:13:43,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42320.7, 300 sec: 42708.6). Total num frames: 13700595712. Throughput: 0: 42532.6. Samples: 13700715800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:13:43,396][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 07:13:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000836218_13700595712.pth... [2024-06-25 07:13:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000835592_13690339328.pth [2024-06-25 07:13:45,570][15401] Updated weights for policy 0, policy_version 836223 (0.0023) [2024-06-25 07:13:48,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13700825088. Throughput: 0: 42653.3. Samples: 13700969920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:13:48,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-25 07:13:48,454][15401] Updated weights for policy 0, policy_version 836233 (0.0027) [2024-06-25 07:13:53,286][15401] Updated weights for policy 0, policy_version 836243 (0.0033) [2024-06-25 07:13:53,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13701005312. Throughput: 0: 42611.5. Samples: 13701101180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:13:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 07:13:55,899][15401] Updated weights for policy 0, policy_version 836253 (0.0026) [2024-06-25 07:13:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 13701251072. Throughput: 0: 42605.8. Samples: 13701353800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:13:58,392][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 07:14:00,746][15401] Updated weights for policy 0, policy_version 836263 (0.0029) [2024-06-25 07:14:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13701447680. Throughput: 0: 42652.9. Samples: 13701611980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 07:14:04,172][15401] Updated weights for policy 0, policy_version 836273 (0.0031) [2024-06-25 07:14:08,389][15132] Fps is (10 sec: 37692.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13701627904. Throughput: 0: 42572.4. Samples: 13701736800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:08,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 07:14:08,631][15401] Updated weights for policy 0, policy_version 836283 (0.0033) [2024-06-25 07:14:11,641][15401] Updated weights for policy 0, policy_version 836293 (0.0027) [2024-06-25 07:14:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13701890048. Throughput: 0: 42359.5. Samples: 13701988640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:13,390][15132] Avg episode reward: [(0, '0.877')] [2024-06-25 07:14:16,343][15401] Updated weights for policy 0, policy_version 836303 (0.0037) [2024-06-25 07:14:18,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13702103040. Throughput: 0: 42550.6. Samples: 13702246960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 07:14:19,335][15401] Updated weights for policy 0, policy_version 836313 (0.0040) [2024-06-25 07:14:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13702283264. Throughput: 0: 42527.5. Samples: 13702375580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 07:14:23,838][15401] Updated weights for policy 0, policy_version 836323 (0.0036) [2024-06-25 07:14:27,177][15401] Updated weights for policy 0, policy_version 836333 (0.0033) [2024-06-25 07:14:28,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 13702512640. Throughput: 0: 42600.3. Samples: 13702632640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:28,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 07:14:31,250][15349] Signal inference workers to stop experience collection... (202900 times) [2024-06-25 07:14:31,250][15349] Signal inference workers to resume experience collection... (202900 times) [2024-06-25 07:14:31,278][15401] InferenceWorker_p0-w0: stopping experience collection (202900 times) [2024-06-25 07:14:31,278][15401] InferenceWorker_p0-w0: resuming experience collection (202900 times) [2024-06-25 07:14:31,635][15401] Updated weights for policy 0, policy_version 836343 (0.0036) [2024-06-25 07:14:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 13702725632. Throughput: 0: 42632.3. Samples: 13702888380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 07:14:34,750][15401] Updated weights for policy 0, policy_version 836353 (0.0049) [2024-06-25 07:14:38,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13702905856. Throughput: 0: 42530.3. Samples: 13703015040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 07:14:39,395][15401] Updated weights for policy 0, policy_version 836363 (0.0045) [2024-06-25 07:14:42,364][15401] Updated weights for policy 0, policy_version 836373 (0.0046) [2024-06-25 07:14:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 13703168000. Throughput: 0: 42636.8. Samples: 13703272360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:43,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 07:14:47,038][15401] Updated weights for policy 0, policy_version 836383 (0.0037) [2024-06-25 07:14:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13703364608. Throughput: 0: 42703.2. Samples: 13703533620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 07:14:49,934][15401] Updated weights for policy 0, policy_version 836393 (0.0034) [2024-06-25 07:14:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13703561216. Throughput: 0: 42632.8. Samples: 13703655280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:53,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 07:14:54,757][15401] Updated weights for policy 0, policy_version 836403 (0.0043) [2024-06-25 07:14:57,558][15401] Updated weights for policy 0, policy_version 836413 (0.0032) [2024-06-25 07:14:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 13703806976. Throughput: 0: 42701.3. Samples: 13703910200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:14:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 07:15:02,612][15401] Updated weights for policy 0, policy_version 836423 (0.0034) [2024-06-25 07:15:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13704003584. Throughput: 0: 42736.0. Samples: 13704170080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 07:15:05,213][15401] Updated weights for policy 0, policy_version 836433 (0.0036) [2024-06-25 07:15:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 13704200192. Throughput: 0: 42720.9. Samples: 13704298020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 07:15:10,251][15401] Updated weights for policy 0, policy_version 836443 (0.0049) [2024-06-25 07:15:12,987][15401] Updated weights for policy 0, policy_version 836453 (0.0030) [2024-06-25 07:15:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13704445952. Throughput: 0: 42639.2. Samples: 13704551300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 07:15:17,847][15401] Updated weights for policy 0, policy_version 836463 (0.0032) [2024-06-25 07:15:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 13704642560. Throughput: 0: 42778.9. Samples: 13704813420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 07:15:20,583][15401] Updated weights for policy 0, policy_version 836473 (0.0033) [2024-06-25 07:15:23,390][15132] Fps is (10 sec: 40957.3, 60 sec: 42871.0, 300 sec: 42653.8). Total num frames: 13704855552. Throughput: 0: 42689.6. Samples: 13704936100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:23,391][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 07:15:25,488][15401] Updated weights for policy 0, policy_version 836483 (0.0037) [2024-06-25 07:15:28,162][15401] Updated weights for policy 0, policy_version 836493 (0.0034) [2024-06-25 07:15:28,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 13705101312. Throughput: 0: 42746.6. Samples: 13705195960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 07:15:33,029][15401] Updated weights for policy 0, policy_version 836503 (0.0031) [2024-06-25 07:15:33,389][15132] Fps is (10 sec: 42601.4, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 13705281536. Throughput: 0: 42808.4. Samples: 13705460000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:33,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 07:15:36,138][15401] Updated weights for policy 0, policy_version 836513 (0.0034) [2024-06-25 07:15:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13705510912. Throughput: 0: 42859.7. Samples: 13705583960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:38,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 07:15:40,639][15401] Updated weights for policy 0, policy_version 836523 (0.0026) [2024-06-25 07:15:42,367][15349] Signal inference workers to stop experience collection... (202950 times) [2024-06-25 07:15:42,415][15401] InferenceWorker_p0-w0: stopping experience collection (202950 times) [2024-06-25 07:15:42,417][15349] Signal inference workers to resume experience collection... (202950 times) [2024-06-25 07:15:42,429][15401] InferenceWorker_p0-w0: resuming experience collection (202950 times) [2024-06-25 07:15:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13705740288. Throughput: 0: 42952.5. Samples: 13705843060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 07:15:43,428][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000836532_13705740288.pth... [2024-06-25 07:15:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000835905_13695467520.pth [2024-06-25 07:15:43,885][15401] Updated weights for policy 0, policy_version 836533 (0.0039) [2024-06-25 07:15:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13705904128. Throughput: 0: 42983.2. Samples: 13706104320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 07:15:48,462][15401] Updated weights for policy 0, policy_version 836543 (0.0048) [2024-06-25 07:15:51,574][15401] Updated weights for policy 0, policy_version 836553 (0.0027) [2024-06-25 07:15:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 13706149888. Throughput: 0: 42681.8. Samples: 13706218800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:53,392][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 07:15:56,185][15401] Updated weights for policy 0, policy_version 836563 (0.0037) [2024-06-25 07:15:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13706362880. Throughput: 0: 42915.2. Samples: 13706482480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:15:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 07:15:59,194][15401] Updated weights for policy 0, policy_version 836573 (0.0032) [2024-06-25 07:16:03,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13706543104. Throughput: 0: 42884.8. Samples: 13706743240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:03,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 07:16:03,803][15401] Updated weights for policy 0, policy_version 836583 (0.0040) [2024-06-25 07:16:06,648][15401] Updated weights for policy 0, policy_version 836593 (0.0047) [2024-06-25 07:16:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13706805248. Throughput: 0: 42737.0. Samples: 13706859240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:08,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 07:16:11,526][15401] Updated weights for policy 0, policy_version 836603 (0.0032) [2024-06-25 07:16:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13706985472. Throughput: 0: 42685.8. Samples: 13707116820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 07:16:14,191][15401] Updated weights for policy 0, policy_version 836613 (0.0042) [2024-06-25 07:16:18,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13707182080. Throughput: 0: 42568.3. Samples: 13707375580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:18,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 07:16:19,179][15401] Updated weights for policy 0, policy_version 836623 (0.0043) [2024-06-25 07:16:21,901][15401] Updated weights for policy 0, policy_version 836633 (0.0033) [2024-06-25 07:16:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.9, 300 sec: 42709.8). Total num frames: 13707427840. Throughput: 0: 42616.9. Samples: 13707501720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:23,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 07:16:26,837][15401] Updated weights for policy 0, policy_version 836643 (0.0028) [2024-06-25 07:16:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13707640832. Throughput: 0: 42590.3. Samples: 13707759620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 07:16:29,591][15401] Updated weights for policy 0, policy_version 836653 (0.0025) [2024-06-25 07:16:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 13707821056. Throughput: 0: 42657.6. Samples: 13708023920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:33,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 07:16:34,325][15401] Updated weights for policy 0, policy_version 836663 (0.0030) [2024-06-25 07:16:37,426][15401] Updated weights for policy 0, policy_version 836673 (0.0036) [2024-06-25 07:16:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13708083200. Throughput: 0: 42850.7. Samples: 13708146980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:38,391][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 07:16:41,999][15401] Updated weights for policy 0, policy_version 836683 (0.0034) [2024-06-25 07:16:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13708279808. Throughput: 0: 42692.3. Samples: 13708403640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:43,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 07:16:45,147][15401] Updated weights for policy 0, policy_version 836693 (0.0035) [2024-06-25 07:16:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 13708476416. Throughput: 0: 42640.5. Samples: 13708662060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:48,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 07:16:49,543][15401] Updated weights for policy 0, policy_version 836703 (0.0036) [2024-06-25 07:16:52,983][15401] Updated weights for policy 0, policy_version 836713 (0.0021) [2024-06-25 07:16:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13708705792. Throughput: 0: 42806.3. Samples: 13708785520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 07:16:57,334][15401] Updated weights for policy 0, policy_version 836723 (0.0029) [2024-06-25 07:16:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13708935168. Throughput: 0: 42868.1. Samples: 13709045880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:16:58,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 07:17:00,574][15401] Updated weights for policy 0, policy_version 836733 (0.0029) [2024-06-25 07:17:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13709115392. Throughput: 0: 42583.7. Samples: 13709291840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 07:17:03,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 07:17:05,106][15401] Updated weights for policy 0, policy_version 836743 (0.0033) [2024-06-25 07:17:08,381][15401] Updated weights for policy 0, policy_version 836753 (0.0035) [2024-06-25 07:17:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13709361152. Throughput: 0: 42600.4. Samples: 13709418740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:08,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 07:17:12,627][15349] Signal inference workers to stop experience collection... (203000 times) [2024-06-25 07:17:12,629][15349] Signal inference workers to resume experience collection... (203000 times) [2024-06-25 07:17:12,656][15401] InferenceWorker_p0-w0: stopping experience collection (203000 times) [2024-06-25 07:17:12,656][15401] InferenceWorker_p0-w0: resuming experience collection (203000 times) [2024-06-25 07:17:12,795][15401] Updated weights for policy 0, policy_version 836763 (0.0032) [2024-06-25 07:17:13,390][15132] Fps is (10 sec: 45872.6, 60 sec: 43144.2, 300 sec: 42709.4). Total num frames: 13709574144. Throughput: 0: 42735.5. Samples: 13709682740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:13,391][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 07:17:16,739][15401] Updated weights for policy 0, policy_version 836773 (0.0039) [2024-06-25 07:17:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42710.0). Total num frames: 13709770752. Throughput: 0: 42364.9. Samples: 13709930340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 07:17:20,325][15401] Updated weights for policy 0, policy_version 836783 (0.0041) [2024-06-25 07:17:23,390][15132] Fps is (10 sec: 40962.0, 60 sec: 42598.3, 300 sec: 42654.5). Total num frames: 13709983744. Throughput: 0: 42361.3. Samples: 13710053240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:23,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 07:17:24,191][15401] Updated weights for policy 0, policy_version 836793 (0.0039) [2024-06-25 07:17:27,927][15401] Updated weights for policy 0, policy_version 836803 (0.0028) [2024-06-25 07:17:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 13710213120. Throughput: 0: 42520.5. Samples: 13710317060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 07:17:31,802][15401] Updated weights for policy 0, policy_version 836813 (0.0028) [2024-06-25 07:17:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13710393344. Throughput: 0: 42422.7. Samples: 13710571080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:33,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 07:17:35,434][15401] Updated weights for policy 0, policy_version 836823 (0.0038) [2024-06-25 07:17:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13710622720. Throughput: 0: 42350.6. Samples: 13710691300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:38,391][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 07:17:39,333][15401] Updated weights for policy 0, policy_version 836833 (0.0038) [2024-06-25 07:17:42,940][15401] Updated weights for policy 0, policy_version 836843 (0.0030) [2024-06-25 07:17:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13710852096. Throughput: 0: 42452.0. Samples: 13710956220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:43,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 07:17:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000836844_13710852096.pth... [2024-06-25 07:17:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000836218_13700595712.pth [2024-06-25 07:17:47,014][15401] Updated weights for policy 0, policy_version 836853 (0.0044) [2024-06-25 07:17:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13711048704. Throughput: 0: 42619.0. Samples: 13711209700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:48,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 07:17:50,717][15401] Updated weights for policy 0, policy_version 836863 (0.0042) [2024-06-25 07:17:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13711261696. Throughput: 0: 42597.8. Samples: 13711335640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 07:17:55,047][15401] Updated weights for policy 0, policy_version 836873 (0.0055) [2024-06-25 07:17:58,361][15401] Updated weights for policy 0, policy_version 836883 (0.0038) [2024-06-25 07:17:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13711491072. Throughput: 0: 42528.1. Samples: 13711596480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:17:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 07:18:02,681][15401] Updated weights for policy 0, policy_version 836893 (0.0036) [2024-06-25 07:18:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13711687680. Throughput: 0: 42624.8. Samples: 13711848460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 07:18:06,151][15401] Updated weights for policy 0, policy_version 836903 (0.0038) [2024-06-25 07:18:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13711900672. Throughput: 0: 42682.3. Samples: 13711973940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 07:18:10,307][15401] Updated weights for policy 0, policy_version 836913 (0.0040) [2024-06-25 07:18:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.7, 300 sec: 42598.4). Total num frames: 13712113664. Throughput: 0: 42653.7. Samples: 13712236480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 07:18:13,864][15401] Updated weights for policy 0, policy_version 836923 (0.0037) [2024-06-25 07:18:17,977][15401] Updated weights for policy 0, policy_version 836933 (0.0038) [2024-06-25 07:18:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13712310272. Throughput: 0: 42554.2. Samples: 13712486020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:18,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 07:18:21,794][15401] Updated weights for policy 0, policy_version 836943 (0.0037) [2024-06-25 07:18:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 13712556032. Throughput: 0: 42770.2. Samples: 13712616060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:23,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 07:18:25,517][15401] Updated weights for policy 0, policy_version 836953 (0.0044) [2024-06-25 07:18:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13712752640. Throughput: 0: 42644.4. Samples: 13712875220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:28,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 07:18:29,154][15349] Signal inference workers to stop experience collection... (203050 times) [2024-06-25 07:18:29,164][15349] Signal inference workers to resume experience collection... (203050 times) [2024-06-25 07:18:29,172][15401] InferenceWorker_p0-w0: stopping experience collection (203050 times) [2024-06-25 07:18:29,194][15401] InferenceWorker_p0-w0: resuming experience collection (203050 times) [2024-06-25 07:18:29,313][15401] Updated weights for policy 0, policy_version 836963 (0.0040) [2024-06-25 07:18:33,350][15401] Updated weights for policy 0, policy_version 836973 (0.0036) [2024-06-25 07:18:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13712965632. Throughput: 0: 42624.9. Samples: 13713127820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 07:18:36,867][15401] Updated weights for policy 0, policy_version 836983 (0.0041) [2024-06-25 07:18:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 13713178624. Throughput: 0: 42660.5. Samples: 13713255360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 07:18:40,869][15401] Updated weights for policy 0, policy_version 836993 (0.0041) [2024-06-25 07:18:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 13713375232. Throughput: 0: 42720.7. Samples: 13713518920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:43,399][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 07:18:44,488][15401] Updated weights for policy 0, policy_version 837003 (0.0033) [2024-06-25 07:18:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13713604608. Throughput: 0: 42781.9. Samples: 13713773640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:48,402][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 07:18:48,725][15401] Updated weights for policy 0, policy_version 837013 (0.0038) [2024-06-25 07:18:51,987][15401] Updated weights for policy 0, policy_version 837023 (0.0033) [2024-06-25 07:18:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 13713833984. Throughput: 0: 42828.4. Samples: 13713901220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 07:18:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 07:18:56,607][15401] Updated weights for policy 0, policy_version 837033 (0.0024) [2024-06-25 07:18:58,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 13714030592. Throughput: 0: 42690.7. Samples: 13714157660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:18:58,392][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 07:18:59,945][15401] Updated weights for policy 0, policy_version 837043 (0.0035) [2024-06-25 07:19:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13714227200. Throughput: 0: 42753.8. Samples: 13714409940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 07:19:04,161][15401] Updated weights for policy 0, policy_version 837053 (0.0051) [2024-06-25 07:19:07,851][15401] Updated weights for policy 0, policy_version 837063 (0.0041) [2024-06-25 07:19:08,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13714472960. Throughput: 0: 42663.1. Samples: 13714535800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:08,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 07:19:12,111][15401] Updated weights for policy 0, policy_version 837073 (0.0033) [2024-06-25 07:19:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13714669568. Throughput: 0: 42718.8. Samples: 13714797560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:13,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 07:19:15,356][15401] Updated weights for policy 0, policy_version 837083 (0.0041) [2024-06-25 07:19:18,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 13714882560. Throughput: 0: 42644.5. Samples: 13715046920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:18,392][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 07:19:19,811][15401] Updated weights for policy 0, policy_version 837093 (0.0040) [2024-06-25 07:19:22,955][15401] Updated weights for policy 0, policy_version 837103 (0.0028) [2024-06-25 07:19:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42873.1, 300 sec: 42765.3). Total num frames: 13715128320. Throughput: 0: 42683.8. Samples: 13715176140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 07:19:27,824][15401] Updated weights for policy 0, policy_version 837113 (0.0037) [2024-06-25 07:19:28,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13715308544. Throughput: 0: 42614.3. Samples: 13715436560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:28,391][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 07:19:30,702][15401] Updated weights for policy 0, policy_version 837123 (0.0037) [2024-06-25 07:19:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13715521536. Throughput: 0: 42458.3. Samples: 13715684260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 07:19:35,385][15401] Updated weights for policy 0, policy_version 837133 (0.0035) [2024-06-25 07:19:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 13715734528. Throughput: 0: 42478.7. Samples: 13715812860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:38,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 07:19:38,694][15401] Updated weights for policy 0, policy_version 837143 (0.0030) [2024-06-25 07:19:43,327][15401] Updated weights for policy 0, policy_version 837153 (0.0029) [2024-06-25 07:19:43,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 13715914752. Throughput: 0: 42545.2. Samples: 13716072100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:43,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 07:19:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000837154_13715931136.pth... [2024-06-25 07:19:43,535][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000836532_13705740288.pth [2024-06-25 07:19:46,223][15401] Updated weights for policy 0, policy_version 837163 (0.0029) [2024-06-25 07:19:48,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13716176896. Throughput: 0: 42450.2. Samples: 13716320200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 07:19:50,822][15401] Updated weights for policy 0, policy_version 837173 (0.0037) [2024-06-25 07:19:51,282][15349] Signal inference workers to stop experience collection... (203100 times) [2024-06-25 07:19:51,283][15349] Signal inference workers to resume experience collection... (203100 times) [2024-06-25 07:19:51,354][15401] InferenceWorker_p0-w0: stopping experience collection (203100 times) [2024-06-25 07:19:51,354][15401] InferenceWorker_p0-w0: resuming experience collection (203100 times) [2024-06-25 07:19:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13716373504. Throughput: 0: 42784.6. Samples: 13716461100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:53,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 07:19:53,946][15401] Updated weights for policy 0, policy_version 837183 (0.0032) [2024-06-25 07:19:58,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 13716553728. Throughput: 0: 42417.3. Samples: 13716706340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:19:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 07:19:58,463][15401] Updated weights for policy 0, policy_version 837193 (0.0044) [2024-06-25 07:20:01,538][15401] Updated weights for policy 0, policy_version 837203 (0.0032) [2024-06-25 07:20:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13716815872. Throughput: 0: 42347.1. Samples: 13716952440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:03,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 07:20:06,433][15401] Updated weights for policy 0, policy_version 837213 (0.0045) [2024-06-25 07:20:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13717028864. Throughput: 0: 42705.4. Samples: 13717097880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:08,390][15132] Avg episode reward: [(0, '0.891')] [2024-06-25 07:20:09,247][15401] Updated weights for policy 0, policy_version 837223 (0.0033) [2024-06-25 07:20:13,390][15132] Fps is (10 sec: 37680.8, 60 sec: 42051.8, 300 sec: 42542.8). Total num frames: 13717192704. Throughput: 0: 42337.2. Samples: 13717341760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:13,391][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 07:20:14,038][15401] Updated weights for policy 0, policy_version 837233 (0.0044) [2024-06-25 07:20:16,956][15401] Updated weights for policy 0, policy_version 837243 (0.0034) [2024-06-25 07:20:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42327.0, 300 sec: 42598.5). Total num frames: 13717422080. Throughput: 0: 42212.9. Samples: 13717583840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 07:20:21,711][15401] Updated weights for policy 0, policy_version 837253 (0.0032) [2024-06-25 07:20:23,390][15132] Fps is (10 sec: 44239.2, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 13717635072. Throughput: 0: 42409.3. Samples: 13717721180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:23,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 07:20:24,562][15401] Updated weights for policy 0, policy_version 837263 (0.0032) [2024-06-25 07:20:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 13717831680. Throughput: 0: 42150.8. Samples: 13717968880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:28,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 07:20:29,298][15401] Updated weights for policy 0, policy_version 837273 (0.0044) [2024-06-25 07:20:32,224][15401] Updated weights for policy 0, policy_version 837283 (0.0038) [2024-06-25 07:20:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13718077440. Throughput: 0: 42298.7. Samples: 13718223640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 07:20:36,861][15401] Updated weights for policy 0, policy_version 837293 (0.0034) [2024-06-25 07:20:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 13718257664. Throughput: 0: 42207.1. Samples: 13718360420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:38,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 07:20:39,916][15401] Updated weights for policy 0, policy_version 837303 (0.0050) [2024-06-25 07:20:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13718487040. Throughput: 0: 42340.3. Samples: 13718611660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 07:20:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 07:20:44,563][15401] Updated weights for policy 0, policy_version 837313 (0.0033) [2024-06-25 07:20:47,845][15401] Updated weights for policy 0, policy_version 837323 (0.0051) [2024-06-25 07:20:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 13718700032. Throughput: 0: 42503.1. Samples: 13718865080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:20:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 07:20:52,106][15401] Updated weights for policy 0, policy_version 837333 (0.0033) [2024-06-25 07:20:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 13718896640. Throughput: 0: 42199.2. Samples: 13718996840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:20:53,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 07:20:55,548][15401] Updated weights for policy 0, policy_version 837343 (0.0035) [2024-06-25 07:20:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 13719126016. Throughput: 0: 42473.9. Samples: 13719253160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:20:58,392][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 07:20:59,853][15401] Updated weights for policy 0, policy_version 837353 (0.0033) [2024-06-25 07:21:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 13719339008. Throughput: 0: 42701.8. Samples: 13719505420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:03,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 07:21:03,447][15401] Updated weights for policy 0, policy_version 837363 (0.0038) [2024-06-25 07:21:07,539][15401] Updated weights for policy 0, policy_version 837373 (0.0032) [2024-06-25 07:21:08,389][15132] Fps is (10 sec: 39331.1, 60 sec: 41506.2, 300 sec: 42487.3). Total num frames: 13719519232. Throughput: 0: 42517.5. Samples: 13719634460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 07:21:11,012][15401] Updated weights for policy 0, policy_version 837383 (0.0040) [2024-06-25 07:21:13,394][15132] Fps is (10 sec: 42578.2, 60 sec: 42868.6, 300 sec: 42653.3). Total num frames: 13719764992. Throughput: 0: 42589.4. Samples: 13719885600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:13,395][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 07:21:15,041][15401] Updated weights for policy 0, policy_version 837393 (0.0042) [2024-06-25 07:21:16,166][15349] Signal inference workers to stop experience collection... (203150 times) [2024-06-25 07:21:16,167][15349] Signal inference workers to resume experience collection... (203150 times) [2024-06-25 07:21:16,205][15401] InferenceWorker_p0-w0: stopping experience collection (203150 times) [2024-06-25 07:21:16,206][15401] InferenceWorker_p0-w0: resuming experience collection (203150 times) [2024-06-25 07:21:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13719977984. Throughput: 0: 42729.3. Samples: 13720146460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 07:21:18,727][15401] Updated weights for policy 0, policy_version 837403 (0.0029) [2024-06-25 07:21:22,666][15401] Updated weights for policy 0, policy_version 837413 (0.0029) [2024-06-25 07:21:23,390][15132] Fps is (10 sec: 40978.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 13720174592. Throughput: 0: 42628.4. Samples: 13720278700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 07:21:26,638][15401] Updated weights for policy 0, policy_version 837423 (0.0034) [2024-06-25 07:21:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13720403968. Throughput: 0: 42563.7. Samples: 13720527020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 07:21:30,722][15401] Updated weights for policy 0, policy_version 837433 (0.0031) [2024-06-25 07:21:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 13720616960. Throughput: 0: 42635.4. Samples: 13720783680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 07:21:34,195][15401] Updated weights for policy 0, policy_version 837443 (0.0033) [2024-06-25 07:21:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13720813568. Throughput: 0: 42684.8. Samples: 13720917660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 07:21:38,393][15401] Updated weights for policy 0, policy_version 837453 (0.0026) [2024-06-25 07:21:41,676][15401] Updated weights for policy 0, policy_version 837463 (0.0049) [2024-06-25 07:21:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13721042944. Throughput: 0: 42567.6. Samples: 13721168600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:43,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 07:21:43,502][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000837467_13721059328.pth... [2024-06-25 07:21:43,569][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000836844_13710852096.pth [2024-06-25 07:21:45,946][15401] Updated weights for policy 0, policy_version 837473 (0.0037) [2024-06-25 07:21:48,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 13721272320. Throughput: 0: 42756.7. Samples: 13721429580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:48,393][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 07:21:49,190][15401] Updated weights for policy 0, policy_version 837483 (0.0037) [2024-06-25 07:21:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 13721452544. Throughput: 0: 42817.8. Samples: 13721561260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 07:21:53,577][15401] Updated weights for policy 0, policy_version 837493 (0.0034) [2024-06-25 07:21:56,806][15401] Updated weights for policy 0, policy_version 837503 (0.0041) [2024-06-25 07:21:58,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 13721698304. Throughput: 0: 42840.5. Samples: 13721813220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:21:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 07:22:01,191][15401] Updated weights for policy 0, policy_version 837513 (0.0034) [2024-06-25 07:22:03,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13721927680. Throughput: 0: 42745.8. Samples: 13722070020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:22:03,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 07:22:04,418][15401] Updated weights for policy 0, policy_version 837523 (0.0035) [2024-06-25 07:22:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42487.4). Total num frames: 13722107904. Throughput: 0: 42740.9. Samples: 13722202040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:22:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 07:22:08,786][15401] Updated weights for policy 0, policy_version 837533 (0.0046) [2024-06-25 07:22:12,029][15401] Updated weights for policy 0, policy_version 837543 (0.0032) [2024-06-25 07:22:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42873.1, 300 sec: 42598.1). Total num frames: 13722337280. Throughput: 0: 42805.7. Samples: 13722453380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:22:13,401][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 07:22:16,674][15401] Updated weights for policy 0, policy_version 837553 (0.0050) [2024-06-25 07:22:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13722550272. Throughput: 0: 42889.8. Samples: 13722713720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:22:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 07:22:19,696][15401] Updated weights for policy 0, policy_version 837563 (0.0031) [2024-06-25 07:22:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 13722746880. Throughput: 0: 42776.4. Samples: 13722842600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:22:23,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 07:22:24,266][15401] Updated weights for policy 0, policy_version 837573 (0.0035) [2024-06-25 07:22:27,729][15401] Updated weights for policy 0, policy_version 837583 (0.0041) [2024-06-25 07:22:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13722992640. Throughput: 0: 42882.6. Samples: 13723098320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:22:28,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 07:22:31,734][15401] Updated weights for policy 0, policy_version 837593 (0.0036) [2024-06-25 07:22:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13723189248. Throughput: 0: 42800.1. Samples: 13723355480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 07:22:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 07:22:35,198][15401] Updated weights for policy 0, policy_version 837603 (0.0035) [2024-06-25 07:22:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 13723402240. Throughput: 0: 42717.3. Samples: 13723483540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:22:38,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 07:22:39,346][15401] Updated weights for policy 0, policy_version 837613 (0.0038) [2024-06-25 07:22:39,986][15349] Signal inference workers to stop experience collection... (203200 times) [2024-06-25 07:22:39,987][15349] Signal inference workers to resume experience collection... (203200 times) [2024-06-25 07:22:40,030][15401] InferenceWorker_p0-w0: stopping experience collection (203200 times) [2024-06-25 07:22:40,031][15401] InferenceWorker_p0-w0: resuming experience collection (203200 times) [2024-06-25 07:22:42,719][15401] Updated weights for policy 0, policy_version 837623 (0.0035) [2024-06-25 07:22:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 13723631616. Throughput: 0: 42725.7. Samples: 13723735880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:22:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 07:22:47,293][15401] Updated weights for policy 0, policy_version 837633 (0.0031) [2024-06-25 07:22:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 13723828224. Throughput: 0: 42893.3. Samples: 13724000220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:22:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 07:22:50,344][15401] Updated weights for policy 0, policy_version 837643 (0.0032) [2024-06-25 07:22:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 13724041216. Throughput: 0: 42675.9. Samples: 13724122460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:22:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 07:22:54,787][15401] Updated weights for policy 0, policy_version 837653 (0.0039) [2024-06-25 07:22:57,784][15401] Updated weights for policy 0, policy_version 837663 (0.0029) [2024-06-25 07:22:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 13724286976. Throughput: 0: 42944.0. Samples: 13724385760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:22:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 07:23:02,518][15401] Updated weights for policy 0, policy_version 837673 (0.0040) [2024-06-25 07:23:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13724467200. Throughput: 0: 42914.8. Samples: 13724644880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 07:23:05,675][15401] Updated weights for policy 0, policy_version 837683 (0.0040) [2024-06-25 07:23:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13724696576. Throughput: 0: 42727.0. Samples: 13724765320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:08,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 07:23:10,220][15401] Updated weights for policy 0, policy_version 837693 (0.0033) [2024-06-25 07:23:13,306][15401] Updated weights for policy 0, policy_version 837703 (0.0029) [2024-06-25 07:23:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 13724925952. Throughput: 0: 42891.9. Samples: 13725028460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 07:23:18,051][15401] Updated weights for policy 0, policy_version 837713 (0.0029) [2024-06-25 07:23:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 13725106176. Throughput: 0: 42917.3. Samples: 13725286760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 07:23:21,013][15401] Updated weights for policy 0, policy_version 837723 (0.0027) [2024-06-25 07:23:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 13725351936. Throughput: 0: 42742.1. Samples: 13725406940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 07:23:25,725][15401] Updated weights for policy 0, policy_version 837733 (0.0033) [2024-06-25 07:23:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13725548544. Throughput: 0: 42921.3. Samples: 13725667340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:28,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-25 07:23:28,751][15401] Updated weights for policy 0, policy_version 837743 (0.0028) [2024-06-25 07:23:33,288][15401] Updated weights for policy 0, policy_version 837753 (0.0040) [2024-06-25 07:23:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13725745152. Throughput: 0: 42655.1. Samples: 13725919700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:33,390][15132] Avg episode reward: [(0, '0.126')] [2024-06-25 07:23:36,377][15401] Updated weights for policy 0, policy_version 837763 (0.0034) [2024-06-25 07:23:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13725974528. Throughput: 0: 42766.3. Samples: 13726046940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:38,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 07:23:40,915][15401] Updated weights for policy 0, policy_version 837773 (0.0026) [2024-06-25 07:23:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 13726187520. Throughput: 0: 42738.3. Samples: 13726309080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:43,392][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 07:23:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000837780_13726187520.pth... [2024-06-25 07:23:43,502][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000837154_13715931136.pth [2024-06-25 07:23:43,969][15401] Updated weights for policy 0, policy_version 837783 (0.0033) [2024-06-25 07:23:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13726384128. Throughput: 0: 42722.2. Samples: 13726567380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 07:23:48,717][15401] Updated weights for policy 0, policy_version 837793 (0.0048) [2024-06-25 07:23:51,569][15401] Updated weights for policy 0, policy_version 837803 (0.0033) [2024-06-25 07:23:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 13726629888. Throughput: 0: 42769.9. Samples: 13726689960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:53,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 07:23:56,195][15401] Updated weights for policy 0, policy_version 837813 (0.0050) [2024-06-25 07:23:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 13726810112. Throughput: 0: 42725.8. Samples: 13726951120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:23:58,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 07:23:59,583][15401] Updated weights for policy 0, policy_version 837823 (0.0027) [2024-06-25 07:24:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13727023104. Throughput: 0: 42664.5. Samples: 13727206660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:24:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 07:24:03,825][15401] Updated weights for policy 0, policy_version 837833 (0.0031) [2024-06-25 07:24:07,165][15401] Updated weights for policy 0, policy_version 837843 (0.0033) [2024-06-25 07:24:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13727268864. Throughput: 0: 42829.8. Samples: 13727334280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:24:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 07:24:11,392][15401] Updated weights for policy 0, policy_version 837853 (0.0030) [2024-06-25 07:24:13,391][15132] Fps is (10 sec: 44231.9, 60 sec: 42324.7, 300 sec: 42654.1). Total num frames: 13727465472. Throughput: 0: 42767.0. Samples: 13727591900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:24:13,391][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 07:24:14,751][15401] Updated weights for policy 0, policy_version 837863 (0.0041) [2024-06-25 07:24:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13727678464. Throughput: 0: 42801.7. Samples: 13727845780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:24:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 07:24:18,770][15349] Signal inference workers to stop experience collection... (203250 times) [2024-06-25 07:24:18,773][15349] Signal inference workers to resume experience collection... (203250 times) [2024-06-25 07:24:18,788][15401] InferenceWorker_p0-w0: stopping experience collection (203250 times) [2024-06-25 07:24:18,824][15401] InferenceWorker_p0-w0: resuming experience collection (203250 times) [2024-06-25 07:24:18,915][15401] Updated weights for policy 0, policy_version 837873 (0.0032) [2024-06-25 07:24:22,396][15401] Updated weights for policy 0, policy_version 837883 (0.0043) [2024-06-25 07:24:23,390][15132] Fps is (10 sec: 44240.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13727907840. Throughput: 0: 42832.7. Samples: 13727974420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 07:24:23,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 07:24:26,472][15401] Updated weights for policy 0, policy_version 837893 (0.0037) [2024-06-25 07:24:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13728120832. Throughput: 0: 42840.0. Samples: 13728236780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:24:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 07:24:30,110][15401] Updated weights for policy 0, policy_version 837903 (0.0036) [2024-06-25 07:24:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 13728333824. Throughput: 0: 42665.3. Samples: 13728487320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:24:33,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 07:24:34,376][15401] Updated weights for policy 0, policy_version 837913 (0.0034) [2024-06-25 07:24:37,788][15401] Updated weights for policy 0, policy_version 837923 (0.0036) [2024-06-25 07:24:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13728530432. Throughput: 0: 42877.3. Samples: 13728619440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:24:38,392][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 07:24:42,322][15401] Updated weights for policy 0, policy_version 837933 (0.0035) [2024-06-25 07:24:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 13728743424. Throughput: 0: 42820.0. Samples: 13728878020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:24:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 07:24:45,294][15401] Updated weights for policy 0, policy_version 837943 (0.0030) [2024-06-25 07:24:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13728972800. Throughput: 0: 42754.2. Samples: 13729130600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:24:48,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 07:24:49,853][15401] Updated weights for policy 0, policy_version 837953 (0.0029) [2024-06-25 07:24:52,675][15401] Updated weights for policy 0, policy_version 837963 (0.0029) [2024-06-25 07:24:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13729185792. Throughput: 0: 42868.4. Samples: 13729263360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:24:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 07:24:57,745][15401] Updated weights for policy 0, policy_version 837973 (0.0028) [2024-06-25 07:24:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13729382400. Throughput: 0: 42899.7. Samples: 13729522340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:24:58,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 07:25:00,662][15401] Updated weights for policy 0, policy_version 837983 (0.0036) [2024-06-25 07:25:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 13729611776. Throughput: 0: 43041.3. Samples: 13729782740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:03,393][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 07:25:05,162][15401] Updated weights for policy 0, policy_version 837993 (0.0030) [2024-06-25 07:25:08,033][15401] Updated weights for policy 0, policy_version 838003 (0.0038) [2024-06-25 07:25:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42876.2). Total num frames: 13729841152. Throughput: 0: 43071.6. Samples: 13729912640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 07:25:12,617][15401] Updated weights for policy 0, policy_version 838013 (0.0041) [2024-06-25 07:25:13,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43145.3, 300 sec: 42820.5). Total num frames: 13730054144. Throughput: 0: 42948.9. Samples: 13730169480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 07:25:15,506][15401] Updated weights for policy 0, policy_version 838023 (0.0030) [2024-06-25 07:25:18,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 13730250752. Throughput: 0: 43119.5. Samples: 13730427800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:18,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 07:25:20,099][15401] Updated weights for policy 0, policy_version 838033 (0.0048) [2024-06-25 07:25:23,356][15401] Updated weights for policy 0, policy_version 838043 (0.0023) [2024-06-25 07:25:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13730496512. Throughput: 0: 42944.4. Samples: 13730551940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:23,392][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 07:25:27,598][15401] Updated weights for policy 0, policy_version 838053 (0.0034) [2024-06-25 07:25:28,390][15132] Fps is (10 sec: 42607.6, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 13730676736. Throughput: 0: 43019.5. Samples: 13730813900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 07:25:30,875][15401] Updated weights for policy 0, policy_version 838063 (0.0035) [2024-06-25 07:25:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13730873344. Throughput: 0: 43190.7. Samples: 13731074180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:33,390][15132] Avg episode reward: [(0, '0.164')] [2024-06-25 07:25:35,241][15401] Updated weights for policy 0, policy_version 838073 (0.0032) [2024-06-25 07:25:38,338][15401] Updated weights for policy 0, policy_version 838083 (0.0037) [2024-06-25 07:25:38,390][15132] Fps is (10 sec: 47514.3, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 13731151872. Throughput: 0: 43017.0. Samples: 13731199120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:38,392][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 07:25:43,007][15401] Updated weights for policy 0, policy_version 838093 (0.0035) [2024-06-25 07:25:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13731315712. Throughput: 0: 43182.1. Samples: 13731465540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:43,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 07:25:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000838094_13731332096.pth... [2024-06-25 07:25:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000837467_13721059328.pth [2024-06-25 07:25:44,592][15349] Signal inference workers to stop experience collection... (203300 times) [2024-06-25 07:25:44,624][15401] InferenceWorker_p0-w0: stopping experience collection (203300 times) [2024-06-25 07:25:44,649][15349] Signal inference workers to resume experience collection... (203300 times) [2024-06-25 07:25:44,650][15401] InferenceWorker_p0-w0: resuming experience collection (203300 times) [2024-06-25 07:25:46,211][15401] Updated weights for policy 0, policy_version 838103 (0.0036) [2024-06-25 07:25:48,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13731528704. Throughput: 0: 43036.2. Samples: 13731719260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:48,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 07:25:50,767][15401] Updated weights for policy 0, policy_version 838113 (0.0029) [2024-06-25 07:25:53,391][15132] Fps is (10 sec: 45870.8, 60 sec: 43143.9, 300 sec: 42876.3). Total num frames: 13731774464. Throughput: 0: 42981.4. Samples: 13731846840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:53,391][15132] Avg episode reward: [(0, '0.265')] [2024-06-25 07:25:53,585][15401] Updated weights for policy 0, policy_version 838123 (0.0038) [2024-06-25 07:25:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13731954688. Throughput: 0: 42999.9. Samples: 13732104480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:25:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 07:25:58,509][15401] Updated weights for policy 0, policy_version 838133 (0.0038) [2024-06-25 07:26:01,662][15401] Updated weights for policy 0, policy_version 838143 (0.0039) [2024-06-25 07:26:03,392][15132] Fps is (10 sec: 40954.5, 60 sec: 42871.5, 300 sec: 42931.3). Total num frames: 13732184064. Throughput: 0: 42862.7. Samples: 13732356620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:26:03,392][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 07:26:06,143][15401] Updated weights for policy 0, policy_version 838153 (0.0045) [2024-06-25 07:26:08,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43144.6, 300 sec: 42932.3). Total num frames: 13732429824. Throughput: 0: 43038.2. Samples: 13732488660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:26:08,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 07:26:09,245][15401] Updated weights for policy 0, policy_version 838163 (0.0037) [2024-06-25 07:26:13,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13732593664. Throughput: 0: 42710.4. Samples: 13732735860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:26:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 07:26:13,898][15401] Updated weights for policy 0, policy_version 838173 (0.0045) [2024-06-25 07:26:16,962][15401] Updated weights for policy 0, policy_version 838183 (0.0030) [2024-06-25 07:26:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 13732823040. Throughput: 0: 42663.1. Samples: 13732994020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:18,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 07:26:21,452][15401] Updated weights for policy 0, policy_version 838193 (0.0039) [2024-06-25 07:26:23,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13733068800. Throughput: 0: 42822.3. Samples: 13733126120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 07:26:24,553][15401] Updated weights for policy 0, policy_version 838203 (0.0040) [2024-06-25 07:26:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13733249024. Throughput: 0: 42420.4. Samples: 13733374460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 07:26:29,228][15401] Updated weights for policy 0, policy_version 838213 (0.0034) [2024-06-25 07:26:32,119][15401] Updated weights for policy 0, policy_version 838223 (0.0026) [2024-06-25 07:26:33,390][15132] Fps is (10 sec: 39320.0, 60 sec: 43144.2, 300 sec: 42876.0). Total num frames: 13733462016. Throughput: 0: 42449.3. Samples: 13733629500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:33,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 07:26:36,787][15401] Updated weights for policy 0, policy_version 838233 (0.0039) [2024-06-25 07:26:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 13733675008. Throughput: 0: 42570.4. Samples: 13733762460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 07:26:39,667][15401] Updated weights for policy 0, policy_version 838243 (0.0033) [2024-06-25 07:26:43,390][15132] Fps is (10 sec: 42600.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 13733888000. Throughput: 0: 42507.2. Samples: 13734017300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:43,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 07:26:44,458][15401] Updated weights for policy 0, policy_version 838253 (0.0039) [2024-06-25 07:26:47,633][15401] Updated weights for policy 0, policy_version 838263 (0.0034) [2024-06-25 07:26:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13734100992. Throughput: 0: 42506.7. Samples: 13734269320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 07:26:52,114][15401] Updated weights for policy 0, policy_version 838273 (0.0032) [2024-06-25 07:26:53,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42051.3, 300 sec: 42709.1). Total num frames: 13734297600. Throughput: 0: 42367.1. Samples: 13734395280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:53,392][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 07:26:55,375][15401] Updated weights for policy 0, policy_version 838283 (0.0039) [2024-06-25 07:26:57,400][15349] Signal inference workers to stop experience collection... (203350 times) [2024-06-25 07:26:57,415][15401] InferenceWorker_p0-w0: stopping experience collection (203350 times) [2024-06-25 07:26:57,518][15349] Signal inference workers to resume experience collection... (203350 times) [2024-06-25 07:26:57,518][15401] InferenceWorker_p0-w0: resuming experience collection (203350 times) [2024-06-25 07:26:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13734526976. Throughput: 0: 42647.1. Samples: 13734654980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:26:58,393][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 07:26:59,563][15401] Updated weights for policy 0, policy_version 838293 (0.0036) [2024-06-25 07:27:03,296][15401] Updated weights for policy 0, policy_version 838303 (0.0038) [2024-06-25 07:27:03,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 13734756352. Throughput: 0: 42439.9. Samples: 13734903820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:03,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 07:27:07,533][15401] Updated weights for policy 0, policy_version 838313 (0.0029) [2024-06-25 07:27:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42709.8). Total num frames: 13734936576. Throughput: 0: 42492.9. Samples: 13735038300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 07:27:10,894][15401] Updated weights for policy 0, policy_version 838323 (0.0033) [2024-06-25 07:27:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13735182336. Throughput: 0: 42635.2. Samples: 13735293040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:13,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 07:27:15,134][15401] Updated weights for policy 0, policy_version 838333 (0.0039) [2024-06-25 07:27:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13735395328. Throughput: 0: 42515.1. Samples: 13735542660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 07:27:18,604][15401] Updated weights for policy 0, policy_version 838343 (0.0027) [2024-06-25 07:27:22,828][15401] Updated weights for policy 0, policy_version 838353 (0.0047) [2024-06-25 07:27:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 13735575552. Throughput: 0: 42487.5. Samples: 13735674400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 07:27:26,354][15401] Updated weights for policy 0, policy_version 838363 (0.0029) [2024-06-25 07:27:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13735804928. Throughput: 0: 42412.4. Samples: 13735925860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:28,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 07:27:30,403][15401] Updated weights for policy 0, policy_version 838373 (0.0032) [2024-06-25 07:27:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.7, 300 sec: 42765.0). Total num frames: 13736017920. Throughput: 0: 42492.0. Samples: 13736181460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 07:27:34,061][15401] Updated weights for policy 0, policy_version 838383 (0.0029) [2024-06-25 07:27:37,951][15401] Updated weights for policy 0, policy_version 838393 (0.0041) [2024-06-25 07:27:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13736230912. Throughput: 0: 42524.1. Samples: 13736308760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 07:27:41,646][15401] Updated weights for policy 0, policy_version 838403 (0.0041) [2024-06-25 07:27:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13736443904. Throughput: 0: 42413.5. Samples: 13736563580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 07:27:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000838406_13736443904.pth... [2024-06-25 07:27:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000837780_13726187520.pth [2024-06-25 07:27:45,948][15401] Updated weights for policy 0, policy_version 838413 (0.0039) [2024-06-25 07:27:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 13736624128. Throughput: 0: 42682.8. Samples: 13736824540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 07:27:49,377][15401] Updated weights for policy 0, policy_version 838423 (0.0043) [2024-06-25 07:27:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 13736869888. Throughput: 0: 42338.6. Samples: 13736943540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 07:27:53,630][15401] Updated weights for policy 0, policy_version 838433 (0.0031) [2024-06-25 07:27:57,338][15401] Updated weights for policy 0, policy_version 838443 (0.0032) [2024-06-25 07:27:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 13737099264. Throughput: 0: 42380.5. Samples: 13737200160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 07:27:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 07:28:01,151][15401] Updated weights for policy 0, policy_version 838453 (0.0033) [2024-06-25 07:28:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 13737263104. Throughput: 0: 42503.1. Samples: 13737455300. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:03,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 07:28:05,291][15401] Updated weights for policy 0, policy_version 838463 (0.0040) [2024-06-25 07:28:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13737525248. Throughput: 0: 42279.5. Samples: 13737576980. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:08,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 07:28:08,691][15401] Updated weights for policy 0, policy_version 838473 (0.0037) [2024-06-25 07:28:12,954][15401] Updated weights for policy 0, policy_version 838483 (0.0033) [2024-06-25 07:28:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13737721856. Throughput: 0: 42478.7. Samples: 13737837400. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 07:28:16,812][15401] Updated weights for policy 0, policy_version 838493 (0.0043) [2024-06-25 07:28:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 13737902080. Throughput: 0: 42527.1. Samples: 13738095180. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:18,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 07:28:20,903][15401] Updated weights for policy 0, policy_version 838503 (0.0034) [2024-06-25 07:28:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13738164224. Throughput: 0: 42471.5. Samples: 13738219980. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:23,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 07:28:24,326][15401] Updated weights for policy 0, policy_version 838513 (0.0031) [2024-06-25 07:28:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13738344448. Throughput: 0: 42595.5. Samples: 13738480380. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 07:28:28,554][15401] Updated weights for policy 0, policy_version 838523 (0.0033) [2024-06-25 07:28:28,992][15349] Signal inference workers to stop experience collection... (203400 times) [2024-06-25 07:28:29,040][15401] InferenceWorker_p0-w0: stopping experience collection (203400 times) [2024-06-25 07:28:29,043][15349] Signal inference workers to resume experience collection... (203400 times) [2024-06-25 07:28:29,050][15401] InferenceWorker_p0-w0: resuming experience collection (203400 times) [2024-06-25 07:28:31,750][15401] Updated weights for policy 0, policy_version 838533 (0.0041) [2024-06-25 07:28:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13738557440. Throughput: 0: 42427.9. Samples: 13738733800. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:33,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 07:28:36,275][15401] Updated weights for policy 0, policy_version 838543 (0.0043) [2024-06-25 07:28:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 13738803200. Throughput: 0: 42625.8. Samples: 13738861700. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 07:28:39,621][15401] Updated weights for policy 0, policy_version 838553 (0.0039) [2024-06-25 07:28:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13738983424. Throughput: 0: 42638.6. Samples: 13739118900. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:43,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 07:28:43,979][15401] Updated weights for policy 0, policy_version 838563 (0.0030) [2024-06-25 07:28:48,099][15401] Updated weights for policy 0, policy_version 838573 (0.0027) [2024-06-25 07:28:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 13739196416. Throughput: 0: 42537.2. Samples: 13739369480. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 07:28:51,755][15401] Updated weights for policy 0, policy_version 838583 (0.0033) [2024-06-25 07:28:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13739442176. Throughput: 0: 42635.2. Samples: 13739495560. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:53,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 07:28:55,741][15401] Updated weights for policy 0, policy_version 838593 (0.0052) [2024-06-25 07:28:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 13739606016. Throughput: 0: 42415.1. Samples: 13739746080. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:28:58,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 07:28:59,749][15401] Updated weights for policy 0, policy_version 838603 (0.0046) [2024-06-25 07:29:03,238][15401] Updated weights for policy 0, policy_version 838613 (0.0035) [2024-06-25 07:29:03,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 13739835392. Throughput: 0: 42217.7. Samples: 13739995080. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:03,393][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 07:29:07,431][15401] Updated weights for policy 0, policy_version 838623 (0.0043) [2024-06-25 07:29:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42654.1). Total num frames: 13740048384. Throughput: 0: 42355.7. Samples: 13740125980. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 07:29:10,929][15401] Updated weights for policy 0, policy_version 838633 (0.0030) [2024-06-25 07:29:13,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13740244992. Throughput: 0: 42096.0. Samples: 13740374700. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 07:29:14,990][15401] Updated weights for policy 0, policy_version 838643 (0.0039) [2024-06-25 07:29:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13740474368. Throughput: 0: 42080.4. Samples: 13740627420. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 07:29:18,731][15401] Updated weights for policy 0, policy_version 838653 (0.0023) [2024-06-25 07:29:22,702][15401] Updated weights for policy 0, policy_version 838663 (0.0027) [2024-06-25 07:29:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13740687360. Throughput: 0: 42267.1. Samples: 13740763720. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:23,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 07:29:26,166][15401] Updated weights for policy 0, policy_version 838673 (0.0020) [2024-06-25 07:29:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13740900352. Throughput: 0: 42303.5. Samples: 13741022560. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:28,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-25 07:29:30,263][15401] Updated weights for policy 0, policy_version 838683 (0.0038) [2024-06-25 07:29:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13741129728. Throughput: 0: 42360.5. Samples: 13741275700. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:33,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 07:29:33,651][15401] Updated weights for policy 0, policy_version 838693 (0.0034) [2024-06-25 07:29:37,819][15401] Updated weights for policy 0, policy_version 838703 (0.0043) [2024-06-25 07:29:38,391][15132] Fps is (10 sec: 40954.9, 60 sec: 41778.3, 300 sec: 42598.2). Total num frames: 13741309952. Throughput: 0: 42480.1. Samples: 13741407220. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:38,391][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 07:29:41,564][15401] Updated weights for policy 0, policy_version 838713 (0.0035) [2024-06-25 07:29:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13741539328. Throughput: 0: 42545.4. Samples: 13741660620. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 07:29:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000838717_13741539328.pth... [2024-06-25 07:29:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000838094_13731332096.pth [2024-06-25 07:29:45,627][15401] Updated weights for policy 0, policy_version 838723 (0.0043) [2024-06-25 07:29:47,648][15349] Signal inference workers to stop experience collection... (203450 times) [2024-06-25 07:29:47,649][15349] Signal inference workers to resume experience collection... (203450 times) [2024-06-25 07:29:47,700][15401] InferenceWorker_p0-w0: stopping experience collection (203450 times) [2024-06-25 07:29:47,700][15401] InferenceWorker_p0-w0: resuming experience collection (203450 times) [2024-06-25 07:29:48,389][15132] Fps is (10 sec: 44243.2, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 13741752320. Throughput: 0: 42574.4. Samples: 13741910820. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-25 07:29:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 07:29:49,227][15401] Updated weights for policy 0, policy_version 838733 (0.0033) [2024-06-25 07:29:53,242][15401] Updated weights for policy 0, policy_version 838743 (0.0028) [2024-06-25 07:29:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13741981696. Throughput: 0: 42749.3. Samples: 13742049700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:29:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 07:29:56,936][15401] Updated weights for policy 0, policy_version 838753 (0.0050) [2024-06-25 07:29:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 13742178304. Throughput: 0: 42918.7. Samples: 13742306040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:29:58,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 07:30:00,685][15401] Updated weights for policy 0, policy_version 838763 (0.0025) [2024-06-25 07:30:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 13742407680. Throughput: 0: 42886.8. Samples: 13742557320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:03,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-25 07:30:04,579][15401] Updated weights for policy 0, policy_version 838773 (0.0025) [2024-06-25 07:30:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13742604288. Throughput: 0: 42896.5. Samples: 13742694060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:08,396][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 07:30:08,547][15401] Updated weights for policy 0, policy_version 838783 (0.0039) [2024-06-25 07:30:12,254][15401] Updated weights for policy 0, policy_version 838793 (0.0043) [2024-06-25 07:30:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 13742833664. Throughput: 0: 42799.3. Samples: 13742948520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 07:30:16,315][15401] Updated weights for policy 0, policy_version 838803 (0.0027) [2024-06-25 07:30:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 13743046656. Throughput: 0: 42720.6. Samples: 13743198120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 07:30:19,927][15401] Updated weights for policy 0, policy_version 838813 (0.0037) [2024-06-25 07:30:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13743243264. Throughput: 0: 42657.2. Samples: 13743326740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 07:30:23,961][15401] Updated weights for policy 0, policy_version 838823 (0.0036) [2024-06-25 07:30:27,541][15401] Updated weights for policy 0, policy_version 838833 (0.0033) [2024-06-25 07:30:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13743472640. Throughput: 0: 42701.8. Samples: 13743582200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 07:30:31,482][15401] Updated weights for policy 0, policy_version 838843 (0.0025) [2024-06-25 07:30:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 13743652864. Throughput: 0: 42940.3. Samples: 13743843140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:33,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 07:30:35,039][15401] Updated weights for policy 0, policy_version 838853 (0.0041) [2024-06-25 07:30:38,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42599.4, 300 sec: 42542.9). Total num frames: 13743865856. Throughput: 0: 42450.8. Samples: 13743959980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 07:30:39,419][15401] Updated weights for policy 0, policy_version 838863 (0.0041) [2024-06-25 07:30:42,875][15401] Updated weights for policy 0, policy_version 838873 (0.0030) [2024-06-25 07:30:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13744111616. Throughput: 0: 42521.8. Samples: 13744219520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 07:30:47,101][15401] Updated weights for policy 0, policy_version 838883 (0.0038) [2024-06-25 07:30:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42487.5). Total num frames: 13744308224. Throughput: 0: 42731.4. Samples: 13744480240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:48,395][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 07:30:50,460][15401] Updated weights for policy 0, policy_version 838893 (0.0038) [2024-06-25 07:30:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13744521216. Throughput: 0: 42302.2. Samples: 13744597660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 07:30:54,696][15401] Updated weights for policy 0, policy_version 838903 (0.0029) [2024-06-25 07:30:58,049][15401] Updated weights for policy 0, policy_version 838913 (0.0023) [2024-06-25 07:30:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 13744750592. Throughput: 0: 42547.0. Samples: 13744863140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:30:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 07:31:02,426][15401] Updated weights for policy 0, policy_version 838923 (0.0025) [2024-06-25 07:31:03,392][15132] Fps is (10 sec: 42587.0, 60 sec: 42323.3, 300 sec: 42431.4). Total num frames: 13744947200. Throughput: 0: 42796.9. Samples: 13745124100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:31:03,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 07:31:05,551][15401] Updated weights for policy 0, policy_version 838933 (0.0038) [2024-06-25 07:31:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13745160192. Throughput: 0: 42547.2. Samples: 13745241360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:31:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 07:31:10,203][15401] Updated weights for policy 0, policy_version 838943 (0.0041) [2024-06-25 07:31:11,126][15349] Signal inference workers to stop experience collection... (203500 times) [2024-06-25 07:31:11,127][15349] Signal inference workers to resume experience collection... (203500 times) [2024-06-25 07:31:11,154][15401] InferenceWorker_p0-w0: stopping experience collection (203500 times) [2024-06-25 07:31:11,155][15401] InferenceWorker_p0-w0: resuming experience collection (203500 times) [2024-06-25 07:31:13,366][15401] Updated weights for policy 0, policy_version 838953 (0.0042) [2024-06-25 07:31:13,390][15132] Fps is (10 sec: 45887.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13745405952. Throughput: 0: 42716.9. Samples: 13745504460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:31:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 07:31:18,130][15401] Updated weights for policy 0, policy_version 838963 (0.0030) [2024-06-25 07:31:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 13745569792. Throughput: 0: 42549.3. Samples: 13745757860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:31:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 07:31:21,126][15401] Updated weights for policy 0, policy_version 838973 (0.0040) [2024-06-25 07:31:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13745799168. Throughput: 0: 42621.2. Samples: 13745877940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:31:23,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 07:31:25,612][15401] Updated weights for policy 0, policy_version 838983 (0.0026) [2024-06-25 07:31:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42598.5). Total num frames: 13746028544. Throughput: 0: 42623.1. Samples: 13746137560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:31:28,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 07:31:29,034][15401] Updated weights for policy 0, policy_version 838993 (0.0026) [2024-06-25 07:31:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13746208768. Throughput: 0: 42539.6. Samples: 13746394520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:31:33,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 07:31:33,479][15401] Updated weights for policy 0, policy_version 839003 (0.0035) [2024-06-25 07:31:36,850][15401] Updated weights for policy 0, policy_version 839013 (0.0031) [2024-06-25 07:31:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13746438144. Throughput: 0: 42587.2. Samples: 13746514080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 07:31:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 07:31:41,098][15401] Updated weights for policy 0, policy_version 839023 (0.0029) [2024-06-25 07:31:43,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 13746667520. Throughput: 0: 42547.6. Samples: 13746777880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:31:43,393][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 07:31:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000839030_13746667520.pth... [2024-06-25 07:31:43,443][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000838406_13736443904.pth [2024-06-25 07:31:44,299][15401] Updated weights for policy 0, policy_version 839033 (0.0036) [2024-06-25 07:31:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 13746847744. Throughput: 0: 42460.9. Samples: 13747034720. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:31:48,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 07:31:48,829][15401] Updated weights for policy 0, policy_version 839043 (0.0027) [2024-06-25 07:31:51,830][15401] Updated weights for policy 0, policy_version 839053 (0.0038) [2024-06-25 07:31:53,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13747093504. Throughput: 0: 42610.1. Samples: 13747158820. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:31:53,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 07:31:56,474][15401] Updated weights for policy 0, policy_version 839063 (0.0029) [2024-06-25 07:31:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 13747290112. Throughput: 0: 42553.8. Samples: 13747419380. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:31:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 07:31:59,415][15401] Updated weights for policy 0, policy_version 839073 (0.0044) [2024-06-25 07:32:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42054.2, 300 sec: 42487.3). Total num frames: 13747470336. Throughput: 0: 42687.6. Samples: 13747678800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 07:32:04,046][15401] Updated weights for policy 0, policy_version 839083 (0.0033) [2024-06-25 07:32:07,307][15401] Updated weights for policy 0, policy_version 839093 (0.0029) [2024-06-25 07:32:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13747732480. Throughput: 0: 42624.5. Samples: 13747796040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 07:32:11,626][15401] Updated weights for policy 0, policy_version 839103 (0.0035) [2024-06-25 07:32:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13747945472. Throughput: 0: 42697.7. Samples: 13748058960. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 07:32:15,197][15401] Updated weights for policy 0, policy_version 839113 (0.0031) [2024-06-25 07:32:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13748125696. Throughput: 0: 42674.3. Samples: 13748314860. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:18,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 07:32:19,539][15401] Updated weights for policy 0, policy_version 839123 (0.0025) [2024-06-25 07:32:22,700][15401] Updated weights for policy 0, policy_version 839133 (0.0028) [2024-06-25 07:32:23,336][15349] Signal inference workers to stop experience collection... (203550 times) [2024-06-25 07:32:23,336][15349] Signal inference workers to resume experience collection... (203550 times) [2024-06-25 07:32:23,384][15401] InferenceWorker_p0-w0: stopping experience collection (203550 times) [2024-06-25 07:32:23,384][15401] InferenceWorker_p0-w0: resuming experience collection (203550 times) [2024-06-25 07:32:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13748371456. Throughput: 0: 42775.5. Samples: 13748438980. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:23,390][15132] Avg episode reward: [(0, '0.288')] [2024-06-25 07:32:27,167][15401] Updated weights for policy 0, policy_version 839143 (0.0042) [2024-06-25 07:32:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13748568064. Throughput: 0: 42749.0. Samples: 13748701480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:28,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 07:32:30,170][15401] Updated weights for policy 0, policy_version 839153 (0.0034) [2024-06-25 07:32:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13748781056. Throughput: 0: 42673.7. Samples: 13748955040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:33,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-25 07:32:34,723][15401] Updated weights for policy 0, policy_version 839163 (0.0052) [2024-06-25 07:32:37,885][15401] Updated weights for policy 0, policy_version 839173 (0.0032) [2024-06-25 07:32:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13749010432. Throughput: 0: 42572.4. Samples: 13749074580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 07:32:42,491][15401] Updated weights for policy 0, policy_version 839183 (0.0042) [2024-06-25 07:32:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 13749190656. Throughput: 0: 42564.3. Samples: 13749334780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:43,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 07:32:45,554][15401] Updated weights for policy 0, policy_version 839193 (0.0023) [2024-06-25 07:32:48,391][15132] Fps is (10 sec: 40955.3, 60 sec: 42870.5, 300 sec: 42542.7). Total num frames: 13749420032. Throughput: 0: 42382.8. Samples: 13749586080. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:48,391][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 07:32:50,333][15401] Updated weights for policy 0, policy_version 839203 (0.0045) [2024-06-25 07:32:53,129][15401] Updated weights for policy 0, policy_version 839213 (0.0047) [2024-06-25 07:32:53,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13749665792. Throughput: 0: 42686.7. Samples: 13749716940. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 07:32:58,091][15401] Updated weights for policy 0, policy_version 839223 (0.0041) [2024-06-25 07:32:58,389][15132] Fps is (10 sec: 40965.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13749829632. Throughput: 0: 42503.2. Samples: 13749971600. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:32:58,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-25 07:33:01,005][15401] Updated weights for policy 0, policy_version 839233 (0.0036) [2024-06-25 07:33:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42542.9). Total num frames: 13750075392. Throughput: 0: 42369.6. Samples: 13750221500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:33:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 07:33:05,871][15401] Updated weights for policy 0, policy_version 839243 (0.0030) [2024-06-25 07:33:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13750288384. Throughput: 0: 42576.0. Samples: 13750354900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:33:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 07:33:08,651][15401] Updated weights for policy 0, policy_version 839253 (0.0041) [2024-06-25 07:33:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13750468608. Throughput: 0: 42549.3. Samples: 13750616200. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:33:13,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 07:33:13,500][15401] Updated weights for policy 0, policy_version 839263 (0.0026) [2024-06-25 07:33:16,340][15401] Updated weights for policy 0, policy_version 839273 (0.0030) [2024-06-25 07:33:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 13750697984. Throughput: 0: 42427.1. Samples: 13750864260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:33:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 07:33:21,377][15401] Updated weights for policy 0, policy_version 839283 (0.0037) [2024-06-25 07:33:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13750927360. Throughput: 0: 42767.6. Samples: 13750999120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:33:23,393][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 07:33:24,009][15401] Updated weights for policy 0, policy_version 839293 (0.0044) [2024-06-25 07:33:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 13751091200. Throughput: 0: 42712.1. Samples: 13751256820. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 07:33:28,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 07:33:28,871][15401] Updated weights for policy 0, policy_version 839303 (0.0035) [2024-06-25 07:33:31,363][15401] Updated weights for policy 0, policy_version 839313 (0.0029) [2024-06-25 07:33:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13751353344. Throughput: 0: 42784.2. Samples: 13751511320. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:33:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 07:33:35,994][15349] Signal inference workers to stop experience collection... (203600 times) [2024-06-25 07:33:35,994][15349] Signal inference workers to resume experience collection... (203600 times) [2024-06-25 07:33:36,035][15401] InferenceWorker_p0-w0: stopping experience collection (203600 times) [2024-06-25 07:33:36,035][15401] InferenceWorker_p0-w0: resuming experience collection (203600 times) [2024-06-25 07:33:36,333][15401] Updated weights for policy 0, policy_version 839323 (0.0034) [2024-06-25 07:33:38,390][15132] Fps is (10 sec: 49152.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13751582720. Throughput: 0: 42863.5. Samples: 13751645800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:33:38,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 07:33:38,965][15401] Updated weights for policy 0, policy_version 839333 (0.0040) [2024-06-25 07:33:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13751746560. Throughput: 0: 42885.7. Samples: 13751901460. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:33:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 07:33:43,459][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000839341_13751762944.pth... [2024-06-25 07:33:43,541][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000838717_13741539328.pth [2024-06-25 07:33:43,866][15401] Updated weights for policy 0, policy_version 839343 (0.0030) [2024-06-25 07:33:46,711][15401] Updated weights for policy 0, policy_version 839353 (0.0050) [2024-06-25 07:33:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42872.3, 300 sec: 42542.8). Total num frames: 13751992320. Throughput: 0: 42928.8. Samples: 13752153300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:33:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 07:33:51,750][15401] Updated weights for policy 0, policy_version 839363 (0.0038) [2024-06-25 07:33:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 13752205312. Throughput: 0: 42941.2. Samples: 13752287260. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:33:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 07:33:54,476][15401] Updated weights for policy 0, policy_version 839373 (0.0027) [2024-06-25 07:33:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 13752401920. Throughput: 0: 42759.1. Samples: 13752540360. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:33:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 07:33:59,723][15401] Updated weights for policy 0, policy_version 839383 (0.0027) [2024-06-25 07:34:02,327][15401] Updated weights for policy 0, policy_version 839393 (0.0037) [2024-06-25 07:34:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13752647680. Throughput: 0: 42604.4. Samples: 13752781460. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:03,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 07:34:07,341][15401] Updated weights for policy 0, policy_version 839403 (0.0034) [2024-06-25 07:34:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13752827904. Throughput: 0: 42587.5. Samples: 13752915560. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:08,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-25 07:34:10,662][15401] Updated weights for policy 0, policy_version 839413 (0.0042) [2024-06-25 07:34:13,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13753024512. Throughput: 0: 42458.7. Samples: 13753167460. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 07:34:15,241][15401] Updated weights for policy 0, policy_version 839423 (0.0032) [2024-06-25 07:34:18,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 13753253888. Throughput: 0: 42354.7. Samples: 13753417380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:18,393][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 07:34:18,553][15401] Updated weights for policy 0, policy_version 839433 (0.0042) [2024-06-25 07:34:22,918][15401] Updated weights for policy 0, policy_version 839443 (0.0035) [2024-06-25 07:34:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13753466880. Throughput: 0: 42314.7. Samples: 13753549960. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 07:34:26,095][15401] Updated weights for policy 0, policy_version 839453 (0.0032) [2024-06-25 07:34:28,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 13753663488. Throughput: 0: 42263.1. Samples: 13753803300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:28,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-25 07:34:30,586][15401] Updated weights for policy 0, policy_version 839463 (0.0028) [2024-06-25 07:34:33,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42709.3). Total num frames: 13753909248. Throughput: 0: 42204.1. Samples: 13754052580. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:33,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 07:34:33,529][15401] Updated weights for policy 0, policy_version 839473 (0.0033) [2024-06-25 07:34:38,131][15401] Updated weights for policy 0, policy_version 839483 (0.0030) [2024-06-25 07:34:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 13754089472. Throughput: 0: 42210.7. Samples: 13754186740. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 07:34:41,093][15401] Updated weights for policy 0, policy_version 839493 (0.0033) [2024-06-25 07:34:43,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13754318848. Throughput: 0: 42302.3. Samples: 13754443960. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 07:34:45,754][15401] Updated weights for policy 0, policy_version 839503 (0.0037) [2024-06-25 07:34:48,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13754564608. Throughput: 0: 42445.8. Samples: 13754691520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:48,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 07:34:48,612][15401] Updated weights for policy 0, policy_version 839513 (0.0034) [2024-06-25 07:34:53,289][15401] Updated weights for policy 0, policy_version 839523 (0.0040) [2024-06-25 07:34:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13754744832. Throughput: 0: 42643.6. Samples: 13754834520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:53,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 07:34:56,107][15401] Updated weights for policy 0, policy_version 839533 (0.0031) [2024-06-25 07:34:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 13754957824. Throughput: 0: 42669.4. Samples: 13755087580. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:34:58,392][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 07:35:00,940][15401] Updated weights for policy 0, policy_version 839543 (0.0029) [2024-06-25 07:35:01,701][15349] Signal inference workers to stop experience collection... (203650 times) [2024-06-25 07:35:01,743][15401] InferenceWorker_p0-w0: stopping experience collection (203650 times) [2024-06-25 07:35:01,822][15349] Signal inference workers to resume experience collection... (203650 times) [2024-06-25 07:35:01,823][15401] InferenceWorker_p0-w0: resuming experience collection (203650 times) [2024-06-25 07:35:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13755187200. Throughput: 0: 42552.1. Samples: 13755332120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:35:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 07:35:04,366][15401] Updated weights for policy 0, policy_version 839553 (0.0022) [2024-06-25 07:35:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13755383808. Throughput: 0: 42692.5. Samples: 13755471120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:35:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 07:35:08,923][15401] Updated weights for policy 0, policy_version 839563 (0.0034) [2024-06-25 07:35:12,101][15401] Updated weights for policy 0, policy_version 839573 (0.0024) [2024-06-25 07:35:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13755580416. Throughput: 0: 42570.2. Samples: 13755718960. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:35:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 07:35:16,727][15401] Updated weights for policy 0, policy_version 839583 (0.0038) [2024-06-25 07:35:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 13755842560. Throughput: 0: 42649.0. Samples: 13755971680. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-25 07:35:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 07:35:19,615][15401] Updated weights for policy 0, policy_version 839593 (0.0038) [2024-06-25 07:35:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 13756006400. Throughput: 0: 42642.3. Samples: 13756105640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:35:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 07:35:24,295][15401] Updated weights for policy 0, policy_version 839603 (0.0038) [2024-06-25 07:35:27,475][15401] Updated weights for policy 0, policy_version 839613 (0.0042) [2024-06-25 07:35:28,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13756235776. Throughput: 0: 42432.8. Samples: 13756353440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:35:28,394][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 07:35:31,887][15401] Updated weights for policy 0, policy_version 839623 (0.0032) [2024-06-25 07:35:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42600.0, 300 sec: 42709.4). Total num frames: 13756465152. Throughput: 0: 42679.0. Samples: 13756612080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:35:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 07:35:35,066][15401] Updated weights for policy 0, policy_version 839633 (0.0029) [2024-06-25 07:35:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13756645376. Throughput: 0: 42399.9. Samples: 13756742520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:35:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 07:35:39,574][15401] Updated weights for policy 0, policy_version 839643 (0.0024) [2024-06-25 07:35:42,774][15401] Updated weights for policy 0, policy_version 839653 (0.0038) [2024-06-25 07:35:43,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 13756891136. Throughput: 0: 42331.1. Samples: 13756992580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:35:43,401][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 07:35:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000839654_13756891136.pth... [2024-06-25 07:35:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000839030_13746667520.pth [2024-06-25 07:35:47,544][15401] Updated weights for policy 0, policy_version 839663 (0.0030) [2024-06-25 07:35:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 13757087744. Throughput: 0: 42659.6. Samples: 13757251800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:35:48,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 07:35:50,383][15401] Updated weights for policy 0, policy_version 839673 (0.0030) [2024-06-25 07:35:53,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 13757284352. Throughput: 0: 42340.0. Samples: 13757376420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:35:53,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 07:35:55,148][15401] Updated weights for policy 0, policy_version 839683 (0.0032) [2024-06-25 07:35:57,850][15401] Updated weights for policy 0, policy_version 839693 (0.0032) [2024-06-25 07:35:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42709.9). Total num frames: 13757546496. Throughput: 0: 42413.8. Samples: 13757627580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:35:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 07:36:02,805][15401] Updated weights for policy 0, policy_version 839703 (0.0036) [2024-06-25 07:36:03,132][15349] Signal inference workers to stop experience collection... (203700 times) [2024-06-25 07:36:03,132][15349] Signal inference workers to resume experience collection... (203700 times) [2024-06-25 07:36:03,174][15401] InferenceWorker_p0-w0: stopping experience collection (203700 times) [2024-06-25 07:36:03,174][15401] InferenceWorker_p0-w0: resuming experience collection (203700 times) [2024-06-25 07:36:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13757726720. Throughput: 0: 42703.0. Samples: 13757893320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 07:36:05,413][15401] Updated weights for policy 0, policy_version 839713 (0.0024) [2024-06-25 07:36:08,390][15132] Fps is (10 sec: 36044.5, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 13757906944. Throughput: 0: 42398.6. Samples: 13758013580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 07:36:10,610][15401] Updated weights for policy 0, policy_version 839723 (0.0032) [2024-06-25 07:36:13,065][15401] Updated weights for policy 0, policy_version 839733 (0.0038) [2024-06-25 07:36:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13758185472. Throughput: 0: 42651.6. Samples: 13758272760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:13,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-25 07:36:18,273][15401] Updated weights for policy 0, policy_version 839743 (0.0039) [2024-06-25 07:36:18,393][15132] Fps is (10 sec: 45857.1, 60 sec: 42049.4, 300 sec: 42597.8). Total num frames: 13758365696. Throughput: 0: 42832.3. Samples: 13758539700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:18,394][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 07:36:20,825][15401] Updated weights for policy 0, policy_version 839753 (0.0035) [2024-06-25 07:36:23,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13758562304. Throughput: 0: 42508.1. Samples: 13758655380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 07:36:25,795][15401] Updated weights for policy 0, policy_version 839763 (0.0028) [2024-06-25 07:36:28,364][15401] Updated weights for policy 0, policy_version 839773 (0.0044) [2024-06-25 07:36:28,389][15132] Fps is (10 sec: 47532.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 13758840832. Throughput: 0: 42822.4. Samples: 13758919480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 07:36:33,199][15401] Updated weights for policy 0, policy_version 839783 (0.0029) [2024-06-25 07:36:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13759004672. Throughput: 0: 42955.4. Samples: 13759184800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:33,390][15132] Avg episode reward: [(0, '0.208')] [2024-06-25 07:36:36,315][15401] Updated weights for policy 0, policy_version 839793 (0.0023) [2024-06-25 07:36:38,392][15132] Fps is (10 sec: 37674.0, 60 sec: 42869.8, 300 sec: 42542.9). Total num frames: 13759217664. Throughput: 0: 42870.2. Samples: 13759305680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:38,392][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 07:36:40,739][15401] Updated weights for policy 0, policy_version 839803 (0.0032) [2024-06-25 07:36:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 13759463424. Throughput: 0: 43073.3. Samples: 13759565880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:43,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 07:36:43,921][15401] Updated weights for policy 0, policy_version 839813 (0.0042) [2024-06-25 07:36:48,180][15401] Updated weights for policy 0, policy_version 839823 (0.0042) [2024-06-25 07:36:48,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13759660032. Throughput: 0: 42914.7. Samples: 13759824480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 07:36:51,520][15401] Updated weights for policy 0, policy_version 839833 (0.0038) [2024-06-25 07:36:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13759873024. Throughput: 0: 43036.4. Samples: 13759950220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 07:36:56,152][15401] Updated weights for policy 0, policy_version 839843 (0.0043) [2024-06-25 07:36:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13760102400. Throughput: 0: 43202.7. Samples: 13760216880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:36:58,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 07:36:59,110][15401] Updated weights for policy 0, policy_version 839853 (0.0044) [2024-06-25 07:37:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13760299008. Throughput: 0: 42866.1. Samples: 13760468500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:37:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 07:37:03,688][15401] Updated weights for policy 0, policy_version 839863 (0.0034) [2024-06-25 07:37:06,782][15401] Updated weights for policy 0, policy_version 839873 (0.0028) [2024-06-25 07:37:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 13760512000. Throughput: 0: 43204.3. Samples: 13760599580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 07:37:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 07:37:11,112][15401] Updated weights for policy 0, policy_version 839883 (0.0035) [2024-06-25 07:37:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13760724992. Throughput: 0: 43005.3. Samples: 13760854720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:13,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-25 07:37:14,453][15401] Updated weights for policy 0, policy_version 839893 (0.0026) [2024-06-25 07:37:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42874.4, 300 sec: 42598.4). Total num frames: 13760937984. Throughput: 0: 42853.9. Samples: 13761113220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 07:37:18,627][15401] Updated weights for policy 0, policy_version 839903 (0.0038) [2024-06-25 07:37:20,525][15349] Signal inference workers to stop experience collection... (203750 times) [2024-06-25 07:37:20,525][15349] Signal inference workers to resume experience collection... (203750 times) [2024-06-25 07:37:20,541][15401] InferenceWorker_p0-w0: stopping experience collection (203750 times) [2024-06-25 07:37:20,542][15401] InferenceWorker_p0-w0: resuming experience collection (203750 times) [2024-06-25 07:37:22,049][15401] Updated weights for policy 0, policy_version 839913 (0.0040) [2024-06-25 07:37:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 13761167360. Throughput: 0: 43025.4. Samples: 13761241720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:23,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 07:37:26,226][15401] Updated weights for policy 0, policy_version 839923 (0.0024) [2024-06-25 07:37:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13761363968. Throughput: 0: 42977.2. Samples: 13761499860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:28,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 07:37:29,621][15401] Updated weights for policy 0, policy_version 839933 (0.0042) [2024-06-25 07:37:33,392][15132] Fps is (10 sec: 42589.0, 60 sec: 43143.0, 300 sec: 42653.6). Total num frames: 13761593344. Throughput: 0: 42924.6. Samples: 13761756180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:33,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 07:37:34,121][15401] Updated weights for policy 0, policy_version 839943 (0.0045) [2024-06-25 07:37:37,768][15401] Updated weights for policy 0, policy_version 839953 (0.0031) [2024-06-25 07:37:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 13761806336. Throughput: 0: 43020.5. Samples: 13761886140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:38,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 07:37:41,852][15401] Updated weights for policy 0, policy_version 839963 (0.0039) [2024-06-25 07:37:43,389][15132] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42654.1). Total num frames: 13762002944. Throughput: 0: 42768.5. Samples: 13762141460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 07:37:43,516][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000839967_13762019328.pth... [2024-06-25 07:37:43,588][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000839341_13751762944.pth [2024-06-25 07:37:45,511][15401] Updated weights for policy 0, policy_version 839973 (0.0039) [2024-06-25 07:37:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13762248704. Throughput: 0: 42699.0. Samples: 13762389960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 07:37:49,418][15401] Updated weights for policy 0, policy_version 839983 (0.0044) [2024-06-25 07:37:53,121][15401] Updated weights for policy 0, policy_version 839993 (0.0030) [2024-06-25 07:37:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13762445312. Throughput: 0: 42791.7. Samples: 13762525200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 07:37:57,430][15401] Updated weights for policy 0, policy_version 840003 (0.0026) [2024-06-25 07:37:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13762641920. Throughput: 0: 42811.1. Samples: 13762781220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:37:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 07:38:00,614][15401] Updated weights for policy 0, policy_version 840013 (0.0044) [2024-06-25 07:38:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 13762887680. Throughput: 0: 42675.4. Samples: 13763033620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 07:38:04,964][15401] Updated weights for policy 0, policy_version 840023 (0.0030) [2024-06-25 07:38:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13763084288. Throughput: 0: 42792.0. Samples: 13763167360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 07:38:08,525][15401] Updated weights for policy 0, policy_version 840033 (0.0040) [2024-06-25 07:38:12,909][15401] Updated weights for policy 0, policy_version 840043 (0.0028) [2024-06-25 07:38:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13763280896. Throughput: 0: 42674.6. Samples: 13763420220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 07:38:15,985][15401] Updated weights for policy 0, policy_version 840053 (0.0038) [2024-06-25 07:38:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13763510272. Throughput: 0: 42656.8. Samples: 13763675640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 07:38:20,423][15401] Updated weights for policy 0, policy_version 840063 (0.0024) [2024-06-25 07:38:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13763739648. Throughput: 0: 42652.5. Samples: 13763805500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 07:38:23,587][15401] Updated weights for policy 0, policy_version 840073 (0.0031) [2024-06-25 07:38:26,162][15349] Signal inference workers to stop experience collection... (203800 times) [2024-06-25 07:38:26,200][15401] InferenceWorker_p0-w0: stopping experience collection (203800 times) [2024-06-25 07:38:26,222][15349] Signal inference workers to resume experience collection... (203800 times) [2024-06-25 07:38:26,223][15401] InferenceWorker_p0-w0: resuming experience collection (203800 times) [2024-06-25 07:38:28,052][15401] Updated weights for policy 0, policy_version 840083 (0.0039) [2024-06-25 07:38:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13763919872. Throughput: 0: 42629.7. Samples: 13764059800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 07:38:31,107][15401] Updated weights for policy 0, policy_version 840093 (0.0035) [2024-06-25 07:38:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 13764149248. Throughput: 0: 42921.0. Samples: 13764321400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 07:38:35,601][15401] Updated weights for policy 0, policy_version 840103 (0.0041) [2024-06-25 07:38:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13764378624. Throughput: 0: 42806.7. Samples: 13764451500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:38,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 07:38:38,878][15401] Updated weights for policy 0, policy_version 840113 (0.0037) [2024-06-25 07:38:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13764558848. Throughput: 0: 42673.8. Samples: 13764701540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 07:38:43,482][15401] Updated weights for policy 0, policy_version 840123 (0.0034) [2024-06-25 07:38:47,031][15401] Updated weights for policy 0, policy_version 840133 (0.0035) [2024-06-25 07:38:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13764788224. Throughput: 0: 42694.3. Samples: 13764954860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:48,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 07:38:51,155][15401] Updated weights for policy 0, policy_version 840143 (0.0036) [2024-06-25 07:38:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13765001216. Throughput: 0: 42623.9. Samples: 13765085440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 07:38:54,470][15401] Updated weights for policy 0, policy_version 840153 (0.0030) [2024-06-25 07:38:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 13765197824. Throughput: 0: 42630.7. Samples: 13765338600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 07:38:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 07:38:58,650][15401] Updated weights for policy 0, policy_version 840163 (0.0028) [2024-06-25 07:39:02,172][15401] Updated weights for policy 0, policy_version 840173 (0.0040) [2024-06-25 07:39:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13765427200. Throughput: 0: 42889.4. Samples: 13765605660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:03,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 07:39:06,148][15401] Updated weights for policy 0, policy_version 840183 (0.0027) [2024-06-25 07:39:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13765640192. Throughput: 0: 42783.0. Samples: 13765730740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:08,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 07:39:09,879][15401] Updated weights for policy 0, policy_version 840193 (0.0021) [2024-06-25 07:39:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 13765853184. Throughput: 0: 42857.4. Samples: 13765988380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 07:39:13,792][15401] Updated weights for policy 0, policy_version 840203 (0.0038) [2024-06-25 07:39:17,503][15401] Updated weights for policy 0, policy_version 840213 (0.0023) [2024-06-25 07:39:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13766082560. Throughput: 0: 42802.5. Samples: 13766247520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 07:39:21,546][15401] Updated weights for policy 0, policy_version 840223 (0.0033) [2024-06-25 07:39:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13766295552. Throughput: 0: 42828.5. Samples: 13766378780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 07:39:25,376][15401] Updated weights for policy 0, policy_version 840233 (0.0032) [2024-06-25 07:39:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 13766508544. Throughput: 0: 42905.8. Samples: 13766632300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:28,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-25 07:39:29,075][15401] Updated weights for policy 0, policy_version 840243 (0.0029) [2024-06-25 07:39:33,012][15401] Updated weights for policy 0, policy_version 840253 (0.0032) [2024-06-25 07:39:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13766737920. Throughput: 0: 43061.8. Samples: 13766892640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:33,392][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 07:39:36,590][15401] Updated weights for policy 0, policy_version 840263 (0.0032) [2024-06-25 07:39:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13766950912. Throughput: 0: 43043.6. Samples: 13767022400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:38,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 07:39:40,527][15401] Updated weights for policy 0, policy_version 840273 (0.0038) [2024-06-25 07:39:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13767131136. Throughput: 0: 43018.6. Samples: 13767274440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:43,390][15132] Avg episode reward: [(0, '0.883')] [2024-06-25 07:39:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000840279_13767131136.pth... [2024-06-25 07:39:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000839654_13756891136.pth [2024-06-25 07:39:43,593][15349] Signal inference workers to stop experience collection... (203850 times) [2024-06-25 07:39:43,628][15401] InferenceWorker_p0-w0: stopping experience collection (203850 times) [2024-06-25 07:39:43,653][15349] Signal inference workers to resume experience collection... (203850 times) [2024-06-25 07:39:43,654][15401] InferenceWorker_p0-w0: resuming experience collection (203850 times) [2024-06-25 07:39:44,198][15401] Updated weights for policy 0, policy_version 840283 (0.0029) [2024-06-25 07:39:48,021][15401] Updated weights for policy 0, policy_version 840293 (0.0035) [2024-06-25 07:39:48,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 13767376896. Throughput: 0: 42719.5. Samples: 13767528140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:48,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 07:39:52,037][15401] Updated weights for policy 0, policy_version 840303 (0.0026) [2024-06-25 07:39:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13767589888. Throughput: 0: 42991.2. Samples: 13767665340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 07:39:55,455][15401] Updated weights for policy 0, policy_version 840313 (0.0027) [2024-06-25 07:39:58,392][15132] Fps is (10 sec: 39323.1, 60 sec: 42870.1, 300 sec: 42653.6). Total num frames: 13767770112. Throughput: 0: 42764.3. Samples: 13767912860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:39:58,392][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 07:39:59,624][15401] Updated weights for policy 0, policy_version 840323 (0.0036) [2024-06-25 07:40:03,063][15401] Updated weights for policy 0, policy_version 840333 (0.0033) [2024-06-25 07:40:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13768032256. Throughput: 0: 42733.0. Samples: 13768170500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:03,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 07:40:07,310][15401] Updated weights for policy 0, policy_version 840343 (0.0034) [2024-06-25 07:40:08,389][15132] Fps is (10 sec: 45884.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13768228864. Throughput: 0: 42747.1. Samples: 13768302400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 07:40:10,750][15401] Updated weights for policy 0, policy_version 840353 (0.0036) [2024-06-25 07:40:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13768425472. Throughput: 0: 42680.5. Samples: 13768552920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 07:40:15,359][15401] Updated weights for policy 0, policy_version 840363 (0.0031) [2024-06-25 07:40:18,263][15401] Updated weights for policy 0, policy_version 840373 (0.0032) [2024-06-25 07:40:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13768671232. Throughput: 0: 42622.3. Samples: 13768810640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:18,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 07:40:22,947][15401] Updated weights for policy 0, policy_version 840383 (0.0028) [2024-06-25 07:40:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13768835072. Throughput: 0: 42659.7. Samples: 13768942080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 07:40:25,823][15401] Updated weights for policy 0, policy_version 840393 (0.0043) [2024-06-25 07:40:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13769080832. Throughput: 0: 42717.8. Samples: 13769196740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:28,390][15132] Avg episode reward: [(0, '0.167')] [2024-06-25 07:40:30,584][15401] Updated weights for policy 0, policy_version 840403 (0.0030) [2024-06-25 07:40:33,390][15132] Fps is (10 sec: 47512.1, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 13769310208. Throughput: 0: 42700.7. Samples: 13769449580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:33,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 07:40:33,456][15401] Updated weights for policy 0, policy_version 840413 (0.0042) [2024-06-25 07:40:38,164][15401] Updated weights for policy 0, policy_version 840423 (0.0031) [2024-06-25 07:40:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 13769490432. Throughput: 0: 42603.2. Samples: 13769582480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:38,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 07:40:41,121][15401] Updated weights for policy 0, policy_version 840433 (0.0033) [2024-06-25 07:40:43,389][15132] Fps is (10 sec: 40961.0, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13769719808. Throughput: 0: 42688.1. Samples: 13769833740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 07:40:45,846][15401] Updated weights for policy 0, policy_version 840443 (0.0037) [2024-06-25 07:40:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 13769932800. Throughput: 0: 42648.9. Samples: 13770089700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 07:40:48,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 07:40:49,272][15401] Updated weights for policy 0, policy_version 840453 (0.0042) [2024-06-25 07:40:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13770113024. Throughput: 0: 42440.9. Samples: 13770212240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:40:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 07:40:54,023][15401] Updated weights for policy 0, policy_version 840463 (0.0038) [2024-06-25 07:40:57,096][15401] Updated weights for policy 0, policy_version 840473 (0.0034) [2024-06-25 07:40:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43419.1, 300 sec: 42876.1). Total num frames: 13770375168. Throughput: 0: 42630.1. Samples: 13770471280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:40:58,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 07:41:01,572][15401] Updated weights for policy 0, policy_version 840483 (0.0033) [2024-06-25 07:41:01,824][15349] Signal inference workers to stop experience collection... (203900 times) [2024-06-25 07:41:01,857][15401] InferenceWorker_p0-w0: stopping experience collection (203900 times) [2024-06-25 07:41:01,885][15349] Signal inference workers to resume experience collection... (203900 times) [2024-06-25 07:41:01,885][15401] InferenceWorker_p0-w0: resuming experience collection (203900 times) [2024-06-25 07:41:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 13770571776. Throughput: 0: 42609.0. Samples: 13770728040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 07:41:05,272][15401] Updated weights for policy 0, policy_version 840493 (0.0030) [2024-06-25 07:41:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13770768384. Throughput: 0: 42336.3. Samples: 13770847220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 07:41:09,458][15401] Updated weights for policy 0, policy_version 840503 (0.0038) [2024-06-25 07:41:12,996][15401] Updated weights for policy 0, policy_version 840513 (0.0032) [2024-06-25 07:41:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.6). Total num frames: 13770981376. Throughput: 0: 42416.0. Samples: 13771105460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 07:41:17,076][15401] Updated weights for policy 0, policy_version 840523 (0.0034) [2024-06-25 07:41:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 13771194368. Throughput: 0: 42523.7. Samples: 13771363140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 07:41:20,565][15401] Updated weights for policy 0, policy_version 840533 (0.0043) [2024-06-25 07:41:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13771423744. Throughput: 0: 42435.0. Samples: 13771492060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 07:41:24,683][15401] Updated weights for policy 0, policy_version 840543 (0.0026) [2024-06-25 07:41:28,084][15401] Updated weights for policy 0, policy_version 840553 (0.0041) [2024-06-25 07:41:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13771620352. Throughput: 0: 42475.0. Samples: 13771745120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 07:41:32,315][15401] Updated weights for policy 0, policy_version 840563 (0.0037) [2024-06-25 07:41:33,391][15132] Fps is (10 sec: 40954.0, 60 sec: 42051.4, 300 sec: 42765.1). Total num frames: 13771833344. Throughput: 0: 42421.7. Samples: 13771998740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:33,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 07:41:36,230][15401] Updated weights for policy 0, policy_version 840573 (0.0029) [2024-06-25 07:41:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13772062720. Throughput: 0: 42599.6. Samples: 13772129220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 07:41:40,373][15401] Updated weights for policy 0, policy_version 840583 (0.0036) [2024-06-25 07:41:43,390][15132] Fps is (10 sec: 40965.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13772242944. Throughput: 0: 42407.5. Samples: 13772379620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:43,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 07:41:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000840591_13772242944.pth... [2024-06-25 07:41:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000839967_13762019328.pth [2024-06-25 07:41:43,977][15401] Updated weights for policy 0, policy_version 840593 (0.0029) [2024-06-25 07:41:47,919][15401] Updated weights for policy 0, policy_version 840603 (0.0037) [2024-06-25 07:41:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13772472320. Throughput: 0: 42382.1. Samples: 13772635240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:48,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 07:41:51,601][15401] Updated weights for policy 0, policy_version 840613 (0.0028) [2024-06-25 07:41:53,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13772718080. Throughput: 0: 42547.6. Samples: 13772761860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:53,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 07:41:55,635][15401] Updated weights for policy 0, policy_version 840623 (0.0037) [2024-06-25 07:41:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 13772898304. Throughput: 0: 42461.3. Samples: 13773016220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:41:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 07:41:59,375][15401] Updated weights for policy 0, policy_version 840633 (0.0035) [2024-06-25 07:42:03,390][15132] Fps is (10 sec: 36044.8, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 13773078528. Throughput: 0: 42364.1. Samples: 13773269520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:42:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 07:42:03,827][15401] Updated weights for policy 0, policy_version 840643 (0.0028) [2024-06-25 07:42:06,942][15401] Updated weights for policy 0, policy_version 840653 (0.0041) [2024-06-25 07:42:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13773340672. Throughput: 0: 42235.5. Samples: 13773392660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:42:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 07:42:11,333][15401] Updated weights for policy 0, policy_version 840663 (0.0022) [2024-06-25 07:42:12,860][15349] Signal inference workers to stop experience collection... (203950 times) [2024-06-25 07:42:12,861][15349] Signal inference workers to resume experience collection... (203950 times) [2024-06-25 07:42:12,903][15401] InferenceWorker_p0-w0: stopping experience collection (203950 times) [2024-06-25 07:42:12,903][15401] InferenceWorker_p0-w0: resuming experience collection (203950 times) [2024-06-25 07:42:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13773537280. Throughput: 0: 42477.5. Samples: 13773656600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:42:13,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 07:42:14,493][15401] Updated weights for policy 0, policy_version 840673 (0.0032) [2024-06-25 07:42:18,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 13773717504. Throughput: 0: 42471.5. Samples: 13773909900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:42:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 07:42:18,913][15401] Updated weights for policy 0, policy_version 840683 (0.0033) [2024-06-25 07:42:22,120][15401] Updated weights for policy 0, policy_version 840693 (0.0044) [2024-06-25 07:42:23,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13773963264. Throughput: 0: 42312.8. Samples: 13774033300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:42:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 07:42:26,843][15401] Updated weights for policy 0, policy_version 840703 (0.0032) [2024-06-25 07:42:28,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42871.3, 300 sec: 42709.7). Total num frames: 13774192640. Throughput: 0: 42574.0. Samples: 13774295460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:42:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 07:42:29,920][15401] Updated weights for policy 0, policy_version 840713 (0.0036) [2024-06-25 07:42:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42326.4, 300 sec: 42598.4). Total num frames: 13774372864. Throughput: 0: 42479.6. Samples: 13774546820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:42:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 07:42:34,366][15401] Updated weights for policy 0, policy_version 840723 (0.0042) [2024-06-25 07:42:37,375][15401] Updated weights for policy 0, policy_version 840733 (0.0032) [2024-06-25 07:42:38,390][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 13774602240. Throughput: 0: 42536.8. Samples: 13774676020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 07:42:38,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 07:42:41,936][15401] Updated weights for policy 0, policy_version 840743 (0.0035) [2024-06-25 07:42:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13774798848. Throughput: 0: 42593.0. Samples: 13774932900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:42:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 07:42:45,009][15401] Updated weights for policy 0, policy_version 840753 (0.0041) [2024-06-25 07:42:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13775028224. Throughput: 0: 42516.8. Samples: 13775182780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:42:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 07:42:49,404][15401] Updated weights for policy 0, policy_version 840763 (0.0037) [2024-06-25 07:42:52,787][15401] Updated weights for policy 0, policy_version 840773 (0.0036) [2024-06-25 07:42:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13775257600. Throughput: 0: 42732.8. Samples: 13775315640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:42:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 07:42:57,048][15401] Updated weights for policy 0, policy_version 840783 (0.0034) [2024-06-25 07:42:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13775454208. Throughput: 0: 42635.5. Samples: 13775575200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:42:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 07:43:00,406][15401] Updated weights for policy 0, policy_version 840793 (0.0027) [2024-06-25 07:43:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 13775683584. Throughput: 0: 42459.3. Samples: 13775820560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 07:43:04,710][15401] Updated weights for policy 0, policy_version 840803 (0.0032) [2024-06-25 07:43:08,095][15401] Updated weights for policy 0, policy_version 840813 (0.0037) [2024-06-25 07:43:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13775880192. Throughput: 0: 42769.0. Samples: 13775957900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 07:43:12,275][15401] Updated weights for policy 0, policy_version 840823 (0.0039) [2024-06-25 07:43:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13776076800. Throughput: 0: 42532.8. Samples: 13776209420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 07:43:15,931][15401] Updated weights for policy 0, policy_version 840833 (0.0036) [2024-06-25 07:43:18,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 13776289792. Throughput: 0: 42527.6. Samples: 13776460560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 07:43:19,987][15401] Updated weights for policy 0, policy_version 840843 (0.0026) [2024-06-25 07:43:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13776502784. Throughput: 0: 42601.8. Samples: 13776593100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 07:43:23,627][15401] Updated weights for policy 0, policy_version 840853 (0.0044) [2024-06-25 07:43:27,613][15401] Updated weights for policy 0, policy_version 840863 (0.0032) [2024-06-25 07:43:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.5, 300 sec: 42542.9). Total num frames: 13776699392. Throughput: 0: 42390.7. Samples: 13776840480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 07:43:28,975][15349] Signal inference workers to stop experience collection... (204000 times) [2024-06-25 07:43:28,975][15349] Signal inference workers to resume experience collection... (204000 times) [2024-06-25 07:43:28,998][15401] InferenceWorker_p0-w0: stopping experience collection (204000 times) [2024-06-25 07:43:28,999][15401] InferenceWorker_p0-w0: resuming experience collection (204000 times) [2024-06-25 07:43:31,453][15401] Updated weights for policy 0, policy_version 840873 (0.0033) [2024-06-25 07:43:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13776945152. Throughput: 0: 42593.4. Samples: 13777099480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:33,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 07:43:35,356][15401] Updated weights for policy 0, policy_version 840883 (0.0029) [2024-06-25 07:43:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 13777141760. Throughput: 0: 42629.5. Samples: 13777233960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 07:43:39,095][15401] Updated weights for policy 0, policy_version 840893 (0.0026) [2024-06-25 07:43:43,084][15401] Updated weights for policy 0, policy_version 840903 (0.0040) [2024-06-25 07:43:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13777354752. Throughput: 0: 42403.5. Samples: 13777483360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 07:43:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000840903_13777354752.pth... [2024-06-25 07:43:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000840279_13767131136.pth [2024-06-25 07:43:46,802][15401] Updated weights for policy 0, policy_version 840913 (0.0028) [2024-06-25 07:43:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13777567744. Throughput: 0: 42620.7. Samples: 13777738500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 07:43:50,723][15401] Updated weights for policy 0, policy_version 840923 (0.0034) [2024-06-25 07:43:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13777780736. Throughput: 0: 42394.0. Samples: 13777865640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 07:43:54,394][15401] Updated weights for policy 0, policy_version 840933 (0.0041) [2024-06-25 07:43:58,328][15401] Updated weights for policy 0, policy_version 840943 (0.0038) [2024-06-25 07:43:58,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 13778010112. Throughput: 0: 42589.3. Samples: 13778126040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:43:58,392][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 07:44:02,026][15401] Updated weights for policy 0, policy_version 840953 (0.0043) [2024-06-25 07:44:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13778223104. Throughput: 0: 42503.9. Samples: 13778373240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:44:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 07:44:06,245][15401] Updated weights for policy 0, policy_version 840963 (0.0037) [2024-06-25 07:44:08,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13778436096. Throughput: 0: 42424.5. Samples: 13778502200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:44:08,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 07:44:10,079][15401] Updated weights for policy 0, policy_version 840973 (0.0025) [2024-06-25 07:44:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 13778632704. Throughput: 0: 42762.0. Samples: 13778764780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:44:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 07:44:13,657][15401] Updated weights for policy 0, policy_version 840983 (0.0039) [2024-06-25 07:44:17,623][15401] Updated weights for policy 0, policy_version 840993 (0.0041) [2024-06-25 07:44:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13778862080. Throughput: 0: 42533.0. Samples: 13779013460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:44:18,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 07:44:21,582][15401] Updated weights for policy 0, policy_version 841003 (0.0032) [2024-06-25 07:44:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13779058688. Throughput: 0: 42415.6. Samples: 13779142660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:44:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 07:44:25,125][15401] Updated weights for policy 0, policy_version 841013 (0.0032) [2024-06-25 07:44:28,391][15132] Fps is (10 sec: 42590.9, 60 sec: 43143.3, 300 sec: 42542.6). Total num frames: 13779288064. Throughput: 0: 42599.7. Samples: 13779400420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 07:44:28,392][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 07:44:29,251][15401] Updated weights for policy 0, policy_version 841023 (0.0040) [2024-06-25 07:44:32,838][15401] Updated weights for policy 0, policy_version 841033 (0.0036) [2024-06-25 07:44:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 13779484672. Throughput: 0: 42472.1. Samples: 13779649740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:44:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 07:44:36,898][15401] Updated weights for policy 0, policy_version 841043 (0.0032) [2024-06-25 07:44:38,389][15132] Fps is (10 sec: 40967.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13779697664. Throughput: 0: 42489.4. Samples: 13779777660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:44:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 07:44:40,349][15401] Updated weights for policy 0, policy_version 841053 (0.0034) [2024-06-25 07:44:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42543.2). Total num frames: 13779927040. Throughput: 0: 42560.2. Samples: 13780041160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:44:43,402][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 07:44:44,370][15401] Updated weights for policy 0, policy_version 841063 (0.0038) [2024-06-25 07:44:47,902][15401] Updated weights for policy 0, policy_version 841073 (0.0029) [2024-06-25 07:44:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13780140032. Throughput: 0: 42600.5. Samples: 13780290260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:44:48,399][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 07:44:51,681][15349] Signal inference workers to stop experience collection... (204050 times) [2024-06-25 07:44:51,682][15349] Signal inference workers to resume experience collection... (204050 times) [2024-06-25 07:44:51,707][15401] InferenceWorker_p0-w0: stopping experience collection (204050 times) [2024-06-25 07:44:51,707][15401] InferenceWorker_p0-w0: resuming experience collection (204050 times) [2024-06-25 07:44:52,338][15401] Updated weights for policy 0, policy_version 841083 (0.0032) [2024-06-25 07:44:53,390][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 13780353024. Throughput: 0: 42791.5. Samples: 13780427820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:44:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 07:44:55,530][15401] Updated weights for policy 0, policy_version 841093 (0.0039) [2024-06-25 07:44:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42431.8). Total num frames: 13780549632. Throughput: 0: 42637.8. Samples: 13780683480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:44:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 07:45:00,029][15401] Updated weights for policy 0, policy_version 841103 (0.0036) [2024-06-25 07:45:03,258][15401] Updated weights for policy 0, policy_version 841113 (0.0038) [2024-06-25 07:45:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13780795392. Throughput: 0: 42611.8. Samples: 13780931000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 07:45:07,870][15401] Updated weights for policy 0, policy_version 841123 (0.0026) [2024-06-25 07:45:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13780992000. Throughput: 0: 42689.3. Samples: 13781063680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 07:45:10,851][15401] Updated weights for policy 0, policy_version 841133 (0.0034) [2024-06-25 07:45:13,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 13781188608. Throughput: 0: 42628.7. Samples: 13781318640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 07:45:15,552][15401] Updated weights for policy 0, policy_version 841143 (0.0029) [2024-06-25 07:45:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13781417984. Throughput: 0: 42690.3. Samples: 13781570800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:18,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 07:45:18,780][15401] Updated weights for policy 0, policy_version 841153 (0.0041) [2024-06-25 07:45:23,139][15401] Updated weights for policy 0, policy_version 841163 (0.0040) [2024-06-25 07:45:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13781630976. Throughput: 0: 42795.6. Samples: 13781703460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 07:45:26,326][15401] Updated weights for policy 0, policy_version 841173 (0.0037) [2024-06-25 07:45:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42326.5, 300 sec: 42431.8). Total num frames: 13781827584. Throughput: 0: 42450.8. Samples: 13781951440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:28,396][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 07:45:30,741][15401] Updated weights for policy 0, policy_version 841183 (0.0029) [2024-06-25 07:45:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13782056960. Throughput: 0: 42689.3. Samples: 13782211280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:33,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-25 07:45:34,202][15401] Updated weights for policy 0, policy_version 841193 (0.0036) [2024-06-25 07:45:38,239][15401] Updated weights for policy 0, policy_version 841203 (0.0044) [2024-06-25 07:45:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 13782269952. Throughput: 0: 42532.4. Samples: 13782341780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 07:45:42,237][15401] Updated weights for policy 0, policy_version 841213 (0.0023) [2024-06-25 07:45:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 13782466560. Throughput: 0: 42485.8. Samples: 13782595340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 07:45:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000841215_13782466560.pth... [2024-06-25 07:45:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000840591_13772242944.pth [2024-06-25 07:45:46,138][15401] Updated weights for policy 0, policy_version 841223 (0.0029) [2024-06-25 07:45:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13782695936. Throughput: 0: 42522.8. Samples: 13782844520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:48,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 07:45:50,406][15401] Updated weights for policy 0, policy_version 841233 (0.0030) [2024-06-25 07:45:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 13782892544. Throughput: 0: 42463.1. Samples: 13782974520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 07:45:53,811][15401] Updated weights for policy 0, policy_version 841243 (0.0036) [2024-06-25 07:45:58,041][15401] Updated weights for policy 0, policy_version 841253 (0.0037) [2024-06-25 07:45:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 13783105536. Throughput: 0: 42421.0. Samples: 13783227580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:45:58,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-25 07:46:01,401][15401] Updated weights for policy 0, policy_version 841263 (0.0038) [2024-06-25 07:46:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 13783318528. Throughput: 0: 42672.3. Samples: 13783491060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:46:03,390][15132] Avg episode reward: [(0, '0.223')] [2024-06-25 07:46:05,554][15401] Updated weights for policy 0, policy_version 841273 (0.0042) [2024-06-25 07:46:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 13783531520. Throughput: 0: 42493.2. Samples: 13783615660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:46:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 07:46:09,135][15401] Updated weights for policy 0, policy_version 841283 (0.0029) [2024-06-25 07:46:11,355][15349] Signal inference workers to stop experience collection... (204100 times) [2024-06-25 07:46:11,407][15401] InferenceWorker_p0-w0: stopping experience collection (204100 times) [2024-06-25 07:46:11,411][15349] Signal inference workers to resume experience collection... (204100 times) [2024-06-25 07:46:11,418][15401] InferenceWorker_p0-w0: resuming experience collection (204100 times) [2024-06-25 07:46:13,212][15401] Updated weights for policy 0, policy_version 841293 (0.0033) [2024-06-25 07:46:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13783760896. Throughput: 0: 42606.3. Samples: 13783868720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:46:13,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 07:46:16,885][15401] Updated weights for policy 0, policy_version 841303 (0.0025) [2024-06-25 07:46:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 13783941120. Throughput: 0: 42625.3. Samples: 13784129420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 07:46:18,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 07:46:20,713][15401] Updated weights for policy 0, policy_version 841313 (0.0042) [2024-06-25 07:46:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 13784170496. Throughput: 0: 42390.6. Samples: 13784249360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:46:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 07:46:24,486][15401] Updated weights for policy 0, policy_version 841323 (0.0047) [2024-06-25 07:46:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42543.1). Total num frames: 13784383488. Throughput: 0: 42405.8. Samples: 13784503600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:46:28,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 07:46:28,775][15401] Updated weights for policy 0, policy_version 841333 (0.0035) [2024-06-25 07:46:32,132][15401] Updated weights for policy 0, policy_version 841343 (0.0027) [2024-06-25 07:46:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 13784596480. Throughput: 0: 42604.7. Samples: 13784761740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:46:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 07:46:36,162][15401] Updated weights for policy 0, policy_version 841353 (0.0041) [2024-06-25 07:46:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13784809472. Throughput: 0: 42528.4. Samples: 13784888300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:46:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 07:46:40,075][15401] Updated weights for policy 0, policy_version 841363 (0.0044) [2024-06-25 07:46:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 13785006080. Throughput: 0: 42529.1. Samples: 13785141400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:46:43,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-25 07:46:44,062][15401] Updated weights for policy 0, policy_version 841373 (0.0033) [2024-06-25 07:46:47,782][15401] Updated weights for policy 0, policy_version 841383 (0.0032) [2024-06-25 07:46:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.6, 300 sec: 42487.0). Total num frames: 13785251840. Throughput: 0: 42560.4. Samples: 13785406380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:46:48,392][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 07:46:51,855][15401] Updated weights for policy 0, policy_version 841393 (0.0034) [2024-06-25 07:46:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 13785432064. Throughput: 0: 42678.3. Samples: 13785536180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:46:53,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 07:46:55,238][15401] Updated weights for policy 0, policy_version 841403 (0.0031) [2024-06-25 07:46:58,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13785661440. Throughput: 0: 42681.8. Samples: 13785789400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:46:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 07:46:59,279][15401] Updated weights for policy 0, policy_version 841413 (0.0039) [2024-06-25 07:47:02,871][15401] Updated weights for policy 0, policy_version 841423 (0.0028) [2024-06-25 07:47:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 13785874432. Throughput: 0: 42610.2. Samples: 13786046880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 07:47:07,180][15401] Updated weights for policy 0, policy_version 841433 (0.0038) [2024-06-25 07:47:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 13786071040. Throughput: 0: 42839.2. Samples: 13786177120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 07:47:10,636][15401] Updated weights for policy 0, policy_version 841443 (0.0041) [2024-06-25 07:47:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13786300416. Throughput: 0: 42738.7. Samples: 13786426840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:13,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 07:47:14,961][15401] Updated weights for policy 0, policy_version 841453 (0.0038) [2024-06-25 07:47:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13786513408. Throughput: 0: 42574.2. Samples: 13786677580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 07:47:18,552][15401] Updated weights for policy 0, policy_version 841463 (0.0032) [2024-06-25 07:47:19,328][15349] Signal inference workers to stop experience collection... (204150 times) [2024-06-25 07:47:19,362][15401] InferenceWorker_p0-w0: stopping experience collection (204150 times) [2024-06-25 07:47:19,444][15349] Signal inference workers to resume experience collection... (204150 times) [2024-06-25 07:47:19,445][15401] InferenceWorker_p0-w0: resuming experience collection (204150 times) [2024-06-25 07:47:22,844][15401] Updated weights for policy 0, policy_version 841473 (0.0034) [2024-06-25 07:47:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 13786710016. Throughput: 0: 42461.2. Samples: 13786799060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:23,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 07:47:26,138][15401] Updated weights for policy 0, policy_version 841483 (0.0026) [2024-06-25 07:47:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13786955776. Throughput: 0: 42663.7. Samples: 13787061260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 07:47:30,444][15401] Updated weights for policy 0, policy_version 841493 (0.0029) [2024-06-25 07:47:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13787152384. Throughput: 0: 42490.7. Samples: 13787318360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 07:47:33,759][15401] Updated weights for policy 0, policy_version 841503 (0.0036) [2024-06-25 07:47:38,103][15401] Updated weights for policy 0, policy_version 841513 (0.0039) [2024-06-25 07:47:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 13787348992. Throughput: 0: 42489.3. Samples: 13787448200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 07:47:41,367][15401] Updated weights for policy 0, policy_version 841523 (0.0027) [2024-06-25 07:47:43,396][15132] Fps is (10 sec: 44208.3, 60 sec: 43140.0, 300 sec: 42597.5). Total num frames: 13787594752. Throughput: 0: 42568.5. Samples: 13787705260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:43,397][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 07:47:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000841528_13787594752.pth... [2024-06-25 07:47:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000840903_13777354752.pth [2024-06-25 07:47:45,695][15401] Updated weights for policy 0, policy_version 841533 (0.0035) [2024-06-25 07:47:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 13787791360. Throughput: 0: 42458.9. Samples: 13787957520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 07:47:48,935][15401] Updated weights for policy 0, policy_version 841543 (0.0036) [2024-06-25 07:47:53,389][15132] Fps is (10 sec: 39347.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13787987968. Throughput: 0: 42408.9. Samples: 13788085520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 07:47:53,567][15401] Updated weights for policy 0, policy_version 841553 (0.0039) [2024-06-25 07:47:56,808][15401] Updated weights for policy 0, policy_version 841563 (0.0039) [2024-06-25 07:47:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13788217344. Throughput: 0: 42563.1. Samples: 13788342180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:47:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 07:48:01,209][15401] Updated weights for policy 0, policy_version 841573 (0.0028) [2024-06-25 07:48:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13788430336. Throughput: 0: 42744.6. Samples: 13788601080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:48:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 07:48:04,336][15401] Updated weights for policy 0, policy_version 841583 (0.0036) [2024-06-25 07:48:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13788626944. Throughput: 0: 42892.5. Samples: 13788729220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 07:48:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 07:48:08,962][15401] Updated weights for policy 0, policy_version 841593 (0.0030) [2024-06-25 07:48:11,980][15401] Updated weights for policy 0, policy_version 841603 (0.0032) [2024-06-25 07:48:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13788856320. Throughput: 0: 42563.8. Samples: 13788976640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 07:48:16,464][15401] Updated weights for policy 0, policy_version 841613 (0.0028) [2024-06-25 07:48:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13789085696. Throughput: 0: 42611.0. Samples: 13789235860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 07:48:19,727][15401] Updated weights for policy 0, policy_version 841623 (0.0046) [2024-06-25 07:48:23,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13789265920. Throughput: 0: 42592.1. Samples: 13789364840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:23,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 07:48:23,933][15401] Updated weights for policy 0, policy_version 841633 (0.0043) [2024-06-25 07:48:27,207][15401] Updated weights for policy 0, policy_version 841643 (0.0043) [2024-06-25 07:48:28,389][15132] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13789511680. Throughput: 0: 42661.8. Samples: 13789624760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 07:48:31,879][15401] Updated weights for policy 0, policy_version 841653 (0.0037) [2024-06-25 07:48:33,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13789741056. Throughput: 0: 42860.8. Samples: 13789886260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 07:48:34,706][15401] Updated weights for policy 0, policy_version 841663 (0.0031) [2024-06-25 07:48:38,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13789921280. Throughput: 0: 42921.3. Samples: 13790016980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:38,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 07:48:39,371][15401] Updated weights for policy 0, policy_version 841673 (0.0029) [2024-06-25 07:48:42,447][15401] Updated weights for policy 0, policy_version 841683 (0.0040) [2024-06-25 07:48:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 13790150656. Throughput: 0: 42799.4. Samples: 13790268160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:43,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 07:48:46,974][15401] Updated weights for policy 0, policy_version 841693 (0.0040) [2024-06-25 07:48:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13790380032. Throughput: 0: 42664.0. Samples: 13790520960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 07:48:50,635][15349] Signal inference workers to stop experience collection... (204200 times) [2024-06-25 07:48:50,687][15401] InferenceWorker_p0-w0: stopping experience collection (204200 times) [2024-06-25 07:48:50,750][15349] Signal inference workers to resume experience collection... (204200 times) [2024-06-25 07:48:50,750][15401] InferenceWorker_p0-w0: resuming experience collection (204200 times) [2024-06-25 07:48:50,752][15401] Updated weights for policy 0, policy_version 841703 (0.0032) [2024-06-25 07:48:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42598.7). Total num frames: 13790576640. Throughput: 0: 42819.0. Samples: 13790656080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 07:48:54,439][15401] Updated weights for policy 0, policy_version 841713 (0.0030) [2024-06-25 07:48:58,272][15401] Updated weights for policy 0, policy_version 841723 (0.0029) [2024-06-25 07:48:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13790789632. Throughput: 0: 42889.0. Samples: 13790906640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:48:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 07:49:01,990][15401] Updated weights for policy 0, policy_version 841733 (0.0042) [2024-06-25 07:49:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13791019008. Throughput: 0: 42912.6. Samples: 13791166920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:03,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 07:49:06,036][15401] Updated weights for policy 0, policy_version 841743 (0.0038) [2024-06-25 07:49:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 13791215616. Throughput: 0: 42957.3. Samples: 13791297920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 07:49:09,687][15401] Updated weights for policy 0, policy_version 841753 (0.0036) [2024-06-25 07:49:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13791428608. Throughput: 0: 42807.9. Samples: 13791551120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 07:49:13,783][15401] Updated weights for policy 0, policy_version 841763 (0.0050) [2024-06-25 07:49:17,677][15401] Updated weights for policy 0, policy_version 841773 (0.0048) [2024-06-25 07:49:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13791657984. Throughput: 0: 42718.3. Samples: 13791808580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:18,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 07:49:21,330][15401] Updated weights for policy 0, policy_version 841783 (0.0034) [2024-06-25 07:49:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42598.6). Total num frames: 13791854592. Throughput: 0: 42767.1. Samples: 13791941500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:23,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 07:49:25,212][15401] Updated weights for policy 0, policy_version 841793 (0.0040) [2024-06-25 07:49:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 13792067584. Throughput: 0: 42581.8. Samples: 13792184340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:28,393][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 07:49:28,903][15401] Updated weights for policy 0, policy_version 841803 (0.0028) [2024-06-25 07:49:32,710][15401] Updated weights for policy 0, policy_version 841813 (0.0041) [2024-06-25 07:49:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13792296960. Throughput: 0: 42963.9. Samples: 13792454340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:33,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 07:49:36,502][15401] Updated weights for policy 0, policy_version 841823 (0.0028) [2024-06-25 07:49:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13792477184. Throughput: 0: 42837.0. Samples: 13792583740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 07:49:40,541][15401] Updated weights for policy 0, policy_version 841833 (0.0044) [2024-06-25 07:49:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13792706560. Throughput: 0: 42669.7. Samples: 13792826780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:43,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 07:49:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000841840_13792706560.pth... [2024-06-25 07:49:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000841215_13782466560.pth [2024-06-25 07:49:44,263][15401] Updated weights for policy 0, policy_version 841843 (0.0038) [2024-06-25 07:49:48,380][15401] Updated weights for policy 0, policy_version 841853 (0.0036) [2024-06-25 07:49:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13792919552. Throughput: 0: 42760.9. Samples: 13793091160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:48,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 07:49:52,032][15401] Updated weights for policy 0, policy_version 841863 (0.0036) [2024-06-25 07:49:53,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13793116160. Throughput: 0: 42651.1. Samples: 13793217220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:53,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 07:49:55,929][15401] Updated weights for policy 0, policy_version 841873 (0.0035) [2024-06-25 07:49:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13793361920. Throughput: 0: 42560.8. Samples: 13793466360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-25 07:49:58,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 07:49:59,539][15349] Signal inference workers to stop experience collection... (204250 times) [2024-06-25 07:49:59,584][15401] InferenceWorker_p0-w0: stopping experience collection (204250 times) [2024-06-25 07:49:59,651][15349] Signal inference workers to resume experience collection... (204250 times) [2024-06-25 07:49:59,651][15401] InferenceWorker_p0-w0: resuming experience collection (204250 times) [2024-06-25 07:49:59,787][15401] Updated weights for policy 0, policy_version 841883 (0.0035) [2024-06-25 07:50:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 13793542144. Throughput: 0: 42668.8. Samples: 13793728680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:03,390][15132] Avg episode reward: [(0, '0.883')] [2024-06-25 07:50:03,614][15401] Updated weights for policy 0, policy_version 841893 (0.0034) [2024-06-25 07:50:07,525][15401] Updated weights for policy 0, policy_version 841903 (0.0047) [2024-06-25 07:50:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 13793771520. Throughput: 0: 42424.2. Samples: 13793850580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:08,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 07:50:11,371][15401] Updated weights for policy 0, policy_version 841913 (0.0035) [2024-06-25 07:50:13,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13794000896. Throughput: 0: 42623.7. Samples: 13794102400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 07:50:15,164][15401] Updated weights for policy 0, policy_version 841923 (0.0038) [2024-06-25 07:50:18,390][15132] Fps is (10 sec: 40958.6, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 13794181120. Throughput: 0: 42512.7. Samples: 13794367420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:18,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 07:50:18,903][15401] Updated weights for policy 0, policy_version 841933 (0.0026) [2024-06-25 07:50:22,769][15401] Updated weights for policy 0, policy_version 841943 (0.0036) [2024-06-25 07:50:23,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 13794394112. Throughput: 0: 42236.0. Samples: 13794484460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:23,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 07:50:26,494][15401] Updated weights for policy 0, policy_version 841953 (0.0031) [2024-06-25 07:50:28,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13794639872. Throughput: 0: 42552.1. Samples: 13794741620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 07:50:30,149][15401] Updated weights for policy 0, policy_version 841963 (0.0028) [2024-06-25 07:50:33,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13794836480. Throughput: 0: 42540.9. Samples: 13795005500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 07:50:34,077][15401] Updated weights for policy 0, policy_version 841973 (0.0046) [2024-06-25 07:50:38,299][15401] Updated weights for policy 0, policy_version 841983 (0.0026) [2024-06-25 07:50:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13795049472. Throughput: 0: 42433.3. Samples: 13795126720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:38,391][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 07:50:42,047][15401] Updated weights for policy 0, policy_version 841993 (0.0029) [2024-06-25 07:50:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13795278848. Throughput: 0: 42541.3. Samples: 13795380720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 07:50:46,284][15401] Updated weights for policy 0, policy_version 842003 (0.0038) [2024-06-25 07:50:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13795459072. Throughput: 0: 42477.8. Samples: 13795640180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:48,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-25 07:50:49,622][15401] Updated weights for policy 0, policy_version 842013 (0.0035) [2024-06-25 07:50:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13795672064. Throughput: 0: 42448.8. Samples: 13795760780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:53,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-25 07:50:53,941][15401] Updated weights for policy 0, policy_version 842023 (0.0033) [2024-06-25 07:50:57,487][15401] Updated weights for policy 0, policy_version 842033 (0.0042) [2024-06-25 07:50:58,396][15132] Fps is (10 sec: 45845.9, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 13795917824. Throughput: 0: 42712.1. Samples: 13796024720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:50:58,396][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 07:51:01,429][15401] Updated weights for policy 0, policy_version 842043 (0.0034) [2024-06-25 07:51:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13796098048. Throughput: 0: 42622.8. Samples: 13796285440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:03,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 07:51:05,141][15401] Updated weights for policy 0, policy_version 842053 (0.0033) [2024-06-25 07:51:08,389][15132] Fps is (10 sec: 40986.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13796327424. Throughput: 0: 42740.2. Samples: 13796407660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:08,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 07:51:08,841][15401] Updated weights for policy 0, policy_version 842063 (0.0021) [2024-06-25 07:51:12,542][15401] Updated weights for policy 0, policy_version 842073 (0.0041) [2024-06-25 07:51:13,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13796556800. Throughput: 0: 43022.4. Samples: 13796677620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 07:51:14,051][15349] Signal inference workers to stop experience collection... (204300 times) [2024-06-25 07:51:14,051][15349] Signal inference workers to resume experience collection... (204300 times) [2024-06-25 07:51:14,066][15401] InferenceWorker_p0-w0: stopping experience collection (204300 times) [2024-06-25 07:51:14,066][15401] InferenceWorker_p0-w0: resuming experience collection (204300 times) [2024-06-25 07:51:16,291][15401] Updated weights for policy 0, policy_version 842083 (0.0027) [2024-06-25 07:51:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13796737024. Throughput: 0: 42985.2. Samples: 13796939840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 07:51:19,962][15401] Updated weights for policy 0, policy_version 842093 (0.0043) [2024-06-25 07:51:23,390][15132] Fps is (10 sec: 42597.2, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 13796982784. Throughput: 0: 43000.4. Samples: 13797061740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 07:51:23,779][15401] Updated weights for policy 0, policy_version 842103 (0.0041) [2024-06-25 07:51:27,445][15401] Updated weights for policy 0, policy_version 842113 (0.0027) [2024-06-25 07:51:28,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13797212160. Throughput: 0: 43113.4. Samples: 13797320820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 07:51:31,477][15401] Updated weights for policy 0, policy_version 842123 (0.0034) [2024-06-25 07:51:33,391][15132] Fps is (10 sec: 39315.0, 60 sec: 42324.0, 300 sec: 42598.1). Total num frames: 13797376000. Throughput: 0: 43141.4. Samples: 13797581620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:33,392][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 07:51:35,215][15401] Updated weights for policy 0, policy_version 842133 (0.0039) [2024-06-25 07:51:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13797638144. Throughput: 0: 43115.1. Samples: 13797700960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 07:51:39,262][15401] Updated weights for policy 0, policy_version 842143 (0.0031) [2024-06-25 07:51:43,197][15401] Updated weights for policy 0, policy_version 842154 (0.0031) [2024-06-25 07:51:43,390][15132] Fps is (10 sec: 49159.8, 60 sec: 43144.4, 300 sec: 42765.3). Total num frames: 13797867520. Throughput: 0: 43014.8. Samples: 13797960120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 07:51:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000842155_13797867520.pth... [2024-06-25 07:51:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000841528_13787594752.pth [2024-06-25 07:51:47,730][15401] Updated weights for policy 0, policy_version 842164 (0.0031) [2024-06-25 07:51:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13798031360. Throughput: 0: 43018.7. Samples: 13798221280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 07:51:48,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 07:51:50,772][15401] Updated weights for policy 0, policy_version 842174 (0.0043) [2024-06-25 07:51:53,390][15132] Fps is (10 sec: 40960.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13798277120. Throughput: 0: 43108.3. Samples: 13798347540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:51:53,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-25 07:51:55,468][15401] Updated weights for policy 0, policy_version 842184 (0.0042) [2024-06-25 07:51:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 13798490112. Throughput: 0: 42871.3. Samples: 13798606840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:51:58,390][15132] Avg episode reward: [(0, '0.283')] [2024-06-25 07:51:58,725][15401] Updated weights for policy 0, policy_version 842194 (0.0033) [2024-06-25 07:52:03,074][15401] Updated weights for policy 0, policy_version 842204 (0.0039) [2024-06-25 07:52:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13798686720. Throughput: 0: 42793.0. Samples: 13798865520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 07:52:06,687][15401] Updated weights for policy 0, policy_version 842214 (0.0029) [2024-06-25 07:52:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 13798932480. Throughput: 0: 42828.5. Samples: 13798989020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 07:52:10,665][15401] Updated weights for policy 0, policy_version 842224 (0.0037) [2024-06-25 07:52:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13799145472. Throughput: 0: 42901.5. Samples: 13799251380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 07:52:14,190][15401] Updated weights for policy 0, policy_version 842234 (0.0022) [2024-06-25 07:52:18,203][15401] Updated weights for policy 0, policy_version 842244 (0.0034) [2024-06-25 07:52:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 13799342080. Throughput: 0: 42853.7. Samples: 13799509960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:18,394][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 07:52:21,967][15401] Updated weights for policy 0, policy_version 842254 (0.0042) [2024-06-25 07:52:23,391][15132] Fps is (10 sec: 42591.6, 60 sec: 43143.6, 300 sec: 42764.8). Total num frames: 13799571456. Throughput: 0: 42908.0. Samples: 13799631880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:23,391][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 07:52:25,586][15401] Updated weights for policy 0, policy_version 842264 (0.0029) [2024-06-25 07:52:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13799768064. Throughput: 0: 43046.0. Samples: 13799897180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 07:52:29,361][15401] Updated weights for policy 0, policy_version 842274 (0.0039) [2024-06-25 07:52:33,390][15132] Fps is (10 sec: 39327.3, 60 sec: 43145.9, 300 sec: 42765.0). Total num frames: 13799964672. Throughput: 0: 42934.7. Samples: 13800153340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 07:52:33,475][15401] Updated weights for policy 0, policy_version 842284 (0.0040) [2024-06-25 07:52:36,176][15349] Signal inference workers to stop experience collection... (204350 times) [2024-06-25 07:52:36,180][15349] Signal inference workers to resume experience collection... (204350 times) [2024-06-25 07:52:36,220][15401] InferenceWorker_p0-w0: stopping experience collection (204350 times) [2024-06-25 07:52:36,220][15401] InferenceWorker_p0-w0: resuming experience collection (204350 times) [2024-06-25 07:52:36,838][15401] Updated weights for policy 0, policy_version 842294 (0.0038) [2024-06-25 07:52:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42766.0). Total num frames: 13800210432. Throughput: 0: 42977.4. Samples: 13800281520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 07:52:41,028][15401] Updated weights for policy 0, policy_version 842304 (0.0036) [2024-06-25 07:52:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 13800407040. Throughput: 0: 42900.1. Samples: 13800537340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:43,390][15132] Avg episode reward: [(0, '0.903')] [2024-06-25 07:52:44,442][15401] Updated weights for policy 0, policy_version 842314 (0.0025) [2024-06-25 07:52:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13800620032. Throughput: 0: 42719.6. Samples: 13800787900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:48,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 07:52:48,478][15401] Updated weights for policy 0, policy_version 842324 (0.0036) [2024-06-25 07:52:52,226][15401] Updated weights for policy 0, policy_version 842334 (0.0046) [2024-06-25 07:52:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13800849408. Throughput: 0: 42951.2. Samples: 13800921820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 07:52:56,369][15401] Updated weights for policy 0, policy_version 842344 (0.0033) [2024-06-25 07:52:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13801046016. Throughput: 0: 42859.4. Samples: 13801180060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:52:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 07:52:59,726][15401] Updated weights for policy 0, policy_version 842354 (0.0026) [2024-06-25 07:53:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13801275392. Throughput: 0: 42694.6. Samples: 13801431220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:53:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 07:53:03,920][15401] Updated weights for policy 0, policy_version 842364 (0.0037) [2024-06-25 07:53:07,591][15401] Updated weights for policy 0, policy_version 842374 (0.0026) [2024-06-25 07:53:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13801472000. Throughput: 0: 42875.6. Samples: 13801561220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:53:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 07:53:11,486][15401] Updated weights for policy 0, policy_version 842384 (0.0043) [2024-06-25 07:53:13,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.5, 300 sec: 42709.1). Total num frames: 13801684992. Throughput: 0: 42711.9. Samples: 13801819320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:53:13,393][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 07:53:15,224][15401] Updated weights for policy 0, policy_version 842394 (0.0037) [2024-06-25 07:53:18,392][15132] Fps is (10 sec: 45864.4, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 13801930752. Throughput: 0: 42475.1. Samples: 13802064820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:53:18,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 07:53:19,622][15401] Updated weights for policy 0, policy_version 842404 (0.0038) [2024-06-25 07:53:23,110][15401] Updated weights for policy 0, policy_version 842414 (0.0029) [2024-06-25 07:53:23,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42326.4, 300 sec: 42709.5). Total num frames: 13802110976. Throughput: 0: 42629.4. Samples: 13802199840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:53:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 07:53:27,359][15401] Updated weights for policy 0, policy_version 842424 (0.0037) [2024-06-25 07:53:28,390][15132] Fps is (10 sec: 37692.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13802307584. Throughput: 0: 42678.2. Samples: 13802457860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:53:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 07:53:30,731][15401] Updated weights for policy 0, policy_version 842434 (0.0025) [2024-06-25 07:53:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13802553344. Throughput: 0: 42630.7. Samples: 13802706280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:53:33,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 07:53:34,857][15401] Updated weights for policy 0, policy_version 842444 (0.0037) [2024-06-25 07:53:38,221][15401] Updated weights for policy 0, policy_version 842454 (0.0038) [2024-06-25 07:53:38,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13802782720. Throughput: 0: 42726.2. Samples: 13802844500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 07:53:38,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 07:53:42,833][15401] Updated weights for policy 0, policy_version 842464 (0.0041) [2024-06-25 07:53:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13802946560. Throughput: 0: 42583.0. Samples: 13803096300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:53:43,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 07:53:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000842465_13802946560.pth... [2024-06-25 07:53:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000841840_13792706560.pth [2024-06-25 07:53:45,932][15401] Updated weights for policy 0, policy_version 842474 (0.0035) [2024-06-25 07:53:48,396][15132] Fps is (10 sec: 42571.0, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 13803208704. Throughput: 0: 42443.8. Samples: 13803341460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:53:48,396][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 07:53:50,539][15401] Updated weights for policy 0, policy_version 842484 (0.0025) [2024-06-25 07:53:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 13803372544. Throughput: 0: 42623.2. Samples: 13803479260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:53:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 07:53:53,879][15401] Updated weights for policy 0, policy_version 842494 (0.0039) [2024-06-25 07:53:58,167][15401] Updated weights for policy 0, policy_version 842504 (0.0025) [2024-06-25 07:53:58,390][15132] Fps is (10 sec: 39346.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13803601920. Throughput: 0: 42500.5. Samples: 13803731740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:53:58,390][15132] Avg episode reward: [(0, '0.206')] [2024-06-25 07:54:01,214][15349] Signal inference workers to stop experience collection... (204400 times) [2024-06-25 07:54:01,268][15401] InferenceWorker_p0-w0: stopping experience collection (204400 times) [2024-06-25 07:54:01,277][15349] Signal inference workers to resume experience collection... (204400 times) [2024-06-25 07:54:01,282][15401] InferenceWorker_p0-w0: resuming experience collection (204400 times) [2024-06-25 07:54:01,429][15401] Updated weights for policy 0, policy_version 842514 (0.0037) [2024-06-25 07:54:03,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13803847680. Throughput: 0: 42523.1. Samples: 13803978260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:03,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 07:54:05,600][15401] Updated weights for policy 0, policy_version 842524 (0.0029) [2024-06-25 07:54:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13804027904. Throughput: 0: 42670.1. Samples: 13804120000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:08,396][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 07:54:09,267][15401] Updated weights for policy 0, policy_version 842534 (0.0035) [2024-06-25 07:54:13,261][15401] Updated weights for policy 0, policy_version 842544 (0.0034) [2024-06-25 07:54:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 13804240896. Throughput: 0: 42476.1. Samples: 13804369280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:13,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 07:54:16,875][15401] Updated weights for policy 0, policy_version 842554 (0.0028) [2024-06-25 07:54:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 13804486656. Throughput: 0: 42611.1. Samples: 13804623780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 07:54:20,938][15401] Updated weights for policy 0, policy_version 842564 (0.0032) [2024-06-25 07:54:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13804666880. Throughput: 0: 42574.1. Samples: 13804760340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 07:54:24,187][15401] Updated weights for policy 0, policy_version 842574 (0.0046) [2024-06-25 07:54:28,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13804879872. Throughput: 0: 42631.2. Samples: 13805014700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:28,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 07:54:28,597][15401] Updated weights for policy 0, policy_version 842584 (0.0039) [2024-06-25 07:54:31,908][15401] Updated weights for policy 0, policy_version 842594 (0.0031) [2024-06-25 07:54:33,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13805125632. Throughput: 0: 42913.2. Samples: 13805272280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 07:54:36,530][15401] Updated weights for policy 0, policy_version 842604 (0.0038) [2024-06-25 07:54:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13805322240. Throughput: 0: 42825.3. Samples: 13805406400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 07:54:39,613][15401] Updated weights for policy 0, policy_version 842614 (0.0033) [2024-06-25 07:54:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13805535232. Throughput: 0: 42797.4. Samples: 13805657620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 07:54:43,985][15401] Updated weights for policy 0, policy_version 842624 (0.0040) [2024-06-25 07:54:47,219][15401] Updated weights for policy 0, policy_version 842634 (0.0028) [2024-06-25 07:54:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 13805764608. Throughput: 0: 43025.0. Samples: 13805914380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 07:54:51,505][15401] Updated weights for policy 0, policy_version 842644 (0.0033) [2024-06-25 07:54:53,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 13805977600. Throughput: 0: 42809.8. Samples: 13806046540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:53,393][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 07:54:55,210][15401] Updated weights for policy 0, policy_version 842654 (0.0041) [2024-06-25 07:54:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13806157824. Throughput: 0: 42983.9. Samples: 13806303560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:54:58,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 07:54:59,075][15401] Updated weights for policy 0, policy_version 842664 (0.0034) [2024-06-25 07:55:02,789][15401] Updated weights for policy 0, policy_version 842674 (0.0039) [2024-06-25 07:55:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13806387200. Throughput: 0: 43019.5. Samples: 13806559660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:55:03,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 07:55:06,519][15401] Updated weights for policy 0, policy_version 842684 (0.0025) [2024-06-25 07:55:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13806616576. Throughput: 0: 42869.4. Samples: 13806689460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:55:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 07:55:10,397][15401] Updated weights for policy 0, policy_version 842694 (0.0027) [2024-06-25 07:55:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13806813184. Throughput: 0: 42856.0. Samples: 13806943220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:55:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 07:55:14,145][15401] Updated weights for policy 0, policy_version 842704 (0.0039) [2024-06-25 07:55:18,033][15401] Updated weights for policy 0, policy_version 842714 (0.0044) [2024-06-25 07:55:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 13807026176. Throughput: 0: 42866.2. Samples: 13807201260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:55:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 07:55:21,966][15401] Updated weights for policy 0, policy_version 842724 (0.0028) [2024-06-25 07:55:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13807255552. Throughput: 0: 42859.6. Samples: 13807335080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:55:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 07:55:25,757][15401] Updated weights for policy 0, policy_version 842734 (0.0030) [2024-06-25 07:55:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13807468544. Throughput: 0: 42993.4. Samples: 13807592320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 07:55:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 07:55:29,640][15401] Updated weights for policy 0, policy_version 842744 (0.0025) [2024-06-25 07:55:33,377][15401] Updated weights for policy 0, policy_version 842754 (0.0035) [2024-06-25 07:55:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13807681536. Throughput: 0: 43014.1. Samples: 13807850020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:55:33,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 07:55:37,156][15401] Updated weights for policy 0, policy_version 842764 (0.0027) [2024-06-25 07:55:37,699][15349] Signal inference workers to stop experience collection... (204450 times) [2024-06-25 07:55:37,700][15349] Signal inference workers to resume experience collection... (204450 times) [2024-06-25 07:55:37,717][15401] InferenceWorker_p0-w0: stopping experience collection (204450 times) [2024-06-25 07:55:37,717][15401] InferenceWorker_p0-w0: resuming experience collection (204450 times) [2024-06-25 07:55:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13807910912. Throughput: 0: 42972.5. Samples: 13807980200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:55:38,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 07:55:40,964][15401] Updated weights for policy 0, policy_version 842774 (0.0038) [2024-06-25 07:55:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 13808107520. Throughput: 0: 42871.5. Samples: 13808232880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:55:43,392][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 07:55:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000842780_13808107520.pth... [2024-06-25 07:55:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000842155_13797867520.pth [2024-06-25 07:55:44,692][15401] Updated weights for policy 0, policy_version 842784 (0.0041) [2024-06-25 07:55:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13808320512. Throughput: 0: 42953.9. Samples: 13808492580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:55:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 07:55:48,950][15401] Updated weights for policy 0, policy_version 842794 (0.0029) [2024-06-25 07:55:52,095][15401] Updated weights for policy 0, policy_version 842804 (0.0048) [2024-06-25 07:55:53,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42600.1, 300 sec: 42766.0). Total num frames: 13808533504. Throughput: 0: 42860.1. Samples: 13808618160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:55:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 07:55:56,578][15401] Updated weights for policy 0, policy_version 842814 (0.0028) [2024-06-25 07:55:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13808746496. Throughput: 0: 42995.2. Samples: 13808878000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:55:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 07:55:59,533][15401] Updated weights for policy 0, policy_version 842824 (0.0032) [2024-06-25 07:56:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13808959488. Throughput: 0: 43021.8. Samples: 13809137240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 07:56:04,282][15401] Updated weights for policy 0, policy_version 842834 (0.0042) [2024-06-25 07:56:07,355][15401] Updated weights for policy 0, policy_version 842844 (0.0026) [2024-06-25 07:56:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13809188864. Throughput: 0: 42934.2. Samples: 13809267120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 07:56:11,825][15401] Updated weights for policy 0, policy_version 842854 (0.0033) [2024-06-25 07:56:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13809385472. Throughput: 0: 42764.9. Samples: 13809516740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 07:56:15,064][15401] Updated weights for policy 0, policy_version 842864 (0.0037) [2024-06-25 07:56:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13809598464. Throughput: 0: 42751.0. Samples: 13809773820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 07:56:19,652][15401] Updated weights for policy 0, policy_version 842874 (0.0037) [2024-06-25 07:56:22,586][15401] Updated weights for policy 0, policy_version 842884 (0.0041) [2024-06-25 07:56:23,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 13809844224. Throughput: 0: 42849.8. Samples: 13809908540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:23,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 07:56:27,152][15401] Updated weights for policy 0, policy_version 842894 (0.0040) [2024-06-25 07:56:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42931.9). Total num frames: 13810040832. Throughput: 0: 43062.6. Samples: 13810170600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 07:56:30,084][15401] Updated weights for policy 0, policy_version 842904 (0.0047) [2024-06-25 07:56:33,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13810253824. Throughput: 0: 42861.7. Samples: 13810421360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:33,396][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 07:56:34,676][15401] Updated weights for policy 0, policy_version 842914 (0.0042) [2024-06-25 07:56:37,463][15349] Signal inference workers to stop experience collection... (204500 times) [2024-06-25 07:56:37,503][15401] InferenceWorker_p0-w0: stopping experience collection (204500 times) [2024-06-25 07:56:37,513][15349] Signal inference workers to resume experience collection... (204500 times) [2024-06-25 07:56:37,525][15401] InferenceWorker_p0-w0: resuming experience collection (204500 times) [2024-06-25 07:56:37,690][15401] Updated weights for policy 0, policy_version 842924 (0.0031) [2024-06-25 07:56:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 13810483200. Throughput: 0: 42987.5. Samples: 13810552600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:38,391][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 07:56:42,229][15401] Updated weights for policy 0, policy_version 842934 (0.0033) [2024-06-25 07:56:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 13810679808. Throughput: 0: 43006.1. Samples: 13810813280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 07:56:45,299][15401] Updated weights for policy 0, policy_version 842944 (0.0028) [2024-06-25 07:56:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13810892800. Throughput: 0: 42788.4. Samples: 13811062720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 07:56:49,875][15401] Updated weights for policy 0, policy_version 842954 (0.0032) [2024-06-25 07:56:52,858][15401] Updated weights for policy 0, policy_version 842964 (0.0031) [2024-06-25 07:56:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13811138560. Throughput: 0: 42795.6. Samples: 13811192920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 07:56:57,616][15401] Updated weights for policy 0, policy_version 842974 (0.0041) [2024-06-25 07:56:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13811318784. Throughput: 0: 43122.2. Samples: 13811457240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:56:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 07:57:00,565][15401] Updated weights for policy 0, policy_version 842984 (0.0037) [2024-06-25 07:57:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13811531776. Throughput: 0: 43137.0. Samples: 13811714980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:57:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 07:57:05,167][15401] Updated weights for policy 0, policy_version 842994 (0.0040) [2024-06-25 07:57:08,085][15401] Updated weights for policy 0, policy_version 843004 (0.0043) [2024-06-25 07:57:08,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13811793920. Throughput: 0: 43009.0. Samples: 13811843840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:57:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 07:57:12,770][15401] Updated weights for policy 0, policy_version 843014 (0.0042) [2024-06-25 07:57:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13811974144. Throughput: 0: 42938.8. Samples: 13812102840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:57:13,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 07:57:15,559][15401] Updated weights for policy 0, policy_version 843024 (0.0043) [2024-06-25 07:57:18,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42709.7). Total num frames: 13812170752. Throughput: 0: 43123.1. Samples: 13812361900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 07:57:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 07:57:20,286][15401] Updated weights for policy 0, policy_version 843034 (0.0030) [2024-06-25 07:57:23,145][15401] Updated weights for policy 0, policy_version 843044 (0.0041) [2024-06-25 07:57:23,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43144.5, 300 sec: 42931.3). Total num frames: 13812432896. Throughput: 0: 42943.5. Samples: 13812485160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:57:23,393][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 07:57:28,174][15401] Updated weights for policy 0, policy_version 843054 (0.0036) [2024-06-25 07:57:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 13812596736. Throughput: 0: 42869.1. Samples: 13812742380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:57:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 07:57:30,886][15401] Updated weights for policy 0, policy_version 843064 (0.0023) [2024-06-25 07:57:33,392][15132] Fps is (10 sec: 39321.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 13812826112. Throughput: 0: 42935.1. Samples: 13812994900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:57:33,393][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 07:57:35,707][15401] Updated weights for policy 0, policy_version 843074 (0.0026) [2024-06-25 07:57:38,390][15132] Fps is (10 sec: 47512.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13813071872. Throughput: 0: 43021.1. Samples: 13813128880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:57:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 07:57:38,463][15401] Updated weights for policy 0, policy_version 843084 (0.0036) [2024-06-25 07:57:43,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13813235712. Throughput: 0: 42829.0. Samples: 13813384540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:57:43,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 07:57:43,419][15401] Updated weights for policy 0, policy_version 843094 (0.0027) [2024-06-25 07:57:43,554][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000843095_13813268480.pth... [2024-06-25 07:57:43,608][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000842465_13802946560.pth [2024-06-25 07:57:46,087][15401] Updated weights for policy 0, policy_version 843104 (0.0038) [2024-06-25 07:57:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13813465088. Throughput: 0: 42763.5. Samples: 13813639340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:57:48,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 07:57:51,213][15401] Updated weights for policy 0, policy_version 843114 (0.0037) [2024-06-25 07:57:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13813710848. Throughput: 0: 42766.1. Samples: 13813768320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:57:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 07:57:54,018][15401] Updated weights for policy 0, policy_version 843124 (0.0038) [2024-06-25 07:57:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13813874688. Throughput: 0: 42663.9. Samples: 13814022720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:57:58,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 07:57:59,008][15401] Updated weights for policy 0, policy_version 843134 (0.0036) [2024-06-25 07:58:00,308][15349] Signal inference workers to stop experience collection... (204550 times) [2024-06-25 07:58:00,314][15349] Signal inference workers to resume experience collection... (204550 times) [2024-06-25 07:58:00,327][15401] InferenceWorker_p0-w0: stopping experience collection (204550 times) [2024-06-25 07:58:00,327][15401] InferenceWorker_p0-w0: resuming experience collection (204550 times) [2024-06-25 07:58:01,686][15401] Updated weights for policy 0, policy_version 843144 (0.0043) [2024-06-25 07:58:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13814104064. Throughput: 0: 42529.4. Samples: 13814275720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 07:58:06,581][15401] Updated weights for policy 0, policy_version 843154 (0.0040) [2024-06-25 07:58:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42876.5). Total num frames: 13814333440. Throughput: 0: 42657.0. Samples: 13814404620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 07:58:09,263][15401] Updated weights for policy 0, policy_version 843164 (0.0025) [2024-06-25 07:58:13,396][15132] Fps is (10 sec: 42570.7, 60 sec: 42593.8, 300 sec: 42708.9). Total num frames: 13814530048. Throughput: 0: 42541.8. Samples: 13814657040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:13,396][15132] Avg episode reward: [(0, '0.242')] [2024-06-25 07:58:14,195][15401] Updated weights for policy 0, policy_version 843174 (0.0043) [2024-06-25 07:58:16,917][15401] Updated weights for policy 0, policy_version 843184 (0.0037) [2024-06-25 07:58:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13814743040. Throughput: 0: 42660.9. Samples: 13814914540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 07:58:21,784][15401] Updated weights for policy 0, policy_version 843194 (0.0023) [2024-06-25 07:58:23,390][15132] Fps is (10 sec: 44264.9, 60 sec: 42327.0, 300 sec: 42931.6). Total num frames: 13814972416. Throughput: 0: 42564.5. Samples: 13815044280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 07:58:24,586][15401] Updated weights for policy 0, policy_version 843204 (0.0032) [2024-06-25 07:58:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 13815185408. Throughput: 0: 42596.8. Samples: 13815301400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 07:58:29,283][15401] Updated weights for policy 0, policy_version 843214 (0.0027) [2024-06-25 07:58:32,601][15401] Updated weights for policy 0, policy_version 843224 (0.0023) [2024-06-25 07:58:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 13815382016. Throughput: 0: 42600.8. Samples: 13815556380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 07:58:37,135][15401] Updated weights for policy 0, policy_version 843234 (0.0038) [2024-06-25 07:58:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.7, 300 sec: 42931.3). Total num frames: 13815611392. Throughput: 0: 42608.9. Samples: 13815685820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:38,392][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 07:58:40,534][15401] Updated weights for policy 0, policy_version 843244 (0.0037) [2024-06-25 07:58:43,393][15132] Fps is (10 sec: 44219.9, 60 sec: 43141.7, 300 sec: 42765.4). Total num frames: 13815824384. Throughput: 0: 42587.0. Samples: 13815939300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:43,394][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 07:58:44,551][15401] Updated weights for policy 0, policy_version 843254 (0.0034) [2024-06-25 07:58:48,123][15401] Updated weights for policy 0, policy_version 843264 (0.0046) [2024-06-25 07:58:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 13816037376. Throughput: 0: 42663.9. Samples: 13816195600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 07:58:52,246][15401] Updated weights for policy 0, policy_version 843274 (0.0033) [2024-06-25 07:58:53,390][15132] Fps is (10 sec: 42615.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13816250368. Throughput: 0: 42816.4. Samples: 13816331360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 07:58:55,618][15401] Updated weights for policy 0, policy_version 843284 (0.0032) [2024-06-25 07:58:58,391][15132] Fps is (10 sec: 42591.0, 60 sec: 43143.3, 300 sec: 42764.8). Total num frames: 13816463360. Throughput: 0: 42794.6. Samples: 13816582600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:58:58,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 07:58:59,765][15401] Updated weights for policy 0, policy_version 843294 (0.0036) [2024-06-25 07:59:03,254][15401] Updated weights for policy 0, policy_version 843304 (0.0024) [2024-06-25 07:59:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 13816692736. Throughput: 0: 42788.9. Samples: 13816840140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:59:03,392][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 07:59:07,531][15401] Updated weights for policy 0, policy_version 843314 (0.0032) [2024-06-25 07:59:08,392][15132] Fps is (10 sec: 42595.9, 60 sec: 42596.7, 300 sec: 42875.7). Total num frames: 13816889344. Throughput: 0: 42739.1. Samples: 13816967640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 07:59:08,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 07:59:11,330][15401] Updated weights for policy 0, policy_version 843324 (0.0029) [2024-06-25 07:59:11,346][15349] Signal inference workers to stop experience collection... (204600 times) [2024-06-25 07:59:11,346][15349] Signal inference workers to resume experience collection... (204600 times) [2024-06-25 07:59:11,391][15401] InferenceWorker_p0-w0: stopping experience collection (204600 times) [2024-06-25 07:59:11,391][15401] InferenceWorker_p0-w0: resuming experience collection (204600 times) [2024-06-25 07:59:13,393][15132] Fps is (10 sec: 42593.1, 60 sec: 43146.6, 300 sec: 42820.0). Total num frames: 13817118720. Throughput: 0: 42813.9. Samples: 13817228180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:13,394][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 07:59:14,979][15401] Updated weights for policy 0, policy_version 843334 (0.0034) [2024-06-25 07:59:18,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 13817315328. Throughput: 0: 42763.5. Samples: 13817480740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 07:59:19,383][15401] Updated weights for policy 0, policy_version 843344 (0.0038) [2024-06-25 07:59:22,700][15401] Updated weights for policy 0, policy_version 843354 (0.0037) [2024-06-25 07:59:23,389][15132] Fps is (10 sec: 42614.1, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 13817544704. Throughput: 0: 42635.7. Samples: 13817604320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:23,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 07:59:26,867][15401] Updated weights for policy 0, policy_version 843364 (0.0037) [2024-06-25 07:59:28,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13817741312. Throughput: 0: 42709.9. Samples: 13817861080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:28,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 07:59:30,202][15401] Updated weights for policy 0, policy_version 843374 (0.0035) [2024-06-25 07:59:33,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 13817937920. Throughput: 0: 42773.3. Samples: 13818120500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:33,393][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 07:59:34,539][15401] Updated weights for policy 0, policy_version 843384 (0.0039) [2024-06-25 07:59:37,995][15401] Updated weights for policy 0, policy_version 843394 (0.0034) [2024-06-25 07:59:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 13818183680. Throughput: 0: 42611.7. Samples: 13818248880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:38,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 07:59:42,250][15401] Updated weights for policy 0, policy_version 843404 (0.0050) [2024-06-25 07:59:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42601.2, 300 sec: 42765.0). Total num frames: 13818380288. Throughput: 0: 42636.4. Samples: 13818501160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:43,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 07:59:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000843407_13818380288.pth... [2024-06-25 07:59:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000842780_13808107520.pth [2024-06-25 07:59:45,665][15401] Updated weights for policy 0, policy_version 843414 (0.0037) [2024-06-25 07:59:48,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42596.8, 300 sec: 42765.0). Total num frames: 13818593280. Throughput: 0: 42425.8. Samples: 13818749300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:48,392][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 07:59:49,897][15401] Updated weights for policy 0, policy_version 843424 (0.0042) [2024-06-25 07:59:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13818806272. Throughput: 0: 42531.2. Samples: 13818881440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 07:59:53,507][15401] Updated weights for policy 0, policy_version 843434 (0.0027) [2024-06-25 07:59:57,540][15401] Updated weights for policy 0, policy_version 843444 (0.0034) [2024-06-25 07:59:58,390][15132] Fps is (10 sec: 40967.8, 60 sec: 42326.3, 300 sec: 42765.0). Total num frames: 13819002880. Throughput: 0: 42420.7. Samples: 13819136980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 07:59:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 08:00:01,417][15401] Updated weights for policy 0, policy_version 843454 (0.0032) [2024-06-25 08:00:03,390][15132] Fps is (10 sec: 44233.6, 60 sec: 42599.6, 300 sec: 42820.5). Total num frames: 13819248640. Throughput: 0: 42180.4. Samples: 13819378880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:03,391][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 08:00:05,533][15401] Updated weights for policy 0, policy_version 843464 (0.0042) [2024-06-25 08:00:08,392][15132] Fps is (10 sec: 42590.1, 60 sec: 42325.3, 300 sec: 42764.7). Total num frames: 13819428864. Throughput: 0: 42533.7. Samples: 13819518440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:08,392][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 08:00:08,959][15401] Updated weights for policy 0, policy_version 843474 (0.0047) [2024-06-25 08:00:13,248][15349] Signal inference workers to stop experience collection... (204650 times) [2024-06-25 08:00:13,248][15349] Signal inference workers to resume experience collection... (204650 times) [2024-06-25 08:00:13,257][15401] Updated weights for policy 0, policy_version 843484 (0.0035) [2024-06-25 08:00:13,273][15401] InferenceWorker_p0-w0: stopping experience collection (204650 times) [2024-06-25 08:00:13,273][15401] InferenceWorker_p0-w0: resuming experience collection (204650 times) [2024-06-25 08:00:13,389][15132] Fps is (10 sec: 40962.9, 60 sec: 42327.9, 300 sec: 42820.6). Total num frames: 13819658240. Throughput: 0: 42653.0. Samples: 13819780460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 08:00:16,680][15401] Updated weights for policy 0, policy_version 843494 (0.0034) [2024-06-25 08:00:18,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13819887616. Throughput: 0: 42446.6. Samples: 13820030500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 08:00:20,857][15401] Updated weights for policy 0, policy_version 843504 (0.0044) [2024-06-25 08:00:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13820084224. Throughput: 0: 42529.2. Samples: 13820162700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 08:00:24,087][15401] Updated weights for policy 0, policy_version 843514 (0.0031) [2024-06-25 08:00:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13820280832. Throughput: 0: 42740.8. Samples: 13820424500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 08:00:28,560][15401] Updated weights for policy 0, policy_version 843524 (0.0032) [2024-06-25 08:00:31,621][15401] Updated weights for policy 0, policy_version 843534 (0.0037) [2024-06-25 08:00:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 13820526592. Throughput: 0: 42829.9. Samples: 13820676540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:33,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 08:00:36,413][15401] Updated weights for policy 0, policy_version 843544 (0.0031) [2024-06-25 08:00:38,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 13820739584. Throughput: 0: 42812.5. Samples: 13820808000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:38,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 08:00:39,463][15401] Updated weights for policy 0, policy_version 843554 (0.0043) [2024-06-25 08:00:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13820936192. Throughput: 0: 42741.7. Samples: 13821060340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 08:00:43,875][15401] Updated weights for policy 0, policy_version 843564 (0.0026) [2024-06-25 08:00:47,103][15401] Updated weights for policy 0, policy_version 843574 (0.0033) [2024-06-25 08:00:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 13821165568. Throughput: 0: 42862.0. Samples: 13821307640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:48,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 08:00:51,355][15401] Updated weights for policy 0, policy_version 843584 (0.0040) [2024-06-25 08:00:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13821345792. Throughput: 0: 42712.6. Samples: 13821440400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 08:00:55,115][15401] Updated weights for policy 0, policy_version 843594 (0.0025) [2024-06-25 08:00:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.7, 300 sec: 42709.5). Total num frames: 13821558784. Throughput: 0: 42553.7. Samples: 13821695380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 08:00:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 08:00:58,918][15401] Updated weights for policy 0, policy_version 843604 (0.0032) [2024-06-25 08:01:02,717][15401] Updated weights for policy 0, policy_version 843614 (0.0039) [2024-06-25 08:01:03,390][15132] Fps is (10 sec: 45873.8, 60 sec: 42598.8, 300 sec: 42765.0). Total num frames: 13821804544. Throughput: 0: 42506.6. Samples: 13821943300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:03,391][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 08:01:07,051][15401] Updated weights for policy 0, policy_version 843624 (0.0032) [2024-06-25 08:01:08,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42054.1, 300 sec: 42598.4). Total num frames: 13821952000. Throughput: 0: 42583.7. Samples: 13822078960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:08,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-25 08:01:10,376][15401] Updated weights for policy 0, policy_version 843634 (0.0040) [2024-06-25 08:01:13,392][15132] Fps is (10 sec: 40952.6, 60 sec: 42597.0, 300 sec: 42764.7). Total num frames: 13822214144. Throughput: 0: 42274.6. Samples: 13822326940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:13,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 08:01:14,771][15401] Updated weights for policy 0, policy_version 843644 (0.0028) [2024-06-25 08:01:17,929][15401] Updated weights for policy 0, policy_version 843654 (0.0035) [2024-06-25 08:01:18,389][15132] Fps is (10 sec: 49151.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 13822443520. Throughput: 0: 42370.2. Samples: 13822583200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 08:01:22,618][15401] Updated weights for policy 0, policy_version 843664 (0.0049) [2024-06-25 08:01:23,390][15132] Fps is (10 sec: 39329.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13822607360. Throughput: 0: 42393.3. Samples: 13822715700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 08:01:25,593][15401] Updated weights for policy 0, policy_version 843674 (0.0028) [2024-06-25 08:01:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13822853120. Throughput: 0: 42339.2. Samples: 13822965600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 08:01:29,986][15401] Updated weights for policy 0, policy_version 843684 (0.0036) [2024-06-25 08:01:33,331][15401] Updated weights for policy 0, policy_version 843694 (0.0038) [2024-06-25 08:01:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13823082496. Throughput: 0: 42697.7. Samples: 13823229040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 08:01:37,917][15401] Updated weights for policy 0, policy_version 843704 (0.0028) [2024-06-25 08:01:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 13823246336. Throughput: 0: 42514.1. Samples: 13823353540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 08:01:39,528][15349] Signal inference workers to stop experience collection... (204700 times) [2024-06-25 08:01:39,529][15349] Signal inference workers to resume experience collection... (204700 times) [2024-06-25 08:01:39,557][15401] InferenceWorker_p0-w0: stopping experience collection (204700 times) [2024-06-25 08:01:39,557][15401] InferenceWorker_p0-w0: resuming experience collection (204700 times) [2024-06-25 08:01:41,124][15401] Updated weights for policy 0, policy_version 843714 (0.0040) [2024-06-25 08:01:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13823508480. Throughput: 0: 42387.5. Samples: 13823602820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:43,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 08:01:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000843720_13823508480.pth... [2024-06-25 08:01:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000843095_13813268480.pth [2024-06-25 08:01:45,483][15401] Updated weights for policy 0, policy_version 843724 (0.0030) [2024-06-25 08:01:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13823705088. Throughput: 0: 42805.6. Samples: 13823869540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:48,390][15132] Avg episode reward: [(0, '0.190')] [2024-06-25 08:01:48,739][15401] Updated weights for policy 0, policy_version 843734 (0.0033) [2024-06-25 08:01:53,325][15401] Updated weights for policy 0, policy_version 843744 (0.0039) [2024-06-25 08:01:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13823901696. Throughput: 0: 42468.8. Samples: 13823990060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 08:01:56,446][15401] Updated weights for policy 0, policy_version 843754 (0.0034) [2024-06-25 08:01:58,392][15132] Fps is (10 sec: 45863.7, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 13824163840. Throughput: 0: 42627.6. Samples: 13824245200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:01:58,392][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 08:02:00,922][15401] Updated weights for policy 0, policy_version 843764 (0.0025) [2024-06-25 08:02:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 13824327680. Throughput: 0: 42764.9. Samples: 13824507620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 08:02:04,091][15401] Updated weights for policy 0, policy_version 843774 (0.0043) [2024-06-25 08:02:08,396][15132] Fps is (10 sec: 36030.3, 60 sec: 42866.8, 300 sec: 42541.9). Total num frames: 13824524288. Throughput: 0: 42372.2. Samples: 13824622720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:08,396][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 08:02:08,746][15401] Updated weights for policy 0, policy_version 843784 (0.0027) [2024-06-25 08:02:11,850][15401] Updated weights for policy 0, policy_version 843794 (0.0038) [2024-06-25 08:02:13,390][15132] Fps is (10 sec: 49151.6, 60 sec: 43419.0, 300 sec: 42876.1). Total num frames: 13824819200. Throughput: 0: 42686.2. Samples: 13824886480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:13,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 08:02:16,180][15401] Updated weights for policy 0, policy_version 843804 (0.0037) [2024-06-25 08:02:18,390][15132] Fps is (10 sec: 44265.2, 60 sec: 42052.2, 300 sec: 42487.7). Total num frames: 13824966656. Throughput: 0: 42717.3. Samples: 13825151320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 08:02:19,673][15401] Updated weights for policy 0, policy_version 843814 (0.0032) [2024-06-25 08:02:23,392][15132] Fps is (10 sec: 34398.2, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 13825163264. Throughput: 0: 42438.6. Samples: 13825263380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:23,392][15132] Avg episode reward: [(0, '0.369')] [2024-06-25 08:02:23,798][15401] Updated weights for policy 0, policy_version 843824 (0.0035) [2024-06-25 08:02:27,442][15401] Updated weights for policy 0, policy_version 843834 (0.0031) [2024-06-25 08:02:28,392][15132] Fps is (10 sec: 49140.3, 60 sec: 43415.9, 300 sec: 42820.6). Total num frames: 13825458176. Throughput: 0: 42985.3. Samples: 13825537260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:28,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 08:02:31,566][15401] Updated weights for policy 0, policy_version 843844 (0.0033) [2024-06-25 08:02:33,390][15132] Fps is (10 sec: 47524.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13825638400. Throughput: 0: 42724.7. Samples: 13825792160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:33,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 08:02:35,066][15401] Updated weights for policy 0, policy_version 843854 (0.0039) [2024-06-25 08:02:35,216][15349] Signal inference workers to stop experience collection... (204750 times) [2024-06-25 08:02:35,220][15349] Signal inference workers to resume experience collection... (204750 times) [2024-06-25 08:02:35,266][15401] InferenceWorker_p0-w0: stopping experience collection (204750 times) [2024-06-25 08:02:35,266][15401] InferenceWorker_p0-w0: resuming experience collection (204750 times) [2024-06-25 08:02:38,389][15132] Fps is (10 sec: 37692.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13825835008. Throughput: 0: 42658.2. Samples: 13825909680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 08:02:39,440][15401] Updated weights for policy 0, policy_version 843864 (0.0038) [2024-06-25 08:02:42,671][15401] Updated weights for policy 0, policy_version 843874 (0.0024) [2024-06-25 08:02:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13826080768. Throughput: 0: 43061.8. Samples: 13826182880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 08:02:46,970][15401] Updated weights for policy 0, policy_version 843884 (0.0037) [2024-06-25 08:02:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13826277376. Throughput: 0: 42783.1. Samples: 13826432860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 08:02:48,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 08:02:50,334][15401] Updated weights for policy 0, policy_version 843894 (0.0026) [2024-06-25 08:02:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13826490368. Throughput: 0: 43035.9. Samples: 13826559060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:02:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 08:02:54,561][15401] Updated weights for policy 0, policy_version 843904 (0.0029) [2024-06-25 08:02:58,183][15401] Updated weights for policy 0, policy_version 843914 (0.0034) [2024-06-25 08:02:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 13826703360. Throughput: 0: 43111.2. Samples: 13826826480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:02:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 08:03:02,096][15401] Updated weights for policy 0, policy_version 843924 (0.0024) [2024-06-25 08:03:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13826916352. Throughput: 0: 42772.5. Samples: 13827076080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 08:03:05,762][15401] Updated weights for policy 0, policy_version 843934 (0.0033) [2024-06-25 08:03:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43695.3, 300 sec: 42765.9). Total num frames: 13827145728. Throughput: 0: 43368.0. Samples: 13827214840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:08,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 08:03:09,693][15401] Updated weights for policy 0, policy_version 843944 (0.0036) [2024-06-25 08:03:13,257][15401] Updated weights for policy 0, policy_version 843954 (0.0031) [2024-06-25 08:03:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13827342336. Throughput: 0: 43008.1. Samples: 13827472520. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 08:03:17,145][15401] Updated weights for policy 0, policy_version 843964 (0.0036) [2024-06-25 08:03:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13827555328. Throughput: 0: 42927.2. Samples: 13827723880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 08:03:20,883][15401] Updated weights for policy 0, policy_version 843974 (0.0037) [2024-06-25 08:03:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43692.4, 300 sec: 42709.5). Total num frames: 13827784704. Throughput: 0: 43191.5. Samples: 13827853300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 08:03:24,693][15401] Updated weights for policy 0, policy_version 843984 (0.0035) [2024-06-25 08:03:28,389][15132] Fps is (10 sec: 40961.0, 60 sec: 41781.0, 300 sec: 42654.0). Total num frames: 13827964928. Throughput: 0: 42888.2. Samples: 13828112840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 08:03:28,628][15401] Updated weights for policy 0, policy_version 843994 (0.0040) [2024-06-25 08:03:32,197][15401] Updated weights for policy 0, policy_version 844004 (0.0034) [2024-06-25 08:03:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 13828210688. Throughput: 0: 42881.7. Samples: 13828362540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 08:03:36,202][15401] Updated weights for policy 0, policy_version 844014 (0.0029) [2024-06-25 08:03:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42710.0). Total num frames: 13828423680. Throughput: 0: 42957.4. Samples: 13828492140. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 08:03:39,744][15401] Updated weights for policy 0, policy_version 844024 (0.0033) [2024-06-25 08:03:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13828603904. Throughput: 0: 42867.0. Samples: 13828755500. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 08:03:43,408][15349] Signal inference workers to stop experience collection... (204800 times) [2024-06-25 08:03:43,409][15349] Signal inference workers to resume experience collection... (204800 times) [2024-06-25 08:03:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000844032_13828620288.pth... [2024-06-25 08:03:43,418][15401] InferenceWorker_p0-w0: stopping experience collection (204800 times) [2024-06-25 08:03:43,432][15401] InferenceWorker_p0-w0: resuming experience collection (204800 times) [2024-06-25 08:03:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000843407_13818380288.pth [2024-06-25 08:03:43,799][15401] Updated weights for policy 0, policy_version 844034 (0.0035) [2024-06-25 08:03:47,218][15401] Updated weights for policy 0, policy_version 844044 (0.0032) [2024-06-25 08:03:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13828849664. Throughput: 0: 42886.7. Samples: 13829005980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 08:03:51,613][15401] Updated weights for policy 0, policy_version 844054 (0.0025) [2024-06-25 08:03:53,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 13829079040. Throughput: 0: 42758.2. Samples: 13829138960. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:53,399][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 08:03:55,267][15401] Updated weights for policy 0, policy_version 844064 (0.0037) [2024-06-25 08:03:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 13829259264. Throughput: 0: 42619.6. Samples: 13829390400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:03:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 08:03:59,164][15401] Updated weights for policy 0, policy_version 844074 (0.0032) [2024-06-25 08:04:02,935][15401] Updated weights for policy 0, policy_version 844084 (0.0034) [2024-06-25 08:04:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 13829472256. Throughput: 0: 42662.3. Samples: 13829643680. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:04:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 08:04:06,983][15401] Updated weights for policy 0, policy_version 844094 (0.0035) [2024-06-25 08:04:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 13829701632. Throughput: 0: 42655.6. Samples: 13829772800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:04:08,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 08:04:10,481][15401] Updated weights for policy 0, policy_version 844104 (0.0038) [2024-06-25 08:04:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13829914624. Throughput: 0: 42643.5. Samples: 13830031800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:04:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 08:04:14,423][15401] Updated weights for policy 0, policy_version 844114 (0.0031) [2024-06-25 08:04:18,023][15401] Updated weights for policy 0, policy_version 844124 (0.0025) [2024-06-25 08:04:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13830127616. Throughput: 0: 42747.9. Samples: 13830286200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:04:18,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 08:04:22,017][15401] Updated weights for policy 0, policy_version 844134 (0.0037) [2024-06-25 08:04:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13830324224. Throughput: 0: 42836.5. Samples: 13830419780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:04:23,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 08:04:25,714][15401] Updated weights for policy 0, policy_version 844144 (0.0044) [2024-06-25 08:04:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 13830553600. Throughput: 0: 42781.0. Samples: 13830680640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:04:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 08:04:29,790][15401] Updated weights for policy 0, policy_version 844154 (0.0042) [2024-06-25 08:04:33,289][15401] Updated weights for policy 0, policy_version 844164 (0.0026) [2024-06-25 08:04:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 13830782976. Throughput: 0: 42690.1. Samples: 13830927040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:04:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 08:04:37,385][15401] Updated weights for policy 0, policy_version 844174 (0.0032) [2024-06-25 08:04:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13830963200. Throughput: 0: 42683.3. Samples: 13831059700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-25 08:04:38,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 08:04:41,110][15401] Updated weights for policy 0, policy_version 844184 (0.0031) [2024-06-25 08:04:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 13831176192. Throughput: 0: 42702.9. Samples: 13831312040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:04:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 08:04:45,387][15401] Updated weights for policy 0, policy_version 844194 (0.0041) [2024-06-25 08:04:48,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13831405568. Throughput: 0: 42683.5. Samples: 13831564540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:04:48,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 08:04:48,798][15401] Updated weights for policy 0, policy_version 844204 (0.0037) [2024-06-25 08:04:53,179][15401] Updated weights for policy 0, policy_version 844214 (0.0026) [2024-06-25 08:04:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 13831618560. Throughput: 0: 42714.2. Samples: 13831694940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:04:53,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 08:04:56,474][15401] Updated weights for policy 0, policy_version 844224 (0.0043) [2024-06-25 08:04:58,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 13831831552. Throughput: 0: 42663.4. Samples: 13831951660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:04:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 08:05:00,827][15401] Updated weights for policy 0, policy_version 844234 (0.0025) [2024-06-25 08:05:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 13832028160. Throughput: 0: 42744.6. Samples: 13832209700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 08:05:04,236][15401] Updated weights for policy 0, policy_version 844244 (0.0029) [2024-06-25 08:05:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13832241152. Throughput: 0: 42553.3. Samples: 13832334680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 08:05:08,523][15349] Signal inference workers to stop experience collection... (204850 times) [2024-06-25 08:05:08,553][15401] InferenceWorker_p0-w0: stopping experience collection (204850 times) [2024-06-25 08:05:08,572][15349] Signal inference workers to resume experience collection... (204850 times) [2024-06-25 08:05:08,573][15401] InferenceWorker_p0-w0: resuming experience collection (204850 times) [2024-06-25 08:05:08,575][15401] Updated weights for policy 0, policy_version 844254 (0.0035) [2024-06-25 08:05:12,154][15401] Updated weights for policy 0, policy_version 844264 (0.0041) [2024-06-25 08:05:13,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 13832486912. Throughput: 0: 42483.4. Samples: 13832592400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 08:05:15,938][15401] Updated weights for policy 0, policy_version 844274 (0.0033) [2024-06-25 08:05:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13832683520. Throughput: 0: 42810.7. Samples: 13832853520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:18,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 08:05:19,574][15401] Updated weights for policy 0, policy_version 844284 (0.0048) [2024-06-25 08:05:23,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13832896512. Throughput: 0: 42682.1. Samples: 13832980400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:23,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 08:05:23,526][15401] Updated weights for policy 0, policy_version 844294 (0.0028) [2024-06-25 08:05:27,358][15401] Updated weights for policy 0, policy_version 844304 (0.0028) [2024-06-25 08:05:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 13833109504. Throughput: 0: 42731.1. Samples: 13833234940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:28,396][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 08:05:31,150][15401] Updated weights for policy 0, policy_version 844314 (0.0025) [2024-06-25 08:05:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13833306112. Throughput: 0: 42980.3. Samples: 13833498560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 08:05:34,983][15401] Updated weights for policy 0, policy_version 844324 (0.0037) [2024-06-25 08:05:38,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13833551872. Throughput: 0: 42854.4. Samples: 13833623380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 08:05:38,464][15401] Updated weights for policy 0, policy_version 844334 (0.0034) [2024-06-25 08:05:42,512][15401] Updated weights for policy 0, policy_version 844344 (0.0029) [2024-06-25 08:05:43,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13833764864. Throughput: 0: 42987.2. Samples: 13833886080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:43,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 08:05:43,462][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000844347_13833781248.pth... [2024-06-25 08:05:43,517][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000843720_13823508480.pth [2024-06-25 08:05:46,138][15401] Updated weights for policy 0, policy_version 844354 (0.0032) [2024-06-25 08:05:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 13833961472. Throughput: 0: 42865.3. Samples: 13834138640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 08:05:50,162][15401] Updated weights for policy 0, policy_version 844364 (0.0033) [2024-06-25 08:05:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13834207232. Throughput: 0: 42841.8. Samples: 13834262560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 08:05:53,950][15401] Updated weights for policy 0, policy_version 844374 (0.0030) [2024-06-25 08:05:57,762][15401] Updated weights for policy 0, policy_version 844384 (0.0030) [2024-06-25 08:05:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13834403840. Throughput: 0: 43032.6. Samples: 13834528860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:05:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 08:06:01,530][15401] Updated weights for policy 0, policy_version 844394 (0.0043) [2024-06-25 08:06:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13834600448. Throughput: 0: 42812.1. Samples: 13834780060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:06:03,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 08:06:05,489][15401] Updated weights for policy 0, policy_version 844404 (0.0041) [2024-06-25 08:06:08,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.9, 300 sec: 42820.5). Total num frames: 13834846208. Throughput: 0: 42726.6. Samples: 13834903200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:06:08,392][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 08:06:09,097][15401] Updated weights for policy 0, policy_version 844414 (0.0048) [2024-06-25 08:06:12,712][15349] Signal inference workers to stop experience collection... (204900 times) [2024-06-25 08:06:12,713][15349] Signal inference workers to resume experience collection... (204900 times) [2024-06-25 08:06:12,746][15401] InferenceWorker_p0-w0: stopping experience collection (204900 times) [2024-06-25 08:06:12,747][15401] InferenceWorker_p0-w0: resuming experience collection (204900 times) [2024-06-25 08:06:12,999][15401] Updated weights for policy 0, policy_version 844424 (0.0036) [2024-06-25 08:06:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13835042816. Throughput: 0: 42977.5. Samples: 13835168920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:06:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 08:06:16,590][15401] Updated weights for policy 0, policy_version 844434 (0.0023) [2024-06-25 08:06:18,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13835239424. Throughput: 0: 42782.5. Samples: 13835423760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:06:18,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 08:06:21,093][15401] Updated weights for policy 0, policy_version 844444 (0.0040) [2024-06-25 08:06:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 13835485184. Throughput: 0: 42795.4. Samples: 13835549180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:06:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 08:06:24,278][15401] Updated weights for policy 0, policy_version 844454 (0.0046) [2024-06-25 08:06:28,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 13835681792. Throughput: 0: 42804.4. Samples: 13835812380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:06:28,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 08:06:29,107][15401] Updated weights for policy 0, policy_version 844464 (0.0038) [2024-06-25 08:06:31,936][15401] Updated weights for policy 0, policy_version 844474 (0.0029) [2024-06-25 08:06:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 13835894784. Throughput: 0: 42699.6. Samples: 13836060120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:06:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 08:06:36,618][15401] Updated weights for policy 0, policy_version 844484 (0.0031) [2024-06-25 08:06:38,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13836124160. Throughput: 0: 42866.2. Samples: 13836191540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:06:38,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 08:06:39,786][15401] Updated weights for policy 0, policy_version 844494 (0.0036) [2024-06-25 08:06:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13836304384. Throughput: 0: 42628.9. Samples: 13836447160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:06:43,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 08:06:44,135][15401] Updated weights for policy 0, policy_version 844504 (0.0034) [2024-06-25 08:06:47,479][15401] Updated weights for policy 0, policy_version 844514 (0.0021) [2024-06-25 08:06:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13836550144. Throughput: 0: 42649.4. Samples: 13836699280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:06:48,390][15132] Avg episode reward: [(0, '0.256')] [2024-06-25 08:06:51,993][15401] Updated weights for policy 0, policy_version 844524 (0.0037) [2024-06-25 08:06:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 13836763136. Throughput: 0: 42885.0. Samples: 13836832920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:06:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 08:06:55,085][15401] Updated weights for policy 0, policy_version 844534 (0.0041) [2024-06-25 08:06:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13836943360. Throughput: 0: 42600.0. Samples: 13837085920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:06:58,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 08:06:59,702][15401] Updated weights for policy 0, policy_version 844544 (0.0039) [2024-06-25 08:07:02,868][15401] Updated weights for policy 0, policy_version 844554 (0.0030) [2024-06-25 08:07:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42932.6). Total num frames: 13837189120. Throughput: 0: 42439.4. Samples: 13837333540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:03,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 08:07:07,261][15401] Updated weights for policy 0, policy_version 844564 (0.0031) [2024-06-25 08:07:08,392][15132] Fps is (10 sec: 45865.3, 60 sec: 42598.6, 300 sec: 42653.6). Total num frames: 13837402112. Throughput: 0: 42763.8. Samples: 13837473640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:08,404][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 08:07:10,543][15401] Updated weights for policy 0, policy_version 844574 (0.0046) [2024-06-25 08:07:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13837598720. Throughput: 0: 42514.3. Samples: 13837725420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 08:07:14,967][15401] Updated weights for policy 0, policy_version 844584 (0.0048) [2024-06-25 08:07:18,281][15401] Updated weights for policy 0, policy_version 844594 (0.0032) [2024-06-25 08:07:18,389][15132] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 13837828096. Throughput: 0: 42620.1. Samples: 13837978020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:18,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 08:07:22,568][15401] Updated weights for policy 0, policy_version 844604 (0.0034) [2024-06-25 08:07:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 13838024704. Throughput: 0: 42638.8. Samples: 13838110280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 08:07:25,777][15401] Updated weights for policy 0, policy_version 844614 (0.0044) [2024-06-25 08:07:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 13838237696. Throughput: 0: 42550.9. Samples: 13838361960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 08:07:30,295][15401] Updated weights for policy 0, policy_version 844624 (0.0034) [2024-06-25 08:07:30,467][15349] Signal inference workers to stop experience collection... (204950 times) [2024-06-25 08:07:30,470][15349] Signal inference workers to resume experience collection... (204950 times) [2024-06-25 08:07:30,487][15401] InferenceWorker_p0-w0: stopping experience collection (204950 times) [2024-06-25 08:07:30,487][15401] InferenceWorker_p0-w0: resuming experience collection (204950 times) [2024-06-25 08:07:33,309][15401] Updated weights for policy 0, policy_version 844634 (0.0041) [2024-06-25 08:07:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13838483456. Throughput: 0: 42596.9. Samples: 13838616140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 08:07:38,036][15401] Updated weights for policy 0, policy_version 844644 (0.0042) [2024-06-25 08:07:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13838663680. Throughput: 0: 42549.2. Samples: 13838747640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:38,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 08:07:41,250][15401] Updated weights for policy 0, policy_version 844654 (0.0029) [2024-06-25 08:07:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13838893056. Throughput: 0: 42502.5. Samples: 13838998540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 08:07:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000844659_13838893056.pth... [2024-06-25 08:07:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000844032_13828620288.pth [2024-06-25 08:07:45,761][15401] Updated weights for policy 0, policy_version 844664 (0.0037) [2024-06-25 08:07:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 13839073280. Throughput: 0: 42648.5. Samples: 13839252720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:48,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 08:07:49,355][15401] Updated weights for policy 0, policy_version 844674 (0.0039) [2024-06-25 08:07:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 13839286272. Throughput: 0: 42245.0. Samples: 13839374580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 08:07:53,688][15401] Updated weights for policy 0, policy_version 844684 (0.0035) [2024-06-25 08:07:57,178][15401] Updated weights for policy 0, policy_version 844694 (0.0026) [2024-06-25 08:07:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13839532032. Throughput: 0: 42391.5. Samples: 13839633040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:07:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 08:08:01,262][15401] Updated weights for policy 0, policy_version 844704 (0.0042) [2024-06-25 08:08:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13839712256. Throughput: 0: 42561.7. Samples: 13839893300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:08:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 08:08:04,737][15401] Updated weights for policy 0, policy_version 844714 (0.0032) [2024-06-25 08:08:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42326.8, 300 sec: 42709.5). Total num frames: 13839941632. Throughput: 0: 42307.9. Samples: 13840014140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:08:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 08:08:08,900][15401] Updated weights for policy 0, policy_version 844724 (0.0032) [2024-06-25 08:08:12,377][15401] Updated weights for policy 0, policy_version 844734 (0.0045) [2024-06-25 08:08:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13840154624. Throughput: 0: 42604.6. Samples: 13840279160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:08:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 08:08:16,381][15401] Updated weights for policy 0, policy_version 844744 (0.0039) [2024-06-25 08:08:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13840351232. Throughput: 0: 42652.3. Samples: 13840535500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:08:18,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 08:08:20,329][15401] Updated weights for policy 0, policy_version 844754 (0.0028) [2024-06-25 08:08:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13840580608. Throughput: 0: 42379.1. Samples: 13840654700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:08:23,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 08:08:23,824][15401] Updated weights for policy 0, policy_version 844764 (0.0049) [2024-06-25 08:08:28,019][15401] Updated weights for policy 0, policy_version 844774 (0.0029) [2024-06-25 08:08:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13840793600. Throughput: 0: 42721.0. Samples: 13840920980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:08:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 08:08:31,254][15401] Updated weights for policy 0, policy_version 844784 (0.0046) [2024-06-25 08:08:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13841006592. Throughput: 0: 42750.6. Samples: 13841176500. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:08:33,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 08:08:35,645][15401] Updated weights for policy 0, policy_version 844794 (0.0043) [2024-06-25 08:08:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13841235968. Throughput: 0: 42824.1. Samples: 13841301660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:08:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 08:08:39,554][15401] Updated weights for policy 0, policy_version 844804 (0.0042) [2024-06-25 08:08:43,252][15401] Updated weights for policy 0, policy_version 844814 (0.0038) [2024-06-25 08:08:43,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 13841432576. Throughput: 0: 42847.1. Samples: 13841561260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:08:43,392][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 08:08:46,980][15401] Updated weights for policy 0, policy_version 844824 (0.0034) [2024-06-25 08:08:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13841629184. Throughput: 0: 42788.0. Samples: 13841818760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:08:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 08:08:51,018][15401] Updated weights for policy 0, policy_version 844834 (0.0032) [2024-06-25 08:08:51,166][15349] Signal inference workers to stop experience collection... (205000 times) [2024-06-25 08:08:51,166][15349] Signal inference workers to resume experience collection... (205000 times) [2024-06-25 08:08:51,179][15401] InferenceWorker_p0-w0: stopping experience collection (205000 times) [2024-06-25 08:08:51,179][15401] InferenceWorker_p0-w0: resuming experience collection (205000 times) [2024-06-25 08:08:53,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 13841874944. Throughput: 0: 42913.0. Samples: 13841945220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:08:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 08:08:54,475][15401] Updated weights for policy 0, policy_version 844844 (0.0038) [2024-06-25 08:08:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13842071552. Throughput: 0: 42641.7. Samples: 13842198040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:08:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 08:08:58,535][15401] Updated weights for policy 0, policy_version 844854 (0.0036) [2024-06-25 08:09:02,536][15401] Updated weights for policy 0, policy_version 844864 (0.0037) [2024-06-25 08:09:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13842284544. Throughput: 0: 42647.5. Samples: 13842454640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:03,394][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 08:09:06,074][15401] Updated weights for policy 0, policy_version 844874 (0.0023) [2024-06-25 08:09:08,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13842530304. Throughput: 0: 42896.9. Samples: 13842585060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 08:09:09,967][15401] Updated weights for policy 0, policy_version 844884 (0.0038) [2024-06-25 08:09:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13842726912. Throughput: 0: 42666.8. Samples: 13842840980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:13,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 08:09:13,989][15401] Updated weights for policy 0, policy_version 844894 (0.0042) [2024-06-25 08:09:17,774][15401] Updated weights for policy 0, policy_version 844904 (0.0033) [2024-06-25 08:09:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13842923520. Throughput: 0: 42839.6. Samples: 13843104280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 08:09:21,498][15401] Updated weights for policy 0, policy_version 844914 (0.0040) [2024-06-25 08:09:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13843169280. Throughput: 0: 42816.0. Samples: 13843228380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:23,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 08:09:25,255][15401] Updated weights for policy 0, policy_version 844924 (0.0031) [2024-06-25 08:09:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13843365888. Throughput: 0: 42650.2. Samples: 13843480420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 08:09:29,068][15401] Updated weights for policy 0, policy_version 844934 (0.0035) [2024-06-25 08:09:32,831][15401] Updated weights for policy 0, policy_version 844944 (0.0024) [2024-06-25 08:09:33,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13843562496. Throughput: 0: 42608.0. Samples: 13843736220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:33,392][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 08:09:36,708][15401] Updated weights for policy 0, policy_version 844954 (0.0047) [2024-06-25 08:09:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13843791872. Throughput: 0: 42741.3. Samples: 13843868580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 08:09:40,861][15401] Updated weights for policy 0, policy_version 844964 (0.0042) [2024-06-25 08:09:43,392][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13844004864. Throughput: 0: 42752.0. Samples: 13844121980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:43,393][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 08:09:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000844972_13844021248.pth... [2024-06-25 08:09:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000844347_13833781248.pth [2024-06-25 08:09:44,543][15401] Updated weights for policy 0, policy_version 844974 (0.0037) [2024-06-25 08:09:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13844201472. Throughput: 0: 42729.8. Samples: 13844377480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 08:09:48,634][15401] Updated weights for policy 0, policy_version 844984 (0.0036) [2024-06-25 08:09:52,092][15401] Updated weights for policy 0, policy_version 844994 (0.0035) [2024-06-25 08:09:53,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13844414464. Throughput: 0: 42663.6. Samples: 13844504920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 08:09:56,284][15401] Updated weights for policy 0, policy_version 845004 (0.0036) [2024-06-25 08:09:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 13844660224. Throughput: 0: 42711.1. Samples: 13844762980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:09:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 08:09:59,715][15401] Updated weights for policy 0, policy_version 845014 (0.0034) [2024-06-25 08:10:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13844856832. Throughput: 0: 42501.6. Samples: 13845016860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:10:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 08:10:04,200][15401] Updated weights for policy 0, policy_version 845024 (0.0031) [2024-06-25 08:10:07,696][15401] Updated weights for policy 0, policy_version 845034 (0.0042) [2024-06-25 08:10:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 13845053440. Throughput: 0: 42436.8. Samples: 13845138040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-25 08:10:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 08:10:11,539][15401] Updated weights for policy 0, policy_version 845044 (0.0023) [2024-06-25 08:10:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13845299200. Throughput: 0: 42720.6. Samples: 13845402840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:13,391][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 08:10:15,363][15401] Updated weights for policy 0, policy_version 845054 (0.0032) [2024-06-25 08:10:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13845463040. Throughput: 0: 42737.9. Samples: 13845659320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:18,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 08:10:19,267][15401] Updated weights for policy 0, policy_version 845064 (0.0040) [2024-06-25 08:10:20,779][15349] Signal inference workers to stop experience collection... (205050 times) [2024-06-25 08:10:20,779][15349] Signal inference workers to resume experience collection... (205050 times) [2024-06-25 08:10:20,796][15401] InferenceWorker_p0-w0: stopping experience collection (205050 times) [2024-06-25 08:10:20,796][15401] InferenceWorker_p0-w0: resuming experience collection (205050 times) [2024-06-25 08:10:23,092][15401] Updated weights for policy 0, policy_version 845074 (0.0038) [2024-06-25 08:10:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13845708800. Throughput: 0: 42484.9. Samples: 13845780400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:23,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 08:10:27,105][15401] Updated weights for policy 0, policy_version 845084 (0.0037) [2024-06-25 08:10:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13845921792. Throughput: 0: 42745.8. Samples: 13846045440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:28,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 08:10:30,506][15401] Updated weights for policy 0, policy_version 845094 (0.0036) [2024-06-25 08:10:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 13846118400. Throughput: 0: 42805.3. Samples: 13846303720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 08:10:34,727][15401] Updated weights for policy 0, policy_version 845104 (0.0026) [2024-06-25 08:10:38,046][15401] Updated weights for policy 0, policy_version 845114 (0.0034) [2024-06-25 08:10:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13846364160. Throughput: 0: 42744.9. Samples: 13846428440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 08:10:42,218][15401] Updated weights for policy 0, policy_version 845124 (0.0041) [2024-06-25 08:10:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13846560768. Throughput: 0: 42807.5. Samples: 13846689320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 08:10:45,627][15401] Updated weights for policy 0, policy_version 845134 (0.0030) [2024-06-25 08:10:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13846773760. Throughput: 0: 42800.0. Samples: 13846942860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:48,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 08:10:50,440][15401] Updated weights for policy 0, policy_version 845144 (0.0037) [2024-06-25 08:10:53,217][15401] Updated weights for policy 0, policy_version 845154 (0.0037) [2024-06-25 08:10:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13847003136. Throughput: 0: 42813.3. Samples: 13847064640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 08:10:58,006][15401] Updated weights for policy 0, policy_version 845164 (0.0039) [2024-06-25 08:10:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13847199744. Throughput: 0: 42748.9. Samples: 13847326540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:10:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 08:11:00,857][15401] Updated weights for policy 0, policy_version 845174 (0.0035) [2024-06-25 08:11:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 13847429120. Throughput: 0: 42597.3. Samples: 13847576200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 08:11:05,693][15401] Updated weights for policy 0, policy_version 845184 (0.0031) [2024-06-25 08:11:08,331][15401] Updated weights for policy 0, policy_version 845194 (0.0031) [2024-06-25 08:11:08,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13847658496. Throughput: 0: 42974.1. Samples: 13847714240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 08:11:13,266][15401] Updated weights for policy 0, policy_version 845204 (0.0041) [2024-06-25 08:11:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13847822336. Throughput: 0: 42699.6. Samples: 13847966920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:13,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 08:11:16,423][15401] Updated weights for policy 0, policy_version 845214 (0.0027) [2024-06-25 08:11:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 13848068096. Throughput: 0: 42505.8. Samples: 13848216480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:18,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 08:11:21,109][15401] Updated weights for policy 0, policy_version 845224 (0.0042) [2024-06-25 08:11:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 13848281088. Throughput: 0: 42803.2. Samples: 13848354580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 08:11:24,025][15401] Updated weights for policy 0, policy_version 845234 (0.0030) [2024-06-25 08:11:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13848461312. Throughput: 0: 42558.2. Samples: 13848604440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 08:11:28,738][15401] Updated weights for policy 0, policy_version 845244 (0.0034) [2024-06-25 08:11:31,535][15401] Updated weights for policy 0, policy_version 845254 (0.0034) [2024-06-25 08:11:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13848690688. Throughput: 0: 42601.0. Samples: 13848859900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:33,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 08:11:36,374][15401] Updated weights for policy 0, policy_version 845264 (0.0032) [2024-06-25 08:11:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13848920064. Throughput: 0: 42992.1. Samples: 13848999280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 08:11:39,391][15401] Updated weights for policy 0, policy_version 845274 (0.0032) [2024-06-25 08:11:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 13849100288. Throughput: 0: 42647.3. Samples: 13849245680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 08:11:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000845282_13849100288.pth... [2024-06-25 08:11:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000844659_13838893056.pth [2024-06-25 08:11:43,930][15401] Updated weights for policy 0, policy_version 845284 (0.0040) [2024-06-25 08:11:46,945][15401] Updated weights for policy 0, policy_version 845294 (0.0047) [2024-06-25 08:11:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13849313280. Throughput: 0: 42718.2. Samples: 13849498520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 08:11:51,718][15349] Signal inference workers to stop experience collection... (205100 times) [2024-06-25 08:11:51,720][15349] Signal inference workers to resume experience collection... (205100 times) [2024-06-25 08:11:51,736][15401] Updated weights for policy 0, policy_version 845304 (0.0031) [2024-06-25 08:11:51,764][15401] InferenceWorker_p0-w0: stopping experience collection (205100 times) [2024-06-25 08:11:51,764][15401] InferenceWorker_p0-w0: resuming experience collection (205100 times) [2024-06-25 08:11:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13849542656. Throughput: 0: 42496.5. Samples: 13849626580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:53,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 08:11:54,900][15401] Updated weights for policy 0, policy_version 845314 (0.0034) [2024-06-25 08:11:58,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 13849739264. Throughput: 0: 42506.7. Samples: 13849879820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 08:11:58,392][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 08:11:59,379][15401] Updated weights for policy 0, policy_version 845324 (0.0046) [2024-06-25 08:12:02,469][15401] Updated weights for policy 0, policy_version 845334 (0.0029) [2024-06-25 08:12:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42598.7). Total num frames: 13849968640. Throughput: 0: 42571.0. Samples: 13850132180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 08:12:06,825][15401] Updated weights for policy 0, policy_version 845344 (0.0045) [2024-06-25 08:12:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 13850181632. Throughput: 0: 42552.0. Samples: 13850269420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 08:12:10,007][15401] Updated weights for policy 0, policy_version 845354 (0.0032) [2024-06-25 08:12:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13850394624. Throughput: 0: 42586.3. Samples: 13850520820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 08:12:14,405][15401] Updated weights for policy 0, policy_version 845364 (0.0034) [2024-06-25 08:12:17,661][15401] Updated weights for policy 0, policy_version 845374 (0.0028) [2024-06-25 08:12:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13850624000. Throughput: 0: 42508.9. Samples: 13850772800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:18,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 08:12:22,367][15401] Updated weights for policy 0, policy_version 845384 (0.0024) [2024-06-25 08:12:23,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42709.2). Total num frames: 13850836992. Throughput: 0: 42300.4. Samples: 13850902900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:23,392][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 08:12:25,364][15401] Updated weights for policy 0, policy_version 845394 (0.0043) [2024-06-25 08:12:28,394][15132] Fps is (10 sec: 39305.4, 60 sec: 42595.5, 300 sec: 42486.7). Total num frames: 13851017216. Throughput: 0: 42600.7. Samples: 13851162880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:28,394][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 08:12:29,976][15401] Updated weights for policy 0, policy_version 845404 (0.0030) [2024-06-25 08:12:33,040][15401] Updated weights for policy 0, policy_version 845414 (0.0031) [2024-06-25 08:12:33,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13851279360. Throughput: 0: 42490.1. Samples: 13851410580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:33,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 08:12:37,440][15401] Updated weights for policy 0, policy_version 845424 (0.0037) [2024-06-25 08:12:38,392][15132] Fps is (10 sec: 44244.5, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 13851459584. Throughput: 0: 42686.7. Samples: 13851547580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:38,392][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 08:12:40,835][15401] Updated weights for policy 0, policy_version 845434 (0.0039) [2024-06-25 08:12:43,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13851656192. Throughput: 0: 42790.3. Samples: 13851805280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:43,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 08:12:45,036][15401] Updated weights for policy 0, policy_version 845444 (0.0039) [2024-06-25 08:12:48,389][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13851901952. Throughput: 0: 42729.5. Samples: 13852055000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 08:12:48,496][15401] Updated weights for policy 0, policy_version 845454 (0.0044) [2024-06-25 08:12:52,710][15401] Updated weights for policy 0, policy_version 845464 (0.0043) [2024-06-25 08:12:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13852114944. Throughput: 0: 42687.5. Samples: 13852190360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:53,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 08:12:56,122][15349] Signal inference workers to stop experience collection... (205150 times) [2024-06-25 08:12:56,164][15401] InferenceWorker_p0-w0: stopping experience collection (205150 times) [2024-06-25 08:12:56,172][15349] Signal inference workers to resume experience collection... (205150 times) [2024-06-25 08:12:56,180][15401] InferenceWorker_p0-w0: resuming experience collection (205150 times) [2024-06-25 08:12:56,183][15401] Updated weights for policy 0, policy_version 845474 (0.0027) [2024-06-25 08:12:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 13852311552. Throughput: 0: 42645.4. Samples: 13852439860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:12:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 08:13:00,589][15401] Updated weights for policy 0, policy_version 845484 (0.0033) [2024-06-25 08:13:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13852540928. Throughput: 0: 42656.1. Samples: 13852692320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:03,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 08:13:04,006][15401] Updated weights for policy 0, policy_version 845494 (0.0047) [2024-06-25 08:13:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13852721152. Throughput: 0: 42683.3. Samples: 13852823540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:08,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 08:13:08,428][15401] Updated weights for policy 0, policy_version 845504 (0.0042) [2024-06-25 08:13:11,660][15401] Updated weights for policy 0, policy_version 845514 (0.0038) [2024-06-25 08:13:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13852950528. Throughput: 0: 42415.9. Samples: 13853071420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:13,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 08:13:16,199][15401] Updated weights for policy 0, policy_version 845524 (0.0032) [2024-06-25 08:13:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13853163520. Throughput: 0: 42577.9. Samples: 13853326580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 08:13:19,436][15401] Updated weights for policy 0, policy_version 845534 (0.0041) [2024-06-25 08:13:23,396][15132] Fps is (10 sec: 39296.3, 60 sec: 41776.4, 300 sec: 42541.9). Total num frames: 13853343744. Throughput: 0: 42413.9. Samples: 13853456380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:23,397][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 08:13:23,949][15401] Updated weights for policy 0, policy_version 845544 (0.0042) [2024-06-25 08:13:27,206][15401] Updated weights for policy 0, policy_version 845554 (0.0031) [2024-06-25 08:13:28,390][15132] Fps is (10 sec: 42596.3, 60 sec: 42874.1, 300 sec: 42653.9). Total num frames: 13853589504. Throughput: 0: 42272.9. Samples: 13853707580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 08:13:31,582][15401] Updated weights for policy 0, policy_version 845564 (0.0042) [2024-06-25 08:13:33,390][15132] Fps is (10 sec: 47543.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13853818880. Throughput: 0: 42359.4. Samples: 13853961180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 08:13:34,843][15401] Updated weights for policy 0, policy_version 845574 (0.0040) [2024-06-25 08:13:38,390][15132] Fps is (10 sec: 40961.6, 60 sec: 42327.0, 300 sec: 42598.7). Total num frames: 13853999104. Throughput: 0: 42217.3. Samples: 13854090140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:38,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-25 08:13:39,253][15401] Updated weights for policy 0, policy_version 845584 (0.0034) [2024-06-25 08:13:42,516][15401] Updated weights for policy 0, policy_version 845594 (0.0039) [2024-06-25 08:13:43,392][15132] Fps is (10 sec: 42588.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 13854244864. Throughput: 0: 42342.1. Samples: 13854345360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:43,392][15132] Avg episode reward: [(0, '0.253')] [2024-06-25 08:13:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000845597_13854261248.pth... [2024-06-25 08:13:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000844972_13844021248.pth [2024-06-25 08:13:47,018][15401] Updated weights for policy 0, policy_version 845604 (0.0035) [2024-06-25 08:13:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13854441472. Throughput: 0: 42507.9. Samples: 13854605180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-25 08:13:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 08:13:50,161][15401] Updated weights for policy 0, policy_version 845614 (0.0025) [2024-06-25 08:13:53,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13854638080. Throughput: 0: 42324.3. Samples: 13854728140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:13:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 08:13:54,804][15401] Updated weights for policy 0, policy_version 845624 (0.0030) [2024-06-25 08:13:57,718][15401] Updated weights for policy 0, policy_version 845634 (0.0031) [2024-06-25 08:13:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13854883840. Throughput: 0: 42599.6. Samples: 13854988400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:13:58,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-25 08:14:02,526][15401] Updated weights for policy 0, policy_version 845644 (0.0030) [2024-06-25 08:14:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13855080448. Throughput: 0: 42635.5. Samples: 13855245180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 08:14:05,605][15401] Updated weights for policy 0, policy_version 845654 (0.0023) [2024-06-25 08:14:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 13855277056. Throughput: 0: 42517.2. Samples: 13855369380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 08:14:10,080][15401] Updated weights for policy 0, policy_version 845664 (0.0043) [2024-06-25 08:14:13,202][15401] Updated weights for policy 0, policy_version 845674 (0.0024) [2024-06-25 08:14:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13855522816. Throughput: 0: 42622.6. Samples: 13855625580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 08:14:17,473][15401] Updated weights for policy 0, policy_version 845684 (0.0039) [2024-06-25 08:14:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13855735808. Throughput: 0: 42858.3. Samples: 13855889800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 08:14:19,021][15349] Signal inference workers to stop experience collection... (205200 times) [2024-06-25 08:14:19,049][15401] InferenceWorker_p0-w0: stopping experience collection (205200 times) [2024-06-25 08:14:19,076][15349] Signal inference workers to resume experience collection... (205200 times) [2024-06-25 08:14:19,080][15401] InferenceWorker_p0-w0: resuming experience collection (205200 times) [2024-06-25 08:14:20,626][15401] Updated weights for policy 0, policy_version 845694 (0.0034) [2024-06-25 08:14:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42876.1, 300 sec: 42542.9). Total num frames: 13855916032. Throughput: 0: 42807.6. Samples: 13856016480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:23,392][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 08:14:25,201][15401] Updated weights for policy 0, policy_version 845704 (0.0041) [2024-06-25 08:14:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42870.0, 300 sec: 42709.5). Total num frames: 13856161792. Throughput: 0: 42805.7. Samples: 13856271620. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:28,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 08:14:28,416][15401] Updated weights for policy 0, policy_version 845714 (0.0034) [2024-06-25 08:14:32,732][15401] Updated weights for policy 0, policy_version 845724 (0.0028) [2024-06-25 08:14:33,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13856374784. Throughput: 0: 42846.6. Samples: 13856533280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:33,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 08:14:36,496][15401] Updated weights for policy 0, policy_version 845734 (0.0036) [2024-06-25 08:14:38,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 13856571392. Throughput: 0: 42991.7. Samples: 13856662760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 08:14:40,221][15401] Updated weights for policy 0, policy_version 845744 (0.0039) [2024-06-25 08:14:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 13856817152. Throughput: 0: 42955.1. Samples: 13856921380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 08:14:44,002][15401] Updated weights for policy 0, policy_version 845754 (0.0036) [2024-06-25 08:14:47,934][15401] Updated weights for policy 0, policy_version 845764 (0.0028) [2024-06-25 08:14:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13857013760. Throughput: 0: 43093.9. Samples: 13857184400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 08:14:51,367][15401] Updated weights for policy 0, policy_version 845774 (0.0037) [2024-06-25 08:14:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 13857226752. Throughput: 0: 43112.0. Samples: 13857309420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 08:14:55,716][15401] Updated weights for policy 0, policy_version 845784 (0.0034) [2024-06-25 08:14:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 13857439744. Throughput: 0: 43193.0. Samples: 13857569260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:14:58,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 08:14:58,797][15401] Updated weights for policy 0, policy_version 845794 (0.0024) [2024-06-25 08:15:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 13857636352. Throughput: 0: 43072.6. Samples: 13857828060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:15:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 08:15:03,418][15401] Updated weights for policy 0, policy_version 845804 (0.0039) [2024-06-25 08:15:06,598][15401] Updated weights for policy 0, policy_version 845814 (0.0030) [2024-06-25 08:15:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13857865728. Throughput: 0: 43024.3. Samples: 13857952580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:15:08,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 08:15:10,891][15401] Updated weights for policy 0, policy_version 845824 (0.0030) [2024-06-25 08:15:13,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13858095104. Throughput: 0: 43041.8. Samples: 13858208400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:15:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 08:15:15,020][15401] Updated weights for policy 0, policy_version 845834 (0.0056) [2024-06-25 08:15:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13858291712. Throughput: 0: 42949.7. Samples: 13858466020. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:15:18,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 08:15:18,450][15401] Updated weights for policy 0, policy_version 845844 (0.0022) [2024-06-25 08:15:22,522][15401] Updated weights for policy 0, policy_version 845854 (0.0033) [2024-06-25 08:15:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13858488320. Throughput: 0: 42837.4. Samples: 13858590440. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:15:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 08:15:26,113][15401] Updated weights for policy 0, policy_version 845864 (0.0042) [2024-06-25 08:15:28,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 13858734080. Throughput: 0: 42848.4. Samples: 13858849660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:15:28,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 08:15:30,012][15401] Updated weights for policy 0, policy_version 845874 (0.0046) [2024-06-25 08:15:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 13858947072. Throughput: 0: 42744.0. Samples: 13859107880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:15:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 08:15:33,797][15401] Updated weights for policy 0, policy_version 845884 (0.0040) [2024-06-25 08:15:36,622][15349] Signal inference workers to stop experience collection... (205250 times) [2024-06-25 08:15:36,623][15349] Signal inference workers to resume experience collection... (205250 times) [2024-06-25 08:15:36,684][15401] InferenceWorker_p0-w0: stopping experience collection (205250 times) [2024-06-25 08:15:36,684][15401] InferenceWorker_p0-w0: resuming experience collection (205250 times) [2024-06-25 08:15:38,015][15401] Updated weights for policy 0, policy_version 845894 (0.0040) [2024-06-25 08:15:38,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 13859143680. Throughput: 0: 42693.2. Samples: 13859230720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-25 08:15:38,393][15132] Avg episode reward: [(0, '0.849')] [2024-06-25 08:15:41,481][15401] Updated weights for policy 0, policy_version 845904 (0.0036) [2024-06-25 08:15:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13859389440. Throughput: 0: 42779.8. Samples: 13859494360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:15:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 08:15:43,544][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000845911_13859405824.pth... [2024-06-25 08:15:43,600][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000845282_13849100288.pth [2024-06-25 08:15:45,562][15401] Updated weights for policy 0, policy_version 845914 (0.0032) [2024-06-25 08:15:48,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13859586048. Throughput: 0: 42704.7. Samples: 13859749780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:15:48,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 08:15:49,425][15401] Updated weights for policy 0, policy_version 845924 (0.0052) [2024-06-25 08:15:53,249][15401] Updated weights for policy 0, policy_version 845934 (0.0054) [2024-06-25 08:15:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13859799040. Throughput: 0: 42644.9. Samples: 13859871600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:15:53,392][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 08:15:57,108][15401] Updated weights for policy 0, policy_version 845944 (0.0036) [2024-06-25 08:15:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13860028416. Throughput: 0: 42826.8. Samples: 13860135600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:15:58,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 08:16:00,737][15401] Updated weights for policy 0, policy_version 845954 (0.0033) [2024-06-25 08:16:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 13860225024. Throughput: 0: 42669.0. Samples: 13860386120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 08:16:04,707][15401] Updated weights for policy 0, policy_version 845964 (0.0028) [2024-06-25 08:16:08,242][15401] Updated weights for policy 0, policy_version 845974 (0.0035) [2024-06-25 08:16:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13860438016. Throughput: 0: 42687.9. Samples: 13860511400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 08:16:12,507][15401] Updated weights for policy 0, policy_version 845984 (0.0043) [2024-06-25 08:16:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13860651008. Throughput: 0: 42855.6. Samples: 13860778060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 08:16:15,674][15401] Updated weights for policy 0, policy_version 845994 (0.0033) [2024-06-25 08:16:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 13860864000. Throughput: 0: 42759.5. Samples: 13861032060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 08:16:20,070][15401] Updated weights for policy 0, policy_version 846004 (0.0026) [2024-06-25 08:16:23,201][15401] Updated weights for policy 0, policy_version 846014 (0.0031) [2024-06-25 08:16:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 13861093376. Throughput: 0: 42849.3. Samples: 13861158840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 08:16:27,707][15401] Updated weights for policy 0, policy_version 846024 (0.0037) [2024-06-25 08:16:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 13861289984. Throughput: 0: 42714.0. Samples: 13861416480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 08:16:30,995][15401] Updated weights for policy 0, policy_version 846034 (0.0027) [2024-06-25 08:16:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13861519360. Throughput: 0: 42762.3. Samples: 13861674080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 08:16:35,338][15401] Updated weights for policy 0, policy_version 846044 (0.0036) [2024-06-25 08:16:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 13861732352. Throughput: 0: 42945.4. Samples: 13861804140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:38,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 08:16:38,721][15401] Updated weights for policy 0, policy_version 846054 (0.0037) [2024-06-25 08:16:42,841][15349] Signal inference workers to stop experience collection... (205300 times) [2024-06-25 08:16:42,841][15349] Signal inference workers to resume experience collection... (205300 times) [2024-06-25 08:16:42,842][15401] Updated weights for policy 0, policy_version 846064 (0.0044) [2024-06-25 08:16:42,884][15401] InferenceWorker_p0-w0: stopping experience collection (205300 times) [2024-06-25 08:16:42,884][15401] InferenceWorker_p0-w0: resuming experience collection (205300 times) [2024-06-25 08:16:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13861928960. Throughput: 0: 42697.3. Samples: 13862056980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 08:16:46,743][15401] Updated weights for policy 0, policy_version 846074 (0.0028) [2024-06-25 08:16:48,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 13862158336. Throughput: 0: 42754.0. Samples: 13862310060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:48,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 08:16:50,690][15401] Updated weights for policy 0, policy_version 846084 (0.0025) [2024-06-25 08:16:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 13862354944. Throughput: 0: 42921.0. Samples: 13862442840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 08:16:54,648][15401] Updated weights for policy 0, policy_version 846094 (0.0051) [2024-06-25 08:16:58,389][15132] Fps is (10 sec: 39322.7, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 13862551552. Throughput: 0: 42707.6. Samples: 13862699900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:16:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 08:16:58,488][15401] Updated weights for policy 0, policy_version 846104 (0.0037) [2024-06-25 08:17:02,133][15401] Updated weights for policy 0, policy_version 846114 (0.0030) [2024-06-25 08:17:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13862797312. Throughput: 0: 42732.0. Samples: 13862955000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:17:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 08:17:06,057][15401] Updated weights for policy 0, policy_version 846124 (0.0051) [2024-06-25 08:17:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13863010304. Throughput: 0: 42948.1. Samples: 13863091500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:17:08,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 08:17:09,617][15401] Updated weights for policy 0, policy_version 846134 (0.0033) [2024-06-25 08:17:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13863206912. Throughput: 0: 42711.5. Samples: 13863338500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:17:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 08:17:13,886][15401] Updated weights for policy 0, policy_version 846144 (0.0047) [2024-06-25 08:17:17,427][15401] Updated weights for policy 0, policy_version 846154 (0.0035) [2024-06-25 08:17:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.3). Total num frames: 13863452672. Throughput: 0: 42611.4. Samples: 13863591600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:17:18,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 08:17:21,576][15401] Updated weights for policy 0, policy_version 846164 (0.0044) [2024-06-25 08:17:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42821.1). Total num frames: 13863649280. Throughput: 0: 42671.5. Samples: 13863724360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:17:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 08:17:25,342][15401] Updated weights for policy 0, policy_version 846174 (0.0035) [2024-06-25 08:17:28,396][15132] Fps is (10 sec: 39297.0, 60 sec: 42593.8, 300 sec: 42597.5). Total num frames: 13863845888. Throughput: 0: 42507.3. Samples: 13863970080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-25 08:17:28,396][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 08:17:29,318][15401] Updated weights for policy 0, policy_version 846184 (0.0043) [2024-06-25 08:17:32,957][15401] Updated weights for policy 0, policy_version 846194 (0.0033) [2024-06-25 08:17:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 13864058880. Throughput: 0: 42628.7. Samples: 13864228340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:17:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 08:17:37,096][15401] Updated weights for policy 0, policy_version 846204 (0.0037) [2024-06-25 08:17:38,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13864288256. Throughput: 0: 42528.1. Samples: 13864356600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:17:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 08:17:40,666][15401] Updated weights for policy 0, policy_version 846214 (0.0024) [2024-06-25 08:17:43,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 13864484864. Throughput: 0: 42215.9. Samples: 13864599720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:17:43,393][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 08:17:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000846221_13864484864.pth... [2024-06-25 08:17:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000845597_13854261248.pth [2024-06-25 08:17:44,788][15401] Updated weights for policy 0, policy_version 846224 (0.0032) [2024-06-25 08:17:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.5, 300 sec: 42598.4). Total num frames: 13864681472. Throughput: 0: 42336.0. Samples: 13864860120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:17:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 08:17:48,485][15401] Updated weights for policy 0, policy_version 846234 (0.0036) [2024-06-25 08:17:52,766][15401] Updated weights for policy 0, policy_version 846244 (0.0046) [2024-06-25 08:17:53,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13864894464. Throughput: 0: 42108.9. Samples: 13864986400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:17:53,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 08:17:53,785][15349] Signal inference workers to stop experience collection... (205350 times) [2024-06-25 08:17:53,839][15401] InferenceWorker_p0-w0: stopping experience collection (205350 times) [2024-06-25 08:17:53,844][15349] Signal inference workers to resume experience collection... (205350 times) [2024-06-25 08:17:53,859][15401] InferenceWorker_p0-w0: resuming experience collection (205350 times) [2024-06-25 08:17:56,163][15401] Updated weights for policy 0, policy_version 846254 (0.0022) [2024-06-25 08:17:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13865123840. Throughput: 0: 42099.6. Samples: 13865232980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:17:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 08:18:00,371][15401] Updated weights for policy 0, policy_version 846264 (0.0046) [2024-06-25 08:18:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42709.4). Total num frames: 13865320448. Throughput: 0: 42318.7. Samples: 13865495940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 08:18:03,783][15401] Updated weights for policy 0, policy_version 846274 (0.0033) [2024-06-25 08:18:08,189][15401] Updated weights for policy 0, policy_version 846284 (0.0029) [2024-06-25 08:18:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 13865517056. Throughput: 0: 42106.1. Samples: 13865619140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 08:18:11,517][15401] Updated weights for policy 0, policy_version 846294 (0.0033) [2024-06-25 08:18:13,390][15132] Fps is (10 sec: 45872.8, 60 sec: 42871.0, 300 sec: 42764.9). Total num frames: 13865779200. Throughput: 0: 42179.6. Samples: 13865867920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:13,391][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 08:18:16,053][15401] Updated weights for policy 0, policy_version 846304 (0.0027) [2024-06-25 08:18:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 42765.9). Total num frames: 13865959424. Throughput: 0: 42073.7. Samples: 13866121660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 08:18:19,320][15401] Updated weights for policy 0, policy_version 846314 (0.0035) [2024-06-25 08:18:23,389][15132] Fps is (10 sec: 37685.9, 60 sec: 41779.3, 300 sec: 42598.5). Total num frames: 13866156032. Throughput: 0: 42162.2. Samples: 13866253900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 08:18:23,633][15401] Updated weights for policy 0, policy_version 846324 (0.0028) [2024-06-25 08:18:27,123][15401] Updated weights for policy 0, policy_version 846334 (0.0038) [2024-06-25 08:18:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42602.9, 300 sec: 42654.0). Total num frames: 13866401792. Throughput: 0: 42381.0. Samples: 13866506760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:28,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 08:18:31,238][15401] Updated weights for policy 0, policy_version 846344 (0.0031) [2024-06-25 08:18:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 13866598400. Throughput: 0: 42108.3. Samples: 13866755000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 08:18:34,981][15401] Updated weights for policy 0, policy_version 846354 (0.0036) [2024-06-25 08:18:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.0, 300 sec: 42487.7). Total num frames: 13866778624. Throughput: 0: 42122.1. Samples: 13866881900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 08:18:38,899][15401] Updated weights for policy 0, policy_version 846364 (0.0038) [2024-06-25 08:18:42,548][15401] Updated weights for policy 0, policy_version 846374 (0.0037) [2024-06-25 08:18:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 13867024384. Throughput: 0: 42495.5. Samples: 13867145280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 08:18:46,830][15401] Updated weights for policy 0, policy_version 846384 (0.0032) [2024-06-25 08:18:48,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13867253760. Throughput: 0: 42136.9. Samples: 13867392100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 08:18:50,296][15401] Updated weights for policy 0, policy_version 846394 (0.0032) [2024-06-25 08:18:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 13867433984. Throughput: 0: 42298.6. Samples: 13867522580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 08:18:54,299][15401] Updated weights for policy 0, policy_version 846404 (0.0033) [2024-06-25 08:18:57,891][15401] Updated weights for policy 0, policy_version 846414 (0.0029) [2024-06-25 08:18:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13867663360. Throughput: 0: 42541.5. Samples: 13867782260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:18:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 08:19:00,650][15349] Signal inference workers to stop experience collection... (205400 times) [2024-06-25 08:19:00,651][15349] Signal inference workers to resume experience collection... (205400 times) [2024-06-25 08:19:00,670][15401] InferenceWorker_p0-w0: stopping experience collection (205400 times) [2024-06-25 08:19:00,670][15401] InferenceWorker_p0-w0: resuming experience collection (205400 times) [2024-06-25 08:19:01,832][15401] Updated weights for policy 0, policy_version 846424 (0.0033) [2024-06-25 08:19:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13867876352. Throughput: 0: 42583.5. Samples: 13868037920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:19:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 08:19:05,639][15401] Updated weights for policy 0, policy_version 846434 (0.0030) [2024-06-25 08:19:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13868072960. Throughput: 0: 42364.4. Samples: 13868160300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:19:08,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 08:19:09,634][15401] Updated weights for policy 0, policy_version 846444 (0.0041) [2024-06-25 08:19:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.7, 300 sec: 42598.4). Total num frames: 13868302336. Throughput: 0: 42551.6. Samples: 13868421580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:19:13,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-25 08:19:13,397][15401] Updated weights for policy 0, policy_version 846454 (0.0027) [2024-06-25 08:19:17,408][15401] Updated weights for policy 0, policy_version 846464 (0.0049) [2024-06-25 08:19:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13868515328. Throughput: 0: 42715.2. Samples: 13868677180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-25 08:19:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 08:19:20,965][15401] Updated weights for policy 0, policy_version 846474 (0.0028) [2024-06-25 08:19:23,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42487.7). Total num frames: 13868695552. Throughput: 0: 42683.1. Samples: 13868802640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:19:23,398][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 08:19:24,989][15401] Updated weights for policy 0, policy_version 846484 (0.0027) [2024-06-25 08:19:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13868941312. Throughput: 0: 42500.9. Samples: 13869057820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:19:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 08:19:28,576][15401] Updated weights for policy 0, policy_version 846494 (0.0034) [2024-06-25 08:19:32,594][15401] Updated weights for policy 0, policy_version 846504 (0.0039) [2024-06-25 08:19:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13869154304. Throughput: 0: 42869.2. Samples: 13869321220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:19:33,399][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 08:19:36,045][15401] Updated weights for policy 0, policy_version 846514 (0.0044) [2024-06-25 08:19:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 13869350912. Throughput: 0: 42784.2. Samples: 13869447860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:19:38,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 08:19:40,177][15401] Updated weights for policy 0, policy_version 846524 (0.0041) [2024-06-25 08:19:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13869596672. Throughput: 0: 42740.8. Samples: 13869705600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:19:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 08:19:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000846533_13869596672.pth... [2024-06-25 08:19:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000845911_13859405824.pth [2024-06-25 08:19:43,648][15401] Updated weights for policy 0, policy_version 846534 (0.0027) [2024-06-25 08:19:47,986][15401] Updated weights for policy 0, policy_version 846544 (0.0034) [2024-06-25 08:19:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13869793280. Throughput: 0: 42725.3. Samples: 13869960560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:19:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 08:19:51,151][15401] Updated weights for policy 0, policy_version 846554 (0.0037) [2024-06-25 08:19:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13870006272. Throughput: 0: 42733.8. Samples: 13870083320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:19:53,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 08:19:55,535][15401] Updated weights for policy 0, policy_version 846564 (0.0035) [2024-06-25 08:19:58,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13870235648. Throughput: 0: 42722.7. Samples: 13870344100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:19:58,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 08:19:59,150][15401] Updated weights for policy 0, policy_version 846574 (0.0026) [2024-06-25 08:20:03,228][15401] Updated weights for policy 0, policy_version 846584 (0.0047) [2024-06-25 08:20:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13870432256. Throughput: 0: 42882.5. Samples: 13870606900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:03,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 08:20:06,562][15401] Updated weights for policy 0, policy_version 846594 (0.0024) [2024-06-25 08:20:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13870661632. Throughput: 0: 42751.7. Samples: 13870726460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:08,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 08:20:10,798][15401] Updated weights for policy 0, policy_version 846604 (0.0026) [2024-06-25 08:20:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 13870874624. Throughput: 0: 42984.4. Samples: 13870992120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 08:20:14,046][15401] Updated weights for policy 0, policy_version 846614 (0.0025) [2024-06-25 08:20:15,698][15349] Signal inference workers to stop experience collection... (205450 times) [2024-06-25 08:20:15,699][15349] Signal inference workers to resume experience collection... (205450 times) [2024-06-25 08:20:15,751][15401] InferenceWorker_p0-w0: stopping experience collection (205450 times) [2024-06-25 08:20:15,751][15401] InferenceWorker_p0-w0: resuming experience collection (205450 times) [2024-06-25 08:20:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13871071232. Throughput: 0: 42850.4. Samples: 13871249480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 08:20:18,778][15401] Updated weights for policy 0, policy_version 846624 (0.0029) [2024-06-25 08:20:21,669][15401] Updated weights for policy 0, policy_version 846634 (0.0034) [2024-06-25 08:20:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42598.7). Total num frames: 13871300608. Throughput: 0: 42682.2. Samples: 13871368560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:23,396][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 08:20:26,331][15401] Updated weights for policy 0, policy_version 846644 (0.0026) [2024-06-25 08:20:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13871513600. Throughput: 0: 42806.7. Samples: 13871631900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 08:20:29,233][15401] Updated weights for policy 0, policy_version 846654 (0.0033) [2024-06-25 08:20:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 13871693824. Throughput: 0: 43048.1. Samples: 13871897720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 08:20:34,004][15401] Updated weights for policy 0, policy_version 846664 (0.0025) [2024-06-25 08:20:36,748][15401] Updated weights for policy 0, policy_version 846674 (0.0045) [2024-06-25 08:20:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 13871939584. Throughput: 0: 42872.4. Samples: 13872012580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 08:20:41,744][15401] Updated weights for policy 0, policy_version 846684 (0.0053) [2024-06-25 08:20:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13872136192. Throughput: 0: 42789.0. Samples: 13872269600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:43,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 08:20:44,789][15401] Updated weights for policy 0, policy_version 846694 (0.0028) [2024-06-25 08:20:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13872349184. Throughput: 0: 42762.8. Samples: 13872531220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:48,392][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 08:20:49,304][15401] Updated weights for policy 0, policy_version 846704 (0.0024) [2024-06-25 08:20:52,402][15401] Updated weights for policy 0, policy_version 846714 (0.0043) [2024-06-25 08:20:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13872594944. Throughput: 0: 42831.1. Samples: 13872653860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:53,391][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 08:20:56,872][15401] Updated weights for policy 0, policy_version 846724 (0.0038) [2024-06-25 08:20:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13872791552. Throughput: 0: 42744.6. Samples: 13872915620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:20:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 08:20:59,973][15401] Updated weights for policy 0, policy_version 846734 (0.0035) [2024-06-25 08:21:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 13872988160. Throughput: 0: 42561.7. Samples: 13873164760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:21:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 08:21:04,779][15401] Updated weights for policy 0, policy_version 846744 (0.0036) [2024-06-25 08:21:07,938][15401] Updated weights for policy 0, policy_version 846754 (0.0037) [2024-06-25 08:21:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13873233920. Throughput: 0: 42656.0. Samples: 13873288080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 08:21:08,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 08:21:12,282][15401] Updated weights for policy 0, policy_version 846764 (0.0037) [2024-06-25 08:21:13,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 13873430528. Throughput: 0: 42566.1. Samples: 13873547480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:13,393][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 08:21:15,633][15401] Updated weights for policy 0, policy_version 846774 (0.0036) [2024-06-25 08:21:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13873643520. Throughput: 0: 42519.1. Samples: 13873811080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 08:21:19,929][15401] Updated weights for policy 0, policy_version 846784 (0.0031) [2024-06-25 08:21:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13873856512. Throughput: 0: 42667.6. Samples: 13873932620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:23,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-25 08:21:23,746][15401] Updated weights for policy 0, policy_version 846794 (0.0031) [2024-06-25 08:21:27,879][15401] Updated weights for policy 0, policy_version 846804 (0.0040) [2024-06-25 08:21:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13874069504. Throughput: 0: 42773.3. Samples: 13874194400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 08:21:31,230][15401] Updated weights for policy 0, policy_version 846814 (0.0028) [2024-06-25 08:21:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 13874266112. Throughput: 0: 42715.6. Samples: 13874453420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 08:21:35,466][15401] Updated weights for policy 0, policy_version 846824 (0.0035) [2024-06-25 08:21:38,396][15132] Fps is (10 sec: 44207.8, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 13874511872. Throughput: 0: 42749.4. Samples: 13874577860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:38,397][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 08:21:38,927][15401] Updated weights for policy 0, policy_version 846834 (0.0032) [2024-06-25 08:21:43,096][15401] Updated weights for policy 0, policy_version 846844 (0.0033) [2024-06-25 08:21:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13874708480. Throughput: 0: 42741.3. Samples: 13874838980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:43,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 08:21:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000846845_13874708480.pth... [2024-06-25 08:21:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000846221_13864484864.pth [2024-06-25 08:21:43,683][15349] Signal inference workers to stop experience collection... (205500 times) [2024-06-25 08:21:43,685][15349] Signal inference workers to resume experience collection... (205500 times) [2024-06-25 08:21:43,708][15401] InferenceWorker_p0-w0: stopping experience collection (205500 times) [2024-06-25 08:21:43,708][15401] InferenceWorker_p0-w0: resuming experience collection (205500 times) [2024-06-25 08:21:46,409][15401] Updated weights for policy 0, policy_version 846854 (0.0036) [2024-06-25 08:21:48,389][15132] Fps is (10 sec: 39347.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13874905088. Throughput: 0: 42966.0. Samples: 13875098220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 08:21:50,547][15401] Updated weights for policy 0, policy_version 846864 (0.0032) [2024-06-25 08:21:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13875167232. Throughput: 0: 42960.1. Samples: 13875221280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:53,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 08:21:54,207][15401] Updated weights for policy 0, policy_version 846874 (0.0036) [2024-06-25 08:21:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 13875331072. Throughput: 0: 42989.3. Samples: 13875481900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:21:58,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 08:21:58,400][15401] Updated weights for policy 0, policy_version 846884 (0.0029) [2024-06-25 08:22:01,977][15401] Updated weights for policy 0, policy_version 846894 (0.0030) [2024-06-25 08:22:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13875560448. Throughput: 0: 42829.8. Samples: 13875738420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 08:22:05,871][15401] Updated weights for policy 0, policy_version 846904 (0.0029) [2024-06-25 08:22:08,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13875806208. Throughput: 0: 42975.2. Samples: 13875866500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 08:22:09,584][15401] Updated weights for policy 0, policy_version 846914 (0.0034) [2024-06-25 08:22:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 13875986432. Throughput: 0: 42978.1. Samples: 13876128420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 08:22:13,599][15401] Updated weights for policy 0, policy_version 846924 (0.0042) [2024-06-25 08:22:17,235][15401] Updated weights for policy 0, policy_version 846934 (0.0033) [2024-06-25 08:22:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13876199424. Throughput: 0: 42835.6. Samples: 13876381020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:18,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 08:22:21,334][15401] Updated weights for policy 0, policy_version 846944 (0.0027) [2024-06-25 08:22:23,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.9, 300 sec: 42710.1). Total num frames: 13876445184. Throughput: 0: 42945.7. Samples: 13876510240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:23,392][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 08:22:24,892][15401] Updated weights for policy 0, policy_version 846954 (0.0028) [2024-06-25 08:22:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13876609024. Throughput: 0: 42785.7. Samples: 13876764340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 08:22:28,832][15401] Updated weights for policy 0, policy_version 846964 (0.0051) [2024-06-25 08:22:32,990][15401] Updated weights for policy 0, policy_version 846974 (0.0042) [2024-06-25 08:22:33,392][15132] Fps is (10 sec: 39321.6, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 13876838400. Throughput: 0: 42635.4. Samples: 13877016920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:33,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 08:22:36,592][15401] Updated weights for policy 0, policy_version 846984 (0.0034) [2024-06-25 08:22:38,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42876.1, 300 sec: 42709.8). Total num frames: 13877084160. Throughput: 0: 42820.8. Samples: 13877148220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:38,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 08:22:40,444][15401] Updated weights for policy 0, policy_version 846994 (0.0034) [2024-06-25 08:22:43,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13877264384. Throughput: 0: 42816.0. Samples: 13877408620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:43,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 08:22:44,187][15401] Updated weights for policy 0, policy_version 847004 (0.0034) [2024-06-25 08:22:47,863][15401] Updated weights for policy 0, policy_version 847014 (0.0028) [2024-06-25 08:22:48,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 13877493760. Throughput: 0: 42733.3. Samples: 13877661520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:48,392][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 08:22:51,242][15349] Signal inference workers to stop experience collection... (205550 times) [2024-06-25 08:22:51,244][15349] Signal inference workers to resume experience collection... (205550 times) [2024-06-25 08:22:51,258][15401] InferenceWorker_p0-w0: stopping experience collection (205550 times) [2024-06-25 08:22:51,258][15401] InferenceWorker_p0-w0: resuming experience collection (205550 times) [2024-06-25 08:22:51,704][15401] Updated weights for policy 0, policy_version 847024 (0.0037) [2024-06-25 08:22:53,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13877739520. Throughput: 0: 42823.1. Samples: 13877793540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 08:22:55,286][15401] Updated weights for policy 0, policy_version 847034 (0.0037) [2024-06-25 08:22:58,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13877903360. Throughput: 0: 42726.2. Samples: 13878051100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 08:22:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 08:22:59,410][15401] Updated weights for policy 0, policy_version 847044 (0.0038) [2024-06-25 08:23:02,770][15401] Updated weights for policy 0, policy_version 847054 (0.0032) [2024-06-25 08:23:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13878149120. Throughput: 0: 42714.1. Samples: 13878303160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:03,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 08:23:07,032][15401] Updated weights for policy 0, policy_version 847064 (0.0027) [2024-06-25 08:23:08,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43144.5, 300 sec: 42765.1). Total num frames: 13878394880. Throughput: 0: 42898.7. Samples: 13878440580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:08,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 08:23:10,207][15401] Updated weights for policy 0, policy_version 847074 (0.0024) [2024-06-25 08:23:13,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 13878525952. Throughput: 0: 42833.7. Samples: 13878691960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:13,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 08:23:14,694][15401] Updated weights for policy 0, policy_version 847084 (0.0037) [2024-06-25 08:23:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13878771712. Throughput: 0: 42702.8. Samples: 13878938440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 08:23:18,796][15401] Updated weights for policy 0, policy_version 847094 (0.0038) [2024-06-25 08:23:22,449][15401] Updated weights for policy 0, policy_version 847104 (0.0040) [2024-06-25 08:23:23,390][15132] Fps is (10 sec: 47524.9, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13879001088. Throughput: 0: 42749.7. Samples: 13879071960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:23,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 08:23:26,793][15401] Updated weights for policy 0, policy_version 847114 (0.0031) [2024-06-25 08:23:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 13879181312. Throughput: 0: 42468.5. Samples: 13879319700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:28,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 08:23:30,291][15401] Updated weights for policy 0, policy_version 847124 (0.0029) [2024-06-25 08:23:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 13879427072. Throughput: 0: 42499.0. Samples: 13879573880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:33,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 08:23:34,656][15401] Updated weights for policy 0, policy_version 847134 (0.0049) [2024-06-25 08:23:37,717][15401] Updated weights for policy 0, policy_version 847144 (0.0035) [2024-06-25 08:23:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13879640064. Throughput: 0: 42550.6. Samples: 13879708320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 08:23:42,292][15401] Updated weights for policy 0, policy_version 847154 (0.0046) [2024-06-25 08:23:43,392][15132] Fps is (10 sec: 37674.5, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 13879803904. Throughput: 0: 42415.5. Samples: 13879959900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:43,393][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 08:23:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000847156_13879803904.pth... [2024-06-25 08:23:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000846533_13869596672.pth [2024-06-25 08:23:45,457][15401] Updated weights for policy 0, policy_version 847164 (0.0031) [2024-06-25 08:23:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.2, 300 sec: 42765.1). Total num frames: 13880049664. Throughput: 0: 42394.8. Samples: 13880210920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:48,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 08:23:49,800][15401] Updated weights for policy 0, policy_version 847174 (0.0041) [2024-06-25 08:23:53,247][15401] Updated weights for policy 0, policy_version 847184 (0.0029) [2024-06-25 08:23:53,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 13880262656. Throughput: 0: 42284.9. Samples: 13880343400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 08:23:56,447][15349] Signal inference workers to stop experience collection... (205600 times) [2024-06-25 08:23:56,448][15349] Signal inference workers to resume experience collection... (205600 times) [2024-06-25 08:23:56,495][15401] InferenceWorker_p0-w0: stopping experience collection (205600 times) [2024-06-25 08:23:56,495][15401] InferenceWorker_p0-w0: resuming experience collection (205600 times) [2024-06-25 08:23:57,599][15401] Updated weights for policy 0, policy_version 847194 (0.0026) [2024-06-25 08:23:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13880459264. Throughput: 0: 42286.7. Samples: 13880594760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:23:58,390][15132] Avg episode reward: [(0, '0.274')] [2024-06-25 08:24:00,734][15401] Updated weights for policy 0, policy_version 847204 (0.0039) [2024-06-25 08:24:03,394][15132] Fps is (10 sec: 44218.1, 60 sec: 42595.4, 300 sec: 42819.9). Total num frames: 13880705024. Throughput: 0: 42427.9. Samples: 13880847880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:03,394][15132] Avg episode reward: [(0, '0.182')] [2024-06-25 08:24:05,222][15401] Updated weights for policy 0, policy_version 847214 (0.0032) [2024-06-25 08:24:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 13880901632. Throughput: 0: 42534.3. Samples: 13880986000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 08:24:08,413][15401] Updated weights for policy 0, policy_version 847224 (0.0038) [2024-06-25 08:24:12,741][15401] Updated weights for policy 0, policy_version 847234 (0.0032) [2024-06-25 08:24:13,389][15132] Fps is (10 sec: 40977.5, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 13881114624. Throughput: 0: 42573.4. Samples: 13881235500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 08:24:15,979][15401] Updated weights for policy 0, policy_version 847244 (0.0031) [2024-06-25 08:24:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13881344000. Throughput: 0: 42549.1. Samples: 13881488580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 08:24:20,226][15401] Updated weights for policy 0, policy_version 847254 (0.0024) [2024-06-25 08:24:23,391][15132] Fps is (10 sec: 42590.2, 60 sec: 42324.0, 300 sec: 42709.2). Total num frames: 13881540608. Throughput: 0: 42599.5. Samples: 13881625380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:23,392][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 08:24:23,909][15401] Updated weights for policy 0, policy_version 847264 (0.0043) [2024-06-25 08:24:27,873][15401] Updated weights for policy 0, policy_version 847274 (0.0040) [2024-06-25 08:24:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13881753600. Throughput: 0: 42664.5. Samples: 13881879700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:28,391][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 08:24:31,723][15401] Updated weights for policy 0, policy_version 847284 (0.0038) [2024-06-25 08:24:33,390][15132] Fps is (10 sec: 44244.9, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 13881982976. Throughput: 0: 42639.8. Samples: 13882129720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 08:24:35,621][15401] Updated weights for policy 0, policy_version 847294 (0.0033) [2024-06-25 08:24:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13882179584. Throughput: 0: 42545.3. Samples: 13882257940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:38,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 08:24:39,334][15401] Updated weights for policy 0, policy_version 847304 (0.0042) [2024-06-25 08:24:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 13882376192. Throughput: 0: 42572.4. Samples: 13882510520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:43,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 08:24:43,647][15401] Updated weights for policy 0, policy_version 847314 (0.0026) [2024-06-25 08:24:46,941][15401] Updated weights for policy 0, policy_version 847324 (0.0046) [2024-06-25 08:24:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13882605568. Throughput: 0: 42608.5. Samples: 13882765080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-25 08:24:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 08:24:51,134][15401] Updated weights for policy 0, policy_version 847334 (0.0032) [2024-06-25 08:24:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13882802176. Throughput: 0: 42436.8. Samples: 13882895660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:24:53,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 08:24:54,625][15401] Updated weights for policy 0, policy_version 847344 (0.0030) [2024-06-25 08:24:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13883031552. Throughput: 0: 42509.8. Samples: 13883148440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:24:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 08:24:58,646][15401] Updated weights for policy 0, policy_version 847354 (0.0035) [2024-06-25 08:25:02,688][15401] Updated weights for policy 0, policy_version 847364 (0.0034) [2024-06-25 08:25:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42055.2, 300 sec: 42598.4). Total num frames: 13883228160. Throughput: 0: 42652.9. Samples: 13883407960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 08:25:06,245][15401] Updated weights for policy 0, policy_version 847374 (0.0021) [2024-06-25 08:25:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13883441152. Throughput: 0: 42364.5. Samples: 13883531700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 08:25:10,325][15401] Updated weights for policy 0, policy_version 847384 (0.0043) [2024-06-25 08:25:10,345][15349] Signal inference workers to stop experience collection... (205650 times) [2024-06-25 08:25:10,345][15349] Signal inference workers to resume experience collection... (205650 times) [2024-06-25 08:25:10,359][15401] InferenceWorker_p0-w0: stopping experience collection (205650 times) [2024-06-25 08:25:10,392][15401] InferenceWorker_p0-w0: resuming experience collection (205650 times) [2024-06-25 08:25:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 13883654144. Throughput: 0: 42464.5. Samples: 13883790600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:13,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 08:25:14,134][15401] Updated weights for policy 0, policy_version 847394 (0.0031) [2024-06-25 08:25:17,906][15401] Updated weights for policy 0, policy_version 847404 (0.0039) [2024-06-25 08:25:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13883883520. Throughput: 0: 42483.9. Samples: 13884041500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:18,396][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 08:25:21,600][15401] Updated weights for policy 0, policy_version 847414 (0.0039) [2024-06-25 08:25:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42326.7, 300 sec: 42598.4). Total num frames: 13884080128. Throughput: 0: 42507.6. Samples: 13884170780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 08:25:25,593][15401] Updated weights for policy 0, policy_version 847424 (0.0037) [2024-06-25 08:25:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13884309504. Throughput: 0: 42584.6. Samples: 13884426820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 08:25:29,918][15401] Updated weights for policy 0, policy_version 847434 (0.0053) [2024-06-25 08:25:33,289][15401] Updated weights for policy 0, policy_version 847444 (0.0032) [2024-06-25 08:25:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13884522496. Throughput: 0: 42675.8. Samples: 13884685500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 08:25:37,465][15401] Updated weights for policy 0, policy_version 847454 (0.0029) [2024-06-25 08:25:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13884751872. Throughput: 0: 42573.4. Samples: 13884811460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:38,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 08:25:40,968][15401] Updated weights for policy 0, policy_version 847464 (0.0036) [2024-06-25 08:25:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13884948480. Throughput: 0: 42785.8. Samples: 13885073800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 08:25:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000847471_13884964864.pth... [2024-06-25 08:25:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000846845_13874708480.pth [2024-06-25 08:25:44,936][15401] Updated weights for policy 0, policy_version 847474 (0.0035) [2024-06-25 08:25:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13885161472. Throughput: 0: 42577.7. Samples: 13885323960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:48,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 08:25:48,972][15401] Updated weights for policy 0, policy_version 847484 (0.0034) [2024-06-25 08:25:52,349][15401] Updated weights for policy 0, policy_version 847494 (0.0030) [2024-06-25 08:25:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13885374464. Throughput: 0: 42700.0. Samples: 13885453200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:53,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 08:25:56,450][15401] Updated weights for policy 0, policy_version 847504 (0.0031) [2024-06-25 08:25:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 13885571072. Throughput: 0: 42689.3. Samples: 13885711620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:25:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 08:25:59,900][15401] Updated weights for policy 0, policy_version 847514 (0.0042) [2024-06-25 08:26:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13885816832. Throughput: 0: 42813.0. Samples: 13885968080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:26:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 08:26:04,266][15401] Updated weights for policy 0, policy_version 847524 (0.0025) [2024-06-25 08:26:07,473][15401] Updated weights for policy 0, policy_version 847534 (0.0033) [2024-06-25 08:26:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 13886013440. Throughput: 0: 42800.3. Samples: 13886096800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:26:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 08:26:12,123][15401] Updated weights for policy 0, policy_version 847544 (0.0044) [2024-06-25 08:26:13,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 13886226432. Throughput: 0: 42751.2. Samples: 13886350640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:26:13,396][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 08:26:15,242][15401] Updated weights for policy 0, policy_version 847554 (0.0032) [2024-06-25 08:26:18,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 13886439424. Throughput: 0: 42529.5. Samples: 13886599320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:26:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 08:26:19,790][15401] Updated weights for policy 0, policy_version 847564 (0.0050) [2024-06-25 08:26:22,796][15401] Updated weights for policy 0, policy_version 847574 (0.0042) [2024-06-25 08:26:23,396][15132] Fps is (10 sec: 44209.6, 60 sec: 43139.9, 300 sec: 42708.5). Total num frames: 13886668800. Throughput: 0: 42751.7. Samples: 13886735560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:26:23,396][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 08:26:27,390][15401] Updated weights for policy 0, policy_version 847584 (0.0029) [2024-06-25 08:26:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13886849024. Throughput: 0: 42432.9. Samples: 13886983280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:26:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 08:26:30,586][15401] Updated weights for policy 0, policy_version 847594 (0.0027) [2024-06-25 08:26:33,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 13887078400. Throughput: 0: 42463.1. Samples: 13887234800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:26:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 08:26:35,072][15401] Updated weights for policy 0, policy_version 847604 (0.0033) [2024-06-25 08:26:37,840][15349] Signal inference workers to stop experience collection... (205700 times) [2024-06-25 08:26:37,841][15349] Signal inference workers to resume experience collection... (205700 times) [2024-06-25 08:26:37,890][15401] InferenceWorker_p0-w0: stopping experience collection (205700 times) [2024-06-25 08:26:37,890][15401] InferenceWorker_p0-w0: resuming experience collection (205700 times) [2024-06-25 08:26:38,259][15401] Updated weights for policy 0, policy_version 847614 (0.0038) [2024-06-25 08:26:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13887307776. Throughput: 0: 42559.1. Samples: 13887368360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 08:26:38,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 08:26:43,003][15401] Updated weights for policy 0, policy_version 847624 (0.0037) [2024-06-25 08:26:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13887488000. Throughput: 0: 42327.5. Samples: 13887616360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:26:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 08:26:46,114][15401] Updated weights for policy 0, policy_version 847634 (0.0030) [2024-06-25 08:26:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 13887717376. Throughput: 0: 42291.0. Samples: 13887871180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:26:48,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 08:26:50,858][15401] Updated weights for policy 0, policy_version 847644 (0.0037) [2024-06-25 08:26:53,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 13887930368. Throughput: 0: 42544.1. Samples: 13888011380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:26:53,392][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 08:26:53,783][15401] Updated weights for policy 0, policy_version 847654 (0.0038) [2024-06-25 08:26:58,343][15401] Updated weights for policy 0, policy_version 847664 (0.0042) [2024-06-25 08:26:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13888126976. Throughput: 0: 42449.5. Samples: 13888260860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:26:58,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-25 08:27:01,412][15401] Updated weights for policy 0, policy_version 847674 (0.0037) [2024-06-25 08:27:03,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 13888356352. Throughput: 0: 42620.8. Samples: 13888517260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:03,390][15132] Avg episode reward: [(0, '0.904')] [2024-06-25 08:27:06,014][15401] Updated weights for policy 0, policy_version 847684 (0.0022) [2024-06-25 08:27:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 13888569344. Throughput: 0: 42616.4. Samples: 13888653020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 08:27:09,024][15401] Updated weights for policy 0, policy_version 847694 (0.0030) [2024-06-25 08:27:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 13888765952. Throughput: 0: 42633.3. Samples: 13888901780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 08:27:13,604][15401] Updated weights for policy 0, policy_version 847704 (0.0024) [2024-06-25 08:27:16,856][15401] Updated weights for policy 0, policy_version 847714 (0.0038) [2024-06-25 08:27:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 13889011712. Throughput: 0: 42596.8. Samples: 13889151660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 08:27:21,246][15401] Updated weights for policy 0, policy_version 847724 (0.0030) [2024-06-25 08:27:23,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42329.7, 300 sec: 42709.5). Total num frames: 13889208320. Throughput: 0: 42681.1. Samples: 13889289020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:23,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 08:27:24,578][15401] Updated weights for policy 0, policy_version 847734 (0.0045) [2024-06-25 08:27:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 13889404928. Throughput: 0: 42828.5. Samples: 13889543640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 08:27:28,858][15401] Updated weights for policy 0, policy_version 847744 (0.0037) [2024-06-25 08:27:32,333][15401] Updated weights for policy 0, policy_version 847754 (0.0026) [2024-06-25 08:27:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13889650688. Throughput: 0: 42708.4. Samples: 13889793060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:33,391][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 08:27:36,478][15401] Updated weights for policy 0, policy_version 847764 (0.0036) [2024-06-25 08:27:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13889830912. Throughput: 0: 42533.8. Samples: 13889925300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:38,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 08:27:39,987][15401] Updated weights for policy 0, policy_version 847774 (0.0032) [2024-06-25 08:27:43,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 13890043904. Throughput: 0: 42439.1. Samples: 13890170620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:43,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-25 08:27:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000847782_13890060288.pth... [2024-06-25 08:27:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000847156_13879803904.pth [2024-06-25 08:27:44,331][15401] Updated weights for policy 0, policy_version 847784 (0.0039) [2024-06-25 08:27:47,691][15401] Updated weights for policy 0, policy_version 847794 (0.0046) [2024-06-25 08:27:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 13890256896. Throughput: 0: 42433.8. Samples: 13890426780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 08:27:51,827][15401] Updated weights for policy 0, policy_version 847804 (0.0033) [2024-06-25 08:27:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 13890469888. Throughput: 0: 42331.0. Samples: 13890557920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:53,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 08:27:55,787][15401] Updated weights for policy 0, policy_version 847814 (0.0029) [2024-06-25 08:27:58,391][15132] Fps is (10 sec: 44231.1, 60 sec: 42870.6, 300 sec: 42542.7). Total num frames: 13890699264. Throughput: 0: 42389.4. Samples: 13890809360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:27:58,391][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 08:27:59,557][15401] Updated weights for policy 0, policy_version 847824 (0.0032) [2024-06-25 08:28:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 13890895872. Throughput: 0: 42651.2. Samples: 13891070960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:28:03,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 08:28:03,610][15401] Updated weights for policy 0, policy_version 847834 (0.0042) [2024-06-25 08:28:07,574][15401] Updated weights for policy 0, policy_version 847844 (0.0035) [2024-06-25 08:28:08,389][15132] Fps is (10 sec: 39327.0, 60 sec: 42052.3, 300 sec: 42598.8). Total num frames: 13891092480. Throughput: 0: 42433.1. Samples: 13891198500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:28:08,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 08:28:09,356][15349] Signal inference workers to stop experience collection... (205750 times) [2024-06-25 08:28:09,391][15401] InferenceWorker_p0-w0: stopping experience collection (205750 times) [2024-06-25 08:28:09,416][15349] Signal inference workers to resume experience collection... (205750 times) [2024-06-25 08:28:09,417][15401] InferenceWorker_p0-w0: resuming experience collection (205750 times) [2024-06-25 08:28:11,146][15401] Updated weights for policy 0, policy_version 847854 (0.0027) [2024-06-25 08:28:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13891338240. Throughput: 0: 42393.3. Samples: 13891451340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:28:13,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 08:28:15,021][15401] Updated weights for policy 0, policy_version 847864 (0.0025) [2024-06-25 08:28:18,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13891551232. Throughput: 0: 42566.3. Samples: 13891708540. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:28:18,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 08:28:18,549][15401] Updated weights for policy 0, policy_version 847874 (0.0046) [2024-06-25 08:28:22,422][15401] Updated weights for policy 0, policy_version 847884 (0.0030) [2024-06-25 08:28:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13891747840. Throughput: 0: 42519.5. Samples: 13891838680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:28:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 08:28:26,324][15401] Updated weights for policy 0, policy_version 847894 (0.0047) [2024-06-25 08:28:28,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 13891993600. Throughput: 0: 42808.4. Samples: 13892097100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-06-25 08:28:28,392][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 08:28:29,819][15401] Updated weights for policy 0, policy_version 847904 (0.0027) [2024-06-25 08:28:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13892190208. Throughput: 0: 42999.5. Samples: 13892361760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:28:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 08:28:33,838][15401] Updated weights for policy 0, policy_version 847914 (0.0032) [2024-06-25 08:28:37,278][15401] Updated weights for policy 0, policy_version 847924 (0.0043) [2024-06-25 08:28:38,396][15132] Fps is (10 sec: 40943.5, 60 sec: 42866.9, 300 sec: 42708.9). Total num frames: 13892403200. Throughput: 0: 42766.8. Samples: 13892482700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:28:38,397][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 08:28:41,466][15401] Updated weights for policy 0, policy_version 847934 (0.0033) [2024-06-25 08:28:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13892632576. Throughput: 0: 42997.2. Samples: 13892744180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:28:43,393][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 08:28:44,756][15401] Updated weights for policy 0, policy_version 847944 (0.0041) [2024-06-25 08:28:48,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13892812800. Throughput: 0: 42977.4. Samples: 13893004940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:28:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 08:28:49,337][15401] Updated weights for policy 0, policy_version 847954 (0.0034) [2024-06-25 08:28:52,198][15401] Updated weights for policy 0, policy_version 847964 (0.0036) [2024-06-25 08:28:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13893042176. Throughput: 0: 42855.0. Samples: 13893126980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:28:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 08:28:56,881][15401] Updated weights for policy 0, policy_version 847974 (0.0034) [2024-06-25 08:28:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42599.3, 300 sec: 42543.5). Total num frames: 13893255168. Throughput: 0: 42917.7. Samples: 13893382640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:28:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 08:29:00,727][15401] Updated weights for policy 0, policy_version 847984 (0.0035) [2024-06-25 08:29:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13893468160. Throughput: 0: 42933.4. Samples: 13893640540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:03,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 08:29:05,013][15401] Updated weights for policy 0, policy_version 847994 (0.0041) [2024-06-25 08:29:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 13893681152. Throughput: 0: 42751.7. Samples: 13893762500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:08,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 08:29:08,412][15401] Updated weights for policy 0, policy_version 848004 (0.0032) [2024-06-25 08:29:12,646][15401] Updated weights for policy 0, policy_version 848014 (0.0035) [2024-06-25 08:29:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13893894144. Throughput: 0: 42668.1. Samples: 13894017060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 08:29:15,954][15401] Updated weights for policy 0, policy_version 848024 (0.0047) [2024-06-25 08:29:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 13894107136. Throughput: 0: 42525.4. Samples: 13894275400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:18,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 08:29:20,336][15401] Updated weights for policy 0, policy_version 848034 (0.0039) [2024-06-25 08:29:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 13894336512. Throughput: 0: 42679.0. Samples: 13894402980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 08:29:23,523][15401] Updated weights for policy 0, policy_version 848044 (0.0040) [2024-06-25 08:29:27,934][15401] Updated weights for policy 0, policy_version 848054 (0.0039) [2024-06-25 08:29:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 13894533120. Throughput: 0: 42564.4. Samples: 13894659580. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 08:29:31,076][15401] Updated weights for policy 0, policy_version 848064 (0.0032) [2024-06-25 08:29:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13894729728. Throughput: 0: 42497.3. Samples: 13894917320. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:33,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 08:29:33,421][15349] Signal inference workers to stop experience collection... (205800 times) [2024-06-25 08:29:33,422][15349] Signal inference workers to resume experience collection... (205800 times) [2024-06-25 08:29:33,446][15401] InferenceWorker_p0-w0: stopping experience collection (205800 times) [2024-06-25 08:29:33,447][15401] InferenceWorker_p0-w0: resuming experience collection (205800 times) [2024-06-25 08:29:35,515][15401] Updated weights for policy 0, policy_version 848074 (0.0026) [2024-06-25 08:29:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 13894975488. Throughput: 0: 42595.2. Samples: 13895043760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:38,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 08:29:38,845][15401] Updated weights for policy 0, policy_version 848084 (0.0030) [2024-06-25 08:29:43,061][15401] Updated weights for policy 0, policy_version 848094 (0.0035) [2024-06-25 08:29:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13895172096. Throughput: 0: 42634.7. Samples: 13895301200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 08:29:43,469][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000848095_13895188480.pth... [2024-06-25 08:29:43,520][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000847471_13884964864.pth [2024-06-25 08:29:46,837][15401] Updated weights for policy 0, policy_version 848104 (0.0026) [2024-06-25 08:29:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13895385088. Throughput: 0: 42644.7. Samples: 13895559560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:48,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 08:29:50,908][15401] Updated weights for policy 0, policy_version 848114 (0.0038) [2024-06-25 08:29:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13895614464. Throughput: 0: 42748.4. Samples: 13895686180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 08:29:54,378][15401] Updated weights for policy 0, policy_version 848124 (0.0038) [2024-06-25 08:29:58,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 13895811072. Throughput: 0: 42615.7. Samples: 13895935040. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:29:58,397][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 08:29:58,813][15401] Updated weights for policy 0, policy_version 848134 (0.0034) [2024-06-25 08:30:02,142][15401] Updated weights for policy 0, policy_version 848144 (0.0044) [2024-06-25 08:30:03,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13896007680. Throughput: 0: 42553.1. Samples: 13896190300. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:30:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 08:30:06,400][15401] Updated weights for policy 0, policy_version 848154 (0.0038) [2024-06-25 08:30:08,389][15132] Fps is (10 sec: 44265.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13896253440. Throughput: 0: 42517.4. Samples: 13896316260. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:30:08,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 08:30:10,693][15401] Updated weights for policy 0, policy_version 848164 (0.0057) [2024-06-25 08:30:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13896433664. Throughput: 0: 42425.4. Samples: 13896568720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:30:13,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 08:30:14,021][15401] Updated weights for policy 0, policy_version 848174 (0.0041) [2024-06-25 08:30:18,142][15401] Updated weights for policy 0, policy_version 848184 (0.0033) [2024-06-25 08:30:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13896646656. Throughput: 0: 42599.3. Samples: 13896834280. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-25 08:30:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 08:30:21,685][15401] Updated weights for policy 0, policy_version 848194 (0.0043) [2024-06-25 08:30:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13896892416. Throughput: 0: 42685.2. Samples: 13896964600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:30:23,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 08:30:25,718][15401] Updated weights for policy 0, policy_version 848204 (0.0034) [2024-06-25 08:30:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 13897072640. Throughput: 0: 42560.5. Samples: 13897216420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:30:28,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-25 08:30:29,191][15401] Updated weights for policy 0, policy_version 848214 (0.0043) [2024-06-25 08:30:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13897285632. Throughput: 0: 42596.1. Samples: 13897476380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:30:33,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 08:30:33,439][15401] Updated weights for policy 0, policy_version 848224 (0.0035) [2024-06-25 08:30:36,779][15401] Updated weights for policy 0, policy_version 848234 (0.0033) [2024-06-25 08:30:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13897531392. Throughput: 0: 42612.4. Samples: 13897603740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:30:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 08:30:40,797][15401] Updated weights for policy 0, policy_version 848244 (0.0026) [2024-06-25 08:30:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13897728000. Throughput: 0: 42860.3. Samples: 13897863480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:30:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 08:30:44,750][15401] Updated weights for policy 0, policy_version 848254 (0.0043) [2024-06-25 08:30:45,902][15349] Signal inference workers to stop experience collection... (205850 times) [2024-06-25 08:30:45,908][15349] Signal inference workers to resume experience collection... (205850 times) [2024-06-25 08:30:45,929][15401] InferenceWorker_p0-w0: stopping experience collection (205850 times) [2024-06-25 08:30:45,929][15401] InferenceWorker_p0-w0: resuming experience collection (205850 times) [2024-06-25 08:30:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13897940992. Throughput: 0: 42758.0. Samples: 13898114400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:30:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 08:30:48,606][15401] Updated weights for policy 0, policy_version 848264 (0.0042) [2024-06-25 08:30:52,392][15401] Updated weights for policy 0, policy_version 848274 (0.0040) [2024-06-25 08:30:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13898170368. Throughput: 0: 42896.9. Samples: 13898246620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:30:53,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 08:30:56,143][15401] Updated weights for policy 0, policy_version 848284 (0.0037) [2024-06-25 08:30:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42603.0, 300 sec: 42542.9). Total num frames: 13898366976. Throughput: 0: 43009.8. Samples: 13898504160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:30:58,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 08:31:00,037][15401] Updated weights for policy 0, policy_version 848294 (0.0038) [2024-06-25 08:31:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 13898596352. Throughput: 0: 42808.3. Samples: 13898760660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 08:31:03,621][15401] Updated weights for policy 0, policy_version 848304 (0.0033) [2024-06-25 08:31:07,682][15401] Updated weights for policy 0, policy_version 848314 (0.0032) [2024-06-25 08:31:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13898825728. Throughput: 0: 42964.6. Samples: 13898898000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 08:31:11,187][15401] Updated weights for policy 0, policy_version 848324 (0.0031) [2024-06-25 08:31:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13899005952. Throughput: 0: 42947.1. Samples: 13899149040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:13,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 08:31:15,365][15401] Updated weights for policy 0, policy_version 848334 (0.0037) [2024-06-25 08:31:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.4, 300 sec: 42654.8). Total num frames: 13899251712. Throughput: 0: 42817.2. Samples: 13899403160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:18,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 08:31:18,844][15401] Updated weights for policy 0, policy_version 848344 (0.0037) [2024-06-25 08:31:22,947][15401] Updated weights for policy 0, policy_version 848354 (0.0032) [2024-06-25 08:31:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13899448320. Throughput: 0: 42953.4. Samples: 13899536640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 08:31:26,308][15401] Updated weights for policy 0, policy_version 848364 (0.0032) [2024-06-25 08:31:28,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 13899628544. Throughput: 0: 42748.1. Samples: 13899787140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:28,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 08:31:30,815][15401] Updated weights for policy 0, policy_version 848374 (0.0037) [2024-06-25 08:31:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 13899890688. Throughput: 0: 42790.7. Samples: 13900039980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:33,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 08:31:33,857][15401] Updated weights for policy 0, policy_version 848384 (0.0027) [2024-06-25 08:31:38,368][15401] Updated weights for policy 0, policy_version 848394 (0.0035) [2024-06-25 08:31:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13900087296. Throughput: 0: 42961.4. Samples: 13900179880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 08:31:41,299][15401] Updated weights for policy 0, policy_version 848404 (0.0040) [2024-06-25 08:31:43,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 13900283904. Throughput: 0: 42618.2. Samples: 13900422080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:43,393][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 08:31:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000848406_13900283904.pth... [2024-06-25 08:31:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000847782_13890060288.pth [2024-06-25 08:31:46,134][15401] Updated weights for policy 0, policy_version 848414 (0.0022) [2024-06-25 08:31:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42765.4). Total num frames: 13900546048. Throughput: 0: 42548.5. Samples: 13900675340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:48,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 08:31:49,277][15401] Updated weights for policy 0, policy_version 848424 (0.0026) [2024-06-25 08:31:53,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13900726272. Throughput: 0: 42643.9. Samples: 13900816980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 08:31:53,796][15401] Updated weights for policy 0, policy_version 848434 (0.0038) [2024-06-25 08:31:56,884][15401] Updated weights for policy 0, policy_version 848444 (0.0029) [2024-06-25 08:31:58,392][15132] Fps is (10 sec: 37674.2, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 13900922880. Throughput: 0: 42543.5. Samples: 13901063600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:31:58,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 08:32:01,486][15401] Updated weights for policy 0, policy_version 848454 (0.0022) [2024-06-25 08:32:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13901168640. Throughput: 0: 42644.9. Samples: 13901322180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:32:03,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 08:32:04,383][15401] Updated weights for policy 0, policy_version 848464 (0.0024) [2024-06-25 08:32:08,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13901365248. Throughput: 0: 42754.6. Samples: 13901460600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 08:32:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 08:32:08,999][15401] Updated weights for policy 0, policy_version 848474 (0.0032) [2024-06-25 08:32:12,037][15401] Updated weights for policy 0, policy_version 848484 (0.0030) [2024-06-25 08:32:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13901578240. Throughput: 0: 42675.2. Samples: 13901707520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 08:32:14,630][15349] Signal inference workers to stop experience collection... (205900 times) [2024-06-25 08:32:14,631][15349] Signal inference workers to resume experience collection... (205900 times) [2024-06-25 08:32:14,648][15401] InferenceWorker_p0-w0: stopping experience collection (205900 times) [2024-06-25 08:32:14,648][15401] InferenceWorker_p0-w0: resuming experience collection (205900 times) [2024-06-25 08:32:16,790][15401] Updated weights for policy 0, policy_version 848494 (0.0040) [2024-06-25 08:32:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 13901824000. Throughput: 0: 42694.6. Samples: 13901961240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:18,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-25 08:32:20,033][15401] Updated weights for policy 0, policy_version 848504 (0.0036) [2024-06-25 08:32:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13902004224. Throughput: 0: 42552.7. Samples: 13902094760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 08:32:24,570][15401] Updated weights for policy 0, policy_version 848514 (0.0026) [2024-06-25 08:32:27,686][15401] Updated weights for policy 0, policy_version 848524 (0.0030) [2024-06-25 08:32:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 13902233600. Throughput: 0: 42586.2. Samples: 13902338360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:28,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 08:32:32,142][15401] Updated weights for policy 0, policy_version 848534 (0.0043) [2024-06-25 08:32:33,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 13902446592. Throughput: 0: 42715.5. Samples: 13902597640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:33,393][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 08:32:35,603][15401] Updated weights for policy 0, policy_version 848544 (0.0040) [2024-06-25 08:32:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13902626816. Throughput: 0: 42436.4. Samples: 13902726620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 08:32:39,707][15401] Updated weights for policy 0, policy_version 848554 (0.0042) [2024-06-25 08:32:43,101][15401] Updated weights for policy 0, policy_version 848564 (0.0034) [2024-06-25 08:32:43,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 13902888960. Throughput: 0: 42686.3. Samples: 13902984380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:43,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 08:32:47,606][15401] Updated weights for policy 0, policy_version 848574 (0.0043) [2024-06-25 08:32:48,389][15132] Fps is (10 sec: 47514.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13903101952. Throughput: 0: 42604.1. Samples: 13903239360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 08:32:50,687][15401] Updated weights for policy 0, policy_version 848584 (0.0029) [2024-06-25 08:32:53,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 13903282176. Throughput: 0: 42368.0. Samples: 13903367160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 08:32:55,327][15401] Updated weights for policy 0, policy_version 848594 (0.0051) [2024-06-25 08:32:58,218][15401] Updated weights for policy 0, policy_version 848604 (0.0034) [2024-06-25 08:32:58,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43419.2, 300 sec: 42820.5). Total num frames: 13903527936. Throughput: 0: 42500.7. Samples: 13903620060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:32:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 08:33:02,902][15401] Updated weights for policy 0, policy_version 848614 (0.0032) [2024-06-25 08:33:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13903724544. Throughput: 0: 42809.7. Samples: 13903887680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 08:33:05,955][15401] Updated weights for policy 0, policy_version 848624 (0.0043) [2024-06-25 08:33:08,392][15132] Fps is (10 sec: 39312.8, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 13903921152. Throughput: 0: 42568.1. Samples: 13904010420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:08,392][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 08:33:10,403][15401] Updated weights for policy 0, policy_version 848634 (0.0034) [2024-06-25 08:33:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13904166912. Throughput: 0: 42796.0. Samples: 13904264180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:13,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 08:33:13,978][15401] Updated weights for policy 0, policy_version 848644 (0.0025) [2024-06-25 08:33:18,076][15401] Updated weights for policy 0, policy_version 848654 (0.0036) [2024-06-25 08:33:18,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13904363520. Throughput: 0: 42920.9. Samples: 13904528980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:18,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 08:33:21,809][15401] Updated weights for policy 0, policy_version 848664 (0.0040) [2024-06-25 08:33:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 13904560128. Throughput: 0: 42839.5. Samples: 13904654400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 08:33:23,876][15349] Signal inference workers to stop experience collection... (205950 times) [2024-06-25 08:33:23,915][15401] InferenceWorker_p0-w0: stopping experience collection (205950 times) [2024-06-25 08:33:23,925][15349] Signal inference workers to resume experience collection... (205950 times) [2024-06-25 08:33:23,935][15401] InferenceWorker_p0-w0: resuming experience collection (205950 times) [2024-06-25 08:33:25,802][15401] Updated weights for policy 0, policy_version 848674 (0.0031) [2024-06-25 08:33:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13904822272. Throughput: 0: 42696.4. Samples: 13904905720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 08:33:29,398][15401] Updated weights for policy 0, policy_version 848684 (0.0033) [2024-06-25 08:33:33,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.0, 300 sec: 42654.9). Total num frames: 13904986112. Throughput: 0: 42858.1. Samples: 13905167980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:33,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 08:33:33,596][15401] Updated weights for policy 0, policy_version 848694 (0.0022) [2024-06-25 08:33:36,956][15401] Updated weights for policy 0, policy_version 848704 (0.0030) [2024-06-25 08:33:38,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13905199104. Throughput: 0: 42808.1. Samples: 13905293520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 08:33:41,185][15401] Updated weights for policy 0, policy_version 848714 (0.0027) [2024-06-25 08:33:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13905444864. Throughput: 0: 42867.3. Samples: 13905549080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 08:33:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000848721_13905444864.pth... [2024-06-25 08:33:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000848095_13895188480.pth [2024-06-25 08:33:44,523][15401] Updated weights for policy 0, policy_version 848724 (0.0052) [2024-06-25 08:33:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 13905625088. Throughput: 0: 42723.6. Samples: 13905810240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:48,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 08:33:48,835][15401] Updated weights for policy 0, policy_version 848734 (0.0037) [2024-06-25 08:33:52,078][15401] Updated weights for policy 0, policy_version 848744 (0.0037) [2024-06-25 08:33:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13905854464. Throughput: 0: 42648.8. Samples: 13905929520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:53,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 08:33:56,466][15401] Updated weights for policy 0, policy_version 848754 (0.0041) [2024-06-25 08:33:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13906083840. Throughput: 0: 42829.4. Samples: 13906191500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-25 08:33:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 08:33:59,717][15401] Updated weights for policy 0, policy_version 848764 (0.0026) [2024-06-25 08:34:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13906264064. Throughput: 0: 42637.0. Samples: 13906447640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:03,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 08:34:04,130][15401] Updated weights for policy 0, policy_version 848774 (0.0026) [2024-06-25 08:34:07,409][15401] Updated weights for policy 0, policy_version 848784 (0.0040) [2024-06-25 08:34:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 13906493440. Throughput: 0: 42508.5. Samples: 13906567280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 08:34:11,645][15401] Updated weights for policy 0, policy_version 848794 (0.0041) [2024-06-25 08:34:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13906706432. Throughput: 0: 42678.5. Samples: 13906826260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 08:34:15,284][15401] Updated weights for policy 0, policy_version 848804 (0.0034) [2024-06-25 08:34:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13906903040. Throughput: 0: 42484.9. Samples: 13907079800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 08:34:19,971][15401] Updated weights for policy 0, policy_version 848814 (0.0028) [2024-06-25 08:34:23,309][15401] Updated weights for policy 0, policy_version 848824 (0.0027) [2024-06-25 08:34:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 13907132416. Throughput: 0: 42365.7. Samples: 13907199980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:23,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 08:34:27,554][15401] Updated weights for policy 0, policy_version 848834 (0.0028) [2024-06-25 08:34:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 13907329024. Throughput: 0: 42483.6. Samples: 13907460840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 08:34:30,823][15401] Updated weights for policy 0, policy_version 848844 (0.0037) [2024-06-25 08:34:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13907542016. Throughput: 0: 42418.2. Samples: 13907719060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:33,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 08:34:35,169][15401] Updated weights for policy 0, policy_version 848854 (0.0037) [2024-06-25 08:34:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13907771392. Throughput: 0: 42587.6. Samples: 13907845960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:38,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 08:34:38,766][15401] Updated weights for policy 0, policy_version 848864 (0.0042) [2024-06-25 08:34:41,208][15349] Signal inference workers to stop experience collection... (206000 times) [2024-06-25 08:34:41,209][15349] Signal inference workers to resume experience collection... (206000 times) [2024-06-25 08:34:41,244][15401] InferenceWorker_p0-w0: stopping experience collection (206000 times) [2024-06-25 08:34:41,244][15401] InferenceWorker_p0-w0: resuming experience collection (206000 times) [2024-06-25 08:34:42,795][15401] Updated weights for policy 0, policy_version 848874 (0.0031) [2024-06-25 08:34:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13907984384. Throughput: 0: 42525.8. Samples: 13908105160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 08:34:46,249][15401] Updated weights for policy 0, policy_version 848884 (0.0035) [2024-06-25 08:34:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13908180992. Throughput: 0: 42402.5. Samples: 13908355760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 08:34:50,443][15401] Updated weights for policy 0, policy_version 848894 (0.0031) [2024-06-25 08:34:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 13908410368. Throughput: 0: 42590.7. Samples: 13908483860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 08:34:53,791][15401] Updated weights for policy 0, policy_version 848904 (0.0032) [2024-06-25 08:34:58,213][15401] Updated weights for policy 0, policy_version 848914 (0.0027) [2024-06-25 08:34:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13908606976. Throughput: 0: 42569.0. Samples: 13908741860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:34:58,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 08:35:01,529][15401] Updated weights for policy 0, policy_version 848924 (0.0023) [2024-06-25 08:35:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13908836352. Throughput: 0: 42627.1. Samples: 13908998020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:03,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 08:35:05,880][15401] Updated weights for policy 0, policy_version 848934 (0.0032) [2024-06-25 08:35:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13909049344. Throughput: 0: 42819.9. Samples: 13909126880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:08,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 08:35:09,207][15401] Updated weights for policy 0, policy_version 848944 (0.0034) [2024-06-25 08:35:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42709.4). Total num frames: 13909245952. Throughput: 0: 42807.9. Samples: 13909387200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:13,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 08:35:13,431][15401] Updated weights for policy 0, policy_version 848954 (0.0031) [2024-06-25 08:35:16,680][15401] Updated weights for policy 0, policy_version 848964 (0.0042) [2024-06-25 08:35:18,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 13909491712. Throughput: 0: 42632.9. Samples: 13909637640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:18,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 08:35:21,339][15401] Updated weights for policy 0, policy_version 848974 (0.0036) [2024-06-25 08:35:23,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 13909704704. Throughput: 0: 42780.4. Samples: 13909771180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:23,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 08:35:24,324][15401] Updated weights for policy 0, policy_version 848984 (0.0027) [2024-06-25 08:35:28,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13909884928. Throughput: 0: 42868.5. Samples: 13910034240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 08:35:29,064][15401] Updated weights for policy 0, policy_version 848994 (0.0043) [2024-06-25 08:35:31,771][15401] Updated weights for policy 0, policy_version 849004 (0.0034) [2024-06-25 08:35:33,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13910130688. Throughput: 0: 42693.4. Samples: 13910276960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:33,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 08:35:36,625][15401] Updated weights for policy 0, policy_version 849014 (0.0036) [2024-06-25 08:35:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13910327296. Throughput: 0: 42840.8. Samples: 13910411700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 08:35:39,197][15401] Updated weights for policy 0, policy_version 849024 (0.0048) [2024-06-25 08:35:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13910523904. Throughput: 0: 42700.3. Samples: 13910663380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:43,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 08:35:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000849031_13910523904.pth... [2024-06-25 08:35:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000848406_13900283904.pth [2024-06-25 08:35:44,530][15401] Updated weights for policy 0, policy_version 849034 (0.0033) [2024-06-25 08:35:47,350][15401] Updated weights for policy 0, policy_version 849044 (0.0033) [2024-06-25 08:35:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13910769664. Throughput: 0: 42518.7. Samples: 13910911360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 08:35:48,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 08:35:52,199][15401] Updated weights for policy 0, policy_version 849054 (0.0025) [2024-06-25 08:35:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 13910966272. Throughput: 0: 42664.1. Samples: 13911046760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:35:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 08:35:53,856][15349] Signal inference workers to stop experience collection... (206050 times) [2024-06-25 08:35:53,857][15349] Signal inference workers to resume experience collection... (206050 times) [2024-06-25 08:35:53,877][15401] InferenceWorker_p0-w0: stopping experience collection (206050 times) [2024-06-25 08:35:53,877][15401] InferenceWorker_p0-w0: resuming experience collection (206050 times) [2024-06-25 08:35:55,090][15401] Updated weights for policy 0, policy_version 849064 (0.0039) [2024-06-25 08:35:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13911162880. Throughput: 0: 42581.4. Samples: 13911303360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:35:58,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 08:35:59,786][15401] Updated weights for policy 0, policy_version 849074 (0.0028) [2024-06-25 08:36:02,866][15401] Updated weights for policy 0, policy_version 849084 (0.0036) [2024-06-25 08:36:03,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 13911408640. Throughput: 0: 42487.0. Samples: 13911549560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:03,393][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 08:36:07,382][15401] Updated weights for policy 0, policy_version 849094 (0.0032) [2024-06-25 08:36:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13911605248. Throughput: 0: 42579.6. Samples: 13911687160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 08:36:10,296][15401] Updated weights for policy 0, policy_version 849104 (0.0034) [2024-06-25 08:36:13,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13911801856. Throughput: 0: 42453.8. Samples: 13911944660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 08:36:14,847][15401] Updated weights for policy 0, policy_version 849114 (0.0034) [2024-06-25 08:36:17,954][15401] Updated weights for policy 0, policy_version 849124 (0.0030) [2024-06-25 08:36:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13912047616. Throughput: 0: 42615.6. Samples: 13912194660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 08:36:22,410][15401] Updated weights for policy 0, policy_version 849134 (0.0035) [2024-06-25 08:36:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 13912244224. Throughput: 0: 42697.9. Samples: 13912333100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 08:36:25,630][15401] Updated weights for policy 0, policy_version 849144 (0.0022) [2024-06-25 08:36:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13912440832. Throughput: 0: 42683.3. Samples: 13912584120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 08:36:30,563][15401] Updated weights for policy 0, policy_version 849154 (0.0040) [2024-06-25 08:36:33,313][15401] Updated weights for policy 0, policy_version 849164 (0.0033) [2024-06-25 08:36:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13912702976. Throughput: 0: 42666.6. Samples: 13912831360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:33,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 08:36:38,181][15401] Updated weights for policy 0, policy_version 849174 (0.0040) [2024-06-25 08:36:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 13912883200. Throughput: 0: 42634.2. Samples: 13912965300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:38,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 08:36:41,089][15401] Updated weights for policy 0, policy_version 849184 (0.0039) [2024-06-25 08:36:43,396][15132] Fps is (10 sec: 37659.4, 60 sec: 42593.9, 300 sec: 42486.4). Total num frames: 13913079808. Throughput: 0: 42541.5. Samples: 13913218000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:43,396][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 08:36:45,661][15401] Updated weights for policy 0, policy_version 849194 (0.0028) [2024-06-25 08:36:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13913325568. Throughput: 0: 42696.6. Samples: 13913470800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 08:36:48,841][15401] Updated weights for policy 0, policy_version 849204 (0.0034) [2024-06-25 08:36:53,235][15401] Updated weights for policy 0, policy_version 849214 (0.0037) [2024-06-25 08:36:53,389][15132] Fps is (10 sec: 45904.6, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 13913538560. Throughput: 0: 42654.2. Samples: 13913606600. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 08:36:56,517][15401] Updated weights for policy 0, policy_version 849224 (0.0024) [2024-06-25 08:36:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13913735168. Throughput: 0: 42557.6. Samples: 13913859760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:36:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 08:37:00,460][15349] Signal inference workers to stop experience collection... (206100 times) [2024-06-25 08:37:00,499][15401] InferenceWorker_p0-w0: stopping experience collection (206100 times) [2024-06-25 08:37:00,511][15349] Signal inference workers to resume experience collection... (206100 times) [2024-06-25 08:37:00,518][15401] InferenceWorker_p0-w0: resuming experience collection (206100 times) [2024-06-25 08:37:00,835][15401] Updated weights for policy 0, policy_version 849234 (0.0035) [2024-06-25 08:37:03,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 13913980928. Throughput: 0: 42643.1. Samples: 13914113700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:37:03,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 08:37:04,095][15401] Updated weights for policy 0, policy_version 849244 (0.0036) [2024-06-25 08:37:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13914144768. Throughput: 0: 42564.4. Samples: 13914248500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:37:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 08:37:09,021][15401] Updated weights for policy 0, policy_version 849254 (0.0024) [2024-06-25 08:37:11,695][15401] Updated weights for policy 0, policy_version 849264 (0.0034) [2024-06-25 08:37:13,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 13914374144. Throughput: 0: 42371.4. Samples: 13914490840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:37:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 08:37:16,702][15401] Updated weights for policy 0, policy_version 849274 (0.0032) [2024-06-25 08:37:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13914587136. Throughput: 0: 42615.6. Samples: 13914749060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:37:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 08:37:19,891][15401] Updated weights for policy 0, policy_version 849284 (0.0030) [2024-06-25 08:37:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 13914783744. Throughput: 0: 42578.2. Samples: 13914881320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:37:23,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 08:37:24,268][15401] Updated weights for policy 0, policy_version 849294 (0.0042) [2024-06-25 08:37:27,302][15401] Updated weights for policy 0, policy_version 849304 (0.0041) [2024-06-25 08:37:28,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42653.9). Total num frames: 13915029504. Throughput: 0: 42525.6. Samples: 13915131480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:37:28,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 08:37:31,870][15401] Updated weights for policy 0, policy_version 849314 (0.0049) [2024-06-25 08:37:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 13915226112. Throughput: 0: 42635.4. Samples: 13915389400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:37:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 08:37:35,399][15401] Updated weights for policy 0, policy_version 849324 (0.0039) [2024-06-25 08:37:38,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 13915422720. Throughput: 0: 42469.3. Samples: 13915517720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-25 08:37:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 08:37:39,485][15401] Updated weights for policy 0, policy_version 849334 (0.0024) [2024-06-25 08:37:42,979][15401] Updated weights for policy 0, policy_version 849344 (0.0042) [2024-06-25 08:37:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43149.2, 300 sec: 42598.4). Total num frames: 13915668480. Throughput: 0: 42672.6. Samples: 13915780020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:37:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 08:37:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000849346_13915684864.pth... [2024-06-25 08:37:43,512][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000848721_13905444864.pth [2024-06-25 08:37:47,068][15401] Updated weights for policy 0, policy_version 849354 (0.0032) [2024-06-25 08:37:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 13915881472. Throughput: 0: 42724.9. Samples: 13916036220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:37:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 08:37:50,760][15401] Updated weights for policy 0, policy_version 849364 (0.0029) [2024-06-25 08:37:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 13916078080. Throughput: 0: 42421.9. Samples: 13916157480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:37:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 08:37:54,979][15401] Updated weights for policy 0, policy_version 849374 (0.0025) [2024-06-25 08:37:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 13916291072. Throughput: 0: 42872.7. Samples: 13916420100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:37:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 08:37:58,395][15401] Updated weights for policy 0, policy_version 849384 (0.0039) [2024-06-25 08:38:02,678][15401] Updated weights for policy 0, policy_version 849394 (0.0041) [2024-06-25 08:38:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42327.0, 300 sec: 42709.8). Total num frames: 13916520448. Throughput: 0: 42796.5. Samples: 13916674900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 08:38:06,084][15401] Updated weights for policy 0, policy_version 849404 (0.0039) [2024-06-25 08:38:08,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 13916733440. Throughput: 0: 42661.7. Samples: 13916801200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:08,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 08:38:10,311][15401] Updated weights for policy 0, policy_version 849414 (0.0027) [2024-06-25 08:38:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13916930048. Throughput: 0: 42818.4. Samples: 13917058200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:13,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 08:38:13,694][15401] Updated weights for policy 0, policy_version 849424 (0.0035) [2024-06-25 08:38:17,859][15401] Updated weights for policy 0, policy_version 849434 (0.0031) [2024-06-25 08:38:18,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13917159424. Throughput: 0: 42851.0. Samples: 13917317700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 08:38:21,424][15401] Updated weights for policy 0, policy_version 849444 (0.0033) [2024-06-25 08:38:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 13917372416. Throughput: 0: 42825.4. Samples: 13917444860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 08:38:25,361][15401] Updated weights for policy 0, policy_version 849454 (0.0036) [2024-06-25 08:38:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13917585408. Throughput: 0: 42835.5. Samples: 13917707620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 08:38:29,085][15401] Updated weights for policy 0, policy_version 849464 (0.0033) [2024-06-25 08:38:30,012][15349] Signal inference workers to stop experience collection... (206150 times) [2024-06-25 08:38:30,052][15401] InferenceWorker_p0-w0: stopping experience collection (206150 times) [2024-06-25 08:38:30,067][15349] Signal inference workers to resume experience collection... (206150 times) [2024-06-25 08:38:30,077][15401] InferenceWorker_p0-w0: resuming experience collection (206150 times) [2024-06-25 08:38:32,860][15401] Updated weights for policy 0, policy_version 849474 (0.0042) [2024-06-25 08:38:33,393][15132] Fps is (10 sec: 42583.7, 60 sec: 42869.1, 300 sec: 42709.0). Total num frames: 13917798400. Throughput: 0: 42821.2. Samples: 13917963320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:33,393][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 08:38:36,693][15401] Updated weights for policy 0, policy_version 849484 (0.0030) [2024-06-25 08:38:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 13918011392. Throughput: 0: 42994.6. Samples: 13918092240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:38,390][15132] Avg episode reward: [(0, '0.199')] [2024-06-25 08:38:40,550][15401] Updated weights for policy 0, policy_version 849494 (0.0036) [2024-06-25 08:38:43,390][15132] Fps is (10 sec: 40973.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 13918208000. Throughput: 0: 42784.3. Samples: 13918345400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:43,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 08:38:44,499][15401] Updated weights for policy 0, policy_version 849504 (0.0041) [2024-06-25 08:38:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13918420992. Throughput: 0: 42745.8. Samples: 13918598460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 08:38:48,570][15401] Updated weights for policy 0, policy_version 849514 (0.0030) [2024-06-25 08:38:52,247][15401] Updated weights for policy 0, policy_version 849524 (0.0033) [2024-06-25 08:38:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 13918666752. Throughput: 0: 42837.4. Samples: 13918728780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:53,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 08:38:56,073][15401] Updated weights for policy 0, policy_version 849534 (0.0042) [2024-06-25 08:38:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13918846976. Throughput: 0: 42730.2. Samples: 13918981060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:38:58,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 08:39:00,132][15401] Updated weights for policy 0, policy_version 849544 (0.0040) [2024-06-25 08:39:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13919059968. Throughput: 0: 42596.1. Samples: 13919234520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:39:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 08:39:03,942][15401] Updated weights for policy 0, policy_version 849554 (0.0031) [2024-06-25 08:39:07,557][15401] Updated weights for policy 0, policy_version 849564 (0.0037) [2024-06-25 08:39:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 13919289344. Throughput: 0: 42787.9. Samples: 13919370320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:39:08,394][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 08:39:11,617][15401] Updated weights for policy 0, policy_version 849574 (0.0032) [2024-06-25 08:39:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 13919485952. Throughput: 0: 42430.5. Samples: 13919617000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:39:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 08:39:15,373][15401] Updated weights for policy 0, policy_version 849584 (0.0032) [2024-06-25 08:39:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13919715328. Throughput: 0: 42369.0. Samples: 13919869780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:39:18,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 08:39:19,122][15401] Updated weights for policy 0, policy_version 849594 (0.0032) [2024-06-25 08:39:23,375][15401] Updated weights for policy 0, policy_version 849604 (0.0033) [2024-06-25 08:39:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13919911936. Throughput: 0: 42417.8. Samples: 13920001040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:39:23,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 08:39:27,051][15401] Updated weights for policy 0, policy_version 849614 (0.0025) [2024-06-25 08:39:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13920141312. Throughput: 0: 42347.6. Samples: 13920251040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:39:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 08:39:31,032][15401] Updated weights for policy 0, policy_version 849624 (0.0048) [2024-06-25 08:39:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.8, 300 sec: 42598.4). Total num frames: 13920337920. Throughput: 0: 42263.6. Samples: 13920500320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 08:39:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 08:39:34,705][15401] Updated weights for policy 0, policy_version 849634 (0.0034) [2024-06-25 08:39:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 13920550912. Throughput: 0: 42307.6. Samples: 13920632620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:39:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 08:39:38,519][15401] Updated weights for policy 0, policy_version 849644 (0.0026) [2024-06-25 08:39:42,211][15401] Updated weights for policy 0, policy_version 849654 (0.0036) [2024-06-25 08:39:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13920780288. Throughput: 0: 42575.6. Samples: 13920896960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:39:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 08:39:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000849658_13920796672.pth... [2024-06-25 08:39:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000849031_13910523904.pth [2024-06-25 08:39:46,231][15401] Updated weights for policy 0, policy_version 849664 (0.0031) [2024-06-25 08:39:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13920993280. Throughput: 0: 42474.6. Samples: 13921145880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:39:48,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 08:39:49,915][15401] Updated weights for policy 0, policy_version 849674 (0.0023) [2024-06-25 08:39:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 13921189888. Throughput: 0: 42386.3. Samples: 13921277700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:39:53,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-25 08:39:53,728][15401] Updated weights for policy 0, policy_version 849684 (0.0046) [2024-06-25 08:39:57,569][15401] Updated weights for policy 0, policy_version 849694 (0.0030) [2024-06-25 08:39:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13921402880. Throughput: 0: 42601.4. Samples: 13921534060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:39:58,390][15132] Avg episode reward: [(0, '0.245')] [2024-06-25 08:40:01,393][15401] Updated weights for policy 0, policy_version 849704 (0.0033) [2024-06-25 08:40:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 13921632256. Throughput: 0: 42585.8. Samples: 13921786140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:03,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 08:40:05,572][15401] Updated weights for policy 0, policy_version 849714 (0.0028) [2024-06-25 08:40:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13921828864. Throughput: 0: 42655.0. Samples: 13921920520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:08,390][15132] Avg episode reward: [(0, '0.163')] [2024-06-25 08:40:09,262][15401] Updated weights for policy 0, policy_version 849724 (0.0035) [2024-06-25 08:40:13,207][15401] Updated weights for policy 0, policy_version 849734 (0.0034) [2024-06-25 08:40:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 13922041856. Throughput: 0: 42553.7. Samples: 13922165960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 08:40:17,053][15401] Updated weights for policy 0, policy_version 849744 (0.0032) [2024-06-25 08:40:18,196][15349] Signal inference workers to stop experience collection... (206200 times) [2024-06-25 08:40:18,197][15349] Signal inference workers to resume experience collection... (206200 times) [2024-06-25 08:40:18,242][15401] InferenceWorker_p0-w0: stopping experience collection (206200 times) [2024-06-25 08:40:18,243][15401] InferenceWorker_p0-w0: resuming experience collection (206200 times) [2024-06-25 08:40:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 13922271232. Throughput: 0: 42686.5. Samples: 13922421220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:18,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 08:40:20,872][15401] Updated weights for policy 0, policy_version 849754 (0.0046) [2024-06-25 08:40:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13922467840. Throughput: 0: 42573.7. Samples: 13922548440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 08:40:24,689][15401] Updated weights for policy 0, policy_version 849764 (0.0030) [2024-06-25 08:40:28,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 13922664448. Throughput: 0: 42251.6. Samples: 13922798280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:28,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 08:40:29,180][15401] Updated weights for policy 0, policy_version 849774 (0.0037) [2024-06-25 08:40:32,207][15401] Updated weights for policy 0, policy_version 849784 (0.0029) [2024-06-25 08:40:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13922893824. Throughput: 0: 42374.3. Samples: 13923052720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 08:40:36,717][15401] Updated weights for policy 0, policy_version 849794 (0.0040) [2024-06-25 08:40:38,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13923090432. Throughput: 0: 42381.2. Samples: 13923184860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 08:40:39,905][15401] Updated weights for policy 0, policy_version 849804 (0.0043) [2024-06-25 08:40:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 13923303424. Throughput: 0: 42246.3. Samples: 13923435140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:43,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 08:40:44,324][15401] Updated weights for policy 0, policy_version 849814 (0.0027) [2024-06-25 08:40:47,635][15401] Updated weights for policy 0, policy_version 849824 (0.0025) [2024-06-25 08:40:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13923532800. Throughput: 0: 42203.2. Samples: 13923685280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 08:40:51,947][15401] Updated weights for policy 0, policy_version 849834 (0.0031) [2024-06-25 08:40:53,394][15132] Fps is (10 sec: 42580.1, 60 sec: 42322.3, 300 sec: 42597.8). Total num frames: 13923729408. Throughput: 0: 42211.1. Samples: 13923820200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:53,394][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 08:40:55,482][15401] Updated weights for policy 0, policy_version 849844 (0.0043) [2024-06-25 08:40:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 13923958784. Throughput: 0: 42379.9. Samples: 13924073060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:40:58,391][15132] Avg episode reward: [(0, '0.240')] [2024-06-25 08:40:59,727][15401] Updated weights for policy 0, policy_version 849854 (0.0044) [2024-06-25 08:41:03,301][15401] Updated weights for policy 0, policy_version 849864 (0.0041) [2024-06-25 08:41:03,389][15132] Fps is (10 sec: 44255.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13924171776. Throughput: 0: 42200.6. Samples: 13924320240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:41:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 08:41:07,377][15401] Updated weights for policy 0, policy_version 849874 (0.0034) [2024-06-25 08:41:08,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 13924352000. Throughput: 0: 42305.8. Samples: 13924452200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:41:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 08:41:11,152][15401] Updated weights for policy 0, policy_version 849884 (0.0043) [2024-06-25 08:41:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 13924564992. Throughput: 0: 42375.4. Samples: 13924705180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:41:13,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 08:41:15,383][15401] Updated weights for policy 0, policy_version 849894 (0.0025) [2024-06-25 08:41:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13924810752. Throughput: 0: 42316.9. Samples: 13924956980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:41:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 08:41:18,698][15401] Updated weights for policy 0, policy_version 849904 (0.0035) [2024-06-25 08:41:22,962][15401] Updated weights for policy 0, policy_version 849914 (0.0037) [2024-06-25 08:41:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13925007360. Throughput: 0: 42349.9. Samples: 13925090600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 08:41:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 08:41:26,254][15401] Updated weights for policy 0, policy_version 849924 (0.0036) [2024-06-25 08:41:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 13925220352. Throughput: 0: 42535.6. Samples: 13925349240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:41:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 08:41:30,615][15401] Updated weights for policy 0, policy_version 849934 (0.0040) [2024-06-25 08:41:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13925449728. Throughput: 0: 42584.0. Samples: 13925601560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:41:33,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 08:41:33,993][15401] Updated weights for policy 0, policy_version 849944 (0.0033) [2024-06-25 08:41:38,150][15401] Updated weights for policy 0, policy_version 849954 (0.0023) [2024-06-25 08:41:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 13925662720. Throughput: 0: 42477.8. Samples: 13925731520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:41:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 08:41:41,521][15401] Updated weights for policy 0, policy_version 849964 (0.0030) [2024-06-25 08:41:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13925859328. Throughput: 0: 42505.0. Samples: 13925985780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:41:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 08:41:43,494][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000849968_13925875712.pth... [2024-06-25 08:41:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000849346_13915684864.pth [2024-06-25 08:41:46,036][15401] Updated weights for policy 0, policy_version 849974 (0.0029) [2024-06-25 08:41:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13926088704. Throughput: 0: 42733.7. Samples: 13926243260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:41:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 08:41:49,148][15401] Updated weights for policy 0, policy_version 849984 (0.0031) [2024-06-25 08:41:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42601.4, 300 sec: 42542.9). Total num frames: 13926285312. Throughput: 0: 42715.1. Samples: 13926374380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:41:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 08:41:53,780][15401] Updated weights for policy 0, policy_version 849994 (0.0042) [2024-06-25 08:41:57,136][15401] Updated weights for policy 0, policy_version 850004 (0.0029) [2024-06-25 08:41:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 13926498304. Throughput: 0: 42716.0. Samples: 13926627400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:41:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 08:42:01,306][15401] Updated weights for policy 0, policy_version 850014 (0.0030) [2024-06-25 08:42:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13926727680. Throughput: 0: 42789.3. Samples: 13926882500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 08:42:03,737][15349] Signal inference workers to stop experience collection... (206250 times) [2024-06-25 08:42:03,737][15349] Signal inference workers to resume experience collection... (206250 times) [2024-06-25 08:42:03,771][15401] InferenceWorker_p0-w0: stopping experience collection (206250 times) [2024-06-25 08:42:03,771][15401] InferenceWorker_p0-w0: resuming experience collection (206250 times) [2024-06-25 08:42:04,579][15401] Updated weights for policy 0, policy_version 850024 (0.0039) [2024-06-25 08:42:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 13926940672. Throughput: 0: 42793.3. Samples: 13927016300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:08,392][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 08:42:09,107][15401] Updated weights for policy 0, policy_version 850034 (0.0038) [2024-06-25 08:42:12,020][15401] Updated weights for policy 0, policy_version 850044 (0.0037) [2024-06-25 08:42:13,391][15132] Fps is (10 sec: 42590.8, 60 sec: 43143.3, 300 sec: 42598.1). Total num frames: 13927153664. Throughput: 0: 42584.9. Samples: 13927265640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:13,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 08:42:16,636][15401] Updated weights for policy 0, policy_version 850054 (0.0038) [2024-06-25 08:42:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13927383040. Throughput: 0: 42771.5. Samples: 13927526280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 08:42:19,689][15401] Updated weights for policy 0, policy_version 850064 (0.0032) [2024-06-25 08:42:23,390][15132] Fps is (10 sec: 42605.8, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 13927579648. Throughput: 0: 42712.4. Samples: 13927653580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:23,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 08:42:24,230][15401] Updated weights for policy 0, policy_version 850074 (0.0036) [2024-06-25 08:42:27,722][15401] Updated weights for policy 0, policy_version 850084 (0.0032) [2024-06-25 08:42:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13927792640. Throughput: 0: 42726.2. Samples: 13927908460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:28,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 08:42:32,215][15401] Updated weights for policy 0, policy_version 850094 (0.0043) [2024-06-25 08:42:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13928005632. Throughput: 0: 42790.2. Samples: 13928168820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 08:42:35,395][15401] Updated weights for policy 0, policy_version 850104 (0.0033) [2024-06-25 08:42:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 13928218624. Throughput: 0: 42688.9. Samples: 13928295480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:38,392][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 08:42:39,723][15401] Updated weights for policy 0, policy_version 850114 (0.0029) [2024-06-25 08:42:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13928415232. Throughput: 0: 42641.8. Samples: 13928546280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 08:42:43,470][15401] Updated weights for policy 0, policy_version 850124 (0.0044) [2024-06-25 08:42:47,595][15401] Updated weights for policy 0, policy_version 850134 (0.0039) [2024-06-25 08:42:48,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 13928644608. Throughput: 0: 42707.5. Samples: 13928804440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:48,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 08:42:51,388][15401] Updated weights for policy 0, policy_version 850144 (0.0034) [2024-06-25 08:42:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13928841216. Throughput: 0: 42569.8. Samples: 13928931940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 08:42:55,362][15401] Updated weights for policy 0, policy_version 850154 (0.0027) [2024-06-25 08:42:58,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 13929054208. Throughput: 0: 42520.0. Samples: 13929178960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:42:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 08:42:59,166][15401] Updated weights for policy 0, policy_version 850164 (0.0043) [2024-06-25 08:43:02,927][15401] Updated weights for policy 0, policy_version 850174 (0.0050) [2024-06-25 08:43:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 13929267200. Throughput: 0: 42448.4. Samples: 13929436460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:43:03,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 08:43:06,985][15401] Updated weights for policy 0, policy_version 850184 (0.0032) [2024-06-25 08:43:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 13929480192. Throughput: 0: 42526.6. Samples: 13929567280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:43:08,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 08:43:10,607][15401] Updated weights for policy 0, policy_version 850194 (0.0028) [2024-06-25 08:43:13,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42598.0, 300 sec: 42542.5). Total num frames: 13929709568. Throughput: 0: 42406.6. Samples: 13929816860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 08:43:13,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 08:43:14,636][15401] Updated weights for policy 0, policy_version 850204 (0.0028) [2024-06-25 08:43:18,323][15401] Updated weights for policy 0, policy_version 850214 (0.0039) [2024-06-25 08:43:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 13929906176. Throughput: 0: 42298.8. Samples: 13930072260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:18,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-25 08:43:22,312][15401] Updated weights for policy 0, policy_version 850224 (0.0036) [2024-06-25 08:43:23,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 13930119168. Throughput: 0: 42246.6. Samples: 13930196480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:23,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-25 08:43:26,155][15401] Updated weights for policy 0, policy_version 850234 (0.0041) [2024-06-25 08:43:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.8). Total num frames: 13930332160. Throughput: 0: 42229.0. Samples: 13930446580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 08:43:30,218][15401] Updated weights for policy 0, policy_version 850244 (0.0025) [2024-06-25 08:43:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 13930545152. Throughput: 0: 42310.2. Samples: 13930708300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 08:43:33,895][15401] Updated weights for policy 0, policy_version 850254 (0.0032) [2024-06-25 08:43:37,609][15401] Updated weights for policy 0, policy_version 850264 (0.0042) [2024-06-25 08:43:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42326.9, 300 sec: 42542.9). Total num frames: 13930758144. Throughput: 0: 42260.3. Samples: 13930833660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 08:43:41,514][15401] Updated weights for policy 0, policy_version 850274 (0.0027) [2024-06-25 08:43:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13930987520. Throughput: 0: 42480.7. Samples: 13931090600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:43,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 08:43:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000850280_13930987520.pth... [2024-06-25 08:43:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000849658_13920796672.pth [2024-06-25 08:43:45,326][15401] Updated weights for policy 0, policy_version 850284 (0.0049) [2024-06-25 08:43:46,904][15349] Signal inference workers to stop experience collection... (206300 times) [2024-06-25 08:43:46,947][15401] InferenceWorker_p0-w0: stopping experience collection (206300 times) [2024-06-25 08:43:46,957][15349] Signal inference workers to resume experience collection... (206300 times) [2024-06-25 08:43:46,963][15401] InferenceWorker_p0-w0: resuming experience collection (206300 times) [2024-06-25 08:43:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42431.8). Total num frames: 13931184128. Throughput: 0: 42533.5. Samples: 13931350460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 08:43:49,006][15401] Updated weights for policy 0, policy_version 850294 (0.0033) [2024-06-25 08:43:53,229][15401] Updated weights for policy 0, policy_version 850304 (0.0032) [2024-06-25 08:43:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13931397120. Throughput: 0: 42321.0. Samples: 13931471720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 08:43:56,565][15401] Updated weights for policy 0, policy_version 850314 (0.0031) [2024-06-25 08:43:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 13931626496. Throughput: 0: 42655.9. Samples: 13931736280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:43:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 08:44:00,779][15401] Updated weights for policy 0, policy_version 850324 (0.0031) [2024-06-25 08:44:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 13931823104. Throughput: 0: 42657.6. Samples: 13931991860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:03,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 08:44:04,037][15401] Updated weights for policy 0, policy_version 850334 (0.0023) [2024-06-25 08:44:08,382][15401] Updated weights for policy 0, policy_version 850344 (0.0028) [2024-06-25 08:44:08,395][15132] Fps is (10 sec: 40936.5, 60 sec: 42594.3, 300 sec: 42542.0). Total num frames: 13932036096. Throughput: 0: 42700.3. Samples: 13932118240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:08,396][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 08:44:11,824][15401] Updated weights for policy 0, policy_version 850354 (0.0035) [2024-06-25 08:44:13,394][15132] Fps is (10 sec: 42580.8, 60 sec: 42324.0, 300 sec: 42486.7). Total num frames: 13932249088. Throughput: 0: 42779.5. Samples: 13932371840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:13,394][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 08:44:16,151][15401] Updated weights for policy 0, policy_version 850364 (0.0038) [2024-06-25 08:44:18,389][15132] Fps is (10 sec: 42623.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 13932462080. Throughput: 0: 42501.9. Samples: 13932620880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 08:44:19,849][15401] Updated weights for policy 0, policy_version 850374 (0.0042) [2024-06-25 08:44:23,389][15132] Fps is (10 sec: 40977.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 13932658688. Throughput: 0: 42728.6. Samples: 13932756440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:23,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 08:44:23,925][15401] Updated weights for policy 0, policy_version 850384 (0.0027) [2024-06-25 08:44:27,344][15401] Updated weights for policy 0, policy_version 850394 (0.0038) [2024-06-25 08:44:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 13932871680. Throughput: 0: 42529.8. Samples: 13933004440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:28,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 08:44:31,581][15401] Updated weights for policy 0, policy_version 850404 (0.0036) [2024-06-25 08:44:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 13933101056. Throughput: 0: 42383.5. Samples: 13933257720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 08:44:35,490][15401] Updated weights for policy 0, policy_version 850414 (0.0048) [2024-06-25 08:44:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 13933297664. Throughput: 0: 42500.8. Samples: 13933384260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 08:44:39,265][15401] Updated weights for policy 0, policy_version 850424 (0.0022) [2024-06-25 08:44:42,982][15401] Updated weights for policy 0, policy_version 850434 (0.0035) [2024-06-25 08:44:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 13933527040. Throughput: 0: 42532.5. Samples: 13933650240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 08:44:46,781][15401] Updated weights for policy 0, policy_version 850444 (0.0027) [2024-06-25 08:44:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 13933740032. Throughput: 0: 42530.3. Samples: 13933905720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:48,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 08:44:50,628][15401] Updated weights for policy 0, policy_version 850454 (0.0047) [2024-06-25 08:44:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 13933953024. Throughput: 0: 42440.0. Samples: 13934027800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 08:44:54,237][15401] Updated weights for policy 0, policy_version 850464 (0.0033) [2024-06-25 08:44:58,340][15401] Updated weights for policy 0, policy_version 850474 (0.0027) [2024-06-25 08:44:58,390][15132] Fps is (10 sec: 42594.6, 60 sec: 42324.7, 300 sec: 42487.2). Total num frames: 13934166016. Throughput: 0: 42597.8. Samples: 13934288600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:44:58,391][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 08:45:01,714][15401] Updated weights for policy 0, policy_version 850484 (0.0033) [2024-06-25 08:45:03,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13934379008. Throughput: 0: 42863.0. Samples: 13934549720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 08:45:03,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 08:45:05,988][15401] Updated weights for policy 0, policy_version 850494 (0.0042) [2024-06-25 08:45:08,390][15132] Fps is (10 sec: 42602.1, 60 sec: 42602.5, 300 sec: 42542.9). Total num frames: 13934592000. Throughput: 0: 42647.4. Samples: 13934675580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:08,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-25 08:45:09,388][15401] Updated weights for policy 0, policy_version 850504 (0.0033) [2024-06-25 08:45:11,715][15349] Signal inference workers to stop experience collection... (206350 times) [2024-06-25 08:45:11,766][15401] InferenceWorker_p0-w0: stopping experience collection (206350 times) [2024-06-25 08:45:11,766][15349] Signal inference workers to resume experience collection... (206350 times) [2024-06-25 08:45:11,784][15401] InferenceWorker_p0-w0: resuming experience collection (206350 times) [2024-06-25 08:45:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42601.5, 300 sec: 42487.3). Total num frames: 13934804992. Throughput: 0: 42827.7. Samples: 13934931680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 08:45:13,433][15401] Updated weights for policy 0, policy_version 850514 (0.0046) [2024-06-25 08:45:17,138][15401] Updated weights for policy 0, policy_version 850524 (0.0027) [2024-06-25 08:45:18,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 13935017984. Throughput: 0: 42952.0. Samples: 13935190660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:18,392][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 08:45:21,378][15401] Updated weights for policy 0, policy_version 850534 (0.0033) [2024-06-25 08:45:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13935230976. Throughput: 0: 42951.3. Samples: 13935317060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 08:45:24,586][15401] Updated weights for policy 0, policy_version 850544 (0.0031) [2024-06-25 08:45:28,392][15132] Fps is (10 sec: 44236.7, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 13935460352. Throughput: 0: 42642.2. Samples: 13935569240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:28,401][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 08:45:28,919][15401] Updated weights for policy 0, policy_version 850554 (0.0022) [2024-06-25 08:45:32,518][15401] Updated weights for policy 0, policy_version 850564 (0.0032) [2024-06-25 08:45:33,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 13935673344. Throughput: 0: 42656.9. Samples: 13935825380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:33,401][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 08:45:36,397][15401] Updated weights for policy 0, policy_version 850574 (0.0034) [2024-06-25 08:45:38,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13935869952. Throughput: 0: 42864.2. Samples: 13935956680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:38,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 08:45:40,304][15401] Updated weights for policy 0, policy_version 850584 (0.0035) [2024-06-25 08:45:43,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13936082944. Throughput: 0: 42734.4. Samples: 13936211600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 08:45:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000850592_13936099328.pth... [2024-06-25 08:45:43,564][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000849968_13925875712.pth [2024-06-25 08:45:43,934][15401] Updated weights for policy 0, policy_version 850594 (0.0024) [2024-06-25 08:45:48,236][15401] Updated weights for policy 0, policy_version 850604 (0.0034) [2024-06-25 08:45:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42599.0). Total num frames: 13936295936. Throughput: 0: 42669.3. Samples: 13936469840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 08:45:51,849][15401] Updated weights for policy 0, policy_version 850614 (0.0043) [2024-06-25 08:45:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 13936525312. Throughput: 0: 42621.0. Samples: 13936593520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 08:45:55,736][15401] Updated weights for policy 0, policy_version 850624 (0.0040) [2024-06-25 08:45:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42599.1, 300 sec: 42542.9). Total num frames: 13936721920. Throughput: 0: 42618.6. Samples: 13936849520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:45:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 08:45:59,699][15401] Updated weights for policy 0, policy_version 850634 (0.0033) [2024-06-25 08:46:03,309][15401] Updated weights for policy 0, policy_version 850644 (0.0029) [2024-06-25 08:46:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 13936951296. Throughput: 0: 42632.0. Samples: 13937109100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:03,393][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 08:46:07,651][15401] Updated weights for policy 0, policy_version 850654 (0.0035) [2024-06-25 08:46:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13937164288. Throughput: 0: 42613.2. Samples: 13937234660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:08,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 08:46:11,058][15401] Updated weights for policy 0, policy_version 850664 (0.0040) [2024-06-25 08:46:13,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13937377280. Throughput: 0: 42784.1. Samples: 13937494420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 08:46:15,333][15401] Updated weights for policy 0, policy_version 850674 (0.0037) [2024-06-25 08:46:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 13937573888. Throughput: 0: 42775.1. Samples: 13937750160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:18,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 08:46:18,994][15401] Updated weights for policy 0, policy_version 850684 (0.0035) [2024-06-25 08:46:22,874][15401] Updated weights for policy 0, policy_version 850694 (0.0035) [2024-06-25 08:46:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 13937803264. Throughput: 0: 42611.9. Samples: 13937874220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:23,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 08:46:26,749][15401] Updated weights for policy 0, policy_version 850704 (0.0025) [2024-06-25 08:46:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42325.3, 300 sec: 42542.5). Total num frames: 13937999872. Throughput: 0: 42768.3. Samples: 13938136280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:28,393][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 08:46:30,309][15401] Updated weights for policy 0, policy_version 850714 (0.0028) [2024-06-25 08:46:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 13938212864. Throughput: 0: 42736.0. Samples: 13938392960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 08:46:34,348][15401] Updated weights for policy 0, policy_version 850724 (0.0030) [2024-06-25 08:46:36,244][15349] Signal inference workers to stop experience collection... (206400 times) [2024-06-25 08:46:36,275][15401] InferenceWorker_p0-w0: stopping experience collection (206400 times) [2024-06-25 08:46:36,293][15349] Signal inference workers to resume experience collection... (206400 times) [2024-06-25 08:46:36,294][15401] InferenceWorker_p0-w0: resuming experience collection (206400 times) [2024-06-25 08:46:38,165][15401] Updated weights for policy 0, policy_version 850734 (0.0034) [2024-06-25 08:46:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13938442240. Throughput: 0: 42913.3. Samples: 13938524620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 08:46:41,845][15401] Updated weights for policy 0, policy_version 850744 (0.0035) [2024-06-25 08:46:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 13938638848. Throughput: 0: 42797.2. Samples: 13938775400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 08:46:45,524][15401] Updated weights for policy 0, policy_version 850754 (0.0035) [2024-06-25 08:46:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 13938868224. Throughput: 0: 42753.8. Samples: 13939033020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:48,392][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 08:46:49,533][15401] Updated weights for policy 0, policy_version 850764 (0.0030) [2024-06-25 08:46:53,222][15401] Updated weights for policy 0, policy_version 850774 (0.0043) [2024-06-25 08:46:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13939097600. Throughput: 0: 42937.3. Samples: 13939166840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-25 08:46:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 08:46:57,422][15401] Updated weights for policy 0, policy_version 850784 (0.0034) [2024-06-25 08:46:58,396][15132] Fps is (10 sec: 40943.8, 60 sec: 42593.8, 300 sec: 42541.9). Total num frames: 13939277824. Throughput: 0: 42833.9. Samples: 13939422220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:46:58,396][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 08:47:00,865][15401] Updated weights for policy 0, policy_version 850794 (0.0041) [2024-06-25 08:47:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 13939523584. Throughput: 0: 42760.5. Samples: 13939674380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:03,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 08:47:04,959][15401] Updated weights for policy 0, policy_version 850804 (0.0039) [2024-06-25 08:47:08,310][15401] Updated weights for policy 0, policy_version 850814 (0.0043) [2024-06-25 08:47:08,389][15132] Fps is (10 sec: 45904.5, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 13939736576. Throughput: 0: 42866.3. Samples: 13939803200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:08,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-25 08:47:12,502][15401] Updated weights for policy 0, policy_version 850824 (0.0031) [2024-06-25 08:47:13,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 13939933184. Throughput: 0: 42875.5. Samples: 13940065580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:13,390][15132] Avg episode reward: [(0, '0.337')] [2024-06-25 08:47:16,656][15401] Updated weights for policy 0, policy_version 850834 (0.0024) [2024-06-25 08:47:18,390][15132] Fps is (10 sec: 42596.1, 60 sec: 43144.2, 300 sec: 42653.9). Total num frames: 13940162560. Throughput: 0: 42578.3. Samples: 13940309000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:18,391][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 08:47:20,621][15401] Updated weights for policy 0, policy_version 850844 (0.0045) [2024-06-25 08:47:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13940359168. Throughput: 0: 42558.9. Samples: 13940439780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 08:47:24,096][15401] Updated weights for policy 0, policy_version 850854 (0.0036) [2024-06-25 08:47:27,995][15401] Updated weights for policy 0, policy_version 850864 (0.0039) [2024-06-25 08:47:28,390][15132] Fps is (10 sec: 40961.6, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 13940572160. Throughput: 0: 42850.6. Samples: 13940703680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 08:47:31,588][15401] Updated weights for policy 0, policy_version 850874 (0.0042) [2024-06-25 08:47:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 13940801536. Throughput: 0: 42761.0. Samples: 13940957160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 08:47:35,577][15401] Updated weights for policy 0, policy_version 850884 (0.0046) [2024-06-25 08:47:38,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13941014528. Throughput: 0: 42736.4. Samples: 13941089980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 08:47:39,102][15401] Updated weights for policy 0, policy_version 850894 (0.0030) [2024-06-25 08:47:43,350][15401] Updated weights for policy 0, policy_version 850904 (0.0033) [2024-06-25 08:47:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 13941211136. Throughput: 0: 42686.0. Samples: 13941342820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 08:47:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000850905_13941227520.pth... [2024-06-25 08:47:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000850280_13930987520.pth [2024-06-25 08:47:46,714][15401] Updated weights for policy 0, policy_version 850914 (0.0031) [2024-06-25 08:47:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 13941440512. Throughput: 0: 42589.2. Samples: 13941590900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:48,396][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 08:47:50,938][15401] Updated weights for policy 0, policy_version 850924 (0.0036) [2024-06-25 08:47:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13941653504. Throughput: 0: 42745.8. Samples: 13941726760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:53,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 08:47:54,458][15401] Updated weights for policy 0, policy_version 850934 (0.0040) [2024-06-25 08:47:55,419][15349] Signal inference workers to stop experience collection... (206450 times) [2024-06-25 08:47:55,419][15349] Signal inference workers to resume experience collection... (206450 times) [2024-06-25 08:47:55,466][15401] InferenceWorker_p0-w0: stopping experience collection (206450 times) [2024-06-25 08:47:55,466][15401] InferenceWorker_p0-w0: resuming experience collection (206450 times) [2024-06-25 08:47:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42876.1, 300 sec: 42654.0). Total num frames: 13941850112. Throughput: 0: 42674.8. Samples: 13941985940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:47:58,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 08:47:58,477][15401] Updated weights for policy 0, policy_version 850944 (0.0038) [2024-06-25 08:48:02,023][15401] Updated weights for policy 0, policy_version 850954 (0.0034) [2024-06-25 08:48:03,391][15132] Fps is (10 sec: 44227.8, 60 sec: 42870.0, 300 sec: 42764.7). Total num frames: 13942095872. Throughput: 0: 42951.9. Samples: 13942241900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:03,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 08:48:06,170][15401] Updated weights for policy 0, policy_version 850964 (0.0032) [2024-06-25 08:48:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 13942292480. Throughput: 0: 42993.1. Samples: 13942374460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 08:48:09,708][15401] Updated weights for policy 0, policy_version 850974 (0.0035) [2024-06-25 08:48:13,389][15132] Fps is (10 sec: 37690.9, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 13942472704. Throughput: 0: 42723.8. Samples: 13942626240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 08:48:13,718][15401] Updated weights for policy 0, policy_version 850984 (0.0033) [2024-06-25 08:48:17,227][15401] Updated weights for policy 0, policy_version 850994 (0.0029) [2024-06-25 08:48:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.7, 300 sec: 42709.5). Total num frames: 13942718464. Throughput: 0: 42820.4. Samples: 13942884080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 08:48:21,485][15401] Updated weights for policy 0, policy_version 851004 (0.0032) [2024-06-25 08:48:23,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 13942947840. Throughput: 0: 42761.0. Samples: 13943014220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 08:48:25,496][15401] Updated weights for policy 0, policy_version 851014 (0.0037) [2024-06-25 08:48:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13943128064. Throughput: 0: 42801.8. Samples: 13943268900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 08:48:29,506][15401] Updated weights for policy 0, policy_version 851024 (0.0025) [2024-06-25 08:48:32,865][15401] Updated weights for policy 0, policy_version 851034 (0.0032) [2024-06-25 08:48:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13943373824. Throughput: 0: 43187.1. Samples: 13943534320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:33,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 08:48:36,965][15401] Updated weights for policy 0, policy_version 851044 (0.0038) [2024-06-25 08:48:38,390][15132] Fps is (10 sec: 47512.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13943603200. Throughput: 0: 43069.1. Samples: 13943664880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 08:48:40,357][15401] Updated weights for policy 0, policy_version 851054 (0.0040) [2024-06-25 08:48:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13943783424. Throughput: 0: 42998.2. Samples: 13943920860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 08:48:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 08:48:44,339][15401] Updated weights for policy 0, policy_version 851064 (0.0033) [2024-06-25 08:48:47,950][15401] Updated weights for policy 0, policy_version 851074 (0.0038) [2024-06-25 08:48:48,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13944012800. Throughput: 0: 42894.8. Samples: 13944172080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:48:48,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 08:48:51,781][15401] Updated weights for policy 0, policy_version 851084 (0.0045) [2024-06-25 08:48:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 13944209408. Throughput: 0: 42801.8. Samples: 13944300540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:48:53,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 08:48:55,815][15401] Updated weights for policy 0, policy_version 851094 (0.0028) [2024-06-25 08:48:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13944422400. Throughput: 0: 42785.3. Samples: 13944551580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:48:58,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 08:49:00,204][15401] Updated weights for policy 0, policy_version 851104 (0.0025) [2024-06-25 08:49:01,370][15349] Signal inference workers to stop experience collection... (206500 times) [2024-06-25 08:49:01,421][15401] InferenceWorker_p0-w0: stopping experience collection (206500 times) [2024-06-25 08:49:01,424][15349] Signal inference workers to resume experience collection... (206500 times) [2024-06-25 08:49:01,436][15401] InferenceWorker_p0-w0: resuming experience collection (206500 times) [2024-06-25 08:49:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42326.6, 300 sec: 42710.3). Total num frames: 13944635392. Throughput: 0: 42731.5. Samples: 13944807000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:03,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 08:49:03,814][15401] Updated weights for policy 0, policy_version 851114 (0.0044) [2024-06-25 08:49:07,926][15401] Updated weights for policy 0, policy_version 851124 (0.0029) [2024-06-25 08:49:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42654.6). Total num frames: 13944832000. Throughput: 0: 42766.1. Samples: 13944938700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 08:49:11,331][15401] Updated weights for policy 0, policy_version 851134 (0.0034) [2024-06-25 08:49:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 13945077760. Throughput: 0: 42730.2. Samples: 13945191760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 08:49:15,503][15401] Updated weights for policy 0, policy_version 851144 (0.0036) [2024-06-25 08:49:18,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 13945290752. Throughput: 0: 42542.2. Samples: 13945448820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:18,393][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 08:49:18,848][15401] Updated weights for policy 0, policy_version 851154 (0.0037) [2024-06-25 08:49:23,134][15401] Updated weights for policy 0, policy_version 851164 (0.0028) [2024-06-25 08:49:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13945487360. Throughput: 0: 42414.0. Samples: 13945573500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:23,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 08:49:26,363][15401] Updated weights for policy 0, policy_version 851174 (0.0028) [2024-06-25 08:49:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13945700352. Throughput: 0: 42614.2. Samples: 13945838500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:28,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-25 08:49:30,658][15401] Updated weights for policy 0, policy_version 851184 (0.0032) [2024-06-25 08:49:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13945913344. Throughput: 0: 42609.6. Samples: 13946089520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:33,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 08:49:33,880][15401] Updated weights for policy 0, policy_version 851194 (0.0038) [2024-06-25 08:49:38,263][15401] Updated weights for policy 0, policy_version 851204 (0.0028) [2024-06-25 08:49:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.5, 300 sec: 42709.5). Total num frames: 13946126336. Throughput: 0: 42520.4. Samples: 13946213960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 08:49:42,043][15401] Updated weights for policy 0, policy_version 851214 (0.0029) [2024-06-25 08:49:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13946355712. Throughput: 0: 42670.6. Samples: 13946471760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 08:49:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000851218_13946355712.pth... [2024-06-25 08:49:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000850592_13936099328.pth [2024-06-25 08:49:45,759][15401] Updated weights for policy 0, policy_version 851224 (0.0036) [2024-06-25 08:49:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13946568704. Throughput: 0: 42657.9. Samples: 13946726600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 08:49:49,419][15349] Signal inference workers to stop experience collection... (206550 times) [2024-06-25 08:49:49,419][15349] Signal inference workers to resume experience collection... (206550 times) [2024-06-25 08:49:49,434][15401] InferenceWorker_p0-w0: stopping experience collection (206550 times) [2024-06-25 08:49:49,434][15401] InferenceWorker_p0-w0: resuming experience collection (206550 times) [2024-06-25 08:49:49,566][15401] Updated weights for policy 0, policy_version 851234 (0.0039) [2024-06-25 08:49:53,190][15401] Updated weights for policy 0, policy_version 851244 (0.0032) [2024-06-25 08:49:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 13946781696. Throughput: 0: 42728.2. Samples: 13946861460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 08:49:57,262][15401] Updated weights for policy 0, policy_version 851254 (0.0031) [2024-06-25 08:49:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13946994688. Throughput: 0: 42770.7. Samples: 13947116440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:49:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 08:50:00,809][15401] Updated weights for policy 0, policy_version 851264 (0.0030) [2024-06-25 08:50:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 13947207680. Throughput: 0: 42792.6. Samples: 13947374380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:50:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 08:50:04,706][15401] Updated weights for policy 0, policy_version 851274 (0.0036) [2024-06-25 08:50:08,275][15401] Updated weights for policy 0, policy_version 851284 (0.0040) [2024-06-25 08:50:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13947437056. Throughput: 0: 42850.2. Samples: 13947501760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:50:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 08:50:12,698][15401] Updated weights for policy 0, policy_version 851294 (0.0036) [2024-06-25 08:50:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 13947633664. Throughput: 0: 42694.1. Samples: 13947759740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:50:13,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 08:50:15,921][15401] Updated weights for policy 0, policy_version 851304 (0.0032) [2024-06-25 08:50:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 13947863040. Throughput: 0: 42741.4. Samples: 13948012880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:50:18,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 08:50:20,443][15401] Updated weights for policy 0, policy_version 851314 (0.0024) [2024-06-25 08:50:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 13948076032. Throughput: 0: 42919.6. Samples: 13948145340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:50:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 08:50:23,429][15401] Updated weights for policy 0, policy_version 851324 (0.0035) [2024-06-25 08:50:27,982][15401] Updated weights for policy 0, policy_version 851334 (0.0039) [2024-06-25 08:50:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 13948272640. Throughput: 0: 42940.9. Samples: 13948404100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:50:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 08:50:31,186][15401] Updated weights for policy 0, policy_version 851344 (0.0039) [2024-06-25 08:50:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13948485632. Throughput: 0: 42889.2. Samples: 13948656620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 08:50:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 08:50:35,783][15401] Updated weights for policy 0, policy_version 851354 (0.0024) [2024-06-25 08:50:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 13948715008. Throughput: 0: 42791.8. Samples: 13948787100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:50:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 08:50:38,810][15401] Updated weights for policy 0, policy_version 851364 (0.0036) [2024-06-25 08:50:43,097][15401] Updated weights for policy 0, policy_version 851374 (0.0027) [2024-06-25 08:50:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13948911616. Throughput: 0: 42879.5. Samples: 13949046020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:50:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 08:50:46,716][15401] Updated weights for policy 0, policy_version 851384 (0.0031) [2024-06-25 08:50:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13949124608. Throughput: 0: 42918.2. Samples: 13949305700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:50:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 08:50:50,634][15401] Updated weights for policy 0, policy_version 851394 (0.0033) [2024-06-25 08:50:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13949353984. Throughput: 0: 42851.4. Samples: 13949430080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:50:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 08:50:54,163][15401] Updated weights for policy 0, policy_version 851404 (0.0028) [2024-06-25 08:50:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 13949550592. Throughput: 0: 42618.6. Samples: 13949677580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:50:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 08:50:58,608][15401] Updated weights for policy 0, policy_version 851414 (0.0038) [2024-06-25 08:51:01,760][15349] Signal inference workers to stop experience collection... (206600 times) [2024-06-25 08:51:01,762][15349] Signal inference workers to resume experience collection... (206600 times) [2024-06-25 08:51:01,782][15401] Updated weights for policy 0, policy_version 851424 (0.0052) [2024-06-25 08:51:01,811][15401] InferenceWorker_p0-w0: stopping experience collection (206600 times) [2024-06-25 08:51:01,811][15401] InferenceWorker_p0-w0: resuming experience collection (206600 times) [2024-06-25 08:51:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 13949763584. Throughput: 0: 42750.5. Samples: 13949936660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 08:51:06,238][15401] Updated weights for policy 0, policy_version 851434 (0.0027) [2024-06-25 08:51:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13949992960. Throughput: 0: 42783.1. Samples: 13950070580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:08,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 08:51:09,302][15401] Updated weights for policy 0, policy_version 851444 (0.0037) [2024-06-25 08:51:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13950189568. Throughput: 0: 42626.2. Samples: 13950322280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:13,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 08:51:13,903][15401] Updated weights for policy 0, policy_version 851454 (0.0034) [2024-06-25 08:51:17,352][15401] Updated weights for policy 0, policy_version 851464 (0.0032) [2024-06-25 08:51:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13950418944. Throughput: 0: 42648.5. Samples: 13950575800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 08:51:21,402][15401] Updated weights for policy 0, policy_version 851474 (0.0032) [2024-06-25 08:51:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42820.9). Total num frames: 13950631936. Throughput: 0: 42737.2. Samples: 13950710280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 08:51:24,926][15401] Updated weights for policy 0, policy_version 851484 (0.0035) [2024-06-25 08:51:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13950828544. Throughput: 0: 42667.0. Samples: 13950966040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 08:51:29,010][15401] Updated weights for policy 0, policy_version 851494 (0.0036) [2024-06-25 08:51:32,659][15401] Updated weights for policy 0, policy_version 851504 (0.0023) [2024-06-25 08:51:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13951074304. Throughput: 0: 42535.6. Samples: 13951219800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 08:51:36,851][15401] Updated weights for policy 0, policy_version 851514 (0.0037) [2024-06-25 08:51:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13951270912. Throughput: 0: 42702.8. Samples: 13951351700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 08:51:40,162][15401] Updated weights for policy 0, policy_version 851524 (0.0033) [2024-06-25 08:51:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 13951500288. Throughput: 0: 42984.4. Samples: 13951611880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 08:51:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000851532_13951500288.pth... [2024-06-25 08:51:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000850905_13941227520.pth [2024-06-25 08:51:44,415][15401] Updated weights for policy 0, policy_version 851534 (0.0037) [2024-06-25 08:51:47,627][15401] Updated weights for policy 0, policy_version 851544 (0.0035) [2024-06-25 08:51:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 13951713280. Throughput: 0: 42720.9. Samples: 13951859100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 08:51:51,942][15401] Updated weights for policy 0, policy_version 851554 (0.0040) [2024-06-25 08:51:53,391][15132] Fps is (10 sec: 42591.9, 60 sec: 42870.4, 300 sec: 42876.8). Total num frames: 13951926272. Throughput: 0: 42745.5. Samples: 13951994200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:53,392][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 08:51:55,196][15401] Updated weights for policy 0, policy_version 851564 (0.0030) [2024-06-25 08:51:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13952122880. Throughput: 0: 42943.6. Samples: 13952254740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:51:58,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 08:51:59,783][15401] Updated weights for policy 0, policy_version 851574 (0.0033) [2024-06-25 08:52:03,022][15401] Updated weights for policy 0, policy_version 851584 (0.0026) [2024-06-25 08:52:03,390][15132] Fps is (10 sec: 44243.3, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 13952368640. Throughput: 0: 42796.4. Samples: 13952501640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:52:03,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 08:52:07,340][15401] Updated weights for policy 0, policy_version 851594 (0.0039) [2024-06-25 08:52:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13952548864. Throughput: 0: 42765.5. Samples: 13952634720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:52:08,399][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 08:52:10,791][15401] Updated weights for policy 0, policy_version 851604 (0.0037) [2024-06-25 08:52:13,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 13952778240. Throughput: 0: 42805.1. Samples: 13952892260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:52:13,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 08:52:15,014][15401] Updated weights for policy 0, policy_version 851614 (0.0035) [2024-06-25 08:52:16,703][15349] Signal inference workers to stop experience collection... (206650 times) [2024-06-25 08:52:16,708][15349] Signal inference workers to resume experience collection... (206650 times) [2024-06-25 08:52:16,756][15401] InferenceWorker_p0-w0: stopping experience collection (206650 times) [2024-06-25 08:52:16,756][15401] InferenceWorker_p0-w0: resuming experience collection (206650 times) [2024-06-25 08:52:18,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 13952991232. Throughput: 0: 42797.2. Samples: 13953145780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:52:18,401][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 08:52:18,480][15401] Updated weights for policy 0, policy_version 851624 (0.0028) [2024-06-25 08:52:22,821][15401] Updated weights for policy 0, policy_version 851634 (0.0037) [2024-06-25 08:52:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13953187840. Throughput: 0: 42697.3. Samples: 13953273080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 08:52:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 08:52:26,387][15401] Updated weights for policy 0, policy_version 851644 (0.0023) [2024-06-25 08:52:28,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 13953417216. Throughput: 0: 42589.4. Samples: 13953528400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:52:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 08:52:30,734][15401] Updated weights for policy 0, policy_version 851654 (0.0047) [2024-06-25 08:52:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13953646592. Throughput: 0: 42824.2. Samples: 13953786180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:52:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 08:52:34,001][15401] Updated weights for policy 0, policy_version 851664 (0.0033) [2024-06-25 08:52:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 13953810432. Throughput: 0: 42727.2. Samples: 13953916860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:52:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 08:52:38,579][15401] Updated weights for policy 0, policy_version 851674 (0.0027) [2024-06-25 08:52:41,553][15401] Updated weights for policy 0, policy_version 851684 (0.0037) [2024-06-25 08:52:43,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 13954056192. Throughput: 0: 42665.7. Samples: 13954174800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:52:43,392][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 08:52:46,045][15401] Updated weights for policy 0, policy_version 851694 (0.0033) [2024-06-25 08:52:48,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13954285568. Throughput: 0: 42978.7. Samples: 13954435680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:52:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 08:52:49,068][15401] Updated weights for policy 0, policy_version 851704 (0.0038) [2024-06-25 08:52:53,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42326.5, 300 sec: 42765.0). Total num frames: 13954465792. Throughput: 0: 42893.4. Samples: 13954564920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:52:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 08:52:53,667][15401] Updated weights for policy 0, policy_version 851714 (0.0030) [2024-06-25 08:52:56,595][15401] Updated weights for policy 0, policy_version 851724 (0.0032) [2024-06-25 08:52:58,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 13954711552. Throughput: 0: 42854.2. Samples: 13954820700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:52:58,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 08:53:01,193][15401] Updated weights for policy 0, policy_version 851734 (0.0039) [2024-06-25 08:53:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13954908160. Throughput: 0: 42985.8. Samples: 13955080040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 08:53:04,156][15401] Updated weights for policy 0, policy_version 851744 (0.0033) [2024-06-25 08:53:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13955121152. Throughput: 0: 43077.7. Samples: 13955211580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 08:53:09,062][15401] Updated weights for policy 0, policy_version 851754 (0.0039) [2024-06-25 08:53:11,735][15401] Updated weights for policy 0, policy_version 851764 (0.0038) [2024-06-25 08:53:13,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 13955366912. Throughput: 0: 43064.8. Samples: 13955466420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:13,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 08:53:16,578][15401] Updated weights for policy 0, policy_version 851774 (0.0039) [2024-06-25 08:53:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 13955547136. Throughput: 0: 43172.8. Samples: 13955728960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 08:53:19,605][15401] Updated weights for policy 0, policy_version 851784 (0.0039) [2024-06-25 08:53:23,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13955760128. Throughput: 0: 42959.2. Samples: 13955850020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:23,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 08:53:24,472][15401] Updated weights for policy 0, policy_version 851794 (0.0022) [2024-06-25 08:53:27,544][15401] Updated weights for policy 0, policy_version 851804 (0.0038) [2024-06-25 08:53:28,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 13956022272. Throughput: 0: 43020.0. Samples: 13956110600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 08:53:32,102][15401] Updated weights for policy 0, policy_version 851814 (0.0029) [2024-06-25 08:53:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 13956186112. Throughput: 0: 43157.9. Samples: 13956377780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:33,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 08:53:33,526][15349] Signal inference workers to stop experience collection... (206700 times) [2024-06-25 08:53:33,529][15349] Signal inference workers to resume experience collection... (206700 times) [2024-06-25 08:53:33,550][15401] InferenceWorker_p0-w0: stopping experience collection (206700 times) [2024-06-25 08:53:33,550][15401] InferenceWorker_p0-w0: resuming experience collection (206700 times) [2024-06-25 08:53:35,075][15401] Updated weights for policy 0, policy_version 851824 (0.0036) [2024-06-25 08:53:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 13956399104. Throughput: 0: 42848.0. Samples: 13956493080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 08:53:39,647][15401] Updated weights for policy 0, policy_version 851834 (0.0028) [2024-06-25 08:53:42,458][15401] Updated weights for policy 0, policy_version 851844 (0.0041) [2024-06-25 08:53:43,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43419.3, 300 sec: 42876.1). Total num frames: 13956661248. Throughput: 0: 42978.7. Samples: 13956754740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 08:53:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000851847_13956661248.pth... [2024-06-25 08:53:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000851218_13946355712.pth [2024-06-25 08:53:47,193][15401] Updated weights for policy 0, policy_version 851854 (0.0033) [2024-06-25 08:53:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 13956841472. Throughput: 0: 43222.4. Samples: 13957025040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 08:53:49,885][15401] Updated weights for policy 0, policy_version 851864 (0.0042) [2024-06-25 08:53:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 13957070848. Throughput: 0: 42868.0. Samples: 13957140640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 08:53:54,805][15401] Updated weights for policy 0, policy_version 851874 (0.0044) [2024-06-25 08:53:57,857][15401] Updated weights for policy 0, policy_version 851884 (0.0032) [2024-06-25 08:53:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13957283840. Throughput: 0: 42924.9. Samples: 13957397940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:53:58,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 08:54:02,445][15401] Updated weights for policy 0, policy_version 851894 (0.0031) [2024-06-25 08:54:03,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 13957464064. Throughput: 0: 43056.8. Samples: 13957666620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:54:03,392][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 08:54:05,312][15401] Updated weights for policy 0, policy_version 851904 (0.0035) [2024-06-25 08:54:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13957709824. Throughput: 0: 43051.6. Samples: 13957787340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:54:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 08:54:10,004][15401] Updated weights for policy 0, policy_version 851914 (0.0042) [2024-06-25 08:54:12,827][15401] Updated weights for policy 0, policy_version 851924 (0.0037) [2024-06-25 08:54:13,389][15132] Fps is (10 sec: 47525.2, 60 sec: 42873.2, 300 sec: 42876.4). Total num frames: 13957939200. Throughput: 0: 43053.4. Samples: 13958048000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 08:54:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 08:54:17,693][15401] Updated weights for policy 0, policy_version 851934 (0.0032) [2024-06-25 08:54:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 13958119424. Throughput: 0: 42974.7. Samples: 13958311640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 08:54:20,582][15401] Updated weights for policy 0, policy_version 851944 (0.0044) [2024-06-25 08:54:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 13958365184. Throughput: 0: 43053.6. Samples: 13958430500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:23,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 08:54:25,651][15401] Updated weights for policy 0, policy_version 851954 (0.0037) [2024-06-25 08:54:28,135][15401] Updated weights for policy 0, policy_version 851964 (0.0026) [2024-06-25 08:54:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 13958578176. Throughput: 0: 43048.5. Samples: 13958691920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:28,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 08:54:33,145][15401] Updated weights for policy 0, policy_version 851974 (0.0032) [2024-06-25 08:54:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13958758400. Throughput: 0: 42880.2. Samples: 13958954660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 08:54:35,803][15401] Updated weights for policy 0, policy_version 851984 (0.0043) [2024-06-25 08:54:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43690.5, 300 sec: 42931.6). Total num frames: 13959020544. Throughput: 0: 42873.7. Samples: 13959069960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:38,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 08:54:40,665][15401] Updated weights for policy 0, policy_version 851994 (0.0034) [2024-06-25 08:54:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13959217152. Throughput: 0: 42989.2. Samples: 13959332460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:43,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 08:54:43,645][15401] Updated weights for policy 0, policy_version 852004 (0.0035) [2024-06-25 08:54:44,332][15349] Signal inference workers to stop experience collection... (206750 times) [2024-06-25 08:54:44,332][15349] Signal inference workers to resume experience collection... (206750 times) [2024-06-25 08:54:44,376][15401] InferenceWorker_p0-w0: stopping experience collection (206750 times) [2024-06-25 08:54:44,376][15401] InferenceWorker_p0-w0: resuming experience collection (206750 times) [2024-06-25 08:54:48,205][15401] Updated weights for policy 0, policy_version 852014 (0.0043) [2024-06-25 08:54:48,390][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 13959397376. Throughput: 0: 42813.8. Samples: 13959593140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 08:54:51,479][15401] Updated weights for policy 0, policy_version 852024 (0.0042) [2024-06-25 08:54:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13959659520. Throughput: 0: 42803.5. Samples: 13959713500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:53,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 08:54:55,761][15401] Updated weights for policy 0, policy_version 852034 (0.0044) [2024-06-25 08:54:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13959856128. Throughput: 0: 42970.2. Samples: 13959981660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:54:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 08:54:59,016][15401] Updated weights for policy 0, policy_version 852044 (0.0050) [2024-06-25 08:55:03,243][15401] Updated weights for policy 0, policy_version 852054 (0.0047) [2024-06-25 08:55:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 13960052736. Throughput: 0: 42679.1. Samples: 13960232200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:03,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 08:55:06,515][15401] Updated weights for policy 0, policy_version 852064 (0.0027) [2024-06-25 08:55:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 13960298496. Throughput: 0: 42770.6. Samples: 13960355180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 08:55:10,788][15401] Updated weights for policy 0, policy_version 852074 (0.0044) [2024-06-25 08:55:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13960495104. Throughput: 0: 42895.0. Samples: 13960622200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 08:55:14,350][15401] Updated weights for policy 0, policy_version 852084 (0.0026) [2024-06-25 08:55:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13960691712. Throughput: 0: 42575.7. Samples: 13960870560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:18,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-25 08:55:18,434][15401] Updated weights for policy 0, policy_version 852094 (0.0038) [2024-06-25 08:55:21,950][15401] Updated weights for policy 0, policy_version 852104 (0.0032) [2024-06-25 08:55:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 13960921088. Throughput: 0: 42828.5. Samples: 13960997240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 08:55:26,351][15401] Updated weights for policy 0, policy_version 852114 (0.0029) [2024-06-25 08:55:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13961134080. Throughput: 0: 42818.8. Samples: 13961259300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:28,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 08:55:29,638][15401] Updated weights for policy 0, policy_version 852124 (0.0031) [2024-06-25 08:55:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13961347072. Throughput: 0: 42684.9. Samples: 13961513960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 08:55:33,858][15401] Updated weights for policy 0, policy_version 852134 (0.0028) [2024-06-25 08:55:37,202][15401] Updated weights for policy 0, policy_version 852144 (0.0033) [2024-06-25 08:55:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13961576448. Throughput: 0: 42757.8. Samples: 13961637600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 08:55:41,399][15401] Updated weights for policy 0, policy_version 852154 (0.0041) [2024-06-25 08:55:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13961789440. Throughput: 0: 42685.7. Samples: 13961902520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 08:55:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000852160_13961789440.pth... [2024-06-25 08:55:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000851532_13951500288.pth [2024-06-25 08:55:45,172][15401] Updated weights for policy 0, policy_version 852164 (0.0035) [2024-06-25 08:55:46,949][15349] Signal inference workers to stop experience collection... (206800 times) [2024-06-25 08:55:46,977][15401] InferenceWorker_p0-w0: stopping experience collection (206800 times) [2024-06-25 08:55:47,004][15349] Signal inference workers to resume experience collection... (206800 times) [2024-06-25 08:55:47,008][15401] InferenceWorker_p0-w0: resuming experience collection (206800 times) [2024-06-25 08:55:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13961986048. Throughput: 0: 42659.1. Samples: 13962151860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:48,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 08:55:49,159][15401] Updated weights for policy 0, policy_version 852174 (0.0034) [2024-06-25 08:55:52,903][15401] Updated weights for policy 0, policy_version 852184 (0.0035) [2024-06-25 08:55:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 13962215424. Throughput: 0: 42795.2. Samples: 13962280960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 08:55:56,523][15401] Updated weights for policy 0, policy_version 852194 (0.0040) [2024-06-25 08:55:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 13962412032. Throughput: 0: 42679.7. Samples: 13962542780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:55:58,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 08:56:00,534][15401] Updated weights for policy 0, policy_version 852204 (0.0038) [2024-06-25 08:56:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13962641408. Throughput: 0: 42898.1. Samples: 13962800980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 08:56:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 08:56:03,865][15401] Updated weights for policy 0, policy_version 852214 (0.0031) [2024-06-25 08:56:08,045][15401] Updated weights for policy 0, policy_version 852224 (0.0034) [2024-06-25 08:56:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 13962854400. Throughput: 0: 42913.4. Samples: 13962928340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 08:56:11,399][15401] Updated weights for policy 0, policy_version 852234 (0.0038) [2024-06-25 08:56:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13963083776. Throughput: 0: 43035.1. Samples: 13963195880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:13,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 08:56:15,843][15401] Updated weights for policy 0, policy_version 852244 (0.0029) [2024-06-25 08:56:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13963280384. Throughput: 0: 42984.9. Samples: 13963448280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 08:56:19,027][15401] Updated weights for policy 0, policy_version 852254 (0.0040) [2024-06-25 08:56:23,396][15132] Fps is (10 sec: 39296.6, 60 sec: 42593.9, 300 sec: 42875.2). Total num frames: 13963476992. Throughput: 0: 43087.2. Samples: 13963576800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:23,396][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 08:56:23,549][15401] Updated weights for policy 0, policy_version 852264 (0.0040) [2024-06-25 08:56:27,160][15401] Updated weights for policy 0, policy_version 852274 (0.0029) [2024-06-25 08:56:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13963722752. Throughput: 0: 42979.1. Samples: 13963836580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 08:56:31,195][15401] Updated weights for policy 0, policy_version 852284 (0.0028) [2024-06-25 08:56:33,390][15132] Fps is (10 sec: 45904.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 13963935744. Throughput: 0: 42949.3. Samples: 13964084580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 08:56:34,608][15401] Updated weights for policy 0, policy_version 852294 (0.0036) [2024-06-25 08:56:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 13964115968. Throughput: 0: 43111.7. Samples: 13964220980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:38,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 08:56:38,872][15401] Updated weights for policy 0, policy_version 852304 (0.0040) [2024-06-25 08:56:42,438][15401] Updated weights for policy 0, policy_version 852314 (0.0038) [2024-06-25 08:56:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13964345344. Throughput: 0: 43080.8. Samples: 13964481420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 08:56:46,286][15401] Updated weights for policy 0, policy_version 852324 (0.0023) [2024-06-25 08:56:48,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43417.6, 300 sec: 42931.9). Total num frames: 13964591104. Throughput: 0: 42852.4. Samples: 13964729340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:48,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 08:56:49,945][15349] Signal inference workers to stop experience collection... (206850 times) [2024-06-25 08:56:50,003][15401] InferenceWorker_p0-w0: stopping experience collection (206850 times) [2024-06-25 08:56:50,009][15349] Signal inference workers to resume experience collection... (206850 times) [2024-06-25 08:56:50,016][15401] InferenceWorker_p0-w0: resuming experience collection (206850 times) [2024-06-25 08:56:50,024][15401] Updated weights for policy 0, policy_version 852334 (0.0040) [2024-06-25 08:56:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 13964787712. Throughput: 0: 43176.9. Samples: 13964871300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 08:56:53,754][15401] Updated weights for policy 0, policy_version 852344 (0.0042) [2024-06-25 08:56:57,411][15401] Updated weights for policy 0, policy_version 852354 (0.0032) [2024-06-25 08:56:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 13965000704. Throughput: 0: 43039.2. Samples: 13965132640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:56:58,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 08:57:01,789][15401] Updated weights for policy 0, policy_version 852364 (0.0049) [2024-06-25 08:57:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 13965246464. Throughput: 0: 42857.2. Samples: 13965376860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 08:57:05,067][15401] Updated weights for policy 0, policy_version 852374 (0.0044) [2024-06-25 08:57:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 13965426688. Throughput: 0: 42884.3. Samples: 13965506320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 08:57:09,474][15401] Updated weights for policy 0, policy_version 852384 (0.0024) [2024-06-25 08:57:12,559][15401] Updated weights for policy 0, policy_version 852394 (0.0036) [2024-06-25 08:57:13,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.8, 300 sec: 42931.6). Total num frames: 13965656064. Throughput: 0: 42845.2. Samples: 13965764720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:13,393][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 08:57:16,911][15401] Updated weights for policy 0, policy_version 852404 (0.0044) [2024-06-25 08:57:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 13965869056. Throughput: 0: 43077.1. Samples: 13966023040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:18,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 08:57:20,284][15401] Updated weights for policy 0, policy_version 852414 (0.0054) [2024-06-25 08:57:23,390][15132] Fps is (10 sec: 40969.9, 60 sec: 43149.1, 300 sec: 42876.1). Total num frames: 13966065664. Throughput: 0: 42854.5. Samples: 13966149440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 08:57:24,622][15401] Updated weights for policy 0, policy_version 852424 (0.0031) [2024-06-25 08:57:27,820][15401] Updated weights for policy 0, policy_version 852434 (0.0049) [2024-06-25 08:57:28,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 13966295040. Throughput: 0: 42784.1. Samples: 13966406980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:28,396][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 08:57:32,191][15401] Updated weights for policy 0, policy_version 852444 (0.0046) [2024-06-25 08:57:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42931.7). Total num frames: 13966475264. Throughput: 0: 43038.8. Samples: 13966666080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 08:57:35,273][15401] Updated weights for policy 0, policy_version 852454 (0.0038) [2024-06-25 08:57:38,390][15132] Fps is (10 sec: 40985.8, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 13966704640. Throughput: 0: 42583.5. Samples: 13966787560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:38,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 08:57:40,009][15401] Updated weights for policy 0, policy_version 852464 (0.0043) [2024-06-25 08:57:42,695][15401] Updated weights for policy 0, policy_version 852474 (0.0032) [2024-06-25 08:57:43,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 13966950400. Throughput: 0: 42450.3. Samples: 13967042900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:43,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 08:57:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000852475_13966950400.pth... [2024-06-25 08:57:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000851847_13956661248.pth [2024-06-25 08:57:47,470][15401] Updated weights for policy 0, policy_version 852484 (0.0043) [2024-06-25 08:57:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 13967114240. Throughput: 0: 42855.1. Samples: 13967305340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 08:57:50,685][15401] Updated weights for policy 0, policy_version 852494 (0.0037) [2024-06-25 08:57:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13967343616. Throughput: 0: 42596.5. Samples: 13967423160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 08:57:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 08:57:55,567][15401] Updated weights for policy 0, policy_version 852504 (0.0030) [2024-06-25 08:57:58,185][15401] Updated weights for policy 0, policy_version 852514 (0.0039) [2024-06-25 08:57:58,390][15132] Fps is (10 sec: 49152.4, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 13967605760. Throughput: 0: 42796.9. Samples: 13967690480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:57:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 08:58:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 42765.0). Total num frames: 13967736832. Throughput: 0: 42715.0. Samples: 13967945220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:03,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 08:58:03,620][15401] Updated weights for policy 0, policy_version 852524 (0.0036) [2024-06-25 08:58:05,650][15349] Signal inference workers to stop experience collection... (206900 times) [2024-06-25 08:58:05,650][15349] Signal inference workers to resume experience collection... (206900 times) [2024-06-25 08:58:05,693][15401] InferenceWorker_p0-w0: stopping experience collection (206900 times) [2024-06-25 08:58:05,693][15401] InferenceWorker_p0-w0: resuming experience collection (206900 times) [2024-06-25 08:58:05,795][15401] Updated weights for policy 0, policy_version 852534 (0.0027) [2024-06-25 08:58:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 13967998976. Throughput: 0: 42411.9. Samples: 13968057980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 08:58:11,520][15401] Updated weights for policy 0, policy_version 852544 (0.0033) [2024-06-25 08:58:13,396][15132] Fps is (10 sec: 49120.7, 60 sec: 42868.7, 300 sec: 42986.3). Total num frames: 13968228352. Throughput: 0: 42594.2. Samples: 13968323720. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:13,396][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 08:58:13,450][15401] Updated weights for policy 0, policy_version 852554 (0.0029) [2024-06-25 08:58:18,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 13968392192. Throughput: 0: 42618.7. Samples: 13968583920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 08:58:19,043][15401] Updated weights for policy 0, policy_version 852564 (0.0043) [2024-06-25 08:58:21,294][15401] Updated weights for policy 0, policy_version 852574 (0.0028) [2024-06-25 08:58:23,390][15132] Fps is (10 sec: 42625.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13968654336. Throughput: 0: 42546.3. Samples: 13968702140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:23,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 08:58:26,512][15401] Updated weights for policy 0, policy_version 852584 (0.0032) [2024-06-25 08:58:28,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42876.1, 300 sec: 42987.2). Total num frames: 13968867328. Throughput: 0: 42880.5. Samples: 13968972520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:28,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 08:58:29,029][15401] Updated weights for policy 0, policy_version 852594 (0.0023) [2024-06-25 08:58:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 13969031168. Throughput: 0: 42743.6. Samples: 13969228800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 08:58:34,028][15401] Updated weights for policy 0, policy_version 852604 (0.0031) [2024-06-25 08:58:36,576][15401] Updated weights for policy 0, policy_version 852614 (0.0040) [2024-06-25 08:58:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13969293312. Throughput: 0: 42692.4. Samples: 13969344320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:38,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-25 08:58:41,750][15401] Updated weights for policy 0, policy_version 852624 (0.0034) [2024-06-25 08:58:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 13969489920. Throughput: 0: 42623.7. Samples: 13969608540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:43,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 08:58:44,164][15401] Updated weights for policy 0, policy_version 852634 (0.0031) [2024-06-25 08:58:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13969670144. Throughput: 0: 42664.4. Samples: 13969865120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:48,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 08:58:49,491][15401] Updated weights for policy 0, policy_version 852644 (0.0038) [2024-06-25 08:58:51,905][15401] Updated weights for policy 0, policy_version 852654 (0.0033) [2024-06-25 08:58:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13969932288. Throughput: 0: 42770.3. Samples: 13969982640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:53,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 08:58:57,121][15401] Updated weights for policy 0, policy_version 852664 (0.0023) [2024-06-25 08:58:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42052.3, 300 sec: 42932.0). Total num frames: 13970128896. Throughput: 0: 42821.7. Samples: 13970250420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:58:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 08:58:59,562][15401] Updated weights for policy 0, policy_version 852674 (0.0039) [2024-06-25 08:59:03,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13970309120. Throughput: 0: 42672.7. Samples: 13970504200. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:03,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 08:59:04,810][15401] Updated weights for policy 0, policy_version 852684 (0.0040) [2024-06-25 08:59:07,549][15401] Updated weights for policy 0, policy_version 852694 (0.0039) [2024-06-25 08:59:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13970554880. Throughput: 0: 42683.1. Samples: 13970622880. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:08,394][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 08:59:12,451][15401] Updated weights for policy 0, policy_version 852704 (0.0041) [2024-06-25 08:59:13,194][15349] Signal inference workers to stop experience collection... (206950 times) [2024-06-25 08:59:13,241][15401] InferenceWorker_p0-w0: stopping experience collection (206950 times) [2024-06-25 08:59:13,248][15349] Signal inference workers to resume experience collection... (206950 times) [2024-06-25 08:59:13,255][15401] InferenceWorker_p0-w0: resuming experience collection (206950 times) [2024-06-25 08:59:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42329.8, 300 sec: 42876.1). Total num frames: 13970767872. Throughput: 0: 42674.6. Samples: 13970892880. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 08:59:15,582][15401] Updated weights for policy 0, policy_version 852714 (0.0040) [2024-06-25 08:59:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13970964480. Throughput: 0: 42437.9. Samples: 13971138500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 08:59:20,018][15401] Updated weights for policy 0, policy_version 852724 (0.0030) [2024-06-25 08:59:23,288][15401] Updated weights for policy 0, policy_version 852734 (0.0038) [2024-06-25 08:59:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13971193856. Throughput: 0: 42694.6. Samples: 13971265580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 08:59:27,592][15401] Updated weights for policy 0, policy_version 852744 (0.0031) [2024-06-25 08:59:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 13971423232. Throughput: 0: 42931.4. Samples: 13971540460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 08:59:30,962][15401] Updated weights for policy 0, policy_version 852754 (0.0030) [2024-06-25 08:59:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 13971619840. Throughput: 0: 42769.0. Samples: 13971789720. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:33,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 08:59:35,037][15401] Updated weights for policy 0, policy_version 852764 (0.0024) [2024-06-25 08:59:38,386][15401] Updated weights for policy 0, policy_version 852774 (0.0047) [2024-06-25 08:59:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13971849216. Throughput: 0: 42972.9. Samples: 13971916420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:38,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 08:59:42,490][15401] Updated weights for policy 0, policy_version 852784 (0.0041) [2024-06-25 08:59:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 13972045824. Throughput: 0: 42953.3. Samples: 13972183320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:43,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 08:59:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000852786_13972045824.pth... [2024-06-25 08:59:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000852160_13961789440.pth [2024-06-25 08:59:46,079][15401] Updated weights for policy 0, policy_version 852794 (0.0035) [2024-06-25 08:59:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 13972275200. Throughput: 0: 42857.0. Samples: 13972432760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-25 08:59:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 08:59:50,105][15401] Updated weights for policy 0, policy_version 852804 (0.0030) [2024-06-25 08:59:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13972488192. Throughput: 0: 43060.9. Samples: 13972560620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 08:59:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 08:59:53,893][15401] Updated weights for policy 0, policy_version 852814 (0.0034) [2024-06-25 08:59:57,565][15401] Updated weights for policy 0, policy_version 852824 (0.0037) [2024-06-25 08:59:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 13972701184. Throughput: 0: 42984.0. Samples: 13972827160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 08:59:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 09:00:01,325][15401] Updated weights for policy 0, policy_version 852834 (0.0039) [2024-06-25 09:00:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 13972930560. Throughput: 0: 43100.7. Samples: 13973078040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 09:00:05,328][15401] Updated weights for policy 0, policy_version 852844 (0.0022) [2024-06-25 09:00:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13973127168. Throughput: 0: 43152.6. Samples: 13973207440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 09:00:09,031][15401] Updated weights for policy 0, policy_version 852854 (0.0044) [2024-06-25 09:00:13,355][15401] Updated weights for policy 0, policy_version 852864 (0.0021) [2024-06-25 09:00:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 13973323776. Throughput: 0: 42747.6. Samples: 13973464100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 09:00:16,866][15401] Updated weights for policy 0, policy_version 852874 (0.0034) [2024-06-25 09:00:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 13973553152. Throughput: 0: 42856.4. Samples: 13973718260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:18,395][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 09:00:20,897][15401] Updated weights for policy 0, policy_version 852884 (0.0043) [2024-06-25 09:00:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13973766144. Throughput: 0: 42914.2. Samples: 13973847560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 09:00:24,659][15401] Updated weights for policy 0, policy_version 852894 (0.0039) [2024-06-25 09:00:28,374][15401] Updated weights for policy 0, policy_version 852904 (0.0039) [2024-06-25 09:00:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13973979136. Throughput: 0: 42688.9. Samples: 13974104320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:28,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 09:00:28,911][15349] Signal inference workers to stop experience collection... (207000 times) [2024-06-25 09:00:28,964][15401] InferenceWorker_p0-w0: stopping experience collection (207000 times) [2024-06-25 09:00:28,972][15349] Signal inference workers to resume experience collection... (207000 times) [2024-06-25 09:00:28,982][15401] InferenceWorker_p0-w0: resuming experience collection (207000 times) [2024-06-25 09:00:32,118][15401] Updated weights for policy 0, policy_version 852914 (0.0033) [2024-06-25 09:00:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13974175744. Throughput: 0: 42941.3. Samples: 13974365120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:33,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-25 09:00:35,875][15401] Updated weights for policy 0, policy_version 852924 (0.0025) [2024-06-25 09:00:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13974421504. Throughput: 0: 42894.7. Samples: 13974490880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 09:00:39,465][15401] Updated weights for policy 0, policy_version 852934 (0.0029) [2024-06-25 09:00:43,274][15401] Updated weights for policy 0, policy_version 852944 (0.0028) [2024-06-25 09:00:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13974634496. Throughput: 0: 42768.9. Samples: 13974751760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 09:00:47,265][15401] Updated weights for policy 0, policy_version 852954 (0.0037) [2024-06-25 09:00:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13974831104. Throughput: 0: 43009.9. Samples: 13975013480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:48,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 09:00:50,784][15401] Updated weights for policy 0, policy_version 852964 (0.0020) [2024-06-25 09:00:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 13975076864. Throughput: 0: 42784.0. Samples: 13975132720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:53,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-25 09:00:55,070][15401] Updated weights for policy 0, policy_version 852974 (0.0046) [2024-06-25 09:00:58,267][15401] Updated weights for policy 0, policy_version 852984 (0.0043) [2024-06-25 09:00:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 13975289856. Throughput: 0: 43005.4. Samples: 13975399340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:00:58,390][15132] Avg episode reward: [(0, '0.158')] [2024-06-25 09:01:02,596][15401] Updated weights for policy 0, policy_version 852994 (0.0045) [2024-06-25 09:01:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 13975486464. Throughput: 0: 43137.5. Samples: 13975659440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:01:03,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 09:01:05,792][15401] Updated weights for policy 0, policy_version 853004 (0.0037) [2024-06-25 09:01:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 13975715840. Throughput: 0: 42993.0. Samples: 13975782240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:01:08,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-25 09:01:10,286][15401] Updated weights for policy 0, policy_version 853014 (0.0042) [2024-06-25 09:01:13,339][15401] Updated weights for policy 0, policy_version 853024 (0.0032) [2024-06-25 09:01:13,390][15132] Fps is (10 sec: 45873.5, 60 sec: 43690.4, 300 sec: 42931.6). Total num frames: 13975945216. Throughput: 0: 43030.8. Samples: 13976040720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:01:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 09:01:18,098][15401] Updated weights for policy 0, policy_version 853034 (0.0037) [2024-06-25 09:01:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42877.0). Total num frames: 13976125440. Throughput: 0: 42929.7. Samples: 13976296960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:01:18,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 09:01:21,026][15401] Updated weights for policy 0, policy_version 853044 (0.0032) [2024-06-25 09:01:23,389][15132] Fps is (10 sec: 39323.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13976338432. Throughput: 0: 42745.9. Samples: 13976414440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:01:23,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 09:01:26,035][15401] Updated weights for policy 0, policy_version 853054 (0.0032) [2024-06-25 09:01:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13976567808. Throughput: 0: 42788.0. Samples: 13976677220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:01:28,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 09:01:28,686][15401] Updated weights for policy 0, policy_version 853064 (0.0030) [2024-06-25 09:01:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13976748032. Throughput: 0: 42688.2. Samples: 13976934460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:01:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 09:01:33,703][15401] Updated weights for policy 0, policy_version 853074 (0.0025) [2024-06-25 09:01:36,775][15401] Updated weights for policy 0, policy_version 853084 (0.0048) [2024-06-25 09:01:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13976961024. Throughput: 0: 42659.9. Samples: 13977052420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 09:01:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 09:01:41,674][15401] Updated weights for policy 0, policy_version 853094 (0.0046) [2024-06-25 09:01:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13977206784. Throughput: 0: 42545.1. Samples: 13977313880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:01:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 09:01:43,554][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000853102_13977223168.pth... [2024-06-25 09:01:43,601][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000852475_13966950400.pth [2024-06-25 09:01:44,447][15401] Updated weights for policy 0, policy_version 853104 (0.0034) [2024-06-25 09:01:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13977403392. Throughput: 0: 42369.7. Samples: 13977566080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:01:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 09:01:49,381][15401] Updated weights for policy 0, policy_version 853114 (0.0037) [2024-06-25 09:01:52,176][15401] Updated weights for policy 0, policy_version 853124 (0.0040) [2024-06-25 09:01:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 13977616384. Throughput: 0: 42352.8. Samples: 13977688120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:01:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 09:01:56,967][15401] Updated weights for policy 0, policy_version 853134 (0.0038) [2024-06-25 09:01:58,297][15349] Signal inference workers to stop experience collection... (207050 times) [2024-06-25 09:01:58,346][15401] InferenceWorker_p0-w0: stopping experience collection (207050 times) [2024-06-25 09:01:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 13977812992. Throughput: 0: 42488.8. Samples: 13977952700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:01:58,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 09:01:58,412][15349] Signal inference workers to resume experience collection... (207050 times) [2024-06-25 09:01:58,412][15401] InferenceWorker_p0-w0: resuming experience collection (207050 times) [2024-06-25 09:02:00,010][15401] Updated weights for policy 0, policy_version 853144 (0.0033) [2024-06-25 09:02:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 13978058752. Throughput: 0: 42340.4. Samples: 13978202280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 09:02:04,642][15401] Updated weights for policy 0, policy_version 853154 (0.0036) [2024-06-25 09:02:07,550][15401] Updated weights for policy 0, policy_version 853164 (0.0030) [2024-06-25 09:02:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 13978255360. Throughput: 0: 42549.3. Samples: 13978329160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:08,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 09:02:12,558][15401] Updated weights for policy 0, policy_version 853174 (0.0034) [2024-06-25 09:02:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.4, 300 sec: 42709.4). Total num frames: 13978468352. Throughput: 0: 42563.8. Samples: 13978592600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:13,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 09:02:15,213][15401] Updated weights for policy 0, policy_version 853184 (0.0031) [2024-06-25 09:02:18,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 13978681344. Throughput: 0: 42424.3. Samples: 13978843640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:18,392][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 09:02:20,135][15401] Updated weights for policy 0, policy_version 853194 (0.0032) [2024-06-25 09:02:22,917][15401] Updated weights for policy 0, policy_version 853204 (0.0041) [2024-06-25 09:02:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 13978894336. Throughput: 0: 42667.0. Samples: 13978972440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 09:02:27,686][15401] Updated weights for policy 0, policy_version 853214 (0.0029) [2024-06-25 09:02:28,389][15132] Fps is (10 sec: 39330.6, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 13979074560. Throughput: 0: 42698.4. Samples: 13979235300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:28,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 09:02:30,347][15401] Updated weights for policy 0, policy_version 853224 (0.0038) [2024-06-25 09:02:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 13979336704. Throughput: 0: 42571.6. Samples: 13979481800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 09:02:35,177][15401] Updated weights for policy 0, policy_version 853234 (0.0044) [2024-06-25 09:02:38,042][15401] Updated weights for policy 0, policy_version 853244 (0.0028) [2024-06-25 09:02:38,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13979549696. Throughput: 0: 42914.2. Samples: 13979619260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:38,390][15132] Avg episode reward: [(0, '0.860')] [2024-06-25 09:02:42,816][15401] Updated weights for policy 0, policy_version 853254 (0.0028) [2024-06-25 09:02:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 13979729920. Throughput: 0: 42846.6. Samples: 13979880800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:43,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 09:02:46,050][15401] Updated weights for policy 0, policy_version 853264 (0.0026) [2024-06-25 09:02:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 13979992064. Throughput: 0: 42656.0. Samples: 13980121800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:48,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 09:02:50,517][15401] Updated weights for policy 0, policy_version 853274 (0.0035) [2024-06-25 09:02:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 13980188672. Throughput: 0: 42916.0. Samples: 13980260380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:53,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 09:02:53,761][15401] Updated weights for policy 0, policy_version 853284 (0.0036) [2024-06-25 09:02:58,035][15401] Updated weights for policy 0, policy_version 853294 (0.0034) [2024-06-25 09:02:58,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 13980368896. Throughput: 0: 42802.5. Samples: 13980518700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:02:58,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 09:03:01,468][15401] Updated weights for policy 0, policy_version 853304 (0.0035) [2024-06-25 09:03:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13980631040. Throughput: 0: 42590.9. Samples: 13980760140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:03:03,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 09:03:05,834][15401] Updated weights for policy 0, policy_version 853314 (0.0032) [2024-06-25 09:03:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 13980811264. Throughput: 0: 42814.8. Samples: 13980899100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:03:08,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 09:03:09,247][15401] Updated weights for policy 0, policy_version 853324 (0.0053) [2024-06-25 09:03:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.6, 300 sec: 42820.5). Total num frames: 13981024256. Throughput: 0: 42609.3. Samples: 13981152720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:03:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 09:03:13,392][15401] Updated weights for policy 0, policy_version 853334 (0.0030) [2024-06-25 09:03:16,820][15401] Updated weights for policy 0, policy_version 853344 (0.0029) [2024-06-25 09:03:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 13981253632. Throughput: 0: 42769.3. Samples: 13981406420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:03:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 09:03:20,875][15401] Updated weights for policy 0, policy_version 853354 (0.0030) [2024-06-25 09:03:22,514][15349] Signal inference workers to stop experience collection... (207100 times) [2024-06-25 09:03:22,550][15401] InferenceWorker_p0-w0: stopping experience collection (207100 times) [2024-06-25 09:03:22,564][15349] Signal inference workers to resume experience collection... (207100 times) [2024-06-25 09:03:22,566][15401] InferenceWorker_p0-w0: resuming experience collection (207100 times) [2024-06-25 09:03:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 13981450240. Throughput: 0: 42701.2. Samples: 13981540820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:03:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 09:03:24,683][15401] Updated weights for policy 0, policy_version 853364 (0.0036) [2024-06-25 09:03:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 13981663232. Throughput: 0: 42534.6. Samples: 13981794860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 09:03:28,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 09:03:28,623][15401] Updated weights for policy 0, policy_version 853374 (0.0038) [2024-06-25 09:03:32,208][15401] Updated weights for policy 0, policy_version 853384 (0.0037) [2024-06-25 09:03:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13981892608. Throughput: 0: 42797.9. Samples: 13982047700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:03:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 09:03:36,430][15401] Updated weights for policy 0, policy_version 853394 (0.0033) [2024-06-25 09:03:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 13982089216. Throughput: 0: 42615.1. Samples: 13982178060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:03:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 09:03:39,754][15401] Updated weights for policy 0, policy_version 853404 (0.0031) [2024-06-25 09:03:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 13982285824. Throughput: 0: 42487.5. Samples: 13982430640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:03:43,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 09:03:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000853412_13982302208.pth... [2024-06-25 09:03:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000852786_13972045824.pth [2024-06-25 09:03:43,877][15401] Updated weights for policy 0, policy_version 853414 (0.0032) [2024-06-25 09:03:47,940][15401] Updated weights for policy 0, policy_version 853424 (0.0039) [2024-06-25 09:03:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 13982531584. Throughput: 0: 42948.5. Samples: 13982692820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:03:48,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 09:03:51,657][15401] Updated weights for policy 0, policy_version 853434 (0.0044) [2024-06-25 09:03:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 13982711808. Throughput: 0: 42715.4. Samples: 13982821300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:03:53,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 09:03:55,586][15401] Updated weights for policy 0, policy_version 853444 (0.0027) [2024-06-25 09:03:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13982941184. Throughput: 0: 42639.5. Samples: 13983071500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:03:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 09:03:59,064][15401] Updated weights for policy 0, policy_version 853454 (0.0035) [2024-06-25 09:04:03,162][15401] Updated weights for policy 0, policy_version 853464 (0.0032) [2024-06-25 09:04:03,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 13983170560. Throughput: 0: 42743.5. Samples: 13983329980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:03,392][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 09:04:07,223][15401] Updated weights for policy 0, policy_version 853474 (0.0031) [2024-06-25 09:04:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13983367168. Throughput: 0: 42578.0. Samples: 13983456820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:08,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 09:04:10,759][15401] Updated weights for policy 0, policy_version 853484 (0.0044) [2024-06-25 09:04:13,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 13983596544. Throughput: 0: 42558.6. Samples: 13983710000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:13,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 09:04:14,864][15401] Updated weights for policy 0, policy_version 853494 (0.0032) [2024-06-25 09:04:18,291][15401] Updated weights for policy 0, policy_version 853504 (0.0033) [2024-06-25 09:04:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 13983809536. Throughput: 0: 42688.0. Samples: 13983968660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 09:04:22,312][15401] Updated weights for policy 0, policy_version 853514 (0.0037) [2024-06-25 09:04:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13984006144. Throughput: 0: 42628.8. Samples: 13984096360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 09:04:25,998][15401] Updated weights for policy 0, policy_version 853524 (0.0038) [2024-06-25 09:04:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 13984235520. Throughput: 0: 42631.1. Samples: 13984349040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 09:04:30,246][15401] Updated weights for policy 0, policy_version 853534 (0.0037) [2024-06-25 09:04:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13984448512. Throughput: 0: 42494.1. Samples: 13984605060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 09:04:33,788][15401] Updated weights for policy 0, policy_version 853544 (0.0045) [2024-06-25 09:04:37,949][15401] Updated weights for policy 0, policy_version 853554 (0.0046) [2024-06-25 09:04:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13984628736. Throughput: 0: 42361.5. Samples: 13984727560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:38,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 09:04:41,557][15401] Updated weights for policy 0, policy_version 853564 (0.0045) [2024-06-25 09:04:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13984874496. Throughput: 0: 42574.6. Samples: 13984987360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:43,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 09:04:45,596][15401] Updated weights for policy 0, policy_version 853574 (0.0032) [2024-06-25 09:04:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13985087488. Throughput: 0: 42676.1. Samples: 13985250300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 09:04:49,178][15401] Updated weights for policy 0, policy_version 853584 (0.0035) [2024-06-25 09:04:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 13985267712. Throughput: 0: 42552.3. Samples: 13985371680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 09:04:53,538][15401] Updated weights for policy 0, policy_version 853594 (0.0042) [2024-06-25 09:04:54,908][15349] Signal inference workers to stop experience collection... (207150 times) [2024-06-25 09:04:54,946][15401] InferenceWorker_p0-w0: stopping experience collection (207150 times) [2024-06-25 09:04:54,957][15349] Signal inference workers to resume experience collection... (207150 times) [2024-06-25 09:04:54,964][15401] InferenceWorker_p0-w0: resuming experience collection (207150 times) [2024-06-25 09:04:57,014][15401] Updated weights for policy 0, policy_version 853604 (0.0028) [2024-06-25 09:04:58,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 13985497088. Throughput: 0: 42626.8. Samples: 13985628300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:04:58,392][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 09:05:01,061][15401] Updated weights for policy 0, policy_version 853614 (0.0027) [2024-06-25 09:05:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 13985710080. Throughput: 0: 42600.4. Samples: 13985885680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:05:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 09:05:04,981][15401] Updated weights for policy 0, policy_version 853624 (0.0037) [2024-06-25 09:05:08,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13985923072. Throughput: 0: 42600.5. Samples: 13986013380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:05:08,396][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 09:05:08,535][15401] Updated weights for policy 0, policy_version 853634 (0.0031) [2024-06-25 09:05:12,531][15401] Updated weights for policy 0, policy_version 853644 (0.0040) [2024-06-25 09:05:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13986136064. Throughput: 0: 42792.9. Samples: 13986274720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:05:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 09:05:15,987][15401] Updated weights for policy 0, policy_version 853654 (0.0037) [2024-06-25 09:05:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 13986349056. Throughput: 0: 42652.5. Samples: 13986524420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 09:05:18,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 09:05:20,196][15401] Updated weights for policy 0, policy_version 853664 (0.0028) [2024-06-25 09:05:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 13986562048. Throughput: 0: 42765.8. Samples: 13986652020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:05:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 09:05:23,571][15401] Updated weights for policy 0, policy_version 853674 (0.0043) [2024-06-25 09:05:27,906][15401] Updated weights for policy 0, policy_version 853684 (0.0030) [2024-06-25 09:05:28,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 13986775040. Throughput: 0: 42693.6. Samples: 13986908580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:05:28,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 09:05:31,742][15401] Updated weights for policy 0, policy_version 853694 (0.0029) [2024-06-25 09:05:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13986988032. Throughput: 0: 42448.9. Samples: 13987160500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:05:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 09:05:35,658][15401] Updated weights for policy 0, policy_version 853704 (0.0041) [2024-06-25 09:05:38,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13987201024. Throughput: 0: 42668.9. Samples: 13987291780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:05:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 09:05:39,234][15401] Updated weights for policy 0, policy_version 853714 (0.0033) [2024-06-25 09:05:43,031][15401] Updated weights for policy 0, policy_version 853724 (0.0042) [2024-06-25 09:05:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 13987414016. Throughput: 0: 42684.8. Samples: 13987549020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:05:43,394][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 09:05:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000853725_13987430400.pth... [2024-06-25 09:05:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000853102_13977223168.pth [2024-06-25 09:05:46,764][15401] Updated weights for policy 0, policy_version 853734 (0.0037) [2024-06-25 09:05:48,394][15132] Fps is (10 sec: 44218.5, 60 sec: 42595.4, 300 sec: 42597.8). Total num frames: 13987643392. Throughput: 0: 42553.0. Samples: 13987800740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:05:48,394][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 09:05:50,519][15401] Updated weights for policy 0, policy_version 853744 (0.0039) [2024-06-25 09:05:53,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 13987856384. Throughput: 0: 42691.1. Samples: 13987934480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:05:53,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 09:05:54,269][15401] Updated weights for policy 0, policy_version 853754 (0.0037) [2024-06-25 09:05:58,390][15132] Fps is (10 sec: 40976.8, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 13988052992. Throughput: 0: 42524.8. Samples: 13988188340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:05:58,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 09:05:58,915][15401] Updated weights for policy 0, policy_version 853764 (0.0034) [2024-06-25 09:06:01,846][15401] Updated weights for policy 0, policy_version 853774 (0.0040) [2024-06-25 09:06:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 13988282368. Throughput: 0: 42575.0. Samples: 13988440300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 09:06:06,399][15401] Updated weights for policy 0, policy_version 853784 (0.0037) [2024-06-25 09:06:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 13988495360. Throughput: 0: 42756.4. Samples: 13988576060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 09:06:09,628][15401] Updated weights for policy 0, policy_version 853794 (0.0025) [2024-06-25 09:06:12,888][15349] Signal inference workers to stop experience collection... (207200 times) [2024-06-25 09:06:12,896][15349] Signal inference workers to resume experience collection... (207200 times) [2024-06-25 09:06:12,916][15401] InferenceWorker_p0-w0: stopping experience collection (207200 times) [2024-06-25 09:06:12,944][15401] InferenceWorker_p0-w0: resuming experience collection (207200 times) [2024-06-25 09:06:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 13988708352. Throughput: 0: 42674.2. Samples: 13988828920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:13,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 09:06:13,902][15401] Updated weights for policy 0, policy_version 853804 (0.0041) [2024-06-25 09:06:17,388][15401] Updated weights for policy 0, policy_version 853814 (0.0039) [2024-06-25 09:06:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13988937728. Throughput: 0: 42687.9. Samples: 13989081460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:18,392][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 09:06:21,453][15401] Updated weights for policy 0, policy_version 853824 (0.0032) [2024-06-25 09:06:23,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 13989117952. Throughput: 0: 42686.2. Samples: 13989212660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 09:06:25,385][15401] Updated weights for policy 0, policy_version 853834 (0.0044) [2024-06-25 09:06:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 13989363712. Throughput: 0: 42539.2. Samples: 13989463280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 09:06:28,995][15401] Updated weights for policy 0, policy_version 853844 (0.0024) [2024-06-25 09:06:33,022][15401] Updated weights for policy 0, policy_version 853854 (0.0035) [2024-06-25 09:06:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13989560320. Throughput: 0: 42660.9. Samples: 13989720300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 09:06:37,010][15401] Updated weights for policy 0, policy_version 853864 (0.0034) [2024-06-25 09:06:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 13989756928. Throughput: 0: 42521.9. Samples: 13989847960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 09:06:40,622][15401] Updated weights for policy 0, policy_version 853874 (0.0037) [2024-06-25 09:06:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 13990002688. Throughput: 0: 42663.5. Samples: 13990108200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 09:06:44,505][15401] Updated weights for policy 0, policy_version 853884 (0.0036) [2024-06-25 09:06:48,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42326.6, 300 sec: 42598.1). Total num frames: 13990182912. Throughput: 0: 42773.3. Samples: 13990365200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:48,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 09:06:48,403][15401] Updated weights for policy 0, policy_version 853894 (0.0032) [2024-06-25 09:06:52,310][15401] Updated weights for policy 0, policy_version 853904 (0.0035) [2024-06-25 09:06:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 13990395904. Throughput: 0: 42525.8. Samples: 13990489720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:53,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 09:06:56,072][15401] Updated weights for policy 0, policy_version 853914 (0.0030) [2024-06-25 09:06:58,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13990625280. Throughput: 0: 42739.7. Samples: 13990752200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:06:58,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 09:06:59,715][15401] Updated weights for policy 0, policy_version 853924 (0.0036) [2024-06-25 09:07:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 13990838272. Throughput: 0: 42681.8. Samples: 13991002140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:07:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 09:07:03,683][15401] Updated weights for policy 0, policy_version 853934 (0.0023) [2024-06-25 09:07:07,173][15401] Updated weights for policy 0, policy_version 853944 (0.0036) [2024-06-25 09:07:08,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 13991051264. Throughput: 0: 42631.7. Samples: 13991131360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:07:08,397][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 09:07:11,321][15401] Updated weights for policy 0, policy_version 853954 (0.0040) [2024-06-25 09:07:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 13991280640. Throughput: 0: 42897.3. Samples: 13991393660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 09:07:15,117][15401] Updated weights for policy 0, policy_version 853964 (0.0039) [2024-06-25 09:07:18,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 13991493632. Throughput: 0: 42780.8. Samples: 13991645440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:18,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 09:07:18,890][15401] Updated weights for policy 0, policy_version 853974 (0.0036) [2024-06-25 09:07:22,660][15401] Updated weights for policy 0, policy_version 853984 (0.0034) [2024-06-25 09:07:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13991690240. Throughput: 0: 42837.6. Samples: 13991775660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 09:07:26,460][15401] Updated weights for policy 0, policy_version 853994 (0.0039) [2024-06-25 09:07:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 13991903232. Throughput: 0: 42633.5. Samples: 13992026700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:28,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 09:07:30,619][15401] Updated weights for policy 0, policy_version 854004 (0.0031) [2024-06-25 09:07:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13992116224. Throughput: 0: 42641.4. Samples: 13992283960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:33,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 09:07:34,455][15401] Updated weights for policy 0, policy_version 854014 (0.0027) [2024-06-25 09:07:36,179][15349] Signal inference workers to stop experience collection... (207250 times) [2024-06-25 09:07:36,179][15349] Signal inference workers to resume experience collection... (207250 times) [2024-06-25 09:07:36,221][15401] InferenceWorker_p0-w0: stopping experience collection (207250 times) [2024-06-25 09:07:36,221][15401] InferenceWorker_p0-w0: resuming experience collection (207250 times) [2024-06-25 09:07:38,172][15401] Updated weights for policy 0, policy_version 854024 (0.0036) [2024-06-25 09:07:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 13992329216. Throughput: 0: 42680.0. Samples: 13992410320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 09:07:41,984][15401] Updated weights for policy 0, policy_version 854034 (0.0027) [2024-06-25 09:07:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13992558592. Throughput: 0: 42628.8. Samples: 13992670500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 09:07:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000854038_13992558592.pth... [2024-06-25 09:07:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000853412_13982302208.pth [2024-06-25 09:07:45,773][15401] Updated weights for policy 0, policy_version 854044 (0.0037) [2024-06-25 09:07:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42871.5, 300 sec: 42598.0). Total num frames: 13992755200. Throughput: 0: 42732.9. Samples: 13992925220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:48,392][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 09:07:49,592][15401] Updated weights for policy 0, policy_version 854054 (0.0044) [2024-06-25 09:07:53,392][15132] Fps is (10 sec: 40949.4, 60 sec: 42869.6, 300 sec: 42709.1). Total num frames: 13992968192. Throughput: 0: 42729.0. Samples: 13993054000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:53,401][15132] Avg episode reward: [(0, '0.281')] [2024-06-25 09:07:53,631][15401] Updated weights for policy 0, policy_version 854064 (0.0041) [2024-06-25 09:07:57,214][15401] Updated weights for policy 0, policy_version 854074 (0.0033) [2024-06-25 09:07:58,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 13993197568. Throughput: 0: 42640.5. Samples: 13993312480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:07:58,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 09:08:01,459][15401] Updated weights for policy 0, policy_version 854084 (0.0037) [2024-06-25 09:08:03,389][15132] Fps is (10 sec: 42609.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13993394176. Throughput: 0: 42724.1. Samples: 13993568020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 09:08:05,072][15401] Updated weights for policy 0, policy_version 854094 (0.0033) [2024-06-25 09:08:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 13993623552. Throughput: 0: 42484.1. Samples: 13993687440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 09:08:09,248][15401] Updated weights for policy 0, policy_version 854104 (0.0028) [2024-06-25 09:08:12,962][15401] Updated weights for policy 0, policy_version 854114 (0.0036) [2024-06-25 09:08:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13993836544. Throughput: 0: 42755.9. Samples: 13993950720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 09:08:17,077][15401] Updated weights for policy 0, policy_version 854124 (0.0043) [2024-06-25 09:08:18,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 13994016768. Throughput: 0: 42741.8. Samples: 13994207340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 09:08:20,595][15401] Updated weights for policy 0, policy_version 854134 (0.0037) [2024-06-25 09:08:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13994246144. Throughput: 0: 42554.6. Samples: 13994325280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 09:08:25,011][15401] Updated weights for policy 0, policy_version 854144 (0.0031) [2024-06-25 09:08:28,114][15401] Updated weights for policy 0, policy_version 854154 (0.0031) [2024-06-25 09:08:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 13994459136. Throughput: 0: 42604.6. Samples: 13994587700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:28,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 09:08:32,719][15401] Updated weights for policy 0, policy_version 854164 (0.0039) [2024-06-25 09:08:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13994655744. Throughput: 0: 42563.9. Samples: 13994840500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:33,391][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 09:08:36,062][15401] Updated weights for policy 0, policy_version 854174 (0.0039) [2024-06-25 09:08:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 13994901504. Throughput: 0: 42534.5. Samples: 13994967940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 09:08:40,200][15401] Updated weights for policy 0, policy_version 854184 (0.0033) [2024-06-25 09:08:43,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 13995098112. Throughput: 0: 42476.0. Samples: 13995224000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:43,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 09:08:43,641][15401] Updated weights for policy 0, policy_version 854194 (0.0029) [2024-06-25 09:08:47,946][15401] Updated weights for policy 0, policy_version 854204 (0.0038) [2024-06-25 09:08:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 13995294720. Throughput: 0: 42633.2. Samples: 13995486620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:48,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 09:08:50,170][15349] Signal inference workers to stop experience collection... (207300 times) [2024-06-25 09:08:50,171][15349] Signal inference workers to resume experience collection... (207300 times) [2024-06-25 09:08:50,205][15401] InferenceWorker_p0-w0: stopping experience collection (207300 times) [2024-06-25 09:08:50,205][15401] InferenceWorker_p0-w0: resuming experience collection (207300 times) [2024-06-25 09:08:51,331][15401] Updated weights for policy 0, policy_version 854214 (0.0042) [2024-06-25 09:08:53,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 13995540480. Throughput: 0: 42696.9. Samples: 13995608800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 09:08:55,418][15401] Updated weights for policy 0, policy_version 854224 (0.0044) [2024-06-25 09:08:58,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 13995720704. Throughput: 0: 42639.9. Samples: 13995869520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-25 09:08:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 09:08:58,996][15401] Updated weights for policy 0, policy_version 854234 (0.0038) [2024-06-25 09:09:02,991][15401] Updated weights for policy 0, policy_version 854244 (0.0027) [2024-06-25 09:09:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13995950080. Throughput: 0: 42652.9. Samples: 13996126720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:03,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 09:09:06,565][15401] Updated weights for policy 0, policy_version 854254 (0.0037) [2024-06-25 09:09:08,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 13996195840. Throughput: 0: 42926.4. Samples: 13996256960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 09:09:10,375][15401] Updated weights for policy 0, policy_version 854264 (0.0029) [2024-06-25 09:09:13,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 13996376064. Throughput: 0: 42940.6. Samples: 13996520040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 09:09:14,073][15401] Updated weights for policy 0, policy_version 854274 (0.0044) [2024-06-25 09:09:17,804][15401] Updated weights for policy 0, policy_version 854284 (0.0032) [2024-06-25 09:09:18,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 13996605440. Throughput: 0: 42914.4. Samples: 13996771740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:18,393][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 09:09:21,646][15401] Updated weights for policy 0, policy_version 854294 (0.0028) [2024-06-25 09:09:23,390][15132] Fps is (10 sec: 47514.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 13996851200. Throughput: 0: 43083.6. Samples: 13996906700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:23,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 09:09:25,285][15401] Updated weights for policy 0, policy_version 854304 (0.0024) [2024-06-25 09:09:28,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 13997015040. Throughput: 0: 43168.9. Samples: 13997166500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 09:09:29,444][15401] Updated weights for policy 0, policy_version 854314 (0.0041) [2024-06-25 09:09:32,827][15401] Updated weights for policy 0, policy_version 854324 (0.0032) [2024-06-25 09:09:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 13997260800. Throughput: 0: 42760.0. Samples: 13997410720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 09:09:36,911][15401] Updated weights for policy 0, policy_version 854334 (0.0024) [2024-06-25 09:09:38,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 13997490176. Throughput: 0: 43186.7. Samples: 13997552200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 09:09:40,649][15401] Updated weights for policy 0, policy_version 854344 (0.0037) [2024-06-25 09:09:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 13997654016. Throughput: 0: 43133.9. Samples: 13997810540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 09:09:43,480][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000854350_13997670400.pth... [2024-06-25 09:09:43,550][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000853725_13987430400.pth [2024-06-25 09:09:44,543][15401] Updated weights for policy 0, policy_version 854354 (0.0029) [2024-06-25 09:09:48,352][15401] Updated weights for policy 0, policy_version 854364 (0.0043) [2024-06-25 09:09:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43419.4, 300 sec: 42820.6). Total num frames: 13997899776. Throughput: 0: 43006.2. Samples: 13998062000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 09:09:52,094][15401] Updated weights for policy 0, policy_version 854374 (0.0023) [2024-06-25 09:09:53,390][15132] Fps is (10 sec: 49151.2, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 13998145536. Throughput: 0: 43106.5. Samples: 13998196760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 09:09:55,886][15401] Updated weights for policy 0, policy_version 854384 (0.0027) [2024-06-25 09:09:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 13998292992. Throughput: 0: 42733.4. Samples: 13998443040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:09:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 09:09:59,818][15401] Updated weights for policy 0, policy_version 854394 (0.0037) [2024-06-25 09:10:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 13998538752. Throughput: 0: 42836.4. Samples: 13998699280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:03,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 09:10:03,825][15401] Updated weights for policy 0, policy_version 854404 (0.0027) [2024-06-25 09:10:07,418][15401] Updated weights for policy 0, policy_version 854414 (0.0031) [2024-06-25 09:10:08,311][15349] Signal inference workers to stop experience collection... (207350 times) [2024-06-25 09:10:08,312][15349] Signal inference workers to resume experience collection... (207350 times) [2024-06-25 09:10:08,329][15401] InferenceWorker_p0-w0: stopping experience collection (207350 times) [2024-06-25 09:10:08,329][15401] InferenceWorker_p0-w0: resuming experience collection (207350 times) [2024-06-25 09:10:08,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 13998768128. Throughput: 0: 42888.4. Samples: 13998836680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 09:10:11,343][15401] Updated weights for policy 0, policy_version 854424 (0.0025) [2024-06-25 09:10:13,396][15132] Fps is (10 sec: 40934.0, 60 sec: 42867.0, 300 sec: 42708.5). Total num frames: 13998948352. Throughput: 0: 42748.6. Samples: 13999090460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:13,397][15132] Avg episode reward: [(0, '0.915')] [2024-06-25 09:10:14,995][15401] Updated weights for policy 0, policy_version 854434 (0.0030) [2024-06-25 09:10:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 13999194112. Throughput: 0: 43008.2. Samples: 13999346080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:18,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 09:10:18,881][15401] Updated weights for policy 0, policy_version 854444 (0.0038) [2024-06-25 09:10:22,470][15401] Updated weights for policy 0, policy_version 854454 (0.0040) [2024-06-25 09:10:23,395][15132] Fps is (10 sec: 45878.9, 60 sec: 42594.4, 300 sec: 42819.8). Total num frames: 13999407104. Throughput: 0: 42860.4. Samples: 13999481160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:23,396][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 09:10:26,390][15401] Updated weights for policy 0, policy_version 854464 (0.0038) [2024-06-25 09:10:28,390][15132] Fps is (10 sec: 37682.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 13999570944. Throughput: 0: 42754.5. Samples: 13999734500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:28,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 09:10:30,073][15401] Updated weights for policy 0, policy_version 854474 (0.0033) [2024-06-25 09:10:33,389][15132] Fps is (10 sec: 42622.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 13999833088. Throughput: 0: 42865.4. Samples: 13999990940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 09:10:33,931][15401] Updated weights for policy 0, policy_version 854484 (0.0032) [2024-06-25 09:10:37,750][15401] Updated weights for policy 0, policy_version 854494 (0.0035) [2024-06-25 09:10:38,390][15132] Fps is (10 sec: 49152.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14000062464. Throughput: 0: 43067.1. Samples: 14000134780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:38,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 09:10:41,478][15401] Updated weights for policy 0, policy_version 854504 (0.0032) [2024-06-25 09:10:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.3, 300 sec: 42654.5). Total num frames: 14000226304. Throughput: 0: 43021.4. Samples: 14000379000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 09:10:45,413][15401] Updated weights for policy 0, policy_version 854514 (0.0033) [2024-06-25 09:10:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14000504832. Throughput: 0: 43042.7. Samples: 14000636200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:48,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 09:10:49,086][15401] Updated weights for policy 0, policy_version 854524 (0.0037) [2024-06-25 09:10:52,945][15401] Updated weights for policy 0, policy_version 854534 (0.0025) [2024-06-25 09:10:53,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14000701440. Throughput: 0: 43159.4. Samples: 14000778860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 09:10:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 09:10:56,542][15401] Updated weights for policy 0, policy_version 854544 (0.0040) [2024-06-25 09:10:58,389][15132] Fps is (10 sec: 37683.6, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 14000881664. Throughput: 0: 43070.7. Samples: 14001028360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:10:58,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 09:11:00,865][15401] Updated weights for policy 0, policy_version 854554 (0.0024) [2024-06-25 09:11:03,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 14001143808. Throughput: 0: 43085.1. Samples: 14001285020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:03,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 09:11:03,965][15401] Updated weights for policy 0, policy_version 854564 (0.0035) [2024-06-25 09:11:08,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42323.7, 300 sec: 42709.2). Total num frames: 14001307648. Throughput: 0: 43079.5. Samples: 14001419600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:08,403][15132] Avg episode reward: [(0, '0.871')] [2024-06-25 09:11:08,811][15401] Updated weights for policy 0, policy_version 854574 (0.0031) [2024-06-25 09:11:11,611][15401] Updated weights for policy 0, policy_version 854584 (0.0033) [2024-06-25 09:11:13,390][15132] Fps is (10 sec: 39331.1, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 14001537024. Throughput: 0: 42954.7. Samples: 14001667460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 09:11:16,142][15349] Signal inference workers to stop experience collection... (207400 times) [2024-06-25 09:11:16,147][15349] Signal inference workers to resume experience collection... (207400 times) [2024-06-25 09:11:16,175][15401] InferenceWorker_p0-w0: stopping experience collection (207400 times) [2024-06-25 09:11:16,175][15401] InferenceWorker_p0-w0: resuming experience collection (207400 times) [2024-06-25 09:11:16,464][15401] Updated weights for policy 0, policy_version 854594 (0.0038) [2024-06-25 09:11:18,390][15132] Fps is (10 sec: 47525.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 14001782784. Throughput: 0: 42943.1. Samples: 14001923380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 09:11:19,234][15401] Updated weights for policy 0, policy_version 854604 (0.0038) [2024-06-25 09:11:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42329.1, 300 sec: 42653.9). Total num frames: 14001946624. Throughput: 0: 42787.0. Samples: 14002060200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 09:11:24,170][15401] Updated weights for policy 0, policy_version 854614 (0.0034) [2024-06-25 09:11:27,710][15401] Updated weights for policy 0, policy_version 854624 (0.0048) [2024-06-25 09:11:28,394][15132] Fps is (10 sec: 39303.2, 60 sec: 43414.3, 300 sec: 42764.3). Total num frames: 14002176000. Throughput: 0: 42823.6. Samples: 14002306260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:28,395][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 09:11:31,903][15401] Updated weights for policy 0, policy_version 854634 (0.0029) [2024-06-25 09:11:33,389][15132] Fps is (10 sec: 47514.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14002421760. Throughput: 0: 42673.8. Samples: 14002556520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 09:11:35,259][15401] Updated weights for policy 0, policy_version 854644 (0.0035) [2024-06-25 09:11:38,390][15132] Fps is (10 sec: 40979.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14002585600. Throughput: 0: 42508.1. Samples: 14002691720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:38,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 09:11:39,831][15401] Updated weights for policy 0, policy_version 854654 (0.0029) [2024-06-25 09:11:42,908][15401] Updated weights for policy 0, policy_version 854664 (0.0037) [2024-06-25 09:11:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 14002831360. Throughput: 0: 42443.8. Samples: 14002938340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 09:11:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000854665_14002831360.pth... [2024-06-25 09:11:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000854038_13992558592.pth [2024-06-25 09:11:47,810][15401] Updated weights for policy 0, policy_version 854674 (0.0035) [2024-06-25 09:11:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 14003027968. Throughput: 0: 42564.0. Samples: 14003200300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 09:11:50,459][15401] Updated weights for policy 0, policy_version 854684 (0.0039) [2024-06-25 09:11:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 14003224576. Throughput: 0: 42267.1. Samples: 14003321520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 09:11:55,349][15401] Updated weights for policy 0, policy_version 854694 (0.0034) [2024-06-25 09:11:58,036][15401] Updated weights for policy 0, policy_version 854704 (0.0028) [2024-06-25 09:11:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14003470336. Throughput: 0: 42463.0. Samples: 14003578300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:11:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 09:12:02,958][15401] Updated weights for policy 0, policy_version 854714 (0.0038) [2024-06-25 09:12:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42052.3, 300 sec: 42765.6). Total num frames: 14003666944. Throughput: 0: 42596.4. Samples: 14003840320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:03,393][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 09:12:05,523][15401] Updated weights for policy 0, policy_version 854724 (0.0037) [2024-06-25 09:12:07,599][15349] Signal inference workers to stop experience collection... (207450 times) [2024-06-25 09:12:07,623][15401] InferenceWorker_p0-w0: stopping experience collection (207450 times) [2024-06-25 09:12:07,661][15349] Signal inference workers to resume experience collection... (207450 times) [2024-06-25 09:12:07,661][15401] InferenceWorker_p0-w0: resuming experience collection (207450 times) [2024-06-25 09:12:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 14003863552. Throughput: 0: 42166.4. Samples: 14003957680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:08,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 09:12:10,632][15401] Updated weights for policy 0, policy_version 854734 (0.0030) [2024-06-25 09:12:13,048][15401] Updated weights for policy 0, policy_version 854744 (0.0024) [2024-06-25 09:12:13,390][15132] Fps is (10 sec: 45886.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14004125696. Throughput: 0: 42372.4. Samples: 14004212820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 09:12:18,251][15401] Updated weights for policy 0, policy_version 854754 (0.0029) [2024-06-25 09:12:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42709.5). Total num frames: 14004289536. Throughput: 0: 42615.9. Samples: 14004474240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:18,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-25 09:12:21,275][15401] Updated weights for policy 0, policy_version 854764 (0.0030) [2024-06-25 09:12:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14004518912. Throughput: 0: 42296.0. Samples: 14004595040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:23,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-25 09:12:25,709][15401] Updated weights for policy 0, policy_version 854774 (0.0027) [2024-06-25 09:12:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42874.7, 300 sec: 42820.5). Total num frames: 14004748288. Throughput: 0: 42526.2. Samples: 14004852020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:28,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-25 09:12:29,238][15401] Updated weights for policy 0, policy_version 854784 (0.0031) [2024-06-25 09:12:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 41777.5, 300 sec: 42709.1). Total num frames: 14004928512. Throughput: 0: 42680.0. Samples: 14005121000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:33,401][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 09:12:33,622][15401] Updated weights for policy 0, policy_version 854794 (0.0038) [2024-06-25 09:12:36,695][15401] Updated weights for policy 0, policy_version 854804 (0.0028) [2024-06-25 09:12:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14005157888. Throughput: 0: 42627.6. Samples: 14005239760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 09:12:41,052][15401] Updated weights for policy 0, policy_version 854814 (0.0025) [2024-06-25 09:12:43,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 14005387264. Throughput: 0: 42666.7. Samples: 14005498300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 09:12:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 09:12:44,448][15401] Updated weights for policy 0, policy_version 854824 (0.0039) [2024-06-25 09:12:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 14005583872. Throughput: 0: 42658.2. Samples: 14005759840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:12:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 09:12:48,839][15401] Updated weights for policy 0, policy_version 854834 (0.0038) [2024-06-25 09:12:52,092][15401] Updated weights for policy 0, policy_version 854844 (0.0035) [2024-06-25 09:12:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14005813248. Throughput: 0: 42681.6. Samples: 14005878360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:12:53,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 09:12:56,284][15401] Updated weights for policy 0, policy_version 854854 (0.0029) [2024-06-25 09:12:58,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14006042624. Throughput: 0: 42864.0. Samples: 14006141700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:12:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 09:12:59,718][15401] Updated weights for policy 0, policy_version 854864 (0.0030) [2024-06-25 09:13:03,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 14006206464. Throughput: 0: 42911.6. Samples: 14006405260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 09:13:03,858][15401] Updated weights for policy 0, policy_version 854874 (0.0030) [2024-06-25 09:13:07,179][15401] Updated weights for policy 0, policy_version 854884 (0.0037) [2024-06-25 09:13:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14006452224. Throughput: 0: 42843.4. Samples: 14006523000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:08,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-25 09:13:11,546][15401] Updated weights for policy 0, policy_version 854894 (0.0034) [2024-06-25 09:13:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 14006681600. Throughput: 0: 42895.2. Samples: 14006782300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 09:13:14,864][15401] Updated weights for policy 0, policy_version 854904 (0.0049) [2024-06-25 09:13:18,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14006845440. Throughput: 0: 42791.6. Samples: 14007046520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:18,392][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 09:13:19,367][15349] Signal inference workers to stop experience collection... (207500 times) [2024-06-25 09:13:19,397][15401] InferenceWorker_p0-w0: stopping experience collection (207500 times) [2024-06-25 09:13:19,428][15349] Signal inference workers to resume experience collection... (207500 times) [2024-06-25 09:13:19,429][15401] InferenceWorker_p0-w0: resuming experience collection (207500 times) [2024-06-25 09:13:19,432][15401] Updated weights for policy 0, policy_version 854914 (0.0029) [2024-06-25 09:13:22,403][15401] Updated weights for policy 0, policy_version 854924 (0.0036) [2024-06-25 09:13:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14007091200. Throughput: 0: 42825.7. Samples: 14007166920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:23,394][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 09:13:26,966][15401] Updated weights for policy 0, policy_version 854934 (0.0049) [2024-06-25 09:13:28,390][15132] Fps is (10 sec: 49151.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 14007336960. Throughput: 0: 43008.0. Samples: 14007433660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 09:13:30,005][15401] Updated weights for policy 0, policy_version 854944 (0.0032) [2024-06-25 09:13:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14007500800. Throughput: 0: 43015.7. Samples: 14007695540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:33,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 09:13:34,430][15401] Updated weights for policy 0, policy_version 854954 (0.0037) [2024-06-25 09:13:37,579][15401] Updated weights for policy 0, policy_version 854964 (0.0027) [2024-06-25 09:13:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 14007746560. Throughput: 0: 42993.5. Samples: 14007813060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:38,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 09:13:41,875][15401] Updated weights for policy 0, policy_version 854974 (0.0039) [2024-06-25 09:13:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 14007959552. Throughput: 0: 42969.3. Samples: 14008075320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:43,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 09:13:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000854978_14007959552.pth... [2024-06-25 09:13:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000854350_13997670400.pth [2024-06-25 09:13:45,353][15401] Updated weights for policy 0, policy_version 854984 (0.0023) [2024-06-25 09:13:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 14008172544. Throughput: 0: 42918.2. Samples: 14008336580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:48,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 09:13:49,568][15401] Updated weights for policy 0, policy_version 854994 (0.0029) [2024-06-25 09:13:53,116][15401] Updated weights for policy 0, policy_version 855004 (0.0032) [2024-06-25 09:13:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 14008385536. Throughput: 0: 42977.5. Samples: 14008456980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:53,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 09:13:57,154][15401] Updated weights for policy 0, policy_version 855014 (0.0022) [2024-06-25 09:13:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14008598528. Throughput: 0: 43058.3. Samples: 14008719920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:13:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 09:14:00,572][15401] Updated weights for policy 0, policy_version 855024 (0.0042) [2024-06-25 09:14:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14008795136. Throughput: 0: 42902.6. Samples: 14008977140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:14:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 09:14:04,860][15401] Updated weights for policy 0, policy_version 855034 (0.0032) [2024-06-25 09:14:08,070][15401] Updated weights for policy 0, policy_version 855044 (0.0039) [2024-06-25 09:14:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 14009040896. Throughput: 0: 43032.4. Samples: 14009103480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:14:08,396][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 09:14:12,559][15401] Updated weights for policy 0, policy_version 855054 (0.0043) [2024-06-25 09:14:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 14009253888. Throughput: 0: 42840.4. Samples: 14009361480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:14:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 09:14:16,043][15401] Updated weights for policy 0, policy_version 855064 (0.0032) [2024-06-25 09:14:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 14009450496. Throughput: 0: 42777.8. Samples: 14009620540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:14:18,398][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 09:14:20,013][15401] Updated weights for policy 0, policy_version 855074 (0.0037) [2024-06-25 09:14:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 14009679872. Throughput: 0: 43063.8. Samples: 14009751040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:14:23,393][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 09:14:23,666][15401] Updated weights for policy 0, policy_version 855084 (0.0045) [2024-06-25 09:14:27,648][15401] Updated weights for policy 0, policy_version 855094 (0.0044) [2024-06-25 09:14:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14009876480. Throughput: 0: 42928.9. Samples: 14010007120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:14:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 09:14:31,086][15401] Updated weights for policy 0, policy_version 855104 (0.0023) [2024-06-25 09:14:33,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14010073088. Throughput: 0: 42964.0. Samples: 14010269960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 09:14:33,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 09:14:35,197][15401] Updated weights for policy 0, policy_version 855114 (0.0029) [2024-06-25 09:14:38,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42871.2, 300 sec: 42931.6). Total num frames: 14010318848. Throughput: 0: 42981.4. Samples: 14010391160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:14:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 09:14:38,854][15401] Updated weights for policy 0, policy_version 855124 (0.0031) [2024-06-25 09:14:42,719][15401] Updated weights for policy 0, policy_version 855134 (0.0042) [2024-06-25 09:14:43,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14010531840. Throughput: 0: 42894.7. Samples: 14010650180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:14:43,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 09:14:46,777][15401] Updated weights for policy 0, policy_version 855144 (0.0036) [2024-06-25 09:14:48,390][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14010712064. Throughput: 0: 42884.4. Samples: 14010906940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:14:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 09:14:50,858][15401] Updated weights for policy 0, policy_version 855154 (0.0041) [2024-06-25 09:14:51,017][15349] Signal inference workers to stop experience collection... (207550 times) [2024-06-25 09:14:51,018][15349] Signal inference workers to resume experience collection... (207550 times) [2024-06-25 09:14:51,035][15401] InferenceWorker_p0-w0: stopping experience collection (207550 times) [2024-06-25 09:14:51,035][15401] InferenceWorker_p0-w0: resuming experience collection (207550 times) [2024-06-25 09:14:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14010957824. Throughput: 0: 42736.9. Samples: 14011026540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:14:53,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 09:14:54,273][15401] Updated weights for policy 0, policy_version 855164 (0.0034) [2024-06-25 09:14:58,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14011154432. Throughput: 0: 42817.1. Samples: 14011288240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:14:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 09:14:58,429][15401] Updated weights for policy 0, policy_version 855174 (0.0040) [2024-06-25 09:15:01,785][15401] Updated weights for policy 0, policy_version 855184 (0.0046) [2024-06-25 09:15:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14011351040. Throughput: 0: 42652.8. Samples: 14011539920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:03,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 09:15:06,301][15401] Updated weights for policy 0, policy_version 855194 (0.0037) [2024-06-25 09:15:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42600.1, 300 sec: 42877.0). Total num frames: 14011596800. Throughput: 0: 42590.7. Samples: 14011667520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:08,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 09:15:09,420][15401] Updated weights for policy 0, policy_version 855204 (0.0038) [2024-06-25 09:15:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14011793408. Throughput: 0: 42573.8. Samples: 14011922940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 09:15:13,892][15401] Updated weights for policy 0, policy_version 855214 (0.0043) [2024-06-25 09:15:17,263][15401] Updated weights for policy 0, policy_version 855224 (0.0037) [2024-06-25 09:15:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42710.3). Total num frames: 14012006400. Throughput: 0: 42453.4. Samples: 14012180360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 09:15:21,603][15401] Updated weights for policy 0, policy_version 855234 (0.0037) [2024-06-25 09:15:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 14012219392. Throughput: 0: 42620.3. Samples: 14012309060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:23,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 09:15:24,784][15401] Updated weights for policy 0, policy_version 855244 (0.0039) [2024-06-25 09:15:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14012416000. Throughput: 0: 42448.5. Samples: 14012560360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:28,390][15132] Avg episode reward: [(0, '0.141')] [2024-06-25 09:15:29,434][15401] Updated weights for policy 0, policy_version 855254 (0.0048) [2024-06-25 09:15:32,535][15401] Updated weights for policy 0, policy_version 855264 (0.0038) [2024-06-25 09:15:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14012661760. Throughput: 0: 42213.9. Samples: 14012806560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 09:15:37,037][15401] Updated weights for policy 0, policy_version 855274 (0.0041) [2024-06-25 09:15:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 14012841984. Throughput: 0: 42538.6. Samples: 14012940780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:38,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 09:15:40,471][15401] Updated weights for policy 0, policy_version 855284 (0.0027) [2024-06-25 09:15:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14013071360. Throughput: 0: 42383.4. Samples: 14013195500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:43,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 09:15:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000855290_14013071360.pth... [2024-06-25 09:15:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000854665_14002831360.pth [2024-06-25 09:15:45,122][15401] Updated weights for policy 0, policy_version 855294 (0.0029) [2024-06-25 09:15:48,128][15401] Updated weights for policy 0, policy_version 855304 (0.0024) [2024-06-25 09:15:48,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14013300736. Throughput: 0: 42235.2. Samples: 14013440500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 09:15:52,824][15401] Updated weights for policy 0, policy_version 855314 (0.0040) [2024-06-25 09:15:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 14013480960. Throughput: 0: 42392.4. Samples: 14013575280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:53,392][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 09:15:55,978][15401] Updated weights for policy 0, policy_version 855324 (0.0031) [2024-06-25 09:15:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 14013693952. Throughput: 0: 42352.9. Samples: 14013828820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:15:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 09:16:00,572][15401] Updated weights for policy 0, policy_version 855334 (0.0041) [2024-06-25 09:16:03,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 14013923328. Throughput: 0: 42244.4. Samples: 14014081360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:16:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 09:16:03,830][15401] Updated weights for policy 0, policy_version 855344 (0.0042) [2024-06-25 09:16:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 14014103552. Throughput: 0: 42276.5. Samples: 14014211500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:16:08,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 09:16:08,504][15401] Updated weights for policy 0, policy_version 855354 (0.0027) [2024-06-25 09:16:11,934][15401] Updated weights for policy 0, policy_version 855364 (0.0031) [2024-06-25 09:16:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14014349312. Throughput: 0: 42224.3. Samples: 14014460460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:16:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 09:16:16,184][15401] Updated weights for policy 0, policy_version 855374 (0.0035) [2024-06-25 09:16:16,836][15349] Signal inference workers to stop experience collection... (207600 times) [2024-06-25 09:16:16,836][15349] Signal inference workers to resume experience collection... (207600 times) [2024-06-25 09:16:16,853][15401] InferenceWorker_p0-w0: stopping experience collection (207600 times) [2024-06-25 09:16:16,853][15401] InferenceWorker_p0-w0: resuming experience collection (207600 times) [2024-06-25 09:16:18,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 14014562304. Throughput: 0: 42445.3. Samples: 14014716700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:16:18,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 09:16:19,778][15401] Updated weights for policy 0, policy_version 855384 (0.0034) [2024-06-25 09:16:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42654.6). Total num frames: 14014758912. Throughput: 0: 42337.0. Samples: 14014845940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:16:23,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 09:16:23,905][15401] Updated weights for policy 0, policy_version 855394 (0.0042) [2024-06-25 09:16:27,294][15401] Updated weights for policy 0, policy_version 855404 (0.0042) [2024-06-25 09:16:28,390][15132] Fps is (10 sec: 42608.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 14014988288. Throughput: 0: 42333.3. Samples: 14015100500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:16:28,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 09:16:31,387][15401] Updated weights for policy 0, policy_version 855414 (0.0049) [2024-06-25 09:16:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14015201280. Throughput: 0: 42613.4. Samples: 14015358100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:16:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 09:16:34,711][15401] Updated weights for policy 0, policy_version 855424 (0.0039) [2024-06-25 09:16:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 14015414272. Throughput: 0: 42504.6. Samples: 14015487880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:16:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 09:16:38,803][15401] Updated weights for policy 0, policy_version 855434 (0.0028) [2024-06-25 09:16:42,211][15401] Updated weights for policy 0, policy_version 855444 (0.0027) [2024-06-25 09:16:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14015627264. Throughput: 0: 42596.8. Samples: 14015745680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:16:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 09:16:46,445][15401] Updated weights for policy 0, policy_version 855454 (0.0024) [2024-06-25 09:16:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14015856640. Throughput: 0: 42728.0. Samples: 14016004120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:16:48,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-25 09:16:49,891][15401] Updated weights for policy 0, policy_version 855464 (0.0034) [2024-06-25 09:16:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42873.3, 300 sec: 42654.0). Total num frames: 14016053248. Throughput: 0: 42746.7. Samples: 14016135100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:16:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 09:16:54,275][15401] Updated weights for policy 0, policy_version 855474 (0.0036) [2024-06-25 09:16:57,690][15401] Updated weights for policy 0, policy_version 855484 (0.0046) [2024-06-25 09:16:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 14016249856. Throughput: 0: 42805.5. Samples: 14016386700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:16:58,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 09:17:01,811][15401] Updated weights for policy 0, policy_version 855494 (0.0034) [2024-06-25 09:17:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14016479232. Throughput: 0: 42984.1. Samples: 14016650880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 09:17:05,160][15401] Updated weights for policy 0, policy_version 855504 (0.0031) [2024-06-25 09:17:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 14016692224. Throughput: 0: 42963.6. Samples: 14016779300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 09:17:09,871][15401] Updated weights for policy 0, policy_version 855514 (0.0033) [2024-06-25 09:17:13,062][15401] Updated weights for policy 0, policy_version 855524 (0.0028) [2024-06-25 09:17:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14016905216. Throughput: 0: 42977.5. Samples: 14017034480. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:13,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-25 09:17:17,422][15401] Updated weights for policy 0, policy_version 855534 (0.0032) [2024-06-25 09:17:18,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 14017101824. Throughput: 0: 43001.2. Samples: 14017293260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:18,393][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 09:17:21,339][15401] Updated weights for policy 0, policy_version 855544 (0.0038) [2024-06-25 09:17:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 14017331200. Throughput: 0: 42875.0. Samples: 14017417260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:23,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 09:17:24,941][15401] Updated weights for policy 0, policy_version 855554 (0.0040) [2024-06-25 09:17:28,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 14017544192. Throughput: 0: 42702.0. Samples: 14017667260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:28,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 09:17:28,716][15401] Updated weights for policy 0, policy_version 855564 (0.0037) [2024-06-25 09:17:32,419][15401] Updated weights for policy 0, policy_version 855574 (0.0029) [2024-06-25 09:17:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14017740800. Throughput: 0: 42725.8. Samples: 14017926780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:33,390][15132] Avg episode reward: [(0, '0.095')] [2024-06-25 09:17:36,340][15401] Updated weights for policy 0, policy_version 855584 (0.0037) [2024-06-25 09:17:38,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 14017986560. Throughput: 0: 42614.4. Samples: 14018052760. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:38,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 09:17:39,901][15349] Signal inference workers to stop experience collection... (207650 times) [2024-06-25 09:17:39,901][15349] Signal inference workers to resume experience collection... (207650 times) [2024-06-25 09:17:39,944][15401] InferenceWorker_p0-w0: stopping experience collection (207650 times) [2024-06-25 09:17:39,944][15401] InferenceWorker_p0-w0: resuming experience collection (207650 times) [2024-06-25 09:17:40,041][15401] Updated weights for policy 0, policy_version 855594 (0.0038) [2024-06-25 09:17:43,391][15132] Fps is (10 sec: 44229.8, 60 sec: 42597.3, 300 sec: 42709.3). Total num frames: 14018183168. Throughput: 0: 42737.1. Samples: 14018309940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:43,392][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 09:17:43,522][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000855603_14018199552.pth... [2024-06-25 09:17:43,580][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000854978_14007959552.pth [2024-06-25 09:17:43,827][15401] Updated weights for policy 0, policy_version 855604 (0.0036) [2024-06-25 09:17:48,102][15401] Updated weights for policy 0, policy_version 855614 (0.0022) [2024-06-25 09:17:48,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 14018396160. Throughput: 0: 42688.8. Samples: 14018571880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 09:17:51,318][15401] Updated weights for policy 0, policy_version 855624 (0.0029) [2024-06-25 09:17:53,390][15132] Fps is (10 sec: 44243.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14018625536. Throughput: 0: 42667.9. Samples: 14018699360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 09:17:55,944][15401] Updated weights for policy 0, policy_version 855634 (0.0034) [2024-06-25 09:17:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14018822144. Throughput: 0: 42559.1. Samples: 14018949640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:17:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 09:17:59,210][15401] Updated weights for policy 0, policy_version 855644 (0.0024) [2024-06-25 09:18:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14019018752. Throughput: 0: 42516.1. Samples: 14019206380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:18:03,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 09:18:03,451][15401] Updated weights for policy 0, policy_version 855654 (0.0034) [2024-06-25 09:18:06,757][15401] Updated weights for policy 0, policy_version 855664 (0.0040) [2024-06-25 09:18:08,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14019264512. Throughput: 0: 42617.3. Samples: 14019335140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:18:08,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 09:18:11,080][15401] Updated weights for policy 0, policy_version 855674 (0.0043) [2024-06-25 09:18:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14019461120. Throughput: 0: 42679.9. Samples: 14019587860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-25 09:18:13,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 09:18:14,391][15401] Updated weights for policy 0, policy_version 855684 (0.0031) [2024-06-25 09:18:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 14019674112. Throughput: 0: 42546.6. Samples: 14019841380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 09:18:18,706][15401] Updated weights for policy 0, policy_version 855694 (0.0030) [2024-06-25 09:18:21,957][15401] Updated weights for policy 0, policy_version 855704 (0.0039) [2024-06-25 09:18:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14019903488. Throughput: 0: 42584.1. Samples: 14019969040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:23,391][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 09:18:26,545][15401] Updated weights for policy 0, policy_version 855714 (0.0037) [2024-06-25 09:18:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14020100096. Throughput: 0: 42706.0. Samples: 14020231640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 09:18:29,921][15401] Updated weights for policy 0, policy_version 855724 (0.0043) [2024-06-25 09:18:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14020313088. Throughput: 0: 42560.0. Samples: 14020487080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:33,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 09:18:34,412][15401] Updated weights for policy 0, policy_version 855734 (0.0032) [2024-06-25 09:18:37,375][15401] Updated weights for policy 0, policy_version 855744 (0.0027) [2024-06-25 09:18:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14020558848. Throughput: 0: 42527.6. Samples: 14020613100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 09:18:41,943][15401] Updated weights for policy 0, policy_version 855754 (0.0040) [2024-06-25 09:18:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42599.6, 300 sec: 42598.4). Total num frames: 14020739072. Throughput: 0: 42801.8. Samples: 14020875720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:43,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 09:18:44,836][15401] Updated weights for policy 0, policy_version 855764 (0.0048) [2024-06-25 09:18:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14020952064. Throughput: 0: 42794.5. Samples: 14021132140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:48,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 09:18:50,138][15401] Updated weights for policy 0, policy_version 855774 (0.0034) [2024-06-25 09:18:52,699][15401] Updated weights for policy 0, policy_version 855784 (0.0027) [2024-06-25 09:18:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14021181440. Throughput: 0: 42604.4. Samples: 14021252240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:53,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 09:18:57,640][15401] Updated weights for policy 0, policy_version 855794 (0.0029) [2024-06-25 09:18:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14021361664. Throughput: 0: 42827.5. Samples: 14021515100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:18:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 09:19:00,334][15349] Signal inference workers to stop experience collection... (207700 times) [2024-06-25 09:19:00,335][15349] Signal inference workers to resume experience collection... (207700 times) [2024-06-25 09:19:00,346][15401] InferenceWorker_p0-w0: stopping experience collection (207700 times) [2024-06-25 09:19:00,349][15401] Updated weights for policy 0, policy_version 855804 (0.0048) [2024-06-25 09:19:00,372][15401] InferenceWorker_p0-w0: resuming experience collection (207700 times) [2024-06-25 09:19:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 14021591040. Throughput: 0: 42717.3. Samples: 14021763660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 09:19:05,058][15401] Updated weights for policy 0, policy_version 855814 (0.0042) [2024-06-25 09:19:07,920][15401] Updated weights for policy 0, policy_version 855824 (0.0040) [2024-06-25 09:19:08,391][15132] Fps is (10 sec: 47504.5, 60 sec: 42871.8, 300 sec: 42653.7). Total num frames: 14021836800. Throughput: 0: 42875.1. Samples: 14021898500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:08,392][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 09:19:12,546][15401] Updated weights for policy 0, policy_version 855834 (0.0040) [2024-06-25 09:19:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14022017024. Throughput: 0: 42605.2. Samples: 14022148880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:13,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 09:19:16,055][15401] Updated weights for policy 0, policy_version 855844 (0.0029) [2024-06-25 09:19:18,390][15132] Fps is (10 sec: 37690.3, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 14022213632. Throughput: 0: 42600.9. Samples: 14022404120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:18,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-25 09:19:20,286][15401] Updated weights for policy 0, policy_version 855854 (0.0034) [2024-06-25 09:19:23,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14022459392. Throughput: 0: 42584.1. Samples: 14022529380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:23,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 09:19:23,518][15401] Updated weights for policy 0, policy_version 855864 (0.0023) [2024-06-25 09:19:27,680][15401] Updated weights for policy 0, policy_version 855874 (0.0027) [2024-06-25 09:19:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14022672384. Throughput: 0: 42605.7. Samples: 14022792980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 09:19:31,582][15401] Updated weights for policy 0, policy_version 855884 (0.0028) [2024-06-25 09:19:33,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14022868992. Throughput: 0: 42414.6. Samples: 14023040800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 09:19:35,254][15401] Updated weights for policy 0, policy_version 855894 (0.0028) [2024-06-25 09:19:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14023098368. Throughput: 0: 42622.8. Samples: 14023170260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 09:19:39,123][15401] Updated weights for policy 0, policy_version 855904 (0.0037) [2024-06-25 09:19:42,783][15401] Updated weights for policy 0, policy_version 855914 (0.0051) [2024-06-25 09:19:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14023311360. Throughput: 0: 42514.2. Samples: 14023428240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:43,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 09:19:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000855915_14023311360.pth... [2024-06-25 09:19:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000855290_14013071360.pth [2024-06-25 09:19:46,826][15401] Updated weights for policy 0, policy_version 855924 (0.0031) [2024-06-25 09:19:48,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14023524352. Throughput: 0: 42593.8. Samples: 14023680380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:48,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 09:19:50,579][15401] Updated weights for policy 0, policy_version 855934 (0.0040) [2024-06-25 09:19:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14023737344. Throughput: 0: 42336.4. Samples: 14023803560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 09:19:54,415][15401] Updated weights for policy 0, policy_version 855944 (0.0044) [2024-06-25 09:19:58,122][15401] Updated weights for policy 0, policy_version 855954 (0.0034) [2024-06-25 09:19:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14023950336. Throughput: 0: 42605.0. Samples: 14024066100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:19:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 09:20:02,060][15401] Updated weights for policy 0, policy_version 855964 (0.0027) [2024-06-25 09:20:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14024146944. Throughput: 0: 42577.4. Samples: 14024320100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 09:20:03,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 09:20:05,876][15401] Updated weights for policy 0, policy_version 855974 (0.0043) [2024-06-25 09:20:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42053.6, 300 sec: 42598.4). Total num frames: 14024359936. Throughput: 0: 42502.1. Samples: 14024441980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:08,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 09:20:09,544][15401] Updated weights for policy 0, policy_version 855984 (0.0037) [2024-06-25 09:20:13,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 14024589312. Throughput: 0: 42576.3. Samples: 14024709020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:13,393][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 09:20:13,545][15401] Updated weights for policy 0, policy_version 855994 (0.0048) [2024-06-25 09:20:17,142][15401] Updated weights for policy 0, policy_version 856004 (0.0038) [2024-06-25 09:20:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14024785920. Throughput: 0: 42631.6. Samples: 14024959220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 09:20:21,237][15401] Updated weights for policy 0, policy_version 856014 (0.0032) [2024-06-25 09:20:23,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14025015296. Throughput: 0: 42690.7. Samples: 14025091340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 09:20:25,050][15401] Updated weights for policy 0, policy_version 856024 (0.0035) [2024-06-25 09:20:28,228][15349] Signal inference workers to stop experience collection... (207750 times) [2024-06-25 09:20:28,233][15349] Signal inference workers to resume experience collection... (207750 times) [2024-06-25 09:20:28,241][15401] InferenceWorker_p0-w0: stopping experience collection (207750 times) [2024-06-25 09:20:28,269][15401] InferenceWorker_p0-w0: resuming experience collection (207750 times) [2024-06-25 09:20:28,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14025244672. Throughput: 0: 42752.8. Samples: 14025352220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:28,392][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 09:20:28,705][15401] Updated weights for policy 0, policy_version 856034 (0.0032) [2024-06-25 09:20:32,762][15401] Updated weights for policy 0, policy_version 856044 (0.0038) [2024-06-25 09:20:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14025441280. Throughput: 0: 42649.3. Samples: 14025599600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:33,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 09:20:36,611][15401] Updated weights for policy 0, policy_version 856054 (0.0034) [2024-06-25 09:20:38,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14025637888. Throughput: 0: 42782.4. Samples: 14025728760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 09:20:40,579][15401] Updated weights for policy 0, policy_version 856064 (0.0034) [2024-06-25 09:20:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14025867264. Throughput: 0: 42753.4. Samples: 14025990000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:43,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 09:20:44,095][15401] Updated weights for policy 0, policy_version 856074 (0.0028) [2024-06-25 09:20:48,306][15401] Updated weights for policy 0, policy_version 856084 (0.0034) [2024-06-25 09:20:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 14026080256. Throughput: 0: 42662.1. Samples: 14026239900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:48,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 09:20:51,726][15401] Updated weights for policy 0, policy_version 856094 (0.0026) [2024-06-25 09:20:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14026276864. Throughput: 0: 42704.8. Samples: 14026363700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 09:20:56,200][15401] Updated weights for policy 0, policy_version 856104 (0.0038) [2024-06-25 09:20:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14026522624. Throughput: 0: 42701.5. Samples: 14026630480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:20:58,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 09:20:59,402][15401] Updated weights for policy 0, policy_version 856114 (0.0033) [2024-06-25 09:21:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14026719232. Throughput: 0: 42913.3. Samples: 14026890320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 09:21:03,686][15401] Updated weights for policy 0, policy_version 856124 (0.0029) [2024-06-25 09:21:06,745][15401] Updated weights for policy 0, policy_version 856134 (0.0036) [2024-06-25 09:21:08,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14026932224. Throughput: 0: 42711.9. Samples: 14027013380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:08,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 09:21:11,530][15401] Updated weights for policy 0, policy_version 856144 (0.0043) [2024-06-25 09:21:13,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43146.3, 300 sec: 42765.4). Total num frames: 14027177984. Throughput: 0: 42811.2. Samples: 14027278620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 09:21:14,215][15401] Updated weights for policy 0, policy_version 856154 (0.0031) [2024-06-25 09:21:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14027358208. Throughput: 0: 43034.6. Samples: 14027536160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 09:21:19,191][15401] Updated weights for policy 0, policy_version 856164 (0.0039) [2024-06-25 09:21:21,720][15401] Updated weights for policy 0, policy_version 856174 (0.0036) [2024-06-25 09:21:23,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 14027587584. Throughput: 0: 42836.6. Samples: 14027656520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:23,393][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 09:21:26,692][15401] Updated weights for policy 0, policy_version 856184 (0.0038) [2024-06-25 09:21:28,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14027816960. Throughput: 0: 42899.0. Samples: 14027920460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 09:21:29,550][15401] Updated weights for policy 0, policy_version 856194 (0.0036) [2024-06-25 09:21:33,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14027997184. Throughput: 0: 43106.4. Samples: 14028179680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 09:21:34,289][15401] Updated weights for policy 0, policy_version 856204 (0.0030) [2024-06-25 09:21:37,265][15401] Updated weights for policy 0, policy_version 856214 (0.0042) [2024-06-25 09:21:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14028226560. Throughput: 0: 42980.5. Samples: 14028297820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 09:21:42,038][15401] Updated weights for policy 0, policy_version 856224 (0.0022) [2024-06-25 09:21:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14028455936. Throughput: 0: 42907.8. Samples: 14028561340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 09:21:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000856229_14028455936.pth... [2024-06-25 09:21:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000855603_14018199552.pth [2024-06-25 09:21:44,705][15401] Updated weights for policy 0, policy_version 856234 (0.0038) [2024-06-25 09:21:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14028619776. Throughput: 0: 42932.1. Samples: 14028822260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:48,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 09:21:49,988][15401] Updated weights for policy 0, policy_version 856244 (0.0034) [2024-06-25 09:21:50,999][15349] Signal inference workers to stop experience collection... (207800 times) [2024-06-25 09:21:51,000][15349] Signal inference workers to resume experience collection... (207800 times) [2024-06-25 09:21:51,020][15401] InferenceWorker_p0-w0: stopping experience collection (207800 times) [2024-06-25 09:21:51,020][15401] InferenceWorker_p0-w0: resuming experience collection (207800 times) [2024-06-25 09:21:52,233][15401] Updated weights for policy 0, policy_version 856254 (0.0034) [2024-06-25 09:21:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 14028881920. Throughput: 0: 42784.9. Samples: 14028938700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 09:21:53,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 09:21:57,551][15401] Updated weights for policy 0, policy_version 856264 (0.0041) [2024-06-25 09:21:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14029078528. Throughput: 0: 42897.2. Samples: 14029209000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:21:58,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 09:21:59,990][15401] Updated weights for policy 0, policy_version 856274 (0.0038) [2024-06-25 09:22:03,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14029258752. Throughput: 0: 42840.1. Samples: 14029463960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 09:22:05,296][15401] Updated weights for policy 0, policy_version 856284 (0.0036) [2024-06-25 09:22:07,497][15401] Updated weights for policy 0, policy_version 856294 (0.0029) [2024-06-25 09:22:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 14029537280. Throughput: 0: 42745.5. Samples: 14029579960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:08,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 09:22:12,989][15401] Updated weights for policy 0, policy_version 856304 (0.0027) [2024-06-25 09:22:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 14029717504. Throughput: 0: 42883.6. Samples: 14029850220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 09:22:15,027][15401] Updated weights for policy 0, policy_version 856314 (0.0036) [2024-06-25 09:22:18,390][15132] Fps is (10 sec: 36044.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14029897728. Throughput: 0: 42683.5. Samples: 14030100440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 09:22:20,593][15401] Updated weights for policy 0, policy_version 856324 (0.0025) [2024-06-25 09:22:23,062][15401] Updated weights for policy 0, policy_version 856334 (0.0024) [2024-06-25 09:22:23,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43419.4, 300 sec: 42876.1). Total num frames: 14030192640. Throughput: 0: 42808.5. Samples: 14030224200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:23,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 09:22:28,241][15401] Updated weights for policy 0, policy_version 856344 (0.0044) [2024-06-25 09:22:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14030340096. Throughput: 0: 42855.2. Samples: 14030489820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:28,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 09:22:30,502][15401] Updated weights for policy 0, policy_version 856354 (0.0036) [2024-06-25 09:22:33,392][15132] Fps is (10 sec: 36035.8, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 14030553088. Throughput: 0: 42789.3. Samples: 14030747880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:33,392][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 09:22:36,030][15401] Updated weights for policy 0, policy_version 856364 (0.0045) [2024-06-25 09:22:38,003][15401] Updated weights for policy 0, policy_version 856374 (0.0032) [2024-06-25 09:22:38,390][15132] Fps is (10 sec: 50789.8, 60 sec: 43690.7, 300 sec: 42931.9). Total num frames: 14030848000. Throughput: 0: 43014.6. Samples: 14030874360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:38,394][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 09:22:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14030979072. Throughput: 0: 42775.6. Samples: 14031133900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 09:22:43,578][15401] Updated weights for policy 0, policy_version 856384 (0.0030) [2024-06-25 09:22:46,007][15401] Updated weights for policy 0, policy_version 856394 (0.0040) [2024-06-25 09:22:46,021][15349] Signal inference workers to stop experience collection... (207850 times) [2024-06-25 09:22:46,021][15349] Signal inference workers to resume experience collection... (207850 times) [2024-06-25 09:22:46,035][15401] InferenceWorker_p0-w0: stopping experience collection (207850 times) [2024-06-25 09:22:46,036][15401] InferenceWorker_p0-w0: resuming experience collection (207850 times) [2024-06-25 09:22:48,390][15132] Fps is (10 sec: 34406.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14031192064. Throughput: 0: 42469.3. Samples: 14031375080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 09:22:51,613][15401] Updated weights for policy 0, policy_version 856404 (0.0046) [2024-06-25 09:22:53,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14031437824. Throughput: 0: 42815.5. Samples: 14031506660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 09:22:53,773][15401] Updated weights for policy 0, policy_version 856414 (0.0031) [2024-06-25 09:22:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 14031601664. Throughput: 0: 42580.9. Samples: 14031766360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:22:58,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 09:22:59,080][15401] Updated weights for policy 0, policy_version 856424 (0.0037) [2024-06-25 09:23:01,530][15401] Updated weights for policy 0, policy_version 856434 (0.0031) [2024-06-25 09:23:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 14031847424. Throughput: 0: 42500.0. Samples: 14032012940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 09:23:06,790][15401] Updated weights for policy 0, policy_version 856444 (0.0028) [2024-06-25 09:23:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14032076800. Throughput: 0: 42708.4. Samples: 14032146080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 09:23:09,237][15401] Updated weights for policy 0, policy_version 856454 (0.0035) [2024-06-25 09:23:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 14032273408. Throughput: 0: 42539.4. Samples: 14032404200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:13,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 09:23:14,483][15401] Updated weights for policy 0, policy_version 856464 (0.0029) [2024-06-25 09:23:16,973][15401] Updated weights for policy 0, policy_version 856474 (0.0037) [2024-06-25 09:23:18,396][15132] Fps is (10 sec: 42570.9, 60 sec: 43413.0, 300 sec: 42708.6). Total num frames: 14032502784. Throughput: 0: 42246.0. Samples: 14032649120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:18,396][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 09:23:22,089][15401] Updated weights for policy 0, policy_version 856484 (0.0035) [2024-06-25 09:23:23,389][15132] Fps is (10 sec: 42609.1, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 14032699392. Throughput: 0: 42495.7. Samples: 14032786660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 09:23:24,642][15401] Updated weights for policy 0, policy_version 856494 (0.0026) [2024-06-25 09:23:28,390][15132] Fps is (10 sec: 39346.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14032896000. Throughput: 0: 42381.3. Samples: 14033041060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 09:23:29,804][15401] Updated weights for policy 0, policy_version 856504 (0.0036) [2024-06-25 09:23:32,291][15401] Updated weights for policy 0, policy_version 856514 (0.0030) [2024-06-25 09:23:33,389][15132] Fps is (10 sec: 45874.7, 60 sec: 43419.4, 300 sec: 42709.5). Total num frames: 14033158144. Throughput: 0: 42353.0. Samples: 14033280960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 09:23:37,333][15401] Updated weights for policy 0, policy_version 856524 (0.0031) [2024-06-25 09:23:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 41506.2, 300 sec: 42709.5). Total num frames: 14033338368. Throughput: 0: 42656.0. Samples: 14033426180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 09:23:39,878][15401] Updated weights for policy 0, policy_version 856534 (0.0030) [2024-06-25 09:23:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14033551360. Throughput: 0: 42569.2. Samples: 14033681980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:43,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 09:23:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000856540_14033551360.pth... [2024-06-25 09:23:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000855915_14023311360.pth [2024-06-25 09:23:45,526][15401] Updated weights for policy 0, policy_version 856544 (0.0030) [2024-06-25 09:23:47,034][15349] Signal inference workers to stop experience collection... (207900 times) [2024-06-25 09:23:47,081][15401] InferenceWorker_p0-w0: stopping experience collection (207900 times) [2024-06-25 09:23:47,091][15349] Signal inference workers to resume experience collection... (207900 times) [2024-06-25 09:23:47,097][15401] InferenceWorker_p0-w0: resuming experience collection (207900 times) [2024-06-25 09:23:47,718][15401] Updated weights for policy 0, policy_version 856554 (0.0036) [2024-06-25 09:23:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 14033797120. Throughput: 0: 42526.7. Samples: 14033926640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-25 09:23:48,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 09:23:53,021][15401] Updated weights for policy 0, policy_version 856564 (0.0028) [2024-06-25 09:23:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14033960960. Throughput: 0: 42636.4. Samples: 14034064720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:23:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 09:23:55,388][15401] Updated weights for policy 0, policy_version 856574 (0.0034) [2024-06-25 09:23:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 14034206720. Throughput: 0: 42608.6. Samples: 14034321480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:23:58,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 09:24:00,591][15401] Updated weights for policy 0, policy_version 856584 (0.0030) [2024-06-25 09:24:03,216][15401] Updated weights for policy 0, policy_version 856594 (0.0037) [2024-06-25 09:24:03,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 14034436096. Throughput: 0: 42747.5. Samples: 14034572480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 09:24:08,148][15401] Updated weights for policy 0, policy_version 856604 (0.0041) [2024-06-25 09:24:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14034616320. Throughput: 0: 42579.5. Samples: 14034702740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:08,392][15132] Avg episode reward: [(0, '0.845')] [2024-06-25 09:24:11,335][15401] Updated weights for policy 0, policy_version 856614 (0.0033) [2024-06-25 09:24:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 14034845696. Throughput: 0: 42664.3. Samples: 14034960960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 09:24:15,450][15401] Updated weights for policy 0, policy_version 856624 (0.0029) [2024-06-25 09:24:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 14035042304. Throughput: 0: 43012.3. Samples: 14035216520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:18,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 09:24:19,093][15401] Updated weights for policy 0, policy_version 856634 (0.0036) [2024-06-25 09:24:23,125][15401] Updated weights for policy 0, policy_version 856644 (0.0032) [2024-06-25 09:24:23,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14035255296. Throughput: 0: 42552.5. Samples: 14035341040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 09:24:26,599][15401] Updated weights for policy 0, policy_version 856654 (0.0034) [2024-06-25 09:24:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14035484672. Throughput: 0: 42527.7. Samples: 14035595720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:28,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 09:24:30,779][15401] Updated weights for policy 0, policy_version 856664 (0.0033) [2024-06-25 09:24:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 14035681280. Throughput: 0: 42838.1. Samples: 14035854360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:33,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 09:24:34,250][15401] Updated weights for policy 0, policy_version 856674 (0.0028) [2024-06-25 09:24:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14035894272. Throughput: 0: 42648.8. Samples: 14035983920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 09:24:38,421][15401] Updated weights for policy 0, policy_version 856684 (0.0040) [2024-06-25 09:24:42,040][15401] Updated weights for policy 0, policy_version 856694 (0.0023) [2024-06-25 09:24:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14036123648. Throughput: 0: 42587.0. Samples: 14036237900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 09:24:46,018][15401] Updated weights for policy 0, policy_version 856704 (0.0035) [2024-06-25 09:24:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 14036320256. Throughput: 0: 42903.1. Samples: 14036503120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 09:24:50,004][15401] Updated weights for policy 0, policy_version 856714 (0.0034) [2024-06-25 09:24:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14036549632. Throughput: 0: 42855.2. Samples: 14036631220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:53,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 09:24:53,591][15401] Updated weights for policy 0, policy_version 856724 (0.0030) [2024-06-25 09:24:57,445][15401] Updated weights for policy 0, policy_version 856734 (0.0047) [2024-06-25 09:24:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14036746240. Throughput: 0: 42716.7. Samples: 14036883200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:24:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 09:25:01,049][15401] Updated weights for policy 0, policy_version 856744 (0.0032) [2024-06-25 09:25:03,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 14036942848. Throughput: 0: 42780.6. Samples: 14037141640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:25:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 09:25:05,089][15401] Updated weights for policy 0, policy_version 856754 (0.0032) [2024-06-25 09:25:08,396][15132] Fps is (10 sec: 45845.3, 60 sec: 43139.9, 300 sec: 42764.4). Total num frames: 14037204992. Throughput: 0: 42874.2. Samples: 14037270660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:25:08,397][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 09:25:08,632][15401] Updated weights for policy 0, policy_version 856764 (0.0035) [2024-06-25 09:25:12,554][15401] Updated weights for policy 0, policy_version 856774 (0.0041) [2024-06-25 09:25:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 14037401600. Throughput: 0: 42967.0. Samples: 14037529340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:25:13,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 09:25:16,026][15401] Updated weights for policy 0, policy_version 856784 (0.0035) [2024-06-25 09:25:18,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14037614592. Throughput: 0: 42889.0. Samples: 14037784360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:25:18,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 09:25:20,221][15401] Updated weights for policy 0, policy_version 856794 (0.0029) [2024-06-25 09:25:22,799][15349] Signal inference workers to stop experience collection... (207950 times) [2024-06-25 09:25:22,836][15401] InferenceWorker_p0-w0: stopping experience collection (207950 times) [2024-06-25 09:25:22,847][15349] Signal inference workers to resume experience collection... (207950 times) [2024-06-25 09:25:22,857][15401] InferenceWorker_p0-w0: resuming experience collection (207950 times) [2024-06-25 09:25:23,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 14037843968. Throughput: 0: 42762.3. Samples: 14037908220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:25:23,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 09:25:24,008][15401] Updated weights for policy 0, policy_version 856804 (0.0028) [2024-06-25 09:25:28,110][15401] Updated weights for policy 0, policy_version 856814 (0.0035) [2024-06-25 09:25:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14038040576. Throughput: 0: 42929.8. Samples: 14038169740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:25:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 09:25:31,692][15401] Updated weights for policy 0, policy_version 856824 (0.0039) [2024-06-25 09:25:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14038253568. Throughput: 0: 42617.4. Samples: 14038420900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:25:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 09:25:36,289][15401] Updated weights for policy 0, policy_version 856834 (0.0040) [2024-06-25 09:25:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14038482944. Throughput: 0: 42691.0. Samples: 14038552320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-25 09:25:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 09:25:39,298][15401] Updated weights for policy 0, policy_version 856844 (0.0024) [2024-06-25 09:25:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14038663168. Throughput: 0: 42712.8. Samples: 14038805280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:25:43,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 09:25:43,431][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000856853_14038679552.pth... [2024-06-25 09:25:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000856229_14028455936.pth [2024-06-25 09:25:43,896][15401] Updated weights for policy 0, policy_version 856854 (0.0035) [2024-06-25 09:25:47,354][15401] Updated weights for policy 0, policy_version 856864 (0.0047) [2024-06-25 09:25:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14038892544. Throughput: 0: 42465.4. Samples: 14039052580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:25:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 09:25:51,543][15401] Updated weights for policy 0, policy_version 856874 (0.0034) [2024-06-25 09:25:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 14039121920. Throughput: 0: 42600.2. Samples: 14039187400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:25:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 09:25:55,236][15401] Updated weights for policy 0, policy_version 856884 (0.0033) [2024-06-25 09:25:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 14039302144. Throughput: 0: 42368.9. Samples: 14039435840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:25:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 09:25:59,591][15401] Updated weights for policy 0, policy_version 856894 (0.0030) [2024-06-25 09:26:02,798][15401] Updated weights for policy 0, policy_version 856904 (0.0041) [2024-06-25 09:26:03,396][15132] Fps is (10 sec: 42572.2, 60 sec: 43413.0, 300 sec: 42764.1). Total num frames: 14039547904. Throughput: 0: 42279.4. Samples: 14039687200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:03,396][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 09:26:07,260][15401] Updated weights for policy 0, policy_version 856914 (0.0031) [2024-06-25 09:26:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 14039760896. Throughput: 0: 42575.4. Samples: 14039824120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:08,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 09:26:10,643][15401] Updated weights for policy 0, policy_version 856924 (0.0039) [2024-06-25 09:26:13,391][15132] Fps is (10 sec: 39338.8, 60 sec: 42325.6, 300 sec: 42653.7). Total num frames: 14039941120. Throughput: 0: 42310.1. Samples: 14040073780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:13,392][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 09:26:14,661][15401] Updated weights for policy 0, policy_version 856934 (0.0038) [2024-06-25 09:26:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 14040154112. Throughput: 0: 42454.6. Samples: 14040331360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 09:26:18,462][15401] Updated weights for policy 0, policy_version 856944 (0.0022) [2024-06-25 09:26:22,208][15401] Updated weights for policy 0, policy_version 856954 (0.0028) [2024-06-25 09:26:23,389][15132] Fps is (10 sec: 45884.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14040399872. Throughput: 0: 42519.6. Samples: 14040465700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 09:26:25,920][15401] Updated weights for policy 0, policy_version 856964 (0.0037) [2024-06-25 09:26:28,391][15132] Fps is (10 sec: 44231.4, 60 sec: 42597.5, 300 sec: 42709.3). Total num frames: 14040596480. Throughput: 0: 42540.2. Samples: 14040719640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:28,391][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 09:26:29,688][15401] Updated weights for policy 0, policy_version 856974 (0.0021) [2024-06-25 09:26:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14040809472. Throughput: 0: 42897.3. Samples: 14040982960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 09:26:33,493][15401] Updated weights for policy 0, policy_version 856984 (0.0033) [2024-06-25 09:26:37,688][15401] Updated weights for policy 0, policy_version 856994 (0.0022) [2024-06-25 09:26:38,389][15132] Fps is (10 sec: 42604.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14041022464. Throughput: 0: 42762.5. Samples: 14041111700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:38,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 09:26:40,996][15401] Updated weights for policy 0, policy_version 857004 (0.0027) [2024-06-25 09:26:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14041251840. Throughput: 0: 42871.1. Samples: 14041365040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:43,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 09:26:45,363][15401] Updated weights for policy 0, policy_version 857014 (0.0036) [2024-06-25 09:26:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14041448448. Throughput: 0: 43022.1. Samples: 14041622920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 09:26:48,457][15349] Signal inference workers to stop experience collection... (208000 times) [2024-06-25 09:26:48,508][15401] InferenceWorker_p0-w0: stopping experience collection (208000 times) [2024-06-25 09:26:48,512][15349] Signal inference workers to resume experience collection... (208000 times) [2024-06-25 09:26:48,519][15401] InferenceWorker_p0-w0: resuming experience collection (208000 times) [2024-06-25 09:26:48,668][15401] Updated weights for policy 0, policy_version 857024 (0.0033) [2024-06-25 09:26:52,932][15401] Updated weights for policy 0, policy_version 857034 (0.0043) [2024-06-25 09:26:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 14041677824. Throughput: 0: 42842.8. Samples: 14041752040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 09:26:56,201][15401] Updated weights for policy 0, policy_version 857044 (0.0042) [2024-06-25 09:26:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14041890816. Throughput: 0: 43002.8. Samples: 14042008820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:26:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 09:27:00,423][15401] Updated weights for policy 0, policy_version 857054 (0.0026) [2024-06-25 09:27:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42329.8, 300 sec: 42542.8). Total num frames: 14042087424. Throughput: 0: 42939.5. Samples: 14042263640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:27:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 09:27:03,926][15401] Updated weights for policy 0, policy_version 857064 (0.0038) [2024-06-25 09:27:08,068][15401] Updated weights for policy 0, policy_version 857074 (0.0026) [2024-06-25 09:27:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14042316800. Throughput: 0: 42801.6. Samples: 14042391780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:27:08,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-25 09:27:11,518][15401] Updated weights for policy 0, policy_version 857084 (0.0034) [2024-06-25 09:27:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43146.0, 300 sec: 42820.6). Total num frames: 14042529792. Throughput: 0: 42713.7. Samples: 14042641700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:27:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 09:27:15,950][15401] Updated weights for policy 0, policy_version 857094 (0.0030) [2024-06-25 09:27:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14042726400. Throughput: 0: 42580.8. Samples: 14042899100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:27:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 09:27:19,193][15401] Updated weights for policy 0, policy_version 857104 (0.0026) [2024-06-25 09:27:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14042939392. Throughput: 0: 42565.6. Samples: 14043027160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:27:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 09:27:23,691][15401] Updated weights for policy 0, policy_version 857114 (0.0032) [2024-06-25 09:27:26,813][15401] Updated weights for policy 0, policy_version 857124 (0.0034) [2024-06-25 09:27:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42599.3, 300 sec: 42709.8). Total num frames: 14043152384. Throughput: 0: 42576.6. Samples: 14043280980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 09:27:28,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 09:27:31,136][15401] Updated weights for policy 0, policy_version 857134 (0.0034) [2024-06-25 09:27:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14043381760. Throughput: 0: 42610.7. Samples: 14043540400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:27:33,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 09:27:34,421][15401] Updated weights for policy 0, policy_version 857144 (0.0027) [2024-06-25 09:27:38,392][15132] Fps is (10 sec: 44226.9, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 14043594752. Throughput: 0: 42689.4. Samples: 14043673160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:27:38,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 09:27:38,682][15401] Updated weights for policy 0, policy_version 857154 (0.0029) [2024-06-25 09:27:42,230][15401] Updated weights for policy 0, policy_version 857164 (0.0047) [2024-06-25 09:27:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14043807744. Throughput: 0: 42671.1. Samples: 14043929020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:27:43,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 09:27:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000857166_14043807744.pth... [2024-06-25 09:27:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000856540_14033551360.pth [2024-06-25 09:27:46,218][15401] Updated weights for policy 0, policy_version 857174 (0.0044) [2024-06-25 09:27:48,389][15132] Fps is (10 sec: 40969.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14044004352. Throughput: 0: 42649.9. Samples: 14044182880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:27:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 09:27:50,120][15401] Updated weights for policy 0, policy_version 857184 (0.0026) [2024-06-25 09:27:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 14044217344. Throughput: 0: 42630.7. Samples: 14044310160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:27:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 09:27:53,434][15349] Signal inference workers to stop experience collection... (208050 times) [2024-06-25 09:27:53,434][15349] Signal inference workers to resume experience collection... (208050 times) [2024-06-25 09:27:53,443][15401] InferenceWorker_p0-w0: stopping experience collection (208050 times) [2024-06-25 09:27:53,444][15401] InferenceWorker_p0-w0: resuming experience collection (208050 times) [2024-06-25 09:27:53,903][15401] Updated weights for policy 0, policy_version 857194 (0.0033) [2024-06-25 09:27:57,760][15401] Updated weights for policy 0, policy_version 857204 (0.0032) [2024-06-25 09:27:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14044463104. Throughput: 0: 42742.6. Samples: 14044565120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:27:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 09:28:01,441][15401] Updated weights for policy 0, policy_version 857214 (0.0032) [2024-06-25 09:28:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14044659712. Throughput: 0: 42918.1. Samples: 14044830420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 09:28:05,315][15401] Updated weights for policy 0, policy_version 857224 (0.0031) [2024-06-25 09:28:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 14044856320. Throughput: 0: 42728.6. Samples: 14044949940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:08,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 09:28:09,579][15401] Updated weights for policy 0, policy_version 857234 (0.0037) [2024-06-25 09:28:13,040][15401] Updated weights for policy 0, policy_version 857244 (0.0021) [2024-06-25 09:28:13,396][15132] Fps is (10 sec: 44209.0, 60 sec: 42866.8, 300 sec: 42709.5). Total num frames: 14045102080. Throughput: 0: 42803.6. Samples: 14045207420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:13,396][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 09:28:17,266][15401] Updated weights for policy 0, policy_version 857254 (0.0040) [2024-06-25 09:28:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14045282304. Throughput: 0: 42725.3. Samples: 14045463040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:18,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 09:28:20,717][15401] Updated weights for policy 0, policy_version 857264 (0.0039) [2024-06-25 09:28:23,389][15132] Fps is (10 sec: 39346.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14045495296. Throughput: 0: 42530.9. Samples: 14045586960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:23,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 09:28:24,820][15401] Updated weights for policy 0, policy_version 857274 (0.0034) [2024-06-25 09:28:28,153][15401] Updated weights for policy 0, policy_version 857284 (0.0032) [2024-06-25 09:28:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14045741056. Throughput: 0: 42703.2. Samples: 14045850660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 09:28:32,472][15401] Updated weights for policy 0, policy_version 857294 (0.0021) [2024-06-25 09:28:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14045937664. Throughput: 0: 42725.7. Samples: 14046105540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:33,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 09:28:35,851][15401] Updated weights for policy 0, policy_version 857304 (0.0030) [2024-06-25 09:28:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42599.9, 300 sec: 42709.5). Total num frames: 14046150656. Throughput: 0: 42571.2. Samples: 14046225860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 09:28:40,009][15401] Updated weights for policy 0, policy_version 857314 (0.0031) [2024-06-25 09:28:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14046380032. Throughput: 0: 42868.4. Samples: 14046494200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:43,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 09:28:43,656][15401] Updated weights for policy 0, policy_version 857324 (0.0031) [2024-06-25 09:28:47,692][15401] Updated weights for policy 0, policy_version 857334 (0.0039) [2024-06-25 09:28:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 14046593024. Throughput: 0: 42505.9. Samples: 14046743280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:48,392][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 09:28:51,311][15401] Updated weights for policy 0, policy_version 857344 (0.0033) [2024-06-25 09:28:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 14046789632. Throughput: 0: 42789.4. Samples: 14046875460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 09:28:55,211][15401] Updated weights for policy 0, policy_version 857354 (0.0046) [2024-06-25 09:28:58,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14047002624. Throughput: 0: 42857.2. Samples: 14047135720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:28:58,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 09:28:58,939][15401] Updated weights for policy 0, policy_version 857364 (0.0028) [2024-06-25 09:29:03,119][15401] Updated weights for policy 0, policy_version 857374 (0.0033) [2024-06-25 09:29:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14047232000. Throughput: 0: 42864.0. Samples: 14047391920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:29:03,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 09:29:06,121][15349] Signal inference workers to stop experience collection... (208100 times) [2024-06-25 09:29:06,121][15349] Signal inference workers to resume experience collection... (208100 times) [2024-06-25 09:29:06,152][15401] InferenceWorker_p0-w0: stopping experience collection (208100 times) [2024-06-25 09:29:06,152][15401] InferenceWorker_p0-w0: resuming experience collection (208100 times) [2024-06-25 09:29:06,419][15401] Updated weights for policy 0, policy_version 857384 (0.0031) [2024-06-25 09:29:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14047444992. Throughput: 0: 42940.0. Samples: 14047519260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:29:08,391][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 09:29:10,693][15401] Updated weights for policy 0, policy_version 857394 (0.0029) [2024-06-25 09:29:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42329.8, 300 sec: 42709.5). Total num frames: 14047641600. Throughput: 0: 42740.3. Samples: 14047773980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:29:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 09:29:14,009][15401] Updated weights for policy 0, policy_version 857404 (0.0037) [2024-06-25 09:29:18,266][15401] Updated weights for policy 0, policy_version 857414 (0.0024) [2024-06-25 09:29:18,391][15132] Fps is (10 sec: 42593.6, 60 sec: 43143.6, 300 sec: 42764.8). Total num frames: 14047870976. Throughput: 0: 42779.8. Samples: 14048030680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-06-25 09:29:18,391][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 09:29:21,811][15401] Updated weights for policy 0, policy_version 857424 (0.0030) [2024-06-25 09:29:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14048067584. Throughput: 0: 43003.6. Samples: 14048161020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:29:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 09:29:25,852][15401] Updated weights for policy 0, policy_version 857434 (0.0027) [2024-06-25 09:29:28,390][15132] Fps is (10 sec: 42601.6, 60 sec: 42598.1, 300 sec: 42765.0). Total num frames: 14048296960. Throughput: 0: 42643.7. Samples: 14048413180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:29:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 09:29:29,982][15401] Updated weights for policy 0, policy_version 857444 (0.0040) [2024-06-25 09:29:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14048493568. Throughput: 0: 42848.5. Samples: 14048671360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:29:33,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 09:29:33,610][15401] Updated weights for policy 0, policy_version 857454 (0.0035) [2024-06-25 09:29:37,570][15401] Updated weights for policy 0, policy_version 857464 (0.0028) [2024-06-25 09:29:38,389][15132] Fps is (10 sec: 42600.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14048722944. Throughput: 0: 42669.7. Samples: 14048795600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:29:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 09:29:41,191][15401] Updated weights for policy 0, policy_version 857474 (0.0033) [2024-06-25 09:29:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14048919552. Throughput: 0: 42515.1. Samples: 14049048900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:29:43,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 09:29:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000857479_14048935936.pth... [2024-06-25 09:29:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000856853_14038679552.pth [2024-06-25 09:29:45,233][15401] Updated weights for policy 0, policy_version 857484 (0.0043) [2024-06-25 09:29:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 14049132544. Throughput: 0: 42550.6. Samples: 14049306700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:29:48,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 09:29:49,142][15401] Updated weights for policy 0, policy_version 857494 (0.0039) [2024-06-25 09:29:52,799][15401] Updated weights for policy 0, policy_version 857504 (0.0040) [2024-06-25 09:29:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14049378304. Throughput: 0: 42524.0. Samples: 14049432840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:29:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 09:29:56,914][15401] Updated weights for policy 0, policy_version 857514 (0.0030) [2024-06-25 09:29:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14049574912. Throughput: 0: 42601.0. Samples: 14049691020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:29:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 09:30:00,353][15401] Updated weights for policy 0, policy_version 857524 (0.0034) [2024-06-25 09:30:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 14049771520. Throughput: 0: 42590.9. Samples: 14049947220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:03,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-25 09:30:04,546][15401] Updated weights for policy 0, policy_version 857534 (0.0029) [2024-06-25 09:30:08,058][15401] Updated weights for policy 0, policy_version 857544 (0.0032) [2024-06-25 09:30:08,393][15132] Fps is (10 sec: 44219.5, 60 sec: 42868.7, 300 sec: 42764.8). Total num frames: 14050017280. Throughput: 0: 42518.0. Samples: 14050074500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:08,394][15132] Avg episode reward: [(0, '0.099')] [2024-06-25 09:30:12,400][15401] Updated weights for policy 0, policy_version 857554 (0.0034) [2024-06-25 09:30:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14050213888. Throughput: 0: 42712.8. Samples: 14050335240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 09:30:15,757][15401] Updated weights for policy 0, policy_version 857564 (0.0027) [2024-06-25 09:30:18,390][15132] Fps is (10 sec: 40975.7, 60 sec: 42599.1, 300 sec: 42653.9). Total num frames: 14050426880. Throughput: 0: 42518.1. Samples: 14050584680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:18,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 09:30:19,913][15401] Updated weights for policy 0, policy_version 857574 (0.0041) [2024-06-25 09:30:23,380][15401] Updated weights for policy 0, policy_version 857584 (0.0025) [2024-06-25 09:30:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14050656256. Throughput: 0: 42656.4. Samples: 14050715140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 09:30:27,559][15401] Updated weights for policy 0, policy_version 857594 (0.0041) [2024-06-25 09:30:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.6, 300 sec: 42709.4). Total num frames: 14050852864. Throughput: 0: 42772.7. Samples: 14050973680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 09:30:31,188][15401] Updated weights for policy 0, policy_version 857604 (0.0028) [2024-06-25 09:30:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14051082240. Throughput: 0: 42582.8. Samples: 14051222920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:33,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 09:30:35,244][15401] Updated weights for policy 0, policy_version 857614 (0.0039) [2024-06-25 09:30:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14051262464. Throughput: 0: 42716.7. Samples: 14051355100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 09:30:39,094][15401] Updated weights for policy 0, policy_version 857624 (0.0023) [2024-06-25 09:30:43,226][15349] Signal inference workers to stop experience collection... (208150 times) [2024-06-25 09:30:43,259][15401] InferenceWorker_p0-w0: stopping experience collection (208150 times) [2024-06-25 09:30:43,343][15349] Signal inference workers to resume experience collection... (208150 times) [2024-06-25 09:30:43,343][15401] InferenceWorker_p0-w0: resuming experience collection (208150 times) [2024-06-25 09:30:43,345][15401] Updated weights for policy 0, policy_version 857634 (0.0033) [2024-06-25 09:30:43,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14051475456. Throughput: 0: 42487.6. Samples: 14051602960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 09:30:46,965][15401] Updated weights for policy 0, policy_version 857644 (0.0030) [2024-06-25 09:30:48,390][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14051721216. Throughput: 0: 42404.4. Samples: 14051855420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:48,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 09:30:51,039][15401] Updated weights for policy 0, policy_version 857654 (0.0041) [2024-06-25 09:30:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14051901440. Throughput: 0: 42580.7. Samples: 14051990460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 09:30:54,563][15401] Updated weights for policy 0, policy_version 857664 (0.0033) [2024-06-25 09:30:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 14052114432. Throughput: 0: 42446.3. Samples: 14052245320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:30:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 09:30:58,473][15401] Updated weights for policy 0, policy_version 857674 (0.0045) [2024-06-25 09:31:02,239][15401] Updated weights for policy 0, policy_version 857684 (0.0041) [2024-06-25 09:31:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14052343808. Throughput: 0: 42561.3. Samples: 14052499940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:31:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 09:31:06,137][15401] Updated weights for policy 0, policy_version 857694 (0.0043) [2024-06-25 09:31:08,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42328.0, 300 sec: 42765.3). Total num frames: 14052556800. Throughput: 0: 42602.1. Samples: 14052632240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 09:31:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 09:31:09,727][15401] Updated weights for policy 0, policy_version 857704 (0.0029) [2024-06-25 09:31:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 14052769792. Throughput: 0: 42531.7. Samples: 14052887700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:13,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 09:31:13,606][15401] Updated weights for policy 0, policy_version 857714 (0.0034) [2024-06-25 09:31:17,189][15401] Updated weights for policy 0, policy_version 857724 (0.0038) [2024-06-25 09:31:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14052982784. Throughput: 0: 42739.4. Samples: 14053146200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:18,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 09:31:21,111][15401] Updated weights for policy 0, policy_version 857734 (0.0042) [2024-06-25 09:31:23,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42709.7). Total num frames: 14053195776. Throughput: 0: 42645.5. Samples: 14053274140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-25 09:31:24,946][15401] Updated weights for policy 0, policy_version 857744 (0.0039) [2024-06-25 09:31:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14053408768. Throughput: 0: 42808.4. Samples: 14053529340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:28,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 09:31:29,000][15401] Updated weights for policy 0, policy_version 857754 (0.0037) [2024-06-25 09:31:32,543][15401] Updated weights for policy 0, policy_version 857764 (0.0026) [2024-06-25 09:31:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14053621760. Throughput: 0: 42883.1. Samples: 14053785160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:33,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 09:31:37,090][15401] Updated weights for policy 0, policy_version 857774 (0.0033) [2024-06-25 09:31:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14053834752. Throughput: 0: 42780.8. Samples: 14053915600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:38,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 09:31:40,215][15401] Updated weights for policy 0, policy_version 857784 (0.0032) [2024-06-25 09:31:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14054064128. Throughput: 0: 42815.4. Samples: 14054172020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:43,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 09:31:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000857792_14054064128.pth... [2024-06-25 09:31:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000857166_14043807744.pth [2024-06-25 09:31:44,959][15401] Updated weights for policy 0, policy_version 857794 (0.0035) [2024-06-25 09:31:47,767][15401] Updated weights for policy 0, policy_version 857804 (0.0047) [2024-06-25 09:31:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14054260736. Throughput: 0: 42736.5. Samples: 14054423080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:48,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 09:31:52,465][15401] Updated weights for policy 0, policy_version 857814 (0.0030) [2024-06-25 09:31:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14054473728. Throughput: 0: 42707.6. Samples: 14054554080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:53,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 09:31:55,645][15401] Updated weights for policy 0, policy_version 857824 (0.0032) [2024-06-25 09:31:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14054703104. Throughput: 0: 42756.4. Samples: 14054811640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:31:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 09:32:00,053][15401] Updated weights for policy 0, policy_version 857834 (0.0025) [2024-06-25 09:32:03,387][15401] Updated weights for policy 0, policy_version 857844 (0.0024) [2024-06-25 09:32:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 14054916096. Throughput: 0: 42804.0. Samples: 14055072480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:03,392][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 09:32:07,747][15401] Updated weights for policy 0, policy_version 857854 (0.0049) [2024-06-25 09:32:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14055096320. Throughput: 0: 42650.6. Samples: 14055193420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:08,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 09:32:10,995][15349] Signal inference workers to stop experience collection... (208200 times) [2024-06-25 09:32:11,030][15401] InferenceWorker_p0-w0: stopping experience collection (208200 times) [2024-06-25 09:32:11,065][15349] Signal inference workers to resume experience collection... (208200 times) [2024-06-25 09:32:11,068][15401] InferenceWorker_p0-w0: resuming experience collection (208200 times) [2024-06-25 09:32:11,216][15401] Updated weights for policy 0, policy_version 857864 (0.0030) [2024-06-25 09:32:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 14055358464. Throughput: 0: 42696.4. Samples: 14055450680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:13,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 09:32:15,598][15401] Updated weights for policy 0, policy_version 857874 (0.0035) [2024-06-25 09:32:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14055538688. Throughput: 0: 42847.5. Samples: 14055713300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 09:32:18,861][15401] Updated weights for policy 0, policy_version 857884 (0.0042) [2024-06-25 09:32:23,133][15401] Updated weights for policy 0, policy_version 857894 (0.0028) [2024-06-25 09:32:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14055751680. Throughput: 0: 42644.5. Samples: 14055834600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:23,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 09:32:26,492][15401] Updated weights for policy 0, policy_version 857904 (0.0024) [2024-06-25 09:32:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14055981056. Throughput: 0: 42722.3. Samples: 14056094520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 09:32:30,559][15401] Updated weights for policy 0, policy_version 857914 (0.0024) [2024-06-25 09:32:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 14056194048. Throughput: 0: 42951.9. Samples: 14056355920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:33,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 09:32:34,182][15401] Updated weights for policy 0, policy_version 857924 (0.0031) [2024-06-25 09:32:38,259][15401] Updated weights for policy 0, policy_version 857934 (0.0027) [2024-06-25 09:32:38,394][15132] Fps is (10 sec: 40943.1, 60 sec: 42595.5, 300 sec: 42653.3). Total num frames: 14056390656. Throughput: 0: 42795.8. Samples: 14056480060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:38,394][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 09:32:41,813][15401] Updated weights for policy 0, policy_version 857944 (0.0035) [2024-06-25 09:32:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14056636416. Throughput: 0: 42768.4. Samples: 14056736220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 09:32:45,852][15401] Updated weights for policy 0, policy_version 857954 (0.0039) [2024-06-25 09:32:48,390][15132] Fps is (10 sec: 44254.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14056833024. Throughput: 0: 42643.2. Samples: 14056991320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 09:32:49,475][15401] Updated weights for policy 0, policy_version 857964 (0.0024) [2024-06-25 09:32:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14057029632. Throughput: 0: 42799.9. Samples: 14057119420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:53,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 09:32:53,595][15401] Updated weights for policy 0, policy_version 857974 (0.0034) [2024-06-25 09:32:57,199][15401] Updated weights for policy 0, policy_version 857984 (0.0032) [2024-06-25 09:32:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14057259008. Throughput: 0: 42777.1. Samples: 14057375640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 09:32:58,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 09:33:01,109][15401] Updated weights for policy 0, policy_version 857994 (0.0036) [2024-06-25 09:33:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14057472000. Throughput: 0: 42527.0. Samples: 14057627020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 09:33:04,814][15401] Updated weights for policy 0, policy_version 858004 (0.0025) [2024-06-25 09:33:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42599.3). Total num frames: 14057668608. Throughput: 0: 42724.5. Samples: 14057757200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:08,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 09:33:08,727][15401] Updated weights for policy 0, policy_version 858014 (0.0032) [2024-06-25 09:33:12,401][15401] Updated weights for policy 0, policy_version 858024 (0.0035) [2024-06-25 09:33:13,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14057897984. Throughput: 0: 42576.9. Samples: 14058010480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 09:33:16,418][15401] Updated weights for policy 0, policy_version 858034 (0.0032) [2024-06-25 09:33:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14058078208. Throughput: 0: 42509.5. Samples: 14058268840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 09:33:20,362][15401] Updated weights for policy 0, policy_version 858044 (0.0041) [2024-06-25 09:33:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42598.3). Total num frames: 14058307584. Throughput: 0: 42440.5. Samples: 14058389720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 09:33:24,097][15401] Updated weights for policy 0, policy_version 858054 (0.0051) [2024-06-25 09:33:28,376][15401] Updated weights for policy 0, policy_version 858064 (0.0035) [2024-06-25 09:33:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14058520576. Throughput: 0: 42375.5. Samples: 14058643120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 09:33:31,771][15401] Updated weights for policy 0, policy_version 858074 (0.0031) [2024-06-25 09:33:33,390][15132] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14058749952. Throughput: 0: 42504.0. Samples: 14058904000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:33,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 09:33:35,882][15401] Updated weights for policy 0, policy_version 858084 (0.0030) [2024-06-25 09:33:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42601.4, 300 sec: 42598.4). Total num frames: 14058946560. Throughput: 0: 42593.1. Samples: 14059036100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 09:33:39,147][15401] Updated weights for policy 0, policy_version 858094 (0.0028) [2024-06-25 09:33:41,711][15349] Signal inference workers to stop experience collection... (208250 times) [2024-06-25 09:33:41,711][15349] Signal inference workers to resume experience collection... (208250 times) [2024-06-25 09:33:41,749][15401] InferenceWorker_p0-w0: stopping experience collection (208250 times) [2024-06-25 09:33:41,749][15401] InferenceWorker_p0-w0: resuming experience collection (208250 times) [2024-06-25 09:33:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 14059159552. Throughput: 0: 42515.0. Samples: 14059288820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 09:33:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000858104_14059175936.pth... [2024-06-25 09:33:43,482][15401] Updated weights for policy 0, policy_version 858104 (0.0029) [2024-06-25 09:33:43,548][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000857479_14048935936.pth [2024-06-25 09:33:46,765][15401] Updated weights for policy 0, policy_version 858114 (0.0037) [2024-06-25 09:33:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14059372544. Throughput: 0: 42720.1. Samples: 14059549420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:48,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-25 09:33:51,076][15401] Updated weights for policy 0, policy_version 858124 (0.0023) [2024-06-25 09:33:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14059601920. Throughput: 0: 42598.5. Samples: 14059674140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:53,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 09:33:54,493][15401] Updated weights for policy 0, policy_version 858134 (0.0044) [2024-06-25 09:33:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14059798528. Throughput: 0: 42616.4. Samples: 14059928220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:33:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 09:33:58,756][15401] Updated weights for policy 0, policy_version 858144 (0.0031) [2024-06-25 09:34:02,257][15401] Updated weights for policy 0, policy_version 858154 (0.0035) [2024-06-25 09:34:03,394][15132] Fps is (10 sec: 42580.0, 60 sec: 42595.4, 300 sec: 42653.3). Total num frames: 14060027904. Throughput: 0: 42688.6. Samples: 14060190020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:03,394][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 09:34:06,242][15401] Updated weights for policy 0, policy_version 858164 (0.0041) [2024-06-25 09:34:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14060240896. Throughput: 0: 42879.8. Samples: 14060319300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:08,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 09:34:10,032][15401] Updated weights for policy 0, policy_version 858174 (0.0043) [2024-06-25 09:34:13,389][15132] Fps is (10 sec: 42617.6, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 14060453888. Throughput: 0: 43033.5. Samples: 14060579620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 09:34:13,860][15401] Updated weights for policy 0, policy_version 858184 (0.0035) [2024-06-25 09:34:17,739][15401] Updated weights for policy 0, policy_version 858194 (0.0032) [2024-06-25 09:34:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14060666880. Throughput: 0: 42680.5. Samples: 14060824620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:18,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 09:34:21,635][15401] Updated weights for policy 0, policy_version 858204 (0.0026) [2024-06-25 09:34:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 14060879872. Throughput: 0: 42715.4. Samples: 14060958300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:23,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 09:34:25,350][15401] Updated weights for policy 0, policy_version 858214 (0.0035) [2024-06-25 09:34:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14061076480. Throughput: 0: 42789.0. Samples: 14061214320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:28,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 09:34:29,387][15401] Updated weights for policy 0, policy_version 858224 (0.0025) [2024-06-25 09:34:32,968][15401] Updated weights for policy 0, policy_version 858234 (0.0040) [2024-06-25 09:34:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14061322240. Throughput: 0: 42507.2. Samples: 14061462240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:33,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 09:34:37,329][15401] Updated weights for policy 0, policy_version 858244 (0.0037) [2024-06-25 09:34:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 14061518848. Throughput: 0: 42753.8. Samples: 14061598060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:38,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-25 09:34:40,614][15401] Updated weights for policy 0, policy_version 858254 (0.0033) [2024-06-25 09:34:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14061715456. Throughput: 0: 42751.5. Samples: 14061852040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:43,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-25 09:34:44,947][15401] Updated weights for policy 0, policy_version 858264 (0.0045) [2024-06-25 09:34:48,371][15401] Updated weights for policy 0, policy_version 858274 (0.0053) [2024-06-25 09:34:48,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 14061961216. Throughput: 0: 42448.6. Samples: 14062100120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:48,393][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 09:34:52,666][15401] Updated weights for policy 0, policy_version 858284 (0.0037) [2024-06-25 09:34:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 14062141440. Throughput: 0: 42376.5. Samples: 14062226240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 09:34:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 09:34:56,437][15401] Updated weights for policy 0, policy_version 858294 (0.0043) [2024-06-25 09:34:58,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14062370816. Throughput: 0: 42180.8. Samples: 14062477760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:34:58,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 09:35:00,809][15401] Updated weights for policy 0, policy_version 858304 (0.0033) [2024-06-25 09:35:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42055.4, 300 sec: 42487.9). Total num frames: 14062551040. Throughput: 0: 42416.9. Samples: 14062733380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:03,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 09:35:04,035][15401] Updated weights for policy 0, policy_version 858314 (0.0032) [2024-06-25 09:35:07,684][15349] Signal inference workers to stop experience collection... (208300 times) [2024-06-25 09:35:07,684][15349] Signal inference workers to resume experience collection... (208300 times) [2024-06-25 09:35:07,703][15401] InferenceWorker_p0-w0: stopping experience collection (208300 times) [2024-06-25 09:35:07,735][15401] InferenceWorker_p0-w0: resuming experience collection (208300 times) [2024-06-25 09:35:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 14062764032. Throughput: 0: 42267.3. Samples: 14062860320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 09:35:08,478][15401] Updated weights for policy 0, policy_version 858324 (0.0031) [2024-06-25 09:35:11,881][15401] Updated weights for policy 0, policy_version 858334 (0.0032) [2024-06-25 09:35:13,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 14063009792. Throughput: 0: 42119.2. Samples: 14063109780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:13,392][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 09:35:16,032][15401] Updated weights for policy 0, policy_version 858344 (0.0039) [2024-06-25 09:35:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14063206400. Throughput: 0: 42435.6. Samples: 14063371840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 09:35:19,556][15401] Updated weights for policy 0, policy_version 858354 (0.0040) [2024-06-25 09:35:23,389][15132] Fps is (10 sec: 40969.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14063419392. Throughput: 0: 42272.5. Samples: 14063500320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:23,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 09:35:23,514][15401] Updated weights for policy 0, policy_version 858364 (0.0027) [2024-06-25 09:35:27,205][15401] Updated weights for policy 0, policy_version 858374 (0.0030) [2024-06-25 09:35:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14063648768. Throughput: 0: 42405.3. Samples: 14063760280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 09:35:31,460][15401] Updated weights for policy 0, policy_version 858384 (0.0026) [2024-06-25 09:35:33,391][15132] Fps is (10 sec: 42589.8, 60 sec: 42050.8, 300 sec: 42653.7). Total num frames: 14063845376. Throughput: 0: 42549.3. Samples: 14064014820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:33,392][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 09:35:34,871][15401] Updated weights for policy 0, policy_version 858394 (0.0041) [2024-06-25 09:35:38,390][15132] Fps is (10 sec: 39319.8, 60 sec: 42052.0, 300 sec: 42598.3). Total num frames: 14064041984. Throughput: 0: 42616.8. Samples: 14064144020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:38,391][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 09:35:39,185][15401] Updated weights for policy 0, policy_version 858404 (0.0052) [2024-06-25 09:35:42,640][15401] Updated weights for policy 0, policy_version 858414 (0.0036) [2024-06-25 09:35:43,392][15132] Fps is (10 sec: 42596.7, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 14064271360. Throughput: 0: 42695.1. Samples: 14064399140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:43,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 09:35:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000858415_14064271360.pth... [2024-06-25 09:35:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000857792_14054064128.pth [2024-06-25 09:35:46,728][15401] Updated weights for policy 0, policy_version 858424 (0.0037) [2024-06-25 09:35:48,389][15132] Fps is (10 sec: 44239.2, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 14064484352. Throughput: 0: 42624.4. Samples: 14064651480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 09:35:50,151][15401] Updated weights for policy 0, policy_version 858434 (0.0037) [2024-06-25 09:35:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14064697344. Throughput: 0: 42673.2. Samples: 14064780620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:53,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 09:35:54,623][15401] Updated weights for policy 0, policy_version 858444 (0.0045) [2024-06-25 09:35:57,671][15401] Updated weights for policy 0, policy_version 858454 (0.0043) [2024-06-25 09:35:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14064926720. Throughput: 0: 42733.7. Samples: 14065032700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:35:58,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 09:36:02,258][15401] Updated weights for policy 0, policy_version 858464 (0.0031) [2024-06-25 09:36:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14065139712. Throughput: 0: 42751.0. Samples: 14065295640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:03,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 09:36:05,814][15401] Updated weights for policy 0, policy_version 858474 (0.0028) [2024-06-25 09:36:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 14065336320. Throughput: 0: 42708.1. Samples: 14065422180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:08,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 09:36:10,012][15401] Updated weights for policy 0, policy_version 858484 (0.0035) [2024-06-25 09:36:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 14065549312. Throughput: 0: 42565.9. Samples: 14065675740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 09:36:13,523][15401] Updated weights for policy 0, policy_version 858494 (0.0034) [2024-06-25 09:36:17,899][15401] Updated weights for policy 0, policy_version 858504 (0.0029) [2024-06-25 09:36:18,390][15132] Fps is (10 sec: 42596.1, 60 sec: 42598.1, 300 sec: 42598.3). Total num frames: 14065762304. Throughput: 0: 42632.6. Samples: 14065933220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:18,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 09:36:21,063][15401] Updated weights for policy 0, policy_version 858514 (0.0039) [2024-06-25 09:36:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14065958912. Throughput: 0: 42488.9. Samples: 14066056000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 09:36:25,436][15401] Updated weights for policy 0, policy_version 858524 (0.0029) [2024-06-25 09:36:28,325][15349] Signal inference workers to stop experience collection... (208350 times) [2024-06-25 09:36:28,360][15401] InferenceWorker_p0-w0: stopping experience collection (208350 times) [2024-06-25 09:36:28,389][15132] Fps is (10 sec: 40961.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14066171904. Throughput: 0: 42501.4. Samples: 14066311600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 09:36:28,393][15349] Signal inference workers to resume experience collection... (208350 times) [2024-06-25 09:36:28,396][15401] InferenceWorker_p0-w0: resuming experience collection (208350 times) [2024-06-25 09:36:28,687][15401] Updated weights for policy 0, policy_version 858534 (0.0025) [2024-06-25 09:36:32,941][15401] Updated weights for policy 0, policy_version 858544 (0.0038) [2024-06-25 09:36:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42872.8, 300 sec: 42653.9). Total num frames: 14066417664. Throughput: 0: 42815.5. Samples: 14066578180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:33,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 09:36:36,076][15401] Updated weights for policy 0, policy_version 858554 (0.0037) [2024-06-25 09:36:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.9, 300 sec: 42542.9). Total num frames: 14066614272. Throughput: 0: 42640.5. Samples: 14066699440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:38,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 09:36:40,341][15401] Updated weights for policy 0, policy_version 858564 (0.0042) [2024-06-25 09:36:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 14066860032. Throughput: 0: 42896.9. Samples: 14066963060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 09:36:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 09:36:43,552][15401] Updated weights for policy 0, policy_version 858574 (0.0023) [2024-06-25 09:36:47,870][15401] Updated weights for policy 0, policy_version 858584 (0.0030) [2024-06-25 09:36:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14067073024. Throughput: 0: 42847.6. Samples: 14067223780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:36:48,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 09:36:50,934][15401] Updated weights for policy 0, policy_version 858594 (0.0031) [2024-06-25 09:36:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14067269632. Throughput: 0: 42884.3. Samples: 14067351980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:36:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 09:36:55,252][15401] Updated weights for policy 0, policy_version 858604 (0.0052) [2024-06-25 09:36:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 14067515392. Throughput: 0: 43182.1. Samples: 14067618940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:36:58,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-25 09:36:58,804][15401] Updated weights for policy 0, policy_version 858614 (0.0038) [2024-06-25 09:37:02,977][15401] Updated weights for policy 0, policy_version 858624 (0.0030) [2024-06-25 09:37:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14067728384. Throughput: 0: 43033.2. Samples: 14067869700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 09:37:06,286][15401] Updated weights for policy 0, policy_version 858634 (0.0030) [2024-06-25 09:37:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14067924992. Throughput: 0: 43135.5. Samples: 14067997100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:08,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 09:37:10,522][15401] Updated weights for policy 0, policy_version 858644 (0.0038) [2024-06-25 09:37:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14068137984. Throughput: 0: 43269.3. Samples: 14068258720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:13,398][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 09:37:14,186][15401] Updated weights for policy 0, policy_version 858654 (0.0035) [2024-06-25 09:37:18,130][15401] Updated weights for policy 0, policy_version 858664 (0.0034) [2024-06-25 09:37:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.9, 300 sec: 42765.0). Total num frames: 14068367360. Throughput: 0: 42970.3. Samples: 14068511840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 09:37:22,036][15401] Updated weights for policy 0, policy_version 858674 (0.0034) [2024-06-25 09:37:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14068547584. Throughput: 0: 43187.6. Samples: 14068642880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 09:37:25,721][15401] Updated weights for policy 0, policy_version 858684 (0.0026) [2024-06-25 09:37:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 42709.5). Total num frames: 14068793344. Throughput: 0: 43011.1. Samples: 14068898560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 09:37:29,742][15401] Updated weights for policy 0, policy_version 858694 (0.0025) [2024-06-25 09:37:32,279][15349] Signal inference workers to stop experience collection... (208400 times) [2024-06-25 09:37:32,287][15349] Signal inference workers to resume experience collection... (208400 times) [2024-06-25 09:37:32,302][15401] InferenceWorker_p0-w0: stopping experience collection (208400 times) [2024-06-25 09:37:32,302][15401] InferenceWorker_p0-w0: resuming experience collection (208400 times) [2024-06-25 09:37:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42710.1). Total num frames: 14068989952. Throughput: 0: 42958.2. Samples: 14069156900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 09:37:33,442][15401] Updated weights for policy 0, policy_version 858704 (0.0035) [2024-06-25 09:37:37,142][15401] Updated weights for policy 0, policy_version 858714 (0.0028) [2024-06-25 09:37:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14069202944. Throughput: 0: 42937.0. Samples: 14069284140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:38,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 09:37:40,915][15401] Updated weights for policy 0, policy_version 858724 (0.0032) [2024-06-25 09:37:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14069432320. Throughput: 0: 42742.4. Samples: 14069542340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 09:37:43,473][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000858731_14069448704.pth... [2024-06-25 09:37:43,529][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000858104_14059175936.pth [2024-06-25 09:37:44,748][15401] Updated weights for policy 0, policy_version 858734 (0.0026) [2024-06-25 09:37:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14069628928. Throughput: 0: 42991.2. Samples: 14069804300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 09:37:49,070][15401] Updated weights for policy 0, policy_version 858744 (0.0034) [2024-06-25 09:37:52,282][15401] Updated weights for policy 0, policy_version 858754 (0.0036) [2024-06-25 09:37:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14069858304. Throughput: 0: 42809.5. Samples: 14069923520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 09:37:56,565][15401] Updated weights for policy 0, policy_version 858764 (0.0036) [2024-06-25 09:37:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14070087680. Throughput: 0: 42921.8. Samples: 14070190200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:37:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 09:38:00,405][15401] Updated weights for policy 0, policy_version 858774 (0.0036) [2024-06-25 09:38:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14070267904. Throughput: 0: 42916.0. Samples: 14070443060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:38:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 09:38:04,650][15401] Updated weights for policy 0, policy_version 858784 (0.0035) [2024-06-25 09:38:07,915][15401] Updated weights for policy 0, policy_version 858794 (0.0046) [2024-06-25 09:38:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14070497280. Throughput: 0: 42595.0. Samples: 14070559660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:38:08,393][15132] Avg episode reward: [(0, '0.342')] [2024-06-25 09:38:12,346][15401] Updated weights for policy 0, policy_version 858804 (0.0029) [2024-06-25 09:38:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14070710272. Throughput: 0: 42819.1. Samples: 14070825420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:38:13,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 09:38:15,924][15401] Updated weights for policy 0, policy_version 858814 (0.0040) [2024-06-25 09:38:18,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 14070890496. Throughput: 0: 42715.9. Samples: 14071079220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:38:18,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 09:38:20,015][15401] Updated weights for policy 0, policy_version 858824 (0.0033) [2024-06-25 09:38:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14071136256. Throughput: 0: 42543.2. Samples: 14071198580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:38:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 09:38:23,401][15401] Updated weights for policy 0, policy_version 858834 (0.0042) [2024-06-25 09:38:27,565][15401] Updated weights for policy 0, policy_version 858844 (0.0032) [2024-06-25 09:38:28,390][15132] Fps is (10 sec: 47524.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14071365632. Throughput: 0: 42842.1. Samples: 14071470240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:38:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 09:38:30,797][15401] Updated weights for policy 0, policy_version 858854 (0.0037) [2024-06-25 09:38:33,389][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14071529472. Throughput: 0: 42659.5. Samples: 14071723980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 09:38:33,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 09:38:35,214][15401] Updated weights for policy 0, policy_version 858864 (0.0033) [2024-06-25 09:38:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14071775232. Throughput: 0: 42621.3. Samples: 14071841480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:38:38,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 09:38:38,521][15401] Updated weights for policy 0, policy_version 858874 (0.0038) [2024-06-25 09:38:42,879][15401] Updated weights for policy 0, policy_version 858884 (0.0036) [2024-06-25 09:38:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 14071988224. Throughput: 0: 42617.6. Samples: 14072108000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:38:43,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 09:38:44,195][15349] Signal inference workers to stop experience collection... (208450 times) [2024-06-25 09:38:44,233][15401] InferenceWorker_p0-w0: stopping experience collection (208450 times) [2024-06-25 09:38:44,242][15349] Signal inference workers to resume experience collection... (208450 times) [2024-06-25 09:38:44,249][15401] InferenceWorker_p0-w0: resuming experience collection (208450 times) [2024-06-25 09:38:46,749][15401] Updated weights for policy 0, policy_version 858894 (0.0041) [2024-06-25 09:38:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14072184832. Throughput: 0: 42513.3. Samples: 14072356160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:38:48,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-25 09:38:50,721][15401] Updated weights for policy 0, policy_version 858904 (0.0028) [2024-06-25 09:38:53,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14072414208. Throughput: 0: 42696.1. Samples: 14072480980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:38:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 09:38:54,186][15401] Updated weights for policy 0, policy_version 858914 (0.0041) [2024-06-25 09:38:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42599.0). Total num frames: 14072594432. Throughput: 0: 42406.7. Samples: 14072733720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:38:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 09:38:58,398][15401] Updated weights for policy 0, policy_version 858924 (0.0042) [2024-06-25 09:39:02,078][15401] Updated weights for policy 0, policy_version 858934 (0.0036) [2024-06-25 09:39:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14072823808. Throughput: 0: 42424.4. Samples: 14072988220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 09:39:06,010][15401] Updated weights for policy 0, policy_version 858944 (0.0041) [2024-06-25 09:39:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14073053184. Throughput: 0: 42525.3. Samples: 14073112220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 09:39:09,653][15401] Updated weights for policy 0, policy_version 858954 (0.0038) [2024-06-25 09:39:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14073249792. Throughput: 0: 42412.4. Samples: 14073378800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:13,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 09:39:13,646][15401] Updated weights for policy 0, policy_version 858964 (0.0037) [2024-06-25 09:39:17,526][15401] Updated weights for policy 0, policy_version 858974 (0.0041) [2024-06-25 09:39:18,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 14073479168. Throughput: 0: 42267.6. Samples: 14073626020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:18,392][15132] Avg episode reward: [(0, '0.163')] [2024-06-25 09:39:21,227][15401] Updated weights for policy 0, policy_version 858984 (0.0040) [2024-06-25 09:39:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14073692160. Throughput: 0: 42578.1. Samples: 14073757500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:23,394][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 09:39:25,174][15401] Updated weights for policy 0, policy_version 858994 (0.0027) [2024-06-25 09:39:28,390][15132] Fps is (10 sec: 37682.6, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 14073856000. Throughput: 0: 42415.6. Samples: 14074016700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:28,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 09:39:29,066][15401] Updated weights for policy 0, policy_version 859004 (0.0026) [2024-06-25 09:39:32,716][15401] Updated weights for policy 0, policy_version 859014 (0.0037) [2024-06-25 09:39:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14074101760. Throughput: 0: 42480.1. Samples: 14074267760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:33,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 09:39:36,540][15401] Updated weights for policy 0, policy_version 859024 (0.0030) [2024-06-25 09:39:38,390][15132] Fps is (10 sec: 49152.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14074347520. Throughput: 0: 42740.4. Samples: 14074404300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:38,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 09:39:40,385][15401] Updated weights for policy 0, policy_version 859034 (0.0043) [2024-06-25 09:39:43,390][15132] Fps is (10 sec: 40958.7, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 14074511360. Throughput: 0: 42718.8. Samples: 14074656080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:43,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 09:39:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000859040_14074511360.pth... [2024-06-25 09:39:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000858415_14064271360.pth [2024-06-25 09:39:44,265][15401] Updated weights for policy 0, policy_version 859044 (0.0032) [2024-06-25 09:39:48,085][15401] Updated weights for policy 0, policy_version 859054 (0.0035) [2024-06-25 09:39:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14074757120. Throughput: 0: 42699.1. Samples: 14074909680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 09:39:51,974][15401] Updated weights for policy 0, policy_version 859064 (0.0032) [2024-06-25 09:39:53,390][15132] Fps is (10 sec: 47515.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14074986496. Throughput: 0: 42885.6. Samples: 14075042080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:53,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 09:39:55,867][15401] Updated weights for policy 0, policy_version 859074 (0.0040) [2024-06-25 09:39:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14075150336. Throughput: 0: 42664.1. Samples: 14075298680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:39:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 09:39:59,652][15401] Updated weights for policy 0, policy_version 859084 (0.0027) [2024-06-25 09:40:03,386][15401] Updated weights for policy 0, policy_version 859094 (0.0032) [2024-06-25 09:40:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14075396096. Throughput: 0: 42845.2. Samples: 14075554060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:40:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 09:40:07,251][15401] Updated weights for policy 0, policy_version 859104 (0.0037) [2024-06-25 09:40:08,390][15132] Fps is (10 sec: 49151.3, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 14075641856. Throughput: 0: 42788.4. Samples: 14075682980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:40:08,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 09:40:11,073][15401] Updated weights for policy 0, policy_version 859114 (0.0035) [2024-06-25 09:40:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14075805696. Throughput: 0: 42571.2. Samples: 14075932400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:40:13,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 09:40:14,997][15401] Updated weights for policy 0, policy_version 859124 (0.0035) [2024-06-25 09:40:18,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14076018688. Throughput: 0: 42819.6. Samples: 14076194640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:40:18,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 09:40:19,046][15401] Updated weights for policy 0, policy_version 859134 (0.0035) [2024-06-25 09:40:22,603][15401] Updated weights for policy 0, policy_version 859144 (0.0040) [2024-06-25 09:40:23,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14076280832. Throughput: 0: 42590.3. Samples: 14076320860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-06-25 09:40:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 09:40:27,059][15401] Updated weights for policy 0, policy_version 859154 (0.0028) [2024-06-25 09:40:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 14076444672. Throughput: 0: 42652.3. Samples: 14076575420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:40:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 09:40:30,181][15349] Signal inference workers to stop experience collection... (208500 times) [2024-06-25 09:40:30,224][15401] InferenceWorker_p0-w0: stopping experience collection (208500 times) [2024-06-25 09:40:30,233][15349] Signal inference workers to resume experience collection... (208500 times) [2024-06-25 09:40:30,243][15401] InferenceWorker_p0-w0: resuming experience collection (208500 times) [2024-06-25 09:40:30,246][15401] Updated weights for policy 0, policy_version 859164 (0.0030) [2024-06-25 09:40:33,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 14076657664. Throughput: 0: 42654.7. Samples: 14076829140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:40:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 09:40:34,520][15401] Updated weights for policy 0, policy_version 859174 (0.0034) [2024-06-25 09:40:37,796][15401] Updated weights for policy 0, policy_version 859184 (0.0023) [2024-06-25 09:40:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 14076887040. Throughput: 0: 42549.4. Samples: 14076956800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:40:38,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 09:40:42,132][15401] Updated weights for policy 0, policy_version 859194 (0.0041) [2024-06-25 09:40:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.7, 300 sec: 42653.9). Total num frames: 14077067264. Throughput: 0: 42487.5. Samples: 14077210620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:40:43,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 09:40:45,291][15401] Updated weights for policy 0, policy_version 859204 (0.0035) [2024-06-25 09:40:48,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 14077296640. Throughput: 0: 42530.7. Samples: 14077468040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:40:48,393][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 09:40:50,049][15401] Updated weights for policy 0, policy_version 859214 (0.0046) [2024-06-25 09:40:52,978][15401] Updated weights for policy 0, policy_version 859224 (0.0035) [2024-06-25 09:40:53,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14077542400. Throughput: 0: 42581.8. Samples: 14077599160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:40:53,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 09:40:57,544][15401] Updated weights for policy 0, policy_version 859234 (0.0033) [2024-06-25 09:40:58,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14077706240. Throughput: 0: 42687.6. Samples: 14077853340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:40:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 09:41:00,622][15401] Updated weights for policy 0, policy_version 859244 (0.0032) [2024-06-25 09:41:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14077935616. Throughput: 0: 42668.8. Samples: 14078114740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 09:41:04,966][15401] Updated weights for policy 0, policy_version 859254 (0.0047) [2024-06-25 09:41:08,318][15401] Updated weights for policy 0, policy_version 859264 (0.0052) [2024-06-25 09:41:08,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14078181376. Throughput: 0: 42726.6. Samples: 14078243560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 09:41:12,970][15401] Updated weights for policy 0, policy_version 859274 (0.0023) [2024-06-25 09:41:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14078361600. Throughput: 0: 42717.4. Samples: 14078497700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 09:41:16,015][15401] Updated weights for policy 0, policy_version 859284 (0.0028) [2024-06-25 09:41:18,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14078574592. Throughput: 0: 42733.1. Samples: 14078752120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 09:41:20,446][15401] Updated weights for policy 0, policy_version 859294 (0.0035) [2024-06-25 09:41:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42052.1, 300 sec: 42820.5). Total num frames: 14078803968. Throughput: 0: 42833.1. Samples: 14078884300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 09:41:23,793][15401] Updated weights for policy 0, policy_version 859304 (0.0030) [2024-06-25 09:41:27,947][15401] Updated weights for policy 0, policy_version 859314 (0.0037) [2024-06-25 09:41:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14079016960. Throughput: 0: 42815.4. Samples: 14079137320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 09:41:31,349][15401] Updated weights for policy 0, policy_version 859324 (0.0028) [2024-06-25 09:41:33,390][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14079229952. Throughput: 0: 42771.1. Samples: 14079392640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 09:41:35,588][15401] Updated weights for policy 0, policy_version 859334 (0.0036) [2024-06-25 09:41:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14079459328. Throughput: 0: 42749.9. Samples: 14079522900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 09:41:39,051][15401] Updated weights for policy 0, policy_version 859344 (0.0036) [2024-06-25 09:41:43,161][15401] Updated weights for policy 0, policy_version 859354 (0.0028) [2024-06-25 09:41:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 14079672320. Throughput: 0: 42873.8. Samples: 14079782660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 09:41:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000859355_14079672320.pth... [2024-06-25 09:41:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000858731_14069448704.pth [2024-06-25 09:41:45,527][15349] Signal inference workers to stop experience collection... (208550 times) [2024-06-25 09:41:45,528][15349] Signal inference workers to resume experience collection... (208550 times) [2024-06-25 09:41:45,563][15401] InferenceWorker_p0-w0: stopping experience collection (208550 times) [2024-06-25 09:41:45,563][15401] InferenceWorker_p0-w0: resuming experience collection (208550 times) [2024-06-25 09:41:46,783][15401] Updated weights for policy 0, policy_version 859364 (0.0033) [2024-06-25 09:41:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14079868928. Throughput: 0: 42788.4. Samples: 14080040220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:48,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 09:41:50,826][15401] Updated weights for policy 0, policy_version 859374 (0.0031) [2024-06-25 09:41:53,392][15132] Fps is (10 sec: 42589.2, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 14080098304. Throughput: 0: 42562.9. Samples: 14080158980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:53,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 09:41:54,610][15401] Updated weights for policy 0, policy_version 859384 (0.0027) [2024-06-25 09:41:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 14080294912. Throughput: 0: 42707.5. Samples: 14080419540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:41:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 09:41:58,442][15401] Updated weights for policy 0, policy_version 859394 (0.0027) [2024-06-25 09:42:02,291][15401] Updated weights for policy 0, policy_version 859404 (0.0038) [2024-06-25 09:42:03,390][15132] Fps is (10 sec: 40968.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14080507904. Throughput: 0: 42793.1. Samples: 14080677820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:42:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 09:42:05,893][15401] Updated weights for policy 0, policy_version 859414 (0.0038) [2024-06-25 09:42:08,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14080753664. Throughput: 0: 42684.2. Samples: 14080805180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:42:08,392][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 09:42:10,189][15401] Updated weights for policy 0, policy_version 859424 (0.0045) [2024-06-25 09:42:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14080950272. Throughput: 0: 42874.7. Samples: 14081066680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 09:42:13,391][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 09:42:14,006][15401] Updated weights for policy 0, policy_version 859434 (0.0032) [2024-06-25 09:42:17,765][15401] Updated weights for policy 0, policy_version 859444 (0.0029) [2024-06-25 09:42:18,389][15132] Fps is (10 sec: 40970.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14081163264. Throughput: 0: 42848.5. Samples: 14081320820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 09:42:21,763][15401] Updated weights for policy 0, policy_version 859454 (0.0044) [2024-06-25 09:42:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14081392640. Throughput: 0: 42796.7. Samples: 14081448760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:23,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 09:42:25,837][15401] Updated weights for policy 0, policy_version 859464 (0.0029) [2024-06-25 09:42:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14081605632. Throughput: 0: 42713.3. Samples: 14081704760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 09:42:29,412][15401] Updated weights for policy 0, policy_version 859474 (0.0045) [2024-06-25 09:42:33,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14081769472. Throughput: 0: 42793.3. Samples: 14081965920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 09:42:33,797][15401] Updated weights for policy 0, policy_version 859484 (0.0035) [2024-06-25 09:42:36,995][15401] Updated weights for policy 0, policy_version 859494 (0.0038) [2024-06-25 09:42:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14082031616. Throughput: 0: 42892.3. Samples: 14082089040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 09:42:41,340][15401] Updated weights for policy 0, policy_version 859504 (0.0038) [2024-06-25 09:42:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14082211840. Throughput: 0: 42860.9. Samples: 14082348280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:43,392][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 09:42:44,582][15401] Updated weights for policy 0, policy_version 859514 (0.0030) [2024-06-25 09:42:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14082424832. Throughput: 0: 42827.7. Samples: 14082605060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 09:42:48,911][15401] Updated weights for policy 0, policy_version 859524 (0.0041) [2024-06-25 09:42:52,296][15401] Updated weights for policy 0, policy_version 859534 (0.0029) [2024-06-25 09:42:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42873.0, 300 sec: 42653.9). Total num frames: 14082670592. Throughput: 0: 42705.3. Samples: 14082726820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:53,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 09:42:56,611][15401] Updated weights for policy 0, policy_version 859544 (0.0033) [2024-06-25 09:42:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14082834432. Throughput: 0: 42582.2. Samples: 14082982880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:42:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 09:42:59,921][15401] Updated weights for policy 0, policy_version 859554 (0.0032) [2024-06-25 09:43:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14083063808. Throughput: 0: 42608.8. Samples: 14083238220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:03,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 09:43:04,279][15401] Updated weights for policy 0, policy_version 859564 (0.0045) [2024-06-25 09:43:07,599][15401] Updated weights for policy 0, policy_version 859574 (0.0034) [2024-06-25 09:43:08,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 14083309568. Throughput: 0: 42653.5. Samples: 14083368160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 09:43:09,016][15349] Signal inference workers to stop experience collection... (208600 times) [2024-06-25 09:43:09,016][15349] Signal inference workers to resume experience collection... (208600 times) [2024-06-25 09:43:09,046][15401] InferenceWorker_p0-w0: stopping experience collection (208600 times) [2024-06-25 09:43:09,046][15401] InferenceWorker_p0-w0: resuming experience collection (208600 times) [2024-06-25 09:43:12,045][15401] Updated weights for policy 0, policy_version 859584 (0.0039) [2024-06-25 09:43:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 14083489792. Throughput: 0: 42600.9. Samples: 14083621800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 09:43:15,223][15401] Updated weights for policy 0, policy_version 859594 (0.0038) [2024-06-25 09:43:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14083719168. Throughput: 0: 42394.3. Samples: 14083873660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 09:43:19,678][15401] Updated weights for policy 0, policy_version 859604 (0.0023) [2024-06-25 09:43:22,795][15401] Updated weights for policy 0, policy_version 859614 (0.0037) [2024-06-25 09:43:23,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 14083964928. Throughput: 0: 42647.4. Samples: 14084008280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:23,392][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 09:43:27,218][15401] Updated weights for policy 0, policy_version 859624 (0.0031) [2024-06-25 09:43:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 14084112384. Throughput: 0: 42649.8. Samples: 14084267520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 09:43:30,553][15401] Updated weights for policy 0, policy_version 859634 (0.0031) [2024-06-25 09:43:33,389][15132] Fps is (10 sec: 39331.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14084358144. Throughput: 0: 42518.6. Samples: 14084518400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:33,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-25 09:43:34,879][15401] Updated weights for policy 0, policy_version 859644 (0.0035) [2024-06-25 09:43:38,047][15401] Updated weights for policy 0, policy_version 859654 (0.0036) [2024-06-25 09:43:38,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14084587520. Throughput: 0: 42815.1. Samples: 14084653500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:38,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 09:43:42,279][15401] Updated weights for policy 0, policy_version 859664 (0.0036) [2024-06-25 09:43:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14084767744. Throughput: 0: 42848.0. Samples: 14084911040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:43,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 09:43:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000859666_14084767744.pth... [2024-06-25 09:43:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000859040_14074511360.pth [2024-06-25 09:43:45,622][15401] Updated weights for policy 0, policy_version 859674 (0.0026) [2024-06-25 09:43:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14084997120. Throughput: 0: 42707.1. Samples: 14085160040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 09:43:50,132][15401] Updated weights for policy 0, policy_version 859684 (0.0033) [2024-06-25 09:43:53,281][15401] Updated weights for policy 0, policy_version 859694 (0.0031) [2024-06-25 09:43:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14085226496. Throughput: 0: 42828.8. Samples: 14085295460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 09:43:57,835][15401] Updated weights for policy 0, policy_version 859704 (0.0029) [2024-06-25 09:43:58,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 14085423104. Throughput: 0: 42709.7. Samples: 14085543840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:43:58,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 09:44:01,062][15401] Updated weights for policy 0, policy_version 859714 (0.0036) [2024-06-25 09:44:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 14085652480. Throughput: 0: 42864.7. Samples: 14085802580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:44:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 09:44:05,368][15401] Updated weights for policy 0, policy_version 859724 (0.0033) [2024-06-25 09:44:08,390][15132] Fps is (10 sec: 44246.7, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 14085865472. Throughput: 0: 42826.6. Samples: 14085935380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 09:44:08,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 09:44:08,653][15401] Updated weights for policy 0, policy_version 859734 (0.0028) [2024-06-25 09:44:12,800][15401] Updated weights for policy 0, policy_version 859744 (0.0029) [2024-06-25 09:44:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14086078464. Throughput: 0: 42815.5. Samples: 14086194220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:13,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 09:44:16,399][15401] Updated weights for policy 0, policy_version 859754 (0.0026) [2024-06-25 09:44:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14086275072. Throughput: 0: 42885.3. Samples: 14086448240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 09:44:20,146][15401] Updated weights for policy 0, policy_version 859764 (0.0039) [2024-06-25 09:44:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 42820.6). Total num frames: 14086488064. Throughput: 0: 42892.6. Samples: 14086583660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 09:44:23,414][15349] Signal inference workers to stop experience collection... (208650 times) [2024-06-25 09:44:23,422][15349] Signal inference workers to resume experience collection... (208650 times) [2024-06-25 09:44:23,464][15401] InferenceWorker_p0-w0: stopping experience collection (208650 times) [2024-06-25 09:44:23,464][15401] InferenceWorker_p0-w0: resuming experience collection (208650 times) [2024-06-25 09:44:23,801][15401] Updated weights for policy 0, policy_version 859774 (0.0029) [2024-06-25 09:44:28,083][15401] Updated weights for policy 0, policy_version 859784 (0.0044) [2024-06-25 09:44:28,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 14086701056. Throughput: 0: 42711.1. Samples: 14086833140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:28,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 09:44:31,302][15401] Updated weights for policy 0, policy_version 859794 (0.0037) [2024-06-25 09:44:33,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 14086914048. Throughput: 0: 42964.4. Samples: 14087093540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:33,393][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 09:44:35,652][15401] Updated weights for policy 0, policy_version 859804 (0.0040) [2024-06-25 09:44:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14087143424. Throughput: 0: 42793.4. Samples: 14087221160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:38,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-25 09:44:38,907][15401] Updated weights for policy 0, policy_version 859814 (0.0028) [2024-06-25 09:44:43,336][15401] Updated weights for policy 0, policy_version 859824 (0.0026) [2024-06-25 09:44:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14087356416. Throughput: 0: 43087.7. Samples: 14087482680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 09:44:46,945][15401] Updated weights for policy 0, policy_version 859834 (0.0036) [2024-06-25 09:44:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14087569408. Throughput: 0: 43000.6. Samples: 14087737600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:48,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 09:44:51,155][15401] Updated weights for policy 0, policy_version 859844 (0.0034) [2024-06-25 09:44:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14087798784. Throughput: 0: 42899.3. Samples: 14087865840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:53,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 09:44:54,627][15401] Updated weights for policy 0, policy_version 859854 (0.0041) [2024-06-25 09:44:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 14087979008. Throughput: 0: 42829.3. Samples: 14088121540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:44:58,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 09:44:58,991][15401] Updated weights for policy 0, policy_version 859864 (0.0037) [2024-06-25 09:45:02,374][15401] Updated weights for policy 0, policy_version 859874 (0.0030) [2024-06-25 09:45:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 14088208384. Throughput: 0: 42740.6. Samples: 14088371560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:03,390][15132] Avg episode reward: [(0, '0.173')] [2024-06-25 09:45:06,575][15401] Updated weights for policy 0, policy_version 859884 (0.0028) [2024-06-25 09:45:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 14088404992. Throughput: 0: 42688.0. Samples: 14088504620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 09:45:09,959][15401] Updated weights for policy 0, policy_version 859894 (0.0027) [2024-06-25 09:45:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 14088634368. Throughput: 0: 42799.6. Samples: 14088759120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:13,393][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 09:45:14,255][15401] Updated weights for policy 0, policy_version 859904 (0.0030) [2024-06-25 09:45:17,819][15401] Updated weights for policy 0, policy_version 859914 (0.0037) [2024-06-25 09:45:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14088863744. Throughput: 0: 42567.6. Samples: 14089008980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 09:45:21,769][15401] Updated weights for policy 0, policy_version 859924 (0.0026) [2024-06-25 09:45:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14089060352. Throughput: 0: 42703.0. Samples: 14089142800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 09:45:25,452][15401] Updated weights for policy 0, policy_version 859934 (0.0044) [2024-06-25 09:45:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 14089256960. Throughput: 0: 42462.2. Samples: 14089393480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 09:45:29,387][15401] Updated weights for policy 0, policy_version 859944 (0.0042) [2024-06-25 09:45:33,101][15401] Updated weights for policy 0, policy_version 859954 (0.0031) [2024-06-25 09:45:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 14089502720. Throughput: 0: 42439.5. Samples: 14089647380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:33,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 09:45:37,222][15401] Updated weights for policy 0, policy_version 859964 (0.0036) [2024-06-25 09:45:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14089699328. Throughput: 0: 42556.4. Samples: 14089780880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 09:45:40,847][15401] Updated weights for policy 0, policy_version 859974 (0.0033) [2024-06-25 09:45:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 14089912320. Throughput: 0: 42470.1. Samples: 14090032700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 09:45:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000859980_14089912320.pth... [2024-06-25 09:45:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000859355_14079672320.pth [2024-06-25 09:45:44,804][15401] Updated weights for policy 0, policy_version 859984 (0.0045) [2024-06-25 09:45:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 14090125312. Throughput: 0: 42587.9. Samples: 14090288120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:48,393][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 09:45:48,732][15401] Updated weights for policy 0, policy_version 859994 (0.0044) [2024-06-25 09:45:52,645][15401] Updated weights for policy 0, policy_version 860004 (0.0041) [2024-06-25 09:45:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14090354688. Throughput: 0: 42674.1. Samples: 14090424960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 09:45:56,409][15401] Updated weights for policy 0, policy_version 860014 (0.0028) [2024-06-25 09:45:58,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14090551296. Throughput: 0: 42542.3. Samples: 14090673520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 09:45:58,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 09:45:59,744][15349] Signal inference workers to stop experience collection... (208700 times) [2024-06-25 09:45:59,773][15401] InferenceWorker_p0-w0: stopping experience collection (208700 times) [2024-06-25 09:45:59,800][15349] Signal inference workers to resume experience collection... (208700 times) [2024-06-25 09:45:59,804][15401] InferenceWorker_p0-w0: resuming experience collection (208700 times) [2024-06-25 09:46:00,164][15401] Updated weights for policy 0, policy_version 860024 (0.0028) [2024-06-25 09:46:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14090764288. Throughput: 0: 42766.7. Samples: 14090933480. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:03,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 09:46:03,914][15401] Updated weights for policy 0, policy_version 860034 (0.0034) [2024-06-25 09:46:08,112][15401] Updated weights for policy 0, policy_version 860044 (0.0029) [2024-06-25 09:46:08,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14090977280. Throughput: 0: 42751.7. Samples: 14091066620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:08,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 09:46:11,412][15401] Updated weights for policy 0, policy_version 860054 (0.0044) [2024-06-25 09:46:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14091190272. Throughput: 0: 42639.9. Samples: 14091312280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:13,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 09:46:15,823][15401] Updated weights for policy 0, policy_version 860064 (0.0041) [2024-06-25 09:46:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14091403264. Throughput: 0: 42713.8. Samples: 14091569500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 09:46:19,185][15401] Updated weights for policy 0, policy_version 860074 (0.0043) [2024-06-25 09:46:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14091599872. Throughput: 0: 42697.7. Samples: 14091702280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:23,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 09:46:23,478][15401] Updated weights for policy 0, policy_version 860084 (0.0032) [2024-06-25 09:46:26,655][15401] Updated weights for policy 0, policy_version 860094 (0.0031) [2024-06-25 09:46:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 14091829248. Throughput: 0: 42663.6. Samples: 14091952660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:28,392][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 09:46:31,033][15401] Updated weights for policy 0, policy_version 860104 (0.0032) [2024-06-25 09:46:33,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 14092058624. Throughput: 0: 42773.7. Samples: 14092212940. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:33,393][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 09:46:34,735][15401] Updated weights for policy 0, policy_version 860114 (0.0039) [2024-06-25 09:46:38,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14092222464. Throughput: 0: 42575.6. Samples: 14092340860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:38,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 09:46:38,764][15401] Updated weights for policy 0, policy_version 860124 (0.0029) [2024-06-25 09:46:42,230][15401] Updated weights for policy 0, policy_version 860134 (0.0044) [2024-06-25 09:46:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14092468224. Throughput: 0: 42648.7. Samples: 14092592620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:43,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 09:46:46,462][15401] Updated weights for policy 0, policy_version 860144 (0.0035) [2024-06-25 09:46:48,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 14092697600. Throughput: 0: 42608.9. Samples: 14092850880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 09:46:49,778][15401] Updated weights for policy 0, policy_version 860154 (0.0041) [2024-06-25 09:46:53,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14092894208. Throughput: 0: 42564.3. Samples: 14092982020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 09:46:53,895][15401] Updated weights for policy 0, policy_version 860164 (0.0030) [2024-06-25 09:46:57,309][15401] Updated weights for policy 0, policy_version 860174 (0.0031) [2024-06-25 09:46:58,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42871.4, 300 sec: 42764.7). Total num frames: 14093123584. Throughput: 0: 42728.0. Samples: 14093235140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:46:58,392][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 09:47:01,666][15401] Updated weights for policy 0, policy_version 860184 (0.0033) [2024-06-25 09:47:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 14093336576. Throughput: 0: 42823.2. Samples: 14093496540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:03,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-25 09:47:05,237][15401] Updated weights for policy 0, policy_version 860194 (0.0040) [2024-06-25 09:47:08,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14093533184. Throughput: 0: 42672.9. Samples: 14093622560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 09:47:09,600][15401] Updated weights for policy 0, policy_version 860204 (0.0037) [2024-06-25 09:47:12,835][15401] Updated weights for policy 0, policy_version 860214 (0.0032) [2024-06-25 09:47:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14093762560. Throughput: 0: 42796.1. Samples: 14093878380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 09:47:17,028][15401] Updated weights for policy 0, policy_version 860224 (0.0043) [2024-06-25 09:47:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14093975552. Throughput: 0: 42732.0. Samples: 14094135780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:18,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 09:47:20,281][15401] Updated weights for policy 0, policy_version 860234 (0.0040) [2024-06-25 09:47:22,373][15349] Signal inference workers to stop experience collection... (208750 times) [2024-06-25 09:47:22,374][15349] Signal inference workers to resume experience collection... (208750 times) [2024-06-25 09:47:22,413][15401] InferenceWorker_p0-w0: stopping experience collection (208750 times) [2024-06-25 09:47:22,414][15401] InferenceWorker_p0-w0: resuming experience collection (208750 times) [2024-06-25 09:47:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14094188544. Throughput: 0: 42665.2. Samples: 14094260800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 09:47:24,480][15401] Updated weights for policy 0, policy_version 860244 (0.0043) [2024-06-25 09:47:28,294][15401] Updated weights for policy 0, policy_version 860254 (0.0042) [2024-06-25 09:47:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 14094401536. Throughput: 0: 42868.3. Samples: 14094521680. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:28,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 09:47:31,979][15401] Updated weights for policy 0, policy_version 860264 (0.0034) [2024-06-25 09:47:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 14094598144. Throughput: 0: 42970.6. Samples: 14094784560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 09:47:35,786][15401] Updated weights for policy 0, policy_version 860274 (0.0045) [2024-06-25 09:47:38,391][15132] Fps is (10 sec: 40951.9, 60 sec: 43143.1, 300 sec: 42709.2). Total num frames: 14094811136. Throughput: 0: 42620.9. Samples: 14094900040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:38,392][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 09:47:39,768][15401] Updated weights for policy 0, policy_version 860284 (0.0037) [2024-06-25 09:47:43,341][15401] Updated weights for policy 0, policy_version 860294 (0.0027) [2024-06-25 09:47:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 14095056896. Throughput: 0: 42751.2. Samples: 14095158840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 09:47:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000860295_14095073280.pth... [2024-06-25 09:47:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000859666_14084767744.pth [2024-06-25 09:47:47,532][15401] Updated weights for policy 0, policy_version 860304 (0.0031) [2024-06-25 09:47:48,389][15132] Fps is (10 sec: 42606.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14095237120. Throughput: 0: 42722.3. Samples: 14095419040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 19.0) [2024-06-25 09:47:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 09:47:51,086][15401] Updated weights for policy 0, policy_version 860314 (0.0040) [2024-06-25 09:47:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14095466496. Throughput: 0: 42639.6. Samples: 14095541340. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:47:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 09:47:55,071][15401] Updated weights for policy 0, policy_version 860324 (0.0032) [2024-06-25 09:47:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14095679488. Throughput: 0: 42794.1. Samples: 14095804120. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:47:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 09:47:59,051][15401] Updated weights for policy 0, policy_version 860334 (0.0036) [2024-06-25 09:48:02,493][15401] Updated weights for policy 0, policy_version 860344 (0.0043) [2024-06-25 09:48:03,390][15132] Fps is (10 sec: 42596.4, 60 sec: 42598.1, 300 sec: 42653.9). Total num frames: 14095892480. Throughput: 0: 42751.7. Samples: 14096059620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 09:48:06,559][15401] Updated weights for policy 0, policy_version 860354 (0.0029) [2024-06-25 09:48:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14096105472. Throughput: 0: 42765.1. Samples: 14096185220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 09:48:10,447][15401] Updated weights for policy 0, policy_version 860364 (0.0044) [2024-06-25 09:48:13,389][15132] Fps is (10 sec: 42600.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14096318464. Throughput: 0: 42735.0. Samples: 14096444760. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 09:48:14,043][15401] Updated weights for policy 0, policy_version 860374 (0.0045) [2024-06-25 09:48:18,040][15401] Updated weights for policy 0, policy_version 860384 (0.0032) [2024-06-25 09:48:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 14096531456. Throughput: 0: 42467.5. Samples: 14096695600. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:18,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 09:48:21,703][15401] Updated weights for policy 0, policy_version 860394 (0.0030) [2024-06-25 09:48:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14096744448. Throughput: 0: 42715.1. Samples: 14096822140. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 09:48:25,583][15401] Updated weights for policy 0, policy_version 860404 (0.0030) [2024-06-25 09:48:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14096941056. Throughput: 0: 42831.1. Samples: 14097086240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:28,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 09:48:29,488][15401] Updated weights for policy 0, policy_version 860414 (0.0035) [2024-06-25 09:48:31,816][15349] Signal inference workers to stop experience collection... (208800 times) [2024-06-25 09:48:31,816][15349] Signal inference workers to resume experience collection... (208800 times) [2024-06-25 09:48:31,865][15401] InferenceWorker_p0-w0: stopping experience collection (208800 times) [2024-06-25 09:48:31,865][15401] InferenceWorker_p0-w0: resuming experience collection (208800 times) [2024-06-25 09:48:33,107][15401] Updated weights for policy 0, policy_version 860424 (0.0035) [2024-06-25 09:48:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14097186816. Throughput: 0: 42581.7. Samples: 14097335220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 09:48:37,260][15401] Updated weights for policy 0, policy_version 860434 (0.0030) [2024-06-25 09:48:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42872.8, 300 sec: 42765.0). Total num frames: 14097383424. Throughput: 0: 42796.8. Samples: 14097467200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 09:48:40,752][15401] Updated weights for policy 0, policy_version 860444 (0.0033) [2024-06-25 09:48:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14097596416. Throughput: 0: 42779.7. Samples: 14097729200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 09:48:45,474][15401] Updated weights for policy 0, policy_version 860454 (0.0030) [2024-06-25 09:48:48,388][15401] Updated weights for policy 0, policy_version 860464 (0.0037) [2024-06-25 09:48:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 14097842176. Throughput: 0: 42716.4. Samples: 14097981840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 09:48:53,051][15401] Updated weights for policy 0, policy_version 860474 (0.0040) [2024-06-25 09:48:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 14098022400. Throughput: 0: 42855.5. Samples: 14098113720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 09:48:56,029][15401] Updated weights for policy 0, policy_version 860484 (0.0038) [2024-06-25 09:48:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14098235392. Throughput: 0: 42753.8. Samples: 14098368680. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:48:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 09:49:00,578][15401] Updated weights for policy 0, policy_version 860494 (0.0040) [2024-06-25 09:49:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.8, 300 sec: 42765.0). Total num frames: 14098481152. Throughput: 0: 42549.8. Samples: 14098610340. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:49:03,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 09:49:04,369][15401] Updated weights for policy 0, policy_version 860504 (0.0031) [2024-06-25 09:49:08,192][15401] Updated weights for policy 0, policy_version 860514 (0.0038) [2024-06-25 09:49:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14098661376. Throughput: 0: 42858.3. Samples: 14098750760. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:49:08,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 09:49:11,873][15401] Updated weights for policy 0, policy_version 860524 (0.0027) [2024-06-25 09:49:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14098874368. Throughput: 0: 42670.3. Samples: 14099006400. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:49:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 09:49:15,971][15401] Updated weights for policy 0, policy_version 860534 (0.0045) [2024-06-25 09:49:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14099103744. Throughput: 0: 42671.9. Samples: 14099255460. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:49:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 09:49:19,476][15401] Updated weights for policy 0, policy_version 860544 (0.0032) [2024-06-25 09:49:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 14099300352. Throughput: 0: 42656.0. Samples: 14099386720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:49:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 09:49:23,646][15401] Updated weights for policy 0, policy_version 860554 (0.0044) [2024-06-25 09:49:26,952][15401] Updated weights for policy 0, policy_version 860564 (0.0025) [2024-06-25 09:49:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 14099513344. Throughput: 0: 42646.2. Samples: 14099648280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:49:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 09:49:31,359][15401] Updated weights for policy 0, policy_version 860574 (0.0035) [2024-06-25 09:49:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14099759104. Throughput: 0: 42590.6. Samples: 14099898420. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:49:33,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 09:49:34,596][15401] Updated weights for policy 0, policy_version 860584 (0.0027) [2024-06-25 09:49:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14099939328. Throughput: 0: 42624.9. Samples: 14100031840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 19.0) [2024-06-25 09:49:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 09:49:38,943][15401] Updated weights for policy 0, policy_version 860594 (0.0033) [2024-06-25 09:49:42,261][15401] Updated weights for policy 0, policy_version 860604 (0.0029) [2024-06-25 09:49:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14100152320. Throughput: 0: 42529.3. Samples: 14100282500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:49:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 09:49:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000860606_14100168704.pth... [2024-06-25 09:49:43,537][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000859980_14089912320.pth [2024-06-25 09:49:46,943][15401] Updated weights for policy 0, policy_version 860614 (0.0049) [2024-06-25 09:49:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14100398080. Throughput: 0: 42785.9. Samples: 14100535700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:49:48,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 09:49:50,231][15401] Updated weights for policy 0, policy_version 860624 (0.0040) [2024-06-25 09:49:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14100578304. Throughput: 0: 42476.0. Samples: 14100662180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:49:53,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-25 09:49:54,536][15401] Updated weights for policy 0, policy_version 860634 (0.0033) [2024-06-25 09:49:57,798][15401] Updated weights for policy 0, policy_version 860644 (0.0031) [2024-06-25 09:49:58,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 14100791296. Throughput: 0: 42393.1. Samples: 14100914100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:49:58,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 09:50:01,634][15349] Signal inference workers to stop experience collection... (208850 times) [2024-06-25 09:50:01,665][15401] InferenceWorker_p0-w0: stopping experience collection (208850 times) [2024-06-25 09:50:01,702][15349] Signal inference workers to resume experience collection... (208850 times) [2024-06-25 09:50:01,703][15401] InferenceWorker_p0-w0: resuming experience collection (208850 times) [2024-06-25 09:50:02,053][15401] Updated weights for policy 0, policy_version 860654 (0.0037) [2024-06-25 09:50:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14101004288. Throughput: 0: 42838.2. Samples: 14101183180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:03,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 09:50:05,461][15401] Updated weights for policy 0, policy_version 860664 (0.0035) [2024-06-25 09:50:08,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 14101217280. Throughput: 0: 42573.7. Samples: 14101302540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:08,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 09:50:09,477][15401] Updated weights for policy 0, policy_version 860674 (0.0031) [2024-06-25 09:50:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14101430272. Throughput: 0: 42435.0. Samples: 14101557860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 09:50:13,443][15401] Updated weights for policy 0, policy_version 860684 (0.0029) [2024-06-25 09:50:17,222][15401] Updated weights for policy 0, policy_version 860694 (0.0032) [2024-06-25 09:50:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14101626880. Throughput: 0: 42617.4. Samples: 14101816200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:18,391][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 09:50:21,097][15401] Updated weights for policy 0, policy_version 860704 (0.0028) [2024-06-25 09:50:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14101872640. Throughput: 0: 42452.7. Samples: 14101942220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 09:50:24,715][15401] Updated weights for policy 0, policy_version 860714 (0.0023) [2024-06-25 09:50:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14102069248. Throughput: 0: 42564.8. Samples: 14102197920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 09:50:28,684][15401] Updated weights for policy 0, policy_version 860724 (0.0048) [2024-06-25 09:50:32,634][15401] Updated weights for policy 0, policy_version 860734 (0.0043) [2024-06-25 09:50:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14102282240. Throughput: 0: 42561.2. Samples: 14102450960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:33,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 09:50:36,392][15401] Updated weights for policy 0, policy_version 860744 (0.0030) [2024-06-25 09:50:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14102495232. Throughput: 0: 42576.9. Samples: 14102578140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:38,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-25 09:50:40,168][15401] Updated weights for policy 0, policy_version 860754 (0.0025) [2024-06-25 09:50:43,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 14102708224. Throughput: 0: 42712.1. Samples: 14102836240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:43,393][15132] Avg episode reward: [(0, '0.280')] [2024-06-25 09:50:43,923][15401] Updated weights for policy 0, policy_version 860764 (0.0033) [2024-06-25 09:50:47,979][15401] Updated weights for policy 0, policy_version 860774 (0.0032) [2024-06-25 09:50:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14102937600. Throughput: 0: 42441.4. Samples: 14103093040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:48,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-25 09:50:51,708][15401] Updated weights for policy 0, policy_version 860784 (0.0048) [2024-06-25 09:50:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 14103117824. Throughput: 0: 42655.5. Samples: 14103222040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 09:50:55,803][15401] Updated weights for policy 0, policy_version 860794 (0.0051) [2024-06-25 09:50:58,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 14103314432. Throughput: 0: 42435.7. Samples: 14103467460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:50:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 09:50:59,450][15401] Updated weights for policy 0, policy_version 860804 (0.0030) [2024-06-25 09:51:03,314][15401] Updated weights for policy 0, policy_version 860814 (0.0026) [2024-06-25 09:51:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14103576576. Throughput: 0: 42445.3. Samples: 14103726240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:51:03,395][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 09:51:06,635][15349] Signal inference workers to stop experience collection... (208900 times) [2024-06-25 09:51:06,689][15401] InferenceWorker_p0-w0: stopping experience collection (208900 times) [2024-06-25 09:51:06,753][15349] Signal inference workers to resume experience collection... (208900 times) [2024-06-25 09:51:06,753][15401] InferenceWorker_p0-w0: resuming experience collection (208900 times) [2024-06-25 09:51:07,471][15401] Updated weights for policy 0, policy_version 860824 (0.0034) [2024-06-25 09:51:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14103773184. Throughput: 0: 42696.5. Samples: 14103863560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:51:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 09:51:10,951][15401] Updated weights for policy 0, policy_version 860834 (0.0033) [2024-06-25 09:51:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14103969792. Throughput: 0: 42445.4. Samples: 14104107960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:51:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 09:51:15,085][15401] Updated weights for policy 0, policy_version 860844 (0.0029) [2024-06-25 09:51:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14104199168. Throughput: 0: 42627.8. Samples: 14104369200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:51:18,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 09:51:18,537][15401] Updated weights for policy 0, policy_version 860854 (0.0027) [2024-06-25 09:51:22,814][15401] Updated weights for policy 0, policy_version 860864 (0.0042) [2024-06-25 09:51:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 14104412160. Throughput: 0: 42792.1. Samples: 14104503780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:51:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 09:51:26,102][15401] Updated weights for policy 0, policy_version 860874 (0.0035) [2024-06-25 09:51:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 14104625152. Throughput: 0: 42562.2. Samples: 14104751440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-06-25 09:51:28,392][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 09:51:30,557][15401] Updated weights for policy 0, policy_version 860884 (0.0034) [2024-06-25 09:51:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14104854528. Throughput: 0: 42584.0. Samples: 14105009320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:51:33,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 09:51:33,997][15401] Updated weights for policy 0, policy_version 860894 (0.0029) [2024-06-25 09:51:38,301][15401] Updated weights for policy 0, policy_version 860904 (0.0038) [2024-06-25 09:51:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14105051136. Throughput: 0: 42647.2. Samples: 14105141160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:51:38,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 09:51:41,462][15401] Updated weights for policy 0, policy_version 860914 (0.0036) [2024-06-25 09:51:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 14105280512. Throughput: 0: 42790.5. Samples: 14105393040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:51:43,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 09:51:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000860918_14105280512.pth... [2024-06-25 09:51:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000860295_14095073280.pth [2024-06-25 09:51:45,711][15401] Updated weights for policy 0, policy_version 860924 (0.0041) [2024-06-25 09:51:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14105493504. Throughput: 0: 42826.6. Samples: 14105653440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:51:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 09:51:49,330][15401] Updated weights for policy 0, policy_version 860934 (0.0031) [2024-06-25 09:51:53,276][15401] Updated weights for policy 0, policy_version 860944 (0.0044) [2024-06-25 09:51:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 14105706496. Throughput: 0: 42736.9. Samples: 14105786720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:51:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 09:51:56,860][15401] Updated weights for policy 0, policy_version 860954 (0.0041) [2024-06-25 09:51:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 14105935872. Throughput: 0: 42880.4. Samples: 14106037580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:51:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 09:52:01,214][15401] Updated weights for policy 0, policy_version 860964 (0.0028) [2024-06-25 09:52:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14106148864. Throughput: 0: 42830.5. Samples: 14106296580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:03,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 09:52:04,403][15401] Updated weights for policy 0, policy_version 860974 (0.0037) [2024-06-25 09:52:08,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 14106345472. Throughput: 0: 42689.3. Samples: 14106424900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:08,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 09:52:08,844][15401] Updated weights for policy 0, policy_version 860984 (0.0051) [2024-06-25 09:52:12,598][15401] Updated weights for policy 0, policy_version 860994 (0.0029) [2024-06-25 09:52:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 14106574848. Throughput: 0: 42954.4. Samples: 14106684380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:13,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 09:52:16,574][15401] Updated weights for policy 0, policy_version 861004 (0.0036) [2024-06-25 09:52:18,392][15132] Fps is (10 sec: 44236.5, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 14106787840. Throughput: 0: 42788.3. Samples: 14106934900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:18,393][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 09:52:20,104][15401] Updated weights for policy 0, policy_version 861014 (0.0023) [2024-06-25 09:52:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14106968064. Throughput: 0: 42788.5. Samples: 14107066640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 09:52:24,150][15401] Updated weights for policy 0, policy_version 861024 (0.0048) [2024-06-25 09:52:27,360][15349] Signal inference workers to stop experience collection... (208950 times) [2024-06-25 09:52:27,409][15401] InferenceWorker_p0-w0: stopping experience collection (208950 times) [2024-06-25 09:52:27,417][15349] Signal inference workers to resume experience collection... (208950 times) [2024-06-25 09:52:27,430][15401] InferenceWorker_p0-w0: resuming experience collection (208950 times) [2024-06-25 09:52:27,767][15401] Updated weights for policy 0, policy_version 861034 (0.0050) [2024-06-25 09:52:28,392][15132] Fps is (10 sec: 42598.5, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 14107213824. Throughput: 0: 42925.4. Samples: 14107324780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:28,393][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 09:52:31,718][15401] Updated weights for policy 0, policy_version 861044 (0.0027) [2024-06-25 09:52:33,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 14107426816. Throughput: 0: 42658.6. Samples: 14107573080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 09:52:35,636][15401] Updated weights for policy 0, policy_version 861054 (0.0031) [2024-06-25 09:52:38,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14107607040. Throughput: 0: 42600.5. Samples: 14107703740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:38,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 09:52:39,678][15401] Updated weights for policy 0, policy_version 861064 (0.0022) [2024-06-25 09:52:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14107820032. Throughput: 0: 42616.8. Samples: 14107955340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 09:52:43,665][15401] Updated weights for policy 0, policy_version 861074 (0.0032) [2024-06-25 09:52:47,562][15401] Updated weights for policy 0, policy_version 861084 (0.0034) [2024-06-25 09:52:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14108033024. Throughput: 0: 42492.4. Samples: 14108208740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 09:52:51,136][15401] Updated weights for policy 0, policy_version 861094 (0.0037) [2024-06-25 09:52:53,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14108278784. Throughput: 0: 42586.6. Samples: 14108341200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:53,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 09:52:55,040][15401] Updated weights for policy 0, policy_version 861104 (0.0042) [2024-06-25 09:52:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14108475392. Throughput: 0: 42396.8. Samples: 14108592240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:52:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 09:52:58,804][15401] Updated weights for policy 0, policy_version 861114 (0.0036) [2024-06-25 09:53:03,230][15401] Updated weights for policy 0, policy_version 861124 (0.0028) [2024-06-25 09:53:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14108672000. Throughput: 0: 42614.7. Samples: 14108852460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:53:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 09:53:06,388][15401] Updated weights for policy 0, policy_version 861134 (0.0034) [2024-06-25 09:53:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14108917760. Throughput: 0: 42356.4. Samples: 14108972680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:53:08,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 09:53:10,721][15401] Updated weights for policy 0, policy_version 861144 (0.0045) [2024-06-25 09:53:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14109097984. Throughput: 0: 42418.7. Samples: 14109233520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:53:13,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 09:53:14,344][15401] Updated weights for policy 0, policy_version 861154 (0.0047) [2024-06-25 09:53:18,242][15401] Updated weights for policy 0, policy_version 861164 (0.0039) [2024-06-25 09:53:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 14109310976. Throughput: 0: 42513.4. Samples: 14109486180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 09:53:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 09:53:21,856][15401] Updated weights for policy 0, policy_version 861174 (0.0026) [2024-06-25 09:53:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14109556736. Throughput: 0: 42427.9. Samples: 14109613000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:53:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 09:53:25,766][15401] Updated weights for policy 0, policy_version 861184 (0.0028) [2024-06-25 09:53:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42053.9, 300 sec: 42542.8). Total num frames: 14109736960. Throughput: 0: 42637.4. Samples: 14109874020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:53:28,390][15132] Avg episode reward: [(0, '0.268')] [2024-06-25 09:53:29,509][15401] Updated weights for policy 0, policy_version 861194 (0.0041) [2024-06-25 09:53:33,323][15401] Updated weights for policy 0, policy_version 861204 (0.0039) [2024-06-25 09:53:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14109966336. Throughput: 0: 42629.7. Samples: 14110127080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:53:33,396][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 09:53:37,321][15401] Updated weights for policy 0, policy_version 861214 (0.0032) [2024-06-25 09:53:38,392][15132] Fps is (10 sec: 42589.8, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 14110162944. Throughput: 0: 42469.2. Samples: 14110252400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:53:38,392][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 09:53:40,869][15401] Updated weights for policy 0, policy_version 861224 (0.0032) [2024-06-25 09:53:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 14110392320. Throughput: 0: 42671.0. Samples: 14110512440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:53:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 09:53:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000861230_14110392320.pth... [2024-06-25 09:53:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000860606_14100168704.pth [2024-06-25 09:53:44,975][15401] Updated weights for policy 0, policy_version 861234 (0.0039) [2024-06-25 09:53:48,390][15132] Fps is (10 sec: 44245.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14110605312. Throughput: 0: 42406.7. Samples: 14110760760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:53:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 09:53:48,583][15401] Updated weights for policy 0, policy_version 861244 (0.0045) [2024-06-25 09:53:52,614][15401] Updated weights for policy 0, policy_version 861254 (0.0029) [2024-06-25 09:53:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14110818304. Throughput: 0: 42684.7. Samples: 14110893500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:53:53,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 09:53:56,674][15401] Updated weights for policy 0, policy_version 861264 (0.0042) [2024-06-25 09:53:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14111014912. Throughput: 0: 42596.6. Samples: 14111150360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:53:58,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 09:54:00,424][15401] Updated weights for policy 0, policy_version 861274 (0.0034) [2024-06-25 09:54:03,392][15132] Fps is (10 sec: 44227.1, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 14111260672. Throughput: 0: 42554.7. Samples: 14111401240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:03,392][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 09:54:04,254][15401] Updated weights for policy 0, policy_version 861284 (0.0040) [2024-06-25 09:54:08,175][15401] Updated weights for policy 0, policy_version 861294 (0.0030) [2024-06-25 09:54:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14111457280. Throughput: 0: 42737.0. Samples: 14111536160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 09:54:09,231][15349] Signal inference workers to stop experience collection... (209000 times) [2024-06-25 09:54:09,231][15349] Signal inference workers to resume experience collection... (209000 times) [2024-06-25 09:54:09,265][15401] InferenceWorker_p0-w0: stopping experience collection (209000 times) [2024-06-25 09:54:09,265][15401] InferenceWorker_p0-w0: resuming experience collection (209000 times) [2024-06-25 09:54:11,760][15401] Updated weights for policy 0, policy_version 861304 (0.0035) [2024-06-25 09:54:13,390][15132] Fps is (10 sec: 37691.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14111637504. Throughput: 0: 42497.3. Samples: 14111786400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 09:54:15,619][15401] Updated weights for policy 0, policy_version 861314 (0.0028) [2024-06-25 09:54:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14111899648. Throughput: 0: 42462.7. Samples: 14112037900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:18,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 09:54:19,924][15401] Updated weights for policy 0, policy_version 861324 (0.0047) [2024-06-25 09:54:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14112079872. Throughput: 0: 42682.4. Samples: 14112173020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:23,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 09:54:23,441][15401] Updated weights for policy 0, policy_version 861334 (0.0044) [2024-06-25 09:54:27,354][15401] Updated weights for policy 0, policy_version 861344 (0.0029) [2024-06-25 09:54:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14112292864. Throughput: 0: 42589.1. Samples: 14112428940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:28,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 09:54:31,046][15401] Updated weights for policy 0, policy_version 861354 (0.0040) [2024-06-25 09:54:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14112522240. Throughput: 0: 42676.0. Samples: 14112681180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 09:54:34,913][15401] Updated weights for policy 0, policy_version 861364 (0.0043) [2024-06-25 09:54:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42872.9, 300 sec: 42653.9). Total num frames: 14112735232. Throughput: 0: 42640.1. Samples: 14112812300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 09:54:39,011][15401] Updated weights for policy 0, policy_version 861374 (0.0038) [2024-06-25 09:54:42,534][15401] Updated weights for policy 0, policy_version 861384 (0.0037) [2024-06-25 09:54:43,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 14112931840. Throughput: 0: 42570.1. Samples: 14113066120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:43,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 09:54:46,681][15401] Updated weights for policy 0, policy_version 861394 (0.0043) [2024-06-25 09:54:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14113161216. Throughput: 0: 42736.9. Samples: 14113324300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 09:54:50,107][15401] Updated weights for policy 0, policy_version 861404 (0.0026) [2024-06-25 09:54:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 14113374208. Throughput: 0: 42671.1. Samples: 14113456360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 09:54:54,011][15401] Updated weights for policy 0, policy_version 861414 (0.0030) [2024-06-25 09:54:57,929][15401] Updated weights for policy 0, policy_version 861424 (0.0036) [2024-06-25 09:54:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14113587200. Throughput: 0: 42735.6. Samples: 14113709500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:54:58,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 09:55:01,731][15401] Updated weights for policy 0, policy_version 861434 (0.0033) [2024-06-25 09:55:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 14113800192. Throughput: 0: 42922.3. Samples: 14113969400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:55:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 09:55:05,851][15401] Updated weights for policy 0, policy_version 861444 (0.0037) [2024-06-25 09:55:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14114029568. Throughput: 0: 42826.3. Samples: 14114100200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:55:08,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 09:55:09,395][15401] Updated weights for policy 0, policy_version 861454 (0.0033) [2024-06-25 09:55:13,299][15401] Updated weights for policy 0, policy_version 861464 (0.0031) [2024-06-25 09:55:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14114226176. Throughput: 0: 42707.8. Samples: 14114350800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-25 09:55:13,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 09:55:16,891][15401] Updated weights for policy 0, policy_version 861474 (0.0028) [2024-06-25 09:55:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14114455552. Throughput: 0: 42933.0. Samples: 14114613160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:18,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 09:55:20,866][15401] Updated weights for policy 0, policy_version 861484 (0.0037) [2024-06-25 09:55:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14114668544. Throughput: 0: 42906.3. Samples: 14114743080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 09:55:24,340][15401] Updated weights for policy 0, policy_version 861494 (0.0049) [2024-06-25 09:55:28,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14114865152. Throughput: 0: 42893.3. Samples: 14114996320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:28,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 09:55:28,402][15401] Updated weights for policy 0, policy_version 861504 (0.0035) [2024-06-25 09:55:32,282][15401] Updated weights for policy 0, policy_version 861514 (0.0049) [2024-06-25 09:55:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14115094528. Throughput: 0: 42824.4. Samples: 14115251400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:33,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-25 09:55:36,142][15401] Updated weights for policy 0, policy_version 861524 (0.0038) [2024-06-25 09:55:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 14115307520. Throughput: 0: 42782.7. Samples: 14115381580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:38,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 09:55:39,994][15401] Updated weights for policy 0, policy_version 861534 (0.0033) [2024-06-25 09:55:43,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43144.5, 300 sec: 42653.6). Total num frames: 14115520512. Throughput: 0: 42672.4. Samples: 14115629860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:43,392][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 09:55:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000861543_14115520512.pth... [2024-06-25 09:55:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000860918_14105280512.pth [2024-06-25 09:55:44,212][15401] Updated weights for policy 0, policy_version 861544 (0.0035) [2024-06-25 09:55:45,804][15349] Signal inference workers to stop experience collection... (209050 times) [2024-06-25 09:55:45,838][15401] InferenceWorker_p0-w0: stopping experience collection (209050 times) [2024-06-25 09:55:45,853][15349] Signal inference workers to resume experience collection... (209050 times) [2024-06-25 09:55:45,865][15401] InferenceWorker_p0-w0: resuming experience collection (209050 times) [2024-06-25 09:55:47,445][15401] Updated weights for policy 0, policy_version 861554 (0.0036) [2024-06-25 09:55:48,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14115717120. Throughput: 0: 42773.2. Samples: 14115894200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:48,391][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 09:55:51,782][15401] Updated weights for policy 0, policy_version 861564 (0.0037) [2024-06-25 09:55:53,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 14115930112. Throughput: 0: 42664.7. Samples: 14116020120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 09:55:54,986][15401] Updated weights for policy 0, policy_version 861574 (0.0031) [2024-06-25 09:55:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14116159488. Throughput: 0: 42825.7. Samples: 14116277960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:55:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 09:55:59,223][15401] Updated weights for policy 0, policy_version 861584 (0.0033) [2024-06-25 09:56:02,434][15401] Updated weights for policy 0, policy_version 861594 (0.0032) [2024-06-25 09:56:03,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14116372480. Throughput: 0: 42695.9. Samples: 14116534480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 09:56:06,663][15401] Updated weights for policy 0, policy_version 861604 (0.0036) [2024-06-25 09:56:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14116585472. Throughput: 0: 42709.2. Samples: 14116665000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:08,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 09:56:10,106][15401] Updated weights for policy 0, policy_version 861614 (0.0022) [2024-06-25 09:56:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.4). Total num frames: 14116798464. Throughput: 0: 42852.0. Samples: 14116924560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 09:56:14,091][15401] Updated weights for policy 0, policy_version 861624 (0.0034) [2024-06-25 09:56:17,790][15401] Updated weights for policy 0, policy_version 861634 (0.0028) [2024-06-25 09:56:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14117027840. Throughput: 0: 42652.9. Samples: 14117170780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 09:56:22,246][15401] Updated weights for policy 0, policy_version 861644 (0.0028) [2024-06-25 09:56:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14117224448. Throughput: 0: 42703.8. Samples: 14117303260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:23,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 09:56:25,597][15401] Updated weights for policy 0, policy_version 861654 (0.0041) [2024-06-25 09:56:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 14117437440. Throughput: 0: 42970.8. Samples: 14117563440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:28,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 09:56:29,933][15401] Updated weights for policy 0, policy_version 861664 (0.0032) [2024-06-25 09:56:33,387][15401] Updated weights for policy 0, policy_version 861674 (0.0030) [2024-06-25 09:56:33,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.6, 300 sec: 42764.6). Total num frames: 14117666816. Throughput: 0: 42735.0. Samples: 14117817380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:33,393][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 09:56:37,567][15401] Updated weights for policy 0, policy_version 861684 (0.0029) [2024-06-25 09:56:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 14117863424. Throughput: 0: 42751.3. Samples: 14117943920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:38,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 09:56:41,811][15401] Updated weights for policy 0, policy_version 861694 (0.0037) [2024-06-25 09:56:43,389][15132] Fps is (10 sec: 40970.7, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 14118076416. Throughput: 0: 42685.5. Samples: 14118198800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 09:56:45,456][15401] Updated weights for policy 0, policy_version 861704 (0.0031) [2024-06-25 09:56:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14118289408. Throughput: 0: 42651.9. Samples: 14118453820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 09:56:49,406][15401] Updated weights for policy 0, policy_version 861714 (0.0025) [2024-06-25 09:56:52,974][15401] Updated weights for policy 0, policy_version 861724 (0.0039) [2024-06-25 09:56:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 14118502400. Throughput: 0: 42528.2. Samples: 14118578760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:53,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-25 09:56:57,043][15401] Updated weights for policy 0, policy_version 861734 (0.0038) [2024-06-25 09:56:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14118715392. Throughput: 0: 42626.3. Samples: 14118842740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:56:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 09:57:00,629][15401] Updated weights for policy 0, policy_version 861744 (0.0038) [2024-06-25 09:57:02,904][15349] Signal inference workers to stop experience collection... (209100 times) [2024-06-25 09:57:02,904][15349] Signal inference workers to resume experience collection... (209100 times) [2024-06-25 09:57:02,947][15401] InferenceWorker_p0-w0: stopping experience collection (209100 times) [2024-06-25 09:57:02,947][15401] InferenceWorker_p0-w0: resuming experience collection (209100 times) [2024-06-25 09:57:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 14118928384. Throughput: 0: 42735.5. Samples: 14119093880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 09:57:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 09:57:04,486][15401] Updated weights for policy 0, policy_version 861754 (0.0040) [2024-06-25 09:57:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14119124992. Throughput: 0: 42635.3. Samples: 14119221840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:08,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 09:57:08,394][15401] Updated weights for policy 0, policy_version 861764 (0.0036) [2024-06-25 09:57:12,002][15401] Updated weights for policy 0, policy_version 861774 (0.0042) [2024-06-25 09:57:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 14119370752. Throughput: 0: 42688.4. Samples: 14119484420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:13,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 09:57:15,906][15401] Updated weights for policy 0, policy_version 861784 (0.0034) [2024-06-25 09:57:18,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14119583744. Throughput: 0: 42744.7. Samples: 14119740780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:18,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 09:57:19,657][15401] Updated weights for policy 0, policy_version 861794 (0.0033) [2024-06-25 09:57:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.6, 300 sec: 42598.8). Total num frames: 14119780352. Throughput: 0: 42789.0. Samples: 14119869420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:23,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 09:57:23,478][15401] Updated weights for policy 0, policy_version 861804 (0.0032) [2024-06-25 09:57:27,313][15401] Updated weights for policy 0, policy_version 861814 (0.0032) [2024-06-25 09:57:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14119993344. Throughput: 0: 42896.4. Samples: 14120129140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 09:57:31,072][15401] Updated weights for policy 0, policy_version 861824 (0.0036) [2024-06-25 09:57:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 14120222720. Throughput: 0: 42900.5. Samples: 14120384340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 09:57:34,838][15401] Updated weights for policy 0, policy_version 861834 (0.0023) [2024-06-25 09:57:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14120435712. Throughput: 0: 42987.5. Samples: 14120513200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:38,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 09:57:38,825][15401] Updated weights for policy 0, policy_version 861844 (0.0032) [2024-06-25 09:57:42,439][15401] Updated weights for policy 0, policy_version 861854 (0.0046) [2024-06-25 09:57:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14120632320. Throughput: 0: 42782.5. Samples: 14120767960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 09:57:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000861855_14120632320.pth... [2024-06-25 09:57:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000861230_14110392320.pth [2024-06-25 09:57:46,377][15401] Updated weights for policy 0, policy_version 861864 (0.0035) [2024-06-25 09:57:48,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 14120878080. Throughput: 0: 42777.4. Samples: 14121018960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:48,392][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 09:57:50,023][15401] Updated weights for policy 0, policy_version 861874 (0.0028) [2024-06-25 09:57:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14121058304. Throughput: 0: 42808.9. Samples: 14121148240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:53,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 09:57:54,294][15401] Updated weights for policy 0, policy_version 861884 (0.0031) [2024-06-25 09:57:58,374][15401] Updated weights for policy 0, policy_version 861894 (0.0040) [2024-06-25 09:57:58,389][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14121271296. Throughput: 0: 42721.9. Samples: 14121406900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:57:58,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 09:58:01,838][15401] Updated weights for policy 0, policy_version 861904 (0.0034) [2024-06-25 09:58:03,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14121500672. Throughput: 0: 42732.7. Samples: 14121663860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:03,393][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 09:58:05,760][15401] Updated weights for policy 0, policy_version 861914 (0.0051) [2024-06-25 09:58:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14121713664. Throughput: 0: 42840.8. Samples: 14121797260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:08,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 09:58:09,288][15401] Updated weights for policy 0, policy_version 861924 (0.0046) [2024-06-25 09:58:13,248][15401] Updated weights for policy 0, policy_version 861934 (0.0031) [2024-06-25 09:58:13,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14121926656. Throughput: 0: 42749.7. Samples: 14122052880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:13,396][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 09:58:16,732][15401] Updated weights for policy 0, policy_version 861944 (0.0040) [2024-06-25 09:58:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14122156032. Throughput: 0: 42784.0. Samples: 14122309620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 09:58:18,768][15349] Signal inference workers to stop experience collection... (209150 times) [2024-06-25 09:58:18,819][15401] InferenceWorker_p0-w0: stopping experience collection (209150 times) [2024-06-25 09:58:18,828][15349] Signal inference workers to resume experience collection... (209150 times) [2024-06-25 09:58:18,833][15401] InferenceWorker_p0-w0: resuming experience collection (209150 times) [2024-06-25 09:58:21,144][15401] Updated weights for policy 0, policy_version 861954 (0.0034) [2024-06-25 09:58:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14122352640. Throughput: 0: 42875.4. Samples: 14122442600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 09:58:24,594][15401] Updated weights for policy 0, policy_version 861964 (0.0036) [2024-06-25 09:58:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14122565632. Throughput: 0: 42785.3. Samples: 14122693300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 09:58:28,674][15401] Updated weights for policy 0, policy_version 861974 (0.0029) [2024-06-25 09:58:32,377][15401] Updated weights for policy 0, policy_version 861984 (0.0041) [2024-06-25 09:58:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 14122795008. Throughput: 0: 42937.9. Samples: 14122951060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:33,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 09:58:36,238][15401] Updated weights for policy 0, policy_version 861994 (0.0032) [2024-06-25 09:58:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14123008000. Throughput: 0: 42902.2. Samples: 14123078840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 09:58:39,912][15401] Updated weights for policy 0, policy_version 862004 (0.0040) [2024-06-25 09:58:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14123188224. Throughput: 0: 42883.5. Samples: 14123336660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:43,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-25 09:58:43,790][15401] Updated weights for policy 0, policy_version 862014 (0.0041) [2024-06-25 09:58:47,691][15401] Updated weights for policy 0, policy_version 862024 (0.0032) [2024-06-25 09:58:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14123433984. Throughput: 0: 42837.9. Samples: 14123591460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:48,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 09:58:51,621][15401] Updated weights for policy 0, policy_version 862034 (0.0040) [2024-06-25 09:58:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14123646976. Throughput: 0: 42787.1. Samples: 14123722680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 09:58:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 09:58:55,437][15401] Updated weights for policy 0, policy_version 862044 (0.0041) [2024-06-25 09:58:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 14123827200. Throughput: 0: 42677.9. Samples: 14123973380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:58:58,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 09:58:59,559][15401] Updated weights for policy 0, policy_version 862054 (0.0032) [2024-06-25 09:59:03,075][15401] Updated weights for policy 0, policy_version 862064 (0.0036) [2024-06-25 09:59:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14124072960. Throughput: 0: 42666.2. Samples: 14124229600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 09:59:06,929][15401] Updated weights for policy 0, policy_version 862074 (0.0027) [2024-06-25 09:59:08,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 14124285952. Throughput: 0: 42575.5. Samples: 14124358500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:08,390][15132] Avg episode reward: [(0, '0.103')] [2024-06-25 09:59:10,418][15401] Updated weights for policy 0, policy_version 862084 (0.0042) [2024-06-25 09:59:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14124482560. Throughput: 0: 42706.0. Samples: 14124615060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:13,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 09:59:14,567][15401] Updated weights for policy 0, policy_version 862094 (0.0040) [2024-06-25 09:59:17,883][15401] Updated weights for policy 0, policy_version 862104 (0.0024) [2024-06-25 09:59:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14124728320. Throughput: 0: 42491.5. Samples: 14124863180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 09:59:22,210][15401] Updated weights for policy 0, policy_version 862114 (0.0039) [2024-06-25 09:59:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14124924928. Throughput: 0: 42741.8. Samples: 14125002220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 09:59:25,410][15401] Updated weights for policy 0, policy_version 862124 (0.0040) [2024-06-25 09:59:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14125137920. Throughput: 0: 42690.6. Samples: 14125257740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:28,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 09:59:29,760][15401] Updated weights for policy 0, policy_version 862134 (0.0041) [2024-06-25 09:59:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14125350912. Throughput: 0: 42627.9. Samples: 14125509720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 09:59:33,434][15401] Updated weights for policy 0, policy_version 862144 (0.0026) [2024-06-25 09:59:37,534][15401] Updated weights for policy 0, policy_version 862154 (0.0033) [2024-06-25 09:59:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 14125563904. Throughput: 0: 42680.4. Samples: 14125643300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:38,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 09:59:38,956][15349] Signal inference workers to stop experience collection... (209200 times) [2024-06-25 09:59:39,012][15401] InferenceWorker_p0-w0: stopping experience collection (209200 times) [2024-06-25 09:59:39,077][15349] Signal inference workers to resume experience collection... (209200 times) [2024-06-25 09:59:39,077][15401] InferenceWorker_p0-w0: resuming experience collection (209200 times) [2024-06-25 09:59:41,521][15401] Updated weights for policy 0, policy_version 862164 (0.0020) [2024-06-25 09:59:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14125760512. Throughput: 0: 42770.9. Samples: 14125898080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 09:59:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000862169_14125776896.pth... [2024-06-25 09:59:43,448][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000861543_14115520512.pth [2024-06-25 09:59:45,281][15401] Updated weights for policy 0, policy_version 862174 (0.0044) [2024-06-25 09:59:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14126006272. Throughput: 0: 42730.7. Samples: 14126152480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:48,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 09:59:48,934][15401] Updated weights for policy 0, policy_version 862184 (0.0029) [2024-06-25 09:59:52,831][15401] Updated weights for policy 0, policy_version 862194 (0.0043) [2024-06-25 09:59:53,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14126219264. Throughput: 0: 42719.8. Samples: 14126280880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:53,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 09:59:56,838][15401] Updated weights for policy 0, policy_version 862204 (0.0030) [2024-06-25 09:59:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 14126399488. Throughput: 0: 42677.2. Samples: 14126535540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 09:59:58,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 10:00:00,358][15401] Updated weights for policy 0, policy_version 862214 (0.0037) [2024-06-25 10:00:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14126645248. Throughput: 0: 42812.9. Samples: 14126789760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 10:00:04,452][15401] Updated weights for policy 0, policy_version 862224 (0.0038) [2024-06-25 10:00:07,980][15401] Updated weights for policy 0, policy_version 862234 (0.0028) [2024-06-25 10:00:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14126858240. Throughput: 0: 42699.6. Samples: 14126923700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 10:00:11,759][15401] Updated weights for policy 0, policy_version 862244 (0.0034) [2024-06-25 10:00:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14127054848. Throughput: 0: 42738.7. Samples: 14127180980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 10:00:15,539][15401] Updated weights for policy 0, policy_version 862254 (0.0025) [2024-06-25 10:00:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14127284224. Throughput: 0: 42800.5. Samples: 14127435740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 10:00:19,525][15401] Updated weights for policy 0, policy_version 862264 (0.0035) [2024-06-25 10:00:23,251][15401] Updated weights for policy 0, policy_version 862274 (0.0037) [2024-06-25 10:00:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 14127497216. Throughput: 0: 42782.2. Samples: 14127568500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 10:00:27,418][15401] Updated weights for policy 0, policy_version 862284 (0.0040) [2024-06-25 10:00:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14127693824. Throughput: 0: 42701.0. Samples: 14127819620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 10:00:30,987][15401] Updated weights for policy 0, policy_version 862294 (0.0040) [2024-06-25 10:00:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 14127906816. Throughput: 0: 42880.3. Samples: 14128082100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:33,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 10:00:34,861][15401] Updated weights for policy 0, policy_version 862304 (0.0043) [2024-06-25 10:00:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 14128136192. Throughput: 0: 42911.5. Samples: 14128211900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:38,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 10:00:38,514][15401] Updated weights for policy 0, policy_version 862314 (0.0030) [2024-06-25 10:00:42,586][15401] Updated weights for policy 0, policy_version 862324 (0.0027) [2024-06-25 10:00:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14128349184. Throughput: 0: 42831.1. Samples: 14128462940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 10:00:43,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 10:00:46,258][15401] Updated weights for policy 0, policy_version 862334 (0.0031) [2024-06-25 10:00:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 14128545792. Throughput: 0: 42983.6. Samples: 14128724020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:00:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 10:00:49,977][15401] Updated weights for policy 0, policy_version 862344 (0.0036) [2024-06-25 10:00:53,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14128758784. Throughput: 0: 42758.7. Samples: 14128847840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:00:53,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 10:00:54,040][15401] Updated weights for policy 0, policy_version 862354 (0.0026) [2024-06-25 10:00:56,265][15349] Signal inference workers to stop experience collection... (209250 times) [2024-06-25 10:00:56,308][15401] InferenceWorker_p0-w0: stopping experience collection (209250 times) [2024-06-25 10:00:56,316][15349] Signal inference workers to resume experience collection... (209250 times) [2024-06-25 10:00:56,323][15401] InferenceWorker_p0-w0: resuming experience collection (209250 times) [2024-06-25 10:00:57,655][15401] Updated weights for policy 0, policy_version 862364 (0.0028) [2024-06-25 10:00:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 14129004544. Throughput: 0: 42895.0. Samples: 14129111260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:00:58,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 10:01:01,841][15401] Updated weights for policy 0, policy_version 862374 (0.0034) [2024-06-25 10:01:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14129201152. Throughput: 0: 42906.3. Samples: 14129366520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:03,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 10:01:05,088][15401] Updated weights for policy 0, policy_version 862384 (0.0027) [2024-06-25 10:01:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14129397760. Throughput: 0: 42756.1. Samples: 14129492520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 10:01:09,695][15401] Updated weights for policy 0, policy_version 862394 (0.0035) [2024-06-25 10:01:13,010][15401] Updated weights for policy 0, policy_version 862404 (0.0033) [2024-06-25 10:01:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14129643520. Throughput: 0: 42818.8. Samples: 14129746460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 10:01:17,408][15401] Updated weights for policy 0, policy_version 862414 (0.0041) [2024-06-25 10:01:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14129823744. Throughput: 0: 42706.3. Samples: 14130003880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:18,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 10:01:20,440][15401] Updated weights for policy 0, policy_version 862424 (0.0039) [2024-06-25 10:01:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14130053120. Throughput: 0: 42576.0. Samples: 14130127820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:23,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 10:01:24,954][15401] Updated weights for policy 0, policy_version 862434 (0.0035) [2024-06-25 10:01:28,213][15401] Updated weights for policy 0, policy_version 862444 (0.0025) [2024-06-25 10:01:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 14130282496. Throughput: 0: 42757.4. Samples: 14130387020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:28,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 10:01:32,593][15401] Updated weights for policy 0, policy_version 862454 (0.0034) [2024-06-25 10:01:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14130479104. Throughput: 0: 42771.9. Samples: 14130648760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 10:01:35,957][15401] Updated weights for policy 0, policy_version 862464 (0.0040) [2024-06-25 10:01:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14130708480. Throughput: 0: 42813.8. Samples: 14130774460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 10:01:40,143][15401] Updated weights for policy 0, policy_version 862474 (0.0041) [2024-06-25 10:01:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14130921472. Throughput: 0: 42724.9. Samples: 14131033880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:43,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 10:01:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000862483_14130921472.pth... [2024-06-25 10:01:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000861855_14120632320.pth [2024-06-25 10:01:43,680][15401] Updated weights for policy 0, policy_version 862484 (0.0033) [2024-06-25 10:01:47,708][15401] Updated weights for policy 0, policy_version 862494 (0.0041) [2024-06-25 10:01:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14131118080. Throughput: 0: 42796.0. Samples: 14131292340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:48,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 10:01:51,223][15401] Updated weights for policy 0, policy_version 862504 (0.0026) [2024-06-25 10:01:53,392][15132] Fps is (10 sec: 44228.0, 60 sec: 43416.1, 300 sec: 42875.8). Total num frames: 14131363840. Throughput: 0: 42879.7. Samples: 14131422200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:53,392][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 10:01:55,330][15401] Updated weights for policy 0, policy_version 862514 (0.0034) [2024-06-25 10:01:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14131576832. Throughput: 0: 42945.7. Samples: 14131679020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:01:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 10:01:59,414][15401] Updated weights for policy 0, policy_version 862524 (0.0031) [2024-06-25 10:02:03,151][15401] Updated weights for policy 0, policy_version 862534 (0.0031) [2024-06-25 10:02:03,389][15132] Fps is (10 sec: 39329.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14131757056. Throughput: 0: 42956.0. Samples: 14131936900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:02:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 10:02:06,917][15401] Updated weights for policy 0, policy_version 862544 (0.0029) [2024-06-25 10:02:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14131986432. Throughput: 0: 42879.1. Samples: 14132057380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:02:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 10:02:10,656][15401] Updated weights for policy 0, policy_version 862554 (0.0029) [2024-06-25 10:02:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14132215808. Throughput: 0: 42893.0. Samples: 14132317200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:02:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 10:02:14,443][15401] Updated weights for policy 0, policy_version 862564 (0.0023) [2024-06-25 10:02:18,185][15401] Updated weights for policy 0, policy_version 862574 (0.0046) [2024-06-25 10:02:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14132412416. Throughput: 0: 42876.8. Samples: 14132578220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:02:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 10:02:21,968][15401] Updated weights for policy 0, policy_version 862584 (0.0044) [2024-06-25 10:02:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14132625408. Throughput: 0: 42852.8. Samples: 14132702840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:02:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 10:02:24,184][15349] Signal inference workers to stop experience collection... (209300 times) [2024-06-25 10:02:24,184][15349] Signal inference workers to resume experience collection... (209300 times) [2024-06-25 10:02:24,202][15401] InferenceWorker_p0-w0: stopping experience collection (209300 times) [2024-06-25 10:02:24,202][15401] InferenceWorker_p0-w0: resuming experience collection (209300 times) [2024-06-25 10:02:25,695][15401] Updated weights for policy 0, policy_version 862594 (0.0024) [2024-06-25 10:02:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14132854784. Throughput: 0: 42941.4. Samples: 14132966240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:02:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 10:02:29,514][15401] Updated weights for policy 0, policy_version 862604 (0.0037) [2024-06-25 10:02:33,370][15401] Updated weights for policy 0, policy_version 862614 (0.0052) [2024-06-25 10:02:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14133067776. Throughput: 0: 42961.3. Samples: 14133225600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:02:33,390][15132] Avg episode reward: [(0, '0.234')] [2024-06-25 10:02:37,153][15401] Updated weights for policy 0, policy_version 862624 (0.0043) [2024-06-25 10:02:38,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 14133264384. Throughput: 0: 42917.2. Samples: 14133353660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:02:38,396][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 10:02:40,925][15401] Updated weights for policy 0, policy_version 862634 (0.0039) [2024-06-25 10:02:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 14133493760. Throughput: 0: 42975.6. Samples: 14133612920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:02:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 10:02:44,689][15401] Updated weights for policy 0, policy_version 862644 (0.0036) [2024-06-25 10:02:48,389][15132] Fps is (10 sec: 44265.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14133706752. Throughput: 0: 42884.0. Samples: 14133866680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:02:48,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 10:02:48,533][15401] Updated weights for policy 0, policy_version 862654 (0.0027) [2024-06-25 10:02:52,536][15401] Updated weights for policy 0, policy_version 862664 (0.0037) [2024-06-25 10:02:53,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42325.1, 300 sec: 42820.2). Total num frames: 14133903360. Throughput: 0: 43131.5. Samples: 14133998400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:02:53,392][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 10:02:56,554][15401] Updated weights for policy 0, policy_version 862674 (0.0032) [2024-06-25 10:02:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42820.6). Total num frames: 14134132736. Throughput: 0: 42976.3. Samples: 14134251240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:02:58,393][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 10:03:00,190][15401] Updated weights for policy 0, policy_version 862684 (0.0034) [2024-06-25 10:03:03,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14134345728. Throughput: 0: 42888.1. Samples: 14134508180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:03,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-25 10:03:04,217][15401] Updated weights for policy 0, policy_version 862694 (0.0035) [2024-06-25 10:03:08,158][15401] Updated weights for policy 0, policy_version 862704 (0.0042) [2024-06-25 10:03:08,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14134558720. Throughput: 0: 42978.6. Samples: 14134636880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:08,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 10:03:11,774][15401] Updated weights for policy 0, policy_version 862714 (0.0029) [2024-06-25 10:03:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14134788096. Throughput: 0: 42798.5. Samples: 14134892180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 10:03:15,807][15401] Updated weights for policy 0, policy_version 862724 (0.0026) [2024-06-25 10:03:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14134984704. Throughput: 0: 42732.5. Samples: 14135148560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 10:03:19,491][15401] Updated weights for policy 0, policy_version 862734 (0.0034) [2024-06-25 10:03:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14135181312. Throughput: 0: 42582.9. Samples: 14135269620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 10:03:23,782][15401] Updated weights for policy 0, policy_version 862744 (0.0044) [2024-06-25 10:03:27,002][15401] Updated weights for policy 0, policy_version 862754 (0.0038) [2024-06-25 10:03:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14135427072. Throughput: 0: 42598.5. Samples: 14135529860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 10:03:31,565][15401] Updated weights for policy 0, policy_version 862764 (0.0036) [2024-06-25 10:03:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14135623680. Throughput: 0: 42795.1. Samples: 14135792460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 10:03:34,542][15401] Updated weights for policy 0, policy_version 862774 (0.0040) [2024-06-25 10:03:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42875.9, 300 sec: 42876.1). Total num frames: 14135836672. Throughput: 0: 42583.1. Samples: 14135914540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 10:03:39,200][15401] Updated weights for policy 0, policy_version 862784 (0.0034) [2024-06-25 10:03:42,102][15401] Updated weights for policy 0, policy_version 862794 (0.0026) [2024-06-25 10:03:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14136066048. Throughput: 0: 42680.5. Samples: 14136171760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:43,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 10:03:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000862797_14136066048.pth... [2024-06-25 10:03:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000862169_14125776896.pth [2024-06-25 10:03:46,974][15401] Updated weights for policy 0, policy_version 862804 (0.0043) [2024-06-25 10:03:47,755][15349] Signal inference workers to stop experience collection... (209350 times) [2024-06-25 10:03:47,755][15349] Signal inference workers to resume experience collection... (209350 times) [2024-06-25 10:03:47,776][15401] InferenceWorker_p0-w0: stopping experience collection (209350 times) [2024-06-25 10:03:47,776][15401] InferenceWorker_p0-w0: resuming experience collection (209350 times) [2024-06-25 10:03:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14136246272. Throughput: 0: 42706.5. Samples: 14136429980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:48,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 10:03:49,959][15401] Updated weights for policy 0, policy_version 862814 (0.0025) [2024-06-25 10:03:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 14136475648. Throughput: 0: 42681.4. Samples: 14136557540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 10:03:54,566][15401] Updated weights for policy 0, policy_version 862824 (0.0040) [2024-06-25 10:03:57,405][15401] Updated weights for policy 0, policy_version 862834 (0.0039) [2024-06-25 10:03:58,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 14136705024. Throughput: 0: 42766.9. Samples: 14136816680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:03:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 10:04:02,183][15401] Updated weights for policy 0, policy_version 862844 (0.0030) [2024-06-25 10:04:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14136885248. Throughput: 0: 42847.6. Samples: 14137076700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:04:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 10:04:04,959][15401] Updated weights for policy 0, policy_version 862854 (0.0041) [2024-06-25 10:04:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14137131008. Throughput: 0: 42887.5. Samples: 14137199560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:04:08,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 10:04:09,743][15401] Updated weights for policy 0, policy_version 862864 (0.0029) [2024-06-25 10:04:12,467][15401] Updated weights for policy 0, policy_version 862874 (0.0027) [2024-06-25 10:04:13,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 14137344000. Throughput: 0: 42885.8. Samples: 14137459820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:04:13,392][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 10:04:17,370][15401] Updated weights for policy 0, policy_version 862884 (0.0043) [2024-06-25 10:04:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14137524224. Throughput: 0: 42882.5. Samples: 14137722180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:04:18,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 10:04:20,229][15401] Updated weights for policy 0, policy_version 862894 (0.0053) [2024-06-25 10:04:23,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14137769984. Throughput: 0: 42793.4. Samples: 14137840240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:04:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 10:04:25,119][15401] Updated weights for policy 0, policy_version 862904 (0.0034) [2024-06-25 10:04:27,750][15401] Updated weights for policy 0, policy_version 862914 (0.0041) [2024-06-25 10:04:28,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14137999360. Throughput: 0: 42882.2. Samples: 14138101460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 10:04:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 10:04:32,829][15401] Updated weights for policy 0, policy_version 862924 (0.0033) [2024-06-25 10:04:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14138163200. Throughput: 0: 43024.1. Samples: 14138366060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:04:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 10:04:35,467][15401] Updated weights for policy 0, policy_version 862934 (0.0040) [2024-06-25 10:04:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 14138425344. Throughput: 0: 42755.2. Samples: 14138481520. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:04:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 10:04:40,483][15401] Updated weights for policy 0, policy_version 862944 (0.0036) [2024-06-25 10:04:43,188][15401] Updated weights for policy 0, policy_version 862954 (0.0039) [2024-06-25 10:04:43,390][15132] Fps is (10 sec: 49152.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14138654720. Throughput: 0: 42995.9. Samples: 14138751500. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:04:43,392][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 10:04:44,288][15349] Signal inference workers to stop experience collection... (209400 times) [2024-06-25 10:04:44,343][15401] InferenceWorker_p0-w0: stopping experience collection (209400 times) [2024-06-25 10:04:44,350][15349] Signal inference workers to resume experience collection... (209400 times) [2024-06-25 10:04:44,353][15401] InferenceWorker_p0-w0: resuming experience collection (209400 times) [2024-06-25 10:04:48,226][15401] Updated weights for policy 0, policy_version 862964 (0.0030) [2024-06-25 10:04:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14138818560. Throughput: 0: 43011.8. Samples: 14139012240. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:04:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 10:04:51,241][15401] Updated weights for policy 0, policy_version 862974 (0.0029) [2024-06-25 10:04:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14139064320. Throughput: 0: 42952.0. Samples: 14139132400. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:04:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 10:04:55,939][15401] Updated weights for policy 0, policy_version 862984 (0.0036) [2024-06-25 10:04:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14139277312. Throughput: 0: 42833.4. Samples: 14139387220. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:04:58,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 10:04:58,706][15401] Updated weights for policy 0, policy_version 862994 (0.0043) [2024-06-25 10:05:03,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14139441152. Throughput: 0: 42812.6. Samples: 14139648740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 10:05:03,535][15401] Updated weights for policy 0, policy_version 863004 (0.0029) [2024-06-25 10:05:06,516][15401] Updated weights for policy 0, policy_version 863014 (0.0032) [2024-06-25 10:05:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14139719680. Throughput: 0: 42912.8. Samples: 14139771320. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:08,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 10:05:10,996][15401] Updated weights for policy 0, policy_version 863024 (0.0037) [2024-06-25 10:05:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 14139899904. Throughput: 0: 42835.0. Samples: 14140029040. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 10:05:14,180][15401] Updated weights for policy 0, policy_version 863034 (0.0036) [2024-06-25 10:05:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14140096512. Throughput: 0: 42748.5. Samples: 14140289740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:18,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-25 10:05:18,488][15401] Updated weights for policy 0, policy_version 863044 (0.0048) [2024-06-25 10:05:21,778][15401] Updated weights for policy 0, policy_version 863054 (0.0040) [2024-06-25 10:05:23,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14140358656. Throughput: 0: 43009.3. Samples: 14140416940. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:23,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-25 10:05:25,935][15401] Updated weights for policy 0, policy_version 863064 (0.0049) [2024-06-25 10:05:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14140555264. Throughput: 0: 42848.5. Samples: 14140679680. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 10:05:29,364][15401] Updated weights for policy 0, policy_version 863074 (0.0037) [2024-06-25 10:05:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14140751872. Throughput: 0: 42720.6. Samples: 14140934660. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 10:05:33,417][15401] Updated weights for policy 0, policy_version 863084 (0.0037) [2024-06-25 10:05:36,802][15401] Updated weights for policy 0, policy_version 863094 (0.0028) [2024-06-25 10:05:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14140997632. Throughput: 0: 42975.2. Samples: 14141066280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 10:05:41,089][15401] Updated weights for policy 0, policy_version 863104 (0.0034) [2024-06-25 10:05:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 14141177856. Throughput: 0: 42912.4. Samples: 14141318280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:43,396][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 10:05:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000863110_14141194240.pth... [2024-06-25 10:05:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000862483_14130921472.pth [2024-06-25 10:05:44,396][15401] Updated weights for policy 0, policy_version 863114 (0.0034) [2024-06-25 10:05:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14141407232. Throughput: 0: 42975.1. Samples: 14141582620. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 10:05:48,722][15401] Updated weights for policy 0, policy_version 863124 (0.0041) [2024-06-25 10:05:51,925][15401] Updated weights for policy 0, policy_version 863134 (0.0034) [2024-06-25 10:05:53,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14141652992. Throughput: 0: 43118.6. Samples: 14141711660. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 10:05:56,117][15401] Updated weights for policy 0, policy_version 863144 (0.0031) [2024-06-25 10:05:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14141816832. Throughput: 0: 43101.9. Samples: 14141968620. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:05:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 10:05:59,478][15401] Updated weights for policy 0, policy_version 863154 (0.0037) [2024-06-25 10:06:01,436][15349] Signal inference workers to stop experience collection... (209450 times) [2024-06-25 10:06:01,436][15349] Signal inference workers to resume experience collection... (209450 times) [2024-06-25 10:06:01,463][15401] InferenceWorker_p0-w0: stopping experience collection (209450 times) [2024-06-25 10:06:01,463][15401] InferenceWorker_p0-w0: resuming experience collection (209450 times) [2024-06-25 10:06:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 14142046208. Throughput: 0: 43020.8. Samples: 14142225680. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:06:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 10:06:03,683][15401] Updated weights for policy 0, policy_version 863164 (0.0037) [2024-06-25 10:06:07,602][15401] Updated weights for policy 0, policy_version 863174 (0.0024) [2024-06-25 10:06:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14142275584. Throughput: 0: 43106.3. Samples: 14142356720. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:06:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 10:06:11,280][15401] Updated weights for policy 0, policy_version 863184 (0.0035) [2024-06-25 10:06:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 14142472192. Throughput: 0: 42799.5. Samples: 14142605760. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:06:13,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 10:06:15,082][15401] Updated weights for policy 0, policy_version 863194 (0.0029) [2024-06-25 10:06:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14142685184. Throughput: 0: 42897.4. Samples: 14142865040. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-06-25 10:06:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 10:06:18,817][15401] Updated weights for policy 0, policy_version 863204 (0.0036) [2024-06-25 10:06:22,844][15401] Updated weights for policy 0, policy_version 863214 (0.0032) [2024-06-25 10:06:23,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14142914560. Throughput: 0: 42925.7. Samples: 14142997940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:06:23,391][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 10:06:26,635][15401] Updated weights for policy 0, policy_version 863224 (0.0037) [2024-06-25 10:06:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14143127552. Throughput: 0: 42905.8. Samples: 14143249040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:06:28,394][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 10:06:30,472][15401] Updated weights for policy 0, policy_version 863234 (0.0036) [2024-06-25 10:06:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14143340544. Throughput: 0: 42731.1. Samples: 14143505520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:06:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 10:06:34,312][15401] Updated weights for policy 0, policy_version 863244 (0.0038) [2024-06-25 10:06:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14143537152. Throughput: 0: 42717.9. Samples: 14143633960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:06:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 10:06:38,603][15401] Updated weights for policy 0, policy_version 863254 (0.0033) [2024-06-25 10:06:42,046][15401] Updated weights for policy 0, policy_version 863264 (0.0021) [2024-06-25 10:06:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 14143782912. Throughput: 0: 42571.4. Samples: 14143884340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:06:43,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 10:06:46,137][15401] Updated weights for policy 0, policy_version 863274 (0.0032) [2024-06-25 10:06:48,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 14143979520. Throughput: 0: 42541.8. Samples: 14144140160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:06:48,392][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 10:06:49,953][15401] Updated weights for policy 0, policy_version 863284 (0.0031) [2024-06-25 10:06:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 14144176128. Throughput: 0: 42532.4. Samples: 14144270680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:06:53,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 10:06:53,726][15401] Updated weights for policy 0, policy_version 863294 (0.0034) [2024-06-25 10:06:57,455][15401] Updated weights for policy 0, policy_version 863304 (0.0048) [2024-06-25 10:06:58,392][15132] Fps is (10 sec: 44236.7, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 14144421888. Throughput: 0: 42772.0. Samples: 14144530500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:06:58,393][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 10:07:01,362][15401] Updated weights for policy 0, policy_version 863314 (0.0034) [2024-06-25 10:07:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14144618496. Throughput: 0: 42711.5. Samples: 14144787060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 10:07:05,012][15401] Updated weights for policy 0, policy_version 863324 (0.0028) [2024-06-25 10:07:08,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14144831488. Throughput: 0: 42456.9. Samples: 14144908500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 10:07:09,603][15401] Updated weights for policy 0, policy_version 863334 (0.0024) [2024-06-25 10:07:10,869][15349] Signal inference workers to stop experience collection... (209500 times) [2024-06-25 10:07:10,870][15349] Signal inference workers to resume experience collection... (209500 times) [2024-06-25 10:07:10,904][15401] InferenceWorker_p0-w0: stopping experience collection (209500 times) [2024-06-25 10:07:10,908][15401] InferenceWorker_p0-w0: resuming experience collection (209500 times) [2024-06-25 10:07:12,451][15401] Updated weights for policy 0, policy_version 863344 (0.0031) [2024-06-25 10:07:13,391][15132] Fps is (10 sec: 45869.3, 60 sec: 43418.4, 300 sec: 42931.5). Total num frames: 14145077248. Throughput: 0: 42748.2. Samples: 14145172760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:13,391][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 10:07:17,116][15401] Updated weights for policy 0, policy_version 863354 (0.0038) [2024-06-25 10:07:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14145273856. Throughput: 0: 42803.3. Samples: 14145431660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 10:07:20,167][15401] Updated weights for policy 0, policy_version 863364 (0.0039) [2024-06-25 10:07:23,389][15132] Fps is (10 sec: 39326.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14145470464. Throughput: 0: 42749.4. Samples: 14145557680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:23,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 10:07:24,619][15401] Updated weights for policy 0, policy_version 863374 (0.0035) [2024-06-25 10:07:27,631][15401] Updated weights for policy 0, policy_version 863384 (0.0033) [2024-06-25 10:07:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14145716224. Throughput: 0: 42949.9. Samples: 14145817080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 10:07:32,175][15401] Updated weights for policy 0, policy_version 863394 (0.0028) [2024-06-25 10:07:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42821.5). Total num frames: 14145896448. Throughput: 0: 42993.9. Samples: 14146074780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 10:07:35,227][15401] Updated weights for policy 0, policy_version 863404 (0.0033) [2024-06-25 10:07:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14146109440. Throughput: 0: 42763.5. Samples: 14146195040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:38,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 10:07:39,810][15401] Updated weights for policy 0, policy_version 863414 (0.0027) [2024-06-25 10:07:42,701][15401] Updated weights for policy 0, policy_version 863424 (0.0024) [2024-06-25 10:07:43,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14146371584. Throughput: 0: 42826.3. Samples: 14146457580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:43,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 10:07:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000863426_14146371584.pth... [2024-06-25 10:07:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000862797_14136066048.pth [2024-06-25 10:07:47,507][15401] Updated weights for policy 0, policy_version 863434 (0.0025) [2024-06-25 10:07:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42820.9). Total num frames: 14146535424. Throughput: 0: 42864.4. Samples: 14146715960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:48,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 10:07:50,317][15401] Updated weights for policy 0, policy_version 863444 (0.0045) [2024-06-25 10:07:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 14146764800. Throughput: 0: 42739.1. Samples: 14146831760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:53,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-25 10:07:55,376][15401] Updated weights for policy 0, policy_version 863454 (0.0029) [2024-06-25 10:07:58,062][15401] Updated weights for policy 0, policy_version 863464 (0.0034) [2024-06-25 10:07:58,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 14147010560. Throughput: 0: 42849.2. Samples: 14147100920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:07:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 10:08:03,004][15401] Updated weights for policy 0, policy_version 863474 (0.0033) [2024-06-25 10:08:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14147174400. Throughput: 0: 42834.6. Samples: 14147359220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:08:03,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-25 10:08:05,699][15401] Updated weights for policy 0, policy_version 863484 (0.0026) [2024-06-25 10:08:08,391][15132] Fps is (10 sec: 37678.8, 60 sec: 42597.6, 300 sec: 42709.3). Total num frames: 14147387392. Throughput: 0: 42667.7. Samples: 14147477780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-25 10:08:08,391][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 10:08:10,541][15401] Updated weights for policy 0, policy_version 863494 (0.0039) [2024-06-25 10:08:13,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42872.4, 300 sec: 42931.6). Total num frames: 14147649536. Throughput: 0: 42864.0. Samples: 14147745960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:13,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-25 10:08:13,403][15401] Updated weights for policy 0, policy_version 863504 (0.0036) [2024-06-25 10:08:17,978][15401] Updated weights for policy 0, policy_version 863514 (0.0024) [2024-06-25 10:08:18,389][15132] Fps is (10 sec: 44242.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14147829760. Throughput: 0: 42698.7. Samples: 14147996220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:18,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 10:08:20,219][15349] Signal inference workers to stop experience collection... (209550 times) [2024-06-25 10:08:20,220][15349] Signal inference workers to resume experience collection... (209550 times) [2024-06-25 10:08:20,238][15401] InferenceWorker_p0-w0: stopping experience collection (209550 times) [2024-06-25 10:08:20,238][15401] InferenceWorker_p0-w0: resuming experience collection (209550 times) [2024-06-25 10:08:21,020][15401] Updated weights for policy 0, policy_version 863524 (0.0039) [2024-06-25 10:08:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14148042752. Throughput: 0: 42769.0. Samples: 14148119640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:23,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 10:08:25,575][15401] Updated weights for policy 0, policy_version 863534 (0.0037) [2024-06-25 10:08:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14148272128. Throughput: 0: 42915.5. Samples: 14148388780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:28,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 10:08:28,762][15401] Updated weights for policy 0, policy_version 863544 (0.0036) [2024-06-25 10:08:33,242][15401] Updated weights for policy 0, policy_version 863554 (0.0028) [2024-06-25 10:08:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 14148468736. Throughput: 0: 42828.3. Samples: 14148643240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:33,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 10:08:36,535][15401] Updated weights for policy 0, policy_version 863564 (0.0029) [2024-06-25 10:08:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14148698112. Throughput: 0: 42946.8. Samples: 14148764360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 10:08:40,779][15401] Updated weights for policy 0, policy_version 863574 (0.0027) [2024-06-25 10:08:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 14148894720. Throughput: 0: 42677.3. Samples: 14149021400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:43,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 10:08:44,529][15401] Updated weights for policy 0, policy_version 863584 (0.0038) [2024-06-25 10:08:48,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14149074944. Throughput: 0: 42392.5. Samples: 14149266880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:48,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 10:08:49,211][15401] Updated weights for policy 0, policy_version 863594 (0.0030) [2024-06-25 10:08:52,293][15401] Updated weights for policy 0, policy_version 863604 (0.0028) [2024-06-25 10:08:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14149337088. Throughput: 0: 42539.7. Samples: 14149392020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 10:08:56,798][15401] Updated weights for policy 0, policy_version 863614 (0.0026) [2024-06-25 10:08:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42820.5). Total num frames: 14149517312. Throughput: 0: 42228.0. Samples: 14149646220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:08:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 10:09:00,088][15401] Updated weights for policy 0, policy_version 863624 (0.0034) [2024-06-25 10:09:03,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14149713920. Throughput: 0: 42443.1. Samples: 14149906160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 10:09:04,438][15401] Updated weights for policy 0, policy_version 863634 (0.0038) [2024-06-25 10:09:07,589][15401] Updated weights for policy 0, policy_version 863644 (0.0034) [2024-06-25 10:09:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43145.3, 300 sec: 42820.9). Total num frames: 14149976064. Throughput: 0: 42519.4. Samples: 14150033020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:08,399][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 10:09:12,032][15401] Updated weights for policy 0, policy_version 863654 (0.0047) [2024-06-25 10:09:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42820.6). Total num frames: 14150156288. Throughput: 0: 42169.9. Samples: 14150286420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 10:09:15,382][15401] Updated weights for policy 0, policy_version 863664 (0.0032) [2024-06-25 10:09:18,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14150369280. Throughput: 0: 42195.2. Samples: 14150542020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 10:09:19,756][15401] Updated weights for policy 0, policy_version 863674 (0.0038) [2024-06-25 10:09:22,963][15401] Updated weights for policy 0, policy_version 863684 (0.0037) [2024-06-25 10:09:23,391][15132] Fps is (10 sec: 45867.2, 60 sec: 42870.2, 300 sec: 42764.8). Total num frames: 14150615040. Throughput: 0: 42369.0. Samples: 14150671040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:23,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 10:09:27,347][15401] Updated weights for policy 0, policy_version 863694 (0.0036) [2024-06-25 10:09:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 14150795264. Throughput: 0: 42405.4. Samples: 14150929640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 10:09:29,832][15349] Signal inference workers to stop experience collection... (209600 times) [2024-06-25 10:09:29,881][15401] InferenceWorker_p0-w0: stopping experience collection (209600 times) [2024-06-25 10:09:29,890][15349] Signal inference workers to resume experience collection... (209600 times) [2024-06-25 10:09:29,898][15401] InferenceWorker_p0-w0: resuming experience collection (209600 times) [2024-06-25 10:09:30,526][15401] Updated weights for policy 0, policy_version 863704 (0.0039) [2024-06-25 10:09:33,390][15132] Fps is (10 sec: 40966.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14151024640. Throughput: 0: 42536.3. Samples: 14151181020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:33,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 10:09:35,156][15401] Updated weights for policy 0, policy_version 863714 (0.0042) [2024-06-25 10:09:38,340][15401] Updated weights for policy 0, policy_version 863724 (0.0040) [2024-06-25 10:09:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14151254016. Throughput: 0: 42597.7. Samples: 14151308920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:38,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 10:09:42,849][15401] Updated weights for policy 0, policy_version 863734 (0.0049) [2024-06-25 10:09:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14151450624. Throughput: 0: 42669.4. Samples: 14151566340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 10:09:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000863736_14151450624.pth... [2024-06-25 10:09:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000863110_14141194240.pth [2024-06-25 10:09:46,334][15401] Updated weights for policy 0, policy_version 863744 (0.0035) [2024-06-25 10:09:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14151647232. Throughput: 0: 42467.5. Samples: 14151817200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:48,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 10:09:50,682][15401] Updated weights for policy 0, policy_version 863754 (0.0041) [2024-06-25 10:09:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 14151860224. Throughput: 0: 42426.4. Samples: 14151942200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:53,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 10:09:54,043][15401] Updated weights for policy 0, policy_version 863764 (0.0030) [2024-06-25 10:09:58,287][15401] Updated weights for policy 0, policy_version 863774 (0.0028) [2024-06-25 10:09:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14152073216. Throughput: 0: 42630.3. Samples: 14152204780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:09:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 10:10:01,514][15401] Updated weights for policy 0, policy_version 863784 (0.0045) [2024-06-25 10:10:03,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 14152302592. Throughput: 0: 42466.2. Samples: 14152453100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-25 10:10:03,393][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 10:10:06,178][15401] Updated weights for policy 0, policy_version 863794 (0.0034) [2024-06-25 10:10:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14152515584. Throughput: 0: 42665.7. Samples: 14152590920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:08,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 10:10:09,021][15401] Updated weights for policy 0, policy_version 863804 (0.0030) [2024-06-25 10:10:13,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14152695808. Throughput: 0: 42503.1. Samples: 14152842280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 10:10:13,787][15401] Updated weights for policy 0, policy_version 863814 (0.0032) [2024-06-25 10:10:16,697][15401] Updated weights for policy 0, policy_version 863824 (0.0040) [2024-06-25 10:10:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 14152941568. Throughput: 0: 42577.5. Samples: 14153097000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 10:10:21,438][15401] Updated weights for policy 0, policy_version 863834 (0.0034) [2024-06-25 10:10:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42053.5, 300 sec: 42653.9). Total num frames: 14153138176. Throughput: 0: 42640.6. Samples: 14153227740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:23,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 10:10:24,772][15401] Updated weights for policy 0, policy_version 863844 (0.0033) [2024-06-25 10:10:28,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14153334784. Throughput: 0: 42619.0. Samples: 14153484200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:28,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 10:10:29,014][15401] Updated weights for policy 0, policy_version 863854 (0.0032) [2024-06-25 10:10:32,088][15401] Updated weights for policy 0, policy_version 863864 (0.0041) [2024-06-25 10:10:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14153580544. Throughput: 0: 42687.1. Samples: 14153738120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:33,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 10:10:36,425][15401] Updated weights for policy 0, policy_version 863874 (0.0042) [2024-06-25 10:10:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14153793536. Throughput: 0: 42979.8. Samples: 14153876300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 10:10:39,613][15401] Updated weights for policy 0, policy_version 863884 (0.0026) [2024-06-25 10:10:43,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 14153990144. Throughput: 0: 42842.1. Samples: 14154132780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:43,393][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 10:10:43,920][15401] Updated weights for policy 0, policy_version 863894 (0.0039) [2024-06-25 10:10:47,066][15349] Signal inference workers to stop experience collection... (209650 times) [2024-06-25 10:10:47,066][15349] Signal inference workers to resume experience collection... (209650 times) [2024-06-25 10:10:47,079][15401] InferenceWorker_p0-w0: stopping experience collection (209650 times) [2024-06-25 10:10:47,112][15401] InferenceWorker_p0-w0: resuming experience collection (209650 times) [2024-06-25 10:10:47,208][15401] Updated weights for policy 0, policy_version 863904 (0.0029) [2024-06-25 10:10:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14154235904. Throughput: 0: 42976.9. Samples: 14154386960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 10:10:51,563][15401] Updated weights for policy 0, policy_version 863914 (0.0042) [2024-06-25 10:10:53,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14154448896. Throughput: 0: 42869.3. Samples: 14154520040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:53,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 10:10:55,292][15401] Updated weights for policy 0, policy_version 863924 (0.0032) [2024-06-25 10:10:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14154629120. Throughput: 0: 43011.6. Samples: 14154777800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:10:58,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 10:10:59,134][15401] Updated weights for policy 0, policy_version 863934 (0.0031) [2024-06-25 10:11:03,127][15401] Updated weights for policy 0, policy_version 863944 (0.0033) [2024-06-25 10:11:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14154874880. Throughput: 0: 42998.6. Samples: 14155031940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 10:11:06,587][15401] Updated weights for policy 0, policy_version 863954 (0.0028) [2024-06-25 10:11:08,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 14155104256. Throughput: 0: 42986.7. Samples: 14155162140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 10:11:10,577][15401] Updated weights for policy 0, policy_version 863964 (0.0030) [2024-06-25 10:11:13,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 14155284480. Throughput: 0: 42981.8. Samples: 14155418480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:13,393][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 10:11:14,749][15401] Updated weights for policy 0, policy_version 863974 (0.0033) [2024-06-25 10:11:18,058][15401] Updated weights for policy 0, policy_version 863984 (0.0036) [2024-06-25 10:11:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14155530240. Throughput: 0: 42874.7. Samples: 14155667480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 10:11:22,306][15401] Updated weights for policy 0, policy_version 863994 (0.0032) [2024-06-25 10:11:23,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14155726848. Throughput: 0: 42824.9. Samples: 14155803420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 10:11:25,898][15401] Updated weights for policy 0, policy_version 864004 (0.0028) [2024-06-25 10:11:28,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14155907072. Throughput: 0: 42677.7. Samples: 14156053180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:28,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 10:11:29,765][15401] Updated weights for policy 0, policy_version 864014 (0.0034) [2024-06-25 10:11:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14156152832. Throughput: 0: 42865.3. Samples: 14156315900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 10:11:33,437][15401] Updated weights for policy 0, policy_version 864024 (0.0029) [2024-06-25 10:11:37,613][15401] Updated weights for policy 0, policy_version 864034 (0.0027) [2024-06-25 10:11:38,390][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14156365824. Throughput: 0: 42837.8. Samples: 14156447740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 10:11:41,347][15401] Updated weights for policy 0, policy_version 864044 (0.0028) [2024-06-25 10:11:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.2, 300 sec: 42709.8). Total num frames: 14156578816. Throughput: 0: 42779.4. Samples: 14156702880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 10:11:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000864049_14156578816.pth... [2024-06-25 10:11:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000863426_14146371584.pth [2024-06-25 10:11:45,011][15401] Updated weights for policy 0, policy_version 864054 (0.0029) [2024-06-25 10:11:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14156791808. Throughput: 0: 42941.3. Samples: 14156964300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 10:11:48,794][15401] Updated weights for policy 0, policy_version 864064 (0.0037) [2024-06-25 10:11:52,430][15401] Updated weights for policy 0, policy_version 864074 (0.0027) [2024-06-25 10:11:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 14157021184. Throughput: 0: 42975.0. Samples: 14157096020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-25 10:11:53,390][15132] Avg episode reward: [(0, '0.175')] [2024-06-25 10:11:56,255][15401] Updated weights for policy 0, policy_version 864084 (0.0037) [2024-06-25 10:11:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14157217792. Throughput: 0: 42865.9. Samples: 14157347340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:11:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 10:11:59,937][15401] Updated weights for policy 0, policy_version 864094 (0.0034) [2024-06-25 10:12:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14157447168. Throughput: 0: 43176.5. Samples: 14157610420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 10:12:03,801][15401] Updated weights for policy 0, policy_version 864104 (0.0029) [2024-06-25 10:12:07,569][15401] Updated weights for policy 0, policy_version 864114 (0.0027) [2024-06-25 10:12:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42654.1). Total num frames: 14157660160. Throughput: 0: 43024.0. Samples: 14157739500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 10:12:11,514][15401] Updated weights for policy 0, policy_version 864124 (0.0030) [2024-06-25 10:12:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 14157873152. Throughput: 0: 43182.8. Samples: 14157996400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:13,399][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 10:12:14,461][15349] Signal inference workers to stop experience collection... (209700 times) [2024-06-25 10:12:14,519][15401] InferenceWorker_p0-w0: stopping experience collection (209700 times) [2024-06-25 10:12:14,574][15349] Signal inference workers to resume experience collection... (209700 times) [2024-06-25 10:12:14,575][15401] InferenceWorker_p0-w0: resuming experience collection (209700 times) [2024-06-25 10:12:15,441][15401] Updated weights for policy 0, policy_version 864134 (0.0033) [2024-06-25 10:12:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14158086144. Throughput: 0: 43119.5. Samples: 14158256280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 10:12:19,099][15401] Updated weights for policy 0, policy_version 864144 (0.0031) [2024-06-25 10:12:23,292][15401] Updated weights for policy 0, policy_version 864154 (0.0029) [2024-06-25 10:12:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14158299136. Throughput: 0: 43047.1. Samples: 14158384860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 10:12:26,854][15401] Updated weights for policy 0, policy_version 864164 (0.0038) [2024-06-25 10:12:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 14158512128. Throughput: 0: 42845.9. Samples: 14158630940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 10:12:30,824][15401] Updated weights for policy 0, policy_version 864174 (0.0040) [2024-06-25 10:12:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14158741504. Throughput: 0: 42804.5. Samples: 14158890500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 10:12:34,948][15401] Updated weights for policy 0, policy_version 864184 (0.0032) [2024-06-25 10:12:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14158938112. Throughput: 0: 42818.7. Samples: 14159022860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 10:12:38,508][15401] Updated weights for policy 0, policy_version 864194 (0.0042) [2024-06-25 10:12:42,550][15401] Updated weights for policy 0, policy_version 864204 (0.0032) [2024-06-25 10:12:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14159151104. Throughput: 0: 42858.6. Samples: 14159275980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 10:12:46,653][15401] Updated weights for policy 0, policy_version 864214 (0.0028) [2024-06-25 10:12:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14159380480. Throughput: 0: 42679.2. Samples: 14159530980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 10:12:50,187][15401] Updated weights for policy 0, policy_version 864224 (0.0024) [2024-06-25 10:12:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14159577088. Throughput: 0: 42829.7. Samples: 14159666840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 10:12:53,860][15401] Updated weights for policy 0, policy_version 864234 (0.0047) [2024-06-25 10:12:57,650][15401] Updated weights for policy 0, policy_version 864244 (0.0029) [2024-06-25 10:12:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14159790080. Throughput: 0: 42911.6. Samples: 14159927420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:12:58,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 10:13:01,189][15401] Updated weights for policy 0, policy_version 864254 (0.0041) [2024-06-25 10:13:03,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 14160035840. Throughput: 0: 42891.2. Samples: 14160186380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:03,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 10:13:05,025][15401] Updated weights for policy 0, policy_version 864264 (0.0040) [2024-06-25 10:13:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14160232448. Throughput: 0: 42847.1. Samples: 14160312980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 10:13:08,673][15401] Updated weights for policy 0, policy_version 864274 (0.0043) [2024-06-25 10:13:12,546][15401] Updated weights for policy 0, policy_version 864284 (0.0039) [2024-06-25 10:13:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 14160461824. Throughput: 0: 43234.1. Samples: 14160576480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:13,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 10:13:16,393][15401] Updated weights for policy 0, policy_version 864294 (0.0028) [2024-06-25 10:13:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 14160691200. Throughput: 0: 43181.2. Samples: 14160833760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:18,393][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 10:13:20,330][15401] Updated weights for policy 0, policy_version 864304 (0.0032) [2024-06-25 10:13:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14160887808. Throughput: 0: 43227.5. Samples: 14160968100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 10:13:23,898][15401] Updated weights for policy 0, policy_version 864314 (0.0040) [2024-06-25 10:13:27,742][15401] Updated weights for policy 0, policy_version 864324 (0.0045) [2024-06-25 10:13:28,389][15132] Fps is (10 sec: 40970.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14161100800. Throughput: 0: 43338.8. Samples: 14161226220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:28,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 10:13:31,613][15401] Updated weights for policy 0, policy_version 864334 (0.0039) [2024-06-25 10:13:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14161330176. Throughput: 0: 43199.9. Samples: 14161474980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 10:13:35,530][15401] Updated weights for policy 0, policy_version 864344 (0.0038) [2024-06-25 10:13:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14161526784. Throughput: 0: 43252.6. Samples: 14161613200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 10:13:39,220][15401] Updated weights for policy 0, policy_version 864354 (0.0028) [2024-06-25 10:13:43,198][15401] Updated weights for policy 0, policy_version 864364 (0.0040) [2024-06-25 10:13:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14161739776. Throughput: 0: 43064.5. Samples: 14161865320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 10:13:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 10:13:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000864364_14161739776.pth... [2024-06-25 10:13:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000863736_14151450624.pth [2024-06-25 10:13:47,003][15401] Updated weights for policy 0, policy_version 864374 (0.0038) [2024-06-25 10:13:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14161969152. Throughput: 0: 43106.2. Samples: 14162126160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:13:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 10:13:50,862][15401] Updated weights for policy 0, policy_version 864384 (0.0031) [2024-06-25 10:13:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14162165760. Throughput: 0: 43143.2. Samples: 14162254420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:13:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 10:13:54,484][15401] Updated weights for policy 0, policy_version 864394 (0.0047) [2024-06-25 10:13:58,355][15401] Updated weights for policy 0, policy_version 864404 (0.0030) [2024-06-25 10:13:58,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 14162395136. Throughput: 0: 42846.6. Samples: 14162504680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:13:58,392][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 10:14:02,501][15401] Updated weights for policy 0, policy_version 864414 (0.0044) [2024-06-25 10:14:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14162591744. Throughput: 0: 42794.7. Samples: 14162759420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:03,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 10:14:06,111][15401] Updated weights for policy 0, policy_version 864424 (0.0030) [2024-06-25 10:14:08,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14162788352. Throughput: 0: 42652.0. Samples: 14162887440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 10:14:10,006][15401] Updated weights for policy 0, policy_version 864434 (0.0032) [2024-06-25 10:14:10,030][15349] Signal inference workers to stop experience collection... (209750 times) [2024-06-25 10:14:10,031][15349] Signal inference workers to resume experience collection... (209750 times) [2024-06-25 10:14:10,071][15401] InferenceWorker_p0-w0: stopping experience collection (209750 times) [2024-06-25 10:14:10,071][15401] InferenceWorker_p0-w0: resuming experience collection (209750 times) [2024-06-25 10:14:13,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 14163034112. Throughput: 0: 42563.2. Samples: 14163141840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:13,397][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 10:14:13,794][15401] Updated weights for policy 0, policy_version 864444 (0.0037) [2024-06-25 10:14:17,857][15401] Updated weights for policy 0, policy_version 864454 (0.0031) [2024-06-25 10:14:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42600.1, 300 sec: 42820.8). Total num frames: 14163247104. Throughput: 0: 42758.7. Samples: 14163399120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 10:14:21,746][15401] Updated weights for policy 0, policy_version 864464 (0.0026) [2024-06-25 10:14:23,389][15132] Fps is (10 sec: 39347.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14163427328. Throughput: 0: 42519.6. Samples: 14163526580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:23,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 10:14:25,488][15401] Updated weights for policy 0, policy_version 864474 (0.0032) [2024-06-25 10:14:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14163673088. Throughput: 0: 42532.9. Samples: 14163779300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:28,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 10:14:29,523][15401] Updated weights for policy 0, policy_version 864484 (0.0028) [2024-06-25 10:14:33,332][15401] Updated weights for policy 0, policy_version 864494 (0.0036) [2024-06-25 10:14:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14163869696. Throughput: 0: 42480.5. Samples: 14164037780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 10:14:37,238][15401] Updated weights for policy 0, policy_version 864504 (0.0026) [2024-06-25 10:14:38,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 14164049920. Throughput: 0: 42406.6. Samples: 14164162720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:38,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 10:14:40,925][15401] Updated weights for policy 0, policy_version 864514 (0.0038) [2024-06-25 10:14:43,391][15132] Fps is (10 sec: 44229.7, 60 sec: 42870.3, 300 sec: 42931.4). Total num frames: 14164312064. Throughput: 0: 42487.5. Samples: 14164416580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:43,392][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 10:14:44,868][15401] Updated weights for policy 0, policy_version 864524 (0.0037) [2024-06-25 10:14:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 14164508672. Throughput: 0: 42599.1. Samples: 14164676380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 10:14:48,635][15401] Updated weights for policy 0, policy_version 864534 (0.0034) [2024-06-25 10:14:52,423][15401] Updated weights for policy 0, policy_version 864544 (0.0041) [2024-06-25 10:14:53,390][15132] Fps is (10 sec: 37689.0, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 14164688896. Throughput: 0: 42549.7. Samples: 14164802180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:53,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 10:14:56,111][15401] Updated weights for policy 0, policy_version 864554 (0.0046) [2024-06-25 10:14:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42876.4). Total num frames: 14164951040. Throughput: 0: 42568.3. Samples: 14165057140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:14:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 10:14:59,990][15401] Updated weights for policy 0, policy_version 864564 (0.0038) [2024-06-25 10:15:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14165131264. Throughput: 0: 42620.9. Samples: 14165317060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:15:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 10:15:03,712][15401] Updated weights for policy 0, policy_version 864574 (0.0036) [2024-06-25 10:15:07,958][15401] Updated weights for policy 0, policy_version 864584 (0.0031) [2024-06-25 10:15:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14165344256. Throughput: 0: 42587.9. Samples: 14165443040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:15:08,395][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 10:15:11,281][15401] Updated weights for policy 0, policy_version 864594 (0.0043) [2024-06-25 10:15:13,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42874.4, 300 sec: 42931.3). Total num frames: 14165606400. Throughput: 0: 42717.7. Samples: 14165701700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:15:13,392][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 10:15:15,338][15401] Updated weights for policy 0, policy_version 864604 (0.0023) [2024-06-25 10:15:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 14165786624. Throughput: 0: 42871.1. Samples: 14165966980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:15:18,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 10:15:18,948][15401] Updated weights for policy 0, policy_version 864614 (0.0043) [2024-06-25 10:15:22,850][15401] Updated weights for policy 0, policy_version 864624 (0.0036) [2024-06-25 10:15:23,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14165999616. Throughput: 0: 42676.0. Samples: 14166083140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:15:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 10:15:26,636][15401] Updated weights for policy 0, policy_version 864634 (0.0024) [2024-06-25 10:15:28,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 14166261760. Throughput: 0: 42823.3. Samples: 14166343560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:15:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 10:15:30,313][15401] Updated weights for policy 0, policy_version 864644 (0.0051) [2024-06-25 10:15:32,573][15349] Signal inference workers to stop experience collection... (209800 times) [2024-06-25 10:15:32,573][15349] Signal inference workers to resume experience collection... (209800 times) [2024-06-25 10:15:32,613][15401] InferenceWorker_p0-w0: stopping experience collection (209800 times) [2024-06-25 10:15:32,614][15401] InferenceWorker_p0-w0: resuming experience collection (209800 times) [2024-06-25 10:15:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 14166425600. Throughput: 0: 42922.1. Samples: 14166607880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:15:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 10:15:34,385][15401] Updated weights for policy 0, policy_version 864654 (0.0036) [2024-06-25 10:15:37,724][15401] Updated weights for policy 0, policy_version 864664 (0.0028) [2024-06-25 10:15:38,389][15132] Fps is (10 sec: 39321.4, 60 sec: 43417.7, 300 sec: 42932.0). Total num frames: 14166654976. Throughput: 0: 42748.0. Samples: 14166725840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 10:15:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 10:15:41,879][15401] Updated weights for policy 0, policy_version 864674 (0.0032) [2024-06-25 10:15:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42599.5, 300 sec: 42820.6). Total num frames: 14166867968. Throughput: 0: 42954.2. Samples: 14166990080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:15:43,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-25 10:15:43,557][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000864679_14166900736.pth... [2024-06-25 10:15:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000864049_14156578816.pth [2024-06-25 10:15:45,347][15401] Updated weights for policy 0, policy_version 864684 (0.0038) [2024-06-25 10:15:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14167048192. Throughput: 0: 42914.6. Samples: 14167248220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:15:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 10:15:49,749][15401] Updated weights for policy 0, policy_version 864694 (0.0035) [2024-06-25 10:15:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 14167293952. Throughput: 0: 42850.2. Samples: 14167371300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:15:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 10:15:53,459][15401] Updated weights for policy 0, policy_version 864704 (0.0041) [2024-06-25 10:15:57,394][15401] Updated weights for policy 0, policy_version 864714 (0.0023) [2024-06-25 10:15:58,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14167523328. Throughput: 0: 42912.0. Samples: 14167632640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:15:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 10:16:00,961][15401] Updated weights for policy 0, policy_version 864724 (0.0024) [2024-06-25 10:16:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14167719936. Throughput: 0: 42863.1. Samples: 14167895820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:03,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 10:16:05,200][15401] Updated weights for policy 0, policy_version 864734 (0.0038) [2024-06-25 10:16:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42932.0). Total num frames: 14167949312. Throughput: 0: 42973.4. Samples: 14168016940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 10:16:08,777][15401] Updated weights for policy 0, policy_version 864744 (0.0043) [2024-06-25 10:16:12,787][15401] Updated weights for policy 0, policy_version 864754 (0.0042) [2024-06-25 10:16:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 14168162304. Throughput: 0: 43092.4. Samples: 14168282720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:13,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 10:16:16,978][15401] Updated weights for policy 0, policy_version 864764 (0.0037) [2024-06-25 10:16:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14168375296. Throughput: 0: 42792.1. Samples: 14168533520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:18,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 10:16:20,386][15401] Updated weights for policy 0, policy_version 864774 (0.0031) [2024-06-25 10:16:23,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.9, 300 sec: 42986.8). Total num frames: 14168588288. Throughput: 0: 42959.9. Samples: 14168659140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:23,392][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 10:16:24,457][15401] Updated weights for policy 0, policy_version 864784 (0.0031) [2024-06-25 10:16:27,863][15401] Updated weights for policy 0, policy_version 864794 (0.0029) [2024-06-25 10:16:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 14168801280. Throughput: 0: 42928.4. Samples: 14168921860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 10:16:31,979][15401] Updated weights for policy 0, policy_version 864804 (0.0026) [2024-06-25 10:16:33,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 14169014272. Throughput: 0: 42862.7. Samples: 14169177040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 10:16:35,363][15401] Updated weights for policy 0, policy_version 864814 (0.0022) [2024-06-25 10:16:38,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 14169227264. Throughput: 0: 42942.8. Samples: 14169304000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:38,396][15132] Avg episode reward: [(0, '0.268')] [2024-06-25 10:16:39,490][15401] Updated weights for policy 0, policy_version 864824 (0.0036) [2024-06-25 10:16:43,058][15401] Updated weights for policy 0, policy_version 864834 (0.0043) [2024-06-25 10:16:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14169456640. Throughput: 0: 43013.2. Samples: 14169568240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:43,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-25 10:16:47,316][15401] Updated weights for policy 0, policy_version 864844 (0.0035) [2024-06-25 10:16:48,389][15132] Fps is (10 sec: 40986.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14169636864. Throughput: 0: 42850.6. Samples: 14169824100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:48,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 10:16:50,603][15401] Updated weights for policy 0, policy_version 864854 (0.0027) [2024-06-25 10:16:51,947][15349] Signal inference workers to stop experience collection... (209850 times) [2024-06-25 10:16:51,947][15349] Signal inference workers to resume experience collection... (209850 times) [2024-06-25 10:16:51,965][15401] InferenceWorker_p0-w0: stopping experience collection (209850 times) [2024-06-25 10:16:51,965][15401] InferenceWorker_p0-w0: resuming experience collection (209850 times) [2024-06-25 10:16:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14169882624. Throughput: 0: 42891.1. Samples: 14169947040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:53,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 10:16:54,613][15401] Updated weights for policy 0, policy_version 864864 (0.0039) [2024-06-25 10:16:58,309][15401] Updated weights for policy 0, policy_version 864874 (0.0034) [2024-06-25 10:16:58,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 14170095616. Throughput: 0: 42858.1. Samples: 14170211440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:16:58,392][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 10:17:02,429][15401] Updated weights for policy 0, policy_version 864884 (0.0036) [2024-06-25 10:17:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14170292224. Throughput: 0: 42954.2. Samples: 14170466460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:17:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 10:17:06,025][15401] Updated weights for policy 0, policy_version 864894 (0.0040) [2024-06-25 10:17:08,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14170521600. Throughput: 0: 43025.9. Samples: 14170595200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:17:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 10:17:10,289][15401] Updated weights for policy 0, policy_version 864904 (0.0035) [2024-06-25 10:17:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14170734592. Throughput: 0: 43067.6. Samples: 14170859900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:17:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 10:17:13,607][15401] Updated weights for policy 0, policy_version 864914 (0.0033) [2024-06-25 10:17:17,622][15401] Updated weights for policy 0, policy_version 864924 (0.0023) [2024-06-25 10:17:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14170931200. Throughput: 0: 43066.5. Samples: 14171115040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:17:18,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 10:17:21,387][15401] Updated weights for policy 0, policy_version 864934 (0.0041) [2024-06-25 10:17:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 14171176960. Throughput: 0: 43112.3. Samples: 14171243780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:17:23,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-25 10:17:25,075][15401] Updated weights for policy 0, policy_version 864944 (0.0043) [2024-06-25 10:17:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14171357184. Throughput: 0: 42958.7. Samples: 14171501380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 10:17:28,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 10:17:29,267][15401] Updated weights for policy 0, policy_version 864954 (0.0036) [2024-06-25 10:17:32,853][15401] Updated weights for policy 0, policy_version 864964 (0.0045) [2024-06-25 10:17:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14171586560. Throughput: 0: 42898.6. Samples: 14171754540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:17:33,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 10:17:36,756][15401] Updated weights for policy 0, policy_version 864974 (0.0030) [2024-06-25 10:17:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43149.1, 300 sec: 42931.6). Total num frames: 14171815936. Throughput: 0: 43250.7. Samples: 14171893320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:17:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 10:17:40,306][15401] Updated weights for policy 0, policy_version 864984 (0.0044) [2024-06-25 10:17:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14171996160. Throughput: 0: 42876.0. Samples: 14172140760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:17:43,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 10:17:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000864990_14171996160.pth... [2024-06-25 10:17:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000864364_14161739776.pth [2024-06-25 10:17:44,281][15401] Updated weights for policy 0, policy_version 864994 (0.0026) [2024-06-25 10:17:47,763][15401] Updated weights for policy 0, policy_version 865004 (0.0036) [2024-06-25 10:17:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 14172241920. Throughput: 0: 42776.9. Samples: 14172391520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:17:48,392][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 10:17:52,086][15401] Updated weights for policy 0, policy_version 865014 (0.0027) [2024-06-25 10:17:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14172438528. Throughput: 0: 42875.1. Samples: 14172524580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:17:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 10:17:55,469][15401] Updated weights for policy 0, policy_version 865024 (0.0033) [2024-06-25 10:17:58,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 14172635136. Throughput: 0: 42475.6. Samples: 14172771300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:17:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 10:18:00,051][15401] Updated weights for policy 0, policy_version 865034 (0.0035) [2024-06-25 10:18:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14172864512. Throughput: 0: 42335.2. Samples: 14173020120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 10:18:03,478][15401] Updated weights for policy 0, policy_version 865044 (0.0034) [2024-06-25 10:18:07,845][15401] Updated weights for policy 0, policy_version 865054 (0.0039) [2024-06-25 10:18:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14173061120. Throughput: 0: 42464.8. Samples: 14173154700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:08,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 10:18:10,987][15401] Updated weights for policy 0, policy_version 865064 (0.0037) [2024-06-25 10:18:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 14173274112. Throughput: 0: 42378.7. Samples: 14173408420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 10:18:15,450][15401] Updated weights for policy 0, policy_version 865074 (0.0038) [2024-06-25 10:18:18,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 14173519872. Throughput: 0: 42371.6. Samples: 14173661260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 10:18:18,567][15401] Updated weights for policy 0, policy_version 865084 (0.0030) [2024-06-25 10:18:23,338][15401] Updated weights for policy 0, policy_version 865094 (0.0034) [2024-06-25 10:18:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14173700096. Throughput: 0: 42292.0. Samples: 14173796460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 10:18:26,077][15401] Updated weights for policy 0, policy_version 865104 (0.0041) [2024-06-25 10:18:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14173929472. Throughput: 0: 42387.2. Samples: 14174048180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:28,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 10:18:30,985][15401] Updated weights for policy 0, policy_version 865114 (0.0024) [2024-06-25 10:18:31,141][15349] Signal inference workers to stop experience collection... (209900 times) [2024-06-25 10:18:31,142][15349] Signal inference workers to resume experience collection... (209900 times) [2024-06-25 10:18:31,178][15401] InferenceWorker_p0-w0: stopping experience collection (209900 times) [2024-06-25 10:18:31,178][15401] InferenceWorker_p0-w0: resuming experience collection (209900 times) [2024-06-25 10:18:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14174158848. Throughput: 0: 42497.0. Samples: 14174303780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 10:18:33,656][15401] Updated weights for policy 0, policy_version 865124 (0.0036) [2024-06-25 10:18:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 14174322688. Throughput: 0: 42508.8. Samples: 14174437480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:38,393][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 10:18:38,659][15401] Updated weights for policy 0, policy_version 865134 (0.0039) [2024-06-25 10:18:41,394][15401] Updated weights for policy 0, policy_version 865144 (0.0042) [2024-06-25 10:18:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14174568448. Throughput: 0: 42563.5. Samples: 14174686660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 10:18:46,302][15401] Updated weights for policy 0, policy_version 865154 (0.0030) [2024-06-25 10:18:48,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42325.3, 300 sec: 42764.7). Total num frames: 14174781440. Throughput: 0: 42817.7. Samples: 14174947020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:48,393][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 10:18:49,385][15401] Updated weights for policy 0, policy_version 865164 (0.0037) [2024-06-25 10:18:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 14174978048. Throughput: 0: 42697.0. Samples: 14175076060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:53,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-25 10:18:53,785][15401] Updated weights for policy 0, policy_version 865174 (0.0036) [2024-06-25 10:18:56,843][15401] Updated weights for policy 0, policy_version 865184 (0.0028) [2024-06-25 10:18:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14175207424. Throughput: 0: 42642.9. Samples: 14175327340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:18:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 10:19:01,799][15401] Updated weights for policy 0, policy_version 865194 (0.0043) [2024-06-25 10:19:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14175436800. Throughput: 0: 42745.7. Samples: 14175584820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:19:03,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 10:19:04,536][15401] Updated weights for policy 0, policy_version 865204 (0.0034) [2024-06-25 10:19:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 14175617024. Throughput: 0: 42563.1. Samples: 14175711800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:19:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 10:19:09,658][15401] Updated weights for policy 0, policy_version 865214 (0.0040) [2024-06-25 10:19:12,564][15401] Updated weights for policy 0, policy_version 865224 (0.0035) [2024-06-25 10:19:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14175862784. Throughput: 0: 42516.8. Samples: 14175961440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:19:13,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 10:19:17,248][15401] Updated weights for policy 0, policy_version 865234 (0.0034) [2024-06-25 10:19:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 14176059392. Throughput: 0: 42557.3. Samples: 14176218860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:19:18,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 10:19:20,249][15401] Updated weights for policy 0, policy_version 865244 (0.0038) [2024-06-25 10:19:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14176272384. Throughput: 0: 42361.3. Samples: 14176343740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:19:23,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 10:19:25,154][15401] Updated weights for policy 0, policy_version 865254 (0.0037) [2024-06-25 10:19:27,931][15401] Updated weights for policy 0, policy_version 865264 (0.0028) [2024-06-25 10:19:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14176501760. Throughput: 0: 42477.3. Samples: 14176598140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:19:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 10:19:32,774][15401] Updated weights for policy 0, policy_version 865274 (0.0032) [2024-06-25 10:19:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 14176681984. Throughput: 0: 42610.3. Samples: 14176864380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:19:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 10:19:35,477][15401] Updated weights for policy 0, policy_version 865284 (0.0033) [2024-06-25 10:19:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.7). Total num frames: 14176911360. Throughput: 0: 42400.9. Samples: 14176984100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:19:38,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 10:19:40,679][15401] Updated weights for policy 0, policy_version 865294 (0.0036) [2024-06-25 10:19:43,237][15401] Updated weights for policy 0, policy_version 865304 (0.0038) [2024-06-25 10:19:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14177140736. Throughput: 0: 42455.0. Samples: 14177237820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:19:43,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 10:19:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000865304_14177140736.pth... [2024-06-25 10:19:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000864679_14166900736.pth [2024-06-25 10:19:48,331][15401] Updated weights for policy 0, policy_version 865314 (0.0033) [2024-06-25 10:19:48,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 14177304576. Throughput: 0: 42505.9. Samples: 14177497580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:19:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 10:19:51,105][15401] Updated weights for policy 0, policy_version 865324 (0.0034) [2024-06-25 10:19:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14177550336. Throughput: 0: 42337.3. Samples: 14177616980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:19:53,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 10:19:55,777][15401] Updated weights for policy 0, policy_version 865334 (0.0039) [2024-06-25 10:19:58,393][15132] Fps is (10 sec: 47495.3, 60 sec: 42868.7, 300 sec: 42875.6). Total num frames: 14177779712. Throughput: 0: 42510.8. Samples: 14177874580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:19:58,394][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 10:19:58,681][15401] Updated weights for policy 0, policy_version 865344 (0.0055) [2024-06-25 10:20:03,394][15132] Fps is (10 sec: 39305.3, 60 sec: 41776.3, 300 sec: 42708.9). Total num frames: 14177943552. Throughput: 0: 42776.4. Samples: 14178143980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:03,394][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 10:20:03,556][15401] Updated weights for policy 0, policy_version 865354 (0.0039) [2024-06-25 10:20:04,144][15349] Signal inference workers to stop experience collection... (209950 times) [2024-06-25 10:20:04,190][15349] Signal inference workers to resume experience collection... (209950 times) [2024-06-25 10:20:04,195][15401] InferenceWorker_p0-w0: stopping experience collection (209950 times) [2024-06-25 10:20:04,208][15401] InferenceWorker_p0-w0: resuming experience collection (209950 times) [2024-06-25 10:20:06,152][15401] Updated weights for policy 0, policy_version 865364 (0.0033) [2024-06-25 10:20:08,389][15132] Fps is (10 sec: 40975.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 14178189312. Throughput: 0: 42613.0. Samples: 14178261320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:08,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 10:20:10,936][15401] Updated weights for policy 0, policy_version 865374 (0.0028) [2024-06-25 10:20:13,389][15132] Fps is (10 sec: 49172.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14178435072. Throughput: 0: 42749.0. Samples: 14178521840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 10:20:13,607][15401] Updated weights for policy 0, policy_version 865384 (0.0038) [2024-06-25 10:20:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14178598912. Throughput: 0: 42718.2. Samples: 14178786700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 10:20:18,617][15401] Updated weights for policy 0, policy_version 865394 (0.0048) [2024-06-25 10:20:21,194][15401] Updated weights for policy 0, policy_version 865404 (0.0034) [2024-06-25 10:20:23,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 14178828288. Throughput: 0: 42698.6. Samples: 14178905640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:23,393][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 10:20:26,279][15401] Updated weights for policy 0, policy_version 865414 (0.0037) [2024-06-25 10:20:28,390][15132] Fps is (10 sec: 49151.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 14179090432. Throughput: 0: 42955.2. Samples: 14179170800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:28,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 10:20:28,817][15401] Updated weights for policy 0, policy_version 865424 (0.0044) [2024-06-25 10:20:33,391][15132] Fps is (10 sec: 42602.7, 60 sec: 42870.4, 300 sec: 42709.3). Total num frames: 14179254272. Throughput: 0: 43124.7. Samples: 14179438260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:33,391][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 10:20:33,665][15401] Updated weights for policy 0, policy_version 865434 (0.0031) [2024-06-25 10:20:36,515][15401] Updated weights for policy 0, policy_version 865444 (0.0037) [2024-06-25 10:20:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14179483648. Throughput: 0: 43233.4. Samples: 14179562480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 10:20:41,261][15401] Updated weights for policy 0, policy_version 865454 (0.0037) [2024-06-25 10:20:43,390][15132] Fps is (10 sec: 45881.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14179713024. Throughput: 0: 43229.4. Samples: 14179819740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:43,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 10:20:44,300][15401] Updated weights for policy 0, policy_version 865464 (0.0049) [2024-06-25 10:20:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14179893248. Throughput: 0: 43293.4. Samples: 14180092000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:48,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 10:20:48,749][15401] Updated weights for policy 0, policy_version 865474 (0.0042) [2024-06-25 10:20:51,725][15401] Updated weights for policy 0, policy_version 865484 (0.0033) [2024-06-25 10:20:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14180122624. Throughput: 0: 43376.7. Samples: 14180213280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:53,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 10:20:56,373][15401] Updated weights for policy 0, policy_version 865494 (0.0029) [2024-06-25 10:20:58,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43147.2, 300 sec: 42876.1). Total num frames: 14180368384. Throughput: 0: 43257.2. Samples: 14180468420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:20:58,392][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 10:20:59,375][15401] Updated weights for policy 0, policy_version 865504 (0.0039) [2024-06-25 10:21:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43147.5, 300 sec: 42653.9). Total num frames: 14180532224. Throughput: 0: 43448.0. Samples: 14180741860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:21:03,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 10:21:03,898][15401] Updated weights for policy 0, policy_version 865514 (0.0029) [2024-06-25 10:21:06,963][15401] Updated weights for policy 0, policy_version 865524 (0.0031) [2024-06-25 10:21:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14180777984. Throughput: 0: 43399.6. Samples: 14180858520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:21:08,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 10:21:11,039][15349] Signal inference workers to stop experience collection... (210000 times) [2024-06-25 10:21:11,040][15349] Signal inference workers to resume experience collection... (210000 times) [2024-06-25 10:21:11,063][15401] InferenceWorker_p0-w0: stopping experience collection (210000 times) [2024-06-25 10:21:11,063][15401] InferenceWorker_p0-w0: resuming experience collection (210000 times) [2024-06-25 10:21:11,514][15401] Updated weights for policy 0, policy_version 865534 (0.0043) [2024-06-25 10:21:13,389][15132] Fps is (10 sec: 49151.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14181023744. Throughput: 0: 43304.4. Samples: 14181119500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 10:21:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 10:21:14,726][15401] Updated weights for policy 0, policy_version 865544 (0.0047) [2024-06-25 10:21:18,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 14181171200. Throughput: 0: 43165.9. Samples: 14181380660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 10:21:19,278][15401] Updated weights for policy 0, policy_version 865554 (0.0033) [2024-06-25 10:21:22,548][15401] Updated weights for policy 0, policy_version 865564 (0.0031) [2024-06-25 10:21:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 14181433344. Throughput: 0: 43078.0. Samples: 14181501000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 10:21:26,870][15401] Updated weights for policy 0, policy_version 865574 (0.0032) [2024-06-25 10:21:28,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14181646336. Throughput: 0: 43133.3. Samples: 14181760740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:28,391][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 10:21:30,153][15401] Updated weights for policy 0, policy_version 865584 (0.0030) [2024-06-25 10:21:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42872.4, 300 sec: 42710.4). Total num frames: 14181826560. Throughput: 0: 42993.2. Samples: 14182026700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 10:21:34,307][15401] Updated weights for policy 0, policy_version 865594 (0.0030) [2024-06-25 10:21:37,714][15401] Updated weights for policy 0, policy_version 865604 (0.0025) [2024-06-25 10:21:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 14182088704. Throughput: 0: 43060.5. Samples: 14182151000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 10:21:41,832][15401] Updated weights for policy 0, policy_version 865614 (0.0028) [2024-06-25 10:21:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14182301696. Throughput: 0: 43121.8. Samples: 14182408900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 10:21:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000865619_14182301696.pth... [2024-06-25 10:21:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000864990_14171996160.pth [2024-06-25 10:21:45,391][15401] Updated weights for policy 0, policy_version 865624 (0.0031) [2024-06-25 10:21:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 14182498304. Throughput: 0: 42862.6. Samples: 14182670680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:48,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 10:21:49,259][15401] Updated weights for policy 0, policy_version 865634 (0.0031) [2024-06-25 10:21:52,855][15401] Updated weights for policy 0, policy_version 865644 (0.0033) [2024-06-25 10:21:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.7, 300 sec: 42820.9). Total num frames: 14182727680. Throughput: 0: 43132.9. Samples: 14182799500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 10:21:56,889][15401] Updated weights for policy 0, policy_version 865654 (0.0038) [2024-06-25 10:21:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14182940672. Throughput: 0: 42937.4. Samples: 14183051680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:21:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 10:22:00,433][15401] Updated weights for policy 0, policy_version 865664 (0.0043) [2024-06-25 10:22:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 14183137280. Throughput: 0: 42979.4. Samples: 14183314740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 10:22:04,438][15401] Updated weights for policy 0, policy_version 865674 (0.0035) [2024-06-25 10:22:08,230][15401] Updated weights for policy 0, policy_version 865684 (0.0040) [2024-06-25 10:22:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 14183366656. Throughput: 0: 43092.1. Samples: 14183440140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:08,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 10:22:11,957][15401] Updated weights for policy 0, policy_version 865694 (0.0041) [2024-06-25 10:22:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 14183596032. Throughput: 0: 43033.5. Samples: 14183697240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:13,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 10:22:16,146][15401] Updated weights for policy 0, policy_version 865704 (0.0035) [2024-06-25 10:22:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 14183792640. Throughput: 0: 42748.1. Samples: 14183950360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 10:22:19,782][15401] Updated weights for policy 0, policy_version 865714 (0.0036) [2024-06-25 10:22:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14184005632. Throughput: 0: 42816.4. Samples: 14184077740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 10:22:23,744][15401] Updated weights for policy 0, policy_version 865724 (0.0033) [2024-06-25 10:22:27,843][15401] Updated weights for policy 0, policy_version 865734 (0.0033) [2024-06-25 10:22:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14184218624. Throughput: 0: 42727.6. Samples: 14184331640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 10:22:31,111][15349] Signal inference workers to stop experience collection... (210050 times) [2024-06-25 10:22:31,112][15349] Signal inference workers to resume experience collection... (210050 times) [2024-06-25 10:22:31,160][15401] InferenceWorker_p0-w0: stopping experience collection (210050 times) [2024-06-25 10:22:31,160][15401] InferenceWorker_p0-w0: resuming experience collection (210050 times) [2024-06-25 10:22:31,588][15401] Updated weights for policy 0, policy_version 865744 (0.0030) [2024-06-25 10:22:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14184415232. Throughput: 0: 42714.6. Samples: 14184592840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 10:22:35,488][15401] Updated weights for policy 0, policy_version 865754 (0.0040) [2024-06-25 10:22:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14184628224. Throughput: 0: 42644.1. Samples: 14184718480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:38,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 10:22:39,316][15401] Updated weights for policy 0, policy_version 865764 (0.0034) [2024-06-25 10:22:43,165][15401] Updated weights for policy 0, policy_version 865774 (0.0029) [2024-06-25 10:22:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 14184841216. Throughput: 0: 42691.1. Samples: 14184972780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 10:22:47,095][15401] Updated weights for policy 0, policy_version 865784 (0.0026) [2024-06-25 10:22:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14185054208. Throughput: 0: 42480.2. Samples: 14185226340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 10:22:50,844][15401] Updated weights for policy 0, policy_version 865794 (0.0034) [2024-06-25 10:22:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14185267200. Throughput: 0: 42560.1. Samples: 14185355340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 10:22:54,870][15401] Updated weights for policy 0, policy_version 865804 (0.0031) [2024-06-25 10:22:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14185463808. Throughput: 0: 42363.6. Samples: 14185603600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:22:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 10:22:58,883][15401] Updated weights for policy 0, policy_version 865814 (0.0034) [2024-06-25 10:23:02,482][15401] Updated weights for policy 0, policy_version 865824 (0.0029) [2024-06-25 10:23:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14185693184. Throughput: 0: 42508.3. Samples: 14185863240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 10:23:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 10:23:06,548][15401] Updated weights for policy 0, policy_version 865834 (0.0031) [2024-06-25 10:23:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14185922560. Throughput: 0: 42601.0. Samples: 14185994780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 10:23:10,153][15401] Updated weights for policy 0, policy_version 865844 (0.0040) [2024-06-25 10:23:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.1, 300 sec: 42709.5). Total num frames: 14186119168. Throughput: 0: 42406.1. Samples: 14186239920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 10:23:13,982][15401] Updated weights for policy 0, policy_version 865854 (0.0038) [2024-06-25 10:23:17,930][15401] Updated weights for policy 0, policy_version 865864 (0.0027) [2024-06-25 10:23:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 14186332160. Throughput: 0: 42527.2. Samples: 14186506560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 10:23:21,636][15401] Updated weights for policy 0, policy_version 865874 (0.0029) [2024-06-25 10:23:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14186561536. Throughput: 0: 42540.4. Samples: 14186632800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 10:23:25,484][15401] Updated weights for policy 0, policy_version 865884 (0.0028) [2024-06-25 10:23:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14186758144. Throughput: 0: 42358.1. Samples: 14186878900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 10:23:29,164][15401] Updated weights for policy 0, policy_version 865894 (0.0027) [2024-06-25 10:23:33,221][15401] Updated weights for policy 0, policy_version 865904 (0.0044) [2024-06-25 10:23:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14186987520. Throughput: 0: 42767.9. Samples: 14187150900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:33,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 10:23:36,810][15401] Updated weights for policy 0, policy_version 865914 (0.0045) [2024-06-25 10:23:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14187200512. Throughput: 0: 42728.4. Samples: 14187278120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:38,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-25 10:23:39,464][15349] Signal inference workers to stop experience collection... (210100 times) [2024-06-25 10:23:39,465][15349] Signal inference workers to resume experience collection... (210100 times) [2024-06-25 10:23:39,507][15401] InferenceWorker_p0-w0: stopping experience collection (210100 times) [2024-06-25 10:23:39,507][15401] InferenceWorker_p0-w0: resuming experience collection (210100 times) [2024-06-25 10:23:40,768][15401] Updated weights for policy 0, policy_version 865924 (0.0025) [2024-06-25 10:23:43,397][15132] Fps is (10 sec: 42568.5, 60 sec: 42866.4, 300 sec: 42819.9). Total num frames: 14187413504. Throughput: 0: 42732.3. Samples: 14187526860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:43,397][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 10:23:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000865931_14187413504.pth... [2024-06-25 10:23:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000865304_14177140736.pth [2024-06-25 10:23:45,004][15401] Updated weights for policy 0, policy_version 865934 (0.0022) [2024-06-25 10:23:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 14187610112. Throughput: 0: 42661.4. Samples: 14187783000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 10:23:48,838][15401] Updated weights for policy 0, policy_version 865944 (0.0046) [2024-06-25 10:23:52,595][15401] Updated weights for policy 0, policy_version 865954 (0.0036) [2024-06-25 10:23:53,389][15132] Fps is (10 sec: 42629.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14187839488. Throughput: 0: 42440.1. Samples: 14187904580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 10:23:56,393][15401] Updated weights for policy 0, policy_version 865964 (0.0039) [2024-06-25 10:23:58,396][15132] Fps is (10 sec: 44208.7, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 14188052480. Throughput: 0: 42753.1. Samples: 14188164080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:23:58,396][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 10:24:00,561][15401] Updated weights for policy 0, policy_version 865974 (0.0039) [2024-06-25 10:24:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14188249088. Throughput: 0: 42474.7. Samples: 14188417920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:03,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 10:24:04,127][15401] Updated weights for policy 0, policy_version 865984 (0.0033) [2024-06-25 10:24:08,046][15401] Updated weights for policy 0, policy_version 865994 (0.0040) [2024-06-25 10:24:08,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14188462080. Throughput: 0: 42416.8. Samples: 14188541560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:08,390][15132] Avg episode reward: [(0, '0.203')] [2024-06-25 10:24:11,559][15401] Updated weights for policy 0, policy_version 866004 (0.0026) [2024-06-25 10:24:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14188707840. Throughput: 0: 42774.6. Samples: 14188803760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:13,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-25 10:24:15,532][15401] Updated weights for policy 0, policy_version 866014 (0.0041) [2024-06-25 10:24:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14188904448. Throughput: 0: 42585.4. Samples: 14189067240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:18,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 10:24:19,299][15401] Updated weights for policy 0, policy_version 866024 (0.0030) [2024-06-25 10:24:23,003][15401] Updated weights for policy 0, policy_version 866034 (0.0027) [2024-06-25 10:24:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14189101056. Throughput: 0: 42448.7. Samples: 14189188320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 10:24:26,897][15401] Updated weights for policy 0, policy_version 866044 (0.0042) [2024-06-25 10:24:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14189346816. Throughput: 0: 42612.0. Samples: 14189444100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:28,399][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 10:24:30,822][15401] Updated weights for policy 0, policy_version 866054 (0.0035) [2024-06-25 10:24:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14189543424. Throughput: 0: 42680.5. Samples: 14189703620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:33,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 10:24:34,573][15401] Updated weights for policy 0, policy_version 866064 (0.0038) [2024-06-25 10:24:38,376][15401] Updated weights for policy 0, policy_version 866074 (0.0033) [2024-06-25 10:24:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14189756416. Throughput: 0: 42704.3. Samples: 14189826280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 10:24:42,042][15401] Updated weights for policy 0, policy_version 866084 (0.0034) [2024-06-25 10:24:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43149.6, 300 sec: 43042.7). Total num frames: 14190002176. Throughput: 0: 42759.9. Samples: 14190088000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 10:24:46,154][15401] Updated weights for policy 0, policy_version 866094 (0.0026) [2024-06-25 10:24:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14190182400. Throughput: 0: 42847.7. Samples: 14190346060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 10:24:49,968][15401] Updated weights for policy 0, policy_version 866104 (0.0040) [2024-06-25 10:24:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.2, 300 sec: 42710.0). Total num frames: 14190379008. Throughput: 0: 42866.7. Samples: 14190470560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 10:24:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 10:24:53,704][15401] Updated weights for policy 0, policy_version 866114 (0.0041) [2024-06-25 10:24:57,645][15401] Updated weights for policy 0, policy_version 866124 (0.0027) [2024-06-25 10:24:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42876.1, 300 sec: 42987.8). Total num frames: 14190624768. Throughput: 0: 42785.9. Samples: 14190729120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:24:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 10:25:01,715][15401] Updated weights for policy 0, policy_version 866134 (0.0032) [2024-06-25 10:25:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14190804992. Throughput: 0: 42584.0. Samples: 14190983520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:03,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 10:25:05,302][15401] Updated weights for policy 0, policy_version 866144 (0.0028) [2024-06-25 10:25:05,854][15349] Signal inference workers to stop experience collection... (210150 times) [2024-06-25 10:25:05,897][15401] InferenceWorker_p0-w0: stopping experience collection (210150 times) [2024-06-25 10:25:05,906][15349] Signal inference workers to resume experience collection... (210150 times) [2024-06-25 10:25:05,918][15401] InferenceWorker_p0-w0: resuming experience collection (210150 times) [2024-06-25 10:25:08,394][15132] Fps is (10 sec: 39305.4, 60 sec: 42595.6, 300 sec: 42653.4). Total num frames: 14191017984. Throughput: 0: 42516.8. Samples: 14191101740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:08,394][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 10:25:09,326][15401] Updated weights for policy 0, policy_version 866154 (0.0035) [2024-06-25 10:25:12,818][15401] Updated weights for policy 0, policy_version 866164 (0.0031) [2024-06-25 10:25:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 14191263744. Throughput: 0: 42779.2. Samples: 14191369160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 10:25:16,932][15401] Updated weights for policy 0, policy_version 866174 (0.0028) [2024-06-25 10:25:18,389][15132] Fps is (10 sec: 44254.6, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 14191460352. Throughput: 0: 42703.1. Samples: 14191625260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:18,398][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 10:25:20,278][15401] Updated weights for policy 0, policy_version 866184 (0.0031) [2024-06-25 10:25:23,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14191656960. Throughput: 0: 42770.1. Samples: 14191750940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 10:25:24,473][15401] Updated weights for policy 0, policy_version 866194 (0.0037) [2024-06-25 10:25:27,976][15401] Updated weights for policy 0, policy_version 866204 (0.0040) [2024-06-25 10:25:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42876.3). Total num frames: 14191902720. Throughput: 0: 42749.3. Samples: 14192011720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:28,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 10:25:32,198][15401] Updated weights for policy 0, policy_version 866214 (0.0043) [2024-06-25 10:25:33,391][15132] Fps is (10 sec: 45870.2, 60 sec: 42870.6, 300 sec: 42820.4). Total num frames: 14192115712. Throughput: 0: 42656.1. Samples: 14192265640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:33,391][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 10:25:35,456][15401] Updated weights for policy 0, policy_version 866224 (0.0034) [2024-06-25 10:25:38,394][15132] Fps is (10 sec: 40940.0, 60 sec: 42594.9, 300 sec: 42708.8). Total num frames: 14192312320. Throughput: 0: 42756.7. Samples: 14192394820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:38,395][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 10:25:39,799][15401] Updated weights for policy 0, policy_version 866234 (0.0033) [2024-06-25 10:25:43,174][15401] Updated weights for policy 0, policy_version 866244 (0.0024) [2024-06-25 10:25:43,389][15132] Fps is (10 sec: 44242.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 14192558080. Throughput: 0: 42932.4. Samples: 14192661080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 10:25:43,493][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000866246_14192574464.pth... [2024-06-25 10:25:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000865619_14182301696.pth [2024-06-25 10:25:47,397][15401] Updated weights for policy 0, policy_version 866254 (0.0040) [2024-06-25 10:25:48,389][15132] Fps is (10 sec: 45897.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14192771072. Throughput: 0: 42993.0. Samples: 14192918200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:48,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-25 10:25:50,670][15401] Updated weights for policy 0, policy_version 866264 (0.0026) [2024-06-25 10:25:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14192967680. Throughput: 0: 43190.9. Samples: 14193045160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 10:25:55,000][15401] Updated weights for policy 0, policy_version 866274 (0.0035) [2024-06-25 10:25:58,198][15401] Updated weights for policy 0, policy_version 866284 (0.0036) [2024-06-25 10:25:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 14193213440. Throughput: 0: 43133.8. Samples: 14193310180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:25:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 10:26:02,597][15401] Updated weights for policy 0, policy_version 866294 (0.0033) [2024-06-25 10:26:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14193393664. Throughput: 0: 43196.1. Samples: 14193569080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 10:26:05,741][15401] Updated weights for policy 0, policy_version 866304 (0.0023) [2024-06-25 10:26:08,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42874.4, 300 sec: 42598.4). Total num frames: 14193590272. Throughput: 0: 43150.0. Samples: 14193692680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 10:26:10,183][15401] Updated weights for policy 0, policy_version 866314 (0.0032) [2024-06-25 10:26:13,182][15401] Updated weights for policy 0, policy_version 866324 (0.0030) [2024-06-25 10:26:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 14193868800. Throughput: 0: 43300.4. Samples: 14193960240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 10:26:17,807][15401] Updated weights for policy 0, policy_version 866334 (0.0037) [2024-06-25 10:26:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14194049024. Throughput: 0: 43329.2. Samples: 14194215400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:18,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 10:26:20,724][15401] Updated weights for policy 0, policy_version 866344 (0.0030) [2024-06-25 10:26:23,390][15132] Fps is (10 sec: 37683.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14194245632. Throughput: 0: 43235.3. Samples: 14194340200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 10:26:25,302][15401] Updated weights for policy 0, policy_version 866354 (0.0033) [2024-06-25 10:26:27,620][15349] Signal inference workers to stop experience collection... (210200 times) [2024-06-25 10:26:27,620][15349] Signal inference workers to resume experience collection... (210200 times) [2024-06-25 10:26:27,651][15401] InferenceWorker_p0-w0: stopping experience collection (210200 times) [2024-06-25 10:26:27,652][15401] InferenceWorker_p0-w0: resuming experience collection (210200 times) [2024-06-25 10:26:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 14194475008. Throughput: 0: 43016.2. Samples: 14194596820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:28,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 10:26:29,076][15401] Updated weights for policy 0, policy_version 866364 (0.0026) [2024-06-25 10:26:33,317][15401] Updated weights for policy 0, policy_version 866374 (0.0041) [2024-06-25 10:26:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42599.3, 300 sec: 42654.0). Total num frames: 14194671616. Throughput: 0: 43133.8. Samples: 14194859220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:33,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 10:26:36,539][15401] Updated weights for policy 0, policy_version 866384 (0.0032) [2024-06-25 10:26:38,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43146.3, 300 sec: 42709.1). Total num frames: 14194900992. Throughput: 0: 43024.5. Samples: 14194981360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:38,393][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 10:26:40,756][15401] Updated weights for policy 0, policy_version 866394 (0.0035) [2024-06-25 10:26:43,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 14195130368. Throughput: 0: 43074.9. Samples: 14195248660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 10:26:43,393][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 10:26:43,943][15401] Updated weights for policy 0, policy_version 866404 (0.0036) [2024-06-25 10:26:48,267][15401] Updated weights for policy 0, policy_version 866414 (0.0028) [2024-06-25 10:26:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14195326976. Throughput: 0: 43095.4. Samples: 14195508380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:26:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 10:26:51,348][15401] Updated weights for policy 0, policy_version 866424 (0.0028) [2024-06-25 10:26:53,389][15132] Fps is (10 sec: 42609.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14195556352. Throughput: 0: 43136.4. Samples: 14195633820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:26:53,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 10:26:55,969][15401] Updated weights for policy 0, policy_version 866434 (0.0030) [2024-06-25 10:26:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.2, 300 sec: 42820.6). Total num frames: 14195769344. Throughput: 0: 43015.5. Samples: 14195895940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:26:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 10:26:58,850][15401] Updated weights for policy 0, policy_version 866444 (0.0030) [2024-06-25 10:27:03,385][15401] Updated weights for policy 0, policy_version 866454 (0.0033) [2024-06-25 10:27:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14195982336. Throughput: 0: 43129.3. Samples: 14196156220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 10:27:06,515][15401] Updated weights for policy 0, policy_version 866464 (0.0028) [2024-06-25 10:27:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43690.5, 300 sec: 42765.0). Total num frames: 14196211712. Throughput: 0: 43038.1. Samples: 14196276920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:08,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 10:27:11,228][15401] Updated weights for policy 0, policy_version 866474 (0.0033) [2024-06-25 10:27:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14196424704. Throughput: 0: 43138.7. Samples: 14196538060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 10:27:14,314][15401] Updated weights for policy 0, policy_version 866484 (0.0035) [2024-06-25 10:27:18,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14196604928. Throughput: 0: 43181.8. Samples: 14196802400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:18,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 10:27:18,925][15401] Updated weights for policy 0, policy_version 866494 (0.0036) [2024-06-25 10:27:22,107][15401] Updated weights for policy 0, policy_version 866504 (0.0032) [2024-06-25 10:27:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14196834304. Throughput: 0: 43067.6. Samples: 14196919300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 10:27:26,455][15401] Updated weights for policy 0, policy_version 866514 (0.0034) [2024-06-25 10:27:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 14197063680. Throughput: 0: 42929.5. Samples: 14197180380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 10:27:29,951][15401] Updated weights for policy 0, policy_version 866524 (0.0041) [2024-06-25 10:27:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14197260288. Throughput: 0: 42939.7. Samples: 14197440660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 10:27:34,009][15401] Updated weights for policy 0, policy_version 866534 (0.0030) [2024-06-25 10:27:37,773][15401] Updated weights for policy 0, policy_version 866544 (0.0045) [2024-06-25 10:27:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 14197473280. Throughput: 0: 42711.5. Samples: 14197555840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:38,392][15132] Avg episode reward: [(0, '0.284')] [2024-06-25 10:27:42,054][15401] Updated weights for policy 0, policy_version 866554 (0.0033) [2024-06-25 10:27:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 14197719040. Throughput: 0: 42831.6. Samples: 14197823360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:43,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 10:27:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000866560_14197719040.pth... [2024-06-25 10:27:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000865931_14187413504.pth [2024-06-25 10:27:45,471][15401] Updated weights for policy 0, policy_version 866564 (0.0031) [2024-06-25 10:27:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14197899264. Throughput: 0: 42652.5. Samples: 14198075580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:48,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 10:27:49,899][15401] Updated weights for policy 0, policy_version 866574 (0.0039) [2024-06-25 10:27:52,942][15401] Updated weights for policy 0, policy_version 866584 (0.0034) [2024-06-25 10:27:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14198112256. Throughput: 0: 42717.4. Samples: 14198199200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 10:27:57,439][15401] Updated weights for policy 0, policy_version 866594 (0.0032) [2024-06-25 10:27:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14198308864. Throughput: 0: 42744.5. Samples: 14198461560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:27:58,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 10:28:00,977][15401] Updated weights for policy 0, policy_version 866604 (0.0034) [2024-06-25 10:28:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14198538240. Throughput: 0: 42423.6. Samples: 14198711460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:28:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 10:28:05,153][15401] Updated weights for policy 0, policy_version 866614 (0.0038) [2024-06-25 10:28:07,264][15349] Signal inference workers to stop experience collection... (210250 times) [2024-06-25 10:28:07,265][15349] Signal inference workers to resume experience collection... (210250 times) [2024-06-25 10:28:07,279][15401] InferenceWorker_p0-w0: stopping experience collection (210250 times) [2024-06-25 10:28:07,280][15401] InferenceWorker_p0-w0: resuming experience collection (210250 times) [2024-06-25 10:28:08,390][15132] Fps is (10 sec: 44235.1, 60 sec: 42325.1, 300 sec: 42820.5). Total num frames: 14198751232. Throughput: 0: 42561.4. Samples: 14198834580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:28:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 10:28:08,523][15401] Updated weights for policy 0, policy_version 866624 (0.0042) [2024-06-25 10:28:12,737][15401] Updated weights for policy 0, policy_version 866634 (0.0034) [2024-06-25 10:28:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14198964224. Throughput: 0: 42655.6. Samples: 14199099880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:28:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 10:28:16,071][15401] Updated weights for policy 0, policy_version 866644 (0.0032) [2024-06-25 10:28:18,390][15132] Fps is (10 sec: 42600.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14199177216. Throughput: 0: 42355.5. Samples: 14199346660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:28:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 10:28:20,267][15401] Updated weights for policy 0, policy_version 866654 (0.0037) [2024-06-25 10:28:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14199406592. Throughput: 0: 42672.1. Samples: 14199476080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:28:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 10:28:23,686][15401] Updated weights for policy 0, policy_version 866664 (0.0031) [2024-06-25 10:28:27,761][15401] Updated weights for policy 0, policy_version 866674 (0.0044) [2024-06-25 10:28:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14199603200. Throughput: 0: 42636.0. Samples: 14199741980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:28:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 10:28:31,360][15401] Updated weights for policy 0, policy_version 866684 (0.0042) [2024-06-25 10:28:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14199816192. Throughput: 0: 42724.9. Samples: 14199998200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:28:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 10:28:35,225][15401] Updated weights for policy 0, policy_version 866694 (0.0036) [2024-06-25 10:28:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42821.6). Total num frames: 14200045568. Throughput: 0: 42837.1. Samples: 14200126860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 10:28:38,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 10:28:39,140][15401] Updated weights for policy 0, policy_version 866704 (0.0041) [2024-06-25 10:28:42,795][15401] Updated weights for policy 0, policy_version 866714 (0.0040) [2024-06-25 10:28:43,392][15132] Fps is (10 sec: 44227.8, 60 sec: 42323.9, 300 sec: 42875.8). Total num frames: 14200258560. Throughput: 0: 42772.3. Samples: 14200386400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:28:43,392][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 10:28:46,885][15401] Updated weights for policy 0, policy_version 866724 (0.0023) [2024-06-25 10:28:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14200471552. Throughput: 0: 42846.1. Samples: 14200639540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:28:48,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 10:28:50,549][15401] Updated weights for policy 0, policy_version 866734 (0.0031) [2024-06-25 10:28:53,389][15132] Fps is (10 sec: 44246.0, 60 sec: 43144.6, 300 sec: 42877.0). Total num frames: 14200700928. Throughput: 0: 42991.1. Samples: 14200769160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:28:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 10:28:54,464][15401] Updated weights for policy 0, policy_version 866744 (0.0033) [2024-06-25 10:28:58,143][15401] Updated weights for policy 0, policy_version 866754 (0.0032) [2024-06-25 10:28:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14200897536. Throughput: 0: 42871.2. Samples: 14201029080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:28:58,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 10:29:01,883][15401] Updated weights for policy 0, policy_version 866764 (0.0030) [2024-06-25 10:29:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14201110528. Throughput: 0: 43204.0. Samples: 14201290840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:03,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 10:29:05,877][15401] Updated weights for policy 0, policy_version 866774 (0.0044) [2024-06-25 10:29:08,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.7, 300 sec: 42820.5). Total num frames: 14201339904. Throughput: 0: 43068.2. Samples: 14201414160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 10:29:09,440][15401] Updated weights for policy 0, policy_version 866784 (0.0037) [2024-06-25 10:29:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14201536512. Throughput: 0: 42939.5. Samples: 14201674260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:13,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 10:29:13,585][15401] Updated weights for policy 0, policy_version 866794 (0.0037) [2024-06-25 10:29:16,933][15401] Updated weights for policy 0, policy_version 866804 (0.0041) [2024-06-25 10:29:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14201749504. Throughput: 0: 42900.8. Samples: 14201928740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:18,402][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 10:29:21,076][15401] Updated weights for policy 0, policy_version 866814 (0.0032) [2024-06-25 10:29:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 14201978880. Throughput: 0: 42841.5. Samples: 14202054740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 10:29:25,121][15401] Updated weights for policy 0, policy_version 866824 (0.0028) [2024-06-25 10:29:26,387][15349] Signal inference workers to stop experience collection... (210300 times) [2024-06-25 10:29:26,388][15349] Signal inference workers to resume experience collection... (210300 times) [2024-06-25 10:29:26,419][15401] InferenceWorker_p0-w0: stopping experience collection (210300 times) [2024-06-25 10:29:26,420][15401] InferenceWorker_p0-w0: resuming experience collection (210300 times) [2024-06-25 10:29:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14202191872. Throughput: 0: 42802.3. Samples: 14202312420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 10:29:28,483][15401] Updated weights for policy 0, policy_version 866834 (0.0043) [2024-06-25 10:29:32,583][15401] Updated weights for policy 0, policy_version 866844 (0.0036) [2024-06-25 10:29:33,390][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14202388480. Throughput: 0: 42929.4. Samples: 14202571360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 10:29:36,203][15401] Updated weights for policy 0, policy_version 866854 (0.0037) [2024-06-25 10:29:38,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14202617856. Throughput: 0: 42877.9. Samples: 14202698660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 10:29:40,158][15401] Updated weights for policy 0, policy_version 866864 (0.0041) [2024-06-25 10:29:43,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42872.8, 300 sec: 42876.1). Total num frames: 14202830848. Throughput: 0: 42971.7. Samples: 14202962820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 10:29:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000866872_14202830848.pth... [2024-06-25 10:29:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000866246_14192574464.pth [2024-06-25 10:29:43,847][15401] Updated weights for policy 0, policy_version 866874 (0.0035) [2024-06-25 10:29:47,598][15401] Updated weights for policy 0, policy_version 866884 (0.0027) [2024-06-25 10:29:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14203043840. Throughput: 0: 42717.8. Samples: 14203213140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 10:29:51,380][15401] Updated weights for policy 0, policy_version 866894 (0.0032) [2024-06-25 10:29:53,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14203256832. Throughput: 0: 42913.5. Samples: 14203345260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 10:29:55,207][15401] Updated weights for policy 0, policy_version 866904 (0.0034) [2024-06-25 10:29:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 14203486208. Throughput: 0: 42910.3. Samples: 14203605220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:29:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 10:29:58,882][15401] Updated weights for policy 0, policy_version 866914 (0.0032) [2024-06-25 10:30:02,713][15401] Updated weights for policy 0, policy_version 866924 (0.0035) [2024-06-25 10:30:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42987.8). Total num frames: 14203699200. Throughput: 0: 42838.8. Samples: 14203856480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:30:03,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 10:30:06,728][15401] Updated weights for policy 0, policy_version 866934 (0.0034) [2024-06-25 10:30:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14203912192. Throughput: 0: 42975.7. Samples: 14203988640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:30:08,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 10:30:10,377][15401] Updated weights for policy 0, policy_version 866944 (0.0047) [2024-06-25 10:30:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14204108800. Throughput: 0: 42882.2. Samples: 14204242120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:30:13,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-25 10:30:14,301][15401] Updated weights for policy 0, policy_version 866954 (0.0042) [2024-06-25 10:30:18,243][15401] Updated weights for policy 0, policy_version 866964 (0.0047) [2024-06-25 10:30:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 14204338176. Throughput: 0: 42584.0. Samples: 14204487640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:30:18,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 10:30:22,170][15401] Updated weights for policy 0, policy_version 866974 (0.0031) [2024-06-25 10:30:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14204551168. Throughput: 0: 42734.0. Samples: 14204621700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:30:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 10:30:25,796][15401] Updated weights for policy 0, policy_version 866984 (0.0050) [2024-06-25 10:30:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42820.7). Total num frames: 14204747776. Throughput: 0: 42578.0. Samples: 14204878820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 10:30:28,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 10:30:29,835][15401] Updated weights for policy 0, policy_version 866994 (0.0035) [2024-06-25 10:30:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42932.4). Total num frames: 14204977152. Throughput: 0: 42596.5. Samples: 14205129980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:30:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 10:30:34,030][15401] Updated weights for policy 0, policy_version 867004 (0.0029) [2024-06-25 10:30:37,411][15401] Updated weights for policy 0, policy_version 867014 (0.0035) [2024-06-25 10:30:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14205190144. Throughput: 0: 42697.8. Samples: 14205266660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:30:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 10:30:41,688][15401] Updated weights for policy 0, policy_version 867024 (0.0030) [2024-06-25 10:30:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14205386752. Throughput: 0: 42510.6. Samples: 14205518200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:30:43,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 10:30:44,920][15401] Updated weights for policy 0, policy_version 867034 (0.0043) [2024-06-25 10:30:46,054][15349] Signal inference workers to stop experience collection... (210350 times) [2024-06-25 10:30:46,106][15401] InferenceWorker_p0-w0: stopping experience collection (210350 times) [2024-06-25 10:30:46,112][15349] Signal inference workers to resume experience collection... (210350 times) [2024-06-25 10:30:46,120][15401] InferenceWorker_p0-w0: resuming experience collection (210350 times) [2024-06-25 10:30:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 14205599744. Throughput: 0: 42602.6. Samples: 14205773600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:30:48,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 10:30:49,344][15401] Updated weights for policy 0, policy_version 867044 (0.0031) [2024-06-25 10:30:53,270][15401] Updated weights for policy 0, policy_version 867054 (0.0036) [2024-06-25 10:30:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 14205812736. Throughput: 0: 42613.8. Samples: 14205906260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:30:53,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 10:30:56,752][15401] Updated weights for policy 0, policy_version 867064 (0.0035) [2024-06-25 10:30:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14206042112. Throughput: 0: 42537.9. Samples: 14206156320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:30:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 10:31:00,893][15401] Updated weights for policy 0, policy_version 867074 (0.0038) [2024-06-25 10:31:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 14206255104. Throughput: 0: 42844.8. Samples: 14206415660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 10:31:04,504][15401] Updated weights for policy 0, policy_version 867084 (0.0042) [2024-06-25 10:31:08,281][15401] Updated weights for policy 0, policy_version 867094 (0.0030) [2024-06-25 10:31:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14206468096. Throughput: 0: 42802.8. Samples: 14206547820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 10:31:11,977][15401] Updated weights for policy 0, policy_version 867104 (0.0036) [2024-06-25 10:31:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14206681088. Throughput: 0: 42752.9. Samples: 14206802700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:13,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 10:31:15,941][15401] Updated weights for policy 0, policy_version 867114 (0.0035) [2024-06-25 10:31:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14206894080. Throughput: 0: 42917.2. Samples: 14207061260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:18,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 10:31:19,505][15401] Updated weights for policy 0, policy_version 867124 (0.0039) [2024-06-25 10:31:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14207090688. Throughput: 0: 42610.7. Samples: 14207184140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 10:31:23,651][15401] Updated weights for policy 0, policy_version 867134 (0.0034) [2024-06-25 10:31:27,215][15401] Updated weights for policy 0, policy_version 867144 (0.0039) [2024-06-25 10:31:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14207336448. Throughput: 0: 42705.3. Samples: 14207439940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 10:31:31,304][15401] Updated weights for policy 0, policy_version 867154 (0.0033) [2024-06-25 10:31:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 14207533056. Throughput: 0: 42693.9. Samples: 14207694820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 10:31:34,936][15401] Updated weights for policy 0, policy_version 867164 (0.0031) [2024-06-25 10:31:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 14207746048. Throughput: 0: 42579.1. Samples: 14207822320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 10:31:38,809][15401] Updated weights for policy 0, policy_version 867174 (0.0025) [2024-06-25 10:31:42,678][15401] Updated weights for policy 0, policy_version 867184 (0.0032) [2024-06-25 10:31:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 14207959040. Throughput: 0: 42754.6. Samples: 14208080380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:43,393][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 10:31:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000867185_14207959040.pth... [2024-06-25 10:31:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000866560_14197719040.pth [2024-06-25 10:31:46,739][15401] Updated weights for policy 0, policy_version 867194 (0.0036) [2024-06-25 10:31:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 14208155648. Throughput: 0: 42699.5. Samples: 14208337140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 10:31:50,330][15401] Updated weights for policy 0, policy_version 867204 (0.0023) [2024-06-25 10:31:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14208368640. Throughput: 0: 42535.4. Samples: 14208461920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 10:31:54,222][15401] Updated weights for policy 0, policy_version 867214 (0.0033) [2024-06-25 10:31:58,358][15401] Updated weights for policy 0, policy_version 867224 (0.0043) [2024-06-25 10:31:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14208598016. Throughput: 0: 42638.2. Samples: 14208721420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:31:58,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 10:32:01,769][15401] Updated weights for policy 0, policy_version 867234 (0.0033) [2024-06-25 10:32:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14208794624. Throughput: 0: 42506.2. Samples: 14208974040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:32:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 10:32:06,085][15401] Updated weights for policy 0, policy_version 867244 (0.0037) [2024-06-25 10:32:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14209007616. Throughput: 0: 42584.7. Samples: 14209100460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:32:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 10:32:09,962][15401] Updated weights for policy 0, policy_version 867254 (0.0029) [2024-06-25 10:32:11,723][15349] Signal inference workers to stop experience collection... (210400 times) [2024-06-25 10:32:11,767][15401] InferenceWorker_p0-w0: stopping experience collection (210400 times) [2024-06-25 10:32:11,777][15349] Signal inference workers to resume experience collection... (210400 times) [2024-06-25 10:32:11,783][15401] InferenceWorker_p0-w0: resuming experience collection (210400 times) [2024-06-25 10:32:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14209220608. Throughput: 0: 42511.5. Samples: 14209352960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:32:13,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 10:32:13,642][15401] Updated weights for policy 0, policy_version 867264 (0.0036) [2024-06-25 10:32:17,642][15401] Updated weights for policy 0, policy_version 867274 (0.0027) [2024-06-25 10:32:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14209433600. Throughput: 0: 42512.8. Samples: 14209607900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-25 10:32:18,390][15132] Avg episode reward: [(0, '0.210')] [2024-06-25 10:32:21,601][15401] Updated weights for policy 0, policy_version 867284 (0.0037) [2024-06-25 10:32:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14209646592. Throughput: 0: 42578.2. Samples: 14209738340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:32:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 10:32:25,159][15401] Updated weights for policy 0, policy_version 867294 (0.0031) [2024-06-25 10:32:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 14209859584. Throughput: 0: 42427.1. Samples: 14209989500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:32:28,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 10:32:29,175][15401] Updated weights for policy 0, policy_version 867304 (0.0039) [2024-06-25 10:32:32,909][15401] Updated weights for policy 0, policy_version 867314 (0.0031) [2024-06-25 10:32:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 14210072576. Throughput: 0: 42300.6. Samples: 14210240760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:32:33,393][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 10:32:37,141][15401] Updated weights for policy 0, policy_version 867324 (0.0027) [2024-06-25 10:32:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14210285568. Throughput: 0: 42317.5. Samples: 14210366200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:32:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 10:32:40,719][15401] Updated weights for policy 0, policy_version 867334 (0.0033) [2024-06-25 10:32:43,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 14210498560. Throughput: 0: 42304.5. Samples: 14210625120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:32:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 10:32:44,733][15401] Updated weights for policy 0, policy_version 867344 (0.0041) [2024-06-25 10:32:48,310][15401] Updated weights for policy 0, policy_version 867354 (0.0042) [2024-06-25 10:32:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 14210727936. Throughput: 0: 42452.6. Samples: 14210884400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:32:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 10:32:52,243][15401] Updated weights for policy 0, policy_version 867364 (0.0027) [2024-06-25 10:32:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14210940928. Throughput: 0: 42529.1. Samples: 14211014260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:32:53,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-25 10:32:55,955][15401] Updated weights for policy 0, policy_version 867374 (0.0036) [2024-06-25 10:32:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14211137536. Throughput: 0: 42528.6. Samples: 14211266740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:32:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 10:32:59,774][15401] Updated weights for policy 0, policy_version 867384 (0.0039) [2024-06-25 10:33:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14211350528. Throughput: 0: 42631.5. Samples: 14211526320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:03,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 10:33:03,999][15401] Updated weights for policy 0, policy_version 867394 (0.0033) [2024-06-25 10:33:07,525][15401] Updated weights for policy 0, policy_version 867404 (0.0037) [2024-06-25 10:33:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14211579904. Throughput: 0: 42545.3. Samples: 14211652880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:08,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 10:33:11,567][15401] Updated weights for policy 0, policy_version 867414 (0.0043) [2024-06-25 10:33:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14211776512. Throughput: 0: 42671.6. Samples: 14211909720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 10:33:15,363][15401] Updated weights for policy 0, policy_version 867424 (0.0027) [2024-06-25 10:33:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14211989504. Throughput: 0: 42727.2. Samples: 14212163380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 10:33:19,377][15401] Updated weights for policy 0, policy_version 867434 (0.0041) [2024-06-25 10:33:23,018][15401] Updated weights for policy 0, policy_version 867444 (0.0043) [2024-06-25 10:33:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14212235264. Throughput: 0: 42811.1. Samples: 14212292700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:23,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 10:33:26,907][15401] Updated weights for policy 0, policy_version 867454 (0.0033) [2024-06-25 10:33:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14212415488. Throughput: 0: 42817.0. Samples: 14212551880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 10:33:30,736][15401] Updated weights for policy 0, policy_version 867464 (0.0037) [2024-06-25 10:33:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 14212644864. Throughput: 0: 42544.0. Samples: 14212798880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:33,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 10:33:34,690][15401] Updated weights for policy 0, policy_version 867474 (0.0032) [2024-06-25 10:33:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42654.2). Total num frames: 14212841472. Throughput: 0: 42550.6. Samples: 14212929040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 10:33:38,480][15401] Updated weights for policy 0, policy_version 867484 (0.0033) [2024-06-25 10:33:39,100][15349] Signal inference workers to stop experience collection... (210450 times) [2024-06-25 10:33:39,100][15349] Signal inference workers to resume experience collection... (210450 times) [2024-06-25 10:33:39,118][15401] InferenceWorker_p0-w0: stopping experience collection (210450 times) [2024-06-25 10:33:39,118][15401] InferenceWorker_p0-w0: resuming experience collection (210450 times) [2024-06-25 10:33:42,555][15401] Updated weights for policy 0, policy_version 867494 (0.0027) [2024-06-25 10:33:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14213054464. Throughput: 0: 42800.8. Samples: 14213192780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:43,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 10:33:43,446][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000867497_14213070848.pth... [2024-06-25 10:33:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000866872_14202830848.pth [2024-06-25 10:33:45,877][15401] Updated weights for policy 0, policy_version 867504 (0.0039) [2024-06-25 10:33:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14213283840. Throughput: 0: 42522.2. Samples: 14213439820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:48,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 10:33:50,451][15401] Updated weights for policy 0, policy_version 867514 (0.0027) [2024-06-25 10:33:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14213496832. Throughput: 0: 42657.8. Samples: 14213572480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:53,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 10:33:53,634][15401] Updated weights for policy 0, policy_version 867524 (0.0028) [2024-06-25 10:33:58,006][15401] Updated weights for policy 0, policy_version 867534 (0.0031) [2024-06-25 10:33:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14213709824. Throughput: 0: 42770.7. Samples: 14213834400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:33:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 10:34:01,082][15401] Updated weights for policy 0, policy_version 867544 (0.0036) [2024-06-25 10:34:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14213922816. Throughput: 0: 42838.1. Samples: 14214091100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:34:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 10:34:05,509][15401] Updated weights for policy 0, policy_version 867554 (0.0035) [2024-06-25 10:34:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14214135808. Throughput: 0: 42873.8. Samples: 14214222020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:34:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 10:34:08,741][15401] Updated weights for policy 0, policy_version 867564 (0.0039) [2024-06-25 10:34:12,894][15401] Updated weights for policy 0, policy_version 867574 (0.0032) [2024-06-25 10:34:13,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14214348800. Throughput: 0: 42828.3. Samples: 14214479160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 10:34:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 10:34:16,271][15401] Updated weights for policy 0, policy_version 867584 (0.0027) [2024-06-25 10:34:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14214561792. Throughput: 0: 42991.5. Samples: 14214733500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:18,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 10:34:20,829][15401] Updated weights for policy 0, policy_version 867594 (0.0037) [2024-06-25 10:34:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 14214774784. Throughput: 0: 43005.8. Samples: 14214864300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:23,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 10:34:23,837][15401] Updated weights for policy 0, policy_version 867604 (0.0038) [2024-06-25 10:34:28,194][15401] Updated weights for policy 0, policy_version 867614 (0.0028) [2024-06-25 10:34:28,391][15132] Fps is (10 sec: 42590.2, 60 sec: 42870.1, 300 sec: 42709.2). Total num frames: 14214987776. Throughput: 0: 43025.9. Samples: 14215129020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:28,392][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 10:34:31,434][15401] Updated weights for policy 0, policy_version 867624 (0.0043) [2024-06-25 10:34:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14215217152. Throughput: 0: 43122.3. Samples: 14215380320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 10:34:35,624][15401] Updated weights for policy 0, policy_version 867634 (0.0037) [2024-06-25 10:34:38,389][15132] Fps is (10 sec: 42606.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14215413760. Throughput: 0: 43061.9. Samples: 14215510260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:38,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 10:34:39,100][15401] Updated weights for policy 0, policy_version 867644 (0.0043) [2024-06-25 10:34:43,331][15401] Updated weights for policy 0, policy_version 867654 (0.0032) [2024-06-25 10:34:43,394][15132] Fps is (10 sec: 42578.3, 60 sec: 43141.2, 300 sec: 42708.8). Total num frames: 14215643136. Throughput: 0: 43012.4. Samples: 14215770160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:43,395][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 10:34:46,830][15401] Updated weights for policy 0, policy_version 867664 (0.0045) [2024-06-25 10:34:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14215856128. Throughput: 0: 42864.6. Samples: 14216020000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 10:34:51,182][15401] Updated weights for policy 0, policy_version 867674 (0.0033) [2024-06-25 10:34:53,390][15132] Fps is (10 sec: 40979.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14216052736. Throughput: 0: 42972.7. Samples: 14216155800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:53,404][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 10:34:54,227][15401] Updated weights for policy 0, policy_version 867684 (0.0039) [2024-06-25 10:34:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14216282112. Throughput: 0: 42945.7. Samples: 14216411720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:34:58,399][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 10:34:58,619][15401] Updated weights for policy 0, policy_version 867694 (0.0034) [2024-06-25 10:34:59,883][15349] Signal inference workers to stop experience collection... (210500 times) [2024-06-25 10:34:59,935][15401] InferenceWorker_p0-w0: stopping experience collection (210500 times) [2024-06-25 10:34:59,944][15349] Signal inference workers to resume experience collection... (210500 times) [2024-06-25 10:34:59,950][15401] InferenceWorker_p0-w0: resuming experience collection (210500 times) [2024-06-25 10:35:02,185][15401] Updated weights for policy 0, policy_version 867704 (0.0043) [2024-06-25 10:35:03,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43143.0, 300 sec: 42709.1). Total num frames: 14216511488. Throughput: 0: 42841.7. Samples: 14216661480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:03,393][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 10:35:06,270][15401] Updated weights for policy 0, policy_version 867714 (0.0037) [2024-06-25 10:35:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14216691712. Throughput: 0: 42860.5. Samples: 14216793020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:08,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 10:35:09,842][15401] Updated weights for policy 0, policy_version 867724 (0.0038) [2024-06-25 10:35:13,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14216904704. Throughput: 0: 42608.4. Samples: 14217046320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 10:35:14,403][15401] Updated weights for policy 0, policy_version 867734 (0.0037) [2024-06-25 10:35:17,621][15401] Updated weights for policy 0, policy_version 867744 (0.0025) [2024-06-25 10:35:18,392][15132] Fps is (10 sec: 45863.6, 60 sec: 43142.7, 300 sec: 42709.1). Total num frames: 14217150464. Throughput: 0: 42558.2. Samples: 14217295540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:18,393][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 10:35:22,028][15401] Updated weights for policy 0, policy_version 867754 (0.0029) [2024-06-25 10:35:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14217347072. Throughput: 0: 42660.4. Samples: 14217429980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:23,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 10:35:25,245][15401] Updated weights for policy 0, policy_version 867764 (0.0028) [2024-06-25 10:35:28,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42599.8, 300 sec: 42598.4). Total num frames: 14217543680. Throughput: 0: 42544.6. Samples: 14217684460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:28,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 10:35:29,547][15401] Updated weights for policy 0, policy_version 867774 (0.0040) [2024-06-25 10:35:32,974][15401] Updated weights for policy 0, policy_version 867784 (0.0034) [2024-06-25 10:35:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14217805824. Throughput: 0: 42557.7. Samples: 14217935100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:33,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-25 10:35:37,485][15401] Updated weights for policy 0, policy_version 867794 (0.0033) [2024-06-25 10:35:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14217986048. Throughput: 0: 42598.7. Samples: 14218072740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:38,392][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 10:35:40,532][15401] Updated weights for policy 0, policy_version 867804 (0.0028) [2024-06-25 10:35:43,391][15132] Fps is (10 sec: 37676.7, 60 sec: 42327.4, 300 sec: 42653.7). Total num frames: 14218182656. Throughput: 0: 42453.0. Samples: 14218322180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:43,392][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 10:35:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000867809_14218182656.pth... [2024-06-25 10:35:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000867185_14207959040.pth [2024-06-25 10:35:44,958][15401] Updated weights for policy 0, policy_version 867814 (0.0041) [2024-06-25 10:35:48,226][15401] Updated weights for policy 0, policy_version 867824 (0.0041) [2024-06-25 10:35:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14218428416. Throughput: 0: 42578.3. Samples: 14218577400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 10:35:52,547][15401] Updated weights for policy 0, policy_version 867834 (0.0020) [2024-06-25 10:35:53,390][15132] Fps is (10 sec: 44244.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14218625024. Throughput: 0: 42596.8. Samples: 14218709880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 10:35:55,886][15401] Updated weights for policy 0, policy_version 867844 (0.0035) [2024-06-25 10:35:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14218838016. Throughput: 0: 42723.7. Samples: 14218968880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:35:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 10:36:00,158][15401] Updated weights for policy 0, policy_version 867854 (0.0031) [2024-06-25 10:36:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42598.4, 300 sec: 42709.1). Total num frames: 14219067392. Throughput: 0: 42836.0. Samples: 14219223160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 10:36:03,392][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 10:36:03,569][15401] Updated weights for policy 0, policy_version 867864 (0.0044) [2024-06-25 10:36:07,535][15401] Updated weights for policy 0, policy_version 867874 (0.0040) [2024-06-25 10:36:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14219280384. Throughput: 0: 42813.3. Samples: 14219356580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 10:36:11,042][15401] Updated weights for policy 0, policy_version 867884 (0.0031) [2024-06-25 10:36:13,377][15349] Signal inference workers to stop experience collection... (210550 times) [2024-06-25 10:36:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14219476992. Throughput: 0: 42940.8. Samples: 14219616800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 10:36:13,422][15401] InferenceWorker_p0-w0: stopping experience collection (210550 times) [2024-06-25 10:36:13,456][15349] Signal inference workers to resume experience collection... (210550 times) [2024-06-25 10:36:13,456][15401] InferenceWorker_p0-w0: resuming experience collection (210550 times) [2024-06-25 10:36:15,293][15401] Updated weights for policy 0, policy_version 867894 (0.0031) [2024-06-25 10:36:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 14219722752. Throughput: 0: 42943.7. Samples: 14219867560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 10:36:18,573][15401] Updated weights for policy 0, policy_version 867904 (0.0034) [2024-06-25 10:36:23,015][15401] Updated weights for policy 0, policy_version 867914 (0.0030) [2024-06-25 10:36:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14219902976. Throughput: 0: 42835.6. Samples: 14220000340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 10:36:26,099][15401] Updated weights for policy 0, policy_version 867924 (0.0047) [2024-06-25 10:36:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14220132352. Throughput: 0: 42907.9. Samples: 14220252960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 10:36:30,501][15401] Updated weights for policy 0, policy_version 867934 (0.0031) [2024-06-25 10:36:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14220345344. Throughput: 0: 42828.5. Samples: 14220504680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 10:36:33,898][15401] Updated weights for policy 0, policy_version 867944 (0.0036) [2024-06-25 10:36:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 14220541952. Throughput: 0: 42885.9. Samples: 14220639740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 10:36:38,596][15401] Updated weights for policy 0, policy_version 867954 (0.0026) [2024-06-25 10:36:41,646][15401] Updated weights for policy 0, policy_version 867964 (0.0034) [2024-06-25 10:36:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42872.7, 300 sec: 42709.5). Total num frames: 14220754944. Throughput: 0: 42714.1. Samples: 14220891020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:43,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-25 10:36:46,037][15401] Updated weights for policy 0, policy_version 867974 (0.0028) [2024-06-25 10:36:48,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 14220984320. Throughput: 0: 42809.1. Samples: 14221149740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:48,397][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 10:36:49,404][15401] Updated weights for policy 0, policy_version 867984 (0.0036) [2024-06-25 10:36:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14221197312. Throughput: 0: 42854.1. Samples: 14221285020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 10:36:53,533][15401] Updated weights for policy 0, policy_version 867994 (0.0044) [2024-06-25 10:36:57,010][15401] Updated weights for policy 0, policy_version 868004 (0.0022) [2024-06-25 10:36:58,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14221410304. Throughput: 0: 42589.8. Samples: 14221533340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:36:58,391][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 10:37:01,585][15401] Updated weights for policy 0, policy_version 868014 (0.0027) [2024-06-25 10:37:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 14221639680. Throughput: 0: 42637.2. Samples: 14221786240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 10:37:04,720][15401] Updated weights for policy 0, policy_version 868024 (0.0034) [2024-06-25 10:37:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14221852672. Throughput: 0: 42713.8. Samples: 14221922460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:08,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 10:37:09,007][15401] Updated weights for policy 0, policy_version 868034 (0.0036) [2024-06-25 10:37:12,716][15401] Updated weights for policy 0, policy_version 868044 (0.0035) [2024-06-25 10:37:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14222049280. Throughput: 0: 42754.8. Samples: 14222176920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 10:37:16,573][15401] Updated weights for policy 0, policy_version 868054 (0.0030) [2024-06-25 10:37:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14222278656. Throughput: 0: 42942.2. Samples: 14222437080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 10:37:20,205][15401] Updated weights for policy 0, policy_version 868064 (0.0027) [2024-06-25 10:37:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14222491648. Throughput: 0: 42819.9. Samples: 14222566640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:23,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 10:37:24,058][15401] Updated weights for policy 0, policy_version 868074 (0.0038) [2024-06-25 10:37:27,901][15401] Updated weights for policy 0, policy_version 868084 (0.0030) [2024-06-25 10:37:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42820.6). Total num frames: 14222704640. Throughput: 0: 42984.8. Samples: 14222825440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:28,393][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 10:37:31,882][15401] Updated weights for policy 0, policy_version 868094 (0.0031) [2024-06-25 10:37:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14222917632. Throughput: 0: 42867.9. Samples: 14223078520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:33,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 10:37:35,727][15401] Updated weights for policy 0, policy_version 868104 (0.0023) [2024-06-25 10:37:38,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14223130624. Throughput: 0: 42719.2. Samples: 14223207380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 10:37:39,381][15401] Updated weights for policy 0, policy_version 868114 (0.0038) [2024-06-25 10:37:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14223327232. Throughput: 0: 42997.9. Samples: 14223468240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:43,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 10:37:43,463][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000868124_14223343616.pth... [2024-06-25 10:37:43,477][15401] Updated weights for policy 0, policy_version 868124 (0.0043) [2024-06-25 10:37:43,508][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000867497_14213070848.pth [2024-06-25 10:37:47,044][15401] Updated weights for policy 0, policy_version 868134 (0.0038) [2024-06-25 10:37:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 14223556608. Throughput: 0: 43037.4. Samples: 14223722920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:48,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 10:37:51,285][15401] Updated weights for policy 0, policy_version 868144 (0.0037) [2024-06-25 10:37:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14223769600. Throughput: 0: 42819.4. Samples: 14223849340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 10:37:53,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 10:37:54,982][15401] Updated weights for policy 0, policy_version 868154 (0.0038) [2024-06-25 10:37:56,921][15349] Signal inference workers to stop experience collection... (210600 times) [2024-06-25 10:37:56,922][15349] Signal inference workers to resume experience collection... (210600 times) [2024-06-25 10:37:56,954][15401] InferenceWorker_p0-w0: stopping experience collection (210600 times) [2024-06-25 10:37:56,954][15401] InferenceWorker_p0-w0: resuming experience collection (210600 times) [2024-06-25 10:37:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 14223966208. Throughput: 0: 42759.4. Samples: 14224101200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:37:58,392][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 10:37:58,861][15401] Updated weights for policy 0, policy_version 868164 (0.0029) [2024-06-25 10:38:02,491][15401] Updated weights for policy 0, policy_version 868174 (0.0040) [2024-06-25 10:38:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14224195584. Throughput: 0: 42818.2. Samples: 14224363900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:03,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 10:38:06,479][15401] Updated weights for policy 0, policy_version 868184 (0.0042) [2024-06-25 10:38:08,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14224424960. Throughput: 0: 42824.5. Samples: 14224493740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:08,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 10:38:10,180][15401] Updated weights for policy 0, policy_version 868194 (0.0030) [2024-06-25 10:38:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14224605184. Throughput: 0: 42619.2. Samples: 14224743200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 10:38:14,194][15401] Updated weights for policy 0, policy_version 868204 (0.0024) [2024-06-25 10:38:18,112][15401] Updated weights for policy 0, policy_version 868214 (0.0047) [2024-06-25 10:38:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14224834560. Throughput: 0: 42685.2. Samples: 14224999360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 10:38:22,113][15401] Updated weights for policy 0, policy_version 868224 (0.0034) [2024-06-25 10:38:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14225063936. Throughput: 0: 42688.1. Samples: 14225128340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 10:38:25,754][15401] Updated weights for policy 0, policy_version 868234 (0.0046) [2024-06-25 10:38:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 14225244160. Throughput: 0: 42571.9. Samples: 14225383980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:28,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 10:38:29,804][15401] Updated weights for policy 0, policy_version 868244 (0.0028) [2024-06-25 10:38:33,145][15401] Updated weights for policy 0, policy_version 868254 (0.0037) [2024-06-25 10:38:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14225473536. Throughput: 0: 42533.4. Samples: 14225636920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:33,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 10:38:37,454][15401] Updated weights for policy 0, policy_version 868264 (0.0030) [2024-06-25 10:38:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14225702912. Throughput: 0: 42623.7. Samples: 14225767400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 10:38:41,139][15401] Updated weights for policy 0, policy_version 868274 (0.0037) [2024-06-25 10:38:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14225883136. Throughput: 0: 42604.5. Samples: 14226018300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 10:38:45,186][15401] Updated weights for policy 0, policy_version 868284 (0.0027) [2024-06-25 10:38:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14226112512. Throughput: 0: 42501.0. Samples: 14226276440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 10:38:48,585][15401] Updated weights for policy 0, policy_version 868294 (0.0036) [2024-06-25 10:38:52,738][15401] Updated weights for policy 0, policy_version 868304 (0.0034) [2024-06-25 10:38:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14226325504. Throughput: 0: 42518.2. Samples: 14226407060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:53,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 10:38:56,121][15401] Updated weights for policy 0, policy_version 868314 (0.0047) [2024-06-25 10:38:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 14226538496. Throughput: 0: 42731.5. Samples: 14226666120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:38:58,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 10:39:00,549][15401] Updated weights for policy 0, policy_version 868324 (0.0031) [2024-06-25 10:39:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14226767872. Throughput: 0: 42594.2. Samples: 14226916100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 10:39:03,588][15401] Updated weights for policy 0, policy_version 868334 (0.0038) [2024-06-25 10:39:08,158][15401] Updated weights for policy 0, policy_version 868344 (0.0038) [2024-06-25 10:39:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14226964480. Throughput: 0: 42724.9. Samples: 14227050960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 10:39:08,908][15349] Signal inference workers to stop experience collection... (210650 times) [2024-06-25 10:39:08,908][15349] Signal inference workers to resume experience collection... (210650 times) [2024-06-25 10:39:08,948][15401] InferenceWorker_p0-w0: stopping experience collection (210650 times) [2024-06-25 10:39:08,949][15401] InferenceWorker_p0-w0: resuming experience collection (210650 times) [2024-06-25 10:39:11,351][15401] Updated weights for policy 0, policy_version 868354 (0.0028) [2024-06-25 10:39:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14227177472. Throughput: 0: 42767.9. Samples: 14227308540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 10:39:15,618][15401] Updated weights for policy 0, policy_version 868364 (0.0035) [2024-06-25 10:39:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14227423232. Throughput: 0: 42888.9. Samples: 14227566920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 10:39:18,827][15401] Updated weights for policy 0, policy_version 868374 (0.0030) [2024-06-25 10:39:23,240][15401] Updated weights for policy 0, policy_version 868384 (0.0045) [2024-06-25 10:39:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 14227603456. Throughput: 0: 42899.6. Samples: 14227697880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 10:39:26,931][15401] Updated weights for policy 0, policy_version 868394 (0.0037) [2024-06-25 10:39:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14227816448. Throughput: 0: 42840.4. Samples: 14227946120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 10:39:31,201][15401] Updated weights for policy 0, policy_version 868404 (0.0039) [2024-06-25 10:39:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14228062208. Throughput: 0: 42861.7. Samples: 14228205220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:33,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 10:39:34,399][15401] Updated weights for policy 0, policy_version 868414 (0.0046) [2024-06-25 10:39:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42710.2). Total num frames: 14228242432. Throughput: 0: 42814.4. Samples: 14228333700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:38,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 10:39:38,712][15401] Updated weights for policy 0, policy_version 868424 (0.0037) [2024-06-25 10:39:41,860][15401] Updated weights for policy 0, policy_version 868434 (0.0041) [2024-06-25 10:39:43,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14228455424. Throughput: 0: 42706.2. Samples: 14228587900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-25 10:39:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 10:39:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000868436_14228455424.pth... [2024-06-25 10:39:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000867809_14218182656.pth [2024-06-25 10:39:46,250][15401] Updated weights for policy 0, policy_version 868444 (0.0026) [2024-06-25 10:39:48,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 14228701184. Throughput: 0: 42828.4. Samples: 14228843380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:39:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 10:39:49,342][15401] Updated weights for policy 0, policy_version 868454 (0.0031) [2024-06-25 10:39:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14228897792. Throughput: 0: 42808.0. Samples: 14228977320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:39:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 10:39:53,910][15401] Updated weights for policy 0, policy_version 868464 (0.0048) [2024-06-25 10:39:56,951][15401] Updated weights for policy 0, policy_version 868474 (0.0036) [2024-06-25 10:39:58,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 14229110784. Throughput: 0: 42557.4. Samples: 14229223720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:39:58,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 10:40:01,445][15401] Updated weights for policy 0, policy_version 868484 (0.0041) [2024-06-25 10:40:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14229307392. Throughput: 0: 42688.9. Samples: 14229487920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 10:40:04,629][15401] Updated weights for policy 0, policy_version 868494 (0.0042) [2024-06-25 10:40:08,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14229536768. Throughput: 0: 42616.0. Samples: 14229615600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 10:40:09,505][15401] Updated weights for policy 0, policy_version 868504 (0.0044) [2024-06-25 10:40:12,386][15401] Updated weights for policy 0, policy_version 868514 (0.0028) [2024-06-25 10:40:13,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 14229766144. Throughput: 0: 42678.2. Samples: 14229866640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 10:40:17,314][15401] Updated weights for policy 0, policy_version 868524 (0.0044) [2024-06-25 10:40:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 14229946368. Throughput: 0: 42754.7. Samples: 14230129180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 10:40:20,223][15401] Updated weights for policy 0, policy_version 868534 (0.0041) [2024-06-25 10:40:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14230175744. Throughput: 0: 42595.9. Samples: 14230250520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:23,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-25 10:40:24,847][15401] Updated weights for policy 0, policy_version 868544 (0.0032) [2024-06-25 10:40:28,374][15401] Updated weights for policy 0, policy_version 868554 (0.0031) [2024-06-25 10:40:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14230388736. Throughput: 0: 42592.2. Samples: 14230504540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:28,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 10:40:32,331][15401] Updated weights for policy 0, policy_version 868564 (0.0039) [2024-06-25 10:40:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14230585344. Throughput: 0: 42731.7. Samples: 14230766300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 10:40:35,825][15401] Updated weights for policy 0, policy_version 868574 (0.0035) [2024-06-25 10:40:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42820.5). Total num frames: 14230814720. Throughput: 0: 42482.6. Samples: 14230889140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:38,392][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 10:40:39,915][15401] Updated weights for policy 0, policy_version 868584 (0.0035) [2024-06-25 10:40:41,614][15349] Signal inference workers to stop experience collection... (210700 times) [2024-06-25 10:40:41,614][15349] Signal inference workers to resume experience collection... (210700 times) [2024-06-25 10:40:41,636][15401] InferenceWorker_p0-w0: stopping experience collection (210700 times) [2024-06-25 10:40:41,636][15401] InferenceWorker_p0-w0: resuming experience collection (210700 times) [2024-06-25 10:40:43,332][15401] Updated weights for policy 0, policy_version 868594 (0.0043) [2024-06-25 10:40:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14231044096. Throughput: 0: 42728.9. Samples: 14231146420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:43,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 10:40:47,479][15401] Updated weights for policy 0, policy_version 868604 (0.0029) [2024-06-25 10:40:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14231240704. Throughput: 0: 42645.3. Samples: 14231406960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 10:40:50,916][15401] Updated weights for policy 0, policy_version 868614 (0.0039) [2024-06-25 10:40:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14231437312. Throughput: 0: 42517.2. Samples: 14231528880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:53,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 10:40:55,067][15401] Updated weights for policy 0, policy_version 868624 (0.0030) [2024-06-25 10:40:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 14231666688. Throughput: 0: 42673.4. Samples: 14231786940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:40:58,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 10:40:58,580][15401] Updated weights for policy 0, policy_version 868634 (0.0041) [2024-06-25 10:41:02,715][15401] Updated weights for policy 0, policy_version 868644 (0.0039) [2024-06-25 10:41:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14231879680. Throughput: 0: 42690.7. Samples: 14232050260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:41:03,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 10:41:06,117][15401] Updated weights for policy 0, policy_version 868654 (0.0041) [2024-06-25 10:41:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14232076288. Throughput: 0: 42787.9. Samples: 14232175980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:41:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 10:41:10,500][15401] Updated weights for policy 0, policy_version 868664 (0.0033) [2024-06-25 10:41:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14232305664. Throughput: 0: 42808.4. Samples: 14232430920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:41:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 10:41:13,911][15401] Updated weights for policy 0, policy_version 868674 (0.0022) [2024-06-25 10:41:18,301][15401] Updated weights for policy 0, policy_version 868684 (0.0028) [2024-06-25 10:41:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14232518656. Throughput: 0: 42853.8. Samples: 14232694720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:41:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 10:41:21,615][15401] Updated weights for policy 0, policy_version 868694 (0.0025) [2024-06-25 10:41:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14232731648. Throughput: 0: 42895.6. Samples: 14232819340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:41:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 10:41:26,148][15401] Updated weights for policy 0, policy_version 868704 (0.0036) [2024-06-25 10:41:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14232961024. Throughput: 0: 42772.0. Samples: 14233071160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:41:28,390][15132] Avg episode reward: [(0, '0.113')] [2024-06-25 10:41:29,161][15401] Updated weights for policy 0, policy_version 868714 (0.0027) [2024-06-25 10:41:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14233157632. Throughput: 0: 42780.4. Samples: 14233332080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:41:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 10:41:33,707][15401] Updated weights for policy 0, policy_version 868724 (0.0040) [2024-06-25 10:41:37,007][15401] Updated weights for policy 0, policy_version 868734 (0.0039) [2024-06-25 10:41:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 14233370624. Throughput: 0: 42826.6. Samples: 14233456080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 10:41:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 10:41:41,383][15401] Updated weights for policy 0, policy_version 868744 (0.0033) [2024-06-25 10:41:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.9). Total num frames: 14233600000. Throughput: 0: 42734.2. Samples: 14233709980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:41:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 10:41:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000868751_14233616384.pth... [2024-06-25 10:41:43,535][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000868124_14223343616.pth [2024-06-25 10:41:44,594][15401] Updated weights for policy 0, policy_version 868754 (0.0026) [2024-06-25 10:41:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14233796608. Throughput: 0: 42843.5. Samples: 14233978220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:41:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 10:41:48,888][15401] Updated weights for policy 0, policy_version 868764 (0.0027) [2024-06-25 10:41:52,152][15401] Updated weights for policy 0, policy_version 868774 (0.0037) [2024-06-25 10:41:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14234009600. Throughput: 0: 42700.1. Samples: 14234097480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:41:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 10:41:56,378][15401] Updated weights for policy 0, policy_version 868784 (0.0033) [2024-06-25 10:41:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14234255360. Throughput: 0: 42793.7. Samples: 14234356640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:41:58,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 10:41:59,673][15401] Updated weights for policy 0, policy_version 868794 (0.0023) [2024-06-25 10:42:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14234451968. Throughput: 0: 42889.3. Samples: 14234624740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:03,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 10:42:04,018][15401] Updated weights for policy 0, policy_version 868804 (0.0033) [2024-06-25 10:42:06,708][15349] Signal inference workers to stop experience collection... (210750 times) [2024-06-25 10:42:06,708][15349] Signal inference workers to resume experience collection... (210750 times) [2024-06-25 10:42:06,730][15401] InferenceWorker_p0-w0: stopping experience collection (210750 times) [2024-06-25 10:42:06,730][15401] InferenceWorker_p0-w0: resuming experience collection (210750 times) [2024-06-25 10:42:07,177][15401] Updated weights for policy 0, policy_version 868814 (0.0031) [2024-06-25 10:42:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14234664960. Throughput: 0: 42796.9. Samples: 14234745200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:08,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 10:42:11,679][15401] Updated weights for policy 0, policy_version 868824 (0.0023) [2024-06-25 10:42:13,396][15132] Fps is (10 sec: 45845.6, 60 sec: 43412.9, 300 sec: 42819.6). Total num frames: 14234910720. Throughput: 0: 42862.8. Samples: 14235000260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:13,397][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 10:42:15,179][15401] Updated weights for policy 0, policy_version 868834 (0.0032) [2024-06-25 10:42:18,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 14235074560. Throughput: 0: 42976.9. Samples: 14235266140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:18,392][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 10:42:19,412][15401] Updated weights for policy 0, policy_version 868844 (0.0038) [2024-06-25 10:42:22,846][15401] Updated weights for policy 0, policy_version 868854 (0.0038) [2024-06-25 10:42:23,390][15132] Fps is (10 sec: 39346.9, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 14235303936. Throughput: 0: 42769.5. Samples: 14235380700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 10:42:27,047][15401] Updated weights for policy 0, policy_version 868864 (0.0043) [2024-06-25 10:42:28,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14235533312. Throughput: 0: 42889.9. Samples: 14235640020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 10:42:30,503][15401] Updated weights for policy 0, policy_version 868874 (0.0034) [2024-06-25 10:42:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14235697152. Throughput: 0: 42763.9. Samples: 14235902600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 10:42:34,769][15401] Updated weights for policy 0, policy_version 868884 (0.0035) [2024-06-25 10:42:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14235942912. Throughput: 0: 42802.2. Samples: 14236023580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:38,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 10:42:38,427][15401] Updated weights for policy 0, policy_version 868894 (0.0032) [2024-06-25 10:42:42,647][15401] Updated weights for policy 0, policy_version 868904 (0.0037) [2024-06-25 10:42:43,390][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14236172288. Throughput: 0: 42952.4. Samples: 14236289500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 10:42:46,121][15401] Updated weights for policy 0, policy_version 868914 (0.0034) [2024-06-25 10:42:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14236352512. Throughput: 0: 42504.8. Samples: 14236537460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:48,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 10:42:50,319][15401] Updated weights for policy 0, policy_version 868924 (0.0034) [2024-06-25 10:42:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 14236598272. Throughput: 0: 42590.2. Samples: 14236661760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 10:42:53,677][15401] Updated weights for policy 0, policy_version 868934 (0.0028) [2024-06-25 10:42:58,158][15401] Updated weights for policy 0, policy_version 868944 (0.0042) [2024-06-25 10:42:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14236794880. Throughput: 0: 42848.8. Samples: 14236928180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:42:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 10:43:01,441][15401] Updated weights for policy 0, policy_version 868954 (0.0022) [2024-06-25 10:43:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 14237007872. Throughput: 0: 42651.5. Samples: 14237185460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:43:03,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 10:43:05,733][15401] Updated weights for policy 0, policy_version 868964 (0.0030) [2024-06-25 10:43:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14237253632. Throughput: 0: 42930.6. Samples: 14237312580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:43:08,396][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 10:43:09,073][15401] Updated weights for policy 0, policy_version 868974 (0.0040) [2024-06-25 10:43:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 41783.7, 300 sec: 42653.9). Total num frames: 14237417472. Throughput: 0: 42787.0. Samples: 14237565440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:43:13,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-25 10:43:13,638][15401] Updated weights for policy 0, policy_version 868984 (0.0042) [2024-06-25 10:43:16,754][15401] Updated weights for policy 0, policy_version 868994 (0.0033) [2024-06-25 10:43:18,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 14237630464. Throughput: 0: 42462.4. Samples: 14237813400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:43:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 10:43:21,337][15401] Updated weights for policy 0, policy_version 869004 (0.0036) [2024-06-25 10:43:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14237876224. Throughput: 0: 42689.0. Samples: 14237944580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:43:23,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 10:43:24,354][15401] Updated weights for policy 0, policy_version 869014 (0.0046) [2024-06-25 10:43:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 14238040064. Throughput: 0: 42499.9. Samples: 14238202000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 10:43:28,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 10:43:28,735][15349] Signal inference workers to stop experience collection... (210800 times) [2024-06-25 10:43:28,783][15401] InferenceWorker_p0-w0: stopping experience collection (210800 times) [2024-06-25 10:43:28,792][15349] Signal inference workers to resume experience collection... (210800 times) [2024-06-25 10:43:28,796][15401] InferenceWorker_p0-w0: resuming experience collection (210800 times) [2024-06-25 10:43:28,951][15401] Updated weights for policy 0, policy_version 869024 (0.0036) [2024-06-25 10:43:32,057][15401] Updated weights for policy 0, policy_version 869034 (0.0046) [2024-06-25 10:43:33,389][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14238285824. Throughput: 0: 42432.5. Samples: 14238446920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:43:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 10:43:36,366][15401] Updated weights for policy 0, policy_version 869044 (0.0038) [2024-06-25 10:43:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14238498816. Throughput: 0: 42730.7. Samples: 14238584640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:43:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 10:43:39,708][15401] Updated weights for policy 0, policy_version 869054 (0.0034) [2024-06-25 10:43:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 14238695424. Throughput: 0: 42519.0. Samples: 14238841540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:43:43,392][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 10:43:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000869061_14238695424.pth... [2024-06-25 10:43:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000868436_14228455424.pth [2024-06-25 10:43:44,012][15401] Updated weights for policy 0, policy_version 869064 (0.0041) [2024-06-25 10:43:47,589][15401] Updated weights for policy 0, policy_version 869074 (0.0029) [2024-06-25 10:43:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14238941184. Throughput: 0: 42366.8. Samples: 14239091860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:43:48,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-25 10:43:51,645][15401] Updated weights for policy 0, policy_version 869084 (0.0039) [2024-06-25 10:43:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14239154176. Throughput: 0: 42518.3. Samples: 14239225900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:43:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 10:43:55,186][15401] Updated weights for policy 0, policy_version 869094 (0.0037) [2024-06-25 10:43:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14239334400. Throughput: 0: 42531.7. Samples: 14239479360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:43:58,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 10:43:59,197][15401] Updated weights for policy 0, policy_version 869104 (0.0037) [2024-06-25 10:44:02,849][15401] Updated weights for policy 0, policy_version 869114 (0.0026) [2024-06-25 10:44:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14239580160. Throughput: 0: 42659.0. Samples: 14239733060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:03,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 10:44:06,967][15401] Updated weights for policy 0, policy_version 869124 (0.0035) [2024-06-25 10:44:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14239793152. Throughput: 0: 42665.6. Samples: 14239864540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 10:44:10,782][15401] Updated weights for policy 0, policy_version 869134 (0.0035) [2024-06-25 10:44:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14239989760. Throughput: 0: 42588.8. Samples: 14240118500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:13,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 10:44:14,717][15401] Updated weights for policy 0, policy_version 869144 (0.0038) [2024-06-25 10:44:18,379][15401] Updated weights for policy 0, policy_version 869154 (0.0043) [2024-06-25 10:44:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14240219136. Throughput: 0: 42920.1. Samples: 14240378320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 10:44:22,388][15401] Updated weights for policy 0, policy_version 869164 (0.0030) [2024-06-25 10:44:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14240415744. Throughput: 0: 42685.4. Samples: 14240505480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 10:44:25,952][15401] Updated weights for policy 0, policy_version 869174 (0.0028) [2024-06-25 10:44:28,389][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14240628736. Throughput: 0: 42677.0. Samples: 14240762000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:28,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 10:44:29,937][15401] Updated weights for policy 0, policy_version 869184 (0.0050) [2024-06-25 10:44:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14240841728. Throughput: 0: 42846.7. Samples: 14241019960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:33,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 10:44:33,684][15401] Updated weights for policy 0, policy_version 869194 (0.0035) [2024-06-25 10:44:37,589][15401] Updated weights for policy 0, policy_version 869204 (0.0025) [2024-06-25 10:44:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14241071104. Throughput: 0: 42547.6. Samples: 14241140540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 10:44:41,532][15401] Updated weights for policy 0, policy_version 869214 (0.0045) [2024-06-25 10:44:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 14241267712. Throughput: 0: 42668.0. Samples: 14241399420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:43,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 10:44:45,373][15401] Updated weights for policy 0, policy_version 869224 (0.0027) [2024-06-25 10:44:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14241464320. Throughput: 0: 42754.8. Samples: 14241657020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:48,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 10:44:49,348][15401] Updated weights for policy 0, policy_version 869234 (0.0040) [2024-06-25 10:44:51,493][15349] Signal inference workers to stop experience collection... (210850 times) [2024-06-25 10:44:51,496][15349] Signal inference workers to resume experience collection... (210850 times) [2024-06-25 10:44:51,519][15401] InferenceWorker_p0-w0: stopping experience collection (210850 times) [2024-06-25 10:44:51,519][15401] InferenceWorker_p0-w0: resuming experience collection (210850 times) [2024-06-25 10:44:53,063][15401] Updated weights for policy 0, policy_version 869244 (0.0040) [2024-06-25 10:44:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 14241726464. Throughput: 0: 42623.1. Samples: 14241782580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:53,402][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 10:44:57,082][15401] Updated weights for policy 0, policy_version 869254 (0.0040) [2024-06-25 10:44:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14241923072. Throughput: 0: 42700.3. Samples: 14242040000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:44:58,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 10:45:00,589][15401] Updated weights for policy 0, policy_version 869264 (0.0042) [2024-06-25 10:45:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14242119680. Throughput: 0: 42589.2. Samples: 14242294840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:45:03,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 10:45:04,913][15401] Updated weights for policy 0, policy_version 869274 (0.0031) [2024-06-25 10:45:08,105][15401] Updated weights for policy 0, policy_version 869284 (0.0030) [2024-06-25 10:45:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14242365440. Throughput: 0: 42600.3. Samples: 14242422500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:45:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 10:45:12,300][15401] Updated weights for policy 0, policy_version 869294 (0.0025) [2024-06-25 10:45:13,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 14242562048. Throughput: 0: 42655.7. Samples: 14242681780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:45:13,396][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 10:45:15,770][15401] Updated weights for policy 0, policy_version 869304 (0.0035) [2024-06-25 10:45:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14242775040. Throughput: 0: 42443.1. Samples: 14242929900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 10:45:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 10:45:20,326][15401] Updated weights for policy 0, policy_version 869314 (0.0033) [2024-06-25 10:45:23,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14242988032. Throughput: 0: 42712.8. Samples: 14243062620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:45:23,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 10:45:23,681][15401] Updated weights for policy 0, policy_version 869324 (0.0035) [2024-06-25 10:45:27,787][15401] Updated weights for policy 0, policy_version 869334 (0.0042) [2024-06-25 10:45:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14243201024. Throughput: 0: 42718.9. Samples: 14243321780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:45:28,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 10:45:31,605][15401] Updated weights for policy 0, policy_version 869344 (0.0027) [2024-06-25 10:45:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 14243414016. Throughput: 0: 42795.8. Samples: 14243582840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:45:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 10:45:35,331][15401] Updated weights for policy 0, policy_version 869354 (0.0030) [2024-06-25 10:45:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14243610624. Throughput: 0: 42757.0. Samples: 14243706640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:45:38,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 10:45:39,221][15401] Updated weights for policy 0, policy_version 869364 (0.0052) [2024-06-25 10:45:42,948][15401] Updated weights for policy 0, policy_version 869374 (0.0052) [2024-06-25 10:45:43,391][15132] Fps is (10 sec: 42593.5, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 14243840000. Throughput: 0: 42632.4. Samples: 14243958520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:45:43,391][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 10:45:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000869375_14243840000.pth... [2024-06-25 10:45:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000868751_14233616384.pth [2024-06-25 10:45:46,956][15401] Updated weights for policy 0, policy_version 869384 (0.0038) [2024-06-25 10:45:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14244052992. Throughput: 0: 42557.4. Samples: 14244209920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:45:48,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 10:45:50,542][15401] Updated weights for policy 0, policy_version 869394 (0.0040) [2024-06-25 10:45:53,390][15132] Fps is (10 sec: 40965.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 14244249600. Throughput: 0: 42572.4. Samples: 14244338260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:45:53,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 10:45:54,433][15401] Updated weights for policy 0, policy_version 869404 (0.0029) [2024-06-25 10:45:58,190][15401] Updated weights for policy 0, policy_version 869414 (0.0024) [2024-06-25 10:45:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14244495360. Throughput: 0: 42473.9. Samples: 14244592840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:45:58,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 10:46:00,642][15349] Signal inference workers to stop experience collection... (210900 times) [2024-06-25 10:46:00,676][15401] InferenceWorker_p0-w0: stopping experience collection (210900 times) [2024-06-25 10:46:00,701][15349] Signal inference workers to resume experience collection... (210900 times) [2024-06-25 10:46:00,705][15401] InferenceWorker_p0-w0: resuming experience collection (210900 times) [2024-06-25 10:46:02,100][15401] Updated weights for policy 0, policy_version 869424 (0.0035) [2024-06-25 10:46:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14244691968. Throughput: 0: 42657.8. Samples: 14244849500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 10:46:05,688][15401] Updated weights for policy 0, policy_version 869434 (0.0028) [2024-06-25 10:46:08,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14244904960. Throughput: 0: 42542.7. Samples: 14244977040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 10:46:10,075][15401] Updated weights for policy 0, policy_version 869444 (0.0049) [2024-06-25 10:46:13,355][15401] Updated weights for policy 0, policy_version 869454 (0.0032) [2024-06-25 10:46:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 14245134336. Throughput: 0: 42559.6. Samples: 14245236960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:13,392][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 10:46:17,597][15401] Updated weights for policy 0, policy_version 869464 (0.0036) [2024-06-25 10:46:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14245330944. Throughput: 0: 42564.2. Samples: 14245498220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 10:46:20,974][15401] Updated weights for policy 0, policy_version 869474 (0.0044) [2024-06-25 10:46:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 14245560320. Throughput: 0: 42494.5. Samples: 14245619000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:23,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 10:46:25,124][15401] Updated weights for policy 0, policy_version 869484 (0.0033) [2024-06-25 10:46:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14245773312. Throughput: 0: 42690.9. Samples: 14245879560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:28,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 10:46:28,545][15401] Updated weights for policy 0, policy_version 869494 (0.0031) [2024-06-25 10:46:32,659][15401] Updated weights for policy 0, policy_version 869504 (0.0036) [2024-06-25 10:46:33,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14245969920. Throughput: 0: 42952.3. Samples: 14246142780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 10:46:36,181][15401] Updated weights for policy 0, policy_version 869514 (0.0038) [2024-06-25 10:46:38,393][15132] Fps is (10 sec: 40944.3, 60 sec: 42868.6, 300 sec: 42653.4). Total num frames: 14246182912. Throughput: 0: 42928.3. Samples: 14246270200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:38,394][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 10:46:40,249][15401] Updated weights for policy 0, policy_version 869524 (0.0032) [2024-06-25 10:46:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43145.4, 300 sec: 42820.5). Total num frames: 14246428672. Throughput: 0: 43018.2. Samples: 14246528660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:43,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 10:46:44,084][15401] Updated weights for policy 0, policy_version 869534 (0.0043) [2024-06-25 10:46:48,098][15401] Updated weights for policy 0, policy_version 869544 (0.0027) [2024-06-25 10:46:48,389][15132] Fps is (10 sec: 44254.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14246625280. Throughput: 0: 43082.3. Samples: 14246788200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 10:46:51,522][15401] Updated weights for policy 0, policy_version 869554 (0.0041) [2024-06-25 10:46:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14246838272. Throughput: 0: 42943.0. Samples: 14246909480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:53,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 10:46:55,675][15401] Updated weights for policy 0, policy_version 869564 (0.0038) [2024-06-25 10:46:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14247051264. Throughput: 0: 42980.1. Samples: 14247171060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:46:58,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 10:46:58,960][15401] Updated weights for policy 0, policy_version 869574 (0.0043) [2024-06-25 10:47:03,288][15401] Updated weights for policy 0, policy_version 869584 (0.0028) [2024-06-25 10:47:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 14247264256. Throughput: 0: 42920.8. Samples: 14247429660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:47:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 10:47:06,790][15401] Updated weights for policy 0, policy_version 869594 (0.0029) [2024-06-25 10:47:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 14247460864. Throughput: 0: 43029.0. Samples: 14247555200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:47:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 10:47:11,161][15401] Updated weights for policy 0, policy_version 869604 (0.0036) [2024-06-25 10:47:12,049][15349] Signal inference workers to stop experience collection... (210950 times) [2024-06-25 10:47:12,049][15349] Signal inference workers to resume experience collection... (210950 times) [2024-06-25 10:47:12,082][15401] InferenceWorker_p0-w0: stopping experience collection (210950 times) [2024-06-25 10:47:12,082][15401] InferenceWorker_p0-w0: resuming experience collection (210950 times) [2024-06-25 10:47:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 14247690240. Throughput: 0: 42855.7. Samples: 14247808060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-25 10:47:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 10:47:14,483][15401] Updated weights for policy 0, policy_version 869614 (0.0041) [2024-06-25 10:47:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14247903232. Throughput: 0: 42654.7. Samples: 14248062240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 10:47:18,631][15401] Updated weights for policy 0, policy_version 869624 (0.0041) [2024-06-25 10:47:22,111][15401] Updated weights for policy 0, policy_version 869634 (0.0028) [2024-06-25 10:47:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 14248116224. Throughput: 0: 42610.4. Samples: 14248187500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:23,390][15132] Avg episode reward: [(0, '0.214')] [2024-06-25 10:47:26,514][15401] Updated weights for policy 0, policy_version 869644 (0.0025) [2024-06-25 10:47:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14248312832. Throughput: 0: 42653.0. Samples: 14248448040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:28,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 10:47:30,150][15401] Updated weights for policy 0, policy_version 869654 (0.0024) [2024-06-25 10:47:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14248525824. Throughput: 0: 42667.4. Samples: 14248708240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 10:47:34,088][15401] Updated weights for policy 0, policy_version 869664 (0.0033) [2024-06-25 10:47:37,557][15401] Updated weights for policy 0, policy_version 869674 (0.0041) [2024-06-25 10:47:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43147.3, 300 sec: 42709.5). Total num frames: 14248771584. Throughput: 0: 42777.4. Samples: 14248834460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 10:47:41,670][15401] Updated weights for policy 0, policy_version 869684 (0.0037) [2024-06-25 10:47:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14248968192. Throughput: 0: 42669.2. Samples: 14249091180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:43,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 10:47:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000869688_14248968192.pth... [2024-06-25 10:47:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000869061_14238695424.pth [2024-06-25 10:47:44,934][15401] Updated weights for policy 0, policy_version 869694 (0.0040) [2024-06-25 10:47:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14249164800. Throughput: 0: 42646.8. Samples: 14249348760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 10:47:49,262][15401] Updated weights for policy 0, policy_version 869704 (0.0032) [2024-06-25 10:47:53,114][15401] Updated weights for policy 0, policy_version 869714 (0.0032) [2024-06-25 10:47:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14249394176. Throughput: 0: 42569.9. Samples: 14249470840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:53,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 10:47:57,446][15401] Updated weights for policy 0, policy_version 869724 (0.0033) [2024-06-25 10:47:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 14249590784. Throughput: 0: 42693.4. Samples: 14249729260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:47:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 10:48:00,728][15401] Updated weights for policy 0, policy_version 869734 (0.0031) [2024-06-25 10:48:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14249803776. Throughput: 0: 42833.4. Samples: 14249989740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 10:48:05,000][15401] Updated weights for policy 0, policy_version 869744 (0.0040) [2024-06-25 10:48:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14250033152. Throughput: 0: 42724.0. Samples: 14250110080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 10:48:08,423][15401] Updated weights for policy 0, policy_version 869754 (0.0027) [2024-06-25 10:48:12,743][15401] Updated weights for policy 0, policy_version 869764 (0.0038) [2024-06-25 10:48:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14250246144. Throughput: 0: 42740.9. Samples: 14250371380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:13,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 10:48:16,087][15401] Updated weights for policy 0, policy_version 869774 (0.0030) [2024-06-25 10:48:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14250459136. Throughput: 0: 42686.1. Samples: 14250629120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:18,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 10:48:20,497][15401] Updated weights for policy 0, policy_version 869784 (0.0034) [2024-06-25 10:48:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14250688512. Throughput: 0: 42735.2. Samples: 14250757540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 10:48:23,611][15401] Updated weights for policy 0, policy_version 869794 (0.0035) [2024-06-25 10:48:28,104][15401] Updated weights for policy 0, policy_version 869804 (0.0045) [2024-06-25 10:48:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14250868736. Throughput: 0: 42709.0. Samples: 14251013080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 10:48:29,451][15349] Signal inference workers to stop experience collection... (211000 times) [2024-06-25 10:48:29,480][15401] InferenceWorker_p0-w0: stopping experience collection (211000 times) [2024-06-25 10:48:29,516][15349] Signal inference workers to resume experience collection... (211000 times) [2024-06-25 10:48:29,518][15401] InferenceWorker_p0-w0: resuming experience collection (211000 times) [2024-06-25 10:48:31,223][15401] Updated weights for policy 0, policy_version 869814 (0.0037) [2024-06-25 10:48:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14251098112. Throughput: 0: 42665.7. Samples: 14251268720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 10:48:35,914][15401] Updated weights for policy 0, policy_version 869824 (0.0033) [2024-06-25 10:48:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14251327488. Throughput: 0: 42871.1. Samples: 14251400040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:38,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 10:48:38,778][15401] Updated weights for policy 0, policy_version 869834 (0.0033) [2024-06-25 10:48:43,346][15401] Updated weights for policy 0, policy_version 869844 (0.0031) [2024-06-25 10:48:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14251524096. Throughput: 0: 42779.5. Samples: 14251654340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 10:48:46,735][15401] Updated weights for policy 0, policy_version 869854 (0.0032) [2024-06-25 10:48:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14251720704. Throughput: 0: 42658.7. Samples: 14251909380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 10:48:51,236][15401] Updated weights for policy 0, policy_version 869864 (0.0043) [2024-06-25 10:48:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 14251982848. Throughput: 0: 42919.5. Samples: 14252041460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 10:48:54,372][15401] Updated weights for policy 0, policy_version 869874 (0.0026) [2024-06-25 10:48:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14252163072. Throughput: 0: 42817.8. Samples: 14252298180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:48:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 10:48:58,711][15401] Updated weights for policy 0, policy_version 869884 (0.0035) [2024-06-25 10:49:02,252][15401] Updated weights for policy 0, policy_version 869894 (0.0042) [2024-06-25 10:49:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14252376064. Throughput: 0: 42786.4. Samples: 14252554500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-25 10:49:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 10:49:06,307][15401] Updated weights for policy 0, policy_version 869904 (0.0031) [2024-06-25 10:49:08,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14252621824. Throughput: 0: 42797.7. Samples: 14252683440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 10:49:10,004][15401] Updated weights for policy 0, policy_version 869914 (0.0030) [2024-06-25 10:49:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14252818432. Throughput: 0: 42755.1. Samples: 14252937060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:13,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 10:49:13,863][15401] Updated weights for policy 0, policy_version 869924 (0.0032) [2024-06-25 10:49:17,673][15401] Updated weights for policy 0, policy_version 869934 (0.0028) [2024-06-25 10:49:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14253031424. Throughput: 0: 42793.4. Samples: 14253194420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:18,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 10:49:21,443][15401] Updated weights for policy 0, policy_version 869944 (0.0031) [2024-06-25 10:49:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14253244416. Throughput: 0: 42692.4. Samples: 14253321200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:23,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 10:49:25,192][15401] Updated weights for policy 0, policy_version 869954 (0.0038) [2024-06-25 10:49:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14253441024. Throughput: 0: 42766.3. Samples: 14253578820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 10:49:28,926][15401] Updated weights for policy 0, policy_version 869964 (0.0034) [2024-06-25 10:49:32,828][15401] Updated weights for policy 0, policy_version 869974 (0.0030) [2024-06-25 10:49:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14253654016. Throughput: 0: 42791.5. Samples: 14253835000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:33,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-25 10:49:36,510][15401] Updated weights for policy 0, policy_version 869984 (0.0033) [2024-06-25 10:49:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14253883392. Throughput: 0: 42647.2. Samples: 14253960580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:38,390][15132] Avg episode reward: [(0, '0.909')] [2024-06-25 10:49:40,561][15401] Updated weights for policy 0, policy_version 869994 (0.0034) [2024-06-25 10:49:43,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 14254096384. Throughput: 0: 42674.5. Samples: 14254218640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:43,393][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 10:49:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000870001_14254096384.pth... [2024-06-25 10:49:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000869375_14243840000.pth [2024-06-25 10:49:44,235][15401] Updated weights for policy 0, policy_version 870004 (0.0031) [2024-06-25 10:49:48,263][15401] Updated weights for policy 0, policy_version 870014 (0.0039) [2024-06-25 10:49:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14254309376. Throughput: 0: 42528.9. Samples: 14254468300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 10:49:52,357][15401] Updated weights for policy 0, policy_version 870024 (0.0035) [2024-06-25 10:49:53,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14254522368. Throughput: 0: 42556.9. Samples: 14254598500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 10:49:55,795][15401] Updated weights for policy 0, policy_version 870034 (0.0034) [2024-06-25 10:49:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14254735360. Throughput: 0: 42672.9. Samples: 14254857340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:49:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 10:50:00,056][15401] Updated weights for policy 0, policy_version 870044 (0.0033) [2024-06-25 10:50:03,059][15349] Signal inference workers to stop experience collection... (211050 times) [2024-06-25 10:50:03,112][15401] InferenceWorker_p0-w0: stopping experience collection (211050 times) [2024-06-25 10:50:03,116][15349] Signal inference workers to resume experience collection... (211050 times) [2024-06-25 10:50:03,130][15401] InferenceWorker_p0-w0: resuming experience collection (211050 times) [2024-06-25 10:50:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14254948352. Throughput: 0: 42636.3. Samples: 14255113060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 10:50:03,472][15401] Updated weights for policy 0, policy_version 870054 (0.0030) [2024-06-25 10:50:07,645][15401] Updated weights for policy 0, policy_version 870064 (0.0044) [2024-06-25 10:50:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42710.4). Total num frames: 14255161344. Throughput: 0: 42527.4. Samples: 14255234940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 10:50:11,235][15401] Updated weights for policy 0, policy_version 870074 (0.0037) [2024-06-25 10:50:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14255357952. Throughput: 0: 42457.3. Samples: 14255489400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 10:50:15,161][15401] Updated weights for policy 0, policy_version 870084 (0.0027) [2024-06-25 10:50:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14255570944. Throughput: 0: 42377.3. Samples: 14255741980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:18,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 10:50:19,127][15401] Updated weights for policy 0, policy_version 870094 (0.0034) [2024-06-25 10:50:22,705][15401] Updated weights for policy 0, policy_version 870104 (0.0040) [2024-06-25 10:50:23,393][15132] Fps is (10 sec: 42581.5, 60 sec: 42322.6, 300 sec: 42653.4). Total num frames: 14255783936. Throughput: 0: 42496.7. Samples: 14255873100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:23,394][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 10:50:26,823][15401] Updated weights for policy 0, policy_version 870114 (0.0034) [2024-06-25 10:50:28,393][15132] Fps is (10 sec: 42582.9, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 14255996928. Throughput: 0: 42280.2. Samples: 14256121300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:28,394][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 10:50:30,869][15401] Updated weights for policy 0, policy_version 870124 (0.0040) [2024-06-25 10:50:33,392][15132] Fps is (10 sec: 42604.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 14256209920. Throughput: 0: 42421.2. Samples: 14256377360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:33,392][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 10:50:34,454][15401] Updated weights for policy 0, policy_version 870134 (0.0046) [2024-06-25 10:50:38,390][15132] Fps is (10 sec: 42613.4, 60 sec: 42325.2, 300 sec: 42654.1). Total num frames: 14256422912. Throughput: 0: 42400.8. Samples: 14256506540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 10:50:38,631][15401] Updated weights for policy 0, policy_version 870144 (0.0036) [2024-06-25 10:50:41,998][15401] Updated weights for policy 0, policy_version 870154 (0.0022) [2024-06-25 10:50:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 14256635904. Throughput: 0: 42195.6. Samples: 14256756140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 10:50:46,189][15401] Updated weights for policy 0, policy_version 870164 (0.0050) [2024-06-25 10:50:48,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14256865280. Throughput: 0: 42281.9. Samples: 14257015740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:48,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 10:50:49,622][15401] Updated weights for policy 0, policy_version 870174 (0.0043) [2024-06-25 10:50:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14257045504. Throughput: 0: 42589.1. Samples: 14257151440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 10:50:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 10:50:53,997][15401] Updated weights for policy 0, policy_version 870184 (0.0028) [2024-06-25 10:50:57,420][15401] Updated weights for policy 0, policy_version 870194 (0.0031) [2024-06-25 10:50:58,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42593.8, 300 sec: 42708.5). Total num frames: 14257291264. Throughput: 0: 42575.2. Samples: 14257405560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:50:58,396][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 10:51:01,667][15401] Updated weights for policy 0, policy_version 870204 (0.0025) [2024-06-25 10:51:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14257504256. Throughput: 0: 42674.6. Samples: 14257662340. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:03,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 10:51:05,048][15401] Updated weights for policy 0, policy_version 870214 (0.0024) [2024-06-25 10:51:08,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 14257700864. Throughput: 0: 42732.2. Samples: 14257795880. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 10:51:09,442][15401] Updated weights for policy 0, policy_version 870224 (0.0029) [2024-06-25 10:51:12,697][15401] Updated weights for policy 0, policy_version 870234 (0.0030) [2024-06-25 10:51:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14257946624. Throughput: 0: 42752.4. Samples: 14258045000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:13,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 10:51:16,991][15401] Updated weights for policy 0, policy_version 870244 (0.0038) [2024-06-25 10:51:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 14258143232. Throughput: 0: 42770.7. Samples: 14258301940. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:18,393][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 10:51:20,485][15349] Signal inference workers to stop experience collection... (211100 times) [2024-06-25 10:51:20,543][15401] InferenceWorker_p0-w0: stopping experience collection (211100 times) [2024-06-25 10:51:20,550][15349] Signal inference workers to resume experience collection... (211100 times) [2024-06-25 10:51:20,558][15401] InferenceWorker_p0-w0: resuming experience collection (211100 times) [2024-06-25 10:51:20,565][15401] Updated weights for policy 0, policy_version 870254 (0.0050) [2024-06-25 10:51:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42874.3, 300 sec: 42654.0). Total num frames: 14258356224. Throughput: 0: 42634.8. Samples: 14258425100. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:23,390][15132] Avg episode reward: [(0, '0.078')] [2024-06-25 10:51:24,891][15401] Updated weights for policy 0, policy_version 870264 (0.0030) [2024-06-25 10:51:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42601.0, 300 sec: 42654.0). Total num frames: 14258552832. Throughput: 0: 42724.9. Samples: 14258678760. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:28,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 10:51:28,886][15401] Updated weights for policy 0, policy_version 870274 (0.0031) [2024-06-25 10:51:32,455][15401] Updated weights for policy 0, policy_version 870284 (0.0026) [2024-06-25 10:51:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42600.0, 300 sec: 42654.5). Total num frames: 14258765824. Throughput: 0: 42679.0. Samples: 14258936300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 10:51:36,499][15401] Updated weights for policy 0, policy_version 870294 (0.0027) [2024-06-25 10:51:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14258978816. Throughput: 0: 42430.6. Samples: 14259060820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 10:51:40,237][15401] Updated weights for policy 0, policy_version 870304 (0.0029) [2024-06-25 10:51:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14259208192. Throughput: 0: 42534.3. Samples: 14259319340. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:43,399][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 10:51:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000870313_14259208192.pth... [2024-06-25 10:51:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000869688_14248968192.pth [2024-06-25 10:51:44,105][15401] Updated weights for policy 0, policy_version 870314 (0.0043) [2024-06-25 10:51:47,927][15401] Updated weights for policy 0, policy_version 870324 (0.0029) [2024-06-25 10:51:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14259404800. Throughput: 0: 42478.7. Samples: 14259573880. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:48,400][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 10:51:51,795][15401] Updated weights for policy 0, policy_version 870334 (0.0039) [2024-06-25 10:51:53,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 14259617792. Throughput: 0: 42339.9. Samples: 14259701280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:53,393][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 10:51:55,350][15401] Updated weights for policy 0, policy_version 870344 (0.0034) [2024-06-25 10:51:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42056.7, 300 sec: 42542.9). Total num frames: 14259814400. Throughput: 0: 42418.1. Samples: 14259953820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:51:58,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 10:51:59,674][15401] Updated weights for policy 0, policy_version 870354 (0.0038) [2024-06-25 10:52:03,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14260027392. Throughput: 0: 42313.7. Samples: 14260206060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:03,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 10:52:03,559][15401] Updated weights for policy 0, policy_version 870364 (0.0039) [2024-06-25 10:52:07,336][15401] Updated weights for policy 0, policy_version 870374 (0.0044) [2024-06-25 10:52:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14260240384. Throughput: 0: 42504.4. Samples: 14260337800. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:08,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 10:52:11,155][15401] Updated weights for policy 0, policy_version 870384 (0.0037) [2024-06-25 10:52:13,392][15132] Fps is (10 sec: 42588.6, 60 sec: 41777.5, 300 sec: 42542.5). Total num frames: 14260453376. Throughput: 0: 42384.8. Samples: 14260586180. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:13,392][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 10:52:14,898][15401] Updated weights for policy 0, policy_version 870394 (0.0039) [2024-06-25 10:52:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14260666368. Throughput: 0: 42468.1. Samples: 14260847360. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 10:52:18,676][15401] Updated weights for policy 0, policy_version 870404 (0.0041) [2024-06-25 10:52:22,609][15401] Updated weights for policy 0, policy_version 870414 (0.0038) [2024-06-25 10:52:23,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14260895744. Throughput: 0: 42570.1. Samples: 14260976480. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 10:52:26,224][15401] Updated weights for policy 0, policy_version 870424 (0.0034) [2024-06-25 10:52:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14261092352. Throughput: 0: 42433.1. Samples: 14261228820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:28,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 10:52:30,242][15401] Updated weights for policy 0, policy_version 870434 (0.0034) [2024-06-25 10:52:33,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 14261305344. Throughput: 0: 42620.0. Samples: 14261491880. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:33,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 10:52:33,827][15401] Updated weights for policy 0, policy_version 870444 (0.0030) [2024-06-25 10:52:37,748][15401] Updated weights for policy 0, policy_version 870454 (0.0057) [2024-06-25 10:52:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14261551104. Throughput: 0: 42612.9. Samples: 14261618760. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:38,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 10:52:41,042][15349] Signal inference workers to stop experience collection... (211150 times) [2024-06-25 10:52:41,090][15401] InferenceWorker_p0-w0: stopping experience collection (211150 times) [2024-06-25 10:52:41,101][15349] Signal inference workers to resume experience collection... (211150 times) [2024-06-25 10:52:41,106][15401] InferenceWorker_p0-w0: resuming experience collection (211150 times) [2024-06-25 10:52:41,393][15401] Updated weights for policy 0, policy_version 870464 (0.0029) [2024-06-25 10:52:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 14261747712. Throughput: 0: 42629.1. Samples: 14261872120. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 10:52:45,346][15401] Updated weights for policy 0, policy_version 870474 (0.0037) [2024-06-25 10:52:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14261960704. Throughput: 0: 42828.4. Samples: 14262133340. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-25 10:52:48,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 10:52:49,124][15401] Updated weights for policy 0, policy_version 870484 (0.0031) [2024-06-25 10:52:52,935][15401] Updated weights for policy 0, policy_version 870494 (0.0037) [2024-06-25 10:52:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14262190080. Throughput: 0: 42719.5. Samples: 14262260180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:52:53,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 10:52:56,811][15401] Updated weights for policy 0, policy_version 870504 (0.0040) [2024-06-25 10:52:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14262370304. Throughput: 0: 42905.4. Samples: 14262516820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:52:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 10:53:00,559][15401] Updated weights for policy 0, policy_version 870514 (0.0038) [2024-06-25 10:53:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14262599680. Throughput: 0: 42576.0. Samples: 14262763280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:03,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 10:53:04,804][15401] Updated weights for policy 0, policy_version 870524 (0.0026) [2024-06-25 10:53:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14262812672. Throughput: 0: 42650.8. Samples: 14262895760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 10:53:08,519][15401] Updated weights for policy 0, policy_version 870534 (0.0026) [2024-06-25 10:53:12,466][15401] Updated weights for policy 0, policy_version 870544 (0.0030) [2024-06-25 10:53:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 14263025664. Throughput: 0: 42770.6. Samples: 14263153500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 10:53:15,999][15401] Updated weights for policy 0, policy_version 870554 (0.0029) [2024-06-25 10:53:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14263238656. Throughput: 0: 42613.9. Samples: 14263409400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:18,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 10:53:20,017][15401] Updated weights for policy 0, policy_version 870564 (0.0041) [2024-06-25 10:53:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14263468032. Throughput: 0: 42637.4. Samples: 14263537440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 10:53:23,450][15401] Updated weights for policy 0, policy_version 870574 (0.0040) [2024-06-25 10:53:27,812][15401] Updated weights for policy 0, policy_version 870584 (0.0035) [2024-06-25 10:53:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14263664640. Throughput: 0: 42714.7. Samples: 14263794280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:28,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 10:53:31,434][15401] Updated weights for policy 0, policy_version 870594 (0.0039) [2024-06-25 10:53:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.1, 300 sec: 42542.8). Total num frames: 14263877632. Throughput: 0: 42512.0. Samples: 14264046380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:33,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 10:53:35,468][15401] Updated weights for policy 0, policy_version 870604 (0.0045) [2024-06-25 10:53:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14264107008. Throughput: 0: 42654.7. Samples: 14264179640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:38,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-25 10:53:38,910][15401] Updated weights for policy 0, policy_version 870614 (0.0033) [2024-06-25 10:53:43,038][15401] Updated weights for policy 0, policy_version 870624 (0.0032) [2024-06-25 10:53:43,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42866.8, 300 sec: 42708.5). Total num frames: 14264320000. Throughput: 0: 42702.7. Samples: 14264438720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:43,397][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 10:53:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000870625_14264320000.pth... [2024-06-25 10:53:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000870001_14254096384.pth [2024-06-25 10:53:46,693][15401] Updated weights for policy 0, policy_version 870634 (0.0049) [2024-06-25 10:53:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 14264532992. Throughput: 0: 42757.4. Samples: 14264687360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:48,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 10:53:50,761][15401] Updated weights for policy 0, policy_version 870644 (0.0037) [2024-06-25 10:53:53,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14264745984. Throughput: 0: 42732.7. Samples: 14264818740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 10:53:54,191][15401] Updated weights for policy 0, policy_version 870654 (0.0041) [2024-06-25 10:53:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14264942592. Throughput: 0: 42716.0. Samples: 14265075720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:53:58,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 10:53:58,659][15401] Updated weights for policy 0, policy_version 870664 (0.0030) [2024-06-25 10:54:01,966][15401] Updated weights for policy 0, policy_version 870674 (0.0026) [2024-06-25 10:54:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14265171968. Throughput: 0: 42697.7. Samples: 14265330800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:54:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 10:54:06,226][15401] Updated weights for policy 0, policy_version 870684 (0.0037) [2024-06-25 10:54:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 14265384960. Throughput: 0: 42809.7. Samples: 14265463880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:54:08,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 10:54:09,753][15401] Updated weights for policy 0, policy_version 870694 (0.0036) [2024-06-25 10:54:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14265581568. Throughput: 0: 42777.7. Samples: 14265719280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:54:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 10:54:13,932][15401] Updated weights for policy 0, policy_version 870704 (0.0035) [2024-06-25 10:54:17,498][15401] Updated weights for policy 0, policy_version 870714 (0.0030) [2024-06-25 10:54:18,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14265794560. Throughput: 0: 42879.7. Samples: 14265975960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:54:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 10:54:21,587][15401] Updated weights for policy 0, policy_version 870724 (0.0043) [2024-06-25 10:54:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14266023936. Throughput: 0: 42780.5. Samples: 14266104760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:54:23,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 10:54:25,190][15401] Updated weights for policy 0, policy_version 870734 (0.0042) [2024-06-25 10:54:28,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14266236928. Throughput: 0: 42696.7. Samples: 14266359900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:54:28,393][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 10:54:29,046][15401] Updated weights for policy 0, policy_version 870744 (0.0036) [2024-06-25 10:54:32,871][15401] Updated weights for policy 0, policy_version 870754 (0.0037) [2024-06-25 10:54:33,396][15132] Fps is (10 sec: 42570.9, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 14266449920. Throughput: 0: 43039.2. Samples: 14266624400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:54:33,396][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 10:54:36,704][15401] Updated weights for policy 0, policy_version 870764 (0.0034) [2024-06-25 10:54:38,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42596.7, 300 sec: 42598.4). Total num frames: 14266662912. Throughput: 0: 42846.3. Samples: 14266746920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 10:54:38,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 10:54:39,732][15349] Signal inference workers to stop experience collection... (211200 times) [2024-06-25 10:54:39,786][15401] InferenceWorker_p0-w0: stopping experience collection (211200 times) [2024-06-25 10:54:39,786][15349] Signal inference workers to resume experience collection... (211200 times) [2024-06-25 10:54:39,804][15401] InferenceWorker_p0-w0: resuming experience collection (211200 times) [2024-06-25 10:54:40,583][15401] Updated weights for policy 0, policy_version 870774 (0.0033) [2024-06-25 10:54:43,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 14266875904. Throughput: 0: 42944.5. Samples: 14267008220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:54:43,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 10:54:44,151][15401] Updated weights for policy 0, policy_version 870784 (0.0026) [2024-06-25 10:54:48,236][15401] Updated weights for policy 0, policy_version 870794 (0.0041) [2024-06-25 10:54:48,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14267105280. Throughput: 0: 43144.6. Samples: 14267272300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:54:48,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 10:54:51,721][15401] Updated weights for policy 0, policy_version 870804 (0.0032) [2024-06-25 10:54:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14267318272. Throughput: 0: 42970.2. Samples: 14267397540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:54:53,399][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 10:54:55,714][15401] Updated weights for policy 0, policy_version 870814 (0.0033) [2024-06-25 10:54:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 14267531264. Throughput: 0: 43075.7. Samples: 14267657680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:54:58,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 10:54:59,289][15401] Updated weights for policy 0, policy_version 870824 (0.0035) [2024-06-25 10:55:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14267727872. Throughput: 0: 43160.4. Samples: 14267918180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 10:55:03,445][15401] Updated weights for policy 0, policy_version 870834 (0.0027) [2024-06-25 10:55:06,905][15401] Updated weights for policy 0, policy_version 870844 (0.0050) [2024-06-25 10:55:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14267940864. Throughput: 0: 43038.6. Samples: 14268041500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:08,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 10:55:11,011][15401] Updated weights for policy 0, policy_version 870854 (0.0033) [2024-06-25 10:55:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14268170240. Throughput: 0: 42973.5. Samples: 14268293600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 10:55:14,690][15401] Updated weights for policy 0, policy_version 870864 (0.0034) [2024-06-25 10:55:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 14268366848. Throughput: 0: 42851.4. Samples: 14268552440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:18,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 10:55:18,876][15401] Updated weights for policy 0, policy_version 870874 (0.0036) [2024-06-25 10:55:22,876][15401] Updated weights for policy 0, policy_version 870884 (0.0040) [2024-06-25 10:55:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 14268596224. Throughput: 0: 42836.8. Samples: 14268674480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 10:55:26,262][15401] Updated weights for policy 0, policy_version 870894 (0.0026) [2024-06-25 10:55:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.1, 300 sec: 42709.8). Total num frames: 14268809216. Throughput: 0: 42802.5. Samples: 14268934340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:28,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 10:55:30,624][15401] Updated weights for policy 0, policy_version 870904 (0.0046) [2024-06-25 10:55:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 14269022208. Throughput: 0: 42631.4. Samples: 14269190720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:33,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 10:55:34,081][15401] Updated weights for policy 0, policy_version 870914 (0.0040) [2024-06-25 10:55:38,105][15401] Updated weights for policy 0, policy_version 870924 (0.0029) [2024-06-25 10:55:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14269235200. Throughput: 0: 42704.1. Samples: 14269319220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 10:55:41,617][15401] Updated weights for policy 0, policy_version 870934 (0.0027) [2024-06-25 10:55:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14269448192. Throughput: 0: 42538.1. Samples: 14269571900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 10:55:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000870939_14269464576.pth... [2024-06-25 10:55:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000870313_14259208192.pth [2024-06-25 10:55:45,739][15401] Updated weights for policy 0, policy_version 870944 (0.0037) [2024-06-25 10:55:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 14269677568. Throughput: 0: 42495.4. Samples: 14269830480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 10:55:49,383][15401] Updated weights for policy 0, policy_version 870954 (0.0036) [2024-06-25 10:55:53,257][15401] Updated weights for policy 0, policy_version 870964 (0.0038) [2024-06-25 10:55:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 14269874176. Throughput: 0: 42570.8. Samples: 14269957180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 10:55:56,956][15401] Updated weights for policy 0, policy_version 870974 (0.0037) [2024-06-25 10:55:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14270087168. Throughput: 0: 42723.4. Samples: 14270216160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:55:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 10:56:00,951][15401] Updated weights for policy 0, policy_version 870984 (0.0028) [2024-06-25 10:56:01,040][15349] Signal inference workers to stop experience collection... (211250 times) [2024-06-25 10:56:01,093][15401] InferenceWorker_p0-w0: stopping experience collection (211250 times) [2024-06-25 10:56:01,093][15349] Signal inference workers to resume experience collection... (211250 times) [2024-06-25 10:56:01,116][15401] InferenceWorker_p0-w0: resuming experience collection (211250 times) [2024-06-25 10:56:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14270316544. Throughput: 0: 42642.3. Samples: 14270471340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:56:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 10:56:04,510][15401] Updated weights for policy 0, policy_version 870994 (0.0033) [2024-06-25 10:56:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14270529536. Throughput: 0: 42911.7. Samples: 14270605500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:56:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 10:56:08,398][15401] Updated weights for policy 0, policy_version 871004 (0.0036) [2024-06-25 10:56:12,328][15401] Updated weights for policy 0, policy_version 871014 (0.0038) [2024-06-25 10:56:13,395][15132] Fps is (10 sec: 42575.2, 60 sec: 42867.6, 300 sec: 42708.7). Total num frames: 14270742528. Throughput: 0: 42960.8. Samples: 14270867800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:56:13,395][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 10:56:15,989][15401] Updated weights for policy 0, policy_version 871024 (0.0027) [2024-06-25 10:56:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 14270971904. Throughput: 0: 42729.9. Samples: 14271113560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:56:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 10:56:20,223][15401] Updated weights for policy 0, policy_version 871034 (0.0025) [2024-06-25 10:56:23,390][15132] Fps is (10 sec: 42621.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14271168512. Throughput: 0: 42790.2. Samples: 14271244780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:56:23,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-25 10:56:23,530][15401] Updated weights for policy 0, policy_version 871044 (0.0040) [2024-06-25 10:56:27,791][15401] Updated weights for policy 0, policy_version 871054 (0.0042) [2024-06-25 10:56:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14271381504. Throughput: 0: 42939.6. Samples: 14271504180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 10:56:28,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-25 10:56:31,315][15401] Updated weights for policy 0, policy_version 871064 (0.0039) [2024-06-25 10:56:33,394][15132] Fps is (10 sec: 45853.9, 60 sec: 43414.2, 300 sec: 42875.4). Total num frames: 14271627264. Throughput: 0: 42836.5. Samples: 14271758320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:56:33,395][15132] Avg episode reward: [(0, '0.233')] [2024-06-25 10:56:35,344][15401] Updated weights for policy 0, policy_version 871074 (0.0039) [2024-06-25 10:56:38,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.7, 300 sec: 42709.2). Total num frames: 14271807488. Throughput: 0: 43013.6. Samples: 14271892900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:56:38,393][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 10:56:39,210][15401] Updated weights for policy 0, policy_version 871084 (0.0037) [2024-06-25 10:56:42,844][15401] Updated weights for policy 0, policy_version 871094 (0.0039) [2024-06-25 10:56:43,389][15132] Fps is (10 sec: 39340.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14272020480. Throughput: 0: 42953.9. Samples: 14272149080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:56:43,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 10:56:46,893][15401] Updated weights for policy 0, policy_version 871104 (0.0035) [2024-06-25 10:56:48,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 14272266240. Throughput: 0: 42879.0. Samples: 14272400900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:56:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 10:56:50,510][15401] Updated weights for policy 0, policy_version 871114 (0.0035) [2024-06-25 10:56:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14272446464. Throughput: 0: 42939.0. Samples: 14272537760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:56:53,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 10:56:54,585][15401] Updated weights for policy 0, policy_version 871124 (0.0024) [2024-06-25 10:56:58,078][15401] Updated weights for policy 0, policy_version 871134 (0.0041) [2024-06-25 10:56:58,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14272659456. Throughput: 0: 42659.0. Samples: 14272787220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:56:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 10:57:02,147][15401] Updated weights for policy 0, policy_version 871144 (0.0030) [2024-06-25 10:57:03,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 14272905216. Throughput: 0: 42822.5. Samples: 14273040680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:03,392][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 10:57:06,390][15401] Updated weights for policy 0, policy_version 871154 (0.0026) [2024-06-25 10:57:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 14273101824. Throughput: 0: 42999.6. Samples: 14273179760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 10:57:09,717][15401] Updated weights for policy 0, policy_version 871164 (0.0038) [2024-06-25 10:57:13,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42602.2, 300 sec: 42820.6). Total num frames: 14273298432. Throughput: 0: 42819.0. Samples: 14273431040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:13,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 10:57:13,981][15401] Updated weights for policy 0, policy_version 871174 (0.0041) [2024-06-25 10:57:17,299][15401] Updated weights for policy 0, policy_version 871184 (0.0036) [2024-06-25 10:57:18,391][15132] Fps is (10 sec: 44232.4, 60 sec: 42870.7, 300 sec: 42876.0). Total num frames: 14273544192. Throughput: 0: 42838.6. Samples: 14273685900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:18,391][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 10:57:21,546][15401] Updated weights for policy 0, policy_version 871194 (0.0028) [2024-06-25 10:57:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14273740800. Throughput: 0: 42839.3. Samples: 14273820560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:23,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 10:57:24,137][15349] Signal inference workers to stop experience collection... (211300 times) [2024-06-25 10:57:24,139][15349] Signal inference workers to resume experience collection... (211300 times) [2024-06-25 10:57:24,178][15401] InferenceWorker_p0-w0: stopping experience collection (211300 times) [2024-06-25 10:57:24,179][15401] InferenceWorker_p0-w0: resuming experience collection (211300 times) [2024-06-25 10:57:24,744][15401] Updated weights for policy 0, policy_version 871204 (0.0029) [2024-06-25 10:57:28,389][15132] Fps is (10 sec: 39325.7, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 14273937408. Throughput: 0: 42746.6. Samples: 14274072680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 10:57:29,171][15401] Updated weights for policy 0, policy_version 871214 (0.0029) [2024-06-25 10:57:32,465][15401] Updated weights for policy 0, policy_version 871224 (0.0034) [2024-06-25 10:57:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42328.7, 300 sec: 42765.0). Total num frames: 14274166784. Throughput: 0: 42922.8. Samples: 14274332420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 10:57:36,592][15401] Updated weights for policy 0, policy_version 871234 (0.0042) [2024-06-25 10:57:38,396][15132] Fps is (10 sec: 45845.7, 60 sec: 43141.7, 300 sec: 42875.2). Total num frames: 14274396160. Throughput: 0: 42814.3. Samples: 14274464680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:38,396][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 10:57:39,935][15401] Updated weights for policy 0, policy_version 871244 (0.0035) [2024-06-25 10:57:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14274592768. Throughput: 0: 42891.9. Samples: 14274717360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 10:57:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000871252_14274592768.pth... [2024-06-25 10:57:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000870625_14264320000.pth [2024-06-25 10:57:44,535][15401] Updated weights for policy 0, policy_version 871254 (0.0026) [2024-06-25 10:57:47,631][15401] Updated weights for policy 0, policy_version 871264 (0.0046) [2024-06-25 10:57:48,390][15132] Fps is (10 sec: 42625.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14274822144. Throughput: 0: 42977.4. Samples: 14274974560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:48,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 10:57:52,095][15401] Updated weights for policy 0, policy_version 871274 (0.0035) [2024-06-25 10:57:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14275018752. Throughput: 0: 42880.1. Samples: 14275109360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 10:57:55,288][15401] Updated weights for policy 0, policy_version 871284 (0.0046) [2024-06-25 10:57:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14275231744. Throughput: 0: 42681.8. Samples: 14275351720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:57:58,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 10:57:59,797][15401] Updated weights for policy 0, policy_version 871294 (0.0036) [2024-06-25 10:58:02,801][15401] Updated weights for policy 0, policy_version 871304 (0.0036) [2024-06-25 10:58:03,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 14275477504. Throughput: 0: 42868.4. Samples: 14275614940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:58:03,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 10:58:07,173][15401] Updated weights for policy 0, policy_version 871314 (0.0034) [2024-06-25 10:58:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 14275657728. Throughput: 0: 42954.6. Samples: 14275753520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:58:08,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 10:58:10,421][15401] Updated weights for policy 0, policy_version 871324 (0.0032) [2024-06-25 10:58:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14275870720. Throughput: 0: 42767.9. Samples: 14275997240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:58:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 10:58:14,949][15401] Updated weights for policy 0, policy_version 871334 (0.0032) [2024-06-25 10:58:18,139][15401] Updated weights for policy 0, policy_version 871344 (0.0031) [2024-06-25 10:58:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42599.1, 300 sec: 42820.6). Total num frames: 14276100096. Throughput: 0: 42770.2. Samples: 14276257080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:58:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 10:58:22,515][15401] Updated weights for policy 0, policy_version 871354 (0.0032) [2024-06-25 10:58:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14276296704. Throughput: 0: 42885.7. Samples: 14276394260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 10:58:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 10:58:25,774][15401] Updated weights for policy 0, policy_version 871364 (0.0031) [2024-06-25 10:58:28,392][15132] Fps is (10 sec: 42589.6, 60 sec: 43143.0, 300 sec: 42875.8). Total num frames: 14276526080. Throughput: 0: 42754.9. Samples: 14276641420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:58:28,392][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 10:58:30,092][15401] Updated weights for policy 0, policy_version 871374 (0.0035) [2024-06-25 10:58:33,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 14276739072. Throughput: 0: 43023.5. Samples: 14276910720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:58:33,393][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 10:58:33,416][15401] Updated weights for policy 0, policy_version 871384 (0.0044) [2024-06-25 10:58:37,592][15401] Updated weights for policy 0, policy_version 871394 (0.0026) [2024-06-25 10:58:38,389][15132] Fps is (10 sec: 42607.3, 60 sec: 42603.0, 300 sec: 42821.5). Total num frames: 14276952064. Throughput: 0: 42958.6. Samples: 14277042500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:58:38,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 10:58:40,905][15401] Updated weights for policy 0, policy_version 871404 (0.0031) [2024-06-25 10:58:41,968][15349] Signal inference workers to stop experience collection... (211350 times) [2024-06-25 10:58:41,969][15349] Signal inference workers to resume experience collection... (211350 times) [2024-06-25 10:58:41,998][15401] InferenceWorker_p0-w0: stopping experience collection (211350 times) [2024-06-25 10:58:41,998][15401] InferenceWorker_p0-w0: resuming experience collection (211350 times) [2024-06-25 10:58:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14277181440. Throughput: 0: 43012.8. Samples: 14277287300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:58:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 10:58:45,308][15401] Updated weights for policy 0, policy_version 871414 (0.0043) [2024-06-25 10:58:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14277394432. Throughput: 0: 43116.5. Samples: 14277555180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:58:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 10:58:48,434][15401] Updated weights for policy 0, policy_version 871424 (0.0029) [2024-06-25 10:58:52,965][15401] Updated weights for policy 0, policy_version 871434 (0.0028) [2024-06-25 10:58:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 14277591040. Throughput: 0: 42896.0. Samples: 14277683840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:58:53,395][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 10:58:55,907][15401] Updated weights for policy 0, policy_version 871444 (0.0040) [2024-06-25 10:58:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 14277836800. Throughput: 0: 43061.5. Samples: 14277935000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:58:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 10:59:00,476][15401] Updated weights for policy 0, policy_version 871454 (0.0030) [2024-06-25 10:59:03,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 14278049792. Throughput: 0: 43125.0. Samples: 14278197700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 10:59:03,423][15401] Updated weights for policy 0, policy_version 871464 (0.0036) [2024-06-25 10:59:08,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14278213632. Throughput: 0: 42936.1. Samples: 14278326380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:08,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 10:59:08,488][15401] Updated weights for policy 0, policy_version 871474 (0.0032) [2024-06-25 10:59:10,992][15401] Updated weights for policy 0, policy_version 871484 (0.0033) [2024-06-25 10:59:13,390][15132] Fps is (10 sec: 42593.8, 60 sec: 43416.9, 300 sec: 42987.0). Total num frames: 14278475776. Throughput: 0: 43206.4. Samples: 14278585660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:13,391][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 10:59:16,030][15401] Updated weights for policy 0, policy_version 871494 (0.0043) [2024-06-25 10:59:18,389][15132] Fps is (10 sec: 49152.1, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 14278705152. Throughput: 0: 42916.6. Samples: 14278841860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 10:59:18,842][15401] Updated weights for policy 0, policy_version 871504 (0.0032) [2024-06-25 10:59:23,389][15132] Fps is (10 sec: 39325.7, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 14278868992. Throughput: 0: 42807.6. Samples: 14278968840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 10:59:23,790][15401] Updated weights for policy 0, policy_version 871514 (0.0027) [2024-06-25 10:59:26,452][15401] Updated weights for policy 0, policy_version 871524 (0.0024) [2024-06-25 10:59:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43419.1, 300 sec: 42988.1). Total num frames: 14279131136. Throughput: 0: 43063.6. Samples: 14279225160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 10:59:31,355][15401] Updated weights for policy 0, policy_version 871534 (0.0027) [2024-06-25 10:59:33,390][15132] Fps is (10 sec: 45874.1, 60 sec: 43146.1, 300 sec: 42932.0). Total num frames: 14279327744. Throughput: 0: 43018.6. Samples: 14279491020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:33,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 10:59:33,928][15401] Updated weights for policy 0, policy_version 871544 (0.0028) [2024-06-25 10:59:38,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 14279524352. Throughput: 0: 43034.7. Samples: 14279620500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:38,393][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 10:59:39,008][15401] Updated weights for policy 0, policy_version 871554 (0.0039) [2024-06-25 10:59:41,458][15401] Updated weights for policy 0, policy_version 871564 (0.0023) [2024-06-25 10:59:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14279770112. Throughput: 0: 43116.8. Samples: 14279875260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:43,390][15132] Avg episode reward: [(0, '0.126')] [2024-06-25 10:59:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000871568_14279770112.pth... [2024-06-25 10:59:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000870939_14269464576.pth [2024-06-25 10:59:43,681][15349] Signal inference workers to stop experience collection... (211400 times) [2024-06-25 10:59:43,681][15349] Signal inference workers to resume experience collection... (211400 times) [2024-06-25 10:59:43,708][15401] InferenceWorker_p0-w0: stopping experience collection (211400 times) [2024-06-25 10:59:43,709][15401] InferenceWorker_p0-w0: resuming experience collection (211400 times) [2024-06-25 10:59:46,683][15401] Updated weights for policy 0, policy_version 871574 (0.0029) [2024-06-25 10:59:48,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14279950336. Throughput: 0: 43264.5. Samples: 14280144600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 10:59:49,162][15401] Updated weights for policy 0, policy_version 871584 (0.0028) [2024-06-25 10:59:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 14280179712. Throughput: 0: 43085.2. Samples: 14280265320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:53,392][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 10:59:54,101][15401] Updated weights for policy 0, policy_version 871594 (0.0029) [2024-06-25 10:59:56,794][15401] Updated weights for policy 0, policy_version 871604 (0.0032) [2024-06-25 10:59:58,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 14280425472. Throughput: 0: 42921.7. Samples: 14280517100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 10:59:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 11:00:01,477][15401] Updated weights for policy 0, policy_version 871614 (0.0045) [2024-06-25 11:00:03,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 14280589312. Throughput: 0: 43196.3. Samples: 14280785700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 11:00:03,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 11:00:04,401][15401] Updated weights for policy 0, policy_version 871624 (0.0034) [2024-06-25 11:00:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 14280835072. Throughput: 0: 43015.1. Samples: 14280904520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 11:00:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 11:00:08,943][15401] Updated weights for policy 0, policy_version 871634 (0.0032) [2024-06-25 11:00:12,182][15401] Updated weights for policy 0, policy_version 871644 (0.0036) [2024-06-25 11:00:13,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43145.2, 300 sec: 43042.7). Total num frames: 14281064448. Throughput: 0: 43008.9. Samples: 14281160560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 11:00:13,396][15132] Avg episode reward: [(0, '0.180')] [2024-06-25 11:00:16,731][15401] Updated weights for policy 0, policy_version 871654 (0.0037) [2024-06-25 11:00:18,390][15132] Fps is (10 sec: 37683.1, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 14281211904. Throughput: 0: 43090.4. Samples: 14281430080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:18,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-25 11:00:19,625][15401] Updated weights for policy 0, policy_version 871664 (0.0041) [2024-06-25 11:00:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 14281474048. Throughput: 0: 42770.3. Samples: 14281545060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 11:00:24,597][15401] Updated weights for policy 0, policy_version 871674 (0.0047) [2024-06-25 11:00:27,364][15401] Updated weights for policy 0, policy_version 871684 (0.0034) [2024-06-25 11:00:28,390][15132] Fps is (10 sec: 49152.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 14281703424. Throughput: 0: 42884.0. Samples: 14281805040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 11:00:32,182][15401] Updated weights for policy 0, policy_version 871694 (0.0028) [2024-06-25 11:00:33,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 14281867264. Throughput: 0: 42854.6. Samples: 14282073060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 11:00:34,889][15401] Updated weights for policy 0, policy_version 871704 (0.0036) [2024-06-25 11:00:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 14282113024. Throughput: 0: 42804.9. Samples: 14282191440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:38,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 11:00:39,641][15401] Updated weights for policy 0, policy_version 871714 (0.0040) [2024-06-25 11:00:42,638][15401] Updated weights for policy 0, policy_version 871724 (0.0032) [2024-06-25 11:00:43,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14282342400. Throughput: 0: 42940.5. Samples: 14282449420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 11:00:47,596][15401] Updated weights for policy 0, policy_version 871734 (0.0035) [2024-06-25 11:00:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14282522624. Throughput: 0: 43009.9. Samples: 14282721140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 11:00:48,628][15349] Signal inference workers to stop experience collection... (211450 times) [2024-06-25 11:00:48,629][15349] Signal inference workers to resume experience collection... (211450 times) [2024-06-25 11:00:48,646][15401] InferenceWorker_p0-w0: stopping experience collection (211450 times) [2024-06-25 11:00:48,646][15401] InferenceWorker_p0-w0: resuming experience collection (211450 times) [2024-06-25 11:00:50,392][15401] Updated weights for policy 0, policy_version 871744 (0.0034) [2024-06-25 11:00:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 14282768384. Throughput: 0: 43039.9. Samples: 14282841320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:53,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 11:00:54,902][15401] Updated weights for policy 0, policy_version 871754 (0.0038) [2024-06-25 11:00:58,021][15401] Updated weights for policy 0, policy_version 871764 (0.0023) [2024-06-25 11:00:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 14282981376. Throughput: 0: 43090.7. Samples: 14283099640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:00:58,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-25 11:01:02,283][15401] Updated weights for policy 0, policy_version 871774 (0.0028) [2024-06-25 11:01:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14283177984. Throughput: 0: 43088.5. Samples: 14283369060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 11:01:06,098][15401] Updated weights for policy 0, policy_version 871784 (0.0039) [2024-06-25 11:01:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42988.0). Total num frames: 14283423744. Throughput: 0: 43248.8. Samples: 14283491260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 11:01:10,329][15401] Updated weights for policy 0, policy_version 871794 (0.0040) [2024-06-25 11:01:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14283620352. Throughput: 0: 43133.0. Samples: 14283746020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 11:01:13,769][15401] Updated weights for policy 0, policy_version 871804 (0.0037) [2024-06-25 11:01:17,878][15401] Updated weights for policy 0, policy_version 871814 (0.0041) [2024-06-25 11:01:18,390][15132] Fps is (10 sec: 37682.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14283800576. Throughput: 0: 43018.1. Samples: 14284008880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:18,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 11:01:21,333][15401] Updated weights for policy 0, policy_version 871824 (0.0040) [2024-06-25 11:01:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 14284062720. Throughput: 0: 43165.3. Samples: 14284133880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 11:01:25,366][15401] Updated weights for policy 0, policy_version 871834 (0.0041) [2024-06-25 11:01:28,390][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42876.8). Total num frames: 14284275712. Throughput: 0: 42984.0. Samples: 14284383700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:28,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 11:01:28,767][15401] Updated weights for policy 0, policy_version 871844 (0.0045) [2024-06-25 11:01:32,985][15401] Updated weights for policy 0, policy_version 871854 (0.0037) [2024-06-25 11:01:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42932.0). Total num frames: 14284472320. Throughput: 0: 42861.6. Samples: 14284649920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 11:01:36,439][15401] Updated weights for policy 0, policy_version 871864 (0.0029) [2024-06-25 11:01:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 14284718080. Throughput: 0: 43005.4. Samples: 14284776560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:38,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 11:01:40,639][15401] Updated weights for policy 0, policy_version 871874 (0.0039) [2024-06-25 11:01:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14284931072. Throughput: 0: 43023.0. Samples: 14285035680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 11:01:43,395][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000871883_14284931072.pth... [2024-06-25 11:01:43,449][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000871252_14274592768.pth [2024-06-25 11:01:43,853][15401] Updated weights for policy 0, policy_version 871884 (0.0036) [2024-06-25 11:01:48,166][15401] Updated weights for policy 0, policy_version 871894 (0.0039) [2024-06-25 11:01:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 14285127680. Throughput: 0: 42790.7. Samples: 14285294640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:48,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 11:01:51,187][15401] Updated weights for policy 0, policy_version 871904 (0.0042) [2024-06-25 11:01:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 14285357056. Throughput: 0: 42805.7. Samples: 14285417520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 11:01:55,728][15401] Updated weights for policy 0, policy_version 871914 (0.0032) [2024-06-25 11:01:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 14285553664. Throughput: 0: 42916.4. Samples: 14285677260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:01:58,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 11:01:59,508][15401] Updated weights for policy 0, policy_version 871924 (0.0034) [2024-06-25 11:02:03,325][15401] Updated weights for policy 0, policy_version 871934 (0.0036) [2024-06-25 11:02:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14285766656. Throughput: 0: 42848.9. Samples: 14285937080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:02:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 11:02:04,476][15349] Signal inference workers to stop experience collection... (211500 times) [2024-06-25 11:02:04,476][15349] Signal inference workers to resume experience collection... (211500 times) [2024-06-25 11:02:04,500][15401] InferenceWorker_p0-w0: stopping experience collection (211500 times) [2024-06-25 11:02:04,500][15401] InferenceWorker_p0-w0: resuming experience collection (211500 times) [2024-06-25 11:02:06,865][15401] Updated weights for policy 0, policy_version 871944 (0.0031) [2024-06-25 11:02:08,392][15132] Fps is (10 sec: 45864.5, 60 sec: 43142.8, 300 sec: 43097.9). Total num frames: 14286012416. Throughput: 0: 42810.3. Samples: 14286060440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:02:08,392][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 11:02:11,457][15401] Updated weights for policy 0, policy_version 871954 (0.0046) [2024-06-25 11:02:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42931.8). Total num frames: 14286209024. Throughput: 0: 43067.6. Samples: 14286321740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:13,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 11:02:14,467][15401] Updated weights for policy 0, policy_version 871964 (0.0041) [2024-06-25 11:02:18,390][15132] Fps is (10 sec: 39330.7, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 14286405632. Throughput: 0: 42991.1. Samples: 14286584520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 11:02:18,903][15401] Updated weights for policy 0, policy_version 871974 (0.0028) [2024-06-25 11:02:22,083][15401] Updated weights for policy 0, policy_version 871984 (0.0025) [2024-06-25 11:02:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 43098.2). Total num frames: 14286651392. Throughput: 0: 42913.8. Samples: 14286707680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 11:02:26,471][15401] Updated weights for policy 0, policy_version 871994 (0.0039) [2024-06-25 11:02:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 14286848000. Throughput: 0: 42933.9. Samples: 14286967700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 11:02:29,578][15401] Updated weights for policy 0, policy_version 872004 (0.0033) [2024-06-25 11:02:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42877.0). Total num frames: 14287044608. Throughput: 0: 43063.2. Samples: 14287232480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 11:02:33,993][15401] Updated weights for policy 0, policy_version 872014 (0.0038) [2024-06-25 11:02:37,067][15401] Updated weights for policy 0, policy_version 872024 (0.0042) [2024-06-25 11:02:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 14287290368. Throughput: 0: 43022.8. Samples: 14287353540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:38,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 11:02:41,740][15401] Updated weights for policy 0, policy_version 872034 (0.0039) [2024-06-25 11:02:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 14287503360. Throughput: 0: 43079.2. Samples: 14287615820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:43,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 11:02:44,657][15401] Updated weights for policy 0, policy_version 872044 (0.0033) [2024-06-25 11:02:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 14287683584. Throughput: 0: 43036.1. Samples: 14287873700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 11:02:49,415][15401] Updated weights for policy 0, policy_version 872054 (0.0036) [2024-06-25 11:02:52,363][15401] Updated weights for policy 0, policy_version 872064 (0.0029) [2024-06-25 11:02:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 14287929344. Throughput: 0: 43005.3. Samples: 14287995580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 11:02:57,090][15401] Updated weights for policy 0, policy_version 872074 (0.0028) [2024-06-25 11:02:58,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 14288142336. Throughput: 0: 43117.3. Samples: 14288262020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:02:58,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 11:02:59,992][15401] Updated weights for policy 0, policy_version 872084 (0.0033) [2024-06-25 11:03:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 14288338944. Throughput: 0: 42827.1. Samples: 14288511740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 11:03:04,927][15401] Updated weights for policy 0, policy_version 872094 (0.0037) [2024-06-25 11:03:07,949][15401] Updated weights for policy 0, policy_version 872104 (0.0042) [2024-06-25 11:03:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.1, 300 sec: 43098.3). Total num frames: 14288584704. Throughput: 0: 42867.1. Samples: 14288636700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:08,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 11:03:12,411][15401] Updated weights for policy 0, policy_version 872114 (0.0033) [2024-06-25 11:03:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 14288764928. Throughput: 0: 42983.6. Samples: 14288901960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:13,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 11:03:15,419][15401] Updated weights for policy 0, policy_version 872124 (0.0038) [2024-06-25 11:03:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 14288994304. Throughput: 0: 42740.8. Samples: 14289155820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 11:03:19,773][15401] Updated weights for policy 0, policy_version 872134 (0.0034) [2024-06-25 11:03:23,121][15401] Updated weights for policy 0, policy_version 872144 (0.0037) [2024-06-25 11:03:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 43043.0). Total num frames: 14289223680. Throughput: 0: 42940.9. Samples: 14289285880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 11:03:27,286][15401] Updated weights for policy 0, policy_version 872154 (0.0038) [2024-06-25 11:03:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 14289403904. Throughput: 0: 42867.5. Samples: 14289544860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 11:03:30,573][15401] Updated weights for policy 0, policy_version 872164 (0.0033) [2024-06-25 11:03:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 14289616896. Throughput: 0: 42801.2. Samples: 14289799760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:33,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-25 11:03:33,438][15349] Signal inference workers to stop experience collection... (211550 times) [2024-06-25 11:03:33,438][15349] Signal inference workers to resume experience collection... (211550 times) [2024-06-25 11:03:33,478][15401] InferenceWorker_p0-w0: stopping experience collection (211550 times) [2024-06-25 11:03:33,478][15401] InferenceWorker_p0-w0: resuming experience collection (211550 times) [2024-06-25 11:03:34,922][15401] Updated weights for policy 0, policy_version 872174 (0.0039) [2024-06-25 11:03:38,184][15401] Updated weights for policy 0, policy_version 872184 (0.0042) [2024-06-25 11:03:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 14289862656. Throughput: 0: 43006.8. Samples: 14289930880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:38,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 11:03:42,626][15401] Updated weights for policy 0, policy_version 872194 (0.0038) [2024-06-25 11:03:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 14290059264. Throughput: 0: 42708.3. Samples: 14290183900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:43,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 11:03:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000872196_14290059264.pth... [2024-06-25 11:03:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000871568_14279770112.pth [2024-06-25 11:03:45,900][15401] Updated weights for policy 0, policy_version 872204 (0.0040) [2024-06-25 11:03:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 14290272256. Throughput: 0: 42855.3. Samples: 14290440220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:48,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 11:03:50,250][15401] Updated weights for policy 0, policy_version 872214 (0.0040) [2024-06-25 11:03:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14290485248. Throughput: 0: 42989.8. Samples: 14290571240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 11:03:53,791][15401] Updated weights for policy 0, policy_version 872224 (0.0034) [2024-06-25 11:03:57,701][15401] Updated weights for policy 0, policy_version 872234 (0.0032) [2024-06-25 11:03:58,394][15132] Fps is (10 sec: 42580.0, 60 sec: 42595.4, 300 sec: 42875.5). Total num frames: 14290698240. Throughput: 0: 42661.2. Samples: 14290821900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-25 11:03:58,394][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 11:04:01,809][15401] Updated weights for policy 0, policy_version 872244 (0.0023) [2024-06-25 11:04:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 14290927616. Throughput: 0: 42624.8. Samples: 14291073940. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:03,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 11:04:05,426][15401] Updated weights for policy 0, policy_version 872254 (0.0027) [2024-06-25 11:04:08,390][15132] Fps is (10 sec: 42616.3, 60 sec: 42325.3, 300 sec: 42876.2). Total num frames: 14291124224. Throughput: 0: 42688.0. Samples: 14291206840. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:08,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 11:04:09,214][15401] Updated weights for policy 0, policy_version 872264 (0.0035) [2024-06-25 11:04:12,946][15401] Updated weights for policy 0, policy_version 872274 (0.0027) [2024-06-25 11:04:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14291337216. Throughput: 0: 42612.1. Samples: 14291462400. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:13,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 11:04:17,182][15401] Updated weights for policy 0, policy_version 872284 (0.0033) [2024-06-25 11:04:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 14291582976. Throughput: 0: 42481.9. Samples: 14291711440. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 11:04:20,877][15401] Updated weights for policy 0, policy_version 872294 (0.0034) [2024-06-25 11:04:23,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 14291730432. Throughput: 0: 42508.9. Samples: 14291843780. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:23,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 11:04:24,957][15401] Updated weights for policy 0, policy_version 872304 (0.0036) [2024-06-25 11:04:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14291976192. Throughput: 0: 42648.0. Samples: 14292103060. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:28,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 11:04:28,663][15401] Updated weights for policy 0, policy_version 872314 (0.0028) [2024-06-25 11:04:32,406][15401] Updated weights for policy 0, policy_version 872324 (0.0042) [2024-06-25 11:04:33,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.7, 300 sec: 42987.5). Total num frames: 14292205568. Throughput: 0: 42644.4. Samples: 14292359220. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 11:04:36,245][15401] Updated weights for policy 0, policy_version 872334 (0.0039) [2024-06-25 11:04:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 14292402176. Throughput: 0: 42687.9. Samples: 14292492200. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 11:04:39,881][15401] Updated weights for policy 0, policy_version 872344 (0.0022) [2024-06-25 11:04:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 14292631552. Throughput: 0: 42753.8. Samples: 14292745640. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:43,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 11:04:43,830][15401] Updated weights for policy 0, policy_version 872354 (0.0042) [2024-06-25 11:04:47,339][15401] Updated weights for policy 0, policy_version 872364 (0.0024) [2024-06-25 11:04:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 14292828160. Throughput: 0: 42905.0. Samples: 14293004660. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:48,391][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 11:04:51,338][15401] Updated weights for policy 0, policy_version 872374 (0.0028) [2024-06-25 11:04:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14293041152. Throughput: 0: 42811.5. Samples: 14293133360. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:53,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 11:04:54,465][15349] Signal inference workers to stop experience collection... (211600 times) [2024-06-25 11:04:54,467][15349] Signal inference workers to resume experience collection... (211600 times) [2024-06-25 11:04:54,508][15401] InferenceWorker_p0-w0: stopping experience collection (211600 times) [2024-06-25 11:04:54,508][15401] InferenceWorker_p0-w0: resuming experience collection (211600 times) [2024-06-25 11:04:54,869][15401] Updated weights for policy 0, policy_version 872384 (0.0028) [2024-06-25 11:04:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43147.6, 300 sec: 43042.7). Total num frames: 14293286912. Throughput: 0: 42845.2. Samples: 14293390440. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:04:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 11:04:58,979][15401] Updated weights for policy 0, policy_version 872394 (0.0024) [2024-06-25 11:05:02,506][15401] Updated weights for policy 0, policy_version 872404 (0.0035) [2024-06-25 11:05:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14293483520. Throughput: 0: 43154.2. Samples: 14293653380. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 11:05:06,406][15401] Updated weights for policy 0, policy_version 872414 (0.0043) [2024-06-25 11:05:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14293696512. Throughput: 0: 43045.4. Samples: 14293780820. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 11:05:10,510][15401] Updated weights for policy 0, policy_version 872424 (0.0037) [2024-06-25 11:05:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 14293925888. Throughput: 0: 42934.8. Samples: 14294035120. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:13,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 11:05:14,424][15401] Updated weights for policy 0, policy_version 872434 (0.0032) [2024-06-25 11:05:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 14294106112. Throughput: 0: 42990.2. Samples: 14294293780. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 11:05:18,411][15401] Updated weights for policy 0, policy_version 872444 (0.0031) [2024-06-25 11:05:22,042][15401] Updated weights for policy 0, policy_version 872454 (0.0032) [2024-06-25 11:05:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 14294351872. Throughput: 0: 42812.4. Samples: 14294418760. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:23,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 11:05:25,983][15401] Updated weights for policy 0, policy_version 872464 (0.0040) [2024-06-25 11:05:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.7, 300 sec: 43042.7). Total num frames: 14294564864. Throughput: 0: 42789.3. Samples: 14294671160. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:28,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-25 11:05:29,625][15401] Updated weights for policy 0, policy_version 872474 (0.0052) [2024-06-25 11:05:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14294761472. Throughput: 0: 42883.0. Samples: 14294934400. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:33,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 11:05:33,607][15401] Updated weights for policy 0, policy_version 872484 (0.0041) [2024-06-25 11:05:37,086][15401] Updated weights for policy 0, policy_version 872494 (0.0041) [2024-06-25 11:05:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14294974464. Throughput: 0: 42699.6. Samples: 14295054840. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 11:05:41,392][15401] Updated weights for policy 0, policy_version 872504 (0.0029) [2024-06-25 11:05:43,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 14295220224. Throughput: 0: 42890.2. Samples: 14295320500. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:43,394][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 11:05:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000872511_14295220224.pth... [2024-06-25 11:05:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000871883_14284931072.pth [2024-06-25 11:05:44,627][15401] Updated weights for policy 0, policy_version 872514 (0.0034) [2024-06-25 11:05:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14295400448. Throughput: 0: 42969.3. Samples: 14295587000. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-25 11:05:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 11:05:49,016][15401] Updated weights for policy 0, policy_version 872524 (0.0032) [2024-06-25 11:05:52,177][15401] Updated weights for policy 0, policy_version 872534 (0.0035) [2024-06-25 11:05:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14295613440. Throughput: 0: 42716.9. Samples: 14295703080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:05:53,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-25 11:05:56,526][15401] Updated weights for policy 0, policy_version 872544 (0.0027) [2024-06-25 11:05:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 14295859200. Throughput: 0: 42770.6. Samples: 14295959800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:05:58,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 11:05:59,746][15401] Updated weights for policy 0, policy_version 872554 (0.0023) [2024-06-25 11:06:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14296039424. Throughput: 0: 42911.5. Samples: 14296224800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:03,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 11:06:04,095][15401] Updated weights for policy 0, policy_version 872564 (0.0040) [2024-06-25 11:06:07,801][15401] Updated weights for policy 0, policy_version 872574 (0.0026) [2024-06-25 11:06:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14296268800. Throughput: 0: 42802.8. Samples: 14296344880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 11:06:11,635][15401] Updated weights for policy 0, policy_version 872584 (0.0032) [2024-06-25 11:06:13,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 14296514560. Throughput: 0: 42992.9. Samples: 14296605840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 11:06:15,558][15401] Updated weights for policy 0, policy_version 872594 (0.0029) [2024-06-25 11:06:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14296694784. Throughput: 0: 42899.2. Samples: 14296864860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 11:06:19,459][15401] Updated weights for policy 0, policy_version 872604 (0.0032) [2024-06-25 11:06:23,091][15401] Updated weights for policy 0, policy_version 872614 (0.0034) [2024-06-25 11:06:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14296907776. Throughput: 0: 42939.7. Samples: 14296987120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 11:06:27,340][15401] Updated weights for policy 0, policy_version 872624 (0.0028) [2024-06-25 11:06:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14297120768. Throughput: 0: 42822.4. Samples: 14297247500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 11:06:31,068][15401] Updated weights for policy 0, policy_version 872634 (0.0024) [2024-06-25 11:06:33,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14297333760. Throughput: 0: 42588.3. Samples: 14297503480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:33,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 11:06:34,826][15401] Updated weights for policy 0, policy_version 872644 (0.0031) [2024-06-25 11:06:36,071][15349] Signal inference workers to stop experience collection... (211650 times) [2024-06-25 11:06:36,072][15349] Signal inference workers to resume experience collection... (211650 times) [2024-06-25 11:06:36,114][15401] InferenceWorker_p0-w0: stopping experience collection (211650 times) [2024-06-25 11:06:36,115][15401] InferenceWorker_p0-w0: resuming experience collection (211650 times) [2024-06-25 11:06:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14297546752. Throughput: 0: 42771.9. Samples: 14297627820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:38,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 11:06:38,639][15401] Updated weights for policy 0, policy_version 872654 (0.0029) [2024-06-25 11:06:42,636][15401] Updated weights for policy 0, policy_version 872664 (0.0041) [2024-06-25 11:06:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14297759744. Throughput: 0: 42785.7. Samples: 14297885160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 11:06:46,393][15401] Updated weights for policy 0, policy_version 872674 (0.0035) [2024-06-25 11:06:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14297956352. Throughput: 0: 42540.1. Samples: 14298139100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:48,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 11:06:50,357][15401] Updated weights for policy 0, policy_version 872684 (0.0045) [2024-06-25 11:06:53,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14298202112. Throughput: 0: 42597.8. Samples: 14298261780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 11:06:54,130][15401] Updated weights for policy 0, policy_version 872694 (0.0029) [2024-06-25 11:06:58,041][15401] Updated weights for policy 0, policy_version 872704 (0.0030) [2024-06-25 11:06:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14298398720. Throughput: 0: 42567.5. Samples: 14298521380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:06:58,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 11:07:01,553][15401] Updated weights for policy 0, policy_version 872714 (0.0035) [2024-06-25 11:07:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 14298611712. Throughput: 0: 42429.7. Samples: 14298774200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 11:07:05,743][15401] Updated weights for policy 0, policy_version 872724 (0.0022) [2024-06-25 11:07:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14298824704. Throughput: 0: 42572.9. Samples: 14298902900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 11:07:09,215][15401] Updated weights for policy 0, policy_version 872734 (0.0031) [2024-06-25 11:07:13,127][15401] Updated weights for policy 0, policy_version 872744 (0.0031) [2024-06-25 11:07:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 14299054080. Throughput: 0: 42853.5. Samples: 14299175920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 11:07:17,018][15401] Updated weights for policy 0, policy_version 872754 (0.0045) [2024-06-25 11:07:18,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14299267072. Throughput: 0: 42717.5. Samples: 14299425860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:18,393][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 11:07:20,697][15401] Updated weights for policy 0, policy_version 872764 (0.0033) [2024-06-25 11:07:23,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14299480064. Throughput: 0: 42745.9. Samples: 14299551380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:23,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 11:07:24,893][15401] Updated weights for policy 0, policy_version 872774 (0.0034) [2024-06-25 11:07:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14299676672. Throughput: 0: 42950.4. Samples: 14299817920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 11:07:28,433][15401] Updated weights for policy 0, policy_version 872784 (0.0026) [2024-06-25 11:07:32,379][15401] Updated weights for policy 0, policy_version 872794 (0.0030) [2024-06-25 11:07:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14299906048. Throughput: 0: 42996.7. Samples: 14300073960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:33,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 11:07:36,242][15401] Updated weights for policy 0, policy_version 872804 (0.0030) [2024-06-25 11:07:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14300151808. Throughput: 0: 43196.4. Samples: 14300205620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:38,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 11:07:39,902][15401] Updated weights for policy 0, policy_version 872814 (0.0031) [2024-06-25 11:07:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 14300315648. Throughput: 0: 43120.0. Samples: 14300461780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 11:07:43,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-25 11:07:43,482][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000872823_14300332032.pth... [2024-06-25 11:07:43,542][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000872196_14290059264.pth [2024-06-25 11:07:43,730][15401] Updated weights for policy 0, policy_version 872824 (0.0025) [2024-06-25 11:07:47,475][15401] Updated weights for policy 0, policy_version 872834 (0.0026) [2024-06-25 11:07:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14300545024. Throughput: 0: 43061.9. Samples: 14300711980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:07:48,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 11:07:51,348][15401] Updated weights for policy 0, policy_version 872844 (0.0032) [2024-06-25 11:07:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14300758016. Throughput: 0: 43185.3. Samples: 14300846240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:07:53,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 11:07:55,203][15401] Updated weights for policy 0, policy_version 872854 (0.0045) [2024-06-25 11:07:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14300954624. Throughput: 0: 42628.6. Samples: 14301094200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:07:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 11:07:59,283][15401] Updated weights for policy 0, policy_version 872864 (0.0035) [2024-06-25 11:08:00,290][15349] Signal inference workers to stop experience collection... (211700 times) [2024-06-25 11:08:00,292][15349] Signal inference workers to resume experience collection... (211700 times) [2024-06-25 11:08:00,325][15401] InferenceWorker_p0-w0: stopping experience collection (211700 times) [2024-06-25 11:08:00,325][15401] InferenceWorker_p0-w0: resuming experience collection (211700 times) [2024-06-25 11:08:02,675][15401] Updated weights for policy 0, policy_version 872874 (0.0037) [2024-06-25 11:08:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14301184000. Throughput: 0: 42915.1. Samples: 14301356940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 11:08:06,727][15401] Updated weights for policy 0, policy_version 872884 (0.0044) [2024-06-25 11:08:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14301396992. Throughput: 0: 43014.6. Samples: 14301487040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:08,390][15132] Avg episode reward: [(0, '0.244')] [2024-06-25 11:08:10,148][15401] Updated weights for policy 0, policy_version 872894 (0.0027) [2024-06-25 11:08:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 14301609984. Throughput: 0: 42792.4. Samples: 14301743580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 11:08:14,234][15401] Updated weights for policy 0, policy_version 872904 (0.0030) [2024-06-25 11:08:17,653][15401] Updated weights for policy 0, policy_version 872914 (0.0032) [2024-06-25 11:08:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14301839360. Throughput: 0: 42762.4. Samples: 14301998260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:18,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 11:08:21,803][15401] Updated weights for policy 0, policy_version 872924 (0.0043) [2024-06-25 11:08:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14302052352. Throughput: 0: 42890.2. Samples: 14302135680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:23,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-25 11:08:25,121][15401] Updated weights for policy 0, policy_version 872934 (0.0033) [2024-06-25 11:08:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14302265344. Throughput: 0: 42878.2. Samples: 14302391300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:28,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-25 11:08:29,560][15401] Updated weights for policy 0, policy_version 872944 (0.0043) [2024-06-25 11:08:32,621][15401] Updated weights for policy 0, policy_version 872954 (0.0047) [2024-06-25 11:08:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14302494720. Throughput: 0: 43076.9. Samples: 14302650440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:33,394][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 11:08:37,144][15401] Updated weights for policy 0, policy_version 872964 (0.0033) [2024-06-25 11:08:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 14302674944. Throughput: 0: 42909.4. Samples: 14302777160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 11:08:40,719][15401] Updated weights for policy 0, policy_version 872974 (0.0035) [2024-06-25 11:08:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 14302920704. Throughput: 0: 43155.5. Samples: 14303036200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 11:08:44,596][15401] Updated weights for policy 0, policy_version 872984 (0.0033) [2024-06-25 11:08:48,347][15401] Updated weights for policy 0, policy_version 872994 (0.0029) [2024-06-25 11:08:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14303133696. Throughput: 0: 43036.5. Samples: 14303293580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:48,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 11:08:52,007][15401] Updated weights for policy 0, policy_version 873004 (0.0033) [2024-06-25 11:08:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42821.2). Total num frames: 14303330304. Throughput: 0: 42945.8. Samples: 14303419600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 11:08:56,094][15401] Updated weights for policy 0, policy_version 873014 (0.0038) [2024-06-25 11:08:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 14303576064. Throughput: 0: 43042.6. Samples: 14303680500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:08:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 11:09:00,061][15401] Updated weights for policy 0, policy_version 873024 (0.0043) [2024-06-25 11:09:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14303772672. Throughput: 0: 42999.9. Samples: 14303933260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:09:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 11:09:03,746][15401] Updated weights for policy 0, policy_version 873034 (0.0041) [2024-06-25 11:09:07,986][15401] Updated weights for policy 0, policy_version 873044 (0.0039) [2024-06-25 11:09:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14303969280. Throughput: 0: 42832.1. Samples: 14304063120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:09:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 11:09:11,372][15401] Updated weights for policy 0, policy_version 873054 (0.0035) [2024-06-25 11:09:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 14304215040. Throughput: 0: 42940.6. Samples: 14304323620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:09:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 11:09:15,462][15401] Updated weights for policy 0, policy_version 873064 (0.0039) [2024-06-25 11:09:17,303][15349] Signal inference workers to stop experience collection... (211750 times) [2024-06-25 11:09:17,303][15349] Signal inference workers to resume experience collection... (211750 times) [2024-06-25 11:09:17,359][15401] InferenceWorker_p0-w0: stopping experience collection (211750 times) [2024-06-25 11:09:17,359][15401] InferenceWorker_p0-w0: resuming experience collection (211750 times) [2024-06-25 11:09:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 14304428032. Throughput: 0: 42987.5. Samples: 14304584880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:09:18,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 11:09:18,870][15401] Updated weights for policy 0, policy_version 873074 (0.0032) [2024-06-25 11:09:22,983][15401] Updated weights for policy 0, policy_version 873084 (0.0026) [2024-06-25 11:09:23,394][15132] Fps is (10 sec: 40941.7, 60 sec: 42868.4, 300 sec: 42875.5). Total num frames: 14304624640. Throughput: 0: 42887.3. Samples: 14304707280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:09:23,394][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 11:09:26,506][15401] Updated weights for policy 0, policy_version 873094 (0.0024) [2024-06-25 11:09:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 14304870400. Throughput: 0: 42977.7. Samples: 14304970200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:09:28,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 11:09:30,476][15401] Updated weights for policy 0, policy_version 873104 (0.0031) [2024-06-25 11:09:33,389][15132] Fps is (10 sec: 42617.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14305050624. Throughput: 0: 43048.5. Samples: 14305230760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:09:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 11:09:34,086][15401] Updated weights for policy 0, policy_version 873114 (0.0028) [2024-06-25 11:09:37,902][15401] Updated weights for policy 0, policy_version 873124 (0.0031) [2024-06-25 11:09:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 14305280000. Throughput: 0: 42972.8. Samples: 14305353380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:09:38,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 11:09:41,651][15401] Updated weights for policy 0, policy_version 873134 (0.0036) [2024-06-25 11:09:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14305492992. Throughput: 0: 42977.8. Samples: 14305614500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:09:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 11:09:43,452][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000873139_14305509376.pth... [2024-06-25 11:09:43,505][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000872511_14295220224.pth [2024-06-25 11:09:45,314][15401] Updated weights for policy 0, policy_version 873144 (0.0028) [2024-06-25 11:09:48,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 14305705984. Throughput: 0: 43107.1. Samples: 14305873180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:09:48,393][15132] Avg episode reward: [(0, '0.233')] [2024-06-25 11:09:49,221][15401] Updated weights for policy 0, policy_version 873154 (0.0032) [2024-06-25 11:09:52,967][15401] Updated weights for policy 0, policy_version 873164 (0.0027) [2024-06-25 11:09:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 14305935360. Throughput: 0: 43125.7. Samples: 14306003780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:09:53,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-25 11:09:56,879][15401] Updated weights for policy 0, policy_version 873174 (0.0032) [2024-06-25 11:09:58,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14306148352. Throughput: 0: 42941.6. Samples: 14306256000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:09:58,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 11:10:00,497][15401] Updated weights for policy 0, policy_version 873184 (0.0027) [2024-06-25 11:10:03,396][15132] Fps is (10 sec: 42571.5, 60 sec: 43140.0, 300 sec: 42930.7). Total num frames: 14306361344. Throughput: 0: 42999.7. Samples: 14306520140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:03,396][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 11:10:04,558][15401] Updated weights for policy 0, policy_version 873194 (0.0027) [2024-06-25 11:10:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14306557952. Throughput: 0: 43037.0. Samples: 14306643760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:08,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 11:10:08,526][15401] Updated weights for policy 0, policy_version 873204 (0.0037) [2024-06-25 11:10:12,231][15401] Updated weights for policy 0, policy_version 873214 (0.0042) [2024-06-25 11:10:13,390][15132] Fps is (10 sec: 42625.1, 60 sec: 42871.3, 300 sec: 42987.2). Total num frames: 14306787328. Throughput: 0: 42961.4. Samples: 14306903460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:13,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 11:10:15,913][15401] Updated weights for policy 0, policy_version 873224 (0.0043) [2024-06-25 11:10:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14306983936. Throughput: 0: 42996.4. Samples: 14307165600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:18,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 11:10:19,965][15401] Updated weights for policy 0, policy_version 873234 (0.0020) [2024-06-25 11:10:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42874.6, 300 sec: 42820.5). Total num frames: 14307196928. Throughput: 0: 42898.3. Samples: 14307283800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 11:10:23,917][15401] Updated weights for policy 0, policy_version 873244 (0.0040) [2024-06-25 11:10:27,821][15401] Updated weights for policy 0, policy_version 873254 (0.0022) [2024-06-25 11:10:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 14307409920. Throughput: 0: 42805.0. Samples: 14307540720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 11:10:31,458][15401] Updated weights for policy 0, policy_version 873264 (0.0037) [2024-06-25 11:10:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14307622912. Throughput: 0: 43004.9. Samples: 14307808300. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 11:10:35,343][15401] Updated weights for policy 0, policy_version 873274 (0.0040) [2024-06-25 11:10:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14307868672. Throughput: 0: 42768.0. Samples: 14307928340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:38,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 11:10:39,142][15401] Updated weights for policy 0, policy_version 873284 (0.0038) [2024-06-25 11:10:42,856][15401] Updated weights for policy 0, policy_version 873294 (0.0041) [2024-06-25 11:10:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 14308081664. Throughput: 0: 42944.1. Samples: 14308188480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:43,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 11:10:46,871][15401] Updated weights for policy 0, policy_version 873304 (0.0029) [2024-06-25 11:10:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42598.4, 300 sec: 42875.7). Total num frames: 14308261888. Throughput: 0: 42831.3. Samples: 14308447380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:48,393][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 11:10:50,340][15401] Updated weights for policy 0, policy_version 873314 (0.0038) [2024-06-25 11:10:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14308507648. Throughput: 0: 42839.1. Samples: 14308571520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:53,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 11:10:54,540][15401] Updated weights for policy 0, policy_version 873324 (0.0030) [2024-06-25 11:10:57,558][15349] Signal inference workers to stop experience collection... (211800 times) [2024-06-25 11:10:57,559][15349] Signal inference workers to resume experience collection... (211800 times) [2024-06-25 11:10:57,578][15401] InferenceWorker_p0-w0: stopping experience collection (211800 times) [2024-06-25 11:10:57,579][15401] InferenceWorker_p0-w0: resuming experience collection (211800 times) [2024-06-25 11:10:57,873][15401] Updated weights for policy 0, policy_version 873334 (0.0038) [2024-06-25 11:10:58,389][15132] Fps is (10 sec: 47525.4, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 14308737024. Throughput: 0: 42957.9. Samples: 14308836560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:10:58,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 11:11:02,278][15401] Updated weights for policy 0, policy_version 873344 (0.0023) [2024-06-25 11:11:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42329.9, 300 sec: 42820.6). Total num frames: 14308900864. Throughput: 0: 42933.0. Samples: 14309097580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:11:03,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 11:11:05,643][15401] Updated weights for policy 0, policy_version 873354 (0.0035) [2024-06-25 11:11:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14309163008. Throughput: 0: 43001.8. Samples: 14309218880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:11:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 11:11:10,034][15401] Updated weights for policy 0, policy_version 873364 (0.0044) [2024-06-25 11:11:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14309343232. Throughput: 0: 43021.2. Samples: 14309476680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:11:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 11:11:13,554][15401] Updated weights for policy 0, policy_version 873374 (0.0037) [2024-06-25 11:11:17,698][15401] Updated weights for policy 0, policy_version 873384 (0.0033) [2024-06-25 11:11:18,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 14309539840. Throughput: 0: 42783.2. Samples: 14309733540. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:11:18,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 11:11:21,167][15401] Updated weights for policy 0, policy_version 873394 (0.0046) [2024-06-25 11:11:23,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 14309801984. Throughput: 0: 42914.7. Samples: 14309859500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-25 11:11:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 11:11:25,402][15401] Updated weights for policy 0, policy_version 873404 (0.0036) [2024-06-25 11:11:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 14309965824. Throughput: 0: 42908.4. Samples: 14310119460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:11:28,393][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 11:11:28,932][15401] Updated weights for policy 0, policy_version 873414 (0.0036) [2024-06-25 11:11:33,096][15401] Updated weights for policy 0, policy_version 873424 (0.0032) [2024-06-25 11:11:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14310178816. Throughput: 0: 42873.0. Samples: 14310376560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:11:33,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 11:11:36,478][15401] Updated weights for policy 0, policy_version 873434 (0.0042) [2024-06-25 11:11:38,389][15132] Fps is (10 sec: 49164.2, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 14310457344. Throughput: 0: 42917.5. Samples: 14310502800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:11:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 11:11:40,867][15401] Updated weights for policy 0, policy_version 873444 (0.0037) [2024-06-25 11:11:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 14310621184. Throughput: 0: 42743.8. Samples: 14310760040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:11:43,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-25 11:11:43,508][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000873452_14310637568.pth... [2024-06-25 11:11:43,551][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000872823_14300332032.pth [2024-06-25 11:11:44,231][15401] Updated weights for policy 0, policy_version 873454 (0.0028) [2024-06-25 11:11:48,389][15132] Fps is (10 sec: 36044.8, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 14310817792. Throughput: 0: 42600.4. Samples: 14311014600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:11:48,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 11:11:48,440][15401] Updated weights for policy 0, policy_version 873464 (0.0030) [2024-06-25 11:11:51,972][15401] Updated weights for policy 0, policy_version 873474 (0.0039) [2024-06-25 11:11:53,389][15132] Fps is (10 sec: 47514.7, 60 sec: 43144.7, 300 sec: 43042.7). Total num frames: 14311096320. Throughput: 0: 42715.2. Samples: 14311141060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:11:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 11:11:56,031][15401] Updated weights for policy 0, policy_version 873484 (0.0039) [2024-06-25 11:11:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 14311260160. Throughput: 0: 42665.4. Samples: 14311396620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:11:58,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 11:11:59,649][15401] Updated weights for policy 0, policy_version 873494 (0.0033) [2024-06-25 11:12:00,200][15349] Signal inference workers to stop experience collection... (211850 times) [2024-06-25 11:12:00,201][15349] Signal inference workers to resume experience collection... (211850 times) [2024-06-25 11:12:00,229][15401] InferenceWorker_p0-w0: stopping experience collection (211850 times) [2024-06-25 11:12:00,229][15401] InferenceWorker_p0-w0: resuming experience collection (211850 times) [2024-06-25 11:12:03,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 14311473152. Throughput: 0: 42603.0. Samples: 14311650680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:03,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 11:12:03,550][15401] Updated weights for policy 0, policy_version 873504 (0.0032) [2024-06-25 11:12:07,142][15401] Updated weights for policy 0, policy_version 873514 (0.0033) [2024-06-25 11:12:08,396][15132] Fps is (10 sec: 47482.8, 60 sec: 42866.9, 300 sec: 42986.3). Total num frames: 14311735296. Throughput: 0: 42663.7. Samples: 14311779640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:08,397][15132] Avg episode reward: [(0, '0.267')] [2024-06-25 11:12:11,421][15401] Updated weights for policy 0, policy_version 873524 (0.0041) [2024-06-25 11:12:13,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 14311882752. Throughput: 0: 42666.7. Samples: 14312039360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:13,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 11:12:14,674][15401] Updated weights for policy 0, policy_version 873534 (0.0025) [2024-06-25 11:12:18,389][15132] Fps is (10 sec: 37707.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14312112128. Throughput: 0: 42502.2. Samples: 14312289160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 11:12:19,100][15401] Updated weights for policy 0, policy_version 873544 (0.0030) [2024-06-25 11:12:22,263][15401] Updated weights for policy 0, policy_version 873554 (0.0041) [2024-06-25 11:12:23,390][15132] Fps is (10 sec: 49150.6, 60 sec: 42871.2, 300 sec: 43042.7). Total num frames: 14312374272. Throughput: 0: 42733.0. Samples: 14312425800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 11:12:26,850][15401] Updated weights for policy 0, policy_version 873564 (0.0030) [2024-06-25 11:12:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14312521728. Throughput: 0: 42633.1. Samples: 14312678520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 11:12:29,993][15401] Updated weights for policy 0, policy_version 873574 (0.0043) [2024-06-25 11:12:33,389][15132] Fps is (10 sec: 37684.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14312751104. Throughput: 0: 42512.0. Samples: 14312927640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 11:12:34,776][15401] Updated weights for policy 0, policy_version 873584 (0.0038) [2024-06-25 11:12:37,573][15401] Updated weights for policy 0, policy_version 873594 (0.0028) [2024-06-25 11:12:38,390][15132] Fps is (10 sec: 47509.1, 60 sec: 42324.6, 300 sec: 42987.0). Total num frames: 14312996864. Throughput: 0: 42688.8. Samples: 14313062100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:38,391][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 11:12:42,257][15401] Updated weights for policy 0, policy_version 873604 (0.0041) [2024-06-25 11:12:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 14313160704. Throughput: 0: 42726.8. Samples: 14313319320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:43,390][15132] Avg episode reward: [(0, '0.302')] [2024-06-25 11:12:45,153][15401] Updated weights for policy 0, policy_version 873614 (0.0042) [2024-06-25 11:12:48,389][15132] Fps is (10 sec: 40964.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14313406464. Throughput: 0: 42625.9. Samples: 14313568840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:48,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 11:12:49,769][15401] Updated weights for policy 0, policy_version 873624 (0.0041) [2024-06-25 11:12:53,163][15401] Updated weights for policy 0, policy_version 873634 (0.0033) [2024-06-25 11:12:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42931.6). Total num frames: 14313619456. Throughput: 0: 42797.3. Samples: 14313705240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 11:12:57,291][15401] Updated weights for policy 0, policy_version 873644 (0.0034) [2024-06-25 11:12:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14313799680. Throughput: 0: 42622.3. Samples: 14313957360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:12:58,396][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 11:13:00,797][15401] Updated weights for policy 0, policy_version 873654 (0.0027) [2024-06-25 11:13:02,389][15349] Signal inference workers to stop experience collection... (211900 times) [2024-06-25 11:13:02,400][15401] InferenceWorker_p0-w0: stopping experience collection (211900 times) [2024-06-25 11:13:02,450][15349] Signal inference workers to resume experience collection... (211900 times) [2024-06-25 11:13:02,450][15401] InferenceWorker_p0-w0: resuming experience collection (211900 times) [2024-06-25 11:13:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14314061824. Throughput: 0: 42518.6. Samples: 14314202500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:13:03,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 11:13:05,182][15401] Updated weights for policy 0, policy_version 873664 (0.0047) [2024-06-25 11:13:08,378][15401] Updated weights for policy 0, policy_version 873674 (0.0038) [2024-06-25 11:13:08,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42328.2, 300 sec: 42931.3). Total num frames: 14314274816. Throughput: 0: 42662.9. Samples: 14314345720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:13:08,392][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 11:13:12,679][15401] Updated weights for policy 0, policy_version 873684 (0.0029) [2024-06-25 11:13:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14314455040. Throughput: 0: 42542.5. Samples: 14314592940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:13:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 11:13:16,280][15401] Updated weights for policy 0, policy_version 873694 (0.0041) [2024-06-25 11:13:18,390][15132] Fps is (10 sec: 42608.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14314700800. Throughput: 0: 42482.1. Samples: 14314839340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-25 11:13:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 11:13:20,279][15401] Updated weights for policy 0, policy_version 873704 (0.0034) [2024-06-25 11:13:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42052.5, 300 sec: 42820.6). Total num frames: 14314897408. Throughput: 0: 42492.9. Samples: 14314974240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:13:23,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 11:13:24,056][15401] Updated weights for policy 0, policy_version 873714 (0.0033) [2024-06-25 11:13:28,190][15401] Updated weights for policy 0, policy_version 873724 (0.0041) [2024-06-25 11:13:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14315094016. Throughput: 0: 42326.6. Samples: 14315224020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:13:28,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 11:13:31,721][15401] Updated weights for policy 0, policy_version 873734 (0.0041) [2024-06-25 11:13:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14315339776. Throughput: 0: 42400.4. Samples: 14315476860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:13:33,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 11:13:35,963][15401] Updated weights for policy 0, policy_version 873744 (0.0026) [2024-06-25 11:13:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42326.0, 300 sec: 42765.0). Total num frames: 14315536384. Throughput: 0: 42399.1. Samples: 14315613200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:13:38,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 11:13:39,231][15401] Updated weights for policy 0, policy_version 873754 (0.0027) [2024-06-25 11:13:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14315732992. Throughput: 0: 42436.0. Samples: 14315866980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:13:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 11:13:43,508][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000873764_14315749376.pth... [2024-06-25 11:13:43,519][15401] Updated weights for policy 0, policy_version 873764 (0.0046) [2024-06-25 11:13:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000873139_14305509376.pth [2024-06-25 11:13:46,849][15401] Updated weights for policy 0, policy_version 873774 (0.0040) [2024-06-25 11:13:48,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42866.8, 300 sec: 42875.1). Total num frames: 14315978752. Throughput: 0: 42482.9. Samples: 14316114500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:13:48,397][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 11:13:51,334][15401] Updated weights for policy 0, policy_version 873784 (0.0030) [2024-06-25 11:13:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14316158976. Throughput: 0: 42275.5. Samples: 14316248020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:13:53,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 11:13:54,601][15401] Updated weights for policy 0, policy_version 873794 (0.0028) [2024-06-25 11:13:58,389][15132] Fps is (10 sec: 37707.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14316355584. Throughput: 0: 42260.6. Samples: 14316494660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:13:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 11:13:59,058][15401] Updated weights for policy 0, policy_version 873804 (0.0035) [2024-06-25 11:14:02,304][15401] Updated weights for policy 0, policy_version 873814 (0.0038) [2024-06-25 11:14:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14316617728. Throughput: 0: 42493.4. Samples: 14316751540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:03,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 11:14:06,772][15401] Updated weights for policy 0, policy_version 873824 (0.0042) [2024-06-25 11:14:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42054.0, 300 sec: 42653.9). Total num frames: 14316797952. Throughput: 0: 42576.0. Samples: 14316890160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:08,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 11:14:10,168][15401] Updated weights for policy 0, policy_version 873834 (0.0037) [2024-06-25 11:14:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14317010944. Throughput: 0: 42601.9. Samples: 14317141100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:13,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 11:14:14,433][15401] Updated weights for policy 0, policy_version 873844 (0.0028) [2024-06-25 11:14:17,793][15401] Updated weights for policy 0, policy_version 873854 (0.0026) [2024-06-25 11:14:18,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.4, 300 sec: 42821.2). Total num frames: 14317256704. Throughput: 0: 42596.3. Samples: 14317393700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:18,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 11:14:22,195][15401] Updated weights for policy 0, policy_version 873864 (0.0031) [2024-06-25 11:14:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 14317420544. Throughput: 0: 42565.2. Samples: 14317528640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:23,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 11:14:25,649][15401] Updated weights for policy 0, policy_version 873874 (0.0033) [2024-06-25 11:14:27,922][15349] Signal inference workers to stop experience collection... (211950 times) [2024-06-25 11:14:27,978][15401] InferenceWorker_p0-w0: stopping experience collection (211950 times) [2024-06-25 11:14:27,987][15349] Signal inference workers to resume experience collection... (211950 times) [2024-06-25 11:14:27,997][15401] InferenceWorker_p0-w0: resuming experience collection (211950 times) [2024-06-25 11:14:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14317666304. Throughput: 0: 42439.0. Samples: 14317776740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 11:14:29,770][15401] Updated weights for policy 0, policy_version 873884 (0.0043) [2024-06-25 11:14:33,208][15401] Updated weights for policy 0, policy_version 873894 (0.0029) [2024-06-25 11:14:33,394][15132] Fps is (10 sec: 45857.2, 60 sec: 42322.5, 300 sec: 42708.9). Total num frames: 14317879296. Throughput: 0: 42493.9. Samples: 14318026620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:33,394][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 11:14:37,495][15401] Updated weights for policy 0, policy_version 873904 (0.0034) [2024-06-25 11:14:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14318059520. Throughput: 0: 42384.9. Samples: 14318155340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 11:14:40,722][15401] Updated weights for policy 0, policy_version 873914 (0.0029) [2024-06-25 11:14:43,392][15132] Fps is (10 sec: 40966.8, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 14318288896. Throughput: 0: 42733.2. Samples: 14318417760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:43,392][15132] Avg episode reward: [(0, '0.808')] [2024-06-25 11:14:45,041][15401] Updated weights for policy 0, policy_version 873924 (0.0039) [2024-06-25 11:14:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42330.0, 300 sec: 42654.0). Total num frames: 14318518272. Throughput: 0: 42647.3. Samples: 14318670660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:48,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 11:14:48,736][15401] Updated weights for policy 0, policy_version 873934 (0.0044) [2024-06-25 11:14:52,966][15401] Updated weights for policy 0, policy_version 873944 (0.0039) [2024-06-25 11:14:53,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14318714880. Throughput: 0: 42600.8. Samples: 14318807200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 11:14:56,222][15401] Updated weights for policy 0, policy_version 873954 (0.0023) [2024-06-25 11:14:58,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42869.7, 300 sec: 42599.0). Total num frames: 14318927872. Throughput: 0: 42666.1. Samples: 14319061180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:14:58,393][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 11:15:00,503][15401] Updated weights for policy 0, policy_version 873964 (0.0031) [2024-06-25 11:15:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14319157248. Throughput: 0: 42674.8. Samples: 14319314060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:15:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 11:15:03,882][15401] Updated weights for policy 0, policy_version 873974 (0.0039) [2024-06-25 11:15:08,121][15401] Updated weights for policy 0, policy_version 873984 (0.0046) [2024-06-25 11:15:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 14319370240. Throughput: 0: 42743.7. Samples: 14319452100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 11:15:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 11:15:11,393][15401] Updated weights for policy 0, policy_version 873994 (0.0028) [2024-06-25 11:15:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 14319566848. Throughput: 0: 42714.1. Samples: 14319698880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 11:15:15,757][15401] Updated weights for policy 0, policy_version 874004 (0.0042) [2024-06-25 11:15:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14319796224. Throughput: 0: 42946.9. Samples: 14319959060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:18,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 11:15:19,052][15401] Updated weights for policy 0, policy_version 874014 (0.0047) [2024-06-25 11:15:23,392][15132] Fps is (10 sec: 44227.3, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 14320009216. Throughput: 0: 42972.8. Samples: 14320089220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:23,392][15401] Updated weights for policy 0, policy_version 874024 (0.0037) [2024-06-25 11:15:23,392][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 11:15:26,536][15401] Updated weights for policy 0, policy_version 874034 (0.0043) [2024-06-25 11:15:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14320205824. Throughput: 0: 42652.0. Samples: 14320337000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:28,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 11:15:31,111][15401] Updated weights for policy 0, policy_version 874044 (0.0029) [2024-06-25 11:15:33,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42601.2, 300 sec: 42598.4). Total num frames: 14320435200. Throughput: 0: 42778.5. Samples: 14320595700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:33,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 11:15:34,351][15401] Updated weights for policy 0, policy_version 874054 (0.0040) [2024-06-25 11:15:37,125][15349] Signal inference workers to stop experience collection... (212000 times) [2024-06-25 11:15:37,125][15349] Signal inference workers to resume experience collection... (212000 times) [2024-06-25 11:15:37,176][15401] InferenceWorker_p0-w0: stopping experience collection (212000 times) [2024-06-25 11:15:37,176][15401] InferenceWorker_p0-w0: resuming experience collection (212000 times) [2024-06-25 11:15:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14320631808. Throughput: 0: 42590.3. Samples: 14320723760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:38,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 11:15:38,823][15401] Updated weights for policy 0, policy_version 874064 (0.0036) [2024-06-25 11:15:42,146][15401] Updated weights for policy 0, policy_version 874074 (0.0037) [2024-06-25 11:15:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 14320861184. Throughput: 0: 42382.2. Samples: 14320968280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:43,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 11:15:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000874076_14320861184.pth... [2024-06-25 11:15:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000873452_14310637568.pth [2024-06-25 11:15:46,654][15401] Updated weights for policy 0, policy_version 874084 (0.0038) [2024-06-25 11:15:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 14321025024. Throughput: 0: 42593.7. Samples: 14321230780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 11:15:50,226][15401] Updated weights for policy 0, policy_version 874094 (0.0023) [2024-06-25 11:15:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14321270784. Throughput: 0: 42209.8. Samples: 14321351540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:53,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 11:15:54,346][15401] Updated weights for policy 0, policy_version 874104 (0.0036) [2024-06-25 11:15:58,022][15401] Updated weights for policy 0, policy_version 874114 (0.0037) [2024-06-25 11:15:58,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 14321500160. Throughput: 0: 42342.4. Samples: 14321604280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:15:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 11:16:02,921][15401] Updated weights for policy 0, policy_version 874124 (0.0026) [2024-06-25 11:16:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 14321680384. Throughput: 0: 42215.3. Samples: 14321858740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 11:16:05,747][15401] Updated weights for policy 0, policy_version 874134 (0.0032) [2024-06-25 11:16:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 14321876992. Throughput: 0: 41934.3. Samples: 14321976160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 11:16:10,489][15401] Updated weights for policy 0, policy_version 874144 (0.0032) [2024-06-25 11:16:13,388][15401] Updated weights for policy 0, policy_version 874154 (0.0036) [2024-06-25 11:16:13,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14322139136. Throughput: 0: 42206.2. Samples: 14322236280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:13,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 11:16:18,046][15401] Updated weights for policy 0, policy_version 874164 (0.0043) [2024-06-25 11:16:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 14322319360. Throughput: 0: 42201.3. Samples: 14322494760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:18,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-25 11:16:21,006][15401] Updated weights for policy 0, policy_version 874174 (0.0048) [2024-06-25 11:16:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42053.8, 300 sec: 42598.7). Total num frames: 14322532352. Throughput: 0: 41917.1. Samples: 14322610040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 11:16:25,679][15401] Updated weights for policy 0, policy_version 874184 (0.0027) [2024-06-25 11:16:28,391][15132] Fps is (10 sec: 44230.2, 60 sec: 42597.3, 300 sec: 42653.7). Total num frames: 14322761728. Throughput: 0: 42282.5. Samples: 14322871060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:28,392][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 11:16:28,683][15401] Updated weights for policy 0, policy_version 874194 (0.0043) [2024-06-25 11:16:33,284][15401] Updated weights for policy 0, policy_version 874204 (0.0047) [2024-06-25 11:16:33,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42376.2). Total num frames: 14322958336. Throughput: 0: 42255.7. Samples: 14323132280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 11:16:36,765][15401] Updated weights for policy 0, policy_version 874214 (0.0027) [2024-06-25 11:16:38,390][15132] Fps is (10 sec: 40966.1, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 14323171328. Throughput: 0: 42246.5. Samples: 14323252640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:38,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 11:16:41,123][15401] Updated weights for policy 0, policy_version 874224 (0.0032) [2024-06-25 11:16:43,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14323384320. Throughput: 0: 42235.0. Samples: 14323504860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:43,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 11:16:44,406][15401] Updated weights for policy 0, policy_version 874234 (0.0029) [2024-06-25 11:16:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 14323597312. Throughput: 0: 42330.9. Samples: 14323763640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 11:16:48,956][15401] Updated weights for policy 0, policy_version 874244 (0.0045) [2024-06-25 11:16:52,034][15401] Updated weights for policy 0, policy_version 874254 (0.0032) [2024-06-25 11:16:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14323826688. Throughput: 0: 42538.2. Samples: 14323890380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 11:16:56,521][15401] Updated weights for policy 0, policy_version 874264 (0.0035) [2024-06-25 11:16:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14324039680. Throughput: 0: 42473.8. Samples: 14324147600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:16:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 11:16:59,744][15401] Updated weights for policy 0, policy_version 874274 (0.0042) [2024-06-25 11:17:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42321.6). Total num frames: 14324219904. Throughput: 0: 42579.7. Samples: 14324410840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 11:17:03,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 11:17:04,086][15401] Updated weights for policy 0, policy_version 874284 (0.0028) [2024-06-25 11:17:07,352][15349] Signal inference workers to stop experience collection... (212050 times) [2024-06-25 11:17:07,409][15349] Signal inference workers to resume experience collection... (212050 times) [2024-06-25 11:17:07,410][15401] InferenceWorker_p0-w0: stopping experience collection (212050 times) [2024-06-25 11:17:07,413][15401] Updated weights for policy 0, policy_version 874294 (0.0021) [2024-06-25 11:17:07,437][15401] InferenceWorker_p0-w0: resuming experience collection (212050 times) [2024-06-25 11:17:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14324465664. Throughput: 0: 42792.6. Samples: 14324535700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 11:17:11,835][15401] Updated weights for policy 0, policy_version 874304 (0.0033) [2024-06-25 11:17:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 14324662272. Throughput: 0: 42681.6. Samples: 14324791660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 11:17:14,922][15401] Updated weights for policy 0, policy_version 874314 (0.0023) [2024-06-25 11:17:18,391][15132] Fps is (10 sec: 39317.3, 60 sec: 42324.7, 300 sec: 42320.6). Total num frames: 14324858880. Throughput: 0: 42451.0. Samples: 14325042620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:18,391][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 11:17:19,659][15401] Updated weights for policy 0, policy_version 874324 (0.0033) [2024-06-25 11:17:22,826][15401] Updated weights for policy 0, policy_version 874334 (0.0033) [2024-06-25 11:17:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 14325104640. Throughput: 0: 42549.4. Samples: 14325167360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 11:17:27,369][15401] Updated weights for policy 0, policy_version 874344 (0.0034) [2024-06-25 11:17:28,390][15132] Fps is (10 sec: 45879.4, 60 sec: 42599.4, 300 sec: 42598.4). Total num frames: 14325317632. Throughput: 0: 42879.6. Samples: 14325434440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 11:17:30,449][15401] Updated weights for policy 0, policy_version 874354 (0.0040) [2024-06-25 11:17:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42431.9). Total num frames: 14325514240. Throughput: 0: 42612.5. Samples: 14325681200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 11:17:34,959][15401] Updated weights for policy 0, policy_version 874364 (0.0029) [2024-06-25 11:17:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14325727232. Throughput: 0: 42567.1. Samples: 14325805900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:38,392][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 11:17:38,558][15401] Updated weights for policy 0, policy_version 874374 (0.0022) [2024-06-25 11:17:42,679][15401] Updated weights for policy 0, policy_version 874384 (0.0041) [2024-06-25 11:17:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14325940224. Throughput: 0: 42776.0. Samples: 14326072520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 11:17:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000874387_14325956608.pth... [2024-06-25 11:17:43,578][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000873764_14315749376.pth [2024-06-25 11:17:46,203][15401] Updated weights for policy 0, policy_version 874394 (0.0035) [2024-06-25 11:17:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 14326169600. Throughput: 0: 42525.3. Samples: 14326324480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 11:17:50,177][15401] Updated weights for policy 0, policy_version 874404 (0.0033) [2024-06-25 11:17:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14326382592. Throughput: 0: 42634.6. Samples: 14326454260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:53,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 11:17:53,752][15401] Updated weights for policy 0, policy_version 874414 (0.0043) [2024-06-25 11:17:57,794][15401] Updated weights for policy 0, policy_version 874424 (0.0039) [2024-06-25 11:17:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 14326595584. Throughput: 0: 42807.6. Samples: 14326718000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:17:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 11:18:01,359][15401] Updated weights for policy 0, policy_version 874434 (0.0025) [2024-06-25 11:18:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42487.7). Total num frames: 14326808576. Throughput: 0: 42826.4. Samples: 14326969760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 11:18:05,261][15401] Updated weights for policy 0, policy_version 874444 (0.0024) [2024-06-25 11:18:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 14327021568. Throughput: 0: 42993.7. Samples: 14327102180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:08,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 11:18:08,759][15401] Updated weights for policy 0, policy_version 874454 (0.0022) [2024-06-25 11:18:12,738][15401] Updated weights for policy 0, policy_version 874464 (0.0035) [2024-06-25 11:18:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 14327250944. Throughput: 0: 42910.3. Samples: 14327365400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:13,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 11:18:16,298][15401] Updated weights for policy 0, policy_version 874474 (0.0034) [2024-06-25 11:18:18,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43418.4, 300 sec: 42598.4). Total num frames: 14327463936. Throughput: 0: 43063.6. Samples: 14327619060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 11:18:20,326][15401] Updated weights for policy 0, policy_version 874484 (0.0036) [2024-06-25 11:18:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14327693312. Throughput: 0: 43151.6. Samples: 14327747720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 11:18:23,747][15401] Updated weights for policy 0, policy_version 874494 (0.0028) [2024-06-25 11:18:27,826][15401] Updated weights for policy 0, policy_version 874504 (0.0041) [2024-06-25 11:18:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 14327906304. Throughput: 0: 43231.7. Samples: 14328017940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 11:18:30,499][15349] Signal inference workers to stop experience collection... (212100 times) [2024-06-25 11:18:30,499][15349] Signal inference workers to resume experience collection... (212100 times) [2024-06-25 11:18:30,544][15401] InferenceWorker_p0-w0: stopping experience collection (212100 times) [2024-06-25 11:18:30,544][15401] InferenceWorker_p0-w0: resuming experience collection (212100 times) [2024-06-25 11:18:31,518][15401] Updated weights for policy 0, policy_version 874514 (0.0035) [2024-06-25 11:18:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14328102912. Throughput: 0: 43136.0. Samples: 14328265600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 11:18:35,352][15401] Updated weights for policy 0, policy_version 874524 (0.0039) [2024-06-25 11:18:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14328315904. Throughput: 0: 43161.3. Samples: 14328396520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:38,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 11:18:39,381][15401] Updated weights for policy 0, policy_version 874534 (0.0037) [2024-06-25 11:18:43,136][15401] Updated weights for policy 0, policy_version 874544 (0.0029) [2024-06-25 11:18:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42599.3). Total num frames: 14328545280. Throughput: 0: 43097.2. Samples: 14328657380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:43,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 11:18:46,874][15401] Updated weights for policy 0, policy_version 874554 (0.0037) [2024-06-25 11:18:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14328758272. Throughput: 0: 43088.7. Samples: 14328908760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:48,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 11:18:51,091][15401] Updated weights for policy 0, policy_version 874564 (0.0042) [2024-06-25 11:18:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14328954880. Throughput: 0: 43083.7. Samples: 14329040840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-25 11:18:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 11:18:54,434][15401] Updated weights for policy 0, policy_version 874574 (0.0027) [2024-06-25 11:18:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 14329167872. Throughput: 0: 42994.6. Samples: 14329300160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:18:58,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 11:18:58,786][15401] Updated weights for policy 0, policy_version 874584 (0.0030) [2024-06-25 11:19:02,307][15401] Updated weights for policy 0, policy_version 874594 (0.0026) [2024-06-25 11:19:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14329397248. Throughput: 0: 42958.3. Samples: 14329552180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:03,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 11:19:06,376][15401] Updated weights for policy 0, policy_version 874604 (0.0042) [2024-06-25 11:19:08,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 14329610240. Throughput: 0: 43027.2. Samples: 14329683940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:08,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 11:19:09,821][15401] Updated weights for policy 0, policy_version 874614 (0.0028) [2024-06-25 11:19:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14329806848. Throughput: 0: 42689.7. Samples: 14329938980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:13,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 11:19:13,950][15401] Updated weights for policy 0, policy_version 874624 (0.0034) [2024-06-25 11:19:17,567][15401] Updated weights for policy 0, policy_version 874634 (0.0047) [2024-06-25 11:19:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14330036224. Throughput: 0: 42770.6. Samples: 14330190280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:18,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 11:19:21,590][15401] Updated weights for policy 0, policy_version 874644 (0.0042) [2024-06-25 11:19:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14330265600. Throughput: 0: 42875.6. Samples: 14330325920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 11:19:25,299][15401] Updated weights for policy 0, policy_version 874654 (0.0029) [2024-06-25 11:19:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42599.0). Total num frames: 14330445824. Throughput: 0: 42725.3. Samples: 14330580020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:28,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 11:19:29,585][15401] Updated weights for policy 0, policy_version 874664 (0.0031) [2024-06-25 11:19:32,918][15401] Updated weights for policy 0, policy_version 874674 (0.0031) [2024-06-25 11:19:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14330675200. Throughput: 0: 42609.7. Samples: 14330826200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 11:19:37,147][15401] Updated weights for policy 0, policy_version 874684 (0.0037) [2024-06-25 11:19:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 14330888192. Throughput: 0: 42710.2. Samples: 14330962800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 11:19:40,278][15401] Updated weights for policy 0, policy_version 874694 (0.0037) [2024-06-25 11:19:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14331084800. Throughput: 0: 42671.6. Samples: 14331220380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:43,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 11:19:43,456][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000874701_14331101184.pth... [2024-06-25 11:19:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000874076_14320861184.pth [2024-06-25 11:19:44,707][15401] Updated weights for policy 0, policy_version 874704 (0.0036) [2024-06-25 11:19:47,938][15401] Updated weights for policy 0, policy_version 874714 (0.0024) [2024-06-25 11:19:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14331314176. Throughput: 0: 42573.7. Samples: 14331468000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 11:19:52,297][15349] Signal inference workers to stop experience collection... (212150 times) [2024-06-25 11:19:52,298][15349] Signal inference workers to resume experience collection... (212150 times) [2024-06-25 11:19:52,308][15401] InferenceWorker_p0-w0: stopping experience collection (212150 times) [2024-06-25 11:19:52,308][15401] InferenceWorker_p0-w0: resuming experience collection (212150 times) [2024-06-25 11:19:52,435][15401] Updated weights for policy 0, policy_version 874724 (0.0023) [2024-06-25 11:19:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 14331527168. Throughput: 0: 42664.7. Samples: 14331603860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:53,392][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 11:19:55,729][15401] Updated weights for policy 0, policy_version 874734 (0.0040) [2024-06-25 11:19:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14331723776. Throughput: 0: 42623.1. Samples: 14331857020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:19:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 11:20:00,151][15401] Updated weights for policy 0, policy_version 874744 (0.0034) [2024-06-25 11:20:03,293][15401] Updated weights for policy 0, policy_version 874754 (0.0032) [2024-06-25 11:20:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14331969536. Throughput: 0: 42561.3. Samples: 14332105540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:03,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 11:20:07,713][15401] Updated weights for policy 0, policy_version 874764 (0.0030) [2024-06-25 11:20:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14332166144. Throughput: 0: 42604.1. Samples: 14332243100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 11:20:10,817][15401] Updated weights for policy 0, policy_version 874774 (0.0035) [2024-06-25 11:20:13,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14332346368. Throughput: 0: 42536.2. Samples: 14332494140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 11:20:15,330][15401] Updated weights for policy 0, policy_version 874784 (0.0028) [2024-06-25 11:20:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 14332608512. Throughput: 0: 42838.3. Samples: 14332753920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 11:20:18,682][15401] Updated weights for policy 0, policy_version 874794 (0.0033) [2024-06-25 11:20:22,861][15401] Updated weights for policy 0, policy_version 874804 (0.0042) [2024-06-25 11:20:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14332805120. Throughput: 0: 42855.6. Samples: 14332891300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:23,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 11:20:26,301][15401] Updated weights for policy 0, policy_version 874814 (0.0039) [2024-06-25 11:20:28,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 14332985344. Throughput: 0: 42633.9. Samples: 14333138900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 11:20:30,787][15401] Updated weights for policy 0, policy_version 874824 (0.0030) [2024-06-25 11:20:33,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14333247488. Throughput: 0: 42925.6. Samples: 14333399660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:33,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 11:20:33,790][15401] Updated weights for policy 0, policy_version 874834 (0.0030) [2024-06-25 11:20:38,245][15401] Updated weights for policy 0, policy_version 874844 (0.0049) [2024-06-25 11:20:38,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14333460480. Throughput: 0: 42969.9. Samples: 14333537500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 11:20:41,751][15401] Updated weights for policy 0, policy_version 874854 (0.0032) [2024-06-25 11:20:43,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 14333640704. Throughput: 0: 42766.6. Samples: 14333781620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:20:43,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 11:20:45,795][15401] Updated weights for policy 0, policy_version 874864 (0.0029) [2024-06-25 11:20:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14333886464. Throughput: 0: 43054.8. Samples: 14334043000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:20:48,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 11:20:49,254][15401] Updated weights for policy 0, policy_version 874874 (0.0036) [2024-06-25 11:20:53,268][15401] Updated weights for policy 0, policy_version 874884 (0.0039) [2024-06-25 11:20:53,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14334099456. Throughput: 0: 43015.9. Samples: 14334178820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:20:53,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 11:20:57,015][15401] Updated weights for policy 0, policy_version 874894 (0.0033) [2024-06-25 11:20:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14334296064. Throughput: 0: 42938.0. Samples: 14334426360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:20:58,391][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 11:21:00,631][15401] Updated weights for policy 0, policy_version 874904 (0.0034) [2024-06-25 11:21:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14334525440. Throughput: 0: 43021.4. Samples: 14334689880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 11:21:04,911][15401] Updated weights for policy 0, policy_version 874914 (0.0026) [2024-06-25 11:21:08,075][15401] Updated weights for policy 0, policy_version 874924 (0.0033) [2024-06-25 11:21:08,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14334754816. Throughput: 0: 42888.3. Samples: 14334821280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 11:21:12,400][15401] Updated weights for policy 0, policy_version 874934 (0.0028) [2024-06-25 11:21:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43415.8, 300 sec: 42820.2). Total num frames: 14334951424. Throughput: 0: 43207.4. Samples: 14335083340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:13,392][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 11:21:15,570][15401] Updated weights for policy 0, policy_version 874944 (0.0042) [2024-06-25 11:21:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14335180800. Throughput: 0: 43011.6. Samples: 14335335180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:18,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 11:21:20,049][15401] Updated weights for policy 0, policy_version 874954 (0.0045) [2024-06-25 11:21:23,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42820.8). Total num frames: 14335393792. Throughput: 0: 42871.6. Samples: 14335466720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 11:21:23,515][15401] Updated weights for policy 0, policy_version 874964 (0.0042) [2024-06-25 11:21:27,621][15401] Updated weights for policy 0, policy_version 874974 (0.0044) [2024-06-25 11:21:28,390][15132] Fps is (10 sec: 40960.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 14335590400. Throughput: 0: 43250.4. Samples: 14335727780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 11:21:29,760][15349] Signal inference workers to stop experience collection... (212200 times) [2024-06-25 11:21:29,761][15349] Signal inference workers to resume experience collection... (212200 times) [2024-06-25 11:21:29,789][15401] InferenceWorker_p0-w0: stopping experience collection (212200 times) [2024-06-25 11:21:29,789][15401] InferenceWorker_p0-w0: resuming experience collection (212200 times) [2024-06-25 11:21:30,971][15401] Updated weights for policy 0, policy_version 874984 (0.0034) [2024-06-25 11:21:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14335819776. Throughput: 0: 43055.5. Samples: 14335980500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 11:21:35,307][15401] Updated weights for policy 0, policy_version 874994 (0.0040) [2024-06-25 11:21:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14336049152. Throughput: 0: 42881.8. Samples: 14336108500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 11:21:39,107][15401] Updated weights for policy 0, policy_version 875004 (0.0031) [2024-06-25 11:21:42,979][15401] Updated weights for policy 0, policy_version 875014 (0.0026) [2024-06-25 11:21:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 14336229376. Throughput: 0: 43178.7. Samples: 14336369400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:43,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 11:21:43,448][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000875015_14336245760.pth... [2024-06-25 11:21:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000874387_14325956608.pth [2024-06-25 11:21:46,681][15401] Updated weights for policy 0, policy_version 875024 (0.0040) [2024-06-25 11:21:48,391][15132] Fps is (10 sec: 40952.2, 60 sec: 42870.0, 300 sec: 42820.3). Total num frames: 14336458752. Throughput: 0: 42968.3. Samples: 14336623540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:48,392][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 11:21:50,891][15401] Updated weights for policy 0, policy_version 875034 (0.0041) [2024-06-25 11:21:53,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 14336688128. Throughput: 0: 42999.7. Samples: 14336756260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 11:21:54,192][15401] Updated weights for policy 0, policy_version 875044 (0.0029) [2024-06-25 11:21:58,389][15132] Fps is (10 sec: 40968.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14336868352. Throughput: 0: 42916.2. Samples: 14337014460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:21:58,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 11:21:58,413][15401] Updated weights for policy 0, policy_version 875054 (0.0027) [2024-06-25 11:22:01,781][15401] Updated weights for policy 0, policy_version 875064 (0.0032) [2024-06-25 11:22:03,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14337097728. Throughput: 0: 43068.6. Samples: 14337273260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:22:03,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 11:22:06,007][15401] Updated weights for policy 0, policy_version 875074 (0.0030) [2024-06-25 11:22:08,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14337327104. Throughput: 0: 43109.6. Samples: 14337406660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:22:08,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 11:22:09,303][15401] Updated weights for policy 0, policy_version 875084 (0.0040) [2024-06-25 11:22:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42931.8). Total num frames: 14337523712. Throughput: 0: 42905.3. Samples: 14337658520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:22:13,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 11:22:13,584][15401] Updated weights for policy 0, policy_version 875094 (0.0033) [2024-06-25 11:22:17,079][15401] Updated weights for policy 0, policy_version 875104 (0.0038) [2024-06-25 11:22:18,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 14337736704. Throughput: 0: 43014.8. Samples: 14337916160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:22:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 11:22:21,099][15401] Updated weights for policy 0, policy_version 875114 (0.0037) [2024-06-25 11:22:23,393][15132] Fps is (10 sec: 44219.9, 60 sec: 42868.7, 300 sec: 42875.6). Total num frames: 14337966080. Throughput: 0: 43083.9. Samples: 14338047440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:22:23,394][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 11:22:24,718][15401] Updated weights for policy 0, policy_version 875124 (0.0032) [2024-06-25 11:22:28,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14338179072. Throughput: 0: 43015.6. Samples: 14338305100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:22:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 11:22:28,497][15401] Updated weights for policy 0, policy_version 875134 (0.0035) [2024-06-25 11:22:32,559][15401] Updated weights for policy 0, policy_version 875144 (0.0027) [2024-06-25 11:22:33,390][15132] Fps is (10 sec: 42614.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14338392064. Throughput: 0: 43045.8. Samples: 14338560520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:22:33,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 11:22:35,866][15401] Updated weights for policy 0, policy_version 875154 (0.0037) [2024-06-25 11:22:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 14338621440. Throughput: 0: 42926.5. Samples: 14338687960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 11:22:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 11:22:40,179][15401] Updated weights for policy 0, policy_version 875164 (0.0031) [2024-06-25 11:22:43,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 14338834432. Throughput: 0: 42934.0. Samples: 14338946600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:22:43,393][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 11:22:43,456][15401] Updated weights for policy 0, policy_version 875174 (0.0022) [2024-06-25 11:22:48,014][15401] Updated weights for policy 0, policy_version 875184 (0.0038) [2024-06-25 11:22:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42872.8, 300 sec: 42876.1). Total num frames: 14339031040. Throughput: 0: 43032.8. Samples: 14339209740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:22:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 11:22:48,760][15349] Signal inference workers to stop experience collection... (212250 times) [2024-06-25 11:22:48,760][15349] Signal inference workers to resume experience collection... (212250 times) [2024-06-25 11:22:48,786][15401] InferenceWorker_p0-w0: stopping experience collection (212250 times) [2024-06-25 11:22:48,786][15401] InferenceWorker_p0-w0: resuming experience collection (212250 times) [2024-06-25 11:22:50,983][15401] Updated weights for policy 0, policy_version 875194 (0.0022) [2024-06-25 11:22:53,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14339260416. Throughput: 0: 42733.9. Samples: 14339329680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:22:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 11:22:55,723][15401] Updated weights for policy 0, policy_version 875204 (0.0028) [2024-06-25 11:22:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 14339489792. Throughput: 0: 43015.2. Samples: 14339594200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:22:58,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 11:22:58,426][15401] Updated weights for policy 0, policy_version 875214 (0.0028) [2024-06-25 11:23:03,359][15401] Updated weights for policy 0, policy_version 875224 (0.0027) [2024-06-25 11:23:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 14339670016. Throughput: 0: 43078.5. Samples: 14339854700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 11:23:06,054][15401] Updated weights for policy 0, policy_version 875234 (0.0039) [2024-06-25 11:23:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14339899392. Throughput: 0: 42715.2. Samples: 14339969460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:08,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 11:23:10,871][15401] Updated weights for policy 0, policy_version 875244 (0.0044) [2024-06-25 11:23:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14340112384. Throughput: 0: 42932.5. Samples: 14340237060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:13,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 11:23:13,738][15401] Updated weights for policy 0, policy_version 875254 (0.0036) [2024-06-25 11:23:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14340308992. Throughput: 0: 42831.5. Samples: 14340487940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 11:23:18,426][15401] Updated weights for policy 0, policy_version 875264 (0.0025) [2024-06-25 11:23:21,571][15401] Updated weights for policy 0, policy_version 875274 (0.0027) [2024-06-25 11:23:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42874.2, 300 sec: 42820.6). Total num frames: 14340538368. Throughput: 0: 42761.4. Samples: 14340612220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 11:23:26,105][15401] Updated weights for policy 0, policy_version 875284 (0.0037) [2024-06-25 11:23:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14340751360. Throughput: 0: 42859.7. Samples: 14340875180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 11:23:29,485][15401] Updated weights for policy 0, policy_version 875294 (0.0033) [2024-06-25 11:23:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14340947968. Throughput: 0: 42562.7. Samples: 14341125060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 11:23:33,876][15401] Updated weights for policy 0, policy_version 875304 (0.0033) [2024-06-25 11:23:37,312][15401] Updated weights for policy 0, policy_version 875314 (0.0035) [2024-06-25 11:23:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14341177344. Throughput: 0: 42524.5. Samples: 14341243280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:38,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 11:23:41,749][15401] Updated weights for policy 0, policy_version 875324 (0.0035) [2024-06-25 11:23:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 14341390336. Throughput: 0: 42533.7. Samples: 14341508220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:43,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-25 11:23:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000875330_14341406720.pth... [2024-06-25 11:23:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000874701_14331101184.pth [2024-06-25 11:23:44,975][15401] Updated weights for policy 0, policy_version 875334 (0.0026) [2024-06-25 11:23:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14341586944. Throughput: 0: 42320.0. Samples: 14341759100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 11:23:49,725][15401] Updated weights for policy 0, policy_version 875344 (0.0036) [2024-06-25 11:23:52,729][15401] Updated weights for policy 0, policy_version 875354 (0.0031) [2024-06-25 11:23:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14341816320. Throughput: 0: 42609.2. Samples: 14341886880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 11:23:57,519][15401] Updated weights for policy 0, policy_version 875364 (0.0022) [2024-06-25 11:23:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42709.4). Total num frames: 14341996544. Throughput: 0: 42428.7. Samples: 14342146360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:23:58,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 11:23:59,631][15349] Signal inference workers to stop experience collection... (212300 times) [2024-06-25 11:23:59,692][15401] InferenceWorker_p0-w0: stopping experience collection (212300 times) [2024-06-25 11:23:59,752][15349] Signal inference workers to resume experience collection... (212300 times) [2024-06-25 11:23:59,752][15401] InferenceWorker_p0-w0: resuming experience collection (212300 times) [2024-06-25 11:24:00,284][15401] Updated weights for policy 0, policy_version 875374 (0.0047) [2024-06-25 11:24:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14342225920. Throughput: 0: 42424.9. Samples: 14342397060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:24:03,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 11:24:05,311][15401] Updated weights for policy 0, policy_version 875384 (0.0030) [2024-06-25 11:24:07,793][15401] Updated weights for policy 0, policy_version 875394 (0.0044) [2024-06-25 11:24:08,389][15132] Fps is (10 sec: 47514.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14342471680. Throughput: 0: 42541.3. Samples: 14342526580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:24:08,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 11:24:12,902][15401] Updated weights for policy 0, policy_version 875404 (0.0038) [2024-06-25 11:24:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.0, 300 sec: 42653.9). Total num frames: 14342619136. Throughput: 0: 42402.1. Samples: 14342783280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:24:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 11:24:15,313][15401] Updated weights for policy 0, policy_version 875414 (0.0028) [2024-06-25 11:24:18,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14342848512. Throughput: 0: 42418.7. Samples: 14343033900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:24:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 11:24:20,505][15401] Updated weights for policy 0, policy_version 875424 (0.0034) [2024-06-25 11:24:22,910][15401] Updated weights for policy 0, policy_version 875434 (0.0026) [2024-06-25 11:24:23,390][15132] Fps is (10 sec: 50791.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 14343127040. Throughput: 0: 42789.7. Samples: 14343168820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:24:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 11:24:28,131][15401] Updated weights for policy 0, policy_version 875444 (0.0030) [2024-06-25 11:24:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14343274496. Throughput: 0: 42551.2. Samples: 14343423020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:24:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 11:24:30,776][15401] Updated weights for policy 0, policy_version 875454 (0.0038) [2024-06-25 11:24:33,392][15132] Fps is (10 sec: 36036.2, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 14343487488. Throughput: 0: 42465.3. Samples: 14343670140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:24:33,393][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 11:24:35,757][15401] Updated weights for policy 0, policy_version 875464 (0.0030) [2024-06-25 11:24:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14343733248. Throughput: 0: 42492.6. Samples: 14343799040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:24:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 11:24:38,863][15401] Updated weights for policy 0, policy_version 875474 (0.0037) [2024-06-25 11:24:43,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14343913472. Throughput: 0: 42633.5. Samples: 14344064860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:24:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 11:24:43,414][15401] Updated weights for policy 0, policy_version 875484 (0.0048) [2024-06-25 11:24:46,653][15401] Updated weights for policy 0, policy_version 875494 (0.0030) [2024-06-25 11:24:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14344142848. Throughput: 0: 42540.2. Samples: 14344311360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:24:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 11:24:51,131][15401] Updated weights for policy 0, policy_version 875504 (0.0035) [2024-06-25 11:24:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14344372224. Throughput: 0: 42676.8. Samples: 14344447040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:24:53,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 11:24:54,139][15401] Updated weights for policy 0, policy_version 875514 (0.0027) [2024-06-25 11:24:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14344568832. Throughput: 0: 42735.4. Samples: 14344706360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:24:58,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 11:24:58,840][15401] Updated weights for policy 0, policy_version 875524 (0.0035) [2024-06-25 11:24:59,158][15349] Signal inference workers to stop experience collection... (212350 times) [2024-06-25 11:24:59,159][15349] Signal inference workers to resume experience collection... (212350 times) [2024-06-25 11:24:59,188][15401] InferenceWorker_p0-w0: stopping experience collection (212350 times) [2024-06-25 11:24:59,188][15401] InferenceWorker_p0-w0: resuming experience collection (212350 times) [2024-06-25 11:25:01,933][15401] Updated weights for policy 0, policy_version 875534 (0.0039) [2024-06-25 11:25:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14344798208. Throughput: 0: 42724.0. Samples: 14344956480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:03,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 11:25:06,217][15401] Updated weights for policy 0, policy_version 875544 (0.0033) [2024-06-25 11:25:08,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 14345011200. Throughput: 0: 42739.6. Samples: 14345092100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 11:25:09,419][15401] Updated weights for policy 0, policy_version 875554 (0.0032) [2024-06-25 11:25:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 14345207808. Throughput: 0: 42894.7. Samples: 14345353280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 11:25:13,647][15401] Updated weights for policy 0, policy_version 875564 (0.0033) [2024-06-25 11:25:17,458][15401] Updated weights for policy 0, policy_version 875574 (0.0036) [2024-06-25 11:25:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14345437184. Throughput: 0: 42965.0. Samples: 14345603460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 11:25:21,177][15401] Updated weights for policy 0, policy_version 875584 (0.0037) [2024-06-25 11:25:23,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 14345666560. Throughput: 0: 43000.4. Samples: 14345734060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 11:25:25,129][15401] Updated weights for policy 0, policy_version 875594 (0.0030) [2024-06-25 11:25:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14345846784. Throughput: 0: 42910.1. Samples: 14345995820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:28,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 11:25:28,990][15401] Updated weights for policy 0, policy_version 875604 (0.0025) [2024-06-25 11:25:32,824][15401] Updated weights for policy 0, policy_version 875614 (0.0039) [2024-06-25 11:25:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 14346092544. Throughput: 0: 42944.8. Samples: 14346243880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 11:25:36,998][15401] Updated weights for policy 0, policy_version 875624 (0.0037) [2024-06-25 11:25:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 14346305536. Throughput: 0: 42931.2. Samples: 14346378940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 11:25:40,410][15401] Updated weights for policy 0, policy_version 875634 (0.0036) [2024-06-25 11:25:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14346502144. Throughput: 0: 42746.9. Samples: 14346629980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 11:25:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000875641_14346502144.pth... [2024-06-25 11:25:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000875015_14336245760.pth [2024-06-25 11:25:44,312][15401] Updated weights for policy 0, policy_version 875644 (0.0024) [2024-06-25 11:25:48,326][15401] Updated weights for policy 0, policy_version 875654 (0.0048) [2024-06-25 11:25:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 14346715136. Throughput: 0: 42887.8. Samples: 14346886420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:48,389][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 11:25:52,086][15401] Updated weights for policy 0, policy_version 875664 (0.0033) [2024-06-25 11:25:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14346928128. Throughput: 0: 42612.5. Samples: 14347009660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 11:25:55,842][15401] Updated weights for policy 0, policy_version 875674 (0.0039) [2024-06-25 11:25:58,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14347157504. Throughput: 0: 42675.6. Samples: 14347273680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:25:58,390][15132] Avg episode reward: [(0, '0.304')] [2024-06-25 11:25:59,634][15401] Updated weights for policy 0, policy_version 875684 (0.0038) [2024-06-25 11:26:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14347354112. Throughput: 0: 42842.7. Samples: 14347531380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:26:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 11:26:03,610][15401] Updated weights for policy 0, policy_version 875694 (0.0036) [2024-06-25 11:26:07,471][15401] Updated weights for policy 0, policy_version 875704 (0.0026) [2024-06-25 11:26:08,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.7, 300 sec: 42820.6). Total num frames: 14347583488. Throughput: 0: 42708.4. Samples: 14347656040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:26:08,392][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 11:26:11,092][15401] Updated weights for policy 0, policy_version 875714 (0.0043) [2024-06-25 11:26:13,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14347796480. Throughput: 0: 42614.7. Samples: 14347913480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:26:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 11:26:14,921][15401] Updated weights for policy 0, policy_version 875724 (0.0042) [2024-06-25 11:26:18,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14348009472. Throughput: 0: 42817.0. Samples: 14348170640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:26:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 11:26:18,602][15401] Updated weights for policy 0, policy_version 875734 (0.0041) [2024-06-25 11:26:22,657][15401] Updated weights for policy 0, policy_version 875744 (0.0039) [2024-06-25 11:26:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14348238848. Throughput: 0: 42603.6. Samples: 14348296100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 11:26:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 11:26:26,834][15401] Updated weights for policy 0, policy_version 875754 (0.0032) [2024-06-25 11:26:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14348435456. Throughput: 0: 42800.6. Samples: 14348556000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:26:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 11:26:30,351][15401] Updated weights for policy 0, policy_version 875764 (0.0030) [2024-06-25 11:26:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14348632064. Throughput: 0: 42756.3. Samples: 14348810460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:26:33,391][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 11:26:34,421][15401] Updated weights for policy 0, policy_version 875774 (0.0038) [2024-06-25 11:26:36,148][15349] Signal inference workers to stop experience collection... (212400 times) [2024-06-25 11:26:36,148][15349] Signal inference workers to resume experience collection... (212400 times) [2024-06-25 11:26:36,189][15401] InferenceWorker_p0-w0: stopping experience collection (212400 times) [2024-06-25 11:26:36,189][15401] InferenceWorker_p0-w0: resuming experience collection (212400 times) [2024-06-25 11:26:37,961][15401] Updated weights for policy 0, policy_version 875784 (0.0050) [2024-06-25 11:26:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14348877824. Throughput: 0: 42916.5. Samples: 14348940900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:26:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 11:26:41,931][15401] Updated weights for policy 0, policy_version 875794 (0.0034) [2024-06-25 11:26:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.3). Total num frames: 14349074432. Throughput: 0: 42785.3. Samples: 14349199020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:26:43,400][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 11:26:45,387][15401] Updated weights for policy 0, policy_version 875804 (0.0030) [2024-06-25 11:26:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14349287424. Throughput: 0: 42806.7. Samples: 14349457680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:26:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 11:26:49,727][15401] Updated weights for policy 0, policy_version 875814 (0.0026) [2024-06-25 11:26:52,806][15401] Updated weights for policy 0, policy_version 875824 (0.0026) [2024-06-25 11:26:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14349516800. Throughput: 0: 42918.4. Samples: 14349587260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:26:53,393][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 11:26:57,359][15401] Updated weights for policy 0, policy_version 875834 (0.0036) [2024-06-25 11:26:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14349729792. Throughput: 0: 42960.5. Samples: 14349846700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:26:58,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 11:27:00,245][15401] Updated weights for policy 0, policy_version 875844 (0.0038) [2024-06-25 11:27:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.1). Total num frames: 14349942784. Throughput: 0: 42734.2. Samples: 14350093680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 11:27:05,175][15401] Updated weights for policy 0, policy_version 875854 (0.0039) [2024-06-25 11:27:08,120][15401] Updated weights for policy 0, policy_version 875864 (0.0051) [2024-06-25 11:27:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 14350155776. Throughput: 0: 42938.7. Samples: 14350228340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:08,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 11:27:12,525][15401] Updated weights for policy 0, policy_version 875874 (0.0036) [2024-06-25 11:27:13,391][15132] Fps is (10 sec: 39313.2, 60 sec: 42323.9, 300 sec: 42709.2). Total num frames: 14350336000. Throughput: 0: 42906.0. Samples: 14350486860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:13,392][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 11:27:15,707][15401] Updated weights for policy 0, policy_version 875884 (0.0031) [2024-06-25 11:27:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.6). Total num frames: 14350581760. Throughput: 0: 42996.9. Samples: 14350745320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:18,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 11:27:20,293][15401] Updated weights for policy 0, policy_version 875894 (0.0042) [2024-06-25 11:27:23,345][15401] Updated weights for policy 0, policy_version 875904 (0.0034) [2024-06-25 11:27:23,391][15132] Fps is (10 sec: 47517.4, 60 sec: 42870.6, 300 sec: 42820.4). Total num frames: 14350811136. Throughput: 0: 42932.5. Samples: 14350872920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:23,391][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 11:27:27,683][15401] Updated weights for policy 0, policy_version 875914 (0.0035) [2024-06-25 11:27:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14351007744. Throughput: 0: 42956.0. Samples: 14351132140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:28,392][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 11:27:30,931][15401] Updated weights for policy 0, policy_version 875924 (0.0048) [2024-06-25 11:27:33,392][15132] Fps is (10 sec: 40955.5, 60 sec: 43142.8, 300 sec: 42709.2). Total num frames: 14351220736. Throughput: 0: 43005.7. Samples: 14351393040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:33,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 11:27:35,175][15401] Updated weights for policy 0, policy_version 875934 (0.0032) [2024-06-25 11:27:38,389][15132] Fps is (10 sec: 44246.8, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 14351450112. Throughput: 0: 42936.9. Samples: 14351519420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 11:27:38,606][15401] Updated weights for policy 0, policy_version 875944 (0.0042) [2024-06-25 11:27:42,684][15401] Updated weights for policy 0, policy_version 875954 (0.0040) [2024-06-25 11:27:43,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14351646720. Throughput: 0: 42760.0. Samples: 14351770900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 11:27:43,445][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000875956_14351663104.pth... [2024-06-25 11:27:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000875330_14341406720.pth [2024-06-25 11:27:46,219][15401] Updated weights for policy 0, policy_version 875964 (0.0036) [2024-06-25 11:27:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14351859712. Throughput: 0: 42992.9. Samples: 14352028360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:48,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 11:27:50,149][15401] Updated weights for policy 0, policy_version 875974 (0.0030) [2024-06-25 11:27:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14352089088. Throughput: 0: 42939.1. Samples: 14352160600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 11:27:54,018][15401] Updated weights for policy 0, policy_version 875984 (0.0030) [2024-06-25 11:27:56,942][15349] Signal inference workers to stop experience collection... (212450 times) [2024-06-25 11:27:56,976][15401] InferenceWorker_p0-w0: stopping experience collection (212450 times) [2024-06-25 11:27:57,002][15349] Signal inference workers to resume experience collection... (212450 times) [2024-06-25 11:27:57,002][15401] InferenceWorker_p0-w0: resuming experience collection (212450 times) [2024-06-25 11:27:57,782][15401] Updated weights for policy 0, policy_version 875994 (0.0038) [2024-06-25 11:27:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14352302080. Throughput: 0: 42917.9. Samples: 14352418080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:27:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 11:28:01,874][15401] Updated weights for policy 0, policy_version 876004 (0.0031) [2024-06-25 11:28:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14352498688. Throughput: 0: 42854.2. Samples: 14352673760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:28:03,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 11:28:05,573][15401] Updated weights for policy 0, policy_version 876014 (0.0040) [2024-06-25 11:28:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14352711680. Throughput: 0: 42694.1. Samples: 14352794100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:28:08,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 11:28:09,598][15401] Updated weights for policy 0, policy_version 876024 (0.0031) [2024-06-25 11:28:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.0, 300 sec: 42765.0). Total num frames: 14352924672. Throughput: 0: 42579.1. Samples: 14353048100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-25 11:28:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 11:28:13,425][15401] Updated weights for policy 0, policy_version 876034 (0.0041) [2024-06-25 11:28:17,442][15401] Updated weights for policy 0, policy_version 876044 (0.0042) [2024-06-25 11:28:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14353137664. Throughput: 0: 42605.8. Samples: 14353310200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 11:28:21,054][15401] Updated weights for policy 0, policy_version 876054 (0.0030) [2024-06-25 11:28:23,394][15132] Fps is (10 sec: 40941.0, 60 sec: 42049.9, 300 sec: 42653.3). Total num frames: 14353334272. Throughput: 0: 42485.0. Samples: 14353431440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:23,395][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 11:28:25,668][15401] Updated weights for policy 0, policy_version 876064 (0.0032) [2024-06-25 11:28:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 14353580032. Throughput: 0: 42613.8. Samples: 14353688520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:28,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 11:28:28,595][15401] Updated weights for policy 0, policy_version 876074 (0.0038) [2024-06-25 11:28:33,185][15401] Updated weights for policy 0, policy_version 876084 (0.0039) [2024-06-25 11:28:33,389][15132] Fps is (10 sec: 44257.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 14353776640. Throughput: 0: 42749.3. Samples: 14353952080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 11:28:36,022][15401] Updated weights for policy 0, policy_version 876094 (0.0045) [2024-06-25 11:28:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14353989632. Throughput: 0: 42421.8. Samples: 14354069580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 11:28:40,637][15401] Updated weights for policy 0, policy_version 876104 (0.0033) [2024-06-25 11:28:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14354235392. Throughput: 0: 42640.1. Samples: 14354336880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:43,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 11:28:43,468][15401] Updated weights for policy 0, policy_version 876114 (0.0037) [2024-06-25 11:28:48,222][15401] Updated weights for policy 0, policy_version 876124 (0.0032) [2024-06-25 11:28:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 14354432000. Throughput: 0: 42566.3. Samples: 14354589240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 11:28:51,192][15401] Updated weights for policy 0, policy_version 876134 (0.0035) [2024-06-25 11:28:53,391][15132] Fps is (10 sec: 40952.6, 60 sec: 42597.2, 300 sec: 42875.9). Total num frames: 14354644992. Throughput: 0: 42561.8. Samples: 14354709460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:53,392][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 11:28:56,877][15401] Updated weights for policy 0, policy_version 876144 (0.0029) [2024-06-25 11:28:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14354857984. Throughput: 0: 42779.9. Samples: 14354973200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:28:58,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 11:28:59,124][15401] Updated weights for policy 0, policy_version 876154 (0.0034) [2024-06-25 11:29:03,392][15132] Fps is (10 sec: 39318.6, 60 sec: 42323.5, 300 sec: 42598.0). Total num frames: 14355038208. Throughput: 0: 42449.6. Samples: 14355220540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:03,393][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 11:29:04,531][15401] Updated weights for policy 0, policy_version 876164 (0.0038) [2024-06-25 11:29:06,360][15349] Signal inference workers to stop experience collection... (212500 times) [2024-06-25 11:29:06,409][15401] InferenceWorker_p0-w0: stopping experience collection (212500 times) [2024-06-25 11:29:06,418][15349] Signal inference workers to resume experience collection... (212500 times) [2024-06-25 11:29:06,427][15401] InferenceWorker_p0-w0: resuming experience collection (212500 times) [2024-06-25 11:29:06,715][15401] Updated weights for policy 0, policy_version 876174 (0.0030) [2024-06-25 11:29:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 14355300352. Throughput: 0: 42531.5. Samples: 14355345160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 11:29:12,141][15401] Updated weights for policy 0, policy_version 876184 (0.0036) [2024-06-25 11:29:13,391][15132] Fps is (10 sec: 44239.9, 60 sec: 42597.1, 300 sec: 42820.3). Total num frames: 14355480576. Throughput: 0: 42796.9. Samples: 14355614460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:13,392][15132] Avg episode reward: [(0, '0.900')] [2024-06-25 11:29:14,475][15401] Updated weights for policy 0, policy_version 876194 (0.0033) [2024-06-25 11:29:18,396][15132] Fps is (10 sec: 37658.6, 60 sec: 42320.7, 300 sec: 42541.9). Total num frames: 14355677184. Throughput: 0: 42520.5. Samples: 14355865780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:18,396][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 11:29:19,747][15401] Updated weights for policy 0, policy_version 876204 (0.0035) [2024-06-25 11:29:22,309][15401] Updated weights for policy 0, policy_version 876214 (0.0046) [2024-06-25 11:29:23,389][15132] Fps is (10 sec: 45883.7, 60 sec: 43421.0, 300 sec: 42931.6). Total num frames: 14355939328. Throughput: 0: 42633.3. Samples: 14355988080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:23,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 11:29:27,463][15401] Updated weights for policy 0, policy_version 876224 (0.0037) [2024-06-25 11:29:28,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42325.2, 300 sec: 42820.9). Total num frames: 14356119552. Throughput: 0: 42595.4. Samples: 14356253680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:28,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 11:29:29,743][15401] Updated weights for policy 0, policy_version 876234 (0.0033) [2024-06-25 11:29:33,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14356332544. Throughput: 0: 42635.5. Samples: 14356507840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:33,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 11:29:35,235][15401] Updated weights for policy 0, policy_version 876244 (0.0048) [2024-06-25 11:29:37,310][15401] Updated weights for policy 0, policy_version 876254 (0.0030) [2024-06-25 11:29:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14356578304. Throughput: 0: 42807.4. Samples: 14356635720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 11:29:42,755][15401] Updated weights for policy 0, policy_version 876264 (0.0033) [2024-06-25 11:29:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 42653.9). Total num frames: 14356725760. Throughput: 0: 42684.6. Samples: 14356894000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 11:29:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000876265_14356725760.pth... [2024-06-25 11:29:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000875641_14346502144.pth [2024-06-25 11:29:44,973][15401] Updated weights for policy 0, policy_version 876274 (0.0034) [2024-06-25 11:29:48,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42323.5, 300 sec: 42709.1). Total num frames: 14356971520. Throughput: 0: 42630.2. Samples: 14357138900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:48,393][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 11:29:50,482][15401] Updated weights for policy 0, policy_version 876284 (0.0039) [2024-06-25 11:29:52,758][15401] Updated weights for policy 0, policy_version 876294 (0.0031) [2024-06-25 11:29:53,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42872.7, 300 sec: 42876.1). Total num frames: 14357217280. Throughput: 0: 42893.8. Samples: 14357275380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:53,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 11:29:57,996][15401] Updated weights for policy 0, policy_version 876304 (0.0030) [2024-06-25 11:29:58,389][15132] Fps is (10 sec: 40970.6, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 14357381120. Throughput: 0: 42701.8. Samples: 14357535960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:29:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 11:30:00,354][15349] Signal inference workers to stop experience collection... (212550 times) [2024-06-25 11:30:00,408][15401] InferenceWorker_p0-w0: stopping experience collection (212550 times) [2024-06-25 11:30:00,408][15349] Signal inference workers to resume experience collection... (212550 times) [2024-06-25 11:30:00,409][15401] Updated weights for policy 0, policy_version 876314 (0.0035) [2024-06-25 11:30:00,423][15401] InferenceWorker_p0-w0: resuming experience collection (212550 times) [2024-06-25 11:30:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43146.4, 300 sec: 42765.0). Total num frames: 14357626880. Throughput: 0: 42758.2. Samples: 14357789620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 11:30:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 11:30:05,504][15401] Updated weights for policy 0, policy_version 876324 (0.0023) [2024-06-25 11:30:07,864][15401] Updated weights for policy 0, policy_version 876334 (0.0048) [2024-06-25 11:30:08,389][15132] Fps is (10 sec: 49152.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14357872640. Throughput: 0: 42905.4. Samples: 14357918820. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 11:30:13,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42053.6, 300 sec: 42598.4). Total num frames: 14358003712. Throughput: 0: 42665.5. Samples: 14358173620. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:13,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 11:30:13,473][15401] Updated weights for policy 0, policy_version 876344 (0.0029) [2024-06-25 11:30:15,872][15401] Updated weights for policy 0, policy_version 876354 (0.0019) [2024-06-25 11:30:18,389][15132] Fps is (10 sec: 39321.3, 60 sec: 43149.2, 300 sec: 42709.5). Total num frames: 14358265856. Throughput: 0: 42648.0. Samples: 14358427000. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 11:30:20,965][15401] Updated weights for policy 0, policy_version 876364 (0.0047) [2024-06-25 11:30:23,389][15132] Fps is (10 sec: 49151.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14358495232. Throughput: 0: 42792.0. Samples: 14358561360. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:23,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 11:30:23,433][15401] Updated weights for policy 0, policy_version 876374 (0.0038) [2024-06-25 11:30:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14358659072. Throughput: 0: 42671.9. Samples: 14358814240. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:28,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 11:30:28,590][15401] Updated weights for policy 0, policy_version 876384 (0.0027) [2024-06-25 11:30:31,090][15401] Updated weights for policy 0, policy_version 876394 (0.0048) [2024-06-25 11:30:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14358904832. Throughput: 0: 42850.9. Samples: 14359067080. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:33,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 11:30:36,246][15401] Updated weights for policy 0, policy_version 876404 (0.0043) [2024-06-25 11:30:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14359134208. Throughput: 0: 42853.4. Samples: 14359203780. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:38,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 11:30:38,652][15401] Updated weights for policy 0, policy_version 876414 (0.0033) [2024-06-25 11:30:43,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14359298048. Throughput: 0: 42496.3. Samples: 14359448400. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:43,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 11:30:43,853][15401] Updated weights for policy 0, policy_version 876424 (0.0034) [2024-06-25 11:30:46,397][15401] Updated weights for policy 0, policy_version 876434 (0.0036) [2024-06-25 11:30:48,391][15132] Fps is (10 sec: 40955.2, 60 sec: 42872.5, 300 sec: 42764.9). Total num frames: 14359543808. Throughput: 0: 42421.6. Samples: 14359698640. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:48,391][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 11:30:52,054][15401] Updated weights for policy 0, policy_version 876444 (0.0027) [2024-06-25 11:30:53,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14359740416. Throughput: 0: 42644.0. Samples: 14359837800. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 11:30:54,147][15401] Updated weights for policy 0, policy_version 876454 (0.0043) [2024-06-25 11:30:58,390][15132] Fps is (10 sec: 40963.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 14359953408. Throughput: 0: 42347.8. Samples: 14360079280. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:30:58,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-25 11:30:59,933][15401] Updated weights for policy 0, policy_version 876464 (0.0033) [2024-06-25 11:31:01,431][15349] Signal inference workers to stop experience collection... (212600 times) [2024-06-25 11:31:01,468][15401] InferenceWorker_p0-w0: stopping experience collection (212600 times) [2024-06-25 11:31:01,482][15349] Signal inference workers to resume experience collection... (212600 times) [2024-06-25 11:31:01,486][15401] InferenceWorker_p0-w0: resuming experience collection (212600 times) [2024-06-25 11:31:01,991][15401] Updated weights for policy 0, policy_version 876474 (0.0033) [2024-06-25 11:31:03,390][15132] Fps is (10 sec: 44234.0, 60 sec: 42598.0, 300 sec: 42709.8). Total num frames: 14360182784. Throughput: 0: 42369.3. Samples: 14360333640. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 11:31:07,704][15401] Updated weights for policy 0, policy_version 876484 (0.0025) [2024-06-25 11:31:08,392][15132] Fps is (10 sec: 40950.4, 60 sec: 41504.4, 300 sec: 42598.0). Total num frames: 14360363008. Throughput: 0: 42248.3. Samples: 14360462640. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:08,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 11:31:10,070][15401] Updated weights for policy 0, policy_version 876494 (0.0038) [2024-06-25 11:31:13,390][15132] Fps is (10 sec: 42600.2, 60 sec: 43417.4, 300 sec: 42709.5). Total num frames: 14360608768. Throughput: 0: 42143.4. Samples: 14360710700. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:13,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 11:31:15,351][15401] Updated weights for policy 0, policy_version 876504 (0.0032) [2024-06-25 11:31:17,836][15401] Updated weights for policy 0, policy_version 876514 (0.0032) [2024-06-25 11:31:18,390][15132] Fps is (10 sec: 45884.5, 60 sec: 42598.1, 300 sec: 42653.9). Total num frames: 14360821760. Throughput: 0: 42123.1. Samples: 14360962640. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:18,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-25 11:31:23,065][15401] Updated weights for policy 0, policy_version 876524 (0.0033) [2024-06-25 11:31:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 41506.1, 300 sec: 42542.9). Total num frames: 14360985600. Throughput: 0: 41928.4. Samples: 14361090560. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 11:31:25,769][15401] Updated weights for policy 0, policy_version 876534 (0.0051) [2024-06-25 11:31:28,389][15132] Fps is (10 sec: 40962.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14361231360. Throughput: 0: 42024.1. Samples: 14361339380. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:28,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 11:31:30,949][15401] Updated weights for policy 0, policy_version 876544 (0.0032) [2024-06-25 11:31:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 14361427968. Throughput: 0: 42240.9. Samples: 14361599440. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:33,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 11:31:33,689][15401] Updated weights for policy 0, policy_version 876554 (0.0037) [2024-06-25 11:31:38,389][15132] Fps is (10 sec: 37682.9, 60 sec: 41233.0, 300 sec: 42487.3). Total num frames: 14361608192. Throughput: 0: 41835.5. Samples: 14361720400. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:38,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 11:31:38,626][15401] Updated weights for policy 0, policy_version 876564 (0.0030) [2024-06-25 11:31:41,506][15401] Updated weights for policy 0, policy_version 876574 (0.0035) [2024-06-25 11:31:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 14361870336. Throughput: 0: 42066.3. Samples: 14361972260. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 11:31:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000876579_14361870336.pth... [2024-06-25 11:31:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000875956_14351663104.pth [2024-06-25 11:31:46,367][15401] Updated weights for policy 0, policy_version 876584 (0.0026) [2024-06-25 11:31:48,394][15132] Fps is (10 sec: 45855.8, 60 sec: 42050.1, 300 sec: 42542.2). Total num frames: 14362066944. Throughput: 0: 42108.5. Samples: 14362228680. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:48,394][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 11:31:49,319][15401] Updated weights for policy 0, policy_version 876594 (0.0045) [2024-06-25 11:31:53,391][15132] Fps is (10 sec: 39315.6, 60 sec: 42051.1, 300 sec: 42487.1). Total num frames: 14362263552. Throughput: 0: 41968.0. Samples: 14362351160. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:53,392][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 11:31:53,975][15401] Updated weights for policy 0, policy_version 876604 (0.0033) [2024-06-25 11:31:56,817][15401] Updated weights for policy 0, policy_version 876614 (0.0060) [2024-06-25 11:31:58,389][15132] Fps is (10 sec: 44255.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14362509312. Throughput: 0: 42149.0. Samples: 14362607400. Policy #0 lag: (min: 1.0, avg: 13.0, max: 25.0) [2024-06-25 11:31:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 11:32:01,697][15401] Updated weights for policy 0, policy_version 876624 (0.0038) [2024-06-25 11:32:03,389][15132] Fps is (10 sec: 44243.3, 60 sec: 42052.6, 300 sec: 42542.9). Total num frames: 14362705920. Throughput: 0: 42334.6. Samples: 14362867680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:03,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 11:32:04,658][15401] Updated weights for policy 0, policy_version 876634 (0.0033) [2024-06-25 11:32:08,390][15132] Fps is (10 sec: 37681.9, 60 sec: 42053.7, 300 sec: 42543.1). Total num frames: 14362886144. Throughput: 0: 42155.2. Samples: 14362987560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 11:32:08,465][15349] Signal inference workers to stop experience collection... (212650 times) [2024-06-25 11:32:08,495][15401] InferenceWorker_p0-w0: stopping experience collection (212650 times) [2024-06-25 11:32:08,525][15349] Signal inference workers to resume experience collection... (212650 times) [2024-06-25 11:32:08,526][15401] InferenceWorker_p0-w0: resuming experience collection (212650 times) [2024-06-25 11:32:09,100][15401] Updated weights for policy 0, policy_version 876644 (0.0036) [2024-06-25 11:32:12,162][15401] Updated weights for policy 0, policy_version 876654 (0.0027) [2024-06-25 11:32:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14363164672. Throughput: 0: 42304.8. Samples: 14363243100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 11:32:16,888][15401] Updated weights for policy 0, policy_version 876664 (0.0022) [2024-06-25 11:32:18,389][15132] Fps is (10 sec: 44238.7, 60 sec: 41779.5, 300 sec: 42432.0). Total num frames: 14363328512. Throughput: 0: 42333.0. Samples: 14363504420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:18,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-25 11:32:19,893][15401] Updated weights for policy 0, policy_version 876674 (0.0029) [2024-06-25 11:32:23,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 14363541504. Throughput: 0: 42276.8. Samples: 14363622860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:23,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 11:32:24,435][15401] Updated weights for policy 0, policy_version 876684 (0.0027) [2024-06-25 11:32:27,427][15401] Updated weights for policy 0, policy_version 876694 (0.0030) [2024-06-25 11:32:28,390][15132] Fps is (10 sec: 47510.2, 60 sec: 42870.9, 300 sec: 42654.2). Total num frames: 14363803648. Throughput: 0: 42590.9. Samples: 14363888880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:28,391][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 11:32:32,139][15401] Updated weights for policy 0, policy_version 876704 (0.0025) [2024-06-25 11:32:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14363983872. Throughput: 0: 42765.4. Samples: 14364152940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 11:32:34,974][15401] Updated weights for policy 0, policy_version 876714 (0.0035) [2024-06-25 11:32:38,389][15132] Fps is (10 sec: 37685.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14364180480. Throughput: 0: 42719.6. Samples: 14364273480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:38,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 11:32:39,605][15401] Updated weights for policy 0, policy_version 876724 (0.0037) [2024-06-25 11:32:43,163][15401] Updated weights for policy 0, policy_version 876734 (0.0032) [2024-06-25 11:32:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14364426240. Throughput: 0: 42647.2. Samples: 14364526520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:43,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 11:32:47,348][15401] Updated weights for policy 0, policy_version 876744 (0.0042) [2024-06-25 11:32:48,394][15132] Fps is (10 sec: 44218.3, 60 sec: 42598.4, 300 sec: 42486.7). Total num frames: 14364622848. Throughput: 0: 42650.3. Samples: 14364787120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:48,394][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 11:32:50,790][15401] Updated weights for policy 0, policy_version 876754 (0.0029) [2024-06-25 11:32:53,392][15132] Fps is (10 sec: 40949.5, 60 sec: 42870.8, 300 sec: 42487.0). Total num frames: 14364835840. Throughput: 0: 42738.9. Samples: 14364910900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:53,392][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 11:32:55,145][15401] Updated weights for policy 0, policy_version 876764 (0.0028) [2024-06-25 11:32:58,178][15401] Updated weights for policy 0, policy_version 876774 (0.0041) [2024-06-25 11:32:58,392][15132] Fps is (10 sec: 44244.7, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 14365065216. Throughput: 0: 42953.7. Samples: 14365176120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:32:58,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 11:33:02,802][15401] Updated weights for policy 0, policy_version 876784 (0.0030) [2024-06-25 11:33:03,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14365261824. Throughput: 0: 42915.5. Samples: 14365435620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 11:33:05,792][15401] Updated weights for policy 0, policy_version 876794 (0.0026) [2024-06-25 11:33:08,389][15132] Fps is (10 sec: 40969.8, 60 sec: 43144.8, 300 sec: 42542.9). Total num frames: 14365474816. Throughput: 0: 42956.0. Samples: 14365555880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 11:33:10,434][15401] Updated weights for policy 0, policy_version 876804 (0.0040) [2024-06-25 11:33:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14365704192. Throughput: 0: 42890.1. Samples: 14365818900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:13,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 11:33:13,498][15401] Updated weights for policy 0, policy_version 876814 (0.0041) [2024-06-25 11:33:18,074][15401] Updated weights for policy 0, policy_version 876824 (0.0023) [2024-06-25 11:33:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42599.1). Total num frames: 14365900800. Throughput: 0: 42690.6. Samples: 14366074020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 11:33:21,419][15401] Updated weights for policy 0, policy_version 876834 (0.0028) [2024-06-25 11:33:23,392][15132] Fps is (10 sec: 42587.7, 60 sec: 43142.8, 300 sec: 42542.5). Total num frames: 14366130176. Throughput: 0: 42716.8. Samples: 14366195840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:23,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 11:33:25,849][15401] Updated weights for policy 0, policy_version 876844 (0.0042) [2024-06-25 11:33:26,952][15349] Signal inference workers to stop experience collection... (212700 times) [2024-06-25 11:33:26,983][15401] InferenceWorker_p0-w0: stopping experience collection (212700 times) [2024-06-25 11:33:27,010][15349] Signal inference workers to resume experience collection... (212700 times) [2024-06-25 11:33:27,021][15401] InferenceWorker_p0-w0: resuming experience collection (212700 times) [2024-06-25 11:33:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.8, 300 sec: 42542.9). Total num frames: 14366326784. Throughput: 0: 42868.8. Samples: 14366455620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:28,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 11:33:29,123][15401] Updated weights for policy 0, policy_version 876854 (0.0044) [2024-06-25 11:33:33,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14366523392. Throughput: 0: 42810.7. Samples: 14366713420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 11:33:33,485][15401] Updated weights for policy 0, policy_version 876864 (0.0038) [2024-06-25 11:33:36,701][15401] Updated weights for policy 0, policy_version 876874 (0.0030) [2024-06-25 11:33:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 14366769152. Throughput: 0: 42889.0. Samples: 14366840800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:38,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 11:33:41,038][15401] Updated weights for policy 0, policy_version 876884 (0.0045) [2024-06-25 11:33:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14366965760. Throughput: 0: 42778.3. Samples: 14367101040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:43,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 11:33:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000876890_14366965760.pth... [2024-06-25 11:33:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000876265_14356725760.pth [2024-06-25 11:33:44,418][15401] Updated weights for policy 0, policy_version 876894 (0.0036) [2024-06-25 11:33:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42601.4, 300 sec: 42487.6). Total num frames: 14367178752. Throughput: 0: 42582.7. Samples: 14367351840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 11:33:48,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 11:33:48,778][15401] Updated weights for policy 0, policy_version 876904 (0.0043) [2024-06-25 11:33:51,969][15401] Updated weights for policy 0, policy_version 876914 (0.0046) [2024-06-25 11:33:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 14367408128. Throughput: 0: 42695.2. Samples: 14367477160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:33:53,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 11:33:56,532][15401] Updated weights for policy 0, policy_version 876924 (0.0046) [2024-06-25 11:33:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42054.0, 300 sec: 42543.2). Total num frames: 14367588352. Throughput: 0: 42523.5. Samples: 14367732460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:33:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 11:33:59,569][15401] Updated weights for policy 0, policy_version 876934 (0.0042) [2024-06-25 11:34:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14367834112. Throughput: 0: 42507.0. Samples: 14367986840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:03,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 11:34:04,055][15401] Updated weights for policy 0, policy_version 876944 (0.0033) [2024-06-25 11:34:07,369][15401] Updated weights for policy 0, policy_version 876954 (0.0039) [2024-06-25 11:34:08,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 14368047104. Throughput: 0: 42738.7. Samples: 14368118980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 11:34:11,564][15401] Updated weights for policy 0, policy_version 876964 (0.0036) [2024-06-25 11:34:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42599.3). Total num frames: 14368243712. Throughput: 0: 42690.5. Samples: 14368376700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 11:34:14,869][15401] Updated weights for policy 0, policy_version 876974 (0.0032) [2024-06-25 11:34:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 14368473088. Throughput: 0: 42640.8. Samples: 14368632260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 11:34:19,121][15401] Updated weights for policy 0, policy_version 876984 (0.0044) [2024-06-25 11:34:22,703][15401] Updated weights for policy 0, policy_version 876994 (0.0033) [2024-06-25 11:34:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 14368686080. Throughput: 0: 42688.9. Samples: 14368761800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:23,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-25 11:34:26,706][15401] Updated weights for policy 0, policy_version 877004 (0.0032) [2024-06-25 11:34:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14368899072. Throughput: 0: 42677.8. Samples: 14369021540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 11:34:30,283][15401] Updated weights for policy 0, policy_version 877014 (0.0035) [2024-06-25 11:34:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42542.9). Total num frames: 14369128448. Throughput: 0: 42758.6. Samples: 14369275980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:33,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 11:34:34,662][15401] Updated weights for policy 0, policy_version 877024 (0.0027) [2024-06-25 11:34:38,048][15401] Updated weights for policy 0, policy_version 877034 (0.0024) [2024-06-25 11:34:38,390][15132] Fps is (10 sec: 42595.2, 60 sec: 42597.9, 300 sec: 42709.4). Total num frames: 14369325056. Throughput: 0: 42950.4. Samples: 14369409960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:38,391][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 11:34:41,983][15401] Updated weights for policy 0, policy_version 877044 (0.0038) [2024-06-25 11:34:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 14369554432. Throughput: 0: 42988.3. Samples: 14369666940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:43,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 11:34:45,858][15401] Updated weights for policy 0, policy_version 877054 (0.0028) [2024-06-25 11:34:48,394][15132] Fps is (10 sec: 42584.0, 60 sec: 42868.5, 300 sec: 42486.7). Total num frames: 14369751040. Throughput: 0: 42857.9. Samples: 14369915620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:48,394][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 11:34:48,849][15349] Signal inference workers to stop experience collection... (212750 times) [2024-06-25 11:34:48,904][15401] InferenceWorker_p0-w0: stopping experience collection (212750 times) [2024-06-25 11:34:48,966][15349] Signal inference workers to resume experience collection... (212750 times) [2024-06-25 11:34:48,966][15401] InferenceWorker_p0-w0: resuming experience collection (212750 times) [2024-06-25 11:34:49,485][15401] Updated weights for policy 0, policy_version 877064 (0.0025) [2024-06-25 11:34:53,387][15401] Updated weights for policy 0, policy_version 877074 (0.0031) [2024-06-25 11:34:53,390][15132] Fps is (10 sec: 42595.3, 60 sec: 42870.9, 300 sec: 42709.4). Total num frames: 14369980416. Throughput: 0: 42855.7. Samples: 14370047520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:53,391][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 11:34:56,931][15401] Updated weights for policy 0, policy_version 877084 (0.0035) [2024-06-25 11:34:58,389][15132] Fps is (10 sec: 42615.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 14370177024. Throughput: 0: 42743.2. Samples: 14370300140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:34:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 11:35:00,949][15401] Updated weights for policy 0, policy_version 877094 (0.0029) [2024-06-25 11:35:03,389][15132] Fps is (10 sec: 39324.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 14370373632. Throughput: 0: 43011.1. Samples: 14370567760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 11:35:04,773][15401] Updated weights for policy 0, policy_version 877104 (0.0035) [2024-06-25 11:35:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14370619392. Throughput: 0: 42933.3. Samples: 14370693800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 11:35:08,580][15401] Updated weights for policy 0, policy_version 877114 (0.0038) [2024-06-25 11:35:12,507][15401] Updated weights for policy 0, policy_version 877124 (0.0022) [2024-06-25 11:35:13,390][15132] Fps is (10 sec: 45873.2, 60 sec: 43144.3, 300 sec: 42598.3). Total num frames: 14370832384. Throughput: 0: 42942.6. Samples: 14370953980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:13,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 11:35:16,709][15401] Updated weights for policy 0, policy_version 877134 (0.0025) [2024-06-25 11:35:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 14371028992. Throughput: 0: 43014.6. Samples: 14371211740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:18,393][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 11:35:19,977][15401] Updated weights for policy 0, policy_version 877144 (0.0037) [2024-06-25 11:35:23,389][15132] Fps is (10 sec: 42600.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14371258368. Throughput: 0: 42889.5. Samples: 14371339960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 11:35:24,083][15401] Updated weights for policy 0, policy_version 877154 (0.0037) [2024-06-25 11:35:27,682][15401] Updated weights for policy 0, policy_version 877164 (0.0037) [2024-06-25 11:35:28,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14371471360. Throughput: 0: 42893.8. Samples: 14371597160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 11:35:31,840][15401] Updated weights for policy 0, policy_version 877174 (0.0032) [2024-06-25 11:35:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 14371684352. Throughput: 0: 42953.2. Samples: 14371848340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:33,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 11:35:35,474][15401] Updated weights for policy 0, policy_version 877184 (0.0027) [2024-06-25 11:35:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42872.0, 300 sec: 42709.8). Total num frames: 14371897344. Throughput: 0: 42897.2. Samples: 14371977860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 11:35:39,456][15401] Updated weights for policy 0, policy_version 877194 (0.0035) [2024-06-25 11:35:43,290][15401] Updated weights for policy 0, policy_version 877204 (0.0038) [2024-06-25 11:35:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 14372110336. Throughput: 0: 42963.1. Samples: 14372233480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 11:35:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 11:35:43,467][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000877205_14372126720.pth... [2024-06-25 11:35:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000876579_14361870336.pth [2024-06-25 11:35:47,058][15401] Updated weights for policy 0, policy_version 877214 (0.0033) [2024-06-25 11:35:48,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42601.3, 300 sec: 42598.4). Total num frames: 14372306944. Throughput: 0: 42601.3. Samples: 14372484820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:35:48,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 11:35:51,063][15401] Updated weights for policy 0, policy_version 877224 (0.0036) [2024-06-25 11:35:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42872.0, 300 sec: 42709.5). Total num frames: 14372552704. Throughput: 0: 42575.2. Samples: 14372609680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:35:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 11:35:55,104][15401] Updated weights for policy 0, policy_version 877234 (0.0039) [2024-06-25 11:35:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42598.5). Total num frames: 14372749312. Throughput: 0: 42582.2. Samples: 14372870160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:35:58,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 11:35:58,755][15401] Updated weights for policy 0, policy_version 877244 (0.0022) [2024-06-25 11:36:02,860][15401] Updated weights for policy 0, policy_version 877254 (0.0042) [2024-06-25 11:36:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 14372945920. Throughput: 0: 42445.8. Samples: 14373121700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:03,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 11:36:06,508][15401] Updated weights for policy 0, policy_version 877264 (0.0031) [2024-06-25 11:36:08,393][15132] Fps is (10 sec: 42582.4, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 14373175296. Throughput: 0: 42369.8. Samples: 14373246760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:08,394][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 11:36:10,423][15401] Updated weights for policy 0, policy_version 877274 (0.0031) [2024-06-25 11:36:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.6, 300 sec: 42542.9). Total num frames: 14373371904. Throughput: 0: 42382.2. Samples: 14373504360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:13,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 11:36:14,379][15401] Updated weights for policy 0, policy_version 877284 (0.0030) [2024-06-25 11:36:15,716][15349] Signal inference workers to stop experience collection... (212800 times) [2024-06-25 11:36:15,763][15401] InferenceWorker_p0-w0: stopping experience collection (212800 times) [2024-06-25 11:36:15,834][15349] Signal inference workers to resume experience collection... (212800 times) [2024-06-25 11:36:15,834][15401] InferenceWorker_p0-w0: resuming experience collection (212800 times) [2024-06-25 11:36:18,015][15401] Updated weights for policy 0, policy_version 877294 (0.0028) [2024-06-25 11:36:18,389][15132] Fps is (10 sec: 40975.3, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 14373584896. Throughput: 0: 42375.5. Samples: 14373755240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:18,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 11:36:21,856][15401] Updated weights for policy 0, policy_version 877304 (0.0043) [2024-06-25 11:36:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14373814272. Throughput: 0: 42464.0. Samples: 14373888740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:23,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 11:36:25,662][15401] Updated weights for policy 0, policy_version 877314 (0.0040) [2024-06-25 11:36:28,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 14374010880. Throughput: 0: 42328.6. Samples: 14374138540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:28,396][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 11:36:29,430][15401] Updated weights for policy 0, policy_version 877324 (0.0041) [2024-06-25 11:36:33,275][15401] Updated weights for policy 0, policy_version 877334 (0.0038) [2024-06-25 11:36:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14374240256. Throughput: 0: 42527.5. Samples: 14374398560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:33,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 11:36:37,178][15401] Updated weights for policy 0, policy_version 877344 (0.0033) [2024-06-25 11:36:38,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14374436864. Throughput: 0: 42644.9. Samples: 14374528700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:38,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 11:36:41,261][15401] Updated weights for policy 0, policy_version 877354 (0.0043) [2024-06-25 11:36:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42710.1). Total num frames: 14374666240. Throughput: 0: 42575.1. Samples: 14374786040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 11:36:44,901][15401] Updated weights for policy 0, policy_version 877364 (0.0032) [2024-06-25 11:36:48,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.8, 300 sec: 42764.9). Total num frames: 14374879232. Throughput: 0: 42565.9. Samples: 14375037260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:48,392][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 11:36:48,911][15401] Updated weights for policy 0, policy_version 877374 (0.0043) [2024-06-25 11:36:52,407][15401] Updated weights for policy 0, policy_version 877384 (0.0028) [2024-06-25 11:36:53,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42050.5, 300 sec: 42598.0). Total num frames: 14375075840. Throughput: 0: 42757.1. Samples: 14375170780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:53,393][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 11:36:56,555][15401] Updated weights for policy 0, policy_version 877394 (0.0036) [2024-06-25 11:36:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14375288832. Throughput: 0: 42605.9. Samples: 14375421620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:36:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 11:37:00,511][15401] Updated weights for policy 0, policy_version 877404 (0.0033) [2024-06-25 11:37:03,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14375518208. Throughput: 0: 42651.5. Samples: 14375674560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:37:03,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 11:37:04,161][15401] Updated weights for policy 0, policy_version 877414 (0.0024) [2024-06-25 11:37:08,233][15401] Updated weights for policy 0, policy_version 877424 (0.0047) [2024-06-25 11:37:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42328.0, 300 sec: 42542.9). Total num frames: 14375714816. Throughput: 0: 42601.8. Samples: 14375805820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:37:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 11:37:11,916][15401] Updated weights for policy 0, policy_version 877434 (0.0034) [2024-06-25 11:37:13,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 14375927808. Throughput: 0: 42765.1. Samples: 14376062800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:37:13,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 11:37:15,946][15401] Updated weights for policy 0, policy_version 877444 (0.0034) [2024-06-25 11:37:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14376173568. Throughput: 0: 42580.1. Samples: 14376314660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:37:18,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 11:37:19,820][15401] Updated weights for policy 0, policy_version 877454 (0.0049) [2024-06-25 11:37:23,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.3, 300 sec: 42543.0). Total num frames: 14376353792. Throughput: 0: 42668.4. Samples: 14376448780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:37:23,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 11:37:23,512][15401] Updated weights for policy 0, policy_version 877464 (0.0044) [2024-06-25 11:37:27,473][15401] Updated weights for policy 0, policy_version 877474 (0.0046) [2024-06-25 11:37:28,392][15132] Fps is (10 sec: 40951.2, 60 sec: 42874.5, 300 sec: 42709.2). Total num frames: 14376583168. Throughput: 0: 42535.4. Samples: 14376700220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:37:28,392][15132] Avg episode reward: [(0, '0.840')] [2024-06-25 11:37:31,085][15349] Signal inference workers to stop experience collection... (212850 times) [2024-06-25 11:37:31,116][15401] InferenceWorker_p0-w0: stopping experience collection (212850 times) [2024-06-25 11:37:31,145][15349] Signal inference workers to resume experience collection... (212850 times) [2024-06-25 11:37:31,148][15401] InferenceWorker_p0-w0: resuming experience collection (212850 times) [2024-06-25 11:37:31,151][15401] Updated weights for policy 0, policy_version 877484 (0.0034) [2024-06-25 11:37:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14376812544. Throughput: 0: 42571.1. Samples: 14376952860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 11:37:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 11:37:35,121][15401] Updated weights for policy 0, policy_version 877494 (0.0039) [2024-06-25 11:37:38,389][15132] Fps is (10 sec: 40968.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14376992768. Throughput: 0: 42619.8. Samples: 14377088560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:37:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 11:37:38,995][15401] Updated weights for policy 0, policy_version 877504 (0.0030) [2024-06-25 11:37:42,659][15401] Updated weights for policy 0, policy_version 877514 (0.0041) [2024-06-25 11:37:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42710.1). Total num frames: 14377222144. Throughput: 0: 42691.4. Samples: 14377342740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:37:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 11:37:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000877516_14377222144.pth... [2024-06-25 11:37:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000876890_14366965760.pth [2024-06-25 11:37:46,759][15401] Updated weights for policy 0, policy_version 877524 (0.0032) [2024-06-25 11:37:48,393][15132] Fps is (10 sec: 45859.4, 60 sec: 42870.7, 300 sec: 42764.9). Total num frames: 14377451520. Throughput: 0: 42696.8. Samples: 14377596060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:37:48,393][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 11:37:50,311][15401] Updated weights for policy 0, policy_version 877534 (0.0028) [2024-06-25 11:37:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.2, 300 sec: 42598.7). Total num frames: 14377631744. Throughput: 0: 42770.2. Samples: 14377730480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:37:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 11:37:54,162][15401] Updated weights for policy 0, policy_version 877544 (0.0029) [2024-06-25 11:37:57,730][15401] Updated weights for policy 0, policy_version 877554 (0.0037) [2024-06-25 11:37:58,389][15132] Fps is (10 sec: 40974.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14377861120. Throughput: 0: 42613.4. Samples: 14377980300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:37:58,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 11:38:01,637][15401] Updated weights for policy 0, policy_version 877564 (0.0037) [2024-06-25 11:38:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14378074112. Throughput: 0: 42732.8. Samples: 14378237640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 11:38:05,246][15401] Updated weights for policy 0, policy_version 877574 (0.0033) [2024-06-25 11:38:08,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14378287104. Throughput: 0: 42695.0. Samples: 14378370160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:08,392][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 11:38:09,133][15401] Updated weights for policy 0, policy_version 877584 (0.0028) [2024-06-25 11:38:13,302][15401] Updated weights for policy 0, policy_version 877594 (0.0030) [2024-06-25 11:38:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14378500096. Throughput: 0: 42737.6. Samples: 14378623320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:13,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 11:38:17,039][15401] Updated weights for policy 0, policy_version 877604 (0.0035) [2024-06-25 11:38:18,391][15132] Fps is (10 sec: 42600.3, 60 sec: 42323.9, 300 sec: 42654.0). Total num frames: 14378713088. Throughput: 0: 42793.6. Samples: 14378878660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:18,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 11:38:20,815][15401] Updated weights for policy 0, policy_version 877614 (0.0030) [2024-06-25 11:38:23,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 14378909696. Throughput: 0: 42605.6. Samples: 14379005920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:23,393][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 11:38:24,685][15401] Updated weights for policy 0, policy_version 877624 (0.0025) [2024-06-25 11:38:28,389][15132] Fps is (10 sec: 42606.8, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 14379139072. Throughput: 0: 42530.2. Samples: 14379256600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 11:38:28,567][15401] Updated weights for policy 0, policy_version 877634 (0.0037) [2024-06-25 11:38:32,267][15401] Updated weights for policy 0, policy_version 877644 (0.0029) [2024-06-25 11:38:33,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14379352064. Throughput: 0: 42725.8. Samples: 14379518580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 11:38:36,157][15401] Updated weights for policy 0, policy_version 877654 (0.0026) [2024-06-25 11:38:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14379532288. Throughput: 0: 42614.1. Samples: 14379648120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 11:38:39,960][15401] Updated weights for policy 0, policy_version 877664 (0.0040) [2024-06-25 11:38:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14379794432. Throughput: 0: 42595.0. Samples: 14379897080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 11:38:43,738][15401] Updated weights for policy 0, policy_version 877674 (0.0034) [2024-06-25 11:38:47,749][15401] Updated weights for policy 0, policy_version 877684 (0.0034) [2024-06-25 11:38:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42327.8, 300 sec: 42653.9). Total num frames: 14379991040. Throughput: 0: 42561.9. Samples: 14380152920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:48,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 11:38:51,463][15401] Updated weights for policy 0, policy_version 877694 (0.0036) [2024-06-25 11:38:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14380171264. Throughput: 0: 42439.7. Samples: 14380279840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 11:38:55,463][15401] Updated weights for policy 0, policy_version 877704 (0.0022) [2024-06-25 11:38:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 14380417024. Throughput: 0: 42434.1. Samples: 14380532960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:38:58,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 11:38:59,183][15401] Updated weights for policy 0, policy_version 877714 (0.0027) [2024-06-25 11:39:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14380613632. Throughput: 0: 42411.2. Samples: 14380787080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:39:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 11:39:03,732][15401] Updated weights for policy 0, policy_version 877724 (0.0041) [2024-06-25 11:39:05,488][15349] Signal inference workers to stop experience collection... (212900 times) [2024-06-25 11:39:05,525][15401] InferenceWorker_p0-w0: stopping experience collection (212900 times) [2024-06-25 11:39:05,552][15349] Signal inference workers to resume experience collection... (212900 times) [2024-06-25 11:39:05,554][15401] InferenceWorker_p0-w0: resuming experience collection (212900 times) [2024-06-25 11:39:07,118][15401] Updated weights for policy 0, policy_version 877734 (0.0033) [2024-06-25 11:39:08,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 14380826624. Throughput: 0: 42324.5. Samples: 14380910420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:39:08,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 11:39:11,230][15401] Updated weights for policy 0, policy_version 877744 (0.0033) [2024-06-25 11:39:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14381056000. Throughput: 0: 42586.2. Samples: 14381172980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:39:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 11:39:14,574][15401] Updated weights for policy 0, policy_version 877754 (0.0038) [2024-06-25 11:39:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42053.6, 300 sec: 42542.9). Total num frames: 14381236224. Throughput: 0: 42520.5. Samples: 14381432000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:39:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 11:39:18,845][15401] Updated weights for policy 0, policy_version 877764 (0.0036) [2024-06-25 11:39:22,339][15401] Updated weights for policy 0, policy_version 877774 (0.0041) [2024-06-25 11:39:23,391][15132] Fps is (10 sec: 42591.0, 60 sec: 42872.0, 300 sec: 42653.7). Total num frames: 14381481984. Throughput: 0: 42380.2. Samples: 14381555300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:39:23,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 11:39:26,440][15401] Updated weights for policy 0, policy_version 877784 (0.0044) [2024-06-25 11:39:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14381694976. Throughput: 0: 42464.5. Samples: 14381807980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 11:39:28,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-25 11:39:30,222][15401] Updated weights for policy 0, policy_version 877794 (0.0043) [2024-06-25 11:39:33,390][15132] Fps is (10 sec: 39328.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 14381875200. Throughput: 0: 42591.0. Samples: 14382069520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:39:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 11:39:34,260][15401] Updated weights for policy 0, policy_version 877804 (0.0033) [2024-06-25 11:39:37,675][15401] Updated weights for policy 0, policy_version 877814 (0.0034) [2024-06-25 11:39:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14382120960. Throughput: 0: 42359.5. Samples: 14382186020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:39:38,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 11:39:42,045][15401] Updated weights for policy 0, policy_version 877824 (0.0028) [2024-06-25 11:39:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 14382333952. Throughput: 0: 42525.4. Samples: 14382446500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:39:43,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 11:39:43,565][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000877829_14382350336.pth... [2024-06-25 11:39:43,615][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000877205_14372126720.pth [2024-06-25 11:39:45,440][15401] Updated weights for policy 0, policy_version 877834 (0.0035) [2024-06-25 11:39:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42487.4). Total num frames: 14382514176. Throughput: 0: 42575.0. Samples: 14382702960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:39:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 11:39:49,766][15401] Updated weights for policy 0, policy_version 877844 (0.0024) [2024-06-25 11:39:53,085][15401] Updated weights for policy 0, policy_version 877854 (0.0045) [2024-06-25 11:39:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 14382776320. Throughput: 0: 42544.8. Samples: 14382824940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:39:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 11:39:57,398][15401] Updated weights for policy 0, policy_version 877864 (0.0043) [2024-06-25 11:39:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 14382956544. Throughput: 0: 42543.6. Samples: 14383087440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:39:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 11:40:00,707][15401] Updated weights for policy 0, policy_version 877874 (0.0032) [2024-06-25 11:40:03,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 14383153152. Throughput: 0: 42464.4. Samples: 14383343000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:03,392][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 11:40:04,877][15349] Signal inference workers to stop experience collection... (212950 times) [2024-06-25 11:40:04,878][15349] Signal inference workers to resume experience collection... (212950 times) [2024-06-25 11:40:04,895][15401] InferenceWorker_p0-w0: stopping experience collection (212950 times) [2024-06-25 11:40:04,896][15401] InferenceWorker_p0-w0: resuming experience collection (212950 times) [2024-06-25 11:40:05,055][15401] Updated weights for policy 0, policy_version 877884 (0.0028) [2024-06-25 11:40:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.5). Total num frames: 14383398912. Throughput: 0: 42423.1. Samples: 14383464260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 11:40:08,640][15401] Updated weights for policy 0, policy_version 877894 (0.0041) [2024-06-25 11:40:13,241][15401] Updated weights for policy 0, policy_version 877904 (0.0033) [2024-06-25 11:40:13,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 14383579136. Throughput: 0: 42452.3. Samples: 14383718340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 11:40:16,245][15401] Updated weights for policy 0, policy_version 877914 (0.0046) [2024-06-25 11:40:18,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 14383808512. Throughput: 0: 42212.5. Samples: 14383969180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:18,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 11:40:21,036][15401] Updated weights for policy 0, policy_version 877924 (0.0034) [2024-06-25 11:40:23,392][15132] Fps is (10 sec: 47502.4, 60 sec: 42871.0, 300 sec: 42653.6). Total num frames: 14384054272. Throughput: 0: 42549.3. Samples: 14384100840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:23,392][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 11:40:23,907][15401] Updated weights for policy 0, policy_version 877934 (0.0029) [2024-06-25 11:40:28,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 14384218112. Throughput: 0: 42383.5. Samples: 14384353760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:28,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 11:40:28,491][15401] Updated weights for policy 0, policy_version 877944 (0.0035) [2024-06-25 11:40:32,238][15401] Updated weights for policy 0, policy_version 877954 (0.0036) [2024-06-25 11:40:33,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 14384447488. Throughput: 0: 42354.4. Samples: 14384608900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 11:40:36,589][15401] Updated weights for policy 0, policy_version 877964 (0.0040) [2024-06-25 11:40:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14384676864. Throughput: 0: 42623.2. Samples: 14384742980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:38,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 11:40:39,844][15401] Updated weights for policy 0, policy_version 877974 (0.0037) [2024-06-25 11:40:43,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 14384857088. Throughput: 0: 42203.9. Samples: 14384986720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:43,392][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 11:40:44,268][15401] Updated weights for policy 0, policy_version 877984 (0.0031) [2024-06-25 11:40:47,508][15401] Updated weights for policy 0, policy_version 877994 (0.0038) [2024-06-25 11:40:48,391][15132] Fps is (10 sec: 40951.7, 60 sec: 42870.1, 300 sec: 42487.0). Total num frames: 14385086464. Throughput: 0: 42163.5. Samples: 14385240340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:48,392][15132] Avg episode reward: [(0, '0.287')] [2024-06-25 11:40:52,017][15401] Updated weights for policy 0, policy_version 878004 (0.0042) [2024-06-25 11:40:53,389][15132] Fps is (10 sec: 42608.7, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 14385283072. Throughput: 0: 42395.0. Samples: 14385372040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:53,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 11:40:55,129][15401] Updated weights for policy 0, policy_version 878014 (0.0037) [2024-06-25 11:40:58,389][15132] Fps is (10 sec: 40968.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14385496064. Throughput: 0: 42280.5. Samples: 14385620960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:40:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 11:40:59,773][15401] Updated weights for policy 0, policy_version 878024 (0.0038) [2024-06-25 11:41:02,705][15401] Updated weights for policy 0, policy_version 878034 (0.0039) [2024-06-25 11:41:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42543.4). Total num frames: 14385725440. Throughput: 0: 42251.1. Samples: 14385870380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:41:03,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 11:41:07,382][15401] Updated weights for policy 0, policy_version 878044 (0.0035) [2024-06-25 11:41:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 14385922048. Throughput: 0: 42281.4. Samples: 14386003400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:41:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 11:41:10,265][15401] Updated weights for policy 0, policy_version 878054 (0.0037) [2024-06-25 11:41:13,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14386118656. Throughput: 0: 42295.2. Samples: 14386257040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:41:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 11:41:15,408][15401] Updated weights for policy 0, policy_version 878064 (0.0030) [2024-06-25 11:41:17,733][15401] Updated weights for policy 0, policy_version 878074 (0.0030) [2024-06-25 11:41:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 14386380800. Throughput: 0: 42183.9. Samples: 14386507180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 11:41:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 11:41:22,765][15401] Updated weights for policy 0, policy_version 878084 (0.0028) [2024-06-25 11:41:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41507.9, 300 sec: 42488.3). Total num frames: 14386544640. Throughput: 0: 42237.4. Samples: 14386643660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:41:23,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 11:41:25,704][15401] Updated weights for policy 0, policy_version 878094 (0.0031) [2024-06-25 11:41:28,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14386774016. Throughput: 0: 42457.8. Samples: 14386897220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:41:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 11:41:30,618][15401] Updated weights for policy 0, policy_version 878104 (0.0047) [2024-06-25 11:41:32,524][15349] Signal inference workers to stop experience collection... (213000 times) [2024-06-25 11:41:32,525][15349] Signal inference workers to resume experience collection... (213000 times) [2024-06-25 11:41:32,578][15401] InferenceWorker_p0-w0: stopping experience collection (213000 times) [2024-06-25 11:41:32,584][15401] InferenceWorker_p0-w0: resuming experience collection (213000 times) [2024-06-25 11:41:33,360][15401] Updated weights for policy 0, policy_version 878114 (0.0036) [2024-06-25 11:41:33,391][15132] Fps is (10 sec: 47507.0, 60 sec: 42870.5, 300 sec: 42653.7). Total num frames: 14387019776. Throughput: 0: 42431.8. Samples: 14387149740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:41:33,391][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 11:41:38,165][15401] Updated weights for policy 0, policy_version 878124 (0.0050) [2024-06-25 11:41:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 14387183616. Throughput: 0: 42371.6. Samples: 14387278760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:41:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 11:41:41,392][15401] Updated weights for policy 0, policy_version 878134 (0.0029) [2024-06-25 11:41:43,389][15132] Fps is (10 sec: 39326.8, 60 sec: 42600.1, 300 sec: 42487.7). Total num frames: 14387412992. Throughput: 0: 42420.0. Samples: 14387529860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:41:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 11:41:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000878138_14387412992.pth... [2024-06-25 11:41:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000877516_14377222144.pth [2024-06-25 11:41:45,718][15401] Updated weights for policy 0, policy_version 878144 (0.0052) [2024-06-25 11:41:48,391][15132] Fps is (10 sec: 44231.5, 60 sec: 42325.9, 300 sec: 42543.1). Total num frames: 14387625984. Throughput: 0: 42590.0. Samples: 14387786980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:41:48,391][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 11:41:49,117][15401] Updated weights for policy 0, policy_version 878154 (0.0028) [2024-06-25 11:41:53,255][15401] Updated weights for policy 0, policy_version 878164 (0.0035) [2024-06-25 11:41:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 14387838976. Throughput: 0: 42527.5. Samples: 14387917140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:41:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 11:41:57,302][15401] Updated weights for policy 0, policy_version 878174 (0.0027) [2024-06-25 11:41:58,392][15132] Fps is (10 sec: 40956.4, 60 sec: 42323.9, 300 sec: 42431.5). Total num frames: 14388035584. Throughput: 0: 42528.7. Samples: 14388170920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:41:58,392][15132] Avg episode reward: [(0, '0.205')] [2024-06-25 11:42:00,894][15401] Updated weights for policy 0, policy_version 878184 (0.0036) [2024-06-25 11:42:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14388264960. Throughput: 0: 42607.7. Samples: 14388424520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:03,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 11:42:05,009][15401] Updated weights for policy 0, policy_version 878194 (0.0041) [2024-06-25 11:42:08,389][15132] Fps is (10 sec: 42607.4, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 14388461568. Throughput: 0: 42391.6. Samples: 14388551280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 11:42:08,816][15401] Updated weights for policy 0, policy_version 878204 (0.0034) [2024-06-25 11:42:12,574][15401] Updated weights for policy 0, policy_version 878214 (0.0037) [2024-06-25 11:42:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 14388674560. Throughput: 0: 42466.7. Samples: 14388808220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 11:42:16,140][15401] Updated weights for policy 0, policy_version 878224 (0.0046) [2024-06-25 11:42:18,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 14388903936. Throughput: 0: 42708.3. Samples: 14389071660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:18,393][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 11:42:20,165][15401] Updated weights for policy 0, policy_version 878234 (0.0039) [2024-06-25 11:42:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 14389133312. Throughput: 0: 42743.9. Samples: 14389202240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:23,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 11:42:23,576][15401] Updated weights for policy 0, policy_version 878244 (0.0036) [2024-06-25 11:42:27,754][15401] Updated weights for policy 0, policy_version 878254 (0.0030) [2024-06-25 11:42:28,391][15132] Fps is (10 sec: 44241.6, 60 sec: 42870.5, 300 sec: 42487.1). Total num frames: 14389346304. Throughput: 0: 43059.6. Samples: 14389467600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:28,392][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 11:42:31,029][15401] Updated weights for policy 0, policy_version 878264 (0.0031) [2024-06-25 11:42:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42326.3, 300 sec: 42598.4). Total num frames: 14389559296. Throughput: 0: 43054.4. Samples: 14389724380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:33,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 11:42:35,433][15401] Updated weights for policy 0, policy_version 878274 (0.0031) [2024-06-25 11:42:38,390][15132] Fps is (10 sec: 44242.6, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 14389788672. Throughput: 0: 43050.7. Samples: 14389854420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:38,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 11:42:38,862][15401] Updated weights for policy 0, policy_version 878284 (0.0024) [2024-06-25 11:42:43,106][15401] Updated weights for policy 0, policy_version 878294 (0.0033) [2024-06-25 11:42:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42487.8). Total num frames: 14389985280. Throughput: 0: 43143.2. Samples: 14390112280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:43,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 11:42:46,347][15401] Updated weights for policy 0, policy_version 878304 (0.0029) [2024-06-25 11:42:48,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42870.6, 300 sec: 42598.1). Total num frames: 14390198272. Throughput: 0: 43207.9. Samples: 14390368980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:48,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 11:42:50,806][15401] Updated weights for policy 0, policy_version 878314 (0.0029) [2024-06-25 11:42:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 14390444032. Throughput: 0: 43249.7. Samples: 14390497520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:53,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 11:42:53,813][15401] Updated weights for policy 0, policy_version 878324 (0.0027) [2024-06-25 11:42:58,305][15401] Updated weights for policy 0, policy_version 878334 (0.0039) [2024-06-25 11:42:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43146.0, 300 sec: 42542.9). Total num frames: 14390624256. Throughput: 0: 43325.3. Samples: 14390757860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:42:58,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 11:43:01,599][15401] Updated weights for policy 0, policy_version 878344 (0.0036) [2024-06-25 11:43:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 14390837248. Throughput: 0: 43069.5. Samples: 14391009680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:43:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 11:43:05,924][15401] Updated weights for policy 0, policy_version 878354 (0.0026) [2024-06-25 11:43:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 14391066624. Throughput: 0: 42896.0. Samples: 14391132560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:43:08,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 11:43:09,551][15401] Updated weights for policy 0, policy_version 878364 (0.0043) [2024-06-25 11:43:13,394][15132] Fps is (10 sec: 40941.5, 60 sec: 42868.3, 300 sec: 42487.0). Total num frames: 14391246848. Throughput: 0: 42680.2. Samples: 14391388340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:13,394][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 11:43:13,742][15401] Updated weights for policy 0, policy_version 878374 (0.0031) [2024-06-25 11:43:17,398][15401] Updated weights for policy 0, policy_version 878384 (0.0025) [2024-06-25 11:43:18,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42868.6, 300 sec: 42597.8). Total num frames: 14391476224. Throughput: 0: 42607.2. Samples: 14391641980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:18,396][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 11:43:20,974][15349] Signal inference workers to stop experience collection... (213050 times) [2024-06-25 11:43:20,976][15349] Signal inference workers to resume experience collection... (213050 times) [2024-06-25 11:43:20,993][15401] InferenceWorker_p0-w0: stopping experience collection (213050 times) [2024-06-25 11:43:21,024][15401] InferenceWorker_p0-w0: resuming experience collection (213050 times) [2024-06-25 11:43:21,632][15401] Updated weights for policy 0, policy_version 878394 (0.0031) [2024-06-25 11:43:23,389][15132] Fps is (10 sec: 45895.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14391705600. Throughput: 0: 42555.6. Samples: 14391769420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 11:43:24,918][15401] Updated weights for policy 0, policy_version 878404 (0.0035) [2024-06-25 11:43:28,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42326.2, 300 sec: 42487.3). Total num frames: 14391885824. Throughput: 0: 42450.7. Samples: 14392022560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 11:43:29,090][15401] Updated weights for policy 0, policy_version 878414 (0.0042) [2024-06-25 11:43:32,468][15401] Updated weights for policy 0, policy_version 878424 (0.0043) [2024-06-25 11:43:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14392115200. Throughput: 0: 42459.3. Samples: 14392279540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 11:43:36,491][15401] Updated weights for policy 0, policy_version 878434 (0.0026) [2024-06-25 11:43:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14392344576. Throughput: 0: 42642.2. Samples: 14392416420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:38,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 11:43:39,886][15401] Updated weights for policy 0, policy_version 878444 (0.0040) [2024-06-25 11:43:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14392541184. Throughput: 0: 42560.8. Samples: 14392673100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 11:43:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000878451_14392541184.pth... [2024-06-25 11:43:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000877829_14382350336.pth [2024-06-25 11:43:44,299][15401] Updated weights for policy 0, policy_version 878454 (0.0034) [2024-06-25 11:43:47,881][15401] Updated weights for policy 0, policy_version 878464 (0.0027) [2024-06-25 11:43:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14392770560. Throughput: 0: 42699.1. Samples: 14392931140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 11:43:51,817][15401] Updated weights for policy 0, policy_version 878474 (0.0033) [2024-06-25 11:43:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 14392983552. Throughput: 0: 42875.2. Samples: 14393061940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:53,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-25 11:43:55,254][15401] Updated weights for policy 0, policy_version 878484 (0.0026) [2024-06-25 11:43:58,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14393196544. Throughput: 0: 42957.5. Samples: 14393321340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:43:58,393][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 11:43:59,119][15401] Updated weights for policy 0, policy_version 878494 (0.0037) [2024-06-25 11:44:02,886][15401] Updated weights for policy 0, policy_version 878504 (0.0029) [2024-06-25 11:44:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14393409536. Throughput: 0: 43059.0. Samples: 14393579360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:03,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-25 11:44:06,938][15401] Updated weights for policy 0, policy_version 878514 (0.0031) [2024-06-25 11:44:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14393622528. Throughput: 0: 43132.0. Samples: 14393710360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:08,390][15132] Avg episode reward: [(0, '0.096')] [2024-06-25 11:44:10,767][15401] Updated weights for policy 0, policy_version 878524 (0.0037) [2024-06-25 11:44:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43420.8, 300 sec: 42765.0). Total num frames: 14393851904. Throughput: 0: 43099.7. Samples: 14393962040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:13,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 11:44:14,844][15401] Updated weights for policy 0, policy_version 878534 (0.0036) [2024-06-25 11:44:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42876.1, 300 sec: 42598.7). Total num frames: 14394048512. Throughput: 0: 43140.4. Samples: 14394220860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 11:44:18,470][15401] Updated weights for policy 0, policy_version 878544 (0.0039) [2024-06-25 11:44:22,563][15401] Updated weights for policy 0, policy_version 878554 (0.0030) [2024-06-25 11:44:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14394277888. Throughput: 0: 42967.1. Samples: 14394349940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 11:44:26,005][15401] Updated weights for policy 0, policy_version 878564 (0.0049) [2024-06-25 11:44:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 14394490880. Throughput: 0: 42992.5. Samples: 14394607760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 11:44:30,151][15401] Updated weights for policy 0, policy_version 878574 (0.0035) [2024-06-25 11:44:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14394703872. Throughput: 0: 42887.1. Samples: 14394861060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 11:44:33,526][15401] Updated weights for policy 0, policy_version 878584 (0.0035) [2024-06-25 11:44:35,625][15349] Signal inference workers to stop experience collection... (213100 times) [2024-06-25 11:44:35,673][15401] InferenceWorker_p0-w0: stopping experience collection (213100 times) [2024-06-25 11:44:35,677][15349] Signal inference workers to resume experience collection... (213100 times) [2024-06-25 11:44:35,685][15401] InferenceWorker_p0-w0: resuming experience collection (213100 times) [2024-06-25 11:44:37,600][15401] Updated weights for policy 0, policy_version 878594 (0.0026) [2024-06-25 11:44:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14394900480. Throughput: 0: 42923.0. Samples: 14394993480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:38,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 11:44:41,143][15401] Updated weights for policy 0, policy_version 878604 (0.0031) [2024-06-25 11:44:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 14395146240. Throughput: 0: 42889.0. Samples: 14395251240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 11:44:45,182][15401] Updated weights for policy 0, policy_version 878614 (0.0042) [2024-06-25 11:44:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14395359232. Throughput: 0: 42817.4. Samples: 14395506140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 11:44:49,005][15401] Updated weights for policy 0, policy_version 878624 (0.0036) [2024-06-25 11:44:52,734][15401] Updated weights for policy 0, policy_version 878634 (0.0041) [2024-06-25 11:44:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14395555840. Throughput: 0: 42884.4. Samples: 14395640160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:53,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 11:44:56,529][15401] Updated weights for policy 0, policy_version 878644 (0.0036) [2024-06-25 11:44:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 14395768832. Throughput: 0: 43033.3. Samples: 14395898540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:44:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 11:45:00,390][15401] Updated weights for policy 0, policy_version 878654 (0.0030) [2024-06-25 11:45:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14395981824. Throughput: 0: 42875.4. Samples: 14396150360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-25 11:45:03,393][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 11:45:04,139][15401] Updated weights for policy 0, policy_version 878664 (0.0036) [2024-06-25 11:45:08,299][15401] Updated weights for policy 0, policy_version 878674 (0.0035) [2024-06-25 11:45:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14396194816. Throughput: 0: 42923.6. Samples: 14396281500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 11:45:12,088][15401] Updated weights for policy 0, policy_version 878684 (0.0024) [2024-06-25 11:45:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 14396391424. Throughput: 0: 42896.4. Samples: 14396538100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 11:45:15,898][15401] Updated weights for policy 0, policy_version 878694 (0.0038) [2024-06-25 11:45:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 14396637184. Throughput: 0: 42877.3. Samples: 14396790540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:18,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 11:45:19,724][15401] Updated weights for policy 0, policy_version 878704 (0.0043) [2024-06-25 11:45:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14396833792. Throughput: 0: 42788.1. Samples: 14396918940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:23,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 11:45:23,454][15401] Updated weights for policy 0, policy_version 878714 (0.0034) [2024-06-25 11:45:27,394][15401] Updated weights for policy 0, policy_version 878724 (0.0029) [2024-06-25 11:45:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14397046784. Throughput: 0: 42760.9. Samples: 14397175480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 11:45:31,414][15401] Updated weights for policy 0, policy_version 878734 (0.0030) [2024-06-25 11:45:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14397259776. Throughput: 0: 42767.0. Samples: 14397430660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 11:45:35,108][15401] Updated weights for policy 0, policy_version 878744 (0.0042) [2024-06-25 11:45:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 14397472768. Throughput: 0: 42519.1. Samples: 14397553520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 11:45:39,037][15401] Updated weights for policy 0, policy_version 878754 (0.0035) [2024-06-25 11:45:42,841][15401] Updated weights for policy 0, policy_version 878764 (0.0035) [2024-06-25 11:45:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 14397702144. Throughput: 0: 42576.4. Samples: 14397814480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:43,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 11:45:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000878766_14397702144.pth... [2024-06-25 11:45:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000878138_14387412992.pth [2024-06-25 11:45:46,821][15401] Updated weights for policy 0, policy_version 878774 (0.0033) [2024-06-25 11:45:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14397898752. Throughput: 0: 42430.8. Samples: 14398059640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:48,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 11:45:50,500][15401] Updated weights for policy 0, policy_version 878784 (0.0022) [2024-06-25 11:45:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14398111744. Throughput: 0: 42419.9. Samples: 14398190400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 11:45:54,683][15401] Updated weights for policy 0, policy_version 878794 (0.0027) [2024-06-25 11:45:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14398308352. Throughput: 0: 42401.0. Samples: 14398446140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:45:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 11:45:58,431][15349] Signal inference workers to stop experience collection... (213150 times) [2024-06-25 11:45:58,434][15349] Signal inference workers to resume experience collection... (213150 times) [2024-06-25 11:45:58,439][15401] Updated weights for policy 0, policy_version 878804 (0.0026) [2024-06-25 11:45:58,451][15401] InferenceWorker_p0-w0: stopping experience collection (213150 times) [2024-06-25 11:45:58,451][15401] InferenceWorker_p0-w0: resuming experience collection (213150 times) [2024-06-25 11:46:02,274][15401] Updated weights for policy 0, policy_version 878814 (0.0028) [2024-06-25 11:46:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42871.4, 300 sec: 42820.2). Total num frames: 14398554112. Throughput: 0: 42566.5. Samples: 14398706140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:03,393][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 11:46:06,032][15401] Updated weights for policy 0, policy_version 878824 (0.0032) [2024-06-25 11:46:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14398750720. Throughput: 0: 42508.3. Samples: 14398831820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 11:46:09,743][15401] Updated weights for policy 0, policy_version 878834 (0.0029) [2024-06-25 11:46:13,392][15132] Fps is (10 sec: 40960.2, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14398963712. Throughput: 0: 42518.9. Samples: 14399088940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:13,393][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 11:46:13,476][15401] Updated weights for policy 0, policy_version 878844 (0.0041) [2024-06-25 11:46:17,544][15401] Updated weights for policy 0, policy_version 878854 (0.0031) [2024-06-25 11:46:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14399176704. Throughput: 0: 42495.3. Samples: 14399342940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:18,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 11:46:21,165][15401] Updated weights for policy 0, policy_version 878864 (0.0030) [2024-06-25 11:46:23,392][15132] Fps is (10 sec: 44236.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 14399406080. Throughput: 0: 42671.9. Samples: 14399473860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:23,393][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 11:46:25,091][15401] Updated weights for policy 0, policy_version 878874 (0.0021) [2024-06-25 11:46:28,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.6, 300 sec: 42653.8). Total num frames: 14399602688. Throughput: 0: 42646.7. Samples: 14399733680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:28,393][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 11:46:29,066][15401] Updated weights for policy 0, policy_version 878884 (0.0030) [2024-06-25 11:46:32,678][15401] Updated weights for policy 0, policy_version 878894 (0.0036) [2024-06-25 11:46:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14399815680. Throughput: 0: 42832.9. Samples: 14399987120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 11:46:36,682][15401] Updated weights for policy 0, policy_version 878904 (0.0038) [2024-06-25 11:46:38,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14400045056. Throughput: 0: 42859.7. Samples: 14400119080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 11:46:40,125][15401] Updated weights for policy 0, policy_version 878914 (0.0053) [2024-06-25 11:46:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.2). Total num frames: 14400241664. Throughput: 0: 42939.0. Samples: 14400378400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:43,398][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 11:46:44,304][15401] Updated weights for policy 0, policy_version 878924 (0.0040) [2024-06-25 11:46:48,178][15401] Updated weights for policy 0, policy_version 878934 (0.0044) [2024-06-25 11:46:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14400454656. Throughput: 0: 42755.7. Samples: 14400630040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 11:46:51,956][15401] Updated weights for policy 0, policy_version 878944 (0.0047) [2024-06-25 11:46:53,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.7, 300 sec: 42876.0). Total num frames: 14400684032. Throughput: 0: 42780.8. Samples: 14400757060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 11:46:53,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 11:46:55,700][15401] Updated weights for policy 0, policy_version 878954 (0.0032) [2024-06-25 11:46:58,390][15132] Fps is (10 sec: 44232.7, 60 sec: 43143.8, 300 sec: 42820.4). Total num frames: 14400897024. Throughput: 0: 42854.3. Samples: 14401017320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:46:58,391][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 11:46:59,642][15401] Updated weights for policy 0, policy_version 878964 (0.0039) [2024-06-25 11:47:03,390][15132] Fps is (10 sec: 40970.2, 60 sec: 42327.1, 300 sec: 42820.5). Total num frames: 14401093632. Throughput: 0: 42746.5. Samples: 14401266540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:03,396][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 11:47:03,809][15401] Updated weights for policy 0, policy_version 878974 (0.0040) [2024-06-25 11:47:07,486][15401] Updated weights for policy 0, policy_version 878984 (0.0033) [2024-06-25 11:47:08,389][15132] Fps is (10 sec: 42602.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14401323008. Throughput: 0: 42665.5. Samples: 14401393700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 11:47:11,445][15401] Updated weights for policy 0, policy_version 878994 (0.0036) [2024-06-25 11:47:13,295][15349] Signal inference workers to stop experience collection... (213200 times) [2024-06-25 11:47:13,295][15349] Signal inference workers to resume experience collection... (213200 times) [2024-06-25 11:47:13,345][15401] InferenceWorker_p0-w0: stopping experience collection (213200 times) [2024-06-25 11:47:13,345][15401] InferenceWorker_p0-w0: resuming experience collection (213200 times) [2024-06-25 11:47:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42600.2, 300 sec: 42765.4). Total num frames: 14401519616. Throughput: 0: 42617.5. Samples: 14401651360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:13,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 11:47:15,152][15401] Updated weights for policy 0, policy_version 879004 (0.0033) [2024-06-25 11:47:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14401732608. Throughput: 0: 42486.6. Samples: 14401899020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 11:47:19,087][15401] Updated weights for policy 0, policy_version 879014 (0.0033) [2024-06-25 11:47:22,842][15401] Updated weights for policy 0, policy_version 879024 (0.0029) [2024-06-25 11:47:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.1, 300 sec: 42765.2). Total num frames: 14401961984. Throughput: 0: 42438.1. Samples: 14402028800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:23,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 11:47:26,697][15401] Updated weights for policy 0, policy_version 879034 (0.0043) [2024-06-25 11:47:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 14402158592. Throughput: 0: 42428.0. Samples: 14402287660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:28,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 11:47:30,426][15401] Updated weights for policy 0, policy_version 879044 (0.0035) [2024-06-25 11:47:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14402371584. Throughput: 0: 42447.6. Samples: 14402540180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 11:47:34,283][15401] Updated weights for policy 0, policy_version 879054 (0.0033) [2024-06-25 11:47:38,035][15401] Updated weights for policy 0, policy_version 879064 (0.0032) [2024-06-25 11:47:38,391][15132] Fps is (10 sec: 44228.9, 60 sec: 42597.1, 300 sec: 42764.8). Total num frames: 14402600960. Throughput: 0: 42610.9. Samples: 14402674520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:38,392][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 11:47:41,957][15401] Updated weights for policy 0, policy_version 879074 (0.0044) [2024-06-25 11:47:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 14402813952. Throughput: 0: 42482.3. Samples: 14402928980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:43,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 11:47:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000879078_14402813952.pth... [2024-06-25 11:47:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000878451_14392541184.pth [2024-06-25 11:47:45,858][15401] Updated weights for policy 0, policy_version 879084 (0.0034) [2024-06-25 11:47:48,389][15132] Fps is (10 sec: 42606.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14403026944. Throughput: 0: 42533.8. Samples: 14403180560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:48,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-25 11:47:49,696][15401] Updated weights for policy 0, policy_version 879094 (0.0043) [2024-06-25 11:47:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.2, 300 sec: 42709.5). Total num frames: 14403223552. Throughput: 0: 42587.1. Samples: 14403310120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:53,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 11:47:53,435][15401] Updated weights for policy 0, policy_version 879104 (0.0032) [2024-06-25 11:47:57,994][15401] Updated weights for policy 0, policy_version 879114 (0.0025) [2024-06-25 11:47:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42326.0, 300 sec: 42709.5). Total num frames: 14403436544. Throughput: 0: 42571.0. Samples: 14403567060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:47:58,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 11:48:01,070][15401] Updated weights for policy 0, policy_version 879124 (0.0024) [2024-06-25 11:48:03,392][15132] Fps is (10 sec: 45863.4, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 14403682304. Throughput: 0: 42604.3. Samples: 14403816320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:03,393][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 11:48:05,551][15401] Updated weights for policy 0, policy_version 879134 (0.0034) [2024-06-25 11:48:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42765.7). Total num frames: 14403862528. Throughput: 0: 42774.7. Samples: 14403953660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 11:48:08,744][15401] Updated weights for policy 0, policy_version 879144 (0.0029) [2024-06-25 11:48:13,231][15401] Updated weights for policy 0, policy_version 879154 (0.0038) [2024-06-25 11:48:13,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 14404075520. Throughput: 0: 42565.8. Samples: 14404203120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 11:48:16,602][15401] Updated weights for policy 0, policy_version 879164 (0.0038) [2024-06-25 11:48:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14404321280. Throughput: 0: 42308.3. Samples: 14404444060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 11:48:20,773][15401] Updated weights for policy 0, policy_version 879174 (0.0033) [2024-06-25 11:48:23,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42047.8, 300 sec: 42708.6). Total num frames: 14404485120. Throughput: 0: 42328.5. Samples: 14404579500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:23,396][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 11:48:24,435][15401] Updated weights for policy 0, policy_version 879184 (0.0029) [2024-06-25 11:48:28,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14404698112. Throughput: 0: 42294.7. Samples: 14404832240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:28,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 11:48:28,669][15401] Updated weights for policy 0, policy_version 879194 (0.0032) [2024-06-25 11:48:28,672][15349] Signal inference workers to stop experience collection... (213250 times) [2024-06-25 11:48:28,678][15349] Signal inference workers to resume experience collection... (213250 times) [2024-06-25 11:48:28,716][15401] InferenceWorker_p0-w0: stopping experience collection (213250 times) [2024-06-25 11:48:28,716][15401] InferenceWorker_p0-w0: resuming experience collection (213250 times) [2024-06-25 11:48:32,140][15401] Updated weights for policy 0, policy_version 879204 (0.0045) [2024-06-25 11:48:33,390][15132] Fps is (10 sec: 45904.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14404943872. Throughput: 0: 42302.2. Samples: 14405084160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 11:48:36,151][15401] Updated weights for policy 0, policy_version 879214 (0.0039) [2024-06-25 11:48:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42326.6, 300 sec: 42709.5). Total num frames: 14405140480. Throughput: 0: 42507.9. Samples: 14405222980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:38,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 11:48:39,701][15401] Updated weights for policy 0, policy_version 879224 (0.0040) [2024-06-25 11:48:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14405337088. Throughput: 0: 42395.2. Samples: 14405474840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:43,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 11:48:43,817][15401] Updated weights for policy 0, policy_version 879234 (0.0034) [2024-06-25 11:48:47,487][15401] Updated weights for policy 0, policy_version 879244 (0.0039) [2024-06-25 11:48:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14405566464. Throughput: 0: 42510.4. Samples: 14405729180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-25 11:48:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 11:48:51,592][15401] Updated weights for policy 0, policy_version 879254 (0.0031) [2024-06-25 11:48:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 14405779456. Throughput: 0: 42342.7. Samples: 14405859080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:48:53,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 11:48:54,993][15401] Updated weights for policy 0, policy_version 879264 (0.0038) [2024-06-25 11:48:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14405976064. Throughput: 0: 42426.8. Samples: 14406112320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:48:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 11:48:59,240][15401] Updated weights for policy 0, policy_version 879274 (0.0035) [2024-06-25 11:49:02,841][15401] Updated weights for policy 0, policy_version 879284 (0.0031) [2024-06-25 11:49:03,396][15132] Fps is (10 sec: 44207.9, 60 sec: 42322.5, 300 sec: 42708.5). Total num frames: 14406221824. Throughput: 0: 42833.0. Samples: 14406371820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:03,396][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 11:49:06,690][15401] Updated weights for policy 0, policy_version 879294 (0.0039) [2024-06-25 11:49:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14406418432. Throughput: 0: 42730.5. Samples: 14406502100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:08,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 11:49:10,899][15401] Updated weights for policy 0, policy_version 879304 (0.0037) [2024-06-25 11:49:13,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14406631424. Throughput: 0: 42749.7. Samples: 14406755980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 11:49:14,691][15401] Updated weights for policy 0, policy_version 879314 (0.0038) [2024-06-25 11:49:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 14406828032. Throughput: 0: 42922.2. Samples: 14407015660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:18,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 11:49:18,640][15401] Updated weights for policy 0, policy_version 879324 (0.0040) [2024-06-25 11:49:22,237][15401] Updated weights for policy 0, policy_version 879334 (0.0040) [2024-06-25 11:49:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43149.2, 300 sec: 42653.9). Total num frames: 14407073792. Throughput: 0: 42669.8. Samples: 14407143120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:23,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 11:49:26,180][15401] Updated weights for policy 0, policy_version 879344 (0.0033) [2024-06-25 11:49:28,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14407286784. Throughput: 0: 42898.6. Samples: 14407405280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 11:49:29,628][15401] Updated weights for policy 0, policy_version 879354 (0.0032) [2024-06-25 11:49:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14407483392. Throughput: 0: 42893.2. Samples: 14407659380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 11:49:33,784][15401] Updated weights for policy 0, policy_version 879364 (0.0044) [2024-06-25 11:49:37,057][15349] Signal inference workers to stop experience collection... (213300 times) [2024-06-25 11:49:37,110][15401] InferenceWorker_p0-w0: stopping experience collection (213300 times) [2024-06-25 11:49:37,112][15349] Signal inference workers to resume experience collection... (213300 times) [2024-06-25 11:49:37,120][15401] InferenceWorker_p0-w0: resuming experience collection (213300 times) [2024-06-25 11:49:37,130][15401] Updated weights for policy 0, policy_version 879374 (0.0045) [2024-06-25 11:49:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14407712768. Throughput: 0: 42831.1. Samples: 14407786480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 11:49:41,339][15401] Updated weights for policy 0, policy_version 879384 (0.0031) [2024-06-25 11:49:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14407925760. Throughput: 0: 42990.1. Samples: 14408046880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:43,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 11:49:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000879390_14407925760.pth... [2024-06-25 11:49:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000878766_14397702144.pth [2024-06-25 11:49:44,681][15401] Updated weights for policy 0, policy_version 879394 (0.0031) [2024-06-25 11:49:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14408122368. Throughput: 0: 42845.8. Samples: 14408299600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 11:49:48,922][15401] Updated weights for policy 0, policy_version 879404 (0.0041) [2024-06-25 11:49:52,364][15401] Updated weights for policy 0, policy_version 879414 (0.0030) [2024-06-25 11:49:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14408351744. Throughput: 0: 42881.7. Samples: 14408431780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 11:49:56,567][15401] Updated weights for policy 0, policy_version 879424 (0.0035) [2024-06-25 11:49:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 14408548352. Throughput: 0: 42840.5. Samples: 14408683800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:49:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 11:50:00,231][15401] Updated weights for policy 0, policy_version 879434 (0.0041) [2024-06-25 11:50:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 14408777728. Throughput: 0: 42764.8. Samples: 14408940080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:50:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 11:50:04,241][15401] Updated weights for policy 0, policy_version 879444 (0.0033) [2024-06-25 11:50:07,911][15401] Updated weights for policy 0, policy_version 879454 (0.0028) [2024-06-25 11:50:08,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14408990720. Throughput: 0: 42864.0. Samples: 14409072000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:50:08,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 11:50:11,945][15401] Updated weights for policy 0, policy_version 879464 (0.0035) [2024-06-25 11:50:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14409187328. Throughput: 0: 42710.6. Samples: 14409327260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:50:13,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 11:50:15,488][15401] Updated weights for policy 0, policy_version 879474 (0.0031) [2024-06-25 11:50:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14409400320. Throughput: 0: 42746.8. Samples: 14409582980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:50:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 11:50:19,551][15401] Updated weights for policy 0, policy_version 879484 (0.0038) [2024-06-25 11:50:23,036][15401] Updated weights for policy 0, policy_version 879494 (0.0023) [2024-06-25 11:50:23,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14409646080. Throughput: 0: 42850.1. Samples: 14409714740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:50:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 11:50:27,419][15401] Updated weights for policy 0, policy_version 879504 (0.0042) [2024-06-25 11:50:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 14409826304. Throughput: 0: 42834.7. Samples: 14409974540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:50:28,392][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 11:50:30,556][15401] Updated weights for policy 0, policy_version 879514 (0.0043) [2024-06-25 11:50:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14410055680. Throughput: 0: 42851.4. Samples: 14410227920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:50:33,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 11:50:35,055][15401] Updated weights for policy 0, policy_version 879524 (0.0036) [2024-06-25 11:50:38,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14410268672. Throughput: 0: 42872.5. Samples: 14410361040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 11:50:38,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 11:50:38,553][15401] Updated weights for policy 0, policy_version 879534 (0.0037) [2024-06-25 11:50:42,806][15401] Updated weights for policy 0, policy_version 879544 (0.0030) [2024-06-25 11:50:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14410465280. Throughput: 0: 42994.6. Samples: 14410618560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:50:43,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 11:50:45,893][15401] Updated weights for policy 0, policy_version 879554 (0.0026) [2024-06-25 11:50:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14410711040. Throughput: 0: 42871.6. Samples: 14410869300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:50:48,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 11:50:50,434][15401] Updated weights for policy 0, policy_version 879564 (0.0032) [2024-06-25 11:50:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14410924032. Throughput: 0: 42949.8. Samples: 14411004740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:50:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 11:50:53,524][15401] Updated weights for policy 0, policy_version 879574 (0.0033) [2024-06-25 11:50:58,184][15401] Updated weights for policy 0, policy_version 879584 (0.0041) [2024-06-25 11:50:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 14411104256. Throughput: 0: 42935.7. Samples: 14411259360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:50:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 11:50:59,371][15349] Signal inference workers to stop experience collection... (213350 times) [2024-06-25 11:50:59,377][15349] Signal inference workers to resume experience collection... (213350 times) [2024-06-25 11:50:59,408][15401] InferenceWorker_p0-w0: stopping experience collection (213350 times) [2024-06-25 11:50:59,408][15401] InferenceWorker_p0-w0: resuming experience collection (213350 times) [2024-06-25 11:51:01,129][15401] Updated weights for policy 0, policy_version 879594 (0.0039) [2024-06-25 11:51:03,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 14411366400. Throughput: 0: 42806.1. Samples: 14411509360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:03,393][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 11:51:05,902][15401] Updated weights for policy 0, policy_version 879604 (0.0028) [2024-06-25 11:51:08,389][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 14411579392. Throughput: 0: 42923.2. Samples: 14411646280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 11:51:08,840][15401] Updated weights for policy 0, policy_version 879614 (0.0033) [2024-06-25 11:51:13,389][15132] Fps is (10 sec: 37692.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14411743232. Throughput: 0: 42705.0. Samples: 14411896160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 11:51:13,412][15401] Updated weights for policy 0, policy_version 879624 (0.0035) [2024-06-25 11:51:16,688][15401] Updated weights for policy 0, policy_version 879634 (0.0038) [2024-06-25 11:51:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 14411988992. Throughput: 0: 42820.1. Samples: 14412154820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:18,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 11:51:20,909][15401] Updated weights for policy 0, policy_version 879644 (0.0031) [2024-06-25 11:51:23,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 14412201984. Throughput: 0: 42840.4. Samples: 14412288960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:23,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 11:51:24,366][15401] Updated weights for policy 0, policy_version 879654 (0.0042) [2024-06-25 11:51:28,317][15401] Updated weights for policy 0, policy_version 879664 (0.0027) [2024-06-25 11:51:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 14412414976. Throughput: 0: 42744.0. Samples: 14412542040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 11:51:31,780][15401] Updated weights for policy 0, policy_version 879674 (0.0039) [2024-06-25 11:51:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 14412627968. Throughput: 0: 43018.7. Samples: 14412805140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 11:51:35,866][15401] Updated weights for policy 0, policy_version 879684 (0.0029) [2024-06-25 11:51:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14412857344. Throughput: 0: 42932.9. Samples: 14412936720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 11:51:39,327][15401] Updated weights for policy 0, policy_version 879694 (0.0031) [2024-06-25 11:51:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14413053952. Throughput: 0: 42922.6. Samples: 14413190880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 11:51:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000879703_14413053952.pth... [2024-06-25 11:51:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000879078_14402813952.pth [2024-06-25 11:51:43,834][15401] Updated weights for policy 0, policy_version 879704 (0.0036) [2024-06-25 11:51:46,923][15401] Updated weights for policy 0, policy_version 879714 (0.0027) [2024-06-25 11:51:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42709.5). Total num frames: 14413283328. Throughput: 0: 43003.6. Samples: 14413444520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:48,393][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 11:51:51,273][15401] Updated weights for policy 0, policy_version 879724 (0.0034) [2024-06-25 11:51:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42709.3). Total num frames: 14413496320. Throughput: 0: 42875.0. Samples: 14413575760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:53,392][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 11:51:54,405][15401] Updated weights for policy 0, policy_version 879734 (0.0050) [2024-06-25 11:51:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 14413709312. Throughput: 0: 42970.1. Samples: 14413829820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:51:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 11:51:58,915][15401] Updated weights for policy 0, policy_version 879744 (0.0030) [2024-06-25 11:52:01,979][15401] Updated weights for policy 0, policy_version 879754 (0.0035) [2024-06-25 11:52:03,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 14413922304. Throughput: 0: 42871.5. Samples: 14414084040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:52:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 11:52:06,302][15401] Updated weights for policy 0, policy_version 879764 (0.0023) [2024-06-25 11:52:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14414151680. Throughput: 0: 42834.3. Samples: 14414216400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:52:08,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 11:52:09,436][15349] Signal inference workers to stop experience collection... (213400 times) [2024-06-25 11:52:09,436][15349] Signal inference workers to resume experience collection... (213400 times) [2024-06-25 11:52:09,463][15401] InferenceWorker_p0-w0: stopping experience collection (213400 times) [2024-06-25 11:52:09,463][15401] InferenceWorker_p0-w0: resuming experience collection (213400 times) [2024-06-25 11:52:09,580][15401] Updated weights for policy 0, policy_version 879774 (0.0035) [2024-06-25 11:52:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14414331904. Throughput: 0: 42874.1. Samples: 14414471380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:52:13,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 11:52:14,297][15401] Updated weights for policy 0, policy_version 879784 (0.0041) [2024-06-25 11:52:17,354][15401] Updated weights for policy 0, policy_version 879794 (0.0053) [2024-06-25 11:52:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14414561280. Throughput: 0: 42544.0. Samples: 14414719620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:52:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 11:52:22,066][15401] Updated weights for policy 0, policy_version 879804 (0.0028) [2024-06-25 11:52:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14414774272. Throughput: 0: 42515.6. Samples: 14414849920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:52:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 11:52:25,594][15401] Updated weights for policy 0, policy_version 879814 (0.0029) [2024-06-25 11:52:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14414970880. Throughput: 0: 42540.8. Samples: 14415105220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:52:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 11:52:29,658][15401] Updated weights for policy 0, policy_version 879824 (0.0032) [2024-06-25 11:52:33,230][15401] Updated weights for policy 0, policy_version 879834 (0.0035) [2024-06-25 11:52:33,394][15132] Fps is (10 sec: 44215.5, 60 sec: 43141.0, 300 sec: 42764.6). Total num frames: 14415216640. Throughput: 0: 42576.8. Samples: 14415360580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 11:52:33,395][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 11:52:37,295][15401] Updated weights for policy 0, policy_version 879844 (0.0032) [2024-06-25 11:52:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14415413248. Throughput: 0: 42602.4. Samples: 14415492760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:52:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 11:52:40,818][15401] Updated weights for policy 0, policy_version 879854 (0.0026) [2024-06-25 11:52:43,390][15132] Fps is (10 sec: 39340.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14415609856. Throughput: 0: 42656.5. Samples: 14415749360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:52:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 11:52:45,014][15401] Updated weights for policy 0, policy_version 879864 (0.0024) [2024-06-25 11:52:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14415839232. Throughput: 0: 42502.6. Samples: 14415996660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:52:48,396][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 11:52:48,933][15401] Updated weights for policy 0, policy_version 879874 (0.0034) [2024-06-25 11:52:52,630][15401] Updated weights for policy 0, policy_version 879884 (0.0031) [2024-06-25 11:52:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 14416035840. Throughput: 0: 42407.6. Samples: 14416124740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:52:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 11:52:56,626][15401] Updated weights for policy 0, policy_version 879894 (0.0031) [2024-06-25 11:52:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 14416248832. Throughput: 0: 42339.2. Samples: 14416376640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:52:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 11:53:00,253][15401] Updated weights for policy 0, policy_version 879904 (0.0032) [2024-06-25 11:53:03,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 14416478208. Throughput: 0: 42502.5. Samples: 14416632340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:03,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 11:53:04,325][15401] Updated weights for policy 0, policy_version 879914 (0.0035) [2024-06-25 11:53:07,918][15401] Updated weights for policy 0, policy_version 879924 (0.0033) [2024-06-25 11:53:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14416691200. Throughput: 0: 42537.7. Samples: 14416764120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:08,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 11:53:12,030][15401] Updated weights for policy 0, policy_version 879934 (0.0035) [2024-06-25 11:53:13,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14416887808. Throughput: 0: 42607.6. Samples: 14417022560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 11:53:15,505][15401] Updated weights for policy 0, policy_version 879944 (0.0043) [2024-06-25 11:53:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42821.5). Total num frames: 14417117184. Throughput: 0: 42709.0. Samples: 14417282280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 11:53:19,729][15401] Updated weights for policy 0, policy_version 879954 (0.0032) [2024-06-25 11:53:23,266][15401] Updated weights for policy 0, policy_version 879964 (0.0036) [2024-06-25 11:53:23,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 14417330176. Throughput: 0: 42568.5. Samples: 14417408620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:23,397][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 11:53:27,133][15401] Updated weights for policy 0, policy_version 879974 (0.0038) [2024-06-25 11:53:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14417526784. Throughput: 0: 42644.0. Samples: 14417668340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:28,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 11:53:30,743][15401] Updated weights for policy 0, policy_version 879984 (0.0036) [2024-06-25 11:53:33,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42328.7, 300 sec: 42765.0). Total num frames: 14417756160. Throughput: 0: 42842.7. Samples: 14417924580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:33,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 11:53:34,651][15401] Updated weights for policy 0, policy_version 879994 (0.0034) [2024-06-25 11:53:38,322][15401] Updated weights for policy 0, policy_version 880004 (0.0025) [2024-06-25 11:53:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14417985536. Throughput: 0: 42916.4. Samples: 14418055980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:38,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 11:53:42,434][15401] Updated weights for policy 0, policy_version 880014 (0.0040) [2024-06-25 11:53:43,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 14418182144. Throughput: 0: 42917.2. Samples: 14418308020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:43,392][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 11:53:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000880016_14418182144.pth... [2024-06-25 11:53:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000879390_14407925760.pth [2024-06-25 11:53:46,204][15401] Updated weights for policy 0, policy_version 880024 (0.0030) [2024-06-25 11:53:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14418395136. Throughput: 0: 42953.9. Samples: 14418565160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 11:53:49,449][15349] Signal inference workers to stop experience collection... (213450 times) [2024-06-25 11:53:49,450][15349] Signal inference workers to resume experience collection... (213450 times) [2024-06-25 11:53:49,499][15401] InferenceWorker_p0-w0: stopping experience collection (213450 times) [2024-06-25 11:53:49,499][15401] InferenceWorker_p0-w0: resuming experience collection (213450 times) [2024-06-25 11:53:49,950][15401] Updated weights for policy 0, policy_version 880034 (0.0038) [2024-06-25 11:53:53,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 14418624512. Throughput: 0: 42980.9. Samples: 14418698260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 11:53:53,617][15401] Updated weights for policy 0, policy_version 880044 (0.0029) [2024-06-25 11:53:57,542][15401] Updated weights for policy 0, policy_version 880054 (0.0042) [2024-06-25 11:53:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 14418821120. Throughput: 0: 42897.0. Samples: 14418952920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:53:58,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 11:54:01,446][15401] Updated weights for policy 0, policy_version 880064 (0.0027) [2024-06-25 11:54:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14419034112. Throughput: 0: 42857.8. Samples: 14419210880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:54:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 11:54:05,245][15401] Updated weights for policy 0, policy_version 880074 (0.0041) [2024-06-25 11:54:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14419263488. Throughput: 0: 42970.1. Samples: 14419342000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:54:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 11:54:08,877][15401] Updated weights for policy 0, policy_version 880084 (0.0041) [2024-06-25 11:54:12,919][15401] Updated weights for policy 0, policy_version 880094 (0.0030) [2024-06-25 11:54:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14419476480. Throughput: 0: 42950.6. Samples: 14419601120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:54:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 11:54:16,663][15401] Updated weights for policy 0, policy_version 880104 (0.0032) [2024-06-25 11:54:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14419705856. Throughput: 0: 42950.8. Samples: 14419857360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:54:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 11:54:20,468][15401] Updated weights for policy 0, policy_version 880114 (0.0031) [2024-06-25 11:54:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 14419902464. Throughput: 0: 42998.6. Samples: 14419990920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 11:54:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 11:54:24,121][15401] Updated weights for policy 0, policy_version 880124 (0.0029) [2024-06-25 11:54:28,238][15401] Updated weights for policy 0, policy_version 880134 (0.0036) [2024-06-25 11:54:28,391][15132] Fps is (10 sec: 40954.7, 60 sec: 43143.6, 300 sec: 42820.4). Total num frames: 14420115456. Throughput: 0: 43093.6. Samples: 14420247180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:54:28,391][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 11:54:31,694][15401] Updated weights for policy 0, policy_version 880144 (0.0037) [2024-06-25 11:54:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14420361216. Throughput: 0: 43003.9. Samples: 14420500340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:54:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 11:54:36,107][15401] Updated weights for policy 0, policy_version 880154 (0.0047) [2024-06-25 11:54:38,389][15132] Fps is (10 sec: 44242.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14420557824. Throughput: 0: 43141.0. Samples: 14420639600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:54:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 11:54:39,307][15401] Updated weights for policy 0, policy_version 880164 (0.0030) [2024-06-25 11:54:43,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 14420754432. Throughput: 0: 43084.7. Samples: 14420891740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:54:43,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 11:54:43,625][15401] Updated weights for policy 0, policy_version 880174 (0.0034) [2024-06-25 11:54:46,980][15401] Updated weights for policy 0, policy_version 880184 (0.0038) [2024-06-25 11:54:48,392][15132] Fps is (10 sec: 44225.6, 60 sec: 43415.8, 300 sec: 42875.8). Total num frames: 14421000192. Throughput: 0: 42920.3. Samples: 14421142400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:54:48,393][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 11:54:51,420][15401] Updated weights for policy 0, policy_version 880194 (0.0036) [2024-06-25 11:54:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14421213184. Throughput: 0: 43140.5. Samples: 14421283320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:54:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 11:54:54,590][15401] Updated weights for policy 0, policy_version 880204 (0.0040) [2024-06-25 11:54:58,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14421393408. Throughput: 0: 42788.4. Samples: 14421526600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:54:58,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 11:54:59,377][15401] Updated weights for policy 0, policy_version 880214 (0.0041) [2024-06-25 11:55:02,236][15401] Updated weights for policy 0, policy_version 880224 (0.0040) [2024-06-25 11:55:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43415.9, 300 sec: 42875.7). Total num frames: 14421639168. Throughput: 0: 42779.5. Samples: 14421782540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:03,392][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 11:55:06,991][15401] Updated weights for policy 0, policy_version 880234 (0.0035) [2024-06-25 11:55:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14421835776. Throughput: 0: 42810.3. Samples: 14421917380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 11:55:09,909][15401] Updated weights for policy 0, policy_version 880244 (0.0028) [2024-06-25 11:55:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14422048768. Throughput: 0: 42602.6. Samples: 14422164240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:13,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 11:55:14,837][15401] Updated weights for policy 0, policy_version 880254 (0.0048) [2024-06-25 11:55:17,563][15401] Updated weights for policy 0, policy_version 880264 (0.0032) [2024-06-25 11:55:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14422278144. Throughput: 0: 42614.2. Samples: 14422417980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 11:55:22,317][15401] Updated weights for policy 0, policy_version 880274 (0.0044) [2024-06-25 11:55:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 14422441984. Throughput: 0: 42494.2. Samples: 14422551840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:23,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 11:55:24,662][15349] Signal inference workers to stop experience collection... (213500 times) [2024-06-25 11:55:24,708][15401] InferenceWorker_p0-w0: stopping experience collection (213500 times) [2024-06-25 11:55:24,716][15349] Signal inference workers to resume experience collection... (213500 times) [2024-06-25 11:55:24,723][15401] InferenceWorker_p0-w0: resuming experience collection (213500 times) [2024-06-25 11:55:25,440][15401] Updated weights for policy 0, policy_version 880284 (0.0039) [2024-06-25 11:55:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42872.4, 300 sec: 42820.6). Total num frames: 14422687744. Throughput: 0: 42482.2. Samples: 14422803440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 11:55:29,994][15401] Updated weights for policy 0, policy_version 880294 (0.0045) [2024-06-25 11:55:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 14422884352. Throughput: 0: 42735.3. Samples: 14423065380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:33,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 11:55:33,464][15401] Updated weights for policy 0, policy_version 880304 (0.0028) [2024-06-25 11:55:37,590][15401] Updated weights for policy 0, policy_version 880314 (0.0039) [2024-06-25 11:55:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 14423080960. Throughput: 0: 42445.8. Samples: 14423193380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:38,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 11:55:41,019][15401] Updated weights for policy 0, policy_version 880324 (0.0026) [2024-06-25 11:55:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14423343104. Throughput: 0: 42676.4. Samples: 14423447040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 11:55:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000880331_14423343104.pth... [2024-06-25 11:55:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000879703_14413053952.pth [2024-06-25 11:55:45,576][15401] Updated weights for policy 0, policy_version 880334 (0.0032) [2024-06-25 11:55:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 14423539712. Throughput: 0: 42542.3. Samples: 14423696840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:48,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-25 11:55:48,636][15401] Updated weights for policy 0, policy_version 880344 (0.0033) [2024-06-25 11:55:53,240][15401] Updated weights for policy 0, policy_version 880354 (0.0029) [2024-06-25 11:55:53,390][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 14423719936. Throughput: 0: 42379.4. Samples: 14423824460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:53,390][15132] Avg episode reward: [(0, '0.103')] [2024-06-25 11:55:56,129][15401] Updated weights for policy 0, policy_version 880364 (0.0045) [2024-06-25 11:55:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 14423982080. Throughput: 0: 42646.3. Samples: 14424083320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:55:58,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 11:56:00,686][15401] Updated weights for policy 0, policy_version 880374 (0.0024) [2024-06-25 11:56:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 14424162304. Throughput: 0: 42790.5. Samples: 14424343560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:56:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 11:56:04,018][15401] Updated weights for policy 0, policy_version 880384 (0.0042) [2024-06-25 11:56:08,383][15401] Updated weights for policy 0, policy_version 880394 (0.0037) [2024-06-25 11:56:08,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14424375296. Throughput: 0: 42576.8. Samples: 14424467800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:56:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 11:56:12,001][15401] Updated weights for policy 0, policy_version 880404 (0.0039) [2024-06-25 11:56:13,392][15132] Fps is (10 sec: 47503.1, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 14424637440. Throughput: 0: 42644.8. Samples: 14424722560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:56:13,393][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 11:56:16,384][15401] Updated weights for policy 0, policy_version 880414 (0.0035) [2024-06-25 11:56:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 14424801280. Throughput: 0: 42473.7. Samples: 14424976700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 11:56:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 11:56:19,690][15401] Updated weights for policy 0, policy_version 880424 (0.0032) [2024-06-25 11:56:23,392][15132] Fps is (10 sec: 37683.4, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 14425014272. Throughput: 0: 42274.2. Samples: 14425095820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:56:23,392][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 11:56:24,160][15401] Updated weights for policy 0, policy_version 880434 (0.0049) [2024-06-25 11:56:27,322][15401] Updated weights for policy 0, policy_version 880444 (0.0035) [2024-06-25 11:56:28,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14425276416. Throughput: 0: 42514.4. Samples: 14425360180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:56:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 11:56:31,615][15401] Updated weights for policy 0, policy_version 880454 (0.0032) [2024-06-25 11:56:31,654][15349] Signal inference workers to stop experience collection... (213550 times) [2024-06-25 11:56:31,660][15349] Signal inference workers to resume experience collection... (213550 times) [2024-06-25 11:56:31,699][15401] InferenceWorker_p0-w0: stopping experience collection (213550 times) [2024-06-25 11:56:31,699][15401] InferenceWorker_p0-w0: resuming experience collection (213550 times) [2024-06-25 11:56:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14425440256. Throughput: 0: 42715.6. Samples: 14425619040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:56:33,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 11:56:34,877][15401] Updated weights for policy 0, policy_version 880464 (0.0036) [2024-06-25 11:56:38,389][15132] Fps is (10 sec: 36045.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14425636864. Throughput: 0: 42509.1. Samples: 14425737360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:56:38,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 11:56:39,172][15401] Updated weights for policy 0, policy_version 880474 (0.0040) [2024-06-25 11:56:42,428][15401] Updated weights for policy 0, policy_version 880484 (0.0036) [2024-06-25 11:56:43,390][15132] Fps is (10 sec: 45871.0, 60 sec: 42597.9, 300 sec: 42765.2). Total num frames: 14425899008. Throughput: 0: 42576.9. Samples: 14425999320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:56:43,391][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 11:56:46,937][15401] Updated weights for policy 0, policy_version 880494 (0.0038) [2024-06-25 11:56:48,389][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 14426095616. Throughput: 0: 42446.0. Samples: 14426253620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:56:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 11:56:50,549][15401] Updated weights for policy 0, policy_version 880504 (0.0053) [2024-06-25 11:56:53,390][15132] Fps is (10 sec: 40963.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14426308608. Throughput: 0: 42513.3. Samples: 14426380900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:56:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 11:56:54,329][15401] Updated weights for policy 0, policy_version 880514 (0.0027) [2024-06-25 11:56:58,105][15401] Updated weights for policy 0, policy_version 880524 (0.0043) [2024-06-25 11:56:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14426521600. Throughput: 0: 42876.0. Samples: 14426651880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:56:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 11:57:01,837][15401] Updated weights for policy 0, policy_version 880534 (0.0048) [2024-06-25 11:57:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 14426734592. Throughput: 0: 42776.5. Samples: 14426901640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:03,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 11:57:05,729][15401] Updated weights for policy 0, policy_version 880544 (0.0042) [2024-06-25 11:57:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14426947584. Throughput: 0: 43028.5. Samples: 14427032000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 11:57:09,196][15401] Updated weights for policy 0, policy_version 880554 (0.0032) [2024-06-25 11:57:13,214][15401] Updated weights for policy 0, policy_version 880564 (0.0025) [2024-06-25 11:57:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 14427176960. Throughput: 0: 43056.0. Samples: 14427297700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:13,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 11:57:16,676][15401] Updated weights for policy 0, policy_version 880574 (0.0036) [2024-06-25 11:57:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14427389952. Throughput: 0: 43105.6. Samples: 14427558800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 11:57:20,709][15401] Updated weights for policy 0, policy_version 880584 (0.0044) [2024-06-25 11:57:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 14427602944. Throughput: 0: 43230.0. Samples: 14427682720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:23,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 11:57:24,056][15401] Updated weights for policy 0, policy_version 880594 (0.0034) [2024-06-25 11:57:28,112][15401] Updated weights for policy 0, policy_version 880604 (0.0035) [2024-06-25 11:57:28,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42320.7, 300 sec: 42709.2). Total num frames: 14427815936. Throughput: 0: 43287.5. Samples: 14427947500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:28,397][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 11:57:31,931][15401] Updated weights for policy 0, policy_version 880614 (0.0030) [2024-06-25 11:57:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14428012544. Throughput: 0: 43355.5. Samples: 14428204620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 11:57:35,923][15401] Updated weights for policy 0, policy_version 880624 (0.0028) [2024-06-25 11:57:36,418][15349] Signal inference workers to stop experience collection... (213600 times) [2024-06-25 11:57:36,419][15349] Signal inference workers to resume experience collection... (213600 times) [2024-06-25 11:57:36,459][15401] InferenceWorker_p0-w0: stopping experience collection (213600 times) [2024-06-25 11:57:36,459][15401] InferenceWorker_p0-w0: resuming experience collection (213600 times) [2024-06-25 11:57:38,392][15132] Fps is (10 sec: 44254.7, 60 sec: 43688.8, 300 sec: 42875.7). Total num frames: 14428258304. Throughput: 0: 43390.1. Samples: 14428333560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:38,392][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 11:57:39,359][15401] Updated weights for policy 0, policy_version 880634 (0.0032) [2024-06-25 11:57:43,354][15401] Updated weights for policy 0, policy_version 880644 (0.0031) [2024-06-25 11:57:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42872.1, 300 sec: 42820.6). Total num frames: 14428471296. Throughput: 0: 43111.7. Samples: 14428591900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:43,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 11:57:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000880644_14428471296.pth... [2024-06-25 11:57:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000880016_14418182144.pth [2024-06-25 11:57:47,148][15401] Updated weights for policy 0, policy_version 880654 (0.0024) [2024-06-25 11:57:48,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 14428667904. Throughput: 0: 43260.4. Samples: 14428848460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:48,392][15132] Avg episode reward: [(0, '0.257')] [2024-06-25 11:57:50,861][15401] Updated weights for policy 0, policy_version 880664 (0.0030) [2024-06-25 11:57:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 14428913664. Throughput: 0: 43237.0. Samples: 14428977660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 11:57:54,704][15401] Updated weights for policy 0, policy_version 880674 (0.0039) [2024-06-25 11:57:58,390][15132] Fps is (10 sec: 44244.3, 60 sec: 43144.1, 300 sec: 42820.8). Total num frames: 14429110272. Throughput: 0: 43088.6. Samples: 14429236720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:57:58,391][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 11:57:58,558][15401] Updated weights for policy 0, policy_version 880684 (0.0037) [2024-06-25 11:58:02,296][15401] Updated weights for policy 0, policy_version 880694 (0.0037) [2024-06-25 11:58:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14429323264. Throughput: 0: 42867.7. Samples: 14429487840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:58:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 11:58:06,357][15401] Updated weights for policy 0, policy_version 880704 (0.0043) [2024-06-25 11:58:08,390][15132] Fps is (10 sec: 44239.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 14429552640. Throughput: 0: 42959.6. Samples: 14429615900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 11:58:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 11:58:09,912][15401] Updated weights for policy 0, policy_version 880714 (0.0028) [2024-06-25 11:58:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14429749248. Throughput: 0: 42928.4. Samples: 14429879000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:13,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 11:58:13,887][15401] Updated weights for policy 0, policy_version 880724 (0.0036) [2024-06-25 11:58:17,528][15401] Updated weights for policy 0, policy_version 880734 (0.0027) [2024-06-25 11:58:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42876.7). Total num frames: 14429978624. Throughput: 0: 42775.9. Samples: 14430129640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:18,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 11:58:21,365][15401] Updated weights for policy 0, policy_version 880744 (0.0033) [2024-06-25 11:58:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14430175232. Throughput: 0: 42808.5. Samples: 14430259840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:23,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 11:58:25,258][15401] Updated weights for policy 0, policy_version 880754 (0.0034) [2024-06-25 11:58:28,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42876.2, 300 sec: 42820.6). Total num frames: 14430388224. Throughput: 0: 42758.8. Samples: 14430516040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 11:58:29,203][15401] Updated weights for policy 0, policy_version 880764 (0.0024) [2024-06-25 11:58:32,981][15401] Updated weights for policy 0, policy_version 880774 (0.0044) [2024-06-25 11:58:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 14430617600. Throughput: 0: 42694.6. Samples: 14430769720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:33,393][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 11:58:36,771][15401] Updated weights for policy 0, policy_version 880784 (0.0036) [2024-06-25 11:58:38,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42820.9). Total num frames: 14430814208. Throughput: 0: 42716.4. Samples: 14430899900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:38,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 11:58:40,668][15401] Updated weights for policy 0, policy_version 880794 (0.0037) [2024-06-25 11:58:43,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14431027200. Throughput: 0: 42737.6. Samples: 14431159880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 11:58:44,377][15401] Updated weights for policy 0, policy_version 880804 (0.0033) [2024-06-25 11:58:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14431240192. Throughput: 0: 42935.5. Samples: 14431419940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 11:58:48,462][15401] Updated weights for policy 0, policy_version 880814 (0.0036) [2024-06-25 11:58:52,023][15401] Updated weights for policy 0, policy_version 880824 (0.0034) [2024-06-25 11:58:53,391][15132] Fps is (10 sec: 42590.1, 60 sec: 42323.9, 300 sec: 42820.3). Total num frames: 14431453184. Throughput: 0: 42797.3. Samples: 14431541860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:53,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 11:58:56,052][15401] Updated weights for policy 0, policy_version 880834 (0.0029) [2024-06-25 11:58:57,194][15349] Signal inference workers to stop experience collection... (213650 times) [2024-06-25 11:58:57,194][15349] Signal inference workers to resume experience collection... (213650 times) [2024-06-25 11:58:57,224][15401] InferenceWorker_p0-w0: stopping experience collection (213650 times) [2024-06-25 11:58:57,224][15401] InferenceWorker_p0-w0: resuming experience collection (213650 times) [2024-06-25 11:58:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42872.0, 300 sec: 42876.1). Total num frames: 14431682560. Throughput: 0: 42690.6. Samples: 14431800080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:58:58,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 11:58:59,645][15401] Updated weights for policy 0, policy_version 880844 (0.0028) [2024-06-25 11:59:03,392][15132] Fps is (10 sec: 42596.5, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 14431879168. Throughput: 0: 42781.4. Samples: 14432054800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:03,392][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 11:59:03,809][15401] Updated weights for policy 0, policy_version 880854 (0.0051) [2024-06-25 11:59:07,156][15401] Updated weights for policy 0, policy_version 880864 (0.0047) [2024-06-25 11:59:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14432108544. Throughput: 0: 42747.9. Samples: 14432183500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:08,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 11:59:11,486][15401] Updated weights for policy 0, policy_version 880874 (0.0023) [2024-06-25 11:59:13,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14432305152. Throughput: 0: 42771.8. Samples: 14432440780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 11:59:14,843][15401] Updated weights for policy 0, policy_version 880884 (0.0027) [2024-06-25 11:59:18,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42322.5, 300 sec: 42764.1). Total num frames: 14432518144. Throughput: 0: 42810.5. Samples: 14432696360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:18,396][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 11:59:19,253][15401] Updated weights for policy 0, policy_version 880894 (0.0044) [2024-06-25 11:59:22,893][15401] Updated weights for policy 0, policy_version 880904 (0.0030) [2024-06-25 11:59:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42820.7). Total num frames: 14432747520. Throughput: 0: 42739.1. Samples: 14432823160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:23,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 11:59:26,965][15401] Updated weights for policy 0, policy_version 880914 (0.0032) [2024-06-25 11:59:28,390][15132] Fps is (10 sec: 44264.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 14432960512. Throughput: 0: 42729.2. Samples: 14433082700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:28,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 11:59:30,874][15401] Updated weights for policy 0, policy_version 880924 (0.0041) [2024-06-25 11:59:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42327.0, 300 sec: 42709.4). Total num frames: 14433157120. Throughput: 0: 42645.6. Samples: 14433339000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:33,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 11:59:34,674][15401] Updated weights for policy 0, policy_version 880934 (0.0042) [2024-06-25 11:59:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14433370112. Throughput: 0: 42741.5. Samples: 14433465140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:38,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 11:59:38,444][15401] Updated weights for policy 0, policy_version 880944 (0.0041) [2024-06-25 11:59:42,698][15401] Updated weights for policy 0, policy_version 880954 (0.0033) [2024-06-25 11:59:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 14433583104. Throughput: 0: 42665.2. Samples: 14433720020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:43,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 11:59:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000880956_14433583104.pth... [2024-06-25 11:59:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000880331_14423343104.pth [2024-06-25 11:59:46,077][15401] Updated weights for policy 0, policy_version 880964 (0.0029) [2024-06-25 11:59:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14433812480. Throughput: 0: 42421.0. Samples: 14433963640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:48,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 11:59:50,551][15401] Updated weights for policy 0, policy_version 880974 (0.0033) [2024-06-25 11:59:53,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42871.1, 300 sec: 42820.2). Total num frames: 14434025472. Throughput: 0: 42508.9. Samples: 14434096500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:53,392][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 11:59:53,701][15401] Updated weights for policy 0, policy_version 880984 (0.0038) [2024-06-25 11:59:58,168][15401] Updated weights for policy 0, policy_version 880994 (0.0036) [2024-06-25 11:59:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 14434222080. Throughput: 0: 42448.2. Samples: 14434350940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 11:59:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 12:00:01,187][15401] Updated weights for policy 0, policy_version 881004 (0.0039) [2024-06-25 12:00:03,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14434451456. Throughput: 0: 42357.1. Samples: 14434602160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:00:03,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 12:00:05,841][15401] Updated weights for policy 0, policy_version 881014 (0.0051) [2024-06-25 12:00:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14434664448. Throughput: 0: 42473.4. Samples: 14434734460. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:08,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 12:00:09,264][15401] Updated weights for policy 0, policy_version 881024 (0.0030) [2024-06-25 12:00:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14434844672. Throughput: 0: 42476.1. Samples: 14434994120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:13,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 12:00:13,533][15401] Updated weights for policy 0, policy_version 881034 (0.0034) [2024-06-25 12:00:16,842][15401] Updated weights for policy 0, policy_version 881044 (0.0041) [2024-06-25 12:00:18,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42874.3, 300 sec: 42875.7). Total num frames: 14435090432. Throughput: 0: 42295.1. Samples: 14435242380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:18,393][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 12:00:21,186][15401] Updated weights for policy 0, policy_version 881054 (0.0028) [2024-06-25 12:00:23,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 14435287040. Throughput: 0: 42511.4. Samples: 14435378260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:23,393][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 12:00:24,205][15349] Signal inference workers to stop experience collection... (213700 times) [2024-06-25 12:00:24,206][15349] Signal inference workers to resume experience collection... (213700 times) [2024-06-25 12:00:24,255][15401] InferenceWorker_p0-w0: stopping experience collection (213700 times) [2024-06-25 12:00:24,256][15401] InferenceWorker_p0-w0: resuming experience collection (213700 times) [2024-06-25 12:00:24,369][15401] Updated weights for policy 0, policy_version 881064 (0.0037) [2024-06-25 12:00:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14435483648. Throughput: 0: 42526.3. Samples: 14435633700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:28,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 12:00:28,773][15401] Updated weights for policy 0, policy_version 881074 (0.0028) [2024-06-25 12:00:32,013][15401] Updated weights for policy 0, policy_version 881084 (0.0036) [2024-06-25 12:00:33,392][15132] Fps is (10 sec: 45875.1, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 14435745792. Throughput: 0: 42593.2. Samples: 14435880440. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:33,393][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 12:00:36,328][15401] Updated weights for policy 0, policy_version 881094 (0.0043) [2024-06-25 12:00:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14435909632. Throughput: 0: 42650.8. Samples: 14436015680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:38,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 12:00:39,555][15401] Updated weights for policy 0, policy_version 881104 (0.0050) [2024-06-25 12:00:43,392][15132] Fps is (10 sec: 37683.2, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 14436122624. Throughput: 0: 42522.5. Samples: 14436264560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:43,393][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 12:00:44,226][15401] Updated weights for policy 0, policy_version 881114 (0.0037) [2024-06-25 12:00:47,434][15401] Updated weights for policy 0, policy_version 881124 (0.0034) [2024-06-25 12:00:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14436368384. Throughput: 0: 42370.8. Samples: 14436508840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:48,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 12:00:52,234][15401] Updated weights for policy 0, policy_version 881134 (0.0038) [2024-06-25 12:00:53,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42598.0). Total num frames: 14436548608. Throughput: 0: 42395.0. Samples: 14436642340. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:53,393][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 12:00:55,226][15401] Updated weights for policy 0, policy_version 881144 (0.0039) [2024-06-25 12:00:58,392][15132] Fps is (10 sec: 39311.5, 60 sec: 42323.5, 300 sec: 42709.2). Total num frames: 14436761600. Throughput: 0: 42249.3. Samples: 14436895440. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:00:58,393][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 12:00:59,896][15401] Updated weights for policy 0, policy_version 881154 (0.0029) [2024-06-25 12:01:02,995][15401] Updated weights for policy 0, policy_version 881164 (0.0039) [2024-06-25 12:01:03,389][15132] Fps is (10 sec: 45886.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14437007360. Throughput: 0: 42201.9. Samples: 14437141360. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:03,396][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 12:01:07,566][15401] Updated weights for policy 0, policy_version 881174 (0.0040) [2024-06-25 12:01:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 14437187584. Throughput: 0: 42194.3. Samples: 14437276900. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:08,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 12:01:11,121][15401] Updated weights for policy 0, policy_version 881184 (0.0038) [2024-06-25 12:01:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14437416960. Throughput: 0: 42029.3. Samples: 14437525020. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:13,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 12:01:15,324][15401] Updated weights for policy 0, policy_version 881194 (0.0040) [2024-06-25 12:01:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41781.0, 300 sec: 42654.3). Total num frames: 14437597184. Throughput: 0: 42299.3. Samples: 14437783800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 12:01:18,905][15401] Updated weights for policy 0, policy_version 881204 (0.0035) [2024-06-25 12:01:23,051][15401] Updated weights for policy 0, policy_version 881214 (0.0029) [2024-06-25 12:01:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.0, 300 sec: 42542.8). Total num frames: 14437826560. Throughput: 0: 41990.6. Samples: 14437905260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 12:01:26,383][15401] Updated weights for policy 0, policy_version 881224 (0.0037) [2024-06-25 12:01:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14438039552. Throughput: 0: 42115.1. Samples: 14438159640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 12:01:30,789][15401] Updated weights for policy 0, policy_version 881234 (0.0044) [2024-06-25 12:01:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41507.8, 300 sec: 42709.5). Total num frames: 14438236160. Throughput: 0: 42302.5. Samples: 14438412460. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 12:01:34,261][15401] Updated weights for policy 0, policy_version 881244 (0.0032) [2024-06-25 12:01:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42543.0). Total num frames: 14438449152. Throughput: 0: 42198.3. Samples: 14438541160. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:38,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-25 12:01:38,445][15401] Updated weights for policy 0, policy_version 881254 (0.0034) [2024-06-25 12:01:41,899][15401] Updated weights for policy 0, policy_version 881264 (0.0031) [2024-06-25 12:01:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.2, 300 sec: 42598.4). Total num frames: 14438662144. Throughput: 0: 42259.3. Samples: 14438797000. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:43,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 12:01:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000881267_14438678528.pth... [2024-06-25 12:01:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000880644_14428471296.pth [2024-06-25 12:01:46,043][15401] Updated weights for policy 0, policy_version 881274 (0.0034) [2024-06-25 12:01:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 14438891520. Throughput: 0: 42461.1. Samples: 14439052120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 12:01:49,495][15401] Updated weights for policy 0, policy_version 881284 (0.0044) [2024-06-25 12:01:53,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42052.3, 300 sec: 42542.5). Total num frames: 14439071744. Throughput: 0: 42362.6. Samples: 14439183320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:01:53,392][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 12:01:53,919][15401] Updated weights for policy 0, policy_version 881294 (0.0043) [2024-06-25 12:01:56,863][15349] Signal inference workers to stop experience collection... (213750 times) [2024-06-25 12:01:56,892][15401] InferenceWorker_p0-w0: stopping experience collection (213750 times) [2024-06-25 12:01:56,913][15349] Signal inference workers to resume experience collection... (213750 times) [2024-06-25 12:01:56,914][15401] InferenceWorker_p0-w0: resuming experience collection (213750 times) [2024-06-25 12:01:57,058][15401] Updated weights for policy 0, policy_version 881304 (0.0028) [2024-06-25 12:01:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 14439317504. Throughput: 0: 42385.8. Samples: 14439432380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:01:58,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 12:02:01,722][15401] Updated weights for policy 0, policy_version 881314 (0.0044) [2024-06-25 12:02:03,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 14439530496. Throughput: 0: 42495.0. Samples: 14439696080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:03,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 12:02:04,878][15401] Updated weights for policy 0, policy_version 881324 (0.0025) [2024-06-25 12:02:08,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 14439727104. Throughput: 0: 42521.8. Samples: 14439818840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:08,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 12:02:09,534][15401] Updated weights for policy 0, policy_version 881334 (0.0038) [2024-06-25 12:02:13,038][15401] Updated weights for policy 0, policy_version 881344 (0.0037) [2024-06-25 12:02:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14439956480. Throughput: 0: 42564.2. Samples: 14440075020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 12:02:17,145][15401] Updated weights for policy 0, policy_version 881354 (0.0033) [2024-06-25 12:02:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14440169472. Throughput: 0: 42753.3. Samples: 14440336360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:18,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 12:02:20,735][15401] Updated weights for policy 0, policy_version 881364 (0.0035) [2024-06-25 12:02:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 14440382464. Throughput: 0: 42650.6. Samples: 14440460440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 12:02:24,773][15401] Updated weights for policy 0, policy_version 881374 (0.0033) [2024-06-25 12:02:28,343][15401] Updated weights for policy 0, policy_version 881384 (0.0027) [2024-06-25 12:02:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14440595456. Throughput: 0: 42651.9. Samples: 14440716340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:28,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 12:02:32,943][15401] Updated weights for policy 0, policy_version 881394 (0.0040) [2024-06-25 12:02:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 14440792064. Throughput: 0: 42703.2. Samples: 14440973760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 12:02:35,990][15401] Updated weights for policy 0, policy_version 881404 (0.0037) [2024-06-25 12:02:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14441021440. Throughput: 0: 42473.8. Samples: 14441094540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 12:02:40,478][15401] Updated weights for policy 0, policy_version 881414 (0.0042) [2024-06-25 12:02:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 14441234432. Throughput: 0: 42624.9. Samples: 14441350500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 12:02:43,607][15401] Updated weights for policy 0, policy_version 881424 (0.0022) [2024-06-25 12:02:48,224][15401] Updated weights for policy 0, policy_version 881434 (0.0039) [2024-06-25 12:02:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 14441431040. Throughput: 0: 42732.9. Samples: 14441619060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 12:02:51,135][15401] Updated weights for policy 0, policy_version 881444 (0.0032) [2024-06-25 12:02:53,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43417.5, 300 sec: 42598.1). Total num frames: 14441676800. Throughput: 0: 42700.4. Samples: 14441740360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:53,393][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 12:02:55,702][15401] Updated weights for policy 0, policy_version 881454 (0.0032) [2024-06-25 12:02:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14441889792. Throughput: 0: 42824.8. Samples: 14442002140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:02:58,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 12:02:58,609][15401] Updated weights for policy 0, policy_version 881464 (0.0028) [2024-06-25 12:03:03,113][15401] Updated weights for policy 0, policy_version 881474 (0.0033) [2024-06-25 12:03:03,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14442086400. Throughput: 0: 42852.9. Samples: 14442264740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:03,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 12:03:06,425][15401] Updated weights for policy 0, policy_version 881484 (0.0030) [2024-06-25 12:03:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43419.3, 300 sec: 42653.9). Total num frames: 14442332160. Throughput: 0: 42814.2. Samples: 14442387080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:08,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-25 12:03:10,695][15401] Updated weights for policy 0, policy_version 881494 (0.0031) [2024-06-25 12:03:11,804][15349] Signal inference workers to stop experience collection... (213800 times) [2024-06-25 12:03:11,829][15401] InferenceWorker_p0-w0: stopping experience collection (213800 times) [2024-06-25 12:03:11,867][15349] Signal inference workers to resume experience collection... (213800 times) [2024-06-25 12:03:11,867][15401] InferenceWorker_p0-w0: resuming experience collection (213800 times) [2024-06-25 12:03:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42598.8). Total num frames: 14442545152. Throughput: 0: 42873.3. Samples: 14442645640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:13,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 12:03:14,015][15401] Updated weights for policy 0, policy_version 881504 (0.0036) [2024-06-25 12:03:18,391][15132] Fps is (10 sec: 37677.4, 60 sec: 42324.2, 300 sec: 42487.1). Total num frames: 14442708992. Throughput: 0: 42849.2. Samples: 14442902040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:18,391][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 12:03:18,764][15401] Updated weights for policy 0, policy_version 881514 (0.0030) [2024-06-25 12:03:21,875][15401] Updated weights for policy 0, policy_version 881524 (0.0033) [2024-06-25 12:03:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 14442938368. Throughput: 0: 42868.0. Samples: 14443023600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:23,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 12:03:26,239][15401] Updated weights for policy 0, policy_version 881534 (0.0035) [2024-06-25 12:03:28,390][15132] Fps is (10 sec: 45882.3, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 14443167744. Throughput: 0: 42899.1. Samples: 14443280960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:28,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 12:03:29,311][15401] Updated weights for policy 0, policy_version 881544 (0.0026) [2024-06-25 12:03:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14443364352. Throughput: 0: 42709.8. Samples: 14443541000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 12:03:33,929][15401] Updated weights for policy 0, policy_version 881554 (0.0031) [2024-06-25 12:03:36,806][15401] Updated weights for policy 0, policy_version 881564 (0.0036) [2024-06-25 12:03:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14443577344. Throughput: 0: 42790.4. Samples: 14443665820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 12:03:41,525][15401] Updated weights for policy 0, policy_version 881574 (0.0044) [2024-06-25 12:03:43,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14443823104. Throughput: 0: 42704.4. Samples: 14443923840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 12:03:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000881581_14443823104.pth... [2024-06-25 12:03:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000880956_14433583104.pth [2024-06-25 12:03:44,316][15401] Updated weights for policy 0, policy_version 881584 (0.0024) [2024-06-25 12:03:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42543.1). Total num frames: 14444003328. Throughput: 0: 42660.8. Samples: 14444184480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 12:03:48,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 12:03:49,142][15401] Updated weights for policy 0, policy_version 881594 (0.0026) [2024-06-25 12:03:52,172][15401] Updated weights for policy 0, policy_version 881604 (0.0031) [2024-06-25 12:03:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 14444232704. Throughput: 0: 42610.2. Samples: 14444304540. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:03:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 12:03:56,789][15401] Updated weights for policy 0, policy_version 881614 (0.0039) [2024-06-25 12:03:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 14444462080. Throughput: 0: 42698.1. Samples: 14444567060. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:03:58,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 12:03:59,782][15401] Updated weights for policy 0, policy_version 881624 (0.0043) [2024-06-25 12:04:03,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42593.8, 300 sec: 42486.4). Total num frames: 14444642304. Throughput: 0: 42604.7. Samples: 14444819460. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:03,396][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 12:04:04,655][15401] Updated weights for policy 0, policy_version 881634 (0.0036) [2024-06-25 12:04:07,350][15401] Updated weights for policy 0, policy_version 881644 (0.0037) [2024-06-25 12:04:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14444888064. Throughput: 0: 42671.6. Samples: 14444943820. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 12:04:12,133][15401] Updated weights for policy 0, policy_version 881654 (0.0036) [2024-06-25 12:04:13,390][15132] Fps is (10 sec: 44265.2, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 14445084672. Throughput: 0: 42965.4. Samples: 14445214400. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 12:04:14,818][15401] Updated weights for policy 0, policy_version 881664 (0.0044) [2024-06-25 12:04:18,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43144.0, 300 sec: 42542.5). Total num frames: 14445297664. Throughput: 0: 42844.9. Samples: 14445469120. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:18,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 12:04:19,710][15401] Updated weights for policy 0, policy_version 881674 (0.0030) [2024-06-25 12:04:22,662][15401] Updated weights for policy 0, policy_version 881684 (0.0029) [2024-06-25 12:04:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14445527040. Throughput: 0: 42792.9. Samples: 14445591500. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:23,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 12:04:24,471][15349] Signal inference workers to stop experience collection... (213850 times) [2024-06-25 12:04:24,471][15349] Signal inference workers to resume experience collection... (213850 times) [2024-06-25 12:04:24,508][15401] InferenceWorker_p0-w0: stopping experience collection (213850 times) [2024-06-25 12:04:24,508][15401] InferenceWorker_p0-w0: resuming experience collection (213850 times) [2024-06-25 12:04:27,215][15401] Updated weights for policy 0, policy_version 881694 (0.0025) [2024-06-25 12:04:28,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14445723648. Throughput: 0: 42926.0. Samples: 14445855500. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:28,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 12:04:30,410][15401] Updated weights for policy 0, policy_version 881704 (0.0032) [2024-06-25 12:04:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14445936640. Throughput: 0: 42767.6. Samples: 14446109020. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 12:04:35,037][15401] Updated weights for policy 0, policy_version 881714 (0.0043) [2024-06-25 12:04:37,969][15401] Updated weights for policy 0, policy_version 881724 (0.0044) [2024-06-25 12:04:38,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 14446166016. Throughput: 0: 42796.8. Samples: 14446230400. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:38,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 12:04:42,732][15401] Updated weights for policy 0, policy_version 881734 (0.0031) [2024-06-25 12:04:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14446346240. Throughput: 0: 42724.9. Samples: 14446489680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:43,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 12:04:45,551][15401] Updated weights for policy 0, policy_version 881744 (0.0037) [2024-06-25 12:04:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 14446559232. Throughput: 0: 42859.0. Samples: 14446747840. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 12:04:50,206][15401] Updated weights for policy 0, policy_version 881754 (0.0037) [2024-06-25 12:04:53,082][15401] Updated weights for policy 0, policy_version 881764 (0.0036) [2024-06-25 12:04:53,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14446821376. Throughput: 0: 42866.3. Samples: 14446872800. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:53,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 12:04:58,334][15401] Updated weights for policy 0, policy_version 881774 (0.0028) [2024-06-25 12:04:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14446985216. Throughput: 0: 42618.7. Samples: 14447132240. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:04:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 12:05:00,708][15401] Updated weights for policy 0, policy_version 881784 (0.0035) [2024-06-25 12:05:03,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42603.0, 300 sec: 42487.3). Total num frames: 14447198208. Throughput: 0: 42635.2. Samples: 14447387600. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:05:03,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 12:05:05,741][15401] Updated weights for policy 0, policy_version 881794 (0.0041) [2024-06-25 12:05:08,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14447460352. Throughput: 0: 42847.1. Samples: 14447519620. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:05:08,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 12:05:08,466][15401] Updated weights for policy 0, policy_version 881804 (0.0036) [2024-06-25 12:05:13,217][15401] Updated weights for policy 0, policy_version 881814 (0.0027) [2024-06-25 12:05:13,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 14447640576. Throughput: 0: 42683.3. Samples: 14447776260. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:05:13,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 12:05:16,316][15401] Updated weights for policy 0, policy_version 881824 (0.0040) [2024-06-25 12:05:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42600.0, 300 sec: 42598.7). Total num frames: 14447853568. Throughput: 0: 42741.6. Samples: 14448032400. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:05:18,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 12:05:21,040][15401] Updated weights for policy 0, policy_version 881834 (0.0025) [2024-06-25 12:05:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14448082944. Throughput: 0: 42943.2. Samples: 14448162840. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:05:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 12:05:24,137][15401] Updated weights for policy 0, policy_version 881844 (0.0033) [2024-06-25 12:05:27,739][15349] Signal inference workers to stop experience collection... (213900 times) [2024-06-25 12:05:27,792][15401] InferenceWorker_p0-w0: stopping experience collection (213900 times) [2024-06-25 12:05:27,800][15349] Signal inference workers to resume experience collection... (213900 times) [2024-06-25 12:05:27,811][15401] InferenceWorker_p0-w0: resuming experience collection (213900 times) [2024-06-25 12:05:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 14448279552. Throughput: 0: 42848.5. Samples: 14448417860. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:05:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 12:05:28,778][15401] Updated weights for policy 0, policy_version 881854 (0.0031) [2024-06-25 12:05:31,836][15401] Updated weights for policy 0, policy_version 881864 (0.0030) [2024-06-25 12:05:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14448492544. Throughput: 0: 42835.0. Samples: 14448675420. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:05:33,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 12:05:36,231][15401] Updated weights for policy 0, policy_version 881874 (0.0036) [2024-06-25 12:05:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 14448738304. Throughput: 0: 42971.5. Samples: 14448806520. Policy #0 lag: (min: 1.0, avg: 12.0, max: 23.0) [2024-06-25 12:05:38,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 12:05:39,452][15401] Updated weights for policy 0, policy_version 881884 (0.0036) [2024-06-25 12:05:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14448934912. Throughput: 0: 42944.0. Samples: 14449064720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:05:43,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 12:05:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000881893_14448934912.pth... [2024-06-25 12:05:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000881267_14438678528.pth [2024-06-25 12:05:43,769][15401] Updated weights for policy 0, policy_version 881894 (0.0046) [2024-06-25 12:05:47,157][15401] Updated weights for policy 0, policy_version 881904 (0.0032) [2024-06-25 12:05:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 14449147904. Throughput: 0: 42894.2. Samples: 14449317840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:05:48,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 12:05:51,361][15401] Updated weights for policy 0, policy_version 881914 (0.0037) [2024-06-25 12:05:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 14449393664. Throughput: 0: 42818.6. Samples: 14449446460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:05:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 12:05:54,746][15401] Updated weights for policy 0, policy_version 881924 (0.0043) [2024-06-25 12:05:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14449573888. Throughput: 0: 42980.2. Samples: 14449710360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:05:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 12:05:58,780][15401] Updated weights for policy 0, policy_version 881934 (0.0041) [2024-06-25 12:06:02,411][15401] Updated weights for policy 0, policy_version 881944 (0.0034) [2024-06-25 12:06:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14449786880. Throughput: 0: 42965.4. Samples: 14449965840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:03,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 12:06:06,627][15401] Updated weights for policy 0, policy_version 881954 (0.0022) [2024-06-25 12:06:08,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 14450049024. Throughput: 0: 42919.9. Samples: 14450094340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:08,392][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 12:06:10,041][15401] Updated weights for policy 0, policy_version 881964 (0.0025) [2024-06-25 12:06:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14450212864. Throughput: 0: 43072.4. Samples: 14450356120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:13,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 12:06:14,440][15401] Updated weights for policy 0, policy_version 881974 (0.0041) [2024-06-25 12:06:18,296][15401] Updated weights for policy 0, policy_version 881984 (0.0035) [2024-06-25 12:06:18,389][15132] Fps is (10 sec: 37692.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14450425856. Throughput: 0: 43050.4. Samples: 14450612680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:18,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 12:06:21,943][15401] Updated weights for policy 0, policy_version 881994 (0.0027) [2024-06-25 12:06:23,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14450688000. Throughput: 0: 42940.4. Samples: 14450738840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 12:06:25,773][15349] Signal inference workers to stop experience collection... (213950 times) [2024-06-25 12:06:25,812][15401] InferenceWorker_p0-w0: stopping experience collection (213950 times) [2024-06-25 12:06:25,822][15349] Signal inference workers to resume experience collection... (213950 times) [2024-06-25 12:06:25,837][15401] InferenceWorker_p0-w0: resuming experience collection (213950 times) [2024-06-25 12:06:25,841][15401] Updated weights for policy 0, policy_version 882004 (0.0035) [2024-06-25 12:06:28,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 14450868224. Throughput: 0: 42939.9. Samples: 14450997120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:28,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 12:06:29,546][15401] Updated weights for policy 0, policy_version 882014 (0.0035) [2024-06-25 12:06:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14451081216. Throughput: 0: 43028.6. Samples: 14451254140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:33,396][15401] Updated weights for policy 0, policy_version 882024 (0.0039) [2024-06-25 12:06:33,399][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 12:06:37,120][15401] Updated weights for policy 0, policy_version 882034 (0.0037) [2024-06-25 12:06:38,396][15132] Fps is (10 sec: 45856.9, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 14451326976. Throughput: 0: 42969.6. Samples: 14451380360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:38,396][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 12:06:40,973][15401] Updated weights for policy 0, policy_version 882044 (0.0048) [2024-06-25 12:06:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14451507200. Throughput: 0: 42721.8. Samples: 14451632840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:43,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 12:06:44,706][15401] Updated weights for policy 0, policy_version 882054 (0.0039) [2024-06-25 12:06:48,390][15132] Fps is (10 sec: 39346.3, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 14451720192. Throughput: 0: 42634.6. Samples: 14451884400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:48,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 12:06:48,983][15401] Updated weights for policy 0, policy_version 882064 (0.0028) [2024-06-25 12:06:52,538][15401] Updated weights for policy 0, policy_version 882074 (0.0045) [2024-06-25 12:06:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14451965952. Throughput: 0: 42719.5. Samples: 14452016620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 12:06:56,492][15401] Updated weights for policy 0, policy_version 882084 (0.0028) [2024-06-25 12:06:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14452146176. Throughput: 0: 42733.9. Samples: 14452279140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:06:58,390][15132] Avg episode reward: [(0, '0.217')] [2024-06-25 12:06:59,980][15401] Updated weights for policy 0, policy_version 882094 (0.0045) [2024-06-25 12:07:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 14452375552. Throughput: 0: 42587.5. Samples: 14452529120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:07:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 12:07:04,214][15401] Updated weights for policy 0, policy_version 882104 (0.0028) [2024-06-25 12:07:07,394][15401] Updated weights for policy 0, policy_version 882114 (0.0052) [2024-06-25 12:07:08,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42325.3, 300 sec: 42820.2). Total num frames: 14452588544. Throughput: 0: 42743.1. Samples: 14452662380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:07:08,392][15132] Avg episode reward: [(0, '0.337')] [2024-06-25 12:07:11,584][15401] Updated weights for policy 0, policy_version 882124 (0.0029) [2024-06-25 12:07:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14452785152. Throughput: 0: 42827.6. Samples: 14452924260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:07:13,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 12:07:14,923][15401] Updated weights for policy 0, policy_version 882134 (0.0033) [2024-06-25 12:07:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14452998144. Throughput: 0: 42693.9. Samples: 14453175360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:07:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 12:07:19,750][15401] Updated weights for policy 0, policy_version 882144 (0.0048) [2024-06-25 12:07:22,710][15401] Updated weights for policy 0, policy_version 882154 (0.0037) [2024-06-25 12:07:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14453243904. Throughput: 0: 42767.4. Samples: 14453304620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:07:23,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 12:07:27,299][15401] Updated weights for policy 0, policy_version 882164 (0.0044) [2024-06-25 12:07:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 14453424128. Throughput: 0: 42932.9. Samples: 14453564820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:07:28,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 12:07:30,555][15401] Updated weights for policy 0, policy_version 882174 (0.0032) [2024-06-25 12:07:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 14453653504. Throughput: 0: 42842.7. Samples: 14453812420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 26.0) [2024-06-25 12:07:33,393][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 12:07:34,752][15401] Updated weights for policy 0, policy_version 882184 (0.0029) [2024-06-25 12:07:38,389][15401] Updated weights for policy 0, policy_version 882194 (0.0027) [2024-06-25 12:07:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42329.9, 300 sec: 42820.6). Total num frames: 14453866496. Throughput: 0: 42881.0. Samples: 14453946260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:07:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 12:07:42,468][15401] Updated weights for policy 0, policy_version 882204 (0.0032) [2024-06-25 12:07:43,391][15132] Fps is (10 sec: 40965.3, 60 sec: 42597.6, 300 sec: 42820.4). Total num frames: 14454063104. Throughput: 0: 42857.1. Samples: 14454207760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:07:43,391][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 12:07:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000882207_14454079488.pth... [2024-06-25 12:07:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000881581_14443823104.pth [2024-06-25 12:07:45,893][15401] Updated weights for policy 0, policy_version 882214 (0.0049) [2024-06-25 12:07:46,646][15349] Signal inference workers to stop experience collection... (214000 times) [2024-06-25 12:07:46,696][15401] InferenceWorker_p0-w0: stopping experience collection (214000 times) [2024-06-25 12:07:46,759][15349] Signal inference workers to resume experience collection... (214000 times) [2024-06-25 12:07:46,760][15401] InferenceWorker_p0-w0: resuming experience collection (214000 times) [2024-06-25 12:07:48,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43142.9, 300 sec: 42820.6). Total num frames: 14454308864. Throughput: 0: 42904.7. Samples: 14454459940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:07:48,392][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 12:07:49,920][15401] Updated weights for policy 0, policy_version 882224 (0.0026) [2024-06-25 12:07:53,228][15401] Updated weights for policy 0, policy_version 882234 (0.0037) [2024-06-25 12:07:53,389][15132] Fps is (10 sec: 45880.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14454521856. Throughput: 0: 42973.5. Samples: 14454596080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:07:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 12:07:57,791][15401] Updated weights for policy 0, policy_version 882244 (0.0041) [2024-06-25 12:07:58,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14454702080. Throughput: 0: 42898.3. Samples: 14454854680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:07:58,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 12:08:00,835][15401] Updated weights for policy 0, policy_version 882254 (0.0028) [2024-06-25 12:08:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14454931456. Throughput: 0: 42862.2. Samples: 14455104160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 12:08:05,519][15401] Updated weights for policy 0, policy_version 882264 (0.0035) [2024-06-25 12:08:08,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 14455177216. Throughput: 0: 43031.0. Samples: 14455241120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:08,393][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 12:08:08,403][15401] Updated weights for policy 0, policy_version 882274 (0.0039) [2024-06-25 12:08:12,920][15401] Updated weights for policy 0, policy_version 882284 (0.0028) [2024-06-25 12:08:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42876.0). Total num frames: 14455357440. Throughput: 0: 43036.8. Samples: 14455501580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:13,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 12:08:16,041][15401] Updated weights for policy 0, policy_version 882294 (0.0039) [2024-06-25 12:08:18,396][15132] Fps is (10 sec: 40943.9, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 14455586816. Throughput: 0: 43144.6. Samples: 14455754100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:18,396][15132] Avg episode reward: [(0, '0.187')] [2024-06-25 12:08:20,530][15401] Updated weights for policy 0, policy_version 882304 (0.0029) [2024-06-25 12:08:23,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14455816192. Throughput: 0: 43113.8. Samples: 14455886380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 12:08:23,652][15401] Updated weights for policy 0, policy_version 882314 (0.0034) [2024-06-25 12:08:28,266][15401] Updated weights for policy 0, policy_version 882324 (0.0032) [2024-06-25 12:08:28,390][15132] Fps is (10 sec: 40985.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14455996416. Throughput: 0: 43060.9. Samples: 14456145460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 12:08:31,211][15401] Updated weights for policy 0, policy_version 882334 (0.0049) [2024-06-25 12:08:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 14456242176. Throughput: 0: 43076.2. Samples: 14456398260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:33,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 12:08:35,804][15401] Updated weights for policy 0, policy_version 882344 (0.0036) [2024-06-25 12:08:38,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14456455168. Throughput: 0: 43008.4. Samples: 14456531460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:38,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 12:08:39,104][15401] Updated weights for policy 0, policy_version 882354 (0.0036) [2024-06-25 12:08:43,199][15401] Updated weights for policy 0, policy_version 882364 (0.0028) [2024-06-25 12:08:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43145.3, 300 sec: 42876.1). Total num frames: 14456651776. Throughput: 0: 42947.6. Samples: 14456787320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:43,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-25 12:08:46,767][15401] Updated weights for policy 0, policy_version 882374 (0.0031) [2024-06-25 12:08:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 14456881152. Throughput: 0: 43077.5. Samples: 14457042640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 12:08:50,770][15401] Updated weights for policy 0, policy_version 882384 (0.0039) [2024-06-25 12:08:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14457094144. Throughput: 0: 42813.9. Samples: 14457167640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 12:08:54,815][15401] Updated weights for policy 0, policy_version 882394 (0.0030) [2024-06-25 12:08:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 14457290752. Throughput: 0: 42835.6. Samples: 14457429080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:08:58,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 12:08:58,657][15401] Updated weights for policy 0, policy_version 882404 (0.0033) [2024-06-25 12:09:02,407][15401] Updated weights for policy 0, policy_version 882414 (0.0045) [2024-06-25 12:09:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14457536512. Throughput: 0: 42855.9. Samples: 14457682340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:09:03,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 12:09:06,189][15401] Updated weights for policy 0, policy_version 882424 (0.0042) [2024-06-25 12:09:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 14457733120. Throughput: 0: 42742.2. Samples: 14457809780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:09:08,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 12:09:10,328][15401] Updated weights for policy 0, policy_version 882434 (0.0042) [2024-06-25 12:09:13,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 14457913344. Throughput: 0: 42540.5. Samples: 14458059780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:09:13,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-25 12:09:14,068][15401] Updated weights for policy 0, policy_version 882444 (0.0026) [2024-06-25 12:09:14,448][15349] Signal inference workers to stop experience collection... (214050 times) [2024-06-25 12:09:14,479][15401] InferenceWorker_p0-w0: stopping experience collection (214050 times) [2024-06-25 12:09:14,559][15349] Signal inference workers to resume experience collection... (214050 times) [2024-06-25 12:09:14,559][15401] InferenceWorker_p0-w0: resuming experience collection (214050 times) [2024-06-25 12:09:17,918][15401] Updated weights for policy 0, policy_version 882454 (0.0037) [2024-06-25 12:09:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 14458142720. Throughput: 0: 42779.2. Samples: 14458323320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:09:18,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-25 12:09:21,588][15401] Updated weights for policy 0, policy_version 882464 (0.0030) [2024-06-25 12:09:23,391][15132] Fps is (10 sec: 47508.2, 60 sec: 42870.6, 300 sec: 42931.4). Total num frames: 14458388480. Throughput: 0: 42599.3. Samples: 14458448480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 12:09:23,391][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 12:09:25,523][15401] Updated weights for policy 0, policy_version 882474 (0.0049) [2024-06-25 12:09:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14458568704. Throughput: 0: 42647.1. Samples: 14458706440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:09:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 12:09:29,275][15401] Updated weights for policy 0, policy_version 882484 (0.0037) [2024-06-25 12:09:33,070][15401] Updated weights for policy 0, policy_version 882494 (0.0040) [2024-06-25 12:09:33,390][15132] Fps is (10 sec: 40964.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 14458798080. Throughput: 0: 42606.5. Samples: 14458959940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:09:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 12:09:36,993][15401] Updated weights for policy 0, policy_version 882504 (0.0034) [2024-06-25 12:09:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 14459027456. Throughput: 0: 42843.4. Samples: 14459095600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:09:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 12:09:40,743][15401] Updated weights for policy 0, policy_version 882514 (0.0040) [2024-06-25 12:09:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14459207680. Throughput: 0: 42503.6. Samples: 14459341740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:09:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 12:09:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000882520_14459207680.pth... [2024-06-25 12:09:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000881893_14448934912.pth [2024-06-25 12:09:44,605][15401] Updated weights for policy 0, policy_version 882524 (0.0034) [2024-06-25 12:09:48,349][15401] Updated weights for policy 0, policy_version 882534 (0.0045) [2024-06-25 12:09:48,391][15132] Fps is (10 sec: 40952.8, 60 sec: 42597.0, 300 sec: 42764.7). Total num frames: 14459437056. Throughput: 0: 42533.4. Samples: 14459596420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:09:48,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 12:09:52,186][15401] Updated weights for policy 0, policy_version 882544 (0.0041) [2024-06-25 12:09:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 14459650048. Throughput: 0: 42588.4. Samples: 14459726260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:09:53,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-25 12:09:56,483][15401] Updated weights for policy 0, policy_version 882554 (0.0037) [2024-06-25 12:09:58,390][15132] Fps is (10 sec: 40967.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14459846656. Throughput: 0: 42757.3. Samples: 14459983860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:09:58,390][15132] Avg episode reward: [(0, '0.834')] [2024-06-25 12:09:59,761][15401] Updated weights for policy 0, policy_version 882564 (0.0030) [2024-06-25 12:10:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14460076032. Throughput: 0: 42406.5. Samples: 14460231620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 12:10:04,264][15401] Updated weights for policy 0, policy_version 882574 (0.0038) [2024-06-25 12:10:07,325][15401] Updated weights for policy 0, policy_version 882584 (0.0031) [2024-06-25 12:10:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14460289024. Throughput: 0: 42653.1. Samples: 14460367820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 12:10:12,009][15401] Updated weights for policy 0, policy_version 882594 (0.0042) [2024-06-25 12:10:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 14460469248. Throughput: 0: 42619.2. Samples: 14460624300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 12:10:15,095][15401] Updated weights for policy 0, policy_version 882604 (0.0031) [2024-06-25 12:10:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14460715008. Throughput: 0: 42515.2. Samples: 14460873120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:18,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 12:10:19,858][15401] Updated weights for policy 0, policy_version 882614 (0.0041) [2024-06-25 12:10:22,639][15401] Updated weights for policy 0, policy_version 882624 (0.0037) [2024-06-25 12:10:23,389][15132] Fps is (10 sec: 47513.1, 60 sec: 42599.2, 300 sec: 42931.6). Total num frames: 14460944384. Throughput: 0: 42625.8. Samples: 14461013760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 12:10:27,183][15401] Updated weights for policy 0, policy_version 882634 (0.0031) [2024-06-25 12:10:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 14461124608. Throughput: 0: 42957.2. Samples: 14461274820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:28,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 12:10:29,363][15349] Signal inference workers to stop experience collection... (214100 times) [2024-06-25 12:10:29,363][15349] Signal inference workers to resume experience collection... (214100 times) [2024-06-25 12:10:29,399][15401] InferenceWorker_p0-w0: stopping experience collection (214100 times) [2024-06-25 12:10:29,400][15401] InferenceWorker_p0-w0: resuming experience collection (214100 times) [2024-06-25 12:10:30,168][15401] Updated weights for policy 0, policy_version 882644 (0.0044) [2024-06-25 12:10:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14461353984. Throughput: 0: 42785.8. Samples: 14461521700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:33,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 12:10:34,541][15401] Updated weights for policy 0, policy_version 882654 (0.0037) [2024-06-25 12:10:37,897][15401] Updated weights for policy 0, policy_version 882664 (0.0031) [2024-06-25 12:10:38,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14461599744. Throughput: 0: 42976.9. Samples: 14461660220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:38,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 12:10:42,017][15401] Updated weights for policy 0, policy_version 882674 (0.0051) [2024-06-25 12:10:43,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 14461747200. Throughput: 0: 42814.1. Samples: 14461910500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 12:10:45,448][15401] Updated weights for policy 0, policy_version 882684 (0.0034) [2024-06-25 12:10:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42872.8, 300 sec: 42765.0). Total num frames: 14462009344. Throughput: 0: 43065.3. Samples: 14462169560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 12:10:49,464][15401] Updated weights for policy 0, policy_version 882694 (0.0044) [2024-06-25 12:10:52,848][15401] Updated weights for policy 0, policy_version 882704 (0.0036) [2024-06-25 12:10:53,389][15132] Fps is (10 sec: 50791.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 14462255104. Throughput: 0: 43092.9. Samples: 14462307000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 12:10:57,053][15401] Updated weights for policy 0, policy_version 882714 (0.0047) [2024-06-25 12:10:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14462402560. Throughput: 0: 42834.2. Samples: 14462551840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:10:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 12:11:00,795][15401] Updated weights for policy 0, policy_version 882724 (0.0038) [2024-06-25 12:11:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 14462664704. Throughput: 0: 42985.8. Samples: 14462807480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:11:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 12:11:05,603][15401] Updated weights for policy 0, policy_version 882734 (0.0035) [2024-06-25 12:11:08,372][15401] Updated weights for policy 0, policy_version 882744 (0.0047) [2024-06-25 12:11:08,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 14462877696. Throughput: 0: 42916.5. Samples: 14462945000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:11:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 12:11:13,219][15401] Updated weights for policy 0, policy_version 882754 (0.0041) [2024-06-25 12:11:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14463057920. Throughput: 0: 42755.2. Samples: 14463198800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:11:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 12:11:16,122][15401] Updated weights for policy 0, policy_version 882764 (0.0026) [2024-06-25 12:11:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 14463320064. Throughput: 0: 42928.3. Samples: 14463453580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-25 12:11:18,392][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 12:11:20,549][15401] Updated weights for policy 0, policy_version 882774 (0.0045) [2024-06-25 12:11:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 14463516672. Throughput: 0: 43067.6. Samples: 14463598260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:11:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 12:11:23,461][15401] Updated weights for policy 0, policy_version 882784 (0.0029) [2024-06-25 12:11:27,927][15401] Updated weights for policy 0, policy_version 882794 (0.0045) [2024-06-25 12:11:28,390][15132] Fps is (10 sec: 39330.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14463713280. Throughput: 0: 43203.2. Samples: 14463854640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:11:28,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 12:11:31,287][15401] Updated weights for policy 0, policy_version 882804 (0.0034) [2024-06-25 12:11:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42821.5). Total num frames: 14463959040. Throughput: 0: 42950.2. Samples: 14464102320. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:11:33,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 12:11:35,521][15401] Updated weights for policy 0, policy_version 882814 (0.0040) [2024-06-25 12:11:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14464172032. Throughput: 0: 42954.7. Samples: 14464239960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:11:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 12:11:38,974][15401] Updated weights for policy 0, policy_version 882824 (0.0029) [2024-06-25 12:11:43,367][15401] Updated weights for policy 0, policy_version 882834 (0.0029) [2024-06-25 12:11:43,392][15132] Fps is (10 sec: 39312.4, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 14464352256. Throughput: 0: 43215.8. Samples: 14464496660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:11:43,393][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 12:11:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000882834_14464352256.pth... [2024-06-25 12:11:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000882207_14454079488.pth [2024-06-25 12:11:46,606][15401] Updated weights for policy 0, policy_version 882844 (0.0035) [2024-06-25 12:11:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14464598016. Throughput: 0: 43109.8. Samples: 14464747420. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:11:48,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 12:11:50,921][15401] Updated weights for policy 0, policy_version 882854 (0.0033) [2024-06-25 12:11:53,390][15132] Fps is (10 sec: 45885.0, 60 sec: 42598.2, 300 sec: 42931.6). Total num frames: 14464811008. Throughput: 0: 43128.5. Samples: 14464885800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:11:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 12:11:53,911][15401] Updated weights for policy 0, policy_version 882864 (0.0040) [2024-06-25 12:11:55,208][15349] Signal inference workers to stop experience collection... (214150 times) [2024-06-25 12:11:55,247][15401] InferenceWorker_p0-w0: stopping experience collection (214150 times) [2024-06-25 12:11:55,265][15349] Signal inference workers to resume experience collection... (214150 times) [2024-06-25 12:11:55,266][15401] InferenceWorker_p0-w0: resuming experience collection (214150 times) [2024-06-25 12:11:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14464991232. Throughput: 0: 43278.6. Samples: 14465146340. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:11:58,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 12:11:58,682][15401] Updated weights for policy 0, policy_version 882874 (0.0023) [2024-06-25 12:12:01,321][15401] Updated weights for policy 0, policy_version 882884 (0.0026) [2024-06-25 12:12:03,392][15132] Fps is (10 sec: 44229.1, 60 sec: 43143.1, 300 sec: 42931.7). Total num frames: 14465253376. Throughput: 0: 43190.5. Samples: 14465397140. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:03,392][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 12:12:06,194][15401] Updated weights for policy 0, policy_version 882894 (0.0037) [2024-06-25 12:12:08,392][15132] Fps is (10 sec: 47504.1, 60 sec: 43143.0, 300 sec: 42986.9). Total num frames: 14465466368. Throughput: 0: 43156.2. Samples: 14465540380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:08,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 12:12:08,775][15401] Updated weights for policy 0, policy_version 882904 (0.0031) [2024-06-25 12:12:13,389][15132] Fps is (10 sec: 37691.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14465630208. Throughput: 0: 43047.7. Samples: 14465791780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 12:12:13,841][15401] Updated weights for policy 0, policy_version 882914 (0.0033) [2024-06-25 12:12:16,361][15401] Updated weights for policy 0, policy_version 882924 (0.0034) [2024-06-25 12:12:18,389][15132] Fps is (10 sec: 40968.7, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 14465875968. Throughput: 0: 43216.2. Samples: 14466047040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 12:12:21,323][15401] Updated weights for policy 0, policy_version 882934 (0.0033) [2024-06-25 12:12:23,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 14466105344. Throughput: 0: 43154.1. Samples: 14466181900. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 12:12:23,948][15401] Updated weights for policy 0, policy_version 882944 (0.0028) [2024-06-25 12:12:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 14466285568. Throughput: 0: 43048.6. Samples: 14466433740. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 12:12:28,810][15401] Updated weights for policy 0, policy_version 882954 (0.0035) [2024-06-25 12:12:32,075][15401] Updated weights for policy 0, policy_version 882964 (0.0032) [2024-06-25 12:12:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 14466531328. Throughput: 0: 43149.3. Samples: 14466689140. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 12:12:36,353][15401] Updated weights for policy 0, policy_version 882974 (0.0054) [2024-06-25 12:12:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42931.4). Total num frames: 14466727936. Throughput: 0: 43005.5. Samples: 14466821140. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:38,393][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 12:12:39,680][15401] Updated weights for policy 0, policy_version 882984 (0.0049) [2024-06-25 12:12:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42873.2, 300 sec: 42765.4). Total num frames: 14466924544. Throughput: 0: 42664.1. Samples: 14467066220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:43,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 12:12:44,409][15401] Updated weights for policy 0, policy_version 882994 (0.0030) [2024-06-25 12:12:47,437][15401] Updated weights for policy 0, policy_version 883004 (0.0039) [2024-06-25 12:12:48,392][15132] Fps is (10 sec: 44236.5, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 14467170304. Throughput: 0: 42698.7. Samples: 14467318600. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:48,393][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 12:12:52,177][15401] Updated weights for policy 0, policy_version 883014 (0.0039) [2024-06-25 12:12:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.7, 300 sec: 42931.6). Total num frames: 14467366912. Throughput: 0: 42523.8. Samples: 14467453860. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:53,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 12:12:55,304][15401] Updated weights for policy 0, policy_version 883024 (0.0029) [2024-06-25 12:12:58,389][15132] Fps is (10 sec: 40970.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14467579904. Throughput: 0: 42469.8. Samples: 14467702920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:12:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 12:12:59,786][15401] Updated weights for policy 0, policy_version 883034 (0.0041) [2024-06-25 12:13:03,337][15401] Updated weights for policy 0, policy_version 883044 (0.0035) [2024-06-25 12:13:03,390][15132] Fps is (10 sec: 42595.1, 60 sec: 42326.3, 300 sec: 42765.3). Total num frames: 14467792896. Throughput: 0: 42580.2. Samples: 14467963180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:13:03,391][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 12:13:03,745][15349] Signal inference workers to stop experience collection... (214200 times) [2024-06-25 12:13:03,791][15401] InferenceWorker_p0-w0: stopping experience collection (214200 times) [2024-06-25 12:13:03,800][15349] Signal inference workers to resume experience collection... (214200 times) [2024-06-25 12:13:03,807][15401] InferenceWorker_p0-w0: resuming experience collection (214200 times) [2024-06-25 12:13:07,245][15401] Updated weights for policy 0, policy_version 883054 (0.0034) [2024-06-25 12:13:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42053.7, 300 sec: 42820.9). Total num frames: 14467989504. Throughput: 0: 42370.3. Samples: 14468088560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:13:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 12:13:11,225][15401] Updated weights for policy 0, policy_version 883064 (0.0050) [2024-06-25 12:13:13,392][15132] Fps is (10 sec: 42591.1, 60 sec: 43142.7, 300 sec: 42821.1). Total num frames: 14468218880. Throughput: 0: 42313.7. Samples: 14468337960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-25 12:13:13,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 12:13:15,119][15401] Updated weights for policy 0, policy_version 883074 (0.0030) [2024-06-25 12:13:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14468431872. Throughput: 0: 42331.9. Samples: 14468594080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 12:13:18,691][15401] Updated weights for policy 0, policy_version 883084 (0.0032) [2024-06-25 12:13:22,608][15401] Updated weights for policy 0, policy_version 883094 (0.0043) [2024-06-25 12:13:23,391][15132] Fps is (10 sec: 42603.2, 60 sec: 42324.5, 300 sec: 42875.9). Total num frames: 14468644864. Throughput: 0: 42342.8. Samples: 14468726520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:23,392][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 12:13:26,244][15401] Updated weights for policy 0, policy_version 883104 (0.0033) [2024-06-25 12:13:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14468874240. Throughput: 0: 42556.5. Samples: 14468981260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 12:13:30,368][15401] Updated weights for policy 0, policy_version 883114 (0.0044) [2024-06-25 12:13:33,390][15132] Fps is (10 sec: 42603.0, 60 sec: 42325.1, 300 sec: 42765.0). Total num frames: 14469070848. Throughput: 0: 42557.7. Samples: 14469233600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:33,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 12:13:33,825][15401] Updated weights for policy 0, policy_version 883124 (0.0047) [2024-06-25 12:13:38,165][15401] Updated weights for policy 0, policy_version 883134 (0.0039) [2024-06-25 12:13:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 14469267456. Throughput: 0: 42318.1. Samples: 14469358180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:38,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 12:13:41,584][15401] Updated weights for policy 0, policy_version 883144 (0.0027) [2024-06-25 12:13:43,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14469496832. Throughput: 0: 42478.0. Samples: 14469614440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:43,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 12:13:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000883149_14469513216.pth... [2024-06-25 12:13:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000882520_14459207680.pth [2024-06-25 12:13:45,980][15401] Updated weights for policy 0, policy_version 883154 (0.0037) [2024-06-25 12:13:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42327.2, 300 sec: 42765.0). Total num frames: 14469709824. Throughput: 0: 42376.8. Samples: 14469870100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 12:13:49,397][15401] Updated weights for policy 0, policy_version 883164 (0.0033) [2024-06-25 12:13:53,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14469906432. Throughput: 0: 42448.1. Samples: 14469998720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:53,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 12:13:53,512][15401] Updated weights for policy 0, policy_version 883174 (0.0025) [2024-06-25 12:13:57,144][15401] Updated weights for policy 0, policy_version 883184 (0.0047) [2024-06-25 12:13:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14470152192. Throughput: 0: 42711.1. Samples: 14470259860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:13:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 12:14:00,903][15401] Updated weights for policy 0, policy_version 883194 (0.0021) [2024-06-25 12:14:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.9, 300 sec: 42709.5). Total num frames: 14470332416. Throughput: 0: 42753.4. Samples: 14470517980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 12:14:05,088][15401] Updated weights for policy 0, policy_version 883204 (0.0037) [2024-06-25 12:14:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14470561792. Throughput: 0: 42508.7. Samples: 14470639360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 12:14:08,401][15401] Updated weights for policy 0, policy_version 883214 (0.0034) [2024-06-25 12:14:12,697][15401] Updated weights for policy 0, policy_version 883224 (0.0041) [2024-06-25 12:14:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 14470774784. Throughput: 0: 42602.5. Samples: 14470898380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 12:14:16,193][15401] Updated weights for policy 0, policy_version 883234 (0.0046) [2024-06-25 12:14:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 14470987776. Throughput: 0: 42675.3. Samples: 14471153980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:18,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 12:14:20,453][15401] Updated weights for policy 0, policy_version 883244 (0.0032) [2024-06-25 12:14:23,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42326.3, 300 sec: 42765.0). Total num frames: 14471184384. Throughput: 0: 42705.9. Samples: 14471279940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:23,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 12:14:23,979][15401] Updated weights for policy 0, policy_version 883254 (0.0034) [2024-06-25 12:14:28,000][15401] Updated weights for policy 0, policy_version 883264 (0.0033) [2024-06-25 12:14:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14471413760. Throughput: 0: 42811.3. Samples: 14471540940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 12:14:31,798][15401] Updated weights for policy 0, policy_version 883274 (0.0040) [2024-06-25 12:14:33,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14471643136. Throughput: 0: 42699.0. Samples: 14471791560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 12:14:35,556][15401] Updated weights for policy 0, policy_version 883284 (0.0026) [2024-06-25 12:14:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14471839744. Throughput: 0: 42734.2. Samples: 14471921760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:38,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 12:14:39,429][15401] Updated weights for policy 0, policy_version 883294 (0.0031) [2024-06-25 12:14:43,306][15401] Updated weights for policy 0, policy_version 883304 (0.0029) [2024-06-25 12:14:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.3). Total num frames: 14472052736. Throughput: 0: 42721.8. Samples: 14472182340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:43,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 12:14:43,743][15349] Signal inference workers to stop experience collection... (214250 times) [2024-06-25 12:14:43,792][15401] InferenceWorker_p0-w0: stopping experience collection (214250 times) [2024-06-25 12:14:43,864][15349] Signal inference workers to resume experience collection... (214250 times) [2024-06-25 12:14:43,864][15401] InferenceWorker_p0-w0: resuming experience collection (214250 times) [2024-06-25 12:14:47,057][15401] Updated weights for policy 0, policy_version 883314 (0.0040) [2024-06-25 12:14:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14472282112. Throughput: 0: 42312.4. Samples: 14472422040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:48,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 12:14:51,318][15401] Updated weights for policy 0, policy_version 883324 (0.0040) [2024-06-25 12:14:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14472478720. Throughput: 0: 42686.8. Samples: 14472560260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:53,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 12:14:54,815][15401] Updated weights for policy 0, policy_version 883334 (0.0042) [2024-06-25 12:14:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14472675328. Throughput: 0: 42631.6. Samples: 14472816800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:14:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 12:14:58,725][15401] Updated weights for policy 0, policy_version 883344 (0.0036) [2024-06-25 12:15:02,426][15401] Updated weights for policy 0, policy_version 883354 (0.0038) [2024-06-25 12:15:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14472921088. Throughput: 0: 42580.4. Samples: 14473070100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 12:15:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 12:15:06,345][15401] Updated weights for policy 0, policy_version 883364 (0.0048) [2024-06-25 12:15:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14473117696. Throughput: 0: 42761.5. Samples: 14473204220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:08,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 12:15:10,072][15401] Updated weights for policy 0, policy_version 883374 (0.0030) [2024-06-25 12:15:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 14473297920. Throughput: 0: 42628.9. Samples: 14473459240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:13,390][15132] Avg episode reward: [(0, '0.852')] [2024-06-25 12:15:14,112][15401] Updated weights for policy 0, policy_version 883384 (0.0029) [2024-06-25 12:15:17,507][15401] Updated weights for policy 0, policy_version 883394 (0.0041) [2024-06-25 12:15:18,389][15132] Fps is (10 sec: 45876.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14473576448. Throughput: 0: 42673.4. Samples: 14473711860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:18,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 12:15:21,967][15401] Updated weights for policy 0, policy_version 883404 (0.0033) [2024-06-25 12:15:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 14473756672. Throughput: 0: 42884.8. Samples: 14473851580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:23,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 12:15:25,099][15401] Updated weights for policy 0, policy_version 883414 (0.0030) [2024-06-25 12:15:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14473969664. Throughput: 0: 42589.4. Samples: 14474098860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 12:15:29,541][15401] Updated weights for policy 0, policy_version 883424 (0.0034) [2024-06-25 12:15:32,749][15401] Updated weights for policy 0, policy_version 883434 (0.0041) [2024-06-25 12:15:33,392][15132] Fps is (10 sec: 45865.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14474215424. Throughput: 0: 43036.0. Samples: 14474358760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:33,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 12:15:37,318][15401] Updated weights for policy 0, policy_version 883444 (0.0031) [2024-06-25 12:15:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14474412032. Throughput: 0: 42945.2. Samples: 14474492800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 12:15:40,348][15401] Updated weights for policy 0, policy_version 883454 (0.0041) [2024-06-25 12:15:43,392][15132] Fps is (10 sec: 39321.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 14474608640. Throughput: 0: 42708.4. Samples: 14474738780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:43,393][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 12:15:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000883460_14474608640.pth... [2024-06-25 12:15:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000882834_14464352256.pth [2024-06-25 12:15:45,042][15401] Updated weights for policy 0, policy_version 883464 (0.0038) [2024-06-25 12:15:47,949][15401] Updated weights for policy 0, policy_version 883474 (0.0039) [2024-06-25 12:15:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14474854400. Throughput: 0: 42728.0. Samples: 14474992860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:48,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 12:15:52,791][15401] Updated weights for policy 0, policy_version 883484 (0.0042) [2024-06-25 12:15:53,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14475034624. Throughput: 0: 42720.5. Samples: 14475126640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:53,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-25 12:15:55,699][15401] Updated weights for policy 0, policy_version 883494 (0.0046) [2024-06-25 12:15:58,393][15132] Fps is (10 sec: 40943.7, 60 sec: 43141.7, 300 sec: 42708.9). Total num frames: 14475264000. Throughput: 0: 42581.5. Samples: 14475375580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:15:58,394][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 12:16:00,490][15401] Updated weights for policy 0, policy_version 883504 (0.0026) [2024-06-25 12:16:03,280][15401] Updated weights for policy 0, policy_version 883514 (0.0037) [2024-06-25 12:16:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14475493376. Throughput: 0: 42766.6. Samples: 14475636360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:03,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 12:16:07,985][15401] Updated weights for policy 0, policy_version 883524 (0.0040) [2024-06-25 12:16:08,389][15132] Fps is (10 sec: 40976.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14475673600. Throughput: 0: 42609.0. Samples: 14475768980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:08,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 12:16:09,001][15349] Signal inference workers to stop experience collection... (214300 times) [2024-06-25 12:16:09,003][15349] Signal inference workers to resume experience collection... (214300 times) [2024-06-25 12:16:09,018][15401] InferenceWorker_p0-w0: stopping experience collection (214300 times) [2024-06-25 12:16:09,019][15401] InferenceWorker_p0-w0: resuming experience collection (214300 times) [2024-06-25 12:16:11,014][15401] Updated weights for policy 0, policy_version 883534 (0.0051) [2024-06-25 12:16:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 42709.8). Total num frames: 14475919360. Throughput: 0: 42800.8. Samples: 14476024900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:13,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 12:16:15,485][15401] Updated weights for policy 0, policy_version 883544 (0.0048) [2024-06-25 12:16:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14476132352. Throughput: 0: 42680.0. Samples: 14476279260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 12:16:18,575][15401] Updated weights for policy 0, policy_version 883554 (0.0034) [2024-06-25 12:16:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 14476296192. Throughput: 0: 42619.7. Samples: 14476410680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:23,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 12:16:23,401][15401] Updated weights for policy 0, policy_version 883564 (0.0036) [2024-06-25 12:16:26,130][15401] Updated weights for policy 0, policy_version 883574 (0.0029) [2024-06-25 12:16:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14476558336. Throughput: 0: 42798.2. Samples: 14476664600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:28,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 12:16:30,923][15401] Updated weights for policy 0, policy_version 883584 (0.0043) [2024-06-25 12:16:33,389][15132] Fps is (10 sec: 47513.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 14476771328. Throughput: 0: 42952.9. Samples: 14476925740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:33,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 12:16:33,975][15401] Updated weights for policy 0, policy_version 883594 (0.0035) [2024-06-25 12:16:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 14476951552. Throughput: 0: 42765.9. Samples: 14477051100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:38,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 12:16:38,486][15401] Updated weights for policy 0, policy_version 883604 (0.0040) [2024-06-25 12:16:41,471][15401] Updated weights for policy 0, policy_version 883614 (0.0029) [2024-06-25 12:16:43,396][15132] Fps is (10 sec: 42570.9, 60 sec: 43141.6, 300 sec: 42708.5). Total num frames: 14477197312. Throughput: 0: 42778.5. Samples: 14477300720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:43,396][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 12:16:46,201][15401] Updated weights for policy 0, policy_version 883624 (0.0035) [2024-06-25 12:16:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14477410304. Throughput: 0: 42834.7. Samples: 14477563920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:48,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 12:16:48,931][15401] Updated weights for policy 0, policy_version 883634 (0.0030) [2024-06-25 12:16:53,392][15132] Fps is (10 sec: 39337.7, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 14477590528. Throughput: 0: 42612.9. Samples: 14477686660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:53,392][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 12:16:53,983][15401] Updated weights for policy 0, policy_version 883644 (0.0029) [2024-06-25 12:16:56,738][15401] Updated weights for policy 0, policy_version 883654 (0.0028) [2024-06-25 12:16:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43145.6, 300 sec: 42709.4). Total num frames: 14477852672. Throughput: 0: 42524.4. Samples: 14477938600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 12:16:58,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 12:17:02,086][15401] Updated weights for policy 0, policy_version 883664 (0.0033) [2024-06-25 12:17:03,389][15132] Fps is (10 sec: 44247.3, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 14478032896. Throughput: 0: 42708.8. Samples: 14478201160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:03,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-25 12:17:04,438][15401] Updated weights for policy 0, policy_version 883674 (0.0033) [2024-06-25 12:17:08,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14478229504. Throughput: 0: 42360.4. Samples: 14478316900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 12:17:09,923][15401] Updated weights for policy 0, policy_version 883684 (0.0042) [2024-06-25 12:17:12,371][15401] Updated weights for policy 0, policy_version 883694 (0.0042) [2024-06-25 12:17:13,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14478491648. Throughput: 0: 42401.0. Samples: 14478572740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:13,392][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 12:17:17,393][15401] Updated weights for policy 0, policy_version 883704 (0.0027) [2024-06-25 12:17:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 14478655488. Throughput: 0: 42507.1. Samples: 14478838560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:18,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 12:17:18,660][15349] Signal inference workers to stop experience collection... (214350 times) [2024-06-25 12:17:18,660][15349] Signal inference workers to resume experience collection... (214350 times) [2024-06-25 12:17:18,705][15401] InferenceWorker_p0-w0: stopping experience collection (214350 times) [2024-06-25 12:17:18,705][15401] InferenceWorker_p0-w0: resuming experience collection (214350 times) [2024-06-25 12:17:20,195][15401] Updated weights for policy 0, policy_version 883714 (0.0035) [2024-06-25 12:17:23,390][15132] Fps is (10 sec: 37691.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14478868480. Throughput: 0: 42323.4. Samples: 14478955660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:23,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 12:17:24,815][15401] Updated weights for policy 0, policy_version 883724 (0.0031) [2024-06-25 12:17:27,881][15401] Updated weights for policy 0, policy_version 883734 (0.0028) [2024-06-25 12:17:28,390][15132] Fps is (10 sec: 45871.1, 60 sec: 42597.9, 300 sec: 42653.8). Total num frames: 14479114240. Throughput: 0: 42599.5. Samples: 14479217460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:28,391][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 12:17:32,422][15401] Updated weights for policy 0, policy_version 883744 (0.0032) [2024-06-25 12:17:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 14479294464. Throughput: 0: 42752.0. Samples: 14479487760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:33,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-25 12:17:35,490][15401] Updated weights for policy 0, policy_version 883754 (0.0032) [2024-06-25 12:17:38,389][15132] Fps is (10 sec: 40964.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14479523840. Throughput: 0: 42661.9. Samples: 14479606340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:38,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 12:17:40,250][15401] Updated weights for policy 0, policy_version 883764 (0.0039) [2024-06-25 12:17:42,991][15401] Updated weights for policy 0, policy_version 883774 (0.0037) [2024-06-25 12:17:43,392][15132] Fps is (10 sec: 47502.0, 60 sec: 42874.3, 300 sec: 42709.5). Total num frames: 14479769600. Throughput: 0: 42874.2. Samples: 14479867940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:43,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 12:17:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000883775_14479769600.pth... [2024-06-25 12:17:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000883149_14469513216.pth [2024-06-25 12:17:47,721][15401] Updated weights for policy 0, policy_version 883784 (0.0034) [2024-06-25 12:17:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14479949824. Throughput: 0: 42917.7. Samples: 14480132460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 12:17:50,643][15401] Updated weights for policy 0, policy_version 883794 (0.0037) [2024-06-25 12:17:53,391][15132] Fps is (10 sec: 40964.9, 60 sec: 43145.4, 300 sec: 42709.3). Total num frames: 14480179200. Throughput: 0: 43005.0. Samples: 14480252180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:53,391][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 12:17:55,152][15401] Updated weights for policy 0, policy_version 883804 (0.0043) [2024-06-25 12:17:58,222][15401] Updated weights for policy 0, policy_version 883814 (0.0033) [2024-06-25 12:17:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42600.0, 300 sec: 42765.1). Total num frames: 14480408576. Throughput: 0: 43092.4. Samples: 14480511800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:17:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 12:18:02,640][15401] Updated weights for policy 0, policy_version 883824 (0.0037) [2024-06-25 12:18:03,389][15132] Fps is (10 sec: 40965.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14480588800. Throughput: 0: 42929.8. Samples: 14480770400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 12:18:06,113][15401] Updated weights for policy 0, policy_version 883834 (0.0033) [2024-06-25 12:18:08,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 14480801792. Throughput: 0: 42977.9. Samples: 14480889660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 12:18:10,371][15401] Updated weights for policy 0, policy_version 883844 (0.0029) [2024-06-25 12:18:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 14481031168. Throughput: 0: 42998.1. Samples: 14481152340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 12:18:13,710][15401] Updated weights for policy 0, policy_version 883854 (0.0034) [2024-06-25 12:18:18,236][15401] Updated weights for policy 0, policy_version 883864 (0.0027) [2024-06-25 12:18:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42709.7). Total num frames: 14481244160. Throughput: 0: 42956.9. Samples: 14481420820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 12:18:21,384][15401] Updated weights for policy 0, policy_version 883874 (0.0039) [2024-06-25 12:18:23,390][15132] Fps is (10 sec: 42595.5, 60 sec: 43144.1, 300 sec: 42653.8). Total num frames: 14481457152. Throughput: 0: 42923.2. Samples: 14481537920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:23,391][15132] Avg episode reward: [(0, '0.280')] [2024-06-25 12:18:25,928][15401] Updated weights for policy 0, policy_version 883884 (0.0047) [2024-06-25 12:18:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 14481670144. Throughput: 0: 42832.1. Samples: 14481795280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:28,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 12:18:28,899][15401] Updated weights for policy 0, policy_version 883894 (0.0032) [2024-06-25 12:18:33,368][15401] Updated weights for policy 0, policy_version 883904 (0.0039) [2024-06-25 12:18:33,390][15132] Fps is (10 sec: 42601.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14481883136. Throughput: 0: 42687.6. Samples: 14482053400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:33,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 12:18:36,571][15401] Updated weights for policy 0, policy_version 883914 (0.0027) [2024-06-25 12:18:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14482112512. Throughput: 0: 42884.7. Samples: 14482181940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:38,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 12:18:40,826][15401] Updated weights for policy 0, policy_version 883924 (0.0037) [2024-06-25 12:18:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 14482309120. Throughput: 0: 42753.1. Samples: 14482435680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 12:18:44,803][15401] Updated weights for policy 0, policy_version 883934 (0.0028) [2024-06-25 12:18:45,244][15349] Signal inference workers to stop experience collection... (214400 times) [2024-06-25 12:18:45,277][15401] InferenceWorker_p0-w0: stopping experience collection (214400 times) [2024-06-25 12:18:45,305][15349] Signal inference workers to resume experience collection... (214400 times) [2024-06-25 12:18:45,305][15401] InferenceWorker_p0-w0: resuming experience collection (214400 times) [2024-06-25 12:18:48,391][15132] Fps is (10 sec: 40952.5, 60 sec: 42870.2, 300 sec: 42764.7). Total num frames: 14482522112. Throughput: 0: 42767.6. Samples: 14482695020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:48,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 12:18:48,741][15401] Updated weights for policy 0, policy_version 883944 (0.0032) [2024-06-25 12:18:52,410][15401] Updated weights for policy 0, policy_version 883954 (0.0029) [2024-06-25 12:18:53,392][15132] Fps is (10 sec: 45863.6, 60 sec: 43143.7, 300 sec: 42764.7). Total num frames: 14482767872. Throughput: 0: 42975.4. Samples: 14482823660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:53,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 12:18:56,161][15401] Updated weights for policy 0, policy_version 883964 (0.0037) [2024-06-25 12:18:58,391][15132] Fps is (10 sec: 44239.7, 60 sec: 42597.7, 300 sec: 42820.4). Total num frames: 14482964480. Throughput: 0: 42802.5. Samples: 14483078500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:18:58,391][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 12:19:00,057][15401] Updated weights for policy 0, policy_version 883974 (0.0041) [2024-06-25 12:19:03,390][15132] Fps is (10 sec: 37692.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14483144704. Throughput: 0: 42634.1. Samples: 14483339360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 12:19:03,912][15401] Updated weights for policy 0, policy_version 883984 (0.0035) [2024-06-25 12:19:07,831][15401] Updated weights for policy 0, policy_version 883994 (0.0032) [2024-06-25 12:19:08,390][15132] Fps is (10 sec: 42603.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14483390464. Throughput: 0: 42837.6. Samples: 14483465580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 12:19:11,500][15401] Updated weights for policy 0, policy_version 884004 (0.0043) [2024-06-25 12:19:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14483587072. Throughput: 0: 42736.5. Samples: 14483718420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:13,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 12:19:15,486][15401] Updated weights for policy 0, policy_version 884014 (0.0032) [2024-06-25 12:19:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14483800064. Throughput: 0: 42757.4. Samples: 14483977480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:18,390][15132] Avg episode reward: [(0, '0.894')] [2024-06-25 12:19:19,180][15401] Updated weights for policy 0, policy_version 884024 (0.0033) [2024-06-25 12:19:23,211][15401] Updated weights for policy 0, policy_version 884034 (0.0040) [2024-06-25 12:19:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42872.0, 300 sec: 42765.0). Total num frames: 14484029440. Throughput: 0: 42625.3. Samples: 14484100080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:23,390][15132] Avg episode reward: [(0, '0.879')] [2024-06-25 12:19:26,706][15401] Updated weights for policy 0, policy_version 884044 (0.0032) [2024-06-25 12:19:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14484242432. Throughput: 0: 42764.4. Samples: 14484360080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:28,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 12:19:30,566][15401] Updated weights for policy 0, policy_version 884054 (0.0027) [2024-06-25 12:19:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14484439040. Throughput: 0: 42825.3. Samples: 14484622080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 12:19:34,272][15401] Updated weights for policy 0, policy_version 884064 (0.0034) [2024-06-25 12:19:38,129][15401] Updated weights for policy 0, policy_version 884074 (0.0036) [2024-06-25 12:19:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14484684800. Throughput: 0: 42885.8. Samples: 14484753420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 12:19:41,753][15401] Updated weights for policy 0, policy_version 884084 (0.0043) [2024-06-25 12:19:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14484881408. Throughput: 0: 42874.1. Samples: 14485007780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:43,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 12:19:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000884088_14484897792.pth... [2024-06-25 12:19:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000883460_14474608640.pth [2024-06-25 12:19:45,500][15401] Updated weights for policy 0, policy_version 884094 (0.0030) [2024-06-25 12:19:48,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42871.1, 300 sec: 42764.7). Total num frames: 14485094400. Throughput: 0: 42796.5. Samples: 14485265300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:48,401][15132] Avg episode reward: [(0, '0.808')] [2024-06-25 12:19:49,750][15401] Updated weights for policy 0, policy_version 884104 (0.0028) [2024-06-25 12:19:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42820.6). Total num frames: 14485307392. Throughput: 0: 42787.7. Samples: 14485391020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:53,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 12:19:53,538][15401] Updated weights for policy 0, policy_version 884114 (0.0028) [2024-06-25 12:19:57,688][15401] Updated weights for policy 0, policy_version 884124 (0.0037) [2024-06-25 12:19:58,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42599.2, 300 sec: 42709.5). Total num frames: 14485520384. Throughput: 0: 43015.5. Samples: 14485654120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:19:58,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-25 12:20:01,074][15401] Updated weights for policy 0, policy_version 884134 (0.0037) [2024-06-25 12:20:03,390][15132] Fps is (10 sec: 44235.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 14485749760. Throughput: 0: 42858.1. Samples: 14485906100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:03,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 12:20:05,332][15401] Updated weights for policy 0, policy_version 884144 (0.0044) [2024-06-25 12:20:05,891][15349] Signal inference workers to stop experience collection... (214450 times) [2024-06-25 12:20:05,891][15349] Signal inference workers to resume experience collection... (214450 times) [2024-06-25 12:20:05,914][15401] InferenceWorker_p0-w0: stopping experience collection (214450 times) [2024-06-25 12:20:05,914][15401] InferenceWorker_p0-w0: resuming experience collection (214450 times) [2024-06-25 12:20:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14485962752. Throughput: 0: 42959.5. Samples: 14486033260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 12:20:08,584][15401] Updated weights for policy 0, policy_version 884154 (0.0039) [2024-06-25 12:20:12,897][15401] Updated weights for policy 0, policy_version 884164 (0.0038) [2024-06-25 12:20:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 14486192128. Throughput: 0: 42940.4. Samples: 14486292400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:13,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 12:20:16,277][15401] Updated weights for policy 0, policy_version 884174 (0.0028) [2024-06-25 12:20:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14486372352. Throughput: 0: 42841.3. Samples: 14486549940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:18,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 12:20:20,355][15401] Updated weights for policy 0, policy_version 884184 (0.0052) [2024-06-25 12:20:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14486601728. Throughput: 0: 42606.2. Samples: 14486670700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:23,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 12:20:23,854][15401] Updated weights for policy 0, policy_version 884194 (0.0032) [2024-06-25 12:20:27,912][15401] Updated weights for policy 0, policy_version 884204 (0.0043) [2024-06-25 12:20:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 14486831104. Throughput: 0: 42905.4. Samples: 14486938520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 12:20:31,572][15401] Updated weights for policy 0, policy_version 884214 (0.0036) [2024-06-25 12:20:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14487011328. Throughput: 0: 42908.4. Samples: 14487196080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 12:20:35,425][15401] Updated weights for policy 0, policy_version 884224 (0.0036) [2024-06-25 12:20:38,391][15132] Fps is (10 sec: 40955.4, 60 sec: 42597.7, 300 sec: 42820.8). Total num frames: 14487240704. Throughput: 0: 42890.0. Samples: 14487321120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:38,391][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 12:20:38,970][15401] Updated weights for policy 0, policy_version 884234 (0.0026) [2024-06-25 12:20:43,063][15401] Updated weights for policy 0, policy_version 884244 (0.0044) [2024-06-25 12:20:43,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 14487486464. Throughput: 0: 42872.5. Samples: 14487583380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:20:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 12:20:46,748][15401] Updated weights for policy 0, policy_version 884254 (0.0038) [2024-06-25 12:20:48,389][15132] Fps is (10 sec: 42602.8, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 14487666688. Throughput: 0: 42859.7. Samples: 14487834780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:20:48,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 12:20:50,802][15401] Updated weights for policy 0, policy_version 884264 (0.0028) [2024-06-25 12:20:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42821.1). Total num frames: 14487896064. Throughput: 0: 42773.7. Samples: 14487958080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:20:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 12:20:54,304][15401] Updated weights for policy 0, policy_version 884274 (0.0050) [2024-06-25 12:20:58,277][15401] Updated weights for policy 0, policy_version 884284 (0.0032) [2024-06-25 12:20:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14488109056. Throughput: 0: 43003.7. Samples: 14488227560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:20:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 12:21:01,988][15401] Updated weights for policy 0, policy_version 884294 (0.0042) [2024-06-25 12:21:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14488322048. Throughput: 0: 42773.8. Samples: 14488474760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 12:21:06,336][15401] Updated weights for policy 0, policy_version 884304 (0.0019) [2024-06-25 12:21:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14488518656. Throughput: 0: 43046.7. Samples: 14488607800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 12:21:10,087][15401] Updated weights for policy 0, policy_version 884314 (0.0049) [2024-06-25 12:21:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14488715264. Throughput: 0: 42734.1. Samples: 14488861560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 12:21:13,985][15401] Updated weights for policy 0, policy_version 884324 (0.0028) [2024-06-25 12:21:17,745][15401] Updated weights for policy 0, policy_version 884334 (0.0042) [2024-06-25 12:21:18,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 14488944640. Throughput: 0: 42743.1. Samples: 14489119620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:18,393][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 12:21:21,519][15401] Updated weights for policy 0, policy_version 884344 (0.0027) [2024-06-25 12:21:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14489174016. Throughput: 0: 42964.6. Samples: 14489254480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 12:21:25,283][15401] Updated weights for policy 0, policy_version 884354 (0.0031) [2024-06-25 12:21:28,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14489387008. Throughput: 0: 42730.6. Samples: 14489506260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 12:21:29,104][15401] Updated weights for policy 0, policy_version 884364 (0.0025) [2024-06-25 12:21:32,859][15401] Updated weights for policy 0, policy_version 884374 (0.0034) [2024-06-25 12:21:33,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14489600000. Throughput: 0: 42979.6. Samples: 14489768860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 12:21:36,778][15349] Signal inference workers to stop experience collection... (214500 times) [2024-06-25 12:21:36,823][15401] InferenceWorker_p0-w0: stopping experience collection (214500 times) [2024-06-25 12:21:36,831][15349] Signal inference workers to resume experience collection... (214500 times) [2024-06-25 12:21:36,834][15401] InferenceWorker_p0-w0: resuming experience collection (214500 times) [2024-06-25 12:21:36,842][15401] Updated weights for policy 0, policy_version 884384 (0.0038) [2024-06-25 12:21:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42872.2, 300 sec: 42766.0). Total num frames: 14489812992. Throughput: 0: 43069.4. Samples: 14489896200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 12:21:40,762][15401] Updated weights for policy 0, policy_version 884394 (0.0029) [2024-06-25 12:21:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14490025984. Throughput: 0: 42815.0. Samples: 14490154240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:43,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 12:21:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000884402_14490042368.pth... [2024-06-25 12:21:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000883775_14479769600.pth [2024-06-25 12:21:44,360][15401] Updated weights for policy 0, policy_version 884404 (0.0031) [2024-06-25 12:21:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 14490222592. Throughput: 0: 43022.3. Samples: 14490410760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 12:21:48,427][15401] Updated weights for policy 0, policy_version 884414 (0.0033) [2024-06-25 12:21:51,978][15401] Updated weights for policy 0, policy_version 884424 (0.0038) [2024-06-25 12:21:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 14490451968. Throughput: 0: 42855.2. Samples: 14490536280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 12:21:56,119][15401] Updated weights for policy 0, policy_version 884434 (0.0028) [2024-06-25 12:21:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 14490664960. Throughput: 0: 42804.9. Samples: 14490787780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:21:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 12:22:00,204][15401] Updated weights for policy 0, policy_version 884444 (0.0037) [2024-06-25 12:22:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14490877952. Throughput: 0: 42784.1. Samples: 14491044800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:22:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 12:22:03,846][15401] Updated weights for policy 0, policy_version 884454 (0.0040) [2024-06-25 12:22:07,803][15401] Updated weights for policy 0, policy_version 884464 (0.0024) [2024-06-25 12:22:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 14491090944. Throughput: 0: 42556.2. Samples: 14491169520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:22:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 12:22:11,205][15401] Updated weights for policy 0, policy_version 884474 (0.0027) [2024-06-25 12:22:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 14491320320. Throughput: 0: 42791.2. Samples: 14491431860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:22:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 12:22:15,292][15401] Updated weights for policy 0, policy_version 884484 (0.0045) [2024-06-25 12:22:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42873.3, 300 sec: 42876.1). Total num frames: 14491516928. Throughput: 0: 42720.5. Samples: 14491691280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:22:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 12:22:18,887][15401] Updated weights for policy 0, policy_version 884494 (0.0047) [2024-06-25 12:22:23,306][15401] Updated weights for policy 0, policy_version 884504 (0.0045) [2024-06-25 12:22:23,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 14491713536. Throughput: 0: 42583.1. Samples: 14491812440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:22:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 12:22:26,480][15401] Updated weights for policy 0, policy_version 884514 (0.0042) [2024-06-25 12:22:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 14491975680. Throughput: 0: 42642.2. Samples: 14492073140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:22:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 12:22:30,855][15401] Updated weights for policy 0, policy_version 884524 (0.0035) [2024-06-25 12:22:33,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14492188672. Throughput: 0: 42722.1. Samples: 14492333260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 12:22:33,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 12:22:34,161][15401] Updated weights for policy 0, policy_version 884534 (0.0029) [2024-06-25 12:22:38,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 14492352512. Throughput: 0: 42720.0. Samples: 14492458680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:22:38,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 12:22:38,763][15401] Updated weights for policy 0, policy_version 884544 (0.0023) [2024-06-25 12:22:41,740][15401] Updated weights for policy 0, policy_version 884554 (0.0036) [2024-06-25 12:22:43,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 14492614656. Throughput: 0: 42796.4. Samples: 14492713720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:22:43,393][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 12:22:46,287][15401] Updated weights for policy 0, policy_version 884564 (0.0033) [2024-06-25 12:22:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.7). Total num frames: 14492811264. Throughput: 0: 42746.6. Samples: 14492968400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:22:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 12:22:49,576][15401] Updated weights for policy 0, policy_version 884574 (0.0047) [2024-06-25 12:22:53,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14493007872. Throughput: 0: 42811.3. Samples: 14493096020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:22:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 12:22:53,698][15401] Updated weights for policy 0, policy_version 884584 (0.0027) [2024-06-25 12:22:57,144][15349] Signal inference workers to stop experience collection... (214550 times) [2024-06-25 12:22:57,144][15349] Signal inference workers to resume experience collection... (214550 times) [2024-06-25 12:22:57,164][15401] InferenceWorker_p0-w0: stopping experience collection (214550 times) [2024-06-25 12:22:57,165][15401] InferenceWorker_p0-w0: resuming experience collection (214550 times) [2024-06-25 12:22:57,297][15401] Updated weights for policy 0, policy_version 884594 (0.0032) [2024-06-25 12:22:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14493237248. Throughput: 0: 42787.0. Samples: 14493357280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:22:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 12:23:01,222][15401] Updated weights for policy 0, policy_version 884604 (0.0035) [2024-06-25 12:23:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14493450240. Throughput: 0: 42811.0. Samples: 14493617780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 12:23:04,800][15401] Updated weights for policy 0, policy_version 884614 (0.0036) [2024-06-25 12:23:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 14493646848. Throughput: 0: 42856.1. Samples: 14493740960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:08,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 12:23:08,786][15401] Updated weights for policy 0, policy_version 884624 (0.0033) [2024-06-25 12:23:12,230][15401] Updated weights for policy 0, policy_version 884634 (0.0038) [2024-06-25 12:23:13,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 14493908992. Throughput: 0: 42875.9. Samples: 14494002660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:13,393][15132] Avg episode reward: [(0, '0.302')] [2024-06-25 12:23:16,329][15401] Updated weights for policy 0, policy_version 884644 (0.0043) [2024-06-25 12:23:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 14494089216. Throughput: 0: 42891.7. Samples: 14494263380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 12:23:19,846][15401] Updated weights for policy 0, policy_version 884654 (0.0046) [2024-06-25 12:23:23,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14494285824. Throughput: 0: 42817.8. Samples: 14494385480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:23,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 12:23:24,142][15401] Updated weights for policy 0, policy_version 884664 (0.0035) [2024-06-25 12:23:27,458][15401] Updated weights for policy 0, policy_version 884674 (0.0030) [2024-06-25 12:23:28,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 14494531584. Throughput: 0: 42929.8. Samples: 14494645560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:28,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 12:23:31,789][15401] Updated weights for policy 0, policy_version 884684 (0.0040) [2024-06-25 12:23:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14494728192. Throughput: 0: 42923.1. Samples: 14494899940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:33,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 12:23:35,363][15401] Updated weights for policy 0, policy_version 884694 (0.0029) [2024-06-25 12:23:38,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14494941184. Throughput: 0: 42967.9. Samples: 14495029580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 12:23:39,662][15401] Updated weights for policy 0, policy_version 884704 (0.0040) [2024-06-25 12:23:42,748][15401] Updated weights for policy 0, policy_version 884714 (0.0037) [2024-06-25 12:23:43,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42873.2, 300 sec: 42931.9). Total num frames: 14495186944. Throughput: 0: 42955.6. Samples: 14495290280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:43,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 12:23:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000884716_14495186944.pth... [2024-06-25 12:23:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000884088_14484897792.pth [2024-06-25 12:23:47,354][15401] Updated weights for policy 0, policy_version 884724 (0.0027) [2024-06-25 12:23:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 14495367168. Throughput: 0: 42851.6. Samples: 14495546100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 12:23:50,410][15401] Updated weights for policy 0, policy_version 884734 (0.0033) [2024-06-25 12:23:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42765.2). Total num frames: 14495580160. Throughput: 0: 42831.9. Samples: 14495668400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:53,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 12:23:54,811][15401] Updated weights for policy 0, policy_version 884744 (0.0034) [2024-06-25 12:23:58,026][15401] Updated weights for policy 0, policy_version 884754 (0.0043) [2024-06-25 12:23:58,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 14495825920. Throughput: 0: 42856.1. Samples: 14495931080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:23:58,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 12:24:02,337][15401] Updated weights for policy 0, policy_version 884764 (0.0036) [2024-06-25 12:24:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14496006144. Throughput: 0: 42784.4. Samples: 14496188680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:24:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 12:24:05,453][15401] Updated weights for policy 0, policy_version 884774 (0.0045) [2024-06-25 12:24:08,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42869.6, 300 sec: 42820.2). Total num frames: 14496219136. Throughput: 0: 42834.5. Samples: 14496313140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:24:08,393][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 12:24:09,794][15401] Updated weights for policy 0, policy_version 884784 (0.0036) [2024-06-25 12:24:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 14496464896. Throughput: 0: 42825.0. Samples: 14496572580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:24:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 12:24:13,391][15401] Updated weights for policy 0, policy_version 884794 (0.0032) [2024-06-25 12:24:17,822][15401] Updated weights for policy 0, policy_version 884804 (0.0036) [2024-06-25 12:24:18,390][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14496661504. Throughput: 0: 42884.5. Samples: 14496829740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:24:18,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 12:24:21,064][15401] Updated weights for policy 0, policy_version 884814 (0.0035) [2024-06-25 12:24:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14496874496. Throughput: 0: 42840.0. Samples: 14496957380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:24:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 12:24:25,579][15401] Updated weights for policy 0, policy_version 884824 (0.0040) [2024-06-25 12:24:27,296][15349] Signal inference workers to stop experience collection... (214600 times) [2024-06-25 12:24:27,297][15349] Signal inference workers to resume experience collection... (214600 times) [2024-06-25 12:24:27,318][15401] InferenceWorker_p0-w0: stopping experience collection (214600 times) [2024-06-25 12:24:27,319][15401] InferenceWorker_p0-w0: resuming experience collection (214600 times) [2024-06-25 12:24:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 14497087488. Throughput: 0: 42789.2. Samples: 14497215800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 12:24:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 12:24:28,727][15401] Updated weights for policy 0, policy_version 884834 (0.0027) [2024-06-25 12:24:33,181][15401] Updated weights for policy 0, policy_version 884844 (0.0027) [2024-06-25 12:24:33,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 14497300480. Throughput: 0: 42822.5. Samples: 14497473220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:24:33,393][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 12:24:36,115][15401] Updated weights for policy 0, policy_version 884854 (0.0038) [2024-06-25 12:24:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14497513472. Throughput: 0: 42765.8. Samples: 14497592860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:24:38,390][15132] Avg episode reward: [(0, '0.009')] [2024-06-25 12:24:40,738][15401] Updated weights for policy 0, policy_version 884864 (0.0036) [2024-06-25 12:24:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 14497742848. Throughput: 0: 42750.2. Samples: 14497854840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:24:43,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-25 12:24:43,961][15401] Updated weights for policy 0, policy_version 884874 (0.0040) [2024-06-25 12:24:48,387][15401] Updated weights for policy 0, policy_version 884884 (0.0024) [2024-06-25 12:24:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14497939456. Throughput: 0: 42821.3. Samples: 14498115640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:24:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 12:24:51,461][15401] Updated weights for policy 0, policy_version 884894 (0.0051) [2024-06-25 12:24:53,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 14498168832. Throughput: 0: 42706.7. Samples: 14498234940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:24:53,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 12:24:55,959][15401] Updated weights for policy 0, policy_version 884904 (0.0027) [2024-06-25 12:24:58,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14498381824. Throughput: 0: 42752.0. Samples: 14498496420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:24:58,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 12:24:59,222][15401] Updated weights for policy 0, policy_version 884914 (0.0035) [2024-06-25 12:25:03,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14498578432. Throughput: 0: 42875.1. Samples: 14498759120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:03,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 12:25:03,407][15401] Updated weights for policy 0, policy_version 884924 (0.0030) [2024-06-25 12:25:06,686][15401] Updated weights for policy 0, policy_version 884934 (0.0048) [2024-06-25 12:25:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 14498791424. Throughput: 0: 42739.0. Samples: 14498880740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:08,392][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 12:25:11,231][15401] Updated weights for policy 0, policy_version 884944 (0.0031) [2024-06-25 12:25:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14499004416. Throughput: 0: 42609.0. Samples: 14499133200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:13,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 12:25:14,793][15401] Updated weights for policy 0, policy_version 884954 (0.0027) [2024-06-25 12:25:18,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14499217408. Throughput: 0: 42696.5. Samples: 14499394460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 12:25:19,036][15401] Updated weights for policy 0, policy_version 884964 (0.0035) [2024-06-25 12:25:22,274][15401] Updated weights for policy 0, policy_version 884974 (0.0028) [2024-06-25 12:25:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14499430400. Throughput: 0: 42742.6. Samples: 14499516280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 12:25:26,667][15401] Updated weights for policy 0, policy_version 884984 (0.0026) [2024-06-25 12:25:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14499659776. Throughput: 0: 42712.4. Samples: 14499776900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:28,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 12:25:29,756][15401] Updated weights for policy 0, policy_version 884994 (0.0035) [2024-06-25 12:25:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42765.2). Total num frames: 14499856384. Throughput: 0: 42645.8. Samples: 14500034700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 12:25:34,331][15401] Updated weights for policy 0, policy_version 885004 (0.0039) [2024-06-25 12:25:37,430][15401] Updated weights for policy 0, policy_version 885014 (0.0021) [2024-06-25 12:25:38,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 14500085760. Throughput: 0: 42758.7. Samples: 14500159080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:38,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 12:25:41,888][15401] Updated weights for policy 0, policy_version 885024 (0.0041) [2024-06-25 12:25:43,296][15349] Signal inference workers to stop experience collection... (214650 times) [2024-06-25 12:25:43,302][15349] Signal inference workers to resume experience collection... (214650 times) [2024-06-25 12:25:43,314][15401] InferenceWorker_p0-w0: stopping experience collection (214650 times) [2024-06-25 12:25:43,349][15401] InferenceWorker_p0-w0: resuming experience collection (214650 times) [2024-06-25 12:25:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 14500298752. Throughput: 0: 42666.1. Samples: 14500416500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:43,393][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 12:25:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000885029_14500315136.pth... [2024-06-25 12:25:43,519][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000884402_14490042368.pth [2024-06-25 12:25:45,425][15401] Updated weights for policy 0, policy_version 885034 (0.0034) [2024-06-25 12:25:48,395][15132] Fps is (10 sec: 40948.8, 60 sec: 42594.8, 300 sec: 42708.8). Total num frames: 14500495360. Throughput: 0: 42621.4. Samples: 14500677300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:48,396][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 12:25:49,461][15401] Updated weights for policy 0, policy_version 885044 (0.0034) [2024-06-25 12:25:52,914][15401] Updated weights for policy 0, policy_version 885054 (0.0039) [2024-06-25 12:25:53,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 14500741120. Throughput: 0: 42688.4. Samples: 14500801620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 12:25:57,007][15401] Updated weights for policy 0, policy_version 885064 (0.0035) [2024-06-25 12:25:58,391][15132] Fps is (10 sec: 45893.3, 60 sec: 42870.6, 300 sec: 42820.4). Total num frames: 14500954112. Throughput: 0: 42871.7. Samples: 14501062480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:25:58,391][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 12:26:00,791][15401] Updated weights for policy 0, policy_version 885074 (0.0056) [2024-06-25 12:26:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14501134336. Throughput: 0: 42947.1. Samples: 14501327080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:26:03,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 12:26:04,618][15401] Updated weights for policy 0, policy_version 885084 (0.0026) [2024-06-25 12:26:08,392][15132] Fps is (10 sec: 40954.9, 60 sec: 42871.4, 300 sec: 42875.8). Total num frames: 14501363712. Throughput: 0: 42951.5. Samples: 14501449200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:26:08,393][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 12:26:08,736][15401] Updated weights for policy 0, policy_version 885094 (0.0025) [2024-06-25 12:26:12,314][15401] Updated weights for policy 0, policy_version 885104 (0.0035) [2024-06-25 12:26:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 14501593088. Throughput: 0: 42758.6. Samples: 14501701040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:26:13,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 12:26:16,314][15401] Updated weights for policy 0, policy_version 885114 (0.0034) [2024-06-25 12:26:18,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14501789696. Throughput: 0: 42874.3. Samples: 14501964040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 12:26:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 12:26:19,839][15401] Updated weights for policy 0, policy_version 885124 (0.0038) [2024-06-25 12:26:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14502019072. Throughput: 0: 42742.7. Samples: 14502082400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:26:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 12:26:23,967][15401] Updated weights for policy 0, policy_version 885134 (0.0026) [2024-06-25 12:26:27,614][15401] Updated weights for policy 0, policy_version 885144 (0.0038) [2024-06-25 12:26:28,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14502248448. Throughput: 0: 42879.2. Samples: 14502345960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:26:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 12:26:31,563][15401] Updated weights for policy 0, policy_version 885154 (0.0032) [2024-06-25 12:26:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14502412288. Throughput: 0: 42806.2. Samples: 14502603360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:26:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 12:26:35,373][15401] Updated weights for policy 0, policy_version 885164 (0.0033) [2024-06-25 12:26:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 14502658048. Throughput: 0: 42667.3. Samples: 14502721640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:26:38,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 12:26:39,323][15401] Updated weights for policy 0, policy_version 885174 (0.0031) [2024-06-25 12:26:42,928][15401] Updated weights for policy 0, policy_version 885184 (0.0034) [2024-06-25 12:26:43,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 14502887424. Throughput: 0: 42809.5. Samples: 14502988860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:26:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 12:26:46,998][15401] Updated weights for policy 0, policy_version 885194 (0.0039) [2024-06-25 12:26:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42602.0, 300 sec: 42709.5). Total num frames: 14503051264. Throughput: 0: 42747.6. Samples: 14503250720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:26:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 12:26:50,410][15401] Updated weights for policy 0, policy_version 885204 (0.0032) [2024-06-25 12:26:53,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42867.0, 300 sec: 42875.2). Total num frames: 14503313408. Throughput: 0: 42650.9. Samples: 14503368660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:26:53,396][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 12:26:54,624][15401] Updated weights for policy 0, policy_version 885214 (0.0039) [2024-06-25 12:26:57,766][15349] Signal inference workers to stop experience collection... (214700 times) [2024-06-25 12:26:57,819][15349] Signal inference workers to resume experience collection... (214700 times) [2024-06-25 12:26:57,820][15401] InferenceWorker_p0-w0: stopping experience collection (214700 times) [2024-06-25 12:26:57,845][15401] InferenceWorker_p0-w0: resuming experience collection (214700 times) [2024-06-25 12:26:58,212][15401] Updated weights for policy 0, policy_version 885224 (0.0041) [2024-06-25 12:26:58,392][15132] Fps is (10 sec: 47502.5, 60 sec: 42870.6, 300 sec: 42875.7). Total num frames: 14503526400. Throughput: 0: 42862.3. Samples: 14503629940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:26:58,392][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 12:27:02,170][15401] Updated weights for policy 0, policy_version 885234 (0.0045) [2024-06-25 12:27:03,390][15132] Fps is (10 sec: 39346.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14503706624. Throughput: 0: 42741.7. Samples: 14503887420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 12:27:05,738][15401] Updated weights for policy 0, policy_version 885244 (0.0027) [2024-06-25 12:27:08,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 14503952384. Throughput: 0: 42774.2. Samples: 14504007240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 12:27:09,724][15401] Updated weights for policy 0, policy_version 885254 (0.0039) [2024-06-25 12:27:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14504148992. Throughput: 0: 42804.1. Samples: 14504272140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 12:27:13,448][15401] Updated weights for policy 0, policy_version 885264 (0.0044) [2024-06-25 12:27:18,054][15401] Updated weights for policy 0, policy_version 885274 (0.0043) [2024-06-25 12:27:18,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 14504345600. Throughput: 0: 42709.3. Samples: 14504525380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:18,393][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 12:27:21,202][15401] Updated weights for policy 0, policy_version 885284 (0.0049) [2024-06-25 12:27:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14504591360. Throughput: 0: 42808.7. Samples: 14504648040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:23,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 12:27:25,691][15401] Updated weights for policy 0, policy_version 885294 (0.0025) [2024-06-25 12:27:28,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14504771584. Throughput: 0: 42623.6. Samples: 14504906920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:28,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 12:27:28,755][15401] Updated weights for policy 0, policy_version 885304 (0.0028) [2024-06-25 12:27:33,204][15401] Updated weights for policy 0, policy_version 885314 (0.0035) [2024-06-25 12:27:33,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 14504984576. Throughput: 0: 42641.9. Samples: 14505169880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:33,396][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 12:27:36,467][15401] Updated weights for policy 0, policy_version 885324 (0.0029) [2024-06-25 12:27:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 14505230336. Throughput: 0: 42778.1. Samples: 14505293400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 12:27:41,280][15401] Updated weights for policy 0, policy_version 885334 (0.0037) [2024-06-25 12:27:43,390][15132] Fps is (10 sec: 44263.6, 60 sec: 42325.1, 300 sec: 42765.0). Total num frames: 14505426944. Throughput: 0: 42699.2. Samples: 14505551320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 12:27:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000885341_14505426944.pth... [2024-06-25 12:27:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000884716_14495186944.pth [2024-06-25 12:27:44,187][15401] Updated weights for policy 0, policy_version 885344 (0.0037) [2024-06-25 12:27:48,390][15132] Fps is (10 sec: 39320.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14505623552. Throughput: 0: 42684.7. Samples: 14505808240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 12:27:48,774][15401] Updated weights for policy 0, policy_version 885354 (0.0029) [2024-06-25 12:27:51,994][15401] Updated weights for policy 0, policy_version 885364 (0.0039) [2024-06-25 12:27:53,389][15132] Fps is (10 sec: 44238.5, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 14505869312. Throughput: 0: 42901.4. Samples: 14505937800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:53,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 12:27:56,380][15401] Updated weights for policy 0, policy_version 885374 (0.0033) [2024-06-25 12:27:58,389][15132] Fps is (10 sec: 44237.9, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 14506065920. Throughput: 0: 42572.4. Samples: 14506187900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:27:58,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 12:27:59,582][15401] Updated weights for policy 0, policy_version 885384 (0.0040) [2024-06-25 12:28:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14506278912. Throughput: 0: 42517.7. Samples: 14506438580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:28:03,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 12:28:03,928][15401] Updated weights for policy 0, policy_version 885394 (0.0026) [2024-06-25 12:28:07,465][15401] Updated weights for policy 0, policy_version 885404 (0.0033) [2024-06-25 12:28:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 14506508288. Throughput: 0: 42855.7. Samples: 14506576540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:28:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 12:28:11,519][15401] Updated weights for policy 0, policy_version 885414 (0.0028) [2024-06-25 12:28:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14506688512. Throughput: 0: 42712.8. Samples: 14506829000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-25 12:28:13,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 12:28:14,939][15401] Updated weights for policy 0, policy_version 885424 (0.0035) [2024-06-25 12:28:18,390][15132] Fps is (10 sec: 40958.7, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 14506917888. Throughput: 0: 42566.3. Samples: 14507085100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:18,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 12:28:19,391][15401] Updated weights for policy 0, policy_version 885434 (0.0036) [2024-06-25 12:28:22,621][15401] Updated weights for policy 0, policy_version 885444 (0.0042) [2024-06-25 12:28:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 14507147264. Throughput: 0: 42769.7. Samples: 14507218040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:23,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 12:28:26,827][15401] Updated weights for policy 0, policy_version 885454 (0.0023) [2024-06-25 12:28:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14507327488. Throughput: 0: 42635.8. Samples: 14507469920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 12:28:29,675][15349] Signal inference workers to stop experience collection... (214750 times) [2024-06-25 12:28:29,675][15349] Signal inference workers to resume experience collection... (214750 times) [2024-06-25 12:28:29,723][15401] InferenceWorker_p0-w0: stopping experience collection (214750 times) [2024-06-25 12:28:29,723][15401] InferenceWorker_p0-w0: resuming experience collection (214750 times) [2024-06-25 12:28:30,295][15401] Updated weights for policy 0, policy_version 885464 (0.0034) [2024-06-25 12:28:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 14507556864. Throughput: 0: 42593.1. Samples: 14507724920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 12:28:34,565][15401] Updated weights for policy 0, policy_version 885474 (0.0029) [2024-06-25 12:28:37,909][15401] Updated weights for policy 0, policy_version 885484 (0.0041) [2024-06-25 12:28:38,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14507786240. Throughput: 0: 42680.9. Samples: 14507858440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 12:28:42,267][15401] Updated weights for policy 0, policy_version 885494 (0.0039) [2024-06-25 12:28:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.6, 300 sec: 42709.5). Total num frames: 14507966464. Throughput: 0: 42885.4. Samples: 14508117740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 12:28:45,381][15401] Updated weights for policy 0, policy_version 885504 (0.0030) [2024-06-25 12:28:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 14508195840. Throughput: 0: 42949.9. Samples: 14508371320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 12:28:50,082][15401] Updated weights for policy 0, policy_version 885514 (0.0037) [2024-06-25 12:28:53,061][15401] Updated weights for policy 0, policy_version 885524 (0.0041) [2024-06-25 12:28:53,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14508441600. Throughput: 0: 42696.7. Samples: 14508497900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:53,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 12:28:57,721][15401] Updated weights for policy 0, policy_version 885534 (0.0040) [2024-06-25 12:28:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14508605440. Throughput: 0: 42757.5. Samples: 14508753080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:28:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 12:29:00,880][15401] Updated weights for policy 0, policy_version 885544 (0.0040) [2024-06-25 12:29:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 14508834816. Throughput: 0: 42679.4. Samples: 14509005660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 12:29:05,387][15401] Updated weights for policy 0, policy_version 885554 (0.0024) [2024-06-25 12:29:08,392][15132] Fps is (10 sec: 45863.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 14509064192. Throughput: 0: 42629.7. Samples: 14509136480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:08,393][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 12:29:08,471][15401] Updated weights for policy 0, policy_version 885564 (0.0023) [2024-06-25 12:29:13,192][15401] Updated weights for policy 0, policy_version 885574 (0.0029) [2024-06-25 12:29:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14509260800. Throughput: 0: 42831.3. Samples: 14509397320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:13,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 12:29:16,122][15401] Updated weights for policy 0, policy_version 885584 (0.0028) [2024-06-25 12:29:18,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 14509490176. Throughput: 0: 42701.3. Samples: 14509646580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:18,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 12:29:20,785][15401] Updated weights for policy 0, policy_version 885594 (0.0042) [2024-06-25 12:29:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14509703168. Throughput: 0: 42612.1. Samples: 14509775980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 12:29:23,672][15401] Updated weights for policy 0, policy_version 885604 (0.0034) [2024-06-25 12:29:28,267][15401] Updated weights for policy 0, policy_version 885614 (0.0030) [2024-06-25 12:29:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 14509899776. Throughput: 0: 42684.9. Samples: 14510038560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 12:29:31,264][15401] Updated weights for policy 0, policy_version 885624 (0.0029) [2024-06-25 12:29:33,396][15132] Fps is (10 sec: 42570.0, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 14510129152. Throughput: 0: 42566.2. Samples: 14510287080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:33,397][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 12:29:35,923][15401] Updated weights for policy 0, policy_version 885634 (0.0033) [2024-06-25 12:29:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14510342144. Throughput: 0: 42642.2. Samples: 14510416800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 12:29:39,066][15401] Updated weights for policy 0, policy_version 885644 (0.0025) [2024-06-25 12:29:43,389][15132] Fps is (10 sec: 40987.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14510538752. Throughput: 0: 42643.0. Samples: 14510672020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 12:29:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000885653_14510538752.pth... [2024-06-25 12:29:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000885029_14500315136.pth [2024-06-25 12:29:43,734][15401] Updated weights for policy 0, policy_version 885654 (0.0034) [2024-06-25 12:29:44,440][15349] Signal inference workers to stop experience collection... (214800 times) [2024-06-25 12:29:44,471][15401] InferenceWorker_p0-w0: stopping experience collection (214800 times) [2024-06-25 12:29:44,559][15349] Signal inference workers to resume experience collection... (214800 times) [2024-06-25 12:29:44,559][15401] InferenceWorker_p0-w0: resuming experience collection (214800 times) [2024-06-25 12:29:46,863][15401] Updated weights for policy 0, policy_version 885664 (0.0041) [2024-06-25 12:29:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 14510784512. Throughput: 0: 42525.8. Samples: 14510919320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 12:29:51,586][15401] Updated weights for policy 0, policy_version 885674 (0.0034) [2024-06-25 12:29:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14510981120. Throughput: 0: 42591.6. Samples: 14511053000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 12:29:54,544][15401] Updated weights for policy 0, policy_version 885684 (0.0034) [2024-06-25 12:29:58,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14511161344. Throughput: 0: 42370.6. Samples: 14511304000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:29:58,390][15132] Avg episode reward: [(0, '0.892')] [2024-06-25 12:29:59,598][15401] Updated weights for policy 0, policy_version 885694 (0.0038) [2024-06-25 12:30:02,184][15401] Updated weights for policy 0, policy_version 885704 (0.0033) [2024-06-25 12:30:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 14511407104. Throughput: 0: 42406.3. Samples: 14511554760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 12:30:03,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-25 12:30:07,422][15401] Updated weights for policy 0, policy_version 885714 (0.0037) [2024-06-25 12:30:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 14511587328. Throughput: 0: 42657.1. Samples: 14511695560. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 12:30:09,860][15401] Updated weights for policy 0, policy_version 885724 (0.0037) [2024-06-25 12:30:13,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 14511816704. Throughput: 0: 42211.9. Samples: 14511938200. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:13,392][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 12:30:15,139][15401] Updated weights for policy 0, policy_version 885734 (0.0023) [2024-06-25 12:30:17,723][15401] Updated weights for policy 0, policy_version 885744 (0.0030) [2024-06-25 12:30:18,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 14512046080. Throughput: 0: 42391.2. Samples: 14512194420. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:18,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 12:30:22,647][15401] Updated weights for policy 0, policy_version 885754 (0.0031) [2024-06-25 12:30:23,392][15132] Fps is (10 sec: 40960.1, 60 sec: 42050.5, 300 sec: 42598.1). Total num frames: 14512226304. Throughput: 0: 42561.4. Samples: 14512332160. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:23,393][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 12:30:25,510][15401] Updated weights for policy 0, policy_version 885764 (0.0044) [2024-06-25 12:30:28,394][15132] Fps is (10 sec: 40942.0, 60 sec: 42595.0, 300 sec: 42708.8). Total num frames: 14512455680. Throughput: 0: 42349.3. Samples: 14512577940. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:28,395][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 12:30:30,243][15401] Updated weights for policy 0, policy_version 885774 (0.0036) [2024-06-25 12:30:33,231][15401] Updated weights for policy 0, policy_version 885784 (0.0038) [2024-06-25 12:30:33,389][15132] Fps is (10 sec: 47525.3, 60 sec: 42876.2, 300 sec: 42765.4). Total num frames: 14512701440. Throughput: 0: 42568.4. Samples: 14512834900. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 12:30:37,866][15401] Updated weights for policy 0, policy_version 885794 (0.0050) [2024-06-25 12:30:38,389][15132] Fps is (10 sec: 40979.3, 60 sec: 42052.3, 300 sec: 42598.8). Total num frames: 14512865280. Throughput: 0: 42509.3. Samples: 14512965920. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 12:30:40,929][15401] Updated weights for policy 0, policy_version 885804 (0.0031) [2024-06-25 12:30:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.8). Total num frames: 14513111040. Throughput: 0: 42478.7. Samples: 14513215540. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:43,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 12:30:45,747][15401] Updated weights for policy 0, policy_version 885814 (0.0034) [2024-06-25 12:30:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14513324032. Throughput: 0: 42659.0. Samples: 14513474420. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:48,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 12:30:48,547][15401] Updated weights for policy 0, policy_version 885824 (0.0032) [2024-06-25 12:30:53,381][15401] Updated weights for policy 0, policy_version 885834 (0.0036) [2024-06-25 12:30:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42543.0). Total num frames: 14513504256. Throughput: 0: 42379.6. Samples: 14513602640. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 12:30:56,087][15401] Updated weights for policy 0, policy_version 885844 (0.0032) [2024-06-25 12:30:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14513733632. Throughput: 0: 42437.9. Samples: 14513847800. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:30:58,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 12:31:00,920][15401] Updated weights for policy 0, policy_version 885854 (0.0035) [2024-06-25 12:31:02,331][15349] Signal inference workers to stop experience collection... (214850 times) [2024-06-25 12:31:02,331][15349] Signal inference workers to resume experience collection... (214850 times) [2024-06-25 12:31:02,343][15401] InferenceWorker_p0-w0: stopping experience collection (214850 times) [2024-06-25 12:31:02,364][15401] InferenceWorker_p0-w0: resuming experience collection (214850 times) [2024-06-25 12:31:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 14513946624. Throughput: 0: 42611.4. Samples: 14514111920. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 12:31:04,231][15401] Updated weights for policy 0, policy_version 885864 (0.0034) [2024-06-25 12:31:08,383][15401] Updated weights for policy 0, policy_version 885874 (0.0040) [2024-06-25 12:31:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 14514159616. Throughput: 0: 42378.8. Samples: 14514239100. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:08,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 12:31:11,802][15401] Updated weights for policy 0, policy_version 885884 (0.0020) [2024-06-25 12:31:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14514388992. Throughput: 0: 42484.9. Samples: 14514489560. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 12:31:15,911][15401] Updated weights for policy 0, policy_version 885894 (0.0032) [2024-06-25 12:31:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.7, 300 sec: 42654.0). Total num frames: 14514601984. Throughput: 0: 42657.8. Samples: 14514754500. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:18,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 12:31:19,270][15401] Updated weights for policy 0, policy_version 885904 (0.0033) [2024-06-25 12:31:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 14514798592. Throughput: 0: 42642.3. Samples: 14514884820. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:23,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 12:31:23,542][15401] Updated weights for policy 0, policy_version 885914 (0.0022) [2024-06-25 12:31:26,908][15401] Updated weights for policy 0, policy_version 885924 (0.0041) [2024-06-25 12:31:28,394][15132] Fps is (10 sec: 40942.6, 60 sec: 42598.8, 300 sec: 42708.9). Total num frames: 14515011584. Throughput: 0: 42599.2. Samples: 14515132680. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:28,394][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 12:31:31,515][15401] Updated weights for policy 0, policy_version 885934 (0.0037) [2024-06-25 12:31:33,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42050.5, 300 sec: 42598.0). Total num frames: 14515224576. Throughput: 0: 42722.1. Samples: 14515397020. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:33,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 12:31:34,732][15401] Updated weights for policy 0, policy_version 885944 (0.0034) [2024-06-25 12:31:38,389][15132] Fps is (10 sec: 44255.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14515453952. Throughput: 0: 42769.9. Samples: 14515527280. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 12:31:38,990][15401] Updated weights for policy 0, policy_version 885954 (0.0037) [2024-06-25 12:31:42,647][15401] Updated weights for policy 0, policy_version 885964 (0.0028) [2024-06-25 12:31:43,391][15132] Fps is (10 sec: 44242.7, 60 sec: 42597.6, 300 sec: 42764.9). Total num frames: 14515666944. Throughput: 0: 42921.5. Samples: 14515779320. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:43,391][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 12:31:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000885966_14515666944.pth... [2024-06-25 12:31:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000885341_14505426944.pth [2024-06-25 12:31:46,517][15401] Updated weights for policy 0, policy_version 885974 (0.0029) [2024-06-25 12:31:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42543.8). Total num frames: 14515863552. Throughput: 0: 42728.5. Samples: 14516034700. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:48,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-25 12:31:50,371][15401] Updated weights for policy 0, policy_version 885984 (0.0033) [2024-06-25 12:31:53,390][15132] Fps is (10 sec: 40964.5, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 14516076544. Throughput: 0: 42695.9. Samples: 14516160420. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 12:31:54,175][15401] Updated weights for policy 0, policy_version 885994 (0.0037) [2024-06-25 12:31:58,069][15401] Updated weights for policy 0, policy_version 886004 (0.0036) [2024-06-25 12:31:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14516322304. Throughput: 0: 42902.7. Samples: 14516420180. Policy #0 lag: (min: 1.0, avg: 8.7, max: 23.0) [2024-06-25 12:31:58,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 12:32:02,090][15401] Updated weights for policy 0, policy_version 886014 (0.0037) [2024-06-25 12:32:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14516502528. Throughput: 0: 42681.7. Samples: 14516675180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 12:32:05,760][15401] Updated weights for policy 0, policy_version 886024 (0.0041) [2024-06-25 12:32:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14516731904. Throughput: 0: 42607.1. Samples: 14516802140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:08,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 12:32:09,585][15401] Updated weights for policy 0, policy_version 886034 (0.0031) [2024-06-25 12:32:13,376][15401] Updated weights for policy 0, policy_version 886044 (0.0046) [2024-06-25 12:32:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 14516944896. Throughput: 0: 42771.9. Samples: 14517057240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 12:32:17,343][15401] Updated weights for policy 0, policy_version 886054 (0.0032) [2024-06-25 12:32:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14517141504. Throughput: 0: 42610.4. Samples: 14517314380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:18,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 12:32:18,410][15349] Signal inference workers to stop experience collection... (214900 times) [2024-06-25 12:32:18,410][15349] Signal inference workers to resume experience collection... (214900 times) [2024-06-25 12:32:18,460][15401] InferenceWorker_p0-w0: stopping experience collection (214900 times) [2024-06-25 12:32:18,460][15401] InferenceWorker_p0-w0: resuming experience collection (214900 times) [2024-06-25 12:32:20,997][15401] Updated weights for policy 0, policy_version 886064 (0.0031) [2024-06-25 12:32:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14517354496. Throughput: 0: 42491.5. Samples: 14517439400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:23,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 12:32:25,105][15401] Updated weights for policy 0, policy_version 886074 (0.0028) [2024-06-25 12:32:28,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42874.3, 300 sec: 42710.4). Total num frames: 14517583872. Throughput: 0: 42668.9. Samples: 14517699380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 12:32:28,775][15401] Updated weights for policy 0, policy_version 886084 (0.0026) [2024-06-25 12:32:32,920][15401] Updated weights for policy 0, policy_version 886094 (0.0030) [2024-06-25 12:32:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 14517780480. Throughput: 0: 42601.8. Samples: 14517951780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 12:32:36,442][15401] Updated weights for policy 0, policy_version 886104 (0.0030) [2024-06-25 12:32:38,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42598.5). Total num frames: 14517993472. Throughput: 0: 42599.3. Samples: 14518077380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 12:32:40,634][15401] Updated weights for policy 0, policy_version 886114 (0.0035) [2024-06-25 12:32:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42326.1, 300 sec: 42654.0). Total num frames: 14518206464. Throughput: 0: 42513.3. Samples: 14518333280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 12:32:44,083][15401] Updated weights for policy 0, policy_version 886124 (0.0038) [2024-06-25 12:32:48,374][15401] Updated weights for policy 0, policy_version 886134 (0.0033) [2024-06-25 12:32:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14518419456. Throughput: 0: 42641.9. Samples: 14518594060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 12:32:51,961][15401] Updated weights for policy 0, policy_version 886144 (0.0032) [2024-06-25 12:32:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14518616064. Throughput: 0: 42593.7. Samples: 14518718860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:53,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 12:32:56,047][15401] Updated weights for policy 0, policy_version 886154 (0.0027) [2024-06-25 12:32:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14518845440. Throughput: 0: 42436.4. Samples: 14518966880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:32:58,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 12:33:00,258][15401] Updated weights for policy 0, policy_version 886164 (0.0028) [2024-06-25 12:33:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14519042048. Throughput: 0: 42481.3. Samples: 14519226040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:03,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 12:33:03,880][15401] Updated weights for policy 0, policy_version 886174 (0.0027) [2024-06-25 12:33:07,866][15401] Updated weights for policy 0, policy_version 886184 (0.0043) [2024-06-25 12:33:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14519255040. Throughput: 0: 42536.5. Samples: 14519353540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:08,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 12:33:11,582][15401] Updated weights for policy 0, policy_version 886194 (0.0036) [2024-06-25 12:33:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14519484416. Throughput: 0: 42325.0. Samples: 14519604000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:13,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-25 12:33:15,599][15401] Updated weights for policy 0, policy_version 886204 (0.0035) [2024-06-25 12:33:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 14519697408. Throughput: 0: 42347.9. Samples: 14519857440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 12:33:19,204][15401] Updated weights for policy 0, policy_version 886214 (0.0026) [2024-06-25 12:33:23,232][15401] Updated weights for policy 0, policy_version 886224 (0.0042) [2024-06-25 12:33:23,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42320.8, 300 sec: 42597.5). Total num frames: 14519894016. Throughput: 0: 42418.3. Samples: 14519986480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:23,396][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 12:33:27,076][15401] Updated weights for policy 0, policy_version 886234 (0.0033) [2024-06-25 12:33:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.5, 300 sec: 42542.9). Total num frames: 14520107008. Throughput: 0: 42421.0. Samples: 14520242220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 12:33:31,286][15401] Updated weights for policy 0, policy_version 886244 (0.0034) [2024-06-25 12:33:33,396][15132] Fps is (10 sec: 45875.2, 60 sec: 42866.8, 300 sec: 42597.5). Total num frames: 14520352768. Throughput: 0: 42142.8. Samples: 14520490760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:33,396][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 12:33:34,572][15401] Updated weights for policy 0, policy_version 886254 (0.0027) [2024-06-25 12:33:36,679][15349] Signal inference workers to stop experience collection... (214950 times) [2024-06-25 12:33:36,680][15349] Signal inference workers to resume experience collection... (214950 times) [2024-06-25 12:33:36,696][15401] InferenceWorker_p0-w0: stopping experience collection (214950 times) [2024-06-25 12:33:36,696][15401] InferenceWorker_p0-w0: resuming experience collection (214950 times) [2024-06-25 12:33:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 14520516608. Throughput: 0: 42428.4. Samples: 14520628140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:38,395][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 12:33:38,897][15401] Updated weights for policy 0, policy_version 886264 (0.0037) [2024-06-25 12:33:42,171][15401] Updated weights for policy 0, policy_version 886274 (0.0029) [2024-06-25 12:33:43,390][15132] Fps is (10 sec: 39346.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 14520745984. Throughput: 0: 42529.7. Samples: 14520880720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 12:33:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000886276_14520745984.pth... [2024-06-25 12:33:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000885653_14510538752.pth [2024-06-25 12:33:46,372][15401] Updated weights for policy 0, policy_version 886284 (0.0026) [2024-06-25 12:33:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 14520975360. Throughput: 0: 42370.6. Samples: 14521132720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 12:33:49,740][15401] Updated weights for policy 0, policy_version 886294 (0.0025) [2024-06-25 12:33:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14521171968. Throughput: 0: 42522.0. Samples: 14521267040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 12:33:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 12:33:53,974][15401] Updated weights for policy 0, policy_version 886304 (0.0032) [2024-06-25 12:33:57,863][15401] Updated weights for policy 0, policy_version 886314 (0.0030) [2024-06-25 12:33:58,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 14521384960. Throughput: 0: 42540.9. Samples: 14521518440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:33:58,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 12:34:02,021][15401] Updated weights for policy 0, policy_version 886324 (0.0026) [2024-06-25 12:34:03,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42869.7, 300 sec: 42542.9). Total num frames: 14521614336. Throughput: 0: 42511.1. Samples: 14521770540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:03,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 12:34:05,439][15401] Updated weights for policy 0, policy_version 886334 (0.0046) [2024-06-25 12:34:08,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14521810944. Throughput: 0: 42588.3. Samples: 14521902680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 12:34:09,489][15401] Updated weights for policy 0, policy_version 886344 (0.0035) [2024-06-25 12:34:12,933][15401] Updated weights for policy 0, policy_version 886354 (0.0029) [2024-06-25 12:34:13,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 14522040320. Throughput: 0: 42606.6. Samples: 14522159520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:13,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 12:34:17,131][15401] Updated weights for policy 0, policy_version 886364 (0.0037) [2024-06-25 12:34:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 14522253312. Throughput: 0: 42618.9. Samples: 14522408340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:18,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 12:34:20,697][15401] Updated weights for policy 0, policy_version 886374 (0.0033) [2024-06-25 12:34:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42603.0, 300 sec: 42542.9). Total num frames: 14522449920. Throughput: 0: 42461.8. Samples: 14522538920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:23,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 12:34:24,814][15401] Updated weights for policy 0, policy_version 886384 (0.0029) [2024-06-25 12:34:28,186][15401] Updated weights for policy 0, policy_version 886394 (0.0028) [2024-06-25 12:34:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42543.8). Total num frames: 14522679296. Throughput: 0: 42513.0. Samples: 14522793800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 12:34:32,552][15401] Updated weights for policy 0, policy_version 886404 (0.0034) [2024-06-25 12:34:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42329.9, 300 sec: 42542.9). Total num frames: 14522892288. Throughput: 0: 42553.8. Samples: 14523047640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 12:34:35,879][15401] Updated weights for policy 0, policy_version 886414 (0.0039) [2024-06-25 12:34:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42598.0). Total num frames: 14523105280. Throughput: 0: 42334.7. Samples: 14523172200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:38,392][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 12:34:40,154][15401] Updated weights for policy 0, policy_version 886424 (0.0032) [2024-06-25 12:34:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14523318272. Throughput: 0: 42503.1. Samples: 14523430980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 12:34:44,099][15401] Updated weights for policy 0, policy_version 886434 (0.0029) [2024-06-25 12:34:47,534][15349] Signal inference workers to stop experience collection... (215000 times) [2024-06-25 12:34:47,534][15349] Signal inference workers to resume experience collection... (215000 times) [2024-06-25 12:34:47,565][15401] InferenceWorker_p0-w0: stopping experience collection (215000 times) [2024-06-25 12:34:47,566][15401] InferenceWorker_p0-w0: resuming experience collection (215000 times) [2024-06-25 12:34:48,066][15401] Updated weights for policy 0, policy_version 886444 (0.0042) [2024-06-25 12:34:48,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14523514880. Throughput: 0: 42506.8. Samples: 14523683240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:48,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 12:34:51,682][15401] Updated weights for policy 0, policy_version 886454 (0.0034) [2024-06-25 12:34:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 14523727872. Throughput: 0: 42395.5. Samples: 14523810580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:53,392][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 12:34:55,669][15401] Updated weights for policy 0, policy_version 886464 (0.0031) [2024-06-25 12:34:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 14523940864. Throughput: 0: 42311.9. Samples: 14524063560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:34:58,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 12:34:59,293][15401] Updated weights for policy 0, policy_version 886474 (0.0038) [2024-06-25 12:35:03,138][15401] Updated weights for policy 0, policy_version 886484 (0.0028) [2024-06-25 12:35:03,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 14524170240. Throughput: 0: 42454.1. Samples: 14524318780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 12:35:06,807][15401] Updated weights for policy 0, policy_version 886494 (0.0037) [2024-06-25 12:35:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 14524366848. Throughput: 0: 42438.3. Samples: 14524448640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 12:35:10,670][15401] Updated weights for policy 0, policy_version 886504 (0.0028) [2024-06-25 12:35:13,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42596.7, 300 sec: 42542.6). Total num frames: 14524596224. Throughput: 0: 42489.8. Samples: 14524705940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:13,392][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 12:35:14,603][15401] Updated weights for policy 0, policy_version 886514 (0.0034) [2024-06-25 12:35:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 14524792832. Throughput: 0: 42624.4. Samples: 14524965740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 12:35:18,765][15401] Updated weights for policy 0, policy_version 886524 (0.0052) [2024-06-25 12:35:22,253][15401] Updated weights for policy 0, policy_version 886534 (0.0034) [2024-06-25 12:35:23,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42871.4, 300 sec: 42599.1). Total num frames: 14525022208. Throughput: 0: 42677.3. Samples: 14525092580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:23,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 12:35:26,418][15401] Updated weights for policy 0, policy_version 886544 (0.0031) [2024-06-25 12:35:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14525235200. Throughput: 0: 42559.6. Samples: 14525346160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 12:35:30,165][15401] Updated weights for policy 0, policy_version 886554 (0.0036) [2024-06-25 12:35:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14525431808. Throughput: 0: 42729.7. Samples: 14525606080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:33,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 12:35:33,998][15401] Updated weights for policy 0, policy_version 886564 (0.0032) [2024-06-25 12:35:37,707][15401] Updated weights for policy 0, policy_version 886574 (0.0034) [2024-06-25 12:35:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 14525644800. Throughput: 0: 42578.3. Samples: 14525726500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 12:35:41,748][15401] Updated weights for policy 0, policy_version 886584 (0.0034) [2024-06-25 12:35:43,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 14525874176. Throughput: 0: 42751.0. Samples: 14525987460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-25 12:35:43,393][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 12:35:43,552][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000886590_14525890560.pth... [2024-06-25 12:35:43,624][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000885966_14515666944.pth [2024-06-25 12:35:45,168][15401] Updated weights for policy 0, policy_version 886594 (0.0032) [2024-06-25 12:35:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14526054400. Throughput: 0: 42970.9. Samples: 14526252460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:35:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 12:35:49,447][15401] Updated weights for policy 0, policy_version 886604 (0.0037) [2024-06-25 12:35:52,741][15401] Updated weights for policy 0, policy_version 886614 (0.0035) [2024-06-25 12:35:53,389][15132] Fps is (10 sec: 42609.3, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 14526300160. Throughput: 0: 42735.1. Samples: 14526371720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:35:53,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 12:35:57,032][15401] Updated weights for policy 0, policy_version 886624 (0.0035) [2024-06-25 12:35:58,392][15132] Fps is (10 sec: 44225.4, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 14526496768. Throughput: 0: 42663.9. Samples: 14526625820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:35:58,393][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 12:36:00,539][15401] Updated weights for policy 0, policy_version 886634 (0.0038) [2024-06-25 12:36:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 14526709760. Throughput: 0: 42564.5. Samples: 14526881140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:03,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 12:36:04,743][15401] Updated weights for policy 0, policy_version 886644 (0.0030) [2024-06-25 12:36:08,209][15401] Updated weights for policy 0, policy_version 886654 (0.0042) [2024-06-25 12:36:08,389][15132] Fps is (10 sec: 44247.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14526939136. Throughput: 0: 42441.9. Samples: 14527002460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 12:36:12,501][15401] Updated weights for policy 0, policy_version 886664 (0.0038) [2024-06-25 12:36:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 14527152128. Throughput: 0: 42599.7. Samples: 14527263140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:13,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 12:36:15,941][15401] Updated weights for policy 0, policy_version 886674 (0.0029) [2024-06-25 12:36:18,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14527332352. Throughput: 0: 42449.8. Samples: 14527516320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:18,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 12:36:20,328][15401] Updated weights for policy 0, policy_version 886684 (0.0043) [2024-06-25 12:36:23,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.4, 300 sec: 42599.0). Total num frames: 14527578112. Throughput: 0: 42591.0. Samples: 14527643100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:23,396][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 12:36:23,559][15401] Updated weights for policy 0, policy_version 886694 (0.0031) [2024-06-25 12:36:28,042][15401] Updated weights for policy 0, policy_version 886704 (0.0032) [2024-06-25 12:36:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 14527791104. Throughput: 0: 42477.4. Samples: 14527898840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:28,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 12:36:30,769][15349] Signal inference workers to stop experience collection... (215050 times) [2024-06-25 12:36:30,789][15401] InferenceWorker_p0-w0: stopping experience collection (215050 times) [2024-06-25 12:36:30,827][15349] Signal inference workers to resume experience collection... (215050 times) [2024-06-25 12:36:30,828][15401] InferenceWorker_p0-w0: resuming experience collection (215050 times) [2024-06-25 12:36:31,221][15401] Updated weights for policy 0, policy_version 886714 (0.0025) [2024-06-25 12:36:33,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 14527971328. Throughput: 0: 42235.9. Samples: 14528153080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 12:36:35,507][15401] Updated weights for policy 0, policy_version 886724 (0.0035) [2024-06-25 12:36:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42543.0). Total num frames: 14528217088. Throughput: 0: 42409.7. Samples: 14528280160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 12:36:38,877][15401] Updated weights for policy 0, policy_version 886734 (0.0043) [2024-06-25 12:36:43,118][15401] Updated weights for policy 0, policy_version 886744 (0.0029) [2024-06-25 12:36:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 14528413696. Throughput: 0: 42658.4. Samples: 14528545340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 12:36:46,398][15401] Updated weights for policy 0, policy_version 886754 (0.0023) [2024-06-25 12:36:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14528626688. Throughput: 0: 42711.1. Samples: 14528803140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:48,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 12:36:50,704][15401] Updated weights for policy 0, policy_version 886764 (0.0022) [2024-06-25 12:36:53,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 14528872448. Throughput: 0: 42867.4. Samples: 14528931600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:53,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 12:36:54,140][15401] Updated weights for policy 0, policy_version 886774 (0.0039) [2024-06-25 12:36:58,367][15401] Updated weights for policy 0, policy_version 886784 (0.0035) [2024-06-25 12:36:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42871.5, 300 sec: 42598.1). Total num frames: 14529069056. Throughput: 0: 42892.7. Samples: 14529193420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:36:58,393][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 12:37:02,216][15401] Updated weights for policy 0, policy_version 886794 (0.0040) [2024-06-25 12:37:03,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 14529282048. Throughput: 0: 42682.1. Samples: 14529437020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:37:03,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 12:37:06,011][15401] Updated weights for policy 0, policy_version 886804 (0.0028) [2024-06-25 12:37:08,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14529511424. Throughput: 0: 42768.1. Samples: 14529567660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:37:08,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 12:37:09,803][15401] Updated weights for policy 0, policy_version 886814 (0.0031) [2024-06-25 12:37:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.1, 300 sec: 42542.8). Total num frames: 14529691648. Throughput: 0: 42812.8. Samples: 14529825420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:37:13,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 12:37:14,102][15401] Updated weights for policy 0, policy_version 886824 (0.0048) [2024-06-25 12:37:17,416][15401] Updated weights for policy 0, policy_version 886834 (0.0029) [2024-06-25 12:37:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14529921024. Throughput: 0: 42902.3. Samples: 14530083680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:37:18,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 12:37:21,611][15401] Updated weights for policy 0, policy_version 886844 (0.0033) [2024-06-25 12:37:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14530150400. Throughput: 0: 43053.6. Samples: 14530217580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:37:23,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-25 12:37:24,967][15401] Updated weights for policy 0, policy_version 886854 (0.0036) [2024-06-25 12:37:28,394][15132] Fps is (10 sec: 40941.9, 60 sec: 42322.3, 300 sec: 42542.2). Total num frames: 14530330624. Throughput: 0: 42746.0. Samples: 14530469100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:37:28,395][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 12:37:29,347][15401] Updated weights for policy 0, policy_version 886864 (0.0038) [2024-06-25 12:37:32,808][15401] Updated weights for policy 0, policy_version 886874 (0.0036) [2024-06-25 12:37:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14530560000. Throughput: 0: 42607.2. Samples: 14530720460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:37:33,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-25 12:37:36,919][15401] Updated weights for policy 0, policy_version 886884 (0.0027) [2024-06-25 12:37:38,389][15132] Fps is (10 sec: 44256.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14530772992. Throughput: 0: 42737.9. Samples: 14530854700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 12:37:38,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-25 12:37:40,363][15401] Updated weights for policy 0, policy_version 886894 (0.0037) [2024-06-25 12:37:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14530985984. Throughput: 0: 42639.6. Samples: 14531112100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:37:43,390][15132] Avg episode reward: [(0, '0.201')] [2024-06-25 12:37:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000886901_14530985984.pth... [2024-06-25 12:37:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000886276_14520745984.pth [2024-06-25 12:37:44,534][15401] Updated weights for policy 0, policy_version 886904 (0.0036) [2024-06-25 12:37:45,994][15349] Signal inference workers to stop experience collection... (215100 times) [2024-06-25 12:37:46,027][15401] InferenceWorker_p0-w0: stopping experience collection (215100 times) [2024-06-25 12:37:46,047][15349] Signal inference workers to resume experience collection... (215100 times) [2024-06-25 12:37:46,052][15401] InferenceWorker_p0-w0: resuming experience collection (215100 times) [2024-06-25 12:37:48,033][15401] Updated weights for policy 0, policy_version 886914 (0.0035) [2024-06-25 12:37:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14531215360. Throughput: 0: 42845.0. Samples: 14531365040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:37:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 12:37:52,492][15401] Updated weights for policy 0, policy_version 886924 (0.0031) [2024-06-25 12:37:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 14531395584. Throughput: 0: 42874.7. Samples: 14531497020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:37:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 12:37:55,656][15401] Updated weights for policy 0, policy_version 886934 (0.0039) [2024-06-25 12:37:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 14531624960. Throughput: 0: 42934.9. Samples: 14531757480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:37:58,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 12:38:00,041][15401] Updated weights for policy 0, policy_version 886944 (0.0034) [2024-06-25 12:38:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14531837952. Throughput: 0: 42844.5. Samples: 14532011680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 12:38:03,514][15401] Updated weights for policy 0, policy_version 886954 (0.0029) [2024-06-25 12:38:07,555][15401] Updated weights for policy 0, policy_version 886964 (0.0035) [2024-06-25 12:38:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 14532034560. Throughput: 0: 42775.4. Samples: 14532142460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:08,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 12:38:11,144][15401] Updated weights for policy 0, policy_version 886974 (0.0026) [2024-06-25 12:38:13,389][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14532280320. Throughput: 0: 42934.8. Samples: 14532400980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:13,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 12:38:15,122][15401] Updated weights for policy 0, policy_version 886984 (0.0041) [2024-06-25 12:38:18,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 14532493312. Throughput: 0: 42952.7. Samples: 14532653340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:18,399][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 12:38:18,767][15401] Updated weights for policy 0, policy_version 886994 (0.0046) [2024-06-25 12:38:22,511][15401] Updated weights for policy 0, policy_version 887004 (0.0032) [2024-06-25 12:38:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 14532689920. Throughput: 0: 42905.9. Samples: 14532785460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 12:38:26,227][15401] Updated weights for policy 0, policy_version 887014 (0.0038) [2024-06-25 12:38:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42874.6, 300 sec: 42543.8). Total num frames: 14532902912. Throughput: 0: 42808.5. Samples: 14533038480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:28,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 12:38:30,150][15401] Updated weights for policy 0, policy_version 887024 (0.0042) [2024-06-25 12:38:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14533132288. Throughput: 0: 42929.7. Samples: 14533296880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:33,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 12:38:33,816][15401] Updated weights for policy 0, policy_version 887034 (0.0043) [2024-06-25 12:38:37,589][15401] Updated weights for policy 0, policy_version 887044 (0.0030) [2024-06-25 12:38:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14533345280. Throughput: 0: 42988.4. Samples: 14533431500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:38,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 12:38:41,527][15401] Updated weights for policy 0, policy_version 887054 (0.0032) [2024-06-25 12:38:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14533558272. Throughput: 0: 42869.7. Samples: 14533686620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 12:38:45,171][15401] Updated weights for policy 0, policy_version 887064 (0.0030) [2024-06-25 12:38:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14533771264. Throughput: 0: 43024.4. Samples: 14533947780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 12:38:49,293][15401] Updated weights for policy 0, policy_version 887074 (0.0033) [2024-06-25 12:38:52,737][15401] Updated weights for policy 0, policy_version 887084 (0.0042) [2024-06-25 12:38:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 14533984256. Throughput: 0: 42914.4. Samples: 14534073620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 12:38:55,698][15349] Signal inference workers to stop experience collection... (215150 times) [2024-06-25 12:38:55,698][15349] Signal inference workers to resume experience collection... (215150 times) [2024-06-25 12:38:55,734][15401] InferenceWorker_p0-w0: stopping experience collection (215150 times) [2024-06-25 12:38:55,734][15401] InferenceWorker_p0-w0: resuming experience collection (215150 times) [2024-06-25 12:38:56,821][15401] Updated weights for policy 0, policy_version 887094 (0.0037) [2024-06-25 12:38:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 14534197248. Throughput: 0: 42770.7. Samples: 14534325660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:38:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 12:39:00,663][15401] Updated weights for policy 0, policy_version 887104 (0.0028) [2024-06-25 12:39:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14534410240. Throughput: 0: 42893.4. Samples: 14534583540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:39:03,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 12:39:04,513][15401] Updated weights for policy 0, policy_version 887114 (0.0047) [2024-06-25 12:39:08,146][15401] Updated weights for policy 0, policy_version 887124 (0.0043) [2024-06-25 12:39:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.4, 300 sec: 42709.5). Total num frames: 14534639616. Throughput: 0: 42786.5. Samples: 14534710860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:39:08,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 12:39:12,035][15401] Updated weights for policy 0, policy_version 887134 (0.0032) [2024-06-25 12:39:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14534836224. Throughput: 0: 42783.6. Samples: 14534963740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:39:13,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 12:39:16,095][15401] Updated weights for policy 0, policy_version 887144 (0.0033) [2024-06-25 12:39:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14535065600. Throughput: 0: 42743.0. Samples: 14535220320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:39:18,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-25 12:39:20,174][15401] Updated weights for policy 0, policy_version 887154 (0.0037) [2024-06-25 12:39:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14535278592. Throughput: 0: 42601.3. Samples: 14535348560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:39:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 12:39:24,038][15401] Updated weights for policy 0, policy_version 887164 (0.0044) [2024-06-25 12:39:27,830][15401] Updated weights for policy 0, policy_version 887174 (0.0037) [2024-06-25 12:39:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14535491584. Throughput: 0: 42573.8. Samples: 14535602440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 12:39:28,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-25 12:39:31,589][15401] Updated weights for policy 0, policy_version 887184 (0.0034) [2024-06-25 12:39:33,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 14535688192. Throughput: 0: 42586.7. Samples: 14535864180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:39:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 12:39:35,186][15401] Updated weights for policy 0, policy_version 887194 (0.0027) [2024-06-25 12:39:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14535901184. Throughput: 0: 42634.8. Samples: 14535992180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:39:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 12:39:39,406][15401] Updated weights for policy 0, policy_version 887204 (0.0040) [2024-06-25 12:39:42,603][15401] Updated weights for policy 0, policy_version 887214 (0.0038) [2024-06-25 12:39:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14536130560. Throughput: 0: 42832.9. Samples: 14536253140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:39:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 12:39:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000887216_14536146944.pth... [2024-06-25 12:39:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000886590_14525890560.pth [2024-06-25 12:39:46,910][15401] Updated weights for policy 0, policy_version 887224 (0.0034) [2024-06-25 12:39:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 14536343552. Throughput: 0: 42690.7. Samples: 14536504620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:39:48,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 12:39:50,754][15401] Updated weights for policy 0, policy_version 887234 (0.0031) [2024-06-25 12:39:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14536556544. Throughput: 0: 42553.4. Samples: 14536625760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:39:53,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 12:39:54,792][15401] Updated weights for policy 0, policy_version 887244 (0.0030) [2024-06-25 12:39:58,221][15401] Updated weights for policy 0, policy_version 887254 (0.0028) [2024-06-25 12:39:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14536769536. Throughput: 0: 42774.4. Samples: 14536888580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:39:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 12:40:02,338][15401] Updated weights for policy 0, policy_version 887264 (0.0032) [2024-06-25 12:40:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14536966144. Throughput: 0: 42718.9. Samples: 14537142660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 12:40:05,957][15401] Updated weights for policy 0, policy_version 887274 (0.0027) [2024-06-25 12:40:08,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 14537195520. Throughput: 0: 42639.6. Samples: 14537267340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 12:40:09,870][15401] Updated weights for policy 0, policy_version 887284 (0.0037) [2024-06-25 12:40:13,392][15132] Fps is (10 sec: 44225.5, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 14537408512. Throughput: 0: 42829.7. Samples: 14537529880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:13,392][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 12:40:13,506][15401] Updated weights for policy 0, policy_version 887294 (0.0030) [2024-06-25 12:40:17,621][15401] Updated weights for policy 0, policy_version 887304 (0.0030) [2024-06-25 12:40:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14537621504. Throughput: 0: 42597.7. Samples: 14537781080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 12:40:21,151][15349] Signal inference workers to stop experience collection... (215200 times) [2024-06-25 12:40:21,151][15349] Signal inference workers to resume experience collection... (215200 times) [2024-06-25 12:40:21,169][15401] InferenceWorker_p0-w0: stopping experience collection (215200 times) [2024-06-25 12:40:21,189][15401] InferenceWorker_p0-w0: resuming experience collection (215200 times) [2024-06-25 12:40:21,308][15401] Updated weights for policy 0, policy_version 887314 (0.0036) [2024-06-25 12:40:23,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14537850880. Throughput: 0: 42659.4. Samples: 14537911860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:23,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 12:40:25,410][15401] Updated weights for policy 0, policy_version 887324 (0.0032) [2024-06-25 12:40:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14538031104. Throughput: 0: 42547.9. Samples: 14538167800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 12:40:28,929][15401] Updated weights for policy 0, policy_version 887334 (0.0044) [2024-06-25 12:40:32,885][15401] Updated weights for policy 0, policy_version 887344 (0.0047) [2024-06-25 12:40:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14538276864. Throughput: 0: 42573.3. Samples: 14538420420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 12:40:36,993][15401] Updated weights for policy 0, policy_version 887354 (0.0041) [2024-06-25 12:40:38,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.9). Total num frames: 14538473472. Throughput: 0: 42800.5. Samples: 14538551780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 12:40:40,515][15401] Updated weights for policy 0, policy_version 887364 (0.0033) [2024-06-25 12:40:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14538670080. Throughput: 0: 42595.9. Samples: 14538805400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 12:40:44,890][15401] Updated weights for policy 0, policy_version 887374 (0.0042) [2024-06-25 12:40:48,119][15401] Updated weights for policy 0, policy_version 887384 (0.0027) [2024-06-25 12:40:48,389][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14538915840. Throughput: 0: 42564.3. Samples: 14539058060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 12:40:52,473][15401] Updated weights for policy 0, policy_version 887394 (0.0042) [2024-06-25 12:40:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 14539096064. Throughput: 0: 42794.3. Samples: 14539193080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:53,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 12:40:55,649][15401] Updated weights for policy 0, policy_version 887404 (0.0028) [2024-06-25 12:40:58,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42323.5, 300 sec: 42709.1). Total num frames: 14539309056. Throughput: 0: 42536.4. Samples: 14539444020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:40:58,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 12:41:00,195][15401] Updated weights for policy 0, policy_version 887414 (0.0033) [2024-06-25 12:41:03,234][15401] Updated weights for policy 0, policy_version 887424 (0.0028) [2024-06-25 12:41:03,391][15132] Fps is (10 sec: 45869.1, 60 sec: 43143.5, 300 sec: 42764.8). Total num frames: 14539554816. Throughput: 0: 42636.5. Samples: 14539699780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:41:03,391][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 12:41:07,687][15401] Updated weights for policy 0, policy_version 887434 (0.0039) [2024-06-25 12:41:08,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14539751424. Throughput: 0: 42681.4. Samples: 14539832520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:41:08,395][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 12:41:10,734][15401] Updated weights for policy 0, policy_version 887444 (0.0034) [2024-06-25 12:41:13,390][15132] Fps is (10 sec: 39326.3, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 14539948032. Throughput: 0: 42698.2. Samples: 14540089220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:41:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 12:41:15,155][15401] Updated weights for policy 0, policy_version 887454 (0.0039) [2024-06-25 12:41:18,326][15401] Updated weights for policy 0, policy_version 887464 (0.0033) [2024-06-25 12:41:18,396][15132] Fps is (10 sec: 45846.0, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 14540210176. Throughput: 0: 42721.1. Samples: 14540343140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:41:18,396][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 12:41:23,141][15401] Updated weights for policy 0, policy_version 887474 (0.0034) [2024-06-25 12:41:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 14540390400. Throughput: 0: 42751.1. Samples: 14540475580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:41:23,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 12:41:26,015][15401] Updated weights for policy 0, policy_version 887484 (0.0034) [2024-06-25 12:41:28,392][15132] Fps is (10 sec: 37698.7, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 14540587008. Throughput: 0: 42750.3. Samples: 14540729260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:41:28,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 12:41:30,577][15401] Updated weights for policy 0, policy_version 887494 (0.0038) [2024-06-25 12:41:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14540832768. Throughput: 0: 42770.2. Samples: 14540982720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:41:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 12:41:33,608][15401] Updated weights for policy 0, policy_version 887504 (0.0031) [2024-06-25 12:41:38,236][15401] Updated weights for policy 0, policy_version 887514 (0.0025) [2024-06-25 12:41:38,390][15132] Fps is (10 sec: 44246.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 14541029376. Throughput: 0: 42712.7. Samples: 14541115160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:41:38,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 12:41:41,606][15401] Updated weights for policy 0, policy_version 887524 (0.0030) [2024-06-25 12:41:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14541242368. Throughput: 0: 42822.3. Samples: 14541370920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:41:43,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 12:41:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000887527_14541242368.pth... [2024-06-25 12:41:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000886901_14530985984.pth [2024-06-25 12:41:45,894][15401] Updated weights for policy 0, policy_version 887534 (0.0027) [2024-06-25 12:41:47,865][15349] Signal inference workers to stop experience collection... (215250 times) [2024-06-25 12:41:47,905][15401] InferenceWorker_p0-w0: stopping experience collection (215250 times) [2024-06-25 12:41:47,929][15349] Signal inference workers to resume experience collection... (215250 times) [2024-06-25 12:41:47,936][15401] InferenceWorker_p0-w0: resuming experience collection (215250 times) [2024-06-25 12:41:48,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 14541488128. Throughput: 0: 42760.3. Samples: 14541623940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:41:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 12:41:49,014][15401] Updated weights for policy 0, policy_version 887544 (0.0038) [2024-06-25 12:41:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 14541668352. Throughput: 0: 42815.1. Samples: 14541759200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:41:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 12:41:53,713][15401] Updated weights for policy 0, policy_version 887554 (0.0041) [2024-06-25 12:41:56,902][15401] Updated weights for policy 0, policy_version 887564 (0.0027) [2024-06-25 12:41:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 14541881344. Throughput: 0: 42732.4. Samples: 14542012180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:41:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 12:42:01,362][15401] Updated weights for policy 0, policy_version 887574 (0.0033) [2024-06-25 12:42:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42872.4, 300 sec: 42765.0). Total num frames: 14542127104. Throughput: 0: 42797.3. Samples: 14542268740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 12:42:04,531][15401] Updated weights for policy 0, policy_version 887584 (0.0030) [2024-06-25 12:42:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14542307328. Throughput: 0: 42773.7. Samples: 14542400400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:08,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 12:42:09,129][15401] Updated weights for policy 0, policy_version 887594 (0.0037) [2024-06-25 12:42:12,055][15401] Updated weights for policy 0, policy_version 887604 (0.0024) [2024-06-25 12:42:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14542536704. Throughput: 0: 42713.7. Samples: 14542651280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 12:42:16,768][15401] Updated weights for policy 0, policy_version 887614 (0.0031) [2024-06-25 12:42:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 14542749696. Throughput: 0: 42805.9. Samples: 14542908980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 12:42:19,659][15401] Updated weights for policy 0, policy_version 887624 (0.0026) [2024-06-25 12:42:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.7). Total num frames: 14542946304. Throughput: 0: 42636.5. Samples: 14543033800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:23,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-25 12:42:24,308][15401] Updated weights for policy 0, policy_version 887634 (0.0029) [2024-06-25 12:42:27,810][15401] Updated weights for policy 0, policy_version 887644 (0.0036) [2024-06-25 12:42:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 14543175680. Throughput: 0: 42625.8. Samples: 14543289080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:28,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 12:42:32,201][15401] Updated weights for policy 0, policy_version 887654 (0.0038) [2024-06-25 12:42:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14543388672. Throughput: 0: 42722.7. Samples: 14543546460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 12:42:35,367][15401] Updated weights for policy 0, policy_version 887664 (0.0032) [2024-06-25 12:42:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14543585280. Throughput: 0: 42508.4. Samples: 14543672080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:38,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 12:42:39,884][15401] Updated weights for policy 0, policy_version 887674 (0.0033) [2024-06-25 12:42:42,898][15401] Updated weights for policy 0, policy_version 887684 (0.0036) [2024-06-25 12:42:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14543831040. Throughput: 0: 42637.5. Samples: 14543930860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:43,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 12:42:47,403][15401] Updated weights for policy 0, policy_version 887694 (0.0041) [2024-06-25 12:42:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 14544027648. Throughput: 0: 42634.6. Samples: 14544187300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:48,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 12:42:50,978][15401] Updated weights for policy 0, policy_version 887704 (0.0047) [2024-06-25 12:42:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 14544224256. Throughput: 0: 42570.9. Samples: 14544316100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:53,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 12:42:54,921][15401] Updated weights for policy 0, policy_version 887714 (0.0035) [2024-06-25 12:42:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14544453632. Throughput: 0: 42631.0. Samples: 14544569680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:42:58,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 12:42:58,622][15401] Updated weights for policy 0, policy_version 887724 (0.0028) [2024-06-25 12:43:02,351][15401] Updated weights for policy 0, policy_version 887734 (0.0036) [2024-06-25 12:43:03,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14544666624. Throughput: 0: 42728.9. Samples: 14544831780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:43:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 12:43:06,200][15401] Updated weights for policy 0, policy_version 887744 (0.0044) [2024-06-25 12:43:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14544863232. Throughput: 0: 42792.3. Samples: 14544959460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:43:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 12:43:09,786][15349] Signal inference workers to stop experience collection... (215300 times) [2024-06-25 12:43:09,787][15349] Signal inference workers to resume experience collection... (215300 times) [2024-06-25 12:43:09,842][15401] InferenceWorker_p0-w0: stopping experience collection (215300 times) [2024-06-25 12:43:09,842][15401] InferenceWorker_p0-w0: resuming experience collection (215300 times) [2024-06-25 12:43:10,286][15401] Updated weights for policy 0, policy_version 887754 (0.0036) [2024-06-25 12:43:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14545092608. Throughput: 0: 42684.1. Samples: 14545209860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:43:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 12:43:14,004][15401] Updated weights for policy 0, policy_version 887764 (0.0032) [2024-06-25 12:43:17,862][15401] Updated weights for policy 0, policy_version 887774 (0.0040) [2024-06-25 12:43:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14545305600. Throughput: 0: 42948.5. Samples: 14545479140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 12:43:21,879][15401] Updated weights for policy 0, policy_version 887784 (0.0034) [2024-06-25 12:43:23,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 14545518592. Throughput: 0: 42992.4. Samples: 14545606840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:23,393][15132] Avg episode reward: [(0, '0.198')] [2024-06-25 12:43:25,634][15401] Updated weights for policy 0, policy_version 887794 (0.0036) [2024-06-25 12:43:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14545747968. Throughput: 0: 42817.7. Samples: 14545857660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 12:43:29,448][15401] Updated weights for policy 0, policy_version 887804 (0.0029) [2024-06-25 12:43:33,071][15401] Updated weights for policy 0, policy_version 887814 (0.0031) [2024-06-25 12:43:33,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14545960960. Throughput: 0: 42889.8. Samples: 14546117340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 12:43:36,939][15401] Updated weights for policy 0, policy_version 887824 (0.0032) [2024-06-25 12:43:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14546157568. Throughput: 0: 42937.0. Samples: 14546248260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:38,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 12:43:40,838][15401] Updated weights for policy 0, policy_version 887834 (0.0040) [2024-06-25 12:43:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14546386944. Throughput: 0: 42903.6. Samples: 14546500340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 12:43:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000887841_14546386944.pth... [2024-06-25 12:43:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000887216_14536146944.pth [2024-06-25 12:43:44,969][15401] Updated weights for policy 0, policy_version 887844 (0.0041) [2024-06-25 12:43:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14546583552. Throughput: 0: 42759.0. Samples: 14546755940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 12:43:48,448][15401] Updated weights for policy 0, policy_version 887854 (0.0025) [2024-06-25 12:43:52,540][15401] Updated weights for policy 0, policy_version 887864 (0.0031) [2024-06-25 12:43:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14546780160. Throughput: 0: 42787.6. Samples: 14546884900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 12:43:55,902][15401] Updated weights for policy 0, policy_version 887874 (0.0037) [2024-06-25 12:43:58,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14547025920. Throughput: 0: 42832.3. Samples: 14547137420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:43:58,392][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 12:44:00,266][15401] Updated weights for policy 0, policy_version 887884 (0.0031) [2024-06-25 12:44:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14547238912. Throughput: 0: 42704.4. Samples: 14547400840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:03,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 12:44:03,509][15401] Updated weights for policy 0, policy_version 887894 (0.0035) [2024-06-25 12:44:08,285][15401] Updated weights for policy 0, policy_version 887904 (0.0034) [2024-06-25 12:44:08,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14547419136. Throughput: 0: 42617.4. Samples: 14547524520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:08,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 12:44:11,351][15401] Updated weights for policy 0, policy_version 887914 (0.0031) [2024-06-25 12:44:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14547664896. Throughput: 0: 42696.5. Samples: 14547779000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:13,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 12:44:15,848][15401] Updated weights for policy 0, policy_version 887924 (0.0036) [2024-06-25 12:44:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 14547861504. Throughput: 0: 42706.0. Samples: 14548039120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:18,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 12:44:18,957][15401] Updated weights for policy 0, policy_version 887934 (0.0033) [2024-06-25 12:44:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 14548058112. Throughput: 0: 42545.4. Samples: 14548162800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 12:44:23,440][15401] Updated weights for policy 0, policy_version 887944 (0.0030) [2024-06-25 12:44:26,266][15349] Signal inference workers to stop experience collection... (215350 times) [2024-06-25 12:44:26,267][15349] Signal inference workers to resume experience collection... (215350 times) [2024-06-25 12:44:26,309][15401] InferenceWorker_p0-w0: stopping experience collection (215350 times) [2024-06-25 12:44:26,309][15401] InferenceWorker_p0-w0: resuming experience collection (215350 times) [2024-06-25 12:44:26,653][15401] Updated weights for policy 0, policy_version 887954 (0.0029) [2024-06-25 12:44:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14548303872. Throughput: 0: 42599.2. Samples: 14548417300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 12:44:30,963][15401] Updated weights for policy 0, policy_version 887964 (0.0030) [2024-06-25 12:44:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14548500480. Throughput: 0: 42867.2. Samples: 14548684960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 12:44:34,183][15401] Updated weights for policy 0, policy_version 887974 (0.0026) [2024-06-25 12:44:38,371][15401] Updated weights for policy 0, policy_version 887984 (0.0043) [2024-06-25 12:44:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14548729856. Throughput: 0: 42804.9. Samples: 14548811120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:38,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 12:44:41,541][15401] Updated weights for policy 0, policy_version 887994 (0.0030) [2024-06-25 12:44:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14548959232. Throughput: 0: 42892.0. Samples: 14549067460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 12:44:45,763][15401] Updated weights for policy 0, policy_version 888004 (0.0026) [2024-06-25 12:44:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14549155840. Throughput: 0: 42892.0. Samples: 14549330980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:48,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-25 12:44:49,278][15401] Updated weights for policy 0, policy_version 888014 (0.0029) [2024-06-25 12:44:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 14549368832. Throughput: 0: 42930.6. Samples: 14549456400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:53,404][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 12:44:53,743][15401] Updated weights for policy 0, policy_version 888024 (0.0037) [2024-06-25 12:44:56,934][15401] Updated weights for policy 0, policy_version 888034 (0.0031) [2024-06-25 12:44:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14549581824. Throughput: 0: 42851.6. Samples: 14549707320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:44:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 12:45:01,467][15401] Updated weights for policy 0, policy_version 888044 (0.0026) [2024-06-25 12:45:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14549811200. Throughput: 0: 42908.6. Samples: 14549970000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:45:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 12:45:04,389][15401] Updated weights for policy 0, policy_version 888054 (0.0039) [2024-06-25 12:45:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42765.3). Total num frames: 14550024192. Throughput: 0: 43129.1. Samples: 14550103620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 12:45:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 12:45:09,079][15401] Updated weights for policy 0, policy_version 888064 (0.0031) [2024-06-25 12:45:12,249][15401] Updated weights for policy 0, policy_version 888074 (0.0035) [2024-06-25 12:45:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14550237184. Throughput: 0: 43115.8. Samples: 14550357520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 12:45:16,749][15401] Updated weights for policy 0, policy_version 888084 (0.0032) [2024-06-25 12:45:18,393][15132] Fps is (10 sec: 44220.5, 60 sec: 43414.9, 300 sec: 42764.5). Total num frames: 14550466560. Throughput: 0: 42916.8. Samples: 14550616380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:18,394][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 12:45:19,787][15401] Updated weights for policy 0, policy_version 888094 (0.0038) [2024-06-25 12:45:23,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 14550663168. Throughput: 0: 43005.0. Samples: 14550746340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 12:45:24,127][15401] Updated weights for policy 0, policy_version 888104 (0.0031) [2024-06-25 12:45:27,356][15401] Updated weights for policy 0, policy_version 888114 (0.0041) [2024-06-25 12:45:28,390][15132] Fps is (10 sec: 42614.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14550892544. Throughput: 0: 43007.1. Samples: 14551002780. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:28,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 12:45:32,246][15401] Updated weights for policy 0, policy_version 888124 (0.0034) [2024-06-25 12:45:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14551089152. Throughput: 0: 42905.0. Samples: 14551261700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:33,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 12:45:35,021][15401] Updated weights for policy 0, policy_version 888134 (0.0026) [2024-06-25 12:45:37,155][15349] Signal inference workers to stop experience collection... (215400 times) [2024-06-25 12:45:37,160][15349] Signal inference workers to resume experience collection... (215400 times) [2024-06-25 12:45:37,195][15401] InferenceWorker_p0-w0: stopping experience collection (215400 times) [2024-06-25 12:45:37,195][15401] InferenceWorker_p0-w0: resuming experience collection (215400 times) [2024-06-25 12:45:38,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 14551318528. Throughput: 0: 42953.8. Samples: 14551389420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:38,392][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 12:45:39,685][15401] Updated weights for policy 0, policy_version 888144 (0.0038) [2024-06-25 12:45:42,486][15401] Updated weights for policy 0, policy_version 888154 (0.0031) [2024-06-25 12:45:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14551547904. Throughput: 0: 43178.1. Samples: 14551650340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:43,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 12:45:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000888156_14551547904.pth... [2024-06-25 12:45:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000887527_14541242368.pth [2024-06-25 12:45:47,042][15401] Updated weights for policy 0, policy_version 888164 (0.0030) [2024-06-25 12:45:48,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14551744512. Throughput: 0: 43154.1. Samples: 14551911940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:48,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 12:45:50,251][15401] Updated weights for policy 0, policy_version 888174 (0.0034) [2024-06-25 12:45:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 14551957504. Throughput: 0: 42900.2. Samples: 14552034120. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 12:45:54,481][15401] Updated weights for policy 0, policy_version 888184 (0.0032) [2024-06-25 12:45:57,915][15401] Updated weights for policy 0, policy_version 888194 (0.0039) [2024-06-25 12:45:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42820.7). Total num frames: 14552186880. Throughput: 0: 43110.3. Samples: 14552297480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:45:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 12:46:01,880][15401] Updated weights for policy 0, policy_version 888204 (0.0040) [2024-06-25 12:46:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14552367104. Throughput: 0: 43144.5. Samples: 14552557720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 12:46:05,431][15401] Updated weights for policy 0, policy_version 888214 (0.0036) [2024-06-25 12:46:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14552612864. Throughput: 0: 42959.0. Samples: 14552679500. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 12:46:09,839][15401] Updated weights for policy 0, policy_version 888224 (0.0040) [2024-06-25 12:46:12,935][15401] Updated weights for policy 0, policy_version 888234 (0.0038) [2024-06-25 12:46:13,390][15132] Fps is (10 sec: 47509.3, 60 sec: 43417.0, 300 sec: 42821.3). Total num frames: 14552842240. Throughput: 0: 43125.9. Samples: 14552943480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:13,391][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 12:46:17,590][15401] Updated weights for policy 0, policy_version 888244 (0.0030) [2024-06-25 12:46:18,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42328.0, 300 sec: 42765.0). Total num frames: 14553006080. Throughput: 0: 43113.2. Samples: 14553201800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:18,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 12:46:20,568][15401] Updated weights for policy 0, policy_version 888254 (0.0039) [2024-06-25 12:46:23,389][15132] Fps is (10 sec: 40963.8, 60 sec: 43144.4, 300 sec: 42932.0). Total num frames: 14553251840. Throughput: 0: 42958.3. Samples: 14553322440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 12:46:25,059][15401] Updated weights for policy 0, policy_version 888264 (0.0047) [2024-06-25 12:46:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14553464832. Throughput: 0: 42997.9. Samples: 14553585240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 12:46:28,445][15401] Updated weights for policy 0, policy_version 888274 (0.0035) [2024-06-25 12:46:32,499][15401] Updated weights for policy 0, policy_version 888284 (0.0042) [2024-06-25 12:46:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 14553677824. Throughput: 0: 43008.9. Samples: 14553847340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:33,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 12:46:36,058][15401] Updated weights for policy 0, policy_version 888294 (0.0030) [2024-06-25 12:46:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 14553890816. Throughput: 0: 43078.7. Samples: 14553972660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 12:46:40,011][15401] Updated weights for policy 0, policy_version 888304 (0.0036) [2024-06-25 12:46:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14554120192. Throughput: 0: 42996.1. Samples: 14554232300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 12:46:43,518][15401] Updated weights for policy 0, policy_version 888314 (0.0038) [2024-06-25 12:46:47,449][15401] Updated weights for policy 0, policy_version 888324 (0.0033) [2024-06-25 12:46:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14554316800. Throughput: 0: 43104.9. Samples: 14554497440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:48,394][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 12:46:50,960][15401] Updated weights for policy 0, policy_version 888334 (0.0036) [2024-06-25 12:46:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 14554546176. Throughput: 0: 43049.8. Samples: 14554616740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 12:46:55,151][15401] Updated weights for policy 0, policy_version 888344 (0.0030) [2024-06-25 12:46:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14554775552. Throughput: 0: 42980.4. Samples: 14554877560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-25 12:46:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 12:46:58,415][15401] Updated weights for policy 0, policy_version 888354 (0.0029) [2024-06-25 12:47:03,021][15401] Updated weights for policy 0, policy_version 888364 (0.0035) [2024-06-25 12:47:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 14554988544. Throughput: 0: 43107.6. Samples: 14555141640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 12:47:06,475][15401] Updated weights for policy 0, policy_version 888374 (0.0032) [2024-06-25 12:47:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14555185152. Throughput: 0: 43233.2. Samples: 14555267940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:08,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-25 12:47:10,663][15401] Updated weights for policy 0, policy_version 888384 (0.0038) [2024-06-25 12:47:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42872.2, 300 sec: 42931.6). Total num frames: 14555414528. Throughput: 0: 42940.5. Samples: 14555517560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:13,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-25 12:47:13,887][15401] Updated weights for policy 0, policy_version 888394 (0.0032) [2024-06-25 12:47:18,028][15349] Signal inference workers to stop experience collection... (215450 times) [2024-06-25 12:47:18,028][15349] Signal inference workers to resume experience collection... (215450 times) [2024-06-25 12:47:18,037][15401] Updated weights for policy 0, policy_version 888404 (0.0035) [2024-06-25 12:47:18,060][15401] InferenceWorker_p0-w0: stopping experience collection (215450 times) [2024-06-25 12:47:18,060][15401] InferenceWorker_p0-w0: resuming experience collection (215450 times) [2024-06-25 12:47:18,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43690.8, 300 sec: 42987.2). Total num frames: 14555627520. Throughput: 0: 43134.4. Samples: 14555788380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:18,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 12:47:21,666][15401] Updated weights for policy 0, policy_version 888414 (0.0029) [2024-06-25 12:47:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14555824128. Throughput: 0: 43235.4. Samples: 14555918260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 12:47:25,536][15401] Updated weights for policy 0, policy_version 888424 (0.0038) [2024-06-25 12:47:28,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 14556069888. Throughput: 0: 43219.8. Samples: 14556177200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:28,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 12:47:29,179][15401] Updated weights for policy 0, policy_version 888434 (0.0032) [2024-06-25 12:47:32,911][15401] Updated weights for policy 0, policy_version 888444 (0.0024) [2024-06-25 12:47:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 14556282880. Throughput: 0: 43068.5. Samples: 14556435520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 12:47:36,721][15401] Updated weights for policy 0, policy_version 888454 (0.0030) [2024-06-25 12:47:38,396][15132] Fps is (10 sec: 39296.7, 60 sec: 42866.8, 300 sec: 42819.6). Total num frames: 14556463104. Throughput: 0: 43370.2. Samples: 14556568680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:38,397][15132] Avg episode reward: [(0, '0.297')] [2024-06-25 12:47:40,740][15401] Updated weights for policy 0, policy_version 888464 (0.0026) [2024-06-25 12:47:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 14556708864. Throughput: 0: 43031.6. Samples: 14556813980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:43,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 12:47:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000888471_14556708864.pth... [2024-06-25 12:47:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000887841_14546386944.pth [2024-06-25 12:47:44,587][15401] Updated weights for policy 0, policy_version 888474 (0.0038) [2024-06-25 12:47:48,389][15132] Fps is (10 sec: 44265.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 14556905472. Throughput: 0: 43002.7. Samples: 14557076760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 12:47:48,477][15401] Updated weights for policy 0, policy_version 888484 (0.0033) [2024-06-25 12:47:52,189][15401] Updated weights for policy 0, policy_version 888494 (0.0026) [2024-06-25 12:47:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14557118464. Throughput: 0: 42993.4. Samples: 14557202640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:53,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 12:47:55,935][15401] Updated weights for policy 0, policy_version 888504 (0.0032) [2024-06-25 12:47:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 14557347840. Throughput: 0: 43152.8. Samples: 14557459440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:47:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 12:47:59,754][15401] Updated weights for policy 0, policy_version 888514 (0.0027) [2024-06-25 12:48:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 14557560832. Throughput: 0: 42877.7. Samples: 14557717880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 12:48:03,591][15401] Updated weights for policy 0, policy_version 888524 (0.0032) [2024-06-25 12:48:07,830][15401] Updated weights for policy 0, policy_version 888534 (0.0029) [2024-06-25 12:48:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 14557757440. Throughput: 0: 42840.1. Samples: 14557846060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 12:48:11,338][15401] Updated weights for policy 0, policy_version 888544 (0.0038) [2024-06-25 12:48:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 43142.7, 300 sec: 43042.3). Total num frames: 14558003200. Throughput: 0: 42673.4. Samples: 14558097600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:13,392][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 12:48:15,596][15401] Updated weights for policy 0, policy_version 888554 (0.0042) [2024-06-25 12:48:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42987.5). Total num frames: 14558199808. Throughput: 0: 42749.6. Samples: 14558359260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:18,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 12:48:19,011][15401] Updated weights for policy 0, policy_version 888564 (0.0038) [2024-06-25 12:48:23,318][15401] Updated weights for policy 0, policy_version 888574 (0.0040) [2024-06-25 12:48:23,389][15132] Fps is (10 sec: 39331.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14558396416. Throughput: 0: 42457.8. Samples: 14558479000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:23,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 12:48:26,601][15401] Updated weights for policy 0, policy_version 888584 (0.0038) [2024-06-25 12:48:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 14558625792. Throughput: 0: 42822.6. Samples: 14558741000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 12:48:28,411][15349] Signal inference workers to stop experience collection... (215500 times) [2024-06-25 12:48:28,456][15401] InferenceWorker_p0-w0: stopping experience collection (215500 times) [2024-06-25 12:48:28,466][15349] Signal inference workers to resume experience collection... (215500 times) [2024-06-25 12:48:28,473][15401] InferenceWorker_p0-w0: resuming experience collection (215500 times) [2024-06-25 12:48:30,941][15401] Updated weights for policy 0, policy_version 888594 (0.0027) [2024-06-25 12:48:33,394][15132] Fps is (10 sec: 44217.3, 60 sec: 42595.3, 300 sec: 42986.5). Total num frames: 14558838784. Throughput: 0: 42904.3. Samples: 14559007640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:33,394][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 12:48:34,143][15401] Updated weights for policy 0, policy_version 888604 (0.0032) [2024-06-25 12:48:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 14559035392. Throughput: 0: 42728.1. Samples: 14559125400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:38,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-25 12:48:38,624][15401] Updated weights for policy 0, policy_version 888614 (0.0041) [2024-06-25 12:48:41,907][15401] Updated weights for policy 0, policy_version 888624 (0.0033) [2024-06-25 12:48:43,389][15132] Fps is (10 sec: 44256.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 14559281152. Throughput: 0: 42760.9. Samples: 14559383680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:43,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 12:48:46,311][15401] Updated weights for policy 0, policy_version 888634 (0.0031) [2024-06-25 12:48:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 14559461376. Throughput: 0: 42825.0. Samples: 14559645000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:48,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 12:48:49,846][15401] Updated weights for policy 0, policy_version 888644 (0.0026) [2024-06-25 12:48:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 14559674368. Throughput: 0: 42592.9. Samples: 14559762740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 12:48:53,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 12:48:54,144][15401] Updated weights for policy 0, policy_version 888654 (0.0031) [2024-06-25 12:48:57,502][15401] Updated weights for policy 0, policy_version 888664 (0.0024) [2024-06-25 12:48:58,390][15132] Fps is (10 sec: 45870.7, 60 sec: 42870.9, 300 sec: 42987.0). Total num frames: 14559920128. Throughput: 0: 42821.5. Samples: 14560024500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:48:58,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 12:49:01,995][15401] Updated weights for policy 0, policy_version 888674 (0.0042) [2024-06-25 12:49:03,392][15132] Fps is (10 sec: 42587.2, 60 sec: 42323.5, 300 sec: 42986.8). Total num frames: 14560100352. Throughput: 0: 42783.9. Samples: 14560284640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:03,393][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 12:49:04,989][15401] Updated weights for policy 0, policy_version 888684 (0.0041) [2024-06-25 12:49:08,389][15132] Fps is (10 sec: 39325.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14560313344. Throughput: 0: 42758.2. Samples: 14560403120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 12:49:09,479][15401] Updated weights for policy 0, policy_version 888694 (0.0029) [2024-06-25 12:49:12,466][15401] Updated weights for policy 0, policy_version 888704 (0.0039) [2024-06-25 12:49:13,390][15132] Fps is (10 sec: 45887.0, 60 sec: 42600.1, 300 sec: 43042.7). Total num frames: 14560559104. Throughput: 0: 42795.2. Samples: 14560666780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 12:49:17,138][15401] Updated weights for policy 0, policy_version 888714 (0.0034) [2024-06-25 12:49:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42987.2). Total num frames: 14560739328. Throughput: 0: 42693.8. Samples: 14560928680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 12:49:20,375][15401] Updated weights for policy 0, policy_version 888724 (0.0025) [2024-06-25 12:49:21,312][15349] Signal inference workers to stop experience collection... (215550 times) [2024-06-25 12:49:21,313][15349] Signal inference workers to resume experience collection... (215550 times) [2024-06-25 12:49:21,347][15401] InferenceWorker_p0-w0: stopping experience collection (215550 times) [2024-06-25 12:49:21,348][15401] InferenceWorker_p0-w0: resuming experience collection (215550 times) [2024-06-25 12:49:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14560952320. Throughput: 0: 42657.3. Samples: 14561044980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:23,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 12:49:25,039][15401] Updated weights for policy 0, policy_version 888734 (0.0037) [2024-06-25 12:49:27,937][15401] Updated weights for policy 0, policy_version 888744 (0.0040) [2024-06-25 12:49:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 14561198080. Throughput: 0: 42670.3. Samples: 14561303840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 12:49:32,709][15401] Updated weights for policy 0, policy_version 888754 (0.0045) [2024-06-25 12:49:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42328.3, 300 sec: 42876.1). Total num frames: 14561378304. Throughput: 0: 42671.0. Samples: 14561565200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:33,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 12:49:35,594][15401] Updated weights for policy 0, policy_version 888764 (0.0030) [2024-06-25 12:49:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 14561607680. Throughput: 0: 42772.3. Samples: 14561687600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:38,392][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 12:49:40,281][15401] Updated weights for policy 0, policy_version 888774 (0.0027) [2024-06-25 12:49:43,295][15401] Updated weights for policy 0, policy_version 888784 (0.0029) [2024-06-25 12:49:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 14561837056. Throughput: 0: 42734.1. Samples: 14561947500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 12:49:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000888784_14561837056.pth... [2024-06-25 12:49:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000888156_14551547904.pth [2024-06-25 12:49:47,851][15401] Updated weights for policy 0, policy_version 888794 (0.0031) [2024-06-25 12:49:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14562017280. Throughput: 0: 42679.8. Samples: 14562205120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 12:49:51,180][15401] Updated weights for policy 0, policy_version 888804 (0.0040) [2024-06-25 12:49:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42987.1). Total num frames: 14562263040. Throughput: 0: 42867.4. Samples: 14562332160. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 12:49:55,528][15401] Updated weights for policy 0, policy_version 888814 (0.0033) [2024-06-25 12:49:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42599.0, 300 sec: 42931.6). Total num frames: 14562476032. Throughput: 0: 42778.2. Samples: 14562591800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:49:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 12:49:58,601][15401] Updated weights for policy 0, policy_version 888824 (0.0032) [2024-06-25 12:50:03,013][15401] Updated weights for policy 0, policy_version 888834 (0.0036) [2024-06-25 12:50:03,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42873.4, 300 sec: 42876.1). Total num frames: 14562672640. Throughput: 0: 42560.1. Samples: 14562843880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 12:50:06,174][15401] Updated weights for policy 0, policy_version 888844 (0.0023) [2024-06-25 12:50:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14562902016. Throughput: 0: 42741.8. Samples: 14562968360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:08,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 12:50:10,658][15401] Updated weights for policy 0, policy_version 888854 (0.0037) [2024-06-25 12:50:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42821.1). Total num frames: 14563098624. Throughput: 0: 42803.9. Samples: 14563230020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 12:50:13,754][15401] Updated weights for policy 0, policy_version 888864 (0.0035) [2024-06-25 12:50:18,265][15401] Updated weights for policy 0, policy_version 888874 (0.0041) [2024-06-25 12:50:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14563311616. Throughput: 0: 42612.8. Samples: 14563482780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:18,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 12:50:21,489][15401] Updated weights for policy 0, policy_version 888884 (0.0039) [2024-06-25 12:50:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14563524608. Throughput: 0: 42650.7. Samples: 14563606780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:23,390][15132] Avg episode reward: [(0, '0.874')] [2024-06-25 12:50:25,943][15401] Updated weights for policy 0, policy_version 888894 (0.0042) [2024-06-25 12:50:27,383][15349] Signal inference workers to stop experience collection... (215600 times) [2024-06-25 12:50:27,388][15349] Signal inference workers to resume experience collection... (215600 times) [2024-06-25 12:50:27,405][15401] InferenceWorker_p0-w0: stopping experience collection (215600 times) [2024-06-25 12:50:27,405][15401] InferenceWorker_p0-w0: resuming experience collection (215600 times) [2024-06-25 12:50:28,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 14563737600. Throughput: 0: 42689.9. Samples: 14563868540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:28,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-25 12:50:29,043][15401] Updated weights for policy 0, policy_version 888904 (0.0036) [2024-06-25 12:50:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 14563950592. Throughput: 0: 42723.5. Samples: 14564127680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:33,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 12:50:33,496][15401] Updated weights for policy 0, policy_version 888914 (0.0024) [2024-06-25 12:50:37,028][15401] Updated weights for policy 0, policy_version 888924 (0.0033) [2024-06-25 12:50:38,391][15132] Fps is (10 sec: 44231.0, 60 sec: 42872.4, 300 sec: 42820.4). Total num frames: 14564179968. Throughput: 0: 42596.4. Samples: 14564249040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:38,391][15132] Avg episode reward: [(0, '0.223')] [2024-06-25 12:50:41,322][15401] Updated weights for policy 0, policy_version 888934 (0.0028) [2024-06-25 12:50:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14564392960. Throughput: 0: 42676.0. Samples: 14564512220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 12:50:44,424][15401] Updated weights for policy 0, policy_version 888944 (0.0029) [2024-06-25 12:50:48,390][15132] Fps is (10 sec: 40964.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14564589568. Throughput: 0: 42937.2. Samples: 14564776060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-25 12:50:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 12:50:48,738][15401] Updated weights for policy 0, policy_version 888954 (0.0034) [2024-06-25 12:50:52,460][15401] Updated weights for policy 0, policy_version 888964 (0.0041) [2024-06-25 12:50:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14564818944. Throughput: 0: 42938.2. Samples: 14564900580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:50:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 12:50:56,489][15401] Updated weights for policy 0, policy_version 888974 (0.0038) [2024-06-25 12:50:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 14565031936. Throughput: 0: 42718.6. Samples: 14565152360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:50:58,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 12:50:59,923][15401] Updated weights for policy 0, policy_version 888984 (0.0034) [2024-06-25 12:51:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 14565244928. Throughput: 0: 43023.1. Samples: 14565418820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 12:51:03,767][15401] Updated weights for policy 0, policy_version 888994 (0.0040) [2024-06-25 12:51:07,446][15401] Updated weights for policy 0, policy_version 889004 (0.0039) [2024-06-25 12:51:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.7). Total num frames: 14565474304. Throughput: 0: 43094.7. Samples: 14565546040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:08,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 12:51:11,590][15401] Updated weights for policy 0, policy_version 889014 (0.0046) [2024-06-25 12:51:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14565670912. Throughput: 0: 43016.2. Samples: 14565804280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:13,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 12:51:15,043][15401] Updated weights for policy 0, policy_version 889024 (0.0039) [2024-06-25 12:51:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 14565900288. Throughput: 0: 43027.7. Samples: 14566063920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:18,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 12:51:19,123][15401] Updated weights for policy 0, policy_version 889034 (0.0032) [2024-06-25 12:51:22,547][15401] Updated weights for policy 0, policy_version 889044 (0.0034) [2024-06-25 12:51:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 14566129664. Throughput: 0: 43219.2. Samples: 14566193860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:23,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 12:51:26,555][15401] Updated weights for policy 0, policy_version 889054 (0.0033) [2024-06-25 12:51:28,392][15132] Fps is (10 sec: 42587.8, 60 sec: 43142.7, 300 sec: 42875.8). Total num frames: 14566326272. Throughput: 0: 43181.3. Samples: 14566455480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:28,392][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 12:51:30,239][15401] Updated weights for policy 0, policy_version 889064 (0.0028) [2024-06-25 12:51:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 14566555648. Throughput: 0: 43042.8. Samples: 14566712980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:33,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-25 12:51:33,945][15401] Updated weights for policy 0, policy_version 889074 (0.0026) [2024-06-25 12:51:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42599.3, 300 sec: 42765.0). Total num frames: 14566735872. Throughput: 0: 43104.6. Samples: 14566840280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:38,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 12:51:38,418][15401] Updated weights for policy 0, policy_version 889084 (0.0038) [2024-06-25 12:51:41,548][15401] Updated weights for policy 0, policy_version 889094 (0.0031) [2024-06-25 12:51:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14566981632. Throughput: 0: 43203.2. Samples: 14567096500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:43,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-25 12:51:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000889098_14566981632.pth... [2024-06-25 12:51:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000888471_14556708864.pth [2024-06-25 12:51:45,962][15401] Updated weights for policy 0, policy_version 889104 (0.0036) [2024-06-25 12:51:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 14567194624. Throughput: 0: 43043.3. Samples: 14567355760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 12:51:49,639][15401] Updated weights for policy 0, policy_version 889114 (0.0043) [2024-06-25 12:51:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14567391232. Throughput: 0: 43116.0. Samples: 14567486260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:53,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 12:51:53,428][15401] Updated weights for policy 0, policy_version 889124 (0.0038) [2024-06-25 12:51:56,994][15349] Signal inference workers to stop experience collection... (215650 times) [2024-06-25 12:51:57,044][15401] InferenceWorker_p0-w0: stopping experience collection (215650 times) [2024-06-25 12:51:57,049][15349] Signal inference workers to resume experience collection... (215650 times) [2024-06-25 12:51:57,056][15401] InferenceWorker_p0-w0: resuming experience collection (215650 times) [2024-06-25 12:51:57,059][15401] Updated weights for policy 0, policy_version 889134 (0.0032) [2024-06-25 12:51:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14567604224. Throughput: 0: 43133.9. Samples: 14567745300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:51:58,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 12:52:01,018][15401] Updated weights for policy 0, policy_version 889144 (0.0028) [2024-06-25 12:52:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 14567849984. Throughput: 0: 43031.9. Samples: 14568000360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:52:03,390][15132] Avg episode reward: [(0, '0.248')] [2024-06-25 12:52:04,594][15401] Updated weights for policy 0, policy_version 889154 (0.0033) [2024-06-25 12:52:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14568046592. Throughput: 0: 43085.9. Samples: 14568132720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:52:08,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 12:52:08,716][15401] Updated weights for policy 0, policy_version 889164 (0.0033) [2024-06-25 12:52:12,230][15401] Updated weights for policy 0, policy_version 889174 (0.0041) [2024-06-25 12:52:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 14568259584. Throughput: 0: 42969.4. Samples: 14568389000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:52:13,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 12:52:16,046][15401] Updated weights for policy 0, policy_version 889184 (0.0027) [2024-06-25 12:52:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 14568488960. Throughput: 0: 42805.3. Samples: 14568639220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:52:18,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 12:52:19,863][15401] Updated weights for policy 0, policy_version 889194 (0.0033) [2024-06-25 12:52:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14568701952. Throughput: 0: 42934.5. Samples: 14568772340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:52:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 12:52:23,610][15401] Updated weights for policy 0, policy_version 889204 (0.0034) [2024-06-25 12:52:28,010][15401] Updated weights for policy 0, policy_version 889214 (0.0030) [2024-06-25 12:52:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14568898560. Throughput: 0: 42923.5. Samples: 14569028060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:52:28,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 12:52:31,726][15401] Updated weights for policy 0, policy_version 889224 (0.0035) [2024-06-25 12:52:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42932.6). Total num frames: 14569127936. Throughput: 0: 42678.5. Samples: 14569276300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:52:33,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 12:52:35,934][15401] Updated weights for policy 0, policy_version 889234 (0.0024) [2024-06-25 12:52:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14569324544. Throughput: 0: 42704.9. Samples: 14569407980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 12:52:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 12:52:39,652][15401] Updated weights for policy 0, policy_version 889244 (0.0028) [2024-06-25 12:52:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14569521152. Throughput: 0: 42504.0. Samples: 14569657980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:52:43,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 12:52:43,723][15401] Updated weights for policy 0, policy_version 889254 (0.0041) [2024-06-25 12:52:47,383][15401] Updated weights for policy 0, policy_version 889264 (0.0037) [2024-06-25 12:52:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14569766912. Throughput: 0: 42488.0. Samples: 14569912320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:52:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 12:52:51,383][15401] Updated weights for policy 0, policy_version 889274 (0.0029) [2024-06-25 12:52:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14569963520. Throughput: 0: 42527.4. Samples: 14570046460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:52:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 12:52:55,150][15401] Updated weights for policy 0, policy_version 889284 (0.0029) [2024-06-25 12:52:58,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14570160128. Throughput: 0: 42376.8. Samples: 14570295960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:52:58,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 12:52:58,888][15401] Updated weights for policy 0, policy_version 889294 (0.0039) [2024-06-25 12:53:02,593][15401] Updated weights for policy 0, policy_version 889304 (0.0039) [2024-06-25 12:53:03,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42593.9, 300 sec: 42875.2). Total num frames: 14570405888. Throughput: 0: 42435.8. Samples: 14570549100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:03,397][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 12:53:06,487][15401] Updated weights for policy 0, policy_version 889314 (0.0027) [2024-06-25 12:53:08,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 14570618880. Throughput: 0: 42493.0. Samples: 14570684520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:08,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-25 12:53:10,251][15401] Updated weights for policy 0, policy_version 889324 (0.0033) [2024-06-25 12:53:13,392][15132] Fps is (10 sec: 39337.3, 60 sec: 42323.6, 300 sec: 42709.2). Total num frames: 14570799104. Throughput: 0: 42371.5. Samples: 14570934880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:13,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 12:53:13,937][15349] Signal inference workers to stop experience collection... (215700 times) [2024-06-25 12:53:13,937][15349] Signal inference workers to resume experience collection... (215700 times) [2024-06-25 12:53:13,988][15401] InferenceWorker_p0-w0: stopping experience collection (215700 times) [2024-06-25 12:53:13,988][15401] InferenceWorker_p0-w0: resuming experience collection (215700 times) [2024-06-25 12:53:14,076][15401] Updated weights for policy 0, policy_version 889334 (0.0042) [2024-06-25 12:53:17,933][15401] Updated weights for policy 0, policy_version 889344 (0.0038) [2024-06-25 12:53:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14571044864. Throughput: 0: 42553.8. Samples: 14571191220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 12:53:21,815][15401] Updated weights for policy 0, policy_version 889354 (0.0035) [2024-06-25 12:53:23,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14571241472. Throughput: 0: 42592.9. Samples: 14571324660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:23,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 12:53:25,451][15401] Updated weights for policy 0, policy_version 889364 (0.0028) [2024-06-25 12:53:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.6). Total num frames: 14571454464. Throughput: 0: 42661.7. Samples: 14571577760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:28,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 12:53:29,322][15401] Updated weights for policy 0, policy_version 889374 (0.0028) [2024-06-25 12:53:32,983][15401] Updated weights for policy 0, policy_version 889384 (0.0032) [2024-06-25 12:53:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 14571700224. Throughput: 0: 42857.0. Samples: 14571840880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:33,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 12:53:36,919][15401] Updated weights for policy 0, policy_version 889394 (0.0037) [2024-06-25 12:53:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14571896832. Throughput: 0: 42826.3. Samples: 14571973640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:38,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 12:53:40,684][15401] Updated weights for policy 0, policy_version 889404 (0.0033) [2024-06-25 12:53:43,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14572109824. Throughput: 0: 42933.9. Samples: 14572227980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 12:53:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000889411_14572109824.pth... [2024-06-25 12:53:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000888784_14561837056.pth [2024-06-25 12:53:44,449][15401] Updated weights for policy 0, policy_version 889414 (0.0036) [2024-06-25 12:53:48,181][15401] Updated weights for policy 0, policy_version 889424 (0.0032) [2024-06-25 12:53:48,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 14572339200. Throughput: 0: 43057.1. Samples: 14572486500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:48,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 12:53:51,952][15401] Updated weights for policy 0, policy_version 889434 (0.0033) [2024-06-25 12:53:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.6). Total num frames: 14572519424. Throughput: 0: 42942.2. Samples: 14572616920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 12:53:55,854][15401] Updated weights for policy 0, policy_version 889444 (0.0030) [2024-06-25 12:53:58,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43417.7, 300 sec: 42932.0). Total num frames: 14572765184. Throughput: 0: 43163.6. Samples: 14572877140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:53:58,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 12:53:59,421][15401] Updated weights for policy 0, policy_version 889454 (0.0031) [2024-06-25 12:54:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 14572961792. Throughput: 0: 43189.9. Samples: 14573134760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:54:03,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 12:54:03,528][15401] Updated weights for policy 0, policy_version 889464 (0.0033) [2024-06-25 12:54:07,387][15401] Updated weights for policy 0, policy_version 889474 (0.0043) [2024-06-25 12:54:08,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 14573174784. Throughput: 0: 43040.5. Samples: 14573261760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:54:08,396][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 12:54:11,033][15401] Updated weights for policy 0, policy_version 889484 (0.0031) [2024-06-25 12:54:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43419.3, 300 sec: 42931.6). Total num frames: 14573404160. Throughput: 0: 43116.0. Samples: 14573517980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:54:13,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 12:54:15,012][15401] Updated weights for policy 0, policy_version 889494 (0.0035) [2024-06-25 12:54:18,391][15132] Fps is (10 sec: 42621.1, 60 sec: 42597.7, 300 sec: 42875.9). Total num frames: 14573600768. Throughput: 0: 42994.0. Samples: 14573775660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:54:18,391][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 12:54:18,777][15401] Updated weights for policy 0, policy_version 889504 (0.0033) [2024-06-25 12:54:22,495][15401] Updated weights for policy 0, policy_version 889514 (0.0030) [2024-06-25 12:54:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14573813760. Throughput: 0: 42871.1. Samples: 14573902840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:54:23,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 12:54:26,282][15401] Updated weights for policy 0, policy_version 889524 (0.0028) [2024-06-25 12:54:28,389][15132] Fps is (10 sec: 44242.0, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 14574043136. Throughput: 0: 42945.0. Samples: 14574160500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:54:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 12:54:30,002][15401] Updated weights for policy 0, policy_version 889534 (0.0034) [2024-06-25 12:54:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 14574256128. Throughput: 0: 42925.5. Samples: 14574418040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 12:54:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 12:54:33,897][15401] Updated weights for policy 0, policy_version 889544 (0.0034) [2024-06-25 12:54:37,585][15401] Updated weights for policy 0, policy_version 889554 (0.0042) [2024-06-25 12:54:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14574469120. Throughput: 0: 42830.7. Samples: 14574544300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:54:38,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 12:54:41,432][15349] Signal inference workers to stop experience collection... (215750 times) [2024-06-25 12:54:41,432][15349] Signal inference workers to resume experience collection... (215750 times) [2024-06-25 12:54:41,476][15401] InferenceWorker_p0-w0: stopping experience collection (215750 times) [2024-06-25 12:54:41,477][15401] InferenceWorker_p0-w0: resuming experience collection (215750 times) [2024-06-25 12:54:41,584][15401] Updated weights for policy 0, policy_version 889564 (0.0038) [2024-06-25 12:54:43,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14574682112. Throughput: 0: 42656.5. Samples: 14574796680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:54:43,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 12:54:45,225][15401] Updated weights for policy 0, policy_version 889574 (0.0034) [2024-06-25 12:54:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 14574895104. Throughput: 0: 42702.6. Samples: 14575056380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:54:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 12:54:49,898][15401] Updated weights for policy 0, policy_version 889584 (0.0043) [2024-06-25 12:54:52,798][15401] Updated weights for policy 0, policy_version 889594 (0.0035) [2024-06-25 12:54:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14575124480. Throughput: 0: 42694.5. Samples: 14575182740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:54:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 12:54:57,674][15401] Updated weights for policy 0, policy_version 889604 (0.0030) [2024-06-25 12:54:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14575304704. Throughput: 0: 42512.1. Samples: 14575431020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:54:58,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 12:55:00,676][15401] Updated weights for policy 0, policy_version 889614 (0.0042) [2024-06-25 12:55:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14575517696. Throughput: 0: 42556.9. Samples: 14575690680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:03,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 12:55:05,221][15401] Updated weights for policy 0, policy_version 889624 (0.0041) [2024-06-25 12:55:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42876.1, 300 sec: 42876.1). Total num frames: 14575747072. Throughput: 0: 42428.9. Samples: 14575812140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:08,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 12:55:08,509][15401] Updated weights for policy 0, policy_version 889634 (0.0036) [2024-06-25 12:55:12,757][15401] Updated weights for policy 0, policy_version 889644 (0.0031) [2024-06-25 12:55:13,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 14575960064. Throughput: 0: 42413.6. Samples: 14576069220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:13,393][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 12:55:16,093][15401] Updated weights for policy 0, policy_version 889654 (0.0039) [2024-06-25 12:55:18,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42326.1, 300 sec: 42765.0). Total num frames: 14576140288. Throughput: 0: 42430.1. Samples: 14576327400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:18,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 12:55:20,314][15401] Updated weights for policy 0, policy_version 889664 (0.0044) [2024-06-25 12:55:23,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14576386048. Throughput: 0: 42398.6. Samples: 14576452240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 12:55:23,839][15401] Updated weights for policy 0, policy_version 889674 (0.0033) [2024-06-25 12:55:28,042][15401] Updated weights for policy 0, policy_version 889684 (0.0047) [2024-06-25 12:55:28,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 14576599040. Throughput: 0: 42588.4. Samples: 14576713260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:28,393][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 12:55:31,666][15401] Updated weights for policy 0, policy_version 889694 (0.0044) [2024-06-25 12:55:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42764.8). Total num frames: 14576795648. Throughput: 0: 42472.4. Samples: 14576967740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:33,392][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 12:55:35,562][15401] Updated weights for policy 0, policy_version 889704 (0.0038) [2024-06-25 12:55:38,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14577008640. Throughput: 0: 42410.7. Samples: 14577091220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:38,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 12:55:39,183][15401] Updated weights for policy 0, policy_version 889714 (0.0029) [2024-06-25 12:55:43,390][15132] Fps is (10 sec: 42607.8, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 14577221632. Throughput: 0: 42687.3. Samples: 14577351960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 12:55:43,459][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000889724_14577238016.pth... [2024-06-25 12:55:43,473][15401] Updated weights for policy 0, policy_version 889724 (0.0032) [2024-06-25 12:55:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000889098_14566981632.pth [2024-06-25 12:55:46,878][15401] Updated weights for policy 0, policy_version 889734 (0.0043) [2024-06-25 12:55:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14577434624. Throughput: 0: 42745.4. Samples: 14577614220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:48,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 12:55:50,966][15401] Updated weights for policy 0, policy_version 889744 (0.0038) [2024-06-25 12:55:53,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14577664000. Throughput: 0: 42816.7. Samples: 14577738900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:53,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 12:55:54,605][15401] Updated weights for policy 0, policy_version 889754 (0.0028) [2024-06-25 12:55:58,395][15132] Fps is (10 sec: 44214.0, 60 sec: 42867.7, 300 sec: 42819.8). Total num frames: 14577876992. Throughput: 0: 42914.3. Samples: 14578000480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:55:58,395][15132] Avg episode reward: [(0, '0.181')] [2024-06-25 12:55:58,672][15401] Updated weights for policy 0, policy_version 889764 (0.0039) [2024-06-25 12:56:01,600][15349] Signal inference workers to stop experience collection... (215800 times) [2024-06-25 12:56:01,601][15349] Signal inference workers to resume experience collection... (215800 times) [2024-06-25 12:56:01,642][15401] InferenceWorker_p0-w0: stopping experience collection (215800 times) [2024-06-25 12:56:01,643][15401] InferenceWorker_p0-w0: resuming experience collection (215800 times) [2024-06-25 12:56:02,134][15401] Updated weights for policy 0, policy_version 889774 (0.0024) [2024-06-25 12:56:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14578089984. Throughput: 0: 42924.8. Samples: 14578259020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:56:03,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-25 12:56:06,190][15401] Updated weights for policy 0, policy_version 889784 (0.0038) [2024-06-25 12:56:08,390][15132] Fps is (10 sec: 40980.7, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 14578286592. Throughput: 0: 43007.4. Samples: 14578387580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:56:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 12:56:09,635][15401] Updated weights for policy 0, policy_version 889794 (0.0030) [2024-06-25 12:56:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 14578515968. Throughput: 0: 42930.7. Samples: 14578645040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:56:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 12:56:13,814][15401] Updated weights for policy 0, policy_version 889804 (0.0033) [2024-06-25 12:56:17,193][15401] Updated weights for policy 0, policy_version 889814 (0.0042) [2024-06-25 12:56:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14578728960. Throughput: 0: 42859.2. Samples: 14578896300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:56:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 12:56:21,323][15401] Updated weights for policy 0, policy_version 889824 (0.0030) [2024-06-25 12:56:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 14578941952. Throughput: 0: 43030.0. Samples: 14579027580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 12:56:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 12:56:24,718][15401] Updated weights for policy 0, policy_version 889834 (0.0032) [2024-06-25 12:56:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 14579138560. Throughput: 0: 42981.1. Samples: 14579286100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:56:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 12:56:29,509][15401] Updated weights for policy 0, policy_version 889844 (0.0036) [2024-06-25 12:56:32,688][15401] Updated weights for policy 0, policy_version 889854 (0.0037) [2024-06-25 12:56:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 14579384320. Throughput: 0: 42592.0. Samples: 14579530860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:56:33,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 12:56:36,938][15401] Updated weights for policy 0, policy_version 889864 (0.0035) [2024-06-25 12:56:38,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 14579597312. Throughput: 0: 42981.3. Samples: 14579673160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:56:38,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 12:56:40,234][15401] Updated weights for policy 0, policy_version 889874 (0.0029) [2024-06-25 12:56:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14579793920. Throughput: 0: 42603.9. Samples: 14579917440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:56:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 12:56:44,801][15401] Updated weights for policy 0, policy_version 889884 (0.0028) [2024-06-25 12:56:48,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14580006912. Throughput: 0: 42687.2. Samples: 14580179940. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:56:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 12:56:48,476][15401] Updated weights for policy 0, policy_version 889894 (0.0040) [2024-06-25 12:56:52,375][15401] Updated weights for policy 0, policy_version 889904 (0.0042) [2024-06-25 12:56:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14580219904. Throughput: 0: 42742.9. Samples: 14580311000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:56:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 12:56:56,049][15401] Updated weights for policy 0, policy_version 889914 (0.0043) [2024-06-25 12:56:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42875.1, 300 sec: 42709.5). Total num frames: 14580449280. Throughput: 0: 42542.6. Samples: 14580559460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:56:58,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 12:57:00,000][15401] Updated weights for policy 0, policy_version 889924 (0.0036) [2024-06-25 12:57:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14580662272. Throughput: 0: 42733.3. Samples: 14580819300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:03,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 12:57:03,658][15401] Updated weights for policy 0, policy_version 889934 (0.0034) [2024-06-25 12:57:07,882][15401] Updated weights for policy 0, policy_version 889944 (0.0037) [2024-06-25 12:57:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14580858880. Throughput: 0: 42674.6. Samples: 14580947940. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 12:57:11,491][15401] Updated weights for policy 0, policy_version 889954 (0.0041) [2024-06-25 12:57:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14581071872. Throughput: 0: 42504.4. Samples: 14581198800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:13,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 12:57:15,831][15401] Updated weights for policy 0, policy_version 889964 (0.0027) [2024-06-25 12:57:18,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 14581284864. Throughput: 0: 42665.3. Samples: 14581450900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:18,393][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 12:57:19,446][15401] Updated weights for policy 0, policy_version 889974 (0.0037) [2024-06-25 12:57:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 14581481472. Throughput: 0: 42417.0. Samples: 14581581820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 12:57:23,448][15401] Updated weights for policy 0, policy_version 889984 (0.0050) [2024-06-25 12:57:23,469][15349] Signal inference workers to stop experience collection... (215850 times) [2024-06-25 12:57:23,469][15349] Signal inference workers to resume experience collection... (215850 times) [2024-06-25 12:57:23,484][15401] InferenceWorker_p0-w0: stopping experience collection (215850 times) [2024-06-25 12:57:23,484][15401] InferenceWorker_p0-w0: resuming experience collection (215850 times) [2024-06-25 12:57:27,337][15401] Updated weights for policy 0, policy_version 889994 (0.0031) [2024-06-25 12:57:28,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14581727232. Throughput: 0: 42731.6. Samples: 14581840360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:28,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 12:57:30,973][15401] Updated weights for policy 0, policy_version 890004 (0.0049) [2024-06-25 12:57:33,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14581940224. Throughput: 0: 42456.0. Samples: 14582090460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 12:57:34,988][15401] Updated weights for policy 0, policy_version 890014 (0.0049) [2024-06-25 12:57:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 14582136832. Throughput: 0: 42458.9. Samples: 14582221660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:38,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 12:57:38,619][15401] Updated weights for policy 0, policy_version 890024 (0.0044) [2024-06-25 12:57:42,581][15401] Updated weights for policy 0, policy_version 890034 (0.0034) [2024-06-25 12:57:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14582349824. Throughput: 0: 42729.9. Samples: 14582482300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 12:57:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000890037_14582366208.pth... [2024-06-25 12:57:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000889411_14572109824.pth [2024-06-25 12:57:46,208][15401] Updated weights for policy 0, policy_version 890044 (0.0034) [2024-06-25 12:57:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14582579200. Throughput: 0: 42584.6. Samples: 14582735600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:48,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 12:57:50,066][15401] Updated weights for policy 0, policy_version 890054 (0.0034) [2024-06-25 12:57:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14582792192. Throughput: 0: 42651.3. Samples: 14582867240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 12:57:53,820][15401] Updated weights for policy 0, policy_version 890064 (0.0031) [2024-06-25 12:57:57,494][15401] Updated weights for policy 0, policy_version 890074 (0.0035) [2024-06-25 12:57:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42710.4). Total num frames: 14583005184. Throughput: 0: 42648.0. Samples: 14583117960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:57:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 12:58:01,451][15401] Updated weights for policy 0, policy_version 890084 (0.0024) [2024-06-25 12:58:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14583218176. Throughput: 0: 42831.2. Samples: 14583378200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:58:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 12:58:04,925][15401] Updated weights for policy 0, policy_version 890094 (0.0024) [2024-06-25 12:58:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 14583414784. Throughput: 0: 42719.9. Samples: 14583504220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:58:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 12:58:08,970][15401] Updated weights for policy 0, policy_version 890104 (0.0028) [2024-06-25 12:58:12,966][15401] Updated weights for policy 0, policy_version 890114 (0.0045) [2024-06-25 12:58:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 14583644160. Throughput: 0: 42646.6. Samples: 14583759560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:58:13,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 12:58:16,740][15401] Updated weights for policy 0, policy_version 890124 (0.0045) [2024-06-25 12:58:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14583857152. Throughput: 0: 42902.6. Samples: 14584021080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 12:58:18,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 12:58:20,543][15401] Updated weights for policy 0, policy_version 890134 (0.0026) [2024-06-25 12:58:23,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14584070144. Throughput: 0: 42736.9. Samples: 14584144820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:58:23,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 12:58:24,737][15401] Updated weights for policy 0, policy_version 890144 (0.0033) [2024-06-25 12:58:27,971][15401] Updated weights for policy 0, policy_version 890154 (0.0041) [2024-06-25 12:58:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14584299520. Throughput: 0: 42560.0. Samples: 14584397500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:58:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 12:58:32,751][15401] Updated weights for policy 0, policy_version 890164 (0.0035) [2024-06-25 12:58:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14584496128. Throughput: 0: 42799.1. Samples: 14584661560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:58:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 12:58:35,829][15401] Updated weights for policy 0, policy_version 890174 (0.0039) [2024-06-25 12:58:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14584709120. Throughput: 0: 42640.5. Samples: 14584786060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:58:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 12:58:40,319][15401] Updated weights for policy 0, policy_version 890184 (0.0026) [2024-06-25 12:58:43,389][15401] Updated weights for policy 0, policy_version 890194 (0.0040) [2024-06-25 12:58:43,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42709.5). Total num frames: 14584938496. Throughput: 0: 42890.6. Samples: 14585048140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:58:43,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 12:58:47,813][15401] Updated weights for policy 0, policy_version 890204 (0.0037) [2024-06-25 12:58:48,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 14585135104. Throughput: 0: 42947.0. Samples: 14585310820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:58:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 12:58:50,827][15401] Updated weights for policy 0, policy_version 890214 (0.0038) [2024-06-25 12:58:53,392][15132] Fps is (10 sec: 42598.2, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 14585364480. Throughput: 0: 42943.4. Samples: 14585436780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:58:53,393][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 12:58:55,348][15401] Updated weights for policy 0, policy_version 890224 (0.0031) [2024-06-25 12:58:58,382][15401] Updated weights for policy 0, policy_version 890234 (0.0028) [2024-06-25 12:58:58,389][15132] Fps is (10 sec: 45876.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14585593856. Throughput: 0: 43062.4. Samples: 14585697260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:58:58,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 12:59:00,233][15349] Signal inference workers to stop experience collection... (215900 times) [2024-06-25 12:59:00,235][15349] Signal inference workers to resume experience collection... (215900 times) [2024-06-25 12:59:00,276][15401] InferenceWorker_p0-w0: stopping experience collection (215900 times) [2024-06-25 12:59:00,276][15401] InferenceWorker_p0-w0: resuming experience collection (215900 times) [2024-06-25 12:59:02,774][15401] Updated weights for policy 0, policy_version 890244 (0.0037) [2024-06-25 12:59:03,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 14585774080. Throughput: 0: 43055.1. Samples: 14585958560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:03,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 12:59:06,143][15401] Updated weights for policy 0, policy_version 890254 (0.0040) [2024-06-25 12:59:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 14586019840. Throughput: 0: 43142.3. Samples: 14586086220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 12:59:10,408][15401] Updated weights for policy 0, policy_version 890264 (0.0045) [2024-06-25 12:59:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43146.2, 300 sec: 42820.7). Total num frames: 14586232832. Throughput: 0: 43273.2. Samples: 14586344800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:13,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 12:59:13,801][15401] Updated weights for policy 0, policy_version 890274 (0.0030) [2024-06-25 12:59:18,059][15401] Updated weights for policy 0, policy_version 890284 (0.0044) [2024-06-25 12:59:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14586429440. Throughput: 0: 43234.9. Samples: 14586607140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:18,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 12:59:21,511][15401] Updated weights for policy 0, policy_version 890294 (0.0029) [2024-06-25 12:59:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14586658816. Throughput: 0: 43231.5. Samples: 14586731480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 12:59:25,442][15401] Updated weights for policy 0, policy_version 890304 (0.0030) [2024-06-25 12:59:28,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14586871808. Throughput: 0: 43237.9. Samples: 14586993740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:28,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 12:59:28,836][15401] Updated weights for policy 0, policy_version 890314 (0.0048) [2024-06-25 12:59:32,959][15401] Updated weights for policy 0, policy_version 890324 (0.0032) [2024-06-25 12:59:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14587084800. Throughput: 0: 43155.7. Samples: 14587252820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 12:59:36,279][15401] Updated weights for policy 0, policy_version 890334 (0.0028) [2024-06-25 12:59:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 14587314176. Throughput: 0: 43173.4. Samples: 14587379480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 12:59:40,403][15401] Updated weights for policy 0, policy_version 890344 (0.0036) [2024-06-25 12:59:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 14587527168. Throughput: 0: 43170.6. Samples: 14587639940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:43,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 12:59:43,480][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000890353_14587543552.pth... [2024-06-25 12:59:43,527][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000889724_14577238016.pth [2024-06-25 12:59:43,876][15401] Updated weights for policy 0, policy_version 890354 (0.0043) [2024-06-25 12:59:47,864][15401] Updated weights for policy 0, policy_version 890364 (0.0034) [2024-06-25 12:59:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 14587723776. Throughput: 0: 43145.8. Samples: 14587900120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 12:59:51,680][15401] Updated weights for policy 0, policy_version 890374 (0.0040) [2024-06-25 12:59:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 14587953152. Throughput: 0: 43025.2. Samples: 14588022360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 12:59:55,317][15401] Updated weights for policy 0, policy_version 890384 (0.0028) [2024-06-25 12:59:58,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14588182528. Throughput: 0: 43223.7. Samples: 14588289860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 12:59:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 12:59:59,452][15401] Updated weights for policy 0, policy_version 890394 (0.0028) [2024-06-25 13:00:02,873][15401] Updated weights for policy 0, policy_version 890404 (0.0041) [2024-06-25 13:00:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 14588379136. Throughput: 0: 43024.6. Samples: 14588543240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:00:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 13:00:07,172][15401] Updated weights for policy 0, policy_version 890414 (0.0028) [2024-06-25 13:00:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42820.6). Total num frames: 14588592128. Throughput: 0: 43062.1. Samples: 14588669380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:00:08,393][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 13:00:10,850][15401] Updated weights for policy 0, policy_version 890424 (0.0030) [2024-06-25 13:00:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 14588821504. Throughput: 0: 43123.1. Samples: 14588934280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:00:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 13:00:14,922][15401] Updated weights for policy 0, policy_version 890434 (0.0033) [2024-06-25 13:00:18,157][15401] Updated weights for policy 0, policy_version 890444 (0.0033) [2024-06-25 13:00:18,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 14589050880. Throughput: 0: 43017.2. Samples: 14589188600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 13:00:22,391][15401] Updated weights for policy 0, policy_version 890454 (0.0030) [2024-06-25 13:00:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.7, 300 sec: 42876.1). Total num frames: 14589247488. Throughput: 0: 43081.7. Samples: 14589318260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:23,393][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 13:00:25,702][15401] Updated weights for policy 0, policy_version 890464 (0.0028) [2024-06-25 13:00:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 14589460480. Throughput: 0: 43092.8. Samples: 14589579120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:28,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 13:00:29,839][15401] Updated weights for policy 0, policy_version 890474 (0.0034) [2024-06-25 13:00:32,254][15349] Signal inference workers to stop experience collection... (215950 times) [2024-06-25 13:00:32,254][15349] Signal inference workers to resume experience collection... (215950 times) [2024-06-25 13:00:32,268][15401] InferenceWorker_p0-w0: stopping experience collection (215950 times) [2024-06-25 13:00:32,269][15401] InferenceWorker_p0-w0: resuming experience collection (215950 times) [2024-06-25 13:00:33,330][15401] Updated weights for policy 0, policy_version 890484 (0.0038) [2024-06-25 13:00:33,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43417.5, 300 sec: 42987.1). Total num frames: 14589689856. Throughput: 0: 42912.3. Samples: 14589831180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:33,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 13:00:37,368][15401] Updated weights for policy 0, policy_version 890494 (0.0034) [2024-06-25 13:00:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14589870080. Throughput: 0: 43044.2. Samples: 14589959340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 13:00:40,974][15401] Updated weights for policy 0, policy_version 890504 (0.0035) [2024-06-25 13:00:43,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14590083072. Throughput: 0: 42796.5. Samples: 14590215700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:43,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 13:00:44,903][15401] Updated weights for policy 0, policy_version 890514 (0.0031) [2024-06-25 13:00:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 14590312448. Throughput: 0: 42978.2. Samples: 14590477260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 13:00:48,732][15401] Updated weights for policy 0, policy_version 890524 (0.0046) [2024-06-25 13:00:52,659][15401] Updated weights for policy 0, policy_version 890534 (0.0023) [2024-06-25 13:00:53,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42869.8, 300 sec: 42876.5). Total num frames: 14590525440. Throughput: 0: 42927.1. Samples: 14590601100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:53,392][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 13:00:56,602][15401] Updated weights for policy 0, policy_version 890544 (0.0045) [2024-06-25 13:00:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14590738432. Throughput: 0: 42728.5. Samples: 14590857060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:00:58,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 13:01:00,507][15401] Updated weights for policy 0, policy_version 890554 (0.0037) [2024-06-25 13:01:03,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14590935040. Throughput: 0: 42785.0. Samples: 14591113920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 13:01:04,325][15401] Updated weights for policy 0, policy_version 890564 (0.0032) [2024-06-25 13:01:08,022][15401] Updated weights for policy 0, policy_version 890574 (0.0033) [2024-06-25 13:01:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 14591164416. Throughput: 0: 42731.7. Samples: 14591241080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 13:01:12,226][15401] Updated weights for policy 0, policy_version 890584 (0.0025) [2024-06-25 13:01:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14591361024. Throughput: 0: 42557.8. Samples: 14591494220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:13,396][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 13:01:16,115][15401] Updated weights for policy 0, policy_version 890594 (0.0027) [2024-06-25 13:01:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 14591574016. Throughput: 0: 42713.4. Samples: 14591753280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 13:01:19,858][15401] Updated weights for policy 0, policy_version 890604 (0.0026) [2024-06-25 13:01:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 14591803392. Throughput: 0: 42718.0. Samples: 14591881660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:23,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 13:01:23,618][15401] Updated weights for policy 0, policy_version 890614 (0.0036) [2024-06-25 13:01:27,540][15401] Updated weights for policy 0, policy_version 890624 (0.0036) [2024-06-25 13:01:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14592000000. Throughput: 0: 42601.7. Samples: 14592132780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:28,396][15132] Avg episode reward: [(0, '0.240')] [2024-06-25 13:01:31,364][15401] Updated weights for policy 0, policy_version 890634 (0.0027) [2024-06-25 13:01:33,396][15132] Fps is (10 sec: 42571.6, 60 sec: 42320.9, 300 sec: 42820.0). Total num frames: 14592229376. Throughput: 0: 42532.6. Samples: 14592391500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:33,397][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 13:01:35,589][15401] Updated weights for policy 0, policy_version 890644 (0.0029) [2024-06-25 13:01:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 14592442368. Throughput: 0: 42561.8. Samples: 14592516380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:38,392][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 13:01:38,879][15401] Updated weights for policy 0, policy_version 890654 (0.0032) [2024-06-25 13:01:43,122][15401] Updated weights for policy 0, policy_version 890664 (0.0032) [2024-06-25 13:01:43,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14592638976. Throughput: 0: 42731.5. Samples: 14592779980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:43,394][15132] Avg episode reward: [(0, '0.867')] [2024-06-25 13:01:43,477][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000890665_14592655360.pth... [2024-06-25 13:01:43,533][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000890037_14582366208.pth [2024-06-25 13:01:46,673][15401] Updated weights for policy 0, policy_version 890674 (0.0034) [2024-06-25 13:01:48,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14592884736. Throughput: 0: 42750.3. Samples: 14593037680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:48,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 13:01:50,704][15401] Updated weights for policy 0, policy_version 890684 (0.0046) [2024-06-25 13:01:51,478][15349] Signal inference workers to stop experience collection... (216000 times) [2024-06-25 13:01:51,479][15349] Signal inference workers to resume experience collection... (216000 times) [2024-06-25 13:01:51,508][15401] InferenceWorker_p0-w0: stopping experience collection (216000 times) [2024-06-25 13:01:51,508][15401] InferenceWorker_p0-w0: resuming experience collection (216000 times) [2024-06-25 13:01:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 14593081344. Throughput: 0: 42795.8. Samples: 14593166900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 13:01:54,106][15401] Updated weights for policy 0, policy_version 890694 (0.0030) [2024-06-25 13:01:58,150][15401] Updated weights for policy 0, policy_version 890704 (0.0037) [2024-06-25 13:01:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14593294336. Throughput: 0: 42893.0. Samples: 14593424400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:01:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 13:02:01,564][15401] Updated weights for policy 0, policy_version 890714 (0.0037) [2024-06-25 13:02:03,391][15132] Fps is (10 sec: 42591.9, 60 sec: 42870.2, 300 sec: 42875.9). Total num frames: 14593507328. Throughput: 0: 42821.5. Samples: 14593680320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:03,392][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 13:02:05,818][15401] Updated weights for policy 0, policy_version 890724 (0.0033) [2024-06-25 13:02:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14593720320. Throughput: 0: 42701.0. Samples: 14593803200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 13:02:09,549][15401] Updated weights for policy 0, policy_version 890734 (0.0032) [2024-06-25 13:02:13,362][15401] Updated weights for policy 0, policy_version 890744 (0.0034) [2024-06-25 13:02:13,390][15132] Fps is (10 sec: 44244.0, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 14593949696. Throughput: 0: 42774.6. Samples: 14594057640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 13:02:17,213][15401] Updated weights for policy 0, policy_version 890754 (0.0029) [2024-06-25 13:02:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14594146304. Throughput: 0: 42690.5. Samples: 14594312300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:18,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 13:02:21,237][15401] Updated weights for policy 0, policy_version 890764 (0.0032) [2024-06-25 13:02:23,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 14594359296. Throughput: 0: 42790.2. Samples: 14594441940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:23,392][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 13:02:25,122][15401] Updated weights for policy 0, policy_version 890774 (0.0037) [2024-06-25 13:02:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14594588672. Throughput: 0: 42599.5. Samples: 14594696960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:28,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 13:02:28,924][15401] Updated weights for policy 0, policy_version 890784 (0.0040) [2024-06-25 13:02:32,611][15401] Updated weights for policy 0, policy_version 890794 (0.0045) [2024-06-25 13:02:33,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42876.0, 300 sec: 42931.6). Total num frames: 14594801664. Throughput: 0: 42730.6. Samples: 14594960560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 13:02:36,474][15401] Updated weights for policy 0, policy_version 890804 (0.0033) [2024-06-25 13:02:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 14594998272. Throughput: 0: 42732.6. Samples: 14595089860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:38,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 13:02:40,080][15401] Updated weights for policy 0, policy_version 890814 (0.0044) [2024-06-25 13:02:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14595211264. Throughput: 0: 42606.5. Samples: 14595341700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 13:02:44,131][15401] Updated weights for policy 0, policy_version 890824 (0.0035) [2024-06-25 13:02:47,635][15401] Updated weights for policy 0, policy_version 890834 (0.0039) [2024-06-25 13:02:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14595440640. Throughput: 0: 42679.8. Samples: 14595600840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:48,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 13:02:51,889][15401] Updated weights for policy 0, policy_version 890844 (0.0041) [2024-06-25 13:02:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14595637248. Throughput: 0: 43003.1. Samples: 14595738340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 13:02:55,411][15401] Updated weights for policy 0, policy_version 890854 (0.0039) [2024-06-25 13:02:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14595866624. Throughput: 0: 42885.0. Samples: 14595987460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:02:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 13:02:59,317][15401] Updated weights for policy 0, policy_version 890864 (0.0043) [2024-06-25 13:03:03,317][15401] Updated weights for policy 0, policy_version 890874 (0.0041) [2024-06-25 13:03:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42872.6, 300 sec: 42931.6). Total num frames: 14596079616. Throughput: 0: 43065.2. Samples: 14596250240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 13:03:07,126][15401] Updated weights for policy 0, policy_version 890884 (0.0034) [2024-06-25 13:03:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 14596292608. Throughput: 0: 43044.2. Samples: 14596378820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:08,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 13:03:10,216][15349] Signal inference workers to stop experience collection... (216050 times) [2024-06-25 13:03:10,244][15401] InferenceWorker_p0-w0: stopping experience collection (216050 times) [2024-06-25 13:03:10,270][15349] Signal inference workers to resume experience collection... (216050 times) [2024-06-25 13:03:10,276][15401] InferenceWorker_p0-w0: resuming experience collection (216050 times) [2024-06-25 13:03:10,819][15401] Updated weights for policy 0, policy_version 890894 (0.0042) [2024-06-25 13:03:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14596505600. Throughput: 0: 42962.2. Samples: 14596630260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 13:03:15,084][15401] Updated weights for policy 0, policy_version 890904 (0.0033) [2024-06-25 13:03:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14596718592. Throughput: 0: 42821.9. Samples: 14596887540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 13:03:18,409][15401] Updated weights for policy 0, policy_version 890914 (0.0034) [2024-06-25 13:03:22,848][15401] Updated weights for policy 0, policy_version 890924 (0.0044) [2024-06-25 13:03:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 14596931584. Throughput: 0: 42692.0. Samples: 14597011000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 13:03:26,172][15401] Updated weights for policy 0, policy_version 890934 (0.0033) [2024-06-25 13:03:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14597144576. Throughput: 0: 42785.1. Samples: 14597267020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 13:03:30,360][15401] Updated weights for policy 0, policy_version 890944 (0.0023) [2024-06-25 13:03:33,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 14597341184. Throughput: 0: 42787.9. Samples: 14597526400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:33,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 13:03:33,891][15401] Updated weights for policy 0, policy_version 890954 (0.0029) [2024-06-25 13:03:37,848][15401] Updated weights for policy 0, policy_version 890964 (0.0033) [2024-06-25 13:03:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 14597570560. Throughput: 0: 42537.0. Samples: 14597652500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 13:03:41,904][15401] Updated weights for policy 0, policy_version 890974 (0.0036) [2024-06-25 13:03:43,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14597799936. Throughput: 0: 42743.9. Samples: 14597910940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:43,393][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 13:03:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000890979_14597799936.pth... [2024-06-25 13:03:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000890353_14587543552.pth [2024-06-25 13:03:45,790][15401] Updated weights for policy 0, policy_version 890984 (0.0028) [2024-06-25 13:03:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 14597996544. Throughput: 0: 42511.6. Samples: 14598163260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 13:03:49,564][15401] Updated weights for policy 0, policy_version 890994 (0.0030) [2024-06-25 13:03:53,294][15401] Updated weights for policy 0, policy_version 891004 (0.0040) [2024-06-25 13:03:53,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14598209536. Throughput: 0: 42498.6. Samples: 14598291360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:53,392][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 13:03:57,012][15401] Updated weights for policy 0, policy_version 891014 (0.0025) [2024-06-25 13:03:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14598438912. Throughput: 0: 42637.4. Samples: 14598548940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:03:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 13:04:00,789][15401] Updated weights for policy 0, policy_version 891024 (0.0024) [2024-06-25 13:04:03,390][15132] Fps is (10 sec: 44246.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14598651904. Throughput: 0: 42417.5. Samples: 14598796340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 13:04:04,861][15401] Updated weights for policy 0, policy_version 891034 (0.0044) [2024-06-25 13:04:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 14598848512. Throughput: 0: 42604.3. Samples: 14598928200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 13:04:08,883][15401] Updated weights for policy 0, policy_version 891044 (0.0024) [2024-06-25 13:04:12,541][15401] Updated weights for policy 0, policy_version 891054 (0.0035) [2024-06-25 13:04:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14599077888. Throughput: 0: 42820.8. Samples: 14599193960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:13,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 13:04:16,649][15401] Updated weights for policy 0, policy_version 891064 (0.0031) [2024-06-25 13:04:18,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 14599307264. Throughput: 0: 42572.5. Samples: 14599442060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 13:04:20,243][15401] Updated weights for policy 0, policy_version 891074 (0.0025) [2024-06-25 13:04:23,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 14599487488. Throughput: 0: 42591.4. Samples: 14599569220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:23,393][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 13:04:24,466][15401] Updated weights for policy 0, policy_version 891084 (0.0034) [2024-06-25 13:04:27,439][15349] Signal inference workers to stop experience collection... (216100 times) [2024-06-25 13:04:27,440][15349] Signal inference workers to resume experience collection... (216100 times) [2024-06-25 13:04:27,478][15401] InferenceWorker_p0-w0: stopping experience collection (216100 times) [2024-06-25 13:04:27,479][15401] InferenceWorker_p0-w0: resuming experience collection (216100 times) [2024-06-25 13:04:27,785][15401] Updated weights for policy 0, policy_version 891094 (0.0027) [2024-06-25 13:04:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 14599716864. Throughput: 0: 42783.1. Samples: 14599836180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 13:04:32,080][15401] Updated weights for policy 0, policy_version 891104 (0.0044) [2024-06-25 13:04:33,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14599913472. Throughput: 0: 42787.5. Samples: 14600088700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 13:04:35,366][15401] Updated weights for policy 0, policy_version 891114 (0.0038) [2024-06-25 13:04:38,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14600126464. Throughput: 0: 42577.8. Samples: 14600207260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:38,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 13:04:39,744][15401] Updated weights for policy 0, policy_version 891124 (0.0036) [2024-06-25 13:04:42,889][15401] Updated weights for policy 0, policy_version 891134 (0.0032) [2024-06-25 13:04:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14600355840. Throughput: 0: 42711.0. Samples: 14600470940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 13:04:47,567][15401] Updated weights for policy 0, policy_version 891144 (0.0030) [2024-06-25 13:04:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14600552448. Throughput: 0: 42844.6. Samples: 14600724340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 13:04:50,467][15401] Updated weights for policy 0, policy_version 891154 (0.0025) [2024-06-25 13:04:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 14600765440. Throughput: 0: 42648.2. Samples: 14600847360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 13:04:55,035][15401] Updated weights for policy 0, policy_version 891164 (0.0029) [2024-06-25 13:04:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14600978432. Throughput: 0: 42479.2. Samples: 14601105520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:04:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 13:04:58,524][15401] Updated weights for policy 0, policy_version 891174 (0.0036) [2024-06-25 13:05:02,814][15401] Updated weights for policy 0, policy_version 891184 (0.0031) [2024-06-25 13:05:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42709.8). Total num frames: 14601191424. Throughput: 0: 42694.8. Samples: 14601363320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 13:05:06,143][15401] Updated weights for policy 0, policy_version 891194 (0.0033) [2024-06-25 13:05:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14601404416. Throughput: 0: 42563.6. Samples: 14601484480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 13:05:10,589][15401] Updated weights for policy 0, policy_version 891204 (0.0031) [2024-06-25 13:05:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14601617408. Throughput: 0: 42294.4. Samples: 14601739420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 13:05:13,962][15401] Updated weights for policy 0, policy_version 891214 (0.0051) [2024-06-25 13:05:18,124][15401] Updated weights for policy 0, policy_version 891224 (0.0028) [2024-06-25 13:05:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 14601830400. Throughput: 0: 42623.2. Samples: 14602006740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 13:05:21,618][15401] Updated weights for policy 0, policy_version 891234 (0.0039) [2024-06-25 13:05:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 14602059776. Throughput: 0: 42754.2. Samples: 14602131200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:23,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 13:05:25,587][15401] Updated weights for policy 0, policy_version 891244 (0.0035) [2024-06-25 13:05:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14602256384. Throughput: 0: 42647.7. Samples: 14602390080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 13:05:29,334][15401] Updated weights for policy 0, policy_version 891254 (0.0047) [2024-06-25 13:05:33,156][15401] Updated weights for policy 0, policy_version 891264 (0.0036) [2024-06-25 13:05:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14602469376. Throughput: 0: 42787.5. Samples: 14602649780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:33,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 13:05:37,000][15401] Updated weights for policy 0, policy_version 891274 (0.0040) [2024-06-25 13:05:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14602698752. Throughput: 0: 42886.1. Samples: 14602777240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:38,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 13:05:40,871][15401] Updated weights for policy 0, policy_version 891284 (0.0025) [2024-06-25 13:05:41,970][15349] Signal inference workers to stop experience collection... (216150 times) [2024-06-25 13:05:41,970][15349] Signal inference workers to resume experience collection... (216150 times) [2024-06-25 13:05:41,993][15401] InferenceWorker_p0-w0: stopping experience collection (216150 times) [2024-06-25 13:05:41,993][15401] InferenceWorker_p0-w0: resuming experience collection (216150 times) [2024-06-25 13:05:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14602911744. Throughput: 0: 42706.5. Samples: 14603027320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:43,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 13:05:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000891291_14602911744.pth... [2024-06-25 13:05:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000890665_14592655360.pth [2024-06-25 13:05:44,603][15401] Updated weights for policy 0, policy_version 891294 (0.0039) [2024-06-25 13:05:48,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 14603108352. Throughput: 0: 42739.5. Samples: 14603286600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 13:05:48,660][15401] Updated weights for policy 0, policy_version 891304 (0.0044) [2024-06-25 13:05:52,171][15401] Updated weights for policy 0, policy_version 891314 (0.0044) [2024-06-25 13:05:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14603354112. Throughput: 0: 42779.5. Samples: 14603409560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 13:05:53,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 13:05:56,373][15401] Updated weights for policy 0, policy_version 891324 (0.0045) [2024-06-25 13:05:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14603550720. Throughput: 0: 42955.1. Samples: 14603672400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:05:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 13:05:59,703][15401] Updated weights for policy 0, policy_version 891334 (0.0038) [2024-06-25 13:06:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14603763712. Throughput: 0: 42532.9. Samples: 14603920720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 13:06:04,071][15401] Updated weights for policy 0, policy_version 891344 (0.0041) [2024-06-25 13:06:07,287][15401] Updated weights for policy 0, policy_version 891354 (0.0040) [2024-06-25 13:06:08,392][15132] Fps is (10 sec: 42587.6, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 14603976704. Throughput: 0: 42656.3. Samples: 14604050840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:08,393][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 13:06:11,766][15401] Updated weights for policy 0, policy_version 891364 (0.0040) [2024-06-25 13:06:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14604189696. Throughput: 0: 42756.4. Samples: 14604314120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:13,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 13:06:15,489][15401] Updated weights for policy 0, policy_version 891374 (0.0034) [2024-06-25 13:06:18,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14604419072. Throughput: 0: 42444.4. Samples: 14604559780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 13:06:19,418][15401] Updated weights for policy 0, policy_version 891384 (0.0039) [2024-06-25 13:06:23,015][15401] Updated weights for policy 0, policy_version 891394 (0.0039) [2024-06-25 13:06:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14604632064. Throughput: 0: 42507.1. Samples: 14604690060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 13:06:27,015][15401] Updated weights for policy 0, policy_version 891404 (0.0036) [2024-06-25 13:06:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 14604828672. Throughput: 0: 42688.9. Samples: 14604948320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 13:06:30,531][15401] Updated weights for policy 0, policy_version 891414 (0.0043) [2024-06-25 13:06:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 14605058048. Throughput: 0: 42581.4. Samples: 14605202760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 13:06:35,003][15401] Updated weights for policy 0, policy_version 891424 (0.0037) [2024-06-25 13:06:38,135][15401] Updated weights for policy 0, policy_version 891434 (0.0041) [2024-06-25 13:06:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14605271040. Throughput: 0: 42780.8. Samples: 14605334700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 13:06:42,601][15401] Updated weights for policy 0, policy_version 891444 (0.0045) [2024-06-25 13:06:43,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14605451264. Throughput: 0: 42673.2. Samples: 14605592700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:43,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 13:06:45,841][15401] Updated weights for policy 0, policy_version 891454 (0.0032) [2024-06-25 13:06:48,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14605697024. Throughput: 0: 42619.1. Samples: 14605838580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:48,394][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 13:06:50,236][15401] Updated weights for policy 0, policy_version 891464 (0.0043) [2024-06-25 13:06:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 14605893632. Throughput: 0: 42823.1. Samples: 14605977780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 13:06:53,550][15401] Updated weights for policy 0, policy_version 891474 (0.0036) [2024-06-25 13:06:57,894][15401] Updated weights for policy 0, policy_version 891484 (0.0028) [2024-06-25 13:06:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42654.2). Total num frames: 14606090240. Throughput: 0: 42482.7. Samples: 14606225840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:06:58,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 13:07:01,201][15401] Updated weights for policy 0, policy_version 891494 (0.0033) [2024-06-25 13:07:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14606336000. Throughput: 0: 42604.0. Samples: 14606476960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 13:07:05,469][15401] Updated weights for policy 0, policy_version 891504 (0.0039) [2024-06-25 13:07:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 14606532608. Throughput: 0: 42808.5. Samples: 14606616440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:08,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 13:07:08,764][15401] Updated weights for policy 0, policy_version 891514 (0.0041) [2024-06-25 13:07:13,090][15401] Updated weights for policy 0, policy_version 891524 (0.0036) [2024-06-25 13:07:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14606745600. Throughput: 0: 42787.9. Samples: 14606873780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:13,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 13:07:16,666][15401] Updated weights for policy 0, policy_version 891534 (0.0043) [2024-06-25 13:07:16,899][15349] Signal inference workers to stop experience collection... (216200 times) [2024-06-25 13:07:16,932][15401] InferenceWorker_p0-w0: stopping experience collection (216200 times) [2024-06-25 13:07:16,948][15349] Signal inference workers to resume experience collection... (216200 times) [2024-06-25 13:07:16,961][15401] InferenceWorker_p0-w0: resuming experience collection (216200 times) [2024-06-25 13:07:18,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 14606991360. Throughput: 0: 42609.6. Samples: 14607120200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:18,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 13:07:20,551][15401] Updated weights for policy 0, policy_version 891544 (0.0037) [2024-06-25 13:07:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14607171584. Throughput: 0: 42749.4. Samples: 14607258420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 13:07:24,055][15401] Updated weights for policy 0, policy_version 891554 (0.0033) [2024-06-25 13:07:28,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14607368192. Throughput: 0: 42659.9. Samples: 14607512400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 13:07:28,587][15401] Updated weights for policy 0, policy_version 891564 (0.0030) [2024-06-25 13:07:31,782][15401] Updated weights for policy 0, policy_version 891574 (0.0039) [2024-06-25 13:07:33,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14607646720. Throughput: 0: 42796.5. Samples: 14607764420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:33,390][15132] Avg episode reward: [(0, '0.218')] [2024-06-25 13:07:36,188][15401] Updated weights for policy 0, policy_version 891584 (0.0023) [2024-06-25 13:07:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14607810560. Throughput: 0: 42810.4. Samples: 14607904240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:38,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 13:07:39,343][15401] Updated weights for policy 0, policy_version 891594 (0.0032) [2024-06-25 13:07:43,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14608023552. Throughput: 0: 42797.7. Samples: 14608151740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 13:07:43,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 13:07:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000891603_14608023552.pth... [2024-06-25 13:07:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000890979_14597799936.pth [2024-06-25 13:07:44,150][15401] Updated weights for policy 0, policy_version 891604 (0.0034) [2024-06-25 13:07:47,207][15401] Updated weights for policy 0, policy_version 891614 (0.0042) [2024-06-25 13:07:48,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14608285696. Throughput: 0: 42735.9. Samples: 14608400080. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:07:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 13:07:51,710][15401] Updated weights for policy 0, policy_version 891624 (0.0035) [2024-06-25 13:07:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14608449536. Throughput: 0: 42834.2. Samples: 14608543980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:07:53,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 13:07:54,825][15401] Updated weights for policy 0, policy_version 891634 (0.0020) [2024-06-25 13:07:58,396][15132] Fps is (10 sec: 37659.5, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 14608662528. Throughput: 0: 42607.0. Samples: 14608791360. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:07:58,396][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 13:07:59,303][15401] Updated weights for policy 0, policy_version 891644 (0.0040) [2024-06-25 13:08:02,425][15401] Updated weights for policy 0, policy_version 891654 (0.0036) [2024-06-25 13:08:03,392][15132] Fps is (10 sec: 45864.3, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 14608908288. Throughput: 0: 42681.4. Samples: 14609040960. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:03,392][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 13:08:06,857][15401] Updated weights for policy 0, policy_version 891664 (0.0039) [2024-06-25 13:08:08,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14609088512. Throughput: 0: 42741.0. Samples: 14609181760. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 13:08:10,085][15401] Updated weights for policy 0, policy_version 891674 (0.0038) [2024-06-25 13:08:13,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14609301504. Throughput: 0: 42476.1. Samples: 14609423820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:13,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 13:08:14,409][15401] Updated weights for policy 0, policy_version 891684 (0.0036) [2024-06-25 13:08:18,047][15401] Updated weights for policy 0, policy_version 891694 (0.0042) [2024-06-25 13:08:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 14609530880. Throughput: 0: 42757.0. Samples: 14609688480. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 13:08:21,953][15401] Updated weights for policy 0, policy_version 891704 (0.0037) [2024-06-25 13:08:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14609727488. Throughput: 0: 42525.8. Samples: 14609817900. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 13:08:25,444][15349] Signal inference workers to stop experience collection... (216250 times) [2024-06-25 13:08:25,444][15349] Signal inference workers to resume experience collection... (216250 times) [2024-06-25 13:08:25,455][15401] InferenceWorker_p0-w0: stopping experience collection (216250 times) [2024-06-25 13:08:25,456][15401] InferenceWorker_p0-w0: resuming experience collection (216250 times) [2024-06-25 13:08:25,988][15401] Updated weights for policy 0, policy_version 891714 (0.0034) [2024-06-25 13:08:28,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 14609956864. Throughput: 0: 42529.9. Samples: 14610065580. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 13:08:29,517][15401] Updated weights for policy 0, policy_version 891724 (0.0047) [2024-06-25 13:08:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 14610153472. Throughput: 0: 42978.8. Samples: 14610334120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 13:08:33,502][15401] Updated weights for policy 0, policy_version 891734 (0.0038) [2024-06-25 13:08:37,410][15401] Updated weights for policy 0, policy_version 891744 (0.0032) [2024-06-25 13:08:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 14610382848. Throughput: 0: 42543.5. Samples: 14610458540. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:38,393][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 13:08:41,062][15401] Updated weights for policy 0, policy_version 891754 (0.0046) [2024-06-25 13:08:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14610612224. Throughput: 0: 42932.3. Samples: 14610723040. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:43,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 13:08:44,874][15401] Updated weights for policy 0, policy_version 891764 (0.0030) [2024-06-25 13:08:48,389][15132] Fps is (10 sec: 40970.1, 60 sec: 41779.3, 300 sec: 42654.3). Total num frames: 14610792448. Throughput: 0: 43064.1. Samples: 14610978740. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 13:08:48,784][15401] Updated weights for policy 0, policy_version 891774 (0.0034) [2024-06-25 13:08:52,575][15401] Updated weights for policy 0, policy_version 891784 (0.0040) [2024-06-25 13:08:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14611038208. Throughput: 0: 42618.6. Samples: 14611099600. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 13:08:56,376][15401] Updated weights for policy 0, policy_version 891794 (0.0041) [2024-06-25 13:08:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 14611251200. Throughput: 0: 43000.4. Samples: 14611358840. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:08:58,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 13:09:00,089][15401] Updated weights for policy 0, policy_version 891804 (0.0035) [2024-06-25 13:09:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 14611447808. Throughput: 0: 42945.7. Samples: 14611621040. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:09:03,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 13:09:03,813][15401] Updated weights for policy 0, policy_version 891814 (0.0038) [2024-06-25 13:09:07,569][15401] Updated weights for policy 0, policy_version 891824 (0.0029) [2024-06-25 13:09:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14611677184. Throughput: 0: 42851.0. Samples: 14611746200. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:09:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 13:09:11,320][15401] Updated weights for policy 0, policy_version 891834 (0.0045) [2024-06-25 13:09:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 14611906560. Throughput: 0: 43080.8. Samples: 14612004220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:09:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 13:09:15,145][15401] Updated weights for policy 0, policy_version 891844 (0.0035) [2024-06-25 13:09:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 14612086784. Throughput: 0: 42726.6. Samples: 14612256820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:09:18,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 13:09:19,630][15401] Updated weights for policy 0, policy_version 891854 (0.0036) [2024-06-25 13:09:22,915][15401] Updated weights for policy 0, policy_version 891864 (0.0029) [2024-06-25 13:09:23,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14612299776. Throughput: 0: 42784.0. Samples: 14612383720. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:09:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 13:09:27,388][15401] Updated weights for policy 0, policy_version 891874 (0.0038) [2024-06-25 13:09:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14612529152. Throughput: 0: 42587.6. Samples: 14612639480. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:09:28,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 13:09:30,942][15401] Updated weights for policy 0, policy_version 891884 (0.0037) [2024-06-25 13:09:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14612725760. Throughput: 0: 42452.9. Samples: 14612889120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:09:33,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 13:09:34,937][15401] Updated weights for policy 0, policy_version 891894 (0.0031) [2024-06-25 13:09:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 14612938752. Throughput: 0: 42614.6. Samples: 14613017260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 20.0) [2024-06-25 13:09:38,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 13:09:38,543][15401] Updated weights for policy 0, policy_version 891904 (0.0034) [2024-06-25 13:09:42,018][15349] Signal inference workers to stop experience collection... (216300 times) [2024-06-25 13:09:42,019][15349] Signal inference workers to resume experience collection... (216300 times) [2024-06-25 13:09:42,060][15401] InferenceWorker_p0-w0: stopping experience collection (216300 times) [2024-06-25 13:09:42,060][15401] InferenceWorker_p0-w0: resuming experience collection (216300 times) [2024-06-25 13:09:42,514][15401] Updated weights for policy 0, policy_version 891914 (0.0033) [2024-06-25 13:09:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14613151744. Throughput: 0: 42565.7. Samples: 14613274300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:09:43,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 13:09:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000891916_14613151744.pth... [2024-06-25 13:09:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000891291_14602911744.pth [2024-06-25 13:09:46,196][15401] Updated weights for policy 0, policy_version 891924 (0.0033) [2024-06-25 13:09:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14613348352. Throughput: 0: 42504.4. Samples: 14613533740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:09:48,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 13:09:50,162][15401] Updated weights for policy 0, policy_version 891934 (0.0024) [2024-06-25 13:09:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14613577728. Throughput: 0: 42570.6. Samples: 14613661880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:09:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 13:09:53,689][15401] Updated weights for policy 0, policy_version 891944 (0.0032) [2024-06-25 13:09:57,649][15401] Updated weights for policy 0, policy_version 891954 (0.0035) [2024-06-25 13:09:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14613790720. Throughput: 0: 42537.9. Samples: 14613918420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:09:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 13:10:01,704][15401] Updated weights for policy 0, policy_version 891964 (0.0042) [2024-06-25 13:10:03,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 14614020096. Throughput: 0: 42519.1. Samples: 14614170280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:03,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 13:10:05,404][15401] Updated weights for policy 0, policy_version 891974 (0.0031) [2024-06-25 13:10:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14614216704. Throughput: 0: 42596.2. Samples: 14614300540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:08,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 13:10:09,281][15401] Updated weights for policy 0, policy_version 891984 (0.0024) [2024-06-25 13:10:13,390][15132] Fps is (10 sec: 39330.7, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 14614413312. Throughput: 0: 42530.1. Samples: 14614553340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 13:10:13,427][15401] Updated weights for policy 0, policy_version 891994 (0.0034) [2024-06-25 13:10:16,936][15401] Updated weights for policy 0, policy_version 892004 (0.0039) [2024-06-25 13:10:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14614626304. Throughput: 0: 42507.6. Samples: 14614801960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 13:10:21,337][15401] Updated weights for policy 0, policy_version 892014 (0.0029) [2024-06-25 13:10:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14614855680. Throughput: 0: 42549.4. Samples: 14614931980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 13:10:24,505][15401] Updated weights for policy 0, policy_version 892024 (0.0042) [2024-06-25 13:10:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 14615035904. Throughput: 0: 42397.0. Samples: 14615182160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 13:10:29,016][15401] Updated weights for policy 0, policy_version 892034 (0.0040) [2024-06-25 13:10:32,579][15401] Updated weights for policy 0, policy_version 892044 (0.0034) [2024-06-25 13:10:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14615265280. Throughput: 0: 42298.2. Samples: 14615437160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:33,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 13:10:36,646][15401] Updated weights for policy 0, policy_version 892054 (0.0046) [2024-06-25 13:10:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14615494656. Throughput: 0: 42399.2. Samples: 14615569840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:38,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 13:10:40,250][15401] Updated weights for policy 0, policy_version 892064 (0.0041) [2024-06-25 13:10:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14615691264. Throughput: 0: 42223.9. Samples: 14615818500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 13:10:44,388][15401] Updated weights for policy 0, policy_version 892074 (0.0026) [2024-06-25 13:10:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14615887872. Throughput: 0: 42207.2. Samples: 14616069500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 13:10:48,482][15401] Updated weights for policy 0, policy_version 892084 (0.0037) [2024-06-25 13:10:52,421][15401] Updated weights for policy 0, policy_version 892094 (0.0043) [2024-06-25 13:10:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14616133632. Throughput: 0: 42120.0. Samples: 14616195940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:53,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 13:10:55,938][15401] Updated weights for policy 0, policy_version 892104 (0.0029) [2024-06-25 13:10:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14616346624. Throughput: 0: 42224.9. Samples: 14616453460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:10:58,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 13:11:00,251][15401] Updated weights for policy 0, policy_version 892114 (0.0031) [2024-06-25 13:11:03,392][15132] Fps is (10 sec: 40949.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14616543232. Throughput: 0: 42240.3. Samples: 14616702880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:11:03,392][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 13:11:03,550][15401] Updated weights for policy 0, policy_version 892124 (0.0027) [2024-06-25 13:11:07,904][15401] Updated weights for policy 0, policy_version 892134 (0.0034) [2024-06-25 13:11:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14616756224. Throughput: 0: 42261.8. Samples: 14616833760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:11:08,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-25 13:11:11,114][15401] Updated weights for policy 0, policy_version 892144 (0.0034) [2024-06-25 13:11:13,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14616952832. Throughput: 0: 42477.3. Samples: 14617093640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:11:13,390][15132] Avg episode reward: [(0, '0.303')] [2024-06-25 13:11:15,653][15401] Updated weights for policy 0, policy_version 892154 (0.0027) [2024-06-25 13:11:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 14617198592. Throughput: 0: 42326.5. Samples: 14617341860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:11:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 13:11:18,797][15349] Signal inference workers to stop experience collection... (216350 times) [2024-06-25 13:11:18,825][15401] InferenceWorker_p0-w0: stopping experience collection (216350 times) [2024-06-25 13:11:18,851][15349] Signal inference workers to resume experience collection... (216350 times) [2024-06-25 13:11:18,851][15401] InferenceWorker_p0-w0: resuming experience collection (216350 times) [2024-06-25 13:11:18,854][15401] Updated weights for policy 0, policy_version 892164 (0.0038) [2024-06-25 13:11:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 14617362432. Throughput: 0: 42369.7. Samples: 14617476480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:11:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 13:11:23,412][15401] Updated weights for policy 0, policy_version 892174 (0.0042) [2024-06-25 13:11:26,322][15401] Updated weights for policy 0, policy_version 892184 (0.0039) [2024-06-25 13:11:28,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14617591808. Throughput: 0: 42474.4. Samples: 14617729840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:11:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 13:11:30,919][15401] Updated weights for policy 0, policy_version 892194 (0.0039) [2024-06-25 13:11:33,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14617837568. Throughput: 0: 42548.3. Samples: 14617984180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 13:11:33,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 13:11:33,869][15401] Updated weights for policy 0, policy_version 892204 (0.0039) [2024-06-25 13:11:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42050.6, 300 sec: 42598.1). Total num frames: 14618017792. Throughput: 0: 42690.6. Samples: 14618117120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:11:38,392][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 13:11:38,790][15401] Updated weights for policy 0, policy_version 892214 (0.0034) [2024-06-25 13:11:41,411][15401] Updated weights for policy 0, policy_version 892224 (0.0046) [2024-06-25 13:11:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14618247168. Throughput: 0: 42554.8. Samples: 14618368420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:11:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 13:11:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000892227_14618247168.pth... [2024-06-25 13:11:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000891603_14608023552.pth [2024-06-25 13:11:46,260][15401] Updated weights for policy 0, policy_version 892234 (0.0036) [2024-06-25 13:11:48,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 14618492928. Throughput: 0: 42772.6. Samples: 14618627540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:11:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 13:11:48,840][15401] Updated weights for policy 0, policy_version 892244 (0.0028) [2024-06-25 13:11:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14618656768. Throughput: 0: 42826.6. Samples: 14618760960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:11:53,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 13:11:53,815][15401] Updated weights for policy 0, policy_version 892254 (0.0032) [2024-06-25 13:11:56,430][15401] Updated weights for policy 0, policy_version 892264 (0.0033) [2024-06-25 13:11:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14618902528. Throughput: 0: 42820.5. Samples: 14619020560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:11:58,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 13:12:01,238][15401] Updated weights for policy 0, policy_version 892274 (0.0027) [2024-06-25 13:12:03,389][15132] Fps is (10 sec: 49152.9, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 14619148288. Throughput: 0: 43017.6. Samples: 14619277640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 13:12:03,952][15401] Updated weights for policy 0, policy_version 892284 (0.0041) [2024-06-25 13:12:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14619295744. Throughput: 0: 43123.3. Samples: 14619417020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 13:12:09,038][15401] Updated weights for policy 0, policy_version 892294 (0.0036) [2024-06-25 13:12:11,444][15401] Updated weights for policy 0, policy_version 892304 (0.0042) [2024-06-25 13:12:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 43144.7, 300 sec: 42542.9). Total num frames: 14619541504. Throughput: 0: 43012.0. Samples: 14619665380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:13,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 13:12:16,705][15401] Updated weights for policy 0, policy_version 892314 (0.0027) [2024-06-25 13:12:18,389][15132] Fps is (10 sec: 49152.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 14619787264. Throughput: 0: 43183.3. Samples: 14619927420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:18,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 13:12:19,135][15401] Updated weights for policy 0, policy_version 892324 (0.0036) [2024-06-25 13:12:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 14619951104. Throughput: 0: 43177.9. Samples: 14620060020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 13:12:24,315][15401] Updated weights for policy 0, policy_version 892334 (0.0029) [2024-06-25 13:12:27,252][15401] Updated weights for policy 0, policy_version 892344 (0.0038) [2024-06-25 13:12:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.5, 300 sec: 42542.9). Total num frames: 14620196864. Throughput: 0: 43223.6. Samples: 14620313480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 13:12:31,731][15401] Updated weights for policy 0, policy_version 892354 (0.0035) [2024-06-25 13:12:33,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14620426240. Throughput: 0: 43223.8. Samples: 14620572620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:33,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 13:12:33,976][15349] Signal inference workers to stop experience collection... (216400 times) [2024-06-25 13:12:33,989][15401] InferenceWorker_p0-w0: stopping experience collection (216400 times) [2024-06-25 13:12:34,034][15349] Signal inference workers to resume experience collection... (216400 times) [2024-06-25 13:12:34,034][15401] InferenceWorker_p0-w0: resuming experience collection (216400 times) [2024-06-25 13:12:34,734][15401] Updated weights for policy 0, policy_version 892364 (0.0028) [2024-06-25 13:12:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 14620590080. Throughput: 0: 43148.6. Samples: 14620702640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:38,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 13:12:39,253][15401] Updated weights for policy 0, policy_version 892374 (0.0022) [2024-06-25 13:12:42,445][15401] Updated weights for policy 0, policy_version 892384 (0.0032) [2024-06-25 13:12:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 14620852224. Throughput: 0: 42979.9. Samples: 14620954660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 13:12:46,960][15401] Updated weights for policy 0, policy_version 892394 (0.0035) [2024-06-25 13:12:48,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14621065216. Throughput: 0: 43054.0. Samples: 14621215080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 13:12:49,975][15401] Updated weights for policy 0, policy_version 892404 (0.0033) [2024-06-25 13:12:53,389][15132] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 42654.9). Total num frames: 14621245440. Throughput: 0: 42724.0. Samples: 14621339600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:53,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 13:12:54,547][15401] Updated weights for policy 0, policy_version 892414 (0.0045) [2024-06-25 13:12:57,606][15401] Updated weights for policy 0, policy_version 892424 (0.0034) [2024-06-25 13:12:58,392][15132] Fps is (10 sec: 44226.8, 60 sec: 43415.9, 300 sec: 42709.5). Total num frames: 14621507584. Throughput: 0: 42971.9. Samples: 14621599220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:12:58,393][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 13:13:02,262][15401] Updated weights for policy 0, policy_version 892434 (0.0033) [2024-06-25 13:13:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 14621671424. Throughput: 0: 42895.4. Samples: 14621857720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:13:03,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 13:13:05,408][15401] Updated weights for policy 0, policy_version 892444 (0.0030) [2024-06-25 13:13:08,390][15132] Fps is (10 sec: 39330.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 14621900800. Throughput: 0: 42605.2. Samples: 14621977260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:13:08,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 13:13:09,965][15401] Updated weights for policy 0, policy_version 892454 (0.0034) [2024-06-25 13:13:13,349][15401] Updated weights for policy 0, policy_version 892464 (0.0038) [2024-06-25 13:13:13,392][15132] Fps is (10 sec: 45863.4, 60 sec: 43142.5, 300 sec: 42709.1). Total num frames: 14622130176. Throughput: 0: 42841.9. Samples: 14622241480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:13:13,393][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 13:13:17,447][15401] Updated weights for policy 0, policy_version 892474 (0.0035) [2024-06-25 13:13:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14622326784. Throughput: 0: 42730.3. Samples: 14622495480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:13:18,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 13:13:20,854][15401] Updated weights for policy 0, policy_version 892484 (0.0046) [2024-06-25 13:13:23,390][15132] Fps is (10 sec: 40970.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14622539776. Throughput: 0: 42562.1. Samples: 14622617940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 13:13:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 13:13:25,202][15401] Updated weights for policy 0, policy_version 892494 (0.0032) [2024-06-25 13:13:28,352][15401] Updated weights for policy 0, policy_version 892504 (0.0021) [2024-06-25 13:13:28,392][15132] Fps is (10 sec: 45864.1, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 14622785536. Throughput: 0: 42903.6. Samples: 14622885420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:13:28,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 13:13:32,711][15401] Updated weights for policy 0, policy_version 892514 (0.0029) [2024-06-25 13:13:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 14622982144. Throughput: 0: 42717.4. Samples: 14623137360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:13:33,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 13:13:36,000][15401] Updated weights for policy 0, policy_version 892524 (0.0031) [2024-06-25 13:13:38,392][15132] Fps is (10 sec: 39321.5, 60 sec: 43142.7, 300 sec: 42598.1). Total num frames: 14623178752. Throughput: 0: 42676.8. Samples: 14623260160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:13:38,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 13:13:40,325][15401] Updated weights for policy 0, policy_version 892534 (0.0045) [2024-06-25 13:13:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14623408128. Throughput: 0: 42730.6. Samples: 14623522000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:13:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 13:13:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000892542_14623408128.pth... [2024-06-25 13:13:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000891916_14613151744.pth [2024-06-25 13:13:44,150][15401] Updated weights for policy 0, policy_version 892544 (0.0041) [2024-06-25 13:13:48,299][15401] Updated weights for policy 0, policy_version 892554 (0.0032) [2024-06-25 13:13:48,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14623604736. Throughput: 0: 42632.5. Samples: 14623776180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:13:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 13:13:51,766][15401] Updated weights for policy 0, policy_version 892564 (0.0044) [2024-06-25 13:13:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14623817728. Throughput: 0: 42791.7. Samples: 14623902880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:13:53,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 13:13:56,106][15401] Updated weights for policy 0, policy_version 892574 (0.0039) [2024-06-25 13:13:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 14624030720. Throughput: 0: 42577.6. Samples: 14624157360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:13:58,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 13:13:59,524][15349] Signal inference workers to stop experience collection... (216450 times) [2024-06-25 13:13:59,559][15401] InferenceWorker_p0-w0: stopping experience collection (216450 times) [2024-06-25 13:13:59,571][15349] Signal inference workers to resume experience collection... (216450 times) [2024-06-25 13:13:59,584][15401] InferenceWorker_p0-w0: resuming experience collection (216450 times) [2024-06-25 13:13:59,586][15401] Updated weights for policy 0, policy_version 892584 (0.0032) [2024-06-25 13:14:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14624243712. Throughput: 0: 42533.3. Samples: 14624409480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:03,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 13:14:03,803][15401] Updated weights for policy 0, policy_version 892594 (0.0024) [2024-06-25 13:14:07,104][15401] Updated weights for policy 0, policy_version 892604 (0.0036) [2024-06-25 13:14:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14624456704. Throughput: 0: 42612.1. Samples: 14624535480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 13:14:11,453][15401] Updated weights for policy 0, policy_version 892614 (0.0040) [2024-06-25 13:14:13,390][15132] Fps is (10 sec: 44234.7, 60 sec: 42599.9, 300 sec: 42709.4). Total num frames: 14624686080. Throughput: 0: 42548.9. Samples: 14624800040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:13,391][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 13:14:14,866][15401] Updated weights for policy 0, policy_version 892624 (0.0031) [2024-06-25 13:14:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14624899072. Throughput: 0: 42585.9. Samples: 14625053720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 13:14:18,985][15401] Updated weights for policy 0, policy_version 892634 (0.0034) [2024-06-25 13:14:22,803][15401] Updated weights for policy 0, policy_version 892644 (0.0040) [2024-06-25 13:14:23,389][15132] Fps is (10 sec: 40962.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14625095680. Throughput: 0: 42681.9. Samples: 14625180740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 13:14:26,412][15401] Updated weights for policy 0, policy_version 892654 (0.0031) [2024-06-25 13:14:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 14625325056. Throughput: 0: 42648.5. Samples: 14625441180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 13:14:30,426][15401] Updated weights for policy 0, policy_version 892664 (0.0030) [2024-06-25 13:14:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14625538048. Throughput: 0: 42726.3. Samples: 14625698860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:33,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 13:14:33,874][15401] Updated weights for policy 0, policy_version 892674 (0.0035) [2024-06-25 13:14:38,201][15401] Updated weights for policy 0, policy_version 892684 (0.0045) [2024-06-25 13:14:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.2, 300 sec: 42654.0). Total num frames: 14625734656. Throughput: 0: 42703.2. Samples: 14625824520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:38,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 13:14:41,496][15401] Updated weights for policy 0, policy_version 892694 (0.0035) [2024-06-25 13:14:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14625964032. Throughput: 0: 42830.3. Samples: 14626084720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 13:14:45,818][15401] Updated weights for policy 0, policy_version 892704 (0.0034) [2024-06-25 13:14:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14626177024. Throughput: 0: 42997.7. Samples: 14626344380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:48,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 13:14:49,362][15401] Updated weights for policy 0, policy_version 892714 (0.0032) [2024-06-25 13:14:53,338][15401] Updated weights for policy 0, policy_version 892724 (0.0037) [2024-06-25 13:14:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14626390016. Throughput: 0: 43018.1. Samples: 14626471300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:53,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 13:14:57,078][15401] Updated weights for policy 0, policy_version 892734 (0.0034) [2024-06-25 13:14:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 14626619392. Throughput: 0: 42876.5. Samples: 14626729460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:14:58,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 13:15:00,811][15401] Updated weights for policy 0, policy_version 892744 (0.0039) [2024-06-25 13:15:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 14626816000. Throughput: 0: 42909.2. Samples: 14626984740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:15:03,392][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 13:15:05,256][15401] Updated weights for policy 0, policy_version 892754 (0.0034) [2024-06-25 13:15:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14627028992. Throughput: 0: 42931.9. Samples: 14627112680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:15:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 13:15:08,591][15401] Updated weights for policy 0, policy_version 892764 (0.0047) [2024-06-25 13:15:12,699][15401] Updated weights for policy 0, policy_version 892774 (0.0034) [2024-06-25 13:15:13,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.8, 300 sec: 42820.5). Total num frames: 14627258368. Throughput: 0: 42928.4. Samples: 14627372960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:15:13,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 13:15:16,239][15401] Updated weights for policy 0, policy_version 892784 (0.0034) [2024-06-25 13:15:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14627454976. Throughput: 0: 42849.9. Samples: 14627627100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 13:15:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 13:15:20,317][15401] Updated weights for policy 0, policy_version 892794 (0.0037) [2024-06-25 13:15:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 14627684352. Throughput: 0: 42867.4. Samples: 14627753660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:15:23,393][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 13:15:23,599][15401] Updated weights for policy 0, policy_version 892804 (0.0022) [2024-06-25 13:15:26,087][15349] Signal inference workers to stop experience collection... (216500 times) [2024-06-25 13:15:26,131][15401] InferenceWorker_p0-w0: stopping experience collection (216500 times) [2024-06-25 13:15:26,142][15349] Signal inference workers to resume experience collection... (216500 times) [2024-06-25 13:15:26,152][15401] InferenceWorker_p0-w0: resuming experience collection (216500 times) [2024-06-25 13:15:27,830][15401] Updated weights for policy 0, policy_version 892814 (0.0039) [2024-06-25 13:15:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14627897344. Throughput: 0: 42850.2. Samples: 14628012980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:15:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 13:15:31,139][15401] Updated weights for policy 0, policy_version 892824 (0.0034) [2024-06-25 13:15:33,390][15132] Fps is (10 sec: 40969.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14628093952. Throughput: 0: 42823.0. Samples: 14628271420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:15:33,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 13:15:35,407][15401] Updated weights for policy 0, policy_version 892834 (0.0027) [2024-06-25 13:15:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 14628323328. Throughput: 0: 42721.3. Samples: 14628393760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:15:38,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 13:15:38,738][15401] Updated weights for policy 0, policy_version 892844 (0.0032) [2024-06-25 13:15:43,130][15401] Updated weights for policy 0, policy_version 892854 (0.0027) [2024-06-25 13:15:43,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14628536320. Throughput: 0: 42691.6. Samples: 14628650580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:15:43,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 13:15:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000892856_14628552704.pth... [2024-06-25 13:15:43,576][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000892227_14618247168.pth [2024-06-25 13:15:46,915][15401] Updated weights for policy 0, policy_version 892864 (0.0039) [2024-06-25 13:15:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14628749312. Throughput: 0: 42740.5. Samples: 14628907960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:15:48,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 13:15:50,835][15401] Updated weights for policy 0, policy_version 892874 (0.0032) [2024-06-25 13:15:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14628962304. Throughput: 0: 42678.2. Samples: 14629033200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:15:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 13:15:54,385][15401] Updated weights for policy 0, policy_version 892884 (0.0032) [2024-06-25 13:15:58,131][15401] Updated weights for policy 0, policy_version 892894 (0.0036) [2024-06-25 13:15:58,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 14629191680. Throughput: 0: 42753.3. Samples: 14629296860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:15:58,395][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 13:16:02,066][15401] Updated weights for policy 0, policy_version 892904 (0.0034) [2024-06-25 13:16:03,396][15132] Fps is (10 sec: 40933.4, 60 sec: 42595.5, 300 sec: 42764.1). Total num frames: 14629371904. Throughput: 0: 42928.8. Samples: 14629559180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:03,397][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 13:16:05,879][15401] Updated weights for policy 0, policy_version 892914 (0.0026) [2024-06-25 13:16:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14629617664. Throughput: 0: 42904.4. Samples: 14629684260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:08,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 13:16:09,674][15401] Updated weights for policy 0, policy_version 892924 (0.0036) [2024-06-25 13:16:13,390][15132] Fps is (10 sec: 44265.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14629814272. Throughput: 0: 42930.5. Samples: 14629944860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 13:16:13,529][15401] Updated weights for policy 0, policy_version 892934 (0.0020) [2024-06-25 13:16:17,426][15401] Updated weights for policy 0, policy_version 892944 (0.0025) [2024-06-25 13:16:18,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 14630027264. Throughput: 0: 42938.4. Samples: 14630203640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:18,392][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 13:16:21,376][15401] Updated weights for policy 0, policy_version 892954 (0.0035) [2024-06-25 13:16:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 14630240256. Throughput: 0: 43086.2. Samples: 14630332640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 13:16:24,871][15401] Updated weights for policy 0, policy_version 892964 (0.0036) [2024-06-25 13:16:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 14630469632. Throughput: 0: 42916.7. Samples: 14630581840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:28,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 13:16:28,774][15401] Updated weights for policy 0, policy_version 892974 (0.0023) [2024-06-25 13:16:32,612][15401] Updated weights for policy 0, policy_version 892984 (0.0027) [2024-06-25 13:16:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42932.0). Total num frames: 14630682624. Throughput: 0: 42953.7. Samples: 14630840880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:33,396][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 13:16:36,489][15401] Updated weights for policy 0, policy_version 892994 (0.0046) [2024-06-25 13:16:38,390][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14630879232. Throughput: 0: 42975.5. Samples: 14630967100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:38,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 13:16:40,293][15401] Updated weights for policy 0, policy_version 893004 (0.0039) [2024-06-25 13:16:42,407][15349] Signal inference workers to stop experience collection... (216550 times) [2024-06-25 13:16:42,412][15349] Signal inference workers to resume experience collection... (216550 times) [2024-06-25 13:16:42,436][15401] InferenceWorker_p0-w0: stopping experience collection (216550 times) [2024-06-25 13:16:42,437][15401] InferenceWorker_p0-w0: resuming experience collection (216550 times) [2024-06-25 13:16:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14631108608. Throughput: 0: 42973.1. Samples: 14631230640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 13:16:44,015][15401] Updated weights for policy 0, policy_version 893014 (0.0027) [2024-06-25 13:16:47,776][15401] Updated weights for policy 0, policy_version 893024 (0.0038) [2024-06-25 13:16:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14631321600. Throughput: 0: 42817.8. Samples: 14631485700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 13:16:51,901][15401] Updated weights for policy 0, policy_version 893034 (0.0045) [2024-06-25 13:16:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14631534592. Throughput: 0: 42907.6. Samples: 14631615100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:53,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 13:16:55,213][15401] Updated weights for policy 0, policy_version 893044 (0.0044) [2024-06-25 13:16:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14631747584. Throughput: 0: 42882.6. Samples: 14631874580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:16:58,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 13:16:59,323][15401] Updated weights for policy 0, policy_version 893054 (0.0031) [2024-06-25 13:17:02,800][15401] Updated weights for policy 0, policy_version 893064 (0.0051) [2024-06-25 13:17:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43422.2, 300 sec: 42987.2). Total num frames: 14631976960. Throughput: 0: 42716.8. Samples: 14632125900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:17:03,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 13:17:07,493][15401] Updated weights for policy 0, policy_version 893074 (0.0032) [2024-06-25 13:17:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14632173568. Throughput: 0: 42671.3. Samples: 14632252840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:17:08,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 13:17:10,888][15401] Updated weights for policy 0, policy_version 893084 (0.0039) [2024-06-25 13:17:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14632402944. Throughput: 0: 42840.6. Samples: 14632509660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 13:17:15,276][15401] Updated weights for policy 0, policy_version 893094 (0.0036) [2024-06-25 13:17:18,329][15401] Updated weights for policy 0, policy_version 893104 (0.0048) [2024-06-25 13:17:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14632615936. Throughput: 0: 42763.7. Samples: 14632765240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 13:17:23,079][15401] Updated weights for policy 0, policy_version 893114 (0.0037) [2024-06-25 13:17:23,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 14632796160. Throughput: 0: 42708.4. Samples: 14632889080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:23,393][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 13:17:26,250][15401] Updated weights for policy 0, policy_version 893124 (0.0027) [2024-06-25 13:17:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14633025536. Throughput: 0: 42671.0. Samples: 14633150840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 13:17:30,640][15401] Updated weights for policy 0, policy_version 893134 (0.0039) [2024-06-25 13:17:33,396][15132] Fps is (10 sec: 45857.0, 60 sec: 42866.9, 300 sec: 42930.7). Total num frames: 14633254912. Throughput: 0: 42653.4. Samples: 14633405380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:33,397][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 13:17:33,916][15401] Updated weights for policy 0, policy_version 893144 (0.0027) [2024-06-25 13:17:38,177][15401] Updated weights for policy 0, policy_version 893154 (0.0048) [2024-06-25 13:17:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14633435136. Throughput: 0: 42638.7. Samples: 14633533840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:38,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 13:17:38,529][15349] Signal inference workers to stop experience collection... (216600 times) [2024-06-25 13:17:38,529][15349] Signal inference workers to resume experience collection... (216600 times) [2024-06-25 13:17:38,538][15401] InferenceWorker_p0-w0: stopping experience collection (216600 times) [2024-06-25 13:17:38,556][15401] InferenceWorker_p0-w0: resuming experience collection (216600 times) [2024-06-25 13:17:41,414][15401] Updated weights for policy 0, policy_version 893164 (0.0040) [2024-06-25 13:17:43,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14633664512. Throughput: 0: 42534.7. Samples: 14633788640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:43,394][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 13:17:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000893168_14633664512.pth... [2024-06-25 13:17:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000892542_14623408128.pth [2024-06-25 13:17:45,806][15401] Updated weights for policy 0, policy_version 893174 (0.0021) [2024-06-25 13:17:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14633877504. Throughput: 0: 42659.1. Samples: 14634045560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 13:17:48,975][15401] Updated weights for policy 0, policy_version 893184 (0.0033) [2024-06-25 13:17:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 14634074112. Throughput: 0: 42666.2. Samples: 14634172820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 13:17:53,424][15401] Updated weights for policy 0, policy_version 893194 (0.0042) [2024-06-25 13:17:56,897][15401] Updated weights for policy 0, policy_version 893204 (0.0037) [2024-06-25 13:17:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14634303488. Throughput: 0: 42590.3. Samples: 14634426220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:17:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 13:18:01,213][15401] Updated weights for policy 0, policy_version 893214 (0.0046) [2024-06-25 13:18:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14634516480. Throughput: 0: 42587.5. Samples: 14634681680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:03,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 13:18:04,487][15401] Updated weights for policy 0, policy_version 893224 (0.0037) [2024-06-25 13:18:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.9). Total num frames: 14634729472. Throughput: 0: 42619.2. Samples: 14634806840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 13:18:08,975][15401] Updated weights for policy 0, policy_version 893234 (0.0043) [2024-06-25 13:18:12,403][15401] Updated weights for policy 0, policy_version 893244 (0.0024) [2024-06-25 13:18:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14634942464. Throughput: 0: 42517.7. Samples: 14635064140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:13,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 13:18:16,803][15401] Updated weights for policy 0, policy_version 893254 (0.0036) [2024-06-25 13:18:18,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 14635171840. Throughput: 0: 42578.0. Samples: 14635321220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:18,393][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 13:18:20,105][15401] Updated weights for policy 0, policy_version 893264 (0.0034) [2024-06-25 13:18:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42600.2, 300 sec: 42598.8). Total num frames: 14635352064. Throughput: 0: 42614.2. Samples: 14635451480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 13:18:24,245][15401] Updated weights for policy 0, policy_version 893274 (0.0033) [2024-06-25 13:18:27,547][15401] Updated weights for policy 0, policy_version 893284 (0.0030) [2024-06-25 13:18:28,395][15132] Fps is (10 sec: 40949.3, 60 sec: 42594.8, 300 sec: 42708.8). Total num frames: 14635581440. Throughput: 0: 42582.4. Samples: 14635705060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:28,395][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 13:18:31,564][15401] Updated weights for policy 0, policy_version 893294 (0.0028) [2024-06-25 13:18:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42329.8, 300 sec: 42765.4). Total num frames: 14635794432. Throughput: 0: 42796.1. Samples: 14635971380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 13:18:35,060][15401] Updated weights for policy 0, policy_version 893304 (0.0039) [2024-06-25 13:18:38,389][15132] Fps is (10 sec: 42620.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14636007424. Throughput: 0: 42652.0. Samples: 14636092160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 13:18:39,031][15401] Updated weights for policy 0, policy_version 893314 (0.0030) [2024-06-25 13:18:42,551][15401] Updated weights for policy 0, policy_version 893324 (0.0026) [2024-06-25 13:18:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14636236800. Throughput: 0: 42751.8. Samples: 14636350060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:43,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 13:18:47,050][15401] Updated weights for policy 0, policy_version 893334 (0.0041) [2024-06-25 13:18:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14636417024. Throughput: 0: 42944.4. Samples: 14636614180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:48,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 13:18:50,230][15401] Updated weights for policy 0, policy_version 893344 (0.0036) [2024-06-25 13:18:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14636662784. Throughput: 0: 42826.7. Samples: 14636734040. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:53,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 13:18:54,416][15401] Updated weights for policy 0, policy_version 893354 (0.0041) [2024-06-25 13:18:56,707][15349] Signal inference workers to stop experience collection... (216650 times) [2024-06-25 13:18:56,707][15349] Signal inference workers to resume experience collection... (216650 times) [2024-06-25 13:18:56,734][15401] InferenceWorker_p0-w0: stopping experience collection (216650 times) [2024-06-25 13:18:56,734][15401] InferenceWorker_p0-w0: resuming experience collection (216650 times) [2024-06-25 13:18:57,608][15401] Updated weights for policy 0, policy_version 893364 (0.0041) [2024-06-25 13:18:58,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 14636892160. Throughput: 0: 42902.7. Samples: 14636994860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:18:58,392][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 13:19:02,319][15401] Updated weights for policy 0, policy_version 893374 (0.0034) [2024-06-25 13:19:03,392][15132] Fps is (10 sec: 40951.4, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 14637072384. Throughput: 0: 42977.2. Samples: 14637255180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-25 13:19:03,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 13:19:05,697][15401] Updated weights for policy 0, policy_version 893384 (0.0043) [2024-06-25 13:19:08,389][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 14637301760. Throughput: 0: 42743.9. Samples: 14637374960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 13:19:10,015][15401] Updated weights for policy 0, policy_version 893394 (0.0023) [2024-06-25 13:19:13,187][15401] Updated weights for policy 0, policy_version 893404 (0.0032) [2024-06-25 13:19:13,389][15132] Fps is (10 sec: 45885.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14637531136. Throughput: 0: 42935.1. Samples: 14637636920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 13:19:17,677][15401] Updated weights for policy 0, policy_version 893414 (0.0031) [2024-06-25 13:19:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 14637727744. Throughput: 0: 42795.9. Samples: 14637897200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:18,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 13:19:20,822][15401] Updated weights for policy 0, policy_version 893424 (0.0045) [2024-06-25 13:19:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 14637957120. Throughput: 0: 42827.1. Samples: 14638019380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 13:19:25,383][15401] Updated weights for policy 0, policy_version 893434 (0.0038) [2024-06-25 13:19:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42875.1, 300 sec: 42765.0). Total num frames: 14638153728. Throughput: 0: 42996.2. Samples: 14638284880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:28,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-25 13:19:28,585][15401] Updated weights for policy 0, policy_version 893444 (0.0036) [2024-06-25 13:19:33,010][15401] Updated weights for policy 0, policy_version 893454 (0.0023) [2024-06-25 13:19:33,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 14638366720. Throughput: 0: 42726.1. Samples: 14638536960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:33,393][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 13:19:36,165][15401] Updated weights for policy 0, policy_version 893464 (0.0030) [2024-06-25 13:19:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14638596096. Throughput: 0: 42865.3. Samples: 14638662980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 13:19:40,526][15401] Updated weights for policy 0, policy_version 893474 (0.0024) [2024-06-25 13:19:43,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14638792704. Throughput: 0: 42848.1. Samples: 14638922920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 13:19:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000893482_14638809088.pth... [2024-06-25 13:19:43,609][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000892856_14628552704.pth [2024-06-25 13:19:44,179][15401] Updated weights for policy 0, policy_version 893484 (0.0037) [2024-06-25 13:19:48,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14638989312. Throughput: 0: 42604.1. Samples: 14639172280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 13:19:48,426][15401] Updated weights for policy 0, policy_version 893494 (0.0037) [2024-06-25 13:19:51,702][15401] Updated weights for policy 0, policy_version 893504 (0.0034) [2024-06-25 13:19:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14639235072. Throughput: 0: 42691.5. Samples: 14639296080. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 13:19:56,201][15401] Updated weights for policy 0, policy_version 893514 (0.0030) [2024-06-25 13:19:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42327.0, 300 sec: 42765.4). Total num frames: 14639431680. Throughput: 0: 42763.1. Samples: 14639561260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:19:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 13:19:59,289][15401] Updated weights for policy 0, policy_version 893524 (0.0035) [2024-06-25 13:20:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42599.9, 300 sec: 42709.5). Total num frames: 14639628288. Throughput: 0: 42721.0. Samples: 14639819640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 13:20:03,739][15401] Updated weights for policy 0, policy_version 893534 (0.0032) [2024-06-25 13:20:06,732][15401] Updated weights for policy 0, policy_version 893544 (0.0026) [2024-06-25 13:20:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14639874048. Throughput: 0: 42730.6. Samples: 14639942260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 13:20:11,642][15401] Updated weights for policy 0, policy_version 893554 (0.0041) [2024-06-25 13:20:13,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 14640087040. Throughput: 0: 42654.1. Samples: 14640204420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:13,393][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 13:20:14,616][15401] Updated weights for policy 0, policy_version 893564 (0.0030) [2024-06-25 13:20:18,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42323.7, 300 sec: 42653.9). Total num frames: 14640267264. Throughput: 0: 42722.7. Samples: 14640459480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:18,392][15132] Avg episode reward: [(0, '0.821')] [2024-06-25 13:20:19,163][15401] Updated weights for policy 0, policy_version 893574 (0.0036) [2024-06-25 13:20:22,563][15401] Updated weights for policy 0, policy_version 893584 (0.0044) [2024-06-25 13:20:23,261][15349] Signal inference workers to stop experience collection... (216700 times) [2024-06-25 13:20:23,261][15349] Signal inference workers to resume experience collection... (216700 times) [2024-06-25 13:20:23,308][15401] InferenceWorker_p0-w0: stopping experience collection (216700 times) [2024-06-25 13:20:23,308][15401] InferenceWorker_p0-w0: resuming experience collection (216700 times) [2024-06-25 13:20:23,389][15132] Fps is (10 sec: 45886.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14640545792. Throughput: 0: 42636.9. Samples: 14640581640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 13:20:26,745][15401] Updated weights for policy 0, policy_version 893594 (0.0030) [2024-06-25 13:20:28,390][15132] Fps is (10 sec: 45885.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14640726016. Throughput: 0: 42796.3. Samples: 14640848760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 13:20:30,150][15401] Updated weights for policy 0, policy_version 893604 (0.0045) [2024-06-25 13:20:33,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 14640922624. Throughput: 0: 42981.9. Samples: 14641106460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 13:20:34,511][15401] Updated weights for policy 0, policy_version 893614 (0.0025) [2024-06-25 13:20:37,678][15401] Updated weights for policy 0, policy_version 893624 (0.0035) [2024-06-25 13:20:38,393][15132] Fps is (10 sec: 44223.9, 60 sec: 42869.3, 300 sec: 42820.1). Total num frames: 14641168384. Throughput: 0: 43031.8. Samples: 14641232640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:38,393][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 13:20:42,163][15401] Updated weights for policy 0, policy_version 893634 (0.0033) [2024-06-25 13:20:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14641364992. Throughput: 0: 42912.0. Samples: 14641492300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:43,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 13:20:45,138][15401] Updated weights for policy 0, policy_version 893644 (0.0026) [2024-06-25 13:20:48,389][15132] Fps is (10 sec: 39333.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14641561600. Throughput: 0: 42904.0. Samples: 14641750320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:48,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 13:20:49,762][15401] Updated weights for policy 0, policy_version 893654 (0.0028) [2024-06-25 13:20:52,850][15401] Updated weights for policy 0, policy_version 893664 (0.0029) [2024-06-25 13:20:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14641823744. Throughput: 0: 42914.6. Samples: 14641873420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:53,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 13:20:57,364][15401] Updated weights for policy 0, policy_version 893674 (0.0030) [2024-06-25 13:20:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42766.0). Total num frames: 14641987584. Throughput: 0: 42902.4. Samples: 14642134920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 13:20:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 13:21:00,418][15401] Updated weights for policy 0, policy_version 893684 (0.0045) [2024-06-25 13:21:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14642200576. Throughput: 0: 42886.2. Samples: 14642389260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 13:21:04,934][15401] Updated weights for policy 0, policy_version 893694 (0.0031) [2024-06-25 13:21:08,097][15401] Updated weights for policy 0, policy_version 893704 (0.0026) [2024-06-25 13:21:08,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14642462720. Throughput: 0: 42988.8. Samples: 14642516140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:08,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 13:21:12,435][15401] Updated weights for policy 0, policy_version 893714 (0.0031) [2024-06-25 13:21:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 14642626560. Throughput: 0: 42866.3. Samples: 14642777740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 13:21:15,623][15401] Updated weights for policy 0, policy_version 893724 (0.0038) [2024-06-25 13:21:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14642839552. Throughput: 0: 42884.0. Samples: 14643036240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:18,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 13:21:20,161][15401] Updated weights for policy 0, policy_version 893734 (0.0036) [2024-06-25 13:21:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14643085312. Throughput: 0: 42889.1. Samples: 14643162520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 13:21:23,397][15401] Updated weights for policy 0, policy_version 893744 (0.0040) [2024-06-25 13:21:27,794][15401] Updated weights for policy 0, policy_version 893754 (0.0023) [2024-06-25 13:21:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14643281920. Throughput: 0: 42850.2. Samples: 14643420560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 13:21:31,123][15401] Updated weights for policy 0, policy_version 893764 (0.0035) [2024-06-25 13:21:31,937][15349] Signal inference workers to stop experience collection... (216750 times) [2024-06-25 13:21:31,987][15401] InferenceWorker_p0-w0: stopping experience collection (216750 times) [2024-06-25 13:21:31,995][15349] Signal inference workers to resume experience collection... (216750 times) [2024-06-25 13:21:32,011][15401] InferenceWorker_p0-w0: resuming experience collection (216750 times) [2024-06-25 13:21:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 14643494912. Throughput: 0: 42746.1. Samples: 14643674000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:33,393][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 13:21:35,389][15401] Updated weights for policy 0, policy_version 893774 (0.0041) [2024-06-25 13:21:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42600.6, 300 sec: 42765.0). Total num frames: 14643724288. Throughput: 0: 42895.7. Samples: 14643803720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:38,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-25 13:21:38,755][15401] Updated weights for policy 0, policy_version 893784 (0.0028) [2024-06-25 13:21:43,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14643904512. Throughput: 0: 42793.2. Samples: 14644060620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:43,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 13:21:43,554][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000893794_14643920896.pth... [2024-06-25 13:21:43,563][15401] Updated weights for policy 0, policy_version 893794 (0.0034) [2024-06-25 13:21:43,619][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000893168_14633664512.pth [2024-06-25 13:21:46,758][15401] Updated weights for policy 0, policy_version 893804 (0.0044) [2024-06-25 13:21:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14644150272. Throughput: 0: 42625.8. Samples: 14644307420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 13:21:51,306][15401] Updated weights for policy 0, policy_version 893814 (0.0042) [2024-06-25 13:21:53,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 14644363264. Throughput: 0: 42851.2. Samples: 14644444440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:53,390][15132] Avg episode reward: [(0, '0.262')] [2024-06-25 13:21:54,280][15401] Updated weights for policy 0, policy_version 893824 (0.0033) [2024-06-25 13:21:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 14644543488. Throughput: 0: 42663.4. Samples: 14644697600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:21:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 13:21:58,937][15401] Updated weights for policy 0, policy_version 893834 (0.0033) [2024-06-25 13:22:01,731][15401] Updated weights for policy 0, policy_version 893844 (0.0044) [2024-06-25 13:22:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 14644805632. Throughput: 0: 42520.3. Samples: 14644949660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 13:22:06,434][15401] Updated weights for policy 0, policy_version 893854 (0.0024) [2024-06-25 13:22:08,392][15132] Fps is (10 sec: 45865.2, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 14645002240. Throughput: 0: 42739.6. Samples: 14645085900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:08,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 13:22:09,345][15401] Updated weights for policy 0, policy_version 893864 (0.0041) [2024-06-25 13:22:13,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14645198848. Throughput: 0: 42603.6. Samples: 14645337720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:13,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 13:22:14,038][15401] Updated weights for policy 0, policy_version 893874 (0.0038) [2024-06-25 13:22:17,330][15401] Updated weights for policy 0, policy_version 893884 (0.0024) [2024-06-25 13:22:18,390][15132] Fps is (10 sec: 42607.9, 60 sec: 43144.4, 300 sec: 42820.9). Total num frames: 14645428224. Throughput: 0: 42672.0. Samples: 14645594140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 13:22:21,413][15401] Updated weights for policy 0, policy_version 893894 (0.0033) [2024-06-25 13:22:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14645641216. Throughput: 0: 42717.7. Samples: 14645726020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 13:22:24,899][15401] Updated weights for policy 0, policy_version 893904 (0.0030) [2024-06-25 13:22:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 14645854208. Throughput: 0: 42772.9. Samples: 14645985400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:28,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 13:22:29,318][15401] Updated weights for policy 0, policy_version 893914 (0.0032) [2024-06-25 13:22:32,449][15401] Updated weights for policy 0, policy_version 893924 (0.0038) [2024-06-25 13:22:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 14646083584. Throughput: 0: 42877.8. Samples: 14646236920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:33,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 13:22:36,733][15401] Updated weights for policy 0, policy_version 893934 (0.0038) [2024-06-25 13:22:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 14646280192. Throughput: 0: 42813.1. Samples: 14646371040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 13:22:39,971][15401] Updated weights for policy 0, policy_version 893944 (0.0028) [2024-06-25 13:22:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14646493184. Throughput: 0: 42822.0. Samples: 14646624580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:43,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 13:22:44,360][15401] Updated weights for policy 0, policy_version 893954 (0.0036) [2024-06-25 13:22:47,979][15401] Updated weights for policy 0, policy_version 893964 (0.0036) [2024-06-25 13:22:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14646722560. Throughput: 0: 42921.5. Samples: 14646881120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:48,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 13:22:52,018][15401] Updated weights for policy 0, policy_version 893974 (0.0034) [2024-06-25 13:22:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14646919168. Throughput: 0: 42861.4. Samples: 14647014560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 13:22:53,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-25 13:22:55,503][15401] Updated weights for policy 0, policy_version 893984 (0.0039) [2024-06-25 13:22:57,285][15349] Signal inference workers to stop experience collection... (216800 times) [2024-06-25 13:22:57,292][15349] Signal inference workers to resume experience collection... (216800 times) [2024-06-25 13:22:57,299][15401] InferenceWorker_p0-w0: stopping experience collection (216800 times) [2024-06-25 13:22:57,325][15401] InferenceWorker_p0-w0: resuming experience collection (216800 times) [2024-06-25 13:22:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 14647132160. Throughput: 0: 42891.5. Samples: 14647267840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:22:58,390][15132] Avg episode reward: [(0, '0.146')] [2024-06-25 13:22:59,606][15401] Updated weights for policy 0, policy_version 893994 (0.0028) [2024-06-25 13:23:02,968][15401] Updated weights for policy 0, policy_version 894004 (0.0031) [2024-06-25 13:23:03,396][15132] Fps is (10 sec: 45845.2, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 14647377920. Throughput: 0: 42876.6. Samples: 14647523860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:03,397][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 13:23:07,122][15401] Updated weights for policy 0, policy_version 894014 (0.0034) [2024-06-25 13:23:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 14647558144. Throughput: 0: 42970.1. Samples: 14647659680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 13:23:10,815][15401] Updated weights for policy 0, policy_version 894024 (0.0035) [2024-06-25 13:23:13,391][15132] Fps is (10 sec: 40981.9, 60 sec: 43143.7, 300 sec: 42765.2). Total num frames: 14647787520. Throughput: 0: 42669.2. Samples: 14647905560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:13,391][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 13:23:14,992][15401] Updated weights for policy 0, policy_version 894034 (0.0040) [2024-06-25 13:23:18,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 14648000512. Throughput: 0: 42815.0. Samples: 14648163700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:18,392][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 13:23:18,577][15401] Updated weights for policy 0, policy_version 894044 (0.0026) [2024-06-25 13:23:22,856][15401] Updated weights for policy 0, policy_version 894054 (0.0038) [2024-06-25 13:23:23,390][15132] Fps is (10 sec: 42602.9, 60 sec: 42871.4, 300 sec: 42821.3). Total num frames: 14648213504. Throughput: 0: 42686.3. Samples: 14648291920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:23,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-25 13:23:26,146][15401] Updated weights for policy 0, policy_version 894064 (0.0035) [2024-06-25 13:23:28,396][15132] Fps is (10 sec: 42581.5, 60 sec: 42866.9, 300 sec: 42819.6). Total num frames: 14648426496. Throughput: 0: 42649.5. Samples: 14648544080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:28,396][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 13:23:30,485][15401] Updated weights for policy 0, policy_version 894074 (0.0043) [2024-06-25 13:23:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14648655872. Throughput: 0: 42886.5. Samples: 14648811020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:33,394][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 13:23:33,577][15401] Updated weights for policy 0, policy_version 894084 (0.0042) [2024-06-25 13:23:37,940][15401] Updated weights for policy 0, policy_version 894094 (0.0042) [2024-06-25 13:23:38,389][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14648852480. Throughput: 0: 42805.7. Samples: 14648940820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 13:23:41,065][15401] Updated weights for policy 0, policy_version 894104 (0.0039) [2024-06-25 13:23:43,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 14649065472. Throughput: 0: 42804.8. Samples: 14649194160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:43,392][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 13:23:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000894108_14649065472.pth... [2024-06-25 13:23:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000893482_14638809088.pth [2024-06-25 13:23:45,498][15401] Updated weights for policy 0, policy_version 894114 (0.0033) [2024-06-25 13:23:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14649294848. Throughput: 0: 42999.0. Samples: 14649458540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:48,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 13:23:49,039][15401] Updated weights for policy 0, policy_version 894124 (0.0028) [2024-06-25 13:23:53,054][15401] Updated weights for policy 0, policy_version 894134 (0.0042) [2024-06-25 13:23:53,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 14649507840. Throughput: 0: 42795.2. Samples: 14649585460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 13:23:56,681][15401] Updated weights for policy 0, policy_version 894144 (0.0029) [2024-06-25 13:23:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 14649720832. Throughput: 0: 42854.8. Samples: 14649833980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:23:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 13:24:00,756][15401] Updated weights for policy 0, policy_version 894154 (0.0033) [2024-06-25 13:24:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42602.8, 300 sec: 42820.5). Total num frames: 14649933824. Throughput: 0: 42987.4. Samples: 14650098040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:03,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 13:24:04,424][15401] Updated weights for policy 0, policy_version 894164 (0.0048) [2024-06-25 13:24:08,271][15401] Updated weights for policy 0, policy_version 894174 (0.0048) [2024-06-25 13:24:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 14650146816. Throughput: 0: 42920.6. Samples: 14650223340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 13:24:11,852][15401] Updated weights for policy 0, policy_version 894184 (0.0034) [2024-06-25 13:24:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43145.3, 300 sec: 42876.1). Total num frames: 14650376192. Throughput: 0: 43047.4. Samples: 14650480940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:13,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 13:24:15,711][15401] Updated weights for policy 0, policy_version 894194 (0.0029) [2024-06-25 13:24:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 14650556416. Throughput: 0: 42981.9. Samples: 14650745200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 13:24:19,697][15401] Updated weights for policy 0, policy_version 894204 (0.0037) [2024-06-25 13:24:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14650785792. Throughput: 0: 42706.2. Samples: 14650862600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:23,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-25 13:24:23,475][15401] Updated weights for policy 0, policy_version 894214 (0.0024) [2024-06-25 13:24:27,457][15401] Updated weights for policy 0, policy_version 894224 (0.0037) [2024-06-25 13:24:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42876.1, 300 sec: 42820.9). Total num frames: 14650998784. Throughput: 0: 42779.2. Samples: 14651119120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:28,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 13:24:28,630][15349] Signal inference workers to stop experience collection... (216850 times) [2024-06-25 13:24:28,631][15349] Signal inference workers to resume experience collection... (216850 times) [2024-06-25 13:24:28,673][15401] InferenceWorker_p0-w0: stopping experience collection (216850 times) [2024-06-25 13:24:28,673][15401] InferenceWorker_p0-w0: resuming experience collection (216850 times) [2024-06-25 13:24:31,009][15401] Updated weights for policy 0, policy_version 894234 (0.0043) [2024-06-25 13:24:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14651195392. Throughput: 0: 42755.6. Samples: 14651382540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:33,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 13:24:35,084][15401] Updated weights for policy 0, policy_version 894244 (0.0032) [2024-06-25 13:24:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14651441152. Throughput: 0: 42640.5. Samples: 14651504280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 13:24:38,687][15401] Updated weights for policy 0, policy_version 894254 (0.0025) [2024-06-25 13:24:42,671][15401] Updated weights for policy 0, policy_version 894264 (0.0031) [2024-06-25 13:24:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 14651637760. Throughput: 0: 43067.5. Samples: 14651772020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:24:43,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 13:24:46,051][15401] Updated weights for policy 0, policy_version 894274 (0.0041) [2024-06-25 13:24:48,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14651850752. Throughput: 0: 42717.9. Samples: 14652020340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:24:48,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 13:24:50,473][15401] Updated weights for policy 0, policy_version 894284 (0.0032) [2024-06-25 13:24:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14652080128. Throughput: 0: 42683.0. Samples: 14652144080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:24:53,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 13:24:53,934][15401] Updated weights for policy 0, policy_version 894294 (0.0033) [2024-06-25 13:24:58,317][15401] Updated weights for policy 0, policy_version 894304 (0.0032) [2024-06-25 13:24:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14652276736. Throughput: 0: 42787.2. Samples: 14652406360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:24:58,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 13:25:01,395][15401] Updated weights for policy 0, policy_version 894314 (0.0036) [2024-06-25 13:25:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 14652473344. Throughput: 0: 42630.6. Samples: 14652663580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 13:25:05,835][15401] Updated weights for policy 0, policy_version 894324 (0.0036) [2024-06-25 13:25:08,396][15132] Fps is (10 sec: 45845.6, 60 sec: 43139.8, 300 sec: 42875.5). Total num frames: 14652735488. Throughput: 0: 42719.2. Samples: 14652785240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:08,397][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 13:25:08,923][15401] Updated weights for policy 0, policy_version 894334 (0.0044) [2024-06-25 13:25:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 14652915712. Throughput: 0: 42766.5. Samples: 14653043620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:13,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 13:25:13,719][15401] Updated weights for policy 0, policy_version 894344 (0.0028) [2024-06-25 13:25:16,769][15401] Updated weights for policy 0, policy_version 894354 (0.0038) [2024-06-25 13:25:18,389][15132] Fps is (10 sec: 37707.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14653112320. Throughput: 0: 42662.8. Samples: 14653302360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:18,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 13:25:21,413][15401] Updated weights for policy 0, policy_version 894364 (0.0031) [2024-06-25 13:25:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14653374464. Throughput: 0: 42699.6. Samples: 14653425760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 13:25:24,235][15401] Updated weights for policy 0, policy_version 894374 (0.0033) [2024-06-25 13:25:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14653538304. Throughput: 0: 42548.9. Samples: 14653686720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 13:25:29,160][15401] Updated weights for policy 0, policy_version 894384 (0.0027) [2024-06-25 13:25:31,932][15401] Updated weights for policy 0, policy_version 894394 (0.0043) [2024-06-25 13:25:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.9). Total num frames: 14653767680. Throughput: 0: 42668.5. Samples: 14653940420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:33,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 13:25:36,883][15401] Updated weights for policy 0, policy_version 894404 (0.0030) [2024-06-25 13:25:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14653997056. Throughput: 0: 42894.3. Samples: 14654074320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 13:25:39,809][15401] Updated weights for policy 0, policy_version 894414 (0.0032) [2024-06-25 13:25:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14654193664. Throughput: 0: 42656.4. Samples: 14654325900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 13:25:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000894421_14654193664.pth... [2024-06-25 13:25:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000893794_14643920896.pth [2024-06-25 13:25:43,909][15349] Signal inference workers to stop experience collection... (216900 times) [2024-06-25 13:25:43,961][15401] InferenceWorker_p0-w0: stopping experience collection (216900 times) [2024-06-25 13:25:43,964][15349] Signal inference workers to resume experience collection... (216900 times) [2024-06-25 13:25:43,973][15401] InferenceWorker_p0-w0: resuming experience collection (216900 times) [2024-06-25 13:25:44,453][15401] Updated weights for policy 0, policy_version 894424 (0.0033) [2024-06-25 13:25:47,341][15401] Updated weights for policy 0, policy_version 894434 (0.0038) [2024-06-25 13:25:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14654423040. Throughput: 0: 42521.7. Samples: 14654577060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:48,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-25 13:25:51,912][15401] Updated weights for policy 0, policy_version 894444 (0.0045) [2024-06-25 13:25:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14654652416. Throughput: 0: 42880.3. Samples: 14654714580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:53,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 13:25:54,974][15401] Updated weights for policy 0, policy_version 894454 (0.0040) [2024-06-25 13:25:58,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14654816256. Throughput: 0: 42774.8. Samples: 14654968480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:25:58,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 13:26:00,070][15401] Updated weights for policy 0, policy_version 894464 (0.0038) [2024-06-25 13:26:02,866][15401] Updated weights for policy 0, policy_version 894474 (0.0029) [2024-06-25 13:26:03,392][15132] Fps is (10 sec: 42588.3, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 14655078400. Throughput: 0: 42526.1. Samples: 14655216140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:26:03,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 13:26:07,631][15401] Updated weights for policy 0, policy_version 894484 (0.0029) [2024-06-25 13:26:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42602.9, 300 sec: 42931.6). Total num frames: 14655291392. Throughput: 0: 42939.5. Samples: 14655358040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:26:08,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 13:26:10,282][15401] Updated weights for policy 0, policy_version 894494 (0.0034) [2024-06-25 13:26:13,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14655488000. Throughput: 0: 42909.7. Samples: 14655617660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:26:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 13:26:15,039][15401] Updated weights for policy 0, policy_version 894504 (0.0036) [2024-06-25 13:26:18,237][15401] Updated weights for policy 0, policy_version 894514 (0.0023) [2024-06-25 13:26:18,394][15132] Fps is (10 sec: 44219.2, 60 sec: 43687.7, 300 sec: 42875.5). Total num frames: 14655733760. Throughput: 0: 42805.6. Samples: 14655866840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:26:18,394][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 13:26:22,609][15401] Updated weights for policy 0, policy_version 894524 (0.0033) [2024-06-25 13:26:23,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 14655913984. Throughput: 0: 42782.1. Samples: 14655999620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:26:23,392][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 13:26:25,923][15401] Updated weights for policy 0, policy_version 894534 (0.0037) [2024-06-25 13:26:28,389][15132] Fps is (10 sec: 39337.5, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 14656126976. Throughput: 0: 42903.6. Samples: 14656256560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:26:28,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 13:26:30,216][15401] Updated weights for policy 0, policy_version 894544 (0.0049) [2024-06-25 13:26:33,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 14656356352. Throughput: 0: 42917.9. Samples: 14656508360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:26:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 13:26:33,554][15401] Updated weights for policy 0, policy_version 894554 (0.0039) [2024-06-25 13:26:38,067][15401] Updated weights for policy 0, policy_version 894564 (0.0038) [2024-06-25 13:26:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14656552960. Throughput: 0: 42865.4. Samples: 14656643520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-25 13:26:38,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 13:26:41,084][15401] Updated weights for policy 0, policy_version 894574 (0.0045) [2024-06-25 13:26:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14656749568. Throughput: 0: 42717.9. Samples: 14656890780. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:26:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 13:26:45,702][15401] Updated weights for policy 0, policy_version 894584 (0.0032) [2024-06-25 13:26:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14657011712. Throughput: 0: 42885.3. Samples: 14657145880. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:26:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 13:26:49,178][15401] Updated weights for policy 0, policy_version 894594 (0.0049) [2024-06-25 13:26:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 14657175552. Throughput: 0: 42723.6. Samples: 14657280600. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:26:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 13:26:53,443][15401] Updated weights for policy 0, policy_version 894604 (0.0043) [2024-06-25 13:26:56,521][15401] Updated weights for policy 0, policy_version 894614 (0.0033) [2024-06-25 13:26:58,389][15132] Fps is (10 sec: 39322.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14657404928. Throughput: 0: 42466.9. Samples: 14657528660. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:26:58,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 13:27:01,007][15349] Signal inference workers to stop experience collection... (216950 times) [2024-06-25 13:27:01,007][15349] Signal inference workers to resume experience collection... (216950 times) [2024-06-25 13:27:01,032][15401] InferenceWorker_p0-w0: stopping experience collection (216950 times) [2024-06-25 13:27:01,032][15401] InferenceWorker_p0-w0: resuming experience collection (216950 times) [2024-06-25 13:27:01,184][15401] Updated weights for policy 0, policy_version 894624 (0.0041) [2024-06-25 13:27:03,392][15132] Fps is (10 sec: 49140.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 14657667072. Throughput: 0: 42608.6. Samples: 14657784160. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:03,393][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 13:27:04,228][15401] Updated weights for policy 0, policy_version 894634 (0.0039) [2024-06-25 13:27:08,389][15132] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 14657798144. Throughput: 0: 42693.4. Samples: 14657920720. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 13:27:08,824][15401] Updated weights for policy 0, policy_version 894644 (0.0048) [2024-06-25 13:27:11,655][15401] Updated weights for policy 0, policy_version 894654 (0.0028) [2024-06-25 13:27:13,390][15132] Fps is (10 sec: 37691.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14658043904. Throughput: 0: 42434.1. Samples: 14658166100. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 13:27:16,358][15401] Updated weights for policy 0, policy_version 894664 (0.0035) [2024-06-25 13:27:18,389][15132] Fps is (10 sec: 49151.9, 60 sec: 42601.2, 300 sec: 42876.1). Total num frames: 14658289664. Throughput: 0: 42520.9. Samples: 14658421800. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:18,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 13:27:19,445][15401] Updated weights for policy 0, policy_version 894674 (0.0039) [2024-06-25 13:27:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 14658453504. Throughput: 0: 42490.1. Samples: 14658555580. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:23,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 13:27:23,971][15401] Updated weights for policy 0, policy_version 894684 (0.0044) [2024-06-25 13:27:27,238][15401] Updated weights for policy 0, policy_version 894694 (0.0024) [2024-06-25 13:27:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14658699264. Throughput: 0: 42589.2. Samples: 14658807300. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:28,391][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 13:27:31,935][15401] Updated weights for policy 0, policy_version 894704 (0.0033) [2024-06-25 13:27:33,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14658912256. Throughput: 0: 42669.0. Samples: 14659065980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:33,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 13:27:34,859][15401] Updated weights for policy 0, policy_version 894714 (0.0033) [2024-06-25 13:27:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14659092480. Throughput: 0: 42434.1. Samples: 14659190140. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:38,390][15132] Avg episode reward: [(0, '0.077')] [2024-06-25 13:27:39,692][15401] Updated weights for policy 0, policy_version 894724 (0.0031) [2024-06-25 13:27:42,559][15401] Updated weights for policy 0, policy_version 894734 (0.0028) [2024-06-25 13:27:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14659338240. Throughput: 0: 42489.1. Samples: 14659440680. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 13:27:43,530][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000894736_14659354624.pth... [2024-06-25 13:27:43,581][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000894108_14649065472.pth [2024-06-25 13:27:47,335][15401] Updated weights for policy 0, policy_version 894744 (0.0036) [2024-06-25 13:27:48,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14659551232. Throughput: 0: 42660.8. Samples: 14659703800. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 13:27:50,129][15401] Updated weights for policy 0, policy_version 894754 (0.0037) [2024-06-25 13:27:53,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 14659731456. Throughput: 0: 42435.0. Samples: 14659830400. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:53,393][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 13:27:54,852][15401] Updated weights for policy 0, policy_version 894764 (0.0029) [2024-06-25 13:27:57,757][15401] Updated weights for policy 0, policy_version 894774 (0.0037) [2024-06-25 13:27:58,392][15132] Fps is (10 sec: 44226.5, 60 sec: 43142.7, 300 sec: 42765.6). Total num frames: 14659993600. Throughput: 0: 42604.9. Samples: 14660083420. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:27:58,393][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 13:28:02,377][15401] Updated weights for policy 0, policy_version 894784 (0.0033) [2024-06-25 13:28:03,389][15132] Fps is (10 sec: 45886.7, 60 sec: 42054.0, 300 sec: 42820.6). Total num frames: 14660190208. Throughput: 0: 42780.5. Samples: 14660346920. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:28:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 13:28:05,415][15401] Updated weights for policy 0, policy_version 894794 (0.0033) [2024-06-25 13:28:08,390][15132] Fps is (10 sec: 36053.6, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 14660354048. Throughput: 0: 42523.6. Samples: 14660469140. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:28:08,393][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 13:28:09,807][15349] Signal inference workers to stop experience collection... (217000 times) [2024-06-25 13:28:09,856][15401] InferenceWorker_p0-w0: stopping experience collection (217000 times) [2024-06-25 13:28:09,865][15349] Signal inference workers to resume experience collection... (217000 times) [2024-06-25 13:28:09,865][15401] InferenceWorker_p0-w0: resuming experience collection (217000 times) [2024-06-25 13:28:10,188][15401] Updated weights for policy 0, policy_version 894804 (0.0042) [2024-06-25 13:28:12,953][15401] Updated weights for policy 0, policy_version 894814 (0.0037) [2024-06-25 13:28:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 14660632576. Throughput: 0: 42485.4. Samples: 14660719140. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:28:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 13:28:18,059][15401] Updated weights for policy 0, policy_version 894824 (0.0034) [2024-06-25 13:28:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 14660812800. Throughput: 0: 42646.2. Samples: 14660985060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:28:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 13:28:20,591][15401] Updated weights for policy 0, policy_version 894834 (0.0034) [2024-06-25 13:28:23,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 14661009408. Throughput: 0: 42505.9. Samples: 14661102900. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:28:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 13:28:25,661][15401] Updated weights for policy 0, policy_version 894844 (0.0029) [2024-06-25 13:28:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14661271552. Throughput: 0: 42668.6. Samples: 14661360760. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:28:28,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 13:28:28,687][15401] Updated weights for policy 0, policy_version 894854 (0.0037) [2024-06-25 13:28:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 14661419008. Throughput: 0: 42811.7. Samples: 14661630320. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-25 13:28:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 13:28:33,598][15401] Updated weights for policy 0, policy_version 894864 (0.0047) [2024-06-25 13:28:36,433][15401] Updated weights for policy 0, policy_version 894874 (0.0024) [2024-06-25 13:28:38,392][15132] Fps is (10 sec: 37673.9, 60 sec: 42596.8, 300 sec: 42653.9). Total num frames: 14661648384. Throughput: 0: 42461.8. Samples: 14661741180. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:28:38,393][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 13:28:41,114][15401] Updated weights for policy 0, policy_version 894884 (0.0040) [2024-06-25 13:28:43,389][15132] Fps is (10 sec: 49152.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14661910528. Throughput: 0: 42604.5. Samples: 14662000520. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:28:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 13:28:44,030][15401] Updated weights for policy 0, policy_version 894894 (0.0037) [2024-06-25 13:28:48,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 14662074368. Throughput: 0: 42699.1. Samples: 14662268380. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:28:48,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 13:28:48,873][15401] Updated weights for policy 0, policy_version 894904 (0.0039) [2024-06-25 13:28:51,704][15401] Updated weights for policy 0, policy_version 894914 (0.0039) [2024-06-25 13:28:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42873.3, 300 sec: 42653.9). Total num frames: 14662303744. Throughput: 0: 42471.2. Samples: 14662380340. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:28:53,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 13:28:56,515][15401] Updated weights for policy 0, policy_version 894924 (0.0043) [2024-06-25 13:28:58,390][15132] Fps is (10 sec: 49148.5, 60 sec: 42872.7, 300 sec: 42820.5). Total num frames: 14662565888. Throughput: 0: 42714.9. Samples: 14662641340. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:28:58,391][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 13:28:59,309][15401] Updated weights for policy 0, policy_version 894934 (0.0020) [2024-06-25 13:29:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14662713344. Throughput: 0: 42756.9. Samples: 14662909120. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 13:29:04,146][15401] Updated weights for policy 0, policy_version 894944 (0.0033) [2024-06-25 13:29:06,917][15401] Updated weights for policy 0, policy_version 894954 (0.0028) [2024-06-25 13:29:08,390][15132] Fps is (10 sec: 39324.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 14662959104. Throughput: 0: 42675.6. Samples: 14663023300. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 13:29:11,701][15349] Signal inference workers to stop experience collection... (217050 times) [2024-06-25 13:29:11,759][15401] InferenceWorker_p0-w0: stopping experience collection (217050 times) [2024-06-25 13:29:11,816][15349] Signal inference workers to resume experience collection... (217050 times) [2024-06-25 13:29:11,816][15401] InferenceWorker_p0-w0: resuming experience collection (217050 times) [2024-06-25 13:29:11,818][15401] Updated weights for policy 0, policy_version 894964 (0.0031) [2024-06-25 13:29:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14663172096. Throughput: 0: 42661.3. Samples: 14663280520. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:13,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 13:29:14,986][15401] Updated weights for policy 0, policy_version 894974 (0.0031) [2024-06-25 13:29:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14663352320. Throughput: 0: 42297.3. Samples: 14663533700. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:18,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 13:29:19,658][15401] Updated weights for policy 0, policy_version 894984 (0.0030) [2024-06-25 13:29:22,858][15401] Updated weights for policy 0, policy_version 894994 (0.0028) [2024-06-25 13:29:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14663581696. Throughput: 0: 42553.3. Samples: 14663655980. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:23,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 13:29:27,205][15401] Updated weights for policy 0, policy_version 895004 (0.0040) [2024-06-25 13:29:28,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14663827456. Throughput: 0: 42825.7. Samples: 14663927680. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:28,404][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 13:29:31,159][15401] Updated weights for policy 0, policy_version 895014 (0.0031) [2024-06-25 13:29:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 14664007680. Throughput: 0: 42643.0. Samples: 14664187320. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:33,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 13:29:34,643][15401] Updated weights for policy 0, policy_version 895024 (0.0034) [2024-06-25 13:29:38,396][15132] Fps is (10 sec: 39296.4, 60 sec: 42868.6, 300 sec: 42653.0). Total num frames: 14664220672. Throughput: 0: 42839.6. Samples: 14664308400. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:38,397][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 13:29:38,771][15401] Updated weights for policy 0, policy_version 895034 (0.0041) [2024-06-25 13:29:42,256][15401] Updated weights for policy 0, policy_version 895044 (0.0032) [2024-06-25 13:29:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14664466432. Throughput: 0: 42943.8. Samples: 14664573780. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 13:29:43,495][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000895049_14664482816.pth... [2024-06-25 13:29:43,545][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000894421_14654193664.pth [2024-06-25 13:29:46,168][15401] Updated weights for policy 0, policy_version 895054 (0.0032) [2024-06-25 13:29:48,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14664646656. Throughput: 0: 42880.1. Samples: 14664838720. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 13:29:49,657][15401] Updated weights for policy 0, policy_version 895064 (0.0043) [2024-06-25 13:29:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14664876032. Throughput: 0: 42948.4. Samples: 14664955980. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 13:29:53,564][15401] Updated weights for policy 0, policy_version 895074 (0.0032) [2024-06-25 13:29:57,256][15401] Updated weights for policy 0, policy_version 895084 (0.0026) [2024-06-25 13:29:58,390][15132] Fps is (10 sec: 49151.4, 60 sec: 42871.9, 300 sec: 42931.6). Total num frames: 14665138176. Throughput: 0: 43203.0. Samples: 14665224660. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:29:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 13:30:01,087][15401] Updated weights for policy 0, policy_version 895094 (0.0034) [2024-06-25 13:30:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42488.2). Total num frames: 14665269248. Throughput: 0: 43271.1. Samples: 14665480900. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:30:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 13:30:04,922][15401] Updated weights for policy 0, policy_version 895104 (0.0033) [2024-06-25 13:30:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14665531392. Throughput: 0: 43028.5. Samples: 14665592260. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:30:08,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 13:30:09,132][15401] Updated weights for policy 0, policy_version 895114 (0.0041) [2024-06-25 13:30:12,828][15401] Updated weights for policy 0, policy_version 895124 (0.0043) [2024-06-25 13:30:13,389][15132] Fps is (10 sec: 49152.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14665760768. Throughput: 0: 42927.3. Samples: 14665859400. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:30:13,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 13:30:16,643][15401] Updated weights for policy 0, policy_version 895134 (0.0027) [2024-06-25 13:30:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 14665924608. Throughput: 0: 42906.6. Samples: 14666118120. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:30:18,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 13:30:20,514][15401] Updated weights for policy 0, policy_version 895144 (0.0025) [2024-06-25 13:30:23,393][15132] Fps is (10 sec: 42581.3, 60 sec: 43414.8, 300 sec: 42875.5). Total num frames: 14666186752. Throughput: 0: 42866.9. Samples: 14666237300. Policy #0 lag: (min: 3.0, avg: 10.5, max: 22.0) [2024-06-25 13:30:23,394][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 13:30:24,131][15401] Updated weights for policy 0, policy_version 895154 (0.0036) [2024-06-25 13:30:27,999][15401] Updated weights for policy 0, policy_version 895164 (0.0032) [2024-06-25 13:30:28,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14666383360. Throughput: 0: 42839.9. Samples: 14666501580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:30:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 13:30:31,663][15401] Updated weights for policy 0, policy_version 895174 (0.0032) [2024-06-25 13:30:33,389][15132] Fps is (10 sec: 37698.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14666563584. Throughput: 0: 42733.8. Samples: 14666761740. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:30:33,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 13:30:34,224][15349] Signal inference workers to stop experience collection... (217100 times) [2024-06-25 13:30:34,224][15349] Signal inference workers to resume experience collection... (217100 times) [2024-06-25 13:30:34,282][15401] InferenceWorker_p0-w0: stopping experience collection (217100 times) [2024-06-25 13:30:34,282][15401] InferenceWorker_p0-w0: resuming experience collection (217100 times) [2024-06-25 13:30:35,587][15401] Updated weights for policy 0, policy_version 895184 (0.0029) [2024-06-25 13:30:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43422.3, 300 sec: 42820.6). Total num frames: 14666825728. Throughput: 0: 42832.1. Samples: 14666883420. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:30:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 13:30:39,694][15401] Updated weights for policy 0, policy_version 895194 (0.0038) [2024-06-25 13:30:43,078][15401] Updated weights for policy 0, policy_version 895204 (0.0033) [2024-06-25 13:30:43,395][15132] Fps is (10 sec: 47488.9, 60 sec: 42867.8, 300 sec: 42764.3). Total num frames: 14667038720. Throughput: 0: 42535.6. Samples: 14667138980. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:30:43,395][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 13:30:47,292][15401] Updated weights for policy 0, policy_version 895214 (0.0037) [2024-06-25 13:30:48,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 14667218944. Throughput: 0: 42508.8. Samples: 14667393800. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:30:48,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 13:30:50,622][15401] Updated weights for policy 0, policy_version 895224 (0.0027) [2024-06-25 13:30:53,389][15132] Fps is (10 sec: 40981.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14667448320. Throughput: 0: 42859.6. Samples: 14667520940. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:30:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 13:30:54,900][15401] Updated weights for policy 0, policy_version 895234 (0.0044) [2024-06-25 13:30:58,315][15401] Updated weights for policy 0, policy_version 895244 (0.0031) [2024-06-25 13:30:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 14667677696. Throughput: 0: 42791.8. Samples: 14667785040. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:30:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 13:31:02,615][15401] Updated weights for policy 0, policy_version 895254 (0.0036) [2024-06-25 13:31:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14667857920. Throughput: 0: 42654.9. Samples: 14668037580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 13:31:06,513][15401] Updated weights for policy 0, policy_version 895264 (0.0029) [2024-06-25 13:31:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14668103680. Throughput: 0: 42869.5. Samples: 14668166260. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:08,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 13:31:10,359][15401] Updated weights for policy 0, policy_version 895274 (0.0033) [2024-06-25 13:31:13,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42599.0). Total num frames: 14668300288. Throughput: 0: 42725.3. Samples: 14668424220. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 13:31:14,073][15401] Updated weights for policy 0, policy_version 895284 (0.0041) [2024-06-25 13:31:18,016][15401] Updated weights for policy 0, policy_version 895294 (0.0038) [2024-06-25 13:31:18,392][15132] Fps is (10 sec: 40950.0, 60 sec: 43142.9, 300 sec: 42709.5). Total num frames: 14668513280. Throughput: 0: 42622.1. Samples: 14668679840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:18,393][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 13:31:21,608][15401] Updated weights for policy 0, policy_version 895304 (0.0038) [2024-06-25 13:31:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42601.1, 300 sec: 42765.0). Total num frames: 14668742656. Throughput: 0: 42809.2. Samples: 14668809840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 13:31:25,604][15401] Updated weights for policy 0, policy_version 895314 (0.0042) [2024-06-25 13:31:28,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14668939264. Throughput: 0: 42861.0. Samples: 14669067500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:28,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 13:31:29,203][15401] Updated weights for policy 0, policy_version 895324 (0.0028) [2024-06-25 13:31:33,262][15401] Updated weights for policy 0, policy_version 895334 (0.0031) [2024-06-25 13:31:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14669152256. Throughput: 0: 42744.0. Samples: 14669317280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 13:31:37,003][15401] Updated weights for policy 0, policy_version 895344 (0.0041) [2024-06-25 13:31:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14669365248. Throughput: 0: 42698.6. Samples: 14669442380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 13:31:38,557][15349] Signal inference workers to stop experience collection... (217150 times) [2024-06-25 13:31:38,604][15401] InferenceWorker_p0-w0: stopping experience collection (217150 times) [2024-06-25 13:31:38,608][15349] Signal inference workers to resume experience collection... (217150 times) [2024-06-25 13:31:38,615][15401] InferenceWorker_p0-w0: resuming experience collection (217150 times) [2024-06-25 13:31:41,062][15401] Updated weights for policy 0, policy_version 895354 (0.0024) [2024-06-25 13:31:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42328.9, 300 sec: 42598.4). Total num frames: 14669578240. Throughput: 0: 42784.1. Samples: 14669710320. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 13:31:43,555][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000895362_14669611008.pth... [2024-06-25 13:31:43,606][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000894736_14659354624.pth [2024-06-25 13:31:44,427][15401] Updated weights for policy 0, policy_version 895364 (0.0029) [2024-06-25 13:31:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14669774848. Throughput: 0: 42786.2. Samples: 14669962960. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 13:31:48,722][15401] Updated weights for policy 0, policy_version 895374 (0.0030) [2024-06-25 13:31:52,138][15401] Updated weights for policy 0, policy_version 895384 (0.0041) [2024-06-25 13:31:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14670020608. Throughput: 0: 42654.5. Samples: 14670085720. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 13:31:56,277][15401] Updated weights for policy 0, policy_version 895394 (0.0040) [2024-06-25 13:31:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42543.2). Total num frames: 14670217216. Throughput: 0: 42741.1. Samples: 14670347560. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:31:58,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 13:31:59,807][15401] Updated weights for policy 0, policy_version 895404 (0.0040) [2024-06-25 13:32:03,391][15132] Fps is (10 sec: 39314.5, 60 sec: 42597.0, 300 sec: 42764.7). Total num frames: 14670413824. Throughput: 0: 42728.9. Samples: 14670602620. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:32:03,392][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 13:32:03,976][15401] Updated weights for policy 0, policy_version 895414 (0.0034) [2024-06-25 13:32:07,800][15401] Updated weights for policy 0, policy_version 895424 (0.0037) [2024-06-25 13:32:08,389][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14670675968. Throughput: 0: 42588.5. Samples: 14670726320. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:32:08,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 13:32:11,412][15401] Updated weights for policy 0, policy_version 895434 (0.0042) [2024-06-25 13:32:13,389][15132] Fps is (10 sec: 44245.7, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 14670856192. Throughput: 0: 42630.7. Samples: 14670985880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:32:13,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 13:32:15,353][15401] Updated weights for policy 0, policy_version 895444 (0.0042) [2024-06-25 13:32:18,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 14671052800. Throughput: 0: 42783.6. Samples: 14671242540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-25 13:32:18,390][15132] Avg episode reward: [(0, '0.313')] [2024-06-25 13:32:19,146][15401] Updated weights for policy 0, policy_version 895454 (0.0021) [2024-06-25 13:32:22,875][15401] Updated weights for policy 0, policy_version 895464 (0.0024) [2024-06-25 13:32:23,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14671298560. Throughput: 0: 42887.6. Samples: 14671372320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:32:23,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 13:32:26,929][15401] Updated weights for policy 0, policy_version 895474 (0.0037) [2024-06-25 13:32:28,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 14671511552. Throughput: 0: 42564.9. Samples: 14671625840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:32:28,392][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 13:32:30,688][15401] Updated weights for policy 0, policy_version 895484 (0.0032) [2024-06-25 13:32:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14671708160. Throughput: 0: 42423.1. Samples: 14671872000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:32:33,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 13:32:34,726][15401] Updated weights for policy 0, policy_version 895494 (0.0032) [2024-06-25 13:32:38,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14671921152. Throughput: 0: 42620.5. Samples: 14672003640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:32:38,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 13:32:38,459][15401] Updated weights for policy 0, policy_version 895504 (0.0033) [2024-06-25 13:32:42,424][15401] Updated weights for policy 0, policy_version 895514 (0.0036) [2024-06-25 13:32:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14672117760. Throughput: 0: 42432.4. Samples: 14672257020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:32:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 13:32:46,175][15401] Updated weights for policy 0, policy_version 895524 (0.0033) [2024-06-25 13:32:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 14672363520. Throughput: 0: 42233.4. Samples: 14672503040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:32:48,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 13:32:49,422][15349] Signal inference workers to stop experience collection... (217200 times) [2024-06-25 13:32:49,454][15401] InferenceWorker_p0-w0: stopping experience collection (217200 times) [2024-06-25 13:32:49,479][15349] Signal inference workers to resume experience collection... (217200 times) [2024-06-25 13:32:49,479][15401] InferenceWorker_p0-w0: resuming experience collection (217200 times) [2024-06-25 13:32:50,394][15401] Updated weights for policy 0, policy_version 895534 (0.0031) [2024-06-25 13:32:53,396][15132] Fps is (10 sec: 44208.2, 60 sec: 42320.9, 300 sec: 42597.8). Total num frames: 14672560128. Throughput: 0: 42488.2. Samples: 14672638560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:32:53,397][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 13:32:53,997][15401] Updated weights for policy 0, policy_version 895544 (0.0033) [2024-06-25 13:32:58,202][15401] Updated weights for policy 0, policy_version 895554 (0.0030) [2024-06-25 13:32:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14672756736. Throughput: 0: 42312.4. Samples: 14672889940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:32:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 13:33:01,592][15401] Updated weights for policy 0, policy_version 895564 (0.0043) [2024-06-25 13:33:03,390][15132] Fps is (10 sec: 42625.8, 60 sec: 42872.8, 300 sec: 42820.6). Total num frames: 14672986112. Throughput: 0: 42065.3. Samples: 14673135480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 13:33:06,633][15401] Updated weights for policy 0, policy_version 895574 (0.0025) [2024-06-25 13:33:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 14673182720. Throughput: 0: 42205.3. Samples: 14673271560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 13:33:09,322][15401] Updated weights for policy 0, policy_version 895584 (0.0023) [2024-06-25 13:33:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 14673395712. Throughput: 0: 42186.3. Samples: 14673524120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:13,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 13:33:14,060][15401] Updated weights for policy 0, policy_version 895594 (0.0026) [2024-06-25 13:33:16,913][15401] Updated weights for policy 0, policy_version 895604 (0.0036) [2024-06-25 13:33:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14673625088. Throughput: 0: 42308.8. Samples: 14673775900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:18,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 13:33:21,522][15401] Updated weights for policy 0, policy_version 895614 (0.0028) [2024-06-25 13:33:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 14673805312. Throughput: 0: 42233.9. Samples: 14673904160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:23,390][15132] Avg episode reward: [(0, '0.166')] [2024-06-25 13:33:24,874][15401] Updated weights for policy 0, policy_version 895624 (0.0045) [2024-06-25 13:33:28,392][15132] Fps is (10 sec: 39312.3, 60 sec: 41779.2, 300 sec: 42709.1). Total num frames: 14674018304. Throughput: 0: 42092.0. Samples: 14674151260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:28,393][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 13:33:29,723][15401] Updated weights for policy 0, policy_version 895634 (0.0026) [2024-06-25 13:33:32,744][15401] Updated weights for policy 0, policy_version 895644 (0.0041) [2024-06-25 13:33:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 14674264064. Throughput: 0: 42156.0. Samples: 14674400060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 13:33:37,539][15401] Updated weights for policy 0, policy_version 895654 (0.0024) [2024-06-25 13:33:38,389][15132] Fps is (10 sec: 40969.8, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 14674427904. Throughput: 0: 42132.7. Samples: 14674534260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:38,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-25 13:33:40,422][15401] Updated weights for policy 0, policy_version 895664 (0.0040) [2024-06-25 13:33:43,390][15132] Fps is (10 sec: 39320.3, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 14674657280. Throughput: 0: 42082.9. Samples: 14674783680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 13:33:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000895670_14674657280.pth... [2024-06-25 13:33:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000895049_14664482816.pth [2024-06-25 13:33:45,095][15401] Updated weights for policy 0, policy_version 895674 (0.0040) [2024-06-25 13:33:48,310][15401] Updated weights for policy 0, policy_version 895684 (0.0032) [2024-06-25 13:33:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14674886656. Throughput: 0: 42354.8. Samples: 14675041440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:48,390][15132] Avg episode reward: [(0, '0.093')] [2024-06-25 13:33:52,657][15401] Updated weights for policy 0, policy_version 895694 (0.0039) [2024-06-25 13:33:53,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42056.7, 300 sec: 42431.9). Total num frames: 14675083264. Throughput: 0: 42292.4. Samples: 14675174720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:53,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 13:33:55,901][15401] Updated weights for policy 0, policy_version 895704 (0.0039) [2024-06-25 13:33:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14675296256. Throughput: 0: 42192.5. Samples: 14675422780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:33:58,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 13:34:00,275][15401] Updated weights for policy 0, policy_version 895714 (0.0038) [2024-06-25 13:34:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14675525632. Throughput: 0: 42333.3. Samples: 14675680900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:34:03,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 13:34:03,559][15401] Updated weights for policy 0, policy_version 895724 (0.0032) [2024-06-25 13:34:07,767][15401] Updated weights for policy 0, policy_version 895734 (0.0041) [2024-06-25 13:34:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14675722240. Throughput: 0: 42311.2. Samples: 14675808160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:34:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 13:34:11,135][15401] Updated weights for policy 0, policy_version 895744 (0.0033) [2024-06-25 13:34:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14675935232. Throughput: 0: 42516.1. Samples: 14676064380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 13:34:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 13:34:15,349][15401] Updated weights for policy 0, policy_version 895754 (0.0041) [2024-06-25 13:34:18,349][15349] Signal inference workers to stop experience collection... (217250 times) [2024-06-25 13:34:18,358][15349] Signal inference workers to resume experience collection... (217250 times) [2024-06-25 13:34:18,372][15401] InferenceWorker_p0-w0: stopping experience collection (217250 times) [2024-06-25 13:34:18,372][15401] InferenceWorker_p0-w0: resuming experience collection (217250 times) [2024-06-25 13:34:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14676164608. Throughput: 0: 42731.6. Samples: 14676322980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 13:34:18,675][15401] Updated weights for policy 0, policy_version 895764 (0.0031) [2024-06-25 13:34:23,097][15401] Updated weights for policy 0, policy_version 895774 (0.0030) [2024-06-25 13:34:23,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 14676377600. Throughput: 0: 42645.2. Samples: 14676453300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:23,399][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 13:34:27,006][15401] Updated weights for policy 0, policy_version 895784 (0.0046) [2024-06-25 13:34:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 14676574208. Throughput: 0: 42610.5. Samples: 14676701140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:28,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 13:34:30,787][15401] Updated weights for policy 0, policy_version 895794 (0.0037) [2024-06-25 13:34:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42654.9). Total num frames: 14676803584. Throughput: 0: 42651.5. Samples: 14676960760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 13:34:34,491][15401] Updated weights for policy 0, policy_version 895804 (0.0031) [2024-06-25 13:34:38,348][15401] Updated weights for policy 0, policy_version 895814 (0.0035) [2024-06-25 13:34:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 14677016576. Throughput: 0: 42663.2. Samples: 14677094560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 13:34:41,991][15401] Updated weights for policy 0, policy_version 895824 (0.0038) [2024-06-25 13:34:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 14677213184. Throughput: 0: 42672.8. Samples: 14677343060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 13:34:45,940][15401] Updated weights for policy 0, policy_version 895834 (0.0031) [2024-06-25 13:34:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14677442560. Throughput: 0: 42520.9. Samples: 14677594340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:48,391][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 13:34:49,559][15401] Updated weights for policy 0, policy_version 895844 (0.0029) [2024-06-25 13:34:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 14677622784. Throughput: 0: 42667.9. Samples: 14677728220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:53,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 13:34:53,844][15401] Updated weights for policy 0, policy_version 895854 (0.0037) [2024-06-25 13:34:57,858][15401] Updated weights for policy 0, policy_version 895864 (0.0045) [2024-06-25 13:34:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14677852160. Throughput: 0: 42423.8. Samples: 14677973460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:34:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 13:35:01,473][15401] Updated weights for policy 0, policy_version 895874 (0.0041) [2024-06-25 13:35:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14678065152. Throughput: 0: 42315.5. Samples: 14678227180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:03,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 13:35:05,807][15401] Updated weights for policy 0, policy_version 895884 (0.0032) [2024-06-25 13:35:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 14678261760. Throughput: 0: 42303.2. Samples: 14678356940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:08,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 13:35:09,213][15401] Updated weights for policy 0, policy_version 895894 (0.0039) [2024-06-25 13:35:13,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14678474752. Throughput: 0: 42326.2. Samples: 14678605820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 13:35:13,402][15401] Updated weights for policy 0, policy_version 895904 (0.0040) [2024-06-25 13:35:17,208][15401] Updated weights for policy 0, policy_version 895914 (0.0041) [2024-06-25 13:35:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42376.8). Total num frames: 14678687744. Throughput: 0: 42279.6. Samples: 14678863340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 13:35:20,974][15401] Updated weights for policy 0, policy_version 895924 (0.0023) [2024-06-25 13:35:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 14678900736. Throughput: 0: 42117.2. Samples: 14678989840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:23,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 13:35:24,714][15401] Updated weights for policy 0, policy_version 895934 (0.0038) [2024-06-25 13:35:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 14679113728. Throughput: 0: 42177.7. Samples: 14679241160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:28,393][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 13:35:28,561][15401] Updated weights for policy 0, policy_version 895944 (0.0026) [2024-06-25 13:35:32,476][15401] Updated weights for policy 0, policy_version 895954 (0.0048) [2024-06-25 13:35:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 14679343104. Throughput: 0: 42422.8. Samples: 14679503360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:33,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 13:35:36,141][15401] Updated weights for policy 0, policy_version 895964 (0.0033) [2024-06-25 13:35:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42052.2, 300 sec: 42377.0). Total num frames: 14679539712. Throughput: 0: 42245.3. Samples: 14679629260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:38,394][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 13:35:40,251][15401] Updated weights for policy 0, policy_version 895974 (0.0026) [2024-06-25 13:35:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14679769088. Throughput: 0: 42471.6. Samples: 14679884680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 13:35:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000895983_14679785472.pth... [2024-06-25 13:35:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000895362_14669611008.pth [2024-06-25 13:35:43,797][15401] Updated weights for policy 0, policy_version 895984 (0.0048) [2024-06-25 13:35:48,049][15401] Updated weights for policy 0, policy_version 895994 (0.0048) [2024-06-25 13:35:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14679965696. Throughput: 0: 42405.6. Samples: 14680135440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 13:35:51,739][15401] Updated weights for policy 0, policy_version 896004 (0.0027) [2024-06-25 13:35:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 14680178688. Throughput: 0: 42386.7. Samples: 14680264340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:53,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 13:35:53,630][15349] Signal inference workers to stop experience collection... (217300 times) [2024-06-25 13:35:53,631][15349] Signal inference workers to resume experience collection... (217300 times) [2024-06-25 13:35:53,679][15401] InferenceWorker_p0-w0: stopping experience collection (217300 times) [2024-06-25 13:35:53,679][15401] InferenceWorker_p0-w0: resuming experience collection (217300 times) [2024-06-25 13:35:55,951][15401] Updated weights for policy 0, policy_version 896014 (0.0031) [2024-06-25 13:35:58,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 14680408064. Throughput: 0: 42668.8. Samples: 14680526020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:35:58,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 13:35:59,221][15401] Updated weights for policy 0, policy_version 896024 (0.0033) [2024-06-25 13:36:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 14680604672. Throughput: 0: 42584.5. Samples: 14680779640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 13:36:03,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 13:36:03,477][15401] Updated weights for policy 0, policy_version 896034 (0.0039) [2024-06-25 13:36:06,954][15401] Updated weights for policy 0, policy_version 896044 (0.0038) [2024-06-25 13:36:08,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 14680817664. Throughput: 0: 42491.6. Samples: 14680901960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:08,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 13:36:11,206][15401] Updated weights for policy 0, policy_version 896054 (0.0031) [2024-06-25 13:36:13,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 14681047040. Throughput: 0: 42707.3. Samples: 14681162880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 13:36:14,533][15401] Updated weights for policy 0, policy_version 896064 (0.0038) [2024-06-25 13:36:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 14681260032. Throughput: 0: 42506.2. Samples: 14681416140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:18,390][15132] Avg episode reward: [(0, '0.277')] [2024-06-25 13:36:18,866][15401] Updated weights for policy 0, policy_version 896074 (0.0030) [2024-06-25 13:36:22,286][15401] Updated weights for policy 0, policy_version 896084 (0.0027) [2024-06-25 13:36:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 14681456640. Throughput: 0: 42547.2. Samples: 14681543880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:23,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 13:36:26,686][15401] Updated weights for policy 0, policy_version 896094 (0.0026) [2024-06-25 13:36:28,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 14681686016. Throughput: 0: 42799.8. Samples: 14681810680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 13:36:30,173][15401] Updated weights for policy 0, policy_version 896104 (0.0044) [2024-06-25 13:36:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 14681882624. Throughput: 0: 42736.1. Samples: 14682058560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:33,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 13:36:34,579][15401] Updated weights for policy 0, policy_version 896114 (0.0029) [2024-06-25 13:36:38,057][15401] Updated weights for policy 0, policy_version 896124 (0.0033) [2024-06-25 13:36:38,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14682112000. Throughput: 0: 42646.2. Samples: 14682183420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:38,393][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 13:36:42,311][15401] Updated weights for policy 0, policy_version 896134 (0.0034) [2024-06-25 13:36:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14682324992. Throughput: 0: 42771.2. Samples: 14682450620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 13:36:45,739][15401] Updated weights for policy 0, policy_version 896144 (0.0035) [2024-06-25 13:36:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42376.3). Total num frames: 14682521600. Throughput: 0: 42596.0. Samples: 14682696460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:48,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 13:36:49,834][15401] Updated weights for policy 0, policy_version 896154 (0.0034) [2024-06-25 13:36:53,230][15401] Updated weights for policy 0, policy_version 896164 (0.0043) [2024-06-25 13:36:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 14682750976. Throughput: 0: 42767.2. Samples: 14682826480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:53,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 13:36:57,309][15401] Updated weights for policy 0, policy_version 896174 (0.0031) [2024-06-25 13:36:58,390][15132] Fps is (10 sec: 44232.7, 60 sec: 42599.6, 300 sec: 42543.0). Total num frames: 14682963968. Throughput: 0: 42784.9. Samples: 14683088240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:36:58,391][15132] Avg episode reward: [(0, '0.306')] [2024-06-25 13:37:00,764][15401] Updated weights for policy 0, policy_version 896184 (0.0028) [2024-06-25 13:37:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 14683176960. Throughput: 0: 42852.0. Samples: 14683344480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:03,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 13:37:04,821][15401] Updated weights for policy 0, policy_version 896194 (0.0031) [2024-06-25 13:37:08,253][15401] Updated weights for policy 0, policy_version 896204 (0.0046) [2024-06-25 13:37:08,389][15132] Fps is (10 sec: 44240.3, 60 sec: 43144.6, 300 sec: 42542.8). Total num frames: 14683406336. Throughput: 0: 42888.4. Samples: 14683473860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 13:37:12,375][15401] Updated weights for policy 0, policy_version 896214 (0.0038) [2024-06-25 13:37:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14683586560. Throughput: 0: 42702.4. Samples: 14683732280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:13,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 13:37:15,783][15401] Updated weights for policy 0, policy_version 896224 (0.0026) [2024-06-25 13:37:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 14683815936. Throughput: 0: 42725.3. Samples: 14683981200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:18,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 13:37:20,171][15401] Updated weights for policy 0, policy_version 896234 (0.0036) [2024-06-25 13:37:23,265][15401] Updated weights for policy 0, policy_version 896244 (0.0028) [2024-06-25 13:37:23,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42543.2). Total num frames: 14684061696. Throughput: 0: 42909.8. Samples: 14684114360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 13:37:27,985][15401] Updated weights for policy 0, policy_version 896254 (0.0044) [2024-06-25 13:37:28,009][15349] Signal inference workers to stop experience collection... (217350 times) [2024-06-25 13:37:28,009][15349] Signal inference workers to resume experience collection... (217350 times) [2024-06-25 13:37:28,021][15401] InferenceWorker_p0-w0: stopping experience collection (217350 times) [2024-06-25 13:37:28,021][15401] InferenceWorker_p0-w0: resuming experience collection (217350 times) [2024-06-25 13:37:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14684241920. Throughput: 0: 42594.2. Samples: 14684367360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 13:37:30,740][15401] Updated weights for policy 0, policy_version 896264 (0.0027) [2024-06-25 13:37:33,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14684454912. Throughput: 0: 42759.9. Samples: 14684620660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 13:37:35,673][15401] Updated weights for policy 0, policy_version 896274 (0.0044) [2024-06-25 13:37:38,336][15401] Updated weights for policy 0, policy_version 896284 (0.0036) [2024-06-25 13:37:38,392][15132] Fps is (10 sec: 47502.2, 60 sec: 43415.9, 300 sec: 42709.1). Total num frames: 14684717056. Throughput: 0: 42747.9. Samples: 14684750240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:38,392][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 13:37:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 14684864512. Throughput: 0: 42808.3. Samples: 14685014580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:43,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 13:37:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000896294_14684880896.pth... [2024-06-25 13:37:43,413][15401] Updated weights for policy 0, policy_version 896294 (0.0031) [2024-06-25 13:37:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000895670_14674657280.pth [2024-06-25 13:37:46,028][15401] Updated weights for policy 0, policy_version 896304 (0.0039) [2024-06-25 13:37:48,390][15132] Fps is (10 sec: 39330.6, 60 sec: 43144.3, 300 sec: 42543.8). Total num frames: 14685110272. Throughput: 0: 42647.8. Samples: 14685263640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:48,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 13:37:51,167][15401] Updated weights for policy 0, policy_version 896314 (0.0023) [2024-06-25 13:37:53,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14685339648. Throughput: 0: 42700.0. Samples: 14685395360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 13:37:53,947][15401] Updated weights for policy 0, policy_version 896324 (0.0023) [2024-06-25 13:37:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.9, 300 sec: 42431.8). Total num frames: 14685503488. Throughput: 0: 42576.5. Samples: 14685648220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-25 13:37:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 13:37:58,957][15401] Updated weights for policy 0, policy_version 896334 (0.0044) [2024-06-25 13:38:01,834][15401] Updated weights for policy 0, policy_version 896344 (0.0027) [2024-06-25 13:38:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 14685749248. Throughput: 0: 42815.5. Samples: 14685907900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:03,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 13:38:06,698][15401] Updated weights for policy 0, policy_version 896354 (0.0040) [2024-06-25 13:38:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14685962240. Throughput: 0: 42760.0. Samples: 14686038560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:08,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 13:38:09,346][15401] Updated weights for policy 0, policy_version 896364 (0.0042) [2024-06-25 13:38:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 14686142464. Throughput: 0: 42631.2. Samples: 14686285760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 13:38:14,395][15401] Updated weights for policy 0, policy_version 896374 (0.0030) [2024-06-25 13:38:16,998][15401] Updated weights for policy 0, policy_version 896384 (0.0040) [2024-06-25 13:38:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14686388224. Throughput: 0: 42665.7. Samples: 14686540620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:18,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 13:38:22,035][15401] Updated weights for policy 0, policy_version 896394 (0.0040) [2024-06-25 13:38:23,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42323.6, 300 sec: 42653.9). Total num frames: 14686601216. Throughput: 0: 42717.3. Samples: 14686672520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:23,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 13:38:25,022][15401] Updated weights for policy 0, policy_version 896404 (0.0029) [2024-06-25 13:38:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14686797824. Throughput: 0: 42545.3. Samples: 14686929120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 13:38:29,657][15401] Updated weights for policy 0, policy_version 896414 (0.0041) [2024-06-25 13:38:32,460][15401] Updated weights for policy 0, policy_version 896424 (0.0039) [2024-06-25 13:38:33,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14687027200. Throughput: 0: 42641.4. Samples: 14687182500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 13:38:37,115][15401] Updated weights for policy 0, policy_version 896434 (0.0040) [2024-06-25 13:38:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42053.9, 300 sec: 42654.0). Total num frames: 14687240192. Throughput: 0: 42711.9. Samples: 14687317400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:38,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 13:38:40,202][15401] Updated weights for policy 0, policy_version 896444 (0.0031) [2024-06-25 13:38:40,221][15349] Signal inference workers to stop experience collection... (217400 times) [2024-06-25 13:38:40,221][15349] Signal inference workers to resume experience collection... (217400 times) [2024-06-25 13:38:40,269][15401] InferenceWorker_p0-w0: stopping experience collection (217400 times) [2024-06-25 13:38:40,270][15401] InferenceWorker_p0-w0: resuming experience collection (217400 times) [2024-06-25 13:38:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14687436800. Throughput: 0: 42709.8. Samples: 14687570160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:43,390][15132] Avg episode reward: [(0, '0.152')] [2024-06-25 13:38:44,782][15401] Updated weights for policy 0, policy_version 896454 (0.0038) [2024-06-25 13:38:47,818][15401] Updated weights for policy 0, policy_version 896464 (0.0032) [2024-06-25 13:38:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14687682560. Throughput: 0: 42483.7. Samples: 14687819660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 13:38:52,262][15401] Updated weights for policy 0, policy_version 896474 (0.0044) [2024-06-25 13:38:53,393][15132] Fps is (10 sec: 42583.5, 60 sec: 42049.9, 300 sec: 42597.9). Total num frames: 14687862784. Throughput: 0: 42574.5. Samples: 14687954560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:53,394][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 13:38:55,606][15401] Updated weights for policy 0, policy_version 896484 (0.0025) [2024-06-25 13:38:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14688092160. Throughput: 0: 42802.0. Samples: 14688211860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:38:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 13:39:00,459][15401] Updated weights for policy 0, policy_version 896494 (0.0033) [2024-06-25 13:39:03,064][15401] Updated weights for policy 0, policy_version 896504 (0.0041) [2024-06-25 13:39:03,390][15132] Fps is (10 sec: 47529.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14688337920. Throughput: 0: 42700.9. Samples: 14688462160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 13:39:08,118][15401] Updated weights for policy 0, policy_version 896514 (0.0032) [2024-06-25 13:39:08,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14688501760. Throughput: 0: 42746.4. Samples: 14688596000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:08,390][15132] Avg episode reward: [(0, '0.224')] [2024-06-25 13:39:11,170][15401] Updated weights for policy 0, policy_version 896524 (0.0037) [2024-06-25 13:39:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14688731136. Throughput: 0: 42637.3. Samples: 14688847800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:13,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 13:39:15,745][15401] Updated weights for policy 0, policy_version 896534 (0.0031) [2024-06-25 13:39:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 14688960512. Throughput: 0: 42596.1. Samples: 14689099320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 13:39:18,689][15401] Updated weights for policy 0, policy_version 896544 (0.0024) [2024-06-25 13:39:23,312][15401] Updated weights for policy 0, policy_version 896554 (0.0033) [2024-06-25 13:39:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 14689140736. Throughput: 0: 42683.3. Samples: 14689238140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:23,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 13:39:26,227][15401] Updated weights for policy 0, policy_version 896564 (0.0034) [2024-06-25 13:39:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14689353728. Throughput: 0: 42649.3. Samples: 14689489380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 13:39:31,053][15401] Updated weights for policy 0, policy_version 896574 (0.0028) [2024-06-25 13:39:33,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14689615872. Throughput: 0: 42681.8. Samples: 14689740340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:33,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 13:39:33,794][15401] Updated weights for policy 0, policy_version 896584 (0.0036) [2024-06-25 13:39:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14689763328. Throughput: 0: 42800.2. Samples: 14689880420. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 13:39:38,776][15401] Updated weights for policy 0, policy_version 896594 (0.0040) [2024-06-25 13:39:41,372][15401] Updated weights for policy 0, policy_version 896604 (0.0036) [2024-06-25 13:39:43,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 14690009088. Throughput: 0: 42574.2. Samples: 14690127700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:43,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 13:39:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000896607_14690009088.pth... [2024-06-25 13:39:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000895983_14679785472.pth [2024-06-25 13:39:46,292][15401] Updated weights for policy 0, policy_version 896614 (0.0045) [2024-06-25 13:39:48,390][15132] Fps is (10 sec: 49151.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14690254848. Throughput: 0: 42579.1. Samples: 14690378220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:48,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 13:39:49,000][15401] Updated weights for policy 0, policy_version 896624 (0.0040) [2024-06-25 13:39:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42327.7, 300 sec: 42542.9). Total num frames: 14690402304. Throughput: 0: 42676.3. Samples: 14690516440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-06-25 13:39:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 13:39:53,843][15401] Updated weights for policy 0, policy_version 896634 (0.0044) [2024-06-25 13:39:56,733][15401] Updated weights for policy 0, policy_version 896644 (0.0043) [2024-06-25 13:39:58,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 14690648064. Throughput: 0: 42480.4. Samples: 14690759520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:39:58,392][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 13:40:01,434][15401] Updated weights for policy 0, policy_version 896654 (0.0041) [2024-06-25 13:40:02,775][15349] Signal inference workers to stop experience collection... (217450 times) [2024-06-25 13:40:02,775][15349] Signal inference workers to resume experience collection... (217450 times) [2024-06-25 13:40:02,813][15401] InferenceWorker_p0-w0: stopping experience collection (217450 times) [2024-06-25 13:40:02,814][15401] InferenceWorker_p0-w0: resuming experience collection (217450 times) [2024-06-25 13:40:03,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14690877440. Throughput: 0: 42764.3. Samples: 14691023720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 13:40:04,406][15401] Updated weights for policy 0, policy_version 896664 (0.0036) [2024-06-25 13:40:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14691057664. Throughput: 0: 42589.2. Samples: 14691154660. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:08,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 13:40:09,093][15401] Updated weights for policy 0, policy_version 896674 (0.0034) [2024-06-25 13:40:12,065][15401] Updated weights for policy 0, policy_version 896684 (0.0043) [2024-06-25 13:40:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14691303424. Throughput: 0: 42500.2. Samples: 14691401900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 13:40:17,139][15401] Updated weights for policy 0, policy_version 896694 (0.0035) [2024-06-25 13:40:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14691500032. Throughput: 0: 42892.8. Samples: 14691670520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:18,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 13:40:20,456][15401] Updated weights for policy 0, policy_version 896704 (0.0034) [2024-06-25 13:40:23,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 14691696640. Throughput: 0: 42518.7. Samples: 14691793760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 13:40:24,767][15401] Updated weights for policy 0, policy_version 896714 (0.0036) [2024-06-25 13:40:28,130][15401] Updated weights for policy 0, policy_version 896724 (0.0042) [2024-06-25 13:40:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14691942400. Throughput: 0: 42648.6. Samples: 14692046880. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 13:40:32,251][15401] Updated weights for policy 0, policy_version 896734 (0.0036) [2024-06-25 13:40:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 14692122624. Throughput: 0: 43074.9. Samples: 14692316580. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:33,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 13:40:35,699][15401] Updated weights for policy 0, policy_version 896744 (0.0041) [2024-06-25 13:40:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14692352000. Throughput: 0: 42669.9. Samples: 14692436580. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 13:40:39,922][15401] Updated weights for policy 0, policy_version 896754 (0.0040) [2024-06-25 13:40:43,385][15401] Updated weights for policy 0, policy_version 896764 (0.0030) [2024-06-25 13:40:43,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14692581376. Throughput: 0: 42866.3. Samples: 14692688400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 13:40:47,622][15401] Updated weights for policy 0, policy_version 896774 (0.0037) [2024-06-25 13:40:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 14692761600. Throughput: 0: 42807.1. Samples: 14692950040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 13:40:51,253][15401] Updated weights for policy 0, policy_version 896784 (0.0045) [2024-06-25 13:40:53,393][15132] Fps is (10 sec: 39307.1, 60 sec: 42868.9, 300 sec: 42598.2). Total num frames: 14692974592. Throughput: 0: 42738.4. Samples: 14693078040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:53,394][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 13:40:55,326][15401] Updated weights for policy 0, policy_version 896794 (0.0033) [2024-06-25 13:40:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14693220352. Throughput: 0: 42871.3. Samples: 14693331100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:40:58,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 13:40:58,806][15401] Updated weights for policy 0, policy_version 896804 (0.0031) [2024-06-25 13:41:02,785][15401] Updated weights for policy 0, policy_version 896814 (0.0026) [2024-06-25 13:41:03,390][15132] Fps is (10 sec: 44252.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14693416960. Throughput: 0: 42825.7. Samples: 14693597680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:03,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 13:41:06,567][15401] Updated weights for policy 0, policy_version 896824 (0.0042) [2024-06-25 13:41:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14693629952. Throughput: 0: 42865.4. Samples: 14693722700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:08,390][15132] Avg episode reward: [(0, '0.177')] [2024-06-25 13:41:10,351][15401] Updated weights for policy 0, policy_version 896834 (0.0028) [2024-06-25 13:41:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 14693842944. Throughput: 0: 42857.4. Samples: 14693975460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 13:41:14,147][15401] Updated weights for policy 0, policy_version 896844 (0.0043) [2024-06-25 13:41:18,349][15401] Updated weights for policy 0, policy_version 896854 (0.0032) [2024-06-25 13:41:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14694055936. Throughput: 0: 42498.0. Samples: 14694229000. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 13:41:19,627][15349] Signal inference workers to stop experience collection... (217500 times) [2024-06-25 13:41:19,627][15349] Signal inference workers to resume experience collection... (217500 times) [2024-06-25 13:41:19,676][15401] InferenceWorker_p0-w0: stopping experience collection (217500 times) [2024-06-25 13:41:19,676][15401] InferenceWorker_p0-w0: resuming experience collection (217500 times) [2024-06-25 13:41:21,865][15401] Updated weights for policy 0, policy_version 896864 (0.0028) [2024-06-25 13:41:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14694268928. Throughput: 0: 42701.8. Samples: 14694358160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:23,390][15132] Avg episode reward: [(0, '0.898')] [2024-06-25 13:41:25,946][15401] Updated weights for policy 0, policy_version 896874 (0.0045) [2024-06-25 13:41:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14694481920. Throughput: 0: 42760.9. Samples: 14694612640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:28,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 13:41:29,528][15401] Updated weights for policy 0, policy_version 896884 (0.0040) [2024-06-25 13:41:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14694694912. Throughput: 0: 42584.3. Samples: 14694866340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:33,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 13:41:33,647][15401] Updated weights for policy 0, policy_version 896894 (0.0038) [2024-06-25 13:41:37,439][15401] Updated weights for policy 0, policy_version 896904 (0.0032) [2024-06-25 13:41:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14694907904. Throughput: 0: 42654.9. Samples: 14694997360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 13:41:41,465][15401] Updated weights for policy 0, policy_version 896914 (0.0034) [2024-06-25 13:41:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 14695104512. Throughput: 0: 42599.9. Samples: 14695248100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 13:41:43,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-25 13:41:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000896919_14695120896.pth... [2024-06-25 13:41:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000896294_14684880896.pth [2024-06-25 13:41:45,062][15401] Updated weights for policy 0, policy_version 896924 (0.0027) [2024-06-25 13:41:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14695333888. Throughput: 0: 42355.2. Samples: 14695503660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:41:48,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 13:41:49,039][15401] Updated weights for policy 0, policy_version 896934 (0.0033) [2024-06-25 13:41:52,902][15401] Updated weights for policy 0, policy_version 896944 (0.0035) [2024-06-25 13:41:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42874.0, 300 sec: 42654.0). Total num frames: 14695546880. Throughput: 0: 42450.1. Samples: 14695632960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:41:53,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 13:41:56,815][15401] Updated weights for policy 0, policy_version 896954 (0.0034) [2024-06-25 13:41:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14695759872. Throughput: 0: 42472.9. Samples: 14695886740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:41:58,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 13:42:00,717][15401] Updated weights for policy 0, policy_version 896964 (0.0039) [2024-06-25 13:42:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 14695989248. Throughput: 0: 42478.2. Samples: 14696140520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:03,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 13:42:04,342][15401] Updated weights for policy 0, policy_version 896974 (0.0035) [2024-06-25 13:42:08,144][15401] Updated weights for policy 0, policy_version 896984 (0.0043) [2024-06-25 13:42:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14696185856. Throughput: 0: 42517.7. Samples: 14696271460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 13:42:11,868][15401] Updated weights for policy 0, policy_version 896994 (0.0033) [2024-06-25 13:42:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14696398848. Throughput: 0: 42638.5. Samples: 14696531380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 13:42:15,609][15401] Updated weights for policy 0, policy_version 897004 (0.0040) [2024-06-25 13:42:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14696628224. Throughput: 0: 42641.4. Samples: 14696785200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 13:42:19,990][15401] Updated weights for policy 0, policy_version 897014 (0.0029) [2024-06-25 13:42:23,381][15401] Updated weights for policy 0, policy_version 897024 (0.0033) [2024-06-25 13:42:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14696841216. Throughput: 0: 42754.8. Samples: 14696921320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 13:42:27,562][15401] Updated weights for policy 0, policy_version 897034 (0.0036) [2024-06-25 13:42:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14697037824. Throughput: 0: 42859.6. Samples: 14697176780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 13:42:30,772][15401] Updated weights for policy 0, policy_version 897044 (0.0042) [2024-06-25 13:42:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42598.8). Total num frames: 14697283584. Throughput: 0: 42748.2. Samples: 14697427320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:33,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 13:42:35,105][15401] Updated weights for policy 0, policy_version 897054 (0.0039) [2024-06-25 13:42:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14697480192. Throughput: 0: 42857.5. Samples: 14697561540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:38,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 13:42:38,766][15401] Updated weights for policy 0, policy_version 897064 (0.0050) [2024-06-25 13:42:43,301][15401] Updated weights for policy 0, policy_version 897074 (0.0044) [2024-06-25 13:42:43,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14697660416. Throughput: 0: 42823.1. Samples: 14697813780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:43,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 13:42:44,360][15349] Signal inference workers to stop experience collection... (217550 times) [2024-06-25 13:42:44,396][15401] InferenceWorker_p0-w0: stopping experience collection (217550 times) [2024-06-25 13:42:44,420][15349] Signal inference workers to resume experience collection... (217550 times) [2024-06-25 13:42:44,421][15401] InferenceWorker_p0-w0: resuming experience collection (217550 times) [2024-06-25 13:42:46,338][15401] Updated weights for policy 0, policy_version 897084 (0.0026) [2024-06-25 13:42:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 14697922560. Throughput: 0: 42627.2. Samples: 14698058740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:48,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 13:42:50,745][15401] Updated weights for policy 0, policy_version 897094 (0.0029) [2024-06-25 13:42:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14698119168. Throughput: 0: 42899.6. Samples: 14698201940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 13:42:53,787][15401] Updated weights for policy 0, policy_version 897104 (0.0034) [2024-06-25 13:42:58,300][15401] Updated weights for policy 0, policy_version 897114 (0.0037) [2024-06-25 13:42:58,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14698315776. Throughput: 0: 42713.4. Samples: 14698453480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:42:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 13:43:01,356][15401] Updated weights for policy 0, policy_version 897124 (0.0030) [2024-06-25 13:43:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 14698561536. Throughput: 0: 42607.8. Samples: 14698702560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:43:03,391][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 13:43:05,868][15401] Updated weights for policy 0, policy_version 897134 (0.0030) [2024-06-25 13:43:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14698741760. Throughput: 0: 42630.2. Samples: 14698839680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:43:08,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 13:43:09,242][15401] Updated weights for policy 0, policy_version 897144 (0.0033) [2024-06-25 13:43:13,393][15132] Fps is (10 sec: 39308.1, 60 sec: 42595.9, 300 sec: 42597.9). Total num frames: 14698954752. Throughput: 0: 42562.9. Samples: 14699092260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:43:13,394][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 13:43:13,728][15401] Updated weights for policy 0, policy_version 897154 (0.0027) [2024-06-25 13:43:16,819][15401] Updated weights for policy 0, policy_version 897164 (0.0032) [2024-06-25 13:43:18,392][15132] Fps is (10 sec: 47502.5, 60 sec: 43142.8, 300 sec: 42765.0). Total num frames: 14699216896. Throughput: 0: 42586.1. Samples: 14699343800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:43:18,392][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 13:43:21,602][15401] Updated weights for policy 0, policy_version 897174 (0.0032) [2024-06-25 13:43:23,390][15132] Fps is (10 sec: 42613.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14699380736. Throughput: 0: 42738.1. Samples: 14699484760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:43:23,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 13:43:24,417][15401] Updated weights for policy 0, policy_version 897184 (0.0042) [2024-06-25 13:43:28,390][15132] Fps is (10 sec: 37692.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14699593728. Throughput: 0: 42586.2. Samples: 14699730160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:43:28,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 13:43:29,076][15401] Updated weights for policy 0, policy_version 897194 (0.0031) [2024-06-25 13:43:31,915][15401] Updated weights for policy 0, policy_version 897204 (0.0028) [2024-06-25 13:43:33,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14699855872. Throughput: 0: 42838.1. Samples: 14699986460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:43:33,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 13:43:36,939][15401] Updated weights for policy 0, policy_version 897214 (0.0024) [2024-06-25 13:43:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14700019712. Throughput: 0: 42866.1. Samples: 14700130920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-25 13:43:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 13:43:39,574][15401] Updated weights for policy 0, policy_version 897224 (0.0038) [2024-06-25 13:43:43,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14700249088. Throughput: 0: 42759.9. Samples: 14700377680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:43:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 13:43:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000897232_14700249088.pth... [2024-06-25 13:43:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000896607_14690009088.pth [2024-06-25 13:43:44,754][15349] Signal inference workers to stop experience collection... (217600 times) [2024-06-25 13:43:44,755][15349] Signal inference workers to resume experience collection... (217600 times) [2024-06-25 13:43:44,773][15401] Updated weights for policy 0, policy_version 897234 (0.0030) [2024-06-25 13:43:44,802][15401] InferenceWorker_p0-w0: stopping experience collection (217600 times) [2024-06-25 13:43:44,802][15401] InferenceWorker_p0-w0: resuming experience collection (217600 times) [2024-06-25 13:43:47,055][15401] Updated weights for policy 0, policy_version 897244 (0.0035) [2024-06-25 13:43:48,390][15132] Fps is (10 sec: 49152.6, 60 sec: 43144.4, 300 sec: 42876.6). Total num frames: 14700511232. Throughput: 0: 42953.9. Samples: 14700635480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:43:48,392][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 13:43:52,246][15401] Updated weights for policy 0, policy_version 897254 (0.0039) [2024-06-25 13:43:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14700658688. Throughput: 0: 43062.3. Samples: 14700777480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:43:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 13:43:54,695][15401] Updated weights for policy 0, policy_version 897264 (0.0043) [2024-06-25 13:43:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 14700904448. Throughput: 0: 43027.4. Samples: 14701028340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:43:58,392][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 13:43:59,708][15401] Updated weights for policy 0, policy_version 897274 (0.0038) [2024-06-25 13:44:02,427][15401] Updated weights for policy 0, policy_version 897284 (0.0023) [2024-06-25 13:44:03,389][15132] Fps is (10 sec: 49152.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 14701150208. Throughput: 0: 43010.8. Samples: 14701279180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 13:44:07,543][15401] Updated weights for policy 0, policy_version 897294 (0.0039) [2024-06-25 13:44:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14701297664. Throughput: 0: 42936.5. Samples: 14701416900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:08,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 13:44:09,913][15401] Updated weights for policy 0, policy_version 897304 (0.0036) [2024-06-25 13:44:13,391][15132] Fps is (10 sec: 39317.3, 60 sec: 43146.4, 300 sec: 42653.8). Total num frames: 14701543424. Throughput: 0: 42971.1. Samples: 14701663900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:13,391][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 13:44:15,379][15401] Updated weights for policy 0, policy_version 897314 (0.0037) [2024-06-25 13:44:17,758][15401] Updated weights for policy 0, policy_version 897324 (0.0023) [2024-06-25 13:44:18,390][15132] Fps is (10 sec: 49152.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 14701789184. Throughput: 0: 42794.2. Samples: 14701912200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 13:44:23,032][15401] Updated weights for policy 0, policy_version 897334 (0.0035) [2024-06-25 13:44:23,390][15132] Fps is (10 sec: 39325.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14701936640. Throughput: 0: 42624.1. Samples: 14702049000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:23,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 13:44:25,740][15401] Updated weights for policy 0, policy_version 897344 (0.0024) [2024-06-25 13:44:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 14702198784. Throughput: 0: 42688.1. Samples: 14702298640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 13:44:30,647][15401] Updated weights for policy 0, policy_version 897354 (0.0033) [2024-06-25 13:44:33,372][15401] Updated weights for policy 0, policy_version 897364 (0.0038) [2024-06-25 13:44:33,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14702411776. Throughput: 0: 42772.0. Samples: 14702560220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:33,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 13:44:38,233][15401] Updated weights for policy 0, policy_version 897374 (0.0041) [2024-06-25 13:44:38,390][15132] Fps is (10 sec: 37681.1, 60 sec: 42598.1, 300 sec: 42598.3). Total num frames: 14702575616. Throughput: 0: 42543.5. Samples: 14702691960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:38,391][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 13:44:40,890][15401] Updated weights for policy 0, policy_version 897384 (0.0031) [2024-06-25 13:44:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14702837760. Throughput: 0: 42488.4. Samples: 14702940320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:43,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 13:44:45,986][15401] Updated weights for policy 0, policy_version 897394 (0.0032) [2024-06-25 13:44:46,671][15349] Signal inference workers to stop experience collection... (217650 times) [2024-06-25 13:44:46,671][15349] Signal inference workers to resume experience collection... (217650 times) [2024-06-25 13:44:46,698][15401] InferenceWorker_p0-w0: stopping experience collection (217650 times) [2024-06-25 13:44:46,698][15401] InferenceWorker_p0-w0: resuming experience collection (217650 times) [2024-06-25 13:44:48,374][15401] Updated weights for policy 0, policy_version 897404 (0.0031) [2024-06-25 13:44:48,391][15132] Fps is (10 sec: 49145.7, 60 sec: 42597.1, 300 sec: 42931.4). Total num frames: 14703067136. Throughput: 0: 42664.8. Samples: 14703199180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:48,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 13:44:53,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 14703214592. Throughput: 0: 42461.7. Samples: 14703327680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 13:44:53,865][15401] Updated weights for policy 0, policy_version 897414 (0.0033) [2024-06-25 13:44:56,086][15401] Updated weights for policy 0, policy_version 897424 (0.0022) [2024-06-25 13:44:58,390][15132] Fps is (10 sec: 42605.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14703493120. Throughput: 0: 42556.9. Samples: 14703578920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:44:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 13:45:01,356][15401] Updated weights for policy 0, policy_version 897434 (0.0042) [2024-06-25 13:45:03,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14703689728. Throughput: 0: 42888.1. Samples: 14703842160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:45:03,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-25 13:45:03,907][15401] Updated weights for policy 0, policy_version 897444 (0.0026) [2024-06-25 13:45:08,389][15132] Fps is (10 sec: 36045.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14703853568. Throughput: 0: 42722.4. Samples: 14703971500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:45:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 13:45:08,935][15401] Updated weights for policy 0, policy_version 897454 (0.0023) [2024-06-25 13:45:11,589][15401] Updated weights for policy 0, policy_version 897464 (0.0038) [2024-06-25 13:45:13,396][15132] Fps is (10 sec: 44208.0, 60 sec: 43140.7, 300 sec: 42819.6). Total num frames: 14704132096. Throughput: 0: 42729.1. Samples: 14704221720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:45:13,396][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 13:45:16,742][15401] Updated weights for policy 0, policy_version 897474 (0.0041) [2024-06-25 13:45:18,390][15132] Fps is (10 sec: 47512.8, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 14704328704. Throughput: 0: 42711.0. Samples: 14704482220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:45:18,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 13:45:19,292][15401] Updated weights for policy 0, policy_version 897484 (0.0032) [2024-06-25 13:45:23,389][15132] Fps is (10 sec: 36067.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14704492544. Throughput: 0: 42501.5. Samples: 14704604500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:45:23,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 13:45:24,263][15401] Updated weights for policy 0, policy_version 897494 (0.0038) [2024-06-25 13:45:27,025][15401] Updated weights for policy 0, policy_version 897504 (0.0039) [2024-06-25 13:45:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14704771072. Throughput: 0: 42745.5. Samples: 14704863860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:45:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 13:45:31,923][15401] Updated weights for policy 0, policy_version 897514 (0.0041) [2024-06-25 13:45:33,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14704967680. Throughput: 0: 42748.3. Samples: 14705122780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 13:45:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 13:45:34,657][15401] Updated weights for policy 0, policy_version 897524 (0.0038) [2024-06-25 13:45:38,389][15132] Fps is (10 sec: 36044.5, 60 sec: 42598.8, 300 sec: 42542.9). Total num frames: 14705131520. Throughput: 0: 42644.9. Samples: 14705246700. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:45:38,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 13:45:39,529][15401] Updated weights for policy 0, policy_version 897534 (0.0028) [2024-06-25 13:45:42,207][15401] Updated weights for policy 0, policy_version 897544 (0.0037) [2024-06-25 13:45:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14705410048. Throughput: 0: 42742.7. Samples: 14705502340. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:45:43,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 13:45:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000897547_14705410048.pth... [2024-06-25 13:45:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000896919_14695120896.pth [2024-06-25 13:45:43,802][15349] Signal inference workers to stop experience collection... (217700 times) [2024-06-25 13:45:43,802][15349] Signal inference workers to resume experience collection... (217700 times) [2024-06-25 13:45:43,818][15401] InferenceWorker_p0-w0: stopping experience collection (217700 times) [2024-06-25 13:45:43,844][15401] InferenceWorker_p0-w0: resuming experience collection (217700 times) [2024-06-25 13:45:47,217][15401] Updated weights for policy 0, policy_version 897554 (0.0040) [2024-06-25 13:45:48,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42326.7, 300 sec: 42821.1). Total num frames: 14705606656. Throughput: 0: 42567.5. Samples: 14705757700. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:45:48,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 13:45:50,171][15401] Updated weights for policy 0, policy_version 897564 (0.0033) [2024-06-25 13:45:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14705786880. Throughput: 0: 42491.9. Samples: 14705883640. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:45:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 13:45:54,890][15401] Updated weights for policy 0, policy_version 897574 (0.0040) [2024-06-25 13:45:57,829][15401] Updated weights for policy 0, policy_version 897584 (0.0040) [2024-06-25 13:45:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14706049024. Throughput: 0: 42672.7. Samples: 14706141720. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:45:58,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 13:46:02,708][15401] Updated weights for policy 0, policy_version 897594 (0.0027) [2024-06-25 13:46:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14706245632. Throughput: 0: 42669.5. Samples: 14706402340. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 13:46:05,337][15401] Updated weights for policy 0, policy_version 897604 (0.0034) [2024-06-25 13:46:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14706442240. Throughput: 0: 42631.1. Samples: 14706522900. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 13:46:10,228][15401] Updated weights for policy 0, policy_version 897614 (0.0029) [2024-06-25 13:46:12,835][15401] Updated weights for policy 0, policy_version 897624 (0.0027) [2024-06-25 13:46:13,393][15132] Fps is (10 sec: 44219.2, 60 sec: 42600.2, 300 sec: 42820.0). Total num frames: 14706688000. Throughput: 0: 42618.0. Samples: 14706781840. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:13,394][15132] Avg episode reward: [(0, '0.307')] [2024-06-25 13:46:17,978][15401] Updated weights for policy 0, policy_version 897634 (0.0037) [2024-06-25 13:46:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 14706851840. Throughput: 0: 42659.7. Samples: 14707042460. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 13:46:20,754][15401] Updated weights for policy 0, policy_version 897644 (0.0032) [2024-06-25 13:46:23,390][15132] Fps is (10 sec: 39336.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14707081216. Throughput: 0: 42592.4. Samples: 14707163360. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:23,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 13:46:25,707][15401] Updated weights for policy 0, policy_version 897654 (0.0032) [2024-06-25 13:46:28,343][15401] Updated weights for policy 0, policy_version 897664 (0.0036) [2024-06-25 13:46:28,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 14707326976. Throughput: 0: 42625.8. Samples: 14707420500. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 13:46:33,166][15401] Updated weights for policy 0, policy_version 897674 (0.0031) [2024-06-25 13:46:33,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 14707507200. Throughput: 0: 43039.0. Samples: 14707694560. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:33,392][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 13:46:35,787][15349] Signal inference workers to stop experience collection... (217750 times) [2024-06-25 13:46:35,788][15349] Signal inference workers to resume experience collection... (217750 times) [2024-06-25 13:46:35,802][15401] InferenceWorker_p0-w0: stopping experience collection (217750 times) [2024-06-25 13:46:35,805][15401] Updated weights for policy 0, policy_version 897684 (0.0039) [2024-06-25 13:46:35,833][15401] InferenceWorker_p0-w0: resuming experience collection (217750 times) [2024-06-25 13:46:38,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 14707736576. Throughput: 0: 42856.8. Samples: 14707812300. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:38,393][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 13:46:40,708][15401] Updated weights for policy 0, policy_version 897694 (0.0030) [2024-06-25 13:46:43,390][15132] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14707965952. Throughput: 0: 42954.1. Samples: 14708074660. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 13:46:43,564][15401] Updated weights for policy 0, policy_version 897704 (0.0032) [2024-06-25 13:46:48,276][15401] Updated weights for policy 0, policy_version 897714 (0.0036) [2024-06-25 13:46:48,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14708146176. Throughput: 0: 42971.1. Samples: 14708336040. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:48,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-25 13:46:51,352][15401] Updated weights for policy 0, policy_version 897724 (0.0037) [2024-06-25 13:46:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14708375552. Throughput: 0: 42956.9. Samples: 14708455960. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:53,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 13:46:55,835][15401] Updated weights for policy 0, policy_version 897734 (0.0043) [2024-06-25 13:46:58,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14708604928. Throughput: 0: 42879.7. Samples: 14708711260. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:46:58,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 13:46:59,068][15401] Updated weights for policy 0, policy_version 897744 (0.0043) [2024-06-25 13:47:03,300][15401] Updated weights for policy 0, policy_version 897754 (0.0037) [2024-06-25 13:47:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14708801536. Throughput: 0: 43012.0. Samples: 14708978000. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:47:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 13:47:06,771][15401] Updated weights for policy 0, policy_version 897764 (0.0028) [2024-06-25 13:47:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14709030912. Throughput: 0: 42959.1. Samples: 14709096520. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:47:08,395][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 13:47:11,046][15401] Updated weights for policy 0, policy_version 897774 (0.0026) [2024-06-25 13:47:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42601.2, 300 sec: 42765.0). Total num frames: 14709243904. Throughput: 0: 43042.7. Samples: 14709357420. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:47:13,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 13:47:14,586][15401] Updated weights for policy 0, policy_version 897784 (0.0044) [2024-06-25 13:47:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14709440512. Throughput: 0: 42665.9. Samples: 14709614420. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:47:18,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 13:47:18,603][15401] Updated weights for policy 0, policy_version 897794 (0.0029) [2024-06-25 13:47:22,170][15401] Updated weights for policy 0, policy_version 897804 (0.0033) [2024-06-25 13:47:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14709669888. Throughput: 0: 42686.6. Samples: 14709733100. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:47:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 13:47:26,439][15401] Updated weights for policy 0, policy_version 897814 (0.0027) [2024-06-25 13:47:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14709882880. Throughput: 0: 42722.3. Samples: 14709997160. Policy #0 lag: (min: 2.0, avg: 12.7, max: 25.0) [2024-06-25 13:47:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 13:47:29,841][15401] Updated weights for policy 0, policy_version 897824 (0.0034) [2024-06-25 13:47:33,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 14710063104. Throughput: 0: 42560.3. Samples: 14710251260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:47:33,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 13:47:33,939][15401] Updated weights for policy 0, policy_version 897834 (0.0030) [2024-06-25 13:47:37,375][15401] Updated weights for policy 0, policy_version 897844 (0.0036) [2024-06-25 13:47:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 14710308864. Throughput: 0: 42706.1. Samples: 14710377740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:47:38,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 13:47:41,551][15401] Updated weights for policy 0, policy_version 897854 (0.0031) [2024-06-25 13:47:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 14710505472. Throughput: 0: 42779.1. Samples: 14710636320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:47:43,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 13:47:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000897858_14710505472.pth... [2024-06-25 13:47:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000897232_14700249088.pth [2024-06-25 13:47:45,029][15401] Updated weights for policy 0, policy_version 897864 (0.0032) [2024-06-25 13:47:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14710718464. Throughput: 0: 42632.4. Samples: 14710896460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:47:48,390][15132] Avg episode reward: [(0, '0.166')] [2024-06-25 13:47:49,250][15401] Updated weights for policy 0, policy_version 897874 (0.0035) [2024-06-25 13:47:52,736][15401] Updated weights for policy 0, policy_version 897884 (0.0046) [2024-06-25 13:47:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14710964224. Throughput: 0: 42817.8. Samples: 14711023320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:47:53,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 13:47:56,825][15401] Updated weights for policy 0, policy_version 897894 (0.0037) [2024-06-25 13:47:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14711144448. Throughput: 0: 42591.2. Samples: 14711274020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:47:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 13:48:00,436][15401] Updated weights for policy 0, policy_version 897904 (0.0026) [2024-06-25 13:48:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14711357440. Throughput: 0: 42731.4. Samples: 14711537340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 13:48:04,264][15401] Updated weights for policy 0, policy_version 897914 (0.0051) [2024-06-25 13:48:07,908][15401] Updated weights for policy 0, policy_version 897924 (0.0038) [2024-06-25 13:48:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42876.6). Total num frames: 14711603200. Throughput: 0: 42923.8. Samples: 14711664660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 13:48:11,239][15349] Signal inference workers to stop experience collection... (217800 times) [2024-06-25 13:48:11,240][15349] Signal inference workers to resume experience collection... (217800 times) [2024-06-25 13:48:11,273][15401] InferenceWorker_p0-w0: stopping experience collection (217800 times) [2024-06-25 13:48:11,273][15401] InferenceWorker_p0-w0: resuming experience collection (217800 times) [2024-06-25 13:48:11,815][15401] Updated weights for policy 0, policy_version 897934 (0.0034) [2024-06-25 13:48:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 14711816192. Throughput: 0: 42682.7. Samples: 14711917880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:13,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 13:48:15,826][15401] Updated weights for policy 0, policy_version 897944 (0.0033) [2024-06-25 13:48:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14712012800. Throughput: 0: 42816.5. Samples: 14712178000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 13:48:19,904][15401] Updated weights for policy 0, policy_version 897954 (0.0030) [2024-06-25 13:48:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 14712225792. Throughput: 0: 42822.4. Samples: 14712304740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 13:48:23,486][15401] Updated weights for policy 0, policy_version 897964 (0.0036) [2024-06-25 13:48:27,555][15401] Updated weights for policy 0, policy_version 897974 (0.0036) [2024-06-25 13:48:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 14712438784. Throughput: 0: 42863.4. Samples: 14712565280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:28,392][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 13:48:31,169][15401] Updated weights for policy 0, policy_version 897984 (0.0031) [2024-06-25 13:48:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14712651776. Throughput: 0: 42797.2. Samples: 14712822340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 13:48:34,997][15401] Updated weights for policy 0, policy_version 897994 (0.0035) [2024-06-25 13:48:38,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14712848384. Throughput: 0: 42825.5. Samples: 14712950460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 13:48:38,858][15401] Updated weights for policy 0, policy_version 898004 (0.0026) [2024-06-25 13:48:42,849][15401] Updated weights for policy 0, policy_version 898014 (0.0053) [2024-06-25 13:48:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14713077760. Throughput: 0: 42896.9. Samples: 14713204380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:43,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 13:48:46,644][15401] Updated weights for policy 0, policy_version 898024 (0.0032) [2024-06-25 13:48:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14713290752. Throughput: 0: 42761.7. Samples: 14713461620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:48,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 13:48:50,604][15401] Updated weights for policy 0, policy_version 898034 (0.0041) [2024-06-25 13:48:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 14713487360. Throughput: 0: 42701.8. Samples: 14713586240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:53,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 13:48:54,109][15401] Updated weights for policy 0, policy_version 898044 (0.0032) [2024-06-25 13:48:58,100][15401] Updated weights for policy 0, policy_version 898054 (0.0034) [2024-06-25 13:48:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14713733120. Throughput: 0: 42866.8. Samples: 14713846880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:48:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 13:49:02,187][15401] Updated weights for policy 0, policy_version 898064 (0.0037) [2024-06-25 13:49:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14713946112. Throughput: 0: 42828.9. Samples: 14714105300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:49:03,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 13:49:05,506][15401] Updated weights for policy 0, policy_version 898074 (0.0027) [2024-06-25 13:49:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.2). Total num frames: 14714159104. Throughput: 0: 42801.2. Samples: 14714230800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:49:08,396][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 13:49:09,737][15401] Updated weights for policy 0, policy_version 898084 (0.0044) [2024-06-25 13:49:13,307][15401] Updated weights for policy 0, policy_version 898094 (0.0042) [2024-06-25 13:49:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14714372096. Throughput: 0: 42728.1. Samples: 14714487940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:49:13,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 13:49:17,477][15401] Updated weights for policy 0, policy_version 898104 (0.0036) [2024-06-25 13:49:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14714585088. Throughput: 0: 42779.2. Samples: 14714747400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 13:49:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 13:49:20,794][15401] Updated weights for policy 0, policy_version 898114 (0.0035) [2024-06-25 13:49:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14714798080. Throughput: 0: 42715.9. Samples: 14714872680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:49:23,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 13:49:24,923][15401] Updated weights for policy 0, policy_version 898124 (0.0041) [2024-06-25 13:49:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 14715011072. Throughput: 0: 42793.2. Samples: 14715130180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:49:28,392][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 13:49:28,485][15401] Updated weights for policy 0, policy_version 898134 (0.0031) [2024-06-25 13:49:32,671][15401] Updated weights for policy 0, policy_version 898144 (0.0031) [2024-06-25 13:49:33,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.8, 300 sec: 42820.3). Total num frames: 14715207680. Throughput: 0: 42874.7. Samples: 14715391080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:49:33,393][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 13:49:36,128][15401] Updated weights for policy 0, policy_version 898154 (0.0030) [2024-06-25 13:49:38,389][15132] Fps is (10 sec: 42609.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14715437056. Throughput: 0: 42908.0. Samples: 14715517100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:49:38,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 13:49:40,409][15401] Updated weights for policy 0, policy_version 898164 (0.0042) [2024-06-25 13:49:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 14715650048. Throughput: 0: 42799.1. Samples: 14715772840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:49:43,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 13:49:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000898173_14715666432.pth... [2024-06-25 13:49:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000897547_14705410048.pth [2024-06-25 13:49:43,901][15349] Signal inference workers to stop experience collection... (217850 times) [2024-06-25 13:49:43,927][15401] InferenceWorker_p0-w0: stopping experience collection (217850 times) [2024-06-25 13:49:43,948][15349] Signal inference workers to resume experience collection... (217850 times) [2024-06-25 13:49:43,956][15401] InferenceWorker_p0-w0: resuming experience collection (217850 times) [2024-06-25 13:49:43,959][15401] Updated weights for policy 0, policy_version 898174 (0.0041) [2024-06-25 13:49:48,119][15401] Updated weights for policy 0, policy_version 898184 (0.0031) [2024-06-25 13:49:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14715863040. Throughput: 0: 42676.7. Samples: 14716025760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:49:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 13:49:51,791][15401] Updated weights for policy 0, policy_version 898194 (0.0034) [2024-06-25 13:49:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14716076032. Throughput: 0: 42621.4. Samples: 14716148760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:49:53,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 13:49:55,770][15401] Updated weights for policy 0, policy_version 898204 (0.0035) [2024-06-25 13:49:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 14716272640. Throughput: 0: 42598.9. Samples: 14716404900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:49:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 13:49:59,543][15401] Updated weights for policy 0, policy_version 898214 (0.0028) [2024-06-25 13:50:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 14716485632. Throughput: 0: 42419.9. Samples: 14716656300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 13:50:03,415][15401] Updated weights for policy 0, policy_version 898224 (0.0037) [2024-06-25 13:50:07,483][15401] Updated weights for policy 0, policy_version 898234 (0.0030) [2024-06-25 13:50:08,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 14716715008. Throughput: 0: 42552.5. Samples: 14716787540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 13:50:11,302][15401] Updated weights for policy 0, policy_version 898244 (0.0039) [2024-06-25 13:50:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14716911616. Throughput: 0: 42490.2. Samples: 14717042140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:13,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 13:50:14,903][15401] Updated weights for policy 0, policy_version 898254 (0.0025) [2024-06-25 13:50:18,393][15132] Fps is (10 sec: 42585.5, 60 sec: 42596.2, 300 sec: 42875.7). Total num frames: 14717140992. Throughput: 0: 42269.6. Samples: 14717293240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:18,393][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 13:50:18,838][15401] Updated weights for policy 0, policy_version 898264 (0.0036) [2024-06-25 13:50:22,398][15401] Updated weights for policy 0, policy_version 898274 (0.0025) [2024-06-25 13:50:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14717370368. Throughput: 0: 42459.4. Samples: 14717427780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 13:50:26,950][15401] Updated weights for policy 0, policy_version 898284 (0.0028) [2024-06-25 13:50:28,389][15132] Fps is (10 sec: 40972.8, 60 sec: 42327.1, 300 sec: 42654.0). Total num frames: 14717550592. Throughput: 0: 42477.4. Samples: 14717684320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 13:50:29,979][15401] Updated weights for policy 0, policy_version 898294 (0.0038) [2024-06-25 13:50:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 14717763584. Throughput: 0: 42449.7. Samples: 14717936000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 13:50:34,683][15401] Updated weights for policy 0, policy_version 898304 (0.0038) [2024-06-25 13:50:37,687][15401] Updated weights for policy 0, policy_version 898314 (0.0035) [2024-06-25 13:50:38,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 14717992960. Throughput: 0: 42649.8. Samples: 14718068000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 13:50:42,314][15401] Updated weights for policy 0, policy_version 898324 (0.0033) [2024-06-25 13:50:43,390][15132] Fps is (10 sec: 42595.0, 60 sec: 42324.6, 300 sec: 42653.8). Total num frames: 14718189568. Throughput: 0: 42579.7. Samples: 14718321020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:43,391][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 13:50:45,457][15401] Updated weights for policy 0, policy_version 898334 (0.0035) [2024-06-25 13:50:48,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 14718386176. Throughput: 0: 42708.4. Samples: 14718578280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:48,393][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 13:50:49,883][15401] Updated weights for policy 0, policy_version 898344 (0.0031) [2024-06-25 13:50:53,211][15401] Updated weights for policy 0, policy_version 898354 (0.0040) [2024-06-25 13:50:53,390][15132] Fps is (10 sec: 44241.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14718631936. Throughput: 0: 42627.1. Samples: 14718705760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:53,394][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 13:50:57,472][15401] Updated weights for policy 0, policy_version 898364 (0.0031) [2024-06-25 13:50:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 14718812160. Throughput: 0: 42711.2. Samples: 14718964140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:50:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 13:51:01,059][15401] Updated weights for policy 0, policy_version 898374 (0.0043) [2024-06-25 13:51:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14719041536. Throughput: 0: 42694.6. Samples: 14719214360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:51:03,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 13:51:05,128][15401] Updated weights for policy 0, policy_version 898384 (0.0035) [2024-06-25 13:51:06,888][15349] Signal inference workers to stop experience collection... (217900 times) [2024-06-25 13:51:06,890][15349] Signal inference workers to resume experience collection... (217900 times) [2024-06-25 13:51:06,904][15401] InferenceWorker_p0-w0: stopping experience collection (217900 times) [2024-06-25 13:51:06,937][15401] InferenceWorker_p0-w0: resuming experience collection (217900 times) [2024-06-25 13:51:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 14719270912. Throughput: 0: 42643.7. Samples: 14719346740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:51:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 13:51:08,599][15401] Updated weights for policy 0, policy_version 898394 (0.0037) [2024-06-25 13:51:12,655][15401] Updated weights for policy 0, policy_version 898404 (0.0035) [2024-06-25 13:51:13,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14719467520. Throughput: 0: 42770.9. Samples: 14719609020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 13:51:13,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 13:51:16,004][15401] Updated weights for policy 0, policy_version 898414 (0.0041) [2024-06-25 13:51:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.6, 300 sec: 42820.6). Total num frames: 14719713280. Throughput: 0: 42726.9. Samples: 14719858700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:18,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 13:51:20,279][15401] Updated weights for policy 0, policy_version 898424 (0.0031) [2024-06-25 13:51:23,392][15132] Fps is (10 sec: 45864.6, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 14719926272. Throughput: 0: 42725.7. Samples: 14719990760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:23,392][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 13:51:23,654][15401] Updated weights for policy 0, policy_version 898434 (0.0035) [2024-06-25 13:51:27,807][15401] Updated weights for policy 0, policy_version 898444 (0.0031) [2024-06-25 13:51:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 14720122880. Throughput: 0: 42918.8. Samples: 14720252320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:28,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 13:51:31,382][15401] Updated weights for policy 0, policy_version 898454 (0.0031) [2024-06-25 13:51:33,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.7, 300 sec: 42765.4). Total num frames: 14720352256. Throughput: 0: 42789.9. Samples: 14720503720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 13:51:35,837][15401] Updated weights for policy 0, policy_version 898464 (0.0035) [2024-06-25 13:51:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14720565248. Throughput: 0: 42775.7. Samples: 14720630660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 13:51:39,308][15401] Updated weights for policy 0, policy_version 898474 (0.0031) [2024-06-25 13:51:43,344][15401] Updated weights for policy 0, policy_version 898484 (0.0036) [2024-06-25 13:51:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42872.3, 300 sec: 42765.0). Total num frames: 14720761856. Throughput: 0: 42858.7. Samples: 14720892780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:43,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 13:51:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000898485_14720778240.pth... [2024-06-25 13:51:43,569][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000897858_14710505472.pth [2024-06-25 13:51:46,848][15401] Updated weights for policy 0, policy_version 898494 (0.0031) [2024-06-25 13:51:48,396][15132] Fps is (10 sec: 42570.8, 60 sec: 43414.8, 300 sec: 42764.1). Total num frames: 14720991232. Throughput: 0: 42844.9. Samples: 14721142660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:48,396][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 13:51:50,789][15401] Updated weights for policy 0, policy_version 898504 (0.0036) [2024-06-25 13:51:53,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14721204224. Throughput: 0: 42681.7. Samples: 14721267420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:53,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 13:51:54,425][15401] Updated weights for policy 0, policy_version 898514 (0.0031) [2024-06-25 13:51:58,390][15132] Fps is (10 sec: 40985.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14721400832. Throughput: 0: 42649.8. Samples: 14721528260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:51:58,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 13:51:58,555][15401] Updated weights for policy 0, policy_version 898524 (0.0028) [2024-06-25 13:52:02,513][15401] Updated weights for policy 0, policy_version 898534 (0.0036) [2024-06-25 13:52:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14721630208. Throughput: 0: 42673.3. Samples: 14721779000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:03,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 13:52:06,167][15401] Updated weights for policy 0, policy_version 898544 (0.0032) [2024-06-25 13:52:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14721843200. Throughput: 0: 42746.7. Samples: 14721914260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 13:52:10,133][15401] Updated weights for policy 0, policy_version 898554 (0.0036) [2024-06-25 13:52:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14722039808. Throughput: 0: 42658.6. Samples: 14722171960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 13:52:14,165][15401] Updated weights for policy 0, policy_version 898564 (0.0037) [2024-06-25 13:52:17,769][15401] Updated weights for policy 0, policy_version 898574 (0.0031) [2024-06-25 13:52:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14722252800. Throughput: 0: 42620.1. Samples: 14722421620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:18,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 13:52:21,828][15401] Updated weights for policy 0, policy_version 898584 (0.0033) [2024-06-25 13:52:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 14722482176. Throughput: 0: 42860.3. Samples: 14722559380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:23,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 13:52:23,427][15349] Signal inference workers to stop experience collection... (217950 times) [2024-06-25 13:52:23,433][15349] Signal inference workers to resume experience collection... (217950 times) [2024-06-25 13:52:23,461][15401] InferenceWorker_p0-w0: stopping experience collection (217950 times) [2024-06-25 13:52:23,461][15401] InferenceWorker_p0-w0: resuming experience collection (217950 times) [2024-06-25 13:52:25,310][15401] Updated weights for policy 0, policy_version 898594 (0.0039) [2024-06-25 13:52:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14722662400. Throughput: 0: 42557.2. Samples: 14722807860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:28,391][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 13:52:29,624][15401] Updated weights for policy 0, policy_version 898604 (0.0037) [2024-06-25 13:52:33,271][15401] Updated weights for policy 0, policy_version 898614 (0.0039) [2024-06-25 13:52:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 14722891776. Throughput: 0: 42649.1. Samples: 14723061600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 13:52:37,222][15401] Updated weights for policy 0, policy_version 898624 (0.0033) [2024-06-25 13:52:38,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14723121152. Throughput: 0: 42841.0. Samples: 14723195260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:38,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 13:52:40,791][15401] Updated weights for policy 0, policy_version 898634 (0.0041) [2024-06-25 13:52:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14723301376. Throughput: 0: 42511.1. Samples: 14723441260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 13:52:44,872][15401] Updated weights for policy 0, policy_version 898644 (0.0036) [2024-06-25 13:52:48,331][15401] Updated weights for policy 0, policy_version 898654 (0.0029) [2024-06-25 13:52:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 14723547136. Throughput: 0: 42793.4. Samples: 14723704700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:48,394][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 13:52:52,371][15401] Updated weights for policy 0, policy_version 898664 (0.0042) [2024-06-25 13:52:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14723760128. Throughput: 0: 42812.4. Samples: 14723840820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:53,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 13:52:55,929][15401] Updated weights for policy 0, policy_version 898674 (0.0032) [2024-06-25 13:52:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14723956736. Throughput: 0: 42630.3. Samples: 14724090320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:52:58,398][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 13:53:00,325][15401] Updated weights for policy 0, policy_version 898684 (0.0033) [2024-06-25 13:53:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14724186112. Throughput: 0: 42656.4. Samples: 14724341160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:53:03,399][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 13:53:03,647][15401] Updated weights for policy 0, policy_version 898694 (0.0025) [2024-06-25 13:53:07,951][15401] Updated weights for policy 0, policy_version 898704 (0.0038) [2024-06-25 13:53:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14724399104. Throughput: 0: 42508.0. Samples: 14724472240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-25 13:53:08,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 13:53:11,723][15401] Updated weights for policy 0, policy_version 898714 (0.0038) [2024-06-25 13:53:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14724595712. Throughput: 0: 42515.6. Samples: 14724721060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:13,399][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 13:53:15,625][15401] Updated weights for policy 0, policy_version 898724 (0.0041) [2024-06-25 13:53:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14724841472. Throughput: 0: 42375.1. Samples: 14724968480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:18,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 13:53:19,386][15401] Updated weights for policy 0, policy_version 898734 (0.0028) [2024-06-25 13:53:23,201][15401] Updated weights for policy 0, policy_version 898744 (0.0039) [2024-06-25 13:53:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 14725038080. Throughput: 0: 42459.0. Samples: 14725106020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:23,400][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 13:53:26,857][15401] Updated weights for policy 0, policy_version 898754 (0.0041) [2024-06-25 13:53:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 14725234688. Throughput: 0: 42741.5. Samples: 14725364620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 13:53:30,737][15401] Updated weights for policy 0, policy_version 898764 (0.0040) [2024-06-25 13:53:33,390][15132] Fps is (10 sec: 42604.6, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 14725464064. Throughput: 0: 42586.7. Samples: 14725621140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:33,391][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 13:53:34,836][15401] Updated weights for policy 0, policy_version 898774 (0.0045) [2024-06-25 13:53:38,237][15401] Updated weights for policy 0, policy_version 898784 (0.0039) [2024-06-25 13:53:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14725677056. Throughput: 0: 42472.0. Samples: 14725752060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 13:53:42,154][15349] Signal inference workers to stop experience collection... (218000 times) [2024-06-25 13:53:42,184][15401] InferenceWorker_p0-w0: stopping experience collection (218000 times) [2024-06-25 13:53:42,216][15349] Signal inference workers to resume experience collection... (218000 times) [2024-06-25 13:53:42,220][15401] InferenceWorker_p0-w0: resuming experience collection (218000 times) [2024-06-25 13:53:42,221][15401] Updated weights for policy 0, policy_version 898794 (0.0037) [2024-06-25 13:53:43,389][15132] Fps is (10 sec: 42602.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14725890048. Throughput: 0: 42712.0. Samples: 14726012360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:43,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 13:53:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000898797_14725890048.pth... [2024-06-25 13:53:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000898173_14715666432.pth [2024-06-25 13:53:45,719][15401] Updated weights for policy 0, policy_version 898804 (0.0035) [2024-06-25 13:53:48,395][15132] Fps is (10 sec: 44213.4, 60 sec: 42867.7, 300 sec: 42819.8). Total num frames: 14726119424. Throughput: 0: 42834.1. Samples: 14726268920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:48,395][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 13:53:49,625][15401] Updated weights for policy 0, policy_version 898814 (0.0027) [2024-06-25 13:53:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14726316032. Throughput: 0: 42851.0. Samples: 14726400540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:53,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 13:53:53,545][15401] Updated weights for policy 0, policy_version 898824 (0.0037) [2024-06-25 13:53:57,399][15401] Updated weights for policy 0, policy_version 898834 (0.0028) [2024-06-25 13:53:58,389][15132] Fps is (10 sec: 40982.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14726529024. Throughput: 0: 42979.7. Samples: 14726655140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:53:58,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 13:54:01,522][15401] Updated weights for policy 0, policy_version 898844 (0.0033) [2024-06-25 13:54:03,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14726758400. Throughput: 0: 42993.3. Samples: 14726903180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:03,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 13:54:05,134][15401] Updated weights for policy 0, policy_version 898854 (0.0029) [2024-06-25 13:54:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14726971392. Throughput: 0: 42934.8. Samples: 14727037980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 13:54:09,130][15401] Updated weights for policy 0, policy_version 898864 (0.0039) [2024-06-25 13:54:12,661][15401] Updated weights for policy 0, policy_version 898874 (0.0038) [2024-06-25 13:54:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14727168000. Throughput: 0: 42910.1. Samples: 14727295580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:13,390][15132] Avg episode reward: [(0, '0.278')] [2024-06-25 13:54:16,746][15401] Updated weights for policy 0, policy_version 898884 (0.0026) [2024-06-25 13:54:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14727413760. Throughput: 0: 42768.0. Samples: 14727545660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 13:54:20,455][15401] Updated weights for policy 0, policy_version 898894 (0.0039) [2024-06-25 13:54:23,396][15132] Fps is (10 sec: 42571.0, 60 sec: 42595.5, 300 sec: 42653.4). Total num frames: 14727593984. Throughput: 0: 42755.6. Samples: 14727676340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:23,397][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 13:54:24,508][15401] Updated weights for policy 0, policy_version 898904 (0.0034) [2024-06-25 13:54:28,188][15401] Updated weights for policy 0, policy_version 898914 (0.0029) [2024-06-25 13:54:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 14727823360. Throughput: 0: 42764.1. Samples: 14727936740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 13:54:32,216][15401] Updated weights for policy 0, policy_version 898924 (0.0022) [2024-06-25 13:54:33,390][15132] Fps is (10 sec: 44265.3, 60 sec: 42872.1, 300 sec: 42709.5). Total num frames: 14728036352. Throughput: 0: 42538.3. Samples: 14728182920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 13:54:35,856][15401] Updated weights for policy 0, policy_version 898934 (0.0036) [2024-06-25 13:54:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14728216576. Throughput: 0: 42467.2. Samples: 14728311560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:38,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 13:54:39,687][15401] Updated weights for policy 0, policy_version 898944 (0.0032) [2024-06-25 13:54:43,337][15401] Updated weights for policy 0, policy_version 898954 (0.0031) [2024-06-25 13:54:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14728462336. Throughput: 0: 42587.0. Samples: 14728571560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 13:54:47,091][15349] Signal inference workers to stop experience collection... (218050 times) [2024-06-25 13:54:47,092][15349] Signal inference workers to resume experience collection... (218050 times) [2024-06-25 13:54:47,136][15401] InferenceWorker_p0-w0: stopping experience collection (218050 times) [2024-06-25 13:54:47,136][15401] InferenceWorker_p0-w0: resuming experience collection (218050 times) [2024-06-25 13:54:47,232][15401] Updated weights for policy 0, policy_version 898964 (0.0023) [2024-06-25 13:54:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42329.1, 300 sec: 42653.9). Total num frames: 14728658944. Throughput: 0: 42743.6. Samples: 14728826640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:48,390][15132] Avg episode reward: [(0, '0.321')] [2024-06-25 13:54:50,951][15401] Updated weights for policy 0, policy_version 898974 (0.0023) [2024-06-25 13:54:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14728871936. Throughput: 0: 42579.8. Samples: 14728954080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 13:54:54,837][15401] Updated weights for policy 0, policy_version 898984 (0.0035) [2024-06-25 13:54:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14729084928. Throughput: 0: 42539.6. Samples: 14729209860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:54:58,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 13:54:58,691][15401] Updated weights for policy 0, policy_version 898994 (0.0037) [2024-06-25 13:55:02,643][15401] Updated weights for policy 0, policy_version 899004 (0.0032) [2024-06-25 13:55:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14729314304. Throughput: 0: 42611.1. Samples: 14729463160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-25 13:55:03,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 13:55:06,282][15401] Updated weights for policy 0, policy_version 899014 (0.0034) [2024-06-25 13:55:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14729510912. Throughput: 0: 42631.4. Samples: 14729594480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 13:55:10,289][15401] Updated weights for policy 0, policy_version 899024 (0.0037) [2024-06-25 13:55:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.4). Total num frames: 14729723904. Throughput: 0: 42564.9. Samples: 14729852160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 13:55:13,994][15401] Updated weights for policy 0, policy_version 899034 (0.0033) [2024-06-25 13:55:18,050][15401] Updated weights for policy 0, policy_version 899044 (0.0031) [2024-06-25 13:55:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14729953280. Throughput: 0: 42866.2. Samples: 14730111900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 13:55:21,345][15401] Updated weights for policy 0, policy_version 899054 (0.0034) [2024-06-25 13:55:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 14730166272. Throughput: 0: 42836.9. Samples: 14730239220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 13:55:25,597][15401] Updated weights for policy 0, policy_version 899064 (0.0036) [2024-06-25 13:55:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14730379264. Throughput: 0: 42900.0. Samples: 14730502060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:28,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 13:55:29,098][15401] Updated weights for policy 0, policy_version 899074 (0.0031) [2024-06-25 13:55:33,104][15401] Updated weights for policy 0, policy_version 899084 (0.0031) [2024-06-25 13:55:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14730608640. Throughput: 0: 42909.2. Samples: 14730757560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 13:55:36,883][15401] Updated weights for policy 0, policy_version 899094 (0.0028) [2024-06-25 13:55:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.2). Total num frames: 14730805248. Throughput: 0: 42939.1. Samples: 14730886340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 13:55:40,766][15401] Updated weights for policy 0, policy_version 899104 (0.0031) [2024-06-25 13:55:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 14731018240. Throughput: 0: 43053.2. Samples: 14731147260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:43,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 13:55:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000899110_14731018240.pth... [2024-06-25 13:55:43,493][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000898485_14720778240.pth [2024-06-25 13:55:44,547][15401] Updated weights for policy 0, policy_version 899114 (0.0042) [2024-06-25 13:55:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14731231232. Throughput: 0: 42980.4. Samples: 14731397280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 13:55:48,548][15401] Updated weights for policy 0, policy_version 899124 (0.0031) [2024-06-25 13:55:52,078][15401] Updated weights for policy 0, policy_version 899134 (0.0031) [2024-06-25 13:55:53,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 14731444224. Throughput: 0: 43018.8. Samples: 14731530420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:53,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 13:55:56,221][15401] Updated weights for policy 0, policy_version 899144 (0.0037) [2024-06-25 13:55:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14731657216. Throughput: 0: 43006.9. Samples: 14731787480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:55:58,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 13:55:59,871][15401] Updated weights for policy 0, policy_version 899154 (0.0038) [2024-06-25 13:56:03,392][15132] Fps is (10 sec: 44236.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14731886592. Throughput: 0: 42967.0. Samples: 14732045520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:03,392][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 13:56:03,678][15401] Updated weights for policy 0, policy_version 899164 (0.0034) [2024-06-25 13:56:08,010][15401] Updated weights for policy 0, policy_version 899174 (0.0040) [2024-06-25 13:56:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14732083200. Throughput: 0: 42880.2. Samples: 14732168820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:08,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 13:56:11,541][15401] Updated weights for policy 0, policy_version 899184 (0.0042) [2024-06-25 13:56:13,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 14732296192. Throughput: 0: 42607.1. Samples: 14732419480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:13,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 13:56:15,655][15401] Updated weights for policy 0, policy_version 899194 (0.0035) [2024-06-25 13:56:18,392][15132] Fps is (10 sec: 44225.8, 60 sec: 42869.8, 300 sec: 42709.5). Total num frames: 14732525568. Throughput: 0: 42562.3. Samples: 14732672960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:18,392][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 13:56:19,054][15401] Updated weights for policy 0, policy_version 899204 (0.0022) [2024-06-25 13:56:22,491][15349] Signal inference workers to stop experience collection... (218100 times) [2024-06-25 13:56:22,532][15401] InferenceWorker_p0-w0: stopping experience collection (218100 times) [2024-06-25 13:56:22,606][15349] Signal inference workers to resume experience collection... (218100 times) [2024-06-25 13:56:22,606][15401] InferenceWorker_p0-w0: resuming experience collection (218100 times) [2024-06-25 13:56:23,292][15401] Updated weights for policy 0, policy_version 899214 (0.0023) [2024-06-25 13:56:23,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14732722176. Throughput: 0: 42599.6. Samples: 14732803320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:23,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 13:56:26,596][15401] Updated weights for policy 0, policy_version 899224 (0.0049) [2024-06-25 13:56:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14732935168. Throughput: 0: 42503.7. Samples: 14733059920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 13:56:31,537][15401] Updated weights for policy 0, policy_version 899234 (0.0040) [2024-06-25 13:56:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14733164544. Throughput: 0: 42484.2. Samples: 14733309060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:33,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 13:56:34,141][15401] Updated weights for policy 0, policy_version 899244 (0.0035) [2024-06-25 13:56:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 14733328384. Throughput: 0: 42354.6. Samples: 14733436280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 13:56:39,107][15401] Updated weights for policy 0, policy_version 899254 (0.0036) [2024-06-25 13:56:41,835][15401] Updated weights for policy 0, policy_version 899264 (0.0047) [2024-06-25 13:56:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 14733574144. Throughput: 0: 42264.0. Samples: 14733689360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:43,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 13:56:46,820][15401] Updated weights for policy 0, policy_version 899274 (0.0023) [2024-06-25 13:56:48,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14733803520. Throughput: 0: 42414.2. Samples: 14733954060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 13:56:49,603][15401] Updated weights for policy 0, policy_version 899284 (0.0036) [2024-06-25 13:56:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 14733983744. Throughput: 0: 42526.9. Samples: 14734082540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 13:56:54,503][15401] Updated weights for policy 0, policy_version 899294 (0.0036) [2024-06-25 13:56:57,420][15401] Updated weights for policy 0, policy_version 899304 (0.0024) [2024-06-25 13:56:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14734213120. Throughput: 0: 42510.3. Samples: 14734332340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:56:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 13:57:02,153][15401] Updated weights for policy 0, policy_version 899314 (0.0031) [2024-06-25 13:57:03,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 14734442496. Throughput: 0: 42486.2. Samples: 14734584740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:03,391][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 13:57:05,465][15401] Updated weights for policy 0, policy_version 899324 (0.0040) [2024-06-25 13:57:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14734639104. Throughput: 0: 42600.4. Samples: 14734720340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 13:57:09,543][15401] Updated weights for policy 0, policy_version 899334 (0.0030) [2024-06-25 13:57:13,147][15401] Updated weights for policy 0, policy_version 899344 (0.0033) [2024-06-25 13:57:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 14734868480. Throughput: 0: 42566.5. Samples: 14734975420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 13:57:17,065][15401] Updated weights for policy 0, policy_version 899354 (0.0027) [2024-06-25 13:57:18,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 14735097856. Throughput: 0: 42767.4. Samples: 14735233600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 13:57:20,641][15401] Updated weights for policy 0, policy_version 899364 (0.0037) [2024-06-25 13:57:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14735294464. Throughput: 0: 42897.7. Samples: 14735366680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 13:57:25,026][15401] Updated weights for policy 0, policy_version 899374 (0.0027) [2024-06-25 13:57:28,253][15401] Updated weights for policy 0, policy_version 899384 (0.0036) [2024-06-25 13:57:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14735523840. Throughput: 0: 42856.9. Samples: 14735617920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 13:57:32,632][15401] Updated weights for policy 0, policy_version 899394 (0.0045) [2024-06-25 13:57:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14735720448. Throughput: 0: 42681.9. Samples: 14735874740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 13:57:35,843][15401] Updated weights for policy 0, policy_version 899404 (0.0029) [2024-06-25 13:57:38,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14735917056. Throughput: 0: 42557.4. Samples: 14735997620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 13:57:40,090][15401] Updated weights for policy 0, policy_version 899414 (0.0031) [2024-06-25 13:57:43,119][15349] Signal inference workers to stop experience collection... (218150 times) [2024-06-25 13:57:43,119][15349] Signal inference workers to resume experience collection... (218150 times) [2024-06-25 13:57:43,135][15401] InferenceWorker_p0-w0: stopping experience collection (218150 times) [2024-06-25 13:57:43,140][15401] InferenceWorker_p0-w0: resuming experience collection (218150 times) [2024-06-25 13:57:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14736146432. Throughput: 0: 42798.2. Samples: 14736258260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:43,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 13:57:43,437][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000899424_14736162816.pth... [2024-06-25 13:57:43,438][15401] Updated weights for policy 0, policy_version 899424 (0.0035) [2024-06-25 13:57:43,486][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000898797_14725890048.pth [2024-06-25 13:57:47,667][15401] Updated weights for policy 0, policy_version 899434 (0.0033) [2024-06-25 13:57:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14736326656. Throughput: 0: 43012.0. Samples: 14736520280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:48,392][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 13:57:50,992][15401] Updated weights for policy 0, policy_version 899444 (0.0031) [2024-06-25 13:57:53,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14736556032. Throughput: 0: 42620.5. Samples: 14736638260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 13:57:55,149][15401] Updated weights for policy 0, policy_version 899454 (0.0030) [2024-06-25 13:57:58,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14736801792. Throughput: 0: 42797.0. Samples: 14736901280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:57:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 13:57:58,482][15401] Updated weights for policy 0, policy_version 899464 (0.0041) [2024-06-25 13:58:03,044][15401] Updated weights for policy 0, policy_version 899474 (0.0038) [2024-06-25 13:58:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 14736982016. Throughput: 0: 42758.8. Samples: 14737157740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 13:58:06,472][15401] Updated weights for policy 0, policy_version 899484 (0.0039) [2024-06-25 13:58:08,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14737178624. Throughput: 0: 42492.4. Samples: 14737278840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 13:58:10,704][15401] Updated weights for policy 0, policy_version 899494 (0.0046) [2024-06-25 13:58:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14737457152. Throughput: 0: 42832.0. Samples: 14737545360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 13:58:13,923][15401] Updated weights for policy 0, policy_version 899504 (0.0034) [2024-06-25 13:58:18,245][15401] Updated weights for policy 0, policy_version 899514 (0.0035) [2024-06-25 13:58:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 14737637376. Throughput: 0: 42732.3. Samples: 14737797700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 13:58:21,402][15401] Updated weights for policy 0, policy_version 899524 (0.0049) [2024-06-25 13:58:23,390][15132] Fps is (10 sec: 36044.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14737817600. Throughput: 0: 42716.5. Samples: 14737919860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 13:58:26,487][15401] Updated weights for policy 0, policy_version 899534 (0.0040) [2024-06-25 13:58:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.6). Total num frames: 14738063360. Throughput: 0: 42705.7. Samples: 14738180020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 13:58:29,442][15401] Updated weights for policy 0, policy_version 899544 (0.0032) [2024-06-25 13:58:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14738259968. Throughput: 0: 42491.1. Samples: 14738432380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 13:58:34,070][15401] Updated weights for policy 0, policy_version 899554 (0.0038) [2024-06-25 13:58:35,701][15349] Signal inference workers to stop experience collection... (218200 times) [2024-06-25 13:58:35,743][15401] InferenceWorker_p0-w0: stopping experience collection (218200 times) [2024-06-25 13:58:35,757][15349] Signal inference workers to resume experience collection... (218200 times) [2024-06-25 13:58:35,760][15401] InferenceWorker_p0-w0: resuming experience collection (218200 times) [2024-06-25 13:58:36,995][15401] Updated weights for policy 0, policy_version 899564 (0.0028) [2024-06-25 13:58:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14738472960. Throughput: 0: 42662.6. Samples: 14738558080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 13:58:41,497][15401] Updated weights for policy 0, policy_version 899574 (0.0041) [2024-06-25 13:58:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42654.7). Total num frames: 14738702336. Throughput: 0: 42605.3. Samples: 14738818520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:43,391][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 13:58:44,903][15401] Updated weights for policy 0, policy_version 899584 (0.0039) [2024-06-25 13:58:48,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14738915328. Throughput: 0: 42535.0. Samples: 14739071820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 13:58:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 13:58:49,610][15401] Updated weights for policy 0, policy_version 899594 (0.0033) [2024-06-25 13:58:53,339][15401] Updated weights for policy 0, policy_version 899604 (0.0035) [2024-06-25 13:58:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14739111936. Throughput: 0: 42611.5. Samples: 14739196360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:58:53,390][15132] Avg episode reward: [(0, '0.264')] [2024-06-25 13:58:57,183][15401] Updated weights for policy 0, policy_version 899614 (0.0032) [2024-06-25 13:58:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14739324928. Throughput: 0: 42447.6. Samples: 14739455500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:58:58,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 13:59:00,980][15401] Updated weights for policy 0, policy_version 899624 (0.0030) [2024-06-25 13:59:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14739537920. Throughput: 0: 42361.7. Samples: 14739703980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:03,395][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 13:59:05,138][15401] Updated weights for policy 0, policy_version 899634 (0.0027) [2024-06-25 13:59:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14739750912. Throughput: 0: 42515.0. Samples: 14739833040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:08,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 13:59:08,633][15401] Updated weights for policy 0, policy_version 899644 (0.0039) [2024-06-25 13:59:12,695][15401] Updated weights for policy 0, policy_version 899654 (0.0038) [2024-06-25 13:59:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 14739963904. Throughput: 0: 42491.1. Samples: 14740092120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:13,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 13:59:16,236][15401] Updated weights for policy 0, policy_version 899664 (0.0037) [2024-06-25 13:59:18,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42654.9). Total num frames: 14740176896. Throughput: 0: 42491.8. Samples: 14740344500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 13:59:20,295][15401] Updated weights for policy 0, policy_version 899674 (0.0032) [2024-06-25 13:59:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14740406272. Throughput: 0: 42430.3. Samples: 14740467440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:23,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 13:59:23,784][15401] Updated weights for policy 0, policy_version 899684 (0.0036) [2024-06-25 13:59:27,958][15401] Updated weights for policy 0, policy_version 899694 (0.0025) [2024-06-25 13:59:28,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14740619264. Throughput: 0: 42506.6. Samples: 14740731320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:28,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 13:59:31,584][15401] Updated weights for policy 0, policy_version 899704 (0.0043) [2024-06-25 13:59:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14740815872. Throughput: 0: 42441.3. Samples: 14740981680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:33,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 13:59:35,871][15401] Updated weights for policy 0, policy_version 899714 (0.0038) [2024-06-25 13:59:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14741028864. Throughput: 0: 42506.3. Samples: 14741109140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 13:59:39,323][15401] Updated weights for policy 0, policy_version 899724 (0.0028) [2024-06-25 13:59:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14741225472. Throughput: 0: 42399.1. Samples: 14741363460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:43,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 13:59:43,546][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000899734_14741241856.pth... [2024-06-25 13:59:43,551][15401] Updated weights for policy 0, policy_version 899734 (0.0047) [2024-06-25 13:59:43,606][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000899110_14731018240.pth [2024-06-25 13:59:47,287][15401] Updated weights for policy 0, policy_version 899744 (0.0024) [2024-06-25 13:59:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14741471232. Throughput: 0: 42415.1. Samples: 14741612660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:48,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 13:59:51,366][15401] Updated weights for policy 0, policy_version 899754 (0.0028) [2024-06-25 13:59:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 14741684224. Throughput: 0: 42609.3. Samples: 14741750460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 13:59:54,973][15401] Updated weights for policy 0, policy_version 899764 (0.0032) [2024-06-25 13:59:58,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14741880832. Throughput: 0: 42439.1. Samples: 14742001880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 13:59:58,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 13:59:58,871][15401] Updated weights for policy 0, policy_version 899774 (0.0039) [2024-06-25 14:00:01,808][15349] Signal inference workers to stop experience collection... (218250 times) [2024-06-25 14:00:01,810][15349] Signal inference workers to resume experience collection... (218250 times) [2024-06-25 14:00:01,860][15401] InferenceWorker_p0-w0: stopping experience collection (218250 times) [2024-06-25 14:00:01,860][15401] InferenceWorker_p0-w0: resuming experience collection (218250 times) [2024-06-25 14:00:02,521][15401] Updated weights for policy 0, policy_version 899784 (0.0040) [2024-06-25 14:00:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14742110208. Throughput: 0: 42539.0. Samples: 14742258760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:03,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-25 14:00:06,614][15401] Updated weights for policy 0, policy_version 899794 (0.0035) [2024-06-25 14:00:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14742306816. Throughput: 0: 42759.1. Samples: 14742391600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 14:00:09,910][15401] Updated weights for policy 0, policy_version 899804 (0.0025) [2024-06-25 14:00:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14742536192. Throughput: 0: 42519.2. Samples: 14742644680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:13,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 14:00:14,212][15401] Updated weights for policy 0, policy_version 899814 (0.0035) [2024-06-25 14:00:17,691][15401] Updated weights for policy 0, policy_version 899824 (0.0025) [2024-06-25 14:00:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14742749184. Throughput: 0: 42584.0. Samples: 14742897960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:18,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 14:00:21,741][15401] Updated weights for policy 0, policy_version 899834 (0.0042) [2024-06-25 14:00:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14742945792. Throughput: 0: 42600.9. Samples: 14743026180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 14:00:25,278][15401] Updated weights for policy 0, policy_version 899844 (0.0034) [2024-06-25 14:00:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14743158784. Throughput: 0: 42758.7. Samples: 14743287600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 14:00:29,281][15401] Updated weights for policy 0, policy_version 899854 (0.0030) [2024-06-25 14:00:32,925][15401] Updated weights for policy 0, policy_version 899864 (0.0030) [2024-06-25 14:00:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14743388160. Throughput: 0: 42696.6. Samples: 14743534000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 14:00:37,261][15401] Updated weights for policy 0, policy_version 899874 (0.0041) [2024-06-25 14:00:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14743568384. Throughput: 0: 42653.9. Samples: 14743669880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 14:00:40,477][15401] Updated weights for policy 0, policy_version 899884 (0.0035) [2024-06-25 14:00:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14743781376. Throughput: 0: 42676.9. Samples: 14743922340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 14:00:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 14:00:44,792][15401] Updated weights for policy 0, policy_version 899894 (0.0047) [2024-06-25 14:00:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42598.7). Total num frames: 14744010752. Throughput: 0: 42610.3. Samples: 14744176220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:00:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 14:00:48,455][15401] Updated weights for policy 0, policy_version 899904 (0.0024) [2024-06-25 14:00:52,429][15401] Updated weights for policy 0, policy_version 899914 (0.0040) [2024-06-25 14:00:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14744223744. Throughput: 0: 42492.5. Samples: 14744303760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:00:53,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 14:00:55,941][15401] Updated weights for policy 0, policy_version 899924 (0.0037) [2024-06-25 14:00:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 14744436736. Throughput: 0: 42457.4. Samples: 14744555260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:00:58,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 14:01:00,099][15401] Updated weights for policy 0, policy_version 899934 (0.0040) [2024-06-25 14:01:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14744666112. Throughput: 0: 42593.8. Samples: 14744814680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:03,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 14:01:03,770][15401] Updated weights for policy 0, policy_version 899944 (0.0032) [2024-06-25 14:01:08,181][15401] Updated weights for policy 0, policy_version 899954 (0.0027) [2024-06-25 14:01:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 14744862720. Throughput: 0: 42589.3. Samples: 14744942700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 14:01:11,391][15401] Updated weights for policy 0, policy_version 899964 (0.0028) [2024-06-25 14:01:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 14745075712. Throughput: 0: 42415.6. Samples: 14745196300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 14:01:15,742][15401] Updated weights for policy 0, policy_version 899974 (0.0034) [2024-06-25 14:01:16,822][15349] Signal inference workers to stop experience collection... (218300 times) [2024-06-25 14:01:16,822][15349] Signal inference workers to resume experience collection... (218300 times) [2024-06-25 14:01:16,835][15401] InferenceWorker_p0-w0: stopping experience collection (218300 times) [2024-06-25 14:01:16,835][15401] InferenceWorker_p0-w0: resuming experience collection (218300 times) [2024-06-25 14:01:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14745288704. Throughput: 0: 42746.2. Samples: 14745457580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 14:01:19,112][15401] Updated weights for policy 0, policy_version 899984 (0.0038) [2024-06-25 14:01:23,366][15401] Updated weights for policy 0, policy_version 899994 (0.0027) [2024-06-25 14:01:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14745501696. Throughput: 0: 42465.7. Samples: 14745580840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:23,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 14:01:26,986][15401] Updated weights for policy 0, policy_version 900004 (0.0023) [2024-06-25 14:01:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14745731072. Throughput: 0: 42499.9. Samples: 14745834840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 14:01:31,377][15401] Updated weights for policy 0, policy_version 900014 (0.0041) [2024-06-25 14:01:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14745911296. Throughput: 0: 42611.1. Samples: 14746093720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:33,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 14:01:34,624][15401] Updated weights for policy 0, policy_version 900024 (0.0036) [2024-06-25 14:01:38,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14746107904. Throughput: 0: 42373.4. Samples: 14746210560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:38,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 14:01:39,150][15401] Updated weights for policy 0, policy_version 900034 (0.0038) [2024-06-25 14:01:42,291][15401] Updated weights for policy 0, policy_version 900044 (0.0045) [2024-06-25 14:01:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14746353664. Throughput: 0: 42369.2. Samples: 14746461880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:43,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 14:01:43,482][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000900047_14746370048.pth... [2024-06-25 14:01:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000899424_14736162816.pth [2024-06-25 14:01:46,924][15401] Updated weights for policy 0, policy_version 900054 (0.0023) [2024-06-25 14:01:48,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14746550272. Throughput: 0: 42219.6. Samples: 14746714560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:48,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 14:01:50,001][15401] Updated weights for policy 0, policy_version 900064 (0.0024) [2024-06-25 14:01:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 14746746880. Throughput: 0: 42195.5. Samples: 14746841500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:53,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 14:01:54,431][15401] Updated weights for policy 0, policy_version 900074 (0.0033) [2024-06-25 14:01:57,973][15401] Updated weights for policy 0, policy_version 900084 (0.0027) [2024-06-25 14:01:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 14746992640. Throughput: 0: 42345.7. Samples: 14747101860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:01:58,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 14:02:01,997][15401] Updated weights for policy 0, policy_version 900094 (0.0032) [2024-06-25 14:02:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 14747172864. Throughput: 0: 42267.6. Samples: 14747359620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:02:03,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 14:02:05,542][15401] Updated weights for policy 0, policy_version 900104 (0.0036) [2024-06-25 14:02:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14747402240. Throughput: 0: 42175.2. Samples: 14747478720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:02:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 14:02:09,870][15401] Updated weights for policy 0, policy_version 900114 (0.0029) [2024-06-25 14:02:13,169][15401] Updated weights for policy 0, policy_version 900124 (0.0037) [2024-06-25 14:02:13,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14747648000. Throughput: 0: 42307.7. Samples: 14747738680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:02:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 14:02:17,748][15401] Updated weights for policy 0, policy_version 900134 (0.0035) [2024-06-25 14:02:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14747811840. Throughput: 0: 42116.3. Samples: 14747988960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:02:18,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 14:02:21,055][15401] Updated weights for policy 0, policy_version 900144 (0.0031) [2024-06-25 14:02:23,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14748057600. Throughput: 0: 42193.3. Samples: 14748109260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:02:23,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 14:02:25,480][15401] Updated weights for policy 0, policy_version 900154 (0.0038) [2024-06-25 14:02:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14748254208. Throughput: 0: 42477.0. Samples: 14748373340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:02:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 14:02:28,823][15401] Updated weights for policy 0, policy_version 900164 (0.0032) [2024-06-25 14:02:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14748434432. Throughput: 0: 42492.0. Samples: 14748626700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 14:02:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 14:02:33,442][15401] Updated weights for policy 0, policy_version 900174 (0.0031) [2024-06-25 14:02:36,361][15401] Updated weights for policy 0, policy_version 900184 (0.0047) [2024-06-25 14:02:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 14748696576. Throughput: 0: 42328.5. Samples: 14748746280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:02:38,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 14:02:41,189][15401] Updated weights for policy 0, policy_version 900194 (0.0029) [2024-06-25 14:02:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 14748876800. Throughput: 0: 42409.9. Samples: 14749010300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:02:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 14:02:43,744][15349] Signal inference workers to stop experience collection... (218350 times) [2024-06-25 14:02:43,748][15349] Signal inference workers to resume experience collection... (218350 times) [2024-06-25 14:02:43,783][15401] InferenceWorker_p0-w0: stopping experience collection (218350 times) [2024-06-25 14:02:43,784][15401] InferenceWorker_p0-w0: resuming experience collection (218350 times) [2024-06-25 14:02:44,381][15401] Updated weights for policy 0, policy_version 900204 (0.0035) [2024-06-25 14:02:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 14749073408. Throughput: 0: 42371.5. Samples: 14749266340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:02:48,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 14:02:48,871][15401] Updated weights for policy 0, policy_version 900214 (0.0025) [2024-06-25 14:02:52,294][15401] Updated weights for policy 0, policy_version 900224 (0.0026) [2024-06-25 14:02:53,389][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 14749335552. Throughput: 0: 42350.2. Samples: 14749384480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:02:53,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 14:02:56,574][15401] Updated weights for policy 0, policy_version 900234 (0.0028) [2024-06-25 14:02:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 14749515776. Throughput: 0: 42503.4. Samples: 14749651340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:02:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 14:02:59,795][15401] Updated weights for policy 0, policy_version 900244 (0.0032) [2024-06-25 14:03:03,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 14749712384. Throughput: 0: 42605.8. Samples: 14749906220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 14:03:04,150][15401] Updated weights for policy 0, policy_version 900254 (0.0025) [2024-06-25 14:03:07,391][15401] Updated weights for policy 0, policy_version 900264 (0.0033) [2024-06-25 14:03:08,392][15132] Fps is (10 sec: 47502.8, 60 sec: 43142.8, 300 sec: 42487.0). Total num frames: 14749990912. Throughput: 0: 42776.3. Samples: 14750034300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:08,392][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 14:03:11,625][15401] Updated weights for policy 0, policy_version 900274 (0.0044) [2024-06-25 14:03:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 42376.3). Total num frames: 14750138368. Throughput: 0: 42521.0. Samples: 14750286780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:13,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 14:03:15,046][15401] Updated weights for policy 0, policy_version 900284 (0.0025) [2024-06-25 14:03:18,390][15132] Fps is (10 sec: 37692.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14750367744. Throughput: 0: 42464.8. Samples: 14750537620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:18,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 14:03:19,974][15401] Updated weights for policy 0, policy_version 900294 (0.0031) [2024-06-25 14:03:22,626][15401] Updated weights for policy 0, policy_version 900304 (0.0034) [2024-06-25 14:03:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14750597120. Throughput: 0: 42808.9. Samples: 14750672680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 14:03:27,495][15401] Updated weights for policy 0, policy_version 900314 (0.0051) [2024-06-25 14:03:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14750793728. Throughput: 0: 42763.4. Samples: 14750934660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 14:03:30,365][15401] Updated weights for policy 0, policy_version 900324 (0.0043) [2024-06-25 14:03:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 14751023104. Throughput: 0: 42500.9. Samples: 14751178880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:33,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 14:03:35,256][15401] Updated weights for policy 0, policy_version 900334 (0.0022) [2024-06-25 14:03:38,198][15401] Updated weights for policy 0, policy_version 900344 (0.0052) [2024-06-25 14:03:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14751236096. Throughput: 0: 42862.2. Samples: 14751313280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:38,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 14:03:42,750][15401] Updated weights for policy 0, policy_version 900354 (0.0041) [2024-06-25 14:03:43,396][15132] Fps is (10 sec: 39295.9, 60 sec: 42320.7, 300 sec: 42375.3). Total num frames: 14751416320. Throughput: 0: 42548.7. Samples: 14751566300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:43,397][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 14:03:43,535][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000900356_14751432704.pth... [2024-06-25 14:03:43,594][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000899734_14741241856.pth [2024-06-25 14:03:45,989][15401] Updated weights for policy 0, policy_version 900364 (0.0035) [2024-06-25 14:03:48,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.8, 300 sec: 42598.1). Total num frames: 14751678464. Throughput: 0: 42377.3. Samples: 14751813300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:48,393][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 14:03:50,315][15401] Updated weights for policy 0, policy_version 900374 (0.0032) [2024-06-25 14:03:53,335][15349] Signal inference workers to stop experience collection... (218400 times) [2024-06-25 14:03:53,337][15349] Signal inference workers to resume experience collection... (218400 times) [2024-06-25 14:03:53,347][15401] InferenceWorker_p0-w0: stopping experience collection (218400 times) [2024-06-25 14:03:53,381][15401] InferenceWorker_p0-w0: resuming experience collection (218400 times) [2024-06-25 14:03:53,389][15132] Fps is (10 sec: 45905.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14751875072. Throughput: 0: 42609.0. Samples: 14751951600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:53,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 14:03:53,492][15401] Updated weights for policy 0, policy_version 900384 (0.0031) [2024-06-25 14:03:57,875][15401] Updated weights for policy 0, policy_version 900394 (0.0039) [2024-06-25 14:03:58,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14752071680. Throughput: 0: 42733.3. Samples: 14752209780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:03:58,394][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 14:04:01,134][15401] Updated weights for policy 0, policy_version 900404 (0.0044) [2024-06-25 14:04:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 14752317440. Throughput: 0: 42656.4. Samples: 14752457160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:04:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 14:04:05,483][15401] Updated weights for policy 0, policy_version 900414 (0.0021) [2024-06-25 14:04:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41780.8, 300 sec: 42487.3). Total num frames: 14752497664. Throughput: 0: 42596.9. Samples: 14752589540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:04:08,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 14:04:09,275][15401] Updated weights for policy 0, policy_version 900424 (0.0033) [2024-06-25 14:04:13,168][15401] Updated weights for policy 0, policy_version 900434 (0.0033) [2024-06-25 14:04:13,391][15132] Fps is (10 sec: 39316.3, 60 sec: 42870.4, 300 sec: 42487.1). Total num frames: 14752710656. Throughput: 0: 42227.6. Samples: 14752834960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:04:13,391][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 14:04:17,053][15401] Updated weights for policy 0, policy_version 900444 (0.0034) [2024-06-25 14:04:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 14752940032. Throughput: 0: 42404.0. Samples: 14753087060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:04:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 14:04:21,130][15401] Updated weights for policy 0, policy_version 900454 (0.0034) [2024-06-25 14:04:23,389][15132] Fps is (10 sec: 40966.2, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 14753120256. Throughput: 0: 42378.8. Samples: 14753220320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:04:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 14:04:24,581][15401] Updated weights for policy 0, policy_version 900464 (0.0043) [2024-06-25 14:04:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14753349632. Throughput: 0: 42271.5. Samples: 14753468240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-25 14:04:28,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 14:04:28,682][15401] Updated weights for policy 0, policy_version 900474 (0.0029) [2024-06-25 14:04:32,342][15401] Updated weights for policy 0, policy_version 900484 (0.0028) [2024-06-25 14:04:33,391][15132] Fps is (10 sec: 45868.0, 60 sec: 42597.3, 300 sec: 42542.6). Total num frames: 14753579008. Throughput: 0: 42535.6. Samples: 14753727360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:04:33,391][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 14:04:36,692][15401] Updated weights for policy 0, policy_version 900494 (0.0044) [2024-06-25 14:04:38,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 14753759232. Throughput: 0: 42345.2. Samples: 14753857240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:04:38,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 14:04:40,156][15401] Updated weights for policy 0, policy_version 900504 (0.0038) [2024-06-25 14:04:43,390][15132] Fps is (10 sec: 42604.2, 60 sec: 43149.1, 300 sec: 42487.3). Total num frames: 14754004992. Throughput: 0: 42160.3. Samples: 14754107000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:04:43,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 14:04:44,233][15401] Updated weights for policy 0, policy_version 900514 (0.0021) [2024-06-25 14:04:47,744][15401] Updated weights for policy 0, policy_version 900524 (0.0041) [2024-06-25 14:04:48,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 14754217984. Throughput: 0: 42465.7. Samples: 14754368120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:04:48,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 14:04:51,835][15401] Updated weights for policy 0, policy_version 900534 (0.0035) [2024-06-25 14:04:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 14754381824. Throughput: 0: 42213.8. Samples: 14754489160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:04:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 14:04:55,685][15401] Updated weights for policy 0, policy_version 900544 (0.0044) [2024-06-25 14:04:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 14754627584. Throughput: 0: 42318.7. Samples: 14754739240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:04:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 14:04:59,348][15401] Updated weights for policy 0, policy_version 900554 (0.0040) [2024-06-25 14:05:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14754840576. Throughput: 0: 42544.8. Samples: 14755001580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 14:05:03,391][15401] Updated weights for policy 0, policy_version 900564 (0.0039) [2024-06-25 14:05:06,957][15401] Updated weights for policy 0, policy_version 900574 (0.0036) [2024-06-25 14:05:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 14755037184. Throughput: 0: 42354.2. Samples: 14755126260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:08,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 14:05:11,197][15401] Updated weights for policy 0, policy_version 900584 (0.0037) [2024-06-25 14:05:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42326.3, 300 sec: 42376.2). Total num frames: 14755250176. Throughput: 0: 42434.5. Samples: 14755377800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:13,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 14:05:15,166][15401] Updated weights for policy 0, policy_version 900594 (0.0030) [2024-06-25 14:05:16,429][15349] Signal inference workers to stop experience collection... (218450 times) [2024-06-25 14:05:16,485][15401] InferenceWorker_p0-w0: stopping experience collection (218450 times) [2024-06-25 14:05:16,551][15349] Signal inference workers to resume experience collection... (218450 times) [2024-06-25 14:05:16,551][15401] InferenceWorker_p0-w0: resuming experience collection (218450 times) [2024-06-25 14:05:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14755463168. Throughput: 0: 42565.7. Samples: 14755642760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:18,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 14:05:18,877][15401] Updated weights for policy 0, policy_version 900604 (0.0039) [2024-06-25 14:05:22,746][15401] Updated weights for policy 0, policy_version 900614 (0.0035) [2024-06-25 14:05:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 14755676160. Throughput: 0: 42444.5. Samples: 14755767140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 14:05:26,347][15401] Updated weights for policy 0, policy_version 900624 (0.0026) [2024-06-25 14:05:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 14755921920. Throughput: 0: 42493.3. Samples: 14756019200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 14:05:30,552][15401] Updated weights for policy 0, policy_version 900634 (0.0029) [2024-06-25 14:05:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42053.3, 300 sec: 42487.3). Total num frames: 14756102144. Throughput: 0: 42458.4. Samples: 14756278740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 14:05:34,321][15401] Updated weights for policy 0, policy_version 900644 (0.0033) [2024-06-25 14:05:38,164][15401] Updated weights for policy 0, policy_version 900654 (0.0037) [2024-06-25 14:05:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.1, 300 sec: 42542.8). Total num frames: 14756331520. Throughput: 0: 42460.3. Samples: 14756399880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:38,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 14:05:42,003][15401] Updated weights for policy 0, policy_version 900664 (0.0033) [2024-06-25 14:05:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 14756544512. Throughput: 0: 42633.4. Samples: 14756657740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:43,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 14:05:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000900669_14756560896.pth... [2024-06-25 14:05:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000900047_14746370048.pth [2024-06-25 14:05:45,662][15401] Updated weights for policy 0, policy_version 900674 (0.0034) [2024-06-25 14:05:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 14756741120. Throughput: 0: 42525.9. Samples: 14756915240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 14:05:49,591][15401] Updated weights for policy 0, policy_version 900684 (0.0043) [2024-06-25 14:05:53,261][15401] Updated weights for policy 0, policy_version 900694 (0.0024) [2024-06-25 14:05:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 14756970496. Throughput: 0: 42490.6. Samples: 14757038340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 14:05:57,317][15401] Updated weights for policy 0, policy_version 900704 (0.0037) [2024-06-25 14:05:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 14757167104. Throughput: 0: 42652.5. Samples: 14757297160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:05:58,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 14:06:01,377][15401] Updated weights for policy 0, policy_version 900714 (0.0032) [2024-06-25 14:06:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 14757380096. Throughput: 0: 42416.5. Samples: 14757551500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:06:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 14:06:05,062][15401] Updated weights for policy 0, policy_version 900724 (0.0032) [2024-06-25 14:06:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 14757609472. Throughput: 0: 42482.6. Samples: 14757678860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:06:08,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 14:06:08,949][15401] Updated weights for policy 0, policy_version 900734 (0.0031) [2024-06-25 14:06:13,055][15401] Updated weights for policy 0, policy_version 900744 (0.0032) [2024-06-25 14:06:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 14757806080. Throughput: 0: 42548.1. Samples: 14757933860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:06:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 14:06:16,638][15401] Updated weights for policy 0, policy_version 900754 (0.0034) [2024-06-25 14:06:18,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.8, 300 sec: 42431.4). Total num frames: 14758019072. Throughput: 0: 42203.5. Samples: 14758178000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:06:18,392][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 14:06:20,840][15401] Updated weights for policy 0, policy_version 900764 (0.0037) [2024-06-25 14:06:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 14758248448. Throughput: 0: 42304.1. Samples: 14758303560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 14:06:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 14:06:24,368][15401] Updated weights for policy 0, policy_version 900774 (0.0047) [2024-06-25 14:06:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 41506.3, 300 sec: 42376.2). Total num frames: 14758412288. Throughput: 0: 42296.9. Samples: 14758561100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:06:28,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 14:06:28,566][15401] Updated weights for policy 0, policy_version 900784 (0.0029) [2024-06-25 14:06:31,858][15401] Updated weights for policy 0, policy_version 900794 (0.0039) [2024-06-25 14:06:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14758641664. Throughput: 0: 42232.5. Samples: 14758815700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:06:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 14:06:36,328][15401] Updated weights for policy 0, policy_version 900804 (0.0056) [2024-06-25 14:06:38,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 14758871040. Throughput: 0: 42366.5. Samples: 14758944840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:06:38,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 14:06:39,502][15401] Updated weights for policy 0, policy_version 900814 (0.0024) [2024-06-25 14:06:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 14759051264. Throughput: 0: 42284.9. Samples: 14759199980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:06:43,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 14:06:44,110][15401] Updated weights for policy 0, policy_version 900824 (0.0039) [2024-06-25 14:06:48,119][15401] Updated weights for policy 0, policy_version 900834 (0.0043) [2024-06-25 14:06:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14759280640. Throughput: 0: 42163.1. Samples: 14759448840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:06:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 14:06:51,848][15401] Updated weights for policy 0, policy_version 900844 (0.0034) [2024-06-25 14:06:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 14759510016. Throughput: 0: 42131.1. Samples: 14759574760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:06:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 14:06:55,604][15401] Updated weights for policy 0, policy_version 900854 (0.0026) [2024-06-25 14:06:58,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42050.5, 300 sec: 42431.4). Total num frames: 14759690240. Throughput: 0: 42140.4. Samples: 14759830280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:06:58,393][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 14:06:59,788][15401] Updated weights for policy 0, policy_version 900864 (0.0041) [2024-06-25 14:07:03,040][15401] Updated weights for policy 0, policy_version 900874 (0.0032) [2024-06-25 14:07:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 14759936000. Throughput: 0: 42382.2. Samples: 14760085100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:03,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 14:07:07,431][15401] Updated weights for policy 0, policy_version 900884 (0.0027) [2024-06-25 14:07:08,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 14760148992. Throughput: 0: 42542.1. Samples: 14760217960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 14:07:10,016][15349] Signal inference workers to stop experience collection... (218500 times) [2024-06-25 14:07:10,017][15349] Signal inference workers to resume experience collection... (218500 times) [2024-06-25 14:07:10,033][15401] InferenceWorker_p0-w0: stopping experience collection (218500 times) [2024-06-25 14:07:10,033][15401] InferenceWorker_p0-w0: resuming experience collection (218500 times) [2024-06-25 14:07:10,483][15401] Updated weights for policy 0, policy_version 900894 (0.0033) [2024-06-25 14:07:13,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 14760345600. Throughput: 0: 42465.4. Samples: 14760472040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 14:07:15,046][15401] Updated weights for policy 0, policy_version 900904 (0.0031) [2024-06-25 14:07:18,251][15401] Updated weights for policy 0, policy_version 900914 (0.0032) [2024-06-25 14:07:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.2, 300 sec: 42431.8). Total num frames: 14760574976. Throughput: 0: 42522.2. Samples: 14760729200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:18,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 14:07:22,571][15401] Updated weights for policy 0, policy_version 900924 (0.0037) [2024-06-25 14:07:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14760787968. Throughput: 0: 42649.4. Samples: 14760864060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 14:07:25,823][15401] Updated weights for policy 0, policy_version 900934 (0.0040) [2024-06-25 14:07:28,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14761000960. Throughput: 0: 42695.0. Samples: 14761121260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:28,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 14:07:30,144][15401] Updated weights for policy 0, policy_version 900944 (0.0033) [2024-06-25 14:07:33,272][15401] Updated weights for policy 0, policy_version 900954 (0.0037) [2024-06-25 14:07:33,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 14761246720. Throughput: 0: 42920.1. Samples: 14761380240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 14:07:37,703][15401] Updated weights for policy 0, policy_version 900964 (0.0039) [2024-06-25 14:07:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 14761426944. Throughput: 0: 43118.7. Samples: 14761515100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:38,390][15132] Avg episode reward: [(0, '0.219')] [2024-06-25 14:07:40,898][15401] Updated weights for policy 0, policy_version 900974 (0.0036) [2024-06-25 14:07:43,390][15132] Fps is (10 sec: 39320.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14761639936. Throughput: 0: 42984.9. Samples: 14761764500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:43,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 14:07:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000900979_14761639936.pth... [2024-06-25 14:07:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000900356_14751432704.pth [2024-06-25 14:07:45,254][15401] Updated weights for policy 0, policy_version 900984 (0.0052) [2024-06-25 14:07:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 14761869312. Throughput: 0: 42950.0. Samples: 14762017840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 14:07:48,408][15401] Updated weights for policy 0, policy_version 900994 (0.0034) [2024-06-25 14:07:52,827][15401] Updated weights for policy 0, policy_version 901004 (0.0039) [2024-06-25 14:07:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14762082304. Throughput: 0: 42956.0. Samples: 14762150980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 14:07:56,192][15401] Updated weights for policy 0, policy_version 901014 (0.0036) [2024-06-25 14:07:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 14762278912. Throughput: 0: 42977.2. Samples: 14762406020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:07:58,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 14:08:00,543][15401] Updated weights for policy 0, policy_version 901024 (0.0036) [2024-06-25 14:08:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.6, 300 sec: 42376.6). Total num frames: 14762491904. Throughput: 0: 42889.8. Samples: 14762659240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:08:03,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 14:08:03,814][15401] Updated weights for policy 0, policy_version 901034 (0.0027) [2024-06-25 14:08:08,118][15401] Updated weights for policy 0, policy_version 901044 (0.0034) [2024-06-25 14:08:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14762704896. Throughput: 0: 42752.0. Samples: 14762787900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:08:08,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 14:08:11,453][15401] Updated weights for policy 0, policy_version 901054 (0.0026) [2024-06-25 14:08:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14762917888. Throughput: 0: 42604.1. Samples: 14763038440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:08:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 14:08:16,007][15401] Updated weights for policy 0, policy_version 901064 (0.0035) [2024-06-25 14:08:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14763130880. Throughput: 0: 42584.9. Samples: 14763296560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 14:08:18,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 14:08:19,190][15401] Updated weights for policy 0, policy_version 901074 (0.0033) [2024-06-25 14:08:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14763327488. Throughput: 0: 42367.6. Samples: 14763421640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:08:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 14:08:24,403][15401] Updated weights for policy 0, policy_version 901084 (0.0039) [2024-06-25 14:08:24,648][15349] Signal inference workers to stop experience collection... (218550 times) [2024-06-25 14:08:24,675][15401] InferenceWorker_p0-w0: stopping experience collection (218550 times) [2024-06-25 14:08:24,707][15349] Signal inference workers to resume experience collection... (218550 times) [2024-06-25 14:08:24,708][15401] InferenceWorker_p0-w0: resuming experience collection (218550 times) [2024-06-25 14:08:26,920][15401] Updated weights for policy 0, policy_version 901094 (0.0043) [2024-06-25 14:08:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14763556864. Throughput: 0: 42365.5. Samples: 14763670940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:08:28,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 14:08:31,936][15401] Updated weights for policy 0, policy_version 901104 (0.0033) [2024-06-25 14:08:33,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 14763786240. Throughput: 0: 42598.1. Samples: 14763934860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:08:33,392][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 14:08:34,650][15401] Updated weights for policy 0, policy_version 901114 (0.0037) [2024-06-25 14:08:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 14763982848. Throughput: 0: 42539.5. Samples: 14764065260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:08:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 14:08:39,506][15401] Updated weights for policy 0, policy_version 901124 (0.0043) [2024-06-25 14:08:42,292][15401] Updated weights for policy 0, policy_version 901134 (0.0030) [2024-06-25 14:08:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.6, 300 sec: 42487.7). Total num frames: 14764212224. Throughput: 0: 42449.7. Samples: 14764316260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:08:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 14:08:47,008][15401] Updated weights for policy 0, policy_version 901144 (0.0036) [2024-06-25 14:08:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 14764408832. Throughput: 0: 42586.5. Samples: 14764575640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:08:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 14:08:49,898][15401] Updated weights for policy 0, policy_version 901154 (0.0031) [2024-06-25 14:08:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14764621824. Throughput: 0: 42570.2. Samples: 14764703560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:08:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 14:08:54,647][15401] Updated weights for policy 0, policy_version 901164 (0.0034) [2024-06-25 14:08:58,019][15401] Updated weights for policy 0, policy_version 901174 (0.0028) [2024-06-25 14:08:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 14764851200. Throughput: 0: 42678.7. Samples: 14764958980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:08:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 14:09:02,502][15401] Updated weights for policy 0, policy_version 901184 (0.0031) [2024-06-25 14:09:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.2, 300 sec: 42542.9). Total num frames: 14765047808. Throughput: 0: 42662.4. Samples: 14765216380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 14:09:05,702][15401] Updated weights for policy 0, policy_version 901194 (0.0040) [2024-06-25 14:09:08,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42596.7, 300 sec: 42542.7). Total num frames: 14765260800. Throughput: 0: 42590.1. Samples: 14765338300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:08,392][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 14:09:10,213][15401] Updated weights for policy 0, policy_version 901204 (0.0042) [2024-06-25 14:09:13,232][15401] Updated weights for policy 0, policy_version 901214 (0.0051) [2024-06-25 14:09:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14765490176. Throughput: 0: 42892.9. Samples: 14765601120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:13,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 14:09:17,853][15401] Updated weights for policy 0, policy_version 901224 (0.0047) [2024-06-25 14:09:18,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42596.6, 300 sec: 42598.0). Total num frames: 14765686784. Throughput: 0: 42587.1. Samples: 14765851280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:18,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 14:09:21,325][15401] Updated weights for policy 0, policy_version 901234 (0.0032) [2024-06-25 14:09:23,392][15132] Fps is (10 sec: 39310.4, 60 sec: 42596.4, 300 sec: 42486.9). Total num frames: 14765883392. Throughput: 0: 42413.5. Samples: 14765973980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:23,393][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 14:09:25,670][15401] Updated weights for policy 0, policy_version 901244 (0.0035) [2024-06-25 14:09:28,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42871.4, 300 sec: 42543.1). Total num frames: 14766129152. Throughput: 0: 42596.3. Samples: 14766233100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:28,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 14:09:28,694][15401] Updated weights for policy 0, policy_version 901254 (0.0042) [2024-06-25 14:09:33,229][15401] Updated weights for policy 0, policy_version 901264 (0.0035) [2024-06-25 14:09:33,389][15132] Fps is (10 sec: 42610.4, 60 sec: 42054.0, 300 sec: 42543.2). Total num frames: 14766309376. Throughput: 0: 42723.2. Samples: 14766498180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 14:09:36,181][15401] Updated weights for policy 0, policy_version 901274 (0.0032) [2024-06-25 14:09:37,123][15349] Signal inference workers to stop experience collection... (218600 times) [2024-06-25 14:09:37,124][15349] Signal inference workers to resume experience collection... (218600 times) [2024-06-25 14:09:37,168][15401] InferenceWorker_p0-w0: stopping experience collection (218600 times) [2024-06-25 14:09:37,168][15401] InferenceWorker_p0-w0: resuming experience collection (218600 times) [2024-06-25 14:09:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14766538752. Throughput: 0: 42445.4. Samples: 14766613600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:38,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 14:09:40,838][15401] Updated weights for policy 0, policy_version 901284 (0.0030) [2024-06-25 14:09:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14766751744. Throughput: 0: 42631.0. Samples: 14766877380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 14:09:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000901291_14766751744.pth... [2024-06-25 14:09:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000900669_14756560896.pth [2024-06-25 14:09:44,171][15401] Updated weights for policy 0, policy_version 901294 (0.0035) [2024-06-25 14:09:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 14766948352. Throughput: 0: 42694.4. Samples: 14767137620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 14:09:48,643][15401] Updated weights for policy 0, policy_version 901304 (0.0033) [2024-06-25 14:09:51,947][15401] Updated weights for policy 0, policy_version 901314 (0.0022) [2024-06-25 14:09:53,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 14767194112. Throughput: 0: 42772.0. Samples: 14767263040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:53,393][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 14:09:56,409][15401] Updated weights for policy 0, policy_version 901324 (0.0044) [2024-06-25 14:09:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 14767390720. Throughput: 0: 42470.5. Samples: 14767512300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:09:58,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 14:09:59,671][15401] Updated weights for policy 0, policy_version 901334 (0.0021) [2024-06-25 14:10:03,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14767603712. Throughput: 0: 42699.2. Samples: 14767772640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:10:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 14:10:03,986][15401] Updated weights for policy 0, policy_version 901344 (0.0033) [2024-06-25 14:10:07,439][15401] Updated weights for policy 0, policy_version 901354 (0.0046) [2024-06-25 14:10:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 14767816704. Throughput: 0: 42791.1. Samples: 14767899460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:10:08,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 14:10:11,692][15401] Updated weights for policy 0, policy_version 901364 (0.0049) [2024-06-25 14:10:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14768029696. Throughput: 0: 42706.7. Samples: 14768154900. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-25 14:10:13,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 14:10:14,902][15401] Updated weights for policy 0, policy_version 901374 (0.0036) [2024-06-25 14:10:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 14768242688. Throughput: 0: 42502.3. Samples: 14768410780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:18,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 14:10:19,339][15401] Updated weights for policy 0, policy_version 901384 (0.0034) [2024-06-25 14:10:22,564][15401] Updated weights for policy 0, policy_version 901394 (0.0044) [2024-06-25 14:10:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.6, 300 sec: 42542.9). Total num frames: 14768472064. Throughput: 0: 42756.5. Samples: 14768537640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 14:10:26,994][15401] Updated weights for policy 0, policy_version 901404 (0.0044) [2024-06-25 14:10:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 14768652288. Throughput: 0: 42525.0. Samples: 14768791000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:28,390][15132] Avg episode reward: [(0, '0.878')] [2024-06-25 14:10:30,245][15401] Updated weights for policy 0, policy_version 901414 (0.0037) [2024-06-25 14:10:33,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14768865280. Throughput: 0: 42393.3. Samples: 14769045320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:33,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 14:10:34,544][15401] Updated weights for policy 0, policy_version 901424 (0.0038) [2024-06-25 14:10:38,329][15401] Updated weights for policy 0, policy_version 901434 (0.0024) [2024-06-25 14:10:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 14769094656. Throughput: 0: 42432.9. Samples: 14769172420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:38,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 14:10:42,084][15401] Updated weights for policy 0, policy_version 901444 (0.0036) [2024-06-25 14:10:43,394][15132] Fps is (10 sec: 42577.9, 60 sec: 42322.0, 300 sec: 42542.2). Total num frames: 14769291264. Throughput: 0: 42486.7. Samples: 14769424400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:43,395][15132] Avg episode reward: [(0, '0.268')] [2024-06-25 14:10:46,089][15401] Updated weights for policy 0, policy_version 901454 (0.0033) [2024-06-25 14:10:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14769520640. Throughput: 0: 42500.0. Samples: 14769685140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:48,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 14:10:49,666][15401] Updated weights for policy 0, policy_version 901464 (0.0042) [2024-06-25 14:10:53,392][15132] Fps is (10 sec: 44247.5, 60 sec: 42325.4, 300 sec: 42598.1). Total num frames: 14769733632. Throughput: 0: 42425.7. Samples: 14769808720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:53,393][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 14:10:53,997][15401] Updated weights for policy 0, policy_version 901474 (0.0032) [2024-06-25 14:10:57,901][15401] Updated weights for policy 0, policy_version 901484 (0.0046) [2024-06-25 14:10:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14769930240. Throughput: 0: 42310.3. Samples: 14770058860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:10:58,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 14:11:01,766][15401] Updated weights for policy 0, policy_version 901494 (0.0031) [2024-06-25 14:11:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 14770143232. Throughput: 0: 42236.3. Samples: 14770311420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:03,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 14:11:04,209][15349] Signal inference workers to stop experience collection... (218650 times) [2024-06-25 14:11:04,210][15349] Signal inference workers to resume experience collection... (218650 times) [2024-06-25 14:11:04,228][15401] InferenceWorker_p0-w0: stopping experience collection (218650 times) [2024-06-25 14:11:04,228][15401] InferenceWorker_p0-w0: resuming experience collection (218650 times) [2024-06-25 14:11:05,633][15401] Updated weights for policy 0, policy_version 901504 (0.0045) [2024-06-25 14:11:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14770356224. Throughput: 0: 42214.7. Samples: 14770437300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:08,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 14:11:09,550][15401] Updated weights for policy 0, policy_version 901514 (0.0037) [2024-06-25 14:11:13,213][15401] Updated weights for policy 0, policy_version 901524 (0.0028) [2024-06-25 14:11:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 14770585600. Throughput: 0: 42277.7. Samples: 14770693500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 14:11:16,966][15401] Updated weights for policy 0, policy_version 901534 (0.0034) [2024-06-25 14:11:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 14770798592. Throughput: 0: 42344.0. Samples: 14770950800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:18,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 14:11:20,754][15401] Updated weights for policy 0, policy_version 901544 (0.0037) [2024-06-25 14:11:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14771011584. Throughput: 0: 42326.4. Samples: 14771077100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:23,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 14:11:24,471][15401] Updated weights for policy 0, policy_version 901554 (0.0031) [2024-06-25 14:11:28,311][15401] Updated weights for policy 0, policy_version 901564 (0.0038) [2024-06-25 14:11:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14771224576. Throughput: 0: 42492.1. Samples: 14771336340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:28,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 14:11:32,289][15401] Updated weights for policy 0, policy_version 901574 (0.0029) [2024-06-25 14:11:33,393][15132] Fps is (10 sec: 40944.1, 60 sec: 42595.7, 300 sec: 42542.3). Total num frames: 14771421184. Throughput: 0: 42352.9. Samples: 14771591180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:33,394][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 14:11:35,831][15401] Updated weights for policy 0, policy_version 901584 (0.0039) [2024-06-25 14:11:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14771650560. Throughput: 0: 42484.6. Samples: 14771720420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 14:11:39,880][15401] Updated weights for policy 0, policy_version 901594 (0.0032) [2024-06-25 14:11:43,390][15132] Fps is (10 sec: 44253.5, 60 sec: 42874.9, 300 sec: 42653.9). Total num frames: 14771863552. Throughput: 0: 42709.8. Samples: 14771980800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 14:11:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000901603_14771863552.pth... [2024-06-25 14:11:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000900979_14761639936.pth [2024-06-25 14:11:43,712][15401] Updated weights for policy 0, policy_version 901604 (0.0044) [2024-06-25 14:11:47,579][15401] Updated weights for policy 0, policy_version 901614 (0.0032) [2024-06-25 14:11:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14772060160. Throughput: 0: 42774.3. Samples: 14772236260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:48,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 14:11:51,176][15401] Updated weights for policy 0, policy_version 901624 (0.0025) [2024-06-25 14:11:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42709.8). Total num frames: 14772289536. Throughput: 0: 42738.2. Samples: 14772360520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:53,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 14:11:55,613][15401] Updated weights for policy 0, policy_version 901634 (0.0030) [2024-06-25 14:11:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 14772518912. Throughput: 0: 42859.5. Samples: 14772622180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:11:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 14:11:58,651][15401] Updated weights for policy 0, policy_version 901644 (0.0032) [2024-06-25 14:12:03,071][15401] Updated weights for policy 0, policy_version 901654 (0.0037) [2024-06-25 14:12:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14772715520. Throughput: 0: 42774.7. Samples: 14772875660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-25 14:12:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 14:12:06,493][15401] Updated weights for policy 0, policy_version 901664 (0.0032) [2024-06-25 14:12:08,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14772912128. Throughput: 0: 42673.3. Samples: 14772997400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:08,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 14:12:08,841][15349] Signal inference workers to stop experience collection... (218700 times) [2024-06-25 14:12:08,841][15349] Signal inference workers to resume experience collection... (218700 times) [2024-06-25 14:12:08,852][15401] InferenceWorker_p0-w0: stopping experience collection (218700 times) [2024-06-25 14:12:08,852][15401] InferenceWorker_p0-w0: resuming experience collection (218700 times) [2024-06-25 14:12:10,702][15401] Updated weights for policy 0, policy_version 901674 (0.0033) [2024-06-25 14:12:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14773141504. Throughput: 0: 42784.0. Samples: 14773261620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:13,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 14:12:14,257][15401] Updated weights for policy 0, policy_version 901684 (0.0030) [2024-06-25 14:12:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14773354496. Throughput: 0: 42671.2. Samples: 14773511220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 14:12:18,399][15401] Updated weights for policy 0, policy_version 901694 (0.0047) [2024-06-25 14:12:21,976][15401] Updated weights for policy 0, policy_version 901704 (0.0036) [2024-06-25 14:12:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14773551104. Throughput: 0: 42704.4. Samples: 14773642120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:23,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 14:12:25,959][15401] Updated weights for policy 0, policy_version 901714 (0.0023) [2024-06-25 14:12:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14773780480. Throughput: 0: 42657.3. Samples: 14773900380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:28,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 14:12:29,553][15401] Updated weights for policy 0, policy_version 901724 (0.0036) [2024-06-25 14:12:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42874.1, 300 sec: 42598.4). Total num frames: 14773993472. Throughput: 0: 42632.3. Samples: 14774154720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 14:12:33,541][15401] Updated weights for policy 0, policy_version 901734 (0.0026) [2024-06-25 14:12:37,522][15401] Updated weights for policy 0, policy_version 901744 (0.0053) [2024-06-25 14:12:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14774206464. Throughput: 0: 42820.4. Samples: 14774287440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:38,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 14:12:41,198][15401] Updated weights for policy 0, policy_version 901754 (0.0023) [2024-06-25 14:12:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14774403072. Throughput: 0: 42484.0. Samples: 14774533960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 14:12:45,351][15401] Updated weights for policy 0, policy_version 901764 (0.0030) [2024-06-25 14:12:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14774616064. Throughput: 0: 42468.9. Samples: 14774786760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:48,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 14:12:49,051][15401] Updated weights for policy 0, policy_version 901774 (0.0038) [2024-06-25 14:12:53,153][15401] Updated weights for policy 0, policy_version 901784 (0.0039) [2024-06-25 14:12:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14774845440. Throughput: 0: 42656.0. Samples: 14774916920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 14:12:56,639][15401] Updated weights for policy 0, policy_version 901794 (0.0032) [2024-06-25 14:12:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14775042048. Throughput: 0: 42382.9. Samples: 14775168840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:12:58,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 14:13:00,750][15401] Updated weights for policy 0, policy_version 901804 (0.0032) [2024-06-25 14:13:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14775271424. Throughput: 0: 42484.0. Samples: 14775423000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:03,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 14:13:04,778][15401] Updated weights for policy 0, policy_version 901814 (0.0044) [2024-06-25 14:13:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14775468032. Throughput: 0: 42544.0. Samples: 14775556600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:08,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 14:13:08,401][15401] Updated weights for policy 0, policy_version 901824 (0.0030) [2024-06-25 14:13:12,581][15401] Updated weights for policy 0, policy_version 901834 (0.0023) [2024-06-25 14:13:13,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 14775664640. Throughput: 0: 42408.4. Samples: 14775808860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:13,401][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 14:13:16,102][15401] Updated weights for policy 0, policy_version 901844 (0.0035) [2024-06-25 14:13:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14775910400. Throughput: 0: 42251.7. Samples: 14776056040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:18,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 14:13:20,143][15401] Updated weights for policy 0, policy_version 901854 (0.0031) [2024-06-25 14:13:23,390][15132] Fps is (10 sec: 45886.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14776123392. Throughput: 0: 42319.1. Samples: 14776191800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:23,395][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 14:13:23,540][15401] Updated weights for policy 0, policy_version 901864 (0.0037) [2024-06-25 14:13:28,154][15401] Updated weights for policy 0, policy_version 901874 (0.0040) [2024-06-25 14:13:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 14776320000. Throughput: 0: 42479.7. Samples: 14776445540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 14:13:31,288][15401] Updated weights for policy 0, policy_version 901884 (0.0033) [2024-06-25 14:13:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14776549376. Throughput: 0: 42581.3. Samples: 14776702920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 14:13:35,691][15401] Updated weights for policy 0, policy_version 901894 (0.0037) [2024-06-25 14:13:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14776745984. Throughput: 0: 42502.7. Samples: 14776829540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:38,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 14:13:38,969][15401] Updated weights for policy 0, policy_version 901904 (0.0037) [2024-06-25 14:13:41,022][15349] Signal inference workers to stop experience collection... (218750 times) [2024-06-25 14:13:41,023][15349] Signal inference workers to resume experience collection... (218750 times) [2024-06-25 14:13:41,040][15401] InferenceWorker_p0-w0: stopping experience collection (218750 times) [2024-06-25 14:13:41,040][15401] InferenceWorker_p0-w0: resuming experience collection (218750 times) [2024-06-25 14:13:43,179][15401] Updated weights for policy 0, policy_version 901914 (0.0033) [2024-06-25 14:13:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14776958976. Throughput: 0: 42680.0. Samples: 14777089440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:43,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 14:13:43,515][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000901915_14776975360.pth... [2024-06-25 14:13:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000901291_14766751744.pth [2024-06-25 14:13:47,167][15401] Updated weights for policy 0, policy_version 901924 (0.0030) [2024-06-25 14:13:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14777171968. Throughput: 0: 42595.0. Samples: 14777339780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:48,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 14:13:50,736][15401] Updated weights for policy 0, policy_version 901934 (0.0047) [2024-06-25 14:13:53,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 14777384960. Throughput: 0: 42395.4. Samples: 14777464500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:53,392][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 14:13:54,964][15401] Updated weights for policy 0, policy_version 901944 (0.0045) [2024-06-25 14:13:58,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42593.8, 300 sec: 42542.0). Total num frames: 14777597952. Throughput: 0: 42419.8. Samples: 14777717920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-25 14:13:58,397][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 14:13:58,658][15401] Updated weights for policy 0, policy_version 901954 (0.0035) [2024-06-25 14:14:02,567][15401] Updated weights for policy 0, policy_version 901964 (0.0035) [2024-06-25 14:14:03,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42325.2, 300 sec: 42543.2). Total num frames: 14777810944. Throughput: 0: 42640.8. Samples: 14777974880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 14:14:06,575][15401] Updated weights for policy 0, policy_version 901974 (0.0024) [2024-06-25 14:14:08,396][15132] Fps is (10 sec: 42598.3, 60 sec: 42593.8, 300 sec: 42486.4). Total num frames: 14778023936. Throughput: 0: 42510.4. Samples: 14778105040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:08,396][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 14:14:10,131][15401] Updated weights for policy 0, policy_version 901984 (0.0037) [2024-06-25 14:14:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42327.1, 300 sec: 42432.1). Total num frames: 14778204160. Throughput: 0: 42411.6. Samples: 14778354060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 14:14:14,377][15401] Updated weights for policy 0, policy_version 901994 (0.0038) [2024-06-25 14:14:17,775][15401] Updated weights for policy 0, policy_version 902004 (0.0040) [2024-06-25 14:14:18,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 14778449920. Throughput: 0: 42189.8. Samples: 14778601460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:18,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 14:14:22,161][15401] Updated weights for policy 0, policy_version 902014 (0.0030) [2024-06-25 14:14:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14778662912. Throughput: 0: 42299.6. Samples: 14778733020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:23,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 14:14:25,651][15401] Updated weights for policy 0, policy_version 902024 (0.0039) [2024-06-25 14:14:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14778859520. Throughput: 0: 42234.2. Samples: 14778989980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:28,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 14:14:29,827][15401] Updated weights for policy 0, policy_version 902034 (0.0036) [2024-06-25 14:14:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14779072512. Throughput: 0: 42319.5. Samples: 14779244160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 14:14:33,465][15401] Updated weights for policy 0, policy_version 902044 (0.0035) [2024-06-25 14:14:37,237][15401] Updated weights for policy 0, policy_version 902054 (0.0037) [2024-06-25 14:14:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14779301888. Throughput: 0: 42348.5. Samples: 14779370080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:38,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 14:14:41,055][15401] Updated weights for policy 0, policy_version 902064 (0.0032) [2024-06-25 14:14:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14779514880. Throughput: 0: 42523.3. Samples: 14779631200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 14:14:44,676][15401] Updated weights for policy 0, policy_version 902074 (0.0050) [2024-06-25 14:14:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 14779727872. Throughput: 0: 42465.8. Samples: 14779885840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 14:14:48,771][15401] Updated weights for policy 0, policy_version 902084 (0.0027) [2024-06-25 14:14:52,861][15401] Updated weights for policy 0, policy_version 902094 (0.0034) [2024-06-25 14:14:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 14779940864. Throughput: 0: 42421.6. Samples: 14780013740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:53,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 14:14:56,357][15401] Updated weights for policy 0, policy_version 902104 (0.0032) [2024-06-25 14:14:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42603.0, 300 sec: 42542.9). Total num frames: 14780153856. Throughput: 0: 42621.8. Samples: 14780272040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:14:58,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 14:15:00,314][15401] Updated weights for policy 0, policy_version 902114 (0.0048) [2024-06-25 14:15:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 14780350464. Throughput: 0: 42723.2. Samples: 14780524000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 14:15:03,997][15401] Updated weights for policy 0, policy_version 902124 (0.0029) [2024-06-25 14:15:05,955][15349] Signal inference workers to stop experience collection... (218800 times) [2024-06-25 14:15:05,955][15349] Signal inference workers to resume experience collection... (218800 times) [2024-06-25 14:15:05,968][15401] InferenceWorker_p0-w0: stopping experience collection (218800 times) [2024-06-25 14:15:05,968][15401] InferenceWorker_p0-w0: resuming experience collection (218800 times) [2024-06-25 14:15:08,127][15401] Updated weights for policy 0, policy_version 902134 (0.0038) [2024-06-25 14:15:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 14780579840. Throughput: 0: 42516.4. Samples: 14780646260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 14:15:12,259][15401] Updated weights for policy 0, policy_version 902144 (0.0029) [2024-06-25 14:15:13,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42487.0). Total num frames: 14780776448. Throughput: 0: 42552.8. Samples: 14780904960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:13,392][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 14:15:15,691][15401] Updated weights for policy 0, policy_version 902154 (0.0039) [2024-06-25 14:15:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 14780973056. Throughput: 0: 42448.9. Samples: 14781154360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 14:15:19,907][15401] Updated weights for policy 0, policy_version 902164 (0.0039) [2024-06-25 14:15:23,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14781202432. Throughput: 0: 42431.2. Samples: 14781279480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:23,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 14:15:23,415][15401] Updated weights for policy 0, policy_version 902174 (0.0025) [2024-06-25 14:15:27,690][15401] Updated weights for policy 0, policy_version 902184 (0.0035) [2024-06-25 14:15:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14781415424. Throughput: 0: 42344.1. Samples: 14781536680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:28,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 14:15:31,009][15401] Updated weights for policy 0, policy_version 902194 (0.0035) [2024-06-25 14:15:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 14781612032. Throughput: 0: 42253.5. Samples: 14781787240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 14:15:35,287][15401] Updated weights for policy 0, policy_version 902204 (0.0030) [2024-06-25 14:15:38,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42323.6, 300 sec: 42543.2). Total num frames: 14781841408. Throughput: 0: 42304.0. Samples: 14781917520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:38,393][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 14:15:39,033][15401] Updated weights for policy 0, policy_version 902214 (0.0034) [2024-06-25 14:15:43,039][15401] Updated weights for policy 0, policy_version 902224 (0.0038) [2024-06-25 14:15:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14782038016. Throughput: 0: 42215.4. Samples: 14782171740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 14:15:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000902224_14782038016.pth... [2024-06-25 14:15:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000901603_14771863552.pth [2024-06-25 14:15:46,891][15401] Updated weights for policy 0, policy_version 902234 (0.0021) [2024-06-25 14:15:48,390][15132] Fps is (10 sec: 40969.9, 60 sec: 42052.3, 300 sec: 42432.1). Total num frames: 14782251008. Throughput: 0: 42066.6. Samples: 14782417000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:48,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 14:15:51,018][15401] Updated weights for policy 0, policy_version 902244 (0.0043) [2024-06-25 14:15:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 14782480384. Throughput: 0: 42313.6. Samples: 14782550380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 14:15:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 14:15:54,783][15401] Updated weights for policy 0, policy_version 902254 (0.0036) [2024-06-25 14:15:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 14782676992. Throughput: 0: 42189.7. Samples: 14782803400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:15:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 14:15:58,548][15401] Updated weights for policy 0, policy_version 902264 (0.0034) [2024-06-25 14:16:02,278][15401] Updated weights for policy 0, policy_version 902274 (0.0029) [2024-06-25 14:16:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 14782889984. Throughput: 0: 42279.5. Samples: 14783056940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:03,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 14:16:06,667][15401] Updated weights for policy 0, policy_version 902284 (0.0044) [2024-06-25 14:16:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14783119360. Throughput: 0: 42478.6. Samples: 14783191020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 14:16:09,745][15401] Updated weights for policy 0, policy_version 902294 (0.0037) [2024-06-25 14:16:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42053.9, 300 sec: 42376.2). Total num frames: 14783299584. Throughput: 0: 42432.7. Samples: 14783446160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 14:16:14,201][15401] Updated weights for policy 0, policy_version 902304 (0.0032) [2024-06-25 14:16:17,779][15401] Updated weights for policy 0, policy_version 902314 (0.0041) [2024-06-25 14:16:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14783545344. Throughput: 0: 42392.8. Samples: 14783694920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:18,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 14:16:21,778][15401] Updated weights for policy 0, policy_version 902324 (0.0028) [2024-06-25 14:16:23,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14783758336. Throughput: 0: 42530.8. Samples: 14783831300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 14:16:25,533][15401] Updated weights for policy 0, policy_version 902334 (0.0033) [2024-06-25 14:16:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42432.3). Total num frames: 14783938560. Throughput: 0: 42465.5. Samples: 14784082680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:28,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 14:16:29,584][15401] Updated weights for policy 0, policy_version 902344 (0.0038) [2024-06-25 14:16:33,259][15401] Updated weights for policy 0, policy_version 902354 (0.0046) [2024-06-25 14:16:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 14784167936. Throughput: 0: 42575.6. Samples: 14784332900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 14:16:37,440][15401] Updated weights for policy 0, policy_version 902364 (0.0040) [2024-06-25 14:16:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42327.1, 300 sec: 42431.8). Total num frames: 14784380928. Throughput: 0: 42549.1. Samples: 14784465080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 14:16:38,797][15349] Signal inference workers to stop experience collection... (218850 times) [2024-06-25 14:16:38,797][15349] Signal inference workers to resume experience collection... (218850 times) [2024-06-25 14:16:38,830][15401] InferenceWorker_p0-w0: stopping experience collection (218850 times) [2024-06-25 14:16:38,830][15401] InferenceWorker_p0-w0: resuming experience collection (218850 times) [2024-06-25 14:16:40,848][15401] Updated weights for policy 0, policy_version 902374 (0.0036) [2024-06-25 14:16:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 14784577536. Throughput: 0: 42397.0. Samples: 14784711260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 14:16:44,980][15401] Updated weights for policy 0, policy_version 902384 (0.0041) [2024-06-25 14:16:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42431.4). Total num frames: 14784806912. Throughput: 0: 42524.5. Samples: 14784970640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:48,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 14:16:48,545][15401] Updated weights for policy 0, policy_version 902394 (0.0028) [2024-06-25 14:16:52,682][15401] Updated weights for policy 0, policy_version 902404 (0.0038) [2024-06-25 14:16:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 14785036288. Throughput: 0: 42547.5. Samples: 14785105660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 14:16:56,102][15401] Updated weights for policy 0, policy_version 902414 (0.0034) [2024-06-25 14:16:58,392][15132] Fps is (10 sec: 42598.5, 60 sec: 42596.8, 300 sec: 42431.4). Total num frames: 14785232896. Throughput: 0: 42414.3. Samples: 14785354900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:16:58,392][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 14:17:00,380][15401] Updated weights for policy 0, policy_version 902424 (0.0046) [2024-06-25 14:17:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14785462272. Throughput: 0: 42764.9. Samples: 14785619340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 14:17:03,665][15401] Updated weights for policy 0, policy_version 902434 (0.0027) [2024-06-25 14:17:08,029][15401] Updated weights for policy 0, policy_version 902444 (0.0033) [2024-06-25 14:17:08,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 14785658880. Throughput: 0: 42553.6. Samples: 14785746220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 14:17:11,318][15401] Updated weights for policy 0, policy_version 902454 (0.0040) [2024-06-25 14:17:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 14785871872. Throughput: 0: 42496.2. Samples: 14785995020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 14:17:15,676][15401] Updated weights for policy 0, policy_version 902464 (0.0037) [2024-06-25 14:17:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14786084864. Throughput: 0: 42752.0. Samples: 14786256740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:18,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 14:17:18,927][15401] Updated weights for policy 0, policy_version 902474 (0.0027) [2024-06-25 14:17:23,142][15401] Updated weights for policy 0, policy_version 902484 (0.0039) [2024-06-25 14:17:23,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 14786297856. Throughput: 0: 42603.1. Samples: 14786382220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:23,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 14:17:26,872][15401] Updated weights for policy 0, policy_version 902494 (0.0035) [2024-06-25 14:17:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 14786527232. Throughput: 0: 42800.5. Samples: 14786637280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:28,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 14:17:30,625][15401] Updated weights for policy 0, policy_version 902504 (0.0032) [2024-06-25 14:17:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 14786723840. Throughput: 0: 42855.6. Samples: 14786899040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 14:17:34,567][15401] Updated weights for policy 0, policy_version 902514 (0.0032) [2024-06-25 14:17:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14786936832. Throughput: 0: 42568.5. Samples: 14787021240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:38,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 14:17:38,789][15401] Updated weights for policy 0, policy_version 902524 (0.0032) [2024-06-25 14:17:42,460][15401] Updated weights for policy 0, policy_version 902534 (0.0031) [2024-06-25 14:17:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 14787166208. Throughput: 0: 42754.7. Samples: 14787278760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:43,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 14:17:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000902537_14787166208.pth... [2024-06-25 14:17:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000901915_14776975360.pth [2024-06-25 14:17:46,210][15401] Updated weights for policy 0, policy_version 902544 (0.0041) [2024-06-25 14:17:48,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 14787379200. Throughput: 0: 42487.1. Samples: 14787531260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:17:48,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 14:17:50,101][15401] Updated weights for policy 0, policy_version 902554 (0.0046) [2024-06-25 14:17:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14787575808. Throughput: 0: 42632.1. Samples: 14787664660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:17:53,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 14:17:53,741][15401] Updated weights for policy 0, policy_version 902564 (0.0044) [2024-06-25 14:17:57,631][15401] Updated weights for policy 0, policy_version 902574 (0.0032) [2024-06-25 14:17:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 14787805184. Throughput: 0: 42836.1. Samples: 14787922640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:17:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 14:18:01,425][15401] Updated weights for policy 0, policy_version 902584 (0.0040) [2024-06-25 14:18:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14788034560. Throughput: 0: 42572.0. Samples: 14788172480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:03,392][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 14:18:05,318][15401] Updated weights for policy 0, policy_version 902594 (0.0037) [2024-06-25 14:18:07,124][15349] Signal inference workers to stop experience collection... (218900 times) [2024-06-25 14:18:07,153][15401] InferenceWorker_p0-w0: stopping experience collection (218900 times) [2024-06-25 14:18:07,183][15349] Signal inference workers to resume experience collection... (218900 times) [2024-06-25 14:18:07,184][15401] InferenceWorker_p0-w0: resuming experience collection (218900 times) [2024-06-25 14:18:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 14788214784. Throughput: 0: 42788.1. Samples: 14788307680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:08,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 14:18:09,443][15401] Updated weights for policy 0, policy_version 902604 (0.0033) [2024-06-25 14:18:13,287][15401] Updated weights for policy 0, policy_version 902614 (0.0047) [2024-06-25 14:18:13,398][15132] Fps is (10 sec: 39289.8, 60 sec: 42592.8, 300 sec: 42430.6). Total num frames: 14788427776. Throughput: 0: 42759.0. Samples: 14788561780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:13,398][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 14:18:17,151][15401] Updated weights for policy 0, policy_version 902624 (0.0029) [2024-06-25 14:18:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 14788673536. Throughput: 0: 42407.6. Samples: 14788807380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 14:18:21,019][15401] Updated weights for policy 0, policy_version 902634 (0.0048) [2024-06-25 14:18:23,389][15132] Fps is (10 sec: 44272.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14788870144. Throughput: 0: 42770.7. Samples: 14788945920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:23,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 14:18:24,666][15401] Updated weights for policy 0, policy_version 902644 (0.0034) [2024-06-25 14:18:28,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 14789050368. Throughput: 0: 42688.9. Samples: 14789199760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 14:18:28,718][15401] Updated weights for policy 0, policy_version 902654 (0.0037) [2024-06-25 14:18:32,325][15401] Updated weights for policy 0, policy_version 902664 (0.0036) [2024-06-25 14:18:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14789312512. Throughput: 0: 42634.7. Samples: 14789449820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 14:18:36,352][15401] Updated weights for policy 0, policy_version 902674 (0.0036) [2024-06-25 14:18:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14789509120. Throughput: 0: 42753.3. Samples: 14789588560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 14:18:39,967][15401] Updated weights for policy 0, policy_version 902684 (0.0024) [2024-06-25 14:18:43,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14789689344. Throughput: 0: 42477.3. Samples: 14789834120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 14:18:44,345][15401] Updated weights for policy 0, policy_version 902694 (0.0034) [2024-06-25 14:18:47,547][15401] Updated weights for policy 0, policy_version 902704 (0.0039) [2024-06-25 14:18:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 14789967872. Throughput: 0: 42660.9. Samples: 14790092220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 14:18:51,819][15401] Updated weights for policy 0, policy_version 902714 (0.0029) [2024-06-25 14:18:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42543.8). Total num frames: 14790148096. Throughput: 0: 42826.6. Samples: 14790234880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:53,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 14:18:55,131][15401] Updated weights for policy 0, policy_version 902724 (0.0032) [2024-06-25 14:18:58,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14790344704. Throughput: 0: 42652.1. Samples: 14790480780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:18:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 14:18:59,399][15401] Updated weights for policy 0, policy_version 902734 (0.0026) [2024-06-25 14:19:02,848][15401] Updated weights for policy 0, policy_version 902744 (0.0041) [2024-06-25 14:19:03,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 14790606848. Throughput: 0: 42940.5. Samples: 14790739700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 14:19:07,134][15401] Updated weights for policy 0, policy_version 902754 (0.0039) [2024-06-25 14:19:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42598.0). Total num frames: 14790770688. Throughput: 0: 42823.0. Samples: 14790873060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:08,392][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 14:19:10,398][15401] Updated weights for policy 0, policy_version 902764 (0.0048) [2024-06-25 14:19:10,979][15349] Signal inference workers to stop experience collection... (218950 times) [2024-06-25 14:19:10,999][15401] InferenceWorker_p0-w0: stopping experience collection (218950 times) [2024-06-25 14:19:11,039][15349] Signal inference workers to resume experience collection... (218950 times) [2024-06-25 14:19:11,040][15401] InferenceWorker_p0-w0: resuming experience collection (218950 times) [2024-06-25 14:19:13,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42877.2, 300 sec: 42542.9). Total num frames: 14791000064. Throughput: 0: 42657.7. Samples: 14791119360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 14:19:14,543][15401] Updated weights for policy 0, policy_version 902774 (0.0031) [2024-06-25 14:19:18,162][15401] Updated weights for policy 0, policy_version 902784 (0.0037) [2024-06-25 14:19:18,392][15132] Fps is (10 sec: 45875.4, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 14791229440. Throughput: 0: 42945.3. Samples: 14791382460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:18,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 14:19:22,109][15401] Updated weights for policy 0, policy_version 902794 (0.0040) [2024-06-25 14:19:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14791409664. Throughput: 0: 42674.3. Samples: 14791508900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:23,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 14:19:25,885][15401] Updated weights for policy 0, policy_version 902804 (0.0033) [2024-06-25 14:19:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 14791655424. Throughput: 0: 42792.1. Samples: 14791759760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 14:19:29,771][15401] Updated weights for policy 0, policy_version 902814 (0.0025) [2024-06-25 14:19:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 14791852032. Throughput: 0: 42857.2. Samples: 14792020800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 14:19:33,705][15401] Updated weights for policy 0, policy_version 902824 (0.0024) [2024-06-25 14:19:37,262][15401] Updated weights for policy 0, policy_version 902834 (0.0033) [2024-06-25 14:19:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14792048640. Throughput: 0: 42385.3. Samples: 14792142220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 14:19:41,289][15401] Updated weights for policy 0, policy_version 902844 (0.0035) [2024-06-25 14:19:43,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43415.9, 300 sec: 42598.1). Total num frames: 14792294400. Throughput: 0: 42778.6. Samples: 14792405920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:19:43,393][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 14:19:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000902850_14792294400.pth... [2024-06-25 14:19:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000902224_14782038016.pth [2024-06-25 14:19:44,867][15401] Updated weights for policy 0, policy_version 902854 (0.0027) [2024-06-25 14:19:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14792491008. Throughput: 0: 42678.7. Samples: 14792660240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:19:48,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 14:19:48,923][15401] Updated weights for policy 0, policy_version 902864 (0.0026) [2024-06-25 14:19:52,385][15401] Updated weights for policy 0, policy_version 902874 (0.0042) [2024-06-25 14:19:53,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 14792704000. Throughput: 0: 42465.8. Samples: 14792783920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:19:53,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 14:19:56,647][15401] Updated weights for policy 0, policy_version 902884 (0.0024) [2024-06-25 14:19:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 14792949760. Throughput: 0: 42724.5. Samples: 14793041960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:19:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 14:20:00,298][15401] Updated weights for policy 0, policy_version 902894 (0.0024) [2024-06-25 14:20:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 14793129984. Throughput: 0: 42718.2. Samples: 14793304680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 14:20:04,292][15401] Updated weights for policy 0, policy_version 902904 (0.0039) [2024-06-25 14:20:07,750][15401] Updated weights for policy 0, policy_version 902914 (0.0033) [2024-06-25 14:20:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42598.7). Total num frames: 14793342976. Throughput: 0: 42591.5. Samples: 14793425520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:08,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 14:20:11,872][15401] Updated weights for policy 0, policy_version 902924 (0.0040) [2024-06-25 14:20:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14793572352. Throughput: 0: 42853.3. Samples: 14793688160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:13,402][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 14:20:15,495][15401] Updated weights for policy 0, policy_version 902934 (0.0031) [2024-06-25 14:20:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 14793768960. Throughput: 0: 42716.2. Samples: 14793943020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:18,392][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 14:20:19,479][15401] Updated weights for policy 0, policy_version 902944 (0.0032) [2024-06-25 14:20:20,626][15349] Signal inference workers to stop experience collection... (219000 times) [2024-06-25 14:20:20,626][15349] Signal inference workers to resume experience collection... (219000 times) [2024-06-25 14:20:20,660][15401] InferenceWorker_p0-w0: stopping experience collection (219000 times) [2024-06-25 14:20:20,660][15401] InferenceWorker_p0-w0: resuming experience collection (219000 times) [2024-06-25 14:20:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14793981952. Throughput: 0: 42847.9. Samples: 14794070380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:23,394][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 14:20:23,696][15401] Updated weights for policy 0, policy_version 902954 (0.0025) [2024-06-25 14:20:26,965][15401] Updated weights for policy 0, policy_version 902964 (0.0034) [2024-06-25 14:20:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14794227712. Throughput: 0: 42853.8. Samples: 14794334240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 14:20:31,122][15401] Updated weights for policy 0, policy_version 902974 (0.0036) [2024-06-25 14:20:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 14794424320. Throughput: 0: 42982.6. Samples: 14794594460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:33,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 14:20:34,400][15401] Updated weights for policy 0, policy_version 902984 (0.0027) [2024-06-25 14:20:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14794637312. Throughput: 0: 43087.1. Samples: 14794722840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:38,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 14:20:38,583][15401] Updated weights for policy 0, policy_version 902994 (0.0037) [2024-06-25 14:20:42,090][15401] Updated weights for policy 0, policy_version 903004 (0.0039) [2024-06-25 14:20:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14794866688. Throughput: 0: 43046.2. Samples: 14794979040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:43,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 14:20:46,436][15401] Updated weights for policy 0, policy_version 903014 (0.0037) [2024-06-25 14:20:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 14795096064. Throughput: 0: 43009.3. Samples: 14795240100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 14:20:49,745][15401] Updated weights for policy 0, policy_version 903024 (0.0042) [2024-06-25 14:20:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14795276288. Throughput: 0: 43209.3. Samples: 14795369940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 14:20:54,015][15401] Updated weights for policy 0, policy_version 903034 (0.0034) [2024-06-25 14:20:57,491][15401] Updated weights for policy 0, policy_version 903044 (0.0043) [2024-06-25 14:20:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14795505664. Throughput: 0: 42948.9. Samples: 14795620860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:20:58,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 14:21:01,710][15401] Updated weights for policy 0, policy_version 903054 (0.0034) [2024-06-25 14:21:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14795718656. Throughput: 0: 42925.8. Samples: 14795874680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:21:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 14:21:05,566][15401] Updated weights for policy 0, policy_version 903064 (0.0028) [2024-06-25 14:21:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14795915264. Throughput: 0: 42874.7. Samples: 14795999740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:21:08,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 14:21:09,298][15401] Updated weights for policy 0, policy_version 903074 (0.0041) [2024-06-25 14:21:12,990][15401] Updated weights for policy 0, policy_version 903084 (0.0028) [2024-06-25 14:21:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14796144640. Throughput: 0: 42859.2. Samples: 14796262900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:21:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 14:21:17,003][15401] Updated weights for policy 0, policy_version 903094 (0.0029) [2024-06-25 14:21:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14796357632. Throughput: 0: 42784.5. Samples: 14796519760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:21:18,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 14:21:20,548][15401] Updated weights for policy 0, policy_version 903104 (0.0034) [2024-06-25 14:21:23,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14796554240. Throughput: 0: 42710.2. Samples: 14796644900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:21:23,392][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 14:21:24,780][15401] Updated weights for policy 0, policy_version 903114 (0.0045) [2024-06-25 14:21:27,996][15401] Updated weights for policy 0, policy_version 903124 (0.0032) [2024-06-25 14:21:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14796800000. Throughput: 0: 42770.7. Samples: 14796903720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:21:28,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 14:21:32,496][15401] Updated weights for policy 0, policy_version 903134 (0.0048) [2024-06-25 14:21:33,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14796980224. Throughput: 0: 42719.1. Samples: 14797162460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 14:21:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 14:21:35,428][15401] Updated weights for policy 0, policy_version 903144 (0.0029) [2024-06-25 14:21:37,490][15349] Signal inference workers to stop experience collection... (219050 times) [2024-06-25 14:21:37,491][15349] Signal inference workers to resume experience collection... (219050 times) [2024-06-25 14:21:37,504][15401] InferenceWorker_p0-w0: stopping experience collection (219050 times) [2024-06-25 14:21:37,540][15401] InferenceWorker_p0-w0: resuming experience collection (219050 times) [2024-06-25 14:21:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14797193216. Throughput: 0: 42566.7. Samples: 14797285440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:21:38,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 14:21:39,947][15401] Updated weights for policy 0, policy_version 903154 (0.0041) [2024-06-25 14:21:43,130][15401] Updated weights for policy 0, policy_version 903164 (0.0036) [2024-06-25 14:21:43,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42876.5). Total num frames: 14797455360. Throughput: 0: 42769.4. Samples: 14797545480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:21:43,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 14:21:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000903165_14797455360.pth... [2024-06-25 14:21:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000902537_14787166208.pth [2024-06-25 14:21:47,614][15401] Updated weights for policy 0, policy_version 903174 (0.0031) [2024-06-25 14:21:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 14797619200. Throughput: 0: 42860.1. Samples: 14797803380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:21:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 14:21:50,630][15401] Updated weights for policy 0, policy_version 903184 (0.0036) [2024-06-25 14:21:53,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 14797832192. Throughput: 0: 42707.5. Samples: 14797921580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:21:53,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 14:21:55,900][15401] Updated weights for policy 0, policy_version 903194 (0.0042) [2024-06-25 14:21:58,204][15401] Updated weights for policy 0, policy_version 903204 (0.0029) [2024-06-25 14:21:58,389][15132] Fps is (10 sec: 49151.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14798110720. Throughput: 0: 42594.2. Samples: 14798179640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:21:58,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 14:22:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14798258176. Throughput: 0: 42736.4. Samples: 14798442900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:03,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 14:22:03,395][15401] Updated weights for policy 0, policy_version 903214 (0.0030) [2024-06-25 14:22:06,199][15401] Updated weights for policy 0, policy_version 903224 (0.0032) [2024-06-25 14:22:08,390][15132] Fps is (10 sec: 36044.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14798471168. Throughput: 0: 42555.6. Samples: 14798559800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 14:22:11,068][15401] Updated weights for policy 0, policy_version 903234 (0.0036) [2024-06-25 14:22:13,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14798733312. Throughput: 0: 42749.4. Samples: 14798827440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 14:22:13,750][15401] Updated weights for policy 0, policy_version 903244 (0.0047) [2024-06-25 14:22:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14798897152. Throughput: 0: 42766.7. Samples: 14799086960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 14:22:18,575][15401] Updated weights for policy 0, policy_version 903254 (0.0031) [2024-06-25 14:22:21,659][15401] Updated weights for policy 0, policy_version 903264 (0.0037) [2024-06-25 14:22:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 14799142912. Throughput: 0: 42674.1. Samples: 14799205780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:23,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 14:22:26,710][15401] Updated weights for policy 0, policy_version 903274 (0.0033) [2024-06-25 14:22:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14799355904. Throughput: 0: 42563.5. Samples: 14799460840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:28,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 14:22:29,264][15401] Updated weights for policy 0, policy_version 903284 (0.0054) [2024-06-25 14:22:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14799519744. Throughput: 0: 42599.9. Samples: 14799720380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 14:22:34,186][15401] Updated weights for policy 0, policy_version 903294 (0.0056) [2024-06-25 14:22:37,125][15401] Updated weights for policy 0, policy_version 903304 (0.0032) [2024-06-25 14:22:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14799781888. Throughput: 0: 42623.6. Samples: 14799839640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 14:22:41,867][15401] Updated weights for policy 0, policy_version 903314 (0.0044) [2024-06-25 14:22:43,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14799994880. Throughput: 0: 42707.1. Samples: 14800101460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 14:22:44,881][15401] Updated weights for policy 0, policy_version 903324 (0.0051) [2024-06-25 14:22:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14800175104. Throughput: 0: 42554.3. Samples: 14800357840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 14:22:49,519][15401] Updated weights for policy 0, policy_version 903334 (0.0040) [2024-06-25 14:22:52,734][15401] Updated weights for policy 0, policy_version 903344 (0.0041) [2024-06-25 14:22:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14800420864. Throughput: 0: 42585.3. Samples: 14800476140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 14:22:57,031][15401] Updated weights for policy 0, policy_version 903354 (0.0027) [2024-06-25 14:22:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14800633856. Throughput: 0: 42519.6. Samples: 14800740820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:22:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 14:23:00,291][15401] Updated weights for policy 0, policy_version 903364 (0.0026) [2024-06-25 14:23:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14800814080. Throughput: 0: 42460.9. Samples: 14800997700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:23:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 14:23:04,584][15401] Updated weights for policy 0, policy_version 903374 (0.0028) [2024-06-25 14:23:07,866][15401] Updated weights for policy 0, policy_version 903384 (0.0037) [2024-06-25 14:23:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42821.7). Total num frames: 14801059840. Throughput: 0: 42636.5. Samples: 14801124420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:23:08,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 14:23:12,018][15401] Updated weights for policy 0, policy_version 903394 (0.0029) [2024-06-25 14:23:12,366][15349] Signal inference workers to stop experience collection... (219100 times) [2024-06-25 14:23:12,390][15401] InferenceWorker_p0-w0: stopping experience collection (219100 times) [2024-06-25 14:23:12,422][15349] Signal inference workers to resume experience collection... (219100 times) [2024-06-25 14:23:12,422][15401] InferenceWorker_p0-w0: resuming experience collection (219100 times) [2024-06-25 14:23:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 14801256448. Throughput: 0: 42687.4. Samples: 14801381780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:23:13,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 14:23:15,601][15401] Updated weights for policy 0, policy_version 903404 (0.0033) [2024-06-25 14:23:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14801469440. Throughput: 0: 42683.9. Samples: 14801641160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:23:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 14:23:19,517][15401] Updated weights for policy 0, policy_version 903414 (0.0032) [2024-06-25 14:23:23,134][15401] Updated weights for policy 0, policy_version 903424 (0.0025) [2024-06-25 14:23:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14801698816. Throughput: 0: 42862.6. Samples: 14801768460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:23:23,390][15132] Avg episode reward: [(0, '0.848')] [2024-06-25 14:23:27,182][15401] Updated weights for policy 0, policy_version 903434 (0.0038) [2024-06-25 14:23:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14801895424. Throughput: 0: 42817.7. Samples: 14802028260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-25 14:23:28,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-25 14:23:30,678][15401] Updated weights for policy 0, policy_version 903444 (0.0042) [2024-06-25 14:23:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14802108416. Throughput: 0: 42836.8. Samples: 14802285500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:23:33,391][15132] Avg episode reward: [(0, '0.862')] [2024-06-25 14:23:35,023][15401] Updated weights for policy 0, policy_version 903454 (0.0035) [2024-06-25 14:23:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14802337792. Throughput: 0: 43047.6. Samples: 14802413280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:23:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 14:23:38,661][15401] Updated weights for policy 0, policy_version 903464 (0.0031) [2024-06-25 14:23:42,479][15401] Updated weights for policy 0, policy_version 903474 (0.0038) [2024-06-25 14:23:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14802550784. Throughput: 0: 42804.4. Samples: 14802667020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:23:43,396][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 14:23:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000903476_14802550784.pth... [2024-06-25 14:23:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000902850_14792294400.pth [2024-06-25 14:23:46,276][15401] Updated weights for policy 0, policy_version 903484 (0.0044) [2024-06-25 14:23:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14802747392. Throughput: 0: 42953.8. Samples: 14802930620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:23:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 14:23:50,006][15401] Updated weights for policy 0, policy_version 903494 (0.0028) [2024-06-25 14:23:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14802976768. Throughput: 0: 42960.4. Samples: 14803057640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:23:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 14:23:53,972][15401] Updated weights for policy 0, policy_version 903504 (0.0032) [2024-06-25 14:23:57,634][15401] Updated weights for policy 0, policy_version 903514 (0.0041) [2024-06-25 14:23:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14803206144. Throughput: 0: 42873.0. Samples: 14803311060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:23:58,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 14:24:01,736][15401] Updated weights for policy 0, policy_version 903524 (0.0044) [2024-06-25 14:24:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 14803402752. Throughput: 0: 42766.3. Samples: 14803565640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 14:24:05,409][15401] Updated weights for policy 0, policy_version 903534 (0.0035) [2024-06-25 14:24:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14803599360. Throughput: 0: 42702.8. Samples: 14803690080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:08,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 14:24:09,571][15401] Updated weights for policy 0, policy_version 903544 (0.0037) [2024-06-25 14:24:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 14803812352. Throughput: 0: 42605.8. Samples: 14803945520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:13,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 14:24:13,572][15401] Updated weights for policy 0, policy_version 903554 (0.0046) [2024-06-25 14:24:17,352][15401] Updated weights for policy 0, policy_version 903564 (0.0028) [2024-06-25 14:24:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 14804025344. Throughput: 0: 42658.9. Samples: 14804205140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 14:24:21,099][15401] Updated weights for policy 0, policy_version 903574 (0.0038) [2024-06-25 14:24:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14804254720. Throughput: 0: 42558.7. Samples: 14804328420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 14:24:25,403][15401] Updated weights for policy 0, policy_version 903584 (0.0028) [2024-06-25 14:24:28,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14804467712. Throughput: 0: 42590.1. Samples: 14804583580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 14:24:28,827][15401] Updated weights for policy 0, policy_version 903594 (0.0033) [2024-06-25 14:24:32,904][15401] Updated weights for policy 0, policy_version 903604 (0.0029) [2024-06-25 14:24:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 14804664320. Throughput: 0: 42599.9. Samples: 14804847720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:33,392][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 14:24:36,250][15401] Updated weights for policy 0, policy_version 903614 (0.0038) [2024-06-25 14:24:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 14804893696. Throughput: 0: 42500.8. Samples: 14804970180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 14:24:40,590][15401] Updated weights for policy 0, policy_version 903624 (0.0037) [2024-06-25 14:24:42,992][15349] Signal inference workers to stop experience collection... (219150 times) [2024-06-25 14:24:42,992][15349] Signal inference workers to resume experience collection... (219150 times) [2024-06-25 14:24:43,033][15401] InferenceWorker_p0-w0: stopping experience collection (219150 times) [2024-06-25 14:24:43,033][15401] InferenceWorker_p0-w0: resuming experience collection (219150 times) [2024-06-25 14:24:43,389][15132] Fps is (10 sec: 45886.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14805123072. Throughput: 0: 42501.8. Samples: 14805223640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 14:24:43,808][15401] Updated weights for policy 0, policy_version 903634 (0.0037) [2024-06-25 14:24:48,389][15132] Fps is (10 sec: 37684.3, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 14805270528. Throughput: 0: 42660.5. Samples: 14805485360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 14:24:48,655][15401] Updated weights for policy 0, policy_version 903644 (0.0046) [2024-06-25 14:24:51,427][15401] Updated weights for policy 0, policy_version 903654 (0.0033) [2024-06-25 14:24:53,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14805532672. Throughput: 0: 42450.9. Samples: 14805600380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:53,404][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 14:24:56,483][15401] Updated weights for policy 0, policy_version 903664 (0.0034) [2024-06-25 14:24:58,390][15132] Fps is (10 sec: 49151.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14805762048. Throughput: 0: 42727.1. Samples: 14805868240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:24:58,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 14:24:59,010][15401] Updated weights for policy 0, policy_version 903674 (0.0031) [2024-06-25 14:25:03,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 14805925888. Throughput: 0: 42615.0. Samples: 14806122820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:25:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 14:25:04,077][15401] Updated weights for policy 0, policy_version 903684 (0.0042) [2024-06-25 14:25:06,821][15401] Updated weights for policy 0, policy_version 903694 (0.0031) [2024-06-25 14:25:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14806171648. Throughput: 0: 42433.8. Samples: 14806237940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:25:08,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 14:25:11,873][15401] Updated weights for policy 0, policy_version 903704 (0.0041) [2024-06-25 14:25:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14806384640. Throughput: 0: 42633.8. Samples: 14806502100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:25:13,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 14:25:14,650][15401] Updated weights for policy 0, policy_version 903714 (0.0044) [2024-06-25 14:25:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 14806581248. Throughput: 0: 42340.9. Samples: 14806752960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:25:18,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 14:25:19,482][15401] Updated weights for policy 0, policy_version 903724 (0.0047) [2024-06-25 14:25:22,527][15401] Updated weights for policy 0, policy_version 903734 (0.0046) [2024-06-25 14:25:23,396][15132] Fps is (10 sec: 42571.2, 60 sec: 42593.8, 300 sec: 42653.0). Total num frames: 14806810624. Throughput: 0: 42321.2. Samples: 14806874900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:25:23,396][15132] Avg episode reward: [(0, '0.286')] [2024-06-25 14:25:27,029][15401] Updated weights for policy 0, policy_version 903744 (0.0033) [2024-06-25 14:25:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14807007232. Throughput: 0: 42575.6. Samples: 14807139540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:25:28,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 14:25:30,234][15401] Updated weights for policy 0, policy_version 903754 (0.0040) [2024-06-25 14:25:33,392][15132] Fps is (10 sec: 40976.6, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 14807220224. Throughput: 0: 42463.4. Samples: 14807396320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:25:33,392][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 14:25:34,813][15401] Updated weights for policy 0, policy_version 903764 (0.0026) [2024-06-25 14:25:37,722][15401] Updated weights for policy 0, policy_version 903774 (0.0030) [2024-06-25 14:25:38,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14807465984. Throughput: 0: 42746.7. Samples: 14807523980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:25:38,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 14:25:42,188][15401] Updated weights for policy 0, policy_version 903784 (0.0037) [2024-06-25 14:25:43,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14807662592. Throughput: 0: 42642.7. Samples: 14807787160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:25:43,390][15132] Avg episode reward: [(0, '0.220')] [2024-06-25 14:25:43,513][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000903789_14807678976.pth... [2024-06-25 14:25:43,575][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000903165_14797455360.pth [2024-06-25 14:25:45,589][15401] Updated weights for policy 0, policy_version 903794 (0.0050) [2024-06-25 14:25:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 14807875584. Throughput: 0: 42449.7. Samples: 14808033060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:25:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 14:25:49,950][15401] Updated weights for policy 0, policy_version 903804 (0.0027) [2024-06-25 14:25:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 14808072192. Throughput: 0: 42771.2. Samples: 14808162640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:25:53,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 14:25:53,406][15401] Updated weights for policy 0, policy_version 903814 (0.0047) [2024-06-25 14:25:57,592][15401] Updated weights for policy 0, policy_version 903824 (0.0028) [2024-06-25 14:25:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14808285184. Throughput: 0: 42603.5. Samples: 14808419260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:25:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 14:26:01,098][15349] Signal inference workers to stop experience collection... (219200 times) [2024-06-25 14:26:01,101][15349] Signal inference workers to resume experience collection... (219200 times) [2024-06-25 14:26:01,111][15401] Updated weights for policy 0, policy_version 903834 (0.0027) [2024-06-25 14:26:01,128][15401] InferenceWorker_p0-w0: stopping experience collection (219200 times) [2024-06-25 14:26:01,129][15401] InferenceWorker_p0-w0: resuming experience collection (219200 times) [2024-06-25 14:26:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14808514560. Throughput: 0: 42657.8. Samples: 14808672560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:03,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 14:26:05,410][15401] Updated weights for policy 0, policy_version 903844 (0.0037) [2024-06-25 14:26:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14808727552. Throughput: 0: 42887.9. Samples: 14808804580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 14:26:08,803][15401] Updated weights for policy 0, policy_version 903854 (0.0037) [2024-06-25 14:26:13,139][15401] Updated weights for policy 0, policy_version 903864 (0.0035) [2024-06-25 14:26:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14808924160. Throughput: 0: 42694.5. Samples: 14809060800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 14:26:16,555][15401] Updated weights for policy 0, policy_version 903874 (0.0036) [2024-06-25 14:26:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 14809169920. Throughput: 0: 42631.6. Samples: 14809314640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:18,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 14:26:20,613][15401] Updated weights for policy 0, policy_version 903884 (0.0038) [2024-06-25 14:26:23,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42876.1, 300 sec: 42653.9). Total num frames: 14809382912. Throughput: 0: 42853.9. Samples: 14809452400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:23,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-25 14:26:24,087][15401] Updated weights for policy 0, policy_version 903894 (0.0036) [2024-06-25 14:26:28,213][15401] Updated weights for policy 0, policy_version 903904 (0.0043) [2024-06-25 14:26:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14809579520. Throughput: 0: 42607.9. Samples: 14809704520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:28,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 14:26:31,646][15401] Updated weights for policy 0, policy_version 903914 (0.0053) [2024-06-25 14:26:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43419.3, 300 sec: 42820.5). Total num frames: 14809825280. Throughput: 0: 42703.1. Samples: 14809954700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:33,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 14:26:35,717][15401] Updated weights for policy 0, policy_version 903924 (0.0041) [2024-06-25 14:26:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.6, 300 sec: 42542.9). Total num frames: 14810005504. Throughput: 0: 42773.9. Samples: 14810087460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:38,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 14:26:39,374][15401] Updated weights for policy 0, policy_version 903934 (0.0043) [2024-06-25 14:26:43,281][15401] Updated weights for policy 0, policy_version 903944 (0.0035) [2024-06-25 14:26:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14810218496. Throughput: 0: 42772.1. Samples: 14810344000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:43,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 14:26:47,033][15401] Updated weights for policy 0, policy_version 903954 (0.0051) [2024-06-25 14:26:48,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14810464256. Throughput: 0: 42798.2. Samples: 14810598480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 14:26:50,806][15401] Updated weights for policy 0, policy_version 903964 (0.0031) [2024-06-25 14:26:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 14810644480. Throughput: 0: 42806.6. Samples: 14810730880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 14:26:54,703][15401] Updated weights for policy 0, policy_version 903974 (0.0037) [2024-06-25 14:26:58,396][15132] Fps is (10 sec: 39296.6, 60 sec: 42867.0, 300 sec: 42708.6). Total num frames: 14810857472. Throughput: 0: 42741.5. Samples: 14810984440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:26:58,396][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 14:26:58,789][15401] Updated weights for policy 0, policy_version 903984 (0.0028) [2024-06-25 14:27:02,344][15401] Updated weights for policy 0, policy_version 903994 (0.0027) [2024-06-25 14:27:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14811103232. Throughput: 0: 42719.2. Samples: 14811237000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:27:03,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 14:27:03,956][15349] Signal inference workers to stop experience collection... (219250 times) [2024-06-25 14:27:03,956][15349] Signal inference workers to resume experience collection... (219250 times) [2024-06-25 14:27:03,972][15401] InferenceWorker_p0-w0: stopping experience collection (219250 times) [2024-06-25 14:27:03,972][15401] InferenceWorker_p0-w0: resuming experience collection (219250 times) [2024-06-25 14:27:06,172][15401] Updated weights for policy 0, policy_version 904004 (0.0040) [2024-06-25 14:27:08,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14811267072. Throughput: 0: 42695.1. Samples: 14811373680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:27:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 14:27:09,946][15401] Updated weights for policy 0, policy_version 904014 (0.0040) [2024-06-25 14:27:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14811512832. Throughput: 0: 42581.3. Samples: 14811620680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:27:13,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 14:27:13,759][15401] Updated weights for policy 0, policy_version 904024 (0.0024) [2024-06-25 14:27:17,996][15401] Updated weights for policy 0, policy_version 904034 (0.0027) [2024-06-25 14:27:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14811725824. Throughput: 0: 42744.8. Samples: 14811878220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 14:27:18,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 14:27:21,489][15401] Updated weights for policy 0, policy_version 904044 (0.0039) [2024-06-25 14:27:23,390][15132] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 14811889664. Throughput: 0: 42638.1. Samples: 14812006180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:27:23,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 14:27:25,648][15401] Updated weights for policy 0, policy_version 904054 (0.0031) [2024-06-25 14:27:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14812151808. Throughput: 0: 42643.3. Samples: 14812262940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:27:28,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 14:27:28,986][15401] Updated weights for policy 0, policy_version 904064 (0.0040) [2024-06-25 14:27:33,298][15401] Updated weights for policy 0, policy_version 904074 (0.0024) [2024-06-25 14:27:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14812348416. Throughput: 0: 42662.7. Samples: 14812518300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:27:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 14:27:36,533][15401] Updated weights for policy 0, policy_version 904084 (0.0034) [2024-06-25 14:27:38,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 14812545024. Throughput: 0: 42321.3. Samples: 14812635340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:27:38,391][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 14:27:40,945][15401] Updated weights for policy 0, policy_version 904094 (0.0047) [2024-06-25 14:27:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14812790784. Throughput: 0: 42458.5. Samples: 14812894800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:27:43,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 14:27:43,644][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000904103_14812823552.pth... [2024-06-25 14:27:43,693][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000903476_14802550784.pth [2024-06-25 14:27:44,963][15401] Updated weights for policy 0, policy_version 904104 (0.0043) [2024-06-25 14:27:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 14812971008. Throughput: 0: 42535.0. Samples: 14813151080. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:27:48,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 14:27:48,941][15401] Updated weights for policy 0, policy_version 904114 (0.0031) [2024-06-25 14:27:52,570][15401] Updated weights for policy 0, policy_version 904124 (0.0023) [2024-06-25 14:27:53,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14813184000. Throughput: 0: 42111.6. Samples: 14813268700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:27:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-25 14:27:56,528][15401] Updated weights for policy 0, policy_version 904134 (0.0026) [2024-06-25 14:27:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 14813413376. Throughput: 0: 42418.7. Samples: 14813529520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:27:58,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 14:28:00,176][15401] Updated weights for policy 0, policy_version 904144 (0.0024) [2024-06-25 14:28:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 14813609984. Throughput: 0: 42448.0. Samples: 14813788380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 14:28:04,468][15401] Updated weights for policy 0, policy_version 904154 (0.0052) [2024-06-25 14:28:06,093][15349] Signal inference workers to stop experience collection... (219300 times) [2024-06-25 14:28:06,094][15349] Signal inference workers to resume experience collection... (219300 times) [2024-06-25 14:28:06,116][15401] InferenceWorker_p0-w0: stopping experience collection (219300 times) [2024-06-25 14:28:06,116][15401] InferenceWorker_p0-w0: resuming experience collection (219300 times) [2024-06-25 14:28:07,907][15401] Updated weights for policy 0, policy_version 904164 (0.0022) [2024-06-25 14:28:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14813822976. Throughput: 0: 42389.0. Samples: 14813913680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 14:28:12,052][15401] Updated weights for policy 0, policy_version 904174 (0.0051) [2024-06-25 14:28:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14814052352. Throughput: 0: 42401.6. Samples: 14814171020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 14:28:15,491][15401] Updated weights for policy 0, policy_version 904184 (0.0032) [2024-06-25 14:28:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 14814248960. Throughput: 0: 42470.7. Samples: 14814429480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:18,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 14:28:19,612][15401] Updated weights for policy 0, policy_version 904194 (0.0037) [2024-06-25 14:28:23,044][15401] Updated weights for policy 0, policy_version 904204 (0.0034) [2024-06-25 14:28:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 14814478336. Throughput: 0: 42683.7. Samples: 14814556100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 14:28:27,538][15401] Updated weights for policy 0, policy_version 904214 (0.0035) [2024-06-25 14:28:28,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14814707712. Throughput: 0: 42597.7. Samples: 14814811700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:28,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 14:28:30,947][15401] Updated weights for policy 0, policy_version 904224 (0.0035) [2024-06-25 14:28:33,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14814904320. Throughput: 0: 42595.5. Samples: 14815067880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:33,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 14:28:35,213][15401] Updated weights for policy 0, policy_version 904234 (0.0036) [2024-06-25 14:28:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 14815117312. Throughput: 0: 42684.9. Samples: 14815189520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:38,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 14:28:38,839][15401] Updated weights for policy 0, policy_version 904244 (0.0030) [2024-06-25 14:28:42,821][15401] Updated weights for policy 0, policy_version 904254 (0.0041) [2024-06-25 14:28:43,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 14815330304. Throughput: 0: 42654.8. Samples: 14815448980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:43,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 14:28:46,814][15401] Updated weights for policy 0, policy_version 904264 (0.0023) [2024-06-25 14:28:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14815526912. Throughput: 0: 42452.5. Samples: 14815698740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 14:28:50,435][15401] Updated weights for policy 0, policy_version 904274 (0.0027) [2024-06-25 14:28:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 14815756288. Throughput: 0: 42374.5. Samples: 14815820540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:53,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 14:28:54,444][15401] Updated weights for policy 0, policy_version 904284 (0.0047) [2024-06-25 14:28:58,010][15401] Updated weights for policy 0, policy_version 904294 (0.0039) [2024-06-25 14:28:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14815969280. Throughput: 0: 42422.7. Samples: 14816080040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:28:58,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 14:29:02,047][15401] Updated weights for policy 0, policy_version 904304 (0.0034) [2024-06-25 14:29:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14816165888. Throughput: 0: 42420.5. Samples: 14816338400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:29:03,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 14:29:05,647][15401] Updated weights for policy 0, policy_version 904314 (0.0033) [2024-06-25 14:29:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14816378880. Throughput: 0: 42415.4. Samples: 14816464800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-25 14:29:08,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 14:29:09,586][15401] Updated weights for policy 0, policy_version 904324 (0.0026) [2024-06-25 14:29:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14816591872. Throughput: 0: 42379.7. Samples: 14816718780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:13,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-25 14:29:13,483][15401] Updated weights for policy 0, policy_version 904334 (0.0041) [2024-06-25 14:29:17,302][15401] Updated weights for policy 0, policy_version 904344 (0.0029) [2024-06-25 14:29:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 14816804864. Throughput: 0: 42469.8. Samples: 14816979020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:18,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 14:29:21,091][15401] Updated weights for policy 0, policy_version 904354 (0.0030) [2024-06-25 14:29:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 14817017856. Throughput: 0: 42562.5. Samples: 14817104840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:23,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 14:29:25,001][15401] Updated weights for policy 0, policy_version 904364 (0.0039) [2024-06-25 14:29:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42598.8). Total num frames: 14817230848. Throughput: 0: 42427.5. Samples: 14817358220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 14:29:28,643][15401] Updated weights for policy 0, policy_version 904374 (0.0033) [2024-06-25 14:29:31,200][15349] Signal inference workers to stop experience collection... (219350 times) [2024-06-25 14:29:31,200][15349] Signal inference workers to resume experience collection... (219350 times) [2024-06-25 14:29:31,215][15401] InferenceWorker_p0-w0: stopping experience collection (219350 times) [2024-06-25 14:29:31,215][15401] InferenceWorker_p0-w0: resuming experience collection (219350 times) [2024-06-25 14:29:32,636][15401] Updated weights for policy 0, policy_version 904384 (0.0034) [2024-06-25 14:29:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14817443840. Throughput: 0: 42699.5. Samples: 14817620220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 14:29:36,276][15401] Updated weights for policy 0, policy_version 904394 (0.0043) [2024-06-25 14:29:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14817656832. Throughput: 0: 42837.4. Samples: 14817748220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 14:29:40,329][15401] Updated weights for policy 0, policy_version 904404 (0.0033) [2024-06-25 14:29:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14817886208. Throughput: 0: 42768.4. Samples: 14818004620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:43,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 14:29:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000904412_14817886208.pth... [2024-06-25 14:29:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000903789_14807678976.pth [2024-06-25 14:29:44,008][15401] Updated weights for policy 0, policy_version 904414 (0.0029) [2024-06-25 14:29:48,115][15401] Updated weights for policy 0, policy_version 904424 (0.0038) [2024-06-25 14:29:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14818099200. Throughput: 0: 42484.4. Samples: 14818250200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:48,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 14:29:52,048][15401] Updated weights for policy 0, policy_version 904434 (0.0034) [2024-06-25 14:29:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14818295808. Throughput: 0: 42461.8. Samples: 14818375580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 14:29:55,827][15401] Updated weights for policy 0, policy_version 904444 (0.0042) [2024-06-25 14:29:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14818508800. Throughput: 0: 42667.5. Samples: 14818638820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:29:58,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 14:29:59,794][15401] Updated weights for policy 0, policy_version 904454 (0.0033) [2024-06-25 14:30:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 14818721792. Throughput: 0: 42354.2. Samples: 14818884960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:03,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 14:30:03,423][15401] Updated weights for policy 0, policy_version 904464 (0.0033) [2024-06-25 14:30:07,818][15401] Updated weights for policy 0, policy_version 904474 (0.0033) [2024-06-25 14:30:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14818934784. Throughput: 0: 42422.7. Samples: 14819013860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:08,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 14:30:11,119][15401] Updated weights for policy 0, policy_version 904484 (0.0034) [2024-06-25 14:30:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14819147776. Throughput: 0: 42527.0. Samples: 14819271940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:13,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 14:30:15,260][15401] Updated weights for policy 0, policy_version 904494 (0.0030) [2024-06-25 14:30:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42543.8). Total num frames: 14819360768. Throughput: 0: 42294.7. Samples: 14819523480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:18,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 14:30:19,308][15401] Updated weights for policy 0, policy_version 904504 (0.0044) [2024-06-25 14:30:23,160][15401] Updated weights for policy 0, policy_version 904514 (0.0051) [2024-06-25 14:30:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14819573760. Throughput: 0: 42259.1. Samples: 14819649880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:23,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 14:30:26,795][15401] Updated weights for policy 0, policy_version 904524 (0.0036) [2024-06-25 14:30:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 14819786752. Throughput: 0: 42316.1. Samples: 14819908840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:28,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 14:30:30,697][15401] Updated weights for policy 0, policy_version 904534 (0.0036) [2024-06-25 14:30:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 14819999744. Throughput: 0: 42487.5. Samples: 14820162140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:33,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 14:30:34,450][15401] Updated weights for policy 0, policy_version 904544 (0.0036) [2024-06-25 14:30:38,133][15401] Updated weights for policy 0, policy_version 904554 (0.0032) [2024-06-25 14:30:38,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 14820229120. Throughput: 0: 42639.5. Samples: 14820294460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:38,392][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 14:30:41,886][15401] Updated weights for policy 0, policy_version 904564 (0.0031) [2024-06-25 14:30:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14820442112. Throughput: 0: 42587.4. Samples: 14820555260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:43,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 14:30:45,566][15401] Updated weights for policy 0, policy_version 904574 (0.0045) [2024-06-25 14:30:48,390][15132] Fps is (10 sec: 42607.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 14820655104. Throughput: 0: 42790.6. Samples: 14820810540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:48,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 14:30:49,873][15401] Updated weights for policy 0, policy_version 904584 (0.0032) [2024-06-25 14:30:50,660][15349] Signal inference workers to stop experience collection... (219400 times) [2024-06-25 14:30:50,660][15349] Signal inference workers to resume experience collection... (219400 times) [2024-06-25 14:30:50,678][15401] InferenceWorker_p0-w0: stopping experience collection (219400 times) [2024-06-25 14:30:50,678][15401] InferenceWorker_p0-w0: resuming experience collection (219400 times) [2024-06-25 14:30:53,253][15401] Updated weights for policy 0, policy_version 904594 (0.0038) [2024-06-25 14:30:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14820868096. Throughput: 0: 42742.6. Samples: 14820937280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:53,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 14:30:57,667][15401] Updated weights for policy 0, policy_version 904604 (0.0036) [2024-06-25 14:30:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14821097472. Throughput: 0: 42827.5. Samples: 14821199180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:30:58,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 14:31:01,082][15401] Updated weights for policy 0, policy_version 904614 (0.0034) [2024-06-25 14:31:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14821277696. Throughput: 0: 42749.3. Samples: 14821447200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 14:31:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 14:31:05,394][15401] Updated weights for policy 0, policy_version 904624 (0.0037) [2024-06-25 14:31:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14821507072. Throughput: 0: 42764.1. Samples: 14821574260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:08,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 14:31:08,525][15401] Updated weights for policy 0, policy_version 904634 (0.0037) [2024-06-25 14:31:12,854][15401] Updated weights for policy 0, policy_version 904644 (0.0028) [2024-06-25 14:31:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14821703680. Throughput: 0: 42849.8. Samples: 14821837080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 14:31:16,289][15401] Updated weights for policy 0, policy_version 904654 (0.0040) [2024-06-25 14:31:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14821916672. Throughput: 0: 42804.8. Samples: 14822088360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:18,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 14:31:20,377][15401] Updated weights for policy 0, policy_version 904664 (0.0031) [2024-06-25 14:31:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 14822129664. Throughput: 0: 42741.7. Samples: 14822217740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:23,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 14:31:24,318][15401] Updated weights for policy 0, policy_version 904674 (0.0043) [2024-06-25 14:31:27,971][15401] Updated weights for policy 0, policy_version 904684 (0.0036) [2024-06-25 14:31:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 14822342656. Throughput: 0: 42707.7. Samples: 14822477100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 14:31:31,789][15401] Updated weights for policy 0, policy_version 904694 (0.0039) [2024-06-25 14:31:33,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14822572032. Throughput: 0: 42511.3. Samples: 14822723540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:33,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 14:31:36,009][15401] Updated weights for policy 0, policy_version 904704 (0.0028) [2024-06-25 14:31:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 14822785024. Throughput: 0: 42694.6. Samples: 14822858540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:38,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 14:31:39,814][15401] Updated weights for policy 0, policy_version 904714 (0.0045) [2024-06-25 14:31:43,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 14822981632. Throughput: 0: 42634.1. Samples: 14823117720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 14:31:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000904723_14822981632.pth... [2024-06-25 14:31:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000904103_14812823552.pth [2024-06-25 14:31:43,614][15401] Updated weights for policy 0, policy_version 904724 (0.0031) [2024-06-25 14:31:47,431][15401] Updated weights for policy 0, policy_version 904734 (0.0026) [2024-06-25 14:31:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14823211008. Throughput: 0: 42586.3. Samples: 14823363580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:48,393][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 14:31:51,527][15401] Updated weights for policy 0, policy_version 904744 (0.0040) [2024-06-25 14:31:53,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42325.4, 300 sec: 42543.8). Total num frames: 14823407616. Throughput: 0: 42640.0. Samples: 14823493060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:53,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 14:31:55,204][15401] Updated weights for policy 0, policy_version 904754 (0.0031) [2024-06-25 14:31:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42376.2). Total num frames: 14823604224. Throughput: 0: 42276.4. Samples: 14823739520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:31:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 14:31:59,514][15401] Updated weights for policy 0, policy_version 904764 (0.0036) [2024-06-25 14:32:02,928][15401] Updated weights for policy 0, policy_version 904774 (0.0033) [2024-06-25 14:32:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 14823849984. Throughput: 0: 42169.4. Samples: 14823985980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 14:32:07,267][15401] Updated weights for policy 0, policy_version 904784 (0.0038) [2024-06-25 14:32:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 14824046592. Throughput: 0: 42225.8. Samples: 14824117900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 14:32:10,405][15401] Updated weights for policy 0, policy_version 904794 (0.0030) [2024-06-25 14:32:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 14824243200. Throughput: 0: 42113.3. Samples: 14824372200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:13,398][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 14:32:14,713][15401] Updated weights for policy 0, policy_version 904804 (0.0042) [2024-06-25 14:32:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14824456192. Throughput: 0: 42270.7. Samples: 14824625720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:18,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 14:32:18,470][15401] Updated weights for policy 0, policy_version 904814 (0.0035) [2024-06-25 14:32:22,544][15401] Updated weights for policy 0, policy_version 904824 (0.0023) [2024-06-25 14:32:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42431.7). Total num frames: 14824669184. Throughput: 0: 42231.5. Samples: 14824758960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 14:32:26,031][15401] Updated weights for policy 0, policy_version 904834 (0.0028) [2024-06-25 14:32:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 14824882176. Throughput: 0: 42157.9. Samples: 14825014920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:28,393][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 14:32:29,444][15349] Signal inference workers to stop experience collection... (219450 times) [2024-06-25 14:32:29,444][15349] Signal inference workers to resume experience collection... (219450 times) [2024-06-25 14:32:29,484][15401] InferenceWorker_p0-w0: stopping experience collection (219450 times) [2024-06-25 14:32:29,484][15401] InferenceWorker_p0-w0: resuming experience collection (219450 times) [2024-06-25 14:32:30,148][15401] Updated weights for policy 0, policy_version 904844 (0.0032) [2024-06-25 14:32:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14825111552. Throughput: 0: 42248.0. Samples: 14825264740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 14:32:33,844][15401] Updated weights for policy 0, policy_version 904854 (0.0046) [2024-06-25 14:32:37,774][15401] Updated weights for policy 0, policy_version 904864 (0.0038) [2024-06-25 14:32:38,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14825324544. Throughput: 0: 42406.5. Samples: 14825401360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:38,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 14:32:41,427][15401] Updated weights for policy 0, policy_version 904874 (0.0023) [2024-06-25 14:32:43,391][15132] Fps is (10 sec: 40952.7, 60 sec: 42324.2, 300 sec: 42542.6). Total num frames: 14825521152. Throughput: 0: 42630.2. Samples: 14825657960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:43,392][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 14:32:45,321][15401] Updated weights for policy 0, policy_version 904884 (0.0046) [2024-06-25 14:32:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14825750528. Throughput: 0: 42624.3. Samples: 14825904080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:48,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 14:32:49,124][15401] Updated weights for policy 0, policy_version 904894 (0.0031) [2024-06-25 14:32:52,984][15401] Updated weights for policy 0, policy_version 904904 (0.0029) [2024-06-25 14:32:53,392][15132] Fps is (10 sec: 42595.9, 60 sec: 42323.6, 300 sec: 42487.0). Total num frames: 14825947136. Throughput: 0: 42749.4. Samples: 14826041720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:53,392][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 14:32:56,691][15401] Updated weights for policy 0, policy_version 904914 (0.0030) [2024-06-25 14:32:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 14826160128. Throughput: 0: 42712.8. Samples: 14826294280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 14:32:58,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 14:33:00,562][15401] Updated weights for policy 0, policy_version 904924 (0.0035) [2024-06-25 14:33:03,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14826389504. Throughput: 0: 42814.1. Samples: 14826552360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 14:33:04,447][15401] Updated weights for policy 0, policy_version 904934 (0.0032) [2024-06-25 14:33:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14826586112. Throughput: 0: 42822.2. Samples: 14826685960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:08,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 14:33:08,543][15401] Updated weights for policy 0, policy_version 904944 (0.0039) [2024-06-25 14:33:11,929][15401] Updated weights for policy 0, policy_version 904954 (0.0036) [2024-06-25 14:33:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 14826799104. Throughput: 0: 42771.1. Samples: 14826939520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:13,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 14:33:16,147][15401] Updated weights for policy 0, policy_version 904964 (0.0038) [2024-06-25 14:33:18,392][15132] Fps is (10 sec: 45865.0, 60 sec: 43142.8, 300 sec: 42598.0). Total num frames: 14827044864. Throughput: 0: 42852.0. Samples: 14827193180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:18,393][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 14:33:19,616][15401] Updated weights for policy 0, policy_version 904974 (0.0046) [2024-06-25 14:33:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14827241472. Throughput: 0: 42868.4. Samples: 14827330440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 14:33:23,663][15401] Updated weights for policy 0, policy_version 904984 (0.0029) [2024-06-25 14:33:27,587][15401] Updated weights for policy 0, policy_version 904994 (0.0034) [2024-06-25 14:33:28,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42600.1, 300 sec: 42487.4). Total num frames: 14827438080. Throughput: 0: 42756.9. Samples: 14827581940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:28,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-25 14:33:31,112][15401] Updated weights for policy 0, policy_version 905004 (0.0038) [2024-06-25 14:33:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14827683840. Throughput: 0: 42818.8. Samples: 14827830920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:33,390][15132] Avg episode reward: [(0, '0.832')] [2024-06-25 14:33:35,177][15401] Updated weights for policy 0, policy_version 905014 (0.0033) [2024-06-25 14:33:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14827880448. Throughput: 0: 42767.3. Samples: 14827966140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:38,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 14:33:39,097][15401] Updated weights for policy 0, policy_version 905024 (0.0033) [2024-06-25 14:33:42,600][15401] Updated weights for policy 0, policy_version 905034 (0.0024) [2024-06-25 14:33:43,390][15132] Fps is (10 sec: 40956.8, 60 sec: 42872.3, 300 sec: 42598.3). Total num frames: 14828093440. Throughput: 0: 42634.6. Samples: 14828212860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:43,391][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 14:33:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000905035_14828093440.pth... [2024-06-25 14:33:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000904412_14817886208.pth [2024-06-25 14:33:46,682][15401] Updated weights for policy 0, policy_version 905044 (0.0042) [2024-06-25 14:33:48,391][15132] Fps is (10 sec: 44228.8, 60 sec: 42870.3, 300 sec: 42598.2). Total num frames: 14828322816. Throughput: 0: 42671.8. Samples: 14828472660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:48,392][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 14:33:50,397][15401] Updated weights for policy 0, policy_version 905054 (0.0033) [2024-06-25 14:33:53,389][15132] Fps is (10 sec: 42602.0, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 14828519424. Throughput: 0: 42538.1. Samples: 14828600160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:53,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 14:33:53,715][15349] Signal inference workers to stop experience collection... (219500 times) [2024-06-25 14:33:53,715][15349] Signal inference workers to resume experience collection... (219500 times) [2024-06-25 14:33:53,767][15401] InferenceWorker_p0-w0: stopping experience collection (219500 times) [2024-06-25 14:33:53,767][15401] InferenceWorker_p0-w0: resuming experience collection (219500 times) [2024-06-25 14:33:54,420][15401] Updated weights for policy 0, policy_version 905064 (0.0036) [2024-06-25 14:33:58,078][15401] Updated weights for policy 0, policy_version 905074 (0.0037) [2024-06-25 14:33:58,390][15132] Fps is (10 sec: 42605.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14828748800. Throughput: 0: 42552.9. Samples: 14828854400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:33:58,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 14:34:02,044][15401] Updated weights for policy 0, policy_version 905084 (0.0033) [2024-06-25 14:34:03,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 14828961792. Throughput: 0: 42572.5. Samples: 14829108840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:03,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 14:34:05,679][15401] Updated weights for policy 0, policy_version 905094 (0.0023) [2024-06-25 14:34:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 14829158400. Throughput: 0: 42422.4. Samples: 14829239440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 14:34:09,480][15401] Updated weights for policy 0, policy_version 905104 (0.0034) [2024-06-25 14:34:13,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14829371392. Throughput: 0: 42506.6. Samples: 14829494740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:13,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 14:34:13,657][15401] Updated weights for policy 0, policy_version 905114 (0.0037) [2024-06-25 14:34:17,063][15401] Updated weights for policy 0, policy_version 905124 (0.0037) [2024-06-25 14:34:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14829617152. Throughput: 0: 42757.3. Samples: 14829755000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:18,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 14:34:21,444][15401] Updated weights for policy 0, policy_version 905134 (0.0037) [2024-06-25 14:34:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14829797376. Throughput: 0: 42680.8. Samples: 14829886780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 14:34:24,425][15401] Updated weights for policy 0, policy_version 905144 (0.0041) [2024-06-25 14:34:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14830010368. Throughput: 0: 42795.8. Samples: 14830138640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 14:34:29,076][15401] Updated weights for policy 0, policy_version 905154 (0.0044) [2024-06-25 14:34:32,483][15401] Updated weights for policy 0, policy_version 905164 (0.0041) [2024-06-25 14:34:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14830239744. Throughput: 0: 42628.7. Samples: 14830390880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 14:34:36,532][15401] Updated weights for policy 0, policy_version 905174 (0.0037) [2024-06-25 14:34:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14830436352. Throughput: 0: 42760.8. Samples: 14830524400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:38,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-25 14:34:40,094][15401] Updated weights for policy 0, policy_version 905184 (0.0044) [2024-06-25 14:34:43,390][15132] Fps is (10 sec: 40956.3, 60 sec: 42598.2, 300 sec: 42542.7). Total num frames: 14830649344. Throughput: 0: 42730.3. Samples: 14830777300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:43,391][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 14:34:44,410][15401] Updated weights for policy 0, policy_version 905194 (0.0042) [2024-06-25 14:34:47,705][15401] Updated weights for policy 0, policy_version 905204 (0.0038) [2024-06-25 14:34:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42599.7, 300 sec: 42654.0). Total num frames: 14830878720. Throughput: 0: 42549.8. Samples: 14831023580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:48,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 14:34:51,962][15401] Updated weights for policy 0, policy_version 905214 (0.0032) [2024-06-25 14:34:53,390][15132] Fps is (10 sec: 40963.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 14831058944. Throughput: 0: 42638.5. Samples: 14831158180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-25 14:34:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 14:34:55,802][15401] Updated weights for policy 0, policy_version 905224 (0.0034) [2024-06-25 14:34:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14831304704. Throughput: 0: 42551.0. Samples: 14831409540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:34:58,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-25 14:35:00,195][15401] Updated weights for policy 0, policy_version 905234 (0.0034) [2024-06-25 14:35:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14831501312. Throughput: 0: 42569.4. Samples: 14831670620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:03,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 14:35:03,482][15401] Updated weights for policy 0, policy_version 905244 (0.0031) [2024-06-25 14:35:07,714][15401] Updated weights for policy 0, policy_version 905254 (0.0026) [2024-06-25 14:35:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 14831697920. Throughput: 0: 42443.9. Samples: 14831796760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:08,392][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 14:35:09,054][15349] Signal inference workers to stop experience collection... (219550 times) [2024-06-25 14:35:09,055][15349] Signal inference workers to resume experience collection... (219550 times) [2024-06-25 14:35:09,102][15401] InferenceWorker_p0-w0: stopping experience collection (219550 times) [2024-06-25 14:35:09,103][15401] InferenceWorker_p0-w0: resuming experience collection (219550 times) [2024-06-25 14:35:11,067][15401] Updated weights for policy 0, policy_version 905264 (0.0030) [2024-06-25 14:35:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14831927296. Throughput: 0: 42498.1. Samples: 14832051060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:13,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 14:35:15,221][15401] Updated weights for policy 0, policy_version 905274 (0.0040) [2024-06-25 14:35:18,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 14832140288. Throughput: 0: 42560.9. Samples: 14832306120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:18,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 14:35:18,776][15401] Updated weights for policy 0, policy_version 905284 (0.0031) [2024-06-25 14:35:22,775][15401] Updated weights for policy 0, policy_version 905294 (0.0042) [2024-06-25 14:35:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14832353280. Throughput: 0: 42391.9. Samples: 14832432040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 14:35:26,835][15401] Updated weights for policy 0, policy_version 905304 (0.0040) [2024-06-25 14:35:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14832566272. Throughput: 0: 42434.7. Samples: 14832686820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 14:35:30,330][15401] Updated weights for policy 0, policy_version 905314 (0.0034) [2024-06-25 14:35:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 14832779264. Throughput: 0: 42737.2. Samples: 14832946760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:33,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 14:35:34,478][15401] Updated weights for policy 0, policy_version 905324 (0.0042) [2024-06-25 14:35:38,019][15401] Updated weights for policy 0, policy_version 905334 (0.0037) [2024-06-25 14:35:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14832992256. Throughput: 0: 42483.3. Samples: 14833069920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:38,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 14:35:42,152][15401] Updated weights for policy 0, policy_version 905344 (0.0037) [2024-06-25 14:35:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42872.2, 300 sec: 42598.4). Total num frames: 14833221632. Throughput: 0: 42714.8. Samples: 14833331700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 14:35:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000905348_14833221632.pth... [2024-06-25 14:35:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000904723_14822981632.pth [2024-06-25 14:35:45,538][15401] Updated weights for policy 0, policy_version 905354 (0.0026) [2024-06-25 14:35:48,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 14833418240. Throughput: 0: 42484.8. Samples: 14833582540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:48,392][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 14:35:49,636][15401] Updated weights for policy 0, policy_version 905364 (0.0041) [2024-06-25 14:35:53,061][15401] Updated weights for policy 0, policy_version 905374 (0.0028) [2024-06-25 14:35:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 14833647616. Throughput: 0: 42587.2. Samples: 14833713180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 14:35:57,220][15401] Updated weights for policy 0, policy_version 905384 (0.0032) [2024-06-25 14:35:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14833827840. Throughput: 0: 42539.7. Samples: 14833965340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:35:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 14:36:00,974][15401] Updated weights for policy 0, policy_version 905394 (0.0028) [2024-06-25 14:36:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14834040832. Throughput: 0: 42600.5. Samples: 14834223140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 14:36:04,822][15401] Updated weights for policy 0, policy_version 905404 (0.0029) [2024-06-25 14:36:08,382][15401] Updated weights for policy 0, policy_version 905414 (0.0031) [2024-06-25 14:36:08,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 14834302976. Throughput: 0: 42576.4. Samples: 14834347980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 14:36:12,967][15401] Updated weights for policy 0, policy_version 905424 (0.0042) [2024-06-25 14:36:13,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 14834483200. Throughput: 0: 42573.9. Samples: 14834602920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:13,396][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 14:36:16,101][15401] Updated weights for policy 0, policy_version 905434 (0.0031) [2024-06-25 14:36:18,389][15132] Fps is (10 sec: 36045.2, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 14834663424. Throughput: 0: 42604.1. Samples: 14834863940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:18,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 14:36:20,457][15401] Updated weights for policy 0, policy_version 905444 (0.0028) [2024-06-25 14:36:23,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14834909184. Throughput: 0: 42621.4. Samples: 14834987880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 14:36:23,772][15401] Updated weights for policy 0, policy_version 905454 (0.0039) [2024-06-25 14:36:26,788][15349] Signal inference workers to stop experience collection... (219600 times) [2024-06-25 14:36:26,837][15401] InferenceWorker_p0-w0: stopping experience collection (219600 times) [2024-06-25 14:36:26,847][15349] Signal inference workers to resume experience collection... (219600 times) [2024-06-25 14:36:26,853][15401] InferenceWorker_p0-w0: resuming experience collection (219600 times) [2024-06-25 14:36:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14835105792. Throughput: 0: 42482.6. Samples: 14835243420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 14:36:28,501][15401] Updated weights for policy 0, policy_version 905464 (0.0035) [2024-06-25 14:36:31,955][15401] Updated weights for policy 0, policy_version 905474 (0.0036) [2024-06-25 14:36:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 14835318784. Throughput: 0: 42575.9. Samples: 14835498360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 14:36:35,983][15401] Updated weights for policy 0, policy_version 905484 (0.0027) [2024-06-25 14:36:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 14835564544. Throughput: 0: 42471.6. Samples: 14835624400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:38,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 14:36:39,887][15401] Updated weights for policy 0, policy_version 905494 (0.0026) [2024-06-25 14:36:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14835744768. Throughput: 0: 42646.2. Samples: 14835884420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:43,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 14:36:43,637][15401] Updated weights for policy 0, policy_version 905504 (0.0033) [2024-06-25 14:36:47,754][15401] Updated weights for policy 0, policy_version 905514 (0.0037) [2024-06-25 14:36:48,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42598.4, 300 sec: 42598.0). Total num frames: 14835974144. Throughput: 0: 42626.2. Samples: 14836141420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-25 14:36:48,393][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 14:36:51,170][15401] Updated weights for policy 0, policy_version 905524 (0.0040) [2024-06-25 14:36:53,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14836203520. Throughput: 0: 42658.8. Samples: 14836267620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:36:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 14:36:55,691][15401] Updated weights for policy 0, policy_version 905534 (0.0026) [2024-06-25 14:36:58,389][15132] Fps is (10 sec: 44248.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14836416512. Throughput: 0: 42684.4. Samples: 14836523440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:36:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 14:36:58,998][15401] Updated weights for policy 0, policy_version 905544 (0.0034) [2024-06-25 14:37:03,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 14836580352. Throughput: 0: 42655.6. Samples: 14836783440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:03,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 14:37:03,427][15401] Updated weights for policy 0, policy_version 905554 (0.0028) [2024-06-25 14:37:06,438][15401] Updated weights for policy 0, policy_version 905564 (0.0049) [2024-06-25 14:37:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14836826112. Throughput: 0: 42578.9. Samples: 14836903940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:08,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 14:37:10,995][15401] Updated weights for policy 0, policy_version 905574 (0.0028) [2024-06-25 14:37:13,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 14837039104. Throughput: 0: 42731.5. Samples: 14837166340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 14:37:13,944][15401] Updated weights for policy 0, policy_version 905584 (0.0031) [2024-06-25 14:37:18,393][15132] Fps is (10 sec: 40946.2, 60 sec: 42868.9, 300 sec: 42597.9). Total num frames: 14837235712. Throughput: 0: 42807.0. Samples: 14837424820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:18,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 14:37:18,583][15401] Updated weights for policy 0, policy_version 905594 (0.0027) [2024-06-25 14:37:21,539][15401] Updated weights for policy 0, policy_version 905604 (0.0054) [2024-06-25 14:37:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 14837465088. Throughput: 0: 42803.1. Samples: 14837550540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 14:37:26,419][15401] Updated weights for policy 0, policy_version 905614 (0.0039) [2024-06-25 14:37:28,390][15132] Fps is (10 sec: 44252.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14837678080. Throughput: 0: 42817.2. Samples: 14837811200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:28,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 14:37:29,184][15401] Updated weights for policy 0, policy_version 905624 (0.0049) [2024-06-25 14:37:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 14837891072. Throughput: 0: 42697.0. Samples: 14838062680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 14:37:33,909][15401] Updated weights for policy 0, policy_version 905634 (0.0039) [2024-06-25 14:37:37,030][15401] Updated weights for policy 0, policy_version 905644 (0.0033) [2024-06-25 14:37:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42654.2). Total num frames: 14838104064. Throughput: 0: 42721.6. Samples: 14838190100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 14:37:41,330][15401] Updated weights for policy 0, policy_version 905654 (0.0038) [2024-06-25 14:37:43,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14838300672. Throughput: 0: 42724.4. Samples: 14838446040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 14:37:43,483][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000905659_14838317056.pth... [2024-06-25 14:37:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000905035_14828093440.pth [2024-06-25 14:37:44,721][15401] Updated weights for policy 0, policy_version 905664 (0.0035) [2024-06-25 14:37:48,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42600.2, 300 sec: 42654.3). Total num frames: 14838530048. Throughput: 0: 42612.4. Samples: 14838701000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 14:37:48,903][15401] Updated weights for policy 0, policy_version 905674 (0.0023) [2024-06-25 14:37:50,208][15349] Signal inference workers to stop experience collection... (219650 times) [2024-06-25 14:37:50,209][15349] Signal inference workers to resume experience collection... (219650 times) [2024-06-25 14:37:50,256][15401] InferenceWorker_p0-w0: stopping experience collection (219650 times) [2024-06-25 14:37:50,256][15401] InferenceWorker_p0-w0: resuming experience collection (219650 times) [2024-06-25 14:37:52,336][15401] Updated weights for policy 0, policy_version 905684 (0.0032) [2024-06-25 14:37:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14838759424. Throughput: 0: 42877.5. Samples: 14838833420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 14:37:56,504][15401] Updated weights for policy 0, policy_version 905694 (0.0030) [2024-06-25 14:37:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14838956032. Throughput: 0: 42762.3. Samples: 14839090640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:37:58,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 14:37:59,887][15401] Updated weights for policy 0, policy_version 905704 (0.0028) [2024-06-25 14:38:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 14839185408. Throughput: 0: 42685.9. Samples: 14839345540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:03,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-25 14:38:03,991][15401] Updated weights for policy 0, policy_version 905714 (0.0034) [2024-06-25 14:38:07,459][15401] Updated weights for policy 0, policy_version 905724 (0.0038) [2024-06-25 14:38:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14839414784. Throughput: 0: 43033.9. Samples: 14839487060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:08,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-25 14:38:11,437][15401] Updated weights for policy 0, policy_version 905734 (0.0038) [2024-06-25 14:38:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 14839578624. Throughput: 0: 42840.0. Samples: 14839739000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 14:38:15,605][15401] Updated weights for policy 0, policy_version 905744 (0.0029) [2024-06-25 14:38:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43420.1, 300 sec: 42709.5). Total num frames: 14839840768. Throughput: 0: 42939.0. Samples: 14839994940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:18,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 14:38:19,233][15401] Updated weights for policy 0, policy_version 905754 (0.0027) [2024-06-25 14:38:23,313][15401] Updated weights for policy 0, policy_version 905764 (0.0031) [2024-06-25 14:38:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 14840037376. Throughput: 0: 43193.3. Samples: 14840133800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 14:38:26,734][15401] Updated weights for policy 0, policy_version 905774 (0.0031) [2024-06-25 14:38:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 14840233984. Throughput: 0: 42952.8. Samples: 14840378920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:28,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 14:38:30,801][15401] Updated weights for policy 0, policy_version 905784 (0.0030) [2024-06-25 14:38:33,389][15132] Fps is (10 sec: 45876.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 14840496128. Throughput: 0: 43024.4. Samples: 14840637100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 14:38:34,375][15401] Updated weights for policy 0, policy_version 905794 (0.0030) [2024-06-25 14:38:38,369][15401] Updated weights for policy 0, policy_version 905804 (0.0033) [2024-06-25 14:38:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42709.6). Total num frames: 14840692736. Throughput: 0: 43112.4. Samples: 14840773480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:38,390][15132] Avg episode reward: [(0, '0.877')] [2024-06-25 14:38:41,902][15401] Updated weights for policy 0, policy_version 905814 (0.0039) [2024-06-25 14:38:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 14840889344. Throughput: 0: 42943.6. Samples: 14841023100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 14:38:43,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 14:38:45,971][15401] Updated weights for policy 0, policy_version 905824 (0.0040) [2024-06-25 14:38:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14841118720. Throughput: 0: 43049.4. Samples: 14841282760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:38:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 14:38:49,369][15401] Updated weights for policy 0, policy_version 905834 (0.0046) [2024-06-25 14:38:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14841315328. Throughput: 0: 42926.6. Samples: 14841418760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:38:53,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 14:38:53,574][15401] Updated weights for policy 0, policy_version 905844 (0.0034) [2024-06-25 14:38:55,521][15349] Signal inference workers to stop experience collection... (219700 times) [2024-06-25 14:38:55,545][15401] InferenceWorker_p0-w0: stopping experience collection (219700 times) [2024-06-25 14:38:55,583][15349] Signal inference workers to resume experience collection... (219700 times) [2024-06-25 14:38:55,583][15401] InferenceWorker_p0-w0: resuming experience collection (219700 times) [2024-06-25 14:38:56,792][15401] Updated weights for policy 0, policy_version 905854 (0.0032) [2024-06-25 14:38:58,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 14841544704. Throughput: 0: 42918.0. Samples: 14841670320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:38:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 14:39:01,135][15401] Updated weights for policy 0, policy_version 905864 (0.0038) [2024-06-25 14:39:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14841774080. Throughput: 0: 43126.3. Samples: 14841935620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:03,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 14:39:04,383][15401] Updated weights for policy 0, policy_version 905874 (0.0028) [2024-06-25 14:39:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14841970688. Throughput: 0: 42891.3. Samples: 14842063900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 14:39:08,668][15401] Updated weights for policy 0, policy_version 905884 (0.0027) [2024-06-25 14:39:11,866][15401] Updated weights for policy 0, policy_version 905894 (0.0035) [2024-06-25 14:39:13,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 14842183680. Throughput: 0: 43080.0. Samples: 14842317520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 14:39:16,326][15401] Updated weights for policy 0, policy_version 905904 (0.0030) [2024-06-25 14:39:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 14842429440. Throughput: 0: 43245.7. Samples: 14842583260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:18,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 14:39:19,414][15401] Updated weights for policy 0, policy_version 905914 (0.0025) [2024-06-25 14:39:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14842626048. Throughput: 0: 43073.7. Samples: 14842711800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:23,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 14:39:24,205][15401] Updated weights for policy 0, policy_version 905924 (0.0035) [2024-06-25 14:39:27,487][15401] Updated weights for policy 0, policy_version 905934 (0.0035) [2024-06-25 14:39:28,392][15132] Fps is (10 sec: 42598.1, 60 sec: 43688.9, 300 sec: 42764.7). Total num frames: 14842855424. Throughput: 0: 43193.2. Samples: 14842966900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:28,393][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 14:39:31,744][15401] Updated weights for policy 0, policy_version 905944 (0.0029) [2024-06-25 14:39:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14843068416. Throughput: 0: 43182.1. Samples: 14843225960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:33,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 14:39:35,147][15401] Updated weights for policy 0, policy_version 905954 (0.0038) [2024-06-25 14:39:38,390][15132] Fps is (10 sec: 42608.6, 60 sec: 43144.5, 300 sec: 42820.7). Total num frames: 14843281408. Throughput: 0: 42919.9. Samples: 14843350160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 14:39:39,458][15401] Updated weights for policy 0, policy_version 905964 (0.0026) [2024-06-25 14:39:42,847][15401] Updated weights for policy 0, policy_version 905974 (0.0028) [2024-06-25 14:39:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 14843494400. Throughput: 0: 43069.3. Samples: 14843608440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 14:39:43,493][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000905976_14843510784.pth... [2024-06-25 14:39:43,546][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000905348_14833221632.pth [2024-06-25 14:39:47,315][15401] Updated weights for policy 0, policy_version 905984 (0.0037) [2024-06-25 14:39:48,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 14843707392. Throughput: 0: 42839.4. Samples: 14843863500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:48,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 14:39:50,655][15401] Updated weights for policy 0, policy_version 905994 (0.0037) [2024-06-25 14:39:53,390][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 14843920384. Throughput: 0: 42819.9. Samples: 14843990800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:53,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 14:39:54,973][15401] Updated weights for policy 0, policy_version 906004 (0.0033) [2024-06-25 14:39:58,355][15401] Updated weights for policy 0, policy_version 906014 (0.0042) [2024-06-25 14:39:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 14844133376. Throughput: 0: 42922.2. Samples: 14844249020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:39:58,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-25 14:40:02,510][15401] Updated weights for policy 0, policy_version 906024 (0.0034) [2024-06-25 14:40:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14844346368. Throughput: 0: 42816.6. Samples: 14844509900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:40:03,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 14:40:05,862][15401] Updated weights for policy 0, policy_version 906034 (0.0034) [2024-06-25 14:40:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14844559360. Throughput: 0: 42785.0. Samples: 14844637120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:40:08,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 14:40:10,035][15401] Updated weights for policy 0, policy_version 906044 (0.0028) [2024-06-25 14:40:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14844772352. Throughput: 0: 42808.2. Samples: 14844893160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:40:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 14:40:13,577][15349] Signal inference workers to stop experience collection... (219750 times) [2024-06-25 14:40:13,604][15401] InferenceWorker_p0-w0: stopping experience collection (219750 times) [2024-06-25 14:40:13,631][15349] Signal inference workers to resume experience collection... (219750 times) [2024-06-25 14:40:13,636][15401] InferenceWorker_p0-w0: resuming experience collection (219750 times) [2024-06-25 14:40:13,640][15401] Updated weights for policy 0, policy_version 906054 (0.0035) [2024-06-25 14:40:17,733][15401] Updated weights for policy 0, policy_version 906064 (0.0029) [2024-06-25 14:40:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 14844985344. Throughput: 0: 42696.5. Samples: 14845147300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:40:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 14:40:21,362][15401] Updated weights for policy 0, policy_version 906074 (0.0022) [2024-06-25 14:40:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14845198336. Throughput: 0: 42636.4. Samples: 14845268800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:40:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 14:40:25,529][15401] Updated weights for policy 0, policy_version 906084 (0.0037) [2024-06-25 14:40:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 14845394944. Throughput: 0: 42581.9. Samples: 14845524620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:40:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 14:40:29,295][15401] Updated weights for policy 0, policy_version 906094 (0.0044) [2024-06-25 14:40:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14845591552. Throughput: 0: 42567.6. Samples: 14845778940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 14:40:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 14:40:33,434][15401] Updated weights for policy 0, policy_version 906104 (0.0046) [2024-06-25 14:40:36,878][15401] Updated weights for policy 0, policy_version 906114 (0.0028) [2024-06-25 14:40:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 14845820928. Throughput: 0: 42543.0. Samples: 14845905340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:40:38,393][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 14:40:41,151][15401] Updated weights for policy 0, policy_version 906124 (0.0037) [2024-06-25 14:40:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 14846033920. Throughput: 0: 42516.9. Samples: 14846162280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:40:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 14:40:44,394][15401] Updated weights for policy 0, policy_version 906134 (0.0040) [2024-06-25 14:40:48,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 14846230528. Throughput: 0: 42539.3. Samples: 14846424180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:40:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 14:40:48,879][15401] Updated weights for policy 0, policy_version 906144 (0.0025) [2024-06-25 14:40:52,173][15401] Updated weights for policy 0, policy_version 906154 (0.0045) [2024-06-25 14:40:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14846459904. Throughput: 0: 42405.7. Samples: 14846545380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:40:53,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 14:40:56,722][15401] Updated weights for policy 0, policy_version 906164 (0.0027) [2024-06-25 14:40:58,390][15132] Fps is (10 sec: 44234.4, 60 sec: 42324.9, 300 sec: 42820.5). Total num frames: 14846672896. Throughput: 0: 42400.6. Samples: 14846801220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:40:58,396][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 14:40:59,991][15401] Updated weights for policy 0, policy_version 906174 (0.0031) [2024-06-25 14:41:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 14846885888. Throughput: 0: 42452.4. Samples: 14847057660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 14:41:04,490][15401] Updated weights for policy 0, policy_version 906184 (0.0040) [2024-06-25 14:41:07,683][15401] Updated weights for policy 0, policy_version 906194 (0.0022) [2024-06-25 14:41:08,396][15132] Fps is (10 sec: 44211.4, 60 sec: 42593.8, 300 sec: 42820.6). Total num frames: 14847115264. Throughput: 0: 42461.2. Samples: 14847179820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:08,396][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 14:41:12,386][15401] Updated weights for policy 0, policy_version 906204 (0.0040) [2024-06-25 14:41:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 14847295488. Throughput: 0: 42611.7. Samples: 14847442140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:13,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 14:41:15,230][15401] Updated weights for policy 0, policy_version 906214 (0.0056) [2024-06-25 14:41:18,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14847524864. Throughput: 0: 42430.6. Samples: 14847688320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 14:41:19,971][15401] Updated weights for policy 0, policy_version 906224 (0.0049) [2024-06-25 14:41:22,867][15401] Updated weights for policy 0, policy_version 906234 (0.0036) [2024-06-25 14:41:23,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42596.8, 300 sec: 42875.7). Total num frames: 14847754240. Throughput: 0: 42555.2. Samples: 14847820320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:23,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 14:41:27,591][15401] Updated weights for policy 0, policy_version 906244 (0.0025) [2024-06-25 14:41:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 14847918080. Throughput: 0: 42501.4. Samples: 14848074840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:28,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 14:41:30,217][15349] Signal inference workers to stop experience collection... (219800 times) [2024-06-25 14:41:30,217][15349] Signal inference workers to resume experience collection... (219800 times) [2024-06-25 14:41:30,255][15401] InferenceWorker_p0-w0: stopping experience collection (219800 times) [2024-06-25 14:41:30,256][15401] InferenceWorker_p0-w0: resuming experience collection (219800 times) [2024-06-25 14:41:30,538][15401] Updated weights for policy 0, policy_version 906254 (0.0033) [2024-06-25 14:41:33,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14848163840. Throughput: 0: 42310.4. Samples: 14848328140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:33,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-25 14:41:35,140][15401] Updated weights for policy 0, policy_version 906264 (0.0048) [2024-06-25 14:41:38,365][15401] Updated weights for policy 0, policy_version 906274 (0.0047) [2024-06-25 14:41:38,393][15132] Fps is (10 sec: 47497.7, 60 sec: 42870.9, 300 sec: 42875.6). Total num frames: 14848393216. Throughput: 0: 42581.8. Samples: 14848461700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:38,393][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 14:41:43,060][15401] Updated weights for policy 0, policy_version 906284 (0.0033) [2024-06-25 14:41:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 14848573440. Throughput: 0: 42529.9. Samples: 14848715040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 14:41:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000906285_14848573440.pth... [2024-06-25 14:41:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000905659_14838317056.pth [2024-06-25 14:41:46,095][15401] Updated weights for policy 0, policy_version 906294 (0.0029) [2024-06-25 14:41:48,392][15132] Fps is (10 sec: 40963.8, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 14848802816. Throughput: 0: 42341.4. Samples: 14848963120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:48,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 14:41:50,694][15401] Updated weights for policy 0, policy_version 906304 (0.0026) [2024-06-25 14:41:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14849032192. Throughput: 0: 42648.8. Samples: 14849098740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 14:41:53,812][15401] Updated weights for policy 0, policy_version 906314 (0.0027) [2024-06-25 14:41:58,361][15401] Updated weights for policy 0, policy_version 906324 (0.0036) [2024-06-25 14:41:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.8, 300 sec: 42820.5). Total num frames: 14849212416. Throughput: 0: 42383.6. Samples: 14849349400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:41:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 14:42:01,676][15401] Updated weights for policy 0, policy_version 906334 (0.0040) [2024-06-25 14:42:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14849441792. Throughput: 0: 42333.8. Samples: 14849593340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:42:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 14:42:06,214][15401] Updated weights for policy 0, policy_version 906344 (0.0039) [2024-06-25 14:42:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42056.8, 300 sec: 42709.5). Total num frames: 14849638400. Throughput: 0: 42431.7. Samples: 14849729640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:42:08,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 14:42:09,323][15401] Updated weights for policy 0, policy_version 906354 (0.0030) [2024-06-25 14:42:13,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42654.4). Total num frames: 14849818624. Throughput: 0: 42353.3. Samples: 14849980740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:42:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 14:42:13,880][15401] Updated weights for policy 0, policy_version 906364 (0.0040) [2024-06-25 14:42:17,069][15401] Updated weights for policy 0, policy_version 906374 (0.0032) [2024-06-25 14:42:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 14850064384. Throughput: 0: 42297.4. Samples: 14850231520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:42:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 14:42:21,419][15401] Updated weights for policy 0, policy_version 906384 (0.0035) [2024-06-25 14:42:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 14850277376. Throughput: 0: 42360.4. Samples: 14850367780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:42:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 14:42:24,569][15401] Updated weights for policy 0, policy_version 906394 (0.0042) [2024-06-25 14:42:28,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14850473984. Throughput: 0: 42273.8. Samples: 14850617360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 14:42:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 14:42:28,870][15401] Updated weights for policy 0, policy_version 906404 (0.0037) [2024-06-25 14:42:32,517][15401] Updated weights for policy 0, policy_version 906414 (0.0030) [2024-06-25 14:42:33,390][15132] Fps is (10 sec: 44234.5, 60 sec: 42597.9, 300 sec: 42765.0). Total num frames: 14850719744. Throughput: 0: 42557.3. Samples: 14850878120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:42:33,391][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 14:42:36,511][15401] Updated weights for policy 0, policy_version 906424 (0.0031) [2024-06-25 14:42:38,001][15349] Signal inference workers to stop experience collection... (219850 times) [2024-06-25 14:42:38,002][15349] Signal inference workers to resume experience collection... (219850 times) [2024-06-25 14:42:38,036][15401] InferenceWorker_p0-w0: stopping experience collection (219850 times) [2024-06-25 14:42:38,037][15401] InferenceWorker_p0-w0: resuming experience collection (219850 times) [2024-06-25 14:42:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42054.7, 300 sec: 42765.0). Total num frames: 14850916352. Throughput: 0: 42533.4. Samples: 14851012740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:42:38,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 14:42:39,998][15401] Updated weights for policy 0, policy_version 906434 (0.0039) [2024-06-25 14:42:43,390][15132] Fps is (10 sec: 40962.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14851129344. Throughput: 0: 42475.9. Samples: 14851260820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:42:43,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 14:42:44,232][15401] Updated weights for policy 0, policy_version 906444 (0.0036) [2024-06-25 14:42:47,574][15401] Updated weights for policy 0, policy_version 906454 (0.0029) [2024-06-25 14:42:48,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 14851358720. Throughput: 0: 42800.4. Samples: 14851519360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:42:48,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 14:42:52,472][15401] Updated weights for policy 0, policy_version 906464 (0.0032) [2024-06-25 14:42:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 14851555328. Throughput: 0: 42617.2. Samples: 14851647420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:42:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 14:42:55,390][15401] Updated weights for policy 0, policy_version 906474 (0.0030) [2024-06-25 14:42:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14851768320. Throughput: 0: 42639.5. Samples: 14851899520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:42:58,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 14:43:00,112][15401] Updated weights for policy 0, policy_version 906484 (0.0031) [2024-06-25 14:43:03,255][15401] Updated weights for policy 0, policy_version 906494 (0.0025) [2024-06-25 14:43:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14852014080. Throughput: 0: 42751.3. Samples: 14852155340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 14:43:07,698][15401] Updated weights for policy 0, policy_version 906504 (0.0041) [2024-06-25 14:43:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14852177920. Throughput: 0: 42522.2. Samples: 14852281280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:08,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 14:43:10,803][15401] Updated weights for policy 0, policy_version 906514 (0.0032) [2024-06-25 14:43:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 14852423680. Throughput: 0: 42665.8. Samples: 14852537320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 14:43:15,481][15401] Updated weights for policy 0, policy_version 906524 (0.0039) [2024-06-25 14:43:18,267][15401] Updated weights for policy 0, policy_version 906534 (0.0027) [2024-06-25 14:43:18,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14852653056. Throughput: 0: 42411.7. Samples: 14852786620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 14:43:23,043][15401] Updated weights for policy 0, policy_version 906544 (0.0029) [2024-06-25 14:43:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14852833280. Throughput: 0: 42298.4. Samples: 14852916180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 14:43:26,406][15401] Updated weights for policy 0, policy_version 906554 (0.0023) [2024-06-25 14:43:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 14853079040. Throughput: 0: 42477.8. Samples: 14853172320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 14:43:30,647][15401] Updated weights for policy 0, policy_version 906564 (0.0036) [2024-06-25 14:43:33,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42325.8, 300 sec: 42598.4). Total num frames: 14853259264. Throughput: 0: 42521.5. Samples: 14853432820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 14:43:33,931][15401] Updated weights for policy 0, policy_version 906574 (0.0035) [2024-06-25 14:43:38,115][15401] Updated weights for policy 0, policy_version 906584 (0.0033) [2024-06-25 14:43:38,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 14853488640. Throughput: 0: 42314.2. Samples: 14853551660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:38,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 14:43:42,142][15401] Updated weights for policy 0, policy_version 906594 (0.0029) [2024-06-25 14:43:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14853701632. Throughput: 0: 42576.9. Samples: 14853815480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 14:43:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000906598_14853701632.pth... [2024-06-25 14:43:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000905976_14843510784.pth [2024-06-25 14:43:45,659][15401] Updated weights for policy 0, policy_version 906604 (0.0027) [2024-06-25 14:43:48,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 14853898240. Throughput: 0: 42566.4. Samples: 14854070820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 14:43:49,602][15401] Updated weights for policy 0, policy_version 906614 (0.0027) [2024-06-25 14:43:53,299][15401] Updated weights for policy 0, policy_version 906624 (0.0035) [2024-06-25 14:43:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 14854127616. Throughput: 0: 42560.5. Samples: 14854196500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:53,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 14:43:57,252][15401] Updated weights for policy 0, policy_version 906634 (0.0035) [2024-06-25 14:43:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14854340608. Throughput: 0: 42623.9. Samples: 14854455400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:43:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 14:44:00,755][15401] Updated weights for policy 0, policy_version 906644 (0.0039) [2024-06-25 14:44:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14854537216. Throughput: 0: 42862.7. Samples: 14854715440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:44:03,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 14:44:04,963][15401] Updated weights for policy 0, policy_version 906654 (0.0028) [2024-06-25 14:44:04,986][15349] Signal inference workers to stop experience collection... (219900 times) [2024-06-25 14:44:04,986][15349] Signal inference workers to resume experience collection... (219900 times) [2024-06-25 14:44:05,030][15401] InferenceWorker_p0-w0: stopping experience collection (219900 times) [2024-06-25 14:44:05,031][15401] InferenceWorker_p0-w0: resuming experience collection (219900 times) [2024-06-25 14:44:08,317][15401] Updated weights for policy 0, policy_version 906664 (0.0041) [2024-06-25 14:44:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 14854782976. Throughput: 0: 42719.6. Samples: 14854838560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:44:08,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 14:44:12,610][15401] Updated weights for policy 0, policy_version 906674 (0.0029) [2024-06-25 14:44:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 14854963200. Throughput: 0: 42679.6. Samples: 14855092900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:44:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 14:44:16,477][15401] Updated weights for policy 0, policy_version 906684 (0.0030) [2024-06-25 14:44:18,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 14855176192. Throughput: 0: 42572.7. Samples: 14855348600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:44:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 14:44:20,429][15401] Updated weights for policy 0, policy_version 906694 (0.0048) [2024-06-25 14:44:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.6, 300 sec: 42487.7). Total num frames: 14855389184. Throughput: 0: 42733.5. Samples: 14855474560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 14:44:23,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 14:44:24,291][15401] Updated weights for policy 0, policy_version 906704 (0.0036) [2024-06-25 14:44:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 14855585792. Throughput: 0: 42595.7. Samples: 14855732280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:44:28,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 14:44:28,497][15401] Updated weights for policy 0, policy_version 906714 (0.0028) [2024-06-25 14:44:31,867][15401] Updated weights for policy 0, policy_version 906724 (0.0036) [2024-06-25 14:44:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14855831552. Throughput: 0: 42551.0. Samples: 14855985620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:44:33,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 14:44:36,044][15401] Updated weights for policy 0, policy_version 906734 (0.0046) [2024-06-25 14:44:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 14856028160. Throughput: 0: 42744.0. Samples: 14856119980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:44:38,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 14:44:39,579][15401] Updated weights for policy 0, policy_version 906744 (0.0040) [2024-06-25 14:44:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 14856241152. Throughput: 0: 42617.8. Samples: 14856373200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:44:43,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 14:44:43,688][15401] Updated weights for policy 0, policy_version 906754 (0.0033) [2024-06-25 14:44:47,026][15401] Updated weights for policy 0, policy_version 906764 (0.0028) [2024-06-25 14:44:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14856470528. Throughput: 0: 42370.8. Samples: 14856622120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:44:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 14:44:51,371][15401] Updated weights for policy 0, policy_version 906774 (0.0033) [2024-06-25 14:44:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14856650752. Throughput: 0: 42735.6. Samples: 14856761660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:44:53,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 14:44:54,675][15401] Updated weights for policy 0, policy_version 906784 (0.0034) [2024-06-25 14:44:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14856880128. Throughput: 0: 42554.2. Samples: 14857007840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:44:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 14:44:58,972][15401] Updated weights for policy 0, policy_version 906794 (0.0031) [2024-06-25 14:45:02,313][15401] Updated weights for policy 0, policy_version 906804 (0.0026) [2024-06-25 14:45:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 14857109504. Throughput: 0: 42642.6. Samples: 14857267520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:03,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 14:45:06,549][15401] Updated weights for policy 0, policy_version 906814 (0.0046) [2024-06-25 14:45:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 14857306112. Throughput: 0: 42698.9. Samples: 14857396020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:08,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 14:45:10,295][15401] Updated weights for policy 0, policy_version 906824 (0.0029) [2024-06-25 14:45:13,393][15132] Fps is (10 sec: 40945.2, 60 sec: 42595.8, 300 sec: 42486.8). Total num frames: 14857519104. Throughput: 0: 42641.7. Samples: 14857651320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:13,394][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 14:45:14,018][15401] Updated weights for policy 0, policy_version 906834 (0.0031) [2024-06-25 14:45:18,026][15401] Updated weights for policy 0, policy_version 906844 (0.0031) [2024-06-25 14:45:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14857748480. Throughput: 0: 42563.1. Samples: 14857900960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 14:45:21,671][15401] Updated weights for policy 0, policy_version 906854 (0.0048) [2024-06-25 14:45:23,390][15132] Fps is (10 sec: 44253.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14857961472. Throughput: 0: 42505.3. Samples: 14858032720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 14:45:25,842][15401] Updated weights for policy 0, policy_version 906864 (0.0029) [2024-06-25 14:45:27,592][15349] Signal inference workers to stop experience collection... (219950 times) [2024-06-25 14:45:27,622][15401] InferenceWorker_p0-w0: stopping experience collection (219950 times) [2024-06-25 14:45:27,653][15349] Signal inference workers to resume experience collection... (219950 times) [2024-06-25 14:45:27,653][15401] InferenceWorker_p0-w0: resuming experience collection (219950 times) [2024-06-25 14:45:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 14858174464. Throughput: 0: 42477.3. Samples: 14858284680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 14:45:29,477][15401] Updated weights for policy 0, policy_version 906874 (0.0043) [2024-06-25 14:45:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 14858371072. Throughput: 0: 42675.8. Samples: 14858542540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:33,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 14:45:33,500][15401] Updated weights for policy 0, policy_version 906884 (0.0035) [2024-06-25 14:45:37,110][15401] Updated weights for policy 0, policy_version 906894 (0.0032) [2024-06-25 14:45:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14858584064. Throughput: 0: 42373.9. Samples: 14858668480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:38,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 14:45:41,322][15401] Updated weights for policy 0, policy_version 906904 (0.0035) [2024-06-25 14:45:43,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14858829824. Throughput: 0: 42604.0. Samples: 14858925020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 14:45:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000906911_14858829824.pth... [2024-06-25 14:45:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000906285_14848573440.pth [2024-06-25 14:45:44,912][15401] Updated weights for policy 0, policy_version 906914 (0.0035) [2024-06-25 14:45:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 14858993664. Throughput: 0: 42403.2. Samples: 14859175660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:48,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 14:45:49,109][15401] Updated weights for policy 0, policy_version 906924 (0.0025) [2024-06-25 14:45:52,600][15401] Updated weights for policy 0, policy_version 906934 (0.0032) [2024-06-25 14:45:53,395][15132] Fps is (10 sec: 39300.5, 60 sec: 42867.7, 300 sec: 42542.2). Total num frames: 14859223040. Throughput: 0: 42198.7. Samples: 14859295180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:53,395][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 14:45:57,077][15401] Updated weights for policy 0, policy_version 906944 (0.0046) [2024-06-25 14:45:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14859452416. Throughput: 0: 42371.6. Samples: 14859557880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:45:58,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 14:46:00,327][15401] Updated weights for policy 0, policy_version 906954 (0.0041) [2024-06-25 14:46:03,390][15132] Fps is (10 sec: 40981.6, 60 sec: 42052.3, 300 sec: 42432.7). Total num frames: 14859632640. Throughput: 0: 42563.8. Samples: 14859816340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:46:03,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 14:46:04,649][15401] Updated weights for policy 0, policy_version 906964 (0.0040) [2024-06-25 14:46:07,870][15401] Updated weights for policy 0, policy_version 906974 (0.0034) [2024-06-25 14:46:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 14859878400. Throughput: 0: 42244.1. Samples: 14859933700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:46:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 14:46:12,205][15401] Updated weights for policy 0, policy_version 906984 (0.0022) [2024-06-25 14:46:13,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42601.1, 300 sec: 42542.9). Total num frames: 14860075008. Throughput: 0: 42468.5. Samples: 14860195760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:46:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 14:46:15,406][15401] Updated weights for policy 0, policy_version 906994 (0.0031) [2024-06-25 14:46:18,390][15132] Fps is (10 sec: 37682.5, 60 sec: 41779.1, 300 sec: 42376.6). Total num frames: 14860255232. Throughput: 0: 42471.9. Samples: 14860453780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 14:46:18,403][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 14:46:20,021][15401] Updated weights for policy 0, policy_version 907004 (0.0051) [2024-06-25 14:46:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14860500992. Throughput: 0: 42339.9. Samples: 14860573780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:46:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 14:46:23,532][15401] Updated weights for policy 0, policy_version 907014 (0.0034) [2024-06-25 14:46:27,596][15401] Updated weights for policy 0, policy_version 907024 (0.0041) [2024-06-25 14:46:28,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14860713984. Throughput: 0: 42485.0. Samples: 14860836840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:46:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 14:46:31,383][15401] Updated weights for policy 0, policy_version 907034 (0.0029) [2024-06-25 14:46:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42376.7). Total num frames: 14860894208. Throughput: 0: 42611.5. Samples: 14861093180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:46:33,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 14:46:35,255][15401] Updated weights for policy 0, policy_version 907044 (0.0033) [2024-06-25 14:46:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14861156352. Throughput: 0: 42640.6. Samples: 14861213780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:46:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 14:46:39,112][15401] Updated weights for policy 0, policy_version 907054 (0.0034) [2024-06-25 14:46:42,799][15401] Updated weights for policy 0, policy_version 907064 (0.0041) [2024-06-25 14:46:43,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42323.6, 300 sec: 42598.4). Total num frames: 14861369344. Throughput: 0: 42803.9. Samples: 14861484160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:46:43,392][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 14:46:46,665][15401] Updated weights for policy 0, policy_version 907074 (0.0032) [2024-06-25 14:46:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 14861549568. Throughput: 0: 42705.3. Samples: 14861738080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:46:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 14:46:50,455][15401] Updated weights for policy 0, policy_version 907084 (0.0041) [2024-06-25 14:46:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42875.3, 300 sec: 42653.9). Total num frames: 14861795328. Throughput: 0: 42889.2. Samples: 14861863720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:46:53,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 14:46:54,108][15401] Updated weights for policy 0, policy_version 907094 (0.0049) [2024-06-25 14:46:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 14861975552. Throughput: 0: 42778.2. Samples: 14862120780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:46:58,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 14:46:58,442][15401] Updated weights for policy 0, policy_version 907104 (0.0030) [2024-06-25 14:47:01,652][15401] Updated weights for policy 0, policy_version 907114 (0.0043) [2024-06-25 14:47:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42598.3). Total num frames: 14862204928. Throughput: 0: 42665.2. Samples: 14862373720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 14:47:06,038][15401] Updated weights for policy 0, policy_version 907124 (0.0040) [2024-06-25 14:47:08,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14862434304. Throughput: 0: 42862.4. Samples: 14862502580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:08,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-25 14:47:09,229][15401] Updated weights for policy 0, policy_version 907134 (0.0034) [2024-06-25 14:47:13,396][15132] Fps is (10 sec: 40936.0, 60 sec: 42320.9, 300 sec: 42542.0). Total num frames: 14862614528. Throughput: 0: 42729.6. Samples: 14862759940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:13,396][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 14:47:14,151][15401] Updated weights for policy 0, policy_version 907144 (0.0049) [2024-06-25 14:47:14,485][15349] Signal inference workers to stop experience collection... (220000 times) [2024-06-25 14:47:14,485][15349] Signal inference workers to resume experience collection... (220000 times) [2024-06-25 14:47:14,518][15401] InferenceWorker_p0-w0: stopping experience collection (220000 times) [2024-06-25 14:47:14,519][15401] InferenceWorker_p0-w0: resuming experience collection (220000 times) [2024-06-25 14:47:16,902][15401] Updated weights for policy 0, policy_version 907154 (0.0037) [2024-06-25 14:47:18,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14862843904. Throughput: 0: 42629.4. Samples: 14863011500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 14:47:21,785][15401] Updated weights for policy 0, policy_version 907164 (0.0035) [2024-06-25 14:47:23,390][15132] Fps is (10 sec: 47542.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14863089664. Throughput: 0: 42981.3. Samples: 14863147940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:23,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 14:47:24,561][15401] Updated weights for policy 0, policy_version 907174 (0.0033) [2024-06-25 14:47:28,391][15132] Fps is (10 sec: 40955.3, 60 sec: 42324.5, 300 sec: 42487.2). Total num frames: 14863253504. Throughput: 0: 42583.4. Samples: 14863400360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:28,391][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 14:47:29,438][15401] Updated weights for policy 0, policy_version 907184 (0.0034) [2024-06-25 14:47:32,351][15401] Updated weights for policy 0, policy_version 907194 (0.0041) [2024-06-25 14:47:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14863482880. Throughput: 0: 42498.9. Samples: 14863650520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:33,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 14:47:37,145][15401] Updated weights for policy 0, policy_version 907204 (0.0034) [2024-06-25 14:47:38,390][15132] Fps is (10 sec: 45880.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14863712256. Throughput: 0: 42566.2. Samples: 14863779200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:38,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-25 14:47:40,712][15401] Updated weights for policy 0, policy_version 907214 (0.0033) [2024-06-25 14:47:43,392][15132] Fps is (10 sec: 39313.3, 60 sec: 41779.5, 300 sec: 42431.5). Total num frames: 14863876096. Throughput: 0: 42341.2. Samples: 14864026220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:43,392][15132] Avg episode reward: [(0, '0.243')] [2024-06-25 14:47:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000907219_14863876096.pth... [2024-06-25 14:47:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000906598_14853701632.pth [2024-06-25 14:47:44,740][15401] Updated weights for policy 0, policy_version 907224 (0.0042) [2024-06-25 14:47:48,135][15401] Updated weights for policy 0, policy_version 907234 (0.0039) [2024-06-25 14:47:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 14864138240. Throughput: 0: 42440.2. Samples: 14864283620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:48,393][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 14:47:52,483][15401] Updated weights for policy 0, policy_version 907244 (0.0036) [2024-06-25 14:47:53,392][15132] Fps is (10 sec: 47511.8, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 14864351232. Throughput: 0: 42565.1. Samples: 14864418120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:53,393][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 14:47:55,774][15401] Updated weights for policy 0, policy_version 907254 (0.0042) [2024-06-25 14:47:58,396][15132] Fps is (10 sec: 37668.5, 60 sec: 42320.9, 300 sec: 42375.3). Total num frames: 14864515072. Throughput: 0: 42454.1. Samples: 14864670380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:47:58,396][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 14:48:00,011][15401] Updated weights for policy 0, policy_version 907264 (0.0023) [2024-06-25 14:48:03,166][15401] Updated weights for policy 0, policy_version 907274 (0.0022) [2024-06-25 14:48:03,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 14864777216. Throughput: 0: 42448.5. Samples: 14864921680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:48:03,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 14:48:08,088][15401] Updated weights for policy 0, policy_version 907284 (0.0033) [2024-06-25 14:48:08,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 14864957440. Throughput: 0: 42442.3. Samples: 14865057840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:48:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 14:48:10,620][15401] Updated weights for policy 0, policy_version 907294 (0.0024) [2024-06-25 14:48:13,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42602.8, 300 sec: 42431.8). Total num frames: 14865170432. Throughput: 0: 42406.0. Samples: 14865308580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 14:48:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 14:48:15,739][15401] Updated weights for policy 0, policy_version 907304 (0.0034) [2024-06-25 14:48:18,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 14865416192. Throughput: 0: 42319.0. Samples: 14865554880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:18,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 14:48:18,734][15401] Updated weights for policy 0, policy_version 907314 (0.0030) [2024-06-25 14:48:20,292][15349] Signal inference workers to stop experience collection... (220050 times) [2024-06-25 14:48:20,318][15401] InferenceWorker_p0-w0: stopping experience collection (220050 times) [2024-06-25 14:48:20,348][15349] Signal inference workers to resume experience collection... (220050 times) [2024-06-25 14:48:20,349][15401] InferenceWorker_p0-w0: resuming experience collection (220050 times) [2024-06-25 14:48:23,331][15401] Updated weights for policy 0, policy_version 907324 (0.0036) [2024-06-25 14:48:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 14865596416. Throughput: 0: 42526.1. Samples: 14865692880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:23,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 14:48:26,298][15401] Updated weights for policy 0, policy_version 907334 (0.0034) [2024-06-25 14:48:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42599.2, 300 sec: 42542.8). Total num frames: 14865809408. Throughput: 0: 42586.4. Samples: 14865942520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 14:48:31,046][15401] Updated weights for policy 0, policy_version 907344 (0.0035) [2024-06-25 14:48:33,390][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 14866071552. Throughput: 0: 42446.3. Samples: 14866193600. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:33,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 14:48:33,772][15401] Updated weights for policy 0, policy_version 907354 (0.0036) [2024-06-25 14:48:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 14866219008. Throughput: 0: 42456.6. Samples: 14866328560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 14:48:38,751][15401] Updated weights for policy 0, policy_version 907364 (0.0046) [2024-06-25 14:48:41,490][15401] Updated weights for policy 0, policy_version 907374 (0.0028) [2024-06-25 14:48:43,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42872.9, 300 sec: 42542.8). Total num frames: 14866448384. Throughput: 0: 42444.6. Samples: 14866580120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 14:48:46,730][15401] Updated weights for policy 0, policy_version 907384 (0.0046) [2024-06-25 14:48:48,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 14866694144. Throughput: 0: 42441.8. Samples: 14866831560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:48,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 14:48:49,078][15401] Updated weights for policy 0, policy_version 907394 (0.0042) [2024-06-25 14:48:53,390][15132] Fps is (10 sec: 40956.9, 60 sec: 41780.3, 300 sec: 42431.7). Total num frames: 14866857984. Throughput: 0: 42413.8. Samples: 14866966500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:53,391][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 14:48:54,194][15401] Updated weights for policy 0, policy_version 907404 (0.0036) [2024-06-25 14:48:57,044][15401] Updated weights for policy 0, policy_version 907414 (0.0038) [2024-06-25 14:48:58,390][15132] Fps is (10 sec: 40959.1, 60 sec: 43149.0, 300 sec: 42598.4). Total num frames: 14867103744. Throughput: 0: 42482.1. Samples: 14867220280. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:48:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 14:49:01,758][15401] Updated weights for policy 0, policy_version 907424 (0.0025) [2024-06-25 14:49:03,389][15132] Fps is (10 sec: 47518.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14867333120. Throughput: 0: 42748.5. Samples: 14867478560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 14:49:04,726][15401] Updated weights for policy 0, policy_version 907434 (0.0033) [2024-06-25 14:49:08,396][15132] Fps is (10 sec: 39297.0, 60 sec: 42320.8, 300 sec: 42486.4). Total num frames: 14867496960. Throughput: 0: 42635.9. Samples: 14867611760. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:08,397][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 14:49:09,348][15401] Updated weights for policy 0, policy_version 907444 (0.0035) [2024-06-25 14:49:12,544][15401] Updated weights for policy 0, policy_version 907454 (0.0031) [2024-06-25 14:49:13,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14867759104. Throughput: 0: 42778.7. Samples: 14867867560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:13,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 14:49:17,010][15401] Updated weights for policy 0, policy_version 907464 (0.0025) [2024-06-25 14:49:18,389][15132] Fps is (10 sec: 47544.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14867972096. Throughput: 0: 42848.6. Samples: 14868121780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 14:49:19,977][15401] Updated weights for policy 0, policy_version 907474 (0.0039) [2024-06-25 14:49:23,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14868152320. Throughput: 0: 42765.7. Samples: 14868253020. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:23,392][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 14:49:24,513][15401] Updated weights for policy 0, policy_version 907484 (0.0024) [2024-06-25 14:49:24,857][15349] Signal inference workers to stop experience collection... (220100 times) [2024-06-25 14:49:24,858][15349] Signal inference workers to resume experience collection... (220100 times) [2024-06-25 14:49:24,907][15401] InferenceWorker_p0-w0: stopping experience collection (220100 times) [2024-06-25 14:49:24,908][15401] InferenceWorker_p0-w0: resuming experience collection (220100 times) [2024-06-25 14:49:27,492][15401] Updated weights for policy 0, policy_version 907494 (0.0032) [2024-06-25 14:49:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 14868398080. Throughput: 0: 42786.8. Samples: 14868505520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:28,390][15132] Avg episode reward: [(0, '0.084')] [2024-06-25 14:49:32,238][15401] Updated weights for policy 0, policy_version 907504 (0.0045) [2024-06-25 14:49:33,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14868611072. Throughput: 0: 43035.9. Samples: 14868768180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 14:49:34,922][15401] Updated weights for policy 0, policy_version 907514 (0.0030) [2024-06-25 14:49:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14868807680. Throughput: 0: 42835.0. Samples: 14868894040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:38,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 14:49:40,010][15401] Updated weights for policy 0, policy_version 907524 (0.0029) [2024-06-25 14:49:42,778][15401] Updated weights for policy 0, policy_version 907534 (0.0039) [2024-06-25 14:49:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 14869053440. Throughput: 0: 42756.1. Samples: 14869144300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:43,395][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 14:49:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000907535_14869053440.pth... [2024-06-25 14:49:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000906911_14858829824.pth [2024-06-25 14:49:47,552][15401] Updated weights for policy 0, policy_version 907544 (0.0027) [2024-06-25 14:49:48,391][15132] Fps is (10 sec: 42591.9, 60 sec: 42324.1, 300 sec: 42653.7). Total num frames: 14869233664. Throughput: 0: 42898.4. Samples: 14869409060. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:48,392][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 14:49:50,238][15401] Updated weights for policy 0, policy_version 907554 (0.0033) [2024-06-25 14:49:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42872.0, 300 sec: 42542.9). Total num frames: 14869430272. Throughput: 0: 42657.5. Samples: 14869531080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 14:49:55,157][15401] Updated weights for policy 0, policy_version 907564 (0.0031) [2024-06-25 14:49:57,893][15401] Updated weights for policy 0, policy_version 907574 (0.0039) [2024-06-25 14:49:58,390][15132] Fps is (10 sec: 47520.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 14869708800. Throughput: 0: 42839.5. Samples: 14869795340. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:49:58,399][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 14:50:02,747][15401] Updated weights for policy 0, policy_version 907584 (0.0035) [2024-06-25 14:50:03,392][15132] Fps is (10 sec: 45864.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 14869889024. Throughput: 0: 42895.0. Samples: 14870052160. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:50:03,401][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 14:50:05,387][15401] Updated weights for policy 0, policy_version 907594 (0.0037) [2024-06-25 14:50:08,390][15132] Fps is (10 sec: 37683.1, 60 sec: 43149.1, 300 sec: 42598.9). Total num frames: 14870085632. Throughput: 0: 42703.1. Samples: 14870174660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-25 14:50:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 14:50:10,490][15401] Updated weights for policy 0, policy_version 907604 (0.0031) [2024-06-25 14:50:12,957][15401] Updated weights for policy 0, policy_version 907614 (0.0028) [2024-06-25 14:50:13,392][15132] Fps is (10 sec: 47513.5, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 14870364160. Throughput: 0: 42822.6. Samples: 14870432640. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:13,393][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 14:50:18,182][15401] Updated weights for policy 0, policy_version 907624 (0.0038) [2024-06-25 14:50:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14870528000. Throughput: 0: 42780.4. Samples: 14870693300. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:18,393][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 14:50:20,928][15401] Updated weights for policy 0, policy_version 907634 (0.0040) [2024-06-25 14:50:23,389][15132] Fps is (10 sec: 34415.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14870708224. Throughput: 0: 42542.8. Samples: 14870808460. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:23,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 14:50:25,869][15401] Updated weights for policy 0, policy_version 907644 (0.0036) [2024-06-25 14:50:28,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14870986752. Throughput: 0: 42940.6. Samples: 14871076620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 14:50:28,424][15401] Updated weights for policy 0, policy_version 907654 (0.0033) [2024-06-25 14:50:32,029][15349] Signal inference workers to stop experience collection... (220150 times) [2024-06-25 14:50:32,029][15349] Signal inference workers to resume experience collection... (220150 times) [2024-06-25 14:50:32,061][15401] InferenceWorker_p0-w0: stopping experience collection (220150 times) [2024-06-25 14:50:32,061][15401] InferenceWorker_p0-w0: resuming experience collection (220150 times) [2024-06-25 14:50:33,255][15401] Updated weights for policy 0, policy_version 907664 (0.0033) [2024-06-25 14:50:33,392][15132] Fps is (10 sec: 45863.6, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 14871166976. Throughput: 0: 42865.9. Samples: 14871338060. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:33,393][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 14:50:36,377][15401] Updated weights for policy 0, policy_version 907674 (0.0034) [2024-06-25 14:50:38,396][15132] Fps is (10 sec: 37658.8, 60 sec: 42593.9, 300 sec: 42486.4). Total num frames: 14871363584. Throughput: 0: 42778.4. Samples: 14871456380. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:38,396][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 14:50:40,966][15401] Updated weights for policy 0, policy_version 907684 (0.0028) [2024-06-25 14:50:43,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14871625728. Throughput: 0: 42738.6. Samples: 14871718580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 14:50:43,846][15401] Updated weights for policy 0, policy_version 907694 (0.0033) [2024-06-25 14:50:48,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42599.6, 300 sec: 42599.2). Total num frames: 14871789568. Throughput: 0: 42765.4. Samples: 14871976500. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:48,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 14:50:48,694][15401] Updated weights for policy 0, policy_version 907704 (0.0037) [2024-06-25 14:50:51,553][15401] Updated weights for policy 0, policy_version 907714 (0.0027) [2024-06-25 14:50:53,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 14872002560. Throughput: 0: 42741.8. Samples: 14872098040. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:53,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 14:50:56,286][15401] Updated weights for policy 0, policy_version 907724 (0.0027) [2024-06-25 14:50:58,390][15132] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14872264704. Throughput: 0: 42813.3. Samples: 14872359140. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:50:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 14:50:59,331][15401] Updated weights for policy 0, policy_version 907734 (0.0032) [2024-06-25 14:51:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42326.9, 300 sec: 42542.8). Total num frames: 14872428544. Throughput: 0: 42755.0. Samples: 14872617280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 14:51:04,147][15401] Updated weights for policy 0, policy_version 907744 (0.0036) [2024-06-25 14:51:07,176][15401] Updated weights for policy 0, policy_version 907754 (0.0032) [2024-06-25 14:51:08,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14872657920. Throughput: 0: 42803.8. Samples: 14872734640. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 14:51:12,182][15401] Updated weights for policy 0, policy_version 907764 (0.0028) [2024-06-25 14:51:13,389][15132] Fps is (10 sec: 47514.8, 60 sec: 42327.1, 300 sec: 42876.1). Total num frames: 14872903680. Throughput: 0: 42781.3. Samples: 14873001780. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 14:51:14,753][15401] Updated weights for policy 0, policy_version 907774 (0.0052) [2024-06-25 14:51:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14873067520. Throughput: 0: 42680.1. Samples: 14873258560. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 14:51:19,695][15401] Updated weights for policy 0, policy_version 907784 (0.0036) [2024-06-25 14:51:22,294][15401] Updated weights for policy 0, policy_version 907794 (0.0032) [2024-06-25 14:51:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 14873313280. Throughput: 0: 42692.7. Samples: 14873377280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 14:51:27,202][15401] Updated weights for policy 0, policy_version 907804 (0.0053) [2024-06-25 14:51:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 14873526272. Throughput: 0: 42813.4. Samples: 14873645180. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:28,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 14:51:29,830][15401] Updated weights for policy 0, policy_version 907814 (0.0037) [2024-06-25 14:51:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 14873706496. Throughput: 0: 42861.3. Samples: 14873905260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:33,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 14:51:34,788][15401] Updated weights for policy 0, policy_version 907824 (0.0032) [2024-06-25 14:51:37,534][15401] Updated weights for policy 0, policy_version 907834 (0.0036) [2024-06-25 14:51:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43422.2, 300 sec: 42709.8). Total num frames: 14873968640. Throughput: 0: 42781.3. Samples: 14874023200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 14:51:42,510][15401] Updated weights for policy 0, policy_version 907844 (0.0041) [2024-06-25 14:51:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 14874165248. Throughput: 0: 42853.0. Samples: 14874287520. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 14:51:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000907848_14874181632.pth... [2024-06-25 14:51:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000907219_14863876096.pth [2024-06-25 14:51:45,930][15401] Updated weights for policy 0, policy_version 907854 (0.0033) [2024-06-25 14:51:48,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 14874345472. Throughput: 0: 42901.0. Samples: 14874547820. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 14:51:50,162][15349] Signal inference workers to stop experience collection... (220200 times) [2024-06-25 14:51:50,216][15401] InferenceWorker_p0-w0: stopping experience collection (220200 times) [2024-06-25 14:51:50,277][15349] Signal inference workers to resume experience collection... (220200 times) [2024-06-25 14:51:50,278][15401] InferenceWorker_p0-w0: resuming experience collection (220200 times) [2024-06-25 14:51:50,280][15401] Updated weights for policy 0, policy_version 907864 (0.0026) [2024-06-25 14:51:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14874591232. Throughput: 0: 42976.6. Samples: 14874668580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:53,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 14:51:53,510][15401] Updated weights for policy 0, policy_version 907874 (0.0035) [2024-06-25 14:51:57,974][15401] Updated weights for policy 0, policy_version 907884 (0.0037) [2024-06-25 14:51:58,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 14874804224. Throughput: 0: 42787.2. Samples: 14874927200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:51:58,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 14:52:01,059][15401] Updated weights for policy 0, policy_version 907894 (0.0048) [2024-06-25 14:52:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 14875000832. Throughput: 0: 42648.4. Samples: 14875177740. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-25 14:52:03,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 14:52:05,710][15401] Updated weights for policy 0, policy_version 907904 (0.0031) [2024-06-25 14:52:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.6, 300 sec: 42821.4). Total num frames: 14875246592. Throughput: 0: 42812.9. Samples: 14875303860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:08,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 14:52:09,170][15401] Updated weights for policy 0, policy_version 907914 (0.0033) [2024-06-25 14:52:13,254][15401] Updated weights for policy 0, policy_version 907924 (0.0038) [2024-06-25 14:52:13,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14875443200. Throughput: 0: 42604.2. Samples: 14875562360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:13,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 14:52:16,879][15401] Updated weights for policy 0, policy_version 907934 (0.0025) [2024-06-25 14:52:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 14875623424. Throughput: 0: 42433.5. Samples: 14875814760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:18,390][15132] Avg episode reward: [(0, '0.238')] [2024-06-25 14:52:21,301][15401] Updated weights for policy 0, policy_version 907944 (0.0047) [2024-06-25 14:52:23,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.2). Total num frames: 14875869184. Throughput: 0: 42721.5. Samples: 14875945660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 14:52:24,429][15401] Updated weights for policy 0, policy_version 907954 (0.0040) [2024-06-25 14:52:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 14876065792. Throughput: 0: 42454.6. Samples: 14876197980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:28,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-25 14:52:28,903][15401] Updated weights for policy 0, policy_version 907964 (0.0042) [2024-06-25 14:52:32,070][15401] Updated weights for policy 0, policy_version 907974 (0.0022) [2024-06-25 14:52:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14876262400. Throughput: 0: 42422.9. Samples: 14876456840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 14:52:36,248][15401] Updated weights for policy 0, policy_version 907984 (0.0038) [2024-06-25 14:52:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 14876508160. Throughput: 0: 42602.2. Samples: 14876585680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 14:52:40,252][15401] Updated weights for policy 0, policy_version 907994 (0.0030) [2024-06-25 14:52:43,396][15132] Fps is (10 sec: 45845.4, 60 sec: 42593.8, 300 sec: 42653.4). Total num frames: 14876721152. Throughput: 0: 42454.3. Samples: 14876837920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:43,396][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 14:52:44,117][15401] Updated weights for policy 0, policy_version 908004 (0.0036) [2024-06-25 14:52:47,724][15401] Updated weights for policy 0, policy_version 908014 (0.0029) [2024-06-25 14:52:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42654.3). Total num frames: 14876934144. Throughput: 0: 42554.4. Samples: 14877092680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:48,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 14:52:51,556][15401] Updated weights for policy 0, policy_version 908024 (0.0049) [2024-06-25 14:52:53,390][15132] Fps is (10 sec: 42625.2, 60 sec: 42598.3, 300 sec: 42821.5). Total num frames: 14877147136. Throughput: 0: 42649.3. Samples: 14877223080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 14:52:55,376][15401] Updated weights for policy 0, policy_version 908034 (0.0040) [2024-06-25 14:52:58,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 14877343744. Throughput: 0: 42658.6. Samples: 14877482100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:52:58,392][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 14:52:58,995][15401] Updated weights for policy 0, policy_version 908044 (0.0034) [2024-06-25 14:53:03,029][15401] Updated weights for policy 0, policy_version 908054 (0.0030) [2024-06-25 14:53:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14877573120. Throughput: 0: 42694.6. Samples: 14877736020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:03,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 14:53:06,749][15401] Updated weights for policy 0, policy_version 908064 (0.0033) [2024-06-25 14:53:08,390][15132] Fps is (10 sec: 44246.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14877786112. Throughput: 0: 42715.5. Samples: 14877867860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 14:53:10,570][15401] Updated weights for policy 0, policy_version 908074 (0.0030) [2024-06-25 14:53:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14877982720. Throughput: 0: 42787.5. Samples: 14878123420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 14:53:14,263][15349] Signal inference workers to stop experience collection... (220250 times) [2024-06-25 14:53:14,307][15401] InferenceWorker_p0-w0: stopping experience collection (220250 times) [2024-06-25 14:53:14,311][15349] Signal inference workers to resume experience collection... (220250 times) [2024-06-25 14:53:14,328][15401] InferenceWorker_p0-w0: resuming experience collection (220250 times) [2024-06-25 14:53:14,336][15401] Updated weights for policy 0, policy_version 908084 (0.0038) [2024-06-25 14:53:18,166][15401] Updated weights for policy 0, policy_version 908094 (0.0037) [2024-06-25 14:53:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14878212096. Throughput: 0: 42781.7. Samples: 14878382020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:18,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 14:53:22,162][15401] Updated weights for policy 0, policy_version 908104 (0.0033) [2024-06-25 14:53:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14878425088. Throughput: 0: 42728.1. Samples: 14878508440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 14:53:25,858][15401] Updated weights for policy 0, policy_version 908114 (0.0028) [2024-06-25 14:53:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14878638080. Throughput: 0: 42930.2. Samples: 14878769500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:28,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 14:53:29,748][15401] Updated weights for policy 0, policy_version 908124 (0.0033) [2024-06-25 14:53:33,327][15401] Updated weights for policy 0, policy_version 908134 (0.0024) [2024-06-25 14:53:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 14878867456. Throughput: 0: 42955.9. Samples: 14879025700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 14:53:37,543][15401] Updated weights for policy 0, policy_version 908144 (0.0033) [2024-06-25 14:53:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14879047680. Throughput: 0: 42900.5. Samples: 14879153600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:38,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 14:53:40,997][15401] Updated weights for policy 0, policy_version 908154 (0.0035) [2024-06-25 14:53:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 14879293440. Throughput: 0: 42885.7. Samples: 14879411860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 14:53:43,447][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000908161_14879309824.pth... [2024-06-25 14:53:43,506][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000907535_14869053440.pth [2024-06-25 14:53:45,146][15401] Updated weights for policy 0, policy_version 908164 (0.0039) [2024-06-25 14:53:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42876.2). Total num frames: 14879506432. Throughput: 0: 42876.4. Samples: 14879665460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:48,399][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 14:53:48,942][15401] Updated weights for policy 0, policy_version 908174 (0.0041) [2024-06-25 14:53:52,803][15401] Updated weights for policy 0, policy_version 908184 (0.0032) [2024-06-25 14:53:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14879703040. Throughput: 0: 42864.0. Samples: 14879796740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-25 14:53:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 14:53:56,519][15401] Updated weights for policy 0, policy_version 908194 (0.0036) [2024-06-25 14:53:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43419.3, 300 sec: 42765.0). Total num frames: 14879948800. Throughput: 0: 42898.2. Samples: 14880053840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:53:58,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 14:54:00,717][15401] Updated weights for policy 0, policy_version 908204 (0.0030) [2024-06-25 14:54:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 14880129024. Throughput: 0: 42822.7. Samples: 14880309040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 14:54:04,371][15401] Updated weights for policy 0, policy_version 908214 (0.0040) [2024-06-25 14:54:08,135][15401] Updated weights for policy 0, policy_version 908224 (0.0039) [2024-06-25 14:54:08,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14880342016. Throughput: 0: 42733.3. Samples: 14880431440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:08,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 14:54:11,835][15401] Updated weights for policy 0, policy_version 908234 (0.0034) [2024-06-25 14:54:13,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 14880604160. Throughput: 0: 42800.8. Samples: 14880695540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 14:54:15,530][15401] Updated weights for policy 0, policy_version 908244 (0.0031) [2024-06-25 14:54:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14880784384. Throughput: 0: 42753.4. Samples: 14880949600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 14:54:19,768][15401] Updated weights for policy 0, policy_version 908254 (0.0036) [2024-06-25 14:54:23,295][15401] Updated weights for policy 0, policy_version 908264 (0.0027) [2024-06-25 14:54:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14880997376. Throughput: 0: 42645.9. Samples: 14881072660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 14:54:27,253][15401] Updated weights for policy 0, policy_version 908274 (0.0037) [2024-06-25 14:54:28,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14881226752. Throughput: 0: 42855.6. Samples: 14881340360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:28,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 14:54:30,672][15401] Updated weights for policy 0, policy_version 908284 (0.0041) [2024-06-25 14:54:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14881423360. Throughput: 0: 42751.1. Samples: 14881589260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 14:54:34,762][15401] Updated weights for policy 0, policy_version 908294 (0.0034) [2024-06-25 14:54:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14881636352. Throughput: 0: 42596.0. Samples: 14881713560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:38,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 14:54:38,494][15401] Updated weights for policy 0, policy_version 908304 (0.0025) [2024-06-25 14:54:42,446][15401] Updated weights for policy 0, policy_version 908314 (0.0030) [2024-06-25 14:54:42,836][15349] Signal inference workers to stop experience collection... (220300 times) [2024-06-25 14:54:42,837][15349] Signal inference workers to resume experience collection... (220300 times) [2024-06-25 14:54:42,869][15401] InferenceWorker_p0-w0: stopping experience collection (220300 times) [2024-06-25 14:54:42,870][15401] InferenceWorker_p0-w0: resuming experience collection (220300 times) [2024-06-25 14:54:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.3). Total num frames: 14881882112. Throughput: 0: 42786.1. Samples: 14881979220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:43,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 14:54:45,995][15401] Updated weights for policy 0, policy_version 908324 (0.0028) [2024-06-25 14:54:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14882062336. Throughput: 0: 42912.0. Samples: 14882240080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 14:54:49,879][15401] Updated weights for policy 0, policy_version 908334 (0.0024) [2024-06-25 14:54:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 14882258944. Throughput: 0: 42860.3. Samples: 14882360160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 14:54:54,400][15401] Updated weights for policy 0, policy_version 908344 (0.0032) [2024-06-25 14:54:57,401][15401] Updated weights for policy 0, policy_version 908354 (0.0035) [2024-06-25 14:54:58,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 14882537472. Throughput: 0: 42743.7. Samples: 14882619000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:54:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 14:55:01,987][15401] Updated weights for policy 0, policy_version 908364 (0.0039) [2024-06-25 14:55:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14882684928. Throughput: 0: 42784.7. Samples: 14882874920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:03,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 14:55:05,291][15401] Updated weights for policy 0, policy_version 908374 (0.0025) [2024-06-25 14:55:08,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 14882914304. Throughput: 0: 42644.0. Samples: 14882991640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:08,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 14:55:09,584][15401] Updated weights for policy 0, policy_version 908384 (0.0040) [2024-06-25 14:55:12,856][15401] Updated weights for policy 0, policy_version 908394 (0.0029) [2024-06-25 14:55:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14883143680. Throughput: 0: 42541.4. Samples: 14883254720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 14:55:17,172][15401] Updated weights for policy 0, policy_version 908404 (0.0039) [2024-06-25 14:55:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 14883323904. Throughput: 0: 42761.8. Samples: 14883513540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:18,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 14:55:20,902][15401] Updated weights for policy 0, policy_version 908414 (0.0041) [2024-06-25 14:55:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14883553280. Throughput: 0: 42655.9. Samples: 14883633080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 14:55:25,108][15401] Updated weights for policy 0, policy_version 908424 (0.0033) [2024-06-25 14:55:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 14883766272. Throughput: 0: 42586.3. Samples: 14883895600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:28,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 14:55:28,412][15401] Updated weights for policy 0, policy_version 908434 (0.0035) [2024-06-25 14:55:32,656][15401] Updated weights for policy 0, policy_version 908444 (0.0043) [2024-06-25 14:55:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 14883962880. Throughput: 0: 42340.5. Samples: 14884145400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:33,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 14:55:36,145][15401] Updated weights for policy 0, policy_version 908454 (0.0036) [2024-06-25 14:55:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14884208640. Throughput: 0: 42502.7. Samples: 14884272780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:38,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 14:55:40,112][15401] Updated weights for policy 0, policy_version 908464 (0.0040) [2024-06-25 14:55:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 14884405248. Throughput: 0: 42607.4. Samples: 14884536340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:43,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-25 14:55:43,457][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000908473_14884421632.pth... [2024-06-25 14:55:43,521][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000907848_14874181632.pth [2024-06-25 14:55:43,977][15401] Updated weights for policy 0, policy_version 908474 (0.0037) [2024-06-25 14:55:47,925][15401] Updated weights for policy 0, policy_version 908484 (0.0042) [2024-06-25 14:55:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14884618240. Throughput: 0: 42570.6. Samples: 14884790600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-25 14:55:48,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 14:55:51,593][15401] Updated weights for policy 0, policy_version 908494 (0.0030) [2024-06-25 14:55:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 14884864000. Throughput: 0: 42845.8. Samples: 14884919700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:55:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 14:55:55,437][15401] Updated weights for policy 0, policy_version 908504 (0.0037) [2024-06-25 14:55:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 14885044224. Throughput: 0: 42700.3. Samples: 14885176240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:55:58,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 14:55:58,983][15401] Updated weights for policy 0, policy_version 908514 (0.0031) [2024-06-25 14:56:03,107][15401] Updated weights for policy 0, policy_version 908524 (0.0042) [2024-06-25 14:56:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14885273600. Throughput: 0: 42673.0. Samples: 14885433820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:03,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 14:56:06,199][15349] Signal inference workers to stop experience collection... (220350 times) [2024-06-25 14:56:06,253][15349] Signal inference workers to resume experience collection... (220350 times) [2024-06-25 14:56:06,253][15401] InferenceWorker_p0-w0: stopping experience collection (220350 times) [2024-06-25 14:56:06,272][15401] InferenceWorker_p0-w0: resuming experience collection (220350 times) [2024-06-25 14:56:06,581][15401] Updated weights for policy 0, policy_version 908534 (0.0033) [2024-06-25 14:56:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14885486592. Throughput: 0: 42902.3. Samples: 14885563680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:08,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 14:56:10,597][15401] Updated weights for policy 0, policy_version 908544 (0.0049) [2024-06-25 14:56:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14885683200. Throughput: 0: 42816.0. Samples: 14885822320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:13,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 14:56:14,173][15401] Updated weights for policy 0, policy_version 908554 (0.0040) [2024-06-25 14:56:18,229][15401] Updated weights for policy 0, policy_version 908564 (0.0043) [2024-06-25 14:56:18,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14885912576. Throughput: 0: 42979.0. Samples: 14886079460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:18,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 14:56:21,828][15401] Updated weights for policy 0, policy_version 908574 (0.0025) [2024-06-25 14:56:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14886125568. Throughput: 0: 43045.4. Samples: 14886209820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:23,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 14:56:25,868][15401] Updated weights for policy 0, policy_version 908584 (0.0036) [2024-06-25 14:56:28,396][15132] Fps is (10 sec: 42570.8, 60 sec: 42866.8, 300 sec: 42819.6). Total num frames: 14886338560. Throughput: 0: 42945.4. Samples: 14886469160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:28,397][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 14:56:29,264][15401] Updated weights for policy 0, policy_version 908594 (0.0035) [2024-06-25 14:56:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 14886551552. Throughput: 0: 42972.1. Samples: 14886724340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:33,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 14:56:33,469][15401] Updated weights for policy 0, policy_version 908604 (0.0033) [2024-06-25 14:56:36,889][15401] Updated weights for policy 0, policy_version 908614 (0.0027) [2024-06-25 14:56:38,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14886764544. Throughput: 0: 42912.9. Samples: 14886850780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:38,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 14:56:41,618][15401] Updated weights for policy 0, policy_version 908624 (0.0036) [2024-06-25 14:56:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14886961152. Throughput: 0: 43045.4. Samples: 14887113280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 14:56:44,495][15401] Updated weights for policy 0, policy_version 908634 (0.0041) [2024-06-25 14:56:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14887190528. Throughput: 0: 42929.4. Samples: 14887365640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:48,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 14:56:49,150][15401] Updated weights for policy 0, policy_version 908644 (0.0028) [2024-06-25 14:56:52,046][15401] Updated weights for policy 0, policy_version 908654 (0.0038) [2024-06-25 14:56:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14887419904. Throughput: 0: 42985.0. Samples: 14887498000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:53,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 14:56:56,690][15401] Updated weights for policy 0, policy_version 908664 (0.0020) [2024-06-25 14:56:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14887600128. Throughput: 0: 42896.9. Samples: 14887752680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:56:58,390][15132] Avg episode reward: [(0, '0.094')] [2024-06-25 14:56:59,856][15401] Updated weights for policy 0, policy_version 908674 (0.0025) [2024-06-25 14:57:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14887829504. Throughput: 0: 42773.3. Samples: 14888004260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:03,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 14:57:04,299][15401] Updated weights for policy 0, policy_version 908684 (0.0032) [2024-06-25 14:57:07,435][15401] Updated weights for policy 0, policy_version 908694 (0.0045) [2024-06-25 14:57:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 14888075264. Throughput: 0: 42880.9. Samples: 14888139460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 14:57:12,384][15401] Updated weights for policy 0, policy_version 908704 (0.0032) [2024-06-25 14:57:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14888239104. Throughput: 0: 42768.0. Samples: 14888393440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 14:57:14,160][15349] Signal inference workers to stop experience collection... (220400 times) [2024-06-25 14:57:14,160][15349] Signal inference workers to resume experience collection... (220400 times) [2024-06-25 14:57:14,193][15401] InferenceWorker_p0-w0: stopping experience collection (220400 times) [2024-06-25 14:57:14,193][15401] InferenceWorker_p0-w0: resuming experience collection (220400 times) [2024-06-25 14:57:15,408][15401] Updated weights for policy 0, policy_version 908714 (0.0036) [2024-06-25 14:57:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14888484864. Throughput: 0: 42591.6. Samples: 14888640960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 14:57:19,999][15401] Updated weights for policy 0, policy_version 908724 (0.0027) [2024-06-25 14:57:22,991][15401] Updated weights for policy 0, policy_version 908734 (0.0030) [2024-06-25 14:57:23,392][15132] Fps is (10 sec: 47501.9, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 14888714240. Throughput: 0: 42929.2. Samples: 14888782700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:23,393][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 14:57:27,580][15401] Updated weights for policy 0, policy_version 908744 (0.0039) [2024-06-25 14:57:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42330.0, 300 sec: 42765.0). Total num frames: 14888878080. Throughput: 0: 42748.5. Samples: 14889036960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 14:57:30,876][15401] Updated weights for policy 0, policy_version 908754 (0.0042) [2024-06-25 14:57:33,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14889140224. Throughput: 0: 42520.3. Samples: 14889279060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:33,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-25 14:57:35,242][15401] Updated weights for policy 0, policy_version 908764 (0.0034) [2024-06-25 14:57:38,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 14889336832. Throughput: 0: 42632.4. Samples: 14889416460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:38,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 14:57:38,686][15401] Updated weights for policy 0, policy_version 908774 (0.0026) [2024-06-25 14:57:43,263][15401] Updated weights for policy 0, policy_version 908784 (0.0042) [2024-06-25 14:57:43,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14889517056. Throughput: 0: 42710.7. Samples: 14889674660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-25 14:57:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 14:57:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000908785_14889533440.pth... [2024-06-25 14:57:43,509][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000908161_14879309824.pth [2024-06-25 14:57:46,256][15401] Updated weights for policy 0, policy_version 908794 (0.0052) [2024-06-25 14:57:48,396][15132] Fps is (10 sec: 45845.7, 60 sec: 43412.9, 300 sec: 42875.2). Total num frames: 14889795584. Throughput: 0: 42525.1. Samples: 14889918160. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:57:48,396][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 14:57:50,888][15401] Updated weights for policy 0, policy_version 908804 (0.0034) [2024-06-25 14:57:53,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 14889975808. Throughput: 0: 42545.3. Samples: 14890054000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:57:53,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 14:57:54,139][15401] Updated weights for policy 0, policy_version 908814 (0.0038) [2024-06-25 14:57:58,389][15132] Fps is (10 sec: 36068.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14890156032. Throughput: 0: 42368.4. Samples: 14890300020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:57:58,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 14:57:58,448][15401] Updated weights for policy 0, policy_version 908824 (0.0041) [2024-06-25 14:58:01,876][15401] Updated weights for policy 0, policy_version 908834 (0.0033) [2024-06-25 14:58:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14890418176. Throughput: 0: 42577.7. Samples: 14890556960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 14:58:06,088][15401] Updated weights for policy 0, policy_version 908844 (0.0033) [2024-06-25 14:58:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 14890614784. Throughput: 0: 42492.9. Samples: 14890694780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 14:58:09,364][15401] Updated weights for policy 0, policy_version 908854 (0.0031) [2024-06-25 14:58:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14890811392. Throughput: 0: 42435.9. Samples: 14890946580. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:13,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 14:58:13,620][15401] Updated weights for policy 0, policy_version 908864 (0.0035) [2024-06-25 14:58:16,797][15349] Signal inference workers to stop experience collection... (220450 times) [2024-06-25 14:58:16,844][15401] InferenceWorker_p0-w0: stopping experience collection (220450 times) [2024-06-25 14:58:16,857][15349] Signal inference workers to resume experience collection... (220450 times) [2024-06-25 14:58:16,858][15401] InferenceWorker_p0-w0: resuming experience collection (220450 times) [2024-06-25 14:58:17,033][15401] Updated weights for policy 0, policy_version 908874 (0.0036) [2024-06-25 14:58:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14891057152. Throughput: 0: 42656.6. Samples: 14891198600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:18,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 14:58:21,087][15401] Updated weights for policy 0, policy_version 908884 (0.0041) [2024-06-25 14:58:23,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42322.5, 300 sec: 42764.1). Total num frames: 14891253760. Throughput: 0: 42605.0. Samples: 14891333960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:23,397][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 14:58:24,443][15401] Updated weights for policy 0, policy_version 908894 (0.0046) [2024-06-25 14:58:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14891466752. Throughput: 0: 42492.4. Samples: 14891586820. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:28,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 14:58:28,670][15401] Updated weights for policy 0, policy_version 908904 (0.0032) [2024-06-25 14:58:31,955][15401] Updated weights for policy 0, policy_version 908914 (0.0031) [2024-06-25 14:58:33,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14891679744. Throughput: 0: 42859.4. Samples: 14891846560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:33,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 14:58:36,189][15401] Updated weights for policy 0, policy_version 908924 (0.0031) [2024-06-25 14:58:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14891909120. Throughput: 0: 42752.0. Samples: 14891977840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:38,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 14:58:39,761][15401] Updated weights for policy 0, policy_version 908934 (0.0043) [2024-06-25 14:58:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43415.8, 300 sec: 42764.7). Total num frames: 14892122112. Throughput: 0: 42943.0. Samples: 14892232560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:43,392][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 14:58:44,278][15401] Updated weights for policy 0, policy_version 908944 (0.0042) [2024-06-25 14:58:47,603][15401] Updated weights for policy 0, policy_version 908954 (0.0033) [2024-06-25 14:58:48,390][15132] Fps is (10 sec: 42596.2, 60 sec: 42329.5, 300 sec: 42820.5). Total num frames: 14892335104. Throughput: 0: 42959.1. Samples: 14892490140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:48,391][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 14:58:51,771][15401] Updated weights for policy 0, policy_version 908964 (0.0032) [2024-06-25 14:58:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14892531712. Throughput: 0: 42751.5. Samples: 14892618600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 14:58:55,247][15401] Updated weights for policy 0, policy_version 908974 (0.0032) [2024-06-25 14:58:58,389][15132] Fps is (10 sec: 42600.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 14892761088. Throughput: 0: 42773.4. Samples: 14892871380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:58:58,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 14:58:59,379][15401] Updated weights for policy 0, policy_version 908984 (0.0038) [2024-06-25 14:59:02,868][15401] Updated weights for policy 0, policy_version 908994 (0.0039) [2024-06-25 14:59:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14892974080. Throughput: 0: 42899.6. Samples: 14893129080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:59:03,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 14:59:06,858][15401] Updated weights for policy 0, policy_version 909004 (0.0041) [2024-06-25 14:59:08,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14893154304. Throughput: 0: 42711.8. Samples: 14893255720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:59:08,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-25 14:59:10,444][15401] Updated weights for policy 0, policy_version 909014 (0.0040) [2024-06-25 14:59:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14893400064. Throughput: 0: 42771.1. Samples: 14893511520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:59:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 14:59:14,462][15401] Updated weights for policy 0, policy_version 909024 (0.0033) [2024-06-25 14:59:18,310][15401] Updated weights for policy 0, policy_version 909034 (0.0026) [2024-06-25 14:59:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14893613056. Throughput: 0: 42593.8. Samples: 14893763280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:59:18,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 14:59:22,116][15401] Updated weights for policy 0, policy_version 909044 (0.0022) [2024-06-25 14:59:23,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42328.1, 300 sec: 42598.0). Total num frames: 14893793280. Throughput: 0: 42506.2. Samples: 14893890720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:59:23,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 14:59:26,027][15401] Updated weights for policy 0, policy_version 909054 (0.0031) [2024-06-25 14:59:28,391][15132] Fps is (10 sec: 42593.3, 60 sec: 42870.6, 300 sec: 42764.8). Total num frames: 14894039040. Throughput: 0: 42577.1. Samples: 14894148480. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:59:28,391][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 14:59:29,760][15401] Updated weights for policy 0, policy_version 909064 (0.0043) [2024-06-25 14:59:33,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14894235648. Throughput: 0: 42676.4. Samples: 14894410560. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:59:33,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 14:59:33,757][15401] Updated weights for policy 0, policy_version 909074 (0.0041) [2024-06-25 14:59:37,506][15401] Updated weights for policy 0, policy_version 909084 (0.0031) [2024-06-25 14:59:38,390][15132] Fps is (10 sec: 40964.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14894448640. Throughput: 0: 42467.1. Samples: 14894529620. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-25 14:59:38,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 14:59:41,295][15349] Signal inference workers to stop experience collection... (220500 times) [2024-06-25 14:59:41,296][15349] Signal inference workers to resume experience collection... (220500 times) [2024-06-25 14:59:41,320][15401] InferenceWorker_p0-w0: stopping experience collection (220500 times) [2024-06-25 14:59:41,352][15401] InferenceWorker_p0-w0: resuming experience collection (220500 times) [2024-06-25 14:59:41,439][15401] Updated weights for policy 0, policy_version 909094 (0.0031) [2024-06-25 14:59:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 14894678016. Throughput: 0: 42633.2. Samples: 14894789880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 14:59:43,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 14:59:43,423][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000909100_14894694400.pth... [2024-06-25 14:59:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000908473_14884421632.pth [2024-06-25 14:59:44,979][15401] Updated weights for policy 0, policy_version 909104 (0.0032) [2024-06-25 14:59:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.7, 300 sec: 42765.0). Total num frames: 14894874624. Throughput: 0: 42674.2. Samples: 14895049420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 14:59:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 14:59:49,269][15401] Updated weights for policy 0, policy_version 909114 (0.0033) [2024-06-25 14:59:52,673][15401] Updated weights for policy 0, policy_version 909124 (0.0034) [2024-06-25 14:59:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14895104000. Throughput: 0: 42545.0. Samples: 14895170240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 14:59:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 14:59:56,859][15401] Updated weights for policy 0, policy_version 909134 (0.0030) [2024-06-25 14:59:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 14895316992. Throughput: 0: 42515.5. Samples: 14895424720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 14:59:58,390][15132] Avg episode reward: [(0, '0.272')] [2024-06-25 15:00:00,422][15401] Updated weights for policy 0, policy_version 909144 (0.0039) [2024-06-25 15:00:03,390][15132] Fps is (10 sec: 37682.8, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 14895480832. Throughput: 0: 42667.9. Samples: 14895683340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:03,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 15:00:04,963][15401] Updated weights for policy 0, policy_version 909154 (0.0022) [2024-06-25 15:00:08,002][15401] Updated weights for policy 0, policy_version 909164 (0.0045) [2024-06-25 15:00:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14895742976. Throughput: 0: 42580.5. Samples: 14895806740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 15:00:12,445][15401] Updated weights for policy 0, policy_version 909174 (0.0028) [2024-06-25 15:00:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 14895939584. Throughput: 0: 42617.9. Samples: 14896066240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 15:00:15,890][15401] Updated weights for policy 0, policy_version 909184 (0.0043) [2024-06-25 15:00:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 14896136192. Throughput: 0: 42488.1. Samples: 14896322520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 15:00:20,282][15401] Updated weights for policy 0, policy_version 909194 (0.0036) [2024-06-25 15:00:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 14896381952. Throughput: 0: 42639.7. Samples: 14896448400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:23,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 15:00:23,473][15401] Updated weights for policy 0, policy_version 909204 (0.0034) [2024-06-25 15:00:27,843][15401] Updated weights for policy 0, policy_version 909214 (0.0037) [2024-06-25 15:00:28,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42599.3, 300 sec: 42820.6). Total num frames: 14896594944. Throughput: 0: 42569.9. Samples: 14896705520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:28,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 15:00:31,428][15401] Updated weights for policy 0, policy_version 909224 (0.0032) [2024-06-25 15:00:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14896791552. Throughput: 0: 42445.3. Samples: 14896959460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 15:00:35,444][15401] Updated weights for policy 0, policy_version 909234 (0.0045) [2024-06-25 15:00:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14897020928. Throughput: 0: 42442.6. Samples: 14897080160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 15:00:39,349][15401] Updated weights for policy 0, policy_version 909244 (0.0042) [2024-06-25 15:00:43,298][15401] Updated weights for policy 0, policy_version 909254 (0.0034) [2024-06-25 15:00:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14897217536. Throughput: 0: 42568.0. Samples: 14897340280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 15:00:47,034][15401] Updated weights for policy 0, policy_version 909264 (0.0037) [2024-06-25 15:00:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14897430528. Throughput: 0: 42342.7. Samples: 14897588760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 15:00:51,000][15401] Updated weights for policy 0, policy_version 909274 (0.0042) [2024-06-25 15:00:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14897659904. Throughput: 0: 42518.1. Samples: 14897720060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 15:00:54,912][15401] Updated weights for policy 0, policy_version 909284 (0.0035) [2024-06-25 15:00:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 14897840128. Throughput: 0: 42495.2. Samples: 14897978520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:00:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 15:00:58,719][15401] Updated weights for policy 0, policy_version 909294 (0.0033) [2024-06-25 15:01:01,676][15349] Signal inference workers to stop experience collection... (220550 times) [2024-06-25 15:01:01,677][15349] Signal inference workers to resume experience collection... (220550 times) [2024-06-25 15:01:01,720][15401] InferenceWorker_p0-w0: stopping experience collection (220550 times) [2024-06-25 15:01:01,720][15401] InferenceWorker_p0-w0: resuming experience collection (220550 times) [2024-06-25 15:01:02,412][15401] Updated weights for policy 0, policy_version 909304 (0.0044) [2024-06-25 15:01:03,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 14898069504. Throughput: 0: 42472.0. Samples: 14898233860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:01:03,392][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 15:01:06,283][15401] Updated weights for policy 0, policy_version 909314 (0.0039) [2024-06-25 15:01:08,395][15132] Fps is (10 sec: 44214.6, 60 sec: 42321.8, 300 sec: 42708.7). Total num frames: 14898282496. Throughput: 0: 42449.4. Samples: 14898358840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:01:08,395][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 15:01:09,847][15401] Updated weights for policy 0, policy_version 909324 (0.0035) [2024-06-25 15:01:13,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 14898479104. Throughput: 0: 42416.9. Samples: 14898614280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:01:13,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 15:01:13,897][15401] Updated weights for policy 0, policy_version 909334 (0.0044) [2024-06-25 15:01:17,984][15401] Updated weights for policy 0, policy_version 909344 (0.0054) [2024-06-25 15:01:18,389][15132] Fps is (10 sec: 40980.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14898692096. Throughput: 0: 42384.5. Samples: 14898866760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:01:18,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 15:01:21,592][15401] Updated weights for policy 0, policy_version 909354 (0.0036) [2024-06-25 15:01:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42599.3). Total num frames: 14898905088. Throughput: 0: 42538.4. Samples: 14898994380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:01:23,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 15:01:25,720][15401] Updated weights for policy 0, policy_version 909364 (0.0036) [2024-06-25 15:01:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 14899101696. Throughput: 0: 42415.1. Samples: 14899248960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:01:28,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 15:01:29,294][15401] Updated weights for policy 0, policy_version 909374 (0.0024) [2024-06-25 15:01:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14899331072. Throughput: 0: 42677.8. Samples: 14899509260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-25 15:01:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 15:01:33,524][15401] Updated weights for policy 0, policy_version 909384 (0.0024) [2024-06-25 15:01:36,978][15401] Updated weights for policy 0, policy_version 909394 (0.0027) [2024-06-25 15:01:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 14899544064. Throughput: 0: 42567.6. Samples: 14899635600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:01:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 15:01:41,312][15401] Updated weights for policy 0, policy_version 909404 (0.0028) [2024-06-25 15:01:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14899757056. Throughput: 0: 42550.7. Samples: 14899893300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:01:43,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 15:01:43,494][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000909410_14899773440.pth... [2024-06-25 15:01:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000908785_14889533440.pth [2024-06-25 15:01:44,638][15401] Updated weights for policy 0, policy_version 909414 (0.0025) [2024-06-25 15:01:48,392][15132] Fps is (10 sec: 44226.7, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 14899986432. Throughput: 0: 42475.6. Samples: 14900145260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:01:48,393][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 15:01:48,767][15401] Updated weights for policy 0, policy_version 909424 (0.0040) [2024-06-25 15:01:52,224][15401] Updated weights for policy 0, policy_version 909434 (0.0030) [2024-06-25 15:01:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14900199424. Throughput: 0: 42654.5. Samples: 14900278080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:01:53,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 15:01:56,496][15401] Updated weights for policy 0, policy_version 909444 (0.0030) [2024-06-25 15:01:58,391][15132] Fps is (10 sec: 42601.8, 60 sec: 42870.3, 300 sec: 42653.7). Total num frames: 14900412416. Throughput: 0: 42762.4. Samples: 14900538660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:01:58,392][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 15:01:59,851][15401] Updated weights for policy 0, policy_version 909454 (0.0028) [2024-06-25 15:02:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 14900625408. Throughput: 0: 42779.1. Samples: 14900791820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:03,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 15:02:04,143][15401] Updated weights for policy 0, policy_version 909464 (0.0025) [2024-06-25 15:02:07,410][15401] Updated weights for policy 0, policy_version 909474 (0.0036) [2024-06-25 15:02:08,389][15132] Fps is (10 sec: 44244.1, 60 sec: 42875.1, 300 sec: 42765.0). Total num frames: 14900854784. Throughput: 0: 42926.7. Samples: 14900926080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 15:02:11,679][15401] Updated weights for policy 0, policy_version 909484 (0.0032) [2024-06-25 15:02:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 14901035008. Throughput: 0: 42861.2. Samples: 14901177720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:13,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 15:02:15,018][15401] Updated weights for policy 0, policy_version 909494 (0.0030) [2024-06-25 15:02:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 14901264384. Throughput: 0: 42779.1. Samples: 14901434320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:18,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 15:02:19,124][15401] Updated weights for policy 0, policy_version 909504 (0.0037) [2024-06-25 15:02:22,886][15401] Updated weights for policy 0, policy_version 909514 (0.0027) [2024-06-25 15:02:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14901493760. Throughput: 0: 43036.5. Samples: 14901572240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:23,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 15:02:26,747][15401] Updated weights for policy 0, policy_version 909524 (0.0029) [2024-06-25 15:02:26,752][15349] Signal inference workers to stop experience collection... (220600 times) [2024-06-25 15:02:26,752][15349] Signal inference workers to resume experience collection... (220600 times) [2024-06-25 15:02:26,767][15401] InferenceWorker_p0-w0: stopping experience collection (220600 times) [2024-06-25 15:02:26,772][15401] InferenceWorker_p0-w0: resuming experience collection (220600 times) [2024-06-25 15:02:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 14901673984. Throughput: 0: 42852.9. Samples: 14901821680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:28,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 15:02:30,643][15401] Updated weights for policy 0, policy_version 909534 (0.0038) [2024-06-25 15:02:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 14901936128. Throughput: 0: 42899.9. Samples: 14902075660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:33,391][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 15:02:34,272][15401] Updated weights for policy 0, policy_version 909544 (0.0027) [2024-06-25 15:02:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14902116352. Throughput: 0: 43050.3. Samples: 14902215340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:38,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 15:02:38,440][15401] Updated weights for policy 0, policy_version 909554 (0.0029) [2024-06-25 15:02:41,675][15401] Updated weights for policy 0, policy_version 909564 (0.0034) [2024-06-25 15:02:43,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42488.2). Total num frames: 14902329344. Throughput: 0: 42789.0. Samples: 14902464100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 15:02:46,001][15401] Updated weights for policy 0, policy_version 909574 (0.0035) [2024-06-25 15:02:48,392][15132] Fps is (10 sec: 45863.7, 60 sec: 43144.5, 300 sec: 42709.1). Total num frames: 14902575104. Throughput: 0: 42724.0. Samples: 14902714500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:48,393][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 15:02:49,732][15401] Updated weights for policy 0, policy_version 909584 (0.0033) [2024-06-25 15:02:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14902771712. Throughput: 0: 42882.1. Samples: 14902855780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:53,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 15:02:53,592][15401] Updated weights for policy 0, policy_version 909594 (0.0038) [2024-06-25 15:02:57,381][15401] Updated weights for policy 0, policy_version 909604 (0.0041) [2024-06-25 15:02:58,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42599.6, 300 sec: 42542.9). Total num frames: 14902968320. Throughput: 0: 42892.6. Samples: 14903107880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:02:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 15:03:01,102][15401] Updated weights for policy 0, policy_version 909614 (0.0037) [2024-06-25 15:03:03,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 14903230464. Throughput: 0: 42874.6. Samples: 14903363680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:03:03,394][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 15:03:04,847][15401] Updated weights for policy 0, policy_version 909624 (0.0044) [2024-06-25 15:03:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14903394304. Throughput: 0: 42804.6. Samples: 14903498440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:03:08,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 15:03:09,110][15401] Updated weights for policy 0, policy_version 909634 (0.0030) [2024-06-25 15:03:12,281][15401] Updated weights for policy 0, policy_version 909644 (0.0031) [2024-06-25 15:03:13,392][15132] Fps is (10 sec: 39312.0, 60 sec: 43142.9, 300 sec: 42598.0). Total num frames: 14903623680. Throughput: 0: 42931.0. Samples: 14903753680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:03:13,393][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 15:03:16,710][15401] Updated weights for policy 0, policy_version 909654 (0.0042) [2024-06-25 15:03:18,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.7, 300 sec: 42766.0). Total num frames: 14903869440. Throughput: 0: 42791.3. Samples: 14904001260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:03:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 15:03:20,418][15401] Updated weights for policy 0, policy_version 909664 (0.0035) [2024-06-25 15:03:23,389][15132] Fps is (10 sec: 42609.4, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 14904049664. Throughput: 0: 42742.7. Samples: 14904138760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:03:23,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 15:03:24,136][15401] Updated weights for policy 0, policy_version 909674 (0.0032) [2024-06-25 15:03:27,728][15401] Updated weights for policy 0, policy_version 909684 (0.0034) [2024-06-25 15:03:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 14904295424. Throughput: 0: 42904.5. Samples: 14904394800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-25 15:03:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 15:03:31,758][15401] Updated weights for policy 0, policy_version 909694 (0.0028) [2024-06-25 15:03:33,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14904524800. Throughput: 0: 43151.2. Samples: 14904656200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:03:33,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 15:03:35,094][15401] Updated weights for policy 0, policy_version 909704 (0.0046) [2024-06-25 15:03:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.3, 300 sec: 42598.7). Total num frames: 14904688640. Throughput: 0: 42904.9. Samples: 14904786500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:03:38,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 15:03:39,396][15401] Updated weights for policy 0, policy_version 909714 (0.0029) [2024-06-25 15:03:41,812][15349] Signal inference workers to stop experience collection... (220650 times) [2024-06-25 15:03:41,812][15349] Signal inference workers to resume experience collection... (220650 times) [2024-06-25 15:03:41,851][15401] InferenceWorker_p0-w0: stopping experience collection (220650 times) [2024-06-25 15:03:41,851][15401] InferenceWorker_p0-w0: resuming experience collection (220650 times) [2024-06-25 15:03:42,710][15401] Updated weights for policy 0, policy_version 909724 (0.0036) [2024-06-25 15:03:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 14904934400. Throughput: 0: 42913.6. Samples: 14905039000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:03:43,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 15:03:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000909725_14904934400.pth... [2024-06-25 15:03:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000909100_14894694400.pth [2024-06-25 15:03:47,163][15401] Updated weights for policy 0, policy_version 909734 (0.0037) [2024-06-25 15:03:48,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 14905131008. Throughput: 0: 42925.0. Samples: 14905295300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:03:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 15:03:50,188][15401] Updated weights for policy 0, policy_version 909744 (0.0043) [2024-06-25 15:03:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14905327616. Throughput: 0: 42811.5. Samples: 14905424960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:03:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 15:03:54,803][15401] Updated weights for policy 0, policy_version 909754 (0.0036) [2024-06-25 15:03:58,320][15401] Updated weights for policy 0, policy_version 909764 (0.0038) [2024-06-25 15:03:58,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 14905573376. Throughput: 0: 42788.5. Samples: 14905679060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:03:58,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 15:04:02,349][15401] Updated weights for policy 0, policy_version 909774 (0.0036) [2024-06-25 15:04:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14905786368. Throughput: 0: 43069.6. Samples: 14905939400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 15:04:05,822][15401] Updated weights for policy 0, policy_version 909784 (0.0035) [2024-06-25 15:04:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14905982976. Throughput: 0: 42819.5. Samples: 14906065640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:08,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 15:04:10,118][15401] Updated weights for policy 0, policy_version 909794 (0.0032) [2024-06-25 15:04:13,281][15401] Updated weights for policy 0, policy_version 909804 (0.0036) [2024-06-25 15:04:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 14906228736. Throughput: 0: 42905.3. Samples: 14906325540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:13,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 15:04:18,016][15401] Updated weights for policy 0, policy_version 909814 (0.0037) [2024-06-25 15:04:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 14906425344. Throughput: 0: 42926.2. Samples: 14906587880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 15:04:20,822][15401] Updated weights for policy 0, policy_version 909824 (0.0034) [2024-06-25 15:04:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42654.1). Total num frames: 14906621952. Throughput: 0: 42726.8. Samples: 14906709200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:23,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 15:04:25,470][15401] Updated weights for policy 0, policy_version 909834 (0.0039) [2024-06-25 15:04:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14906867712. Throughput: 0: 42991.8. Samples: 14906973620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:28,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 15:04:28,430][15401] Updated weights for policy 0, policy_version 909844 (0.0037) [2024-06-25 15:04:33,339][15401] Updated weights for policy 0, policy_version 909854 (0.0043) [2024-06-25 15:04:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 14907047936. Throughput: 0: 43091.0. Samples: 14907234400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 15:04:36,268][15401] Updated weights for policy 0, policy_version 909864 (0.0041) [2024-06-25 15:04:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14907277312. Throughput: 0: 42872.8. Samples: 14907354240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:38,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 15:04:40,656][15401] Updated weights for policy 0, policy_version 909874 (0.0033) [2024-06-25 15:04:43,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 14907523072. Throughput: 0: 43135.2. Samples: 14907620140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:43,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 15:04:43,942][15401] Updated weights for policy 0, policy_version 909884 (0.0041) [2024-06-25 15:04:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14907686912. Throughput: 0: 43041.8. Samples: 14907876280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 15:04:48,790][15401] Updated weights for policy 0, policy_version 909894 (0.0033) [2024-06-25 15:04:52,364][15401] Updated weights for policy 0, policy_version 909904 (0.0026) [2024-06-25 15:04:53,390][15132] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 14907916288. Throughput: 0: 42866.1. Samples: 14907994620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:53,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 15:04:56,189][15401] Updated weights for policy 0, policy_version 909914 (0.0039) [2024-06-25 15:04:58,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 14908145664. Throughput: 0: 42896.8. Samples: 14908255900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:04:58,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 15:04:59,834][15401] Updated weights for policy 0, policy_version 909924 (0.0041) [2024-06-25 15:05:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 14908342272. Throughput: 0: 42795.6. Samples: 14908513680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:05:03,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 15:05:03,631][15401] Updated weights for policy 0, policy_version 909934 (0.0032) [2024-06-25 15:05:04,682][15349] Signal inference workers to stop experience collection... (220700 times) [2024-06-25 15:05:04,682][15349] Signal inference workers to resume experience collection... (220700 times) [2024-06-25 15:05:04,709][15401] InferenceWorker_p0-w0: stopping experience collection (220700 times) [2024-06-25 15:05:04,709][15401] InferenceWorker_p0-w0: resuming experience collection (220700 times) [2024-06-25 15:05:07,286][15401] Updated weights for policy 0, policy_version 909944 (0.0027) [2024-06-25 15:05:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14908555264. Throughput: 0: 43004.4. Samples: 14908644400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:05:08,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 15:05:11,147][15401] Updated weights for policy 0, policy_version 909954 (0.0034) [2024-06-25 15:05:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14908784640. Throughput: 0: 42858.7. Samples: 14908902260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:05:13,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 15:05:14,681][15401] Updated weights for policy 0, policy_version 909964 (0.0035) [2024-06-25 15:05:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14908997632. Throughput: 0: 42829.7. Samples: 14909161740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:05:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 15:05:18,645][15401] Updated weights for policy 0, policy_version 909974 (0.0036) [2024-06-25 15:05:22,571][15401] Updated weights for policy 0, policy_version 909984 (0.0026) [2024-06-25 15:05:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14909210624. Throughput: 0: 42913.9. Samples: 14909285360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 15:05:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 15:05:26,198][15401] Updated weights for policy 0, policy_version 909994 (0.0046) [2024-06-25 15:05:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14909440000. Throughput: 0: 42836.0. Samples: 14909547760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:05:28,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 15:05:29,965][15401] Updated weights for policy 0, policy_version 910004 (0.0043) [2024-06-25 15:05:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14909620224. Throughput: 0: 42903.7. Samples: 14909806940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:05:33,398][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 15:05:34,076][15401] Updated weights for policy 0, policy_version 910014 (0.0043) [2024-06-25 15:05:37,469][15401] Updated weights for policy 0, policy_version 910024 (0.0038) [2024-06-25 15:05:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14909849600. Throughput: 0: 43018.9. Samples: 14909930460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:05:38,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 15:05:41,564][15401] Updated weights for policy 0, policy_version 910034 (0.0047) [2024-06-25 15:05:43,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14910078976. Throughput: 0: 43088.5. Samples: 14910194880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:05:43,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 15:05:43,497][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000910040_14910095360.pth... [2024-06-25 15:05:43,560][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000909410_14899773440.pth [2024-06-25 15:05:45,360][15401] Updated weights for policy 0, policy_version 910044 (0.0040) [2024-06-25 15:05:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 14910291968. Throughput: 0: 43034.6. Samples: 14910450240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:05:48,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 15:05:49,024][15401] Updated weights for policy 0, policy_version 910054 (0.0058) [2024-06-25 15:05:52,923][15401] Updated weights for policy 0, policy_version 910064 (0.0032) [2024-06-25 15:05:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 14910521344. Throughput: 0: 42916.0. Samples: 14910575620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:05:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 15:05:56,674][15401] Updated weights for policy 0, policy_version 910074 (0.0037) [2024-06-25 15:05:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 14910717952. Throughput: 0: 42893.2. Samples: 14910832460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:05:58,390][15132] Avg episode reward: [(0, '0.226')] [2024-06-25 15:06:00,810][15401] Updated weights for policy 0, policy_version 910084 (0.0030) [2024-06-25 15:06:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.8). Total num frames: 14910930944. Throughput: 0: 42708.1. Samples: 14911083600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:03,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 15:06:04,426][15401] Updated weights for policy 0, policy_version 910094 (0.0037) [2024-06-25 15:06:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14911127552. Throughput: 0: 42871.1. Samples: 14911214560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:08,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 15:06:08,418][15401] Updated weights for policy 0, policy_version 910104 (0.0028) [2024-06-25 15:06:11,850][15401] Updated weights for policy 0, policy_version 910114 (0.0037) [2024-06-25 15:06:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14911340544. Throughput: 0: 42695.9. Samples: 14911469080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:13,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 15:06:15,933][15401] Updated weights for policy 0, policy_version 910124 (0.0036) [2024-06-25 15:06:18,396][15132] Fps is (10 sec: 45845.7, 60 sec: 43140.1, 300 sec: 42986.2). Total num frames: 14911586304. Throughput: 0: 42709.9. Samples: 14911729160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:18,396][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 15:06:20,013][15401] Updated weights for policy 0, policy_version 910134 (0.0036) [2024-06-25 15:06:21,758][15349] Signal inference workers to stop experience collection... (220750 times) [2024-06-25 15:06:21,765][15349] Signal inference workers to resume experience collection... (220750 times) [2024-06-25 15:06:21,804][15401] InferenceWorker_p0-w0: stopping experience collection (220750 times) [2024-06-25 15:06:21,804][15401] InferenceWorker_p0-w0: resuming experience collection (220750 times) [2024-06-25 15:06:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 14911782912. Throughput: 0: 42873.3. Samples: 14911859760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 15:06:23,487][15401] Updated weights for policy 0, policy_version 910144 (0.0039) [2024-06-25 15:06:27,646][15401] Updated weights for policy 0, policy_version 910154 (0.0027) [2024-06-25 15:06:28,389][15132] Fps is (10 sec: 40986.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 14911995904. Throughput: 0: 42620.5. Samples: 14912112800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:28,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 15:06:30,991][15401] Updated weights for policy 0, policy_version 910164 (0.0034) [2024-06-25 15:06:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 14912225280. Throughput: 0: 42733.7. Samples: 14912373260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 15:06:35,270][15401] Updated weights for policy 0, policy_version 910174 (0.0045) [2024-06-25 15:06:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14912421888. Throughput: 0: 42845.9. Samples: 14912503680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 15:06:38,963][15401] Updated weights for policy 0, policy_version 910184 (0.0029) [2024-06-25 15:06:43,124][15401] Updated weights for policy 0, policy_version 910194 (0.0038) [2024-06-25 15:06:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 14912634880. Throughput: 0: 42835.7. Samples: 14912760060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:43,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 15:06:46,880][15401] Updated weights for policy 0, policy_version 910204 (0.0040) [2024-06-25 15:06:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14912847872. Throughput: 0: 42884.9. Samples: 14913013420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:48,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 15:06:50,565][15401] Updated weights for policy 0, policy_version 910214 (0.0031) [2024-06-25 15:06:53,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42876.3). Total num frames: 14913060864. Throughput: 0: 42746.9. Samples: 14913138180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 15:06:54,514][15401] Updated weights for policy 0, policy_version 910224 (0.0028) [2024-06-25 15:06:58,378][15401] Updated weights for policy 0, policy_version 910234 (0.0033) [2024-06-25 15:06:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14913273856. Throughput: 0: 42833.9. Samples: 14913396600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:06:58,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 15:07:02,173][15401] Updated weights for policy 0, policy_version 910244 (0.0035) [2024-06-25 15:07:03,396][15132] Fps is (10 sec: 42571.4, 60 sec: 42593.8, 300 sec: 42819.6). Total num frames: 14913486848. Throughput: 0: 42791.9. Samples: 14913654800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:07:03,396][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 15:07:05,949][15401] Updated weights for policy 0, policy_version 910254 (0.0039) [2024-06-25 15:07:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 14913716224. Throughput: 0: 42788.7. Samples: 14913785260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:07:08,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 15:07:09,816][15401] Updated weights for policy 0, policy_version 910264 (0.0033) [2024-06-25 15:07:13,390][15132] Fps is (10 sec: 42625.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14913912832. Throughput: 0: 42770.2. Samples: 14914037460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:07:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 15:07:13,576][15401] Updated weights for policy 0, policy_version 910274 (0.0040) [2024-06-25 15:07:17,571][15401] Updated weights for policy 0, policy_version 910284 (0.0044) [2024-06-25 15:07:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42329.9, 300 sec: 42820.6). Total num frames: 14914125824. Throughput: 0: 42734.0. Samples: 14914296280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 15:07:18,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 15:07:21,275][15401] Updated weights for policy 0, policy_version 910294 (0.0039) [2024-06-25 15:07:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 14914338816. Throughput: 0: 42555.9. Samples: 14914418700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:07:23,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 15:07:25,381][15401] Updated weights for policy 0, policy_version 910304 (0.0044) [2024-06-25 15:07:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14914568192. Throughput: 0: 42642.6. Samples: 14914678980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:07:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 15:07:28,676][15401] Updated weights for policy 0, policy_version 910314 (0.0035) [2024-06-25 15:07:32,953][15401] Updated weights for policy 0, policy_version 910324 (0.0030) [2024-06-25 15:07:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 14914764800. Throughput: 0: 42605.2. Samples: 14914930660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:07:33,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 15:07:36,708][15401] Updated weights for policy 0, policy_version 910334 (0.0033) [2024-06-25 15:07:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 14914977792. Throughput: 0: 42736.5. Samples: 14915061320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:07:38,396][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 15:07:40,630][15401] Updated weights for policy 0, policy_version 910344 (0.0027) [2024-06-25 15:07:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 14915190784. Throughput: 0: 42736.4. Samples: 14915319740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:07:43,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 15:07:43,451][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000910352_14915207168.pth... [2024-06-25 15:07:43,535][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000909725_14904934400.pth [2024-06-25 15:07:44,307][15401] Updated weights for policy 0, policy_version 910354 (0.0038) [2024-06-25 15:07:48,235][15401] Updated weights for policy 0, policy_version 910364 (0.0042) [2024-06-25 15:07:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14915403776. Throughput: 0: 42840.8. Samples: 14915582360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:07:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 15:07:51,898][15401] Updated weights for policy 0, policy_version 910374 (0.0044) [2024-06-25 15:07:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 14915633152. Throughput: 0: 42671.2. Samples: 14915705460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:07:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 15:07:53,769][15349] Signal inference workers to stop experience collection... (220800 times) [2024-06-25 15:07:53,813][15401] InferenceWorker_p0-w0: stopping experience collection (220800 times) [2024-06-25 15:07:53,818][15349] Signal inference workers to resume experience collection... (220800 times) [2024-06-25 15:07:53,828][15401] InferenceWorker_p0-w0: resuming experience collection (220800 times) [2024-06-25 15:07:55,959][15401] Updated weights for policy 0, policy_version 910384 (0.0029) [2024-06-25 15:07:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14915846144. Throughput: 0: 42819.5. Samples: 14915964340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:07:58,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 15:07:59,528][15401] Updated weights for policy 0, policy_version 910394 (0.0034) [2024-06-25 15:08:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42603.0, 300 sec: 42876.1). Total num frames: 14916042752. Throughput: 0: 43008.4. Samples: 14916231660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:03,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 15:08:03,510][15401] Updated weights for policy 0, policy_version 910404 (0.0031) [2024-06-25 15:08:06,977][15401] Updated weights for policy 0, policy_version 910414 (0.0034) [2024-06-25 15:08:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42876.5). Total num frames: 14916272128. Throughput: 0: 42998.7. Samples: 14916353640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:08,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 15:08:11,211][15401] Updated weights for policy 0, policy_version 910424 (0.0033) [2024-06-25 15:08:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14916485120. Throughput: 0: 42914.5. Samples: 14916610140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:13,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 15:08:14,532][15401] Updated weights for policy 0, policy_version 910434 (0.0037) [2024-06-25 15:08:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14916681728. Throughput: 0: 42974.2. Samples: 14916864500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 15:08:18,949][15401] Updated weights for policy 0, policy_version 910444 (0.0031) [2024-06-25 15:08:22,247][15401] Updated weights for policy 0, policy_version 910454 (0.0037) [2024-06-25 15:08:23,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 14916927488. Throughput: 0: 42916.4. Samples: 14916992660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:23,392][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 15:08:26,627][15401] Updated weights for policy 0, policy_version 910464 (0.0034) [2024-06-25 15:08:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14917124096. Throughput: 0: 43065.4. Samples: 14917257680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:28,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 15:08:29,887][15401] Updated weights for policy 0, policy_version 910474 (0.0039) [2024-06-25 15:08:33,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14917337088. Throughput: 0: 42858.0. Samples: 14917510980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:33,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 15:08:34,214][15401] Updated weights for policy 0, policy_version 910484 (0.0039) [2024-06-25 15:08:37,555][15401] Updated weights for policy 0, policy_version 910494 (0.0035) [2024-06-25 15:08:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14917566464. Throughput: 0: 42778.6. Samples: 14917630500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 15:08:42,154][15401] Updated weights for policy 0, policy_version 910504 (0.0039) [2024-06-25 15:08:43,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14917746688. Throughput: 0: 42868.1. Samples: 14917893400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 15:08:45,165][15401] Updated weights for policy 0, policy_version 910514 (0.0032) [2024-06-25 15:08:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14917976064. Throughput: 0: 42412.4. Samples: 14918140220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:48,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-25 15:08:50,117][15401] Updated weights for policy 0, policy_version 910524 (0.0042) [2024-06-25 15:08:53,116][15401] Updated weights for policy 0, policy_version 910534 (0.0037) [2024-06-25 15:08:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14918205440. Throughput: 0: 42610.6. Samples: 14918271120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:53,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 15:08:57,795][15401] Updated weights for policy 0, policy_version 910544 (0.0033) [2024-06-25 15:08:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 14918369280. Throughput: 0: 42420.1. Samples: 14918519040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:08:58,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 15:09:00,748][15401] Updated weights for policy 0, policy_version 910554 (0.0037) [2024-06-25 15:09:03,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 14918615040. Throughput: 0: 42304.9. Samples: 14918768320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:09:03,393][15132] Avg episode reward: [(0, '0.851')] [2024-06-25 15:09:05,558][15401] Updated weights for policy 0, policy_version 910564 (0.0046) [2024-06-25 15:09:08,286][15401] Updated weights for policy 0, policy_version 910574 (0.0038) [2024-06-25 15:09:08,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14918844416. Throughput: 0: 42368.2. Samples: 14918899120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 15:09:08,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 15:09:13,309][15401] Updated weights for policy 0, policy_version 910584 (0.0032) [2024-06-25 15:09:13,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 14919008256. Throughput: 0: 42266.5. Samples: 14919159680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:13,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 15:09:16,287][15401] Updated weights for policy 0, policy_version 910594 (0.0033) [2024-06-25 15:09:18,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14919254016. Throughput: 0: 42210.8. Samples: 14919410460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 15:09:20,833][15401] Updated weights for policy 0, policy_version 910604 (0.0030) [2024-06-25 15:09:21,513][15349] Signal inference workers to stop experience collection... (220850 times) [2024-06-25 15:09:21,513][15349] Signal inference workers to resume experience collection... (220850 times) [2024-06-25 15:09:21,536][15401] InferenceWorker_p0-w0: stopping experience collection (220850 times) [2024-06-25 15:09:21,536][15401] InferenceWorker_p0-w0: resuming experience collection (220850 times) [2024-06-25 15:09:23,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 14919483392. Throughput: 0: 42509.4. Samples: 14919543420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:23,396][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 15:09:23,896][15401] Updated weights for policy 0, policy_version 910614 (0.0027) [2024-06-25 15:09:28,345][15401] Updated weights for policy 0, policy_version 910624 (0.0025) [2024-06-25 15:09:28,392][15132] Fps is (10 sec: 40948.8, 60 sec: 42323.4, 300 sec: 42764.6). Total num frames: 14919663616. Throughput: 0: 42380.9. Samples: 14919800660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:28,393][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 15:09:31,474][15401] Updated weights for policy 0, policy_version 910634 (0.0037) [2024-06-25 15:09:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 14919909376. Throughput: 0: 42560.5. Samples: 14920055440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:33,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 15:09:36,109][15401] Updated weights for policy 0, policy_version 910644 (0.0023) [2024-06-25 15:09:38,390][15132] Fps is (10 sec: 45887.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14920122368. Throughput: 0: 42606.3. Samples: 14920188400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:38,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 15:09:39,086][15401] Updated weights for policy 0, policy_version 910654 (0.0043) [2024-06-25 15:09:43,390][15132] Fps is (10 sec: 37682.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14920286208. Throughput: 0: 42763.4. Samples: 14920443400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:43,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 15:09:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000910663_14920302592.pth... [2024-06-25 15:09:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000910040_14910095360.pth [2024-06-25 15:09:43,653][15401] Updated weights for policy 0, policy_version 910664 (0.0031) [2024-06-25 15:09:46,777][15401] Updated weights for policy 0, policy_version 910674 (0.0043) [2024-06-25 15:09:48,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 14920548352. Throughput: 0: 42770.7. Samples: 14920693000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:48,392][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 15:09:51,415][15401] Updated weights for policy 0, policy_version 910684 (0.0033) [2024-06-25 15:09:53,392][15132] Fps is (10 sec: 47503.1, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 14920761344. Throughput: 0: 42878.1. Samples: 14920828740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:53,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 15:09:54,286][15401] Updated weights for policy 0, policy_version 910694 (0.0034) [2024-06-25 15:09:58,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14920941568. Throughput: 0: 42637.5. Samples: 14921078360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:09:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 15:09:58,916][15401] Updated weights for policy 0, policy_version 910704 (0.0044) [2024-06-25 15:10:02,348][15401] Updated weights for policy 0, policy_version 910714 (0.0038) [2024-06-25 15:10:03,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 14921187328. Throughput: 0: 42571.5. Samples: 14921326180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:03,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 15:10:06,694][15401] Updated weights for policy 0, policy_version 910724 (0.0030) [2024-06-25 15:10:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14921383936. Throughput: 0: 42641.7. Samples: 14921462300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:08,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 15:10:09,832][15401] Updated weights for policy 0, policy_version 910734 (0.0030) [2024-06-25 15:10:13,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 14921564160. Throughput: 0: 42505.3. Samples: 14921713280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:13,390][15132] Avg episode reward: [(0, '0.879')] [2024-06-25 15:10:14,614][15401] Updated weights for policy 0, policy_version 910744 (0.0034) [2024-06-25 15:10:18,029][15401] Updated weights for policy 0, policy_version 910754 (0.0035) [2024-06-25 15:10:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14921809920. Throughput: 0: 42495.0. Samples: 14921967720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 15:10:22,033][15401] Updated weights for policy 0, policy_version 910764 (0.0038) [2024-06-25 15:10:23,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 14922022912. Throughput: 0: 42483.1. Samples: 14922100240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:23,392][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 15:10:25,586][15401] Updated weights for policy 0, policy_version 910774 (0.0032) [2024-06-25 15:10:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.3, 300 sec: 42709.5). Total num frames: 14922219520. Throughput: 0: 42360.5. Samples: 14922349620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 15:10:29,569][15401] Updated weights for policy 0, policy_version 910784 (0.0035) [2024-06-25 15:10:33,168][15401] Updated weights for policy 0, policy_version 910794 (0.0045) [2024-06-25 15:10:33,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14922448896. Throughput: 0: 42556.1. Samples: 14922607920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:33,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 15:10:37,357][15401] Updated weights for policy 0, policy_version 910804 (0.0028) [2024-06-25 15:10:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 14922661888. Throughput: 0: 42475.1. Samples: 14922740020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:38,390][15132] Avg episode reward: [(0, '0.208')] [2024-06-25 15:10:40,053][15349] Signal inference workers to stop experience collection... (220900 times) [2024-06-25 15:10:40,053][15349] Signal inference workers to resume experience collection... (220900 times) [2024-06-25 15:10:40,072][15401] InferenceWorker_p0-w0: stopping experience collection (220900 times) [2024-06-25 15:10:40,072][15401] InferenceWorker_p0-w0: resuming experience collection (220900 times) [2024-06-25 15:10:40,738][15401] Updated weights for policy 0, policy_version 910814 (0.0033) [2024-06-25 15:10:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14922858496. Throughput: 0: 42455.5. Samples: 14922988860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:43,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 15:10:45,297][15401] Updated weights for policy 0, policy_version 910824 (0.0040) [2024-06-25 15:10:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 14923087872. Throughput: 0: 42670.3. Samples: 14923246340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:48,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 15:10:48,579][15401] Updated weights for policy 0, policy_version 910834 (0.0042) [2024-06-25 15:10:52,899][15401] Updated weights for policy 0, policy_version 910844 (0.0042) [2024-06-25 15:10:53,394][15132] Fps is (10 sec: 44218.8, 60 sec: 42324.1, 300 sec: 42653.3). Total num frames: 14923300864. Throughput: 0: 42561.9. Samples: 14923377760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:53,394][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 15:10:56,364][15401] Updated weights for policy 0, policy_version 910854 (0.0027) [2024-06-25 15:10:58,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14923513856. Throughput: 0: 42669.7. Samples: 14923633420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:10:58,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 15:11:00,352][15401] Updated weights for policy 0, policy_version 910864 (0.0038) [2024-06-25 15:11:03,390][15132] Fps is (10 sec: 42615.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14923726848. Throughput: 0: 42653.8. Samples: 14923887140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 15:11:03,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 15:11:04,025][15401] Updated weights for policy 0, policy_version 910874 (0.0038) [2024-06-25 15:11:08,129][15401] Updated weights for policy 0, policy_version 910884 (0.0038) [2024-06-25 15:11:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14923939840. Throughput: 0: 42544.9. Samples: 14924014660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:08,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 15:11:11,718][15401] Updated weights for policy 0, policy_version 910894 (0.0037) [2024-06-25 15:11:13,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42599.3). Total num frames: 14924152832. Throughput: 0: 42669.0. Samples: 14924269720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:13,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 15:11:15,951][15401] Updated weights for policy 0, policy_version 910904 (0.0037) [2024-06-25 15:11:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14924349440. Throughput: 0: 42551.6. Samples: 14924522740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 15:11:19,418][15401] Updated weights for policy 0, policy_version 910914 (0.0040) [2024-06-25 15:11:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 14924562432. Throughput: 0: 42427.9. Samples: 14924649280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 15:11:23,930][15401] Updated weights for policy 0, policy_version 910924 (0.0037) [2024-06-25 15:11:27,202][15401] Updated weights for policy 0, policy_version 910934 (0.0038) [2024-06-25 15:11:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14924775424. Throughput: 0: 42428.1. Samples: 14924898120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 15:11:31,526][15401] Updated weights for policy 0, policy_version 910944 (0.0039) [2024-06-25 15:11:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14925004800. Throughput: 0: 42438.5. Samples: 14925156080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:33,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 15:11:35,437][15401] Updated weights for policy 0, policy_version 910954 (0.0046) [2024-06-25 15:11:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 14925185024. Throughput: 0: 42354.5. Samples: 14925283540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 15:11:39,063][15401] Updated weights for policy 0, policy_version 910964 (0.0035) [2024-06-25 15:11:43,102][15401] Updated weights for policy 0, policy_version 910974 (0.0033) [2024-06-25 15:11:43,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 14925414400. Throughput: 0: 42237.3. Samples: 14925534200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:43,392][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 15:11:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000910975_14925414400.pth... [2024-06-25 15:11:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000910352_14915207168.pth [2024-06-25 15:11:46,615][15401] Updated weights for policy 0, policy_version 910984 (0.0028) [2024-06-25 15:11:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14925627392. Throughput: 0: 42397.4. Samples: 14925795020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 15:11:50,625][15401] Updated weights for policy 0, policy_version 910994 (0.0029) [2024-06-25 15:11:52,339][15349] Signal inference workers to stop experience collection... (220950 times) [2024-06-25 15:11:52,384][15401] InferenceWorker_p0-w0: stopping experience collection (220950 times) [2024-06-25 15:11:52,394][15349] Signal inference workers to resume experience collection... (220950 times) [2024-06-25 15:11:52,411][15401] InferenceWorker_p0-w0: resuming experience collection (220950 times) [2024-06-25 15:11:53,392][15132] Fps is (10 sec: 42597.8, 60 sec: 42326.5, 300 sec: 42598.0). Total num frames: 14925840384. Throughput: 0: 42373.2. Samples: 14925921560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:53,393][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 15:11:54,543][15401] Updated weights for policy 0, policy_version 911004 (0.0025) [2024-06-25 15:11:58,168][15401] Updated weights for policy 0, policy_version 911014 (0.0044) [2024-06-25 15:11:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 14926053376. Throughput: 0: 42280.7. Samples: 14926172360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:11:58,391][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 15:12:02,179][15401] Updated weights for policy 0, policy_version 911024 (0.0042) [2024-06-25 15:12:03,390][15132] Fps is (10 sec: 40970.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14926249984. Throughput: 0: 42289.3. Samples: 14926425760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 15:12:05,747][15401] Updated weights for policy 0, policy_version 911034 (0.0032) [2024-06-25 15:12:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 14926462976. Throughput: 0: 42272.1. Samples: 14926551520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:08,396][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 15:12:09,911][15401] Updated weights for policy 0, policy_version 911044 (0.0032) [2024-06-25 15:12:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 14926692352. Throughput: 0: 42435.5. Samples: 14926807720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 15:12:13,499][15401] Updated weights for policy 0, policy_version 911054 (0.0044) [2024-06-25 15:12:17,757][15401] Updated weights for policy 0, policy_version 911064 (0.0038) [2024-06-25 15:12:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14926905344. Throughput: 0: 42354.7. Samples: 14927062040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:18,391][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 15:12:21,400][15401] Updated weights for policy 0, policy_version 911074 (0.0048) [2024-06-25 15:12:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14927101952. Throughput: 0: 42359.6. Samples: 14927189720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:23,399][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 15:12:25,434][15401] Updated weights for policy 0, policy_version 911084 (0.0039) [2024-06-25 15:12:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14927331328. Throughput: 0: 42478.7. Samples: 14927445640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:28,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 15:12:29,374][15401] Updated weights for policy 0, policy_version 911094 (0.0041) [2024-06-25 15:12:32,879][15401] Updated weights for policy 0, policy_version 911104 (0.0041) [2024-06-25 15:12:33,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14927544320. Throughput: 0: 42288.4. Samples: 14927698000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 15:12:37,014][15401] Updated weights for policy 0, policy_version 911114 (0.0028) [2024-06-25 15:12:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 14927724544. Throughput: 0: 42444.2. Samples: 14927831440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 15:12:40,533][15401] Updated weights for policy 0, policy_version 911124 (0.0037) [2024-06-25 15:12:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 14927953920. Throughput: 0: 42441.4. Samples: 14928082220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 15:12:44,531][15401] Updated weights for policy 0, policy_version 911134 (0.0027) [2024-06-25 15:12:48,281][15401] Updated weights for policy 0, policy_version 911144 (0.0046) [2024-06-25 15:12:48,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14928183296. Throughput: 0: 42554.3. Samples: 14928340700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 15:12:52,165][15401] Updated weights for policy 0, policy_version 911154 (0.0022) [2024-06-25 15:12:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 14928379904. Throughput: 0: 42530.6. Samples: 14928465400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:53,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 15:12:56,265][15401] Updated weights for policy 0, policy_version 911164 (0.0039) [2024-06-25 15:12:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14928609280. Throughput: 0: 42425.4. Samples: 14928716860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:12:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 15:13:00,444][15401] Updated weights for policy 0, policy_version 911174 (0.0029) [2024-06-25 15:13:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14928805888. Throughput: 0: 42481.3. Samples: 14928973700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 15:13:03,895][15401] Updated weights for policy 0, policy_version 911184 (0.0030) [2024-06-25 15:13:08,123][15401] Updated weights for policy 0, policy_version 911194 (0.0030) [2024-06-25 15:13:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 14929002496. Throughput: 0: 42388.1. Samples: 14929097180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:08,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 15:13:11,436][15401] Updated weights for policy 0, policy_version 911204 (0.0038) [2024-06-25 15:13:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14929231872. Throughput: 0: 42280.5. Samples: 14929348260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 15:13:15,746][15401] Updated weights for policy 0, policy_version 911214 (0.0037) [2024-06-25 15:13:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42376.6). Total num frames: 14929428480. Throughput: 0: 42468.9. Samples: 14929609100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:18,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 15:13:19,019][15349] Signal inference workers to stop experience collection... (221000 times) [2024-06-25 15:13:19,067][15401] InferenceWorker_p0-w0: stopping experience collection (221000 times) [2024-06-25 15:13:19,072][15349] Signal inference workers to resume experience collection... (221000 times) [2024-06-25 15:13:19,077][15401] InferenceWorker_p0-w0: resuming experience collection (221000 times) [2024-06-25 15:13:19,226][15401] Updated weights for policy 0, policy_version 911224 (0.0030) [2024-06-25 15:13:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 14929641472. Throughput: 0: 42251.9. Samples: 14929732780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:23,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 15:13:23,796][15401] Updated weights for policy 0, policy_version 911234 (0.0042) [2024-06-25 15:13:27,136][15401] Updated weights for policy 0, policy_version 911244 (0.0039) [2024-06-25 15:13:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14929870848. Throughput: 0: 42144.4. Samples: 14929978720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 15:13:31,291][15401] Updated weights for policy 0, policy_version 911254 (0.0033) [2024-06-25 15:13:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 14930067456. Throughput: 0: 42162.8. Samples: 14930238020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:33,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 15:13:35,021][15401] Updated weights for policy 0, policy_version 911264 (0.0035) [2024-06-25 15:13:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14930280448. Throughput: 0: 42162.8. Samples: 14930362720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:38,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-25 15:13:38,961][15401] Updated weights for policy 0, policy_version 911274 (0.0037) [2024-06-25 15:13:42,444][15401] Updated weights for policy 0, policy_version 911284 (0.0037) [2024-06-25 15:13:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 14930509824. Throughput: 0: 42346.1. Samples: 14930622440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:43,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 15:13:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000911286_14930509824.pth... [2024-06-25 15:13:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000910663_14920302592.pth [2024-06-25 15:13:46,663][15401] Updated weights for policy 0, policy_version 911294 (0.0024) [2024-06-25 15:13:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 14930706432. Throughput: 0: 42378.8. Samples: 14930880740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 15:13:50,165][15401] Updated weights for policy 0, policy_version 911304 (0.0034) [2024-06-25 15:13:53,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 14930903040. Throughput: 0: 42379.8. Samples: 14931004380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:53,393][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 15:13:54,243][15401] Updated weights for policy 0, policy_version 911314 (0.0039) [2024-06-25 15:13:58,058][15401] Updated weights for policy 0, policy_version 911324 (0.0028) [2024-06-25 15:13:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 14931148800. Throughput: 0: 42512.0. Samples: 14931261300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:13:58,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 15:14:01,835][15401] Updated weights for policy 0, policy_version 911334 (0.0034) [2024-06-25 15:14:03,392][15132] Fps is (10 sec: 45875.2, 60 sec: 42596.7, 300 sec: 42431.4). Total num frames: 14931361792. Throughput: 0: 42483.9. Samples: 14931520980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:03,393][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 15:14:05,574][15401] Updated weights for policy 0, policy_version 911344 (0.0027) [2024-06-25 15:14:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14931558400. Throughput: 0: 42616.9. Samples: 14931650540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:08,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 15:14:09,785][15401] Updated weights for policy 0, policy_version 911354 (0.0035) [2024-06-25 15:14:13,081][15401] Updated weights for policy 0, policy_version 911364 (0.0047) [2024-06-25 15:14:13,389][15132] Fps is (10 sec: 42609.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14931787776. Throughput: 0: 42828.5. Samples: 14931906000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 15:14:17,354][15401] Updated weights for policy 0, policy_version 911374 (0.0033) [2024-06-25 15:14:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 14932000768. Throughput: 0: 42820.2. Samples: 14932164940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 15:14:20,618][15401] Updated weights for policy 0, policy_version 911384 (0.0032) [2024-06-25 15:14:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 14932197376. Throughput: 0: 42746.6. Samples: 14932286320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 15:14:25,434][15401] Updated weights for policy 0, policy_version 911394 (0.0036) [2024-06-25 15:14:28,167][15401] Updated weights for policy 0, policy_version 911404 (0.0028) [2024-06-25 15:14:28,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.7, 300 sec: 42487.0). Total num frames: 14932443136. Throughput: 0: 42692.0. Samples: 14932543680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:28,392][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 15:14:33,218][15401] Updated weights for policy 0, policy_version 911414 (0.0031) [2024-06-25 15:14:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.2, 300 sec: 42376.2). Total num frames: 14932623360. Throughput: 0: 42735.8. Samples: 14932803860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 15:14:35,674][15401] Updated weights for policy 0, policy_version 911424 (0.0043) [2024-06-25 15:14:38,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14932852736. Throughput: 0: 42560.5. Samples: 14932919500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:38,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 15:14:40,768][15401] Updated weights for policy 0, policy_version 911434 (0.0038) [2024-06-25 15:14:42,533][15349] Signal inference workers to stop experience collection... (221050 times) [2024-06-25 15:14:42,533][15349] Signal inference workers to resume experience collection... (221050 times) [2024-06-25 15:14:42,554][15401] InferenceWorker_p0-w0: stopping experience collection (221050 times) [2024-06-25 15:14:42,554][15401] InferenceWorker_p0-w0: resuming experience collection (221050 times) [2024-06-25 15:14:43,334][15401] Updated weights for policy 0, policy_version 911444 (0.0032) [2024-06-25 15:14:43,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 14933098496. Throughput: 0: 42784.2. Samples: 14933186600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:43,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 15:14:48,337][15401] Updated weights for policy 0, policy_version 911454 (0.0048) [2024-06-25 15:14:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42376.6). Total num frames: 14933262336. Throughput: 0: 42781.0. Samples: 14933446020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:48,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 15:14:50,951][15401] Updated weights for policy 0, policy_version 911464 (0.0028) [2024-06-25 15:14:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43419.3, 300 sec: 42598.4). Total num frames: 14933508096. Throughput: 0: 42477.2. Samples: 14933562020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-25 15:14:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 15:14:56,053][15401] Updated weights for policy 0, policy_version 911474 (0.0031) [2024-06-25 15:14:58,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 14933737472. Throughput: 0: 42915.4. Samples: 14933837200. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:14:58,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 15:14:58,562][15401] Updated weights for policy 0, policy_version 911484 (0.0028) [2024-06-25 15:15:03,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42053.9, 300 sec: 42376.2). Total num frames: 14933884928. Throughput: 0: 42829.4. Samples: 14934092260. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 15:15:03,770][15401] Updated weights for policy 0, policy_version 911494 (0.0023) [2024-06-25 15:15:06,383][15401] Updated weights for policy 0, policy_version 911504 (0.0032) [2024-06-25 15:15:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 14934147072. Throughput: 0: 42673.2. Samples: 14934206620. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 15:15:11,595][15401] Updated weights for policy 0, policy_version 911514 (0.0029) [2024-06-25 15:15:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 14934343680. Throughput: 0: 42846.7. Samples: 14934471680. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 15:15:14,014][15401] Updated weights for policy 0, policy_version 911524 (0.0041) [2024-06-25 15:15:18,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42376.6). Total num frames: 14934523904. Throughput: 0: 42672.0. Samples: 14934724100. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 15:15:19,508][15401] Updated weights for policy 0, policy_version 911534 (0.0042) [2024-06-25 15:15:22,015][15401] Updated weights for policy 0, policy_version 911544 (0.0036) [2024-06-25 15:15:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 14934802432. Throughput: 0: 42710.7. Samples: 14934841480. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:23,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 15:15:26,962][15401] Updated weights for policy 0, policy_version 911554 (0.0036) [2024-06-25 15:15:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 14934982656. Throughput: 0: 42652.6. Samples: 14935105960. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 15:15:29,715][15401] Updated weights for policy 0, policy_version 911564 (0.0028) [2024-06-25 15:15:33,389][15132] Fps is (10 sec: 36045.4, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 14935162880. Throughput: 0: 42580.0. Samples: 14935362120. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 15:15:34,644][15401] Updated weights for policy 0, policy_version 911574 (0.0034) [2024-06-25 15:15:37,309][15401] Updated weights for policy 0, policy_version 911584 (0.0030) [2024-06-25 15:15:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 14935425024. Throughput: 0: 42735.6. Samples: 14935485120. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 15:15:42,332][15401] Updated weights for policy 0, policy_version 911594 (0.0046) [2024-06-25 15:15:43,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 14935621632. Throughput: 0: 42432.0. Samples: 14935746640. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:43,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 15:15:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000911598_14935621632.pth... [2024-06-25 15:15:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000910975_14925414400.pth [2024-06-25 15:15:45,124][15401] Updated weights for policy 0, policy_version 911604 (0.0034) [2024-06-25 15:15:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42432.4). Total num frames: 14935818240. Throughput: 0: 42268.1. Samples: 14935994320. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:48,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 15:15:49,846][15401] Updated weights for policy 0, policy_version 911614 (0.0026) [2024-06-25 15:15:52,831][15401] Updated weights for policy 0, policy_version 911624 (0.0029) [2024-06-25 15:15:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14936064000. Throughput: 0: 42569.8. Samples: 14936122260. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:53,390][15132] Avg episode reward: [(0, '0.873')] [2024-06-25 15:15:55,323][15349] Signal inference workers to stop experience collection... (221100 times) [2024-06-25 15:15:55,324][15349] Signal inference workers to resume experience collection... (221100 times) [2024-06-25 15:15:55,339][15401] InferenceWorker_p0-w0: stopping experience collection (221100 times) [2024-06-25 15:15:55,339][15401] InferenceWorker_p0-w0: resuming experience collection (221100 times) [2024-06-25 15:15:57,729][15401] Updated weights for policy 0, policy_version 911634 (0.0043) [2024-06-25 15:15:58,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14936260608. Throughput: 0: 42563.1. Samples: 14936387020. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:15:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 15:16:00,719][15401] Updated weights for policy 0, policy_version 911644 (0.0028) [2024-06-25 15:16:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 14936473600. Throughput: 0: 42382.3. Samples: 14936631300. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 15:16:05,343][15401] Updated weights for policy 0, policy_version 911654 (0.0039) [2024-06-25 15:16:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14936686592. Throughput: 0: 42625.4. Samples: 14936759620. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:08,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 15:16:08,686][15401] Updated weights for policy 0, policy_version 911664 (0.0029) [2024-06-25 15:16:12,979][15401] Updated weights for policy 0, policy_version 911674 (0.0035) [2024-06-25 15:16:13,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 14936899584. Throughput: 0: 42595.1. Samples: 14937022840. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:13,392][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 15:16:16,368][15401] Updated weights for policy 0, policy_version 911684 (0.0042) [2024-06-25 15:16:18,396][15132] Fps is (10 sec: 42571.1, 60 sec: 43140.0, 300 sec: 42542.0). Total num frames: 14937112576. Throughput: 0: 42323.2. Samples: 14937266940. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:18,396][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 15:16:20,531][15401] Updated weights for policy 0, policy_version 911694 (0.0032) [2024-06-25 15:16:23,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 14937325568. Throughput: 0: 42543.6. Samples: 14937399580. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:23,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-25 15:16:23,845][15401] Updated weights for policy 0, policy_version 911704 (0.0041) [2024-06-25 15:16:28,025][15401] Updated weights for policy 0, policy_version 911714 (0.0024) [2024-06-25 15:16:28,390][15132] Fps is (10 sec: 42625.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 14937538560. Throughput: 0: 42532.9. Samples: 14937660620. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:28,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 15:16:31,541][15401] Updated weights for policy 0, policy_version 911724 (0.0036) [2024-06-25 15:16:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 14937751552. Throughput: 0: 42638.6. Samples: 14937913060. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:33,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 15:16:35,697][15401] Updated weights for policy 0, policy_version 911734 (0.0027) [2024-06-25 15:16:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42487.6). Total num frames: 14937948160. Throughput: 0: 42697.7. Samples: 14938043660. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:38,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 15:16:39,454][15401] Updated weights for policy 0, policy_version 911744 (0.0035) [2024-06-25 15:16:43,314][15401] Updated weights for policy 0, policy_version 911754 (0.0039) [2024-06-25 15:16:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14938177536. Throughput: 0: 42447.7. Samples: 14938297160. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:43,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 15:16:47,426][15401] Updated weights for policy 0, policy_version 911764 (0.0027) [2024-06-25 15:16:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 14938374144. Throughput: 0: 42715.9. Samples: 14938553520. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-06-25 15:16:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 15:16:50,957][15401] Updated weights for policy 0, policy_version 911774 (0.0034) [2024-06-25 15:16:53,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14938587136. Throughput: 0: 42602.7. Samples: 14938676740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:16:53,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 15:16:55,227][15401] Updated weights for policy 0, policy_version 911784 (0.0051) [2024-06-25 15:16:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14938816512. Throughput: 0: 42487.6. Samples: 14938934680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:16:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 15:16:58,472][15401] Updated weights for policy 0, policy_version 911794 (0.0032) [2024-06-25 15:17:02,810][15401] Updated weights for policy 0, policy_version 911804 (0.0037) [2024-06-25 15:17:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14939013120. Throughput: 0: 42862.2. Samples: 14939195460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:03,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 15:17:05,996][15401] Updated weights for policy 0, policy_version 911814 (0.0040) [2024-06-25 15:17:08,394][15132] Fps is (10 sec: 42579.1, 60 sec: 42595.2, 300 sec: 42542.2). Total num frames: 14939242496. Throughput: 0: 42593.1. Samples: 14939316460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:08,395][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 15:17:10,520][15401] Updated weights for policy 0, policy_version 911824 (0.0026) [2024-06-25 15:17:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 14939455488. Throughput: 0: 42613.9. Samples: 14939578240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 15:17:13,810][15401] Updated weights for policy 0, policy_version 911834 (0.0046) [2024-06-25 15:17:18,070][15401] Updated weights for policy 0, policy_version 911844 (0.0046) [2024-06-25 15:17:18,390][15132] Fps is (10 sec: 42616.8, 60 sec: 42602.8, 300 sec: 42598.4). Total num frames: 14939668480. Throughput: 0: 42579.8. Samples: 14939829160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:18,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 15:17:21,695][15349] Signal inference workers to stop experience collection... (221150 times) [2024-06-25 15:17:21,731][15401] InferenceWorker_p0-w0: stopping experience collection (221150 times) [2024-06-25 15:17:21,743][15349] Signal inference workers to resume experience collection... (221150 times) [2024-06-25 15:17:21,753][15401] InferenceWorker_p0-w0: resuming experience collection (221150 times) [2024-06-25 15:17:21,760][15401] Updated weights for policy 0, policy_version 911854 (0.0038) [2024-06-25 15:17:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14939881472. Throughput: 0: 42547.3. Samples: 14939958280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 15:17:25,698][15401] Updated weights for policy 0, policy_version 911864 (0.0035) [2024-06-25 15:17:28,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 14940078080. Throughput: 0: 42547.0. Samples: 14940211780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 15:17:29,295][15401] Updated weights for policy 0, policy_version 911874 (0.0030) [2024-06-25 15:17:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14940291072. Throughput: 0: 42497.9. Samples: 14940465920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 15:17:33,735][15401] Updated weights for policy 0, policy_version 911884 (0.0034) [2024-06-25 15:17:37,076][15401] Updated weights for policy 0, policy_version 911894 (0.0036) [2024-06-25 15:17:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14940536832. Throughput: 0: 42668.4. Samples: 14940596820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 15:17:41,363][15401] Updated weights for policy 0, policy_version 911904 (0.0034) [2024-06-25 15:17:43,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14940700672. Throughput: 0: 42540.9. Samples: 14940849020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:43,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 15:17:43,498][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000911909_14940717056.pth... [2024-06-25 15:17:43,559][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000911286_14930509824.pth [2024-06-25 15:17:44,757][15401] Updated weights for policy 0, policy_version 911914 (0.0039) [2024-06-25 15:17:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14940930048. Throughput: 0: 42336.0. Samples: 14941100580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 15:17:48,976][15401] Updated weights for policy 0, policy_version 911924 (0.0029) [2024-06-25 15:17:52,495][15401] Updated weights for policy 0, policy_version 911934 (0.0032) [2024-06-25 15:17:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14941159424. Throughput: 0: 42654.1. Samples: 14941235700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:53,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 15:17:56,645][15401] Updated weights for policy 0, policy_version 911944 (0.0041) [2024-06-25 15:17:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 14941339648. Throughput: 0: 42335.0. Samples: 14941483320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:17:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 15:18:00,147][15401] Updated weights for policy 0, policy_version 911954 (0.0025) [2024-06-25 15:18:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 14941585408. Throughput: 0: 42391.7. Samples: 14941736780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 15:18:04,267][15401] Updated weights for policy 0, policy_version 911964 (0.0038) [2024-06-25 15:18:08,000][15401] Updated weights for policy 0, policy_version 911974 (0.0032) [2024-06-25 15:18:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42601.6, 300 sec: 42598.4). Total num frames: 14941798400. Throughput: 0: 42492.0. Samples: 14941870420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 15:18:11,919][15401] Updated weights for policy 0, policy_version 911984 (0.0031) [2024-06-25 15:18:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 14941978624. Throughput: 0: 42511.0. Samples: 14942124780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:13,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 15:18:15,785][15401] Updated weights for policy 0, policy_version 911994 (0.0040) [2024-06-25 15:18:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 14942224384. Throughput: 0: 42429.2. Samples: 14942375240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:18,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 15:18:19,597][15401] Updated weights for policy 0, policy_version 912004 (0.0030) [2024-06-25 15:18:23,311][15401] Updated weights for policy 0, policy_version 912014 (0.0033) [2024-06-25 15:18:23,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14942437376. Throughput: 0: 42560.4. Samples: 14942512040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:23,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 15:18:27,926][15401] Updated weights for policy 0, policy_version 912024 (0.0049) [2024-06-25 15:18:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 14942617600. Throughput: 0: 42553.8. Samples: 14942763940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:28,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 15:18:31,075][15401] Updated weights for policy 0, policy_version 912034 (0.0031) [2024-06-25 15:18:33,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14942863360. Throughput: 0: 42527.1. Samples: 14943014300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 15:18:35,574][15401] Updated weights for policy 0, policy_version 912044 (0.0029) [2024-06-25 15:18:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 14943076352. Throughput: 0: 42618.2. Samples: 14943153520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:38,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 15:18:38,598][15401] Updated weights for policy 0, policy_version 912054 (0.0028) [2024-06-25 15:18:43,196][15401] Updated weights for policy 0, policy_version 912064 (0.0040) [2024-06-25 15:18:43,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 14943272960. Throughput: 0: 42732.8. Samples: 14943406400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-25 15:18:43,392][15132] Avg episode reward: [(0, '0.272')] [2024-06-25 15:18:46,139][15401] Updated weights for policy 0, policy_version 912074 (0.0037) [2024-06-25 15:18:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 14943518720. Throughput: 0: 42683.7. Samples: 14943657540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:18:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 15:18:51,212][15401] Updated weights for policy 0, policy_version 912084 (0.0024) [2024-06-25 15:18:53,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 14943731712. Throughput: 0: 42802.6. Samples: 14943796540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:18:53,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 15:18:53,744][15401] Updated weights for policy 0, policy_version 912094 (0.0038) [2024-06-25 15:18:58,390][15132] Fps is (10 sec: 37682.4, 60 sec: 42598.3, 300 sec: 42487.7). Total num frames: 14943895552. Throughput: 0: 42738.7. Samples: 14944048020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:18:58,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 15:18:58,759][15401] Updated weights for policy 0, policy_version 912104 (0.0023) [2024-06-25 15:19:01,379][15401] Updated weights for policy 0, policy_version 912114 (0.0039) [2024-06-25 15:19:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14944141312. Throughput: 0: 42668.0. Samples: 14944295300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 15:19:06,238][15401] Updated weights for policy 0, policy_version 912124 (0.0038) [2024-06-25 15:19:07,560][15349] Signal inference workers to stop experience collection... (221200 times) [2024-06-25 15:19:07,562][15349] Signal inference workers to resume experience collection... (221200 times) [2024-06-25 15:19:07,586][15401] InferenceWorker_p0-w0: stopping experience collection (221200 times) [2024-06-25 15:19:07,586][15401] InferenceWorker_p0-w0: resuming experience collection (221200 times) [2024-06-25 15:19:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 14944354304. Throughput: 0: 42765.8. Samples: 14944436500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 15:19:09,071][15401] Updated weights for policy 0, policy_version 912134 (0.0026) [2024-06-25 15:19:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 14944550912. Throughput: 0: 42693.6. Samples: 14944685160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 15:19:13,712][15401] Updated weights for policy 0, policy_version 912144 (0.0051) [2024-06-25 15:19:17,478][15401] Updated weights for policy 0, policy_version 912154 (0.0030) [2024-06-25 15:19:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14944780288. Throughput: 0: 42787.4. Samples: 14944939740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:18,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 15:19:21,317][15401] Updated weights for policy 0, policy_version 912164 (0.0031) [2024-06-25 15:19:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 14945009664. Throughput: 0: 42673.8. Samples: 14945073840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 15:19:24,879][15401] Updated weights for policy 0, policy_version 912174 (0.0031) [2024-06-25 15:19:28,392][15132] Fps is (10 sec: 40950.9, 60 sec: 42869.7, 300 sec: 42598.1). Total num frames: 14945189888. Throughput: 0: 42642.7. Samples: 14945325320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:28,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 15:19:29,089][15401] Updated weights for policy 0, policy_version 912184 (0.0030) [2024-06-25 15:19:32,381][15401] Updated weights for policy 0, policy_version 912194 (0.0044) [2024-06-25 15:19:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 14945419264. Throughput: 0: 42786.5. Samples: 14945582940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:33,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 15:19:36,729][15401] Updated weights for policy 0, policy_version 912204 (0.0033) [2024-06-25 15:19:38,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 14945648640. Throughput: 0: 42696.9. Samples: 14945717900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 15:19:40,221][15401] Updated weights for policy 0, policy_version 912214 (0.0042) [2024-06-25 15:19:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 14945845248. Throughput: 0: 42629.0. Samples: 14945966320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 15:19:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000912222_14945845248.pth... [2024-06-25 15:19:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000911598_14935621632.pth [2024-06-25 15:19:44,357][15401] Updated weights for policy 0, policy_version 912224 (0.0027) [2024-06-25 15:19:47,867][15401] Updated weights for policy 0, policy_version 912234 (0.0038) [2024-06-25 15:19:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14946058240. Throughput: 0: 42820.1. Samples: 14946222200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 15:19:51,977][15401] Updated weights for policy 0, policy_version 912244 (0.0041) [2024-06-25 15:19:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 14946254848. Throughput: 0: 42565.7. Samples: 14946351960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 15:19:55,705][15401] Updated weights for policy 0, policy_version 912254 (0.0024) [2024-06-25 15:19:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14946467840. Throughput: 0: 42599.2. Samples: 14946602120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:19:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 15:19:59,444][15401] Updated weights for policy 0, policy_version 912264 (0.0033) [2024-06-25 15:20:03,281][15401] Updated weights for policy 0, policy_version 912274 (0.0033) [2024-06-25 15:20:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 14946697216. Throughput: 0: 42741.5. Samples: 14946863100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:20:03,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 15:20:07,071][15401] Updated weights for policy 0, policy_version 912284 (0.0034) [2024-06-25 15:20:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 14946893824. Throughput: 0: 42672.9. Samples: 14946994120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:20:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 15:20:11,064][15401] Updated weights for policy 0, policy_version 912294 (0.0036) [2024-06-25 15:20:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 14947123200. Throughput: 0: 42654.2. Samples: 14947244660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:20:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 15:20:14,578][15401] Updated weights for policy 0, policy_version 912304 (0.0032) [2024-06-25 15:20:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 14947336192. Throughput: 0: 42701.9. Samples: 14947504520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:20:18,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 15:20:18,442][15401] Updated weights for policy 0, policy_version 912314 (0.0029) [2024-06-25 15:20:22,703][15401] Updated weights for policy 0, policy_version 912324 (0.0025) [2024-06-25 15:20:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14947549184. Throughput: 0: 42599.5. Samples: 14947634880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:20:23,396][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 15:20:26,319][15401] Updated weights for policy 0, policy_version 912334 (0.0033) [2024-06-25 15:20:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 14947778560. Throughput: 0: 42779.1. Samples: 14947891380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:20:28,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 15:20:30,054][15401] Updated weights for policy 0, policy_version 912344 (0.0038) [2024-06-25 15:20:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 14947991552. Throughput: 0: 42836.5. Samples: 14948149840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:20:33,398][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 15:20:33,825][15401] Updated weights for policy 0, policy_version 912354 (0.0046) [2024-06-25 15:20:37,633][15401] Updated weights for policy 0, policy_version 912364 (0.0026) [2024-06-25 15:20:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14948204544. Throughput: 0: 42858.8. Samples: 14948280600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 15:20:38,396][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 15:20:41,330][15401] Updated weights for policy 0, policy_version 912374 (0.0036) [2024-06-25 15:20:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14948433920. Throughput: 0: 43008.5. Samples: 14948537500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:20:43,404][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 15:20:45,289][15349] Signal inference workers to stop experience collection... (221250 times) [2024-06-25 15:20:45,335][15401] InferenceWorker_p0-w0: stopping experience collection (221250 times) [2024-06-25 15:20:45,407][15349] Signal inference workers to resume experience collection... (221250 times) [2024-06-25 15:20:45,407][15401] InferenceWorker_p0-w0: resuming experience collection (221250 times) [2024-06-25 15:20:45,409][15401] Updated weights for policy 0, policy_version 912384 (0.0033) [2024-06-25 15:20:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 14948614144. Throughput: 0: 42875.6. Samples: 14948792500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:20:48,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 15:20:49,191][15401] Updated weights for policy 0, policy_version 912394 (0.0026) [2024-06-25 15:20:53,057][15401] Updated weights for policy 0, policy_version 912404 (0.0036) [2024-06-25 15:20:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 14948843520. Throughput: 0: 42771.6. Samples: 14948918840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:20:53,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-25 15:20:56,897][15401] Updated weights for policy 0, policy_version 912414 (0.0035) [2024-06-25 15:20:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 14949056512. Throughput: 0: 42941.8. Samples: 14949177040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:20:58,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 15:21:00,656][15401] Updated weights for policy 0, policy_version 912424 (0.0042) [2024-06-25 15:21:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14949269504. Throughput: 0: 42848.0. Samples: 14949432680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:03,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-25 15:21:04,723][15401] Updated weights for policy 0, policy_version 912434 (0.0031) [2024-06-25 15:21:08,295][15401] Updated weights for policy 0, policy_version 912444 (0.0026) [2024-06-25 15:21:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 14949482496. Throughput: 0: 42737.1. Samples: 14949558040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 15:21:12,233][15401] Updated weights for policy 0, policy_version 912454 (0.0035) [2024-06-25 15:21:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 14949679104. Throughput: 0: 42734.7. Samples: 14949814440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:13,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 15:21:15,887][15401] Updated weights for policy 0, policy_version 912464 (0.0032) [2024-06-25 15:21:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 14949908480. Throughput: 0: 42609.8. Samples: 14950067280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:18,391][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 15:21:19,737][15401] Updated weights for policy 0, policy_version 912474 (0.0027) [2024-06-25 15:21:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 14950121472. Throughput: 0: 42662.6. Samples: 14950200420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:23,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 15:21:23,433][15401] Updated weights for policy 0, policy_version 912484 (0.0038) [2024-06-25 15:21:27,231][15401] Updated weights for policy 0, policy_version 912494 (0.0040) [2024-06-25 15:21:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14950318080. Throughput: 0: 42643.9. Samples: 14950456480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 15:21:31,218][15401] Updated weights for policy 0, policy_version 912504 (0.0037) [2024-06-25 15:21:33,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14950547456. Throughput: 0: 42580.8. Samples: 14950708640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:33,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 15:21:35,091][15401] Updated weights for policy 0, policy_version 912514 (0.0028) [2024-06-25 15:21:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 14950744064. Throughput: 0: 42804.0. Samples: 14950845020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 15:21:38,973][15401] Updated weights for policy 0, policy_version 912524 (0.0038) [2024-06-25 15:21:42,562][15401] Updated weights for policy 0, policy_version 912534 (0.0030) [2024-06-25 15:21:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 14950973440. Throughput: 0: 42691.0. Samples: 14951098140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:43,395][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 15:21:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000912535_14950973440.pth... [2024-06-25 15:21:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000911909_14940717056.pth [2024-06-25 15:21:46,463][15401] Updated weights for policy 0, policy_version 912544 (0.0043) [2024-06-25 15:21:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14951186432. Throughput: 0: 42788.7. Samples: 14951358180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 15:21:50,627][15401] Updated weights for policy 0, policy_version 912554 (0.0028) [2024-06-25 15:21:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14951399424. Throughput: 0: 42934.0. Samples: 14951490080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:53,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 15:21:54,162][15401] Updated weights for policy 0, policy_version 912564 (0.0027) [2024-06-25 15:21:56,263][15349] Signal inference workers to stop experience collection... (221300 times) [2024-06-25 15:21:56,264][15349] Signal inference workers to resume experience collection... (221300 times) [2024-06-25 15:21:56,303][15401] InferenceWorker_p0-w0: stopping experience collection (221300 times) [2024-06-25 15:21:56,303][15401] InferenceWorker_p0-w0: resuming experience collection (221300 times) [2024-06-25 15:21:58,328][15401] Updated weights for policy 0, policy_version 912574 (0.0032) [2024-06-25 15:21:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14951612416. Throughput: 0: 42841.7. Samples: 14951742320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:21:58,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 15:22:01,897][15401] Updated weights for policy 0, policy_version 912584 (0.0036) [2024-06-25 15:22:03,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42710.1). Total num frames: 14951841792. Throughput: 0: 42820.4. Samples: 14951994200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:22:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 15:22:06,028][15401] Updated weights for policy 0, policy_version 912594 (0.0036) [2024-06-25 15:22:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14952038400. Throughput: 0: 42825.7. Samples: 14952127580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:22:08,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 15:22:09,500][15401] Updated weights for policy 0, policy_version 912604 (0.0028) [2024-06-25 15:22:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 14952251392. Throughput: 0: 42719.5. Samples: 14952378860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:22:13,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 15:22:13,644][15401] Updated weights for policy 0, policy_version 912614 (0.0041) [2024-06-25 15:22:17,132][15401] Updated weights for policy 0, policy_version 912624 (0.0035) [2024-06-25 15:22:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 14952497152. Throughput: 0: 42735.6. Samples: 14952631740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:22:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 15:22:21,610][15401] Updated weights for policy 0, policy_version 912634 (0.0041) [2024-06-25 15:22:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14952693760. Throughput: 0: 42701.3. Samples: 14952766580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:22:23,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 15:22:24,629][15401] Updated weights for policy 0, policy_version 912644 (0.0028) [2024-06-25 15:22:28,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 14952890368. Throughput: 0: 42610.5. Samples: 14953015880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:22:28,396][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 15:22:29,262][15401] Updated weights for policy 0, policy_version 912654 (0.0045) [2024-06-25 15:22:32,803][15401] Updated weights for policy 0, policy_version 912664 (0.0023) [2024-06-25 15:22:33,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 14953136128. Throughput: 0: 42521.4. Samples: 14953271740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-25 15:22:33,392][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 15:22:37,035][15401] Updated weights for policy 0, policy_version 912674 (0.0040) [2024-06-25 15:22:38,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14953316352. Throughput: 0: 42566.0. Samples: 14953405540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:22:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 15:22:40,359][15401] Updated weights for policy 0, policy_version 912684 (0.0037) [2024-06-25 15:22:43,390][15132] Fps is (10 sec: 39330.6, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 14953529344. Throughput: 0: 42639.9. Samples: 14953661120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:22:43,399][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 15:22:44,579][15401] Updated weights for policy 0, policy_version 912694 (0.0034) [2024-06-25 15:22:47,849][15401] Updated weights for policy 0, policy_version 912704 (0.0036) [2024-06-25 15:22:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 14953758720. Throughput: 0: 42737.4. Samples: 14953917380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:22:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 15:22:52,127][15401] Updated weights for policy 0, policy_version 912714 (0.0044) [2024-06-25 15:22:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14953971712. Throughput: 0: 42671.9. Samples: 14954047820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:22:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 15:22:55,737][15401] Updated weights for policy 0, policy_version 912724 (0.0053) [2024-06-25 15:22:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14954184704. Throughput: 0: 42644.5. Samples: 14954297860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:22:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 15:22:59,965][15401] Updated weights for policy 0, policy_version 912734 (0.0041) [2024-06-25 15:23:03,257][15401] Updated weights for policy 0, policy_version 912744 (0.0039) [2024-06-25 15:23:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 14954397696. Throughput: 0: 42811.4. Samples: 14954558260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:03,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 15:23:07,807][15401] Updated weights for policy 0, policy_version 912754 (0.0028) [2024-06-25 15:23:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14954610688. Throughput: 0: 42589.0. Samples: 14954683080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 15:23:11,193][15401] Updated weights for policy 0, policy_version 912764 (0.0028) [2024-06-25 15:23:11,845][15349] Signal inference workers to stop experience collection... (221350 times) [2024-06-25 15:23:11,896][15401] InferenceWorker_p0-w0: stopping experience collection (221350 times) [2024-06-25 15:23:11,906][15349] Signal inference workers to resume experience collection... (221350 times) [2024-06-25 15:23:11,910][15401] InferenceWorker_p0-w0: resuming experience collection (221350 times) [2024-06-25 15:23:13,390][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14954840064. Throughput: 0: 42690.1. Samples: 14954936660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 15:23:15,383][15401] Updated weights for policy 0, policy_version 912774 (0.0025) [2024-06-25 15:23:18,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14955036672. Throughput: 0: 42882.7. Samples: 14955201360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:18,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 15:23:18,830][15401] Updated weights for policy 0, policy_version 912784 (0.0021) [2024-06-25 15:23:23,017][15401] Updated weights for policy 0, policy_version 912794 (0.0030) [2024-06-25 15:23:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14955233280. Throughput: 0: 42689.7. Samples: 14955326580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 15:23:26,575][15401] Updated weights for policy 0, policy_version 912804 (0.0032) [2024-06-25 15:23:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 14955479040. Throughput: 0: 42645.8. Samples: 14955580180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 15:23:30,689][15401] Updated weights for policy 0, policy_version 912814 (0.0033) [2024-06-25 15:23:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 14955659264. Throughput: 0: 42616.6. Samples: 14955835140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:33,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-25 15:23:34,261][15401] Updated weights for policy 0, policy_version 912824 (0.0041) [2024-06-25 15:23:38,327][15401] Updated weights for policy 0, policy_version 912834 (0.0033) [2024-06-25 15:23:38,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 14955872256. Throughput: 0: 42432.2. Samples: 14955957260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:38,390][15132] Avg episode reward: [(0, '0.889')] [2024-06-25 15:23:41,881][15401] Updated weights for policy 0, policy_version 912844 (0.0032) [2024-06-25 15:23:43,390][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14956118016. Throughput: 0: 42575.6. Samples: 14956213760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 15:23:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000912849_14956118016.pth... [2024-06-25 15:23:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000912222_14945845248.pth [2024-06-25 15:23:45,889][15401] Updated weights for policy 0, policy_version 912854 (0.0031) [2024-06-25 15:23:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 14956314624. Throughput: 0: 42624.3. Samples: 14956476340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 15:23:49,505][15401] Updated weights for policy 0, policy_version 912864 (0.0041) [2024-06-25 15:23:53,396][15132] Fps is (10 sec: 39296.5, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 14956511232. Throughput: 0: 42535.6. Samples: 14956597460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:53,396][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 15:23:53,974][15401] Updated weights for policy 0, policy_version 912874 (0.0028) [2024-06-25 15:23:56,958][15401] Updated weights for policy 0, policy_version 912884 (0.0024) [2024-06-25 15:23:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14956756992. Throughput: 0: 42720.1. Samples: 14956859060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:23:58,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 15:24:01,394][15401] Updated weights for policy 0, policy_version 912894 (0.0032) [2024-06-25 15:24:03,389][15132] Fps is (10 sec: 45904.9, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 14956969984. Throughput: 0: 42409.9. Samples: 14957109800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:24:03,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 15:24:04,941][15401] Updated weights for policy 0, policy_version 912904 (0.0027) [2024-06-25 15:24:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14957150208. Throughput: 0: 42559.2. Samples: 14957241740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:24:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 15:24:08,782][15401] Updated weights for policy 0, policy_version 912914 (0.0031) [2024-06-25 15:24:12,382][15401] Updated weights for policy 0, policy_version 912924 (0.0045) [2024-06-25 15:24:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14957379584. Throughput: 0: 42683.7. Samples: 14957500940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:24:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 15:24:16,221][15401] Updated weights for policy 0, policy_version 912934 (0.0039) [2024-06-25 15:24:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 14957592576. Throughput: 0: 42719.0. Samples: 14957757480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:24:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 15:24:19,934][15401] Updated weights for policy 0, policy_version 912944 (0.0029) [2024-06-25 15:24:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 14957789184. Throughput: 0: 42780.4. Samples: 14957882380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:24:23,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 15:24:24,322][15401] Updated weights for policy 0, policy_version 912954 (0.0035) [2024-06-25 15:24:27,418][15401] Updated weights for policy 0, policy_version 912964 (0.0033) [2024-06-25 15:24:28,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14958018560. Throughput: 0: 42802.2. Samples: 14958139860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 15:24:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 15:24:31,895][15401] Updated weights for policy 0, policy_version 912974 (0.0045) [2024-06-25 15:24:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 14958215168. Throughput: 0: 42662.6. Samples: 14958396160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:24:33,398][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 15:24:35,556][15401] Updated weights for policy 0, policy_version 912984 (0.0033) [2024-06-25 15:24:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14958444544. Throughput: 0: 42866.1. Samples: 14958526160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:24:38,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 15:24:39,325][15401] Updated weights for policy 0, policy_version 912994 (0.0036) [2024-06-25 15:24:43,131][15401] Updated weights for policy 0, policy_version 913004 (0.0031) [2024-06-25 15:24:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14958673920. Throughput: 0: 42795.0. Samples: 14958784840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:24:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 15:24:47,201][15401] Updated weights for policy 0, policy_version 913014 (0.0045) [2024-06-25 15:24:48,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 14958870528. Throughput: 0: 42938.4. Samples: 14959042040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:24:48,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 15:24:50,545][15349] Signal inference workers to stop experience collection... (221400 times) [2024-06-25 15:24:50,546][15349] Signal inference workers to resume experience collection... (221400 times) [2024-06-25 15:24:50,567][15401] InferenceWorker_p0-w0: stopping experience collection (221400 times) [2024-06-25 15:24:50,567][15401] InferenceWorker_p0-w0: resuming experience collection (221400 times) [2024-06-25 15:24:50,728][15401] Updated weights for policy 0, policy_version 913024 (0.0044) [2024-06-25 15:24:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 14959083520. Throughput: 0: 42780.7. Samples: 14959166880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:24:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 15:24:54,799][15401] Updated weights for policy 0, policy_version 913034 (0.0031) [2024-06-25 15:24:58,327][15401] Updated weights for policy 0, policy_version 913044 (0.0049) [2024-06-25 15:24:58,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14959312896. Throughput: 0: 42856.4. Samples: 14959429480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:24:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 15:25:02,348][15401] Updated weights for policy 0, policy_version 913054 (0.0040) [2024-06-25 15:25:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14959525888. Throughput: 0: 42885.6. Samples: 14959687340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:03,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 15:25:05,796][15401] Updated weights for policy 0, policy_version 913064 (0.0037) [2024-06-25 15:25:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 14959738880. Throughput: 0: 42927.0. Samples: 14959814100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:08,390][15132] Avg episode reward: [(0, '0.164')] [2024-06-25 15:25:10,147][15401] Updated weights for policy 0, policy_version 913074 (0.0029) [2024-06-25 15:25:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14959951872. Throughput: 0: 42980.0. Samples: 14960073960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:13,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 15:25:13,623][15401] Updated weights for policy 0, policy_version 913084 (0.0025) [2024-06-25 15:25:17,498][15401] Updated weights for policy 0, policy_version 913094 (0.0042) [2024-06-25 15:25:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 14960164864. Throughput: 0: 43003.1. Samples: 14960331300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:18,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 15:25:21,449][15401] Updated weights for policy 0, policy_version 913104 (0.0040) [2024-06-25 15:25:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 14960377856. Throughput: 0: 42932.0. Samples: 14960458100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 15:25:25,086][15401] Updated weights for policy 0, policy_version 913114 (0.0036) [2024-06-25 15:25:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 14960590848. Throughput: 0: 42913.4. Samples: 14960715940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:28,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 15:25:28,980][15401] Updated weights for policy 0, policy_version 913124 (0.0030) [2024-06-25 15:25:32,645][15401] Updated weights for policy 0, policy_version 913134 (0.0032) [2024-06-25 15:25:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43415.9, 300 sec: 42764.7). Total num frames: 14960820224. Throughput: 0: 43055.2. Samples: 14960979620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:33,393][15132] Avg episode reward: [(0, '0.131')] [2024-06-25 15:25:36,558][15401] Updated weights for policy 0, policy_version 913144 (0.0027) [2024-06-25 15:25:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 14961033216. Throughput: 0: 43173.1. Samples: 14961109660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:38,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 15:25:40,182][15401] Updated weights for policy 0, policy_version 913154 (0.0024) [2024-06-25 15:25:43,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14961246208. Throughput: 0: 43054.2. Samples: 14961366920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:43,390][15132] Avg episode reward: [(0, '0.266')] [2024-06-25 15:25:43,532][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000913163_14961262592.pth... [2024-06-25 15:25:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000912535_14950973440.pth [2024-06-25 15:25:44,116][15401] Updated weights for policy 0, policy_version 913164 (0.0036) [2024-06-25 15:25:47,698][15401] Updated weights for policy 0, policy_version 913174 (0.0033) [2024-06-25 15:25:48,394][15132] Fps is (10 sec: 42578.6, 60 sec: 43141.4, 300 sec: 42764.3). Total num frames: 14961459200. Throughput: 0: 43024.1. Samples: 14961623620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:48,394][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 15:25:51,722][15401] Updated weights for policy 0, policy_version 913184 (0.0050) [2024-06-25 15:25:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 14961672192. Throughput: 0: 43121.9. Samples: 14961754580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:53,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 15:25:55,171][15401] Updated weights for policy 0, policy_version 913194 (0.0035) [2024-06-25 15:25:58,389][15132] Fps is (10 sec: 42618.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14961885184. Throughput: 0: 43077.8. Samples: 14962012460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:25:58,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 15:25:59,622][15401] Updated weights for policy 0, policy_version 913204 (0.0038) [2024-06-25 15:26:02,921][15401] Updated weights for policy 0, policy_version 913214 (0.0039) [2024-06-25 15:26:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14962098176. Throughput: 0: 43119.7. Samples: 14962271680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:26:03,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 15:26:07,161][15401] Updated weights for policy 0, policy_version 913224 (0.0040) [2024-06-25 15:26:07,304][15349] Signal inference workers to stop experience collection... (221450 times) [2024-06-25 15:26:07,305][15349] Signal inference workers to resume experience collection... (221450 times) [2024-06-25 15:26:07,347][15401] InferenceWorker_p0-w0: stopping experience collection (221450 times) [2024-06-25 15:26:07,347][15401] InferenceWorker_p0-w0: resuming experience collection (221450 times) [2024-06-25 15:26:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14962311168. Throughput: 0: 43194.3. Samples: 14962401840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:26:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 15:26:10,632][15401] Updated weights for policy 0, policy_version 913234 (0.0046) [2024-06-25 15:26:13,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 14962540544. Throughput: 0: 43082.6. Samples: 14962654760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:26:13,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 15:26:14,705][15401] Updated weights for policy 0, policy_version 913244 (0.0034) [2024-06-25 15:26:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 14962737152. Throughput: 0: 42990.9. Samples: 14962914100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:26:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 15:26:18,424][15401] Updated weights for policy 0, policy_version 913254 (0.0039) [2024-06-25 15:26:22,285][15401] Updated weights for policy 0, policy_version 913264 (0.0034) [2024-06-25 15:26:23,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14962950144. Throughput: 0: 42875.9. Samples: 14963039080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 15:26:23,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 15:26:26,004][15401] Updated weights for policy 0, policy_version 913274 (0.0029) [2024-06-25 15:26:28,390][15132] Fps is (10 sec: 44235.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14963179520. Throughput: 0: 42907.5. Samples: 14963297760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:26:28,390][15132] Avg episode reward: [(0, '0.905')] [2024-06-25 15:26:29,828][15401] Updated weights for policy 0, policy_version 913284 (0.0037) [2024-06-25 15:26:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 14963376128. Throughput: 0: 43052.5. Samples: 14963560780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:26:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 15:26:33,702][15401] Updated weights for policy 0, policy_version 913294 (0.0040) [2024-06-25 15:26:37,456][15401] Updated weights for policy 0, policy_version 913304 (0.0029) [2024-06-25 15:26:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14963605504. Throughput: 0: 42872.3. Samples: 14963683840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:26:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 15:26:41,362][15401] Updated weights for policy 0, policy_version 913314 (0.0031) [2024-06-25 15:26:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14963818496. Throughput: 0: 42806.6. Samples: 14963938760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:26:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 15:26:45,253][15401] Updated weights for policy 0, policy_version 913324 (0.0044) [2024-06-25 15:26:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42601.7, 300 sec: 42765.0). Total num frames: 14964015104. Throughput: 0: 42880.5. Samples: 14964201300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:26:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 15:26:48,982][15401] Updated weights for policy 0, policy_version 913334 (0.0032) [2024-06-25 15:26:52,748][15401] Updated weights for policy 0, policy_version 913344 (0.0035) [2024-06-25 15:26:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14964244480. Throughput: 0: 42673.6. Samples: 14964322160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:26:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 15:26:56,786][15401] Updated weights for policy 0, policy_version 913354 (0.0030) [2024-06-25 15:26:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14964473856. Throughput: 0: 42813.0. Samples: 14964581240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:26:58,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 15:27:00,151][15401] Updated weights for policy 0, policy_version 913364 (0.0028) [2024-06-25 15:27:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14964670464. Throughput: 0: 42892.3. Samples: 14964844260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 15:27:04,307][15401] Updated weights for policy 0, policy_version 913374 (0.0034) [2024-06-25 15:27:07,651][15401] Updated weights for policy 0, policy_version 913384 (0.0027) [2024-06-25 15:27:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14964899840. Throughput: 0: 42780.0. Samples: 14964964180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 15:27:12,002][15401] Updated weights for policy 0, policy_version 913394 (0.0039) [2024-06-25 15:27:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14965112832. Throughput: 0: 42817.5. Samples: 14965224540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 15:27:15,284][15401] Updated weights for policy 0, policy_version 913404 (0.0040) [2024-06-25 15:27:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14965309440. Throughput: 0: 42819.0. Samples: 14965487640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 15:27:19,511][15401] Updated weights for policy 0, policy_version 913414 (0.0041) [2024-06-25 15:27:23,266][15401] Updated weights for policy 0, policy_version 913424 (0.0029) [2024-06-25 15:27:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 14965538816. Throughput: 0: 42824.4. Samples: 14965610940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:23,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 15:27:27,332][15401] Updated weights for policy 0, policy_version 913434 (0.0042) [2024-06-25 15:27:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 14965751808. Throughput: 0: 42846.2. Samples: 14965866840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 15:27:30,997][15401] Updated weights for policy 0, policy_version 913444 (0.0040) [2024-06-25 15:27:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14965932032. Throughput: 0: 42720.4. Samples: 14966123720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:33,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 15:27:35,026][15401] Updated weights for policy 0, policy_version 913454 (0.0028) [2024-06-25 15:27:38,234][15349] Signal inference workers to stop experience collection... (221500 times) [2024-06-25 15:27:38,235][15349] Signal inference workers to resume experience collection... (221500 times) [2024-06-25 15:27:38,284][15401] InferenceWorker_p0-w0: stopping experience collection (221500 times) [2024-06-25 15:27:38,284][15401] InferenceWorker_p0-w0: resuming experience collection (221500 times) [2024-06-25 15:27:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 14966177792. Throughput: 0: 42722.9. Samples: 14966244680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 15:27:38,536][15401] Updated weights for policy 0, policy_version 913464 (0.0039) [2024-06-25 15:27:42,702][15401] Updated weights for policy 0, policy_version 913474 (0.0041) [2024-06-25 15:27:43,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14966390784. Throughput: 0: 42830.7. Samples: 14966508620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 15:27:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000913476_14966390784.pth... [2024-06-25 15:27:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000912849_14956118016.pth [2024-06-25 15:27:46,295][15401] Updated weights for policy 0, policy_version 913484 (0.0028) [2024-06-25 15:27:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 14966603776. Throughput: 0: 42633.7. Samples: 14966762780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:48,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 15:27:50,200][15401] Updated weights for policy 0, policy_version 913494 (0.0037) [2024-06-25 15:27:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14966800384. Throughput: 0: 42807.0. Samples: 14966890500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:53,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 15:27:53,956][15401] Updated weights for policy 0, policy_version 913504 (0.0037) [2024-06-25 15:27:58,105][15401] Updated weights for policy 0, policy_version 913514 (0.0043) [2024-06-25 15:27:58,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 14967013376. Throughput: 0: 42800.5. Samples: 14967150560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:27:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 15:28:01,742][15401] Updated weights for policy 0, policy_version 913524 (0.0030) [2024-06-25 15:28:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14967242752. Throughput: 0: 42639.6. Samples: 14967406420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:28:03,398][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 15:28:06,059][15401] Updated weights for policy 0, policy_version 913534 (0.0042) [2024-06-25 15:28:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14967455744. Throughput: 0: 42665.4. Samples: 14967530880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:28:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 15:28:09,373][15401] Updated weights for policy 0, policy_version 913544 (0.0040) [2024-06-25 15:28:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 14967652352. Throughput: 0: 42732.0. Samples: 14967789780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:28:13,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 15:28:13,504][15401] Updated weights for policy 0, policy_version 913554 (0.0039) [2024-06-25 15:28:17,057][15401] Updated weights for policy 0, policy_version 913564 (0.0024) [2024-06-25 15:28:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 14967898112. Throughput: 0: 42490.3. Samples: 14968035780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 15:28:18,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 15:28:21,135][15401] Updated weights for policy 0, policy_version 913574 (0.0024) [2024-06-25 15:28:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14968078336. Throughput: 0: 42769.3. Samples: 14968169300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:28:23,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 15:28:25,003][15401] Updated weights for policy 0, policy_version 913584 (0.0037) [2024-06-25 15:28:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 14968291328. Throughput: 0: 42668.4. Samples: 14968428700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:28:28,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 15:28:28,895][15401] Updated weights for policy 0, policy_version 913594 (0.0035) [2024-06-25 15:28:32,279][15401] Updated weights for policy 0, policy_version 913604 (0.0035) [2024-06-25 15:28:33,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 14968553472. Throughput: 0: 42588.1. Samples: 14968679240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:28:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 15:28:36,440][15401] Updated weights for policy 0, policy_version 913614 (0.0029) [2024-06-25 15:28:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 14968717312. Throughput: 0: 42905.9. Samples: 14968821260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:28:38,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 15:28:39,753][15401] Updated weights for policy 0, policy_version 913624 (0.0037) [2024-06-25 15:28:43,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 14968930304. Throughput: 0: 42799.0. Samples: 14969076520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:28:43,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 15:28:43,935][15401] Updated weights for policy 0, policy_version 913634 (0.0036) [2024-06-25 15:28:47,478][15401] Updated weights for policy 0, policy_version 913644 (0.0030) [2024-06-25 15:28:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42932.6). Total num frames: 14969176064. Throughput: 0: 42612.9. Samples: 14969324000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:28:48,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 15:28:51,811][15401] Updated weights for policy 0, policy_version 913654 (0.0029) [2024-06-25 15:28:52,804][15349] Signal inference workers to stop experience collection... (221550 times) [2024-06-25 15:28:52,804][15349] Signal inference workers to resume experience collection... (221550 times) [2024-06-25 15:28:52,838][15401] InferenceWorker_p0-w0: stopping experience collection (221550 times) [2024-06-25 15:28:52,838][15401] InferenceWorker_p0-w0: resuming experience collection (221550 times) [2024-06-25 15:28:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14969372672. Throughput: 0: 42839.9. Samples: 14969458680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:28:53,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 15:28:54,896][15401] Updated weights for policy 0, policy_version 913664 (0.0032) [2024-06-25 15:28:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14969585664. Throughput: 0: 42780.8. Samples: 14969714920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:28:58,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 15:28:59,663][15401] Updated weights for policy 0, policy_version 913674 (0.0031) [2024-06-25 15:29:02,590][15401] Updated weights for policy 0, policy_version 913684 (0.0044) [2024-06-25 15:29:03,392][15132] Fps is (10 sec: 45864.6, 60 sec: 43142.7, 300 sec: 42986.8). Total num frames: 14969831424. Throughput: 0: 42854.0. Samples: 14969964320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:03,392][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 15:29:07,233][15401] Updated weights for policy 0, policy_version 913694 (0.0037) [2024-06-25 15:29:08,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 14970011648. Throughput: 0: 42900.3. Samples: 14970099920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:08,393][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 15:29:10,260][15401] Updated weights for policy 0, policy_version 913704 (0.0024) [2024-06-25 15:29:13,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 14970224640. Throughput: 0: 42739.0. Samples: 14970351960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 15:29:14,841][15401] Updated weights for policy 0, policy_version 913714 (0.0041) [2024-06-25 15:29:18,073][15401] Updated weights for policy 0, policy_version 913724 (0.0042) [2024-06-25 15:29:18,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 14970454016. Throughput: 0: 42924.8. Samples: 14970610860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 15:29:22,269][15401] Updated weights for policy 0, policy_version 913734 (0.0037) [2024-06-25 15:29:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14970667008. Throughput: 0: 42827.5. Samples: 14970748500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 15:29:25,597][15401] Updated weights for policy 0, policy_version 913744 (0.0030) [2024-06-25 15:29:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 14970880000. Throughput: 0: 42766.2. Samples: 14971001000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 15:29:29,906][15401] Updated weights for policy 0, policy_version 913754 (0.0034) [2024-06-25 15:29:33,368][15401] Updated weights for policy 0, policy_version 913764 (0.0038) [2024-06-25 15:29:33,394][15132] Fps is (10 sec: 44214.9, 60 sec: 42594.8, 300 sec: 42930.9). Total num frames: 14971109376. Throughput: 0: 42985.9. Samples: 14971258580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:33,395][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 15:29:37,532][15401] Updated weights for policy 0, policy_version 913774 (0.0029) [2024-06-25 15:29:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 14971305984. Throughput: 0: 42999.2. Samples: 14971393640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 15:29:41,235][15401] Updated weights for policy 0, policy_version 913784 (0.0034) [2024-06-25 15:29:43,390][15132] Fps is (10 sec: 40979.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 14971518976. Throughput: 0: 42846.5. Samples: 14971643020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 15:29:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000913789_14971518976.pth... [2024-06-25 15:29:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000913163_14961262592.pth [2024-06-25 15:29:45,343][15401] Updated weights for policy 0, policy_version 913794 (0.0042) [2024-06-25 15:29:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 14971748352. Throughput: 0: 42961.5. Samples: 14971897480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:48,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 15:29:48,770][15401] Updated weights for policy 0, policy_version 913804 (0.0036) [2024-06-25 15:29:53,056][15401] Updated weights for policy 0, policy_version 913814 (0.0034) [2024-06-25 15:29:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14971944960. Throughput: 0: 42955.5. Samples: 14972032820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:53,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 15:29:56,663][15401] Updated weights for policy 0, policy_version 913824 (0.0036) [2024-06-25 15:29:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14972174336. Throughput: 0: 42930.8. Samples: 14972283840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:29:58,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 15:30:00,611][15401] Updated weights for policy 0, policy_version 913834 (0.0035) [2024-06-25 15:30:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 14972387328. Throughput: 0: 42884.9. Samples: 14972540680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:30:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 15:30:04,314][15401] Updated weights for policy 0, policy_version 913844 (0.0047) [2024-06-25 15:30:08,311][15401] Updated weights for policy 0, policy_version 913854 (0.0041) [2024-06-25 15:30:08,320][15349] Signal inference workers to stop experience collection... (221600 times) [2024-06-25 15:30:08,320][15349] Signal inference workers to resume experience collection... (221600 times) [2024-06-25 15:30:08,359][15401] InferenceWorker_p0-w0: stopping experience collection (221600 times) [2024-06-25 15:30:08,359][15401] InferenceWorker_p0-w0: resuming experience collection (221600 times) [2024-06-25 15:30:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 14972583936. Throughput: 0: 42752.0. Samples: 14972672340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:30:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 15:30:12,038][15401] Updated weights for policy 0, policy_version 913864 (0.0047) [2024-06-25 15:30:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14972796928. Throughput: 0: 42717.0. Samples: 14972923260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:30:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 15:30:15,925][15401] Updated weights for policy 0, policy_version 913874 (0.0036) [2024-06-25 15:30:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14973009920. Throughput: 0: 42557.7. Samples: 14973173460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:18,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 15:30:20,041][15401] Updated weights for policy 0, policy_version 913884 (0.0030) [2024-06-25 15:30:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14973222912. Throughput: 0: 42435.1. Samples: 14973303220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:23,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 15:30:23,461][15401] Updated weights for policy 0, policy_version 913894 (0.0034) [2024-06-25 15:30:27,528][15401] Updated weights for policy 0, policy_version 913904 (0.0036) [2024-06-25 15:30:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 14973435904. Throughput: 0: 42818.4. Samples: 14973569840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 15:30:31,019][15401] Updated weights for policy 0, policy_version 913914 (0.0036) [2024-06-25 15:30:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42601.9, 300 sec: 42820.5). Total num frames: 14973665280. Throughput: 0: 42781.3. Samples: 14973822640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:33,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 15:30:35,147][15401] Updated weights for policy 0, policy_version 913924 (0.0032) [2024-06-25 15:30:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14973861888. Throughput: 0: 42679.2. Samples: 14973953380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 15:30:38,604][15401] Updated weights for policy 0, policy_version 913934 (0.0024) [2024-06-25 15:30:42,538][15401] Updated weights for policy 0, policy_version 913944 (0.0040) [2024-06-25 15:30:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.5, 300 sec: 42710.2). Total num frames: 14974058496. Throughput: 0: 42906.7. Samples: 14974214640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 15:30:46,165][15401] Updated weights for policy 0, policy_version 913954 (0.0036) [2024-06-25 15:30:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 14974304256. Throughput: 0: 42628.8. Samples: 14974458980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:48,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 15:30:50,134][15401] Updated weights for policy 0, policy_version 913964 (0.0036) [2024-06-25 15:30:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14974500864. Throughput: 0: 42746.7. Samples: 14974595940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:53,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 15:30:53,950][15401] Updated weights for policy 0, policy_version 913974 (0.0040) [2024-06-25 15:30:57,552][15401] Updated weights for policy 0, policy_version 913984 (0.0038) [2024-06-25 15:30:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 14974713856. Throughput: 0: 42859.3. Samples: 14974851940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:30:58,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 15:31:01,668][15401] Updated weights for policy 0, policy_version 913994 (0.0047) [2024-06-25 15:31:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14974943232. Throughput: 0: 42935.4. Samples: 14975105560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:03,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 15:31:05,054][15401] Updated weights for policy 0, policy_version 914004 (0.0039) [2024-06-25 15:31:06,964][15349] Signal inference workers to stop experience collection... (221650 times) [2024-06-25 15:31:06,964][15349] Signal inference workers to resume experience collection... (221650 times) [2024-06-25 15:31:07,012][15401] InferenceWorker_p0-w0: stopping experience collection (221650 times) [2024-06-25 15:31:07,012][15401] InferenceWorker_p0-w0: resuming experience collection (221650 times) [2024-06-25 15:31:08,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 14975156224. Throughput: 0: 43060.1. Samples: 14975240920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:08,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 15:31:09,242][15401] Updated weights for policy 0, policy_version 914014 (0.0023) [2024-06-25 15:31:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14975352832. Throughput: 0: 42826.3. Samples: 14975497020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 15:31:13,486][15401] Updated weights for policy 0, policy_version 914024 (0.0040) [2024-06-25 15:31:17,008][15401] Updated weights for policy 0, policy_version 914034 (0.0023) [2024-06-25 15:31:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14975598592. Throughput: 0: 42753.8. Samples: 14975746560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 15:31:21,138][15401] Updated weights for policy 0, policy_version 914044 (0.0029) [2024-06-25 15:31:23,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14975795200. Throughput: 0: 42791.2. Samples: 14975878980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 15:31:24,677][15401] Updated weights for policy 0, policy_version 914054 (0.0030) [2024-06-25 15:31:28,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 14976008192. Throughput: 0: 42593.2. Samples: 14976131340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:28,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 15:31:28,615][15401] Updated weights for policy 0, policy_version 914064 (0.0028) [2024-06-25 15:31:32,252][15401] Updated weights for policy 0, policy_version 914074 (0.0029) [2024-06-25 15:31:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14976221184. Throughput: 0: 42835.3. Samples: 14976386560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:33,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 15:31:36,171][15401] Updated weights for policy 0, policy_version 914084 (0.0042) [2024-06-25 15:31:38,392][15132] Fps is (10 sec: 42588.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 14976434176. Throughput: 0: 42709.3. Samples: 14976517960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:38,392][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 15:31:39,823][15401] Updated weights for policy 0, policy_version 914094 (0.0045) [2024-06-25 15:31:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 14976663552. Throughput: 0: 42679.2. Samples: 14976772500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:43,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 15:31:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000914103_14976663552.pth... [2024-06-25 15:31:43,448][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000913476_14966390784.pth [2024-06-25 15:31:44,022][15401] Updated weights for policy 0, policy_version 914104 (0.0035) [2024-06-25 15:31:47,700][15401] Updated weights for policy 0, policy_version 914114 (0.0030) [2024-06-25 15:31:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14976860160. Throughput: 0: 42704.0. Samples: 14977027240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:48,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 15:31:51,648][15401] Updated weights for policy 0, policy_version 914124 (0.0047) [2024-06-25 15:31:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 14977056768. Throughput: 0: 42503.0. Samples: 14977153560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:53,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 15:31:55,218][15401] Updated weights for policy 0, policy_version 914134 (0.0040) [2024-06-25 15:31:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14977302528. Throughput: 0: 42633.7. Samples: 14977415540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:31:58,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 15:31:59,405][15401] Updated weights for policy 0, policy_version 914144 (0.0035) [2024-06-25 15:32:02,836][15401] Updated weights for policy 0, policy_version 914154 (0.0026) [2024-06-25 15:32:03,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 14977515520. Throughput: 0: 42752.0. Samples: 14977670400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:32:03,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 15:32:06,878][15401] Updated weights for policy 0, policy_version 914164 (0.0029) [2024-06-25 15:32:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 14977712128. Throughput: 0: 42568.4. Samples: 14977794560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 15:32:08,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 15:32:10,360][15401] Updated weights for policy 0, policy_version 914174 (0.0044) [2024-06-25 15:32:13,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14977941504. Throughput: 0: 42656.9. Samples: 14978050900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 15:32:14,419][15401] Updated weights for policy 0, policy_version 914184 (0.0057) [2024-06-25 15:32:18,380][15401] Updated weights for policy 0, policy_version 914194 (0.0026) [2024-06-25 15:32:18,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.6, 300 sec: 42764.7). Total num frames: 14978154496. Throughput: 0: 42642.6. Samples: 14978305580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:18,393][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 15:32:21,996][15401] Updated weights for policy 0, policy_version 914204 (0.0038) [2024-06-25 15:32:23,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 14978351104. Throughput: 0: 42693.4. Samples: 14978439060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:23,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 15:32:26,073][15401] Updated weights for policy 0, policy_version 914214 (0.0031) [2024-06-25 15:32:28,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14978580480. Throughput: 0: 42688.9. Samples: 14978693500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:28,390][15132] Avg episode reward: [(0, '0.306')] [2024-06-25 15:32:29,703][15401] Updated weights for policy 0, policy_version 914224 (0.0028) [2024-06-25 15:32:33,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.7, 300 sec: 42764.6). Total num frames: 14978793472. Throughput: 0: 42576.8. Samples: 14978943300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:33,393][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 15:32:33,789][15401] Updated weights for policy 0, policy_version 914234 (0.0030) [2024-06-25 15:32:36,934][15349] Signal inference workers to stop experience collection... (221700 times) [2024-06-25 15:32:36,934][15349] Signal inference workers to resume experience collection... (221700 times) [2024-06-25 15:32:36,961][15401] InferenceWorker_p0-w0: stopping experience collection (221700 times) [2024-06-25 15:32:36,961][15401] InferenceWorker_p0-w0: resuming experience collection (221700 times) [2024-06-25 15:32:37,866][15401] Updated weights for policy 0, policy_version 914244 (0.0033) [2024-06-25 15:32:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 14979006464. Throughput: 0: 42758.3. Samples: 14979077680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 15:32:41,470][15401] Updated weights for policy 0, policy_version 914254 (0.0023) [2024-06-25 15:32:43,392][15132] Fps is (10 sec: 40959.9, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 14979203072. Throughput: 0: 42583.0. Samples: 14979331880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:43,393][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 15:32:45,235][15401] Updated weights for policy 0, policy_version 914264 (0.0039) [2024-06-25 15:32:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14979416064. Throughput: 0: 42717.7. Samples: 14979592700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:48,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 15:32:49,199][15401] Updated weights for policy 0, policy_version 914274 (0.0025) [2024-06-25 15:32:52,623][15401] Updated weights for policy 0, policy_version 914284 (0.0038) [2024-06-25 15:32:53,389][15132] Fps is (10 sec: 45886.5, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 14979661824. Throughput: 0: 42950.3. Samples: 14979727320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 15:32:56,639][15401] Updated weights for policy 0, policy_version 914294 (0.0031) [2024-06-25 15:32:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 14979858432. Throughput: 0: 42931.7. Samples: 14979982820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:32:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 15:33:00,093][15401] Updated weights for policy 0, policy_version 914304 (0.0033) [2024-06-25 15:33:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14980071424. Throughput: 0: 42994.7. Samples: 14980240240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:03,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 15:33:04,416][15401] Updated weights for policy 0, policy_version 914314 (0.0030) [2024-06-25 15:33:07,659][15401] Updated weights for policy 0, policy_version 914324 (0.0040) [2024-06-25 15:33:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14980300800. Throughput: 0: 42936.3. Samples: 14980371200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:08,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 15:33:11,841][15401] Updated weights for policy 0, policy_version 914334 (0.0031) [2024-06-25 15:33:13,395][15132] Fps is (10 sec: 44213.0, 60 sec: 42867.7, 300 sec: 42764.2). Total num frames: 14980513792. Throughput: 0: 43069.1. Samples: 14980631840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:13,396][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 15:33:15,648][15401] Updated weights for policy 0, policy_version 914344 (0.0046) [2024-06-25 15:33:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 14980726784. Throughput: 0: 43141.0. Samples: 14980884540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 15:33:19,476][15401] Updated weights for policy 0, policy_version 914354 (0.0029) [2024-06-25 15:33:23,389][15132] Fps is (10 sec: 40982.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14980923392. Throughput: 0: 43048.9. Samples: 14981014880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:23,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 15:33:23,516][15401] Updated weights for policy 0, policy_version 914364 (0.0040) [2024-06-25 15:33:27,754][15401] Updated weights for policy 0, policy_version 914374 (0.0032) [2024-06-25 15:33:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 14981136384. Throughput: 0: 43033.0. Samples: 14981268260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:28,392][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 15:33:30,975][15401] Updated weights for policy 0, policy_version 914384 (0.0031) [2024-06-25 15:33:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 14981365760. Throughput: 0: 42863.6. Samples: 14981521560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:33,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 15:33:35,354][15401] Updated weights for policy 0, policy_version 914394 (0.0029) [2024-06-25 15:33:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14981578752. Throughput: 0: 42803.5. Samples: 14981653480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:38,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 15:33:38,660][15401] Updated weights for policy 0, policy_version 914404 (0.0035) [2024-06-25 15:33:43,118][15401] Updated weights for policy 0, policy_version 914414 (0.0031) [2024-06-25 15:33:43,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 14981775360. Throughput: 0: 42858.6. Samples: 14981911460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:43,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 15:33:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000914416_14981791744.pth... [2024-06-25 15:33:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000913789_14971518976.pth [2024-06-25 15:33:46,354][15401] Updated weights for policy 0, policy_version 914424 (0.0033) [2024-06-25 15:33:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 14982004736. Throughput: 0: 42808.6. Samples: 14982166620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:48,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 15:33:51,065][15401] Updated weights for policy 0, policy_version 914434 (0.0049) [2024-06-25 15:33:51,994][15349] Signal inference workers to stop experience collection... (221750 times) [2024-06-25 15:33:51,994][15349] Signal inference workers to resume experience collection... (221750 times) [2024-06-25 15:33:52,020][15401] InferenceWorker_p0-w0: stopping experience collection (221750 times) [2024-06-25 15:33:52,020][15401] InferenceWorker_p0-w0: resuming experience collection (221750 times) [2024-06-25 15:33:53,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14982234112. Throughput: 0: 42898.3. Samples: 14982301620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:53,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-25 15:33:53,937][15401] Updated weights for policy 0, policy_version 914444 (0.0029) [2024-06-25 15:33:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 14982397952. Throughput: 0: 42767.4. Samples: 14982556140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:33:58,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 15:33:58,526][15401] Updated weights for policy 0, policy_version 914454 (0.0039) [2024-06-25 15:34:01,750][15401] Updated weights for policy 0, policy_version 914464 (0.0042) [2024-06-25 15:34:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 14982643712. Throughput: 0: 42674.2. Samples: 14982804880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-25 15:34:03,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 15:34:05,995][15401] Updated weights for policy 0, policy_version 914474 (0.0041) [2024-06-25 15:34:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 14982856704. Throughput: 0: 42756.5. Samples: 14982938920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:08,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 15:34:09,151][15401] Updated weights for policy 0, policy_version 914484 (0.0027) [2024-06-25 15:34:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42329.2, 300 sec: 42709.5). Total num frames: 14983053312. Throughput: 0: 42802.7. Samples: 14983194380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:13,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 15:34:13,853][15401] Updated weights for policy 0, policy_version 914494 (0.0037) [2024-06-25 15:34:16,691][15401] Updated weights for policy 0, policy_version 914504 (0.0031) [2024-06-25 15:34:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14983299072. Throughput: 0: 42772.0. Samples: 14983446300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:18,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 15:34:21,372][15401] Updated weights for policy 0, policy_version 914514 (0.0042) [2024-06-25 15:34:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14983495680. Throughput: 0: 43002.2. Samples: 14983588580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 15:34:24,193][15401] Updated weights for policy 0, policy_version 914524 (0.0028) [2024-06-25 15:34:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42710.2). Total num frames: 14983708672. Throughput: 0: 42859.6. Samples: 14983840140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 15:34:28,827][15401] Updated weights for policy 0, policy_version 914534 (0.0042) [2024-06-25 15:34:31,867][15401] Updated weights for policy 0, policy_version 914544 (0.0034) [2024-06-25 15:34:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14983954432. Throughput: 0: 42895.4. Samples: 14984096920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 15:34:36,395][15401] Updated weights for policy 0, policy_version 914554 (0.0028) [2024-06-25 15:34:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 14984134656. Throughput: 0: 42889.8. Samples: 14984231660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:38,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 15:34:39,866][15401] Updated weights for policy 0, policy_version 914564 (0.0023) [2024-06-25 15:34:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 14984364032. Throughput: 0: 42844.9. Samples: 14984484160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 15:34:44,034][15401] Updated weights for policy 0, policy_version 914574 (0.0045) [2024-06-25 15:34:47,610][15401] Updated weights for policy 0, policy_version 914584 (0.0023) [2024-06-25 15:34:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 14984593408. Throughput: 0: 42888.0. Samples: 14984734840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 15:34:51,563][15401] Updated weights for policy 0, policy_version 914594 (0.0024) [2024-06-25 15:34:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 14984790016. Throughput: 0: 42927.0. Samples: 14984870640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:53,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 15:34:55,174][15401] Updated weights for policy 0, policy_version 914604 (0.0033) [2024-06-25 15:34:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 14985003008. Throughput: 0: 42918.7. Samples: 14985125720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:34:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 15:34:58,992][15401] Updated weights for policy 0, policy_version 914614 (0.0045) [2024-06-25 15:35:02,983][15401] Updated weights for policy 0, policy_version 914624 (0.0025) [2024-06-25 15:35:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14985216000. Throughput: 0: 43072.9. Samples: 14985384580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:03,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 15:35:06,602][15401] Updated weights for policy 0, policy_version 914634 (0.0032) [2024-06-25 15:35:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14985428992. Throughput: 0: 42855.3. Samples: 14985517060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 15:35:10,176][15349] Signal inference workers to stop experience collection... (221800 times) [2024-06-25 15:35:10,177][15349] Signal inference workers to resume experience collection... (221800 times) [2024-06-25 15:35:10,191][15401] InferenceWorker_p0-w0: stopping experience collection (221800 times) [2024-06-25 15:35:10,191][15401] InferenceWorker_p0-w0: resuming experience collection (221800 times) [2024-06-25 15:35:10,329][15401] Updated weights for policy 0, policy_version 914644 (0.0045) [2024-06-25 15:35:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 14985641984. Throughput: 0: 42815.4. Samples: 14985766840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 15:35:14,062][15401] Updated weights for policy 0, policy_version 914654 (0.0036) [2024-06-25 15:35:18,039][15401] Updated weights for policy 0, policy_version 914664 (0.0033) [2024-06-25 15:35:18,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14985854976. Throughput: 0: 42952.9. Samples: 14986029800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:18,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 15:35:21,711][15401] Updated weights for policy 0, policy_version 914674 (0.0048) [2024-06-25 15:35:23,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 14986067968. Throughput: 0: 42921.3. Samples: 14986163120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 15:35:25,727][15401] Updated weights for policy 0, policy_version 914684 (0.0028) [2024-06-25 15:35:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14986297344. Throughput: 0: 42805.3. Samples: 14986410400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:28,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 15:35:29,305][15401] Updated weights for policy 0, policy_version 914694 (0.0045) [2024-06-25 15:35:33,263][15401] Updated weights for policy 0, policy_version 914704 (0.0039) [2024-06-25 15:35:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 14986510336. Throughput: 0: 43099.6. Samples: 14986674320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 15:35:36,861][15401] Updated weights for policy 0, policy_version 914714 (0.0041) [2024-06-25 15:35:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14986706944. Throughput: 0: 42962.8. Samples: 14986803960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 15:35:40,810][15401] Updated weights for policy 0, policy_version 914724 (0.0032) [2024-06-25 15:35:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14986952704. Throughput: 0: 42958.2. Samples: 14987058840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:43,392][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 15:35:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000914731_14986952704.pth... [2024-06-25 15:35:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000914103_14976663552.pth [2024-06-25 15:35:44,563][15401] Updated weights for policy 0, policy_version 914734 (0.0046) [2024-06-25 15:35:48,386][15401] Updated weights for policy 0, policy_version 914744 (0.0031) [2024-06-25 15:35:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14987165696. Throughput: 0: 42990.3. Samples: 14987319140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 15:35:52,461][15401] Updated weights for policy 0, policy_version 914754 (0.0031) [2024-06-25 15:35:53,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 14987362304. Throughput: 0: 42732.7. Samples: 14987440140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-25 15:35:53,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 15:35:56,429][15401] Updated weights for policy 0, policy_version 914764 (0.0042) [2024-06-25 15:35:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14987591680. Throughput: 0: 42939.2. Samples: 14987699100. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:35:58,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 15:35:59,886][15401] Updated weights for policy 0, policy_version 914774 (0.0030) [2024-06-25 15:36:03,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14987804672. Throughput: 0: 42808.4. Samples: 14987956180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 15:36:03,958][15401] Updated weights for policy 0, policy_version 914784 (0.0030) [2024-06-25 15:36:07,397][15401] Updated weights for policy 0, policy_version 914794 (0.0026) [2024-06-25 15:36:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14988001280. Throughput: 0: 42704.0. Samples: 14988084800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:08,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 15:36:11,543][15401] Updated weights for policy 0, policy_version 914804 (0.0026) [2024-06-25 15:36:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 14988230656. Throughput: 0: 42922.2. Samples: 14988341900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 15:36:15,309][15401] Updated weights for policy 0, policy_version 914814 (0.0041) [2024-06-25 15:36:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14988427264. Throughput: 0: 42873.2. Samples: 14988603620. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:18,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 15:36:19,117][15401] Updated weights for policy 0, policy_version 914824 (0.0037) [2024-06-25 15:36:22,843][15401] Updated weights for policy 0, policy_version 914834 (0.0033) [2024-06-25 15:36:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14988656640. Throughput: 0: 42739.0. Samples: 14988727220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 15:36:26,912][15401] Updated weights for policy 0, policy_version 914844 (0.0038) [2024-06-25 15:36:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 14988869632. Throughput: 0: 42746.2. Samples: 14988982420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:28,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 15:36:30,343][15401] Updated weights for policy 0, policy_version 914854 (0.0033) [2024-06-25 15:36:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 14989066240. Throughput: 0: 42706.7. Samples: 14989240940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 15:36:34,743][15401] Updated weights for policy 0, policy_version 914864 (0.0029) [2024-06-25 15:36:37,092][15349] Signal inference workers to stop experience collection... (221850 times) [2024-06-25 15:36:37,092][15349] Signal inference workers to resume experience collection... (221850 times) [2024-06-25 15:36:37,148][15401] InferenceWorker_p0-w0: stopping experience collection (221850 times) [2024-06-25 15:36:37,148][15401] InferenceWorker_p0-w0: resuming experience collection (221850 times) [2024-06-25 15:36:38,369][15401] Updated weights for policy 0, policy_version 914874 (0.0035) [2024-06-25 15:36:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14989295616. Throughput: 0: 42797.1. Samples: 14989365900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:38,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 15:36:42,600][15401] Updated weights for policy 0, policy_version 914884 (0.0049) [2024-06-25 15:36:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14989492224. Throughput: 0: 42759.0. Samples: 14989623260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 15:36:45,863][15401] Updated weights for policy 0, policy_version 914894 (0.0037) [2024-06-25 15:36:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 14989705216. Throughput: 0: 42722.3. Samples: 14989878680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:48,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 15:36:50,292][15401] Updated weights for policy 0, policy_version 914904 (0.0039) [2024-06-25 15:36:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 14989934592. Throughput: 0: 42633.3. Samples: 14990003300. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:53,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 15:36:53,567][15401] Updated weights for policy 0, policy_version 914914 (0.0028) [2024-06-25 15:36:58,131][15401] Updated weights for policy 0, policy_version 914924 (0.0036) [2024-06-25 15:36:58,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 14990131200. Throughput: 0: 42601.6. Samples: 14990258980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:36:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 15:37:01,515][15401] Updated weights for policy 0, policy_version 914934 (0.0027) [2024-06-25 15:37:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 14990344192. Throughput: 0: 42400.9. Samples: 14990511660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:03,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-25 15:37:05,629][15401] Updated weights for policy 0, policy_version 914944 (0.0035) [2024-06-25 15:37:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14990573568. Throughput: 0: 42516.8. Samples: 14990640480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:08,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 15:37:09,043][15401] Updated weights for policy 0, policy_version 914954 (0.0027) [2024-06-25 15:37:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 14990753792. Throughput: 0: 42628.4. Samples: 14990900700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:13,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 15:37:13,509][15401] Updated weights for policy 0, policy_version 914964 (0.0047) [2024-06-25 15:37:16,639][15401] Updated weights for policy 0, policy_version 914974 (0.0024) [2024-06-25 15:37:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14990983168. Throughput: 0: 42572.4. Samples: 14991156700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:18,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 15:37:21,067][15401] Updated weights for policy 0, policy_version 914984 (0.0036) [2024-06-25 15:37:23,390][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14991228928. Throughput: 0: 42722.5. Samples: 14991288420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:23,396][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 15:37:24,412][15401] Updated weights for policy 0, policy_version 914994 (0.0032) [2024-06-25 15:37:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 14991409152. Throughput: 0: 42780.0. Samples: 14991548360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:28,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 15:37:28,566][15401] Updated weights for policy 0, policy_version 915004 (0.0024) [2024-06-25 15:37:31,929][15401] Updated weights for policy 0, policy_version 915014 (0.0029) [2024-06-25 15:37:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14991638528. Throughput: 0: 42760.0. Samples: 14991802880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:33,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 15:37:36,354][15401] Updated weights for policy 0, policy_version 915024 (0.0035) [2024-06-25 15:37:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.3, 300 sec: 42932.0). Total num frames: 14991867904. Throughput: 0: 42964.4. Samples: 14991936700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 15:37:39,523][15401] Updated weights for policy 0, policy_version 915034 (0.0029) [2024-06-25 15:37:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14992048128. Throughput: 0: 42904.6. Samples: 14992189680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:43,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 15:37:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000915042_14992048128.pth... [2024-06-25 15:37:43,489][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000914416_14981791744.pth [2024-06-25 15:37:44,083][15401] Updated weights for policy 0, policy_version 915044 (0.0022) [2024-06-25 15:37:47,025][15401] Updated weights for policy 0, policy_version 915054 (0.0042) [2024-06-25 15:37:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 14992293888. Throughput: 0: 42893.8. Samples: 14992441880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-25 15:37:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 15:37:51,743][15401] Updated weights for policy 0, policy_version 915064 (0.0026) [2024-06-25 15:37:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14992506880. Throughput: 0: 42967.6. Samples: 14992574020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:37:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 15:37:54,919][15401] Updated weights for policy 0, policy_version 915074 (0.0033) [2024-06-25 15:37:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 14992703488. Throughput: 0: 42815.1. Samples: 14992827380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:37:58,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 15:37:59,127][15349] Signal inference workers to stop experience collection... (221900 times) [2024-06-25 15:37:59,162][15401] InferenceWorker_p0-w0: stopping experience collection (221900 times) [2024-06-25 15:37:59,174][15349] Signal inference workers to resume experience collection... (221900 times) [2024-06-25 15:37:59,183][15401] InferenceWorker_p0-w0: resuming experience collection (221900 times) [2024-06-25 15:37:59,190][15401] Updated weights for policy 0, policy_version 915084 (0.0035) [2024-06-25 15:38:02,458][15401] Updated weights for policy 0, policy_version 915094 (0.0037) [2024-06-25 15:38:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14992932864. Throughput: 0: 42983.6. Samples: 14993090960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:03,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 15:38:06,623][15401] Updated weights for policy 0, policy_version 915104 (0.0030) [2024-06-25 15:38:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42821.3). Total num frames: 14993145856. Throughput: 0: 43069.4. Samples: 14993226540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 15:38:09,905][15401] Updated weights for policy 0, policy_version 915114 (0.0032) [2024-06-25 15:38:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 14993358848. Throughput: 0: 42863.6. Samples: 14993477220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 15:38:14,372][15401] Updated weights for policy 0, policy_version 915124 (0.0034) [2024-06-25 15:38:17,510][15401] Updated weights for policy 0, policy_version 915134 (0.0028) [2024-06-25 15:38:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 14993571840. Throughput: 0: 43108.5. Samples: 14993742760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 15:38:22,355][15401] Updated weights for policy 0, policy_version 915144 (0.0034) [2024-06-25 15:38:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14993801216. Throughput: 0: 43112.5. Samples: 14993876760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 15:38:25,371][15401] Updated weights for policy 0, policy_version 915154 (0.0038) [2024-06-25 15:38:28,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 14993997824. Throughput: 0: 42917.4. Samples: 14994120960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:28,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 15:38:29,826][15401] Updated weights for policy 0, policy_version 915164 (0.0039) [2024-06-25 15:38:32,721][15401] Updated weights for policy 0, policy_version 915174 (0.0030) [2024-06-25 15:38:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14994227200. Throughput: 0: 43135.1. Samples: 14994382960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:33,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 15:38:37,438][15401] Updated weights for policy 0, policy_version 915184 (0.0031) [2024-06-25 15:38:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 14994440192. Throughput: 0: 43174.4. Samples: 14994516860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 15:38:40,532][15401] Updated weights for policy 0, policy_version 915194 (0.0037) [2024-06-25 15:38:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 14994653184. Throughput: 0: 43177.4. Samples: 14994770360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 15:38:44,917][15401] Updated weights for policy 0, policy_version 915204 (0.0036) [2024-06-25 15:38:48,285][15401] Updated weights for policy 0, policy_version 915214 (0.0039) [2024-06-25 15:38:48,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 14994866176. Throughput: 0: 43185.3. Samples: 14995034300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:48,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 15:38:52,428][15401] Updated weights for policy 0, policy_version 915224 (0.0038) [2024-06-25 15:38:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 14995095552. Throughput: 0: 42962.6. Samples: 14995159860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 15:38:55,931][15401] Updated weights for policy 0, policy_version 915234 (0.0041) [2024-06-25 15:38:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 14995308544. Throughput: 0: 43036.4. Samples: 14995413860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:38:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 15:38:59,935][15401] Updated weights for policy 0, policy_version 915244 (0.0033) [2024-06-25 15:39:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 14995505152. Throughput: 0: 42978.5. Samples: 14995676800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:03,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 15:39:03,443][15401] Updated weights for policy 0, policy_version 915254 (0.0030) [2024-06-25 15:39:07,733][15401] Updated weights for policy 0, policy_version 915264 (0.0035) [2024-06-25 15:39:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 14995718144. Throughput: 0: 42910.7. Samples: 14995807740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:08,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 15:39:11,302][15349] Signal inference workers to stop experience collection... (221950 times) [2024-06-25 15:39:11,304][15349] Signal inference workers to resume experience collection... (221950 times) [2024-06-25 15:39:11,308][15401] Updated weights for policy 0, policy_version 915274 (0.0025) [2024-06-25 15:39:11,317][15401] InferenceWorker_p0-w0: stopping experience collection (221950 times) [2024-06-25 15:39:11,318][15401] InferenceWorker_p0-w0: resuming experience collection (221950 times) [2024-06-25 15:39:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 14995947520. Throughput: 0: 43032.8. Samples: 14996057440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:13,391][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 15:39:15,135][15401] Updated weights for policy 0, policy_version 915284 (0.0023) [2024-06-25 15:39:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 14996144128. Throughput: 0: 43087.5. Samples: 14996322000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:18,392][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 15:39:18,962][15401] Updated weights for policy 0, policy_version 915294 (0.0035) [2024-06-25 15:39:23,174][15401] Updated weights for policy 0, policy_version 915304 (0.0031) [2024-06-25 15:39:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14996357120. Throughput: 0: 42811.9. Samples: 14996443400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:23,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 15:39:26,482][15401] Updated weights for policy 0, policy_version 915314 (0.0031) [2024-06-25 15:39:28,390][15132] Fps is (10 sec: 45886.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 14996602880. Throughput: 0: 42825.8. Samples: 14996697520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 15:39:30,715][15401] Updated weights for policy 0, policy_version 915324 (0.0028) [2024-06-25 15:39:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 14996783104. Throughput: 0: 42699.2. Samples: 14996955760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:33,390][15132] Avg episode reward: [(0, '0.227')] [2024-06-25 15:39:34,325][15401] Updated weights for policy 0, policy_version 915334 (0.0027) [2024-06-25 15:39:38,301][15401] Updated weights for policy 0, policy_version 915344 (0.0037) [2024-06-25 15:39:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14996996096. Throughput: 0: 42567.2. Samples: 14997075380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:38,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 15:39:41,742][15401] Updated weights for policy 0, policy_version 915354 (0.0033) [2024-06-25 15:39:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 14997225472. Throughput: 0: 42583.6. Samples: 14997330120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-25 15:39:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 15:39:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000915358_14997225472.pth... [2024-06-25 15:39:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000914731_14986952704.pth [2024-06-25 15:39:46,103][15401] Updated weights for policy 0, policy_version 915364 (0.0039) [2024-06-25 15:39:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 14997422080. Throughput: 0: 42586.8. Samples: 14997593200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:39:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 15:39:49,487][15401] Updated weights for policy 0, policy_version 915374 (0.0024) [2024-06-25 15:39:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14997635072. Throughput: 0: 42373.2. Samples: 14997714540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:39:53,395][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 15:39:54,046][15401] Updated weights for policy 0, policy_version 915384 (0.0036) [2024-06-25 15:39:57,164][15401] Updated weights for policy 0, policy_version 915394 (0.0041) [2024-06-25 15:39:58,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 14997848064. Throughput: 0: 42560.0. Samples: 14997972740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:39:58,392][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 15:40:01,424][15401] Updated weights for policy 0, policy_version 915404 (0.0030) [2024-06-25 15:40:03,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42764.6). Total num frames: 14998044672. Throughput: 0: 42515.0. Samples: 14998235180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:03,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 15:40:04,725][15401] Updated weights for policy 0, policy_version 915414 (0.0031) [2024-06-25 15:40:08,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 14998290432. Throughput: 0: 42511.9. Samples: 14998356440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:08,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 15:40:08,910][15401] Updated weights for policy 0, policy_version 915424 (0.0027) [2024-06-25 15:40:12,578][15401] Updated weights for policy 0, policy_version 915434 (0.0038) [2024-06-25 15:40:13,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 14998487040. Throughput: 0: 42532.0. Samples: 14998611460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 15:40:16,363][15401] Updated weights for policy 0, policy_version 915444 (0.0030) [2024-06-25 15:40:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 14998700032. Throughput: 0: 42609.3. Samples: 14998873180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 15:40:20,164][15401] Updated weights for policy 0, policy_version 915454 (0.0037) [2024-06-25 15:40:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 14998929408. Throughput: 0: 42731.5. Samples: 14998998300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 15:40:24,357][15401] Updated weights for policy 0, policy_version 915464 (0.0040) [2024-06-25 15:40:27,794][15401] Updated weights for policy 0, policy_version 915474 (0.0033) [2024-06-25 15:40:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 14999142400. Throughput: 0: 42709.4. Samples: 14999252040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:28,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 15:40:32,335][15401] Updated weights for policy 0, policy_version 915484 (0.0032) [2024-06-25 15:40:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 14999339008. Throughput: 0: 42637.3. Samples: 14999511880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 15:40:35,531][15401] Updated weights for policy 0, policy_version 915494 (0.0026) [2024-06-25 15:40:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 14999568384. Throughput: 0: 42689.0. Samples: 14999635540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 15:40:39,902][15401] Updated weights for policy 0, policy_version 915504 (0.0028) [2024-06-25 15:40:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 14999764992. Throughput: 0: 42652.1. Samples: 14999891980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:43,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 15:40:43,417][15401] Updated weights for policy 0, policy_version 915514 (0.0033) [2024-06-25 15:40:47,445][15401] Updated weights for policy 0, policy_version 915524 (0.0037) [2024-06-25 15:40:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 14999961600. Throughput: 0: 42494.8. Samples: 15000147340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 15:40:51,087][15401] Updated weights for policy 0, policy_version 915534 (0.0032) [2024-06-25 15:40:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15000190976. Throughput: 0: 42566.3. Samples: 15000271920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 15:40:54,216][15349] Signal inference workers to stop experience collection... (222000 times) [2024-06-25 15:40:54,229][15401] InferenceWorker_p0-w0: stopping experience collection (222000 times) [2024-06-25 15:40:54,280][15349] Signal inference workers to resume experience collection... (222000 times) [2024-06-25 15:40:54,280][15401] InferenceWorker_p0-w0: resuming experience collection (222000 times) [2024-06-25 15:40:54,948][15401] Updated weights for policy 0, policy_version 915544 (0.0036) [2024-06-25 15:40:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 15000403968. Throughput: 0: 42728.8. Samples: 15000534260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:40:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 15:40:59,074][15401] Updated weights for policy 0, policy_version 915554 (0.0026) [2024-06-25 15:41:02,953][15401] Updated weights for policy 0, policy_version 915564 (0.0030) [2024-06-25 15:41:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 15000616960. Throughput: 0: 42496.8. Samples: 15000785540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:41:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 15:41:06,542][15401] Updated weights for policy 0, policy_version 915574 (0.0027) [2024-06-25 15:41:08,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 15000829952. Throughput: 0: 42522.2. Samples: 15000911900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:41:08,401][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 15:41:10,744][15401] Updated weights for policy 0, policy_version 915584 (0.0033) [2024-06-25 15:41:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15001059328. Throughput: 0: 42612.8. Samples: 15001169620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:41:13,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 15:41:14,032][15401] Updated weights for policy 0, policy_version 915594 (0.0038) [2024-06-25 15:41:18,389][15132] Fps is (10 sec: 40970.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15001239552. Throughput: 0: 42706.3. Samples: 15001433660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:41:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 15:41:18,430][15401] Updated weights for policy 0, policy_version 915604 (0.0024) [2024-06-25 15:41:21,561][15401] Updated weights for policy 0, policy_version 915614 (0.0049) [2024-06-25 15:41:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15001468928. Throughput: 0: 42656.8. Samples: 15001555100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:41:23,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 15:41:26,063][15401] Updated weights for policy 0, policy_version 915624 (0.0046) [2024-06-25 15:41:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15001698304. Throughput: 0: 42696.4. Samples: 15001813320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:41:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 15:41:29,087][15401] Updated weights for policy 0, policy_version 915634 (0.0042) [2024-06-25 15:41:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15001894912. Throughput: 0: 42655.6. Samples: 15002066840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:41:33,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 15:41:33,659][15401] Updated weights for policy 0, policy_version 915644 (0.0033) [2024-06-25 15:41:37,103][15401] Updated weights for policy 0, policy_version 915654 (0.0033) [2024-06-25 15:41:38,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 15002107904. Throughput: 0: 42588.3. Samples: 15002188400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-25 15:41:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 15:41:41,220][15401] Updated weights for policy 0, policy_version 915664 (0.0032) [2024-06-25 15:41:43,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15002320896. Throughput: 0: 42492.4. Samples: 15002446420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:41:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 15:41:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000915670_15002337280.pth... [2024-06-25 15:41:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000915042_14992048128.pth [2024-06-25 15:41:45,185][15401] Updated weights for policy 0, policy_version 915674 (0.0032) [2024-06-25 15:41:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15002533888. Throughput: 0: 42596.1. Samples: 15002702360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:41:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 15:41:49,114][15401] Updated weights for policy 0, policy_version 915684 (0.0042) [2024-06-25 15:41:52,691][15401] Updated weights for policy 0, policy_version 915694 (0.0042) [2024-06-25 15:41:53,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15002763264. Throughput: 0: 42625.4. Samples: 15002829940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:41:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 15:41:56,651][15401] Updated weights for policy 0, policy_version 915704 (0.0040) [2024-06-25 15:41:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15002959872. Throughput: 0: 42686.2. Samples: 15003090500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:41:58,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 15:42:00,201][15401] Updated weights for policy 0, policy_version 915714 (0.0046) [2024-06-25 15:42:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15003172864. Throughput: 0: 42453.3. Samples: 15003344060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 15:42:04,531][15401] Updated weights for policy 0, policy_version 915724 (0.0033) [2024-06-25 15:42:07,817][15401] Updated weights for policy 0, policy_version 915734 (0.0034) [2024-06-25 15:42:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 15003402240. Throughput: 0: 42684.0. Samples: 15003475880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 15:42:12,054][15401] Updated weights for policy 0, policy_version 915744 (0.0038) [2024-06-25 15:42:13,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 15003582464. Throughput: 0: 42631.0. Samples: 15003731820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:13,393][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 15:42:13,610][15349] Signal inference workers to stop experience collection... (222050 times) [2024-06-25 15:42:13,616][15349] Signal inference workers to resume experience collection... (222050 times) [2024-06-25 15:42:13,652][15401] InferenceWorker_p0-w0: stopping experience collection (222050 times) [2024-06-25 15:42:13,652][15401] InferenceWorker_p0-w0: resuming experience collection (222050 times) [2024-06-25 15:42:15,337][15401] Updated weights for policy 0, policy_version 915754 (0.0046) [2024-06-25 15:42:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15003828224. Throughput: 0: 42500.4. Samples: 15003979360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:18,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 15:42:19,905][15401] Updated weights for policy 0, policy_version 915764 (0.0044) [2024-06-25 15:42:23,062][15401] Updated weights for policy 0, policy_version 915774 (0.0041) [2024-06-25 15:42:23,390][15132] Fps is (10 sec: 47524.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15004057600. Throughput: 0: 42857.4. Samples: 15004116980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:23,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 15:42:27,440][15401] Updated weights for policy 0, policy_version 915784 (0.0028) [2024-06-25 15:42:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15004221440. Throughput: 0: 42657.0. Samples: 15004365980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:28,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 15:42:30,874][15401] Updated weights for policy 0, policy_version 915794 (0.0031) [2024-06-25 15:42:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15004467200. Throughput: 0: 42692.5. Samples: 15004623520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 15:42:34,838][15401] Updated weights for policy 0, policy_version 915804 (0.0035) [2024-06-25 15:42:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15004680192. Throughput: 0: 42838.7. Samples: 15004757680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:38,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 15:42:38,997][15401] Updated weights for policy 0, policy_version 915814 (0.0034) [2024-06-25 15:42:42,587][15401] Updated weights for policy 0, policy_version 915824 (0.0044) [2024-06-25 15:42:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15004876800. Throughput: 0: 42553.3. Samples: 15005005400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:43,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 15:42:46,746][15401] Updated weights for policy 0, policy_version 915834 (0.0028) [2024-06-25 15:42:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15005106176. Throughput: 0: 42585.7. Samples: 15005260420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 15:42:50,066][15401] Updated weights for policy 0, policy_version 915844 (0.0035) [2024-06-25 15:42:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15005319168. Throughput: 0: 42529.9. Samples: 15005389720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:53,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 15:42:54,236][15401] Updated weights for policy 0, policy_version 915854 (0.0039) [2024-06-25 15:42:57,747][15401] Updated weights for policy 0, policy_version 915864 (0.0038) [2024-06-25 15:42:58,391][15132] Fps is (10 sec: 42593.1, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 15005532160. Throughput: 0: 42473.1. Samples: 15005643060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:42:58,391][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 15:43:01,772][15401] Updated weights for policy 0, policy_version 915874 (0.0044) [2024-06-25 15:43:03,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15005745152. Throughput: 0: 42599.4. Samples: 15005896340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:43:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 15:43:05,440][15401] Updated weights for policy 0, policy_version 915884 (0.0020) [2024-06-25 15:43:08,390][15132] Fps is (10 sec: 40965.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15005941760. Throughput: 0: 42453.4. Samples: 15006027380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:43:08,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 15:43:09,633][15401] Updated weights for policy 0, policy_version 915894 (0.0040) [2024-06-25 15:43:13,219][15401] Updated weights for policy 0, policy_version 915904 (0.0027) [2024-06-25 15:43:13,389][15132] Fps is (10 sec: 42599.3, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 15006171136. Throughput: 0: 42558.8. Samples: 15006281120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:43:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 15:43:17,352][15401] Updated weights for policy 0, policy_version 915914 (0.0027) [2024-06-25 15:43:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15006384128. Throughput: 0: 42637.7. Samples: 15006542220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:43:18,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 15:43:20,818][15401] Updated weights for policy 0, policy_version 915924 (0.0024) [2024-06-25 15:43:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 15006580736. Throughput: 0: 42433.0. Samples: 15006667160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:43:23,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 15:43:24,974][15401] Updated weights for policy 0, policy_version 915934 (0.0027) [2024-06-25 15:43:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15006810112. Throughput: 0: 42524.5. Samples: 15006919000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:43:28,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 15:43:29,041][15401] Updated weights for policy 0, policy_version 915944 (0.0040) [2024-06-25 15:43:32,603][15401] Updated weights for policy 0, policy_version 915954 (0.0036) [2024-06-25 15:43:33,389][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15007023104. Throughput: 0: 42753.8. Samples: 15007184340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 15:43:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 15:43:35,134][15349] Signal inference workers to stop experience collection... (222100 times) [2024-06-25 15:43:35,135][15349] Signal inference workers to resume experience collection... (222100 times) [2024-06-25 15:43:35,150][15401] InferenceWorker_p0-w0: stopping experience collection (222100 times) [2024-06-25 15:43:35,150][15401] InferenceWorker_p0-w0: resuming experience collection (222100 times) [2024-06-25 15:43:36,805][15401] Updated weights for policy 0, policy_version 915964 (0.0035) [2024-06-25 15:43:38,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15007236096. Throughput: 0: 42750.3. Samples: 15007313480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:43:38,398][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 15:43:40,429][15401] Updated weights for policy 0, policy_version 915974 (0.0034) [2024-06-25 15:43:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15007465472. Throughput: 0: 42733.2. Samples: 15007566000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:43:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 15:43:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000915983_15007465472.pth... [2024-06-25 15:43:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000915358_14997225472.pth [2024-06-25 15:43:44,353][15401] Updated weights for policy 0, policy_version 915984 (0.0033) [2024-06-25 15:43:47,980][15401] Updated weights for policy 0, policy_version 915994 (0.0033) [2024-06-25 15:43:48,390][15132] Fps is (10 sec: 42596.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15007662080. Throughput: 0: 42847.5. Samples: 15007824480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:43:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 15:43:51,871][15401] Updated weights for policy 0, policy_version 916004 (0.0026) [2024-06-25 15:43:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15007875072. Throughput: 0: 42771.1. Samples: 15007952080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:43:53,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 15:43:55,463][15401] Updated weights for policy 0, policy_version 916014 (0.0030) [2024-06-25 15:43:58,390][15132] Fps is (10 sec: 44237.6, 60 sec: 42872.4, 300 sec: 42709.5). Total num frames: 15008104448. Throughput: 0: 42911.5. Samples: 15008212140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:43:58,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 15:43:59,486][15401] Updated weights for policy 0, policy_version 916024 (0.0035) [2024-06-25 15:44:03,035][15401] Updated weights for policy 0, policy_version 916034 (0.0037) [2024-06-25 15:44:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15008301056. Throughput: 0: 42709.3. Samples: 15008464140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 15:44:07,146][15401] Updated weights for policy 0, policy_version 916044 (0.0037) [2024-06-25 15:44:08,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 15008514048. Throughput: 0: 42837.8. Samples: 15008595140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:08,397][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 15:44:10,498][15401] Updated weights for policy 0, policy_version 916054 (0.0033) [2024-06-25 15:44:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 15008743424. Throughput: 0: 43030.3. Samples: 15008855360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 15:44:14,701][15401] Updated weights for policy 0, policy_version 916064 (0.0027) [2024-06-25 15:44:18,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15008940032. Throughput: 0: 42796.5. Samples: 15009110180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:18,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 15:44:18,490][15401] Updated weights for policy 0, policy_version 916074 (0.0028) [2024-06-25 15:44:22,240][15401] Updated weights for policy 0, policy_version 916084 (0.0029) [2024-06-25 15:44:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15009153024. Throughput: 0: 42747.4. Samples: 15009237120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 15:44:25,957][15401] Updated weights for policy 0, policy_version 916094 (0.0026) [2024-06-25 15:44:28,394][15132] Fps is (10 sec: 44217.6, 60 sec: 42868.4, 300 sec: 42708.9). Total num frames: 15009382400. Throughput: 0: 42819.9. Samples: 15009493080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:28,394][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 15:44:29,803][15401] Updated weights for policy 0, policy_version 916104 (0.0028) [2024-06-25 15:44:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15009595392. Throughput: 0: 42840.1. Samples: 15009752280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:33,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 15:44:33,579][15401] Updated weights for policy 0, policy_version 916114 (0.0042) [2024-06-25 15:44:37,566][15401] Updated weights for policy 0, policy_version 916124 (0.0046) [2024-06-25 15:44:38,389][15132] Fps is (10 sec: 39338.5, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 15009775616. Throughput: 0: 42772.9. Samples: 15009876860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:38,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 15:44:41,100][15401] Updated weights for policy 0, policy_version 916134 (0.0037) [2024-06-25 15:44:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15010021376. Throughput: 0: 42657.9. Samples: 15010131740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:43,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 15:44:45,183][15401] Updated weights for policy 0, policy_version 916144 (0.0045) [2024-06-25 15:44:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15010234368. Throughput: 0: 42806.3. Samples: 15010390420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:48,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 15:44:49,381][15401] Updated weights for policy 0, policy_version 916154 (0.0034) [2024-06-25 15:44:52,793][15401] Updated weights for policy 0, policy_version 916164 (0.0047) [2024-06-25 15:44:53,390][15132] Fps is (10 sec: 42596.7, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 15010447360. Throughput: 0: 42564.4. Samples: 15010510280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:53,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 15:44:57,289][15401] Updated weights for policy 0, policy_version 916174 (0.0038) [2024-06-25 15:44:57,599][15349] Signal inference workers to stop experience collection... (222150 times) [2024-06-25 15:44:57,600][15349] Signal inference workers to resume experience collection... (222150 times) [2024-06-25 15:44:57,626][15401] InferenceWorker_p0-w0: stopping experience collection (222150 times) [2024-06-25 15:44:57,626][15401] InferenceWorker_p0-w0: resuming experience collection (222150 times) [2024-06-25 15:44:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 15010660352. Throughput: 0: 42665.7. Samples: 15010775320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:44:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 15:45:00,740][15401] Updated weights for policy 0, policy_version 916184 (0.0032) [2024-06-25 15:45:03,390][15132] Fps is (10 sec: 39322.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15010840576. Throughput: 0: 42609.2. Samples: 15011027600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:45:03,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 15:45:04,950][15401] Updated weights for policy 0, policy_version 916194 (0.0042) [2024-06-25 15:45:08,325][15401] Updated weights for policy 0, policy_version 916204 (0.0033) [2024-06-25 15:45:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 15011086336. Throughput: 0: 42577.3. Samples: 15011153100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:45:08,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 15:45:12,576][15401] Updated weights for policy 0, policy_version 916214 (0.0040) [2024-06-25 15:45:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15011299328. Throughput: 0: 42685.9. Samples: 15011413760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:45:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 15:45:16,175][15401] Updated weights for policy 0, policy_version 916224 (0.0025) [2024-06-25 15:45:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15011495936. Throughput: 0: 42613.0. Samples: 15011669860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:45:18,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 15:45:20,195][15401] Updated weights for policy 0, policy_version 916234 (0.0034) [2024-06-25 15:45:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15011725312. Throughput: 0: 42484.1. Samples: 15011788640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:45:23,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 15:45:23,947][15401] Updated weights for policy 0, policy_version 916244 (0.0031) [2024-06-25 15:45:27,870][15401] Updated weights for policy 0, policy_version 916254 (0.0030) [2024-06-25 15:45:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42874.6, 300 sec: 42765.0). Total num frames: 15011954688. Throughput: 0: 42632.9. Samples: 15012050220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 15:45:28,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-25 15:45:31,637][15401] Updated weights for policy 0, policy_version 916264 (0.0030) [2024-06-25 15:45:33,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15012118528. Throughput: 0: 42619.1. Samples: 15012308280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:45:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 15:45:35,312][15401] Updated weights for policy 0, policy_version 916274 (0.0026) [2024-06-25 15:45:38,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15012347904. Throughput: 0: 42589.7. Samples: 15012426800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:45:38,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 15:45:39,341][15401] Updated weights for policy 0, policy_version 916284 (0.0031) [2024-06-25 15:45:42,900][15401] Updated weights for policy 0, policy_version 916294 (0.0039) [2024-06-25 15:45:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15012577280. Throughput: 0: 42421.1. Samples: 15012684260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:45:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 15:45:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000916295_15012577280.pth... [2024-06-25 15:45:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000915670_15002337280.pth [2024-06-25 15:45:47,314][15401] Updated weights for policy 0, policy_version 916304 (0.0034) [2024-06-25 15:45:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 15012757504. Throughput: 0: 42572.8. Samples: 15012943380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:45:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 15:45:50,487][15401] Updated weights for policy 0, policy_version 916314 (0.0046) [2024-06-25 15:45:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.6, 300 sec: 42654.0). Total num frames: 15012986880. Throughput: 0: 42445.8. Samples: 15013063160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:45:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 15:45:55,118][15401] Updated weights for policy 0, policy_version 916324 (0.0024) [2024-06-25 15:45:58,312][15401] Updated weights for policy 0, policy_version 916334 (0.0039) [2024-06-25 15:45:58,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15013216256. Throughput: 0: 42400.8. Samples: 15013321800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:45:58,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 15:46:02,841][15401] Updated weights for policy 0, policy_version 916344 (0.0037) [2024-06-25 15:46:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 15013396480. Throughput: 0: 42428.9. Samples: 15013579160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:03,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 15:46:05,897][15401] Updated weights for policy 0, policy_version 916354 (0.0040) [2024-06-25 15:46:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15013625856. Throughput: 0: 42444.4. Samples: 15013698640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 15:46:10,399][15401] Updated weights for policy 0, policy_version 916364 (0.0035) [2024-06-25 15:46:12,118][15349] Signal inference workers to stop experience collection... (222200 times) [2024-06-25 15:46:12,172][15401] InferenceWorker_p0-w0: stopping experience collection (222200 times) [2024-06-25 15:46:12,182][15349] Signal inference workers to resume experience collection... (222200 times) [2024-06-25 15:46:12,182][15401] InferenceWorker_p0-w0: resuming experience collection (222200 times) [2024-06-25 15:46:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15013838848. Throughput: 0: 42608.9. Samples: 15013967620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:13,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 15:46:13,583][15401] Updated weights for policy 0, policy_version 916374 (0.0028) [2024-06-25 15:46:17,922][15401] Updated weights for policy 0, policy_version 916384 (0.0032) [2024-06-25 15:46:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15014035456. Throughput: 0: 42573.4. Samples: 15014224080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 15:46:21,225][15401] Updated weights for policy 0, policy_version 916394 (0.0032) [2024-06-25 15:46:23,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15014264832. Throughput: 0: 42611.8. Samples: 15014344340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:23,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 15:46:26,113][15401] Updated weights for policy 0, policy_version 916404 (0.0034) [2024-06-25 15:46:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15014477824. Throughput: 0: 42715.4. Samples: 15014606460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 15:46:29,214][15401] Updated weights for policy 0, policy_version 916414 (0.0043) [2024-06-25 15:46:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15014674432. Throughput: 0: 42719.6. Samples: 15014865760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:33,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 15:46:33,683][15401] Updated weights for policy 0, policy_version 916424 (0.0034) [2024-06-25 15:46:36,829][15401] Updated weights for policy 0, policy_version 916434 (0.0031) [2024-06-25 15:46:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15014920192. Throughput: 0: 42855.8. Samples: 15014991680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:38,399][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 15:46:41,332][15401] Updated weights for policy 0, policy_version 916444 (0.0032) [2024-06-25 15:46:43,396][15132] Fps is (10 sec: 45845.8, 60 sec: 42593.7, 300 sec: 42708.5). Total num frames: 15015133184. Throughput: 0: 42941.8. Samples: 15015254460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:43,397][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 15:46:44,580][15401] Updated weights for policy 0, policy_version 916454 (0.0030) [2024-06-25 15:46:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15015313408. Throughput: 0: 42805.8. Samples: 15015505420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:48,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 15:46:48,891][15401] Updated weights for policy 0, policy_version 916464 (0.0027) [2024-06-25 15:46:52,327][15401] Updated weights for policy 0, policy_version 916474 (0.0035) [2024-06-25 15:46:53,389][15132] Fps is (10 sec: 42626.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15015559168. Throughput: 0: 42884.0. Samples: 15015628420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 15:46:56,593][15401] Updated weights for policy 0, policy_version 916484 (0.0044) [2024-06-25 15:46:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15015755776. Throughput: 0: 42615.5. Samples: 15015885320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:46:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 15:47:00,215][15401] Updated weights for policy 0, policy_version 916494 (0.0031) [2024-06-25 15:47:03,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15015952384. Throughput: 0: 42643.4. Samples: 15016143040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:47:03,390][15132] Avg episode reward: [(0, '0.366')] [2024-06-25 15:47:04,235][15401] Updated weights for policy 0, policy_version 916504 (0.0036) [2024-06-25 15:47:07,664][15401] Updated weights for policy 0, policy_version 916514 (0.0033) [2024-06-25 15:47:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15016198144. Throughput: 0: 42611.3. Samples: 15016261840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:47:08,393][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 15:47:11,837][15401] Updated weights for policy 0, policy_version 916524 (0.0032) [2024-06-25 15:47:13,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15016411136. Throughput: 0: 42642.3. Samples: 15016525360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:47:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 15:47:15,346][15401] Updated weights for policy 0, policy_version 916534 (0.0034) [2024-06-25 15:47:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15016607744. Throughput: 0: 42701.9. Samples: 15016787340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:47:18,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 15:47:19,390][15401] Updated weights for policy 0, policy_version 916544 (0.0036) [2024-06-25 15:47:23,049][15401] Updated weights for policy 0, policy_version 916554 (0.0031) [2024-06-25 15:47:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15016837120. Throughput: 0: 42641.8. Samples: 15016910560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 15:47:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 15:47:23,640][15349] Signal inference workers to stop experience collection... (222250 times) [2024-06-25 15:47:23,688][15349] Signal inference workers to resume experience collection... (222250 times) [2024-06-25 15:47:23,698][15401] InferenceWorker_p0-w0: stopping experience collection (222250 times) [2024-06-25 15:47:23,733][15401] InferenceWorker_p0-w0: resuming experience collection (222250 times) [2024-06-25 15:47:26,969][15401] Updated weights for policy 0, policy_version 916564 (0.0043) [2024-06-25 15:47:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15017050112. Throughput: 0: 42448.8. Samples: 15017164380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:47:28,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 15:47:31,076][15401] Updated weights for policy 0, policy_version 916574 (0.0033) [2024-06-25 15:47:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15017246720. Throughput: 0: 42513.7. Samples: 15017418540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:47:33,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 15:47:34,534][15401] Updated weights for policy 0, policy_version 916584 (0.0040) [2024-06-25 15:47:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15017459712. Throughput: 0: 42839.9. Samples: 15017556220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:47:38,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 15:47:38,677][15401] Updated weights for policy 0, policy_version 916594 (0.0039) [2024-06-25 15:47:42,219][15401] Updated weights for policy 0, policy_version 916604 (0.0029) [2024-06-25 15:47:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42329.9, 300 sec: 42598.4). Total num frames: 15017672704. Throughput: 0: 42684.5. Samples: 15017806120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:47:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 15:47:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000916606_15017672704.pth... [2024-06-25 15:47:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000915983_15007465472.pth [2024-06-25 15:47:46,104][15401] Updated weights for policy 0, policy_version 916614 (0.0024) [2024-06-25 15:47:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15017902080. Throughput: 0: 42539.6. Samples: 15018057320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:47:48,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 15:47:50,288][15401] Updated weights for policy 0, policy_version 916624 (0.0036) [2024-06-25 15:47:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.6). Total num frames: 15018098688. Throughput: 0: 42777.7. Samples: 15018186840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:47:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 15:47:53,649][15401] Updated weights for policy 0, policy_version 916634 (0.0029) [2024-06-25 15:47:57,898][15401] Updated weights for policy 0, policy_version 916644 (0.0034) [2024-06-25 15:47:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15018311680. Throughput: 0: 42715.6. Samples: 15018447560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:47:58,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 15:48:01,338][15401] Updated weights for policy 0, policy_version 916654 (0.0040) [2024-06-25 15:48:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 15018541056. Throughput: 0: 42352.0. Samples: 15018693180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:03,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 15:48:05,684][15401] Updated weights for policy 0, policy_version 916664 (0.0037) [2024-06-25 15:48:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15018737664. Throughput: 0: 42550.3. Samples: 15018825320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:08,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 15:48:09,065][15401] Updated weights for policy 0, policy_version 916674 (0.0030) [2024-06-25 15:48:13,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15018934272. Throughput: 0: 42653.7. Samples: 15019083800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:13,398][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 15:48:13,417][15401] Updated weights for policy 0, policy_version 916684 (0.0037) [2024-06-25 15:48:16,755][15401] Updated weights for policy 0, policy_version 916694 (0.0025) [2024-06-25 15:48:18,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 15019180032. Throughput: 0: 42629.8. Samples: 15019336980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:18,393][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 15:48:21,006][15401] Updated weights for policy 0, policy_version 916704 (0.0036) [2024-06-25 15:48:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15019393024. Throughput: 0: 42582.1. Samples: 15019472420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:23,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 15:48:24,219][15401] Updated weights for policy 0, policy_version 916714 (0.0039) [2024-06-25 15:48:28,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15019589632. Throughput: 0: 42655.2. Samples: 15019725600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 15:48:28,621][15401] Updated weights for policy 0, policy_version 916724 (0.0035) [2024-06-25 15:48:31,933][15401] Updated weights for policy 0, policy_version 916734 (0.0030) [2024-06-25 15:48:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15019835392. Throughput: 0: 42693.9. Samples: 15019978540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:33,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 15:48:36,443][15401] Updated weights for policy 0, policy_version 916744 (0.0051) [2024-06-25 15:48:38,138][15349] Signal inference workers to stop experience collection... (222300 times) [2024-06-25 15:48:38,138][15349] Signal inference workers to resume experience collection... (222300 times) [2024-06-25 15:48:38,181][15401] InferenceWorker_p0-w0: stopping experience collection (222300 times) [2024-06-25 15:48:38,182][15401] InferenceWorker_p0-w0: resuming experience collection (222300 times) [2024-06-25 15:48:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15020032000. Throughput: 0: 42732.1. Samples: 15020109780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:38,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 15:48:39,685][15401] Updated weights for policy 0, policy_version 916754 (0.0032) [2024-06-25 15:48:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15020228608. Throughput: 0: 42695.6. Samples: 15020368860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:43,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 15:48:43,969][15401] Updated weights for policy 0, policy_version 916764 (0.0045) [2024-06-25 15:48:47,359][15401] Updated weights for policy 0, policy_version 916774 (0.0021) [2024-06-25 15:48:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15020490752. Throughput: 0: 42863.0. Samples: 15020622020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 15:48:51,658][15401] Updated weights for policy 0, policy_version 916784 (0.0034) [2024-06-25 15:48:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15020670976. Throughput: 0: 42879.2. Samples: 15020754880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:53,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 15:48:54,945][15401] Updated weights for policy 0, policy_version 916794 (0.0032) [2024-06-25 15:48:58,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15020867584. Throughput: 0: 42652.9. Samples: 15021003180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:48:58,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 15:48:59,295][15401] Updated weights for policy 0, policy_version 916804 (0.0023) [2024-06-25 15:49:02,494][15401] Updated weights for policy 0, policy_version 916814 (0.0030) [2024-06-25 15:49:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 15021113344. Throughput: 0: 42748.0. Samples: 15021260540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:49:03,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 15:49:07,266][15401] Updated weights for policy 0, policy_version 916824 (0.0046) [2024-06-25 15:49:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15021309952. Throughput: 0: 42715.3. Samples: 15021394600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:49:08,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 15:49:10,118][15401] Updated weights for policy 0, policy_version 916834 (0.0030) [2024-06-25 15:49:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15021522944. Throughput: 0: 42687.5. Samples: 15021646540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:49:13,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 15:49:14,753][15401] Updated weights for policy 0, policy_version 916844 (0.0031) [2024-06-25 15:49:18,013][15401] Updated weights for policy 0, policy_version 916854 (0.0034) [2024-06-25 15:49:18,392][15132] Fps is (10 sec: 45863.9, 60 sec: 43144.5, 300 sec: 42764.7). Total num frames: 15021768704. Throughput: 0: 42828.4. Samples: 15021905920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 15:49:18,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 15:49:22,325][15401] Updated weights for policy 0, policy_version 916864 (0.0030) [2024-06-25 15:49:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42599.0). Total num frames: 15021948928. Throughput: 0: 42939.0. Samples: 15022042040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:49:23,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 15:49:25,496][15401] Updated weights for policy 0, policy_version 916874 (0.0029) [2024-06-25 15:49:28,390][15132] Fps is (10 sec: 39330.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15022161920. Throughput: 0: 42711.5. Samples: 15022290880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:49:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 15:49:30,181][15401] Updated weights for policy 0, policy_version 916884 (0.0041) [2024-06-25 15:49:33,000][15401] Updated weights for policy 0, policy_version 916894 (0.0035) [2024-06-25 15:49:33,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15022424064. Throughput: 0: 42727.5. Samples: 15022544760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:49:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 15:49:37,718][15401] Updated weights for policy 0, policy_version 916904 (0.0026) [2024-06-25 15:49:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15022587904. Throughput: 0: 42907.0. Samples: 15022685700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:49:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 15:49:40,548][15401] Updated weights for policy 0, policy_version 916914 (0.0021) [2024-06-25 15:49:43,396][15132] Fps is (10 sec: 39297.0, 60 sec: 43139.9, 300 sec: 42653.0). Total num frames: 15022817280. Throughput: 0: 42846.9. Samples: 15022931560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:49:43,396][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 15:49:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000916920_15022817280.pth... [2024-06-25 15:49:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000916295_15012577280.pth [2024-06-25 15:49:45,299][15401] Updated weights for policy 0, policy_version 916924 (0.0039) [2024-06-25 15:49:46,684][15349] Signal inference workers to stop experience collection... (222350 times) [2024-06-25 15:49:46,684][15349] Signal inference workers to resume experience collection... (222350 times) [2024-06-25 15:49:46,732][15401] InferenceWorker_p0-w0: stopping experience collection (222350 times) [2024-06-25 15:49:46,732][15401] InferenceWorker_p0-w0: resuming experience collection (222350 times) [2024-06-25 15:49:48,208][15401] Updated weights for policy 0, policy_version 916934 (0.0035) [2024-06-25 15:49:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15023046656. Throughput: 0: 42940.4. Samples: 15023192860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:49:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 15:49:52,978][15401] Updated weights for policy 0, policy_version 916944 (0.0037) [2024-06-25 15:49:53,390][15132] Fps is (10 sec: 42625.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15023243264. Throughput: 0: 42827.8. Samples: 15023321860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:49:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 15:49:55,986][15401] Updated weights for policy 0, policy_version 916954 (0.0028) [2024-06-25 15:49:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15023472640. Throughput: 0: 42875.5. Samples: 15023575940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:49:58,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 15:50:00,629][15401] Updated weights for policy 0, policy_version 916964 (0.0033) [2024-06-25 15:50:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15023685632. Throughput: 0: 42851.7. Samples: 15023834140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:03,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 15:50:03,714][15401] Updated weights for policy 0, policy_version 916974 (0.0039) [2024-06-25 15:50:08,188][15401] Updated weights for policy 0, policy_version 916984 (0.0031) [2024-06-25 15:50:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15023882240. Throughput: 0: 42686.7. Samples: 15023962940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:08,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 15:50:11,171][15401] Updated weights for policy 0, policy_version 916994 (0.0045) [2024-06-25 15:50:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 15024128000. Throughput: 0: 42831.1. Samples: 15024218280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 15:50:15,834][15401] Updated weights for policy 0, policy_version 917004 (0.0037) [2024-06-25 15:50:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 15024308224. Throughput: 0: 42962.3. Samples: 15024478060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:18,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 15:50:19,196][15401] Updated weights for policy 0, policy_version 917014 (0.0027) [2024-06-25 15:50:23,390][15132] Fps is (10 sec: 36044.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15024488448. Throughput: 0: 42516.0. Samples: 15024598920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:23,399][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 15:50:23,761][15401] Updated weights for policy 0, policy_version 917024 (0.0041) [2024-06-25 15:50:26,613][15401] Updated weights for policy 0, policy_version 917034 (0.0029) [2024-06-25 15:50:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 15024783360. Throughput: 0: 42792.6. Samples: 15024856960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 15:50:31,241][15401] Updated weights for policy 0, policy_version 917044 (0.0026) [2024-06-25 15:50:33,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15024963584. Throughput: 0: 42962.3. Samples: 15025126160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:33,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 15:50:34,274][15401] Updated weights for policy 0, policy_version 917054 (0.0038) [2024-06-25 15:50:38,390][15132] Fps is (10 sec: 36045.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15025143808. Throughput: 0: 42786.7. Samples: 15025247260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 15:50:38,915][15401] Updated weights for policy 0, policy_version 917064 (0.0032) [2024-06-25 15:50:41,773][15401] Updated weights for policy 0, policy_version 917074 (0.0062) [2024-06-25 15:50:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 15025405952. Throughput: 0: 42955.2. Samples: 15025508920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 15:50:46,691][15349] Signal inference workers to stop experience collection... (222400 times) [2024-06-25 15:50:46,691][15349] Signal inference workers to resume experience collection... (222400 times) [2024-06-25 15:50:46,695][15401] Updated weights for policy 0, policy_version 917084 (0.0038) [2024-06-25 15:50:46,708][15401] InferenceWorker_p0-w0: stopping experience collection (222400 times) [2024-06-25 15:50:46,708][15401] InferenceWorker_p0-w0: resuming experience collection (222400 times) [2024-06-25 15:50:48,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15025602560. Throughput: 0: 42999.6. Samples: 15025769120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:48,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 15:50:49,265][15401] Updated weights for policy 0, policy_version 917094 (0.0043) [2024-06-25 15:50:53,392][15132] Fps is (10 sec: 37673.5, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 15025782784. Throughput: 0: 42882.5. Samples: 15025892760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:53,393][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 15:50:54,027][15401] Updated weights for policy 0, policy_version 917104 (0.0034) [2024-06-25 15:50:56,833][15401] Updated weights for policy 0, policy_version 917114 (0.0041) [2024-06-25 15:50:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15026028544. Throughput: 0: 42989.3. Samples: 15026152800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:50:58,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 15:51:01,567][15401] Updated weights for policy 0, policy_version 917124 (0.0037) [2024-06-25 15:51:03,390][15132] Fps is (10 sec: 47524.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15026257920. Throughput: 0: 43072.8. Samples: 15026416340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:51:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 15:51:04,605][15401] Updated weights for policy 0, policy_version 917134 (0.0031) [2024-06-25 15:51:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15026438144. Throughput: 0: 43195.5. Samples: 15026542720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:51:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 15:51:08,997][15401] Updated weights for policy 0, policy_version 917144 (0.0033) [2024-06-25 15:51:12,301][15401] Updated weights for policy 0, policy_version 917154 (0.0042) [2024-06-25 15:51:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15026683904. Throughput: 0: 43125.9. Samples: 15026797620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 15:51:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 15:51:16,758][15401] Updated weights for policy 0, policy_version 917164 (0.0030) [2024-06-25 15:51:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15026896896. Throughput: 0: 42882.2. Samples: 15027055860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 15:51:20,022][15401] Updated weights for policy 0, policy_version 917174 (0.0031) [2024-06-25 15:51:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15027077120. Throughput: 0: 42951.2. Samples: 15027180060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:23,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 15:51:24,378][15401] Updated weights for policy 0, policy_version 917184 (0.0033) [2024-06-25 15:51:27,819][15401] Updated weights for policy 0, policy_version 917194 (0.0035) [2024-06-25 15:51:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 15027339264. Throughput: 0: 42815.9. Samples: 15027435640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:28,391][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 15:51:32,155][15401] Updated weights for policy 0, policy_version 917204 (0.0037) [2024-06-25 15:51:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15027503104. Throughput: 0: 42824.8. Samples: 15027696240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:33,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 15:51:35,500][15401] Updated weights for policy 0, policy_version 917214 (0.0027) [2024-06-25 15:51:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 43144.4, 300 sec: 42710.4). Total num frames: 15027732480. Throughput: 0: 42783.1. Samples: 15027817900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:38,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 15:51:39,651][15401] Updated weights for policy 0, policy_version 917224 (0.0023) [2024-06-25 15:51:42,935][15401] Updated weights for policy 0, policy_version 917234 (0.0036) [2024-06-25 15:51:43,392][15132] Fps is (10 sec: 47501.9, 60 sec: 42869.6, 300 sec: 42931.3). Total num frames: 15027978240. Throughput: 0: 42780.8. Samples: 15028078040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:43,401][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 15:51:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000917235_15027978240.pth... [2024-06-25 15:51:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000916606_15017672704.pth [2024-06-25 15:51:47,459][15401] Updated weights for policy 0, policy_version 917244 (0.0038) [2024-06-25 15:51:48,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15028158464. Throughput: 0: 42840.1. Samples: 15028344140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:48,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 15:51:50,495][15401] Updated weights for policy 0, policy_version 917254 (0.0038) [2024-06-25 15:51:53,389][15132] Fps is (10 sec: 39331.3, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 15028371456. Throughput: 0: 42688.6. Samples: 15028463700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 15:51:55,281][15401] Updated weights for policy 0, policy_version 917264 (0.0027) [2024-06-25 15:51:58,057][15401] Updated weights for policy 0, policy_version 917274 (0.0042) [2024-06-25 15:51:58,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 15028633600. Throughput: 0: 42918.7. Samples: 15028728960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:51:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 15:52:02,919][15401] Updated weights for policy 0, policy_version 917284 (0.0034) [2024-06-25 15:52:03,118][15349] Signal inference workers to stop experience collection... (222450 times) [2024-06-25 15:52:03,152][15401] InferenceWorker_p0-w0: stopping experience collection (222450 times) [2024-06-25 15:52:03,235][15349] Signal inference workers to resume experience collection... (222450 times) [2024-06-25 15:52:03,235][15401] InferenceWorker_p0-w0: resuming experience collection (222450 times) [2024-06-25 15:52:03,390][15132] Fps is (10 sec: 45872.5, 60 sec: 42871.1, 300 sec: 42820.5). Total num frames: 15028830208. Throughput: 0: 42910.1. Samples: 15028986840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:03,391][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 15:52:05,928][15401] Updated weights for policy 0, policy_version 917294 (0.0029) [2024-06-25 15:52:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15029026816. Throughput: 0: 42923.1. Samples: 15029111600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:08,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 15:52:10,427][15401] Updated weights for policy 0, policy_version 917304 (0.0036) [2024-06-25 15:52:13,318][15401] Updated weights for policy 0, policy_version 917314 (0.0026) [2024-06-25 15:52:13,392][15132] Fps is (10 sec: 44228.7, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 15029272576. Throughput: 0: 43048.0. Samples: 15029372900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:13,393][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 15:52:17,779][15401] Updated weights for policy 0, policy_version 917324 (0.0024) [2024-06-25 15:52:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15029469184. Throughput: 0: 43036.4. Samples: 15029632880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:18,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 15:52:21,445][15401] Updated weights for policy 0, policy_version 917334 (0.0033) [2024-06-25 15:52:23,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 15029682176. Throughput: 0: 43136.2. Samples: 15029759020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 15:52:25,389][15401] Updated weights for policy 0, policy_version 917344 (0.0036) [2024-06-25 15:52:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15029911552. Throughput: 0: 43045.5. Samples: 15030014980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 15:52:28,918][15401] Updated weights for policy 0, policy_version 917354 (0.0038) [2024-06-25 15:52:32,966][15401] Updated weights for policy 0, policy_version 917364 (0.0040) [2024-06-25 15:52:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15030108160. Throughput: 0: 42814.7. Samples: 15030270800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:33,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 15:52:36,761][15401] Updated weights for policy 0, policy_version 917374 (0.0037) [2024-06-25 15:52:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 15030321152. Throughput: 0: 43001.8. Samples: 15030398780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:38,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 15:52:40,318][15401] Updated weights for policy 0, policy_version 917384 (0.0039) [2024-06-25 15:52:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 15030534144. Throughput: 0: 42924.0. Samples: 15030660540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 15:52:44,194][15401] Updated weights for policy 0, policy_version 917394 (0.0039) [2024-06-25 15:52:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15030730752. Throughput: 0: 42822.3. Samples: 15030913820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 15:52:48,425][15401] Updated weights for policy 0, policy_version 917404 (0.0037) [2024-06-25 15:52:51,917][15401] Updated weights for policy 0, policy_version 917414 (0.0033) [2024-06-25 15:52:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15030976512. Throughput: 0: 42851.0. Samples: 15031039900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:53,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 15:52:55,911][15401] Updated weights for policy 0, policy_version 917424 (0.0031) [2024-06-25 15:52:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 15031173120. Throughput: 0: 42901.9. Samples: 15031303380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:52:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 15:52:59,653][15401] Updated weights for policy 0, policy_version 917434 (0.0027) [2024-06-25 15:53:03,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42597.1, 300 sec: 42875.7). Total num frames: 15031386112. Throughput: 0: 42804.8. Samples: 15031559200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:53:03,393][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 15:53:03,522][15401] Updated weights for policy 0, policy_version 917444 (0.0027) [2024-06-25 15:53:07,420][15401] Updated weights for policy 0, policy_version 917454 (0.0033) [2024-06-25 15:53:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15031599104. Throughput: 0: 42763.1. Samples: 15031683360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 15:53:08,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 15:53:11,003][15401] Updated weights for policy 0, policy_version 917464 (0.0035) [2024-06-25 15:53:13,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42054.0, 300 sec: 42765.4). Total num frames: 15031795712. Throughput: 0: 42874.3. Samples: 15031944320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 15:53:14,877][15401] Updated weights for policy 0, policy_version 917474 (0.0038) [2024-06-25 15:53:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15032025088. Throughput: 0: 42948.8. Samples: 15032203500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:18,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 15:53:18,603][15401] Updated weights for policy 0, policy_version 917484 (0.0050) [2024-06-25 15:53:22,459][15401] Updated weights for policy 0, policy_version 917494 (0.0028) [2024-06-25 15:53:23,395][15132] Fps is (10 sec: 45851.5, 60 sec: 42867.8, 300 sec: 42930.9). Total num frames: 15032254464. Throughput: 0: 42870.6. Samples: 15032328180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:23,395][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 15:53:26,516][15401] Updated weights for policy 0, policy_version 917504 (0.0024) [2024-06-25 15:53:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 15032418304. Throughput: 0: 42566.7. Samples: 15032576040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:28,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 15:53:30,072][15401] Updated weights for policy 0, policy_version 917514 (0.0039) [2024-06-25 15:53:31,130][15349] Signal inference workers to stop experience collection... (222500 times) [2024-06-25 15:53:31,130][15349] Signal inference workers to resume experience collection... (222500 times) [2024-06-25 15:53:31,164][15401] InferenceWorker_p0-w0: stopping experience collection (222500 times) [2024-06-25 15:53:31,164][15401] InferenceWorker_p0-w0: resuming experience collection (222500 times) [2024-06-25 15:53:33,389][15132] Fps is (10 sec: 40981.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15032664064. Throughput: 0: 42737.4. Samples: 15032837000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:33,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 15:53:34,070][15401] Updated weights for policy 0, policy_version 917524 (0.0031) [2024-06-25 15:53:37,979][15401] Updated weights for policy 0, policy_version 917534 (0.0036) [2024-06-25 15:53:38,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15032877056. Throughput: 0: 42842.3. Samples: 15032967800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:38,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 15:53:41,811][15401] Updated weights for policy 0, policy_version 917544 (0.0029) [2024-06-25 15:53:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15033073664. Throughput: 0: 42607.5. Samples: 15033220720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:43,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 15:53:43,427][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000917547_15033090048.pth... [2024-06-25 15:53:43,498][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000916920_15022817280.pth [2024-06-25 15:53:45,950][15401] Updated weights for policy 0, policy_version 917554 (0.0039) [2024-06-25 15:53:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15033303040. Throughput: 0: 42525.0. Samples: 15033472720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 15:53:49,383][15401] Updated weights for policy 0, policy_version 917564 (0.0033) [2024-06-25 15:53:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 15033516032. Throughput: 0: 42639.4. Samples: 15033602140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 15:53:53,764][15401] Updated weights for policy 0, policy_version 917574 (0.0037) [2024-06-25 15:53:56,960][15401] Updated weights for policy 0, policy_version 917584 (0.0051) [2024-06-25 15:53:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15033729024. Throughput: 0: 42281.3. Samples: 15033846980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:53:58,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 15:54:01,368][15401] Updated weights for policy 0, policy_version 917594 (0.0041) [2024-06-25 15:54:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 15033942016. Throughput: 0: 42402.3. Samples: 15034111600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 15:54:04,434][15401] Updated weights for policy 0, policy_version 917604 (0.0027) [2024-06-25 15:54:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15034155008. Throughput: 0: 42585.7. Samples: 15034244320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:08,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 15:54:08,904][15401] Updated weights for policy 0, policy_version 917614 (0.0028) [2024-06-25 15:54:11,954][15401] Updated weights for policy 0, policy_version 917624 (0.0030) [2024-06-25 15:54:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 15034368000. Throughput: 0: 42595.0. Samples: 15034492820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:13,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-25 15:54:16,444][15401] Updated weights for policy 0, policy_version 917634 (0.0038) [2024-06-25 15:54:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15034580992. Throughput: 0: 42672.4. Samples: 15034757260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 15:54:20,172][15401] Updated weights for policy 0, policy_version 917644 (0.0031) [2024-06-25 15:54:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42328.9, 300 sec: 42820.6). Total num frames: 15034793984. Throughput: 0: 42612.8. Samples: 15034885380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 15:54:24,183][15401] Updated weights for policy 0, policy_version 917654 (0.0029) [2024-06-25 15:54:27,722][15401] Updated weights for policy 0, policy_version 917664 (0.0031) [2024-06-25 15:54:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 43415.9, 300 sec: 42709.1). Total num frames: 15035023360. Throughput: 0: 42628.9. Samples: 15035139120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:28,392][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 15:54:32,220][15401] Updated weights for policy 0, policy_version 917674 (0.0029) [2024-06-25 15:54:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15035219968. Throughput: 0: 42803.0. Samples: 15035398860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 15:54:35,332][15401] Updated weights for policy 0, policy_version 917684 (0.0033) [2024-06-25 15:54:38,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 15035449344. Throughput: 0: 42753.4. Samples: 15035526040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:38,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 15:54:39,692][15401] Updated weights for policy 0, policy_version 917694 (0.0035) [2024-06-25 15:54:42,901][15401] Updated weights for policy 0, policy_version 917704 (0.0029) [2024-06-25 15:54:43,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15035678720. Throughput: 0: 43124.8. Samples: 15035787600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 15:54:47,521][15401] Updated weights for policy 0, policy_version 917714 (0.0032) [2024-06-25 15:54:48,390][15132] Fps is (10 sec: 40958.0, 60 sec: 42598.0, 300 sec: 42765.0). Total num frames: 15035858944. Throughput: 0: 43003.5. Samples: 15036046780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:48,391][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 15:54:50,494][15401] Updated weights for policy 0, policy_version 917724 (0.0034) [2024-06-25 15:54:53,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 15036104704. Throughput: 0: 42715.9. Samples: 15036166640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:53,393][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 15:54:54,969][15401] Updated weights for policy 0, policy_version 917734 (0.0033) [2024-06-25 15:54:58,037][15401] Updated weights for policy 0, policy_version 917744 (0.0028) [2024-06-25 15:54:58,390][15132] Fps is (10 sec: 47515.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 15036334080. Throughput: 0: 43173.8. Samples: 15036435640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:54:58,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 15:55:02,028][15349] Signal inference workers to stop experience collection... (222550 times) [2024-06-25 15:55:02,078][15401] InferenceWorker_p0-w0: stopping experience collection (222550 times) [2024-06-25 15:55:02,086][15349] Signal inference workers to resume experience collection... (222550 times) [2024-06-25 15:55:02,093][15401] InferenceWorker_p0-w0: resuming experience collection (222550 times) [2024-06-25 15:55:02,409][15401] Updated weights for policy 0, policy_version 917754 (0.0052) [2024-06-25 15:55:03,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15036514304. Throughput: 0: 43136.5. Samples: 15036698400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-25 15:55:03,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 15:55:05,743][15401] Updated weights for policy 0, policy_version 917764 (0.0022) [2024-06-25 15:55:08,390][15132] Fps is (10 sec: 42596.3, 60 sec: 43417.2, 300 sec: 42820.5). Total num frames: 15036760064. Throughput: 0: 42814.2. Samples: 15036812040. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:08,391][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 15:55:10,771][15401] Updated weights for policy 0, policy_version 917774 (0.0040) [2024-06-25 15:55:13,293][15401] Updated weights for policy 0, policy_version 917784 (0.0044) [2024-06-25 15:55:13,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 15036973056. Throughput: 0: 43085.0. Samples: 15037077840. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 15:55:18,253][15401] Updated weights for policy 0, policy_version 917794 (0.0023) [2024-06-25 15:55:18,389][15132] Fps is (10 sec: 37685.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15037136896. Throughput: 0: 43122.8. Samples: 15037339380. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:18,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 15:55:21,056][15401] Updated weights for policy 0, policy_version 917804 (0.0036) [2024-06-25 15:55:23,395][15132] Fps is (10 sec: 42575.6, 60 sec: 43413.8, 300 sec: 42764.3). Total num frames: 15037399040. Throughput: 0: 42828.4. Samples: 15037453540. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:23,395][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 15:55:25,720][15401] Updated weights for policy 0, policy_version 917814 (0.0029) [2024-06-25 15:55:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 15037595648. Throughput: 0: 42936.8. Samples: 15037719760. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:28,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 15:55:28,861][15401] Updated weights for policy 0, policy_version 917824 (0.0038) [2024-06-25 15:55:33,389][15132] Fps is (10 sec: 37703.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15037775872. Throughput: 0: 42936.1. Samples: 15037978880. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 15:55:33,508][15401] Updated weights for policy 0, policy_version 917834 (0.0044) [2024-06-25 15:55:36,480][15401] Updated weights for policy 0, policy_version 917844 (0.0029) [2024-06-25 15:55:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15038054400. Throughput: 0: 43029.4. Samples: 15038102860. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 15:55:40,918][15401] Updated weights for policy 0, policy_version 917854 (0.0029) [2024-06-25 15:55:43,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15038234624. Throughput: 0: 42957.7. Samples: 15038368740. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:43,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 15:55:43,478][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000917862_15038251008.pth... [2024-06-25 15:55:43,528][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000917235_15027978240.pth [2024-06-25 15:55:44,158][15401] Updated weights for policy 0, policy_version 917864 (0.0042) [2024-06-25 15:55:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.8, 300 sec: 42876.4). Total num frames: 15038431232. Throughput: 0: 42749.2. Samples: 15038622120. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:48,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 15:55:49,009][15401] Updated weights for policy 0, policy_version 917874 (0.0033) [2024-06-25 15:55:51,652][15401] Updated weights for policy 0, policy_version 917884 (0.0033) [2024-06-25 15:55:53,390][15132] Fps is (10 sec: 47514.3, 60 sec: 43419.4, 300 sec: 42987.2). Total num frames: 15038709760. Throughput: 0: 42958.7. Samples: 15038745160. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:53,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 15:55:56,523][15401] Updated weights for policy 0, policy_version 917894 (0.0031) [2024-06-25 15:55:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15038873600. Throughput: 0: 42824.3. Samples: 15039004940. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:55:58,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 15:55:59,673][15401] Updated weights for policy 0, policy_version 917904 (0.0035) [2024-06-25 15:56:03,392][15132] Fps is (10 sec: 37674.0, 60 sec: 42869.7, 300 sec: 42875.8). Total num frames: 15039086592. Throughput: 0: 42573.2. Samples: 15039255280. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:03,393][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 15:56:04,157][15401] Updated weights for policy 0, policy_version 917914 (0.0025) [2024-06-25 15:56:07,204][15349] Signal inference workers to stop experience collection... (222600 times) [2024-06-25 15:56:07,236][15401] InferenceWorker_p0-w0: stopping experience collection (222600 times) [2024-06-25 15:56:07,251][15349] Signal inference workers to resume experience collection... (222600 times) [2024-06-25 15:56:07,257][15401] InferenceWorker_p0-w0: resuming experience collection (222600 times) [2024-06-25 15:56:07,410][15401] Updated weights for policy 0, policy_version 917924 (0.0039) [2024-06-25 15:56:08,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.8, 300 sec: 42931.6). Total num frames: 15039348736. Throughput: 0: 42916.5. Samples: 15039384560. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 15:56:11,832][15401] Updated weights for policy 0, policy_version 917934 (0.0029) [2024-06-25 15:56:13,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 15039496192. Throughput: 0: 42683.2. Samples: 15039640500. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:13,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 15:56:15,082][15401] Updated weights for policy 0, policy_version 917944 (0.0035) [2024-06-25 15:56:18,390][15132] Fps is (10 sec: 37683.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15039725568. Throughput: 0: 42516.4. Samples: 15039892120. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 15:56:19,246][15401] Updated weights for policy 0, policy_version 917954 (0.0033) [2024-06-25 15:56:22,967][15401] Updated weights for policy 0, policy_version 917964 (0.0034) [2024-06-25 15:56:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42329.1, 300 sec: 42709.5). Total num frames: 15039938560. Throughput: 0: 42521.9. Samples: 15040016340. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:23,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 15:56:26,658][15401] Updated weights for policy 0, policy_version 917974 (0.0026) [2024-06-25 15:56:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15040151552. Throughput: 0: 42319.2. Samples: 15040273100. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 15:56:30,528][15401] Updated weights for policy 0, policy_version 917984 (0.0029) [2024-06-25 15:56:33,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 15040364544. Throughput: 0: 42497.4. Samples: 15040534600. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:33,393][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 15:56:34,217][15401] Updated weights for policy 0, policy_version 917994 (0.0035) [2024-06-25 15:56:38,063][15401] Updated weights for policy 0, policy_version 918004 (0.0039) [2024-06-25 15:56:38,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 15040577536. Throughput: 0: 42537.4. Samples: 15040659340. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 15:56:41,914][15401] Updated weights for policy 0, policy_version 918014 (0.0028) [2024-06-25 15:56:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15040790528. Throughput: 0: 42383.2. Samples: 15040912180. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:43,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 15:56:45,691][15401] Updated weights for policy 0, policy_version 918024 (0.0048) [2024-06-25 15:56:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15040987136. Throughput: 0: 42309.5. Samples: 15041159100. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:48,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 15:56:49,882][15401] Updated weights for policy 0, policy_version 918034 (0.0047) [2024-06-25 15:56:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 15041216512. Throughput: 0: 42412.9. Samples: 15041293140. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:53,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 15:56:54,116][15401] Updated weights for policy 0, policy_version 918044 (0.0025) [2024-06-25 15:56:57,551][15401] Updated weights for policy 0, policy_version 918054 (0.0036) [2024-06-25 15:56:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15041413120. Throughput: 0: 42365.7. Samples: 15041546960. Policy #0 lag: (min: 2.0, avg: 11.1, max: 22.0) [2024-06-25 15:56:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 15:57:01,698][15401] Updated weights for policy 0, policy_version 918064 (0.0043) [2024-06-25 15:57:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 15041626112. Throughput: 0: 42413.8. Samples: 15041800740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:03,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 15:57:05,355][15401] Updated weights for policy 0, policy_version 918074 (0.0037) [2024-06-25 15:57:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 42598.7). Total num frames: 15041839104. Throughput: 0: 42549.7. Samples: 15041931080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:08,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 15:57:09,342][15401] Updated weights for policy 0, policy_version 918084 (0.0039) [2024-06-25 15:57:12,926][15401] Updated weights for policy 0, policy_version 918094 (0.0025) [2024-06-25 15:57:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15042068480. Throughput: 0: 42592.5. Samples: 15042189760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:13,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 15:57:17,186][15401] Updated weights for policy 0, policy_version 918104 (0.0030) [2024-06-25 15:57:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15042281472. Throughput: 0: 42516.5. Samples: 15042447740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 15:57:20,640][15401] Updated weights for policy 0, policy_version 918114 (0.0036) [2024-06-25 15:57:22,240][15349] Signal inference workers to stop experience collection... (222650 times) [2024-06-25 15:57:22,273][15401] InferenceWorker_p0-w0: stopping experience collection (222650 times) [2024-06-25 15:57:22,302][15349] Signal inference workers to resume experience collection... (222650 times) [2024-06-25 15:57:22,302][15401] InferenceWorker_p0-w0: resuming experience collection (222650 times) [2024-06-25 15:57:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15042494464. Throughput: 0: 42458.1. Samples: 15042569960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:23,395][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 15:57:24,736][15401] Updated weights for policy 0, policy_version 918124 (0.0030) [2024-06-25 15:57:28,257][15401] Updated weights for policy 0, policy_version 918134 (0.0028) [2024-06-25 15:57:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15042707456. Throughput: 0: 42667.2. Samples: 15042832200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:28,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 15:57:32,351][15401] Updated weights for policy 0, policy_version 918144 (0.0029) [2024-06-25 15:57:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15042920448. Throughput: 0: 42884.4. Samples: 15043088900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 15:57:36,014][15401] Updated weights for policy 0, policy_version 918154 (0.0035) [2024-06-25 15:57:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15043149824. Throughput: 0: 42564.6. Samples: 15043208540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 15:57:39,966][15401] Updated weights for policy 0, policy_version 918164 (0.0027) [2024-06-25 15:57:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 15043346432. Throughput: 0: 42833.3. Samples: 15043474560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:43,393][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 15:57:43,522][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000918174_15043362816.pth... [2024-06-25 15:57:43,527][15401] Updated weights for policy 0, policy_version 918174 (0.0038) [2024-06-25 15:57:43,566][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000917547_15033090048.pth [2024-06-25 15:57:47,729][15401] Updated weights for policy 0, policy_version 918184 (0.0026) [2024-06-25 15:57:48,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15043543040. Throughput: 0: 42796.3. Samples: 15043726580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:48,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 15:57:51,413][15401] Updated weights for policy 0, policy_version 918194 (0.0030) [2024-06-25 15:57:53,390][15132] Fps is (10 sec: 45886.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 15043805184. Throughput: 0: 42665.4. Samples: 15043851020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:53,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 15:57:55,154][15401] Updated weights for policy 0, policy_version 918204 (0.0029) [2024-06-25 15:57:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 15043969024. Throughput: 0: 42741.2. Samples: 15044113120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:57:58,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 15:57:58,883][15401] Updated weights for policy 0, policy_version 918214 (0.0030) [2024-06-25 15:58:02,569][15401] Updated weights for policy 0, policy_version 918224 (0.0034) [2024-06-25 15:58:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15044198400. Throughput: 0: 42766.6. Samples: 15044372240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:03,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-25 15:58:06,471][15401] Updated weights for policy 0, policy_version 918234 (0.0032) [2024-06-25 15:58:08,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15044444160. Throughput: 0: 42946.7. Samples: 15044502560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:08,390][15132] Avg episode reward: [(0, '0.197')] [2024-06-25 15:58:10,566][15401] Updated weights for policy 0, policy_version 918244 (0.0039) [2024-06-25 15:58:13,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 15044640768. Throughput: 0: 42902.9. Samples: 15044762940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:13,392][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 15:58:13,862][15401] Updated weights for policy 0, policy_version 918254 (0.0034) [2024-06-25 15:58:18,067][15401] Updated weights for policy 0, policy_version 918264 (0.0036) [2024-06-25 15:58:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42654.7). Total num frames: 15044837376. Throughput: 0: 42779.0. Samples: 15045013960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 15:58:21,525][15401] Updated weights for policy 0, policy_version 918274 (0.0037) [2024-06-25 15:58:23,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15045066752. Throughput: 0: 42932.3. Samples: 15045140500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 15:58:26,177][15401] Updated weights for policy 0, policy_version 918284 (0.0036) [2024-06-25 15:58:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15045263360. Throughput: 0: 42750.7. Samples: 15045398240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:28,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 15:58:29,554][15401] Updated weights for policy 0, policy_version 918294 (0.0032) [2024-06-25 15:58:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15045476352. Throughput: 0: 42588.9. Samples: 15045643080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:33,392][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 15:58:33,806][15401] Updated weights for policy 0, policy_version 918304 (0.0035) [2024-06-25 15:58:37,430][15401] Updated weights for policy 0, policy_version 918314 (0.0042) [2024-06-25 15:58:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 15045689344. Throughput: 0: 42688.8. Samples: 15045772020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:38,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 15:58:41,502][15401] Updated weights for policy 0, policy_version 918324 (0.0031) [2024-06-25 15:58:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15045902336. Throughput: 0: 42535.6. Samples: 15046027220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:43,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 15:58:45,222][15401] Updated weights for policy 0, policy_version 918334 (0.0028) [2024-06-25 15:58:46,248][15349] Signal inference workers to stop experience collection... (222700 times) [2024-06-25 15:58:46,289][15401] InferenceWorker_p0-w0: stopping experience collection (222700 times) [2024-06-25 15:58:46,298][15349] Signal inference workers to resume experience collection... (222700 times) [2024-06-25 15:58:46,306][15401] InferenceWorker_p0-w0: resuming experience collection (222700 times) [2024-06-25 15:58:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15046131712. Throughput: 0: 42348.4. Samples: 15046277920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:48,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 15:58:49,131][15401] Updated weights for policy 0, policy_version 918344 (0.0038) [2024-06-25 15:58:53,115][15401] Updated weights for policy 0, policy_version 918354 (0.0051) [2024-06-25 15:58:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 15046328320. Throughput: 0: 42278.6. Samples: 15046405100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 15:58:53,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 15:58:56,776][15401] Updated weights for policy 0, policy_version 918364 (0.0028) [2024-06-25 15:58:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15046541312. Throughput: 0: 42186.7. Samples: 15046661240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:58:58,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 15:59:00,727][15401] Updated weights for policy 0, policy_version 918374 (0.0029) [2024-06-25 15:59:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15046754304. Throughput: 0: 42265.0. Samples: 15046915880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:03,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 15:59:04,334][15401] Updated weights for policy 0, policy_version 918384 (0.0026) [2024-06-25 15:59:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 15046950912. Throughput: 0: 42216.9. Samples: 15047040260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 15:59:08,615][15401] Updated weights for policy 0, policy_version 918394 (0.0043) [2024-06-25 15:59:12,500][15401] Updated weights for policy 0, policy_version 918404 (0.0040) [2024-06-25 15:59:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 15047180288. Throughput: 0: 42253.5. Samples: 15047299640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:13,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 15:59:16,170][15401] Updated weights for policy 0, policy_version 918414 (0.0034) [2024-06-25 15:59:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15047393280. Throughput: 0: 42404.9. Samples: 15047551300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 15:59:19,969][15401] Updated weights for policy 0, policy_version 918424 (0.0022) [2024-06-25 15:59:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 15047606272. Throughput: 0: 42353.5. Samples: 15047677920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:23,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 15:59:23,767][15401] Updated weights for policy 0, policy_version 918434 (0.0031) [2024-06-25 15:59:27,809][15401] Updated weights for policy 0, policy_version 918444 (0.0042) [2024-06-25 15:59:28,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 15047802880. Throughput: 0: 42443.3. Samples: 15047937160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 15:59:31,457][15401] Updated weights for policy 0, policy_version 918454 (0.0033) [2024-06-25 15:59:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15048015872. Throughput: 0: 42460.9. Samples: 15048188660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:33,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 15:59:35,618][15401] Updated weights for policy 0, policy_version 918464 (0.0036) [2024-06-25 15:59:38,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 15048228864. Throughput: 0: 42457.3. Samples: 15048315680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 15:59:39,000][15401] Updated weights for policy 0, policy_version 918474 (0.0037) [2024-06-25 15:59:43,327][15401] Updated weights for policy 0, policy_version 918484 (0.0034) [2024-06-25 15:59:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15048441856. Throughput: 0: 42481.3. Samples: 15048572900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:43,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 15:59:43,468][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000918485_15048458240.pth... [2024-06-25 15:59:43,532][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000917862_15038251008.pth [2024-06-25 15:59:47,184][15401] Updated weights for policy 0, policy_version 918494 (0.0037) [2024-06-25 15:59:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42543.2). Total num frames: 15048654848. Throughput: 0: 42392.1. Samples: 15048823520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 15:59:51,007][15401] Updated weights for policy 0, policy_version 918504 (0.0033) [2024-06-25 15:59:53,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15048867840. Throughput: 0: 42420.7. Samples: 15048949200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:53,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 15:59:54,846][15401] Updated weights for policy 0, policy_version 918514 (0.0034) [2024-06-25 15:59:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15049080832. Throughput: 0: 42346.7. Samples: 15049205240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 15:59:58,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 15:59:58,832][15401] Updated weights for policy 0, policy_version 918524 (0.0038) [2024-06-25 16:00:02,443][15401] Updated weights for policy 0, policy_version 918534 (0.0035) [2024-06-25 16:00:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42431.9). Total num frames: 15049277440. Throughput: 0: 42338.3. Samples: 15049456520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:03,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 16:00:06,552][15401] Updated weights for policy 0, policy_version 918544 (0.0028) [2024-06-25 16:00:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 15049523200. Throughput: 0: 42420.7. Samples: 15049586860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:08,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 16:00:10,053][15401] Updated weights for policy 0, policy_version 918554 (0.0041) [2024-06-25 16:00:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 15049703424. Throughput: 0: 42346.9. Samples: 15049842780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 16:00:14,206][15401] Updated weights for policy 0, policy_version 918564 (0.0032) [2024-06-25 16:00:17,907][15401] Updated weights for policy 0, policy_version 918574 (0.0041) [2024-06-25 16:00:18,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42488.1). Total num frames: 15049932800. Throughput: 0: 42414.8. Samples: 15050097320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:18,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 16:00:21,871][15401] Updated weights for policy 0, policy_version 918584 (0.0029) [2024-06-25 16:00:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 15050145792. Throughput: 0: 42443.5. Samples: 15050225640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:23,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 16:00:25,647][15401] Updated weights for policy 0, policy_version 918594 (0.0041) [2024-06-25 16:00:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15050342400. Throughput: 0: 42392.5. Samples: 15050480560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:28,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 16:00:29,367][15401] Updated weights for policy 0, policy_version 918604 (0.0032) [2024-06-25 16:00:33,334][15401] Updated weights for policy 0, policy_version 918614 (0.0032) [2024-06-25 16:00:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 15050571776. Throughput: 0: 42499.4. Samples: 15050736000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:33,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 16:00:37,182][15401] Updated weights for policy 0, policy_version 918624 (0.0035) [2024-06-25 16:00:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15050784768. Throughput: 0: 42513.1. Samples: 15050862280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 16:00:38,938][15349] Signal inference workers to stop experience collection... (222750 times) [2024-06-25 16:00:38,991][15401] InferenceWorker_p0-w0: stopping experience collection (222750 times) [2024-06-25 16:00:38,994][15349] Signal inference workers to resume experience collection... (222750 times) [2024-06-25 16:00:39,002][15401] InferenceWorker_p0-w0: resuming experience collection (222750 times) [2024-06-25 16:00:41,087][15401] Updated weights for policy 0, policy_version 918634 (0.0033) [2024-06-25 16:00:43,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 15050964992. Throughput: 0: 42572.6. Samples: 15051121020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:43,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 16:00:44,686][15401] Updated weights for policy 0, policy_version 918644 (0.0026) [2024-06-25 16:00:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 15051210752. Throughput: 0: 42500.4. Samples: 15051369040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-25 16:00:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 16:00:48,839][15401] Updated weights for policy 0, policy_version 918654 (0.0037) [2024-06-25 16:00:52,403][15401] Updated weights for policy 0, policy_version 918664 (0.0026) [2024-06-25 16:00:53,390][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15051423744. Throughput: 0: 42565.8. Samples: 15051502320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:00:53,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 16:00:56,702][15401] Updated weights for policy 0, policy_version 918674 (0.0044) [2024-06-25 16:00:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42487.7). Total num frames: 15051620352. Throughput: 0: 42442.3. Samples: 15051752680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:00:58,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 16:01:00,625][15401] Updated weights for policy 0, policy_version 918684 (0.0036) [2024-06-25 16:01:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 15051833344. Throughput: 0: 42426.2. Samples: 15052006500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:03,391][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 16:01:04,266][15401] Updated weights for policy 0, policy_version 918694 (0.0038) [2024-06-25 16:01:08,071][15401] Updated weights for policy 0, policy_version 918704 (0.0038) [2024-06-25 16:01:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15052062720. Throughput: 0: 42433.0. Samples: 15052135120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 16:01:11,862][15401] Updated weights for policy 0, policy_version 918714 (0.0025) [2024-06-25 16:01:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 15052259328. Throughput: 0: 42457.2. Samples: 15052391140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 16:01:15,687][15401] Updated weights for policy 0, policy_version 918724 (0.0045) [2024-06-25 16:01:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15052488704. Throughput: 0: 42419.6. Samples: 15052644880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 16:01:19,647][15401] Updated weights for policy 0, policy_version 918734 (0.0028) [2024-06-25 16:01:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15052685312. Throughput: 0: 42480.8. Samples: 15052773920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:23,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 16:01:23,549][15401] Updated weights for policy 0, policy_version 918744 (0.0035) [2024-06-25 16:01:27,128][15401] Updated weights for policy 0, policy_version 918754 (0.0039) [2024-06-25 16:01:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 15052898304. Throughput: 0: 42418.0. Samples: 15053029820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 16:01:31,224][15401] Updated weights for policy 0, policy_version 918764 (0.0047) [2024-06-25 16:01:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15053127680. Throughput: 0: 42552.4. Samples: 15053283900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 16:01:34,705][15401] Updated weights for policy 0, policy_version 918774 (0.0031) [2024-06-25 16:01:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15053340672. Throughput: 0: 42438.2. Samples: 15053412040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 16:01:38,851][15401] Updated weights for policy 0, policy_version 918784 (0.0030) [2024-06-25 16:01:42,443][15401] Updated weights for policy 0, policy_version 918794 (0.0030) [2024-06-25 16:01:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15053553664. Throughput: 0: 42538.2. Samples: 15053666900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:43,391][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 16:01:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000918796_15053553664.pth... [2024-06-25 16:01:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000918174_15043362816.pth [2024-06-25 16:01:46,581][15401] Updated weights for policy 0, policy_version 918804 (0.0040) [2024-06-25 16:01:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15053783040. Throughput: 0: 42515.6. Samples: 15053919700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:48,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 16:01:50,102][15401] Updated weights for policy 0, policy_version 918814 (0.0026) [2024-06-25 16:01:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15053963264. Throughput: 0: 42628.4. Samples: 15054053400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 16:01:54,436][15401] Updated weights for policy 0, policy_version 918824 (0.0030) [2024-06-25 16:01:58,198][15401] Updated weights for policy 0, policy_version 918834 (0.0037) [2024-06-25 16:01:58,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15054176256. Throughput: 0: 42643.1. Samples: 15054310080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:01:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 16:02:01,585][15349] Signal inference workers to stop experience collection... (222800 times) [2024-06-25 16:02:01,585][15349] Signal inference workers to resume experience collection... (222800 times) [2024-06-25 16:02:01,621][15401] InferenceWorker_p0-w0: stopping experience collection (222800 times) [2024-06-25 16:02:01,622][15401] InferenceWorker_p0-w0: resuming experience collection (222800 times) [2024-06-25 16:02:02,034][15401] Updated weights for policy 0, policy_version 918844 (0.0050) [2024-06-25 16:02:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 15054422016. Throughput: 0: 42627.6. Samples: 15054563120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:03,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 16:02:05,802][15401] Updated weights for policy 0, policy_version 918854 (0.0032) [2024-06-25 16:02:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15054602240. Throughput: 0: 42630.3. Samples: 15054692280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 16:02:09,711][15401] Updated weights for policy 0, policy_version 918864 (0.0038) [2024-06-25 16:02:13,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15054815232. Throughput: 0: 42601.7. Samples: 15054946900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:13,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 16:02:13,538][15401] Updated weights for policy 0, policy_version 918874 (0.0035) [2024-06-25 16:02:17,319][15401] Updated weights for policy 0, policy_version 918884 (0.0042) [2024-06-25 16:02:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15055060992. Throughput: 0: 42691.6. Samples: 15055205020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 16:02:21,735][15401] Updated weights for policy 0, policy_version 918894 (0.0032) [2024-06-25 16:02:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15055257600. Throughput: 0: 42794.4. Samples: 15055337780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:23,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 16:02:24,957][15401] Updated weights for policy 0, policy_version 918904 (0.0041) [2024-06-25 16:02:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15055470592. Throughput: 0: 42681.3. Samples: 15055587560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:28,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 16:02:29,308][15401] Updated weights for policy 0, policy_version 918914 (0.0042) [2024-06-25 16:02:32,643][15401] Updated weights for policy 0, policy_version 918924 (0.0038) [2024-06-25 16:02:33,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15055683584. Throughput: 0: 42864.9. Samples: 15055848620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 16:02:36,833][15401] Updated weights for policy 0, policy_version 918934 (0.0031) [2024-06-25 16:02:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 15055912960. Throughput: 0: 42755.5. Samples: 15055977400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:38,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 16:02:40,035][15401] Updated weights for policy 0, policy_version 918944 (0.0036) [2024-06-25 16:02:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15056125952. Throughput: 0: 42740.0. Samples: 15056233380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-25 16:02:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 16:02:44,316][15401] Updated weights for policy 0, policy_version 918954 (0.0031) [2024-06-25 16:02:47,931][15401] Updated weights for policy 0, policy_version 918964 (0.0028) [2024-06-25 16:02:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 15056338944. Throughput: 0: 42903.8. Samples: 15056493800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:02:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 16:02:51,799][15401] Updated weights for policy 0, policy_version 918974 (0.0030) [2024-06-25 16:02:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15056535552. Throughput: 0: 42809.7. Samples: 15056618720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:02:53,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 16:02:55,617][15401] Updated weights for policy 0, policy_version 918984 (0.0037) [2024-06-25 16:02:58,392][15132] Fps is (10 sec: 42588.9, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 15056764928. Throughput: 0: 42909.3. Samples: 15056877920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:02:58,392][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 16:02:59,229][15401] Updated weights for policy 0, policy_version 918994 (0.0045) [2024-06-25 16:03:03,207][15401] Updated weights for policy 0, policy_version 919004 (0.0036) [2024-06-25 16:03:03,392][15132] Fps is (10 sec: 44225.6, 60 sec: 42596.6, 300 sec: 42487.0). Total num frames: 15056977920. Throughput: 0: 43012.6. Samples: 15057140700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:03,393][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 16:03:06,690][15401] Updated weights for policy 0, policy_version 919014 (0.0025) [2024-06-25 16:03:08,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.3, 300 sec: 42487.7). Total num frames: 15057174528. Throughput: 0: 42739.4. Samples: 15057261060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:08,390][15132] Avg episode reward: [(0, '0.286')] [2024-06-25 16:03:10,867][15401] Updated weights for policy 0, policy_version 919024 (0.0034) [2024-06-25 16:03:12,418][15349] Signal inference workers to stop experience collection... (222850 times) [2024-06-25 16:03:12,419][15349] Signal inference workers to resume experience collection... (222850 times) [2024-06-25 16:03:12,460][15401] InferenceWorker_p0-w0: stopping experience collection (222850 times) [2024-06-25 16:03:12,460][15401] InferenceWorker_p0-w0: resuming experience collection (222850 times) [2024-06-25 16:03:13,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 15057420288. Throughput: 0: 42904.8. Samples: 15057518280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:13,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 16:03:14,906][15401] Updated weights for policy 0, policy_version 919034 (0.0028) [2024-06-25 16:03:18,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15057616896. Throughput: 0: 42953.0. Samples: 15057781500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:18,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 16:03:18,396][15401] Updated weights for policy 0, policy_version 919044 (0.0032) [2024-06-25 16:03:22,467][15401] Updated weights for policy 0, policy_version 919054 (0.0031) [2024-06-25 16:03:23,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15057813504. Throughput: 0: 42711.6. Samples: 15057899420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:23,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 16:03:26,205][15401] Updated weights for policy 0, policy_version 919064 (0.0034) [2024-06-25 16:03:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15058059264. Throughput: 0: 42776.4. Samples: 15058158320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 16:03:30,154][15401] Updated weights for policy 0, policy_version 919074 (0.0034) [2024-06-25 16:03:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15058223104. Throughput: 0: 42856.2. Samples: 15058422320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:33,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 16:03:33,868][15401] Updated weights for policy 0, policy_version 919084 (0.0035) [2024-06-25 16:03:37,647][15401] Updated weights for policy 0, policy_version 919094 (0.0031) [2024-06-25 16:03:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15058468864. Throughput: 0: 42668.4. Samples: 15058538800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 16:03:41,504][15401] Updated weights for policy 0, policy_version 919104 (0.0032) [2024-06-25 16:03:43,390][15132] Fps is (10 sec: 49151.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15058714624. Throughput: 0: 42820.4. Samples: 15058804740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:43,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 16:03:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000919111_15058714624.pth... [2024-06-25 16:03:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000918485_15048458240.pth [2024-06-25 16:03:45,129][15401] Updated weights for policy 0, policy_version 919114 (0.0048) [2024-06-25 16:03:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15058878464. Throughput: 0: 42761.0. Samples: 15059064840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 16:03:49,055][15401] Updated weights for policy 0, policy_version 919124 (0.0036) [2024-06-25 16:03:52,680][15401] Updated weights for policy 0, policy_version 919134 (0.0042) [2024-06-25 16:03:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15059107840. Throughput: 0: 42739.2. Samples: 15059184320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:53,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 16:03:56,617][15401] Updated weights for policy 0, policy_version 919144 (0.0047) [2024-06-25 16:03:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 15059337216. Throughput: 0: 42823.6. Samples: 15059445340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:03:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 16:04:00,210][15401] Updated weights for policy 0, policy_version 919154 (0.0034) [2024-06-25 16:04:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 15059517440. Throughput: 0: 42649.5. Samples: 15059700740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:04:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 16:04:04,214][15401] Updated weights for policy 0, policy_version 919164 (0.0035) [2024-06-25 16:04:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15059730432. Throughput: 0: 42781.8. Samples: 15059824600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:04:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 16:04:08,518][15401] Updated weights for policy 0, policy_version 919174 (0.0039) [2024-06-25 16:04:11,878][15401] Updated weights for policy 0, policy_version 919184 (0.0032) [2024-06-25 16:04:13,390][15132] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15059992576. Throughput: 0: 42780.0. Samples: 15060083420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:04:13,392][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 16:04:16,042][15401] Updated weights for policy 0, policy_version 919194 (0.0049) [2024-06-25 16:04:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15060156416. Throughput: 0: 42608.9. Samples: 15060339720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:04:18,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 16:04:19,483][15401] Updated weights for policy 0, policy_version 919204 (0.0025) [2024-06-25 16:04:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15060385792. Throughput: 0: 42713.8. Samples: 15060460920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:04:23,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 16:04:24,315][15401] Updated weights for policy 0, policy_version 919214 (0.0023) [2024-06-25 16:04:26,967][15401] Updated weights for policy 0, policy_version 919224 (0.0035) [2024-06-25 16:04:28,392][15132] Fps is (10 sec: 47503.6, 60 sec: 42870.0, 300 sec: 42764.7). Total num frames: 15060631552. Throughput: 0: 42631.9. Samples: 15060723260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:04:28,392][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 16:04:31,893][15349] Signal inference workers to stop experience collection... (222900 times) [2024-06-25 16:04:31,931][15401] InferenceWorker_p0-w0: stopping experience collection (222900 times) [2024-06-25 16:04:31,942][15349] Signal inference workers to resume experience collection... (222900 times) [2024-06-25 16:04:31,947][15401] InferenceWorker_p0-w0: resuming experience collection (222900 times) [2024-06-25 16:04:31,949][15401] Updated weights for policy 0, policy_version 919234 (0.0031) [2024-06-25 16:04:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15060795392. Throughput: 0: 42638.2. Samples: 15060983560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:04:33,390][15132] Avg episode reward: [(0, '0.298')] [2024-06-25 16:04:34,491][15401] Updated weights for policy 0, policy_version 919244 (0.0043) [2024-06-25 16:04:38,389][15132] Fps is (10 sec: 37691.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15061008384. Throughput: 0: 42744.0. Samples: 15061107800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-25 16:04:38,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 16:04:39,475][15401] Updated weights for policy 0, policy_version 919254 (0.0041) [2024-06-25 16:04:42,536][15401] Updated weights for policy 0, policy_version 919264 (0.0033) [2024-06-25 16:04:43,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 15061254144. Throughput: 0: 42602.3. Samples: 15061362440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:04:43,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 16:04:47,028][15401] Updated weights for policy 0, policy_version 919274 (0.0031) [2024-06-25 16:04:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15061434368. Throughput: 0: 42794.0. Samples: 15061626460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:04:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 16:04:50,130][15401] Updated weights for policy 0, policy_version 919284 (0.0037) [2024-06-25 16:04:53,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15061663744. Throughput: 0: 42705.6. Samples: 15061746360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:04:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 16:04:54,655][15401] Updated weights for policy 0, policy_version 919294 (0.0031) [2024-06-25 16:04:57,837][15401] Updated weights for policy 0, policy_version 919304 (0.0042) [2024-06-25 16:04:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15061893120. Throughput: 0: 42750.8. Samples: 15062007200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:04:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 16:05:02,337][15401] Updated weights for policy 0, policy_version 919314 (0.0036) [2024-06-25 16:05:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15062089728. Throughput: 0: 42732.0. Samples: 15062262660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:03,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-25 16:05:05,814][15401] Updated weights for policy 0, policy_version 919324 (0.0029) [2024-06-25 16:05:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 15062319104. Throughput: 0: 42632.4. Samples: 15062379480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:08,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 16:05:10,441][15401] Updated weights for policy 0, policy_version 919334 (0.0042) [2024-06-25 16:05:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15062515712. Throughput: 0: 42573.6. Samples: 15062638980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 16:05:13,587][15401] Updated weights for policy 0, policy_version 919344 (0.0033) [2024-06-25 16:05:18,077][15401] Updated weights for policy 0, policy_version 919354 (0.0038) [2024-06-25 16:05:18,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15062728704. Throughput: 0: 42517.7. Samples: 15062896860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:18,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 16:05:21,135][15401] Updated weights for policy 0, policy_version 919364 (0.0029) [2024-06-25 16:05:23,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15062958080. Throughput: 0: 42480.9. Samples: 15063019440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 16:05:25,515][15401] Updated weights for policy 0, policy_version 919374 (0.0032) [2024-06-25 16:05:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42053.6, 300 sec: 42653.9). Total num frames: 15063154688. Throughput: 0: 42715.8. Samples: 15063284660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:28,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 16:05:28,742][15401] Updated weights for policy 0, policy_version 919384 (0.0029) [2024-06-25 16:05:33,034][15401] Updated weights for policy 0, policy_version 919394 (0.0033) [2024-06-25 16:05:33,390][15132] Fps is (10 sec: 42597.3, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 15063384064. Throughput: 0: 42558.8. Samples: 15063541620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 16:05:36,451][15401] Updated weights for policy 0, policy_version 919404 (0.0040) [2024-06-25 16:05:38,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15063597056. Throughput: 0: 42777.0. Samples: 15063671320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 16:05:40,539][15401] Updated weights for policy 0, policy_version 919414 (0.0042) [2024-06-25 16:05:43,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15063777280. Throughput: 0: 42591.9. Samples: 15063923840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:43,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 16:05:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000919421_15063793664.pth... [2024-06-25 16:05:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000918796_15053553664.pth [2024-06-25 16:05:44,070][15401] Updated weights for policy 0, policy_version 919424 (0.0039) [2024-06-25 16:05:48,178][15401] Updated weights for policy 0, policy_version 919434 (0.0028) [2024-06-25 16:05:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15064023040. Throughput: 0: 42644.9. Samples: 15064181680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:48,390][15132] Avg episode reward: [(0, '0.881')] [2024-06-25 16:05:51,531][15401] Updated weights for policy 0, policy_version 919444 (0.0024) [2024-06-25 16:05:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15064236032. Throughput: 0: 42926.7. Samples: 15064311080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:53,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 16:05:55,780][15401] Updated weights for policy 0, policy_version 919454 (0.0032) [2024-06-25 16:05:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15064432640. Throughput: 0: 42889.6. Samples: 15064569020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:05:58,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-25 16:05:58,778][15349] Signal inference workers to stop experience collection... (222950 times) [2024-06-25 16:05:58,781][15349] Signal inference workers to resume experience collection... (222950 times) [2024-06-25 16:05:58,788][15401] InferenceWorker_p0-w0: stopping experience collection (222950 times) [2024-06-25 16:05:58,805][15401] InferenceWorker_p0-w0: resuming experience collection (222950 times) [2024-06-25 16:05:59,292][15401] Updated weights for policy 0, policy_version 919464 (0.0033) [2024-06-25 16:06:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15064645632. Throughput: 0: 42909.0. Samples: 15064827760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:06:03,390][15132] Avg episode reward: [(0, '0.131')] [2024-06-25 16:06:03,442][15401] Updated weights for policy 0, policy_version 919474 (0.0044) [2024-06-25 16:06:06,833][15401] Updated weights for policy 0, policy_version 919484 (0.0031) [2024-06-25 16:06:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 15064891392. Throughput: 0: 43008.4. Samples: 15064954820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:06:08,392][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 16:06:10,988][15401] Updated weights for policy 0, policy_version 919494 (0.0038) [2024-06-25 16:06:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15065071616. Throughput: 0: 42690.8. Samples: 15065205740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:06:13,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 16:06:14,646][15401] Updated weights for policy 0, policy_version 919504 (0.0038) [2024-06-25 16:06:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15065284608. Throughput: 0: 42719.8. Samples: 15065464000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:06:18,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 16:06:18,621][15401] Updated weights for policy 0, policy_version 919514 (0.0049) [2024-06-25 16:06:22,138][15401] Updated weights for policy 0, policy_version 919524 (0.0026) [2024-06-25 16:06:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15065513984. Throughput: 0: 42699.2. Samples: 15065592780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:06:23,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 16:06:26,095][15401] Updated weights for policy 0, policy_version 919534 (0.0035) [2024-06-25 16:06:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15065726976. Throughput: 0: 42793.3. Samples: 15065849540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:06:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 16:06:30,273][15401] Updated weights for policy 0, policy_version 919544 (0.0044) [2024-06-25 16:06:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 15065939968. Throughput: 0: 42804.9. Samples: 15066107900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-25 16:06:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 16:06:34,240][15401] Updated weights for policy 0, policy_version 919554 (0.0036) [2024-06-25 16:06:37,810][15401] Updated weights for policy 0, policy_version 919564 (0.0033) [2024-06-25 16:06:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15066152960. Throughput: 0: 42657.8. Samples: 15066230680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:06:38,390][15132] Avg episode reward: [(0, '0.902')] [2024-06-25 16:06:41,771][15401] Updated weights for policy 0, policy_version 919574 (0.0034) [2024-06-25 16:06:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15066382336. Throughput: 0: 42816.4. Samples: 15066495760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:06:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 16:06:45,314][15401] Updated weights for policy 0, policy_version 919584 (0.0032) [2024-06-25 16:06:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15066578944. Throughput: 0: 42790.1. Samples: 15066753320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:06:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 16:06:49,386][15401] Updated weights for policy 0, policy_version 919594 (0.0047) [2024-06-25 16:06:52,883][15401] Updated weights for policy 0, policy_version 919604 (0.0032) [2024-06-25 16:06:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15066808320. Throughput: 0: 42744.5. Samples: 15066878320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:06:53,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 16:06:57,167][15401] Updated weights for policy 0, policy_version 919614 (0.0027) [2024-06-25 16:06:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15067021312. Throughput: 0: 43000.1. Samples: 15067140740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:06:58,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 16:07:00,639][15401] Updated weights for policy 0, policy_version 919624 (0.0029) [2024-06-25 16:07:03,396][15132] Fps is (10 sec: 40933.7, 60 sec: 42866.8, 300 sec: 42764.1). Total num frames: 15067217920. Throughput: 0: 42760.9. Samples: 15067388520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:03,397][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 16:07:05,020][15401] Updated weights for policy 0, policy_version 919634 (0.0035) [2024-06-25 16:07:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 15067430912. Throughput: 0: 42706.3. Samples: 15067514560. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:08,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 16:07:08,649][15401] Updated weights for policy 0, policy_version 919644 (0.0043) [2024-06-25 16:07:08,688][15349] Signal inference workers to stop experience collection... (223000 times) [2024-06-25 16:07:08,689][15349] Signal inference workers to resume experience collection... (223000 times) [2024-06-25 16:07:08,712][15401] InferenceWorker_p0-w0: stopping experience collection (223000 times) [2024-06-25 16:07:08,712][15401] InferenceWorker_p0-w0: resuming experience collection (223000 times) [2024-06-25 16:07:12,831][15401] Updated weights for policy 0, policy_version 919654 (0.0030) [2024-06-25 16:07:13,389][15132] Fps is (10 sec: 42626.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15067643904. Throughput: 0: 42770.8. Samples: 15067774220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 16:07:16,311][15401] Updated weights for policy 0, policy_version 919664 (0.0035) [2024-06-25 16:07:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15067873280. Throughput: 0: 42574.2. Samples: 15068023740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:18,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 16:07:20,582][15401] Updated weights for policy 0, policy_version 919674 (0.0038) [2024-06-25 16:07:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15068086272. Throughput: 0: 42734.3. Samples: 15068153720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 16:07:24,016][15401] Updated weights for policy 0, policy_version 919684 (0.0042) [2024-06-25 16:07:28,139][15401] Updated weights for policy 0, policy_version 919694 (0.0033) [2024-06-25 16:07:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15068266496. Throughput: 0: 42508.1. Samples: 15068408620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:28,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 16:07:31,709][15401] Updated weights for policy 0, policy_version 919704 (0.0040) [2024-06-25 16:07:33,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15068512256. Throughput: 0: 42394.2. Samples: 15068661060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 16:07:36,122][15401] Updated weights for policy 0, policy_version 919714 (0.0032) [2024-06-25 16:07:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15068708864. Throughput: 0: 42533.8. Samples: 15068792340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:38,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 16:07:39,387][15401] Updated weights for policy 0, policy_version 919724 (0.0040) [2024-06-25 16:07:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15068905472. Throughput: 0: 42342.7. Samples: 15069046160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 16:07:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000919733_15068905472.pth... [2024-06-25 16:07:43,448][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000919111_15058714624.pth [2024-06-25 16:07:43,756][15401] Updated weights for policy 0, policy_version 919734 (0.0034) [2024-06-25 16:07:47,000][15401] Updated weights for policy 0, policy_version 919744 (0.0039) [2024-06-25 16:07:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15069151232. Throughput: 0: 42462.5. Samples: 15069299060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 16:07:51,423][15401] Updated weights for policy 0, policy_version 919754 (0.0034) [2024-06-25 16:07:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 15069364224. Throughput: 0: 42780.3. Samples: 15069439680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 16:07:54,548][15401] Updated weights for policy 0, policy_version 919764 (0.0042) [2024-06-25 16:07:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42598.8). Total num frames: 15069544448. Throughput: 0: 42527.1. Samples: 15069687940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:07:58,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-25 16:07:59,076][15401] Updated weights for policy 0, policy_version 919774 (0.0041) [2024-06-25 16:08:02,294][15401] Updated weights for policy 0, policy_version 919784 (0.0049) [2024-06-25 16:08:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 15069790208. Throughput: 0: 42595.5. Samples: 15069940540. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:08:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 16:08:06,731][15401] Updated weights for policy 0, policy_version 919794 (0.0032) [2024-06-25 16:08:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15069986816. Throughput: 0: 42630.1. Samples: 15070072080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:08:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 16:08:09,906][15401] Updated weights for policy 0, policy_version 919804 (0.0027) [2024-06-25 16:08:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15070183424. Throughput: 0: 42369.3. Samples: 15070315240. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:08:13,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 16:08:14,370][15401] Updated weights for policy 0, policy_version 919814 (0.0040) [2024-06-25 16:08:15,894][15349] Signal inference workers to stop experience collection... (223050 times) [2024-06-25 16:08:15,939][15401] InferenceWorker_p0-w0: stopping experience collection (223050 times) [2024-06-25 16:08:15,943][15349] Signal inference workers to resume experience collection... (223050 times) [2024-06-25 16:08:15,949][15401] InferenceWorker_p0-w0: resuming experience collection (223050 times) [2024-06-25 16:08:17,994][15401] Updated weights for policy 0, policy_version 919824 (0.0041) [2024-06-25 16:08:18,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 15070412800. Throughput: 0: 42526.8. Samples: 15070574860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:08:18,392][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 16:08:22,435][15401] Updated weights for policy 0, policy_version 919834 (0.0030) [2024-06-25 16:08:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15070625792. Throughput: 0: 42554.3. Samples: 15070707280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:08:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 16:08:25,591][15401] Updated weights for policy 0, policy_version 919844 (0.0037) [2024-06-25 16:08:28,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15070838784. Throughput: 0: 42497.8. Samples: 15070958560. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-06-25 16:08:28,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 16:08:29,995][15401] Updated weights for policy 0, policy_version 919854 (0.0034) [2024-06-25 16:08:33,147][15401] Updated weights for policy 0, policy_version 919864 (0.0036) [2024-06-25 16:08:33,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15071068160. Throughput: 0: 42618.3. Samples: 15071216880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:08:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 16:08:37,459][15401] Updated weights for policy 0, policy_version 919874 (0.0044) [2024-06-25 16:08:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15071264768. Throughput: 0: 42373.5. Samples: 15071346480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:08:38,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 16:08:40,687][15401] Updated weights for policy 0, policy_version 919884 (0.0039) [2024-06-25 16:08:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15071494144. Throughput: 0: 42581.2. Samples: 15071604100. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:08:43,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 16:08:45,232][15401] Updated weights for policy 0, policy_version 919894 (0.0037) [2024-06-25 16:08:48,171][15401] Updated weights for policy 0, policy_version 919904 (0.0033) [2024-06-25 16:08:48,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15071707136. Throughput: 0: 42550.2. Samples: 15071855300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:08:48,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 16:08:53,149][15401] Updated weights for policy 0, policy_version 919914 (0.0032) [2024-06-25 16:08:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15071887360. Throughput: 0: 42391.5. Samples: 15071979700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:08:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 16:08:56,082][15401] Updated weights for policy 0, policy_version 919924 (0.0025) [2024-06-25 16:08:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15072116736. Throughput: 0: 42716.1. Samples: 15072237460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:08:58,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-25 16:09:00,648][15401] Updated weights for policy 0, policy_version 919934 (0.0028) [2024-06-25 16:09:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15072346112. Throughput: 0: 42550.1. Samples: 15072489520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 16:09:04,028][15401] Updated weights for policy 0, policy_version 919944 (0.0039) [2024-06-25 16:09:08,263][15401] Updated weights for policy 0, policy_version 919954 (0.0036) [2024-06-25 16:09:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15072526336. Throughput: 0: 42395.9. Samples: 15072615100. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:08,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 16:09:11,653][15401] Updated weights for policy 0, policy_version 919964 (0.0050) [2024-06-25 16:09:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15072755712. Throughput: 0: 42542.6. Samples: 15072872980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:13,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 16:09:15,806][15401] Updated weights for policy 0, policy_version 919974 (0.0041) [2024-06-25 16:09:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 15072968704. Throughput: 0: 42440.9. Samples: 15073126720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:18,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 16:09:19,256][15401] Updated weights for policy 0, policy_version 919984 (0.0031) [2024-06-25 16:09:23,278][15401] Updated weights for policy 0, policy_version 919994 (0.0027) [2024-06-25 16:09:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42543.1). Total num frames: 15073181696. Throughput: 0: 42439.8. Samples: 15073256280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:23,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 16:09:26,958][15401] Updated weights for policy 0, policy_version 920004 (0.0031) [2024-06-25 16:09:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15073394688. Throughput: 0: 42392.5. Samples: 15073511760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:28,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 16:09:28,799][15349] Signal inference workers to stop experience collection... (223100 times) [2024-06-25 16:09:28,799][15349] Signal inference workers to resume experience collection... (223100 times) [2024-06-25 16:09:28,834][15401] InferenceWorker_p0-w0: stopping experience collection (223100 times) [2024-06-25 16:09:28,835][15401] InferenceWorker_p0-w0: resuming experience collection (223100 times) [2024-06-25 16:09:31,302][15401] Updated weights for policy 0, policy_version 920014 (0.0038) [2024-06-25 16:09:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15073607680. Throughput: 0: 42678.6. Samples: 15073775840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:33,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 16:09:34,677][15401] Updated weights for policy 0, policy_version 920024 (0.0030) [2024-06-25 16:09:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15073820672. Throughput: 0: 42580.0. Samples: 15073895800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 16:09:38,794][15401] Updated weights for policy 0, policy_version 920034 (0.0033) [2024-06-25 16:09:42,351][15401] Updated weights for policy 0, policy_version 920044 (0.0029) [2024-06-25 16:09:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15074033664. Throughput: 0: 42576.8. Samples: 15074153420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:43,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 16:09:43,589][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000920048_15074066432.pth... [2024-06-25 16:09:43,640][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000919421_15063793664.pth [2024-06-25 16:09:46,202][15401] Updated weights for policy 0, policy_version 920054 (0.0036) [2024-06-25 16:09:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15074230272. Throughput: 0: 42785.8. Samples: 15074414880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:48,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 16:09:50,029][15401] Updated weights for policy 0, policy_version 920064 (0.0035) [2024-06-25 16:09:53,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15074476032. Throughput: 0: 42684.0. Samples: 15074535980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:53,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 16:09:53,953][15401] Updated weights for policy 0, policy_version 920074 (0.0039) [2024-06-25 16:09:57,819][15401] Updated weights for policy 0, policy_version 920084 (0.0034) [2024-06-25 16:09:58,395][15132] Fps is (10 sec: 45851.9, 60 sec: 42867.7, 300 sec: 42708.7). Total num frames: 15074689024. Throughput: 0: 42765.0. Samples: 15074797620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:09:58,395][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 16:10:01,767][15401] Updated weights for policy 0, policy_version 920094 (0.0035) [2024-06-25 16:10:03,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 15074885632. Throughput: 0: 42969.8. Samples: 15075060360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:10:03,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 16:10:05,433][15401] Updated weights for policy 0, policy_version 920104 (0.0034) [2024-06-25 16:10:08,390][15132] Fps is (10 sec: 42620.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15075115008. Throughput: 0: 42809.4. Samples: 15075182700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:10:08,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 16:10:09,446][15401] Updated weights for policy 0, policy_version 920114 (0.0030) [2024-06-25 16:10:13,039][15401] Updated weights for policy 0, policy_version 920124 (0.0023) [2024-06-25 16:10:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15075328000. Throughput: 0: 42960.3. Samples: 15075444980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:10:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 16:10:16,830][15401] Updated weights for policy 0, policy_version 920134 (0.0028) [2024-06-25 16:10:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15075524608. Throughput: 0: 42899.6. Samples: 15075706320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:10:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 16:10:20,597][15401] Updated weights for policy 0, policy_version 920144 (0.0041) [2024-06-25 16:10:23,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 15075753984. Throughput: 0: 42989.7. Samples: 15075830440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 16:10:23,393][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 16:10:24,389][15401] Updated weights for policy 0, policy_version 920154 (0.0040) [2024-06-25 16:10:28,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 15075950592. Throughput: 0: 42855.1. Samples: 15076082000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:10:28,392][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 16:10:28,446][15401] Updated weights for policy 0, policy_version 920164 (0.0035) [2024-06-25 16:10:32,075][15401] Updated weights for policy 0, policy_version 920174 (0.0041) [2024-06-25 16:10:33,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15076163584. Throughput: 0: 42897.3. Samples: 15076345260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:10:33,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 16:10:36,170][15401] Updated weights for policy 0, policy_version 920184 (0.0031) [2024-06-25 16:10:38,390][15132] Fps is (10 sec: 45885.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15076409344. Throughput: 0: 43051.6. Samples: 15076473200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:10:38,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 16:10:39,756][15401] Updated weights for policy 0, policy_version 920194 (0.0032) [2024-06-25 16:10:43,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 15076605952. Throughput: 0: 42875.0. Samples: 15076726880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:10:43,393][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 16:10:43,885][15401] Updated weights for policy 0, policy_version 920204 (0.0034) [2024-06-25 16:10:47,477][15401] Updated weights for policy 0, policy_version 920214 (0.0036) [2024-06-25 16:10:48,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15076802560. Throughput: 0: 42792.5. Samples: 15076986020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:10:48,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 16:10:51,503][15401] Updated weights for policy 0, policy_version 920224 (0.0032) [2024-06-25 16:10:52,054][15349] Signal inference workers to stop experience collection... (223150 times) [2024-06-25 16:10:52,083][15401] InferenceWorker_p0-w0: stopping experience collection (223150 times) [2024-06-25 16:10:52,170][15349] Signal inference workers to resume experience collection... (223150 times) [2024-06-25 16:10:52,170][15401] InferenceWorker_p0-w0: resuming experience collection (223150 times) [2024-06-25 16:10:53,394][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42708.8). Total num frames: 15077031936. Throughput: 0: 42833.3. Samples: 15077110400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:10:53,395][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 16:10:55,428][15401] Updated weights for policy 0, policy_version 920234 (0.0046) [2024-06-25 16:10:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42329.0, 300 sec: 42653.9). Total num frames: 15077228544. Throughput: 0: 42683.3. Samples: 15077365720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:10:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 16:10:59,183][15401] Updated weights for policy 0, policy_version 920244 (0.0039) [2024-06-25 16:11:03,014][15401] Updated weights for policy 0, policy_version 920254 (0.0031) [2024-06-25 16:11:03,396][15132] Fps is (10 sec: 42591.6, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 15077457920. Throughput: 0: 42549.9. Samples: 15077621340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:03,396][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 16:11:06,791][15401] Updated weights for policy 0, policy_version 920264 (0.0023) [2024-06-25 16:11:08,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15077670912. Throughput: 0: 42665.4. Samples: 15077750280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 16:11:10,907][15401] Updated weights for policy 0, policy_version 920274 (0.0041) [2024-06-25 16:11:13,390][15132] Fps is (10 sec: 40986.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15077867520. Throughput: 0: 42810.2. Samples: 15078008360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 16:11:14,388][15401] Updated weights for policy 0, policy_version 920284 (0.0032) [2024-06-25 16:11:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15078080512. Throughput: 0: 42616.0. Samples: 15078262980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:18,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 16:11:18,533][15401] Updated weights for policy 0, policy_version 920294 (0.0039) [2024-06-25 16:11:22,092][15401] Updated weights for policy 0, policy_version 920304 (0.0033) [2024-06-25 16:11:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 15078326272. Throughput: 0: 42696.4. Samples: 15078394540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:23,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 16:11:26,057][15401] Updated weights for policy 0, policy_version 920314 (0.0041) [2024-06-25 16:11:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 15078522880. Throughput: 0: 42587.1. Samples: 15078643200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:28,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 16:11:29,844][15401] Updated weights for policy 0, policy_version 920324 (0.0035) [2024-06-25 16:11:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15078735872. Throughput: 0: 42568.8. Samples: 15078901620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 16:11:33,514][15401] Updated weights for policy 0, policy_version 920334 (0.0035) [2024-06-25 16:11:37,560][15401] Updated weights for policy 0, policy_version 920344 (0.0037) [2024-06-25 16:11:38,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15078948864. Throughput: 0: 42648.2. Samples: 15079029360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:38,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 16:11:41,087][15401] Updated weights for policy 0, policy_version 920354 (0.0031) [2024-06-25 16:11:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 15079145472. Throughput: 0: 42601.1. Samples: 15079282780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:43,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 16:11:43,602][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000920360_15079178240.pth... [2024-06-25 16:11:43,660][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000919733_15068905472.pth [2024-06-25 16:11:45,408][15401] Updated weights for policy 0, policy_version 920364 (0.0029) [2024-06-25 16:11:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15079391232. Throughput: 0: 42613.2. Samples: 15079538760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:48,392][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 16:11:48,893][15401] Updated weights for policy 0, policy_version 920374 (0.0041) [2024-06-25 16:11:52,928][15401] Updated weights for policy 0, policy_version 920384 (0.0036) [2024-06-25 16:11:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42601.7, 300 sec: 42598.4). Total num frames: 15079587840. Throughput: 0: 42705.3. Samples: 15079672020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:53,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 16:11:56,255][15401] Updated weights for policy 0, policy_version 920394 (0.0031) [2024-06-25 16:11:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.4, 300 sec: 42654.9). Total num frames: 15079800832. Throughput: 0: 42678.3. Samples: 15079928880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:11:58,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 16:12:00,407][15401] Updated weights for policy 0, policy_version 920404 (0.0030) [2024-06-25 16:12:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 15080013824. Throughput: 0: 42672.9. Samples: 15080183260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:12:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 16:12:04,173][15401] Updated weights for policy 0, policy_version 920414 (0.0035) [2024-06-25 16:12:08,136][15401] Updated weights for policy 0, policy_version 920424 (0.0034) [2024-06-25 16:12:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15080243200. Throughput: 0: 42560.0. Samples: 15080309740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:12:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 16:12:11,207][15349] Signal inference workers to stop experience collection... (223200 times) [2024-06-25 16:12:11,208][15349] Signal inference workers to resume experience collection... (223200 times) [2024-06-25 16:12:11,234][15401] InferenceWorker_p0-w0: stopping experience collection (223200 times) [2024-06-25 16:12:11,234][15401] InferenceWorker_p0-w0: resuming experience collection (223200 times) [2024-06-25 16:12:11,681][15401] Updated weights for policy 0, policy_version 920434 (0.0039) [2024-06-25 16:12:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15080439808. Throughput: 0: 42549.5. Samples: 15080557920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:12:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 16:12:16,103][15401] Updated weights for policy 0, policy_version 920444 (0.0026) [2024-06-25 16:12:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 15080652800. Throughput: 0: 42657.3. Samples: 15080821300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 16:12:18,392][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 16:12:19,283][15401] Updated weights for policy 0, policy_version 920454 (0.0047) [2024-06-25 16:12:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15080849408. Throughput: 0: 42642.6. Samples: 15080948280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:12:23,394][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 16:12:23,887][15401] Updated weights for policy 0, policy_version 920464 (0.0028) [2024-06-25 16:12:27,101][15401] Updated weights for policy 0, policy_version 920474 (0.0038) [2024-06-25 16:12:28,390][15132] Fps is (10 sec: 44246.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15081095168. Throughput: 0: 42722.7. Samples: 15081205300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:12:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 16:12:31,419][15401] Updated weights for policy 0, policy_version 920484 (0.0037) [2024-06-25 16:12:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15081308160. Throughput: 0: 42773.8. Samples: 15081463480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:12:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 16:12:34,657][15401] Updated weights for policy 0, policy_version 920494 (0.0035) [2024-06-25 16:12:38,392][15132] Fps is (10 sec: 39312.8, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 15081488384. Throughput: 0: 42633.4. Samples: 15081590620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:12:38,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 16:12:39,117][15401] Updated weights for policy 0, policy_version 920504 (0.0038) [2024-06-25 16:12:42,272][15401] Updated weights for policy 0, policy_version 920514 (0.0034) [2024-06-25 16:12:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 15081750528. Throughput: 0: 42566.1. Samples: 15081844360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:12:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 16:12:46,751][15401] Updated weights for policy 0, policy_version 920524 (0.0042) [2024-06-25 16:12:48,392][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.1). Total num frames: 15081930752. Throughput: 0: 42799.6. Samples: 15082109340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:12:48,393][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 16:12:49,914][15401] Updated weights for policy 0, policy_version 920534 (0.0038) [2024-06-25 16:12:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15082143744. Throughput: 0: 42607.5. Samples: 15082227080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:12:53,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 16:12:54,575][15401] Updated weights for policy 0, policy_version 920544 (0.0035) [2024-06-25 16:12:57,513][15401] Updated weights for policy 0, policy_version 920554 (0.0028) [2024-06-25 16:12:58,392][15132] Fps is (10 sec: 45875.3, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 15082389504. Throughput: 0: 42879.0. Samples: 15082487580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:12:58,393][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 16:13:02,101][15401] Updated weights for policy 0, policy_version 920564 (0.0030) [2024-06-25 16:13:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15082553344. Throughput: 0: 42833.4. Samples: 15082748700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:03,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 16:13:05,142][15401] Updated weights for policy 0, policy_version 920574 (0.0032) [2024-06-25 16:13:08,390][15132] Fps is (10 sec: 39330.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15082782720. Throughput: 0: 42693.3. Samples: 15082869480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:08,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 16:13:09,639][15401] Updated weights for policy 0, policy_version 920584 (0.0043) [2024-06-25 16:13:12,789][15401] Updated weights for policy 0, policy_version 920594 (0.0036) [2024-06-25 16:13:13,390][15132] Fps is (10 sec: 49152.0, 60 sec: 43417.6, 300 sec: 42820.9). Total num frames: 15083044864. Throughput: 0: 42842.4. Samples: 15083133200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 16:13:17,646][15401] Updated weights for policy 0, policy_version 920604 (0.0029) [2024-06-25 16:13:18,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 15083192320. Throughput: 0: 42853.5. Samples: 15083391880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:18,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-25 16:13:20,472][15401] Updated weights for policy 0, policy_version 920614 (0.0034) [2024-06-25 16:13:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15083438080. Throughput: 0: 42655.2. Samples: 15083510000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 16:13:25,395][15401] Updated weights for policy 0, policy_version 920624 (0.0036) [2024-06-25 16:13:28,166][15401] Updated weights for policy 0, policy_version 920634 (0.0029) [2024-06-25 16:13:28,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15083667456. Throughput: 0: 42833.4. Samples: 15083771860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 16:13:32,757][15401] Updated weights for policy 0, policy_version 920644 (0.0040) [2024-06-25 16:13:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15083864064. Throughput: 0: 42805.4. Samples: 15084035480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:33,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 16:13:34,077][15349] Signal inference workers to stop experience collection... (223250 times) [2024-06-25 16:13:34,077][15349] Signal inference workers to resume experience collection... (223250 times) [2024-06-25 16:13:34,096][15401] InferenceWorker_p0-w0: stopping experience collection (223250 times) [2024-06-25 16:13:34,096][15401] InferenceWorker_p0-w0: resuming experience collection (223250 times) [2024-06-25 16:13:36,014][15401] Updated weights for policy 0, policy_version 920654 (0.0038) [2024-06-25 16:13:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43417.5, 300 sec: 42709.1). Total num frames: 15084093440. Throughput: 0: 43010.6. Samples: 15084162660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:38,393][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 16:13:40,239][15401] Updated weights for policy 0, policy_version 920664 (0.0033) [2024-06-25 16:13:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15084306432. Throughput: 0: 42996.9. Samples: 15084422340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:43,394][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 16:13:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000920673_15084306432.pth... [2024-06-25 16:13:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000920048_15074066432.pth [2024-06-25 16:13:43,636][15401] Updated weights for policy 0, policy_version 920674 (0.0038) [2024-06-25 16:13:47,623][15401] Updated weights for policy 0, policy_version 920684 (0.0039) [2024-06-25 16:13:48,390][15132] Fps is (10 sec: 42608.7, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 15084519424. Throughput: 0: 42908.8. Samples: 15084679600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:48,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 16:13:51,193][15401] Updated weights for policy 0, policy_version 920694 (0.0029) [2024-06-25 16:13:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 15084748800. Throughput: 0: 43178.2. Samples: 15084812500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:53,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 16:13:55,017][15401] Updated weights for policy 0, policy_version 920704 (0.0030) [2024-06-25 16:13:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 15084961792. Throughput: 0: 43191.5. Samples: 15085076820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:13:58,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 16:13:58,845][15401] Updated weights for policy 0, policy_version 920714 (0.0030) [2024-06-25 16:14:02,503][15401] Updated weights for policy 0, policy_version 920724 (0.0028) [2024-06-25 16:14:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43690.5, 300 sec: 42876.1). Total num frames: 15085174784. Throughput: 0: 43066.8. Samples: 15085329900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:14:03,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 16:14:06,507][15401] Updated weights for policy 0, policy_version 920734 (0.0035) [2024-06-25 16:14:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15085387776. Throughput: 0: 43303.5. Samples: 15085458660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:14:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 16:14:09,975][15401] Updated weights for policy 0, policy_version 920744 (0.0036) [2024-06-25 16:14:13,389][15132] Fps is (10 sec: 40961.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15085584384. Throughput: 0: 43249.8. Samples: 15085718100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 16:14:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 16:14:14,123][15401] Updated weights for policy 0, policy_version 920754 (0.0028) [2024-06-25 16:14:17,607][15401] Updated weights for policy 0, policy_version 920764 (0.0038) [2024-06-25 16:14:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43690.5, 300 sec: 42820.6). Total num frames: 15085813760. Throughput: 0: 43054.6. Samples: 15085972940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:18,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 16:14:21,945][15401] Updated weights for policy 0, policy_version 920774 (0.0027) [2024-06-25 16:14:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15086026752. Throughput: 0: 43303.6. Samples: 15086111220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:23,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 16:14:25,400][15401] Updated weights for policy 0, policy_version 920784 (0.0033) [2024-06-25 16:14:28,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15086239744. Throughput: 0: 43136.4. Samples: 15086363480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:28,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 16:14:29,562][15401] Updated weights for policy 0, policy_version 920794 (0.0037) [2024-06-25 16:14:33,000][15401] Updated weights for policy 0, policy_version 920804 (0.0040) [2024-06-25 16:14:33,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 15086469120. Throughput: 0: 43175.5. Samples: 15086622500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:33,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-25 16:14:36,942][15401] Updated weights for policy 0, policy_version 920814 (0.0026) [2024-06-25 16:14:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43146.3, 300 sec: 42876.1). Total num frames: 15086682112. Throughput: 0: 43124.0. Samples: 15086753080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:38,399][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 16:14:40,404][15401] Updated weights for policy 0, policy_version 920824 (0.0046) [2024-06-25 16:14:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15086895104. Throughput: 0: 43082.2. Samples: 15087015520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:43,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 16:14:44,520][15401] Updated weights for policy 0, policy_version 920834 (0.0044) [2024-06-25 16:14:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15087091712. Throughput: 0: 43227.0. Samples: 15087275100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:48,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 16:14:48,537][15401] Updated weights for policy 0, policy_version 920844 (0.0033) [2024-06-25 16:14:52,163][15401] Updated weights for policy 0, policy_version 920854 (0.0034) [2024-06-25 16:14:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42821.3). Total num frames: 15087321088. Throughput: 0: 43183.5. Samples: 15087401920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 16:14:56,053][15401] Updated weights for policy 0, policy_version 920864 (0.0027) [2024-06-25 16:14:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15087534080. Throughput: 0: 43169.3. Samples: 15087660720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:14:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 16:14:59,674][15401] Updated weights for policy 0, policy_version 920874 (0.0038) [2024-06-25 16:15:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.7, 300 sec: 42820.6). Total num frames: 15087747072. Throughput: 0: 43380.1. Samples: 15087925040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 16:15:03,491][15349] Signal inference workers to stop experience collection... (223300 times) [2024-06-25 16:15:03,511][15401] InferenceWorker_p0-w0: stopping experience collection (223300 times) [2024-06-25 16:15:03,549][15349] Signal inference workers to resume experience collection... (223300 times) [2024-06-25 16:15:03,550][15401] InferenceWorker_p0-w0: resuming experience collection (223300 times) [2024-06-25 16:15:03,552][15401] Updated weights for policy 0, policy_version 920884 (0.0026) [2024-06-25 16:15:07,180][15401] Updated weights for policy 0, policy_version 920894 (0.0028) [2024-06-25 16:15:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15087976448. Throughput: 0: 43094.0. Samples: 15088050440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 16:15:11,322][15401] Updated weights for policy 0, policy_version 920904 (0.0040) [2024-06-25 16:15:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15088173056. Throughput: 0: 43148.1. Samples: 15088305140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 16:15:14,793][15401] Updated weights for policy 0, policy_version 920914 (0.0022) [2024-06-25 16:15:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 15088386048. Throughput: 0: 43141.0. Samples: 15088563840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:18,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 16:15:18,926][15401] Updated weights for policy 0, policy_version 920924 (0.0038) [2024-06-25 16:15:22,419][15401] Updated weights for policy 0, policy_version 920934 (0.0040) [2024-06-25 16:15:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 15088599040. Throughput: 0: 43158.4. Samples: 15088695200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 16:15:26,452][15401] Updated weights for policy 0, policy_version 920944 (0.0049) [2024-06-25 16:15:28,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 15088812032. Throughput: 0: 42895.0. Samples: 15088945900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:28,393][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 16:15:29,971][15401] Updated weights for policy 0, policy_version 920954 (0.0029) [2024-06-25 16:15:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15089025024. Throughput: 0: 42952.9. Samples: 15089207980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 16:15:33,872][15401] Updated weights for policy 0, policy_version 920964 (0.0026) [2024-06-25 16:15:37,558][15401] Updated weights for policy 0, policy_version 920974 (0.0033) [2024-06-25 16:15:38,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 15089254400. Throughput: 0: 43027.6. Samples: 15089338160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:38,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 16:15:41,718][15401] Updated weights for policy 0, policy_version 920984 (0.0029) [2024-06-25 16:15:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15089451008. Throughput: 0: 42946.1. Samples: 15089593300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:43,390][15132] Avg episode reward: [(0, '0.940')] [2024-06-25 16:15:43,549][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000920988_15089467392.pth... [2024-06-25 16:15:43,611][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000920360_15079178240.pth [2024-06-25 16:15:45,306][15401] Updated weights for policy 0, policy_version 920994 (0.0031) [2024-06-25 16:15:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42821.2). Total num frames: 15089664000. Throughput: 0: 42764.8. Samples: 15089849460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:48,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 16:15:49,317][15401] Updated weights for policy 0, policy_version 921004 (0.0036) [2024-06-25 16:15:52,910][15401] Updated weights for policy 0, policy_version 921014 (0.0038) [2024-06-25 16:15:53,396][15132] Fps is (10 sec: 45846.1, 60 sec: 43140.0, 300 sec: 42986.2). Total num frames: 15089909760. Throughput: 0: 42877.4. Samples: 15089980200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:53,396][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 16:15:57,102][15401] Updated weights for policy 0, policy_version 921024 (0.0031) [2024-06-25 16:15:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42821.5). Total num frames: 15090089984. Throughput: 0: 42810.6. Samples: 15090231620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:15:58,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-25 16:16:00,571][15401] Updated weights for policy 0, policy_version 921034 (0.0033) [2024-06-25 16:16:03,390][15132] Fps is (10 sec: 40986.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15090319360. Throughput: 0: 42719.5. Samples: 15090486220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:16:03,392][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 16:16:04,854][15401] Updated weights for policy 0, policy_version 921044 (0.0022) [2024-06-25 16:16:08,242][15401] Updated weights for policy 0, policy_version 921054 (0.0032) [2024-06-25 16:16:08,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15090548736. Throughput: 0: 42808.4. Samples: 15090621580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 16:16:08,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 16:16:12,371][15401] Updated weights for policy 0, policy_version 921064 (0.0031) [2024-06-25 16:16:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 15090745344. Throughput: 0: 42954.7. Samples: 15090878760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 16:16:15,818][15401] Updated weights for policy 0, policy_version 921074 (0.0031) [2024-06-25 16:16:18,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 15090958336. Throughput: 0: 42734.5. Samples: 15091131140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:18,393][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 16:16:19,891][15401] Updated weights for policy 0, policy_version 921084 (0.0031) [2024-06-25 16:16:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 15091187712. Throughput: 0: 42748.8. Samples: 15091261860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:23,390][15132] Avg episode reward: [(0, '0.128')] [2024-06-25 16:16:23,513][15401] Updated weights for policy 0, policy_version 921094 (0.0032) [2024-06-25 16:16:24,696][15349] Signal inference workers to stop experience collection... (223350 times) [2024-06-25 16:16:24,696][15349] Signal inference workers to resume experience collection... (223350 times) [2024-06-25 16:16:24,729][15401] InferenceWorker_p0-w0: stopping experience collection (223350 times) [2024-06-25 16:16:24,729][15401] InferenceWorker_p0-w0: resuming experience collection (223350 times) [2024-06-25 16:16:27,620][15401] Updated weights for policy 0, policy_version 921104 (0.0029) [2024-06-25 16:16:28,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 15091384320. Throughput: 0: 42765.0. Samples: 15091517720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:28,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 16:16:31,283][15401] Updated weights for policy 0, policy_version 921114 (0.0027) [2024-06-25 16:16:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 15091613696. Throughput: 0: 42969.3. Samples: 15091783080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 16:16:35,001][15401] Updated weights for policy 0, policy_version 921124 (0.0036) [2024-06-25 16:16:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15091826688. Throughput: 0: 42992.7. Samples: 15091914600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:38,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 16:16:38,720][15401] Updated weights for policy 0, policy_version 921134 (0.0026) [2024-06-25 16:16:42,390][15401] Updated weights for policy 0, policy_version 921144 (0.0043) [2024-06-25 16:16:43,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 15092039680. Throughput: 0: 42953.0. Samples: 15092164500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:43,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 16:16:46,651][15401] Updated weights for policy 0, policy_version 921154 (0.0033) [2024-06-25 16:16:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 15092252672. Throughput: 0: 43101.3. Samples: 15092425780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 16:16:50,468][15401] Updated weights for policy 0, policy_version 921164 (0.0036) [2024-06-25 16:16:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42876.0, 300 sec: 42987.2). Total num frames: 15092482048. Throughput: 0: 42968.9. Samples: 15092555180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:53,390][15132] Avg episode reward: [(0, '0.854')] [2024-06-25 16:16:54,485][15401] Updated weights for policy 0, policy_version 921174 (0.0035) [2024-06-25 16:16:57,993][15401] Updated weights for policy 0, policy_version 921184 (0.0036) [2024-06-25 16:16:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15092678656. Throughput: 0: 42910.3. Samples: 15092809720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:16:58,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 16:17:01,889][15401] Updated weights for policy 0, policy_version 921194 (0.0030) [2024-06-25 16:17:03,396][15132] Fps is (10 sec: 42571.1, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 15092908032. Throughput: 0: 43009.5. Samples: 15093066740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:03,396][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 16:17:05,441][15401] Updated weights for policy 0, policy_version 921204 (0.0036) [2024-06-25 16:17:08,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15093121024. Throughput: 0: 43017.4. Samples: 15093197640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 16:17:09,392][15401] Updated weights for policy 0, policy_version 921214 (0.0043) [2024-06-25 16:17:13,290][15401] Updated weights for policy 0, policy_version 921224 (0.0033) [2024-06-25 16:17:13,390][15132] Fps is (10 sec: 42625.5, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 15093334016. Throughput: 0: 42954.6. Samples: 15093450680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 16:17:17,091][15401] Updated weights for policy 0, policy_version 921234 (0.0029) [2024-06-25 16:17:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42873.2, 300 sec: 42987.2). Total num frames: 15093530624. Throughput: 0: 42739.2. Samples: 15093706340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:18,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 16:17:21,331][15401] Updated weights for policy 0, policy_version 921244 (0.0031) [2024-06-25 16:17:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15093760000. Throughput: 0: 42655.1. Samples: 15093834080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:23,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 16:17:25,015][15401] Updated weights for policy 0, policy_version 921254 (0.0032) [2024-06-25 16:17:28,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 15093972992. Throughput: 0: 42627.4. Samples: 15094082740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:28,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 16:17:29,037][15401] Updated weights for policy 0, policy_version 921264 (0.0037) [2024-06-25 16:17:32,521][15401] Updated weights for policy 0, policy_version 921274 (0.0035) [2024-06-25 16:17:33,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 43043.1). Total num frames: 15094185984. Throughput: 0: 42672.6. Samples: 15094346040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 16:17:36,665][15401] Updated weights for policy 0, policy_version 921284 (0.0034) [2024-06-25 16:17:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15094382592. Throughput: 0: 42673.8. Samples: 15094475500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:38,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 16:17:40,087][15401] Updated weights for policy 0, policy_version 921294 (0.0040) [2024-06-25 16:17:43,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42987.5). Total num frames: 15094611968. Throughput: 0: 42647.1. Samples: 15094728840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 16:17:43,431][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000921303_15094628352.pth... [2024-06-25 16:17:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000920673_15084306432.pth [2024-06-25 16:17:44,047][15401] Updated weights for policy 0, policy_version 921304 (0.0044) [2024-06-25 16:17:48,111][15401] Updated weights for policy 0, policy_version 921314 (0.0028) [2024-06-25 16:17:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15094824960. Throughput: 0: 42690.5. Samples: 15094987540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:48,399][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 16:17:51,599][15401] Updated weights for policy 0, policy_version 921324 (0.0028) [2024-06-25 16:17:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 15095037952. Throughput: 0: 42503.4. Samples: 15095110300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:53,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 16:17:55,691][15401] Updated weights for policy 0, policy_version 921334 (0.0026) [2024-06-25 16:17:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 15095250944. Throughput: 0: 42619.6. Samples: 15095368560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:17:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 16:17:59,573][15401] Updated weights for policy 0, policy_version 921344 (0.0024) [2024-06-25 16:18:01,576][15349] Signal inference workers to stop experience collection... (223400 times) [2024-06-25 16:18:01,576][15349] Signal inference workers to resume experience collection... (223400 times) [2024-06-25 16:18:01,608][15401] InferenceWorker_p0-w0: stopping experience collection (223400 times) [2024-06-25 16:18:01,608][15401] InferenceWorker_p0-w0: resuming experience collection (223400 times) [2024-06-25 16:18:03,267][15401] Updated weights for policy 0, policy_version 921354 (0.0037) [2024-06-25 16:18:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42602.9, 300 sec: 42987.2). Total num frames: 15095463936. Throughput: 0: 42589.3. Samples: 15095622860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 16:18:03,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 16:18:07,852][15401] Updated weights for policy 0, policy_version 921364 (0.0031) [2024-06-25 16:18:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15095660544. Throughput: 0: 42512.1. Samples: 15095747120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:08,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 16:18:11,182][15401] Updated weights for policy 0, policy_version 921374 (0.0029) [2024-06-25 16:18:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 43042.7). Total num frames: 15095889920. Throughput: 0: 42772.1. Samples: 15096007480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 16:18:15,364][15401] Updated weights for policy 0, policy_version 921384 (0.0037) [2024-06-25 16:18:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 15096086528. Throughput: 0: 42669.1. Samples: 15096266160. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 16:18:18,766][15401] Updated weights for policy 0, policy_version 921394 (0.0026) [2024-06-25 16:18:23,069][15401] Updated weights for policy 0, policy_version 921404 (0.0040) [2024-06-25 16:18:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15096283136. Throughput: 0: 42457.8. Samples: 15096386100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 16:18:26,374][15401] Updated weights for policy 0, policy_version 921414 (0.0048) [2024-06-25 16:18:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15096528896. Throughput: 0: 42547.5. Samples: 15096643480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 16:18:30,832][15401] Updated weights for policy 0, policy_version 921424 (0.0039) [2024-06-25 16:18:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42820.9). Total num frames: 15096725504. Throughput: 0: 42504.0. Samples: 15096900220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:33,396][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 16:18:34,129][15401] Updated weights for policy 0, policy_version 921434 (0.0035) [2024-06-25 16:18:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15096922112. Throughput: 0: 42494.0. Samples: 15097022520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 16:18:38,499][15401] Updated weights for policy 0, policy_version 921444 (0.0037) [2024-06-25 16:18:41,794][15401] Updated weights for policy 0, policy_version 921454 (0.0027) [2024-06-25 16:18:43,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 15097167872. Throughput: 0: 42465.7. Samples: 15097279620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:43,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 16:18:46,090][15401] Updated weights for policy 0, policy_version 921464 (0.0044) [2024-06-25 16:18:48,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 15097364480. Throughput: 0: 42568.6. Samples: 15097538720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:48,396][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 16:18:49,552][15401] Updated weights for policy 0, policy_version 921474 (0.0028) [2024-06-25 16:18:53,390][15132] Fps is (10 sec: 39331.1, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 15097561088. Throughput: 0: 42542.7. Samples: 15097661540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:53,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 16:18:53,973][15401] Updated weights for policy 0, policy_version 921484 (0.0031) [2024-06-25 16:18:57,143][15401] Updated weights for policy 0, policy_version 921494 (0.0044) [2024-06-25 16:18:58,392][15132] Fps is (10 sec: 44254.6, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 15097806848. Throughput: 0: 42519.0. Samples: 15097920940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:18:58,393][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 16:19:01,502][15401] Updated weights for policy 0, policy_version 921504 (0.0034) [2024-06-25 16:19:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15098019840. Throughput: 0: 42432.0. Samples: 15098175600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:03,391][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 16:19:04,849][15401] Updated weights for policy 0, policy_version 921514 (0.0027) [2024-06-25 16:19:08,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15098216448. Throughput: 0: 42695.5. Samples: 15098307400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:08,392][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 16:19:09,025][15401] Updated weights for policy 0, policy_version 921524 (0.0031) [2024-06-25 16:19:12,373][15401] Updated weights for policy 0, policy_version 921534 (0.0028) [2024-06-25 16:19:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15098445824. Throughput: 0: 42715.6. Samples: 15098565680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 16:19:16,736][15401] Updated weights for policy 0, policy_version 921544 (0.0039) [2024-06-25 16:19:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15098642432. Throughput: 0: 42631.2. Samples: 15098818620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:18,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 16:19:19,656][15349] Signal inference workers to stop experience collection... (223450 times) [2024-06-25 16:19:19,657][15349] Signal inference workers to resume experience collection... (223450 times) [2024-06-25 16:19:19,706][15401] InferenceWorker_p0-w0: stopping experience collection (223450 times) [2024-06-25 16:19:19,706][15401] InferenceWorker_p0-w0: resuming experience collection (223450 times) [2024-06-25 16:19:20,110][15401] Updated weights for policy 0, policy_version 921554 (0.0026) [2024-06-25 16:19:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15098855424. Throughput: 0: 42643.5. Samples: 15098941480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:23,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 16:19:24,828][15401] Updated weights for policy 0, policy_version 921564 (0.0042) [2024-06-25 16:19:27,932][15401] Updated weights for policy 0, policy_version 921574 (0.0037) [2024-06-25 16:19:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15099101184. Throughput: 0: 42725.4. Samples: 15099202160. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:28,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 16:19:32,290][15401] Updated weights for policy 0, policy_version 921584 (0.0033) [2024-06-25 16:19:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15099297792. Throughput: 0: 42551.4. Samples: 15099453260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:33,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 16:19:35,451][15401] Updated weights for policy 0, policy_version 921594 (0.0028) [2024-06-25 16:19:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15099494400. Throughput: 0: 42779.5. Samples: 15099586620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:38,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 16:19:39,720][15401] Updated weights for policy 0, policy_version 921604 (0.0031) [2024-06-25 16:19:42,953][15401] Updated weights for policy 0, policy_version 921614 (0.0044) [2024-06-25 16:19:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 15099740160. Throughput: 0: 42785.8. Samples: 15099846200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:43,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 16:19:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000921615_15099740160.pth... [2024-06-25 16:19:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000920988_15089467392.pth [2024-06-25 16:19:47,475][15401] Updated weights for policy 0, policy_version 921624 (0.0037) [2024-06-25 16:19:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 15099936768. Throughput: 0: 42775.3. Samples: 15100100480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:48,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 16:19:50,452][15401] Updated weights for policy 0, policy_version 921634 (0.0028) [2024-06-25 16:19:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 15100149760. Throughput: 0: 42648.0. Samples: 15100226660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:53,393][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 16:19:55,081][15401] Updated weights for policy 0, policy_version 921644 (0.0042) [2024-06-25 16:19:58,199][15401] Updated weights for policy 0, policy_version 921654 (0.0040) [2024-06-25 16:19:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 15100395520. Throughput: 0: 42674.7. Samples: 15100486040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-25 16:19:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 16:20:02,651][15401] Updated weights for policy 0, policy_version 921664 (0.0030) [2024-06-25 16:20:03,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15100575744. Throughput: 0: 42807.1. Samples: 15100744940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:03,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 16:20:05,741][15401] Updated weights for policy 0, policy_version 921674 (0.0027) [2024-06-25 16:20:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15100805120. Throughput: 0: 42788.4. Samples: 15100866960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 16:20:10,233][15401] Updated weights for policy 0, policy_version 921684 (0.0040) [2024-06-25 16:20:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15101018112. Throughput: 0: 42814.2. Samples: 15101128800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:13,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 16:20:13,451][15401] Updated weights for policy 0, policy_version 921694 (0.0030) [2024-06-25 16:20:17,910][15401] Updated weights for policy 0, policy_version 921704 (0.0024) [2024-06-25 16:20:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15101214720. Throughput: 0: 42903.6. Samples: 15101383920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:18,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 16:20:21,227][15401] Updated weights for policy 0, policy_version 921714 (0.0034) [2024-06-25 16:20:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 15101427712. Throughput: 0: 42629.3. Samples: 15101504940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:23,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 16:20:25,964][15401] Updated weights for policy 0, policy_version 921724 (0.0031) [2024-06-25 16:20:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 15101657088. Throughput: 0: 42700.0. Samples: 15101767800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:28,393][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 16:20:28,934][15401] Updated weights for policy 0, policy_version 921734 (0.0042) [2024-06-25 16:20:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 15101820928. Throughput: 0: 42761.4. Samples: 15102024740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:33,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 16:20:33,715][15401] Updated weights for policy 0, policy_version 921744 (0.0030) [2024-06-25 16:20:36,754][15401] Updated weights for policy 0, policy_version 921754 (0.0043) [2024-06-25 16:20:38,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15102083072. Throughput: 0: 42571.7. Samples: 15102142280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:38,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 16:20:41,223][15401] Updated weights for policy 0, policy_version 921764 (0.0042) [2024-06-25 16:20:43,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 15102279680. Throughput: 0: 42543.1. Samples: 15102400580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:43,393][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 16:20:44,349][15401] Updated weights for policy 0, policy_version 921774 (0.0033) [2024-06-25 16:20:48,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 42543.8). Total num frames: 15102459904. Throughput: 0: 42493.8. Samples: 15102657160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:48,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 16:20:48,948][15401] Updated weights for policy 0, policy_version 921784 (0.0044) [2024-06-25 16:20:52,282][15401] Updated weights for policy 0, policy_version 921794 (0.0051) [2024-06-25 16:20:53,390][15132] Fps is (10 sec: 45885.7, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 15102738432. Throughput: 0: 42541.2. Samples: 15102781320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 16:20:56,979][15349] Signal inference workers to stop experience collection... (223500 times) [2024-06-25 16:20:57,028][15401] InferenceWorker_p0-w0: stopping experience collection (223500 times) [2024-06-25 16:20:57,037][15349] Signal inference workers to resume experience collection... (223500 times) [2024-06-25 16:20:57,047][15401] InferenceWorker_p0-w0: resuming experience collection (223500 times) [2024-06-25 16:20:57,054][15401] Updated weights for policy 0, policy_version 921804 (0.0033) [2024-06-25 16:20:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 15102885888. Throughput: 0: 42422.3. Samples: 15103037800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:20:58,390][15132] Avg episode reward: [(0, '0.279')] [2024-06-25 16:21:00,086][15401] Updated weights for policy 0, policy_version 921814 (0.0031) [2024-06-25 16:21:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15103115264. Throughput: 0: 42250.7. Samples: 15103285200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:03,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 16:21:04,714][15401] Updated weights for policy 0, policy_version 921824 (0.0036) [2024-06-25 16:21:07,682][15401] Updated weights for policy 0, policy_version 921834 (0.0044) [2024-06-25 16:21:08,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15103361024. Throughput: 0: 42530.7. Samples: 15103418820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:08,392][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 16:21:12,747][15401] Updated weights for policy 0, policy_version 921844 (0.0033) [2024-06-25 16:21:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 42598.7). Total num frames: 15103524864. Throughput: 0: 42407.9. Samples: 15103676060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:13,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 16:21:15,328][15401] Updated weights for policy 0, policy_version 921854 (0.0034) [2024-06-25 16:21:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15103770624. Throughput: 0: 42216.9. Samples: 15103924500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:18,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 16:21:20,257][15401] Updated weights for policy 0, policy_version 921864 (0.0034) [2024-06-25 16:21:23,147][15401] Updated weights for policy 0, policy_version 921874 (0.0040) [2024-06-25 16:21:23,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 15103983616. Throughput: 0: 42605.2. Samples: 15104059620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:23,393][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 16:21:27,780][15401] Updated weights for policy 0, policy_version 921884 (0.0033) [2024-06-25 16:21:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 15104180224. Throughput: 0: 42685.4. Samples: 15104321320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 16:21:30,818][15401] Updated weights for policy 0, policy_version 921894 (0.0029) [2024-06-25 16:21:33,389][15132] Fps is (10 sec: 44247.9, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 15104425984. Throughput: 0: 42378.6. Samples: 15104564200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 16:21:35,306][15401] Updated weights for policy 0, policy_version 921904 (0.0025) [2024-06-25 16:21:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15104606208. Throughput: 0: 42681.9. Samples: 15104702000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:38,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 16:21:38,533][15401] Updated weights for policy 0, policy_version 921914 (0.0034) [2024-06-25 16:21:43,237][15401] Updated weights for policy 0, policy_version 921924 (0.0035) [2024-06-25 16:21:43,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42325.3, 300 sec: 42598.1). Total num frames: 15104819200. Throughput: 0: 42576.7. Samples: 15104953860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:43,392][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 16:21:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000921925_15104819200.pth... [2024-06-25 16:21:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000921303_15094628352.pth [2024-06-25 16:21:46,230][15401] Updated weights for policy 0, policy_version 921934 (0.0038) [2024-06-25 16:21:48,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 15105081344. Throughput: 0: 42548.8. Samples: 15105199900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 16:21:50,854][15401] Updated weights for policy 0, policy_version 921944 (0.0032) [2024-06-25 16:21:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 15105245184. Throughput: 0: 42579.5. Samples: 15105334900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 16:21:53,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 16:21:54,073][15401] Updated weights for policy 0, policy_version 921954 (0.0048) [2024-06-25 16:21:58,316][15401] Updated weights for policy 0, policy_version 921964 (0.0031) [2024-06-25 16:21:58,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.4, 300 sec: 42543.8). Total num frames: 15105458176. Throughput: 0: 42535.3. Samples: 15105590140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:21:58,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 16:22:01,846][15401] Updated weights for policy 0, policy_version 921974 (0.0041) [2024-06-25 16:22:03,392][15132] Fps is (10 sec: 47502.4, 60 sec: 43415.8, 300 sec: 42709.1). Total num frames: 15105720320. Throughput: 0: 42505.6. Samples: 15105837360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:03,392][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 16:22:05,830][15401] Updated weights for policy 0, policy_version 921984 (0.0028) [2024-06-25 16:22:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 15105867776. Throughput: 0: 42553.0. Samples: 15105974400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:08,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 16:22:09,452][15401] Updated weights for policy 0, policy_version 921994 (0.0031) [2024-06-25 16:22:13,259][15401] Updated weights for policy 0, policy_version 922004 (0.0036) [2024-06-25 16:22:13,389][15132] Fps is (10 sec: 40970.3, 60 sec: 43417.8, 300 sec: 42709.5). Total num frames: 15106129920. Throughput: 0: 42485.4. Samples: 15106233160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:13,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 16:22:17,195][15401] Updated weights for policy 0, policy_version 922014 (0.0037) [2024-06-25 16:22:17,679][15349] Signal inference workers to stop experience collection... (223550 times) [2024-06-25 16:22:17,680][15349] Signal inference workers to resume experience collection... (223550 times) [2024-06-25 16:22:17,695][15401] InferenceWorker_p0-w0: stopping experience collection (223550 times) [2024-06-25 16:22:17,695][15401] InferenceWorker_p0-w0: resuming experience collection (223550 times) [2024-06-25 16:22:18,389][15132] Fps is (10 sec: 49152.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15106359296. Throughput: 0: 42580.0. Samples: 15106480300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:18,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 16:22:20,814][15401] Updated weights for policy 0, policy_version 922024 (0.0038) [2024-06-25 16:22:23,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42054.0, 300 sec: 42487.3). Total num frames: 15106506752. Throughput: 0: 42546.2. Samples: 15106616580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:23,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 16:22:24,826][15401] Updated weights for policy 0, policy_version 922034 (0.0040) [2024-06-25 16:22:28,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15106752512. Throughput: 0: 42550.4. Samples: 15106868520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 16:22:28,415][15401] Updated weights for policy 0, policy_version 922044 (0.0035) [2024-06-25 16:22:32,673][15401] Updated weights for policy 0, policy_version 922054 (0.0029) [2024-06-25 16:22:33,389][15132] Fps is (10 sec: 49152.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15106998272. Throughput: 0: 42854.7. Samples: 15107128360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:33,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 16:22:35,837][15401] Updated weights for policy 0, policy_version 922064 (0.0029) [2024-06-25 16:22:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15107162112. Throughput: 0: 42703.2. Samples: 15107256540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 16:22:40,257][15401] Updated weights for policy 0, policy_version 922074 (0.0029) [2024-06-25 16:22:43,278][15401] Updated weights for policy 0, policy_version 922084 (0.0039) [2024-06-25 16:22:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43419.4, 300 sec: 42709.5). Total num frames: 15107424256. Throughput: 0: 42778.2. Samples: 15107515160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:43,399][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 16:22:47,859][15401] Updated weights for policy 0, policy_version 922094 (0.0040) [2024-06-25 16:22:48,392][15132] Fps is (10 sec: 47502.2, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 15107637248. Throughput: 0: 43008.1. Samples: 15107772720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:48,392][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 16:22:51,192][15401] Updated weights for policy 0, policy_version 922104 (0.0044) [2024-06-25 16:22:53,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15107801088. Throughput: 0: 42816.1. Samples: 15107901120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:53,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 16:22:55,419][15401] Updated weights for policy 0, policy_version 922114 (0.0038) [2024-06-25 16:22:58,389][15132] Fps is (10 sec: 40969.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 15108046848. Throughput: 0: 42728.0. Samples: 15108155920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:22:58,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 16:22:58,861][15401] Updated weights for policy 0, policy_version 922124 (0.0042) [2024-06-25 16:23:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 15108227072. Throughput: 0: 42904.9. Samples: 15108411020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 16:23:03,509][15401] Updated weights for policy 0, policy_version 922134 (0.0033) [2024-06-25 16:23:06,727][15401] Updated weights for policy 0, policy_version 922144 (0.0040) [2024-06-25 16:23:08,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15108423680. Throughput: 0: 42504.5. Samples: 15108529280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:08,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 16:23:11,219][15401] Updated weights for policy 0, policy_version 922154 (0.0033) [2024-06-25 16:23:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15108685824. Throughput: 0: 42627.0. Samples: 15108786740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 16:23:14,496][15401] Updated weights for policy 0, policy_version 922164 (0.0044) [2024-06-25 16:23:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 42598.4). Total num frames: 15108849664. Throughput: 0: 42711.6. Samples: 15109050380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 16:23:19,081][15401] Updated weights for policy 0, policy_version 922174 (0.0045) [2024-06-25 16:23:22,090][15401] Updated weights for policy 0, policy_version 922184 (0.0033) [2024-06-25 16:23:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15109079040. Throughput: 0: 42472.7. Samples: 15109167820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:23,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 16:23:26,554][15401] Updated weights for policy 0, policy_version 922194 (0.0032) [2024-06-25 16:23:27,077][15349] Signal inference workers to stop experience collection... (223600 times) [2024-06-25 16:23:27,077][15349] Signal inference workers to resume experience collection... (223600 times) [2024-06-25 16:23:27,090][15401] InferenceWorker_p0-w0: stopping experience collection (223600 times) [2024-06-25 16:23:27,090][15401] InferenceWorker_p0-w0: resuming experience collection (223600 times) [2024-06-25 16:23:28,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15109341184. Throughput: 0: 42582.6. Samples: 15109431380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 16:23:29,529][15401] Updated weights for policy 0, policy_version 922204 (0.0032) [2024-06-25 16:23:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 15109505024. Throughput: 0: 42746.6. Samples: 15109696220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 16:23:34,058][15401] Updated weights for policy 0, policy_version 922214 (0.0020) [2024-06-25 16:23:37,198][15401] Updated weights for policy 0, policy_version 922224 (0.0034) [2024-06-25 16:23:38,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 15109734400. Throughput: 0: 42449.7. Samples: 15109811360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:38,390][15132] Avg episode reward: [(0, '0.257')] [2024-06-25 16:23:41,571][15401] Updated weights for policy 0, policy_version 922234 (0.0037) [2024-06-25 16:23:43,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 15109980160. Throughput: 0: 42634.1. Samples: 15110074460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:43,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 16:23:43,418][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000922240_15109980160.pth... [2024-06-25 16:23:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000921615_15099740160.pth [2024-06-25 16:23:45,237][15401] Updated weights for policy 0, policy_version 922244 (0.0039) [2024-06-25 16:23:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 15110160384. Throughput: 0: 42738.6. Samples: 15110334260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:48,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 16:23:49,038][15401] Updated weights for policy 0, policy_version 922254 (0.0028) [2024-06-25 16:23:53,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42596.6, 300 sec: 42542.9). Total num frames: 15110356992. Throughput: 0: 42813.2. Samples: 15110455980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:53,392][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 16:23:53,446][15401] Updated weights for policy 0, policy_version 922264 (0.0025) [2024-06-25 16:23:56,985][15401] Updated weights for policy 0, policy_version 922274 (0.0036) [2024-06-25 16:23:58,396][15132] Fps is (10 sec: 47483.5, 60 sec: 43139.9, 300 sec: 42764.1). Total num frames: 15110635520. Throughput: 0: 42978.8. Samples: 15110721060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:23:58,397][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 16:24:00,801][15401] Updated weights for policy 0, policy_version 922284 (0.0037) [2024-06-25 16:24:03,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15110799360. Throughput: 0: 42956.3. Samples: 15110983420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 16:24:04,398][15401] Updated weights for policy 0, policy_version 922294 (0.0040) [2024-06-25 16:24:08,150][15401] Updated weights for policy 0, policy_version 922304 (0.0040) [2024-06-25 16:24:08,390][15132] Fps is (10 sec: 39346.6, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 15111028736. Throughput: 0: 43085.4. Samples: 15111106660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:08,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 16:24:11,859][15401] Updated weights for policy 0, policy_version 922314 (0.0044) [2024-06-25 16:24:13,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15111258112. Throughput: 0: 43018.2. Samples: 15111367200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:13,396][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 16:24:15,434][15401] Updated weights for policy 0, policy_version 922324 (0.0032) [2024-06-25 16:24:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15111454720. Throughput: 0: 43073.3. Samples: 15111634520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 16:24:19,438][15401] Updated weights for policy 0, policy_version 922334 (0.0036) [2024-06-25 16:24:23,025][15401] Updated weights for policy 0, policy_version 922344 (0.0027) [2024-06-25 16:24:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 15111684096. Throughput: 0: 43157.3. Samples: 15111753440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 16:24:27,020][15401] Updated weights for policy 0, policy_version 922354 (0.0035) [2024-06-25 16:24:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15111913472. Throughput: 0: 43128.5. Samples: 15112015240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:28,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 16:24:30,620][15401] Updated weights for policy 0, policy_version 922364 (0.0037) [2024-06-25 16:24:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15112077312. Throughput: 0: 43184.5. Samples: 15112277560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:33,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 16:24:34,764][15401] Updated weights for policy 0, policy_version 922374 (0.0039) [2024-06-25 16:24:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15112323072. Throughput: 0: 43107.7. Samples: 15112395720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:38,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 16:24:38,883][15401] Updated weights for policy 0, policy_version 922384 (0.0036) [2024-06-25 16:24:42,218][15401] Updated weights for policy 0, policy_version 922394 (0.0032) [2024-06-25 16:24:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15112536064. Throughput: 0: 43053.2. Samples: 15112658180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:43,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 16:24:46,631][15401] Updated weights for policy 0, policy_version 922404 (0.0025) [2024-06-25 16:24:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 15112732672. Throughput: 0: 43028.0. Samples: 15112919680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 16:24:50,054][15401] Updated weights for policy 0, policy_version 922414 (0.0031) [2024-06-25 16:24:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43419.3, 300 sec: 42598.4). Total num frames: 15112962048. Throughput: 0: 42966.1. Samples: 15113040140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:53,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 16:24:54,243][15401] Updated weights for policy 0, policy_version 922424 (0.0029) [2024-06-25 16:24:56,726][15349] Signal inference workers to stop experience collection... (223650 times) [2024-06-25 16:24:56,767][15401] InferenceWorker_p0-w0: stopping experience collection (223650 times) [2024-06-25 16:24:56,781][15349] Signal inference workers to resume experience collection... (223650 times) [2024-06-25 16:24:56,782][15401] InferenceWorker_p0-w0: resuming experience collection (223650 times) [2024-06-25 16:24:57,779][15401] Updated weights for policy 0, policy_version 922434 (0.0040) [2024-06-25 16:24:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 15113191424. Throughput: 0: 42950.6. Samples: 15113299980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:24:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 16:25:01,960][15401] Updated weights for policy 0, policy_version 922444 (0.0043) [2024-06-25 16:25:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15113355264. Throughput: 0: 42641.8. Samples: 15113553400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:03,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 16:25:05,391][15401] Updated weights for policy 0, policy_version 922454 (0.0030) [2024-06-25 16:25:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15113617408. Throughput: 0: 42733.4. Samples: 15113676440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 16:25:09,437][15401] Updated weights for policy 0, policy_version 922464 (0.0026) [2024-06-25 16:25:12,907][15401] Updated weights for policy 0, policy_version 922474 (0.0028) [2024-06-25 16:25:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15113814016. Throughput: 0: 42814.2. Samples: 15113941880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:13,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 16:25:17,370][15401] Updated weights for policy 0, policy_version 922484 (0.0037) [2024-06-25 16:25:18,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 15114010624. Throughput: 0: 42566.6. Samples: 15114193160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:18,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 16:25:20,993][15401] Updated weights for policy 0, policy_version 922494 (0.0034) [2024-06-25 16:25:23,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 15114256384. Throughput: 0: 42772.1. Samples: 15114320460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:23,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 16:25:24,787][15401] Updated weights for policy 0, policy_version 922504 (0.0027) [2024-06-25 16:25:28,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 15114452992. Throughput: 0: 42730.1. Samples: 15114581040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:28,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-25 16:25:28,577][15401] Updated weights for policy 0, policy_version 922514 (0.0033) [2024-06-25 16:25:32,386][15401] Updated weights for policy 0, policy_version 922524 (0.0038) [2024-06-25 16:25:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15114665984. Throughput: 0: 42538.7. Samples: 15114833920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:33,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 16:25:36,188][15401] Updated weights for policy 0, policy_version 922534 (0.0031) [2024-06-25 16:25:38,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 15114878976. Throughput: 0: 42724.5. Samples: 15114962740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 16:25:39,986][15401] Updated weights for policy 0, policy_version 922544 (0.0035) [2024-06-25 16:25:43,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15115075584. Throughput: 0: 42624.6. Samples: 15115218080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-25 16:25:43,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 16:25:43,471][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000922552_15115091968.pth... [2024-06-25 16:25:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000921925_15104819200.pth [2024-06-25 16:25:43,889][15401] Updated weights for policy 0, policy_version 922554 (0.0037) [2024-06-25 16:25:47,608][15401] Updated weights for policy 0, policy_version 922564 (0.0040) [2024-06-25 16:25:48,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 15115304960. Throughput: 0: 42601.8. Samples: 15115470580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:25:48,392][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 16:25:51,593][15401] Updated weights for policy 0, policy_version 922574 (0.0039) [2024-06-25 16:25:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 15115517952. Throughput: 0: 42776.4. Samples: 15115601380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:25:53,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 16:25:55,419][15401] Updated weights for policy 0, policy_version 922584 (0.0031) [2024-06-25 16:25:58,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15115730944. Throughput: 0: 42519.1. Samples: 15115855240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:25:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 16:25:59,365][15401] Updated weights for policy 0, policy_version 922594 (0.0038) [2024-06-25 16:26:02,955][15401] Updated weights for policy 0, policy_version 922604 (0.0032) [2024-06-25 16:26:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15115943936. Throughput: 0: 42665.8. Samples: 15116113020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 16:26:06,972][15401] Updated weights for policy 0, policy_version 922614 (0.0027) [2024-06-25 16:26:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 15116173312. Throughput: 0: 42696.8. Samples: 15116241820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 16:26:11,042][15401] Updated weights for policy 0, policy_version 922624 (0.0044) [2024-06-25 16:26:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 15116353536. Throughput: 0: 42481.3. Samples: 15116492700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:13,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 16:26:13,974][15349] Signal inference workers to stop experience collection... (223700 times) [2024-06-25 16:26:14,011][15401] InferenceWorker_p0-w0: stopping experience collection (223700 times) [2024-06-25 16:26:14,033][15349] Signal inference workers to resume experience collection... (223700 times) [2024-06-25 16:26:14,034][15401] InferenceWorker_p0-w0: resuming experience collection (223700 times) [2024-06-25 16:26:14,542][15401] Updated weights for policy 0, policy_version 922634 (0.0036) [2024-06-25 16:26:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42873.2, 300 sec: 42709.8). Total num frames: 15116582912. Throughput: 0: 42447.1. Samples: 15116744040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:18,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 16:26:18,723][15401] Updated weights for policy 0, policy_version 922644 (0.0029) [2024-06-25 16:26:22,105][15401] Updated weights for policy 0, policy_version 922654 (0.0025) [2024-06-25 16:26:23,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15116828672. Throughput: 0: 42517.8. Samples: 15116876040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:23,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 16:26:26,206][15401] Updated weights for policy 0, policy_version 922664 (0.0036) [2024-06-25 16:26:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15116992512. Throughput: 0: 42672.4. Samples: 15117138340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 16:26:29,691][15401] Updated weights for policy 0, policy_version 922674 (0.0034) [2024-06-25 16:26:33,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 15117221888. Throughput: 0: 42742.2. Samples: 15117393980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:33,393][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 16:26:33,730][15401] Updated weights for policy 0, policy_version 922684 (0.0037) [2024-06-25 16:26:37,494][15401] Updated weights for policy 0, policy_version 922694 (0.0037) [2024-06-25 16:26:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 15117467648. Throughput: 0: 42689.2. Samples: 15117522400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:38,390][15132] Avg episode reward: [(0, '0.150')] [2024-06-25 16:26:41,792][15401] Updated weights for policy 0, policy_version 922704 (0.0038) [2024-06-25 16:26:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15117647872. Throughput: 0: 42653.7. Samples: 15117774660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 16:26:45,224][15401] Updated weights for policy 0, policy_version 922714 (0.0038) [2024-06-25 16:26:48,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 15117860864. Throughput: 0: 42664.4. Samples: 15118032920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:48,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 16:26:49,268][15401] Updated weights for policy 0, policy_version 922724 (0.0026) [2024-06-25 16:26:52,878][15401] Updated weights for policy 0, policy_version 922734 (0.0037) [2024-06-25 16:26:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15118090240. Throughput: 0: 42715.6. Samples: 15118164020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:53,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 16:26:56,903][15401] Updated weights for policy 0, policy_version 922744 (0.0034) [2024-06-25 16:26:58,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42598.8). Total num frames: 15118286848. Throughput: 0: 42782.8. Samples: 15118417920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:26:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 16:27:00,656][15401] Updated weights for policy 0, policy_version 922754 (0.0034) [2024-06-25 16:27:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15118516224. Throughput: 0: 42864.0. Samples: 15118672920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:27:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 16:27:04,538][15401] Updated weights for policy 0, policy_version 922764 (0.0041) [2024-06-25 16:27:08,310][15401] Updated weights for policy 0, policy_version 922774 (0.0028) [2024-06-25 16:27:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15118729216. Throughput: 0: 42904.5. Samples: 15118806740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:27:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 16:27:12,277][15401] Updated weights for policy 0, policy_version 922784 (0.0033) [2024-06-25 16:27:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15118925824. Throughput: 0: 42730.7. Samples: 15119061220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:27:13,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 16:27:15,999][15401] Updated weights for policy 0, policy_version 922794 (0.0027) [2024-06-25 16:27:18,392][15132] Fps is (10 sec: 44225.9, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 15119171584. Throughput: 0: 42591.1. Samples: 15119310580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:27:18,392][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 16:27:20,057][15401] Updated weights for policy 0, policy_version 922804 (0.0035) [2024-06-25 16:27:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15119368192. Throughput: 0: 42693.4. Samples: 15119443600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:27:23,392][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 16:27:23,760][15401] Updated weights for policy 0, policy_version 922814 (0.0038) [2024-06-25 16:27:27,638][15401] Updated weights for policy 0, policy_version 922824 (0.0024) [2024-06-25 16:27:28,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15119564800. Throughput: 0: 42741.0. Samples: 15119698000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:27:28,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 16:27:31,454][15401] Updated weights for policy 0, policy_version 922834 (0.0045) [2024-06-25 16:27:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 15119777792. Throughput: 0: 42635.6. Samples: 15119951520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:27:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 16:27:35,532][15401] Updated weights for policy 0, policy_version 922844 (0.0026) [2024-06-25 16:27:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15120007168. Throughput: 0: 42552.0. Samples: 15120078860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 16:27:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 16:27:39,485][15401] Updated weights for policy 0, policy_version 922854 (0.0035) [2024-06-25 16:27:43,386][15401] Updated weights for policy 0, policy_version 922864 (0.0037) [2024-06-25 16:27:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 15120203776. Throughput: 0: 42562.8. Samples: 15120333260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:27:43,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 16:27:43,534][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000922865_15120220160.pth... [2024-06-25 16:27:43,586][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000922240_15109980160.pth [2024-06-25 16:27:47,095][15401] Updated weights for policy 0, policy_version 922874 (0.0033) [2024-06-25 16:27:48,329][15349] Signal inference workers to stop experience collection... (223750 times) [2024-06-25 16:27:48,329][15349] Signal inference workers to resume experience collection... (223750 times) [2024-06-25 16:27:48,376][15401] InferenceWorker_p0-w0: stopping experience collection (223750 times) [2024-06-25 16:27:48,377][15401] InferenceWorker_p0-w0: resuming experience collection (223750 times) [2024-06-25 16:27:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15120416768. Throughput: 0: 42583.6. Samples: 15120589180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:27:48,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 16:27:50,886][15401] Updated weights for policy 0, policy_version 922884 (0.0029) [2024-06-25 16:27:53,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15120646144. Throughput: 0: 42439.4. Samples: 15120716520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:27:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 16:27:54,752][15401] Updated weights for policy 0, policy_version 922894 (0.0041) [2024-06-25 16:27:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15120842752. Throughput: 0: 42443.2. Samples: 15120971160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:27:58,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 16:27:58,687][15401] Updated weights for policy 0, policy_version 922904 (0.0036) [2024-06-25 16:28:02,387][15401] Updated weights for policy 0, policy_version 922914 (0.0033) [2024-06-25 16:28:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 15121055744. Throughput: 0: 42516.6. Samples: 15121223720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:03,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 16:28:06,555][15401] Updated weights for policy 0, policy_version 922924 (0.0038) [2024-06-25 16:28:08,389][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15121285120. Throughput: 0: 42446.3. Samples: 15121353680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:08,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 16:28:10,012][15401] Updated weights for policy 0, policy_version 922934 (0.0035) [2024-06-25 16:28:13,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15121481728. Throughput: 0: 42487.9. Samples: 15121609960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:13,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 16:28:13,999][15401] Updated weights for policy 0, policy_version 922944 (0.0023) [2024-06-25 16:28:17,813][15401] Updated weights for policy 0, policy_version 922954 (0.0037) [2024-06-25 16:28:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42054.0, 300 sec: 42765.0). Total num frames: 15121694720. Throughput: 0: 42389.1. Samples: 15121859020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 16:28:21,810][15401] Updated weights for policy 0, policy_version 922964 (0.0040) [2024-06-25 16:28:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 15121907712. Throughput: 0: 42458.6. Samples: 15121989600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:23,393][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 16:28:25,422][15401] Updated weights for policy 0, policy_version 922974 (0.0040) [2024-06-25 16:28:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15122104320. Throughput: 0: 42498.9. Samples: 15122245700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:28,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 16:28:29,394][15401] Updated weights for policy 0, policy_version 922984 (0.0029) [2024-06-25 16:28:33,124][15401] Updated weights for policy 0, policy_version 922994 (0.0031) [2024-06-25 16:28:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15122333696. Throughput: 0: 42456.4. Samples: 15122499720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:33,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 16:28:37,289][15401] Updated weights for policy 0, policy_version 923004 (0.0024) [2024-06-25 16:28:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15122546688. Throughput: 0: 42400.4. Samples: 15122624540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:38,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-25 16:28:40,771][15401] Updated weights for policy 0, policy_version 923014 (0.0046) [2024-06-25 16:28:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 15122759680. Throughput: 0: 42471.0. Samples: 15122882360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 16:28:44,812][15401] Updated weights for policy 0, policy_version 923024 (0.0029) [2024-06-25 16:28:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 15122972672. Throughput: 0: 42421.3. Samples: 15123132680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 16:28:48,619][15401] Updated weights for policy 0, policy_version 923034 (0.0036) [2024-06-25 16:28:52,383][15401] Updated weights for policy 0, policy_version 923044 (0.0026) [2024-06-25 16:28:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42543.8). Total num frames: 15123185664. Throughput: 0: 42544.5. Samples: 15123268180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:53,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 16:28:56,286][15401] Updated weights for policy 0, policy_version 923054 (0.0025) [2024-06-25 16:28:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15123382272. Throughput: 0: 42385.8. Samples: 15123517320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:28:58,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 16:29:00,267][15401] Updated weights for policy 0, policy_version 923064 (0.0043) [2024-06-25 16:29:03,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 15123611648. Throughput: 0: 42586.1. Samples: 15123775500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:29:03,392][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 16:29:04,141][15401] Updated weights for policy 0, policy_version 923074 (0.0034) [2024-06-25 16:29:07,783][15401] Updated weights for policy 0, policy_version 923084 (0.0032) [2024-06-25 16:29:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15123824640. Throughput: 0: 42533.0. Samples: 15123903480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:29:08,396][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 16:29:11,731][15401] Updated weights for policy 0, policy_version 923094 (0.0032) [2024-06-25 16:29:13,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15124037632. Throughput: 0: 42524.9. Samples: 15124159320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:29:13,390][15132] Avg episode reward: [(0, '0.247')] [2024-06-25 16:29:15,799][15401] Updated weights for policy 0, policy_version 923104 (0.0032) [2024-06-25 16:29:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15124250624. Throughput: 0: 42643.1. Samples: 15124418660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:29:18,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 16:29:19,123][15401] Updated weights for policy 0, policy_version 923114 (0.0040) [2024-06-25 16:29:23,294][15401] Updated weights for policy 0, policy_version 923124 (0.0026) [2024-06-25 16:29:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 15124463616. Throughput: 0: 42714.7. Samples: 15124546700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:29:23,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 16:29:27,260][15401] Updated weights for policy 0, policy_version 923134 (0.0025) [2024-06-25 16:29:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15124676608. Throughput: 0: 42625.7. Samples: 15124800520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:29:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 16:29:31,255][15401] Updated weights for policy 0, policy_version 923144 (0.0038) [2024-06-25 16:29:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15124905984. Throughput: 0: 42620.1. Samples: 15125050580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:29:33,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 16:29:34,777][15401] Updated weights for policy 0, policy_version 923154 (0.0033) [2024-06-25 16:29:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15125086208. Throughput: 0: 42639.1. Samples: 15125186940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:29:38,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 16:29:38,951][15401] Updated weights for policy 0, policy_version 923164 (0.0028) [2024-06-25 16:29:39,642][15349] Signal inference workers to stop experience collection... (223800 times) [2024-06-25 16:29:39,673][15401] InferenceWorker_p0-w0: stopping experience collection (223800 times) [2024-06-25 16:29:39,702][15349] Signal inference workers to resume experience collection... (223800 times) [2024-06-25 16:29:39,702][15401] InferenceWorker_p0-w0: resuming experience collection (223800 times) [2024-06-25 16:29:42,342][15401] Updated weights for policy 0, policy_version 923174 (0.0028) [2024-06-25 16:29:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15125315584. Throughput: 0: 42692.4. Samples: 15125438480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:29:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 16:29:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000923176_15125315584.pth... [2024-06-25 16:29:43,475][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000922552_15115091968.pth [2024-06-25 16:29:46,678][15401] Updated weights for policy 0, policy_version 923184 (0.0037) [2024-06-25 16:29:48,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15125544960. Throughput: 0: 42550.1. Samples: 15125690160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:29:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 16:29:50,002][15401] Updated weights for policy 0, policy_version 923194 (0.0039) [2024-06-25 16:29:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 15125725184. Throughput: 0: 42670.1. Samples: 15125823640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:29:53,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 16:29:54,158][15401] Updated weights for policy 0, policy_version 923204 (0.0043) [2024-06-25 16:29:57,519][15401] Updated weights for policy 0, policy_version 923214 (0.0032) [2024-06-25 16:29:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15125954560. Throughput: 0: 42615.6. Samples: 15126077020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:29:58,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 16:30:01,969][15401] Updated weights for policy 0, policy_version 923224 (0.0040) [2024-06-25 16:30:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 15126167552. Throughput: 0: 42367.4. Samples: 15126325200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 16:30:05,296][15401] Updated weights for policy 0, policy_version 923234 (0.0038) [2024-06-25 16:30:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15126364160. Throughput: 0: 42294.0. Samples: 15126449920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:08,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 16:30:09,689][15401] Updated weights for policy 0, policy_version 923244 (0.0041) [2024-06-25 16:30:12,993][15401] Updated weights for policy 0, policy_version 923254 (0.0033) [2024-06-25 16:30:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 15126609920. Throughput: 0: 42254.1. Samples: 15126701960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:13,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 16:30:17,622][15401] Updated weights for policy 0, policy_version 923264 (0.0037) [2024-06-25 16:30:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15126790144. Throughput: 0: 42501.6. Samples: 15126963160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 16:30:20,790][15401] Updated weights for policy 0, policy_version 923274 (0.0036) [2024-06-25 16:30:23,389][15132] Fps is (10 sec: 37683.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 15126986752. Throughput: 0: 42134.2. Samples: 15127082980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:23,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 16:30:25,494][15401] Updated weights for policy 0, policy_version 923284 (0.0032) [2024-06-25 16:30:28,286][15401] Updated weights for policy 0, policy_version 923294 (0.0034) [2024-06-25 16:30:28,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15127248896. Throughput: 0: 42199.6. Samples: 15127337460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:28,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 16:30:33,342][15401] Updated weights for policy 0, policy_version 923304 (0.0029) [2024-06-25 16:30:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 15127412736. Throughput: 0: 42407.7. Samples: 15127598500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:33,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 16:30:35,778][15401] Updated weights for policy 0, policy_version 923314 (0.0030) [2024-06-25 16:30:38,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 15127625728. Throughput: 0: 41996.4. Samples: 15127713480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:38,390][15132] Avg episode reward: [(0, '0.890')] [2024-06-25 16:30:41,050][15401] Updated weights for policy 0, policy_version 923324 (0.0024) [2024-06-25 16:30:43,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 15127887872. Throughput: 0: 42201.3. Samples: 15127976080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 16:30:43,842][15401] Updated weights for policy 0, policy_version 923334 (0.0030) [2024-06-25 16:30:48,392][15132] Fps is (10 sec: 39312.7, 60 sec: 41231.5, 300 sec: 42375.9). Total num frames: 15128018944. Throughput: 0: 42564.5. Samples: 15128240700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:48,392][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 16:30:48,852][15401] Updated weights for policy 0, policy_version 923344 (0.0030) [2024-06-25 16:30:51,855][15401] Updated weights for policy 0, policy_version 923354 (0.0037) [2024-06-25 16:30:53,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15128281088. Throughput: 0: 42305.6. Samples: 15128353680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:53,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 16:30:55,998][15349] Signal inference workers to stop experience collection... (223850 times) [2024-06-25 16:30:56,045][15401] InferenceWorker_p0-w0: stopping experience collection (223850 times) [2024-06-25 16:30:56,118][15349] Signal inference workers to resume experience collection... (223850 times) [2024-06-25 16:30:56,118][15401] InferenceWorker_p0-w0: resuming experience collection (223850 times) [2024-06-25 16:30:56,434][15401] Updated weights for policy 0, policy_version 923364 (0.0024) [2024-06-25 16:30:58,390][15132] Fps is (10 sec: 49163.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15128510464. Throughput: 0: 42410.7. Samples: 15128610440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:30:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 16:30:59,474][15401] Updated weights for policy 0, policy_version 923374 (0.0037) [2024-06-25 16:31:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 15128657920. Throughput: 0: 42513.3. Samples: 15128876260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:31:03,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 16:31:04,063][15401] Updated weights for policy 0, policy_version 923384 (0.0025) [2024-06-25 16:31:07,277][15401] Updated weights for policy 0, policy_version 923394 (0.0037) [2024-06-25 16:31:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15128936448. Throughput: 0: 42437.6. Samples: 15128992680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:31:08,390][15132] Avg episode reward: [(0, '0.239')] [2024-06-25 16:31:11,811][15401] Updated weights for policy 0, policy_version 923404 (0.0032) [2024-06-25 16:31:13,389][15132] Fps is (10 sec: 49152.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15129149440. Throughput: 0: 42583.1. Samples: 15129253700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:31:13,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 16:31:15,163][15401] Updated weights for policy 0, policy_version 923414 (0.0034) [2024-06-25 16:31:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 15129313280. Throughput: 0: 42559.1. Samples: 15129513660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:31:18,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 16:31:19,417][15401] Updated weights for policy 0, policy_version 923424 (0.0032) [2024-06-25 16:31:22,725][15401] Updated weights for policy 0, policy_version 923434 (0.0040) [2024-06-25 16:31:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15129575424. Throughput: 0: 42661.0. Samples: 15129633220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:31:23,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 16:31:27,190][15401] Updated weights for policy 0, policy_version 923444 (0.0043) [2024-06-25 16:31:28,389][15132] Fps is (10 sec: 49151.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 15129804800. Throughput: 0: 42793.7. Samples: 15129901800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-25 16:31:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 16:31:30,417][15401] Updated weights for policy 0, policy_version 923454 (0.0030) [2024-06-25 16:31:33,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 15129968640. Throughput: 0: 42474.6. Samples: 15130151960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:31:33,396][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 16:31:34,924][15401] Updated weights for policy 0, policy_version 923464 (0.0041) [2024-06-25 16:31:38,128][15401] Updated weights for policy 0, policy_version 923474 (0.0033) [2024-06-25 16:31:38,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 15130198016. Throughput: 0: 42594.2. Samples: 15130270520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:31:38,393][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 16:31:42,506][15401] Updated weights for policy 0, policy_version 923484 (0.0033) [2024-06-25 16:31:43,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15130411008. Throughput: 0: 42717.0. Samples: 15130532700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:31:43,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 16:31:43,472][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000923488_15130427392.pth... [2024-06-25 16:31:43,526][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000922865_15120220160.pth [2024-06-25 16:31:46,232][15401] Updated weights for policy 0, policy_version 923494 (0.0027) [2024-06-25 16:31:48,390][15132] Fps is (10 sec: 40969.4, 60 sec: 43146.1, 300 sec: 42431.8). Total num frames: 15130607616. Throughput: 0: 42619.1. Samples: 15130794120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:31:48,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 16:31:49,951][15401] Updated weights for policy 0, policy_version 923504 (0.0032) [2024-06-25 16:31:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15130836992. Throughput: 0: 42739.5. Samples: 15130915960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:31:53,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 16:31:53,824][15401] Updated weights for policy 0, policy_version 923514 (0.0032) [2024-06-25 16:31:57,611][15401] Updated weights for policy 0, policy_version 923524 (0.0022) [2024-06-25 16:31:58,390][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15131049984. Throughput: 0: 42810.6. Samples: 15131180180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:31:58,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 16:32:01,363][15401] Updated weights for policy 0, policy_version 923534 (0.0049) [2024-06-25 16:32:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43417.7, 300 sec: 42487.3). Total num frames: 15131262976. Throughput: 0: 42605.8. Samples: 15131430920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:03,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 16:32:05,376][15401] Updated weights for policy 0, policy_version 923544 (0.0032) [2024-06-25 16:32:06,372][15349] Signal inference workers to stop experience collection... (223900 times) [2024-06-25 16:32:06,372][15349] Signal inference workers to resume experience collection... (223900 times) [2024-06-25 16:32:06,389][15401] InferenceWorker_p0-w0: stopping experience collection (223900 times) [2024-06-25 16:32:06,389][15401] InferenceWorker_p0-w0: resuming experience collection (223900 times) [2024-06-25 16:32:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15131475968. Throughput: 0: 42825.7. Samples: 15131560380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 16:32:08,811][15401] Updated weights for policy 0, policy_version 923554 (0.0031) [2024-06-25 16:32:13,110][15401] Updated weights for policy 0, policy_version 923564 (0.0038) [2024-06-25 16:32:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 15131688960. Throughput: 0: 42500.8. Samples: 15131814340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:13,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-25 16:32:16,895][15401] Updated weights for policy 0, policy_version 923574 (0.0033) [2024-06-25 16:32:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 15131885568. Throughput: 0: 42714.4. Samples: 15132074100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:18,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 16:32:20,873][15401] Updated weights for policy 0, policy_version 923584 (0.0044) [2024-06-25 16:32:23,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15132147712. Throughput: 0: 42892.5. Samples: 15132200580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 16:32:24,509][15401] Updated weights for policy 0, policy_version 923594 (0.0041) [2024-06-25 16:32:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 15132311552. Throughput: 0: 42752.9. Samples: 15132456580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 16:32:28,536][15401] Updated weights for policy 0, policy_version 923604 (0.0035) [2024-06-25 16:32:32,115][15401] Updated weights for policy 0, policy_version 923614 (0.0044) [2024-06-25 16:32:33,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15132540928. Throughput: 0: 42715.6. Samples: 15132716320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:33,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 16:32:36,050][15401] Updated weights for policy 0, policy_version 923624 (0.0028) [2024-06-25 16:32:38,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43146.3, 300 sec: 42654.0). Total num frames: 15132786688. Throughput: 0: 42904.5. Samples: 15132846660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:38,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 16:32:39,668][15401] Updated weights for policy 0, policy_version 923634 (0.0041) [2024-06-25 16:32:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15132966912. Throughput: 0: 42743.1. Samples: 15133103620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 16:32:43,489][15401] Updated weights for policy 0, policy_version 923644 (0.0032) [2024-06-25 16:32:47,662][15401] Updated weights for policy 0, policy_version 923654 (0.0039) [2024-06-25 16:32:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 15133179904. Throughput: 0: 42874.2. Samples: 15133360260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 16:32:50,987][15401] Updated weights for policy 0, policy_version 923664 (0.0029) [2024-06-25 16:32:53,389][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 15133425664. Throughput: 0: 42726.8. Samples: 15133483080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:53,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 16:32:55,295][15401] Updated weights for policy 0, policy_version 923674 (0.0036) [2024-06-25 16:32:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15133605888. Throughput: 0: 42739.3. Samples: 15133737600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:32:58,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 16:32:58,773][15401] Updated weights for policy 0, policy_version 923684 (0.0034) [2024-06-25 16:33:02,910][15401] Updated weights for policy 0, policy_version 923694 (0.0036) [2024-06-25 16:33:03,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15133818880. Throughput: 0: 42549.8. Samples: 15133988840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:03,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 16:33:06,850][15401] Updated weights for policy 0, policy_version 923704 (0.0041) [2024-06-25 16:33:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15134064640. Throughput: 0: 42715.7. Samples: 15134122780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:08,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 16:33:10,475][15401] Updated weights for policy 0, policy_version 923714 (0.0023) [2024-06-25 16:33:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15134244864. Throughput: 0: 42738.6. Samples: 15134379820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 16:33:14,264][15401] Updated weights for policy 0, policy_version 923724 (0.0027) [2024-06-25 16:33:18,071][15401] Updated weights for policy 0, policy_version 923734 (0.0033) [2024-06-25 16:33:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42598.8). Total num frames: 15134474240. Throughput: 0: 42471.3. Samples: 15134627520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:18,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 16:33:22,065][15401] Updated weights for policy 0, policy_version 923744 (0.0037) [2024-06-25 16:33:23,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 15134654464. Throughput: 0: 42472.1. Samples: 15134757900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:23,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 16:33:25,698][15401] Updated weights for policy 0, policy_version 923754 (0.0042) [2024-06-25 16:33:28,392][15132] Fps is (10 sec: 40950.0, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 15134883840. Throughput: 0: 42454.6. Samples: 15135014180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:28,392][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 16:33:29,983][15401] Updated weights for policy 0, policy_version 923764 (0.0031) [2024-06-25 16:33:33,244][15401] Updated weights for policy 0, policy_version 923774 (0.0049) [2024-06-25 16:33:33,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15135113216. Throughput: 0: 42425.9. Samples: 15135269420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 16:33:35,286][15349] Signal inference workers to stop experience collection... (223950 times) [2024-06-25 16:33:35,332][15401] InferenceWorker_p0-w0: stopping experience collection (223950 times) [2024-06-25 16:33:35,340][15349] Signal inference workers to resume experience collection... (223950 times) [2024-06-25 16:33:35,350][15401] InferenceWorker_p0-w0: resuming experience collection (223950 times) [2024-06-25 16:33:37,611][15401] Updated weights for policy 0, policy_version 923784 (0.0038) [2024-06-25 16:33:38,390][15132] Fps is (10 sec: 40969.3, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 15135293440. Throughput: 0: 42565.2. Samples: 15135398520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 16:33:40,777][15401] Updated weights for policy 0, policy_version 923794 (0.0037) [2024-06-25 16:33:43,389][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15135539200. Throughput: 0: 42632.3. Samples: 15135656060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 16:33:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000923800_15135539200.pth... [2024-06-25 16:33:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000923176_15125315584.pth [2024-06-25 16:33:44,986][15401] Updated weights for policy 0, policy_version 923804 (0.0031) [2024-06-25 16:33:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15135752192. Throughput: 0: 42553.3. Samples: 15135903740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:48,396][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 16:33:48,615][15401] Updated weights for policy 0, policy_version 923814 (0.0044) [2024-06-25 16:33:52,538][15401] Updated weights for policy 0, policy_version 923824 (0.0024) [2024-06-25 16:33:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15135948800. Throughput: 0: 42592.5. Samples: 15136039440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:53,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 16:33:56,118][15401] Updated weights for policy 0, policy_version 923834 (0.0038) [2024-06-25 16:33:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.2, 300 sec: 42543.2). Total num frames: 15136161792. Throughput: 0: 42626.2. Samples: 15136298000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:33:58,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 16:34:00,323][15401] Updated weights for policy 0, policy_version 923844 (0.0027) [2024-06-25 16:34:03,392][15132] Fps is (10 sec: 45865.1, 60 sec: 43143.0, 300 sec: 42653.6). Total num frames: 15136407552. Throughput: 0: 42721.5. Samples: 15136550080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:03,392][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 16:34:03,730][15401] Updated weights for policy 0, policy_version 923854 (0.0028) [2024-06-25 16:34:07,804][15401] Updated weights for policy 0, policy_version 923864 (0.0023) [2024-06-25 16:34:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15136604160. Throughput: 0: 42880.0. Samples: 15136687500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:08,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 16:34:11,405][15401] Updated weights for policy 0, policy_version 923874 (0.0038) [2024-06-25 16:34:13,389][15132] Fps is (10 sec: 40968.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15136817152. Throughput: 0: 42787.2. Samples: 15136939500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:13,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 16:34:15,361][15401] Updated weights for policy 0, policy_version 923884 (0.0026) [2024-06-25 16:34:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15137046528. Throughput: 0: 42650.2. Samples: 15137188680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:18,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 16:34:19,125][15401] Updated weights for policy 0, policy_version 923894 (0.0029) [2024-06-25 16:34:23,286][15401] Updated weights for policy 0, policy_version 923904 (0.0036) [2024-06-25 16:34:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15137243136. Throughput: 0: 42732.2. Samples: 15137321460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 16:34:26,699][15401] Updated weights for policy 0, policy_version 923914 (0.0036) [2024-06-25 16:34:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 15137439744. Throughput: 0: 42752.4. Samples: 15137579920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 16:34:30,772][15401] Updated weights for policy 0, policy_version 923924 (0.0045) [2024-06-25 16:34:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 15137669120. Throughput: 0: 42943.9. Samples: 15137836220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:33,390][15132] Avg episode reward: [(0, '0.243')] [2024-06-25 16:34:34,694][15401] Updated weights for policy 0, policy_version 923934 (0.0033) [2024-06-25 16:34:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 15137882112. Throughput: 0: 42875.6. Samples: 15137968840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:38,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 16:34:38,434][15401] Updated weights for policy 0, policy_version 923944 (0.0034) [2024-06-25 16:34:40,192][15349] Signal inference workers to stop experience collection... (224000 times) [2024-06-25 16:34:40,192][15349] Signal inference workers to resume experience collection... (224000 times) [2024-06-25 16:34:40,242][15401] InferenceWorker_p0-w0: stopping experience collection (224000 times) [2024-06-25 16:34:40,242][15401] InferenceWorker_p0-w0: resuming experience collection (224000 times) [2024-06-25 16:34:42,614][15401] Updated weights for policy 0, policy_version 923954 (0.0047) [2024-06-25 16:34:43,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15138095104. Throughput: 0: 42844.5. Samples: 15138226000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:43,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 16:34:45,820][15401] Updated weights for policy 0, policy_version 923964 (0.0030) [2024-06-25 16:34:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15138324480. Throughput: 0: 42884.0. Samples: 15138479760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 16:34:50,141][15401] Updated weights for policy 0, policy_version 923974 (0.0037) [2024-06-25 16:34:53,392][15132] Fps is (10 sec: 44226.6, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15138537472. Throughput: 0: 42660.4. Samples: 15138607320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:53,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 16:34:53,834][15401] Updated weights for policy 0, policy_version 923984 (0.0036) [2024-06-25 16:34:57,688][15401] Updated weights for policy 0, policy_version 923994 (0.0038) [2024-06-25 16:34:58,389][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15138734080. Throughput: 0: 42960.0. Samples: 15138872700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:34:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 16:35:01,478][15401] Updated weights for policy 0, policy_version 924004 (0.0034) [2024-06-25 16:35:03,390][15132] Fps is (10 sec: 40969.0, 60 sec: 42326.7, 300 sec: 42653.9). Total num frames: 15138947072. Throughput: 0: 42995.3. Samples: 15139123480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:35:03,391][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 16:35:05,163][15401] Updated weights for policy 0, policy_version 924014 (0.0035) [2024-06-25 16:35:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15139176448. Throughput: 0: 42872.0. Samples: 15139250700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:35:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 16:35:09,208][15401] Updated weights for policy 0, policy_version 924024 (0.0036) [2024-06-25 16:35:12,768][15401] Updated weights for policy 0, policy_version 924034 (0.0048) [2024-06-25 16:35:13,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15139389440. Throughput: 0: 42849.0. Samples: 15139508120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:35:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 16:35:16,958][15401] Updated weights for policy 0, policy_version 924044 (0.0031) [2024-06-25 16:35:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 15139569664. Throughput: 0: 42827.8. Samples: 15139763460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:35:18,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 16:35:20,442][15401] Updated weights for policy 0, policy_version 924054 (0.0038) [2024-06-25 16:35:23,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 15139815424. Throughput: 0: 42622.8. Samples: 15139886880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 16:35:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 16:35:24,654][15401] Updated weights for policy 0, policy_version 924064 (0.0030) [2024-06-25 16:35:28,204][15401] Updated weights for policy 0, policy_version 924074 (0.0040) [2024-06-25 16:35:28,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15140028416. Throughput: 0: 42753.9. Samples: 15140149920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:35:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 16:35:32,413][15401] Updated weights for policy 0, policy_version 924084 (0.0044) [2024-06-25 16:35:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15140208640. Throughput: 0: 42653.2. Samples: 15140399160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:35:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 16:35:36,228][15401] Updated weights for policy 0, policy_version 924094 (0.0042) [2024-06-25 16:35:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15140454400. Throughput: 0: 42644.1. Samples: 15140526200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:35:38,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 16:35:40,161][15401] Updated weights for policy 0, policy_version 924104 (0.0041) [2024-06-25 16:35:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 15140651008. Throughput: 0: 42465.3. Samples: 15140783640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:35:43,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 16:35:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000924112_15140651008.pth... [2024-06-25 16:35:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000923488_15130427392.pth [2024-06-25 16:35:44,200][15401] Updated weights for policy 0, policy_version 924114 (0.0023) [2024-06-25 16:35:47,691][15401] Updated weights for policy 0, policy_version 924124 (0.0043) [2024-06-25 16:35:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42654.0). Total num frames: 15140864000. Throughput: 0: 42449.6. Samples: 15141033700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:35:48,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 16:35:51,874][15401] Updated weights for policy 0, policy_version 924134 (0.0024) [2024-06-25 16:35:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 15141109760. Throughput: 0: 42624.3. Samples: 15141168800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:35:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 16:35:55,536][15401] Updated weights for policy 0, policy_version 924144 (0.0049) [2024-06-25 16:35:58,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15141289984. Throughput: 0: 42462.6. Samples: 15141418940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:35:58,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-25 16:35:59,543][15401] Updated weights for policy 0, policy_version 924154 (0.0053) [2024-06-25 16:36:03,126][15401] Updated weights for policy 0, policy_version 924164 (0.0037) [2024-06-25 16:36:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15141519360. Throughput: 0: 42491.4. Samples: 15141675580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:03,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 16:36:07,154][15401] Updated weights for policy 0, policy_version 924174 (0.0054) [2024-06-25 16:36:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15141732352. Throughput: 0: 42704.7. Samples: 15141808580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 16:36:10,763][15401] Updated weights for policy 0, policy_version 924184 (0.0033) [2024-06-25 16:36:12,238][15349] Signal inference workers to stop experience collection... (224050 times) [2024-06-25 16:36:12,238][15349] Signal inference workers to resume experience collection... (224050 times) [2024-06-25 16:36:12,273][15401] InferenceWorker_p0-w0: stopping experience collection (224050 times) [2024-06-25 16:36:12,274][15401] InferenceWorker_p0-w0: resuming experience collection (224050 times) [2024-06-25 16:36:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15141928960. Throughput: 0: 42466.7. Samples: 15142060920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 16:36:14,608][15401] Updated weights for policy 0, policy_version 924194 (0.0037) [2024-06-25 16:36:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15142141952. Throughput: 0: 42748.9. Samples: 15142322860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 16:36:18,536][15401] Updated weights for policy 0, policy_version 924204 (0.0032) [2024-06-25 16:36:22,393][15401] Updated weights for policy 0, policy_version 924214 (0.0026) [2024-06-25 16:36:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15142387712. Throughput: 0: 42794.3. Samples: 15142451940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 16:36:26,186][15401] Updated weights for policy 0, policy_version 924224 (0.0043) [2024-06-25 16:36:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15142567936. Throughput: 0: 42595.0. Samples: 15142700420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:28,390][15132] Avg episode reward: [(0, '0.839')] [2024-06-25 16:36:30,135][15401] Updated weights for policy 0, policy_version 924234 (0.0030) [2024-06-25 16:36:33,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 15142780928. Throughput: 0: 42716.8. Samples: 15142955960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 16:36:33,777][15401] Updated weights for policy 0, policy_version 924244 (0.0032) [2024-06-25 16:36:37,732][15401] Updated weights for policy 0, policy_version 924254 (0.0036) [2024-06-25 16:36:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15143010304. Throughput: 0: 42677.8. Samples: 15143089300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:38,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 16:36:41,334][15401] Updated weights for policy 0, policy_version 924264 (0.0035) [2024-06-25 16:36:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15143206912. Throughput: 0: 42700.9. Samples: 15143340480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 16:36:45,153][15401] Updated weights for policy 0, policy_version 924274 (0.0031) [2024-06-25 16:36:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15143403520. Throughput: 0: 42626.3. Samples: 15143593760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:48,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 16:36:49,281][15401] Updated weights for policy 0, policy_version 924284 (0.0034) [2024-06-25 16:36:52,977][15401] Updated weights for policy 0, policy_version 924294 (0.0027) [2024-06-25 16:36:53,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 15143649280. Throughput: 0: 42475.0. Samples: 15143720060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:53,393][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 16:36:56,976][15401] Updated weights for policy 0, policy_version 924304 (0.0042) [2024-06-25 16:36:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15143829504. Throughput: 0: 42421.7. Samples: 15143969900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:36:58,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 16:37:00,680][15401] Updated weights for policy 0, policy_version 924314 (0.0032) [2024-06-25 16:37:03,392][15132] Fps is (10 sec: 40960.0, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 15144058880. Throughput: 0: 42203.5. Samples: 15144222120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:37:03,393][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 16:37:04,738][15401] Updated weights for policy 0, policy_version 924324 (0.0037) [2024-06-25 16:37:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15144271872. Throughput: 0: 42234.7. Samples: 15144352500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:37:08,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 16:37:08,437][15401] Updated weights for policy 0, policy_version 924334 (0.0033) [2024-06-25 16:37:12,721][15401] Updated weights for policy 0, policy_version 924344 (0.0039) [2024-06-25 16:37:13,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15144468480. Throughput: 0: 42343.1. Samples: 15144605860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:37:13,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-25 16:37:16,390][15401] Updated weights for policy 0, policy_version 924354 (0.0041) [2024-06-25 16:37:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15144697856. Throughput: 0: 42138.2. Samples: 15144852180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 16:37:18,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 16:37:20,765][15401] Updated weights for policy 0, policy_version 924364 (0.0029) [2024-06-25 16:37:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 15144894464. Throughput: 0: 42082.3. Samples: 15144983000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:37:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 16:37:23,912][15349] Signal inference workers to stop experience collection... (224100 times) [2024-06-25 16:37:23,956][15401] InferenceWorker_p0-w0: stopping experience collection (224100 times) [2024-06-25 16:37:24,027][15349] Signal inference workers to resume experience collection... (224100 times) [2024-06-25 16:37:24,027][15401] InferenceWorker_p0-w0: resuming experience collection (224100 times) [2024-06-25 16:37:24,029][15401] Updated weights for policy 0, policy_version 924374 (0.0032) [2024-06-25 16:37:28,195][15401] Updated weights for policy 0, policy_version 924384 (0.0025) [2024-06-25 16:37:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15145107456. Throughput: 0: 42187.1. Samples: 15145238900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:37:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 16:37:31,703][15401] Updated weights for policy 0, policy_version 924394 (0.0035) [2024-06-25 16:37:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15145353216. Throughput: 0: 42083.0. Samples: 15145487500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:37:33,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 16:37:35,979][15401] Updated weights for policy 0, policy_version 924404 (0.0025) [2024-06-25 16:37:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15145533440. Throughput: 0: 42220.0. Samples: 15145619860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:37:38,391][15132] Avg episode reward: [(0, '0.912')] [2024-06-25 16:37:39,326][15401] Updated weights for policy 0, policy_version 924414 (0.0033) [2024-06-25 16:37:43,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 15145730048. Throughput: 0: 42330.2. Samples: 15145874760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:37:43,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 16:37:43,532][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000924423_15145746432.pth... [2024-06-25 16:37:43,589][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000923800_15135539200.pth [2024-06-25 16:37:43,739][15401] Updated weights for policy 0, policy_version 924424 (0.0033) [2024-06-25 16:37:47,069][15401] Updated weights for policy 0, policy_version 924434 (0.0036) [2024-06-25 16:37:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 15145992192. Throughput: 0: 42236.3. Samples: 15146122660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:37:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 16:37:51,219][15401] Updated weights for policy 0, policy_version 924444 (0.0027) [2024-06-25 16:37:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 41780.9, 300 sec: 42542.8). Total num frames: 15146156032. Throughput: 0: 42356.4. Samples: 15146258540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:37:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 16:37:54,661][15401] Updated weights for policy 0, policy_version 924454 (0.0035) [2024-06-25 16:37:58,389][15132] Fps is (10 sec: 37684.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15146369024. Throughput: 0: 42439.7. Samples: 15146515640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:37:58,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 16:37:58,798][15401] Updated weights for policy 0, policy_version 924464 (0.0043) [2024-06-25 16:38:02,284][15401] Updated weights for policy 0, policy_version 924474 (0.0035) [2024-06-25 16:38:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 15146631168. Throughput: 0: 42611.9. Samples: 15146769720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 16:38:06,456][15401] Updated weights for policy 0, policy_version 924484 (0.0031) [2024-06-25 16:38:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15146795008. Throughput: 0: 42590.2. Samples: 15146899560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 16:38:10,070][15401] Updated weights for policy 0, policy_version 924494 (0.0030) [2024-06-25 16:38:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15147024384. Throughput: 0: 42569.9. Samples: 15147154540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 16:38:13,960][15401] Updated weights for policy 0, policy_version 924504 (0.0042) [2024-06-25 16:38:17,933][15401] Updated weights for policy 0, policy_version 924514 (0.0035) [2024-06-25 16:38:18,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15147270144. Throughput: 0: 42613.0. Samples: 15147405080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 16:38:21,750][15401] Updated weights for policy 0, policy_version 924524 (0.0029) [2024-06-25 16:38:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42543.2). Total num frames: 15147433984. Throughput: 0: 42687.5. Samples: 15147540800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 16:38:25,501][15401] Updated weights for policy 0, policy_version 924534 (0.0042) [2024-06-25 16:38:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15147663360. Throughput: 0: 42589.9. Samples: 15147791300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 16:38:29,210][15401] Updated weights for policy 0, policy_version 924544 (0.0032) [2024-06-25 16:38:32,835][15349] Signal inference workers to stop experience collection... (224150 times) [2024-06-25 16:38:32,835][15349] Signal inference workers to resume experience collection... (224150 times) [2024-06-25 16:38:32,894][15401] InferenceWorker_p0-w0: stopping experience collection (224150 times) [2024-06-25 16:38:32,895][15401] InferenceWorker_p0-w0: resuming experience collection (224150 times) [2024-06-25 16:38:33,306][15401] Updated weights for policy 0, policy_version 924554 (0.0038) [2024-06-25 16:38:33,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 15147892736. Throughput: 0: 42878.8. Samples: 15148052300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:33,392][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 16:38:37,197][15401] Updated weights for policy 0, policy_version 924564 (0.0035) [2024-06-25 16:38:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15148089344. Throughput: 0: 42752.5. Samples: 15148182400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:38,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 16:38:40,798][15401] Updated weights for policy 0, policy_version 924574 (0.0035) [2024-06-25 16:38:43,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15148318720. Throughput: 0: 42669.2. Samples: 15148435760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:43,390][15132] Avg episode reward: [(0, '0.109')] [2024-06-25 16:38:44,677][15401] Updated weights for policy 0, policy_version 924584 (0.0044) [2024-06-25 16:38:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15148515328. Throughput: 0: 42950.3. Samples: 15148702480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:48,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 16:38:48,792][15401] Updated weights for policy 0, policy_version 924594 (0.0025) [2024-06-25 16:38:52,129][15401] Updated weights for policy 0, policy_version 924604 (0.0037) [2024-06-25 16:38:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15148744704. Throughput: 0: 42725.3. Samples: 15148822200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:53,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 16:38:56,484][15401] Updated weights for policy 0, policy_version 924614 (0.0040) [2024-06-25 16:38:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42543.2). Total num frames: 15148957696. Throughput: 0: 42595.1. Samples: 15149071320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:38:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 16:39:00,147][15401] Updated weights for policy 0, policy_version 924624 (0.0030) [2024-06-25 16:39:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15149154304. Throughput: 0: 42911.5. Samples: 15149336100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:39:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 16:39:04,211][15401] Updated weights for policy 0, policy_version 924634 (0.0042) [2024-06-25 16:39:07,746][15401] Updated weights for policy 0, policy_version 924644 (0.0039) [2024-06-25 16:39:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15149383680. Throughput: 0: 42660.2. Samples: 15149460500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:39:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 16:39:11,779][15401] Updated weights for policy 0, policy_version 924654 (0.0033) [2024-06-25 16:39:13,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15149613056. Throughput: 0: 42785.9. Samples: 15149716660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 16:39:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 16:39:15,310][15401] Updated weights for policy 0, policy_version 924664 (0.0037) [2024-06-25 16:39:18,395][15132] Fps is (10 sec: 40939.0, 60 sec: 42048.7, 300 sec: 42542.1). Total num frames: 15149793280. Throughput: 0: 42753.9. Samples: 15149976340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:18,395][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 16:39:19,390][15401] Updated weights for policy 0, policy_version 924674 (0.0028) [2024-06-25 16:39:23,275][15401] Updated weights for policy 0, policy_version 924684 (0.0039) [2024-06-25 16:39:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15150022656. Throughput: 0: 42438.5. Samples: 15150092140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 16:39:27,211][15401] Updated weights for policy 0, policy_version 924694 (0.0034) [2024-06-25 16:39:28,389][15132] Fps is (10 sec: 45898.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 15150252032. Throughput: 0: 42604.1. Samples: 15150352940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:28,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 16:39:30,897][15401] Updated weights for policy 0, policy_version 924704 (0.0034) [2024-06-25 16:39:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42542.8). Total num frames: 15150432256. Throughput: 0: 42411.5. Samples: 15150611000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:33,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 16:39:34,835][15401] Updated weights for policy 0, policy_version 924714 (0.0037) [2024-06-25 16:39:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15150661632. Throughput: 0: 42442.4. Samples: 15150732100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 16:39:38,602][15401] Updated weights for policy 0, policy_version 924724 (0.0029) [2024-06-25 16:39:42,145][15401] Updated weights for policy 0, policy_version 924734 (0.0027) [2024-06-25 16:39:43,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15150891008. Throughput: 0: 42767.0. Samples: 15150995840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 16:39:43,475][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000924738_15150907392.pth... [2024-06-25 16:39:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000924112_15140651008.pth [2024-06-25 16:39:46,167][15401] Updated weights for policy 0, policy_version 924744 (0.0032) [2024-06-25 16:39:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 15151071232. Throughput: 0: 42771.2. Samples: 15151260800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:48,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 16:39:49,312][15349] Signal inference workers to stop experience collection... (224200 times) [2024-06-25 16:39:49,312][15349] Signal inference workers to resume experience collection... (224200 times) [2024-06-25 16:39:49,352][15401] InferenceWorker_p0-w0: stopping experience collection (224200 times) [2024-06-25 16:39:49,352][15401] InferenceWorker_p0-w0: resuming experience collection (224200 times) [2024-06-25 16:39:49,655][15401] Updated weights for policy 0, policy_version 924754 (0.0028) [2024-06-25 16:39:53,393][15132] Fps is (10 sec: 42582.9, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 15151316992. Throughput: 0: 42782.7. Samples: 15151385880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:53,394][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 16:39:53,586][15401] Updated weights for policy 0, policy_version 924764 (0.0035) [2024-06-25 16:39:57,149][15401] Updated weights for policy 0, policy_version 924774 (0.0032) [2024-06-25 16:39:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 15151529984. Throughput: 0: 42811.0. Samples: 15151643160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:39:58,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-25 16:40:01,110][15401] Updated weights for policy 0, policy_version 924784 (0.0037) [2024-06-25 16:40:03,390][15132] Fps is (10 sec: 40975.1, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 15151726592. Throughput: 0: 43034.2. Samples: 15151912660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:03,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 16:40:04,622][15401] Updated weights for policy 0, policy_version 924794 (0.0037) [2024-06-25 16:40:08,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 15151955968. Throughput: 0: 43082.7. Samples: 15152030960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:08,393][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 16:40:08,841][15401] Updated weights for policy 0, policy_version 924804 (0.0035) [2024-06-25 16:40:12,494][15401] Updated weights for policy 0, policy_version 924814 (0.0029) [2024-06-25 16:40:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15152168960. Throughput: 0: 42989.8. Samples: 15152287480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 16:40:16,605][15401] Updated weights for policy 0, policy_version 924824 (0.0033) [2024-06-25 16:40:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42875.1, 300 sec: 42542.9). Total num frames: 15152365568. Throughput: 0: 42935.2. Samples: 15152543080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 16:40:20,459][15401] Updated weights for policy 0, policy_version 924834 (0.0029) [2024-06-25 16:40:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15152578560. Throughput: 0: 42970.2. Samples: 15152665760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:23,392][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 16:40:24,450][15401] Updated weights for policy 0, policy_version 924844 (0.0030) [2024-06-25 16:40:28,123][15401] Updated weights for policy 0, policy_version 924854 (0.0037) [2024-06-25 16:40:28,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15152824320. Throughput: 0: 42936.6. Samples: 15152927980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:28,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 16:40:31,950][15401] Updated weights for policy 0, policy_version 924864 (0.0030) [2024-06-25 16:40:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 15153004544. Throughput: 0: 42667.5. Samples: 15153180840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 16:40:35,810][15401] Updated weights for policy 0, policy_version 924874 (0.0036) [2024-06-25 16:40:38,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15153233920. Throughput: 0: 42690.2. Samples: 15153306780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:38,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 16:40:39,966][15401] Updated weights for policy 0, policy_version 924884 (0.0033) [2024-06-25 16:40:43,380][15401] Updated weights for policy 0, policy_version 924894 (0.0031) [2024-06-25 16:40:43,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 15153463296. Throughput: 0: 42868.4. Samples: 15153572340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:43,392][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 16:40:47,586][15401] Updated weights for policy 0, policy_version 924904 (0.0031) [2024-06-25 16:40:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15153643520. Throughput: 0: 42447.7. Samples: 15153822800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 16:40:51,026][15401] Updated weights for policy 0, policy_version 924914 (0.0027) [2024-06-25 16:40:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42601.1, 300 sec: 42654.0). Total num frames: 15153872896. Throughput: 0: 42569.5. Samples: 15153946480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 16:40:55,595][15401] Updated weights for policy 0, policy_version 924924 (0.0028) [2024-06-25 16:40:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15154085888. Throughput: 0: 42863.1. Samples: 15154216320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:40:58,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 16:40:58,635][15401] Updated weights for policy 0, policy_version 924934 (0.0027) [2024-06-25 16:41:03,165][15401] Updated weights for policy 0, policy_version 924944 (0.0036) [2024-06-25 16:41:03,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15154298880. Throughput: 0: 42765.9. Samples: 15154467540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:41:03,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 16:41:06,376][15401] Updated weights for policy 0, policy_version 924954 (0.0040) [2024-06-25 16:41:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 15154511872. Throughput: 0: 42750.3. Samples: 15154589520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 16:41:08,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 16:41:10,711][15401] Updated weights for policy 0, policy_version 924964 (0.0031) [2024-06-25 16:41:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15154724864. Throughput: 0: 42732.9. Samples: 15154850960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:13,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 16:41:14,347][15401] Updated weights for policy 0, policy_version 924974 (0.0029) [2024-06-25 16:41:18,255][15401] Updated weights for policy 0, policy_version 924984 (0.0040) [2024-06-25 16:41:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 15154937856. Throughput: 0: 42577.0. Samples: 15155096800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:18,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 16:41:22,385][15349] Signal inference workers to stop experience collection... (224250 times) [2024-06-25 16:41:22,431][15401] InferenceWorker_p0-w0: stopping experience collection (224250 times) [2024-06-25 16:41:22,438][15349] Signal inference workers to resume experience collection... (224250 times) [2024-06-25 16:41:22,446][15401] InferenceWorker_p0-w0: resuming experience collection (224250 times) [2024-06-25 16:41:22,453][15401] Updated weights for policy 0, policy_version 924994 (0.0047) [2024-06-25 16:41:23,390][15132] Fps is (10 sec: 44235.2, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 15155167232. Throughput: 0: 42678.4. Samples: 15155227320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:23,391][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 16:41:25,815][15401] Updated weights for policy 0, policy_version 925004 (0.0039) [2024-06-25 16:41:28,389][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 15155331072. Throughput: 0: 42404.0. Samples: 15155480420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 16:41:30,020][15401] Updated weights for policy 0, policy_version 925014 (0.0044) [2024-06-25 16:41:33,321][15401] Updated weights for policy 0, policy_version 925024 (0.0040) [2024-06-25 16:41:33,390][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15155593216. Throughput: 0: 42414.9. Samples: 15155731480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:33,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 16:41:37,598][15401] Updated weights for policy 0, policy_version 925034 (0.0029) [2024-06-25 16:41:38,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15155806208. Throughput: 0: 42708.8. Samples: 15155868380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 16:41:41,394][15401] Updated weights for policy 0, policy_version 925044 (0.0039) [2024-06-25 16:41:43,390][15132] Fps is (10 sec: 37683.6, 60 sec: 41780.8, 300 sec: 42598.4). Total num frames: 15155970048. Throughput: 0: 42332.3. Samples: 15156121280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:43,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 16:41:43,420][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000925047_15155970048.pth... [2024-06-25 16:41:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000924423_15145746432.pth [2024-06-25 16:41:45,172][15401] Updated weights for policy 0, policy_version 925054 (0.0026) [2024-06-25 16:41:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 15156232192. Throughput: 0: 42312.0. Samples: 15156371580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 16:41:48,941][15401] Updated weights for policy 0, policy_version 925064 (0.0037) [2024-06-25 16:41:53,309][15401] Updated weights for policy 0, policy_version 925074 (0.0043) [2024-06-25 16:41:53,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42320.7, 300 sec: 42653.0). Total num frames: 15156412416. Throughput: 0: 42614.3. Samples: 15156507440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:53,397][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 16:41:56,775][15401] Updated weights for policy 0, policy_version 925084 (0.0040) [2024-06-25 16:41:58,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 15156609024. Throughput: 0: 42312.7. Samples: 15156755040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:41:58,390][15132] Avg episode reward: [(0, '0.342')] [2024-06-25 16:42:00,991][15401] Updated weights for policy 0, policy_version 925094 (0.0029) [2024-06-25 16:42:03,389][15132] Fps is (10 sec: 44265.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15156854784. Throughput: 0: 42303.5. Samples: 15157000460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:03,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 16:42:04,687][15401] Updated weights for policy 0, policy_version 925104 (0.0029) [2024-06-25 16:42:08,396][15132] Fps is (10 sec: 42571.5, 60 sec: 42047.8, 300 sec: 42597.5). Total num frames: 15157035008. Throughput: 0: 42376.9. Samples: 15157134540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:08,397][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 16:42:08,725][15401] Updated weights for policy 0, policy_version 925114 (0.0038) [2024-06-25 16:42:12,398][15401] Updated weights for policy 0, policy_version 925124 (0.0049) [2024-06-25 16:42:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15157248000. Throughput: 0: 42396.4. Samples: 15157388260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 16:42:16,294][15401] Updated weights for policy 0, policy_version 925134 (0.0023) [2024-06-25 16:42:18,389][15132] Fps is (10 sec: 45904.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15157493760. Throughput: 0: 42321.1. Samples: 15157635920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:18,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 16:42:20,282][15401] Updated weights for policy 0, policy_version 925144 (0.0029) [2024-06-25 16:42:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.4, 300 sec: 42598.4). Total num frames: 15157673984. Throughput: 0: 42280.0. Samples: 15157770980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:23,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 16:42:23,970][15401] Updated weights for policy 0, policy_version 925154 (0.0038) [2024-06-25 16:42:27,977][15401] Updated weights for policy 0, policy_version 925164 (0.0035) [2024-06-25 16:42:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15157903360. Throughput: 0: 42251.2. Samples: 15158022580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:28,390][15132] Avg episode reward: [(0, '0.220')] [2024-06-25 16:42:31,891][15401] Updated weights for policy 0, policy_version 925174 (0.0040) [2024-06-25 16:42:33,107][15349] Signal inference workers to stop experience collection... (224300 times) [2024-06-25 16:42:33,109][15349] Signal inference workers to resume experience collection... (224300 times) [2024-06-25 16:42:33,152][15401] InferenceWorker_p0-w0: stopping experience collection (224300 times) [2024-06-25 16:42:33,153][15401] InferenceWorker_p0-w0: resuming experience collection (224300 times) [2024-06-25 16:42:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15158132736. Throughput: 0: 42387.1. Samples: 15158279000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:33,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 16:42:35,409][15401] Updated weights for policy 0, policy_version 925184 (0.0032) [2024-06-25 16:42:38,390][15132] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 15158312960. Throughput: 0: 42221.5. Samples: 15158407140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 16:42:39,743][15401] Updated weights for policy 0, policy_version 925194 (0.0024) [2024-06-25 16:42:43,141][15401] Updated weights for policy 0, policy_version 925204 (0.0034) [2024-06-25 16:42:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15158542336. Throughput: 0: 42271.6. Samples: 15158657260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 16:42:47,389][15401] Updated weights for policy 0, policy_version 925214 (0.0034) [2024-06-25 16:42:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15158755328. Throughput: 0: 42470.2. Samples: 15158911620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:48,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 16:42:51,413][15401] Updated weights for policy 0, policy_version 925224 (0.0026) [2024-06-25 16:42:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42329.8, 300 sec: 42653.9). Total num frames: 15158951936. Throughput: 0: 42355.3. Samples: 15159040260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:53,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 16:42:55,196][15401] Updated weights for policy 0, policy_version 925234 (0.0031) [2024-06-25 16:42:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15159181312. Throughput: 0: 42204.9. Samples: 15159287480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:42:58,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-25 16:42:58,959][15401] Updated weights for policy 0, policy_version 925244 (0.0037) [2024-06-25 16:43:03,082][15401] Updated weights for policy 0, policy_version 925254 (0.0035) [2024-06-25 16:43:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15159377920. Throughput: 0: 42383.5. Samples: 15159543180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-25 16:43:03,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 16:43:06,501][15401] Updated weights for policy 0, policy_version 925264 (0.0035) [2024-06-25 16:43:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 15159590912. Throughput: 0: 42040.0. Samples: 15159662780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:08,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 16:43:10,872][15401] Updated weights for policy 0, policy_version 925274 (0.0038) [2024-06-25 16:43:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15159820288. Throughput: 0: 42097.3. Samples: 15159916960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:13,399][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 16:43:14,697][15401] Updated weights for policy 0, policy_version 925284 (0.0053) [2024-06-25 16:43:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 15160000512. Throughput: 0: 42204.1. Samples: 15160178180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 16:43:18,454][15401] Updated weights for policy 0, policy_version 925294 (0.0033) [2024-06-25 16:43:22,301][15401] Updated weights for policy 0, policy_version 925304 (0.0037) [2024-06-25 16:43:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15160229888. Throughput: 0: 42011.1. Samples: 15160297640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:23,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 16:43:26,459][15401] Updated weights for policy 0, policy_version 925314 (0.0042) [2024-06-25 16:43:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 15160442880. Throughput: 0: 42325.4. Samples: 15160561900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:28,392][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 16:43:29,867][15401] Updated weights for policy 0, policy_version 925324 (0.0037) [2024-06-25 16:43:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15160655872. Throughput: 0: 42378.3. Samples: 15160818640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 16:43:34,234][15401] Updated weights for policy 0, policy_version 925334 (0.0030) [2024-06-25 16:43:37,483][15401] Updated weights for policy 0, policy_version 925344 (0.0030) [2024-06-25 16:43:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 15160885248. Throughput: 0: 42281.8. Samples: 15160943040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:38,393][15132] Avg episode reward: [(0, '0.256')] [2024-06-25 16:43:42,004][15401] Updated weights for policy 0, policy_version 925354 (0.0042) [2024-06-25 16:43:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15161098240. Throughput: 0: 42618.2. Samples: 15161205300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:43,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 16:43:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000925360_15161098240.pth... [2024-06-25 16:43:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000924738_15150907392.pth [2024-06-25 16:43:45,422][15401] Updated weights for policy 0, policy_version 925364 (0.0029) [2024-06-25 16:43:48,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 15161278464. Throughput: 0: 42463.2. Samples: 15161454020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:48,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 16:43:49,469][15401] Updated weights for policy 0, policy_version 925374 (0.0034) [2024-06-25 16:43:53,016][15401] Updated weights for policy 0, policy_version 925384 (0.0032) [2024-06-25 16:43:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15161507840. Throughput: 0: 42615.9. Samples: 15161580500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 16:43:57,062][15401] Updated weights for policy 0, policy_version 925394 (0.0028) [2024-06-25 16:43:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15161704448. Throughput: 0: 42777.7. Samples: 15161841960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:43:58,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 16:44:00,702][15401] Updated weights for policy 0, policy_version 925404 (0.0036) [2024-06-25 16:44:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15161933824. Throughput: 0: 42655.9. Samples: 15162097700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:03,391][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 16:44:04,505][15401] Updated weights for policy 0, policy_version 925414 (0.0035) [2024-06-25 16:44:08,365][15401] Updated weights for policy 0, policy_version 925424 (0.0037) [2024-06-25 16:44:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 15162146816. Throughput: 0: 42932.0. Samples: 15162229580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:08,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 16:44:12,248][15401] Updated weights for policy 0, policy_version 925434 (0.0042) [2024-06-25 16:44:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42543.6). Total num frames: 15162343424. Throughput: 0: 42799.5. Samples: 15162487880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:13,393][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 16:44:13,824][15349] Signal inference workers to stop experience collection... (224350 times) [2024-06-25 16:44:13,880][15401] InferenceWorker_p0-w0: stopping experience collection (224350 times) [2024-06-25 16:44:13,944][15349] Signal inference workers to resume experience collection... (224350 times) [2024-06-25 16:44:13,944][15401] InferenceWorker_p0-w0: resuming experience collection (224350 times) [2024-06-25 16:44:16,044][15401] Updated weights for policy 0, policy_version 925444 (0.0028) [2024-06-25 16:44:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15162572800. Throughput: 0: 42647.5. Samples: 15162737780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 16:44:19,689][15401] Updated weights for policy 0, policy_version 925454 (0.0037) [2024-06-25 16:44:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 15162769408. Throughput: 0: 42859.7. Samples: 15162871620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:23,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 16:44:23,726][15401] Updated weights for policy 0, policy_version 925464 (0.0034) [2024-06-25 16:44:27,418][15401] Updated weights for policy 0, policy_version 925474 (0.0031) [2024-06-25 16:44:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15162998784. Throughput: 0: 42645.8. Samples: 15163124360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:28,393][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 16:44:31,543][15401] Updated weights for policy 0, policy_version 925484 (0.0023) [2024-06-25 16:44:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15163211776. Throughput: 0: 42688.9. Samples: 15163375020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:33,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 16:44:35,142][15401] Updated weights for policy 0, policy_version 925494 (0.0028) [2024-06-25 16:44:38,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 42431.8). Total num frames: 15163408384. Throughput: 0: 42730.7. Samples: 15163503380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:38,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 16:44:39,114][15401] Updated weights for policy 0, policy_version 925504 (0.0032) [2024-06-25 16:44:42,667][15401] Updated weights for policy 0, policy_version 925514 (0.0034) [2024-06-25 16:44:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15163654144. Throughput: 0: 42672.0. Samples: 15163762200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:43,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 16:44:46,657][15401] Updated weights for policy 0, policy_version 925524 (0.0022) [2024-06-25 16:44:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42487.9). Total num frames: 15163850752. Throughput: 0: 42554.3. Samples: 15164012640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:48,394][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 16:44:50,227][15401] Updated weights for policy 0, policy_version 925534 (0.0030) [2024-06-25 16:44:53,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 15164047360. Throughput: 0: 42501.4. Samples: 15164142140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 16:44:54,571][15401] Updated weights for policy 0, policy_version 925544 (0.0026) [2024-06-25 16:44:58,046][15401] Updated weights for policy 0, policy_version 925554 (0.0038) [2024-06-25 16:44:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15164276736. Throughput: 0: 42424.5. Samples: 15164396980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 16:44:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 16:45:02,462][15401] Updated weights for policy 0, policy_version 925564 (0.0033) [2024-06-25 16:45:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 15164489728. Throughput: 0: 42424.6. Samples: 15164646880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 16:45:05,687][15401] Updated weights for policy 0, policy_version 925574 (0.0038) [2024-06-25 16:45:08,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 15164669952. Throughput: 0: 42204.0. Samples: 15164770800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:08,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 16:45:10,113][15401] Updated weights for policy 0, policy_version 925584 (0.0031) [2024-06-25 16:45:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15164915712. Throughput: 0: 42300.5. Samples: 15165027880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:13,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 16:45:13,630][15401] Updated weights for policy 0, policy_version 925594 (0.0026) [2024-06-25 16:45:17,667][15401] Updated weights for policy 0, policy_version 925604 (0.0037) [2024-06-25 16:45:18,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15165128704. Throughput: 0: 42256.0. Samples: 15165276540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:18,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 16:45:21,315][15401] Updated weights for policy 0, policy_version 925614 (0.0027) [2024-06-25 16:45:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 15165308928. Throughput: 0: 42363.6. Samples: 15165409740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:23,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 16:45:25,323][15401] Updated weights for policy 0, policy_version 925624 (0.0042) [2024-06-25 16:45:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15165538304. Throughput: 0: 42195.2. Samples: 15165660980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:28,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 16:45:29,399][15401] Updated weights for policy 0, policy_version 925634 (0.0044) [2024-06-25 16:45:33,040][15401] Updated weights for policy 0, policy_version 925644 (0.0025) [2024-06-25 16:45:33,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 15165767680. Throughput: 0: 42368.8. Samples: 15165919240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:33,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 16:45:36,863][15401] Updated weights for policy 0, policy_version 925654 (0.0032) [2024-06-25 16:45:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42376.6). Total num frames: 15165964288. Throughput: 0: 42384.5. Samples: 15166049440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 16:45:39,582][15349] Signal inference workers to stop experience collection... (224400 times) [2024-06-25 16:45:39,627][15401] InferenceWorker_p0-w0: stopping experience collection (224400 times) [2024-06-25 16:45:39,692][15349] Signal inference workers to resume experience collection... (224400 times) [2024-06-25 16:45:39,692][15401] InferenceWorker_p0-w0: resuming experience collection (224400 times) [2024-06-25 16:45:40,986][15401] Updated weights for policy 0, policy_version 925664 (0.0045) [2024-06-25 16:45:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 15166177280. Throughput: 0: 42316.3. Samples: 15166301220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 16:45:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000925670_15166177280.pth... [2024-06-25 16:45:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000925047_15155970048.pth [2024-06-25 16:45:44,596][15401] Updated weights for policy 0, policy_version 925674 (0.0024) [2024-06-25 16:45:48,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 15166373888. Throughput: 0: 42482.1. Samples: 15166558580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:48,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 16:45:48,631][15401] Updated weights for policy 0, policy_version 925684 (0.0028) [2024-06-25 16:45:52,066][15401] Updated weights for policy 0, policy_version 925694 (0.0027) [2024-06-25 16:45:53,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 15166603264. Throughput: 0: 42543.5. Samples: 15166685260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 16:45:56,390][15401] Updated weights for policy 0, policy_version 925704 (0.0032) [2024-06-25 16:45:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15166832640. Throughput: 0: 42497.7. Samples: 15166940280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:45:58,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 16:45:59,687][15401] Updated weights for policy 0, policy_version 925714 (0.0028) [2024-06-25 16:46:03,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 15167012864. Throughput: 0: 42658.1. Samples: 15167196160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:03,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 16:46:04,386][15401] Updated weights for policy 0, policy_version 925724 (0.0033) [2024-06-25 16:46:07,557][15401] Updated weights for policy 0, policy_version 925734 (0.0044) [2024-06-25 16:46:08,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 15167258624. Throughput: 0: 42375.5. Samples: 15167316640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:08,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 16:46:12,099][15401] Updated weights for policy 0, policy_version 925744 (0.0037) [2024-06-25 16:46:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 15167471616. Throughput: 0: 42574.3. Samples: 15167576820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 16:46:15,462][15401] Updated weights for policy 0, policy_version 925754 (0.0032) [2024-06-25 16:46:18,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42050.5, 300 sec: 42320.4). Total num frames: 15167651840. Throughput: 0: 42480.5. Samples: 15167830960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:18,392][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 16:46:20,083][15401] Updated weights for policy 0, policy_version 925764 (0.0031) [2024-06-25 16:46:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15167881216. Throughput: 0: 42083.9. Samples: 15167943220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 16:46:23,400][15401] Updated weights for policy 0, policy_version 925774 (0.0053) [2024-06-25 16:46:27,876][15401] Updated weights for policy 0, policy_version 925784 (0.0046) [2024-06-25 16:46:28,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 15168077824. Throughput: 0: 42189.0. Samples: 15168199720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:28,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 16:46:31,273][15401] Updated weights for policy 0, policy_version 925794 (0.0040) [2024-06-25 16:46:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 15168258048. Throughput: 0: 42325.8. Samples: 15168463240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:33,395][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 16:46:35,566][15401] Updated weights for policy 0, policy_version 925804 (0.0032) [2024-06-25 16:46:38,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 15168520192. Throughput: 0: 42110.3. Samples: 15168580320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:38,393][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 16:46:38,846][15401] Updated weights for policy 0, policy_version 925814 (0.0030) [2024-06-25 16:46:43,345][15401] Updated weights for policy 0, policy_version 925824 (0.0039) [2024-06-25 16:46:43,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 15168700416. Throughput: 0: 42122.7. Samples: 15168835800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:43,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 16:46:46,785][15401] Updated weights for policy 0, policy_version 925834 (0.0041) [2024-06-25 16:46:48,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42052.3, 300 sec: 42321.6). Total num frames: 15168897024. Throughput: 0: 42151.2. Samples: 15169092960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 16:46:50,984][15401] Updated weights for policy 0, policy_version 925844 (0.0029) [2024-06-25 16:46:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15169159168. Throughput: 0: 42251.1. Samples: 15169217940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 16:46:53,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 16:46:54,189][15401] Updated weights for policy 0, policy_version 925854 (0.0041) [2024-06-25 16:46:58,392][15132] Fps is (10 sec: 44226.0, 60 sec: 41777.6, 300 sec: 42320.4). Total num frames: 15169339392. Throughput: 0: 42209.2. Samples: 15169476340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:46:58,393][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 16:46:58,589][15401] Updated weights for policy 0, policy_version 925864 (0.0046) [2024-06-25 16:46:59,628][15349] Signal inference workers to stop experience collection... (224450 times) [2024-06-25 16:46:59,628][15349] Signal inference workers to resume experience collection... (224450 times) [2024-06-25 16:46:59,659][15401] InferenceWorker_p0-w0: stopping experience collection (224450 times) [2024-06-25 16:46:59,659][15401] InferenceWorker_p0-w0: resuming experience collection (224450 times) [2024-06-25 16:47:01,732][15401] Updated weights for policy 0, policy_version 925874 (0.0032) [2024-06-25 16:47:03,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42052.4, 300 sec: 42377.2). Total num frames: 15169536000. Throughput: 0: 42329.9. Samples: 15169735700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:03,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 16:47:06,228][15401] Updated weights for policy 0, policy_version 925884 (0.0037) [2024-06-25 16:47:08,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15169798144. Throughput: 0: 42713.5. Samples: 15169865320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 16:47:09,283][15401] Updated weights for policy 0, policy_version 925894 (0.0037) [2024-06-25 16:47:13,392][15132] Fps is (10 sec: 44225.6, 60 sec: 41777.4, 300 sec: 42320.4). Total num frames: 15169978368. Throughput: 0: 42819.0. Samples: 15170126680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:13,393][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 16:47:14,060][15401] Updated weights for policy 0, policy_version 925904 (0.0042) [2024-06-25 16:47:16,855][15401] Updated weights for policy 0, policy_version 925914 (0.0032) [2024-06-25 16:47:18,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42325.4, 300 sec: 42431.4). Total num frames: 15170191360. Throughput: 0: 42434.7. Samples: 15170372900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:18,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 16:47:21,813][15401] Updated weights for policy 0, policy_version 925924 (0.0038) [2024-06-25 16:47:23,390][15132] Fps is (10 sec: 47524.7, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 15170453504. Throughput: 0: 42867.1. Samples: 15170509240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:23,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 16:47:25,046][15401] Updated weights for policy 0, policy_version 925934 (0.0034) [2024-06-25 16:47:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 15170617344. Throughput: 0: 42787.6. Samples: 15170761240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:28,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 16:47:29,365][15401] Updated weights for policy 0, policy_version 925944 (0.0027) [2024-06-25 16:47:32,559][15401] Updated weights for policy 0, policy_version 925954 (0.0029) [2024-06-25 16:47:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 15170846720. Throughput: 0: 42724.0. Samples: 15171015540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 16:47:36,995][15401] Updated weights for policy 0, policy_version 925964 (0.0035) [2024-06-25 16:47:38,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42873.1, 300 sec: 42542.9). Total num frames: 15171092480. Throughput: 0: 42986.2. Samples: 15171152320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 16:47:39,967][15401] Updated weights for policy 0, policy_version 925974 (0.0030) [2024-06-25 16:47:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 15171272704. Throughput: 0: 43016.0. Samples: 15171411960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 16:47:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000925981_15171272704.pth... [2024-06-25 16:47:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000925360_15161098240.pth [2024-06-25 16:47:44,549][15401] Updated weights for policy 0, policy_version 925984 (0.0031) [2024-06-25 16:47:47,572][15401] Updated weights for policy 0, policy_version 925994 (0.0030) [2024-06-25 16:47:48,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43415.8, 300 sec: 42542.5). Total num frames: 15171502080. Throughput: 0: 42738.1. Samples: 15171659020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:48,392][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 16:47:52,046][15401] Updated weights for policy 0, policy_version 926004 (0.0030) [2024-06-25 16:47:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15171731456. Throughput: 0: 42947.5. Samples: 15171797960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 16:47:55,500][15401] Updated weights for policy 0, policy_version 926014 (0.0035) [2024-06-25 16:47:58,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 15171911680. Throughput: 0: 42605.4. Samples: 15172043820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:47:58,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 16:47:59,845][15401] Updated weights for policy 0, policy_version 926024 (0.0050) [2024-06-25 16:48:01,303][15349] Signal inference workers to stop experience collection... (224500 times) [2024-06-25 16:48:01,340][15401] InferenceWorker_p0-w0: stopping experience collection (224500 times) [2024-06-25 16:48:01,363][15349] Signal inference workers to resume experience collection... (224500 times) [2024-06-25 16:48:01,364][15401] InferenceWorker_p0-w0: resuming experience collection (224500 times) [2024-06-25 16:48:03,345][15401] Updated weights for policy 0, policy_version 926034 (0.0035) [2024-06-25 16:48:03,391][15132] Fps is (10 sec: 40955.5, 60 sec: 43416.8, 300 sec: 42542.7). Total num frames: 15172141056. Throughput: 0: 42807.5. Samples: 15172299180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:03,391][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 16:48:07,279][15401] Updated weights for policy 0, policy_version 926044 (0.0032) [2024-06-25 16:48:08,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15172370432. Throughput: 0: 42829.4. Samples: 15172436560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:08,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 16:48:10,670][15401] Updated weights for policy 0, policy_version 926054 (0.0034) [2024-06-25 16:48:13,390][15132] Fps is (10 sec: 42602.5, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 15172567040. Throughput: 0: 42944.9. Samples: 15172693760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 16:48:15,106][15401] Updated weights for policy 0, policy_version 926064 (0.0039) [2024-06-25 16:48:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43146.3, 300 sec: 42542.9). Total num frames: 15172780032. Throughput: 0: 42823.6. Samples: 15172942600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 16:48:18,527][15401] Updated weights for policy 0, policy_version 926074 (0.0038) [2024-06-25 16:48:22,650][15401] Updated weights for policy 0, policy_version 926084 (0.0040) [2024-06-25 16:48:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15172993024. Throughput: 0: 42728.5. Samples: 15173075100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 16:48:26,057][15401] Updated weights for policy 0, policy_version 926094 (0.0032) [2024-06-25 16:48:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 15173189632. Throughput: 0: 42651.1. Samples: 15173331260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:28,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 16:48:30,149][15401] Updated weights for policy 0, policy_version 926104 (0.0031) [2024-06-25 16:48:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42432.2). Total num frames: 15173402624. Throughput: 0: 42689.9. Samples: 15173579960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:33,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 16:48:34,006][15401] Updated weights for policy 0, policy_version 926114 (0.0029) [2024-06-25 16:48:38,077][15401] Updated weights for policy 0, policy_version 926124 (0.0038) [2024-06-25 16:48:38,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15173648384. Throughput: 0: 42493.8. Samples: 15173710180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 16:48:41,901][15401] Updated weights for policy 0, policy_version 926134 (0.0033) [2024-06-25 16:48:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15173828608. Throughput: 0: 42718.7. Samples: 15173966160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 16:48:45,536][15401] Updated weights for policy 0, policy_version 926144 (0.0030) [2024-06-25 16:48:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 15174057984. Throughput: 0: 42662.8. Samples: 15174218960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 16:48:48,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 16:48:49,646][15401] Updated weights for policy 0, policy_version 926154 (0.0027) [2024-06-25 16:48:53,110][15401] Updated weights for policy 0, policy_version 926164 (0.0033) [2024-06-25 16:48:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15174287360. Throughput: 0: 42615.0. Samples: 15174354240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:48:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 16:48:57,099][15401] Updated weights for policy 0, policy_version 926174 (0.0046) [2024-06-25 16:48:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15174483968. Throughput: 0: 42547.1. Samples: 15174608380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:48:58,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 16:49:00,673][15401] Updated weights for policy 0, policy_version 926184 (0.0033) [2024-06-25 16:49:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42872.2, 300 sec: 42598.4). Total num frames: 15174713344. Throughput: 0: 42725.8. Samples: 15174865260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 16:49:05,223][15401] Updated weights for policy 0, policy_version 926194 (0.0031) [2024-06-25 16:49:08,170][15401] Updated weights for policy 0, policy_version 926204 (0.0033) [2024-06-25 16:49:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15174926336. Throughput: 0: 42712.9. Samples: 15174997180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 16:49:11,135][15349] Signal inference workers to stop experience collection... (224550 times) [2024-06-25 16:49:11,135][15349] Signal inference workers to resume experience collection... (224550 times) [2024-06-25 16:49:11,153][15401] InferenceWorker_p0-w0: stopping experience collection (224550 times) [2024-06-25 16:49:11,183][15401] InferenceWorker_p0-w0: resuming experience collection (224550 times) [2024-06-25 16:49:12,678][15401] Updated weights for policy 0, policy_version 926214 (0.0034) [2024-06-25 16:49:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15175122944. Throughput: 0: 42575.6. Samples: 15175247160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:13,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 16:49:15,839][15401] Updated weights for policy 0, policy_version 926224 (0.0034) [2024-06-25 16:49:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15175352320. Throughput: 0: 42731.9. Samples: 15175502900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:18,399][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 16:49:20,097][15401] Updated weights for policy 0, policy_version 926234 (0.0031) [2024-06-25 16:49:23,344][15401] Updated weights for policy 0, policy_version 926244 (0.0026) [2024-06-25 16:49:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15175581696. Throughput: 0: 42775.9. Samples: 15175635100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:23,391][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 16:49:27,620][15401] Updated weights for policy 0, policy_version 926254 (0.0049) [2024-06-25 16:49:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 15175761920. Throughput: 0: 42739.6. Samples: 15175889440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:28,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-25 16:49:30,878][15401] Updated weights for policy 0, policy_version 926264 (0.0030) [2024-06-25 16:49:33,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15175958528. Throughput: 0: 43033.2. Samples: 15176155460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 16:49:35,074][15401] Updated weights for policy 0, policy_version 926274 (0.0025) [2024-06-25 16:49:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15176204288. Throughput: 0: 42610.4. Samples: 15176271700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:38,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 16:49:38,901][15401] Updated weights for policy 0, policy_version 926284 (0.0027) [2024-06-25 16:49:42,686][15401] Updated weights for policy 0, policy_version 926294 (0.0044) [2024-06-25 16:49:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15176417280. Throughput: 0: 42766.7. Samples: 15176532880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 16:49:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000926295_15176417280.pth... [2024-06-25 16:49:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000925670_15166177280.pth [2024-06-25 16:49:46,639][15401] Updated weights for policy 0, policy_version 926304 (0.0037) [2024-06-25 16:49:48,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15176597504. Throughput: 0: 42847.5. Samples: 15176793400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:48,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 16:49:50,526][15401] Updated weights for policy 0, policy_version 926314 (0.0032) [2024-06-25 16:49:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15176859648. Throughput: 0: 42661.3. Samples: 15176916940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 16:49:54,315][15401] Updated weights for policy 0, policy_version 926324 (0.0022) [2024-06-25 16:49:57,995][15401] Updated weights for policy 0, policy_version 926334 (0.0028) [2024-06-25 16:49:58,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15177072640. Throughput: 0: 42937.3. Samples: 15177179440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:49:58,392][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 16:50:02,509][15401] Updated weights for policy 0, policy_version 926344 (0.0034) [2024-06-25 16:50:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15177252864. Throughput: 0: 42844.1. Samples: 15177430880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:03,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 16:50:05,648][15401] Updated weights for policy 0, policy_version 926354 (0.0035) [2024-06-25 16:50:08,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15177498624. Throughput: 0: 42639.5. Samples: 15177553880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 16:50:10,032][15401] Updated weights for policy 0, policy_version 926364 (0.0030) [2024-06-25 16:50:13,125][15401] Updated weights for policy 0, policy_version 926374 (0.0037) [2024-06-25 16:50:13,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15177711616. Throughput: 0: 42987.8. Samples: 15177823900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 16:50:15,389][15349] Signal inference workers to stop experience collection... (224600 times) [2024-06-25 16:50:15,394][15349] Signal inference workers to resume experience collection... (224600 times) [2024-06-25 16:50:15,435][15401] InferenceWorker_p0-w0: stopping experience collection (224600 times) [2024-06-25 16:50:15,436][15401] InferenceWorker_p0-w0: resuming experience collection (224600 times) [2024-06-25 16:50:17,588][15401] Updated weights for policy 0, policy_version 926384 (0.0030) [2024-06-25 16:50:18,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15177908224. Throughput: 0: 42712.5. Samples: 15178077520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:18,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 16:50:21,054][15401] Updated weights for policy 0, policy_version 926394 (0.0027) [2024-06-25 16:50:23,389][15132] Fps is (10 sec: 44238.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15178153984. Throughput: 0: 42882.3. Samples: 15178201400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 16:50:25,113][15401] Updated weights for policy 0, policy_version 926404 (0.0033) [2024-06-25 16:50:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15178350592. Throughput: 0: 42831.5. Samples: 15178460300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:28,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 16:50:28,836][15401] Updated weights for policy 0, policy_version 926414 (0.0043) [2024-06-25 16:50:32,696][15401] Updated weights for policy 0, policy_version 926424 (0.0038) [2024-06-25 16:50:33,390][15132] Fps is (10 sec: 37682.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15178530816. Throughput: 0: 42683.9. Samples: 15178714180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:33,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 16:50:36,350][15401] Updated weights for policy 0, policy_version 926434 (0.0036) [2024-06-25 16:50:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15178760192. Throughput: 0: 42714.1. Samples: 15178839080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 16:50:40,304][15401] Updated weights for policy 0, policy_version 926444 (0.0033) [2024-06-25 16:50:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15178973184. Throughput: 0: 42650.2. Samples: 15179098600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-25 16:50:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 16:50:44,102][15401] Updated weights for policy 0, policy_version 926454 (0.0034) [2024-06-25 16:50:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15179169792. Throughput: 0: 42633.4. Samples: 15179349380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:50:48,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 16:50:48,494][15401] Updated weights for policy 0, policy_version 926464 (0.0032) [2024-06-25 16:50:51,730][15401] Updated weights for policy 0, policy_version 926474 (0.0028) [2024-06-25 16:50:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15179382784. Throughput: 0: 42714.3. Samples: 15179476020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:50:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 16:50:56,536][15401] Updated weights for policy 0, policy_version 926484 (0.0042) [2024-06-25 16:50:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 15179579392. Throughput: 0: 42361.5. Samples: 15179730160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:50:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 16:51:00,118][15401] Updated weights for policy 0, policy_version 926494 (0.0039) [2024-06-25 16:51:03,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15179825152. Throughput: 0: 42311.4. Samples: 15179981540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:03,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 16:51:04,154][15401] Updated weights for policy 0, policy_version 926504 (0.0034) [2024-06-25 16:51:07,751][15401] Updated weights for policy 0, policy_version 926514 (0.0038) [2024-06-25 16:51:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 15180021760. Throughput: 0: 42514.0. Samples: 15180114540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:08,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 16:51:11,811][15401] Updated weights for policy 0, policy_version 926524 (0.0034) [2024-06-25 16:51:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 41779.4, 300 sec: 42598.8). Total num frames: 15180218368. Throughput: 0: 42355.3. Samples: 15180366280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:13,390][15132] Avg episode reward: [(0, '0.356')] [2024-06-25 16:51:15,302][15401] Updated weights for policy 0, policy_version 926534 (0.0031) [2024-06-25 16:51:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15180464128. Throughput: 0: 42349.3. Samples: 15180619900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 16:51:19,376][15401] Updated weights for policy 0, policy_version 926544 (0.0031) [2024-06-25 16:51:23,335][15401] Updated weights for policy 0, policy_version 926554 (0.0034) [2024-06-25 16:51:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 15180660736. Throughput: 0: 42514.4. Samples: 15180752220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:23,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 16:51:26,894][15401] Updated weights for policy 0, policy_version 926564 (0.0027) [2024-06-25 16:51:28,338][15349] Signal inference workers to stop experience collection... (224650 times) [2024-06-25 16:51:28,339][15349] Signal inference workers to resume experience collection... (224650 times) [2024-06-25 16:51:28,356][15401] InferenceWorker_p0-w0: stopping experience collection (224650 times) [2024-06-25 16:51:28,360][15401] InferenceWorker_p0-w0: resuming experience collection (224650 times) [2024-06-25 16:51:28,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 15180873728. Throughput: 0: 42435.8. Samples: 15181008200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:28,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 16:51:30,792][15401] Updated weights for policy 0, policy_version 926574 (0.0037) [2024-06-25 16:51:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.7, 300 sec: 42654.3). Total num frames: 15181103104. Throughput: 0: 42467.6. Samples: 15181260420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:33,390][15132] Avg episode reward: [(0, '0.235')] [2024-06-25 16:51:35,003][15401] Updated weights for policy 0, policy_version 926584 (0.0036) [2024-06-25 16:51:38,223][15401] Updated weights for policy 0, policy_version 926594 (0.0028) [2024-06-25 16:51:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15181316096. Throughput: 0: 42581.3. Samples: 15181392180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 16:51:42,523][15401] Updated weights for policy 0, policy_version 926604 (0.0025) [2024-06-25 16:51:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15181512704. Throughput: 0: 42754.7. Samples: 15181654120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:43,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 16:51:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000926606_15181512704.pth... [2024-06-25 16:51:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000925981_15171272704.pth [2024-06-25 16:51:45,919][15401] Updated weights for policy 0, policy_version 926614 (0.0038) [2024-06-25 16:51:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15181758464. Throughput: 0: 42651.2. Samples: 15181900840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 16:51:50,336][15401] Updated weights for policy 0, policy_version 926624 (0.0031) [2024-06-25 16:51:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 15181955072. Throughput: 0: 42739.2. Samples: 15182037800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:53,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 16:51:53,483][15401] Updated weights for policy 0, policy_version 926634 (0.0041) [2024-06-25 16:51:57,845][15401] Updated weights for policy 0, policy_version 926644 (0.0031) [2024-06-25 16:51:58,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15182151680. Throughput: 0: 42850.5. Samples: 15182294560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:51:58,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 16:52:00,984][15401] Updated weights for policy 0, policy_version 926654 (0.0041) [2024-06-25 16:52:03,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 15182397440. Throughput: 0: 42797.4. Samples: 15182545880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:52:03,392][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 16:52:05,616][15401] Updated weights for policy 0, policy_version 926664 (0.0030) [2024-06-25 16:52:08,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 15182610432. Throughput: 0: 42755.5. Samples: 15182676220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:52:08,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 16:52:08,584][15401] Updated weights for policy 0, policy_version 926674 (0.0045) [2024-06-25 16:52:13,369][15401] Updated weights for policy 0, policy_version 926684 (0.0044) [2024-06-25 16:52:13,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 15182790656. Throughput: 0: 42755.0. Samples: 15182932180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:52:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 16:52:16,165][15401] Updated weights for policy 0, policy_version 926694 (0.0037) [2024-06-25 16:52:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15183020032. Throughput: 0: 42772.7. Samples: 15183185200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:52:18,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 16:52:21,229][15401] Updated weights for policy 0, policy_version 926704 (0.0042) [2024-06-25 16:52:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15183249408. Throughput: 0: 42972.0. Samples: 15183325920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:52:23,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 16:52:23,746][15401] Updated weights for policy 0, policy_version 926714 (0.0034) [2024-06-25 16:52:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15183429632. Throughput: 0: 42624.9. Samples: 15183572240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:52:28,390][15132] Avg episode reward: [(0, '0.831')] [2024-06-25 16:52:28,647][15401] Updated weights for policy 0, policy_version 926724 (0.0034) [2024-06-25 16:52:31,257][15401] Updated weights for policy 0, policy_version 926734 (0.0046) [2024-06-25 16:52:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15183675392. Throughput: 0: 42766.6. Samples: 15183825340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:52:33,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 16:52:36,836][15401] Updated weights for policy 0, policy_version 926744 (0.0033) [2024-06-25 16:52:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15183888384. Throughput: 0: 42739.5. Samples: 15183961080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-25 16:52:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 16:52:39,102][15401] Updated weights for policy 0, policy_version 926754 (0.0037) [2024-06-25 16:52:43,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 15184084992. Throughput: 0: 42710.8. Samples: 15184216540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:52:43,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 16:52:44,522][15401] Updated weights for policy 0, policy_version 926764 (0.0029) [2024-06-25 16:52:44,963][15349] Signal inference workers to stop experience collection... (224700 times) [2024-06-25 16:52:44,967][15349] Signal inference workers to resume experience collection... (224700 times) [2024-06-25 16:52:45,005][15401] InferenceWorker_p0-w0: stopping experience collection (224700 times) [2024-06-25 16:52:45,005][15401] InferenceWorker_p0-w0: resuming experience collection (224700 times) [2024-06-25 16:52:46,753][15401] Updated weights for policy 0, policy_version 926774 (0.0043) [2024-06-25 16:52:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15184314368. Throughput: 0: 42583.7. Samples: 15184462040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:52:48,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 16:52:52,322][15401] Updated weights for policy 0, policy_version 926784 (0.0047) [2024-06-25 16:52:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15184510976. Throughput: 0: 42577.0. Samples: 15184592180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:52:53,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 16:52:54,590][15401] Updated weights for policy 0, policy_version 926794 (0.0029) [2024-06-25 16:52:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42654.1). Total num frames: 15184723968. Throughput: 0: 42678.3. Samples: 15184852700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:52:58,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 16:52:59,686][15401] Updated weights for policy 0, policy_version 926804 (0.0022) [2024-06-25 16:53:02,209][15401] Updated weights for policy 0, policy_version 926814 (0.0030) [2024-06-25 16:53:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 15184953344. Throughput: 0: 42514.7. Samples: 15185098360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 16:53:07,140][15401] Updated weights for policy 0, policy_version 926824 (0.0042) [2024-06-25 16:53:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 15185133568. Throughput: 0: 42372.2. Samples: 15185232660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 16:53:09,887][15401] Updated weights for policy 0, policy_version 926834 (0.0033) [2024-06-25 16:53:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15185362944. Throughput: 0: 42530.2. Samples: 15185486100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:13,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 16:53:14,551][15401] Updated weights for policy 0, policy_version 926844 (0.0047) [2024-06-25 16:53:17,548][15401] Updated weights for policy 0, policy_version 926854 (0.0030) [2024-06-25 16:53:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15185592320. Throughput: 0: 42483.7. Samples: 15185737100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:18,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 16:53:22,044][15401] Updated weights for policy 0, policy_version 926864 (0.0029) [2024-06-25 16:53:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 15185772544. Throughput: 0: 42513.0. Samples: 15185874160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:23,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-25 16:53:25,056][15401] Updated weights for policy 0, policy_version 926874 (0.0028) [2024-06-25 16:53:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15186001920. Throughput: 0: 42631.9. Samples: 15186134980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:28,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 16:53:29,578][15401] Updated weights for policy 0, policy_version 926884 (0.0039) [2024-06-25 16:53:33,063][15401] Updated weights for policy 0, policy_version 926894 (0.0023) [2024-06-25 16:53:33,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15186247680. Throughput: 0: 42711.6. Samples: 15186384060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:33,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 16:53:37,490][15401] Updated weights for policy 0, policy_version 926904 (0.0033) [2024-06-25 16:53:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15186411520. Throughput: 0: 42812.9. Samples: 15186518760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 16:53:40,697][15401] Updated weights for policy 0, policy_version 926914 (0.0035) [2024-06-25 16:53:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15186640896. Throughput: 0: 42704.5. Samples: 15186774400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:43,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 16:53:43,465][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000926920_15186657280.pth... [2024-06-25 16:53:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000926295_15176417280.pth [2024-06-25 16:53:44,172][15349] Signal inference workers to stop experience collection... (224750 times) [2024-06-25 16:53:44,172][15349] Signal inference workers to resume experience collection... (224750 times) [2024-06-25 16:53:44,184][15401] InferenceWorker_p0-w0: stopping experience collection (224750 times) [2024-06-25 16:53:44,184][15401] InferenceWorker_p0-w0: resuming experience collection (224750 times) [2024-06-25 16:53:44,956][15401] Updated weights for policy 0, policy_version 926924 (0.0041) [2024-06-25 16:53:48,390][15132] Fps is (10 sec: 45873.6, 60 sec: 42598.1, 300 sec: 42653.9). Total num frames: 15186870272. Throughput: 0: 42865.5. Samples: 15187027320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:48,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 16:53:48,642][15401] Updated weights for policy 0, policy_version 926934 (0.0039) [2024-06-25 16:53:52,511][15401] Updated weights for policy 0, policy_version 926944 (0.0032) [2024-06-25 16:53:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15187066880. Throughput: 0: 42921.3. Samples: 15187164120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 16:53:56,119][15401] Updated weights for policy 0, policy_version 926954 (0.0036) [2024-06-25 16:53:58,395][15132] Fps is (10 sec: 40939.3, 60 sec: 42594.5, 300 sec: 42597.6). Total num frames: 15187279872. Throughput: 0: 42885.1. Samples: 15187416160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:53:58,395][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 16:54:00,566][15401] Updated weights for policy 0, policy_version 926964 (0.0035) [2024-06-25 16:54:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15187525632. Throughput: 0: 42848.8. Samples: 15187665300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:54:03,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 16:54:03,698][15401] Updated weights for policy 0, policy_version 926974 (0.0032) [2024-06-25 16:54:08,385][15401] Updated weights for policy 0, policy_version 926984 (0.0034) [2024-06-25 16:54:08,389][15132] Fps is (10 sec: 42621.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15187705856. Throughput: 0: 42886.7. Samples: 15187804060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:54:08,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 16:54:11,330][15401] Updated weights for policy 0, policy_version 926994 (0.0033) [2024-06-25 16:54:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15187935232. Throughput: 0: 42517.3. Samples: 15188048260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:54:13,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 16:54:15,848][15401] Updated weights for policy 0, policy_version 927004 (0.0027) [2024-06-25 16:54:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15188148224. Throughput: 0: 42687.0. Samples: 15188304980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:54:18,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 16:54:19,113][15401] Updated weights for policy 0, policy_version 927014 (0.0029) [2024-06-25 16:54:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15188344832. Throughput: 0: 42467.6. Samples: 15188429800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:54:23,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 16:54:23,614][15401] Updated weights for policy 0, policy_version 927024 (0.0041) [2024-06-25 16:54:26,733][15401] Updated weights for policy 0, policy_version 927034 (0.0031) [2024-06-25 16:54:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15188574208. Throughput: 0: 42298.2. Samples: 15188677820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:54:28,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 16:54:31,162][15401] Updated weights for policy 0, policy_version 927044 (0.0038) [2024-06-25 16:54:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15188770816. Throughput: 0: 42478.9. Samples: 15188938860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 16:54:33,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 16:54:34,784][15401] Updated weights for policy 0, policy_version 927054 (0.0033) [2024-06-25 16:54:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15188983808. Throughput: 0: 42207.2. Samples: 15189063440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:54:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 16:54:39,238][15401] Updated weights for policy 0, policy_version 927064 (0.0033) [2024-06-25 16:54:42,275][15401] Updated weights for policy 0, policy_version 927074 (0.0031) [2024-06-25 16:54:43,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15189213184. Throughput: 0: 42423.3. Samples: 15189324980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:54:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 16:54:47,028][15401] Updated weights for policy 0, policy_version 927084 (0.0037) [2024-06-25 16:54:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.7, 300 sec: 42598.4). Total num frames: 15189426176. Throughput: 0: 42582.3. Samples: 15189581500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:54:48,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 16:54:50,325][15401] Updated weights for policy 0, policy_version 927094 (0.0036) [2024-06-25 16:54:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 15189606400. Throughput: 0: 42194.2. Samples: 15189702800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:54:53,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 16:54:54,648][15401] Updated weights for policy 0, policy_version 927104 (0.0037) [2024-06-25 16:54:55,896][15349] Signal inference workers to stop experience collection... (224800 times) [2024-06-25 16:54:55,952][15401] InferenceWorker_p0-w0: stopping experience collection (224800 times) [2024-06-25 16:54:55,954][15349] Signal inference workers to resume experience collection... (224800 times) [2024-06-25 16:54:55,970][15401] InferenceWorker_p0-w0: resuming experience collection (224800 times) [2024-06-25 16:54:57,726][15401] Updated weights for policy 0, policy_version 927114 (0.0043) [2024-06-25 16:54:58,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42873.6, 300 sec: 42709.1). Total num frames: 15189852160. Throughput: 0: 42538.7. Samples: 15189962600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:54:58,393][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 16:55:02,235][15401] Updated weights for policy 0, policy_version 927124 (0.0028) [2024-06-25 16:55:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15190065152. Throughput: 0: 42578.3. Samples: 15190221000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 16:55:05,147][15401] Updated weights for policy 0, policy_version 927134 (0.0032) [2024-06-25 16:55:08,396][15132] Fps is (10 sec: 40943.6, 60 sec: 42593.8, 300 sec: 42542.0). Total num frames: 15190261760. Throughput: 0: 42546.8. Samples: 15190344680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:08,396][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 16:55:09,979][15401] Updated weights for policy 0, policy_version 927144 (0.0046) [2024-06-25 16:55:13,113][15401] Updated weights for policy 0, policy_version 927154 (0.0027) [2024-06-25 16:55:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15190507520. Throughput: 0: 42760.3. Samples: 15190602040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:13,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 16:55:17,919][15401] Updated weights for policy 0, policy_version 927164 (0.0036) [2024-06-25 16:55:18,390][15132] Fps is (10 sec: 42625.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15190687744. Throughput: 0: 42763.2. Samples: 15190863200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 16:55:20,711][15401] Updated weights for policy 0, policy_version 927174 (0.0043) [2024-06-25 16:55:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15190900736. Throughput: 0: 42551.4. Samples: 15190978260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 16:55:25,632][15401] Updated weights for policy 0, policy_version 927184 (0.0040) [2024-06-25 16:55:28,334][15401] Updated weights for policy 0, policy_version 927194 (0.0031) [2024-06-25 16:55:28,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15191146496. Throughput: 0: 42436.4. Samples: 15191234620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:28,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 16:55:33,267][15401] Updated weights for policy 0, policy_version 927204 (0.0038) [2024-06-25 16:55:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15191310336. Throughput: 0: 42719.0. Samples: 15191503860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:33,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 16:55:35,737][15401] Updated weights for policy 0, policy_version 927214 (0.0040) [2024-06-25 16:55:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15191539712. Throughput: 0: 42509.7. Samples: 15191615740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 16:55:41,001][15401] Updated weights for policy 0, policy_version 927224 (0.0033) [2024-06-25 16:55:43,390][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15191785472. Throughput: 0: 42474.7. Samples: 15191873860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:43,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 16:55:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000927233_15191785472.pth... [2024-06-25 16:55:43,447][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000926606_15181512704.pth [2024-06-25 16:55:43,652][15401] Updated weights for policy 0, policy_version 927234 (0.0036) [2024-06-25 16:55:48,389][15132] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 15191932928. Throughput: 0: 42596.1. Samples: 15192137820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:48,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 16:55:48,854][15401] Updated weights for policy 0, policy_version 927244 (0.0038) [2024-06-25 16:55:51,259][15401] Updated weights for policy 0, policy_version 927254 (0.0042) [2024-06-25 16:55:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15192195072. Throughput: 0: 42445.1. Samples: 15192254440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 16:55:56,300][15401] Updated weights for policy 0, policy_version 927264 (0.0038) [2024-06-25 16:55:58,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 15192408064. Throughput: 0: 42727.7. Samples: 15192524780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:55:58,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 16:55:58,770][15401] Updated weights for policy 0, policy_version 927274 (0.0028) [2024-06-25 16:56:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 15192571904. Throughput: 0: 42776.4. Samples: 15192788140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:56:03,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 16:56:03,523][15349] Signal inference workers to stop experience collection... (224850 times) [2024-06-25 16:56:03,564][15401] InferenceWorker_p0-w0: stopping experience collection (224850 times) [2024-06-25 16:56:03,571][15349] Signal inference workers to resume experience collection... (224850 times) [2024-06-25 16:56:03,582][15401] InferenceWorker_p0-w0: resuming experience collection (224850 times) [2024-06-25 16:56:03,886][15401] Updated weights for policy 0, policy_version 927284 (0.0043) [2024-06-25 16:56:06,785][15401] Updated weights for policy 0, policy_version 927294 (0.0027) [2024-06-25 16:56:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43149.1, 300 sec: 42820.5). Total num frames: 15192850432. Throughput: 0: 42802.7. Samples: 15192904380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:56:08,395][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 16:56:11,954][15401] Updated weights for policy 0, policy_version 927304 (0.0040) [2024-06-25 16:56:13,390][15132] Fps is (10 sec: 47514.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15193047040. Throughput: 0: 43016.4. Samples: 15193170360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:56:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 16:56:14,396][15401] Updated weights for policy 0, policy_version 927314 (0.0027) [2024-06-25 16:56:18,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15193227264. Throughput: 0: 42643.7. Samples: 15193422820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:56:18,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 16:56:19,506][15401] Updated weights for policy 0, policy_version 927324 (0.0038) [2024-06-25 16:56:21,881][15401] Updated weights for policy 0, policy_version 927334 (0.0034) [2024-06-25 16:56:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15193473024. Throughput: 0: 42858.8. Samples: 15193544380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:56:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 16:56:26,899][15401] Updated weights for policy 0, policy_version 927344 (0.0025) [2024-06-25 16:56:28,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15193686016. Throughput: 0: 43012.0. Samples: 15193809400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-25 16:56:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 16:56:29,655][15401] Updated weights for policy 0, policy_version 927354 (0.0031) [2024-06-25 16:56:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15193882624. Throughput: 0: 42814.1. Samples: 15194064460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:56:33,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 16:56:34,242][15401] Updated weights for policy 0, policy_version 927364 (0.0027) [2024-06-25 16:56:37,428][15401] Updated weights for policy 0, policy_version 927374 (0.0038) [2024-06-25 16:56:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15194128384. Throughput: 0: 43015.6. Samples: 15194190140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:56:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 16:56:41,954][15401] Updated weights for policy 0, policy_version 927384 (0.0037) [2024-06-25 16:56:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15194324992. Throughput: 0: 42732.3. Samples: 15194447740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:56:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 16:56:45,288][15401] Updated weights for policy 0, policy_version 927394 (0.0031) [2024-06-25 16:56:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15194521600. Throughput: 0: 42483.2. Samples: 15194699880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:56:48,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 16:56:49,354][15401] Updated weights for policy 0, policy_version 927404 (0.0025) [2024-06-25 16:56:52,948][15401] Updated weights for policy 0, policy_version 927414 (0.0035) [2024-06-25 16:56:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15194767360. Throughput: 0: 42803.6. Samples: 15194830540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:56:53,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 16:56:56,947][15401] Updated weights for policy 0, policy_version 927424 (0.0034) [2024-06-25 16:56:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 15194947584. Throughput: 0: 42493.8. Samples: 15195082580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:56:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 16:57:00,589][15401] Updated weights for policy 0, policy_version 927434 (0.0033) [2024-06-25 16:57:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 15195176960. Throughput: 0: 42702.1. Samples: 15195344420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 16:57:04,731][15401] Updated weights for policy 0, policy_version 927444 (0.0028) [2024-06-25 16:57:08,161][15401] Updated weights for policy 0, policy_version 927454 (0.0037) [2024-06-25 16:57:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15195406336. Throughput: 0: 42880.9. Samples: 15195474020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:08,390][15132] Avg episode reward: [(0, '0.176')] [2024-06-25 16:57:12,370][15401] Updated weights for policy 0, policy_version 927464 (0.0032) [2024-06-25 16:57:13,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 15195602944. Throughput: 0: 42555.9. Samples: 15195724520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:13,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 16:57:16,164][15401] Updated weights for policy 0, policy_version 927474 (0.0044) [2024-06-25 16:57:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 15195815936. Throughput: 0: 42649.2. Samples: 15195983680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:18,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-25 16:57:20,129][15401] Updated weights for policy 0, policy_version 927484 (0.0031) [2024-06-25 16:57:23,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15196045312. Throughput: 0: 42690.3. Samples: 15196111200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:23,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 16:57:23,763][15401] Updated weights for policy 0, policy_version 927494 (0.0031) [2024-06-25 16:57:27,957][15401] Updated weights for policy 0, policy_version 927504 (0.0040) [2024-06-25 16:57:28,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15196241920. Throughput: 0: 42565.0. Samples: 15196363160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:28,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 16:57:31,283][15401] Updated weights for policy 0, policy_version 927514 (0.0027) [2024-06-25 16:57:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15196454912. Throughput: 0: 42697.9. Samples: 15196621280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 16:57:35,683][15401] Updated weights for policy 0, policy_version 927524 (0.0031) [2024-06-25 16:57:35,706][15349] Signal inference workers to stop experience collection... (224900 times) [2024-06-25 16:57:35,706][15349] Signal inference workers to resume experience collection... (224900 times) [2024-06-25 16:57:35,727][15401] InferenceWorker_p0-w0: stopping experience collection (224900 times) [2024-06-25 16:57:35,728][15401] InferenceWorker_p0-w0: resuming experience collection (224900 times) [2024-06-25 16:57:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15196684288. Throughput: 0: 42670.1. Samples: 15196750700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 16:57:38,833][15401] Updated weights for policy 0, policy_version 927534 (0.0043) [2024-06-25 16:57:43,252][15401] Updated weights for policy 0, policy_version 927544 (0.0040) [2024-06-25 16:57:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15196897280. Throughput: 0: 42769.4. Samples: 15197007200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:43,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 16:57:43,398][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000927545_15196897280.pth... [2024-06-25 16:57:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000926920_15186657280.pth [2024-06-25 16:57:46,989][15401] Updated weights for policy 0, policy_version 927554 (0.0052) [2024-06-25 16:57:48,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15197093888. Throughput: 0: 42564.6. Samples: 15197259820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:48,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 16:57:50,947][15401] Updated weights for policy 0, policy_version 927564 (0.0030) [2024-06-25 16:57:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15197306880. Throughput: 0: 42436.0. Samples: 15197383640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 16:57:54,734][15401] Updated weights for policy 0, policy_version 927574 (0.0026) [2024-06-25 16:57:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15197519872. Throughput: 0: 42592.6. Samples: 15197641080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:57:58,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 16:57:58,527][15401] Updated weights for policy 0, policy_version 927584 (0.0030) [2024-06-25 16:58:02,311][15401] Updated weights for policy 0, policy_version 927594 (0.0033) [2024-06-25 16:58:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15197749248. Throughput: 0: 42535.7. Samples: 15197897780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:58:03,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 16:58:06,365][15401] Updated weights for policy 0, policy_version 927604 (0.0035) [2024-06-25 16:58:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15197945856. Throughput: 0: 42672.0. Samples: 15198031440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:58:08,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 16:58:09,936][15401] Updated weights for policy 0, policy_version 927614 (0.0047) [2024-06-25 16:58:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 15198158848. Throughput: 0: 42643.4. Samples: 15198282120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:58:13,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 16:58:13,915][15401] Updated weights for policy 0, policy_version 927624 (0.0032) [2024-06-25 16:58:17,495][15401] Updated weights for policy 0, policy_version 927634 (0.0034) [2024-06-25 16:58:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15198388224. Throughput: 0: 42596.8. Samples: 15198538140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:58:18,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 16:58:21,642][15401] Updated weights for policy 0, policy_version 927644 (0.0048) [2024-06-25 16:58:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15198568448. Throughput: 0: 42586.4. Samples: 15198667080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:58:23,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 16:58:25,153][15401] Updated weights for policy 0, policy_version 927654 (0.0041) [2024-06-25 16:58:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15198814208. Throughput: 0: 42695.1. Samples: 15198928480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-25 16:58:28,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 16:58:29,068][15401] Updated weights for policy 0, policy_version 927664 (0.0034) [2024-06-25 16:58:32,927][15401] Updated weights for policy 0, policy_version 927674 (0.0038) [2024-06-25 16:58:33,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15199043584. Throughput: 0: 42685.7. Samples: 15199180680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:58:33,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 16:58:36,584][15401] Updated weights for policy 0, policy_version 927684 (0.0034) [2024-06-25 16:58:38,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42050.7, 300 sec: 42598.0). Total num frames: 15199207424. Throughput: 0: 42808.8. Samples: 15199310140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:58:38,401][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 16:58:40,586][15401] Updated weights for policy 0, policy_version 927694 (0.0032) [2024-06-25 16:58:43,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42593.8, 300 sec: 42653.1). Total num frames: 15199453184. Throughput: 0: 42861.8. Samples: 15199570140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:58:43,397][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 16:58:44,891][15401] Updated weights for policy 0, policy_version 927704 (0.0036) [2024-06-25 16:58:45,998][15349] Signal inference workers to stop experience collection... (224950 times) [2024-06-25 16:58:45,998][15349] Signal inference workers to resume experience collection... (224950 times) [2024-06-25 16:58:46,011][15401] InferenceWorker_p0-w0: stopping experience collection (224950 times) [2024-06-25 16:58:46,041][15401] InferenceWorker_p0-w0: resuming experience collection (224950 times) [2024-06-25 16:58:48,046][15401] Updated weights for policy 0, policy_version 927714 (0.0052) [2024-06-25 16:58:48,390][15132] Fps is (10 sec: 49163.3, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 15199698944. Throughput: 0: 42901.2. Samples: 15199828340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:58:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 16:58:52,359][15401] Updated weights for policy 0, policy_version 927724 (0.0037) [2024-06-25 16:58:53,389][15132] Fps is (10 sec: 39346.9, 60 sec: 42325.3, 300 sec: 42599.2). Total num frames: 15199846400. Throughput: 0: 42767.1. Samples: 15199955960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:58:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 16:58:55,791][15401] Updated weights for policy 0, policy_version 927734 (0.0029) [2024-06-25 16:58:58,392][15132] Fps is (10 sec: 40950.8, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15200108544. Throughput: 0: 42942.3. Samples: 15200214620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:58:58,392][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 16:58:59,860][15401] Updated weights for policy 0, policy_version 927744 (0.0042) [2024-06-25 16:59:03,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15200305152. Throughput: 0: 43042.7. Samples: 15200475060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:03,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 16:59:03,475][15401] Updated weights for policy 0, policy_version 927754 (0.0033) [2024-06-25 16:59:07,409][15401] Updated weights for policy 0, policy_version 927764 (0.0037) [2024-06-25 16:59:08,390][15132] Fps is (10 sec: 39330.3, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 15200501760. Throughput: 0: 42935.0. Samples: 15200599160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:08,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 16:59:11,290][15401] Updated weights for policy 0, policy_version 927774 (0.0027) [2024-06-25 16:59:13,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 15200747520. Throughput: 0: 42773.4. Samples: 15200853280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:13,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 16:59:14,939][15401] Updated weights for policy 0, policy_version 927784 (0.0034) [2024-06-25 16:59:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15200927744. Throughput: 0: 42915.0. Samples: 15201111860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:18,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-25 16:59:19,015][15401] Updated weights for policy 0, policy_version 927794 (0.0025) [2024-06-25 16:59:22,613][15401] Updated weights for policy 0, policy_version 927804 (0.0033) [2024-06-25 16:59:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15201157120. Throughput: 0: 42642.8. Samples: 15201228960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:23,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 16:59:26,618][15401] Updated weights for policy 0, policy_version 927814 (0.0039) [2024-06-25 16:59:28,390][15132] Fps is (10 sec: 45872.6, 60 sec: 42871.0, 300 sec: 42764.9). Total num frames: 15201386496. Throughput: 0: 42561.9. Samples: 15201485180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:28,391][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 16:59:30,195][15401] Updated weights for policy 0, policy_version 927824 (0.0038) [2024-06-25 16:59:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15201566720. Throughput: 0: 42712.2. Samples: 15201750380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:33,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 16:59:34,222][15401] Updated weights for policy 0, policy_version 927834 (0.0032) [2024-06-25 16:59:37,868][15401] Updated weights for policy 0, policy_version 927844 (0.0027) [2024-06-25 16:59:38,390][15132] Fps is (10 sec: 40962.7, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 15201796096. Throughput: 0: 42624.4. Samples: 15201874060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:38,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 16:59:42,054][15401] Updated weights for policy 0, policy_version 927854 (0.0036) [2024-06-25 16:59:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 15202025472. Throughput: 0: 42616.1. Samples: 15202132240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 16:59:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000927859_15202041856.pth... [2024-06-25 16:59:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000927233_15191785472.pth [2024-06-25 16:59:45,641][15401] Updated weights for policy 0, policy_version 927864 (0.0029) [2024-06-25 16:59:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15202222080. Throughput: 0: 42734.1. Samples: 15202398100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:48,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 16:59:49,581][15401] Updated weights for policy 0, policy_version 927874 (0.0032) [2024-06-25 16:59:53,219][15401] Updated weights for policy 0, policy_version 927884 (0.0028) [2024-06-25 16:59:53,390][15132] Fps is (10 sec: 42597.5, 60 sec: 43417.5, 300 sec: 42709.8). Total num frames: 15202451456. Throughput: 0: 42623.1. Samples: 15202517200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:53,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 16:59:56,987][15349] Signal inference workers to stop experience collection... (225000 times) [2024-06-25 16:59:56,989][15349] Signal inference workers to resume experience collection... (225000 times) [2024-06-25 16:59:57,005][15401] InferenceWorker_p0-w0: stopping experience collection (225000 times) [2024-06-25 16:59:57,006][15401] InferenceWorker_p0-w0: resuming experience collection (225000 times) [2024-06-25 16:59:57,149][15401] Updated weights for policy 0, policy_version 927894 (0.0038) [2024-06-25 16:59:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 15202680832. Throughput: 0: 42765.7. Samples: 15202777740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 16:59:58,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 17:00:01,180][15401] Updated weights for policy 0, policy_version 927904 (0.0043) [2024-06-25 17:00:03,389][15132] Fps is (10 sec: 37684.2, 60 sec: 42052.3, 300 sec: 42599.3). Total num frames: 15202828288. Throughput: 0: 42897.1. Samples: 15203042220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 17:00:03,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 17:00:04,779][15401] Updated weights for policy 0, policy_version 927914 (0.0025) [2024-06-25 17:00:08,390][15132] Fps is (10 sec: 40958.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15203090432. Throughput: 0: 42808.6. Samples: 15203155360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 17:00:08,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 17:00:09,053][15401] Updated weights for policy 0, policy_version 927924 (0.0039) [2024-06-25 17:00:12,650][15401] Updated weights for policy 0, policy_version 927934 (0.0035) [2024-06-25 17:00:13,389][15132] Fps is (10 sec: 49151.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15203319808. Throughput: 0: 42968.7. Samples: 15203418740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 17:00:13,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 17:00:16,836][15401] Updated weights for policy 0, policy_version 927944 (0.0049) [2024-06-25 17:00:18,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15203483648. Throughput: 0: 42921.2. Samples: 15203681840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 17:00:18,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 17:00:20,211][15401] Updated weights for policy 0, policy_version 927954 (0.0024) [2024-06-25 17:00:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15203745792. Throughput: 0: 42724.4. Samples: 15203796660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 17:00:23,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 17:00:24,679][15401] Updated weights for policy 0, policy_version 927964 (0.0039) [2024-06-25 17:00:27,770][15401] Updated weights for policy 0, policy_version 927974 (0.0030) [2024-06-25 17:00:28,390][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.9, 300 sec: 42876.1). Total num frames: 15203958784. Throughput: 0: 42875.9. Samples: 15204061660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:00:28,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 17:00:32,381][15401] Updated weights for policy 0, policy_version 927984 (0.0039) [2024-06-25 17:00:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15204122624. Throughput: 0: 42726.8. Samples: 15204320800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:00:33,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 17:00:35,389][15401] Updated weights for policy 0, policy_version 927994 (0.0053) [2024-06-25 17:00:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15204384768. Throughput: 0: 42677.9. Samples: 15204437700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:00:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 17:00:39,996][15401] Updated weights for policy 0, policy_version 928004 (0.0043) [2024-06-25 17:00:43,177][15401] Updated weights for policy 0, policy_version 928014 (0.0032) [2024-06-25 17:00:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 15204581376. Throughput: 0: 42674.6. Samples: 15204698100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:00:43,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 17:00:47,795][15401] Updated weights for policy 0, policy_version 928024 (0.0037) [2024-06-25 17:00:48,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15204761600. Throughput: 0: 42487.1. Samples: 15204954140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:00:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 17:00:50,979][15401] Updated weights for policy 0, policy_version 928034 (0.0034) [2024-06-25 17:00:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15205007360. Throughput: 0: 42681.1. Samples: 15205076000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:00:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 17:00:55,404][15401] Updated weights for policy 0, policy_version 928044 (0.0037) [2024-06-25 17:00:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 15205220352. Throughput: 0: 42639.9. Samples: 15205337540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:00:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 17:00:58,794][15401] Updated weights for policy 0, policy_version 928054 (0.0045) [2024-06-25 17:01:02,930][15401] Updated weights for policy 0, policy_version 928064 (0.0036) [2024-06-25 17:01:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 15205416960. Throughput: 0: 42494.3. Samples: 15205594080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 17:01:06,088][15349] Signal inference workers to stop experience collection... (225050 times) [2024-06-25 17:01:06,089][15349] Signal inference workers to resume experience collection... (225050 times) [2024-06-25 17:01:06,113][15401] InferenceWorker_p0-w0: stopping experience collection (225050 times) [2024-06-25 17:01:06,114][15401] InferenceWorker_p0-w0: resuming experience collection (225050 times) [2024-06-25 17:01:06,234][15401] Updated weights for policy 0, policy_version 928074 (0.0043) [2024-06-25 17:01:08,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 15205629952. Throughput: 0: 42734.8. Samples: 15205719720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 17:01:10,492][15401] Updated weights for policy 0, policy_version 928084 (0.0033) [2024-06-25 17:01:13,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42323.6, 300 sec: 42820.2). Total num frames: 15205859328. Throughput: 0: 42543.1. Samples: 15205976200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:13,393][15132] Avg episode reward: [(0, '0.263')] [2024-06-25 17:01:13,839][15401] Updated weights for policy 0, policy_version 928094 (0.0033) [2024-06-25 17:01:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15206039552. Throughput: 0: 42500.7. Samples: 15206233340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 17:01:18,651][15401] Updated weights for policy 0, policy_version 928104 (0.0041) [2024-06-25 17:01:21,371][15401] Updated weights for policy 0, policy_version 928114 (0.0032) [2024-06-25 17:01:23,392][15132] Fps is (10 sec: 42598.3, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 15206285312. Throughput: 0: 42659.5. Samples: 15206357480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:23,393][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 17:01:26,059][15401] Updated weights for policy 0, policy_version 928124 (0.0031) [2024-06-25 17:01:28,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15206481920. Throughput: 0: 42704.1. Samples: 15206619780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 17:01:29,277][15401] Updated weights for policy 0, policy_version 928134 (0.0039) [2024-06-25 17:01:33,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15206694912. Throughput: 0: 42610.2. Samples: 15206871600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:33,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 17:01:33,467][15401] Updated weights for policy 0, policy_version 928144 (0.0029) [2024-06-25 17:01:37,167][15401] Updated weights for policy 0, policy_version 928154 (0.0026) [2024-06-25 17:01:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15206940672. Throughput: 0: 42803.6. Samples: 15207002160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:38,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 17:01:40,957][15401] Updated weights for policy 0, policy_version 928164 (0.0036) [2024-06-25 17:01:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15207120896. Throughput: 0: 42671.6. Samples: 15207257760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 17:01:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000928169_15207120896.pth... [2024-06-25 17:01:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000927545_15196897280.pth [2024-06-25 17:01:44,931][15401] Updated weights for policy 0, policy_version 928174 (0.0036) [2024-06-25 17:01:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15207350272. Throughput: 0: 42511.2. Samples: 15207507080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 17:01:48,490][15401] Updated weights for policy 0, policy_version 928184 (0.0034) [2024-06-25 17:01:52,526][15401] Updated weights for policy 0, policy_version 928194 (0.0042) [2024-06-25 17:01:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15207563264. Throughput: 0: 42773.8. Samples: 15207644540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:53,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 17:01:56,028][15401] Updated weights for policy 0, policy_version 928204 (0.0025) [2024-06-25 17:01:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15207759872. Throughput: 0: 42626.2. Samples: 15207894280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:01:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 17:02:00,200][15401] Updated weights for policy 0, policy_version 928214 (0.0031) [2024-06-25 17:02:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15207989248. Throughput: 0: 42630.4. Samples: 15208151700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:02:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 17:02:03,780][15401] Updated weights for policy 0, policy_version 928224 (0.0030) [2024-06-25 17:02:08,116][15401] Updated weights for policy 0, policy_version 928234 (0.0038) [2024-06-25 17:02:08,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 15208202240. Throughput: 0: 42779.2. Samples: 15208282440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:02:08,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 17:02:11,469][15401] Updated weights for policy 0, policy_version 928244 (0.0033) [2024-06-25 17:02:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15208415232. Throughput: 0: 42453.7. Samples: 15208530200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:02:13,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 17:02:15,667][15401] Updated weights for policy 0, policy_version 928254 (0.0034) [2024-06-25 17:02:18,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15208628224. Throughput: 0: 42631.8. Samples: 15208790040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-25 17:02:18,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 17:02:18,999][15401] Updated weights for policy 0, policy_version 928264 (0.0037) [2024-06-25 17:02:23,222][15401] Updated weights for policy 0, policy_version 928274 (0.0036) [2024-06-25 17:02:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42709.4). Total num frames: 15208841216. Throughput: 0: 42618.6. Samples: 15208920000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:02:23,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 17:02:26,840][15401] Updated weights for policy 0, policy_version 928284 (0.0040) [2024-06-25 17:02:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15209037824. Throughput: 0: 42305.8. Samples: 15209161520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:02:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 17:02:31,322][15401] Updated weights for policy 0, policy_version 928294 (0.0046) [2024-06-25 17:02:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15209250816. Throughput: 0: 42608.5. Samples: 15209424460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:02:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 17:02:34,364][15401] Updated weights for policy 0, policy_version 928304 (0.0032) [2024-06-25 17:02:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15209463808. Throughput: 0: 42402.2. Samples: 15209552640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:02:38,390][15132] Avg episode reward: [(0, '0.344')] [2024-06-25 17:02:38,848][15401] Updated weights for policy 0, policy_version 928314 (0.0034) [2024-06-25 17:02:40,832][15349] Signal inference workers to stop experience collection... (225100 times) [2024-06-25 17:02:40,832][15349] Signal inference workers to resume experience collection... (225100 times) [2024-06-25 17:02:40,884][15401] InferenceWorker_p0-w0: stopping experience collection (225100 times) [2024-06-25 17:02:40,884][15401] InferenceWorker_p0-w0: resuming experience collection (225100 times) [2024-06-25 17:02:42,538][15401] Updated weights for policy 0, policy_version 928324 (0.0050) [2024-06-25 17:02:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15209693184. Throughput: 0: 42507.6. Samples: 15209807120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:02:43,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 17:02:46,723][15401] Updated weights for policy 0, policy_version 928334 (0.0027) [2024-06-25 17:02:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15209889792. Throughput: 0: 42369.3. Samples: 15210058320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:02:48,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 17:02:50,183][15401] Updated weights for policy 0, policy_version 928344 (0.0035) [2024-06-25 17:02:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15210086400. Throughput: 0: 42238.1. Samples: 15210183160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:02:53,399][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 17:02:54,620][15401] Updated weights for policy 0, policy_version 928354 (0.0026) [2024-06-25 17:02:57,856][15401] Updated weights for policy 0, policy_version 928364 (0.0036) [2024-06-25 17:02:58,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15210348544. Throughput: 0: 42464.5. Samples: 15210441100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:02:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 17:03:02,266][15401] Updated weights for policy 0, policy_version 928374 (0.0028) [2024-06-25 17:03:03,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15210545152. Throughput: 0: 42486.4. Samples: 15210701920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 17:03:05,219][15401] Updated weights for policy 0, policy_version 928384 (0.0028) [2024-06-25 17:03:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15210741760. Throughput: 0: 42365.8. Samples: 15210826460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:08,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 17:03:09,984][15401] Updated weights for policy 0, policy_version 928394 (0.0042) [2024-06-25 17:03:12,768][15401] Updated weights for policy 0, policy_version 928404 (0.0031) [2024-06-25 17:03:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15210987520. Throughput: 0: 42776.4. Samples: 15211086460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 17:03:17,661][15401] Updated weights for policy 0, policy_version 928414 (0.0040) [2024-06-25 17:03:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15211167744. Throughput: 0: 42651.9. Samples: 15211343800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:18,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 17:03:20,371][15401] Updated weights for policy 0, policy_version 928424 (0.0033) [2024-06-25 17:03:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15211380736. Throughput: 0: 42564.0. Samples: 15211468020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:23,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 17:03:25,075][15401] Updated weights for policy 0, policy_version 928434 (0.0031) [2024-06-25 17:03:28,018][15401] Updated weights for policy 0, policy_version 928444 (0.0030) [2024-06-25 17:03:28,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15211642880. Throughput: 0: 42737.8. Samples: 15211730320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:28,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 17:03:32,767][15401] Updated weights for policy 0, policy_version 928454 (0.0046) [2024-06-25 17:03:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 15211806720. Throughput: 0: 42779.2. Samples: 15211983380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 17:03:35,642][15401] Updated weights for policy 0, policy_version 928464 (0.0029) [2024-06-25 17:03:38,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 15212019712. Throughput: 0: 42668.6. Samples: 15212103240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:38,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 17:03:40,475][15401] Updated weights for policy 0, policy_version 928474 (0.0028) [2024-06-25 17:03:43,268][15401] Updated weights for policy 0, policy_version 928484 (0.0038) [2024-06-25 17:03:43,390][15132] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15212281856. Throughput: 0: 42872.7. Samples: 15212370380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:43,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 17:03:43,540][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000928485_15212298240.pth... [2024-06-25 17:03:43,607][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000927859_15202041856.pth [2024-06-25 17:03:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15212429312. Throughput: 0: 42802.5. Samples: 15212628040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:48,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-25 17:03:48,397][15401] Updated weights for policy 0, policy_version 928494 (0.0025) [2024-06-25 17:03:50,576][15349] Signal inference workers to stop experience collection... (225150 times) [2024-06-25 17:03:50,633][15401] InferenceWorker_p0-w0: stopping experience collection (225150 times) [2024-06-25 17:03:50,693][15349] Signal inference workers to resume experience collection... (225150 times) [2024-06-25 17:03:50,694][15401] InferenceWorker_p0-w0: resuming experience collection (225150 times) [2024-06-25 17:03:51,216][15401] Updated weights for policy 0, policy_version 928504 (0.0026) [2024-06-25 17:03:53,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 15212658688. Throughput: 0: 42650.1. Samples: 15212745720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:53,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 17:03:56,105][15401] Updated weights for policy 0, policy_version 928514 (0.0036) [2024-06-25 17:03:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15212888064. Throughput: 0: 42710.8. Samples: 15213008440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:03:58,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 17:03:58,867][15401] Updated weights for policy 0, policy_version 928524 (0.0041) [2024-06-25 17:04:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 15213068288. Throughput: 0: 42780.9. Samples: 15213268940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:04:03,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 17:04:03,729][15401] Updated weights for policy 0, policy_version 928534 (0.0034) [2024-06-25 17:04:06,585][15401] Updated weights for policy 0, policy_version 928544 (0.0031) [2024-06-25 17:04:08,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 15213314048. Throughput: 0: 42666.1. Samples: 15213388100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:04:08,393][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 17:04:11,405][15401] Updated weights for policy 0, policy_version 928554 (0.0037) [2024-06-25 17:04:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15213510656. Throughput: 0: 42484.3. Samples: 15213642120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-25 17:04:13,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 17:04:14,376][15401] Updated weights for policy 0, policy_version 928564 (0.0038) [2024-06-25 17:04:18,392][15132] Fps is (10 sec: 39321.8, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 15213707264. Throughput: 0: 42615.5. Samples: 15213901180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:18,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 17:04:19,214][15401] Updated weights for policy 0, policy_version 928574 (0.0047) [2024-06-25 17:04:22,193][15401] Updated weights for policy 0, policy_version 928584 (0.0040) [2024-06-25 17:04:23,392][15132] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42598.2). Total num frames: 15213953024. Throughput: 0: 42657.7. Samples: 15214022940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:23,392][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 17:04:26,820][15401] Updated weights for policy 0, policy_version 928594 (0.0052) [2024-06-25 17:04:28,389][15132] Fps is (10 sec: 44247.9, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 15214149632. Throughput: 0: 42390.0. Samples: 15214277920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:28,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 17:04:29,903][15401] Updated weights for policy 0, policy_version 928604 (0.0035) [2024-06-25 17:04:33,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15214346240. Throughput: 0: 42382.8. Samples: 15214535260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:33,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 17:04:34,363][15401] Updated weights for policy 0, policy_version 928614 (0.0033) [2024-06-25 17:04:37,804][15401] Updated weights for policy 0, policy_version 928624 (0.0026) [2024-06-25 17:04:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15214608384. Throughput: 0: 42563.2. Samples: 15214661060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:38,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 17:04:42,123][15401] Updated weights for policy 0, policy_version 928634 (0.0036) [2024-06-25 17:04:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 15214788608. Throughput: 0: 42441.7. Samples: 15214918320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:43,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 17:04:45,454][15401] Updated weights for policy 0, policy_version 928644 (0.0037) [2024-06-25 17:04:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 15215001600. Throughput: 0: 42194.7. Samples: 15215167800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:48,392][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 17:04:49,814][15401] Updated weights for policy 0, policy_version 928654 (0.0032) [2024-06-25 17:04:49,822][15349] Signal inference workers to stop experience collection... (225200 times) [2024-06-25 17:04:49,822][15349] Signal inference workers to resume experience collection... (225200 times) [2024-06-25 17:04:49,841][15401] InferenceWorker_p0-w0: stopping experience collection (225200 times) [2024-06-25 17:04:49,841][15401] InferenceWorker_p0-w0: resuming experience collection (225200 times) [2024-06-25 17:04:53,256][15401] Updated weights for policy 0, policy_version 928664 (0.0034) [2024-06-25 17:04:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 15215230976. Throughput: 0: 42369.0. Samples: 15215294600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 17:04:57,398][15401] Updated weights for policy 0, policy_version 928674 (0.0034) [2024-06-25 17:04:58,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15215443968. Throughput: 0: 42477.8. Samples: 15215553620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:04:58,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 17:05:00,860][15401] Updated weights for policy 0, policy_version 928684 (0.0028) [2024-06-25 17:05:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15215656960. Throughput: 0: 42458.2. Samples: 15215811700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:03,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 17:05:05,279][15401] Updated weights for policy 0, policy_version 928694 (0.0039) [2024-06-25 17:05:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 15215869952. Throughput: 0: 42553.9. Samples: 15215937760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 17:05:08,464][15401] Updated weights for policy 0, policy_version 928704 (0.0039) [2024-06-25 17:05:12,841][15401] Updated weights for policy 0, policy_version 928714 (0.0034) [2024-06-25 17:05:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15216066560. Throughput: 0: 42556.2. Samples: 15216192960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:13,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 17:05:16,233][15401] Updated weights for policy 0, policy_version 928724 (0.0028) [2024-06-25 17:05:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.2, 300 sec: 42487.3). Total num frames: 15216279552. Throughput: 0: 42585.7. Samples: 15216451620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:18,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 17:05:20,840][15401] Updated weights for policy 0, policy_version 928734 (0.0034) [2024-06-25 17:05:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 15216508928. Throughput: 0: 42575.2. Samples: 15216576940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:23,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 17:05:23,807][15401] Updated weights for policy 0, policy_version 928744 (0.0031) [2024-06-25 17:05:28,355][15401] Updated weights for policy 0, policy_version 928754 (0.0031) [2024-06-25 17:05:28,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42653.6). Total num frames: 15216705536. Throughput: 0: 42634.2. Samples: 15216836960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:28,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 17:05:31,607][15401] Updated weights for policy 0, policy_version 928764 (0.0032) [2024-06-25 17:05:33,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 15216918528. Throughput: 0: 42883.9. Samples: 15217097480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:33,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 17:05:35,662][15401] Updated weights for policy 0, policy_version 928774 (0.0034) [2024-06-25 17:05:38,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15217147904. Throughput: 0: 42908.0. Samples: 15217225460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:38,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 17:05:39,335][15401] Updated weights for policy 0, policy_version 928784 (0.0043) [2024-06-25 17:05:43,227][15401] Updated weights for policy 0, policy_version 928794 (0.0033) [2024-06-25 17:05:43,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15217377280. Throughput: 0: 42856.0. Samples: 15217482140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:43,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 17:05:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000928795_15217377280.pth... [2024-06-25 17:05:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000928169_15207120896.pth [2024-06-25 17:05:47,289][15401] Updated weights for policy 0, policy_version 928804 (0.0035) [2024-06-25 17:05:48,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 15217573888. Throughput: 0: 42759.1. Samples: 15217735860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:48,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 17:05:50,871][15401] Updated weights for policy 0, policy_version 928814 (0.0032) [2024-06-25 17:05:53,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15217770496. Throughput: 0: 42614.6. Samples: 15217855420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:53,392][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 17:05:54,867][15401] Updated weights for policy 0, policy_version 928824 (0.0035) [2024-06-25 17:05:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15217999872. Throughput: 0: 42731.5. Samples: 15218115880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:05:58,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 17:05:58,593][15401] Updated weights for policy 0, policy_version 928834 (0.0045) [2024-06-25 17:06:02,421][15401] Updated weights for policy 0, policy_version 928844 (0.0031) [2024-06-25 17:06:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15218212864. Throughput: 0: 42601.4. Samples: 15218368680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:06:03,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 17:06:06,338][15401] Updated weights for policy 0, policy_version 928854 (0.0035) [2024-06-25 17:06:08,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.2, 300 sec: 42543.2). Total num frames: 15218409472. Throughput: 0: 42675.9. Samples: 15218497360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-25 17:06:08,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 17:06:10,321][15401] Updated weights for policy 0, policy_version 928864 (0.0027) [2024-06-25 17:06:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15218638848. Throughput: 0: 42532.6. Samples: 15218750820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:13,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 17:06:14,005][15401] Updated weights for policy 0, policy_version 928874 (0.0045) [2024-06-25 17:06:18,239][15401] Updated weights for policy 0, policy_version 928884 (0.0038) [2024-06-25 17:06:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 15218851840. Throughput: 0: 42430.1. Samples: 15219006820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:18,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 17:06:21,788][15401] Updated weights for policy 0, policy_version 928894 (0.0025) [2024-06-25 17:06:23,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42050.5, 300 sec: 42542.5). Total num frames: 15219032064. Throughput: 0: 42343.1. Samples: 15219131000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:23,392][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 17:06:25,859][15401] Updated weights for policy 0, policy_version 928904 (0.0029) [2024-06-25 17:06:26,611][15349] Signal inference workers to stop experience collection... (225250 times) [2024-06-25 17:06:26,656][15401] InferenceWorker_p0-w0: stopping experience collection (225250 times) [2024-06-25 17:06:26,723][15349] Signal inference workers to resume experience collection... (225250 times) [2024-06-25 17:06:26,723][15401] InferenceWorker_p0-w0: resuming experience collection (225250 times) [2024-06-25 17:06:28,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 15219277824. Throughput: 0: 42213.3. Samples: 15219381740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:28,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 17:06:29,505][15401] Updated weights for policy 0, policy_version 928914 (0.0041) [2024-06-25 17:06:33,390][15132] Fps is (10 sec: 44246.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15219474432. Throughput: 0: 42376.8. Samples: 15219642820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:33,391][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 17:06:33,469][15401] Updated weights for policy 0, policy_version 928924 (0.0035) [2024-06-25 17:06:37,165][15401] Updated weights for policy 0, policy_version 928934 (0.0029) [2024-06-25 17:06:38,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15219671040. Throughput: 0: 42556.9. Samples: 15219770480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:38,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 17:06:41,134][15401] Updated weights for policy 0, policy_version 928944 (0.0037) [2024-06-25 17:06:43,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15219916800. Throughput: 0: 42310.4. Samples: 15220019840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:43,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 17:06:44,851][15401] Updated weights for policy 0, policy_version 928954 (0.0038) [2024-06-25 17:06:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 15220113408. Throughput: 0: 42494.5. Samples: 15220280940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 17:06:48,808][15401] Updated weights for policy 0, policy_version 928964 (0.0037) [2024-06-25 17:06:52,646][15401] Updated weights for policy 0, policy_version 928974 (0.0030) [2024-06-25 17:06:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15220326400. Throughput: 0: 42332.9. Samples: 15220402340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 17:06:56,380][15401] Updated weights for policy 0, policy_version 928984 (0.0043) [2024-06-25 17:06:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15220555776. Throughput: 0: 42491.5. Samples: 15220662940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:06:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 17:07:00,298][15401] Updated weights for policy 0, policy_version 928994 (0.0042) [2024-06-25 17:07:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 15220752384. Throughput: 0: 42486.1. Samples: 15220918700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:03,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-25 17:07:04,118][15401] Updated weights for policy 0, policy_version 929004 (0.0040) [2024-06-25 17:07:08,058][15401] Updated weights for policy 0, policy_version 929014 (0.0037) [2024-06-25 17:07:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15220965376. Throughput: 0: 42399.5. Samples: 15221038880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:08,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 17:07:12,284][15401] Updated weights for policy 0, policy_version 929024 (0.0036) [2024-06-25 17:07:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15221194752. Throughput: 0: 42584.1. Samples: 15221298020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 17:07:16,163][15401] Updated weights for policy 0, policy_version 929034 (0.0037) [2024-06-25 17:07:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 15221391360. Throughput: 0: 42365.9. Samples: 15221549280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 17:07:19,964][15401] Updated weights for policy 0, policy_version 929044 (0.0039) [2024-06-25 17:07:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 15221587968. Throughput: 0: 42427.1. Samples: 15221679700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:23,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-25 17:07:24,167][15401] Updated weights for policy 0, policy_version 929054 (0.0028) [2024-06-25 17:07:27,439][15401] Updated weights for policy 0, policy_version 929064 (0.0023) [2024-06-25 17:07:28,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15221817344. Throughput: 0: 42598.2. Samples: 15221936760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:28,390][15132] Avg episode reward: [(0, '0.186')] [2024-06-25 17:07:31,690][15401] Updated weights for policy 0, policy_version 929074 (0.0025) [2024-06-25 17:07:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 15222013952. Throughput: 0: 42472.8. Samples: 15222192220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:33,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 17:07:35,025][15401] Updated weights for policy 0, policy_version 929084 (0.0048) [2024-06-25 17:07:38,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 15222243328. Throughput: 0: 42616.9. Samples: 15222320200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:38,392][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 17:07:39,207][15401] Updated weights for policy 0, policy_version 929094 (0.0036) [2024-06-25 17:07:42,745][15401] Updated weights for policy 0, policy_version 929104 (0.0036) [2024-06-25 17:07:43,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15222456320. Throughput: 0: 42523.6. Samples: 15222576500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:43,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 17:07:43,444][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000929106_15222472704.pth... [2024-06-25 17:07:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000928485_15212298240.pth [2024-06-25 17:07:46,718][15401] Updated weights for policy 0, policy_version 929114 (0.0021) [2024-06-25 17:07:48,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 15222652928. Throughput: 0: 42727.2. Samples: 15222841420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:48,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 17:07:50,293][15401] Updated weights for policy 0, policy_version 929124 (0.0038) [2024-06-25 17:07:52,684][15349] Signal inference workers to stop experience collection... (225300 times) [2024-06-25 17:07:52,721][15401] InferenceWorker_p0-w0: stopping experience collection (225300 times) [2024-06-25 17:07:52,741][15349] Signal inference workers to resume experience collection... (225300 times) [2024-06-25 17:07:52,742][15401] InferenceWorker_p0-w0: resuming experience collection (225300 times) [2024-06-25 17:07:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15222882304. Throughput: 0: 42695.7. Samples: 15222960180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:53,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 17:07:54,700][15401] Updated weights for policy 0, policy_version 929134 (0.0042) [2024-06-25 17:07:58,223][15401] Updated weights for policy 0, policy_version 929144 (0.0035) [2024-06-25 17:07:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15223111680. Throughput: 0: 42658.3. Samples: 15223217640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:07:58,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 17:08:02,255][15401] Updated weights for policy 0, policy_version 929154 (0.0026) [2024-06-25 17:08:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15223291904. Throughput: 0: 42797.4. Samples: 15223475160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 17:08:03,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 17:08:05,905][15401] Updated weights for policy 0, policy_version 929164 (0.0027) [2024-06-25 17:08:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 15223521280. Throughput: 0: 42644.0. Samples: 15223598680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:08,404][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 17:08:09,951][15401] Updated weights for policy 0, policy_version 929174 (0.0021) [2024-06-25 17:08:13,390][15401] Updated weights for policy 0, policy_version 929184 (0.0028) [2024-06-25 17:08:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15223750656. Throughput: 0: 42779.5. Samples: 15223861840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 17:08:17,512][15401] Updated weights for policy 0, policy_version 929194 (0.0044) [2024-06-25 17:08:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15223947264. Throughput: 0: 42835.3. Samples: 15224119800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:18,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 17:08:20,983][15401] Updated weights for policy 0, policy_version 929204 (0.0032) [2024-06-25 17:08:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 15224143872. Throughput: 0: 42658.6. Samples: 15224239740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:23,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 17:08:25,394][15401] Updated weights for policy 0, policy_version 929214 (0.0034) [2024-06-25 17:08:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15224373248. Throughput: 0: 42580.0. Samples: 15224492600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:28,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 17:08:28,668][15401] Updated weights for policy 0, policy_version 929224 (0.0056) [2024-06-25 17:08:32,980][15401] Updated weights for policy 0, policy_version 929234 (0.0034) [2024-06-25 17:08:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15224586240. Throughput: 0: 42530.9. Samples: 15224755320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:33,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 17:08:36,467][15401] Updated weights for policy 0, policy_version 929244 (0.0027) [2024-06-25 17:08:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.1, 300 sec: 42431.8). Total num frames: 15224799232. Throughput: 0: 42634.3. Samples: 15224878720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 17:08:40,956][15401] Updated weights for policy 0, policy_version 929254 (0.0025) [2024-06-25 17:08:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15225028608. Throughput: 0: 42652.3. Samples: 15225137000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:43,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 17:08:44,009][15401] Updated weights for policy 0, policy_version 929264 (0.0027) [2024-06-25 17:08:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15225192448. Throughput: 0: 42920.0. Samples: 15225406560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:48,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 17:08:48,585][15401] Updated weights for policy 0, policy_version 929274 (0.0043) [2024-06-25 17:08:51,699][15401] Updated weights for policy 0, policy_version 929284 (0.0024) [2024-06-25 17:08:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15225438208. Throughput: 0: 42792.1. Samples: 15225524320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:53,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 17:08:56,107][15401] Updated weights for policy 0, policy_version 929294 (0.0038) [2024-06-25 17:08:58,392][15132] Fps is (10 sec: 49139.8, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 15225683968. Throughput: 0: 42645.8. Samples: 15225781000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:08:58,393][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 17:08:59,264][15401] Updated weights for policy 0, policy_version 929304 (0.0031) [2024-06-25 17:09:03,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 15225831424. Throughput: 0: 42939.0. Samples: 15226052060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:03,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 17:09:03,757][15401] Updated weights for policy 0, policy_version 929314 (0.0031) [2024-06-25 17:09:03,903][15349] Signal inference workers to stop experience collection... (225350 times) [2024-06-25 17:09:03,903][15349] Signal inference workers to resume experience collection... (225350 times) [2024-06-25 17:09:03,950][15401] InferenceWorker_p0-w0: stopping experience collection (225350 times) [2024-06-25 17:09:03,951][15401] InferenceWorker_p0-w0: resuming experience collection (225350 times) [2024-06-25 17:09:06,813][15401] Updated weights for policy 0, policy_version 929324 (0.0031) [2024-06-25 17:09:08,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15226093568. Throughput: 0: 42862.3. Samples: 15226168540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 17:09:11,371][15401] Updated weights for policy 0, policy_version 929334 (0.0035) [2024-06-25 17:09:13,390][15132] Fps is (10 sec: 49151.6, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 15226322944. Throughput: 0: 42958.0. Samples: 15226425720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 17:09:14,485][15401] Updated weights for policy 0, policy_version 929344 (0.0038) [2024-06-25 17:09:18,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 15226486784. Throughput: 0: 43094.3. Samples: 15226694560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:18,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 17:09:18,950][15401] Updated weights for policy 0, policy_version 929354 (0.0039) [2024-06-25 17:09:22,012][15401] Updated weights for policy 0, policy_version 929364 (0.0034) [2024-06-25 17:09:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 15226748928. Throughput: 0: 42860.0. Samples: 15226807420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:23,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 17:09:26,614][15401] Updated weights for policy 0, policy_version 929374 (0.0023) [2024-06-25 17:09:28,390][15132] Fps is (10 sec: 49152.1, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 15226978304. Throughput: 0: 42973.4. Samples: 15227070800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 17:09:29,636][15401] Updated weights for policy 0, policy_version 929384 (0.0034) [2024-06-25 17:09:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 15227142144. Throughput: 0: 42828.5. Samples: 15227333840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:33,390][15132] Avg episode reward: [(0, '0.369')] [2024-06-25 17:09:34,383][15401] Updated weights for policy 0, policy_version 929394 (0.0035) [2024-06-25 17:09:37,242][15401] Updated weights for policy 0, policy_version 929404 (0.0027) [2024-06-25 17:09:38,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15227371520. Throughput: 0: 42756.7. Samples: 15227448380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 17:09:41,885][15401] Updated weights for policy 0, policy_version 929414 (0.0047) [2024-06-25 17:09:43,390][15132] Fps is (10 sec: 47512.5, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 15227617280. Throughput: 0: 42862.6. Samples: 15227709720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:43,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 17:09:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000929420_15227617280.pth... [2024-06-25 17:09:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000928795_15217377280.pth [2024-06-25 17:09:45,260][15401] Updated weights for policy 0, policy_version 929424 (0.0044) [2024-06-25 17:09:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 15227781120. Throughput: 0: 42621.8. Samples: 15227970040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:48,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 17:09:49,674][15401] Updated weights for policy 0, policy_version 929434 (0.0046) [2024-06-25 17:09:52,842][15401] Updated weights for policy 0, policy_version 929444 (0.0034) [2024-06-25 17:09:53,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.7, 300 sec: 42653.6). Total num frames: 15228026880. Throughput: 0: 42745.2. Samples: 15228092180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:53,393][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 17:09:57,286][15401] Updated weights for policy 0, policy_version 929454 (0.0035) [2024-06-25 17:09:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 15228239872. Throughput: 0: 42812.5. Samples: 15228352280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 17:09:58,390][15132] Avg episode reward: [(0, '0.840')] [2024-06-25 17:10:00,330][15401] Updated weights for policy 0, policy_version 929464 (0.0032) [2024-06-25 17:10:03,390][15132] Fps is (10 sec: 40969.6, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 15228436480. Throughput: 0: 42559.1. Samples: 15228609720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:03,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 17:10:04,729][15401] Updated weights for policy 0, policy_version 929474 (0.0039) [2024-06-25 17:10:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15228649472. Throughput: 0: 42859.5. Samples: 15228736100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:08,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 17:10:08,443][15401] Updated weights for policy 0, policy_version 929484 (0.0045) [2024-06-25 17:10:12,349][15401] Updated weights for policy 0, policy_version 929494 (0.0023) [2024-06-25 17:10:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15228878848. Throughput: 0: 42804.9. Samples: 15228997020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:13,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 17:10:15,947][15401] Updated weights for policy 0, policy_version 929504 (0.0031) [2024-06-25 17:10:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15229075456. Throughput: 0: 42686.5. Samples: 15229254740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 17:10:19,882][15349] Signal inference workers to stop experience collection... (225400 times) [2024-06-25 17:10:19,885][15349] Signal inference workers to resume experience collection... (225400 times) [2024-06-25 17:10:19,894][15401] Updated weights for policy 0, policy_version 929514 (0.0032) [2024-06-25 17:10:19,916][15401] InferenceWorker_p0-w0: stopping experience collection (225400 times) [2024-06-25 17:10:19,920][15401] InferenceWorker_p0-w0: resuming experience collection (225400 times) [2024-06-25 17:10:23,387][15401] Updated weights for policy 0, policy_version 929524 (0.0031) [2024-06-25 17:10:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 15229321216. Throughput: 0: 42948.4. Samples: 15229381060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:23,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-25 17:10:27,522][15401] Updated weights for policy 0, policy_version 929534 (0.0039) [2024-06-25 17:10:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15229501440. Throughput: 0: 42780.4. Samples: 15229634840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:28,390][15132] Avg episode reward: [(0, '0.261')] [2024-06-25 17:10:31,112][15401] Updated weights for policy 0, policy_version 929544 (0.0038) [2024-06-25 17:10:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15229730816. Throughput: 0: 42807.5. Samples: 15229896380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:33,395][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 17:10:35,039][15401] Updated weights for policy 0, policy_version 929554 (0.0031) [2024-06-25 17:10:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15229943808. Throughput: 0: 42977.3. Samples: 15230026060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:38,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 17:10:38,740][15401] Updated weights for policy 0, policy_version 929564 (0.0037) [2024-06-25 17:10:42,631][15401] Updated weights for policy 0, policy_version 929574 (0.0031) [2024-06-25 17:10:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15230156800. Throughput: 0: 42806.2. Samples: 15230278560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 17:10:46,709][15401] Updated weights for policy 0, policy_version 929584 (0.0039) [2024-06-25 17:10:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15230353408. Throughput: 0: 42718.3. Samples: 15230532040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:48,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 17:10:50,235][15401] Updated weights for policy 0, policy_version 929594 (0.0037) [2024-06-25 17:10:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 15230582784. Throughput: 0: 42692.5. Samples: 15230657260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:53,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 17:10:54,260][15401] Updated weights for policy 0, policy_version 929604 (0.0029) [2024-06-25 17:10:58,084][15401] Updated weights for policy 0, policy_version 929614 (0.0029) [2024-06-25 17:10:58,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 15230812160. Throughput: 0: 42650.4. Samples: 15230916380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:10:58,392][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 17:11:01,830][15401] Updated weights for policy 0, policy_version 929624 (0.0031) [2024-06-25 17:11:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15230992384. Throughput: 0: 42774.2. Samples: 15231179580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:03,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 17:11:05,524][15401] Updated weights for policy 0, policy_version 929634 (0.0036) [2024-06-25 17:11:08,390][15132] Fps is (10 sec: 42607.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15231238144. Throughput: 0: 42624.9. Samples: 15231299180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:08,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 17:11:09,885][15401] Updated weights for policy 0, policy_version 929644 (0.0037) [2024-06-25 17:11:13,125][15401] Updated weights for policy 0, policy_version 929654 (0.0032) [2024-06-25 17:11:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15231467520. Throughput: 0: 42698.2. Samples: 15231556260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:13,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 17:11:17,815][15401] Updated weights for policy 0, policy_version 929664 (0.0031) [2024-06-25 17:11:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 15231631360. Throughput: 0: 42586.7. Samples: 15231812780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 17:11:20,758][15401] Updated weights for policy 0, policy_version 929674 (0.0035) [2024-06-25 17:11:23,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15231844352. Throughput: 0: 42443.5. Samples: 15231936020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:23,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 17:11:25,346][15401] Updated weights for policy 0, policy_version 929684 (0.0037) [2024-06-25 17:11:28,288][15401] Updated weights for policy 0, policy_version 929694 (0.0035) [2024-06-25 17:11:28,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 15232106496. Throughput: 0: 42576.6. Samples: 15232194500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:28,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 17:11:33,109][15401] Updated weights for policy 0, policy_version 929704 (0.0041) [2024-06-25 17:11:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15232286720. Throughput: 0: 42786.2. Samples: 15232457420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:33,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 17:11:35,905][15349] Signal inference workers to stop experience collection... (225450 times) [2024-06-25 17:11:35,961][15401] InferenceWorker_p0-w0: stopping experience collection (225450 times) [2024-06-25 17:11:35,970][15349] Signal inference workers to resume experience collection... (225450 times) [2024-06-25 17:11:35,976][15401] InferenceWorker_p0-w0: resuming experience collection (225450 times) [2024-06-25 17:11:35,979][15401] Updated weights for policy 0, policy_version 929714 (0.0040) [2024-06-25 17:11:38,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15232499712. Throughput: 0: 42534.2. Samples: 15232571300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:38,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 17:11:41,037][15401] Updated weights for policy 0, policy_version 929724 (0.0034) [2024-06-25 17:11:43,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15232745472. Throughput: 0: 42635.3. Samples: 15232834880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:43,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 17:11:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000929734_15232761856.pth... [2024-06-25 17:11:43,492][15401] Updated weights for policy 0, policy_version 929734 (0.0023) [2024-06-25 17:11:43,534][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000929106_15222472704.pth [2024-06-25 17:11:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15232909312. Throughput: 0: 42690.7. Samples: 15233100660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 17:11:48,559][15401] Updated weights for policy 0, policy_version 929744 (0.0022) [2024-06-25 17:11:51,378][15401] Updated weights for policy 0, policy_version 929754 (0.0035) [2024-06-25 17:11:53,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15233138688. Throughput: 0: 42685.0. Samples: 15233220000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:53,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 17:11:56,086][15401] Updated weights for policy 0, policy_version 929764 (0.0031) [2024-06-25 17:11:58,390][15132] Fps is (10 sec: 45873.4, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 15233368064. Throughput: 0: 42612.2. Samples: 15233473820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 17:11:58,391][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 17:11:59,468][15401] Updated weights for policy 0, policy_version 929774 (0.0034) [2024-06-25 17:12:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15233531904. Throughput: 0: 42874.7. Samples: 15233742140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:03,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 17:12:03,829][15401] Updated weights for policy 0, policy_version 929784 (0.0037) [2024-06-25 17:12:07,075][15401] Updated weights for policy 0, policy_version 929794 (0.0034) [2024-06-25 17:12:08,389][15132] Fps is (10 sec: 42600.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15233794048. Throughput: 0: 42782.0. Samples: 15233861200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:08,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 17:12:11,417][15401] Updated weights for policy 0, policy_version 929804 (0.0035) [2024-06-25 17:12:13,389][15132] Fps is (10 sec: 49151.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15234023424. Throughput: 0: 42767.5. Samples: 15234119040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:13,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 17:12:14,605][15401] Updated weights for policy 0, policy_version 929814 (0.0031) [2024-06-25 17:12:18,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15234170880. Throughput: 0: 42880.7. Samples: 15234387060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:18,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 17:12:19,085][15401] Updated weights for policy 0, policy_version 929824 (0.0030) [2024-06-25 17:12:22,109][15401] Updated weights for policy 0, policy_version 929834 (0.0037) [2024-06-25 17:12:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15234433024. Throughput: 0: 43021.8. Samples: 15234507280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 17:12:26,689][15401] Updated weights for policy 0, policy_version 929844 (0.0035) [2024-06-25 17:12:28,390][15132] Fps is (10 sec: 49152.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 15234662400. Throughput: 0: 42863.7. Samples: 15234763740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:28,399][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 17:12:29,789][15401] Updated weights for policy 0, policy_version 929854 (0.0038) [2024-06-25 17:12:33,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 15234809856. Throughput: 0: 42801.3. Samples: 15235026720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:33,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 17:12:34,428][15401] Updated weights for policy 0, policy_version 929864 (0.0041) [2024-06-25 17:12:37,466][15401] Updated weights for policy 0, policy_version 929874 (0.0028) [2024-06-25 17:12:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15235072000. Throughput: 0: 42628.4. Samples: 15235138280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 17:12:42,167][15401] Updated weights for policy 0, policy_version 929884 (0.0029) [2024-06-25 17:12:43,390][15132] Fps is (10 sec: 49151.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15235301376. Throughput: 0: 42823.4. Samples: 15235400860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 17:12:45,038][15401] Updated weights for policy 0, policy_version 929894 (0.0029) [2024-06-25 17:12:48,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 15235465216. Throughput: 0: 42500.8. Samples: 15235654780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:48,392][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 17:12:49,851][15401] Updated weights for policy 0, policy_version 929904 (0.0035) [2024-06-25 17:12:49,867][15349] Signal inference workers to stop experience collection... (225500 times) [2024-06-25 17:12:49,867][15349] Signal inference workers to resume experience collection... (225500 times) [2024-06-25 17:12:49,918][15401] InferenceWorker_p0-w0: stopping experience collection (225500 times) [2024-06-25 17:12:49,918][15401] InferenceWorker_p0-w0: resuming experience collection (225500 times) [2024-06-25 17:12:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 15235710976. Throughput: 0: 42530.5. Samples: 15235775080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 17:12:53,393][15401] Updated weights for policy 0, policy_version 929914 (0.0039) [2024-06-25 17:12:57,863][15401] Updated weights for policy 0, policy_version 929924 (0.0037) [2024-06-25 17:12:58,390][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.6, 300 sec: 42765.0). Total num frames: 15235907584. Throughput: 0: 42540.0. Samples: 15236033340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:12:58,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 17:13:00,914][15401] Updated weights for policy 0, policy_version 929934 (0.0033) [2024-06-25 17:13:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15236104192. Throughput: 0: 42273.5. Samples: 15236289360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:03,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 17:13:05,538][15401] Updated weights for policy 0, policy_version 929944 (0.0041) [2024-06-25 17:13:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15236349952. Throughput: 0: 42350.3. Samples: 15236413040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:08,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 17:13:08,547][15401] Updated weights for policy 0, policy_version 929954 (0.0038) [2024-06-25 17:13:13,150][15401] Updated weights for policy 0, policy_version 929964 (0.0025) [2024-06-25 17:13:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15236546560. Throughput: 0: 42473.4. Samples: 15236675040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 17:13:16,098][15401] Updated weights for policy 0, policy_version 929974 (0.0033) [2024-06-25 17:13:18,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15236743168. Throughput: 0: 42318.7. Samples: 15236931060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:18,395][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 17:13:20,899][15401] Updated weights for policy 0, policy_version 929984 (0.0040) [2024-06-25 17:13:23,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15237005312. Throughput: 0: 42596.1. Samples: 15237055100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:23,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 17:13:24,316][15401] Updated weights for policy 0, policy_version 929994 (0.0032) [2024-06-25 17:13:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 15237169152. Throughput: 0: 42460.0. Samples: 15237311560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:28,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 17:13:28,568][15401] Updated weights for policy 0, policy_version 930004 (0.0032) [2024-06-25 17:13:31,892][15401] Updated weights for policy 0, policy_version 930014 (0.0024) [2024-06-25 17:13:33,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15237382144. Throughput: 0: 42643.6. Samples: 15237573640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 17:13:35,963][15401] Updated weights for policy 0, policy_version 930024 (0.0035) [2024-06-25 17:13:38,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15237644288. Throughput: 0: 42717.1. Samples: 15237697340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 17:13:39,379][15401] Updated weights for policy 0, policy_version 930034 (0.0031) [2024-06-25 17:13:43,389][15401] Updated weights for policy 0, policy_version 930044 (0.0041) [2024-06-25 17:13:43,390][15132] Fps is (10 sec: 45872.6, 60 sec: 42324.9, 300 sec: 42876.0). Total num frames: 15237840896. Throughput: 0: 42739.4. Samples: 15237956640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:43,391][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 17:13:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000930044_15237840896.pth... [2024-06-25 17:13:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000929420_15227617280.pth [2024-06-25 17:13:46,969][15401] Updated weights for policy 0, policy_version 930054 (0.0034) [2024-06-25 17:13:48,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 15238021120. Throughput: 0: 42715.6. Samples: 15238211560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:48,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 17:13:51,208][15401] Updated weights for policy 0, policy_version 930064 (0.0021) [2024-06-25 17:13:53,390][15132] Fps is (10 sec: 44238.8, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 15238283264. Throughput: 0: 42778.0. Samples: 15238338060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-25 17:13:53,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 17:13:54,466][15401] Updated weights for policy 0, policy_version 930074 (0.0046) [2024-06-25 17:13:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15238447104. Throughput: 0: 42790.6. Samples: 15238600620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:13:58,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-25 17:13:58,853][15401] Updated weights for policy 0, policy_version 930084 (0.0031) [2024-06-25 17:14:02,041][15401] Updated weights for policy 0, policy_version 930094 (0.0046) [2024-06-25 17:14:03,390][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15238676480. Throughput: 0: 42533.2. Samples: 15238845060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:03,391][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 17:14:06,287][15401] Updated weights for policy 0, policy_version 930104 (0.0036) [2024-06-25 17:14:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15238889472. Throughput: 0: 42714.2. Samples: 15238977240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:08,394][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 17:14:09,920][15401] Updated weights for policy 0, policy_version 930114 (0.0032) [2024-06-25 17:14:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15239086080. Throughput: 0: 42731.6. Samples: 15239234480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 17:14:14,172][15401] Updated weights for policy 0, policy_version 930124 (0.0038) [2024-06-25 17:14:14,190][15349] Signal inference workers to stop experience collection... (225550 times) [2024-06-25 17:14:14,190][15349] Signal inference workers to resume experience collection... (225550 times) [2024-06-25 17:14:14,214][15401] InferenceWorker_p0-w0: stopping experience collection (225550 times) [2024-06-25 17:14:14,214][15401] InferenceWorker_p0-w0: resuming experience collection (225550 times) [2024-06-25 17:14:17,439][15401] Updated weights for policy 0, policy_version 930134 (0.0027) [2024-06-25 17:14:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15239331840. Throughput: 0: 42420.5. Samples: 15239482560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:18,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 17:14:21,722][15401] Updated weights for policy 0, policy_version 930144 (0.0027) [2024-06-25 17:14:23,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15239544832. Throughput: 0: 42743.3. Samples: 15239620800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 17:14:25,436][15401] Updated weights for policy 0, policy_version 930154 (0.0025) [2024-06-25 17:14:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15239741440. Throughput: 0: 42910.3. Samples: 15239887580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:28,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 17:14:29,160][15401] Updated weights for policy 0, policy_version 930164 (0.0033) [2024-06-25 17:14:33,112][15401] Updated weights for policy 0, policy_version 930174 (0.0039) [2024-06-25 17:14:33,389][15132] Fps is (10 sec: 42599.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15239970816. Throughput: 0: 42782.3. Samples: 15240136760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 17:14:37,134][15401] Updated weights for policy 0, policy_version 930184 (0.0034) [2024-06-25 17:14:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15240200192. Throughput: 0: 42935.2. Samples: 15240270140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:38,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 17:14:40,939][15401] Updated weights for policy 0, policy_version 930194 (0.0034) [2024-06-25 17:14:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.7, 300 sec: 42709.5). Total num frames: 15240380416. Throughput: 0: 42734.1. Samples: 15240523660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:43,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 17:14:44,670][15401] Updated weights for policy 0, policy_version 930204 (0.0039) [2024-06-25 17:14:48,392][15132] Fps is (10 sec: 40950.5, 60 sec: 43142.8, 300 sec: 42653.9). Total num frames: 15240609792. Throughput: 0: 43016.9. Samples: 15240780920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:48,392][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 17:14:48,471][15401] Updated weights for policy 0, policy_version 930214 (0.0036) [2024-06-25 17:14:52,203][15401] Updated weights for policy 0, policy_version 930224 (0.0032) [2024-06-25 17:14:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15240839168. Throughput: 0: 43088.8. Samples: 15240916240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 17:14:55,985][15401] Updated weights for policy 0, policy_version 930234 (0.0038) [2024-06-25 17:14:58,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15241035776. Throughput: 0: 43064.8. Samples: 15241172400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:14:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 17:15:00,168][15401] Updated weights for policy 0, policy_version 930244 (0.0037) [2024-06-25 17:15:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 15241265152. Throughput: 0: 43115.0. Samples: 15241422840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:03,393][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 17:15:03,414][15401] Updated weights for policy 0, policy_version 930254 (0.0028) [2024-06-25 17:15:08,082][15401] Updated weights for policy 0, policy_version 930264 (0.0027) [2024-06-25 17:15:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15241461760. Throughput: 0: 43001.4. Samples: 15241555860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:08,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 17:15:10,930][15401] Updated weights for policy 0, policy_version 930274 (0.0022) [2024-06-25 17:15:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15241691136. Throughput: 0: 42777.3. Samples: 15241812560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 17:15:15,511][15401] Updated weights for policy 0, policy_version 930284 (0.0036) [2024-06-25 17:15:18,390][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15241920512. Throughput: 0: 42763.4. Samples: 15242061120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 17:15:18,590][15401] Updated weights for policy 0, policy_version 930294 (0.0037) [2024-06-25 17:15:20,996][15349] Signal inference workers to stop experience collection... (225600 times) [2024-06-25 17:15:20,996][15349] Signal inference workers to resume experience collection... (225600 times) [2024-06-25 17:15:21,018][15401] InferenceWorker_p0-w0: stopping experience collection (225600 times) [2024-06-25 17:15:21,050][15401] InferenceWorker_p0-w0: resuming experience collection (225600 times) [2024-06-25 17:15:23,232][15401] Updated weights for policy 0, policy_version 930304 (0.0033) [2024-06-25 17:15:23,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15242117120. Throughput: 0: 42750.2. Samples: 15242193900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:23,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 17:15:26,176][15401] Updated weights for policy 0, policy_version 930314 (0.0024) [2024-06-25 17:15:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15242330112. Throughput: 0: 42869.9. Samples: 15242452800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:28,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 17:15:30,784][15401] Updated weights for policy 0, policy_version 930324 (0.0031) [2024-06-25 17:15:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15242559488. Throughput: 0: 42740.6. Samples: 15242704140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:33,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 17:15:34,317][15401] Updated weights for policy 0, policy_version 930334 (0.0036) [2024-06-25 17:15:38,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15242739712. Throughput: 0: 42605.0. Samples: 15242833460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:38,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 17:15:38,780][15401] Updated weights for policy 0, policy_version 930344 (0.0037) [2024-06-25 17:15:41,988][15401] Updated weights for policy 0, policy_version 930354 (0.0030) [2024-06-25 17:15:43,389][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15242969088. Throughput: 0: 42505.8. Samples: 15243085160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:43,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 17:15:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000930358_15242985472.pth... [2024-06-25 17:15:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000929734_15232761856.pth [2024-06-25 17:15:46,262][15401] Updated weights for policy 0, policy_version 930364 (0.0034) [2024-06-25 17:15:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 15243182080. Throughput: 0: 42562.2. Samples: 15243338040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 17:15:48,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 17:15:49,677][15401] Updated weights for policy 0, policy_version 930374 (0.0032) [2024-06-25 17:15:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 15243378688. Throughput: 0: 42352.0. Samples: 15243461700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:15:53,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 17:15:54,322][15401] Updated weights for policy 0, policy_version 930384 (0.0039) [2024-06-25 17:15:57,735][15401] Updated weights for policy 0, policy_version 930394 (0.0034) [2024-06-25 17:15:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15243624448. Throughput: 0: 42408.4. Samples: 15243720940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:15:58,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 17:16:01,887][15401] Updated weights for policy 0, policy_version 930404 (0.0034) [2024-06-25 17:16:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 15243788288. Throughput: 0: 42707.9. Samples: 15243982980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:03,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 17:16:05,333][15401] Updated weights for policy 0, policy_version 930414 (0.0038) [2024-06-25 17:16:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15244017664. Throughput: 0: 42438.2. Samples: 15244103620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:08,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 17:16:09,323][15401] Updated weights for policy 0, policy_version 930424 (0.0032) [2024-06-25 17:16:12,791][15401] Updated weights for policy 0, policy_version 930434 (0.0044) [2024-06-25 17:16:13,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15244263424. Throughput: 0: 42544.6. Samples: 15244367320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:13,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 17:16:16,801][15401] Updated weights for policy 0, policy_version 930444 (0.0046) [2024-06-25 17:16:18,394][15132] Fps is (10 sec: 42578.6, 60 sec: 42049.0, 300 sec: 42708.8). Total num frames: 15244443648. Throughput: 0: 42705.2. Samples: 15244626080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:18,395][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 17:16:20,591][15401] Updated weights for policy 0, policy_version 930454 (0.0044) [2024-06-25 17:16:23,390][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15244673024. Throughput: 0: 42525.8. Samples: 15244747120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:23,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 17:16:24,579][15401] Updated weights for policy 0, policy_version 930464 (0.0035) [2024-06-25 17:16:28,192][15401] Updated weights for policy 0, policy_version 930474 (0.0035) [2024-06-25 17:16:28,390][15132] Fps is (10 sec: 45896.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15244902400. Throughput: 0: 42878.6. Samples: 15245014700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:28,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-25 17:16:32,035][15401] Updated weights for policy 0, policy_version 930484 (0.0022) [2024-06-25 17:16:33,390][15132] Fps is (10 sec: 40958.2, 60 sec: 42051.9, 300 sec: 42653.9). Total num frames: 15245082624. Throughput: 0: 43001.5. Samples: 15245273120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:33,391][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 17:16:35,248][15349] Signal inference workers to stop experience collection... (225650 times) [2024-06-25 17:16:35,249][15349] Signal inference workers to resume experience collection... (225650 times) [2024-06-25 17:16:35,287][15401] InferenceWorker_p0-w0: stopping experience collection (225650 times) [2024-06-25 17:16:35,287][15401] InferenceWorker_p0-w0: resuming experience collection (225650 times) [2024-06-25 17:16:35,774][15401] Updated weights for policy 0, policy_version 930494 (0.0033) [2024-06-25 17:16:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15245328384. Throughput: 0: 42957.8. Samples: 15245394800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:38,394][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 17:16:39,571][15401] Updated weights for policy 0, policy_version 930504 (0.0028) [2024-06-25 17:16:43,334][15401] Updated weights for policy 0, policy_version 930514 (0.0029) [2024-06-25 17:16:43,389][15132] Fps is (10 sec: 45877.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15245541376. Throughput: 0: 43254.7. Samples: 15245667400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:43,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 17:16:47,164][15401] Updated weights for policy 0, policy_version 930524 (0.0026) [2024-06-25 17:16:48,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 15245737984. Throughput: 0: 43052.5. Samples: 15245920440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:48,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 17:16:50,866][15401] Updated weights for policy 0, policy_version 930534 (0.0028) [2024-06-25 17:16:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42765.1). Total num frames: 15245983744. Throughput: 0: 43163.9. Samples: 15246046000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 17:16:54,730][15401] Updated weights for policy 0, policy_version 930544 (0.0036) [2024-06-25 17:16:58,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15246180352. Throughput: 0: 43284.7. Samples: 15246315120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:16:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 17:16:58,468][15401] Updated weights for policy 0, policy_version 930554 (0.0036) [2024-06-25 17:17:02,254][15401] Updated weights for policy 0, policy_version 930564 (0.0022) [2024-06-25 17:17:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 15246393344. Throughput: 0: 43122.3. Samples: 15246566380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:03,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 17:17:05,986][15401] Updated weights for policy 0, policy_version 930574 (0.0023) [2024-06-25 17:17:08,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 15246639104. Throughput: 0: 43212.5. Samples: 15246691680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 17:17:10,044][15401] Updated weights for policy 0, policy_version 930584 (0.0024) [2024-06-25 17:17:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 15246819328. Throughput: 0: 43087.6. Samples: 15246953640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:13,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 17:17:13,655][15401] Updated weights for policy 0, policy_version 930594 (0.0022) [2024-06-25 17:17:18,033][15401] Updated weights for policy 0, policy_version 930604 (0.0036) [2024-06-25 17:17:18,390][15132] Fps is (10 sec: 39320.8, 60 sec: 43147.8, 300 sec: 42709.5). Total num frames: 15247032320. Throughput: 0: 43038.1. Samples: 15247209820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 17:17:21,263][15401] Updated weights for policy 0, policy_version 930614 (0.0027) [2024-06-25 17:17:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15247278080. Throughput: 0: 43120.4. Samples: 15247335220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:23,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 17:17:25,571][15401] Updated weights for policy 0, policy_version 930624 (0.0034) [2024-06-25 17:17:28,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15247458304. Throughput: 0: 42982.3. Samples: 15247601600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:28,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 17:17:28,885][15401] Updated weights for policy 0, policy_version 930634 (0.0041) [2024-06-25 17:17:33,000][15401] Updated weights for policy 0, policy_version 930644 (0.0028) [2024-06-25 17:17:33,390][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.8, 300 sec: 42709.5). Total num frames: 15247671296. Throughput: 0: 42829.8. Samples: 15247847680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:33,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 17:17:36,452][15401] Updated weights for policy 0, policy_version 930654 (0.0037) [2024-06-25 17:17:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15247917056. Throughput: 0: 42949.0. Samples: 15247978700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:38,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 17:17:40,918][15401] Updated weights for policy 0, policy_version 930664 (0.0038) [2024-06-25 17:17:43,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 15248097280. Throughput: 0: 42750.3. Samples: 15248238880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:17:43,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 17:17:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000930670_15248097280.pth... [2024-06-25 17:17:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000930044_15237840896.pth [2024-06-25 17:17:44,223][15401] Updated weights for policy 0, policy_version 930674 (0.0028) [2024-06-25 17:17:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 15248310272. Throughput: 0: 42706.6. Samples: 15248488180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:17:48,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 17:17:48,486][15401] Updated weights for policy 0, policy_version 930684 (0.0031) [2024-06-25 17:17:51,197][15349] Signal inference workers to stop experience collection... (225700 times) [2024-06-25 17:17:51,250][15401] InferenceWorker_p0-w0: stopping experience collection (225700 times) [2024-06-25 17:17:51,309][15349] Signal inference workers to resume experience collection... (225700 times) [2024-06-25 17:17:51,309][15401] InferenceWorker_p0-w0: resuming experience collection (225700 times) [2024-06-25 17:17:51,969][15401] Updated weights for policy 0, policy_version 930694 (0.0035) [2024-06-25 17:17:53,390][15132] Fps is (10 sec: 45873.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15248556032. Throughput: 0: 42842.5. Samples: 15248619600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:17:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 17:17:56,242][15401] Updated weights for policy 0, policy_version 930704 (0.0046) [2024-06-25 17:17:58,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 15248719872. Throughput: 0: 42585.6. Samples: 15248870000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:17:58,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 17:17:59,740][15401] Updated weights for policy 0, policy_version 930714 (0.0035) [2024-06-25 17:18:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15248949248. Throughput: 0: 42497.8. Samples: 15249122220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 17:18:04,061][15401] Updated weights for policy 0, policy_version 930724 (0.0032) [2024-06-25 17:18:07,335][15401] Updated weights for policy 0, policy_version 930734 (0.0033) [2024-06-25 17:18:08,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 15249178624. Throughput: 0: 42738.3. Samples: 15249258440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:08,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 17:18:11,866][15401] Updated weights for policy 0, policy_version 930744 (0.0033) [2024-06-25 17:18:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15249358848. Throughput: 0: 42465.7. Samples: 15249512560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:13,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 17:18:15,287][15401] Updated weights for policy 0, policy_version 930754 (0.0030) [2024-06-25 17:18:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15249604608. Throughput: 0: 42527.1. Samples: 15249761400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:18,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 17:18:19,602][15401] Updated weights for policy 0, policy_version 930764 (0.0037) [2024-06-25 17:18:23,033][15401] Updated weights for policy 0, policy_version 930774 (0.0034) [2024-06-25 17:18:23,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 15249833984. Throughput: 0: 42525.8. Samples: 15249892360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:23,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-25 17:18:27,404][15401] Updated weights for policy 0, policy_version 930784 (0.0032) [2024-06-25 17:18:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15249997824. Throughput: 0: 42354.2. Samples: 15250144820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:28,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 17:18:30,661][15401] Updated weights for policy 0, policy_version 930794 (0.0038) [2024-06-25 17:18:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15250243584. Throughput: 0: 42415.1. Samples: 15250396860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:33,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 17:18:35,134][15401] Updated weights for policy 0, policy_version 930804 (0.0022) [2024-06-25 17:18:38,292][15401] Updated weights for policy 0, policy_version 930814 (0.0040) [2024-06-25 17:18:38,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 15250456576. Throughput: 0: 42614.0. Samples: 15250537220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:38,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 17:18:42,491][15401] Updated weights for policy 0, policy_version 930824 (0.0040) [2024-06-25 17:18:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15250636800. Throughput: 0: 42702.4. Samples: 15250791600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 17:18:45,883][15401] Updated weights for policy 0, policy_version 930834 (0.0038) [2024-06-25 17:18:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15250915328. Throughput: 0: 42753.0. Samples: 15251046100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:48,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 17:18:49,860][15401] Updated weights for policy 0, policy_version 930844 (0.0036) [2024-06-25 17:18:53,390][15132] Fps is (10 sec: 45874.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 15251095552. Throughput: 0: 42749.2. Samples: 15251182160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:53,390][15132] Avg episode reward: [(0, '0.267')] [2024-06-25 17:18:53,480][15401] Updated weights for policy 0, policy_version 930854 (0.0025) [2024-06-25 17:18:57,273][15401] Updated weights for policy 0, policy_version 930864 (0.0026) [2024-06-25 17:18:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15251308544. Throughput: 0: 42770.1. Samples: 15251437220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:18:58,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 17:19:00,846][15349] Signal inference workers to stop experience collection... (225750 times) [2024-06-25 17:19:00,852][15349] Signal inference workers to resume experience collection... (225750 times) [2024-06-25 17:19:00,863][15401] InferenceWorker_p0-w0: stopping experience collection (225750 times) [2024-06-25 17:19:00,893][15401] InferenceWorker_p0-w0: resuming experience collection (225750 times) [2024-06-25 17:19:01,016][15401] Updated weights for policy 0, policy_version 930874 (0.0031) [2024-06-25 17:19:03,390][15132] Fps is (10 sec: 45875.7, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 15251554304. Throughput: 0: 42877.3. Samples: 15251690880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:19:03,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 17:19:04,668][15401] Updated weights for policy 0, policy_version 930884 (0.0034) [2024-06-25 17:19:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15251734528. Throughput: 0: 43111.1. Samples: 15251832360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:19:08,396][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 17:19:08,683][15401] Updated weights for policy 0, policy_version 930894 (0.0026) [2024-06-25 17:19:12,343][15401] Updated weights for policy 0, policy_version 930904 (0.0043) [2024-06-25 17:19:13,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15251947520. Throughput: 0: 43055.5. Samples: 15252082320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:19:13,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 17:19:16,307][15401] Updated weights for policy 0, policy_version 930914 (0.0040) [2024-06-25 17:19:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15252193280. Throughput: 0: 42954.2. Samples: 15252329800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:19:18,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 17:19:19,798][15401] Updated weights for policy 0, policy_version 930924 (0.0022) [2024-06-25 17:19:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 15252357120. Throughput: 0: 42923.4. Samples: 15252468780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:19:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 17:19:24,063][15401] Updated weights for policy 0, policy_version 930934 (0.0028) [2024-06-25 17:19:27,855][15401] Updated weights for policy 0, policy_version 930944 (0.0026) [2024-06-25 17:19:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15252586496. Throughput: 0: 42622.5. Samples: 15252709620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:19:28,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 17:19:31,785][15401] Updated weights for policy 0, policy_version 930954 (0.0030) [2024-06-25 17:19:33,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15252832256. Throughput: 0: 42580.7. Samples: 15252962240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:19:33,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 17:19:35,949][15401] Updated weights for policy 0, policy_version 930964 (0.0029) [2024-06-25 17:19:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15252996096. Throughput: 0: 42606.0. Samples: 15253099420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 17:19:38,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 17:19:39,477][15401] Updated weights for policy 0, policy_version 930974 (0.0030) [2024-06-25 17:19:43,390][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.4, 300 sec: 42765.3). Total num frames: 15253225472. Throughput: 0: 42522.2. Samples: 15253350720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:19:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 17:19:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000930983_15253225472.pth... [2024-06-25 17:19:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000930358_15242985472.pth [2024-06-25 17:19:43,684][15401] Updated weights for policy 0, policy_version 930984 (0.0034) [2024-06-25 17:19:47,494][15401] Updated weights for policy 0, policy_version 930994 (0.0046) [2024-06-25 17:19:48,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15253471232. Throughput: 0: 42555.5. Samples: 15253605880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:19:48,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 17:19:51,320][15401] Updated weights for policy 0, policy_version 931004 (0.0032) [2024-06-25 17:19:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15253651456. Throughput: 0: 42427.6. Samples: 15253741600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:19:53,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 17:19:55,131][15401] Updated weights for policy 0, policy_version 931014 (0.0043) [2024-06-25 17:19:58,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 15253864448. Throughput: 0: 42473.4. Samples: 15253993620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:19:58,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 17:19:59,038][15401] Updated weights for policy 0, policy_version 931024 (0.0027) [2024-06-25 17:20:02,791][15401] Updated weights for policy 0, policy_version 931034 (0.0038) [2024-06-25 17:20:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15254077440. Throughput: 0: 42531.6. Samples: 15254243720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:03,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 17:20:06,646][15401] Updated weights for policy 0, policy_version 931044 (0.0037) [2024-06-25 17:20:07,792][15349] Signal inference workers to stop experience collection... (225800 times) [2024-06-25 17:20:07,834][15401] InferenceWorker_p0-w0: stopping experience collection (225800 times) [2024-06-25 17:20:07,846][15349] Signal inference workers to resume experience collection... (225800 times) [2024-06-25 17:20:07,854][15401] InferenceWorker_p0-w0: resuming experience collection (225800 times) [2024-06-25 17:20:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15254290432. Throughput: 0: 42423.6. Samples: 15254377840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:08,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 17:20:10,703][15401] Updated weights for policy 0, policy_version 931054 (0.0032) [2024-06-25 17:20:13,390][15132] Fps is (10 sec: 44235.6, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 15254519808. Throughput: 0: 42696.7. Samples: 15254630980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:13,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-25 17:20:14,418][15401] Updated weights for policy 0, policy_version 931064 (0.0037) [2024-06-25 17:20:18,222][15401] Updated weights for policy 0, policy_version 931074 (0.0029) [2024-06-25 17:20:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15254732800. Throughput: 0: 42900.6. Samples: 15254892760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:18,394][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 17:20:21,964][15401] Updated weights for policy 0, policy_version 931084 (0.0036) [2024-06-25 17:20:23,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15254945792. Throughput: 0: 42600.8. Samples: 15255016460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:23,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 17:20:25,798][15401] Updated weights for policy 0, policy_version 931094 (0.0036) [2024-06-25 17:20:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15255158784. Throughput: 0: 42793.0. Samples: 15255276400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:28,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 17:20:29,520][15401] Updated weights for policy 0, policy_version 931104 (0.0046) [2024-06-25 17:20:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 15255355392. Throughput: 0: 42930.3. Samples: 15255537740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:33,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-25 17:20:33,735][15401] Updated weights for policy 0, policy_version 931114 (0.0028) [2024-06-25 17:20:37,125][15401] Updated weights for policy 0, policy_version 931124 (0.0029) [2024-06-25 17:20:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 15255601152. Throughput: 0: 42629.7. Samples: 15255659940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 17:20:41,243][15401] Updated weights for policy 0, policy_version 931134 (0.0037) [2024-06-25 17:20:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15255797760. Throughput: 0: 42859.5. Samples: 15255922300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:43,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 17:20:44,642][15401] Updated weights for policy 0, policy_version 931144 (0.0025) [2024-06-25 17:20:48,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15255994368. Throughput: 0: 43081.3. Samples: 15256182380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:48,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 17:20:48,802][15401] Updated weights for policy 0, policy_version 931154 (0.0039) [2024-06-25 17:20:52,059][15401] Updated weights for policy 0, policy_version 931164 (0.0030) [2024-06-25 17:20:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15256240128. Throughput: 0: 42761.8. Samples: 15256302120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 17:20:56,494][15401] Updated weights for policy 0, policy_version 931174 (0.0029) [2024-06-25 17:20:58,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.7, 300 sec: 42931.3). Total num frames: 15256453120. Throughput: 0: 43052.1. Samples: 15256568420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:20:58,393][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 17:20:59,778][15401] Updated weights for policy 0, policy_version 931184 (0.0028) [2024-06-25 17:21:03,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15256633344. Throughput: 0: 43045.4. Samples: 15256829800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:21:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 17:21:04,051][15401] Updated weights for policy 0, policy_version 931194 (0.0031) [2024-06-25 17:21:07,334][15401] Updated weights for policy 0, policy_version 931204 (0.0048) [2024-06-25 17:21:08,390][15132] Fps is (10 sec: 44247.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15256895488. Throughput: 0: 42913.8. Samples: 15256947580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:21:08,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 17:21:11,637][15401] Updated weights for policy 0, policy_version 931214 (0.0027) [2024-06-25 17:21:13,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.7, 300 sec: 42876.8). Total num frames: 15257092096. Throughput: 0: 43021.4. Samples: 15257212360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:21:13,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 17:21:14,887][15401] Updated weights for policy 0, policy_version 931224 (0.0033) [2024-06-25 17:21:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15257288704. Throughput: 0: 42888.9. Samples: 15257467740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:21:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 17:21:19,718][15401] Updated weights for policy 0, policy_version 931234 (0.0027) [2024-06-25 17:21:22,509][15401] Updated weights for policy 0, policy_version 931244 (0.0048) [2024-06-25 17:21:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15257534464. Throughput: 0: 42912.5. Samples: 15257591000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:21:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 17:21:27,431][15401] Updated weights for policy 0, policy_version 931254 (0.0031) [2024-06-25 17:21:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.2). Total num frames: 15257731072. Throughput: 0: 42912.8. Samples: 15257853380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:21:28,390][15132] Avg episode reward: [(0, '0.189')] [2024-06-25 17:21:29,810][15349] Signal inference workers to stop experience collection... (225850 times) [2024-06-25 17:21:29,867][15401] InferenceWorker_p0-w0: stopping experience collection (225850 times) [2024-06-25 17:21:29,930][15349] Signal inference workers to resume experience collection... (225850 times) [2024-06-25 17:21:29,931][15401] InferenceWorker_p0-w0: resuming experience collection (225850 times) [2024-06-25 17:21:30,245][15401] Updated weights for policy 0, policy_version 931264 (0.0025) [2024-06-25 17:21:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15257944064. Throughput: 0: 42695.0. Samples: 15258103660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:21:33,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 17:21:35,183][15401] Updated weights for policy 0, policy_version 931274 (0.0031) [2024-06-25 17:21:37,850][15401] Updated weights for policy 0, policy_version 931284 (0.0045) [2024-06-25 17:21:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15258173440. Throughput: 0: 42967.5. Samples: 15258235660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:21:38,392][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 17:21:42,873][15401] Updated weights for policy 0, policy_version 931294 (0.0037) [2024-06-25 17:21:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42765.3). Total num frames: 15258353664. Throughput: 0: 42737.7. Samples: 15258491520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:21:43,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 17:21:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000931296_15258353664.pth... [2024-06-25 17:21:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000930670_15248097280.pth [2024-06-25 17:21:45,716][15401] Updated weights for policy 0, policy_version 931304 (0.0036) [2024-06-25 17:21:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15258583040. Throughput: 0: 42529.7. Samples: 15258743640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:21:48,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 17:21:50,490][15401] Updated weights for policy 0, policy_version 931314 (0.0026) [2024-06-25 17:21:53,316][15401] Updated weights for policy 0, policy_version 931324 (0.0034) [2024-06-25 17:21:53,389][15132] Fps is (10 sec: 45876.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15258812416. Throughput: 0: 42836.5. Samples: 15258875220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:21:53,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 17:21:58,117][15401] Updated weights for policy 0, policy_version 931334 (0.0026) [2024-06-25 17:21:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 15258992640. Throughput: 0: 42495.1. Samples: 15259124640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:21:58,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 17:22:01,412][15401] Updated weights for policy 0, policy_version 931344 (0.0032) [2024-06-25 17:22:03,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15259205632. Throughput: 0: 42376.0. Samples: 15259374660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:03,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 17:22:05,845][15401] Updated weights for policy 0, policy_version 931354 (0.0027) [2024-06-25 17:22:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15259418624. Throughput: 0: 42497.0. Samples: 15259503360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 17:22:09,067][15401] Updated weights for policy 0, policy_version 931364 (0.0033) [2024-06-25 17:22:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 15259615232. Throughput: 0: 42395.2. Samples: 15259761160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:13,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 17:22:13,473][15401] Updated weights for policy 0, policy_version 931374 (0.0037) [2024-06-25 17:22:16,655][15401] Updated weights for policy 0, policy_version 931384 (0.0026) [2024-06-25 17:22:18,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15259860992. Throughput: 0: 42268.5. Samples: 15260005740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:18,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 17:22:21,211][15401] Updated weights for policy 0, policy_version 931394 (0.0038) [2024-06-25 17:22:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 15260041216. Throughput: 0: 42410.2. Samples: 15260144120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 17:22:24,581][15401] Updated weights for policy 0, policy_version 931404 (0.0043) [2024-06-25 17:22:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15260270592. Throughput: 0: 42312.6. Samples: 15260395580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:28,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 17:22:28,787][15401] Updated weights for policy 0, policy_version 931414 (0.0027) [2024-06-25 17:22:31,988][15401] Updated weights for policy 0, policy_version 931424 (0.0042) [2024-06-25 17:22:33,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15260499968. Throughput: 0: 42439.1. Samples: 15260653400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:33,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 17:22:36,364][15401] Updated weights for policy 0, policy_version 931434 (0.0034) [2024-06-25 17:22:38,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42050.6, 300 sec: 42709.1). Total num frames: 15260696576. Throughput: 0: 42369.2. Samples: 15260781940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:38,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 17:22:39,733][15401] Updated weights for policy 0, policy_version 931444 (0.0032) [2024-06-25 17:22:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 15260893184. Throughput: 0: 42511.1. Samples: 15261037640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:43,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 17:22:44,226][15401] Updated weights for policy 0, policy_version 931454 (0.0038) [2024-06-25 17:22:46,868][15349] Signal inference workers to stop experience collection... (225900 times) [2024-06-25 17:22:46,876][15349] Signal inference workers to resume experience collection... (225900 times) [2024-06-25 17:22:46,908][15401] InferenceWorker_p0-w0: stopping experience collection (225900 times) [2024-06-25 17:22:46,908][15401] InferenceWorker_p0-w0: resuming experience collection (225900 times) [2024-06-25 17:22:47,451][15401] Updated weights for policy 0, policy_version 931464 (0.0042) [2024-06-25 17:22:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15261138944. Throughput: 0: 42404.0. Samples: 15261282840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 17:22:51,793][15401] Updated weights for policy 0, policy_version 931474 (0.0036) [2024-06-25 17:22:53,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 15261335552. Throughput: 0: 42702.6. Samples: 15261424980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 17:22:55,284][15401] Updated weights for policy 0, policy_version 931484 (0.0044) [2024-06-25 17:22:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15261532160. Throughput: 0: 42527.6. Samples: 15261674900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:22:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 17:22:59,531][15401] Updated weights for policy 0, policy_version 931494 (0.0035) [2024-06-25 17:23:02,867][15401] Updated weights for policy 0, policy_version 931504 (0.0027) [2024-06-25 17:23:03,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15261794304. Throughput: 0: 42704.8. Samples: 15261927460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:23:03,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 17:23:07,072][15401] Updated weights for policy 0, policy_version 931514 (0.0029) [2024-06-25 17:23:08,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15261974528. Throughput: 0: 42674.7. Samples: 15262064480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:23:08,390][15132] Avg episode reward: [(0, '0.243')] [2024-06-25 17:23:10,500][15401] Updated weights for policy 0, policy_version 931524 (0.0034) [2024-06-25 17:23:13,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15262187520. Throughput: 0: 42752.4. Samples: 15262319440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:23:13,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 17:23:14,651][15401] Updated weights for policy 0, policy_version 931534 (0.0037) [2024-06-25 17:23:18,026][15401] Updated weights for policy 0, policy_version 931544 (0.0039) [2024-06-25 17:23:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15262433280. Throughput: 0: 42608.5. Samples: 15262570780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:23:18,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 17:23:22,602][15401] Updated weights for policy 0, policy_version 931554 (0.0036) [2024-06-25 17:23:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15262613504. Throughput: 0: 42728.1. Samples: 15262704600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:23:23,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 17:23:25,792][15401] Updated weights for policy 0, policy_version 931564 (0.0028) [2024-06-25 17:23:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15262826496. Throughput: 0: 42708.5. Samples: 15262959520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:23:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 17:23:30,031][15401] Updated weights for policy 0, policy_version 931574 (0.0036) [2024-06-25 17:23:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15263055872. Throughput: 0: 43079.9. Samples: 15263221440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-25 17:23:33,399][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 17:23:33,594][15401] Updated weights for policy 0, policy_version 931584 (0.0033) [2024-06-25 17:23:37,424][15401] Updated weights for policy 0, policy_version 931594 (0.0033) [2024-06-25 17:23:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 15263268864. Throughput: 0: 42753.3. Samples: 15263348880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:23:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 17:23:41,068][15401] Updated weights for policy 0, policy_version 931604 (0.0037) [2024-06-25 17:23:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15263481856. Throughput: 0: 42888.4. Samples: 15263604880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:23:43,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 17:23:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000931609_15263481856.pth... [2024-06-25 17:23:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000930983_15253225472.pth [2024-06-25 17:23:44,828][15401] Updated weights for policy 0, policy_version 931614 (0.0047) [2024-06-25 17:23:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15263678464. Throughput: 0: 43089.0. Samples: 15263866460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:23:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 17:23:48,808][15401] Updated weights for policy 0, policy_version 931624 (0.0037) [2024-06-25 17:23:52,422][15401] Updated weights for policy 0, policy_version 931634 (0.0027) [2024-06-25 17:23:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15263907840. Throughput: 0: 42834.7. Samples: 15263992040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:23:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 17:23:56,411][15401] Updated weights for policy 0, policy_version 931644 (0.0040) [2024-06-25 17:23:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 15264120832. Throughput: 0: 42804.4. Samples: 15264245640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:23:58,391][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 17:23:58,669][15349] Signal inference workers to stop experience collection... (225950 times) [2024-06-25 17:23:58,671][15349] Signal inference workers to resume experience collection... (225950 times) [2024-06-25 17:23:58,692][15401] InferenceWorker_p0-w0: stopping experience collection (225950 times) [2024-06-25 17:23:58,720][15401] InferenceWorker_p0-w0: resuming experience collection (225950 times) [2024-06-25 17:24:00,205][15401] Updated weights for policy 0, policy_version 931654 (0.0031) [2024-06-25 17:24:03,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15264317440. Throughput: 0: 42969.1. Samples: 15264504400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:03,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 17:24:04,501][15401] Updated weights for policy 0, policy_version 931664 (0.0040) [2024-06-25 17:24:07,848][15401] Updated weights for policy 0, policy_version 931674 (0.0045) [2024-06-25 17:24:08,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15264563200. Throughput: 0: 42879.1. Samples: 15264634160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 17:24:12,035][15401] Updated weights for policy 0, policy_version 931684 (0.0037) [2024-06-25 17:24:13,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15264776192. Throughput: 0: 42861.3. Samples: 15264888280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:13,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 17:24:15,534][15401] Updated weights for policy 0, policy_version 931694 (0.0037) [2024-06-25 17:24:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15264972800. Throughput: 0: 42850.3. Samples: 15265149700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:18,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 17:24:19,703][15401] Updated weights for policy 0, policy_version 931704 (0.0037) [2024-06-25 17:24:23,330][15401] Updated weights for policy 0, policy_version 931714 (0.0038) [2024-06-25 17:24:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15265202176. Throughput: 0: 42681.3. Samples: 15265269540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:23,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 17:24:27,440][15401] Updated weights for policy 0, policy_version 931724 (0.0038) [2024-06-25 17:24:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15265398784. Throughput: 0: 42688.9. Samples: 15265525880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:28,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 17:24:31,042][15401] Updated weights for policy 0, policy_version 931734 (0.0034) [2024-06-25 17:24:33,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42869.7, 300 sec: 42820.2). Total num frames: 15265628160. Throughput: 0: 42566.9. Samples: 15265782080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:33,393][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 17:24:34,823][15401] Updated weights for policy 0, policy_version 931744 (0.0035) [2024-06-25 17:24:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15265841152. Throughput: 0: 42715.5. Samples: 15265914240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:38,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 17:24:38,806][15401] Updated weights for policy 0, policy_version 931754 (0.0037) [2024-06-25 17:24:42,802][15401] Updated weights for policy 0, policy_version 931764 (0.0032) [2024-06-25 17:24:43,390][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15266054144. Throughput: 0: 42748.5. Samples: 15266169320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:43,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 17:24:46,355][15401] Updated weights for policy 0, policy_version 931774 (0.0028) [2024-06-25 17:24:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15266250752. Throughput: 0: 42775.7. Samples: 15266429300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:48,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 17:24:50,308][15401] Updated weights for policy 0, policy_version 931784 (0.0034) [2024-06-25 17:24:53,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15266480128. Throughput: 0: 42617.3. Samples: 15266551940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 17:24:54,274][15401] Updated weights for policy 0, policy_version 931794 (0.0028) [2024-06-25 17:24:57,864][15401] Updated weights for policy 0, policy_version 931804 (0.0026) [2024-06-25 17:24:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15266693120. Throughput: 0: 42591.5. Samples: 15266804900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:24:58,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-25 17:25:02,090][15401] Updated weights for policy 0, policy_version 931814 (0.0032) [2024-06-25 17:25:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15266906112. Throughput: 0: 42433.7. Samples: 15267059220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:25:03,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 17:25:05,415][15401] Updated weights for policy 0, policy_version 931824 (0.0028) [2024-06-25 17:25:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15267102720. Throughput: 0: 42636.1. Samples: 15267188160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:25:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 17:25:09,761][15401] Updated weights for policy 0, policy_version 931834 (0.0037) [2024-06-25 17:25:13,071][15401] Updated weights for policy 0, policy_version 931844 (0.0033) [2024-06-25 17:25:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15267348480. Throughput: 0: 42688.8. Samples: 15267446880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:25:13,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 17:25:17,527][15401] Updated weights for policy 0, policy_version 931854 (0.0031) [2024-06-25 17:25:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15267528704. Throughput: 0: 42673.6. Samples: 15267702280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:25:18,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 17:25:20,812][15401] Updated weights for policy 0, policy_version 931864 (0.0022) [2024-06-25 17:25:23,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15267725312. Throughput: 0: 42407.1. Samples: 15267822560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:25:23,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 17:25:25,362][15401] Updated weights for policy 0, policy_version 931874 (0.0039) [2024-06-25 17:25:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15267971072. Throughput: 0: 42403.2. Samples: 15268077460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:25:28,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 17:25:28,653][15401] Updated weights for policy 0, policy_version 931884 (0.0040) [2024-06-25 17:25:32,132][15349] Signal inference workers to stop experience collection... (226000 times) [2024-06-25 17:25:32,169][15401] InferenceWorker_p0-w0: stopping experience collection (226000 times) [2024-06-25 17:25:32,188][15349] Signal inference workers to resume experience collection... (226000 times) [2024-06-25 17:25:32,190][15401] InferenceWorker_p0-w0: resuming experience collection (226000 times) [2024-06-25 17:25:33,118][15401] Updated weights for policy 0, policy_version 931894 (0.0030) [2024-06-25 17:25:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 15268167680. Throughput: 0: 42472.4. Samples: 15268340560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:25:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 17:25:36,361][15401] Updated weights for policy 0, policy_version 931904 (0.0035) [2024-06-25 17:25:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15268397056. Throughput: 0: 42467.9. Samples: 15268463000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:25:38,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 17:25:40,756][15401] Updated weights for policy 0, policy_version 931914 (0.0034) [2024-06-25 17:25:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15268593664. Throughput: 0: 42542.6. Samples: 15268719320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:25:43,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 17:25:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000931921_15268593664.pth... [2024-06-25 17:25:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000931296_15258353664.pth [2024-06-25 17:25:44,294][15401] Updated weights for policy 0, policy_version 931924 (0.0041) [2024-06-25 17:25:48,337][15401] Updated weights for policy 0, policy_version 931934 (0.0040) [2024-06-25 17:25:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15268806656. Throughput: 0: 42553.8. Samples: 15268974140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:25:48,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 17:25:51,905][15401] Updated weights for policy 0, policy_version 931944 (0.0039) [2024-06-25 17:25:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 15269036032. Throughput: 0: 42523.9. Samples: 15269101740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:25:53,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 17:25:56,074][15401] Updated weights for policy 0, policy_version 931954 (0.0038) [2024-06-25 17:25:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15269232640. Throughput: 0: 42521.9. Samples: 15269360360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:25:58,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 17:25:59,578][15401] Updated weights for policy 0, policy_version 931964 (0.0050) [2024-06-25 17:26:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15269445632. Throughput: 0: 42547.9. Samples: 15269616940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:03,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 17:26:03,531][15401] Updated weights for policy 0, policy_version 931974 (0.0039) [2024-06-25 17:26:07,205][15401] Updated weights for policy 0, policy_version 931984 (0.0039) [2024-06-25 17:26:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15269658624. Throughput: 0: 42660.5. Samples: 15269742280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 17:26:11,284][15401] Updated weights for policy 0, policy_version 931994 (0.0041) [2024-06-25 17:26:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15269871616. Throughput: 0: 42701.7. Samples: 15269999040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:13,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 17:26:14,999][15401] Updated weights for policy 0, policy_version 932004 (0.0030) [2024-06-25 17:26:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15270084608. Throughput: 0: 42498.7. Samples: 15270253000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 17:26:18,874][15401] Updated weights for policy 0, policy_version 932014 (0.0037) [2024-06-25 17:26:22,622][15401] Updated weights for policy 0, policy_version 932024 (0.0029) [2024-06-25 17:26:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15270313984. Throughput: 0: 42612.8. Samples: 15270380580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 17:26:26,487][15401] Updated weights for policy 0, policy_version 932034 (0.0038) [2024-06-25 17:26:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15270510592. Throughput: 0: 42661.4. Samples: 15270639080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 17:26:30,410][15401] Updated weights for policy 0, policy_version 932044 (0.0038) [2024-06-25 17:26:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15270739968. Throughput: 0: 42639.2. Samples: 15270892900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:33,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 17:26:34,259][15401] Updated weights for policy 0, policy_version 932054 (0.0033) [2024-06-25 17:26:38,106][15401] Updated weights for policy 0, policy_version 932064 (0.0025) [2024-06-25 17:26:38,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15270952960. Throughput: 0: 42733.7. Samples: 15271024760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 17:26:41,788][15401] Updated weights for policy 0, policy_version 932074 (0.0025) [2024-06-25 17:26:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 15271133184. Throughput: 0: 42630.3. Samples: 15271278720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:43,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 17:26:45,864][15401] Updated weights for policy 0, policy_version 932084 (0.0031) [2024-06-25 17:26:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 15271411712. Throughput: 0: 42481.8. Samples: 15271528620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 17:26:49,245][15401] Updated weights for policy 0, policy_version 932094 (0.0023) [2024-06-25 17:26:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15271575552. Throughput: 0: 42696.1. Samples: 15271663600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:53,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 17:26:53,444][15401] Updated weights for policy 0, policy_version 932104 (0.0039) [2024-06-25 17:26:56,752][15401] Updated weights for policy 0, policy_version 932114 (0.0032) [2024-06-25 17:26:58,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15271788544. Throughput: 0: 42592.9. Samples: 15271915720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:26:58,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 17:27:01,238][15401] Updated weights for policy 0, policy_version 932124 (0.0033) [2024-06-25 17:27:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15272034304. Throughput: 0: 42572.5. Samples: 15272168760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:27:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 17:27:04,717][15401] Updated weights for policy 0, policy_version 932134 (0.0031) [2024-06-25 17:27:06,357][15349] Signal inference workers to stop experience collection... (226050 times) [2024-06-25 17:27:06,404][15401] InferenceWorker_p0-w0: stopping experience collection (226050 times) [2024-06-25 17:27:06,472][15349] Signal inference workers to resume experience collection... (226050 times) [2024-06-25 17:27:06,472][15401] InferenceWorker_p0-w0: resuming experience collection (226050 times) [2024-06-25 17:27:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15272198144. Throughput: 0: 42900.5. Samples: 15272311100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:27:08,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 17:27:08,991][15401] Updated weights for policy 0, policy_version 932144 (0.0030) [2024-06-25 17:27:12,312][15401] Updated weights for policy 0, policy_version 932154 (0.0034) [2024-06-25 17:27:13,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15272411136. Throughput: 0: 42531.5. Samples: 15272553000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:27:13,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 17:27:16,455][15401] Updated weights for policy 0, policy_version 932164 (0.0039) [2024-06-25 17:27:18,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15272673280. Throughput: 0: 42673.4. Samples: 15272813200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:27:18,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 17:27:19,675][15401] Updated weights for policy 0, policy_version 932174 (0.0044) [2024-06-25 17:27:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15272853504. Throughput: 0: 42814.3. Samples: 15272951400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 17:27:23,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 17:27:24,183][15401] Updated weights for policy 0, policy_version 932184 (0.0035) [2024-06-25 17:27:27,868][15401] Updated weights for policy 0, policy_version 932194 (0.0041) [2024-06-25 17:27:28,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15273066496. Throughput: 0: 42624.9. Samples: 15273196840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:27:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 17:27:31,932][15401] Updated weights for policy 0, policy_version 932204 (0.0044) [2024-06-25 17:27:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15273312256. Throughput: 0: 42736.0. Samples: 15273451740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:27:33,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 17:27:35,287][15401] Updated weights for policy 0, policy_version 932214 (0.0033) [2024-06-25 17:27:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15273492480. Throughput: 0: 42811.0. Samples: 15273590100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:27:38,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 17:27:39,609][15401] Updated weights for policy 0, policy_version 932224 (0.0026) [2024-06-25 17:27:42,858][15401] Updated weights for policy 0, policy_version 932234 (0.0033) [2024-06-25 17:27:43,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15273721856. Throughput: 0: 42680.9. Samples: 15273836360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:27:43,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 17:27:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000932234_15273721856.pth... [2024-06-25 17:27:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000931609_15263481856.pth [2024-06-25 17:27:47,203][15401] Updated weights for policy 0, policy_version 932244 (0.0037) [2024-06-25 17:27:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15273951232. Throughput: 0: 42823.1. Samples: 15274095800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:27:48,390][15132] Avg episode reward: [(0, '0.180')] [2024-06-25 17:27:50,302][15401] Updated weights for policy 0, policy_version 932254 (0.0048) [2024-06-25 17:27:53,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15274115072. Throughput: 0: 42687.5. Samples: 15274232040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:27:53,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 17:27:54,699][15401] Updated weights for policy 0, policy_version 932264 (0.0049) [2024-06-25 17:27:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15274360832. Throughput: 0: 42791.6. Samples: 15274478620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:27:58,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 17:27:58,609][15401] Updated weights for policy 0, policy_version 932274 (0.0046) [2024-06-25 17:28:02,352][15401] Updated weights for policy 0, policy_version 932284 (0.0039) [2024-06-25 17:28:03,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15274590208. Throughput: 0: 42792.4. Samples: 15274738860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:03,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 17:28:06,713][15401] Updated weights for policy 0, policy_version 932294 (0.0034) [2024-06-25 17:28:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15274770432. Throughput: 0: 42730.3. Samples: 15274874260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:08,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 17:28:10,010][15401] Updated weights for policy 0, policy_version 932304 (0.0034) [2024-06-25 17:28:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 15275016192. Throughput: 0: 42794.6. Samples: 15275122600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 17:28:14,349][15401] Updated weights for policy 0, policy_version 932314 (0.0023) [2024-06-25 17:28:17,570][15401] Updated weights for policy 0, policy_version 932324 (0.0032) [2024-06-25 17:28:18,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15275212800. Throughput: 0: 43050.7. Samples: 15275389020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 17:28:20,072][15349] Signal inference workers to stop experience collection... (226100 times) [2024-06-25 17:28:20,079][15349] Signal inference workers to resume experience collection... (226100 times) [2024-06-25 17:28:20,112][15401] InferenceWorker_p0-w0: stopping experience collection (226100 times) [2024-06-25 17:28:20,112][15401] InferenceWorker_p0-w0: resuming experience collection (226100 times) [2024-06-25 17:28:22,037][15401] Updated weights for policy 0, policy_version 932334 (0.0028) [2024-06-25 17:28:23,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15275409408. Throughput: 0: 42643.6. Samples: 15275509060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 17:28:25,687][15401] Updated weights for policy 0, policy_version 932344 (0.0037) [2024-06-25 17:28:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15275671552. Throughput: 0: 42792.8. Samples: 15275762040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 17:28:29,687][15401] Updated weights for policy 0, policy_version 932354 (0.0026) [2024-06-25 17:28:33,294][15401] Updated weights for policy 0, policy_version 932364 (0.0039) [2024-06-25 17:28:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15275851776. Throughput: 0: 42911.0. Samples: 15276026800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 17:28:37,704][15401] Updated weights for policy 0, policy_version 932374 (0.0031) [2024-06-25 17:28:38,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15276048384. Throughput: 0: 42486.6. Samples: 15276143940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:38,392][15132] Avg episode reward: [(0, '0.231')] [2024-06-25 17:28:41,075][15401] Updated weights for policy 0, policy_version 932384 (0.0040) [2024-06-25 17:28:43,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15276310528. Throughput: 0: 42704.9. Samples: 15276400340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 17:28:45,244][15401] Updated weights for policy 0, policy_version 932394 (0.0048) [2024-06-25 17:28:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15276474368. Throughput: 0: 42787.9. Samples: 15276664320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:48,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 17:28:48,890][15401] Updated weights for policy 0, policy_version 932404 (0.0033) [2024-06-25 17:28:53,227][15401] Updated weights for policy 0, policy_version 932414 (0.0033) [2024-06-25 17:28:53,390][15132] Fps is (10 sec: 37682.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 15276687360. Throughput: 0: 42377.1. Samples: 15276781240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:53,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 17:28:56,298][15401] Updated weights for policy 0, policy_version 932424 (0.0034) [2024-06-25 17:28:58,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15276949504. Throughput: 0: 42705.7. Samples: 15277044360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:28:58,394][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 17:29:00,587][15401] Updated weights for policy 0, policy_version 932434 (0.0026) [2024-06-25 17:29:03,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15277113344. Throughput: 0: 42645.3. Samples: 15277308060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:29:03,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 17:29:03,934][15401] Updated weights for policy 0, policy_version 932444 (0.0032) [2024-06-25 17:29:08,166][15401] Updated weights for policy 0, policy_version 932454 (0.0026) [2024-06-25 17:29:08,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 15277342720. Throughput: 0: 42670.6. Samples: 15277429240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:29:08,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 17:29:11,734][15401] Updated weights for policy 0, policy_version 932464 (0.0026) [2024-06-25 17:29:13,389][15132] Fps is (10 sec: 49151.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15277604864. Throughput: 0: 42797.4. Samples: 15277687920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:29:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 17:29:15,732][15401] Updated weights for policy 0, policy_version 932474 (0.0039) [2024-06-25 17:29:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15277768704. Throughput: 0: 42640.5. Samples: 15277945620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 17:29:18,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 17:29:19,407][15401] Updated weights for policy 0, policy_version 932484 (0.0034) [2024-06-25 17:29:20,624][15349] Signal inference workers to stop experience collection... (226150 times) [2024-06-25 17:29:20,624][15349] Signal inference workers to resume experience collection... (226150 times) [2024-06-25 17:29:20,672][15401] InferenceWorker_p0-w0: stopping experience collection (226150 times) [2024-06-25 17:29:20,672][15401] InferenceWorker_p0-w0: resuming experience collection (226150 times) [2024-06-25 17:29:23,258][15401] Updated weights for policy 0, policy_version 932494 (0.0028) [2024-06-25 17:29:23,390][15132] Fps is (10 sec: 37681.5, 60 sec: 42871.1, 300 sec: 42653.9). Total num frames: 15277981696. Throughput: 0: 42656.5. Samples: 15278063500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:29:23,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 17:29:27,133][15401] Updated weights for policy 0, policy_version 932504 (0.0035) [2024-06-25 17:29:28,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 15278243840. Throughput: 0: 42971.0. Samples: 15278334040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:29:28,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 17:29:31,360][15401] Updated weights for policy 0, policy_version 932514 (0.0033) [2024-06-25 17:29:33,389][15132] Fps is (10 sec: 42600.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15278407680. Throughput: 0: 42787.6. Samples: 15278589760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:29:33,396][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 17:29:34,613][15401] Updated weights for policy 0, policy_version 932524 (0.0029) [2024-06-25 17:29:38,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15278620672. Throughput: 0: 42825.4. Samples: 15278708380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:29:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 17:29:38,774][15401] Updated weights for policy 0, policy_version 932534 (0.0028) [2024-06-25 17:29:42,141][15401] Updated weights for policy 0, policy_version 932544 (0.0027) [2024-06-25 17:29:43,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15278866432. Throughput: 0: 42927.2. Samples: 15278976080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:29:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 17:29:43,545][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000932550_15278899200.pth... [2024-06-25 17:29:43,601][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000931921_15268593664.pth [2024-06-25 17:29:46,278][15401] Updated weights for policy 0, policy_version 932554 (0.0033) [2024-06-25 17:29:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15279063040. Throughput: 0: 42743.9. Samples: 15279231540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:29:48,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 17:29:49,657][15401] Updated weights for policy 0, policy_version 932564 (0.0042) [2024-06-25 17:29:53,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 15279276032. Throughput: 0: 42901.9. Samples: 15279359820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:29:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 17:29:53,763][15401] Updated weights for policy 0, policy_version 932574 (0.0033) [2024-06-25 17:29:57,098][15401] Updated weights for policy 0, policy_version 932584 (0.0048) [2024-06-25 17:29:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15279505408. Throughput: 0: 42912.3. Samples: 15279618980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:29:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 17:30:01,671][15401] Updated weights for policy 0, policy_version 932594 (0.0034) [2024-06-25 17:30:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15279718400. Throughput: 0: 42890.2. Samples: 15279875680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:03,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 17:30:04,753][15401] Updated weights for policy 0, policy_version 932604 (0.0031) [2024-06-25 17:30:08,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15279915008. Throughput: 0: 43105.9. Samples: 15280003240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:08,390][15132] Avg episode reward: [(0, '0.806')] [2024-06-25 17:30:09,083][15401] Updated weights for policy 0, policy_version 932614 (0.0035) [2024-06-25 17:30:12,580][15401] Updated weights for policy 0, policy_version 932624 (0.0028) [2024-06-25 17:30:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15280144384. Throughput: 0: 42828.4. Samples: 15280261320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 17:30:16,694][15401] Updated weights for policy 0, policy_version 932634 (0.0034) [2024-06-25 17:30:18,390][15132] Fps is (10 sec: 44232.2, 60 sec: 43143.9, 300 sec: 42820.4). Total num frames: 15280357376. Throughput: 0: 42897.3. Samples: 15280520180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:18,391][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 17:30:20,174][15401] Updated weights for policy 0, policy_version 932644 (0.0039) [2024-06-25 17:30:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.8, 300 sec: 42653.9). Total num frames: 15280553984. Throughput: 0: 43177.5. Samples: 15280651360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 17:30:24,081][15401] Updated weights for policy 0, policy_version 932654 (0.0037) [2024-06-25 17:30:27,735][15401] Updated weights for policy 0, policy_version 932664 (0.0035) [2024-06-25 17:30:28,390][15132] Fps is (10 sec: 42601.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 15280783360. Throughput: 0: 42986.8. Samples: 15280910500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:28,399][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 17:30:31,973][15401] Updated weights for policy 0, policy_version 932674 (0.0036) [2024-06-25 17:30:33,392][15132] Fps is (10 sec: 44225.7, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 15280996352. Throughput: 0: 42937.3. Samples: 15281163820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:33,393][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 17:30:35,380][15349] Signal inference workers to stop experience collection... (226200 times) [2024-06-25 17:30:35,432][15401] InferenceWorker_p0-w0: stopping experience collection (226200 times) [2024-06-25 17:30:35,441][15349] Signal inference workers to resume experience collection... (226200 times) [2024-06-25 17:30:35,451][15401] InferenceWorker_p0-w0: resuming experience collection (226200 times) [2024-06-25 17:30:35,459][15401] Updated weights for policy 0, policy_version 932684 (0.0038) [2024-06-25 17:30:38,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15281209344. Throughput: 0: 43028.9. Samples: 15281296120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 17:30:39,686][15401] Updated weights for policy 0, policy_version 932694 (0.0030) [2024-06-25 17:30:43,298][15401] Updated weights for policy 0, policy_version 932704 (0.0027) [2024-06-25 17:30:43,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15281422336. Throughput: 0: 42882.3. Samples: 15281548680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:43,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 17:30:47,051][15401] Updated weights for policy 0, policy_version 932714 (0.0033) [2024-06-25 17:30:48,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15281651712. Throughput: 0: 42950.6. Samples: 15281808460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:48,394][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 17:30:50,822][15401] Updated weights for policy 0, policy_version 932724 (0.0036) [2024-06-25 17:30:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15281864704. Throughput: 0: 43048.2. Samples: 15281940420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 17:30:54,632][15401] Updated weights for policy 0, policy_version 932734 (0.0040) [2024-06-25 17:30:58,391][15401] Updated weights for policy 0, policy_version 932744 (0.0024) [2024-06-25 17:30:58,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 15282077696. Throughput: 0: 43032.4. Samples: 15282197880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:30:58,392][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 17:31:02,031][15401] Updated weights for policy 0, policy_version 932754 (0.0044) [2024-06-25 17:31:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15282307072. Throughput: 0: 42977.3. Samples: 15282454120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:31:03,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 17:31:06,181][15401] Updated weights for policy 0, policy_version 932764 (0.0034) [2024-06-25 17:31:08,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15282503680. Throughput: 0: 43103.5. Samples: 15282591020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:31:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 17:31:09,641][15401] Updated weights for policy 0, policy_version 932774 (0.0031) [2024-06-25 17:31:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15282716672. Throughput: 0: 42884.1. Samples: 15282840280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:31:13,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 17:31:14,125][15401] Updated weights for policy 0, policy_version 932784 (0.0037) [2024-06-25 17:31:17,063][15401] Updated weights for policy 0, policy_version 932794 (0.0022) [2024-06-25 17:31:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43145.2, 300 sec: 42820.6). Total num frames: 15282946048. Throughput: 0: 43085.0. Samples: 15283102540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-25 17:31:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 17:31:21,806][15401] Updated weights for policy 0, policy_version 932804 (0.0032) [2024-06-25 17:31:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 15283159040. Throughput: 0: 43007.0. Samples: 15283231440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:31:23,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 17:31:24,955][15401] Updated weights for policy 0, policy_version 932814 (0.0035) [2024-06-25 17:31:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15283355648. Throughput: 0: 42874.8. Samples: 15283478040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:31:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 17:31:29,462][15401] Updated weights for policy 0, policy_version 932824 (0.0026) [2024-06-25 17:31:32,397][15401] Updated weights for policy 0, policy_version 932834 (0.0032) [2024-06-25 17:31:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 15283585024. Throughput: 0: 42975.2. Samples: 15283742340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:31:33,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 17:31:36,968][15401] Updated weights for policy 0, policy_version 932844 (0.0036) [2024-06-25 17:31:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15283781632. Throughput: 0: 42997.5. Samples: 15283875300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:31:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 17:31:40,089][15401] Updated weights for policy 0, policy_version 932854 (0.0039) [2024-06-25 17:31:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15284011008. Throughput: 0: 42741.0. Samples: 15284121120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:31:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 17:31:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000932862_15284011008.pth... [2024-06-25 17:31:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000932234_15273721856.pth [2024-06-25 17:31:44,551][15401] Updated weights for policy 0, policy_version 932864 (0.0028) [2024-06-25 17:31:47,832][15401] Updated weights for policy 0, policy_version 932874 (0.0042) [2024-06-25 17:31:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15284224000. Throughput: 0: 42832.1. Samples: 15284381560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:31:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 17:31:51,987][15401] Updated weights for policy 0, policy_version 932884 (0.0033) [2024-06-25 17:31:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 15284420608. Throughput: 0: 42748.5. Samples: 15284514700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:31:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 17:31:55,488][15401] Updated weights for policy 0, policy_version 932894 (0.0029) [2024-06-25 17:31:56,703][15349] Signal inference workers to stop experience collection... (226250 times) [2024-06-25 17:31:56,755][15349] Signal inference workers to resume experience collection... (226250 times) [2024-06-25 17:31:56,755][15401] InferenceWorker_p0-w0: stopping experience collection (226250 times) [2024-06-25 17:31:56,771][15401] InferenceWorker_p0-w0: resuming experience collection (226250 times) [2024-06-25 17:31:58,392][15132] Fps is (10 sec: 44225.8, 60 sec: 43144.5, 300 sec: 42820.2). Total num frames: 15284666368. Throughput: 0: 42702.6. Samples: 15284762000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:31:58,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 17:31:59,556][15401] Updated weights for policy 0, policy_version 932904 (0.0027) [2024-06-25 17:32:03,127][15401] Updated weights for policy 0, policy_version 932914 (0.0038) [2024-06-25 17:32:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15284879360. Throughput: 0: 42720.8. Samples: 15285024980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:03,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 17:32:07,848][15401] Updated weights for policy 0, policy_version 932924 (0.0030) [2024-06-25 17:32:08,392][15132] Fps is (10 sec: 39321.5, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 15285059584. Throughput: 0: 42640.4. Samples: 15285150360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:08,393][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 17:32:10,651][15401] Updated weights for policy 0, policy_version 932934 (0.0033) [2024-06-25 17:32:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15285305344. Throughput: 0: 42832.9. Samples: 15285405520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:13,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 17:32:15,307][15401] Updated weights for policy 0, policy_version 932944 (0.0031) [2024-06-25 17:32:18,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15285501952. Throughput: 0: 42691.1. Samples: 15285663440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 17:32:18,674][15401] Updated weights for policy 0, policy_version 932954 (0.0040) [2024-06-25 17:32:22,843][15401] Updated weights for policy 0, policy_version 932964 (0.0023) [2024-06-25 17:32:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15285714944. Throughput: 0: 42531.0. Samples: 15285789200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:23,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-25 17:32:26,223][15401] Updated weights for policy 0, policy_version 932974 (0.0032) [2024-06-25 17:32:28,394][15132] Fps is (10 sec: 45853.1, 60 sec: 43414.0, 300 sec: 42875.4). Total num frames: 15285960704. Throughput: 0: 42798.4. Samples: 15286047260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:28,395][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 17:32:30,358][15401] Updated weights for policy 0, policy_version 932984 (0.0028) [2024-06-25 17:32:33,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 15286124544. Throughput: 0: 42847.0. Samples: 15286309680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 17:32:33,805][15401] Updated weights for policy 0, policy_version 932994 (0.0040) [2024-06-25 17:32:37,994][15401] Updated weights for policy 0, policy_version 933004 (0.0045) [2024-06-25 17:32:38,389][15132] Fps is (10 sec: 40980.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15286370304. Throughput: 0: 42524.0. Samples: 15286428280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:38,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 17:32:41,488][15401] Updated weights for policy 0, policy_version 933014 (0.0034) [2024-06-25 17:32:43,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15286599680. Throughput: 0: 42906.7. Samples: 15286692700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 17:32:45,660][15401] Updated weights for policy 0, policy_version 933024 (0.0037) [2024-06-25 17:32:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 15286763520. Throughput: 0: 42845.8. Samples: 15286953040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:48,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 17:32:49,131][15401] Updated weights for policy 0, policy_version 933034 (0.0037) [2024-06-25 17:32:53,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15286976512. Throughput: 0: 42652.1. Samples: 15287069600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:53,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 17:32:53,465][15401] Updated weights for policy 0, policy_version 933044 (0.0039) [2024-06-25 17:32:56,966][15401] Updated weights for policy 0, policy_version 933054 (0.0039) [2024-06-25 17:32:58,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 15287222272. Throughput: 0: 42591.5. Samples: 15287322140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:32:58,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 17:33:01,425][15401] Updated weights for policy 0, policy_version 933064 (0.0040) [2024-06-25 17:33:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42820.5). Total num frames: 15287402496. Throughput: 0: 42643.2. Samples: 15287582380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:33:03,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 17:33:04,789][15401] Updated weights for policy 0, policy_version 933074 (0.0027) [2024-06-25 17:33:08,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 15287599104. Throughput: 0: 42573.5. Samples: 15287705000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:33:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 17:33:08,956][15401] Updated weights for policy 0, policy_version 933084 (0.0028) [2024-06-25 17:33:12,501][15401] Updated weights for policy 0, policy_version 933094 (0.0032) [2024-06-25 17:33:13,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15287861248. Throughput: 0: 42596.2. Samples: 15287963880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 17:33:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 17:33:16,703][15401] Updated weights for policy 0, policy_version 933104 (0.0029) [2024-06-25 17:33:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15288057856. Throughput: 0: 42390.3. Samples: 15288217240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 17:33:20,219][15401] Updated weights for policy 0, policy_version 933114 (0.0025) [2024-06-25 17:33:21,337][15349] Signal inference workers to stop experience collection... (226300 times) [2024-06-25 17:33:21,363][15401] InferenceWorker_p0-w0: stopping experience collection (226300 times) [2024-06-25 17:33:21,400][15349] Signal inference workers to resume experience collection... (226300 times) [2024-06-25 17:33:21,401][15401] InferenceWorker_p0-w0: resuming experience collection (226300 times) [2024-06-25 17:33:23,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15288238080. Throughput: 0: 42600.0. Samples: 15288345280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:23,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 17:33:24,777][15401] Updated weights for policy 0, policy_version 933124 (0.0034) [2024-06-25 17:33:27,816][15401] Updated weights for policy 0, policy_version 933134 (0.0038) [2024-06-25 17:33:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42054.0, 300 sec: 42820.2). Total num frames: 15288483840. Throughput: 0: 42334.7. Samples: 15288597860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:28,392][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 17:33:32,294][15401] Updated weights for policy 0, policy_version 933144 (0.0033) [2024-06-25 17:33:33,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15288696832. Throughput: 0: 42281.0. Samples: 15288855680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:33,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 17:33:35,528][15401] Updated weights for policy 0, policy_version 933154 (0.0045) [2024-06-25 17:33:38,391][15132] Fps is (10 sec: 42602.7, 60 sec: 42324.4, 300 sec: 42709.3). Total num frames: 15288909824. Throughput: 0: 42458.3. Samples: 15288980280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:38,392][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 17:33:39,838][15401] Updated weights for policy 0, policy_version 933164 (0.0028) [2024-06-25 17:33:43,168][15401] Updated weights for policy 0, policy_version 933174 (0.0035) [2024-06-25 17:33:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 15289122816. Throughput: 0: 42566.3. Samples: 15289237620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:43,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 17:33:43,477][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000933175_15289139200.pth... [2024-06-25 17:33:43,531][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000932550_15278899200.pth [2024-06-25 17:33:47,453][15401] Updated weights for policy 0, policy_version 933184 (0.0038) [2024-06-25 17:33:48,392][15132] Fps is (10 sec: 40955.8, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 15289319424. Throughput: 0: 42500.3. Samples: 15289495000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:48,392][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 17:33:50,877][15401] Updated weights for policy 0, policy_version 933194 (0.0037) [2024-06-25 17:33:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15289548800. Throughput: 0: 42568.0. Samples: 15289620560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 17:33:55,453][15401] Updated weights for policy 0, policy_version 933204 (0.0043) [2024-06-25 17:33:58,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 15289761792. Throughput: 0: 42477.3. Samples: 15289875360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:33:58,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 17:33:58,706][15401] Updated weights for policy 0, policy_version 933214 (0.0034) [2024-06-25 17:34:03,121][15401] Updated weights for policy 0, policy_version 933224 (0.0036) [2024-06-25 17:34:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15289958400. Throughput: 0: 42672.8. Samples: 15290137520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:03,391][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 17:34:06,239][15401] Updated weights for policy 0, policy_version 933234 (0.0034) [2024-06-25 17:34:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15290171392. Throughput: 0: 42469.0. Samples: 15290256380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:08,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 17:34:10,844][15401] Updated weights for policy 0, policy_version 933244 (0.0030) [2024-06-25 17:34:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 15290400768. Throughput: 0: 42505.7. Samples: 15290510520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:13,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 17:34:13,753][15401] Updated weights for policy 0, policy_version 933254 (0.0035) [2024-06-25 17:34:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 15290580992. Throughput: 0: 42675.1. Samples: 15290776060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:18,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 17:34:18,602][15401] Updated weights for policy 0, policy_version 933264 (0.0028) [2024-06-25 17:34:21,386][15401] Updated weights for policy 0, policy_version 933274 (0.0031) [2024-06-25 17:34:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15290826752. Throughput: 0: 42607.4. Samples: 15290897660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:23,393][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 17:34:26,069][15401] Updated weights for policy 0, policy_version 933284 (0.0032) [2024-06-25 17:34:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 15291039744. Throughput: 0: 42584.4. Samples: 15291153920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 17:34:29,507][15401] Updated weights for policy 0, policy_version 933294 (0.0032) [2024-06-25 17:34:33,389][15132] Fps is (10 sec: 37692.5, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 15291203584. Throughput: 0: 42735.6. Samples: 15291418000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 17:34:33,952][15401] Updated weights for policy 0, policy_version 933304 (0.0048) [2024-06-25 17:34:34,033][15349] Signal inference workers to stop experience collection... (226350 times) [2024-06-25 17:34:34,060][15401] InferenceWorker_p0-w0: stopping experience collection (226350 times) [2024-06-25 17:34:34,093][15349] Signal inference workers to resume experience collection... (226350 times) [2024-06-25 17:34:34,094][15401] InferenceWorker_p0-w0: resuming experience collection (226350 times) [2024-06-25 17:34:36,870][15401] Updated weights for policy 0, policy_version 933314 (0.0029) [2024-06-25 17:34:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 15291465728. Throughput: 0: 42521.4. Samples: 15291534020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:38,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 17:34:41,639][15401] Updated weights for policy 0, policy_version 933324 (0.0034) [2024-06-25 17:34:43,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15291678720. Throughput: 0: 42742.2. Samples: 15291798760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 17:34:44,424][15401] Updated weights for policy 0, policy_version 933334 (0.0032) [2024-06-25 17:34:48,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 15291875328. Throughput: 0: 42602.6. Samples: 15292054640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:48,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 17:34:49,566][15401] Updated weights for policy 0, policy_version 933344 (0.0032) [2024-06-25 17:34:51,810][15401] Updated weights for policy 0, policy_version 933354 (0.0044) [2024-06-25 17:34:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15292121088. Throughput: 0: 42695.9. Samples: 15292177700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:53,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 17:34:57,043][15401] Updated weights for policy 0, policy_version 933364 (0.0038) [2024-06-25 17:34:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15292301312. Throughput: 0: 42977.8. Samples: 15292444520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:34:58,390][15132] Avg episode reward: [(0, '0.370')] [2024-06-25 17:34:59,569][15401] Updated weights for policy 0, policy_version 933374 (0.0031) [2024-06-25 17:35:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 15292514304. Throughput: 0: 42735.9. Samples: 15292699180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:35:03,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 17:35:04,556][15401] Updated weights for policy 0, policy_version 933384 (0.0045) [2024-06-25 17:35:07,022][15401] Updated weights for policy 0, policy_version 933394 (0.0031) [2024-06-25 17:35:08,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15292760064. Throughput: 0: 42957.9. Samples: 15292830660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 17:35:08,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 17:35:12,075][15401] Updated weights for policy 0, policy_version 933404 (0.0037) [2024-06-25 17:35:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15292940288. Throughput: 0: 43027.8. Samples: 15293090180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 17:35:14,735][15401] Updated weights for policy 0, policy_version 933414 (0.0027) [2024-06-25 17:35:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15293169664. Throughput: 0: 42757.7. Samples: 15293342100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:18,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 17:35:19,645][15401] Updated weights for policy 0, policy_version 933424 (0.0036) [2024-06-25 17:35:22,578][15401] Updated weights for policy 0, policy_version 933434 (0.0031) [2024-06-25 17:35:23,389][15132] Fps is (10 sec: 47514.5, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 15293415424. Throughput: 0: 43089.3. Samples: 15293473040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 17:35:27,463][15401] Updated weights for policy 0, policy_version 933444 (0.0030) [2024-06-25 17:35:28,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 15293595648. Throughput: 0: 42954.7. Samples: 15293731720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:28,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 17:35:30,184][15401] Updated weights for policy 0, policy_version 933454 (0.0033) [2024-06-25 17:35:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15293808640. Throughput: 0: 42798.7. Samples: 15293980580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:33,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 17:35:35,548][15401] Updated weights for policy 0, policy_version 933464 (0.0039) [2024-06-25 17:35:38,186][15401] Updated weights for policy 0, policy_version 933474 (0.0028) [2024-06-25 17:35:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 15294054400. Throughput: 0: 42914.6. Samples: 15294108860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:38,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 17:35:43,094][15401] Updated weights for policy 0, policy_version 933484 (0.0031) [2024-06-25 17:35:43,394][15132] Fps is (10 sec: 40940.8, 60 sec: 42322.0, 300 sec: 42597.7). Total num frames: 15294218240. Throughput: 0: 42620.9. Samples: 15294362660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:43,395][15132] Avg episode reward: [(0, '0.856')] [2024-06-25 17:35:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000933485_15294218240.pth... [2024-06-25 17:35:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000932862_15284011008.pth [2024-06-25 17:35:46,026][15401] Updated weights for policy 0, policy_version 933494 (0.0039) [2024-06-25 17:35:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 15294464000. Throughput: 0: 42530.4. Samples: 15294613040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:48,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-25 17:35:50,754][15401] Updated weights for policy 0, policy_version 933504 (0.0032) [2024-06-25 17:35:53,392][15132] Fps is (10 sec: 45885.8, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 15294676992. Throughput: 0: 42600.8. Samples: 15294747800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:53,392][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 17:35:53,745][15401] Updated weights for policy 0, policy_version 933514 (0.0032) [2024-06-25 17:35:58,245][15401] Updated weights for policy 0, policy_version 933524 (0.0048) [2024-06-25 17:35:58,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 15294857216. Throughput: 0: 42444.5. Samples: 15295000280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:35:58,392][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 17:36:00,646][15349] Signal inference workers to stop experience collection... (226400 times) [2024-06-25 17:36:00,652][15349] Signal inference workers to resume experience collection... (226400 times) [2024-06-25 17:36:00,662][15401] InferenceWorker_p0-w0: stopping experience collection (226400 times) [2024-06-25 17:36:00,682][15401] InferenceWorker_p0-w0: resuming experience collection (226400 times) [2024-06-25 17:36:01,349][15401] Updated weights for policy 0, policy_version 933534 (0.0033) [2024-06-25 17:36:03,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15295086592. Throughput: 0: 42409.4. Samples: 15295250520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 17:36:06,203][15401] Updated weights for policy 0, policy_version 933544 (0.0038) [2024-06-25 17:36:08,390][15132] Fps is (10 sec: 47524.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15295332352. Throughput: 0: 42561.7. Samples: 15295388320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:08,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 17:36:08,912][15401] Updated weights for policy 0, policy_version 933554 (0.0023) [2024-06-25 17:36:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15295496192. Throughput: 0: 42266.5. Samples: 15295633720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 17:36:13,655][15401] Updated weights for policy 0, policy_version 933564 (0.0038) [2024-06-25 17:36:17,258][15401] Updated weights for policy 0, policy_version 933574 (0.0039) [2024-06-25 17:36:18,391][15132] Fps is (10 sec: 39315.7, 60 sec: 42597.3, 300 sec: 42598.2). Total num frames: 15295725568. Throughput: 0: 42480.3. Samples: 15295892260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:18,392][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 17:36:21,067][15401] Updated weights for policy 0, policy_version 933584 (0.0030) [2024-06-25 17:36:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15295938560. Throughput: 0: 42374.3. Samples: 15296015700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:23,390][15132] Avg episode reward: [(0, '0.824')] [2024-06-25 17:36:24,832][15401] Updated weights for policy 0, policy_version 933594 (0.0044) [2024-06-25 17:36:28,390][15132] Fps is (10 sec: 42605.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15296151552. Throughput: 0: 42368.9. Samples: 15296269060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 17:36:28,567][15401] Updated weights for policy 0, policy_version 933604 (0.0039) [2024-06-25 17:36:32,447][15401] Updated weights for policy 0, policy_version 933614 (0.0034) [2024-06-25 17:36:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42598.1). Total num frames: 15296348160. Throughput: 0: 42642.1. Samples: 15296532040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:33,393][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 17:36:36,018][15401] Updated weights for policy 0, policy_version 933624 (0.0032) [2024-06-25 17:36:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 15296577536. Throughput: 0: 42398.8. Samples: 15296655640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:38,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 17:36:40,141][15401] Updated weights for policy 0, policy_version 933634 (0.0034) [2024-06-25 17:36:43,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42874.9, 300 sec: 42598.4). Total num frames: 15296790528. Throughput: 0: 42571.2. Samples: 15296915880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:43,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 17:36:43,938][15401] Updated weights for policy 0, policy_version 933644 (0.0039) [2024-06-25 17:36:47,768][15401] Updated weights for policy 0, policy_version 933654 (0.0027) [2024-06-25 17:36:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 15297003520. Throughput: 0: 42712.3. Samples: 15297172580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:48,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 17:36:51,336][15401] Updated weights for policy 0, policy_version 933664 (0.0033) [2024-06-25 17:36:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42327.1, 300 sec: 42543.2). Total num frames: 15297216512. Throughput: 0: 42413.5. Samples: 15297296920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:53,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 17:36:55,374][15401] Updated weights for policy 0, policy_version 933674 (0.0046) [2024-06-25 17:36:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42873.2, 300 sec: 42542.9). Total num frames: 15297429504. Throughput: 0: 42791.2. Samples: 15297559320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:36:58,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 17:36:59,038][15401] Updated weights for policy 0, policy_version 933684 (0.0028) [2024-06-25 17:37:03,213][15401] Updated weights for policy 0, policy_version 933694 (0.0035) [2024-06-25 17:37:03,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42596.7, 300 sec: 42653.9). Total num frames: 15297642496. Throughput: 0: 42824.1. Samples: 15297819380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 17:37:03,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 17:37:06,967][15401] Updated weights for policy 0, policy_version 933704 (0.0037) [2024-06-25 17:37:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15297871872. Throughput: 0: 42801.3. Samples: 15297941760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 17:37:10,719][15401] Updated weights for policy 0, policy_version 933714 (0.0037) [2024-06-25 17:37:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15298068480. Throughput: 0: 42939.6. Samples: 15298201340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:13,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 17:37:14,720][15401] Updated weights for policy 0, policy_version 933724 (0.0036) [2024-06-25 17:37:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42599.6, 300 sec: 42598.4). Total num frames: 15298281472. Throughput: 0: 42724.6. Samples: 15298454540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:18,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 17:37:18,717][15401] Updated weights for policy 0, policy_version 933734 (0.0037) [2024-06-25 17:37:21,361][15349] Signal inference workers to stop experience collection... (226450 times) [2024-06-25 17:37:21,361][15349] Signal inference workers to resume experience collection... (226450 times) [2024-06-25 17:37:21,403][15401] InferenceWorker_p0-w0: stopping experience collection (226450 times) [2024-06-25 17:37:21,403][15401] InferenceWorker_p0-w0: resuming experience collection (226450 times) [2024-06-25 17:37:22,177][15401] Updated weights for policy 0, policy_version 933744 (0.0041) [2024-06-25 17:37:23,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42487.7). Total num frames: 15298494464. Throughput: 0: 42771.8. Samples: 15298580480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:23,393][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 17:37:26,322][15401] Updated weights for policy 0, policy_version 933754 (0.0046) [2024-06-25 17:37:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15298707456. Throughput: 0: 42727.0. Samples: 15298838600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:28,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-25 17:37:29,863][15401] Updated weights for policy 0, policy_version 933764 (0.0048) [2024-06-25 17:37:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 15298936832. Throughput: 0: 42561.0. Samples: 15299087820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 17:37:33,941][15401] Updated weights for policy 0, policy_version 933774 (0.0033) [2024-06-25 17:37:37,689][15401] Updated weights for policy 0, policy_version 933784 (0.0044) [2024-06-25 17:37:38,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15299149824. Throughput: 0: 42838.1. Samples: 15299224640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 17:37:41,374][15401] Updated weights for policy 0, policy_version 933794 (0.0030) [2024-06-25 17:37:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15299346432. Throughput: 0: 42709.4. Samples: 15299481240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:43,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 17:37:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000933799_15299362816.pth... [2024-06-25 17:37:43,482][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000933175_15289139200.pth [2024-06-25 17:37:45,314][15401] Updated weights for policy 0, policy_version 933804 (0.0032) [2024-06-25 17:37:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15299575808. Throughput: 0: 42585.7. Samples: 15299735640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:48,399][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 17:37:49,378][15401] Updated weights for policy 0, policy_version 933814 (0.0036) [2024-06-25 17:37:52,893][15401] Updated weights for policy 0, policy_version 933824 (0.0027) [2024-06-25 17:37:53,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15299788800. Throughput: 0: 42721.4. Samples: 15299864220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 17:37:57,060][15401] Updated weights for policy 0, policy_version 933834 (0.0035) [2024-06-25 17:37:58,392][15132] Fps is (10 sec: 42589.0, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 15300001792. Throughput: 0: 42694.6. Samples: 15300122700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:37:58,392][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 17:38:00,544][15401] Updated weights for policy 0, policy_version 933844 (0.0038) [2024-06-25 17:38:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 15300198400. Throughput: 0: 42629.8. Samples: 15300372880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:03,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 17:38:04,628][15401] Updated weights for policy 0, policy_version 933854 (0.0046) [2024-06-25 17:38:08,180][15401] Updated weights for policy 0, policy_version 933864 (0.0030) [2024-06-25 17:38:08,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15300427776. Throughput: 0: 42578.7. Samples: 15300496420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:08,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 17:38:12,110][15401] Updated weights for policy 0, policy_version 933874 (0.0028) [2024-06-25 17:38:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15300640768. Throughput: 0: 42677.4. Samples: 15300759080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 17:38:16,093][15401] Updated weights for policy 0, policy_version 933884 (0.0039) [2024-06-25 17:38:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15300853760. Throughput: 0: 42911.5. Samples: 15301018840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 17:38:19,596][15401] Updated weights for policy 0, policy_version 933894 (0.0047) [2024-06-25 17:38:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42654.3). Total num frames: 15301066752. Throughput: 0: 42659.1. Samples: 15301144300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 17:38:23,591][15401] Updated weights for policy 0, policy_version 933904 (0.0029) [2024-06-25 17:38:27,386][15401] Updated weights for policy 0, policy_version 933914 (0.0028) [2024-06-25 17:38:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15301296128. Throughput: 0: 42697.7. Samples: 15301402640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:28,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 17:38:31,038][15401] Updated weights for policy 0, policy_version 933924 (0.0028) [2024-06-25 17:38:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.6). Total num frames: 15301476352. Throughput: 0: 42855.7. Samples: 15301664140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 17:38:35,126][15401] Updated weights for policy 0, policy_version 933934 (0.0026) [2024-06-25 17:38:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15301705728. Throughput: 0: 42703.2. Samples: 15301785860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:38,390][15132] Avg episode reward: [(0, '0.114')] [2024-06-25 17:38:38,786][15401] Updated weights for policy 0, policy_version 933944 (0.0041) [2024-06-25 17:38:42,928][15401] Updated weights for policy 0, policy_version 933954 (0.0029) [2024-06-25 17:38:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 15301918720. Throughput: 0: 42850.7. Samples: 15302050880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:43,390][15132] Avg episode reward: [(0, '0.246')] [2024-06-25 17:38:46,294][15401] Updated weights for policy 0, policy_version 933964 (0.0037) [2024-06-25 17:38:48,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15302131712. Throughput: 0: 42791.8. Samples: 15302298520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:48,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 17:38:50,770][15401] Updated weights for policy 0, policy_version 933974 (0.0036) [2024-06-25 17:38:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15302344704. Throughput: 0: 43006.1. Samples: 15302431700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:53,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 17:38:54,217][15401] Updated weights for policy 0, policy_version 933984 (0.0033) [2024-06-25 17:38:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 15302541312. Throughput: 0: 42911.6. Samples: 15302690100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 17:38:58,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 17:38:58,515][15401] Updated weights for policy 0, policy_version 933994 (0.0039) [2024-06-25 17:39:02,083][15401] Updated weights for policy 0, policy_version 934004 (0.0037) [2024-06-25 17:39:03,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15302787072. Throughput: 0: 42772.9. Samples: 15302943620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:03,391][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 17:39:05,987][15401] Updated weights for policy 0, policy_version 934014 (0.0027) [2024-06-25 17:39:06,589][15349] Signal inference workers to stop experience collection... (226500 times) [2024-06-25 17:39:06,591][15349] Signal inference workers to resume experience collection... (226500 times) [2024-06-25 17:39:06,629][15401] InferenceWorker_p0-w0: stopping experience collection (226500 times) [2024-06-25 17:39:06,629][15401] InferenceWorker_p0-w0: resuming experience collection (226500 times) [2024-06-25 17:39:08,391][15132] Fps is (10 sec: 44230.0, 60 sec: 42597.3, 300 sec: 42653.7). Total num frames: 15302983680. Throughput: 0: 42841.7. Samples: 15303072240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:08,392][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 17:39:09,947][15401] Updated weights for policy 0, policy_version 934024 (0.0037) [2024-06-25 17:39:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15303196672. Throughput: 0: 42869.2. Samples: 15303331760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:13,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 17:39:13,498][15401] Updated weights for policy 0, policy_version 934034 (0.0024) [2024-06-25 17:39:17,530][15401] Updated weights for policy 0, policy_version 934044 (0.0040) [2024-06-25 17:39:18,389][15132] Fps is (10 sec: 44243.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 15303426048. Throughput: 0: 42712.0. Samples: 15303586180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:18,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-25 17:39:21,138][15401] Updated weights for policy 0, policy_version 934054 (0.0046) [2024-06-25 17:39:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15303639040. Throughput: 0: 42931.1. Samples: 15303717760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:23,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 17:39:25,165][15401] Updated weights for policy 0, policy_version 934064 (0.0032) [2024-06-25 17:39:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 15303835648. Throughput: 0: 42675.5. Samples: 15303971280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:28,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 17:39:28,765][15401] Updated weights for policy 0, policy_version 934074 (0.0034) [2024-06-25 17:39:32,726][15401] Updated weights for policy 0, policy_version 934084 (0.0044) [2024-06-25 17:39:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15304048640. Throughput: 0: 42844.1. Samples: 15304226500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:33,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 17:39:36,346][15401] Updated weights for policy 0, policy_version 934094 (0.0033) [2024-06-25 17:39:38,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15304278016. Throughput: 0: 42788.2. Samples: 15304357160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:38,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 17:39:40,423][15401] Updated weights for policy 0, policy_version 934104 (0.0028) [2024-06-25 17:39:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15304491008. Throughput: 0: 42831.5. Samples: 15304617520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:43,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 17:39:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000934112_15304491008.pth... [2024-06-25 17:39:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000933485_15294218240.pth [2024-06-25 17:39:43,831][15401] Updated weights for policy 0, policy_version 934114 (0.0028) [2024-06-25 17:39:47,946][15401] Updated weights for policy 0, policy_version 934124 (0.0032) [2024-06-25 17:39:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15304704000. Throughput: 0: 42880.5. Samples: 15304873240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:48,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 17:39:51,666][15401] Updated weights for policy 0, policy_version 934134 (0.0029) [2024-06-25 17:39:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 15304933376. Throughput: 0: 42846.9. Samples: 15305000280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:53,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 17:39:55,421][15401] Updated weights for policy 0, policy_version 934144 (0.0040) [2024-06-25 17:39:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15305129984. Throughput: 0: 42875.6. Samples: 15305261160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:39:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 17:39:59,201][15401] Updated weights for policy 0, policy_version 934154 (0.0035) [2024-06-25 17:40:02,889][15401] Updated weights for policy 0, policy_version 934164 (0.0038) [2024-06-25 17:40:03,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15305359360. Throughput: 0: 42947.9. Samples: 15305518840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:03,393][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 17:40:06,799][15401] Updated weights for policy 0, policy_version 934174 (0.0039) [2024-06-25 17:40:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43691.8, 300 sec: 42931.7). Total num frames: 15305605120. Throughput: 0: 42923.5. Samples: 15305649320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:08,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 17:40:10,503][15401] Updated weights for policy 0, policy_version 934184 (0.0036) [2024-06-25 17:40:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15305768960. Throughput: 0: 43125.5. Samples: 15305911920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:13,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 17:40:14,884][15401] Updated weights for policy 0, policy_version 934194 (0.0028) [2024-06-25 17:40:18,352][15401] Updated weights for policy 0, policy_version 934204 (0.0032) [2024-06-25 17:40:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15305998336. Throughput: 0: 43023.9. Samples: 15306162580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:18,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 17:40:22,406][15401] Updated weights for policy 0, policy_version 934214 (0.0037) [2024-06-25 17:40:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15306227712. Throughput: 0: 42981.2. Samples: 15306291320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:23,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 17:40:25,916][15401] Updated weights for policy 0, policy_version 934224 (0.0041) [2024-06-25 17:40:27,366][15349] Signal inference workers to stop experience collection... (226550 times) [2024-06-25 17:40:27,411][15401] InferenceWorker_p0-w0: stopping experience collection (226550 times) [2024-06-25 17:40:27,414][15349] Signal inference workers to resume experience collection... (226550 times) [2024-06-25 17:40:27,426][15401] InferenceWorker_p0-w0: resuming experience collection (226550 times) [2024-06-25 17:40:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15306424320. Throughput: 0: 43061.3. Samples: 15306555280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 17:40:29,979][15401] Updated weights for policy 0, policy_version 934234 (0.0039) [2024-06-25 17:40:33,355][15401] Updated weights for policy 0, policy_version 934244 (0.0035) [2024-06-25 17:40:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15306653696. Throughput: 0: 42819.1. Samples: 15306800100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:33,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 17:40:37,435][15401] Updated weights for policy 0, policy_version 934254 (0.0034) [2024-06-25 17:40:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42821.2). Total num frames: 15306850304. Throughput: 0: 42911.3. Samples: 15306931300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 17:40:40,893][15401] Updated weights for policy 0, policy_version 934264 (0.0027) [2024-06-25 17:40:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15307046912. Throughput: 0: 42924.5. Samples: 15307192760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:43,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 17:40:45,218][15401] Updated weights for policy 0, policy_version 934274 (0.0046) [2024-06-25 17:40:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 15307292672. Throughput: 0: 42639.7. Samples: 15307437620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:48,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 17:40:48,797][15401] Updated weights for policy 0, policy_version 934284 (0.0038) [2024-06-25 17:40:52,671][15401] Updated weights for policy 0, policy_version 934294 (0.0049) [2024-06-25 17:40:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 15307489280. Throughput: 0: 42785.8. Samples: 15307574680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:53,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 17:40:56,342][15401] Updated weights for policy 0, policy_version 934304 (0.0037) [2024-06-25 17:40:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15307685888. Throughput: 0: 42656.9. Samples: 15307831480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-25 17:40:58,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 17:41:00,249][15401] Updated weights for policy 0, policy_version 934314 (0.0031) [2024-06-25 17:41:03,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15307948032. Throughput: 0: 42613.0. Samples: 15308080160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 17:41:04,665][15401] Updated weights for policy 0, policy_version 934324 (0.0054) [2024-06-25 17:41:07,995][15401] Updated weights for policy 0, policy_version 934334 (0.0031) [2024-06-25 17:41:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 15308144640. Throughput: 0: 42725.8. Samples: 15308213980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:08,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 17:41:12,234][15401] Updated weights for policy 0, policy_version 934344 (0.0037) [2024-06-25 17:41:13,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42709.7). Total num frames: 15308324864. Throughput: 0: 42610.7. Samples: 15308472760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:13,390][15132] Avg episode reward: [(0, '0.378')] [2024-06-25 17:41:15,789][15401] Updated weights for policy 0, policy_version 934354 (0.0026) [2024-06-25 17:41:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15308570624. Throughput: 0: 42632.8. Samples: 15308718580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 17:41:19,860][15401] Updated weights for policy 0, policy_version 934364 (0.0049) [2024-06-25 17:41:23,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 15308767232. Throughput: 0: 42742.7. Samples: 15308854820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:23,392][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 17:41:23,402][15401] Updated weights for policy 0, policy_version 934374 (0.0039) [2024-06-25 17:41:27,358][15401] Updated weights for policy 0, policy_version 934384 (0.0029) [2024-06-25 17:41:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 15308980224. Throughput: 0: 42581.7. Samples: 15309108940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:28,390][15132] Avg episode reward: [(0, '0.148')] [2024-06-25 17:41:31,247][15401] Updated weights for policy 0, policy_version 934394 (0.0034) [2024-06-25 17:41:33,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15309225984. Throughput: 0: 42651.6. Samples: 15309356940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:33,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 17:41:35,021][15401] Updated weights for policy 0, policy_version 934404 (0.0031) [2024-06-25 17:41:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15309422592. Throughput: 0: 42542.7. Samples: 15309489100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:38,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 17:41:38,757][15401] Updated weights for policy 0, policy_version 934414 (0.0035) [2024-06-25 17:41:42,629][15401] Updated weights for policy 0, policy_version 934424 (0.0040) [2024-06-25 17:41:43,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15309602816. Throughput: 0: 42526.6. Samples: 15309745180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:43,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 17:41:43,582][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000934426_15309635584.pth... [2024-06-25 17:41:43,629][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000933799_15299362816.pth [2024-06-25 17:41:46,337][15401] Updated weights for policy 0, policy_version 934434 (0.0037) [2024-06-25 17:41:47,955][15349] Signal inference workers to stop experience collection... (226600 times) [2024-06-25 17:41:47,955][15349] Signal inference workers to resume experience collection... (226600 times) [2024-06-25 17:41:47,968][15401] InferenceWorker_p0-w0: stopping experience collection (226600 times) [2024-06-25 17:41:47,983][15401] InferenceWorker_p0-w0: resuming experience collection (226600 times) [2024-06-25 17:41:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15309848576. Throughput: 0: 42483.1. Samples: 15309991900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 17:41:50,356][15401] Updated weights for policy 0, policy_version 934444 (0.0041) [2024-06-25 17:41:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15310045184. Throughput: 0: 42319.2. Samples: 15310118340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:53,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 17:41:54,928][15401] Updated weights for policy 0, policy_version 934454 (0.0036) [2024-06-25 17:41:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 15310241792. Throughput: 0: 42173.5. Samples: 15310370560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:41:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 17:41:58,463][15401] Updated weights for policy 0, policy_version 934464 (0.0031) [2024-06-25 17:42:02,646][15401] Updated weights for policy 0, policy_version 934474 (0.0027) [2024-06-25 17:42:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 15310454784. Throughput: 0: 42374.1. Samples: 15310625420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 17:42:06,077][15401] Updated weights for policy 0, policy_version 934484 (0.0039) [2024-06-25 17:42:08,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42050.8, 300 sec: 42709.2). Total num frames: 15310667776. Throughput: 0: 42080.2. Samples: 15310748420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:08,392][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 17:42:10,215][15401] Updated weights for policy 0, policy_version 934494 (0.0041) [2024-06-25 17:42:13,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15310897152. Throughput: 0: 42207.3. Samples: 15311008260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:13,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 17:42:13,662][15401] Updated weights for policy 0, policy_version 934504 (0.0024) [2024-06-25 17:42:17,845][15401] Updated weights for policy 0, policy_version 934514 (0.0037) [2024-06-25 17:42:18,389][15132] Fps is (10 sec: 40969.1, 60 sec: 41779.3, 300 sec: 42654.3). Total num frames: 15311077376. Throughput: 0: 42366.7. Samples: 15311263440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:18,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 17:42:21,371][15401] Updated weights for policy 0, policy_version 934524 (0.0040) [2024-06-25 17:42:23,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 15311290368. Throughput: 0: 42160.4. Samples: 15311386320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:23,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 17:42:25,473][15401] Updated weights for policy 0, policy_version 934534 (0.0043) [2024-06-25 17:42:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15311536128. Throughput: 0: 42252.1. Samples: 15311646520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:28,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 17:42:28,886][15401] Updated weights for policy 0, policy_version 934544 (0.0031) [2024-06-25 17:42:33,288][15401] Updated weights for policy 0, policy_version 934554 (0.0033) [2024-06-25 17:42:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 15311732736. Throughput: 0: 42394.8. Samples: 15311899660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:33,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 17:42:36,663][15401] Updated weights for policy 0, policy_version 934564 (0.0026) [2024-06-25 17:42:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 15311929344. Throughput: 0: 42436.1. Samples: 15312027960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:38,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 17:42:40,903][15401] Updated weights for policy 0, policy_version 934574 (0.0031) [2024-06-25 17:42:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15312175104. Throughput: 0: 42628.8. Samples: 15312288860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:43,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-25 17:42:44,227][15401] Updated weights for policy 0, policy_version 934584 (0.0044) [2024-06-25 17:42:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15312371712. Throughput: 0: 42606.0. Samples: 15312542680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:48,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 17:42:48,612][15401] Updated weights for policy 0, policy_version 934594 (0.0035) [2024-06-25 17:42:51,849][15401] Updated weights for policy 0, policy_version 934604 (0.0032) [2024-06-25 17:42:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 15312584704. Throughput: 0: 42634.9. Samples: 15312666900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-25 17:42:53,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 17:42:56,593][15401] Updated weights for policy 0, policy_version 934614 (0.0035) [2024-06-25 17:42:56,600][15349] Signal inference workers to stop experience collection... (226650 times) [2024-06-25 17:42:56,600][15349] Signal inference workers to resume experience collection... (226650 times) [2024-06-25 17:42:56,619][15401] InferenceWorker_p0-w0: stopping experience collection (226650 times) [2024-06-25 17:42:56,619][15401] InferenceWorker_p0-w0: resuming experience collection (226650 times) [2024-06-25 17:42:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15312797696. Throughput: 0: 42681.7. Samples: 15312928940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:42:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 17:42:59,546][15401] Updated weights for policy 0, policy_version 934624 (0.0027) [2024-06-25 17:43:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15313010688. Throughput: 0: 42689.3. Samples: 15313184460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:03,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 17:43:04,339][15401] Updated weights for policy 0, policy_version 934634 (0.0031) [2024-06-25 17:43:07,031][15401] Updated weights for policy 0, policy_version 934644 (0.0032) [2024-06-25 17:43:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42872.9, 300 sec: 42709.5). Total num frames: 15313240064. Throughput: 0: 42774.1. Samples: 15313311160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:08,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 17:43:11,884][15401] Updated weights for policy 0, policy_version 934654 (0.0032) [2024-06-25 17:43:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15313453056. Throughput: 0: 42894.3. Samples: 15313576760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:13,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 17:43:14,988][15401] Updated weights for policy 0, policy_version 934664 (0.0050) [2024-06-25 17:43:18,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15313666048. Throughput: 0: 42818.1. Samples: 15313826480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:18,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 17:43:19,385][15401] Updated weights for policy 0, policy_version 934674 (0.0034) [2024-06-25 17:43:22,458][15401] Updated weights for policy 0, policy_version 934684 (0.0029) [2024-06-25 17:43:23,390][15132] Fps is (10 sec: 42596.1, 60 sec: 43144.3, 300 sec: 42653.9). Total num frames: 15313879040. Throughput: 0: 42887.9. Samples: 15313957940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:23,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 17:43:26,996][15401] Updated weights for policy 0, policy_version 934694 (0.0034) [2024-06-25 17:43:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15314092032. Throughput: 0: 42851.6. Samples: 15314217180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:28,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 17:43:30,272][15401] Updated weights for policy 0, policy_version 934704 (0.0037) [2024-06-25 17:43:33,392][15132] Fps is (10 sec: 44228.1, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 15314321408. Throughput: 0: 42866.6. Samples: 15314471780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:33,401][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 17:43:34,447][15401] Updated weights for policy 0, policy_version 934714 (0.0028) [2024-06-25 17:43:37,965][15401] Updated weights for policy 0, policy_version 934724 (0.0025) [2024-06-25 17:43:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15314534400. Throughput: 0: 43127.5. Samples: 15314607640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 17:43:41,849][15401] Updated weights for policy 0, policy_version 934734 (0.0038) [2024-06-25 17:43:43,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15314747392. Throughput: 0: 42901.8. Samples: 15314859520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:43,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 17:43:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000934738_15314747392.pth... [2024-06-25 17:43:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000934112_15304491008.pth [2024-06-25 17:43:45,915][15401] Updated weights for policy 0, policy_version 934744 (0.0038) [2024-06-25 17:43:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15314944000. Throughput: 0: 42720.9. Samples: 15315106900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 17:43:49,917][15401] Updated weights for policy 0, policy_version 934754 (0.0038) [2024-06-25 17:43:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15315156992. Throughput: 0: 42758.4. Samples: 15315235280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 17:43:53,483][15401] Updated weights for policy 0, policy_version 934764 (0.0027) [2024-06-25 17:43:57,664][15401] Updated weights for policy 0, policy_version 934774 (0.0034) [2024-06-25 17:43:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15315369984. Throughput: 0: 42631.9. Samples: 15315495200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:43:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 17:44:01,347][15401] Updated weights for policy 0, policy_version 934784 (0.0033) [2024-06-25 17:44:03,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.7). Total num frames: 15315582976. Throughput: 0: 42719.2. Samples: 15315748840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:03,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 17:44:05,294][15401] Updated weights for policy 0, policy_version 934794 (0.0035) [2024-06-25 17:44:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 15315795968. Throughput: 0: 42638.2. Samples: 15315876640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:08,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 17:44:09,063][15401] Updated weights for policy 0, policy_version 934804 (0.0029) [2024-06-25 17:44:11,475][15349] Signal inference workers to stop experience collection... (226700 times) [2024-06-25 17:44:11,475][15349] Signal inference workers to resume experience collection... (226700 times) [2024-06-25 17:44:11,510][15401] InferenceWorker_p0-w0: stopping experience collection (226700 times) [2024-06-25 17:44:11,510][15401] InferenceWorker_p0-w0: resuming experience collection (226700 times) [2024-06-25 17:44:13,230][15401] Updated weights for policy 0, policy_version 934814 (0.0040) [2024-06-25 17:44:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15316008960. Throughput: 0: 42569.7. Samples: 15316132820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:13,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 17:44:16,732][15401] Updated weights for policy 0, policy_version 934824 (0.0033) [2024-06-25 17:44:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15316221952. Throughput: 0: 42573.9. Samples: 15316387500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 17:44:21,003][15401] Updated weights for policy 0, policy_version 934834 (0.0031) [2024-06-25 17:44:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 15316418560. Throughput: 0: 42427.0. Samples: 15316516860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 17:44:24,370][15401] Updated weights for policy 0, policy_version 934844 (0.0023) [2024-06-25 17:44:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15316631552. Throughput: 0: 42446.8. Samples: 15316769620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:28,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 17:44:28,522][15401] Updated weights for policy 0, policy_version 934854 (0.0031) [2024-06-25 17:44:32,014][15401] Updated weights for policy 0, policy_version 934864 (0.0031) [2024-06-25 17:44:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 15316877312. Throughput: 0: 42758.1. Samples: 15317031020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 17:44:35,882][15401] Updated weights for policy 0, policy_version 934874 (0.0038) [2024-06-25 17:44:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15317073920. Throughput: 0: 42728.7. Samples: 15317158080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:38,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 17:44:40,059][15401] Updated weights for policy 0, policy_version 934884 (0.0033) [2024-06-25 17:44:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15317286912. Throughput: 0: 42608.3. Samples: 15317412580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:43,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 17:44:43,755][15401] Updated weights for policy 0, policy_version 934894 (0.0026) [2024-06-25 17:44:47,768][15401] Updated weights for policy 0, policy_version 934904 (0.0039) [2024-06-25 17:44:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15317516288. Throughput: 0: 42887.5. Samples: 15317678780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-25 17:44:48,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 17:44:51,350][15401] Updated weights for policy 0, policy_version 934914 (0.0043) [2024-06-25 17:44:53,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15317729280. Throughput: 0: 42958.7. Samples: 15317809780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:44:53,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-25 17:44:55,327][15401] Updated weights for policy 0, policy_version 934924 (0.0033) [2024-06-25 17:44:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15317942272. Throughput: 0: 42736.1. Samples: 15318055940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:44:58,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 17:44:58,981][15401] Updated weights for policy 0, policy_version 934934 (0.0048) [2024-06-25 17:45:03,037][15401] Updated weights for policy 0, policy_version 934944 (0.0033) [2024-06-25 17:45:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15318138880. Throughput: 0: 43047.9. Samples: 15318324660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 17:45:06,279][15401] Updated weights for policy 0, policy_version 934954 (0.0036) [2024-06-25 17:45:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15318368256. Throughput: 0: 42997.5. Samples: 15318451740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 17:45:10,510][15401] Updated weights for policy 0, policy_version 934964 (0.0030) [2024-06-25 17:45:12,010][15349] Signal inference workers to stop experience collection... (226750 times) [2024-06-25 17:45:12,064][15401] InferenceWorker_p0-w0: stopping experience collection (226750 times) [2024-06-25 17:45:12,073][15349] Signal inference workers to resume experience collection... (226750 times) [2024-06-25 17:45:12,086][15401] InferenceWorker_p0-w0: resuming experience collection (226750 times) [2024-06-25 17:45:13,391][15132] Fps is (10 sec: 44230.0, 60 sec: 42870.4, 300 sec: 42653.7). Total num frames: 15318581248. Throughput: 0: 43076.7. Samples: 15318708140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:13,391][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 17:45:13,822][15401] Updated weights for policy 0, policy_version 934974 (0.0047) [2024-06-25 17:45:18,070][15401] Updated weights for policy 0, policy_version 934984 (0.0033) [2024-06-25 17:45:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15318794240. Throughput: 0: 43160.2. Samples: 15318973220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:18,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 17:45:21,329][15401] Updated weights for policy 0, policy_version 934994 (0.0025) [2024-06-25 17:45:23,390][15132] Fps is (10 sec: 44243.4, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 15319023616. Throughput: 0: 43095.2. Samples: 15319097360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:23,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 17:45:25,682][15401] Updated weights for policy 0, policy_version 935004 (0.0044) [2024-06-25 17:45:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15319220224. Throughput: 0: 43051.8. Samples: 15319349900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:28,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 17:45:29,063][15401] Updated weights for policy 0, policy_version 935014 (0.0046) [2024-06-25 17:45:33,248][15401] Updated weights for policy 0, policy_version 935024 (0.0028) [2024-06-25 17:45:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15319433216. Throughput: 0: 42983.1. Samples: 15319613020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:33,392][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 17:45:36,513][15401] Updated weights for policy 0, policy_version 935034 (0.0043) [2024-06-25 17:45:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 15319678976. Throughput: 0: 42774.5. Samples: 15319734640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:38,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 17:45:41,182][15401] Updated weights for policy 0, policy_version 935044 (0.0034) [2024-06-25 17:45:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15319875584. Throughput: 0: 43084.8. Samples: 15319994760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 17:45:43,508][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000935052_15319891968.pth... [2024-06-25 17:45:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000934426_15309635584.pth [2024-06-25 17:45:44,231][15401] Updated weights for policy 0, policy_version 935054 (0.0034) [2024-06-25 17:45:48,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15320072192. Throughput: 0: 42889.7. Samples: 15320254700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:48,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 17:45:48,768][15401] Updated weights for policy 0, policy_version 935064 (0.0039) [2024-06-25 17:45:51,754][15401] Updated weights for policy 0, policy_version 935074 (0.0028) [2024-06-25 17:45:53,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15320317952. Throughput: 0: 42841.2. Samples: 15320379600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:53,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 17:45:56,248][15401] Updated weights for policy 0, policy_version 935084 (0.0037) [2024-06-25 17:45:58,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15320514560. Throughput: 0: 42916.7. Samples: 15320639320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:45:58,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 17:45:59,451][15401] Updated weights for policy 0, policy_version 935094 (0.0037) [2024-06-25 17:46:03,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15320711168. Throughput: 0: 42781.7. Samples: 15320898400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 17:46:04,070][15401] Updated weights for policy 0, policy_version 935104 (0.0028) [2024-06-25 17:46:07,088][15401] Updated weights for policy 0, policy_version 935114 (0.0033) [2024-06-25 17:46:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15320956928. Throughput: 0: 42727.5. Samples: 15321020100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 17:46:11,779][15401] Updated weights for policy 0, policy_version 935124 (0.0037) [2024-06-25 17:46:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42599.5, 300 sec: 42598.4). Total num frames: 15321137152. Throughput: 0: 42823.1. Samples: 15321276940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:13,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 17:46:14,686][15401] Updated weights for policy 0, policy_version 935134 (0.0046) [2024-06-25 17:46:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 15321350144. Throughput: 0: 42731.1. Samples: 15321535920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 17:46:19,326][15401] Updated weights for policy 0, policy_version 935144 (0.0032) [2024-06-25 17:46:22,479][15401] Updated weights for policy 0, policy_version 935154 (0.0025) [2024-06-25 17:46:23,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15321595904. Throughput: 0: 42880.0. Samples: 15321664240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:23,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 17:46:26,768][15401] Updated weights for policy 0, policy_version 935164 (0.0029) [2024-06-25 17:46:26,788][15349] Signal inference workers to stop experience collection... (226800 times) [2024-06-25 17:46:26,789][15349] Signal inference workers to resume experience collection... (226800 times) [2024-06-25 17:46:26,801][15401] InferenceWorker_p0-w0: stopping experience collection (226800 times) [2024-06-25 17:46:26,801][15401] InferenceWorker_p0-w0: resuming experience collection (226800 times) [2024-06-25 17:46:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15321792512. Throughput: 0: 42724.5. Samples: 15321917360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:28,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 17:46:30,119][15401] Updated weights for policy 0, policy_version 935174 (0.0055) [2024-06-25 17:46:33,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15322005504. Throughput: 0: 42786.3. Samples: 15322180080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:33,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 17:46:34,163][15401] Updated weights for policy 0, policy_version 935184 (0.0037) [2024-06-25 17:46:37,637][15401] Updated weights for policy 0, policy_version 935194 (0.0034) [2024-06-25 17:46:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15322234880. Throughput: 0: 42883.2. Samples: 15322309340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:38,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 17:46:41,968][15401] Updated weights for policy 0, policy_version 935204 (0.0041) [2024-06-25 17:46:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15322447872. Throughput: 0: 42790.9. Samples: 15322564920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 17:46:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 17:46:45,210][15401] Updated weights for policy 0, policy_version 935214 (0.0033) [2024-06-25 17:46:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 15322677248. Throughput: 0: 42688.0. Samples: 15322819360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:46:48,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 17:46:49,541][15401] Updated weights for policy 0, policy_version 935224 (0.0027) [2024-06-25 17:46:52,698][15401] Updated weights for policy 0, policy_version 935234 (0.0026) [2024-06-25 17:46:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15322890240. Throughput: 0: 42963.1. Samples: 15322953440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:46:53,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 17:46:57,183][15401] Updated weights for policy 0, policy_version 935244 (0.0053) [2024-06-25 17:46:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 15323070464. Throughput: 0: 42834.2. Samples: 15323204480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:46:58,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 17:47:00,979][15401] Updated weights for policy 0, policy_version 935254 (0.0032) [2024-06-25 17:47:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 15323299840. Throughput: 0: 42695.2. Samples: 15323457200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:03,398][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 17:47:04,742][15401] Updated weights for policy 0, policy_version 935264 (0.0033) [2024-06-25 17:47:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15323512832. Throughput: 0: 42810.7. Samples: 15323590720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:08,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 17:47:08,679][15401] Updated weights for policy 0, policy_version 935274 (0.0041) [2024-06-25 17:47:12,604][15401] Updated weights for policy 0, policy_version 935284 (0.0035) [2024-06-25 17:47:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 15323709440. Throughput: 0: 42767.4. Samples: 15323841900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:13,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 17:47:16,463][15401] Updated weights for policy 0, policy_version 935294 (0.0037) [2024-06-25 17:47:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15323922432. Throughput: 0: 42585.8. Samples: 15324096440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:18,390][15132] Avg episode reward: [(0, '0.127')] [2024-06-25 17:47:20,373][15401] Updated weights for policy 0, policy_version 935304 (0.0044) [2024-06-25 17:47:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15324151808. Throughput: 0: 42596.4. Samples: 15324226180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:23,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 17:47:24,084][15401] Updated weights for policy 0, policy_version 935314 (0.0039) [2024-06-25 17:47:27,861][15401] Updated weights for policy 0, policy_version 935324 (0.0036) [2024-06-25 17:47:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15324364800. Throughput: 0: 42537.9. Samples: 15324479120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 17:47:31,618][15401] Updated weights for policy 0, policy_version 935334 (0.0041) [2024-06-25 17:47:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15324577792. Throughput: 0: 42658.8. Samples: 15324739000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:33,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 17:47:35,451][15401] Updated weights for policy 0, policy_version 935344 (0.0031) [2024-06-25 17:47:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15324790784. Throughput: 0: 42511.7. Samples: 15324866460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 17:47:39,135][15401] Updated weights for policy 0, policy_version 935354 (0.0030) [2024-06-25 17:47:43,027][15401] Updated weights for policy 0, policy_version 935364 (0.0034) [2024-06-25 17:47:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15325020160. Throughput: 0: 42747.5. Samples: 15325128120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:43,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 17:47:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000935365_15325020160.pth... [2024-06-25 17:47:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000934738_15314747392.pth [2024-06-25 17:47:46,746][15401] Updated weights for policy 0, policy_version 935374 (0.0032) [2024-06-25 17:47:48,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15325200384. Throughput: 0: 42757.7. Samples: 15325381300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:48,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 17:47:50,897][15401] Updated weights for policy 0, policy_version 935384 (0.0031) [2024-06-25 17:47:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15325446144. Throughput: 0: 42639.4. Samples: 15325509500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:53,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 17:47:54,289][15349] Signal inference workers to stop experience collection... (226850 times) [2024-06-25 17:47:54,330][15401] InferenceWorker_p0-w0: stopping experience collection (226850 times) [2024-06-25 17:47:54,358][15349] Signal inference workers to resume experience collection... (226850 times) [2024-06-25 17:47:54,364][15401] InferenceWorker_p0-w0: resuming experience collection (226850 times) [2024-06-25 17:47:54,366][15401] Updated weights for policy 0, policy_version 935394 (0.0030) [2024-06-25 17:47:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15325642752. Throughput: 0: 42738.3. Samples: 15325765120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:47:58,391][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 17:47:58,724][15401] Updated weights for policy 0, policy_version 935404 (0.0032) [2024-06-25 17:48:01,969][15401] Updated weights for policy 0, policy_version 935414 (0.0047) [2024-06-25 17:48:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15325855744. Throughput: 0: 42608.8. Samples: 15326013840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:48:03,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 17:48:06,485][15401] Updated weights for policy 0, policy_version 935424 (0.0030) [2024-06-25 17:48:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15326068736. Throughput: 0: 42673.4. Samples: 15326146480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:48:08,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 17:48:09,571][15401] Updated weights for policy 0, policy_version 935434 (0.0028) [2024-06-25 17:48:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15326281728. Throughput: 0: 42760.7. Samples: 15326403360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:48:13,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 17:48:14,399][15401] Updated weights for policy 0, policy_version 935444 (0.0040) [2024-06-25 17:48:17,231][15401] Updated weights for policy 0, policy_version 935454 (0.0035) [2024-06-25 17:48:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 15326494720. Throughput: 0: 42563.5. Samples: 15326654360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:48:18,399][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 17:48:21,889][15401] Updated weights for policy 0, policy_version 935464 (0.0032) [2024-06-25 17:48:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15326707712. Throughput: 0: 42699.9. Samples: 15326787960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:48:23,390][15132] Avg episode reward: [(0, '0.842')] [2024-06-25 17:48:25,040][15401] Updated weights for policy 0, policy_version 935474 (0.0044) [2024-06-25 17:48:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 15326920704. Throughput: 0: 42552.0. Samples: 15327042960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:48:28,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 17:48:29,443][15401] Updated weights for policy 0, policy_version 935484 (0.0032) [2024-06-25 17:48:32,845][15401] Updated weights for policy 0, policy_version 935494 (0.0038) [2024-06-25 17:48:33,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 15327133696. Throughput: 0: 42471.9. Samples: 15327292540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:48:33,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 17:48:36,999][15401] Updated weights for policy 0, policy_version 935504 (0.0031) [2024-06-25 17:48:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15327330304. Throughput: 0: 42457.9. Samples: 15327420100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-25 17:48:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 17:48:40,536][15401] Updated weights for policy 0, policy_version 935514 (0.0040) [2024-06-25 17:48:43,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15327559680. Throughput: 0: 42475.3. Samples: 15327676500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:48:43,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 17:48:44,961][15401] Updated weights for policy 0, policy_version 935524 (0.0049) [2024-06-25 17:48:48,220][15401] Updated weights for policy 0, policy_version 935534 (0.0029) [2024-06-25 17:48:48,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15327789056. Throughput: 0: 42472.9. Samples: 15327925120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:48:48,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 17:48:52,538][15401] Updated weights for policy 0, policy_version 935544 (0.0024) [2024-06-25 17:48:53,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 15327969280. Throughput: 0: 42476.9. Samples: 15328057940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:48:53,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 17:48:56,021][15401] Updated weights for policy 0, policy_version 935554 (0.0029) [2024-06-25 17:48:58,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15328182272. Throughput: 0: 42426.8. Samples: 15328312560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:48:58,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 17:49:00,110][15401] Updated weights for policy 0, policy_version 935564 (0.0039) [2024-06-25 17:49:03,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15328411648. Throughput: 0: 42435.2. Samples: 15328563940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:03,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 17:49:03,930][15401] Updated weights for policy 0, policy_version 935574 (0.0028) [2024-06-25 17:49:07,650][15401] Updated weights for policy 0, policy_version 935584 (0.0029) [2024-06-25 17:49:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15328608256. Throughput: 0: 42345.4. Samples: 15328693500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 17:49:11,513][15401] Updated weights for policy 0, policy_version 935594 (0.0027) [2024-06-25 17:49:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 15328804864. Throughput: 0: 42292.9. Samples: 15328946140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 17:49:14,888][15349] Signal inference workers to stop experience collection... (226900 times) [2024-06-25 17:49:14,889][15349] Signal inference workers to resume experience collection... (226900 times) [2024-06-25 17:49:14,931][15401] InferenceWorker_p0-w0: stopping experience collection (226900 times) [2024-06-25 17:49:14,931][15401] InferenceWorker_p0-w0: resuming experience collection (226900 times) [2024-06-25 17:49:15,186][15401] Updated weights for policy 0, policy_version 935604 (0.0035) [2024-06-25 17:49:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15329034240. Throughput: 0: 42452.6. Samples: 15329202900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:18,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 17:49:19,577][15401] Updated weights for policy 0, policy_version 935614 (0.0030) [2024-06-25 17:49:22,983][15401] Updated weights for policy 0, policy_version 935624 (0.0041) [2024-06-25 17:49:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15329263616. Throughput: 0: 42527.4. Samples: 15329333840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:23,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 17:49:26,938][15401] Updated weights for policy 0, policy_version 935634 (0.0033) [2024-06-25 17:49:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15329460224. Throughput: 0: 42480.3. Samples: 15329588120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:28,392][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 17:49:31,136][15401] Updated weights for policy 0, policy_version 935644 (0.0035) [2024-06-25 17:49:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15329673216. Throughput: 0: 42729.9. Samples: 15329847960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:33,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 17:49:34,877][15401] Updated weights for policy 0, policy_version 935654 (0.0052) [2024-06-25 17:49:38,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15329902592. Throughput: 0: 42562.5. Samples: 15329973260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:38,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 17:49:38,804][15401] Updated weights for policy 0, policy_version 935664 (0.0038) [2024-06-25 17:49:42,518][15401] Updated weights for policy 0, policy_version 935674 (0.0032) [2024-06-25 17:49:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15330115584. Throughput: 0: 42572.4. Samples: 15330228320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 17:49:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000935677_15330131968.pth... [2024-06-25 17:49:43,562][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000935052_15319891968.pth [2024-06-25 17:49:46,397][15401] Updated weights for policy 0, policy_version 935684 (0.0030) [2024-06-25 17:49:48,392][15132] Fps is (10 sec: 40951.3, 60 sec: 42050.8, 300 sec: 42653.6). Total num frames: 15330312192. Throughput: 0: 42870.8. Samples: 15330493220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:48,392][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 17:49:50,167][15401] Updated weights for policy 0, policy_version 935694 (0.0042) [2024-06-25 17:49:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15330541568. Throughput: 0: 42576.8. Samples: 15330609460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:53,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 17:49:54,131][15401] Updated weights for policy 0, policy_version 935704 (0.0032) [2024-06-25 17:49:57,858][15401] Updated weights for policy 0, policy_version 935714 (0.0025) [2024-06-25 17:49:58,390][15132] Fps is (10 sec: 45884.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15330770944. Throughput: 0: 42838.5. Samples: 15330873880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:49:58,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 17:50:01,645][15401] Updated weights for policy 0, policy_version 935724 (0.0038) [2024-06-25 17:50:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15330967552. Throughput: 0: 42875.9. Samples: 15331132320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:50:03,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 17:50:05,499][15401] Updated weights for policy 0, policy_version 935734 (0.0037) [2024-06-25 17:50:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.3, 300 sec: 42709.7). Total num frames: 15331180544. Throughput: 0: 42675.5. Samples: 15331254240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:50:08,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 17:50:09,163][15401] Updated weights for policy 0, policy_version 935744 (0.0034) [2024-06-25 17:50:13,297][15401] Updated weights for policy 0, policy_version 935754 (0.0040) [2024-06-25 17:50:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15331393536. Throughput: 0: 42869.4. Samples: 15331517240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:50:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 17:50:17,016][15401] Updated weights for policy 0, policy_version 935764 (0.0027) [2024-06-25 17:50:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15331606528. Throughput: 0: 42866.2. Samples: 15331776940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:50:18,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 17:50:20,818][15401] Updated weights for policy 0, policy_version 935774 (0.0034) [2024-06-25 17:50:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15331819520. Throughput: 0: 42844.5. Samples: 15331901260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:50:23,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 17:50:24,707][15401] Updated weights for policy 0, policy_version 935784 (0.0024) [2024-06-25 17:50:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15332032512. Throughput: 0: 43062.7. Samples: 15332166140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:50:28,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 17:50:28,430][15349] Signal inference workers to stop experience collection... (226950 times) [2024-06-25 17:50:28,475][15401] InferenceWorker_p0-w0: stopping experience collection (226950 times) [2024-06-25 17:50:28,482][15401] Updated weights for policy 0, policy_version 935794 (0.0040) [2024-06-25 17:50:28,482][15349] Signal inference workers to resume experience collection... (226950 times) [2024-06-25 17:50:28,494][15401] InferenceWorker_p0-w0: resuming experience collection (226950 times) [2024-06-25 17:50:32,437][15401] Updated weights for policy 0, policy_version 935804 (0.0040) [2024-06-25 17:50:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15332245504. Throughput: 0: 42645.6. Samples: 15332412180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:50:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 17:50:36,206][15401] Updated weights for policy 0, policy_version 935814 (0.0026) [2024-06-25 17:50:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15332474880. Throughput: 0: 42957.3. Samples: 15332542540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 17:50:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 17:50:39,991][15401] Updated weights for policy 0, policy_version 935824 (0.0042) [2024-06-25 17:50:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15332671488. Throughput: 0: 42888.0. Samples: 15332803840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:50:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 17:50:43,908][15401] Updated weights for policy 0, policy_version 935834 (0.0038) [2024-06-25 17:50:47,747][15401] Updated weights for policy 0, policy_version 935844 (0.0037) [2024-06-25 17:50:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 15332884480. Throughput: 0: 42753.0. Samples: 15333056200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:50:48,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 17:50:51,326][15401] Updated weights for policy 0, policy_version 935854 (0.0039) [2024-06-25 17:50:53,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15333113856. Throughput: 0: 42965.9. Samples: 15333187700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:50:53,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 17:50:55,170][15401] Updated weights for policy 0, policy_version 935864 (0.0044) [2024-06-25 17:50:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15333310464. Throughput: 0: 42890.2. Samples: 15333447300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:50:58,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 17:50:58,773][15401] Updated weights for policy 0, policy_version 935874 (0.0039) [2024-06-25 17:51:02,722][15401] Updated weights for policy 0, policy_version 935884 (0.0043) [2024-06-25 17:51:03,394][15132] Fps is (10 sec: 42579.5, 60 sec: 42868.4, 300 sec: 42653.3). Total num frames: 15333539840. Throughput: 0: 42711.4. Samples: 15333699140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:03,394][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 17:51:06,281][15401] Updated weights for policy 0, policy_version 935894 (0.0029) [2024-06-25 17:51:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15333752832. Throughput: 0: 42816.8. Samples: 15333828020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 17:51:10,928][15401] Updated weights for policy 0, policy_version 935904 (0.0036) [2024-06-25 17:51:13,389][15132] Fps is (10 sec: 42617.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15333965824. Throughput: 0: 42739.6. Samples: 15334089420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:13,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 17:51:14,570][15401] Updated weights for policy 0, policy_version 935914 (0.0036) [2024-06-25 17:51:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15334162432. Throughput: 0: 43010.2. Samples: 15334347640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:18,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 17:51:18,447][15401] Updated weights for policy 0, policy_version 935924 (0.0034) [2024-06-25 17:51:22,181][15401] Updated weights for policy 0, policy_version 935934 (0.0030) [2024-06-25 17:51:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15334408192. Throughput: 0: 42887.1. Samples: 15334472460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 17:51:26,000][15401] Updated weights for policy 0, policy_version 935944 (0.0037) [2024-06-25 17:51:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15334604800. Throughput: 0: 42776.1. Samples: 15334728760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:28,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 17:51:29,649][15401] Updated weights for policy 0, policy_version 935954 (0.0045) [2024-06-25 17:51:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15334817792. Throughput: 0: 42927.5. Samples: 15334987940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 17:51:33,614][15401] Updated weights for policy 0, policy_version 935964 (0.0036) [2024-06-25 17:51:37,429][15401] Updated weights for policy 0, policy_version 935974 (0.0033) [2024-06-25 17:51:38,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15335063552. Throughput: 0: 42847.9. Samples: 15335115860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 17:51:41,323][15401] Updated weights for policy 0, policy_version 935984 (0.0041) [2024-06-25 17:51:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15335243776. Throughput: 0: 42755.9. Samples: 15335371320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 17:51:43,537][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000935990_15335260160.pth... [2024-06-25 17:51:43,626][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000935365_15325020160.pth [2024-06-25 17:51:45,025][15401] Updated weights for policy 0, policy_version 935994 (0.0044) [2024-06-25 17:51:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15335456768. Throughput: 0: 42814.9. Samples: 15335625620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 17:51:49,042][15401] Updated weights for policy 0, policy_version 936004 (0.0038) [2024-06-25 17:51:52,568][15401] Updated weights for policy 0, policy_version 936014 (0.0042) [2024-06-25 17:51:53,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15335686144. Throughput: 0: 42851.0. Samples: 15335756320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:53,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 17:51:56,573][15349] Signal inference workers to stop experience collection... (227000 times) [2024-06-25 17:51:56,575][15349] Signal inference workers to resume experience collection... (227000 times) [2024-06-25 17:51:56,583][15401] Updated weights for policy 0, policy_version 936024 (0.0036) [2024-06-25 17:51:56,604][15401] InferenceWorker_p0-w0: stopping experience collection (227000 times) [2024-06-25 17:51:56,605][15401] InferenceWorker_p0-w0: resuming experience collection (227000 times) [2024-06-25 17:51:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15335882752. Throughput: 0: 42664.4. Samples: 15336009320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:51:58,391][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 17:52:00,218][15401] Updated weights for policy 0, policy_version 936034 (0.0036) [2024-06-25 17:52:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42601.5, 300 sec: 42653.9). Total num frames: 15336095744. Throughput: 0: 42807.5. Samples: 15336273980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:52:03,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 17:52:04,090][15401] Updated weights for policy 0, policy_version 936044 (0.0035) [2024-06-25 17:52:07,839][15401] Updated weights for policy 0, policy_version 936054 (0.0037) [2024-06-25 17:52:08,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15336341504. Throughput: 0: 42902.3. Samples: 15336403060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:52:08,390][15132] Avg episode reward: [(0, '0.865')] [2024-06-25 17:52:11,540][15401] Updated weights for policy 0, policy_version 936064 (0.0036) [2024-06-25 17:52:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15336521728. Throughput: 0: 42808.9. Samples: 15336655160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:52:13,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 17:52:15,481][15401] Updated weights for policy 0, policy_version 936074 (0.0030) [2024-06-25 17:52:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15336734720. Throughput: 0: 42767.9. Samples: 15336912500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:52:18,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 17:52:19,137][15401] Updated weights for policy 0, policy_version 936084 (0.0033) [2024-06-25 17:52:23,138][15401] Updated weights for policy 0, policy_version 936094 (0.0033) [2024-06-25 17:52:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15336980480. Throughput: 0: 42823.9. Samples: 15337042940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:52:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 17:52:26,806][15401] Updated weights for policy 0, policy_version 936104 (0.0034) [2024-06-25 17:52:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15337177088. Throughput: 0: 42849.9. Samples: 15337299560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:52:28,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 17:52:30,586][15401] Updated weights for policy 0, policy_version 936114 (0.0036) [2024-06-25 17:52:33,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15337373696. Throughput: 0: 43050.7. Samples: 15337562900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:52:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 17:52:34,509][15401] Updated weights for policy 0, policy_version 936124 (0.0044) [2024-06-25 17:52:38,080][15401] Updated weights for policy 0, policy_version 936134 (0.0029) [2024-06-25 17:52:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15337635840. Throughput: 0: 43039.1. Samples: 15337693080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:52:38,390][15132] Avg episode reward: [(0, '0.335')] [2024-06-25 17:52:42,055][15401] Updated weights for policy 0, policy_version 936144 (0.0037) [2024-06-25 17:52:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15337816064. Throughput: 0: 43043.9. Samples: 15337946300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:52:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 17:52:45,692][15401] Updated weights for policy 0, policy_version 936154 (0.0037) [2024-06-25 17:52:48,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15338045440. Throughput: 0: 43084.6. Samples: 15338212780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:52:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 17:52:49,767][15401] Updated weights for policy 0, policy_version 936164 (0.0036) [2024-06-25 17:52:53,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 15338258432. Throughput: 0: 43087.1. Samples: 15338342080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:52:53,396][15132] Avg episode reward: [(0, '0.282')] [2024-06-25 17:52:53,681][15401] Updated weights for policy 0, policy_version 936174 (0.0040) [2024-06-25 17:52:57,358][15401] Updated weights for policy 0, policy_version 936184 (0.0032) [2024-06-25 17:52:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15338471424. Throughput: 0: 43121.7. Samples: 15338595640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:52:58,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 17:53:01,297][15401] Updated weights for policy 0, policy_version 936194 (0.0038) [2024-06-25 17:53:03,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15338684416. Throughput: 0: 43086.8. Samples: 15338851400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 17:53:05,089][15401] Updated weights for policy 0, policy_version 936204 (0.0033) [2024-06-25 17:53:06,020][15349] Signal inference workers to stop experience collection... (227050 times) [2024-06-25 17:53:06,071][15349] Signal inference workers to resume experience collection... (227050 times) [2024-06-25 17:53:06,071][15401] InferenceWorker_p0-w0: stopping experience collection (227050 times) [2024-06-25 17:53:06,086][15401] InferenceWorker_p0-w0: resuming experience collection (227050 times) [2024-06-25 17:53:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15338897408. Throughput: 0: 42972.9. Samples: 15338976720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 17:53:08,846][15401] Updated weights for policy 0, policy_version 936214 (0.0029) [2024-06-25 17:53:13,091][15401] Updated weights for policy 0, policy_version 936224 (0.0037) [2024-06-25 17:53:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15339110400. Throughput: 0: 42965.2. Samples: 15339233000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:13,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 17:53:16,664][15401] Updated weights for policy 0, policy_version 936234 (0.0037) [2024-06-25 17:53:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15339323392. Throughput: 0: 42828.8. Samples: 15339490200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:18,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 17:53:20,688][15401] Updated weights for policy 0, policy_version 936244 (0.0029) [2024-06-25 17:53:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 15339536384. Throughput: 0: 42838.9. Samples: 15339620820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 17:53:24,108][15401] Updated weights for policy 0, policy_version 936254 (0.0033) [2024-06-25 17:53:28,149][15401] Updated weights for policy 0, policy_version 936264 (0.0036) [2024-06-25 17:53:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15339749376. Throughput: 0: 43006.8. Samples: 15339881600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 17:53:31,901][15401] Updated weights for policy 0, policy_version 936274 (0.0040) [2024-06-25 17:53:33,389][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15339978752. Throughput: 0: 42784.9. Samples: 15340138100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:33,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 17:53:35,671][15401] Updated weights for policy 0, policy_version 936284 (0.0025) [2024-06-25 17:53:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15340191744. Throughput: 0: 42699.5. Samples: 15340263460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:38,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 17:53:39,967][15401] Updated weights for policy 0, policy_version 936294 (0.0036) [2024-06-25 17:53:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15340388352. Throughput: 0: 42880.7. Samples: 15340525260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:43,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 17:53:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000936304_15340404736.pth... [2024-06-25 17:53:43,515][15401] Updated weights for policy 0, policy_version 936304 (0.0038) [2024-06-25 17:53:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000935677_15330131968.pth [2024-06-25 17:53:47,510][15401] Updated weights for policy 0, policy_version 936314 (0.0033) [2024-06-25 17:53:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15340601344. Throughput: 0: 42889.2. Samples: 15340781420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 17:53:51,001][15401] Updated weights for policy 0, policy_version 936324 (0.0023) [2024-06-25 17:53:53,393][15132] Fps is (10 sec: 44222.1, 60 sec: 42870.9, 300 sec: 42875.6). Total num frames: 15340830720. Throughput: 0: 42902.4. Samples: 15340907460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:53,393][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 17:53:55,055][15401] Updated weights for policy 0, policy_version 936334 (0.0036) [2024-06-25 17:53:58,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15341043712. Throughput: 0: 42977.9. Samples: 15341167000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:53:58,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 17:53:58,425][15401] Updated weights for policy 0, policy_version 936344 (0.0032) [2024-06-25 17:54:02,547][15401] Updated weights for policy 0, policy_version 936354 (0.0043) [2024-06-25 17:54:03,390][15132] Fps is (10 sec: 40973.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15341240320. Throughput: 0: 42964.9. Samples: 15341423620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:54:03,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 17:54:05,979][15401] Updated weights for policy 0, policy_version 936364 (0.0035) [2024-06-25 17:54:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 15341469696. Throughput: 0: 42761.7. Samples: 15341545100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:54:08,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 17:54:10,218][15401] Updated weights for policy 0, policy_version 936374 (0.0038) [2024-06-25 17:54:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15341699072. Throughput: 0: 42915.9. Samples: 15341812820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:54:13,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 17:54:13,462][15401] Updated weights for policy 0, policy_version 936384 (0.0035) [2024-06-25 17:54:17,672][15401] Updated weights for policy 0, policy_version 936394 (0.0026) [2024-06-25 17:54:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15341912064. Throughput: 0: 42824.5. Samples: 15342065200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:54:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 17:54:21,340][15349] Signal inference workers to stop experience collection... (227100 times) [2024-06-25 17:54:21,345][15349] Signal inference workers to resume experience collection... (227100 times) [2024-06-25 17:54:21,355][15401] Updated weights for policy 0, policy_version 936404 (0.0035) [2024-06-25 17:54:21,381][15401] InferenceWorker_p0-w0: stopping experience collection (227100 times) [2024-06-25 17:54:21,381][15401] InferenceWorker_p0-w0: resuming experience collection (227100 times) [2024-06-25 17:54:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15342108672. Throughput: 0: 42871.2. Samples: 15342192660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:54:23,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 17:54:25,345][15401] Updated weights for policy 0, policy_version 936414 (0.0032) [2024-06-25 17:54:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15342338048. Throughput: 0: 42914.6. Samples: 15342456420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-25 17:54:28,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 17:54:28,854][15401] Updated weights for policy 0, policy_version 936424 (0.0029) [2024-06-25 17:54:32,750][15401] Updated weights for policy 0, policy_version 936434 (0.0037) [2024-06-25 17:54:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15342551040. Throughput: 0: 42879.2. Samples: 15342710980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:54:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 17:54:36,609][15401] Updated weights for policy 0, policy_version 936444 (0.0026) [2024-06-25 17:54:38,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15342747648. Throughput: 0: 42882.5. Samples: 15342837040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:54:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 17:54:40,492][15401] Updated weights for policy 0, policy_version 936454 (0.0037) [2024-06-25 17:54:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42987.5). Total num frames: 15342993408. Throughput: 0: 42943.9. Samples: 15343099480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:54:43,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 17:54:44,588][15401] Updated weights for policy 0, policy_version 936464 (0.0031) [2024-06-25 17:54:47,935][15401] Updated weights for policy 0, policy_version 936474 (0.0033) [2024-06-25 17:54:48,395][15132] Fps is (10 sec: 45852.3, 60 sec: 43414.0, 300 sec: 42930.9). Total num frames: 15343206400. Throughput: 0: 42876.1. Samples: 15343353260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:54:48,395][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 17:54:52,502][15401] Updated weights for policy 0, policy_version 936484 (0.0043) [2024-06-25 17:54:53,396][15132] Fps is (10 sec: 40934.1, 60 sec: 42869.2, 300 sec: 42819.6). Total num frames: 15343403008. Throughput: 0: 43033.8. Samples: 15343481900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:54:53,396][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 17:54:55,472][15401] Updated weights for policy 0, policy_version 936494 (0.0031) [2024-06-25 17:54:58,389][15132] Fps is (10 sec: 42620.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15343632384. Throughput: 0: 42940.5. Samples: 15343745140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:54:58,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 17:55:00,017][15401] Updated weights for policy 0, policy_version 936504 (0.0025) [2024-06-25 17:55:03,276][15401] Updated weights for policy 0, policy_version 936514 (0.0024) [2024-06-25 17:55:03,389][15132] Fps is (10 sec: 44265.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15343845376. Throughput: 0: 43120.8. Samples: 15344005640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:03,390][15132] Avg episode reward: [(0, '0.249')] [2024-06-25 17:55:07,803][15401] Updated weights for policy 0, policy_version 936524 (0.0038) [2024-06-25 17:55:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15344041984. Throughput: 0: 43142.7. Samples: 15344134080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:08,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 17:55:10,855][15401] Updated weights for policy 0, policy_version 936534 (0.0025) [2024-06-25 17:55:13,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42986.8). Total num frames: 15344287744. Throughput: 0: 42995.5. Samples: 15344391320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:13,393][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 17:55:15,515][15401] Updated weights for policy 0, policy_version 936544 (0.0034) [2024-06-25 17:55:18,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15344484352. Throughput: 0: 42914.8. Samples: 15344642140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 17:55:18,497][15401] Updated weights for policy 0, policy_version 936554 (0.0040) [2024-06-25 17:55:23,089][15401] Updated weights for policy 0, policy_version 936564 (0.0036) [2024-06-25 17:55:23,389][15132] Fps is (10 sec: 39331.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15344680960. Throughput: 0: 42969.5. Samples: 15344770660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:23,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 17:55:26,554][15401] Updated weights for policy 0, policy_version 936574 (0.0026) [2024-06-25 17:55:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15344910336. Throughput: 0: 42875.2. Samples: 15345028860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 17:55:30,525][15401] Updated weights for policy 0, policy_version 936584 (0.0039) [2024-06-25 17:55:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15345123328. Throughput: 0: 43043.9. Samples: 15345290020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:33,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 17:55:34,038][15401] Updated weights for policy 0, policy_version 936594 (0.0032) [2024-06-25 17:55:37,952][15401] Updated weights for policy 0, policy_version 936604 (0.0043) [2024-06-25 17:55:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15345319936. Throughput: 0: 42919.0. Samples: 15345412980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:38,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 17:55:41,515][15349] Signal inference workers to stop experience collection... (227150 times) [2024-06-25 17:55:41,555][15401] InferenceWorker_p0-w0: stopping experience collection (227150 times) [2024-06-25 17:55:41,563][15349] Signal inference workers to resume experience collection... (227150 times) [2024-06-25 17:55:41,570][15401] InferenceWorker_p0-w0: resuming experience collection (227150 times) [2024-06-25 17:55:41,577][15401] Updated weights for policy 0, policy_version 936614 (0.0032) [2024-06-25 17:55:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42987.1). Total num frames: 15345565696. Throughput: 0: 42851.4. Samples: 15345673460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:43,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 17:55:43,542][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000936620_15345582080.pth... [2024-06-25 17:55:43,601][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000935990_15335260160.pth [2024-06-25 17:55:45,785][15401] Updated weights for policy 0, policy_version 936624 (0.0046) [2024-06-25 17:55:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42328.9, 300 sec: 42820.6). Total num frames: 15345745920. Throughput: 0: 42985.4. Samples: 15345939980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:48,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 17:55:49,175][15401] Updated weights for policy 0, policy_version 936634 (0.0047) [2024-06-25 17:55:53,172][15401] Updated weights for policy 0, policy_version 936644 (0.0040) [2024-06-25 17:55:53,392][15132] Fps is (10 sec: 42586.9, 60 sec: 43147.1, 300 sec: 42986.8). Total num frames: 15345991680. Throughput: 0: 42974.6. Samples: 15346068060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:53,393][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 17:55:56,514][15401] Updated weights for policy 0, policy_version 936654 (0.0021) [2024-06-25 17:55:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42932.3). Total num frames: 15346204672. Throughput: 0: 43120.1. Samples: 15346331620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:55:58,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 17:56:00,623][15401] Updated weights for policy 0, policy_version 936664 (0.0023) [2024-06-25 17:56:03,389][15132] Fps is (10 sec: 42610.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15346417664. Throughput: 0: 43417.8. Samples: 15346595940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:56:03,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 17:56:04,171][15401] Updated weights for policy 0, policy_version 936674 (0.0034) [2024-06-25 17:56:08,061][15401] Updated weights for policy 0, policy_version 936684 (0.0049) [2024-06-25 17:56:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15346630656. Throughput: 0: 43308.4. Samples: 15346719540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:56:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 17:56:11,790][15401] Updated weights for policy 0, policy_version 936694 (0.0028) [2024-06-25 17:56:13,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 43042.7). Total num frames: 15346860032. Throughput: 0: 43296.9. Samples: 15346977220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:56:13,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 17:56:15,644][15401] Updated weights for policy 0, policy_version 936704 (0.0040) [2024-06-25 17:56:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 15347073024. Throughput: 0: 43419.7. Samples: 15347243900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:56:18,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 17:56:19,387][15401] Updated weights for policy 0, policy_version 936714 (0.0026) [2024-06-25 17:56:23,130][15401] Updated weights for policy 0, policy_version 936724 (0.0032) [2024-06-25 17:56:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 15347286016. Throughput: 0: 43426.7. Samples: 15347367180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-25 17:56:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 17:56:26,959][15401] Updated weights for policy 0, policy_version 936734 (0.0029) [2024-06-25 17:56:28,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 15347515392. Throughput: 0: 43458.9. Samples: 15347629100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:56:28,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 17:56:30,989][15401] Updated weights for policy 0, policy_version 936744 (0.0038) [2024-06-25 17:56:33,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15347728384. Throughput: 0: 43249.2. Samples: 15347886200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:56:33,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 17:56:34,538][15401] Updated weights for policy 0, policy_version 936754 (0.0037) [2024-06-25 17:56:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 15347924992. Throughput: 0: 43264.4. Samples: 15348014840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:56:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 17:56:38,553][15401] Updated weights for policy 0, policy_version 936764 (0.0045) [2024-06-25 17:56:42,621][15401] Updated weights for policy 0, policy_version 936774 (0.0028) [2024-06-25 17:56:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43417.8, 300 sec: 43098.2). Total num frames: 15348170752. Throughput: 0: 43204.1. Samples: 15348275800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:56:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 17:56:46,056][15401] Updated weights for policy 0, policy_version 936784 (0.0031) [2024-06-25 17:56:48,392][15132] Fps is (10 sec: 44227.1, 60 sec: 43688.9, 300 sec: 42986.8). Total num frames: 15348367360. Throughput: 0: 42828.8. Samples: 15348523340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:56:48,392][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 17:56:50,415][15401] Updated weights for policy 0, policy_version 936794 (0.0040) [2024-06-25 17:56:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 43146.5, 300 sec: 43042.7). Total num frames: 15348580352. Throughput: 0: 42987.4. Samples: 15348653980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:56:53,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 17:56:53,673][15401] Updated weights for policy 0, policy_version 936804 (0.0036) [2024-06-25 17:56:57,916][15401] Updated weights for policy 0, policy_version 936814 (0.0037) [2024-06-25 17:56:57,927][15349] Signal inference workers to stop experience collection... (227200 times) [2024-06-25 17:56:57,931][15349] Signal inference workers to resume experience collection... (227200 times) [2024-06-25 17:56:57,951][15401] InferenceWorker_p0-w0: stopping experience collection (227200 times) [2024-06-25 17:56:57,951][15401] InferenceWorker_p0-w0: resuming experience collection (227200 times) [2024-06-25 17:56:58,390][15132] Fps is (10 sec: 42607.9, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 15348793344. Throughput: 0: 43105.2. Samples: 15348916960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:56:58,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 17:57:01,239][15401] Updated weights for policy 0, policy_version 936824 (0.0038) [2024-06-25 17:57:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15349006336. Throughput: 0: 42846.9. Samples: 15349172020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:03,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 17:57:05,458][15401] Updated weights for policy 0, policy_version 936834 (0.0037) [2024-06-25 17:57:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15349202944. Throughput: 0: 42962.1. Samples: 15349300480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:08,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 17:57:08,745][15401] Updated weights for policy 0, policy_version 936844 (0.0033) [2024-06-25 17:57:12,864][15401] Updated weights for policy 0, policy_version 936854 (0.0024) [2024-06-25 17:57:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 15349448704. Throughput: 0: 42925.3. Samples: 15349560740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:13,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 17:57:16,406][15401] Updated weights for policy 0, policy_version 936864 (0.0032) [2024-06-25 17:57:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15349628928. Throughput: 0: 42799.7. Samples: 15349812180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:18,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 17:57:20,301][15401] Updated weights for policy 0, policy_version 936874 (0.0041) [2024-06-25 17:57:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15349858304. Throughput: 0: 42757.9. Samples: 15349938940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:23,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 17:57:24,183][15401] Updated weights for policy 0, policy_version 936884 (0.0034) [2024-06-25 17:57:28,076][15401] Updated weights for policy 0, policy_version 936894 (0.0037) [2024-06-25 17:57:28,391][15132] Fps is (10 sec: 45867.5, 60 sec: 42870.3, 300 sec: 43098.0). Total num frames: 15350087680. Throughput: 0: 42630.8. Samples: 15350194260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:28,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 17:57:31,750][15401] Updated weights for policy 0, policy_version 936904 (0.0031) [2024-06-25 17:57:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15350284288. Throughput: 0: 42967.0. Samples: 15350456760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:33,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 17:57:35,623][15401] Updated weights for policy 0, policy_version 936914 (0.0039) [2024-06-25 17:57:38,390][15132] Fps is (10 sec: 42604.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 15350513664. Throughput: 0: 42829.8. Samples: 15350581320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 17:57:39,453][15401] Updated weights for policy 0, policy_version 936924 (0.0041) [2024-06-25 17:57:43,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 15350710272. Throughput: 0: 42797.5. Samples: 15350842840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:43,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 17:57:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000936934_15350726656.pth... [2024-06-25 17:57:43,418][15401] Updated weights for policy 0, policy_version 936934 (0.0043) [2024-06-25 17:57:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000936304_15340404736.pth [2024-06-25 17:57:47,117][15401] Updated weights for policy 0, policy_version 936944 (0.0050) [2024-06-25 17:57:48,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42600.1, 300 sec: 42932.0). Total num frames: 15350923264. Throughput: 0: 42837.4. Samples: 15351099700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:48,394][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 17:57:51,060][15401] Updated weights for policy 0, policy_version 936954 (0.0030) [2024-06-25 17:57:53,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 15351169024. Throughput: 0: 42812.9. Samples: 15351227060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:53,392][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 17:57:54,587][15401] Updated weights for policy 0, policy_version 936964 (0.0024) [2024-06-25 17:57:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 15351332864. Throughput: 0: 42710.7. Samples: 15351482720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:57:58,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 17:57:59,037][15401] Updated weights for policy 0, policy_version 936974 (0.0042) [2024-06-25 17:58:02,527][15401] Updated weights for policy 0, policy_version 936984 (0.0032) [2024-06-25 17:58:03,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42931.7). Total num frames: 15351562240. Throughput: 0: 42786.3. Samples: 15351737560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:58:03,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 17:58:06,584][15401] Updated weights for policy 0, policy_version 936994 (0.0041) [2024-06-25 17:58:08,390][15132] Fps is (10 sec: 47513.4, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 15351808000. Throughput: 0: 42953.8. Samples: 15351871860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:58:08,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 17:58:09,970][15401] Updated weights for policy 0, policy_version 937004 (0.0036) [2024-06-25 17:58:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42931.7). Total num frames: 15351988224. Throughput: 0: 42929.2. Samples: 15352126000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:58:13,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 17:58:14,207][15401] Updated weights for policy 0, policy_version 937014 (0.0030) [2024-06-25 17:58:17,367][15401] Updated weights for policy 0, policy_version 937024 (0.0044) [2024-06-25 17:58:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 15352217600. Throughput: 0: 42833.4. Samples: 15352384260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:58:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 17:58:21,620][15401] Updated weights for policy 0, policy_version 937034 (0.0023) [2024-06-25 17:58:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 15352446976. Throughput: 0: 43108.5. Samples: 15352521200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-25 17:58:23,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 17:58:24,985][15401] Updated weights for policy 0, policy_version 937044 (0.0038) [2024-06-25 17:58:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42599.6, 300 sec: 42931.6). Total num frames: 15352643584. Throughput: 0: 43024.8. Samples: 15352778960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:58:28,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-25 17:58:28,950][15401] Updated weights for policy 0, policy_version 937054 (0.0032) [2024-06-25 17:58:32,599][15401] Updated weights for policy 0, policy_version 937064 (0.0033) [2024-06-25 17:58:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 15352872960. Throughput: 0: 42899.5. Samples: 15353030180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:58:33,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 17:58:36,805][15349] Signal inference workers to stop experience collection... (227250 times) [2024-06-25 17:58:36,856][15401] InferenceWorker_p0-w0: stopping experience collection (227250 times) [2024-06-25 17:58:36,917][15349] Signal inference workers to resume experience collection... (227250 times) [2024-06-25 17:58:36,917][15401] InferenceWorker_p0-w0: resuming experience collection (227250 times) [2024-06-25 17:58:36,919][15401] Updated weights for policy 0, policy_version 937074 (0.0037) [2024-06-25 17:58:38,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 15353085952. Throughput: 0: 43124.0. Samples: 15353167640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:58:38,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 17:58:40,093][15401] Updated weights for policy 0, policy_version 937084 (0.0031) [2024-06-25 17:58:43,393][15132] Fps is (10 sec: 42584.2, 60 sec: 43142.0, 300 sec: 43042.2). Total num frames: 15353298944. Throughput: 0: 43246.9. Samples: 15353428980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:58:43,393][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 17:58:44,411][15401] Updated weights for policy 0, policy_version 937094 (0.0029) [2024-06-25 17:58:47,863][15401] Updated weights for policy 0, policy_version 937104 (0.0029) [2024-06-25 17:58:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42987.7). Total num frames: 15353511936. Throughput: 0: 43165.8. Samples: 15353680020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:58:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 17:58:51,869][15401] Updated weights for policy 0, policy_version 937114 (0.0028) [2024-06-25 17:58:53,389][15132] Fps is (10 sec: 44251.9, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 15353741312. Throughput: 0: 43063.6. Samples: 15353809720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:58:53,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 17:58:55,737][15401] Updated weights for policy 0, policy_version 937124 (0.0046) [2024-06-25 17:58:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 15353937920. Throughput: 0: 43272.0. Samples: 15354073240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:58:58,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 17:58:59,451][15401] Updated weights for policy 0, policy_version 937134 (0.0028) [2024-06-25 17:59:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 15354150912. Throughput: 0: 43243.2. Samples: 15354330200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:03,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 17:59:03,426][15401] Updated weights for policy 0, policy_version 937144 (0.0032) [2024-06-25 17:59:07,077][15401] Updated weights for policy 0, policy_version 937154 (0.0033) [2024-06-25 17:59:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15354380288. Throughput: 0: 43003.2. Samples: 15354456340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:08,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 17:59:11,024][15401] Updated weights for policy 0, policy_version 937164 (0.0029) [2024-06-25 17:59:13,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.4, 300 sec: 42987.1). Total num frames: 15354593280. Throughput: 0: 43156.3. Samples: 15354721000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 17:59:14,641][15401] Updated weights for policy 0, policy_version 937174 (0.0037) [2024-06-25 17:59:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 15354789888. Throughput: 0: 43231.3. Samples: 15354975580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:18,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 17:59:18,632][15401] Updated weights for policy 0, policy_version 937184 (0.0039) [2024-06-25 17:59:22,316][15401] Updated weights for policy 0, policy_version 937194 (0.0034) [2024-06-25 17:59:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 15355019264. Throughput: 0: 43031.3. Samples: 15355104040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 17:59:26,056][15401] Updated weights for policy 0, policy_version 937204 (0.0031) [2024-06-25 17:59:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 43042.7). Total num frames: 15355248640. Throughput: 0: 42948.1. Samples: 15355361500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:28,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 17:59:29,840][15401] Updated weights for policy 0, policy_version 937214 (0.0049) [2024-06-25 17:59:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 15355461632. Throughput: 0: 43098.2. Samples: 15355619440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:33,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 17:59:33,819][15401] Updated weights for policy 0, policy_version 937224 (0.0033) [2024-06-25 17:59:37,607][15401] Updated weights for policy 0, policy_version 937234 (0.0044) [2024-06-25 17:59:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 15355658240. Throughput: 0: 43103.7. Samples: 15355749380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:38,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 17:59:41,361][15401] Updated weights for policy 0, policy_version 937244 (0.0045) [2024-06-25 17:59:43,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43420.0, 300 sec: 43043.4). Total num frames: 15355904000. Throughput: 0: 42910.1. Samples: 15356004200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:43,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-25 17:59:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000937250_15355904000.pth... [2024-06-25 17:59:43,464][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000936620_15345582080.pth [2024-06-25 17:59:45,075][15401] Updated weights for policy 0, policy_version 937254 (0.0031) [2024-06-25 17:59:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 43043.7). Total num frames: 15356100608. Throughput: 0: 42953.8. Samples: 15356263120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 17:59:49,115][15401] Updated weights for policy 0, policy_version 937264 (0.0028) [2024-06-25 17:59:52,718][15401] Updated weights for policy 0, policy_version 937274 (0.0031) [2024-06-25 17:59:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15356297216. Throughput: 0: 42870.7. Samples: 15356385520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:53,392][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 17:59:56,564][15401] Updated weights for policy 0, policy_version 937284 (0.0044) [2024-06-25 17:59:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 15356542976. Throughput: 0: 42758.2. Samples: 15356645120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 17:59:58,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 18:00:00,845][15401] Updated weights for policy 0, policy_version 937294 (0.0032) [2024-06-25 18:00:03,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 15356739584. Throughput: 0: 42882.6. Samples: 15356905300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 18:00:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 18:00:04,001][15401] Updated weights for policy 0, policy_version 937304 (0.0029) [2024-06-25 18:00:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 15356936192. Throughput: 0: 42877.7. Samples: 15357033540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 18:00:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 18:00:08,406][15401] Updated weights for policy 0, policy_version 937314 (0.0040) [2024-06-25 18:00:12,001][15401] Updated weights for policy 0, policy_version 937324 (0.0026) [2024-06-25 18:00:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 15357181952. Throughput: 0: 42737.4. Samples: 15357284680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 18:00:13,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 18:00:15,788][15401] Updated weights for policy 0, policy_version 937334 (0.0049) [2024-06-25 18:00:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 15357378560. Throughput: 0: 42894.2. Samples: 15357549680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 18:00:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 18:00:18,466][15349] Signal inference workers to stop experience collection... (227300 times) [2024-06-25 18:00:18,503][15401] InferenceWorker_p0-w0: stopping experience collection (227300 times) [2024-06-25 18:00:18,513][15349] Signal inference workers to resume experience collection... (227300 times) [2024-06-25 18:00:18,523][15401] InferenceWorker_p0-w0: resuming experience collection (227300 times) [2024-06-25 18:00:19,835][15401] Updated weights for policy 0, policy_version 937344 (0.0032) [2024-06-25 18:00:23,230][15401] Updated weights for policy 0, policy_version 937354 (0.0023) [2024-06-25 18:00:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 15357607936. Throughput: 0: 42717.1. Samples: 15357671660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:00:23,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 18:00:27,239][15401] Updated weights for policy 0, policy_version 937364 (0.0032) [2024-06-25 18:00:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 15357820928. Throughput: 0: 42841.5. Samples: 15357932060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:00:28,390][15132] Avg episode reward: [(0, '0.892')] [2024-06-25 18:00:30,718][15401] Updated weights for policy 0, policy_version 937374 (0.0034) [2024-06-25 18:00:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 15358017536. Throughput: 0: 43112.4. Samples: 15358203180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:00:33,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-25 18:00:34,551][15401] Updated weights for policy 0, policy_version 937384 (0.0040) [2024-06-25 18:00:38,096][15401] Updated weights for policy 0, policy_version 937394 (0.0038) [2024-06-25 18:00:38,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43690.5, 300 sec: 43098.3). Total num frames: 15358279680. Throughput: 0: 43103.5. Samples: 15358325180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:00:38,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 18:00:42,581][15401] Updated weights for policy 0, policy_version 937404 (0.0029) [2024-06-25 18:00:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 43153.8). Total num frames: 15358476288. Throughput: 0: 43115.7. Samples: 15358585320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:00:43,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 18:00:45,750][15401] Updated weights for policy 0, policy_version 937414 (0.0027) [2024-06-25 18:00:48,396][15132] Fps is (10 sec: 40934.3, 60 sec: 43139.9, 300 sec: 43042.2). Total num frames: 15358689280. Throughput: 0: 43113.4. Samples: 15358845680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:00:48,396][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 18:00:50,281][15401] Updated weights for policy 0, policy_version 937424 (0.0025) [2024-06-25 18:00:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 43042.7). Total num frames: 15358902272. Throughput: 0: 43009.0. Samples: 15358968940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:00:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 18:00:53,492][15401] Updated weights for policy 0, policy_version 937434 (0.0032) [2024-06-25 18:00:57,726][15401] Updated weights for policy 0, policy_version 937444 (0.0032) [2024-06-25 18:00:58,389][15132] Fps is (10 sec: 44265.3, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 15359131648. Throughput: 0: 43282.7. Samples: 15359232400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:00:58,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 18:01:01,670][15401] Updated weights for policy 0, policy_version 937454 (0.0031) [2024-06-25 18:01:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15359311872. Throughput: 0: 43007.0. Samples: 15359485000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:03,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 18:01:05,467][15401] Updated weights for policy 0, policy_version 937464 (0.0043) [2024-06-25 18:01:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 15359557632. Throughput: 0: 43113.4. Samples: 15359611760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 18:01:09,019][15401] Updated weights for policy 0, policy_version 937474 (0.0029) [2024-06-25 18:01:13,165][15401] Updated weights for policy 0, policy_version 937484 (0.0035) [2024-06-25 18:01:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15359737856. Throughput: 0: 43208.4. Samples: 15359876440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:13,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 18:01:16,750][15401] Updated weights for policy 0, policy_version 937494 (0.0042) [2024-06-25 18:01:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 15359967232. Throughput: 0: 42790.6. Samples: 15360128760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:18,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 18:01:20,832][15401] Updated weights for policy 0, policy_version 937504 (0.0027) [2024-06-25 18:01:23,390][15132] Fps is (10 sec: 45870.8, 60 sec: 43143.9, 300 sec: 42987.0). Total num frames: 15360196608. Throughput: 0: 42884.5. Samples: 15360255020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:23,391][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 18:01:24,389][15401] Updated weights for policy 0, policy_version 937514 (0.0046) [2024-06-25 18:01:26,419][15349] Signal inference workers to stop experience collection... (227350 times) [2024-06-25 18:01:26,419][15349] Signal inference workers to resume experience collection... (227350 times) [2024-06-25 18:01:26,457][15401] InferenceWorker_p0-w0: stopping experience collection (227350 times) [2024-06-25 18:01:26,457][15401] InferenceWorker_p0-w0: resuming experience collection (227350 times) [2024-06-25 18:01:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15360376832. Throughput: 0: 42803.1. Samples: 15360511460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 18:01:28,464][15401] Updated weights for policy 0, policy_version 937524 (0.0043) [2024-06-25 18:01:32,471][15401] Updated weights for policy 0, policy_version 937534 (0.0028) [2024-06-25 18:01:33,390][15132] Fps is (10 sec: 39325.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15360589824. Throughput: 0: 42775.8. Samples: 15360770320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:33,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 18:01:36,228][15401] Updated weights for policy 0, policy_version 937544 (0.0051) [2024-06-25 18:01:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 15360835584. Throughput: 0: 42849.8. Samples: 15360897180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:38,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 18:01:39,939][15401] Updated weights for policy 0, policy_version 937554 (0.0028) [2024-06-25 18:01:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42876.4). Total num frames: 15361015808. Throughput: 0: 42515.5. Samples: 15361145600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:43,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 18:01:43,491][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000937563_15361032192.pth... [2024-06-25 18:01:43,540][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000936934_15350726656.pth [2024-06-25 18:01:43,712][15401] Updated weights for policy 0, policy_version 937564 (0.0045) [2024-06-25 18:01:47,681][15401] Updated weights for policy 0, policy_version 937574 (0.0029) [2024-06-25 18:01:48,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42329.8, 300 sec: 42876.1). Total num frames: 15361228800. Throughput: 0: 42688.0. Samples: 15361405960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:48,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 18:01:51,276][15401] Updated weights for policy 0, policy_version 937584 (0.0040) [2024-06-25 18:01:53,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 15361458176. Throughput: 0: 42724.6. Samples: 15361534360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:53,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 18:01:55,212][15401] Updated weights for policy 0, policy_version 937594 (0.0038) [2024-06-25 18:01:58,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 15361654784. Throughput: 0: 42548.0. Samples: 15361791100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:01:58,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 18:01:58,938][15401] Updated weights for policy 0, policy_version 937604 (0.0033) [2024-06-25 18:02:02,875][15401] Updated weights for policy 0, policy_version 937614 (0.0035) [2024-06-25 18:02:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15361867776. Throughput: 0: 42659.0. Samples: 15362048420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:02:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 18:02:06,524][15401] Updated weights for policy 0, policy_version 937624 (0.0032) [2024-06-25 18:02:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 15362097152. Throughput: 0: 42701.9. Samples: 15362176560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:02:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 18:02:10,511][15401] Updated weights for policy 0, policy_version 937634 (0.0038) [2024-06-25 18:02:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15362310144. Throughput: 0: 42794.3. Samples: 15362437200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 18:02:13,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 18:02:14,168][15401] Updated weights for policy 0, policy_version 937644 (0.0042) [2024-06-25 18:02:18,283][15401] Updated weights for policy 0, policy_version 937654 (0.0035) [2024-06-25 18:02:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15362523136. Throughput: 0: 42703.7. Samples: 15362691980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:18,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 18:02:21,946][15401] Updated weights for policy 0, policy_version 937664 (0.0025) [2024-06-25 18:02:23,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.9, 300 sec: 42820.8). Total num frames: 15362719744. Throughput: 0: 42734.5. Samples: 15362820240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 18:02:25,754][15401] Updated weights for policy 0, policy_version 937674 (0.0038) [2024-06-25 18:02:28,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 15362949120. Throughput: 0: 43000.4. Samples: 15363080620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 18:02:29,524][15401] Updated weights for policy 0, policy_version 937684 (0.0034) [2024-06-25 18:02:33,334][15401] Updated weights for policy 0, policy_version 937694 (0.0031) [2024-06-25 18:02:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15363178496. Throughput: 0: 42906.7. Samples: 15363336760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:33,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 18:02:36,974][15401] Updated weights for policy 0, policy_version 937704 (0.0032) [2024-06-25 18:02:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 15363375104. Throughput: 0: 43018.6. Samples: 15363470200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:38,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 18:02:41,115][15401] Updated weights for policy 0, policy_version 937714 (0.0040) [2024-06-25 18:02:43,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15363571712. Throughput: 0: 42832.5. Samples: 15363718560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:43,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 18:02:44,601][15401] Updated weights for policy 0, policy_version 937724 (0.0026) [2024-06-25 18:02:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15363817472. Throughput: 0: 42823.2. Samples: 15363975460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:48,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 18:02:48,643][15401] Updated weights for policy 0, policy_version 937734 (0.0036) [2024-06-25 18:02:52,575][15401] Updated weights for policy 0, policy_version 937744 (0.0031) [2024-06-25 18:02:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 15364014080. Throughput: 0: 42839.0. Samples: 15364104320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:53,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 18:02:56,437][15349] Signal inference workers to stop experience collection... (227400 times) [2024-06-25 18:02:56,467][15401] InferenceWorker_p0-w0: stopping experience collection (227400 times) [2024-06-25 18:02:56,490][15349] Signal inference workers to resume experience collection... (227400 times) [2024-06-25 18:02:56,496][15401] InferenceWorker_p0-w0: resuming experience collection (227400 times) [2024-06-25 18:02:56,500][15401] Updated weights for policy 0, policy_version 937754 (0.0036) [2024-06-25 18:02:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15364227072. Throughput: 0: 42650.2. Samples: 15364356460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:02:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 18:03:00,535][15401] Updated weights for policy 0, policy_version 937764 (0.0035) [2024-06-25 18:03:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15364440064. Throughput: 0: 42646.1. Samples: 15364611060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:03,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 18:03:04,347][15401] Updated weights for policy 0, policy_version 937774 (0.0024) [2024-06-25 18:03:08,013][15401] Updated weights for policy 0, policy_version 937784 (0.0037) [2024-06-25 18:03:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 15364653056. Throughput: 0: 42681.8. Samples: 15364740920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 18:03:12,120][15401] Updated weights for policy 0, policy_version 937794 (0.0034) [2024-06-25 18:03:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 15364866048. Throughput: 0: 42507.5. Samples: 15364993460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:13,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 18:03:15,629][15401] Updated weights for policy 0, policy_version 937804 (0.0032) [2024-06-25 18:03:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15365079040. Throughput: 0: 42494.3. Samples: 15365249000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:18,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 18:03:19,455][15401] Updated weights for policy 0, policy_version 937814 (0.0028) [2024-06-25 18:03:23,190][15401] Updated weights for policy 0, policy_version 937824 (0.0031) [2024-06-25 18:03:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 15365324800. Throughput: 0: 42408.9. Samples: 15365378600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:23,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 18:03:27,443][15401] Updated weights for policy 0, policy_version 937834 (0.0040) [2024-06-25 18:03:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15365505024. Throughput: 0: 42465.4. Samples: 15365629500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:28,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 18:03:31,238][15401] Updated weights for policy 0, policy_version 937844 (0.0034) [2024-06-25 18:03:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15365734400. Throughput: 0: 42368.0. Samples: 15365882020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:33,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 18:03:35,056][15401] Updated weights for policy 0, policy_version 937854 (0.0037) [2024-06-25 18:03:38,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.2, 300 sec: 42821.0). Total num frames: 15365931008. Throughput: 0: 42507.8. Samples: 15366017180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:38,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 18:03:38,927][15401] Updated weights for policy 0, policy_version 937864 (0.0031) [2024-06-25 18:03:42,574][15401] Updated weights for policy 0, policy_version 937874 (0.0039) [2024-06-25 18:03:43,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15366127616. Throughput: 0: 42459.9. Samples: 15366267160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:43,390][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 18:03:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000937874_15366127616.pth... [2024-06-25 18:03:43,452][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000937250_15355904000.pth [2024-06-25 18:03:46,420][15401] Updated weights for policy 0, policy_version 937884 (0.0031) [2024-06-25 18:03:48,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15366373376. Throughput: 0: 42479.1. Samples: 15366522620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 18:03:50,398][15401] Updated weights for policy 0, policy_version 937894 (0.0028) [2024-06-25 18:03:53,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15366569984. Throughput: 0: 42650.2. Samples: 15366660180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:53,392][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 18:03:53,901][15401] Updated weights for policy 0, policy_version 937904 (0.0024) [2024-06-25 18:03:58,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15366766592. Throughput: 0: 42663.3. Samples: 15366913300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:03:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 18:03:58,408][15401] Updated weights for policy 0, policy_version 937914 (0.0026) [2024-06-25 18:04:01,986][15401] Updated weights for policy 0, policy_version 937924 (0.0033) [2024-06-25 18:04:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15367012352. Throughput: 0: 42519.0. Samples: 15367162360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:04:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 18:04:05,965][15401] Updated weights for policy 0, policy_version 937934 (0.0038) [2024-06-25 18:04:08,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15367225344. Throughput: 0: 42593.8. Samples: 15367295320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:04:08,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 18:04:09,742][15401] Updated weights for policy 0, policy_version 937944 (0.0045) [2024-06-25 18:04:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15367421952. Throughput: 0: 42593.3. Samples: 15367546200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:13,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 18:04:13,534][15401] Updated weights for policy 0, policy_version 937954 (0.0028) [2024-06-25 18:04:17,459][15401] Updated weights for policy 0, policy_version 937964 (0.0038) [2024-06-25 18:04:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15367651328. Throughput: 0: 42569.8. Samples: 15367797660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:18,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 18:04:21,105][15401] Updated weights for policy 0, policy_version 937974 (0.0032) [2024-06-25 18:04:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15367847936. Throughput: 0: 42508.3. Samples: 15367930040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:23,390][15132] Avg episode reward: [(0, '0.826')] [2024-06-25 18:04:24,936][15401] Updated weights for policy 0, policy_version 937984 (0.0033) [2024-06-25 18:04:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15368044544. Throughput: 0: 42608.2. Samples: 15368184520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:28,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 18:04:28,877][15401] Updated weights for policy 0, policy_version 937994 (0.0037) [2024-06-25 18:04:32,988][15349] Signal inference workers to stop experience collection... (227450 times) [2024-06-25 18:04:33,024][15401] InferenceWorker_p0-w0: stopping experience collection (227450 times) [2024-06-25 18:04:33,049][15349] Signal inference workers to resume experience collection... (227450 times) [2024-06-25 18:04:33,056][15401] InferenceWorker_p0-w0: resuming experience collection (227450 times) [2024-06-25 18:04:33,059][15401] Updated weights for policy 0, policy_version 938004 (0.0033) [2024-06-25 18:04:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15368290304. Throughput: 0: 42626.4. Samples: 15368440800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:33,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 18:04:36,870][15401] Updated weights for policy 0, policy_version 938014 (0.0041) [2024-06-25 18:04:38,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15368486912. Throughput: 0: 42350.2. Samples: 15368565940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:38,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 18:04:40,727][15401] Updated weights for policy 0, policy_version 938024 (0.0036) [2024-06-25 18:04:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15368699904. Throughput: 0: 42458.1. Samples: 15368823920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 18:04:44,422][15401] Updated weights for policy 0, policy_version 938034 (0.0024) [2024-06-25 18:04:48,263][15401] Updated weights for policy 0, policy_version 938044 (0.0038) [2024-06-25 18:04:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 15368912896. Throughput: 0: 42546.0. Samples: 15369076920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 18:04:52,266][15401] Updated weights for policy 0, policy_version 938054 (0.0032) [2024-06-25 18:04:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15369125888. Throughput: 0: 42311.5. Samples: 15369199340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:53,390][15132] Avg episode reward: [(0, '0.252')] [2024-06-25 18:04:55,748][15401] Updated weights for policy 0, policy_version 938064 (0.0028) [2024-06-25 18:04:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15369338880. Throughput: 0: 42498.2. Samples: 15369458620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:04:58,392][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 18:04:59,808][15401] Updated weights for policy 0, policy_version 938074 (0.0035) [2024-06-25 18:05:03,390][15401] Updated weights for policy 0, policy_version 938084 (0.0038) [2024-06-25 18:05:03,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 15369568256. Throughput: 0: 42547.0. Samples: 15369712380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:03,392][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 18:05:07,582][15401] Updated weights for policy 0, policy_version 938094 (0.0032) [2024-06-25 18:05:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 15369764864. Throughput: 0: 42522.5. Samples: 15369843560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 18:05:11,083][15401] Updated weights for policy 0, policy_version 938104 (0.0033) [2024-06-25 18:05:13,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15369961472. Throughput: 0: 42537.8. Samples: 15370098720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:13,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 18:05:15,686][15401] Updated weights for policy 0, policy_version 938114 (0.0033) [2024-06-25 18:05:18,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 15370174464. Throughput: 0: 42454.0. Samples: 15370351240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 18:05:18,756][15401] Updated weights for policy 0, policy_version 938124 (0.0021) [2024-06-25 18:05:23,114][15401] Updated weights for policy 0, policy_version 938134 (0.0033) [2024-06-25 18:05:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 15370403840. Throughput: 0: 42524.8. Samples: 15370479560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:23,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 18:05:26,328][15401] Updated weights for policy 0, policy_version 938144 (0.0049) [2024-06-25 18:05:28,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15370616832. Throughput: 0: 42436.4. Samples: 15370733560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:28,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 18:05:30,939][15401] Updated weights for policy 0, policy_version 938154 (0.0038) [2024-06-25 18:05:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15370829824. Throughput: 0: 42588.8. Samples: 15370993420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:33,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 18:05:34,011][15401] Updated weights for policy 0, policy_version 938164 (0.0037) [2024-06-25 18:05:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15371026432. Throughput: 0: 42698.2. Samples: 15371120760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:38,395][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 18:05:38,399][15401] Updated weights for policy 0, policy_version 938174 (0.0033) [2024-06-25 18:05:41,589][15401] Updated weights for policy 0, policy_version 938184 (0.0031) [2024-06-25 18:05:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42599.3). Total num frames: 15371255808. Throughput: 0: 42502.2. Samples: 15371371220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:43,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 18:05:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000938187_15371255808.pth... [2024-06-25 18:05:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000937563_15361032192.pth [2024-06-25 18:05:46,105][15401] Updated weights for policy 0, policy_version 938194 (0.0029) [2024-06-25 18:05:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15371468800. Throughput: 0: 42669.4. Samples: 15371632400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 18:05:49,413][15401] Updated weights for policy 0, policy_version 938204 (0.0022) [2024-06-25 18:05:53,221][15349] Signal inference workers to stop experience collection... (227500 times) [2024-06-25 18:05:53,246][15401] InferenceWorker_p0-w0: stopping experience collection (227500 times) [2024-06-25 18:05:53,283][15349] Signal inference workers to resume experience collection... (227500 times) [2024-06-25 18:05:53,283][15401] InferenceWorker_p0-w0: resuming experience collection (227500 times) [2024-06-25 18:05:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15371665408. Throughput: 0: 42675.2. Samples: 15371763940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:53,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 18:05:53,641][15401] Updated weights for policy 0, policy_version 938214 (0.0039) [2024-06-25 18:05:57,111][15401] Updated weights for policy 0, policy_version 938224 (0.0045) [2024-06-25 18:05:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15371894784. Throughput: 0: 42652.9. Samples: 15372018100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:05:58,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 18:06:01,234][15401] Updated weights for policy 0, policy_version 938234 (0.0040) [2024-06-25 18:06:03,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 15372124160. Throughput: 0: 42746.8. Samples: 15372274840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:06:03,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 18:06:04,625][15401] Updated weights for policy 0, policy_version 938244 (0.0028) [2024-06-25 18:06:08,392][15132] Fps is (10 sec: 40949.6, 60 sec: 42323.7, 300 sec: 42598.0). Total num frames: 15372304384. Throughput: 0: 42865.8. Samples: 15372408620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:06:08,393][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 18:06:08,833][15401] Updated weights for policy 0, policy_version 938254 (0.0041) [2024-06-25 18:06:12,171][15401] Updated weights for policy 0, policy_version 938264 (0.0036) [2024-06-25 18:06:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15372550144. Throughput: 0: 42778.7. Samples: 15372658600. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:13,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 18:06:16,727][15401] Updated weights for policy 0, policy_version 938274 (0.0035) [2024-06-25 18:06:18,389][15132] Fps is (10 sec: 44248.0, 60 sec: 42871.6, 300 sec: 42543.0). Total num frames: 15372746752. Throughput: 0: 42893.4. Samples: 15372923620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:18,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 18:06:19,943][15401] Updated weights for policy 0, policy_version 938284 (0.0033) [2024-06-25 18:06:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15372959744. Throughput: 0: 42832.4. Samples: 15373048220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 18:06:24,390][15401] Updated weights for policy 0, policy_version 938294 (0.0027) [2024-06-25 18:06:27,651][15401] Updated weights for policy 0, policy_version 938304 (0.0044) [2024-06-25 18:06:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15373189120. Throughput: 0: 42772.9. Samples: 15373296000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:28,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 18:06:32,000][15401] Updated weights for policy 0, policy_version 938314 (0.0032) [2024-06-25 18:06:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15373385728. Throughput: 0: 42837.3. Samples: 15373560080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:33,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 18:06:35,426][15401] Updated weights for policy 0, policy_version 938324 (0.0034) [2024-06-25 18:06:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15373582336. Throughput: 0: 42689.3. Samples: 15373684960. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 18:06:39,559][15401] Updated weights for policy 0, policy_version 938334 (0.0034) [2024-06-25 18:06:42,882][15401] Updated weights for policy 0, policy_version 938344 (0.0034) [2024-06-25 18:06:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15373828096. Throughput: 0: 42620.9. Samples: 15373936040. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 18:06:47,222][15401] Updated weights for policy 0, policy_version 938354 (0.0023) [2024-06-25 18:06:48,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15374024704. Throughput: 0: 42780.5. Samples: 15374199960. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:48,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 18:06:50,400][15401] Updated weights for policy 0, policy_version 938364 (0.0026) [2024-06-25 18:06:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15374221312. Throughput: 0: 42485.8. Samples: 15374320380. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:53,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 18:06:55,422][15401] Updated weights for policy 0, policy_version 938374 (0.0028) [2024-06-25 18:06:58,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42866.8, 300 sec: 42708.6). Total num frames: 15374467072. Throughput: 0: 42642.8. Samples: 15374577800. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:06:58,397][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 18:06:58,549][15401] Updated weights for policy 0, policy_version 938384 (0.0057) [2024-06-25 18:07:02,898][15401] Updated weights for policy 0, policy_version 938394 (0.0023) [2024-06-25 18:07:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15374663680. Throughput: 0: 42503.1. Samples: 15374836260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 18:07:06,091][15401] Updated weights for policy 0, policy_version 938404 (0.0048) [2024-06-25 18:07:08,389][15132] Fps is (10 sec: 39347.0, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 15374860288. Throughput: 0: 42432.1. Samples: 15374957660. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:08,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 18:07:10,781][15401] Updated weights for policy 0, policy_version 938414 (0.0049) [2024-06-25 18:07:12,908][15349] Signal inference workers to stop experience collection... (227550 times) [2024-06-25 18:07:12,908][15349] Signal inference workers to resume experience collection... (227550 times) [2024-06-25 18:07:12,951][15401] InferenceWorker_p0-w0: stopping experience collection (227550 times) [2024-06-25 18:07:12,951][15401] InferenceWorker_p0-w0: resuming experience collection (227550 times) [2024-06-25 18:07:13,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15375122432. Throughput: 0: 42721.8. Samples: 15375218480. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:13,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 18:07:14,279][15401] Updated weights for policy 0, policy_version 938424 (0.0028) [2024-06-25 18:07:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15375286272. Throughput: 0: 42503.2. Samples: 15375472720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:18,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 18:07:18,433][15401] Updated weights for policy 0, policy_version 938434 (0.0051) [2024-06-25 18:07:22,109][15401] Updated weights for policy 0, policy_version 938444 (0.0041) [2024-06-25 18:07:23,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15375499264. Throughput: 0: 42318.6. Samples: 15375589300. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:23,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 18:07:25,887][15401] Updated weights for policy 0, policy_version 938454 (0.0029) [2024-06-25 18:07:28,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15375761408. Throughput: 0: 42614.2. Samples: 15375853680. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:28,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 18:07:29,696][15401] Updated weights for policy 0, policy_version 938464 (0.0033) [2024-06-25 18:07:33,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15375941632. Throughput: 0: 42415.0. Samples: 15376108640. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:33,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 18:07:33,646][15401] Updated weights for policy 0, policy_version 938474 (0.0030) [2024-06-25 18:07:37,714][15401] Updated weights for policy 0, policy_version 938484 (0.0042) [2024-06-25 18:07:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15376154624. Throughput: 0: 42485.9. Samples: 15376232240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 18:07:41,099][15401] Updated weights for policy 0, policy_version 938494 (0.0040) [2024-06-25 18:07:43,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15376384000. Throughput: 0: 42549.6. Samples: 15376492260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 18:07:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000938500_15376384000.pth... [2024-06-25 18:07:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000937874_15366127616.pth [2024-06-25 18:07:45,370][15401] Updated weights for policy 0, policy_version 938504 (0.0039) [2024-06-25 18:07:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15376580608. Throughput: 0: 42391.4. Samples: 15376743880. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:48,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 18:07:49,086][15401] Updated weights for policy 0, policy_version 938514 (0.0042) [2024-06-25 18:07:53,021][15401] Updated weights for policy 0, policy_version 938524 (0.0039) [2024-06-25 18:07:53,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15376793600. Throughput: 0: 42476.8. Samples: 15376869120. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 18:07:56,656][15401] Updated weights for policy 0, policy_version 938534 (0.0026) [2024-06-25 18:07:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42603.0, 300 sec: 42654.0). Total num frames: 15377022976. Throughput: 0: 42575.6. Samples: 15377134380. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:07:58,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 18:08:00,687][15401] Updated weights for policy 0, policy_version 938544 (0.0044) [2024-06-25 18:08:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15377219584. Throughput: 0: 42491.5. Samples: 15377384840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-25 18:08:03,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 18:08:04,357][15401] Updated weights for policy 0, policy_version 938554 (0.0037) [2024-06-25 18:08:08,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15377416192. Throughput: 0: 42760.5. Samples: 15377513520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 18:08:08,491][15401] Updated weights for policy 0, policy_version 938564 (0.0031) [2024-06-25 18:08:11,943][15401] Updated weights for policy 0, policy_version 938574 (0.0040) [2024-06-25 18:08:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15377661952. Throughput: 0: 42617.3. Samples: 15377771460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:13,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 18:08:16,142][15401] Updated weights for policy 0, policy_version 938584 (0.0035) [2024-06-25 18:08:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 15377858560. Throughput: 0: 42549.0. Samples: 15378023340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:18,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 18:08:20,120][15401] Updated weights for policy 0, policy_version 938594 (0.0025) [2024-06-25 18:08:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15378055168. Throughput: 0: 42642.2. Samples: 15378151140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:23,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 18:08:23,968][15401] Updated weights for policy 0, policy_version 938604 (0.0032) [2024-06-25 18:08:27,688][15401] Updated weights for policy 0, policy_version 938614 (0.0026) [2024-06-25 18:08:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15378300928. Throughput: 0: 42555.6. Samples: 15378407260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 18:08:31,418][15401] Updated weights for policy 0, policy_version 938624 (0.0037) [2024-06-25 18:08:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15378497536. Throughput: 0: 42757.5. Samples: 15378667960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:33,390][15132] Avg episode reward: [(0, '0.259')] [2024-06-25 18:08:35,098][15401] Updated weights for policy 0, policy_version 938634 (0.0035) [2024-06-25 18:08:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15378694144. Throughput: 0: 42730.3. Samples: 15378791980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:38,390][15132] Avg episode reward: [(0, '0.280')] [2024-06-25 18:08:39,455][15401] Updated weights for policy 0, policy_version 938644 (0.0042) [2024-06-25 18:08:42,773][15401] Updated weights for policy 0, policy_version 938654 (0.0024) [2024-06-25 18:08:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15378939904. Throughput: 0: 42636.4. Samples: 15379053020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:43,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 18:08:47,039][15349] Signal inference workers to stop experience collection... (227600 times) [2024-06-25 18:08:47,085][15401] InferenceWorker_p0-w0: stopping experience collection (227600 times) [2024-06-25 18:08:47,093][15349] Signal inference workers to resume experience collection... (227600 times) [2024-06-25 18:08:47,101][15401] InferenceWorker_p0-w0: resuming experience collection (227600 times) [2024-06-25 18:08:47,108][15401] Updated weights for policy 0, policy_version 938664 (0.0031) [2024-06-25 18:08:48,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15379136512. Throughput: 0: 42720.5. Samples: 15379307260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 18:08:50,298][15401] Updated weights for policy 0, policy_version 938674 (0.0035) [2024-06-25 18:08:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15379365888. Throughput: 0: 42643.1. Samples: 15379432460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:53,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 18:08:54,560][15401] Updated weights for policy 0, policy_version 938684 (0.0029) [2024-06-25 18:08:57,921][15401] Updated weights for policy 0, policy_version 938694 (0.0036) [2024-06-25 18:08:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15379562496. Throughput: 0: 42640.9. Samples: 15379690300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:08:58,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 18:09:02,111][15401] Updated weights for policy 0, policy_version 938704 (0.0036) [2024-06-25 18:09:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15379775488. Throughput: 0: 42662.9. Samples: 15379943180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 18:09:05,663][15401] Updated weights for policy 0, policy_version 938714 (0.0041) [2024-06-25 18:09:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15379988480. Throughput: 0: 42661.8. Samples: 15380070920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 18:09:09,786][15401] Updated weights for policy 0, policy_version 938724 (0.0029) [2024-06-25 18:09:13,193][15401] Updated weights for policy 0, policy_version 938734 (0.0032) [2024-06-25 18:09:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15380217856. Throughput: 0: 42754.6. Samples: 15380331220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 18:09:17,444][15401] Updated weights for policy 0, policy_version 938744 (0.0033) [2024-06-25 18:09:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15380414464. Throughput: 0: 42680.3. Samples: 15380588580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:18,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 18:09:20,842][15401] Updated weights for policy 0, policy_version 938754 (0.0042) [2024-06-25 18:09:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15380643840. Throughput: 0: 42764.0. Samples: 15380716360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:23,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 18:09:25,303][15401] Updated weights for policy 0, policy_version 938764 (0.0033) [2024-06-25 18:09:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15380856832. Throughput: 0: 42696.1. Samples: 15380974340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 18:09:28,542][15401] Updated weights for policy 0, policy_version 938774 (0.0028) [2024-06-25 18:09:32,749][15401] Updated weights for policy 0, policy_version 938784 (0.0030) [2024-06-25 18:09:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15381053440. Throughput: 0: 42717.7. Samples: 15381229560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 18:09:36,352][15401] Updated weights for policy 0, policy_version 938794 (0.0035) [2024-06-25 18:09:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15381282816. Throughput: 0: 42716.7. Samples: 15381354720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:38,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 18:09:40,694][15401] Updated weights for policy 0, policy_version 938804 (0.0039) [2024-06-25 18:09:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15381495808. Throughput: 0: 42699.7. Samples: 15381611780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:43,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 18:09:43,482][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000938813_15381512192.pth... [2024-06-25 18:09:43,528][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000938187_15371255808.pth [2024-06-25 18:09:44,089][15401] Updated weights for policy 0, policy_version 938814 (0.0032) [2024-06-25 18:09:48,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15381676032. Throughput: 0: 42819.7. Samples: 15381870060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:48,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 18:09:48,436][15401] Updated weights for policy 0, policy_version 938824 (0.0038) [2024-06-25 18:09:52,113][15401] Updated weights for policy 0, policy_version 938834 (0.0038) [2024-06-25 18:09:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15381921792. Throughput: 0: 42720.4. Samples: 15381993340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:53,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 18:09:56,262][15401] Updated weights for policy 0, policy_version 938844 (0.0043) [2024-06-25 18:09:58,396][15132] Fps is (10 sec: 45845.8, 60 sec: 42867.0, 300 sec: 42597.8). Total num frames: 15382134784. Throughput: 0: 42793.5. Samples: 15382257200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 18:09:58,396][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 18:09:59,528][15401] Updated weights for policy 0, policy_version 938854 (0.0031) [2024-06-25 18:10:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15382331392. Throughput: 0: 42770.2. Samples: 15382513240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:03,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 18:10:03,598][15401] Updated weights for policy 0, policy_version 938864 (0.0031) [2024-06-25 18:10:06,958][15401] Updated weights for policy 0, policy_version 938874 (0.0026) [2024-06-25 18:10:08,389][15132] Fps is (10 sec: 42625.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15382560768. Throughput: 0: 42741.7. Samples: 15382639740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:08,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 18:10:11,077][15401] Updated weights for policy 0, policy_version 938884 (0.0044) [2024-06-25 18:10:12,821][15349] Signal inference workers to stop experience collection... (227650 times) [2024-06-25 18:10:12,825][15349] Signal inference workers to resume experience collection... (227650 times) [2024-06-25 18:10:12,849][15401] InferenceWorker_p0-w0: stopping experience collection (227650 times) [2024-06-25 18:10:12,849][15401] InferenceWorker_p0-w0: resuming experience collection (227650 times) [2024-06-25 18:10:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15382773760. Throughput: 0: 42864.0. Samples: 15382903220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:13,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 18:10:14,432][15401] Updated weights for policy 0, policy_version 938894 (0.0049) [2024-06-25 18:10:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15382986752. Throughput: 0: 42730.6. Samples: 15383152440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:18,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 18:10:18,573][15401] Updated weights for policy 0, policy_version 938904 (0.0045) [2024-06-25 18:10:22,158][15401] Updated weights for policy 0, policy_version 938914 (0.0032) [2024-06-25 18:10:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15383199744. Throughput: 0: 42912.2. Samples: 15383285760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:23,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 18:10:26,245][15401] Updated weights for policy 0, policy_version 938924 (0.0030) [2024-06-25 18:10:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15383396352. Throughput: 0: 42871.1. Samples: 15383540980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:28,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 18:10:29,694][15401] Updated weights for policy 0, policy_version 938934 (0.0031) [2024-06-25 18:10:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15383642112. Throughput: 0: 42630.5. Samples: 15383788440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:33,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 18:10:33,640][15401] Updated weights for policy 0, policy_version 938944 (0.0039) [2024-06-25 18:10:37,784][15401] Updated weights for policy 0, policy_version 938954 (0.0036) [2024-06-25 18:10:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15383838720. Throughput: 0: 42886.2. Samples: 15383923220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:38,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 18:10:41,188][15401] Updated weights for policy 0, policy_version 938964 (0.0028) [2024-06-25 18:10:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15384068096. Throughput: 0: 42790.0. Samples: 15384182480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 18:10:45,328][15401] Updated weights for policy 0, policy_version 938974 (0.0041) [2024-06-25 18:10:48,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43688.9, 300 sec: 42820.2). Total num frames: 15384297472. Throughput: 0: 42670.2. Samples: 15384433500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:48,392][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 18:10:49,674][15401] Updated weights for policy 0, policy_version 938984 (0.0038) [2024-06-25 18:10:53,211][15401] Updated weights for policy 0, policy_version 938994 (0.0045) [2024-06-25 18:10:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15384477696. Throughput: 0: 42664.7. Samples: 15384559660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:53,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 18:10:57,201][15401] Updated weights for policy 0, policy_version 939004 (0.0036) [2024-06-25 18:10:58,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 15384690688. Throughput: 0: 42569.8. Samples: 15384818860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:10:58,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 18:11:01,250][15401] Updated weights for policy 0, policy_version 939014 (0.0035) [2024-06-25 18:11:03,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 15384920064. Throughput: 0: 42520.6. Samples: 15385065860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:03,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 18:11:04,755][15401] Updated weights for policy 0, policy_version 939024 (0.0027) [2024-06-25 18:11:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15385116672. Throughput: 0: 42558.1. Samples: 15385200880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:08,390][15132] Avg episode reward: [(0, '0.800')] [2024-06-25 18:11:08,792][15401] Updated weights for policy 0, policy_version 939034 (0.0039) [2024-06-25 18:11:12,945][15401] Updated weights for policy 0, policy_version 939044 (0.0034) [2024-06-25 18:11:13,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15385313280. Throughput: 0: 42383.9. Samples: 15385448260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 18:11:16,291][15401] Updated weights for policy 0, policy_version 939054 (0.0031) [2024-06-25 18:11:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15385559040. Throughput: 0: 42649.5. Samples: 15385707660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:18,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 18:11:20,604][15401] Updated weights for policy 0, policy_version 939064 (0.0033) [2024-06-25 18:11:21,958][15349] Signal inference workers to stop experience collection... (227700 times) [2024-06-25 18:11:21,966][15349] Signal inference workers to resume experience collection... (227700 times) [2024-06-25 18:11:22,001][15401] InferenceWorker_p0-w0: stopping experience collection (227700 times) [2024-06-25 18:11:22,001][15401] InferenceWorker_p0-w0: resuming experience collection (227700 times) [2024-06-25 18:11:23,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15385755648. Throughput: 0: 42499.0. Samples: 15385835680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:23,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 18:11:24,397][15401] Updated weights for policy 0, policy_version 939074 (0.0048) [2024-06-25 18:11:28,169][15401] Updated weights for policy 0, policy_version 939084 (0.0048) [2024-06-25 18:11:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15385968640. Throughput: 0: 42201.4. Samples: 15386081540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:28,390][15132] Avg episode reward: [(0, '0.031')] [2024-06-25 18:11:32,098][15401] Updated weights for policy 0, policy_version 939094 (0.0040) [2024-06-25 18:11:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 15386165248. Throughput: 0: 42444.5. Samples: 15386343400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 18:11:35,669][15401] Updated weights for policy 0, policy_version 939104 (0.0030) [2024-06-25 18:11:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15386394624. Throughput: 0: 42379.3. Samples: 15386466720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 18:11:39,793][15401] Updated weights for policy 0, policy_version 939114 (0.0048) [2024-06-25 18:11:43,236][15401] Updated weights for policy 0, policy_version 939124 (0.0035) [2024-06-25 18:11:43,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15386607616. Throughput: 0: 42276.8. Samples: 15386721320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:43,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 18:11:43,425][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000939124_15386607616.pth... [2024-06-25 18:11:43,496][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000938500_15376384000.pth [2024-06-25 18:11:47,406][15401] Updated weights for policy 0, policy_version 939134 (0.0034) [2024-06-25 18:11:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 41506.1, 300 sec: 42598.1). Total num frames: 15386787840. Throughput: 0: 42531.0. Samples: 15386979860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:48,392][15132] Avg episode reward: [(0, '0.292')] [2024-06-25 18:11:50,763][15401] Updated weights for policy 0, policy_version 939144 (0.0028) [2024-06-25 18:11:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42599.3). Total num frames: 15387033600. Throughput: 0: 42304.1. Samples: 15387104560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:11:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 18:11:55,023][15401] Updated weights for policy 0, policy_version 939154 (0.0029) [2024-06-25 18:11:58,390][15132] Fps is (10 sec: 45885.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15387246592. Throughput: 0: 42577.2. Samples: 15387364240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:11:58,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 18:11:58,643][15401] Updated weights for policy 0, policy_version 939164 (0.0041) [2024-06-25 18:12:02,543][15401] Updated weights for policy 0, policy_version 939174 (0.0025) [2024-06-25 18:12:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 15387443200. Throughput: 0: 42563.8. Samples: 15387623040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 18:12:06,306][15401] Updated weights for policy 0, policy_version 939184 (0.0034) [2024-06-25 18:12:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15387688960. Throughput: 0: 42563.9. Samples: 15387751060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 18:12:10,245][15401] Updated weights for policy 0, policy_version 939194 (0.0037) [2024-06-25 18:12:13,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15387852800. Throughput: 0: 42759.1. Samples: 15388005700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:13,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 18:12:14,006][15401] Updated weights for policy 0, policy_version 939204 (0.0026) [2024-06-25 18:12:17,844][15401] Updated weights for policy 0, policy_version 939214 (0.0025) [2024-06-25 18:12:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15388098560. Throughput: 0: 42551.8. Samples: 15388258240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 18:12:21,832][15401] Updated weights for policy 0, policy_version 939224 (0.0035) [2024-06-25 18:12:23,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15388311552. Throughput: 0: 42806.1. Samples: 15388393000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 18:12:25,421][15401] Updated weights for policy 0, policy_version 939234 (0.0031) [2024-06-25 18:12:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15388491776. Throughput: 0: 42782.3. Samples: 15388646520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 18:12:29,747][15401] Updated weights for policy 0, policy_version 939244 (0.0037) [2024-06-25 18:12:32,895][15401] Updated weights for policy 0, policy_version 939254 (0.0037) [2024-06-25 18:12:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15388737536. Throughput: 0: 42589.3. Samples: 15388896280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 18:12:37,587][15401] Updated weights for policy 0, policy_version 939264 (0.0034) [2024-06-25 18:12:38,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15388966912. Throughput: 0: 42824.4. Samples: 15389031660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:38,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 18:12:40,848][15401] Updated weights for policy 0, policy_version 939274 (0.0031) [2024-06-25 18:12:43,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15389147136. Throughput: 0: 42774.4. Samples: 15389289080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:43,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 18:12:45,083][15401] Updated weights for policy 0, policy_version 939284 (0.0030) [2024-06-25 18:12:48,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43419.2, 300 sec: 42709.5). Total num frames: 15389392896. Throughput: 0: 42545.8. Samples: 15389537600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 18:12:48,387][15401] Updated weights for policy 0, policy_version 939294 (0.0033) [2024-06-25 18:12:52,449][15349] Signal inference workers to stop experience collection... (227750 times) [2024-06-25 18:12:52,449][15349] Signal inference workers to resume experience collection... (227750 times) [2024-06-25 18:12:52,472][15401] InferenceWorker_p0-w0: stopping experience collection (227750 times) [2024-06-25 18:12:52,473][15401] InferenceWorker_p0-w0: resuming experience collection (227750 times) [2024-06-25 18:12:52,851][15401] Updated weights for policy 0, policy_version 939304 (0.0037) [2024-06-25 18:12:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 15389573120. Throughput: 0: 42537.4. Samples: 15389665240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 18:12:56,185][15401] Updated weights for policy 0, policy_version 939314 (0.0032) [2024-06-25 18:12:58,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15389786112. Throughput: 0: 42412.9. Samples: 15389914280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:12:58,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 18:13:00,599][15401] Updated weights for policy 0, policy_version 939324 (0.0026) [2024-06-25 18:13:03,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15390031872. Throughput: 0: 42449.9. Samples: 15390168480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 18:13:03,822][15401] Updated weights for policy 0, policy_version 939334 (0.0028) [2024-06-25 18:13:08,275][15401] Updated weights for policy 0, policy_version 939344 (0.0041) [2024-06-25 18:13:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15390212096. Throughput: 0: 42420.5. Samples: 15390301920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:08,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 18:13:11,533][15401] Updated weights for policy 0, policy_version 939354 (0.0036) [2024-06-25 18:13:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15390425088. Throughput: 0: 42275.5. Samples: 15390548920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:13,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 18:13:16,137][15401] Updated weights for policy 0, policy_version 939364 (0.0033) [2024-06-25 18:13:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15390654464. Throughput: 0: 42457.9. Samples: 15390806880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:18,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 18:13:19,115][15401] Updated weights for policy 0, policy_version 939374 (0.0032) [2024-06-25 18:13:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15390851072. Throughput: 0: 42514.3. Samples: 15390944800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:23,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 18:13:23,670][15401] Updated weights for policy 0, policy_version 939384 (0.0042) [2024-06-25 18:13:26,787][15401] Updated weights for policy 0, policy_version 939394 (0.0034) [2024-06-25 18:13:28,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15391064064. Throughput: 0: 42299.4. Samples: 15391192560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 18:13:31,208][15401] Updated weights for policy 0, policy_version 939404 (0.0028) [2024-06-25 18:13:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15391293440. Throughput: 0: 42558.2. Samples: 15391452720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:33,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 18:13:34,444][15401] Updated weights for policy 0, policy_version 939414 (0.0026) [2024-06-25 18:13:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15391490048. Throughput: 0: 42594.7. Samples: 15391582000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:38,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 18:13:38,702][15401] Updated weights for policy 0, policy_version 939424 (0.0041) [2024-06-25 18:13:42,023][15401] Updated weights for policy 0, policy_version 939434 (0.0025) [2024-06-25 18:13:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15391719424. Throughput: 0: 42676.9. Samples: 15391834740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:43,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 18:13:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000939436_15391719424.pth... [2024-06-25 18:13:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000938813_15381512192.pth [2024-06-25 18:13:46,281][15401] Updated weights for policy 0, policy_version 939444 (0.0046) [2024-06-25 18:13:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15391932416. Throughput: 0: 42852.9. Samples: 15392096860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:48,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 18:13:49,591][15401] Updated weights for policy 0, policy_version 939454 (0.0036) [2024-06-25 18:13:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 15392129024. Throughput: 0: 42684.0. Samples: 15392222800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-25 18:13:53,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 18:13:54,163][15401] Updated weights for policy 0, policy_version 939464 (0.0028) [2024-06-25 18:13:57,320][15401] Updated weights for policy 0, policy_version 939474 (0.0037) [2024-06-25 18:13:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15392358400. Throughput: 0: 42892.9. Samples: 15392479100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:13:58,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 18:14:01,613][15401] Updated weights for policy 0, policy_version 939484 (0.0042) [2024-06-25 18:14:03,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15392571392. Throughput: 0: 42887.5. Samples: 15392736820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:03,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 18:14:05,159][15401] Updated weights for policy 0, policy_version 939494 (0.0037) [2024-06-25 18:14:08,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15392768000. Throughput: 0: 42736.0. Samples: 15392867920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 18:14:08,853][15349] Signal inference workers to stop experience collection... (227800 times) [2024-06-25 18:14:08,900][15401] InferenceWorker_p0-w0: stopping experience collection (227800 times) [2024-06-25 18:14:08,903][15349] Signal inference workers to resume experience collection... (227800 times) [2024-06-25 18:14:08,919][15401] InferenceWorker_p0-w0: resuming experience collection (227800 times) [2024-06-25 18:14:09,278][15401] Updated weights for policy 0, policy_version 939504 (0.0027) [2024-06-25 18:14:12,733][15401] Updated weights for policy 0, policy_version 939514 (0.0038) [2024-06-25 18:14:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15393013760. Throughput: 0: 42895.6. Samples: 15393122860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:13,390][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 18:14:17,030][15401] Updated weights for policy 0, policy_version 939524 (0.0035) [2024-06-25 18:14:18,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 15393210368. Throughput: 0: 42826.2. Samples: 15393379900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 18:14:20,176][15401] Updated weights for policy 0, policy_version 939534 (0.0043) [2024-06-25 18:14:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15393406976. Throughput: 0: 42657.8. Samples: 15393501600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:23,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 18:14:24,681][15401] Updated weights for policy 0, policy_version 939544 (0.0034) [2024-06-25 18:14:27,874][15401] Updated weights for policy 0, policy_version 939554 (0.0034) [2024-06-25 18:14:28,392][15132] Fps is (10 sec: 44227.0, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 15393652736. Throughput: 0: 42776.0. Samples: 15393759760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:28,392][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 18:14:32,207][15401] Updated weights for policy 0, policy_version 939564 (0.0043) [2024-06-25 18:14:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15393849344. Throughput: 0: 42770.7. Samples: 15394021540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:33,391][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 18:14:36,189][15401] Updated weights for policy 0, policy_version 939574 (0.0031) [2024-06-25 18:14:38,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15394045952. Throughput: 0: 42768.1. Samples: 15394147260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 18:14:40,179][15401] Updated weights for policy 0, policy_version 939584 (0.0045) [2024-06-25 18:14:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15394291712. Throughput: 0: 42828.7. Samples: 15394406400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 18:14:43,624][15401] Updated weights for policy 0, policy_version 939594 (0.0029) [2024-06-25 18:14:47,594][15401] Updated weights for policy 0, policy_version 939604 (0.0031) [2024-06-25 18:14:48,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15394504704. Throughput: 0: 42757.4. Samples: 15394660900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 18:14:51,446][15401] Updated weights for policy 0, policy_version 939614 (0.0027) [2024-06-25 18:14:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42599.3). Total num frames: 15394701312. Throughput: 0: 42766.6. Samples: 15394792420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:53,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 18:14:55,008][15401] Updated weights for policy 0, policy_version 939624 (0.0034) [2024-06-25 18:14:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15394930688. Throughput: 0: 42986.3. Samples: 15395057240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:14:58,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 18:14:58,841][15401] Updated weights for policy 0, policy_version 939634 (0.0029) [2024-06-25 18:15:02,807][15401] Updated weights for policy 0, policy_version 939644 (0.0027) [2024-06-25 18:15:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15395143680. Throughput: 0: 43013.1. Samples: 15395315480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:03,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 18:15:06,253][15401] Updated weights for policy 0, policy_version 939654 (0.0043) [2024-06-25 18:15:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15395356672. Throughput: 0: 43139.1. Samples: 15395442860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 18:15:10,334][15401] Updated weights for policy 0, policy_version 939664 (0.0032) [2024-06-25 18:15:13,390][15132] Fps is (10 sec: 44232.1, 60 sec: 42870.8, 300 sec: 42709.4). Total num frames: 15395586048. Throughput: 0: 43172.9. Samples: 15395702480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:13,391][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 18:15:14,087][15401] Updated weights for policy 0, policy_version 939674 (0.0028) [2024-06-25 18:15:17,912][15401] Updated weights for policy 0, policy_version 939684 (0.0031) [2024-06-25 18:15:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15395799040. Throughput: 0: 43063.1. Samples: 15395959380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 18:15:21,623][15401] Updated weights for policy 0, policy_version 939694 (0.0027) [2024-06-25 18:15:23,389][15132] Fps is (10 sec: 42602.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15396012032. Throughput: 0: 43099.1. Samples: 15396086720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:23,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 18:15:25,690][15401] Updated weights for policy 0, policy_version 939704 (0.0036) [2024-06-25 18:15:28,394][15132] Fps is (10 sec: 42577.4, 60 sec: 42869.6, 300 sec: 42653.2). Total num frames: 15396225024. Throughput: 0: 42982.5. Samples: 15396340820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:28,395][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 18:15:29,212][15401] Updated weights for policy 0, policy_version 939714 (0.0031) [2024-06-25 18:15:33,102][15401] Updated weights for policy 0, policy_version 939724 (0.0035) [2024-06-25 18:15:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15396438016. Throughput: 0: 43127.0. Samples: 15396601620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 18:15:36,770][15401] Updated weights for policy 0, policy_version 939734 (0.0032) [2024-06-25 18:15:36,776][15349] Signal inference workers to stop experience collection... (227850 times) [2024-06-25 18:15:36,776][15349] Signal inference workers to resume experience collection... (227850 times) [2024-06-25 18:15:36,816][15401] InferenceWorker_p0-w0: stopping experience collection (227850 times) [2024-06-25 18:15:36,816][15401] InferenceWorker_p0-w0: resuming experience collection (227850 times) [2024-06-25 18:15:38,389][15132] Fps is (10 sec: 42619.7, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 15396651008. Throughput: 0: 43085.4. Samples: 15396731260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:38,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 18:15:40,692][15401] Updated weights for policy 0, policy_version 939744 (0.0028) [2024-06-25 18:15:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 15396864000. Throughput: 0: 42818.7. Samples: 15396984080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 18:15:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000939750_15396864000.pth... [2024-06-25 18:15:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000939124_15386607616.pth [2024-06-25 18:15:44,939][15401] Updated weights for policy 0, policy_version 939754 (0.0031) [2024-06-25 18:15:48,294][15401] Updated weights for policy 0, policy_version 939764 (0.0032) [2024-06-25 18:15:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 43144.3, 300 sec: 42765.0). Total num frames: 15397093376. Throughput: 0: 42769.1. Samples: 15397240100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-25 18:15:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 18:15:52,565][15401] Updated weights for policy 0, policy_version 939774 (0.0041) [2024-06-25 18:15:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15397289984. Throughput: 0: 42891.2. Samples: 15397372960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:15:53,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 18:15:56,074][15401] Updated weights for policy 0, policy_version 939784 (0.0029) [2024-06-25 18:15:58,389][15132] Fps is (10 sec: 39322.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15397486592. Throughput: 0: 42725.4. Samples: 15397625080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:15:58,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 18:16:00,504][15401] Updated weights for policy 0, policy_version 939794 (0.0023) [2024-06-25 18:16:03,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15397732352. Throughput: 0: 42633.7. Samples: 15397877900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 18:16:03,867][15401] Updated weights for policy 0, policy_version 939804 (0.0045) [2024-06-25 18:16:08,201][15401] Updated weights for policy 0, policy_version 939814 (0.0036) [2024-06-25 18:16:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15397928960. Throughput: 0: 42716.0. Samples: 15398008940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:08,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 18:16:11,922][15401] Updated weights for policy 0, policy_version 939824 (0.0034) [2024-06-25 18:16:13,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42326.0, 300 sec: 42598.4). Total num frames: 15398125568. Throughput: 0: 42657.2. Samples: 15398260180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 18:16:15,830][15401] Updated weights for policy 0, policy_version 939834 (0.0031) [2024-06-25 18:16:18,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15398354944. Throughput: 0: 42401.4. Samples: 15398509680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:18,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 18:16:19,587][15401] Updated weights for policy 0, policy_version 939844 (0.0033) [2024-06-25 18:16:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15398551552. Throughput: 0: 42567.6. Samples: 15398646800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:23,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 18:16:23,414][15401] Updated weights for policy 0, policy_version 939854 (0.0040) [2024-06-25 18:16:27,243][15401] Updated weights for policy 0, policy_version 939864 (0.0039) [2024-06-25 18:16:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42328.8, 300 sec: 42709.5). Total num frames: 15398764544. Throughput: 0: 42522.1. Samples: 15398897580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 18:16:31,457][15401] Updated weights for policy 0, policy_version 939874 (0.0033) [2024-06-25 18:16:33,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15399010304. Throughput: 0: 42416.5. Samples: 15399148840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:33,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 18:16:34,764][15401] Updated weights for policy 0, policy_version 939884 (0.0042) [2024-06-25 18:16:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15399206912. Throughput: 0: 42552.9. Samples: 15399287840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:38,390][15132] Avg episode reward: [(0, '0.876')] [2024-06-25 18:16:38,887][15401] Updated weights for policy 0, policy_version 939894 (0.0030) [2024-06-25 18:16:42,303][15401] Updated weights for policy 0, policy_version 939904 (0.0030) [2024-06-25 18:16:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42820.9). Total num frames: 15399419904. Throughput: 0: 42615.9. Samples: 15399542800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:43,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 18:16:46,706][15401] Updated weights for policy 0, policy_version 939914 (0.0028) [2024-06-25 18:16:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15399665664. Throughput: 0: 42566.7. Samples: 15399793400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:48,390][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 18:16:50,186][15401] Updated weights for policy 0, policy_version 939924 (0.0037) [2024-06-25 18:16:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 15399829504. Throughput: 0: 42648.8. Samples: 15399928140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:53,392][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 18:16:54,355][15401] Updated weights for policy 0, policy_version 939934 (0.0031) [2024-06-25 18:16:54,983][15349] Signal inference workers to stop experience collection... (227900 times) [2024-06-25 18:16:54,984][15349] Signal inference workers to resume experience collection... (227900 times) [2024-06-25 18:16:55,027][15401] InferenceWorker_p0-w0: stopping experience collection (227900 times) [2024-06-25 18:16:55,027][15401] InferenceWorker_p0-w0: resuming experience collection (227900 times) [2024-06-25 18:16:57,636][15401] Updated weights for policy 0, policy_version 939944 (0.0038) [2024-06-25 18:16:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 15400058880. Throughput: 0: 42765.7. Samples: 15400184640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:16:58,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 18:17:01,726][15401] Updated weights for policy 0, policy_version 939954 (0.0034) [2024-06-25 18:17:03,392][15132] Fps is (10 sec: 49140.4, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 15400321024. Throughput: 0: 42881.2. Samples: 15400439440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:03,393][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 18:17:05,114][15401] Updated weights for policy 0, policy_version 939964 (0.0033) [2024-06-25 18:17:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15400484864. Throughput: 0: 42910.6. Samples: 15400577780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:08,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 18:17:09,534][15401] Updated weights for policy 0, policy_version 939974 (0.0041) [2024-06-25 18:17:12,641][15401] Updated weights for policy 0, policy_version 939984 (0.0031) [2024-06-25 18:17:13,389][15132] Fps is (10 sec: 37692.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15400697856. Throughput: 0: 42813.5. Samples: 15400824180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 18:17:17,268][15401] Updated weights for policy 0, policy_version 939994 (0.0034) [2024-06-25 18:17:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15400943616. Throughput: 0: 42869.9. Samples: 15401077980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 18:17:20,301][15401] Updated weights for policy 0, policy_version 940004 (0.0050) [2024-06-25 18:17:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15401123840. Throughput: 0: 42761.3. Samples: 15401212100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:23,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 18:17:24,825][15401] Updated weights for policy 0, policy_version 940014 (0.0033) [2024-06-25 18:17:27,862][15401] Updated weights for policy 0, policy_version 940024 (0.0039) [2024-06-25 18:17:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15401353216. Throughput: 0: 42576.5. Samples: 15401458740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:28,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 18:17:32,576][15401] Updated weights for policy 0, policy_version 940034 (0.0029) [2024-06-25 18:17:33,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15401566208. Throughput: 0: 42758.4. Samples: 15401717520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:33,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 18:17:35,494][15401] Updated weights for policy 0, policy_version 940044 (0.0032) [2024-06-25 18:17:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15401746432. Throughput: 0: 42540.9. Samples: 15401842480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 18:17:40,109][15401] Updated weights for policy 0, policy_version 940054 (0.0035) [2024-06-25 18:17:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15401992192. Throughput: 0: 42349.4. Samples: 15402090360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 18:17:43,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 18:17:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000940064_15402008576.pth... [2024-06-25 18:17:43,432][15401] Updated weights for policy 0, policy_version 940064 (0.0042) [2024-06-25 18:17:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000939436_15391719424.pth [2024-06-25 18:17:48,197][15401] Updated weights for policy 0, policy_version 940074 (0.0036) [2024-06-25 18:17:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 15402188800. Throughput: 0: 42498.3. Samples: 15402351760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:17:48,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 18:17:51,139][15401] Updated weights for policy 0, policy_version 940084 (0.0027) [2024-06-25 18:17:53,389][15132] Fps is (10 sec: 36045.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15402352640. Throughput: 0: 42011.1. Samples: 15402468280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:17:53,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 18:17:55,464][15349] Signal inference workers to stop experience collection... (227950 times) [2024-06-25 18:17:55,466][15349] Signal inference workers to resume experience collection... (227950 times) [2024-06-25 18:17:55,487][15401] InferenceWorker_p0-w0: stopping experience collection (227950 times) [2024-06-25 18:17:55,487][15401] InferenceWorker_p0-w0: resuming experience collection (227950 times) [2024-06-25 18:17:55,782][15401] Updated weights for policy 0, policy_version 940094 (0.0028) [2024-06-25 18:17:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 15402647552. Throughput: 0: 42112.4. Samples: 15402719240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:17:58,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 18:17:58,796][15401] Updated weights for policy 0, policy_version 940104 (0.0027) [2024-06-25 18:18:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 41507.8, 300 sec: 42709.5). Total num frames: 15402811392. Throughput: 0: 42484.4. Samples: 15402989780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:03,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 18:18:03,446][15401] Updated weights for policy 0, policy_version 940114 (0.0033) [2024-06-25 18:18:06,496][15401] Updated weights for policy 0, policy_version 940124 (0.0030) [2024-06-25 18:18:08,390][15132] Fps is (10 sec: 36044.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15403008000. Throughput: 0: 42041.7. Samples: 15403103980. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:08,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 18:18:11,132][15401] Updated weights for policy 0, policy_version 940134 (0.0027) [2024-06-25 18:18:13,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15403286528. Throughput: 0: 42334.6. Samples: 15403363800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:13,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 18:18:14,003][15401] Updated weights for policy 0, policy_version 940144 (0.0029) [2024-06-25 18:18:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 42653.9). Total num frames: 15403433984. Throughput: 0: 42614.2. Samples: 15403635160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:18,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 18:18:18,896][15401] Updated weights for policy 0, policy_version 940154 (0.0043) [2024-06-25 18:18:21,811][15401] Updated weights for policy 0, policy_version 940164 (0.0024) [2024-06-25 18:18:23,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15403663360. Throughput: 0: 42402.8. Samples: 15403750600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 18:18:26,383][15401] Updated weights for policy 0, policy_version 940174 (0.0035) [2024-06-25 18:18:28,390][15132] Fps is (10 sec: 49151.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15403925504. Throughput: 0: 42679.6. Samples: 15404010940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 18:18:29,586][15401] Updated weights for policy 0, policy_version 940184 (0.0032) [2024-06-25 18:18:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 15404089344. Throughput: 0: 42921.7. Samples: 15404283240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 18:18:33,960][15401] Updated weights for policy 0, policy_version 940194 (0.0033) [2024-06-25 18:18:37,264][15401] Updated weights for policy 0, policy_version 940204 (0.0032) [2024-06-25 18:18:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15404318720. Throughput: 0: 42907.1. Samples: 15404399100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:38,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 18:18:41,664][15401] Updated weights for policy 0, policy_version 940214 (0.0040) [2024-06-25 18:18:43,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15404564480. Throughput: 0: 43126.5. Samples: 15404659940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:43,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 18:18:44,971][15401] Updated weights for policy 0, policy_version 940224 (0.0033) [2024-06-25 18:18:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 15404728320. Throughput: 0: 42928.0. Samples: 15404921540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:48,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 18:18:49,536][15401] Updated weights for policy 0, policy_version 940234 (0.0030) [2024-06-25 18:18:52,746][15401] Updated weights for policy 0, policy_version 940244 (0.0037) [2024-06-25 18:18:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 15404974080. Throughput: 0: 42963.6. Samples: 15405037340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 18:18:55,655][15349] Signal inference workers to stop experience collection... (228000 times) [2024-06-25 18:18:55,699][15401] InferenceWorker_p0-w0: stopping experience collection (228000 times) [2024-06-25 18:18:55,765][15349] Signal inference workers to resume experience collection... (228000 times) [2024-06-25 18:18:55,765][15401] InferenceWorker_p0-w0: resuming experience collection (228000 times) [2024-06-25 18:18:57,363][15401] Updated weights for policy 0, policy_version 940254 (0.0033) [2024-06-25 18:18:58,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15405187072. Throughput: 0: 43094.7. Samples: 15405303060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:18:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 18:19:00,514][15401] Updated weights for policy 0, policy_version 940264 (0.0037) [2024-06-25 18:19:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15405367296. Throughput: 0: 42668.4. Samples: 15405555240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:19:03,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 18:19:05,062][15401] Updated weights for policy 0, policy_version 940274 (0.0032) [2024-06-25 18:19:08,213][15401] Updated weights for policy 0, policy_version 940284 (0.0027) [2024-06-25 18:19:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15405613056. Throughput: 0: 42715.5. Samples: 15405672800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:19:08,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 18:19:12,581][15401] Updated weights for policy 0, policy_version 940294 (0.0049) [2024-06-25 18:19:13,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 15405826048. Throughput: 0: 42974.3. Samples: 15405944780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:19:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 18:19:16,191][15401] Updated weights for policy 0, policy_version 940304 (0.0035) [2024-06-25 18:19:18,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15406022656. Throughput: 0: 42512.1. Samples: 15406196280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:19:18,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 18:19:20,083][15401] Updated weights for policy 0, policy_version 940314 (0.0033) [2024-06-25 18:19:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42654.3). Total num frames: 15406235648. Throughput: 0: 42639.8. Samples: 15406317900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:19:23,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 18:19:23,758][15401] Updated weights for policy 0, policy_version 940324 (0.0034) [2024-06-25 18:19:27,564][15401] Updated weights for policy 0, policy_version 940334 (0.0041) [2024-06-25 18:19:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15406448640. Throughput: 0: 42613.9. Samples: 15406577560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:19:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 18:19:31,366][15401] Updated weights for policy 0, policy_version 940344 (0.0020) [2024-06-25 18:19:33,390][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15406661632. Throughput: 0: 42502.1. Samples: 15406834140. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:19:33,390][15132] Avg episode reward: [(0, '0.237')] [2024-06-25 18:19:35,030][15401] Updated weights for policy 0, policy_version 940354 (0.0042) [2024-06-25 18:19:38,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15406891008. Throughput: 0: 42898.5. Samples: 15406967780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-25 18:19:38,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 18:19:38,908][15401] Updated weights for policy 0, policy_version 940364 (0.0040) [2024-06-25 18:19:42,484][15401] Updated weights for policy 0, policy_version 940374 (0.0025) [2024-06-25 18:19:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15407104000. Throughput: 0: 42662.2. Samples: 15407222860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:19:43,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 18:19:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000940375_15407104000.pth... [2024-06-25 18:19:43,473][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000939750_15396864000.pth [2024-06-25 18:19:46,934][15401] Updated weights for policy 0, policy_version 940384 (0.0032) [2024-06-25 18:19:48,389][15132] Fps is (10 sec: 42599.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15407316992. Throughput: 0: 42797.0. Samples: 15407481100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:19:48,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 18:19:50,066][15401] Updated weights for policy 0, policy_version 940394 (0.0042) [2024-06-25 18:19:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15407529984. Throughput: 0: 43108.9. Samples: 15407612700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:19:53,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 18:19:54,564][15401] Updated weights for policy 0, policy_version 940404 (0.0042) [2024-06-25 18:19:57,651][15401] Updated weights for policy 0, policy_version 940414 (0.0033) [2024-06-25 18:19:58,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 15407742976. Throughput: 0: 42605.1. Samples: 15407862020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:19:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 18:20:02,540][15401] Updated weights for policy 0, policy_version 940424 (0.0029) [2024-06-25 18:20:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15407939584. Throughput: 0: 42750.5. Samples: 15408120060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:03,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 18:20:06,003][15401] Updated weights for policy 0, policy_version 940434 (0.0037) [2024-06-25 18:20:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42654.1). Total num frames: 15408168960. Throughput: 0: 42781.6. Samples: 15408243060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:08,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 18:20:10,108][15401] Updated weights for policy 0, policy_version 940444 (0.0021) [2024-06-25 18:20:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15408381952. Throughput: 0: 42695.5. Samples: 15408498860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:13,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 18:20:13,537][15401] Updated weights for policy 0, policy_version 940454 (0.0046) [2024-06-25 18:20:17,774][15401] Updated weights for policy 0, policy_version 940464 (0.0047) [2024-06-25 18:20:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15408578560. Throughput: 0: 42776.0. Samples: 15408759060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:18,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 18:20:21,192][15401] Updated weights for policy 0, policy_version 940474 (0.0031) [2024-06-25 18:20:23,294][15349] Signal inference workers to stop experience collection... (228050 times) [2024-06-25 18:20:23,295][15349] Signal inference workers to resume experience collection... (228050 times) [2024-06-25 18:20:23,316][15401] InferenceWorker_p0-w0: stopping experience collection (228050 times) [2024-06-25 18:20:23,316][15401] InferenceWorker_p0-w0: resuming experience collection (228050 times) [2024-06-25 18:20:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42654.7). Total num frames: 15408807936. Throughput: 0: 42532.1. Samples: 15408881720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 18:20:25,515][15401] Updated weights for policy 0, policy_version 940484 (0.0034) [2024-06-25 18:20:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15409020928. Throughput: 0: 42606.6. Samples: 15409140160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 18:20:28,710][15401] Updated weights for policy 0, policy_version 940494 (0.0032) [2024-06-25 18:20:33,231][15401] Updated weights for policy 0, policy_version 940504 (0.0030) [2024-06-25 18:20:33,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42598.0). Total num frames: 15409217536. Throughput: 0: 42635.0. Samples: 15409399780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:33,393][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 18:20:36,521][15401] Updated weights for policy 0, policy_version 940514 (0.0026) [2024-06-25 18:20:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15409463296. Throughput: 0: 42383.2. Samples: 15409519940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:38,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 18:20:40,889][15401] Updated weights for policy 0, policy_version 940524 (0.0030) [2024-06-25 18:20:43,390][15132] Fps is (10 sec: 45885.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 15409676288. Throughput: 0: 42741.4. Samples: 15409785380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:43,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 18:20:44,138][15401] Updated weights for policy 0, policy_version 940534 (0.0040) [2024-06-25 18:20:48,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15409856512. Throughput: 0: 42791.6. Samples: 15410045680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:48,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 18:20:48,426][15401] Updated weights for policy 0, policy_version 940544 (0.0041) [2024-06-25 18:20:51,728][15401] Updated weights for policy 0, policy_version 940554 (0.0031) [2024-06-25 18:20:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15410102272. Throughput: 0: 42828.4. Samples: 15410170340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 18:20:55,899][15401] Updated weights for policy 0, policy_version 940564 (0.0034) [2024-06-25 18:20:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15410298880. Throughput: 0: 42963.6. Samples: 15410432220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:20:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 18:20:59,203][15401] Updated weights for policy 0, policy_version 940574 (0.0034) [2024-06-25 18:21:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15410511872. Throughput: 0: 42907.5. Samples: 15410689900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:21:03,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 18:21:03,410][15401] Updated weights for policy 0, policy_version 940584 (0.0031) [2024-06-25 18:21:06,846][15401] Updated weights for policy 0, policy_version 940594 (0.0033) [2024-06-25 18:21:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15410741248. Throughput: 0: 43034.3. Samples: 15410818260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:21:08,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 18:21:11,027][15401] Updated weights for policy 0, policy_version 940604 (0.0025) [2024-06-25 18:21:13,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15410954240. Throughput: 0: 42992.1. Samples: 15411074800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:21:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 18:21:14,654][15401] Updated weights for policy 0, policy_version 940614 (0.0033) [2024-06-25 18:21:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 15411150848. Throughput: 0: 42965.8. Samples: 15411333140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:21:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 18:21:18,787][15401] Updated weights for policy 0, policy_version 940624 (0.0029) [2024-06-25 18:21:22,287][15401] Updated weights for policy 0, policy_version 940634 (0.0028) [2024-06-25 18:21:23,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 15411380224. Throughput: 0: 43041.6. Samples: 15411456920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:21:23,392][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 18:21:26,475][15401] Updated weights for policy 0, policy_version 940644 (0.0026) [2024-06-25 18:21:28,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 15411576832. Throughput: 0: 42871.3. Samples: 15411714580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:21:28,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 18:21:29,843][15401] Updated weights for policy 0, policy_version 940654 (0.0022) [2024-06-25 18:21:33,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 15411789824. Throughput: 0: 42861.2. Samples: 15411974440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:21:33,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 18:21:33,938][15401] Updated weights for policy 0, policy_version 940664 (0.0029) [2024-06-25 18:21:37,664][15401] Updated weights for policy 0, policy_version 940674 (0.0037) [2024-06-25 18:21:38,389][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15412035584. Throughput: 0: 42983.1. Samples: 15412104580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 18:21:38,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 18:21:41,936][15401] Updated weights for policy 0, policy_version 940684 (0.0032) [2024-06-25 18:21:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15412232192. Throughput: 0: 42805.3. Samples: 15412358460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:21:43,396][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 18:21:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000940688_15412232192.pth... [2024-06-25 18:21:43,480][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000940064_15402008576.pth [2024-06-25 18:21:45,261][15401] Updated weights for policy 0, policy_version 940694 (0.0042) [2024-06-25 18:21:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15412445184. Throughput: 0: 42649.4. Samples: 15412609120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:21:48,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 18:21:49,504][15401] Updated weights for policy 0, policy_version 940704 (0.0032) [2024-06-25 18:21:51,257][15349] Signal inference workers to stop experience collection... (228100 times) [2024-06-25 18:21:51,291][15401] InferenceWorker_p0-w0: stopping experience collection (228100 times) [2024-06-25 18:21:51,302][15349] Signal inference workers to resume experience collection... (228100 times) [2024-06-25 18:21:51,316][15401] InferenceWorker_p0-w0: resuming experience collection (228100 times) [2024-06-25 18:21:52,834][15401] Updated weights for policy 0, policy_version 940714 (0.0028) [2024-06-25 18:21:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15412674560. Throughput: 0: 42687.0. Samples: 15412739180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:21:53,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 18:21:57,287][15401] Updated weights for policy 0, policy_version 940724 (0.0030) [2024-06-25 18:21:58,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42487.3). Total num frames: 15412854784. Throughput: 0: 42696.0. Samples: 15412996220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:21:58,393][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 18:22:00,577][15401] Updated weights for policy 0, policy_version 940734 (0.0038) [2024-06-25 18:22:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15413084160. Throughput: 0: 42461.9. Samples: 15413243920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 18:22:05,042][15401] Updated weights for policy 0, policy_version 940744 (0.0042) [2024-06-25 18:22:08,334][15401] Updated weights for policy 0, policy_version 940754 (0.0029) [2024-06-25 18:22:08,390][15132] Fps is (10 sec: 45885.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 15413313536. Throughput: 0: 42709.3. Samples: 15413378740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:08,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 18:22:12,974][15401] Updated weights for policy 0, policy_version 940764 (0.0042) [2024-06-25 18:22:13,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 15413477376. Throughput: 0: 42570.6. Samples: 15413630260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:13,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 18:22:15,994][15401] Updated weights for policy 0, policy_version 940774 (0.0048) [2024-06-25 18:22:18,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42866.9, 300 sec: 42708.5). Total num frames: 15413723136. Throughput: 0: 42223.9. Samples: 15413874780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:18,396][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 18:22:20,703][15401] Updated weights for policy 0, policy_version 940784 (0.0037) [2024-06-25 18:22:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 15413936128. Throughput: 0: 42252.3. Samples: 15414005940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:23,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 18:22:24,069][15401] Updated weights for policy 0, policy_version 940794 (0.0034) [2024-06-25 18:22:28,377][15401] Updated weights for policy 0, policy_version 940804 (0.0032) [2024-06-25 18:22:28,390][15132] Fps is (10 sec: 40986.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15414132736. Throughput: 0: 42316.4. Samples: 15414262700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:28,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 18:22:31,627][15401] Updated weights for policy 0, policy_version 940814 (0.0029) [2024-06-25 18:22:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 15414378496. Throughput: 0: 42415.7. Samples: 15414517820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 18:22:36,080][15401] Updated weights for policy 0, policy_version 940824 (0.0044) [2024-06-25 18:22:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15414558720. Throughput: 0: 42441.7. Samples: 15414649060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:38,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 18:22:39,459][15401] Updated weights for policy 0, policy_version 940834 (0.0047) [2024-06-25 18:22:43,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 15414771712. Throughput: 0: 42442.2. Samples: 15414906120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:43,393][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 18:22:43,613][15401] Updated weights for policy 0, policy_version 940844 (0.0037) [2024-06-25 18:22:47,287][15401] Updated weights for policy 0, policy_version 940854 (0.0023) [2024-06-25 18:22:48,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15415001088. Throughput: 0: 42466.7. Samples: 15415154920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:48,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 18:22:51,228][15401] Updated weights for policy 0, policy_version 940864 (0.0045) [2024-06-25 18:22:53,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 15415197696. Throughput: 0: 42445.4. Samples: 15415288880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:53,392][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 18:22:54,814][15401] Updated weights for policy 0, policy_version 940874 (0.0033) [2024-06-25 18:22:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 15415410688. Throughput: 0: 42605.3. Samples: 15415547500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:22:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 18:22:59,065][15401] Updated weights for policy 0, policy_version 940884 (0.0039) [2024-06-25 18:23:02,665][15401] Updated weights for policy 0, policy_version 940894 (0.0032) [2024-06-25 18:23:03,392][15132] Fps is (10 sec: 44236.9, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 15415640064. Throughput: 0: 42598.1. Samples: 15415791520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:23:03,392][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 18:23:06,687][15401] Updated weights for policy 0, policy_version 940904 (0.0031) [2024-06-25 18:23:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15415853056. Throughput: 0: 42691.1. Samples: 15415927040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:23:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 18:23:10,263][15401] Updated weights for policy 0, policy_version 940914 (0.0042) [2024-06-25 18:23:10,413][15349] Signal inference workers to stop experience collection... (228150 times) [2024-06-25 18:23:10,415][15349] Signal inference workers to resume experience collection... (228150 times) [2024-06-25 18:23:10,436][15401] InferenceWorker_p0-w0: stopping experience collection (228150 times) [2024-06-25 18:23:10,464][15401] InferenceWorker_p0-w0: resuming experience collection (228150 times) [2024-06-25 18:23:13,389][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15416049664. Throughput: 0: 42687.1. Samples: 15416183620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:23:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 18:23:14,270][15401] Updated weights for policy 0, policy_version 940924 (0.0032) [2024-06-25 18:23:17,764][15401] Updated weights for policy 0, policy_version 940934 (0.0041) [2024-06-25 18:23:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42603.0, 300 sec: 42765.0). Total num frames: 15416279040. Throughput: 0: 42611.6. Samples: 15416435340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:23:18,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 18:23:21,816][15401] Updated weights for policy 0, policy_version 940944 (0.0027) [2024-06-25 18:23:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15416492032. Throughput: 0: 42678.7. Samples: 15416569600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:23:23,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 18:23:25,450][15401] Updated weights for policy 0, policy_version 940954 (0.0034) [2024-06-25 18:23:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15416688640. Throughput: 0: 42511.2. Samples: 15416819020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:23:28,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 18:23:29,689][15401] Updated weights for policy 0, policy_version 940964 (0.0036) [2024-06-25 18:23:33,354][15401] Updated weights for policy 0, policy_version 940974 (0.0048) [2024-06-25 18:23:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15416918016. Throughput: 0: 42697.3. Samples: 15417076300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 18:23:33,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 18:23:37,407][15401] Updated weights for policy 0, policy_version 940984 (0.0032) [2024-06-25 18:23:38,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15417131008. Throughput: 0: 42657.2. Samples: 15417208360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:23:38,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 18:23:40,913][15401] Updated weights for policy 0, policy_version 940994 (0.0036) [2024-06-25 18:23:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15417327616. Throughput: 0: 42509.3. Samples: 15417460420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:23:43,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 18:23:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000940999_15417327616.pth... [2024-06-25 18:23:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000940375_15407104000.pth [2024-06-25 18:23:44,779][15401] Updated weights for policy 0, policy_version 941004 (0.0034) [2024-06-25 18:23:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15417556992. Throughput: 0: 42898.7. Samples: 15417721860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:23:48,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 18:23:48,454][15401] Updated weights for policy 0, policy_version 941014 (0.0037) [2024-06-25 18:23:52,634][15401] Updated weights for policy 0, policy_version 941024 (0.0037) [2024-06-25 18:23:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 15417786368. Throughput: 0: 42738.3. Samples: 15417850260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:23:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 18:23:55,930][15401] Updated weights for policy 0, policy_version 941034 (0.0036) [2024-06-25 18:23:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15417966592. Throughput: 0: 42689.4. Samples: 15418104640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:23:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 18:24:00,424][15401] Updated weights for policy 0, policy_version 941044 (0.0028) [2024-06-25 18:24:03,394][15132] Fps is (10 sec: 40943.2, 60 sec: 42597.1, 300 sec: 42653.3). Total num frames: 15418195968. Throughput: 0: 42699.5. Samples: 15418357000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:03,394][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 18:24:04,066][15401] Updated weights for policy 0, policy_version 941054 (0.0037) [2024-06-25 18:24:08,038][15401] Updated weights for policy 0, policy_version 941064 (0.0038) [2024-06-25 18:24:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15418408960. Throughput: 0: 42623.1. Samples: 15418487640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:08,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 18:24:11,800][15401] Updated weights for policy 0, policy_version 941074 (0.0035) [2024-06-25 18:24:13,390][15132] Fps is (10 sec: 40976.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15418605568. Throughput: 0: 42682.0. Samples: 15418739720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:13,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 18:24:15,712][15401] Updated weights for policy 0, policy_version 941084 (0.0039) [2024-06-25 18:24:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15418834944. Throughput: 0: 42545.7. Samples: 15418990860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:18,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 18:24:19,475][15401] Updated weights for policy 0, policy_version 941094 (0.0041) [2024-06-25 18:24:23,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15419031552. Throughput: 0: 42732.1. Samples: 15419131300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 18:24:23,566][15401] Updated weights for policy 0, policy_version 941104 (0.0029) [2024-06-25 18:24:27,186][15401] Updated weights for policy 0, policy_version 941114 (0.0040) [2024-06-25 18:24:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15419244544. Throughput: 0: 42732.4. Samples: 15419383380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 18:24:31,010][15401] Updated weights for policy 0, policy_version 941124 (0.0042) [2024-06-25 18:24:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15419490304. Throughput: 0: 42484.4. Samples: 15419633660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:33,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 18:24:34,756][15401] Updated weights for policy 0, policy_version 941134 (0.0029) [2024-06-25 18:24:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15419686912. Throughput: 0: 42627.2. Samples: 15419768480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:38,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 18:24:38,909][15401] Updated weights for policy 0, policy_version 941144 (0.0037) [2024-06-25 18:24:42,320][15401] Updated weights for policy 0, policy_version 941154 (0.0030) [2024-06-25 18:24:43,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15419899904. Throughput: 0: 42652.0. Samples: 15420023980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 18:24:46,466][15401] Updated weights for policy 0, policy_version 941164 (0.0039) [2024-06-25 18:24:46,778][15349] Signal inference workers to stop experience collection... (228200 times) [2024-06-25 18:24:46,779][15349] Signal inference workers to resume experience collection... (228200 times) [2024-06-25 18:24:46,822][15401] InferenceWorker_p0-w0: stopping experience collection (228200 times) [2024-06-25 18:24:46,822][15401] InferenceWorker_p0-w0: resuming experience collection (228200 times) [2024-06-25 18:24:48,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15420129280. Throughput: 0: 42427.6. Samples: 15420266060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:48,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 18:24:50,282][15401] Updated weights for policy 0, policy_version 941174 (0.0033) [2024-06-25 18:24:53,390][15132] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15420342272. Throughput: 0: 42667.9. Samples: 15420407700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 18:24:54,133][15401] Updated weights for policy 0, policy_version 941184 (0.0029) [2024-06-25 18:24:57,951][15401] Updated weights for policy 0, policy_version 941194 (0.0022) [2024-06-25 18:24:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15420522496. Throughput: 0: 42610.9. Samples: 15420657200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:24:58,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 18:25:01,785][15401] Updated weights for policy 0, policy_version 941204 (0.0022) [2024-06-25 18:25:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43147.6, 300 sec: 42765.0). Total num frames: 15420784640. Throughput: 0: 42570.7. Samples: 15420906540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:25:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 18:25:05,670][15401] Updated weights for policy 0, policy_version 941214 (0.0033) [2024-06-25 18:25:08,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15420948480. Throughput: 0: 42533.4. Samples: 15421045300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:25:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 18:25:09,365][15401] Updated weights for policy 0, policy_version 941224 (0.0027) [2024-06-25 18:25:13,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15421161472. Throughput: 0: 42410.2. Samples: 15421291840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:25:13,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 18:25:13,497][15401] Updated weights for policy 0, policy_version 941234 (0.0032) [2024-06-25 18:25:17,135][15401] Updated weights for policy 0, policy_version 941244 (0.0034) [2024-06-25 18:25:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15421407232. Throughput: 0: 42476.6. Samples: 15421545100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:25:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 18:25:21,134][15401] Updated weights for policy 0, policy_version 941254 (0.0035) [2024-06-25 18:25:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15421571072. Throughput: 0: 42440.8. Samples: 15421678320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:25:23,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-25 18:25:24,832][15401] Updated weights for policy 0, policy_version 941264 (0.0040) [2024-06-25 18:25:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 15421800448. Throughput: 0: 42219.4. Samples: 15421923860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-25 18:25:28,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 18:25:29,403][15401] Updated weights for policy 0, policy_version 941274 (0.0027) [2024-06-25 18:25:32,556][15401] Updated weights for policy 0, policy_version 941284 (0.0035) [2024-06-25 18:25:33,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15422029824. Throughput: 0: 42633.6. Samples: 15422184580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:25:33,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 18:25:37,064][15401] Updated weights for policy 0, policy_version 941294 (0.0027) [2024-06-25 18:25:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 15422210048. Throughput: 0: 42419.3. Samples: 15422316560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:25:38,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 18:25:40,121][15401] Updated weights for policy 0, policy_version 941304 (0.0045) [2024-06-25 18:25:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15422455808. Throughput: 0: 42394.5. Samples: 15422564960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:25:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 18:25:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000941312_15422455808.pth... [2024-06-25 18:25:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000940688_15412232192.pth [2024-06-25 18:25:44,822][15401] Updated weights for policy 0, policy_version 941314 (0.0038) [2024-06-25 18:25:47,865][15401] Updated weights for policy 0, policy_version 941324 (0.0037) [2024-06-25 18:25:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15422668800. Throughput: 0: 42639.9. Samples: 15422825340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:25:48,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 18:25:52,431][15401] Updated weights for policy 0, policy_version 941334 (0.0028) [2024-06-25 18:25:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15422865408. Throughput: 0: 42484.8. Samples: 15422957120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:25:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 18:25:55,714][15401] Updated weights for policy 0, policy_version 941344 (0.0036) [2024-06-25 18:25:56,728][15349] Signal inference workers to stop experience collection... (228250 times) [2024-06-25 18:25:56,729][15349] Signal inference workers to resume experience collection... (228250 times) [2024-06-25 18:25:56,766][15401] InferenceWorker_p0-w0: stopping experience collection (228250 times) [2024-06-25 18:25:56,766][15401] InferenceWorker_p0-w0: resuming experience collection (228250 times) [2024-06-25 18:25:58,390][15132] Fps is (10 sec: 42594.7, 60 sec: 42870.8, 300 sec: 42653.8). Total num frames: 15423094784. Throughput: 0: 42491.2. Samples: 15423203980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:25:58,391][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 18:25:59,879][15401] Updated weights for policy 0, policy_version 941354 (0.0029) [2024-06-25 18:26:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 15423291392. Throughput: 0: 42803.5. Samples: 15423471260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 18:26:03,503][15401] Updated weights for policy 0, policy_version 941364 (0.0049) [2024-06-25 18:26:07,382][15401] Updated weights for policy 0, policy_version 941374 (0.0033) [2024-06-25 18:26:08,392][15132] Fps is (10 sec: 40953.8, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 15423504384. Throughput: 0: 42679.1. Samples: 15423598980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:08,392][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 18:26:11,070][15401] Updated weights for policy 0, policy_version 941384 (0.0033) [2024-06-25 18:26:13,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15423750144. Throughput: 0: 42883.6. Samples: 15423853620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:13,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 18:26:14,898][15401] Updated weights for policy 0, policy_version 941394 (0.0036) [2024-06-25 18:26:18,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 15423946752. Throughput: 0: 42858.8. Samples: 15424113220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 18:26:18,509][15401] Updated weights for policy 0, policy_version 941404 (0.0032) [2024-06-25 18:26:22,861][15401] Updated weights for policy 0, policy_version 941414 (0.0022) [2024-06-25 18:26:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15424159744. Throughput: 0: 42656.4. Samples: 15424236100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:23,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 18:26:26,458][15401] Updated weights for policy 0, policy_version 941424 (0.0033) [2024-06-25 18:26:28,395][15132] Fps is (10 sec: 44213.1, 60 sec: 43140.7, 300 sec: 42708.7). Total num frames: 15424389120. Throughput: 0: 42859.0. Samples: 15424493840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:28,395][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 18:26:30,260][15401] Updated weights for policy 0, policy_version 941434 (0.0031) [2024-06-25 18:26:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15424585728. Throughput: 0: 42860.5. Samples: 15424754060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 18:26:33,921][15401] Updated weights for policy 0, policy_version 941444 (0.0035) [2024-06-25 18:26:38,182][15401] Updated weights for policy 0, policy_version 941454 (0.0041) [2024-06-25 18:26:38,391][15132] Fps is (10 sec: 40976.2, 60 sec: 43143.5, 300 sec: 42598.2). Total num frames: 15424798720. Throughput: 0: 42669.0. Samples: 15424877280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:38,391][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 18:26:41,694][15401] Updated weights for policy 0, policy_version 941464 (0.0044) [2024-06-25 18:26:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15425028096. Throughput: 0: 42906.3. Samples: 15425134720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 18:26:45,724][15401] Updated weights for policy 0, policy_version 941474 (0.0050) [2024-06-25 18:26:48,390][15132] Fps is (10 sec: 40965.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15425208320. Throughput: 0: 42788.3. Samples: 15425396740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:48,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 18:26:49,249][15401] Updated weights for policy 0, policy_version 941484 (0.0030) [2024-06-25 18:26:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42598.8). Total num frames: 15425421312. Throughput: 0: 42657.5. Samples: 15425518460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:53,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 18:26:53,436][15401] Updated weights for policy 0, policy_version 941494 (0.0044) [2024-06-25 18:26:56,768][15401] Updated weights for policy 0, policy_version 941504 (0.0044) [2024-06-25 18:26:58,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42872.1, 300 sec: 42653.9). Total num frames: 15425667072. Throughput: 0: 42646.7. Samples: 15425772720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:26:58,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 18:27:00,862][15401] Updated weights for policy 0, policy_version 941514 (0.0034) [2024-06-25 18:27:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15425863680. Throughput: 0: 42680.8. Samples: 15426033860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:27:03,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 18:27:04,603][15401] Updated weights for policy 0, policy_version 941524 (0.0042) [2024-06-25 18:27:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 15426076672. Throughput: 0: 42672.0. Samples: 15426156340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:27:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 18:27:08,424][15401] Updated weights for policy 0, policy_version 941534 (0.0041) [2024-06-25 18:27:12,559][15401] Updated weights for policy 0, policy_version 941544 (0.0042) [2024-06-25 18:27:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 15426289664. Throughput: 0: 42678.3. Samples: 15426414140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:27:13,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 18:27:16,304][15401] Updated weights for policy 0, policy_version 941554 (0.0035) [2024-06-25 18:27:18,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15426519040. Throughput: 0: 42680.7. Samples: 15426674700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:27:18,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 18:27:20,109][15401] Updated weights for policy 0, policy_version 941564 (0.0033) [2024-06-25 18:27:23,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15426732032. Throughput: 0: 42784.8. Samples: 15426802540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:27:23,390][15132] Avg episode reward: [(0, '0.059')] [2024-06-25 18:27:24,125][15401] Updated weights for policy 0, policy_version 941574 (0.0031) [2024-06-25 18:27:27,692][15401] Updated weights for policy 0, policy_version 941584 (0.0035) [2024-06-25 18:27:28,389][15132] Fps is (10 sec: 40961.3, 60 sec: 42329.2, 300 sec: 42542.9). Total num frames: 15426928640. Throughput: 0: 42548.5. Samples: 15427049400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-25 18:27:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 18:27:31,687][15401] Updated weights for policy 0, policy_version 941594 (0.0033) [2024-06-25 18:27:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15427141632. Throughput: 0: 42656.5. Samples: 15427316280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:27:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 18:27:33,618][15349] Signal inference workers to stop experience collection... (228300 times) [2024-06-25 18:27:33,618][15349] Signal inference workers to resume experience collection... (228300 times) [2024-06-25 18:27:33,663][15401] InferenceWorker_p0-w0: stopping experience collection (228300 times) [2024-06-25 18:27:33,664][15401] InferenceWorker_p0-w0: resuming experience collection (228300 times) [2024-06-25 18:27:35,105][15401] Updated weights for policy 0, policy_version 941604 (0.0036) [2024-06-25 18:27:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42599.4, 300 sec: 42654.3). Total num frames: 15427354624. Throughput: 0: 42777.8. Samples: 15427443460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:27:38,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 18:27:39,126][15401] Updated weights for policy 0, policy_version 941614 (0.0033) [2024-06-25 18:27:43,016][15401] Updated weights for policy 0, policy_version 941624 (0.0031) [2024-06-25 18:27:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15427584000. Throughput: 0: 42854.7. Samples: 15427701180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:27:43,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 18:27:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000941626_15427600384.pth... [2024-06-25 18:27:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000940999_15417327616.pth [2024-06-25 18:27:47,034][15401] Updated weights for policy 0, policy_version 941634 (0.0048) [2024-06-25 18:27:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42709.8). Total num frames: 15427796992. Throughput: 0: 42762.4. Samples: 15427958160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:27:48,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 18:27:50,549][15401] Updated weights for policy 0, policy_version 941644 (0.0041) [2024-06-25 18:27:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15428009984. Throughput: 0: 42863.1. Samples: 15428085180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:27:53,396][15132] Avg episode reward: [(0, '0.395')] [2024-06-25 18:27:54,898][15401] Updated weights for policy 0, policy_version 941654 (0.0044) [2024-06-25 18:27:58,293][15401] Updated weights for policy 0, policy_version 941664 (0.0026) [2024-06-25 18:27:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 15428222976. Throughput: 0: 42772.0. Samples: 15428338880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:27:58,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 18:28:02,710][15401] Updated weights for policy 0, policy_version 941674 (0.0037) [2024-06-25 18:28:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15428435968. Throughput: 0: 42754.8. Samples: 15428598660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 18:28:05,919][15401] Updated weights for policy 0, policy_version 941684 (0.0038) [2024-06-25 18:28:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15428648960. Throughput: 0: 42677.8. Samples: 15428723040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:08,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 18:28:10,298][15401] Updated weights for policy 0, policy_version 941694 (0.0035) [2024-06-25 18:28:13,382][15401] Updated weights for policy 0, policy_version 941704 (0.0028) [2024-06-25 18:28:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15428878336. Throughput: 0: 42899.0. Samples: 15428979860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 18:28:17,841][15401] Updated weights for policy 0, policy_version 941714 (0.0030) [2024-06-25 18:28:18,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 15429074944. Throughput: 0: 42873.2. Samples: 15429245680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:18,392][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 18:28:20,950][15401] Updated weights for policy 0, policy_version 941724 (0.0034) [2024-06-25 18:28:23,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15429304320. Throughput: 0: 42785.1. Samples: 15429368800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 18:28:25,281][15401] Updated weights for policy 0, policy_version 941734 (0.0040) [2024-06-25 18:28:28,390][15132] Fps is (10 sec: 44247.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 15429517312. Throughput: 0: 42932.3. Samples: 15429633140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:28,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 18:28:28,449][15401] Updated weights for policy 0, policy_version 941744 (0.0032) [2024-06-25 18:28:32,902][15401] Updated weights for policy 0, policy_version 941754 (0.0053) [2024-06-25 18:28:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15429713920. Throughput: 0: 42822.0. Samples: 15429885160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 18:28:36,314][15401] Updated weights for policy 0, policy_version 941764 (0.0031) [2024-06-25 18:28:38,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15429926912. Throughput: 0: 42750.8. Samples: 15430008960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:38,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 18:28:40,606][15401] Updated weights for policy 0, policy_version 941774 (0.0029) [2024-06-25 18:28:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15430139904. Throughput: 0: 42911.5. Samples: 15430269900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 18:28:44,107][15401] Updated weights for policy 0, policy_version 941784 (0.0029) [2024-06-25 18:28:48,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 15430336512. Throughput: 0: 42805.4. Samples: 15430524900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:48,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 18:28:48,536][15401] Updated weights for policy 0, policy_version 941794 (0.0033) [2024-06-25 18:28:51,773][15401] Updated weights for policy 0, policy_version 941804 (0.0042) [2024-06-25 18:28:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15430549504. Throughput: 0: 42860.4. Samples: 15430651760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:53,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 18:28:56,213][15401] Updated weights for policy 0, policy_version 941814 (0.0035) [2024-06-25 18:28:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42654.5). Total num frames: 15430778880. Throughput: 0: 42800.9. Samples: 15430905900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:28:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 18:28:59,592][15401] Updated weights for policy 0, policy_version 941824 (0.0037) [2024-06-25 18:29:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15430991872. Throughput: 0: 42752.0. Samples: 15431169420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:29:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 18:29:03,661][15401] Updated weights for policy 0, policy_version 941834 (0.0036) [2024-06-25 18:29:07,107][15401] Updated weights for policy 0, policy_version 941844 (0.0024) [2024-06-25 18:29:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15431221248. Throughput: 0: 42894.9. Samples: 15431299060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:29:08,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 18:29:08,909][15349] Signal inference workers to stop experience collection... (228350 times) [2024-06-25 18:29:08,910][15349] Signal inference workers to resume experience collection... (228350 times) [2024-06-25 18:29:08,950][15401] InferenceWorker_p0-w0: stopping experience collection (228350 times) [2024-06-25 18:29:08,951][15401] InferenceWorker_p0-w0: resuming experience collection (228350 times) [2024-06-25 18:29:11,198][15401] Updated weights for policy 0, policy_version 941854 (0.0029) [2024-06-25 18:29:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15431434240. Throughput: 0: 42632.5. Samples: 15431551600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:29:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 18:29:14,801][15401] Updated weights for policy 0, policy_version 941864 (0.0028) [2024-06-25 18:29:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 15431647232. Throughput: 0: 42731.6. Samples: 15431808080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:29:18,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 18:29:18,701][15401] Updated weights for policy 0, policy_version 941874 (0.0043) [2024-06-25 18:29:22,543][15401] Updated weights for policy 0, policy_version 941884 (0.0032) [2024-06-25 18:29:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 15431843840. Throughput: 0: 42904.9. Samples: 15431939680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-25 18:29:23,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 18:29:26,097][15401] Updated weights for policy 0, policy_version 941894 (0.0034) [2024-06-25 18:29:28,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 15432089600. Throughput: 0: 42692.4. Samples: 15432191160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:29:28,393][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 18:29:30,526][15401] Updated weights for policy 0, policy_version 941904 (0.0029) [2024-06-25 18:29:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15432269824. Throughput: 0: 42756.0. Samples: 15432448920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:29:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 18:29:34,073][15401] Updated weights for policy 0, policy_version 941914 (0.0030) [2024-06-25 18:29:38,072][15401] Updated weights for policy 0, policy_version 941924 (0.0030) [2024-06-25 18:29:38,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15432499200. Throughput: 0: 42804.0. Samples: 15432577940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:29:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 18:29:41,650][15401] Updated weights for policy 0, policy_version 941934 (0.0035) [2024-06-25 18:29:43,392][15132] Fps is (10 sec: 45864.7, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 15432728576. Throughput: 0: 42946.2. Samples: 15432838580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:29:43,392][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 18:29:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000941939_15432728576.pth... [2024-06-25 18:29:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000941312_15422455808.pth [2024-06-25 18:29:45,522][15401] Updated weights for policy 0, policy_version 941944 (0.0026) [2024-06-25 18:29:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 15432925184. Throughput: 0: 42812.6. Samples: 15433095980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:29:48,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 18:29:49,393][15401] Updated weights for policy 0, policy_version 941954 (0.0040) [2024-06-25 18:29:53,156][15401] Updated weights for policy 0, policy_version 941964 (0.0028) [2024-06-25 18:29:53,392][15132] Fps is (10 sec: 40959.8, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 15433138176. Throughput: 0: 42714.6. Samples: 15433221320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:29:53,392][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 18:29:56,990][15401] Updated weights for policy 0, policy_version 941974 (0.0031) [2024-06-25 18:29:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15433383936. Throughput: 0: 42954.3. Samples: 15433484540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:29:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 18:30:01,008][15401] Updated weights for policy 0, policy_version 941984 (0.0038) [2024-06-25 18:30:03,391][15132] Fps is (10 sec: 44241.1, 60 sec: 43143.6, 300 sec: 42820.4). Total num frames: 15433580544. Throughput: 0: 42990.3. Samples: 15433742700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:03,391][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 18:30:04,469][15401] Updated weights for policy 0, policy_version 941994 (0.0034) [2024-06-25 18:30:08,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15433777152. Throughput: 0: 42807.4. Samples: 15433866020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 18:30:08,565][15401] Updated weights for policy 0, policy_version 942004 (0.0029) [2024-06-25 18:30:12,108][15401] Updated weights for policy 0, policy_version 942014 (0.0031) [2024-06-25 18:30:13,389][15132] Fps is (10 sec: 44243.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15434022912. Throughput: 0: 43027.7. Samples: 15434127300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:13,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 18:30:16,231][15401] Updated weights for policy 0, policy_version 942024 (0.0037) [2024-06-25 18:30:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15434219520. Throughput: 0: 43036.1. Samples: 15434385540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 18:30:19,561][15401] Updated weights for policy 0, policy_version 942034 (0.0041) [2024-06-25 18:30:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15434432512. Throughput: 0: 43053.9. Samples: 15434515360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 18:30:23,738][15401] Updated weights for policy 0, policy_version 942044 (0.0035) [2024-06-25 18:30:26,820][15349] Signal inference workers to stop experience collection... (228400 times) [2024-06-25 18:30:26,824][15349] Signal inference workers to resume experience collection... (228400 times) [2024-06-25 18:30:26,870][15401] InferenceWorker_p0-w0: stopping experience collection (228400 times) [2024-06-25 18:30:26,870][15401] InferenceWorker_p0-w0: resuming experience collection (228400 times) [2024-06-25 18:30:27,307][15401] Updated weights for policy 0, policy_version 942054 (0.0035) [2024-06-25 18:30:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42873.3, 300 sec: 42820.6). Total num frames: 15434661888. Throughput: 0: 43039.7. Samples: 15434775260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:28,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 18:30:31,269][15401] Updated weights for policy 0, policy_version 942064 (0.0048) [2024-06-25 18:30:33,394][15132] Fps is (10 sec: 44216.7, 60 sec: 43414.4, 300 sec: 42931.0). Total num frames: 15434874880. Throughput: 0: 42992.5. Samples: 15435030840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:33,394][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 18:30:35,391][15401] Updated weights for policy 0, policy_version 942074 (0.0031) [2024-06-25 18:30:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15435071488. Throughput: 0: 43005.3. Samples: 15435156460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 18:30:38,757][15401] Updated weights for policy 0, policy_version 942084 (0.0027) [2024-06-25 18:30:42,833][15401] Updated weights for policy 0, policy_version 942094 (0.0035) [2024-06-25 18:30:43,389][15132] Fps is (10 sec: 42617.9, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 15435300864. Throughput: 0: 43106.3. Samples: 15435424320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:43,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 18:30:46,307][15401] Updated weights for policy 0, policy_version 942104 (0.0022) [2024-06-25 18:30:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 15435513856. Throughput: 0: 43063.5. Samples: 15435680500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:48,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 18:30:50,637][15401] Updated weights for policy 0, policy_version 942114 (0.0030) [2024-06-25 18:30:53,389][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.3, 300 sec: 42820.7). Total num frames: 15435726848. Throughput: 0: 43175.6. Samples: 15435808920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:53,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 18:30:53,787][15401] Updated weights for policy 0, policy_version 942124 (0.0038) [2024-06-25 18:30:58,125][15401] Updated weights for policy 0, policy_version 942134 (0.0039) [2024-06-25 18:30:58,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 15435939840. Throughput: 0: 43051.4. Samples: 15436064620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:30:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 18:31:01,295][15401] Updated weights for policy 0, policy_version 942144 (0.0042) [2024-06-25 18:31:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42872.5, 300 sec: 42876.5). Total num frames: 15436152832. Throughput: 0: 43090.6. Samples: 15436324620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:31:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 18:31:05,522][15401] Updated weights for policy 0, policy_version 942154 (0.0032) [2024-06-25 18:31:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15436349440. Throughput: 0: 42967.0. Samples: 15436448880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:31:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 18:31:09,198][15401] Updated weights for policy 0, policy_version 942164 (0.0026) [2024-06-25 18:31:12,927][15401] Updated weights for policy 0, policy_version 942174 (0.0026) [2024-06-25 18:31:13,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 15436595200. Throughput: 0: 42945.3. Samples: 15436708080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:31:13,396][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 18:31:17,066][15401] Updated weights for policy 0, policy_version 942184 (0.0033) [2024-06-25 18:31:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 15436808192. Throughput: 0: 42859.7. Samples: 15436959340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-25 18:31:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 18:31:20,844][15401] Updated weights for policy 0, policy_version 942194 (0.0043) [2024-06-25 18:31:23,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42871.5, 300 sec: 42765.8). Total num frames: 15437004800. Throughput: 0: 42911.7. Samples: 15437087480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:31:23,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 18:31:24,713][15401] Updated weights for policy 0, policy_version 942204 (0.0034) [2024-06-25 18:31:28,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 15437217792. Throughput: 0: 42600.3. Samples: 15437341440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:31:28,392][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 18:31:28,462][15401] Updated weights for policy 0, policy_version 942214 (0.0030) [2024-06-25 18:31:32,261][15401] Updated weights for policy 0, policy_version 942224 (0.0041) [2024-06-25 18:31:33,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42874.7, 300 sec: 42876.3). Total num frames: 15437447168. Throughput: 0: 42780.1. Samples: 15437605600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:31:33,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 18:31:35,998][15401] Updated weights for policy 0, policy_version 942234 (0.0026) [2024-06-25 18:31:38,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15437660160. Throughput: 0: 42650.5. Samples: 15437728200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:31:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 18:31:39,872][15401] Updated weights for policy 0, policy_version 942244 (0.0031) [2024-06-25 18:31:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15437873152. Throughput: 0: 42751.6. Samples: 15437988440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:31:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 18:31:43,496][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000942254_15437889536.pth... [2024-06-25 18:31:43,507][15401] Updated weights for policy 0, policy_version 942254 (0.0028) [2024-06-25 18:31:43,543][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000941626_15427600384.pth [2024-06-25 18:31:47,503][15401] Updated weights for policy 0, policy_version 942264 (0.0032) [2024-06-25 18:31:48,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 15438102528. Throughput: 0: 42653.7. Samples: 15438244040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:31:48,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 18:31:51,013][15401] Updated weights for policy 0, policy_version 942274 (0.0037) [2024-06-25 18:31:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15438282752. Throughput: 0: 42703.3. Samples: 15438370520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:31:53,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-25 18:31:53,996][15349] Signal inference workers to stop experience collection... (228450 times) [2024-06-25 18:31:53,996][15349] Signal inference workers to resume experience collection... (228450 times) [2024-06-25 18:31:54,019][15401] InferenceWorker_p0-w0: stopping experience collection (228450 times) [2024-06-25 18:31:54,019][15401] InferenceWorker_p0-w0: resuming experience collection (228450 times) [2024-06-25 18:31:55,130][15401] Updated weights for policy 0, policy_version 942284 (0.0035) [2024-06-25 18:31:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15438512128. Throughput: 0: 42731.9. Samples: 15438630740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:31:58,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 18:31:58,659][15401] Updated weights for policy 0, policy_version 942294 (0.0022) [2024-06-25 18:32:02,804][15401] Updated weights for policy 0, policy_version 942304 (0.0029) [2024-06-25 18:32:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15438725120. Throughput: 0: 42825.5. Samples: 15438886480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:03,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 18:32:06,234][15401] Updated weights for policy 0, policy_version 942314 (0.0038) [2024-06-25 18:32:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15438921728. Throughput: 0: 42763.1. Samples: 15439011820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 18:32:10,650][15401] Updated weights for policy 0, policy_version 942324 (0.0025) [2024-06-25 18:32:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42876.0, 300 sec: 42876.1). Total num frames: 15439167488. Throughput: 0: 42896.1. Samples: 15439271660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 18:32:13,767][15401] Updated weights for policy 0, policy_version 942334 (0.0037) [2024-06-25 18:32:18,197][15401] Updated weights for policy 0, policy_version 942344 (0.0037) [2024-06-25 18:32:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15439380480. Throughput: 0: 42841.8. Samples: 15439533480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:18,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 18:32:21,688][15401] Updated weights for policy 0, policy_version 942354 (0.0035) [2024-06-25 18:32:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15439560704. Throughput: 0: 42814.3. Samples: 15439654840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:23,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 18:32:25,607][15401] Updated weights for policy 0, policy_version 942364 (0.0031) [2024-06-25 18:32:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43146.3, 300 sec: 42931.6). Total num frames: 15439806464. Throughput: 0: 42827.1. Samples: 15439915660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:28,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 18:32:29,098][15401] Updated weights for policy 0, policy_version 942374 (0.0034) [2024-06-25 18:32:33,367][15401] Updated weights for policy 0, policy_version 942384 (0.0034) [2024-06-25 18:32:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15440019456. Throughput: 0: 42991.2. Samples: 15440178640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:33,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 18:32:36,892][15401] Updated weights for policy 0, policy_version 942394 (0.0038) [2024-06-25 18:32:38,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15440199680. Throughput: 0: 42913.7. Samples: 15440301640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:38,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 18:32:41,108][15401] Updated weights for policy 0, policy_version 942404 (0.0047) [2024-06-25 18:32:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15440445440. Throughput: 0: 42736.8. Samples: 15440553900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:43,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 18:32:44,327][15401] Updated weights for policy 0, policy_version 942414 (0.0032) [2024-06-25 18:32:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 15440642048. Throughput: 0: 42865.7. Samples: 15440815440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:48,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 18:32:48,790][15401] Updated weights for policy 0, policy_version 942424 (0.0037) [2024-06-25 18:32:52,624][15401] Updated weights for policy 0, policy_version 942434 (0.0029) [2024-06-25 18:32:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15440855040. Throughput: 0: 42892.8. Samples: 15440942000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:53,390][15132] Avg episode reward: [(0, '0.250')] [2024-06-25 18:32:56,283][15401] Updated weights for policy 0, policy_version 942444 (0.0029) [2024-06-25 18:32:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15441068032. Throughput: 0: 42832.0. Samples: 15441199100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:32:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 18:33:00,018][15401] Updated weights for policy 0, policy_version 942454 (0.0043) [2024-06-25 18:33:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15441297408. Throughput: 0: 42799.1. Samples: 15441459440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:33:03,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 18:33:04,085][15401] Updated weights for policy 0, policy_version 942464 (0.0050) [2024-06-25 18:33:07,688][15401] Updated weights for policy 0, policy_version 942474 (0.0037) [2024-06-25 18:33:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15441510400. Throughput: 0: 42998.7. Samples: 15441589780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:33:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 18:33:11,705][15401] Updated weights for policy 0, policy_version 942484 (0.0022) [2024-06-25 18:33:11,714][15349] Signal inference workers to stop experience collection... (228500 times) [2024-06-25 18:33:11,720][15349] Signal inference workers to resume experience collection... (228500 times) [2024-06-25 18:33:11,737][15401] InferenceWorker_p0-w0: stopping experience collection (228500 times) [2024-06-25 18:33:11,737][15401] InferenceWorker_p0-w0: resuming experience collection (228500 times) [2024-06-25 18:33:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 15441723392. Throughput: 0: 42926.6. Samples: 15441847360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 18:33:13,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 18:33:15,213][15401] Updated weights for policy 0, policy_version 942494 (0.0028) [2024-06-25 18:33:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15441936384. Throughput: 0: 42916.0. Samples: 15442109860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:18,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 18:33:19,395][15401] Updated weights for policy 0, policy_version 942504 (0.0034) [2024-06-25 18:33:22,822][15401] Updated weights for policy 0, policy_version 942514 (0.0034) [2024-06-25 18:33:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15442165760. Throughput: 0: 42861.8. Samples: 15442230420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 18:33:27,005][15401] Updated weights for policy 0, policy_version 942524 (0.0039) [2024-06-25 18:33:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15442362368. Throughput: 0: 42968.5. Samples: 15442487480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:28,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 18:33:31,223][15401] Updated weights for policy 0, policy_version 942534 (0.0025) [2024-06-25 18:33:33,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42596.6, 300 sec: 42875.7). Total num frames: 15442575360. Throughput: 0: 42891.9. Samples: 15442745680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:33,392][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 18:33:34,566][15401] Updated weights for policy 0, policy_version 942544 (0.0024) [2024-06-25 18:33:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15442788352. Throughput: 0: 42974.6. Samples: 15442875860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 18:33:38,949][15401] Updated weights for policy 0, policy_version 942554 (0.0032) [2024-06-25 18:33:42,105][15401] Updated weights for policy 0, policy_version 942564 (0.0033) [2024-06-25 18:33:43,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 15443001344. Throughput: 0: 42919.1. Samples: 15443130460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 18:33:43,505][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000942567_15443017728.pth... [2024-06-25 18:33:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000941939_15432728576.pth [2024-06-25 18:33:46,678][15401] Updated weights for policy 0, policy_version 942574 (0.0040) [2024-06-25 18:33:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 15443214336. Throughput: 0: 42859.2. Samples: 15443388100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 18:33:49,650][15401] Updated weights for policy 0, policy_version 942584 (0.0025) [2024-06-25 18:33:53,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15443427328. Throughput: 0: 42880.0. Samples: 15443519380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:53,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 18:33:54,127][15401] Updated weights for policy 0, policy_version 942594 (0.0042) [2024-06-25 18:33:57,318][15401] Updated weights for policy 0, policy_version 942604 (0.0042) [2024-06-25 18:33:58,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 15443656704. Throughput: 0: 42752.1. Samples: 15443771200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:33:58,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 18:34:01,777][15401] Updated weights for policy 0, policy_version 942614 (0.0031) [2024-06-25 18:34:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 15443853312. Throughput: 0: 42607.9. Samples: 15444027320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:03,392][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 18:34:05,245][15401] Updated weights for policy 0, policy_version 942624 (0.0034) [2024-06-25 18:34:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 15444082688. Throughput: 0: 42798.6. Samples: 15444156360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 18:34:09,272][15401] Updated weights for policy 0, policy_version 942634 (0.0032) [2024-06-25 18:34:12,669][15401] Updated weights for policy 0, policy_version 942644 (0.0024) [2024-06-25 18:34:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15444279296. Throughput: 0: 42737.8. Samples: 15444410680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:13,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 18:34:16,849][15401] Updated weights for policy 0, policy_version 942654 (0.0028) [2024-06-25 18:34:18,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 15444475904. Throughput: 0: 42813.4. Samples: 15444672180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 18:34:20,131][15401] Updated weights for policy 0, policy_version 942664 (0.0045) [2024-06-25 18:34:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 15444721664. Throughput: 0: 42644.5. Samples: 15444794860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:23,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 18:34:25,015][15401] Updated weights for policy 0, policy_version 942674 (0.0023) [2024-06-25 18:34:26,649][15349] Signal inference workers to stop experience collection... (228550 times) [2024-06-25 18:34:26,650][15349] Signal inference workers to resume experience collection... (228550 times) [2024-06-25 18:34:26,667][15401] InferenceWorker_p0-w0: stopping experience collection (228550 times) [2024-06-25 18:34:26,667][15401] InferenceWorker_p0-w0: resuming experience collection (228550 times) [2024-06-25 18:34:28,278][15401] Updated weights for policy 0, policy_version 942684 (0.0035) [2024-06-25 18:34:28,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.7, 300 sec: 42931.3). Total num frames: 15444934656. Throughput: 0: 42655.8. Samples: 15445050080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:28,393][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 18:34:32,724][15401] Updated weights for policy 0, policy_version 942694 (0.0038) [2024-06-25 18:34:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.1, 300 sec: 42820.6). Total num frames: 15445131264. Throughput: 0: 42804.4. Samples: 15445314300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 18:34:35,853][15401] Updated weights for policy 0, policy_version 942704 (0.0033) [2024-06-25 18:34:38,390][15132] Fps is (10 sec: 42608.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 15445360640. Throughput: 0: 42532.9. Samples: 15445433360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:38,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 18:34:40,837][15401] Updated weights for policy 0, policy_version 942714 (0.0028) [2024-06-25 18:34:43,336][15401] Updated weights for policy 0, policy_version 942724 (0.0033) [2024-06-25 18:34:43,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15445590016. Throughput: 0: 42681.3. Samples: 15445691860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:43,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 18:34:48,377][15401] Updated weights for policy 0, policy_version 942734 (0.0042) [2024-06-25 18:34:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42765.4). Total num frames: 15445753856. Throughput: 0: 43025.9. Samples: 15445963380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:48,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 18:34:50,866][15401] Updated weights for policy 0, policy_version 942744 (0.0033) [2024-06-25 18:34:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15446016000. Throughput: 0: 42785.4. Samples: 15446081700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:53,391][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 18:34:55,871][15401] Updated weights for policy 0, policy_version 942754 (0.0038) [2024-06-25 18:34:58,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42876.3). Total num frames: 15446228992. Throughput: 0: 42849.3. Samples: 15446338900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:34:58,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 18:34:58,458][15401] Updated weights for policy 0, policy_version 942764 (0.0036) [2024-06-25 18:35:03,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 15446392832. Throughput: 0: 42927.6. Samples: 15446603920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:35:03,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 18:35:03,553][15401] Updated weights for policy 0, policy_version 942774 (0.0031) [2024-06-25 18:35:06,087][15401] Updated weights for policy 0, policy_version 942784 (0.0035) [2024-06-25 18:35:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 15446638592. Throughput: 0: 42853.8. Samples: 15446723280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:35:08,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 18:35:11,136][15401] Updated weights for policy 0, policy_version 942794 (0.0039) [2024-06-25 18:35:13,389][15132] Fps is (10 sec: 49151.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 15446884352. Throughput: 0: 42979.7. Samples: 15446984060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-25 18:35:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 18:35:13,694][15401] Updated weights for policy 0, policy_version 942804 (0.0030) [2024-06-25 18:35:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15447048192. Throughput: 0: 42976.5. Samples: 15447248240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:18,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 18:35:18,681][15401] Updated weights for policy 0, policy_version 942814 (0.0032) [2024-06-25 18:35:21,208][15401] Updated weights for policy 0, policy_version 942824 (0.0039) [2024-06-25 18:35:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15447293952. Throughput: 0: 43016.5. Samples: 15447369100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:23,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 18:35:26,029][15401] Updated weights for policy 0, policy_version 942834 (0.0035) [2024-06-25 18:35:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43146.3, 300 sec: 42876.7). Total num frames: 15447523328. Throughput: 0: 43182.2. Samples: 15447635060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:28,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 18:35:28,941][15401] Updated weights for policy 0, policy_version 942844 (0.0029) [2024-06-25 18:35:33,156][15349] Signal inference workers to stop experience collection... (228600 times) [2024-06-25 18:35:33,157][15349] Signal inference workers to resume experience collection... (228600 times) [2024-06-25 18:35:33,175][15401] InferenceWorker_p0-w0: stopping experience collection (228600 times) [2024-06-25 18:35:33,175][15401] InferenceWorker_p0-w0: resuming experience collection (228600 times) [2024-06-25 18:35:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15447703552. Throughput: 0: 42947.5. Samples: 15447896020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:33,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 18:35:33,693][15401] Updated weights for policy 0, policy_version 942854 (0.0043) [2024-06-25 18:35:36,439][15401] Updated weights for policy 0, policy_version 942864 (0.0037) [2024-06-25 18:35:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15447932928. Throughput: 0: 42863.7. Samples: 15448010560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:38,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 18:35:41,639][15401] Updated weights for policy 0, policy_version 942874 (0.0035) [2024-06-25 18:35:43,389][15132] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15448178688. Throughput: 0: 42977.9. Samples: 15448272900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:43,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 18:35:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000942882_15448178688.pth... [2024-06-25 18:35:43,453][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000942254_15437889536.pth [2024-06-25 18:35:44,267][15401] Updated weights for policy 0, policy_version 942884 (0.0033) [2024-06-25 18:35:48,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15448326144. Throughput: 0: 42898.7. Samples: 15448534360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 18:35:49,098][15401] Updated weights for policy 0, policy_version 942894 (0.0044) [2024-06-25 18:35:51,760][15401] Updated weights for policy 0, policy_version 942904 (0.0032) [2024-06-25 18:35:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15448571904. Throughput: 0: 42809.6. Samples: 15448649720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:53,394][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 18:35:56,687][15401] Updated weights for policy 0, policy_version 942914 (0.0036) [2024-06-25 18:35:58,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15448801280. Throughput: 0: 42880.5. Samples: 15448913680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:35:58,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 18:35:59,538][15401] Updated weights for policy 0, policy_version 942924 (0.0023) [2024-06-25 18:36:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15448981504. Throughput: 0: 42948.3. Samples: 15449180920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 18:36:04,102][15401] Updated weights for policy 0, policy_version 942934 (0.0036) [2024-06-25 18:36:07,042][15401] Updated weights for policy 0, policy_version 942944 (0.0037) [2024-06-25 18:36:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42766.0). Total num frames: 15449210880. Throughput: 0: 42812.9. Samples: 15449295680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 18:36:11,604][15401] Updated weights for policy 0, policy_version 942954 (0.0034) [2024-06-25 18:36:13,391][15132] Fps is (10 sec: 47506.4, 60 sec: 42870.3, 300 sec: 42875.9). Total num frames: 15449456640. Throughput: 0: 42823.8. Samples: 15449562200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:13,392][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 18:36:14,685][15401] Updated weights for policy 0, policy_version 942964 (0.0032) [2024-06-25 18:36:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15449636864. Throughput: 0: 42884.1. Samples: 15449825800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:18,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 18:36:19,153][15401] Updated weights for policy 0, policy_version 942974 (0.0031) [2024-06-25 18:36:22,475][15401] Updated weights for policy 0, policy_version 942984 (0.0031) [2024-06-25 18:36:23,389][15132] Fps is (10 sec: 40966.8, 60 sec: 42871.5, 300 sec: 42876.5). Total num frames: 15449866240. Throughput: 0: 43011.6. Samples: 15449946080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 18:36:26,806][15401] Updated weights for policy 0, policy_version 942994 (0.0030) [2024-06-25 18:36:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15450062848. Throughput: 0: 42931.9. Samples: 15450204840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:28,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 18:36:30,346][15401] Updated weights for policy 0, policy_version 943004 (0.0036) [2024-06-25 18:36:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15450275840. Throughput: 0: 42952.8. Samples: 15450467240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:33,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 18:36:34,339][15401] Updated weights for policy 0, policy_version 943014 (0.0034) [2024-06-25 18:36:37,882][15401] Updated weights for policy 0, policy_version 943024 (0.0041) [2024-06-25 18:36:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15450521600. Throughput: 0: 43254.2. Samples: 15450596160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:38,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-25 18:36:41,817][15401] Updated weights for policy 0, policy_version 943034 (0.0033) [2024-06-25 18:36:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15450701824. Throughput: 0: 43018.2. Samples: 15450849500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:43,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 18:36:43,484][15349] Signal inference workers to stop experience collection... (228650 times) [2024-06-25 18:36:43,488][15349] Signal inference workers to resume experience collection... (228650 times) [2024-06-25 18:36:43,513][15401] InferenceWorker_p0-w0: stopping experience collection (228650 times) [2024-06-25 18:36:43,513][15401] InferenceWorker_p0-w0: resuming experience collection (228650 times) [2024-06-25 18:36:45,410][15401] Updated weights for policy 0, policy_version 943044 (0.0023) [2024-06-25 18:36:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 15450931200. Throughput: 0: 42731.1. Samples: 15451103820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:48,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 18:36:49,748][15401] Updated weights for policy 0, policy_version 943054 (0.0027) [2024-06-25 18:36:53,043][15401] Updated weights for policy 0, policy_version 943064 (0.0039) [2024-06-25 18:36:53,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 15451176960. Throughput: 0: 43146.2. Samples: 15451237260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 18:36:57,576][15401] Updated weights for policy 0, policy_version 943074 (0.0034) [2024-06-25 18:36:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15451357184. Throughput: 0: 42845.6. Samples: 15451490180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:36:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 18:37:00,974][15401] Updated weights for policy 0, policy_version 943084 (0.0036) [2024-06-25 18:37:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15451586560. Throughput: 0: 42574.5. Samples: 15451741660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:37:03,390][15132] Avg episode reward: [(0, '0.779')] [2024-06-25 18:37:05,121][15401] Updated weights for policy 0, policy_version 943094 (0.0042) [2024-06-25 18:37:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15451799552. Throughput: 0: 42905.2. Samples: 15451876820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 18:37:08,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 18:37:08,576][15401] Updated weights for policy 0, policy_version 943104 (0.0030) [2024-06-25 18:37:13,259][15401] Updated weights for policy 0, policy_version 943114 (0.0028) [2024-06-25 18:37:13,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42053.4, 300 sec: 42709.5). Total num frames: 15451979776. Throughput: 0: 42747.6. Samples: 15452128480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:13,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-25 18:37:16,275][15401] Updated weights for policy 0, policy_version 943124 (0.0036) [2024-06-25 18:37:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 15452241920. Throughput: 0: 42418.2. Samples: 15452376060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 18:37:20,912][15401] Updated weights for policy 0, policy_version 943134 (0.0037) [2024-06-25 18:37:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15452405760. Throughput: 0: 42661.8. Samples: 15452515940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 18:37:23,933][15401] Updated weights for policy 0, policy_version 943144 (0.0033) [2024-06-25 18:37:28,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15452618752. Throughput: 0: 42652.9. Samples: 15452768880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:28,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-25 18:37:28,693][15401] Updated weights for policy 0, policy_version 943154 (0.0031) [2024-06-25 18:37:31,519][15401] Updated weights for policy 0, policy_version 943164 (0.0022) [2024-06-25 18:37:33,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 15452880896. Throughput: 0: 42724.1. Samples: 15453026400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:33,390][15132] Avg episode reward: [(0, '0.861')] [2024-06-25 18:37:36,136][15401] Updated weights for policy 0, policy_version 943174 (0.0033) [2024-06-25 18:37:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15453077504. Throughput: 0: 42884.4. Samples: 15453167060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 18:37:39,190][15401] Updated weights for policy 0, policy_version 943184 (0.0043) [2024-06-25 18:37:43,390][15132] Fps is (10 sec: 37682.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15453257728. Throughput: 0: 42747.4. Samples: 15453413820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 18:37:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000943192_15453257728.pth... [2024-06-25 18:37:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000942567_15443017728.pth [2024-06-25 18:37:43,777][15401] Updated weights for policy 0, policy_version 943194 (0.0030) [2024-06-25 18:37:46,275][15349] Signal inference workers to stop experience collection... (228700 times) [2024-06-25 18:37:46,276][15349] Signal inference workers to resume experience collection... (228700 times) [2024-06-25 18:37:46,314][15401] InferenceWorker_p0-w0: stopping experience collection (228700 times) [2024-06-25 18:37:46,314][15401] InferenceWorker_p0-w0: resuming experience collection (228700 times) [2024-06-25 18:37:46,887][15401] Updated weights for policy 0, policy_version 943204 (0.0028) [2024-06-25 18:37:48,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43415.9, 300 sec: 42986.8). Total num frames: 15453536256. Throughput: 0: 42808.9. Samples: 15453668160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:48,393][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 18:37:51,265][15401] Updated weights for policy 0, policy_version 943214 (0.0042) [2024-06-25 18:37:53,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 15453700096. Throughput: 0: 42904.0. Samples: 15453807500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:53,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 18:37:54,631][15401] Updated weights for policy 0, policy_version 943224 (0.0034) [2024-06-25 18:37:58,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15453913088. Throughput: 0: 42972.1. Samples: 15454062220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:37:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 18:37:58,675][15401] Updated weights for policy 0, policy_version 943234 (0.0040) [2024-06-25 18:38:02,182][15401] Updated weights for policy 0, policy_version 943244 (0.0032) [2024-06-25 18:38:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15454175232. Throughput: 0: 43034.1. Samples: 15454312600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:03,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 18:38:06,148][15401] Updated weights for policy 0, policy_version 943254 (0.0040) [2024-06-25 18:38:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15454339072. Throughput: 0: 43078.7. Samples: 15454454480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:08,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 18:38:09,634][15401] Updated weights for policy 0, policy_version 943264 (0.0034) [2024-06-25 18:38:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15454568448. Throughput: 0: 43039.6. Samples: 15454705660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:13,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 18:38:14,045][15401] Updated weights for policy 0, policy_version 943274 (0.0033) [2024-06-25 18:38:17,239][15401] Updated weights for policy 0, policy_version 943284 (0.0028) [2024-06-25 18:38:18,390][15132] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15454830592. Throughput: 0: 42995.0. Samples: 15454961180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:18,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 18:38:21,679][15401] Updated weights for policy 0, policy_version 943294 (0.0023) [2024-06-25 18:38:23,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15454978048. Throughput: 0: 42907.6. Samples: 15455097900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:23,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 18:38:24,864][15401] Updated weights for policy 0, policy_version 943304 (0.0035) [2024-06-25 18:38:28,396][15132] Fps is (10 sec: 39296.7, 60 sec: 43413.0, 300 sec: 42875.5). Total num frames: 15455223808. Throughput: 0: 42865.6. Samples: 15455343040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:28,396][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 18:38:29,160][15401] Updated weights for policy 0, policy_version 943314 (0.0037) [2024-06-25 18:38:32,932][15401] Updated weights for policy 0, policy_version 943324 (0.0029) [2024-06-25 18:38:33,390][15132] Fps is (10 sec: 49151.5, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 15455469568. Throughput: 0: 43015.1. Samples: 15455603740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:33,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 18:38:36,933][15401] Updated weights for policy 0, policy_version 943334 (0.0036) [2024-06-25 18:38:38,389][15132] Fps is (10 sec: 42625.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15455649792. Throughput: 0: 42871.6. Samples: 15455736720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:38,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 18:38:38,972][15349] Signal inference workers to stop experience collection... (228750 times) [2024-06-25 18:38:38,973][15349] Signal inference workers to resume experience collection... (228750 times) [2024-06-25 18:38:38,995][15401] InferenceWorker_p0-w0: stopping experience collection (228750 times) [2024-06-25 18:38:39,028][15401] InferenceWorker_p0-w0: resuming experience collection (228750 times) [2024-06-25 18:38:40,537][15401] Updated weights for policy 0, policy_version 943344 (0.0035) [2024-06-25 18:38:43,389][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 15455862784. Throughput: 0: 42677.7. Samples: 15455982720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:43,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 18:38:44,378][15401] Updated weights for policy 0, policy_version 943354 (0.0028) [2024-06-25 18:38:48,206][15401] Updated weights for policy 0, policy_version 943364 (0.0036) [2024-06-25 18:38:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 15456075776. Throughput: 0: 43095.2. Samples: 15456251880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 18:38:51,941][15401] Updated weights for policy 0, policy_version 943374 (0.0041) [2024-06-25 18:38:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15456288768. Throughput: 0: 42835.9. Samples: 15456382100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 18:38:56,046][15401] Updated weights for policy 0, policy_version 943384 (0.0033) [2024-06-25 18:38:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42932.0). Total num frames: 15456518144. Throughput: 0: 42708.8. Samples: 15456627560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:38:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 18:38:59,587][15401] Updated weights for policy 0, policy_version 943394 (0.0028) [2024-06-25 18:39:03,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 15456714752. Throughput: 0: 42833.3. Samples: 15456888780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:39:03,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 18:39:03,660][15401] Updated weights for policy 0, policy_version 943404 (0.0031) [2024-06-25 18:39:07,080][15401] Updated weights for policy 0, policy_version 943414 (0.0032) [2024-06-25 18:39:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15456927744. Throughput: 0: 42532.9. Samples: 15457011880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-25 18:39:08,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 18:39:11,259][15401] Updated weights for policy 0, policy_version 943424 (0.0037) [2024-06-25 18:39:13,390][15132] Fps is (10 sec: 44247.1, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 15457157120. Throughput: 0: 42935.8. Samples: 15457274880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:13,390][15132] Avg episode reward: [(0, '0.174')] [2024-06-25 18:39:14,763][15401] Updated weights for policy 0, policy_version 943434 (0.0031) [2024-06-25 18:39:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 15457337344. Throughput: 0: 42734.7. Samples: 15457526800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 18:39:18,837][15401] Updated weights for policy 0, policy_version 943444 (0.0044) [2024-06-25 18:39:22,783][15401] Updated weights for policy 0, policy_version 943454 (0.0033) [2024-06-25 18:39:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 15457583104. Throughput: 0: 42521.3. Samples: 15457650180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:23,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 18:39:26,980][15401] Updated weights for policy 0, policy_version 943464 (0.0041) [2024-06-25 18:39:28,393][15132] Fps is (10 sec: 45857.0, 60 sec: 42873.2, 300 sec: 42931.0). Total num frames: 15457796096. Throughput: 0: 42794.4. Samples: 15457908640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:28,394][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 18:39:30,444][15401] Updated weights for policy 0, policy_version 943474 (0.0038) [2024-06-25 18:39:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 15457976320. Throughput: 0: 42560.9. Samples: 15458167120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:33,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 18:39:34,382][15401] Updated weights for policy 0, policy_version 943484 (0.0040) [2024-06-25 18:39:37,974][15401] Updated weights for policy 0, policy_version 943494 (0.0036) [2024-06-25 18:39:38,390][15132] Fps is (10 sec: 42615.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15458222080. Throughput: 0: 42457.3. Samples: 15458292680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 18:39:41,940][15401] Updated weights for policy 0, policy_version 943504 (0.0036) [2024-06-25 18:39:43,390][15132] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 15458451456. Throughput: 0: 42750.2. Samples: 15458551320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:43,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 18:39:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000943509_15458451456.pth... [2024-06-25 18:39:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000942882_15448178688.pth [2024-06-25 18:39:45,570][15401] Updated weights for policy 0, policy_version 943514 (0.0034) [2024-06-25 18:39:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15458631680. Throughput: 0: 42635.5. Samples: 15458807280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:48,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 18:39:49,740][15401] Updated weights for policy 0, policy_version 943524 (0.0039) [2024-06-25 18:39:53,297][15401] Updated weights for policy 0, policy_version 943534 (0.0025) [2024-06-25 18:39:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15458861056. Throughput: 0: 42658.6. Samples: 15458931520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:53,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 18:39:57,479][15401] Updated weights for policy 0, policy_version 943544 (0.0024) [2024-06-25 18:39:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 15459057664. Throughput: 0: 42432.1. Samples: 15459184320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:39:58,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 18:40:00,026][15349] Signal inference workers to stop experience collection... (228800 times) [2024-06-25 18:40:00,073][15401] InferenceWorker_p0-w0: stopping experience collection (228800 times) [2024-06-25 18:40:00,080][15349] Signal inference workers to resume experience collection... (228800 times) [2024-06-25 18:40:00,088][15401] InferenceWorker_p0-w0: resuming experience collection (228800 times) [2024-06-25 18:40:01,155][15401] Updated weights for policy 0, policy_version 943554 (0.0026) [2024-06-25 18:40:03,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 15459254272. Throughput: 0: 42665.4. Samples: 15459446740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:03,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 18:40:04,891][15401] Updated weights for policy 0, policy_version 943564 (0.0042) [2024-06-25 18:40:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15459500032. Throughput: 0: 42787.1. Samples: 15459575600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 18:40:08,583][15401] Updated weights for policy 0, policy_version 943574 (0.0034) [2024-06-25 18:40:12,681][15401] Updated weights for policy 0, policy_version 943584 (0.0037) [2024-06-25 18:40:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 15459696640. Throughput: 0: 42664.7. Samples: 15459828380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:13,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 18:40:16,096][15401] Updated weights for policy 0, policy_version 943594 (0.0055) [2024-06-25 18:40:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15459909632. Throughput: 0: 42686.7. Samples: 15460088020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:18,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 18:40:20,117][15401] Updated weights for policy 0, policy_version 943604 (0.0036) [2024-06-25 18:40:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15460139008. Throughput: 0: 42783.7. Samples: 15460217940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 18:40:24,169][15401] Updated weights for policy 0, policy_version 943614 (0.0043) [2024-06-25 18:40:27,516][15401] Updated weights for policy 0, policy_version 943624 (0.0031) [2024-06-25 18:40:28,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42326.5, 300 sec: 42820.2). Total num frames: 15460335616. Throughput: 0: 42792.5. Samples: 15460477080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:28,392][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 18:40:31,530][15401] Updated weights for policy 0, policy_version 943634 (0.0028) [2024-06-25 18:40:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15460564992. Throughput: 0: 42741.0. Samples: 15460730620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:33,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 18:40:35,951][15401] Updated weights for policy 0, policy_version 943644 (0.0034) [2024-06-25 18:40:38,389][15132] Fps is (10 sec: 45886.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15460794368. Throughput: 0: 42897.8. Samples: 15460861920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:38,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 18:40:39,228][15401] Updated weights for policy 0, policy_version 943654 (0.0032) [2024-06-25 18:40:43,369][15401] Updated weights for policy 0, policy_version 943664 (0.0038) [2024-06-25 18:40:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 15460990976. Throughput: 0: 42978.2. Samples: 15461118340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:43,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 18:40:46,978][15401] Updated weights for policy 0, policy_version 943674 (0.0032) [2024-06-25 18:40:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 15461220352. Throughput: 0: 42682.1. Samples: 15461367540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:48,392][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 18:40:50,789][15401] Updated weights for policy 0, policy_version 943684 (0.0033) [2024-06-25 18:40:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15461416960. Throughput: 0: 42804.1. Samples: 15461501780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:53,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 18:40:54,592][15401] Updated weights for policy 0, policy_version 943694 (0.0041) [2024-06-25 18:40:58,220][15401] Updated weights for policy 0, policy_version 943704 (0.0021) [2024-06-25 18:40:58,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15461646336. Throughput: 0: 42920.0. Samples: 15461759780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:40:58,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 18:41:02,161][15401] Updated weights for policy 0, policy_version 943714 (0.0038) [2024-06-25 18:41:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 15461859328. Throughput: 0: 42757.2. Samples: 15462012100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-25 18:41:03,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 18:41:05,706][15401] Updated weights for policy 0, policy_version 943724 (0.0038) [2024-06-25 18:41:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.7). Total num frames: 15462055936. Throughput: 0: 42761.3. Samples: 15462142200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:08,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 18:41:09,732][15401] Updated weights for policy 0, policy_version 943734 (0.0029) [2024-06-25 18:41:13,249][15401] Updated weights for policy 0, policy_version 943744 (0.0039) [2024-06-25 18:41:13,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15462301696. Throughput: 0: 42781.8. Samples: 15462402160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:13,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 18:41:14,818][15349] Signal inference workers to stop experience collection... (228850 times) [2024-06-25 18:41:14,818][15349] Signal inference workers to resume experience collection... (228850 times) [2024-06-25 18:41:14,862][15401] InferenceWorker_p0-w0: stopping experience collection (228850 times) [2024-06-25 18:41:14,863][15401] InferenceWorker_p0-w0: resuming experience collection (228850 times) [2024-06-25 18:41:17,688][15401] Updated weights for policy 0, policy_version 943754 (0.0028) [2024-06-25 18:41:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15462498304. Throughput: 0: 42753.6. Samples: 15462654540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 18:41:21,166][15401] Updated weights for policy 0, policy_version 943764 (0.0029) [2024-06-25 18:41:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15462694912. Throughput: 0: 42607.4. Samples: 15462779260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:23,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 18:41:25,331][15401] Updated weights for policy 0, policy_version 943774 (0.0030) [2024-06-25 18:41:28,389][15132] Fps is (10 sec: 44237.8, 60 sec: 43419.4, 300 sec: 42931.7). Total num frames: 15462940672. Throughput: 0: 42678.7. Samples: 15463038880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:28,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 18:41:28,806][15401] Updated weights for policy 0, policy_version 943784 (0.0034) [2024-06-25 18:41:33,181][15401] Updated weights for policy 0, policy_version 943794 (0.0038) [2024-06-25 18:41:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15463137280. Throughput: 0: 42887.6. Samples: 15463297380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:33,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 18:41:36,560][15401] Updated weights for policy 0, policy_version 943804 (0.0027) [2024-06-25 18:41:38,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 15463333888. Throughput: 0: 42647.1. Samples: 15463420900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:38,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 18:41:40,899][15401] Updated weights for policy 0, policy_version 943814 (0.0030) [2024-06-25 18:41:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15463563264. Throughput: 0: 42596.4. Samples: 15463676620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:43,390][15132] Avg episode reward: [(0, '0.192')] [2024-06-25 18:41:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000943821_15463563264.pth... [2024-06-25 18:41:43,454][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000943192_15453257728.pth [2024-06-25 18:41:44,132][15401] Updated weights for policy 0, policy_version 943824 (0.0020) [2024-06-25 18:41:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 15463759872. Throughput: 0: 42823.5. Samples: 15463939160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:48,390][15132] Avg episode reward: [(0, '0.082')] [2024-06-25 18:41:48,685][15401] Updated weights for policy 0, policy_version 943834 (0.0030) [2024-06-25 18:41:51,697][15401] Updated weights for policy 0, policy_version 943844 (0.0045) [2024-06-25 18:41:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15463972864. Throughput: 0: 42819.9. Samples: 15464069100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:53,390][15132] Avg episode reward: [(0, '0.082')] [2024-06-25 18:41:56,230][15401] Updated weights for policy 0, policy_version 943854 (0.0046) [2024-06-25 18:41:58,389][15132] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15464218624. Throughput: 0: 42759.7. Samples: 15464326340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:41:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 18:41:59,389][15401] Updated weights for policy 0, policy_version 943864 (0.0032) [2024-06-25 18:42:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15464382464. Throughput: 0: 42928.0. Samples: 15464586300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:03,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 18:42:04,072][15401] Updated weights for policy 0, policy_version 943874 (0.0044) [2024-06-25 18:42:07,173][15401] Updated weights for policy 0, policy_version 943884 (0.0038) [2024-06-25 18:42:08,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15464644608. Throughput: 0: 42864.9. Samples: 15464708180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:08,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 18:42:11,774][15401] Updated weights for policy 0, policy_version 943894 (0.0023) [2024-06-25 18:42:13,390][15132] Fps is (10 sec: 49151.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15464873984. Throughput: 0: 43024.7. Samples: 15464975000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:13,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 18:42:14,696][15401] Updated weights for policy 0, policy_version 943904 (0.0039) [2024-06-25 18:42:18,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 15465037824. Throughput: 0: 42954.2. Samples: 15465230320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:18,396][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 18:42:19,360][15349] Signal inference workers to stop experience collection... (228900 times) [2024-06-25 18:42:19,422][15401] InferenceWorker_p0-w0: stopping experience collection (228900 times) [2024-06-25 18:42:19,478][15349] Signal inference workers to resume experience collection... (228900 times) [2024-06-25 18:42:19,479][15401] InferenceWorker_p0-w0: resuming experience collection (228900 times) [2024-06-25 18:42:19,482][15401] Updated weights for policy 0, policy_version 943914 (0.0047) [2024-06-25 18:42:22,242][15401] Updated weights for policy 0, policy_version 943924 (0.0037) [2024-06-25 18:42:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 15465283584. Throughput: 0: 42807.9. Samples: 15465347260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:23,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 18:42:27,042][15401] Updated weights for policy 0, policy_version 943934 (0.0025) [2024-06-25 18:42:28,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15465496576. Throughput: 0: 43144.0. Samples: 15465618100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:28,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 18:42:29,762][15401] Updated weights for policy 0, policy_version 943944 (0.0039) [2024-06-25 18:42:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15465676800. Throughput: 0: 42940.0. Samples: 15465871460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:33,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 18:42:34,668][15401] Updated weights for policy 0, policy_version 943954 (0.0035) [2024-06-25 18:42:37,616][15401] Updated weights for policy 0, policy_version 943964 (0.0035) [2024-06-25 18:42:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 15465922560. Throughput: 0: 42671.2. Samples: 15465989300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:38,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 18:42:42,257][15401] Updated weights for policy 0, policy_version 943974 (0.0036) [2024-06-25 18:42:43,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 15466135552. Throughput: 0: 43007.4. Samples: 15466261680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:43,392][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 18:42:45,472][15401] Updated weights for policy 0, policy_version 943984 (0.0031) [2024-06-25 18:42:48,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15466332160. Throughput: 0: 42804.5. Samples: 15466512500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:48,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 18:42:49,919][15401] Updated weights for policy 0, policy_version 943994 (0.0030) [2024-06-25 18:42:53,089][15401] Updated weights for policy 0, policy_version 944004 (0.0029) [2024-06-25 18:42:53,395][15132] Fps is (10 sec: 44214.4, 60 sec: 43414.0, 300 sec: 42930.9). Total num frames: 15466577920. Throughput: 0: 42957.4. Samples: 15466641480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:53,395][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 18:42:57,499][15401] Updated weights for policy 0, policy_version 944014 (0.0034) [2024-06-25 18:42:58,394][15132] Fps is (10 sec: 44217.2, 60 sec: 42595.2, 300 sec: 42708.9). Total num frames: 15466774528. Throughput: 0: 42854.1. Samples: 15466903620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-25 18:42:58,394][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 18:43:00,914][15401] Updated weights for policy 0, policy_version 944024 (0.0028) [2024-06-25 18:43:03,390][15132] Fps is (10 sec: 40980.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15466987520. Throughput: 0: 42806.2. Samples: 15467156600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 18:43:05,231][15401] Updated weights for policy 0, policy_version 944034 (0.0033) [2024-06-25 18:43:08,376][15401] Updated weights for policy 0, policy_version 944044 (0.0049) [2024-06-25 18:43:08,390][15132] Fps is (10 sec: 44256.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15467216896. Throughput: 0: 43068.0. Samples: 15467285320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:08,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 18:43:12,805][15401] Updated weights for policy 0, policy_version 944054 (0.0030) [2024-06-25 18:43:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 15467397120. Throughput: 0: 42785.8. Samples: 15467543460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 18:43:15,987][15401] Updated weights for policy 0, policy_version 944064 (0.0033) [2024-06-25 18:43:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15467642880. Throughput: 0: 42675.1. Samples: 15467791840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 18:43:20,822][15401] Updated weights for policy 0, policy_version 944074 (0.0032) [2024-06-25 18:43:23,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 15467855872. Throughput: 0: 43024.5. Samples: 15467925400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 18:43:23,607][15401] Updated weights for policy 0, policy_version 944084 (0.0027) [2024-06-25 18:43:28,335][15401] Updated weights for policy 0, policy_version 944094 (0.0030) [2024-06-25 18:43:28,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15468036096. Throughput: 0: 42713.3. Samples: 15468183780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:28,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 18:43:31,159][15401] Updated weights for policy 0, policy_version 944104 (0.0031) [2024-06-25 18:43:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15468281856. Throughput: 0: 42641.8. Samples: 15468431380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 18:43:35,904][15401] Updated weights for policy 0, policy_version 944114 (0.0031) [2024-06-25 18:43:38,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15468478464. Throughput: 0: 42755.9. Samples: 15468565280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:38,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 18:43:38,590][15349] Signal inference workers to stop experience collection... (228950 times) [2024-06-25 18:43:38,641][15401] InferenceWorker_p0-w0: stopping experience collection (228950 times) [2024-06-25 18:43:38,646][15349] Signal inference workers to resume experience collection... (228950 times) [2024-06-25 18:43:38,661][15401] InferenceWorker_p0-w0: resuming experience collection (228950 times) [2024-06-25 18:43:38,787][15401] Updated weights for policy 0, policy_version 944124 (0.0024) [2024-06-25 18:43:43,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15468675072. Throughput: 0: 42661.1. Samples: 15468823180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:43,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 18:43:43,557][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000944134_15468691456.pth... [2024-06-25 18:43:43,568][15401] Updated weights for policy 0, policy_version 944134 (0.0051) [2024-06-25 18:43:43,622][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000943509_15458451456.pth [2024-06-25 18:43:46,647][15401] Updated weights for policy 0, policy_version 944144 (0.0041) [2024-06-25 18:43:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15468920832. Throughput: 0: 42552.5. Samples: 15469071460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 18:43:51,110][15401] Updated weights for policy 0, policy_version 944154 (0.0026) [2024-06-25 18:43:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42601.9, 300 sec: 42765.0). Total num frames: 15469133824. Throughput: 0: 42654.6. Samples: 15469204780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:53,390][15132] Avg episode reward: [(0, '0.309')] [2024-06-25 18:43:54,315][15401] Updated weights for policy 0, policy_version 944164 (0.0026) [2024-06-25 18:43:58,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42328.5, 300 sec: 42709.8). Total num frames: 15469314048. Throughput: 0: 42689.8. Samples: 15469464500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:43:58,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 18:43:58,643][15401] Updated weights for policy 0, policy_version 944174 (0.0037) [2024-06-25 18:44:02,154][15401] Updated weights for policy 0, policy_version 944184 (0.0029) [2024-06-25 18:44:03,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15469559808. Throughput: 0: 42849.3. Samples: 15469720060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:03,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 18:44:06,068][15401] Updated weights for policy 0, policy_version 944194 (0.0043) [2024-06-25 18:44:08,390][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15469789184. Throughput: 0: 42815.9. Samples: 15469852120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 18:44:09,611][15401] Updated weights for policy 0, policy_version 944204 (0.0026) [2024-06-25 18:44:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15469985792. Throughput: 0: 42904.9. Samples: 15470114500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:13,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 18:44:13,655][15401] Updated weights for policy 0, policy_version 944214 (0.0030) [2024-06-25 18:44:17,157][15401] Updated weights for policy 0, policy_version 944224 (0.0030) [2024-06-25 18:44:18,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 15470198784. Throughput: 0: 43048.3. Samples: 15470368660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:18,392][15132] Avg episode reward: [(0, '0.264')] [2024-06-25 18:44:21,107][15401] Updated weights for policy 0, policy_version 944234 (0.0031) [2024-06-25 18:44:23,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42821.1). Total num frames: 15470428160. Throughput: 0: 42892.5. Samples: 15470495440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:23,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 18:44:24,829][15401] Updated weights for policy 0, policy_version 944244 (0.0029) [2024-06-25 18:44:28,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15470624768. Throughput: 0: 42980.9. Samples: 15470757320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:28,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 18:44:29,036][15401] Updated weights for policy 0, policy_version 944254 (0.0034) [2024-06-25 18:44:32,443][15401] Updated weights for policy 0, policy_version 944264 (0.0050) [2024-06-25 18:44:33,390][15132] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 15470854144. Throughput: 0: 42967.0. Samples: 15471004980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 18:44:36,544][15401] Updated weights for policy 0, policy_version 944274 (0.0038) [2024-06-25 18:44:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15471083520. Throughput: 0: 42972.5. Samples: 15471138540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:38,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 18:44:39,899][15401] Updated weights for policy 0, policy_version 944284 (0.0035) [2024-06-25 18:44:43,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43415.8, 300 sec: 42875.8). Total num frames: 15471280128. Throughput: 0: 42976.8. Samples: 15471398560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:43,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 18:44:44,237][15401] Updated weights for policy 0, policy_version 944294 (0.0030) [2024-06-25 18:44:47,370][15401] Updated weights for policy 0, policy_version 944304 (0.0028) [2024-06-25 18:44:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15471493120. Throughput: 0: 42907.6. Samples: 15471650900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:48,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 18:44:51,795][15401] Updated weights for policy 0, policy_version 944314 (0.0023) [2024-06-25 18:44:53,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15471706112. Throughput: 0: 43009.7. Samples: 15471787560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-25 18:44:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 18:44:53,806][15349] Signal inference workers to stop experience collection... (229000 times) [2024-06-25 18:44:53,807][15349] Signal inference workers to resume experience collection... (229000 times) [2024-06-25 18:44:53,850][15401] InferenceWorker_p0-w0: stopping experience collection (229000 times) [2024-06-25 18:44:53,850][15401] InferenceWorker_p0-w0: resuming experience collection (229000 times) [2024-06-25 18:44:55,296][15401] Updated weights for policy 0, policy_version 944324 (0.0031) [2024-06-25 18:44:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 15471919104. Throughput: 0: 42960.4. Samples: 15472047720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:44:58,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 18:44:59,463][15401] Updated weights for policy 0, policy_version 944334 (0.0026) [2024-06-25 18:45:02,977][15401] Updated weights for policy 0, policy_version 944344 (0.0035) [2024-06-25 18:45:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15472148480. Throughput: 0: 42877.8. Samples: 15472298060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 18:45:07,084][15401] Updated weights for policy 0, policy_version 944354 (0.0039) [2024-06-25 18:45:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15472345088. Throughput: 0: 43008.8. Samples: 15472430840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:08,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 18:45:10,493][15401] Updated weights for policy 0, policy_version 944364 (0.0032) [2024-06-25 18:45:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15472558080. Throughput: 0: 42903.9. Samples: 15472688000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:13,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 18:45:14,844][15401] Updated weights for policy 0, policy_version 944374 (0.0050) [2024-06-25 18:45:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 15472771072. Throughput: 0: 43002.8. Samples: 15472940100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 18:45:18,634][15401] Updated weights for policy 0, policy_version 944384 (0.0035) [2024-06-25 18:45:22,642][15401] Updated weights for policy 0, policy_version 944394 (0.0032) [2024-06-25 18:45:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42876.5). Total num frames: 15472984064. Throughput: 0: 43005.0. Samples: 15473073760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:23,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 18:45:26,043][15401] Updated weights for policy 0, policy_version 944404 (0.0031) [2024-06-25 18:45:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15473180672. Throughput: 0: 43027.3. Samples: 15473334680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 18:45:30,119][15401] Updated weights for policy 0, policy_version 944414 (0.0033) [2024-06-25 18:45:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15473426432. Throughput: 0: 43017.9. Samples: 15473586700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:33,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 18:45:33,606][15401] Updated weights for policy 0, policy_version 944424 (0.0028) [2024-06-25 18:45:37,575][15401] Updated weights for policy 0, policy_version 944434 (0.0034) [2024-06-25 18:45:38,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15473639424. Throughput: 0: 42952.1. Samples: 15473720400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:38,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 18:45:41,219][15401] Updated weights for policy 0, policy_version 944444 (0.0039) [2024-06-25 18:45:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42765.4). Total num frames: 15473836032. Throughput: 0: 42921.8. Samples: 15473979200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:43,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 18:45:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000944448_15473836032.pth... [2024-06-25 18:45:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000943821_15463563264.pth [2024-06-25 18:45:45,101][15401] Updated weights for policy 0, policy_version 944454 (0.0026) [2024-06-25 18:45:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15474081792. Throughput: 0: 42856.0. Samples: 15474226580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:48,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 18:45:48,777][15401] Updated weights for policy 0, policy_version 944464 (0.0035) [2024-06-25 18:45:52,705][15401] Updated weights for policy 0, policy_version 944474 (0.0037) [2024-06-25 18:45:53,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15474311168. Throughput: 0: 43028.8. Samples: 15474367140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:53,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 18:45:56,345][15401] Updated weights for policy 0, policy_version 944484 (0.0037) [2024-06-25 18:45:58,392][15132] Fps is (10 sec: 39312.4, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 15474475008. Throughput: 0: 43020.9. Samples: 15474624040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:45:58,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 18:46:00,137][15401] Updated weights for policy 0, policy_version 944494 (0.0035) [2024-06-25 18:46:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 15474737152. Throughput: 0: 43061.0. Samples: 15474877840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:03,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 18:46:04,582][15401] Updated weights for policy 0, policy_version 944504 (0.0038) [2024-06-25 18:46:07,625][15401] Updated weights for policy 0, policy_version 944514 (0.0036) [2024-06-25 18:46:08,389][15132] Fps is (10 sec: 47525.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15474950144. Throughput: 0: 43065.8. Samples: 15475011720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 18:46:12,090][15401] Updated weights for policy 0, policy_version 944524 (0.0039) [2024-06-25 18:46:13,389][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15475113984. Throughput: 0: 43023.1. Samples: 15475270720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:13,390][15132] Avg episode reward: [(0, '0.816')] [2024-06-25 18:46:15,318][15401] Updated weights for policy 0, policy_version 944534 (0.0029) [2024-06-25 18:46:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15475359744. Throughput: 0: 43012.8. Samples: 15475522280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:18,392][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 18:46:19,649][15401] Updated weights for policy 0, policy_version 944544 (0.0040) [2024-06-25 18:46:23,166][15401] Updated weights for policy 0, policy_version 944554 (0.0044) [2024-06-25 18:46:23,390][15132] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15475572736. Throughput: 0: 42951.0. Samples: 15475653200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:23,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 18:46:27,229][15401] Updated weights for policy 0, policy_version 944564 (0.0029) [2024-06-25 18:46:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15475752960. Throughput: 0: 42896.1. Samples: 15475909520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:28,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 18:46:30,774][15401] Updated weights for policy 0, policy_version 944574 (0.0036) [2024-06-25 18:46:33,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 15476015104. Throughput: 0: 43004.4. Samples: 15476161780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 18:46:34,867][15401] Updated weights for policy 0, policy_version 944584 (0.0037) [2024-06-25 18:46:38,298][15401] Updated weights for policy 0, policy_version 944594 (0.0028) [2024-06-25 18:46:38,392][15132] Fps is (10 sec: 47502.1, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 15476228096. Throughput: 0: 42909.3. Samples: 15476298160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:38,392][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 18:46:39,571][15349] Signal inference workers to stop experience collection... (229050 times) [2024-06-25 18:46:39,601][15401] InferenceWorker_p0-w0: stopping experience collection (229050 times) [2024-06-25 18:46:39,622][15349] Signal inference workers to resume experience collection... (229050 times) [2024-06-25 18:46:39,622][15401] InferenceWorker_p0-w0: resuming experience collection (229050 times) [2024-06-25 18:46:42,360][15401] Updated weights for policy 0, policy_version 944604 (0.0042) [2024-06-25 18:46:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15476408320. Throughput: 0: 42889.5. Samples: 15476553960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 18:46:46,283][15401] Updated weights for policy 0, policy_version 944614 (0.0036) [2024-06-25 18:46:48,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15476654080. Throughput: 0: 42833.2. Samples: 15476805340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 18:46:49,931][15401] Updated weights for policy 0, policy_version 944624 (0.0038) [2024-06-25 18:46:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15476867072. Throughput: 0: 42885.2. Samples: 15476941560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 18:46:53,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 18:46:53,929][15401] Updated weights for policy 0, policy_version 944634 (0.0040) [2024-06-25 18:46:57,470][15401] Updated weights for policy 0, policy_version 944644 (0.0036) [2024-06-25 18:46:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 15477047296. Throughput: 0: 42787.5. Samples: 15477196160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:46:58,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 18:47:01,500][15401] Updated weights for policy 0, policy_version 944654 (0.0035) [2024-06-25 18:47:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15477293056. Throughput: 0: 42756.1. Samples: 15477446300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:03,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 18:47:05,416][15401] Updated weights for policy 0, policy_version 944664 (0.0038) [2024-06-25 18:47:08,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15477489664. Throughput: 0: 42669.5. Samples: 15477573320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:08,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 18:47:09,078][15401] Updated weights for policy 0, policy_version 944674 (0.0033) [2024-06-25 18:47:12,949][15401] Updated weights for policy 0, policy_version 944684 (0.0034) [2024-06-25 18:47:13,390][15132] Fps is (10 sec: 42597.6, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 15477719040. Throughput: 0: 42642.6. Samples: 15477828440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:13,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 18:47:17,366][15401] Updated weights for policy 0, policy_version 944694 (0.0040) [2024-06-25 18:47:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 15477948416. Throughput: 0: 42737.9. Samples: 15478084980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:18,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 18:47:20,502][15401] Updated weights for policy 0, policy_version 944704 (0.0036) [2024-06-25 18:47:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15478145024. Throughput: 0: 42711.1. Samples: 15478220060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 18:47:24,850][15401] Updated weights for policy 0, policy_version 944714 (0.0037) [2024-06-25 18:47:28,287][15401] Updated weights for policy 0, policy_version 944724 (0.0028) [2024-06-25 18:47:28,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 15478358016. Throughput: 0: 42803.1. Samples: 15478480100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:28,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 18:47:32,326][15401] Updated weights for policy 0, policy_version 944734 (0.0033) [2024-06-25 18:47:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15478587392. Throughput: 0: 42853.4. Samples: 15478733740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:33,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 18:47:35,751][15401] Updated weights for policy 0, policy_version 944744 (0.0035) [2024-06-25 18:47:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42876.1). Total num frames: 15478784000. Throughput: 0: 42775.7. Samples: 15478866460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:38,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 18:47:39,712][15401] Updated weights for policy 0, policy_version 944754 (0.0036) [2024-06-25 18:47:43,225][15401] Updated weights for policy 0, policy_version 944764 (0.0037) [2024-06-25 18:47:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 15479013376. Throughput: 0: 42950.1. Samples: 15479128920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:43,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 18:47:43,510][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000944765_15479029760.pth... [2024-06-25 18:47:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000944134_15468691456.pth [2024-06-25 18:47:47,419][15401] Updated weights for policy 0, policy_version 944774 (0.0037) [2024-06-25 18:47:48,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42821.3). Total num frames: 15479209984. Throughput: 0: 43019.5. Samples: 15479382180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:48,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 18:47:50,699][15401] Updated weights for policy 0, policy_version 944784 (0.0030) [2024-06-25 18:47:53,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.7). Total num frames: 15479422976. Throughput: 0: 43015.9. Samples: 15479509040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:53,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 18:47:55,227][15401] Updated weights for policy 0, policy_version 944794 (0.0029) [2024-06-25 18:47:58,245][15401] Updated weights for policy 0, policy_version 944804 (0.0037) [2024-06-25 18:47:58,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 15479668736. Throughput: 0: 43204.1. Samples: 15479772620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:47:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 18:48:00,863][15349] Signal inference workers to stop experience collection... (229100 times) [2024-06-25 18:48:00,864][15349] Signal inference workers to resume experience collection... (229100 times) [2024-06-25 18:48:00,876][15401] InferenceWorker_p0-w0: stopping experience collection (229100 times) [2024-06-25 18:48:00,887][15401] InferenceWorker_p0-w0: resuming experience collection (229100 times) [2024-06-25 18:48:02,750][15401] Updated weights for policy 0, policy_version 944814 (0.0046) [2024-06-25 18:48:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15479865344. Throughput: 0: 43204.3. Samples: 15480029180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:03,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 18:48:05,948][15401] Updated weights for policy 0, policy_version 944824 (0.0040) [2024-06-25 18:48:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 15480078336. Throughput: 0: 42997.4. Samples: 15480154940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:08,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 18:48:10,320][15401] Updated weights for policy 0, policy_version 944834 (0.0044) [2024-06-25 18:48:13,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15480291328. Throughput: 0: 43049.3. Samples: 15480417320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:13,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 18:48:13,741][15401] Updated weights for policy 0, policy_version 944844 (0.0029) [2024-06-25 18:48:18,166][15401] Updated weights for policy 0, policy_version 944854 (0.0047) [2024-06-25 18:48:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15480504320. Throughput: 0: 43087.2. Samples: 15480672660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:18,390][15132] Avg episode reward: [(0, '0.138')] [2024-06-25 18:48:21,511][15401] Updated weights for policy 0, policy_version 944864 (0.0036) [2024-06-25 18:48:23,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 15480717312. Throughput: 0: 42917.7. Samples: 15480797860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:23,392][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 18:48:25,807][15401] Updated weights for policy 0, policy_version 944874 (0.0032) [2024-06-25 18:48:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15480913920. Throughput: 0: 42819.2. Samples: 15481055780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:28,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 18:48:29,126][15401] Updated weights for policy 0, policy_version 944884 (0.0037) [2024-06-25 18:48:33,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 15481126912. Throughput: 0: 42836.1. Samples: 15481309800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 18:48:33,514][15401] Updated weights for policy 0, policy_version 944894 (0.0031) [2024-06-25 18:48:36,782][15401] Updated weights for policy 0, policy_version 944904 (0.0028) [2024-06-25 18:48:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 15481372672. Throughput: 0: 42915.9. Samples: 15481440260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:38,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 18:48:40,937][15401] Updated weights for policy 0, policy_version 944914 (0.0033) [2024-06-25 18:48:43,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42047.9, 300 sec: 42764.1). Total num frames: 15481536512. Throughput: 0: 42821.0. Samples: 15481699840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:43,397][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 18:48:44,379][15401] Updated weights for policy 0, policy_version 944924 (0.0029) [2024-06-25 18:48:48,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15481782272. Throughput: 0: 42736.2. Samples: 15481952300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 18:48:48,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 18:48:48,483][15401] Updated weights for policy 0, policy_version 944934 (0.0033) [2024-06-25 18:48:52,139][15401] Updated weights for policy 0, policy_version 944944 (0.0025) [2024-06-25 18:48:53,389][15132] Fps is (10 sec: 49183.8, 60 sec: 43417.7, 300 sec: 43098.3). Total num frames: 15482028032. Throughput: 0: 42841.9. Samples: 15482082820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:48:53,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 18:48:56,274][15401] Updated weights for policy 0, policy_version 944954 (0.0043) [2024-06-25 18:48:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 15482191872. Throughput: 0: 42660.8. Samples: 15482337060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:48:58,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 18:48:59,081][15349] Signal inference workers to stop experience collection... (229150 times) [2024-06-25 18:48:59,082][15349] Signal inference workers to resume experience collection... (229150 times) [2024-06-25 18:48:59,122][15401] InferenceWorker_p0-w0: stopping experience collection (229150 times) [2024-06-25 18:48:59,122][15401] InferenceWorker_p0-w0: resuming experience collection (229150 times) [2024-06-25 18:48:59,711][15401] Updated weights for policy 0, policy_version 944964 (0.0036) [2024-06-25 18:49:03,389][15132] Fps is (10 sec: 37683.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15482404864. Throughput: 0: 42747.1. Samples: 15482596280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:03,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-25 18:49:04,170][15401] Updated weights for policy 0, policy_version 944974 (0.0049) [2024-06-25 18:49:07,536][15401] Updated weights for policy 0, policy_version 944984 (0.0034) [2024-06-25 18:49:08,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 15482667008. Throughput: 0: 42740.5. Samples: 15482721080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 18:49:11,713][15401] Updated weights for policy 0, policy_version 944994 (0.0033) [2024-06-25 18:49:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 15482830848. Throughput: 0: 42627.9. Samples: 15482974040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:13,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 18:49:15,180][15401] Updated weights for policy 0, policy_version 945004 (0.0035) [2024-06-25 18:49:18,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15483060224. Throughput: 0: 42634.1. Samples: 15483228340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:18,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 18:49:19,312][15401] Updated weights for policy 0, policy_version 945014 (0.0043) [2024-06-25 18:49:22,938][15401] Updated weights for policy 0, policy_version 945024 (0.0034) [2024-06-25 18:49:23,389][15132] Fps is (10 sec: 47514.4, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 15483305984. Throughput: 0: 42653.5. Samples: 15483359660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 18:49:27,350][15401] Updated weights for policy 0, policy_version 945034 (0.0042) [2024-06-25 18:49:28,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 15483469824. Throughput: 0: 42589.3. Samples: 15483616080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:28,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 18:49:30,758][15401] Updated weights for policy 0, policy_version 945044 (0.0030) [2024-06-25 18:49:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15483699200. Throughput: 0: 42521.6. Samples: 15483865780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:33,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 18:49:34,945][15401] Updated weights for policy 0, policy_version 945054 (0.0040) [2024-06-25 18:49:38,341][15401] Updated weights for policy 0, policy_version 945064 (0.0034) [2024-06-25 18:49:38,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 15483928576. Throughput: 0: 42558.1. Samples: 15483997940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:38,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 18:49:42,704][15401] Updated weights for policy 0, policy_version 945074 (0.0041) [2024-06-25 18:49:43,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42874.3, 300 sec: 42764.7). Total num frames: 15484108800. Throughput: 0: 42448.0. Samples: 15484247320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:43,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 18:49:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000945075_15484108800.pth... [2024-06-25 18:49:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000944448_15473836032.pth [2024-06-25 18:49:45,802][15401] Updated weights for policy 0, policy_version 945084 (0.0025) [2024-06-25 18:49:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15484338176. Throughput: 0: 42312.9. Samples: 15484500360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 18:49:50,119][15401] Updated weights for policy 0, policy_version 945094 (0.0039) [2024-06-25 18:49:53,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 15484567552. Throughput: 0: 42564.4. Samples: 15484636480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 18:49:53,486][15401] Updated weights for policy 0, policy_version 945104 (0.0037) [2024-06-25 18:49:58,283][15401] Updated weights for policy 0, policy_version 945114 (0.0026) [2024-06-25 18:49:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15484747776. Throughput: 0: 42595.6. Samples: 15484890840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:49:58,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 18:50:01,198][15401] Updated weights for policy 0, policy_version 945124 (0.0032) [2024-06-25 18:50:03,392][15132] Fps is (10 sec: 42588.1, 60 sec: 43142.8, 300 sec: 42875.7). Total num frames: 15484993536. Throughput: 0: 42507.6. Samples: 15485141280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:03,393][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 18:50:05,731][15401] Updated weights for policy 0, policy_version 945134 (0.0025) [2024-06-25 18:50:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 15485190144. Throughput: 0: 42696.0. Samples: 15485280980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 18:50:08,999][15401] Updated weights for policy 0, policy_version 945144 (0.0030) [2024-06-25 18:50:13,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15485386752. Throughput: 0: 42501.7. Samples: 15485528660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:13,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 18:50:13,424][15401] Updated weights for policy 0, policy_version 945154 (0.0044) [2024-06-25 18:50:15,875][15349] Signal inference workers to stop experience collection... (229200 times) [2024-06-25 18:50:15,931][15401] InferenceWorker_p0-w0: stopping experience collection (229200 times) [2024-06-25 18:50:15,935][15349] Signal inference workers to resume experience collection... (229200 times) [2024-06-25 18:50:15,944][15401] InferenceWorker_p0-w0: resuming experience collection (229200 times) [2024-06-25 18:50:16,692][15401] Updated weights for policy 0, policy_version 945164 (0.0036) [2024-06-25 18:50:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15485632512. Throughput: 0: 42554.8. Samples: 15485780740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 18:50:21,050][15401] Updated weights for policy 0, policy_version 945174 (0.0031) [2024-06-25 18:50:23,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 15485845504. Throughput: 0: 42544.5. Samples: 15485912440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:23,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 18:50:24,207][15401] Updated weights for policy 0, policy_version 945184 (0.0026) [2024-06-25 18:50:28,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15486025728. Throughput: 0: 42754.8. Samples: 15486171180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:28,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 18:50:28,751][15401] Updated weights for policy 0, policy_version 945194 (0.0029) [2024-06-25 18:50:31,733][15401] Updated weights for policy 0, policy_version 945204 (0.0033) [2024-06-25 18:50:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15486271488. Throughput: 0: 42728.3. Samples: 15486423140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 18:50:36,685][15401] Updated weights for policy 0, policy_version 945214 (0.0036) [2024-06-25 18:50:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15486484480. Throughput: 0: 42745.5. Samples: 15486560020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 18:50:40,012][15401] Updated weights for policy 0, policy_version 945224 (0.0026) [2024-06-25 18:50:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 15486681088. Throughput: 0: 42723.4. Samples: 15486813400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 18:50:43,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 18:50:44,598][15401] Updated weights for policy 0, policy_version 945234 (0.0038) [2024-06-25 18:50:47,548][15401] Updated weights for policy 0, policy_version 945244 (0.0034) [2024-06-25 18:50:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15486910464. Throughput: 0: 42800.6. Samples: 15487067200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:50:48,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 18:50:52,175][15401] Updated weights for policy 0, policy_version 945254 (0.0020) [2024-06-25 18:50:53,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 15487123456. Throughput: 0: 42663.9. Samples: 15487200860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:50:53,390][15132] Avg episode reward: [(0, '0.868')] [2024-06-25 18:50:54,926][15401] Updated weights for policy 0, policy_version 945264 (0.0022) [2024-06-25 18:50:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15487336448. Throughput: 0: 42802.1. Samples: 15487454760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:50:58,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 18:50:59,786][15401] Updated weights for policy 0, policy_version 945274 (0.0044) [2024-06-25 18:51:02,463][15401] Updated weights for policy 0, policy_version 945284 (0.0048) [2024-06-25 18:51:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 15487565824. Throughput: 0: 42813.0. Samples: 15487707340. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:03,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 18:51:07,463][15401] Updated weights for policy 0, policy_version 945294 (0.0037) [2024-06-25 18:51:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15487762432. Throughput: 0: 42885.3. Samples: 15487842280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:08,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 18:51:09,947][15401] Updated weights for policy 0, policy_version 945304 (0.0035) [2024-06-25 18:51:13,390][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15487975424. Throughput: 0: 42808.3. Samples: 15488097560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 18:51:15,022][15401] Updated weights for policy 0, policy_version 945314 (0.0028) [2024-06-25 18:51:17,503][15401] Updated weights for policy 0, policy_version 945324 (0.0024) [2024-06-25 18:51:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15488204800. Throughput: 0: 42852.2. Samples: 15488351480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:18,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 18:51:22,548][15401] Updated weights for policy 0, policy_version 945334 (0.0036) [2024-06-25 18:51:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15488401408. Throughput: 0: 42876.4. Samples: 15488489460. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:23,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 18:51:25,391][15401] Updated weights for policy 0, policy_version 945344 (0.0039) [2024-06-25 18:51:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15488614400. Throughput: 0: 42866.0. Samples: 15488742360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:28,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 18:51:30,347][15401] Updated weights for policy 0, policy_version 945354 (0.0031) [2024-06-25 18:51:30,694][15349] Signal inference workers to stop experience collection... (229250 times) [2024-06-25 18:51:30,694][15349] Signal inference workers to resume experience collection... (229250 times) [2024-06-25 18:51:30,713][15401] InferenceWorker_p0-w0: stopping experience collection (229250 times) [2024-06-25 18:51:30,713][15401] InferenceWorker_p0-w0: resuming experience collection (229250 times) [2024-06-25 18:51:33,063][15401] Updated weights for policy 0, policy_version 945364 (0.0046) [2024-06-25 18:51:33,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 15488860160. Throughput: 0: 42892.7. Samples: 15488997380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:33,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 18:51:37,893][15401] Updated weights for policy 0, policy_version 945374 (0.0041) [2024-06-25 18:51:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15489040384. Throughput: 0: 42855.7. Samples: 15489129360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:38,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 18:51:40,665][15401] Updated weights for policy 0, policy_version 945384 (0.0049) [2024-06-25 18:51:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 15489269760. Throughput: 0: 43007.2. Samples: 15489390080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:43,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 18:51:43,514][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000945391_15489286144.pth... [2024-06-25 18:51:43,560][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000944765_15479029760.pth [2024-06-25 18:51:45,360][15401] Updated weights for policy 0, policy_version 945394 (0.0038) [2024-06-25 18:51:48,294][15401] Updated weights for policy 0, policy_version 945404 (0.0037) [2024-06-25 18:51:48,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15489499136. Throughput: 0: 43021.6. Samples: 15489643300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:48,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 18:51:52,868][15401] Updated weights for policy 0, policy_version 945414 (0.0037) [2024-06-25 18:51:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15489695744. Throughput: 0: 42883.5. Samples: 15489772040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 18:51:55,975][15401] Updated weights for policy 0, policy_version 945424 (0.0034) [2024-06-25 18:51:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15489908736. Throughput: 0: 43020.9. Samples: 15490033500. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:51:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 18:52:00,399][15401] Updated weights for policy 0, policy_version 945434 (0.0036) [2024-06-25 18:52:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15490138112. Throughput: 0: 43066.9. Samples: 15490289500. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:03,390][15132] Avg episode reward: [(0, '0.815')] [2024-06-25 18:52:03,524][15401] Updated weights for policy 0, policy_version 945444 (0.0033) [2024-06-25 18:52:07,953][15401] Updated weights for policy 0, policy_version 945454 (0.0032) [2024-06-25 18:52:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15490334720. Throughput: 0: 42790.6. Samples: 15490415040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:08,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 18:52:11,561][15401] Updated weights for policy 0, policy_version 945464 (0.0033) [2024-06-25 18:52:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 15490547712. Throughput: 0: 42760.2. Samples: 15490666580. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:13,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 18:52:15,449][15401] Updated weights for policy 0, policy_version 945474 (0.0035) [2024-06-25 18:52:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15490744320. Throughput: 0: 42895.8. Samples: 15490927680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:18,390][15132] Avg episode reward: [(0, '0.820')] [2024-06-25 18:52:19,223][15401] Updated weights for policy 0, policy_version 945484 (0.0029) [2024-06-25 18:52:23,001][15401] Updated weights for policy 0, policy_version 945494 (0.0036) [2024-06-25 18:52:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15490990080. Throughput: 0: 42811.9. Samples: 15491055900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 18:52:26,780][15401] Updated weights for policy 0, policy_version 945504 (0.0036) [2024-06-25 18:52:28,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15491186688. Throughput: 0: 42690.6. Samples: 15491311160. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:28,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 18:52:30,856][15401] Updated weights for policy 0, policy_version 945514 (0.0041) [2024-06-25 18:52:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 15491399680. Throughput: 0: 42765.0. Samples: 15491567720. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:33,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 18:52:34,643][15401] Updated weights for policy 0, policy_version 945524 (0.0041) [2024-06-25 18:52:38,263][15401] Updated weights for policy 0, policy_version 945534 (0.0042) [2024-06-25 18:52:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15491629056. Throughput: 0: 42928.9. Samples: 15491703840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 18:52:42,282][15349] Signal inference workers to stop experience collection... (229300 times) [2024-06-25 18:52:42,282][15349] Signal inference workers to resume experience collection... (229300 times) [2024-06-25 18:52:42,286][15401] Updated weights for policy 0, policy_version 945544 (0.0028) [2024-06-25 18:52:42,309][15401] InferenceWorker_p0-w0: stopping experience collection (229300 times) [2024-06-25 18:52:42,309][15401] InferenceWorker_p0-w0: resuming experience collection (229300 times) [2024-06-25 18:52:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15491825664. Throughput: 0: 42803.9. Samples: 15491959680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-25 18:52:43,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 18:52:45,805][15401] Updated weights for policy 0, policy_version 945554 (0.0037) [2024-06-25 18:52:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15492038656. Throughput: 0: 42685.0. Samples: 15492210320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:52:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 18:52:50,315][15401] Updated weights for policy 0, policy_version 945564 (0.0029) [2024-06-25 18:52:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15492268032. Throughput: 0: 42696.9. Samples: 15492336400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:52:53,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 18:52:53,792][15401] Updated weights for policy 0, policy_version 945574 (0.0040) [2024-06-25 18:52:58,036][15401] Updated weights for policy 0, policy_version 945584 (0.0036) [2024-06-25 18:52:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15492481024. Throughput: 0: 42968.1. Samples: 15492600140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:52:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 18:53:01,145][15401] Updated weights for policy 0, policy_version 945594 (0.0023) [2024-06-25 18:53:03,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15492694016. Throughput: 0: 42800.6. Samples: 15492853720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:03,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 18:53:05,536][15401] Updated weights for policy 0, policy_version 945604 (0.0034) [2024-06-25 18:53:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15492923392. Throughput: 0: 42897.3. Samples: 15492986280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 18:53:08,568][15401] Updated weights for policy 0, policy_version 945614 (0.0033) [2024-06-25 18:53:13,376][15401] Updated weights for policy 0, policy_version 945624 (0.0034) [2024-06-25 18:53:13,389][15132] Fps is (10 sec: 40960.9, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 15493103616. Throughput: 0: 43004.5. Samples: 15493246360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:13,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 18:53:16,070][15401] Updated weights for policy 0, policy_version 945634 (0.0035) [2024-06-25 18:53:18,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.7, 300 sec: 42765.0). Total num frames: 15493332992. Throughput: 0: 42746.5. Samples: 15493491420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:18,393][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 18:53:20,943][15401] Updated weights for policy 0, policy_version 945644 (0.0036) [2024-06-25 18:53:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15493562368. Throughput: 0: 42658.6. Samples: 15493623480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:23,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 18:53:23,877][15401] Updated weights for policy 0, policy_version 945654 (0.0027) [2024-06-25 18:53:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15493742592. Throughput: 0: 42759.2. Samples: 15493883840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:28,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 18:53:28,498][15401] Updated weights for policy 0, policy_version 945664 (0.0026) [2024-06-25 18:53:31,328][15401] Updated weights for policy 0, policy_version 945674 (0.0027) [2024-06-25 18:53:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15493988352. Throughput: 0: 42636.4. Samples: 15494128960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 18:53:36,573][15401] Updated weights for policy 0, policy_version 945684 (0.0036) [2024-06-25 18:53:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42877.0). Total num frames: 15494184960. Throughput: 0: 42985.3. Samples: 15494270740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:38,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 18:53:39,006][15401] Updated weights for policy 0, policy_version 945694 (0.0028) [2024-06-25 18:53:43,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15494381568. Throughput: 0: 42722.3. Samples: 15494522640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:43,390][15132] Avg episode reward: [(0, '0.258')] [2024-06-25 18:53:43,436][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000945703_15494397952.pth... [2024-06-25 18:53:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000945075_15484108800.pth [2024-06-25 18:53:44,235][15401] Updated weights for policy 0, policy_version 945704 (0.0028) [2024-06-25 18:53:47,039][15401] Updated weights for policy 0, policy_version 945714 (0.0031) [2024-06-25 18:53:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15494627328. Throughput: 0: 42664.2. Samples: 15494773600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:48,396][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 18:53:51,966][15401] Updated weights for policy 0, policy_version 945724 (0.0030) [2024-06-25 18:53:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15494807552. Throughput: 0: 42621.6. Samples: 15494904240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 18:53:54,731][15401] Updated weights for policy 0, policy_version 945734 (0.0037) [2024-06-25 18:53:58,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15495020544. Throughput: 0: 42341.7. Samples: 15495151740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:53:58,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 18:53:59,635][15401] Updated weights for policy 0, policy_version 945744 (0.0027) [2024-06-25 18:54:02,754][15401] Updated weights for policy 0, policy_version 945754 (0.0037) [2024-06-25 18:54:02,769][15349] Signal inference workers to stop experience collection... (229350 times) [2024-06-25 18:54:02,776][15349] Signal inference workers to resume experience collection... (229350 times) [2024-06-25 18:54:02,793][15401] InferenceWorker_p0-w0: stopping experience collection (229350 times) [2024-06-25 18:54:02,793][15401] InferenceWorker_p0-w0: resuming experience collection (229350 times) [2024-06-25 18:54:03,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15495266304. Throughput: 0: 42490.7. Samples: 15495403400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:54:03,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 18:54:07,374][15401] Updated weights for policy 0, policy_version 945764 (0.0039) [2024-06-25 18:54:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 15495462912. Throughput: 0: 42588.0. Samples: 15495539940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:54:08,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 18:54:10,435][15401] Updated weights for policy 0, policy_version 945774 (0.0034) [2024-06-25 18:54:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 15495675904. Throughput: 0: 42507.9. Samples: 15495796700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:54:13,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 18:54:14,963][15401] Updated weights for policy 0, policy_version 945784 (0.0025) [2024-06-25 18:54:17,984][15401] Updated weights for policy 0, policy_version 945794 (0.0044) [2024-06-25 18:54:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 15495905280. Throughput: 0: 42698.0. Samples: 15496050360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:54:18,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 18:54:22,576][15401] Updated weights for policy 0, policy_version 945804 (0.0025) [2024-06-25 18:54:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15496085504. Throughput: 0: 42434.3. Samples: 15496180280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:54:23,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 18:54:25,676][15401] Updated weights for policy 0, policy_version 945814 (0.0028) [2024-06-25 18:54:28,392][15132] Fps is (10 sec: 40951.2, 60 sec: 42870.0, 300 sec: 42764.7). Total num frames: 15496314880. Throughput: 0: 42428.6. Samples: 15496432020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:54:28,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 18:54:30,177][15401] Updated weights for policy 0, policy_version 945824 (0.0028) [2024-06-25 18:54:33,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15496527872. Throughput: 0: 42543.9. Samples: 15496688080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:54:33,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 18:54:33,450][15401] Updated weights for policy 0, policy_version 945834 (0.0034) [2024-06-25 18:54:37,674][15401] Updated weights for policy 0, policy_version 945844 (0.0028) [2024-06-25 18:54:38,390][15132] Fps is (10 sec: 40968.4, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 15496724480. Throughput: 0: 42630.5. Samples: 15496822620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 18:54:38,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-25 18:54:41,140][15401] Updated weights for policy 0, policy_version 945854 (0.0049) [2024-06-25 18:54:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 15496937472. Throughput: 0: 42781.2. Samples: 15497076900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:54:43,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 18:54:45,616][15401] Updated weights for policy 0, policy_version 945864 (0.0042) [2024-06-25 18:54:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15497166848. Throughput: 0: 42827.6. Samples: 15497330640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:54:48,392][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 18:54:48,901][15401] Updated weights for policy 0, policy_version 945874 (0.0046) [2024-06-25 18:54:53,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15497347072. Throughput: 0: 42640.8. Samples: 15497458780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:54:53,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 18:54:53,474][15401] Updated weights for policy 0, policy_version 945884 (0.0032) [2024-06-25 18:54:56,419][15401] Updated weights for policy 0, policy_version 945894 (0.0036) [2024-06-25 18:54:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 15497592832. Throughput: 0: 42701.0. Samples: 15497718240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:54:58,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 18:55:00,937][15401] Updated weights for policy 0, policy_version 945904 (0.0039) [2024-06-25 18:55:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 15497805824. Throughput: 0: 42763.3. Samples: 15497974720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:03,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 18:55:04,074][15401] Updated weights for policy 0, policy_version 945914 (0.0030) [2024-06-25 18:55:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15498002432. Throughput: 0: 42820.5. Samples: 15498107200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:08,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 18:55:08,449][15401] Updated weights for policy 0, policy_version 945924 (0.0036) [2024-06-25 18:55:11,718][15401] Updated weights for policy 0, policy_version 945934 (0.0040) [2024-06-25 18:55:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15498215424. Throughput: 0: 42730.4. Samples: 15498354800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:13,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 18:55:16,344][15401] Updated weights for policy 0, policy_version 945944 (0.0031) [2024-06-25 18:55:18,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15498461184. Throughput: 0: 42603.3. Samples: 15498605220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:18,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 18:55:19,367][15401] Updated weights for policy 0, policy_version 945954 (0.0037) [2024-06-25 18:55:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15498641408. Throughput: 0: 42618.3. Samples: 15498740440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:23,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 18:55:23,908][15401] Updated weights for policy 0, policy_version 945964 (0.0034) [2024-06-25 18:55:27,053][15401] Updated weights for policy 0, policy_version 945974 (0.0033) [2024-06-25 18:55:28,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42599.9, 300 sec: 42709.5). Total num frames: 15498870784. Throughput: 0: 42635.2. Samples: 15498995480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:28,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 18:55:31,466][15401] Updated weights for policy 0, policy_version 945984 (0.0032) [2024-06-25 18:55:33,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 15499100160. Throughput: 0: 42726.2. Samples: 15499253420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:33,393][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 18:55:34,438][15349] Signal inference workers to stop experience collection... (229400 times) [2024-06-25 18:55:34,484][15401] InferenceWorker_p0-w0: stopping experience collection (229400 times) [2024-06-25 18:55:34,490][15349] Signal inference workers to resume experience collection... (229400 times) [2024-06-25 18:55:34,507][15401] InferenceWorker_p0-w0: resuming experience collection (229400 times) [2024-06-25 18:55:34,626][15401] Updated weights for policy 0, policy_version 945994 (0.0037) [2024-06-25 18:55:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15499280384. Throughput: 0: 42816.5. Samples: 15499385520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:38,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 18:55:39,021][15401] Updated weights for policy 0, policy_version 946004 (0.0041) [2024-06-25 18:55:42,355][15401] Updated weights for policy 0, policy_version 946014 (0.0040) [2024-06-25 18:55:43,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15499509760. Throughput: 0: 42559.1. Samples: 15499633400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:43,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 18:55:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000946015_15499509760.pth... [2024-06-25 18:55:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000945391_15489286144.pth [2024-06-25 18:55:47,015][15401] Updated weights for policy 0, policy_version 946024 (0.0035) [2024-06-25 18:55:48,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15499755520. Throughput: 0: 42366.4. Samples: 15499881200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:48,390][15132] Avg episode reward: [(0, '0.804')] [2024-06-25 18:55:50,308][15401] Updated weights for policy 0, policy_version 946034 (0.0023) [2024-06-25 18:55:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15499935744. Throughput: 0: 42536.4. Samples: 15500021340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:53,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 18:55:54,458][15401] Updated weights for policy 0, policy_version 946044 (0.0024) [2024-06-25 18:55:57,745][15401] Updated weights for policy 0, policy_version 946054 (0.0037) [2024-06-25 18:55:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15500165120. Throughput: 0: 42708.4. Samples: 15500276680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:55:58,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 18:56:01,947][15401] Updated weights for policy 0, policy_version 946064 (0.0037) [2024-06-25 18:56:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15500378112. Throughput: 0: 42756.5. Samples: 15500529260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:56:03,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 18:56:05,279][15401] Updated weights for policy 0, policy_version 946074 (0.0033) [2024-06-25 18:56:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15500574720. Throughput: 0: 42749.2. Samples: 15500664160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:56:08,390][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 18:56:09,837][15401] Updated weights for policy 0, policy_version 946084 (0.0032) [2024-06-25 18:56:13,201][15401] Updated weights for policy 0, policy_version 946094 (0.0038) [2024-06-25 18:56:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15500804096. Throughput: 0: 42718.8. Samples: 15500917820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:56:13,390][15132] Avg episode reward: [(0, '0.384')] [2024-06-25 18:56:17,371][15401] Updated weights for policy 0, policy_version 946104 (0.0039) [2024-06-25 18:56:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15501033472. Throughput: 0: 42730.3. Samples: 15501176180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:56:18,392][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 18:56:20,944][15401] Updated weights for policy 0, policy_version 946114 (0.0028) [2024-06-25 18:56:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15501213696. Throughput: 0: 42692.0. Samples: 15501306660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:56:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 18:56:24,903][15401] Updated weights for policy 0, policy_version 946124 (0.0032) [2024-06-25 18:56:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 15501443072. Throughput: 0: 42757.3. Samples: 15501557480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:56:28,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 18:56:28,608][15401] Updated weights for policy 0, policy_version 946134 (0.0033) [2024-06-25 18:56:32,434][15401] Updated weights for policy 0, policy_version 946144 (0.0026) [2024-06-25 18:56:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 15501656064. Throughput: 0: 43134.2. Samples: 15501822240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-25 18:56:33,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 18:56:36,152][15401] Updated weights for policy 0, policy_version 946154 (0.0045) [2024-06-25 18:56:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15501852672. Throughput: 0: 42876.1. Samples: 15501950760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:56:38,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 18:56:40,136][15401] Updated weights for policy 0, policy_version 946164 (0.0041) [2024-06-25 18:56:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15502098432. Throughput: 0: 42904.4. Samples: 15502207380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:56:43,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 18:56:43,779][15401] Updated weights for policy 0, policy_version 946174 (0.0040) [2024-06-25 18:56:47,604][15401] Updated weights for policy 0, policy_version 946184 (0.0028) [2024-06-25 18:56:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15502295040. Throughput: 0: 43212.4. Samples: 15502473820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:56:48,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 18:56:51,240][15401] Updated weights for policy 0, policy_version 946194 (0.0039) [2024-06-25 18:56:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15502508032. Throughput: 0: 42993.0. Samples: 15502598840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:56:53,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 18:56:55,457][15401] Updated weights for policy 0, policy_version 946204 (0.0047) [2024-06-25 18:56:57,155][15349] Signal inference workers to stop experience collection... (229450 times) [2024-06-25 18:56:57,174][15401] InferenceWorker_p0-w0: stopping experience collection (229450 times) [2024-06-25 18:56:57,214][15349] Signal inference workers to resume experience collection... (229450 times) [2024-06-25 18:56:57,215][15401] InferenceWorker_p0-w0: resuming experience collection (229450 times) [2024-06-25 18:56:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15502753792. Throughput: 0: 43087.4. Samples: 15502856760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:56:58,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 18:56:58,864][15401] Updated weights for policy 0, policy_version 946214 (0.0026) [2024-06-25 18:57:02,813][15401] Updated weights for policy 0, policy_version 946224 (0.0036) [2024-06-25 18:57:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15502950400. Throughput: 0: 43146.6. Samples: 15503117780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 18:57:06,875][15401] Updated weights for policy 0, policy_version 946234 (0.0038) [2024-06-25 18:57:08,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15503163392. Throughput: 0: 42984.8. Samples: 15503240980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 18:57:10,250][15401] Updated weights for policy 0, policy_version 946244 (0.0042) [2024-06-25 18:57:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15503392768. Throughput: 0: 43155.6. Samples: 15503499480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:13,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 18:57:14,502][15401] Updated weights for policy 0, policy_version 946254 (0.0023) [2024-06-25 18:57:17,770][15401] Updated weights for policy 0, policy_version 946264 (0.0032) [2024-06-25 18:57:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15503605760. Throughput: 0: 43130.8. Samples: 15503763120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:18,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 18:57:22,017][15401] Updated weights for policy 0, policy_version 946274 (0.0036) [2024-06-25 18:57:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15503802368. Throughput: 0: 43003.8. Samples: 15503885940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 18:57:25,955][15401] Updated weights for policy 0, policy_version 946284 (0.0038) [2024-06-25 18:57:28,390][15132] Fps is (10 sec: 42594.8, 60 sec: 43144.0, 300 sec: 42820.4). Total num frames: 15504031744. Throughput: 0: 42906.0. Samples: 15504138180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:28,391][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 18:57:29,651][15401] Updated weights for policy 0, policy_version 946294 (0.0036) [2024-06-25 18:57:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15504228352. Throughput: 0: 42858.6. Samples: 15504402460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 18:57:33,712][15401] Updated weights for policy 0, policy_version 946304 (0.0033) [2024-06-25 18:57:37,449][15401] Updated weights for policy 0, policy_version 946314 (0.0031) [2024-06-25 18:57:38,389][15132] Fps is (10 sec: 40963.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15504441344. Throughput: 0: 42817.7. Samples: 15504525640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:38,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 18:57:41,422][15401] Updated weights for policy 0, policy_version 946324 (0.0048) [2024-06-25 18:57:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15504670720. Throughput: 0: 42836.8. Samples: 15504784420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:43,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 18:57:43,518][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000946331_15504687104.pth... [2024-06-25 18:57:43,570][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000945703_15494397952.pth [2024-06-25 18:57:44,858][15401] Updated weights for policy 0, policy_version 946334 (0.0031) [2024-06-25 18:57:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15504850944. Throughput: 0: 42729.8. Samples: 15505040620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:48,392][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 18:57:49,269][15401] Updated weights for policy 0, policy_version 946344 (0.0042) [2024-06-25 18:57:52,376][15401] Updated weights for policy 0, policy_version 946354 (0.0040) [2024-06-25 18:57:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15505080320. Throughput: 0: 42790.4. Samples: 15505166540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:53,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 18:57:56,869][15401] Updated weights for policy 0, policy_version 946364 (0.0036) [2024-06-25 18:57:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15505293312. Throughput: 0: 42741.6. Samples: 15505422860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:57:58,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 18:58:00,279][15401] Updated weights for policy 0, policy_version 946374 (0.0040) [2024-06-25 18:58:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15505506304. Throughput: 0: 42565.2. Samples: 15505678560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:58:03,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 18:58:04,444][15401] Updated weights for policy 0, policy_version 946384 (0.0031) [2024-06-25 18:58:08,052][15401] Updated weights for policy 0, policy_version 946394 (0.0034) [2024-06-25 18:58:08,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15505735680. Throughput: 0: 42648.5. Samples: 15505805120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:58:08,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 18:58:12,088][15401] Updated weights for policy 0, policy_version 946404 (0.0039) [2024-06-25 18:58:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 15505948672. Throughput: 0: 42783.8. Samples: 15506063420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:58:13,391][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 18:58:15,615][15401] Updated weights for policy 0, policy_version 946414 (0.0046) [2024-06-25 18:58:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15506145280. Throughput: 0: 42565.3. Samples: 15506317900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:58:18,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 18:58:19,808][15401] Updated weights for policy 0, policy_version 946424 (0.0030) [2024-06-25 18:58:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15506358272. Throughput: 0: 42574.7. Samples: 15506441500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:58:23,390][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 18:58:23,404][15401] Updated weights for policy 0, policy_version 946434 (0.0027) [2024-06-25 18:58:25,408][15349] Signal inference workers to stop experience collection... (229500 times) [2024-06-25 18:58:25,409][15349] Signal inference workers to resume experience collection... (229500 times) [2024-06-25 18:58:25,422][15401] InferenceWorker_p0-w0: stopping experience collection (229500 times) [2024-06-25 18:58:25,422][15401] InferenceWorker_p0-w0: resuming experience collection (229500 times) [2024-06-25 18:58:27,600][15401] Updated weights for policy 0, policy_version 946444 (0.0037) [2024-06-25 18:58:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.9, 300 sec: 42654.0). Total num frames: 15506571264. Throughput: 0: 42656.2. Samples: 15506703940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:58:28,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 18:58:31,180][15401] Updated weights for policy 0, policy_version 946454 (0.0034) [2024-06-25 18:58:33,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 15506800640. Throughput: 0: 42436.6. Samples: 15506950360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-25 18:58:33,392][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 18:58:35,487][15401] Updated weights for policy 0, policy_version 946464 (0.0036) [2024-06-25 18:58:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15507013632. Throughput: 0: 42562.6. Samples: 15507081860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:58:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 18:58:38,716][15401] Updated weights for policy 0, policy_version 946474 (0.0027) [2024-06-25 18:58:43,254][15401] Updated weights for policy 0, policy_version 946484 (0.0038) [2024-06-25 18:58:43,389][15132] Fps is (10 sec: 39330.8, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 15507193856. Throughput: 0: 42524.2. Samples: 15507336440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:58:43,390][15132] Avg episode reward: [(0, '0.373')] [2024-06-25 18:58:46,413][15401] Updated weights for policy 0, policy_version 946494 (0.0032) [2024-06-25 18:58:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15507423232. Throughput: 0: 42472.6. Samples: 15507589820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:58:48,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 18:58:51,109][15401] Updated weights for policy 0, policy_version 946504 (0.0038) [2024-06-25 18:58:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15507636224. Throughput: 0: 42615.1. Samples: 15507722800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:58:53,390][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 18:58:53,888][15401] Updated weights for policy 0, policy_version 946514 (0.0029) [2024-06-25 18:58:58,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 15507832832. Throughput: 0: 42524.0. Samples: 15507977100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:58:58,393][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 18:58:58,815][15401] Updated weights for policy 0, policy_version 946524 (0.0037) [2024-06-25 18:59:01,527][15401] Updated weights for policy 0, policy_version 946534 (0.0024) [2024-06-25 18:59:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15508078592. Throughput: 0: 42470.3. Samples: 15508229060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:03,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 18:59:06,334][15401] Updated weights for policy 0, policy_version 946544 (0.0030) [2024-06-25 18:59:08,391][15132] Fps is (10 sec: 44243.1, 60 sec: 42324.6, 300 sec: 42709.3). Total num frames: 15508275200. Throughput: 0: 42715.0. Samples: 15508363720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:08,391][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 18:59:09,180][15401] Updated weights for policy 0, policy_version 946554 (0.0038) [2024-06-25 18:59:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15508488192. Throughput: 0: 42410.1. Samples: 15508612400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:13,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 18:59:14,042][15401] Updated weights for policy 0, policy_version 946564 (0.0028) [2024-06-25 18:59:17,163][15401] Updated weights for policy 0, policy_version 946574 (0.0034) [2024-06-25 18:59:18,389][15132] Fps is (10 sec: 44241.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15508717568. Throughput: 0: 42491.6. Samples: 15508862380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:18,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 18:59:21,930][15401] Updated weights for policy 0, policy_version 946584 (0.0034) [2024-06-25 18:59:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42654.2). Total num frames: 15508897792. Throughput: 0: 42483.5. Samples: 15508993620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 18:59:24,758][15401] Updated weights for policy 0, policy_version 946594 (0.0038) [2024-06-25 18:59:28,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15509110784. Throughput: 0: 42289.4. Samples: 15509239460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:28,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 18:59:29,989][15401] Updated weights for policy 0, policy_version 946604 (0.0028) [2024-06-25 18:59:33,071][15401] Updated weights for policy 0, policy_version 946614 (0.0037) [2024-06-25 18:59:33,396][15132] Fps is (10 sec: 44208.8, 60 sec: 42322.4, 300 sec: 42764.1). Total num frames: 15509340160. Throughput: 0: 42355.2. Samples: 15509496080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:33,396][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 18:59:37,499][15401] Updated weights for policy 0, policy_version 946624 (0.0030) [2024-06-25 18:59:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15509536768. Throughput: 0: 42330.7. Samples: 15509627680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 18:59:40,567][15401] Updated weights for policy 0, policy_version 946634 (0.0032) [2024-06-25 18:59:43,390][15132] Fps is (10 sec: 42625.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15509766144. Throughput: 0: 42322.2. Samples: 15509881500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 18:59:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000946641_15509766144.pth... [2024-06-25 18:59:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000946015_15499509760.pth [2024-06-25 18:59:44,975][15401] Updated weights for policy 0, policy_version 946644 (0.0029) [2024-06-25 18:59:48,212][15401] Updated weights for policy 0, policy_version 946654 (0.0031) [2024-06-25 18:59:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15509995520. Throughput: 0: 42402.2. Samples: 15510137160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:48,390][15132] Avg episode reward: [(0, '0.858')] [2024-06-25 18:59:52,591][15401] Updated weights for policy 0, policy_version 946664 (0.0033) [2024-06-25 18:59:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15510175744. Throughput: 0: 42233.4. Samples: 15510264180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:53,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 18:59:54,248][15349] Signal inference workers to stop experience collection... (229550 times) [2024-06-25 18:59:54,298][15401] InferenceWorker_p0-w0: stopping experience collection (229550 times) [2024-06-25 18:59:54,300][15349] Signal inference workers to resume experience collection... (229550 times) [2024-06-25 18:59:54,312][15401] InferenceWorker_p0-w0: resuming experience collection (229550 times) [2024-06-25 18:59:56,031][15401] Updated weights for policy 0, policy_version 946674 (0.0033) [2024-06-25 18:59:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 15510405120. Throughput: 0: 42335.2. Samples: 15510517480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 18:59:58,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 19:00:00,445][15401] Updated weights for policy 0, policy_version 946684 (0.0044) [2024-06-25 19:00:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15510618112. Throughput: 0: 42331.4. Samples: 15510767300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 19:00:03,390][15132] Avg episode reward: [(0, '0.358')] [2024-06-25 19:00:03,767][15401] Updated weights for policy 0, policy_version 946694 (0.0038) [2024-06-25 19:00:08,079][15401] Updated weights for policy 0, policy_version 946704 (0.0030) [2024-06-25 19:00:08,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42051.3, 300 sec: 42653.6). Total num frames: 15510798336. Throughput: 0: 42316.5. Samples: 15510897960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 19:00:08,393][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 19:00:11,345][15401] Updated weights for policy 0, policy_version 946714 (0.0036) [2024-06-25 19:00:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15511044096. Throughput: 0: 42667.0. Samples: 15511159480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 19:00:13,391][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 19:00:15,806][15401] Updated weights for policy 0, policy_version 946724 (0.0038) [2024-06-25 19:00:18,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15511257088. Throughput: 0: 42603.9. Samples: 15511412980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 19:00:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 19:00:18,976][15401] Updated weights for policy 0, policy_version 946734 (0.0030) [2024-06-25 19:00:23,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15511437312. Throughput: 0: 42527.5. Samples: 15511541420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 19:00:23,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 19:00:23,512][15401] Updated weights for policy 0, policy_version 946744 (0.0027) [2024-06-25 19:00:26,633][15401] Updated weights for policy 0, policy_version 946754 (0.0039) [2024-06-25 19:00:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 15511683072. Throughput: 0: 42608.1. Samples: 15511798860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-25 19:00:28,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 19:00:30,967][15401] Updated weights for policy 0, policy_version 946764 (0.0033) [2024-06-25 19:00:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 15511896064. Throughput: 0: 42591.9. Samples: 15512053800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:00:33,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 19:00:34,175][15401] Updated weights for policy 0, policy_version 946774 (0.0025) [2024-06-25 19:00:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15512076288. Throughput: 0: 42668.1. Samples: 15512184240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:00:38,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 19:00:38,687][15401] Updated weights for policy 0, policy_version 946784 (0.0029) [2024-06-25 19:00:41,778][15401] Updated weights for policy 0, policy_version 946794 (0.0045) [2024-06-25 19:00:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15512322048. Throughput: 0: 42763.9. Samples: 15512441860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:00:43,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 19:00:46,211][15401] Updated weights for policy 0, policy_version 946804 (0.0039) [2024-06-25 19:00:48,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15512535040. Throughput: 0: 42803.2. Samples: 15512693440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:00:48,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 19:00:49,485][15401] Updated weights for policy 0, policy_version 946814 (0.0033) [2024-06-25 19:00:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15512715264. Throughput: 0: 42803.1. Samples: 15512824000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:00:53,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 19:00:54,108][15401] Updated weights for policy 0, policy_version 946824 (0.0031) [2024-06-25 19:00:57,741][15401] Updated weights for policy 0, policy_version 946834 (0.0031) [2024-06-25 19:00:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15512961024. Throughput: 0: 42617.9. Samples: 15513077280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:00:58,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 19:01:01,831][15401] Updated weights for policy 0, policy_version 946844 (0.0024) [2024-06-25 19:01:03,389][15132] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15513190400. Throughput: 0: 42607.1. Samples: 15513330300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 19:01:05,275][15401] Updated weights for policy 0, policy_version 946854 (0.0023) [2024-06-25 19:01:08,389][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.1, 300 sec: 42542.8). Total num frames: 15513354240. Throughput: 0: 42679.5. Samples: 15513462000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:08,390][15132] Avg episode reward: [(0, '0.296')] [2024-06-25 19:01:09,506][15401] Updated weights for policy 0, policy_version 946864 (0.0045) [2024-06-25 19:01:12,842][15401] Updated weights for policy 0, policy_version 946874 (0.0037) [2024-06-25 19:01:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15513616384. Throughput: 0: 42746.6. Samples: 15513722460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:13,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 19:01:17,004][15401] Updated weights for policy 0, policy_version 946884 (0.0039) [2024-06-25 19:01:18,390][15132] Fps is (10 sec: 49151.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15513845760. Throughput: 0: 42782.2. Samples: 15513979000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:18,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 19:01:20,396][15401] Updated weights for policy 0, policy_version 946894 (0.0034) [2024-06-25 19:01:22,988][15349] Signal inference workers to stop experience collection... (229600 times) [2024-06-25 19:01:22,988][15349] Signal inference workers to resume experience collection... (229600 times) [2024-06-25 19:01:23,018][15401] InferenceWorker_p0-w0: stopping experience collection (229600 times) [2024-06-25 19:01:23,019][15401] InferenceWorker_p0-w0: resuming experience collection (229600 times) [2024-06-25 19:01:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15514009600. Throughput: 0: 42740.3. Samples: 15514107560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 19:01:24,415][15401] Updated weights for policy 0, policy_version 946904 (0.0039) [2024-06-25 19:01:27,861][15401] Updated weights for policy 0, policy_version 946914 (0.0037) [2024-06-25 19:01:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15514271744. Throughput: 0: 42833.4. Samples: 15514369360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:28,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 19:01:32,293][15401] Updated weights for policy 0, policy_version 946924 (0.0043) [2024-06-25 19:01:33,391][15132] Fps is (10 sec: 47509.2, 60 sec: 43143.9, 300 sec: 42820.4). Total num frames: 15514484736. Throughput: 0: 42993.3. Samples: 15514628180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:33,391][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 19:01:35,619][15401] Updated weights for policy 0, policy_version 946934 (0.0027) [2024-06-25 19:01:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15514648576. Throughput: 0: 42921.8. Samples: 15514755480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:38,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 19:01:40,074][15401] Updated weights for policy 0, policy_version 946944 (0.0035) [2024-06-25 19:01:43,141][15401] Updated weights for policy 0, policy_version 946954 (0.0032) [2024-06-25 19:01:43,390][15132] Fps is (10 sec: 42601.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15514910720. Throughput: 0: 42886.8. Samples: 15515007200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:43,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 19:01:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000946955_15514910720.pth... [2024-06-25 19:01:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000946331_15504687104.pth [2024-06-25 19:01:47,892][15401] Updated weights for policy 0, policy_version 946964 (0.0044) [2024-06-25 19:01:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15515090944. Throughput: 0: 43078.6. Samples: 15515268840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:48,390][15132] Avg episode reward: [(0, '0.780')] [2024-06-25 19:01:50,972][15401] Updated weights for policy 0, policy_version 946974 (0.0035) [2024-06-25 19:01:53,390][15132] Fps is (10 sec: 37683.0, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 15515287552. Throughput: 0: 42867.8. Samples: 15515391060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 19:01:55,482][15401] Updated weights for policy 0, policy_version 946984 (0.0043) [2024-06-25 19:01:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15515533312. Throughput: 0: 42758.7. Samples: 15515646600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:01:58,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 19:01:58,594][15401] Updated weights for policy 0, policy_version 946994 (0.0033) [2024-06-25 19:02:02,947][15401] Updated weights for policy 0, policy_version 947004 (0.0040) [2024-06-25 19:02:03,390][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15515729920. Throughput: 0: 42836.5. Samples: 15515906640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:02:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 19:02:06,250][15401] Updated weights for policy 0, policy_version 947014 (0.0047) [2024-06-25 19:02:08,392][15132] Fps is (10 sec: 40950.2, 60 sec: 43142.8, 300 sec: 42542.5). Total num frames: 15515942912. Throughput: 0: 42743.1. Samples: 15516031100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:02:08,393][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 19:02:10,554][15401] Updated weights for policy 0, policy_version 947024 (0.0039) [2024-06-25 19:02:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15516172288. Throughput: 0: 42671.1. Samples: 15516289560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:02:13,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 19:02:13,804][15401] Updated weights for policy 0, policy_version 947034 (0.0033) [2024-06-25 19:02:18,389][15132] Fps is (10 sec: 40970.4, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 15516352512. Throughput: 0: 42673.0. Samples: 15516548420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:02:18,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 19:02:18,412][15401] Updated weights for policy 0, policy_version 947044 (0.0045) [2024-06-25 19:02:21,751][15401] Updated weights for policy 0, policy_version 947054 (0.0032) [2024-06-25 19:02:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42543.0). Total num frames: 15516581888. Throughput: 0: 42531.9. Samples: 15516669420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:02:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 19:02:26,091][15401] Updated weights for policy 0, policy_version 947064 (0.0031) [2024-06-25 19:02:28,389][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15516811264. Throughput: 0: 42620.6. Samples: 15516925120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:02:28,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 19:02:29,391][15401] Updated weights for policy 0, policy_version 947074 (0.0037) [2024-06-25 19:02:33,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41779.9, 300 sec: 42542.9). Total num frames: 15516991488. Throughput: 0: 42528.5. Samples: 15517182620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:02:33,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 19:02:33,632][15401] Updated weights for policy 0, policy_version 947084 (0.0034) [2024-06-25 19:02:36,525][15349] Signal inference workers to stop experience collection... (229650 times) [2024-06-25 19:02:36,526][15349] Signal inference workers to resume experience collection... (229650 times) [2024-06-25 19:02:36,565][15401] InferenceWorker_p0-w0: stopping experience collection (229650 times) [2024-06-25 19:02:36,566][15401] InferenceWorker_p0-w0: resuming experience collection (229650 times) [2024-06-25 19:02:37,361][15401] Updated weights for policy 0, policy_version 947094 (0.0041) [2024-06-25 19:02:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15517220864. Throughput: 0: 42528.8. Samples: 15517304840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:02:38,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 19:02:41,388][15401] Updated weights for policy 0, policy_version 947104 (0.0040) [2024-06-25 19:02:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15517450240. Throughput: 0: 42578.2. Samples: 15517562620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:02:43,399][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 19:02:45,214][15401] Updated weights for policy 0, policy_version 947114 (0.0050) [2024-06-25 19:02:48,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15517646848. Throughput: 0: 42518.8. Samples: 15517819980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:02:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 19:02:49,015][15401] Updated weights for policy 0, policy_version 947124 (0.0036) [2024-06-25 19:02:52,748][15401] Updated weights for policy 0, policy_version 947134 (0.0028) [2024-06-25 19:02:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 15517859840. Throughput: 0: 42495.7. Samples: 15517943300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:02:53,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 19:02:56,674][15401] Updated weights for policy 0, policy_version 947144 (0.0025) [2024-06-25 19:02:58,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 15518089216. Throughput: 0: 42642.8. Samples: 15518208760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:02:58,397][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 19:03:00,277][15401] Updated weights for policy 0, policy_version 947154 (0.0033) [2024-06-25 19:03:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15518285824. Throughput: 0: 42466.6. Samples: 15518459420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:03,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 19:03:04,580][15401] Updated weights for policy 0, policy_version 947164 (0.0032) [2024-06-25 19:03:08,211][15401] Updated weights for policy 0, policy_version 947174 (0.0041) [2024-06-25 19:03:08,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 15518498816. Throughput: 0: 42511.7. Samples: 15518582440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:08,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 19:03:12,450][15401] Updated weights for policy 0, policy_version 947184 (0.0041) [2024-06-25 19:03:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15518711808. Throughput: 0: 42559.1. Samples: 15518840280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:13,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 19:03:15,774][15401] Updated weights for policy 0, policy_version 947194 (0.0046) [2024-06-25 19:03:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15518924800. Throughput: 0: 42470.1. Samples: 15519093780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:18,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 19:03:20,192][15401] Updated weights for policy 0, policy_version 947204 (0.0042) [2024-06-25 19:03:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15519137792. Throughput: 0: 42527.1. Samples: 15519218560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:23,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 19:03:23,722][15401] Updated weights for policy 0, policy_version 947214 (0.0035) [2024-06-25 19:03:27,705][15401] Updated weights for policy 0, policy_version 947224 (0.0029) [2024-06-25 19:03:28,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42487.6). Total num frames: 15519334400. Throughput: 0: 42500.0. Samples: 15519475120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:28,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 19:03:31,250][15401] Updated weights for policy 0, policy_version 947234 (0.0032) [2024-06-25 19:03:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15519563776. Throughput: 0: 42504.8. Samples: 15519732700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:33,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 19:03:35,302][15401] Updated weights for policy 0, policy_version 947244 (0.0032) [2024-06-25 19:03:38,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15519793152. Throughput: 0: 42686.6. Samples: 15519864200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:38,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 19:03:39,029][15401] Updated weights for policy 0, policy_version 947254 (0.0044) [2024-06-25 19:03:43,164][15401] Updated weights for policy 0, policy_version 947264 (0.0025) [2024-06-25 19:03:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 15519973376. Throughput: 0: 42369.1. Samples: 15520115100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:43,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 19:03:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000947265_15519989760.pth... [2024-06-25 19:03:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000946641_15509766144.pth [2024-06-25 19:03:46,781][15401] Updated weights for policy 0, policy_version 947274 (0.0037) [2024-06-25 19:03:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15520219136. Throughput: 0: 42496.9. Samples: 15520371780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:48,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 19:03:50,993][15401] Updated weights for policy 0, policy_version 947284 (0.0039) [2024-06-25 19:03:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 15520415744. Throughput: 0: 42731.1. Samples: 15520505340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:53,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 19:03:54,425][15401] Updated weights for policy 0, policy_version 947294 (0.0052) [2024-06-25 19:03:58,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42056.8, 300 sec: 42487.3). Total num frames: 15520612352. Throughput: 0: 42588.0. Samples: 15520756740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:03:58,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 19:03:58,462][15401] Updated weights for policy 0, policy_version 947304 (0.0038) [2024-06-25 19:04:01,999][15401] Updated weights for policy 0, policy_version 947314 (0.0037) [2024-06-25 19:04:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42654.1). Total num frames: 15520858112. Throughput: 0: 42635.6. Samples: 15521012380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:04:03,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 19:04:03,621][15349] Signal inference workers to stop experience collection... (229700 times) [2024-06-25 19:04:03,622][15349] Signal inference workers to resume experience collection... (229700 times) [2024-06-25 19:04:03,652][15401] InferenceWorker_p0-w0: stopping experience collection (229700 times) [2024-06-25 19:04:03,652][15401] InferenceWorker_p0-w0: resuming experience collection (229700 times) [2024-06-25 19:04:06,150][15401] Updated weights for policy 0, policy_version 947324 (0.0032) [2024-06-25 19:04:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15521054720. Throughput: 0: 42814.2. Samples: 15521145200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:04:08,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 19:04:09,649][15401] Updated weights for policy 0, policy_version 947334 (0.0027) [2024-06-25 19:04:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15521267712. Throughput: 0: 42618.3. Samples: 15521392940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:04:13,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 19:04:13,771][15401] Updated weights for policy 0, policy_version 947344 (0.0035) [2024-06-25 19:04:17,161][15401] Updated weights for policy 0, policy_version 947354 (0.0028) [2024-06-25 19:04:18,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15521497088. Throughput: 0: 42575.2. Samples: 15521648580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:04:18,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 19:04:21,455][15401] Updated weights for policy 0, policy_version 947364 (0.0035) [2024-06-25 19:04:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15521710080. Throughput: 0: 42603.5. Samples: 15521781360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 19:04:23,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 19:04:25,011][15401] Updated weights for policy 0, policy_version 947374 (0.0027) [2024-06-25 19:04:28,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42598.4, 300 sec: 42543.8). Total num frames: 15521890304. Throughput: 0: 42658.7. Samples: 15522034740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:04:28,398][15132] Avg episode reward: [(0, '0.257')] [2024-06-25 19:04:29,041][15401] Updated weights for policy 0, policy_version 947384 (0.0036) [2024-06-25 19:04:32,841][15401] Updated weights for policy 0, policy_version 947394 (0.0036) [2024-06-25 19:04:33,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15522136064. Throughput: 0: 42490.9. Samples: 15522283880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:04:33,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 19:04:37,040][15401] Updated weights for policy 0, policy_version 947404 (0.0036) [2024-06-25 19:04:38,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15522332672. Throughput: 0: 42538.7. Samples: 15522419580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:04:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 19:04:40,309][15401] Updated weights for policy 0, policy_version 947414 (0.0033) [2024-06-25 19:04:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 15522545664. Throughput: 0: 42664.8. Samples: 15522676660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:04:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 19:04:44,541][15401] Updated weights for policy 0, policy_version 947424 (0.0048) [2024-06-25 19:04:47,738][15401] Updated weights for policy 0, policy_version 947434 (0.0030) [2024-06-25 19:04:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15522791424. Throughput: 0: 42756.5. Samples: 15522936420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:04:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 19:04:51,933][15401] Updated weights for policy 0, policy_version 947444 (0.0032) [2024-06-25 19:04:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15522988032. Throughput: 0: 42745.7. Samples: 15523068760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:04:53,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 19:04:55,302][15401] Updated weights for policy 0, policy_version 947454 (0.0030) [2024-06-25 19:04:58,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 15523201024. Throughput: 0: 43011.6. Samples: 15523328460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:04:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 19:04:59,439][15401] Updated weights for policy 0, policy_version 947464 (0.0038) [2024-06-25 19:05:02,931][15401] Updated weights for policy 0, policy_version 947474 (0.0033) [2024-06-25 19:05:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 15523430400. Throughput: 0: 43111.0. Samples: 15523588580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:03,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 19:05:06,854][15401] Updated weights for policy 0, policy_version 947484 (0.0032) [2024-06-25 19:05:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15523627008. Throughput: 0: 43045.9. Samples: 15523718420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:08,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 19:05:10,365][15401] Updated weights for policy 0, policy_version 947494 (0.0036) [2024-06-25 19:05:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 15523856384. Throughput: 0: 43154.2. Samples: 15523976680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 19:05:14,260][15401] Updated weights for policy 0, policy_version 947504 (0.0032) [2024-06-25 19:05:17,807][15401] Updated weights for policy 0, policy_version 947514 (0.0036) [2024-06-25 19:05:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15524085760. Throughput: 0: 43318.8. Samples: 15524233220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:18,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 19:05:21,993][15401] Updated weights for policy 0, policy_version 947524 (0.0034) [2024-06-25 19:05:23,392][15132] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 15524265984. Throughput: 0: 43135.9. Samples: 15524360800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:23,393][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 19:05:25,755][15401] Updated weights for policy 0, policy_version 947534 (0.0031) [2024-06-25 19:05:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 15524495360. Throughput: 0: 43094.4. Samples: 15524615900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:28,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 19:05:29,999][15401] Updated weights for policy 0, policy_version 947544 (0.0032) [2024-06-25 19:05:33,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15524708352. Throughput: 0: 43036.4. Samples: 15524873060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 19:05:33,440][15401] Updated weights for policy 0, policy_version 947554 (0.0043) [2024-06-25 19:05:37,576][15401] Updated weights for policy 0, policy_version 947564 (0.0033) [2024-06-25 19:05:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15524904960. Throughput: 0: 42846.1. Samples: 15524996840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:38,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 19:05:39,177][15349] Signal inference workers to stop experience collection... (229750 times) [2024-06-25 19:05:39,179][15349] Signal inference workers to resume experience collection... (229750 times) [2024-06-25 19:05:39,224][15401] InferenceWorker_p0-w0: stopping experience collection (229750 times) [2024-06-25 19:05:39,224][15401] InferenceWorker_p0-w0: resuming experience collection (229750 times) [2024-06-25 19:05:41,192][15401] Updated weights for policy 0, policy_version 947574 (0.0031) [2024-06-25 19:05:43,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15525150720. Throughput: 0: 42793.7. Samples: 15525254180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 19:05:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000947580_15525150720.pth... [2024-06-25 19:05:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000946955_15514910720.pth [2024-06-25 19:05:45,054][15401] Updated weights for policy 0, policy_version 947584 (0.0030) [2024-06-25 19:05:48,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15525363712. Throughput: 0: 42848.8. Samples: 15525516780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 19:05:48,547][15401] Updated weights for policy 0, policy_version 947594 (0.0036) [2024-06-25 19:05:52,568][15401] Updated weights for policy 0, policy_version 947604 (0.0032) [2024-06-25 19:05:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 15525560320. Throughput: 0: 42746.1. Samples: 15525642000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:53,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 19:05:56,420][15401] Updated weights for policy 0, policy_version 947614 (0.0040) [2024-06-25 19:05:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15525789696. Throughput: 0: 42728.7. Samples: 15525899460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:05:58,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 19:06:00,178][15401] Updated weights for policy 0, policy_version 947624 (0.0034) [2024-06-25 19:06:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15525986304. Throughput: 0: 42764.8. Samples: 15526157640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:06:03,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 19:06:04,153][15401] Updated weights for policy 0, policy_version 947634 (0.0028) [2024-06-25 19:06:07,677][15401] Updated weights for policy 0, policy_version 947644 (0.0029) [2024-06-25 19:06:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15526215680. Throughput: 0: 42581.0. Samples: 15526276840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:06:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 19:06:12,315][15401] Updated weights for policy 0, policy_version 947654 (0.0033) [2024-06-25 19:06:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15526428672. Throughput: 0: 42874.6. Samples: 15526545260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:06:13,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 19:06:15,307][15401] Updated weights for policy 0, policy_version 947664 (0.0037) [2024-06-25 19:06:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15526625280. Throughput: 0: 42736.4. Samples: 15526796200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-25 19:06:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 19:06:20,054][15401] Updated weights for policy 0, policy_version 947674 (0.0040) [2024-06-25 19:06:23,199][15401] Updated weights for policy 0, policy_version 947684 (0.0031) [2024-06-25 19:06:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43419.3, 300 sec: 42709.5). Total num frames: 15526871040. Throughput: 0: 42793.4. Samples: 15526922540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:06:23,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 19:06:27,489][15401] Updated weights for policy 0, policy_version 947694 (0.0037) [2024-06-25 19:06:28,390][15132] Fps is (10 sec: 45871.9, 60 sec: 43144.0, 300 sec: 42709.5). Total num frames: 15527084032. Throughput: 0: 42972.7. Samples: 15527187980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:06:28,391][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 19:06:30,665][15401] Updated weights for policy 0, policy_version 947704 (0.0036) [2024-06-25 19:06:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15527264256. Throughput: 0: 42876.1. Samples: 15527446200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:06:33,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 19:06:35,372][15401] Updated weights for policy 0, policy_version 947714 (0.0034) [2024-06-25 19:06:38,339][15401] Updated weights for policy 0, policy_version 947724 (0.0024) [2024-06-25 19:06:38,390][15132] Fps is (10 sec: 42601.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15527510016. Throughput: 0: 42771.2. Samples: 15527566700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:06:38,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 19:06:42,906][15401] Updated weights for policy 0, policy_version 947734 (0.0035) [2024-06-25 19:06:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15527706624. Throughput: 0: 42919.5. Samples: 15527830840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:06:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 19:06:45,850][15401] Updated weights for policy 0, policy_version 947744 (0.0036) [2024-06-25 19:06:48,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15527919616. Throughput: 0: 42791.2. Samples: 15528083240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:06:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 19:06:50,570][15401] Updated weights for policy 0, policy_version 947754 (0.0028) [2024-06-25 19:06:53,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15528148992. Throughput: 0: 42950.1. Samples: 15528209600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:06:53,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 19:06:53,424][15401] Updated weights for policy 0, policy_version 947764 (0.0039) [2024-06-25 19:06:58,292][15401] Updated weights for policy 0, policy_version 947774 (0.0042) [2024-06-25 19:06:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15528329216. Throughput: 0: 42780.2. Samples: 15528470360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:06:58,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 19:07:01,013][15401] Updated weights for policy 0, policy_version 947784 (0.0034) [2024-06-25 19:07:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15528558592. Throughput: 0: 42948.9. Samples: 15528728900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:03,390][15132] Avg episode reward: [(0, '0.829')] [2024-06-25 19:07:05,922][15401] Updated weights for policy 0, policy_version 947794 (0.0042) [2024-06-25 19:07:07,463][15349] Signal inference workers to stop experience collection... (229800 times) [2024-06-25 19:07:07,463][15349] Signal inference workers to resume experience collection... (229800 times) [2024-06-25 19:07:07,491][15401] InferenceWorker_p0-w0: stopping experience collection (229800 times) [2024-06-25 19:07:07,492][15401] InferenceWorker_p0-w0: resuming experience collection (229800 times) [2024-06-25 19:07:08,390][15132] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 15528804352. Throughput: 0: 43040.0. Samples: 15528859340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:08,392][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 19:07:08,493][15401] Updated weights for policy 0, policy_version 947804 (0.0039) [2024-06-25 19:07:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15528968192. Throughput: 0: 42838.0. Samples: 15529115660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:13,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 19:07:13,573][15401] Updated weights for policy 0, policy_version 947814 (0.0033) [2024-06-25 19:07:16,430][15401] Updated weights for policy 0, policy_version 947824 (0.0040) [2024-06-25 19:07:18,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15529197568. Throughput: 0: 42862.0. Samples: 15529375000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 19:07:21,142][15401] Updated weights for policy 0, policy_version 947834 (0.0022) [2024-06-25 19:07:23,389][15132] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15529443328. Throughput: 0: 43075.2. Samples: 15529505080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:23,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 19:07:23,956][15401] Updated weights for policy 0, policy_version 947844 (0.0035) [2024-06-25 19:07:28,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42051.1, 300 sec: 42764.7). Total num frames: 15529607168. Throughput: 0: 42758.5. Samples: 15529755080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:28,393][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 19:07:28,829][15401] Updated weights for policy 0, policy_version 947854 (0.0032) [2024-06-25 19:07:31,886][15401] Updated weights for policy 0, policy_version 947864 (0.0038) [2024-06-25 19:07:33,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15529836544. Throughput: 0: 42771.1. Samples: 15530007940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:33,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 19:07:36,346][15401] Updated weights for policy 0, policy_version 947874 (0.0031) [2024-06-25 19:07:38,389][15132] Fps is (10 sec: 45886.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15530065920. Throughput: 0: 42999.2. Samples: 15530144560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:38,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 19:07:39,486][15401] Updated weights for policy 0, policy_version 947884 (0.0041) [2024-06-25 19:07:43,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15530246144. Throughput: 0: 42862.5. Samples: 15530399180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:43,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 19:07:43,401][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000947891_15530246144.pth... [2024-06-25 19:07:43,483][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000947265_15519989760.pth [2024-06-25 19:07:43,925][15401] Updated weights for policy 0, policy_version 947894 (0.0032) [2024-06-25 19:07:47,123][15401] Updated weights for policy 0, policy_version 947904 (0.0035) [2024-06-25 19:07:48,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15530491904. Throughput: 0: 42690.4. Samples: 15530649960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 19:07:51,427][15401] Updated weights for policy 0, policy_version 947914 (0.0028) [2024-06-25 19:07:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 15530704896. Throughput: 0: 42682.7. Samples: 15530780060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:53,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 19:07:55,253][15401] Updated weights for policy 0, policy_version 947924 (0.0035) [2024-06-25 19:07:58,392][15132] Fps is (10 sec: 39311.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 15530885120. Throughput: 0: 42671.9. Samples: 15531036000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:07:58,393][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 19:07:58,963][15401] Updated weights for policy 0, policy_version 947934 (0.0029) [2024-06-25 19:08:03,062][15401] Updated weights for policy 0, policy_version 947944 (0.0041) [2024-06-25 19:08:03,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15531130880. Throughput: 0: 42553.0. Samples: 15531289880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:08:03,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 19:08:06,489][15401] Updated weights for policy 0, policy_version 947954 (0.0024) [2024-06-25 19:08:08,390][15132] Fps is (10 sec: 45886.3, 60 sec: 42325.4, 300 sec: 42820.5). Total num frames: 15531343872. Throughput: 0: 42547.0. Samples: 15531419700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:08:08,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 19:08:10,455][15401] Updated weights for policy 0, policy_version 947964 (0.0032) [2024-06-25 19:08:13,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15531540480. Throughput: 0: 42756.6. Samples: 15531679020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:08:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 19:08:14,373][15401] Updated weights for policy 0, policy_version 947974 (0.0033) [2024-06-25 19:08:17,996][15401] Updated weights for policy 0, policy_version 947984 (0.0041) [2024-06-25 19:08:18,004][15349] Signal inference workers to stop experience collection... (229850 times) [2024-06-25 19:08:18,004][15349] Signal inference workers to resume experience collection... (229850 times) [2024-06-25 19:08:18,048][15401] InferenceWorker_p0-w0: stopping experience collection (229850 times) [2024-06-25 19:08:18,048][15401] InferenceWorker_p0-w0: resuming experience collection (229850 times) [2024-06-25 19:08:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15531786240. Throughput: 0: 42835.6. Samples: 15531935540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-25 19:08:18,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 19:08:21,995][15401] Updated weights for policy 0, policy_version 947994 (0.0029) [2024-06-25 19:08:23,392][15132] Fps is (10 sec: 45864.1, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 15531999232. Throughput: 0: 42757.7. Samples: 15532068760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:08:23,392][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 19:08:25,475][15401] Updated weights for policy 0, policy_version 948004 (0.0046) [2024-06-25 19:08:28,392][15132] Fps is (10 sec: 40950.8, 60 sec: 43144.6, 300 sec: 42820.2). Total num frames: 15532195840. Throughput: 0: 42790.7. Samples: 15532324860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:08:28,392][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 19:08:29,579][15401] Updated weights for policy 0, policy_version 948014 (0.0039) [2024-06-25 19:08:32,951][15401] Updated weights for policy 0, policy_version 948024 (0.0035) [2024-06-25 19:08:33,389][15132] Fps is (10 sec: 42608.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15532425216. Throughput: 0: 42759.1. Samples: 15532574120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:08:33,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-25 19:08:37,641][15401] Updated weights for policy 0, policy_version 948034 (0.0034) [2024-06-25 19:08:38,390][15132] Fps is (10 sec: 44246.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15532638208. Throughput: 0: 42881.3. Samples: 15532709720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:08:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 19:08:40,951][15401] Updated weights for policy 0, policy_version 948044 (0.0040) [2024-06-25 19:08:43,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15532834816. Throughput: 0: 42907.1. Samples: 15532966720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:08:43,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 19:08:45,293][15401] Updated weights for policy 0, policy_version 948054 (0.0027) [2024-06-25 19:08:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15533064192. Throughput: 0: 42712.5. Samples: 15533211940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:08:48,390][15132] Avg episode reward: [(0, '0.305')] [2024-06-25 19:08:48,661][15401] Updated weights for policy 0, policy_version 948064 (0.0030) [2024-06-25 19:08:52,959][15401] Updated weights for policy 0, policy_version 948074 (0.0043) [2024-06-25 19:08:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15533260800. Throughput: 0: 42782.6. Samples: 15533344920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:08:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 19:08:56,246][15401] Updated weights for policy 0, policy_version 948084 (0.0035) [2024-06-25 19:08:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 15533473792. Throughput: 0: 42719.0. Samples: 15533601380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:08:58,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 19:09:00,514][15401] Updated weights for policy 0, policy_version 948094 (0.0057) [2024-06-25 19:09:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15533703168. Throughput: 0: 42588.0. Samples: 15533852000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 19:09:04,261][15401] Updated weights for policy 0, policy_version 948104 (0.0032) [2024-06-25 19:09:08,258][15401] Updated weights for policy 0, policy_version 948114 (0.0029) [2024-06-25 19:09:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15533899776. Throughput: 0: 42551.6. Samples: 15533983480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 19:09:11,817][15401] Updated weights for policy 0, policy_version 948124 (0.0026) [2024-06-25 19:09:13,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15534096384. Throughput: 0: 42524.8. Samples: 15534238380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:13,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 19:09:15,697][15401] Updated weights for policy 0, policy_version 948134 (0.0030) [2024-06-25 19:09:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15534342144. Throughput: 0: 42623.9. Samples: 15534492200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:18,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 19:09:19,355][15401] Updated weights for policy 0, policy_version 948144 (0.0038) [2024-06-25 19:09:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 15534538752. Throughput: 0: 42726.7. Samples: 15534632420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:23,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 19:09:23,558][15401] Updated weights for policy 0, policy_version 948154 (0.0038) [2024-06-25 19:09:26,942][15401] Updated weights for policy 0, policy_version 948164 (0.0027) [2024-06-25 19:09:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 15534751744. Throughput: 0: 42513.9. Samples: 15534879840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:28,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 19:09:31,001][15401] Updated weights for policy 0, policy_version 948174 (0.0046) [2024-06-25 19:09:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15534997504. Throughput: 0: 42717.7. Samples: 15535134240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 19:09:35,074][15401] Updated weights for policy 0, policy_version 948184 (0.0034) [2024-06-25 19:09:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 15535177728. Throughput: 0: 42849.8. Samples: 15535273160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:38,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 19:09:38,765][15401] Updated weights for policy 0, policy_version 948194 (0.0031) [2024-06-25 19:09:38,781][15349] Signal inference workers to stop experience collection... (229900 times) [2024-06-25 19:09:38,781][15349] Signal inference workers to resume experience collection... (229900 times) [2024-06-25 19:09:38,802][15401] InferenceWorker_p0-w0: stopping experience collection (229900 times) [2024-06-25 19:09:38,802][15401] InferenceWorker_p0-w0: resuming experience collection (229900 times) [2024-06-25 19:09:42,530][15401] Updated weights for policy 0, policy_version 948204 (0.0033) [2024-06-25 19:09:43,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 15535390720. Throughput: 0: 42786.3. Samples: 15535526860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:43,392][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 19:09:43,400][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000948205_15535390720.pth... [2024-06-25 19:09:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000947580_15525150720.pth [2024-06-25 19:09:46,391][15401] Updated weights for policy 0, policy_version 948214 (0.0031) [2024-06-25 19:09:48,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 15535652864. Throughput: 0: 42832.4. Samples: 15535779460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:48,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 19:09:49,879][15401] Updated weights for policy 0, policy_version 948224 (0.0042) [2024-06-25 19:09:53,389][15132] Fps is (10 sec: 40970.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 15535800320. Throughput: 0: 43102.6. Samples: 15535923100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 19:09:53,975][15401] Updated weights for policy 0, policy_version 948234 (0.0036) [2024-06-25 19:09:57,635][15401] Updated weights for policy 0, policy_version 948244 (0.0036) [2024-06-25 19:09:58,390][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15536046080. Throughput: 0: 42944.0. Samples: 15536170860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:09:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 19:10:01,594][15401] Updated weights for policy 0, policy_version 948254 (0.0023) [2024-06-25 19:10:03,389][15132] Fps is (10 sec: 49152.0, 60 sec: 43144.7, 300 sec: 42931.6). Total num frames: 15536291840. Throughput: 0: 42969.9. Samples: 15536425840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:10:03,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 19:10:05,350][15401] Updated weights for policy 0, policy_version 948264 (0.0032) [2024-06-25 19:10:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15536455680. Throughput: 0: 42881.8. Samples: 15536562100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:10:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 19:10:09,060][15401] Updated weights for policy 0, policy_version 948274 (0.0025) [2024-06-25 19:10:12,767][15401] Updated weights for policy 0, policy_version 948284 (0.0025) [2024-06-25 19:10:13,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15536701440. Throughput: 0: 43048.8. Samples: 15536817040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-25 19:10:13,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 19:10:16,803][15401] Updated weights for policy 0, policy_version 948294 (0.0027) [2024-06-25 19:10:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42876.5). Total num frames: 15536914432. Throughput: 0: 42943.3. Samples: 15537066680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:18,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 19:10:20,315][15401] Updated weights for policy 0, policy_version 948304 (0.0042) [2024-06-25 19:10:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15537111040. Throughput: 0: 42632.4. Samples: 15537191620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:23,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 19:10:24,480][15401] Updated weights for policy 0, policy_version 948314 (0.0032) [2024-06-25 19:10:28,247][15401] Updated weights for policy 0, policy_version 948324 (0.0041) [2024-06-25 19:10:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15537340416. Throughput: 0: 42585.8. Samples: 15537443120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:28,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 19:10:32,256][15401] Updated weights for policy 0, policy_version 948334 (0.0026) [2024-06-25 19:10:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 15537537024. Throughput: 0: 42681.9. Samples: 15537700140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:33,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 19:10:35,706][15401] Updated weights for policy 0, policy_version 948344 (0.0027) [2024-06-25 19:10:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15537733632. Throughput: 0: 42292.7. Samples: 15537826280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:38,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 19:10:40,190][15401] Updated weights for policy 0, policy_version 948354 (0.0047) [2024-06-25 19:10:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 15537979392. Throughput: 0: 42559.6. Samples: 15538086040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:43,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 19:10:43,495][15401] Updated weights for policy 0, policy_version 948364 (0.0039) [2024-06-25 19:10:47,793][15401] Updated weights for policy 0, policy_version 948374 (0.0030) [2024-06-25 19:10:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 15538176000. Throughput: 0: 42384.9. Samples: 15538333160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:48,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 19:10:50,913][15349] Signal inference workers to stop experience collection... (229950 times) [2024-06-25 19:10:50,914][15349] Signal inference workers to resume experience collection... (229950 times) [2024-06-25 19:10:50,960][15401] InferenceWorker_p0-w0: stopping experience collection (229950 times) [2024-06-25 19:10:50,960][15401] InferenceWorker_p0-w0: resuming experience collection (229950 times) [2024-06-25 19:10:51,530][15401] Updated weights for policy 0, policy_version 948384 (0.0032) [2024-06-25 19:10:53,396][15132] Fps is (10 sec: 39296.4, 60 sec: 42866.8, 300 sec: 42653.0). Total num frames: 15538372608. Throughput: 0: 42121.5. Samples: 15538457840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:53,396][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 19:10:55,423][15401] Updated weights for policy 0, policy_version 948394 (0.0027) [2024-06-25 19:10:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15538618368. Throughput: 0: 42273.8. Samples: 15538719360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:10:58,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 19:10:59,310][15401] Updated weights for policy 0, policy_version 948404 (0.0034) [2024-06-25 19:11:03,076][15401] Updated weights for policy 0, policy_version 948414 (0.0036) [2024-06-25 19:11:03,390][15132] Fps is (10 sec: 45904.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15538831360. Throughput: 0: 42420.3. Samples: 15538975600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 19:11:07,130][15401] Updated weights for policy 0, policy_version 948424 (0.0035) [2024-06-25 19:11:08,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15539011584. Throughput: 0: 42476.1. Samples: 15539103040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:08,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-25 19:11:10,720][15401] Updated weights for policy 0, policy_version 948434 (0.0026) [2024-06-25 19:11:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15539257344. Throughput: 0: 42690.2. Samples: 15539364180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:13,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 19:11:14,785][15401] Updated weights for policy 0, policy_version 948444 (0.0032) [2024-06-25 19:11:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15539453952. Throughput: 0: 42652.9. Samples: 15539619520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 19:11:18,403][15401] Updated weights for policy 0, policy_version 948454 (0.0027) [2024-06-25 19:11:22,678][15401] Updated weights for policy 0, policy_version 948464 (0.0038) [2024-06-25 19:11:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.5). Total num frames: 15539650560. Throughput: 0: 42571.2. Samples: 15539741980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:23,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 19:11:26,478][15401] Updated weights for policy 0, policy_version 948474 (0.0036) [2024-06-25 19:11:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15539896320. Throughput: 0: 42444.9. Samples: 15539996060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:28,390][15132] Avg episode reward: [(0, '0.870')] [2024-06-25 19:11:30,711][15401] Updated weights for policy 0, policy_version 948484 (0.0027) [2024-06-25 19:11:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15540076544. Throughput: 0: 42658.2. Samples: 15540252780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:33,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 19:11:34,197][15401] Updated weights for policy 0, policy_version 948494 (0.0033) [2024-06-25 19:11:38,266][15401] Updated weights for policy 0, policy_version 948504 (0.0033) [2024-06-25 19:11:38,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15540289536. Throughput: 0: 42635.3. Samples: 15540376160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:38,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 19:11:41,964][15401] Updated weights for policy 0, policy_version 948514 (0.0034) [2024-06-25 19:11:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15540518912. Throughput: 0: 42487.1. Samples: 15540631280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:43,396][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 19:11:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000948518_15540518912.pth... [2024-06-25 19:11:43,505][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000947891_15530246144.pth [2024-06-25 19:11:45,985][15401] Updated weights for policy 0, policy_version 948524 (0.0026) [2024-06-25 19:11:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15540715520. Throughput: 0: 42505.8. Samples: 15540888360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:48,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 19:11:49,477][15401] Updated weights for policy 0, policy_version 948534 (0.0040) [2024-06-25 19:11:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 15540928512. Throughput: 0: 42414.1. Samples: 15541011680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:53,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 19:11:53,574][15401] Updated weights for policy 0, policy_version 948544 (0.0043) [2024-06-25 19:11:57,047][15401] Updated weights for policy 0, policy_version 948554 (0.0029) [2024-06-25 19:11:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15541174272. Throughput: 0: 42355.2. Samples: 15541270160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:11:58,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 19:12:01,286][15401] Updated weights for policy 0, policy_version 948564 (0.0032) [2024-06-25 19:12:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 15541354496. Throughput: 0: 42448.1. Samples: 15541529680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:12:03,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 19:12:04,663][15401] Updated weights for policy 0, policy_version 948574 (0.0038) [2024-06-25 19:12:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15541567488. Throughput: 0: 42474.7. Samples: 15541653340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-25 19:12:08,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 19:12:09,162][15401] Updated weights for policy 0, policy_version 948584 (0.0040) [2024-06-25 19:12:12,238][15401] Updated weights for policy 0, policy_version 948594 (0.0028) [2024-06-25 19:12:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15541813248. Throughput: 0: 42411.9. Samples: 15541904600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:13,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 19:12:17,068][15401] Updated weights for policy 0, policy_version 948604 (0.0031) [2024-06-25 19:12:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15541993472. Throughput: 0: 42538.3. Samples: 15542167000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:18,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 19:12:20,317][15401] Updated weights for policy 0, policy_version 948614 (0.0041) [2024-06-25 19:12:23,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 15542206464. Throughput: 0: 42445.1. Samples: 15542286180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:23,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 19:12:24,671][15401] Updated weights for policy 0, policy_version 948624 (0.0043) [2024-06-25 19:12:27,864][15401] Updated weights for policy 0, policy_version 948634 (0.0035) [2024-06-25 19:12:28,216][15349] Signal inference workers to stop experience collection... (230000 times) [2024-06-25 19:12:28,217][15349] Signal inference workers to resume experience collection... (230000 times) [2024-06-25 19:12:28,264][15401] InferenceWorker_p0-w0: stopping experience collection (230000 times) [2024-06-25 19:12:28,264][15401] InferenceWorker_p0-w0: resuming experience collection (230000 times) [2024-06-25 19:12:28,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 15542452224. Throughput: 0: 42567.6. Samples: 15542546920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:28,392][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 19:12:32,611][15401] Updated weights for policy 0, policy_version 948644 (0.0028) [2024-06-25 19:12:33,390][15132] Fps is (10 sec: 42594.3, 60 sec: 42597.8, 300 sec: 42598.3). Total num frames: 15542632448. Throughput: 0: 42530.7. Samples: 15542802280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:33,391][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 19:12:35,413][15401] Updated weights for policy 0, policy_version 948654 (0.0025) [2024-06-25 19:12:38,390][15132] Fps is (10 sec: 40968.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15542861824. Throughput: 0: 42571.8. Samples: 15542927420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 19:12:40,355][15401] Updated weights for policy 0, policy_version 948664 (0.0029) [2024-06-25 19:12:43,297][15401] Updated weights for policy 0, policy_version 948674 (0.0038) [2024-06-25 19:12:43,390][15132] Fps is (10 sec: 44240.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15543074816. Throughput: 0: 42482.9. Samples: 15543181900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:43,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 19:12:48,389][15132] Fps is (10 sec: 37684.5, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 15543238656. Throughput: 0: 42591.1. Samples: 15543446280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 19:12:48,395][15401] Updated weights for policy 0, policy_version 948684 (0.0033) [2024-06-25 19:12:51,023][15401] Updated weights for policy 0, policy_version 948694 (0.0024) [2024-06-25 19:12:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15543500800. Throughput: 0: 42337.8. Samples: 15543558540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 19:12:55,823][15401] Updated weights for policy 0, policy_version 948704 (0.0028) [2024-06-25 19:12:58,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15543697408. Throughput: 0: 42460.0. Samples: 15543815300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:12:58,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 19:12:58,767][15401] Updated weights for policy 0, policy_version 948714 (0.0029) [2024-06-25 19:13:03,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 15543877632. Throughput: 0: 42416.5. Samples: 15544075740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:03,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 19:13:03,425][15401] Updated weights for policy 0, policy_version 948724 (0.0042) [2024-06-25 19:13:06,218][15401] Updated weights for policy 0, policy_version 948734 (0.0031) [2024-06-25 19:13:08,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15544123392. Throughput: 0: 42452.0. Samples: 15544196520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 19:13:10,993][15401] Updated weights for policy 0, policy_version 948744 (0.0037) [2024-06-25 19:13:13,392][15132] Fps is (10 sec: 45863.7, 60 sec: 42050.6, 300 sec: 42542.5). Total num frames: 15544336384. Throughput: 0: 42541.7. Samples: 15544461300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:13,393][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 19:13:13,844][15401] Updated weights for policy 0, policy_version 948754 (0.0029) [2024-06-25 19:13:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 15544532992. Throughput: 0: 42520.0. Samples: 15544715640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:18,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 19:13:18,505][15401] Updated weights for policy 0, policy_version 948764 (0.0046) [2024-06-25 19:13:21,431][15401] Updated weights for policy 0, policy_version 948774 (0.0053) [2024-06-25 19:13:23,392][15132] Fps is (10 sec: 44237.3, 60 sec: 42869.7, 300 sec: 42653.9). Total num frames: 15544778752. Throughput: 0: 42473.6. Samples: 15544838820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:23,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 19:13:26,142][15401] Updated weights for policy 0, policy_version 948784 (0.0037) [2024-06-25 19:13:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42054.0, 300 sec: 42542.9). Total num frames: 15544975360. Throughput: 0: 42653.5. Samples: 15545101300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:28,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 19:13:29,379][15401] Updated weights for policy 0, policy_version 948794 (0.0031) [2024-06-25 19:13:33,390][15132] Fps is (10 sec: 37691.8, 60 sec: 42052.9, 300 sec: 42431.8). Total num frames: 15545155584. Throughput: 0: 42461.6. Samples: 15545357060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:33,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 19:13:33,963][15401] Updated weights for policy 0, policy_version 948804 (0.0029) [2024-06-25 19:13:35,654][15349] Signal inference workers to stop experience collection... (230050 times) [2024-06-25 19:13:35,655][15349] Signal inference workers to resume experience collection... (230050 times) [2024-06-25 19:13:35,685][15401] InferenceWorker_p0-w0: stopping experience collection (230050 times) [2024-06-25 19:13:35,686][15401] InferenceWorker_p0-w0: resuming experience collection (230050 times) [2024-06-25 19:13:36,937][15401] Updated weights for policy 0, policy_version 948814 (0.0042) [2024-06-25 19:13:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15545417728. Throughput: 0: 42648.5. Samples: 15545477720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:38,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 19:13:41,450][15401] Updated weights for policy 0, policy_version 948824 (0.0041) [2024-06-25 19:13:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 15545614336. Throughput: 0: 42793.6. Samples: 15545741020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:43,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 19:13:43,453][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000948830_15545630720.pth... [2024-06-25 19:13:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000948205_15535390720.pth [2024-06-25 19:13:44,579][15401] Updated weights for policy 0, policy_version 948834 (0.0035) [2024-06-25 19:13:48,390][15132] Fps is (10 sec: 39318.6, 60 sec: 42870.8, 300 sec: 42542.8). Total num frames: 15545810944. Throughput: 0: 42755.2. Samples: 15545999760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:48,391][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 19:13:49,239][15401] Updated weights for policy 0, policy_version 948844 (0.0030) [2024-06-25 19:13:52,162][15401] Updated weights for policy 0, policy_version 948854 (0.0045) [2024-06-25 19:13:53,389][15132] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15546056704. Throughput: 0: 42904.9. Samples: 15546127240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 19:13:57,078][15401] Updated weights for policy 0, policy_version 948864 (0.0055) [2024-06-25 19:13:58,389][15132] Fps is (10 sec: 44240.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15546253312. Throughput: 0: 42832.2. Samples: 15546388640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:13:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 19:14:00,080][15401] Updated weights for policy 0, policy_version 948874 (0.0032) [2024-06-25 19:14:03,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 15546449920. Throughput: 0: 42758.2. Samples: 15546639760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:14:03,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 19:14:04,613][15401] Updated weights for policy 0, policy_version 948884 (0.0037) [2024-06-25 19:14:07,591][15401] Updated weights for policy 0, policy_version 948894 (0.0038) [2024-06-25 19:14:08,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15546695680. Throughput: 0: 42864.0. Samples: 15546767600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:14:08,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 19:14:12,238][15401] Updated weights for policy 0, policy_version 948904 (0.0031) [2024-06-25 19:14:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 15546875904. Throughput: 0: 42882.7. Samples: 15547031020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:13,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 19:14:15,150][15401] Updated weights for policy 0, policy_version 948914 (0.0031) [2024-06-25 19:14:18,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15547088896. Throughput: 0: 42840.8. Samples: 15547284900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 19:14:19,871][15401] Updated weights for policy 0, policy_version 948924 (0.0034) [2024-06-25 19:14:23,260][15401] Updated weights for policy 0, policy_version 948934 (0.0039) [2024-06-25 19:14:23,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 15547351040. Throughput: 0: 42848.5. Samples: 15547405900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:23,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 19:14:27,328][15401] Updated weights for policy 0, policy_version 948944 (0.0026) [2024-06-25 19:14:28,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 15547531264. Throughput: 0: 42800.5. Samples: 15547667040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 19:14:30,744][15401] Updated weights for policy 0, policy_version 948954 (0.0032) [2024-06-25 19:14:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15547744256. Throughput: 0: 42678.6. Samples: 15547920260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 19:14:35,171][15401] Updated weights for policy 0, policy_version 948964 (0.0033) [2024-06-25 19:14:38,370][15401] Updated weights for policy 0, policy_version 948974 (0.0037) [2024-06-25 19:14:38,396][15132] Fps is (10 sec: 45846.5, 60 sec: 42866.9, 300 sec: 42708.9). Total num frames: 15547990016. Throughput: 0: 42691.7. Samples: 15548048640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:38,396][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 19:14:42,739][15401] Updated weights for policy 0, policy_version 948984 (0.0034) [2024-06-25 19:14:43,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15548186624. Throughput: 0: 42548.2. Samples: 15548303320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:43,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 19:14:46,312][15401] Updated weights for policy 0, policy_version 948994 (0.0038) [2024-06-25 19:14:48,390][15132] Fps is (10 sec: 39346.6, 60 sec: 42872.0, 300 sec: 42653.9). Total num frames: 15548383232. Throughput: 0: 42607.1. Samples: 15548557080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:48,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 19:14:50,643][15401] Updated weights for policy 0, policy_version 949004 (0.0036) [2024-06-25 19:14:52,223][15349] Signal inference workers to stop experience collection... (230100 times) [2024-06-25 19:14:52,223][15349] Signal inference workers to resume experience collection... (230100 times) [2024-06-25 19:14:52,263][15401] InferenceWorker_p0-w0: stopping experience collection (230100 times) [2024-06-25 19:14:52,263][15401] InferenceWorker_p0-w0: resuming experience collection (230100 times) [2024-06-25 19:14:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15548628992. Throughput: 0: 42677.8. Samples: 15548688100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:53,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 19:14:53,763][15401] Updated weights for policy 0, policy_version 949014 (0.0030) [2024-06-25 19:14:58,128][15401] Updated weights for policy 0, policy_version 949024 (0.0035) [2024-06-25 19:14:58,391][15132] Fps is (10 sec: 44231.4, 60 sec: 42870.5, 300 sec: 42487.1). Total num frames: 15548825600. Throughput: 0: 42569.0. Samples: 15548946680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:14:58,391][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 19:15:01,565][15401] Updated weights for policy 0, policy_version 949034 (0.0032) [2024-06-25 19:15:03,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15549038592. Throughput: 0: 42466.3. Samples: 15549195880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:03,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 19:15:05,630][15401] Updated weights for policy 0, policy_version 949044 (0.0030) [2024-06-25 19:15:08,390][15132] Fps is (10 sec: 42603.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15549251584. Throughput: 0: 42721.7. Samples: 15549328380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:08,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 19:15:09,156][15401] Updated weights for policy 0, policy_version 949054 (0.0042) [2024-06-25 19:15:13,354][15401] Updated weights for policy 0, policy_version 949064 (0.0030) [2024-06-25 19:15:13,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 15549464576. Throughput: 0: 42706.7. Samples: 15549588840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 19:15:17,160][15401] Updated weights for policy 0, policy_version 949074 (0.0031) [2024-06-25 19:15:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 15549677568. Throughput: 0: 42642.2. Samples: 15549839160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:18,390][15132] Avg episode reward: [(0, '0.367')] [2024-06-25 19:15:20,753][15401] Updated weights for policy 0, policy_version 949084 (0.0033) [2024-06-25 19:15:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15549890560. Throughput: 0: 42708.2. Samples: 15549970240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:23,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-25 19:15:24,519][15401] Updated weights for policy 0, policy_version 949094 (0.0029) [2024-06-25 19:15:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15550103552. Throughput: 0: 42820.1. Samples: 15550230220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:28,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 19:15:28,530][15401] Updated weights for policy 0, policy_version 949104 (0.0034) [2024-06-25 19:15:32,556][15401] Updated weights for policy 0, policy_version 949114 (0.0036) [2024-06-25 19:15:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 15550316544. Throughput: 0: 42727.1. Samples: 15550479800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:33,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 19:15:36,170][15401] Updated weights for policy 0, policy_version 949124 (0.0021) [2024-06-25 19:15:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42056.8, 300 sec: 42487.3). Total num frames: 15550513152. Throughput: 0: 42628.1. Samples: 15550606360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:38,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 19:15:40,102][15401] Updated weights for policy 0, policy_version 949134 (0.0027) [2024-06-25 19:15:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15550742528. Throughput: 0: 42696.7. Samples: 15550867980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:43,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 19:15:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000949143_15550758912.pth... [2024-06-25 19:15:43,569][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000948518_15540518912.pth [2024-06-25 19:15:43,712][15401] Updated weights for policy 0, policy_version 949144 (0.0025) [2024-06-25 19:15:47,719][15401] Updated weights for policy 0, policy_version 949154 (0.0039) [2024-06-25 19:15:48,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 15550955520. Throughput: 0: 42746.3. Samples: 15551119460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 19:15:51,541][15401] Updated weights for policy 0, policy_version 949164 (0.0034) [2024-06-25 19:15:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 15551152128. Throughput: 0: 42620.5. Samples: 15551246300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:53,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 19:15:55,315][15401] Updated weights for policy 0, policy_version 949174 (0.0033) [2024-06-25 19:15:58,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42599.3, 300 sec: 42542.9). Total num frames: 15551381504. Throughput: 0: 42680.5. Samples: 15551509460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:15:58,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 19:15:59,303][15401] Updated weights for policy 0, policy_version 949184 (0.0032) [2024-06-25 19:16:02,860][15401] Updated weights for policy 0, policy_version 949194 (0.0033) [2024-06-25 19:16:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15551594496. Throughput: 0: 42792.8. Samples: 15551764840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:16:03,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 19:16:06,989][15401] Updated weights for policy 0, policy_version 949204 (0.0027) [2024-06-25 19:16:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15551791104. Throughput: 0: 42795.2. Samples: 15551896020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 19:16:10,819][15401] Updated weights for policy 0, policy_version 949214 (0.0046) [2024-06-25 19:16:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15552036864. Throughput: 0: 42715.9. Samples: 15552152440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:13,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 19:16:14,672][15401] Updated weights for policy 0, policy_version 949224 (0.0033) [2024-06-25 19:16:18,291][15401] Updated weights for policy 0, policy_version 949234 (0.0022) [2024-06-25 19:16:18,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15552249856. Throughput: 0: 42823.0. Samples: 15552406840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:18,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 19:16:22,160][15401] Updated weights for policy 0, policy_version 949244 (0.0046) [2024-06-25 19:16:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15552446464. Throughput: 0: 42883.5. Samples: 15552536120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:23,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 19:16:25,674][15401] Updated weights for policy 0, policy_version 949254 (0.0033) [2024-06-25 19:16:28,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15552675840. Throughput: 0: 42871.7. Samples: 15552797200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:28,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 19:16:29,674][15401] Updated weights for policy 0, policy_version 949264 (0.0048) [2024-06-25 19:16:33,270][15401] Updated weights for policy 0, policy_version 949274 (0.0036) [2024-06-25 19:16:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15552905216. Throughput: 0: 43017.3. Samples: 15553055240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 19:16:35,650][15349] Signal inference workers to stop experience collection... (230150 times) [2024-06-25 19:16:35,657][15349] Signal inference workers to resume experience collection... (230150 times) [2024-06-25 19:16:35,694][15401] InferenceWorker_p0-w0: stopping experience collection (230150 times) [2024-06-25 19:16:35,694][15401] InferenceWorker_p0-w0: resuming experience collection (230150 times) [2024-06-25 19:16:37,583][15401] Updated weights for policy 0, policy_version 949284 (0.0043) [2024-06-25 19:16:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15553101824. Throughput: 0: 42995.5. Samples: 15553181100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:38,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 19:16:40,916][15401] Updated weights for policy 0, policy_version 949294 (0.0034) [2024-06-25 19:16:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15553331200. Throughput: 0: 42896.0. Samples: 15553439780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:43,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 19:16:45,253][15401] Updated weights for policy 0, policy_version 949304 (0.0034) [2024-06-25 19:16:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15553511424. Throughput: 0: 42865.0. Samples: 15553693760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:48,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 19:16:49,130][15401] Updated weights for policy 0, policy_version 949314 (0.0032) [2024-06-25 19:16:52,802][15401] Updated weights for policy 0, policy_version 949324 (0.0029) [2024-06-25 19:16:53,392][15132] Fps is (10 sec: 40950.3, 60 sec: 43142.8, 300 sec: 42598.1). Total num frames: 15553740800. Throughput: 0: 42685.7. Samples: 15553816980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:53,392][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 19:16:56,577][15401] Updated weights for policy 0, policy_version 949334 (0.0036) [2024-06-25 19:16:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15553970176. Throughput: 0: 42650.3. Samples: 15554071700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:16:58,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 19:17:00,511][15401] Updated weights for policy 0, policy_version 949344 (0.0043) [2024-06-25 19:17:03,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15554150400. Throughput: 0: 42890.3. Samples: 15554336900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:03,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 19:17:04,251][15401] Updated weights for policy 0, policy_version 949354 (0.0047) [2024-06-25 19:17:08,377][15401] Updated weights for policy 0, policy_version 949364 (0.0041) [2024-06-25 19:17:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 15554379776. Throughput: 0: 42696.7. Samples: 15554457480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:08,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 19:17:11,841][15401] Updated weights for policy 0, policy_version 949374 (0.0029) [2024-06-25 19:17:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15554592768. Throughput: 0: 42567.9. Samples: 15554712760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:13,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 19:17:16,315][15401] Updated weights for policy 0, policy_version 949384 (0.0044) [2024-06-25 19:17:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15554789376. Throughput: 0: 42524.0. Samples: 15554968820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:18,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 19:17:19,922][15401] Updated weights for policy 0, policy_version 949394 (0.0041) [2024-06-25 19:17:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 15555018752. Throughput: 0: 42467.6. Samples: 15555092140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:23,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 19:17:23,994][15401] Updated weights for policy 0, policy_version 949404 (0.0042) [2024-06-25 19:17:27,693][15401] Updated weights for policy 0, policy_version 949414 (0.0044) [2024-06-25 19:17:28,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 15555248128. Throughput: 0: 42368.9. Samples: 15555346380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:28,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 19:17:31,774][15401] Updated weights for policy 0, policy_version 949424 (0.0047) [2024-06-25 19:17:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15555428352. Throughput: 0: 42463.4. Samples: 15555604620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:33,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 19:17:35,385][15401] Updated weights for policy 0, policy_version 949434 (0.0038) [2024-06-25 19:17:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15555641344. Throughput: 0: 42409.3. Samples: 15555725300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:38,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 19:17:39,312][15401] Updated weights for policy 0, policy_version 949444 (0.0030) [2024-06-25 19:17:42,969][15401] Updated weights for policy 0, policy_version 949454 (0.0034) [2024-06-25 19:17:43,392][15132] Fps is (10 sec: 44227.0, 60 sec: 42323.7, 300 sec: 42820.2). Total num frames: 15555870720. Throughput: 0: 42617.0. Samples: 15555989560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:43,392][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 19:17:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000949455_15555870720.pth... [2024-06-25 19:17:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000948830_15545630720.pth [2024-06-25 19:17:47,170][15401] Updated weights for policy 0, policy_version 949464 (0.0035) [2024-06-25 19:17:48,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15556067328. Throughput: 0: 42191.1. Samples: 15556235500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:48,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 19:17:50,737][15401] Updated weights for policy 0, policy_version 949474 (0.0034) [2024-06-25 19:17:53,390][15132] Fps is (10 sec: 42607.7, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 15556296704. Throughput: 0: 42358.2. Samples: 15556363600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:53,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 19:17:54,698][15401] Updated weights for policy 0, policy_version 949484 (0.0025) [2024-06-25 19:17:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 15556493312. Throughput: 0: 42448.8. Samples: 15556622960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:17:58,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 19:17:58,563][15401] Updated weights for policy 0, policy_version 949494 (0.0038) [2024-06-25 19:18:02,370][15401] Updated weights for policy 0, policy_version 949504 (0.0025) [2024-06-25 19:18:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15556722688. Throughput: 0: 42466.3. Samples: 15556879800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 19:18:03,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 19:18:06,097][15401] Updated weights for policy 0, policy_version 949514 (0.0035) [2024-06-25 19:18:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 15556935680. Throughput: 0: 42637.0. Samples: 15557010800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 19:18:10,036][15401] Updated weights for policy 0, policy_version 949524 (0.0031) [2024-06-25 19:18:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15557132288. Throughput: 0: 42633.4. Samples: 15557264880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:13,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 19:18:13,906][15401] Updated weights for policy 0, policy_version 949534 (0.0033) [2024-06-25 19:18:14,057][15349] Signal inference workers to stop experience collection... (230200 times) [2024-06-25 19:18:14,064][15349] Signal inference workers to resume experience collection... (230200 times) [2024-06-25 19:18:14,087][15401] InferenceWorker_p0-w0: stopping experience collection (230200 times) [2024-06-25 19:18:14,092][15401] InferenceWorker_p0-w0: resuming experience collection (230200 times) [2024-06-25 19:18:17,688][15401] Updated weights for policy 0, policy_version 949544 (0.0028) [2024-06-25 19:18:18,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 15557361664. Throughput: 0: 42648.0. Samples: 15557523780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:18,390][15132] Avg episode reward: [(0, '0.216')] [2024-06-25 19:18:21,466][15401] Updated weights for policy 0, policy_version 949554 (0.0035) [2024-06-25 19:18:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15557558272. Throughput: 0: 42796.5. Samples: 15557651140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 19:18:25,386][15401] Updated weights for policy 0, policy_version 949564 (0.0034) [2024-06-25 19:18:28,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 15557771264. Throughput: 0: 42428.8. Samples: 15557898760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:28,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 19:18:29,289][15401] Updated weights for policy 0, policy_version 949574 (0.0034) [2024-06-25 19:18:33,386][15401] Updated weights for policy 0, policy_version 949584 (0.0036) [2024-06-25 19:18:33,391][15132] Fps is (10 sec: 42593.7, 60 sec: 42597.7, 300 sec: 42598.2). Total num frames: 15557984256. Throughput: 0: 42722.7. Samples: 15558158060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:33,391][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 19:18:37,034][15401] Updated weights for policy 0, policy_version 949594 (0.0032) [2024-06-25 19:18:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15558197248. Throughput: 0: 42726.3. Samples: 15558286280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:38,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 19:18:40,827][15401] Updated weights for policy 0, policy_version 949604 (0.0044) [2024-06-25 19:18:43,390][15132] Fps is (10 sec: 42602.9, 60 sec: 42327.0, 300 sec: 42709.6). Total num frames: 15558410240. Throughput: 0: 42568.5. Samples: 15558538540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:43,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 19:18:44,554][15401] Updated weights for policy 0, policy_version 949614 (0.0039) [2024-06-25 19:18:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15558623232. Throughput: 0: 42632.4. Samples: 15558798260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:48,390][15132] Avg episode reward: [(0, '0.396')] [2024-06-25 19:18:48,545][15401] Updated weights for policy 0, policy_version 949624 (0.0037) [2024-06-25 19:18:52,590][15401] Updated weights for policy 0, policy_version 949634 (0.0038) [2024-06-25 19:18:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 15558836224. Throughput: 0: 42455.5. Samples: 15558921300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:53,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 19:18:56,435][15401] Updated weights for policy 0, policy_version 949644 (0.0034) [2024-06-25 19:18:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15559065600. Throughput: 0: 42435.1. Samples: 15559174460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:18:58,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-25 19:19:00,084][15401] Updated weights for policy 0, policy_version 949654 (0.0038) [2024-06-25 19:19:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15559245824. Throughput: 0: 42511.7. Samples: 15559436800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:03,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 19:19:04,088][15401] Updated weights for policy 0, policy_version 949664 (0.0029) [2024-06-25 19:19:07,558][15401] Updated weights for policy 0, policy_version 949674 (0.0031) [2024-06-25 19:19:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15559491584. Throughput: 0: 42355.0. Samples: 15559557120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:08,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 19:19:11,856][15401] Updated weights for policy 0, policy_version 949684 (0.0034) [2024-06-25 19:19:13,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15559720960. Throughput: 0: 42741.0. Samples: 15559822100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:13,390][15132] Avg episode reward: [(0, '0.102')] [2024-06-25 19:19:14,978][15401] Updated weights for policy 0, policy_version 949694 (0.0033) [2024-06-25 19:19:18,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 15559884800. Throughput: 0: 42699.7. Samples: 15560079500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:18,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 19:19:19,668][15401] Updated weights for policy 0, policy_version 949704 (0.0025) [2024-06-25 19:19:23,181][15401] Updated weights for policy 0, policy_version 949714 (0.0032) [2024-06-25 19:19:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15560130560. Throughput: 0: 42437.3. Samples: 15560195960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:23,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 19:19:27,189][15401] Updated weights for policy 0, policy_version 949724 (0.0042) [2024-06-25 19:19:27,288][15349] Signal inference workers to stop experience collection... (230250 times) [2024-06-25 19:19:27,331][15401] InferenceWorker_p0-w0: stopping experience collection (230250 times) [2024-06-25 19:19:27,351][15349] Signal inference workers to resume experience collection... (230250 times) [2024-06-25 19:19:27,356][15401] InferenceWorker_p0-w0: resuming experience collection (230250 times) [2024-06-25 19:19:28,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15560343552. Throughput: 0: 42723.9. Samples: 15560461120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:28,391][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 19:19:30,716][15401] Updated weights for policy 0, policy_version 949734 (0.0033) [2024-06-25 19:19:33,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42053.1, 300 sec: 42432.7). Total num frames: 15560507392. Throughput: 0: 42664.5. Samples: 15560718160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:33,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 19:19:34,738][15401] Updated weights for policy 0, policy_version 949744 (0.0030) [2024-06-25 19:19:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15560753152. Throughput: 0: 42584.0. Samples: 15560837580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:38,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 19:19:38,769][15401] Updated weights for policy 0, policy_version 949754 (0.0040) [2024-06-25 19:19:42,430][15401] Updated weights for policy 0, policy_version 949764 (0.0032) [2024-06-25 19:19:43,392][15132] Fps is (10 sec: 49139.6, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 15560998912. Throughput: 0: 42811.0. Samples: 15561101060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:43,393][15132] Avg episode reward: [(0, '0.260')] [2024-06-25 19:19:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000949768_15560998912.pth... [2024-06-25 19:19:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000949143_15550758912.pth [2024-06-25 19:19:46,292][15401] Updated weights for policy 0, policy_version 949774 (0.0038) [2024-06-25 19:19:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15561162752. Throughput: 0: 42635.2. Samples: 15561355380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 19:19:49,973][15401] Updated weights for policy 0, policy_version 949784 (0.0027) [2024-06-25 19:19:53,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42871.4, 300 sec: 42654.1). Total num frames: 15561408512. Throughput: 0: 42675.5. Samples: 15561477520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:53,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 19:19:53,767][15401] Updated weights for policy 0, policy_version 949794 (0.0033) [2024-06-25 19:19:57,497][15401] Updated weights for policy 0, policy_version 949804 (0.0034) [2024-06-25 19:19:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15561621504. Throughput: 0: 42688.0. Samples: 15561743060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:19:58,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 19:20:01,311][15401] Updated weights for policy 0, policy_version 949814 (0.0034) [2024-06-25 19:20:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15561818112. Throughput: 0: 42553.7. Samples: 15561994420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:03,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 19:20:05,399][15401] Updated weights for policy 0, policy_version 949824 (0.0038) [2024-06-25 19:20:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15562031104. Throughput: 0: 42617.8. Samples: 15562113760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:08,390][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 19:20:08,954][15401] Updated weights for policy 0, policy_version 949834 (0.0039) [2024-06-25 19:20:13,234][15401] Updated weights for policy 0, policy_version 949844 (0.0035) [2024-06-25 19:20:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 15562260480. Throughput: 0: 42421.8. Samples: 15562370100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:13,400][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 19:20:16,806][15401] Updated weights for policy 0, policy_version 949854 (0.0034) [2024-06-25 19:20:18,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15562440704. Throughput: 0: 42308.7. Samples: 15562622060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:18,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 19:20:20,999][15401] Updated weights for policy 0, policy_version 949864 (0.0027) [2024-06-25 19:20:23,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15562686464. Throughput: 0: 42411.0. Samples: 15562746080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:23,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 19:20:24,414][15401] Updated weights for policy 0, policy_version 949874 (0.0040) [2024-06-25 19:20:28,389][15132] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15562883072. Throughput: 0: 42450.8. Samples: 15563011240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 19:20:28,857][15401] Updated weights for policy 0, policy_version 949884 (0.0041) [2024-06-25 19:20:32,445][15401] Updated weights for policy 0, policy_version 949894 (0.0037) [2024-06-25 19:20:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15563096064. Throughput: 0: 42367.8. Samples: 15563261940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:33,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 19:20:36,349][15401] Updated weights for policy 0, policy_version 949904 (0.0038) [2024-06-25 19:20:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15563325440. Throughput: 0: 42476.1. Samples: 15563388940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:38,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 19:20:40,131][15401] Updated weights for policy 0, policy_version 949914 (0.0041) [2024-06-25 19:20:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42054.0, 300 sec: 42598.4). Total num frames: 15563522048. Throughput: 0: 42427.5. Samples: 15563652300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:43,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 19:20:43,917][15401] Updated weights for policy 0, policy_version 949924 (0.0036) [2024-06-25 19:20:47,763][15401] Updated weights for policy 0, policy_version 949934 (0.0039) [2024-06-25 19:20:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15563735040. Throughput: 0: 42305.4. Samples: 15563898160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:48,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 19:20:49,092][15349] Signal inference workers to stop experience collection... (230300 times) [2024-06-25 19:20:49,092][15349] Signal inference workers to resume experience collection... (230300 times) [2024-06-25 19:20:49,124][15401] InferenceWorker_p0-w0: stopping experience collection (230300 times) [2024-06-25 19:20:49,124][15401] InferenceWorker_p0-w0: resuming experience collection (230300 times) [2024-06-25 19:20:51,451][15401] Updated weights for policy 0, policy_version 949944 (0.0025) [2024-06-25 19:20:53,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15563980800. Throughput: 0: 42563.5. Samples: 15564029120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:53,390][15132] Avg episode reward: [(0, '0.308')] [2024-06-25 19:20:55,569][15401] Updated weights for policy 0, policy_version 949954 (0.0037) [2024-06-25 19:20:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15564144640. Throughput: 0: 42627.2. Samples: 15564288320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:20:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 19:20:59,318][15401] Updated weights for policy 0, policy_version 949964 (0.0034) [2024-06-25 19:21:03,287][15401] Updated weights for policy 0, policy_version 949974 (0.0040) [2024-06-25 19:21:03,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15564374016. Throughput: 0: 42566.0. Samples: 15564537520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:03,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 19:21:06,946][15401] Updated weights for policy 0, policy_version 949984 (0.0039) [2024-06-25 19:21:08,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15564619776. Throughput: 0: 42762.6. Samples: 15564670400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 19:21:11,406][15401] Updated weights for policy 0, policy_version 949994 (0.0037) [2024-06-25 19:21:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 15564783616. Throughput: 0: 42472.8. Samples: 15564922520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:13,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 19:21:14,778][15401] Updated weights for policy 0, policy_version 950004 (0.0043) [2024-06-25 19:21:18,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42869.8, 300 sec: 42598.0). Total num frames: 15565012992. Throughput: 0: 42532.5. Samples: 15565176000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:18,392][15132] Avg episode reward: [(0, '0.518')] [2024-06-25 19:21:18,972][15401] Updated weights for policy 0, policy_version 950014 (0.0030) [2024-06-25 19:21:22,443][15401] Updated weights for policy 0, policy_version 950024 (0.0041) [2024-06-25 19:21:23,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15565242368. Throughput: 0: 42659.9. Samples: 15565308640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:23,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 19:21:26,467][15401] Updated weights for policy 0, policy_version 950034 (0.0033) [2024-06-25 19:21:28,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 15565422592. Throughput: 0: 42482.6. Samples: 15565564020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:28,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 19:21:30,025][15401] Updated weights for policy 0, policy_version 950044 (0.0039) [2024-06-25 19:21:33,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15565651968. Throughput: 0: 42687.5. Samples: 15565819100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 19:21:34,231][15401] Updated weights for policy 0, policy_version 950054 (0.0029) [2024-06-25 19:21:37,443][15401] Updated weights for policy 0, policy_version 950064 (0.0034) [2024-06-25 19:21:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15565881344. Throughput: 0: 42678.3. Samples: 15565949640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 19:21:41,793][15401] Updated weights for policy 0, policy_version 950074 (0.0043) [2024-06-25 19:21:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 15566045184. Throughput: 0: 42527.9. Samples: 15566202080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 19:21:43,396][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000950076_15566045184.pth... [2024-06-25 19:21:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000949455_15555870720.pth [2024-06-25 19:21:44,940][15349] Signal inference workers to stop experience collection... (230350 times) [2024-06-25 19:21:44,940][15349] Signal inference workers to resume experience collection... (230350 times) [2024-06-25 19:21:44,971][15401] InferenceWorker_p0-w0: stopping experience collection (230350 times) [2024-06-25 19:21:44,971][15401] InferenceWorker_p0-w0: resuming experience collection (230350 times) [2024-06-25 19:21:45,097][15401] Updated weights for policy 0, policy_version 950084 (0.0031) [2024-06-25 19:21:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42543.2). Total num frames: 15566290944. Throughput: 0: 42723.9. Samples: 15566460100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:48,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 19:21:49,478][15401] Updated weights for policy 0, policy_version 950094 (0.0034) [2024-06-25 19:21:52,689][15401] Updated weights for policy 0, policy_version 950104 (0.0038) [2024-06-25 19:21:53,389][15132] Fps is (10 sec: 49152.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15566536704. Throughput: 0: 42796.6. Samples: 15566596240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:53,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 19:21:57,451][15401] Updated weights for policy 0, policy_version 950114 (0.0037) [2024-06-25 19:21:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15566700544. Throughput: 0: 42895.6. Samples: 15566852820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 19:21:58,390][15132] Avg episode reward: [(0, '0.492')] [2024-06-25 19:22:00,406][15401] Updated weights for policy 0, policy_version 950124 (0.0043) [2024-06-25 19:22:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15566946304. Throughput: 0: 42844.1. Samples: 15567103880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 19:22:04,926][15401] Updated weights for policy 0, policy_version 950134 (0.0037) [2024-06-25 19:22:08,143][15401] Updated weights for policy 0, policy_version 950144 (0.0034) [2024-06-25 19:22:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15567175680. Throughput: 0: 42883.1. Samples: 15567238380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:08,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 19:22:12,499][15401] Updated weights for policy 0, policy_version 950154 (0.0041) [2024-06-25 19:22:13,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15567339520. Throughput: 0: 42889.0. Samples: 15567494020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:13,390][15132] Avg episode reward: [(0, '0.311')] [2024-06-25 19:22:15,766][15401] Updated weights for policy 0, policy_version 950164 (0.0032) [2024-06-25 19:22:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 15567568896. Throughput: 0: 42788.3. Samples: 15567744580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:18,396][15132] Avg episode reward: [(0, '0.131')] [2024-06-25 19:22:20,215][15401] Updated weights for policy 0, policy_version 950174 (0.0035) [2024-06-25 19:22:23,389][15132] Fps is (10 sec: 47513.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15567814656. Throughput: 0: 42747.2. Samples: 15567873260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:23,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 19:22:23,395][15401] Updated weights for policy 0, policy_version 950184 (0.0039) [2024-06-25 19:22:27,675][15401] Updated weights for policy 0, policy_version 950194 (0.0034) [2024-06-25 19:22:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15567994880. Throughput: 0: 42888.6. Samples: 15568132060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 19:22:31,030][15401] Updated weights for policy 0, policy_version 950204 (0.0035) [2024-06-25 19:22:33,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15568224256. Throughput: 0: 42828.4. Samples: 15568387380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:33,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 19:22:35,440][15401] Updated weights for policy 0, policy_version 950214 (0.0040) [2024-06-25 19:22:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 15568453632. Throughput: 0: 42686.7. Samples: 15568517140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 19:22:38,541][15401] Updated weights for policy 0, policy_version 950224 (0.0037) [2024-06-25 19:22:43,262][15401] Updated weights for policy 0, policy_version 950234 (0.0039) [2024-06-25 19:22:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 15568633856. Throughput: 0: 42596.9. Samples: 15568769680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:43,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 19:22:46,270][15401] Updated weights for policy 0, policy_version 950244 (0.0035) [2024-06-25 19:22:48,392][15132] Fps is (10 sec: 40949.3, 60 sec: 42869.7, 300 sec: 42598.0). Total num frames: 15568863232. Throughput: 0: 42706.8. Samples: 15569025800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:48,393][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 19:22:50,724][15401] Updated weights for policy 0, policy_version 950254 (0.0039) [2024-06-25 19:22:53,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15569092608. Throughput: 0: 42648.8. Samples: 15569157580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:53,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 19:22:53,976][15401] Updated weights for policy 0, policy_version 950264 (0.0048) [2024-06-25 19:22:58,374][15401] Updated weights for policy 0, policy_version 950274 (0.0041) [2024-06-25 19:22:58,390][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 15569289216. Throughput: 0: 42615.4. Samples: 15569411720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:22:58,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 19:23:01,837][15401] Updated weights for policy 0, policy_version 950284 (0.0047) [2024-06-25 19:23:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15569518592. Throughput: 0: 42503.2. Samples: 15569657220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:03,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 19:23:06,415][15401] Updated weights for policy 0, policy_version 950294 (0.0037) [2024-06-25 19:23:08,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15569715200. Throughput: 0: 42599.4. Samples: 15569790240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:08,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 19:23:09,567][15401] Updated weights for policy 0, policy_version 950304 (0.0032) [2024-06-25 19:23:10,948][15349] Signal inference workers to stop experience collection... (230400 times) [2024-06-25 19:23:10,951][15349] Signal inference workers to resume experience collection... (230400 times) [2024-06-25 19:23:10,971][15401] InferenceWorker_p0-w0: stopping experience collection (230400 times) [2024-06-25 19:23:10,971][15401] InferenceWorker_p0-w0: resuming experience collection (230400 times) [2024-06-25 19:23:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15569911808. Throughput: 0: 42465.3. Samples: 15570043000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:13,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 19:23:13,986][15401] Updated weights for policy 0, policy_version 950314 (0.0032) [2024-06-25 19:23:17,312][15401] Updated weights for policy 0, policy_version 950324 (0.0031) [2024-06-25 19:23:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 15570157568. Throughput: 0: 42425.4. Samples: 15570296520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:18,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 19:23:21,781][15401] Updated weights for policy 0, policy_version 950334 (0.0038) [2024-06-25 19:23:23,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15570337792. Throughput: 0: 42569.3. Samples: 15570432760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:23,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 19:23:24,982][15401] Updated weights for policy 0, policy_version 950344 (0.0028) [2024-06-25 19:23:28,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42598.5). Total num frames: 15570550784. Throughput: 0: 42483.4. Samples: 15570681440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:28,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 19:23:29,713][15401] Updated weights for policy 0, policy_version 950354 (0.0027) [2024-06-25 19:23:32,593][15401] Updated weights for policy 0, policy_version 950364 (0.0039) [2024-06-25 19:23:33,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15570796544. Throughput: 0: 42429.5. Samples: 15570935020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:33,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 19:23:37,272][15401] Updated weights for policy 0, policy_version 950374 (0.0036) [2024-06-25 19:23:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 15570976768. Throughput: 0: 42439.9. Samples: 15571067380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 19:23:40,373][15401] Updated weights for policy 0, policy_version 950384 (0.0029) [2024-06-25 19:23:43,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15571173376. Throughput: 0: 42327.7. Samples: 15571316460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:43,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 19:23:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000950390_15571189760.pth... [2024-06-25 19:23:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000949768_15560998912.pth [2024-06-25 19:23:44,734][15401] Updated weights for policy 0, policy_version 950394 (0.0031) [2024-06-25 19:23:47,985][15401] Updated weights for policy 0, policy_version 950404 (0.0028) [2024-06-25 19:23:48,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42873.3, 300 sec: 42709.5). Total num frames: 15571435520. Throughput: 0: 42572.9. Samples: 15571573000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:48,393][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 19:23:52,888][15401] Updated weights for policy 0, policy_version 950414 (0.0043) [2024-06-25 19:23:53,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15571615744. Throughput: 0: 42602.3. Samples: 15571707340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-25 19:23:53,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 19:23:55,390][15401] Updated weights for policy 0, policy_version 950424 (0.0025) [2024-06-25 19:23:58,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 15571828736. Throughput: 0: 42595.6. Samples: 15571959800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:23:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 19:24:00,358][15401] Updated weights for policy 0, policy_version 950434 (0.0043) [2024-06-25 19:24:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15572058112. Throughput: 0: 42523.9. Samples: 15572210100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:03,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 19:24:03,558][15401] Updated weights for policy 0, policy_version 950444 (0.0041) [2024-06-25 19:24:07,980][15401] Updated weights for policy 0, policy_version 950454 (0.0036) [2024-06-25 19:24:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15572271104. Throughput: 0: 42465.7. Samples: 15572343720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 19:24:11,125][15401] Updated weights for policy 0, policy_version 950464 (0.0032) [2024-06-25 19:24:13,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15572467712. Throughput: 0: 42433.4. Samples: 15572590940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:13,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 19:24:15,528][15401] Updated weights for policy 0, policy_version 950474 (0.0037) [2024-06-25 19:24:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15572713472. Throughput: 0: 42636.5. Samples: 15572853660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 19:24:18,599][15401] Updated weights for policy 0, policy_version 950484 (0.0027) [2024-06-25 19:24:23,302][15401] Updated weights for policy 0, policy_version 950494 (0.0050) [2024-06-25 19:24:23,392][15132] Fps is (10 sec: 42588.2, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 15572893696. Throughput: 0: 42657.4. Samples: 15572987060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:23,393][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 19:24:26,459][15401] Updated weights for policy 0, policy_version 950504 (0.0044) [2024-06-25 19:24:28,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15573090304. Throughput: 0: 42630.6. Samples: 15573234840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:28,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 19:24:30,829][15401] Updated weights for policy 0, policy_version 950514 (0.0048) [2024-06-25 19:24:33,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15573336064. Throughput: 0: 42737.4. Samples: 15573496180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:33,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 19:24:34,155][15401] Updated weights for policy 0, policy_version 950524 (0.0027) [2024-06-25 19:24:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 15573532672. Throughput: 0: 42714.7. Samples: 15573629500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 19:24:38,464][15401] Updated weights for policy 0, policy_version 950534 (0.0033) [2024-06-25 19:24:41,904][15401] Updated weights for policy 0, policy_version 950544 (0.0038) [2024-06-25 19:24:43,391][15132] Fps is (10 sec: 40953.5, 60 sec: 42870.3, 300 sec: 42653.7). Total num frames: 15573745664. Throughput: 0: 42534.9. Samples: 15573873940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:43,392][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 19:24:46,067][15401] Updated weights for policy 0, policy_version 950554 (0.0025) [2024-06-25 19:24:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15573975040. Throughput: 0: 42806.4. Samples: 15574136380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 19:24:49,303][15349] Signal inference workers to stop experience collection... (230450 times) [2024-06-25 19:24:49,351][15401] InferenceWorker_p0-w0: stopping experience collection (230450 times) [2024-06-25 19:24:49,352][15349] Signal inference workers to resume experience collection... (230450 times) [2024-06-25 19:24:49,366][15401] InferenceWorker_p0-w0: resuming experience collection (230450 times) [2024-06-25 19:24:49,528][15401] Updated weights for policy 0, policy_version 950564 (0.0041) [2024-06-25 19:24:53,390][15132] Fps is (10 sec: 42604.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15574171648. Throughput: 0: 42685.2. Samples: 15574264560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:53,391][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 19:24:53,995][15401] Updated weights for policy 0, policy_version 950574 (0.0034) [2024-06-25 19:24:57,173][15401] Updated weights for policy 0, policy_version 950584 (0.0046) [2024-06-25 19:24:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15574384640. Throughput: 0: 42732.5. Samples: 15574513900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:24:58,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 19:25:01,549][15401] Updated weights for policy 0, policy_version 950594 (0.0037) [2024-06-25 19:25:03,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 15574597632. Throughput: 0: 42668.4. Samples: 15574773840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:03,393][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 19:25:05,047][15401] Updated weights for policy 0, policy_version 950604 (0.0035) [2024-06-25 19:25:08,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15574810624. Throughput: 0: 42525.5. Samples: 15574900600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 19:25:09,024][15401] Updated weights for policy 0, policy_version 950614 (0.0041) [2024-06-25 19:25:12,997][15401] Updated weights for policy 0, policy_version 950624 (0.0033) [2024-06-25 19:25:13,391][15132] Fps is (10 sec: 44239.0, 60 sec: 42870.1, 300 sec: 42709.2). Total num frames: 15575040000. Throughput: 0: 42639.6. Samples: 15575153700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:13,392][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 19:25:16,754][15401] Updated weights for policy 0, policy_version 950634 (0.0027) [2024-06-25 19:25:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15575252992. Throughput: 0: 42617.3. Samples: 15575413960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:18,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 19:25:20,810][15401] Updated weights for policy 0, policy_version 950644 (0.0039) [2024-06-25 19:25:23,390][15132] Fps is (10 sec: 42606.3, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 15575465984. Throughput: 0: 42427.5. Samples: 15575538740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:23,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 19:25:24,448][15401] Updated weights for policy 0, policy_version 950654 (0.0040) [2024-06-25 19:25:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15575662592. Throughput: 0: 42760.2. Samples: 15575798080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 19:25:28,419][15401] Updated weights for policy 0, policy_version 950664 (0.0034) [2024-06-25 19:25:32,135][15401] Updated weights for policy 0, policy_version 950674 (0.0025) [2024-06-25 19:25:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15575891968. Throughput: 0: 42673.2. Samples: 15576056680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 19:25:35,875][15401] Updated weights for policy 0, policy_version 950684 (0.0038) [2024-06-25 19:25:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15576104960. Throughput: 0: 42637.8. Samples: 15576183260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:38,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 19:25:39,618][15401] Updated weights for policy 0, policy_version 950694 (0.0028) [2024-06-25 19:25:43,265][15401] Updated weights for policy 0, policy_version 950704 (0.0034) [2024-06-25 19:25:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43145.6, 300 sec: 42709.5). Total num frames: 15576334336. Throughput: 0: 42894.1. Samples: 15576444140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:43,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 19:25:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000950704_15576334336.pth... [2024-06-25 19:25:43,461][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000950076_15566045184.pth [2024-06-25 19:25:47,175][15401] Updated weights for policy 0, policy_version 950714 (0.0030) [2024-06-25 19:25:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15576547328. Throughput: 0: 42889.8. Samples: 15576703780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 19:25:48,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 19:25:51,040][15401] Updated weights for policy 0, policy_version 950724 (0.0040) [2024-06-25 19:25:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15576760320. Throughput: 0: 42910.0. Samples: 15576831560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:25:53,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 19:25:54,682][15401] Updated weights for policy 0, policy_version 950734 (0.0038) [2024-06-25 19:25:58,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15576973312. Throughput: 0: 42992.0. Samples: 15577088260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:25:58,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 19:25:58,671][15401] Updated weights for policy 0, policy_version 950744 (0.0038) [2024-06-25 19:26:02,167][15401] Updated weights for policy 0, policy_version 950754 (0.0030) [2024-06-25 19:26:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 15577186304. Throughput: 0: 42958.6. Samples: 15577347100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 19:26:06,246][15401] Updated weights for policy 0, policy_version 950764 (0.0049) [2024-06-25 19:26:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15577382912. Throughput: 0: 43046.7. Samples: 15577475840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:08,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 19:26:10,110][15401] Updated weights for policy 0, policy_version 950774 (0.0036) [2024-06-25 19:26:12,805][15349] Signal inference workers to stop experience collection... (230500 times) [2024-06-25 19:26:12,809][15349] Signal inference workers to resume experience collection... (230500 times) [2024-06-25 19:26:12,817][15401] InferenceWorker_p0-w0: stopping experience collection (230500 times) [2024-06-25 19:26:12,832][15401] InferenceWorker_p0-w0: resuming experience collection (230500 times) [2024-06-25 19:26:13,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43145.9, 300 sec: 42765.4). Total num frames: 15577628672. Throughput: 0: 43033.2. Samples: 15577734580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:13,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 19:26:14,075][15401] Updated weights for policy 0, policy_version 950784 (0.0034) [2024-06-25 19:26:17,665][15401] Updated weights for policy 0, policy_version 950794 (0.0035) [2024-06-25 19:26:18,396][15132] Fps is (10 sec: 45845.9, 60 sec: 43139.9, 300 sec: 42708.6). Total num frames: 15577841664. Throughput: 0: 42870.9. Samples: 15577986140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:18,396][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 19:26:21,613][15401] Updated weights for policy 0, policy_version 950804 (0.0034) [2024-06-25 19:26:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15578038272. Throughput: 0: 42934.6. Samples: 15578115320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:23,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 19:26:25,410][15401] Updated weights for policy 0, policy_version 950814 (0.0043) [2024-06-25 19:26:28,390][15132] Fps is (10 sec: 40986.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15578251264. Throughput: 0: 42926.3. Samples: 15578375820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:28,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 19:26:29,136][15401] Updated weights for policy 0, policy_version 950824 (0.0042) [2024-06-25 19:26:32,825][15401] Updated weights for policy 0, policy_version 950834 (0.0031) [2024-06-25 19:26:33,396][15132] Fps is (10 sec: 44209.0, 60 sec: 43140.0, 300 sec: 42708.5). Total num frames: 15578480640. Throughput: 0: 42750.0. Samples: 15578627800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:33,397][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 19:26:37,180][15401] Updated weights for policy 0, policy_version 950844 (0.0032) [2024-06-25 19:26:38,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15578693632. Throughput: 0: 42955.3. Samples: 15578764540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 19:26:40,571][15401] Updated weights for policy 0, policy_version 950854 (0.0041) [2024-06-25 19:26:43,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15578890240. Throughput: 0: 42943.1. Samples: 15579020700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:43,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 19:26:44,735][15401] Updated weights for policy 0, policy_version 950864 (0.0039) [2024-06-25 19:26:48,108][15401] Updated weights for policy 0, policy_version 950874 (0.0035) [2024-06-25 19:26:48,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 15579119616. Throughput: 0: 42815.1. Samples: 15579273880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:48,392][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 19:26:52,171][15401] Updated weights for policy 0, policy_version 950884 (0.0027) [2024-06-25 19:26:53,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15579332608. Throughput: 0: 42934.2. Samples: 15579407880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:53,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 19:26:55,902][15401] Updated weights for policy 0, policy_version 950894 (0.0037) [2024-06-25 19:26:58,390][15132] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15579512832. Throughput: 0: 42919.9. Samples: 15579665980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:26:58,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 19:26:59,845][15401] Updated weights for policy 0, policy_version 950904 (0.0037) [2024-06-25 19:27:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15579758592. Throughput: 0: 43072.8. Samples: 15579924140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:03,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 19:27:03,460][15401] Updated weights for policy 0, policy_version 950914 (0.0042) [2024-06-25 19:27:07,373][15401] Updated weights for policy 0, policy_version 950924 (0.0042) [2024-06-25 19:27:08,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 15579987968. Throughput: 0: 43045.1. Samples: 15580052340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:08,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 19:27:11,460][15401] Updated weights for policy 0, policy_version 950934 (0.0032) [2024-06-25 19:27:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15580168192. Throughput: 0: 42941.8. Samples: 15580308200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 19:27:15,193][15401] Updated weights for policy 0, policy_version 950944 (0.0046) [2024-06-25 19:27:18,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42601.2, 300 sec: 42653.6). Total num frames: 15580397568. Throughput: 0: 43030.5. Samples: 15580564000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:18,393][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 19:27:19,135][15401] Updated weights for policy 0, policy_version 950954 (0.0038) [2024-06-25 19:27:22,734][15401] Updated weights for policy 0, policy_version 950964 (0.0034) [2024-06-25 19:27:23,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 15580626944. Throughput: 0: 42890.7. Samples: 15580694620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:23,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 19:27:26,755][15401] Updated weights for policy 0, policy_version 950974 (0.0033) [2024-06-25 19:27:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15580823552. Throughput: 0: 42956.5. Samples: 15580953740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:28,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 19:27:30,118][15401] Updated weights for policy 0, policy_version 950984 (0.0039) [2024-06-25 19:27:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 15581036544. Throughput: 0: 43073.4. Samples: 15581212080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 19:27:34,429][15401] Updated weights for policy 0, policy_version 950994 (0.0036) [2024-06-25 19:27:37,793][15401] Updated weights for policy 0, policy_version 951004 (0.0032) [2024-06-25 19:27:37,795][15349] Signal inference workers to stop experience collection... (230550 times) [2024-06-25 19:27:37,795][15349] Signal inference workers to resume experience collection... (230550 times) [2024-06-25 19:27:37,817][15401] InferenceWorker_p0-w0: stopping experience collection (230550 times) [2024-06-25 19:27:37,817][15401] InferenceWorker_p0-w0: resuming experience collection (230550 times) [2024-06-25 19:27:38,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15581282304. Throughput: 0: 43013.4. Samples: 15581343480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 19:27:42,076][15401] Updated weights for policy 0, policy_version 951014 (0.0039) [2024-06-25 19:27:43,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.4). Total num frames: 15581478912. Throughput: 0: 42904.0. Samples: 15581596660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 19:27:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000951018_15581478912.pth... [2024-06-25 19:27:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000950390_15571189760.pth [2024-06-25 19:27:45,314][15401] Updated weights for policy 0, policy_version 951024 (0.0028) [2024-06-25 19:27:48,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 15581675520. Throughput: 0: 42783.9. Samples: 15581849520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-25 19:27:48,393][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 19:27:49,576][15401] Updated weights for policy 0, policy_version 951034 (0.0035) [2024-06-25 19:27:53,035][15401] Updated weights for policy 0, policy_version 951044 (0.0035) [2024-06-25 19:27:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15581921280. Throughput: 0: 42837.2. Samples: 15581980020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:27:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 19:27:57,201][15401] Updated weights for policy 0, policy_version 951054 (0.0037) [2024-06-25 19:27:58,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 15582117888. Throughput: 0: 42898.7. Samples: 15582238640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:27:58,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 19:28:00,575][15401] Updated weights for policy 0, policy_version 951064 (0.0028) [2024-06-25 19:28:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15582330880. Throughput: 0: 42927.6. Samples: 15582495640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:03,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 19:28:04,822][15401] Updated weights for policy 0, policy_version 951074 (0.0035) [2024-06-25 19:28:08,060][15401] Updated weights for policy 0, policy_version 951084 (0.0029) [2024-06-25 19:28:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15582560256. Throughput: 0: 42917.3. Samples: 15582625900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:08,390][15132] Avg episode reward: [(0, '0.326')] [2024-06-25 19:28:12,634][15401] Updated weights for policy 0, policy_version 951094 (0.0050) [2024-06-25 19:28:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15582740480. Throughput: 0: 42848.1. Samples: 15582881900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:13,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 19:28:15,542][15401] Updated weights for policy 0, policy_version 951104 (0.0032) [2024-06-25 19:28:18,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 15582953472. Throughput: 0: 42806.3. Samples: 15583138360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:18,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 19:28:20,433][15401] Updated weights for policy 0, policy_version 951114 (0.0050) [2024-06-25 19:28:23,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 15583199232. Throughput: 0: 42722.6. Samples: 15583266000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:23,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 19:28:23,817][15401] Updated weights for policy 0, policy_version 951124 (0.0033) [2024-06-25 19:28:28,224][15401] Updated weights for policy 0, policy_version 951134 (0.0042) [2024-06-25 19:28:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15583379456. Throughput: 0: 42750.8. Samples: 15583520440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 19:28:31,248][15401] Updated weights for policy 0, policy_version 951144 (0.0033) [2024-06-25 19:28:33,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15583608832. Throughput: 0: 42868.0. Samples: 15583778480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:33,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 19:28:35,862][15401] Updated weights for policy 0, policy_version 951154 (0.0031) [2024-06-25 19:28:38,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15583854592. Throughput: 0: 42815.1. Samples: 15583906700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:38,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 19:28:38,684][15401] Updated weights for policy 0, policy_version 951164 (0.0028) [2024-06-25 19:28:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15584018432. Throughput: 0: 42673.6. Samples: 15584158960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 19:28:43,647][15401] Updated weights for policy 0, policy_version 951174 (0.0037) [2024-06-25 19:28:47,259][15401] Updated weights for policy 0, policy_version 951184 (0.0036) [2024-06-25 19:28:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 15584247808. Throughput: 0: 42746.3. Samples: 15584419220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:48,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 19:28:51,171][15401] Updated weights for policy 0, policy_version 951194 (0.0034) [2024-06-25 19:28:53,390][15132] Fps is (10 sec: 45871.7, 60 sec: 42597.9, 300 sec: 42876.0). Total num frames: 15584477184. Throughput: 0: 42813.4. Samples: 15584552540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:53,391][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 19:28:54,723][15401] Updated weights for policy 0, policy_version 951204 (0.0032) [2024-06-25 19:28:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15584657408. Throughput: 0: 42659.6. Samples: 15584801580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:28:58,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 19:28:59,013][15401] Updated weights for policy 0, policy_version 951214 (0.0039) [2024-06-25 19:28:59,014][15349] Signal inference workers to stop experience collection... (230600 times) [2024-06-25 19:28:59,014][15349] Signal inference workers to resume experience collection... (230600 times) [2024-06-25 19:28:59,062][15401] InferenceWorker_p0-w0: stopping experience collection (230600 times) [2024-06-25 19:28:59,062][15401] InferenceWorker_p0-w0: resuming experience collection (230600 times) [2024-06-25 19:29:02,282][15401] Updated weights for policy 0, policy_version 951224 (0.0032) [2024-06-25 19:29:03,389][15132] Fps is (10 sec: 40963.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15584886784. Throughput: 0: 42814.2. Samples: 15585065000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:03,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 19:29:06,602][15401] Updated weights for policy 0, policy_version 951234 (0.0043) [2024-06-25 19:29:08,390][15132] Fps is (10 sec: 47512.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15585132544. Throughput: 0: 42925.8. Samples: 15585197660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:08,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 19:29:09,792][15401] Updated weights for policy 0, policy_version 951244 (0.0042) [2024-06-25 19:29:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15585312768. Throughput: 0: 42760.3. Samples: 15585444660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:13,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 19:29:14,234][15401] Updated weights for policy 0, policy_version 951254 (0.0036) [2024-06-25 19:29:17,316][15401] Updated weights for policy 0, policy_version 951264 (0.0029) [2024-06-25 19:29:18,392][15132] Fps is (10 sec: 39312.6, 60 sec: 42869.7, 300 sec: 42820.6). Total num frames: 15585525760. Throughput: 0: 42765.8. Samples: 15585703040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:18,393][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 19:29:21,767][15401] Updated weights for policy 0, policy_version 951274 (0.0026) [2024-06-25 19:29:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15585755136. Throughput: 0: 42889.7. Samples: 15585836740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:23,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 19:29:24,897][15401] Updated weights for policy 0, policy_version 951284 (0.0038) [2024-06-25 19:29:28,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15585951744. Throughput: 0: 42896.0. Samples: 15586089280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:28,390][15132] Avg episode reward: [(0, '0.242')] [2024-06-25 19:29:29,337][15401] Updated weights for policy 0, policy_version 951294 (0.0034) [2024-06-25 19:29:32,412][15401] Updated weights for policy 0, policy_version 951304 (0.0038) [2024-06-25 19:29:33,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15586181120. Throughput: 0: 42727.5. Samples: 15586341960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:33,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 19:29:37,089][15401] Updated weights for policy 0, policy_version 951314 (0.0028) [2024-06-25 19:29:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42820.8). Total num frames: 15586377728. Throughput: 0: 42713.7. Samples: 15586474620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:38,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 19:29:39,989][15401] Updated weights for policy 0, policy_version 951324 (0.0045) [2024-06-25 19:29:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 15586607104. Throughput: 0: 42862.1. Samples: 15586730380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 19:29:43,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 19:29:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000951331_15586607104.pth... [2024-06-25 19:29:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000950704_15576334336.pth [2024-06-25 19:29:44,644][15401] Updated weights for policy 0, policy_version 951334 (0.0036) [2024-06-25 19:29:47,574][15401] Updated weights for policy 0, policy_version 951344 (0.0035) [2024-06-25 19:29:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15586836480. Throughput: 0: 42528.5. Samples: 15586978780. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:29:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 19:29:52,257][15401] Updated weights for policy 0, policy_version 951354 (0.0027) [2024-06-25 19:29:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.9, 300 sec: 42820.5). Total num frames: 15587016704. Throughput: 0: 42530.7. Samples: 15587111540. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:29:53,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 19:29:55,208][15401] Updated weights for policy 0, policy_version 951364 (0.0035) [2024-06-25 19:29:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42876.4). Total num frames: 15587246080. Throughput: 0: 42798.6. Samples: 15587370600. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:29:58,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 19:30:00,215][15401] Updated weights for policy 0, policy_version 951374 (0.0030) [2024-06-25 19:30:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15587459072. Throughput: 0: 42610.4. Samples: 15587620400. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:03,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 19:30:03,438][15401] Updated weights for policy 0, policy_version 951384 (0.0044) [2024-06-25 19:30:07,683][15401] Updated weights for policy 0, policy_version 951394 (0.0037) [2024-06-25 19:30:08,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42820.8). Total num frames: 15587672064. Throughput: 0: 42597.1. Samples: 15587753600. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 19:30:11,011][15401] Updated weights for policy 0, policy_version 951404 (0.0030) [2024-06-25 19:30:13,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15587885056. Throughput: 0: 42615.2. Samples: 15588006960. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:13,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 19:30:15,534][15401] Updated weights for policy 0, policy_version 951414 (0.0028) [2024-06-25 19:30:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 15588098048. Throughput: 0: 42665.8. Samples: 15588261920. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 19:30:18,785][15401] Updated weights for policy 0, policy_version 951424 (0.0035) [2024-06-25 19:30:22,892][15401] Updated weights for policy 0, policy_version 951434 (0.0029) [2024-06-25 19:30:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15588327424. Throughput: 0: 42688.4. Samples: 15588395600. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:23,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 19:30:26,084][15349] Signal inference workers to stop experience collection... (230650 times) [2024-06-25 19:30:26,105][15401] InferenceWorker_p0-w0: stopping experience collection (230650 times) [2024-06-25 19:30:26,194][15349] Signal inference workers to resume experience collection... (230650 times) [2024-06-25 19:30:26,194][15401] InferenceWorker_p0-w0: resuming experience collection (230650 times) [2024-06-25 19:30:26,195][15401] Updated weights for policy 0, policy_version 951444 (0.0034) [2024-06-25 19:30:28,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15588507648. Throughput: 0: 42651.2. Samples: 15588649680. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:28,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 19:30:30,554][15401] Updated weights for policy 0, policy_version 951454 (0.0024) [2024-06-25 19:30:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15588737024. Throughput: 0: 42868.5. Samples: 15588907860. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:33,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 19:30:34,173][15401] Updated weights for policy 0, policy_version 951464 (0.0045) [2024-06-25 19:30:38,102][15401] Updated weights for policy 0, policy_version 951474 (0.0036) [2024-06-25 19:30:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15588966400. Throughput: 0: 42851.6. Samples: 15589039860. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 19:30:41,655][15401] Updated weights for policy 0, policy_version 951484 (0.0028) [2024-06-25 19:30:43,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15589146624. Throughput: 0: 42712.9. Samples: 15589292680. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:43,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 19:30:45,609][15401] Updated weights for policy 0, policy_version 951494 (0.0045) [2024-06-25 19:30:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15589376000. Throughput: 0: 42891.0. Samples: 15589550500. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 19:30:49,311][15401] Updated weights for policy 0, policy_version 951504 (0.0044) [2024-06-25 19:30:53,256][15401] Updated weights for policy 0, policy_version 951514 (0.0025) [2024-06-25 19:30:53,390][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15589621760. Throughput: 0: 42916.8. Samples: 15589684860. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:53,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 19:30:56,975][15401] Updated weights for policy 0, policy_version 951524 (0.0026) [2024-06-25 19:30:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15589801984. Throughput: 0: 42826.6. Samples: 15589934160. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:30:58,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 19:31:00,787][15401] Updated weights for policy 0, policy_version 951534 (0.0028) [2024-06-25 19:31:03,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15590014976. Throughput: 0: 42932.9. Samples: 15590193900. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 19:31:04,600][15401] Updated weights for policy 0, policy_version 951544 (0.0052) [2024-06-25 19:31:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15590244352. Throughput: 0: 42868.1. Samples: 15590324660. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:08,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 19:31:08,654][15401] Updated weights for policy 0, policy_version 951554 (0.0027) [2024-06-25 19:31:12,351][15401] Updated weights for policy 0, policy_version 951564 (0.0032) [2024-06-25 19:31:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 15590457344. Throughput: 0: 42771.9. Samples: 15590574420. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:13,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 19:31:16,309][15401] Updated weights for policy 0, policy_version 951574 (0.0038) [2024-06-25 19:31:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15590670336. Throughput: 0: 42788.4. Samples: 15590833340. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 19:31:19,862][15401] Updated weights for policy 0, policy_version 951584 (0.0037) [2024-06-25 19:31:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15590866944. Throughput: 0: 42827.1. Samples: 15590967080. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:23,392][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 19:31:23,848][15401] Updated weights for policy 0, policy_version 951594 (0.0030) [2024-06-25 19:31:27,554][15401] Updated weights for policy 0, policy_version 951604 (0.0033) [2024-06-25 19:31:28,394][15132] Fps is (10 sec: 42579.8, 60 sec: 43141.4, 300 sec: 42765.3). Total num frames: 15591096320. Throughput: 0: 42889.3. Samples: 15591222880. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:28,394][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 19:31:31,420][15401] Updated weights for policy 0, policy_version 951614 (0.0033) [2024-06-25 19:31:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15591325696. Throughput: 0: 42867.9. Samples: 15591479560. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 19:31:35,374][15401] Updated weights for policy 0, policy_version 951624 (0.0022) [2024-06-25 19:31:38,396][15132] Fps is (10 sec: 40951.8, 60 sec: 42320.8, 300 sec: 42764.1). Total num frames: 15591505920. Throughput: 0: 42674.4. Samples: 15591605480. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:38,396][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 19:31:39,286][15401] Updated weights for policy 0, policy_version 951634 (0.0024) [2024-06-25 19:31:42,604][15349] Signal inference workers to stop experience collection... (230700 times) [2024-06-25 19:31:42,612][15349] Signal inference workers to resume experience collection... (230700 times) [2024-06-25 19:31:42,619][15401] InferenceWorker_p0-w0: stopping experience collection (230700 times) [2024-06-25 19:31:42,644][15401] InferenceWorker_p0-w0: resuming experience collection (230700 times) [2024-06-25 19:31:43,033][15401] Updated weights for policy 0, policy_version 951644 (0.0043) [2024-06-25 19:31:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42820.9). Total num frames: 15591751680. Throughput: 0: 42859.2. Samples: 15591862820. Policy #0 lag: (min: 3.0, avg: 12.0, max: 21.0) [2024-06-25 19:31:43,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 19:31:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000951645_15591751680.pth... [2024-06-25 19:31:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000951018_15581478912.pth [2024-06-25 19:31:46,975][15401] Updated weights for policy 0, policy_version 951654 (0.0027) [2024-06-25 19:31:48,390][15132] Fps is (10 sec: 45903.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15591964672. Throughput: 0: 42621.7. Samples: 15592111880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:31:48,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 19:31:50,774][15401] Updated weights for policy 0, policy_version 951664 (0.0052) [2024-06-25 19:31:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 15592144896. Throughput: 0: 42640.4. Samples: 15592243480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:31:53,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 19:31:54,783][15401] Updated weights for policy 0, policy_version 951674 (0.0038) [2024-06-25 19:31:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15592374272. Throughput: 0: 42708.1. Samples: 15592496280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:31:58,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 19:31:58,403][15401] Updated weights for policy 0, policy_version 951684 (0.0026) [2024-06-25 19:32:02,376][15401] Updated weights for policy 0, policy_version 951694 (0.0029) [2024-06-25 19:32:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15592587264. Throughput: 0: 42736.5. Samples: 15592756480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:03,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 19:32:06,124][15401] Updated weights for policy 0, policy_version 951704 (0.0024) [2024-06-25 19:32:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15592783872. Throughput: 0: 42521.7. Samples: 15592880560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 19:32:10,210][15401] Updated weights for policy 0, policy_version 951714 (0.0022) [2024-06-25 19:32:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 15593029632. Throughput: 0: 42466.2. Samples: 15593133680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 19:32:13,908][15401] Updated weights for policy 0, policy_version 951724 (0.0027) [2024-06-25 19:32:17,821][15401] Updated weights for policy 0, policy_version 951734 (0.0035) [2024-06-25 19:32:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15593226240. Throughput: 0: 42542.3. Samples: 15593393960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:18,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 19:32:21,557][15401] Updated weights for policy 0, policy_version 951744 (0.0046) [2024-06-25 19:32:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15593439232. Throughput: 0: 42619.8. Samples: 15593523100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:23,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 19:32:25,371][15401] Updated weights for policy 0, policy_version 951754 (0.0031) [2024-06-25 19:32:28,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43147.5, 300 sec: 42876.1). Total num frames: 15593684992. Throughput: 0: 42680.7. Samples: 15593783460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 19:32:29,131][15401] Updated weights for policy 0, policy_version 951764 (0.0036) [2024-06-25 19:32:33,165][15401] Updated weights for policy 0, policy_version 951774 (0.0026) [2024-06-25 19:32:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15593865216. Throughput: 0: 42975.2. Samples: 15594045760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:33,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 19:32:36,642][15401] Updated weights for policy 0, policy_version 951784 (0.0033) [2024-06-25 19:32:38,396][15132] Fps is (10 sec: 39297.1, 60 sec: 42871.4, 300 sec: 42708.6). Total num frames: 15594078208. Throughput: 0: 42751.7. Samples: 15594167580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:38,396][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 19:32:40,616][15401] Updated weights for policy 0, policy_version 951794 (0.0040) [2024-06-25 19:32:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 15594323968. Throughput: 0: 42891.9. Samples: 15594426420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:43,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 19:32:44,135][15401] Updated weights for policy 0, policy_version 951804 (0.0027) [2024-06-25 19:32:48,118][15401] Updated weights for policy 0, policy_version 951814 (0.0044) [2024-06-25 19:32:48,389][15132] Fps is (10 sec: 44265.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15594520576. Throughput: 0: 42691.6. Samples: 15594677600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:48,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 19:32:52,013][15401] Updated weights for policy 0, policy_version 951824 (0.0028) [2024-06-25 19:32:53,392][15132] Fps is (10 sec: 40950.4, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 15594733568. Throughput: 0: 42770.7. Samples: 15594805340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:53,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 19:32:56,103][15401] Updated weights for policy 0, policy_version 951834 (0.0031) [2024-06-25 19:32:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15594946560. Throughput: 0: 42926.0. Samples: 15595065340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:32:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 19:32:59,745][15401] Updated weights for policy 0, policy_version 951844 (0.0032) [2024-06-25 19:33:03,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15595159552. Throughput: 0: 42747.1. Samples: 15595317580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:33:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 19:33:03,734][15401] Updated weights for policy 0, policy_version 951854 (0.0024) [2024-06-25 19:33:06,928][15349] Signal inference workers to stop experience collection... (230750 times) [2024-06-25 19:33:06,987][15401] InferenceWorker_p0-w0: stopping experience collection (230750 times) [2024-06-25 19:33:06,991][15349] Signal inference workers to resume experience collection... (230750 times) [2024-06-25 19:33:07,009][15401] InferenceWorker_p0-w0: resuming experience collection (230750 times) [2024-06-25 19:33:07,388][15401] Updated weights for policy 0, policy_version 951864 (0.0039) [2024-06-25 19:33:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15595372544. Throughput: 0: 42822.6. Samples: 15595450120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:33:08,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 19:33:11,449][15401] Updated weights for policy 0, policy_version 951874 (0.0045) [2024-06-25 19:33:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15595569152. Throughput: 0: 42636.1. Samples: 15595702080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:33:13,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-25 19:33:15,167][15401] Updated weights for policy 0, policy_version 951884 (0.0030) [2024-06-25 19:33:18,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15595814912. Throughput: 0: 42396.9. Samples: 15595953620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:33:18,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 19:33:19,120][15401] Updated weights for policy 0, policy_version 951894 (0.0053) [2024-06-25 19:33:22,839][15401] Updated weights for policy 0, policy_version 951904 (0.0029) [2024-06-25 19:33:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15595995136. Throughput: 0: 42700.6. Samples: 15596088840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:33:23,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 19:33:26,735][15401] Updated weights for policy 0, policy_version 951914 (0.0042) [2024-06-25 19:33:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 15596224512. Throughput: 0: 42481.5. Samples: 15596338080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:33:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 19:33:31,025][15401] Updated weights for policy 0, policy_version 951924 (0.0028) [2024-06-25 19:33:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 15596453888. Throughput: 0: 42470.5. Samples: 15596588780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:33:33,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 19:33:34,439][15401] Updated weights for policy 0, policy_version 951934 (0.0034) [2024-06-25 19:33:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 15596634112. Throughput: 0: 42575.5. Samples: 15596721140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 19:33:38,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 19:33:38,439][15401] Updated weights for policy 0, policy_version 951944 (0.0044) [2024-06-25 19:33:42,272][15401] Updated weights for policy 0, policy_version 951954 (0.0032) [2024-06-25 19:33:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42709.4). Total num frames: 15596847104. Throughput: 0: 42474.8. Samples: 15596976720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:33:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 19:33:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000951956_15596847104.pth... [2024-06-25 19:33:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000951331_15586607104.pth [2024-06-25 19:33:46,057][15401] Updated weights for policy 0, policy_version 951964 (0.0031) [2024-06-25 19:33:48,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 15597076480. Throughput: 0: 42346.8. Samples: 15597223180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:33:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 19:33:50,090][15401] Updated weights for policy 0, policy_version 951974 (0.0032) [2024-06-25 19:33:53,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 15597273088. Throughput: 0: 42468.6. Samples: 15597361200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:33:53,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 19:33:53,578][15401] Updated weights for policy 0, policy_version 951984 (0.0033) [2024-06-25 19:33:57,815][15401] Updated weights for policy 0, policy_version 951994 (0.0043) [2024-06-25 19:33:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15597502464. Throughput: 0: 42488.0. Samples: 15597614040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:33:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 19:34:01,677][15401] Updated weights for policy 0, policy_version 952004 (0.0044) [2024-06-25 19:34:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15597715456. Throughput: 0: 42666.3. Samples: 15597873600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 19:34:05,202][15401] Updated weights for policy 0, policy_version 952014 (0.0030) [2024-06-25 19:34:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15597944832. Throughput: 0: 42553.4. Samples: 15598003740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:08,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 19:34:09,358][15401] Updated weights for policy 0, policy_version 952024 (0.0031) [2024-06-25 19:34:12,685][15401] Updated weights for policy 0, policy_version 952034 (0.0036) [2024-06-25 19:34:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 15598157824. Throughput: 0: 42774.5. Samples: 15598262940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:13,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 19:34:15,757][15349] Signal inference workers to stop experience collection... (230800 times) [2024-06-25 19:34:15,809][15401] InferenceWorker_p0-w0: stopping experience collection (230800 times) [2024-06-25 19:34:15,810][15349] Signal inference workers to resume experience collection... (230800 times) [2024-06-25 19:34:15,837][15401] InferenceWorker_p0-w0: resuming experience collection (230800 times) [2024-06-25 19:34:16,986][15401] Updated weights for policy 0, policy_version 952044 (0.0041) [2024-06-25 19:34:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15598370816. Throughput: 0: 42912.5. Samples: 15598519840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:18,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 19:34:20,213][15401] Updated weights for policy 0, policy_version 952054 (0.0045) [2024-06-25 19:34:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15598567424. Throughput: 0: 42809.0. Samples: 15598647540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 19:34:24,419][15401] Updated weights for policy 0, policy_version 952064 (0.0037) [2024-06-25 19:34:27,951][15401] Updated weights for policy 0, policy_version 952074 (0.0028) [2024-06-25 19:34:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15598796800. Throughput: 0: 43034.0. Samples: 15598913240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 19:34:31,970][15401] Updated weights for policy 0, policy_version 952084 (0.0034) [2024-06-25 19:34:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 15598977024. Throughput: 0: 43181.8. Samples: 15599166360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 19:34:35,644][15401] Updated weights for policy 0, policy_version 952094 (0.0033) [2024-06-25 19:34:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15599206400. Throughput: 0: 42782.2. Samples: 15599286400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:38,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 19:34:39,502][15401] Updated weights for policy 0, policy_version 952104 (0.0043) [2024-06-25 19:34:43,264][15401] Updated weights for policy 0, policy_version 952114 (0.0033) [2024-06-25 19:34:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.8, 300 sec: 42709.5). Total num frames: 15599435776. Throughput: 0: 42941.9. Samples: 15599546420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 19:34:47,191][15401] Updated weights for policy 0, policy_version 952124 (0.0033) [2024-06-25 19:34:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15599632384. Throughput: 0: 42752.0. Samples: 15599797440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:48,390][15132] Avg episode reward: [(0, '0.449')] [2024-06-25 19:34:51,002][15401] Updated weights for policy 0, policy_version 952134 (0.0034) [2024-06-25 19:34:53,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15599845376. Throughput: 0: 42691.1. Samples: 15599924840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:53,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 19:34:54,822][15401] Updated weights for policy 0, policy_version 952144 (0.0031) [2024-06-25 19:34:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15600041984. Throughput: 0: 42662.8. Samples: 15600182760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:34:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 19:34:58,886][15401] Updated weights for policy 0, policy_version 952154 (0.0036) [2024-06-25 19:35:02,586][15401] Updated weights for policy 0, policy_version 952164 (0.0030) [2024-06-25 19:35:03,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15600287744. Throughput: 0: 42622.7. Samples: 15600437860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:35:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 19:35:06,678][15401] Updated weights for policy 0, policy_version 952174 (0.0031) [2024-06-25 19:35:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15600484352. Throughput: 0: 42701.2. Samples: 15600569100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:35:08,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 19:35:10,273][15401] Updated weights for policy 0, policy_version 952184 (0.0051) [2024-06-25 19:35:13,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15600680960. Throughput: 0: 42376.3. Samples: 15600820180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:35:13,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 19:35:14,276][15401] Updated weights for policy 0, policy_version 952194 (0.0035) [2024-06-25 19:35:18,339][15401] Updated weights for policy 0, policy_version 952204 (0.0025) [2024-06-25 19:35:18,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15600910336. Throughput: 0: 42482.2. Samples: 15601078060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:35:18,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 19:35:22,032][15401] Updated weights for policy 0, policy_version 952214 (0.0048) [2024-06-25 19:35:23,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15601123328. Throughput: 0: 42703.1. Samples: 15601208040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:35:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 19:35:25,875][15401] Updated weights for policy 0, policy_version 952224 (0.0043) [2024-06-25 19:35:28,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42325.1, 300 sec: 42709.4). Total num frames: 15601336320. Throughput: 0: 42611.2. Samples: 15601463940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:35:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 19:35:29,110][15349] Signal inference workers to stop experience collection... (230850 times) [2024-06-25 19:35:29,167][15401] InferenceWorker_p0-w0: stopping experience collection (230850 times) [2024-06-25 19:35:29,175][15349] Signal inference workers to resume experience collection... (230850 times) [2024-06-25 19:35:29,181][15401] InferenceWorker_p0-w0: resuming experience collection (230850 times) [2024-06-25 19:35:29,475][15401] Updated weights for policy 0, policy_version 952234 (0.0031) [2024-06-25 19:35:33,320][15401] Updated weights for policy 0, policy_version 952244 (0.0038) [2024-06-25 19:35:33,391][15132] Fps is (10 sec: 44230.1, 60 sec: 43143.4, 300 sec: 42709.3). Total num frames: 15601565696. Throughput: 0: 42823.8. Samples: 15601724580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:35:33,392][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 19:35:37,231][15401] Updated weights for policy 0, policy_version 952254 (0.0044) [2024-06-25 19:35:38,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 15601778688. Throughput: 0: 42784.9. Samples: 15601850160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-25 19:35:38,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 19:35:40,897][15401] Updated weights for policy 0, policy_version 952264 (0.0031) [2024-06-25 19:35:43,390][15132] Fps is (10 sec: 40965.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15601975296. Throughput: 0: 42759.9. Samples: 15602106960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:35:43,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 19:35:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000952269_15601975296.pth... [2024-06-25 19:35:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000951645_15591751680.pth [2024-06-25 19:35:44,834][15401] Updated weights for policy 0, policy_version 952274 (0.0025) [2024-06-25 19:35:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15602204672. Throughput: 0: 42796.5. Samples: 15602363700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:35:48,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 19:35:48,516][15401] Updated weights for policy 0, policy_version 952284 (0.0031) [2024-06-25 19:35:52,520][15401] Updated weights for policy 0, policy_version 952294 (0.0029) [2024-06-25 19:35:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15602417664. Throughput: 0: 42858.2. Samples: 15602497720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:35:53,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 19:35:56,172][15401] Updated weights for policy 0, policy_version 952304 (0.0030) [2024-06-25 19:35:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15602630656. Throughput: 0: 42966.7. Samples: 15602753680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:35:58,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 19:36:00,094][15401] Updated weights for policy 0, policy_version 952314 (0.0031) [2024-06-25 19:36:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15602843648. Throughput: 0: 43011.1. Samples: 15603013560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:03,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 19:36:03,840][15401] Updated weights for policy 0, policy_version 952324 (0.0030) [2024-06-25 19:36:07,765][15401] Updated weights for policy 0, policy_version 952334 (0.0036) [2024-06-25 19:36:08,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15603056640. Throughput: 0: 42961.4. Samples: 15603141300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:08,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 19:36:11,607][15401] Updated weights for policy 0, policy_version 952344 (0.0039) [2024-06-25 19:36:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15603286016. Throughput: 0: 42870.4. Samples: 15603393100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:13,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 19:36:15,426][15401] Updated weights for policy 0, policy_version 952354 (0.0026) [2024-06-25 19:36:18,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 15603482624. Throughput: 0: 42823.2. Samples: 15603651660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:18,392][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 19:36:19,108][15401] Updated weights for policy 0, policy_version 952364 (0.0037) [2024-06-25 19:36:23,077][15401] Updated weights for policy 0, policy_version 952374 (0.0037) [2024-06-25 19:36:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.6). Total num frames: 15603712000. Throughput: 0: 42891.2. Samples: 15603780260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 19:36:27,018][15401] Updated weights for policy 0, policy_version 952384 (0.0038) [2024-06-25 19:36:28,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.8, 300 sec: 42709.5). Total num frames: 15603924992. Throughput: 0: 43055.7. Samples: 15604044460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:28,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 19:36:30,457][15401] Updated weights for policy 0, policy_version 952394 (0.0035) [2024-06-25 19:36:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42870.8, 300 sec: 42821.1). Total num frames: 15604137984. Throughput: 0: 43142.9. Samples: 15604305240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:33,393][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 19:36:34,521][15401] Updated weights for policy 0, policy_version 952404 (0.0032) [2024-06-25 19:36:37,907][15401] Updated weights for policy 0, policy_version 952414 (0.0033) [2024-06-25 19:36:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 15604367360. Throughput: 0: 42900.2. Samples: 15604428220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:38,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 19:36:41,965][15401] Updated weights for policy 0, policy_version 952424 (0.0044) [2024-06-25 19:36:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15604563968. Throughput: 0: 43074.8. Samples: 15604692040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:43,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 19:36:45,538][15401] Updated weights for policy 0, policy_version 952434 (0.0032) [2024-06-25 19:36:48,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15604776960. Throughput: 0: 43128.4. Samples: 15604954340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:48,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 19:36:49,569][15401] Updated weights for policy 0, policy_version 952444 (0.0036) [2024-06-25 19:36:53,277][15401] Updated weights for policy 0, policy_version 952454 (0.0027) [2024-06-25 19:36:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 15605006336. Throughput: 0: 43080.4. Samples: 15605079920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:53,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 19:36:57,262][15401] Updated weights for policy 0, policy_version 952464 (0.0037) [2024-06-25 19:36:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15605202944. Throughput: 0: 43345.4. Samples: 15605343640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:36:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 19:37:00,807][15401] Updated weights for policy 0, policy_version 952474 (0.0028) [2024-06-25 19:37:03,389][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15605432320. Throughput: 0: 43221.9. Samples: 15605596540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:37:03,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 19:37:04,960][15401] Updated weights for policy 0, policy_version 952484 (0.0029) [2024-06-25 19:37:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15605645312. Throughput: 0: 43323.2. Samples: 15605729800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:37:08,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 19:37:08,486][15401] Updated weights for policy 0, policy_version 952494 (0.0030) [2024-06-25 19:37:12,453][15401] Updated weights for policy 0, policy_version 952504 (0.0028) [2024-06-25 19:37:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15605841920. Throughput: 0: 43152.0. Samples: 15605986300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:37:13,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 19:37:15,970][15401] Updated weights for policy 0, policy_version 952514 (0.0032) [2024-06-25 19:37:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 15606071296. Throughput: 0: 43020.1. Samples: 15606241040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:37:18,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 19:37:20,130][15401] Updated weights for policy 0, policy_version 952524 (0.0037) [2024-06-25 19:37:23,248][15349] Signal inference workers to stop experience collection... (230900 times) [2024-06-25 19:37:23,248][15349] Signal inference workers to resume experience collection... (230900 times) [2024-06-25 19:37:23,276][15401] InferenceWorker_p0-w0: stopping experience collection (230900 times) [2024-06-25 19:37:23,276][15401] InferenceWorker_p0-w0: resuming experience collection (230900 times) [2024-06-25 19:37:23,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15606284288. Throughput: 0: 43146.6. Samples: 15606369820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:37:23,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 19:37:23,576][15401] Updated weights for policy 0, policy_version 952534 (0.0039) [2024-06-25 19:37:27,589][15401] Updated weights for policy 0, policy_version 952544 (0.0032) [2024-06-25 19:37:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15606497280. Throughput: 0: 43077.4. Samples: 15606630520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:37:28,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 19:37:30,993][15401] Updated weights for policy 0, policy_version 952554 (0.0036) [2024-06-25 19:37:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.3, 300 sec: 42821.5). Total num frames: 15606710272. Throughput: 0: 42997.8. Samples: 15606889240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 19:37:33,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 19:37:35,188][15401] Updated weights for policy 0, policy_version 952564 (0.0034) [2024-06-25 19:37:38,393][15132] Fps is (10 sec: 45858.4, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 15606956032. Throughput: 0: 43139.1. Samples: 15607021340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:37:38,394][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 19:37:38,540][15401] Updated weights for policy 0, policy_version 952574 (0.0033) [2024-06-25 19:37:43,367][15401] Updated weights for policy 0, policy_version 952584 (0.0028) [2024-06-25 19:37:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15607136256. Throughput: 0: 42855.1. Samples: 15607272120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:37:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 19:37:43,503][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000952585_15607152640.pth... [2024-06-25 19:37:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000951956_15596847104.pth [2024-06-25 19:37:46,711][15401] Updated weights for policy 0, policy_version 952594 (0.0032) [2024-06-25 19:37:48,391][15132] Fps is (10 sec: 40967.3, 60 sec: 43143.2, 300 sec: 42820.6). Total num frames: 15607365632. Throughput: 0: 42888.5. Samples: 15607526600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:37:48,392][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 19:37:51,201][15401] Updated weights for policy 0, policy_version 952604 (0.0038) [2024-06-25 19:37:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15607562240. Throughput: 0: 42940.9. Samples: 15607662140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:37:53,390][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 19:37:54,406][15401] Updated weights for policy 0, policy_version 952614 (0.0032) [2024-06-25 19:37:58,390][15132] Fps is (10 sec: 40967.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15607775232. Throughput: 0: 42745.7. Samples: 15607909860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:37:58,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 19:37:58,743][15401] Updated weights for policy 0, policy_version 952624 (0.0037) [2024-06-25 19:38:01,975][15401] Updated weights for policy 0, policy_version 952634 (0.0023) [2024-06-25 19:38:03,389][15132] Fps is (10 sec: 47514.0, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 15608037376. Throughput: 0: 42816.1. Samples: 15608167760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:03,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 19:38:06,214][15401] Updated weights for policy 0, policy_version 952644 (0.0029) [2024-06-25 19:38:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15608217600. Throughput: 0: 43022.6. Samples: 15608305840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 19:38:09,670][15401] Updated weights for policy 0, policy_version 952654 (0.0031) [2024-06-25 19:38:13,390][15132] Fps is (10 sec: 37682.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15608414208. Throughput: 0: 42838.6. Samples: 15608558260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:13,400][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 19:38:13,792][15401] Updated weights for policy 0, policy_version 952664 (0.0036) [2024-06-25 19:38:17,204][15401] Updated weights for policy 0, policy_version 952674 (0.0037) [2024-06-25 19:38:18,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 15608659968. Throughput: 0: 42829.8. Samples: 15608816580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:18,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 19:38:21,352][15401] Updated weights for policy 0, policy_version 952684 (0.0031) [2024-06-25 19:38:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15608856576. Throughput: 0: 42863.5. Samples: 15608950040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:23,404][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 19:38:24,660][15401] Updated weights for policy 0, policy_version 952694 (0.0041) [2024-06-25 19:38:28,389][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15609069568. Throughput: 0: 42961.3. Samples: 15609205380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:28,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 19:38:28,678][15401] Updated weights for policy 0, policy_version 952704 (0.0034) [2024-06-25 19:38:32,487][15401] Updated weights for policy 0, policy_version 952714 (0.0026) [2024-06-25 19:38:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 15609298944. Throughput: 0: 43153.8. Samples: 15609468440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:33,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 19:38:36,445][15401] Updated weights for policy 0, policy_version 952724 (0.0044) [2024-06-25 19:38:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42601.0, 300 sec: 42931.7). Total num frames: 15609511936. Throughput: 0: 43145.8. Samples: 15609603700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 19:38:40,005][15401] Updated weights for policy 0, policy_version 952734 (0.0044) [2024-06-25 19:38:40,101][15349] Signal inference workers to stop experience collection... (230950 times) [2024-06-25 19:38:40,158][15401] InferenceWorker_p0-w0: stopping experience collection (230950 times) [2024-06-25 19:38:40,158][15349] Signal inference workers to resume experience collection... (230950 times) [2024-06-25 19:38:40,179][15401] InferenceWorker_p0-w0: resuming experience collection (230950 times) [2024-06-25 19:38:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15609724928. Throughput: 0: 43231.9. Samples: 15609855300. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 19:38:44,097][15401] Updated weights for policy 0, policy_version 952744 (0.0045) [2024-06-25 19:38:47,526][15401] Updated weights for policy 0, policy_version 952754 (0.0036) [2024-06-25 19:38:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43145.8, 300 sec: 42987.2). Total num frames: 15609954304. Throughput: 0: 43237.7. Samples: 15610113460. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:48,394][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 19:38:51,637][15401] Updated weights for policy 0, policy_version 952764 (0.0043) [2024-06-25 19:38:53,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15610150912. Throughput: 0: 42990.4. Samples: 15610240400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 19:38:55,148][15401] Updated weights for policy 0, policy_version 952774 (0.0027) [2024-06-25 19:38:58,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 15610396672. Throughput: 0: 43049.4. Samples: 15610495480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:38:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 19:38:59,089][15401] Updated weights for policy 0, policy_version 952784 (0.0034) [2024-06-25 19:39:02,963][15401] Updated weights for policy 0, policy_version 952794 (0.0035) [2024-06-25 19:39:03,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42593.8, 300 sec: 42875.2). Total num frames: 15610593280. Throughput: 0: 43108.0. Samples: 15610756720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:39:03,397][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 19:39:06,641][15401] Updated weights for policy 0, policy_version 952804 (0.0029) [2024-06-25 19:39:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15610806272. Throughput: 0: 43036.9. Samples: 15610886700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:39:08,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 19:39:10,410][15401] Updated weights for policy 0, policy_version 952814 (0.0026) [2024-06-25 19:39:13,390][15132] Fps is (10 sec: 42625.4, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15611019264. Throughput: 0: 43043.5. Samples: 15611142340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:39:13,390][15132] Avg episode reward: [(0, '0.386')] [2024-06-25 19:39:14,237][15401] Updated weights for policy 0, policy_version 952824 (0.0032) [2024-06-25 19:39:18,046][15401] Updated weights for policy 0, policy_version 952834 (0.0033) [2024-06-25 19:39:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 15611248640. Throughput: 0: 42920.3. Samples: 15611399860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:39:18,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 19:39:21,962][15401] Updated weights for policy 0, policy_version 952844 (0.0029) [2024-06-25 19:39:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15611445248. Throughput: 0: 42820.0. Samples: 15611530600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:39:23,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 19:39:25,796][15401] Updated weights for policy 0, policy_version 952854 (0.0027) [2024-06-25 19:39:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42987.1). Total num frames: 15611658240. Throughput: 0: 42959.6. Samples: 15611788480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:39:28,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 19:39:29,461][15401] Updated weights for policy 0, policy_version 952864 (0.0032) [2024-06-25 19:39:33,333][15401] Updated weights for policy 0, policy_version 952874 (0.0032) [2024-06-25 19:39:33,395][15132] Fps is (10 sec: 44211.5, 60 sec: 43140.4, 300 sec: 42986.3). Total num frames: 15611887616. Throughput: 0: 42967.5. Samples: 15612047240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-25 19:39:33,396][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 19:39:37,032][15401] Updated weights for policy 0, policy_version 952884 (0.0040) [2024-06-25 19:39:38,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.7, 300 sec: 42875.7). Total num frames: 15612084224. Throughput: 0: 43095.4. Samples: 15612179800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:39:38,393][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 19:39:40,981][15401] Updated weights for policy 0, policy_version 952894 (0.0033) [2024-06-25 19:39:43,390][15132] Fps is (10 sec: 40983.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15612297216. Throughput: 0: 43063.1. Samples: 15612433320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:39:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 19:39:43,492][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000952900_15612313600.pth... [2024-06-25 19:39:43,549][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000952269_15601975296.pth [2024-06-25 19:39:44,686][15401] Updated weights for policy 0, policy_version 952904 (0.0035) [2024-06-25 19:39:48,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15612526592. Throughput: 0: 42825.2. Samples: 15612683580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:39:48,390][15132] Avg episode reward: [(0, '0.855')] [2024-06-25 19:39:48,410][15401] Updated weights for policy 0, policy_version 952914 (0.0057) [2024-06-25 19:39:53,110][15401] Updated weights for policy 0, policy_version 952924 (0.0025) [2024-06-25 19:39:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15612706816. Throughput: 0: 42835.7. Samples: 15612814300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:39:53,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 19:39:56,166][15401] Updated weights for policy 0, policy_version 952934 (0.0035) [2024-06-25 19:39:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15612952576. Throughput: 0: 42738.8. Samples: 15613065580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:39:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 19:40:00,610][15401] Updated weights for policy 0, policy_version 952944 (0.0032) [2024-06-25 19:40:03,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43149.2, 300 sec: 43042.7). Total num frames: 15613181952. Throughput: 0: 42665.9. Samples: 15613319820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:03,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 19:40:03,706][15401] Updated weights for policy 0, policy_version 952954 (0.0035) [2024-06-25 19:40:08,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 15613345792. Throughput: 0: 42706.5. Samples: 15613452400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:08,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 19:40:08,543][15401] Updated weights for policy 0, policy_version 952964 (0.0029) [2024-06-25 19:40:08,983][15349] Signal inference workers to stop experience collection... (231000 times) [2024-06-25 19:40:08,983][15349] Signal inference workers to resume experience collection... (231000 times) [2024-06-25 19:40:09,025][15401] InferenceWorker_p0-w0: stopping experience collection (231000 times) [2024-06-25 19:40:09,025][15401] InferenceWorker_p0-w0: resuming experience collection (231000 times) [2024-06-25 19:40:11,723][15401] Updated weights for policy 0, policy_version 952974 (0.0033) [2024-06-25 19:40:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 15613575168. Throughput: 0: 42583.2. Samples: 15613704720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:13,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 19:40:16,121][15401] Updated weights for policy 0, policy_version 952984 (0.0049) [2024-06-25 19:40:18,390][15132] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 15613804544. Throughput: 0: 42439.5. Samples: 15613956780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:18,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 19:40:19,657][15401] Updated weights for policy 0, policy_version 952994 (0.0037) [2024-06-25 19:40:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 15614001152. Throughput: 0: 42410.3. Samples: 15614088160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 19:40:23,578][15401] Updated weights for policy 0, policy_version 953004 (0.0043) [2024-06-25 19:40:27,065][15401] Updated weights for policy 0, policy_version 953014 (0.0030) [2024-06-25 19:40:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42876.3). Total num frames: 15614214144. Throughput: 0: 42383.5. Samples: 15614340580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:28,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 19:40:31,371][15401] Updated weights for policy 0, policy_version 953024 (0.0028) [2024-06-25 19:40:33,394][15132] Fps is (10 sec: 44217.9, 60 sec: 42599.4, 300 sec: 42931.0). Total num frames: 15614443520. Throughput: 0: 42596.9. Samples: 15614600620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:33,394][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 19:40:34,508][15401] Updated weights for policy 0, policy_version 953034 (0.0033) [2024-06-25 19:40:38,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 15614640128. Throughput: 0: 42570.6. Samples: 15614729980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:38,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 19:40:39,113][15401] Updated weights for policy 0, policy_version 953044 (0.0038) [2024-06-25 19:40:42,166][15401] Updated weights for policy 0, policy_version 953054 (0.0033) [2024-06-25 19:40:43,390][15132] Fps is (10 sec: 40976.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 15614853120. Throughput: 0: 42529.2. Samples: 15614979400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:43,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 19:40:46,830][15401] Updated weights for policy 0, policy_version 953064 (0.0045) [2024-06-25 19:40:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 15615066112. Throughput: 0: 42638.6. Samples: 15615238560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:48,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 19:40:50,012][15401] Updated weights for policy 0, policy_version 953074 (0.0026) [2024-06-25 19:40:53,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15615262720. Throughput: 0: 42513.0. Samples: 15615365480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:53,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 19:40:54,397][15401] Updated weights for policy 0, policy_version 953084 (0.0045) [2024-06-25 19:40:57,396][15401] Updated weights for policy 0, policy_version 953094 (0.0037) [2024-06-25 19:40:58,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 15615508480. Throughput: 0: 42507.0. Samples: 15615617640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:40:58,392][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 19:41:02,039][15401] Updated weights for policy 0, policy_version 953104 (0.0033) [2024-06-25 19:41:03,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 15615721472. Throughput: 0: 42741.0. Samples: 15615880120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:41:03,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 19:41:04,877][15401] Updated weights for policy 0, policy_version 953114 (0.0035) [2024-06-25 19:41:08,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15615918080. Throughput: 0: 42765.3. Samples: 15616012600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:41:08,390][15132] Avg episode reward: [(0, '0.281')] [2024-06-25 19:41:09,564][15401] Updated weights for policy 0, policy_version 953124 (0.0029) [2024-06-25 19:41:13,005][15401] Updated weights for policy 0, policy_version 953134 (0.0038) [2024-06-25 19:41:13,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42987.5). Total num frames: 15616163840. Throughput: 0: 42733.8. Samples: 15616263600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:41:13,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 19:41:17,210][15401] Updated weights for policy 0, policy_version 953144 (0.0036) [2024-06-25 19:41:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15616360448. Throughput: 0: 42705.2. Samples: 15616522180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:41:18,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 19:41:20,546][15401] Updated weights for policy 0, policy_version 953154 (0.0036) [2024-06-25 19:41:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 15616557056. Throughput: 0: 42565.6. Samples: 15616645440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:41:23,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 19:41:24,697][15401] Updated weights for policy 0, policy_version 953164 (0.0049) [2024-06-25 19:41:28,119][15401] Updated weights for policy 0, policy_version 953174 (0.0038) [2024-06-25 19:41:28,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42932.0). Total num frames: 15616802816. Throughput: 0: 42800.0. Samples: 15616905400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-25 19:41:28,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 19:41:32,669][15401] Updated weights for policy 0, policy_version 953184 (0.0031) [2024-06-25 19:41:32,887][15349] Signal inference workers to stop experience collection... (231050 times) [2024-06-25 19:41:32,887][15349] Signal inference workers to resume experience collection... (231050 times) [2024-06-25 19:41:32,905][15401] InferenceWorker_p0-w0: stopping experience collection (231050 times) [2024-06-25 19:41:32,937][15401] InferenceWorker_p0-w0: resuming experience collection (231050 times) [2024-06-25 19:41:33,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42601.4, 300 sec: 42820.5). Total num frames: 15616999424. Throughput: 0: 42768.9. Samples: 15617163160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:41:33,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 19:41:35,615][15401] Updated weights for policy 0, policy_version 953194 (0.0041) [2024-06-25 19:41:38,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15617196032. Throughput: 0: 42827.5. Samples: 15617292720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:41:38,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-25 19:41:40,259][15401] Updated weights for policy 0, policy_version 953204 (0.0034) [2024-06-25 19:41:43,103][15401] Updated weights for policy 0, policy_version 953214 (0.0025) [2024-06-25 19:41:43,396][15132] Fps is (10 sec: 45845.7, 60 sec: 43413.0, 300 sec: 42986.2). Total num frames: 15617458176. Throughput: 0: 42933.5. Samples: 15617549820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:41:43,396][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 19:41:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000953214_15617458176.pth... [2024-06-25 19:41:43,501][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000952585_15607152640.pth [2024-06-25 19:41:47,878][15401] Updated weights for policy 0, policy_version 953224 (0.0028) [2024-06-25 19:41:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15617638400. Throughput: 0: 42764.8. Samples: 15617804540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:41:48,390][15132] Avg episode reward: [(0, '0.412')] [2024-06-25 19:41:50,972][15401] Updated weights for policy 0, policy_version 953234 (0.0037) [2024-06-25 19:41:53,390][15132] Fps is (10 sec: 37707.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15617835008. Throughput: 0: 42551.5. Samples: 15617927420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:41:53,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 19:41:55,553][15401] Updated weights for policy 0, policy_version 953244 (0.0031) [2024-06-25 19:41:58,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 15618097152. Throughput: 0: 42784.5. Samples: 15618188900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:41:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 19:41:58,891][15401] Updated weights for policy 0, policy_version 953254 (0.0029) [2024-06-25 19:42:03,062][15401] Updated weights for policy 0, policy_version 953264 (0.0028) [2024-06-25 19:42:03,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15618277376. Throughput: 0: 42800.0. Samples: 15618448180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:03,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 19:42:06,651][15401] Updated weights for policy 0, policy_version 953274 (0.0034) [2024-06-25 19:42:08,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15618473984. Throughput: 0: 42726.3. Samples: 15618568120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:08,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 19:42:11,226][15401] Updated weights for policy 0, policy_version 953284 (0.0034) [2024-06-25 19:42:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15618719744. Throughput: 0: 42680.0. Samples: 15618826000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:13,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 19:42:14,463][15401] Updated weights for policy 0, policy_version 953294 (0.0034) [2024-06-25 19:42:18,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 15618883584. Throughput: 0: 42785.0. Samples: 15619088480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:18,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 19:42:18,839][15401] Updated weights for policy 0, policy_version 953304 (0.0031) [2024-06-25 19:42:22,028][15401] Updated weights for policy 0, policy_version 953314 (0.0031) [2024-06-25 19:42:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15619129344. Throughput: 0: 42561.9. Samples: 15619208000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:23,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 19:42:26,314][15401] Updated weights for policy 0, policy_version 953324 (0.0041) [2024-06-25 19:42:28,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15619358720. Throughput: 0: 42646.9. Samples: 15619468660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:28,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 19:42:29,573][15401] Updated weights for policy 0, policy_version 953334 (0.0039) [2024-06-25 19:42:33,390][15132] Fps is (10 sec: 40956.5, 60 sec: 42324.7, 300 sec: 42654.3). Total num frames: 15619538944. Throughput: 0: 42819.2. Samples: 15619731440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:33,391][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 19:42:34,046][15401] Updated weights for policy 0, policy_version 953344 (0.0025) [2024-06-25 19:42:37,175][15401] Updated weights for policy 0, policy_version 953354 (0.0036) [2024-06-25 19:42:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15619784704. Throughput: 0: 42744.0. Samples: 15619850900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:38,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 19:42:41,735][15401] Updated weights for policy 0, policy_version 953364 (0.0025) [2024-06-25 19:42:43,390][15132] Fps is (10 sec: 45878.7, 60 sec: 42329.8, 300 sec: 42820.8). Total num frames: 15619997696. Throughput: 0: 42708.8. Samples: 15620110800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:43,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 19:42:44,854][15401] Updated weights for policy 0, policy_version 953374 (0.0032) [2024-06-25 19:42:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15620177920. Throughput: 0: 42704.6. Samples: 15620369880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 19:42:49,258][15401] Updated weights for policy 0, policy_version 953384 (0.0037) [2024-06-25 19:42:52,535][15401] Updated weights for policy 0, policy_version 953394 (0.0024) [2024-06-25 19:42:53,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 15620423680. Throughput: 0: 42745.5. Samples: 15620491660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:53,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 19:42:57,129][15349] Signal inference workers to stop experience collection... (231100 times) [2024-06-25 19:42:57,181][15401] InferenceWorker_p0-w0: stopping experience collection (231100 times) [2024-06-25 19:42:57,185][15349] Signal inference workers to resume experience collection... (231100 times) [2024-06-25 19:42:57,192][15401] InferenceWorker_p0-w0: resuming experience collection (231100 times) [2024-06-25 19:42:57,199][15401] Updated weights for policy 0, policy_version 953404 (0.0029) [2024-06-25 19:42:58,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15620636672. Throughput: 0: 42709.8. Samples: 15620747940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:42:58,390][15132] Avg episode reward: [(0, '0.301')] [2024-06-25 19:43:00,329][15401] Updated weights for policy 0, policy_version 953414 (0.0038) [2024-06-25 19:43:03,392][15132] Fps is (10 sec: 37673.7, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 15620800512. Throughput: 0: 42633.6. Samples: 15621007100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:43:03,392][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 19:43:04,726][15401] Updated weights for policy 0, policy_version 953424 (0.0035) [2024-06-25 19:43:08,045][15401] Updated weights for policy 0, policy_version 953434 (0.0044) [2024-06-25 19:43:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15621079040. Throughput: 0: 42668.8. Samples: 15621128100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:43:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 19:43:12,247][15401] Updated weights for policy 0, policy_version 953444 (0.0030) [2024-06-25 19:43:13,390][15132] Fps is (10 sec: 47524.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15621275648. Throughput: 0: 42737.8. Samples: 15621391860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:43:13,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 19:43:15,804][15401] Updated weights for policy 0, policy_version 953454 (0.0049) [2024-06-25 19:43:18,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15621455872. Throughput: 0: 42668.0. Samples: 15621651460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:43:18,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 19:43:19,997][15401] Updated weights for policy 0, policy_version 953464 (0.0031) [2024-06-25 19:43:23,337][15401] Updated weights for policy 0, policy_version 953474 (0.0033) [2024-06-25 19:43:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15621718016. Throughput: 0: 42676.9. Samples: 15621771360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 19:43:23,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 19:43:27,697][15401] Updated weights for policy 0, policy_version 953484 (0.0036) [2024-06-25 19:43:28,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15621914624. Throughput: 0: 42698.8. Samples: 15622032240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:43:28,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 19:43:30,935][15401] Updated weights for policy 0, policy_version 953494 (0.0042) [2024-06-25 19:43:33,394][15132] Fps is (10 sec: 39305.5, 60 sec: 42869.1, 300 sec: 42708.9). Total num frames: 15622111232. Throughput: 0: 42699.1. Samples: 15622291520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:43:33,394][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 19:43:35,152][15401] Updated weights for policy 0, policy_version 953504 (0.0038) [2024-06-25 19:43:38,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15622356992. Throughput: 0: 42656.8. Samples: 15622411220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:43:38,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 19:43:38,578][15401] Updated weights for policy 0, policy_version 953514 (0.0032) [2024-06-25 19:43:42,984][15401] Updated weights for policy 0, policy_version 953524 (0.0026) [2024-06-25 19:43:43,390][15132] Fps is (10 sec: 44254.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15622553600. Throughput: 0: 42772.9. Samples: 15622672720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:43:43,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-25 19:43:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000953526_15622569984.pth... [2024-06-25 19:43:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000952900_15612313600.pth [2024-06-25 19:43:46,654][15401] Updated weights for policy 0, policy_version 953534 (0.0031) [2024-06-25 19:43:48,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15622733824. Throughput: 0: 42754.3. Samples: 15622930940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:43:48,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 19:43:50,610][15401] Updated weights for policy 0, policy_version 953544 (0.0029) [2024-06-25 19:43:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15622979584. Throughput: 0: 42745.5. Samples: 15623051640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:43:53,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 19:43:54,512][15401] Updated weights for policy 0, policy_version 953554 (0.0035) [2024-06-25 19:43:58,129][15401] Updated weights for policy 0, policy_version 953564 (0.0035) [2024-06-25 19:43:58,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 15623208960. Throughput: 0: 42660.4. Samples: 15623311580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:43:58,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 19:44:02,040][15401] Updated weights for policy 0, policy_version 953574 (0.0048) [2024-06-25 19:44:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 15623372800. Throughput: 0: 42602.6. Samples: 15623568580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:03,403][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 19:44:05,674][15401] Updated weights for policy 0, policy_version 953584 (0.0031) [2024-06-25 19:44:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15623618560. Throughput: 0: 42616.9. Samples: 15623689120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:08,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 19:44:09,926][15401] Updated weights for policy 0, policy_version 953594 (0.0043) [2024-06-25 19:44:13,384][15401] Updated weights for policy 0, policy_version 953604 (0.0025) [2024-06-25 19:44:13,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15623847936. Throughput: 0: 42704.3. Samples: 15623953940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:13,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 19:44:17,817][15401] Updated weights for policy 0, policy_version 953614 (0.0031) [2024-06-25 19:44:18,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15624028160. Throughput: 0: 42401.2. Samples: 15624199400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 19:44:21,126][15401] Updated weights for policy 0, policy_version 953624 (0.0037) [2024-06-25 19:44:23,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 15624241152. Throughput: 0: 42469.5. Samples: 15624322360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 19:44:25,773][15401] Updated weights for policy 0, policy_version 953634 (0.0043) [2024-06-25 19:44:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42599.2). Total num frames: 15624454144. Throughput: 0: 42602.4. Samples: 15624589820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:28,390][15132] Avg episode reward: [(0, '0.289')] [2024-06-25 19:44:28,926][15401] Updated weights for policy 0, policy_version 953644 (0.0023) [2024-06-25 19:44:33,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42328.3, 300 sec: 42598.8). Total num frames: 15624650752. Throughput: 0: 42533.4. Samples: 15624844940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:33,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 19:44:33,432][15401] Updated weights for policy 0, policy_version 953654 (0.0041) [2024-06-25 19:44:35,247][15349] Signal inference workers to stop experience collection... (231150 times) [2024-06-25 19:44:35,249][15349] Signal inference workers to resume experience collection... (231150 times) [2024-06-25 19:44:35,294][15401] InferenceWorker_p0-w0: stopping experience collection (231150 times) [2024-06-25 19:44:35,294][15401] InferenceWorker_p0-w0: resuming experience collection (231150 times) [2024-06-25 19:44:36,665][15401] Updated weights for policy 0, policy_version 953664 (0.0035) [2024-06-25 19:44:38,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15624896512. Throughput: 0: 42526.7. Samples: 15624965340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 19:44:41,030][15401] Updated weights for policy 0, policy_version 953674 (0.0032) [2024-06-25 19:44:43,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15625109504. Throughput: 0: 42722.3. Samples: 15625234080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 19:44:44,315][15401] Updated weights for policy 0, policy_version 953684 (0.0026) [2024-06-25 19:44:48,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15625306112. Throughput: 0: 42780.5. Samples: 15625493700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 19:44:48,669][15401] Updated weights for policy 0, policy_version 953694 (0.0032) [2024-06-25 19:44:51,880][15401] Updated weights for policy 0, policy_version 953704 (0.0036) [2024-06-25 19:44:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15625551872. Throughput: 0: 42781.7. Samples: 15625614300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:53,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 19:44:56,244][15401] Updated weights for policy 0, policy_version 953714 (0.0035) [2024-06-25 19:44:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15625748480. Throughput: 0: 42751.6. Samples: 15625877760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:44:58,399][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 19:44:59,283][15401] Updated weights for policy 0, policy_version 953724 (0.0044) [2024-06-25 19:45:03,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15625945088. Throughput: 0: 42997.8. Samples: 15626134300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:45:03,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 19:45:03,916][15401] Updated weights for policy 0, policy_version 953734 (0.0037) [2024-06-25 19:45:06,776][15401] Updated weights for policy 0, policy_version 953744 (0.0028) [2024-06-25 19:45:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15626207232. Throughput: 0: 42969.9. Samples: 15626256000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:45:08,396][15132] Avg episode reward: [(0, '0.369')] [2024-06-25 19:45:11,827][15401] Updated weights for policy 0, policy_version 953754 (0.0034) [2024-06-25 19:45:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 15626371072. Throughput: 0: 42764.4. Samples: 15626514220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:45:13,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 19:45:14,408][15401] Updated weights for policy 0, policy_version 953764 (0.0039) [2024-06-25 19:45:18,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15626584064. Throughput: 0: 42786.5. Samples: 15626770340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:45:18,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 19:45:19,552][15401] Updated weights for policy 0, policy_version 953774 (0.0039) [2024-06-25 19:45:22,114][15401] Updated weights for policy 0, policy_version 953784 (0.0036) [2024-06-25 19:45:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15626829824. Throughput: 0: 42911.9. Samples: 15626896380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-25 19:45:23,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 19:45:27,024][15401] Updated weights for policy 0, policy_version 953794 (0.0041) [2024-06-25 19:45:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42654.6). Total num frames: 15627026432. Throughput: 0: 42693.9. Samples: 15627155300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:45:28,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 19:45:29,904][15401] Updated weights for policy 0, policy_version 953804 (0.0030) [2024-06-25 19:45:33,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 15627223040. Throughput: 0: 42750.6. Samples: 15627417580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:45:33,392][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 19:45:34,625][15401] Updated weights for policy 0, policy_version 953814 (0.0045) [2024-06-25 19:45:37,485][15401] Updated weights for policy 0, policy_version 953824 (0.0042) [2024-06-25 19:45:38,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15627485184. Throughput: 0: 42776.5. Samples: 15627539240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:45:38,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 19:45:42,311][15401] Updated weights for policy 0, policy_version 953834 (0.0022) [2024-06-25 19:45:43,389][15132] Fps is (10 sec: 45886.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15627681792. Throughput: 0: 42819.2. Samples: 15627804620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:45:43,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 19:45:43,499][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000953839_15627698176.pth... [2024-06-25 19:45:43,555][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000953214_15617458176.pth [2024-06-25 19:45:45,230][15401] Updated weights for policy 0, policy_version 953844 (0.0041) [2024-06-25 19:45:48,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15627878400. Throughput: 0: 42713.7. Samples: 15628056420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:45:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 19:45:50,085][15401] Updated weights for policy 0, policy_version 953854 (0.0028) [2024-06-25 19:45:53,055][15401] Updated weights for policy 0, policy_version 953864 (0.0033) [2024-06-25 19:45:53,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 15628140544. Throughput: 0: 42830.3. Samples: 15628183360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:45:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 19:45:57,444][15401] Updated weights for policy 0, policy_version 953874 (0.0043) [2024-06-25 19:45:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15628304384. Throughput: 0: 42946.2. Samples: 15628446800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:45:58,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 19:45:58,742][15349] Signal inference workers to stop experience collection... (231200 times) [2024-06-25 19:45:58,762][15401] InferenceWorker_p0-w0: stopping experience collection (231200 times) [2024-06-25 19:45:58,854][15349] Signal inference workers to resume experience collection... (231200 times) [2024-06-25 19:45:58,854][15401] InferenceWorker_p0-w0: resuming experience collection (231200 times) [2024-06-25 19:46:00,664][15401] Updated weights for policy 0, policy_version 953884 (0.0040) [2024-06-25 19:46:03,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15628517376. Throughput: 0: 42841.9. Samples: 15628698220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:03,390][15132] Avg episode reward: [(0, '0.895')] [2024-06-25 19:46:04,826][15401] Updated weights for policy 0, policy_version 953894 (0.0035) [2024-06-25 19:46:08,140][15401] Updated weights for policy 0, policy_version 953904 (0.0034) [2024-06-25 19:46:08,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15628763136. Throughput: 0: 42983.1. Samples: 15628830620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:08,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 19:46:12,819][15401] Updated weights for policy 0, policy_version 953914 (0.0041) [2024-06-25 19:46:13,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15628959744. Throughput: 0: 42919.1. Samples: 15629086660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 19:46:15,737][15401] Updated weights for policy 0, policy_version 953924 (0.0036) [2024-06-25 19:46:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15629172736. Throughput: 0: 42665.0. Samples: 15629337400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:18,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 19:46:20,507][15401] Updated weights for policy 0, policy_version 953934 (0.0035) [2024-06-25 19:46:23,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15629402112. Throughput: 0: 42974.7. Samples: 15629473100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:23,390][15132] Avg episode reward: [(0, '0.809')] [2024-06-25 19:46:23,433][15401] Updated weights for policy 0, policy_version 953944 (0.0035) [2024-06-25 19:46:28,169][15401] Updated weights for policy 0, policy_version 953954 (0.0056) [2024-06-25 19:46:28,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15629582336. Throughput: 0: 42681.8. Samples: 15629725300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 19:46:31,431][15401] Updated weights for policy 0, policy_version 953964 (0.0039) [2024-06-25 19:46:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43419.2, 300 sec: 42820.5). Total num frames: 15629828096. Throughput: 0: 42508.3. Samples: 15629969300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:33,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 19:46:35,878][15401] Updated weights for policy 0, policy_version 953974 (0.0038) [2024-06-25 19:46:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 15630024704. Throughput: 0: 42577.9. Samples: 15630099360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:38,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 19:46:39,006][15401] Updated weights for policy 0, policy_version 953984 (0.0030) [2024-06-25 19:46:43,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15630221312. Throughput: 0: 42399.2. Samples: 15630354760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 19:46:43,517][15401] Updated weights for policy 0, policy_version 953994 (0.0034) [2024-06-25 19:46:47,449][15401] Updated weights for policy 0, policy_version 954004 (0.0043) [2024-06-25 19:46:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15630434304. Throughput: 0: 42488.1. Samples: 15630610180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 19:46:51,087][15401] Updated weights for policy 0, policy_version 954014 (0.0032) [2024-06-25 19:46:53,395][15132] Fps is (10 sec: 44213.9, 60 sec: 42048.7, 300 sec: 42597.7). Total num frames: 15630663680. Throughput: 0: 42360.6. Samples: 15630737060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:53,395][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 19:46:55,022][15401] Updated weights for policy 0, policy_version 954024 (0.0031) [2024-06-25 19:46:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15630876672. Throughput: 0: 42331.5. Samples: 15630991580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:46:58,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 19:46:58,577][15401] Updated weights for policy 0, policy_version 954034 (0.0028) [2024-06-25 19:47:02,597][15401] Updated weights for policy 0, policy_version 954044 (0.0025) [2024-06-25 19:47:03,389][15132] Fps is (10 sec: 40981.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15631073280. Throughput: 0: 42574.7. Samples: 15631253260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:47:03,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 19:47:06,136][15401] Updated weights for policy 0, policy_version 954054 (0.0027) [2024-06-25 19:47:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15631319040. Throughput: 0: 42409.3. Samples: 15631381520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:47:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 19:47:10,059][15401] Updated weights for policy 0, policy_version 954064 (0.0033) [2024-06-25 19:47:13,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15631515648. Throughput: 0: 42588.4. Samples: 15631641780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:47:13,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 19:47:13,755][15401] Updated weights for policy 0, policy_version 954074 (0.0043) [2024-06-25 19:47:17,683][15401] Updated weights for policy 0, policy_version 954084 (0.0031) [2024-06-25 19:47:18,395][15132] Fps is (10 sec: 40939.2, 60 sec: 42594.7, 300 sec: 42708.7). Total num frames: 15631728640. Throughput: 0: 42752.6. Samples: 15631893380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:47:18,395][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 19:47:21,286][15349] Signal inference workers to stop experience collection... (231250 times) [2024-06-25 19:47:21,341][15349] Signal inference workers to resume experience collection... (231250 times) [2024-06-25 19:47:21,342][15401] InferenceWorker_p0-w0: stopping experience collection (231250 times) [2024-06-25 19:47:21,360][15401] InferenceWorker_p0-w0: resuming experience collection (231250 times) [2024-06-25 19:47:21,478][15401] Updated weights for policy 0, policy_version 954094 (0.0028) [2024-06-25 19:47:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15631941632. Throughput: 0: 42831.5. Samples: 15632026780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-25 19:47:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 19:47:25,382][15401] Updated weights for policy 0, policy_version 954104 (0.0028) [2024-06-25 19:47:28,390][15132] Fps is (10 sec: 40980.8, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 15632138240. Throughput: 0: 42771.0. Samples: 15632279460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:47:28,390][15132] Avg episode reward: [(0, '0.241')] [2024-06-25 19:47:29,233][15401] Updated weights for policy 0, policy_version 954114 (0.0031) [2024-06-25 19:47:33,078][15401] Updated weights for policy 0, policy_version 954124 (0.0043) [2024-06-25 19:47:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15632367616. Throughput: 0: 42633.1. Samples: 15632528680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:47:33,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 19:47:37,310][15401] Updated weights for policy 0, policy_version 954134 (0.0038) [2024-06-25 19:47:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 15632580608. Throughput: 0: 42752.4. Samples: 15632660700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:47:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 19:47:41,341][15401] Updated weights for policy 0, policy_version 954144 (0.0046) [2024-06-25 19:47:43,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15632777216. Throughput: 0: 42685.8. Samples: 15632912440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:47:43,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 19:47:43,461][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000954150_15632793600.pth... [2024-06-25 19:47:43,511][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000953526_15622569984.pth [2024-06-25 19:47:45,045][15401] Updated weights for policy 0, policy_version 954154 (0.0030) [2024-06-25 19:47:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15633006592. Throughput: 0: 42401.3. Samples: 15633161320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:47:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 19:47:48,915][15401] Updated weights for policy 0, policy_version 954164 (0.0040) [2024-06-25 19:47:52,854][15401] Updated weights for policy 0, policy_version 954174 (0.0028) [2024-06-25 19:47:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42055.9, 300 sec: 42542.9). Total num frames: 15633186816. Throughput: 0: 42513.9. Samples: 15633294640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:47:53,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 19:47:56,608][15401] Updated weights for policy 0, policy_version 954184 (0.0040) [2024-06-25 19:47:58,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42050.6, 300 sec: 42709.5). Total num frames: 15633399808. Throughput: 0: 42309.8. Samples: 15633545820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:47:58,393][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 19:48:00,726][15401] Updated weights for policy 0, policy_version 954194 (0.0052) [2024-06-25 19:48:03,390][15132] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 15633661952. Throughput: 0: 42296.8. Samples: 15633796520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:03,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 19:48:04,267][15401] Updated weights for policy 0, policy_version 954204 (0.0038) [2024-06-25 19:48:08,390][15132] Fps is (10 sec: 42608.6, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 15633825792. Throughput: 0: 42442.6. Samples: 15633936700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:08,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 19:48:08,445][15401] Updated weights for policy 0, policy_version 954214 (0.0045) [2024-06-25 19:48:11,971][15401] Updated weights for policy 0, policy_version 954224 (0.0041) [2024-06-25 19:48:13,392][15132] Fps is (10 sec: 37674.3, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 15634038784. Throughput: 0: 42396.4. Samples: 15634187400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:13,393][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 19:48:16,254][15401] Updated weights for policy 0, policy_version 954234 (0.0050) [2024-06-25 19:48:18,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42875.2, 300 sec: 42653.9). Total num frames: 15634300928. Throughput: 0: 42361.0. Samples: 15634434920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:18,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 19:48:19,611][15401] Updated weights for policy 0, policy_version 954244 (0.0031) [2024-06-25 19:48:23,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 15634464768. Throughput: 0: 42447.9. Samples: 15634570860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:23,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 19:48:23,863][15401] Updated weights for policy 0, policy_version 954254 (0.0039) [2024-06-25 19:48:27,122][15401] Updated weights for policy 0, policy_version 954264 (0.0032) [2024-06-25 19:48:28,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42599.0). Total num frames: 15634677760. Throughput: 0: 42435.2. Samples: 15634822020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:28,390][15132] Avg episode reward: [(0, '0.455')] [2024-06-25 19:48:31,517][15401] Updated weights for policy 0, policy_version 954274 (0.0041) [2024-06-25 19:48:32,580][15349] Signal inference workers to stop experience collection... (231300 times) [2024-06-25 19:48:32,633][15401] InferenceWorker_p0-w0: stopping experience collection (231300 times) [2024-06-25 19:48:32,634][15349] Signal inference workers to resume experience collection... (231300 times) [2024-06-25 19:48:32,649][15401] InferenceWorker_p0-w0: resuming experience collection (231300 times) [2024-06-25 19:48:33,389][15132] Fps is (10 sec: 47514.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15634939904. Throughput: 0: 42444.0. Samples: 15635071300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:33,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 19:48:35,367][15401] Updated weights for policy 0, policy_version 954284 (0.0034) [2024-06-25 19:48:38,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15635120128. Throughput: 0: 42601.2. Samples: 15635211700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:38,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 19:48:39,230][15401] Updated weights for policy 0, policy_version 954294 (0.0037) [2024-06-25 19:48:42,845][15401] Updated weights for policy 0, policy_version 954304 (0.0030) [2024-06-25 19:48:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15635333120. Throughput: 0: 42594.2. Samples: 15635462460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:43,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 19:48:46,729][15401] Updated weights for policy 0, policy_version 954314 (0.0035) [2024-06-25 19:48:48,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15635578880. Throughput: 0: 42761.4. Samples: 15635720780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:48,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 19:48:50,247][15401] Updated weights for policy 0, policy_version 954324 (0.0043) [2024-06-25 19:48:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15635775488. Throughput: 0: 42725.4. Samples: 15635859340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:53,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 19:48:54,162][15401] Updated weights for policy 0, policy_version 954334 (0.0033) [2024-06-25 19:48:57,793][15401] Updated weights for policy 0, policy_version 954344 (0.0025) [2024-06-25 19:48:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43146.3, 300 sec: 42765.0). Total num frames: 15635988480. Throughput: 0: 42737.9. Samples: 15636110500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:48:58,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 19:49:01,591][15401] Updated weights for policy 0, policy_version 954354 (0.0025) [2024-06-25 19:49:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15636217856. Throughput: 0: 42945.3. Samples: 15636367460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:49:03,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 19:49:05,302][15401] Updated weights for policy 0, policy_version 954364 (0.0031) [2024-06-25 19:49:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15636414464. Throughput: 0: 42857.9. Samples: 15636499460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:49:08,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 19:49:09,045][15401] Updated weights for policy 0, policy_version 954374 (0.0032) [2024-06-25 19:49:12,825][15401] Updated weights for policy 0, policy_version 954384 (0.0033) [2024-06-25 19:49:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43419.4, 300 sec: 42765.0). Total num frames: 15636643840. Throughput: 0: 42970.7. Samples: 15636755700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:49:13,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 19:49:16,867][15401] Updated weights for policy 0, policy_version 954394 (0.0032) [2024-06-25 19:49:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15636840448. Throughput: 0: 43175.1. Samples: 15637014180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:49:18,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-25 19:49:20,466][15401] Updated weights for policy 0, policy_version 954404 (0.0036) [2024-06-25 19:49:23,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15637053440. Throughput: 0: 42876.1. Samples: 15637141120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:49:23,400][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 19:49:24,483][15401] Updated weights for policy 0, policy_version 954414 (0.0032) [2024-06-25 19:49:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15637266432. Throughput: 0: 43015.7. Samples: 15637398160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:49:28,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 19:49:28,472][15401] Updated weights for policy 0, policy_version 954424 (0.0035) [2024-06-25 19:49:32,078][15401] Updated weights for policy 0, policy_version 954434 (0.0037) [2024-06-25 19:49:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15637479424. Throughput: 0: 43136.5. Samples: 15637661920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:49:33,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 19:49:35,969][15401] Updated weights for policy 0, policy_version 954444 (0.0033) [2024-06-25 19:49:38,392][15132] Fps is (10 sec: 44223.8, 60 sec: 43142.6, 300 sec: 42709.1). Total num frames: 15637708800. Throughput: 0: 42799.1. Samples: 15637785420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:49:38,393][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 19:49:40,024][15401] Updated weights for policy 0, policy_version 954454 (0.0033) [2024-06-25 19:49:43,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15637921792. Throughput: 0: 42980.5. Samples: 15638044620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:49:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 19:49:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000954463_15637921792.pth... [2024-06-25 19:49:43,504][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000953839_15627698176.pth [2024-06-25 19:49:43,642][15401] Updated weights for policy 0, policy_version 954464 (0.0042) [2024-06-25 19:49:47,707][15401] Updated weights for policy 0, policy_version 954474 (0.0032) [2024-06-25 19:49:48,390][15132] Fps is (10 sec: 44248.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15638151168. Throughput: 0: 42924.0. Samples: 15638299040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:49:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 19:49:51,311][15401] Updated weights for policy 0, policy_version 954484 (0.0033) [2024-06-25 19:49:52,265][15349] Signal inference workers to stop experience collection... (231350 times) [2024-06-25 19:49:52,265][15349] Signal inference workers to resume experience collection... (231350 times) [2024-06-25 19:49:52,290][15401] InferenceWorker_p0-w0: stopping experience collection (231350 times) [2024-06-25 19:49:52,290][15401] InferenceWorker_p0-w0: resuming experience collection (231350 times) [2024-06-25 19:49:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15638347776. Throughput: 0: 42955.5. Samples: 15638432460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:49:53,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 19:49:55,555][15401] Updated weights for policy 0, policy_version 954494 (0.0038) [2024-06-25 19:49:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15638577152. Throughput: 0: 42884.1. Samples: 15638685480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:49:58,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 19:49:58,759][15401] Updated weights for policy 0, policy_version 954504 (0.0032) [2024-06-25 19:50:03,161][15401] Updated weights for policy 0, policy_version 954514 (0.0030) [2024-06-25 19:50:03,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15638757376. Throughput: 0: 43061.4. Samples: 15638951940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:03,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 19:50:06,585][15401] Updated weights for policy 0, policy_version 954524 (0.0033) [2024-06-25 19:50:08,390][15132] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15639003136. Throughput: 0: 42919.9. Samples: 15639072520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:08,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 19:50:10,996][15401] Updated weights for policy 0, policy_version 954534 (0.0032) [2024-06-25 19:50:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15639183360. Throughput: 0: 42823.8. Samples: 15639325240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:13,392][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 19:50:14,127][15401] Updated weights for policy 0, policy_version 954544 (0.0041) [2024-06-25 19:50:18,392][15132] Fps is (10 sec: 39312.5, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 15639396352. Throughput: 0: 42803.0. Samples: 15639588160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:18,393][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 19:50:18,718][15401] Updated weights for policy 0, policy_version 954554 (0.0040) [2024-06-25 19:50:21,644][15401] Updated weights for policy 0, policy_version 954564 (0.0028) [2024-06-25 19:50:23,390][15132] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15639642112. Throughput: 0: 42880.5. Samples: 15639714920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:23,390][15132] Avg episode reward: [(0, '0.756')] [2024-06-25 19:50:26,281][15401] Updated weights for policy 0, policy_version 954574 (0.0036) [2024-06-25 19:50:28,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 15639822336. Throughput: 0: 42804.8. Samples: 15639970840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 19:50:29,198][15401] Updated weights for policy 0, policy_version 954584 (0.0040) [2024-06-25 19:50:33,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15640051712. Throughput: 0: 42878.8. Samples: 15640228580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 19:50:33,625][15401] Updated weights for policy 0, policy_version 954594 (0.0050) [2024-06-25 19:50:36,943][15401] Updated weights for policy 0, policy_version 954604 (0.0040) [2024-06-25 19:50:38,389][15132] Fps is (10 sec: 47514.2, 60 sec: 43146.6, 300 sec: 42765.0). Total num frames: 15640297472. Throughput: 0: 42827.2. Samples: 15640359680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 19:50:41,115][15401] Updated weights for policy 0, policy_version 954614 (0.0040) [2024-06-25 19:50:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15640477696. Throughput: 0: 42860.0. Samples: 15640614180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 19:50:45,047][15401] Updated weights for policy 0, policy_version 954624 (0.0030) [2024-06-25 19:50:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15640707072. Throughput: 0: 42526.2. Samples: 15640865620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 19:50:48,740][15401] Updated weights for policy 0, policy_version 954634 (0.0030) [2024-06-25 19:50:52,595][15401] Updated weights for policy 0, policy_version 954644 (0.0033) [2024-06-25 19:50:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15640936448. Throughput: 0: 42784.5. Samples: 15640997820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:53,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 19:50:56,211][15401] Updated weights for policy 0, policy_version 954654 (0.0034) [2024-06-25 19:50:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15641116672. Throughput: 0: 42896.5. Samples: 15641255580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:50:58,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 19:50:59,971][15349] Signal inference workers to stop experience collection... (231400 times) [2024-06-25 19:50:59,999][15401] InferenceWorker_p0-w0: stopping experience collection (231400 times) [2024-06-25 19:51:00,037][15349] Signal inference workers to resume experience collection... (231400 times) [2024-06-25 19:51:00,037][15401] InferenceWorker_p0-w0: resuming experience collection (231400 times) [2024-06-25 19:51:00,183][15401] Updated weights for policy 0, policy_version 954664 (0.0029) [2024-06-25 19:51:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 15641346048. Throughput: 0: 42779.3. Samples: 15641513120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:51:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 19:51:03,884][15401] Updated weights for policy 0, policy_version 954674 (0.0039) [2024-06-25 19:51:07,647][15401] Updated weights for policy 0, policy_version 954684 (0.0037) [2024-06-25 19:51:08,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15641575424. Throughput: 0: 42941.5. Samples: 15641647280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:51:08,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 19:51:11,418][15401] Updated weights for policy 0, policy_version 954694 (0.0046) [2024-06-25 19:51:13,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15641772032. Throughput: 0: 42786.7. Samples: 15641896240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:51:13,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 19:51:15,328][15401] Updated weights for policy 0, policy_version 954704 (0.0037) [2024-06-25 19:51:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 43146.3, 300 sec: 42653.9). Total num frames: 15641985024. Throughput: 0: 42803.1. Samples: 15642154720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:51:18,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 19:51:18,871][15401] Updated weights for policy 0, policy_version 954714 (0.0030) [2024-06-25 19:51:22,928][15401] Updated weights for policy 0, policy_version 954724 (0.0049) [2024-06-25 19:51:23,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15642214400. Throughput: 0: 42921.7. Samples: 15642291160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:51:23,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 19:51:26,898][15401] Updated weights for policy 0, policy_version 954734 (0.0036) [2024-06-25 19:51:28,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 15642394624. Throughput: 0: 42738.4. Samples: 15642537420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:51:28,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 19:51:30,577][15401] Updated weights for policy 0, policy_version 954744 (0.0038) [2024-06-25 19:51:33,389][15132] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15642640384. Throughput: 0: 42769.8. Samples: 15642790260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:51:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 19:51:34,524][15401] Updated weights for policy 0, policy_version 954754 (0.0046) [2024-06-25 19:51:38,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 15642836992. Throughput: 0: 42918.3. Samples: 15642929140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:51:38,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 19:51:38,482][15401] Updated weights for policy 0, policy_version 954764 (0.0025) [2024-06-25 19:51:42,154][15401] Updated weights for policy 0, policy_version 954774 (0.0029) [2024-06-25 19:51:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15643049984. Throughput: 0: 42685.8. Samples: 15643176440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:51:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 19:51:43,422][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000954776_15643049984.pth... [2024-06-25 19:51:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000954150_15632793600.pth [2024-06-25 19:51:46,106][15401] Updated weights for policy 0, policy_version 954784 (0.0025) [2024-06-25 19:51:48,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.7). Total num frames: 15643279360. Throughput: 0: 42628.3. Samples: 15643431400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:51:48,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 19:51:49,949][15401] Updated weights for policy 0, policy_version 954794 (0.0044) [2024-06-25 19:51:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15643459584. Throughput: 0: 42591.0. Samples: 15643563880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:51:53,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 19:51:53,792][15401] Updated weights for policy 0, policy_version 954804 (0.0032) [2024-06-25 19:51:57,494][15401] Updated weights for policy 0, policy_version 954814 (0.0027) [2024-06-25 19:51:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15643688960. Throughput: 0: 42653.0. Samples: 15643815620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:51:58,390][15132] Avg episode reward: [(0, '0.093')] [2024-06-25 19:52:01,479][15401] Updated weights for policy 0, policy_version 954824 (0.0040) [2024-06-25 19:52:03,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15643934720. Throughput: 0: 42549.6. Samples: 15644069460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:03,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 19:52:05,247][15401] Updated weights for policy 0, policy_version 954834 (0.0034) [2024-06-25 19:52:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 15644098560. Throughput: 0: 42509.9. Samples: 15644204100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:08,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 19:52:09,086][15401] Updated weights for policy 0, policy_version 954844 (0.0030) [2024-06-25 19:52:12,880][15401] Updated weights for policy 0, policy_version 954854 (0.0039) [2024-06-25 19:52:13,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42710.2). Total num frames: 15644327936. Throughput: 0: 42622.3. Samples: 15644455420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 19:52:16,934][15401] Updated weights for policy 0, policy_version 954864 (0.0028) [2024-06-25 19:52:18,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15644557312. Throughput: 0: 42530.6. Samples: 15644704140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 19:52:20,728][15401] Updated weights for policy 0, policy_version 954874 (0.0036) [2024-06-25 19:52:23,390][15132] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 15644721152. Throughput: 0: 42260.5. Samples: 15644830860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:23,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 19:52:24,637][15401] Updated weights for policy 0, policy_version 954884 (0.0034) [2024-06-25 19:52:26,638][15349] Signal inference workers to stop experience collection... (231450 times) [2024-06-25 19:52:26,680][15401] InferenceWorker_p0-w0: stopping experience collection (231450 times) [2024-06-25 19:52:26,694][15349] Signal inference workers to resume experience collection... (231450 times) [2024-06-25 19:52:26,696][15401] InferenceWorker_p0-w0: resuming experience collection (231450 times) [2024-06-25 19:52:28,266][15401] Updated weights for policy 0, policy_version 954894 (0.0038) [2024-06-25 19:52:28,396][15132] Fps is (10 sec: 42571.3, 60 sec: 43140.1, 300 sec: 42764.1). Total num frames: 15644983296. Throughput: 0: 42302.1. Samples: 15645080300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:28,396][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 19:52:33,128][15401] Updated weights for policy 0, policy_version 954904 (0.0043) [2024-06-25 19:52:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 15645163520. Throughput: 0: 42589.9. Samples: 15645347940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 19:52:35,939][15401] Updated weights for policy 0, policy_version 954914 (0.0039) [2024-06-25 19:52:38,390][15132] Fps is (10 sec: 39346.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15645376512. Throughput: 0: 42207.0. Samples: 15645463200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:38,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 19:52:40,810][15401] Updated weights for policy 0, policy_version 954924 (0.0031) [2024-06-25 19:52:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15645605888. Throughput: 0: 42193.7. Samples: 15645714340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 19:52:44,190][15401] Updated weights for policy 0, policy_version 954934 (0.0043) [2024-06-25 19:52:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 15645786112. Throughput: 0: 42345.9. Samples: 15645975020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:48,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 19:52:48,609][15401] Updated weights for policy 0, policy_version 954944 (0.0034) [2024-06-25 19:52:51,982][15401] Updated weights for policy 0, policy_version 954954 (0.0032) [2024-06-25 19:52:53,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 15645999104. Throughput: 0: 41882.1. Samples: 15646088800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:53,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 19:52:56,330][15401] Updated weights for policy 0, policy_version 954964 (0.0033) [2024-06-25 19:52:58,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15646244864. Throughput: 0: 42004.5. Samples: 15646345620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:52:58,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 19:52:59,800][15401] Updated weights for policy 0, policy_version 954974 (0.0029) [2024-06-25 19:53:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 41233.2, 300 sec: 42653.9). Total num frames: 15646408704. Throughput: 0: 42292.9. Samples: 15646607320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:53:03,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 19:53:04,045][15401] Updated weights for policy 0, policy_version 954984 (0.0034) [2024-06-25 19:53:07,620][15401] Updated weights for policy 0, policy_version 954994 (0.0042) [2024-06-25 19:53:08,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 15646638080. Throughput: 0: 42105.7. Samples: 15646725620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:53:08,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 19:53:11,748][15401] Updated weights for policy 0, policy_version 955004 (0.0040) [2024-06-25 19:53:13,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15646867456. Throughput: 0: 42143.3. Samples: 15646976480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 19:53:13,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 19:53:15,241][15401] Updated weights for policy 0, policy_version 955014 (0.0041) [2024-06-25 19:53:18,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 15647064064. Throughput: 0: 41910.5. Samples: 15647233920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:18,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 19:53:19,454][15401] Updated weights for policy 0, policy_version 955024 (0.0030) [2024-06-25 19:53:22,876][15401] Updated weights for policy 0, policy_version 955034 (0.0032) [2024-06-25 19:53:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15647277056. Throughput: 0: 42090.7. Samples: 15647357280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:23,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 19:53:27,295][15401] Updated weights for policy 0, policy_version 955044 (0.0027) [2024-06-25 19:53:28,396][15132] Fps is (10 sec: 44208.9, 60 sec: 42052.2, 300 sec: 42597.5). Total num frames: 15647506432. Throughput: 0: 42237.6. Samples: 15647615300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:28,396][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 19:53:30,661][15401] Updated weights for policy 0, policy_version 955054 (0.0038) [2024-06-25 19:53:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15647703040. Throughput: 0: 42164.0. Samples: 15647872400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:33,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 19:53:34,896][15349] Signal inference workers to stop experience collection... (231500 times) [2024-06-25 19:53:34,943][15401] InferenceWorker_p0-w0: stopping experience collection (231500 times) [2024-06-25 19:53:35,013][15349] Signal inference workers to resume experience collection... (231500 times) [2024-06-25 19:53:35,014][15401] InferenceWorker_p0-w0: resuming experience collection (231500 times) [2024-06-25 19:53:35,015][15401] Updated weights for policy 0, policy_version 955064 (0.0035) [2024-06-25 19:53:38,218][15401] Updated weights for policy 0, policy_version 955074 (0.0028) [2024-06-25 19:53:38,394][15132] Fps is (10 sec: 42604.3, 60 sec: 42594.9, 300 sec: 42708.8). Total num frames: 15647932416. Throughput: 0: 42326.5. Samples: 15647993700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:38,395][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 19:53:42,811][15401] Updated weights for policy 0, policy_version 955084 (0.0038) [2024-06-25 19:53:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15648145408. Throughput: 0: 42427.1. Samples: 15648254840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 19:53:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000955087_15648145408.pth... [2024-06-25 19:53:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000954463_15637921792.pth [2024-06-25 19:53:45,915][15401] Updated weights for policy 0, policy_version 955094 (0.0035) [2024-06-25 19:53:48,389][15132] Fps is (10 sec: 37702.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 15648309248. Throughput: 0: 42392.9. Samples: 15648515000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:48,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 19:53:50,331][15401] Updated weights for policy 0, policy_version 955104 (0.0029) [2024-06-25 19:53:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15648555008. Throughput: 0: 42300.0. Samples: 15648629120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 19:53:53,963][15401] Updated weights for policy 0, policy_version 955114 (0.0036) [2024-06-25 19:53:57,930][15401] Updated weights for policy 0, policy_version 955124 (0.0030) [2024-06-25 19:53:58,396][15132] Fps is (10 sec: 47483.2, 60 sec: 42320.8, 300 sec: 42597.5). Total num frames: 15648784384. Throughput: 0: 42483.3. Samples: 15648888500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:53:58,396][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 19:54:01,750][15401] Updated weights for policy 0, policy_version 955134 (0.0037) [2024-06-25 19:54:03,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 15648964608. Throughput: 0: 42526.7. Samples: 15649147720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:03,392][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 19:54:05,750][15401] Updated weights for policy 0, policy_version 955144 (0.0033) [2024-06-25 19:54:08,392][15132] Fps is (10 sec: 40976.1, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 15649193984. Throughput: 0: 42528.8. Samples: 15649271180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:08,393][15132] Avg episode reward: [(0, '0.841')] [2024-06-25 19:54:09,699][15401] Updated weights for policy 0, policy_version 955154 (0.0037) [2024-06-25 19:54:13,368][15401] Updated weights for policy 0, policy_version 955164 (0.0024) [2024-06-25 19:54:13,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15649406976. Throughput: 0: 42387.8. Samples: 15649522480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:13,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 19:54:17,789][15401] Updated weights for policy 0, policy_version 955174 (0.0031) [2024-06-25 19:54:18,392][15132] Fps is (10 sec: 39321.8, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 15649587200. Throughput: 0: 42381.7. Samples: 15649779680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:18,392][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 19:54:21,073][15401] Updated weights for policy 0, policy_version 955184 (0.0032) [2024-06-25 19:54:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15649832960. Throughput: 0: 42367.3. Samples: 15649900020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 19:54:25,523][15401] Updated weights for policy 0, policy_version 955194 (0.0027) [2024-06-25 19:54:28,390][15132] Fps is (10 sec: 44247.0, 60 sec: 42056.7, 300 sec: 42542.8). Total num frames: 15650029568. Throughput: 0: 42227.9. Samples: 15650155100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:28,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 19:54:28,847][15401] Updated weights for policy 0, policy_version 955204 (0.0029) [2024-06-25 19:54:32,878][15401] Updated weights for policy 0, policy_version 955214 (0.0038) [2024-06-25 19:54:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 15650242560. Throughput: 0: 42285.3. Samples: 15650417840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:33,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 19:54:36,239][15401] Updated weights for policy 0, policy_version 955224 (0.0037) [2024-06-25 19:54:38,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42601.9, 300 sec: 42598.4). Total num frames: 15650488320. Throughput: 0: 42599.2. Samples: 15650546080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:38,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 19:54:40,564][15401] Updated weights for policy 0, policy_version 955234 (0.0029) [2024-06-25 19:54:43,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15650684928. Throughput: 0: 42641.2. Samples: 15650807080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:43,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 19:54:43,813][15401] Updated weights for policy 0, policy_version 955244 (0.0047) [2024-06-25 19:54:48,278][15401] Updated weights for policy 0, policy_version 955254 (0.0032) [2024-06-25 19:54:48,392][15132] Fps is (10 sec: 39312.1, 60 sec: 42869.7, 300 sec: 42487.0). Total num frames: 15650881536. Throughput: 0: 42559.5. Samples: 15651062900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:48,393][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 19:54:51,354][15401] Updated weights for policy 0, policy_version 955264 (0.0035) [2024-06-25 19:54:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42542.8). Total num frames: 15651127296. Throughput: 0: 42538.3. Samples: 15651185300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:53,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 19:54:55,915][15401] Updated weights for policy 0, policy_version 955274 (0.0037) [2024-06-25 19:54:57,734][15349] Signal inference workers to stop experience collection... (231550 times) [2024-06-25 19:54:57,735][15349] Signal inference workers to resume experience collection... (231550 times) [2024-06-25 19:54:57,766][15401] InferenceWorker_p0-w0: stopping experience collection (231550 times) [2024-06-25 19:54:57,767][15401] InferenceWorker_p0-w0: resuming experience collection (231550 times) [2024-06-25 19:54:58,390][15132] Fps is (10 sec: 44246.6, 60 sec: 42329.7, 300 sec: 42598.4). Total num frames: 15651323904. Throughput: 0: 42857.2. Samples: 15651451060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:54:58,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 19:54:58,909][15401] Updated weights for policy 0, policy_version 955284 (0.0023) [2024-06-25 19:55:03,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42600.1, 300 sec: 42431.8). Total num frames: 15651520512. Throughput: 0: 42784.9. Samples: 15651704900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:55:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 19:55:03,680][15401] Updated weights for policy 0, policy_version 955294 (0.0024) [2024-06-25 19:55:06,483][15401] Updated weights for policy 0, policy_version 955304 (0.0035) [2024-06-25 19:55:08,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 15651766272. Throughput: 0: 42882.6. Samples: 15651829740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-25 19:55:08,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 19:55:11,259][15401] Updated weights for policy 0, policy_version 955314 (0.0047) [2024-06-25 19:55:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 15651962880. Throughput: 0: 43072.9. Samples: 15652093380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:13,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 19:55:14,176][15401] Updated weights for policy 0, policy_version 955324 (0.0033) [2024-06-25 19:55:18,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43146.2, 300 sec: 42487.3). Total num frames: 15652175872. Throughput: 0: 42735.9. Samples: 15652340960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:18,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 19:55:18,913][15401] Updated weights for policy 0, policy_version 955334 (0.0041) [2024-06-25 19:55:21,896][15401] Updated weights for policy 0, policy_version 955344 (0.0031) [2024-06-25 19:55:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15652405248. Throughput: 0: 42791.2. Samples: 15652471680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 19:55:26,929][15401] Updated weights for policy 0, policy_version 955354 (0.0025) [2024-06-25 19:55:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 15652585472. Throughput: 0: 42696.4. Samples: 15652728420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:28,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 19:55:29,900][15401] Updated weights for policy 0, policy_version 955364 (0.0031) [2024-06-25 19:55:33,392][15132] Fps is (10 sec: 42587.5, 60 sec: 43142.7, 300 sec: 42487.0). Total num frames: 15652831232. Throughput: 0: 42461.3. Samples: 15652973660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:33,393][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 19:55:34,470][15401] Updated weights for policy 0, policy_version 955374 (0.0038) [2024-06-25 19:55:37,393][15401] Updated weights for policy 0, policy_version 955384 (0.0028) [2024-06-25 19:55:38,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42323.6, 300 sec: 42542.5). Total num frames: 15653027840. Throughput: 0: 42649.2. Samples: 15653104620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:38,393][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 19:55:42,371][15401] Updated weights for policy 0, policy_version 955394 (0.0021) [2024-06-25 19:55:43,389][15132] Fps is (10 sec: 37692.9, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 15653208064. Throughput: 0: 42509.6. Samples: 15653363980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:43,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 19:55:43,434][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000955397_15653224448.pth... [2024-06-25 19:55:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000954776_15643049984.pth [2024-06-25 19:55:45,169][15401] Updated weights for policy 0, policy_version 955404 (0.0043) [2024-06-25 19:55:48,390][15132] Fps is (10 sec: 44247.4, 60 sec: 43146.3, 300 sec: 42487.3). Total num frames: 15653470208. Throughput: 0: 42312.0. Samples: 15653608940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:48,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 19:55:49,997][15401] Updated weights for policy 0, policy_version 955414 (0.0030) [2024-06-25 19:55:53,018][15401] Updated weights for policy 0, policy_version 955424 (0.0025) [2024-06-25 19:55:53,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15653683200. Throughput: 0: 42487.2. Samples: 15653741660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:53,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 19:55:57,511][15401] Updated weights for policy 0, policy_version 955434 (0.0039) [2024-06-25 19:55:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 15653863424. Throughput: 0: 42193.0. Samples: 15653992060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:55:58,390][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 19:56:00,690][15401] Updated weights for policy 0, policy_version 955444 (0.0041) [2024-06-25 19:56:03,393][15132] Fps is (10 sec: 39307.3, 60 sec: 42595.8, 300 sec: 42375.7). Total num frames: 15654076416. Throughput: 0: 42367.8. Samples: 15654247660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:03,394][15132] Avg episode reward: [(0, '0.833')] [2024-06-25 19:56:05,174][15401] Updated weights for policy 0, policy_version 955454 (0.0038) [2024-06-25 19:56:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 15654305792. Throughput: 0: 42355.1. Samples: 15654377660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:08,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 19:56:08,427][15401] Updated weights for policy 0, policy_version 955464 (0.0035) [2024-06-25 19:56:12,732][15401] Updated weights for policy 0, policy_version 955474 (0.0049) [2024-06-25 19:56:13,389][15132] Fps is (10 sec: 42614.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 15654502400. Throughput: 0: 42155.6. Samples: 15654625420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:13,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 19:56:16,657][15401] Updated weights for policy 0, policy_version 955484 (0.0024) [2024-06-25 19:56:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 15654715392. Throughput: 0: 42409.9. Samples: 15654882000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:18,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 19:56:20,486][15401] Updated weights for policy 0, policy_version 955494 (0.0028) [2024-06-25 19:56:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 15654928384. Throughput: 0: 42338.3. Samples: 15655009740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:23,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 19:56:24,174][15401] Updated weights for policy 0, policy_version 955504 (0.0033) [2024-06-25 19:56:28,058][15401] Updated weights for policy 0, policy_version 955514 (0.0031) [2024-06-25 19:56:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 15655157760. Throughput: 0: 42268.0. Samples: 15655266040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:28,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 19:56:31,387][15349] Signal inference workers to stop experience collection... (231600 times) [2024-06-25 19:56:31,435][15349] Signal inference workers to resume experience collection... (231600 times) [2024-06-25 19:56:31,437][15401] InferenceWorker_p0-w0: stopping experience collection (231600 times) [2024-06-25 19:56:31,459][15401] InferenceWorker_p0-w0: resuming experience collection (231600 times) [2024-06-25 19:56:31,828][15401] Updated weights for policy 0, policy_version 955524 (0.0049) [2024-06-25 19:56:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42327.1, 300 sec: 42487.3). Total num frames: 15655370752. Throughput: 0: 42394.7. Samples: 15655516700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 19:56:35,585][15401] Updated weights for policy 0, policy_version 955534 (0.0034) [2024-06-25 19:56:38,389][15132] Fps is (10 sec: 37683.5, 60 sec: 41781.0, 300 sec: 42320.7). Total num frames: 15655534592. Throughput: 0: 42274.8. Samples: 15655644020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:38,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 19:56:39,381][15401] Updated weights for policy 0, policy_version 955544 (0.0039) [2024-06-25 19:56:42,985][15401] Updated weights for policy 0, policy_version 955554 (0.0036) [2024-06-25 19:56:43,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42487.3). Total num frames: 15655813120. Throughput: 0: 42507.2. Samples: 15655904880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:43,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 19:56:47,299][15401] Updated weights for policy 0, policy_version 955564 (0.0028) [2024-06-25 19:56:48,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 15655993344. Throughput: 0: 42327.0. Samples: 15656152220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:48,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 19:56:51,131][15401] Updated weights for policy 0, policy_version 955574 (0.0031) [2024-06-25 19:56:53,389][15132] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 15656189952. Throughput: 0: 42355.9. Samples: 15656283680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:53,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 19:56:55,004][15401] Updated weights for policy 0, policy_version 955584 (0.0034) [2024-06-25 19:56:58,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 15656435712. Throughput: 0: 42531.0. Samples: 15656539320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:56:58,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 19:56:58,907][15401] Updated weights for policy 0, policy_version 955594 (0.0042) [2024-06-25 19:57:02,682][15401] Updated weights for policy 0, policy_version 955604 (0.0053) [2024-06-25 19:57:03,392][15132] Fps is (10 sec: 45864.0, 60 sec: 42872.3, 300 sec: 42542.5). Total num frames: 15656648704. Throughput: 0: 42426.1. Samples: 15656791280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:57:03,392][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 19:57:06,468][15401] Updated weights for policy 0, policy_version 955614 (0.0040) [2024-06-25 19:57:08,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 15656812544. Throughput: 0: 42633.9. Samples: 15656928260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 19:57:08,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 19:57:10,167][15401] Updated weights for policy 0, policy_version 955624 (0.0029) [2024-06-25 19:57:13,390][15132] Fps is (10 sec: 40969.7, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 15657058304. Throughput: 0: 42600.8. Samples: 15657183080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:13,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 19:57:14,108][15401] Updated weights for policy 0, policy_version 955634 (0.0036) [2024-06-25 19:57:18,133][15401] Updated weights for policy 0, policy_version 955644 (0.0035) [2024-06-25 19:57:18,392][15132] Fps is (10 sec: 47499.5, 60 sec: 42869.4, 300 sec: 42598.0). Total num frames: 15657287680. Throughput: 0: 42716.4. Samples: 15657439060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:18,393][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 19:57:21,905][15401] Updated weights for policy 0, policy_version 955654 (0.0050) [2024-06-25 19:57:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42321.6). Total num frames: 15657467904. Throughput: 0: 42728.3. Samples: 15657566800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:23,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 19:57:25,733][15401] Updated weights for policy 0, policy_version 955664 (0.0033) [2024-06-25 19:57:28,390][15132] Fps is (10 sec: 40971.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15657697280. Throughput: 0: 42549.7. Samples: 15657819620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:28,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 19:57:29,624][15401] Updated weights for policy 0, policy_version 955674 (0.0043) [2024-06-25 19:57:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15657910272. Throughput: 0: 42786.6. Samples: 15658077620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:33,390][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 19:57:33,667][15401] Updated weights for policy 0, policy_version 955684 (0.0033) [2024-06-25 19:57:37,276][15401] Updated weights for policy 0, policy_version 955694 (0.0035) [2024-06-25 19:57:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 15658123264. Throughput: 0: 42701.3. Samples: 15658205240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:38,390][15132] Avg episode reward: [(0, '0.843')] [2024-06-25 19:57:41,197][15401] Updated weights for policy 0, policy_version 955704 (0.0032) [2024-06-25 19:57:43,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15658336256. Throughput: 0: 42798.8. Samples: 15658465260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:43,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 19:57:43,501][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000955710_15658352640.pth... [2024-06-25 19:57:43,553][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000955087_15648145408.pth [2024-06-25 19:57:44,831][15401] Updated weights for policy 0, policy_version 955714 (0.0034) [2024-06-25 19:57:48,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.6, 300 sec: 42542.5). Total num frames: 15658549248. Throughput: 0: 42902.7. Samples: 15658721900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:48,392][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 19:57:48,745][15401] Updated weights for policy 0, policy_version 955724 (0.0039) [2024-06-25 19:57:49,790][15349] Signal inference workers to stop experience collection... (231650 times) [2024-06-25 19:57:49,790][15349] Signal inference workers to resume experience collection... (231650 times) [2024-06-25 19:57:49,815][15401] InferenceWorker_p0-w0: stopping experience collection (231650 times) [2024-06-25 19:57:49,816][15401] InferenceWorker_p0-w0: resuming experience collection (231650 times) [2024-06-25 19:57:52,884][15401] Updated weights for policy 0, policy_version 955734 (0.0043) [2024-06-25 19:57:53,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 15658778624. Throughput: 0: 42719.8. Samples: 15658850660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:53,390][15132] Avg episode reward: [(0, '0.414')] [2024-06-25 19:57:56,562][15401] Updated weights for policy 0, policy_version 955744 (0.0037) [2024-06-25 19:57:58,390][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15658975232. Throughput: 0: 42706.7. Samples: 15659104880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:57:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 19:58:00,479][15401] Updated weights for policy 0, policy_version 955754 (0.0036) [2024-06-25 19:58:03,391][15132] Fps is (10 sec: 42593.3, 60 sec: 42599.2, 300 sec: 42598.2). Total num frames: 15659204608. Throughput: 0: 42770.4. Samples: 15659363660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:03,391][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 19:58:04,244][15401] Updated weights for policy 0, policy_version 955764 (0.0035) [2024-06-25 19:58:08,014][15401] Updated weights for policy 0, policy_version 955774 (0.0036) [2024-06-25 19:58:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42542.9). Total num frames: 15659417600. Throughput: 0: 42865.3. Samples: 15659495740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 19:58:11,849][15401] Updated weights for policy 0, policy_version 955784 (0.0027) [2024-06-25 19:58:13,390][15132] Fps is (10 sec: 42603.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15659630592. Throughput: 0: 42948.8. Samples: 15659752320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:13,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 19:58:15,562][15401] Updated weights for policy 0, policy_version 955794 (0.0041) [2024-06-25 19:58:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42327.4, 300 sec: 42542.9). Total num frames: 15659827200. Throughput: 0: 42935.6. Samples: 15660009720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:18,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 19:58:19,649][15401] Updated weights for policy 0, policy_version 955804 (0.0035) [2024-06-25 19:58:23,047][15401] Updated weights for policy 0, policy_version 955814 (0.0025) [2024-06-25 19:58:23,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42599.3). Total num frames: 15660072960. Throughput: 0: 43014.6. Samples: 15660140900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:23,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 19:58:27,288][15401] Updated weights for policy 0, policy_version 955824 (0.0036) [2024-06-25 19:58:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15660285952. Throughput: 0: 42955.0. Samples: 15660398240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:28,390][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 19:58:30,680][15401] Updated weights for policy 0, policy_version 955834 (0.0031) [2024-06-25 19:58:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42432.5). Total num frames: 15660449792. Throughput: 0: 43030.8. Samples: 15660658180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:33,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 19:58:34,769][15401] Updated weights for policy 0, policy_version 955844 (0.0034) [2024-06-25 19:58:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 15660695552. Throughput: 0: 42856.0. Samples: 15660779180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:38,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 19:58:38,475][15401] Updated weights for policy 0, policy_version 955854 (0.0033) [2024-06-25 19:58:42,494][15401] Updated weights for policy 0, policy_version 955864 (0.0027) [2024-06-25 19:58:43,391][15132] Fps is (10 sec: 45870.0, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 15660908544. Throughput: 0: 42879.0. Samples: 15661034480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:43,391][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 19:58:45,919][15401] Updated weights for policy 0, policy_version 955874 (0.0024) [2024-06-25 19:58:48,394][15132] Fps is (10 sec: 40941.0, 60 sec: 42596.8, 300 sec: 42542.2). Total num frames: 15661105152. Throughput: 0: 42849.7. Samples: 15661292040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:48,395][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 19:58:50,171][15401] Updated weights for policy 0, policy_version 955884 (0.0043) [2024-06-25 19:58:53,389][15132] Fps is (10 sec: 44241.8, 60 sec: 42871.6, 300 sec: 42599.3). Total num frames: 15661350912. Throughput: 0: 42738.3. Samples: 15661418960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:53,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 19:58:53,546][15401] Updated weights for policy 0, policy_version 955894 (0.0029) [2024-06-25 19:58:57,907][15401] Updated weights for policy 0, policy_version 955904 (0.0034) [2024-06-25 19:58:58,389][15132] Fps is (10 sec: 42618.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 15661531136. Throughput: 0: 42767.2. Samples: 15661676840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:58:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 19:59:01,390][15401] Updated weights for policy 0, policy_version 955914 (0.0031) [2024-06-25 19:59:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42599.4, 300 sec: 42598.8). Total num frames: 15661760512. Throughput: 0: 42714.7. Samples: 15661931880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 19:59:03,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 19:59:05,845][15401] Updated weights for policy 0, policy_version 955924 (0.0026) [2024-06-25 19:59:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15661973504. Throughput: 0: 42736.4. Samples: 15662064040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:08,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 19:59:08,910][15401] Updated weights for policy 0, policy_version 955934 (0.0032) [2024-06-25 19:59:13,287][15401] Updated weights for policy 0, policy_version 955944 (0.0039) [2024-06-25 19:59:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 15662186496. Throughput: 0: 42764.9. Samples: 15662322660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:13,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 19:59:16,728][15401] Updated weights for policy 0, policy_version 955954 (0.0055) [2024-06-25 19:59:18,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15662415872. Throughput: 0: 42451.5. Samples: 15662568500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 19:59:21,456][15401] Updated weights for policy 0, policy_version 955964 (0.0041) [2024-06-25 19:59:22,088][15349] Signal inference workers to stop experience collection... (231700 times) [2024-06-25 19:59:22,088][15349] Signal inference workers to resume experience collection... (231700 times) [2024-06-25 19:59:22,107][15401] InferenceWorker_p0-w0: stopping experience collection (231700 times) [2024-06-25 19:59:22,107][15401] InferenceWorker_p0-w0: resuming experience collection (231700 times) [2024-06-25 19:59:23,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15662645248. Throughput: 0: 42902.3. Samples: 15662709780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:23,390][15132] Avg episode reward: [(0, '0.796')] [2024-06-25 19:59:24,181][15401] Updated weights for policy 0, policy_version 955974 (0.0032) [2024-06-25 19:59:28,390][15132] Fps is (10 sec: 37683.0, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 15662792704. Throughput: 0: 42756.9. Samples: 15662958500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:28,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 19:59:28,954][15401] Updated weights for policy 0, policy_version 955984 (0.0027) [2024-06-25 19:59:31,911][15401] Updated weights for policy 0, policy_version 955994 (0.0032) [2024-06-25 19:59:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 15663054848. Throughput: 0: 42696.5. Samples: 15663213180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:33,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 19:59:36,557][15401] Updated weights for policy 0, policy_version 956004 (0.0039) [2024-06-25 19:59:38,389][15132] Fps is (10 sec: 47514.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15663267840. Throughput: 0: 42949.9. Samples: 15663351700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:38,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 19:59:39,475][15401] Updated weights for policy 0, policy_version 956014 (0.0038) [2024-06-25 19:59:43,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42326.1, 300 sec: 42598.7). Total num frames: 15663448064. Throughput: 0: 42720.8. Samples: 15663599280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 19:59:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000956021_15663448064.pth... [2024-06-25 19:59:43,462][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000955397_15653224448.pth [2024-06-25 19:59:44,161][15401] Updated weights for policy 0, policy_version 956024 (0.0033) [2024-06-25 19:59:47,118][15401] Updated weights for policy 0, policy_version 956034 (0.0033) [2024-06-25 19:59:48,389][15132] Fps is (10 sec: 42597.9, 60 sec: 43147.9, 300 sec: 42598.4). Total num frames: 15663693824. Throughput: 0: 42649.3. Samples: 15663851100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 19:59:51,831][15401] Updated weights for policy 0, policy_version 956044 (0.0034) [2024-06-25 19:59:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15663906816. Throughput: 0: 42716.9. Samples: 15663986300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:53,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 19:59:54,866][15401] Updated weights for policy 0, policy_version 956054 (0.0037) [2024-06-25 19:59:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15664103424. Throughput: 0: 42406.7. Samples: 15664230960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 19:59:58,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 19:59:59,433][15401] Updated weights for policy 0, policy_version 956064 (0.0048) [2024-06-25 20:00:02,741][15401] Updated weights for policy 0, policy_version 956074 (0.0038) [2024-06-25 20:00:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 15664332800. Throughput: 0: 42447.5. Samples: 15664478640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:03,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 20:00:07,575][15401] Updated weights for policy 0, policy_version 956084 (0.0036) [2024-06-25 20:00:08,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15664529408. Throughput: 0: 42319.1. Samples: 15664614140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:08,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 20:00:10,411][15401] Updated weights for policy 0, policy_version 956094 (0.0036) [2024-06-25 20:00:13,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15664742400. Throughput: 0: 42365.0. Samples: 15664864920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:13,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 20:00:15,241][15401] Updated weights for policy 0, policy_version 956104 (0.0044) [2024-06-25 20:00:17,979][15401] Updated weights for policy 0, policy_version 956114 (0.0035) [2024-06-25 20:00:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15664971776. Throughput: 0: 42261.8. Samples: 15665114960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:18,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 20:00:22,854][15401] Updated weights for policy 0, policy_version 956124 (0.0033) [2024-06-25 20:00:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15665168384. Throughput: 0: 42171.4. Samples: 15665249420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 20:00:25,634][15401] Updated weights for policy 0, policy_version 956134 (0.0025) [2024-06-25 20:00:28,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 15665364992. Throughput: 0: 42170.7. Samples: 15665496960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:28,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 20:00:30,520][15401] Updated weights for policy 0, policy_version 956144 (0.0033) [2024-06-25 20:00:33,282][15401] Updated weights for policy 0, policy_version 956154 (0.0030) [2024-06-25 20:00:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 15665627136. Throughput: 0: 42241.8. Samples: 15665751980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 20:00:37,520][15349] Signal inference workers to stop experience collection... (231750 times) [2024-06-25 20:00:37,520][15349] Signal inference workers to resume experience collection... (231750 times) [2024-06-25 20:00:37,535][15401] InferenceWorker_p0-w0: stopping experience collection (231750 times) [2024-06-25 20:00:37,535][15401] InferenceWorker_p0-w0: resuming experience collection (231750 times) [2024-06-25 20:00:38,069][15401] Updated weights for policy 0, policy_version 956164 (0.0040) [2024-06-25 20:00:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15665790976. Throughput: 0: 42185.0. Samples: 15665884620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:38,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 20:00:40,739][15401] Updated weights for policy 0, policy_version 956174 (0.0036) [2024-06-25 20:00:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15666020352. Throughput: 0: 42481.8. Samples: 15666142640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:43,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 20:00:45,952][15401] Updated weights for policy 0, policy_version 956184 (0.0041) [2024-06-25 20:00:48,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15666249728. Throughput: 0: 42596.1. Samples: 15666395460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:48,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 20:00:48,751][15401] Updated weights for policy 0, policy_version 956194 (0.0036) [2024-06-25 20:00:53,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15666429952. Throughput: 0: 42554.9. Samples: 15666529120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 20:00:53,542][15401] Updated weights for policy 0, policy_version 956204 (0.0027) [2024-06-25 20:00:56,655][15401] Updated weights for policy 0, policy_version 956214 (0.0032) [2024-06-25 20:00:58,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 15666659328. Throughput: 0: 42612.4. Samples: 15666782480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:00:58,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 20:01:01,106][15401] Updated weights for policy 0, policy_version 956224 (0.0036) [2024-06-25 20:01:03,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15666888704. Throughput: 0: 42689.4. Samples: 15667035980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-25 20:01:03,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 20:01:04,312][15401] Updated weights for policy 0, policy_version 956234 (0.0026) [2024-06-25 20:01:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15667068928. Throughput: 0: 42674.6. Samples: 15667169780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:08,390][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 20:01:08,865][15401] Updated weights for policy 0, policy_version 956244 (0.0034) [2024-06-25 20:01:11,965][15401] Updated weights for policy 0, policy_version 956254 (0.0040) [2024-06-25 20:01:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15667298304. Throughput: 0: 42788.5. Samples: 15667422440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:13,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 20:01:16,574][15401] Updated weights for policy 0, policy_version 956264 (0.0028) [2024-06-25 20:01:18,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15667527680. Throughput: 0: 42793.3. Samples: 15667677680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:18,390][15132] Avg episode reward: [(0, '0.314')] [2024-06-25 20:01:19,773][15401] Updated weights for policy 0, policy_version 956274 (0.0034) [2024-06-25 20:01:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15667707904. Throughput: 0: 42693.7. Samples: 15667805840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:23,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 20:01:24,124][15401] Updated weights for policy 0, policy_version 956284 (0.0032) [2024-06-25 20:01:27,317][15401] Updated weights for policy 0, policy_version 956294 (0.0028) [2024-06-25 20:01:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15667953664. Throughput: 0: 42646.2. Samples: 15668061720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 20:01:31,935][15401] Updated weights for policy 0, policy_version 956304 (0.0045) [2024-06-25 20:01:33,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 15668150272. Throughput: 0: 42821.8. Samples: 15668322440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:33,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 20:01:34,763][15401] Updated weights for policy 0, policy_version 956314 (0.0030) [2024-06-25 20:01:38,393][15132] Fps is (10 sec: 40945.4, 60 sec: 42868.8, 300 sec: 42542.3). Total num frames: 15668363264. Throughput: 0: 42579.8. Samples: 15668445360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:38,394][15132] Avg episode reward: [(0, '0.238')] [2024-06-25 20:01:39,394][15401] Updated weights for policy 0, policy_version 956324 (0.0033) [2024-06-25 20:01:42,316][15401] Updated weights for policy 0, policy_version 956334 (0.0034) [2024-06-25 20:01:43,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 15668609024. Throughput: 0: 42568.4. Samples: 15668698160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:43,393][15132] Avg episode reward: [(0, '0.345')] [2024-06-25 20:01:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000956336_15668609024.pth... [2024-06-25 20:01:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000955710_15658352640.pth [2024-06-25 20:01:46,983][15401] Updated weights for policy 0, policy_version 956344 (0.0034) [2024-06-25 20:01:48,390][15132] Fps is (10 sec: 40974.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15668772864. Throughput: 0: 42863.0. Samples: 15668964820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:48,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 20:01:50,235][15401] Updated weights for policy 0, policy_version 956354 (0.0037) [2024-06-25 20:01:53,392][15132] Fps is (10 sec: 39321.8, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 15669002240. Throughput: 0: 42432.9. Samples: 15669079360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:53,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 20:01:55,438][15401] Updated weights for policy 0, policy_version 956364 (0.0042) [2024-06-25 20:01:57,795][15401] Updated weights for policy 0, policy_version 956374 (0.0038) [2024-06-25 20:01:58,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 15669248000. Throughput: 0: 42522.5. Samples: 15669335960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:01:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 20:02:01,697][15349] Signal inference workers to stop experience collection... (231800 times) [2024-06-25 20:02:01,698][15349] Signal inference workers to resume experience collection... (231800 times) [2024-06-25 20:02:01,715][15401] InferenceWorker_p0-w0: stopping experience collection (231800 times) [2024-06-25 20:02:01,715][15401] InferenceWorker_p0-w0: resuming experience collection (231800 times) [2024-06-25 20:02:02,965][15401] Updated weights for policy 0, policy_version 956384 (0.0038) [2024-06-25 20:02:03,389][15132] Fps is (10 sec: 39331.6, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 15669395456. Throughput: 0: 42577.8. Samples: 15669593680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 20:02:05,857][15401] Updated weights for policy 0, policy_version 956394 (0.0034) [2024-06-25 20:02:08,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15669624832. Throughput: 0: 42317.0. Samples: 15669710100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 20:02:10,971][15401] Updated weights for policy 0, policy_version 956404 (0.0027) [2024-06-25 20:02:13,390][15132] Fps is (10 sec: 47512.7, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 15669870592. Throughput: 0: 42555.6. Samples: 15669976720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:13,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 20:02:13,630][15401] Updated weights for policy 0, policy_version 956414 (0.0027) [2024-06-25 20:02:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 15670034432. Throughput: 0: 42366.2. Samples: 15670228920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 20:02:18,568][15401] Updated weights for policy 0, policy_version 956424 (0.0045) [2024-06-25 20:02:21,324][15401] Updated weights for policy 0, policy_version 956434 (0.0027) [2024-06-25 20:02:23,394][15132] Fps is (10 sec: 40940.5, 60 sec: 42868.0, 300 sec: 42653.2). Total num frames: 15670280192. Throughput: 0: 42307.3. Samples: 15670349240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:23,395][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 20:02:26,221][15401] Updated weights for policy 0, policy_version 956444 (0.0037) [2024-06-25 20:02:28,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15670476800. Throughput: 0: 42466.6. Samples: 15670609060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:28,390][15132] Avg episode reward: [(0, '0.526')] [2024-06-25 20:02:29,202][15401] Updated weights for policy 0, policy_version 956454 (0.0034) [2024-06-25 20:02:33,389][15132] Fps is (10 sec: 39340.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15670673408. Throughput: 0: 42278.8. Samples: 15670867360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:33,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 20:02:33,905][15401] Updated weights for policy 0, policy_version 956464 (0.0035) [2024-06-25 20:02:36,964][15401] Updated weights for policy 0, policy_version 956474 (0.0040) [2024-06-25 20:02:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42874.1, 300 sec: 42709.5). Total num frames: 15670935552. Throughput: 0: 42446.7. Samples: 15670989360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:38,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 20:02:41,513][15401] Updated weights for policy 0, policy_version 956484 (0.0027) [2024-06-25 20:02:43,389][15132] Fps is (10 sec: 42598.3, 60 sec: 41507.9, 300 sec: 42543.2). Total num frames: 15671099392. Throughput: 0: 42428.1. Samples: 15671245220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:43,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 20:02:44,864][15401] Updated weights for policy 0, policy_version 956494 (0.0031) [2024-06-25 20:02:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15671312384. Throughput: 0: 42296.7. Samples: 15671497040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:48,390][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 20:02:49,115][15401] Updated weights for policy 0, policy_version 956504 (0.0041) [2024-06-25 20:02:52,700][15401] Updated weights for policy 0, policy_version 956514 (0.0042) [2024-06-25 20:02:53,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42600.2, 300 sec: 42653.9). Total num frames: 15671558144. Throughput: 0: 42541.8. Samples: 15671624480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:53,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 20:02:57,029][15401] Updated weights for policy 0, policy_version 956524 (0.0040) [2024-06-25 20:02:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 42487.5). Total num frames: 15671738368. Throughput: 0: 42178.3. Samples: 15671874740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-25 20:02:58,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 20:03:00,575][15401] Updated weights for policy 0, policy_version 956534 (0.0036) [2024-06-25 20:03:03,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 15671951360. Throughput: 0: 42192.9. Samples: 15672127600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:03,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 20:03:05,014][15401] Updated weights for policy 0, policy_version 956544 (0.0036) [2024-06-25 20:03:08,111][15401] Updated weights for policy 0, policy_version 956554 (0.0035) [2024-06-25 20:03:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15672197120. Throughput: 0: 42456.6. Samples: 15672259580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:08,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 20:03:12,636][15401] Updated weights for policy 0, policy_version 956564 (0.0035) [2024-06-25 20:03:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 15672377344. Throughput: 0: 42343.2. Samples: 15672514500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 20:03:15,955][15401] Updated weights for policy 0, policy_version 956574 (0.0040) [2024-06-25 20:03:16,907][15349] Signal inference workers to stop experience collection... (231850 times) [2024-06-25 20:03:16,908][15349] Signal inference workers to resume experience collection... (231850 times) [2024-06-25 20:03:16,922][15401] InferenceWorker_p0-w0: stopping experience collection (231850 times) [2024-06-25 20:03:16,958][15401] InferenceWorker_p0-w0: resuming experience collection (231850 times) [2024-06-25 20:03:18,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 15672606720. Throughput: 0: 42041.6. Samples: 15672759240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 20:03:20,119][15401] Updated weights for policy 0, policy_version 956584 (0.0037) [2024-06-25 20:03:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42055.6, 300 sec: 42431.8). Total num frames: 15672803328. Throughput: 0: 42343.0. Samples: 15672894800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:23,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 20:03:23,636][15401] Updated weights for policy 0, policy_version 956594 (0.0036) [2024-06-25 20:03:27,648][15401] Updated weights for policy 0, policy_version 956604 (0.0023) [2024-06-25 20:03:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 15673016320. Throughput: 0: 42371.2. Samples: 15673151920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:28,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 20:03:31,199][15401] Updated weights for policy 0, policy_version 956614 (0.0053) [2024-06-25 20:03:33,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 15673262080. Throughput: 0: 42314.6. Samples: 15673401200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:33,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 20:03:35,336][15401] Updated weights for policy 0, policy_version 956624 (0.0035) [2024-06-25 20:03:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42487.5). Total num frames: 15673442304. Throughput: 0: 42422.1. Samples: 15673533480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:38,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 20:03:38,996][15401] Updated weights for policy 0, policy_version 956634 (0.0027) [2024-06-25 20:03:43,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42543.5). Total num frames: 15673655296. Throughput: 0: 42540.9. Samples: 15673789080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 20:03:43,394][15401] Updated weights for policy 0, policy_version 956644 (0.0028) [2024-06-25 20:03:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000956644_15673655296.pth... [2024-06-25 20:03:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000956021_15663448064.pth [2024-06-25 20:03:46,727][15401] Updated weights for policy 0, policy_version 956654 (0.0029) [2024-06-25 20:03:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15673884672. Throughput: 0: 42506.3. Samples: 15674040380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:48,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 20:03:50,999][15401] Updated weights for policy 0, policy_version 956664 (0.0025) [2024-06-25 20:03:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15674081280. Throughput: 0: 42569.2. Samples: 15674175200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:53,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 20:03:54,390][15401] Updated weights for policy 0, policy_version 956674 (0.0042) [2024-06-25 20:03:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 15674294272. Throughput: 0: 42594.8. Samples: 15674431260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:03:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 20:03:58,506][15401] Updated weights for policy 0, policy_version 956684 (0.0030) [2024-06-25 20:04:01,910][15401] Updated weights for policy 0, policy_version 956694 (0.0029) [2024-06-25 20:04:03,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 15674507264. Throughput: 0: 42898.4. Samples: 15674689660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 20:04:06,010][15401] Updated weights for policy 0, policy_version 956704 (0.0031) [2024-06-25 20:04:08,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 15674736640. Throughput: 0: 42745.8. Samples: 15674818360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 20:04:09,666][15401] Updated weights for policy 0, policy_version 956714 (0.0031) [2024-06-25 20:04:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15674949632. Throughput: 0: 42628.8. Samples: 15675070220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:13,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 20:04:13,567][15401] Updated weights for policy 0, policy_version 956724 (0.0036) [2024-06-25 20:04:17,134][15401] Updated weights for policy 0, policy_version 956734 (0.0028) [2024-06-25 20:04:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15675179008. Throughput: 0: 42799.1. Samples: 15675327160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:18,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 20:04:21,033][15401] Updated weights for policy 0, policy_version 956744 (0.0040) [2024-06-25 20:04:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15675375616. Throughput: 0: 42727.5. Samples: 15675456220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:23,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 20:04:24,651][15401] Updated weights for policy 0, policy_version 956754 (0.0034) [2024-06-25 20:04:28,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15675588608. Throughput: 0: 42774.8. Samples: 15675713940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:28,390][15132] Avg episode reward: [(0, '0.251')] [2024-06-25 20:04:28,721][15401] Updated weights for policy 0, policy_version 956764 (0.0031) [2024-06-25 20:04:32,123][15401] Updated weights for policy 0, policy_version 956774 (0.0034) [2024-06-25 20:04:33,396][15132] Fps is (10 sec: 44208.7, 60 sec: 42593.9, 300 sec: 42541.9). Total num frames: 15675817984. Throughput: 0: 42948.5. Samples: 15675973340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:33,397][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 20:04:36,502][15401] Updated weights for policy 0, policy_version 956784 (0.0041) [2024-06-25 20:04:38,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15676014592. Throughput: 0: 42878.8. Samples: 15676104740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:38,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 20:04:39,707][15401] Updated weights for policy 0, policy_version 956794 (0.0035) [2024-06-25 20:04:43,389][15132] Fps is (10 sec: 40986.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15676227584. Throughput: 0: 42915.0. Samples: 15676362440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:43,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 20:04:44,310][15401] Updated weights for policy 0, policy_version 956804 (0.0034) [2024-06-25 20:04:47,783][15349] Signal inference workers to stop experience collection... (231900 times) [2024-06-25 20:04:47,832][15401] InferenceWorker_p0-w0: stopping experience collection (231900 times) [2024-06-25 20:04:47,844][15349] Signal inference workers to resume experience collection... (231900 times) [2024-06-25 20:04:47,849][15401] InferenceWorker_p0-w0: resuming experience collection (231900 times) [2024-06-25 20:04:47,851][15401] Updated weights for policy 0, policy_version 956814 (0.0032) [2024-06-25 20:04:48,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 15676456960. Throughput: 0: 42609.2. Samples: 15676607080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:48,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 20:04:51,804][15401] Updated weights for policy 0, policy_version 956824 (0.0033) [2024-06-25 20:04:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15676653568. Throughput: 0: 42795.7. Samples: 15676744160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:53,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 20:04:55,311][15401] Updated weights for policy 0, policy_version 956834 (0.0035) [2024-06-25 20:04:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 15676866560. Throughput: 0: 43057.6. Samples: 15677007820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-25 20:04:58,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 20:04:59,667][15401] Updated weights for policy 0, policy_version 956844 (0.0040) [2024-06-25 20:05:02,802][15401] Updated weights for policy 0, policy_version 956854 (0.0026) [2024-06-25 20:05:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 15677112320. Throughput: 0: 42772.0. Samples: 15677251900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:03,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 20:05:07,490][15401] Updated weights for policy 0, policy_version 956864 (0.0040) [2024-06-25 20:05:08,390][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15677308928. Throughput: 0: 42965.0. Samples: 15677389640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:08,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-25 20:05:10,594][15401] Updated weights for policy 0, policy_version 956874 (0.0034) [2024-06-25 20:05:13,394][15132] Fps is (10 sec: 40941.3, 60 sec: 42868.2, 300 sec: 42542.2). Total num frames: 15677521920. Throughput: 0: 42875.5. Samples: 15677643540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:13,394][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 20:05:15,221][15401] Updated weights for policy 0, policy_version 956884 (0.0045) [2024-06-25 20:05:18,339][15401] Updated weights for policy 0, policy_version 956894 (0.0030) [2024-06-25 20:05:18,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15677751296. Throughput: 0: 42676.4. Samples: 15677893500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:18,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 20:05:22,868][15401] Updated weights for policy 0, policy_version 956904 (0.0039) [2024-06-25 20:05:23,390][15132] Fps is (10 sec: 42617.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15677947904. Throughput: 0: 42640.7. Samples: 15678023580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:23,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 20:05:25,906][15401] Updated weights for policy 0, policy_version 956914 (0.0029) [2024-06-25 20:05:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 15678160896. Throughput: 0: 42623.6. Samples: 15678280500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:28,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 20:05:30,392][15401] Updated weights for policy 0, policy_version 956924 (0.0045) [2024-06-25 20:05:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 15678390272. Throughput: 0: 42949.8. Samples: 15678539820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:33,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 20:05:33,544][15401] Updated weights for policy 0, policy_version 956934 (0.0026) [2024-06-25 20:05:38,119][15401] Updated weights for policy 0, policy_version 956944 (0.0027) [2024-06-25 20:05:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 15678586880. Throughput: 0: 42726.1. Samples: 15678666840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:38,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 20:05:41,271][15401] Updated weights for policy 0, policy_version 956954 (0.0032) [2024-06-25 20:05:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 15678783488. Throughput: 0: 42347.6. Samples: 15678913460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:43,396][15132] Avg episode reward: [(0, '0.844')] [2024-06-25 20:05:43,539][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000956958_15678799872.pth... [2024-06-25 20:05:43,598][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000956336_15668609024.pth [2024-06-25 20:05:45,993][15401] Updated weights for policy 0, policy_version 956964 (0.0038) [2024-06-25 20:05:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15679029248. Throughput: 0: 42641.0. Samples: 15679170740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:48,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 20:05:48,893][15401] Updated weights for policy 0, policy_version 956974 (0.0029) [2024-06-25 20:05:53,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15679193088. Throughput: 0: 42478.2. Samples: 15679301160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:53,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 20:05:53,941][15401] Updated weights for policy 0, policy_version 956984 (0.0035) [2024-06-25 20:05:56,589][15401] Updated weights for policy 0, policy_version 956994 (0.0028) [2024-06-25 20:05:58,390][15132] Fps is (10 sec: 39320.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15679422464. Throughput: 0: 42331.7. Samples: 15679548280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:05:58,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 20:06:01,755][15401] Updated weights for policy 0, policy_version 957004 (0.0034) [2024-06-25 20:06:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15679651840. Throughput: 0: 42598.6. Samples: 15679810440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:03,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 20:06:04,214][15401] Updated weights for policy 0, policy_version 957014 (0.0030) [2024-06-25 20:06:08,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15679848448. Throughput: 0: 42582.7. Samples: 15679939800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 20:06:09,310][15401] Updated weights for policy 0, policy_version 957024 (0.0045) [2024-06-25 20:06:10,873][15349] Signal inference workers to stop experience collection... (231950 times) [2024-06-25 20:06:10,919][15401] InferenceWorker_p0-w0: stopping experience collection (231950 times) [2024-06-25 20:06:10,927][15349] Signal inference workers to resume experience collection... (231950 times) [2024-06-25 20:06:10,943][15401] InferenceWorker_p0-w0: resuming experience collection (231950 times) [2024-06-25 20:06:12,078][15401] Updated weights for policy 0, policy_version 957034 (0.0029) [2024-06-25 20:06:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42601.6, 300 sec: 42542.8). Total num frames: 15680077824. Throughput: 0: 42393.2. Samples: 15680188200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:13,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 20:06:17,023][15401] Updated weights for policy 0, policy_version 957044 (0.0033) [2024-06-25 20:06:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15680290816. Throughput: 0: 42459.7. Samples: 15680450500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:18,390][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 20:06:20,131][15401] Updated weights for policy 0, policy_version 957054 (0.0035) [2024-06-25 20:06:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15680487424. Throughput: 0: 42324.1. Samples: 15680571420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:23,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 20:06:24,627][15401] Updated weights for policy 0, policy_version 957064 (0.0026) [2024-06-25 20:06:28,171][15401] Updated weights for policy 0, policy_version 957074 (0.0041) [2024-06-25 20:06:28,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42598.1). Total num frames: 15680716800. Throughput: 0: 42436.1. Samples: 15680823180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:28,392][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 20:06:32,241][15401] Updated weights for policy 0, policy_version 957084 (0.0036) [2024-06-25 20:06:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42543.4). Total num frames: 15680913408. Throughput: 0: 42375.1. Samples: 15681077620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 20:06:35,770][15401] Updated weights for policy 0, policy_version 957094 (0.0040) [2024-06-25 20:06:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 15681142784. Throughput: 0: 42363.0. Samples: 15681207500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:38,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 20:06:39,694][15401] Updated weights for policy 0, policy_version 957104 (0.0034) [2024-06-25 20:06:43,258][15401] Updated weights for policy 0, policy_version 957114 (0.0038) [2024-06-25 20:06:43,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15681355776. Throughput: 0: 42570.9. Samples: 15681463960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 20:06:47,422][15401] Updated weights for policy 0, policy_version 957124 (0.0022) [2024-06-25 20:06:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 42543.2). Total num frames: 15681552384. Throughput: 0: 42475.9. Samples: 15681721860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:48,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 20:06:50,894][15401] Updated weights for policy 0, policy_version 957134 (0.0038) [2024-06-25 20:06:53,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 15681765376. Throughput: 0: 42311.1. Samples: 15681843800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-25 20:06:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 20:06:54,949][15401] Updated weights for policy 0, policy_version 957144 (0.0026) [2024-06-25 20:06:58,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.4). Total num frames: 15681994752. Throughput: 0: 42577.8. Samples: 15682104200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:06:58,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 20:06:59,061][15401] Updated weights for policy 0, policy_version 957154 (0.0021) [2024-06-25 20:07:02,401][15401] Updated weights for policy 0, policy_version 957164 (0.0033) [2024-06-25 20:07:03,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15682191360. Throughput: 0: 42458.6. Samples: 15682361140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:03,393][15132] Avg episode reward: [(0, '0.724')] [2024-06-25 20:07:06,578][15401] Updated weights for policy 0, policy_version 957174 (0.0027) [2024-06-25 20:07:08,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15682404352. Throughput: 0: 42715.6. Samples: 15682493620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:08,390][15132] Avg episode reward: [(0, '0.290')] [2024-06-25 20:07:10,289][15401] Updated weights for policy 0, policy_version 957184 (0.0036) [2024-06-25 20:07:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15682617344. Throughput: 0: 42700.4. Samples: 15682744600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:13,400][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 20:07:14,275][15401] Updated weights for policy 0, policy_version 957194 (0.0036) [2024-06-25 20:07:17,877][15401] Updated weights for policy 0, policy_version 957204 (0.0038) [2024-06-25 20:07:18,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42599.1). Total num frames: 15682846720. Throughput: 0: 42623.5. Samples: 15682995680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:18,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 20:07:21,915][15401] Updated weights for policy 0, policy_version 957214 (0.0038) [2024-06-25 20:07:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15683043328. Throughput: 0: 42682.7. Samples: 15683128220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:23,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 20:07:25,483][15401] Updated weights for policy 0, policy_version 957224 (0.0045) [2024-06-25 20:07:27,811][15349] Signal inference workers to stop experience collection... (232000 times) [2024-06-25 20:07:27,811][15349] Signal inference workers to resume experience collection... (232000 times) [2024-06-25 20:07:27,850][15401] InferenceWorker_p0-w0: stopping experience collection (232000 times) [2024-06-25 20:07:27,850][15401] InferenceWorker_p0-w0: resuming experience collection (232000 times) [2024-06-25 20:07:28,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 15683256320. Throughput: 0: 42657.7. Samples: 15683383560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:28,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 20:07:29,712][15401] Updated weights for policy 0, policy_version 957234 (0.0034) [2024-06-25 20:07:33,089][15401] Updated weights for policy 0, policy_version 957244 (0.0031) [2024-06-25 20:07:33,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15683502080. Throughput: 0: 42618.8. Samples: 15683639700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:33,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 20:07:37,254][15401] Updated weights for policy 0, policy_version 957254 (0.0035) [2024-06-25 20:07:38,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15683682304. Throughput: 0: 42935.6. Samples: 15683775900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:38,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 20:07:40,638][15401] Updated weights for policy 0, policy_version 957264 (0.0035) [2024-06-25 20:07:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15683911680. Throughput: 0: 42853.4. Samples: 15684032600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:43,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 20:07:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000957270_15683911680.pth... [2024-06-25 20:07:43,481][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000956644_15673655296.pth [2024-06-25 20:07:44,917][15401] Updated weights for policy 0, policy_version 957274 (0.0048) [2024-06-25 20:07:48,390][15401] Updated weights for policy 0, policy_version 957284 (0.0033) [2024-06-25 20:07:48,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 15684141056. Throughput: 0: 42754.2. Samples: 15684285180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:48,393][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 20:07:52,543][15401] Updated weights for policy 0, policy_version 957294 (0.0033) [2024-06-25 20:07:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15684321280. Throughput: 0: 42624.9. Samples: 15684411740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:53,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 20:07:56,021][15401] Updated weights for policy 0, policy_version 957304 (0.0027) [2024-06-25 20:07:58,390][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15684567040. Throughput: 0: 42861.4. Samples: 15684673360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:07:58,392][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 20:08:00,305][15401] Updated weights for policy 0, policy_version 957314 (0.0027) [2024-06-25 20:08:03,390][15132] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15684780032. Throughput: 0: 43075.5. Samples: 15684934080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:03,390][15132] Avg episode reward: [(0, '0.744')] [2024-06-25 20:08:03,480][15401] Updated weights for policy 0, policy_version 957324 (0.0031) [2024-06-25 20:08:07,988][15401] Updated weights for policy 0, policy_version 957334 (0.0040) [2024-06-25 20:08:08,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15684976640. Throughput: 0: 43022.7. Samples: 15685064240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:08,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 20:08:11,046][15401] Updated weights for policy 0, policy_version 957344 (0.0022) [2024-06-25 20:08:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15685189632. Throughput: 0: 43008.9. Samples: 15685318960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:13,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 20:08:15,624][15401] Updated weights for policy 0, policy_version 957354 (0.0039) [2024-06-25 20:08:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15685435392. Throughput: 0: 42928.0. Samples: 15685571460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:18,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 20:08:18,484][15401] Updated weights for policy 0, policy_version 957364 (0.0029) [2024-06-25 20:08:23,159][15401] Updated weights for policy 0, policy_version 957374 (0.0031) [2024-06-25 20:08:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15685632000. Throughput: 0: 42819.1. Samples: 15685702760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 20:08:26,767][15401] Updated weights for policy 0, policy_version 957384 (0.0036) [2024-06-25 20:08:28,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15685828608. Throughput: 0: 42942.4. Samples: 15685965000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:28,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 20:08:30,514][15401] Updated weights for policy 0, policy_version 957394 (0.0035) [2024-06-25 20:08:33,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 15686057984. Throughput: 0: 43060.5. Samples: 15686222900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:33,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 20:08:34,332][15401] Updated weights for policy 0, policy_version 957404 (0.0028) [2024-06-25 20:08:38,071][15401] Updated weights for policy 0, policy_version 957414 (0.0049) [2024-06-25 20:08:38,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15686287360. Throughput: 0: 43080.8. Samples: 15686350380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:38,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 20:08:42,274][15401] Updated weights for policy 0, policy_version 957424 (0.0037) [2024-06-25 20:08:43,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15686500352. Throughput: 0: 42981.4. Samples: 15686607520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:43,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 20:08:46,046][15349] Signal inference workers to stop experience collection... (232050 times) [2024-06-25 20:08:46,046][15349] Signal inference workers to resume experience collection... (232050 times) [2024-06-25 20:08:46,050][15401] Updated weights for policy 0, policy_version 957434 (0.0035) [2024-06-25 20:08:46,075][15401] InferenceWorker_p0-w0: stopping experience collection (232050 times) [2024-06-25 20:08:46,075][15401] InferenceWorker_p0-w0: resuming experience collection (232050 times) [2024-06-25 20:08:48,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 15686713344. Throughput: 0: 42762.6. Samples: 15686858400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:48,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 20:08:49,807][15401] Updated weights for policy 0, policy_version 957444 (0.0030) [2024-06-25 20:08:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15686909952. Throughput: 0: 42791.1. Samples: 15686989840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 20:08:53,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 20:08:53,511][15401] Updated weights for policy 0, policy_version 957454 (0.0039) [2024-06-25 20:08:57,663][15401] Updated weights for policy 0, policy_version 957464 (0.0035) [2024-06-25 20:08:58,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15687122944. Throughput: 0: 42728.9. Samples: 15687241760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:08:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 20:09:01,652][15401] Updated weights for policy 0, policy_version 957474 (0.0037) [2024-06-25 20:09:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15687352320. Throughput: 0: 42817.4. Samples: 15687498240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:03,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 20:09:05,312][15401] Updated weights for policy 0, policy_version 957484 (0.0032) [2024-06-25 20:09:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15687532544. Throughput: 0: 42806.8. Samples: 15687629060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 20:09:09,217][15401] Updated weights for policy 0, policy_version 957494 (0.0042) [2024-06-25 20:09:13,073][15401] Updated weights for policy 0, policy_version 957504 (0.0038) [2024-06-25 20:09:13,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15687761920. Throughput: 0: 42498.5. Samples: 15687877440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 20:09:16,856][15401] Updated weights for policy 0, policy_version 957514 (0.0026) [2024-06-25 20:09:18,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15687991296. Throughput: 0: 42548.0. Samples: 15688137460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:18,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 20:09:20,830][15401] Updated weights for policy 0, policy_version 957524 (0.0045) [2024-06-25 20:09:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15688187904. Throughput: 0: 42614.2. Samples: 15688268020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:23,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 20:09:24,359][15401] Updated weights for policy 0, policy_version 957534 (0.0029) [2024-06-25 20:09:28,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 15688384512. Throughput: 0: 42542.6. Samples: 15688521940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:28,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 20:09:28,454][15401] Updated weights for policy 0, policy_version 957544 (0.0035) [2024-06-25 20:09:31,955][15401] Updated weights for policy 0, policy_version 957554 (0.0027) [2024-06-25 20:09:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 15688630272. Throughput: 0: 42657.1. Samples: 15688777960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 20:09:36,293][15401] Updated weights for policy 0, policy_version 957564 (0.0038) [2024-06-25 20:09:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15688826880. Throughput: 0: 42700.8. Samples: 15688911380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:38,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 20:09:39,707][15401] Updated weights for policy 0, policy_version 957574 (0.0027) [2024-06-25 20:09:43,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 15689039872. Throughput: 0: 42519.3. Samples: 15689155140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:43,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 20:09:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000957583_15689039872.pth... [2024-06-25 20:09:43,471][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000956958_15678799872.pth [2024-06-25 20:09:44,031][15401] Updated weights for policy 0, policy_version 957584 (0.0046) [2024-06-25 20:09:47,453][15401] Updated weights for policy 0, policy_version 957594 (0.0034) [2024-06-25 20:09:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 15689236480. Throughput: 0: 42653.3. Samples: 15689417640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:48,390][15132] Avg episode reward: [(0, '0.822')] [2024-06-25 20:09:51,975][15401] Updated weights for policy 0, policy_version 957604 (0.0034) [2024-06-25 20:09:53,389][15132] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15689449472. Throughput: 0: 42566.6. Samples: 15689544560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:53,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 20:09:55,407][15401] Updated weights for policy 0, policy_version 957614 (0.0026) [2024-06-25 20:09:58,394][15132] Fps is (10 sec: 45855.9, 60 sec: 42868.5, 300 sec: 42653.3). Total num frames: 15689695232. Throughput: 0: 42551.7. Samples: 15689792440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:09:58,394][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 20:09:59,539][15401] Updated weights for policy 0, policy_version 957624 (0.0038) [2024-06-25 20:10:02,886][15401] Updated weights for policy 0, policy_version 957634 (0.0044) [2024-06-25 20:10:03,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42325.1, 300 sec: 42653.9). Total num frames: 15689891840. Throughput: 0: 42678.0. Samples: 15690057980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:03,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 20:10:07,246][15401] Updated weights for policy 0, policy_version 957644 (0.0054) [2024-06-25 20:10:07,277][15349] Signal inference workers to stop experience collection... (232100 times) [2024-06-25 20:10:07,277][15349] Signal inference workers to resume experience collection... (232100 times) [2024-06-25 20:10:07,287][15401] InferenceWorker_p0-w0: stopping experience collection (232100 times) [2024-06-25 20:10:07,306][15401] InferenceWorker_p0-w0: resuming experience collection (232100 times) [2024-06-25 20:10:08,390][15132] Fps is (10 sec: 40976.9, 60 sec: 42871.4, 300 sec: 42654.6). Total num frames: 15690104832. Throughput: 0: 42441.3. Samples: 15690177880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:08,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 20:10:10,871][15401] Updated weights for policy 0, policy_version 957654 (0.0044) [2024-06-25 20:10:13,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15690334208. Throughput: 0: 42514.6. Samples: 15690435100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:13,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 20:10:14,849][15401] Updated weights for policy 0, policy_version 957664 (0.0028) [2024-06-25 20:10:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15690514432. Throughput: 0: 42667.4. Samples: 15690698000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:18,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 20:10:18,594][15401] Updated weights for policy 0, policy_version 957674 (0.0032) [2024-06-25 20:10:22,389][15401] Updated weights for policy 0, policy_version 957684 (0.0043) [2024-06-25 20:10:23,391][15132] Fps is (10 sec: 39315.8, 60 sec: 42324.2, 300 sec: 42598.2). Total num frames: 15690727424. Throughput: 0: 42388.3. Samples: 15690818920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:23,391][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 20:10:26,271][15401] Updated weights for policy 0, policy_version 957694 (0.0025) [2024-06-25 20:10:28,389][15132] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 15690973184. Throughput: 0: 42664.7. Samples: 15691075040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:28,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 20:10:29,928][15401] Updated weights for policy 0, policy_version 957704 (0.0040) [2024-06-25 20:10:33,390][15132] Fps is (10 sec: 44243.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15691169792. Throughput: 0: 42472.8. Samples: 15691328920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:33,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 20:10:33,882][15401] Updated weights for policy 0, policy_version 957714 (0.0041) [2024-06-25 20:10:37,751][15401] Updated weights for policy 0, policy_version 957724 (0.0031) [2024-06-25 20:10:38,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 15691366400. Throughput: 0: 42472.5. Samples: 15691455820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:38,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 20:10:41,535][15401] Updated weights for policy 0, policy_version 957734 (0.0031) [2024-06-25 20:10:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 15691595776. Throughput: 0: 42650.2. Samples: 15691711520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 20:10:45,365][15401] Updated weights for policy 0, policy_version 957744 (0.0033) [2024-06-25 20:10:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15691808768. Throughput: 0: 42350.5. Samples: 15691963740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-25 20:10:48,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 20:10:49,266][15401] Updated weights for policy 0, policy_version 957754 (0.0034) [2024-06-25 20:10:53,210][15401] Updated weights for policy 0, policy_version 957764 (0.0032) [2024-06-25 20:10:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15692005376. Throughput: 0: 42617.8. Samples: 15692095680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:10:53,390][15132] Avg episode reward: [(0, '0.398')] [2024-06-25 20:10:56,653][15401] Updated weights for policy 0, policy_version 957774 (0.0044) [2024-06-25 20:10:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42055.2, 300 sec: 42598.4). Total num frames: 15692218368. Throughput: 0: 42429.5. Samples: 15692344420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:10:58,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 20:11:00,939][15401] Updated weights for policy 0, policy_version 957784 (0.0028) [2024-06-25 20:11:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 15692447744. Throughput: 0: 42222.0. Samples: 15692597980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 20:11:04,568][15401] Updated weights for policy 0, policy_version 957794 (0.0037) [2024-06-25 20:11:08,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15692644352. Throughput: 0: 42550.9. Samples: 15692733640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 20:11:08,563][15401] Updated weights for policy 0, policy_version 957804 (0.0033) [2024-06-25 20:11:12,041][15401] Updated weights for policy 0, policy_version 957814 (0.0030) [2024-06-25 20:11:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.5, 300 sec: 42598.4). Total num frames: 15692857344. Throughput: 0: 42342.8. Samples: 15692980460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:13,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 20:11:16,172][15401] Updated weights for policy 0, policy_version 957824 (0.0039) [2024-06-25 20:11:18,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 15693086720. Throughput: 0: 42414.7. Samples: 15693237680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:18,393][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 20:11:19,720][15401] Updated weights for policy 0, policy_version 957834 (0.0044) [2024-06-25 20:11:23,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42599.6, 300 sec: 42598.8). Total num frames: 15693283328. Throughput: 0: 42359.1. Samples: 15693361980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:23,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 20:11:24,195][15401] Updated weights for policy 0, policy_version 957844 (0.0032) [2024-06-25 20:11:27,353][15401] Updated weights for policy 0, policy_version 957854 (0.0031) [2024-06-25 20:11:28,390][15132] Fps is (10 sec: 40969.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15693496320. Throughput: 0: 42313.7. Samples: 15693615640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:28,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 20:11:31,899][15401] Updated weights for policy 0, policy_version 957864 (0.0041) [2024-06-25 20:11:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15693709312. Throughput: 0: 42533.3. Samples: 15693877740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:33,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 20:11:34,951][15401] Updated weights for policy 0, policy_version 957874 (0.0027) [2024-06-25 20:11:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 15693922304. Throughput: 0: 42312.3. Samples: 15693999740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 20:11:39,642][15401] Updated weights for policy 0, policy_version 957884 (0.0036) [2024-06-25 20:11:40,759][15349] Signal inference workers to stop experience collection... (232150 times) [2024-06-25 20:11:40,814][15349] Signal inference workers to resume experience collection... (232150 times) [2024-06-25 20:11:40,814][15401] InferenceWorker_p0-w0: stopping experience collection (232150 times) [2024-06-25 20:11:40,829][15401] InferenceWorker_p0-w0: resuming experience collection (232150 times) [2024-06-25 20:11:43,076][15401] Updated weights for policy 0, policy_version 957894 (0.0032) [2024-06-25 20:11:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15694135296. Throughput: 0: 42458.1. Samples: 15694255040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:43,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 20:11:43,449][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000957895_15694151680.pth... [2024-06-25 20:11:43,499][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000957270_15683911680.pth [2024-06-25 20:11:47,099][15401] Updated weights for policy 0, policy_version 957904 (0.0036) [2024-06-25 20:11:48,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15694348288. Throughput: 0: 42571.1. Samples: 15694513680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:48,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 20:11:50,566][15401] Updated weights for policy 0, policy_version 957914 (0.0037) [2024-06-25 20:11:53,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 15694528512. Throughput: 0: 42288.9. Samples: 15694636640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:53,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 20:11:54,976][15401] Updated weights for policy 0, policy_version 957924 (0.0033) [2024-06-25 20:11:58,134][15401] Updated weights for policy 0, policy_version 957934 (0.0036) [2024-06-25 20:11:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15694790656. Throughput: 0: 42485.2. Samples: 15694892300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:11:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 20:12:02,713][15401] Updated weights for policy 0, policy_version 957944 (0.0027) [2024-06-25 20:12:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15694987264. Throughput: 0: 42520.5. Samples: 15695151000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 20:12:05,789][15401] Updated weights for policy 0, policy_version 957954 (0.0030) [2024-06-25 20:12:08,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15695183872. Throughput: 0: 42494.6. Samples: 15695274240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:08,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 20:12:10,286][15401] Updated weights for policy 0, policy_version 957964 (0.0036) [2024-06-25 20:12:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15695429632. Throughput: 0: 42544.9. Samples: 15695530160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:13,390][15132] Avg episode reward: [(0, '0.883')] [2024-06-25 20:12:13,631][15401] Updated weights for policy 0, policy_version 957974 (0.0036) [2024-06-25 20:12:17,869][15401] Updated weights for policy 0, policy_version 957984 (0.0035) [2024-06-25 20:12:18,396][15132] Fps is (10 sec: 44208.6, 60 sec: 42322.5, 300 sec: 42653.0). Total num frames: 15695626240. Throughput: 0: 42506.8. Samples: 15695790820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:18,396][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 20:12:21,667][15401] Updated weights for policy 0, policy_version 957994 (0.0035) [2024-06-25 20:12:23,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15695839232. Throughput: 0: 42561.6. Samples: 15695915000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:23,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 20:12:25,357][15401] Updated weights for policy 0, policy_version 958004 (0.0042) [2024-06-25 20:12:28,389][15132] Fps is (10 sec: 42625.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15696052224. Throughput: 0: 42602.8. Samples: 15696172160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 20:12:29,147][15401] Updated weights for policy 0, policy_version 958014 (0.0037) [2024-06-25 20:12:33,388][15401] Updated weights for policy 0, policy_version 958024 (0.0039) [2024-06-25 20:12:33,390][15132] Fps is (10 sec: 42597.1, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 15696265216. Throughput: 0: 42769.2. Samples: 15696438300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:33,394][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 20:12:36,875][15401] Updated weights for policy 0, policy_version 958034 (0.0052) [2024-06-25 20:12:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15696494592. Throughput: 0: 42770.6. Samples: 15696561320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:38,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 20:12:41,029][15401] Updated weights for policy 0, policy_version 958044 (0.0032) [2024-06-25 20:12:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 15696707584. Throughput: 0: 42847.4. Samples: 15696820440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:43,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 20:12:44,501][15401] Updated weights for policy 0, policy_version 958054 (0.0041) [2024-06-25 20:12:48,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15696904192. Throughput: 0: 42845.8. Samples: 15697079060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 20:12:48,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 20:12:48,659][15401] Updated weights for policy 0, policy_version 958064 (0.0032) [2024-06-25 20:12:52,331][15401] Updated weights for policy 0, policy_version 958074 (0.0031) [2024-06-25 20:12:53,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 15697133568. Throughput: 0: 42876.5. Samples: 15697203680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:12:53,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 20:12:56,038][15401] Updated weights for policy 0, policy_version 958084 (0.0034) [2024-06-25 20:12:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15697346560. Throughput: 0: 42991.7. Samples: 15697464780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:12:58,390][15132] Avg episode reward: [(0, '0.413')] [2024-06-25 20:12:59,893][15401] Updated weights for policy 0, policy_version 958094 (0.0032) [2024-06-25 20:13:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15697559552. Throughput: 0: 42914.5. Samples: 15697721700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:03,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 20:13:03,680][15401] Updated weights for policy 0, policy_version 958104 (0.0025) [2024-06-25 20:13:07,473][15401] Updated weights for policy 0, policy_version 958114 (0.0050) [2024-06-25 20:13:08,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15697772544. Throughput: 0: 42990.9. Samples: 15697849700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:08,393][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 20:13:11,367][15401] Updated weights for policy 0, policy_version 958124 (0.0045) [2024-06-25 20:13:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15697985536. Throughput: 0: 42992.8. Samples: 15698106840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:13,390][15132] Avg episode reward: [(0, '0.295')] [2024-06-25 20:13:14,526][15349] Signal inference workers to stop experience collection... (232200 times) [2024-06-25 20:13:14,527][15349] Signal inference workers to resume experience collection... (232200 times) [2024-06-25 20:13:14,544][15401] InferenceWorker_p0-w0: stopping experience collection (232200 times) [2024-06-25 20:13:14,544][15401] InferenceWorker_p0-w0: resuming experience collection (232200 times) [2024-06-25 20:13:15,168][15401] Updated weights for policy 0, policy_version 958134 (0.0034) [2024-06-25 20:13:18,390][15132] Fps is (10 sec: 40966.0, 60 sec: 42602.3, 300 sec: 42542.7). Total num frames: 15698182144. Throughput: 0: 42795.7. Samples: 15698364140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:18,391][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 20:13:18,894][15401] Updated weights for policy 0, policy_version 958144 (0.0030) [2024-06-25 20:13:22,813][15401] Updated weights for policy 0, policy_version 958154 (0.0019) [2024-06-25 20:13:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15698411520. Throughput: 0: 42791.1. Samples: 15698486920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:23,390][15132] Avg episode reward: [(0, '0.178')] [2024-06-25 20:13:26,341][15401] Updated weights for policy 0, policy_version 958164 (0.0034) [2024-06-25 20:13:28,389][15132] Fps is (10 sec: 45879.9, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 15698640896. Throughput: 0: 42707.3. Samples: 15698742260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:28,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 20:13:30,678][15401] Updated weights for policy 0, policy_version 958174 (0.0026) [2024-06-25 20:13:33,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 15698821120. Throughput: 0: 42777.0. Samples: 15699004020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:33,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 20:13:34,145][15401] Updated weights for policy 0, policy_version 958184 (0.0028) [2024-06-25 20:13:38,387][15401] Updated weights for policy 0, policy_version 958194 (0.0024) [2024-06-25 20:13:38,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15699050496. Throughput: 0: 42804.8. Samples: 15699129900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:38,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 20:13:41,711][15401] Updated weights for policy 0, policy_version 958204 (0.0039) [2024-06-25 20:13:43,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15699279872. Throughput: 0: 42613.3. Samples: 15699382380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:43,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 20:13:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000958208_15699279872.pth... [2024-06-25 20:13:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000957583_15689039872.pth [2024-06-25 20:13:46,023][15401] Updated weights for policy 0, policy_version 958214 (0.0034) [2024-06-25 20:13:48,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15699476480. Throughput: 0: 42792.6. Samples: 15699647360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:48,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 20:13:49,200][15401] Updated weights for policy 0, policy_version 958224 (0.0026) [2024-06-25 20:13:53,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15699689472. Throughput: 0: 42723.7. Samples: 15699772160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:53,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 20:13:53,624][15401] Updated weights for policy 0, policy_version 958234 (0.0035) [2024-06-25 20:13:56,881][15401] Updated weights for policy 0, policy_version 958244 (0.0029) [2024-06-25 20:13:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15699935232. Throughput: 0: 42654.3. Samples: 15700026280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:13:58,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 20:14:01,142][15401] Updated weights for policy 0, policy_version 958254 (0.0022) [2024-06-25 20:14:03,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15700115456. Throughput: 0: 42822.6. Samples: 15700291120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 20:14:04,697][15401] Updated weights for policy 0, policy_version 958264 (0.0036) [2024-06-25 20:14:08,390][15132] Fps is (10 sec: 37683.3, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 15700312064. Throughput: 0: 42792.0. Samples: 15700412560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:08,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 20:14:09,189][15401] Updated weights for policy 0, policy_version 958274 (0.0038) [2024-06-25 20:14:12,321][15401] Updated weights for policy 0, policy_version 958284 (0.0033) [2024-06-25 20:14:13,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15700574208. Throughput: 0: 42794.1. Samples: 15700668000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:13,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 20:14:16,715][15401] Updated weights for policy 0, policy_version 958294 (0.0031) [2024-06-25 20:14:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42872.2, 300 sec: 42598.4). Total num frames: 15700754432. Throughput: 0: 42669.7. Samples: 15700924160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:18,390][15132] Avg episode reward: [(0, '0.316')] [2024-06-25 20:14:19,979][15401] Updated weights for policy 0, policy_version 958304 (0.0030) [2024-06-25 20:14:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15700967424. Throughput: 0: 42591.7. Samples: 15701046520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:23,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 20:14:24,489][15401] Updated weights for policy 0, policy_version 958314 (0.0039) [2024-06-25 20:14:28,108][15401] Updated weights for policy 0, policy_version 958324 (0.0046) [2024-06-25 20:14:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15701196800. Throughput: 0: 42791.2. Samples: 15701307980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:28,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 20:14:32,283][15401] Updated weights for policy 0, policy_version 958334 (0.0035) [2024-06-25 20:14:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15701409792. Throughput: 0: 42614.2. Samples: 15701565000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:33,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 20:14:35,555][15401] Updated weights for policy 0, policy_version 958344 (0.0027) [2024-06-25 20:14:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15701622784. Throughput: 0: 42705.7. Samples: 15701693920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:38,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 20:14:39,839][15401] Updated weights for policy 0, policy_version 958354 (0.0022) [2024-06-25 20:14:41,825][15349] Signal inference workers to stop experience collection... (232250 times) [2024-06-25 20:14:41,826][15349] Signal inference workers to resume experience collection... (232250 times) [2024-06-25 20:14:41,874][15401] InferenceWorker_p0-w0: stopping experience collection (232250 times) [2024-06-25 20:14:41,874][15401] InferenceWorker_p0-w0: resuming experience collection (232250 times) [2024-06-25 20:14:43,096][15401] Updated weights for policy 0, policy_version 958364 (0.0037) [2024-06-25 20:14:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15701835776. Throughput: 0: 42747.7. Samples: 15701949920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-25 20:14:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 20:14:47,462][15401] Updated weights for policy 0, policy_version 958374 (0.0030) [2024-06-25 20:14:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15702032384. Throughput: 0: 42643.1. Samples: 15702210060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:14:48,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 20:14:50,755][15401] Updated weights for policy 0, policy_version 958384 (0.0035) [2024-06-25 20:14:53,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42599.0). Total num frames: 15702261760. Throughput: 0: 42612.1. Samples: 15702330100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:14:53,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 20:14:55,047][15401] Updated weights for policy 0, policy_version 958394 (0.0023) [2024-06-25 20:14:58,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 15702474752. Throughput: 0: 42817.7. Samples: 15702594900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:14:58,393][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 20:14:58,618][15401] Updated weights for policy 0, policy_version 958404 (0.0033) [2024-06-25 20:15:02,714][15401] Updated weights for policy 0, policy_version 958414 (0.0029) [2024-06-25 20:15:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15702671360. Throughput: 0: 42591.4. Samples: 15702840780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 20:15:06,368][15401] Updated weights for policy 0, policy_version 958424 (0.0051) [2024-06-25 20:15:08,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 15702917120. Throughput: 0: 42740.0. Samples: 15702969820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 20:15:10,470][15401] Updated weights for policy 0, policy_version 958434 (0.0030) [2024-06-25 20:15:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 15703097344. Throughput: 0: 42710.6. Samples: 15703229960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:13,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 20:15:13,973][15401] Updated weights for policy 0, policy_version 958444 (0.0036) [2024-06-25 20:15:18,386][15401] Updated weights for policy 0, policy_version 958454 (0.0037) [2024-06-25 20:15:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 15703310336. Throughput: 0: 42731.2. Samples: 15703487900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:18,390][15132] Avg episode reward: [(0, '0.329')] [2024-06-25 20:15:21,598][15401] Updated weights for policy 0, policy_version 958464 (0.0030) [2024-06-25 20:15:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15703539712. Throughput: 0: 42546.2. Samples: 15703608500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:23,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 20:15:26,006][15401] Updated weights for policy 0, policy_version 958474 (0.0033) [2024-06-25 20:15:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15703752704. Throughput: 0: 42730.2. Samples: 15703872780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:28,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 20:15:29,679][15401] Updated weights for policy 0, policy_version 958484 (0.0034) [2024-06-25 20:15:33,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15703949312. Throughput: 0: 42554.3. Samples: 15704125000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:33,398][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 20:15:33,655][15401] Updated weights for policy 0, policy_version 958494 (0.0042) [2024-06-25 20:15:37,242][15401] Updated weights for policy 0, policy_version 958504 (0.0040) [2024-06-25 20:15:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15704178688. Throughput: 0: 42534.7. Samples: 15704244160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:38,390][15132] Avg episode reward: [(0, '0.300')] [2024-06-25 20:15:41,301][15401] Updated weights for policy 0, policy_version 958514 (0.0030) [2024-06-25 20:15:43,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15704391680. Throughput: 0: 42633.5. Samples: 15704513300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:43,390][15132] Avg episode reward: [(0, '0.400')] [2024-06-25 20:15:43,429][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000958521_15704408064.pth... [2024-06-25 20:15:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000957895_15694151680.pth [2024-06-25 20:15:44,733][15401] Updated weights for policy 0, policy_version 958524 (0.0033) [2024-06-25 20:15:48,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15704588288. Throughput: 0: 42788.1. Samples: 15704766240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:48,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 20:15:48,820][15401] Updated weights for policy 0, policy_version 958534 (0.0036) [2024-06-25 20:15:52,241][15401] Updated weights for policy 0, policy_version 958544 (0.0024) [2024-06-25 20:15:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15704817664. Throughput: 0: 42773.3. Samples: 15704894620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:53,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 20:15:56,274][15401] Updated weights for policy 0, policy_version 958554 (0.0038) [2024-06-25 20:15:58,229][15349] Signal inference workers to stop experience collection... (232300 times) [2024-06-25 20:15:58,230][15349] Signal inference workers to resume experience collection... (232300 times) [2024-06-25 20:15:58,262][15401] InferenceWorker_p0-w0: stopping experience collection (232300 times) [2024-06-25 20:15:58,262][15401] InferenceWorker_p0-w0: resuming experience collection (232300 times) [2024-06-25 20:15:58,392][15132] Fps is (10 sec: 45864.2, 60 sec: 42871.5, 300 sec: 42709.1). Total num frames: 15705047040. Throughput: 0: 42836.4. Samples: 15705157700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:15:58,393][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 20:15:59,697][15401] Updated weights for policy 0, policy_version 958564 (0.0035) [2024-06-25 20:16:03,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15705227264. Throughput: 0: 42783.5. Samples: 15705413160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:03,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 20:16:03,847][15401] Updated weights for policy 0, policy_version 958574 (0.0030) [2024-06-25 20:16:07,383][15401] Updated weights for policy 0, policy_version 958584 (0.0033) [2024-06-25 20:16:08,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15705473024. Throughput: 0: 42978.8. Samples: 15705542540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:08,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 20:16:11,959][15401] Updated weights for policy 0, policy_version 958594 (0.0043) [2024-06-25 20:16:13,392][15132] Fps is (10 sec: 45864.0, 60 sec: 43142.8, 300 sec: 42709.5). Total num frames: 15705686016. Throughput: 0: 42741.2. Samples: 15705796240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:13,392][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 20:16:15,313][15401] Updated weights for policy 0, policy_version 958604 (0.0029) [2024-06-25 20:16:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15705882624. Throughput: 0: 42904.9. Samples: 15706055720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:18,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 20:16:19,528][15401] Updated weights for policy 0, policy_version 958614 (0.0038) [2024-06-25 20:16:22,949][15401] Updated weights for policy 0, policy_version 958624 (0.0037) [2024-06-25 20:16:23,389][15132] Fps is (10 sec: 42609.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15706112000. Throughput: 0: 43090.3. Samples: 15706183220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:23,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 20:16:27,023][15401] Updated weights for policy 0, policy_version 958634 (0.0032) [2024-06-25 20:16:28,396][15132] Fps is (10 sec: 45845.9, 60 sec: 43139.9, 300 sec: 42819.6). Total num frames: 15706341376. Throughput: 0: 42993.4. Samples: 15706448280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:28,396][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 20:16:30,627][15401] Updated weights for policy 0, policy_version 958644 (0.0032) [2024-06-25 20:16:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15706521600. Throughput: 0: 43167.1. Samples: 15706708760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:33,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 20:16:34,734][15401] Updated weights for policy 0, policy_version 958654 (0.0035) [2024-06-25 20:16:38,201][15401] Updated weights for policy 0, policy_version 958664 (0.0043) [2024-06-25 20:16:38,390][15132] Fps is (10 sec: 42625.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15706767360. Throughput: 0: 42963.1. Samples: 15706827960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:38,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 20:16:42,235][15401] Updated weights for policy 0, policy_version 958674 (0.0023) [2024-06-25 20:16:43,390][15132] Fps is (10 sec: 45872.6, 60 sec: 43144.1, 300 sec: 42820.5). Total num frames: 15706980352. Throughput: 0: 43184.4. Samples: 15707100920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:16:43,391][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 20:16:45,537][15401] Updated weights for policy 0, policy_version 958684 (0.0032) [2024-06-25 20:16:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15707176960. Throughput: 0: 42975.2. Samples: 15707347040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:16:48,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 20:16:49,916][15401] Updated weights for policy 0, policy_version 958694 (0.0035) [2024-06-25 20:16:53,389][15132] Fps is (10 sec: 40962.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15707389952. Throughput: 0: 42922.7. Samples: 15707474060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:16:53,390][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 20:16:53,402][15401] Updated weights for policy 0, policy_version 958704 (0.0039) [2024-06-25 20:16:57,574][15401] Updated weights for policy 0, policy_version 958714 (0.0033) [2024-06-25 20:16:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 15707602944. Throughput: 0: 43171.6. Samples: 15707738860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:16:58,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 20:17:00,820][15401] Updated weights for policy 0, policy_version 958724 (0.0037) [2024-06-25 20:17:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15707815936. Throughput: 0: 42962.8. Samples: 15707989040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:03,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 20:17:05,061][15401] Updated weights for policy 0, policy_version 958734 (0.0038) [2024-06-25 20:17:08,301][15401] Updated weights for policy 0, policy_version 958744 (0.0025) [2024-06-25 20:17:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 15708061696. Throughput: 0: 42953.6. Samples: 15708116140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:08,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 20:17:12,686][15401] Updated weights for policy 0, policy_version 958754 (0.0037) [2024-06-25 20:17:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42873.2, 300 sec: 42821.5). Total num frames: 15708258304. Throughput: 0: 42924.7. Samples: 15708379620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 20:17:15,651][15349] Signal inference workers to stop experience collection... (232350 times) [2024-06-25 20:17:15,657][15349] Signal inference workers to resume experience collection... (232350 times) [2024-06-25 20:17:15,691][15401] InferenceWorker_p0-w0: stopping experience collection (232350 times) [2024-06-25 20:17:15,691][15401] InferenceWorker_p0-w0: resuming experience collection (232350 times) [2024-06-25 20:17:15,794][15401] Updated weights for policy 0, policy_version 958764 (0.0031) [2024-06-25 20:17:18,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15708454912. Throughput: 0: 42737.3. Samples: 15708631940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:18,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 20:17:20,458][15401] Updated weights for policy 0, policy_version 958774 (0.0024) [2024-06-25 20:17:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 15708700672. Throughput: 0: 42849.7. Samples: 15708756200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:23,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 20:17:23,826][15401] Updated weights for policy 0, policy_version 958784 (0.0041) [2024-06-25 20:17:28,277][15401] Updated weights for policy 0, policy_version 958794 (0.0043) [2024-06-25 20:17:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42329.8, 300 sec: 42765.0). Total num frames: 15708880896. Throughput: 0: 42570.2. Samples: 15709016560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 20:17:31,492][15401] Updated weights for policy 0, policy_version 958804 (0.0043) [2024-06-25 20:17:33,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15709093888. Throughput: 0: 42579.0. Samples: 15709263100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 20:17:35,907][15401] Updated weights for policy 0, policy_version 958814 (0.0033) [2024-06-25 20:17:38,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15709339648. Throughput: 0: 42629.7. Samples: 15709392400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:38,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 20:17:39,035][15401] Updated weights for policy 0, policy_version 958824 (0.0026) [2024-06-25 20:17:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.6, 300 sec: 42709.5). Total num frames: 15709503488. Throughput: 0: 42523.1. Samples: 15709652400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 20:17:43,417][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000958833_15709519872.pth... [2024-06-25 20:17:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000958208_15699279872.pth [2024-06-25 20:17:43,770][15401] Updated weights for policy 0, policy_version 958834 (0.0027) [2024-06-25 20:17:46,929][15401] Updated weights for policy 0, policy_version 958844 (0.0057) [2024-06-25 20:17:48,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15709732864. Throughput: 0: 42388.3. Samples: 15709896520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:48,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 20:17:51,375][15401] Updated weights for policy 0, policy_version 958854 (0.0027) [2024-06-25 20:17:53,389][15132] Fps is (10 sec: 47514.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15709978624. Throughput: 0: 42634.8. Samples: 15710034700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:53,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 20:17:54,717][15401] Updated weights for policy 0, policy_version 958864 (0.0025) [2024-06-25 20:17:58,390][15132] Fps is (10 sec: 37682.8, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 15710109696. Throughput: 0: 42375.5. Samples: 15710286520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:17:58,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 20:17:59,208][15401] Updated weights for policy 0, policy_version 958874 (0.0046) [2024-06-25 20:18:02,530][15401] Updated weights for policy 0, policy_version 958884 (0.0038) [2024-06-25 20:18:03,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42765.3). Total num frames: 15710388224. Throughput: 0: 42349.1. Samples: 15710537660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:18:03,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 20:18:06,958][15401] Updated weights for policy 0, policy_version 958894 (0.0037) [2024-06-25 20:18:08,390][15132] Fps is (10 sec: 49151.5, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 15710601216. Throughput: 0: 42678.5. Samples: 15710676740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:18:08,391][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 20:18:10,325][15401] Updated weights for policy 0, policy_version 958904 (0.0033) [2024-06-25 20:18:13,390][15132] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 42654.1). Total num frames: 15710765056. Throughput: 0: 42326.7. Samples: 15710921260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:18:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 20:18:14,799][15401] Updated weights for policy 0, policy_version 958914 (0.0032) [2024-06-25 20:18:18,132][15401] Updated weights for policy 0, policy_version 958924 (0.0034) [2024-06-25 20:18:18,390][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15711027200. Throughput: 0: 42375.5. Samples: 15711170000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:18:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 20:18:22,425][15401] Updated weights for policy 0, policy_version 958934 (0.0031) [2024-06-25 20:18:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 15711207424. Throughput: 0: 42543.4. Samples: 15711306860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:18:23,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 20:18:25,700][15401] Updated weights for policy 0, policy_version 958944 (0.0041) [2024-06-25 20:18:28,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15711420416. Throughput: 0: 42364.0. Samples: 15711558780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:18:28,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 20:18:29,658][15349] Signal inference workers to stop experience collection... (232400 times) [2024-06-25 20:18:29,664][15349] Signal inference workers to resume experience collection... (232400 times) [2024-06-25 20:18:29,704][15401] InferenceWorker_p0-w0: stopping experience collection (232400 times) [2024-06-25 20:18:29,705][15401] InferenceWorker_p0-w0: resuming experience collection (232400 times) [2024-06-25 20:18:30,493][15401] Updated weights for policy 0, policy_version 958954 (0.0041) [2024-06-25 20:18:33,250][15401] Updated weights for policy 0, policy_version 958964 (0.0033) [2024-06-25 20:18:33,392][15132] Fps is (10 sec: 47502.5, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 15711682560. Throughput: 0: 42393.8. Samples: 15711804340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:18:33,392][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 20:18:38,196][15401] Updated weights for policy 0, policy_version 958974 (0.0028) [2024-06-25 20:18:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 15711846400. Throughput: 0: 42272.4. Samples: 15711936960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-25 20:18:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 20:18:40,751][15401] Updated weights for policy 0, policy_version 958984 (0.0046) [2024-06-25 20:18:43,390][15132] Fps is (10 sec: 37691.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15712059392. Throughput: 0: 42289.3. Samples: 15712189540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:18:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 20:18:45,878][15401] Updated weights for policy 0, policy_version 958994 (0.0039) [2024-06-25 20:18:48,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15712305152. Throughput: 0: 42205.5. Samples: 15712436900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:18:48,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 20:18:49,083][15401] Updated weights for policy 0, policy_version 959004 (0.0024) [2024-06-25 20:18:53,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 15712468992. Throughput: 0: 42194.1. Samples: 15712575460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:18:53,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 20:18:53,417][15401] Updated weights for policy 0, policy_version 959014 (0.0027) [2024-06-25 20:18:56,551][15401] Updated weights for policy 0, policy_version 959024 (0.0028) [2024-06-25 20:18:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 15712714752. Throughput: 0: 42219.2. Samples: 15712821120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:18:58,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 20:19:01,123][15401] Updated weights for policy 0, policy_version 959034 (0.0042) [2024-06-25 20:19:03,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 15712927744. Throughput: 0: 42381.4. Samples: 15713077160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:03,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 20:19:04,079][15401] Updated weights for policy 0, policy_version 959044 (0.0039) [2024-06-25 20:19:08,389][15132] Fps is (10 sec: 37683.1, 60 sec: 41506.3, 300 sec: 42431.8). Total num frames: 15713091584. Throughput: 0: 42210.7. Samples: 15713206340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:08,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 20:19:08,977][15401] Updated weights for policy 0, policy_version 959054 (0.0044) [2024-06-25 20:19:11,694][15401] Updated weights for policy 0, policy_version 959064 (0.0026) [2024-06-25 20:19:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15713353728. Throughput: 0: 42081.8. Samples: 15713452460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:13,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 20:19:16,994][15401] Updated weights for policy 0, policy_version 959074 (0.0044) [2024-06-25 20:19:18,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15713550336. Throughput: 0: 42506.3. Samples: 15713717020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:18,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 20:19:19,362][15401] Updated weights for policy 0, policy_version 959084 (0.0034) [2024-06-25 20:19:23,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15713746944. Throughput: 0: 42310.7. Samples: 15713840940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:23,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 20:19:24,813][15401] Updated weights for policy 0, policy_version 959094 (0.0034) [2024-06-25 20:19:27,246][15401] Updated weights for policy 0, policy_version 959104 (0.0034) [2024-06-25 20:19:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15713992704. Throughput: 0: 42246.8. Samples: 15714090640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:28,390][15132] Avg episode reward: [(0, '0.717')] [2024-06-25 20:19:32,278][15401] Updated weights for policy 0, policy_version 959114 (0.0028) [2024-06-25 20:19:32,293][15349] Signal inference workers to stop experience collection... (232450 times) [2024-06-25 20:19:32,293][15349] Signal inference workers to resume experience collection... (232450 times) [2024-06-25 20:19:32,311][15401] InferenceWorker_p0-w0: stopping experience collection (232450 times) [2024-06-25 20:19:32,311][15401] InferenceWorker_p0-w0: resuming experience collection (232450 times) [2024-06-25 20:19:33,389][15132] Fps is (10 sec: 42598.2, 60 sec: 41507.8, 300 sec: 42542.9). Total num frames: 15714172928. Throughput: 0: 42707.6. Samples: 15714358740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:33,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 20:19:35,249][15401] Updated weights for policy 0, policy_version 959124 (0.0043) [2024-06-25 20:19:38,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15714385920. Throughput: 0: 42276.5. Samples: 15714477900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:38,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 20:19:39,870][15401] Updated weights for policy 0, policy_version 959134 (0.0041) [2024-06-25 20:19:42,963][15401] Updated weights for policy 0, policy_version 959144 (0.0032) [2024-06-25 20:19:43,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15714631680. Throughput: 0: 42487.0. Samples: 15714733040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:43,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 20:19:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000959145_15714631680.pth... [2024-06-25 20:19:43,467][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000958521_15704408064.pth [2024-06-25 20:19:47,479][15401] Updated weights for policy 0, policy_version 959154 (0.0033) [2024-06-25 20:19:48,389][15132] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 15714811904. Throughput: 0: 42511.1. Samples: 15714990160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:48,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 20:19:50,590][15401] Updated weights for policy 0, policy_version 959164 (0.0042) [2024-06-25 20:19:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 15715041280. Throughput: 0: 42372.9. Samples: 15715113120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:53,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 20:19:55,114][15401] Updated weights for policy 0, policy_version 959174 (0.0031) [2024-06-25 20:19:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15715270656. Throughput: 0: 42701.7. Samples: 15715374040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:19:58,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 20:19:58,391][15401] Updated weights for policy 0, policy_version 959184 (0.0038) [2024-06-25 20:20:02,670][15401] Updated weights for policy 0, policy_version 959194 (0.0029) [2024-06-25 20:20:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15715467264. Throughput: 0: 42599.6. Samples: 15715634000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:20:03,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 20:20:05,813][15401] Updated weights for policy 0, policy_version 959204 (0.0030) [2024-06-25 20:20:08,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15715680256. Throughput: 0: 42679.4. Samples: 15715761520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:20:08,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 20:20:10,224][15401] Updated weights for policy 0, policy_version 959214 (0.0037) [2024-06-25 20:20:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15715909632. Throughput: 0: 42959.1. Samples: 15716023800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:20:13,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 20:20:13,414][15401] Updated weights for policy 0, policy_version 959224 (0.0035) [2024-06-25 20:20:17,708][15401] Updated weights for policy 0, policy_version 959234 (0.0026) [2024-06-25 20:20:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15716122624. Throughput: 0: 42570.2. Samples: 15716274400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:20:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 20:20:21,014][15401] Updated weights for policy 0, policy_version 959244 (0.0033) [2024-06-25 20:20:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15716319232. Throughput: 0: 42820.0. Samples: 15716404800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:20:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 20:20:25,168][15401] Updated weights for policy 0, policy_version 959254 (0.0034) [2024-06-25 20:20:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15716548608. Throughput: 0: 43083.6. Samples: 15716671800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:20:28,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 20:20:28,576][15401] Updated weights for policy 0, policy_version 959264 (0.0023) [2024-06-25 20:20:32,712][15349] Signal inference workers to stop experience collection... (232500 times) [2024-06-25 20:20:32,715][15401] Updated weights for policy 0, policy_version 959274 (0.0034) [2024-06-25 20:20:32,716][15349] Signal inference workers to resume experience collection... (232500 times) [2024-06-25 20:20:32,724][15401] InferenceWorker_p0-w0: stopping experience collection (232500 times) [2024-06-25 20:20:32,750][15401] InferenceWorker_p0-w0: resuming experience collection (232500 times) [2024-06-25 20:20:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15716777984. Throughput: 0: 42915.2. Samples: 15716921340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:20:33,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 20:20:36,520][15401] Updated weights for policy 0, policy_version 959284 (0.0027) [2024-06-25 20:20:38,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15716958208. Throughput: 0: 43098.3. Samples: 15717052540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 20:20:38,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 20:20:40,355][15401] Updated weights for policy 0, policy_version 959294 (0.0038) [2024-06-25 20:20:43,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15717171200. Throughput: 0: 43082.4. Samples: 15717312740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:20:43,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 20:20:44,194][15401] Updated weights for policy 0, policy_version 959304 (0.0040) [2024-06-25 20:20:48,064][15401] Updated weights for policy 0, policy_version 959314 (0.0045) [2024-06-25 20:20:48,390][15132] Fps is (10 sec: 45873.8, 60 sec: 43417.4, 300 sec: 42709.5). Total num frames: 15717416960. Throughput: 0: 42801.5. Samples: 15717560080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:20:48,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 20:20:52,156][15401] Updated weights for policy 0, policy_version 959324 (0.0045) [2024-06-25 20:20:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 15717597184. Throughput: 0: 42930.3. Samples: 15717693380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:20:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 20:20:55,621][15401] Updated weights for policy 0, policy_version 959334 (0.0034) [2024-06-25 20:20:58,389][15132] Fps is (10 sec: 37684.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15717793792. Throughput: 0: 42729.9. Samples: 15717946640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:20:58,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 20:20:59,640][15401] Updated weights for policy 0, policy_version 959344 (0.0042) [2024-06-25 20:21:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15718039552. Throughput: 0: 42762.8. Samples: 15718198720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:03,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 20:21:03,457][15401] Updated weights for policy 0, policy_version 959354 (0.0031) [2024-06-25 20:21:07,397][15401] Updated weights for policy 0, policy_version 959364 (0.0038) [2024-06-25 20:21:08,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42598.8). Total num frames: 15718252544. Throughput: 0: 42802.3. Samples: 15718330900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:08,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 20:21:11,171][15401] Updated weights for policy 0, policy_version 959374 (0.0031) [2024-06-25 20:21:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15718449152. Throughput: 0: 42591.5. Samples: 15718588420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:13,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 20:21:14,974][15401] Updated weights for policy 0, policy_version 959384 (0.0035) [2024-06-25 20:21:18,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15718694912. Throughput: 0: 42701.2. Samples: 15718842900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:18,390][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 20:21:18,649][15401] Updated weights for policy 0, policy_version 959394 (0.0044) [2024-06-25 20:21:22,588][15401] Updated weights for policy 0, policy_version 959404 (0.0051) [2024-06-25 20:21:23,390][15132] Fps is (10 sec: 44234.3, 60 sec: 42871.0, 300 sec: 42543.7). Total num frames: 15718891520. Throughput: 0: 42674.9. Samples: 15718972940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:23,391][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 20:21:26,182][15401] Updated weights for policy 0, policy_version 959414 (0.0039) [2024-06-25 20:21:28,392][15132] Fps is (10 sec: 40950.5, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 15719104512. Throughput: 0: 42492.3. Samples: 15719225000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:28,393][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 20:21:30,387][15401] Updated weights for policy 0, policy_version 959424 (0.0032) [2024-06-25 20:21:33,394][15132] Fps is (10 sec: 44218.1, 60 sec: 42594.9, 300 sec: 42597.7). Total num frames: 15719333888. Throughput: 0: 42673.4. Samples: 15719480580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:33,395][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 20:21:34,021][15401] Updated weights for policy 0, policy_version 959434 (0.0025) [2024-06-25 20:21:37,893][15401] Updated weights for policy 0, policy_version 959444 (0.0030) [2024-06-25 20:21:38,396][15132] Fps is (10 sec: 42581.1, 60 sec: 42866.8, 300 sec: 42542.0). Total num frames: 15719530496. Throughput: 0: 42609.0. Samples: 15719611060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:38,397][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 20:21:41,960][15401] Updated weights for policy 0, policy_version 959454 (0.0031) [2024-06-25 20:21:43,390][15132] Fps is (10 sec: 39340.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15719727104. Throughput: 0: 42570.6. Samples: 15719862320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 20:21:43,527][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000959458_15719759872.pth... [2024-06-25 20:21:43,584][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000958833_15709519872.pth [2024-06-25 20:21:45,329][15349] Signal inference workers to stop experience collection... (232550 times) [2024-06-25 20:21:45,369][15401] InferenceWorker_p0-w0: stopping experience collection (232550 times) [2024-06-25 20:21:45,383][15349] Signal inference workers to resume experience collection... (232550 times) [2024-06-25 20:21:45,397][15401] InferenceWorker_p0-w0: resuming experience collection (232550 times) [2024-06-25 20:21:45,520][15401] Updated weights for policy 0, policy_version 959464 (0.0027) [2024-06-25 20:21:48,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 15719956480. Throughput: 0: 42786.7. Samples: 15720124120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 20:21:49,757][15401] Updated weights for policy 0, policy_version 959474 (0.0042) [2024-06-25 20:21:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15720169472. Throughput: 0: 42584.4. Samples: 15720247200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:53,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 20:21:54,016][15401] Updated weights for policy 0, policy_version 959484 (0.0036) [2024-06-25 20:21:57,422][15401] Updated weights for policy 0, policy_version 959494 (0.0046) [2024-06-25 20:21:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15720382464. Throughput: 0: 42438.7. Samples: 15720498160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:21:58,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 20:22:01,742][15401] Updated weights for policy 0, policy_version 959504 (0.0036) [2024-06-25 20:22:03,389][15132] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 15720546304. Throughput: 0: 42593.9. Samples: 15720759620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:22:03,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 20:22:04,954][15401] Updated weights for policy 0, policy_version 959514 (0.0028) [2024-06-25 20:22:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 15720792064. Throughput: 0: 42386.3. Samples: 15720880300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:22:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 20:22:09,342][15401] Updated weights for policy 0, policy_version 959524 (0.0033) [2024-06-25 20:22:12,825][15401] Updated weights for policy 0, policy_version 959534 (0.0028) [2024-06-25 20:22:13,392][15132] Fps is (10 sec: 47502.2, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 15721021440. Throughput: 0: 42363.1. Samples: 15721131340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:22:13,392][15132] Avg episode reward: [(0, '0.821')] [2024-06-25 20:22:17,257][15401] Updated weights for policy 0, policy_version 959544 (0.0037) [2024-06-25 20:22:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 42320.7). Total num frames: 15721185280. Throughput: 0: 42538.0. Samples: 15721394580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:22:18,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 20:22:20,707][15401] Updated weights for policy 0, policy_version 959554 (0.0023) [2024-06-25 20:22:23,389][15132] Fps is (10 sec: 40970.3, 60 sec: 42325.8, 300 sec: 42542.9). Total num frames: 15721431040. Throughput: 0: 42272.0. Samples: 15721513020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:22:23,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 20:22:24,909][15401] Updated weights for policy 0, policy_version 959564 (0.0041) [2024-06-25 20:22:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 15721644032. Throughput: 0: 42365.8. Samples: 15721768780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:22:28,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 20:22:28,480][15401] Updated weights for policy 0, policy_version 959574 (0.0034) [2024-06-25 20:22:32,592][15401] Updated weights for policy 0, policy_version 959584 (0.0046) [2024-06-25 20:22:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 41782.6, 300 sec: 42376.3). Total num frames: 15721840640. Throughput: 0: 42360.0. Samples: 15722030320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:22:33,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 20:22:36,323][15401] Updated weights for policy 0, policy_version 959594 (0.0041) [2024-06-25 20:22:38,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 15722086400. Throughput: 0: 42415.5. Samples: 15722155900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:22:38,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 20:22:40,244][15401] Updated weights for policy 0, policy_version 959604 (0.0037) [2024-06-25 20:22:43,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15722299392. Throughput: 0: 42623.9. Samples: 15722416240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:22:43,390][15132] Avg episode reward: [(0, '0.401')] [2024-06-25 20:22:43,839][15401] Updated weights for policy 0, policy_version 959614 (0.0037) [2024-06-25 20:22:47,800][15401] Updated weights for policy 0, policy_version 959624 (0.0025) [2024-06-25 20:22:48,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 15722496000. Throughput: 0: 42371.6. Samples: 15722666340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:22:48,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 20:22:51,651][15401] Updated weights for policy 0, policy_version 959634 (0.0033) [2024-06-25 20:22:53,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15722725376. Throughput: 0: 42505.3. Samples: 15722793040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:22:53,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 20:22:55,478][15401] Updated weights for policy 0, policy_version 959644 (0.0037) [2024-06-25 20:22:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 15722905600. Throughput: 0: 42681.4. Samples: 15723051900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:22:58,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 20:22:59,472][15401] Updated weights for policy 0, policy_version 959654 (0.0039) [2024-06-25 20:23:03,363][15401] Updated weights for policy 0, policy_version 959664 (0.0038) [2024-06-25 20:23:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42487.4). Total num frames: 15723134976. Throughput: 0: 42324.8. Samples: 15723299200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:03,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 20:23:06,757][15349] Signal inference workers to stop experience collection... (232600 times) [2024-06-25 20:23:06,757][15349] Signal inference workers to resume experience collection... (232600 times) [2024-06-25 20:23:06,779][15401] InferenceWorker_p0-w0: stopping experience collection (232600 times) [2024-06-25 20:23:06,779][15401] InferenceWorker_p0-w0: resuming experience collection (232600 times) [2024-06-25 20:23:06,927][15401] Updated weights for policy 0, policy_version 959674 (0.0038) [2024-06-25 20:23:08,390][15132] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15723364352. Throughput: 0: 42585.1. Samples: 15723429360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:08,390][15132] Avg episode reward: [(0, '0.849')] [2024-06-25 20:23:10,906][15401] Updated weights for policy 0, policy_version 959684 (0.0035) [2024-06-25 20:23:13,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42325.3, 300 sec: 42487.0). Total num frames: 15723560960. Throughput: 0: 42628.0. Samples: 15723687140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:13,392][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 20:23:14,695][15401] Updated weights for policy 0, policy_version 959694 (0.0032) [2024-06-25 20:23:18,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15723773952. Throughput: 0: 42317.2. Samples: 15723934600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:18,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 20:23:18,815][15401] Updated weights for policy 0, policy_version 959704 (0.0027) [2024-06-25 20:23:22,371][15401] Updated weights for policy 0, policy_version 959714 (0.0038) [2024-06-25 20:23:23,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15723986944. Throughput: 0: 42399.6. Samples: 15724063880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:23,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 20:23:26,620][15401] Updated weights for policy 0, policy_version 959724 (0.0039) [2024-06-25 20:23:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 15724199936. Throughput: 0: 42303.8. Samples: 15724319900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:28,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 20:23:30,029][15401] Updated weights for policy 0, policy_version 959734 (0.0031) [2024-06-25 20:23:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15724412928. Throughput: 0: 42541.7. Samples: 15724580720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:33,390][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 20:23:34,110][15401] Updated weights for policy 0, policy_version 959744 (0.0043) [2024-06-25 20:23:37,547][15401] Updated weights for policy 0, policy_version 959754 (0.0034) [2024-06-25 20:23:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15724642304. Throughput: 0: 42456.4. Samples: 15724703580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:38,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 20:23:41,780][15401] Updated weights for policy 0, policy_version 959764 (0.0030) [2024-06-25 20:23:43,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15724838912. Throughput: 0: 42499.0. Samples: 15724964360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:43,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 20:23:43,528][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000959769_15724855296.pth... [2024-06-25 20:23:43,572][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000959145_15714631680.pth [2024-06-25 20:23:45,118][15401] Updated weights for policy 0, policy_version 959774 (0.0030) [2024-06-25 20:23:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15725051904. Throughput: 0: 42680.8. Samples: 15725219840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 20:23:49,767][15401] Updated weights for policy 0, policy_version 959784 (0.0041) [2024-06-25 20:23:53,136][15401] Updated weights for policy 0, policy_version 959794 (0.0036) [2024-06-25 20:23:53,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15725281280. Throughput: 0: 42606.6. Samples: 15725346660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 20:23:57,226][15401] Updated weights for policy 0, policy_version 959804 (0.0042) [2024-06-25 20:23:58,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15725494272. Throughput: 0: 42764.1. Samples: 15725611420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:23:58,392][15132] Avg episode reward: [(0, '0.274')] [2024-06-25 20:24:00,699][15401] Updated weights for policy 0, policy_version 959814 (0.0031) [2024-06-25 20:24:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15725690880. Throughput: 0: 42843.0. Samples: 15725862540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:24:03,390][15132] Avg episode reward: [(0, '0.230')] [2024-06-25 20:24:04,760][15401] Updated weights for policy 0, policy_version 959824 (0.0038) [2024-06-25 20:24:08,305][15401] Updated weights for policy 0, policy_version 959834 (0.0028) [2024-06-25 20:24:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15725920256. Throughput: 0: 42742.1. Samples: 15725987280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:24:08,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-25 20:24:12,718][15401] Updated weights for policy 0, policy_version 959844 (0.0024) [2024-06-25 20:24:13,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 15726116864. Throughput: 0: 42796.2. Samples: 15726245740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:24:13,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 20:24:16,422][15401] Updated weights for policy 0, policy_version 959854 (0.0029) [2024-06-25 20:24:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15726313472. Throughput: 0: 42612.1. Samples: 15726498260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:24:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 20:24:20,436][15401] Updated weights for policy 0, policy_version 959864 (0.0034) [2024-06-25 20:24:23,393][15132] Fps is (10 sec: 44223.2, 60 sec: 42869.2, 300 sec: 42597.9). Total num frames: 15726559232. Throughput: 0: 42657.0. Samples: 15726623280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:24:23,393][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 20:24:24,063][15401] Updated weights for policy 0, policy_version 959874 (0.0031) [2024-06-25 20:24:28,123][15401] Updated weights for policy 0, policy_version 959884 (0.0021) [2024-06-25 20:24:28,166][15349] Signal inference workers to stop experience collection... (232650 times) [2024-06-25 20:24:28,172][15349] Signal inference workers to resume experience collection... (232650 times) [2024-06-25 20:24:28,192][15401] InferenceWorker_p0-w0: stopping experience collection (232650 times) [2024-06-25 20:24:28,192][15401] InferenceWorker_p0-w0: resuming experience collection (232650 times) [2024-06-25 20:24:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15726755840. Throughput: 0: 42681.4. Samples: 15726885020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:24:28,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 20:24:31,703][15401] Updated weights for policy 0, policy_version 959894 (0.0024) [2024-06-25 20:24:33,389][15132] Fps is (10 sec: 39334.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15726952448. Throughput: 0: 42681.1. Samples: 15727140480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:24:33,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 20:24:35,760][15401] Updated weights for policy 0, policy_version 959904 (0.0035) [2024-06-25 20:24:38,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15727214592. Throughput: 0: 42592.2. Samples: 15727263300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:24:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 20:24:39,408][15401] Updated weights for policy 0, policy_version 959914 (0.0026) [2024-06-25 20:24:43,370][15401] Updated weights for policy 0, policy_version 959924 (0.0052) [2024-06-25 20:24:43,390][15132] Fps is (10 sec: 44235.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15727394816. Throughput: 0: 42603.9. Samples: 15727528600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:24:43,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 20:24:46,984][15401] Updated weights for policy 0, policy_version 959934 (0.0038) [2024-06-25 20:24:48,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 15727591424. Throughput: 0: 42734.5. Samples: 15727785580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:24:48,390][15132] Avg episode reward: [(0, '0.690')] [2024-06-25 20:24:51,081][15401] Updated weights for policy 0, policy_version 959944 (0.0046) [2024-06-25 20:24:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15727853568. Throughput: 0: 42613.2. Samples: 15727904880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:24:53,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 20:24:55,280][15401] Updated weights for policy 0, policy_version 959954 (0.0045) [2024-06-25 20:24:58,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 15728017408. Throughput: 0: 42660.9. Samples: 15728165480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:24:58,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 20:24:58,715][15401] Updated weights for policy 0, policy_version 959964 (0.0025) [2024-06-25 20:25:03,060][15401] Updated weights for policy 0, policy_version 959974 (0.0030) [2024-06-25 20:25:03,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15728230400. Throughput: 0: 42643.4. Samples: 15728417220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:03,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 20:25:06,389][15401] Updated weights for policy 0, policy_version 959984 (0.0045) [2024-06-25 20:25:08,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15728492544. Throughput: 0: 42631.8. Samples: 15728541580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:08,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 20:25:10,739][15401] Updated weights for policy 0, policy_version 959994 (0.0043) [2024-06-25 20:25:13,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 15728640000. Throughput: 0: 42440.9. Samples: 15728794860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:13,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 20:25:14,231][15401] Updated weights for policy 0, policy_version 960004 (0.0032) [2024-06-25 20:25:18,284][15401] Updated weights for policy 0, policy_version 960014 (0.0028) [2024-06-25 20:25:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15728869376. Throughput: 0: 42472.8. Samples: 15729051760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:18,390][15132] Avg episode reward: [(0, '0.888')] [2024-06-25 20:25:21,785][15401] Updated weights for policy 0, policy_version 960024 (0.0035) [2024-06-25 20:25:23,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42600.7, 300 sec: 42598.4). Total num frames: 15729115136. Throughput: 0: 42608.0. Samples: 15729180660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:23,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-25 20:25:25,824][15401] Updated weights for policy 0, policy_version 960034 (0.0034) [2024-06-25 20:25:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 15729278976. Throughput: 0: 42271.2. Samples: 15729430800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:28,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 20:25:29,462][15401] Updated weights for policy 0, policy_version 960044 (0.0041) [2024-06-25 20:25:33,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15729508352. Throughput: 0: 42282.2. Samples: 15729688280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:33,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 20:25:33,451][15401] Updated weights for policy 0, policy_version 960054 (0.0031) [2024-06-25 20:25:37,062][15401] Updated weights for policy 0, policy_version 960064 (0.0039) [2024-06-25 20:25:38,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15729754112. Throughput: 0: 42563.6. Samples: 15729820240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:38,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 20:25:41,468][15401] Updated weights for policy 0, policy_version 960074 (0.0033) [2024-06-25 20:25:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 15729917952. Throughput: 0: 42451.6. Samples: 15730075800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:43,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 20:25:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000960078_15729917952.pth... [2024-06-25 20:25:43,487][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000959458_15719759872.pth [2024-06-25 20:25:44,739][15349] Signal inference workers to stop experience collection... (232700 times) [2024-06-25 20:25:44,746][15349] Signal inference workers to resume experience collection... (232700 times) [2024-06-25 20:25:44,750][15401] Updated weights for policy 0, policy_version 960084 (0.0032) [2024-06-25 20:25:44,759][15401] InferenceWorker_p0-w0: stopping experience collection (232700 times) [2024-06-25 20:25:44,759][15401] InferenceWorker_p0-w0: resuming experience collection (232700 times) [2024-06-25 20:25:48,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15730163712. Throughput: 0: 42392.1. Samples: 15730324860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 20:25:48,872][15401] Updated weights for policy 0, policy_version 960094 (0.0039) [2024-06-25 20:25:52,280][15401] Updated weights for policy 0, policy_version 960104 (0.0027) [2024-06-25 20:25:53,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 15730393088. Throughput: 0: 42688.6. Samples: 15730462560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:53,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 20:25:56,440][15401] Updated weights for policy 0, policy_version 960114 (0.0032) [2024-06-25 20:25:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 15730573312. Throughput: 0: 42664.4. Samples: 15730714760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:25:58,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 20:26:00,000][15401] Updated weights for policy 0, policy_version 960124 (0.0036) [2024-06-25 20:26:03,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43142.9, 300 sec: 42598.0). Total num frames: 15730819072. Throughput: 0: 42586.6. Samples: 15730968260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:26:03,392][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 20:26:04,575][15401] Updated weights for policy 0, policy_version 960134 (0.0035) [2024-06-25 20:26:07,542][15401] Updated weights for policy 0, policy_version 960144 (0.0029) [2024-06-25 20:26:08,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15731032064. Throughput: 0: 42604.8. Samples: 15731097880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:26:08,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 20:26:12,174][15401] Updated weights for policy 0, policy_version 960154 (0.0030) [2024-06-25 20:26:13,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 15731195904. Throughput: 0: 42695.7. Samples: 15731352100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:26:13,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 20:26:15,523][15401] Updated weights for policy 0, policy_version 960164 (0.0033) [2024-06-25 20:26:18,392][15132] Fps is (10 sec: 40951.2, 60 sec: 42869.8, 300 sec: 42542.6). Total num frames: 15731441664. Throughput: 0: 42441.8. Samples: 15731598260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:26:18,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 20:26:20,238][15401] Updated weights for policy 0, policy_version 960174 (0.0039) [2024-06-25 20:26:23,054][15401] Updated weights for policy 0, policy_version 960184 (0.0056) [2024-06-25 20:26:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 15731654656. Throughput: 0: 42508.9. Samples: 15731733140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:26:23,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 20:26:27,652][15401] Updated weights for policy 0, policy_version 960194 (0.0028) [2024-06-25 20:26:28,389][15132] Fps is (10 sec: 39330.5, 60 sec: 42598.4, 300 sec: 42376.9). Total num frames: 15731834880. Throughput: 0: 42432.9. Samples: 15731985280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:26:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 20:26:31,151][15401] Updated weights for policy 0, policy_version 960204 (0.0030) [2024-06-25 20:26:33,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42543.8). Total num frames: 15732080640. Throughput: 0: 42528.4. Samples: 15732238640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-25 20:26:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 20:26:35,123][15401] Updated weights for policy 0, policy_version 960214 (0.0038) [2024-06-25 20:26:38,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15732293632. Throughput: 0: 42467.9. Samples: 15732373620. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:26:38,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 20:26:38,836][15401] Updated weights for policy 0, policy_version 960224 (0.0028) [2024-06-25 20:26:42,998][15401] Updated weights for policy 0, policy_version 960234 (0.0032) [2024-06-25 20:26:43,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 15732473856. Throughput: 0: 42309.4. Samples: 15732618680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:26:43,390][15132] Avg episode reward: [(0, '0.813')] [2024-06-25 20:26:46,299][15401] Updated weights for policy 0, policy_version 960244 (0.0034) [2024-06-25 20:26:48,390][15132] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15732719616. Throughput: 0: 42391.4. Samples: 15732875780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:26:48,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 20:26:50,519][15401] Updated weights for policy 0, policy_version 960254 (0.0038) [2024-06-25 20:26:53,394][15132] Fps is (10 sec: 45856.2, 60 sec: 42322.4, 300 sec: 42542.3). Total num frames: 15732932608. Throughput: 0: 42484.7. Samples: 15733009860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:26:53,394][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 20:26:53,850][15401] Updated weights for policy 0, policy_version 960264 (0.0040) [2024-06-25 20:26:57,988][15401] Updated weights for policy 0, policy_version 960274 (0.0038) [2024-06-25 20:26:58,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15733129216. Throughput: 0: 42382.2. Samples: 15733259300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:26:58,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 20:27:01,450][15401] Updated weights for policy 0, policy_version 960284 (0.0032) [2024-06-25 20:27:01,461][15349] Signal inference workers to stop experience collection... (232750 times) [2024-06-25 20:27:01,461][15349] Signal inference workers to resume experience collection... (232750 times) [2024-06-25 20:27:01,471][15401] InferenceWorker_p0-w0: stopping experience collection (232750 times) [2024-06-25 20:27:01,490][15401] InferenceWorker_p0-w0: resuming experience collection (232750 times) [2024-06-25 20:27:03,389][15132] Fps is (10 sec: 42616.1, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 15733358592. Throughput: 0: 42614.7. Samples: 15733515820. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 20:27:05,507][15401] Updated weights for policy 0, policy_version 960294 (0.0035) [2024-06-25 20:27:08,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42323.7, 300 sec: 42542.9). Total num frames: 15733571584. Throughput: 0: 42611.1. Samples: 15733650740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:08,392][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 20:27:09,143][15401] Updated weights for policy 0, policy_version 960304 (0.0019) [2024-06-25 20:27:13,270][15401] Updated weights for policy 0, policy_version 960314 (0.0040) [2024-06-25 20:27:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 15733784576. Throughput: 0: 42608.0. Samples: 15733902640. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:13,390][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 20:27:16,778][15401] Updated weights for policy 0, policy_version 960324 (0.0032) [2024-06-25 20:27:18,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 15733997568. Throughput: 0: 42732.4. Samples: 15734161600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:18,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 20:27:20,737][15401] Updated weights for policy 0, policy_version 960334 (0.0025) [2024-06-25 20:27:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15734210560. Throughput: 0: 42723.5. Samples: 15734296180. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:23,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 20:27:24,484][15401] Updated weights for policy 0, policy_version 960344 (0.0036) [2024-06-25 20:27:28,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15734423552. Throughput: 0: 42812.3. Samples: 15734545240. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:28,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 20:27:28,546][15401] Updated weights for policy 0, policy_version 960354 (0.0044) [2024-06-25 20:27:32,077][15401] Updated weights for policy 0, policy_version 960364 (0.0025) [2024-06-25 20:27:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15734652928. Throughput: 0: 42857.5. Samples: 15734804360. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:33,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 20:27:35,930][15401] Updated weights for policy 0, policy_version 960374 (0.0039) [2024-06-25 20:27:38,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15734833152. Throughput: 0: 42752.3. Samples: 15734933540. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:38,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 20:27:39,661][15401] Updated weights for policy 0, policy_version 960384 (0.0031) [2024-06-25 20:27:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 15735078912. Throughput: 0: 42931.0. Samples: 15735191200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:43,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 20:27:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000960393_15735078912.pth... [2024-06-25 20:27:43,460][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000959769_15724855296.pth [2024-06-25 20:27:43,685][15401] Updated weights for policy 0, policy_version 960394 (0.0037) [2024-06-25 20:27:47,312][15401] Updated weights for policy 0, policy_version 960404 (0.0033) [2024-06-25 20:27:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15735291904. Throughput: 0: 42967.0. Samples: 15735449340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:48,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 20:27:51,099][15401] Updated weights for policy 0, policy_version 960414 (0.0024) [2024-06-25 20:27:53,389][15132] Fps is (10 sec: 37683.8, 60 sec: 42055.2, 300 sec: 42542.9). Total num frames: 15735455744. Throughput: 0: 42785.5. Samples: 15735575980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 20:27:55,115][15401] Updated weights for policy 0, policy_version 960424 (0.0040) [2024-06-25 20:27:58,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15735734272. Throughput: 0: 42940.1. Samples: 15735834940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:27:58,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 20:27:59,172][15401] Updated weights for policy 0, policy_version 960434 (0.0030) [2024-06-25 20:28:02,806][15401] Updated weights for policy 0, policy_version 960444 (0.0025) [2024-06-25 20:28:03,390][15132] Fps is (10 sec: 47512.3, 60 sec: 42871.2, 300 sec: 42598.4). Total num frames: 15735930880. Throughput: 0: 42833.2. Samples: 15736089100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:28:03,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 20:28:06,922][15401] Updated weights for policy 0, policy_version 960454 (0.0026) [2024-06-25 20:28:08,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42327.1, 300 sec: 42543.2). Total num frames: 15736111104. Throughput: 0: 42663.2. Samples: 15736216020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:28:08,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 20:28:10,554][15401] Updated weights for policy 0, policy_version 960464 (0.0040) [2024-06-25 20:28:12,201][15349] Signal inference workers to stop experience collection... (232800 times) [2024-06-25 20:28:12,208][15349] Signal inference workers to resume experience collection... (232800 times) [2024-06-25 20:28:12,220][15401] InferenceWorker_p0-w0: stopping experience collection (232800 times) [2024-06-25 20:28:12,255][15401] InferenceWorker_p0-w0: resuming experience collection (232800 times) [2024-06-25 20:28:13,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15736356864. Throughput: 0: 42944.5. Samples: 15736477740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:28:13,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 20:28:14,575][15401] Updated weights for policy 0, policy_version 960474 (0.0027) [2024-06-25 20:28:18,077][15401] Updated weights for policy 0, policy_version 960484 (0.0037) [2024-06-25 20:28:18,389][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15736569856. Throughput: 0: 42641.9. Samples: 15736723240. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:28:18,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 20:28:22,217][15401] Updated weights for policy 0, policy_version 960494 (0.0043) [2024-06-25 20:28:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15736766464. Throughput: 0: 42693.7. Samples: 15736854760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:28:23,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 20:28:25,487][15401] Updated weights for policy 0, policy_version 960504 (0.0036) [2024-06-25 20:28:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15736995840. Throughput: 0: 42901.5. Samples: 15737121760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-25 20:28:28,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 20:28:29,878][15401] Updated weights for policy 0, policy_version 960514 (0.0029) [2024-06-25 20:28:33,041][15401] Updated weights for policy 0, policy_version 960524 (0.0041) [2024-06-25 20:28:33,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15737225216. Throughput: 0: 42708.4. Samples: 15737371220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:28:33,390][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 20:28:37,394][15401] Updated weights for policy 0, policy_version 960534 (0.0033) [2024-06-25 20:28:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15737421824. Throughput: 0: 42908.3. Samples: 15737506860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:28:38,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 20:28:40,866][15401] Updated weights for policy 0, policy_version 960544 (0.0036) [2024-06-25 20:28:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15737651200. Throughput: 0: 42902.1. Samples: 15737765540. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:28:43,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 20:28:44,912][15401] Updated weights for policy 0, policy_version 960554 (0.0035) [2024-06-25 20:28:48,328][15401] Updated weights for policy 0, policy_version 960564 (0.0032) [2024-06-25 20:28:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15737880576. Throughput: 0: 42943.2. Samples: 15738021540. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:28:48,396][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 20:28:52,588][15401] Updated weights for policy 0, policy_version 960574 (0.0032) [2024-06-25 20:28:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43690.5, 300 sec: 42653.9). Total num frames: 15738077184. Throughput: 0: 43078.9. Samples: 15738154580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:28:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 20:28:55,804][15401] Updated weights for policy 0, policy_version 960584 (0.0030) [2024-06-25 20:28:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15738290176. Throughput: 0: 43095.7. Samples: 15738417040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:28:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 20:29:00,781][15401] Updated weights for policy 0, policy_version 960594 (0.0036) [2024-06-25 20:29:03,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15738519552. Throughput: 0: 43112.7. Samples: 15738663320. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:03,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 20:29:03,424][15401] Updated weights for policy 0, policy_version 960604 (0.0040) [2024-06-25 20:29:08,307][15401] Updated weights for policy 0, policy_version 960614 (0.0030) [2024-06-25 20:29:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 15738699776. Throughput: 0: 43130.3. Samples: 15738795620. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:08,390][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 20:29:10,952][15401] Updated weights for policy 0, policy_version 960624 (0.0028) [2024-06-25 20:29:13,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15738945536. Throughput: 0: 43043.4. Samples: 15739058720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:13,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 20:29:15,758][15401] Updated weights for policy 0, policy_version 960634 (0.0042) [2024-06-25 20:29:18,389][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42765.5). Total num frames: 15739174912. Throughput: 0: 43196.6. Samples: 15739315060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 20:29:18,512][15401] Updated weights for policy 0, policy_version 960644 (0.0041) [2024-06-25 20:29:23,225][15401] Updated weights for policy 0, policy_version 960654 (0.0037) [2024-06-25 20:29:23,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15739355136. Throughput: 0: 43116.1. Samples: 15739447080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:23,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 20:29:26,434][15401] Updated weights for policy 0, policy_version 960664 (0.0028) [2024-06-25 20:29:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 15739600896. Throughput: 0: 43077.4. Samples: 15739704020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:28,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 20:29:30,803][15401] Updated weights for policy 0, policy_version 960674 (0.0036) [2024-06-25 20:29:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15739797504. Throughput: 0: 43061.8. Samples: 15739959320. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 20:29:33,555][15349] Signal inference workers to stop experience collection... (232850 times) [2024-06-25 20:29:33,555][15349] Signal inference workers to resume experience collection... (232850 times) [2024-06-25 20:29:33,603][15401] InferenceWorker_p0-w0: stopping experience collection (232850 times) [2024-06-25 20:29:33,603][15401] InferenceWorker_p0-w0: resuming experience collection (232850 times) [2024-06-25 20:29:34,322][15401] Updated weights for policy 0, policy_version 960684 (0.0032) [2024-06-25 20:29:38,384][15401] Updated weights for policy 0, policy_version 960694 (0.0024) [2024-06-25 20:29:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15740010496. Throughput: 0: 43026.8. Samples: 15740090780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 20:29:41,814][15401] Updated weights for policy 0, policy_version 960704 (0.0043) [2024-06-25 20:29:43,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15740207104. Throughput: 0: 42826.6. Samples: 15740344240. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 20:29:43,470][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000960707_15740223488.pth... [2024-06-25 20:29:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000960078_15729917952.pth [2024-06-25 20:29:46,376][15401] Updated weights for policy 0, policy_version 960714 (0.0033) [2024-06-25 20:29:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15740436480. Throughput: 0: 43027.8. Samples: 15740599560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:48,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 20:29:49,325][15401] Updated weights for policy 0, policy_version 960724 (0.0035) [2024-06-25 20:29:53,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15740649472. Throughput: 0: 43169.8. Samples: 15740738260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:53,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 20:29:53,732][15401] Updated weights for policy 0, policy_version 960734 (0.0032) [2024-06-25 20:29:57,223][15401] Updated weights for policy 0, policy_version 960744 (0.0037) [2024-06-25 20:29:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15740846080. Throughput: 0: 42931.6. Samples: 15740990640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:29:58,390][15132] Avg episode reward: [(0, '0.407')] [2024-06-25 20:30:01,141][15401] Updated weights for policy 0, policy_version 960754 (0.0030) [2024-06-25 20:30:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15741091840. Throughput: 0: 42964.5. Samples: 15741248460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:30:03,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 20:30:04,903][15401] Updated weights for policy 0, policy_version 960764 (0.0030) [2024-06-25 20:30:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 15741304832. Throughput: 0: 42885.7. Samples: 15741376940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:30:08,392][15132] Avg episode reward: [(0, '0.254')] [2024-06-25 20:30:08,709][15401] Updated weights for policy 0, policy_version 960774 (0.0040) [2024-06-25 20:30:12,480][15401] Updated weights for policy 0, policy_version 960784 (0.0033) [2024-06-25 20:30:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15741517824. Throughput: 0: 42823.1. Samples: 15741631060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:30:13,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 20:30:16,650][15401] Updated weights for policy 0, policy_version 960794 (0.0037) [2024-06-25 20:30:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15741730816. Throughput: 0: 42879.7. Samples: 15741888900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:30:18,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 20:30:19,965][15401] Updated weights for policy 0, policy_version 960804 (0.0036) [2024-06-25 20:30:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15741927424. Throughput: 0: 42839.5. Samples: 15742018560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:30:23,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 20:30:24,089][15401] Updated weights for policy 0, policy_version 960814 (0.0032) [2024-06-25 20:30:27,902][15401] Updated weights for policy 0, policy_version 960824 (0.0038) [2024-06-25 20:30:28,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15742156800. Throughput: 0: 42827.5. Samples: 15742271480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-25 20:30:28,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 20:30:31,695][15401] Updated weights for policy 0, policy_version 960834 (0.0032) [2024-06-25 20:30:33,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15742369792. Throughput: 0: 42970.6. Samples: 15742533240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:30:33,390][15132] Avg episode reward: [(0, '0.585')] [2024-06-25 20:30:35,417][15401] Updated weights for policy 0, policy_version 960844 (0.0039) [2024-06-25 20:30:38,392][15132] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 15742582784. Throughput: 0: 42786.2. Samples: 15742663740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:30:38,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 20:30:39,299][15401] Updated weights for policy 0, policy_version 960854 (0.0042) [2024-06-25 20:30:42,920][15401] Updated weights for policy 0, policy_version 960864 (0.0038) [2024-06-25 20:30:43,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15742812160. Throughput: 0: 42895.2. Samples: 15742920920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:30:43,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 20:30:46,884][15401] Updated weights for policy 0, policy_version 960874 (0.0028) [2024-06-25 20:30:48,389][15132] Fps is (10 sec: 44247.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15743025152. Throughput: 0: 42979.1. Samples: 15743182520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:30:48,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 20:30:50,409][15401] Updated weights for policy 0, policy_version 960884 (0.0034) [2024-06-25 20:30:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15743221760. Throughput: 0: 42848.5. Samples: 15743305120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:30:53,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 20:30:54,463][15401] Updated weights for policy 0, policy_version 960894 (0.0032) [2024-06-25 20:30:58,111][15401] Updated weights for policy 0, policy_version 960904 (0.0043) [2024-06-25 20:30:58,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 42876.4). Total num frames: 15743467520. Throughput: 0: 43042.8. Samples: 15743567980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:30:58,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 20:31:00,745][15349] Signal inference workers to stop experience collection... (232900 times) [2024-06-25 20:31:00,799][15401] InferenceWorker_p0-w0: stopping experience collection (232900 times) [2024-06-25 20:31:00,807][15349] Signal inference workers to resume experience collection... (232900 times) [2024-06-25 20:31:00,813][15401] InferenceWorker_p0-w0: resuming experience collection (232900 times) [2024-06-25 20:31:02,388][15401] Updated weights for policy 0, policy_version 960914 (0.0037) [2024-06-25 20:31:03,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 15743680512. Throughput: 0: 42919.9. Samples: 15743820300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 20:31:05,583][15401] Updated weights for policy 0, policy_version 960924 (0.0026) [2024-06-25 20:31:08,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 15743844352. Throughput: 0: 42882.4. Samples: 15743948260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:08,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 20:31:09,902][15401] Updated weights for policy 0, policy_version 960934 (0.0039) [2024-06-25 20:31:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 15744090112. Throughput: 0: 43082.0. Samples: 15744210160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 20:31:13,433][15401] Updated weights for policy 0, policy_version 960944 (0.0030) [2024-06-25 20:31:17,526][15401] Updated weights for policy 0, policy_version 960954 (0.0036) [2024-06-25 20:31:18,390][15132] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15744303104. Throughput: 0: 42842.1. Samples: 15744461140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 20:31:21,063][15401] Updated weights for policy 0, policy_version 960964 (0.0034) [2024-06-25 20:31:23,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 15744499712. Throughput: 0: 42775.6. Samples: 15744588540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:23,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 20:31:25,153][15401] Updated weights for policy 0, policy_version 960974 (0.0035) [2024-06-25 20:31:28,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 15744745472. Throughput: 0: 42876.0. Samples: 15744850340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:28,390][15132] Avg episode reward: [(0, '0.163')] [2024-06-25 20:31:28,509][15401] Updated weights for policy 0, policy_version 960984 (0.0035) [2024-06-25 20:31:32,697][15401] Updated weights for policy 0, policy_version 960994 (0.0046) [2024-06-25 20:31:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15744942080. Throughput: 0: 42695.6. Samples: 15745103820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 20:31:36,074][15401] Updated weights for policy 0, policy_version 961004 (0.0037) [2024-06-25 20:31:38,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42600.1, 300 sec: 42931.6). Total num frames: 15745138688. Throughput: 0: 42746.3. Samples: 15745228700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:38,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 20:31:40,394][15401] Updated weights for policy 0, policy_version 961014 (0.0030) [2024-06-25 20:31:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42931.7). Total num frames: 15745384448. Throughput: 0: 42755.1. Samples: 15745491960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:43,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 20:31:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000961023_15745400832.pth... [2024-06-25 20:31:43,446][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000960393_15735078912.pth [2024-06-25 20:31:43,608][15401] Updated weights for policy 0, policy_version 961024 (0.0036) [2024-06-25 20:31:48,161][15401] Updated weights for policy 0, policy_version 961034 (0.0039) [2024-06-25 20:31:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42932.2). Total num frames: 15745597440. Throughput: 0: 42940.9. Samples: 15745752640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:48,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 20:31:51,520][15401] Updated weights for policy 0, policy_version 961044 (0.0038) [2024-06-25 20:31:53,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15745794048. Throughput: 0: 42955.9. Samples: 15745881280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:53,390][15132] Avg episode reward: [(0, '0.372')] [2024-06-25 20:31:56,025][15401] Updated weights for policy 0, policy_version 961054 (0.0039) [2024-06-25 20:31:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 15746007040. Throughput: 0: 42740.4. Samples: 15746133480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:31:58,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 20:31:59,171][15401] Updated weights for policy 0, policy_version 961064 (0.0029) [2024-06-25 20:32:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42820.9). Total num frames: 15746203648. Throughput: 0: 43024.1. Samples: 15746397220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:32:03,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 20:32:03,865][15401] Updated weights for policy 0, policy_version 961074 (0.0033) [2024-06-25 20:32:06,654][15401] Updated weights for policy 0, policy_version 961084 (0.0037) [2024-06-25 20:32:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15746433024. Throughput: 0: 42909.4. Samples: 15746519460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:32:08,390][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 20:32:11,504][15401] Updated weights for policy 0, policy_version 961094 (0.0049) [2024-06-25 20:32:13,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15746662400. Throughput: 0: 42836.4. Samples: 15746777980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:32:13,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 20:32:14,243][15401] Updated weights for policy 0, policy_version 961104 (0.0034) [2024-06-25 20:32:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 15746842624. Throughput: 0: 43024.8. Samples: 15747039940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:32:18,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 20:32:19,049][15401] Updated weights for policy 0, policy_version 961114 (0.0043) [2024-06-25 20:32:19,877][15349] Signal inference workers to stop experience collection... (232950 times) [2024-06-25 20:32:19,907][15401] InferenceWorker_p0-w0: stopping experience collection (232950 times) [2024-06-25 20:32:19,932][15349] Signal inference workers to resume experience collection... (232950 times) [2024-06-25 20:32:19,932][15401] InferenceWorker_p0-w0: resuming experience collection (232950 times) [2024-06-25 20:32:21,980][15401] Updated weights for policy 0, policy_version 961124 (0.0037) [2024-06-25 20:32:23,390][15132] Fps is (10 sec: 40958.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 15747072000. Throughput: 0: 42912.2. Samples: 15747159760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 20:32:23,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 20:32:26,550][15401] Updated weights for policy 0, policy_version 961134 (0.0041) [2024-06-25 20:32:28,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15747317760. Throughput: 0: 42981.4. Samples: 15747426120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:32:28,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 20:32:29,685][15401] Updated weights for policy 0, policy_version 961144 (0.0033) [2024-06-25 20:32:33,390][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 15747497984. Throughput: 0: 42862.3. Samples: 15747681440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:32:33,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 20:32:34,101][15401] Updated weights for policy 0, policy_version 961154 (0.0025) [2024-06-25 20:32:37,401][15401] Updated weights for policy 0, policy_version 961164 (0.0035) [2024-06-25 20:32:38,390][15132] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 15747727360. Throughput: 0: 42720.4. Samples: 15747803700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:32:38,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 20:32:41,915][15401] Updated weights for policy 0, policy_version 961174 (0.0026) [2024-06-25 20:32:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15747940352. Throughput: 0: 42947.5. Samples: 15748066120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:32:43,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 20:32:45,175][15401] Updated weights for policy 0, policy_version 961184 (0.0034) [2024-06-25 20:32:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 43042.7). Total num frames: 15748153344. Throughput: 0: 42763.5. Samples: 15748321580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:32:48,392][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 20:32:49,599][15401] Updated weights for policy 0, policy_version 961194 (0.0027) [2024-06-25 20:32:52,812][15401] Updated weights for policy 0, policy_version 961204 (0.0043) [2024-06-25 20:32:53,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15748399104. Throughput: 0: 42893.2. Samples: 15748449660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:32:53,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 20:32:57,532][15401] Updated weights for policy 0, policy_version 961214 (0.0039) [2024-06-25 20:32:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15748562944. Throughput: 0: 42879.6. Samples: 15748707560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:32:58,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 20:33:00,477][15401] Updated weights for policy 0, policy_version 961224 (0.0037) [2024-06-25 20:33:03,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 15748792320. Throughput: 0: 42616.4. Samples: 15748957680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:03,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-25 20:33:05,089][15401] Updated weights for policy 0, policy_version 961234 (0.0042) [2024-06-25 20:33:08,187][15401] Updated weights for policy 0, policy_version 961244 (0.0027) [2024-06-25 20:33:08,389][15132] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 15749038080. Throughput: 0: 42900.7. Samples: 15749090280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:08,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 20:33:13,211][15401] Updated weights for policy 0, policy_version 961254 (0.0040) [2024-06-25 20:33:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 15749201920. Throughput: 0: 42662.6. Samples: 15749345940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 20:33:15,817][15401] Updated weights for policy 0, policy_version 961264 (0.0023) [2024-06-25 20:33:18,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15749414912. Throughput: 0: 42704.5. Samples: 15749603140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:18,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 20:33:20,904][15401] Updated weights for policy 0, policy_version 961274 (0.0025) [2024-06-25 20:33:23,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.8, 300 sec: 42931.6). Total num frames: 15749660672. Throughput: 0: 42810.0. Samples: 15749730140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:23,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 20:33:23,494][15401] Updated weights for policy 0, policy_version 961284 (0.0035) [2024-06-25 20:33:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 15749824512. Throughput: 0: 42663.1. Samples: 15749985960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:28,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 20:33:28,529][15401] Updated weights for policy 0, policy_version 961294 (0.0037) [2024-06-25 20:33:30,967][15349] Signal inference workers to stop experience collection... (233000 times) [2024-06-25 20:33:30,999][15401] InferenceWorker_p0-w0: stopping experience collection (233000 times) [2024-06-25 20:33:31,020][15349] Signal inference workers to resume experience collection... (233000 times) [2024-06-25 20:33:31,020][15401] InferenceWorker_p0-w0: resuming experience collection (233000 times) [2024-06-25 20:33:31,023][15401] Updated weights for policy 0, policy_version 961304 (0.0032) [2024-06-25 20:33:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15750070272. Throughput: 0: 42671.1. Samples: 15750241780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 20:33:36,203][15401] Updated weights for policy 0, policy_version 961314 (0.0033) [2024-06-25 20:33:38,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15750283264. Throughput: 0: 42804.1. Samples: 15750375840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:38,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 20:33:38,809][15401] Updated weights for policy 0, policy_version 961324 (0.0036) [2024-06-25 20:33:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15750479872. Throughput: 0: 42556.8. Samples: 15750622620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 20:33:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000961333_15750479872.pth... [2024-06-25 20:33:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000960707_15740223488.pth [2024-06-25 20:33:43,932][15401] Updated weights for policy 0, policy_version 961334 (0.0030) [2024-06-25 20:33:46,385][15401] Updated weights for policy 0, policy_version 961344 (0.0040) [2024-06-25 20:33:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15750725632. Throughput: 0: 42622.2. Samples: 15750875680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:48,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 20:33:51,487][15401] Updated weights for policy 0, policy_version 961354 (0.0035) [2024-06-25 20:33:53,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.1, 300 sec: 42820.5). Total num frames: 15750922240. Throughput: 0: 42518.9. Samples: 15751003640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:53,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 20:33:54,112][15401] Updated weights for policy 0, policy_version 961364 (0.0035) [2024-06-25 20:33:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15751118848. Throughput: 0: 42380.4. Samples: 15751253060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:33:58,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 20:33:59,112][15401] Updated weights for policy 0, policy_version 961374 (0.0030) [2024-06-25 20:34:01,716][15401] Updated weights for policy 0, policy_version 961384 (0.0022) [2024-06-25 20:34:03,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.1, 300 sec: 42876.0). Total num frames: 15751348224. Throughput: 0: 42414.6. Samples: 15751511820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:34:03,391][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 20:34:06,942][15401] Updated weights for policy 0, policy_version 961394 (0.0035) [2024-06-25 20:34:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 42709.5). Total num frames: 15751544832. Throughput: 0: 42492.9. Samples: 15751642320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:34:08,390][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 20:34:09,747][15401] Updated weights for policy 0, policy_version 961404 (0.0029) [2024-06-25 20:34:13,390][15132] Fps is (10 sec: 40961.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15751757824. Throughput: 0: 42292.8. Samples: 15751889140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:34:13,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 20:34:14,459][15401] Updated weights for policy 0, policy_version 961414 (0.0039) [2024-06-25 20:34:17,798][15401] Updated weights for policy 0, policy_version 961424 (0.0040) [2024-06-25 20:34:18,389][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15752003584. Throughput: 0: 42401.4. Samples: 15752149840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:34:18,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 20:34:22,061][15401] Updated weights for policy 0, policy_version 961434 (0.0031) [2024-06-25 20:34:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15752183808. Throughput: 0: 42387.1. Samples: 15752283260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 20:34:23,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 20:34:25,475][15401] Updated weights for policy 0, policy_version 961444 (0.0033) [2024-06-25 20:34:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15752413184. Throughput: 0: 42353.0. Samples: 15752528500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:34:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 20:34:29,754][15401] Updated weights for policy 0, policy_version 961454 (0.0029) [2024-06-25 20:34:33,208][15401] Updated weights for policy 0, policy_version 961464 (0.0033) [2024-06-25 20:34:33,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15752626176. Throughput: 0: 42484.5. Samples: 15752787480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:34:33,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 20:34:37,457][15401] Updated weights for policy 0, policy_version 961474 (0.0027) [2024-06-25 20:34:38,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15752822784. Throughput: 0: 42465.5. Samples: 15752914580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:34:38,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 20:34:41,121][15401] Updated weights for policy 0, policy_version 961484 (0.0030) [2024-06-25 20:34:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15753052160. Throughput: 0: 42631.2. Samples: 15753171460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:34:43,390][15132] Avg episode reward: [(0, '0.189')] [2024-06-25 20:34:44,986][15401] Updated weights for policy 0, policy_version 961494 (0.0027) [2024-06-25 20:34:44,996][15349] Signal inference workers to stop experience collection... (233050 times) [2024-06-25 20:34:44,997][15349] Signal inference workers to resume experience collection... (233050 times) [2024-06-25 20:34:45,046][15401] InferenceWorker_p0-w0: stopping experience collection (233050 times) [2024-06-25 20:34:45,046][15401] InferenceWorker_p0-w0: resuming experience collection (233050 times) [2024-06-25 20:34:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15753248768. Throughput: 0: 42440.9. Samples: 15753421640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:34:48,392][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 20:34:48,632][15401] Updated weights for policy 0, policy_version 961504 (0.0043) [2024-06-25 20:34:52,986][15401] Updated weights for policy 0, policy_version 961514 (0.0039) [2024-06-25 20:34:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 15753461760. Throughput: 0: 42392.8. Samples: 15753550000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:34:53,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 20:34:56,483][15401] Updated weights for policy 0, policy_version 961524 (0.0034) [2024-06-25 20:34:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15753674752. Throughput: 0: 42495.6. Samples: 15753801440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:34:58,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 20:35:00,493][15401] Updated weights for policy 0, policy_version 961534 (0.0039) [2024-06-25 20:35:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42325.6, 300 sec: 42653.9). Total num frames: 15753887744. Throughput: 0: 42491.9. Samples: 15754061980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:03,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 20:35:04,283][15401] Updated weights for policy 0, policy_version 961544 (0.0022) [2024-06-25 20:35:07,969][15401] Updated weights for policy 0, policy_version 961554 (0.0028) [2024-06-25 20:35:08,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 15754117120. Throughput: 0: 42393.3. Samples: 15754191060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:08,392][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 20:35:11,874][15401] Updated weights for policy 0, policy_version 961564 (0.0042) [2024-06-25 20:35:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15754313728. Throughput: 0: 42571.0. Samples: 15754444200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:13,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 20:35:16,031][15401] Updated weights for policy 0, policy_version 961574 (0.0034) [2024-06-25 20:35:18,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15754543104. Throughput: 0: 42455.0. Samples: 15754697960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 20:35:19,574][15401] Updated weights for policy 0, policy_version 961584 (0.0026) [2024-06-25 20:35:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15754739712. Throughput: 0: 42596.1. Samples: 15754831400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:23,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 20:35:23,642][15401] Updated weights for policy 0, policy_version 961594 (0.0039) [2024-06-25 20:35:27,042][15401] Updated weights for policy 0, policy_version 961604 (0.0038) [2024-06-25 20:35:28,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15754969088. Throughput: 0: 42441.3. Samples: 15755081320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:28,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-25 20:35:31,262][15401] Updated weights for policy 0, policy_version 961614 (0.0022) [2024-06-25 20:35:33,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.4). Total num frames: 15755198464. Throughput: 0: 42629.7. Samples: 15755339980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:33,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 20:35:34,505][15401] Updated weights for policy 0, policy_version 961624 (0.0029) [2024-06-25 20:35:38,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15755378688. Throughput: 0: 42766.7. Samples: 15755474500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:38,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 20:35:38,836][15401] Updated weights for policy 0, policy_version 961634 (0.0033) [2024-06-25 20:35:42,176][15401] Updated weights for policy 0, policy_version 961644 (0.0035) [2024-06-25 20:35:43,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15755591680. Throughput: 0: 42859.1. Samples: 15755730100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:43,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 20:35:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000961646_15755608064.pth... [2024-06-25 20:35:43,488][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000961023_15745400832.pth [2024-06-25 20:35:46,399][15401] Updated weights for policy 0, policy_version 961654 (0.0037) [2024-06-25 20:35:48,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15755837440. Throughput: 0: 42658.4. Samples: 15755981600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:48,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 20:35:49,981][15401] Updated weights for policy 0, policy_version 961664 (0.0042) [2024-06-25 20:35:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15756017664. Throughput: 0: 42768.9. Samples: 15756115560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:53,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 20:35:54,174][15401] Updated weights for policy 0, policy_version 961674 (0.0057) [2024-06-25 20:35:55,475][15349] Signal inference workers to stop experience collection... (233100 times) [2024-06-25 20:35:55,523][15401] InferenceWorker_p0-w0: stopping experience collection (233100 times) [2024-06-25 20:35:55,533][15349] Signal inference workers to resume experience collection... (233100 times) [2024-06-25 20:35:55,537][15401] InferenceWorker_p0-w0: resuming experience collection (233100 times) [2024-06-25 20:35:57,868][15401] Updated weights for policy 0, policy_version 961684 (0.0041) [2024-06-25 20:35:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15756247040. Throughput: 0: 42705.4. Samples: 15756365940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:35:58,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 20:36:01,813][15401] Updated weights for policy 0, policy_version 961694 (0.0043) [2024-06-25 20:36:03,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 15756476416. Throughput: 0: 42834.3. Samples: 15756625500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:36:03,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 20:36:05,502][15401] Updated weights for policy 0, policy_version 961704 (0.0038) [2024-06-25 20:36:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 15756656640. Throughput: 0: 42649.6. Samples: 15756750640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:36:08,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 20:36:09,553][15401] Updated weights for policy 0, policy_version 961714 (0.0032) [2024-06-25 20:36:12,940][15401] Updated weights for policy 0, policy_version 961724 (0.0035) [2024-06-25 20:36:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15756902400. Throughput: 0: 42938.2. Samples: 15757013540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:36:13,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 20:36:17,157][15401] Updated weights for policy 0, policy_version 961734 (0.0041) [2024-06-25 20:36:18,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15757115392. Throughput: 0: 42795.2. Samples: 15757265760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:36:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 20:36:20,927][15401] Updated weights for policy 0, policy_version 961744 (0.0035) [2024-06-25 20:36:23,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15757295616. Throughput: 0: 42705.6. Samples: 15757396260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-25 20:36:23,390][15132] Avg episode reward: [(0, '0.701')] [2024-06-25 20:36:24,703][15401] Updated weights for policy 0, policy_version 961754 (0.0037) [2024-06-25 20:36:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15757524992. Throughput: 0: 42540.5. Samples: 15757644420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:36:28,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 20:36:28,640][15401] Updated weights for policy 0, policy_version 961764 (0.0036) [2024-06-25 20:36:32,831][15401] Updated weights for policy 0, policy_version 961774 (0.0031) [2024-06-25 20:36:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15757737984. Throughput: 0: 42738.2. Samples: 15757904820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:36:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 20:36:36,119][15401] Updated weights for policy 0, policy_version 961784 (0.0028) [2024-06-25 20:36:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15757934592. Throughput: 0: 42613.0. Samples: 15758033140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:36:38,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 20:36:40,387][15401] Updated weights for policy 0, policy_version 961794 (0.0038) [2024-06-25 20:36:43,390][15132] Fps is (10 sec: 44235.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15758180352. Throughput: 0: 42668.7. Samples: 15758286040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:36:43,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 20:36:43,627][15401] Updated weights for policy 0, policy_version 961804 (0.0034) [2024-06-25 20:36:47,964][15401] Updated weights for policy 0, policy_version 961814 (0.0032) [2024-06-25 20:36:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15758376960. Throughput: 0: 42757.0. Samples: 15758549560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:36:48,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 20:36:51,153][15401] Updated weights for policy 0, policy_version 961824 (0.0032) [2024-06-25 20:36:53,392][15132] Fps is (10 sec: 40950.8, 60 sec: 42869.7, 300 sec: 42653.6). Total num frames: 15758589952. Throughput: 0: 42731.2. Samples: 15758673640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:36:53,392][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 20:36:55,661][15401] Updated weights for policy 0, policy_version 961834 (0.0029) [2024-06-25 20:36:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15758819328. Throughput: 0: 42530.8. Samples: 15758927420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:36:58,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 20:36:58,778][15401] Updated weights for policy 0, policy_version 961844 (0.0032) [2024-06-25 20:37:03,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15758999552. Throughput: 0: 42754.2. Samples: 15759189700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:03,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 20:37:03,502][15401] Updated weights for policy 0, policy_version 961854 (0.0035) [2024-06-25 20:37:06,482][15401] Updated weights for policy 0, policy_version 961864 (0.0035) [2024-06-25 20:37:08,389][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15759212544. Throughput: 0: 42484.5. Samples: 15759308060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 20:37:11,180][15401] Updated weights for policy 0, policy_version 961874 (0.0041) [2024-06-25 20:37:13,392][15132] Fps is (10 sec: 45864.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 15759458304. Throughput: 0: 42670.2. Samples: 15759564680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:13,392][15132] Avg episode reward: [(0, '0.726')] [2024-06-25 20:37:14,120][15401] Updated weights for policy 0, policy_version 961884 (0.0036) [2024-06-25 20:37:18,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15759638528. Throughput: 0: 42793.3. Samples: 15759830520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 20:37:18,739][15401] Updated weights for policy 0, policy_version 961894 (0.0025) [2024-06-25 20:37:22,312][15401] Updated weights for policy 0, policy_version 961904 (0.0028) [2024-06-25 20:37:22,486][15349] Signal inference workers to stop experience collection... (233150 times) [2024-06-25 20:37:22,489][15349] Signal inference workers to resume experience collection... (233150 times) [2024-06-25 20:37:22,530][15401] InferenceWorker_p0-w0: stopping experience collection (233150 times) [2024-06-25 20:37:22,531][15401] InferenceWorker_p0-w0: resuming experience collection (233150 times) [2024-06-25 20:37:23,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 15759867904. Throughput: 0: 42512.2. Samples: 15759946200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:23,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 20:37:26,195][15401] Updated weights for policy 0, policy_version 961914 (0.0038) [2024-06-25 20:37:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15760097280. Throughput: 0: 42760.7. Samples: 15760210260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:28,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 20:37:29,864][15401] Updated weights for policy 0, policy_version 961924 (0.0030) [2024-06-25 20:37:33,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15760277504. Throughput: 0: 42772.3. Samples: 15760474320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:33,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 20:37:33,888][15401] Updated weights for policy 0, policy_version 961934 (0.0037) [2024-06-25 20:37:37,367][15401] Updated weights for policy 0, policy_version 961944 (0.0032) [2024-06-25 20:37:38,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15760523264. Throughput: 0: 42512.9. Samples: 15760586620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:38,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 20:37:41,431][15401] Updated weights for policy 0, policy_version 961954 (0.0028) [2024-06-25 20:37:43,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15760752640. Throughput: 0: 42854.8. Samples: 15760855900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:43,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 20:37:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000961960_15760752640.pth... [2024-06-25 20:37:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000961333_15750479872.pth [2024-06-25 20:37:44,870][15401] Updated weights for policy 0, policy_version 961964 (0.0042) [2024-06-25 20:37:48,391][15132] Fps is (10 sec: 39316.8, 60 sec: 42324.3, 300 sec: 42431.6). Total num frames: 15760916480. Throughput: 0: 42693.9. Samples: 15761110980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:48,391][15132] Avg episode reward: [(0, '0.673')] [2024-06-25 20:37:49,651][15401] Updated weights for policy 0, policy_version 961974 (0.0036) [2024-06-25 20:37:52,531][15401] Updated weights for policy 0, policy_version 961984 (0.0036) [2024-06-25 20:37:53,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 15761178624. Throughput: 0: 42735.0. Samples: 15761231140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:53,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 20:37:57,173][15401] Updated weights for policy 0, policy_version 961994 (0.0032) [2024-06-25 20:37:58,390][15132] Fps is (10 sec: 45881.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15761375232. Throughput: 0: 43040.0. Samples: 15761501380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:37:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 20:38:00,003][15401] Updated weights for policy 0, policy_version 962004 (0.0041) [2024-06-25 20:38:03,393][15132] Fps is (10 sec: 39307.2, 60 sec: 42868.8, 300 sec: 42486.8). Total num frames: 15761571840. Throughput: 0: 42704.9. Samples: 15761752400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:38:03,394][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 20:38:04,852][15401] Updated weights for policy 0, policy_version 962014 (0.0030) [2024-06-25 20:38:07,606][15401] Updated weights for policy 0, policy_version 962024 (0.0038) [2024-06-25 20:38:08,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15761817600. Throughput: 0: 42875.2. Samples: 15761875580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:38:08,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 20:38:12,457][15401] Updated weights for policy 0, policy_version 962034 (0.0030) [2024-06-25 20:38:13,392][15132] Fps is (10 sec: 44242.7, 60 sec: 42598.4, 300 sec: 42709.1). Total num frames: 15762014208. Throughput: 0: 42772.3. Samples: 15762135120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:38:13,392][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 20:38:15,727][15401] Updated weights for policy 0, policy_version 962044 (0.0034) [2024-06-25 20:38:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15762227200. Throughput: 0: 42465.4. Samples: 15762385260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-25 20:38:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 20:38:20,475][15401] Updated weights for policy 0, policy_version 962054 (0.0031) [2024-06-25 20:38:23,289][15401] Updated weights for policy 0, policy_version 962064 (0.0039) [2024-06-25 20:38:23,392][15132] Fps is (10 sec: 44236.8, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 15762456576. Throughput: 0: 42819.6. Samples: 15762513600. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:38:23,392][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 20:38:27,991][15401] Updated weights for policy 0, policy_version 962074 (0.0044) [2024-06-25 20:38:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15762636800. Throughput: 0: 42579.2. Samples: 15762771960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:38:28,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 20:38:29,001][15349] Signal inference workers to stop experience collection... (233200 times) [2024-06-25 20:38:29,039][15401] InferenceWorker_p0-w0: stopping experience collection (233200 times) [2024-06-25 20:38:29,048][15349] Signal inference workers to resume experience collection... (233200 times) [2024-06-25 20:38:29,068][15401] InferenceWorker_p0-w0: resuming experience collection (233200 times) [2024-06-25 20:38:31,271][15401] Updated weights for policy 0, policy_version 962084 (0.0034) [2024-06-25 20:38:33,392][15132] Fps is (10 sec: 40960.1, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15762866176. Throughput: 0: 42536.3. Samples: 15763025160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:38:33,392][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 20:38:35,569][15401] Updated weights for policy 0, policy_version 962094 (0.0029) [2024-06-25 20:38:38,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15763095552. Throughput: 0: 42724.9. Samples: 15763153760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:38:38,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 20:38:39,333][15401] Updated weights for policy 0, policy_version 962104 (0.0034) [2024-06-25 20:38:43,079][15401] Updated weights for policy 0, policy_version 962114 (0.0032) [2024-06-25 20:38:43,390][15132] Fps is (10 sec: 40969.3, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 15763275776. Throughput: 0: 42429.7. Samples: 15763410720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:38:43,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 20:38:47,037][15401] Updated weights for policy 0, policy_version 962124 (0.0028) [2024-06-25 20:38:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 43418.5, 300 sec: 42709.5). Total num frames: 15763521536. Throughput: 0: 42475.5. Samples: 15763663640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:38:48,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 20:38:50,623][15401] Updated weights for policy 0, policy_version 962134 (0.0032) [2024-06-25 20:38:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15763718144. Throughput: 0: 42734.2. Samples: 15763798620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:38:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 20:38:54,542][15401] Updated weights for policy 0, policy_version 962144 (0.0036) [2024-06-25 20:38:58,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15763931136. Throughput: 0: 42544.6. Samples: 15764049520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:38:58,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 20:38:58,397][15401] Updated weights for policy 0, policy_version 962154 (0.0032) [2024-06-25 20:39:02,044][15401] Updated weights for policy 0, policy_version 962164 (0.0036) [2024-06-25 20:39:03,389][15132] Fps is (10 sec: 44237.7, 60 sec: 43147.3, 300 sec: 42765.0). Total num frames: 15764160512. Throughput: 0: 42639.6. Samples: 15764304040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:03,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 20:39:06,321][15401] Updated weights for policy 0, policy_version 962174 (0.0032) [2024-06-25 20:39:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15764357120. Throughput: 0: 42840.9. Samples: 15764441340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:08,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 20:39:09,541][15401] Updated weights for policy 0, policy_version 962184 (0.0034) [2024-06-25 20:39:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 15764570112. Throughput: 0: 42711.6. Samples: 15764693980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:13,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 20:39:13,989][15401] Updated weights for policy 0, policy_version 962194 (0.0036) [2024-06-25 20:39:17,162][15401] Updated weights for policy 0, policy_version 962204 (0.0028) [2024-06-25 20:39:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15764783104. Throughput: 0: 42722.3. Samples: 15764947560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:18,390][15132] Avg episode reward: [(0, '0.232')] [2024-06-25 20:39:21,495][15401] Updated weights for policy 0, policy_version 962214 (0.0036) [2024-06-25 20:39:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 15764996096. Throughput: 0: 42766.3. Samples: 15765078240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:23,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 20:39:24,702][15401] Updated weights for policy 0, policy_version 962224 (0.0029) [2024-06-25 20:39:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15765192704. Throughput: 0: 42764.1. Samples: 15765335100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 20:39:28,999][15401] Updated weights for policy 0, policy_version 962234 (0.0025) [2024-06-25 20:39:32,583][15401] Updated weights for policy 0, policy_version 962244 (0.0043) [2024-06-25 20:39:33,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15765422080. Throughput: 0: 42722.6. Samples: 15765586160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:33,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 20:39:36,999][15401] Updated weights for policy 0, policy_version 962254 (0.0038) [2024-06-25 20:39:38,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15765651456. Throughput: 0: 42719.2. Samples: 15765720980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:38,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 20:39:40,329][15401] Updated weights for policy 0, policy_version 962264 (0.0040) [2024-06-25 20:39:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15765831680. Throughput: 0: 42748.8. Samples: 15765973220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:43,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 20:39:43,512][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000962271_15765848064.pth... [2024-06-25 20:39:43,560][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000961646_15755608064.pth [2024-06-25 20:39:44,494][15401] Updated weights for policy 0, policy_version 962274 (0.0043) [2024-06-25 20:39:48,338][15401] Updated weights for policy 0, policy_version 962284 (0.0035) [2024-06-25 20:39:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15766061056. Throughput: 0: 42844.7. Samples: 15766232060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:48,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 20:39:48,494][15349] Signal inference workers to stop experience collection... (233250 times) [2024-06-25 20:39:48,495][15349] Signal inference workers to resume experience collection... (233250 times) [2024-06-25 20:39:48,542][15401] InferenceWorker_p0-w0: stopping experience collection (233250 times) [2024-06-25 20:39:48,542][15401] InferenceWorker_p0-w0: resuming experience collection (233250 times) [2024-06-25 20:39:52,285][15401] Updated weights for policy 0, policy_version 962294 (0.0031) [2024-06-25 20:39:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15766290432. Throughput: 0: 42696.4. Samples: 15766362680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:53,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 20:39:55,753][15401] Updated weights for policy 0, policy_version 962304 (0.0035) [2024-06-25 20:39:58,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42598.0, 300 sec: 42709.4). Total num frames: 15766487040. Throughput: 0: 42697.0. Samples: 15766615360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:39:58,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 20:39:59,986][15401] Updated weights for policy 0, policy_version 962314 (0.0027) [2024-06-25 20:40:03,315][15401] Updated weights for policy 0, policy_version 962324 (0.0034) [2024-06-25 20:40:03,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 15766716416. Throughput: 0: 42664.5. Samples: 15766867460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:40:03,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 20:40:07,812][15401] Updated weights for policy 0, policy_version 962334 (0.0031) [2024-06-25 20:40:08,389][15132] Fps is (10 sec: 44238.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15766929408. Throughput: 0: 42623.1. Samples: 15766996280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:40:08,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 20:40:10,983][15401] Updated weights for policy 0, policy_version 962344 (0.0035) [2024-06-25 20:40:13,392][15132] Fps is (10 sec: 39311.9, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 15767109632. Throughput: 0: 42602.6. Samples: 15767252320. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:40:13,392][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 20:40:15,490][15401] Updated weights for policy 0, policy_version 962354 (0.0031) [2024-06-25 20:40:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15767355392. Throughput: 0: 42677.0. Samples: 15767506620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-25 20:40:18,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 20:40:18,541][15401] Updated weights for policy 0, policy_version 962364 (0.0044) [2024-06-25 20:40:23,099][15401] Updated weights for policy 0, policy_version 962374 (0.0033) [2024-06-25 20:40:23,390][15132] Fps is (10 sec: 45886.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15767568384. Throughput: 0: 42621.8. Samples: 15767638960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:40:23,390][15132] Avg episode reward: [(0, '0.625')] [2024-06-25 20:40:26,258][15401] Updated weights for policy 0, policy_version 962384 (0.0037) [2024-06-25 20:40:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15767764992. Throughput: 0: 42502.6. Samples: 15767885840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:40:28,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 20:40:30,588][15401] Updated weights for policy 0, policy_version 962394 (0.0038) [2024-06-25 20:40:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15767977984. Throughput: 0: 42577.8. Samples: 15768148060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:40:33,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 20:40:34,069][15401] Updated weights for policy 0, policy_version 962404 (0.0031) [2024-06-25 20:40:38,132][15401] Updated weights for policy 0, policy_version 962414 (0.0039) [2024-06-25 20:40:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15768190976. Throughput: 0: 42481.7. Samples: 15768274360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:40:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 20:40:41,554][15401] Updated weights for policy 0, policy_version 962424 (0.0041) [2024-06-25 20:40:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15768420352. Throughput: 0: 42458.4. Samples: 15768525980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:40:43,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 20:40:45,695][15401] Updated weights for policy 0, policy_version 962434 (0.0030) [2024-06-25 20:40:48,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15768633344. Throughput: 0: 42589.7. Samples: 15768784000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:40:48,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 20:40:49,442][15401] Updated weights for policy 0, policy_version 962444 (0.0039) [2024-06-25 20:40:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15768829952. Throughput: 0: 42465.2. Samples: 15768907220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:40:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 20:40:53,890][15401] Updated weights for policy 0, policy_version 962454 (0.0047) [2024-06-25 20:40:57,087][15349] Signal inference workers to stop experience collection... (233300 times) [2024-06-25 20:40:57,133][15401] InferenceWorker_p0-w0: stopping experience collection (233300 times) [2024-06-25 20:40:57,146][15349] Signal inference workers to resume experience collection... (233300 times) [2024-06-25 20:40:57,158][15401] InferenceWorker_p0-w0: resuming experience collection (233300 times) [2024-06-25 20:40:57,166][15401] Updated weights for policy 0, policy_version 962464 (0.0045) [2024-06-25 20:40:58,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.7, 300 sec: 42598.4). Total num frames: 15769042944. Throughput: 0: 42430.7. Samples: 15769161600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:40:58,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 20:41:01,431][15401] Updated weights for policy 0, policy_version 962474 (0.0045) [2024-06-25 20:41:03,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15769255936. Throughput: 0: 42527.9. Samples: 15769420380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:03,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 20:41:04,982][15401] Updated weights for policy 0, policy_version 962484 (0.0037) [2024-06-25 20:41:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15769452544. Throughput: 0: 42342.7. Samples: 15769544380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:08,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 20:41:09,069][15401] Updated weights for policy 0, policy_version 962494 (0.0032) [2024-06-25 20:41:12,841][15401] Updated weights for policy 0, policy_version 962504 (0.0027) [2024-06-25 20:41:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42873.3, 300 sec: 42598.4). Total num frames: 15769681920. Throughput: 0: 42523.8. Samples: 15769799400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:13,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 20:41:16,757][15401] Updated weights for policy 0, policy_version 962514 (0.0040) [2024-06-25 20:41:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15769894912. Throughput: 0: 42290.2. Samples: 15770051120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:18,390][15132] Avg episode reward: [(0, '0.463')] [2024-06-25 20:41:20,484][15401] Updated weights for policy 0, policy_version 962524 (0.0042) [2024-06-25 20:41:23,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42050.6, 300 sec: 42598.1). Total num frames: 15770091520. Throughput: 0: 42236.6. Samples: 15770175100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:23,393][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 20:41:24,635][15401] Updated weights for policy 0, policy_version 962534 (0.0032) [2024-06-25 20:41:28,170][15401] Updated weights for policy 0, policy_version 962544 (0.0039) [2024-06-25 20:41:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15770337280. Throughput: 0: 42375.7. Samples: 15770432880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:28,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 20:41:32,187][15401] Updated weights for policy 0, policy_version 962554 (0.0034) [2024-06-25 20:41:33,390][15132] Fps is (10 sec: 42608.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15770517504. Throughput: 0: 42391.0. Samples: 15770691600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:33,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 20:41:35,833][15401] Updated weights for policy 0, policy_version 962564 (0.0033) [2024-06-25 20:41:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15770746880. Throughput: 0: 42283.7. Samples: 15770809980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:38,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 20:41:39,732][15401] Updated weights for policy 0, policy_version 962574 (0.0043) [2024-06-25 20:41:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15770959872. Throughput: 0: 42418.1. Samples: 15771070420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:43,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 20:41:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000962583_15770959872.pth... [2024-06-25 20:41:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000961960_15760752640.pth [2024-06-25 20:41:43,630][15401] Updated weights for policy 0, policy_version 962584 (0.0023) [2024-06-25 20:41:47,282][15401] Updated weights for policy 0, policy_version 962594 (0.0030) [2024-06-25 20:41:48,392][15132] Fps is (10 sec: 40950.4, 60 sec: 42050.7, 300 sec: 42598.4). Total num frames: 15771156480. Throughput: 0: 42349.9. Samples: 15771326220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:48,392][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 20:41:51,523][15401] Updated weights for policy 0, policy_version 962604 (0.0044) [2024-06-25 20:41:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 15771369472. Throughput: 0: 42315.1. Samples: 15771448560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:53,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 20:41:55,130][15401] Updated weights for policy 0, policy_version 962614 (0.0036) [2024-06-25 20:41:58,390][15132] Fps is (10 sec: 44246.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15771598848. Throughput: 0: 42467.8. Samples: 15771710460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:41:58,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 20:41:59,102][15401] Updated weights for policy 0, policy_version 962624 (0.0034) [2024-06-25 20:42:03,224][15401] Updated weights for policy 0, policy_version 962634 (0.0035) [2024-06-25 20:42:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15771795456. Throughput: 0: 42433.4. Samples: 15771960620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:42:03,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 20:42:07,062][15401] Updated weights for policy 0, policy_version 962644 (0.0031) [2024-06-25 20:42:08,389][15132] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 15771992064. Throughput: 0: 42483.2. Samples: 15772086740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:42:08,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 20:42:10,946][15401] Updated weights for policy 0, policy_version 962654 (0.0035) [2024-06-25 20:42:11,971][15349] Signal inference workers to stop experience collection... (233350 times) [2024-06-25 20:42:12,011][15401] InferenceWorker_p0-w0: stopping experience collection (233350 times) [2024-06-25 20:42:12,020][15349] Signal inference workers to resume experience collection... (233350 times) [2024-06-25 20:42:12,025][15401] InferenceWorker_p0-w0: resuming experience collection (233350 times) [2024-06-25 20:42:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 15772221440. Throughput: 0: 42473.3. Samples: 15772344180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-25 20:42:13,390][15132] Avg episode reward: [(0, '0.830')] [2024-06-25 20:42:14,843][15401] Updated weights for policy 0, policy_version 962664 (0.0029) [2024-06-25 20:42:18,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15772434432. Throughput: 0: 42328.9. Samples: 15772596400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:18,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 20:42:18,623][15401] Updated weights for policy 0, policy_version 962674 (0.0029) [2024-06-25 20:42:22,369][15401] Updated weights for policy 0, policy_version 962684 (0.0030) [2024-06-25 20:42:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 15772647424. Throughput: 0: 42578.1. Samples: 15772726000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:23,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 20:42:26,435][15401] Updated weights for policy 0, policy_version 962694 (0.0037) [2024-06-25 20:42:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 15772844032. Throughput: 0: 42491.7. Samples: 15772982540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:28,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 20:42:29,841][15401] Updated weights for policy 0, policy_version 962704 (0.0032) [2024-06-25 20:42:33,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15773073408. Throughput: 0: 42595.5. Samples: 15773242920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:33,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 20:42:33,912][15401] Updated weights for policy 0, policy_version 962714 (0.0032) [2024-06-25 20:42:37,449][15401] Updated weights for policy 0, policy_version 962724 (0.0052) [2024-06-25 20:42:38,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 15773286400. Throughput: 0: 42798.1. Samples: 15773374480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:38,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 20:42:41,584][15401] Updated weights for policy 0, policy_version 962734 (0.0042) [2024-06-25 20:42:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42598.6). Total num frames: 15773483008. Throughput: 0: 42454.7. Samples: 15773620920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:43,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 20:42:45,097][15401] Updated weights for policy 0, policy_version 962744 (0.0035) [2024-06-25 20:42:48,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42873.1, 300 sec: 42542.9). Total num frames: 15773728768. Throughput: 0: 42624.8. Samples: 15773878740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:48,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 20:42:49,434][15401] Updated weights for policy 0, policy_version 962754 (0.0028) [2024-06-25 20:42:53,074][15401] Updated weights for policy 0, policy_version 962764 (0.0028) [2024-06-25 20:42:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15773941760. Throughput: 0: 42702.1. Samples: 15774008340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:53,390][15132] Avg episode reward: [(0, '0.747')] [2024-06-25 20:42:56,915][15401] Updated weights for policy 0, policy_version 962774 (0.0037) [2024-06-25 20:42:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.9). Total num frames: 15774138368. Throughput: 0: 42601.7. Samples: 15774261260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:42:58,390][15132] Avg episode reward: [(0, '0.604')] [2024-06-25 20:43:00,763][15401] Updated weights for policy 0, policy_version 962784 (0.0043) [2024-06-25 20:43:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15774351360. Throughput: 0: 42764.1. Samples: 15774520780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 20:43:04,482][15401] Updated weights for policy 0, policy_version 962794 (0.0042) [2024-06-25 20:43:08,249][15401] Updated weights for policy 0, policy_version 962804 (0.0028) [2024-06-25 20:43:08,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42598.8). Total num frames: 15774580736. Throughput: 0: 42795.6. Samples: 15774651800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 20:43:11,955][15401] Updated weights for policy 0, policy_version 962814 (0.0040) [2024-06-25 20:43:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15774777344. Throughput: 0: 42579.6. Samples: 15774898620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:13,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 20:43:16,049][15401] Updated weights for policy 0, policy_version 962824 (0.0030) [2024-06-25 20:43:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 15775006720. Throughput: 0: 42697.3. Samples: 15775164300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:18,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 20:43:19,690][15401] Updated weights for policy 0, policy_version 962834 (0.0033) [2024-06-25 20:43:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15775186944. Throughput: 0: 42571.8. Samples: 15775290200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:23,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 20:43:24,149][15401] Updated weights for policy 0, policy_version 962844 (0.0027) [2024-06-25 20:43:27,631][15401] Updated weights for policy 0, policy_version 962854 (0.0035) [2024-06-25 20:43:28,393][15132] Fps is (10 sec: 42581.7, 60 sec: 43141.7, 300 sec: 42598.2). Total num frames: 15775432704. Throughput: 0: 42739.5. Samples: 15775544360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:28,394][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 20:43:31,447][15401] Updated weights for policy 0, policy_version 962864 (0.0026) [2024-06-25 20:43:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15775629312. Throughput: 0: 42802.2. Samples: 15775804840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:33,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 20:43:35,105][15401] Updated weights for policy 0, policy_version 962874 (0.0041) [2024-06-25 20:43:38,389][15132] Fps is (10 sec: 40976.5, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 15775842304. Throughput: 0: 42776.9. Samples: 15775933300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:38,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 20:43:38,920][15401] Updated weights for policy 0, policy_version 962884 (0.0023) [2024-06-25 20:43:42,251][15349] Signal inference workers to stop experience collection... (233400 times) [2024-06-25 20:43:42,251][15349] Signal inference workers to resume experience collection... (233400 times) [2024-06-25 20:43:42,299][15401] InferenceWorker_p0-w0: stopping experience collection (233400 times) [2024-06-25 20:43:42,299][15401] InferenceWorker_p0-w0: resuming experience collection (233400 times) [2024-06-25 20:43:42,642][15401] Updated weights for policy 0, policy_version 962894 (0.0026) [2024-06-25 20:43:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 15776071680. Throughput: 0: 42904.0. Samples: 15776191940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:43,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 20:43:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000962895_15776071680.pth... [2024-06-25 20:43:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000962271_15765848064.pth [2024-06-25 20:43:46,807][15401] Updated weights for policy 0, policy_version 962904 (0.0027) [2024-06-25 20:43:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15776284672. Throughput: 0: 42866.6. Samples: 15776449780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 20:43:50,131][15401] Updated weights for policy 0, policy_version 962914 (0.0042) [2024-06-25 20:43:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15776497664. Throughput: 0: 42861.8. Samples: 15776580580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:53,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 20:43:54,177][15401] Updated weights for policy 0, policy_version 962924 (0.0043) [2024-06-25 20:43:57,722][15401] Updated weights for policy 0, policy_version 962934 (0.0033) [2024-06-25 20:43:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 15776710656. Throughput: 0: 43106.8. Samples: 15776838420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:43:58,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 20:44:01,774][15401] Updated weights for policy 0, policy_version 962944 (0.0028) [2024-06-25 20:44:03,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15776907264. Throughput: 0: 42973.4. Samples: 15777098100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:44:03,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 20:44:05,515][15401] Updated weights for policy 0, policy_version 962954 (0.0042) [2024-06-25 20:44:08,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15777136640. Throughput: 0: 42898.9. Samples: 15777220660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:44:08,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 20:44:09,935][15401] Updated weights for policy 0, policy_version 962964 (0.0037) [2024-06-25 20:44:13,167][15401] Updated weights for policy 0, policy_version 962974 (0.0023) [2024-06-25 20:44:13,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 15777382400. Throughput: 0: 43015.8. Samples: 15777479900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-25 20:44:13,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 20:44:17,671][15401] Updated weights for policy 0, policy_version 962984 (0.0027) [2024-06-25 20:44:18,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15777562624. Throughput: 0: 42929.9. Samples: 15777736680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:18,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 20:44:21,080][15401] Updated weights for policy 0, policy_version 962994 (0.0048) [2024-06-25 20:44:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 15777792000. Throughput: 0: 42810.1. Samples: 15777859760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:23,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 20:44:25,244][15401] Updated weights for policy 0, policy_version 963004 (0.0036) [2024-06-25 20:44:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42874.2, 300 sec: 42653.9). Total num frames: 15778004992. Throughput: 0: 42770.6. Samples: 15778116620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:28,390][15132] Avg episode reward: [(0, '0.797')] [2024-06-25 20:44:28,590][15401] Updated weights for policy 0, policy_version 963014 (0.0036) [2024-06-25 20:44:32,884][15401] Updated weights for policy 0, policy_version 963024 (0.0035) [2024-06-25 20:44:33,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 15778201600. Throughput: 0: 42832.5. Samples: 15778377240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:33,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 20:44:36,230][15401] Updated weights for policy 0, policy_version 963034 (0.0028) [2024-06-25 20:44:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15778430976. Throughput: 0: 42678.7. Samples: 15778501120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:38,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 20:44:40,615][15401] Updated weights for policy 0, policy_version 963044 (0.0031) [2024-06-25 20:44:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 15778627584. Throughput: 0: 42657.6. Samples: 15778758120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:43,392][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 20:44:43,763][15401] Updated weights for policy 0, policy_version 963054 (0.0043) [2024-06-25 20:44:48,100][15401] Updated weights for policy 0, policy_version 963064 (0.0029) [2024-06-25 20:44:48,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15778840576. Throughput: 0: 42547.1. Samples: 15779012720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:48,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 20:44:51,895][15401] Updated weights for policy 0, policy_version 963074 (0.0030) [2024-06-25 20:44:53,390][15132] Fps is (10 sec: 44247.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15779069952. Throughput: 0: 42678.3. Samples: 15779141180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:53,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 20:44:56,165][15401] Updated weights for policy 0, policy_version 963084 (0.0050) [2024-06-25 20:44:58,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15779266560. Throughput: 0: 42610.1. Samples: 15779397360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:44:58,390][15132] Avg episode reward: [(0, '0.194')] [2024-06-25 20:44:59,535][15401] Updated weights for policy 0, policy_version 963094 (0.0032) [2024-06-25 20:45:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 15779479552. Throughput: 0: 42606.6. Samples: 15779653980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:03,390][15132] Avg episode reward: [(0, '0.000')] [2024-06-25 20:45:03,552][15401] Updated weights for policy 0, policy_version 963104 (0.0046) [2024-06-25 20:45:07,206][15401] Updated weights for policy 0, policy_version 963114 (0.0022) [2024-06-25 20:45:08,391][15132] Fps is (10 sec: 45867.9, 60 sec: 43143.4, 300 sec: 42765.1). Total num frames: 15779725312. Throughput: 0: 42763.3. Samples: 15779784180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:08,392][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 20:45:11,038][15401] Updated weights for policy 0, policy_version 963124 (0.0032) [2024-06-25 20:45:13,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15779905536. Throughput: 0: 42720.1. Samples: 15780039020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:13,390][15132] Avg episode reward: [(0, '0.343')] [2024-06-25 20:45:15,276][15401] Updated weights for policy 0, policy_version 963134 (0.0021) [2024-06-25 20:45:16,680][15349] Signal inference workers to stop experience collection... (233450 times) [2024-06-25 20:45:16,724][15401] InferenceWorker_p0-w0: stopping experience collection (233450 times) [2024-06-25 20:45:16,732][15349] Signal inference workers to resume experience collection... (233450 times) [2024-06-25 20:45:16,740][15401] InferenceWorker_p0-w0: resuming experience collection (233450 times) [2024-06-25 20:45:18,390][15132] Fps is (10 sec: 39328.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15780118528. Throughput: 0: 42502.6. Samples: 15780289860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:18,390][15132] Avg episode reward: [(0, '0.380')] [2024-06-25 20:45:19,407][15401] Updated weights for policy 0, policy_version 963144 (0.0023) [2024-06-25 20:45:22,919][15401] Updated weights for policy 0, policy_version 963154 (0.0031) [2024-06-25 20:45:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15780331520. Throughput: 0: 42604.8. Samples: 15780418340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 20:45:26,816][15401] Updated weights for policy 0, policy_version 963164 (0.0037) [2024-06-25 20:45:28,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15780560896. Throughput: 0: 42683.2. Samples: 15780678760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:28,390][15132] Avg episode reward: [(0, '0.803')] [2024-06-25 20:45:30,493][15401] Updated weights for policy 0, policy_version 963174 (0.0036) [2024-06-25 20:45:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15780773888. Throughput: 0: 42720.9. Samples: 15780935160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:33,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 20:45:34,182][15401] Updated weights for policy 0, policy_version 963184 (0.0030) [2024-06-25 20:45:38,008][15401] Updated weights for policy 0, policy_version 963194 (0.0039) [2024-06-25 20:45:38,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 15780986880. Throughput: 0: 42757.7. Samples: 15781065380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:38,392][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 20:45:41,609][15401] Updated weights for policy 0, policy_version 963204 (0.0031) [2024-06-25 20:45:43,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42873.2, 300 sec: 42598.4). Total num frames: 15781199872. Throughput: 0: 42744.1. Samples: 15781320840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:43,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 20:45:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000963208_15781199872.pth... [2024-06-25 20:45:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000962583_15770959872.pth [2024-06-25 20:45:45,554][15401] Updated weights for policy 0, policy_version 963214 (0.0036) [2024-06-25 20:45:48,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15781412864. Throughput: 0: 42792.5. Samples: 15781579640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:48,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 20:45:49,094][15401] Updated weights for policy 0, policy_version 963224 (0.0031) [2024-06-25 20:45:52,952][15401] Updated weights for policy 0, policy_version 963234 (0.0029) [2024-06-25 20:45:53,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15781625856. Throughput: 0: 42858.6. Samples: 15781712740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:53,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 20:45:57,010][15401] Updated weights for policy 0, policy_version 963244 (0.0026) [2024-06-25 20:45:58,390][15132] Fps is (10 sec: 42594.4, 60 sec: 42870.9, 300 sec: 42653.8). Total num frames: 15781838848. Throughput: 0: 42821.3. Samples: 15781966020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:45:58,391][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 20:46:01,231][15401] Updated weights for policy 0, policy_version 963254 (0.0029) [2024-06-25 20:46:03,389][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15782068224. Throughput: 0: 42871.1. Samples: 15782219060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:46:03,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 20:46:04,398][15401] Updated weights for policy 0, policy_version 963264 (0.0034) [2024-06-25 20:46:08,390][15132] Fps is (10 sec: 40963.4, 60 sec: 42053.4, 300 sec: 42598.4). Total num frames: 15782248448. Throughput: 0: 43021.3. Samples: 15782354300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:46:08,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 20:46:08,800][15401] Updated weights for policy 0, policy_version 963274 (0.0038) [2024-06-25 20:46:11,774][15401] Updated weights for policy 0, policy_version 963284 (0.0029) [2024-06-25 20:46:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15782477824. Throughput: 0: 42803.5. Samples: 15782604920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-25 20:46:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 20:46:16,243][15401] Updated weights for policy 0, policy_version 963294 (0.0023) [2024-06-25 20:46:18,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 15782707200. Throughput: 0: 42763.4. Samples: 15782859520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 20:46:19,428][15401] Updated weights for policy 0, policy_version 963304 (0.0044) [2024-06-25 20:46:23,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15782887424. Throughput: 0: 42904.6. Samples: 15782995980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:23,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 20:46:23,990][15401] Updated weights for policy 0, policy_version 963314 (0.0036) [2024-06-25 20:46:27,069][15401] Updated weights for policy 0, policy_version 963324 (0.0035) [2024-06-25 20:46:28,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15783133184. Throughput: 0: 42699.6. Samples: 15783242320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:28,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 20:46:31,450][15349] Signal inference workers to stop experience collection... (233500 times) [2024-06-25 20:46:31,451][15349] Signal inference workers to resume experience collection... (233500 times) [2024-06-25 20:46:31,492][15401] InferenceWorker_p0-w0: stopping experience collection (233500 times) [2024-06-25 20:46:31,492][15401] InferenceWorker_p0-w0: resuming experience collection (233500 times) [2024-06-25 20:46:31,594][15401] Updated weights for policy 0, policy_version 963334 (0.0027) [2024-06-25 20:46:33,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15783346176. Throughput: 0: 42668.1. Samples: 15783499700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:33,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 20:46:34,726][15401] Updated weights for policy 0, policy_version 963344 (0.0034) [2024-06-25 20:46:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 15783542784. Throughput: 0: 42588.4. Samples: 15783629220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:38,390][15132] Avg episode reward: [(0, '0.333')] [2024-06-25 20:46:39,345][15401] Updated weights for policy 0, policy_version 963354 (0.0042) [2024-06-25 20:46:42,867][15401] Updated weights for policy 0, policy_version 963364 (0.0029) [2024-06-25 20:46:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15783772160. Throughput: 0: 42476.0. Samples: 15783877400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 20:46:46,954][15401] Updated weights for policy 0, policy_version 963374 (0.0044) [2024-06-25 20:46:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15783968768. Throughput: 0: 42672.0. Samples: 15784139300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:48,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 20:46:50,363][15401] Updated weights for policy 0, policy_version 963384 (0.0029) [2024-06-25 20:46:53,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 15784181760. Throughput: 0: 42482.2. Samples: 15784266000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:53,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 20:46:54,606][15401] Updated weights for policy 0, policy_version 963394 (0.0042) [2024-06-25 20:46:58,240][15401] Updated weights for policy 0, policy_version 963404 (0.0037) [2024-06-25 20:46:58,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43145.2, 300 sec: 42820.6). Total num frames: 15784427520. Throughput: 0: 42572.1. Samples: 15784520660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:46:58,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 20:47:02,178][15401] Updated weights for policy 0, policy_version 963414 (0.0034) [2024-06-25 20:47:03,390][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15784607744. Throughput: 0: 42673.4. Samples: 15784779820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:03,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 20:47:05,821][15401] Updated weights for policy 0, policy_version 963424 (0.0036) [2024-06-25 20:47:08,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15784820736. Throughput: 0: 42420.8. Samples: 15784904920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:08,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 20:47:09,771][15401] Updated weights for policy 0, policy_version 963434 (0.0031) [2024-06-25 20:47:13,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15785050112. Throughput: 0: 42710.8. Samples: 15785164300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 20:47:13,483][15401] Updated weights for policy 0, policy_version 963444 (0.0030) [2024-06-25 20:47:17,429][15401] Updated weights for policy 0, policy_version 963454 (0.0036) [2024-06-25 20:47:18,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15785263104. Throughput: 0: 42652.4. Samples: 15785419060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:18,390][15132] Avg episode reward: [(0, '0.668')] [2024-06-25 20:47:21,161][15401] Updated weights for policy 0, policy_version 963464 (0.0034) [2024-06-25 20:47:23,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15785459712. Throughput: 0: 42601.4. Samples: 15785546280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:23,390][15132] Avg episode reward: [(0, '0.793')] [2024-06-25 20:47:25,299][15401] Updated weights for policy 0, policy_version 963474 (0.0035) [2024-06-25 20:47:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15785689088. Throughput: 0: 42984.3. Samples: 15785811700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:28,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 20:47:29,111][15401] Updated weights for policy 0, policy_version 963484 (0.0048) [2024-06-25 20:47:32,899][15401] Updated weights for policy 0, policy_version 963494 (0.0027) [2024-06-25 20:47:33,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15785918464. Throughput: 0: 42747.0. Samples: 15786062920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:33,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 20:47:36,631][15401] Updated weights for policy 0, policy_version 963504 (0.0044) [2024-06-25 20:47:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 15786115072. Throughput: 0: 42758.1. Samples: 15786190120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:38,396][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 20:47:39,935][15349] Signal inference workers to stop experience collection... (233550 times) [2024-06-25 20:47:39,984][15401] InferenceWorker_p0-w0: stopping experience collection (233550 times) [2024-06-25 20:47:39,986][15349] Signal inference workers to resume experience collection... (233550 times) [2024-06-25 20:47:40,003][15401] InferenceWorker_p0-w0: resuming experience collection (233550 times) [2024-06-25 20:47:40,349][15401] Updated weights for policy 0, policy_version 963514 (0.0034) [2024-06-25 20:47:43,391][15132] Fps is (10 sec: 42592.9, 60 sec: 42870.5, 300 sec: 42764.8). Total num frames: 15786344448. Throughput: 0: 42885.8. Samples: 15786450580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:43,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 20:47:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000963522_15786344448.pth... [2024-06-25 20:47:43,497][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000962895_15776071680.pth [2024-06-25 20:47:44,308][15401] Updated weights for policy 0, policy_version 963524 (0.0029) [2024-06-25 20:47:48,218][15401] Updated weights for policy 0, policy_version 963534 (0.0038) [2024-06-25 20:47:48,390][15132] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15786557440. Throughput: 0: 42756.9. Samples: 15786703880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:48,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 20:47:51,887][15401] Updated weights for policy 0, policy_version 963544 (0.0044) [2024-06-25 20:47:53,389][15132] Fps is (10 sec: 39327.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15786737664. Throughput: 0: 42793.8. Samples: 15786830640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:53,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 20:47:55,860][15401] Updated weights for policy 0, policy_version 963554 (0.0036) [2024-06-25 20:47:58,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15786967040. Throughput: 0: 42871.6. Samples: 15787093520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:47:58,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 20:47:59,486][15401] Updated weights for policy 0, policy_version 963564 (0.0046) [2024-06-25 20:48:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15787180032. Throughput: 0: 42826.2. Samples: 15787346240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:48:03,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 20:48:03,691][15401] Updated weights for policy 0, policy_version 963574 (0.0019) [2024-06-25 20:48:07,171][15401] Updated weights for policy 0, policy_version 963584 (0.0033) [2024-06-25 20:48:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15787393024. Throughput: 0: 42762.2. Samples: 15787470580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 20:48:08,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 20:48:11,233][15401] Updated weights for policy 0, policy_version 963594 (0.0043) [2024-06-25 20:48:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15787606016. Throughput: 0: 42730.2. Samples: 15787734560. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:13,390][15132] Avg episode reward: [(0, '0.324')] [2024-06-25 20:48:14,686][15401] Updated weights for policy 0, policy_version 963604 (0.0023) [2024-06-25 20:48:18,391][15132] Fps is (10 sec: 44228.4, 60 sec: 42870.1, 300 sec: 42875.8). Total num frames: 15787835392. Throughput: 0: 42807.1. Samples: 15787989320. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:18,392][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 20:48:18,628][15401] Updated weights for policy 0, policy_version 963614 (0.0043) [2024-06-25 20:48:22,567][15401] Updated weights for policy 0, policy_version 963624 (0.0037) [2024-06-25 20:48:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.6). Total num frames: 15788048384. Throughput: 0: 42843.3. Samples: 15788118060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:23,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 20:48:26,697][15401] Updated weights for policy 0, policy_version 963634 (0.0039) [2024-06-25 20:48:28,389][15132] Fps is (10 sec: 40967.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15788244992. Throughput: 0: 42763.5. Samples: 15788374880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:28,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 20:48:30,133][15401] Updated weights for policy 0, policy_version 963644 (0.0035) [2024-06-25 20:48:33,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15788457984. Throughput: 0: 42826.2. Samples: 15788631060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:33,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 20:48:34,635][15401] Updated weights for policy 0, policy_version 963654 (0.0028) [2024-06-25 20:48:37,771][15401] Updated weights for policy 0, policy_version 963664 (0.0041) [2024-06-25 20:48:38,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.8, 300 sec: 42820.6). Total num frames: 15788703744. Throughput: 0: 42787.2. Samples: 15788756060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:38,390][15132] Avg episode reward: [(0, '0.795')] [2024-06-25 20:48:42,310][15401] Updated weights for policy 0, policy_version 963674 (0.0043) [2024-06-25 20:48:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42599.3, 300 sec: 42765.0). Total num frames: 15788900352. Throughput: 0: 42778.9. Samples: 15789018580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:43,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-25 20:48:45,418][15401] Updated weights for policy 0, policy_version 963684 (0.0043) [2024-06-25 20:48:48,396][15132] Fps is (10 sec: 39296.2, 60 sec: 42320.8, 300 sec: 42708.6). Total num frames: 15789096960. Throughput: 0: 42698.9. Samples: 15789267960. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:48,397][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 20:48:50,030][15401] Updated weights for policy 0, policy_version 963694 (0.0036) [2024-06-25 20:48:52,944][15401] Updated weights for policy 0, policy_version 963704 (0.0026) [2024-06-25 20:48:53,392][15132] Fps is (10 sec: 44226.7, 60 sec: 43415.9, 300 sec: 42820.2). Total num frames: 15789342720. Throughput: 0: 42742.6. Samples: 15789394100. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:53,392][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 20:48:57,827][15401] Updated weights for policy 0, policy_version 963714 (0.0023) [2024-06-25 20:48:58,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 15789539328. Throughput: 0: 42697.4. Samples: 15789655940. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:48:58,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 20:49:00,997][15401] Updated weights for policy 0, policy_version 963724 (0.0028) [2024-06-25 20:49:03,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15789752320. Throughput: 0: 42680.4. Samples: 15789909860. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:03,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-25 20:49:05,327][15401] Updated weights for policy 0, policy_version 963734 (0.0035) [2024-06-25 20:49:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15789965312. Throughput: 0: 42664.0. Samples: 15790037940. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 20:49:08,570][15401] Updated weights for policy 0, policy_version 963744 (0.0030) [2024-06-25 20:49:12,887][15349] Signal inference workers to stop experience collection... (233600 times) [2024-06-25 20:49:12,892][15349] Signal inference workers to resume experience collection... (233600 times) [2024-06-25 20:49:12,908][15401] Updated weights for policy 0, policy_version 963754 (0.0041) [2024-06-25 20:49:12,936][15401] InferenceWorker_p0-w0: stopping experience collection (233600 times) [2024-06-25 20:49:12,937][15401] InferenceWorker_p0-w0: resuming experience collection (233600 times) [2024-06-25 20:49:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15790161920. Throughput: 0: 42664.8. Samples: 15790294800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:13,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 20:49:16,700][15401] Updated weights for policy 0, policy_version 963764 (0.0031) [2024-06-25 20:49:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42326.7, 300 sec: 42653.9). Total num frames: 15790374912. Throughput: 0: 42465.4. Samples: 15790542000. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:18,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 20:49:20,711][15401] Updated weights for policy 0, policy_version 963774 (0.0031) [2024-06-25 20:49:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15790587904. Throughput: 0: 42556.9. Samples: 15790671120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:23,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 20:49:24,325][15401] Updated weights for policy 0, policy_version 963784 (0.0055) [2024-06-25 20:49:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15790784512. Throughput: 0: 42496.1. Samples: 15790930900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:28,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 20:49:28,407][15401] Updated weights for policy 0, policy_version 963794 (0.0030) [2024-06-25 20:49:31,857][15401] Updated weights for policy 0, policy_version 963804 (0.0037) [2024-06-25 20:49:33,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15791030272. Throughput: 0: 42529.5. Samples: 15791181520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:33,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 20:49:35,887][15401] Updated weights for policy 0, policy_version 963814 (0.0022) [2024-06-25 20:49:38,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 15791226880. Throughput: 0: 42780.5. Samples: 15791319120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:38,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 20:49:39,347][15401] Updated weights for policy 0, policy_version 963824 (0.0029) [2024-06-25 20:49:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15791439872. Throughput: 0: 42613.4. Samples: 15791573540. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 20:49:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000963834_15791456256.pth... [2024-06-25 20:49:43,491][15401] Updated weights for policy 0, policy_version 963834 (0.0025) [2024-06-25 20:49:43,536][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000963208_15781199872.pth [2024-06-25 20:49:46,984][15401] Updated weights for policy 0, policy_version 963844 (0.0035) [2024-06-25 20:49:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 15791669248. Throughput: 0: 42617.9. Samples: 15791827660. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:48,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 20:49:51,467][15401] Updated weights for policy 0, policy_version 963854 (0.0032) [2024-06-25 20:49:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 15791882240. Throughput: 0: 42665.2. Samples: 15791957880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:53,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 20:49:54,519][15401] Updated weights for policy 0, policy_version 963864 (0.0035) [2024-06-25 20:49:58,390][15132] Fps is (10 sec: 39320.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 15792062464. Throughput: 0: 42487.1. Samples: 15792206720. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:49:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 20:49:58,956][15401] Updated weights for policy 0, policy_version 963874 (0.0035) [2024-06-25 20:50:02,222][15401] Updated weights for policy 0, policy_version 963884 (0.0031) [2024-06-25 20:50:03,390][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 15792308224. Throughput: 0: 42864.0. Samples: 15792470880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:50:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 20:50:06,605][15401] Updated weights for policy 0, policy_version 963894 (0.0039) [2024-06-25 20:50:08,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15792504832. Throughput: 0: 42935.1. Samples: 15792603200. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-25 20:50:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 20:50:10,261][15401] Updated weights for policy 0, policy_version 963904 (0.0036) [2024-06-25 20:50:13,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15792717824. Throughput: 0: 42610.2. Samples: 15792848360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:13,399][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 20:50:14,316][15401] Updated weights for policy 0, policy_version 963914 (0.0032) [2024-06-25 20:50:18,225][15401] Updated weights for policy 0, policy_version 963924 (0.0039) [2024-06-25 20:50:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15792947200. Throughput: 0: 42788.0. Samples: 15793106980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:18,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 20:50:21,985][15401] Updated weights for policy 0, policy_version 963934 (0.0035) [2024-06-25 20:50:23,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15793143808. Throughput: 0: 42551.9. Samples: 15793233960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:23,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 20:50:25,849][15401] Updated weights for policy 0, policy_version 963944 (0.0036) [2024-06-25 20:50:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15793356800. Throughput: 0: 42496.3. Samples: 15793485880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 20:50:29,901][15401] Updated weights for policy 0, policy_version 963954 (0.0039) [2024-06-25 20:50:33,386][15401] Updated weights for policy 0, policy_version 963964 (0.0033) [2024-06-25 20:50:33,389][15132] Fps is (10 sec: 44238.0, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 15793586176. Throughput: 0: 42441.4. Samples: 15793737520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:33,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 20:50:37,417][15349] Signal inference workers to stop experience collection... (233650 times) [2024-06-25 20:50:37,472][15401] InferenceWorker_p0-w0: stopping experience collection (233650 times) [2024-06-25 20:50:37,480][15349] Signal inference workers to resume experience collection... (233650 times) [2024-06-25 20:50:37,485][15401] InferenceWorker_p0-w0: resuming experience collection (233650 times) [2024-06-25 20:50:37,492][15401] Updated weights for policy 0, policy_version 963974 (0.0050) [2024-06-25 20:50:38,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15793782784. Throughput: 0: 42484.2. Samples: 15793869660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 20:50:40,834][15401] Updated weights for policy 0, policy_version 963984 (0.0046) [2024-06-25 20:50:43,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15794012160. Throughput: 0: 42641.8. Samples: 15794125600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:43,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 20:50:45,078][15401] Updated weights for policy 0, policy_version 963994 (0.0031) [2024-06-25 20:50:48,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 15794225152. Throughput: 0: 42393.7. Samples: 15794378600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:48,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 20:50:48,547][15401] Updated weights for policy 0, policy_version 964004 (0.0034) [2024-06-25 20:50:52,730][15401] Updated weights for policy 0, policy_version 964014 (0.0030) [2024-06-25 20:50:53,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.1). Total num frames: 15794421760. Throughput: 0: 42335.6. Samples: 15794508300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:53,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 20:50:56,328][15401] Updated weights for policy 0, policy_version 964024 (0.0032) [2024-06-25 20:50:58,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 15794634752. Throughput: 0: 42600.5. Samples: 15794765380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:50:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 20:51:00,326][15401] Updated weights for policy 0, policy_version 964034 (0.0044) [2024-06-25 20:51:03,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15794864128. Throughput: 0: 42425.4. Samples: 15795016120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:03,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 20:51:04,469][15401] Updated weights for policy 0, policy_version 964044 (0.0043) [2024-06-25 20:51:07,996][15401] Updated weights for policy 0, policy_version 964054 (0.0039) [2024-06-25 20:51:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15795077120. Throughput: 0: 42538.8. Samples: 15795148200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:08,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 20:51:12,088][15401] Updated weights for policy 0, policy_version 964064 (0.0033) [2024-06-25 20:51:13,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15795257344. Throughput: 0: 42537.0. Samples: 15795400040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:13,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 20:51:16,051][15401] Updated weights for policy 0, policy_version 964074 (0.0032) [2024-06-25 20:51:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15795486720. Throughput: 0: 42510.1. Samples: 15795650480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:18,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 20:51:19,666][15401] Updated weights for policy 0, policy_version 964084 (0.0032) [2024-06-25 20:51:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 15795699712. Throughput: 0: 42538.7. Samples: 15795783900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:23,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 20:51:23,731][15401] Updated weights for policy 0, policy_version 964094 (0.0035) [2024-06-25 20:51:27,230][15401] Updated weights for policy 0, policy_version 964104 (0.0027) [2024-06-25 20:51:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15795912704. Throughput: 0: 42388.0. Samples: 15796033060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:28,390][15132] Avg episode reward: [(0, '0.850')] [2024-06-25 20:51:31,378][15401] Updated weights for policy 0, policy_version 964114 (0.0036) [2024-06-25 20:51:33,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15796125696. Throughput: 0: 42399.7. Samples: 15796286580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:33,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 20:51:34,927][15401] Updated weights for policy 0, policy_version 964124 (0.0031) [2024-06-25 20:51:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15796338688. Throughput: 0: 42346.7. Samples: 15796413900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 20:51:38,963][15401] Updated weights for policy 0, policy_version 964134 (0.0047) [2024-06-25 20:51:42,606][15401] Updated weights for policy 0, policy_version 964144 (0.0029) [2024-06-25 20:51:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15796551680. Throughput: 0: 42401.3. Samples: 15796673440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:43,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 20:51:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000964146_15796568064.pth... [2024-06-25 20:51:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000963522_15786344448.pth [2024-06-25 20:51:46,765][15401] Updated weights for policy 0, policy_version 964154 (0.0028) [2024-06-25 20:51:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15796764672. Throughput: 0: 42471.9. Samples: 15796927360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:48,390][15132] Avg episode reward: [(0, '0.315')] [2024-06-25 20:51:50,346][15401] Updated weights for policy 0, policy_version 964164 (0.0028) [2024-06-25 20:51:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15796977664. Throughput: 0: 42256.3. Samples: 15797049740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:53,390][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 20:51:54,562][15401] Updated weights for policy 0, policy_version 964174 (0.0038) [2024-06-25 20:51:57,362][15349] Signal inference workers to stop experience collection... (233700 times) [2024-06-25 20:51:57,363][15349] Signal inference workers to resume experience collection... (233700 times) [2024-06-25 20:51:57,415][15401] InferenceWorker_p0-w0: stopping experience collection (233700 times) [2024-06-25 20:51:57,415][15401] InferenceWorker_p0-w0: resuming experience collection (233700 times) [2024-06-25 20:51:58,341][15401] Updated weights for policy 0, policy_version 964184 (0.0034) [2024-06-25 20:51:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15797190656. Throughput: 0: 42391.1. Samples: 15797307640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:51:58,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 20:52:02,359][15401] Updated weights for policy 0, policy_version 964194 (0.0036) [2024-06-25 20:52:03,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15797387264. Throughput: 0: 42571.7. Samples: 15797566200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-25 20:52:03,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 20:52:05,885][15401] Updated weights for policy 0, policy_version 964204 (0.0050) [2024-06-25 20:52:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15797616640. Throughput: 0: 42239.1. Samples: 15797684660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:08,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 20:52:10,147][15401] Updated weights for policy 0, policy_version 964214 (0.0031) [2024-06-25 20:52:13,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 15797829632. Throughput: 0: 42492.8. Samples: 15797945240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:13,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 20:52:13,562][15401] Updated weights for policy 0, policy_version 964224 (0.0041) [2024-06-25 20:52:17,775][15401] Updated weights for policy 0, policy_version 964234 (0.0036) [2024-06-25 20:52:18,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15798042624. Throughput: 0: 42607.9. Samples: 15798203940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 20:52:21,149][15401] Updated weights for policy 0, policy_version 964244 (0.0033) [2024-06-25 20:52:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15798255616. Throughput: 0: 42516.4. Samples: 15798327140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:23,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 20:52:25,612][15401] Updated weights for policy 0, policy_version 964254 (0.0037) [2024-06-25 20:52:28,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15798468608. Throughput: 0: 42408.5. Samples: 15798581820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:28,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 20:52:28,903][15401] Updated weights for policy 0, policy_version 964264 (0.0040) [2024-06-25 20:52:33,146][15401] Updated weights for policy 0, policy_version 964274 (0.0032) [2024-06-25 20:52:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 15798665216. Throughput: 0: 42640.0. Samples: 15798846160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:33,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 20:52:36,862][15401] Updated weights for policy 0, policy_version 964284 (0.0034) [2024-06-25 20:52:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42543.1). Total num frames: 15798894592. Throughput: 0: 42647.3. Samples: 15798968860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:38,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 20:52:41,142][15401] Updated weights for policy 0, policy_version 964294 (0.0043) [2024-06-25 20:52:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15799091200. Throughput: 0: 42491.0. Samples: 15799219740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:43,391][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 20:52:44,614][15401] Updated weights for policy 0, policy_version 964304 (0.0029) [2024-06-25 20:52:48,392][15132] Fps is (10 sec: 40949.9, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 15799304192. Throughput: 0: 42635.4. Samples: 15799484900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:48,393][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 20:52:48,652][15401] Updated weights for policy 0, policy_version 964314 (0.0030) [2024-06-25 20:52:52,182][15401] Updated weights for policy 0, policy_version 964324 (0.0040) [2024-06-25 20:52:53,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15799549952. Throughput: 0: 42897.7. Samples: 15799615060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:53,390][15132] Avg episode reward: [(0, '0.681')] [2024-06-25 20:52:56,119][15401] Updated weights for policy 0, policy_version 964334 (0.0028) [2024-06-25 20:52:58,390][15132] Fps is (10 sec: 44247.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15799746560. Throughput: 0: 42790.8. Samples: 15799870820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:52:58,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 20:52:59,852][15401] Updated weights for policy 0, policy_version 964344 (0.0032) [2024-06-25 20:53:03,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15799959552. Throughput: 0: 42703.6. Samples: 15800125600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:03,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 20:53:03,670][15401] Updated weights for policy 0, policy_version 964354 (0.0032) [2024-06-25 20:53:07,538][15401] Updated weights for policy 0, policy_version 964364 (0.0031) [2024-06-25 20:53:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15800172544. Throughput: 0: 42844.4. Samples: 15800255140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:08,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 20:53:11,263][15401] Updated weights for policy 0, policy_version 964374 (0.0041) [2024-06-25 20:53:13,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42543.1). Total num frames: 15800385536. Throughput: 0: 42785.3. Samples: 15800507160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:13,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 20:53:15,387][15401] Updated weights for policy 0, policy_version 964384 (0.0035) [2024-06-25 20:53:18,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15800598528. Throughput: 0: 42613.8. Samples: 15800763780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:18,390][15132] Avg episode reward: [(0, '0.346')] [2024-06-25 20:53:18,813][15401] Updated weights for policy 0, policy_version 964394 (0.0035) [2024-06-25 20:53:22,923][15401] Updated weights for policy 0, policy_version 964404 (0.0027) [2024-06-25 20:53:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15800795136. Throughput: 0: 42745.2. Samples: 15800892400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:23,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 20:53:26,575][15401] Updated weights for policy 0, policy_version 964414 (0.0034) [2024-06-25 20:53:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15801024512. Throughput: 0: 42766.6. Samples: 15801144240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:28,390][15132] Avg episode reward: [(0, '0.530')] [2024-06-25 20:53:30,895][15401] Updated weights for policy 0, policy_version 964424 (0.0041) [2024-06-25 20:53:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 15801237504. Throughput: 0: 42537.3. Samples: 15801399080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:33,393][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 20:53:33,978][15349] Signal inference workers to stop experience collection... (233750 times) [2024-06-25 20:53:34,001][15401] InferenceWorker_p0-w0: stopping experience collection (233750 times) [2024-06-25 20:53:34,042][15349] Signal inference workers to resume experience collection... (233750 times) [2024-06-25 20:53:34,042][15401] InferenceWorker_p0-w0: resuming experience collection (233750 times) [2024-06-25 20:53:34,377][15401] Updated weights for policy 0, policy_version 964434 (0.0033) [2024-06-25 20:53:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 15801434112. Throughput: 0: 42504.7. Samples: 15801527780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:38,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 20:53:38,687][15401] Updated weights for policy 0, policy_version 964444 (0.0031) [2024-06-25 20:53:42,018][15401] Updated weights for policy 0, policy_version 964454 (0.0036) [2024-06-25 20:53:43,392][15132] Fps is (10 sec: 42598.6, 60 sec: 42869.8, 300 sec: 42599.0). Total num frames: 15801663488. Throughput: 0: 42341.4. Samples: 15801776280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:43,392][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 20:53:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000964457_15801663488.pth... [2024-06-25 20:53:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000963834_15791456256.pth [2024-06-25 20:53:46,305][15401] Updated weights for policy 0, policy_version 964464 (0.0029) [2024-06-25 20:53:48,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 15801860096. Throughput: 0: 42461.3. Samples: 15802036460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:48,392][15132] Avg episode reward: [(0, '0.403')] [2024-06-25 20:53:49,724][15401] Updated weights for policy 0, policy_version 964474 (0.0028) [2024-06-25 20:53:53,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 15802073088. Throughput: 0: 42377.3. Samples: 15802162120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:53,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 20:53:53,889][15401] Updated weights for policy 0, policy_version 964484 (0.0038) [2024-06-25 20:53:57,430][15401] Updated weights for policy 0, policy_version 964494 (0.0031) [2024-06-25 20:53:58,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15802302464. Throughput: 0: 42472.0. Samples: 15802418400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:53:58,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 20:54:01,256][15401] Updated weights for policy 0, policy_version 964504 (0.0022) [2024-06-25 20:54:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15802499072. Throughput: 0: 42706.3. Samples: 15802685560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 20:54:03,390][15132] Avg episode reward: [(0, '0.828')] [2024-06-25 20:54:04,890][15401] Updated weights for policy 0, policy_version 964514 (0.0033) [2024-06-25 20:54:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15802728448. Throughput: 0: 42614.1. Samples: 15802810040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:08,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 20:54:08,781][15401] Updated weights for policy 0, policy_version 964524 (0.0044) [2024-06-25 20:54:12,421][15401] Updated weights for policy 0, policy_version 964534 (0.0036) [2024-06-25 20:54:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15802957824. Throughput: 0: 42721.8. Samples: 15803066720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 20:54:16,304][15401] Updated weights for policy 0, policy_version 964544 (0.0036) [2024-06-25 20:54:18,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15803154432. Throughput: 0: 42923.2. Samples: 15803330520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:18,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 20:54:19,962][15401] Updated weights for policy 0, policy_version 964554 (0.0035) [2024-06-25 20:54:23,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15803367424. Throughput: 0: 42714.8. Samples: 15803449940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:23,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 20:54:24,073][15401] Updated weights for policy 0, policy_version 964564 (0.0032) [2024-06-25 20:54:27,892][15401] Updated weights for policy 0, policy_version 964574 (0.0037) [2024-06-25 20:54:28,390][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15803613184. Throughput: 0: 43011.6. Samples: 15803711700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 20:54:32,164][15401] Updated weights for policy 0, policy_version 964584 (0.0034) [2024-06-25 20:54:33,396][15132] Fps is (10 sec: 42571.3, 60 sec: 42595.6, 300 sec: 42597.5). Total num frames: 15803793408. Throughput: 0: 42964.2. Samples: 15803970020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:33,396][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 20:54:35,564][15401] Updated weights for policy 0, policy_version 964594 (0.0043) [2024-06-25 20:54:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15804022784. Throughput: 0: 42780.0. Samples: 15804087220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:38,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 20:54:39,708][15401] Updated weights for policy 0, policy_version 964604 (0.0040) [2024-06-25 20:54:43,225][15401] Updated weights for policy 0, policy_version 964614 (0.0036) [2024-06-25 20:54:43,390][15132] Fps is (10 sec: 45904.0, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 15804252160. Throughput: 0: 42890.1. Samples: 15804348460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:43,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 20:54:47,518][15401] Updated weights for policy 0, policy_version 964624 (0.0031) [2024-06-25 20:54:48,396][15132] Fps is (10 sec: 40933.8, 60 sec: 42868.6, 300 sec: 42542.0). Total num frames: 15804432384. Throughput: 0: 42707.7. Samples: 15804607680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:48,396][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 20:54:49,338][15349] Signal inference workers to stop experience collection... (233800 times) [2024-06-25 20:54:49,376][15401] InferenceWorker_p0-w0: stopping experience collection (233800 times) [2024-06-25 20:54:49,386][15349] Signal inference workers to resume experience collection... (233800 times) [2024-06-25 20:54:49,392][15401] InferenceWorker_p0-w0: resuming experience collection (233800 times) [2024-06-25 20:54:50,906][15401] Updated weights for policy 0, policy_version 964634 (0.0042) [2024-06-25 20:54:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15804661760. Throughput: 0: 42672.0. Samples: 15804730280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:53,390][15132] Avg episode reward: [(0, '0.503')] [2024-06-25 20:54:54,963][15401] Updated weights for policy 0, policy_version 964644 (0.0031) [2024-06-25 20:54:58,389][15132] Fps is (10 sec: 44265.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15804874752. Throughput: 0: 42864.1. Samples: 15804995600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:54:58,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 20:54:58,448][15401] Updated weights for policy 0, policy_version 964654 (0.0036) [2024-06-25 20:55:02,456][15401] Updated weights for policy 0, policy_version 964664 (0.0034) [2024-06-25 20:55:03,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15805071360. Throughput: 0: 42775.1. Samples: 15805255400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:03,403][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 20:55:06,088][15401] Updated weights for policy 0, policy_version 964674 (0.0046) [2024-06-25 20:55:08,392][15132] Fps is (10 sec: 44226.3, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 15805317120. Throughput: 0: 42899.6. Samples: 15805380520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:08,392][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 20:55:10,180][15401] Updated weights for policy 0, policy_version 964684 (0.0037) [2024-06-25 20:55:13,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15805530112. Throughput: 0: 42796.9. Samples: 15805637560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:13,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 20:55:13,758][15401] Updated weights for policy 0, policy_version 964694 (0.0035) [2024-06-25 20:55:18,139][15401] Updated weights for policy 0, policy_version 964704 (0.0037) [2024-06-25 20:55:18,390][15132] Fps is (10 sec: 40969.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15805726720. Throughput: 0: 42774.0. Samples: 15805894580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:18,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 20:55:21,250][15401] Updated weights for policy 0, policy_version 964714 (0.0031) [2024-06-25 20:55:23,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15805956096. Throughput: 0: 42920.0. Samples: 15806018620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:23,395][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 20:55:25,497][15401] Updated weights for policy 0, policy_version 964724 (0.0033) [2024-06-25 20:55:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15806169088. Throughput: 0: 43034.4. Samples: 15806285000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:28,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 20:55:28,676][15401] Updated weights for policy 0, policy_version 964734 (0.0030) [2024-06-25 20:55:33,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42603.0, 300 sec: 42598.4). Total num frames: 15806349312. Throughput: 0: 43068.9. Samples: 15806545500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:33,390][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 20:55:33,547][15401] Updated weights for policy 0, policy_version 964744 (0.0036) [2024-06-25 20:55:36,486][15401] Updated weights for policy 0, policy_version 964754 (0.0050) [2024-06-25 20:55:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15806595072. Throughput: 0: 43061.9. Samples: 15806668060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:38,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 20:55:40,938][15401] Updated weights for policy 0, policy_version 964764 (0.0036) [2024-06-25 20:55:43,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15806824448. Throughput: 0: 42817.3. Samples: 15806922380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:43,390][15132] Avg episode reward: [(0, '0.632')] [2024-06-25 20:55:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000964772_15806824448.pth... [2024-06-25 20:55:43,484][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000964146_15796568064.pth [2024-06-25 20:55:44,528][15401] Updated weights for policy 0, policy_version 964774 (0.0029) [2024-06-25 20:55:48,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42876.0, 300 sec: 42653.9). Total num frames: 15807004672. Throughput: 0: 42904.4. Samples: 15807186100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:48,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 20:55:48,678][15401] Updated weights for policy 0, policy_version 964784 (0.0033) [2024-06-25 20:55:51,924][15401] Updated weights for policy 0, policy_version 964794 (0.0039) [2024-06-25 20:55:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15807250432. Throughput: 0: 42999.5. Samples: 15807315400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 20:55:56,112][15401] Updated weights for policy 0, policy_version 964804 (0.0025) [2024-06-25 20:55:58,389][15132] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15807463424. Throughput: 0: 43120.0. Samples: 15807577960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:55:58,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 20:55:59,576][15401] Updated weights for policy 0, policy_version 964814 (0.0035) [2024-06-25 20:56:03,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15807643648. Throughput: 0: 43280.9. Samples: 15807842220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-25 20:56:03,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 20:56:03,903][15401] Updated weights for policy 0, policy_version 964824 (0.0022) [2024-06-25 20:56:07,149][15401] Updated weights for policy 0, policy_version 964834 (0.0036) [2024-06-25 20:56:08,043][15349] Signal inference workers to stop experience collection... (233850 times) [2024-06-25 20:56:08,076][15401] InferenceWorker_p0-w0: stopping experience collection (233850 times) [2024-06-25 20:56:08,108][15349] Signal inference workers to resume experience collection... (233850 times) [2024-06-25 20:56:08,108][15401] InferenceWorker_p0-w0: resuming experience collection (233850 times) [2024-06-25 20:56:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 15807905792. Throughput: 0: 43220.4. Samples: 15807963540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:08,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 20:56:11,366][15401] Updated weights for policy 0, policy_version 964844 (0.0029) [2024-06-25 20:56:13,396][15132] Fps is (10 sec: 45846.4, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 15808102400. Throughput: 0: 43037.4. Samples: 15808221960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:13,396][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 20:56:14,753][15401] Updated weights for policy 0, policy_version 964854 (0.0054) [2024-06-25 20:56:18,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15808282624. Throughput: 0: 43148.9. Samples: 15808487200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:18,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 20:56:18,885][15401] Updated weights for policy 0, policy_version 964864 (0.0021) [2024-06-25 20:56:22,392][15401] Updated weights for policy 0, policy_version 964874 (0.0030) [2024-06-25 20:56:23,390][15132] Fps is (10 sec: 45904.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15808561152. Throughput: 0: 43233.2. Samples: 15808613560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:23,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 20:56:26,394][15401] Updated weights for policy 0, policy_version 964884 (0.0029) [2024-06-25 20:56:28,389][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15808741376. Throughput: 0: 43370.3. Samples: 15808874040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:28,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 20:56:29,901][15401] Updated weights for policy 0, policy_version 964894 (0.0027) [2024-06-25 20:56:33,389][15132] Fps is (10 sec: 39322.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15808954368. Throughput: 0: 43228.2. Samples: 15809131360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:33,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 20:56:33,797][15401] Updated weights for policy 0, policy_version 964904 (0.0036) [2024-06-25 20:56:37,715][15401] Updated weights for policy 0, policy_version 964914 (0.0037) [2024-06-25 20:56:38,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15809183744. Throughput: 0: 43206.4. Samples: 15809259680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:38,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 20:56:41,425][15401] Updated weights for policy 0, policy_version 964924 (0.0033) [2024-06-25 20:56:43,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15809380352. Throughput: 0: 43003.6. Samples: 15809513120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:43,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 20:56:45,383][15401] Updated weights for policy 0, policy_version 964934 (0.0027) [2024-06-25 20:56:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43417.8, 300 sec: 42820.6). Total num frames: 15809609728. Throughput: 0: 42910.9. Samples: 15809773200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:48,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 20:56:49,028][15401] Updated weights for policy 0, policy_version 964944 (0.0029) [2024-06-25 20:56:52,956][15401] Updated weights for policy 0, policy_version 964954 (0.0036) [2024-06-25 20:56:53,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15809839104. Throughput: 0: 43057.8. Samples: 15809901140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 20:56:56,629][15401] Updated weights for policy 0, policy_version 964964 (0.0041) [2024-06-25 20:56:58,389][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15810019328. Throughput: 0: 42952.8. Samples: 15810154560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:56:58,390][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 20:57:00,299][15401] Updated weights for policy 0, policy_version 964974 (0.0040) [2024-06-25 20:57:03,390][15132] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 15810248704. Throughput: 0: 42907.5. Samples: 15810418040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 20:57:04,231][15401] Updated weights for policy 0, policy_version 964984 (0.0027) [2024-06-25 20:57:07,973][15401] Updated weights for policy 0, policy_version 964994 (0.0040) [2024-06-25 20:57:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15810478080. Throughput: 0: 43072.9. Samples: 15810551840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 20:57:11,927][15401] Updated weights for policy 0, policy_version 965004 (0.0056) [2024-06-25 20:57:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 15810658304. Throughput: 0: 42833.7. Samples: 15810801560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:13,390][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 20:57:15,546][15401] Updated weights for policy 0, policy_version 965014 (0.0038) [2024-06-25 20:57:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 15810904064. Throughput: 0: 42859.9. Samples: 15811060060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:18,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 20:57:19,531][15401] Updated weights for policy 0, policy_version 965024 (0.0030) [2024-06-25 20:57:23,269][15349] Signal inference workers to stop experience collection... (233900 times) [2024-06-25 20:57:23,271][15349] Signal inference workers to resume experience collection... (233900 times) [2024-06-25 20:57:23,281][15401] InferenceWorker_p0-w0: stopping experience collection (233900 times) [2024-06-25 20:57:23,284][15401] Updated weights for policy 0, policy_version 965034 (0.0035) [2024-06-25 20:57:23,307][15401] InferenceWorker_p0-w0: resuming experience collection (233900 times) [2024-06-25 20:57:23,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15811117056. Throughput: 0: 43115.5. Samples: 15811199880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:23,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 20:57:27,209][15401] Updated weights for policy 0, policy_version 965044 (0.0041) [2024-06-25 20:57:28,396][15132] Fps is (10 sec: 40933.6, 60 sec: 42866.9, 300 sec: 42875.2). Total num frames: 15811313664. Throughput: 0: 43146.2. Samples: 15811454980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:28,396][15132] Avg episode reward: [(0, '0.390')] [2024-06-25 20:57:31,011][15401] Updated weights for policy 0, policy_version 965054 (0.0035) [2024-06-25 20:57:33,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43415.8, 300 sec: 42931.3). Total num frames: 15811559424. Throughput: 0: 42921.6. Samples: 15811704780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:33,392][15132] Avg episode reward: [(0, '0.236')] [2024-06-25 20:57:34,948][15401] Updated weights for policy 0, policy_version 965064 (0.0036) [2024-06-25 20:57:38,389][15132] Fps is (10 sec: 44265.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15811756032. Throughput: 0: 43206.8. Samples: 15811845440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:38,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 20:57:38,597][15401] Updated weights for policy 0, policy_version 965074 (0.0026) [2024-06-25 20:57:42,448][15401] Updated weights for policy 0, policy_version 965084 (0.0028) [2024-06-25 20:57:43,389][15132] Fps is (10 sec: 39330.8, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 15811952640. Throughput: 0: 43172.4. Samples: 15812097320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:43,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 20:57:43,413][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000965086_15811969024.pth... [2024-06-25 20:57:43,469][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000964457_15801663488.pth [2024-06-25 20:57:46,296][15401] Updated weights for policy 0, policy_version 965094 (0.0041) [2024-06-25 20:57:48,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15812198400. Throughput: 0: 42957.5. Samples: 15812351120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 20:57:49,995][15401] Updated weights for policy 0, policy_version 965104 (0.0035) [2024-06-25 20:57:53,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 15812378624. Throughput: 0: 42900.9. Samples: 15812482380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:53,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 20:57:53,891][15401] Updated weights for policy 0, policy_version 965114 (0.0034) [2024-06-25 20:57:57,471][15401] Updated weights for policy 0, policy_version 965124 (0.0028) [2024-06-25 20:57:58,389][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15812608000. Throughput: 0: 42972.0. Samples: 15812735300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 20:57:58,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 20:58:01,531][15401] Updated weights for policy 0, policy_version 965134 (0.0027) [2024-06-25 20:58:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15812837376. Throughput: 0: 42838.1. Samples: 15812987780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:03,390][15132] Avg episode reward: [(0, '0.798')] [2024-06-25 20:58:05,363][15401] Updated weights for policy 0, policy_version 965144 (0.0029) [2024-06-25 20:58:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 15813033984. Throughput: 0: 42604.9. Samples: 15813117100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:08,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 20:58:09,071][15401] Updated weights for policy 0, policy_version 965154 (0.0038) [2024-06-25 20:58:13,215][15401] Updated weights for policy 0, policy_version 965164 (0.0043) [2024-06-25 20:58:13,390][15132] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15813263360. Throughput: 0: 42645.5. Samples: 15813373760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:13,391][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 20:58:16,858][15401] Updated weights for policy 0, policy_version 965174 (0.0027) [2024-06-25 20:58:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15813476352. Throughput: 0: 42850.8. Samples: 15813632960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 20:58:20,722][15401] Updated weights for policy 0, policy_version 965184 (0.0044) [2024-06-25 20:58:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15813672960. Throughput: 0: 42636.0. Samples: 15813764060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:23,390][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 20:58:24,329][15401] Updated weights for policy 0, policy_version 965194 (0.0030) [2024-06-25 20:58:28,229][15401] Updated weights for policy 0, policy_version 965204 (0.0032) [2024-06-25 20:58:28,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43149.1, 300 sec: 42932.0). Total num frames: 15813902336. Throughput: 0: 42684.4. Samples: 15814018120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:28,390][15132] Avg episode reward: [(0, '0.172')] [2024-06-25 20:58:32,018][15401] Updated weights for policy 0, policy_version 965214 (0.0047) [2024-06-25 20:58:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42600.1, 300 sec: 42987.2). Total num frames: 15814115328. Throughput: 0: 42810.1. Samples: 15814277580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:33,390][15132] Avg episode reward: [(0, '0.117')] [2024-06-25 20:58:35,803][15401] Updated weights for policy 0, policy_version 965224 (0.0038) [2024-06-25 20:58:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42932.0). Total num frames: 15814328320. Throughput: 0: 42784.5. Samples: 15814407680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:38,390][15132] Avg episode reward: [(0, '0.170')] [2024-06-25 20:58:39,682][15401] Updated weights for policy 0, policy_version 965234 (0.0026) [2024-06-25 20:58:43,371][15401] Updated weights for policy 0, policy_version 965244 (0.0032) [2024-06-25 20:58:43,389][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 43043.1). Total num frames: 15814557696. Throughput: 0: 42877.8. Samples: 15814664800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:43,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 20:58:47,471][15401] Updated weights for policy 0, policy_version 965254 (0.0036) [2024-06-25 20:58:48,390][15132] Fps is (10 sec: 42597.3, 60 sec: 42598.2, 300 sec: 42987.1). Total num frames: 15814754304. Throughput: 0: 42937.6. Samples: 15814919980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:48,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 20:58:51,081][15401] Updated weights for policy 0, policy_version 965264 (0.0040) [2024-06-25 20:58:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 15814983680. Throughput: 0: 42930.1. Samples: 15815048960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:53,390][15132] Avg episode reward: [(0, '0.755')] [2024-06-25 20:58:54,943][15401] Updated weights for policy 0, policy_version 965274 (0.0028) [2024-06-25 20:58:56,433][15349] Signal inference workers to stop experience collection... (233950 times) [2024-06-25 20:58:56,489][15401] InferenceWorker_p0-w0: stopping experience collection (233950 times) [2024-06-25 20:58:56,490][15349] Signal inference workers to resume experience collection... (233950 times) [2024-06-25 20:58:56,502][15401] InferenceWorker_p0-w0: resuming experience collection (233950 times) [2024-06-25 20:58:58,390][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15815180288. Throughput: 0: 42942.7. Samples: 15815306180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:58:58,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 20:58:59,169][15401] Updated weights for policy 0, policy_version 965284 (0.0035) [2024-06-25 20:59:02,499][15401] Updated weights for policy 0, policy_version 965294 (0.0038) [2024-06-25 20:59:03,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42986.8). Total num frames: 15815409664. Throughput: 0: 42931.0. Samples: 15815564960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:03,392][15132] Avg episode reward: [(0, '0.302')] [2024-06-25 20:59:06,790][15401] Updated weights for policy 0, policy_version 965304 (0.0041) [2024-06-25 20:59:08,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 15815622656. Throughput: 0: 43018.7. Samples: 15815699900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:08,390][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 20:59:09,991][15401] Updated weights for policy 0, policy_version 965314 (0.0034) [2024-06-25 20:59:13,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 15815819264. Throughput: 0: 42942.7. Samples: 15815950540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:13,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 20:59:14,298][15401] Updated weights for policy 0, policy_version 965324 (0.0028) [2024-06-25 20:59:17,606][15401] Updated weights for policy 0, policy_version 965334 (0.0025) [2024-06-25 20:59:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15816048640. Throughput: 0: 42900.9. Samples: 15816208120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:18,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 20:59:22,193][15401] Updated weights for policy 0, policy_version 965344 (0.0032) [2024-06-25 20:59:23,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 15816261632. Throughput: 0: 42963.3. Samples: 15816341040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 20:59:25,131][15401] Updated weights for policy 0, policy_version 965354 (0.0029) [2024-06-25 20:59:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42932.6). Total num frames: 15816458240. Throughput: 0: 42912.4. Samples: 15816595860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:28,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 20:59:29,803][15401] Updated weights for policy 0, policy_version 965364 (0.0034) [2024-06-25 20:59:33,250][15401] Updated weights for policy 0, policy_version 965374 (0.0040) [2024-06-25 20:59:33,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15816687616. Throughput: 0: 42832.3. Samples: 15816847420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:33,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 20:59:37,531][15401] Updated weights for policy 0, policy_version 965384 (0.0044) [2024-06-25 20:59:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15816900608. Throughput: 0: 42860.1. Samples: 15816977660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:38,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 20:59:40,782][15401] Updated weights for policy 0, policy_version 965394 (0.0040) [2024-06-25 20:59:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42988.1). Total num frames: 15817113600. Throughput: 0: 42803.5. Samples: 15817232340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:43,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 20:59:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000965400_15817113600.pth... [2024-06-25 20:59:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000964772_15806824448.pth [2024-06-25 20:59:45,590][15401] Updated weights for policy 0, policy_version 965404 (0.0036) [2024-06-25 20:59:48,364][15401] Updated weights for policy 0, policy_version 965414 (0.0037) [2024-06-25 20:59:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 15817342976. Throughput: 0: 42680.9. Samples: 15817485500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:48,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 20:59:53,061][15401] Updated weights for policy 0, policy_version 965424 (0.0033) [2024-06-25 20:59:53,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 15817523200. Throughput: 0: 42630.6. Samples: 15817618280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:53,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 20:59:56,520][15401] Updated weights for policy 0, policy_version 965434 (0.0044) [2024-06-25 20:59:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15817752576. Throughput: 0: 42547.9. Samples: 15817865200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 20:59:58,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-25 21:00:01,131][15401] Updated weights for policy 0, policy_version 965444 (0.0032) [2024-06-25 21:00:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42600.0, 300 sec: 42876.4). Total num frames: 15817965568. Throughput: 0: 42495.5. Samples: 15818120420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:03,390][15132] Avg episode reward: [(0, '0.374')] [2024-06-25 21:00:04,109][15401] Updated weights for policy 0, policy_version 965454 (0.0038) [2024-06-25 21:00:08,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 15818145792. Throughput: 0: 42481.9. Samples: 15818252720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:08,390][15132] Avg episode reward: [(0, '0.434')] [2024-06-25 21:00:08,625][15401] Updated weights for policy 0, policy_version 965464 (0.0035) [2024-06-25 21:00:11,741][15401] Updated weights for policy 0, policy_version 965474 (0.0045) [2024-06-25 21:00:13,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15818375168. Throughput: 0: 42397.8. Samples: 15818503760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:13,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 21:00:14,299][15349] Signal inference workers to stop experience collection... (234000 times) [2024-06-25 21:00:14,308][15349] Signal inference workers to resume experience collection... (234000 times) [2024-06-25 21:00:14,325][15401] InferenceWorker_p0-w0: stopping experience collection (234000 times) [2024-06-25 21:00:14,325][15401] InferenceWorker_p0-w0: resuming experience collection (234000 times) [2024-06-25 21:00:16,206][15401] Updated weights for policy 0, policy_version 965484 (0.0027) [2024-06-25 21:00:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15818604544. Throughput: 0: 42426.2. Samples: 15818756600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:18,390][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 21:00:19,543][15401] Updated weights for policy 0, policy_version 965494 (0.0024) [2024-06-25 21:00:23,396][15132] Fps is (10 sec: 42571.1, 60 sec: 42320.9, 300 sec: 42819.6). Total num frames: 15818801152. Throughput: 0: 42541.9. Samples: 15818892320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:23,397][15132] Avg episode reward: [(0, '0.741')] [2024-06-25 21:00:23,733][15401] Updated weights for policy 0, policy_version 965504 (0.0035) [2024-06-25 21:00:27,349][15401] Updated weights for policy 0, policy_version 965514 (0.0027) [2024-06-25 21:00:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15819014144. Throughput: 0: 42502.6. Samples: 15819144960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:28,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 21:00:31,795][15401] Updated weights for policy 0, policy_version 965524 (0.0028) [2024-06-25 21:00:33,389][15132] Fps is (10 sec: 45904.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15819259904. Throughput: 0: 42375.2. Samples: 15819392380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:33,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 21:00:34,959][15401] Updated weights for policy 0, policy_version 965534 (0.0025) [2024-06-25 21:00:38,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15819440128. Throughput: 0: 42451.1. Samples: 15819528580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:38,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 21:00:39,374][15401] Updated weights for policy 0, policy_version 965544 (0.0038) [2024-06-25 21:00:42,664][15401] Updated weights for policy 0, policy_version 965554 (0.0041) [2024-06-25 21:00:43,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15819669504. Throughput: 0: 42571.2. Samples: 15819780900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 21:00:47,026][15401] Updated weights for policy 0, policy_version 965564 (0.0032) [2024-06-25 21:00:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 15819882496. Throughput: 0: 42520.1. Samples: 15820033820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:48,390][15132] Avg episode reward: [(0, '0.332')] [2024-06-25 21:00:50,222][15401] Updated weights for policy 0, policy_version 965574 (0.0045) [2024-06-25 21:00:53,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15820079104. Throughput: 0: 42452.9. Samples: 15820163100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:53,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 21:00:54,623][15401] Updated weights for policy 0, policy_version 965584 (0.0028) [2024-06-25 21:00:58,177][15401] Updated weights for policy 0, policy_version 965594 (0.0031) [2024-06-25 21:00:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15820308480. Throughput: 0: 42454.2. Samples: 15820414200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:00:58,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 21:01:02,142][15401] Updated weights for policy 0, policy_version 965604 (0.0036) [2024-06-25 21:01:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15820521472. Throughput: 0: 42648.5. Samples: 15820675780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:03,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 21:01:05,795][15401] Updated weights for policy 0, policy_version 965614 (0.0036) [2024-06-25 21:01:08,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 15820718080. Throughput: 0: 42455.0. Samples: 15820802520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:08,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 21:01:09,677][15401] Updated weights for policy 0, policy_version 965624 (0.0039) [2024-06-25 21:01:13,298][15401] Updated weights for policy 0, policy_version 965634 (0.0040) [2024-06-25 21:01:13,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15820947456. Throughput: 0: 42604.9. Samples: 15821062180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:13,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 21:01:17,409][15401] Updated weights for policy 0, policy_version 965644 (0.0045) [2024-06-25 21:01:18,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15821160448. Throughput: 0: 42767.9. Samples: 15821316940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:18,396][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 21:01:20,990][15401] Updated weights for policy 0, policy_version 965654 (0.0029) [2024-06-25 21:01:23,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 15821357056. Throughput: 0: 42614.2. Samples: 15821446220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:23,390][15132] Avg episode reward: [(0, '0.862')] [2024-06-25 21:01:25,011][15401] Updated weights for policy 0, policy_version 965664 (0.0025) [2024-06-25 21:01:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15821586432. Throughput: 0: 42706.4. Samples: 15821702680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 21:01:28,431][15401] Updated weights for policy 0, policy_version 965674 (0.0031) [2024-06-25 21:01:30,095][15349] Signal inference workers to stop experience collection... (234050 times) [2024-06-25 21:01:30,148][15401] InferenceWorker_p0-w0: stopping experience collection (234050 times) [2024-06-25 21:01:30,149][15349] Signal inference workers to resume experience collection... (234050 times) [2024-06-25 21:01:30,157][15401] InferenceWorker_p0-w0: resuming experience collection (234050 times) [2024-06-25 21:01:32,498][15401] Updated weights for policy 0, policy_version 965684 (0.0045) [2024-06-25 21:01:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15821815808. Throughput: 0: 42889.2. Samples: 15821963840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:33,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 21:01:35,878][15401] Updated weights for policy 0, policy_version 965694 (0.0030) [2024-06-25 21:01:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15822012416. Throughput: 0: 42806.2. Samples: 15822089380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:38,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 21:01:40,358][15401] Updated weights for policy 0, policy_version 965704 (0.0028) [2024-06-25 21:01:43,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15822241792. Throughput: 0: 42950.7. Samples: 15822346980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:43,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 21:01:43,451][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000965714_15822258176.pth... [2024-06-25 21:01:43,454][15401] Updated weights for policy 0, policy_version 965714 (0.0034) [2024-06-25 21:01:43,516][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000965086_15811969024.pth [2024-06-25 21:01:47,890][15401] Updated weights for policy 0, policy_version 965724 (0.0038) [2024-06-25 21:01:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15822454784. Throughput: 0: 42760.3. Samples: 15822600000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:48,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 21:01:51,614][15401] Updated weights for policy 0, policy_version 965734 (0.0040) [2024-06-25 21:01:53,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15822635008. Throughput: 0: 42864.0. Samples: 15822731400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 21:01:53,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-25 21:01:55,621][15401] Updated weights for policy 0, policy_version 965744 (0.0038) [2024-06-25 21:01:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 15822880768. Throughput: 0: 42715.3. Samples: 15822984360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:01:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 21:01:59,210][15401] Updated weights for policy 0, policy_version 965754 (0.0029) [2024-06-25 21:02:03,295][15401] Updated weights for policy 0, policy_version 965764 (0.0031) [2024-06-25 21:02:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15823077376. Throughput: 0: 42864.0. Samples: 15823245820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:03,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 21:02:07,115][15401] Updated weights for policy 0, policy_version 965774 (0.0038) [2024-06-25 21:02:08,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15823290368. Throughput: 0: 42833.4. Samples: 15823373720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:08,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 21:02:10,732][15401] Updated weights for policy 0, policy_version 965784 (0.0048) [2024-06-25 21:02:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15823503360. Throughput: 0: 42806.6. Samples: 15823628980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:13,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 21:02:14,731][15401] Updated weights for policy 0, policy_version 965794 (0.0029) [2024-06-25 21:02:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15823716352. Throughput: 0: 42817.3. Samples: 15823890620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:18,390][15132] Avg episode reward: [(0, '0.539')] [2024-06-25 21:02:18,520][15401] Updated weights for policy 0, policy_version 965804 (0.0023) [2024-06-25 21:02:22,433][15401] Updated weights for policy 0, policy_version 965814 (0.0040) [2024-06-25 21:02:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42766.0). Total num frames: 15823929344. Throughput: 0: 42717.0. Samples: 15824011640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:23,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 21:02:26,291][15401] Updated weights for policy 0, policy_version 965824 (0.0032) [2024-06-25 21:02:28,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 15824158720. Throughput: 0: 42691.0. Samples: 15824268080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:28,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 21:02:29,910][15401] Updated weights for policy 0, policy_version 965834 (0.0041) [2024-06-25 21:02:33,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 15824338944. Throughput: 0: 42833.0. Samples: 15824527480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:33,390][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 21:02:34,071][15401] Updated weights for policy 0, policy_version 965844 (0.0028) [2024-06-25 21:02:37,478][15401] Updated weights for policy 0, policy_version 965854 (0.0046) [2024-06-25 21:02:38,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15824601088. Throughput: 0: 42743.1. Samples: 15824654840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:38,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 21:02:41,559][15401] Updated weights for policy 0, policy_version 965864 (0.0040) [2024-06-25 21:02:43,392][15132] Fps is (10 sec: 45863.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 15824797696. Throughput: 0: 42873.1. Samples: 15824913760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:43,392][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 21:02:45,166][15401] Updated weights for policy 0, policy_version 965874 (0.0036) [2024-06-25 21:02:48,394][15132] Fps is (10 sec: 39304.0, 60 sec: 42322.3, 300 sec: 42764.4). Total num frames: 15824994304. Throughput: 0: 42799.0. Samples: 15825171960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:48,394][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 21:02:49,282][15401] Updated weights for policy 0, policy_version 965884 (0.0025) [2024-06-25 21:02:52,901][15401] Updated weights for policy 0, policy_version 965894 (0.0039) [2024-06-25 21:02:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15825240064. Throughput: 0: 42643.7. Samples: 15825292680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:53,390][15132] Avg episode reward: [(0, '0.297')] [2024-06-25 21:02:56,842][15401] Updated weights for policy 0, policy_version 965904 (0.0033) [2024-06-25 21:02:58,390][15132] Fps is (10 sec: 44256.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15825436672. Throughput: 0: 42691.5. Samples: 15825550100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:02:58,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 21:03:00,452][15401] Updated weights for policy 0, policy_version 965914 (0.0022) [2024-06-25 21:03:03,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15825633280. Throughput: 0: 42748.9. Samples: 15825814320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:03,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 21:03:04,244][15349] Signal inference workers to stop experience collection... (234100 times) [2024-06-25 21:03:04,261][15401] InferenceWorker_p0-w0: stopping experience collection (234100 times) [2024-06-25 21:03:04,355][15349] Signal inference workers to resume experience collection... (234100 times) [2024-06-25 21:03:04,355][15401] InferenceWorker_p0-w0: resuming experience collection (234100 times) [2024-06-25 21:03:04,357][15401] Updated weights for policy 0, policy_version 965924 (0.0031) [2024-06-25 21:03:08,278][15401] Updated weights for policy 0, policy_version 965934 (0.0033) [2024-06-25 21:03:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15825862656. Throughput: 0: 42872.8. Samples: 15825940920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:08,392][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 21:03:11,906][15401] Updated weights for policy 0, policy_version 965944 (0.0036) [2024-06-25 21:03:13,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15826092032. Throughput: 0: 42827.6. Samples: 15826195320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:13,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 21:03:15,831][15401] Updated weights for policy 0, policy_version 965954 (0.0031) [2024-06-25 21:03:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15826288640. Throughput: 0: 42848.8. Samples: 15826455680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:18,390][15132] Avg episode reward: [(0, '0.472')] [2024-06-25 21:03:19,284][15401] Updated weights for policy 0, policy_version 965964 (0.0031) [2024-06-25 21:03:23,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 15826501632. Throughput: 0: 42861.2. Samples: 15826583700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:23,392][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 21:03:23,737][15401] Updated weights for policy 0, policy_version 965974 (0.0035) [2024-06-25 21:03:26,913][15401] Updated weights for policy 0, policy_version 965984 (0.0031) [2024-06-25 21:03:28,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15826731008. Throughput: 0: 42720.1. Samples: 15826836060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:28,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 21:03:31,385][15401] Updated weights for policy 0, policy_version 965994 (0.0033) [2024-06-25 21:03:33,390][15132] Fps is (10 sec: 44247.2, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15826944000. Throughput: 0: 42694.3. Samples: 15827093020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:33,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 21:03:34,822][15401] Updated weights for policy 0, policy_version 966004 (0.0036) [2024-06-25 21:03:38,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15827124224. Throughput: 0: 42904.5. Samples: 15827223380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:38,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 21:03:39,023][15401] Updated weights for policy 0, policy_version 966014 (0.0030) [2024-06-25 21:03:42,347][15401] Updated weights for policy 0, policy_version 966024 (0.0032) [2024-06-25 21:03:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 15827369984. Throughput: 0: 42896.5. Samples: 15827480440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 21:03:43,419][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000966027_15827386368.pth... [2024-06-25 21:03:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000965400_15817113600.pth [2024-06-25 21:03:46,969][15401] Updated weights for policy 0, policy_version 966034 (0.0030) [2024-06-25 21:03:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43147.6, 300 sec: 42709.5). Total num frames: 15827582976. Throughput: 0: 42739.0. Samples: 15827737580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:48,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 21:03:50,492][15401] Updated weights for policy 0, policy_version 966044 (0.0023) [2024-06-25 21:03:53,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 15827763200. Throughput: 0: 42644.4. Samples: 15827859920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-25 21:03:53,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 21:03:54,553][15401] Updated weights for policy 0, policy_version 966054 (0.0039) [2024-06-25 21:03:58,005][15401] Updated weights for policy 0, policy_version 966064 (0.0043) [2024-06-25 21:03:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 15827992576. Throughput: 0: 42628.5. Samples: 15828113600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:03:58,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 21:04:02,324][15401] Updated weights for policy 0, policy_version 966074 (0.0025) [2024-06-25 21:04:03,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15828205568. Throughput: 0: 42561.8. Samples: 15828370960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:03,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 21:04:05,515][15401] Updated weights for policy 0, policy_version 966084 (0.0033) [2024-06-25 21:04:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15828402176. Throughput: 0: 42551.6. Samples: 15828498420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:08,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 21:04:10,059][15401] Updated weights for policy 0, policy_version 966094 (0.0030) [2024-06-25 21:04:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15828631552. Throughput: 0: 42599.6. Samples: 15828753040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 21:04:13,505][15401] Updated weights for policy 0, policy_version 966104 (0.0035) [2024-06-25 21:04:17,789][15401] Updated weights for policy 0, policy_version 966114 (0.0033) [2024-06-25 21:04:18,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15828828160. Throughput: 0: 42603.6. Samples: 15829010180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:18,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 21:04:21,051][15401] Updated weights for policy 0, policy_version 966124 (0.0039) [2024-06-25 21:04:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15829057536. Throughput: 0: 42368.4. Samples: 15829129960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:23,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 21:04:25,145][15349] Signal inference workers to stop experience collection... (234150 times) [2024-06-25 21:04:25,152][15349] Signal inference workers to resume experience collection... (234150 times) [2024-06-25 21:04:25,188][15401] InferenceWorker_p0-w0: stopping experience collection (234150 times) [2024-06-25 21:04:25,188][15401] InferenceWorker_p0-w0: resuming experience collection (234150 times) [2024-06-25 21:04:25,291][15401] Updated weights for policy 0, policy_version 966134 (0.0029) [2024-06-25 21:04:28,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15829286912. Throughput: 0: 42589.7. Samples: 15829396980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:28,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 21:04:28,480][15401] Updated weights for policy 0, policy_version 966144 (0.0032) [2024-06-25 21:04:32,973][15401] Updated weights for policy 0, policy_version 966154 (0.0027) [2024-06-25 21:04:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15829483520. Throughput: 0: 42748.2. Samples: 15829661240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:33,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 21:04:35,974][15401] Updated weights for policy 0, policy_version 966164 (0.0028) [2024-06-25 21:04:38,390][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 15829712896. Throughput: 0: 42712.5. Samples: 15829781980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:38,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 21:04:40,747][15401] Updated weights for policy 0, policy_version 966174 (0.0037) [2024-06-25 21:04:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15829925888. Throughput: 0: 42845.3. Samples: 15830041640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:43,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 21:04:43,622][15401] Updated weights for policy 0, policy_version 966184 (0.0032) [2024-06-25 21:04:48,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15830106112. Throughput: 0: 43008.4. Samples: 15830306340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:48,392][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 21:04:48,491][15401] Updated weights for policy 0, policy_version 966194 (0.0039) [2024-06-25 21:04:51,638][15401] Updated weights for policy 0, policy_version 966204 (0.0035) [2024-06-25 21:04:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15830351872. Throughput: 0: 42682.6. Samples: 15830419140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:53,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 21:04:56,082][15401] Updated weights for policy 0, policy_version 966214 (0.0029) [2024-06-25 21:04:58,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15830564864. Throughput: 0: 42784.9. Samples: 15830678360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:04:58,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 21:04:59,210][15401] Updated weights for policy 0, policy_version 966224 (0.0034) [2024-06-25 21:05:03,390][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15830745088. Throughput: 0: 43010.6. Samples: 15830945660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:03,391][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 21:05:03,660][15401] Updated weights for policy 0, policy_version 966234 (0.0031) [2024-06-25 21:05:06,703][15401] Updated weights for policy 0, policy_version 966244 (0.0039) [2024-06-25 21:05:08,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15831007232. Throughput: 0: 42933.7. Samples: 15831061980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:08,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 21:05:11,529][15401] Updated weights for policy 0, policy_version 966254 (0.0042) [2024-06-25 21:05:13,390][15132] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15831220224. Throughput: 0: 42860.5. Samples: 15831325700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:13,392][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 21:05:14,204][15401] Updated weights for policy 0, policy_version 966264 (0.0032) [2024-06-25 21:05:18,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 15831384064. Throughput: 0: 42634.2. Samples: 15831579780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:18,390][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 21:05:19,383][15401] Updated weights for policy 0, policy_version 966274 (0.0040) [2024-06-25 21:05:22,529][15401] Updated weights for policy 0, policy_version 966284 (0.0033) [2024-06-25 21:05:23,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15831629824. Throughput: 0: 42500.4. Samples: 15831694500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 21:05:27,203][15401] Updated weights for policy 0, policy_version 966294 (0.0046) [2024-06-25 21:05:28,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 15831810048. Throughput: 0: 42508.5. Samples: 15831954520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:28,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 21:05:30,159][15401] Updated weights for policy 0, policy_version 966304 (0.0033) [2024-06-25 21:05:33,390][15132] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15832023040. Throughput: 0: 42201.4. Samples: 15832205400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:33,390][15132] Avg episode reward: [(0, '0.429')] [2024-06-25 21:05:34,773][15401] Updated weights for policy 0, policy_version 966314 (0.0043) [2024-06-25 21:05:37,823][15401] Updated weights for policy 0, policy_version 966324 (0.0034) [2024-06-25 21:05:38,389][15132] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15832285184. Throughput: 0: 42417.0. Samples: 15832327900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:38,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 21:05:42,519][15401] Updated weights for policy 0, policy_version 966334 (0.0030) [2024-06-25 21:05:43,174][15349] Signal inference workers to stop experience collection... (234200 times) [2024-06-25 21:05:43,175][15349] Signal inference workers to resume experience collection... (234200 times) [2024-06-25 21:05:43,225][15401] InferenceWorker_p0-w0: stopping experience collection (234200 times) [2024-06-25 21:05:43,226][15401] InferenceWorker_p0-w0: resuming experience collection (234200 times) [2024-06-25 21:05:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 15832449024. Throughput: 0: 42493.2. Samples: 15832590560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:43,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 21:05:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000966336_15832449024.pth... [2024-06-25 21:05:43,479][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000965714_15822258176.pth [2024-06-25 21:05:45,274][15401] Updated weights for policy 0, policy_version 966344 (0.0030) [2024-06-25 21:05:48,389][15132] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15832662016. Throughput: 0: 42157.9. Samples: 15832842760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 21:05:50,664][15401] Updated weights for policy 0, policy_version 966354 (0.0050) [2024-06-25 21:05:52,866][15401] Updated weights for policy 0, policy_version 966364 (0.0023) [2024-06-25 21:05:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15832907776. Throughput: 0: 42411.5. Samples: 15832970500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-25 21:05:53,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 21:05:58,195][15401] Updated weights for policy 0, policy_version 966374 (0.0036) [2024-06-25 21:05:58,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15833088000. Throughput: 0: 42234.7. Samples: 15833226260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:05:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 21:06:00,863][15401] Updated weights for policy 0, policy_version 966384 (0.0031) [2024-06-25 21:06:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15833317376. Throughput: 0: 42251.1. Samples: 15833481080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:03,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 21:06:05,747][15401] Updated weights for policy 0, policy_version 966394 (0.0036) [2024-06-25 21:06:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15833546752. Throughput: 0: 42587.1. Samples: 15833610920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:08,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 21:06:08,514][15401] Updated weights for policy 0, policy_version 966404 (0.0039) [2024-06-25 21:06:13,289][15401] Updated weights for policy 0, policy_version 966414 (0.0037) [2024-06-25 21:06:13,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 15833726976. Throughput: 0: 42516.9. Samples: 15833867780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:13,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 21:06:16,701][15401] Updated weights for policy 0, policy_version 966424 (0.0044) [2024-06-25 21:06:18,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15833956352. Throughput: 0: 42513.7. Samples: 15834118520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:18,390][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 21:06:21,035][15401] Updated weights for policy 0, policy_version 966434 (0.0033) [2024-06-25 21:06:23,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 15834185728. Throughput: 0: 42884.8. Samples: 15834257720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:23,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 21:06:24,176][15401] Updated weights for policy 0, policy_version 966444 (0.0031) [2024-06-25 21:06:28,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15834365952. Throughput: 0: 42676.4. Samples: 15834511000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:28,392][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 21:06:28,604][15401] Updated weights for policy 0, policy_version 966454 (0.0032) [2024-06-25 21:06:31,603][15401] Updated weights for policy 0, policy_version 966464 (0.0023) [2024-06-25 21:06:33,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15834595328. Throughput: 0: 42735.0. Samples: 15834765840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:33,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 21:06:36,241][15401] Updated weights for policy 0, policy_version 966474 (0.0034) [2024-06-25 21:06:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15834841088. Throughput: 0: 42850.8. Samples: 15834898780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:38,390][15132] Avg episode reward: [(0, '0.389')] [2024-06-25 21:06:39,273][15401] Updated weights for policy 0, policy_version 966484 (0.0035) [2024-06-25 21:06:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15835004928. Throughput: 0: 42778.2. Samples: 15835151280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:43,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-25 21:06:43,855][15401] Updated weights for policy 0, policy_version 966494 (0.0039) [2024-06-25 21:06:46,007][15349] Signal inference workers to stop experience collection... (234250 times) [2024-06-25 21:06:46,039][15401] InferenceWorker_p0-w0: stopping experience collection (234250 times) [2024-06-25 21:06:46,074][15349] Signal inference workers to resume experience collection... (234250 times) [2024-06-25 21:06:46,074][15401] InferenceWorker_p0-w0: resuming experience collection (234250 times) [2024-06-25 21:06:46,863][15401] Updated weights for policy 0, policy_version 966504 (0.0027) [2024-06-25 21:06:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15835234304. Throughput: 0: 42779.2. Samples: 15835406140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:48,390][15132] Avg episode reward: [(0, '0.857')] [2024-06-25 21:06:51,336][15401] Updated weights for policy 0, policy_version 966514 (0.0028) [2024-06-25 21:06:53,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15835463680. Throughput: 0: 42784.0. Samples: 15835536200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:53,390][15132] Avg episode reward: [(0, '0.438')] [2024-06-25 21:06:54,554][15401] Updated weights for policy 0, policy_version 966524 (0.0038) [2024-06-25 21:06:58,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15835660288. Throughput: 0: 42714.7. Samples: 15835789940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:06:58,390][15132] Avg episode reward: [(0, '0.393')] [2024-06-25 21:06:59,450][15401] Updated weights for policy 0, policy_version 966534 (0.0035) [2024-06-25 21:07:02,241][15401] Updated weights for policy 0, policy_version 966544 (0.0040) [2024-06-25 21:07:03,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15835889664. Throughput: 0: 42669.1. Samples: 15836038620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:03,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 21:07:07,031][15401] Updated weights for policy 0, policy_version 966554 (0.0030) [2024-06-25 21:07:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15836086272. Throughput: 0: 42531.6. Samples: 15836171640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 21:07:09,945][15401] Updated weights for policy 0, policy_version 966564 (0.0041) [2024-06-25 21:07:13,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15836299264. Throughput: 0: 42550.2. Samples: 15836425760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:13,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 21:07:14,526][15401] Updated weights for policy 0, policy_version 966574 (0.0034) [2024-06-25 21:07:17,558][15401] Updated weights for policy 0, policy_version 966584 (0.0038) [2024-06-25 21:07:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15836545024. Throughput: 0: 42459.1. Samples: 15836676500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:18,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 21:07:22,160][15401] Updated weights for policy 0, policy_version 966594 (0.0041) [2024-06-25 21:07:23,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15836708864. Throughput: 0: 42409.3. Samples: 15836807200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:23,390][15132] Avg episode reward: [(0, '0.442')] [2024-06-25 21:07:25,837][15401] Updated weights for policy 0, policy_version 966604 (0.0040) [2024-06-25 21:07:28,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15836938240. Throughput: 0: 42443.2. Samples: 15837061220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:28,390][15132] Avg episode reward: [(0, '0.488')] [2024-06-25 21:07:29,709][15401] Updated weights for policy 0, policy_version 966614 (0.0029) [2024-06-25 21:07:33,378][15401] Updated weights for policy 0, policy_version 966624 (0.0032) [2024-06-25 21:07:33,391][15132] Fps is (10 sec: 45866.8, 60 sec: 42870.2, 300 sec: 42598.1). Total num frames: 15837167616. Throughput: 0: 42502.2. Samples: 15837318820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:33,392][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 21:07:37,453][15401] Updated weights for policy 0, policy_version 966634 (0.0032) [2024-06-25 21:07:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 15837364224. Throughput: 0: 42492.5. Samples: 15837448360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:38,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 21:07:40,957][15401] Updated weights for policy 0, policy_version 966644 (0.0032) [2024-06-25 21:07:43,390][15132] Fps is (10 sec: 39328.8, 60 sec: 42598.4, 300 sec: 42599.0). Total num frames: 15837560832. Throughput: 0: 42460.5. Samples: 15837700660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:43,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 21:07:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000966648_15837560832.pth... [2024-06-25 21:07:43,495][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000966027_15827386368.pth [2024-06-25 21:07:45,134][15401] Updated weights for policy 0, policy_version 966654 (0.0041) [2024-06-25 21:07:48,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15837790208. Throughput: 0: 42541.2. Samples: 15837952980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-25 21:07:48,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 21:07:48,669][15401] Updated weights for policy 0, policy_version 966664 (0.0040) [2024-06-25 21:07:53,163][15401] Updated weights for policy 0, policy_version 966674 (0.0029) [2024-06-25 21:07:53,392][15132] Fps is (10 sec: 44226.2, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 15838003200. Throughput: 0: 42460.5. Samples: 15838082460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:07:53,392][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 21:07:56,508][15401] Updated weights for policy 0, policy_version 966684 (0.0033) [2024-06-25 21:07:58,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15838199808. Throughput: 0: 42343.6. Samples: 15838331220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:07:58,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 21:08:00,658][15401] Updated weights for policy 0, policy_version 966694 (0.0042) [2024-06-25 21:08:00,955][15349] Signal inference workers to stop experience collection... (234300 times) [2024-06-25 21:08:00,960][15349] Signal inference workers to resume experience collection... (234300 times) [2024-06-25 21:08:00,971][15401] InferenceWorker_p0-w0: stopping experience collection (234300 times) [2024-06-25 21:08:00,971][15401] InferenceWorker_p0-w0: resuming experience collection (234300 times) [2024-06-25 21:08:03,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15838429184. Throughput: 0: 42670.3. Samples: 15838596660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 21:08:04,020][15401] Updated weights for policy 0, policy_version 966704 (0.0033) [2024-06-25 21:08:08,223][15401] Updated weights for policy 0, policy_version 966714 (0.0046) [2024-06-25 21:08:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15838642176. Throughput: 0: 42774.1. Samples: 15838732040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 21:08:11,446][15401] Updated weights for policy 0, policy_version 966724 (0.0040) [2024-06-25 21:08:13,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 15838838784. Throughput: 0: 42499.6. Samples: 15838973700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 21:08:16,034][15401] Updated weights for policy 0, policy_version 966734 (0.0032) [2024-06-25 21:08:18,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 15839084544. Throughput: 0: 42646.3. Samples: 15839237820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:18,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 21:08:18,979][15401] Updated weights for policy 0, policy_version 966744 (0.0036) [2024-06-25 21:08:23,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15839264768. Throughput: 0: 42856.9. Samples: 15839376920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 21:08:23,708][15401] Updated weights for policy 0, policy_version 966754 (0.0025) [2024-06-25 21:08:26,595][15401] Updated weights for policy 0, policy_version 966764 (0.0038) [2024-06-25 21:08:28,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15839494144. Throughput: 0: 42560.9. Samples: 15839615900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:28,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 21:08:31,232][15401] Updated weights for policy 0, policy_version 966774 (0.0032) [2024-06-25 21:08:33,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42872.8, 300 sec: 42765.0). Total num frames: 15839739904. Throughput: 0: 42832.5. Samples: 15839880440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:33,390][15132] Avg episode reward: [(0, '0.856')] [2024-06-25 21:08:34,205][15401] Updated weights for policy 0, policy_version 966784 (0.0042) [2024-06-25 21:08:38,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15839903744. Throughput: 0: 42937.7. Samples: 15840014560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:38,390][15132] Avg episode reward: [(0, '0.863')] [2024-06-25 21:08:39,194][15401] Updated weights for policy 0, policy_version 966794 (0.0025) [2024-06-25 21:08:41,982][15401] Updated weights for policy 0, policy_version 966804 (0.0035) [2024-06-25 21:08:43,392][15132] Fps is (10 sec: 39311.8, 60 sec: 42869.7, 300 sec: 42542.5). Total num frames: 15840133120. Throughput: 0: 42917.3. Samples: 15840262600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:43,393][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 21:08:46,720][15401] Updated weights for policy 0, policy_version 966814 (0.0026) [2024-06-25 21:08:48,389][15132] Fps is (10 sec: 47514.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15840378880. Throughput: 0: 42766.8. Samples: 15840521160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:48,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 21:08:49,636][15401] Updated weights for policy 0, policy_version 966824 (0.0032) [2024-06-25 21:08:53,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 15840559104. Throughput: 0: 42732.5. Samples: 15840655000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:53,390][15132] Avg episode reward: [(0, '0.120')] [2024-06-25 21:08:54,439][15401] Updated weights for policy 0, policy_version 966834 (0.0039) [2024-06-25 21:08:57,288][15401] Updated weights for policy 0, policy_version 966844 (0.0039) [2024-06-25 21:08:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 15840788480. Throughput: 0: 43085.8. Samples: 15840912560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:08:58,390][15132] Avg episode reward: [(0, '0.271')] [2024-06-25 21:09:01,886][15401] Updated weights for policy 0, policy_version 966854 (0.0034) [2024-06-25 21:09:02,800][15349] Signal inference workers to stop experience collection... (234350 times) [2024-06-25 21:09:02,801][15349] Signal inference workers to resume experience collection... (234350 times) [2024-06-25 21:09:02,831][15401] InferenceWorker_p0-w0: stopping experience collection (234350 times) [2024-06-25 21:09:02,832][15401] InferenceWorker_p0-w0: resuming experience collection (234350 times) [2024-06-25 21:09:03,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 15841034240. Throughput: 0: 42964.4. Samples: 15841171220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:03,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 21:09:04,819][15401] Updated weights for policy 0, policy_version 966864 (0.0038) [2024-06-25 21:09:08,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15841198080. Throughput: 0: 42808.0. Samples: 15841303280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:08,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 21:09:09,401][15401] Updated weights for policy 0, policy_version 966874 (0.0026) [2024-06-25 21:09:12,596][15401] Updated weights for policy 0, policy_version 966884 (0.0033) [2024-06-25 21:09:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43688.9, 300 sec: 42820.2). Total num frames: 15841460224. Throughput: 0: 43188.8. Samples: 15841559500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:13,392][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 21:09:17,127][15401] Updated weights for policy 0, policy_version 966894 (0.0032) [2024-06-25 21:09:18,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 15841656832. Throughput: 0: 42970.9. Samples: 15841814140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:18,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 21:09:20,370][15401] Updated weights for policy 0, policy_version 966904 (0.0033) [2024-06-25 21:09:23,390][15132] Fps is (10 sec: 39330.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15841853440. Throughput: 0: 42881.3. Samples: 15841944220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:23,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 21:09:24,991][15401] Updated weights for policy 0, policy_version 966914 (0.0024) [2024-06-25 21:09:27,918][15401] Updated weights for policy 0, policy_version 966924 (0.0035) [2024-06-25 21:09:28,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15842099200. Throughput: 0: 43117.0. Samples: 15842202760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:28,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 21:09:32,459][15401] Updated weights for policy 0, policy_version 966934 (0.0032) [2024-06-25 21:09:33,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15842279424. Throughput: 0: 43216.9. Samples: 15842465920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 21:09:35,536][15401] Updated weights for policy 0, policy_version 966944 (0.0023) [2024-06-25 21:09:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 15842508800. Throughput: 0: 42912.1. Samples: 15842586040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:38,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 21:09:40,335][15401] Updated weights for policy 0, policy_version 966954 (0.0037) [2024-06-25 21:09:43,279][15401] Updated weights for policy 0, policy_version 966964 (0.0036) [2024-06-25 21:09:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43419.3, 300 sec: 42820.6). Total num frames: 15842738176. Throughput: 0: 42933.2. Samples: 15842844560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:43,390][15132] Avg episode reward: [(0, '0.718')] [2024-06-25 21:09:43,430][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000966965_15842754560.pth... [2024-06-25 21:09:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000966336_15832449024.pth [2024-06-25 21:09:47,858][15401] Updated weights for policy 0, policy_version 966974 (0.0033) [2024-06-25 21:09:48,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15842902016. Throughput: 0: 42883.1. Samples: 15843100960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-25 21:09:48,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 21:09:51,138][15401] Updated weights for policy 0, policy_version 966984 (0.0028) [2024-06-25 21:09:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15843147776. Throughput: 0: 42539.5. Samples: 15843217560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:09:53,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 21:09:55,879][15401] Updated weights for policy 0, policy_version 966994 (0.0041) [2024-06-25 21:09:58,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15843344384. Throughput: 0: 42568.8. Samples: 15843475000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:09:58,390][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 21:09:58,974][15401] Updated weights for policy 0, policy_version 967004 (0.0045) [2024-06-25 21:10:03,341][15401] Updated weights for policy 0, policy_version 967014 (0.0032) [2024-06-25 21:10:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15843557376. Throughput: 0: 42728.1. Samples: 15843736900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:03,390][15132] Avg episode reward: [(0, '0.577')] [2024-06-25 21:10:06,513][15401] Updated weights for policy 0, policy_version 967024 (0.0035) [2024-06-25 21:10:08,390][15132] Fps is (10 sec: 47513.8, 60 sec: 43690.6, 300 sec: 42709.5). Total num frames: 15843819520. Throughput: 0: 42695.6. Samples: 15843865520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:08,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 21:10:10,834][15401] Updated weights for policy 0, policy_version 967034 (0.0031) [2024-06-25 21:10:13,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42327.1, 300 sec: 42765.0). Total num frames: 15843999744. Throughput: 0: 42593.4. Samples: 15844119460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:13,390][15132] Avg episode reward: [(0, '0.680')] [2024-06-25 21:10:14,215][15401] Updated weights for policy 0, policy_version 967044 (0.0028) [2024-06-25 21:10:14,951][15349] Signal inference workers to stop experience collection... (234400 times) [2024-06-25 21:10:14,956][15349] Signal inference workers to resume experience collection... (234400 times) [2024-06-25 21:10:15,000][15401] InferenceWorker_p0-w0: stopping experience collection (234400 times) [2024-06-25 21:10:15,000][15401] InferenceWorker_p0-w0: resuming experience collection (234400 times) [2024-06-25 21:10:18,331][15401] Updated weights for policy 0, policy_version 967054 (0.0025) [2024-06-25 21:10:18,390][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15844212736. Throughput: 0: 42459.9. Samples: 15844376620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:18,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 21:10:22,196][15401] Updated weights for policy 0, policy_version 967064 (0.0049) [2024-06-25 21:10:23,393][15132] Fps is (10 sec: 44219.7, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 15844442112. Throughput: 0: 42525.7. Samples: 15844499860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:23,394][15132] Avg episode reward: [(0, '0.293')] [2024-06-25 21:10:26,294][15401] Updated weights for policy 0, policy_version 967074 (0.0041) [2024-06-25 21:10:28,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15844638720. Throughput: 0: 42525.8. Samples: 15844758220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:28,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 21:10:29,882][15401] Updated weights for policy 0, policy_version 967084 (0.0029) [2024-06-25 21:10:33,390][15132] Fps is (10 sec: 39336.2, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 15844835328. Throughput: 0: 42539.9. Samples: 15845015260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:33,390][15132] Avg episode reward: [(0, '0.837')] [2024-06-25 21:10:34,060][15401] Updated weights for policy 0, policy_version 967094 (0.0032) [2024-06-25 21:10:37,311][15401] Updated weights for policy 0, policy_version 967104 (0.0036) [2024-06-25 21:10:38,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15845081088. Throughput: 0: 42734.2. Samples: 15845140600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:38,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 21:10:42,038][15401] Updated weights for policy 0, policy_version 967114 (0.0022) [2024-06-25 21:10:43,390][15132] Fps is (10 sec: 42596.3, 60 sec: 42051.9, 300 sec: 42709.4). Total num frames: 15845261312. Throughput: 0: 42761.4. Samples: 15845399280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:43,391][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 21:10:45,317][15401] Updated weights for policy 0, policy_version 967124 (0.0032) [2024-06-25 21:10:48,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15845474304. Throughput: 0: 42516.6. Samples: 15845650140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 21:10:49,572][15401] Updated weights for policy 0, policy_version 967134 (0.0034) [2024-06-25 21:10:53,004][15401] Updated weights for policy 0, policy_version 967144 (0.0030) [2024-06-25 21:10:53,389][15132] Fps is (10 sec: 44239.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15845703680. Throughput: 0: 42502.4. Samples: 15845778120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:53,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 21:10:57,267][15401] Updated weights for policy 0, policy_version 967154 (0.0038) [2024-06-25 21:10:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15845883904. Throughput: 0: 42547.1. Samples: 15846034080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:10:58,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 21:11:00,638][15401] Updated weights for policy 0, policy_version 967164 (0.0036) [2024-06-25 21:11:03,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15846129664. Throughput: 0: 42388.1. Samples: 15846284080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:03,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 21:11:05,077][15401] Updated weights for policy 0, policy_version 967174 (0.0040) [2024-06-25 21:11:08,236][15401] Updated weights for policy 0, policy_version 967184 (0.0030) [2024-06-25 21:11:08,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 15846342656. Throughput: 0: 42635.2. Samples: 15846418280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:08,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 21:11:12,500][15401] Updated weights for policy 0, policy_version 967194 (0.0027) [2024-06-25 21:11:13,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 15846539264. Throughput: 0: 42534.7. Samples: 15846672380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:13,393][15132] Avg episode reward: [(0, '0.489')] [2024-06-25 21:11:15,955][15401] Updated weights for policy 0, policy_version 967204 (0.0045) [2024-06-25 21:11:18,396][15132] Fps is (10 sec: 44208.1, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 15846785024. Throughput: 0: 42249.6. Samples: 15846916760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:18,397][15132] Avg episode reward: [(0, '0.771')] [2024-06-25 21:11:20,335][15401] Updated weights for policy 0, policy_version 967214 (0.0036) [2024-06-25 21:11:23,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42328.0, 300 sec: 42765.0). Total num frames: 15846981632. Throughput: 0: 42497.3. Samples: 15847052980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:23,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 21:11:24,469][15401] Updated weights for policy 0, policy_version 967224 (0.0031) [2024-06-25 21:11:27,860][15401] Updated weights for policy 0, policy_version 967234 (0.0034) [2024-06-25 21:11:28,390][15132] Fps is (10 sec: 39346.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15847178240. Throughput: 0: 42376.0. Samples: 15847306180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:28,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 21:11:32,136][15401] Updated weights for policy 0, policy_version 967244 (0.0032) [2024-06-25 21:11:33,389][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 15847424000. Throughput: 0: 42412.0. Samples: 15847558680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:33,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 21:11:35,674][15401] Updated weights for policy 0, policy_version 967254 (0.0042) [2024-06-25 21:11:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15847604224. Throughput: 0: 42474.6. Samples: 15847689480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:38,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 21:11:39,539][15401] Updated weights for policy 0, policy_version 967264 (0.0033) [2024-06-25 21:11:43,389][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.8, 300 sec: 42598.4). Total num frames: 15847800832. Throughput: 0: 42475.1. Samples: 15847945460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:43,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 21:11:43,507][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000967274_15847817216.pth... [2024-06-25 21:11:43,521][15401] Updated weights for policy 0, policy_version 967274 (0.0037) [2024-06-25 21:11:43,554][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000966648_15837560832.pth [2024-06-25 21:11:47,102][15401] Updated weights for policy 0, policy_version 967284 (0.0040) [2024-06-25 21:11:48,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 15848046592. Throughput: 0: 42574.7. Samples: 15848199940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-25 21:11:48,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 21:11:48,399][15349] Signal inference workers to stop experience collection... (234450 times) [2024-06-25 21:11:48,453][15349] Signal inference workers to resume experience collection... (234450 times) [2024-06-25 21:11:48,457][15401] InferenceWorker_p0-w0: stopping experience collection (234450 times) [2024-06-25 21:11:48,468][15401] InferenceWorker_p0-w0: resuming experience collection (234450 times) [2024-06-25 21:11:51,001][15401] Updated weights for policy 0, policy_version 967294 (0.0039) [2024-06-25 21:11:53,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15848259584. Throughput: 0: 42453.7. Samples: 15848328700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:11:53,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 21:11:54,851][15401] Updated weights for policy 0, policy_version 967304 (0.0043) [2024-06-25 21:11:58,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15848423424. Throughput: 0: 42323.6. Samples: 15848576840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:11:58,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 21:11:58,803][15401] Updated weights for policy 0, policy_version 967314 (0.0027) [2024-06-25 21:12:02,540][15401] Updated weights for policy 0, policy_version 967324 (0.0031) [2024-06-25 21:12:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15848669184. Throughput: 0: 42671.4. Samples: 15848836700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:03,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 21:12:06,428][15401] Updated weights for policy 0, policy_version 967334 (0.0027) [2024-06-25 21:12:08,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15848882176. Throughput: 0: 42640.9. Samples: 15848971820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:08,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 21:12:10,026][15401] Updated weights for policy 0, policy_version 967344 (0.0038) [2024-06-25 21:12:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 15849078784. Throughput: 0: 42486.7. Samples: 15849218080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:13,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 21:12:14,278][15401] Updated weights for policy 0, policy_version 967354 (0.0037) [2024-06-25 21:12:17,792][15401] Updated weights for policy 0, policy_version 967364 (0.0043) [2024-06-25 21:12:18,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42056.8, 300 sec: 42709.5). Total num frames: 15849308160. Throughput: 0: 42616.5. Samples: 15849476420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:18,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 21:12:21,784][15401] Updated weights for policy 0, policy_version 967374 (0.0040) [2024-06-25 21:12:23,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 15849521152. Throughput: 0: 42675.4. Samples: 15849609880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:23,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 21:12:25,563][15401] Updated weights for policy 0, policy_version 967384 (0.0036) [2024-06-25 21:12:28,395][15132] Fps is (10 sec: 42576.2, 60 sec: 42594.8, 300 sec: 42597.9). Total num frames: 15849734144. Throughput: 0: 42604.9. Samples: 15849862900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:28,395][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 21:12:29,560][15401] Updated weights for policy 0, policy_version 967394 (0.0031) [2024-06-25 21:12:33,249][15401] Updated weights for policy 0, policy_version 967404 (0.0035) [2024-06-25 21:12:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15849963520. Throughput: 0: 42634.9. Samples: 15850118520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:33,390][15132] Avg episode reward: [(0, '0.817')] [2024-06-25 21:12:37,228][15401] Updated weights for policy 0, policy_version 967414 (0.0035) [2024-06-25 21:12:38,392][15132] Fps is (10 sec: 44248.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 15850176512. Throughput: 0: 42596.4. Samples: 15850245640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:38,393][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 21:12:40,701][15401] Updated weights for policy 0, policy_version 967424 (0.0026) [2024-06-25 21:12:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15850373120. Throughput: 0: 42703.4. Samples: 15850498500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 21:12:45,016][15401] Updated weights for policy 0, policy_version 967434 (0.0036) [2024-06-25 21:12:48,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 15850586112. Throughput: 0: 42496.5. Samples: 15850749040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:48,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 21:12:48,512][15401] Updated weights for policy 0, policy_version 967444 (0.0025) [2024-06-25 21:12:50,118][15349] Signal inference workers to stop experience collection... (234500 times) [2024-06-25 21:12:50,163][15401] InferenceWorker_p0-w0: stopping experience collection (234500 times) [2024-06-25 21:12:50,235][15349] Signal inference workers to resume experience collection... (234500 times) [2024-06-25 21:12:50,235][15401] InferenceWorker_p0-w0: resuming experience collection (234500 times) [2024-06-25 21:12:52,891][15401] Updated weights for policy 0, policy_version 967454 (0.0024) [2024-06-25 21:12:53,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15850799104. Throughput: 0: 42317.6. Samples: 15850876120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:53,390][15132] Avg episode reward: [(0, '0.872')] [2024-06-25 21:12:56,102][15401] Updated weights for policy 0, policy_version 967464 (0.0048) [2024-06-25 21:12:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15851012096. Throughput: 0: 42484.0. Samples: 15851129860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:12:58,392][15132] Avg episode reward: [(0, '0.596')] [2024-06-25 21:13:00,645][15401] Updated weights for policy 0, policy_version 967474 (0.0042) [2024-06-25 21:13:03,389][15132] Fps is (10 sec: 42599.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15851225088. Throughput: 0: 42359.6. Samples: 15851382600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:03,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 21:13:03,882][15401] Updated weights for policy 0, policy_version 967484 (0.0031) [2024-06-25 21:13:08,116][15401] Updated weights for policy 0, policy_version 967494 (0.0038) [2024-06-25 21:13:08,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15851438080. Throughput: 0: 42271.6. Samples: 15851512100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:08,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 21:13:11,769][15401] Updated weights for policy 0, policy_version 967504 (0.0035) [2024-06-25 21:13:13,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 15851618304. Throughput: 0: 42080.8. Samples: 15851756320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 21:13:15,710][15401] Updated weights for policy 0, policy_version 967514 (0.0044) [2024-06-25 21:13:18,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15851847680. Throughput: 0: 42193.5. Samples: 15852017220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:18,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 21:13:19,371][15401] Updated weights for policy 0, policy_version 967524 (0.0036) [2024-06-25 21:13:23,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15852060672. Throughput: 0: 42151.5. Samples: 15852142360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:23,392][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 21:13:23,650][15401] Updated weights for policy 0, policy_version 967534 (0.0026) [2024-06-25 21:13:27,088][15401] Updated weights for policy 0, policy_version 967544 (0.0031) [2024-06-25 21:13:28,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42328.9, 300 sec: 42487.3). Total num frames: 15852273664. Throughput: 0: 42064.5. Samples: 15852391400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:28,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 21:13:31,397][15401] Updated weights for policy 0, policy_version 967554 (0.0030) [2024-06-25 21:13:33,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15852486656. Throughput: 0: 42175.0. Samples: 15852646920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 21:13:35,196][15401] Updated weights for policy 0, policy_version 967564 (0.0041) [2024-06-25 21:13:38,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42054.0, 300 sec: 42598.8). Total num frames: 15852699648. Throughput: 0: 42174.4. Samples: 15852773960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:38,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-25 21:13:39,056][15401] Updated weights for policy 0, policy_version 967574 (0.0040) [2024-06-25 21:13:43,374][15401] Updated weights for policy 0, policy_version 967584 (0.0042) [2024-06-25 21:13:43,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 15852896256. Throughput: 0: 42231.2. Samples: 15853030260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-06-25 21:13:43,390][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 21:13:43,524][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000967585_15852912640.pth... [2024-06-25 21:13:43,567][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000966965_15842754560.pth [2024-06-25 21:13:46,926][15401] Updated weights for policy 0, policy_version 967594 (0.0028) [2024-06-25 21:13:48,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15853125632. Throughput: 0: 42164.4. Samples: 15853280000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:13:48,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 21:13:51,186][15401] Updated weights for policy 0, policy_version 967604 (0.0031) [2024-06-25 21:13:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 15853322240. Throughput: 0: 42284.4. Samples: 15853414900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:13:53,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 21:13:54,399][15401] Updated weights for policy 0, policy_version 967614 (0.0033) [2024-06-25 21:13:58,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 15853535232. Throughput: 0: 42492.4. Samples: 15853668480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:13:58,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 21:13:58,735][15401] Updated weights for policy 0, policy_version 967624 (0.0027) [2024-06-25 21:14:02,147][15401] Updated weights for policy 0, policy_version 967634 (0.0032) [2024-06-25 21:14:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 15853748224. Throughput: 0: 42386.3. Samples: 15853924600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:03,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 21:14:06,169][15401] Updated weights for policy 0, policy_version 967644 (0.0026) [2024-06-25 21:14:06,736][15349] Signal inference workers to stop experience collection... (234550 times) [2024-06-25 21:14:06,736][15349] Signal inference workers to resume experience collection... (234550 times) [2024-06-25 21:14:06,779][15401] InferenceWorker_p0-w0: stopping experience collection (234550 times) [2024-06-25 21:14:06,779][15401] InferenceWorker_p0-w0: resuming experience collection (234550 times) [2024-06-25 21:14:08,392][15132] Fps is (10 sec: 44226.3, 60 sec: 42323.7, 300 sec: 42431.8). Total num frames: 15853977600. Throughput: 0: 42483.6. Samples: 15854054220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:08,392][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 21:14:09,949][15401] Updated weights for policy 0, policy_version 967654 (0.0043) [2024-06-25 21:14:13,392][15132] Fps is (10 sec: 44225.9, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 15854190592. Throughput: 0: 42510.7. Samples: 15854304480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:13,392][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 21:14:13,659][15401] Updated weights for policy 0, policy_version 967664 (0.0038) [2024-06-25 21:14:17,520][15401] Updated weights for policy 0, policy_version 967674 (0.0036) [2024-06-25 21:14:18,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15854403584. Throughput: 0: 42662.2. Samples: 15854566720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:18,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 21:14:21,278][15401] Updated weights for policy 0, policy_version 967684 (0.0031) [2024-06-25 21:14:23,389][15132] Fps is (10 sec: 42608.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 15854616576. Throughput: 0: 42683.1. Samples: 15854694700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:23,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 21:14:25,376][15401] Updated weights for policy 0, policy_version 967694 (0.0036) [2024-06-25 21:14:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15854829568. Throughput: 0: 42552.1. Samples: 15854945100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:28,390][15132] Avg episode reward: [(0, '0.353')] [2024-06-25 21:14:29,255][15401] Updated weights for policy 0, policy_version 967704 (0.0031) [2024-06-25 21:14:33,098][15401] Updated weights for policy 0, policy_version 967714 (0.0035) [2024-06-25 21:14:33,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 15855042560. Throughput: 0: 42855.8. Samples: 15855208520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:33,390][15132] Avg episode reward: [(0, '0.406')] [2024-06-25 21:14:36,949][15401] Updated weights for policy 0, policy_version 967724 (0.0032) [2024-06-25 21:14:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.7, 300 sec: 42431.4). Total num frames: 15855255552. Throughput: 0: 42644.5. Samples: 15855334000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:38,392][15132] Avg episode reward: [(0, '0.445')] [2024-06-25 21:14:40,803][15401] Updated weights for policy 0, policy_version 967734 (0.0027) [2024-06-25 21:14:43,390][15132] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15855484928. Throughput: 0: 42691.9. Samples: 15855589620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:43,390][15132] Avg episode reward: [(0, '0.354')] [2024-06-25 21:14:44,394][15401] Updated weights for policy 0, policy_version 967744 (0.0033) [2024-06-25 21:14:48,137][15401] Updated weights for policy 0, policy_version 967754 (0.0027) [2024-06-25 21:14:48,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 15855697920. Throughput: 0: 42673.2. Samples: 15855844900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 21:14:52,137][15401] Updated weights for policy 0, policy_version 967764 (0.0034) [2024-06-25 21:14:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 15855878144. Throughput: 0: 42717.0. Samples: 15855976380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:53,390][15132] Avg episode reward: [(0, '0.851')] [2024-06-25 21:14:55,730][15401] Updated weights for policy 0, policy_version 967774 (0.0035) [2024-06-25 21:14:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15856123904. Throughput: 0: 42833.4. Samples: 15856231880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:14:58,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 21:14:59,615][15401] Updated weights for policy 0, policy_version 967784 (0.0031) [2024-06-25 21:15:03,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42869.7, 300 sec: 42375.9). Total num frames: 15856320512. Throughput: 0: 42740.4. Samples: 15856490140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:03,393][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 21:15:03,538][15401] Updated weights for policy 0, policy_version 967794 (0.0028) [2024-06-25 21:15:07,478][15401] Updated weights for policy 0, policy_version 967804 (0.0033) [2024-06-25 21:15:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 15856533504. Throughput: 0: 42782.1. Samples: 15856619900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:08,390][15132] Avg episode reward: [(0, '0.229')] [2024-06-25 21:15:11,017][15401] Updated weights for policy 0, policy_version 967814 (0.0053) [2024-06-25 21:15:13,390][15132] Fps is (10 sec: 45886.1, 60 sec: 43146.2, 300 sec: 42598.4). Total num frames: 15856779264. Throughput: 0: 42876.8. Samples: 15856874560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:13,390][15132] Avg episode reward: [(0, '0.383')] [2024-06-25 21:15:15,361][15401] Updated weights for policy 0, policy_version 967824 (0.0037) [2024-06-25 21:15:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42487.9). Total num frames: 15856975872. Throughput: 0: 42844.1. Samples: 15857136500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:18,390][15132] Avg episode reward: [(0, '0.255')] [2024-06-25 21:15:18,598][15401] Updated weights for policy 0, policy_version 967834 (0.0030) [2024-06-25 21:15:23,155][15401] Updated weights for policy 0, policy_version 967844 (0.0030) [2024-06-25 21:15:23,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15857172480. Throughput: 0: 42810.3. Samples: 15857260360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:23,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 21:15:26,432][15401] Updated weights for policy 0, policy_version 967854 (0.0023) [2024-06-25 21:15:28,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 15857418240. Throughput: 0: 42846.7. Samples: 15857517720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:28,390][15132] Avg episode reward: [(0, '0.339')] [2024-06-25 21:15:30,605][15401] Updated weights for policy 0, policy_version 967864 (0.0033) [2024-06-25 21:15:32,629][15349] Signal inference workers to stop experience collection... (234600 times) [2024-06-25 21:15:32,632][15349] Signal inference workers to resume experience collection... (234600 times) [2024-06-25 21:15:32,680][15401] InferenceWorker_p0-w0: stopping experience collection (234600 times) [2024-06-25 21:15:32,684][15401] InferenceWorker_p0-w0: resuming experience collection (234600 times) [2024-06-25 21:15:33,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 15857598464. Throughput: 0: 42860.1. Samples: 15857773600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:33,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 21:15:34,287][15401] Updated weights for policy 0, policy_version 967874 (0.0025) [2024-06-25 21:15:37,959][15401] Updated weights for policy 0, policy_version 967884 (0.0033) [2024-06-25 21:15:38,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42598.5). Total num frames: 15857827840. Throughput: 0: 42710.6. Samples: 15857898360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:38,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 21:15:41,790][15401] Updated weights for policy 0, policy_version 967894 (0.0027) [2024-06-25 21:15:43,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15858073600. Throughput: 0: 42870.6. Samples: 15858161060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-25 21:15:43,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 21:15:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000967900_15858073600.pth... [2024-06-25 21:15:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000967274_15847817216.pth [2024-06-25 21:15:45,495][15401] Updated weights for policy 0, policy_version 967904 (0.0026) [2024-06-25 21:15:48,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 15858221056. Throughput: 0: 42893.9. Samples: 15858420260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:15:48,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 21:15:49,535][15401] Updated weights for policy 0, policy_version 967914 (0.0026) [2024-06-25 21:15:53,225][15401] Updated weights for policy 0, policy_version 967924 (0.0035) [2024-06-25 21:15:53,389][15132] Fps is (10 sec: 39322.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15858466816. Throughput: 0: 42594.8. Samples: 15858536660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:15:53,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 21:15:57,116][15401] Updated weights for policy 0, policy_version 967934 (0.0037) [2024-06-25 21:15:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15858679808. Throughput: 0: 42740.0. Samples: 15858797860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:15:58,390][15132] Avg episode reward: [(0, '0.330')] [2024-06-25 21:16:00,770][15401] Updated weights for policy 0, policy_version 967944 (0.0046) [2024-06-25 21:16:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 15858860032. Throughput: 0: 42680.9. Samples: 15859057140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:03,390][15132] Avg episode reward: [(0, '0.641')] [2024-06-25 21:16:05,065][15401] Updated weights for policy 0, policy_version 967954 (0.0041) [2024-06-25 21:16:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 15859105792. Throughput: 0: 42672.4. Samples: 15859180620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:08,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 21:16:08,745][15401] Updated weights for policy 0, policy_version 967964 (0.0037) [2024-06-25 21:16:12,678][15401] Updated weights for policy 0, policy_version 967974 (0.0039) [2024-06-25 21:16:13,392][15132] Fps is (10 sec: 45863.9, 60 sec: 42323.6, 300 sec: 42487.9). Total num frames: 15859318784. Throughput: 0: 42816.4. Samples: 15859444560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:13,393][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 21:16:16,394][15401] Updated weights for policy 0, policy_version 967984 (0.0037) [2024-06-25 21:16:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15859531776. Throughput: 0: 42832.8. Samples: 15859701080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:18,390][15132] Avg episode reward: [(0, '0.819')] [2024-06-25 21:16:20,215][15401] Updated weights for policy 0, policy_version 967994 (0.0039) [2024-06-25 21:16:23,390][15132] Fps is (10 sec: 44247.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15859761152. Throughput: 0: 42828.4. Samples: 15859825640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:23,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 21:16:23,908][15401] Updated weights for policy 0, policy_version 968004 (0.0032) [2024-06-25 21:16:28,227][15401] Updated weights for policy 0, policy_version 968014 (0.0023) [2024-06-25 21:16:28,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 15859941376. Throughput: 0: 42819.2. Samples: 15860087920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:28,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 21:16:31,785][15401] Updated weights for policy 0, policy_version 968024 (0.0034) [2024-06-25 21:16:33,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 15860170752. Throughput: 0: 42725.1. Samples: 15860342900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:33,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 21:16:35,872][15401] Updated weights for policy 0, policy_version 968034 (0.0038) [2024-06-25 21:16:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15860400128. Throughput: 0: 42897.7. Samples: 15860467060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:38,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 21:16:39,525][15401] Updated weights for policy 0, policy_version 968044 (0.0036) [2024-06-25 21:16:43,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 15860580352. Throughput: 0: 42811.2. Samples: 15860724360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:43,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 21:16:43,410][15401] Updated weights for policy 0, policy_version 968054 (0.0035) [2024-06-25 21:16:47,181][15401] Updated weights for policy 0, policy_version 968064 (0.0042) [2024-06-25 21:16:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 15860826112. Throughput: 0: 42759.2. Samples: 15860981300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:48,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 21:16:50,976][15401] Updated weights for policy 0, policy_version 968074 (0.0041) [2024-06-25 21:16:51,643][15349] Signal inference workers to stop experience collection... (234650 times) [2024-06-25 21:16:51,644][15349] Signal inference workers to resume experience collection... (234650 times) [2024-06-25 21:16:51,659][15401] InferenceWorker_p0-w0: stopping experience collection (234650 times) [2024-06-25 21:16:51,659][15401] InferenceWorker_p0-w0: resuming experience collection (234650 times) [2024-06-25 21:16:53,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15861039104. Throughput: 0: 42785.4. Samples: 15861105960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:53,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 21:16:54,793][15401] Updated weights for policy 0, policy_version 968084 (0.0033) [2024-06-25 21:16:58,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15861235712. Throughput: 0: 42666.8. Samples: 15861364460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:16:58,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 21:16:58,521][15401] Updated weights for policy 0, policy_version 968094 (0.0036) [2024-06-25 21:17:02,332][15401] Updated weights for policy 0, policy_version 968104 (0.0021) [2024-06-25 21:17:03,390][15132] Fps is (10 sec: 40959.4, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 15861448704. Throughput: 0: 42576.8. Samples: 15861617040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:03,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 21:17:06,074][15401] Updated weights for policy 0, policy_version 968114 (0.0045) [2024-06-25 21:17:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15861678080. Throughput: 0: 42672.1. Samples: 15861745880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:08,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 21:17:09,777][15401] Updated weights for policy 0, policy_version 968124 (0.0035) [2024-06-25 21:17:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 15861874688. Throughput: 0: 42578.6. Samples: 15862003960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 21:17:13,836][15401] Updated weights for policy 0, policy_version 968134 (0.0024) [2024-06-25 21:17:17,295][15401] Updated weights for policy 0, policy_version 968144 (0.0030) [2024-06-25 21:17:18,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 15862104064. Throughput: 0: 42516.5. Samples: 15862256240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:18,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 21:17:21,857][15401] Updated weights for policy 0, policy_version 968154 (0.0051) [2024-06-25 21:17:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42543.6). Total num frames: 15862284288. Throughput: 0: 42619.2. Samples: 15862384920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:23,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 21:17:24,923][15401] Updated weights for policy 0, policy_version 968164 (0.0043) [2024-06-25 21:17:28,389][15132] Fps is (10 sec: 40970.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15862513664. Throughput: 0: 42603.6. Samples: 15862641520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:28,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 21:17:29,646][15401] Updated weights for policy 0, policy_version 968174 (0.0037) [2024-06-25 21:17:32,538][15401] Updated weights for policy 0, policy_version 968184 (0.0039) [2024-06-25 21:17:33,391][15132] Fps is (10 sec: 45866.5, 60 sec: 42870.2, 300 sec: 42598.5). Total num frames: 15862743040. Throughput: 0: 42671.1. Samples: 15862901580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:33,392][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 21:17:36,971][15401] Updated weights for policy 0, policy_version 968194 (0.0036) [2024-06-25 21:17:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15862939648. Throughput: 0: 42871.1. Samples: 15863035160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:38,390][15132] Avg episode reward: [(0, '0.636')] [2024-06-25 21:17:40,078][15401] Updated weights for policy 0, policy_version 968204 (0.0033) [2024-06-25 21:17:43,390][15132] Fps is (10 sec: 40967.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15863152640. Throughput: 0: 42860.4. Samples: 15863293180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 21:17:43,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 21:17:43,424][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000968211_15863169024.pth... [2024-06-25 21:17:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000967585_15852912640.pth [2024-06-25 21:17:44,420][15401] Updated weights for policy 0, policy_version 968214 (0.0029) [2024-06-25 21:17:47,855][15401] Updated weights for policy 0, policy_version 968224 (0.0023) [2024-06-25 21:17:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15863398400. Throughput: 0: 42918.3. Samples: 15863548360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:17:48,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 21:17:52,058][15401] Updated weights for policy 0, policy_version 968234 (0.0029) [2024-06-25 21:17:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15863595008. Throughput: 0: 43038.7. Samples: 15863682620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:17:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 21:17:55,780][15401] Updated weights for policy 0, policy_version 968244 (0.0024) [2024-06-25 21:17:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15863808000. Throughput: 0: 42848.5. Samples: 15863932140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:17:58,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 21:17:59,977][15401] Updated weights for policy 0, policy_version 968254 (0.0040) [2024-06-25 21:18:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15864020992. Throughput: 0: 43117.8. Samples: 15864196440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:03,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 21:18:03,656][15401] Updated weights for policy 0, policy_version 968264 (0.0034) [2024-06-25 21:18:07,363][15401] Updated weights for policy 0, policy_version 968274 (0.0031) [2024-06-25 21:18:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15864250368. Throughput: 0: 43136.4. Samples: 15864326060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:08,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 21:18:11,246][15401] Updated weights for policy 0, policy_version 968284 (0.0030) [2024-06-25 21:18:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15864463360. Throughput: 0: 43095.9. Samples: 15864580840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:13,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 21:18:15,317][15401] Updated weights for policy 0, policy_version 968294 (0.0039) [2024-06-25 21:18:16,540][15349] Signal inference workers to stop experience collection... (234700 times) [2024-06-25 21:18:16,540][15349] Signal inference workers to resume experience collection... (234700 times) [2024-06-25 21:18:16,570][15401] InferenceWorker_p0-w0: stopping experience collection (234700 times) [2024-06-25 21:18:16,571][15401] InferenceWorker_p0-w0: resuming experience collection (234700 times) [2024-06-25 21:18:18,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15864659968. Throughput: 0: 43239.0. Samples: 15864847260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:18,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 21:18:18,845][15401] Updated weights for policy 0, policy_version 968304 (0.0028) [2024-06-25 21:18:22,786][15401] Updated weights for policy 0, policy_version 968314 (0.0023) [2024-06-25 21:18:23,393][15132] Fps is (10 sec: 42584.3, 60 sec: 43415.2, 300 sec: 42764.5). Total num frames: 15864889344. Throughput: 0: 42952.4. Samples: 15864968160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:23,393][15132] Avg episode reward: [(0, '0.474')] [2024-06-25 21:18:26,462][15401] Updated weights for policy 0, policy_version 968324 (0.0023) [2024-06-25 21:18:28,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 15865118720. Throughput: 0: 42949.4. Samples: 15865225900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:28,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 21:18:30,508][15401] Updated weights for policy 0, policy_version 968334 (0.0025) [2024-06-25 21:18:33,389][15132] Fps is (10 sec: 42612.6, 60 sec: 42872.8, 300 sec: 42765.0). Total num frames: 15865315328. Throughput: 0: 43125.9. Samples: 15865489020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:33,390][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 21:18:34,005][15401] Updated weights for policy 0, policy_version 968344 (0.0035) [2024-06-25 21:18:38,018][15401] Updated weights for policy 0, policy_version 968354 (0.0031) [2024-06-25 21:18:38,390][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15865528320. Throughput: 0: 42776.8. Samples: 15865607580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:38,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 21:18:41,645][15401] Updated weights for policy 0, policy_version 968364 (0.0027) [2024-06-25 21:18:43,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 15865774080. Throughput: 0: 42999.1. Samples: 15865867100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:43,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 21:18:45,667][15401] Updated weights for policy 0, policy_version 968374 (0.0037) [2024-06-25 21:18:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15865937920. Throughput: 0: 42963.7. Samples: 15866129800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:48,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 21:18:49,318][15401] Updated weights for policy 0, policy_version 968384 (0.0038) [2024-06-25 21:18:53,321][15401] Updated weights for policy 0, policy_version 968394 (0.0039) [2024-06-25 21:18:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15866167296. Throughput: 0: 42708.5. Samples: 15866247940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:53,390][15132] Avg episode reward: [(0, '0.233')] [2024-06-25 21:18:56,925][15401] Updated weights for policy 0, policy_version 968404 (0.0033) [2024-06-25 21:18:58,389][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15866396672. Throughput: 0: 42715.5. Samples: 15866503040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:18:58,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 21:19:00,857][15401] Updated weights for policy 0, policy_version 968414 (0.0032) [2024-06-25 21:19:03,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 15866576896. Throughput: 0: 42624.6. Samples: 15866765360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:19:03,390][15132] Avg episode reward: [(0, '0.377')] [2024-06-25 21:19:04,463][15401] Updated weights for policy 0, policy_version 968424 (0.0036) [2024-06-25 21:19:08,392][15132] Fps is (10 sec: 40950.3, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 15866806272. Throughput: 0: 42688.9. Samples: 15866889120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:19:08,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 21:19:08,615][15401] Updated weights for policy 0, policy_version 968434 (0.0031) [2024-06-25 21:19:12,419][15401] Updated weights for policy 0, policy_version 968444 (0.0040) [2024-06-25 21:19:13,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15867035648. Throughput: 0: 42728.9. Samples: 15867148700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:19:13,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 21:19:16,790][15401] Updated weights for policy 0, policy_version 968454 (0.0034) [2024-06-25 21:19:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15867232256. Throughput: 0: 42472.0. Samples: 15867400260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:19:18,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 21:19:20,076][15401] Updated weights for policy 0, policy_version 968464 (0.0032) [2024-06-25 21:19:23,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.8, 300 sec: 42765.0). Total num frames: 15867445248. Throughput: 0: 42600.9. Samples: 15867524620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:19:23,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 21:19:24,234][15401] Updated weights for policy 0, policy_version 968474 (0.0038) [2024-06-25 21:19:27,651][15401] Updated weights for policy 0, policy_version 968484 (0.0036) [2024-06-25 21:19:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15867674624. Throughput: 0: 42754.2. Samples: 15867791040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:19:28,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 21:19:31,788][15401] Updated weights for policy 0, policy_version 968494 (0.0028) [2024-06-25 21:19:33,392][15132] Fps is (10 sec: 42588.1, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 15867871232. Throughput: 0: 42600.4. Samples: 15868046920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:19:33,392][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 21:19:34,836][15349] Signal inference workers to stop experience collection... (234750 times) [2024-06-25 21:19:34,836][15349] Signal inference workers to resume experience collection... (234750 times) [2024-06-25 21:19:34,851][15401] InferenceWorker_p0-w0: stopping experience collection (234750 times) [2024-06-25 21:19:34,851][15401] InferenceWorker_p0-w0: resuming experience collection (234750 times) [2024-06-25 21:19:35,253][15401] Updated weights for policy 0, policy_version 968504 (0.0043) [2024-06-25 21:19:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15868084224. Throughput: 0: 42773.8. Samples: 15868172760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-25 21:19:38,390][15132] Avg episode reward: [(0, '0.670')] [2024-06-25 21:19:39,289][15401] Updated weights for policy 0, policy_version 968514 (0.0028) [2024-06-25 21:19:42,910][15401] Updated weights for policy 0, policy_version 968524 (0.0037) [2024-06-25 21:19:43,390][15132] Fps is (10 sec: 44247.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15868313600. Throughput: 0: 42819.5. Samples: 15868429920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:19:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 21:19:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000968525_15868313600.pth... [2024-06-25 21:19:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000967900_15858073600.pth [2024-06-25 21:19:46,889][15401] Updated weights for policy 0, policy_version 968534 (0.0026) [2024-06-25 21:19:48,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15868493824. Throughput: 0: 42694.1. Samples: 15868686600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:19:48,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 21:19:50,604][15401] Updated weights for policy 0, policy_version 968544 (0.0031) [2024-06-25 21:19:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15868739584. Throughput: 0: 42649.8. Samples: 15868808260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:19:53,390][15132] Avg episode reward: [(0, '0.532')] [2024-06-25 21:19:54,608][15401] Updated weights for policy 0, policy_version 968554 (0.0025) [2024-06-25 21:19:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42765.4). Total num frames: 15868936192. Throughput: 0: 42688.5. Samples: 15869069680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:19:58,390][15132] Avg episode reward: [(0, '0.659')] [2024-06-25 21:19:58,638][15401] Updated weights for policy 0, policy_version 968564 (0.0031) [2024-06-25 21:20:02,503][15401] Updated weights for policy 0, policy_version 968574 (0.0038) [2024-06-25 21:20:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15869132800. Throughput: 0: 42809.4. Samples: 15869326680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:03,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 21:20:06,280][15401] Updated weights for policy 0, policy_version 968584 (0.0027) [2024-06-25 21:20:08,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 15869394944. Throughput: 0: 42841.7. Samples: 15869452500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:08,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 21:20:10,371][15401] Updated weights for policy 0, policy_version 968594 (0.0027) [2024-06-25 21:20:13,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15869591552. Throughput: 0: 42533.2. Samples: 15869705040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:13,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 21:20:13,849][15401] Updated weights for policy 0, policy_version 968604 (0.0041) [2024-06-25 21:20:17,983][15401] Updated weights for policy 0, policy_version 968614 (0.0042) [2024-06-25 21:20:18,395][15132] Fps is (10 sec: 39301.9, 60 sec: 42594.8, 300 sec: 42764.3). Total num frames: 15869788160. Throughput: 0: 42426.8. Samples: 15869956240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:18,395][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 21:20:21,768][15401] Updated weights for policy 0, policy_version 968624 (0.0031) [2024-06-25 21:20:23,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15870033920. Throughput: 0: 42471.0. Samples: 15870083960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:23,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 21:20:25,726][15401] Updated weights for policy 0, policy_version 968634 (0.0028) [2024-06-25 21:20:28,392][15132] Fps is (10 sec: 42609.8, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 15870214144. Throughput: 0: 42562.7. Samples: 15870345340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:28,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 21:20:29,274][15401] Updated weights for policy 0, policy_version 968644 (0.0031) [2024-06-25 21:20:33,393][15132] Fps is (10 sec: 37671.2, 60 sec: 42324.7, 300 sec: 42653.5). Total num frames: 15870410752. Throughput: 0: 42656.1. Samples: 15870606260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:33,393][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 21:20:33,867][15401] Updated weights for policy 0, policy_version 968654 (0.0041) [2024-06-25 21:20:36,810][15401] Updated weights for policy 0, policy_version 968664 (0.0041) [2024-06-25 21:20:38,390][15132] Fps is (10 sec: 47524.9, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15870689280. Throughput: 0: 42764.9. Samples: 15870732680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:38,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 21:20:41,249][15401] Updated weights for policy 0, policy_version 968674 (0.0047) [2024-06-25 21:20:42,168][15349] Signal inference workers to stop experience collection... (234800 times) [2024-06-25 21:20:42,211][15401] InferenceWorker_p0-w0: stopping experience collection (234800 times) [2024-06-25 21:20:42,223][15349] Signal inference workers to resume experience collection... (234800 times) [2024-06-25 21:20:42,232][15401] InferenceWorker_p0-w0: resuming experience collection (234800 times) [2024-06-25 21:20:43,390][15132] Fps is (10 sec: 44251.0, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 15870853120. Throughput: 0: 42743.1. Samples: 15870993120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:43,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 21:20:44,344][15401] Updated weights for policy 0, policy_version 968684 (0.0024) [2024-06-25 21:20:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15871066112. Throughput: 0: 42677.3. Samples: 15871247160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:48,390][15132] Avg episode reward: [(0, '0.794')] [2024-06-25 21:20:48,887][15401] Updated weights for policy 0, policy_version 968694 (0.0023) [2024-06-25 21:20:52,164][15401] Updated weights for policy 0, policy_version 968704 (0.0040) [2024-06-25 21:20:53,391][15132] Fps is (10 sec: 45868.3, 60 sec: 42870.4, 300 sec: 42820.3). Total num frames: 15871311872. Throughput: 0: 42769.7. Samples: 15871377200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:53,391][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 21:20:56,361][15401] Updated weights for policy 0, policy_version 968714 (0.0034) [2024-06-25 21:20:58,396][15132] Fps is (10 sec: 45846.0, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 15871524864. Throughput: 0: 42984.2. Samples: 15871639600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:20:58,396][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 21:20:59,640][15401] Updated weights for policy 0, policy_version 968724 (0.0038) [2024-06-25 21:21:03,390][15132] Fps is (10 sec: 40966.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15871721472. Throughput: 0: 43116.8. Samples: 15871896280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:21:03,392][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 21:21:03,783][15401] Updated weights for policy 0, policy_version 968734 (0.0040) [2024-06-25 21:21:07,096][15401] Updated weights for policy 0, policy_version 968744 (0.0029) [2024-06-25 21:21:08,389][15132] Fps is (10 sec: 42625.9, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 15871950848. Throughput: 0: 43138.3. Samples: 15872025180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:21:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 21:21:11,253][15401] Updated weights for policy 0, policy_version 968754 (0.0036) [2024-06-25 21:21:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15872163840. Throughput: 0: 43115.1. Samples: 15872285420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:21:13,395][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 21:21:14,745][15401] Updated weights for policy 0, policy_version 968764 (0.0035) [2024-06-25 21:21:18,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42875.0, 300 sec: 42709.5). Total num frames: 15872360448. Throughput: 0: 42975.0. Samples: 15872540000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:21:18,390][15132] Avg episode reward: [(0, '0.294')] [2024-06-25 21:21:18,730][15401] Updated weights for policy 0, policy_version 968774 (0.0026) [2024-06-25 21:21:22,345][15401] Updated weights for policy 0, policy_version 968784 (0.0038) [2024-06-25 21:21:23,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 15872606208. Throughput: 0: 43009.4. Samples: 15872668100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:21:23,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 21:21:27,094][15401] Updated weights for policy 0, policy_version 968794 (0.0030) [2024-06-25 21:21:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 15872786432. Throughput: 0: 43000.0. Samples: 15872928120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:21:28,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 21:21:30,007][15401] Updated weights for policy 0, policy_version 968804 (0.0043) [2024-06-25 21:21:33,389][15132] Fps is (10 sec: 40959.7, 60 sec: 43419.9, 300 sec: 42765.0). Total num frames: 15873015808. Throughput: 0: 43020.0. Samples: 15873183060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:21:33,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 21:21:34,458][15401] Updated weights for policy 0, policy_version 968814 (0.0038) [2024-06-25 21:21:37,582][15401] Updated weights for policy 0, policy_version 968824 (0.0040) [2024-06-25 21:21:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 15873228800. Throughput: 0: 43035.1. Samples: 15873313720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-25 21:21:38,390][15132] Avg episode reward: [(0, '0.212')] [2024-06-25 21:21:41,851][15401] Updated weights for policy 0, policy_version 968834 (0.0036) [2024-06-25 21:21:43,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 15873425408. Throughput: 0: 42870.4. Samples: 15873568500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:21:43,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-25 21:21:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000968837_15873425408.pth... [2024-06-25 21:21:43,465][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000968211_15863169024.pth [2024-06-25 21:21:45,389][15401] Updated weights for policy 0, policy_version 968844 (0.0032) [2024-06-25 21:21:48,392][15132] Fps is (10 sec: 42588.8, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 15873654784. Throughput: 0: 42707.9. Samples: 15873818240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:21:48,393][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 21:21:49,583][15401] Updated weights for policy 0, policy_version 968854 (0.0037) [2024-06-25 21:21:52,995][15401] Updated weights for policy 0, policy_version 968864 (0.0028) [2024-06-25 21:21:53,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42872.6, 300 sec: 42876.1). Total num frames: 15873884160. Throughput: 0: 42860.0. Samples: 15873953880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:21:53,390][15132] Avg episode reward: [(0, '0.785')] [2024-06-25 21:21:57,003][15401] Updated weights for policy 0, policy_version 968874 (0.0033) [2024-06-25 21:21:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42603.0, 300 sec: 42820.6). Total num frames: 15874080768. Throughput: 0: 42717.4. Samples: 15874207700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:21:58,390][15132] Avg episode reward: [(0, '0.733')] [2024-06-25 21:22:00,582][15401] Updated weights for policy 0, policy_version 968884 (0.0035) [2024-06-25 21:22:03,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15874310144. Throughput: 0: 42672.2. Samples: 15874460240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:03,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 21:22:04,829][15401] Updated weights for policy 0, policy_version 968894 (0.0034) [2024-06-25 21:22:08,058][15401] Updated weights for policy 0, policy_version 968904 (0.0031) [2024-06-25 21:22:08,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 15874539520. Throughput: 0: 42843.5. Samples: 15874596060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:08,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 21:22:09,350][15349] Signal inference workers to stop experience collection... (234850 times) [2024-06-25 21:22:09,351][15349] Signal inference workers to resume experience collection... (234850 times) [2024-06-25 21:22:09,395][15401] InferenceWorker_p0-w0: stopping experience collection (234850 times) [2024-06-25 21:22:09,395][15401] InferenceWorker_p0-w0: resuming experience collection (234850 times) [2024-06-25 21:22:12,459][15401] Updated weights for policy 0, policy_version 968914 (0.0039) [2024-06-25 21:22:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 15874736128. Throughput: 0: 42692.9. Samples: 15874849300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:13,390][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 21:22:15,986][15401] Updated weights for policy 0, policy_version 968924 (0.0037) [2024-06-25 21:22:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 15874949120. Throughput: 0: 42617.4. Samples: 15875100840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:18,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 21:22:20,054][15401] Updated weights for policy 0, policy_version 968934 (0.0039) [2024-06-25 21:22:23,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15875129344. Throughput: 0: 42583.8. Samples: 15875229980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:23,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 21:22:23,894][15401] Updated weights for policy 0, policy_version 968944 (0.0043) [2024-06-25 21:22:27,508][15401] Updated weights for policy 0, policy_version 968954 (0.0028) [2024-06-25 21:22:28,389][15132] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42876.4). Total num frames: 15875391488. Throughput: 0: 42592.6. Samples: 15875485160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:28,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 21:22:31,457][15401] Updated weights for policy 0, policy_version 968964 (0.0035) [2024-06-25 21:22:33,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15875571712. Throughput: 0: 42675.3. Samples: 15875738520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:33,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 21:22:35,257][15401] Updated weights for policy 0, policy_version 968974 (0.0034) [2024-06-25 21:22:38,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15875784704. Throughput: 0: 42525.3. Samples: 15875867520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:38,393][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 21:22:39,261][15401] Updated weights for policy 0, policy_version 968984 (0.0023) [2024-06-25 21:22:42,816][15401] Updated weights for policy 0, policy_version 968994 (0.0037) [2024-06-25 21:22:43,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 15876030464. Throughput: 0: 42710.6. Samples: 15876129680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:43,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 21:22:46,950][15401] Updated weights for policy 0, policy_version 969004 (0.0045) [2024-06-25 21:22:48,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 15876210688. Throughput: 0: 42683.9. Samples: 15876381020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:48,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 21:22:50,470][15401] Updated weights for policy 0, policy_version 969014 (0.0042) [2024-06-25 21:22:53,390][15132] Fps is (10 sec: 39318.4, 60 sec: 42324.7, 300 sec: 42764.9). Total num frames: 15876423680. Throughput: 0: 42374.2. Samples: 15876502940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:53,391][15132] Avg episode reward: [(0, '0.468')] [2024-06-25 21:22:54,669][15401] Updated weights for policy 0, policy_version 969024 (0.0030) [2024-06-25 21:22:58,166][15401] Updated weights for policy 0, policy_version 969034 (0.0033) [2024-06-25 21:22:58,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 15876669440. Throughput: 0: 42707.0. Samples: 15876771120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:22:58,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 21:23:02,081][15401] Updated weights for policy 0, policy_version 969044 (0.0041) [2024-06-25 21:23:03,390][15132] Fps is (10 sec: 42602.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15876849664. Throughput: 0: 42677.3. Samples: 15877021320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:23:03,390][15132] Avg episode reward: [(0, '0.540')] [2024-06-25 21:23:05,888][15401] Updated weights for policy 0, policy_version 969054 (0.0037) [2024-06-25 21:23:08,390][15132] Fps is (10 sec: 39322.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 15877062656. Throughput: 0: 42602.6. Samples: 15877147100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:23:08,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 21:23:09,943][15401] Updated weights for policy 0, policy_version 969064 (0.0028) [2024-06-25 21:23:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15877292032. Throughput: 0: 42795.7. Samples: 15877410960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:23:13,390][15132] Avg episode reward: [(0, '0.375')] [2024-06-25 21:23:13,411][15401] Updated weights for policy 0, policy_version 969074 (0.0040) [2024-06-25 21:23:17,616][15349] Signal inference workers to stop experience collection... (234900 times) [2024-06-25 21:23:17,647][15401] InferenceWorker_p0-w0: stopping experience collection (234900 times) [2024-06-25 21:23:17,673][15349] Signal inference workers to resume experience collection... (234900 times) [2024-06-25 21:23:17,673][15401] InferenceWorker_p0-w0: resuming experience collection (234900 times) [2024-06-25 21:23:17,676][15401] Updated weights for policy 0, policy_version 969084 (0.0035) [2024-06-25 21:23:18,396][15132] Fps is (10 sec: 44208.5, 60 sec: 42593.8, 300 sec: 42764.6). Total num frames: 15877505024. Throughput: 0: 42614.8. Samples: 15877656460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:23:18,396][15132] Avg episode reward: [(0, '0.184')] [2024-06-25 21:23:21,133][15401] Updated weights for policy 0, policy_version 969094 (0.0050) [2024-06-25 21:23:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15877701632. Throughput: 0: 42562.7. Samples: 15877782840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:23:23,392][15132] Avg episode reward: [(0, '0.281')] [2024-06-25 21:23:25,507][15401] Updated weights for policy 0, policy_version 969104 (0.0048) [2024-06-25 21:23:28,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15877931008. Throughput: 0: 42570.3. Samples: 15878045340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:23:28,390][15132] Avg episode reward: [(0, '0.853')] [2024-06-25 21:23:29,092][15401] Updated weights for policy 0, policy_version 969114 (0.0041) [2024-06-25 21:23:33,266][15401] Updated weights for policy 0, policy_version 969124 (0.0039) [2024-06-25 21:23:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15878127616. Throughput: 0: 42721.9. Samples: 15878303500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:23:33,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 21:23:36,522][15401] Updated weights for policy 0, policy_version 969134 (0.0042) [2024-06-25 21:23:38,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15878356992. Throughput: 0: 42780.4. Samples: 15878428020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-25 21:23:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 21:23:40,770][15401] Updated weights for policy 0, policy_version 969144 (0.0031) [2024-06-25 21:23:43,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15878553600. Throughput: 0: 42540.6. Samples: 15878685440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:23:43,390][15132] Avg episode reward: [(0, '0.359')] [2024-06-25 21:23:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000969150_15878553600.pth... [2024-06-25 21:23:43,463][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000968525_15868313600.pth [2024-06-25 21:23:44,339][15401] Updated weights for policy 0, policy_version 969154 (0.0035) [2024-06-25 21:23:48,382][15401] Updated weights for policy 0, policy_version 969164 (0.0028) [2024-06-25 21:23:48,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15878782976. Throughput: 0: 42595.7. Samples: 15878938120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:23:48,390][15132] Avg episode reward: [(0, '0.436')] [2024-06-25 21:23:52,168][15401] Updated weights for policy 0, policy_version 969174 (0.0032) [2024-06-25 21:23:53,390][15132] Fps is (10 sec: 45875.4, 60 sec: 43145.2, 300 sec: 42765.0). Total num frames: 15879012352. Throughput: 0: 42639.1. Samples: 15879065860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:23:53,392][15132] Avg episode reward: [(0, '0.765')] [2024-06-25 21:23:56,036][15401] Updated weights for policy 0, policy_version 969184 (0.0035) [2024-06-25 21:23:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 15879192576. Throughput: 0: 42609.2. Samples: 15879328380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:23:58,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 21:23:59,624][15401] Updated weights for policy 0, policy_version 969194 (0.0034) [2024-06-25 21:24:03,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15879421952. Throughput: 0: 42777.2. Samples: 15879581160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:03,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 21:24:03,506][15401] Updated weights for policy 0, policy_version 969204 (0.0042) [2024-06-25 21:24:07,345][15401] Updated weights for policy 0, policy_version 969214 (0.0041) [2024-06-25 21:24:08,390][15132] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 15879667712. Throughput: 0: 42994.2. Samples: 15879717580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:08,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 21:24:11,201][15401] Updated weights for policy 0, policy_version 969224 (0.0036) [2024-06-25 21:24:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 15879831552. Throughput: 0: 42822.2. Samples: 15879972340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:13,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 21:24:14,970][15401] Updated weights for policy 0, policy_version 969234 (0.0027) [2024-06-25 21:24:18,392][15132] Fps is (10 sec: 39313.6, 60 sec: 42601.5, 300 sec: 42764.7). Total num frames: 15880060928. Throughput: 0: 42706.4. Samples: 15880225380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:18,392][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 21:24:19,281][15401] Updated weights for policy 0, policy_version 969244 (0.0059) [2024-06-25 21:24:22,548][15401] Updated weights for policy 0, policy_version 969254 (0.0043) [2024-06-25 21:24:23,390][15132] Fps is (10 sec: 47513.6, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 15880306688. Throughput: 0: 42972.9. Samples: 15880361800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:23,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 21:24:26,809][15401] Updated weights for policy 0, policy_version 969264 (0.0044) [2024-06-25 21:24:28,390][15132] Fps is (10 sec: 40968.4, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 15880470528. Throughput: 0: 42870.7. Samples: 15880614620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:28,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 21:24:30,235][15401] Updated weights for policy 0, policy_version 969274 (0.0035) [2024-06-25 21:24:30,543][15349] Signal inference workers to stop experience collection... (234950 times) [2024-06-25 21:24:30,547][15349] Signal inference workers to resume experience collection... (234950 times) [2024-06-25 21:24:30,572][15401] InferenceWorker_p0-w0: stopping experience collection (234950 times) [2024-06-25 21:24:30,572][15401] InferenceWorker_p0-w0: resuming experience collection (234950 times) [2024-06-25 21:24:33,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15880699904. Throughput: 0: 43076.0. Samples: 15880876540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:33,390][15132] Avg episode reward: [(0, '0.731')] [2024-06-25 21:24:34,313][15401] Updated weights for policy 0, policy_version 969284 (0.0039) [2024-06-25 21:24:37,820][15401] Updated weights for policy 0, policy_version 969294 (0.0038) [2024-06-25 21:24:38,389][15132] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15880945664. Throughput: 0: 43064.5. Samples: 15881003760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:38,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 21:24:42,230][15401] Updated weights for policy 0, policy_version 969304 (0.0034) [2024-06-25 21:24:43,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15881109504. Throughput: 0: 42964.0. Samples: 15881261760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 21:24:45,327][15401] Updated weights for policy 0, policy_version 969314 (0.0045) [2024-06-25 21:24:48,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15881338880. Throughput: 0: 42895.5. Samples: 15881511460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:48,390][15132] Avg episode reward: [(0, '0.847')] [2024-06-25 21:24:49,813][15401] Updated weights for policy 0, policy_version 969324 (0.0034) [2024-06-25 21:24:52,975][15401] Updated weights for policy 0, policy_version 969334 (0.0032) [2024-06-25 21:24:53,389][15132] Fps is (10 sec: 49152.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 15881601024. Throughput: 0: 42904.6. Samples: 15881648280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:53,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 21:24:57,266][15401] Updated weights for policy 0, policy_version 969344 (0.0041) [2024-06-25 21:24:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15881748480. Throughput: 0: 42928.1. Samples: 15881904100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:24:58,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 21:25:00,593][15401] Updated weights for policy 0, policy_version 969354 (0.0054) [2024-06-25 21:25:03,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15881994240. Throughput: 0: 42911.4. Samples: 15882156300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:25:03,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 21:25:05,278][15401] Updated weights for policy 0, policy_version 969364 (0.0032) [2024-06-25 21:25:08,357][15401] Updated weights for policy 0, policy_version 969374 (0.0038) [2024-06-25 21:25:08,389][15132] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15882223616. Throughput: 0: 43005.4. Samples: 15882297040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:25:08,390][15132] Avg episode reward: [(0, '0.483')] [2024-06-25 21:25:12,975][15401] Updated weights for policy 0, policy_version 969384 (0.0030) [2024-06-25 21:25:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42765.7). Total num frames: 15882403840. Throughput: 0: 42946.2. Samples: 15882547200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:25:13,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 21:25:16,011][15401] Updated weights for policy 0, policy_version 969394 (0.0036) [2024-06-25 21:25:18,392][15132] Fps is (10 sec: 42588.0, 60 sec: 43144.3, 300 sec: 42764.7). Total num frames: 15882649600. Throughput: 0: 42698.5. Samples: 15882798080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:25:18,393][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 21:25:20,509][15401] Updated weights for policy 0, policy_version 969404 (0.0036) [2024-06-25 21:25:23,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 15882829824. Throughput: 0: 42973.8. Samples: 15882937580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:25:23,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 21:25:23,790][15401] Updated weights for policy 0, policy_version 969414 (0.0029) [2024-06-25 21:25:28,108][15401] Updated weights for policy 0, policy_version 969424 (0.0036) [2024-06-25 21:25:28,392][15132] Fps is (10 sec: 39321.8, 60 sec: 42869.8, 300 sec: 42820.7). Total num frames: 15883042816. Throughput: 0: 42846.2. Samples: 15883189940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:25:28,392][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 21:25:31,295][15349] Signal inference workers to stop experience collection... (235000 times) [2024-06-25 21:25:31,295][15349] Signal inference workers to resume experience collection... (235000 times) [2024-06-25 21:25:31,313][15401] InferenceWorker_p0-w0: stopping experience collection (235000 times) [2024-06-25 21:25:31,313][15401] InferenceWorker_p0-w0: resuming experience collection (235000 times) [2024-06-25 21:25:31,459][15401] Updated weights for policy 0, policy_version 969434 (0.0034) [2024-06-25 21:25:33,390][15132] Fps is (10 sec: 47513.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15883304960. Throughput: 0: 42820.9. Samples: 15883438400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-25 21:25:33,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 21:25:35,604][15401] Updated weights for policy 0, policy_version 969444 (0.0033) [2024-06-25 21:25:38,389][15132] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 15883485184. Throughput: 0: 42928.4. Samples: 15883580060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:25:38,390][15132] Avg episode reward: [(0, '0.553')] [2024-06-25 21:25:39,200][15401] Updated weights for policy 0, policy_version 969454 (0.0028) [2024-06-25 21:25:43,020][15401] Updated weights for policy 0, policy_version 969464 (0.0038) [2024-06-25 21:25:43,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15883698176. Throughput: 0: 42867.0. Samples: 15883833120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:25:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 21:25:43,416][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000969464_15883698176.pth... [2024-06-25 21:25:43,474][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000968837_15873425408.pth [2024-06-25 21:25:46,767][15401] Updated weights for policy 0, policy_version 969474 (0.0039) [2024-06-25 21:25:48,390][15132] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 42820.8). Total num frames: 15883943936. Throughput: 0: 42878.8. Samples: 15884085860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:25:48,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 21:25:50,779][15401] Updated weights for policy 0, policy_version 969484 (0.0033) [2024-06-25 21:25:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42710.4). Total num frames: 15884124160. Throughput: 0: 42871.5. Samples: 15884226260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:25:53,390][15132] Avg episode reward: [(0, '0.347')] [2024-06-25 21:25:54,338][15401] Updated weights for policy 0, policy_version 969494 (0.0029) [2024-06-25 21:25:58,389][15132] Fps is (10 sec: 39322.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15884337152. Throughput: 0: 42907.6. Samples: 15884478040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:25:58,390][15132] Avg episode reward: [(0, '0.291')] [2024-06-25 21:25:58,523][15401] Updated weights for policy 0, policy_version 969504 (0.0034) [2024-06-25 21:26:01,957][15401] Updated weights for policy 0, policy_version 969514 (0.0041) [2024-06-25 21:26:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15884582912. Throughput: 0: 42949.3. Samples: 15884730700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:03,390][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 21:26:06,305][15401] Updated weights for policy 0, policy_version 969524 (0.0033) [2024-06-25 21:26:08,392][15132] Fps is (10 sec: 42588.0, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 15884763136. Throughput: 0: 42855.0. Samples: 15884866160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:08,392][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 21:26:09,477][15401] Updated weights for policy 0, policy_version 969534 (0.0038) [2024-06-25 21:26:13,390][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15884992512. Throughput: 0: 42884.0. Samples: 15885119620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:13,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 21:26:13,670][15401] Updated weights for policy 0, policy_version 969544 (0.0048) [2024-06-25 21:26:17,285][15401] Updated weights for policy 0, policy_version 969554 (0.0031) [2024-06-25 21:26:18,389][15132] Fps is (10 sec: 47524.9, 60 sec: 43146.3, 300 sec: 42820.5). Total num frames: 15885238272. Throughput: 0: 42910.7. Samples: 15885369380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:18,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 21:26:21,187][15401] Updated weights for policy 0, policy_version 969564 (0.0032) [2024-06-25 21:26:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15885418496. Throughput: 0: 42757.8. Samples: 15885504160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:23,390][15132] Avg episode reward: [(0, '0.397')] [2024-06-25 21:26:24,871][15401] Updated weights for policy 0, policy_version 969574 (0.0042) [2024-06-25 21:26:28,390][15132] Fps is (10 sec: 39321.0, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 15885631488. Throughput: 0: 42860.8. Samples: 15885761860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:28,390][15132] Avg episode reward: [(0, '0.451')] [2024-06-25 21:26:28,811][15401] Updated weights for policy 0, policy_version 969584 (0.0036) [2024-06-25 21:26:32,392][15401] Updated weights for policy 0, policy_version 969594 (0.0043) [2024-06-25 21:26:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15885860864. Throughput: 0: 42805.5. Samples: 15886012100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:33,390][15132] Avg episode reward: [(0, '0.544')] [2024-06-25 21:26:36,877][15401] Updated weights for policy 0, policy_version 969604 (0.0034) [2024-06-25 21:26:38,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15886057472. Throughput: 0: 42664.4. Samples: 15886146160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:38,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 21:26:40,010][15401] Updated weights for policy 0, policy_version 969614 (0.0032) [2024-06-25 21:26:43,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42869.8, 300 sec: 42765.0). Total num frames: 15886270464. Throughput: 0: 42793.6. Samples: 15886403860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:43,392][15132] Avg episode reward: [(0, '0.249')] [2024-06-25 21:26:44,371][15401] Updated weights for policy 0, policy_version 969624 (0.0029) [2024-06-25 21:26:47,619][15401] Updated weights for policy 0, policy_version 969634 (0.0031) [2024-06-25 21:26:48,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15886516224. Throughput: 0: 42840.9. Samples: 15886658540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 21:26:51,916][15401] Updated weights for policy 0, policy_version 969644 (0.0030) [2024-06-25 21:26:53,389][15132] Fps is (10 sec: 44247.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15886712832. Throughput: 0: 42796.1. Samples: 15886791880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:53,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 21:26:54,627][15349] Signal inference workers to stop experience collection... (235050 times) [2024-06-25 21:26:54,628][15349] Signal inference workers to resume experience collection... (235050 times) [2024-06-25 21:26:54,677][15401] InferenceWorker_p0-w0: stopping experience collection (235050 times) [2024-06-25 21:26:54,677][15401] InferenceWorker_p0-w0: resuming experience collection (235050 times) [2024-06-25 21:26:55,619][15401] Updated weights for policy 0, policy_version 969654 (0.0036) [2024-06-25 21:26:58,392][15132] Fps is (10 sec: 40950.6, 60 sec: 43142.7, 300 sec: 42764.7). Total num frames: 15886925824. Throughput: 0: 42853.3. Samples: 15887048120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:26:58,392][15132] Avg episode reward: [(0, '0.171')] [2024-06-25 21:26:59,506][15401] Updated weights for policy 0, policy_version 969664 (0.0028) [2024-06-25 21:27:03,123][15401] Updated weights for policy 0, policy_version 969674 (0.0034) [2024-06-25 21:27:03,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15887138816. Throughput: 0: 42984.3. Samples: 15887303680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:27:03,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 21:27:07,592][15401] Updated weights for policy 0, policy_version 969684 (0.0033) [2024-06-25 21:27:08,390][15132] Fps is (10 sec: 40969.6, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 15887335424. Throughput: 0: 42892.8. Samples: 15887434340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:27:08,390][15132] Avg episode reward: [(0, '0.752')] [2024-06-25 21:27:10,573][15401] Updated weights for policy 0, policy_version 969694 (0.0041) [2024-06-25 21:27:13,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15887564800. Throughput: 0: 42809.1. Samples: 15887688260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:27:13,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 21:27:15,169][15401] Updated weights for policy 0, policy_version 969704 (0.0037) [2024-06-25 21:27:18,149][15401] Updated weights for policy 0, policy_version 969714 (0.0029) [2024-06-25 21:27:18,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 15887794176. Throughput: 0: 42852.8. Samples: 15887940480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:27:18,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 21:27:22,763][15401] Updated weights for policy 0, policy_version 969724 (0.0028) [2024-06-25 21:27:23,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15887974400. Throughput: 0: 42875.0. Samples: 15888075540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:27:23,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 21:27:26,253][15401] Updated weights for policy 0, policy_version 969734 (0.0030) [2024-06-25 21:27:28,389][15132] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15888187392. Throughput: 0: 42718.8. Samples: 15888326100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:27:28,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 21:27:30,222][15401] Updated weights for policy 0, policy_version 969744 (0.0035) [2024-06-25 21:27:33,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42596.6, 300 sec: 42820.2). Total num frames: 15888416768. Throughput: 0: 42724.9. Samples: 15888581260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-25 21:27:33,392][15132] Avg episode reward: [(0, '0.716')] [2024-06-25 21:27:33,815][15401] Updated weights for policy 0, policy_version 969754 (0.0026) [2024-06-25 21:27:37,831][15401] Updated weights for policy 0, policy_version 969764 (0.0043) [2024-06-25 21:27:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15888629760. Throughput: 0: 42740.9. Samples: 15888715220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:27:38,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 21:27:41,386][15401] Updated weights for policy 0, policy_version 969774 (0.0033) [2024-06-25 21:27:43,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 15888826368. Throughput: 0: 42591.2. Samples: 15888964620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:27:43,390][15132] Avg episode reward: [(0, '0.253')] [2024-06-25 21:27:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000969778_15888842752.pth... [2024-06-25 21:27:43,451][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000969150_15878553600.pth [2024-06-25 21:27:45,429][15401] Updated weights for policy 0, policy_version 969784 (0.0027) [2024-06-25 21:27:48,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42820.7). Total num frames: 15889055744. Throughput: 0: 42665.9. Samples: 15889223640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:27:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 21:27:49,031][15401] Updated weights for policy 0, policy_version 969794 (0.0028) [2024-06-25 21:27:53,113][15401] Updated weights for policy 0, policy_version 969804 (0.0033) [2024-06-25 21:27:53,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15889268736. Throughput: 0: 42635.1. Samples: 15889352920. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:27:53,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 21:27:56,549][15401] Updated weights for policy 0, policy_version 969814 (0.0033) [2024-06-25 21:27:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42873.2, 300 sec: 42876.1). Total num frames: 15889498112. Throughput: 0: 42497.7. Samples: 15889600660. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:27:58,390][15132] Avg episode reward: [(0, '0.776')] [2024-06-25 21:28:01,178][15401] Updated weights for policy 0, policy_version 969824 (0.0037) [2024-06-25 21:28:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15889694720. Throughput: 0: 42846.8. Samples: 15889868580. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:03,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 21:28:04,325][15401] Updated weights for policy 0, policy_version 969834 (0.0037) [2024-06-25 21:28:08,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15889907712. Throughput: 0: 42584.4. Samples: 15889991840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:08,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 21:28:08,648][15401] Updated weights for policy 0, policy_version 969844 (0.0027) [2024-06-25 21:28:11,824][15401] Updated weights for policy 0, policy_version 969854 (0.0031) [2024-06-25 21:28:13,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42596.6, 300 sec: 42765.6). Total num frames: 15890120704. Throughput: 0: 42579.9. Samples: 15890242300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:13,392][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 21:28:16,109][15401] Updated weights for policy 0, policy_version 969864 (0.0036) [2024-06-25 21:28:18,392][15132] Fps is (10 sec: 44226.6, 60 sec: 42596.7, 300 sec: 42875.8). Total num frames: 15890350080. Throughput: 0: 42901.8. Samples: 15890511840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:18,392][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 21:28:19,386][15401] Updated weights for policy 0, policy_version 969874 (0.0040) [2024-06-25 21:28:23,390][15132] Fps is (10 sec: 44243.8, 60 sec: 43144.0, 300 sec: 42820.4). Total num frames: 15890563072. Throughput: 0: 42772.5. Samples: 15890640020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:23,391][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 21:28:23,614][15401] Updated weights for policy 0, policy_version 969884 (0.0031) [2024-06-25 21:28:25,668][15349] Signal inference workers to stop experience collection... (235100 times) [2024-06-25 21:28:25,668][15349] Signal inference workers to resume experience collection... (235100 times) [2024-06-25 21:28:25,716][15401] InferenceWorker_p0-w0: stopping experience collection (235100 times) [2024-06-25 21:28:25,716][15401] InferenceWorker_p0-w0: resuming experience collection (235100 times) [2024-06-25 21:28:27,021][15401] Updated weights for policy 0, policy_version 969894 (0.0030) [2024-06-25 21:28:28,392][15132] Fps is (10 sec: 42598.4, 60 sec: 43142.7, 300 sec: 42875.7). Total num frames: 15890776064. Throughput: 0: 42712.8. Samples: 15890886800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:28,392][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 21:28:31,259][15401] Updated weights for policy 0, policy_version 969904 (0.0032) [2024-06-25 21:28:33,389][15132] Fps is (10 sec: 42601.9, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 15890989056. Throughput: 0: 42882.7. Samples: 15891153360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:33,390][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 21:28:35,177][15401] Updated weights for policy 0, policy_version 969914 (0.0030) [2024-06-25 21:28:38,390][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 15891185664. Throughput: 0: 42736.0. Samples: 15891276040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:38,392][15132] Avg episode reward: [(0, '0.863')] [2024-06-25 21:28:39,167][15401] Updated weights for policy 0, policy_version 969924 (0.0027) [2024-06-25 21:28:42,742][15401] Updated weights for policy 0, policy_version 969934 (0.0031) [2024-06-25 21:28:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15891415040. Throughput: 0: 42904.9. Samples: 15891531380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:43,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 21:28:46,827][15401] Updated weights for policy 0, policy_version 969944 (0.0029) [2024-06-25 21:28:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15891628032. Throughput: 0: 42753.7. Samples: 15891792500. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 21:28:50,803][15401] Updated weights for policy 0, policy_version 969954 (0.0028) [2024-06-25 21:28:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15891841024. Throughput: 0: 42592.1. Samples: 15891908480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:53,390][15132] Avg episode reward: [(0, '0.635')] [2024-06-25 21:28:54,479][15401] Updated weights for policy 0, policy_version 969964 (0.0026) [2024-06-25 21:28:58,281][15401] Updated weights for policy 0, policy_version 969974 (0.0045) [2024-06-25 21:28:58,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15892054016. Throughput: 0: 42856.8. Samples: 15892170760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:28:58,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 21:29:02,075][15401] Updated weights for policy 0, policy_version 969984 (0.0031) [2024-06-25 21:29:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15892283392. Throughput: 0: 42484.0. Samples: 15892423520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:29:03,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 21:29:05,868][15401] Updated weights for policy 0, policy_version 969994 (0.0032) [2024-06-25 21:29:08,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15892463616. Throughput: 0: 42547.4. Samples: 15892554620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:29:08,390][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 21:29:09,975][15401] Updated weights for policy 0, policy_version 970004 (0.0036) [2024-06-25 21:29:13,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42873.2, 300 sec: 42820.9). Total num frames: 15892692992. Throughput: 0: 42611.7. Samples: 15892804220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:29:13,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-25 21:29:13,557][15401] Updated weights for policy 0, policy_version 970014 (0.0037) [2024-06-25 21:29:17,555][15401] Updated weights for policy 0, policy_version 970024 (0.0026) [2024-06-25 21:29:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15892905984. Throughput: 0: 42487.0. Samples: 15893065280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:29:18,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 21:29:21,731][15401] Updated weights for policy 0, policy_version 970034 (0.0029) [2024-06-25 21:29:23,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42325.8, 300 sec: 42820.5). Total num frames: 15893102592. Throughput: 0: 42705.7. Samples: 15893197800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:29:23,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 21:29:25,131][15401] Updated weights for policy 0, policy_version 970044 (0.0028) [2024-06-25 21:29:28,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.2, 300 sec: 42820.6). Total num frames: 15893331968. Throughput: 0: 42561.8. Samples: 15893446660. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:29:28,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 21:29:29,531][15401] Updated weights for policy 0, policy_version 970054 (0.0035) [2024-06-25 21:29:32,859][15401] Updated weights for policy 0, policy_version 970064 (0.0043) [2024-06-25 21:29:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15893544960. Throughput: 0: 42399.1. Samples: 15893700460. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-25 21:29:33,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 21:29:37,153][15401] Updated weights for policy 0, policy_version 970074 (0.0033) [2024-06-25 21:29:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15893725184. Throughput: 0: 42713.9. Samples: 15893830600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:29:38,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 21:29:40,454][15401] Updated weights for policy 0, policy_version 970084 (0.0027) [2024-06-25 21:29:43,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 15893970944. Throughput: 0: 42415.5. Samples: 15894079460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:29:43,390][15132] Avg episode reward: [(0, '0.736')] [2024-06-25 21:29:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000970091_15893970944.pth... [2024-06-25 21:29:43,455][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000969464_15883698176.pth [2024-06-25 21:29:45,063][15401] Updated weights for policy 0, policy_version 970094 (0.0045) [2024-06-25 21:29:48,046][15401] Updated weights for policy 0, policy_version 970104 (0.0040) [2024-06-25 21:29:48,389][15132] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15894200320. Throughput: 0: 42387.7. Samples: 15894330960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:29:48,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 21:29:50,560][15349] Signal inference workers to stop experience collection... (235150 times) [2024-06-25 21:29:50,561][15349] Signal inference workers to resume experience collection... (235150 times) [2024-06-25 21:29:50,593][15401] InferenceWorker_p0-w0: stopping experience collection (235150 times) [2024-06-25 21:29:50,593][15401] InferenceWorker_p0-w0: resuming experience collection (235150 times) [2024-06-25 21:29:52,784][15401] Updated weights for policy 0, policy_version 970114 (0.0026) [2024-06-25 21:29:53,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 15894380544. Throughput: 0: 42365.8. Samples: 15894461080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:29:53,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 21:29:56,173][15401] Updated weights for policy 0, policy_version 970124 (0.0028) [2024-06-25 21:29:58,390][15132] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15894609920. Throughput: 0: 42605.6. Samples: 15894721480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:29:58,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 21:30:00,472][15401] Updated weights for policy 0, policy_version 970134 (0.0028) [2024-06-25 21:30:03,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15894822912. Throughput: 0: 42439.6. Samples: 15894975060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:03,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 21:30:03,678][15401] Updated weights for policy 0, policy_version 970144 (0.0040) [2024-06-25 21:30:08,038][15401] Updated weights for policy 0, policy_version 970154 (0.0048) [2024-06-25 21:30:08,389][15132] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15895019520. Throughput: 0: 42248.2. Samples: 15895098960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:08,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 21:30:11,286][15401] Updated weights for policy 0, policy_version 970164 (0.0028) [2024-06-25 21:30:13,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42654.3). Total num frames: 15895232512. Throughput: 0: 42443.0. Samples: 15895356600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:13,390][15132] Avg episode reward: [(0, '0.821')] [2024-06-25 21:30:15,749][15401] Updated weights for policy 0, policy_version 970174 (0.0043) [2024-06-25 21:30:18,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15895445504. Throughput: 0: 42479.6. Samples: 15895612040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:18,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 21:30:18,996][15401] Updated weights for policy 0, policy_version 970184 (0.0027) [2024-06-25 21:30:23,369][15401] Updated weights for policy 0, policy_version 970194 (0.0039) [2024-06-25 21:30:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 15895658496. Throughput: 0: 42492.3. Samples: 15895742760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:23,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 21:30:26,639][15401] Updated weights for policy 0, policy_version 970204 (0.0040) [2024-06-25 21:30:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15895871488. Throughput: 0: 42582.0. Samples: 15895995640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:28,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 21:30:30,920][15401] Updated weights for policy 0, policy_version 970214 (0.0027) [2024-06-25 21:30:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15896084480. Throughput: 0: 42572.0. Samples: 15896246700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:33,390][15132] Avg episode reward: [(0, '0.310')] [2024-06-25 21:30:34,912][15401] Updated weights for policy 0, policy_version 970224 (0.0026) [2024-06-25 21:30:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15896297472. Throughput: 0: 42505.8. Samples: 15896373840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:38,390][15132] Avg episode reward: [(0, '0.391')] [2024-06-25 21:30:38,686][15401] Updated weights for policy 0, policy_version 970234 (0.0034) [2024-06-25 21:30:42,806][15401] Updated weights for policy 0, policy_version 970244 (0.0039) [2024-06-25 21:30:43,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 15896510464. Throughput: 0: 42410.0. Samples: 15896629920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:43,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 21:30:46,205][15401] Updated weights for policy 0, policy_version 970254 (0.0025) [2024-06-25 21:30:48,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 15896723456. Throughput: 0: 42614.3. Samples: 15896892700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:48,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 21:30:50,149][15401] Updated weights for policy 0, policy_version 970264 (0.0041) [2024-06-25 21:30:53,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15896936448. Throughput: 0: 42669.8. Samples: 15897019100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:53,390][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 21:30:54,183][15401] Updated weights for policy 0, policy_version 970274 (0.0032) [2024-06-25 21:30:57,675][15401] Updated weights for policy 0, policy_version 970284 (0.0038) [2024-06-25 21:30:58,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 15897149440. Throughput: 0: 42515.6. Samples: 15897269800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:30:58,396][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 21:31:01,668][15401] Updated weights for policy 0, policy_version 970294 (0.0028) [2024-06-25 21:31:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 15897362432. Throughput: 0: 42602.2. Samples: 15897529140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:31:03,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 21:31:05,221][15401] Updated weights for policy 0, policy_version 970304 (0.0029) [2024-06-25 21:31:08,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15897575424. Throughput: 0: 42637.9. Samples: 15897661460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:31:08,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 21:31:09,143][15401] Updated weights for policy 0, policy_version 970314 (0.0028) [2024-06-25 21:31:13,043][15401] Updated weights for policy 0, policy_version 970324 (0.0035) [2024-06-25 21:31:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15897788416. Throughput: 0: 42627.5. Samples: 15897913880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:31:13,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 21:31:17,041][15401] Updated weights for policy 0, policy_version 970334 (0.0039) [2024-06-25 21:31:18,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 15898001408. Throughput: 0: 42819.6. Samples: 15898173580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:31:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 21:31:20,866][15401] Updated weights for policy 0, policy_version 970344 (0.0054) [2024-06-25 21:31:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15898214400. Throughput: 0: 42810.7. Samples: 15898300320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:31:23,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 21:31:24,853][15401] Updated weights for policy 0, policy_version 970354 (0.0042) [2024-06-25 21:31:25,991][15349] Signal inference workers to stop experience collection... (235200 times) [2024-06-25 21:31:25,991][15349] Signal inference workers to resume experience collection... (235200 times) [2024-06-25 21:31:26,030][15401] InferenceWorker_p0-w0: stopping experience collection (235200 times) [2024-06-25 21:31:26,031][15401] InferenceWorker_p0-w0: resuming experience collection (235200 times) [2024-06-25 21:31:28,390][15132] Fps is (10 sec: 42597.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15898427392. Throughput: 0: 42737.5. Samples: 15898553120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-25 21:31:28,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 21:31:28,549][15401] Updated weights for policy 0, policy_version 970364 (0.0041) [2024-06-25 21:31:32,483][15401] Updated weights for policy 0, policy_version 970374 (0.0039) [2024-06-25 21:31:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15898640384. Throughput: 0: 42587.5. Samples: 15898809140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:31:33,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 21:31:36,535][15401] Updated weights for policy 0, policy_version 970384 (0.0032) [2024-06-25 21:31:38,389][15132] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 15898853376. Throughput: 0: 42748.5. Samples: 15898942780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:31:38,390][15132] Avg episode reward: [(0, '0.639')] [2024-06-25 21:31:39,986][15401] Updated weights for policy 0, policy_version 970394 (0.0037) [2024-06-25 21:31:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15899082752. Throughput: 0: 42750.2. Samples: 15899193560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:31:43,390][15132] Avg episode reward: [(0, '0.428')] [2024-06-25 21:31:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000970403_15899082752.pth... [2024-06-25 21:31:43,477][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000969778_15888842752.pth [2024-06-25 21:31:44,543][15401] Updated weights for policy 0, policy_version 970404 (0.0032) [2024-06-25 21:31:47,746][15401] Updated weights for policy 0, policy_version 970414 (0.0037) [2024-06-25 21:31:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15899279360. Throughput: 0: 42650.3. Samples: 15899448400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:31:48,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 21:31:52,305][15401] Updated weights for policy 0, policy_version 970424 (0.0039) [2024-06-25 21:31:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 15899492352. Throughput: 0: 42530.1. Samples: 15899575320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:31:53,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 21:31:55,365][15401] Updated weights for policy 0, policy_version 970434 (0.0031) [2024-06-25 21:31:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15899705344. Throughput: 0: 42524.5. Samples: 15899827480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:31:58,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 21:31:59,821][15401] Updated weights for policy 0, policy_version 970444 (0.0040) [2024-06-25 21:32:02,837][15401] Updated weights for policy 0, policy_version 970454 (0.0036) [2024-06-25 21:32:03,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15899934720. Throughput: 0: 42533.2. Samples: 15900087580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:03,390][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 21:32:07,574][15401] Updated weights for policy 0, policy_version 970464 (0.0040) [2024-06-25 21:32:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15900131328. Throughput: 0: 42515.5. Samples: 15900213520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:08,395][15132] Avg episode reward: [(0, '0.769')] [2024-06-25 21:32:10,822][15401] Updated weights for policy 0, policy_version 970474 (0.0041) [2024-06-25 21:32:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15900360704. Throughput: 0: 42671.8. Samples: 15900473340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:13,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 21:32:15,017][15401] Updated weights for policy 0, policy_version 970484 (0.0041) [2024-06-25 21:32:18,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15900557312. Throughput: 0: 42696.6. Samples: 15900730480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:18,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 21:32:18,586][15401] Updated weights for policy 0, policy_version 970494 (0.0035) [2024-06-25 21:32:22,590][15401] Updated weights for policy 0, policy_version 970504 (0.0034) [2024-06-25 21:32:23,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15900753920. Throughput: 0: 42527.9. Samples: 15900856540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:23,390][15132] Avg episode reward: [(0, '0.766')] [2024-06-25 21:32:26,196][15401] Updated weights for policy 0, policy_version 970514 (0.0036) [2024-06-25 21:32:28,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 15901016064. Throughput: 0: 42736.9. Samples: 15901116720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:28,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 21:32:30,269][15401] Updated weights for policy 0, policy_version 970524 (0.0029) [2024-06-25 21:32:33,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 15901212672. Throughput: 0: 42839.5. Samples: 15901376180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:33,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 21:32:34,048][15401] Updated weights for policy 0, policy_version 970534 (0.0040) [2024-06-25 21:32:35,080][15349] Signal inference workers to stop experience collection... (235250 times) [2024-06-25 21:32:35,128][15401] InferenceWorker_p0-w0: stopping experience collection (235250 times) [2024-06-25 21:32:35,135][15349] Signal inference workers to resume experience collection... (235250 times) [2024-06-25 21:32:35,151][15401] InferenceWorker_p0-w0: resuming experience collection (235250 times) [2024-06-25 21:32:37,875][15401] Updated weights for policy 0, policy_version 970544 (0.0034) [2024-06-25 21:32:38,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15901409280. Throughput: 0: 42789.9. Samples: 15901500860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:38,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 21:32:41,515][15401] Updated weights for policy 0, policy_version 970554 (0.0029) [2024-06-25 21:32:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15901671424. Throughput: 0: 42964.5. Samples: 15901760880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:43,390][15132] Avg episode reward: [(0, '0.284')] [2024-06-25 21:32:45,661][15401] Updated weights for policy 0, policy_version 970564 (0.0043) [2024-06-25 21:32:48,390][15132] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 15901868032. Throughput: 0: 42886.2. Samples: 15902017460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:48,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 21:32:49,088][15401] Updated weights for policy 0, policy_version 970574 (0.0026) [2024-06-25 21:32:53,138][15401] Updated weights for policy 0, policy_version 970584 (0.0037) [2024-06-25 21:32:53,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15902064640. Throughput: 0: 42775.1. Samples: 15902138400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:53,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 21:32:56,376][15401] Updated weights for policy 0, policy_version 970594 (0.0040) [2024-06-25 21:32:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15902310400. Throughput: 0: 43095.1. Samples: 15902412620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:32:58,390][15132] Avg episode reward: [(0, '0.415')] [2024-06-25 21:33:00,542][15401] Updated weights for policy 0, policy_version 970604 (0.0033) [2024-06-25 21:33:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15902507008. Throughput: 0: 42987.6. Samples: 15902664920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:33:03,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 21:33:03,864][15401] Updated weights for policy 0, policy_version 970614 (0.0033) [2024-06-25 21:33:08,059][15401] Updated weights for policy 0, policy_version 970624 (0.0035) [2024-06-25 21:33:08,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 15902720000. Throughput: 0: 42908.9. Samples: 15902787440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:33:08,390][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 21:33:11,412][15401] Updated weights for policy 0, policy_version 970634 (0.0036) [2024-06-25 21:33:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 15902932992. Throughput: 0: 42878.2. Samples: 15903046240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:33:13,390][15132] Avg episode reward: [(0, '0.789')] [2024-06-25 21:33:15,596][15401] Updated weights for policy 0, policy_version 970644 (0.0030) [2024-06-25 21:33:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42543.0). Total num frames: 15903113216. Throughput: 0: 42872.1. Samples: 15903305420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:33:18,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 21:33:19,818][15401] Updated weights for policy 0, policy_version 970654 (0.0044) [2024-06-25 21:33:23,248][15401] Updated weights for policy 0, policy_version 970664 (0.0034) [2024-06-25 21:33:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42654.3). Total num frames: 15903358976. Throughput: 0: 42718.6. Samples: 15903423200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:33:23,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 21:33:27,432][15401] Updated weights for policy 0, policy_version 970674 (0.0032) [2024-06-25 21:33:28,390][15132] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15903588352. Throughput: 0: 42895.1. Samples: 15903691160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-25 21:33:28,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 21:33:31,199][15401] Updated weights for policy 0, policy_version 970684 (0.0046) [2024-06-25 21:33:33,390][15132] Fps is (10 sec: 39320.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15903752192. Throughput: 0: 42954.1. Samples: 15903950400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:33:33,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 21:33:34,931][15401] Updated weights for policy 0, policy_version 970694 (0.0027) [2024-06-25 21:33:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 15903997952. Throughput: 0: 42919.6. Samples: 15904069780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:33:38,390][15132] Avg episode reward: [(0, '0.688')] [2024-06-25 21:33:38,760][15401] Updated weights for policy 0, policy_version 970704 (0.0027) [2024-06-25 21:33:40,229][15349] Signal inference workers to stop experience collection... (235300 times) [2024-06-25 21:33:40,230][15349] Signal inference workers to resume experience collection... (235300 times) [2024-06-25 21:33:40,284][15401] InferenceWorker_p0-w0: stopping experience collection (235300 times) [2024-06-25 21:33:40,284][15401] InferenceWorker_p0-w0: resuming experience collection (235300 times) [2024-06-25 21:33:42,462][15401] Updated weights for policy 0, policy_version 970714 (0.0033) [2024-06-25 21:33:43,390][15132] Fps is (10 sec: 47514.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15904227328. Throughput: 0: 42753.2. Samples: 15904336520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:33:43,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 21:33:43,506][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000970718_15904243712.pth... [2024-06-25 21:33:43,558][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000970091_15893970944.pth [2024-06-25 21:33:46,453][15401] Updated weights for policy 0, policy_version 970724 (0.0049) [2024-06-25 21:33:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15904407552. Throughput: 0: 42687.5. Samples: 15904585860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:33:48,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 21:33:50,166][15401] Updated weights for policy 0, policy_version 970734 (0.0034) [2024-06-25 21:33:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15904620544. Throughput: 0: 42699.9. Samples: 15904708940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:33:53,390][15132] Avg episode reward: [(0, '0.746')] [2024-06-25 21:33:53,997][15401] Updated weights for policy 0, policy_version 970744 (0.0034) [2024-06-25 21:33:57,910][15401] Updated weights for policy 0, policy_version 970754 (0.0037) [2024-06-25 21:33:58,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15904849920. Throughput: 0: 42859.1. Samples: 15904974900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:33:58,390][15132] Avg episode reward: [(0, '0.764')] [2024-06-25 21:34:01,537][15401] Updated weights for policy 0, policy_version 970764 (0.0031) [2024-06-25 21:34:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15905062912. Throughput: 0: 42699.0. Samples: 15905226880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:03,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 21:34:05,712][15401] Updated weights for policy 0, policy_version 970774 (0.0035) [2024-06-25 21:34:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15905275904. Throughput: 0: 42817.3. Samples: 15905349980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:08,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 21:34:09,526][15401] Updated weights for policy 0, policy_version 970784 (0.0042) [2024-06-25 21:34:13,302][15401] Updated weights for policy 0, policy_version 970794 (0.0035) [2024-06-25 21:34:13,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15905488896. Throughput: 0: 42571.0. Samples: 15905606860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:13,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 21:34:17,560][15401] Updated weights for policy 0, policy_version 970804 (0.0036) [2024-06-25 21:34:18,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15905685504. Throughput: 0: 42498.0. Samples: 15905862800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:18,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 21:34:20,999][15401] Updated weights for policy 0, policy_version 970814 (0.0033) [2024-06-25 21:34:23,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15905914880. Throughput: 0: 42562.7. Samples: 15905985100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:23,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 21:34:25,153][15401] Updated weights for policy 0, policy_version 970824 (0.0029) [2024-06-25 21:34:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15906127872. Throughput: 0: 42329.8. Samples: 15906241360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:28,393][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 21:34:29,067][15401] Updated weights for policy 0, policy_version 970834 (0.0029) [2024-06-25 21:34:32,793][15401] Updated weights for policy 0, policy_version 970844 (0.0035) [2024-06-25 21:34:33,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15906324480. Throughput: 0: 42550.2. Samples: 15906500620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:33,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 21:34:36,631][15401] Updated weights for policy 0, policy_version 970854 (0.0031) [2024-06-25 21:34:38,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15906537472. Throughput: 0: 42605.9. Samples: 15906626200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 21:34:40,196][15401] Updated weights for policy 0, policy_version 970864 (0.0027) [2024-06-25 21:34:42,466][15349] Signal inference workers to stop experience collection... (235350 times) [2024-06-25 21:34:42,466][15349] Signal inference workers to resume experience collection... (235350 times) [2024-06-25 21:34:42,488][15401] InferenceWorker_p0-w0: stopping experience collection (235350 times) [2024-06-25 21:34:42,488][15401] InferenceWorker_p0-w0: resuming experience collection (235350 times) [2024-06-25 21:34:43,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15906766848. Throughput: 0: 42568.8. Samples: 15906890500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:43,396][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 21:34:44,128][15401] Updated weights for policy 0, policy_version 970874 (0.0035) [2024-06-25 21:34:47,645][15401] Updated weights for policy 0, policy_version 970884 (0.0036) [2024-06-25 21:34:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15906979840. Throughput: 0: 42560.9. Samples: 15907142120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:48,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 21:34:51,798][15401] Updated weights for policy 0, policy_version 970894 (0.0031) [2024-06-25 21:34:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 15907192832. Throughput: 0: 42748.6. Samples: 15907273660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:53,390][15132] Avg episode reward: [(0, '0.737')] [2024-06-25 21:34:55,006][15401] Updated weights for policy 0, policy_version 970904 (0.0036) [2024-06-25 21:34:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15907405824. Throughput: 0: 42897.5. Samples: 15907537240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:34:58,390][15132] Avg episode reward: [(0, '0.352')] [2024-06-25 21:34:59,323][15401] Updated weights for policy 0, policy_version 970914 (0.0039) [2024-06-25 21:35:03,046][15401] Updated weights for policy 0, policy_version 970924 (0.0033) [2024-06-25 21:35:03,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15907635200. Throughput: 0: 42723.9. Samples: 15907785380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:35:03,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 21:35:06,883][15401] Updated weights for policy 0, policy_version 970934 (0.0024) [2024-06-25 21:35:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15907815424. Throughput: 0: 42867.5. Samples: 15907914140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:35:08,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 21:35:10,713][15401] Updated weights for policy 0, policy_version 970944 (0.0034) [2024-06-25 21:35:13,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15908044800. Throughput: 0: 43070.3. Samples: 15908179520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:35:13,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 21:35:14,433][15401] Updated weights for policy 0, policy_version 970954 (0.0048) [2024-06-25 21:35:18,281][15401] Updated weights for policy 0, policy_version 970964 (0.0037) [2024-06-25 21:35:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15908274176. Throughput: 0: 42788.9. Samples: 15908426120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:35:18,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 21:35:22,274][15401] Updated weights for policy 0, policy_version 970974 (0.0024) [2024-06-25 21:35:23,392][15132] Fps is (10 sec: 42587.5, 60 sec: 42596.6, 300 sec: 42709.1). Total num frames: 15908470784. Throughput: 0: 42944.7. Samples: 15908558820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-25 21:35:23,393][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 21:35:25,897][15401] Updated weights for policy 0, policy_version 970984 (0.0028) [2024-06-25 21:35:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15908700160. Throughput: 0: 42733.9. Samples: 15908813520. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:35:28,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 21:35:29,985][15401] Updated weights for policy 0, policy_version 970994 (0.0033) [2024-06-25 21:35:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15908913152. Throughput: 0: 42699.1. Samples: 15909063580. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:35:33,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 21:35:33,441][15401] Updated weights for policy 0, policy_version 971004 (0.0040) [2024-06-25 21:35:37,545][15401] Updated weights for policy 0, policy_version 971014 (0.0031) [2024-06-25 21:35:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15909109760. Throughput: 0: 42754.7. Samples: 15909197620. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:35:38,390][15132] Avg episode reward: [(0, '0.500')] [2024-06-25 21:35:41,195][15401] Updated weights for policy 0, policy_version 971024 (0.0034) [2024-06-25 21:35:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15909322752. Throughput: 0: 42556.3. Samples: 15909452280. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:35:43,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 21:35:43,414][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000971028_15909322752.pth... [2024-06-25 21:35:43,503][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000970403_15899082752.pth [2024-06-25 21:35:45,095][15401] Updated weights for policy 0, policy_version 971034 (0.0023) [2024-06-25 21:35:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15909535744. Throughput: 0: 42799.2. Samples: 15909711340. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:35:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 21:35:49,200][15401] Updated weights for policy 0, policy_version 971044 (0.0042) [2024-06-25 21:35:52,705][15401] Updated weights for policy 0, policy_version 971054 (0.0037) [2024-06-25 21:35:53,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15909765120. Throughput: 0: 42764.1. Samples: 15909838520. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:35:53,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 21:35:56,987][15401] Updated weights for policy 0, policy_version 971064 (0.0024) [2024-06-25 21:35:58,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 15909978112. Throughput: 0: 42569.6. Samples: 15910095160. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:35:58,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 21:36:00,310][15401] Updated weights for policy 0, policy_version 971074 (0.0033) [2024-06-25 21:36:03,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15910191104. Throughput: 0: 42815.6. Samples: 15910352820. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:03,390][15132] Avg episode reward: [(0, '0.665')] [2024-06-25 21:36:04,394][15401] Updated weights for policy 0, policy_version 971084 (0.0031) [2024-06-25 21:36:06,426][15349] Signal inference workers to stop experience collection... (235400 times) [2024-06-25 21:36:06,427][15349] Signal inference workers to resume experience collection... (235400 times) [2024-06-25 21:36:06,440][15401] InferenceWorker_p0-w0: stopping experience collection (235400 times) [2024-06-25 21:36:06,441][15401] InferenceWorker_p0-w0: resuming experience collection (235400 times) [2024-06-25 21:36:08,204][15401] Updated weights for policy 0, policy_version 971094 (0.0027) [2024-06-25 21:36:08,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15910404096. Throughput: 0: 42791.8. Samples: 15910484340. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:08,390][15132] Avg episode reward: [(0, '0.323')] [2024-06-25 21:36:11,885][15401] Updated weights for policy 0, policy_version 971104 (0.0032) [2024-06-25 21:36:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 15910617088. Throughput: 0: 42729.1. Samples: 15910736340. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:13,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 21:36:15,905][15401] Updated weights for policy 0, policy_version 971114 (0.0033) [2024-06-25 21:36:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15910830080. Throughput: 0: 42840.0. Samples: 15910991380. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:18,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 21:36:19,997][15401] Updated weights for policy 0, policy_version 971124 (0.0048) [2024-06-25 21:36:23,395][15132] Fps is (10 sec: 40938.9, 60 sec: 42596.4, 300 sec: 42708.7). Total num frames: 15911026688. Throughput: 0: 42638.5. Samples: 15911116580. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:23,395][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 21:36:23,631][15401] Updated weights for policy 0, policy_version 971134 (0.0048) [2024-06-25 21:36:28,060][15401] Updated weights for policy 0, policy_version 971144 (0.0046) [2024-06-25 21:36:28,396][15132] Fps is (10 sec: 42571.6, 60 sec: 42593.8, 300 sec: 42764.1). Total num frames: 15911256064. Throughput: 0: 42554.9. Samples: 15911367520. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:28,396][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 21:36:31,518][15401] Updated weights for policy 0, policy_version 971154 (0.0036) [2024-06-25 21:36:33,389][15132] Fps is (10 sec: 42621.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15911452672. Throughput: 0: 42594.7. Samples: 15911628100. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:33,390][15132] Avg episode reward: [(0, '0.753')] [2024-06-25 21:36:35,525][15401] Updated weights for policy 0, policy_version 971164 (0.0037) [2024-06-25 21:36:38,390][15132] Fps is (10 sec: 39346.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 15911649280. Throughput: 0: 42578.9. Samples: 15911754580. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:38,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 21:36:39,258][15401] Updated weights for policy 0, policy_version 971174 (0.0032) [2024-06-25 21:36:42,947][15401] Updated weights for policy 0, policy_version 971184 (0.0043) [2024-06-25 21:36:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 15911878656. Throughput: 0: 42484.5. Samples: 15912007060. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:43,392][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 21:36:46,790][15401] Updated weights for policy 0, policy_version 971194 (0.0028) [2024-06-25 21:36:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15912091648. Throughput: 0: 42446.5. Samples: 15912262920. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:48,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 21:36:50,490][15401] Updated weights for policy 0, policy_version 971204 (0.0040) [2024-06-25 21:36:53,392][15132] Fps is (10 sec: 42598.4, 60 sec: 42323.6, 300 sec: 42709.1). Total num frames: 15912304640. Throughput: 0: 42323.0. Samples: 15912388980. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:53,392][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 21:36:54,747][15401] Updated weights for policy 0, policy_version 971214 (0.0036) [2024-06-25 21:36:58,239][15401] Updated weights for policy 0, policy_version 971224 (0.0027) [2024-06-25 21:36:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15912534016. Throughput: 0: 42441.0. Samples: 15912646180. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:36:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 21:37:02,527][15401] Updated weights for policy 0, policy_version 971234 (0.0029) [2024-06-25 21:37:03,389][15132] Fps is (10 sec: 42608.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15912730624. Throughput: 0: 42455.6. Samples: 15912901880. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:37:03,390][15132] Avg episode reward: [(0, '0.457')] [2024-06-25 21:37:06,033][15401] Updated weights for policy 0, policy_version 971244 (0.0035) [2024-06-25 21:37:08,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42323.6, 300 sec: 42653.6). Total num frames: 15912943616. Throughput: 0: 42498.3. Samples: 15913028880. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:37:08,393][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 21:37:10,264][15401] Updated weights for policy 0, policy_version 971254 (0.0036) [2024-06-25 21:37:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15913172992. Throughput: 0: 42553.5. Samples: 15913282160. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:37:13,390][15132] Avg episode reward: [(0, '0.399')] [2024-06-25 21:37:13,614][15401] Updated weights for policy 0, policy_version 971264 (0.0036) [2024-06-25 21:37:18,129][15401] Updated weights for policy 0, policy_version 971274 (0.0032) [2024-06-25 21:37:18,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15913369600. Throughput: 0: 42511.0. Samples: 15913541100. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:37:18,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 21:37:21,127][15401] Updated weights for policy 0, policy_version 971284 (0.0028) [2024-06-25 21:37:23,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42602.2, 300 sec: 42598.4). Total num frames: 15913582592. Throughput: 0: 42361.6. Samples: 15913660840. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-06-25 21:37:23,390][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 21:37:25,850][15401] Updated weights for policy 0, policy_version 971294 (0.0033) [2024-06-25 21:37:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 15913811968. Throughput: 0: 42550.4. Samples: 15913921720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:37:28,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 21:37:28,729][15401] Updated weights for policy 0, policy_version 971304 (0.0049) [2024-06-25 21:37:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15913992192. Throughput: 0: 42663.3. Samples: 15914182760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:37:33,390][15132] Avg episode reward: [(0, '0.904')] [2024-06-25 21:37:33,497][15401] Updated weights for policy 0, policy_version 971314 (0.0037) [2024-06-25 21:37:34,728][15349] Signal inference workers to stop experience collection... (235450 times) [2024-06-25 21:37:34,780][15401] InferenceWorker_p0-w0: stopping experience collection (235450 times) [2024-06-25 21:37:34,783][15349] Signal inference workers to resume experience collection... (235450 times) [2024-06-25 21:37:34,794][15401] InferenceWorker_p0-w0: resuming experience collection (235450 times) [2024-06-25 21:37:36,259][15401] Updated weights for policy 0, policy_version 971324 (0.0026) [2024-06-25 21:37:38,392][15132] Fps is (10 sec: 42587.9, 60 sec: 43142.9, 300 sec: 42598.0). Total num frames: 15914237952. Throughput: 0: 42631.6. Samples: 15914307400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:37:38,392][15132] Avg episode reward: [(0, '0.897')] [2024-06-25 21:37:40,937][15401] Updated weights for policy 0, policy_version 971334 (0.0031) [2024-06-25 21:37:43,389][15132] Fps is (10 sec: 47513.2, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 15914467328. Throughput: 0: 42759.5. Samples: 15914570360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:37:43,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 21:37:43,481][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000971343_15914483712.pth... [2024-06-25 21:37:43,541][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000970718_15904243712.pth [2024-06-25 21:37:44,343][15401] Updated weights for policy 0, policy_version 971344 (0.0023) [2024-06-25 21:37:48,396][15132] Fps is (10 sec: 40943.3, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 15914647552. Throughput: 0: 42712.5. Samples: 15914824220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:37:48,397][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 21:37:48,701][15401] Updated weights for policy 0, policy_version 971354 (0.0030) [2024-06-25 21:37:52,047][15401] Updated weights for policy 0, policy_version 971364 (0.0040) [2024-06-25 21:37:53,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42600.0, 300 sec: 42542.8). Total num frames: 15914860544. Throughput: 0: 42547.1. Samples: 15914943400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:37:53,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 21:37:56,298][15401] Updated weights for policy 0, policy_version 971374 (0.0037) [2024-06-25 21:37:58,389][15132] Fps is (10 sec: 42626.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15915073536. Throughput: 0: 42570.3. Samples: 15915197820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:37:58,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 21:38:00,427][15401] Updated weights for policy 0, policy_version 971384 (0.0032) [2024-06-25 21:38:03,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15915270144. Throughput: 0: 42422.7. Samples: 15915450120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 21:38:04,133][15401] Updated weights for policy 0, policy_version 971394 (0.0042) [2024-06-25 21:38:08,050][15401] Updated weights for policy 0, policy_version 971404 (0.0033) [2024-06-25 21:38:08,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 15915499520. Throughput: 0: 42481.7. Samples: 15915572520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:08,390][15132] Avg episode reward: [(0, '0.606')] [2024-06-25 21:38:12,093][15401] Updated weights for policy 0, policy_version 971414 (0.0043) [2024-06-25 21:38:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15915712512. Throughput: 0: 42389.2. Samples: 15915829240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:13,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 21:38:15,755][15401] Updated weights for policy 0, policy_version 971424 (0.0039) [2024-06-25 21:38:18,396][15132] Fps is (10 sec: 40933.9, 60 sec: 42320.9, 300 sec: 42541.9). Total num frames: 15915909120. Throughput: 0: 42129.5. Samples: 15916078860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:18,396][15132] Avg episode reward: [(0, '0.471')] [2024-06-25 21:38:19,845][15401] Updated weights for policy 0, policy_version 971434 (0.0025) [2024-06-25 21:38:23,379][15401] Updated weights for policy 0, policy_version 971444 (0.0037) [2024-06-25 21:38:23,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 15916138496. Throughput: 0: 42275.6. Samples: 15916209700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:23,390][15132] Avg episode reward: [(0, '0.328')] [2024-06-25 21:38:27,458][15401] Updated weights for policy 0, policy_version 971454 (0.0038) [2024-06-25 21:38:28,390][15132] Fps is (10 sec: 44264.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15916351488. Throughput: 0: 42181.3. Samples: 15916468520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:28,390][15132] Avg episode reward: [(0, '0.392')] [2024-06-25 21:38:31,384][15401] Updated weights for policy 0, policy_version 971464 (0.0043) [2024-06-25 21:38:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15916548096. Throughput: 0: 42095.5. Samples: 15916718240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 21:38:35,559][15401] Updated weights for policy 0, policy_version 971474 (0.0030) [2024-06-25 21:38:38,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 15916761088. Throughput: 0: 42225.9. Samples: 15916843560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:38,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 21:38:39,105][15401] Updated weights for policy 0, policy_version 971484 (0.0035) [2024-06-25 21:38:43,180][15401] Updated weights for policy 0, policy_version 971494 (0.0044) [2024-06-25 21:38:43,390][15132] Fps is (10 sec: 42597.4, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 15916974080. Throughput: 0: 42219.9. Samples: 15917097720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:43,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 21:38:46,892][15401] Updated weights for policy 0, policy_version 971504 (0.0044) [2024-06-25 21:38:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42056.8, 300 sec: 42542.9). Total num frames: 15917170688. Throughput: 0: 42321.4. Samples: 15917354580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:48,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 21:38:50,779][15401] Updated weights for policy 0, policy_version 971514 (0.0029) [2024-06-25 21:38:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15917416448. Throughput: 0: 42308.0. Samples: 15917476380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:53,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 21:38:54,685][15401] Updated weights for policy 0, policy_version 971524 (0.0036) [2024-06-25 21:38:58,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 15917596672. Throughput: 0: 42241.3. Samples: 15917730100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:38:58,390][15132] Avg episode reward: [(0, '0.808')] [2024-06-25 21:38:58,445][15401] Updated weights for policy 0, policy_version 971534 (0.0032) [2024-06-25 21:39:00,257][15349] Signal inference workers to stop experience collection... (235500 times) [2024-06-25 21:39:00,257][15349] Signal inference workers to resume experience collection... (235500 times) [2024-06-25 21:39:00,291][15401] InferenceWorker_p0-w0: stopping experience collection (235500 times) [2024-06-25 21:39:00,291][15401] InferenceWorker_p0-w0: resuming experience collection (235500 times) [2024-06-25 21:39:02,275][15401] Updated weights for policy 0, policy_version 971544 (0.0039) [2024-06-25 21:39:03,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 15917809664. Throughput: 0: 42347.8. Samples: 15917984240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 21:39:06,099][15401] Updated weights for policy 0, policy_version 971554 (0.0034) [2024-06-25 21:39:08,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15918055424. Throughput: 0: 42200.9. Samples: 15918108740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:08,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 21:39:10,248][15401] Updated weights for policy 0, policy_version 971564 (0.0032) [2024-06-25 21:39:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15918235648. Throughput: 0: 42223.2. Samples: 15918368560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:13,390][15132] Avg episode reward: [(0, '0.674')] [2024-06-25 21:39:13,850][15401] Updated weights for policy 0, policy_version 971574 (0.0032) [2024-06-25 21:39:18,320][15401] Updated weights for policy 0, policy_version 971584 (0.0041) [2024-06-25 21:39:18,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42056.7, 300 sec: 42431.8). Total num frames: 15918432256. Throughput: 0: 42372.3. Samples: 15918625000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:18,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 21:39:21,337][15401] Updated weights for policy 0, policy_version 971594 (0.0032) [2024-06-25 21:39:23,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15918694400. Throughput: 0: 42295.5. Samples: 15918746860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:23,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 21:39:25,977][15401] Updated weights for policy 0, policy_version 971604 (0.0043) [2024-06-25 21:39:28,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 15918874624. Throughput: 0: 42430.8. Samples: 15919007100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:28,390][15132] Avg episode reward: [(0, '0.658')] [2024-06-25 21:39:29,163][15401] Updated weights for policy 0, policy_version 971614 (0.0028) [2024-06-25 21:39:33,392][15132] Fps is (10 sec: 37674.4, 60 sec: 42050.5, 300 sec: 42487.0). Total num frames: 15919071232. Throughput: 0: 42391.0. Samples: 15919262280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:33,392][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 21:39:33,518][15401] Updated weights for policy 0, policy_version 971624 (0.0033) [2024-06-25 21:39:36,697][15401] Updated weights for policy 0, policy_version 971634 (0.0035) [2024-06-25 21:39:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15919316992. Throughput: 0: 42410.6. Samples: 15919384860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:38,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 21:39:40,999][15401] Updated weights for policy 0, policy_version 971644 (0.0034) [2024-06-25 21:39:43,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 15919497216. Throughput: 0: 42593.4. Samples: 15919646800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:43,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 21:39:43,406][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000971650_15919513600.pth... [2024-06-25 21:39:43,476][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000971028_15909322752.pth [2024-06-25 21:39:44,443][15401] Updated weights for policy 0, policy_version 971654 (0.0040) [2024-06-25 21:39:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15919726592. Throughput: 0: 42486.7. Samples: 15919896140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:48,390][15132] Avg episode reward: [(0, '0.142')] [2024-06-25 21:39:48,910][15401] Updated weights for policy 0, policy_version 971664 (0.0033) [2024-06-25 21:39:51,941][15401] Updated weights for policy 0, policy_version 971674 (0.0033) [2024-06-25 21:39:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15919955968. Throughput: 0: 42683.5. Samples: 15920029500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:53,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 21:39:56,640][15401] Updated weights for policy 0, policy_version 971684 (0.0039) [2024-06-25 21:39:58,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 15920136192. Throughput: 0: 42568.9. Samples: 15920284160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:39:58,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 21:39:59,543][15401] Updated weights for policy 0, policy_version 971694 (0.0024) [2024-06-25 21:40:03,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15920381952. Throughput: 0: 42434.2. Samples: 15920534540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:03,393][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 21:40:04,115][15401] Updated weights for policy 0, policy_version 971704 (0.0030) [2024-06-25 21:40:07,168][15401] Updated weights for policy 0, policy_version 971714 (0.0035) [2024-06-25 21:40:08,390][15132] Fps is (10 sec: 45874.0, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 15920594944. Throughput: 0: 42652.3. Samples: 15920666220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:08,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 21:40:11,703][15401] Updated weights for policy 0, policy_version 971724 (0.0034) [2024-06-25 21:40:13,360][15349] Signal inference workers to stop experience collection... (235550 times) [2024-06-25 21:40:13,360][15349] Signal inference workers to resume experience collection... (235550 times) [2024-06-25 21:40:13,378][15401] InferenceWorker_p0-w0: stopping experience collection (235550 times) [2024-06-25 21:40:13,378][15401] InferenceWorker_p0-w0: resuming experience collection (235550 times) [2024-06-25 21:40:13,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 15920775168. Throughput: 0: 42539.2. Samples: 15920921360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:13,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 21:40:14,911][15401] Updated weights for policy 0, policy_version 971734 (0.0038) [2024-06-25 21:40:18,390][15132] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 15921004544. Throughput: 0: 42442.7. Samples: 15921172100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:18,390][15132] Avg episode reward: [(0, '0.334')] [2024-06-25 21:40:19,230][15401] Updated weights for policy 0, policy_version 971744 (0.0030) [2024-06-25 21:40:22,767][15401] Updated weights for policy 0, policy_version 971754 (0.0039) [2024-06-25 21:40:23,390][15132] Fps is (10 sec: 47512.4, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 15921250304. Throughput: 0: 42675.5. Samples: 15921305260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:23,390][15132] Avg episode reward: [(0, '0.774')] [2024-06-25 21:40:26,971][15401] Updated weights for policy 0, policy_version 971764 (0.0031) [2024-06-25 21:40:28,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 15921414144. Throughput: 0: 42337.3. Samples: 15921551980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:28,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 21:40:30,547][15401] Updated weights for policy 0, policy_version 971774 (0.0038) [2024-06-25 21:40:33,391][15132] Fps is (10 sec: 40954.3, 60 sec: 43145.2, 300 sec: 42542.6). Total num frames: 15921659904. Throughput: 0: 42471.9. Samples: 15921807440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:33,391][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 21:40:34,447][15401] Updated weights for policy 0, policy_version 971784 (0.0034) [2024-06-25 21:40:38,335][15401] Updated weights for policy 0, policy_version 971794 (0.0032) [2024-06-25 21:40:38,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 15921872896. Throughput: 0: 42517.8. Samples: 15921942800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:38,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 21:40:41,989][15401] Updated weights for policy 0, policy_version 971804 (0.0036) [2024-06-25 21:40:43,390][15132] Fps is (10 sec: 40965.7, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 15922069504. Throughput: 0: 42416.7. Samples: 15922192920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:43,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 21:40:46,142][15401] Updated weights for policy 0, policy_version 971814 (0.0045) [2024-06-25 21:40:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 15922298880. Throughput: 0: 42630.3. Samples: 15922452900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:48,390][15132] Avg episode reward: [(0, '0.838')] [2024-06-25 21:40:49,466][15401] Updated weights for policy 0, policy_version 971824 (0.0031) [2024-06-25 21:40:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 15922511872. Throughput: 0: 42823.7. Samples: 15922593280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:53,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 21:40:53,736][15401] Updated weights for policy 0, policy_version 971834 (0.0037) [2024-06-25 21:40:57,062][15401] Updated weights for policy 0, policy_version 971844 (0.0034) [2024-06-25 21:40:58,392][15132] Fps is (10 sec: 40951.3, 60 sec: 42870.0, 300 sec: 42431.5). Total num frames: 15922708480. Throughput: 0: 42587.7. Samples: 15922837900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:40:58,392][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 21:41:01,382][15401] Updated weights for policy 0, policy_version 971854 (0.0034) [2024-06-25 21:41:03,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 15922954240. Throughput: 0: 42829.3. Samples: 15923099420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:41:03,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 21:41:04,831][15401] Updated weights for policy 0, policy_version 971864 (0.0039) [2024-06-25 21:41:08,389][15132] Fps is (10 sec: 44246.2, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 15923150848. Throughput: 0: 42729.5. Samples: 15923228080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:41:08,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 21:41:08,962][15401] Updated weights for policy 0, policy_version 971874 (0.0051) [2024-06-25 21:41:12,527][15401] Updated weights for policy 0, policy_version 971884 (0.0042) [2024-06-25 21:41:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 15923363840. Throughput: 0: 42855.8. Samples: 15923480500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:41:13,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 21:41:16,637][15401] Updated weights for policy 0, policy_version 971894 (0.0024) [2024-06-25 21:41:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42543.6). Total num frames: 15923576832. Throughput: 0: 42992.5. Samples: 15923742040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:41:18,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 21:41:20,115][15401] Updated weights for policy 0, policy_version 971904 (0.0027) [2024-06-25 21:41:22,980][15349] Signal inference workers to stop experience collection... (235600 times) [2024-06-25 21:41:22,983][15349] Signal inference workers to resume experience collection... (235600 times) [2024-06-25 21:41:23,005][15401] InferenceWorker_p0-w0: stopping experience collection (235600 times) [2024-06-25 21:41:23,005][15401] InferenceWorker_p0-w0: resuming experience collection (235600 times) [2024-06-25 21:41:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42488.2). Total num frames: 15923789824. Throughput: 0: 42915.5. Samples: 15923874000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-25 21:41:23,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 21:41:24,100][15401] Updated weights for policy 0, policy_version 971914 (0.0031) [2024-06-25 21:41:28,175][15401] Updated weights for policy 0, policy_version 971924 (0.0033) [2024-06-25 21:41:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 15924002816. Throughput: 0: 42899.2. Samples: 15924123380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:41:28,390][15132] Avg episode reward: [(0, '0.341')] [2024-06-25 21:41:32,505][15401] Updated weights for policy 0, policy_version 971934 (0.0039) [2024-06-25 21:41:33,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42599.4, 300 sec: 42598.4). Total num frames: 15924215808. Throughput: 0: 42966.5. Samples: 15924386400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:41:33,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 21:41:35,685][15401] Updated weights for policy 0, policy_version 971944 (0.0032) [2024-06-25 21:41:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 15924428800. Throughput: 0: 42703.7. Samples: 15924514940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:41:38,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 21:41:39,823][15401] Updated weights for policy 0, policy_version 971954 (0.0028) [2024-06-25 21:41:43,315][15401] Updated weights for policy 0, policy_version 971964 (0.0036) [2024-06-25 21:41:43,390][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 15924658176. Throughput: 0: 42894.4. Samples: 15924768060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:41:43,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 21:41:43,405][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000971964_15924658176.pth... [2024-06-25 21:41:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000971343_15914483712.pth [2024-06-25 21:41:47,445][15401] Updated weights for policy 0, policy_version 971974 (0.0055) [2024-06-25 21:41:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 15924838400. Throughput: 0: 42771.7. Samples: 15925024140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:41:48,390][15132] Avg episode reward: [(0, '0.864')] [2024-06-25 21:41:51,080][15401] Updated weights for policy 0, policy_version 971984 (0.0043) [2024-06-25 21:41:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 15925084160. Throughput: 0: 42653.2. Samples: 15925147480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:41:53,390][15132] Avg episode reward: [(0, '0.624')] [2024-06-25 21:41:55,519][15401] Updated weights for policy 0, policy_version 971994 (0.0029) [2024-06-25 21:41:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 15925280768. Throughput: 0: 42818.8. Samples: 15925407340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:41:58,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 21:41:58,590][15401] Updated weights for policy 0, policy_version 972004 (0.0030) [2024-06-25 21:42:03,109][15401] Updated weights for policy 0, policy_version 972014 (0.0033) [2024-06-25 21:42:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 15925493760. Throughput: 0: 42783.5. Samples: 15925667300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:03,390][15132] Avg episode reward: [(0, '0.265')] [2024-06-25 21:42:06,410][15401] Updated weights for policy 0, policy_version 972024 (0.0036) [2024-06-25 21:42:08,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 15925706752. Throughput: 0: 42576.0. Samples: 15925789920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:08,390][15132] Avg episode reward: [(0, '0.285')] [2024-06-25 21:42:10,701][15401] Updated weights for policy 0, policy_version 972034 (0.0037) [2024-06-25 21:42:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15925936128. Throughput: 0: 42775.0. Samples: 15926048260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:13,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 21:42:14,713][15401] Updated weights for policy 0, policy_version 972044 (0.0048) [2024-06-25 21:42:18,171][15401] Updated weights for policy 0, policy_version 972054 (0.0033) [2024-06-25 21:42:18,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15926149120. Throughput: 0: 42613.5. Samples: 15926304000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:18,390][15132] Avg episode reward: [(0, '0.433')] [2024-06-25 21:42:22,285][15401] Updated weights for policy 0, policy_version 972064 (0.0032) [2024-06-25 21:42:23,390][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 15926362112. Throughput: 0: 42603.5. Samples: 15926432100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:23,390][15132] Avg episode reward: [(0, '0.350')] [2024-06-25 21:42:25,834][15401] Updated weights for policy 0, policy_version 972074 (0.0038) [2024-06-25 21:42:28,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15926575104. Throughput: 0: 42697.0. Samples: 15926689420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:28,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 21:42:29,934][15401] Updated weights for policy 0, policy_version 972084 (0.0025) [2024-06-25 21:42:33,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42487.7). Total num frames: 15926771712. Throughput: 0: 42687.2. Samples: 15926945060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:33,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 21:42:33,422][15401] Updated weights for policy 0, policy_version 972094 (0.0034) [2024-06-25 21:42:37,509][15401] Updated weights for policy 0, policy_version 972104 (0.0037) [2024-06-25 21:42:38,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 15926984704. Throughput: 0: 42787.2. Samples: 15927072900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:38,390][15132] Avg episode reward: [(0, '0.349')] [2024-06-25 21:42:40,942][15401] Updated weights for policy 0, policy_version 972114 (0.0031) [2024-06-25 21:42:43,390][15132] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 15927230464. Throughput: 0: 42809.2. Samples: 15927333760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:43,390][15132] Avg episode reward: [(0, '0.657')] [2024-06-25 21:42:44,978][15401] Updated weights for policy 0, policy_version 972124 (0.0036) [2024-06-25 21:42:48,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 15927427072. Throughput: 0: 42716.5. Samples: 15927589540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:48,390][15132] Avg episode reward: [(0, '0.603')] [2024-06-25 21:42:48,955][15401] Updated weights for policy 0, policy_version 972134 (0.0032) [2024-06-25 21:42:52,955][15401] Updated weights for policy 0, policy_version 972144 (0.0035) [2024-06-25 21:42:53,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 15927623680. Throughput: 0: 42717.7. Samples: 15927712220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:53,390][15132] Avg episode reward: [(0, '0.430')] [2024-06-25 21:42:56,533][15401] Updated weights for policy 0, policy_version 972154 (0.0028) [2024-06-25 21:42:57,544][15349] Signal inference workers to stop experience collection... (235650 times) [2024-06-25 21:42:57,594][15401] InferenceWorker_p0-w0: stopping experience collection (235650 times) [2024-06-25 21:42:57,662][15349] Signal inference workers to resume experience collection... (235650 times) [2024-06-25 21:42:57,663][15401] InferenceWorker_p0-w0: resuming experience collection (235650 times) [2024-06-25 21:42:58,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 15927869440. Throughput: 0: 42823.2. Samples: 15927975300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:42:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 21:43:00,413][15401] Updated weights for policy 0, policy_version 972164 (0.0029) [2024-06-25 21:43:03,392][15132] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 15928066048. Throughput: 0: 42814.1. Samples: 15928230740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:43:03,392][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 21:43:04,097][15401] Updated weights for policy 0, policy_version 972174 (0.0031) [2024-06-25 21:43:07,829][15401] Updated weights for policy 0, policy_version 972184 (0.0046) [2024-06-25 21:43:08,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 15928279040. Throughput: 0: 42810.6. Samples: 15928358580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:43:08,390][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 21:43:11,600][15401] Updated weights for policy 0, policy_version 972194 (0.0033) [2024-06-25 21:43:13,390][15132] Fps is (10 sec: 42608.4, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 15928492032. Throughput: 0: 42901.2. Samples: 15928619980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:43:13,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 21:43:15,236][15401] Updated weights for policy 0, policy_version 972204 (0.0026) [2024-06-25 21:43:18,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15928705024. Throughput: 0: 42844.0. Samples: 15928873040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-25 21:43:18,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 21:43:19,211][15401] Updated weights for policy 0, policy_version 972214 (0.0033) [2024-06-25 21:43:22,786][15401] Updated weights for policy 0, policy_version 972224 (0.0029) [2024-06-25 21:43:23,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15928934400. Throughput: 0: 42836.8. Samples: 15929000560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:43:23,390][15132] Avg episode reward: [(0, '0.446')] [2024-06-25 21:43:26,976][15401] Updated weights for policy 0, policy_version 972234 (0.0035) [2024-06-25 21:43:28,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15929131008. Throughput: 0: 42775.7. Samples: 15929258660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:43:28,390][15132] Avg episode reward: [(0, '0.547')] [2024-06-25 21:43:30,274][15401] Updated weights for policy 0, policy_version 972244 (0.0043) [2024-06-25 21:43:33,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15929327616. Throughput: 0: 42713.8. Samples: 15929511660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:43:33,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 21:43:35,030][15401] Updated weights for policy 0, policy_version 972254 (0.0037) [2024-06-25 21:43:38,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15929556992. Throughput: 0: 42702.2. Samples: 15929633820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:43:38,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 21:43:38,547][15401] Updated weights for policy 0, policy_version 972264 (0.0035) [2024-06-25 21:43:42,706][15401] Updated weights for policy 0, policy_version 972274 (0.0050) [2024-06-25 21:43:43,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15929769984. Throughput: 0: 42519.7. Samples: 15929888680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:43:43,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 21:43:43,460][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000972277_15929786368.pth... [2024-06-25 21:43:43,513][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000971650_15919513600.pth [2024-06-25 21:43:46,087][15401] Updated weights for policy 0, policy_version 972284 (0.0033) [2024-06-25 21:43:48,389][15132] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 15929966592. Throughput: 0: 42465.4. Samples: 15930141580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:43:48,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 21:43:50,195][15401] Updated weights for policy 0, policy_version 972294 (0.0024) [2024-06-25 21:43:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15930195968. Throughput: 0: 42394.4. Samples: 15930266320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:43:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 21:43:53,934][15401] Updated weights for policy 0, policy_version 972304 (0.0041) [2024-06-25 21:43:57,683][15401] Updated weights for policy 0, policy_version 972314 (0.0038) [2024-06-25 21:43:58,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15930408960. Throughput: 0: 42403.1. Samples: 15930528120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:43:58,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 21:44:01,580][15401] Updated weights for policy 0, policy_version 972324 (0.0036) [2024-06-25 21:44:03,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 15930605568. Throughput: 0: 42392.8. Samples: 15930780720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:03,390][15132] Avg episode reward: [(0, '0.405')] [2024-06-25 21:44:05,257][15401] Updated weights for policy 0, policy_version 972334 (0.0047) [2024-06-25 21:44:08,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15930818560. Throughput: 0: 42387.6. Samples: 15930908000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:08,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 21:44:09,343][15401] Updated weights for policy 0, policy_version 972344 (0.0037) [2024-06-25 21:44:13,167][15401] Updated weights for policy 0, policy_version 972354 (0.0038) [2024-06-25 21:44:13,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15931047936. Throughput: 0: 42440.0. Samples: 15931168460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:13,390][15132] Avg episode reward: [(0, '0.470')] [2024-06-25 21:44:16,938][15401] Updated weights for policy 0, policy_version 972364 (0.0034) [2024-06-25 21:44:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 15931244544. Throughput: 0: 42379.6. Samples: 15931418740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:18,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 21:44:21,046][15401] Updated weights for policy 0, policy_version 972374 (0.0038) [2024-06-25 21:44:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 15931457536. Throughput: 0: 42391.2. Samples: 15931541420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:23,390][15132] Avg episode reward: [(0, '0.299')] [2024-06-25 21:44:23,654][15349] Signal inference workers to stop experience collection... (235700 times) [2024-06-25 21:44:23,655][15349] Signal inference workers to resume experience collection... (235700 times) [2024-06-25 21:44:23,697][15401] InferenceWorker_p0-w0: stopping experience collection (235700 times) [2024-06-25 21:44:23,697][15401] InferenceWorker_p0-w0: resuming experience collection (235700 times) [2024-06-25 21:44:24,483][15401] Updated weights for policy 0, policy_version 972384 (0.0028) [2024-06-25 21:44:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 15931670528. Throughput: 0: 42425.3. Samples: 15931797820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:28,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 21:44:28,847][15401] Updated weights for policy 0, policy_version 972394 (0.0031) [2024-06-25 21:44:32,056][15401] Updated weights for policy 0, policy_version 972404 (0.0031) [2024-06-25 21:44:33,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15931883520. Throughput: 0: 42434.2. Samples: 15932051120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:33,390][15132] Avg episode reward: [(0, '0.811')] [2024-06-25 21:44:36,615][15401] Updated weights for policy 0, policy_version 972414 (0.0039) [2024-06-25 21:44:38,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15932096512. Throughput: 0: 42565.2. Samples: 15932181760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:38,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 21:44:39,930][15401] Updated weights for policy 0, policy_version 972424 (0.0025) [2024-06-25 21:44:43,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15932309504. Throughput: 0: 42523.3. Samples: 15932441660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:43,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 21:44:44,025][15401] Updated weights for policy 0, policy_version 972434 (0.0028) [2024-06-25 21:44:47,585][15401] Updated weights for policy 0, policy_version 972444 (0.0028) [2024-06-25 21:44:48,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15932538880. Throughput: 0: 42529.3. Samples: 15932694540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:48,390][15132] Avg episode reward: [(0, '0.601')] [2024-06-25 21:44:51,950][15401] Updated weights for policy 0, policy_version 972454 (0.0030) [2024-06-25 21:44:53,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15932735488. Throughput: 0: 42613.4. Samples: 15932825600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:53,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 21:44:55,352][15401] Updated weights for policy 0, policy_version 972464 (0.0031) [2024-06-25 21:44:58,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15932964864. Throughput: 0: 42482.7. Samples: 15933080180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:44:58,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-25 21:44:59,431][15401] Updated weights for policy 0, policy_version 972474 (0.0036) [2024-06-25 21:45:03,234][15401] Updated weights for policy 0, policy_version 972484 (0.0026) [2024-06-25 21:45:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15933194240. Throughput: 0: 42640.8. Samples: 15933337580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:45:03,390][15132] Avg episode reward: [(0, '0.260')] [2024-06-25 21:45:06,874][15401] Updated weights for policy 0, policy_version 972494 (0.0041) [2024-06-25 21:45:08,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15933374464. Throughput: 0: 42745.4. Samples: 15933464960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:45:08,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 21:45:10,688][15401] Updated weights for policy 0, policy_version 972504 (0.0038) [2024-06-25 21:45:13,396][15132] Fps is (10 sec: 40934.3, 60 sec: 42593.8, 300 sec: 42708.6). Total num frames: 15933603840. Throughput: 0: 42809.8. Samples: 15933724540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:45:13,396][15132] Avg episode reward: [(0, '0.759')] [2024-06-25 21:45:14,525][15401] Updated weights for policy 0, policy_version 972514 (0.0036) [2024-06-25 21:45:18,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 15933816832. Throughput: 0: 42774.2. Samples: 15933975960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-25 21:45:18,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 21:45:18,444][15401] Updated weights for policy 0, policy_version 972524 (0.0040) [2024-06-25 21:45:22,386][15401] Updated weights for policy 0, policy_version 972534 (0.0041) [2024-06-25 21:45:23,392][15132] Fps is (10 sec: 40976.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 15934013440. Throughput: 0: 42723.5. Samples: 15934104420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:45:23,392][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 21:45:26,437][15401] Updated weights for policy 0, policy_version 972544 (0.0034) [2024-06-25 21:45:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 15934226432. Throughput: 0: 42542.7. Samples: 15934356080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:45:28,390][15132] Avg episode reward: [(0, '0.523')] [2024-06-25 21:45:30,212][15401] Updated weights for policy 0, policy_version 972554 (0.0023) [2024-06-25 21:45:33,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15934439424. Throughput: 0: 42784.5. Samples: 15934619840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:45:33,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 21:45:33,897][15401] Updated weights for policy 0, policy_version 972564 (0.0032) [2024-06-25 21:45:37,157][15349] Signal inference workers to stop experience collection... (235750 times) [2024-06-25 21:45:37,157][15349] Signal inference workers to resume experience collection... (235750 times) [2024-06-25 21:45:37,196][15401] InferenceWorker_p0-w0: stopping experience collection (235750 times) [2024-06-25 21:45:37,196][15401] InferenceWorker_p0-w0: resuming experience collection (235750 times) [2024-06-25 21:45:37,826][15401] Updated weights for policy 0, policy_version 972574 (0.0033) [2024-06-25 21:45:38,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15934668800. Throughput: 0: 42805.6. Samples: 15934751860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:45:38,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 21:45:41,401][15401] Updated weights for policy 0, policy_version 972584 (0.0038) [2024-06-25 21:45:43,389][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15934881792. Throughput: 0: 42829.7. Samples: 15935007520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:45:43,390][15132] Avg episode reward: [(0, '0.506')] [2024-06-25 21:45:43,403][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000972588_15934881792.pth... [2024-06-25 21:45:43,448][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000971964_15924658176.pth [2024-06-25 21:45:45,325][15401] Updated weights for policy 0, policy_version 972594 (0.0035) [2024-06-25 21:45:48,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15935094784. Throughput: 0: 42863.2. Samples: 15935266420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:45:48,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 21:45:48,922][15401] Updated weights for policy 0, policy_version 972604 (0.0026) [2024-06-25 21:45:53,038][15401] Updated weights for policy 0, policy_version 972614 (0.0037) [2024-06-25 21:45:53,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 15935324160. Throughput: 0: 42909.4. Samples: 15935395880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:45:53,390][15132] Avg episode reward: [(0, '0.568')] [2024-06-25 21:45:56,522][15401] Updated weights for policy 0, policy_version 972624 (0.0034) [2024-06-25 21:45:58,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 15935537152. Throughput: 0: 42678.5. Samples: 15935644800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:45:58,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 21:46:00,673][15401] Updated weights for policy 0, policy_version 972634 (0.0029) [2024-06-25 21:46:03,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15935733760. Throughput: 0: 42912.0. Samples: 15935907000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:03,390][15132] Avg episode reward: [(0, '0.534')] [2024-06-25 21:46:04,399][15401] Updated weights for policy 0, policy_version 972644 (0.0043) [2024-06-25 21:46:08,207][15401] Updated weights for policy 0, policy_version 972654 (0.0026) [2024-06-25 21:46:08,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 15935963136. Throughput: 0: 42829.8. Samples: 15936031660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:08,390][15132] Avg episode reward: [(0, '0.448')] [2024-06-25 21:46:11,958][15401] Updated weights for policy 0, policy_version 972664 (0.0035) [2024-06-25 21:46:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42876.1, 300 sec: 42709.5). Total num frames: 15936176128. Throughput: 0: 42929.8. Samples: 15936287920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:13,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 21:46:15,717][15401] Updated weights for policy 0, policy_version 972674 (0.0031) [2024-06-25 21:46:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15936372736. Throughput: 0: 42975.0. Samples: 15936553720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:18,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 21:46:19,367][15401] Updated weights for policy 0, policy_version 972684 (0.0030) [2024-06-25 21:46:23,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43146.3, 300 sec: 42709.5). Total num frames: 15936602112. Throughput: 0: 42841.5. Samples: 15936679720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:23,390][15132] Avg episode reward: [(0, '0.418')] [2024-06-25 21:46:23,443][15401] Updated weights for policy 0, policy_version 972694 (0.0024) [2024-06-25 21:46:27,077][15401] Updated weights for policy 0, policy_version 972704 (0.0034) [2024-06-25 21:46:28,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 15936831488. Throughput: 0: 42741.3. Samples: 15936930880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:28,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 21:46:31,566][15401] Updated weights for policy 0, policy_version 972714 (0.0035) [2024-06-25 21:46:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15937011712. Throughput: 0: 42856.4. Samples: 15937194960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:33,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 21:46:34,674][15401] Updated weights for policy 0, policy_version 972724 (0.0044) [2024-06-25 21:46:38,389][15132] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 15937257472. Throughput: 0: 42656.9. Samples: 15937315440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:38,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 21:46:39,403][15401] Updated weights for policy 0, policy_version 972734 (0.0028) [2024-06-25 21:46:42,362][15401] Updated weights for policy 0, policy_version 972744 (0.0031) [2024-06-25 21:46:43,390][15132] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 15937486848. Throughput: 0: 42868.8. Samples: 15937573900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:43,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 21:46:44,788][15349] Signal inference workers to stop experience collection... (235800 times) [2024-06-25 21:46:44,794][15349] Signal inference workers to resume experience collection... (235800 times) [2024-06-25 21:46:44,816][15401] InferenceWorker_p0-w0: stopping experience collection (235800 times) [2024-06-25 21:46:44,816][15401] InferenceWorker_p0-w0: resuming experience collection (235800 times) [2024-06-25 21:46:47,404][15401] Updated weights for policy 0, policy_version 972754 (0.0028) [2024-06-25 21:46:48,389][15132] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 15937650688. Throughput: 0: 42885.8. Samples: 15937836860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:48,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 21:46:50,302][15401] Updated weights for policy 0, policy_version 972764 (0.0034) [2024-06-25 21:46:53,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15937896448. Throughput: 0: 42634.8. Samples: 15937950220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:53,390][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 21:46:54,934][15401] Updated weights for policy 0, policy_version 972774 (0.0030) [2024-06-25 21:46:58,088][15401] Updated weights for policy 0, policy_version 972784 (0.0034) [2024-06-25 21:46:58,389][15132] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15938125824. Throughput: 0: 42858.2. Samples: 15938216540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:46:58,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 21:47:02,456][15401] Updated weights for policy 0, policy_version 972794 (0.0031) [2024-06-25 21:47:03,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15938289664. Throughput: 0: 42679.6. Samples: 15938474300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:47:03,390][15132] Avg episode reward: [(0, '0.465')] [2024-06-25 21:47:05,643][15401] Updated weights for policy 0, policy_version 972804 (0.0025) [2024-06-25 21:47:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15938535424. Throughput: 0: 42427.0. Samples: 15938588940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:47:08,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 21:47:10,425][15401] Updated weights for policy 0, policy_version 972814 (0.0030) [2024-06-25 21:47:13,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15938732032. Throughput: 0: 42728.0. Samples: 15938853640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:47:13,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 21:47:13,471][15401] Updated weights for policy 0, policy_version 972824 (0.0033) [2024-06-25 21:47:17,934][15401] Updated weights for policy 0, policy_version 972834 (0.0042) [2024-06-25 21:47:18,390][15132] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 15938928640. Throughput: 0: 42403.2. Samples: 15939103100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-25 21:47:18,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 21:47:21,303][15401] Updated weights for policy 0, policy_version 972844 (0.0034) [2024-06-25 21:47:23,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15939174400. Throughput: 0: 42555.9. Samples: 15939230460. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:47:23,390][15132] Avg episode reward: [(0, '0.621')] [2024-06-25 21:47:26,142][15401] Updated weights for policy 0, policy_version 972854 (0.0025) [2024-06-25 21:47:28,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15939371008. Throughput: 0: 42501.4. Samples: 15939486460. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:47:28,390][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 21:47:29,141][15401] Updated weights for policy 0, policy_version 972864 (0.0036) [2024-06-25 21:47:33,389][15132] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15939551232. Throughput: 0: 42432.9. Samples: 15939746340. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:47:33,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 21:47:33,630][15401] Updated weights for policy 0, policy_version 972874 (0.0028) [2024-06-25 21:47:36,833][15401] Updated weights for policy 0, policy_version 972884 (0.0038) [2024-06-25 21:47:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15939796992. Throughput: 0: 42474.6. Samples: 15939861580. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:47:38,391][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 21:47:41,269][15401] Updated weights for policy 0, policy_version 972894 (0.0048) [2024-06-25 21:47:43,390][15132] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 15939993600. Throughput: 0: 42440.4. Samples: 15940126360. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:47:43,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 21:47:43,556][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000972901_15940009984.pth... [2024-06-25 21:47:43,638][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000972277_15929786368.pth [2024-06-25 21:47:44,550][15401] Updated weights for policy 0, policy_version 972904 (0.0036) [2024-06-25 21:47:48,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 15940173824. Throughput: 0: 42349.7. Samples: 15940380040. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:47:48,390][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 21:47:49,320][15401] Updated weights for policy 0, policy_version 972914 (0.0040) [2024-06-25 21:47:52,022][15401] Updated weights for policy 0, policy_version 972924 (0.0025) [2024-06-25 21:47:53,389][15132] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15940452352. Throughput: 0: 42543.7. Samples: 15940503400. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:47:53,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 21:47:56,976][15401] Updated weights for policy 0, policy_version 972934 (0.0026) [2024-06-25 21:47:57,651][15349] Signal inference workers to stop experience collection... (235850 times) [2024-06-25 21:47:57,651][15349] Signal inference workers to resume experience collection... (235850 times) [2024-06-25 21:47:57,702][15401] InferenceWorker_p0-w0: stopping experience collection (235850 times) [2024-06-25 21:47:57,703][15401] InferenceWorker_p0-w0: resuming experience collection (235850 times) [2024-06-25 21:47:58,389][15132] Fps is (10 sec: 45875.8, 60 sec: 41779.2, 300 sec: 42598.8). Total num frames: 15940632576. Throughput: 0: 42505.9. Samples: 15940766400. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:47:58,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 21:47:59,573][15401] Updated weights for policy 0, policy_version 972944 (0.0042) [2024-06-25 21:48:03,390][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 15940829184. Throughput: 0: 42600.5. Samples: 15941020120. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:03,390][15132] Avg episode reward: [(0, '0.713')] [2024-06-25 21:48:04,604][15401] Updated weights for policy 0, policy_version 972954 (0.0031) [2024-06-25 21:48:07,108][15401] Updated weights for policy 0, policy_version 972964 (0.0028) [2024-06-25 21:48:08,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15941074944. Throughput: 0: 42622.6. Samples: 15941148480. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:08,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 21:48:12,159][15401] Updated weights for policy 0, policy_version 972974 (0.0042) [2024-06-25 21:48:13,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 15941271552. Throughput: 0: 42719.9. Samples: 15941408860. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:13,390][15132] Avg episode reward: [(0, '0.786')] [2024-06-25 21:48:15,075][15401] Updated weights for policy 0, policy_version 972984 (0.0031) [2024-06-25 21:48:18,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 15941484544. Throughput: 0: 42418.7. Samples: 15941655180. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:18,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 21:48:19,658][15401] Updated weights for policy 0, policy_version 972994 (0.0036) [2024-06-25 21:48:22,900][15401] Updated weights for policy 0, policy_version 973004 (0.0037) [2024-06-25 21:48:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15941713920. Throughput: 0: 42720.9. Samples: 15941784020. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 21:48:27,641][15401] Updated weights for policy 0, policy_version 973014 (0.0032) [2024-06-25 21:48:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15941894144. Throughput: 0: 42497.9. Samples: 15942038760. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 21:48:30,638][15401] Updated weights for policy 0, policy_version 973024 (0.0026) [2024-06-25 21:48:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 15942139904. Throughput: 0: 42512.5. Samples: 15942293100. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:33,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 21:48:35,043][15401] Updated weights for policy 0, policy_version 973034 (0.0042) [2024-06-25 21:48:38,317][15401] Updated weights for policy 0, policy_version 973044 (0.0028) [2024-06-25 21:48:38,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15942352896. Throughput: 0: 42685.3. Samples: 15942424240. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:38,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 21:48:42,518][15401] Updated weights for policy 0, policy_version 973054 (0.0037) [2024-06-25 21:48:43,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15942549504. Throughput: 0: 42653.7. Samples: 15942685820. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:43,390][15132] Avg episode reward: [(0, '0.240')] [2024-06-25 21:48:46,095][15401] Updated weights for policy 0, policy_version 973064 (0.0027) [2024-06-25 21:48:48,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 15942778880. Throughput: 0: 42634.3. Samples: 15942938660. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:48,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 21:48:50,335][15401] Updated weights for policy 0, policy_version 973074 (0.0030) [2024-06-25 21:48:53,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 15942991872. Throughput: 0: 42708.6. Samples: 15943070360. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 21:48:53,643][15401] Updated weights for policy 0, policy_version 973084 (0.0039) [2024-06-25 21:48:57,861][15401] Updated weights for policy 0, policy_version 973094 (0.0033) [2024-06-25 21:48:58,392][15132] Fps is (10 sec: 42587.7, 60 sec: 42869.7, 300 sec: 42709.1). Total num frames: 15943204864. Throughput: 0: 42737.8. Samples: 15943332160. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:48:58,393][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 21:49:01,139][15401] Updated weights for policy 0, policy_version 973104 (0.0026) [2024-06-25 21:49:03,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15943434240. Throughput: 0: 42866.7. Samples: 15943584180. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:49:03,390][15132] Avg episode reward: [(0, '0.600')] [2024-06-25 21:49:05,412][15401] Updated weights for policy 0, policy_version 973114 (0.0034) [2024-06-25 21:49:08,389][15132] Fps is (10 sec: 44247.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15943647232. Throughput: 0: 42876.9. Samples: 15943713480. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:49:08,390][15132] Avg episode reward: [(0, '0.889')] [2024-06-25 21:49:08,721][15401] Updated weights for policy 0, policy_version 973124 (0.0045) [2024-06-25 21:49:13,091][15401] Updated weights for policy 0, policy_version 973134 (0.0029) [2024-06-25 21:49:13,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15943843840. Throughput: 0: 42935.5. Samples: 15943970860. Policy #0 lag: (min: 2.0, avg: 8.4, max: 21.0) [2024-06-25 21:49:13,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 21:49:16,313][15401] Updated weights for policy 0, policy_version 973144 (0.0029) [2024-06-25 21:49:18,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15944056832. Throughput: 0: 42826.8. Samples: 15944220300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:18,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 21:49:20,515][15401] Updated weights for policy 0, policy_version 973154 (0.0033) [2024-06-25 21:49:23,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15944286208. Throughput: 0: 42865.9. Samples: 15944353200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:23,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 21:49:24,302][15401] Updated weights for policy 0, policy_version 973164 (0.0038) [2024-06-25 21:49:28,271][15401] Updated weights for policy 0, policy_version 973174 (0.0029) [2024-06-25 21:49:28,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 15944482816. Throughput: 0: 42954.3. Samples: 15944618760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:28,391][15132] Avg episode reward: [(0, '0.364')] [2024-06-25 21:49:31,588][15401] Updated weights for policy 0, policy_version 973184 (0.0042) [2024-06-25 21:49:33,392][15132] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 15944712192. Throughput: 0: 42923.4. Samples: 15944870320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:33,393][15132] Avg episode reward: [(0, '0.582')] [2024-06-25 21:49:36,030][15401] Updated weights for policy 0, policy_version 973194 (0.0036) [2024-06-25 21:49:36,946][15349] Signal inference workers to stop experience collection... (235900 times) [2024-06-25 21:49:36,946][15349] Signal inference workers to resume experience collection... (235900 times) [2024-06-25 21:49:36,959][15401] InferenceWorker_p0-w0: stopping experience collection (235900 times) [2024-06-25 21:49:36,959][15401] InferenceWorker_p0-w0: resuming experience collection (235900 times) [2024-06-25 21:49:38,392][15132] Fps is (10 sec: 45864.3, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 15944941568. Throughput: 0: 43042.5. Samples: 15945007380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:38,393][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 21:49:39,376][15401] Updated weights for policy 0, policy_version 973204 (0.0036) [2024-06-25 21:49:43,390][15132] Fps is (10 sec: 40968.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 15945121792. Throughput: 0: 43057.5. Samples: 15945269660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:43,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 21:49:43,415][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000973213_15945121792.pth... [2024-06-25 21:49:43,492][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000972588_15934881792.pth [2024-06-25 21:49:43,678][15401] Updated weights for policy 0, policy_version 973214 (0.0049) [2024-06-25 21:49:46,789][15401] Updated weights for policy 0, policy_version 973224 (0.0033) [2024-06-25 21:49:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15945367552. Throughput: 0: 43038.6. Samples: 15945520920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:48,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 21:49:51,132][15401] Updated weights for policy 0, policy_version 973234 (0.0024) [2024-06-25 21:49:53,393][15132] Fps is (10 sec: 45861.6, 60 sec: 43142.1, 300 sec: 42764.5). Total num frames: 15945580544. Throughput: 0: 43209.7. Samples: 15945658060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:53,393][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 21:49:54,441][15401] Updated weights for policy 0, policy_version 973244 (0.0029) [2024-06-25 21:49:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42873.2, 300 sec: 42654.0). Total num frames: 15945777152. Throughput: 0: 43086.3. Samples: 15945909740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:49:58,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 21:49:58,597][15401] Updated weights for policy 0, policy_version 973254 (0.0033) [2024-06-25 21:50:02,200][15401] Updated weights for policy 0, policy_version 973264 (0.0032) [2024-06-25 21:50:03,390][15132] Fps is (10 sec: 44251.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15946022912. Throughput: 0: 43211.4. Samples: 15946164820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:03,390][15132] Avg episode reward: [(0, '0.616')] [2024-06-25 21:50:06,159][15401] Updated weights for policy 0, policy_version 973274 (0.0042) [2024-06-25 21:50:08,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 15946219520. Throughput: 0: 43184.8. Samples: 15946296520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:08,390][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 21:50:09,829][15401] Updated weights for policy 0, policy_version 973284 (0.0043) [2024-06-25 21:50:13,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15946416128. Throughput: 0: 43026.8. Samples: 15946554960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:13,390][15132] Avg episode reward: [(0, '0.402')] [2024-06-25 21:50:13,845][15401] Updated weights for policy 0, policy_version 973294 (0.0045) [2024-06-25 21:50:17,458][15401] Updated weights for policy 0, policy_version 973304 (0.0033) [2024-06-25 21:50:18,389][15132] Fps is (10 sec: 44236.7, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 15946661888. Throughput: 0: 42995.6. Samples: 15946805020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:18,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 21:50:21,571][15401] Updated weights for policy 0, policy_version 973314 (0.0031) [2024-06-25 21:50:23,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15946858496. Throughput: 0: 42952.5. Samples: 15946940140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:23,390][15132] Avg episode reward: [(0, '0.307')] [2024-06-25 21:50:25,135][15401] Updated weights for policy 0, policy_version 973324 (0.0033) [2024-06-25 21:50:28,392][15132] Fps is (10 sec: 39312.0, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 15947055104. Throughput: 0: 42707.4. Samples: 15947191580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:28,393][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 21:50:29,126][15401] Updated weights for policy 0, policy_version 973334 (0.0040) [2024-06-25 21:50:32,844][15401] Updated weights for policy 0, policy_version 973344 (0.0025) [2024-06-25 21:50:33,389][15132] Fps is (10 sec: 44237.3, 60 sec: 43146.3, 300 sec: 42820.6). Total num frames: 15947300864. Throughput: 0: 42624.1. Samples: 15947439000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:33,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 21:50:36,835][15401] Updated weights for policy 0, policy_version 973354 (0.0033) [2024-06-25 21:50:38,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 15947497472. Throughput: 0: 42570.2. Samples: 15947573580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:38,399][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 21:50:40,604][15401] Updated weights for policy 0, policy_version 973364 (0.0035) [2024-06-25 21:50:43,390][15132] Fps is (10 sec: 39320.8, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 15947694080. Throughput: 0: 42583.9. Samples: 15947826020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:43,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 21:50:44,463][15401] Updated weights for policy 0, policy_version 973374 (0.0039) [2024-06-25 21:50:48,147][15401] Updated weights for policy 0, policy_version 973384 (0.0026) [2024-06-25 21:50:48,152][15349] Signal inference workers to stop experience collection... (235950 times) [2024-06-25 21:50:48,153][15349] Signal inference workers to resume experience collection... (235950 times) [2024-06-25 21:50:48,190][15401] InferenceWorker_p0-w0: stopping experience collection (235950 times) [2024-06-25 21:50:48,190][15401] InferenceWorker_p0-w0: resuming experience collection (235950 times) [2024-06-25 21:50:48,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15947939840. Throughput: 0: 42635.1. Samples: 15948083400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:48,390][15132] Avg episode reward: [(0, '0.720')] [2024-06-25 21:50:52,226][15401] Updated weights for policy 0, policy_version 973394 (0.0028) [2024-06-25 21:50:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42600.8, 300 sec: 42709.5). Total num frames: 15948136448. Throughput: 0: 42612.4. Samples: 15948214080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:53,390][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 21:50:55,650][15401] Updated weights for policy 0, policy_version 973404 (0.0033) [2024-06-25 21:50:58,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15948333056. Throughput: 0: 42569.6. Samples: 15948470600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:50:58,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 21:51:00,079][15401] Updated weights for policy 0, policy_version 973414 (0.0045) [2024-06-25 21:51:03,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 15948562432. Throughput: 0: 42605.8. Samples: 15948722280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:51:03,390][15132] Avg episode reward: [(0, '0.487')] [2024-06-25 21:51:03,622][15401] Updated weights for policy 0, policy_version 973424 (0.0034) [2024-06-25 21:51:07,650][15401] Updated weights for policy 0, policy_version 973434 (0.0038) [2024-06-25 21:51:08,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15948791808. Throughput: 0: 42552.8. Samples: 15948855020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:51:08,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 21:51:11,234][15401] Updated weights for policy 0, policy_version 973444 (0.0028) [2024-06-25 21:51:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15948972032. Throughput: 0: 42717.9. Samples: 15949113780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-25 21:51:13,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-25 21:51:15,126][15401] Updated weights for policy 0, policy_version 973454 (0.0032) [2024-06-25 21:51:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15949217792. Throughput: 0: 42814.5. Samples: 15949365660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:18,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 21:51:19,052][15401] Updated weights for policy 0, policy_version 973464 (0.0036) [2024-06-25 21:51:22,799][15401] Updated weights for policy 0, policy_version 973474 (0.0025) [2024-06-25 21:51:23,390][15132] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15949414400. Throughput: 0: 42749.2. Samples: 15949497300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:23,390][15132] Avg episode reward: [(0, '0.666')] [2024-06-25 21:51:26,700][15401] Updated weights for policy 0, policy_version 973484 (0.0029) [2024-06-25 21:51:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 15949627392. Throughput: 0: 42831.6. Samples: 15949753440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:28,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 21:51:30,330][15401] Updated weights for policy 0, policy_version 973494 (0.0038) [2024-06-25 21:51:33,389][15132] Fps is (10 sec: 45876.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15949873152. Throughput: 0: 42865.4. Samples: 15950012340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 21:51:34,205][15401] Updated weights for policy 0, policy_version 973504 (0.0024) [2024-06-25 21:51:37,752][15401] Updated weights for policy 0, policy_version 973514 (0.0030) [2024-06-25 21:51:38,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 15950069760. Throughput: 0: 42901.3. Samples: 15950144640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:38,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 21:51:41,674][15401] Updated weights for policy 0, policy_version 973524 (0.0028) [2024-06-25 21:51:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 15950282752. Throughput: 0: 42855.1. Samples: 15950399080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:43,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 21:51:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000973528_15950282752.pth... [2024-06-25 21:51:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000972901_15940009984.pth [2024-06-25 21:51:45,649][15401] Updated weights for policy 0, policy_version 973534 (0.0052) [2024-06-25 21:51:48,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15950512128. Throughput: 0: 42955.1. Samples: 15950655260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:48,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 21:51:49,412][15401] Updated weights for policy 0, policy_version 973544 (0.0029) [2024-06-25 21:51:53,215][15401] Updated weights for policy 0, policy_version 973554 (0.0033) [2024-06-25 21:51:53,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 15950708736. Throughput: 0: 42973.8. Samples: 15950788840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 21:51:57,187][15401] Updated weights for policy 0, policy_version 973564 (0.0037) [2024-06-25 21:51:58,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15950905344. Throughput: 0: 42923.1. Samples: 15951045320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:51:58,390][15132] Avg episode reward: [(0, '0.757')] [2024-06-25 21:52:00,667][15401] Updated weights for policy 0, policy_version 973574 (0.0027) [2024-06-25 21:52:03,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 15951151104. Throughput: 0: 42855.0. Samples: 15951294140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:03,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 21:52:04,865][15401] Updated weights for policy 0, policy_version 973584 (0.0031) [2024-06-25 21:52:08,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15951347712. Throughput: 0: 42913.9. Samples: 15951428420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 21:52:08,812][15401] Updated weights for policy 0, policy_version 973594 (0.0044) [2024-06-25 21:52:12,477][15401] Updated weights for policy 0, policy_version 973604 (0.0028) [2024-06-25 21:52:13,389][15132] Fps is (10 sec: 39322.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15951544320. Throughput: 0: 42762.4. Samples: 15951677740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:13,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 21:52:16,659][15401] Updated weights for policy 0, policy_version 973614 (0.0037) [2024-06-25 21:52:18,389][15132] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 15951806464. Throughput: 0: 42584.5. Samples: 15951928640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:18,390][15132] Avg episode reward: [(0, '0.439')] [2024-06-25 21:52:20,082][15401] Updated weights for policy 0, policy_version 973624 (0.0034) [2024-06-25 21:52:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15951970304. Throughput: 0: 42624.4. Samples: 15952062740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:23,390][15132] Avg episode reward: [(0, '0.338')] [2024-06-25 21:52:24,060][15401] Updated weights for policy 0, policy_version 973634 (0.0028) [2024-06-25 21:52:27,703][15401] Updated weights for policy 0, policy_version 973644 (0.0034) [2024-06-25 21:52:28,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15952199680. Throughput: 0: 42657.8. Samples: 15952318680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:28,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 21:52:29,159][15349] Signal inference workers to stop experience collection... (236000 times) [2024-06-25 21:52:29,159][15349] Signal inference workers to resume experience collection... (236000 times) [2024-06-25 21:52:29,200][15401] InferenceWorker_p0-w0: stopping experience collection (236000 times) [2024-06-25 21:52:29,200][15401] InferenceWorker_p0-w0: resuming experience collection (236000 times) [2024-06-25 21:52:31,594][15401] Updated weights for policy 0, policy_version 973654 (0.0022) [2024-06-25 21:52:33,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15952429056. Throughput: 0: 42746.6. Samples: 15952578860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:33,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 21:52:35,355][15401] Updated weights for policy 0, policy_version 973664 (0.0037) [2024-06-25 21:52:38,391][15132] Fps is (10 sec: 42592.5, 60 sec: 42597.4, 300 sec: 42820.4). Total num frames: 15952625664. Throughput: 0: 42677.8. Samples: 15952709400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:38,391][15132] Avg episode reward: [(0, '0.651')] [2024-06-25 21:52:39,066][15401] Updated weights for policy 0, policy_version 973674 (0.0030) [2024-06-25 21:52:43,055][15401] Updated weights for policy 0, policy_version 973684 (0.0035) [2024-06-25 21:52:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15952838656. Throughput: 0: 42640.3. Samples: 15952964140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:43,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 21:52:46,679][15401] Updated weights for policy 0, policy_version 973694 (0.0027) [2024-06-25 21:52:48,390][15132] Fps is (10 sec: 42604.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15953051648. Throughput: 0: 42825.8. Samples: 15953221300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:48,390][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 21:52:50,588][15401] Updated weights for policy 0, policy_version 973704 (0.0027) [2024-06-25 21:52:53,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15953264640. Throughput: 0: 42743.9. Samples: 15953351900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:53,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 21:52:54,471][15401] Updated weights for policy 0, policy_version 973714 (0.0047) [2024-06-25 21:52:58,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15953477632. Throughput: 0: 42784.4. Samples: 15953603040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:52:58,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 21:52:58,611][15401] Updated weights for policy 0, policy_version 973724 (0.0032) [2024-06-25 21:53:02,200][15401] Updated weights for policy 0, policy_version 973734 (0.0028) [2024-06-25 21:53:03,392][15132] Fps is (10 sec: 42588.7, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 15953690624. Throughput: 0: 42860.8. Samples: 15953857480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:53:03,392][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 21:53:06,484][15401] Updated weights for policy 0, policy_version 973744 (0.0028) [2024-06-25 21:53:08,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15953887232. Throughput: 0: 42802.3. Samples: 15953988840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:53:08,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 21:53:10,182][15401] Updated weights for policy 0, policy_version 973754 (0.0038) [2024-06-25 21:53:13,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15954116608. Throughput: 0: 42595.2. Samples: 15954235460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-25 21:53:13,390][15132] Avg episode reward: [(0, '0.459')] [2024-06-25 21:53:14,252][15401] Updated weights for policy 0, policy_version 973764 (0.0030) [2024-06-25 21:53:17,776][15401] Updated weights for policy 0, policy_version 973774 (0.0037) [2024-06-25 21:53:18,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 15954329600. Throughput: 0: 42491.1. Samples: 15954490960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:18,390][15132] Avg episode reward: [(0, '0.627')] [2024-06-25 21:53:22,090][15401] Updated weights for policy 0, policy_version 973784 (0.0038) [2024-06-25 21:53:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15954542592. Throughput: 0: 42502.6. Samples: 15954621960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:23,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 21:53:25,265][15401] Updated weights for policy 0, policy_version 973794 (0.0048) [2024-06-25 21:53:28,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 15954755584. Throughput: 0: 42549.4. Samples: 15954878960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:28,392][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 21:53:29,725][15401] Updated weights for policy 0, policy_version 973804 (0.0032) [2024-06-25 21:53:32,738][15401] Updated weights for policy 0, policy_version 973814 (0.0037) [2024-06-25 21:53:33,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 15954984960. Throughput: 0: 42574.6. Samples: 15955137160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:33,390][15132] Avg episode reward: [(0, '0.712')] [2024-06-25 21:53:37,142][15401] Updated weights for policy 0, policy_version 973824 (0.0028) [2024-06-25 21:53:38,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42326.4, 300 sec: 42765.0). Total num frames: 15955165184. Throughput: 0: 42548.2. Samples: 15955266560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:38,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 21:53:40,486][15401] Updated weights for policy 0, policy_version 973834 (0.0033) [2024-06-25 21:53:43,392][15132] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 15955410944. Throughput: 0: 42498.6. Samples: 15955515580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:43,392][15132] Avg episode reward: [(0, '0.615')] [2024-06-25 21:53:43,407][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000973841_15955410944.pth... [2024-06-25 21:53:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000973213_15945121792.pth [2024-06-25 21:53:44,603][15401] Updated weights for policy 0, policy_version 973844 (0.0032) [2024-06-25 21:53:48,209][15401] Updated weights for policy 0, policy_version 973854 (0.0033) [2024-06-25 21:53:48,392][15132] Fps is (10 sec: 47502.0, 60 sec: 43142.9, 300 sec: 42875.7). Total num frames: 15955640320. Throughput: 0: 42618.7. Samples: 15955775320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:48,392][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 21:53:52,296][15401] Updated weights for policy 0, policy_version 973864 (0.0033) [2024-06-25 21:53:53,389][15132] Fps is (10 sec: 39331.2, 60 sec: 42325.5, 300 sec: 42709.8). Total num frames: 15955804160. Throughput: 0: 42622.7. Samples: 15955906860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:53,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 21:53:55,867][15401] Updated weights for policy 0, policy_version 973874 (0.0038) [2024-06-25 21:53:58,390][15132] Fps is (10 sec: 40969.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15956049920. Throughput: 0: 42733.2. Samples: 15956158460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:53:58,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 21:54:00,065][15401] Updated weights for policy 0, policy_version 973884 (0.0043) [2024-06-25 21:54:03,390][15132] Fps is (10 sec: 45874.7, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 15956262912. Throughput: 0: 42923.6. Samples: 15956422520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 21:54:03,411][15401] Updated weights for policy 0, policy_version 973894 (0.0034) [2024-06-25 21:54:07,708][15401] Updated weights for policy 0, policy_version 973904 (0.0034) [2024-06-25 21:54:08,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15956459520. Throughput: 0: 42899.6. Samples: 15956552440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:08,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 21:54:09,884][15349] Signal inference workers to stop experience collection... (236050 times) [2024-06-25 21:54:09,884][15349] Signal inference workers to resume experience collection... (236050 times) [2024-06-25 21:54:09,937][15401] InferenceWorker_p0-w0: stopping experience collection (236050 times) [2024-06-25 21:54:09,937][15401] InferenceWorker_p0-w0: resuming experience collection (236050 times) [2024-06-25 21:54:11,025][15401] Updated weights for policy 0, policy_version 973914 (0.0034) [2024-06-25 21:54:13,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15956688896. Throughput: 0: 42712.5. Samples: 15956800920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:13,390][15132] Avg episode reward: [(0, '0.424')] [2024-06-25 21:54:15,313][15401] Updated weights for policy 0, policy_version 973924 (0.0052) [2024-06-25 21:54:18,392][15132] Fps is (10 sec: 45864.2, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 15956918272. Throughput: 0: 42932.0. Samples: 15957069200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:18,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 21:54:18,533][15401] Updated weights for policy 0, policy_version 973934 (0.0033) [2024-06-25 21:54:23,011][15401] Updated weights for policy 0, policy_version 973944 (0.0035) [2024-06-25 21:54:23,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15957114880. Throughput: 0: 42944.7. Samples: 15957199080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:23,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 21:54:26,258][15401] Updated weights for policy 0, policy_version 973954 (0.0037) [2024-06-25 21:54:28,389][15132] Fps is (10 sec: 42608.8, 60 sec: 43146.3, 300 sec: 42820.9). Total num frames: 15957344256. Throughput: 0: 43000.5. Samples: 15957450500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:28,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 21:54:30,394][15401] Updated weights for policy 0, policy_version 973964 (0.0035) [2024-06-25 21:54:33,390][15132] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42820.9). Total num frames: 15957573632. Throughput: 0: 43230.3. Samples: 15957720580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:33,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 21:54:33,648][15401] Updated weights for policy 0, policy_version 973974 (0.0046) [2024-06-25 21:54:38,119][15401] Updated weights for policy 0, policy_version 973984 (0.0038) [2024-06-25 21:54:38,389][15132] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15957753856. Throughput: 0: 43136.8. Samples: 15957848020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:38,390][15132] Avg episode reward: [(0, '0.394')] [2024-06-25 21:54:41,402][15401] Updated weights for policy 0, policy_version 973994 (0.0030) [2024-06-25 21:54:43,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 15957999616. Throughput: 0: 43211.6. Samples: 15958102980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:43,392][15132] Avg episode reward: [(0, '0.464')] [2024-06-25 21:54:46,082][15401] Updated weights for policy 0, policy_version 974004 (0.0033) [2024-06-25 21:54:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42327.0, 300 sec: 42710.0). Total num frames: 15958179840. Throughput: 0: 43234.3. Samples: 15958368060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:48,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 21:54:49,034][15401] Updated weights for policy 0, policy_version 974014 (0.0039) [2024-06-25 21:54:53,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15958392832. Throughput: 0: 43012.9. Samples: 15958488020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:53,390][15132] Avg episode reward: [(0, '0.683')] [2024-06-25 21:54:53,666][15401] Updated weights for policy 0, policy_version 974024 (0.0033) [2024-06-25 21:54:56,715][15401] Updated weights for policy 0, policy_version 974034 (0.0035) [2024-06-25 21:54:58,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15958638592. Throughput: 0: 43090.2. Samples: 15958739980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:54:58,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 21:55:01,084][15401] Updated weights for policy 0, policy_version 974044 (0.0029) [2024-06-25 21:55:03,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15958818816. Throughput: 0: 43050.8. Samples: 15959006380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:55:03,390][15132] Avg episode reward: [(0, '0.640')] [2024-06-25 21:55:04,618][15401] Updated weights for policy 0, policy_version 974054 (0.0034) [2024-06-25 21:55:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15959048192. Throughput: 0: 42781.3. Samples: 15959124240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:55:08,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 21:55:08,654][15401] Updated weights for policy 0, policy_version 974064 (0.0046) [2024-06-25 21:55:12,241][15401] Updated weights for policy 0, policy_version 974074 (0.0037) [2024-06-25 21:55:13,395][15132] Fps is (10 sec: 45850.3, 60 sec: 43140.7, 300 sec: 42764.2). Total num frames: 15959277568. Throughput: 0: 42834.0. Samples: 15959378260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-25 21:55:13,395][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 21:55:16,140][15401] Updated weights for policy 0, policy_version 974084 (0.0025) [2024-06-25 21:55:18,389][15132] Fps is (10 sec: 37683.8, 60 sec: 41780.9, 300 sec: 42598.4). Total num frames: 15959425024. Throughput: 0: 42725.8. Samples: 15959643240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:18,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 21:55:19,861][15401] Updated weights for policy 0, policy_version 974094 (0.0041) [2024-06-25 21:55:21,604][15349] Signal inference workers to stop experience collection... (236100 times) [2024-06-25 21:55:21,604][15349] Signal inference workers to resume experience collection... (236100 times) [2024-06-25 21:55:21,621][15401] InferenceWorker_p0-w0: stopping experience collection (236100 times) [2024-06-25 21:55:21,622][15401] InferenceWorker_p0-w0: resuming experience collection (236100 times) [2024-06-25 21:55:23,389][15132] Fps is (10 sec: 40982.1, 60 sec: 42871.6, 300 sec: 42820.9). Total num frames: 15959687168. Throughput: 0: 42522.2. Samples: 15959761520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:23,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 21:55:23,803][15401] Updated weights for policy 0, policy_version 974104 (0.0032) [2024-06-25 21:55:27,508][15401] Updated weights for policy 0, policy_version 974114 (0.0035) [2024-06-25 21:55:28,390][15132] Fps is (10 sec: 49151.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15959916544. Throughput: 0: 42677.8. Samples: 15960023480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:28,390][15132] Avg episode reward: [(0, '0.708')] [2024-06-25 21:55:31,629][15401] Updated weights for policy 0, policy_version 974124 (0.0050) [2024-06-25 21:55:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 15960080384. Throughput: 0: 42763.1. Samples: 15960292400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:33,392][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 21:55:35,277][15401] Updated weights for policy 0, policy_version 974134 (0.0037) [2024-06-25 21:55:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15960342528. Throughput: 0: 42485.3. Samples: 15960399860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:38,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 21:55:39,210][15401] Updated weights for policy 0, policy_version 974144 (0.0036) [2024-06-25 21:55:43,111][15401] Updated weights for policy 0, policy_version 974154 (0.0038) [2024-06-25 21:55:43,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15960555520. Throughput: 0: 42725.8. Samples: 15960662640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:43,390][15132] Avg episode reward: [(0, '0.541')] [2024-06-25 21:55:43,458][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000974156_15960571904.pth... [2024-06-25 21:55:43,522][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000973528_15950282752.pth [2024-06-25 21:55:46,774][15401] Updated weights for policy 0, policy_version 974164 (0.0038) [2024-06-25 21:55:48,390][15132] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15960719360. Throughput: 0: 42501.6. Samples: 15960918960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:48,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 21:55:50,885][15401] Updated weights for policy 0, policy_version 974174 (0.0039) [2024-06-25 21:55:53,392][15132] Fps is (10 sec: 42588.4, 60 sec: 43142.8, 300 sec: 42875.8). Total num frames: 15960981504. Throughput: 0: 42505.9. Samples: 15961037100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:53,392][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 21:55:54,574][15401] Updated weights for policy 0, policy_version 974184 (0.0032) [2024-06-25 21:55:58,392][15132] Fps is (10 sec: 45865.1, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 15961178112. Throughput: 0: 42805.6. Samples: 15961304380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:55:58,392][15132] Avg episode reward: [(0, '0.703')] [2024-06-25 21:55:58,529][15401] Updated weights for policy 0, policy_version 974194 (0.0032) [2024-06-25 21:56:02,298][15401] Updated weights for policy 0, policy_version 974204 (0.0039) [2024-06-25 21:56:03,389][15132] Fps is (10 sec: 39331.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 15961374720. Throughput: 0: 42488.0. Samples: 15961555200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:03,390][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 21:56:06,447][15401] Updated weights for policy 0, policy_version 974214 (0.0040) [2024-06-25 21:56:08,392][15132] Fps is (10 sec: 44236.4, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 15961620480. Throughput: 0: 42698.6. Samples: 15961683060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:08,392][15132] Avg episode reward: [(0, '0.467')] [2024-06-25 21:56:09,680][15401] Updated weights for policy 0, policy_version 974224 (0.0025) [2024-06-25 21:56:13,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42056.1, 300 sec: 42654.0). Total num frames: 15961800704. Throughput: 0: 42821.0. Samples: 15961950420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:13,390][15132] Avg episode reward: [(0, '0.676')] [2024-06-25 21:56:14,145][15401] Updated weights for policy 0, policy_version 974234 (0.0037) [2024-06-25 21:56:17,143][15401] Updated weights for policy 0, policy_version 974244 (0.0026) [2024-06-25 21:56:18,390][15132] Fps is (10 sec: 40969.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15962030080. Throughput: 0: 42262.2. Samples: 15962194200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:18,395][15132] Avg episode reward: [(0, '0.507')] [2024-06-25 21:56:21,818][15401] Updated weights for policy 0, policy_version 974254 (0.0036) [2024-06-25 21:56:23,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15962259456. Throughput: 0: 42758.7. Samples: 15962324000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:23,390][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 21:56:24,635][15401] Updated weights for policy 0, policy_version 974264 (0.0033) [2024-06-25 21:56:27,717][15349] Signal inference workers to stop experience collection... (236150 times) [2024-06-25 21:56:27,745][15401] InferenceWorker_p0-w0: stopping experience collection (236150 times) [2024-06-25 21:56:27,771][15349] Signal inference workers to resume experience collection... (236150 times) [2024-06-25 21:56:27,772][15401] InferenceWorker_p0-w0: resuming experience collection (236150 times) [2024-06-25 21:56:28,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 15962439680. Throughput: 0: 42885.8. Samples: 15962592500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:28,390][15132] Avg episode reward: [(0, '0.770')] [2024-06-25 21:56:29,306][15401] Updated weights for policy 0, policy_version 974274 (0.0035) [2024-06-25 21:56:32,415][15401] Updated weights for policy 0, policy_version 974284 (0.0038) [2024-06-25 21:56:33,389][15132] Fps is (10 sec: 42598.9, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 15962685440. Throughput: 0: 42474.8. Samples: 15962830320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:33,390][15132] Avg episode reward: [(0, '0.647')] [2024-06-25 21:56:37,346][15401] Updated weights for policy 0, policy_version 974294 (0.0027) [2024-06-25 21:56:38,390][15132] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15962914816. Throughput: 0: 42895.1. Samples: 15962967280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:38,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-25 21:56:40,165][15401] Updated weights for policy 0, policy_version 974304 (0.0036) [2024-06-25 21:56:43,390][15132] Fps is (10 sec: 37682.4, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 15963062272. Throughput: 0: 42714.9. Samples: 15963226460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:43,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 21:56:44,883][15401] Updated weights for policy 0, policy_version 974314 (0.0024) [2024-06-25 21:56:47,835][15401] Updated weights for policy 0, policy_version 974324 (0.0037) [2024-06-25 21:56:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 15963340800. Throughput: 0: 42490.1. Samples: 15963467260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:48,390][15132] Avg episode reward: [(0, '0.648')] [2024-06-25 21:56:52,439][15401] Updated weights for policy 0, policy_version 974334 (0.0036) [2024-06-25 21:56:53,390][15132] Fps is (10 sec: 49152.5, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 15963553792. Throughput: 0: 42957.8. Samples: 15963616060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:53,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 21:56:55,215][15401] Updated weights for policy 0, policy_version 974344 (0.0022) [2024-06-25 21:56:58,390][15132] Fps is (10 sec: 37683.4, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 15963717632. Throughput: 0: 42800.3. Samples: 15963876440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:56:58,390][15132] Avg episode reward: [(0, '0.231')] [2024-06-25 21:56:59,857][15401] Updated weights for policy 0, policy_version 974354 (0.0048) [2024-06-25 21:57:03,169][15401] Updated weights for policy 0, policy_version 974364 (0.0045) [2024-06-25 21:57:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 15963996160. Throughput: 0: 42968.0. Samples: 15964127760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:57:03,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 21:57:07,515][15401] Updated weights for policy 0, policy_version 974374 (0.0030) [2024-06-25 21:57:08,390][15132] Fps is (10 sec: 47513.5, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 15964192768. Throughput: 0: 43230.6. Samples: 15964269380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-25 21:57:08,390][15132] Avg episode reward: [(0, '0.678')] [2024-06-25 21:57:10,525][15401] Updated weights for policy 0, policy_version 974384 (0.0035) [2024-06-25 21:57:13,392][15132] Fps is (10 sec: 36036.1, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 15964356608. Throughput: 0: 42944.4. Samples: 15964525100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:13,392][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 21:57:15,074][15401] Updated weights for policy 0, policy_version 974394 (0.0043) [2024-06-25 21:57:18,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15964618752. Throughput: 0: 43179.0. Samples: 15964773380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:18,390][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 21:57:18,651][15401] Updated weights for policy 0, policy_version 974404 (0.0029) [2024-06-25 21:57:22,935][15401] Updated weights for policy 0, policy_version 974414 (0.0024) [2024-06-25 21:57:23,390][15132] Fps is (10 sec: 45885.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15964815360. Throughput: 0: 43248.4. Samples: 15964913460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:23,390][15132] Avg episode reward: [(0, '0.529')] [2024-06-25 21:57:23,512][15349] Signal inference workers to stop experience collection... (236200 times) [2024-06-25 21:57:23,548][15401] InferenceWorker_p0-w0: stopping experience collection (236200 times) [2024-06-25 21:57:23,567][15349] Signal inference workers to resume experience collection... (236200 times) [2024-06-25 21:57:23,568][15401] InferenceWorker_p0-w0: resuming experience collection (236200 times) [2024-06-25 21:57:26,027][15401] Updated weights for policy 0, policy_version 974424 (0.0030) [2024-06-25 21:57:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 15965011968. Throughput: 0: 43149.1. Samples: 15965168160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:28,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 21:57:30,539][15401] Updated weights for policy 0, policy_version 974434 (0.0024) [2024-06-25 21:57:33,390][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42876.3). Total num frames: 15965274112. Throughput: 0: 43406.2. Samples: 15965420540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:33,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 21:57:33,425][15401] Updated weights for policy 0, policy_version 974444 (0.0028) [2024-06-25 21:57:38,335][15401] Updated weights for policy 0, policy_version 974454 (0.0027) [2024-06-25 21:57:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15965454336. Throughput: 0: 43142.8. Samples: 15965557480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:38,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 21:57:41,345][15401] Updated weights for policy 0, policy_version 974464 (0.0037) [2024-06-25 21:57:43,390][15132] Fps is (10 sec: 39321.8, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 15965667328. Throughput: 0: 43003.6. Samples: 15965811600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:43,390][15132] Avg episode reward: [(0, '0.435')] [2024-06-25 21:57:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000974467_15965667328.pth... [2024-06-25 21:57:43,472][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000973841_15955410944.pth [2024-06-25 21:57:45,751][15401] Updated weights for policy 0, policy_version 974474 (0.0039) [2024-06-25 21:57:48,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 15965929472. Throughput: 0: 42996.8. Samples: 15966062620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:48,391][15132] Avg episode reward: [(0, '0.422')] [2024-06-25 21:57:48,790][15401] Updated weights for policy 0, policy_version 974484 (0.0036) [2024-06-25 21:57:53,301][15401] Updated weights for policy 0, policy_version 974494 (0.0034) [2024-06-25 21:57:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15966109696. Throughput: 0: 43056.5. Samples: 15966206920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:53,390][15132] Avg episode reward: [(0, '0.254')] [2024-06-25 21:57:56,261][15401] Updated weights for policy 0, policy_version 974504 (0.0034) [2024-06-25 21:57:58,390][15132] Fps is (10 sec: 39321.7, 60 sec: 43417.7, 300 sec: 42820.9). Total num frames: 15966322688. Throughput: 0: 42938.3. Samples: 15966457220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:57:58,390][15132] Avg episode reward: [(0, '0.282')] [2024-06-25 21:58:01,085][15401] Updated weights for policy 0, policy_version 974514 (0.0027) [2024-06-25 21:58:03,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15966552064. Throughput: 0: 42972.2. Samples: 15966707120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:03,390][15132] Avg episode reward: [(0, '0.447')] [2024-06-25 21:58:03,945][15401] Updated weights for policy 0, policy_version 974524 (0.0044) [2024-06-25 21:58:08,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 15966748672. Throughput: 0: 42908.6. Samples: 15966844340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:08,390][15132] Avg episode reward: [(0, '0.750')] [2024-06-25 21:58:08,686][15401] Updated weights for policy 0, policy_version 974534 (0.0037) [2024-06-25 21:58:11,880][15401] Updated weights for policy 0, policy_version 974544 (0.0026) [2024-06-25 21:58:13,389][15132] Fps is (10 sec: 42598.4, 60 sec: 43692.5, 300 sec: 42876.1). Total num frames: 15966978048. Throughput: 0: 42760.4. Samples: 15967092380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:13,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 21:58:16,158][15401] Updated weights for policy 0, policy_version 974554 (0.0039) [2024-06-25 21:58:18,395][15132] Fps is (10 sec: 45850.5, 60 sec: 43140.7, 300 sec: 42930.9). Total num frames: 15967207424. Throughput: 0: 42901.6. Samples: 15967351340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:18,395][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 21:58:19,568][15401] Updated weights for policy 0, policy_version 974564 (0.0037) [2024-06-25 21:58:23,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 15967371264. Throughput: 0: 42769.3. Samples: 15967482100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:23,390][15132] Avg episode reward: [(0, '0.695')] [2024-06-25 21:58:23,922][15401] Updated weights for policy 0, policy_version 974574 (0.0041) [2024-06-25 21:58:27,129][15401] Updated weights for policy 0, policy_version 974584 (0.0045) [2024-06-25 21:58:28,389][15132] Fps is (10 sec: 42621.7, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 15967633408. Throughput: 0: 42813.9. Samples: 15967738220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:28,390][15132] Avg episode reward: [(0, '0.484')] [2024-06-25 21:58:31,525][15401] Updated weights for policy 0, policy_version 974594 (0.0026) [2024-06-25 21:58:33,392][15132] Fps is (10 sec: 49139.8, 60 sec: 43142.9, 300 sec: 43042.4). Total num frames: 15967862784. Throughput: 0: 43027.5. Samples: 15967998960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:33,392][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 21:58:34,655][15401] Updated weights for policy 0, policy_version 974604 (0.0033) [2024-06-25 21:58:38,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15968026624. Throughput: 0: 42802.3. Samples: 15968133020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:38,390][15132] Avg episode reward: [(0, '0.524')] [2024-06-25 21:58:39,126][15401] Updated weights for policy 0, policy_version 974614 (0.0032) [2024-06-25 21:58:39,134][15349] Signal inference workers to stop experience collection... (236250 times) [2024-06-25 21:58:39,134][15349] Signal inference workers to resume experience collection... (236250 times) [2024-06-25 21:58:39,177][15401] InferenceWorker_p0-w0: stopping experience collection (236250 times) [2024-06-25 21:58:39,177][15401] InferenceWorker_p0-w0: resuming experience collection (236250 times) [2024-06-25 21:58:42,307][15401] Updated weights for policy 0, policy_version 974624 (0.0038) [2024-06-25 21:58:43,390][15132] Fps is (10 sec: 42608.5, 60 sec: 43690.7, 300 sec: 42876.4). Total num frames: 15968288768. Throughput: 0: 42950.2. Samples: 15968389980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:43,390][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 21:58:46,725][15401] Updated weights for policy 0, policy_version 974634 (0.0042) [2024-06-25 21:58:48,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 15968501760. Throughput: 0: 43045.8. Samples: 15968644180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:48,390][15132] Avg episode reward: [(0, '0.185')] [2024-06-25 21:58:50,020][15401] Updated weights for policy 0, policy_version 974644 (0.0042) [2024-06-25 21:58:53,389][15132] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 15968665600. Throughput: 0: 42926.7. Samples: 15968776040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:53,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 21:58:54,324][15401] Updated weights for policy 0, policy_version 974654 (0.0035) [2024-06-25 21:58:57,611][15401] Updated weights for policy 0, policy_version 974664 (0.0031) [2024-06-25 21:58:58,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 15968927744. Throughput: 0: 43076.8. Samples: 15969030840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:58:58,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 21:59:02,111][15401] Updated weights for policy 0, policy_version 974674 (0.0038) [2024-06-25 21:59:03,389][15132] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 15969140736. Throughput: 0: 43009.7. Samples: 15969286540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:59:03,390][15132] Avg episode reward: [(0, '0.873')] [2024-06-25 21:59:05,449][15401] Updated weights for policy 0, policy_version 974684 (0.0033) [2024-06-25 21:59:08,390][15132] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15969337344. Throughput: 0: 42979.0. Samples: 15969416160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 21:59:08,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 21:59:09,592][15401] Updated weights for policy 0, policy_version 974694 (0.0032) [2024-06-25 21:59:13,082][15401] Updated weights for policy 0, policy_version 974704 (0.0045) [2024-06-25 21:59:13,396][15132] Fps is (10 sec: 42570.5, 60 sec: 43139.8, 300 sec: 42875.5). Total num frames: 15969566720. Throughput: 0: 42949.8. Samples: 15969671240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:13,397][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 21:59:17,321][15401] Updated weights for policy 0, policy_version 974714 (0.0026) [2024-06-25 21:59:18,389][15132] Fps is (10 sec: 45875.9, 60 sec: 43148.5, 300 sec: 42987.2). Total num frames: 15969796096. Throughput: 0: 42908.6. Samples: 15969929740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:18,390][15132] Avg episode reward: [(0, '0.273')] [2024-06-25 21:59:20,587][15401] Updated weights for policy 0, policy_version 974724 (0.0022) [2024-06-25 21:59:23,390][15132] Fps is (10 sec: 40986.1, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 15969976320. Throughput: 0: 42802.5. Samples: 15970059140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:23,390][15132] Avg episode reward: [(0, '0.760')] [2024-06-25 21:59:24,913][15401] Updated weights for policy 0, policy_version 974734 (0.0029) [2024-06-25 21:59:28,080][15401] Updated weights for policy 0, policy_version 974744 (0.0039) [2024-06-25 21:59:28,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 15970205696. Throughput: 0: 42799.7. Samples: 15970315960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:28,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 21:59:32,567][15401] Updated weights for policy 0, policy_version 974754 (0.0030) [2024-06-25 21:59:33,389][15132] Fps is (10 sec: 45876.2, 60 sec: 42873.3, 300 sec: 42987.2). Total num frames: 15970435072. Throughput: 0: 42875.6. Samples: 15970573580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:33,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 21:59:35,895][15401] Updated weights for policy 0, policy_version 974764 (0.0027) [2024-06-25 21:59:38,396][15132] Fps is (10 sec: 39296.1, 60 sec: 42866.8, 300 sec: 42708.6). Total num frames: 15970598912. Throughput: 0: 42847.1. Samples: 15970704440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:38,396][15132] Avg episode reward: [(0, '0.486')] [2024-06-25 21:59:40,175][15401] Updated weights for policy 0, policy_version 974774 (0.0034) [2024-06-25 21:59:43,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 15970844672. Throughput: 0: 42815.6. Samples: 15970957540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:43,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 21:59:43,426][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000974784_15970861056.pth... [2024-06-25 21:59:43,434][15401] Updated weights for policy 0, policy_version 974784 (0.0034) [2024-06-25 21:59:43,485][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000974156_15960571904.pth [2024-06-25 21:59:47,815][15401] Updated weights for policy 0, policy_version 974794 (0.0042) [2024-06-25 21:59:48,390][15132] Fps is (10 sec: 47544.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15971074048. Throughput: 0: 42907.9. Samples: 15971217400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:48,390][15132] Avg episode reward: [(0, '0.454')] [2024-06-25 21:59:51,374][15401] Updated weights for policy 0, policy_version 974804 (0.0023) [2024-06-25 21:59:53,389][15132] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15971237888. Throughput: 0: 42885.0. Samples: 15971345980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:53,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 21:59:55,411][15401] Updated weights for policy 0, policy_version 974814 (0.0045) [2024-06-25 21:59:58,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 15971500032. Throughput: 0: 42764.4. Samples: 15971595360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 21:59:58,390][15132] Avg episode reward: [(0, '0.575')] [2024-06-25 21:59:59,010][15401] Updated weights for policy 0, policy_version 974824 (0.0034) [2024-06-25 22:00:02,978][15349] Signal inference workers to stop experience collection... (236300 times) [2024-06-25 22:00:03,019][15401] InferenceWorker_p0-w0: stopping experience collection (236300 times) [2024-06-25 22:00:03,030][15349] Signal inference workers to resume experience collection... (236300 times) [2024-06-25 22:00:03,036][15401] InferenceWorker_p0-w0: resuming experience collection (236300 times) [2024-06-25 22:00:03,039][15401] Updated weights for policy 0, policy_version 974834 (0.0033) [2024-06-25 22:00:03,390][15132] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15971713024. Throughput: 0: 42816.3. Samples: 15971856480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:03,390][15132] Avg episode reward: [(0, '0.633')] [2024-06-25 22:00:06,599][15401] Updated weights for policy 0, policy_version 974844 (0.0031) [2024-06-25 22:00:08,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42765.8). Total num frames: 15971893248. Throughput: 0: 42694.8. Samples: 15971980400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:08,390][15132] Avg episode reward: [(0, '0.791')] [2024-06-25 22:00:10,694][15401] Updated weights for policy 0, policy_version 974854 (0.0043) [2024-06-25 22:00:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42602.9, 300 sec: 43042.7). Total num frames: 15972122624. Throughput: 0: 42583.8. Samples: 15972232240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:13,390][15132] Avg episode reward: [(0, '0.763')] [2024-06-25 22:00:14,189][15401] Updated weights for policy 0, policy_version 974864 (0.0031) [2024-06-25 22:00:18,205][15401] Updated weights for policy 0, policy_version 974874 (0.0025) [2024-06-25 22:00:18,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 15972335616. Throughput: 0: 42858.1. Samples: 15972502200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:18,390][15132] Avg episode reward: [(0, '0.792')] [2024-06-25 22:00:22,085][15401] Updated weights for policy 0, policy_version 974884 (0.0038) [2024-06-25 22:00:23,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15972532224. Throughput: 0: 42816.8. Samples: 15972630920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:23,390][15132] Avg episode reward: [(0, '0.869')] [2024-06-25 22:00:25,851][15401] Updated weights for policy 0, policy_version 974894 (0.0028) [2024-06-25 22:00:28,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 15972761600. Throughput: 0: 42726.6. Samples: 15972880240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:28,390][15132] Avg episode reward: [(0, '0.735')] [2024-06-25 22:00:29,687][15401] Updated weights for policy 0, policy_version 974904 (0.0047) [2024-06-25 22:00:33,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 15972974592. Throughput: 0: 42807.6. Samples: 15973143740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:33,390][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 22:00:33,449][15401] Updated weights for policy 0, policy_version 974914 (0.0036) [2024-06-25 22:00:37,216][15401] Updated weights for policy 0, policy_version 974924 (0.0047) [2024-06-25 22:00:38,390][15132] Fps is (10 sec: 42598.1, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 15973187584. Throughput: 0: 42803.4. Samples: 15973272140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:38,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 22:00:41,184][15401] Updated weights for policy 0, policy_version 974934 (0.0037) [2024-06-25 22:00:43,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 15973416960. Throughput: 0: 42907.5. Samples: 15973526200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:43,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 22:00:44,879][15401] Updated weights for policy 0, policy_version 974944 (0.0029) [2024-06-25 22:00:48,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42876.4). Total num frames: 15973629952. Throughput: 0: 42902.6. Samples: 15973787100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:48,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 22:00:48,920][15401] Updated weights for policy 0, policy_version 974954 (0.0031) [2024-06-25 22:00:52,538][15401] Updated weights for policy 0, policy_version 974964 (0.0030) [2024-06-25 22:00:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42876.4). Total num frames: 15973826560. Throughput: 0: 43043.9. Samples: 15973917380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:53,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 22:00:56,394][15401] Updated weights for policy 0, policy_version 974974 (0.0037) [2024-06-25 22:00:58,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 15974072320. Throughput: 0: 43090.7. Samples: 15974171320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:00:58,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 22:01:00,148][15401] Updated weights for policy 0, policy_version 974984 (0.0032) [2024-06-25 22:01:03,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 15974268928. Throughput: 0: 42998.1. Samples: 15974437120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:01:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 22:01:03,854][15401] Updated weights for policy 0, policy_version 974994 (0.0036) [2024-06-25 22:01:07,629][15401] Updated weights for policy 0, policy_version 975004 (0.0025) [2024-06-25 22:01:08,389][15132] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 15974481920. Throughput: 0: 42922.8. Samples: 15974562440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-25 22:01:08,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 22:01:11,392][15401] Updated weights for policy 0, policy_version 975014 (0.0032) [2024-06-25 22:01:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 15974711296. Throughput: 0: 43032.9. Samples: 15974816720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:13,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 22:01:14,678][15349] Signal inference workers to stop experience collection... (236350 times) [2024-06-25 22:01:14,680][15349] Signal inference workers to resume experience collection... (236350 times) [2024-06-25 22:01:14,703][15401] InferenceWorker_p0-w0: stopping experience collection (236350 times) [2024-06-25 22:01:14,731][15401] InferenceWorker_p0-w0: resuming experience collection (236350 times) [2024-06-25 22:01:15,079][15401] Updated weights for policy 0, policy_version 975024 (0.0043) [2024-06-25 22:01:18,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15974907904. Throughput: 0: 42982.1. Samples: 15975077940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:18,390][15132] Avg episode reward: [(0, '0.466')] [2024-06-25 22:01:19,125][15401] Updated weights for policy 0, policy_version 975034 (0.0037) [2024-06-25 22:01:22,746][15401] Updated weights for policy 0, policy_version 975044 (0.0027) [2024-06-25 22:01:23,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43417.5, 300 sec: 43042.7). Total num frames: 15975137280. Throughput: 0: 42944.8. Samples: 15975204660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:23,390][15132] Avg episode reward: [(0, '0.784')] [2024-06-25 22:01:26,508][15401] Updated weights for policy 0, policy_version 975054 (0.0036) [2024-06-25 22:01:28,389][15132] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15975333888. Throughput: 0: 42897.0. Samples: 15975456560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:28,390][15132] Avg episode reward: [(0, '0.417')] [2024-06-25 22:01:30,918][15401] Updated weights for policy 0, policy_version 975064 (0.0032) [2024-06-25 22:01:33,389][15132] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15975546880. Throughput: 0: 42766.4. Samples: 15975711580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:33,390][15132] Avg episode reward: [(0, '0.411')] [2024-06-25 22:01:34,522][15401] Updated weights for policy 0, policy_version 975074 (0.0033) [2024-06-25 22:01:38,389][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 15975759872. Throughput: 0: 42738.7. Samples: 15975840620. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:38,390][15132] Avg episode reward: [(0, '0.628')] [2024-06-25 22:01:38,476][15401] Updated weights for policy 0, policy_version 975084 (0.0027) [2024-06-25 22:01:42,163][15401] Updated weights for policy 0, policy_version 975094 (0.0029) [2024-06-25 22:01:43,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15975972864. Throughput: 0: 42813.0. Samples: 15976097900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:43,396][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 22:01:43,411][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000975096_15975972864.pth... [2024-06-25 22:01:43,468][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000974467_15965667328.pth [2024-06-25 22:01:46,281][15401] Updated weights for policy 0, policy_version 975104 (0.0027) [2024-06-25 22:01:48,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15976185856. Throughput: 0: 42421.8. Samples: 15976346100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:48,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 22:01:49,864][15401] Updated weights for policy 0, policy_version 975114 (0.0033) [2024-06-25 22:01:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 15976398848. Throughput: 0: 42582.1. Samples: 15976478640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:53,390][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 22:01:54,093][15401] Updated weights for policy 0, policy_version 975124 (0.0037) [2024-06-25 22:01:57,600][15401] Updated weights for policy 0, policy_version 975134 (0.0029) [2024-06-25 22:01:58,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15976628224. Throughput: 0: 42659.9. Samples: 15976736420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:01:58,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 22:02:01,619][15401] Updated weights for policy 0, policy_version 975144 (0.0042) [2024-06-25 22:02:03,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42875.8). Total num frames: 15976841216. Throughput: 0: 42591.5. Samples: 15976994660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:03,393][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 22:02:05,345][15401] Updated weights for policy 0, policy_version 975154 (0.0042) [2024-06-25 22:02:08,396][15132] Fps is (10 sec: 40934.2, 60 sec: 42593.8, 300 sec: 42986.6). Total num frames: 15977037824. Throughput: 0: 42549.6. Samples: 15977119660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:08,396][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 22:02:09,593][15401] Updated weights for policy 0, policy_version 975164 (0.0029) [2024-06-25 22:02:13,155][15401] Updated weights for policy 0, policy_version 975174 (0.0051) [2024-06-25 22:02:13,389][15132] Fps is (10 sec: 42608.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15977267200. Throughput: 0: 42632.3. Samples: 15977375020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:13,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 22:02:17,359][15401] Updated weights for policy 0, policy_version 975184 (0.0034) [2024-06-25 22:02:18,390][15132] Fps is (10 sec: 44265.0, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15977480192. Throughput: 0: 42697.2. Samples: 15977632960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:18,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 22:02:20,697][15401] Updated weights for policy 0, policy_version 975194 (0.0033) [2024-06-25 22:02:20,723][15349] Signal inference workers to stop experience collection... (236400 times) [2024-06-25 22:02:20,724][15349] Signal inference workers to resume experience collection... (236400 times) [2024-06-25 22:02:20,763][15401] InferenceWorker_p0-w0: stopping experience collection (236400 times) [2024-06-25 22:02:20,763][15401] InferenceWorker_p0-w0: resuming experience collection (236400 times) [2024-06-25 22:02:23,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42931.6). Total num frames: 15977676800. Throughput: 0: 42679.9. Samples: 15977761220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:23,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 22:02:24,779][15401] Updated weights for policy 0, policy_version 975204 (0.0028) [2024-06-25 22:02:28,290][15401] Updated weights for policy 0, policy_version 975214 (0.0041) [2024-06-25 22:02:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 15977906176. Throughput: 0: 42688.9. Samples: 15978018900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:28,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 22:02:32,210][15401] Updated weights for policy 0, policy_version 975224 (0.0033) [2024-06-25 22:02:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15978119168. Throughput: 0: 42946.8. Samples: 15978278700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:33,392][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 22:02:35,829][15401] Updated weights for policy 0, policy_version 975234 (0.0033) [2024-06-25 22:02:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 15978332160. Throughput: 0: 42868.4. Samples: 15978407720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:38,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 22:02:39,745][15401] Updated weights for policy 0, policy_version 975244 (0.0032) [2024-06-25 22:02:43,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15978545152. Throughput: 0: 42970.8. Samples: 15978670100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:43,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 22:02:43,666][15401] Updated weights for policy 0, policy_version 975254 (0.0031) [2024-06-25 22:02:47,721][15401] Updated weights for policy 0, policy_version 975264 (0.0028) [2024-06-25 22:02:48,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 15978758144. Throughput: 0: 42918.3. Samples: 15978925980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:48,393][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 22:02:51,131][15401] Updated weights for policy 0, policy_version 975274 (0.0037) [2024-06-25 22:02:53,396][15132] Fps is (10 sec: 44208.1, 60 sec: 43139.9, 300 sec: 42930.7). Total num frames: 15978987520. Throughput: 0: 42976.8. Samples: 15979053620. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:53,397][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 22:02:55,255][15401] Updated weights for policy 0, policy_version 975284 (0.0027) [2024-06-25 22:02:58,390][15132] Fps is (10 sec: 42608.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15979184128. Throughput: 0: 42968.3. Samples: 15979308600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:02:58,390][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 22:02:59,174][15401] Updated weights for policy 0, policy_version 975294 (0.0036) [2024-06-25 22:03:02,727][15401] Updated weights for policy 0, policy_version 975304 (0.0031) [2024-06-25 22:03:03,389][15132] Fps is (10 sec: 42626.3, 60 sec: 42873.3, 300 sec: 42931.6). Total num frames: 15979413504. Throughput: 0: 43029.9. Samples: 15979569300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:03:03,390][15132] Avg episode reward: [(0, '0.336')] [2024-06-25 22:03:06,610][15401] Updated weights for policy 0, policy_version 975314 (0.0029) [2024-06-25 22:03:08,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 15979626496. Throughput: 0: 43057.4. Samples: 15979698800. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-25 22:03:08,390][15132] Avg episode reward: [(0, '0.443')] [2024-06-25 22:03:10,338][15401] Updated weights for policy 0, policy_version 975324 (0.0039) [2024-06-25 22:03:13,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.8). Total num frames: 15979823104. Throughput: 0: 42901.3. Samples: 15979949460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:13,390][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 22:03:14,343][15401] Updated weights for policy 0, policy_version 975334 (0.0030) [2024-06-25 22:03:18,043][15401] Updated weights for policy 0, policy_version 975344 (0.0033) [2024-06-25 22:03:18,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 15980036096. Throughput: 0: 42882.8. Samples: 15980208420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:18,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 22:03:21,788][15401] Updated weights for policy 0, policy_version 975354 (0.0049) [2024-06-25 22:03:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15980265472. Throughput: 0: 43022.6. Samples: 15980343740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:23,390][15132] Avg episode reward: [(0, '0.612')] [2024-06-25 22:03:25,850][15401] Updated weights for policy 0, policy_version 975364 (0.0046) [2024-06-25 22:03:28,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 15980478464. Throughput: 0: 42799.2. Samples: 15980596060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:28,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 22:03:29,654][15401] Updated weights for policy 0, policy_version 975374 (0.0033) [2024-06-25 22:03:33,298][15401] Updated weights for policy 0, policy_version 975384 (0.0045) [2024-06-25 22:03:33,389][15132] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 15980691456. Throughput: 0: 42944.2. Samples: 15980858360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:33,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 22:03:37,348][15401] Updated weights for policy 0, policy_version 975394 (0.0040) [2024-06-25 22:03:38,390][15132] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 15980920832. Throughput: 0: 42951.4. Samples: 15980986160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 22:03:40,882][15401] Updated weights for policy 0, policy_version 975404 (0.0037) [2024-06-25 22:03:43,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15981117440. Throughput: 0: 42951.2. Samples: 15981241400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:43,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 22:03:43,535][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000975411_15981133824.pth... [2024-06-25 22:03:43,602][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000974784_15970861056.pth [2024-06-25 22:03:44,925][15401] Updated weights for policy 0, policy_version 975414 (0.0036) [2024-06-25 22:03:48,386][15401] Updated weights for policy 0, policy_version 975424 (0.0038) [2024-06-25 22:03:48,390][15132] Fps is (10 sec: 42599.0, 60 sec: 43146.3, 300 sec: 42987.2). Total num frames: 15981346816. Throughput: 0: 42906.2. Samples: 15981500080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:48,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 22:03:52,567][15401] Updated weights for policy 0, policy_version 975434 (0.0034) [2024-06-25 22:03:53,389][15132] Fps is (10 sec: 45875.5, 60 sec: 43149.2, 300 sec: 42876.1). Total num frames: 15981576192. Throughput: 0: 42895.5. Samples: 15981629100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:53,390][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 22:03:55,576][15349] Signal inference workers to stop experience collection... (236450 times) [2024-06-25 22:03:55,576][15349] Signal inference workers to resume experience collection... (236450 times) [2024-06-25 22:03:55,597][15401] InferenceWorker_p0-w0: stopping experience collection (236450 times) [2024-06-25 22:03:55,597][15401] InferenceWorker_p0-w0: resuming experience collection (236450 times) [2024-06-25 22:03:55,907][15401] Updated weights for policy 0, policy_version 975444 (0.0030) [2024-06-25 22:03:58,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15981756416. Throughput: 0: 42954.7. Samples: 15981882420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:03:58,390][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 22:04:00,131][15401] Updated weights for policy 0, policy_version 975454 (0.0034) [2024-06-25 22:04:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 15981985792. Throughput: 0: 42870.6. Samples: 15982137600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:03,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 22:04:03,717][15401] Updated weights for policy 0, policy_version 975464 (0.0042) [2024-06-25 22:04:07,595][15401] Updated weights for policy 0, policy_version 975474 (0.0040) [2024-06-25 22:04:08,390][15132] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42821.5). Total num frames: 15982198784. Throughput: 0: 42740.9. Samples: 15982267080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:08,390][15132] Avg episode reward: [(0, '0.672')] [2024-06-25 22:04:11,274][15401] Updated weights for policy 0, policy_version 975484 (0.0031) [2024-06-25 22:04:13,389][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 15982411776. Throughput: 0: 42887.5. Samples: 15982526000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:13,390][15132] Avg episode reward: [(0, '0.331')] [2024-06-25 22:04:15,199][15401] Updated weights for policy 0, policy_version 975494 (0.0040) [2024-06-25 22:04:18,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 15982608384. Throughput: 0: 42843.8. Samples: 15982786340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:18,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 22:04:19,204][15401] Updated weights for policy 0, policy_version 975504 (0.0039) [2024-06-25 22:04:22,648][15401] Updated weights for policy 0, policy_version 975514 (0.0032) [2024-06-25 22:04:23,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15982854144. Throughput: 0: 42879.7. Samples: 15982915740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:23,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 22:04:26,605][15401] Updated weights for policy 0, policy_version 975524 (0.0029) [2024-06-25 22:04:28,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 15983034368. Throughput: 0: 42883.0. Samples: 15983171140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:28,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 22:04:30,200][15401] Updated weights for policy 0, policy_version 975534 (0.0034) [2024-06-25 22:04:33,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.2, 300 sec: 42877.0). Total num frames: 15983247360. Throughput: 0: 42870.9. Samples: 15983429280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:33,390][15132] Avg episode reward: [(0, '0.236')] [2024-06-25 22:04:34,334][15401] Updated weights for policy 0, policy_version 975544 (0.0032) [2024-06-25 22:04:38,138][15401] Updated weights for policy 0, policy_version 975554 (0.0039) [2024-06-25 22:04:38,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 15983493120. Throughput: 0: 42874.6. Samples: 15983558460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:38,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 22:04:42,278][15401] Updated weights for policy 0, policy_version 975564 (0.0039) [2024-06-25 22:04:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15983673344. Throughput: 0: 42851.4. Samples: 15983810740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:43,390][15132] Avg episode reward: [(0, '0.751')] [2024-06-25 22:04:45,800][15401] Updated weights for policy 0, policy_version 975574 (0.0041) [2024-06-25 22:04:48,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 15983902720. Throughput: 0: 42855.0. Samples: 15984066080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:48,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 22:04:50,035][15401] Updated weights for policy 0, policy_version 975584 (0.0027) [2024-06-25 22:04:53,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15984115712. Throughput: 0: 42881.5. Samples: 15984196740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:53,390][15132] Avg episode reward: [(0, '0.408')] [2024-06-25 22:04:53,526][15401] Updated weights for policy 0, policy_version 975594 (0.0033) [2024-06-25 22:04:57,596][15401] Updated weights for policy 0, policy_version 975604 (0.0024) [2024-06-25 22:04:58,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15984312320. Throughput: 0: 42775.5. Samples: 15984450900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:04:58,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 22:05:01,126][15401] Updated weights for policy 0, policy_version 975614 (0.0033) [2024-06-25 22:05:03,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 15984541696. Throughput: 0: 42825.5. Samples: 15984713480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-25 22:05:03,390][15132] Avg episode reward: [(0, '0.481')] [2024-06-25 22:05:05,156][15401] Updated weights for policy 0, policy_version 975624 (0.0044) [2024-06-25 22:05:08,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15984754688. Throughput: 0: 42773.2. Samples: 15984840540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:08,390][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 22:05:08,793][15401] Updated weights for policy 0, policy_version 975634 (0.0032) [2024-06-25 22:05:13,205][15401] Updated weights for policy 0, policy_version 975644 (0.0039) [2024-06-25 22:05:13,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15984967680. Throughput: 0: 42734.3. Samples: 15985094180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:13,390][15132] Avg episode reward: [(0, '0.767')] [2024-06-25 22:05:16,757][15401] Updated weights for policy 0, policy_version 975654 (0.0034) [2024-06-25 22:05:18,389][15132] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 15985197056. Throughput: 0: 42593.5. Samples: 15985345980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:18,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 22:05:20,764][15401] Updated weights for policy 0, policy_version 975664 (0.0031) [2024-06-25 22:05:23,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 15985393664. Throughput: 0: 42698.2. Samples: 15985479880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:23,390][15132] Avg episode reward: [(0, '0.745')] [2024-06-25 22:05:24,328][15401] Updated weights for policy 0, policy_version 975674 (0.0039) [2024-06-25 22:05:28,313][15401] Updated weights for policy 0, policy_version 975684 (0.0032) [2024-06-25 22:05:28,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 15985606656. Throughput: 0: 42806.3. Samples: 15985737020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:28,390][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 22:05:31,834][15349] Signal inference workers to stop experience collection... (236500 times) [2024-06-25 22:05:31,868][15401] InferenceWorker_p0-w0: stopping experience collection (236500 times) [2024-06-25 22:05:31,892][15349] Signal inference workers to resume experience collection... (236500 times) [2024-06-25 22:05:31,892][15401] InferenceWorker_p0-w0: resuming experience collection (236500 times) [2024-06-25 22:05:31,895][15401] Updated weights for policy 0, policy_version 975694 (0.0041) [2024-06-25 22:05:33,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 15985836032. Throughput: 0: 42841.7. Samples: 15985993960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:33,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 22:05:35,793][15401] Updated weights for policy 0, policy_version 975704 (0.0035) [2024-06-25 22:05:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 15986032640. Throughput: 0: 42768.4. Samples: 15986121320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:38,390][15132] Avg episode reward: [(0, '0.461')] [2024-06-25 22:05:39,620][15401] Updated weights for policy 0, policy_version 975714 (0.0044) [2024-06-25 22:05:43,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 15986245632. Throughput: 0: 42802.3. Samples: 15986377000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:43,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 22:05:43,436][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000975724_15986262016.pth... [2024-06-25 22:05:43,441][15401] Updated weights for policy 0, policy_version 975724 (0.0026) [2024-06-25 22:05:43,490][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000975096_15975972864.pth [2024-06-25 22:05:47,214][15401] Updated weights for policy 0, policy_version 975734 (0.0027) [2024-06-25 22:05:48,389][15132] Fps is (10 sec: 45876.0, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 15986491392. Throughput: 0: 42607.2. Samples: 15986630800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:48,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 22:05:51,000][15401] Updated weights for policy 0, policy_version 975744 (0.0032) [2024-06-25 22:05:53,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 15986688000. Throughput: 0: 42731.7. Samples: 15986763460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:53,390][15132] Avg episode reward: [(0, '0.497')] [2024-06-25 22:05:54,852][15401] Updated weights for policy 0, policy_version 975754 (0.0031) [2024-06-25 22:05:58,390][15132] Fps is (10 sec: 40958.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 15986900992. Throughput: 0: 42735.4. Samples: 15987017280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:05:58,390][15132] Avg episode reward: [(0, '0.527')] [2024-06-25 22:05:58,663][15401] Updated weights for policy 0, policy_version 975764 (0.0031) [2024-06-25 22:06:02,652][15401] Updated weights for policy 0, policy_version 975774 (0.0030) [2024-06-25 22:06:03,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 15987113984. Throughput: 0: 42873.8. Samples: 15987275300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:03,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 22:06:06,309][15401] Updated weights for policy 0, policy_version 975784 (0.0029) [2024-06-25 22:06:08,389][15132] Fps is (10 sec: 40961.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15987310592. Throughput: 0: 42683.2. Samples: 15987400620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:08,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 22:06:10,414][15401] Updated weights for policy 0, policy_version 975794 (0.0036) [2024-06-25 22:06:13,390][15132] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 15987556352. Throughput: 0: 42551.5. Samples: 15987651840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:13,390][15132] Avg episode reward: [(0, '0.743')] [2024-06-25 22:06:13,861][15401] Updated weights for policy 0, policy_version 975804 (0.0036) [2024-06-25 22:06:18,060][15401] Updated weights for policy 0, policy_version 975814 (0.0052) [2024-06-25 22:06:18,389][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15987752960. Throughput: 0: 42768.1. Samples: 15987918520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 22:06:21,415][15401] Updated weights for policy 0, policy_version 975824 (0.0033) [2024-06-25 22:06:23,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15987949568. Throughput: 0: 42730.6. Samples: 15988044200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:23,394][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 22:06:25,517][15401] Updated weights for policy 0, policy_version 975834 (0.0039) [2024-06-25 22:06:28,391][15132] Fps is (10 sec: 42594.0, 60 sec: 42870.8, 300 sec: 42820.4). Total num frames: 15988178944. Throughput: 0: 42780.3. Samples: 15988302160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:28,391][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 22:06:29,184][15401] Updated weights for policy 0, policy_version 975844 (0.0027) [2024-06-25 22:06:32,997][15401] Updated weights for policy 0, policy_version 975854 (0.0028) [2024-06-25 22:06:33,390][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 15988408320. Throughput: 0: 42819.8. Samples: 15988557700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:33,390][15132] Avg episode reward: [(0, '0.501')] [2024-06-25 22:06:36,991][15401] Updated weights for policy 0, policy_version 975864 (0.0026) [2024-06-25 22:06:38,390][15132] Fps is (10 sec: 40963.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15988588544. Throughput: 0: 42843.5. Samples: 15988691420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:38,390][15132] Avg episode reward: [(0, '0.499')] [2024-06-25 22:06:40,865][15401] Updated weights for policy 0, policy_version 975874 (0.0027) [2024-06-25 22:06:43,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 15988817920. Throughput: 0: 42666.3. Samples: 15988937260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:43,390][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 22:06:44,738][15401] Updated weights for policy 0, policy_version 975884 (0.0039) [2024-06-25 22:06:48,332][15401] Updated weights for policy 0, policy_version 975894 (0.0033) [2024-06-25 22:06:48,390][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 15989047296. Throughput: 0: 42764.5. Samples: 15989199700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:48,395][15132] Avg episode reward: [(0, '0.564')] [2024-06-25 22:06:52,478][15349] Signal inference workers to stop experience collection... (236550 times) [2024-06-25 22:06:52,482][15349] Signal inference workers to resume experience collection... (236550 times) [2024-06-25 22:06:52,486][15401] Updated weights for policy 0, policy_version 975904 (0.0034) [2024-06-25 22:06:52,511][15401] InferenceWorker_p0-w0: stopping experience collection (236550 times) [2024-06-25 22:06:52,511][15401] InferenceWorker_p0-w0: resuming experience collection (236550 times) [2024-06-25 22:06:53,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 15989243904. Throughput: 0: 42845.2. Samples: 15989328760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:53,392][15132] Avg episode reward: [(0, '0.762')] [2024-06-25 22:06:55,894][15401] Updated weights for policy 0, policy_version 975914 (0.0034) [2024-06-25 22:06:58,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 15989456896. Throughput: 0: 42791.2. Samples: 15989577440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:06:58,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 22:07:00,139][15401] Updated weights for policy 0, policy_version 975924 (0.0034) [2024-06-25 22:07:03,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 15989669888. Throughput: 0: 42634.2. Samples: 15989837060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-25 22:07:03,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 22:07:03,785][15401] Updated weights for policy 0, policy_version 975934 (0.0036) [2024-06-25 22:07:07,749][15401] Updated weights for policy 0, policy_version 975944 (0.0044) [2024-06-25 22:07:08,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15989882880. Throughput: 0: 42681.0. Samples: 15989964840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:08,390][15132] Avg episode reward: [(0, '0.479')] [2024-06-25 22:07:11,632][15401] Updated weights for policy 0, policy_version 975954 (0.0037) [2024-06-25 22:07:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 15990112256. Throughput: 0: 42472.1. Samples: 15990213360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:13,390][15132] Avg episode reward: [(0, '0.584')] [2024-06-25 22:07:15,935][15401] Updated weights for policy 0, policy_version 975964 (0.0029) [2024-06-25 22:07:18,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15990308864. Throughput: 0: 42615.6. Samples: 15990475400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:18,390][15132] Avg episode reward: [(0, '0.689')] [2024-06-25 22:07:19,279][15401] Updated weights for policy 0, policy_version 975974 (0.0027) [2024-06-25 22:07:23,389][15132] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 15990505472. Throughput: 0: 42368.5. Samples: 15990598000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:23,390][15132] Avg episode reward: [(0, '0.836')] [2024-06-25 22:07:23,489][15401] Updated weights for policy 0, policy_version 975984 (0.0044) [2024-06-25 22:07:26,970][15401] Updated weights for policy 0, policy_version 975994 (0.0030) [2024-06-25 22:07:28,392][15132] Fps is (10 sec: 44226.4, 60 sec: 42870.5, 300 sec: 42820.2). Total num frames: 15990751232. Throughput: 0: 42562.2. Samples: 15990852660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:28,392][15132] Avg episode reward: [(0, '0.322')] [2024-06-25 22:07:31,373][15401] Updated weights for policy 0, policy_version 976004 (0.0040) [2024-06-25 22:07:33,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 15990947840. Throughput: 0: 42441.8. Samples: 15991109580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:33,390][15132] Avg episode reward: [(0, '0.269')] [2024-06-25 22:07:34,582][15401] Updated weights for policy 0, policy_version 976014 (0.0026) [2024-06-25 22:07:38,389][15132] Fps is (10 sec: 37692.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15991128064. Throughput: 0: 42379.6. Samples: 15991235740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:38,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 22:07:38,988][15401] Updated weights for policy 0, policy_version 976024 (0.0041) [2024-06-25 22:07:42,387][15401] Updated weights for policy 0, policy_version 976034 (0.0035) [2024-06-25 22:07:43,390][15132] Fps is (10 sec: 44235.4, 60 sec: 42871.3, 300 sec: 42820.9). Total num frames: 15991390208. Throughput: 0: 42464.5. Samples: 15991488360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:43,390][15132] Avg episode reward: [(0, '0.591')] [2024-06-25 22:07:43,402][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000976037_15991390208.pth... [2024-06-25 22:07:43,456][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000975411_15981133824.pth [2024-06-25 22:07:46,889][15401] Updated weights for policy 0, policy_version 976044 (0.0029) [2024-06-25 22:07:48,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 15991586816. Throughput: 0: 42431.2. Samples: 15991746460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:48,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 22:07:49,937][15401] Updated weights for policy 0, policy_version 976054 (0.0037) [2024-06-25 22:07:53,389][15132] Fps is (10 sec: 39323.2, 60 sec: 42327.1, 300 sec: 42709.5). Total num frames: 15991783424. Throughput: 0: 42453.8. Samples: 15991875260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:53,390][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 22:07:54,716][15401] Updated weights for policy 0, policy_version 976064 (0.0042) [2024-06-25 22:07:57,545][15401] Updated weights for policy 0, policy_version 976074 (0.0030) [2024-06-25 22:07:58,392][15132] Fps is (10 sec: 44225.7, 60 sec: 42869.7, 300 sec: 42764.7). Total num frames: 15992029184. Throughput: 0: 42600.8. Samples: 15992130500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:07:58,393][15132] Avg episode reward: [(0, '0.512')] [2024-06-25 22:08:02,458][15401] Updated weights for policy 0, policy_version 976084 (0.0031) [2024-06-25 22:08:03,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 15992225792. Throughput: 0: 42486.2. Samples: 15992387280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:03,390][15132] Avg episode reward: [(0, '0.662')] [2024-06-25 22:08:05,188][15401] Updated weights for policy 0, policy_version 976094 (0.0039) [2024-06-25 22:08:08,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 15992422400. Throughput: 0: 42494.3. Samples: 15992510240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:08,390][15132] Avg episode reward: [(0, '0.409')] [2024-06-25 22:08:10,000][15349] Signal inference workers to stop experience collection... (236600 times) [2024-06-25 22:08:10,002][15349] Signal inference workers to resume experience collection... (236600 times) [2024-06-25 22:08:10,025][15401] Updated weights for policy 0, policy_version 976104 (0.0041) [2024-06-25 22:08:10,053][15401] InferenceWorker_p0-w0: stopping experience collection (236600 times) [2024-06-25 22:08:10,053][15401] InferenceWorker_p0-w0: resuming experience collection (236600 times) [2024-06-25 22:08:13,015][15401] Updated weights for policy 0, policy_version 976114 (0.0028) [2024-06-25 22:08:13,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 15992668160. Throughput: 0: 42645.0. Samples: 15992771580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:13,390][15132] Avg episode reward: [(0, '0.209')] [2024-06-25 22:08:17,633][15401] Updated weights for policy 0, policy_version 976124 (0.0042) [2024-06-25 22:08:18,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 15992848384. Throughput: 0: 42598.7. Samples: 15993026520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:18,390][15132] Avg episode reward: [(0, '0.287')] [2024-06-25 22:08:20,707][15401] Updated weights for policy 0, policy_version 976134 (0.0026) [2024-06-25 22:08:23,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 15993094144. Throughput: 0: 42632.4. Samples: 15993154200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:23,390][15132] Avg episode reward: [(0, '0.570')] [2024-06-25 22:08:25,221][15401] Updated weights for policy 0, policy_version 976144 (0.0028) [2024-06-25 22:08:28,223][15401] Updated weights for policy 0, policy_version 976154 (0.0027) [2024-06-25 22:08:28,390][15132] Fps is (10 sec: 47513.2, 60 sec: 42873.2, 300 sec: 42820.5). Total num frames: 15993323520. Throughput: 0: 42770.5. Samples: 15993413020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:28,390][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 22:08:32,707][15401] Updated weights for policy 0, policy_version 976164 (0.0027) [2024-06-25 22:08:33,390][15132] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 15993503744. Throughput: 0: 42830.0. Samples: 15993673820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:33,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 22:08:35,821][15401] Updated weights for policy 0, policy_version 976174 (0.0025) [2024-06-25 22:08:38,396][15132] Fps is (10 sec: 40933.8, 60 sec: 43412.9, 300 sec: 42764.1). Total num frames: 15993733120. Throughput: 0: 42676.9. Samples: 15993796000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:38,396][15132] Avg episode reward: [(0, '0.692')] [2024-06-25 22:08:40,391][15401] Updated weights for policy 0, policy_version 976184 (0.0026) [2024-06-25 22:08:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42598.7, 300 sec: 42709.5). Total num frames: 15993946112. Throughput: 0: 42898.4. Samples: 15994060820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:43,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 22:08:43,618][15401] Updated weights for policy 0, policy_version 976194 (0.0033) [2024-06-25 22:08:48,029][15401] Updated weights for policy 0, policy_version 976204 (0.0034) [2024-06-25 22:08:48,390][15132] Fps is (10 sec: 40985.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 15994142720. Throughput: 0: 42967.5. Samples: 15994320820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:48,390][15132] Avg episode reward: [(0, '0.586')] [2024-06-25 22:08:51,199][15401] Updated weights for policy 0, policy_version 976214 (0.0036) [2024-06-25 22:08:53,390][15132] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 15994388480. Throughput: 0: 43009.7. Samples: 15994445680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:53,393][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 22:08:55,765][15401] Updated weights for policy 0, policy_version 976224 (0.0029) [2024-06-25 22:08:58,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42327.1, 300 sec: 42653.9). Total num frames: 15994568704. Throughput: 0: 42945.8. Samples: 15994704140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:08:58,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 22:08:58,902][15401] Updated weights for policy 0, policy_version 976234 (0.0031) [2024-06-25 22:09:03,374][15401] Updated weights for policy 0, policy_version 976244 (0.0033) [2024-06-25 22:09:03,389][15132] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 15994781696. Throughput: 0: 43054.7. Samples: 15994963980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-25 22:09:03,390][15132] Avg episode reward: [(0, '0.684')] [2024-06-25 22:09:06,649][15401] Updated weights for policy 0, policy_version 976254 (0.0041) [2024-06-25 22:09:08,389][15132] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 15995027456. Throughput: 0: 43004.9. Samples: 15995089420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:08,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 22:09:10,763][15401] Updated weights for policy 0, policy_version 976264 (0.0029) [2024-06-25 22:09:13,390][15132] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 15995224064. Throughput: 0: 42980.4. Samples: 15995347140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:13,390][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 22:09:14,245][15401] Updated weights for policy 0, policy_version 976274 (0.0039) [2024-06-25 22:09:18,248][15401] Updated weights for policy 0, policy_version 976284 (0.0030) [2024-06-25 22:09:18,392][15132] Fps is (10 sec: 40950.1, 60 sec: 43142.8, 300 sec: 42653.6). Total num frames: 15995437056. Throughput: 0: 42919.2. Samples: 15995605280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:18,393][15132] Avg episode reward: [(0, '0.288')] [2024-06-25 22:09:20,264][15349] Signal inference workers to stop experience collection... (236650 times) [2024-06-25 22:09:20,284][15401] InferenceWorker_p0-w0: stopping experience collection (236650 times) [2024-06-25 22:09:20,326][15349] Signal inference workers to resume experience collection... (236650 times) [2024-06-25 22:09:20,326][15401] InferenceWorker_p0-w0: resuming experience collection (236650 times) [2024-06-25 22:09:21,675][15401] Updated weights for policy 0, policy_version 976294 (0.0032) [2024-06-25 22:09:23,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 15995650048. Throughput: 0: 43053.3. Samples: 15995733120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:23,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 22:09:25,910][15401] Updated weights for policy 0, policy_version 976304 (0.0024) [2024-06-25 22:09:28,389][15132] Fps is (10 sec: 44247.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 15995879424. Throughput: 0: 42852.4. Samples: 15995989180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:28,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 22:09:29,441][15401] Updated weights for policy 0, policy_version 976314 (0.0043) [2024-06-25 22:09:33,371][15401] Updated weights for policy 0, policy_version 976324 (0.0048) [2024-06-25 22:09:33,396][15132] Fps is (10 sec: 44208.5, 60 sec: 43140.1, 300 sec: 42708.6). Total num frames: 15996092416. Throughput: 0: 42780.3. Samples: 15996246200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:33,396][15132] Avg episode reward: [(0, '0.251')] [2024-06-25 22:09:37,243][15401] Updated weights for policy 0, policy_version 976334 (0.0035) [2024-06-25 22:09:38,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42329.9, 300 sec: 42709.5). Total num frames: 15996272640. Throughput: 0: 42851.2. Samples: 15996373980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:38,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 22:09:40,980][15401] Updated weights for policy 0, policy_version 976344 (0.0028) [2024-06-25 22:09:43,390][15132] Fps is (10 sec: 44264.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 15996534784. Throughput: 0: 42907.9. Samples: 15996635000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:43,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 22:09:43,399][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000976351_15996534784.pth... [2024-06-25 22:09:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000975724_15986262016.pth [2024-06-25 22:09:44,856][15401] Updated weights for policy 0, policy_version 976354 (0.0023) [2024-06-25 22:09:48,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 15996715008. Throughput: 0: 42846.6. Samples: 15996892080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:48,390][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 22:09:48,666][15401] Updated weights for policy 0, policy_version 976364 (0.0034) [2024-06-25 22:09:52,459][15401] Updated weights for policy 0, policy_version 976374 (0.0032) [2024-06-25 22:09:53,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 15996928000. Throughput: 0: 42768.8. Samples: 15997014120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:53,393][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 22:09:56,211][15401] Updated weights for policy 0, policy_version 976384 (0.0036) [2024-06-25 22:09:58,396][15132] Fps is (10 sec: 44208.0, 60 sec: 43139.8, 300 sec: 42764.1). Total num frames: 15997157376. Throughput: 0: 42684.6. Samples: 15997268220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:09:58,397][15132] Avg episode reward: [(0, '0.799')] [2024-06-25 22:10:00,582][15401] Updated weights for policy 0, policy_version 976394 (0.0031) [2024-06-25 22:10:03,390][15132] Fps is (10 sec: 42608.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 15997353984. Throughput: 0: 42656.9. Samples: 15997524740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:03,390][15132] Avg episode reward: [(0, '0.163')] [2024-06-25 22:10:04,074][15401] Updated weights for policy 0, policy_version 976404 (0.0027) [2024-06-25 22:10:08,071][15401] Updated weights for policy 0, policy_version 976414 (0.0027) [2024-06-25 22:10:08,390][15132] Fps is (10 sec: 40985.4, 60 sec: 42325.1, 300 sec: 42709.4). Total num frames: 15997566976. Throughput: 0: 42619.6. Samples: 15997651020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:08,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 22:10:11,907][15401] Updated weights for policy 0, policy_version 976424 (0.0043) [2024-06-25 22:10:13,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 15997796352. Throughput: 0: 42579.6. Samples: 15997905260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:13,390][15132] Avg episode reward: [(0, '0.637')] [2024-06-25 22:10:16,249][15401] Updated weights for policy 0, policy_version 976434 (0.0037) [2024-06-25 22:10:18,389][15132] Fps is (10 sec: 42599.8, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 15997992960. Throughput: 0: 42630.9. Samples: 15998164320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:18,390][15132] Avg episode reward: [(0, '0.823')] [2024-06-25 22:10:19,571][15401] Updated weights for policy 0, policy_version 976444 (0.0035) [2024-06-25 22:10:23,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 15998205952. Throughput: 0: 42506.6. Samples: 15998286780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:23,390][15132] Avg episode reward: [(0, '0.560')] [2024-06-25 22:10:23,867][15401] Updated weights for policy 0, policy_version 976454 (0.0038) [2024-06-25 22:10:27,155][15401] Updated weights for policy 0, policy_version 976464 (0.0040) [2024-06-25 22:10:28,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 15998451712. Throughput: 0: 42443.6. Samples: 15998544960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:28,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 22:10:31,490][15401] Updated weights for policy 0, policy_version 976474 (0.0032) [2024-06-25 22:10:33,390][15132] Fps is (10 sec: 40960.2, 60 sec: 42056.7, 300 sec: 42653.9). Total num frames: 15998615552. Throughput: 0: 42479.1. Samples: 15998803640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:33,390][15132] Avg episode reward: [(0, '0.340')] [2024-06-25 22:10:34,882][15401] Updated weights for policy 0, policy_version 976484 (0.0028) [2024-06-25 22:10:38,389][15132] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 15998828544. Throughput: 0: 42352.1. Samples: 15998919860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:38,390][15132] Avg episode reward: [(0, '0.421')] [2024-06-25 22:10:39,117][15401] Updated weights for policy 0, policy_version 976494 (0.0032) [2024-06-25 22:10:42,695][15401] Updated weights for policy 0, policy_version 976504 (0.0036) [2024-06-25 22:10:43,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 15999074304. Throughput: 0: 42459.5. Samples: 15999178620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:43,390][15132] Avg episode reward: [(0, '0.590')] [2024-06-25 22:10:43,628][15349] Signal inference workers to stop experience collection... (236700 times) [2024-06-25 22:10:43,679][15401] InferenceWorker_p0-w0: stopping experience collection (236700 times) [2024-06-25 22:10:43,686][15349] Signal inference workers to resume experience collection... (236700 times) [2024-06-25 22:10:43,693][15401] InferenceWorker_p0-w0: resuming experience collection (236700 times) [2024-06-25 22:10:46,813][15401] Updated weights for policy 0, policy_version 976514 (0.0033) [2024-06-25 22:10:48,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 15999254528. Throughput: 0: 42704.6. Samples: 15999446440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:48,390][15132] Avg episode reward: [(0, '0.730')] [2024-06-25 22:10:50,266][15401] Updated weights for policy 0, policy_version 976524 (0.0037) [2024-06-25 22:10:53,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 15999483904. Throughput: 0: 42534.1. Samples: 15999565040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:53,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 22:10:54,623][15401] Updated weights for policy 0, policy_version 976534 (0.0032) [2024-06-25 22:10:57,827][15401] Updated weights for policy 0, policy_version 976544 (0.0042) [2024-06-25 22:10:58,389][15132] Fps is (10 sec: 45874.7, 60 sec: 42603.0, 300 sec: 42709.5). Total num frames: 15999713280. Throughput: 0: 42687.0. Samples: 15999826180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:10:58,390][15132] Avg episode reward: [(0, '0.697')] [2024-06-25 22:11:02,291][15401] Updated weights for policy 0, policy_version 976554 (0.0026) [2024-06-25 22:11:03,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 15999893504. Throughput: 0: 42663.1. Samples: 16000084160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-25 22:11:03,392][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 22:11:05,408][15401] Updated weights for policy 0, policy_version 976564 (0.0033) [2024-06-25 22:11:08,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 16000122880. Throughput: 0: 42561.8. Samples: 16000202060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:08,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 22:11:09,908][15401] Updated weights for policy 0, policy_version 976574 (0.0032) [2024-06-25 22:11:13,049][15401] Updated weights for policy 0, policy_version 976584 (0.0028) [2024-06-25 22:11:13,390][15132] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 16000352256. Throughput: 0: 42718.6. Samples: 16000467300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:13,392][15132] Avg episode reward: [(0, '0.496')] [2024-06-25 22:11:17,843][15401] Updated weights for policy 0, policy_version 976594 (0.0039) [2024-06-25 22:11:18,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 16000548864. Throughput: 0: 42542.6. Samples: 16000718060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:18,390][15132] Avg episode reward: [(0, '0.431')] [2024-06-25 22:11:20,775][15401] Updated weights for policy 0, policy_version 976604 (0.0029) [2024-06-25 22:11:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42709.6). Total num frames: 16000778240. Throughput: 0: 42671.1. Samples: 16000840060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:23,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 22:11:25,429][15401] Updated weights for policy 0, policy_version 976614 (0.0043) [2024-06-25 22:11:28,396][15132] Fps is (10 sec: 44208.3, 60 sec: 42320.8, 300 sec: 42653.0). Total num frames: 16000991232. Throughput: 0: 42872.9. Samples: 16001108180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:28,397][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 22:11:28,469][15401] Updated weights for policy 0, policy_version 976624 (0.0033) [2024-06-25 22:11:32,914][15401] Updated weights for policy 0, policy_version 976634 (0.0026) [2024-06-25 22:11:33,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16001187840. Throughput: 0: 42717.3. Samples: 16001368720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:33,390][15132] Avg episode reward: [(0, '0.675')] [2024-06-25 22:11:35,948][15401] Updated weights for policy 0, policy_version 976644 (0.0047) [2024-06-25 22:11:38,390][15132] Fps is (10 sec: 44264.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 16001433600. Throughput: 0: 42735.0. Samples: 16001488120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:38,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 22:11:40,386][15401] Updated weights for policy 0, policy_version 976654 (0.0024) [2024-06-25 22:11:43,390][15132] Fps is (10 sec: 45871.8, 60 sec: 42870.9, 300 sec: 42709.4). Total num frames: 16001646592. Throughput: 0: 42800.2. Samples: 16001752220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:43,391][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 22:11:43,397][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000976663_16001646592.pth... [2024-06-25 22:11:43,466][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000976037_15991390208.pth [2024-06-25 22:11:43,630][15401] Updated weights for policy 0, policy_version 976664 (0.0033) [2024-06-25 22:11:48,185][15401] Updated weights for policy 0, policy_version 976674 (0.0027) [2024-06-25 22:11:48,389][15132] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 16001843200. Throughput: 0: 42782.3. Samples: 16002009360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:48,390][15132] Avg episode reward: [(0, '0.698')] [2024-06-25 22:11:51,649][15401] Updated weights for policy 0, policy_version 976684 (0.0045) [2024-06-25 22:11:53,391][15132] Fps is (10 sec: 40957.0, 60 sec: 42870.4, 300 sec: 42709.3). Total num frames: 16002056192. Throughput: 0: 42829.7. Samples: 16002129460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:53,391][15132] Avg episode reward: [(0, '0.578')] [2024-06-25 22:11:55,482][15401] Updated weights for policy 0, policy_version 976694 (0.0036) [2024-06-25 22:11:58,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 16002285568. Throughput: 0: 42960.5. Samples: 16002400520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:11:58,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 22:11:59,343][15401] Updated weights for policy 0, policy_version 976704 (0.0041) [2024-06-25 22:12:03,200][15401] Updated weights for policy 0, policy_version 976714 (0.0032) [2024-06-25 22:12:03,390][15132] Fps is (10 sec: 42604.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 16002482176. Throughput: 0: 43054.6. Samples: 16002655520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:03,390][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 22:12:04,543][15349] Signal inference workers to stop experience collection... (236750 times) [2024-06-25 22:12:04,593][15349] Signal inference workers to resume experience collection... (236750 times) [2024-06-25 22:12:04,594][15401] InferenceWorker_p0-w0: stopping experience collection (236750 times) [2024-06-25 22:12:04,607][15401] InferenceWorker_p0-w0: resuming experience collection (236750 times) [2024-06-25 22:12:06,926][15401] Updated weights for policy 0, policy_version 976724 (0.0037) [2024-06-25 22:12:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 16002711552. Throughput: 0: 43015.0. Samples: 16002775740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:08,390][15132] Avg episode reward: [(0, '0.538')] [2024-06-25 22:12:10,802][15401] Updated weights for policy 0, policy_version 976734 (0.0023) [2024-06-25 22:12:13,390][15132] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16002924544. Throughput: 0: 42939.9. Samples: 16003040200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:13,390][15132] Avg episode reward: [(0, '0.609')] [2024-06-25 22:12:14,552][15401] Updated weights for policy 0, policy_version 976744 (0.0028) [2024-06-25 22:12:18,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16003121152. Throughput: 0: 43062.6. Samples: 16003306540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:18,390][15132] Avg episode reward: [(0, '0.450')] [2024-06-25 22:12:18,458][15401] Updated weights for policy 0, policy_version 976754 (0.0052) [2024-06-25 22:12:22,003][15401] Updated weights for policy 0, policy_version 976764 (0.0037) [2024-06-25 22:12:23,389][15132] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42765.4). Total num frames: 16003366912. Throughput: 0: 43151.3. Samples: 16003429920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:23,390][15132] Avg episode reward: [(0, '0.275')] [2024-06-25 22:12:26,252][15401] Updated weights for policy 0, policy_version 976774 (0.0037) [2024-06-25 22:12:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42876.1, 300 sec: 42765.0). Total num frames: 16003563520. Throughput: 0: 43021.6. Samples: 16003688160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:28,390][15132] Avg episode reward: [(0, '0.700')] [2024-06-25 22:12:29,518][15401] Updated weights for policy 0, policy_version 976784 (0.0030) [2024-06-25 22:12:33,389][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 16003776512. Throughput: 0: 42948.0. Samples: 16003942020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:33,390][15132] Avg episode reward: [(0, '0.699')] [2024-06-25 22:12:33,748][15401] Updated weights for policy 0, policy_version 976794 (0.0045) [2024-06-25 22:12:37,046][15401] Updated weights for policy 0, policy_version 976804 (0.0026) [2024-06-25 22:12:38,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 16003989504. Throughput: 0: 43078.2. Samples: 16004067920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:38,390][15132] Avg episode reward: [(0, '0.566')] [2024-06-25 22:12:41,727][15401] Updated weights for policy 0, policy_version 976814 (0.0039) [2024-06-25 22:12:43,392][15132] Fps is (10 sec: 42587.8, 60 sec: 42597.2, 300 sec: 42764.7). Total num frames: 16004202496. Throughput: 0: 42883.9. Samples: 16004330400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:43,393][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 22:12:44,499][15401] Updated weights for policy 0, policy_version 976824 (0.0030) [2024-06-25 22:12:48,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16004399104. Throughput: 0: 43035.3. Samples: 16004592100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:48,390][15132] Avg episode reward: [(0, '0.548')] [2024-06-25 22:12:49,401][15401] Updated weights for policy 0, policy_version 976834 (0.0046) [2024-06-25 22:12:52,155][15401] Updated weights for policy 0, policy_version 976844 (0.0034) [2024-06-25 22:12:53,389][15132] Fps is (10 sec: 44248.0, 60 sec: 43145.6, 300 sec: 42765.4). Total num frames: 16004644864. Throughput: 0: 43131.2. Samples: 16004716640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:53,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 22:12:57,060][15401] Updated weights for policy 0, policy_version 976854 (0.0032) [2024-06-25 22:12:58,389][15132] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 16004857856. Throughput: 0: 43080.9. Samples: 16004978840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-25 22:12:58,390][15132] Avg episode reward: [(0, '0.452')] [2024-06-25 22:13:00,010][15401] Updated weights for policy 0, policy_version 976864 (0.0046) [2024-06-25 22:13:03,390][15132] Fps is (10 sec: 40959.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 16005054464. Throughput: 0: 42683.5. Samples: 16005227300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 22:13:04,636][15401] Updated weights for policy 0, policy_version 976874 (0.0038) [2024-06-25 22:13:08,062][15401] Updated weights for policy 0, policy_version 976884 (0.0034) [2024-06-25 22:13:08,392][15132] Fps is (10 sec: 44226.0, 60 sec: 43142.8, 300 sec: 42820.2). Total num frames: 16005300224. Throughput: 0: 42656.7. Samples: 16005349580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:08,393][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 22:13:12,061][15401] Updated weights for policy 0, policy_version 976894 (0.0040) [2024-06-25 22:13:13,390][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 16005496832. Throughput: 0: 42663.0. Samples: 16005608000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:13,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 22:13:15,617][15401] Updated weights for policy 0, policy_version 976904 (0.0047) [2024-06-25 22:13:17,903][15349] Signal inference workers to stop experience collection... (236800 times) [2024-06-25 22:13:17,903][15349] Signal inference workers to resume experience collection... (236800 times) [2024-06-25 22:13:17,928][15401] InferenceWorker_p0-w0: stopping experience collection (236800 times) [2024-06-25 22:13:17,928][15401] InferenceWorker_p0-w0: resuming experience collection (236800 times) [2024-06-25 22:13:18,390][15132] Fps is (10 sec: 39331.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16005693440. Throughput: 0: 42776.8. Samples: 16005866980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:18,390][15132] Avg episode reward: [(0, '0.593')] [2024-06-25 22:13:19,750][15401] Updated weights for policy 0, policy_version 976914 (0.0022) [2024-06-25 22:13:23,212][15401] Updated weights for policy 0, policy_version 976924 (0.0047) [2024-06-25 22:13:23,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 16005922816. Throughput: 0: 42847.7. Samples: 16005996060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:23,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 22:13:27,459][15401] Updated weights for policy 0, policy_version 976934 (0.0031) [2024-06-25 22:13:28,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 16006135808. Throughput: 0: 42785.5. Samples: 16006255640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:28,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 22:13:30,871][15401] Updated weights for policy 0, policy_version 976944 (0.0032) [2024-06-25 22:13:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 16006348800. Throughput: 0: 42584.3. Samples: 16006508400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:33,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 22:13:35,160][15401] Updated weights for policy 0, policy_version 976954 (0.0029) [2024-06-25 22:13:38,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16006561792. Throughput: 0: 42724.0. Samples: 16006639220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:38,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 22:13:38,454][15401] Updated weights for policy 0, policy_version 976964 (0.0029) [2024-06-25 22:13:42,577][15401] Updated weights for policy 0, policy_version 976974 (0.0038) [2024-06-25 22:13:43,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42873.2, 300 sec: 42820.6). Total num frames: 16006774784. Throughput: 0: 42647.5. Samples: 16006897980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:43,390][15132] Avg episode reward: [(0, '0.498')] [2024-06-25 22:13:43,454][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000976977_16006791168.pth... [2024-06-25 22:13:43,507][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000976351_15996534784.pth [2024-06-25 22:13:46,856][15401] Updated weights for policy 0, policy_version 976984 (0.0043) [2024-06-25 22:13:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 16006987776. Throughput: 0: 42686.3. Samples: 16007148180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 22:13:50,036][15401] Updated weights for policy 0, policy_version 976994 (0.0031) [2024-06-25 22:13:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 16007184384. Throughput: 0: 42857.0. Samples: 16007278040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:53,398][15132] Avg episode reward: [(0, '0.705')] [2024-06-25 22:13:54,238][15401] Updated weights for policy 0, policy_version 977004 (0.0039) [2024-06-25 22:13:57,667][15401] Updated weights for policy 0, policy_version 977014 (0.0033) [2024-06-25 22:13:58,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 16007413760. Throughput: 0: 42894.3. Samples: 16007538240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:13:58,390][15132] Avg episode reward: [(0, '0.729')] [2024-06-25 22:14:01,786][15401] Updated weights for policy 0, policy_version 977024 (0.0037) [2024-06-25 22:14:03,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 16007643136. Throughput: 0: 42666.2. Samples: 16007786960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:03,390][15132] Avg episode reward: [(0, '0.425')] [2024-06-25 22:14:05,519][15401] Updated weights for policy 0, policy_version 977034 (0.0036) [2024-06-25 22:14:08,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 16007839744. Throughput: 0: 42679.9. Samples: 16007916660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:08,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 22:14:09,450][15401] Updated weights for policy 0, policy_version 977044 (0.0030) [2024-06-25 22:14:13,230][15401] Updated weights for policy 0, policy_version 977054 (0.0047) [2024-06-25 22:14:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 16008052736. Throughput: 0: 42739.0. Samples: 16008178900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:13,390][15132] Avg episode reward: [(0, '0.669')] [2024-06-25 22:14:17,163][15401] Updated weights for policy 0, policy_version 977064 (0.0028) [2024-06-25 22:14:18,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16008265728. Throughput: 0: 42598.2. Samples: 16008425320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:18,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 22:14:20,896][15401] Updated weights for policy 0, policy_version 977074 (0.0043) [2024-06-25 22:14:23,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 16008478720. Throughput: 0: 42524.9. Samples: 16008552840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:23,390][15132] Avg episode reward: [(0, '0.444')] [2024-06-25 22:14:24,703][15401] Updated weights for policy 0, policy_version 977084 (0.0029) [2024-06-25 22:14:28,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 16008691712. Throughput: 0: 42541.8. Samples: 16008812360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:28,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 22:14:28,522][15401] Updated weights for policy 0, policy_version 977094 (0.0027) [2024-06-25 22:14:32,764][15401] Updated weights for policy 0, policy_version 977104 (0.0038) [2024-06-25 22:14:33,390][15132] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 16008904704. Throughput: 0: 42623.1. Samples: 16009066220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:33,390][15132] Avg episode reward: [(0, '0.522')] [2024-06-25 22:14:36,342][15401] Updated weights for policy 0, policy_version 977114 (0.0031) [2024-06-25 22:14:38,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 16009101312. Throughput: 0: 42595.1. Samples: 16009194820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:38,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 22:14:40,534][15401] Updated weights for policy 0, policy_version 977124 (0.0045) [2024-06-25 22:14:43,392][15132] Fps is (10 sec: 42588.5, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 16009330688. Throughput: 0: 42464.3. Samples: 16009449240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:43,393][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 22:14:43,738][15401] Updated weights for policy 0, policy_version 977134 (0.0033) [2024-06-25 22:14:48,144][15401] Updated weights for policy 0, policy_version 977144 (0.0034) [2024-06-25 22:14:48,152][15349] Signal inference workers to stop experience collection... (236850 times) [2024-06-25 22:14:48,152][15349] Signal inference workers to resume experience collection... (236850 times) [2024-06-25 22:14:48,175][15401] InferenceWorker_p0-w0: stopping experience collection (236850 times) [2024-06-25 22:14:48,175][15401] InferenceWorker_p0-w0: resuming experience collection (236850 times) [2024-06-25 22:14:48,392][15132] Fps is (10 sec: 44226.0, 60 sec: 42596.7, 300 sec: 42765.0). Total num frames: 16009543680. Throughput: 0: 42799.0. Samples: 16009713020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:48,393][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 22:14:51,319][15401] Updated weights for policy 0, policy_version 977154 (0.0024) [2024-06-25 22:14:53,389][15132] Fps is (10 sec: 42608.7, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 16009756672. Throughput: 0: 42635.6. Samples: 16009835260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:53,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 22:14:56,014][15401] Updated weights for policy 0, policy_version 977164 (0.0028) [2024-06-25 22:14:58,389][15132] Fps is (10 sec: 42609.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16009969664. Throughput: 0: 42570.7. Samples: 16010094580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-25 22:14:58,390][15132] Avg episode reward: [(0, '0.567')] [2024-06-25 22:14:59,215][15401] Updated weights for policy 0, policy_version 977174 (0.0031) [2024-06-25 22:15:03,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 16010166272. Throughput: 0: 42863.9. Samples: 16010354200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:03,390][15132] Avg episode reward: [(0, '0.485')] [2024-06-25 22:15:03,620][15401] Updated weights for policy 0, policy_version 977184 (0.0037) [2024-06-25 22:15:06,640][15401] Updated weights for policy 0, policy_version 977194 (0.0041) [2024-06-25 22:15:08,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 16010412032. Throughput: 0: 42763.8. Samples: 16010477220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:08,390][15132] Avg episode reward: [(0, '0.660')] [2024-06-25 22:15:11,169][15401] Updated weights for policy 0, policy_version 977204 (0.0046) [2024-06-25 22:15:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 16010608640. Throughput: 0: 42808.5. Samples: 16010738740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:13,390][15132] Avg episode reward: [(0, '0.460')] [2024-06-25 22:15:14,153][15401] Updated weights for policy 0, policy_version 977214 (0.0039) [2024-06-25 22:15:18,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 16010805248. Throughput: 0: 42782.3. Samples: 16010991420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:18,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 22:15:18,693][15401] Updated weights for policy 0, policy_version 977224 (0.0033) [2024-06-25 22:15:21,739][15401] Updated weights for policy 0, policy_version 977234 (0.0038) [2024-06-25 22:15:23,390][15132] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 16011067392. Throughput: 0: 42619.5. Samples: 16011112700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:23,390][15132] Avg episode reward: [(0, '0.456')] [2024-06-25 22:15:26,960][15401] Updated weights for policy 0, policy_version 977244 (0.0042) [2024-06-25 22:15:28,389][15132] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 16011264000. Throughput: 0: 42898.4. Samples: 16011379560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:28,390][15132] Avg episode reward: [(0, '0.276')] [2024-06-25 22:15:29,279][15401] Updated weights for policy 0, policy_version 977254 (0.0031) [2024-06-25 22:15:33,390][15132] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 16011444224. Throughput: 0: 42656.0. Samples: 16011632440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:33,390][15132] Avg episode reward: [(0, '0.557')] [2024-06-25 22:15:34,618][15401] Updated weights for policy 0, policy_version 977264 (0.0026) [2024-06-25 22:15:36,922][15401] Updated weights for policy 0, policy_version 977274 (0.0031) [2024-06-25 22:15:38,390][15132] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 16011689984. Throughput: 0: 42632.4. Samples: 16011753720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 22:15:42,304][15401] Updated weights for policy 0, policy_version 977284 (0.0041) [2024-06-25 22:15:43,389][15132] Fps is (10 sec: 44237.5, 60 sec: 42600.1, 300 sec: 42820.5). Total num frames: 16011886592. Throughput: 0: 42848.0. Samples: 16012022740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:43,390][15132] Avg episode reward: [(0, '0.654')] [2024-06-25 22:15:43,410][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000977288_16011886592.pth... [2024-06-25 22:15:43,457][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000976663_16001646592.pth [2024-06-25 22:15:44,598][15401] Updated weights for policy 0, policy_version 977294 (0.0032) [2024-06-25 22:15:48,390][15132] Fps is (10 sec: 40960.3, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 16012099584. Throughput: 0: 42682.3. Samples: 16012274900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:48,390][15132] Avg episode reward: [(0, '0.643')] [2024-06-25 22:15:49,825][15401] Updated weights for policy 0, policy_version 977304 (0.0049) [2024-06-25 22:15:52,057][15349] Signal inference workers to stop experience collection... (236900 times) [2024-06-25 22:15:52,058][15349] Signal inference workers to resume experience collection... (236900 times) [2024-06-25 22:15:52,109][15401] InferenceWorker_p0-w0: stopping experience collection (236900 times) [2024-06-25 22:15:52,109][15401] InferenceWorker_p0-w0: resuming experience collection (236900 times) [2024-06-25 22:15:52,451][15401] Updated weights for policy 0, policy_version 977314 (0.0038) [2024-06-25 22:15:53,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 16012345344. Throughput: 0: 42715.2. Samples: 16012399400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:53,390][15132] Avg episode reward: [(0, '0.420')] [2024-06-25 22:15:57,603][15401] Updated weights for policy 0, policy_version 977324 (0.0034) [2024-06-25 22:15:58,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 16012509184. Throughput: 0: 42634.1. Samples: 16012657280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:15:58,392][15132] Avg episode reward: [(0, '0.410')] [2024-06-25 22:16:00,465][15401] Updated weights for policy 0, policy_version 977334 (0.0041) [2024-06-25 22:16:03,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16012738560. Throughput: 0: 42568.8. Samples: 16012907020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:03,390][15132] Avg episode reward: [(0, '0.614')] [2024-06-25 22:16:05,114][15401] Updated weights for policy 0, policy_version 977344 (0.0035) [2024-06-25 22:16:07,979][15401] Updated weights for policy 0, policy_version 977354 (0.0030) [2024-06-25 22:16:08,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 16012984320. Throughput: 0: 42813.4. Samples: 16013039300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:08,390][15132] Avg episode reward: [(0, '0.787')] [2024-06-25 22:16:12,714][15401] Updated weights for policy 0, policy_version 977364 (0.0038) [2024-06-25 22:16:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 16013148160. Throughput: 0: 42654.1. Samples: 16013299000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:13,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 22:16:15,389][15401] Updated weights for policy 0, policy_version 977374 (0.0037) [2024-06-25 22:16:18,390][15132] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 16013377536. Throughput: 0: 42599.2. Samples: 16013549400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:18,390][15132] Avg episode reward: [(0, '0.348')] [2024-06-25 22:16:20,334][15401] Updated weights for policy 0, policy_version 977384 (0.0032) [2024-06-25 22:16:23,390][15132] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42765.9). Total num frames: 16013606912. Throughput: 0: 42812.0. Samples: 16013680260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:23,390][15132] Avg episode reward: [(0, '0.725')] [2024-06-25 22:16:23,554][15401] Updated weights for policy 0, policy_version 977394 (0.0026) [2024-06-25 22:16:27,953][15401] Updated weights for policy 0, policy_version 977404 (0.0029) [2024-06-25 22:16:28,389][15132] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 16013803520. Throughput: 0: 42551.6. Samples: 16013937560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:28,390][15132] Avg episode reward: [(0, '0.629')] [2024-06-25 22:16:31,282][15401] Updated weights for policy 0, policy_version 977414 (0.0036) [2024-06-25 22:16:33,392][15132] Fps is (10 sec: 42588.7, 60 sec: 43142.9, 300 sec: 42709.2). Total num frames: 16014032896. Throughput: 0: 42503.1. Samples: 16014187640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:33,392][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 22:16:35,616][15401] Updated weights for policy 0, policy_version 977424 (0.0036) [2024-06-25 22:16:38,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.6). Total num frames: 16014245888. Throughput: 0: 42606.7. Samples: 16014316700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:38,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 22:16:39,282][15401] Updated weights for policy 0, policy_version 977434 (0.0027) [2024-06-25 22:16:43,277][15401] Updated weights for policy 0, policy_version 977444 (0.0041) [2024-06-25 22:16:43,389][15132] Fps is (10 sec: 40969.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 16014442496. Throughput: 0: 42482.2. Samples: 16014568980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:43,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 22:16:47,222][15401] Updated weights for policy 0, policy_version 977454 (0.0041) [2024-06-25 22:16:48,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42709.7). Total num frames: 16014655488. Throughput: 0: 42574.8. Samples: 16014822880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:48,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 22:16:51,166][15401] Updated weights for policy 0, policy_version 977464 (0.0035) [2024-06-25 22:16:53,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 16014884864. Throughput: 0: 42624.7. Samples: 16014957420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:53,390][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 22:16:54,627][15401] Updated weights for policy 0, policy_version 977474 (0.0033) [2024-06-25 22:16:58,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16015081472. Throughput: 0: 42448.5. Samples: 16015209180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-25 22:16:58,390][15132] Avg episode reward: [(0, '0.571')] [2024-06-25 22:16:58,622][15401] Updated weights for policy 0, policy_version 977484 (0.0042) [2024-06-25 22:17:02,339][15401] Updated weights for policy 0, policy_version 977494 (0.0034) [2024-06-25 22:17:03,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 16015294464. Throughput: 0: 42571.5. Samples: 16015465120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:03,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 22:17:06,166][15401] Updated weights for policy 0, policy_version 977504 (0.0041) [2024-06-25 22:17:08,396][15132] Fps is (10 sec: 44208.4, 60 sec: 42320.8, 300 sec: 42708.6). Total num frames: 16015523840. Throughput: 0: 42560.3. Samples: 16015595740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:08,396][15132] Avg episode reward: [(0, '0.569')] [2024-06-25 22:17:09,981][15401] Updated weights for policy 0, policy_version 977514 (0.0026) [2024-06-25 22:17:13,389][15132] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 16015736832. Throughput: 0: 42585.8. Samples: 16015853920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:13,390][15132] Avg episode reward: [(0, '0.493')] [2024-06-25 22:17:13,670][15401] Updated weights for policy 0, policy_version 977524 (0.0032) [2024-06-25 22:17:16,230][15349] Signal inference workers to stop experience collection... (236950 times) [2024-06-25 22:17:16,248][15401] InferenceWorker_p0-w0: stopping experience collection (236950 times) [2024-06-25 22:17:16,288][15349] Signal inference workers to resume experience collection... (236950 times) [2024-06-25 22:17:16,288][15401] InferenceWorker_p0-w0: resuming experience collection (236950 times) [2024-06-25 22:17:18,062][15401] Updated weights for policy 0, policy_version 977534 (0.0028) [2024-06-25 22:17:18,390][15132] Fps is (10 sec: 40986.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 16015933440. Throughput: 0: 42616.9. Samples: 16016105300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:18,390][15132] Avg episode reward: [(0, '0.387')] [2024-06-25 22:17:21,859][15401] Updated weights for policy 0, policy_version 977544 (0.0033) [2024-06-25 22:17:23,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 16016162816. Throughput: 0: 42471.9. Samples: 16016227940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:23,390][15132] Avg episode reward: [(0, '0.768')] [2024-06-25 22:17:25,606][15401] Updated weights for policy 0, policy_version 977554 (0.0038) [2024-06-25 22:17:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 16016375808. Throughput: 0: 42729.3. Samples: 16016491800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:28,392][15132] Avg episode reward: [(0, '0.775')] [2024-06-25 22:17:29,311][15401] Updated weights for policy 0, policy_version 977564 (0.0036) [2024-06-25 22:17:33,096][15401] Updated weights for policy 0, policy_version 977574 (0.0034) [2024-06-25 22:17:33,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42600.2, 300 sec: 42709.5). Total num frames: 16016588800. Throughput: 0: 42818.2. Samples: 16016749700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:33,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 22:17:36,942][15401] Updated weights for policy 0, policy_version 977584 (0.0040) [2024-06-25 22:17:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 16016801792. Throughput: 0: 42571.6. Samples: 16016873140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:38,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 22:17:40,471][15401] Updated weights for policy 0, policy_version 977594 (0.0043) [2024-06-25 22:17:43,391][15132] Fps is (10 sec: 42592.5, 60 sec: 42870.5, 300 sec: 42764.8). Total num frames: 16017014784. Throughput: 0: 42858.3. Samples: 16017137860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:43,391][15132] Avg episode reward: [(0, '0.505')] [2024-06-25 22:17:43,412][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000977601_16017014784.pth... [2024-06-25 22:17:43,458][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000976977_16006791168.pth [2024-06-25 22:17:44,447][15401] Updated weights for policy 0, policy_version 977604 (0.0033) [2024-06-25 22:17:48,047][15401] Updated weights for policy 0, policy_version 977614 (0.0032) [2024-06-25 22:17:48,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 16017227776. Throughput: 0: 42707.8. Samples: 16017386960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:48,390][15132] Avg episode reward: [(0, '0.550')] [2024-06-25 22:17:52,305][15401] Updated weights for policy 0, policy_version 977624 (0.0028) [2024-06-25 22:17:53,390][15132] Fps is (10 sec: 44242.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 16017457152. Throughput: 0: 42846.1. Samples: 16017523540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:53,390][15132] Avg episode reward: [(0, '0.475')] [2024-06-25 22:17:55,854][15401] Updated weights for policy 0, policy_version 977634 (0.0037) [2024-06-25 22:17:58,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 16017637376. Throughput: 0: 42779.1. Samples: 16017778980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:17:58,390][15132] Avg episode reward: [(0, '0.382')] [2024-06-25 22:17:59,910][15401] Updated weights for policy 0, policy_version 977644 (0.0037) [2024-06-25 22:18:03,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 16017866752. Throughput: 0: 42760.4. Samples: 16018029520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:03,390][15132] Avg episode reward: [(0, '0.508')] [2024-06-25 22:18:03,535][15401] Updated weights for policy 0, policy_version 977654 (0.0032) [2024-06-25 22:18:07,498][15401] Updated weights for policy 0, policy_version 977664 (0.0037) [2024-06-25 22:18:08,389][15132] Fps is (10 sec: 44236.7, 60 sec: 42603.0, 300 sec: 42653.9). Total num frames: 16018079744. Throughput: 0: 43039.2. Samples: 16018164700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:08,390][15132] Avg episode reward: [(0, '0.010')] [2024-06-25 22:18:11,095][15401] Updated weights for policy 0, policy_version 977674 (0.0029) [2024-06-25 22:18:13,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 16018276352. Throughput: 0: 42710.3. Samples: 16018413760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:13,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 22:18:15,158][15401] Updated weights for policy 0, policy_version 977684 (0.0038) [2024-06-25 22:18:18,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 16018522112. Throughput: 0: 42623.9. Samples: 16018667780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:18,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 22:18:18,875][15401] Updated weights for policy 0, policy_version 977694 (0.0031) [2024-06-25 22:18:22,824][15401] Updated weights for policy 0, policy_version 977704 (0.0027) [2024-06-25 22:18:23,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 16018718720. Throughput: 0: 42852.4. Samples: 16018801500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:23,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 22:18:26,470][15401] Updated weights for policy 0, policy_version 977714 (0.0022) [2024-06-25 22:18:28,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 16018931712. Throughput: 0: 42504.8. Samples: 16019050520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:28,399][15132] Avg episode reward: [(0, '0.613')] [2024-06-25 22:18:30,677][15401] Updated weights for policy 0, policy_version 977724 (0.0035) [2024-06-25 22:18:33,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 16019161088. Throughput: 0: 42732.7. Samples: 16019309940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:33,390][15132] Avg episode reward: [(0, '0.790')] [2024-06-25 22:18:34,183][15401] Updated weights for policy 0, policy_version 977734 (0.0027) [2024-06-25 22:18:38,372][15401] Updated weights for policy 0, policy_version 977744 (0.0031) [2024-06-25 22:18:38,390][15132] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 16019357696. Throughput: 0: 42593.8. Samples: 16019440260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:38,390][15132] Avg episode reward: [(0, '0.723')] [2024-06-25 22:18:42,232][15401] Updated weights for policy 0, policy_version 977754 (0.0030) [2024-06-25 22:18:43,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42872.3, 300 sec: 42709.5). Total num frames: 16019587072. Throughput: 0: 42791.9. Samples: 16019704620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:43,390][15132] Avg episode reward: [(0, '0.542')] [2024-06-25 22:18:45,960][15401] Updated weights for policy 0, policy_version 977764 (0.0040) [2024-06-25 22:18:48,390][15132] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 16019816448. Throughput: 0: 42736.0. Samples: 16019952640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:48,390][15132] Avg episode reward: [(0, '0.525')] [2024-06-25 22:18:49,710][15401] Updated weights for policy 0, policy_version 977774 (0.0034) [2024-06-25 22:18:53,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 16019996672. Throughput: 0: 42757.8. Samples: 16020088800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:53,390][15132] Avg episode reward: [(0, '0.563')] [2024-06-25 22:18:53,633][15401] Updated weights for policy 0, policy_version 977784 (0.0051) [2024-06-25 22:18:57,250][15401] Updated weights for policy 0, policy_version 977794 (0.0028) [2024-06-25 22:18:58,389][15132] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 16020209664. Throughput: 0: 42897.4. Samples: 16020344140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-25 22:18:58,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 22:18:58,693][15349] Signal inference workers to stop experience collection... (237000 times) [2024-06-25 22:18:58,694][15349] Signal inference workers to resume experience collection... (237000 times) [2024-06-25 22:18:58,720][15401] InferenceWorker_p0-w0: stopping experience collection (237000 times) [2024-06-25 22:18:58,748][15401] InferenceWorker_p0-w0: resuming experience collection (237000 times) [2024-06-25 22:19:01,270][15401] Updated weights for policy 0, policy_version 977804 (0.0031) [2024-06-25 22:19:03,391][15132] Fps is (10 sec: 45867.1, 60 sec: 43143.3, 300 sec: 42764.8). Total num frames: 16020455424. Throughput: 0: 42798.8. Samples: 16020593800. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:03,392][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 22:19:04,841][15401] Updated weights for policy 0, policy_version 977814 (0.0033) [2024-06-25 22:19:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 16020619264. Throughput: 0: 42836.9. Samples: 16020729160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:08,390][15132] Avg episode reward: [(0, '0.562')] [2024-06-25 22:19:08,923][15401] Updated weights for policy 0, policy_version 977824 (0.0023) [2024-06-25 22:19:12,430][15401] Updated weights for policy 0, policy_version 977834 (0.0031) [2024-06-25 22:19:13,392][15132] Fps is (10 sec: 40957.4, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 16020865024. Throughput: 0: 42892.0. Samples: 16020980760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:13,392][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 22:19:16,702][15401] Updated weights for policy 0, policy_version 977844 (0.0027) [2024-06-25 22:19:18,392][15132] Fps is (10 sec: 47502.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 16021094400. Throughput: 0: 42684.9. Samples: 16021230860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:18,392][15132] Avg episode reward: [(0, '0.602')] [2024-06-25 22:19:20,355][15401] Updated weights for policy 0, policy_version 977854 (0.0037) [2024-06-25 22:19:23,389][15132] Fps is (10 sec: 39331.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 16021258240. Throughput: 0: 42716.9. Samples: 16021362520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:23,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 22:19:24,303][15401] Updated weights for policy 0, policy_version 977864 (0.0050) [2024-06-25 22:19:27,884][15401] Updated weights for policy 0, policy_version 977874 (0.0036) [2024-06-25 22:19:28,389][15132] Fps is (10 sec: 40970.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16021504000. Throughput: 0: 42492.1. Samples: 16021616760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:28,390][15132] Avg episode reward: [(0, '0.727')] [2024-06-25 22:19:31,767][15401] Updated weights for policy 0, policy_version 977884 (0.0044) [2024-06-25 22:19:33,389][15132] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 16021733376. Throughput: 0: 42595.2. Samples: 16021869420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:33,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 22:19:36,006][15401] Updated weights for policy 0, policy_version 977894 (0.0036) [2024-06-25 22:19:38,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 16021913600. Throughput: 0: 42532.9. Samples: 16022002780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:38,390][15132] Avg episode reward: [(0, '0.597')] [2024-06-25 22:19:39,571][15401] Updated weights for policy 0, policy_version 977904 (0.0031) [2024-06-25 22:19:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 16022126592. Throughput: 0: 42634.6. Samples: 16022262700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:43,392][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 22:19:43,486][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000977914_16022142976.pth... [2024-06-25 22:19:43,494][15401] Updated weights for policy 0, policy_version 977914 (0.0033) [2024-06-25 22:19:43,556][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000977288_16011886592.pth [2024-06-25 22:19:47,191][15401] Updated weights for policy 0, policy_version 977924 (0.0058) [2024-06-25 22:19:48,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 16022372352. Throughput: 0: 42644.8. Samples: 16022512740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:48,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 22:19:51,149][15401] Updated weights for policy 0, policy_version 977934 (0.0033) [2024-06-25 22:19:53,390][15132] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 16022552576. Throughput: 0: 42577.2. Samples: 16022645140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:53,390][15132] Avg episode reward: [(0, '0.807')] [2024-06-25 22:19:54,946][15401] Updated weights for policy 0, policy_version 977944 (0.0033) [2024-06-25 22:19:58,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 16022781952. Throughput: 0: 42665.8. Samples: 16022900620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:19:58,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 22:19:58,688][15401] Updated weights for policy 0, policy_version 977954 (0.0044) [2024-06-25 22:20:02,574][15401] Updated weights for policy 0, policy_version 977964 (0.0027) [2024-06-25 22:20:03,390][15132] Fps is (10 sec: 45875.6, 60 sec: 42599.6, 300 sec: 42709.5). Total num frames: 16023011328. Throughput: 0: 42868.0. Samples: 16023159820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:03,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 22:20:06,343][15401] Updated weights for policy 0, policy_version 977974 (0.0023) [2024-06-25 22:20:08,390][15132] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 16023207936. Throughput: 0: 42872.3. Samples: 16023291780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:08,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 22:20:09,997][15401] Updated weights for policy 0, policy_version 977984 (0.0029) [2024-06-25 22:20:13,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 16023420928. Throughput: 0: 42954.9. Samples: 16023549740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:13,390][15132] Avg episode reward: [(0, '0.317')] [2024-06-25 22:20:14,133][15401] Updated weights for policy 0, policy_version 977994 (0.0035) [2024-06-25 22:20:17,614][15401] Updated weights for policy 0, policy_version 978004 (0.0034) [2024-06-25 22:20:18,073][15349] Signal inference workers to stop experience collection... (237050 times) [2024-06-25 22:20:18,073][15349] Signal inference workers to resume experience collection... (237050 times) [2024-06-25 22:20:18,127][15401] InferenceWorker_p0-w0: stopping experience collection (237050 times) [2024-06-25 22:20:18,127][15401] InferenceWorker_p0-w0: resuming experience collection (237050 times) [2024-06-25 22:20:18,389][15132] Fps is (10 sec: 45875.7, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 16023666688. Throughput: 0: 42904.5. Samples: 16023800120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:18,390][15132] Avg episode reward: [(0, '0.827')] [2024-06-25 22:20:21,733][15401] Updated weights for policy 0, policy_version 978014 (0.0044) [2024-06-25 22:20:23,389][15132] Fps is (10 sec: 42599.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 16023846912. Throughput: 0: 42914.3. Samples: 16023933920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:23,390][15132] Avg episode reward: [(0, '0.773')] [2024-06-25 22:20:25,412][15401] Updated weights for policy 0, policy_version 978024 (0.0033) [2024-06-25 22:20:28,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 16024076288. Throughput: 0: 42932.0. Samples: 16024194640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:28,390][15132] Avg episode reward: [(0, '0.801')] [2024-06-25 22:20:29,267][15401] Updated weights for policy 0, policy_version 978034 (0.0027) [2024-06-25 22:20:32,964][15401] Updated weights for policy 0, policy_version 978044 (0.0033) [2024-06-25 22:20:33,390][15132] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 16024289280. Throughput: 0: 43022.5. Samples: 16024448760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:33,390][15132] Avg episode reward: [(0, '0.663')] [2024-06-25 22:20:36,815][15401] Updated weights for policy 0, policy_version 978054 (0.0036) [2024-06-25 22:20:38,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 16024502272. Throughput: 0: 42958.3. Samples: 16024578260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:38,390][15132] Avg episode reward: [(0, '0.519')] [2024-06-25 22:20:40,490][15401] Updated weights for policy 0, policy_version 978064 (0.0024) [2024-06-25 22:20:43,390][15132] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16024698880. Throughput: 0: 43025.8. Samples: 16024836780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:43,390][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 22:20:44,520][15401] Updated weights for policy 0, policy_version 978074 (0.0033) [2024-06-25 22:20:48,163][15401] Updated weights for policy 0, policy_version 978084 (0.0038) [2024-06-25 22:20:48,390][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 16024944640. Throughput: 0: 42894.3. Samples: 16025090060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:48,390][15132] Avg episode reward: [(0, '0.709')] [2024-06-25 22:20:52,130][15401] Updated weights for policy 0, policy_version 978094 (0.0053) [2024-06-25 22:20:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16025124864. Throughput: 0: 42857.3. Samples: 16025220360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:53,390][15132] Avg episode reward: [(0, '0.543')] [2024-06-25 22:20:55,826][15401] Updated weights for policy 0, policy_version 978104 (0.0039) [2024-06-25 22:20:58,392][15132] Fps is (10 sec: 39312.2, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 16025337856. Throughput: 0: 42695.7. Samples: 16025471140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-25 22:20:58,392][15132] Avg episode reward: [(0, '0.441')] [2024-06-25 22:20:59,865][15401] Updated weights for policy 0, policy_version 978114 (0.0033) [2024-06-25 22:21:03,356][15401] Updated weights for policy 0, policy_version 978124 (0.0034) [2024-06-25 22:21:03,389][15132] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16025583616. Throughput: 0: 42789.7. Samples: 16025725660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:03,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 22:21:07,687][15401] Updated weights for policy 0, policy_version 978134 (0.0037) [2024-06-25 22:21:08,390][15132] Fps is (10 sec: 44247.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 16025780224. Throughput: 0: 42755.9. Samples: 16025857940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:08,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 22:21:10,838][15401] Updated weights for policy 0, policy_version 978144 (0.0035) [2024-06-25 22:21:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 16025976832. Throughput: 0: 42592.4. Samples: 16026111300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:13,390][15132] Avg episode reward: [(0, '0.722')] [2024-06-25 22:21:15,509][15401] Updated weights for policy 0, policy_version 978154 (0.0043) [2024-06-25 22:21:18,378][15401] Updated weights for policy 0, policy_version 978164 (0.0037) [2024-06-25 22:21:18,395][15132] Fps is (10 sec: 45851.0, 60 sec: 42867.6, 300 sec: 42819.8). Total num frames: 16026238976. Throughput: 0: 42586.2. Samples: 16026365360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:18,395][15132] Avg episode reward: [(0, '0.777')] [2024-06-25 22:21:23,082][15401] Updated weights for policy 0, policy_version 978174 (0.0056) [2024-06-25 22:21:23,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 16026419200. Throughput: 0: 42621.9. Samples: 16026496240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:23,390][15132] Avg episode reward: [(0, '0.552')] [2024-06-25 22:21:26,095][15401] Updated weights for policy 0, policy_version 978184 (0.0036) [2024-06-25 22:21:28,389][15132] Fps is (10 sec: 37703.2, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 16026615808. Throughput: 0: 42618.2. Samples: 16026754600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:28,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 22:21:30,723][15401] Updated weights for policy 0, policy_version 978194 (0.0023) [2024-06-25 22:21:33,389][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 16026861568. Throughput: 0: 42794.7. Samples: 16027015820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:33,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 22:21:33,619][15401] Updated weights for policy 0, policy_version 978204 (0.0030) [2024-06-25 22:21:38,269][15401] Updated weights for policy 0, policy_version 978214 (0.0031) [2024-06-25 22:21:38,391][15132] Fps is (10 sec: 45868.3, 60 sec: 42870.4, 300 sec: 42820.3). Total num frames: 16027074560. Throughput: 0: 42908.4. Samples: 16027151300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:38,392][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 22:21:41,262][15401] Updated weights for policy 0, policy_version 978224 (0.0043) [2024-06-25 22:21:41,276][15349] Signal inference workers to stop experience collection... (237100 times) [2024-06-25 22:21:41,276][15349] Signal inference workers to resume experience collection... (237100 times) [2024-06-25 22:21:41,327][15401] InferenceWorker_p0-w0: stopping experience collection (237100 times) [2024-06-25 22:21:41,327][15401] InferenceWorker_p0-w0: resuming experience collection (237100 times) [2024-06-25 22:21:43,390][15132] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 16027254784. Throughput: 0: 42757.8. Samples: 16027395140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:43,390][15132] Avg episode reward: [(0, '0.652')] [2024-06-25 22:21:43,404][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000978226_16027254784.pth... [2024-06-25 22:21:43,459][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000977601_16017014784.pth [2024-06-25 22:21:45,762][15401] Updated weights for policy 0, policy_version 978234 (0.0037) [2024-06-25 22:21:48,389][15132] Fps is (10 sec: 40966.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 16027484160. Throughput: 0: 42852.4. Samples: 16027654020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:48,390][15132] Avg episode reward: [(0, '0.685')] [2024-06-25 22:21:49,291][15401] Updated weights for policy 0, policy_version 978244 (0.0041) [2024-06-25 22:21:53,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16027697152. Throughput: 0: 42857.9. Samples: 16027786540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:53,390][15132] Avg episode reward: [(0, '0.812')] [2024-06-25 22:21:53,711][15401] Updated weights for policy 0, policy_version 978254 (0.0037) [2024-06-25 22:21:56,859][15401] Updated weights for policy 0, policy_version 978264 (0.0023) [2024-06-25 22:21:58,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42873.2, 300 sec: 42765.0). Total num frames: 16027910144. Throughput: 0: 42715.6. Samples: 16028033500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:21:58,390][15132] Avg episode reward: [(0, '0.835')] [2024-06-25 22:22:01,449][15401] Updated weights for policy 0, policy_version 978274 (0.0038) [2024-06-25 22:22:03,390][15132] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.9). Total num frames: 16028139520. Throughput: 0: 42910.4. Samples: 16028296100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:03,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 22:22:04,358][15401] Updated weights for policy 0, policy_version 978284 (0.0030) [2024-06-25 22:22:08,390][15132] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16028352512. Throughput: 0: 42921.3. Samples: 16028427700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:08,390][15132] Avg episode reward: [(0, '0.702')] [2024-06-25 22:22:08,921][15401] Updated weights for policy 0, policy_version 978294 (0.0037) [2024-06-25 22:22:12,480][15401] Updated weights for policy 0, policy_version 978304 (0.0022) [2024-06-25 22:22:13,390][15132] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 16028549120. Throughput: 0: 42874.1. Samples: 16028683940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:13,390][15132] Avg episode reward: [(0, '0.649')] [2024-06-25 22:22:16,350][15401] Updated weights for policy 0, policy_version 978314 (0.0058) [2024-06-25 22:22:18,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42602.2, 300 sec: 42820.6). Total num frames: 16028794880. Throughput: 0: 42738.2. Samples: 16028939040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:18,390][15132] Avg episode reward: [(0, '0.555')] [2024-06-25 22:22:20,202][15401] Updated weights for policy 0, policy_version 978324 (0.0038) [2024-06-25 22:22:23,390][15132] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 16028975104. Throughput: 0: 42666.7. Samples: 16029071240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:23,390][15132] Avg episode reward: [(0, '0.595')] [2024-06-25 22:22:23,960][15401] Updated weights for policy 0, policy_version 978334 (0.0035) [2024-06-25 22:22:27,850][15401] Updated weights for policy 0, policy_version 978344 (0.0036) [2024-06-25 22:22:28,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 16029204480. Throughput: 0: 42668.2. Samples: 16029315200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:28,390][15132] Avg episode reward: [(0, '0.388')] [2024-06-25 22:22:31,741][15401] Updated weights for policy 0, policy_version 978354 (0.0037) [2024-06-25 22:22:33,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 16029417472. Throughput: 0: 42721.3. Samples: 16029576480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:33,390][15132] Avg episode reward: [(0, '0.385')] [2024-06-25 22:22:35,611][15401] Updated weights for policy 0, policy_version 978364 (0.0041) [2024-06-25 22:22:38,392][15132] Fps is (10 sec: 40949.8, 60 sec: 42324.7, 300 sec: 42709.3). Total num frames: 16029614080. Throughput: 0: 42602.6. Samples: 16029703760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:38,392][15132] Avg episode reward: [(0, '0.559')] [2024-06-25 22:22:39,717][15401] Updated weights for policy 0, policy_version 978374 (0.0043) [2024-06-25 22:22:43,117][15401] Updated weights for policy 0, policy_version 978384 (0.0032) [2024-06-25 22:22:43,389][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 16029843456. Throughput: 0: 42536.5. Samples: 16029947640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:43,390][15132] Avg episode reward: [(0, '0.694')] [2024-06-25 22:22:47,349][15401] Updated weights for policy 0, policy_version 978394 (0.0036) [2024-06-25 22:22:48,389][15132] Fps is (10 sec: 44247.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16030056448. Throughput: 0: 42537.0. Samples: 16030210260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:48,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 22:22:50,602][15401] Updated weights for policy 0, policy_version 978404 (0.0030) [2024-06-25 22:22:53,389][15132] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 16030236672. Throughput: 0: 42399.2. Samples: 16030335660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:22:53,390][15132] Avg episode reward: [(0, '0.739')] [2024-06-25 22:22:53,964][15349] Signal inference workers to stop experience collection... (237150 times) [2024-06-25 22:22:53,965][15349] Signal inference workers to resume experience collection... (237150 times) [2024-06-25 22:22:53,998][15401] InferenceWorker_p0-w0: stopping experience collection (237150 times) [2024-06-25 22:22:53,998][15401] InferenceWorker_p0-w0: resuming experience collection (237150 times) [2024-06-25 22:22:54,929][15401] Updated weights for policy 0, policy_version 978414 (0.0026) [2024-06-25 22:22:58,198][15401] Updated weights for policy 0, policy_version 978424 (0.0042) [2024-06-25 22:22:58,390][15132] Fps is (10 sec: 44233.4, 60 sec: 43144.0, 300 sec: 42820.5). Total num frames: 16030498816. Throughput: 0: 42389.2. Samples: 16030591480. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:22:58,391][15132] Avg episode reward: [(0, '0.754')] [2024-06-25 22:23:02,626][15401] Updated weights for policy 0, policy_version 978434 (0.0043) [2024-06-25 22:23:03,390][15132] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16030695424. Throughput: 0: 42348.0. Samples: 16030844700. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 22:23:05,971][15401] Updated weights for policy 0, policy_version 978444 (0.0041) [2024-06-25 22:23:08,390][15132] Fps is (10 sec: 36047.0, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 16030859264. Throughput: 0: 42100.0. Samples: 16030965740. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:08,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 22:23:10,471][15401] Updated weights for policy 0, policy_version 978454 (0.0040) [2024-06-25 22:23:13,389][15132] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 16031121408. Throughput: 0: 42454.2. Samples: 16031225640. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:13,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 22:23:14,057][15401] Updated weights for policy 0, policy_version 978464 (0.0038) [2024-06-25 22:23:18,133][15401] Updated weights for policy 0, policy_version 978474 (0.0037) [2024-06-25 22:23:18,395][15132] Fps is (10 sec: 47485.9, 60 sec: 42321.2, 300 sec: 42764.2). Total num frames: 16031334400. Throughput: 0: 42183.4. Samples: 16031474980. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:18,396][15132] Avg episode reward: [(0, '0.537')] [2024-06-25 22:23:21,694][15401] Updated weights for policy 0, policy_version 978484 (0.0044) [2024-06-25 22:23:23,390][15132] Fps is (10 sec: 37682.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 16031498240. Throughput: 0: 42159.1. Samples: 16031600820. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:23,390][15132] Avg episode reward: [(0, '0.453')] [2024-06-25 22:23:25,810][15401] Updated weights for policy 0, policy_version 978494 (0.0036) [2024-06-25 22:23:28,392][15132] Fps is (10 sec: 39335.2, 60 sec: 42050.5, 300 sec: 42598.1). Total num frames: 16031727616. Throughput: 0: 42336.3. Samples: 16031852880. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:28,393][15132] Avg episode reward: [(0, '0.650')] [2024-06-25 22:23:29,519][15401] Updated weights for policy 0, policy_version 978504 (0.0034) [2024-06-25 22:23:33,366][15401] Updated weights for policy 0, policy_version 978514 (0.0031) [2024-06-25 22:23:33,392][15132] Fps is (10 sec: 47502.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 16031973376. Throughput: 0: 42211.9. Samples: 16032109900. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:33,393][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 22:23:37,838][15401] Updated weights for policy 0, policy_version 978524 (0.0038) [2024-06-25 22:23:38,393][15132] Fps is (10 sec: 42593.8, 60 sec: 42324.5, 300 sec: 42597.9). Total num frames: 16032153600. Throughput: 0: 42242.4. Samples: 16032236720. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:38,394][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 22:23:41,328][15401] Updated weights for policy 0, policy_version 978534 (0.0032) [2024-06-25 22:23:43,389][15132] Fps is (10 sec: 39331.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 16032366592. Throughput: 0: 42156.7. Samples: 16032488500. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:43,390][15132] Avg episode reward: [(0, '0.514')] [2024-06-25 22:23:43,466][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000978539_16032382976.pth... [2024-06-25 22:23:43,525][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000977914_16022142976.pth [2024-06-25 22:23:45,647][15401] Updated weights for policy 0, policy_version 978544 (0.0036) [2024-06-25 22:23:48,389][15132] Fps is (10 sec: 42613.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 16032579584. Throughput: 0: 42177.0. Samples: 16032742660. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:48,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 22:23:49,121][15401] Updated weights for policy 0, policy_version 978554 (0.0039) [2024-06-25 22:23:53,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 16032776192. Throughput: 0: 42218.6. Samples: 16032865580. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:53,390][15132] Avg episode reward: [(0, '0.545')] [2024-06-25 22:23:53,649][15401] Updated weights for policy 0, policy_version 978564 (0.0047) [2024-06-25 22:23:57,022][15401] Updated weights for policy 0, policy_version 978574 (0.0039) [2024-06-25 22:23:58,389][15132] Fps is (10 sec: 42598.9, 60 sec: 41779.8, 300 sec: 42543.1). Total num frames: 16033005568. Throughput: 0: 41982.3. Samples: 16033114840. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:23:58,390][15132] Avg episode reward: [(0, '0.360')] [2024-06-25 22:24:01,547][15401] Updated weights for policy 0, policy_version 978584 (0.0033) [2024-06-25 22:24:03,389][15132] Fps is (10 sec: 40960.8, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 16033185792. Throughput: 0: 42241.1. Samples: 16033375580. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:03,390][15132] Avg episode reward: [(0, '0.504')] [2024-06-25 22:24:04,762][15401] Updated weights for policy 0, policy_version 978594 (0.0034) [2024-06-25 22:24:08,389][15132] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 16033431552. Throughput: 0: 42021.9. Samples: 16033491800. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:08,390][15132] Avg episode reward: [(0, '0.734')] [2024-06-25 22:24:09,574][15401] Updated weights for policy 0, policy_version 978604 (0.0030) [2024-06-25 22:24:12,480][15401] Updated weights for policy 0, policy_version 978614 (0.0033) [2024-06-25 22:24:13,391][15132] Fps is (10 sec: 45867.4, 60 sec: 42051.0, 300 sec: 42543.0). Total num frames: 16033644544. Throughput: 0: 42297.2. Samples: 16033756220. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:13,392][15132] Avg episode reward: [(0, '0.580')] [2024-06-25 22:24:13,581][15349] Signal inference workers to stop experience collection... (237200 times) [2024-06-25 22:24:13,582][15349] Signal inference workers to resume experience collection... (237200 times) [2024-06-25 22:24:13,621][15401] InferenceWorker_p0-w0: stopping experience collection (237200 times) [2024-06-25 22:24:13,622][15401] InferenceWorker_p0-w0: resuming experience collection (237200 times) [2024-06-25 22:24:17,137][15401] Updated weights for policy 0, policy_version 978624 (0.0037) [2024-06-25 22:24:18,389][15132] Fps is (10 sec: 40959.7, 60 sec: 41783.3, 300 sec: 42653.9). Total num frames: 16033841152. Throughput: 0: 42236.1. Samples: 16034010420. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:18,390][15132] Avg episode reward: [(0, '0.587')] [2024-06-25 22:24:20,138][15401] Updated weights for policy 0, policy_version 978634 (0.0039) [2024-06-25 22:24:23,390][15132] Fps is (10 sec: 40966.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 16034054144. Throughput: 0: 42179.3. Samples: 16034134640. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:23,390][15132] Avg episode reward: [(0, '0.520')] [2024-06-25 22:24:25,094][15401] Updated weights for policy 0, policy_version 978644 (0.0042) [2024-06-25 22:24:27,837][15401] Updated weights for policy 0, policy_version 978654 (0.0033) [2024-06-25 22:24:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 16034283520. Throughput: 0: 42286.6. Samples: 16034391400. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 22:24:32,522][15401] Updated weights for policy 0, policy_version 978664 (0.0032) [2024-06-25 22:24:33,392][15132] Fps is (10 sec: 40951.6, 60 sec: 41506.4, 300 sec: 42542.6). Total num frames: 16034463744. Throughput: 0: 42442.0. Samples: 16034652640. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:33,392][15132] Avg episode reward: [(0, '0.502')] [2024-06-25 22:24:35,540][15401] Updated weights for policy 0, policy_version 978674 (0.0040) [2024-06-25 22:24:38,389][15132] Fps is (10 sec: 42598.5, 60 sec: 42600.9, 300 sec: 42653.9). Total num frames: 16034709504. Throughput: 0: 42391.3. Samples: 16034773180. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:38,390][15132] Avg episode reward: [(0, '0.325')] [2024-06-25 22:24:40,035][15401] Updated weights for policy 0, policy_version 978684 (0.0039) [2024-06-25 22:24:43,359][15401] Updated weights for policy 0, policy_version 978694 (0.0030) [2024-06-25 22:24:43,389][15132] Fps is (10 sec: 45884.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 16034922496. Throughput: 0: 42539.0. Samples: 16035029100. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:43,390][15132] Avg episode reward: [(0, '0.679')] [2024-06-25 22:24:47,766][15401] Updated weights for policy 0, policy_version 978704 (0.0021) [2024-06-25 22:24:48,390][15132] Fps is (10 sec: 37682.8, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 16035086336. Throughput: 0: 42562.5. Samples: 16035290900. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:48,390][15132] Avg episode reward: [(0, '0.565')] [2024-06-25 22:24:50,940][15401] Updated weights for policy 0, policy_version 978714 (0.0035) [2024-06-25 22:24:53,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 16035332096. Throughput: 0: 42572.9. Samples: 16035407580. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-06-25 22:24:53,390][15132] Avg episode reward: [(0, '0.579')] [2024-06-25 22:24:55,432][15401] Updated weights for policy 0, policy_version 978724 (0.0046) [2024-06-25 22:24:58,389][15132] Fps is (10 sec: 47514.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 16035561472. Throughput: 0: 42619.9. Samples: 16035674040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:24:58,390][15132] Avg episode reward: [(0, '0.644')] [2024-06-25 22:24:58,555][15401] Updated weights for policy 0, policy_version 978734 (0.0031) [2024-06-25 22:25:02,923][15401] Updated weights for policy 0, policy_version 978744 (0.0034) [2024-06-25 22:25:03,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 16035741696. Throughput: 0: 42615.1. Samples: 16035928100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:03,390][15132] Avg episode reward: [(0, '0.617')] [2024-06-25 22:25:06,132][15401] Updated weights for policy 0, policy_version 978754 (0.0030) [2024-06-25 22:25:08,390][15132] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 16035971072. Throughput: 0: 42512.3. Samples: 16036047700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:08,390][15132] Avg episode reward: [(0, '0.351')] [2024-06-25 22:25:10,430][15401] Updated weights for policy 0, policy_version 978764 (0.0033) [2024-06-25 22:25:13,390][15132] Fps is (10 sec: 45874.9, 60 sec: 42599.5, 300 sec: 42487.3). Total num frames: 16036200448. Throughput: 0: 42796.0. Samples: 16036317220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:13,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 22:25:13,697][15401] Updated weights for policy 0, policy_version 978774 (0.0026) [2024-06-25 22:25:17,847][15401] Updated weights for policy 0, policy_version 978784 (0.0030) [2024-06-25 22:25:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 16036397056. Throughput: 0: 42689.5. Samples: 16036573580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:18,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 22:25:21,467][15401] Updated weights for policy 0, policy_version 978794 (0.0041) [2024-06-25 22:25:23,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 16036626432. Throughput: 0: 42683.9. Samples: 16036694060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:23,392][15132] Avg episode reward: [(0, '0.783')] [2024-06-25 22:25:25,860][15401] Updated weights for policy 0, policy_version 978804 (0.0036) [2024-06-25 22:25:28,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 16036839424. Throughput: 0: 42989.3. Samples: 16036963620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:28,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 22:25:28,827][15401] Updated weights for policy 0, policy_version 978814 (0.0036) [2024-06-25 22:25:33,297][15401] Updated weights for policy 0, policy_version 978824 (0.0032) [2024-06-25 22:25:33,389][15132] Fps is (10 sec: 42608.6, 60 sec: 43146.0, 300 sec: 42542.9). Total num frames: 16037052416. Throughput: 0: 42816.5. Samples: 16037217640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:33,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 22:25:34,978][15349] Signal inference workers to stop experience collection... (237250 times) [2024-06-25 22:25:35,027][15401] InferenceWorker_p0-w0: stopping experience collection (237250 times) [2024-06-25 22:25:35,032][15349] Signal inference workers to resume experience collection... (237250 times) [2024-06-25 22:25:35,040][15401] InferenceWorker_p0-w0: resuming experience collection (237250 times) [2024-06-25 22:25:37,326][15401] Updated weights for policy 0, policy_version 978834 (0.0038) [2024-06-25 22:25:38,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 16037249024. Throughput: 0: 42996.9. Samples: 16037342440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:38,390][15132] Avg episode reward: [(0, '0.634')] [2024-06-25 22:25:40,967][15401] Updated weights for policy 0, policy_version 978844 (0.0023) [2024-06-25 22:25:43,392][15132] Fps is (10 sec: 42588.4, 60 sec: 42596.7, 300 sec: 42487.0). Total num frames: 16037478400. Throughput: 0: 42864.3. Samples: 16037603040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:43,392][15132] Avg episode reward: [(0, '0.376')] [2024-06-25 22:25:43,487][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000978851_16037494784.pth... [2024-06-25 22:25:43,538][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000978226_16027254784.pth [2024-06-25 22:25:44,824][15401] Updated weights for policy 0, policy_version 978854 (0.0029) [2024-06-25 22:25:48,390][15132] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 16037675008. Throughput: 0: 42834.6. Samples: 16037855660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:48,390][15132] Avg episode reward: [(0, '0.270')] [2024-06-25 22:25:48,573][15401] Updated weights for policy 0, policy_version 978864 (0.0038) [2024-06-25 22:25:52,629][15401] Updated weights for policy 0, policy_version 978874 (0.0030) [2024-06-25 22:25:53,390][15132] Fps is (10 sec: 42608.2, 60 sec: 42871.4, 300 sec: 42598.7). Total num frames: 16037904384. Throughput: 0: 42968.5. Samples: 16037981280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:53,390][15132] Avg episode reward: [(0, '0.426')] [2024-06-25 22:25:56,289][15401] Updated weights for policy 0, policy_version 978884 (0.0032) [2024-06-25 22:25:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 16038117376. Throughput: 0: 42611.2. Samples: 16038234720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:25:58,390][15132] Avg episode reward: [(0, '0.419')] [2024-06-25 22:26:00,133][15401] Updated weights for policy 0, policy_version 978894 (0.0038) [2024-06-25 22:26:03,390][15132] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 16038330368. Throughput: 0: 42691.5. Samples: 16038494700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:03,390][15132] Avg episode reward: [(0, '0.516')] [2024-06-25 22:26:04,027][15401] Updated weights for policy 0, policy_version 978904 (0.0032) [2024-06-25 22:26:07,635][15401] Updated weights for policy 0, policy_version 978914 (0.0032) [2024-06-25 22:26:08,390][15132] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 16038543360. Throughput: 0: 42793.7. Samples: 16038619680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:08,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 22:26:11,846][15401] Updated weights for policy 0, policy_version 978924 (0.0042) [2024-06-25 22:26:13,389][15132] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42432.5). Total num frames: 16038756352. Throughput: 0: 42520.9. Samples: 16038877060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:13,390][15132] Avg episode reward: [(0, '0.686')] [2024-06-25 22:26:15,136][15401] Updated weights for policy 0, policy_version 978934 (0.0032) [2024-06-25 22:26:18,390][15132] Fps is (10 sec: 40958.8, 60 sec: 42598.1, 300 sec: 42487.3). Total num frames: 16038952960. Throughput: 0: 42625.4. Samples: 16039135800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:18,391][15132] Avg episode reward: [(0, '0.599')] [2024-06-25 22:26:19,555][15401] Updated weights for policy 0, policy_version 978944 (0.0033) [2024-06-25 22:26:22,958][15401] Updated weights for policy 0, policy_version 978954 (0.0028) [2024-06-25 22:26:23,390][15132] Fps is (10 sec: 44236.2, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 16039198720. Throughput: 0: 42628.2. Samples: 16039260720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:23,390][15132] Avg episode reward: [(0, '0.805')] [2024-06-25 22:26:27,095][15401] Updated weights for policy 0, policy_version 978964 (0.0034) [2024-06-25 22:26:28,392][15132] Fps is (10 sec: 45866.0, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 16039411712. Throughput: 0: 42546.2. Samples: 16039517620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:28,393][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 22:26:30,523][15401] Updated weights for policy 0, policy_version 978974 (0.0044) [2024-06-25 22:26:33,390][15132] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.5). Total num frames: 16039608320. Throughput: 0: 42722.7. Samples: 16039778180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:33,390][15132] Avg episode reward: [(0, '0.623')] [2024-06-25 22:26:34,765][15401] Updated weights for policy 0, policy_version 978984 (0.0039) [2024-06-25 22:26:38,091][15401] Updated weights for policy 0, policy_version 978994 (0.0034) [2024-06-25 22:26:38,390][15132] Fps is (10 sec: 42608.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 16039837696. Throughput: 0: 42635.5. Samples: 16039899880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:38,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 22:26:42,215][15401] Updated weights for policy 0, policy_version 979004 (0.0026) [2024-06-25 22:26:43,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 16040050688. Throughput: 0: 42735.9. Samples: 16040157840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:43,390][15132] Avg episode reward: [(0, '0.561')] [2024-06-25 22:26:45,591][15401] Updated weights for policy 0, policy_version 979014 (0.0047) [2024-06-25 22:26:48,390][15132] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 16040230912. Throughput: 0: 42760.9. Samples: 16040418940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:48,390][15132] Avg episode reward: [(0, '0.319')] [2024-06-25 22:26:48,742][15349] Signal inference workers to stop experience collection... (237300 times) [2024-06-25 22:26:48,743][15349] Signal inference workers to resume experience collection... (237300 times) [2024-06-25 22:26:48,773][15401] InferenceWorker_p0-w0: stopping experience collection (237300 times) [2024-06-25 22:26:48,774][15401] InferenceWorker_p0-w0: resuming experience collection (237300 times) [2024-06-25 22:26:49,728][15401] Updated weights for policy 0, policy_version 979024 (0.0038) [2024-06-25 22:26:53,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 16040460288. Throughput: 0: 42820.0. Samples: 16040546580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-25 22:26:53,395][15132] Avg episode reward: [(0, '0.458')] [2024-06-25 22:26:53,597][15401] Updated weights for policy 0, policy_version 979034 (0.0033) [2024-06-25 22:26:57,714][15401] Updated weights for policy 0, policy_version 979044 (0.0025) [2024-06-25 22:26:58,392][15132] Fps is (10 sec: 44226.1, 60 sec: 42596.6, 300 sec: 42487.0). Total num frames: 16040673280. Throughput: 0: 42701.7. Samples: 16040798740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:26:58,393][15132] Avg episode reward: [(0, '0.704')] [2024-06-25 22:27:01,316][15401] Updated weights for policy 0, policy_version 979054 (0.0027) [2024-06-25 22:27:03,389][15132] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 16040886272. Throughput: 0: 42801.8. Samples: 16041061860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:03,390][15132] Avg episode reward: [(0, '0.576')] [2024-06-25 22:27:05,215][15401] Updated weights for policy 0, policy_version 979064 (0.0039) [2024-06-25 22:27:08,389][15132] Fps is (10 sec: 44247.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 16041115648. Throughput: 0: 42783.3. Samples: 16041185960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:08,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 22:27:08,851][15401] Updated weights for policy 0, policy_version 979074 (0.0036) [2024-06-25 22:27:13,171][15401] Updated weights for policy 0, policy_version 979084 (0.0034) [2024-06-25 22:27:13,396][15132] Fps is (10 sec: 44207.5, 60 sec: 42866.8, 300 sec: 42486.4). Total num frames: 16041328640. Throughput: 0: 42689.0. Samples: 16041438800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:13,397][15132] Avg episode reward: [(0, '0.832')] [2024-06-25 22:27:16,439][15401] Updated weights for policy 0, policy_version 979094 (0.0034) [2024-06-25 22:27:18,392][15132] Fps is (10 sec: 39312.3, 60 sec: 42597.0, 300 sec: 42487.0). Total num frames: 16041508864. Throughput: 0: 42639.6. Samples: 16041697060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:18,393][15132] Avg episode reward: [(0, '0.607')] [2024-06-25 22:27:20,811][15401] Updated weights for policy 0, policy_version 979104 (0.0041) [2024-06-25 22:27:23,390][15132] Fps is (10 sec: 42626.2, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 16041754624. Throughput: 0: 42655.6. Samples: 16041819380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:23,390][15132] Avg episode reward: [(0, '0.312')] [2024-06-25 22:27:24,010][15401] Updated weights for policy 0, policy_version 979114 (0.0036) [2024-06-25 22:27:28,389][15401] Updated weights for policy 0, policy_version 979124 (0.0027) [2024-06-25 22:27:28,389][15132] Fps is (10 sec: 45886.0, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 16041967616. Throughput: 0: 42768.0. Samples: 16042082400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:28,390][15132] Avg episode reward: [(0, '0.320')] [2024-06-25 22:27:31,634][15401] Updated weights for policy 0, policy_version 979134 (0.0023) [2024-06-25 22:27:33,392][15132] Fps is (10 sec: 40950.1, 60 sec: 42596.7, 300 sec: 42542.9). Total num frames: 16042164224. Throughput: 0: 42733.3. Samples: 16042342040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:33,392][15132] Avg episode reward: [(0, '0.655')] [2024-06-25 22:27:36,244][15401] Updated weights for policy 0, policy_version 979144 (0.0038) [2024-06-25 22:27:38,392][15132] Fps is (10 sec: 42588.3, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 16042393600. Throughput: 0: 42729.0. Samples: 16042469480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:38,393][15132] Avg episode reward: [(0, '0.531')] [2024-06-25 22:27:39,355][15401] Updated weights for policy 0, policy_version 979154 (0.0033) [2024-06-25 22:27:43,389][15132] Fps is (10 sec: 42608.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 16042590208. Throughput: 0: 42800.1. Samples: 16042724640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:43,390][15132] Avg episode reward: [(0, '0.761')] [2024-06-25 22:27:43,559][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000979163_16042606592.pth... [2024-06-25 22:27:43,623][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000978539_16032382976.pth [2024-06-25 22:27:43,764][15401] Updated weights for policy 0, policy_version 979164 (0.0028) [2024-06-25 22:27:47,225][15401] Updated weights for policy 0, policy_version 979174 (0.0033) [2024-06-25 22:27:48,390][15132] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 16042819584. Throughput: 0: 42569.7. Samples: 16042977500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:48,390][15132] Avg episode reward: [(0, '0.404')] [2024-06-25 22:27:51,390][15401] Updated weights for policy 0, policy_version 979184 (0.0038) [2024-06-25 22:27:53,390][15132] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42431.9). Total num frames: 16043016192. Throughput: 0: 42781.3. Samples: 16043111120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:53,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 22:27:55,082][15401] Updated weights for policy 0, policy_version 979194 (0.0029) [2024-06-25 22:27:58,389][15132] Fps is (10 sec: 40960.2, 60 sec: 42600.2, 300 sec: 42487.3). Total num frames: 16043229184. Throughput: 0: 42768.0. Samples: 16043363080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:27:58,390][15132] Avg episode reward: [(0, '0.495')] [2024-06-25 22:27:59,062][15401] Updated weights for policy 0, policy_version 979204 (0.0036) [2024-06-25 22:28:03,079][15401] Updated weights for policy 0, policy_version 979214 (0.0040) [2024-06-25 22:28:03,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 16043458560. Throughput: 0: 42637.8. Samples: 16043615660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:03,390][15132] Avg episode reward: [(0, '0.781')] [2024-06-25 22:28:06,651][15401] Updated weights for policy 0, policy_version 979224 (0.0028) [2024-06-25 22:28:08,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 16043655168. Throughput: 0: 42807.9. Samples: 16043745740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:08,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 22:28:11,025][15401] Updated weights for policy 0, policy_version 979234 (0.0036) [2024-06-25 22:28:12,558][15349] Signal inference workers to stop experience collection... (237350 times) [2024-06-25 22:28:12,610][15401] InferenceWorker_p0-w0: stopping experience collection (237350 times) [2024-06-25 22:28:12,673][15349] Signal inference workers to resume experience collection... (237350 times) [2024-06-25 22:28:12,673][15401] InferenceWorker_p0-w0: resuming experience collection (237350 times) [2024-06-25 22:28:13,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42603.1, 300 sec: 42543.7). Total num frames: 16043884544. Throughput: 0: 42603.2. Samples: 16043999540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:13,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 22:28:14,233][15401] Updated weights for policy 0, policy_version 979244 (0.0037) [2024-06-25 22:28:18,390][15132] Fps is (10 sec: 42598.7, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 16044081152. Throughput: 0: 42550.3. Samples: 16044256700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:18,390][15132] Avg episode reward: [(0, '0.626')] [2024-06-25 22:28:18,510][15401] Updated weights for policy 0, policy_version 979254 (0.0038) [2024-06-25 22:28:21,794][15401] Updated weights for policy 0, policy_version 979264 (0.0035) [2024-06-25 22:28:23,389][15132] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.8). Total num frames: 16044294144. Throughput: 0: 42599.7. Samples: 16044386360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:23,390][15132] Avg episode reward: [(0, '0.510')] [2024-06-25 22:28:26,071][15401] Updated weights for policy 0, policy_version 979274 (0.0033) [2024-06-25 22:28:28,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 16044523520. Throughput: 0: 42734.6. Samples: 16044647700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:28,390][15132] Avg episode reward: [(0, '0.440')] [2024-06-25 22:28:29,342][15401] Updated weights for policy 0, policy_version 979284 (0.0037) [2024-06-25 22:28:33,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42600.1, 300 sec: 42598.9). Total num frames: 16044720128. Throughput: 0: 42782.7. Samples: 16044902720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:33,391][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 22:28:33,694][15401] Updated weights for policy 0, policy_version 979294 (0.0033) [2024-06-25 22:28:36,988][15401] Updated weights for policy 0, policy_version 979304 (0.0033) [2024-06-25 22:28:38,390][15132] Fps is (10 sec: 42598.2, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 16044949504. Throughput: 0: 42578.7. Samples: 16045027160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:38,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 22:28:41,262][15401] Updated weights for policy 0, policy_version 979314 (0.0036) [2024-06-25 22:28:43,390][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 16045146112. Throughput: 0: 42625.7. Samples: 16045281240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:43,390][15132] Avg episode reward: [(0, '0.355')] [2024-06-25 22:28:44,983][15401] Updated weights for policy 0, policy_version 979324 (0.0039) [2024-06-25 22:28:48,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 16045359104. Throughput: 0: 42736.0. Samples: 16045538780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 22:28:49,504][15401] Updated weights for policy 0, policy_version 979334 (0.0041) [2024-06-25 22:28:52,650][15401] Updated weights for policy 0, policy_version 979344 (0.0042) [2024-06-25 22:28:53,390][15132] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 16045604864. Throughput: 0: 42681.3. Samples: 16045666400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-25 22:28:53,390][15132] Avg episode reward: [(0, '0.476')] [2024-06-25 22:28:57,156][15401] Updated weights for policy 0, policy_version 979354 (0.0037) [2024-06-25 22:28:58,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16045801472. Throughput: 0: 42948.9. Samples: 16045932240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:28:58,390][15132] Avg episode reward: [(0, '0.588')] [2024-06-25 22:29:00,303][15401] Updated weights for policy 0, policy_version 979364 (0.0024) [2024-06-25 22:29:03,389][15132] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 16046014464. Throughput: 0: 42805.4. Samples: 16046182940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:03,390][15132] Avg episode reward: [(0, '0.814')] [2024-06-25 22:29:04,547][15401] Updated weights for policy 0, policy_version 979374 (0.0040) [2024-06-25 22:29:07,739][15401] Updated weights for policy 0, policy_version 979384 (0.0030) [2024-06-25 22:29:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.7). Total num frames: 16046243840. Throughput: 0: 42796.4. Samples: 16046312200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:08,390][15132] Avg episode reward: [(0, '0.742')] [2024-06-25 22:29:11,946][15401] Updated weights for policy 0, policy_version 979394 (0.0028) [2024-06-25 22:29:13,389][15132] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 16046424064. Throughput: 0: 42676.9. Samples: 16046568160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:13,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 22:29:15,467][15401] Updated weights for policy 0, policy_version 979404 (0.0031) [2024-06-25 22:29:18,389][15132] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16046653440. Throughput: 0: 42701.8. Samples: 16046824300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:18,390][15132] Avg episode reward: [(0, '0.646')] [2024-06-25 22:29:19,432][15401] Updated weights for policy 0, policy_version 979414 (0.0032) [2024-06-25 22:29:23,070][15401] Updated weights for policy 0, policy_version 979424 (0.0031) [2024-06-25 22:29:23,390][15132] Fps is (10 sec: 47513.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 16046899200. Throughput: 0: 42881.8. Samples: 16046956840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:23,399][15132] Avg episode reward: [(0, '0.707')] [2024-06-25 22:29:27,141][15401] Updated weights for policy 0, policy_version 979434 (0.0034) [2024-06-25 22:29:28,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 16047079424. Throughput: 0: 42772.5. Samples: 16047206000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:28,390][15132] Avg episode reward: [(0, '0.714')] [2024-06-25 22:29:31,020][15401] Updated weights for policy 0, policy_version 979444 (0.0044) [2024-06-25 22:29:33,390][15132] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 16047292416. Throughput: 0: 42826.1. Samples: 16047465960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:33,390][15132] Avg episode reward: [(0, '0.778')] [2024-06-25 22:29:34,577][15401] Updated weights for policy 0, policy_version 979454 (0.0048) [2024-06-25 22:29:38,389][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16047521792. Throughput: 0: 42933.9. Samples: 16047598420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:38,390][15132] Avg episode reward: [(0, '0.758')] [2024-06-25 22:29:38,799][15401] Updated weights for policy 0, policy_version 979464 (0.0040) [2024-06-25 22:29:39,495][15349] Signal inference workers to stop experience collection... (237400 times) [2024-06-25 22:29:39,495][15349] Signal inference workers to resume experience collection... (237400 times) [2024-06-25 22:29:39,505][15401] InferenceWorker_p0-w0: stopping experience collection (237400 times) [2024-06-25 22:29:39,516][15401] InferenceWorker_p0-w0: resuming experience collection (237400 times) [2024-06-25 22:29:42,372][15401] Updated weights for policy 0, policy_version 979474 (0.0034) [2024-06-25 22:29:43,389][15132] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 16047734784. Throughput: 0: 42731.6. Samples: 16047855160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:43,390][15132] Avg episode reward: [(0, '0.528')] [2024-06-25 22:29:43,421][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000979476_16047734784.pth... [2024-06-25 22:29:43,478][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000978851_16037494784.pth [2024-06-25 22:29:46,394][15401] Updated weights for policy 0, policy_version 979484 (0.0033) [2024-06-25 22:29:48,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 16047964160. Throughput: 0: 42847.5. Samples: 16048111080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:48,390][15132] Avg episode reward: [(0, '0.728')] [2024-06-25 22:29:49,853][15401] Updated weights for policy 0, policy_version 979494 (0.0024) [2024-06-25 22:29:53,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 16048160768. Throughput: 0: 42882.6. Samples: 16048241920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:53,390][15132] Avg episode reward: [(0, '0.592')] [2024-06-25 22:29:53,962][15401] Updated weights for policy 0, policy_version 979504 (0.0028) [2024-06-25 22:29:57,538][15401] Updated weights for policy 0, policy_version 979514 (0.0035) [2024-06-25 22:29:58,389][15132] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 16048373760. Throughput: 0: 42917.4. Samples: 16048499440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:29:58,390][15132] Avg episode reward: [(0, '0.631')] [2024-06-25 22:30:01,558][15401] Updated weights for policy 0, policy_version 979524 (0.0031) [2024-06-25 22:30:03,390][15132] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 16048619520. Throughput: 0: 42905.7. Samples: 16048755060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:03,390][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 22:30:05,027][15401] Updated weights for policy 0, policy_version 979534 (0.0032) [2024-06-25 22:30:08,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 16048783360. Throughput: 0: 42896.9. Samples: 16048887200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:08,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 22:30:09,322][15401] Updated weights for policy 0, policy_version 979544 (0.0055) [2024-06-25 22:30:13,079][15401] Updated weights for policy 0, policy_version 979554 (0.0034) [2024-06-25 22:30:13,390][15132] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 16049012736. Throughput: 0: 42982.2. Samples: 16049140200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:13,390][15132] Avg episode reward: [(0, '0.748')] [2024-06-25 22:30:17,059][15401] Updated weights for policy 0, policy_version 979564 (0.0037) [2024-06-25 22:30:18,390][15132] Fps is (10 sec: 47513.1, 60 sec: 43417.5, 300 sec: 42820.9). Total num frames: 16049258496. Throughput: 0: 42837.8. Samples: 16049393660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:18,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 22:30:20,997][15401] Updated weights for policy 0, policy_version 979574 (0.0041) [2024-06-25 22:30:23,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 16049422336. Throughput: 0: 42936.9. Samples: 16049530580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:23,390][15132] Avg episode reward: [(0, '0.656')] [2024-06-25 22:30:24,741][15401] Updated weights for policy 0, policy_version 979584 (0.0035) [2024-06-25 22:30:28,389][15132] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 16049651712. Throughput: 0: 42842.7. Samples: 16049783080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:28,390][15132] Avg episode reward: [(0, '0.423')] [2024-06-25 22:30:28,531][15401] Updated weights for policy 0, policy_version 979594 (0.0032) [2024-06-25 22:30:32,349][15401] Updated weights for policy 0, policy_version 979604 (0.0042) [2024-06-25 22:30:33,389][15132] Fps is (10 sec: 47513.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 16049897472. Throughput: 0: 42909.5. Samples: 16050042000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:33,390][15132] Avg episode reward: [(0, '0.551')] [2024-06-25 22:30:36,141][15401] Updated weights for policy 0, policy_version 979614 (0.0040) [2024-06-25 22:30:38,389][15132] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 16050061312. Throughput: 0: 42972.9. Samples: 16050175700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:38,390][15132] Avg episode reward: [(0, '0.622')] [2024-06-25 22:30:39,854][15401] Updated weights for policy 0, policy_version 979624 (0.0034) [2024-06-25 22:30:43,389][15132] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16050290688. Throughput: 0: 42726.2. Samples: 16050422120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:43,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 22:30:43,817][15401] Updated weights for policy 0, policy_version 979634 (0.0030) [2024-06-25 22:30:47,537][15401] Updated weights for policy 0, policy_version 979644 (0.0045) [2024-06-25 22:30:48,389][15132] Fps is (10 sec: 47513.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 16050536448. Throughput: 0: 42902.8. Samples: 16050685680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:48,390][15132] Avg episode reward: [(0, '0.490')] [2024-06-25 22:30:51,573][15401] Updated weights for policy 0, policy_version 979654 (0.0041) [2024-06-25 22:30:53,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 16050716672. Throughput: 0: 42910.7. Samples: 16050818180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-25 22:30:53,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 22:30:55,171][15401] Updated weights for policy 0, policy_version 979664 (0.0038) [2024-06-25 22:30:55,652][15349] Signal inference workers to stop experience collection... (237450 times) [2024-06-25 22:30:55,702][15401] InferenceWorker_p0-w0: stopping experience collection (237450 times) [2024-06-25 22:30:55,706][15349] Signal inference workers to resume experience collection... (237450 times) [2024-06-25 22:30:55,717][15401] InferenceWorker_p0-w0: resuming experience collection (237450 times) [2024-06-25 22:30:58,389][15132] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 16050962432. Throughput: 0: 42921.8. Samples: 16051071680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:30:58,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 22:30:59,185][15401] Updated weights for policy 0, policy_version 979674 (0.0032) [2024-06-25 22:31:03,053][15401] Updated weights for policy 0, policy_version 979684 (0.0049) [2024-06-25 22:31:03,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 16051159040. Throughput: 0: 43074.9. Samples: 16051332020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:03,390][15132] Avg episode reward: [(0, '0.558')] [2024-06-25 22:31:06,736][15401] Updated weights for policy 0, policy_version 979694 (0.0047) [2024-06-25 22:31:08,390][15132] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 16051372032. Throughput: 0: 42926.2. Samples: 16051462260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:08,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 22:31:10,467][15401] Updated weights for policy 0, policy_version 979704 (0.0028) [2024-06-25 22:31:13,389][15132] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42876.2). Total num frames: 16051601408. Throughput: 0: 42924.8. Samples: 16051714700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:13,390][15132] Avg episode reward: [(0, '0.740')] [2024-06-25 22:31:14,083][15401] Updated weights for policy 0, policy_version 979714 (0.0031) [2024-06-25 22:31:17,881][15401] Updated weights for policy 0, policy_version 979724 (0.0035) [2024-06-25 22:31:18,390][15132] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16051814400. Throughput: 0: 43133.6. Samples: 16051983020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:18,390][15132] Avg episode reward: [(0, '0.749')] [2024-06-25 22:31:21,764][15401] Updated weights for policy 0, policy_version 979734 (0.0034) [2024-06-25 22:31:23,390][15132] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42709.8). Total num frames: 16052011008. Throughput: 0: 42990.1. Samples: 16052110260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:23,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 22:31:25,814][15401] Updated weights for policy 0, policy_version 979744 (0.0032) [2024-06-25 22:31:28,390][15132] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 16052256768. Throughput: 0: 43091.0. Samples: 16052361220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:28,390][15132] Avg episode reward: [(0, '0.533')] [2024-06-25 22:31:29,382][15401] Updated weights for policy 0, policy_version 979754 (0.0032) [2024-06-25 22:31:33,277][15401] Updated weights for policy 0, policy_version 979764 (0.0025) [2024-06-25 22:31:33,389][15132] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16052453376. Throughput: 0: 43150.7. Samples: 16052627460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:33,390][15132] Avg episode reward: [(0, '0.610')] [2024-06-25 22:31:36,956][15401] Updated weights for policy 0, policy_version 979774 (0.0031) [2024-06-25 22:31:38,389][15132] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 16052666368. Throughput: 0: 42984.9. Samples: 16052752500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:38,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 22:31:40,788][15401] Updated weights for policy 0, policy_version 979784 (0.0033) [2024-06-25 22:31:43,390][15132] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 16052895744. Throughput: 0: 43107.9. Samples: 16053011540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:43,390][15132] Avg episode reward: [(0, '0.618')] [2024-06-25 22:31:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000979792_16052912128.pth... [2024-06-25 22:31:43,450][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000979163_16042606592.pth [2024-06-25 22:31:44,534][15401] Updated weights for policy 0, policy_version 979794 (0.0045) [2024-06-25 22:31:48,389][15132] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 16053092352. Throughput: 0: 43075.5. Samples: 16053270420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:48,390][15132] Avg episode reward: [(0, '0.494')] [2024-06-25 22:31:48,394][15401] Updated weights for policy 0, policy_version 979804 (0.0047) [2024-06-25 22:31:52,253][15401] Updated weights for policy 0, policy_version 979814 (0.0043) [2024-06-25 22:31:53,389][15132] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 16053305344. Throughput: 0: 42913.4. Samples: 16053393360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:53,390][15132] Avg episode reward: [(0, '0.691')] [2024-06-25 22:31:55,877][15401] Updated weights for policy 0, policy_version 979824 (0.0041) [2024-06-25 22:31:58,389][15132] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 16053551104. Throughput: 0: 43005.4. Samples: 16053649940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:31:58,390][15132] Avg episode reward: [(0, '0.605')] [2024-06-25 22:32:00,069][15401] Updated weights for policy 0, policy_version 979834 (0.0026) [2024-06-25 22:32:03,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 16053747712. Throughput: 0: 42989.9. Samples: 16053917560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:03,390][15132] Avg episode reward: [(0, '0.480')] [2024-06-25 22:32:03,645][15401] Updated weights for policy 0, policy_version 979844 (0.0035) [2024-06-25 22:32:07,529][15401] Updated weights for policy 0, policy_version 979854 (0.0035) [2024-06-25 22:32:08,389][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42821.5). Total num frames: 16053960704. Throughput: 0: 42894.0. Samples: 16054040480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:08,390][15132] Avg episode reward: [(0, '0.535')] [2024-06-25 22:32:11,269][15401] Updated weights for policy 0, policy_version 979864 (0.0034) [2024-06-25 22:32:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42987.5). Total num frames: 16054190080. Throughput: 0: 42935.1. Samples: 16054293300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:13,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 22:32:15,159][15401] Updated weights for policy 0, policy_version 979874 (0.0028) [2024-06-25 22:32:15,806][15349] Signal inference workers to stop experience collection... (237500 times) [2024-06-25 22:32:15,806][15349] Signal inference workers to resume experience collection... (237500 times) [2024-06-25 22:32:15,844][15401] InferenceWorker_p0-w0: stopping experience collection (237500 times) [2024-06-25 22:32:15,844][15401] InferenceWorker_p0-w0: resuming experience collection (237500 times) [2024-06-25 22:32:18,389][15132] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 16054370304. Throughput: 0: 42894.2. Samples: 16054557700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:18,390][15132] Avg episode reward: [(0, '0.521')] [2024-06-25 22:32:18,934][15401] Updated weights for policy 0, policy_version 979884 (0.0035) [2024-06-25 22:32:22,886][15401] Updated weights for policy 0, policy_version 979894 (0.0037) [2024-06-25 22:32:23,390][15132] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 16054599680. Throughput: 0: 42744.0. Samples: 16054675980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:23,390][15132] Avg episode reward: [(0, '0.371')] [2024-06-25 22:32:26,613][15401] Updated weights for policy 0, policy_version 979904 (0.0027) [2024-06-25 22:32:28,389][15132] Fps is (10 sec: 44237.0, 60 sec: 42598.6, 300 sec: 42876.5). Total num frames: 16054812672. Throughput: 0: 42717.1. Samples: 16054933800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:28,390][15132] Avg episode reward: [(0, '0.536')] [2024-06-25 22:32:30,534][15401] Updated weights for policy 0, policy_version 979914 (0.0036) [2024-06-25 22:32:33,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 16055009280. Throughput: 0: 42778.1. Samples: 16055195440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:33,390][15132] Avg episode reward: [(0, '0.732')] [2024-06-25 22:32:34,395][15401] Updated weights for policy 0, policy_version 979924 (0.0047) [2024-06-25 22:32:38,074][15401] Updated weights for policy 0, policy_version 979934 (0.0026) [2024-06-25 22:32:38,389][15132] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 16055238656. Throughput: 0: 42751.1. Samples: 16055317160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:38,390][15132] Avg episode reward: [(0, '0.682')] [2024-06-25 22:32:42,168][15401] Updated weights for policy 0, policy_version 979944 (0.0039) [2024-06-25 22:32:43,392][15132] Fps is (10 sec: 45864.8, 60 sec: 42869.8, 300 sec: 42875.7). Total num frames: 16055468032. Throughput: 0: 42796.8. Samples: 16055575900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:43,392][15132] Avg episode reward: [(0, '0.469')] [2024-06-25 22:32:45,572][15401] Updated weights for policy 0, policy_version 979954 (0.0028) [2024-06-25 22:32:48,390][15132] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 16055664640. Throughput: 0: 42513.3. Samples: 16055830660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-25 22:32:48,390][15132] Avg episode reward: [(0, '0.549')] [2024-06-25 22:32:49,922][15401] Updated weights for policy 0, policy_version 979964 (0.0031) [2024-06-25 22:32:53,294][15401] Updated weights for policy 0, policy_version 979974 (0.0037) [2024-06-25 22:32:53,390][15132] Fps is (10 sec: 42608.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 16055894016. Throughput: 0: 42535.4. Samples: 16055954580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:32:53,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 22:32:57,631][15401] Updated weights for policy 0, policy_version 979984 (0.0026) [2024-06-25 22:32:58,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 16056090624. Throughput: 0: 42788.6. Samples: 16056218780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:32:58,390][15132] Avg episode reward: [(0, '0.515')] [2024-06-25 22:33:01,205][15401] Updated weights for policy 0, policy_version 979994 (0.0035) [2024-06-25 22:33:03,389][15132] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 16056303616. Throughput: 0: 42423.1. Samples: 16056466740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:03,390][15132] Avg episode reward: [(0, '0.432')] [2024-06-25 22:33:05,395][15401] Updated weights for policy 0, policy_version 980004 (0.0041) [2024-06-25 22:33:08,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 16056532992. Throughput: 0: 42611.2. Samples: 16056593480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:08,390][15132] Avg episode reward: [(0, '0.630')] [2024-06-25 22:33:09,087][15401] Updated weights for policy 0, policy_version 980014 (0.0027) [2024-06-25 22:33:12,948][15401] Updated weights for policy 0, policy_version 980024 (0.0038) [2024-06-25 22:33:13,390][15132] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 16056745984. Throughput: 0: 42757.6. Samples: 16056857900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:13,390][15132] Avg episode reward: [(0, '0.462')] [2024-06-25 22:33:16,754][15401] Updated weights for policy 0, policy_version 980034 (0.0039) [2024-06-25 22:33:18,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 16056942592. Throughput: 0: 42617.4. Samples: 16057113220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:18,390][15132] Avg episode reward: [(0, '0.327')] [2024-06-25 22:33:20,603][15401] Updated weights for policy 0, policy_version 980044 (0.0039) [2024-06-25 22:33:23,392][15132] Fps is (10 sec: 40950.2, 60 sec: 42596.7, 300 sec: 42820.2). Total num frames: 16057155584. Throughput: 0: 42661.7. Samples: 16057237040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:23,393][15132] Avg episode reward: [(0, '0.416')] [2024-06-25 22:33:24,333][15401] Updated weights for policy 0, policy_version 980054 (0.0033) [2024-06-25 22:33:28,389][15132] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 16057352192. Throughput: 0: 42553.0. Samples: 16057490680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:28,390][15132] Avg episode reward: [(0, '0.482')] [2024-06-25 22:33:28,410][15401] Updated weights for policy 0, policy_version 980064 (0.0030) [2024-06-25 22:33:31,938][15401] Updated weights for policy 0, policy_version 980074 (0.0030) [2024-06-25 22:33:33,389][15132] Fps is (10 sec: 44247.7, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 16057597952. Throughput: 0: 42628.1. Samples: 16057748920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:33,390][15132] Avg episode reward: [(0, '0.368')] [2024-06-25 22:33:35,878][15401] Updated weights for policy 0, policy_version 980084 (0.0028) [2024-06-25 22:33:38,390][15132] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 16057810944. Throughput: 0: 42737.3. Samples: 16057877760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:38,390][15132] Avg episode reward: [(0, '0.381')] [2024-06-25 22:33:39,395][15401] Updated weights for policy 0, policy_version 980094 (0.0030) [2024-06-25 22:33:43,390][15132] Fps is (10 sec: 40959.7, 60 sec: 42327.0, 300 sec: 42876.1). Total num frames: 16058007552. Throughput: 0: 42686.1. Samples: 16058139660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:43,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 22:33:43,484][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000980104_16058023936.pth... [2024-06-25 22:33:43,485][15401] Updated weights for policy 0, policy_version 980104 (0.0028) [2024-06-25 22:33:43,538][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000979476_16047734784.pth [2024-06-25 22:33:46,936][15401] Updated weights for policy 0, policy_version 980114 (0.0037) [2024-06-25 22:33:48,389][15132] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 16058236928. Throughput: 0: 42673.8. Samples: 16058387060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:48,390][15132] Avg episode reward: [(0, '0.619')] [2024-06-25 22:33:51,505][15401] Updated weights for policy 0, policy_version 980124 (0.0035) [2024-06-25 22:33:53,390][15132] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 16058449920. Throughput: 0: 42750.6. Samples: 16058517260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:53,390][15132] Avg episode reward: [(0, '0.642')] [2024-06-25 22:33:54,543][15401] Updated weights for policy 0, policy_version 980134 (0.0039) [2024-06-25 22:33:58,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 16058630144. Throughput: 0: 42611.1. Samples: 16058775400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:33:58,390][15132] Avg episode reward: [(0, '0.782')] [2024-06-25 22:33:59,087][15401] Updated weights for policy 0, policy_version 980144 (0.0029) [2024-06-25 22:34:00,653][15349] Signal inference workers to stop experience collection... (237550 times) [2024-06-25 22:34:00,693][15401] InferenceWorker_p0-w0: stopping experience collection (237550 times) [2024-06-25 22:34:00,716][15349] Signal inference workers to resume experience collection... (237550 times) [2024-06-25 22:34:00,717][15401] InferenceWorker_p0-w0: resuming experience collection (237550 times) [2024-06-25 22:34:02,516][15401] Updated weights for policy 0, policy_version 980154 (0.0033) [2024-06-25 22:34:03,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 16058875904. Throughput: 0: 42311.6. Samples: 16059017240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:03,390][15132] Avg episode reward: [(0, '0.554')] [2024-06-25 22:34:06,870][15401] Updated weights for policy 0, policy_version 980164 (0.0035) [2024-06-25 22:34:08,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 16059072512. Throughput: 0: 42648.9. Samples: 16059156140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:08,390][15132] Avg episode reward: [(0, '0.427')] [2024-06-25 22:34:10,339][15401] Updated weights for policy 0, policy_version 980174 (0.0034) [2024-06-25 22:34:13,389][15132] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 16059269120. Throughput: 0: 42430.3. Samples: 16059400040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:13,390][15132] Avg episode reward: [(0, '0.589')] [2024-06-25 22:34:14,934][15401] Updated weights for policy 0, policy_version 980184 (0.0040) [2024-06-25 22:34:18,145][15401] Updated weights for policy 0, policy_version 980194 (0.0026) [2024-06-25 22:34:18,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 16059514880. Throughput: 0: 42412.5. Samples: 16059657480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:18,390][15132] Avg episode reward: [(0, '0.693')] [2024-06-25 22:34:22,515][15401] Updated weights for policy 0, policy_version 980204 (0.0038) [2024-06-25 22:34:23,396][15132] Fps is (10 sec: 44208.0, 60 sec: 42595.6, 300 sec: 42819.6). Total num frames: 16059711488. Throughput: 0: 42415.8. Samples: 16059786740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:23,396][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 22:34:25,871][15401] Updated weights for policy 0, policy_version 980214 (0.0031) [2024-06-25 22:34:28,390][15132] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 16059924480. Throughput: 0: 42160.8. Samples: 16060036900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:28,390][15132] Avg episode reward: [(0, '0.517')] [2024-06-25 22:34:30,221][15401] Updated weights for policy 0, policy_version 980224 (0.0042) [2024-06-25 22:34:33,318][15401] Updated weights for policy 0, policy_version 980234 (0.0051) [2024-06-25 22:34:33,390][15132] Fps is (10 sec: 44262.6, 60 sec: 42597.9, 300 sec: 42820.5). Total num frames: 16060153856. Throughput: 0: 42478.9. Samples: 16060298640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:33,390][15132] Avg episode reward: [(0, '0.357')] [2024-06-25 22:34:37,779][15401] Updated weights for policy 0, policy_version 980244 (0.0031) [2024-06-25 22:34:38,390][15132] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42709.4). Total num frames: 16060334080. Throughput: 0: 42491.0. Samples: 16060429360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:38,390][15132] Avg episode reward: [(0, '0.379')] [2024-06-25 22:34:40,828][15401] Updated weights for policy 0, policy_version 980254 (0.0031) [2024-06-25 22:34:43,389][15132] Fps is (10 sec: 39324.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 16060547072. Throughput: 0: 42282.8. Samples: 16060678120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:43,390][15132] Avg episode reward: [(0, '0.653')] [2024-06-25 22:34:45,451][15401] Updated weights for policy 0, policy_version 980264 (0.0032) [2024-06-25 22:34:48,390][15132] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 16060792832. Throughput: 0: 42683.5. Samples: 16060938000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-25 22:34:48,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 22:34:48,692][15401] Updated weights for policy 0, policy_version 980274 (0.0030) [2024-06-25 22:34:53,079][15401] Updated weights for policy 0, policy_version 980284 (0.0024) [2024-06-25 22:34:53,389][15132] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 16060973056. Throughput: 0: 42580.6. Samples: 16061072260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:34:53,390][15132] Avg episode reward: [(0, '0.509')] [2024-06-25 22:34:56,146][15401] Updated weights for policy 0, policy_version 980294 (0.0037) [2024-06-25 22:34:58,392][15132] Fps is (10 sec: 40950.7, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 16061202432. Throughput: 0: 42803.0. Samples: 16061326280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:34:58,392][15132] Avg episode reward: [(0, '0.362')] [2024-06-25 22:35:01,096][15401] Updated weights for policy 0, policy_version 980304 (0.0019) [2024-06-25 22:35:03,390][15132] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 16061431808. Throughput: 0: 42899.4. Samples: 16061587960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:03,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 22:35:03,727][15401] Updated weights for policy 0, policy_version 980314 (0.0037) [2024-06-25 22:35:08,389][15132] Fps is (10 sec: 40969.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 16061612032. Throughput: 0: 43010.2. Samples: 16061721920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:08,390][15132] Avg episode reward: [(0, '0.363')] [2024-06-25 22:35:08,519][15401] Updated weights for policy 0, policy_version 980324 (0.0038) [2024-06-25 22:35:09,440][15349] Signal inference workers to stop experience collection... (237600 times) [2024-06-25 22:35:09,476][15401] InferenceWorker_p0-w0: stopping experience collection (237600 times) [2024-06-25 22:35:09,508][15349] Signal inference workers to resume experience collection... (237600 times) [2024-06-25 22:35:09,510][15401] InferenceWorker_p0-w0: resuming experience collection (237600 times) [2024-06-25 22:35:11,281][15401] Updated weights for policy 0, policy_version 980334 (0.0047) [2024-06-25 22:35:13,390][15132] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 16061857792. Throughput: 0: 43057.9. Samples: 16061974500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:13,390][15132] Avg episode reward: [(0, '0.645')] [2024-06-25 22:35:15,915][15401] Updated weights for policy 0, policy_version 980344 (0.0035) [2024-06-25 22:35:18,389][15132] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 16062070784. Throughput: 0: 42954.0. Samples: 16062231540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:18,390][15132] Avg episode reward: [(0, '0.620')] [2024-06-25 22:35:18,954][15401] Updated weights for policy 0, policy_version 980354 (0.0033) [2024-06-25 22:35:23,302][15401] Updated weights for policy 0, policy_version 980364 (0.0022) [2024-06-25 22:35:23,389][15132] Fps is (10 sec: 42598.6, 60 sec: 42876.1, 300 sec: 42820.5). Total num frames: 16062283776. Throughput: 0: 43093.9. Samples: 16062368580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:23,390][15132] Avg episode reward: [(0, '0.611')] [2024-06-25 22:35:26,685][15401] Updated weights for policy 0, policy_version 980374 (0.0031) [2024-06-25 22:35:28,390][15132] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 16062480384. Throughput: 0: 43081.7. Samples: 16062616800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:28,390][15132] Avg episode reward: [(0, '0.664')] [2024-06-25 22:35:31,048][15401] Updated weights for policy 0, policy_version 980384 (0.0030) [2024-06-25 22:35:33,389][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.8, 300 sec: 42876.1). Total num frames: 16062709760. Throughput: 0: 43073.0. Samples: 16062876280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:33,390][15132] Avg episode reward: [(0, '0.721')] [2024-06-25 22:35:34,518][15401] Updated weights for policy 0, policy_version 980394 (0.0046) [2024-06-25 22:35:38,390][15132] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 16062922752. Throughput: 0: 43068.3. Samples: 16063010340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:38,390][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 22:35:38,805][15401] Updated weights for policy 0, policy_version 980404 (0.0041) [2024-06-25 22:35:42,170][15401] Updated weights for policy 0, policy_version 980414 (0.0034) [2024-06-25 22:35:43,390][15132] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 16063152128. Throughput: 0: 43168.5. Samples: 16063268760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:43,390][15132] Avg episode reward: [(0, '0.511')] [2024-06-25 22:35:43,409][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000980417_16063152128.pth... [2024-06-25 22:35:43,491][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000979792_16052912128.pth [2024-06-25 22:35:46,202][15401] Updated weights for policy 0, policy_version 980424 (0.0032) [2024-06-25 22:35:48,390][15132] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 16063365120. Throughput: 0: 43171.1. Samples: 16063530660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:48,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 22:35:49,697][15401] Updated weights for policy 0, policy_version 980434 (0.0035) [2024-06-25 22:35:53,390][15132] Fps is (10 sec: 42597.9, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 16063578112. Throughput: 0: 43181.1. Samples: 16063665080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:53,390][15132] Avg episode reward: [(0, '0.365')] [2024-06-25 22:35:53,567][15401] Updated weights for policy 0, policy_version 980444 (0.0025) [2024-06-25 22:35:57,427][15401] Updated weights for policy 0, policy_version 980454 (0.0026) [2024-06-25 22:35:58,390][15132] Fps is (10 sec: 42598.6, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 16063791104. Throughput: 0: 43266.2. Samples: 16063921480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:35:58,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 22:36:01,233][15401] Updated weights for policy 0, policy_version 980464 (0.0035) [2024-06-25 22:36:03,390][15132] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 16064020480. Throughput: 0: 43354.9. Samples: 16064182520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:03,390][15132] Avg episode reward: [(0, '0.677')] [2024-06-25 22:36:05,006][15401] Updated weights for policy 0, policy_version 980474 (0.0042) [2024-06-25 22:36:08,390][15132] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 16064233472. Throughput: 0: 43265.7. Samples: 16064315540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:08,390][15132] Avg episode reward: [(0, '0.583')] [2024-06-25 22:36:08,673][15401] Updated weights for policy 0, policy_version 980484 (0.0031) [2024-06-25 22:36:12,662][15401] Updated weights for policy 0, policy_version 980494 (0.0038) [2024-06-25 22:36:13,390][15132] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 16064430080. Throughput: 0: 43471.9. Samples: 16064573040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:13,390][15132] Avg episode reward: [(0, '0.711')] [2024-06-25 22:36:16,309][15401] Updated weights for policy 0, policy_version 980504 (0.0037) [2024-06-25 22:36:18,389][15132] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 16064659456. Throughput: 0: 43157.4. Samples: 16064818360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:18,390][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 22:36:20,723][15401] Updated weights for policy 0, policy_version 980514 (0.0031) [2024-06-25 22:36:23,390][15132] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 16064856064. Throughput: 0: 43139.8. Samples: 16064951640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:23,391][15132] Avg episode reward: [(0, '0.661')] [2024-06-25 22:36:23,959][15401] Updated weights for policy 0, policy_version 980524 (0.0035) [2024-06-25 22:36:28,182][15401] Updated weights for policy 0, policy_version 980534 (0.0033) [2024-06-25 22:36:28,389][15132] Fps is (10 sec: 40959.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 16065069056. Throughput: 0: 43026.7. Samples: 16065204960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:28,390][15132] Avg episode reward: [(0, '0.608')] [2024-06-25 22:36:29,049][15349] Signal inference workers to stop experience collection... (237650 times) [2024-06-25 22:36:29,088][15401] InferenceWorker_p0-w0: stopping experience collection (237650 times) [2024-06-25 22:36:29,101][15349] Signal inference workers to resume experience collection... (237650 times) [2024-06-25 22:36:29,103][15401] InferenceWorker_p0-w0: resuming experience collection (237650 times) [2024-06-25 22:36:31,654][15401] Updated weights for policy 0, policy_version 980544 (0.0025) [2024-06-25 22:36:33,390][15132] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 16065298432. Throughput: 0: 42742.7. Samples: 16065454080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:33,390][15132] Avg episode reward: [(0, '0.556')] [2024-06-25 22:36:36,049][15401] Updated weights for policy 0, policy_version 980554 (0.0028) [2024-06-25 22:36:38,392][15132] Fps is (10 sec: 44226.1, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 16065511424. Throughput: 0: 42783.2. Samples: 16065590420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:38,392][15132] Avg episode reward: [(0, '0.598')] [2024-06-25 22:36:39,263][15401] Updated weights for policy 0, policy_version 980564 (0.0040) [2024-06-25 22:36:43,390][15132] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16065708032. Throughput: 0: 42764.4. Samples: 16065845880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:43,390][15132] Avg episode reward: [(0, '0.706')] [2024-06-25 22:36:43,767][15401] Updated weights for policy 0, policy_version 980574 (0.0043) [2024-06-25 22:36:46,857][15401] Updated weights for policy 0, policy_version 980584 (0.0047) [2024-06-25 22:36:48,393][15132] Fps is (10 sec: 44230.0, 60 sec: 43141.8, 300 sec: 42875.5). Total num frames: 16065953792. Throughput: 0: 42452.4. Samples: 16066093040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-25 22:36:48,394][15132] Avg episode reward: [(0, '0.772')] [2024-06-25 22:36:51,190][15401] Updated weights for policy 0, policy_version 980594 (0.0039) [2024-06-25 22:36:53,390][15132] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 16066150400. Throughput: 0: 42708.9. Samples: 16066237440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:36:53,390][15132] Avg episode reward: [(0, '0.802')] [2024-06-25 22:36:54,474][15401] Updated weights for policy 0, policy_version 980604 (0.0030) [2024-06-25 22:36:58,390][15132] Fps is (10 sec: 39336.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 16066347008. Throughput: 0: 42514.7. Samples: 16066486200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:36:58,390][15132] Avg episode reward: [(0, '0.513')] [2024-06-25 22:36:58,831][15401] Updated weights for policy 0, policy_version 980614 (0.0031) [2024-06-25 22:37:01,914][15401] Updated weights for policy 0, policy_version 980624 (0.0045) [2024-06-25 22:37:03,389][15132] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 16066592768. Throughput: 0: 42728.4. Samples: 16066741140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:03,390][15132] Avg episode reward: [(0, '0.546')] [2024-06-25 22:37:06,667][15401] Updated weights for policy 0, policy_version 980634 (0.0027) [2024-06-25 22:37:08,389][15132] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 16066789376. Throughput: 0: 42897.1. Samples: 16066882000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:08,390][15132] Avg episode reward: [(0, '0.710')] [2024-06-25 22:37:09,692][15401] Updated weights for policy 0, policy_version 980644 (0.0026) [2024-06-25 22:37:13,390][15132] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16066985984. Throughput: 0: 42850.6. Samples: 16067133240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:13,390][15132] Avg episode reward: [(0, '0.437')] [2024-06-25 22:37:14,203][15401] Updated weights for policy 0, policy_version 980654 (0.0026) [2024-06-25 22:37:17,434][15401] Updated weights for policy 0, policy_version 980664 (0.0031) [2024-06-25 22:37:18,390][15132] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 16067248128. Throughput: 0: 42817.0. Samples: 16067380840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:18,390][15132] Avg episode reward: [(0, '0.810')] [2024-06-25 22:37:21,686][15401] Updated weights for policy 0, policy_version 980674 (0.0027) [2024-06-25 22:37:23,389][15132] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 16067428352. Throughput: 0: 43042.3. Samples: 16067527220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:23,390][15132] Avg episode reward: [(0, '0.572')] [2024-06-25 22:37:25,078][15401] Updated weights for policy 0, policy_version 980684 (0.0041) [2024-06-25 22:37:28,389][15132] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 16067624960. Throughput: 0: 42797.9. Samples: 16067771780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:28,390][15132] Avg episode reward: [(0, '0.581')] [2024-06-25 22:37:29,491][15401] Updated weights for policy 0, policy_version 980694 (0.0043) [2024-06-25 22:37:30,654][15349] Signal inference workers to stop experience collection... (237700 times) [2024-06-25 22:37:30,655][15349] Signal inference workers to resume experience collection... (237700 times) [2024-06-25 22:37:30,668][15401] InferenceWorker_p0-w0: stopping experience collection (237700 times) [2024-06-25 22:37:30,669][15401] InferenceWorker_p0-w0: resuming experience collection (237700 times) [2024-06-25 22:37:32,591][15401] Updated weights for policy 0, policy_version 980704 (0.0036) [2024-06-25 22:37:33,389][15132] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 16067887104. Throughput: 0: 42946.5. Samples: 16068025460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:33,390][15132] Avg episode reward: [(0, '0.687')] [2024-06-25 22:37:37,033][15401] Updated weights for policy 0, policy_version 980714 (0.0028) [2024-06-25 22:37:38,389][15132] Fps is (10 sec: 44236.8, 60 sec: 42600.2, 300 sec: 42709.8). Total num frames: 16068067328. Throughput: 0: 42820.2. Samples: 16068164340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:38,390][15132] Avg episode reward: [(0, '0.318')] [2024-06-25 22:37:40,376][15401] Updated weights for policy 0, policy_version 980724 (0.0042) [2024-06-25 22:37:43,390][15132] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 16068280320. Throughput: 0: 42698.7. Samples: 16068407640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:43,394][15132] Avg episode reward: [(0, '0.574')] [2024-06-25 22:37:43,408][15349] Saving /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000980730_16068280320.pth... [2024-06-25 22:37:43,470][15349] Removing /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000980104_16058023936.pth [2024-06-25 22:37:44,583][15401] Updated weights for policy 0, policy_version 980734 (0.0033) [2024-06-25 22:37:47,910][15401] Updated weights for policy 0, policy_version 980744 (0.0025) [2024-06-25 22:37:48,389][15132] Fps is (10 sec: 45874.8, 60 sec: 42874.3, 300 sec: 42820.6). Total num frames: 16068526080. Throughput: 0: 42827.5. Samples: 16068668380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:48,390][15132] Avg episode reward: [(0, '0.818')] [2024-06-25 22:37:52,815][15401] Updated weights for policy 0, policy_version 980754 (0.0032) [2024-06-25 22:37:53,390][15132] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 16068706304. Throughput: 0: 42709.2. Samples: 16068803920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:53,390][15132] Avg episode reward: [(0, '0.361')] [2024-06-25 22:37:55,662][15401] Updated weights for policy 0, policy_version 980764 (0.0033) [2024-06-25 22:37:58,389][15132] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 16068935680. Throughput: 0: 42533.0. Samples: 16069047220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:37:58,390][15132] Avg episode reward: [(0, '0.491')] [2024-06-25 22:38:00,326][15401] Updated weights for policy 0, policy_version 980774 (0.0033) [2024-06-25 22:38:03,370][15401] Updated weights for policy 0, policy_version 980784 (0.0040) [2024-06-25 22:38:03,389][15132] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 16069165056. Throughput: 0: 42956.5. Samples: 16069313880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:03,390][15132] Avg episode reward: [(0, '0.825')] [2024-06-25 22:38:08,051][15401] Updated weights for policy 0, policy_version 980794 (0.0033) [2024-06-25 22:38:08,390][15132] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 16069345280. Throughput: 0: 42500.8. Samples: 16069439760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:08,390][15132] Avg episode reward: [(0, '0.573')] [2024-06-25 22:38:11,150][15401] Updated weights for policy 0, policy_version 980804 (0.0025) [2024-06-25 22:38:13,390][15132] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 16069591040. Throughput: 0: 42528.3. Samples: 16069685560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:13,390][15132] Avg episode reward: [(0, '0.478')] [2024-06-25 22:38:15,826][15401] Updated weights for policy 0, policy_version 980814 (0.0042) [2024-06-25 22:38:18,390][15132] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42820.9). Total num frames: 16069787648. Throughput: 0: 42624.4. Samples: 16069943560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:18,390][15132] Avg episode reward: [(0, '0.715')] [2024-06-25 22:38:18,933][15401] Updated weights for policy 0, policy_version 980824 (0.0045) [2024-06-25 22:38:23,390][15132] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 16069967872. Throughput: 0: 42357.6. Samples: 16070070440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:23,390][15132] Avg episode reward: [(0, '0.477')] [2024-06-25 22:38:23,730][15401] Updated weights for policy 0, policy_version 980834 (0.0035) [2024-06-25 22:38:26,582][15401] Updated weights for policy 0, policy_version 980844 (0.0025) [2024-06-25 22:38:28,390][15132] Fps is (10 sec: 44236.2, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 16070230016. Throughput: 0: 42603.5. Samples: 16070324800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:28,390][15132] Avg episode reward: [(0, '0.738')] [2024-06-25 22:38:31,237][15401] Updated weights for policy 0, policy_version 980854 (0.0035) [2024-06-25 22:38:33,390][15132] Fps is (10 sec: 47513.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 16070443008. Throughput: 0: 42539.5. Samples: 16070582660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:33,390][15132] Avg episode reward: [(0, '0.788')] [2024-06-25 22:38:34,191][15401] Updated weights for policy 0, policy_version 980864 (0.0041) [2024-06-25 22:38:38,390][15132] Fps is (10 sec: 37683.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 16070606848. Throughput: 0: 42302.3. Samples: 16070707520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:38,390][15132] Avg episode reward: [(0, '0.844')] [2024-06-25 22:38:39,223][15401] Updated weights for policy 0, policy_version 980874 (0.0033) [2024-06-25 22:38:41,936][15401] Updated weights for policy 0, policy_version 980884 (0.0025) [2024-06-25 22:38:43,389][15132] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 16070868992. Throughput: 0: 42529.8. Samples: 16070961060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:43,390][15132] Avg episode reward: [(0, '0.846')] [2024-06-25 22:38:46,776][15349] Signal inference workers to stop experience collection... (237750 times) [2024-06-25 22:38:46,808][15401] InferenceWorker_p0-w0: stopping experience collection (237750 times) [2024-06-25 22:38:46,891][15349] Signal inference workers to resume experience collection... (237750 times) [2024-06-25 22:38:46,891][15401] InferenceWorker_p0-w0: resuming experience collection (237750 times) [2024-06-25 22:38:46,894][15401] Updated weights for policy 0, policy_version 980894 (0.0039) [2024-06-25 22:38:48,390][15132] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 16071065600. Throughput: 0: 42407.9. Samples: 16071222240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-25 22:38:48,390][15132] Avg episode reward: [(0, '0.671')] [2024-06-25 22:38:49,455][15401] Updated weights for policy 0, policy_version 980904 (0.0040) [2024-06-25 22:38:53,389][15132] Fps is (10 sec: 37682.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 16071245824. Throughput: 0: 42331.7. Samples: 16071344680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 22:38:53,390][15132] Avg episode reward: [(0, '0.594')] [2024-06-25 22:38:54,466][15401] Updated weights for policy 0, policy_version 980914 (0.0041) [2024-06-25 22:38:57,109][15401] Updated weights for policy 0, policy_version 980924 (0.0051) [2024-06-25 22:38:58,389][15132] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 16071507968. Throughput: 0: 42569.9. Samples: 16071601200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 22:38:58,390][15132] Avg episode reward: [(0, '0.719')] [2024-06-25 22:39:01,855][15401] Updated weights for policy 0, policy_version 980934 (0.0025) [2024-06-25 22:39:03,390][15132] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 42709.5). Total num frames: 16071671808. Throughput: 0: 42654.6. Samples: 16071863020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 22:39:03,390][15132] Avg episode reward: [(0, '0.696')] [2024-06-25 22:39:05,265][15401] Updated weights for policy 0, policy_version 980944 (0.0038) [2024-06-25 22:39:08,392][15132] Fps is (10 sec: 37674.0, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 16071884800. Throughput: 0: 42397.3. Samples: 16071978420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 22:39:08,392][15132] Avg episode reward: [(0, '0.667')] [2024-06-25 22:39:09,727][15401] Updated weights for policy 0, policy_version 980954 (0.0023) [2024-06-25 22:39:12,859][15401] Updated weights for policy 0, policy_version 980964 (0.0029) [2024-06-25 22:39:13,389][15132] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 16072130560. Throughput: 0: 42494.8. Samples: 16072237060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 22:39:13,390][15132] Avg episode reward: [(0, '0.638')] [2024-06-25 22:39:17,275][15401] Updated weights for policy 0, policy_version 980974 (0.0040) [2024-06-25 22:39:18,390][15132] Fps is (10 sec: 42608.5, 60 sec: 42052.2, 300 sec: 42710.4). Total num frames: 16072310784. Throughput: 0: 42623.1. Samples: 16072500700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-25 22:39:18,390][15132] Avg episode reward: [(0, '0.473')] [2024-06-25 22:39:20,498][15401] Updated weights for policy 0, policy_version 980984 (0.0029) [2024-06-25 22:40:05,012][17646] Saving configuration to /workspace/metta/train_dir/p2.dr6/config.json... [2024-06-25 22:40:05,029][17646] Rollout worker 0 uses device cpu [2024-06-25 22:40:05,029][17646] Rollout worker 1 uses device cpu [2024-06-25 22:40:05,029][17646] Rollout worker 2 uses device cpu [2024-06-25 22:40:05,030][17646] Rollout worker 3 uses device cpu [2024-06-25 22:40:05,030][17646] Rollout worker 4 uses device cpu [2024-06-25 22:40:05,030][17646] Rollout worker 5 uses device cpu [2024-06-25 22:40:05,031][17646] Rollout worker 6 uses device cpu [2024-06-25 22:40:05,031][17646] Rollout worker 7 uses device cpu [2024-06-25 22:40:05,031][17646] Rollout worker 8 uses device cpu [2024-06-25 22:40:05,031][17646] Rollout worker 9 uses device cpu [2024-06-25 22:40:05,032][17646] Rollout worker 10 uses device cpu [2024-06-25 22:40:05,032][17646] Rollout worker 11 uses device cpu [2024-06-25 22:40:05,032][17646] Rollout worker 12 uses device cpu [2024-06-25 22:40:05,033][17646] Rollout worker 13 uses device cpu [2024-06-25 22:40:05,033][17646] Rollout worker 14 uses device cpu [2024-06-25 22:40:05,033][17646] Rollout worker 15 uses device cpu [2024-06-25 22:40:05,033][17646] Rollout worker 16 uses device cpu [2024-06-25 22:40:05,033][17646] Rollout worker 17 uses device cpu [2024-06-25 22:40:05,033][17646] Rollout worker 18 uses device cpu [2024-06-25 22:40:05,033][17646] Rollout worker 19 uses device cpu [2024-06-25 22:40:05,033][17646] Rollout worker 20 uses device cpu [2024-06-25 22:40:05,034][17646] Rollout worker 21 uses device cpu [2024-06-25 22:40:05,034][17646] Rollout worker 22 uses device cpu [2024-06-25 22:40:05,034][17646] Rollout worker 23 uses device cpu [2024-06-25 22:40:05,034][17646] Rollout worker 24 uses device cpu [2024-06-25 22:40:05,034][17646] Rollout worker 25 uses device cpu [2024-06-25 22:40:05,034][17646] Rollout worker 26 uses device cpu [2024-06-25 22:40:05,034][17646] Rollout worker 27 uses device cpu [2024-06-25 22:40:05,035][17646] Rollout worker 28 uses device cpu [2024-06-25 22:40:05,035][17646] Rollout worker 29 uses device cpu [2024-06-25 22:40:05,035][17646] Rollout worker 30 uses device cpu [2024-06-25 22:40:05,035][17646] Rollout worker 31 uses device cpu [2024-06-25 22:40:05,623][17646] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-25 22:40:05,623][17646] InferenceWorker_p0-w0: min num requests: 10 [2024-06-25 22:40:05,704][17646] Starting all processes... [2024-06-25 22:40:05,705][17646] Starting process learner_proc0 [2024-06-25 22:40:05,936][17646] Starting all processes... [2024-06-25 22:40:05,939][17646] Starting process inference_proc0-0 [2024-06-25 22:40:05,940][17646] Starting process rollout_proc1 [2024-06-25 22:40:05,940][17646] Starting process rollout_proc0 [2024-06-25 22:40:05,942][17646] Starting process rollout_proc2 [2024-06-25 22:40:05,944][17646] Starting process rollout_proc3 [2024-06-25 22:40:05,944][17646] Starting process rollout_proc4 [2024-06-25 22:40:05,946][17646] Starting process rollout_proc5 [2024-06-25 22:40:05,947][17646] Starting process rollout_proc6 [2024-06-25 22:40:05,947][17646] Starting process rollout_proc7 [2024-06-25 22:40:05,948][17646] Starting process rollout_proc8 [2024-06-25 22:40:06,056][17646] Starting process rollout_proc21 [2024-06-25 22:40:05,948][17646] Starting process rollout_proc10 [2024-06-25 22:40:06,060][17646] Starting process rollout_proc23 [2024-06-25 22:40:05,948][17646] Starting process rollout_proc11 [2024-06-25 22:40:06,042][17646] Starting process rollout_proc14 [2024-06-25 22:40:06,044][17646] Starting process rollout_proc13 [2024-06-25 22:40:06,044][17646] Starting process rollout_proc15 [2024-06-25 22:40:06,045][17646] Starting process rollout_proc16 [2024-06-25 22:40:06,045][17646] Starting process rollout_proc17 [2024-06-25 22:40:06,052][17646] Starting process rollout_proc19 [2024-06-25 22:40:06,052][17646] Starting process rollout_proc18 [2024-06-25 22:40:06,052][17646] Starting process rollout_proc20 [2024-06-25 22:40:05,948][17646] Starting process rollout_proc9 [2024-06-25 22:40:06,056][17646] Starting process rollout_proc22 [2024-06-25 22:40:06,102][17646] Starting process rollout_proc30 [2024-06-25 22:40:06,064][17646] Starting process rollout_proc24 [2024-06-25 22:40:06,080][17646] Starting process rollout_proc25 [2024-06-25 22:40:06,081][17646] Starting process rollout_proc26 [2024-06-25 22:40:06,092][17646] Starting process rollout_proc27 [2024-06-25 22:40:06,103][17646] Starting process rollout_proc31 [2024-06-25 22:40:06,092][17646] Starting process rollout_proc28 [2024-06-25 22:40:05,948][17646] Starting process rollout_proc12 [2024-06-25 22:40:06,092][17646] Starting process rollout_proc29 [2024-06-25 22:40:08,151][17879] Worker 1 uses CPU cores [1] [2024-06-25 22:40:08,208][17882] Worker 3 uses CPU cores [3] [2024-06-25 22:40:08,250][18018] Worker 14 uses CPU cores [14] [2024-06-25 22:40:08,255][18013] Worker 23 uses CPU cores [23] [2024-06-25 22:40:08,262][17878] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-25 22:40:08,262][17878] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-25 22:40:08,271][17878] Num visible devices: 1 [2024-06-25 22:40:08,300][17880] Worker 0 uses CPU cores [0] [2024-06-25 22:40:08,316][18090] Worker 26 uses CPU cores [26] [2024-06-25 22:40:08,341][17858] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-25 22:40:08,341][17858] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-25 22:40:08,351][17858] Num visible devices: 1 [2024-06-25 22:40:08,372][17858] Setting fixed seed 0 [2024-06-25 22:40:08,373][17858] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-25 22:40:08,373][17858] Initializing actor-critic model on device cuda:0 [2024-06-25 22:40:08,380][18095] Worker 29 uses CPU cores [29] [2024-06-25 22:40:08,393][18049] Worker 15 uses CPU cores [15] [2024-06-25 22:40:08,400][18014] Worker 21 uses CPU cores [21] [2024-06-25 22:40:08,412][18048] Worker 13 uses CPU cores [13] [2024-06-25 22:40:08,413][17948] Worker 7 uses CPU cores [7] [2024-06-25 22:40:08,420][17945] Worker 4 uses CPU cores [4] [2024-06-25 22:40:08,424][18085] Worker 27 uses CPU cores [27] [2024-06-25 22:40:08,432][18054] Worker 17 uses CPU cores [17] [2024-06-25 22:40:08,436][18055] Worker 22 uses CPU cores [22] [2024-06-25 22:40:08,455][18088] Worker 31 uses CPU cores [31] [2024-06-25 22:40:08,461][18052] Worker 20 uses CPU cores [20] [2024-06-25 22:40:08,464][17947] Worker 6 uses CPU cores [6] [2024-06-25 22:40:08,471][18015] Worker 11 uses CPU cores [11] [2024-06-25 22:40:08,480][17949] Worker 8 uses CPU cores [8] [2024-06-25 22:40:08,488][17881] Worker 2 uses CPU cores [2] [2024-06-25 22:40:08,504][18050] Worker 16 uses CPU cores [16] [2024-06-25 22:40:08,512][18096] Worker 30 uses CPU cores [30] [2024-06-25 22:40:08,518][18053] Worker 18 uses CPU cores [18] [2024-06-25 22:40:08,549][17946] Worker 5 uses CPU cores [5] [2024-06-25 22:40:08,562][18091] Worker 12 uses CPU cores [12] [2024-06-25 22:40:08,598][18089] Worker 9 uses CPU cores [9] [2024-06-25 22:40:08,600][17981] Worker 10 uses CPU cores [10] [2024-06-25 22:40:08,614][18051] Worker 19 uses CPU cores [19] [2024-06-25 22:40:08,658][18093] Worker 28 uses CPU cores [28] [2024-06-25 22:40:08,672][18094] Worker 24 uses CPU cores [24] [2024-06-25 22:40:08,748][18092] Worker 25 uses CPU cores [25] [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,212][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,213][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,216][17858] RunningMeanStd input shape: (1,) [2024-06-25 22:40:09,216][17858] RunningMeanStd input shape: (1,) [2024-06-25 22:40:09,217][17858] RunningMeanStd input shape: (1,) [2024-06-25 22:40:09,217][17858] RunningMeanStd input shape: (1,) [2024-06-25 22:40:09,217][17858] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:09,256][17858] RunningMeanStd input shape: (1,) [2024-06-25 22:40:09,261][17858] Created Actor Critic model with architecture: [2024-06-25 22:40:09,261][17858] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-25 22:40:09,328][17858] Using optimizer [2024-06-25 22:40:09,513][17858] Loading state from checkpoint /workspace/metta/train_dir/p2.dr6/checkpoint_p0/checkpoint_000980730_16068280320.pth... [2024-06-25 22:40:09,527][17858] Loading model from checkpoint [2024-06-25 22:40:09,529][17858] Loaded experiment state at self.train_step=980730, self.env_steps=16068280320 [2024-06-25 22:40:09,529][17858] Initialized policy 0 weights for model version 980730 [2024-06-25 22:40:09,530][17858] LearnerWorker_p0 finished initialization! [2024-06-25 22:40:09,530][17858] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-25 22:40:10,284][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,284][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,284][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,284][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,284][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,284][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,284][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,284][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,285][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,288][17878] RunningMeanStd input shape: (1,) [2024-06-25 22:40:10,289][17878] RunningMeanStd input shape: (1,) [2024-06-25 22:40:10,289][17878] RunningMeanStd input shape: (1,) [2024-06-25 22:40:10,289][17878] RunningMeanStd input shape: (1,) [2024-06-25 22:40:10,289][17878] RunningMeanStd input shape: (11, 11) [2024-06-25 22:40:10,329][17878] RunningMeanStd input shape: (1,) [2024-06-25 22:40:10,352][17646] Inference worker 0-0 is ready! [2024-06-25 22:40:10,352][17646] All inference workers are ready! Signal rollout workers to start! [2024-06-25 22:40:12,759][17646] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 16068280320. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-25 22:40:13,014][18096] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,077][18055] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,096][18013] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,112][18053] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,125][17948] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,142][18052] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,161][18051] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,169][17882] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,195][18048] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,205][17949] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,208][18014] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,213][17947] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,218][18094] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,222][18054] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,223][18018] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,230][18049] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,234][18088] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,236][18089] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,242][17881] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,255][18091] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,270][18092] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,282][18085] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,286][18095] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,292][18050] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,296][18090] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,296][17981] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,301][17945] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,301][17879] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,305][17946] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,323][17880] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,324][18015] Decorrelating experience for 0 frames... [2024-06-25 22:40:13,336][18093] Decorrelating experience for 0 frames... [2024-06-25 22:40:14,273][18053] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,298][18096] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,318][18013] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,353][18055] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,377][18052] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,395][18048] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,417][18014] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,430][18051] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,440][17882] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,447][18049] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,449][18089] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,450][17948] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,475][17947] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,481][18090] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,493][17945] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,497][17981] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,514][18050] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,541][17946] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,555][17879] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,558][18018] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,584][18094] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,593][17881] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,600][18054] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,604][17949] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,627][18093] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,635][18015] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,648][18088] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,656][18091] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,660][18095] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,689][18085] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,695][18092] Decorrelating experience for 256 frames... [2024-06-25 22:40:14,699][17880] Decorrelating experience for 256 frames... [2024-06-25 22:40:17,760][17646] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 16068280320. Throughput: 0: 907.9. Samples: 4540. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-25 22:40:22,159][18048] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-25 22:40:22,178][18049] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-25 22:40:22,196][17949] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-25 22:40:22,211][17882] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-25 22:40:22,215][18089] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-25 22:40:22,245][18013] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-25 22:40:22,248][17881] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-25 22:40:22,252][17879] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-25 22:40:22,283][18018] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-25 22:40:22,316][17981] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-25 22:40:22,330][17858] Signal inference workers to stop experience collection... [2024-06-25 22:40:22,341][17878] InferenceWorker_p0-w0: stopping experience collection